Machine Learning and Mechanical Trading with Genotick

Screen Shot 2015-12-10 at 8.35.32 AM

I’ve recently been experimenting with Genotick, which is open-source java software that attempts to discover mechanical trading systems through the use of machine learning. You can run it on just about any Mac/Windows/Linux system (although you may have additional hurdles to get java8 working at the command-line level on a Mac). Thousands of tiny programs create random rules to predict the next day’s market move. The ones that have a better success rate are kept, and the ones that suck are booted to the curb. Every day the ‘robots’ all take a vote for up or down, and the majority wins. The process repeats each day, and the good ones evolve and the bad ones die out. Every time you do a new run, the robots evolve differently and you get different results.

Because it’s open source software (and is still at the very early stage of development), there’s a fair amount of heavy lifting on the user’s part to get this to work. I’m still wrapping my head around the Linux command-line interface and grepping the data I need out of the reports the software generates. Serves me right for just clicking on icons all these years I suppose. No matter, I’ve been able to get some results out of it. Not results I would trade, mind you, but this is more about the intellectual exercise at this point. The software does hold promise though!

The lead image shows an equity curve of IBM stock from 2004-2006 inclusive, vs. buy-and-hold of the same stock. No commissions deducted (which would be substantial, since this is a daily trade), and the total account is invested each trade. I fed Genotick the open/high/low/close/volume (OHLCV) data for each period and let it go to work. About an hour later, I had results.

As you can see, where owning IBM through this period would have left your account roughly where you started, the Genotick system would have been up about 18%. The bad news is that it would have been in drawdown from over 40%. It appears as if the software was no longer able to exploit an ‘edge’ after about the half way point.

The funny thing about human nature is that if you don’t know why something like this works, it would be very difficult to keep trading when the results started to go poorly. Would Genotick have turned things around? No idea. One part of any trading plan using (a future version of) Genotick would be a method to check whether your trading system was still working.

As a simple measure of ‘system health’ I tried computing a 10-day moving average of the system’s ‘hit rate’. It varies quite a bit, but there’s a definite downward trend in the success rate over the run.

Screen Shot 2015-12-10 at 8.37.13 AM

The next run could have a completely different equity curve, and a correspondingly different hit-rate profile. As I move forward, I’ll be comparing multiple runs and incorporating more data besides OHLCV data for the ticker. But it’s a fun first step.

AmiBroker Code for the Breadth Indicator

30qtrindicator 3

As per request, I’m including the AmiBroker code for the “30% up/down last quarter in the Russell 3000 index” indicator. I REALLY need to come up with a better name for it than that. How about the Haines Breadth Indicator? No, that’s  stupid. Magic Matt’s Mystical Meter? Uh…sure.

It’s a two step process. You must do a scan every day, or as frequently as you want accurate data. It doesn’t hurt if you’re away on vacation and miss a few days, as AmiBroker will fill in the gaps when you return. The scan counts up all the stocks meeting your criteria, and writes these to separate ‘composite’ files that are then read back in by the indicator. You use this with the ‘explore’ process.


AddToComposite(C/LLV(C,60)>=1.3 AND xx,"~POS30QTR","X",19);
AddToComposite(C/HHV(C,60)<=.7 AND xx,"~NEG30QTR","X",19);

// above is the breadth measurement that showed the most promise.
//You could comment out all the other indicators and
//possibly speed things up.

AddToComposite(C/Ref(C,-1)>=1.04 AND xx,"~POS4DAILY","X",19);
AddToComposite(C/Ref(C,-1)<=.96 AND xx,"~NEG4DAILY","X",19);
AddToComposite(C/LLV(C,20)>=1.2 AND xx,"~POS20MO","X",19);
AddToComposite(C/HHV(C,20)<=.8 AND xx,"~NEG20MO","X",19);
AddToComposite(L==LLV(L,60) AND xx,"~NEWLOW60","X",19);
AddToComposite(H==HHV(H,60) AND xx,"~NEWHI60","X",19);

AddToComposite(myrsi<10 AND xx,"~LO_RSI_COMP","X",19);
AddToComposite(myrsi>90 AND xx,"~HI_RSI_COMP","X",19);

AddToComposite(rsi14<30 AND xx,"~LO_RSI14_COMP","X",19);
AddToComposite(rsi14>70 AND xx,"~HI_RSI14_COMP","X",19);

AddToComposite(C>BBandTop(C) AND xx,"~HI_BB_COMP","X",19);
AddToComposite(C<BBandBot(C) AND xx,"~LO_BB_COMP","X",19);



Filter= IsIndexConstituent("$RUA");
AddColumn(pos4d,"+4% today",1);
AddColumn(neg4d,"-4% today",1);
AddColumn(pos30q,"+30% qtr",1);
AddColumn(neg30q,"-30% qtr",1);
AddColumn(pos20m,"+20% month",1);
AddColumn(neg20m,"-20% month",1);
AddColumn(spyop,"SPY Open",1.2);
AddColumn(spycl,"SPY Close",1.2);
AddColumn(newlow,"New 60d Low",1.0);
AddColumn(newhi,"New 60d Hi",1.0);
AddColumn(pos30q/(pos30q+neg30q)*100,"30q dif",1.2);
AddColumn(newhi/(newhi+newlow)*100,"hi-lo dif",1.2);

It occurs to me that I don’t know if the first line….


…works for all types of data sources. If you don’t get any results, change xx to equal “1” and use AmiBroker’s filter to select current Russell 3000 members.

I run this on a historical set of the Russell 3000 constituents, so it takes a while on my aging laptop. You could also just set AmiBroker’s filter (not the one in the code, but the one used when running an ‘explore’ job) to the current list of Russell 3000 members. Just be aware that your historical data will be subject to survivorship bias. This won’t have any effect on your current readings however…it just means your data will get less reliable as you look back in time.

If you don’t run the composite code regularly, you won’t have any recent data for your indicator to display. It won’t break though, but it won’t be valid either.

Here’s the code for the diffusion indicator. It’s actually generic in the sense that you can compare any two composites (or tickers) you want. It defaults however to the file name used for the “30% quarterly” composite index. It also defaults to a 10 day requirement for a signal to be given.

thresh1=Param("buy thresh",75,1,99,1); 
thresh2=Param("sell thresh",75,1,99,1); 
Ticker1 = ParamStr("Symbol1", "~POS30QTR" ); 
Ticker2 = ParamStr("Symbol2", "~NEG30QTR" ); 
flgrng=Param("threshold count",10,1,50,1); 


Using Market Breadth to Gauge Market Health (Conclusion)


Let’s wrap this up! We established a baseline using a moving-average system on the price of SPY to determine when we enter and exit the market. Then we tested a variety of breadth indicators, using the diffusion calculation and requiring entries and exits to have ten days above or below the threshold before acting.

Our grand prize winner used a breadth indicator that counted all the stocks that were up at least 30% in the last 60 trading days, versus all the ones that were down at least 30% in the same period (based on the Russell 3000 index members). All results below are for the out-of-sample period of 2013-2015, except the last line (2000-2015, which is mostly OOS).

Screen Shot 2015-12-21 at 7.13.47 PM* the 4% Daily system doesn’t trade when ten days of signal are required. Removing that requirement gave us frequent trading, but the results were worth reporting.

I’ve highlighted the winner of each row above in green.

I tried some other breadth indicators that weren’t worth devoting a whole blog post to, because they were epic failures. For completeness’ sake, I’ll mention them here:

• Bollinger bands. This one surprised me, because I thought for sure it would be useful. The breadth indicator determines the lower and upper Bollinger Bands for each member of the Russell 3000 (with 15 periods and 2 standard deviations). I optimized for the thresholds and got 30 as a good number for each. However in the out-of-sample period, the system never exited a trade. I therefore pronounced it useless. You might find something different though.

• “20-monthly”. This one is similar to our winning system, so I’m surprised it didn’t show better performance. You record all the stocks that are 20% or higher in the last 20 trading days (approximately one month), and the ones that are down 20% or more, and do the diffusion calculation like all the rest. I got a optimized threshold of 20 for entrance and exit, but in the out-of-sample data, it never traded.

• RSI(2). This didn’t trade, much like the “4% Daily” system when using the 10-day signal requirement. It’s such a short-term indicator, it rarely stays below or above a threshold for 10 straight days.

Also, I initially started out this quest by calculating breadth diffusion values and then using a moving average of those values to determine buy and sell signals. The results were ok, but I realized one important fact: a moving average eventually catches up with your indicator if it’s range-bound. If the indicator goes from 0 to 100, and pegs 100 for ten straight days, then your 10-day moving average will also have a value of 100. If your indicator then dips to 99—below your indicator—does that mean bad news and it’s time to sell? Probably not! During the big run up of 2013, these indicators that were compared to their moving averages all stopped trading at the worst possible times (i.e. when the markets took a tiny dip in an otherwise booming run). The simpler static thresholds proved to have better results.

This process gave me some ideas for future examination, and I’ll outline them here:

• There may be added benefit in examining the under- and over-achieving stocks independently. For example, can an increasing number of stocks making new 60-period lows be a better signal for exit, regardless of what the new-high stocks are doing? This is simply a matter of calculating the number of new-low stocks and dividing by the number of stocks in the index. Or conversely, perhaps an exit signal could be derived from the number of new-high stocks falling? This could be checked for all these types of breadth indicators.

• Mean reversion, baby! Some of these systems showed interesting behavior when reversing the signals and viewing them as short-term mean reversion strategies. I discovered them while working on this project. I have not yet compared them to a strict price-based baseline though (short-period MA or RSI(2) on the price of SPY perhaps). If they look like they’ll beat the best baseline I can find, I’ll post about them.



Using Market Breadth To Gauge Market Health (Part 5)

Alaskan otter.
Alaskan otter.

This is part 5 of a multi-part series examining the use of market breadth indicators to judge the state of the market. For an overview of what I’m doing, you’d best start here so you can catch up:


And oh yeah, we finally have an indicator that beats our baseline! Just coincidence that I left this one until the end? Perhaps…

This next market breadth indicator counts all the stocks in the Russell 3000 that are up at least 30% over the last quarter (60 trading days), as well as all the stocks that are down at least 30% for the last 60 days. I’m using historical constituents of the Russell 3000 including delisted stocks to avoid survivorship bias. I am optimizing over the period of 2010-2012 inclusive, and our out-of sample data is 2013-2015 inclusive (well, almost to the end of the year). I also take a look at a wider period of 2000-2015 to see if the system holds up.

We calculate diffusion as before:

dif30qtr = total_up_30 / ( total_up_30 + total_down_30 ) * 100

I’m multiplying by 100 to give it a percent-y feel, but that part isn’t strictly necessary.

Now we need a single threshold for entry and exit as a starting point. And we will require that there be ten days of ‘signal’ (dif30qtr must be above or below the threshold for 10 trading days in a row before action is taken). We buy when we get ten days above the threshold, and sell when we have ten days below. if the indicator thrashes around above and below the threshold without giving 10 days on either side, we maintain the status quo.

Below are the optimization results for 2010-2012.

type 30qtr 3D, 2010-2012 single th rev

Thresholds at 10 or below never exited the market, so it was not a threshold that was suitable as a market-health indicator. Fortunately there’s a nice plateau of results in the 65-80 range. The best result was 75, which fortunately is right in the middle of the plateau as well. This means we can have greater confidence that the best value will hold up in out-of-sample testing.

Next we see if there is an exit threshold lower than 75 that increases performance:

type 30qtr 3D, 2010-2012 th1 75Nope! Looks like 75 works best for an exit as well. Now let’s take a look at our equity curve and results for a 75 in/out threshold in our in-sample testing:

type 30qtr 75,75 2010-2012Five trades, three of which were winners. Average number of days per trade is 80 days. We are in the market just over half the time. CAR/MDD is 0.50.

And our out-of-sample testing…

type 30qtr 75,75 2013-2015On the down side, it kept us out of 2014 for much of the time, even though the year was an up year. However it also kept us out of almost all of 2015, which in my book is a good thing. Total number of trades was 4, of which 3 were winners. The average trade lasted 83 bars, although that’s very skewed since one was a very long one and the others were very short. We were in the market just 44% of the time. CAR/MDD was…wait for it…1.05! Which handily beats our baseline moving-average indicator (which was 0.69 for the OOS period).

Our longer period of 2000-2015 looks like this:

type 30qtr 75,75 2000-2015Note how well this indicator missed the 2008 crash.

This “30% up/down over 60 trading days” indicator really shines through, beating not only our baseline but all the other breadth indicators we tried as well. Here’s how it looks in action, using AmiBroker.

30qtrindicator rev

The blue line is the breadth reading. Red bars mean the previous ten bars were below threshold, green means the previous ten bars were above threshold, and no color means the breadth indicator was wobbling above and below the threshold line (which is also green).

There sure is a lot of red.

I like how this indicator said “get the heck out!” long before the market fell off a cliff in August. Coulda saved me some money!

Next post will wrap everything up. I’ll compare the breadth indicators side by side, talk briefly about some other breadth types I tested that were complete failures, and hint at some other ways to use breadth indicators.




Using Market Breadth to Gauge Market Health (part 4)

Arroyo Verde in Ventura, CA USA, may 2007. "Two Trees" in the background. Taken from the eastern rim of the canyon.

Welcome to Part 4 of this series. We’re still trying to find a market breadth indicator that gives a better health assessment than using a simple moving average on SPY. For a description of what the heck I’m doing, please go back and read the first post (and the subsequent ones too):

Using Market Breadth to Gauge Market Health (part 1)

Back when momentum and dinosaurs ruled the earth (instead of our mean-reverting robot overlords, all praise be to them), traders would often use the Relative Strength Index (RSI) to gauge the performance of a stock. RSI is calculated by adding up some numbers, and I think there might be some division involved too…oh go look it up, this ain’t no site for baby traders.

Traditionally one would select an RSI with a period of 14. Why? Because “14” is twice as lucky as “7”, that’s why. And then a stock was considered to have great positive momentum when it hit 70, and strong downward momentum when it hit 30. Unless you thought it was over- or undersold, in which case it meant something completely different.

We can also measure overall market breadth by using the RSI values of individual stocks. Count up the number of all the stocks (historical constituents of the Russell 3000 index, in my case) with their RSI(14) values currently above 70, and also all the stocks that have their RSI(14) level below 30. Compute the diffusion so that there’s no confusion:

breadth = total_above_70 / (total_above_70 + total_below_30) * 100

The ” * 100 ” is there just so it feels more like a percentage to me. Completely optional.

RSI(14) is a slow-moving indicator, so this ought to give us a longer-term sense of how the market is feeling than say our previous attempt at using the up/down 4% daily breadth indicator.

We are using the period of 2010-2012 as our in-sample period for optimization, and we are using the buying and selling of SPY as a proxy for market health. As before, the breadth reading must be over its entry threshold for ten days for a trade to be triggered, and must be below the exit threshold for ten days to exit. If the breadth reading is whipsawing around the threshold, then the status quo is maintained.

So let’s go pick some entrance/exit thresholds!

As before, we are first looking for a single threshold that works best as both an exit and entry signal. After that, we look for a lower threshold that might work as a better exit signal. Doing it this way prevents the entrance threshold being lower than the exit threshold, which can result in small swing trades that don’t tell us anything about the state of the market.

type rsi14 3D 2010-2012 single thBefore you get excited about the 5 and 10 levels, know that this simply resulted in no trading, and is equivalent to buy-and-hold. That doesn’t tell us much about the market’s health. We want to be in and out of the market at least a few times over a three year period for the breadth indicator to make any sense. Fortunately the high end of the optimization set looks promising too. It’s a little ‘peaky,’ and the drop off from 75 to 70 is very steep. It’s usually wise to pick something in the middle of a local plateau for more reliable results in out-of-sample (OOS) testing. So I went with a value of 80, which had only a tiny decrease in CAR/MDD.

Now let’s see if there’s a lower exit threshold that gives us better results:

type rsi14 3D 2010-2012 th1 80

Well there’s that 75 value again, but this time there’s a much gentler slope downward. So we’ll make our exit threshold equal to 75. How did our in-sample results look?

type rsi14 80,75 2010-2012

We’re out of the market for a big chunk of the time, like 60%. Basically we take the summers off! We missed the summer ugliness of 2011 and 2012, which is a good thing. There were a total of 6 trades, averaging 51 days in length, and only two of them were winners. The losers were much shorter though, averaging a hold of 31 days rather than 91 days for the winners. CAR/MDD was 0.41.

Now for the actual results: our out-of-sample test period of 2013-2015 (or close enough to the end of 2015 anyway).

type rsi14 80,75 2013-2015We captured most of the run-up in 2013, which is a Very Good Thing. We got in and out too much in 2014, and ended the year at a loss, when it was another up year for buy-and-hold. And 2015…we took most of this year off! Looking at my portfolio, sometimes I wish I’d taken 2015 off as well. So I guess that’s not unreasonable. As for the numbers:

6 trades again, and this time 4 were winners. Average trade length was 61 days. CAR/MDD was 0.58.

And for the big picture, here’s how 2000-2015 would have looked:

type rsi14 80,75 2000-2015I look at this and I think to myself that the bear markets of 2000 and 2008 were very different beasts. Yes they were “bears,” but one was a polar bear and the other was a grizzly bear. 2000-2002 was still very painful using this breadth indicator, but look at 2008! We would have dodged a bullet/bear with this breadth indicator for sure.

Sadly, it still doesn’t beat out lowly moving-average baseline. The MA System beats this one on CAR/MDD for the OOS period, as well as total profit across all three time frames.

Surely there must be a breadth indicator that can help us? Surely not all information is contained solely in the price of the index?

Fortunately, there is. Next post, we finally get to see one that beats the pants off our MA System baseline.