I’ve been reading books by Michael Halls-Moore and my head hurts. Not having any formal training in statistics, I only understand about half of the material. None the less, I found his discussion of ‘correlograms’ interesting. I even installed R on my computer (even though I haven’t fully grasped Python yet!) and was able to make some correlograms with R. However not knowing anything about R (sensing a theme here?), I thought I’d come up with my own version of a correlogram using AmiBroker and Google Sheets. A ‘redneck correlogram’ if you will.
So what is a correlogram, you ask? Here’s a link to a wiki page on the subject. My interpretation: it’s a tool to see if a time series of data (i.e. stock prices) is autocorrelated (i.e. is there some connection between price movements down the line from the day in question).
For each day being examined, which I’ll call the ‘trigger day’, I first look at the directional movement from the previous day. Was it up or down? Then I look at the next 20 days, and record whether they move the same direction or the opposite direction as the current day. If they moved the same direction, that column gets a ‘1’. If not, a ‘-1’. The mean of each day’s values is taken.
I include day 0 in there, which is perfectly correlated with itself, and so will always be at 100%. A perfect anti-correlation would show a value of -100%. No correlation at all would thus provide a value very close to 0, because all those -1s and +1s will be equally distributed and average out to 0. Basically, white noise. Some days will show spikes, but if it doesn’t poke over the 5% threshold, then it’s just random. Even some spikes barely above the threshold are going to be meaningless.
Now throw on your data pants and let’s look at SPY from 1/1/2010 to 12/31/15, including the 20 lagging trading days after the last trigger day.
So this is how SPY is autocorrelated to itself. Day 0 shows 100% correlation with itself. I’ve truncated the bounds of the graph at .5 to show more detail in the other values. Note that in this case, day 0 (the trigger day) could be either an up or down day, and the bars are showing correlation or anti-correlation, regardless of the direction of the trigger day.
The green lines are the +/-5% mark. Day 9 is just touching that mark, but does that mean day 9 tends to go the same direction as day 0? It’s probably just a false positive. No other days are showing anything significant. Bunch of noise!
What if we filter for just up days or just down days?
The first graph is for up days only, so the trigger day of course is always positive. Looks at all those correlated days down the line! But wait…the down days shows significant anti-correlation across the entire lagged series too. What gives? Well this just shows that the market had a strong tendency to be positive during 2010-2015. So in general, whether you had a down day or an up day, you were likely to have up days following it.
I however am not as interested in the day-to-day random noise of the market. I am more interested in what happens when the market goes ‘bang’. The market normally hums along as background noise until something whacks it like a child banging on pots and pans. Then prices start vibrating loudly, and the juicy trading happens.
Let’s take a look at the autocorrelation when the market closes up at least 1% over the previous day’s close:
Compare this to the more generalized ‘all up days’ correlogram from earlier. Here, day 1 is pretty ambivalent, and slightly on the negative side. This seems logical if you’re a market watcher: after a big up day, we often have a slight down day as traders take profits. Day 2 is significant but no more so than the ‘all up days’ chart. Day 4 though, now we’re poppin’ over the 15% mark. What this means is that, more often then not, if you have a >1% day with SPY, you’ll have an up day of some sort 4 days later. At least for the time period we’re looking at.
And what’s with day 20 (a month later)? Is that fluke?
And finally, let’s look at all the down days that were more than 1% from the previous close.
As expected, most days are negatively correlated, since we saw this in the ‘all down days’ graph. However we don’t have anything quite as strong as the ‘1% up’ chart in the early days following the trigger. And yet, day 14…that’s the highest value away from zero of the whole series of graphs.
Does this seriously mean that if we have a down day bigger than 1%, that we’re more likely to have an up day 14 trading days later? Can we make a trading system on that? It seems so…unlikely.
And yet…if we plot an equity graph for day 14 gain/loss vs day 16, we see this:
Ignoring such nuisances as commissions and fees, we get the following:
|avg gain/loss||win rate|
|+0.0895 %||54.80 %|
|avg gain/loss||win rate|
So I have four questions for you:
- Is this enough data?
- If it’s not enough data, then is data before 2010 still relevant?
- Is this a pattern unique to the time period, or will it persist into the future?
- Am I doing something wonky with statistics that all you quant-types know is just foolish?