Yes I realize that might be the dullest blog post title I’ve ever come up with. Perhaps only to be topped by the dullest content ever…let’s see.
It’s quite typical when creating backtests to filter using some minimum measurement of trading volume. This ensures that you’re working with stocks that are liquid. Stocks that don’t have heavy trade volumes tend not to behave like your backtest software thinks they do. For very illiquid stocks, you may have to wait a long time to get your order filled, and market or stop orders may be subject to punitive bid/ask spreads. So when testing, you want to make sure your test eliminates these sketchy ne’er-do-well stocks.
Until recently I’ve been using a simple average (mean) of the volume. For example, I might require a stock to have an average 10-day volume greater than 100,000 shares. However I recently noticed some weird outliers skewing my results.
Averages are very sensitive to ‘tail events’. If you have nine small numbers and one big number, the big number is going to skew the average in its direction in a way that you might not have intended.
For example, see the lead image above? There’s a spike of 21.8 million shares in one day’s volume, but you can see that it’s not very typical of the stock. The 20-day average of volume is right around 1.5 million shares. However on most days the volume is well below that number.
So instead I’ve started using the median value of a series instead. I’ll use an odd number for the sample size so that I get a true median or ‘middle’ number. This gives me a value that doesn’t heavily weigh outliers, and seems more typical.
The median volume for the stock you see above is actually around 209,000 shares, which makes a whole lot more sense. If I’d been filtering out stocks under an average of 1 million shares in volume, this stock would have made it through. But when using the median value, it would have been dropped (and rightly so).
Don’t be mean. Be median.