Noise Kills Profits (Machine Learning with Genotick)

268656360_0a9f83d03a_z

A reader on my blog (Thanks Kris!) suggested that I explore how much noise is needed to send Genotick off the deep end. You’ll recall from my earlier post on the subject that I was looking for hidden biases that Genotick might have, and explored how it responded to pure and noisy sine waves of data.

For those just catching up, Genotick is a free, open-source machine-learning price-prediction application created by Lukasz Wojtow. You can read more about it here. It’s still in beta but it’s a very interesting concept.

First, I generated some sinusoidal data in Google Sheets, and started over with a new sine wave period. I created the sine wave like this:

Column A was simply a counter. It went from 1 to 2000.

Column B contained:

=sin(radians(A2*12.64))

This uses the counter in column A to advance the sine wave through time. The sine wave moves between -1 and 1.

Column C contained:

=(B2*3)+30+($E$1*rand())

I increase the amplitude by multiplying row B by 3, then offsetting it by 30 to make it more like a real stock price. Cell E1 contained the noise multiplier, which adjusts the noise source (random number generator creating numbers between 0 and 1). With a noise multiplier of 0, no noise is added and the sine wave is pure.

The sine wave has an amplitude of 6 peak to peak, so divide the noise multiplier by 6 to determine the ratio of noise to signal.

My spreadsheet looks like this:

Screen Shot 2016-01-13 at 10.07.28 AM

I had Genotick do ten runs every time I tested a new noise multiplier. This was a lot of work! Since I didn’t know where the noise might start interfering, I started with wide-ranging values: 0, .001, .01, .1, 1 and 10.

Here are the results of that first toe-dip into the statistical water:

Screen Shot 2016-01-13 at 10.27.19 AM

Screen Shot 2016-01-13 at 10.27.31 AM

Screen Shot 2016-01-13 at 10.27.42 AMScreen Shot 2016-01-13 at 10.27.55 AM

Screen Shot 2016-01-13 at 10.28.11 AM

Screen Shot 2016-01-13 at 10.28.25 AMAs you can see, a value of 10 killed Genotick dead! The values between 0 and .1 all show a worst-case scenario of at least 10^7 % profit. A noise multiplier value of “1” shows a worst-performer return of ‘only’ 10^5 %. But the spread of the returns is getting wider at that point as well, which that tells me the noise is starting to effect the results.

Then I focused in on the range between 1 and 10. It soon became apparent that by n=3, the noise was taking over (this would be a 50% noise ratio). So how about we take a closer look at the 1-3 range?

Screen Shot 2016-01-13 at 10.38.20 AMScreen Shot 2016-01-13 at 10.38.31 AM Screen Shot 2016-01-13 at 10.38.57 AMScreen Shot 2016-01-13 at 10.39.09 AMScreen Shot 2016-01-13 at 10.39.25 AMThe last chart, noise=2.0, is linear scale because the results were bad enough log scaling became meaningless. With each noise-multiplier increase of .2, the worst and best cases got lower and lower. By the time we get to a value of 1.8, we see our first loss-making run. While I can’t say there’s a specific threshold of noise that causes Genotick to become confused, I can say that it’s definitely in the 1 to 2 range (or roughly 16-33% range).

Earlier I had done some experiments with filtering actual price data using a simple short moving average. This is like putting the price data through a lowpass filter for you audio types, and removes some of the high frequency noise. Some initial tests on price data that had previously proven stubborn showed a big improvement when filtered. The only problem is, you can’t buy and sell a moving average! You have to buy and sell on the actual price.

Wouldn’t it be nice if Genotick could be set to ignore the price data for its learning inputs, and only look at other columns of data such as filtered price information? It would still use actual price data for the trade, but it would ignore it for the trading algorithm. I’ve put in a request for that feature.

Meanwhile, there may be something of a workaround. That’s my next step: feeding Genotick filtered data to see if it can successfully ignore the noise.

5 thoughts on “Noise Kills Profits (Machine Learning with Genotick)”

    1. Hi Ashu and thanks for your comment. I know what a box plot is, but not sure what you’re suggesting. Can you elaborate? Thanks.

  1. Matt,

    What was your final verdict on genotick? Can it be used to trade and make money or do you fear that it gets confused to easily?

    1. I was never confident that the results could be relied upon to trade with. I do know the developer continues to develop it, so I can’t speak to its current value to a trader. It seems like a good concept, so don’t write it off on my account!

Leave a Reply

Your email address will not be published. Required fields are marked *