Created
January 15, 2016 11:00
-
-
Save twiecki/9f10654b05ff69859ae1 to your computer and use it in GitHub Desktop.
Thomas Wiecki Strata Hadoop World London 2016 submission -- accepted
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
All that glitters is not gold: Comparing backtest and out-of-sample performance of 800.000 trading algorithms | |
“Past performance is no guarantee of future returns”. This cautionary message will certainly match the experience of many | |
investors. When automated trading strategies are developed and evaluated using backtests on historical pricing data, there | |
is always a tendency, intentional or not, to overfit to the past. As a result, strategies that show fantastic performance on | |
historical data often flounder when deployed with real capital. | |
Quantopian is an online platform that allows users to develop, backtest, and trade algorithmic investing strategies. By | |
pooling all strategies developed on our platform we constructed a huge and unique data set of over 800.000 trading algorith | |
ms. Although we do not have access to source code, we have returns and portfolio allocations as well as the time the | |
algorithm was last edited. This allows us to compare returns over the period the author had access to and potentially | |
overfit on, as well as true out-of-sample data that accumulated since then. In this talk I will shed light on the prevalence | |
of backtest overfitting and debunk several common myths in quantitative finance based on empirical findings. Moreover, I'll | |
show how I trained a machine learning classifier on this dataset to predict whether an algorithm is overfit or not and how | |
its future performance will likely unfold. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment