Back

Non-linear machine learning classifiers outperform manual backtesting due to overfitting

An interesting paper was published recently that drew the conclusion that most traders produce poor backtest optimizations compared to suitably trained machine learning classifiers (based on out of sample performance).

http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2745220

I'm sure I'm not alone in developing too good to be true algorithms that turn out to be just that in production trading. I notice there is an AWS ML classifier service and I'm wondering whether it would be suitably adapted as a backtest classifier/optimizer? In general, does anyone with experience in this area have an estimate on the kind of time investment required to produce an ML classifier model suitable for backtesting? What level of re-use across an entire portfolio could be acheived? Would it require a customized model for each algorithm for instance?

Update Backtest








I posted a similar thread recently, and am in agreement with your opening statement; the main problem we face is how to ensure that an algorithm's performance isn't in fantasy land.

On some of your particular questions: Retraining classifiers for a different training set isn't that difficult/work intensive assuming you've done it once already, but hyperparameter selection might be. Feature engineering is, on the other hand.

I expect the primary problem with using AWS ML is that you can't (as in not allowed, but wouldn't be very practical either) to stream QC data out of QC. There's the option of acquiring data outside QC of course, and I've done this for some daily algos I had previously (the reason I started with QC was as an execution interface for those). Sending signals to QC for daily ticks isn't a problem obviously, second resolution ticks would be problematic on the other hand. Intraday data acquisition is also generally more difficult, which is another reason I'm using QC.

When it comes to the ML capability AWS has, I'd expect it to be sufficient for some rather good results. I'm basing that on people achieving alpha with far less than the algos AWS ML offers. Disclaimer: I do not have a trade system running live at the moment or have used AWS ML for this, and my opinion on whether "it could be made to work" therefore comparatively speculative.

Might have more comments after I've had time to read that particular paper.

2

You'll see in evidence my lack of specialist knowledge in ML, so I'm not sure if the API's have evolved to the low level I have in mind. I'm mostly just interested in backtest parameter optimization rather than alpha discovery and data mining.

Anyway, thinking at a high level, you might be able to stream data out using custom data requests. This is intended for pulling data in, but there's nothing stopping you passing the parameters of your choice to a remote optimization service, unless the right data is not exposed in the scope of the subscription in the first place. Maybe this is what you've been doing for your daily algos.

I'm mostly thinking about this in terms of being put together through the LEAN backend and then possibly adapted for use within QC, so most of these kind of barriers could be overcome.

Do you think the bar is too high with AWS ML and there are adequate "roll-your-own" options? Could this all be run on a desktop or micro instance?

0

@James, "there's nothing stopping you passing the parameters of your choice" - exporting data is against the terms of service. We are working on a way to have AI-parameter optimization -- when its available we'll announce it to the forums.

0

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.


OK great, I will be following the progress of this feature.

0

I was inspried by the discussion in this thread and so decided to implement an optimizer to my liking. Details here:

https://www.quantconnect.com/forum/discussion/454/strategy-optimization

I'm able to use this to get useful optimization results but I'm still toying with different configurations. I find working with this is preferable to manually tweaking backtest parameters.

0

After some time to digest this paper, it has a few complimentary assertions beyond the main finding:

  • Backtests with high volatility suffer more from overfitting.
  • There is a positive correlarion between Sharpe shortfall and amount of backtesting. More backtesting increases the inflation of Sharpe.
  • There may be a recency bias evident from higher accuracy of predictions from backtests covering just the last year.
  • Sharp ratio alone is a mediocre predictor, but is still superior to common alternatives.
  • Several other unexpected measures were found to be effective predictors including tail ratio, skew and kurtosis. The tail ratio was actually more predictive than Sharp itself (in the chosen sample).

The main finding concerning the superiority of Machine Learning optimization is with respect to a Random Forest algorithm. Does anyone have experience with this? I'm also considering the merits of a naive Bayes optimizer, which I gather is fairly common in this scenario.

0

Update Backtest





0

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.


Loading...

This discussion is closed