QUANTCONNECT COMMUNITY

No Results

Join Our Discord Channel

Join QuantConnect's Discord server for real-time support, where a vibrant community of traders and developers awaits to help you with any of your QuantConnect needs.

pending review This research is under review. To publish this research attract three community upvotes.

Draft Discussions

Bookmarked Discussions

Share New Research

Start New Discussion Sign up

SEARCH DISCUSSIONS

TOP 5 Research PUblications

About Quant League

The Open-Quant League is a quarterly competition between universities and investment clubs for the best performing strategy. Previous quarter's code is open-sourced, and competitors must adapt to survive.

competition rules

See the competition code of conduct and rules for participation in prizes.

Read Rules

previous competitions

Browse strategies and organization entries from previous quarter's competitions.

STRATEGY

285,700 Quants.

Become a Quant

VOTE FOR UPCOMING FEATURES

Share your input and vote on our future direction.

LEAN Roadmap

Create an account on QuantConnect for the latest delivered to your inbox.

Non-linear machine learning classifiers outperform manual backtesting due to overfitting

An interesting paper was published recently that drew the conclusion that most traders produce poor backtest optimizations compared to suitably trained machine learning classifiers (based on out of sample performance).

http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2745220

I'm sure I'm not alone in developing too good to be true algorithms that turn out to be just that in production trading. I notice there is an AWS ML classifier service and I'm wondering whether it would be suitably adapted as a backtest classifier/optimizer? In general, does anyone with experience in this area have an estimate on the kind of time investment required to produce an ML classifier model suitable for backtesting? What level of re-use across an entire portfolio could be acheived? Would it require a customized model for each algorithm for instance?

Update Backtest

person upvoted this people upvoted this

James Smith

| |

Accepted Answer

Update Backtest

Notebook

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.

Petter Hansson

10.5k ,

I posted a similar thread recently, and am in agreement with your opening statement; the main problem we face is how to ensure that an algorithm's performance isn't in fantasy land.

On some of your particular questions: Retraining classifiers for a different training set isn't that difficult/work intensive assuming you've done it once already, but hyperparameter selection might be. Feature engineering is, on the other hand.

I expect the primary problem with using AWS ML is that you can't (as in not allowed, but wouldn't be very practical either) to stream QC data out of QC. There's the option of acquiring data outside QC of course, and I've done this for some daily algos I had previously (the reason I started with QC was as an execution interface for those). Sending signals to QC for daily ticks isn't a problem obviously, second resolution ticks would be problematic on the other hand. Intraday data acquisition is also generally more difficult, which is another reason I'm using QC.

When it comes to the ML capability AWS has, I'd expect it to be sufficient for some rather good results. I'm basing that on people achieving alpha with far less than the algos AWS ML offers. Disclaimer: I do not have a trade system running live at the moment or have used AWS ML for this, and my opinion on whether "it could be made to work" therefore comparatively speculative.

Might have more comments after I've had time to read that particular paper.

James Smith

2.5k ,

You'll see in evidence my lack of specialist knowledge in ML, so I'm not sure if the API's have evolved to the low level I have in mind. I'm mostly just interested in backtest parameter optimization rather than alpha discovery and data mining.

Anyway, thinking at a high level, you might be able to stream data out using custom data requests. This is intended for pulling data in, but there's nothing stopping you passing the parameters of your choice to a remote optimization service, unless the right data is not exposed in the scope of the subscription in the first place. Maybe this is what you've been doing for your daily algos.

I'm mostly thinking about this in terms of being put together through the LEAN backend and then possibly adapted for use within QC, so most of these kind of barriers could be overcome.

Do you think the bar is too high with AWS ML and there are adequate "roll-your-own" options? Could this all be run on a desktop or micro instance?

Jared Broad

STAFF ,

@James, "there's nothing stopping you passing the parameters of your choice" - exporting data is against the terms of service. We are working on a way to have AI-parameter optimization -- when its available we'll announce it to the forums.

James Smith

2.5k ,

OK great, I will be following the progress of this feature.

James Smith

2.5k ,

I was inspried by the discussion in this thread and so decided to implement an optimizer to my liking. Details here:

https://www.quantconnect.com/forum/discussion/454/strategy-optimization

I'm able to use this to get useful optimization results but I'm still toying with different configurations. I find working with this is preferable to manually tweaking backtest parameters.

James Smith

2.5k ,

After some time to digest this paper, it has a few complimentary assertions beyond the main finding:

Backtests with high volatility suffer more from overfitting.
There is a positive correlarion between Sharpe shortfall and amount of backtesting. More backtesting increases the inflation of Sharpe.
There may be a recency bias evident from higher accuracy of predictions from backtests covering just the last year.
Sharp ratio alone is a mediocre predictor, but is still superior to common alternatives.
Several other unexpected measures were found to be effective predictors including tail ratio, skew and kurtosis. The tail ratio was actually more predictive than Sharp itself (in the chosen sample).

The main finding concerning the superiority of Machine Learning optimization is with respect to a Random Forest algorithm. Does anyone have experience with this? I'm also considering the merits of a naive Bayes optimizer, which I gather is fairly common in this scenario.

James Smith INVESTOR

Update Backtest

Notebook

person upvoted this people upvoted this

To unlock posting to the community forums please complete at least 30% of Boot Camp.
You can continue your Boot Camp training progress from the terminal. We hope to see you in the community soon!

Organization

Organization Website

Update Competition

Team

Clone Strategy

Copy this strategy code to your QuantConnect account and deploy it live with your brokerage.

Clone

Previous Ranking

Browse strategies and organization entries from previous quarter's competitions.

Author:

Platform

Radically Open-Source Algorithmic Trading Engine

Join Our Discord Channel

Quarterly Open-Source Trading Competition

Draft Discussions

Bookmarked Discussions

SEARCH DISCUSSIONS

TOP 5 Research PUblications

About Quant League

competition rules

previous competitions

285,700 Quants.

VOTE FOR UPCOMING FEATURES

Non-linear machine learning classifiers outperform manual backtesting due to overfitting

Organization

Team

Clone Strategy

Previous Ranking

IN THIS RESEARCH

PARTICIPANTS

Discussion Awards

Actions

Join QuantConnect for Free

Platform

SIGN IN

Radically Open-Source Algorithmic Trading Engine

Join Our Discord Channel

Quarterly Open-Source Trading Competition

Draft Discussions

Bookmarked Discussions

SEARCH DISCUSSIONS

TOP 5 Research PUblications

About Quant League

competition rules

previous competitions

285,700 Quants.

VOTE FOR UPCOMING FEATURES

Non-linear machine learning classifiers outperform manual backtesting due to overfitting

Organization

Team

Clone Strategy

Previous Ranking

IN THIS RESEARCH

PARTICIPANTS

Discussion Awards

SHARE RESEARCH

SHARE DISCUSSION

SHARE ARTICLE

SHARE

Actions

Join QuantConnect for Free