XGBoost Forex Strategy

Hello guys!

I've been developing a Forex trading strategy based on XGBoost and Lopez de Prado's techniques adapted to hourly data.

Let's say i have a bunch of features which include: raw close data, returns, shifted returns, volume, technical indicators, ratios, fundamental data, fractionally differentiated close data. My target labels are next hour's returns (for regression) and a binary classification (1, -1) of those returns:

np.where(df['return'].shift(-1) > 0, 1, -1)

After a sample reduction with CUSUM filter and various preprocessing steps like feature reduction/importance i obtain my final dataframe.

After train test split i train the data with an XGBoost regressor as first model. Cross-validation, Gridshearch and hyperparameter tuning lead me to the testing step and the final score.

I repeat the same process with an XGBoost classifier as second model (confirmation model).

Bypassing backesting phase now i get real time data streaming with a broker API, two saved and loaded models. One predicts me next hour's returns, the latter predicts me direction class (1 for buy, -1 for sell) and prediction probability.

So the main question: how could i implement a position sizing/risk management strategy in a live Forex environment?

My idea was: if regressor predicts a positive return and classifier confirms it with a buy class and a relevant predict_proba (maybe a threshold is needed, let's say >70%) go long setting take profit = predicted return. Otherwise go short with the same and opposite rules. I'll set the stop loss with 1/2 risk reward ratio, so it will be predicted return/2 or maybe a fixed stop loss calculated upon the standard deviation of last n hours (20 pips, for example).

Lots are calculated with classical formula: net liquidation X %riskXtrade/ stop loss X $valueXPip

Please let me know if you have any ideas on making this a better money management.

Final question: if i wanted to use only a classification model that gives me a binary prediction for next hours, is there a method to filter low predicted returns?

For example if next hour's return is 1 (buy), how could i know the size of this prediction and HOW MUCH to buy? If a prediction falls into a category for a minimum variation it doesn't worth the trade. Maybe setting a multiclass label (1, 0, -1) with a threshold could figure this out?

Thank you very much!

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.

> So the main question: how could i implement a position sizing/risk management strategy in a live Forex environment?

Since you have a trained classification model, you can simply allocate position sizing via the predicted probabilities. Alternatively, if you convert to logits (log p / 1-p ) you can implement position sizing/stop losses via probabilistic arguments. Not sure if using the standard deviation of past prices would make sense since the ML model you are using assumes i.i.d., but you could backtest some different variations and see what the results are.

> Final question: if i wanted to use only a classification model that gives me a binary prediction for next hours, is there a method to filter low predicted returns?

In theory, with the binary classification set-up you described any predictions that are "close" to 0.5 implies low predicted returns. Any predictions that are closer to 0 or 1 implies higher predicted returns.

In practice, due to the low signal-to-noise ratio of financial data your classifier likely won't be able to differentiate between the classes much (a threshold of 0.7 likely will result in no trades being made at all). Using a multi-class setting with a threshold will let you filter low predicted returns, but complicate the task for the ML algorithm.

Thank you for your enlightening answer.

Let me see if i hear that right: probabilities are correlated to a major or minor return, right? So an higher predicted probability in classifiers corresponds to a larger return in a regressor prediction?

Please explain this step in a straightforward code: "if you convert to logits (log p / 1-p ) you can implement position sizing/stop losses via probabilistic arguments." It's a little bit exoteric for me

Adam W

3.9k ,

Federico Juvara

199 ,

I wouldn't think that a higher probability corresponds to a larger return directly - only that you are more certain that the true direction is up or down. However in this case, and assuming the two classes are somewhat separable based on X, and the true mapping is approximated well by your ML model, etc, small returns (near 0) would mean it's relatively "harder" for the classifier thus not having much predictive power beyond just the naive 0.5 case. Predictions closer to 0 or 1 suggest that the model is more certain, which means we can allocate positions based on relative certainty. Then, you can multiply your probabilities with the regression model's predictions to get expected returns and allocate positions based on that (or divide by std to maximize Sharpe).

Logits are basically just log-odds, for instance a probability of 0.55 implies odds of 1.22:1 so you can do things like stop losses/take-profits/etc using this ratio, or within certain portfolio choice models like the Kelly Criterion

Really exhaustive, thank you very much!

Federico Juvara INVESTOR

Update Backtest

Notebook

person upvoted this people upvoted this

To unlock posting to the community forums please complete at least 30% of Boot Camp.
You can continue your Boot Camp training progress from the terminal. We hope to see you in the community soon!

Platform

Radically Open-Source Algorithmic Trading Engine

Join Our Discord Channel

Quarterly Open-Source Trading Competition

Draft Discussions

Bookmarked Discussions

SEARCH DISCUSSIONS

TOP 5 Research PUblications

About Quant League

competition rules

previous competitions

286,600 Quants.

VOTE FOR UPCOMING FEATURES

XGBoost Forex Strategy

Organization

Team

Clone Strategy

Previous Ranking

IN THIS RESEARCH

PARTICIPANTS

Discussion Awards

Actions

Join QuantConnect for Free

Platform

SIGN IN

Radically Open-Source Algorithmic Trading Engine

Join Our Discord Channel

Quarterly Open-Source Trading Competition

Draft Discussions

Bookmarked Discussions

SEARCH DISCUSSIONS

TOP 5 Research PUblications

About Quant League

competition rules

previous competitions

286,600 Quants.

VOTE FOR UPCOMING FEATURES

XGBoost Forex Strategy

Organization

Team

Clone Strategy

Previous Ranking

IN THIS RESEARCH

PARTICIPANTS

Discussion Awards

SHARE RESEARCH

SHARE DISCUSSION

SHARE ARTICLE

SHARE

Actions

Join QuantConnect for Free