Hello guys!
I've been developing a Forex trading strategy based on XGBoost and Lopez de Prado's techniques adapted to hourly data.
Let's say i have a bunch of features which include: raw close data, returns, shifted returns, volume, technical indicators, ratios, fundamental data, fractionally differentiated close data. My target labels are next hour's returns (for regression) and a binary classification (1, -1) of those returns:
np.where(df['return'].shift(-1) > 0, 1, -1)
After a sample reduction with CUSUM filter and various preprocessing steps like feature reduction/importance i obtain my final dataframe.
After train test split i train the data with an XGBoost regressor as first model. Cross-validation, Gridshearch and hyperparameter tuning lead me to the testing step and the final score.
I repeat the same process with an XGBoost classifier as second model (confirmation model).
Bypassing backesting phase now i get real time data streaming with a broker API, two saved and loaded models. One predicts me next hour's returns, the latter predicts me direction class (1 for buy, -1 for sell) and prediction probability.
So the main question: how could i implement a position sizing/risk management strategy in a live Forex environment?
My idea was: if regressor predicts a positive return and classifier confirms it with a buy class and a relevant predict_proba (maybe a threshold is needed, let's say >70%) go long setting take profit = predicted return. Otherwise go short with the same and opposite rules. I'll set the stop loss with 1/2 risk reward ratio, so it will be predicted return/2 or maybe a fixed stop loss calculated upon the standard deviation of last n hours (20 pips, for example).
Lots are calculated with classical formula: net liquidation X %riskXtrade/ stop loss X $valueXPip
Please let me know if you have any ideas on making this a better money management.
Final question: if i wanted to use only a classification model that gives me a binary prediction for next hours, is there a method to filter low predicted returns?
For example if next hour's return is 1 (buy), how could i know the size of this prediction and HOW MUCH to buy? If a prediction falls into a category for a minimum variation it doesn't worth the trade. Maybe setting a multiclass label (1, 0, -1) with a threshold could figure this out?
Thank you very much!