LSTM and Reinforcement Learning

Hi QC, I want to share some work I am doing using LSTM and RL back to the community, as I have seen quite a bit of great posts and comments that are helping me. Please let me know if you have any questions, suggested improvements or general feedback about the attached.

I'm using a LSTM to predict price and volume. Then using those predictions to train an RL agent.

The LSTM is used to predict a price and volume but uses the feature of dropout in Keras to provide a probability distribution of the prediction.
(Monte Carlo Dropout - great video on this -> https://youtu.be/eHT0raFtl1Q )

Price and Volume are multiplied for an interaction effect and to keep the network simpler. In the future I plan to add in additional interaction effects around options chains/greeks so that these can be predicted and simulated against actuals.

The probability distribution produced by the LSTM prediction is used as inputs to the RL environment with some additional abstraction done. This is appended after each prediction so the RL agent must trade profitably over a longer and longer time period.

The gym environment for the RL agent is still in beginning stages. I have used code from github for the gym and plan to customize this code to have the ability to trade different options styles. So instead of just 0 and 1 it would have a 2, a 3, a 4, etc, which may include different options trading logic like a covered call or put logic that it could test and learn on simulated data against.

Issues that I am still working through:
Simmed data has stochasticity so on additional backtests it won't reproduce results without setting a seed, but improved sample coverage would be a better option through scalability of the codebase instead of using seed.

RL trade logic should be improved in many ways. I would like to first test adding in options and greeks. It would be interesting to also tie in the rewards for the RL agent to work with existing QC data like portfolio. Also RL logic should be able to return continuous values if you want it to decide how much to buy.

Simmed data is still doing some simple things that could be improved, it just rolls die from the predicted distribution to get close and open, but certainly many important factors act upon open and close values, so would need to understand and include those values in the prediction to be able to better simulate open/close values. But right now price and volume are being simmed in a strong manner for the RL agent. Also blending of the data is just averaged between some random selections to protect real data from overfitting, but this could be improved and tested on.

Scalability needs to continue to improve. LSTM can go for about 6 months, before weights become larger then my backtest machine memory. With the addition of the RL agent I am also using additional memory. Scalability needs to improve to make it a more viable strategy. This could be solved by moving to custom hardware but for now I will look to cleanup and improve the codebase.

Buy and sell logic is still basic and just for debugging, but could go many routes either emit insights back to a portfolio manager. Or continue down custom trade logic but write it to better utilize capital on table and spread risk e.g. TSLA at 1% of available.

Paper trade I am still debugging this to get it to paper trade, so likely some bugs in this as well, but to start with I need to make sure it does warmup and can begin trading immediately so that start/stop of state can happen without issues.

Benchmarks haven't been set or compared against. And generalization of framework against other stocks sets and selection methods I'm just starting to test.

Other minor bugs like not trading last stock in list.

If you are testing it make sure to clear out the cached models and datasets before next test.

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.

Thanks for the share Ben Kaylor!

Unfortunately, much of it is over my head, but this seems chock full of good information. I'll be bookmarking it to decipher / digest at some future date.

Spacetime

11.1k Pro ,

interesting share! I will check it out.

.ekz.

184.2k Pro ,

Devsim INVESTOR

Update Backtest

Notebook

person upvoted this people upvoted this

To unlock posting to the community forums please complete at least 30% of Boot Camp.
You can continue your Boot Camp training progress from the terminal. We hope to see you in the community soon!

Radically Open-Source Algorithmic Trading Engine

Join Our Discord Channel

Draft Discussions

Bookmarked Discussions

SEARCH DISCUSSIONS

TOP 5 Research Publications

447,200 Quants.

VOTE FOR UPCOMING FEATURES

LSTM and Reinforcement Learning

Allocate to this Strategy

Organization

Team

Clone Strategy

Previous Ranking

IN THIS RESEARCH

PARTICIPANTS

Discussion Awards

Actions

Join QuantConnect for Free

SIGN IN

Radically Open-Source Algorithmic Trading Engine

Join Our Discord Channel

Draft Discussions

Bookmarked Discussions

SEARCH DISCUSSIONS

TOP 5 Research Publications

447,200 Quants.

VOTE FOR UPCOMING FEATURES

LSTM and Reinforcement Learning

Allocate to this Strategy

Organization

Team

Clone Strategy

Previous Ranking

IN THIS RESEARCH

PARTICIPANTS

Discussion Awards

SHARE RESEARCH

SHARE DISCUSSION

SHARE ARTICLE

SHARE

Actions

Join QuantConnect for Free