Taking big history data


I'm creating a reinforcement learning algorithm, but before use it is needed train the model with historical data. Because the necessary data for training is among 80000 training periods, it was necesary to use 15 minute resolution in the data ( the hour resolution only can get arround 30000 training past periods) .

I execute the training in the Initialize event. And for few amount of training periods there is no problem ( e.g. using 1000 training periods the algorithm runs well). But when I try to use the necesary 80000 it show the following error: "Algorithm took longer than 5 minutes on a single time loop."

When I use the training method inside the OnData event, something similar happens: "Algorithm took longer than 10 minutes on a single time loop."

I think the error is because there is no possible to create the history array for 80000 past events in less than 10 o 5 minutes. So, I'm asking for help. It is possible to create the training array in other way?

I can't attach the backtest because when there is a bug there is no possible to attach it. So I run a backtest with 100 training periods and create a snippet code to show you what is the code genereting the error.


def Initialize(self):
self.TimeSpan = 15
self.BarPeriod = TimeSpan.FromMinutes(self.TimeSpan)
self.assets = ["AAPL", "IBM", "MSFT"]
self.features = ["close", "high", "low"]
self.f = len(self.features)
self.m = len(self.assets)
self.n = 50
self.t_steps = 80000

def train_model(self):
back_bars = self.TimeSpan * self.t_steps
h1 = self.History(self.Securities.Keys, back_bars, Resolution.Minute)

# Filter the data for selected assets and features
h1 = h1.loc[self.assets][self.features].fillna(method='ffill').fillna(method='bfill')
raw_prices = np.zeros(shape=(self.m,self.t_batch,self.f), dtype=np.float64)

# Create at hand the 15 minute bars
for asset in range(len(self.assets)):
period = 0
for i in range(self.t_batch * self.TimeSpan):
if i % self.TimeSpan == 0 and i > 0:
ran = h1.loc[self.assets[asset]][:][i:i+self.TimeSpan].values
close = ran[-1][0]
high = np.max(ran[:][1])
low = np.min(ran[:][2])
raw_prices[asset][period][0] = close
raw_prices[asset][period][1] = high
raw_prices[asset][period][2] = low
period += 1
Update Backtest

Update Backtest


The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.


This discussion is closed