Taking big history data

Hi,

I'm creating a reinforcement learning algorithm, but before use it is needed train the model with historical data. Because the necessary data for training is among 80000 training periods, it was necesary to use 15 minute resolution in the data ( the hour resolution only can get arround 30000 training past periods) .

I execute the training in the Initialize event. And for few amount of training periods there is no problem ( e.g. using 1000 training periods the algorithm runs well). But when I try to use the necesary 80000 it show the following error: "Algorithm took longer than 5 minutes on a single time loop."

When I use the training method inside the OnData event, something similar happens: "Algorithm took longer than 10 minutes on a single time loop."

I think the error is because there is no possible to create the history array for 80000 past events in less than 10 o 5 minutes. So, I'm asking for help. It is possible to create the training array in other way?

I can't attach the backtest because when there is a bug there is no possible to attach it. So I run a backtest with 100 training periods and create a snippet code to show you what is the code genereting the error.

def Initialize(self):
    self.TimeSpan = 15
    self.BarPeriod = TimeSpan.FromMinutes(self.TimeSpan)
    self.assets = ["AAPL", "IBM", "MSFT"]
    self.features = ["close", "high", "low"]
    self.f = len(self.features)
    self.m = len(self.assets)
    self.n = 50
    self.t_steps = 80000
    self.train_model()

def train_model(self):
    back_bars = self.TimeSpan * self.t_steps
    h1 = self.History(self.Securities.Keys, back_bars, Resolution.Minute)
    
    # Filter the data for selected assets and features
    h1 = h1.loc[self.assets][self.features].fillna(method='ffill').fillna(method='bfill')
        raw_prices = np.zeros(shape=(self.m,self.t_batch,self.f), dtype=np.float64)
        
        # Create at hand the 15 minute bars
        for asset in range(len(self.assets)):
            period = 0
            for i in range(self.t_batch * self.TimeSpan):
                if i % self.TimeSpan == 0 and i > 0:
                    ran = h1.loc[self.assets[asset]][:][i:i+self.TimeSpan].values
                    self.Debug(str(ran.shape))
                    close = ran[-1][0]
                    high = np.max(ran[:][1])
                    low = np.min(ran[:][2])
                    raw_prices[asset][period][0] = close
                    raw_prices[asset][period][1] = high
                    raw_prices[asset][period][2] = low
                    period += 1
     ......

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.

Platform

Radically Open-Source Algorithmic Trading Engine

Join Our Discord Channel

Quarterly Open-Source Trading Competition

Draft Discussions

Bookmarked Discussions

SEARCH DISCUSSIONS

TOP 5 Research PUblications

About Quant League

competition rules

previous competitions

285,800 Quants.

VOTE FOR UPCOMING FEATURES

Taking big history data

Organization

Team

Clone Strategy

Previous Ranking

IN THIS RESEARCH

PARTICIPANTS

Discussion Awards

Actions

Join QuantConnect for Free

Platform

SIGN IN

Radically Open-Source Algorithmic Trading Engine

Join Our Discord Channel

Quarterly Open-Source Trading Competition

Draft Discussions

Bookmarked Discussions

SEARCH DISCUSSIONS

TOP 5 Research PUblications

About Quant League

competition rules

previous competitions

285,800 Quants.

VOTE FOR UPCOMING FEATURES

Taking big history data

Organization

Team

Clone Strategy

Previous Ranking

IN THIS RESEARCH

PARTICIPANTS

Discussion Awards

SHARE RESEARCH

SHARE DISCUSSION

SHARE ARTICLE

SHARE

Actions

Join QuantConnect for Free