When performing history request on custom data I notice that it is seemly going through all the data that is in the dataset it is searching through. The same goes for using the data in a backtest - the algorithm seems to iterate through all rows of the dataset until it gets to the date I have set as starting date. The data is sorted by date so it should easily use an algorithm to find the rows I am asking for. How can I define the custom data in a matter so it quickly finds the correct rows during a History request?
I am also unable to fetch the n latest bars through qb.History(symbol, 10), only by date qb.History(symbol, start=datetime(2021, 1, 3), end=datetime(2021, 1, 7)).
I load my custom data like this:
class MESData(PythonData):
def GetSource(self, config, date, isLiveMode):
source = os.path.join(Globals.DataFolder, "custom/future/minute/mes.csv")
subscription_data_source = SubscriptionDataSource(source, SubscriptionTransportMedium.LocalFile, FileFormat.Csv)
return subscription_data_source
def Reader(self, config: SubscriptionDataConfig, line, date, isLiveMode):
if not line.strip():
return None
data = MESData()
data.Symbol = config.Symbol
# Example Line Format:
# Date Open High Low Close Volume
# 2023-05-14 18:01 4132.25 4133.50 4131.25 4132.00 501.0
splitted = line.split(',')
# Add data
data.EndTime = datetime.strptime(splitted[0], "%Y-%m-%d %H:%M:%S")
data.Time = data.EndTime - timedelta(minutes=1)
data.Value = float(splitted[4])
data["Open"] = float(splitted[1])
data["High"] = float(splitted[2])
data["Low"] = float(splitted[3])
data["Close"] = float(splitted[4])
data["Volume"] = float(splitted[5])
return data
Louis Szeto
Hi Haakon
We cannot blindly assume all data to be sorted, so although it is a bit stupid, there is a point to iterate by date. Note that it will not affect the algorithm performance after initialization, since the data is loaded and already waited to be called. It only be a bulk load during the start.
Another way to solve it would be making the data by day (one-day-one-file) and load the respective files only. You may check out sample minute data in Lean CLI on how to do so.
Best
Louis
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
Haakon
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
To unlock posting to the community forums please complete at least 30% of Boot Camp.
You can continue your Boot Camp training progress from the terminal. We hope to see you in the community soon!