Hi,

I have a custom data source for BTCUSDT from binance in minute resolution that I have used successfully in the past via .AddData(). However for ML model training, when I try to get consolidated history, the pandas Dataframe returned by self.History() seems wrong. 

Data setup:

self.ticker = "BTCUSDT"
self.symbols = [self.AddData(type=BinanceTradeBarData, ticker=self.ticker,resolution=Resolution.Minute).Symbol]

I call ML model training function via `self.Train(self.Training) within Initialize():

def Training(self):
	...
	    # get historical data
        # hist_data:pd.DataFrame = self.History(self.symbols, 20) # works + returns dataframe with shape (20,5) in minute resolution
        """
                                                            close  ...    volume
        symbol                         time                           ...
        BTCUSDT.BinanceTradeBarData 2S 2022-03-14 23:41:00  39625.53  ...  16.35635
                                    2022-03-14 23:42:00  39638.02  ...  16.54291
                                    2022-03-14 23:43:00  39645.74  ...  12.47931
                                    2022-03-14 23:44:00  39648.04  ...  11.78058
                                    2022-03-14 23:45:00  39545.50  ...  53.39393
                                    2022-03-14 23:46:00  39555.17  ...  30.92626
        """
        # hist_data:pd.DataFrame = self.History(self.symbols, 20, Resolution.Hour) # works + returns dataframe with shape (1200,5) with duplicate
        """
                                                            close  ...    volume
        symbol                         time                           ...
        BTCUSDT.BinanceTradeBarData 2S 2022-03-14 04:01:00  38177.89  ...  15.30639
                                    2022-03-14 04:01:00  38150.91  ...  11.90438
                                    2022-03-14 04:01:00  38135.00  ...  17.80335
                                    2022-03-14 04:01:00  38185.00  ...   9.66609
                                    2022-03-14 04:01:00  38209.61  ...  36.55811
        ...                                                      ...  ...       ...
                                    2022-03-14 23:01:00  39689.81  ...  48.53675
                                    2022-03-14 23:01:00  39687.24  ...  21.09104
                                    2022-03-14 23:01:00  39691.64  ...  80.96595
        """
        hist_data:pd.DataFrame = self.History(self.symbols, timedelta(hours=5), Resolution.Hour) # works + returns dataframe with shape (300,5)
        """
                                                               close  ...    volume
        symbol                         time                           ...
        BTCUSDT.BinanceTradeBarData 2S 2022-03-14 19:01:00  38700.00  ...  30.80545
                                    2022-03-14 19:01:00  38711.01  ...  23.43854
                                    2022-03-14 19:01:00  38747.86  ...  61.53202
                                    2022-03-14 19:01:00  38706.69  ...   8.87105
                                    2022-03-14 19:01:00  38699.11  ...  25.44933
        ...                                                      ...  ...       ...
                                    2022-03-14 23:01:00  39689.81  ...  48.53675
                                    2022-03-14 23:01:00  39687.24  ...  21.09104
                                    2022-03-14 23:01:00  39691.64  ...  80.96595
                                    2022-03-14 23:01:00  39674.91  ...  22.56632
        """
        # hist_data = self.History(symbols = self.symbols, periods = 10, resolution = Resolution.Hour) # doesn't work: returns MemoizingEnumerable[Slice]
        # hist_data = self.History(symbols = self.symbols, periods = 10) # doesnt work: returns MemoizingEnumerable[Slice]
        # hist_data = self.History(symbols = self.symbols, span=timedelta(hours=5)) # doesnt work: returns MemoizingEnumerable[Slice]
        # hist_data = self.History(symbols = self.symbols, start=datetime(2022,3,15), end=datetime(2022,3,20)) # doesnt work: Count of resolved generics 0 does not match method generic count 1.

 

As you can see none of the returned dataframes are in Hourly consolidated pattern. 

Trial #1: returns data in how I added my dataset (=minute resolution)

Trial #2: returns data with duplicate `time`s. 

 

Main question:

To get consolidated History data, should I retrieve via History with resolution not set and do pandas.groupby() etc to convert it to consolidated bars? Or is there a bug within custom data sources? 

 

References checked

 

Any help is appreciated, thanks!

Author