Historical Data
Custom Data
Introduction
You can import external datasets into your algorithm to use alongside other datasets from the Dataset Market.
This page explains how to get historical data for custom datasets.
Before you can get historical data for the dataset, define the get_source
GetSource
and reader
Reader
methods of the custom data class.
For examples of custom dataset implementations, see Key Concepts.
Slices
To request Slice
objects of historical data, call the History
method.
If you pass a list of Symbol
objects, it returns data for all the custom datasets that the Symbol
objects reference.
// Get the latest 3 data points of some custom datasets, packaged into Slice objects. var history = History(datasetSymbols, 3);
If you don't pass any Symbol
objects, it returns data for all the data subscriptions in your notebook, so the result may include more than just custom data.
To request Slice
objects of historical data, call the history
method without providing any Symbol
objects.
It returns data for all the data subscriptions in your notebook, so the result may include more than just custom data.
// Get the latest 3 data points of all the securities/datasets in the notebook, packaged into Slice objects. var history = History(3);
# Get the latest 3 data points of all the securities/datasets in the notebook, packaged into Slice objects. history = self.history(3)
When your history request returns Slice
objects, the Time
time
properties of these objects are based on the algorithm time zone, but the EndTime
end_time
properties of the individual data objects are based on the data time zone.
The EndTime
end_time
is the end of the sampling period and when the data is actually available.
Data Points
To get a list of historical data points for a custom dataset, call the History<customDatasetClass>
method with the dataset Symbol
.
For an example definition of a custom data class, see the CSV Format Example.
To get historical data points for a custom dataset, call the history
method with the dataset Symbol
.
This method returns a DataFrame that contains the data point attributes of the dataset class.
For an example definition of a custom data class, see the CSV Format Example.
public class CustomSecurityHistoryAlgorithm : QCAlgorithm { public override void Initialize() { SetStartDate(2014, 7, 10); // Add a custom dataset and save a reference to it's Symbol. var datasetSymbol = AddData<MyCustomDataType>("MyCustomDataType", Resolution.Daily).Symbol; // Get the trailing 5 days of MyCustomDataType data. var history = History<MyCustomDataType>(datasetSymbol, 5, Resolution.Daily); } }
class CustomSecurityHistoryAlgorithm(QCAlgorithm): def initialize(self) -> None: self.set_start_date(2014, 7, 10) # Add a custom dataset and save a reference to it's Symbol. dataset_symbol = self.add_data(MyCustomDataType, "MyCustomDataType", Resolution.DAILY).symbol # Get the trailing 5 days of MyCustomDataType data in DataFrame format. history = self.history(dataset_symbol, 5, Resolution.DAILY)
close | high | low | open | value | ||
---|---|---|---|---|---|---|
symbol | time | |||||
MYCUSTOMDATATYPE.MyCustomDataType | 2014-07-08 | 7787.15 | 7792.00 | 7755.10 | 7780.40 | 7787.15 |
2014-07-09 | 7623.20 | 7808.85 | 7595.90 | 7804.05 | 7623.20 | |
2014-07-10 | 7585.00 | 7650.10 | 7551.65 | 7637.95 | 7585.00 |
If you intend to use the data in the DataFrame to create customDatasetClass
objects, request that the history request returns the data type you need.
Otherwise, LEAN consumes unnecessary computational resources populating the DataFrame.
To get a list of dataset objects instead of a DataFrame, call the history[customDatasetClass]
method.
# Get the trailing 5 days of MyCustomDataType data for an asset in MyCustomDataType format. history = self.history[MyCustomDataType](dataset_symbol, 5, Resolution.DAILY)
If the dataset provides multiple entries per time step, in the get_source
GetSource
method of your custom data class, return a SubscriptionDataSource
that uses FileFormat.UNFOLDING_COLLECTION
FileFormat.UnfoldingCollection
.
To get the historical data of this custom data type in a DataFrame, set the flatten
argument to True
.
history = self.history(dataset_symbol, 1, Resolution.DAILY, flatten=True)
Universes
To get historical data for a custom data universe, call the History
method with the Universe
object.
For an example definition of a custom data universe class, see the CSV Format Example.
To get historical data for a custom data universe, call the history
method with the Universe
object.
For an example definition of a custom data universe class, see the CSV Format Example.
public class CustomDataUniverseHistoryAlgorithm : QCAlgorithm { public override void Initialize() { SetStartDate(2017, 7, 9); // Add a universe from a custom data source and save a reference to it. var universe = AddUniverse<StockDataSource>( "myStockDataSource", Resolution.Daily, data => data.Select(x => x.Symbol) ); // Get the historical universe data over the last 5 days. var history = History(universe, TimeSpan.FromDays(5)).Cast<StockDataSource>().ToList(); // Iterate through each day in the universe history and count the number of constituents. foreach (var stockData in history) { var t = stockData.Time; var size = stockData.Symbols.Count; } } }
class CustomDataUniverseHistoryAlgorithm(QCAlgorithm): def initialize(self) -> None: self.set_start_date(2017, 7, 9) # Add a universe from a custom data source and save a reference to it. universe = self.add_universe( StockDataSource, "my-stock-data-source", Resolution.DAILY, lambda data: [x.symbol for x in data] ) # Get the historical universe data over the last 5 days in DataFrame format. history = self.history(universe, timedelta(5))
symbols | |
---|---|
time | |
2017-07-05 | [SPY, QQQ, FB, AAPL, IWM] |
2017-07-06 | [SPY, QQQ, FB, AAPL, IWM] |
2017-07-07 | [QQQ, AAPL, IWM, FB, GOOGL] |
2017-07-08 | [IWM, AAPL, FB, BAC, GOOGL] |
2017-07-09 | [AAPL, FB, GOOGL, GOOG, BAC] |
# Count the number of assets in the universe each day. universe_size_by_day = history.apply(lambda row: len(row['symbols']), axis=1)
time 2017-07-05 5 2017-07-06 5 2017-07-07 5 2017-07-08 5 2017-07-09 5 Name: symbols, dtype: int64
Missing Data Points
History requests for a trailing number of data samples return data based on the market hours of assets. The default market hours for custom securities is to be always open. Therefore, history requests for a trailing number of data samples may return fewer samples than you expect. To set the market hours of the dataset, see Market Hours.
Examples
The following examples demonstrate some common practices for trading with historical custom data.
Example 1: Custom Universe Data
The following algorithm trades based on a custom universe dataset. It obtains historical data on the universe's weights and uses the exponential moving average as the investment weight.
public class CustomUniverseExampleAlgorithm : QCAlgorithm { private const decimal SmoothingFactor = 2m / (1m + 5m); private Universe _universe; public override void Initialize() { SetStartDate(2016, 1, 1); SetEndDate(2025, 1, 1); // Add a universe that reads from the Object Store. _universe = AddUniverse<CustomUniverseData>("CustomUniverse", Resolution.Daily, (altCoarse) => { return altCoarse.Select(d => d.Symbol); }); // Add a Scheduled Event to rebalance the portfolio. var spy = QuantConnect.Symbol.Create("SPY", SecurityType.Equity, Market.USA); Schedule.On( DateRules.EveryDay(spy), TimeRules.AfterMarketOpen(spy, 1), Rebalance ); } private void Rebalance() { // We invest by the EMA of the given weight. var history = History(_universe, 5, Resolution.Daily); var symbols = history.Select(x => x.Symbol).ToHashSet(); var weights = symbols.Select(symbol => { var ema = history.Where(x => x.Symbol == symbol) .Select(x => (x as CustomUniverseData).Weight) .Aggregate((ema, nextQuote) => SmoothingFactor * nextQuote + (1m - SmoothingFactor) * ema); return (symbol, ema); }).ToList(); // Normalized weights. var weightSum = weights.Sum(x => x.Item2); var targets = weights.Select(x => new PortfolioTarget(x.Item1, x.Item2 / weightSum)).ToList(); SetHoldings(targets, liquidateExistingHoldings: true); } } public class CustomUniverseData : BaseDataCollection { public decimal Weight; public override DateTime EndTime { // Set the data period to 1 day. get { return Time + QuantConnect.Time.OneDay; } set { Time = value - QuantConnect.Time.OneDay; } } public override SubscriptionDataSource GetSource(SubscriptionDataConfig config, DateTime date, bool isLiveMode) { // Define the location and format of the data file. return new SubscriptionDataSource( "portfolioTargets.csv", SubscriptionTransportMedium.ObjectStore, FileFormat.Csv ); } public override BaseData Reader(SubscriptionDataConfig config, string line, DateTime date, bool isLiveMode) { // Skip the header row. if (!Char.IsDigit(line[0])) { return null; } // Split the line by each comma. var items = line.Split(","); // Parse the data from the CSV file. return new CustomUniverseData { EndTime = Parse.DateTimeExact(items[0], "yyyy-MM-dd"), Symbol = Symbol.Create(items[1], SecurityType.Equity, Market.USA), Weight = decimal.Parse(items[2], NumberStyles.Any, CultureInfo.InvariantCulture) }; } }
class CustomUniverseExampleAlgorithm(QCAlgorithm): def initialize(self) -> None: self.set_start_date(2016, 1, 1) self.set_end_date(2025, 1, 1) # Add a universe that reads from the Object Store. self._universe = self.add_universe( CustomUniverseData, "CustomUniverse", Resolution.DAILY, lambda x: [y.symbol for y in x] ) # Add a Scheduled Event to rebalance the portfolio. spy = Symbol.create('SPY', SecurityType.EQUITY, Market.USA) self.schedule.on( self.date_rules.every_day(spy), self.time_rules.after_market_open(spy, 1), self.rebalance ) def rebalance(self) -> None: # We invest by the EMA of the given weight. history = self.history(self._universe, 5, Resolution.DAILY).set_index('symbol', append=True).unstack('symbol').fillna(0) weights = history.ewm(5).mean().iloc[-1].droplevel(0) # Normalized weights. weights = weights / weights.sum() self.set_holdings([PortfolioTarget(symbol, weight) for symbol, weight in weights.items()], liquidate_existing_holdings=True) class CustomUniverseData(PythonData): def get_source(self, config: SubscriptionDataConfig, date: datetime, is_live_mode: bool) -> SubscriptionDataSource: # Define the location and format of the data file. return SubscriptionDataSource( "portfolio-targets.csv", SubscriptionTransportMedium.OBJECT_STORE, FileFormat.CSV ) def reader(self, config: SubscriptionDataConfig, line: str, date: datetime, is_live_mode: bool) -> BaseData: # Skip the header row. if not line[0].isnumeric(): return None # Split the line by each comma. items = line.split(",") # Parse the data from the CSV file. data = CustomUniverseData() data.end_time = datetime.strptime(items[0], "%Y-%m-%d") data.time = data.end_time - timedelta(1) data['symbol'] = Symbol.create(items[1], SecurityType.EQUITY, Market.USA) data["weight"] = float(items[2]) return data
Other Examples
For more examples, see the following algorithms: