Importing Data

Key Concepts

Introduction

Custom data is your own external data that's not from the Dataset Market. You can use custom data to inform trading decisions and to simulate trades on unsupported securities. To get custom data into your algorithms, you download the entire file at once or read it line-by-line with a custom data reader. If you use a custom data reader, LEAN sends the data to the OnData method in your algorithm.

Using Custom Data Types

To receive your custom data in the OnData method, create a custom type and then create a data subscription. The custom data type tells LEAN where to get your data and how to read it.

All custom data types must extend the BaseDataPythonData class and override the GetSource and Reader methods

public class MyCustomDataType : BaseData
{
    public decimal Property1 { get; set; } = 0;

    public override SubscriptionDataSource GetSource(
        SubscriptionDataConfig config,
        DateTime date,
        bool isLive)
    {
        return new SubscriptionDataSource("<sourceURL>", SubscriptionTransportMedium.RemoteFile);
    }

    public override BaseData Reader(
        SubscriptionDataConfig config,
        string line,
        DateTime date,
        bool isLive)
    {
        if (string.IsNullOrWhiteSpace(line.Trim()) || char.IsDigit(line[0]))
        {
            return null;
        }

        var data = line.Split(',');
        return new MyCustomDataType()
        {
            Time = DateTime.ParseExact(data[0], "yyyyMMdd", CultureInfo.InvariantCulture),
            EndTime = Time.AddDays(1),
            Symbol = config.Symbol,
            Value = data[1].IfNotNullOrEmpty(
                s => decimal.Parse(s, NumberStyles.Any, CultureInfo.InvariantCulture)),
            Property1 = data[2].IfNotNullOrEmpty(
                s => decimal.Parse(s, NumberStyles.Any, CultureInfo.InvariantCulture))
        };
    }
}
class MyCustomDataType(PythonData):
    def GetSource(self,
         config: SubscriptionDataConfig,
         date: datetime,
         isLive: bool) -> SubscriptionDataSource:
        return SubscriptionDataSource("<sourceURL>", SubscriptionTransportMedium.RemoteFile)

    def Reader(self,
         config: SubscriptionDataConfig,
         line: str,
         date: datetime,
         isLive: bool) -> BaseData:

         if not (line.strip() and line[0].isdigit()):
            return None

         data = line.split(',')

        custom = MyCustomDataType()
        custom.EndTime = datetime.strptime(data[0], '%Y%m%d') + timedelta(1)
        custom.Value = float(data[1])
        custom["property1"] = float(data[2])
        return custom

After you define the custom data class, call AddData<T>(string ticker, Resolution resolution = Resolution.Daily) in the Initialize method of your algorithm. This method gives LEAN the T-type factory to create the objects, the name of the data, and the resolution at which to poll the data source for updates. self.AddData(Type class, string ticker, Resolution resolution = Resolution.Daily) in the Initialize method of your algorithm. This method gives LEAN the type factory to create the data objects, the name of the data, and the resolution to poll the data source for updates.

public class MyAlgorithm : QCAlgorithm
    {
        private Symbol _symbol;
        public override void Initialize()
        {
            _symbol = AddData<MyCustomDataType>("<name>", Resolution.Daily).Symbol;
        }
    }
class MyAlgorithm(QCAlgorithm): 
        def Initialize(self) -> None:
            self.symbol = self.AddData(MyCustomDataType, "<name>", Resolution.Daily).Symbol
    

As your data reader reads your custom data file, LEAN adds the data points in the Slice it passes to your algorithm's OnData method. To collect the custom data, use the Symbol or name of your custom data subscription. You can access the Value and custom properties of your custom data class from the Slice. To access the custom properties, use the custom attributepass the property name to the GetProperty method.

public class MyAlgorithm : QCAlgorithm
{
    public override void OnData(Slice slice)
    {
        if (slice.ContainsKey(_symbol))
        {
            var customData = slice[_symbol];
            var value = customData.Value;
            var property1 = customData.CustomAttribute1;
        }
    }

    // Can also get the data instance directly with OnData(dataClass) method
    public void OnData(MuCustomDataClass slice)
    {
        var value = slice.Value;
        var property1 = slice.CustomAttribute1;
    }
}
class MyAlgorithm(QCAlgorithm):
    def OnData(self, slice: Slice) -> None:
        if slice.ContainsKey(self.symbol):
            custom_data = slice[self.symbol]
            value = custom_data.Value
            property1 = custom_data.GetProperty('property1')

Downloading Bulk Data

The Download method downloads the content served from a local file or URL and then returns it as a string.

var content = Download("<filePathOrURL>");
content = self.Download("<filePathOrURL>")

The batch import technique is outside of the LEAN's awareness or control, so it can't enforce good practices. However, the batch import technique is good for the loading the following datasets:

  • Trained AI Models
  • Well-defined historical price datasets
  • Parameters and setting imports such as Symbol lists

File Quotas

In a cloud backtest, you can download up to 100 files. Each file can be up to 200 MB in size and have a file name up to 200 characters long.

Rate Limits

The download methods can download 10 KB per second. To ensure your algorithms run fast, only use a small number of small custom data files.

Timeouts

In cloud algorithms, the download methods have a 10-second timeout period. If the methods don't download the data within 10 seconds, LEAN throws an error.

You can also see our Videos. You can also get in touch with us via Discord.

Did you find this page helpful?

Contribute to the documentation: