Defining Data Sources

Universe

Introduction

This page explains how to import alternative data from CSV format into a universe selection method. This type of data is used to select a universe of securities on a set schedule (for example, daily).

Data Format

You must create a file with data in CSV format.

A R735QTJ8XC9X,A,17.19,109700,1885743,False,0.9904858,1
AA R735QTJ8XC9X,AA,71.25,513400,36579750,False,0.3992678,0.750075
AAB R735QTJ8XC9X,AAB,16.38,5000,81900,False,0.9902758,1
...
ZSEV R735QTJ8XC9X,ZSEV,10.5,800,8400,False,0.8981684,1
ZTR R735QTJ8XC9X,ZTR,9.56,102300,977988,False,0.0803037,3.97015016
ZVX R735QTJ8XC9X,ZVX,10,15600,156000,False,1,0.666667

The first column in your data file must be the security identifier and the second column must be the point-in-time ticker.

Define Data Type

Follow these steps to define the data source class:

  1. Open the Lean.DataSource.<vendorNameDatasetName>/<vendorNameDatasetName>Universe.cs file.
  2. Follow these steps to define the properties of your dataset:
    1. Duplicate lines 33-36 or 38-41 (depending on the data type) for as many properties as there are in your dataset.
    2. Rename the SomeCustomProperty/SomeNumericProperty properties to the names of your dataset properties (for example, Destination/FlightPassengerCount).
    3. Replace the “Some custom data property” comments with a description of each property in your dataset.
  3. Define the GetSource method to point to the path of your dataset file(s).
  4. Use the date parameter as the file name to get the date of data being requested. An example output file path is /output/alternative/xyzairline/ticketsales/universe/20200320.csv.

  5. Define the Reader method to return instances of your universe class.
  6. The first column in your data file must be the security identifier and the second column must be the point-in-time ticker. With this configuration, use new Symbol(SecurityIdentifier.Parse(csv[0]), csv[1]) to create the security Symbol.

    The date in your data file must be the date that the data point is available for consumption. With this configuration, set the Time to date - Period.

  7. Define the following methods in your dataset class:

Example

To view an example data source class for universe data, see the QuiverWallStreetBetsUniverse.cs file in the Lean.DataSource.QuiverQuantWallStreetBets GitHub repository.

You can also see our Videos. You can also get in touch with us via Discord.

Did you find this page helpful?

Contribute to the documentation: