Datasets

Custom Data

Introduction

This page explains how to request, manipulate, and visualize historical user-defined custom data.

Prerequisites

Working knowledge of C#.

Working knowedge of Python and pandas. If you are not familiar with pandas, see the pandas documentation.

Define Custom Data

You must format the data file into chronological order before you define the custom data class.

Follow these steps to define a custom data security:

Load the required assembly files and data types.

#load "../Initialize.csx"
#load "../QuantConnect.csx"

using QuantConnect;
using QuantConnect.Data;
using QuantConnect.Algorithm;
using QuantConnect.Research;

Define the Custom Data Class.

public class Nifty : BaseData
{
    public decimal Open;
    public decimal High;
    public decimal Low;
    public decimal Close;

    public Nifty()
    {
    }

    public override SubscriptionDataSource GetSource(SubscriptionDataConfig config, DateTime date, bool isLiveMode)
    {
        var url = "http://cdn.quantconnect.com.s3.us-east-1.amazonaws.com/uploads/CNXNIFTY.csv";
        return new SubscriptionDataSource(url, SubscriptionTransportMedium.RemoteFile);
    }

    public override BaseData Reader(SubscriptionDataConfig config, string line, DateTime date, bool isLiveMode)
    {
        var index = new Nifty();
        index.Symbol = config.Symbol;

        try
        {
            //Example File Format:
            //Date,       Open       High        Low       Close     Volume      Turnover
            //2011-09-13  7792.9    7799.9     7722.65    7748.7    116534670    6107.78
            var data = line.Split(',');
            index.Time = DateTime.Parse(data[0], CultureInfo.InvariantCulture);
            index.EndTime = index.Time.AddDays(1);
            index.Open = Convert.ToDecimal(data[1], CultureInfo.InvariantCulture);
            index.High = Convert.ToDecimal(data[2], CultureInfo.InvariantCulture);
            index.Low = Convert.ToDecimal(data[3], CultureInfo.InvariantCulture);
            index.Close = Convert.ToDecimal(data[4], CultureInfo.InvariantCulture);
            index.Value = index.Close;
        }
        catch
        {
             // Do nothing
        }
        return index;
    }
}
class Nifty(PythonData):
    '''NIFTY Custom Data Class'''
    def GetSource(self, config: SubscriptionDataConfig, date: datetime, isLiveMode: bool) -> SubscriptionDataSource:
        url = "http://cdn.quantconnect.com.s3.us-east-1.amazonaws.com/uploads/CNXNIFTY.csv"
        return SubscriptionDataSource(url, SubscriptionTransportMedium.RemoteFile)

    def Reader(self, config: SubscriptionDataConfig, line: str, date: datetime, isLiveMode: bool) -> BaseData:
        if not (line.strip() and line[0].isdigit()): return None

        # New Nifty object
        index = Nifty()
        index.Symbol = config.Symbol

        try:
            # Example File Format:
            # Date,       Open       High        Low       Close     Volume      Turnover
            # 2011-09-13  7792.9    7799.9     7722.65    7748.7    116534670    6107.78
            data = line.split(',')
            index.Time = datetime.strptime(data[0], "%Y-%m-%d")
            index.EndTime = index.Time + timedelta(days=1)
            index.Value = data[4]
            index["Open"] = float(data[1])
            index["High"] = float(data[2])
            index["Low"] = float(data[3])
            index["Close"] = float(data[4])

        except:
            pass

        return index

Create Subscriptions

Follow these steps to subscribe to custom data security:

  1. Instantiate a QuantBook.
  2. var qb = new QuantBook();
    qb = QuantBook()
  3. Call the AddData method with a ticker.
  4. var custom = qb.AddData<Nifty>("NIFTY");
    custom = qb.AddData(Nifty, "NIFTY")

    Custom data has its own resolution, so you don't need to specify it.

  5. Save a reference to the Custom data Symbol.
  6. var symbol = custom.Symbol;
    symbol = custom.Symbol

Get Historical Data

You need a subscription before you can request historical data for a security. You can request an amount of historical data based on a trailing number of bars, a trailing period of time, or a defined period of time.

(Optional) Call SetStartDate to set the algorithm Time:

qb.SetStartDate(2014, 7, 29);
qb.SetStartDate(2014, 7, 29)

To avoid lookahead bias, the algorithm Time is last date that a historical data request can get, and his value is internally used to calculate the requested period in the next methods. Since you have access to the file with the data, you can set the start date with the date of the last row, in this case, Jul 28th, 2014

Trailing Number of Bars

Call the History method with a symbol, integer, and resolution to request historical data based on the given number of trailing bars and resolution.

var history = qb.History(symbol, 10);
history = qb.History(symbol, 10)

The call returns the most recent bars, including periods of time when the there is no data, e.g., closed exchanged.

Trailing Period of Time

Call the History method with a symbol, TimeSpantimedelta, and resolution to request historical data based on the given trailing period of time and resolution.

var history = qb.History(symbol, TimeSpan.FromDays(10));
history = qb.History(symbol, timedelta(days=10))

The call returns the most recent bars, including periods of time when the there is no data, e.g., closed exchanged.

Defined Period of Time

Call the History method with a symbol, start DateTimedatetime, end DateTimedatetime, and resolution to request historical data based on the defined period of time and resolution.

var startTime = new DateTime(2013, 7, 29);
var endTime = new DateTime(2014, 7, 29);
var history = qb.History(symbol, startTime, endTime);
start_time = datetime(2013, 7, 29)
end_time = datetime(2014, 7, 29)
history = qb.History(symbol, start_time, end_time)

The call returns the bars that are timestamped within the defined period of time.

In all of the cases above, the History method returns a DataFrame with a MultiIndex.

In all of the cases above, the History method returns an IEnumerable<Nifty> for single-security requests.

Download Method

Call the Download method with the custom data URL.

var url = "http://cdn.quantconnect.com.s3.us-east-1.amazonaws.com/uploads/CNXNIFTY.csv";
var content = qb.Download(url);
url = "http://cdn.quantconnect.com.s3.us-east-1.amazonaws.com/uploads/CNXNIFTY.csv"
content = qb.Download(url)

Follow these steps to convert the downloaded content to a DataFrame:

  1. Import the StringIO from the io library
  2. from io import StringIO
  3. Create a StringIO with the downloaded content
  4. data = StringIO(content)
  5. Call the read_csv method to convert the data to a DataFrame
  6. dataframe = pd.read_csv(data, index_col=0)

Wrangle Data

You need some historical data to perform wrangling operations. Run a cell in a Jupyter Notebook with the pandas object as the last line to display the historical data.

You need some historical data to perform wrangling operations. Use LINQ to wrangle the data and then call the Console.WriteLine method in a Jupyter Notebook to display the data.

Select One Custom Data

Iterate through the IEnumerable<Nifty> to get the historical custom data.

Index the DataFrame with a Symbol to select the historical custom data.

foreach(var bar in history.TakeLast(5))
{
    Console.WriteLine($"{bar} EndTime: {bar.EndTime}");
}
history.loc[symbol]

The Jupyter Notebook displays the contents of the last 5 Nifty object.

The Jupyter Notebook displays a DataFrame that contains the open, high, low, close, and value attributes of the Nifty object.

Select a Data Property

Iterate through the IEnumerable<Nifty> and select a property of the Nifty object to get the historical values of the property.

Index the DataFrame with a Symbol to select the historical data of the security and then select a property column to get the historical values of the property.

var values = history.Select(data => $"{data.Symbol} Value: {data.Value} EndTime: {data.EndTime}");
foreach(var value in values.TakeLast(5))
{
    Console.WriteLine(value);
}
history.loc[symbol]['value']

The Jupyter Notebook displays the last 5 values of the Nifty object.

The DataFrame is transformed into a Series of NIFTY values.

Unstack the Dataframe

If you request historical data for multiple data in an custom data class, you can transform the DataFrame so that it is a time series of values for all of the custom data. Select the column that you want to display for each custom data and then call the unstack method to transform the DataFrame into the desired format.

history['value'].unstack(level=0)

The DataFrame is transformed so that the column indices are the Symbol of each custom data and each row contains the value.

Plot Data

Jupyter Notebooks don't currently support libraries to plot historical data, but we are working on adding the functionality. Until we add the functionality, use Python to plot historical custom data.

You need to get some historical custom data to plot it. You can use many of the supported plotting libraries to visualize data in various formats. For example, you can plot candlestick and line charts.

Candlestick Chart

Follow these steps to plot candlestick charts:

  1. Import the plotly library.
  2. import plotly.graph_objects as go
  3. Select the data:
  4. history = history.loc[symbol]
  5. Create a Candlestick.
  6. candlestick = go.Candlestick(x=history.index,
                                 open=history['open'],
                                 high=history['high'],
                                 low=history['low'],
                                 close=history['close'])
  7. Create a Layout.
  8. layout = go.Layout(title=go.layout.Title(text='SPY OHLC'),
                       xaxis_title='Date',
                       yaxis_title='Price',
                       xaxis_rangeslider_visible=False)
  9. Create the Figure.
  10. fig = go.Figure(data=[candlestick], layout=layout)
  11. Show the Figure.
  12. fig.show()

    Candlestick charts display the open, high, low, and close prices of the security.

Line Chart

Follow these steps to plot line charts using built-in methods:

  1. Select data to plot.
  2. values = history['value'].unstack(level=0)
  3. Call the plot method on the pandas object.
  4. values.plot(title="Value", figsize=(15, 10))
  5. Show the plot.
  6. plt.show()

    Line charts display the value of the property you selected in a time series.

You can also see our Videos. You can also get in touch with us via Discord.

Did you find this page helpful?

Contribute to the documentation: