QuantConnect
US Equity
Introduction
Download the US Equities dataset to your local machine. You can download the full dataset in bulk to avoid selection bias or download individual tickers to keep the cost low. The dataset contains data for every ticker and trading day. If the resolution you download provides trade and quote data, the download contains both data types. To check which data types each resolution provides, see Resolutions.
To use the CLI, you must be a member in an organization on a paid tier.
Prerequisites
The US Equities dataset depends on the US Equity Security Master dataset because the US Equity Security Master contains information on splits, dividends, and symbol changes. Before you download US Equities data, open the Pricing page of your organization and subscribe to the data package. You need billing permissions to change the organization's subscriptions.
Download in Bulk
After you subscribe to local access (see Prerequisites), open a terminal in your organization workspace and run the following commands to bulk download the data and its prerequisites.
To download the US Equity Security Master, run:
$ lean data download --dataset "US Equity Security Master"
To download the US Equities data for a resolution, run the following command, replacing <resolution> with daily, hour, minute, second, or tick and adjusting the date range:
$ lean data download --dataset "US Equities" --data-type "Bulk" --resolution "<resolution>" --start "20230101" --end "20230105"
You can also use the CLI Command Generator on the dataset listing to generate the command. For more information about the interactive wizard and the CLI Command Generator, see Using the CLI.
After you bulk download the US Equities dataset, new daily updates are available at 7 AM Eastern Time (ET) after each trading day. Subscribe to at least one of the data packages to unlock the updates. Instead of directly calling the lean data download command, you can place a Python script in the data directory of your organization workspace and run it to update your data files. The following example script updates all data resolutions:
import os
import pandas as pd
from datetime import datetime, time, timedelta
from pytz import timezone
from os.path import abspath, dirname
os.chdir(dirname(abspath(__file__)))
OVERWRITE = False
# Define a method to download the data
def __download_data(resolution, start=None, end=None):
print(f"Updating {resolution} data...")
command = f'lean data download --dataset "US Equities" --data-type "Bulk" --resolution "{resolution}"'
if start:
end = end if end else start
command += f" --start {start} --end {end}"
if OVERWRITE:
command += " --overwrite"
print(command)
os.system(command)
def __get_end_date() -> str:
now = datetime.now(timezone("US/Eastern"))
if now.time() > time(7,30):
return (now - timedelta(1)).strftime("%Y%m%d")
print('New data is available at 07:30 AM EST')
return (now - timedelta(2)).strftime("%Y%m%d")
def __download_high_frequency_data(latest_on_cloud):
for resolution in ["minute", "second", "tick"]:
dir_name = f"equity/usa/{resolution}/spy".lower()
if not os.path.exists(dir_name):
__download_data(resolution, '19980101')
continue
latest_on_disk = sorted(os.listdir(dir_name))[-1].split('_')[0]
if latest_on_disk >= latest_on_cloud:
print(f"{resolution} data is already up to date.")
continue
__download_data(resolution, latest_on_disk, latest_on_cloud)
def __download_low_frequency_data(latest_on_cloud):
for resolution in ["daily", "hour"]:
file_name = f"equity/usa/{resolution}/spy.zip".lower()
if not os.path.exists(file_name):
__download_data(resolution)
continue
latest_on_disk = str(pd.read_csv(file_name, header=None)[0].iloc[-1])[:8]
if latest_on_disk >= latest_on_cloud:
print(f"{resolution} data is already up to date.")
continue
__download_data(resolution)
if __name__ == "__main__":
latest_on_cloud = __get_end_date()
__download_low_frequency_data(latest_on_cloud)
__download_high_frequency_data(latest_on_cloud)
The preceding script checks the date of the most recent SPY data you have for all resolutions. If there is new data available for any of these resolutions, it downloads the new data files and overwrites your hourly and daily files. If you don't intend to download all resolutions, adjust this script to your needs.
To update your local copy of the US Equity Security Master, run:
$ lean data download --dataset "US Equity Security Master"
Download by Ticker
To download data for selected tickers instead of the full dataset, run a non-interactive lean data download command. First, download the prerequisite US Equity Security Master:
$ lean data download --dataset "US Equity Security Master"
Then download the US Equities data for the tickers, resolution, and date range you need. Tick, second, and minute resolutions provide separate trade and quote data, so run a command for each data type. For example, to download minute-resolution data for SPY:
$ lean data download --dataset "US Equities" --data-type "Trade" --ticker "SPY" --resolution "Minute" --start "20230101" --end "20230105" $ lean data download --dataset "US Equities" --data-type "Quote" --ticker "SPY" --resolution "Minute" --start "20230101" --end "20230105"
Hour and daily resolutions provide only trade data, so download only the trade data type for those resolutions.
To download data interactively or to use the CLI Command Generator instead, see Using the CLI.
Size and Format
The following table shows the size and format of the US Equities dataset for each resolution:
| Resolution | Size | Format |
|---|---|---|
| Daily | 2 GB | 1 file per ticker |
| Hour | 4 GB | 1 file per ticker |
| Minute | 500 GB | 1 file per ticker per day |
| Second | 1.5 TB | 1 file per ticker per day |
| Tick | 1.5 TB | 1 file per ticker per day |
For more information about the file format, see the Data / equity directory in the LEAN repository.
Price
Both the bulk and by-ticker downloads require the US Equity Security Master. The following table shows the cost of an annual subscription to the US Equity Security Master for each organization tier:
| Tier | Price ($/Year) |
|---|---|
| Quant Researcher | 600 |
| Team | 900 |
| Trading Firm | 1,200 |
| Institution | 1,800 |
Download in Bulk
To download the US Equities dataset in bulk, subscribe to it on the Pricing page of your organization. The price depends on your organization tier and the resolution you need. The following table shows the price ($/year) to download the historical data of each resolution for each organization tier:
| Resolution | Quant Researcher | Team | Trading Firm | Institution |
|---|---|---|---|---|
| Tick | 16,800 | 28,800 | 33,600 | 52,800 |
| Second | 15,360 | 28,800 | 33,600 | 48,000 |
| Minute | 11,760 | 16,800 | 31,200 | 43,200 |
| Hour | 2,136 | 3,480 | 3,480 | 3,480 |
| Daily | 2,136 | 3,480 | 3,480 | 3,480 |
After the first bulk subscription ends, subscribe to the updates to keep your local data current. The updates cost the same for all resolutions. The following table shows the price ($/year) of the updates for each organization tier:
| Tier | Price ($/Year) |
|---|---|
| Quant Researcher | 600 |
| Team | 840 |
| Trading Firm | 1,440 |
| Institution | 2,640 |
The following table shows the total cost of downloading the required datasets in bulk at minute resolution on the Quant Researcher tier. Other organization tiers apply their own rates, shown in the preceding tables.
| Dataset | Package | Historical | Updates |
|---|---|---|---|
| US Equity Security Master | Subscription | $600 | $600/year |
| US Equities | Minute | $11,760 | $600/year |
| Total | $12,360 | $1,200/year |
Download by Ticker
The US Equities dataset is available is several resolutions. The resolution you need depends on the US Equity subscriptions you create in your algorithm and the resolution of data you get in history requests. The following table describes the file format and costs of each resolution:
| Resolution | File Format | Cost per file |
|---|---|---|
| Tick | One file per security per trading day per data format. Quote and trade data are separate files. | 6 QCC = $0.06 USD |
| Second | One file per security per trading day per data format. Quote and trade data are separate files. | 5 QCC = $0.05 USD |
| Minute | One file per security per trading day per data format. Quote and trade data are separate files. | 5 QCC = $0.05 USD |
| Hour | One file per security. | 300 QCC = $3 USD |
| Daily | One file per security. | 100 QCC = $1 USD |
For example, the following algorithm subscribes to minute resolution data for a US Equity:
public class USEquityDataAlgorithm : QCAlgorithm
{
public override void Initialize()
{
SetStartDate(2020, 1, 1);
SetEndDate(2021, 1, 1);
AddEquity("SPY", Resolution.Minute);
}
} class USEquityDataAlgorithm(QCAlgorithm):
def initialize(self) -> None:
self.set_start_date(2020, 1, 1)
self.set_end_date(2021, 1, 1)
self.add_equity("SPY", Resolution.MINUTE)
The following table shows the data cost of the preceding algorithm on the Quant Researcher tier:
| Dataset | Package | Initial Cost | Ongoing Cost |
|---|---|---|---|
| US Equity Security Master | Download On Premise | $600 USD | $600 USD/year |
| US Equity | Minute Download | 1 security over 252 trading days with 2 data formats => 1 * 252 * 2 files = 504 files 504 files @ 5 QCC/file => 504 * 5 QCC = 2,520 QCC = $25.20 USD | 1 security with 2 data formats => 1 * 2 files/day = 2 files/day 2 files/day @ 5 QCC/file => 2 * 5 QCC/day = 10 QCC/day = $0.10 USD/day |
The preceding table assumes you download trade and quote data, but you can run backtests with only trade data.