QuantConnect

US ETF Constituents

Introduction

Download the US ETF Constituents dataset to your local machine to get constituents data for all of the supported ETFs for every trading day without any selection bias.

This dataset selects US Equities, so you also need to purchase the US Equities dataset.

To use the CLI, you must be a member in an organization on a paid tier.

Prerequisites

The US ETF Constituents dataset depends on the US Equity Security Master dataset because the US Equity Security Master contains information on splits, dividends, and symbol changes. Before you download US ETF Constituents data, open the Pricing page of your organization and subscribe to the US Equity Security Master by QuantConnect data package. You need billing permissions to change the organization's subscriptions.

Download in Bulk

After you subscribe to local access (see Prerequisites), open a terminal in your organization workspace and run the following commands to bulk download the data and its prerequisites.

To download the US ETF Constituents data, run:

$ lean data download --dataset "US ETF Constituents" --data-type "Bulk" --start "20250701" --end "20260701"

To download the US Equity Security Master, run:

$ lean data download --dataset "US Equity Security Master"

After you bulk download the US ETF Constituents dataset, new daily updates are available at 7 AM Eastern Time (ET) after each trading day. Instead of directly calling the lean data download command, you can place the following Python script in the data directory of your organization workspace and run it to update your data files:

import os
from datetime import datetime, time, timedelta
from pytz import timezone
from os.path import abspath, dirname
os.chdir(dirname(abspath(__file__)))

OVERWRITE = False

def __get_start_date() -> str:
    dir_name = f"equity/usa/universes/etf/spy"
    files = [] if not os.path.exists(dir_name) else sorted(os.listdir(dir_name)) 
    return files[-1].split(".")[0] if files else '19980101'

def __get_end_date() -> str:
    now = datetime.now(timezone("US/Eastern"))
    if now.time() > time(7, 0):
        return (now - timedelta(1)).strftime("%Y%m%d")
    print('New data is available at 07:00 AM EST')
    return (now - timedelta(2)).strftime("%Y%m%d")

if __name__ == "__main__":
    start, end = __get_start_date(), __get_end_date()
    if start >= end:
        exit("Your data is already up to date.")
            
    command = f'lean data download --dataset "US ETF Constituents" --data-type "Bulk" --start {start} --end {end}'
    if OVERWRITE:
        command += " --overwrite"
    print(command)
    os.system(command)

The preceding script checks the date of the most recent SPY data you have. If there is new data available for SPY, it downloads the new data files for all of the ETFs. You may need to adjust this script to fit your needs.

To update your local copy of the US Equity Security Master, run:

$ lean data download --dataset "US Equity Security Master"

Download by Date

The US ETF Constituents dataset provides one file per ETF per trading day, so you download it by date and ETF instead of by individual ticker. First, download the prerequisite US Equity Security Master:

$ lean data download --dataset "US Equity Security Master"

Then download the US ETF Constituents data for the ETF and date range you need:

$ lean data download --dataset "US ETF Constituents" --data-type "Trade" --ticker "SPY" --start "20250701" --end "20260701"

To download data interactively or to use the CLI Command Generator, see Using the CLI.

Size and Format

The US ETF Constituents dataset is 50 GB in size. We structure the data files so there is one file per ETF per day. For an example, see the Data / equity / usa / universes / etf / spy / 20201201.csv file in the LEAN repository.

Price

The US ETF Constituents dataset selects US Equities, so you also need the US Equities data to use the selected constituents. Review the US Equity costs in addition to the prices below.

Download in Bulk

To download the US ETF Constituents dataset in bulk, subscribe to it on the Pricing page of your organization. The bulk download also requires the US Equity Security Master subscription. The first bulk subscription downloads the full historical dataset for one year. After that subscription ends, renew with the cheaper updates subscription to keep your data current. The following table shows the annual price ($/year) of each subscription for every organization tier:

TierHistoricalUpdates
Quant Researcher3,6001,200
Team3,9601,200
Trading Firm3,9601,200
Institution3,9601,200

The following table shows the total cost of downloading the required datasets, including minute US Equities data for the selected constituents, in bulk on the Quant Researcher tier. Other organization tiers apply their own rates.

DatasetPackageHistoricalUpdates
US Equity Security MasterSubscription$600$600/year
US ETF ConstituentsSubscription$3,600$1,200/year
US EquitiesMinute$11,760$600/year
Total$15,960$2,400/year

Download by Date

When you download by date, the US ETF Constituents data is one file per ETF per day and each file costs 50 QCC = $0.50 USD. The following table shows the cost of downloading one year of data for one ETF by date on the Quant Researcher tier, assuming you download minute US Equities data for all 500 constituents of an ETF such as SPY:

DatasetPackageInitial CostOngoing Cost
US Equity Security MasterSubscription$600$600/year
US ETF ConstituentsDownload1 ETF over 252 trading days
=> 252 files

252 files @ 50 QCC/file
=> 252 * 50 QCC
= 12,600 QCC
= $126 USD
1 ETF
=> 1 file/day

1 file/day @ 50 QCC/file
=> 50 QCC/day
= $0.50 USD/day
US EquitiesMinute500 constituents over 252 trading days with 2 data formats
=> 500 * 252 * 2 files
= 252,000 files

252,000 files @ 5 QCC/file
=> 252,000 * 5 QCC
= 1,260,000 QCC
= $12,600 USD
500 constituents with 2 data formats
=> 1,000 files/day

1,000 files/day @ 5 QCC/file
=> 5,000 QCC/day
= $50 USD/day

This example downloads all 500 constituents of an ETF such as SPY to show the maximum cost. The cost depends on your selection, so you can download fewer constituents (for example, 50) to reduce it. As the number of constituents grows, compare the by-date US Equities cost with the bulk cost above to determine when downloading in bulk is cheaper.

You can also see our Videos. You can also get in touch with us via Discord.

Did you find this page helpful?

Contribute to the documentation: