US Equity

Data Preparation

Introduction

The US Equities dataset provides price data for backtests and live trading.

Sourcing

The QuantConnect data provider consolidates US Equity market data across all of the exchanges. Over-the-Counter (OTC) trades are excluded. The data is powered by the Securities Information Processor (SIP), so it has 100% market coverage. In contrast, free platforms that display data feeds like the Better Alternative Trading System (BATS) only have about 6-7% market coverage.

We provide live splits, dividends, and corporate actions for US companies. We deliver them to your algorithm before the trading day starts.

Bar Building

We aggregate ticks to build bars.

The bar-building process can exclude ticks. If a tick is excluded, its volume is aggregated in the bar but its price is not aggregated in the bar. Ticks are excluded if any of the following statements are true:

  • The tick is suspicious.
  • The tick is from the FINRA exchange and meets our price and volume thresholds.
  • The trade has none of the following included TradeConditionFlags and at least one of the following excluded TradeConditionFlags:
  • The quote has a size of less than 100 shares.
  • The quote has none of the following included QuoteConditionFlags and at least one of the following excluded QuoteConditionFlags:
  • The quote has one of the following QuoteConditionFlags:

In the preceding tables, Participant refers to the entities on page 19 of the Consolidated Tape System Multicast Output Binary Specification.

Suspicious Ticks

Tick price data is raw and unfiltered, so it can contain a lot of noise. If a tick is not tradable, we flag it as suspicious. This process makes the bars a more realistic representation of what you could execute in live trading. If you use tick data, avoid using suspicious ticks in your algorithms as informative data points. We recommend only using tick data if you understand the risks and are able to perform your own tick filtering. Ticks are flagged as suspicious in the following situations:

  • The tick occurs below the best bid or above the best ask
  • A tick occurs above the best ask of a security

    This image shows a tick that occurred above the best ask price of a security. The green line represents the best ask of the security, the blue line represents the best bid of the security, and the red dots represent trade ticks. The ticks between the best bid and ask occur from filling hidden orders. The tick that occurred above the best ask price is flagged as suspicious.

  • The tick occurs far from the current market price
  • A tick occurs far below the market price of a security

    This image shows a tick that occurred far from the price of the security. The red dots represent trade ticks. The tick that occurred far from the market price is flagged as suspicious.

  • The tick occurs on a dark pool
  • The tick is rolled back
  • The tick is reported late

Market Auction Prices

The opening and closing price of the day is set by very specific opening and closing auction ticks. When a stock like Apple is listed, it’s listed on Nasdaq. The open auction tick on Nasdaq is the price that’s used as the official open of the day. NYSE, BATS, and other exchanges also have opening auctions, but the only official opening price for Apple is the opening auction on the exchange where it was listed.

We set the opening and closing prices of the first and last bars of the day to the official auction prices. This process is used for second, minute, hour, and daily bars for the 9:30 AM and 4:30 PM Eastern Time (ET) prices. In contrast, other platforms might not be using the correct opening and closing prices.

The official auction prices are usually emitted 2-30 seconds after the market open and close. We do our best to use the official opening and closing prices in the bars we build, but the delay can be so large that there isn't enough time to update the opening and closing price of the bar before it's injected into your algorithms. For example, if you subscribe to second resolution data, we wait until the end of the second for the opening price but most second resolution data won’t get the official opening price. If you subscribe to minute resolution data, we wait until the end of the minute for the opening auction price. Most of the time, you’ll get the actual opening auction price with minute resolution data, but there are always exceptions. Nasdaq and NYSE can have delays in publishing the opening auction price, but we don’t have control over those issues and we have to emit the data on time so that you get the bar you are expecting.

Live and Backtesting Differences

In live trading, bars are built using the exchange timestamps with microsecond accuracy. This microsecond-by-microsecond processing of the ticks can mean that the individual bars between live trading and backtesting can have slightly different ticks. As a result, it's possible for a tick to be counted in different bars between backtesting and live trading, which can lead to bars having slightly different open, high, low, close, and volume values.

Data Availability

In live trading, live data is available in real-time. In backtests, live data is available at the following pre-market trading session.

You can also see our Videos. You can also get in touch with us via Discord.

Did you find this page helpful?

Contribute to the documentation: