US Equity

Data Preparation

Introduction

The US Equities dataset provides price data for backtests and live trading.

Sourcing

The QuantConnect data provider consolidates US Equity market data across all of the exchanges. Over-the-Counter (OTC) trades are excluded. The data is powered by the Securities Information Processor (SIP), so it has 100% market coverage. In contrast, free platforms that display data feeds like the Better Alternative Trading System (BATS) only have about 6-7% market coverage.

We provide live splits, dividends, and corporate actions for US companies. We deliver them to your algorithm before the trading day starts.

Bar Building

We aggregate ticks to build bars.

The bar-building process can exclude ticks. If a tick is excluded, its volume is aggregated in the bar but its price is not aggregated in the bar. Ticks are excluded if any of the following statements are true:

  • The tick is suspicious.
  • The tick is from the FINRA exchange and meets our price and volume thresholds.
  • The trade has none of the following included TradeConditionFlags and at least one of the following excluded TradeConditionFlags:
  • TradeConditionFlagsStatusDescription
    RegularREGULARIncludedA trade made without stated conditions is deemed the regular way for settlement on the third business day following the transaction date.
    FormTFORM_TIncludedTrading in extended hours enables investors to react quickly to events that typically occur outside regular market hours, such as earnings reports. However, liquidity may be constrained during such Form T trading, resulting in wide bid-ask spreads.
    CashCASHIncludedA transaction that requires delivery of securities and payment on the same day the trade takes place.
    ExtendedHoursEXTENDED_HOURSIncludedIdentifies a trade that was executed outside of regular primary market hours and is reported as an extended hours trade.
    NextDayNEXT_DAYIncludedA transaction that requires the delivery of securities on the first business day following the trade date.
    OfficialCloseOFFICIAL_CLOSEIncludedIndicates the "official" closing value determined by a Market Center. This transaction report will contain the market center generated closing price.
    OfficialOpenOFFICIAL_OPENIncludedIndicates the 'Official' open value as determined by a Market Center. This transaction report will contain the market center generated opening price.
    ClosingPrintsCLOSING_PRINTSIncludedThe transaction that constituted the trade-through was a single priced closing transaction by the Market Center.
    OpeningPrintsOPENING_PRINTSIncludedThe trade that constituted the trade-through was a single priced opening transaction by the Market Center.
    IntermarketSweepINTERMARKET_SWEEPExcludedThe transaction that constituted the trade-through was the execution of an order identified as an Intermarket Sweep Order.
    TradeThroughExemptTRADE_THROUGH_EXEMPTExcludedDenotes whether or not a trade is exempt (Rule 611).
    OddLotODD_LOTExcludedDenotes the trade is an odd lot less than a 100 shares.
  • The quote has a size of less than 100 shares.
  • The quote has one of the following QuoteConditionFlags:
  • QuoteConditionFlagsDescription
    ClosingCLOSINGIndicates that this quote was the last quote for a security for that Participant.
    NewsDisseminationNEWS_DISSEMINATIONDenotes a regulatory trading halt when relevant news influencing the security is being disseminated. Trading is suspended until the primary market determines that an adequate publication or disclosure of information has occurred.
    NewsPendingNEWS_PENDINGDenotes a regulatory Trading Halt due to an expected news announcement, which may influence the security. An Opening Delay or Trading Halt may be continued once the news has been disseminated.
    TradingRangeIndicationTRADING_RANGE_INDICATIONDenotes the probable trading range (Bid and Offer prices, no sizes) of a security that is not Opening Delayed or Trading Halted. The Trading Range Indication is used prior to or after the opening of a security.
    OrderImbalanceORDER_IMBALANCEDenotes a non-regulatory halt condition where there is a significant imbalance of buy or sell orders.
    ResumeRESUMEIndicates that trading for a Participant is no longer suspended in a security that had been Opening Delayed or Trading Halted.
  • The quote has none of the following QuoteConditionFlags:
  • QuoteConditionFlagsDescription
    RegularREGULARThis condition is used for the majority of quotes to indicate a normal trading environment.
    SlowSLOWThis condition is used to indicate that the quote is a Slow Quote on both the bid and offer sides due to a Set Slow List that includes high price securities.
    GapGAPWhile in this mode, auto-execution is not eligible, the quote is then considered manual and non-firm in the bid and offer, and either or both sides can be traded through as per Regulation NMS.
    OpeningQuoteOPENING_QUOTEThis condition can be disseminated to indicate that this quote was the opening quote for a security for that Participant.
    FastTradingFAST_TRADINGFor extremely active periods of short duration. While in this mode, the UTP Participant will enter quotations on a best efforts basis.
    ResumeRESUMEIndicate that trading for a Participant is no longer suspended in a security which had been Opening Delayed or Trading Halted.

In the preceding tables, Participant refers to the entities on page 19 of the Consolidated Tape System Multicast Output Binary Specification.

Suspicious Ticks

Tick price data is raw and unfiltered, so it can contain a lot of noise. If a tick is not tradable, we flag it as suspicious. This process makes the bars a more realistic representation of what you could execute in live trading. If you use tick data, avoid using suspicious ticks in your algorithms as informative data points. We recommend only using tick data if you understand the risks and are able to perform your own tick filtering. Ticks are flagged as suspicious in the following situations:

  • The tick occurs below the best bid or above the best ask
  • A tick occurs above the best ask of a security

    This image shows a tick that occurred above the best ask price of a security. The green line represents the best ask of the security, the blue line represents the best bid of the security, and the red dots represent trade ticks. The ticks between the best bid and ask occur from filling hidden orders. The tick that occurred above the best ask price is flagged as suspicious.

  • The tick occurs far from the current market price
  • A tick occurs far below the market price of a security

    This image shows a tick that occurred far from the price of the security. The red dots represent trade ticks. The tick that occurred far from the market price is flagged as suspicious.

  • The tick occurs on a dark pool
  • The tick is rolled back
  • The tick is reported late

Market Auction Prices

The opening and closing price of the day is set by very specific opening and closing auction ticks. When a stock like Apple is listed, it’s listed on Nasdaq. The open auction tick on Nasdaq is the price that’s used as the official open of the day. NYSE, BATS, and other exchanges also have opening auctions, but the only official opening price for Apple is the opening auction on the exchange where it was listed.

We set the opening and closing prices of the first and last bars of the day to the official auction prices. This process is used for second, minute, hour, and daily bars for the 9:30 AM and 4:30 PM Eastern Time (ET) prices. In contrast, other platforms might not be using the correct opening and closing prices.

The official auction prices are usually emitted 2-30 seconds after the market open and close. We do our best to use the official opening and closing prices in the bars we build, but the delay can be so large that there isn't enough time to update the opening and closing price of the bar before it's injected into your algorithms. For example, if you subscribe to second resolution data, we wait until the end of the second for the opening price but most second resolution data won’t get the official opening price. If you subscribe to minute resolution data, we wait until the end of the minute for the opening auction price. Most of the time, you’ll get the actual opening auction price with minute resolution data, but there are always exceptions. Nasdaq and NYSE can have delays in publishing the opening auction price, but we don’t have control over those issues and we have to emit the data on time so that you get the bar you are expecting.

Live and Backtesting Differences

In live trading, bars are built using the exchange timestamps with microsecond accuracy. This microsecond-by-microsecond processing of the ticks can mean that the individual bars between live trading and backtesting can have slightly different ticks. As a result, it's possible for a tick to be counted in different bars between backtesting and live trading, which can lead to bars having slightly different open, high, low, close, and volume values.

Data Availability

In live trading, live data is available in real-time. In backtests, live data is available at the following pre-market trading session.

You can also see our Videos. You can also get in touch with us via Discord.

Did you find this page helpful?

Contribute to the documentation: