Opening-Closing Auction Prices / 20TB, 200M files, and 640 Cores.

Back

Hello Community,

We have recently finished reprocessing all USA equity trades and quotes data since 2007. The processing worked through approximately 20TB of raw data from the top of the book ticks. We converted it into more than 200M files for the 5 data resolutions we support! We did the processing on QuantConnect's office 640-CPU server cluster. 

This data is now available for your backtesting and live trading. 

We reprocessed the data to precisely reflect the open and close auction price from the security's primary exchange. Each security has a primary exchange that sets the official opening price used for Market On Open (MOO) / Market On Close (MOC) orders. Once we determined the correct auction price, we set it for all resolutions.  Previously, the opening price would be set by the first tick in market hours. Although this was the most accurate for intraday strategies, it caused small (>0.1%) differences to other platforms for daily strategies or MOO/MOC orders.

Depending on your strategy style, you may notice a difference in backtesting performance. In addition to trade prices, daily close prices set coarse-universe asset prices.

We hope this makes your backtesting process better as your signals will likely line up better to what you expect with external platforms. Let us know how it goes and as always if you find data issues please report them with the data explorer so we can systematically fix them! 

Best
JayJayD.

Update Backtest





The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.



 
0

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.


Great piece of news, especially for intra-day research, JayJayD!

1

That's great news!

I would have few additional questions for claryfication, if you don't mind..:

  • Are those data already available in both research / backtesting engine? 
  • In both daily bars and lower resolutions?
  • Does extendedMarektHours parameter have any effect on including auction (e.g. Are auction included even in case of behaviour: extendedMarektHours=True)
0

I compared quickly research daily bars with Yahoo finnance data and I still see some difference in opening aucitons.

It's definitelly possible that Quantconnect data are correct, Yahoo finnance aren't source of truth, but it's great to check..

Quantconnect data from research for few stocks:

 openclosesymbolAAPL R735QTJ8XC9XMSFT R735QTJ8XC9XWORK X5HR3Q1EPJDXZM X3RPXTZRW09XAAPL R735QTJ8XC9XMSFT R735QTJ8XC9XWORK X5HR3Q1EPJDXZM X3RPXTZRW09Xtime        2020-09-09114.25206.5128.18350.00112.83202.6629.32350.882020-09-10117.30207.8724.70369.45117.32211.2925.24389.65

 

If I compare for example with Yahoo finance data for "WORK", 2020-09-08:

(Comparing 2020-09-08 in Yahoo and 2020-09-09 in Quantconnect, because Quantconnect data are labeled at "right" end of interval, to avoid lookahead bias..)

https://finance.yahoo.com/quote/WORK/history?p=WORK

I see same Close value 29.32, but different open (28.18 at Quantconnect and 28.29 at Yahoo).

Do I compare it correctly? Or are Yahoo data incorrect? Or aren't all Opening/Closing auctions included bars already available at research environment?

Thanks!

 

#Code for quering research data:

qb = QuantBook()

syms = ["AAPL", "MSFT", "WORK", "ZM"]
for symn in syms:
    sym = qb.AddEquity(symn)
history = qb.History(qb.Securities.Keys, 100, Resolution.Daily)

df = history.unstack(level = "symbol")
df[['open','close']].loc["2020-09-06":"2020-09-10"]

0

I'm curious, do you guys have more data-related projects in your pipeline?  I fully understand the value that this data cleanup provides; the ultimate goal is to model reality as close as possible.  It is concerning, though, that a strategy may seem great and get through paper trials just fine, but then an overhaul of the data reveals it isn't as viable as previously thought.  It would be nice to know what characteristics of the dataset the team is working to improve so we can keep that in mind when analyzing backtest results.

0

Hi Stephen, we have a full-time data team so it is a constant process. Our project in the next 1-2 months is to overlay 3-4 data vendors datasets for splits and dividends together so we can create a more perfect "composite" splits-dividends dataset. Where there are errors in splits and dividends it will create different adjusted pricing. 

The next phase of making the platform more "accurate" is installing L2 data but that is pretty heavy so I don't know if it'll be worthwhile for many people's backtests. We can also model L2 with fill models that adjust the fill price depending on the volume of the asset. 

I don't know if there's anything you can do to "factor it in". There are no errors in the data "pending patches". This change is a style decision. Rather than continue dealing with users comparing us to Yahoo and having to explain over and over the intraday open-close we've just opted to use the opening-closing-auction prices and it'll match 99.9% of the time.  

I think you should always carefully review outlier trades in your transaction history for what might cause the jump or fall in equity over the trade. Gaps are either an overnight move or a data error.  It's tempting to ignore favorable gaps... but ignorance is not bliss in quant trading! =)

 

2

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.


Hi Jared!

Overlaying 3-4 data vendors datasets will be a great improvement in the quality of data for splits and dividends, and that should also help reducing the small (but cumulative) discrepancies between the OOS and the live returns of Alpha Streams.

As for the L2 data, that would (will?) be very worthwhile as for being as professionnal as possible!

However, may I respectully ask if this huge addition of data might not be less needed than other data, for instance:
- the futures that are not available currently in intraday (VIX futures are still not available in minute resolution)
- maybe the new micro futures on indices (more and more liquidity on these new contracts)
- and even maybe the options on futures later? (options on micro futures on S&P500 and Nasdaq100 are already rather liquid after only a few weeks!) 
- ...

These data are tiny compared to the huge L2 quotes, and could help the community being more interested in working on futures and, maybe proposing Alpha Streams trading futures (and options on futures?)

2

Jared Broad 

Regarding to "style" chage - including auction prices (and MOC/MOO orders) to Quantconnect data.

I believe this is great decision so thanks for that, especially closing auction is probably "most important" price of the day, because something like 10%+ of total daily volume seems to be often traded for this single price (and it's growing)..:

https://www.fixglobal.com/home/global-volumes-shifting-toward-the-close/

 

0

@Oldrich S - Please use the data explorer to report specific data issues.

@Laurent - We're adding other data sources as sponsored integrations from commercial entities. I was more replying about changes in the current data. Future Options are one of these sponsored integrations. 

2

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.


"Future Options are one of these sponsored integrations."
>> That is a great piece of news, really!

2

Hi CAPOCAPITAL,

L1 data is available for US equities. QuantConnect does not currently support L2 data.

Best,
Derek Melchin

0

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.


Update Backtest





0

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.


Loading...

This discussion is closed