We're pleased to share we've achieved four massive improvements to our fundamental data research capacities on QuantConnect. The first is, as promised, a 10x (1000%+) speed-up in fundamental data backtesting!

Our community often searches for value stocks using fundamental data to filter and rank assets by their investment potential. We created a test case by loading a universe of 5,000 equities fundamental data into a strategy, filtering it using the Piofsky scoring technique, and investing in the selected assets. QuantConnect is still the only platform in the world to do event-driven, point-in-time, spread, slippage-adjusted backtesting for the most accurate results possible.

We applied several engineering leaps to win this 10x improvement. First, one of our core bottlenecks was loading the data. We eliminated this work by swapping from ZIP-JSON files to a parquet file format, a massive leap in the efficiency of loading the data.

Next, once the data was ingested at the maximum possible speed, we reduced the data we were loading to only the requested properties. The intelligent requests reduced the fundamental data loaded from the 4900 fields we host to only the 50 required for the Piosfsky benchmark. This optimization allows strategies to scale according to client usage and might have faster speed-ups for users with more straightforward backtests. Finally, by reviewing the types used by each data property, we doubled the effective speed again by carefully choosing which types were ints, floats, and decimals.

These updates resulted in our metric for speed DPS (data points per second) leaping from 4K to 76K DPS and reducing our benchmark test from 40 minutes to 4 minutes 20 seconds, up to 75% faster than zipline. This leap in efficiency allows our community of 250,000 quants to explore new ideas that were previously inaccessible due to the computation requirements.

Universes are now a single-pass filter, with all the potential data passed to a single event handler, allowing you to simplify your algorithm logic. Previously, universes were separated into coarse and fine due to the heavy nature of the data loaded. With a single pass filter, you can combine all filtering into a single method, using factors like dollar volume alongside the fundamental properties.

10x Fundamental Video

#2. In addition to the massive speed boost, all fundamental data is now accessible on security objects at any point, or with just a Symbol object. This enables you to combine a fundamental filter with any other universe selection type. For example, you can now combine our massive library of ETF constituent universes with a Morning Star Fundamental data filter with just one line of code.

#3. Fundamental data can now also be accessed with the history API, allowing you to quickly review long term fundamental trends. This enables faster warm up for your fundamental strategies and accessing the data without adding a security to your algorithm universe.

#4. Finally, we've reviewed and loaded in fundamental data for all historically delisted assets, ensuring your universe selections to avoid survivorship bias. This increased the coverage of fundamental data on assets from 6,100 to 9,100 assets. Before deploying we performed detailed spot and statistical testing to understand every difference in the data.

As always, this work is backward compatible and deployed to research, backtesting, optimization, and live trading. There should be no changes required to your code. The old data format for universes will be maintained until January 2024, so if you have live fundamental strategies, please stop and redeploy them to take advantage of this new technology.

We're excited to see how the community can benefit from these leaps in engineering. If you have further feedback or ideas to improve it, let us know in the comments!

Team @ QuantConnect