History of Non-Market Data Correlations
Over the course of the history of the stock market, quantitative observers have built up a comprehensive database of non-market data correlations. From solar flares to hurricane cycles, biota growth and New York City temperature. Figures like Consumer Sentiment data and other broadly distributed survey data have been used by market participants when deciding market purchases as well as by economists to predict metrics. Investor interest has grown massively around Twitter Sentiment data, crowd sourced estimates data like Estimize, which integrates its data with QuantConnect, and other unconventional, seemingly unrelated market data. All of which are being used to build stock trading algorithms based on correlations. We decided to write a post with an overview of different types of non-market data used to predict security prices.
Calendar/Time Data’s Influence on Trading
For hundreds of years speculators have placed seasons, months, days and hours as either to blame or be praises for their seemingly superstitious distribution of market returns. A well known and often repeated line is the classic Wall Street saying, “Sell in May and Go Away”. Surprisingly, the strategy would have performed mightily since 1950 in the US as the chart below indicates.1 Other models have based trading on avoiding the recurring October declines or “October Surprise”, as well as tax loss selling based models which happen at the end of the fiscal year (last week of December). 1
Statistically Proving the “Sell in May and Go Away” Strategy
We were able to verify the “Sell in May and Go Away” strategy with a backtest in QuantConnect from 1998-2012, here you can see the equity curve from a flat return profile to the returns for the “Sell in May and Go Away” strategy:
Weather and Temperature Based Trading
Trading is largely emotional even despite the efforts of small time traders it is nearly impossible to remove emotion from trading. When it comes to weather and other environmental factors, researchers have reason to believe from proven correlations that an association exists. That is mood and behavior changes have been shown to change in association with environmental variables, like temperature, daylight, pollution etc..
The first correlation in the field was found in 1993 and showed that from 1927 to 1989, the level of cloud cover outside a stock exchange was correlated with returns for major stock exchanges.2 The correlation showed that on days with less cloud cover, market participants were in a better mood and returns were higher. However on cloudy days when presumably traders were more depressed, market returns were lower. More recently, the study has been replicated for 26 international markets and the correlation holds.2 Everywhere in the world clouds are correlated with markets, an absolutely incredible discovery.
Other research has identified that the temperature outside a stock exchange is negatively correlated with returns. That is, the lower the temperature, the higher the return, and vice versa. The researchers hypothesize that lower temperatures are associated with higher stock returns due to aggressive risk taking psychologically linked with stable temperatures. While higher temperature leads to lower stock returns since both aggression (associated with risk taking) and apathy (associated with risk averting) are possible consequences of higher temperatures.2
The underlying security we will be talking about here is the CME Hurricane Index (CHI). The CHI index is driven by two factors: maximum sustained winds (V, in mph) and the radius of hurricane-force winds (R). This means the very price of the security is tied to intrinsic elements inside a hurricane. In past paragraphs we have covered correlations, well this relation differs as it the index is in fact direct linked to the variables. The formula to compute CHI at a certain point in time is:
The hurricane index products offered by the Chicago Mercantile Exchange are mainly four: Hurricane futures and options, Hurricane Seasonal futures and options, Hurricane Seasonal Maximum futures and options and Hurricane Index Binary Options. For more about these indexes, our friend Dvega wrote a great piece here.
Micro-Managing Earning Estimates
In August 2010, news broke that UBS had been using satellite imagery inputs into an algorithm to then measure the number of vehicles at Wal-Mart locations.3 Based on the large scale algorithmic process, UBS was able to conjecture a gain in year over year Same-Store-Sales for Wal-Mart in the next quarter as opposed to the decline that was expected. The hypothesis turned out to be correct and the number of cars calculated by an algorithm has brought a new level of micro-managing of earnings estimates by large institutions.
Non-Market Data Correlations
In modern markets, the fast advancements of algorithmic trading strategies incorporating all types of data, the finding new correlations is important and lucrative. Nearly all forms of data can be used to build models of markets and makes the task of building a model either much more difficult as one has to look at huge fields of data or much easier as correlations can be found in nearly any data set. For us at QuantConnect the democratizing of algorithmic model creation is leading to exciting new correlations to be found, iterated and traded on. We’re looking forward to the backtesting, iteration and future correlations found on the QuantConnect platform!
Looking to try out QuantConnect? QuantConnect gives you free access to high resolution data for global financial markets to backtest your algorithm in our simulator. Once you’re ready you build and backtest your algorithm right from QuantConnect. You’ll be presented with your strategy equity curve and key performance indicators! Sign Up for Free