From Research To Production: Kalman Filters and Pairs Trading

Hey Everyone,

In this installment, I'm going to walk you through how to use and apply Kalman filters in your algorithms. Before we start, I want to note that there are a few Python packages out there for Kalman filters, but we're adapting this example and the Kalman filter class code from this article and demonstrating how you can implement similar ideas using QuantConnect!

Briefly, a Kalman filter is a state-space model applicable to linear dynamic systems -- systems whose state is time-dependent and state variations are represented linearly. The model is used to estimate unknown states of a variable based on a series of past values. The procedure is two-fold: a prediction (estimate) is made by the filter of the current state of a variable and the uncertainty of the estimate itself. When new data is available, these estimates are updated. There is a lot of information available about Kalman filters, and the variety of their applications is pretty astounding, but for now, we're going to use a Kalman filter to estimate the hedge ratio between a pair of equities.

The idea behind the strategy is pretty straightforward: take two equities that are cointegrated and create a long-short portfolio. The premise of this is that the spread between the value of our two positions should be mean-reverting. Anytime the spread deviates from its expected value, one of the assets moved in an unexpected direction and is due to revert back. When the spread diverges, you can take advantage of this by going long or short on the spread.

To illustrate, imagine you have a long position in "AAPL" worth $2000 and a short position in "IBM" worth $2000. This gives you a net spread of $0. Since you expected AAPL and IBM to move together, then if the spread increases significantly above $0, you would short the spread in the expectation that it will return to $0, it's natural equilibrium. Similarly, if the value drops significantly below $0, you would long the spread and capture the profits as its value returns to $0. In our application, the Kalman filter will be used to track the hedging ratio between our equities to ensure that the portfolio value is stationary, which means it will continue to exhibit mean-reversion behavior.

First, we used a research notebook to test out the Kalman filter class from the article linked above.

import numpy as np
from math import floor

class KalmanFilter:
    def __init__(self):
        self.delta = 1e-4
        self.wt = self.delta / (1 - self.delta) * np.eye(2)
        self.vt = 1e-3
        self.theta = np.zeros(2)
        self.P = np.zeros((2, 2))
        self.R = None
        self.qty = 2000

    def update(self, price_one, price_two):
        # Create the observation matrix of the latest prices
        # of TLT and the intercept value (1.0)
        F = np.asarray([price_one, 1.0]).reshape((1, 2))
        y = price_two

        # The prior value of the states \theta_t is
        # distributed as a multivariate Gaussian with
        # mean a_t and variance-covariance R_t
        if self.R is not None:
            self.R = self.C + self.wt
        else:
            self.R = np.zeros((2, 2))

        # Calculate the Kalman Filter update
        # ----------------------------------
        # Calculate prediction of new observation
        # as well as forecast error of that prediction
        yhat = F.dot(self.theta)
        et = y - yhat

        # Q_t is the variance of the prediction of
        # observations and hence \sqrt{Q_t} is the
        # standard deviation of the predictions
        Qt = F.dot(self.R).dot(F.T) + self.vt
        sqrt_Qt = np.sqrt(Qt)

        # The posterior value of the states \theta_t is
        # distributed as a multivariate Gaussian with mean
        # m_t and variance-covariance C_t
        At = self.R.dot(F.T) / Qt
        self.theta = self.theta + At.flatten() * et
        self.C = self.R - At * F.dot(self.R)
        hedge_quantity = int(floor(self.qty*self.theta[0]))
        
        return et, sqrt_Qt, hedge_quantity

import numpy as np
from math import floor
import matplotlib.pyplot as plt
from KalmanFilter import KalmanFilter

qb = QuantBook()
symbols = [qb.AddEquity(x).Symbol for x in ['VIA', 'VIAB']]

# Initialize Kalman Filter imported from another file
kf = KalmanFilter()
# Fetch history
history = qb.History(qb.Securities.Keys, 10, Resolution.Daily)
# Get close prices
prices = history.unstack(level=1).close.transpose()
# Iterate over prices, update filter, and print results
for index, row in prices.iterrows():
    via = row.loc['VIA 2T']
    viab = row.loc['VIAB 2T']
    forecast_error, prediction_std_dev, hedge_quantity = kf.update(via, viab)
    print(f'{forecast_error} :: {prediction_std_dev} :: {hedge_quantity}')

The above code allowed us to test out our code in the research environment, and now we can implement it in practice. To do this, we built a simple pairs trading model that uses VIA and VIAB. We didn't test these two equities for cointegration but instead made the assumption that they will move together as they are different share classes of Viacom and should, theoretically, move the same direction with the same magnitude.

    def Initialize(self):
        self.SetStartDate(2016, 1, 1)  # Set Start Date
        self.SetCash(100000)  # Set Strategy Cash
        self.SetBrokerageModel(AlphaStreamsBrokerageModel())

        self.symbols = [self.AddEquity(x, Resolution.Minute).Symbol for x in ['VIA', 'VIAB']]
        self.kf = KalmanFilter()
        self.invested = None
    
        self.Schedule.On(self.DateRules.EveryDay('VIA'), self.TimeRules.BeforeMarketClose('VIA', 5), self.UpdateAndTrade)

We initialized an instance of the Kalman Filter class in Initialize(), which we then use in the UpdateAndTrade() method to figure out our optimal position sizing -- this is done to maintain a proper ratio of our long and short positions so as to ensure that the spread remains stationary and mean-reverting. In the UpdateAndTrade() method, we update the Kalman filter with the new price data and get the forecast error, prediction standard deviation, and hedge ratio. The trade signals for us are that we long the spread if the forecast error is less than the negative of the standard deviation of the spread and we exit this position if the forecast error is greater than the negative of the standard deviation of the spread. We short the spread (take opposite positions in both equities) when the forecast error is greater than the standard deviation, and we close this position when the forecast error is less than the standard deviation.

This may seem a bit complicated or abstract, but the essential point is that when our forecast error (the difference between the current value of VIA and the Kalman filter's estimate for today of VIA) is negative, the Kalman filter is saying that the current spread is lower than it is expected to be and so it is due to move back to its expected value. Similarly, when the forecast error is larger than the standard deviation of the predictions, then the spread is higher than expected and will drop back to its expected value.

    def UpdateAndTrade(self):
        
        # Get recent price and holdings information
        via = self.CurrentSlice[self.symbols[0]].Close
        viab = self.CurrentSlice[self.symbols[1]].Close
        holdings = self.Portfolio[self.symbols[0]]
        
        forecast_error, prediction_std_dev, hedge_quantity = self.kf.update(via, viab)
        
        if not holdings.Invested:
            # Long the spread
            if forecast_error < -prediction_std_dev:
                insights = Insight.Group([Insight(self.symbols[0], timedelta(1), InsightType.Price, InsightDirection.Down),
                                           Insight(self.symbols[1], timedelta(1), InsightType.Price, InsightDirection.Up)])
                self.EmitInsights(insights)
                self.MarketOrder(self.symbols[1], self.kf.qty)
                self.MarketOrder(self.symbols[0], -hedge_quantity)

            # Short the spread
            elif forecast_error > prediction_std_dev:
                insights = Insight.Group([Insight(self.symbols[0], timedelta(1), InsightType.Price, InsightDirection.Up),
                                           Insight(self.symbols[1], timedelta(1), InsightType.Price, InsightDirection.Down)])
                self.EmitInsights(insights)
                self.MarketOrder(self.symbols[1], -self.kf.qty)
                self.MarketOrder(self.symbols[0], hedge_quantity)

        if holdings.Invested:
            # Close long position
            if holdings.IsShort and (forecast_error >= -prediction_std_dev):
                insights = Insight.Group([Insight(self.symbols[0], timedelta(1), InsightType.Price, InsightDirection.Flat),
                                           Insight(self.symbols[1], timedelta(1), InsightType.Price, InsightDirection.Flat)])
                self.EmitInsights(insights)
                self.Liquidate()
                self.invested = None
            
            # Close short position
            elif holdings.IsLong and (forecast_error <= prediction_std_dev):
                insights = Insight.Group([Insight(self.symbols[0], timedelta(1), InsightType.Price, InsightDirection.Flat),
                                           Insight(self.symbols[1], timedelta(1), InsightType.Price, InsightDirection.Flat)])
                self.EmitInsights(insights)
                self.Liquidate()
                self.invested = None

This is one application of a Kalman filter in finance, and there are countless others. We encourage you to explore building your own Kalman filter class, using the Python libraries, or apply this one to your own research and trading!

(The code for the Kalman filter was taken from an article posted here and the basic strategy is taken from Ernie Chan's book on algorithmic trading)

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.

Hi Valery T ,

NullPortfolioConstructionModel is the default QCAlgorithm.PortfolioConstruction. It does not create targets, and, consequently, the Execution Model will not place orders.

On the other hand, we want to emit insights to score its prediction since they will eventually be consumed by a hedge fund that subscribes to that Alpha Stream algorithm.

Note:
Alpha Stream Algorithm: a QCAlgorithm that emits insights, can use either Framework or "Classic" pattern.

Hi Alexandre Catarino and Jack Simonson

Great article!

Just curiious if we want to use the Kalman filter as a replacemeent for the exponeential moving average, how would you define the fast indicator (fast kalman filter) and the slow indicator (slow kalman filter) to mimic a "moving average" cross over strategy?

Best,

Andrew

DawnAmber

19 ,

You did a great job. I am sure other users will benefit from you taking the time to write this up.

Valery T

10.8k ,

What is the purpose of generating insights _and_ market orders at the same time? How the insights will be used? To generate more orders?

Alexandre Catarino

103.9k ,

Andrew martin czeizler

624 ,

Jack Simonson

49.4k ,

Hi Andrew,

As far as I know, what you're hoping to accomplish isn't quite do-able. Kalman filters generally use single, discrete steps (especially when it comes to simple time series data) - as new data comes in, it updates. It is able to provide a smoothed series of values based on the observed data, but it doesn't aggregate the data and can't be lagged by multiple steps to offer the same features of a moving average. If you did "lag" it, say only input every n-th data point into the filter, you wouldn't achieve a result in the way that you would with a different period MA. It would give you different state estimates with different error than a 1-step filter, but I'm not sure this is what you're trying to achieve.

If I'm misunderstanding your question, let me know and perhaps I can still help!

Dave Mueller

2.4k ,

Hi Jack (or Jared/Alex/Anyone):

I hope it's not too late to post some questions on this thread.

I'm trying to make sense of the Kalman filter and something doesn't click with me yet.

It does not appear that the filter stores more than 1 price value for each symbol at a time (please correct me if I interpreted this incorrectly). My understanding of the Kalman filter is that it predicts the state of a time series (at least in this case). How does it do this if it doesn't get "warmed up" the way a technical indicator would? For example, why does it not need something like a 200 minute history warmup (or any length of time)? I don't understand how it would predict state based only on one value. How does it know that the first data point for each symbol is any given state if its the first point?

I would really appreciate some confirmation as it is definitely an intriguing concept, but I don't quite follow the logic. Perhaps I'm missing something simple - whatever it is I would like to know how this works.

Thanks in advance!

Hi Dave Mueller ,

The Kalman Filter implicitly stores a time series since the class variables, self.theta, self.P and self.R, are updated with a new pair of prices. In this aspect, it works similarly to an exponential moving average.

We believe that you are right about the need for warming it up because it takes some iterations to smooth them out. We can add the following logic in Initialize to warm it up:

df = self.History(self.symbols, 200, Resolution.Daily).close.unstack(0)
for time, value in df.iterrows():
    err, std, qnty = self.kf.update(value['VIA'], value['VIAB'])

The choice of 200 is ad hoc since we don't have a defined look-back period as we do for most indicators.

Thank you Alex, that adds a lot of clarity. I appreciate the response!

Alex - a follow up question.

Is there a common best practice when using missing data? For example, is it more reliable to not call update in the case of a missing bar? To input the last price? To input the average price? Or is this something that varies widely from model to model?

Rahul Chowdhury

39.9k ,

Hey Dave,

Since Kalman Filters are meant to act on time series data with constant time steps, we shouldn't skip an update when data is missing. We can use an estimate of the missing price data as a replacement. The best method to estimate is to use the last price, which introduces the least amount of bias because we are forward filling our data. .Besl
Rahul

Derek Melchin

STAFF ,

Hi Kaushik,

When using the pykalman library, we can use the filter method to get the dynamic hedge ratio over a window of data. In a backtesting environment, where data points are made available over time, we use the filter_update method to get the latest estimate of the hedge ratio. Unfortunately, neither of these methods return information for the forecast_error or prediction_std_dev.

Best,
Derek Melchin

Richard Thomas Harrison

425 ,

Hey Andrew martin czeizler,

Regarding the fast and slow moving average equivalent for Kalman Filtering, the delta parameter is something you can adjust to control how fast the Kalman filter adapts to the most recent data which is analogous to different lookback windows for moving averages. Michael Halls-Moore talks about it in his book Advanced Algorithmic Trading, p.236-237. I am not sure how to go from a value for delta to a number of lookback periods or vice versa but I guess you could experiment with some data and see for what values of delta smooths the time series most similarly to whatever lookback period you want. I'm not even sure if higher values of delta correspond to slower adaption and or vice versa but I may try and implement myself and make another post in here. I know that when delta=0, the kalman filter doesn't update at all and in this case, the filter would become traditional linear regression, see Ernie Chan's Algorithmic Trading book, p.78

Richard

Hey Dave Mueller/Andrew Martin Czeizler,

andrew martin czeizler regarding my last post, the lookback window equivalent for Kalman filtering, if I would've read a little more into Ernie's Algo Trading book on p.77-78, I would've found that higher values of delta correspond to shorter lookback windows for moving averages, at least I think this is how it can be interpreted. Ernie says that if delta=0, Kalman filtering reduces to ordinary least squares regression and if delta=1, the parameters will fluctuate wildly based on the latest observation.

Dave Mueller regarding the warm up for Kalman filtering, other than the procedure proposed by Alexandre Catarino where you simply fit the Kalman filter to a set amount of data before actually using it, there's actually a procedure to initialize the variance-covariance parameters of the measurement and state transition equations. Ernest Chan to the rescue, again in his Algo Trading book, p.77-78, he cites a paper by Rajamani and Rawlings (2007, 2009) who develop a method called autocovariance least squares which initializes the measurement and state transition equations as well as eliminates the need to arbitrarily pick a value for delta. However, this procedure requires data and so the outcome may not be much different than if you simply do what Alexandre Catarino proposes.

Also I said that we “arbitrarily” pick a value for delta=1e-4 (Michael Halls-Moore used 1e-5 in his advanced trading book but uses 1e-4 in his article), well this is what Ernie used in his book and he said it's picked with the benefit of hindsight so it's probably not “arbitrary” after all. I'm assuming that if you instead use the autocovariance least squares method, your Kalman filter parameters for either method will converge after a few data points.

Papa Bear

2k ,

If anyone was considering comparing more than two assets attached ex for increasing variations

Louis Szeto

Hi Papa Bear

Thank you for sharing! We also got a Kalman Filter with statistical arbitrage tutorial (multiple-security version) here.😊

Best
Louis

Johnny Cash

1.3k ,

The link above.

does not work

I think this is the correct link. Why is the backtest equity result so different from the research ?

Jack Simonson INVESTOR

Update Backtest

Notebook

person upvoted this people upvoted this

To unlock posting to the community forums please complete at least 30% of Boot Camp.
You can continue your Boot Camp training progress from the terminal. We hope to see you in the community soon!

Platform

Radically Open-Source Algorithmic Trading Engine

Join Our Discord Channel

Quarterly Open-Source Trading Competition

Draft Discussions

Bookmarked Discussions

SEARCH DISCUSSIONS

TOP 5 Research PUblications

About Quant League

competition rules

previous competitions

287,000 Quants.

VOTE FOR UPCOMING FEATURES

From Research To Production: Kalman Filters and Pairs Trading

Organization

Team

Clone Strategy

Previous Ranking

IN THIS RESEARCH

PARTICIPANTS

Discussion Awards

Actions

Join QuantConnect for Free

Platform

SIGN IN

Radically Open-Source Algorithmic Trading Engine

Join Our Discord Channel

Quarterly Open-Source Trading Competition

Draft Discussions

Bookmarked Discussions

SEARCH DISCUSSIONS

TOP 5 Research PUblications

About Quant League

competition rules

previous competitions

287,000 Quants.

VOTE FOR UPCOMING FEATURES

From Research To Production: Kalman Filters and Pairs Trading

Organization

Team

Clone Strategy

Previous Ranking

IN THIS RESEARCH

PARTICIPANTS

Discussion Awards

SHARE RESEARCH

SHARE DISCUSSION

SHARE ARTICLE

SHARE

Actions

Join QuantConnect for Free