Securities

Filtering Data

Introduction

Unfiltered raw data can be faulty for a number of reasons, including invalid data entry. Moreover, high-frequency traders can deploy bait-and-switch strategies by submitting bait orders to deceive other market participants, making raw data noisy and untradeable. To avoid messing up with our trading logic and model training, you can filter out suspicious raw data with a data filter.

Set Models

To set a data filter for a security, call the SetDataFilterset_data_filter property on the Security object.

// Use the SetDataFilter method to use the SecurityDataFilter on the SPY ETF data.
var spy = AddEquity("SPY");
spy.SetDataFilter(new SecurityDataFilter());
# Use the set_data_filter method to use the SecurityDataFilter on the SPY ETF data.
spy = self.add_equity("SPY")
spy.set_data_filter(SecurityDataFilter())

You can also set the data filter model in a security initializer. If your algorithm has a universe, use the security initializer technique. In order to initialize single security subscriptions with the security initializer, call SetSecurityInitializerset_security_initializer before you create the subscriptions.

public class BrokerageModelExampleAlgorithm : QCAlgorithm
{
    public override void Initialize()
    {
        // In the Initialize method, set the security initializer to seed initial the prices and models of assets.
        SetSecurityInitializer(new MySecurityInitializer(BrokerageModel, new FuncSecuritySeeder(GetLastKnownPrices)));
    }
}

public class MySecurityInitializer : BrokerageModelSecurityInitializer
{
    public MySecurityInitializer(IBrokerageModel brokerageModel, ISecuritySeeder securitySeeder)
        : base(brokerageModel, securitySeeder) {}    
    public override void Initialize(Security security)
    {
        // First, call the superclass definition.
        // This method sets the reality models of each security using the default reality models of the brokerage model.
        base.Initialize(security);

        // Next, overwrite some of the reality models
        security.SetDataFilter(new SecurityDataFilter());    }
}
class BrokerageModelExampleAlgorithm(QCAlgorithm):
    def initialize(self) -> None:
        # In the Initialize method, set the security initializer to seed initial the prices and models of assets.
        self.set_security_initializer(MySecurityInitializer(self.brokerage_model, FuncSecuritySeeder(self.get_last_known_prices)))

# Outside of the algorithm class
class MySecurityInitializer(BrokerageModelSecurityInitializer):

    def __init__(self, brokerage_model: IBrokerageModel, security_seeder: ISecuritySeeder) -> None:
        super().__init__(brokerage_model, security_seeder)    
    def initialize(self, security: Security) -> None:
        # First, call the superclass definition.
        # This method sets the reality models of each security using the default reality models of the brokerage model.
        super().initialize(security)

        # Next, overwrite some of the reality models
        security.set_data_filter(SecurityDataFilter())

Default Behavior

The following table shows the default data filter for each security type:

Security TypeDefault Filter
EquityEquityDataFilter
OptionOptionDataFilter
ForexForexDataFilter
IndexIndexDataFilter
CfdCfdDataFilter
OthersSecurityDataFilter

None of the preceding filters filter out any data.

Model Structure

Data filtering models should implement the ISecurityDataFilter interface. Extensions of the ISecurityDataFilter interface must implement the Filter method, which receives Security and BaseData objects and then returns a boolean object that represents if the data point should be filtered out.

Data filtering models must implement a filter method, which receives Security and BaseData objects and then returns a boolean object that represents if the data point should be filtered out.

// Include or exclude a data point from the algorithm.
public class MyDataFilter : ISecurityDataFilter
{
    public override bool Filter(Security vehicle, BaseData data)
    {
        return true;
    }
}
# Include or exclude a data point from the algorithm.
class MyDataFilter(SecurityDataFilter):
    def filter(self, vehicle: Security, data: BaseData) -> bool:
        return True

Examples

The following examples demonstrate some common practices for filtering data.

Example 1: Filter Out Outliers

When analyzing high-frequency price data, it's important to filter out potential outliers and anomalies that may skew the analysis. One effective method is to use simple moving average (SMA) and standard deviation indicators to identify ticks that significantly deviate from the short-term trend. By comparing each tick to the indicator values, you can flag any data points that fall outside a threshold (for example, three standard deviations). This filtration process removes suspicious or erroneous price information from entering your algorithm, ensuring a cleaner dataset for trading.

public class CustomDataFilterAlgorithm : QCAlgorithm
{
    public override void Initialize()
    {
        dynamic equity = AddEquity("AAPL", Resolution.Tick);
        // Create the indicators.
        equity.Sma = SMA(equity.Symbol, 100);
        equity.Std = IndicatorExtensions.Of(new StandardDeviation(100), equity.Sma, true);
        // Set the data filter.
        equity.SetDataFilter(new CustomDataFilter());
    }
}

class CustomDataFilter : SecurityDataFilter
{
    public CustomDataFilter() : base() { }
    public override bool Filter(Security vehicle, BaseData data)
    {
        // Wait until the indicators are ready.
        var security = vehicle as dynamic;
        if (!(security.Sma.IsReady && security.Std.Window.IsReady))
        {
            return true; // Keep the data point.
        }
        // Check if the current value is within 3 standard deviations of the mean.
        // Return true (keep) or false (discard).
        var sma = security.Sma.Current.Value;
        var std = security.Std.Current.Value;
        return sma - 3m*std <= data.Value && data.Value <= sma + 3m*std;
    }
}
class CustomDataFilterAlgorithm(QCAlgorithm):

    def initialize(self) -> None:
        equity = self.add_equity("AAPL", Resolution.TICK)
        # Create the indicators.
        equity.sma = self.sma(equity.symbol, 100)
        equity.std = IndicatorExtensions.of(StandardDeviation(100), equity.sma, True)
        # Set the data filter.
        equity.set_data_filter(CustomDataFilter())


class CustomDataFilter(SecurityDataFilter):

    def filter(self, vehicle: Security, data: BaseData) -> bool:
        # Wait until the indicators are ready.
        security = vehicle
        if not (security.sma.is_ready and security.std.is_ready):
            return True # Keep the data point.
        # Check if the current value is within 3 standard deviations of the mean.
        # Return True (keep) or False (discard).
        sma = security.sma.current.value
        std = security.std.current.value
        return sma - 3*std <= data.value <= sma + 3*std

Example 2: Filter Out Major Exchanges

When you trade illiquid financial instruments, it can be advantageous to focus on the BATS exchange since its quote data may not fully reflect the fair market value. Due to the lower trading volume and visibility of BATS, the quotes there may lag behind the true value of illiquid assets. A carefully designed algorithm can analyze the BATS feed to identify situations where the quotes appears to be undervalued compared to the asset's intrinsic worth. By executing trades to capture this disconnect, rather than arbitraging between exchanges, you may be able to profit from the market inefficiencies present in the less liquid instrument. The following example demonstrates how to only consume data from the BATS exchange:

public class BatsDataFilterAlgorithm : QCAlgorithm
{
    public override void Initialize()
    {
        dynamic equity = AddEquity("AAPL", Resolution.Tick);
        // Set the data filter.
        equity.SetDataFilter(new BatsDataFilter());
    }
}

class BatsDataFilter : SecurityDataFilter
{
    public BatsDataFilter() : base() { }
    public override bool Filter(Security vehicle, BaseData data)
    {
        // Get the tick object.
        var tick = data as Tick;
        // Return true (keep) or false (discard).
        return tick != null && tick.Exchange == Exchange.BATS;
    }
}
class BatsDataFilterAlgorithm(QCAlgorithm):

    def initialize(self) -> None:
        equity = self.add_equity("AAPL", Resolution.TICK)
        # Set the data filter.
        equity.set_data_filter(BatsDataFilter())


class BatsDataFilter(SecurityDataFilter):

    def filter(self, vehicle: Security, data: BaseData) -> bool:
        # Get the tick object.
        tick = Tick(data)
        # Return True (keep) or False (discard).
        return tick and tick.exchange == Exchange.BATS.name
        

For more examples, see the following algorithms:

You can also see our Videos. You can also get in touch with us via Discord.

Did you find this page helpful?

Contribute to the documentation: