Securities
Filtering Data
Introduction
Unfiltered raw data can be faulty for a number of reasons, including invalid data entry. Moreover, high-frequency traders can deploy bait-and-switch strategies by submitting bait orders to deceive other market participants, making raw data noisy and untradeable. To avoid messing up with our trading logic and model training, you can filter out suspicious raw data with a data filter.
Set Models
To set a data filter for a security, call the SetDataFilter
set_data_filter
property on the Security
object.
// Use the SetDataFilter method to use the SecurityDataFilter on the SPY ETF data. var spy = AddEquity("SPY"); spy.SetDataFilter(new SecurityDataFilter());
# Use the set_data_filter method to use the SecurityDataFilter on the SPY ETF data. spy = self.add_equity("SPY") spy.set_data_filter(SecurityDataFilter())
You can also set the data filter model in a security initializer. If your algorithm has a universe, use the security initializer technique. In order to initialize single security subscriptions with the security initializer, call SetSecurityInitializer
set_security_initializer
before you create the subscriptions.
public class BrokerageModelExampleAlgorithm : QCAlgorithm { public override void Initialize() { // In the Initialize method, set the security initializer to seed initial the prices and models of assets. SetSecurityInitializer(new MySecurityInitializer(BrokerageModel, new FuncSecuritySeeder(GetLastKnownPrices))); } } public class MySecurityInitializer : BrokerageModelSecurityInitializer { public MySecurityInitializer(IBrokerageModel brokerageModel, ISecuritySeeder securitySeeder) : base(brokerageModel, securitySeeder) {} public override void Initialize(Security security) { // First, call the superclass definition. // This method sets the reality models of each security using the default reality models of the brokerage model. base.Initialize(security); // Next, overwrite some of the reality models security.SetDataFilter(new SecurityDataFilter()); } }
class BrokerageModelExampleAlgorithm(QCAlgorithm): def initialize(self) -> None: # In the Initialize method, set the security initializer to seed initial the prices and models of assets. self.set_security_initializer(MySecurityInitializer(self.brokerage_model, FuncSecuritySeeder(self.get_last_known_prices))) # Outside of the algorithm class class MySecurityInitializer(BrokerageModelSecurityInitializer): def __init__(self, brokerage_model: IBrokerageModel, security_seeder: ISecuritySeeder) -> None: super().__init__(brokerage_model, security_seeder) def initialize(self, security: Security) -> None: # First, call the superclass definition. # This method sets the reality models of each security using the default reality models of the brokerage model. super().initialize(security) # Next, overwrite some of the reality models security.set_data_filter(SecurityDataFilter())
Default Behavior
The following table shows the default data filter for each security type:
Security Type | Default Filter |
---|---|
Equity | EquityDataFilter |
Option | OptionDataFilter |
Forex | ForexDataFilter |
Index | IndexDataFilter |
Cfd | CfdDataFilter |
Others | SecurityDataFilter |
None of the preceding filters filter out any data.
Model Structure
Data filtering models should implement the ISecurityDataFilter
interface. Extensions of the ISecurityDataFilter
interface must implement the Filter
method, which receives Security
and BaseData
objects and then returns a boolean
object that represents if the data point should be filtered out.
Data filtering models must implement a filter
method, which receives Security
and BaseData
objects and then returns a boolean
object that represents if the data point should be filtered out.
// Include or exclude a data point from the algorithm. public class MyDataFilter : ISecurityDataFilter { public override bool Filter(Security vehicle, BaseData data) { return true; } }
# Include or exclude a data point from the algorithm. class MyDataFilter(SecurityDataFilter): def filter(self, vehicle: Security, data: BaseData) -> bool: return True
Examples
The following examples demonstrate some common practices for filtering data.
Example 1: Filter Out Outliers
When analyzing high-frequency price data, it's important to filter out potential outliers and anomalies that may skew the analysis. One effective method is to use simple moving average (SMA) and standard deviation indicators to identify ticks that significantly deviate from the short-term trend. By comparing each tick to the indicator values, you can flag any data points that fall outside a threshold (for example, three standard deviations). This filtration process removes suspicious or erroneous price information from entering your algorithm, ensuring a cleaner dataset for trading.
public class CustomDataFilterAlgorithm : QCAlgorithm { public override void Initialize() { dynamic equity = AddEquity("AAPL", Resolution.Tick); // Create the indicators. equity.Sma = SMA(equity.Symbol, 100); equity.Std = IndicatorExtensions.Of(new StandardDeviation(100), equity.Sma, true); // Set the data filter. equity.SetDataFilter(new CustomDataFilter()); } } class CustomDataFilter : SecurityDataFilter { public CustomDataFilter() : base() { } public override bool Filter(Security vehicle, BaseData data) { // Wait until the indicators are ready. var security = vehicle as dynamic; if (!(security.Sma.IsReady && security.Std.Window.IsReady)) { return true; // Keep the data point. } // Check if the current value is within 3 standard deviations of the mean. // Return true (keep) or false (discard). var sma = security.Sma.Current.Value; var std = security.Std.Current.Value; return sma - 3m*std <= data.Value && data.Value <= sma + 3m*std; } }
class CustomDataFilterAlgorithm(QCAlgorithm): def initialize(self) -> None: equity = self.add_equity("AAPL", Resolution.TICK) # Create the indicators. equity.sma = self.sma(equity.symbol, 100) equity.std = IndicatorExtensions.of(StandardDeviation(100), equity.sma, True) # Set the data filter. equity.set_data_filter(CustomDataFilter()) class CustomDataFilter(SecurityDataFilter): def filter(self, vehicle: Security, data: BaseData) -> bool: # Wait until the indicators are ready. security = vehicle if not (security.sma.is_ready and security.std.is_ready): return True # Keep the data point. # Check if the current value is within 3 standard deviations of the mean. # Return True (keep) or False (discard). sma = security.sma.current.value std = security.std.current.value return sma - 3*std <= data.value <= sma + 3*std
Example 2: Filter Out Major Exchanges
When you trade illiquid financial instruments, it can be advantageous to focus on the BATS exchange since its quote data may not fully reflect the fair market value. Due to the lower trading volume and visibility of BATS, the quotes there may lag behind the true value of illiquid assets. A carefully designed algorithm can analyze the BATS feed to identify situations where the quotes appears to be undervalued compared to the asset's intrinsic worth. By executing trades to capture this disconnect, rather than arbitraging between exchanges, you may be able to profit from the market inefficiencies present in the less liquid instrument. The following example demonstrates how to only consume data from the BATS exchange:
public class BatsDataFilterAlgorithm : QCAlgorithm { public override void Initialize() { dynamic equity = AddEquity("AAPL", Resolution.Tick); // Set the data filter. equity.SetDataFilter(new BatsDataFilter()); } } class BatsDataFilter : SecurityDataFilter { public BatsDataFilter() : base() { } public override bool Filter(Security vehicle, BaseData data) { // Get the tick object. var tick = data as Tick; // Return true (keep) or false (discard). return tick != null && tick.Exchange == Exchange.BATS; } }
class BatsDataFilterAlgorithm(QCAlgorithm): def initialize(self) -> None: equity = self.add_equity("AAPL", Resolution.TICK) # Set the data filter. equity.set_data_filter(BatsDataFilter()) class BatsDataFilter(SecurityDataFilter): def filter(self, vehicle: Security, data: BaseData) -> bool: # Get the tick object. tick = Tick(data) # Return True (keep) or False (discard). return tick and tick.exchange == Exchange.BATS.name
For more examples, see the following algorithms: