book
Checkout our new book! Hands on AI Trading with Python, QuantConnect, and AWS Learn More arrow

ExtractAlpha

Estimize

Introduction

The Estimize dataset by ExtractAlpha estimates the financials of companies, including EPS, and revenues. The data covers over 2,800 US-listed Equities’ EPS/Revenue. The data starts in January 2011 and is updated on a daily frequency. The data is sparse, and it doesn't have new updates every day. This dataset is crowdsourced from a community of 100,000+ contributors via the data provider’s web platform.

This dataset depends on the US Equity Security Master dataset because the US Equity Security Master dataset contains information on splits, dividends, and symbol changes.

For more information about the Estimize dataset, including CLI commands and pricing, see the dataset listing.

About the Provider

ExtractAlpha was founded by Vinesh Jha in 2013 with the goal of providing alternative data for investors. ExtractAlpha's rigorously researched data sets and quantitative stock selection models leverage unique sources and analytical techniques, allowing users to gain an investment edge.

Getting Started

The following snippet demonstrates how to request data from the Estimize dataset:

from QuantConnect.DataSource import *

self.aapl = self.add_equity("AAPL", Resolution.DAILY).symbol
self.estimize_consensus_symbol = self.add_data(EstimizeConsensus, self.symbol).symbol
self.estimize_estimate_symbol = self.add_data(EstimizeEstimate, self.symbol).symbol
self.estimize_release_symbol = self.add_data(EstimizeRelease, self.symbol).symbol
using QuantConnect.DataSource;

_symbol = AddEquity("AAPL", Resolution.Daily).Symbol;
_estimizeConsensusSymbol = AddData<EstimizeConsensus>(_symbol).Symbol;
_estimizeEstimateSymbol = AddData<EstimizeEstimate>(_symbol).Symbol; 
_estimizeReleaseSymbol = AddData<EstimizeRelease>(_symbol).Symbol; 

Data Summary

The following table describes the dataset properties:

PropertyValue
Start DateJanuary 2011
Asset Coverage2,800 US Equities
Data DensitySparse
ResolutionDaily
TimezoneUTC

Requesting Data

To add Estimize data to your algorithm, call the AddDataadd_data method. Save a reference to the dataset Symbol so you can access the data later in your algorithm.

class ExtractAlphaEstimizeDataAlgorithm(QCAlgorithm):

    def initialize(self) -> None:
        self.set_start_date(2019, 1, 1)
        self.set_end_date(2020, 6, 1)
        self.set_cash(100000)

        self.aapl = self.add_equity("AAPL", Resolution.DAILY).symbol
        self.estimize_consensus_symbol = self.add_data(EstimizeConsensus, self.symbol).symbol
        self.estimize_estimate_symbol = self.add_data(EstimizeEstimate, self.symbol).symbol
        self.estimize_release_symbol = self.add_data(EstimizeRelease, self.symbol).symbol
public class ExtractAlphaEstimizeDataAlgorithm : QCAlgorithm
{
    private Symbol _symbol, _estimizeConsensusSymbol, _estimizeEstimateSymbol, _estimizeReleaseSymbol;

    public override void Initialize()
    {
        SetStartDate(2019, 1, 1);
        SetEndDate(2020, 6, 1);
        SetCash(100000);

        _symbol = AddEquity("AAPL", Resolution.Daily).Symbol;
        _estimizeConsensusSymbol = AddData<EstimizeConsensus>(_symbol).Symbol;
        _estimizeEstimateSymbol = AddData<EstimizeEstimate>(_symbol).Symbol; 
        _estimizeReleaseSymbol = AddData<EstimizeRelease>(_symbol).Symbol;
    }
}

Accessing Data

To get the current Estimize data, index the current Slice with the dataset Symbol. Slice objects deliver unique events to your algorithm as they happen, but the Slice may not contain data for your dataset at every time step. To avoid issues, check if the Slice contains the data you want before you index it.

def on_data(self, slice: Slice) -> None:
    if slice.contains_key(self.estimize_consensus_symbol):
        data_point = slice[self.estimize_consensus_symbol]
        self.log(f"{self.estimize_consensus_symbol} mean at {slice.time}: {data_point.mean}")

    if slice.contains_key(self.estimize_estimate_symbol):
        data_point = slice[self.estimize_estimate_symbol]
        self.log(f"{self.estimize_estimate_symbol} EPS at {slice.time}: {data_point.eps}")

    if slice.contains_key(self.estimize_release_symbol):
        data_point = slice[self.estimize_release_symbol]
        self.log(f"{self.estimize_release_symbol} EPS at {slice.time}: {data_point.eps}")
public override void OnData(Slice slice)
{
    if (slice.ContainsKey(_estimizeConsensusSymbol))
    {
        var dataPoint = slice[_estimizeConsensusSymbol];
        Log($"{_estimizeConsensusSymbol} mean at {slice.Time}: {dataPoint.Mean}");
    }

    if (slice.ContainsKey(_estimizeEstimateSymbol))
    {
        var dataPoint = slice[_estimizeEstimateSymbol];
        Log($"{_estimizeEstimateSymbol} EPS at {slice.Time}: {dataPoint.Eps}");
    }

    if (slice.ContainsKey(_estimizeReleaseSymbol))
    {
        var dataPoint = slice[_estimizeReleaseSymbol];
        Log($"{_estimizeReleaseSymbol} EPS at {slice.Time}: {dataPoint.Eps}");
    }
}

To iterate through all of the dataset objects in the current Slice, call the Getget method.

def on_data(self, slice: Slice) -> None:
    for dataset_symbol, data_point in slice.get(EstimizeConsensus).items():
        self.log(f"{dataset_symbol} mean at {slice.time}: {data_point.mentions}")

    for dataset_symbol, data_point in slice.get(EstimizeEstimate).items():
        self.log(f"{dataset_symbol} EPS at {slice.time}: {data_point.eps}")

    for dataset_symbol, data_point in slice.get(EstimizeRelease).items():
        self.log(f"{dataset_symbol} EPS at {slice.time}: {data_point.eps}")
public override void OnData(Slice slice)
{
    foreach (var kvp in slice.Get<EstimizeConsensus>())
    {
        var datasetSymbol = kvp.Key;
        var dataPoint = kvp.Value;
        Log($"{datasetSymbol} mean at {slice.Time}: {dataPoint.Mentions}");
    }

    foreach (var kvp in slice.Get<EstimizeEstimate>())
    {
        var datasetSymbol = kvp.Key;
        var dataPoint = kvp.Value;
        Log($"{datasetSymbol} EPS at {slice.Time}: {dataPoint.Eps}");
    }

    foreach (var kvp in slice.Get<EstimizeRelease>())
    {
        var datasetSymbol = kvp.Key;
        var dataPoint = kvp.Value;
        Log($"{datasetSymbol} EPS at {slice.Time}: {dataPoint.Eps}");
    }
}

Historical Data

To get historical Estimize data, call the Historyhistory method with the dataset Symbol. If there is no data in the period you request, the history result is empty.

# DataFrames
consensus_history_df = self.history(self.estimize_consensus_symbol, 100, Resolution.DAILY)
estimate_history_df = self.history(self.estimize_estimate_symbol, 100, Resolution.DAILY)
release_history_df = self.history(self.estimize_release_symbol, 100, Resolution.DAILY)
history_df = self.history([
    self.estimize_consensus_symbol,
    self.estimize_estimate_symbol,
    self.estimize_release_symbol], 100, Resolution.DAILY)

# Dataset objects
consensus_history_bars = self.history[EstimizeConsensus](self.estimize_consensus_symbol, 100, Resolution.DAILY)
estimate_history_bars = self.history[EstimizeEstimate](self.estimize_estimate_symbol, 100, Resolution.DAILY)
release_history_bars = self.history[EstimizeRelease](self.estimize_release_symbol, 100, Resolution.DAILY)
// Dataset objects
var concensusHistory = History<EstimizeConsensus>(_estimizeConsensusSymbol, 100, Resolution.Daily);
var estimateHistory = History<EstimizeEstimate>(_estimizeEstimateSymbol, 100, Resolution.Daily);
var releaseHistory = History<EstimizeRelease>(_estimizeReleaseSymbol, 100, Resolution.Daily);

// Slice objects
var history = History(new[]{_estimizeConsensusSymbol,
                            _estimizeEstimateSymbol,
                            _estimizeReleaseSymbol}, 10, Resolution.Daily);

For more information about historical data, see History Requests.

Remove Subscriptions

To remove a subscription, call the RemoveSecurityremove_security method.

self.remove_security(self.estimize_consensus_symbol)
self.remove_security(self.estimize_estimate_symbol)
self.remove_security(self.estimize_release_symbol)
RemoveSecurity(_estimizeConsensusSymbol);
RemoveSecurity(_estimizeEstimateSymbol);
RemoveSecurity(_estimizeReleaseSymbol);

If you subscribe to Estimize data for assets in a dynamic universe, remove the dataset subscription when the asset leaves your universe. To view a common design pattern, see Track Security Changes.

Example Applications

The Estimize dataset enables you to estimate the financial data of a company more accurately for alpha. Examples include the following use cases:

  • Fundamental estimates for ML regression/classification models
  • Arbitrage/Sentiment trading on market “surprise” from ordinary expectations based on the better expectation by the dataset
  • Using industry-specific KPIs to predict the returns of individual sectors

Classic Algorithm Example

The following example algorithm creates a dynamic universe of the 500 most liquid US Equities. Each month, the algorithm forms an equal-weighted dollar-neutral portfolio with the 10 companies with the highest EPS estimate and the 10 companies with the lowest EPS estimate.

from AlgorithmImports import *
from QuantConnect.DataSource import *

class ExtractAlphaEstimizeAlgorithm(QCAlgorithm):

    def initialize(self) -> None:
        self.set_start_date(2019, 1, 1)
        self.set_end_date(2020, 12, 31)
        self.set_cash(100000)
        
        # A variable to control the next rebalance time
        self.last_time = datetime.min
        
        self.add_universe(self.my_coarse_filter_function)
        self.universe_settings.resolution = Resolution.MINUTE
        
    def my_coarse_filter_function(self, coarse: List[CoarseFundamental]) -> List[Symbol]:
        # Select the non-penny stocks with the highest dollar volume, since they have more stable price (lower risk) and more informed insights from high market activities
        sorted_by_dollar_volume = sorted([x for x in coarse if x.has_fundamental_data and x.price > 4], 
                                key=lambda x: x.dollar_volume, reverse=True)
        selected = [x.symbol for x in sorted_by_dollar_volume[:500]]
        return selected

    def on_data(self, slice: Slice) -> None:
        if self.last_time > self.time: return
    
        # Accessing Estimize data to collect the crowd-sourced insighrt as trading signals
        consensus = slice.Get(EstimizeConsensus)
        estimate = slice.Get(EstimizeEstimate)
        release = slice.Get(EstimizeRelease)
        
        if not estimate: return
        
        # Long the ones with highest earning estimates and short the ones with lowest earning estimates, assuming the fundamental factor significance
        sorted_by_eps_estimate = sorted([x for x in estimate.items() if x[1].eps], key=lambda x: x[1].eps)
        long_symbols = [x[0].underlying for x in sorted_by_eps_estimate[-10:]]
        short_symbols = [x[0].underlying for x in sorted_by_eps_estimate[:10]]
        
        # Liquidate the ones that fall out of the earning extremes
        for symbol in [x.symbol for x in self.portfolio.Values if x.invested]:
            if symbol not in long_symbols + short_symbols:
                self.liquidate(symbol)
        
        # Invest equally and dollar-neutral to evenly dissipate capital risk and hedge systematic risk
        long_targets = [PortfolioTarget(symbol, 0.05) for symbol in long_symbols]
        short_targets = [PortfolioTarget(symbol, -0.05) for symbol in short_symbols]
        self.set_holdings(long_targets + short_targets)

        # Update the rebalance time to next month start
        self.last_time = Expiry.END_OF_MONTH(self.time)
        
    def on_securities_changed(self, changes: SecurityChanges) -> None:
        for security in changes.added_securities:
            # Requesting data for trading signal generation
            estimize_consensus_symbol = self.add_data(EstimizeConsensus, security.symbol).symbol
            estimize_estimate_symbol = self.add_data(EstimizeEstimate, security.symbol).symbol
            estimize_release_symbol = self.add_data(EstimizeRelease, security.symbol).symbol

            # Historical Data
            history = self.history([estimize_consensus_symbol,
                                    estimize_estimate_symbol,
                                    estimize_release_symbol
                                    ], 10, Resolution.DAILY)
            self.debug(f"We got {len(history)} items from our history request")
public class ExtractAlphaEstimizeAlgorithm : QCAlgorithm
{
    // A variable to control the next rebalance time
    private DateTime _time = DateTime.MinValue;
    
    public override void Initialize()
    {
        SetStartDate(2019, 1, 1);
        SetEndDate(2020, 12, 31);
        SetCash(100000);
        
        AddUniverse(MyCoarseFilterFunction);
        UniverseSettings.Resolution = Resolution.Minute;
    }
    
    private IEnumerable<Symbol> MyCoarseFilterFunction(IEnumerable<CoarseFundamental> coarse)
    {
        // Select the non-penny stocks with the highest dollar volume, since they have more stable price (lower risk) and more informed insights from high market activities
        return (from c in coarse
                where c.HasFundamentalData && c.Price > 4
                orderby c.DollarVolume descending
                select c.Symbol).Take(500);
    }
    
    public override void OnData(Slice slice)
    {
        if (_time > Time) return;
        
        // Accessing Estimize data to collect the crowd-sourced insighrt as trading signals
        var consensus = slice.Get<EstimizeConsensus>();
        var estimate = slice.Get<EstimizeEstimate>();
        var release = slice.Get<EstimizeRelease>();
        
        if (estimate.IsNullOrEmpty()) return;

        // Long the ones with highest earning estimates and short the ones with lowest earning estimates, assuming the fundamental factor significance
        var sortedByEpsEstimate = from value in estimate.Values
                        where (value.Eps != None)
                        orderby value.Eps descending
                        select value.Symbol.Underlying;
        var longSymbols = sortedByEpsEstimate.Take(10).ToList();
        var shortSymbols = sortedByEpsEstimate.TakeLast(10).ToList();
        
        // Liquidate the ones that fall out of the earning extremes
        foreach (var kvp in Portfolio)
        {
            var symbol = kvp.Key;
            if (kvp.Value.Invested && 
            !longSymbols.Contains(symbol) && 
            !shortSymbols.Contains(symbol))
            {
                Liquidate(symbol);
            }
        }
        
        // Invest equally and dollar-neutral to evenly dissipate capital risk and hedge systematic risk
        var targets = new List<PortfolioTarget>();
        targets.AddRange(longSymbols.Select(symbol => new PortfolioTarget(symbol, 0.05m)));
        targets.AddRange(shortSymbols.Select(symbol => new PortfolioTarget(symbol, -0.05m)));
        
        SetHoldings(targets);

        // Update the rebalance time to next month start
        _time = Expiry.EndOfMonth(Time);
    }
    
    public override void OnSecuritiesChanged(SecurityChanges changes)
    {
        foreach(var security in changes.AddedSecurities)
        {
            // Requesting data for trading signal generation
            var consensusSymbol = AddData<EstimizeConsensus>(security.Symbol).Symbol;
            var estimateSymbol = AddData<EstimizeEstimate>(security.Symbol).Symbol;
            var releaseSymbol = AddData<EstimizeRelease>(security.Symbol).Symbol;
            
            // Historical Data
            var history = History(new[]{
                consensusSymbol,
                estimateSymbol,
                releaseSymbol
            }, 10, Resolution.Daily);
            Debug($"We got {history.Count()} items from our history request");
        }
    }
}

Framework Algorithm Example

The following example algorithm creates a dynamic universe of the 500 most liquid US Equities. Each month, the algorithm forms an equal-weighted dollar-neutral portfolio with the 10 companies with the highest EPS estimate and the 10 companies with the lowest EPS estimate.

from AlgorithmImports import *
from QuantConnect.DataSource import *

class ExtractAlphaEstimizeFrameworkAlgorithm(QCAlgorithm):

    def initialize(self) -> None:
        self.set_start_date(2019, 1, 1)
        self.set_end_date(2020, 12, 31)
        self.set_cash(100000)
        
        self.add_universe(self.my_coarse_filter_function)
        self.universe_settings.resolution = Resolution.MINUTE

        # Custom alpha model that emit signal according to Estimize data
        self.add_alpha(ExtractAlphaEstimizeAlphaModel())
        
        # Invest equally to evenly dissipate capital concentration risk
        self.set_portfolio_construction(EqualWeightingPortfolioConstructionModel())
        
        self.set_execution(ImmediateExecutionModel())
        
    def my_coarse_filter_function(self, coarse: List[CoarseFundamental]) -> List[Symbol]:
        # Select the non-penny stocks with the highest dollar volume, since they have more stable price (lower risk) and more informed insights from high market activities
        sorted_by_dollar_volume = sorted([x for x in coarse if x.has_fundamental_data and x.price > 4], 
                                key=lambda x: x.dollar_volume, reverse=True)
        selected = [x.symbol for x in sorted_by_dollar_volume[:500]]
        return selected
        
class ExtractAlphaEstimizeAlphaModel(AlphaModel):
    
    def __init__(self) -> None:
        # A variable to control the next rebalance time
        self.time = datetime.min
        
    def update(self, algorithm: QCAlgorithm, slice: Slice) -> List[Insight]:
        if self.time > algorithm.time: return []
        
        # Accessing Estimize data to collect the crowd-sourced insighrt as trading signals
        consensus = slice.Get(EstimizeConsensus)
        estimate = slice.Get(EstimizeEstimate)
        release = slice.Get(EstimizeRelease)
        
        if not estimate: return []

        # Long the ones with highest earning estimates and short the ones with lowest earning estimates, assuming the fundamental factor significance
        sorted_by_eps_estimate = sorted([x for x in estimate.items() if x[1].eps], key=lambda x: x[1].eps)
        long_symbols = [x[0].underlying for x in sorted_by_eps_estimate[-10:]]
        short_symbols = [x[0].underlying for x in sorted_by_eps_estimate[:10]]
        
        insights = []
        for symbol in long_symbols:
            insights.append(Insight.price(symbol, Expiry.END_OF_MONTH, InsightDirection.UP))
        for symbol in short_symbols:
            insights.append(Insight.price(symbol, Expiry.END_OF_MONTH, InsightDirection.DOWN))

        # Update the rebalance time to next month start
        self.time = Expiry.END_OF_MONTH(algorithm.time)
        
        return insights
        
    def on_securities_changed(self, algorithm: QCAlgorithm, changes: SecurityChanges) -> None:
        for security in changes.added_securities:
            # Requesting data for trading signal generation
            estimize_consensus_symbol = algorithm.add_data(EstimizeConsensus, security.symbol).symbol
            estimize_estimate_symbol = algorithm.add_data(EstimizeEstimate, security.symbol).symbol
            estimize_release_symbol = algorithm.add_data(EstimizeRelease, security.symbol).symbol
            
            # Historical Data
            history = algorithm.history([estimize_consensus_symbol,
                                         estimize_estimate_symbol,
                                         estimize_release_symbol
                                        ], 10, Resolution.DAILY)
            algorithm.debug(f"We got {len(history)} items from our history request")
public class ExtractAlphaEstimizeAlgorithm : QCAlgorithm
{
    public override void Initialize()
    {
        SetStartDate(2019, 1, 1);
        SetEndDate(2021, 1, 1);
        SetCash(100000);
        
        AddUniverse(MyCoarseFilterFunction);
        UniverseSettings.Resolution = Resolution.Minute;

        // Custom alpha model that emit signal according to Estimize data
        AddAlpha(new ExtractAlphaEstimizeAlphaModel());

        // Invest equally to evenly dissipate capital concentration risk
        SetPortfolioConstruction(new EqualWeightingPortfolioConstructionModel());
        
        SetExecution(new ImmediateExecutionModel());
    }
    
    private IEnumerable<Symbol> MyCoarseFilterFunction(IEnumerable<CoarseFundamental> coarse)
    {
        // Select the non-penny stocks with the highest dollar volume, since they have more stable price (lower risk) and more informed insights from high market activities
        return (from c in coarse
                where c.HasFundamentalData && c.Price > 4
                orderby c.DollarVolume descending
                select c.Symbol).Take(500);
    }
}

public class ExtractAlphaEstimizeAlphaModel: AlphaModel
{
    // A variable to control the next rebalance time
    public DateTime _time;
    
    public ExtractAlphaEstimizeAlphaModel()
    {
        _time = DateTime.MinValue;
    }
    
    public override IEnumerable<Insight> Update(QCAlgorithm algorithm, Slice slice)
    {
        if (_time > algorithm.Time) return new List<Insight>();
        
        // Accessing Estimize data to collect the crowd-sourced insighrt as trading signals
        var consensus = slice.Get<EstimizeConsensus>();
        var estimate = slice.Get<EstimizeEstimate>();
        var release = slice.Get<EstimizeRelease>();
        
        if (estimate.IsNullOrEmpty()) return new List<Insight>();
        
        // Long the ones with highest earning estimates and short the ones with lowest earning estimates, assuming the fundamental factor significance
        var sortedByEpsEstimate = from s in estimate.Values
                        where (s.Eps != None)
                        orderby s.Eps descending
                        select s.Symbol.Underlying;
        var longSymbols = sortedByEpsEstimate.Take(10).ToList();
        var shortSymbols = sortedByEpsEstimate.TakeLast(10).ToList();
        
        var insights = new List<Insight>();
        insights.AddRange(longSymbols.Select(symbol => new Insight(symbol, Expiry.EndOfMonth, InsightType.Price, InsightDirection.Up)));
        insights.AddRange(shortSymbols.Select(symbol => new Insight(symbol, Expiry.EndOfMonth, InsightType.Price, InsightDirection.Down)));

        // Update the rebalance time to next month start
        _time = Expiry.EndOfMonth(algorithm.Time);
        
        return insights;
    }
    
    public override void OnSecuritiesChanged(QCAlgorithm algorithm, SecurityChanges changes)
    {
        foreach(var security in changes.AddedSecurities)
        {
            // Requesting data for trading signal generation
            var consensusSymbol = algorithm.AddData<EstimizeConsensus>(security.Symbol).Symbol;
            var estimateSymbol = algorithm.AddData<EstimizeEstimate>(security.Symbol).Symbol;
            var releaseSymbol = algorithm.AddData<EstimizeRelease>(security.Symbol).Symbol;
            
            // Historical Data
            var history = algorithm.History(new[]{
                consensusSymbol,
                estimateSymbol,
                releaseSymbol
            }, 10, Resolution.Daily);
            algorithm.Debug($"We got {history.Count()} items from our history request");
        }
    }
}

Data Point Attributes

The Estimize dataset provides EstimizeConsensus, EstimizeEstimate, and EstimizeRelease objects.

EstimizeConsensus Attributes

EstimizeConsensus objects have the following attributes:

EstimizeEstimate Attributes

EstimizeEstimate objects have the following attributes:

EstimizeRelease Attributes

EstimizeRelease objects have the following attributes:

You can also see our Videos. You can also get in touch with us via Discord.

Did you find this page helpful?

Contribute to the documentation: