ExtractAlpha
Estimize
Introduction
The Estimize dataset by ExtractAlpha estimates the financials of companies, including EPS, and revenues. The data covers over 2,800 US-listed Equities’ EPS/Revenue. The data starts in January 2011 and is updated on a daily frequency. The data is sparse, and it doesn't have new updates every day. This dataset is crowdsourced from a community of 100,000+ contributors via the data provider’s web platform.
This dataset depends on the US Equity Security Master dataset because the US Equity Security Master dataset contains information on splits, dividends, and symbol changes.
For more information about the Estimize dataset, including CLI commands and pricing, see the dataset listing.
About the Provider
ExtractAlpha was founded by Vinesh Jha in 2013 with the goal of providing alternative data for investors. ExtractAlpha's rigorously researched data sets and quantitative stock selection models leverage unique sources and analytical techniques, allowing users to gain an investment edge.
Getting Started
The following snippet demonstrates how to request data from the Estimize dataset:
from QuantConnect.DataSource import * self.aapl = self.add_equity("AAPL", Resolution.DAILY).symbol self.estimize_consensus_symbol = self.add_data(EstimizeConsensus, self.symbol).symbol self.estimize_estimate_symbol = self.add_data(EstimizeEstimate, self.symbol).symbol self.estimize_release_symbol = self.add_data(EstimizeRelease, self.symbol).symbol
using QuantConnect.DataSource; _symbol = AddEquity("AAPL", Resolution.Daily).Symbol; _estimizeConsensusSymbol = AddData<EstimizeConsensus>(_symbol).Symbol; _estimizeEstimateSymbol = AddData<EstimizeEstimate>(_symbol).Symbol; _estimizeReleaseSymbol = AddData<EstimizeRelease>(_symbol).Symbol;
Requesting Data
To add Estimize data to your algorithm, call the AddData
add_data
method. Save a reference to the dataset Symbol
so you can access the data later in your algorithm.
class ExtractAlphaEstimizeDataAlgorithm(QCAlgorithm): def initialize(self) -> None: self.set_start_date(2019, 1, 1) self.set_end_date(2020, 6, 1) self.set_cash(100000) self.aapl = self.add_equity("AAPL", Resolution.DAILY).symbol self.estimize_consensus_symbol = self.add_data(EstimizeConsensus, self.symbol).symbol self.estimize_estimate_symbol = self.add_data(EstimizeEstimate, self.symbol).symbol self.estimize_release_symbol = self.add_data(EstimizeRelease, self.symbol).symbol
public class ExtractAlphaEstimizeDataAlgorithm : QCAlgorithm { private Symbol _symbol, _estimizeConsensusSymbol, _estimizeEstimateSymbol, _estimizeReleaseSymbol; public override void Initialize() { SetStartDate(2019, 1, 1); SetEndDate(2020, 6, 1); SetCash(100000); _symbol = AddEquity("AAPL", Resolution.Daily).Symbol; _estimizeConsensusSymbol = AddData<EstimizeConsensus>(_symbol).Symbol; _estimizeEstimateSymbol = AddData<EstimizeEstimate>(_symbol).Symbol; _estimizeReleaseSymbol = AddData<EstimizeRelease>(_symbol).Symbol; } }
Accessing Data
To get the current Estimize data, index the current Slice
with the dataset Symbol
. Slice objects deliver unique events to your algorithm as they happen, but the Slice
may not contain data for your dataset at every time step. To avoid issues, check if the Slice
contains the data you want before you index it.
def on_data(self, slice: Slice) -> None: if slice.contains_key(self.estimize_consensus_symbol): data_point = slice[self.estimize_consensus_symbol] self.log(f"{self.estimize_consensus_symbol} mean at {slice.time}: {data_point.mean}") if slice.contains_key(self.estimize_estimate_symbol): data_point = slice[self.estimize_estimate_symbol] self.log(f"{self.estimize_estimate_symbol} EPS at {slice.time}: {data_point.eps}") if slice.contains_key(self.estimize_release_symbol): data_point = slice[self.estimize_release_symbol] self.log(f"{self.estimize_release_symbol} EPS at {slice.time}: {data_point.eps}")
public override void OnData(Slice slice) { if (slice.ContainsKey(_estimizeConsensusSymbol)) { var dataPoint = slice[_estimizeConsensusSymbol]; Log($"{_estimizeConsensusSymbol} mean at {slice.Time}: {dataPoint.Mean}"); } if (slice.ContainsKey(_estimizeEstimateSymbol)) { var dataPoint = slice[_estimizeEstimateSymbol]; Log($"{_estimizeEstimateSymbol} EPS at {slice.Time}: {dataPoint.Eps}"); } if (slice.ContainsKey(_estimizeReleaseSymbol)) { var dataPoint = slice[_estimizeReleaseSymbol]; Log($"{_estimizeReleaseSymbol} EPS at {slice.Time}: {dataPoint.Eps}"); } }
To iterate through all of the dataset objects in the current Slice
, call the Get
get
method.
def on_data(self, slice: Slice) -> None: for dataset_symbol, data_point in slice.get(EstimizeConsensus).items(): self.log(f"{dataset_symbol} mean at {slice.time}: {data_point.mentions}") for dataset_symbol, data_point in slice.get(EstimizeEstimate).items(): self.log(f"{dataset_symbol} EPS at {slice.time}: {data_point.eps}") for dataset_symbol, data_point in slice.get(EstimizeRelease).items(): self.log(f"{dataset_symbol} EPS at {slice.time}: {data_point.eps}")
public override void OnData(Slice slice) { foreach (var kvp in slice.Get<EstimizeConsensus>()) { var datasetSymbol = kvp.Key; var dataPoint = kvp.Value; Log($"{datasetSymbol} mean at {slice.Time}: {dataPoint.Mentions}"); } foreach (var kvp in slice.Get<EstimizeEstimate>()) { var datasetSymbol = kvp.Key; var dataPoint = kvp.Value; Log($"{datasetSymbol} EPS at {slice.Time}: {dataPoint.Eps}"); } foreach (var kvp in slice.Get<EstimizeRelease>()) { var datasetSymbol = kvp.Key; var dataPoint = kvp.Value; Log($"{datasetSymbol} EPS at {slice.Time}: {dataPoint.Eps}"); } }
Historical Data
To get historical Estimize data, call the History
history
method with the dataset Symbol
. If there is no data in the period you request, the history result is empty.
# DataFrames consensus_history_df = self.history(self.estimize_consensus_symbol, 100, Resolution.DAILY) estimate_history_df = self.history(self.estimize_estimate_symbol, 100, Resolution.DAILY) release_history_df = self.history(self.estimize_release_symbol, 100, Resolution.DAILY) history_df = self.history([ self.estimize_consensus_symbol, self.estimize_estimate_symbol, self.estimize_release_symbol], 100, Resolution.DAILY) # Dataset objects consensus_history_bars = self.history[EstimizeConsensus](self.estimize_consensus_symbol, 100, Resolution.DAILY) estimate_history_bars = self.history[EstimizeEstimate](self.estimize_estimate_symbol, 100, Resolution.DAILY) release_history_bars = self.history[EstimizeRelease](self.estimize_release_symbol, 100, Resolution.DAILY)
// Dataset objects var concensusHistory = History<EstimizeConsensus>(_estimizeConsensusSymbol, 100, Resolution.Daily); var estimateHistory = History<EstimizeEstimate>(_estimizeEstimateSymbol, 100, Resolution.Daily); var releaseHistory = History<EstimizeRelease>(_estimizeReleaseSymbol, 100, Resolution.Daily); // Slice objects var history = History(new[]{_estimizeConsensusSymbol, _estimizeEstimateSymbol, _estimizeReleaseSymbol}, 10, Resolution.Daily);
For more information about historical data, see History Requests.
Remove Subscriptions
To remove a subscription, call the RemoveSecurity
remove_security
method.
self.remove_security(self.estimize_consensus_symbol) self.remove_security(self.estimize_estimate_symbol) self.remove_security(self.estimize_release_symbol)
RemoveSecurity(_estimizeConsensusSymbol); RemoveSecurity(_estimizeEstimateSymbol); RemoveSecurity(_estimizeReleaseSymbol);
If you subscribe to Estimize data for assets in a dynamic universe, remove the dataset subscription when the asset leaves your universe. To view a common design pattern, see Track Security Changes.
Example Applications
The Estimize dataset enables you to estimate the financial data of a company more accurately for alpha. Examples include the following use cases:
- Fundamental estimates for ML regression/classification models
- Arbitrage/Sentiment trading on market “surprise” from ordinary expectations based on the better expectation by the dataset
- Using industry-specific KPIs to predict the returns of individual sectors
Classic Algorithm Example
The following example algorithm creates a dynamic universe of the 500 most liquid US Equities. Each month, the algorithm forms an equal-weighted dollar-neutral portfolio with the 10 companies with the highest EPS estimate and the 10 companies with the lowest EPS estimate.
from AlgorithmImports import * from QuantConnect.DataSource import * class ExtractAlphaEstimizeAlgorithm(QCAlgorithm): def initialize(self) -> None: self.set_start_date(2019, 1, 1) self.set_end_date(2020, 12, 31) self.set_cash(100000) # A variable to control the next rebalance time self.last_time = datetime.min self.add_universe(self.my_coarse_filter_function) self.universe_settings.resolution = Resolution.MINUTE def my_coarse_filter_function(self, coarse: List[CoarseFundamental]) -> List[Symbol]: # Select the non-penny stocks with the highest dollar volume, since they have more stable price (lower risk) and more informed insights from high market activities sorted_by_dollar_volume = sorted([x for x in coarse if x.has_fundamental_data and x.price > 4], key=lambda x: x.dollar_volume, reverse=True) selected = [x.symbol for x in sorted_by_dollar_volume[:500]] return selected def on_data(self, slice: Slice) -> None: if self.last_time > self.time: return # Accessing Estimize data to collect the crowd-sourced insighrt as trading signals consensus = slice.Get(EstimizeConsensus) estimate = slice.Get(EstimizeEstimate) release = slice.Get(EstimizeRelease) if not estimate: return # Long the ones with highest earning estimates and short the ones with lowest earning estimates, assuming the fundamental factor significance sorted_by_eps_estimate = sorted([x for x in estimate.items() if x[1].eps], key=lambda x: x[1].eps) long_symbols = [x[0].underlying for x in sorted_by_eps_estimate[-10:]] short_symbols = [x[0].underlying for x in sorted_by_eps_estimate[:10]] # Liquidate the ones that fall out of the earning extremes for symbol in [x.symbol for x in self.portfolio.Values if x.invested]: if symbol not in long_symbols + short_symbols: self.liquidate(symbol) # Invest equally and dollar-neutral to evenly dissipate capital risk and hedge systematic risk long_targets = [PortfolioTarget(symbol, 0.05) for symbol in long_symbols] short_targets = [PortfolioTarget(symbol, -0.05) for symbol in short_symbols] self.set_holdings(long_targets + short_targets) # Update the rebalance time to next month start self.last_time = Expiry.END_OF_MONTH(self.time) def on_securities_changed(self, changes: SecurityChanges) -> None: for security in changes.added_securities: # Requesting data for trading signal generation estimize_consensus_symbol = self.add_data(EstimizeConsensus, security.symbol).symbol estimize_estimate_symbol = self.add_data(EstimizeEstimate, security.symbol).symbol estimize_release_symbol = self.add_data(EstimizeRelease, security.symbol).symbol # Historical Data history = self.history([estimize_consensus_symbol, estimize_estimate_symbol, estimize_release_symbol ], 10, Resolution.DAILY) self.debug(f"We got {len(history)} items from our history request")
public class ExtractAlphaEstimizeAlgorithm : QCAlgorithm { // A variable to control the next rebalance time private DateTime _time = DateTime.MinValue; public override void Initialize() { SetStartDate(2019, 1, 1); SetEndDate(2020, 12, 31); SetCash(100000); AddUniverse(MyCoarseFilterFunction); UniverseSettings.Resolution = Resolution.Minute; } private IEnumerable<Symbol> MyCoarseFilterFunction(IEnumerable<CoarseFundamental> coarse) { // Select the non-penny stocks with the highest dollar volume, since they have more stable price (lower risk) and more informed insights from high market activities return (from c in coarse where c.HasFundamentalData && c.Price > 4 orderby c.DollarVolume descending select c.Symbol).Take(500); } public override void OnData(Slice slice) { if (_time > Time) return; // Accessing Estimize data to collect the crowd-sourced insighrt as trading signals var consensus = slice.Get<EstimizeConsensus>(); var estimate = slice.Get<EstimizeEstimate>(); var release = slice.Get<EstimizeRelease>(); if (estimate.IsNullOrEmpty()) return; // Long the ones with highest earning estimates and short the ones with lowest earning estimates, assuming the fundamental factor significance var sortedByEpsEstimate = from value in estimate.Values where (value.Eps != None) orderby value.Eps descending select value.Symbol.Underlying; var longSymbols = sortedByEpsEstimate.Take(10).ToList(); var shortSymbols = sortedByEpsEstimate.TakeLast(10).ToList(); // Liquidate the ones that fall out of the earning extremes foreach (var kvp in Portfolio) { var symbol = kvp.Key; if (kvp.Value.Invested && !longSymbols.Contains(symbol) && !shortSymbols.Contains(symbol)) { Liquidate(symbol); } } // Invest equally and dollar-neutral to evenly dissipate capital risk and hedge systematic risk var targets = new List<PortfolioTarget>(); targets.AddRange(longSymbols.Select(symbol => new PortfolioTarget(symbol, 0.05m))); targets.AddRange(shortSymbols.Select(symbol => new PortfolioTarget(symbol, -0.05m))); SetHoldings(targets); // Update the rebalance time to next month start _time = Expiry.EndOfMonth(Time); } public override void OnSecuritiesChanged(SecurityChanges changes) { foreach(var security in changes.AddedSecurities) { // Requesting data for trading signal generation var consensusSymbol = AddData<EstimizeConsensus>(security.Symbol).Symbol; var estimateSymbol = AddData<EstimizeEstimate>(security.Symbol).Symbol; var releaseSymbol = AddData<EstimizeRelease>(security.Symbol).Symbol; // Historical Data var history = History(new[]{ consensusSymbol, estimateSymbol, releaseSymbol }, 10, Resolution.Daily); Debug($"We got {history.Count()} items from our history request"); } } }
Framework Algorithm Example
The following example algorithm creates a dynamic universe of the 500 most liquid US Equities. Each month, the algorithm forms an equal-weighted dollar-neutral portfolio with the 10 companies with the highest EPS estimate and the 10 companies with the lowest EPS estimate.
from AlgorithmImports import * from QuantConnect.DataSource import * class ExtractAlphaEstimizeFrameworkAlgorithm(QCAlgorithm): def initialize(self) -> None: self.set_start_date(2019, 1, 1) self.set_end_date(2020, 12, 31) self.set_cash(100000) self.add_universe(self.my_coarse_filter_function) self.universe_settings.resolution = Resolution.MINUTE # Custom alpha model that emit signal according to Estimize data self.add_alpha(ExtractAlphaEstimizeAlphaModel()) # Invest equally to evenly dissipate capital concentration risk self.set_portfolio_construction(EqualWeightingPortfolioConstructionModel()) self.set_execution(ImmediateExecutionModel()) def my_coarse_filter_function(self, coarse: List[CoarseFundamental]) -> List[Symbol]: # Select the non-penny stocks with the highest dollar volume, since they have more stable price (lower risk) and more informed insights from high market activities sorted_by_dollar_volume = sorted([x for x in coarse if x.has_fundamental_data and x.price > 4], key=lambda x: x.dollar_volume, reverse=True) selected = [x.symbol for x in sorted_by_dollar_volume[:500]] return selected class ExtractAlphaEstimizeAlphaModel(AlphaModel): def __init__(self) -> None: # A variable to control the next rebalance time self.time = datetime.min def update(self, algorithm: QCAlgorithm, slice: Slice) -> List[Insight]: if self.time > algorithm.time: return [] # Accessing Estimize data to collect the crowd-sourced insighrt as trading signals consensus = slice.Get(EstimizeConsensus) estimate = slice.Get(EstimizeEstimate) release = slice.Get(EstimizeRelease) if not estimate: return [] # Long the ones with highest earning estimates and short the ones with lowest earning estimates, assuming the fundamental factor significance sorted_by_eps_estimate = sorted([x for x in estimate.items() if x[1].eps], key=lambda x: x[1].eps) long_symbols = [x[0].underlying for x in sorted_by_eps_estimate[-10:]] short_symbols = [x[0].underlying for x in sorted_by_eps_estimate[:10]] insights = [] for symbol in long_symbols: insights.append(Insight.price(symbol, Expiry.END_OF_MONTH, InsightDirection.UP)) for symbol in short_symbols: insights.append(Insight.price(symbol, Expiry.END_OF_MONTH, InsightDirection.DOWN)) # Update the rebalance time to next month start self.time = Expiry.END_OF_MONTH(algorithm.time) return insights def on_securities_changed(self, algorithm: QCAlgorithm, changes: SecurityChanges) -> None: for security in changes.added_securities: # Requesting data for trading signal generation estimize_consensus_symbol = algorithm.add_data(EstimizeConsensus, security.symbol).symbol estimize_estimate_symbol = algorithm.add_data(EstimizeEstimate, security.symbol).symbol estimize_release_symbol = algorithm.add_data(EstimizeRelease, security.symbol).symbol # Historical Data history = algorithm.history([estimize_consensus_symbol, estimize_estimate_symbol, estimize_release_symbol ], 10, Resolution.DAILY) algorithm.debug(f"We got {len(history)} items from our history request")
public class ExtractAlphaEstimizeAlgorithm : QCAlgorithm { public override void Initialize() { SetStartDate(2019, 1, 1); SetEndDate(2021, 1, 1); SetCash(100000); AddUniverse(MyCoarseFilterFunction); UniverseSettings.Resolution = Resolution.Minute; // Custom alpha model that emit signal according to Estimize data AddAlpha(new ExtractAlphaEstimizeAlphaModel()); // Invest equally to evenly dissipate capital concentration risk SetPortfolioConstruction(new EqualWeightingPortfolioConstructionModel()); SetExecution(new ImmediateExecutionModel()); } private IEnumerable<Symbol> MyCoarseFilterFunction(IEnumerable<CoarseFundamental> coarse) { // Select the non-penny stocks with the highest dollar volume, since they have more stable price (lower risk) and more informed insights from high market activities return (from c in coarse where c.HasFundamentalData && c.Price > 4 orderby c.DollarVolume descending select c.Symbol).Take(500); } } public class ExtractAlphaEstimizeAlphaModel: AlphaModel { // A variable to control the next rebalance time public DateTime _time; public ExtractAlphaEstimizeAlphaModel() { _time = DateTime.MinValue; } public override IEnumerable<Insight> Update(QCAlgorithm algorithm, Slice slice) { if (_time > algorithm.Time) return new List<Insight>(); // Accessing Estimize data to collect the crowd-sourced insighrt as trading signals var consensus = slice.Get<EstimizeConsensus>(); var estimate = slice.Get<EstimizeEstimate>(); var release = slice.Get<EstimizeRelease>(); if (estimate.IsNullOrEmpty()) return new List<Insight>(); // Long the ones with highest earning estimates and short the ones with lowest earning estimates, assuming the fundamental factor significance var sortedByEpsEstimate = from s in estimate.Values where (s.Eps != None) orderby s.Eps descending select s.Symbol.Underlying; var longSymbols = sortedByEpsEstimate.Take(10).ToList(); var shortSymbols = sortedByEpsEstimate.TakeLast(10).ToList(); var insights = new List<Insight>(); insights.AddRange(longSymbols.Select(symbol => new Insight(symbol, Expiry.EndOfMonth, InsightType.Price, InsightDirection.Up))); insights.AddRange(shortSymbols.Select(symbol => new Insight(symbol, Expiry.EndOfMonth, InsightType.Price, InsightDirection.Down))); // Update the rebalance time to next month start _time = Expiry.EndOfMonth(algorithm.Time); return insights; } public override void OnSecuritiesChanged(QCAlgorithm algorithm, SecurityChanges changes) { foreach(var security in changes.AddedSecurities) { // Requesting data for trading signal generation var consensusSymbol = algorithm.AddData<EstimizeConsensus>(security.Symbol).Symbol; var estimateSymbol = algorithm.AddData<EstimizeEstimate>(security.Symbol).Symbol; var releaseSymbol = algorithm.AddData<EstimizeRelease>(security.Symbol).Symbol; // Historical Data var history = algorithm.History(new[]{ consensusSymbol, estimateSymbol, releaseSymbol }, 10, Resolution.Daily); algorithm.Debug($"We got {history.Count()} items from our history request"); } } }
Data Point Attributes
The Estimize dataset provides EstimizeConsensus
, EstimizeEstimate
, and EstimizeRelease
objects.
EstimizeConsensus Attributes
EstimizeConsensus
objects have the following attributes:
EstimizeEstimate Attributes
EstimizeEstimate
objects have the following attributes:
EstimizeRelease Attributes
EstimizeRelease
objects have the following attributes: