Brain
Brain Wikipedia Page Views
Introduction
The Brain Wikipedia Page Views dataset by Brain tracks investor attention toward large U.S. equities by monitoring traffic on company Wikipedia pages. The dataset covers approximately 1,000 of the largest U.S. companies, broadly corresponding to the Russell 1000, and is available starting from February 2023. Data is delivered at a daily frequency.
The dataset is constructed by analyzing the number of views on each company’s Wikipedia page, including both the main page and any secondary pages that redirect to it. Only English-language Wikipedia pages are considered. The data is published daily, typically by 12:00 PM UTC, and reflects attention metrics calculated relative to the reporting date.
In addition to raw page-view counts, the dataset includes buzz metrics, which normalize current attention levels relative to historical behavior. Buzz scores are computed as the distance between current page views and the company’s own historical average over the previous six months, expressed in units of standard deviations. As a result, positive buzz values indicate unusually high attention, while negative values indicate lower-than-normal attention.
Metrics are provided across multiple time horizons, including 1-day, 7-day, and 30-day rolling windows, enabling users to study short-term spikes in attention as well as more persistent trends. These features make the dataset well-suited for research into investor attention, behavioral finance signals, event-driven strategies, and alpha models that incorporate information demand.
This dataset depends on the US Equity Security Master dataset, which provides essential information such as symbol mappings, corporate actions, and identifier consistency required for accurate historical analysis and backtesting.
For more information about the Brain Wikipedia Page Views dataset, including CLI commands and pricing, see the dataset listing.
About the Provider
Brain is a Research Company that creates proprietary datasets and algorithms for investment strategies, combining experience in financial markets with strong competencies in Statistics, Machine Learning, and Natural Language Processing. The founders share a common academic background of research in Physics as well as extensive experience in Financial markets.
Getting Started
The following snippet demonstrates how to request data from the Brain Wikipedia Page Views dataset:
self.aapl = self.add_equity("AAPL", Resolution.DAILY).symbol
self.dataset_symbol = self.add_data(BrainWikipediaPageViews, self.aapl).symbol _symbol = AddEquity("AAPL", Resolution.Daily).Symbol;
_datasetSymbol = AddData<BrainWikipediaPageViews>(_symbol).Symbol;
Data Summary
The following table describes the dataset properties:
| Property | Value |
|---|---|
| Start Date | February 2023 |
| Asset Coverage* | ~1,000 US Equities (approximately Russell 1000) |
| Data Density | Sparse |
| Resolution | Daily |
| Timezone | UTC |
* The coverage includes the largest U.S. equities since the start date and may evolve over time.
Requesting Data
To add Brain Wikipedia Page Views Indicator data to your algorithm, call the AddDataadd_data method. Save a reference to the dataset Symbol so you can access the data later in your algorithm.
class BrainWikipediaPageViews(QCAlgorithm):
def initialize(self) -> None:
self.set_start_date(2021, 1, 1)
self.set_end_date(2021, 7, 8)
self.set_cash(100000)
self.aapl = self.add_equity("AAPL", Resolution.DAILY).symbol
self.dataset_symbol = self.add_data(BrainWikipediaPageViews, self.aapl).symbol public class BrainWikipediaPageViews: QCAlgorithm
{
private Symbol _symbol;
public override void Initialize()
{
SetStartDate(2021, 1, 1);
SetEndDate(2021, 7, 8);
SetCash(100000);
_symbol = AddEquity("AAPL", Resolution.Daily).Symbol;
_datasetSymbol = AddData<BrainWikipediaPageViews>(_symbol).Symbol;
}
}
Accessing Data
To get the current Brain Wikipedia Page Views Indicator data, index the current Slice with the dataset Symbol. Slice objects deliver unique events to your algorithm as they happen, but the Slice may not contain data for your dataset at every time step. To avoid issues, check if the Slice contains the data you want before you index it.
def on_data(self, slice: Slice) -> None:
if slice.contains_key(self.dataset_symbol):
data_point = slice[self.dataset_symbol]
self.log(f"{self.dataset_symbol} sentiments at {slice.time}: {data_point.NumberViews1}, {data_point.Buzz1}" public override void OnData(Slice slice)
{
if (slice.ContainsKey(_datasetSymbol))
{
var dataPoint = slice[_datasetSymbol];
Log($"{_datasetSymbol} report sentiment at {slice.Time}: {dataPoint. NumberViews1}, {dataPoint. Buzz1}");
}
}To iterate through all of the dataset objects in the current Slice, call the Getget method.
def on_data(self, slice: Slice) -> None:
for dataset_symbol, data_point in slice.get(BrainWikipediaPageViews).items():
self.log(f"{dataset_symbol} metric at {slice.time}: {data_point.NumberViews1} , {data_point.Buzz1}") public override void OnData(Slice slice)
{
foreach (var kvp in slice.Get<BrainWikipediaPageViews>())
{
var datasetSymbol = kvp.Key;
var dataPoint = kvp.Value;
Log($"{datasetSymbol} metric at {slice.Time}: {data_point. NumberViews1}, {data_point. Buzz1}");
}
} Historical Data
To get historical Brain Wikipedia Page Views data, call the Historyhistory method with the dataset Symbol. If there is no data in the period you request, the history result is empty.
history_df = self.history(self.dataset_symbol, 100, Resolution.DAILY)
var history = History<BrainWikipediaPageViews>(_datasetSymbol, 100, Resolution.Daily);
Example Applications
The Brain Wikipedia Page Views dataset enables you to incorporate investor attention signals derived from Wikipedia traffic into your strategies. Examples include the following strategies:
- Buying securities that experience unusually high spikes in Wikipedia page views, indicating elevated public interest.
- Avoiding or reducing exposure to securities with declining or abnormally low attention relative to their historical baseline.
- Scaling position sizes based on short-term or long-term attention intensity using page view and buzz metrics.
- Building cross-sectional ranking strategies that favor stocks with sustained increases in public visibility.
Classic Algorithm Example
The following example algorithm ranks large-cap U.S. stocks daily using Brain Wikipedia Page Views attention and buzz metrics, then allocates portfolio weights proportionally to each stock’s relative attention score.
from AlgorithmImports import *
class BrainWikipediaPageViewsRankingAlgorithm(QCAlgorithm):
def initialize(self) -> None:
self.set_start_date(2025, 1, 1)
self.set_end_date(2025, 11, 23)
self.set_cash(100000)
# Seed prices so we don't get "no price" trading errors
self.settings.seed_initial_prices = True
# We cherry-picked 5 largest stocks, high trading volume provides better information and credibility for ML ranking
tickers = ["AAPL", "TSLA", "MSFT", "F", "KO"]
for ticker in tickers:
# Requesting data to get 2 days estimated relative ranking
equity = self.add_equity(ticker, Resolution.DAILY)
equity.dataset_symbol = self.add_data(BrainWikipediaPageViews, equity.symbol).symbol
# Historical data
history = self.history(equity.dataset_symbol, 365, Resolution.DAILY)
self.debug(f"We got {len(history)} items from our history request for {equity.symbol}")
def on_data(self, slice: Slice) -> None:
points = slice.get(BrainWikipediaPageViews)
if not points:
return
scores = {}
for point in points.Values:
# Example composite signal:
number_of_views = point.NumberViews1 if point.NumberViews1 else 0
buzz1day = point.Buzz1 if point.Buzz1 else 0
symbol = point.symbol.underlying
scores[symbol] = buzz1day * math.sqrt(number_of_views)
if len(scores) == 0:
return
# Rank by score (higher score = better)
ranked = sorted(scores.items(), key=lambda x: x[1], reverse=True)
# Convert ranks to weights (same style as your example)
ranks = list(range(len(ranked), 0, -1)) # best gets highest rank number
denom = sum(ranks)
top = ", ".join([f"{sym.value}:{sc:.4f}" for sym, sc in ranked[:3]])
self.debug(f"{self.time.date()} top: {top}")
# Place orders according to the ML ranking, the better the rank, the higher the estimated return and hence weight
for i, (sym, sc) in enumerate(ranked):
w = ranks[i] / denom
self.set_holdings(sym, w) public class BrainWikipediaPageViewsRankingAlgorithm : QCAlgorithm
{
public override void Initialize()
{
SetStartDate(2025, 1, 1);
SetEndDate(2025, 11, 23);
SetCash(100000);
// Seed prices so we don't get "no price" trading errors
Settings.SeedInitialPrices = true;
// We cherry picked 5 largest stocks, high trading volume provides better information and credibility for ML ranking
var tickers = new[] { "AAPL", "TSLA", "MSFT", "F", "KO" };
foreach (var ticker in tickers)
{
// Requesting data to get 2 days estimated relative ranking
var symbol = AddEquity(ticker, Resolution.Daily).Symbol;
var datasetSymbol = AddData<BrainWikipediaPageViews>(symbol).Symbol;
// Historical data (request dataset symbol)
var history = History(datasetSymbol, 365, Resolution.Daily);
Debug($"We got {history.Count()} items from our history request for {symbol}");
}
}
public override void OnData(Slice slice)
{
// Get all BrainWikipediaPageViews points in the slice
var points = slice.Get<BrainWikipediaPageViews>();
if (points == null || points.Count == 0)
{
return;
}
var ranked = points.Values.ToDictionary(
x => x.Symbol.Underlying,
x =>
{
// Example composite signal:
var number_of_views = x.NumberViews1.HasValue ? x.NumberViews1.Value : 0m;
var buzz1day = x.Buzz1.HasValue ? x.Buzz1.Value : 0m;
return buzz1day * (decimal)Math.Sqrt((double)number_of_views);
})
// Rank by score (higher score = better)
.OrderByDescending(x => x.Value).ToList();
// Convert ranks to weights (same style as your example)
// best gets highest rank number
var n = ranked.Count;
var ranks = Enumerable.Range(1, n).Reverse().ToArray(); // n, n-1, ..., 1
var denom = ranks.Sum();
var top = string.Join(", ", ranked.Take(3).Select(x => $"{x.Key.Value}:{x.Value:0.0000}"));
Debug($"{Time.Date:yyyy-MM-dd} top: {top}");
// Place orders according to the ML ranking, the better the rank, the higher the estimated return and hence weight
for (var i = 0; i < ranked.Count; i++)
{
var w = (decimal)ranks[i] / denom;
SetHoldings(ranked[i].Key, w);
}
}
}
Framework Algorithm Example
The following example algorithm creates a dynamic universe of liquid U.S. equities, ranks them daily using Brain Wikipedia Page Views attention and buzz metrics, and emits weighted bullish insights for the most attention-driven stocks using the Alpha Framework.
from AlgorithmImports import *
import math
class BrainWikipediaPageViewsFrameworkRankingAlgorithm(QCAlgorithm):
def initialize(self) -> None:
self.set_start_date(2025, 1, 1)
self.set_end_date(2025, 11, 23)
self.set_cash(100000)
self.universe_settings.resolution = Resolution.DAILY
# Universe: pick liquid names
self.add_universe(self._select)
alpha = BrainWikipediaPageViewsRankingAlpha()
self.add_alpha(alpha)
self.schedule.on(
self.date_rules.every_day(),
self.time_rules.at(9, 0),
alpha.emit_insights
)
self.set_portfolio_construction(InsightWeightingPortfolioConstructionModel())
self.add_risk_management(NullRiskManagementModel())
self.set_execution(ImmediateExecutionModel())
self.set_warmup(30, Resolution.DAILY)
def _select(self, fundamental: List[Fundamental]) -> List[Symbol]:
# Liquid universe
filtered = [c for c in fundamental if c.market_cap and c.price > 10]
filtered.sort(key=lambda c: c.dollar_volume, reverse=True)
# Keep it manageable
return [c.symbol for c in filtered[:50]]
class BrainWikipediaPageViewsRankingAlpha(AlphaModel):
def __init__(self):
self.symbol_data_by_symbol = {}
self.symbol_by_dataset_symbol = {}
# Cache latest score per equity symbol
self.latest_scores = {} # Symbol -> float
self.last_score_date = None # date of the latest cache refresh (algorithm.time.date())
# Store algorithm reference for scheduled emission
self.algorithm = None
def update(self, algorithm: QCAlgorithm, slice: Slice) -> List[Insight]:
self.algorithm = algorithm
points = slice.get(BrainWikipediaPageViews)
if points is None:
return []
for point in points.values():
# Map dataset symbol -> equity symbol
if point.symbol not in self.symbol_by_dataset_symbol:
continue
sym = self.symbol_by_dataset_symbol[point.symbol]
number_of_views = point.NumberViews1 if point.NumberViews1 else 0
buzz1day = point.Buzz1 if point.Buzz1 else 0
score = buzz1day + math.sqrt(number_of_views)
self.latest_scores[sym] = score
self.last_score_date = algorithm.time.date()
return []
def emit_insights(self) -> None:
algorithm = self.algorithm
if not algorithm:
return
if self.last_score_date != algorithm.time.date():
return
paired = []
for sym, sc in self.latest_scores.items():
sec = algorithm.securities.get(sym)
if sec is None or not sec.has_data or sec.price is None or sec.price <= 0:
continue
if sc == 0:
continue
paired.append((sym, sc))
if not paired:
return
# Rank (higher is better)
paired.sort(key=lambda x: x[1], reverse=True)
n = len(paired)
ranks = list(range(n, 0, -1)) # best gets largest rank
denom = sum(ranks)
insights = []
for i, (sym, sc) in enumerate(paired):
weight = ranks[i] / denom
insights.append(
Insight.price(sym, timedelta(days=7), InsightDirection.UP, None, None, None, weight)
)
# Emit insights into the framework
algorithm.emit_insights(insights)
def on_securities_changed(self, algorithm: QCAlgorithm, changes: SecurityChanges) -> None:
for security in changes.added_securities:
symbol = security.symbol
if symbol in self.symbol_data_by_symbol:
continue
sd = SymbolData(algorithm, symbol)
self.symbol_data_by_symbol[symbol] = sd
self.symbol_by_dataset_symbol[sd.dataset_symbol] = symbol
for security in changes.removed_securities:
symbol = security.symbol
sd = self.symbol_data_by_symbol.pop(symbol, None)
if sd is not None:
sd.dispose()
to_remove = None
for ds_sym, eq_sym in self.symbol_by_dataset_symbol.items():
if eq_sym == symbol:
to_remove = ds_sym
break
if to_remove is not None:
self.symbol_by_dataset_symbol.pop(to_remove, None)
self.latest_scores.pop(symbol, None)
class SymbolData:
def __init__(self, algorithm: QCAlgorithm, symbol: Symbol):
self.algorithm = algorithm
# Subscribe to Brain Wikipedia Page Views metrics for this equity symbol
self.dataset_symbol = algorithm.add_data(BrainWikipediaPageViews, symbol, Resolution.DAILY).symbol
history = algorithm.history(self.dataset_symbol, 365, Resolution.DAILY)
algorithm.debug(f"We got {len(history)} items from our history request for {symbol}")
def dispose(self):
# Unsubscribe custom data subscription
self.algorithm.remove_security(self.dataset_symbol) public class BrainWikipediaPageViewsFrameworkRankingAlgorithm : QCAlgorithm
{
public override void Initialize()
{
SetStartDate(2025, 1, 1);
SetEndDate(2025, 11, 23);
SetCash(100000);
UniverseSettings.Resolution = Resolution.Daily;
AddUniverse(f => f
.Where(c => c.HasFundamentalData && c.Price > 10)
.OrderByDescending(c => c.DollarVolume)
.Take(50)
.Select(c => c.Symbol));
var alpha = new BrainWikipediaPageViewsRankingAlpha();
AddAlpha(alpha);
Schedule.On(
DateRules.EveryDay(),
TimeRules.At(9, 0),
alpha.EmitInsights
);
SetPortfolioConstruction(new InsightWeightingPortfolioConstructionModel());
AddRiskManagement(new NullRiskManagementModel());
SetExecution(new ImmediateExecutionModel());
SetWarmUp(30, Resolution.Daily);
}
}
public class BrainWikipediaPageViewsRankingAlpha : AlphaModel
{
private readonly Dictionary<Symbol, SymbolData> _symbolDataBySymbol = new();
private readonly Dictionary<Symbol, Symbol> _symbolByDatasetSymbol = new();
private readonly Dictionary<Symbol, double> _latestScores = new();
private DateTime? _lastScoreDate;
private QCAlgorithm _algorithm;
public override IEnumerable<Insight> Update(QCAlgorithm algorithm, Slice slice)
{
_algorithm = algorithm;
var points = slice.Get<BrainWikipediaPageViews>();
if (points == null)
{
return Enumerable.Empty<Insight>();
}
foreach (var kvp in points)
{
var dsSymbol = kvp.Key;
var point = kvp.Value;
if (!_symbolByDatasetSymbol.TryGetValue(dsSymbol, out var sym))
{
continue;
}
var numberOfViews = point.NumberViews1.HasValue ? (double)point.NumberViews1.Value : 0.0;
var buzz1day = point.Buzz1.HasValue ? (double)point.Buzz1.Value : 0.0;
var score = buzz1day + Math.Sqrt(numberOfViews);
_latestScores[sym] = score;
}
_lastScoreDate = algorithm.Time.Date;
return Enumerable.Empty<Insight>();
}
public void EmitInsights()
{
var algorithm = _algorithm;
if (algorithm == null)
{
return;
}
if (!_lastScoreDate.HasValue || _lastScoreDate.Value != algorithm.Time.Date)
{
return;
}
var paired = new List<(Symbol Symbol, double Score)>();
foreach (var kvp in _latestScores)
{
var sym = kvp.Key;
var sc = kvp.Value;
if (!algorithm.Securities.TryGetValue(sym, out var sec) || sec == null || !sec.HasData || sec.Price <= 0)
{
continue;
}
if (sc == 0)
{
continue;
}
paired.Add((sym, sc));
}
if (paired.Count == 0)
{
return;
}
paired.Sort((a, b) => b.Score.CompareTo(a.Score));
var n = paired.Count;
var denom = n * (n + 1) / 2.0;
var insights = new List<Insight>(n);
for (int i = 0; i < n; i++)
{
var sym = paired[i].Symbol;
var rank = n - i;
var weight = rank / denom;
insights.Add(Insight.Price(sym, TimeSpan.FromDays(7), InsightDirection.Up, null, null, null, weight));
}
algorithm.EmitInsights(insights.ToArray());
}
public override void OnSecuritiesChanged(QCAlgorithm algorithm, SecurityChanges changes)
{
foreach (var security in changes.AddedSecurities)
{
var symbol = security.Symbol;
if (_symbolDataBySymbol.ContainsKey(symbol))
{
continue;
}
var sd = new SymbolData(algorithm, symbol);
_symbolDataBySymbol[symbol] = sd;
_symbolByDatasetSymbol[sd.DatasetSymbol] = symbol;
}
foreach (var security in changes.RemovedSecurities)
{
var symbol = security.Symbol;
if (_symbolDataBySymbol.TryGetValue(symbol, out var sd))
{
_symbolDataBySymbol.Remove(symbol);
sd.Dispose();
}
Symbol toRemove = default;
var found = false;
foreach (var kvp in _symbolByDatasetSymbol)
{
if (kvp.Value == symbol)
{
toRemove = kvp.Key;
found = true;
break;
}
}
if (found)
{
_symbolByDatasetSymbol.Remove(toRemove);
}
_latestScores.Remove(symbol);
}
}
}
public class SymbolData
{
private readonly QCAlgorithm _algorithm;
public Symbol DatasetSymbol { get; }
public SymbolData(QCAlgorithm algorithm, Symbol symbol)
{
_algorithm = algorithm;
DatasetSymbol = algorithm.AddData<BrainWikipediaPageViews>(symbol, Resolution.Daily).Symbol;
var history = algorithm.History(DatasetSymbol, 365, Resolution.Daily);
algorithm.Debug($"We got {history.Count()} items from our history request for {symbol}");
}
public void Dispose()
{
_algorithm.RemoveSecurity(DatasetSymbol);
}
}
Research Example
The following example retrieves historical Brain Wikipedia Page Views data for Apple and prints the daily number of Wikipedia page views over the past six months.
qb = QuantBook()
# Requesting Data
aapl = qb.AddEquity("AAPL", Resolution.Daily).Symbol
symbol = qb.AddData(BrainWikipediaPageViews, aapl).Symbol
# Historical data
history = qb.History(BrainWikipediaPageViews, symbol, 180, Resolution.Daily)
for (symbol, time), row in history.iterrows():
print(f"{symbol} views at {time}: {row['numberviews1']}") #r "../QuantConnect.DataSource.BrainSentiment.dll"
var qb = new QuantBook();
// Requesting data
var aapl = qb.AddEquity("AAPL", Resolution.Daily).Symbol;
var symbol = qb.AddData<BrainWikipediaPageViews>(aapl).Symbol;
// Historical data
var history = qb.History<BrainWikipediaPageViews>(symbol, 180, Resolution.Daily);
foreach (BrainWikipediaPageViews PageMetrics in history)
{
Console.WriteLine($"{PageMetrics} at {PageMetrics.EndTime}");
}