Detecting Impactful News In ETF Constituents

Abstract

In this tutorial, we build upon the natural language processing (NLP) approach from the previous strategy. In this iteration, we monitor the Tiingo News Feed and try to determine the intraday news sentiment of the largest constituents in the Nasdaq-100 index while avoiding look-ahead bias. The results show that this version of the strategy experienced lower risk-adjusted returns than the QQQ exchange-traded fund (ETF) over the last two years.

Background

NLP is a subfield of artificial intelligence that strives to process unstructured text and understand its meaning. In most NLP trading strategies, the developer provides a set of pre-selected phrases and their sentiment scores, which usually introduces look-ahead bias into the strategy. In this algorithm, we circumvent this error by assigning sentiment scores to words on-the-fly based on how they impact the future returns of the security.

Method

Let’s review how we can implement this strategy as a framework algorithm with the LEAN trading engine.

Universe Selection

To get the largest constituents of the QQQ ETF, we add a custom ETF Constituents Universe Selection model and define the filter function to provide the 10 securities with the largest weight in the QQQ ETF.

def ETFConstituentsFilter(self, constituents: List[ETFConstituentData]) -> List[Symbol]:
    selected = sorted([c for c in constituents if c.Weight],
        key=lambda c: c.Weight, reverse=True)[:10]
    return [c.Symbol for c in selected]

Requesting News Articles

Everytime a security enters our ETF universe, we subscribe to its Tiingo News Feed.

self.dataset_symbol = algorithm.AddData(TiingoNews, symbol).Symbol

Training the NLP Models

To ensure the algorithm is fit using the most recent news releases, we train a model for each security when they enter the universe and we schedule training sessions to re-fit the models at the beginning of every month.

algorithm.Train(algorithm.DateRules.MonthStart(), algorithm.TimeRules.At(7, 0), self.train_models)

During the training sessions, we use the following procedure for each security in the universe:

Make a history request to gather the news releases and trading prices of the security over the last 30 days.
Tokenize the news article text, drop the punctuation, and drop filler words like “the”, “a”, and “an”.
Create a dictionary that maps each word to the expected future return of the security over the following 30 minutes.

Detecting Significant News

The NLP models transform the text of news releases into a prediction on the future returns of the respective security. Instead of trading in response to every news release, we only trade when an NLP model provides a prediction that’s \(n\) standard deviations away from the mean of the last 30 predictions. A larger value of \(n\) translates to fewer trades, but the trades are in response to news that carry more significance.

Emitting Insights

When an NLP model detects some significant news, the Alpha model emits an Insight with a duration of 30 minutes and a direction that matches the sentiment of the news release. That is, if the model determines the news release is positive, the insight has InsightDirection.Up. Otherwise, it has InsightDirection.Down.

direction = InsightDirection.Up if expected_return > 0 else InsightDirection.Down
insights.append(Insight.Price(asset_symbol, self.PREDICTION_INTERVAL, direction))

Portfolio Construction

The Tiingo News Feed provides news every second an article is released. In this strategy, the goal is to immediately trade in response to news articles and hold the position for 30 minutes. The position size should only change during the 30 minutes if another significant news article is released for the same security and it has sentiment in the opposite direction of our trade. To achieve this, we create a custom Portfolio Construction model (PCM), called the PartitionedPortfolioConstructionModel.

This PCM works by slicing the portfolio into \(p\) independent partitions. When the PCM receives an Insight for the first security, it allocates \(\frac{1}{p}\) of the portfolio capital to the security. The security price fluctuates over time, so its weight in the portfolio won’t stay fixed at \(\frac{1}{p}\) if \(p > 1\). When the model receives an Insight for another security, it calculates the number of vacant partitions \(v\) and then allocates \(\frac{1}{v}\) of the portfolio cash to the new security. The benefit of this design is that the PCM maintains the size of every trade until the Insight expires or the Alpha model emits a new insight in the other direction. The drawback of this design is that the portfolio can only hold up to \(p\) securities at any one time.

Results

We backtested the strategy from January 1, 2021 to January 1, 2023 and the algorithm achieved a -0.659 Sharpe ratio. To compare this performance, the following table shows the results of some benchmarks:

Benchmark	Sharpe Ratio
Buy-and-hold with the QQQ	-0.14
An equal-weighted portfolio of the same universe as the strategy	-0.11

In conclusion, the strategy underperforms the two preceding benchmarks in terms of risk-adjusted returns and it needs further development before live trading.

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.

Platform

Detecting Impactful News In ETF Constituents

Radically Open-Source Algorithmic Trading Engine

Join Our Discord Channel

Quarterly Open-Source Trading Competition

Draft Discussions

Bookmarked Discussions

SEARCH DISCUSSIONS

TOP 5 Research PUblications

About Quant League

competition rules

previous competitions

301,700 Quants.

VOTE FOR UPCOMING FEATURES

Abstract

Background

Method

Universe Selection

Requesting News Articles

Training the NLP Models

Detecting Significant News

Emitting Insights

Portfolio Construction

Results

Organization

Team

Clone Strategy

Previous Ranking

IN THIS RESEARCH

PARTICIPANTS

Discussion Awards

Actions

Join QuantConnect for Free

Platform

SIGN IN

Detecting Impactful News In ETF Constituents

Radically Open-Source Algorithmic Trading Engine

Join Our Discord Channel

Quarterly Open-Source Trading Competition

Draft Discussions

Bookmarked Discussions

SEARCH DISCUSSIONS

TOP 5 Research PUblications

About Quant League

competition rules

previous competitions

301,700 Quants.

VOTE FOR UPCOMING FEATURES

Abstract

Background

Method

Universe Selection

Requesting News Articles

Training the NLP Models

Detecting Significant News

Emitting Insights

Portfolio Construction

Results

Organization

Team

Clone Strategy

Previous Ranking

IN THIS RESEARCH

PARTICIPANTS

Discussion Awards

SHARE RESEARCH

SHARE DISCUSSION

SHARE ARTICLE

SHARE

Actions

Join QuantConnect for Free