Strategy Library
Stock Selection Strategy Based on Fundamental Factors
Abstract
In recent years, factor investing gained significant popularity among global institutional investors. In this tutorial, we first developed a factor selection model to test if factors have the ability to differentiate potential winners and losers in the stock market. Then we use those preselected factors to implement the factor ranking stock selection algorithm based on Factor Based Stock Selection Model for Turkish Equities, 2015, Ayhan Yüksel.
Factor Selection
QuantConnect provides Morningstar fundamentals data for US Equities. Valuation Ratios is daily data. For others like operation ratios and financial statements data are available for multiple periods depending on the property. Please refer to Data Library for detailed factors available.
The algorithm is designed to test the significance of one factor each time.
def Initialize(self): self.SetStartDate(2005,01,01) #Set Start Date self.SetEndDate(2012,03,01) #Set End Date self.SetCash(50000) #Set Strategy Cash self.UniverseSettings.Resolution = Resolution.Daily self.AddUniverse(self.CoarseSelectionFunction, self.FineSelectionFunction) self.AddEquity("SPY") # add benchmark self.numOfCourseSymbols = 200 self.numOfPortfolio = 5 self._changes = None self.flag1 = 1 # variable to control the monthly rebalance of coarse and fine selection function self.flag2 = 0 # variable to control the monthly rebalance of OnData function self.flag3 = 0 # variable to record the number of rebalancing times # store the monthly returns of different portfolios in a dataframe self.df_return = pd.DataFrame(index = range(self.numOfPortfolio+1)) # schedule an event to fire at the first trading day of SPY self.Schedule.On(self.DateRules.MonthStart("SPY"), self.TimeRules.AfterMarketOpen("SPY"), Action(self.Rebalancing))
Step 1: Ranking the stocks by factor values
1. First, we sort the stocks by daily dollar volume and take the top stocks with the highest dollar volumes as our candidates. There is a convenient way using our universe selection API. Universes are refreshed every day by default, but can be refreshed as often as required. This is controlled by the variable UniverseSettings.Resolution
. You can refer to the documentation for more details. Here we use Scheduled events API to trigger code to run at the first trading day each month and use three flag variables to control the rebalancing of CoarseSelection
, FineSelection
and Ondata
functions.
Coarse Universe selection is the built-in universe data provided by QuantConnect which allows you to filter the universe of over 16000 symbols to perform rough filtering before your algorithm. Because coarse selection function takes all the equities including ETFs which have no fundamental data into account, we need the property x.HasFundamentalData
to exclude them from our candidate stocks pool.
# sort the data by daily dollar volume and take the top entries def CoarseSelectionFunction(self, coarse): if self.flag1: CoarseWithFundamental = [x for x in coarse if x.HasFundamentalData] sortedByVolume = sorted(CoarseWithFundamental, key=lambda x: x.DollarVolume, reverse=True) top = sortedByVolume[:self.numOfCourseSymbols] return [i.Symbol for i in top] else: return []
2. We extract the factor values of candidate stocks at the beginning of each month and sort the stocks in ascending
order of their factor values. Here we use 12-months' total risk-based capital data
x.FinancialStatements.TotalRiskBasedCapital.TwelveMonths
as an example.
It is the sum of Tier 1 and Tier 2 Capital. x.Symbol.Value
can give the string symbol of selected stock x. Then we save those sorted symbols as self.symbol
.
def FineSelectionFunction(self, fine): if self.flag1: self.flag1 = 0 self.flag2 = 1 # filter the fine by deleting equities wit zero factor value filtered_fine = [x for x in fine if x.FinancialStatements.TotalRiskBasedCapital.TwelveMonths != 0 ] # sort the fine by reverse order of factor value sorted_fine = sorted(filtered_fine, key=lambda x: x.FinancialStatements.TotalRiskBasedCapital.TwelveMonths, reverse=True) self.symbol = [str(x.Symbol.Value) for x in sorted_fine] # factor_value = [x.ValuationRatios.PERatio for x in sorted_fine] self.flag3 = self.flag3 + 1 return [] else: return []
Step 2: Compute the monthly return of portfolios
1. At the end of each month, we extract the one-month history close prices of each stock and compute the monthly returns.
sorted_symbol = self.symbol self.AddEquity("SPY") # add benchmark for x in sorted_symbol: self.AddEquity(x) history = self.History(20,Resolution.Daily) monthly_return =[] new_symbol_list =[] for j in range(len(sorted_symbol)): try: daily_price = [] for slice in history: bar = slice[sorted_symbol[j]] daily_price.append(float(bar.Close)) new_symbol_list.append(sorted_symbol[j]) monthly_return.append(daily_price[-1] / daily_price[0] - 1) except: self.Log("No history data for " + str(sorted_symbol[j])) del daily_price # the length of monthly_return list should be divisible by the number of portfolios monthly_return = monthly_return[:int(math.floor(len(monthly_return) / self.numOfPortfolio) * self.numOfPortfolio)]
2. We divide the stocks into 5 portfolios and compute the average monthly returns of each portfolio. Then we add the monthly return of benchmark "SPY" at the last line of the data frame df_return
.
reshape_return = np.reshape(monthly_return, (self.numOfPortfolio, len(monthly_return)/self.numOfPortfolio)) # calculate the average return of different portfolios port_avg_return = np.mean(reshape_return,axis=1).tolist() # add return of "SPY" as the benchmark to the end of the return list benchmark_syl = self.AddEquity("SPY").Symbol history_benchmark = self.History(20,Resolution.Daily) benchmark_daily_price = [float(slice[benchmark_syl].Close) for slice in history_benchmark] benchmark_monthly_return = (benchmark_daily_price[-1]/benchmark_daily_price[0]) - 1 port_avg_return.append(benchmark_monthly_return) self.df_return[str(self.flag3)] = port_avg_return
Step 3: Generate the metrics to test the factor significance
After getting the monthly returns of portfolios and the benchmark, we compute the average annual return and excess return over benchmark of each portfolio across the whole backtesting period. Then We generate three metrics to judge the significance of each factor.
- The first metrics is the correlation between the portfolio' returns and their rank. The absolute value of the correlation coefficient should larger than 0.8.
- If the return of the rank first portfolio larger than the portfolio at the bottom of the return rankings, we define it the win portfolio and the loss portfolio and vice versa. The win probability is the probability that the win portfolio return outperform the benchmark return. The loss probability is the probability that the loss portfolio return underperform the benchmark. If the factor is significant, both loss and win probability should greater than 0.4.
- The excess return of win portfolio should be greater than 0.25, while the excess return of loss portfolio should be lower than 0.05.
def calculate_criteria(self,df_port_return): total_return = (df_port_return + 1).T.cumprod().iloc[-1,:] - 1 annual_return = (total_return+1)**(1./6)-1 excess_return = annual_return - np.array(annual_return)[-1] correlation = annual_return[0:5].corr(pd.Series([5,4,3,2,1],index = annual_return[0:5].index)) # higher factor with higher return if np.array(total_return)[0] > np.array(total_return)[-2]: loss_excess = df_port_return.iloc[-2,:] - df_port_return.iloc[-1,:] win_excess = df_port_return.iloc[0,:] - df_port_return.iloc[-1,:] loss_prob = loss_excess[loss_excess<0].count()/float(len(loss_excess)) win_prob = win_excess[win_excess>0].count()/float(len(win_excess)) win_port_excess_return = np.array(excess_return)[0] loss_port_excess_return = np.array(excess_return)[-2] # higher factor with lower return else: loss_excess = df_port_return.iloc[0,:] - df_port_return.iloc[-1,:] win_excess = df_port_return.iloc[-2,:] - df_port_return.iloc[-1,:] loss_prob = loss_excess[loss_excess<0].count()/float(len(loss_excess)) win_prob = win_excess[win_excess>0].count()/float(len(win_excess)) win_port_excess_return = np.array(excess_return)[-2] loss_port_excess_return = np.array(excess_return)[0] test_result = {} test_result["correelation"]=correlation test_result["win probality"]=win_prob test_result["loss probality"]=loss_prob test_result["win portfolio excess return"]=win_port_excess_return test_result["loss portfolio excess return"]=loss_port_excess_return return test_result
Factor Significance Testing Result | |||||||
---|---|---|---|---|---|---|---|
Factor | FCFYield | BuyBackYield | PriceChange1M | TrailingDividendYield | EVToEBITDA | RevenueGrowth | BookValuePerShare |
The correlation | -0.936 | -0.987 | 0.918 | -0.981 | 0.939 | 0.89 | -0.92 |
Win Probability | 0.630 | 0.639 | 1 | 0.667 | 0.722 | 0.69 | 0.69 |
Loss probability | 0.426 | 0.472 | 1 | 0.518 | 0.472 | 0.42 | 0.40 |
Excess Return(Win) | 0.324 | 0.212 | 0.303 | 0.225 | 0.414 | 0.23 | 0.27 |
Excess Return(Loss) | 0.060 | 0.037 | -1.67 | 0.043 | 0.042 | 0.07 | 0.06 |
We choose 4 factors: FCFYield, PriceChange1M, BookValuePerShare and RevenueGrowth.
Stock Selection
Step 1: Rank the stocks by factor values
First, we remove the stocks without fundamental data or have zero factor value. For each pre-selected factor, we rank the stocks by those factor values. The order is descending if the factor correlation is negative, it is ascending if the factor correlation is positive.
Step 2: Calculate equally weighted composite factor scores
The second step is using different selected factor variables to calculate an equally weighted composite factor score for each stock.
- First, according to the factor order, we place our universe stocks into 5 distinct quintile portfolios, named P1, P2, P3, P4 and P5. The ranking of portfolios sets out the preference of the factor model, i.e. the first portfolio (P1) corresponds to the “most preferred” stocks, while the fifth (P5) corresponds to the “least preferred” stocks.Suppose there are n stocks in total. Then the stocks fall into the first rank portfolio will have score p, the stocks fall into the second rank portfolio will get score p-1 and so on. Then we can get a score for every stock. We did the same calculation for each factor.
- Second, we calculate a “Composite Factor Score” by combining the six-factor scores and using an equal weighting scheme. Then we get composite factor score for each stock.
- Third, we then rank the stocks in our universe according to their Composite Factor Scores and choose the highest ranked 20 stocks to construct our portfolios at the beginning of each month.
- At the end of each month, we repeat the above steps to construct the new portfolio and adjust the holding stocks.
References
- Factor Based Stock Selection Model for Turkish Equities, 2015, Ayhan Yüksel Online Copy
You can also see our Documentation and Videos. You can also get in touch with us via Chat.