This tutorial implements a strategy that trades stocks with low expected idiosyncratic skewness based on a paper by Boyer, Mitton and Vorkink (2009, hereafter BMV) published in The Review of Financial Studies. Our implementation narrows down our initial universe to liquid assets by selecting 200 stocks based on daily trading volume, price, and whether the stock has fundamental data in our data library. We calculate the expected idiosyncratic skewness at the end of each month and sort our universe based on the calculated skewness. This implementation will long the bottom 5%, hold for the next month, and rebalance the portfolio monthly. The Sharpe ratio is 0.947 relative to S&P 500 (SPY) Sharpe ratio of 0.87 during the period of July 1, 2009 to July 30, 2019.


BMV tests recent theories that stocks with low idiosyncratic skewness should have high expected returns. For example, Mitton and Vorkink (2007) develop a model that some investors ("lotto investors") have a preference for positive skewness while others ("traditional investors") are mean-variance optimizers seeking to maximize the Sharpe ratio of their portfolios. Lotto investors accept lower average returns on stocks with high idiosyncratic skewness because they have a preference for stocks with lottery-like return properties. In equilibrium, markets clear at prices such that stocks with high idiosyncratic skewness have low expected returns, due to the different portfolio preferences of the two groups of investors.

Despite the theoretical basis for the pricing effects of skewness preference, empirically testing the relation is not straightforward as expected skewness is difficult to measure. BMV accounts for the phenomenon that lagged skewness alone does not adequately forecast skewness by presenting a cross-sectional model of expected skewness using additional predictive variables. Using their model, they reaffirm the existing theory that expected idiosyncratic skewness and returns are negatively correlated. Notably, they find the Fama-French alpha of a low-expected-skewness quintile exceeds the alpha of a high-expected-skewness quintile by 1.00% per month. Furthermore, the Fama-MacBeth cross-sectional regressions have statistically significant, negative coefficients. Besides, BMV finds that the expected skewness helps explain how stocks with low idiosyncratic volatility have high expected returns.

Data Description

To execute our algorithm, we will use daily data from Kenneth French's Data Library that captures the Fama-French three factors for the period July 1, 2009 to June 30, 2019. The raw data is delivered in a zip file which is not directly importable into LEAN. We need to unzip the file and upload the CSV to a Github repository. All other data used for this algorithm, including stock price, volume, and market capitalization, are from QuantConnect's Dataset Market. In the original paper, BSV also includes firm-specific variables like momentum, turnover, and dummies of properties including Nasdaq-listed stocks, small-size, medium-size, industries.


We can develop a model of estimated expected idiosyncratic skewness using Fama-French three factors. Lower expected idiosyncratic skewness will predict a higher alpha. We will let the investment horizon over which investors are hoping to experience an extreme positive outcome be 1 month. And, let \(S(t)\) denote the set of trading days in the current month, and let \(N(t)\) denote the number of days in this set.

Step 1: Getting Fama-French three-factor regression residuals

Let \(\epsilon_{i,d}\) be the regression residual using the Fama and French (1993) three-factor model on day \(d\) for firm \(i\), where the regression coefficients that define this residual are estimated using daily data for days in \(S(t)\) as the time-series regression below.

\[\begin{equation} \begin{aligned} R_{i,d} - R_{f,d} = \ & \alpha_i \\ & + \beta_i [R_{M,d} - R_{f,d}] \\ & + s_i SMB_{d} \\ & + h_i HML_{d} + \epsilon_{i,d} \end{aligned} \end{equation}\]

for all day \(d \in S(t)\) and each \(i = 1,2,\dots,N\).

Step 2: Estimating historical idiosyncratic moments

Let \(iv_{i,t}\) and \(is_{i,t}\) denote historical estimates of idiosyncratic volatility and skewness, respectively, for firm \(i\) using daily data for all days in \(S(t)\). We can then define \(iv_{i,t}\) and \(is_{i,t}\) as:

\[iv_{i,t} = \left( \frac{1}{N(t) - 1} \sum_{d\in S(t)} \epsilon_{i,d}^2 \right)^{1/2}\]

\[is_{i,t} = \frac{1}{N(t) - 2} \frac{ \sum_{d\in S(t)} \epsilon_{i,d}^3 } { iv_{i,t}^{3/2} }\]

Step 3: Estimating expected idiosyncratic skewness

We need measures of expected skewness over a horizon of 1 month for firm \(i\) at the end of month \(t\), \(E_t[is_{i,t+1}]\), rather than measures of historical skewness as defined in equation above. To model investor perceptions of expected skewness in a feasible manner, we first estimate cross-sectional regression separately at the end of each month \(t\) in our sample,

\[is_{i,t} = \beta_0^t + \beta_1^t is_{i,t-1} + \beta_2^t iv_{i,t-1} + \varepsilon_{i,t}\]

Superscripts on regression parameters are included to emphasize that we estimate these parameters using information observable at the end of month \(t\). We then use the regression parameters from equation above, along with information observable at the end of each month \(t\), to estimate expected skewness for each firm,

\[ E_t[is_{i,t+1}] = \beta_0^t + \beta_1^t is_{i,t} + \beta_2^t iv_{i,t} \]

This approach provides feasible estimates of each month's expected skewness and accounts for variation between historical moments and expected skewness across time.

Step 4: Generating trading signals

At the end of each month, we use the results of equation above to sort stocks by expected idiosyncratic skewness. We construct our universe using the lowest 5% of expected skewness, and long our assets to construct a value-weighted portfolio.

Conclusion and Future Work

Before the BMV paper was published in 2009, a number of theories on the pricing premium for stocks with idiosyncratic skewness existed, but lacked supporting empirical evidence of the relationship between idiosyncratic skewness and returns. BMV fills this void by estimating a model of predicted skewness and using predicted skewness to explain the cross-section of returns. The paper finds that lagged idiosyncratic volatility is a stronger predictor of skewness than lagged idiosyncratic skewness.

In this implementation, we rely on idiosyncratic volatility and skewness to predict idiosyncratic skewness. Interested users can build from this implementation by trying the following extensions:

  1. Including a number of firm-specific variables to improve predictive power for expected idiosyncratic skewness;
  2. Using different investment horizons such as 3 months, 6 months, 1 year;
  3. Adding more lags in the time-series regression for both expected and historical idiosyncratic skewness.


  1. Boyer B, Mitton T, Vorkink K. Expected idiosyncratic skewness. The Review of Financial Studies. 2009 Jun 3;23(1):169-202. Online Copy
  2. Fama EF, French KR. Common risk factors in the returns on stocks and bonds. Journal of Financial Economics. 1993 Feb 1;33(1):3-56. Online Copy
  3. Mitton T, Vorkink K. Equilibrium underdiversification and the preference for skewness. The Review of Financial Studies. 2007 Jan 29;20(4):1255-88. Online Copy