The Promise of Machine Learning

Machine learning shows great promise for empirical asset pricing and has the potential to improve our understanding of expected asset returns.

Due to increased computing power and data availability, decreased data storage costs and algorithmic innovations, machine learning methods are increasing in popularity in financial research. A number of aspects of empirical asset pricing make it an attractive field for analysis with machine learning methods. There is now a zoo of predictors that various researchers have argued possess forecasting power for returns, many of which are highly correlated. With its emphasis on variable selection and dimension reduction techniques, machine learning is well suited for such challenging prediction problems by reducing degrees of freedom and condensing redundant variation among predictors. There is also the question of whether relationships are linear or nonlinear. Machine learning is explicitly designed to approximate complex nonlinear associations.

Doron Avramov, Si Cheng and Lior Metzker, authors of the October 2021 study “Machine Learning versus Economic Restrictions: Evidence from Stock Return Predictability,” examined whether investors can harvest extra profits generated by various machine learning signals given plausible restrictions on the investment universe. They considered a comprehensive set of both linear and nonlinear models (a diverse collection of high-dimensional models for statistical prediction). In doing so they imposed several economic restrictions:

  • They limited the universe of stocks to those that were relatively cheap to trade by excluding microcaps or distressed firms.
  • In the time series, they examined whether investment profitability was more pronounced during high limits-to-arbitrage market states, such as high volatility and low liquidity.
  • They assessed the turnover and the corresponding transaction costs associated with implementing machine learning-based strategies.
  • They explored the economic foundations of trading strategies advocated by seemingly opaque machine learning methods.

Their full sample covered U.S. stocks over the period 1957-2017 divided into three subperiods: a training sample and a validation sample (both of which varied in length depending on the machine learning method), and the remaining 31 years (1987 to 2017) for out-of-sample testing. They trained the model every year so that the training sample expanded every year. Following is a summary of their findings:

  • The value-weighted long-short portfolio generated returns of between 0.95% and 2.18% per month, with corresponding Fama-French six-factor (FF6) adjusted returns (beta, size, value, momentum, profitability and investment) of between 0.62% and 1.87% per month: “Such large and significant figures reflect the impressive success of machine learning techniques in generating outstanding performance relative to traditional methods such as nonregulated regressions and portfolio sorts based on individual anomalies.”