Why Artificial Intelligence Has Failed to Outperform
Membership required
Membership is now required to use this feature. To learn more:
View Membership BenefitsNew research has documented the persistent failure of investing based on artificial intelligence (AI). This is unsurprising, given the challenges of active management and the widespread inadequacy of humans to outperform an index fund.
With all the hype around AI, I’ve been getting questions about whether it might outperform benchmark indices – something human active managers have persistently failed to accomplish. According to BlackRock CEO Laurence Fink, the likely reason for the relative underperformance of active equity funds and the resulting outflows is the fallacies inherent in human discretion in active portfolio management and stock‐picking. According to Fink, “the democratization of information has made it much harder for active management. We have to change the ecosystem – that means relying more on big data, AI, factors and models within quant and traditional investment strategies.” BlackRock executive Mark Wiseman added, “The old way of people sitting in a room picking stocks, thinking they are smarter than the next guy – that does not exist anymore.” Kiplinger provided another example of the hype (and hope) surrounding AI: “Artificial intelligence leveraging the raw power of Big Data might just be the edge tactical investors and traders need to navigate an increasingly uncertain market.”
To assess whether AI will outperform human active managers, I’ll first review the findings of Wojtek Buczynski, Fabio Cuzzolin and Barbara Sahakian, authors of the study, “A Review of Machine Learning Experiments in Equity Investment Decision‑Making: Why Most Published Research Findings Do Not Live Up to Their Promise in Real Life,” published in the April 2021 issue of the International Journal of Data Science and Analytics. They analyzed 27 peer-reviewed articles published by academic researchers between 2000 and 2018 describing experiments in AI market forecasting. They found that virtually all of them claimed great forecasting accuracy. They wanted to determine whether these forecasting techniques could be replicated in the real world. Here is a summary of their findings:
- Most of the experiments ran multiple versions (in extreme cases, hundreds) of their investment model in parallel. In almost all cases, the authors presented their highest-performing model as the primary product of their experiment – the best result was cherry-picked, and all the suboptimal results were ignored. In other words, they tortured the data until it confessed.
- Models in the papers reviewed achieved a high level of accuracy, about 95%. However, in the real world, even if an algorithm is wrong only 5% of the time, it could wipe out any profits and the entire underlying capital – as hedge fund manager Victor Niederhoffer demonstrated more than once.
- Most experiments did not account for trading costs.
- Most AI algorithms appeared to be “black boxes,” with no transparency on how they worked. In the real world, this isn’t likely to inspire investors’ confidence.
- The handful of AI-powered funds whose performance data were disclosed on publicly available market data sources generally underperformed the market.
- Based on hit rates (the percentage of times the forecast was directionally accurate in its predictions), there was no improvement in forecasting accuracy over time.
- From January 2011 to January 2020, the Eurekahedge AI Hedge Fund Index substantially underperformed two global benchmark indices, S&P 500 and MSCI World, with cumulative returns of 115%, 210% and 133%, respectively.
- Preqin’s AI hedge fund universe generated a 27% return in the three years from August 2016 to August 2019. The S&P 500 returned 65% and the MSCI World returned 32%.
The authors cited the following examples of failure:
- Aidya, created and run by AI legend Ben Goertzel, was a Hong Kong-based machine learning-driven hedge fund employing ensemble models. Aidya delivered 12% on its first day – and liquidated after less than a year due to poor performance.
- Sentient Technologies, a high-profile startup hedge fund that attracted $143 million in venture capital funding for its evolutionary algorithms-based trading strategies, returned 4% in 2017 and 0% in 2018 – and was liquidated.
- Rogers AI Global Macro ETF was launched in June 2018. It employed AI in an investment decision-making capacity. It operated for just over one year (from June 2018 to July 2019) and during that time made virtually no profit.
We can also examine the performance of EquBot’s AI Powered Equity ETF (AIEQ), another high-profile failure. Powered by IBM’s Watson, over the period November 2017 to June 2023, it returned 6.7% per annum with a standard deviation of 23.0%, underperforming Vanguard’s Total Stock Market Index Fund (VTSMX), which returned 11.2% with a standard deviation of 18.5%. Its five-year performance through June 2023 placed it in the 97th percentile of performance, according to Morningstar. The poor performance explains why the fund had only $115 million in assets under management as of the end of June 2023.
Their findings led Buczynski, Cuzzolin and Sahakian to conclude: “There is no conclusive evidence of *any* ML-driven investment funds delivering spectacular returns at scale. All market data indicates substantial underperformance compared to benchmark indices.” They added: “Lack of explanation on how an algorithm arrived at a particular forecast or recommendation is suboptimal in the experimental (theoretical) context, but very risky (if not unacceptable) in practical context, where there would be real investors’ money at stake. It is also likely to raise concerns of regulatory and/or legal nature.”
Explaining the poor performance of AI funds
How do we explain the poor performance of AI funds such as AIEQ? As Andrew Berkin and I explained in our book, The Incredible Shrinking Alpha, while Watson can outwit individuals (even champions in their fields, e.g., chess), individuals are not the competition when it comes to investing. Instead, Watson competes with the collective wisdom of the millions of individuals trading in stock markets each day – a much tougher competitor. In addition, as soon as new information (such as the publication of a paper revealing a profitable anomaly) is obtained, the process of acting on that information gets incorporated into market prices very quickly. Today, the competition is the collective decision making of not just humans but also machines, algorithms, and algorithms predicting what other algorithms will do next. Thus, any successful AI strategy is likely to be very short lived.
As Dimensional explained in a research note: “Active investors have long attempted to get an informational edge on markets by using artificial intelligence (AI) processes to retrieve and process data. For example, tools that gauge sentiment from social media or scrape text from company financial reports predate ChatGPT by many years. Material information gleaned from running AI processes is very likely a subset of the vast information set known by the market in aggregate and reflected in market prices. If new information is obtained, the process of acting on that information incorporates it into market prices. Another reason to question AI’s role in helping with market timing is limitations with its predictions. AI’s forecasting ability fares well when assessing patterns that are relatively stable. The market is fantastically complex. So much so that no one knows exactly how much a particular piece of information impacts a price, because there are so many other simultaneous inputs. AI trying to predict market prices is like self-piloting cars trying to read stop signs with words, shapes, and colors that differ every day.”
Investor takeaways
There is no doubt that artificial intelligence has changed the way financial institutions execute trades. But the collective wisdom of the market is a powerful force that ensures that the price quoted is the best estimate of the value of a security. Thus, there’s no reason to think that the use of AI should lead to persistent fund outperformance.
Larry Swedroe is head of financial and economic research for Buckingham Wealth Partners.
For informational and educational purposes only and should not be construed as specific investment, accounting, legal, or tax advice. Certain information is based on third party data and may become outdated or otherwise superseded without notice. Third party information is deemed to be reliable, but its accuracy and completeness cannot be guaranteed. All investments involve risk, including loss of principal. By clicking on any of the links above, you acknowledge that they are solely for your convenience, and do not necessarily imply any affiliations, sponsorships, endorsements or representations whatsoever by us regarding third-party websites. We are not responsible for the content, availability or privacy policies of these sites, and shall not be responsible or liable for any information, opinions, advice, products or services available on or through them. Neither the Securities and Exchange Commission (SEC) nor any other federal or state agency have approved, determined the accuracy, or confirmed adequacy of this article. LSR-23-530.
Membership required
Membership is now required to use this feature. To learn more:
View Membership BenefitsSponsored Content
Upcoming Webinars View All














