Peter Coy writes for Bloomberg about how financial statistics are manipulated to tell stories that investors want to hear, even if they’re not true. He offers that the “core of the problem is that it’s hard to beat the market, but people keep trying anyway,” then explains how financial firms manufacture investment strategies that are designed to capitalize on statistical anomalies.
Coy gives two examples of how pure statistical analysis, without the benefit of logic and reason, can lead to horribly flawed outcomes. In the first example, an analyst who is trying to find a statistical relationship between the consumption of jelly beans and acne is not able to do so, so he keeps eliminating data based on the color of jelly bean, until he demonstrates a clear correlation between green jelly beans and acne. In the second example, Coy references a study which found that the best predictor of the S&P 500, out of all the time series in a collection of United Nations data, was butter production in Bangladesh. Trouble is, while there may be a quantifiable relationship between the data series, there is no causation. It’s purely coincidence.
There is a term for such over-fitting of statistical data to achieve a desired outcome: it’s called p-hacking, a reference to the p-value in statistical analysis. The risk is particularly high with financial time series, because there’s a powerful market incentive for practitioners to search for and identify previously unknown relationships.