Tuesday, July 7, 2020

Regression, "Oh so 20th century"; Machine learning, "Oh so 21st century"


The movement from regression or attribution statistics to prediction algorithms is not just a fad but a significant change in focus associated with how we use information and the purpose of data analysis. Appreciating the change in perspective is important for all investors looking at systematic managers and quantitative analysis.

Regression is the framework of choice for most older quants who were trained in the 20th century. This world focuses on the formulation of what could be called "surface plus noise" with the surface describing the model or scientific truths we wish to learn, and the noise representing what obscures the truth hidden in the model. The emphasis is on model estimation and less on prediction. Develop a good estimated model, find factors, and the prediction will take care of itself.

Pure prediction algorithms are 21st century analysis and  include neural nets, deep learning,  and random forests. These algorithms have moved to the center analysis attention given the increase in computing power and the explosion of large data sets. These are important advancements on existing statistical analysis, but it also is a change in orientation. The pure prediction algorithms focus on prediction with less emphasis on estimation and attribution. Don't worry about the model estimation. There is no focus on significance. It is all about accuracy and error reduction. The connection between prediction and attribution is not relevant.

The table and a deeper discussion are available in "Prediction, Estimation, and Attribution" by Bradley Efron in the 2020 Journal of American Statistical Association

Data analysis cultures are changing. A challenge is for the old guard to learn new tricks and the new culture to appreciate the power of traditional estimation. Right now, the link between these two cultures is stilted and needs to be bridged. Estimation may not work for all data, and the pure prediction culture may need to temper their use of complex algorithms. However, the old guard is going to have to accept the prediction algorithms to be part of the 21st century. 

No comments: