Sunday, November 7, 2021

Basics of data science - Machine learning is not econometrics


 

There are two cultures in the use of statistical modeling to reach conclusions from data. One assumes that the data are generated by a given stochastic data model. The other uses algorithmic models and treats the data mechanism as unknown. - Leo Breiman 

Most analysts and investors who have gone to business school have taken at least one statistics class and perhaps an econometrics course. Business schools have integrated data science into the mix over the last five years, but finance has still been dominated by econometric thinking. With the econometric view, there is a proposed model for explaining markets. Data and statistics are used to test hypotheses and estimate parameters. Machine learning does not make any assumptions surrounding a model, rather the focus is on using data to make accurate predictions. If data is useful for predictions, then it has meaning. 

From a practical perspective, data science techniques like machine learning focuses on engineering solutions. The goal is solely to predict and be accurate. There is no attempt to fit a model that can explain, although there may be a desire for the inputs to be intuitive. 

The refocus on accuracy over causal reasoning should help with predictions; however, reasoning will be reversed. Inputs that provide an accurate forecast will now have to be assessed to understand their causal relationship with the variable to be predicted. Observational relationships obtained by machine learning will have to be given explanations. Nevertheless, data science should be embraced as a helpful investment tool. 

No comments: