Monday, May 9, 2016

Know your data, the fundamentals of analysis

Thanks to Ben Hunt's latest financial letter, I pulled out my old copies of work by Ed Tufte, the visual data extraordinaire. He may have done more for good visualization and data analysis than anyone over the last 40 years. The Anscombe Quartet says it all if you don't look at the data.

I took the Tufte one day course on the visualization of data over 20 years ago and it has had a profound effect on focusing my powers of observation and my ability to think of clean representations of data. Over the years, I have followed a set of rules when looking at data that has served me well.

My data rule checklist:

1. Graph the data - If you cannot see it, you cannot understand it. Pictures are critical.
2. Cut the data - Look over different time periods to see if there are changes. Data move.
3. Get the stats - Understand the distribution of the data only after seeing it first.
4. Look at the histogram - Find for outliers because this is were money is won or lost.
5. Match data with events - Go both ways  - match outliers with events and big events with data.
6. Generate conditional tables - The power of a 2x2 contingency table is often missed.
7. Check the volatility and correlation  - See whether volatility and correlations change.
8. Compare against good times and bad - Show business cycle and big events on graph.
9. Map theory and prior evidence  - Use priors of other researchers to focus representation of data.
10. Run your regression tests - Be skeptical and focus on residuals.

Good data analysis is a skill that can be learned through following a process.

