The noise bottleneck is really a paradox. We think the more information we consume, the more signal we’ll consume. Only the mind doesn’t work like that. When the volume of information increases, our ability to comprehend the relevant from the irrelevant becomes compromised. We place too much emphasis on irrelevant data and lose sight of what’s really important.
Give me more information! The more data - the better the decision. No, this may not be the case. Cutting out the noise and focuses on less may be a better way of thinking about problem solving. Nassim Taleb in his book Antifragile, discusses the problem as the noise bottleneck. What is true in the absolute concerning data may not apply to the relative issue. As you consume more data, you may see the noise to signal go higher. More data creates a false confidence as well as stress for investors. For models, more data also create noise. Features will increase, but these features are often dynamic and move in and out of significance.
How do you solve this problem? First, realize that there is a signal to noise problem. More data is not always better. Second, preparing signals to streamline data use is appropriate. Just because data is available does not mean it should be used. Three, accept that if signals cannot be counted and tested as repeatable events, it should not be used. Reading one-off stories as signals is not appropriate.
See our past posting on signal and noise:
No comments:
Post a Comment