Good and evil, art and science, nature and nurture – people seem to thrive on false dichotomies. Of course the concepts in these pairs differ; it’s just that they rarely, if ever, occur in a pure state. Nevertheless, we get a constant stream of oversimplified stories such as humans beating computers in the stock market or big data analysts abandoning causation in their pursuit of correlation. Trust me: the humans who are “beating” computers are actually using computers themselves, just as sure as any big data analysts who set aside theories of causation won’t be big data analysts very long.
Both fundamental and technical traders (another false dichotomy, since anyone who isn’t at least a little bit of both is eventually doomed) rely heavily on computers; it would be virtually impossible to do otherwise these days, and that’s fine: computers are great at crunching numbers. In the case of big data, the end result of all that crunching can be some seemingly relevant correlations. But it’s as pointless to look at a correlation without a notion of causation as it would be to trade stocks without using computers. Fortunately, one thing humans still do far better than machines is to create stories.
To be sure, sometimes our reflexive need to wrap data in a narrative does us a disservice: it leads us to see patterns that aren’t there well as overlook patters that are (Kahneman, Mandelbrot and Taleb are particularly helpful on this point if you haven’t read them). And, of course, it is this very tendency that gives rise to false dichotomies (since they make stories easier to construct and share). But I think we’re at the peak of overcorrection when we try to get people out of the process. Without the people we don’t have a story, without the story we don’t have a context, and without the context those data don’t have useful meaning.
This need for meaning isn’t an existential or spiritual matter; it’s a purely practical one. It tells you which data are important and how those data need to be captured. Our company recently diagnosed a problem for a client that had discovered a historical correlation in their data that was failing to translate into ongoing business results. When we pressed them to tell us the story they thought they were “hearing” from the data and then tried to retell that story using the job sequence and data fields in their system, we identified two problems that were interacting – one of which was a simple choice about how they created their time code field. It wasn’t “wrong” in any objective sense, but it was completely wrong in the context of the story.
We know from the centuries of progress tied to the scientific method (somewhat formalized and socialized by Roger Bacon in the 13th century but nascent long before even then) that having a tentative “story” in mind can help you test data; if you form your story only after seeing the data, then you need to get more data before you can conclude anything. And the process is never really over. People who really understand the value of working hypotheses know they are always working on new ones.