As I have said earlier, the world, epistemologically, is literally a different place to a bottom-up empiricist. We don’t have the luxury of sitting down to read the equation that governs the universe; we just observe data and make an assumption about what the real process might be, and “calibrate” by adjusting our equation in accordance with additional information. As events present themselves to us, we compare what we see to what we expected to see. It is usually a humbling process, particularly for someone aware of the narrative fallacy, to discover that history runs forward, not backward. As much as one thinks that businessmen have big egos, these people are often humbled by reminders of the differences between decision and results, between precise models and reality.
What I am talking about is opacity, incompleteness of information, the invisibility of the generator of the world. History does not reveal its mind to us—we need to guess what’s inside of it.
From Representation to Reality
The above idea links all the parts of this book. While many study psychology, mathematics, or evolutionary theory and look for ways to take it to the bank by applying their ideas to business, I suggest the exact opposite: study the intense, uncharted, humbling uncertainty in the markets as a means to get insights about the nature of randomness that is applicable to psychology, probability, mathematics, decision theory, and even statistical physics. You will see the sneaky manifestations of the narrative fallacy, the ludic fallacy, and the great errors of Platonicity, of going from representation to reality.
When I first met Mandelbrot I asked him why an established scientist like him who should have more valuable things to do with his life would take an interest in such a vulgar topic as finance. I thought that finance and economics were just a place where one learned from various empirical phenomena and filled up one’s bank account with f*** you cash before leaving for bigger and better things. Mandelbrot’s answer was, “Data , a gold mine of data.” Indeed, everyone forgets that he started in economics before moving on to physics and the geometry of nature. Working with such abundant data humbles us; it provides the intuition of the following error: traveling the road between representation and reality in the wrong direction.
The problem of the circularity of statistics (which we can also call the statistical regress argument) is as follows. Say you need past data to discover whether a probability distribution is Gaussian, fractal, or something else. You will need to establish whether you have enough data to back up your claim. How do we know if we have enough data? From the probability distribution—a distribution does tell you whether you have enough data to “build confidence” about what you are inferring. If it is a Gaussian bell curve, then a few points will suffice (the law of large numbers once again). And how do you know if the distribution is Gaussian? Well, from the data. So we need the data to tell us what the probability distribution is, and a probability distribution to tell us how much data we need. This causes a severe regress argument.
This regress does not occur if you assume beforehand that the distribution is Gaussian. It happens that, for some reason, the Gaussian yields its properties rather easily. Extremistan distributions do not do so. So selecting the Gaussian while invoking some general law appears to be convenient. The Gaussian is used as a default distribution for that very reason. As I keep repeating, assuming its application beforehand may work with a small number of fields such as crime statistics, mortality rates, matters from Mediocristan. But not for historical data of unknown attributes and not for matters from Extremistan.
Now, why aren’t statisticians who work with historical data aware of this problem? First, they do not like to hear that their entire business has been canceled by the problem of induction. Second, they are not confronted with the results of their predictions in rigorous ways. As we saw with the Makridakis competition, they are grounded in the narrative fallacy, and they do not want to hear it.
ONCE AGAIN, BEWARE THE FORECASTERS
Let me take the problem one step higher up. As I mentioned earlier, plenty of fashionable models attempt to explain the genesis of Extremistan. In fact, they are grouped into two broad classes, but there are occasionally more approaches. The first class includes the simple rich-get-richer (or big-get-bigger) style model that is used to explain the lumping of people around cities, the market domination of Microsoft and VHS (instead of Apple and Betamax), the dynamics of academic reputations, etc. The second class concerns what are generally called “percolation models,” which address not the behavior of the individual, but rather the terrain in which he operates. When you pour water on a porous surface, the structure of that surface matters more than does the liquid. When a grain of sand hits a pile of other grains of sand, how the terrain is organized is what determines whether there will be an avalanche.
Most models, of course, attempt to be precisely predictive, not just descriptive; I find this infuriating. They are nice tools for illustrating the genesis of Extremistan, but I insist that the “generator” of reality does not appear to obey them closely enough to make them helpful in precise forecasting. At least to judge by anything you find in the current literature on the subject of Extremistan. Once again we face grave calibration problems, so it would be a great idea to avoid the common mistakes made while calibrating a nonlinear process. Recall that nonlinear processes have greater degrees of freedom than linear ones (as we saw in Chapter 11), with the implication that you run a great risk of using the wrong model. Yet once in a while you run into a book or articles advocating the application of models from statistical physics to reality. Beautiful books like Philip Ball’s illustrate and inform, but they should not lead to precise quantitative models. Do not take them at face value.
But let us see what we can take home from these models.
Once Again, a Happy Solution
First, in assuming a scalable, I accept that an arbitrarily large number is possible. In other words, inequalities should not stop above some known maximum bound.
Say that the book The Da Vinci Code sold around 60 million copies. (The Bible sold about a billion copies but let’s ignore it and limit our analysis to lay books written by individual authors.) Although we have never known a lay book to sell 200 million copies, we can consider that the possibility is not zero. It’s small, but it’s not zero. For every three Da Vinci Code –style bestsellers, there might be one superbestseller, and though one has not happened so far, we cannot rule it out. And for every fifteen Da Vinci Codes there will be one superbestseller selling, say, 500 million copies.
Apply the same logic to wealth. Say the richest person on earth is worth $50 billion. There is a nonnegligible probability that next year someone with $100 billion or more will pop out of nowhere. For every three people with more than $50 billion, there could be one with $100 billion or more. There is a much smaller probability of there being someone with more than $200 billion—one third of the previous probability, but nevertheless not zero. There is even a minute, but not zero probability of there being someone worth more than $500 billion.
This tells me the following: I can make inferences about things that I do not see in my data, but these things should still belong to the realm of possibilities. There is an invisible bestseller out there, one that is absent from the past data but that you need to account for. Recall my point in Chapter 13: it makes investment in a book or a drug better than statistics on past data might suggest. But it can make stock market losses worse than what the past shows.
Читать дальше