Alex's Adventures in Numberland - Alex Bellos [156]
I’ve been treating the bell curve as if it is one curve, when, in fact, it is a family of curves. They all look like a bell, but some are wider than others (see diagram overleaf).
Bell curves with different deviations.
Here’s an explanation for why we get different widths. If Galileo, for example, measured planetary orbits with a twenty-first-century telescope, the margin of error would be less than if he were using his sixteenth-century one. The modern instrument would produce a much thinner bell curve than the antique one. The errors would be much smaller, yet they would still be distributed normally.
The average value of a bell curve is called the mean. The width is called the deviation. If we know the mean and the deviation, then we know the shape of the curve. It is incredibly convenient that the normal curve can be described using only two parameters. Perhaps, though, it is too convenient. Often statisticians are overly eager to find the bell curve in their data. Bill Robinson, an economist who heads KPMG’s forensic-accounting division, admits this is the case. ‘We love to work with normal distributions because [the normal distribution] has mathematical properties that have been very well explored. Once we know it’s a normal distribution, we can start to make all sorts of interesting statements.’
Robinson’s job, in basic terms, is to deduce, by looking for patterns in huge data sets, whether someone has been cooking the books. He is carrying out the same strategy that Poincaré used when he weighed his loaves every day, except that Robinson is looking at gigabytes of financial data, and has much more sophisticated statistical tools at his disposal.
Robinson said that his department tends to work on the assumption that for any set of data the default distribution is the normal distribution. ‘We like to assume that the normal curve operates because then we are in the light. Actually, sometimes it doesn’t, and sometimes we probably should be looking in the dark. I think in the financial markets it is true that we have assumed a normal distribution when perhaps it doesn’t work.’ In recent years, in fact, there has been a backlash in both academia and finance against the historic reliance on the normal distribution.
When a distribution is less concentrated around the mean than the bell curve it is called platykurtic, from the Greek words platus, meaning ‘flat’, and kurtos, ‘bulging’. Conversely, when a distribution is more concentrated around the mean it is called leptokurtic, from the Greek leptos, meaning ‘thin’. William Sealy Gosset, a statistician who worked for the Guinness brewery in Dublin, drew the aide-memoire below in 1908 to remember which was which: a duck-billed platypus was platykurtic, and the kissing kangaroos were leptokurtic. He chose kangaroos because they are ‘noted for “lepping”, though, perhaps, with equal reason they should be hares!’ Gosset’s sketches are the origin of the term tail for describing the far-left and far-right sections of a distribution curve.
When economists talk of distributions that are fat-tailed or heavy-tailed, they are talking of curves that stay higher than normal from the axis at the extremes, as if Gosset’s animals have larger than average tails. These curves describe distributions in which extreme events are more likely than if the distribution were normal. For instance, if the variation in the price of a share were fat-tailed, it would mean there was more of a chance of a dramatic drop, or hike, in price than if the variation were normally distributed. For this reason, it can sometimes be reckless to assume a bell curve over a fat-tailed curve. The economist Nassim Nicholas Taleb’s position in his bestselling book The Black Swan is that we have tended to underestimate the size and importance of the tails in distribution curves. He argues that the bell curve is a historically defective model because it cannot anticipate the occurrence of, or predict the impact of, very rare, extreme events – such as a major scientific discovery