Drunkard's Walk - Leonard Mlodinow [71]
It is one thing to suspect that archers and astronomers, chemists and marketers, encounter the same error law; it is another to discover the specific form of that law. Driven by the need to analyze astronomical data, scientists like Daniel Bernoulli and Laplace postulated a series of flawed candidates in the late eighteenth century. As it turned out, the correct mathematical function describing the error law—the bell curve—had been under their noses the whole time. It had been discovered in London in a different context many decades earlier.
OF THE THREE PEOPLE instrumental in uncovering the importance of the bell curve, its discoverer is the one who least often gets the credit. Abraham De Moivre’s breakthrough came in 1733, when he was in his mid-sixties, and wasn’t made public until his book The Doctrine of Chances came out in its second edition five years later. De Moivre was led to the curve while searching for an approximation to the numbers that inhabit the regions of Pascal’s triangle far beneath the place where I truncated it, hundreds or thousands of lines down. In order to prove his version of the law of large numbers, Jakob Bernoulli had had to grapple with certain properties of the numbers that appeared in those lines. The numbers can be very large—for instance, one coefficient in the 200th row of Pascal’s triangle has fifty-nine digits! In Bernoulli’s day, and indeed in the days before computers, such numbers were obviously very hard to calculate. That’s why, as I said, Bernoulli proved his law of large numbers employing various approximations, which diminished the practical usefulness of his result. With his curve, De Moivre was able to make far better approximations to the coefficients and therefore greatly improve on Bernoulli’s estimates.
The approximation De Moivre derived is evident if, as I did for the registration cards, you represent the numbers in a row of the triangle by the height of the bars on a bar graph. For instance, the three numbers in the third line of the triangle are 1, 2, 1. In their bar graph the first bar rises one unit; the second is twice that height; and the third is again just one unit. Now look at the five numbers in the fifth line: 1, 4, 6, 4, 1. That graph will have five bars, again starting low, rising to a peak at the center, and then falling off symmetrically. The coefficients very far down in the triangle lead to bar graphs with very many bars, but they behave in the same manner. The bar graphs in the case of the 10th, 100th, and 1,000th lines of Pascal’s triangle are shown on chapter 07.
If you draw a curve connecting the tops of all the bars in each bar graph, it will take on a characteristic shape, a shape approaching that of a bell. And if you smooth the curve a bit, you can write a mathematical expression for it. That smooth bell curve is more than just a visualization of the numbers in Pascal’s triangle; it is a means for obtaining an accurate and easy-to-use estimate of the numbers that appear in the triangle’s lower lines. This was De Moivre’s discovery.
Today the bell curve is usually called the normal distribution and sometimes the Gaussian distribution (we’ll see later where that term originated). The normal distribution is actually not a fixed curve but a family of curves, in which each depends on two parameters to set its specific position and shape. The first parameter determines where its peak is located, which is at 5, 50, and 500