Proofiness - Charles Seife [39]
The more people in your sample, the less likely you are to have weird random encounters like this that invalidate your data. If you poll ten people instead of one, a single sheep’s-blood fanatic won’t throw off the results as dramatically. Better yet, if you survey a hundred people, a single oddball will have a negligible effect on the outcome of the poll. So the larger the sample, the less error there is due to random weirdness.
In fact, random weirdness is a law of nature; no matter how careful you are in picking your sample, you’re going to get strange events. Since this random weirdness throws off the accuracy of your poll, it is a source of imprecision dubbed statistical error. Luckily, there are mathematical and statistical equations that quantify this weirdness—they dictate what level of weirdness (and thus statistical error) you expect to get in a sample of a given size. In other words, there’s a strict relationship between sample size and statistical error. The bigger the sample, the smaller the statistical error. 36 And though this might seem like hand-waving mumbo-jumbo, statistical error is a very real and important phenomenon that is grounded in fundamental mathematical laws of probability. Statistical error is a verifiable source of error in every poll, and it’s a consequence of the randomness of nature—and a function of the size of the sample.37
The concept of margin of error relays this bizarre imprecision caused by randomness. By convention, it is a number larger than the imprecision caused by randomness 95 percent of the time. This is almost impossible to understand at first, so let’s use an example.
Say that a poll finds that 64 percent of Britons prefer tea to coffee. The pollster knows that the randomness of the universe—statistical error—might mess up the result of the poll; the real answer might not actually be 64 percent, but instead 62 percent or 66 percent or even 93 percent if there was a particularly weird random event that messed up the sample. When the pollster says that the margin of error is 3 percent, she is expressing confidence that the randomness of the universe can only mess up the answer by three percentage points up or down—that the real answer is somewhere between 61 percent and 67 percent. However, this confidence isn’t absolute. Randomness is, well, random, and sometimes a quirky and unlikely set of events can throw off the result of the poll by more than 3 percent. However, something so bizarre can occur only fairly rarely; only one in twenty polls like this can suffer from a strange event that messes up the result by more than 3 percent. Most of the time—in nineteen out of twenty polls like this—the randomness of the universe screws up the poll’s answer by no more than 3 percent.
Still with me? If you don’t understand it fully, don’t worry. The margin of error is a really hard concept to wrap your head around, and many journalists, even those who regularly report on polls, don’t get it. There are two important things to remember about the margin of error. First, the margin of error reflects the imprecision in a poll caused by statistical error—it is an unavoidable consequence of the randomness of nature. Second, the margin of error is a function of the size of the sample—the bigger the sample, the smaller the margin of error. In fact, the margin of error can be considered pretty much as nothing more than an expression of how big the sample is.38
The margin of error is a direct result of the mathematical laws of probability and randomness. It describes a fundamental limitation to the precision of a poll, an unavoidable statistical error that faces pollsters when they use a sample of people to intuit the beliefs of an entire population. There’s no way to get around it; the moment a pollster makes a leap of faith and assumes that a sample of people has the same predilections as the entire