Drunkard's Walk - Leonard Mlodinow [73]
Coin toss guessing compared to stock-picking success
It is important, whenever assessing any kind of survey or poll, to realize that when it is repeated, we should expect the results to vary. For example, if in reality 40 percent of registered voters approve of the way the president is handling his job, it is much more likely that six independent surveys will report numbers like 37, 39, 39, 40, 42, and 42 than it is that all six surveys will agree that the president’s support stands at 40 percent. (Those six numbers are in fact the results of six independent polls gauging the president’s job approval in the first two weeks of September 2006.)27 That’s why, as another rule of thumb, any variation within the margin of error should be ignored. But although The New York Times would not run the headline “Jobs and Wages Increased Modestly at 2 P.M.,” analogous headlines are common in the reporting of political polls. For example, after the Republican National Convention in 2004, CNN ran the headline “Bush Apparently Gets Modest Bounce.”28 The experts at CNN went on to explain that “Bush’s convention bounce appeared to be 2 percentage points…. The percentage of likely voters who said he was their choice for president rose from 50 right before the convention to 52 immediately afterward.” Only later did the reporter remark that the poll’s margin of error was plus or minus 3.5 percentage points, which means that the news flash was essentially meaningless. Apparently the word apparently, in CNN-talk, means “apparently not.”
For many polls a margin of error of more than 5 percent is considered unacceptable, yet in our everyday lives we make judgments based on far fewer data points than that. People don’t get to play 100 years of professional basketball, invest in 100 apartment buildings, or start 100 chocolate-chip-cookie companies. And so when we judge their success at those enterprises, we judge them on just a few data points. Should a football team lavish $50 million to lure a guy coming off a single record-breaking year? How likely is it that the stockbroker who wants your money for a sure thing will repeat her earlier successes? Does the success of the wealthy inventor of sea monkeys mean there is a good chance he’ll succeed with his new ideas of invisible goldfish and instant frogs? (For the record, he didn’t.)29 When we observe a success or a failure, we are observing one data point, a sample from under the bell curve that represents the potentialities that previously existed. We cannot know whether our single observation represents the mean or an outlier, an event to bet on or a rare happening that is not likely to be reproduced. But at a minimum we ought to be aware that a sample point is just a sample point, and rather than accepting it simply as reality, we ought to see it in the context of the standard deviation or the spread of possibilities that produced it. The wine might be rated 91, but that number is meaningless if we have no estimate of the variation that would occur if the identical wine were rated again and again or by someone else. It might help to know, for instance, that a few years back, when both The Penguin Good Australian Wine Guide and On Wine’s Australian Wine Annual reviewed the 1999 vintage of the Mitchelton Blackwood Park Riesling, the Penguin guide gave the wine five stars out of five and named it Penguin Best Wine of the Year, while On Wine rated it at the bottom of all the wines it reviewed, deeming it the worst vintage produced in a decade.30 The normal distribution not only helps us understand such discrepancies,