Proofiness - Charles Seife [24]
Psssssht . . . you could almost hear the economists’ egos deflating when the election returns came in. Fair, humbled, tweaked his equation to correct for the mistakes he had made. In a paper published in mid-1996, just before the next presidential election, he gently ventured another prediction: “The basic story from the equation is that the 1996 election will be close with a slight edge for the Republicans.” Whoops. Clinton defeated Dole almost as soundly, vote-wise, as Reagan beat Carter.
The problem is that Fair’s equation was a regression to the moon. It was an elaborate model that found a pattern in the data, but the pattern was all but meaningless. (What success Fair did have boiled down to the commonsense dictum that incumbents benefit from a good economy.) The formula did a great job explaining past elections, but it was pretty hopeless when it came to predicting future ones: a sure sign of a faux pattern. Almost all electoral predictions have the same problems; year after year, economists and other experts line up with their regression models and make predictions that as often as not are dead wrong. On a slow news day, they even make it onto the front page of a major paper: “It’s not even going to be close,” an economist trumpeted from page A1 of the Washington Post in 2000—Gore would win 56.2 percent of the votes cast for the two main candidates.25 Yeah, right.
The prize for regression silliness, though, has to go to the academics who crank out equations or formulae for everything under the sun, whether or not there’s anything mathematically valid to make a formula about. It’s a favorite pastime of attention-seeking pundits, as the media seem to gobble up these phony formulas without even a little bit of skepticism that might give them indigestion. In 2003, the BBC trumpeted a formula that philosophers have been seeking for years: the formula for happiness. This formula is simply:
Happiness = P + (5 × E) + (3 × H)
This equation is supposed to make sense when you know what the variables mean. P is “Personal Characteristics”—you get a high score if you have an optimistic outlook. E is “Existence” and reflects your health. H stands for your “Higher-Order Needs,” such as the ego-stroking you get from fawning news organizations when you get them to publish your claptrap. The formula’s obviously nonsense; these ideas are not quantifiable—there is no way to measure P or E or H—so it’s a garbage equation that feeds on Potemkin numbers. This sort of equation is the lowest of all the varieties of regression to the moon; it isn’t even a good-faith attempt to try to explain data. Yet these phony equations emerge from the swamps of academia quite regularly. Want to know the most depressing day of the year? Use the formula that came from Cardiff University:
Misery = 1/8W + (D - d) 3/8 × TQ M × NA
where W is weather, D is debt, M is motivation, NA is the need to take action, and so forth and so on.26 In 2005, when the formula was first revealed, it proved—scientifically—that the most miserable day of the year was January 24.
There are many ways to generate numerical falsehoods from data, many ways to create proofiness from even valid measurements. Causuistry distorts the relationship between two sets of numbers. Randumbness creates patterns where none are