Online Book Reader

Home Category

Proofiness - Charles Seife [74]

By Root 867 0
accurate count: say you find that there are really 28 trout and 19 minnows.

This new information tells you how accurate your boat count really was. The data tell you that you did in fact overcount trout (you counted 30 from the boat, but there were really 28) and undercounted minnows (you counted 15 from the boat, but there were really 19). And now that the data tell you the nature of your measurement errors, you can correct for them. You now know that your original count of 599 trout is too large and should be adjusted downward—to about 560—to compensate for your tendency to overcount trout. Similarly, your count of 301 minnows is too small and should be adjusted upward—to about 380—to account for timid minnows that you were unable to see from the boat. Your new, adjusted total is 940 fish in the pond, about 60 percent of which are trout and 40 percent of which are minnows.

The new numbers aren’t perfect by any means. It’s possible that the small netted-off section of the pond was not truly representative of the entire pond. There might have been a particularly dense and hard-to-spot concentration of minnows in the area, for example.73 Also, since you’re extending your observations about a small number of fish to the entire pond, you have to worry about statistical errors that would be irrelevant in a direct count of the entire population. However, the increase in statistical error is more than compensated for by the decrease in systematic error—your measurement allows a dramatic reduction in the problems caused by miscounting certain segments of the population. In short, you’re trading large, systematic errors for (hopefully) smaller, mostly statistical errors—and the result is a better, more accurate count.

This is sampling in a nutshell. By looking extremely carefully at a sample of the population, the Census Bureau can generate data that allow it to correct for the systematic undercounts and over-counts in the census. From a statistician’s point of view, it’s a nobrainer. A corrected count would produce a much more accurate depiction of the population of the United States than a count-every-head census ever could. Instead of having censuses that are good to within a few percent, it would be possible to reduce the errors down to a fraction of a percent. The most accurate tally of the population of the United States would not come from a straight head count; instead, it should be a census that is corrected by sampling. As an added bonus, a census that uses sampling is cheaper than a straight head count. Instead of spending billions of dollars to try to chase down that recalcitrant last few percent who don’t respond to census workers, the bureau can spend a few tens of millions doing the same thing, even more exhaustively, in a small number of communities and use that data to correct for the undercount. Sampling is more accurate and it’s cheaper. So every politician should be in favor of it, right?

Not quite. Unfortunately, sampling is caught up in the racial politics of voter suppression. The citizens who tend to be undercounted by the census tend to be poorer people who rent their homes rather than own them. A disproportionate number don’t speak English and are distrustful of government authorities (including the Census Bureau). They tend to be minorities—and they tend to vote Democratic. Conversely, the overcounted tend to be white and affluent, and are more likely than not to vote Republican. If the United States were a pond, minorities would be the minnows while whites would be the trout. The moment you use sampling to correct for the undercount, you suddenly add several million more minorities—Democrats—into your count of the population. It’s something that Republicans want to prevent so badly that they are forced to take an idiotic stance: they insist the proper way to conduct a census is the least accurate and most expensive method.74

The Census Bureau was reduced to reporting two population numbers to Congress every decade: a sampling-corrected number that statisticians and population experts use because

Return Main Page Previous Page Next Page

®Online Book Reader