Academic Legal Writing - Eugene Volokh [92]
You could try to get consumption data by surveying the public, but people may not know for sure just how much ice cream they've eaten, and might not be entirely candid even if they did know. So if you want to know how much ice cream people are eating, your best bet is probably to look at the production information. It's not perfect, but it probably isn't bad, and it's better than the alternatives.
But some such inferences are more dangerous. For instance, say you read in some article that there were 2.15 million burglaries in the U.S. in 2002. Sounds good, but of course you check the original source, rather than relying on the article—and it turns out the original source is FBI's Uniform Crime Reports, which reports information on burglaries that were reported to the police.51
Your intermediate source thus took one variable (burglaries reported to the police) and reported it as something else (burglaries actually committed). It seems likely, though, that only about two thirds of all burglaries are reported to the police. The National Crime Victimization Study, which is based on surveys of victims, rather on police data, reports an estimated 3.05 million burglaries in 2002.52 Surveys, even ones conducted as well as the NCVS is, have their own problems; but they're probably more reliable measures of actual burglaries than the UCR, which only measures reported burglaries. (The UCR is seen as a fairly reliable measure of changes in crime rates over time, because people assume that the underreporting rate will be fairly similar each year, though that may not always be an accurate assumption.)
So, when reading sources, look closely at exactly what variable the original study measured (for instance, ice cream production or reported burglaries), and be skeptical of inferences from that variable to any other variable (ice cream consumption or actual burglaries). Sometimes, you might need to draw that inference—sometimes, the variable you're looking for just isn't measured directly, so you must infer it from other measurements. But recognize that you are drawing that inference.
And again, when making such inferences yourself, make clear to your readers what variable the data actually measures, and explain why it's proper to infer that the variable you're interested in is really going to be roughly the same as the variable you're measuring.
4. A summary plus an exercise
A brief summary, through an example: Say you're arguing for a proposed federal law, and you cite a study showing that when a similar state law was enacted in Ohio in 1991, robbery arrests fell by 25% in the following year. When you make this argument, you're implicitly making three assumptions:
a. The data is generalizable over time and space: You're assuming that results from Ohio in 1991–92 are generalizable to the whole country in the years after the federal law would take effect. Differences among states and changes over time may make this assumption incorrect.
b. The data shows causation and not just correlation: You're assuming that arrests fell as a result of the law. That might be true, but it might be a coincidence: arrest rates might have fallen because crime rates were generally falling, or because some other crime-reducing measure was implemented at the same time.
c. The data is generalizable from the measured variable to the important variable: You're assuming that a decline in arrests reflects a decline in the crime rate, since presumably the goal of the law is to cut crime (the important variable), and not just to cut arrests (the measured variable). A declining arrest rate doesn't necessarily mean a declining crime rate: maybe there was a surge in some other kind of crime, which caused the police to pay less attention to this crime; maybe police practices changed in some other way; maybe the law discouraged people from reporting the crime. This assumption is easy to miss, because the two terms (arrest rate and crime rate) sound similar, though