The Believing Brain - Michael Shermer [179]
Here is a relatively simple example of how this method of statistical significance works in relation to the null hypothesis to answer this question: can a psychic using ESP alone determine whether a playing card from a deck is red or black? Psychics typically claim that they can do this, but in my experience what people say they can do and what they can actually do are not always the same. How can we test this claim? If we place the cards down onto a table one by one with the psychic stating either red or black for each card, how many correct hits would the psychic need in order for us to conclude that the card color determinations were not due to chance? In this scenario, the null hypothesis is that the psychic will do no better than chance, and thus to reject the null hypothesis we will need to establish a figure for the number of correct hits needed in each round. By chance, we would expect the psychic to get about half correct. In a deck of 52 cards, half of which are red and half of which are black, random guessing or flipping a coin will produce, on average, 26 correct hits.
Of course, as anyone who has flipped coins for fun knows, 10 flips do not necessarily always result in 5 heads and 5 tails. There are streaks and deviations from symmetry—6 heads and 4 tails, or 3 heads and 7 tails—all within the realm of chance. Or as anyone who has gambled at a roulette wheel knows, sometimes red comes up more than black, or vice versa, without any violations of chance and randomness. In fact, we count on such asymmetrical streaks in our betting schemes and hope that we’re disciplined enough to walk away from the table during a temporary deviation from chance in our favor before the odds swing the other way.
So we can’t just test our psychic on one short series of card guesses, because by chance the psychic may be expected to get a series of hits. We need to run multiple trials, in which some rounds may result in slightly below chance (say, 22, 23, 24, or 25 hits) and other rounds may result in slightly above chance (say, 27, 28, 29, or 30 hits). The variation may be even greater and still be due to nothing but chance. What we need to determine is the number by which we can confidently reject the null hypothesis. In this example, that number is 35. The psychic would need to get 35 correct hits out of a 52-card deck in order for us to reject the null hypothesis at the 99 percent confidence level. The statistical method by which this figure is derived need not concern us here.1 The point is that even though 35 out of 52 doesn’t sound like it would be that hard to obtain, in fact by chance alone it would be so unusual that we could confidently state (“at the 99 percent confidence level”) that something else besides chance was going on here.
What might that be? It could be ESP. But it could be something else as well. Perhaps our controls were not tight enough. Maybe the psychic was getting the red/black information by some other normal (as opposed to paranormal) means of which we were not aware (such as the reflection of the card face in the table surface). Possibly the psychic was cheating, and we don’t know how. I’ve seen James Randi do this very experiment with an entire deck of cards, resulting in two perfect piles of all red and all black cards. The magician Lennart Green shuffles and scrambles a deck of cards, fumbles with them for a while as if he’s all thumbs, clumsily pushes them back together, then proceeds to deal out four winning poker hands or an entire sequence of a suit in order, all while blindfolded.2 But Randi and Green