Complexity_ A Guided Tour - Melanie Mitchell [103]
Robert Axelrod. (Photograph courtesy of the Center for the Study of Complex Systems, University of Michigan.)
Since an effective central government seems out of reach, Axelrod wondered if and how cooperation could come about without one. He believed that exploring strategies for playing the simple repeated Prisoner’s Dilemma game could give insight into this question. For Axelrod, “cooperation coming about” meant that cooperative strategies must, over time, receive higher total payoff than noncooperative ones, even in the face of any changes opponents make to their strategies as the repeated game is being played. Furthermore, if the players’ strategies are evolving under Darwinian selection, the fraction of cooperative strategies in the population should increase over time.
COMPUTER SIMULATIONS OF THE PRISONER’S DILEMMA
Axlerod’s interest in determining what makes for a good strategy led him to organize two Prisoner’s Dilemma tournaments. He asked researchers in several disciplines to submit computer programs that implemented particular strategies for playing the Prisoner’s Dilemma, and then had the programs play repeated games against one another.
Recall from my discussion of Robby the Robot in chapter 9 that a strategy is a set of rules that gives, for any situation, the action one should take in that situation. For the Prisoner’s Dilemma, a strategy consists of a rule for deciding whether to cooperate or defect on the next turn, depending on the opponent’s behavior on previous turns.
The first tournament received fourteen submitted programs; the second tournament jumped to sixty-three programs. Each program played with every other program for 200 turns, collecting points according to the payoff matrix in figure 14.3. The programs had some memory—each program could store the results of at least some of its previous turns against each opponent. Some of the strategies submitted were rather complicated, using statistical techniques to characterize other players’ “psychology.” However, in both tournaments the winner (the strategy with the highest average score over games with all other players) was the simplest of the submitted strategies: TIT FOR TAT. This strategy, submitted by mathematician Anatol Rapoport, cooperates on the first turn and then, on subsequent turns, does whatever the opponent did for its move on the previous turn. That is, TIT FOR TAT offers cooperation and reciprocates it. But if the other player defects, TIT FOR TAT punishes that defection with a defection of its own, and continues the punishment until the other player begins cooperating again.
It is surprising that such a simple strategy could beat out all others, especially in light of the fact that the entrants in the second tournament already knew about TIT FOR TAT so could plan against it. However, out of the dozens of experts who participated, no one was able to design a better strategy.
Axelrod drew some general conclusions from the results of these tournaments. He noted that all of the top scoring strategies have the attribute of being nice—that is, they are never the first one to defect. The lowest scoring of all the nice programs was the “least forgiving” one: it starts out by cooperating but if its opponent defects even once, it defects against that opponent on every turn from then on. This contrasts with TIT FOR TAT, which will punish an opponent’s defection with a defection of its own, but will forgive that opponent by cooperating once the opponent starts to cooperate again.
Axelrod also noted that although the most successful strategies were nice and forgiving, they also were retaliatory—they punished defection soon after it occurred. TIT FOR TAT not only was nice, forgiving, and retaliatory, but it also had another