A heretical view of the prisoners' dilemma
Nov. 9th, 2013 10:35 pmReading: Brian Hayes, "New Dilemmas for the Prisoner", American Scientist 101(6), 422--424 (2013); http://www.americanscientist.org/issues/pub/2013/6/new-dilemmas-for-the-prisoner
This article gives further weird news about an old paradox in game theory called the prisoners' dilemma. Rid of the fancy story that gave it its name, the game is as follows: Two players, who cannot communicate except thru their moves, must each simultaneously choose either to cooperate or to defect. If both cooperate, each receives a modest reward. If both defect, each receives a booby prize. If one cooperates & the other defects, the cooperator receives nothing, and the defector receives a large reward. Evidently, the sensible thing is for both players to cooperate & rake in the modest reward. Each, however, is diabolically tempted to reason as follows: whether the other player cooperates or defects, I am better off defecting (better the booby prize than nothing; better the large reward than the modest one), and so I should defect. If both fall for that (and if I do, why shouldn't you?), each ends up with the booby prize.
A vast amount of experimental & theoretical research, some of it rather funny, has been done on iterations of this game, either between the same two players, or between various pairs of players in a group who can remember each other's past behavior. Thus, defectors can be chastened by the mistrust of others, and it turns out that, in various models, a stable pattern of cooperation may emerge. This line of investigation is interesting in that it suggests how cooperation of various kinds might have evolved within the Darwinian struggle for life. The article mentioned contains some odd surprises.
The case of a single play between strangers, however, remains an embarrassment. Hayes says
In a single game against a player you'll never meet again, there's no escape from this doleful logic.
and that seems to have been the consensus among game theorists going back to Luce & Raiffa's classical textbook (1957). They say
Of course, it is slightly uncomfortable that two so-called irrational players will both fare much better than two so-called rational ones. Nevertheless, it remains true that a rational player . . . is always better off than an irrational player. . . .
. . . No, there appears to be no way around this dilemma. We do not believe there is anything irrational or perverse about the choice of [defection on both sides], and we must admit that if we were actually in this position we would make these choices.
Experiments, however, show that quite a lot of people know better. And indeed, I believe that the remarks I have quoted, which amount to saying that the Golden Rule is contrary to reason, constitute a reductio ad absurdum of existing game theory as a model of human rationality. If I were a mathematical logician and came up with an axiomatization of arithmetic that looked plausible on its face, but turned out to allow for a hitherto unsuspected integer >0 and <1, I would not publish it as a warning to schoolchildren & accountants; I would look for the mistake.
My suspicion is that the mistake here lies in supposing that the modeling of other players as rational analogs to oneself --- as maximizers of some value function --- can be entirely free of caring for those others: that empathy (theory of mind) & sympathy are entirely independent notions. Clearly, they are independent to some extent: In general, if I am wicked, I can use my insight into your state of mind to torment you; and if I believe that you are wicked, I can use it to frustrate you. But it ought to be possible to put something in the formalism to force the players' utility functions to stick to each other in cases of common interest like the prisoners' dilemma.
In the meantime, I suggest the following thought experiment (to perform it in actuality would be expensive): Let a couple of hundred naive subjects be recruited and assigned to random pairs to play the game once, anonymously at computer terminals, say for monetary rewards of 0, $100, $300, & $500. After the rewards are distributed, let the company be given a free lunch, at little square tables accommodating four each. There are place cards generated by the computer assigning to each table two of the pairs from the game, so that if I played you we are sitting across from each other. There are three kinds of pairs (cooperators, defectors, and mixed), and so there are six kinds of tables (cooperators + cooperators, cooperators + mixed, cooperators + defectors, mixed + mixed, mixed + defectors, and defectors + defectors); the computer makes the assignment so that the six kinds are present in as nearly equal numbers as possible. The lunch is a good one, with a choice of food & drink to encourage conviviality. In one version, the participants might be tagged C & D; in another, they might be allowed to divulge their action or conceal it or lie about it as they chose. A pleasant floral arrangement at the center of each table contains an omnidirectional microphone, and all conversations are recorded for subsequent study.
Exercise: Imagine the conversation at one or another table. A table with four cooperators, of course, has an easy time. Four defectors, one may suppose, grimly congratulate each other on their rationality. A pair of defectors lectures a pair of cooperators on how irrational they have been, and one of the latter says "Thank you for your advice; we'll be happy to rave all the way to the bank". The mixed cases might result in some profanity.
This article gives further weird news about an old paradox in game theory called the prisoners' dilemma. Rid of the fancy story that gave it its name, the game is as follows: Two players, who cannot communicate except thru their moves, must each simultaneously choose either to cooperate or to defect. If both cooperate, each receives a modest reward. If both defect, each receives a booby prize. If one cooperates & the other defects, the cooperator receives nothing, and the defector receives a large reward. Evidently, the sensible thing is for both players to cooperate & rake in the modest reward. Each, however, is diabolically tempted to reason as follows: whether the other player cooperates or defects, I am better off defecting (better the booby prize than nothing; better the large reward than the modest one), and so I should defect. If both fall for that (and if I do, why shouldn't you?), each ends up with the booby prize.
A vast amount of experimental & theoretical research, some of it rather funny, has been done on iterations of this game, either between the same two players, or between various pairs of players in a group who can remember each other's past behavior. Thus, defectors can be chastened by the mistrust of others, and it turns out that, in various models, a stable pattern of cooperation may emerge. This line of investigation is interesting in that it suggests how cooperation of various kinds might have evolved within the Darwinian struggle for life. The article mentioned contains some odd surprises.
The case of a single play between strangers, however, remains an embarrassment. Hayes says
In a single game against a player you'll never meet again, there's no escape from this doleful logic.
and that seems to have been the consensus among game theorists going back to Luce & Raiffa's classical textbook (1957). They say
Of course, it is slightly uncomfortable that two so-called irrational players will both fare much better than two so-called rational ones. Nevertheless, it remains true that a rational player . . . is always better off than an irrational player. . . .
. . . No, there appears to be no way around this dilemma. We do not believe there is anything irrational or perverse about the choice of [defection on both sides], and we must admit that if we were actually in this position we would make these choices.
Experiments, however, show that quite a lot of people know better. And indeed, I believe that the remarks I have quoted, which amount to saying that the Golden Rule is contrary to reason, constitute a reductio ad absurdum of existing game theory as a model of human rationality. If I were a mathematical logician and came up with an axiomatization of arithmetic that looked plausible on its face, but turned out to allow for a hitherto unsuspected integer >0 and <1, I would not publish it as a warning to schoolchildren & accountants; I would look for the mistake.
My suspicion is that the mistake here lies in supposing that the modeling of other players as rational analogs to oneself --- as maximizers of some value function --- can be entirely free of caring for those others: that empathy (theory of mind) & sympathy are entirely independent notions. Clearly, they are independent to some extent: In general, if I am wicked, I can use my insight into your state of mind to torment you; and if I believe that you are wicked, I can use it to frustrate you. But it ought to be possible to put something in the formalism to force the players' utility functions to stick to each other in cases of common interest like the prisoners' dilemma.
In the meantime, I suggest the following thought experiment (to perform it in actuality would be expensive): Let a couple of hundred naive subjects be recruited and assigned to random pairs to play the game once, anonymously at computer terminals, say for monetary rewards of 0, $100, $300, & $500. After the rewards are distributed, let the company be given a free lunch, at little square tables accommodating four each. There are place cards generated by the computer assigning to each table two of the pairs from the game, so that if I played you we are sitting across from each other. There are three kinds of pairs (cooperators, defectors, and mixed), and so there are six kinds of tables (cooperators + cooperators, cooperators + mixed, cooperators + defectors, mixed + mixed, mixed + defectors, and defectors + defectors); the computer makes the assignment so that the six kinds are present in as nearly equal numbers as possible. The lunch is a good one, with a choice of food & drink to encourage conviviality. In one version, the participants might be tagged C & D; in another, they might be allowed to divulge their action or conceal it or lie about it as they chose. A pleasant floral arrangement at the center of each table contains an omnidirectional microphone, and all conversations are recorded for subsequent study.
Exercise: Imagine the conversation at one or another table. A table with four cooperators, of course, has an easy time. Four defectors, one may suppose, grimly congratulate each other on their rationality. A pair of defectors lectures a pair of cooperators on how irrational they have been, and one of the latter says "Thank you for your advice; we'll be happy to rave all the way to the bank". The mixed cases might result in some profanity.