Brad DeLong wrote:
Iterated Prisoner's Dilemma Blogging: Dyson and Press Really Are Very Clever Indeed...: Basically, consider the strategy space of one-period look-back mixed strategies. And take the strategy S = { p(C|CC)=1/2-ε, p(C|CD)=1/4+ε', p(C|DC)=0, p(D|DD)=0 }.
If the opponent cooperates all the time, their average score is a little better than the DD payoff 1--giving them an incentive to choose a strategy that cooperates some time. If they cooperate all the time, then your average score is a little worse than 4--pretty good. And if they do anything other than cooperate all the time, their score falls and is worse than if they cooperate all the time. Thus in this strategy space { P(C )=1 } is a dominant strategy for the opponent if you choose your strategy first. Of course, this is not a Nash equilibrium: if the opponent is playing C you should play C as well
And two opponents each thinking that they move first and committing to S do rather badly: they end up in (D,D) for all time.
And then comes the smackdown:
>anon said… I'm hoping someone can enlighten me about why this is particularly interesting. The folk theorem already told us that any outcome above the (D,D) forever payoffs could be implemented, even very extortionary outcomes. There is something interesting about only needing one period of history dependence in the strategies to do it, but I don't think that is what everyone is so excited about. I think I'm probably missing something crucial here, but I'm not sure what it is-any suggestions?
Besttrousers said… Doesn't this conclusion flow out of the folk theorem? It seems like it's just a special case of it, but I haven't really sat down with the paper.
"Besttrousers" and "anon"'s reaction appears to be a very common one, especially among economists. The point is that Press-Dyson is not interesting because (i) the folk theorem already taught us that iterated prisoner's dilemma can support practically anything as an equilibrium; (ii) Press-Dyson simply present us with one of these equilibria; so (iii) why isn't this boring?
What is the "folk theorem"? Wikipedia:
A commonly referenced proof of a folk theorem was published in (Rubinstein 1979).
The method for proving folk theorems is actually quite simple. A grim trigger strategy is a strategy which punishes an opponent for any deviation from some certain behavior. So, all of the players of the game first must have a certain feasible outcome in mind. Then the players need only adhere to an almost GRIM trigger strategy under which any deviation from the strategy which will bring about the intended outcome is punished to a degree such that any gains made by the deviator on account of the deviation are exactly cancelled out. Thus, there is no advantage to any player for deviating from the course which will bring out the intended, and arbitrary, outcome, and the game will proceed in exactly the manner to bring about that outcome.
What response do I have to this?
First, basically, I don't believe that equilibria that exist only because any deviation is a trigger for a GRIM or a nearly-GRIM strategy are really there. To be credible, a strategy has to be chosen to elicit good behavior in the future, and not to punish bad behavior in the past, because sunk costs truly are sunk. If you want to support an equilibrium that requires a GRIM or a near-GRIM trigger, you need to specify that the players in your game like to engage in altruistic punishment--or have a taste for vengeance, which is more-or-less the same thing. Absent a strong taste for engaging in altruistic punishment in your agents, equilibria supported by GRIM out-of-equilibrium just don't count.
Second, I don't believe in equilibria that simply descend from the sky. Some kind of historical process that starts with out-of-equilibrium play or some other set of factors has to be there to produce the coordinated common expectations on which any equilibrium must rest. Folk theorem arguments are absolutely and deliberately and necessarily silent on the disequilibrium foundations of equilibrium game theory.
My problem is that my native code running on my wetware believes these two points strongly, but is not equipped to argue for them. And the Cosma Shalizi emulation module that I could fire up and run on top of my native code is primitive and buggy--and I don't want to crash my entire brain and have to reboot it.
So I demand--I plead for--I beg for Cosma Shalizi to write the weblog post about the value of Dyson-Press that I cannot…
Iterated Prisoner's Dilemma Blogging: Dyson and Press Really Are Very Clever Indeed...
Cosma Shalizi directs me to:
[…]
Basically, consider the strategy space of one-period look-back mixed strategies. And take the strategy S = { p(C|CC)=1/2-ε, p(C|CD)=1/4+ε', p(C|DC)=0, p(D|DD)=0 }.
If the opponent cooperates all the time, their average score is a little better than the DD payoff 1--giving them an incentive to choose a strategy that cooperates some time. If they cooperate all the time, then your average score is a little worse than 4--pretty good. And if they do anything other than cooperate all the time, their score falls and is worse than if they cooperate all the time. Thus in this strategy space { P(C )=1 } is a dominant strategy for the opponent if you choose your strategy first.
Of course, this is not a Nash equilibrium: if the opponent is playing C you should play C as well. D and really clean up.
And two opponents each thinking that they move first and committing to S do rather badly: they end up in (D,D) for all time.
Cosma also directs me to:
KARL SIGMUND & MARTIN NOWAK: A comment on Press-Dyson (PNAS)
Being close means not being there. We had known about strategies that allow to nail down the opponent's payoff to an arbitrary level [1,2], but not about the vast and fascinating realm of zero determinant (ZD) strategies that enforce a linear relation between the payoffs for the two players. This opens a new facet in the study of trigger strategies and folk theorems for iterated games, and offers a highly stimulating approach for moral philosophers enquiring about 'egoistic' and 'tuistic' viewpoints.
Our only quibble with the Press-Dyson paper is semantic. The title speaks of 'evolutionary opponents', which suggests evolutionary game theory. But biological or cultural evolution is not a phenomenon on the level of the individual. It requires a population. The 'evolutionary' players of Press and Dyson don't evolve but adapt. With their splendidly 'mischievous' extortionate strategies, Press and Dyson contribute to classical game theory, by considering two players who grapple with each other in a kind of mental jiu-jitsu. The leverage afforded by zero-determinant strategies offers a splendid new arsenal of throws, locks, and holds.
Which of these strategies can flourish in an evolutionary setting is less clear. Being successful, in this context, feeds back at the population level. It means that more and more players will act like you, be they your offspring or your epigones. Thus you are increasingly likely to encounter your own kind. If your 'extortionate' strategy guarantees that you do twice as well as your opponent, and your opponents' strategy guarantees that she does twice as well as you, this only means that both get nothing. The only norm which is not self-defeating through population dynamics requires players to guarantee each other as much as themselves. We are then back to Tit For Tat. Press and Dyson are perfectly aware of this, of course. In a nutshell, they have uncovered a vast set of strategies linking the scores of two players deterministically (as TFT does), but asymmetrically (unlike TFT). This enriches the canvas of individual interactions, but not necessarily the range of outcomes open to evolving populations.
[1] M.A. Nowak, M.C. Boerlijst, K.Sigmund, Equal pay for all prisoners, AMS Monthly 104 (1997) 303-307.
[2] K. Sigmund, The Calculus of Selfishness, Princeton UP, Princeton, New Jersey (2010).
What Should I Read to Very Quickly Get Up-to-Speed on Dyson and Press's Zero-Determinant Prisoner's Dilemma Strategies?
Bill Press taught me what very little general relativity I ever knew, and now he and Freeman Dyson have come up with something absolutely fascinating.
I view it as a generation of TIT-FOR-TAT strategies in iterated prisoner's dilemma. Their strategies do better than tit-for-tat in an evolutionary context in the sense that they create an environment in which other strategies evolve in ways that profit the ZD strategies. But of course the ZD strategies are not themselves evolutionarily stable--and a population of a single ZD strategy does quite badly indeed…
Am I right? What should I read first?
Comments