Some concepts in dynamic games

Behavioral strategy

A pure strategy in a dynamic game specifies the action chosen by a player at each of the player’s information sets. A mixed strategy is a probability distribution over the set of pure strategies. An alternative way of depicting the strategy of a player is called a behavioral strategy. In a behavioral strategy the player chooses a probability distribution over the set of actions at each of the player’s information sets.

 

Suppose player 1 chooses at two information sets. The action set at the first information set is (a1,a2) and the second is (b1,b2). There are four pure strategies (a1b1, a1b2, a2b1, a2b2) and a mixed strategy profile for this player would be (k1,k2,k3,1- k1-k2-k3), where k1, for example, is the probability of playing strategy a1b1. A behavioural strategy profile for the player would be {[p(a1),1- p(a1)], [p(b1),1- p(b1)]}. One can be transformed into the other. For example, k1 = p(a1). p(b1), and so on.

 

See Gintis 4.13 and 4.45.

Perfect behavioral Nash equilibria and trembling hand equilibria

See Gintis 5.14. We will delay our discussion of these subjects until we have a chance to consider signaling and other games of incomplete information.

Repeated game

A repeated game is a dynamic game (usually of imperfect information) in which an interaction is modeled in a stage game and this interaction is repeated. The repetitions may be finite (e.g., 10 repeats) or infinite or be uncertain in number. A frequent way of modeling the uncertainty is that there is a probability that the game will end after the next interaction.

Players

The players of the stage game are the players of the repeated game.

Strategies

Describing a strategy in a dynamic game requires specifying what the player will do in each stage game. A strategy can become very complicated as actions in say the tenth repetition of the stage game can be made conditional on prior decisions of the other player or outcomes in preceding stage game encounters. There are some economical strategy descriptions that are useful in obtaining insights into repeated games. A strategy may then be a simple "always do the same thing in each stage game regardless of what has happened before" or a much more complex set of instructions. The potential complexity arises from conditioning moves in a stage game on what has happened in earlier rounds of the repeated game.

 

Some strategies that are useful in analyzing the properties of repeated :

Always play a particular pure strategy in the stage game

A strategy of “always defects” in a repeated prisoner’s dilemma game describes a situation in which a player defects in each of the repetitions of the stage game.

Permanent retaliation (PR) as an example of a trigger strategy

A trigger strategy requires that a player take an action in the stage game that generates the cooperative solution and continue to do so unless another player defects. If a defect occurs the player adopts actions to punish the defector.

 

For example, a strategy of “permanent retaliation” (AKA the “grim reaper” strategy) in a repeated prisoner’s dilemma game requires that the player cooperate in the initial stage game and continue to cooperate in subsequent stage games as long as the other player has cooperated in the preceding stage game. If the other player defects in a stage game the player adopting PR defects in all subsequent stage games. The punishment in this case is severe and independent of any consequent actions by the other player. More forgiving punishment sequences can be adopted. Tit for tat is a strategy that exacts a measured punishment and then “forgives” if the other player has made a subsequent move of atonement.

Tit for tat (TT)

A strategy of “tit for tat” in a repeated prisoner’s dilemma game requires that the player cooperate in the first stage game. In subsequent stage games, the player does the same action as the other player did in the preceding stage game. A defection by the other player elicits a “punishing” defection from the TT player but if the other player cooperates the TT player returns to cooperation in the next stage game.

Payoffs

As in other settings, the payoff for each player depends on the strategy choice of all players but it has a structure particular to repeated games. The payoff of the repeated game is the sum of the payoffs realized in each repetition of the stage game appropriately weighted by a discount factor.

 

Note 1: The discount factor reflects the rate of interest and in a game with uncertainty also reflects the probability of the game extending one more period.

 

Note 2: The structure of a repeated game of finite duration is revealed by the knowledge of the stage game, G, the number of repetitions, T, and the discount factor, D (= 1/(1+r)) if there is no uncertainty concerning the continuation of the game.

Example

Consider the following one-shot game. Firm A and Firm B can cooperate and price so as to realize the monopoly output for the industry, which they are assumed to share equally. If one firm cooperates but the other undercuts the monopoly price, the undercutting firm realises a profit of x, where 40>x>10, while the other firm has a loss of -20. If both undercut, both make no profit.

 

 

 

FIRM B

 

 

COOPERATE

UNDERCUT



FIRM A

COOPERATE

10,10

-20,x

UNDERCUT

x,-20

0,0

 

  1. Assume that the two firms know that they will repeat this interaction 20 times. The rate of interest for each firm is 20%.
    1. Is "always undercut" by each player a Nash equilibrium for this repeated game?
    2. Is “always cooperate” by each player a Nash equilibrium for this repeated game?
    3. Is the adoption of "permanent retaliation" by each player a Nash equilibrium?
  2. Assume that the one shot game is repeated with the probability that the repeated game (also sometimes called the supergame) will end in the next round is 1/7. The rate of interest for each firm is still 20%.
    1. Calculate the values of x for which PR played by each player is a Nash equilibrium and for which this strategy profile is not a NE.
    2. Calculate the values of x for which TT played by each player is a Nash equilibrium and for which this strategy profile is not a NE.

 

Discussion of 1.

 

PA({Always cooperate}A,{Always cooperate}B) = 10 + (1/(1+r))10 + (1/(1+r))210 + ... + (1/(1+r))1910 = 10 + (5/6)10 + (5/6))210 + ... + (5/6))1910 = 58.434.....

 

Note: The sum a+aD+aD2+ ... +and = a(1-Dn)/(1-D)

 

Similarly the payoffs if each firm played {Always undercut} would be 0 for each player as "a" in the above formula would be 0.

 

We can find the NE for this repeated game by starting at the end and working backwards. The last stage game is a subgame of the repeated game. Its Nash equilibrium solution is that each player will undercut. Knowing that each player will get 0 in the last round, they decide what to do in the round before. This is a repeated game of two repetitions. The subgame perfect equilibrium for it is for both firms to undercut in their two interactions. One works backwards to the beginning of the game. The SPNE of the repeated game is for each firm to Undercut in each of the 20 repetitions regardless of what has transpired up to that point.

 

This is the only Nash equilibrium.

 

In general a repeated game with a finite number of repetitions that has a unique NE in the stage game considered in isolation has a unique subgame perfect NE. In that SPNE for the repeated game each player plays the NE strategy for the stage game at each repetition regardless of the history of the game.

 

In general a repeated game with a finite number of repetitions that has many NE in the stage game considered in isolation has many SPNE. Without a focal point among the SPNE there is no way of telling how this game might be played. Pareto dominance or some concept of fairness might provide a way of distinguishing among NEs.

 

Discussion of 2.

 

If both play PR the strategy profile that is candidate for a NE is [PRA,PRB].

 

The expected payoff for player A is PA(PRA,PRB) = 10 + (5/6)(6/7)10 + (5/6)2(6/7)210 + .... = 10/ (1-(5/7)) = 35. The game is symmetric so B expects the same payoff for this strategy combination.

 

Note: The term D* = (5/6)(6/7) is called the generalized discount factor.

 

To test for a NE see if an alternative is better for A to play if B is playing PRB.

 

Firm A knows that if it undercuts, the best strategy that can be followed from that point on is to always undercut, AUA, because B is playing PRB and therefore will react to the undercut by undercutting in every subsequent interaction . The question that then arises is when is the best time to default. If, for example, A deviates from PRA by undercutting at the third stage game, A would receive 10 + (5/6)(6/7)10 for the first two rounds plus the discounted value of PA(AUA,PRB) or 10 + (5/6)(6/7)10 + (5/6)2(6/7)2 PA(AUA,PRB). For defection to be attractive this must exceed PA(PRA,PRB) = 10 + (5/6)(6/7)10 + (5/6)2(6/7)2 PA(PRA,PRB). This will be true only if PA(AUA,PRB) > PA(PRA,PRB). Therefore we can concentrate on player A defaulting on the first move.

 

Consider then the alternative of A undercutting on the first move and then continuing to default after that, i.e. A playing always undercut (AU) against B’s PR.

 

Calculate PA(AUA,PRB) = x + (5/6)(6/7)0 + (5/6)2(6/7)20 + ....= x + 0 = x

 

If x < or = 35, A will be no better off by choosing this alternative to PRA.

 

By symmetry, if x < or = 35, (PRA,PRB) is a NE.

 

Folk theorem

“Consider any two-player stage game with a Nash equilibrium with payoffs (a,b) to the two players. Suppose there exists a pair of strategies for the two players that gives the players (c,d). Then, if c > or = a and d > or = b, and the discount factors of the players are sufficiently close to unity, there is a subgame perfect Nash equilibrium of the repeated game with expected payoffs (c,d) in each period.” Gintis p. 127.

 

Two definitions from other texts:

 

Define a reservation utility which is the minimum that a player can be assured of obtaining in a repeated game (mini-max). When players know what the other player played in the historical stage games, the following folk theorem holds. For every feasible set of payoffs which are greater than the reservation utilities of the players, there exists a lower bound discount rate <1, for which a NE exists sustaining the payoff vector. See Fudenberg and Tirole p. 152.

 

In an infinite repeated n-person game with finite actions sets at each repetition, any combination of actions observed in any finite number of repetitions is the unique outcome of some subgame perfect equilibrium given that the generalized rate of discount is one or sufficiently close to one and that the set of payoffs that exceed the mini-max payoff in the mixed extension of  the one-shot game has a dimension equal to the number of players. (See Rasmusen, 92).

Competition policy implication of repeated games

Cartels can arise without an explicit agreement by focussing on a particular NE in an infinitely repeated game. This realization was reflected in a change in competition policy to make it unnecessary to have an overt agreement or produce evidence of planned collusion. In the United States, this development has been described: "The line of decisions commencing with Interstate Circuit, Inc. v. United States has established the doctrine that parallel action by distributors, each with knowledge of what the others were doing, can suffice to show that they impliedly had agreed upon the concert of action. .... It must be proved that the 'parallel decisions of the alleged conspirators were contrary, on the hypothesis of independent individual decision, to their apparent individual self-interest.'" "Blind Bidding," Harvard Law Review March 1979. In this article, the author argues that blind bidding is not a contravention of the antitrust laws in the United States because it would be in the self-interest of a distributor even if the other distributors were not following the practice. (Relate this point to the above analysis.) He also maintains that the practice has not generated damages for exhibitors. The latter point is more controversial.

 

Problems of interest

Subgame perfection

Rubinstein bargaining model

It is easier to see the subgame perfection in a finite move version of this problem. I will do this in class, time permitting. A more difficult problem is the infinite horizon problem discussed by Gintis in 5.6

Nuisance suits 5.9

Cooperation in an over-lapping generations model 5.10

Repeated games

Reputational equilibrium 6.13