Game Theory (pt. 2)
Enrico Franchi [email protected]
URLs
Slides 13-05-2010:
http://dl.dropbox.com/u/836013/slides/lectures/2009-2010/
game-theory/gametheory-handout.pdf
Slides 20-05-2010:
http://dl.dropbox.com/u/836013/slides/lectures/2009-2010/
game-theory/gametheory2.pdf
TOC
Computing Nash Equilibrium
Complexity
Two-Player Games
Minmax/Maxmin
Von Neumann Theorem
Regret
Linear Programming
Social Choice Theory
Plurality and other voting procedures
Paradoxes
Arrow’s Theorem and Gibbard-Satterthwaite’s Theorem
Computing the Nash Equilibrium
The general case is extremely hard:
We must “manually” guess the support of the equilibrium, that is to say the set of strategies with positive probability
We must solve a set of M non-linear equations in M variables, where:
If the players are only two, the equations are linear
We still have to guess the support
Complexity
Technically, computing a general Nash Equilibrium is PPAD-complete and PPAD-hard (2005)
PPAD stands for “polynomial parity argument, directed version”
Relations with NP are not known
However, it is known that PPAD ⊆ TFNP ⊆ FNP
Nash-related NP-Hard Problems
Uniqueness: given a game G, does there exist a unique equilibrium in G?
Pareto Optimality: Given a game G, does there exist a strictly Pareto efficient equilibrium in G?
Guaranteed Payoff: Given a game G and a value v, does there exist an equilibrium in G in which some player
obtains a payoff of at least v?
Guaranteed Social Welfare
Action Inclusion
Action Exclusion
Two-player games
If the two-player is a general sum game, the Lemke-
Howson algorithm can be used formulating the game as a linear complementarity problem [1]
If the game is a zero-sum game, it can be solved with linear programming
Maxmin
The maxmin strategy for player i in an n player, general sum game is a not necessarily unique (mixed) strategy that maximizes i’s worst case
The maxmin value (or security level) is the minimum payoff level guaranteed by a maxmin strategy
Minmax, two player
Minmax strategy and value are somewhat dual to the maxmin strategy and value
In two player general sum games the minmax strategy for player i against player –i is the strategy that keeps the
maximum payoff for –i at minimum
It is a punishment
Minmax, n-player
In an n-player game, the minmax strategy for player i
against player j ≠ i is i’s component of the mixed-strategy profile s-j in the expression
where –j denotes the set of players other that j.
Player i receives his minmax value if players –i choose their strategies to minimize i utility “after” he chose
strategy si
A player maxmin value is always less than or equal to his minmax value
Minmax Theorem
In any finite, two-player, zero-sum game, in any Nash equilibrium each player receives a payoff that is equal to both his maxmin value and his minmax value
Von Neumann, 1928
Each player’s maxmin equals his minmax (value of the game)
Maxmin strategies coincide with minmax strategies
Any maxmin strategy profile is a Nash equilibrium
and any Nash equilibrium is a maxmin strategy profile
Matching Pennies Example
Heads Tails
Heads 1, -1 -1, 1
Tails -1, 1 1, -1
Matching Pennies Graph
0
0.3
0.6 0.9
-1 -0.5
0 0.5
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
P2 P(heads)
P1 Expected Utility
P1 P(heads)
Matching Pennies for P1
Regret
An agent i’s regret for playing an action ai if other agents adopt action profile a-i is defined as:
An agent i’s maximum regret for playing an action ai is defined as:
Minimax Regret
Minimax regret actions for agent i are defined as:
Minimax Regret vs. Minmax
L R
T 100, a 1-ε, b
B 2, c 1, d
LP-formulation
Consider the class of two-player zero-sum games
is the expected utility for player i in equilibrium
In the next slides, the LP for computing player 2 and player 1 strategies are given
Linear Programs are rather inexpensive to compute
Player 2 Strategy in Equilibrium
Constants: u1(...)
Variables: s2,
Rock-Paper-Scissors
Rock Paper Sciss.
Rock 0,0 -1,1 1,-1
Paper 1,-1 0,0 -1,1
Sciss. -1,1 1,-1 0,0
Social Choice Theory
Enrico Franchi [email protected]
Group Decisions
Agents are required to choose among a set of outcomes
Agents can choose one outcome in Ω
Agents can express a preference of outcomes
Let the set of preference orderings of outcomes
We also write to express that agent i prefers ω1 to ω2
Social Welfare
A social welfare function takes the voter preferences and produces a social preference order:
or in the slighly simplifies form:
We write to express that the first outcome ranked above the second in the social outcome
Plurality
Simplest voting procedure: used to select a single outcome (candidate)
Everyone submits his preference order, we count how many times each candidate was ranked first
Easy to implement and to understand
If the outcomes are just 2, it is called simple majority voting
If they are more than two, problems arise
Voting in the UK
Three main parties:
Labour Party (left-wing)
Liberal Democrats (center- left)
Conservative Party (right- wing)
Left-wing voter:
Center voter:
Right-wing voter:
Tactical Voting
Labour Party
44%
Liberal Democ
rats Conser
vative Party
44%
Voters
Condorcet’s Paradox
Consider this election:
No matters the outcome we choose: two thirds of the electors will be unhappy
Sequential Majority
Series of pairwise elections, the winner will go on to the next election
An agenda is the strategy we choose to order the elections (linear, binary tree)
An outcome is a possible winner if there is some agenda which would make that outcome the overall winner
An outcome is a Condorcet winner if it is the overall winner for every possible agenda
Can we choose the agenda to choose a winner?
Borda Count and Slater Ranking
Borda Count
We have K outcomes
Each time an outcome is in the j-th position for some agent, we increment its counter by K-j
We order the outcomes according to their counter
Good for single candidates
Slater ranking
Tries to be as close to the majority graph as possible
Unfortunately, is NP-hard
Desirable Properties
Pareto condition: if every agent ranks ωi above ωj, then
Plurality, Borda
Condorcet winner: if an outcome is a condorcet winner, then it should be ranked first
Sequential majority elections
Independence of Irrelevant Alternatives (IIA): social ranking of two outcomes should only be affected by the way that they are ranked in their preference orders
Almost no protocol satisfies IIA
Non-dictatorship
ω
i
*ω
jUndesirable properties
Dictatorship: a social welfare function f is a dictatorship if for some voter j we have that:
Me dispiace, ma io so io e voi non siete un …
(Alberto Sordi, Il Marchese del Grillo)
Arrow’s Theorem
Unrestricted Domain: for any set of individual voter preferences, the social welfare function should yield a unique and complete ranking of societal choices.
There is no voting procedure for elections with more than two outcomes that satisfies
Non-dictatorship
Unrestricted Domain
Pareto
Independence of Irrelevant Alternatives
Gibbard-Satterthwaite’s Theorem
Sometimes voters “lie” in order to obtain a better outcome
Is it possible to devise a voting procedure that is not subject to such manipulation?
Manipulation (i prefers ωi):
The only procedure that cannot be manipulated and satisfies the Pareto condition is dictatorship
Complexity and Manipulation
Even if all procedures can be manipulated, can we devise procedures which are hard to manipulate?
Hard means “difficult to compute” in an algorithmic sense, e.g., NP-Hard procedures
These procedures are easy (polynomial) to compute?
Second-order Copeland may be “difficult” to manipulate
In theory it is NP-Hard
However, it is only a worst case complexity
References
1. Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations; Yoav Shoham and Kevin Leyton-
Brown; Cambridge Press
2. Game Theory: Analysis of Conflict; Roger B. Myerson;
Harvard Press
3. An Introduction to Multi-Agent Systems; Michael Wooldridge; Wiley Press