• Non ci sono risultati.

Game Theory (pt. 2)

N/A
N/A
Protected

Academic year: 2022

Condividi "Game Theory (pt. 2)"

Copied!
33
0
0

Testo completo

(1)

Game Theory (pt. 2)

Enrico Franchi [email protected]

(2)

URLs

  Slides 13-05-2010:

  http://dl.dropbox.com/u/836013/slides/lectures/2009-2010/

game-theory/gametheory-handout.pdf

  Slides 20-05-2010:

  http://dl.dropbox.com/u/836013/slides/lectures/2009-2010/

game-theory/gametheory2.pdf

(3)

TOC

  Computing Nash Equilibrium

  Complexity

  Two-Player Games

  Minmax/Maxmin

  Von Neumann Theorem

  Regret

  Linear Programming

  Social Choice Theory

  Plurality and other voting procedures

  Paradoxes

  Arrow’s Theorem and Gibbard-Satterthwaite’s Theorem

(4)

Computing the Nash Equilibrium

  The general case is extremely hard:

  We must “manually” guess the support of the equilibrium, that is to say the set of strategies with positive probability

  We must solve a set of M non-linear equations in M variables, where:

  If the players are only two, the equations are linear

  We still have to guess the support

(5)

Complexity

  Technically, computing a general Nash Equilibrium is PPAD-complete and PPAD-hard (2005)

  PPAD stands for “polynomial parity argument, directed version”

  Relations with NP are not known

  However, it is known that PPAD ⊆ TFNP ⊆ FNP

(6)

Nash-related NP-Hard Problems

  Uniqueness: given a game G, does there exist a unique equilibrium in G?

  Pareto Optimality: Given a game G, does there exist a strictly Pareto efficient equilibrium in G?

  Guaranteed Payoff: Given a game G and a value v, does there exist an equilibrium in G in which some player

obtains a payoff of at least v?

  Guaranteed Social Welfare

  Action Inclusion

  Action Exclusion

(7)

Two-player games

  If the two-player is a general sum game, the Lemke-

Howson algorithm can be used formulating the game as a linear complementarity problem [1]

  If the game is a zero-sum game, it can be solved with linear programming

(8)

Maxmin

  The maxmin strategy for player i in an n player, general sum game is a not necessarily unique (mixed) strategy that maximizes i’s worst case

  The maxmin value (or security level) is the minimum payoff level guaranteed by a maxmin strategy

(9)

Minmax, two player

  Minmax strategy and value are somewhat dual to the maxmin strategy and value

  In two player general sum games the minmax strategy for player i against player –i is the strategy that keeps the

maximum payoff for –i at minimum

  It is a punishment

(10)

Minmax, n-player

  In an n-player game, the minmax strategy for player i

against player j ≠ i is i’s component of the mixed-strategy profile s-j in the expression

where –j denotes the set of players other that j.

  Player i receives his minmax value if players –i choose their strategies to minimize i utility “after” he chose

strategy si

  A player maxmin value is always less than or equal to his minmax value

(11)

Minmax Theorem

 

In any finite, two-player, zero-sum game, in any Nash equilibrium each player receives a payoff that is equal to both his maxmin value and his minmax value

  Von Neumann, 1928

 

Each player’s maxmin equals his minmax (value of the game)

 

Maxmin strategies coincide with minmax strategies

 

Any maxmin strategy profile is a Nash equilibrium

and any Nash equilibrium is a maxmin strategy profile

(12)

Matching Pennies Example

Heads Tails

Heads 1, -1 -1, 1

Tails -1, 1 1, -1

(13)

Matching Pennies Graph

0

0.3

0.6 0.9

-1 -0.5

0 0.5

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

P2 P(heads)

P1 Expected Utility

P1 P(heads)

Matching Pennies for P1

(14)

Regret

  An agent i’s regret for playing an action ai if other agents adopt action profile a-i is defined as:

  An agent i’s maximum regret for playing an action ai is defined as:

(15)

Minimax Regret

  Minimax regret actions for agent i are defined as:

(16)

Minimax Regret vs. Minmax

L R

T 100, a 1-ε, b

B 2, c 1, d

(17)

LP-formulation

  Consider the class of two-player zero-sum games

  is the expected utility for player i in equilibrium

  In the next slides, the LP for computing player 2 and player 1 strategies are given

  Linear Programs are rather inexpensive to compute

(18)

Player 2 Strategy in Equilibrium

  Constants: u1(...)

  Variables: s2,

(19)

Rock-Paper-Scissors

Rock Paper Sciss.

Rock 0,0 -1,1 1,-1

Paper 1,-1 0,0 -1,1

Sciss. -1,1 1,-1 0,0

(20)

Social Choice Theory

Enrico Franchi [email protected]

(21)

Group Decisions

  Agents are required to choose among a set of outcomes

  Agents can choose one outcome in Ω

  Agents can express a preference of outcomes

  Let the set of preference orderings of outcomes

  We also write to express that agent i prefers ω1 to ω2

(22)

Social Welfare

  A social welfare function takes the voter preferences and produces a social preference order:

or in the slighly simplifies form:

  We write to express that the first outcome ranked above the second in the social outcome

(23)

Plurality

  Simplest voting procedure: used to select a single outcome (candidate)

  Everyone submits his preference order, we count how many times each candidate was ranked first

  Easy to implement and to understand

  If the outcomes are just 2, it is called simple majority voting

  If they are more than two, problems arise

(24)

Voting in the UK

  Three main parties:

  Labour Party (left-wing)

  Liberal Democrats (center- left)

  Conservative Party (right- wing)

  Left-wing voter:

  Center voter:

  Right-wing voter:

  Tactical Voting

Labour Party

44%

Liberal Democ

rats Conser

vative Party

44%

Voters

(25)

Condorcet’s Paradox

  Consider this election:

  No matters the outcome we choose: two thirds of the electors will be unhappy

(26)

Sequential Majority

  Series of pairwise elections, the winner will go on to the next election

  An agenda is the strategy we choose to order the elections (linear, binary tree)

  An outcome is a possible winner if there is some agenda which would make that outcome the overall winner

  An outcome is a Condorcet winner if it is the overall winner for every possible agenda

  Can we choose the agenda to choose a winner?

(27)

Borda Count and Slater Ranking

  Borda Count

  We have K outcomes

  Each time an outcome is in the j-th position for some agent, we increment its counter by K-j

  We order the outcomes according to their counter

  Good for single candidates

  Slater ranking

  Tries to be as close to the majority graph as possible

  Unfortunately, is NP-hard

(28)

Desirable Properties

  Pareto condition: if every agent ranks ωi above ωj, then

  Plurality, Borda

  Condorcet winner: if an outcome is a condorcet winner, then it should be ranked first

  Sequential majority elections

  Independence of Irrelevant Alternatives (IIA): social ranking of two outcomes should only be affected by the way that they are ranked in their preference orders

  Almost no protocol satisfies IIA

  Non-dictatorship

ω

i

*

ω

j

(29)

Undesirable properties

  Dictatorship: a social welfare function f is a dictatorship if for some voter j we have that:

Me dispiace, ma io so io e voi non siete un …

(Alberto Sordi, Il Marchese del Grillo)

(30)

Arrow’s Theorem

  Unrestricted Domain:  for any set of individual voter preferences, the social welfare function should yield a unique and complete ranking of societal choices.

  There is no voting procedure for elections with more than two outcomes that satisfies

  Non-dictatorship

  Unrestricted Domain

  Pareto

  Independence of Irrelevant Alternatives

(31)

Gibbard-Satterthwaite’s Theorem

  Sometimes voters “lie” in order to obtain a better outcome

  Is it possible to devise a voting procedure that is not subject to such manipulation?

  Manipulation (i prefers ωi):

  The only procedure that cannot be manipulated and satisfies the Pareto condition is dictatorship

(32)

Complexity and Manipulation

  Even if all procedures can be manipulated, can we devise procedures which are hard to manipulate?

  Hard means “difficult to compute” in an algorithmic sense, e.g., NP-Hard procedures

  These procedures are easy (polynomial) to compute?

  Second-order Copeland may be “difficult” to manipulate

  In theory it is NP-Hard

  However, it is only a worst case complexity

(33)

References

1.  Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations; Yoav Shoham and Kevin Leyton-

Brown; Cambridge Press

2.  Game Theory: Analysis of Conflict; Roger B. Myerson;

Harvard Press

3.  An Introduction to Multi-Agent Systems; Michael Wooldridge; Wiley Press

Riferimenti

Documenti correlati

 when “real” status (the correction) of the past is received from server.  compare it with corresponding

Client-side prediction + server correction, if game status not overly complex. or, slow

The rules also specify that the set of coins that may be turned over depends only on the position of the rightmost coin turned over and not otherwise on the position of the other

▶ For the unanimity games the first 3 axioms determine the Shapley value.. ▶ Using additivity we extend it to all

They could decide to nd the monopoly optimal quantity to produce (it is a−c 2 ), produce each half of it and thus share evenly the oligopoly payo which is ( a−c) 4 2.. In such a

The characteristic function of a TU Game may be obtained starting from the strategic or normal form of the game, taking into account the different possible total payoffs of a

Likewise, if your accomplice confesses while you remain silent, he will go free (freedom) while you do the time (4 years in jail). • If you both confess I get two convictions, but

Game Theory: lecture of May 3, 2019: Solving exercises on Game Theory and Evolutionary