Cooperation and Awareness through Networked Evolutionary Games
Chiara Mocenni
Department of Information Engineering and Mathematics University of Siena (Italy)
Pathways of Change - Pontignano (SI), April 24-27, 2019
Outline
Game Theory
Models grounded on Evolutionary Game Theory are used to tackle the problem of cooperation in social systems
Evolutionary Game Theory on Networks allowed to introduce a structure in the population and to solve the problem of cooperation in some cases
Evolutionary Game Theory on Networks with Self-regulation and Awareness may explain partial or full emergence of cooperation among individuals
A dominant game: Prisoner’s dilemma (PD)
Two players (the thieves)
Two strategies: confess (accuse), not confess (silent) Confess is a non cooperative, defective (D) behavior Not confess is a cooperative (C) behavior
Payoffs (years in jail, minimization problem):
Rational people say that best strategy is to confess (defect, D)...
Player 2
D C
D 2 0
Player 1 ⇑ ⇑
C 3 1
A dominant game: Prisoner’s dilemma (PD)
Player 2
D C
D 2,2 ⇐ 0,3
Player 1 ⇑ ⇑
C 3,0 ⇐ 1,1
...but a deeper investigation shows that it’s better for both not to confess (cooperate, C)!
Arrows represent the preferences of rational players
(D, D) is a pure Nash equilibrium*: no player has anything to gain by changing only his own strategy unilaterally
*J.F. Nash was awarded with the Noble Prize in 1994 thanks to his huge contribution to game theory
A bistable game: Stag Hunt (SH)
Two players (the hunters)
Two strategies: hunt a stag, hunt a hare
To hunt a stag, the hunters must cooperate (C) in order to be successful
An hunter can get a hare by himself, but a hare is worth less than a stag (defect, D)
Payoffs (rewards, maximization problem):
Best strategies are that both players hunt a stag (C) or a hare (D), but a stag is better than a hare
Player 2
C D
C 4 1
Player 1 ⇑ ⇓
D 3 2
A coexistence game: Hawk Dove (HD)
Two players (the contestants in a competition for a common resource)
Two strategies: conciliating (C), conflicting (D)
The contestants can choose conciliating (cooperation, C) Or, they can choose conflicting (defection, D)
Payoffs (rewards, maximization problem):
Best strategies are conciliating (C) when the opponent is conflicting or conflicting (D) when the opponent is conciliating
Player 2
C D
C 3 2
Player 1 ⇓ ⇑
D 4 1
Replication and Selection
Selection enforces one out of two or more different types of individuals to prevail in the population because they reproduce (Replication) faster
˙
xA = xA(a − φ), ˙xB = xB(b − φ)
To ensure that xA+ xB = 1, we have that φ = axA+ bxB
xA = x , xB = 1 − x , =⇒x = x (1 − x )(a − b)˙ The fitness landscape is constant
Mutation is also important, but not discussed here.
xA and xB are the abundances of types A and B
The reproduction rates a and b are constant
Density dependent fitness
Realistically, the fitness landscape changes according to the density of phenotypes (e.g. changing landscapes, bottelneck in traffic)
The selection mechanism re-reads as:
˙
x = x (1 − x )(fA(x ) − fB(x ))
Considering a large population, where Individuals (decision makers) are players Phenotypes are strategies
Players can calculate the fitness of each available strategy by a game payoff matrix
Evolutionary Game Theory
Games as social dilemmas
Payoff matrix P =
R S
T P
where
R is the reward for cooperation
T is the temptation to defect when the opponent cooperates S is the sucker payoff of cooperating when the opponent defects
P is the punishment for defection
Game Payoffs Tensions included in each dilemma
HD T > R > S > P Unilateral defection is preferred to mutual cooperation
SH R > T > P > S Mutual defection is preferred to mutual co- operation
PD T > R > P > S Both tensions above are incorporated in this dilemma
The Replicator Equation (RE)
Individuals are preprogrammed to play a pure strategy inherited from the parents
Social behavior: Evolutionary Game Equation on Networks
In human behavior, individuals are not identical agents May change their behavioral strategy over time
Are able to make decisions by optimizing a payoff function Are interconnected to each other
Let v and w be two individuals, and G = {gv ,w} the adjacency matrix of the graph of connections:
˙
xv = xv(1 − xv)
" N X
w =1
gv ,w(fv ,C(xw) − fv ,D(xw))
#
, v = 1, . . . , N
The graph dependence is explicit
The two players are distinct
D. Madeo, CM, Game Interactions and Dynamics on Networked Populations, IEEE TAC, 2015
The EGN optimizes the payoff
In EGN fv ,C(xw) is the payoff that player v gets when he plays C against a player who plays xw, provided that gv ,w 6= 0.
The equation can be rewritten as follows:
˙
xv = xv(1 − xv)∂fv
∂xv
,
Player v is maximizing his payoff over time
∂fv
∂xv
= kv[(σC + σD) ¯xv − σD]
where ¯xv = 1 kv
N
X
w =1
gv ,wxw and kv =
N
X
w =1
gv ,w is the degree of v
¯
xv is the equivalent player
Equivalent player on networks
Nash Equilibria of EGN
A profile of strategies x is a Nash Equilibrium (NE) if, ∀v , the payoff earned by v , when he plays xv against ¯xv, is bigger or equal than the payoff of any other strategy z against the same equivalent opponent ¯xv (all other players are assumed do not change their strategy)
xTvP¯xv ≥ zTP¯xv, ∀z ∈ ∆M
NEs are interesting solutions because they represent compromise among players
NEs prevent all players from changing their decision unilaterally
NE have a relevant role for the system dynamics
Simulation of EGN for the Prisoner’s dilemma game
Initial condition Dynamics
t xv(t)
0 (D) 1 (C)
Arbitrary network topology
Defective consensus (ALLD) is reached starting from any initial condition x(0) ∈ (0, 1)N.
Cooperation in the non strict Prisoner’s dilemma game
Initial Conditions
The payoff matrix of a PD game is:
P =
1 0
T P
with
T > 1 > P > 0 If P = 0 the NE
“D” is non strict
Steady state
Simulation of EGN for Stag Hunt and Hawk Dove games
Stag Hunt game (SH)
t xv(t)
0 (D) 1 (C)
σC > 0, σD > 0
Convergence to ALLD or to ALLC (depends on x(0))
Hawk Dove game (HD)
t xv(t)
0 (D) 1 (C)
σC < 0, σD < 0 No consensus
Inhomogeneous payoff matrices in 2-star networks (1/2)
Oscillations emerge in networks where nodes share coexistence and bistable payoff matrices
Unconnected stars Connected stars
Inhomogeneous payoff matrices in 2-star networks (2/2)
Numerical simulation of EGN where nodes share coexistence and bistable payoff matrices
Left star nodes:
|σC| = |σD| = 0.5
Intermediate nodes:
|σC| = |σD| = 1.5
Right star nodes:
|σC| = |σD| = 1.5
Cooperation vs competition
In Evolutionary Game Theory...
Natural selection, on its own, opposes cooperation
Non-cooperators will do better than cooperators and wipe them out (M. Nowak)
Biological evolution is grounded on selection and competition Anyway...
Cooperation is present in biology and in our everyday life Cooperation can spread out, sustain itself and be resilient
How does competition can coexist with cooperation?
Prisoner’s dilemma and Cooperation
The Prisoner’s dilemma game is a prototypical social dilemma:
defection out-performs cooperation, but cooperation provides a higher payoff (no NE because of high temptation to defect) In EGN for a PD game: σC < 0 and σD > 0, then
∂fv/∂xv ≤ 0 and ˙xv ≤ 0: the level of cooperation decreases over time towards full defection
Stag Hunt game is also related to a social dilemma, where cooperation for the two players depends on the initial conditions (bistability)
Hawk Dove game represents the social dilemma where one of the two players will always defect (coexistence)
Introducing the self-regulation in EGN
The willingness to pursue cooperation as a greater good follows from internal mechanisms correlated to personal awareness and culture (Vogel 2004)
These mechanisms should be able to reduce the rational temptation to defect
We introduce these mechanisms in the EGN equation by adding a term hv balancing ∂fv/∂xv (Leonard 2018) This term is weighted by a parameter βv which measures the inertia of a player with respect to his neighbors’ actions The term βvhv is an internal feedback describing a virtual game that each individual plays against himself, i.e. a self game
The SR-EGN equation
External feedback Internal feedback SR-EGN equation
Self regulation and Awareness
A generic player in the new model can be aware of the conflicting context where he participates, and he may know the importance of cooperation as a primal objective to be pursued
Remarkably, this term models a spontaneous learning internal process, thus representing a time varying feature of each individual
Self-regulated SH and HD games
t x
v(t )
0 (D) 1 (C)
Convergence to ALLM for σC = −3 and σD = −4
Heterogeneous networks
Damped oscillations emerge in networks where nodes share coexistence and bistable payoff matrices
Unconnected stars Connected stars
Heterogeneous 2-star networks with self-regulation
Numerical simulation of EGN where nodes share coexistence and bistable payoff matrices and self-regulation
Left star nodes:
|σC| = |σD| = 0.5
Intermediate nodes:
|σC| = |σD| = 1.5
Right star nodes:
|σC| = |σD| = 1.5
P-driven and T-driven PD games
When studying the SR-EGN equation for the PD game, we have to distinguish two cases:
When the effect of σC is stronger than σD (|σC| > σD), the game is more influenced by temptation
In a T-driven game σC + σD < 0
When (|σC| < σD) punishment is more effective In a P-driven game σC + σD > 0
P-driven and T-driven PD games
Theorem
If ∀v , βv > kv, then x∗ALLC is asymptotically stable. Moreover, x∗ALLD is unstable.
Theorem
If ∀v , βv > ρkv, where ρ = |σC/σD| ≥ 1 for a T-driven game and ρ = |σD/σC| ≥ 1 for a P-driven game, then x∗ALLC is globally asymptotically stable.
The proof follows by defining a suitable a
Lyapunov function
t xv(t)
0 (D) 1 (C)
Flow of EGN with self-regulation
Convincing individuals with high degree kv to be cooperative requires potentially large values of βv
Anyway, cooperation may be achieved for smaller values of βv
Average distribution of strategies in random networks
Average degree k = 10
Cooperation in SR-EGN equation may diffuse over the network
Reversed-diffusion in T-driven PD game
Selfishness and altruism within heterogeneous populations
We consider the difference between the level of cooperation of v and the average cooperation of his neighbors at steady state
cv = xv− ¯xv cv > 0: the level of
cooperation of v is higher than his neighbors (altruism) cv < 0 indicates selfishness Non central players are more altruistic than hubs
Intermediate players split into altruist and selfish (T-driven) Continuous distribution from altruist to selfish (P-driven)
β = 15, ∀v
Conclusions
We have discussed the problem of cooperation in networked systems modeled as evolutionary games
Cooperation can be promoted by the introduction of self games and awareness
Under strong punishment cooperation is a diffusive process, while it presents reversed diffusion under strong temptation Cooperation is associated to a learning process
Future studies will include breaking the symmetry on the payoff matrices, evaluating the presence of bifurcations, accounting for random disturbances over time and performing stochastic simulations of the self-regulated model
This study results from the cooperation with Dario Madeo
Thank you for your attention!