ChiaraMocenni CooperationandAwarenessthroughNetworkedEvolutionaryGames

(1)

Cooperation and Awareness through Networked Evolutionary Games

Chiara Mocenni

Department of Information Engineering and Mathematics University of Siena (Italy)

[email protected]

Pathways of Change - Pontignano (SI), April 24-27, 2019

(2)

Outline

Game Theory

Models grounded on Evolutionary Game Theory are used to tackle the problem of cooperation in social systems

Evolutionary Game Theory on Networks allowed to introduce a structure in the population and to solve the problem of cooperation in some cases

Evolutionary Game Theory on Networks with Self-regulation and Awareness may explain partial or full emergence of cooperation among individuals

(3)

A dominant game: Prisoner’s dilemma (PD)

Two players (the thieves)

Two strategies: confess (accuse), not confess (silent) Confess is a non cooperative, defective (D) behavior Not confess is a cooperative (C) behavior

Payoffs (years in jail, minimization problem):

Rational people say that best strategy is to confess (defect, D)...

Player 2

D C

D 2 0

Player 1 ⇑ ⇑

C 3 1

(4)

A dominant game: Prisoner’s dilemma (PD)

Player 2

D C

D 2,2 ⇐ 0,3

Player 1 ⇑ ⇑

C 3,0 ⇐ 1,1

...but a deeper investigation shows that it’s better for both not to confess (cooperate, C)!

Arrows represent the preferences of rational players

(D, D) is a pure Nash equilibrium*: no player has anything to gain by changing only his own strategy unilaterally

*J.F. Nash was awarded with the Noble Prize in 1994 thanks to his huge contribution to game theory

(5)

A bistable game: Stag Hunt (SH)

Two players (the hunters)

Two strategies: hunt a stag, hunt a hare

To hunt a stag, the hunters must cooperate (C) in order to be successful

An hunter can get a hare by himself, but a hare is worth less than a stag (defect, D)

Payoffs (rewards, maximization problem):

Best strategies are that both players hunt a stag (C) or a hare (D), but a stag is better than a hare

Player 2

C D

C 4 1

Player 1 ⇑ ⇓

D 3 2

(6)

A coexistence game: Hawk Dove (HD)

Two players (the contestants in a competition for a common resource)

Two strategies: conciliating (C), conflicting (D)

The contestants can choose conciliating (cooperation, C) Or, they can choose conflicting (defection, D)

Payoffs (rewards, maximization problem):

Best strategies are conciliating (C) when the opponent is conflicting or conflicting (D) when the opponent is conciliating

Player 2

C D

C 3 2

Player 1 ⇓ ⇑

D 4 1

(7)

Replication and Selection

Selection enforces one out of two or more different types of individuals to prevail in the population because they reproduce (Replication) faster

˙

x_A = x_A(a − φ), ˙x_B = x_B(b − φ)

To ensure that xA+ xB = 1, we have that φ = axA+ bxB

x_A = x , x_B = 1 − x , =⇒x = x (1 − x )(a − b)˙ The fitness landscape is constant

Mutation is also important, but not discussed here.

xA and xB are the abundances of types A and B

The reproduction rates a and b are constant

(8)

Density dependent fitness

Realistically, the fitness landscape changes according to the density of phenotypes (e.g. changing landscapes, bottelneck in traffic)

The selection mechanism re-reads as:

˙

x = x (1 − x )(fA(x ) − fB(x ))

Considering a large population, where Individuals (decision makers) are players Phenotypes are strategies

Players can calculate the fitness of each available strategy by a game payoff matrix

Evolutionary Game Theory

(9)

Games as social dilemmas

Payoff matrix P =

R S

T P

where

R is the reward for cooperation

T is the temptation to defect when the opponent cooperates S is the sucker payoff of cooperating when the opponent defects

P is the punishment for defection

Game Payoffs Tensions included in each dilemma

HD T > R > S > P Unilateral defection is preferred to mutual cooperation

SH R > T > P > S Mutual defection is preferred to mutual cooperation

PD T > R > P > S Both tensions above are incorporated in this dilemma

(10)

The Replicator Equation (RE)

(11)

Individuals are preprogrammed to play a pure strategy inherited from the parents

(12)

Social behavior: Evolutionary Game Equation on Networks

In human behavior, individuals are not identical agents May change their behavioral strategy over time

Are able to make decisions by optimizing a payoff function Are interconnected to each other

Let v and w be two individuals, and G = {g_{v ,w}} the adjacency matrix of the graph of connections:

˙

xv = xv(1 − xv)

" _N X

w =1

gv ,w(fv ,C(xw) − fv ,D(xw))

#

, v = 1, . . . , N

The graph dependence is explicit

The two players are distinct

D. Madeo, CM, Game Interactions and Dynamics on Networked Populations, IEEE TAC, 2015

(13)

The EGN optimizes the payoff

In EGN f_{v ,C}(xw) is the payoff that player v gets when he plays C against a player who plays x_w, provided that g_{v ,w} 6= 0.

The equation can be rewritten as follows:

˙

xv = xv(1 − xv)∂fv

∂xv

,

Player v is maximizing his payoff over time

∂f_v

∂xv

= k_v[(σ_C + σ_D) ¯x_v − σ_D]

where ¯x_v = 1 kv

N

X

w =1

g_{v ,w}x_w and k_v =

N

X

w =1

g_{v ,w} is the degree of v

¯

x_v is the equivalent player

(14)

Equivalent player on networks

(15)

Nash Equilibria of EGN

A profile of strategies x is a Nash Equilibrium (NE) if, ∀v , the payoff earned by v , when he plays x_v against ¯x_v, is bigger or equal than the payoff of any other strategy z against the same equivalent opponent ¯xv (all other players are assumed do not change their strategy)

x^T_vP¯x_v ≥ z^TP¯x_v, ∀z ∈ ∆_M

NEs are interesting solutions because they represent compromise among players

NEs prevent all players from changing their decision unilaterally

NE have a relevant role for the system dynamics

(16)

Simulation of EGN for the Prisoner’s dilemma game

Initial condition Dynamics

t xv(t)

0 (D) 1 (C)

Arbitrary network topology

Defective consensus (ALLD) is reached starting from any initial condition x(0) ∈ (0, 1)^N.

(17)

Cooperation in the non strict Prisoner’s dilemma game

Initial Conditions

The payoff matrix of a PD game is:

P =

1 0

T P

with

T > 1 > P > 0 If P = 0 the NE

“D” is non strict

Steady state

(18)

Simulation of EGN for Stag Hunt and Hawk Dove games

Stag Hunt game (SH)

t xv(t)

0 (D) 1 (C)

σ_C > 0, σ_D > 0

Convergence to ALLD or to ALLC (depends on x(0))

Hawk Dove game (HD)

t xv(t)

0 (D) 1 (C)

σ_C < 0, σ_D < 0 No consensus

(19)

Inhomogeneous payoff matrices in 2-star networks (1/2)

Oscillations emerge in networks where nodes share coexistence and bistable payoff matrices

Unconnected stars Connected stars

(20)

Inhomogeneous payoff matrices in 2-star networks (2/2)

Numerical simulation of EGN where nodes share coexistence and bistable payoff matrices

Left star nodes:

|σC| = |σD| = 0.5

Intermediate nodes:

|σC| = |σD| = 1.5

Right star nodes:

|σC| = |σD| = 1.5

(21)

Cooperation vs competition

In Evolutionary Game Theory...

Natural selection, on its own, opposes cooperation

Non-cooperators will do better than cooperators and wipe them out (M. Nowak)

Biological evolution is grounded on selection and competition Anyway...

Cooperation is present in biology and in our everyday life Cooperation can spread out, sustain itself and be resilient

How does competition can coexist with cooperation?

(22)

Prisoner’s dilemma and Cooperation

The Prisoner’s dilemma game is a prototypical social dilemma:

defection out-performs cooperation, but cooperation provides a higher payoff (no NE because of high temptation to defect) In EGN for a PD game: σC < 0 and σD > 0, then

∂fv/∂xv ≤ 0 and ˙x_v ≤ 0: the level of cooperation decreases over time towards full defection

Stag Hunt game is also related to a social dilemma, where cooperation for the two players depends on the initial conditions (bistability)

Hawk Dove game represents the social dilemma where one of the two players will always defect (coexistence)

(23)

Introducing the self-regulation in EGN

The willingness to pursue cooperation as a greater good follows from internal mechanisms correlated to personal awareness and culture (Vogel 2004)

These mechanisms should be able to reduce the rational temptation to defect

We introduce these mechanisms in the EGN equation by adding a term h_v balancing ∂f_v/∂x_v (Leonard 2018) This term is weighted by a parameter βv which measures the inertia of a player with respect to his neighbors’ actions The term βvhv is an internal feedback describing a virtual game that each individual plays against himself, i.e. a self game

(24)

The SR-EGN equation

External feedback Internal feedback SR-EGN equation

(25)

Self regulation and Awareness

A generic player in the new model can be aware of the conflicting context where he participates, and he may know the importance of cooperation as a primal objective to be pursued

Remarkably, this term models a spontaneous learning internal process, thus representing a time varying feature of each individual

(26)

Self-regulated SH and HD games

t x

v

(t )

0 (D) 1 (C)

Convergence to ALLM for σC = −3 and σD = −4

(27)

Heterogeneous networks

Damped oscillations emerge in networks where nodes share coexistence and bistable payoff matrices

Unconnected stars Connected stars

(28)

Heterogeneous 2-star networks with self-regulation

Numerical simulation of EGN where nodes share coexistence and bistable payoff matrices and self-regulation

Left star nodes:

|σC| = |σD| = 0.5

Intermediate nodes:

|σC| = |σD| = 1.5

Right star nodes:

|σC| = |σD| = 1.5

(29)

P-driven and T-driven PD games

When studying the SR-EGN equation for the PD game, we have to distinguish two cases:

When the effect of σC is stronger than σD (|σC| > σ_D), the game is more influenced by temptation

In a T-driven game σ_C + σ_D < 0

When (|σC| < σ_D) punishment is more effective In a P-driven game σ_C + σ_D > 0

(30)

P-driven and T-driven PD games

Theorem

If ∀v , βv > kv, then x^∗ALLC is asymptotically stable. Moreover, x^∗_ALLD is unstable.

Theorem

If ∀v , βv > ρkv, where ρ = |σ_C/σ_D| ≥ 1 for a T-driven game and ρ = |σ_D/σ_C| ≥ 1 for a P-driven game, then x^∗_ALLC is globally asymptotically stable.

The proof follows by defining a suitable a

Lyapunov function

t xv(t)

0 (D) 1 (C)

(31)

Flow of EGN with self-regulation

Convincing individuals with high degree kv to be cooperative requires potentially large values of βv

Anyway, cooperation may be achieved for smaller values of β_v

(32)

Average distribution of strategies in random networks

Average degree k = 10

(33)

Cooperation in SR-EGN equation may diffuse over the network

(34)

Reversed-diffusion in T-driven PD game

(35)

Selfishness and altruism within heterogeneous populations

We consider the difference between the level of cooperation of v and the average cooperation of his neighbors at steady state

cv = xv− ¯xv cv > 0: the level of

cooperation of v is higher than his neighbors (altruism) cv < 0 indicates selfishness Non central players are more altruistic than hubs

Intermediate players split into altruist and selfish (T-driven) Continuous distribution from altruist to selfish (P-driven)

β = 15, ∀v

(36)

Conclusions

We have discussed the problem of cooperation in networked systems modeled as evolutionary games

Cooperation can be promoted by the introduction of self games and awareness

Under strong punishment cooperation is a diffusive process, while it presents reversed diffusion under strong temptation Cooperation is associated to a learning process

Future studies will include breaking the symmetry on the payoff matrices, evaluating the presence of bifurcations, accounting for random disturbances over time and performing stochastic simulations of the self-regulated model

(37)

This study results from the cooperation with Dario Madeo

Thank you for your attention!