Multiscale problems in differential games

(1)

Multiscale problems in differential games

Martino Bardi

Department of Pure and Applied Mathematics University of Padua, Italy

EPSRC Symposium Workshop on Game theory for finance, social and biological sciences (GAM)

Warwick Mathematical Institute, 14th-17th April, 2010

(2)

Plan

1 Dynamic Programming and the Isaacs PDE for differential games:

a brief historical overview

2 DGs with random parameters A two-scale model

Examples from finance and marketing Averaging via Bellman-Isaacs equations

3 Singular Perturbations of differential games

(3)

Zero-sum differential games

We are given a (nonlinear) system with two controls x˙_s =f(x_s,α_s,β_s) x_s ∈Rⁿ,α_s ∈A,β_s ∈B, x₀=x

with A,B compact sets,and a cost functional J(t,x, α, β) :=

Z t 0

l(x_s, αs, βs)ds+h(x_t) Player 1 governingαswants toMINIMIZEJ,

Player 2 governingβswants toMAXIMIZEJ.

(4)

Zero-sum differential games

with A,B compact sets, and a cost functional J(t,x, α, β) :=

Z t 0

l(x_s, αs, βs)ds+h(x_t)

Player 1 governingαswants toMINIMIZEJ, Player 2 governingβswants toMAXIMIZEJ.

(5)

Zero-sum differential games

with A,B compact sets, and a cost functional J(t,x, α, β) :=

Z t 0

l(x_s, αs, βs)ds+h(x_t)

Player 1 governingαswants toMINIMIZEJ, Player 2 governingβswants toMAXIMIZEJ.

(6)

Part I: Dynamic Programming and the Isaacs PDE

Classical idea: thevalue functionof the game

”V(t,x) :=J(t,x, α^∗, β^∗)

(α^∗, β^∗) =a saddle point of the game within feedback controls”

should be a solution of theIsaacs Partial Differential Equation

∂V

∂t +min

b∈Bmax

a∈A {−D_xV ·f(x,a,b) −l(x,a,b)} =0 in R+×Rⁿ with the initial condition

V(0,x) =h(x) in Rⁿ.

Moreover, from the Hamiltonian computed on V one can (in principle!) synthesize the saddle feedbacks.

(7)

To make it rigorous must answer some questions:

1 Does the definition of V(t,x)make sense ?

2 What happens at point where V is not differentiable ?

3 Can the Cauchy problem for the Isaacs PDE be solved ?

4 Does it determine the value function V ?

5 How can we synthesize the saddle if D_xV does not exist, or argmin_bargmaxaare discontinuous?

(8)

Mathematical tools: definitions of value

Question 1: rigorous definition of value.

by discretization: Fleming ’60s, Friedman 71

by nonanticipating strategies (causal maps from the open loop controls of one player to those of the other player):

Varaiya, Roxin, Elliott - Kalton 67 - 74; the lower value is V(t,x) := inf

α∈Γ(t) sup

β∈B(t)

J(t,x, α[β], β),

and the upper value is

V˜(t,x) := sup

β∈∆(t)

inf

α∈A(t)J(t,x, α, β[α]), if they coincide the game has a value.

by generalized motions of the system: Krassovski - Subbotin ’70s, Berkovitz ’80s,...

(9)

Mathematical tools: the viscosity method

Questions 2-3-4: v solves Isaacs PDE in a generalized sense

1980 Subbotin: minimax solutions and Krassovski-Subbotin value, 1981 Crandall - P.-L. Lions: viscosity solutions, existence and uniqueness for the Cauchy problem,

1984 L.C. Evans - Souganidis: V-R-E-K value is the viscosity solution to the Cauchy problem,

1989 M.B. - Soravia: viscosity solutions for pursuit-evasion games.

Surveys:

Subbotin’s book 1995

M.B. - Capuzzo-Dolcetta book 1997 Fleming - Soner book, 2nd ed. 2006

(10)

Mathematical tools: the viscosity method

Questions 2-3-4: v solves Isaacs PDE in a generalized sense

1980 Subbotin: minimax solutions and Krassovski-Subbotin value, 1981 Crandall - P.-L. Lions: viscosity solutions, existence and uniqueness for the Cauchy problem,

1984 L.C. Evans - Souganidis: V-R-E-K value is the viscosity solution to the Cauchy problem,

1989 M.B. - Soravia: viscosity solutions for pursuit-evasion games.

Surveys:

Subbotin’s book 1995

M.B. - Capuzzo-Dolcetta book 1997 Fleming - Soner book, 2nd ed. 2006

(11)

Set H(x,p) :=minb∈Bmaxa∈A{−p·f(x,a,b) −l(x,a,b)}

and assume data at least continuous, f Lipschitz in x ....

Consider the Cauchy problem

(CP) ∂u

∂t +H(x,D_xu) =0 in R₊×Rⁿ, u(0,x) =h(x).

Main results

i) Comparison Principle: a viscosity subsolution u and supersolution v of (CP) satisfy u ≤v ∀t,x ; so (CP) has at most one visco. solution;

ii) the lower value V is the continous visco. solution of (CP);

iii) the VREK upper valueV is the continous visco. solution of (CP)˜ with H˜ :=max_a∈Amin_b∈B{...} ; so V ≤ ˜V ;

iv) if H = ˜H (Isaacs condition) then V = ˜V and the game has a value.

(12)

(CP) ∂u

Main results

(13)

(CP) ∂u

Main results

(14)

(CP) ∂u

Main results

(15)

v) All other notions of (lower) value coincide with V ; vi) if u^εsolves

∂u^ε

∂t +H(x,D_xu^ε) =ε∆u^ε in R₊×Rⁿ, u^ε(0,x) =h(x).

then u^ε→V locally uniformly;

vii) any monotone and consistent approximation scheme for (CP) converges to V .

Remark: vi) is the vanishing viscosity approximation of (CP).

In game terms u^ε(t,x) =inf_αsup_βE[J] for the stochastic system dx_s=f(x_s, α_s, β_s)ds+ε√

2dW_s where Ws is a Brownian motion, i.e.,

u^ε= value of thesmall noise approximationof the game.

(16)

∂u^ε

(17)

∂u^ε

(18)

∂u^ε

(19)

A more general context: stochastic differential games

The theory of viscosity solutions works also for STOCHASTIC control and d.g.’s, i.e.,

dx_s =f(x_s, αs, βs)ds+σ(x_s, αs, βs)dW_s, x₀=x, J(t,x, α, β) :=Ex

Z t 0

l(xs, αs, βs)ds+h(x_t)

.

The value function is the unique solution of the Cauchy problem for the (degenerate) parabolicPDE

∂u

∂t +min

b∈Bmax

a∈A L^a,bu =0

where L^a,bis the generator of the diffusion process with constant controlsα_s=a, β_s =b:

L^a,bu:= −1

2trace(σσ^TD²u)−f ·Du

1 player: P.-L. Lions 1983; 2 players: Fleming - Souganidis 1989

(20)

Costructive and computational methods

Questions 3 and 5: - solve explicitly or numerically the Isaacs PDE, - compute the optimal (saddle) strategies.

Study of singular surfaces in low dimensions:

Isaacs, Breakwell, Bernhard,...

Surveys:

Lewin’s book 1994 Melikyan’s book 1998

Semi-discrete schemes: discretize time

x_n+1=x_n+ ∆t f(x_n,a_n,b_n),

find the value function V_∆t(n,x)and the feedback saddle form the D.P. finite difference equation, then let∆t →0 and show

convergence to V(t,x):

Fleming, Friedman,..., Souganidis, M.B. - Falcone,

(21)

Costructive and computational methods

Surveys:

(22)

Costructive and computational methods

Surveys:

(23)

Fully discrete schemes: discretize time and space, solve the game on a finite graph, then prove (by viscosity methods)

V_∆t,∆x →v as∆t, ∆x →0

survey in M.B. - Falcone - Soravia, Ann. ISDG 4 (1999) Methods from the theory of positional differential games and minimax solutions:

Krassovski, Subbotin, Patsko ...

Methods based on necessary conditions:

Pesch, Breitner ...

Methods from viability theory:

Cardaliaguet, Quincampoix, Saint Pierre

(24)

Fully discrete schemes: discretize time and space, solve the game on a finite graph, then prove (by viscosity methods)

V_∆t,∆x →v as∆t, ∆x →0

survey in M.B. - Falcone - Soravia, Ann. ISDG 4 (1999) Methods from the theory of positional differential games and minimax solutions:

Krassovski, Subbotin, Patsko ...

Methods based on necessary conditions:

Pesch, Breitner ...

Methods from viability theory:

Cardaliaguet, Quincampoix, Saint Pierre

(25)

Part II: Games with random parameters

Usually the system and costs depend on a vector ofparameters y: f =f(x,y,a,b), l =l(x,y,a,b),

summarizing all the un-modeled variables.

In practical applications, for short time, one often models the

parameters as CONSTANTS: one gets some historical values y₁, ...,y_N and then estimates φ =f,l by

φ ≈ 1 N

N

X

i=1

φ_i, φ_i := φ(x,y_i,a,b),

the arithmetic mean of the observed data.

QUESTION: is this correct? and why?

(26)

Part II: Games with random parameters

Usually the system and costs depend on a vector ofparameters y: f =f(x,y,a,b), l =l(x,y,a,b),

summarizing all the un-modeled variables.

In practical applications, for short time, one often models the

parameters as CONSTANTS: one gets some historical values y₁, ...,y_N and then estimates φ =f,l by

φ ≈ 1 N

N

X

i=1

φ_i, φ_i := φ(x,y_i,a,b),

the arithmetic mean of the observed data.

QUESTION: is this correct? and why?

(27)

Rmk: the datay₁, ...,y_N often look likesamples of a stochastic process. How can we model them?

A process˜y_τ isergodicwithinvariant measureµif for all measurableφ

T→+∞lim E

"

1 T

Z T 0

φ(˜y_τ)dτ

#

= Z

φ(y)dµ(y)=:E[φ].

Definey_t^ε:= ˜y_t_/ε. Suppose you observe y_t^εat the times t =i/N, i =1, ...,N. Want to estimate the system and costφ =f,l by

1 N

N

X

i=1

φi, φi := φ(x,y_i/N^ε ,a,b).

ForN largeandεsmall, settingτ =t/εwe get

1 N

N

X

i=1

φi ≈ Z 1

0

φ(y_t^ε)dt=ε Z 1/ε

0

φ(˜y_τ)dτ ≈E[φ].

(28)

T→+∞lim E

"

1 T

Z T 0

φ(˜y_τ)dτ

#

= Z

φ(y)dµ(y)=:E[φ].

1 N

N

X

i=1

ForN largeandεsmall, settingτ =t/εwe get 1

N

X

i=1

φi ≈ Z 1

0

φ(˜y_τ)dτ ≈E[φ].

(29)

T→+∞lim E

"

1 T

Z T 0

φ(˜y_τ)dτ

#

= Z

φ(y)dµ(y)=:E[φ].

1 N

N

X

i=1

N

X

i=1

φi ≈ Z 1

0

φ(˜y_τ)dτ ≈E[φ].

(30)

T→+∞lim E

"

1 T

Z T 0

φ(˜y_τ)dτ

#

= Z

φ(y)dµ(y)=:E[φ].

1 N

N

X

i=1

N

X

i=1

φi ≈ Z 1

0

φ(˜y_τ)dτ ≈E[φ].

(31)

Conclusion:

The arithmetic mean of data is a good approximation of a function of the random parameters if

there are many data, and

the parameters are an ergodic process evolving on a time scale much faster than the state variables.

QUESTION 1:

What are the right quantities to average?

The system dynamics f and costs l themselves or something else?

QUESTION 2:

Is this model fit to real observed data in applications ?

(32)

Conclusion:

QUESTION 1:

QUESTION 2:

(33)

Conclusion:

QUESTION 1:

QUESTION 2:

(34)

Two-scale model of DGs with random parameters

Ify˜_τ solves

(FS) dy˜_τ =g(˜y_τ)dτ + ν(˜y_τ)dW_τ, and y_t = ˜y_t/ε, we get the two-scale system

(2SS)

x˙_s=f(x_s,y_s, αs, βs) x_s ∈Rⁿ,

dys = ¹_εg(ys)ds+^√¹_εν(ys)dWs, ys ∈R^m, Want to understand the limit asε →0:

aSingular Perturbationproblem.

Main assumption: the fast subsystem (FS) is ergodic, i.e., it has a unique invariant measureµ.

(35)

Example 1. Any non-degenerate diffusion

dy˜_τ =g(˜y_τ)dτ + ν(˜y_τ)dW_τ, detν 6=0 on a compact manifold, e.g., the torusT^m, is ergodic.

Example 2. TheOrnstein-Uhlenbeck process dy˜_t = (m− ˜y_t)dt+

√

2νdW_t

(m, ν constant) is ergodic with Gaussian invariant measure µ ∼ N (m, ν²).

It is also mean-reverting, i.e., the drift pulls the process back to its mean value m.

Fouque, Papanicolaou, Sircar give empirical data showing that a good model for the volatility in financial markets is

σ = σ(y_t^ε), y_t^ε:= ˜y_t_/ε for some σ(·) >0.

(36)

√

2νdW_t

(37)

√

2νdW_t

(38)

√

2νdW_t

(39)

A natural candidatelimit systemis

x˙_s=hfi(x_s, αs, βs), hfi(x,a,b) = Z

R^m

f(x,y,a,b)dµ(y).

More generally, we can consider astochastic control systemwith random parameters, so the 1st equation in (2SS) becomes (Sx) dxs =f(xs,ys, αs, βs)ds+ σ(xs,ys, αs, βs)dWs, and then the candidatelimit systembecomes

(S) dx_s =hfi(x_s, αs, βs)ds+hσi(x_s, αs, βs)dW_s, with

hσihσi^T(x,a,b) = Z

R^m

σσ^T(x,y,a,b)dµ(y).

(40)

R^m

f(x,y,a,b)dµ(y).

R^m

(41)

R^m

f(x,y,a,b)dµ(y).

R^m

(42)

For the initial conditions x₀=x,y₀=y take the cost functional J^ε(t,x,y, α_., β_.) :=E

Z t 0

l(xs,ys, αs, βs)ds+h(xt)

.

The value function is V^ε(t,x,y) :=inf_α∈Γ(t)sup_β∈B(t)J^ε(t,x,y, α[β], β).

The candidate limit functional is J(t,x, α_., β_.) =E

Z t 0

hli(x_s, α_s, β_s)ds+h(x_t)

, with the effective cost hli(x,a,b) :=R

R^ml(x,y,a,b)dµ(y), and xs is the trajectory of the limit system (S) with x₀=x .

QUESTION: lim_ε→0V^ε(t,x,y) =V(t,x) :=inf_αsup_βJ(t,x, α[β], β) ?

(43)

Z t 0

.

Z t 0

,

with the effective cost hli(x,a,b) :=R

(44)

Z t 0

.

Z t 0

,

with the effective cost hli(x,a,b) :=R

(45)

Convergence for split system with a single controller

An answer is known by probabilistic methods if B is a singleton, so the problem is a minimization for thesingle player a.

Theorem [Kushner, book 1990]

If the system (Sx) for the slow variables xs hasσ =σ(x,y)possibly degenerate butindependent of the controland

f(x,y,a) =f₀(x,y) +f₁(x,a), l(x,y,a) =l₀(x,y) +l₁(x,a), then

ε→0limV^ε(t,x,y) =V(t,x) :=inf

α.

J(t,x, α_.).

(46)

Convergence for games with split system

Theorem [M.B. et al. 2009]

If the system (Sx) for the slow variables x_s hasσ =σ(x,y)possibly degenerate butindependent of the controland

f(x,y,a,b) =f₀(x,y)+f₁(x,a,b), l(x,y,a,b) =l₀(x,y)+l₁(x,a,b), then

lim

ε→0V^ε(t,x,y) =V(t,x) :=inf

α sup

β

J(t,x, α[β], β).

It is proved by PDE instead of probabilistic methods.

It is a special case of the general result we show later.

N.B.: split system and uncontrolled diffusionσ arerestrictive assumptions: see the next examples.

(47)

Convergence for games with split system

Theorem [M.B. et al. 2009]

If the system (Sx) for the slow variables x_s hasσ =σ(x,y)possibly degenerate butindependent of the controland

f(x,y,a,b) =f₀(x,y)+f₁(x,a,b), l(x,y,a,b) =l₀(x,y)+l₁(x,a,b), then

lim

ε→0V^ε(t,x,y) =V(t,x) :=inf

α sup

β

J(t,x, α[β], β).

It is proved by PDE instead of probabilistic methods.

It is a special case of the general result we show later.

N.B.: split system and uncontrolled diffusionσ arerestrictive assumptions: see the next examples.

(48)

Financial models: option pricing

The evolution of stock S withstochastic volatilityσ is d log Ss = γds+σ(ys)dWs

dy_s = ¹_ε(m−y_s) +^√^ν_εdW˜_s

There is NO control, W_.andW˜_.can be correlated, l ≡0, and, e.g., the terminal cost at time t is h(S_t) = (S_t −K)⁺for European call options.

Then, asε →0,

V^ε(t,x,y) :=E[h(S_t) |S₀=x,y₀=y] →V(t,x) =

the Black-Scholes formula of the model with (constant)mean historical volatility

d log Ss= γds+hσidWs, hσi²= Z

R

σ²(y) 1

√

2πν²e^−(y−m)²^/2ν²dy.

(49)

Financial models: option pricing

The evolution of stock S withstochastic volatilityσ is d log Ss = γds+σ(ys)dWs

dy_s = ¹_ε(m−y_s) +^√^ν_εdW˜_s

There is NO control, W_.andW˜_.can be correlated, l ≡0, and, e.g., the terminal cost at time t is h(S_t) = (S_t −K)⁺for European call options.

Then, asε →0,

V^ε(t,x,y) :=E[h(S_t) |S₀=x,y₀=y] →V(t,x) =

the Black-Scholes formula of the model with (constant)mean historical volatility

d log Ss= γds+hσidWs, hσi²= Z

R

σ²(y) 1

√

2πν²e^−(y−m)²^/2ν²dy.

(50)

Merton portfolio optimization problem

Investβ_s in the stock S_s,1− β_s in a bond with interest rate r . Then the wealth x_s evolves as

d x_s = (r + (γ −r)β_s)x_sds+x_sβ_sσ(y_s)dW_s dys = ¹_ε(m−ys) +^√^ν_εdW˜s

and want to maximize the expected utility at time t, E[h(x_t)]for some h increasing and concave.

N.B.: the diffusion term depends on the control and is not in split form, the previous theory does not apply.

QUESTIONS:

Is the limit asε →0 a Merton problem with constant volatility?

If so, is the previous averaged system still correct ?

(51)

Merton portfolio optimization problem

Investβ_s in the stock S_s,1− β_s in a bond with interest rate r . Then the wealth x_s evolves as

d x_s = (r + (γ −r)β_s)x_sds+x_sβ_sσ(y_s)dW_s dys = ¹_ε(m−ys) +^√^ν_εdW˜s

and want to maximize the expected utility at time t, E[h(x_t)]for some h increasing and concave.

N.B.: the diffusion term depends on the control and is not in split form, the previous theory does not apply.

QUESTIONS:

Is the limit asε →0 a Merton problem with constant volatility?

If so, is the previous averaged system still correct ?

(52)

An advertising model

Consider a duopoly: in a market with total sales M the sales of firm 1 are Ss, those of firm 2 are M−Ss, andαs, βs ≥0 are the advertising efforts. TakeLanchester dynamics

S˙_s = (M−S_s)α_s−β_sS_s and objective functionals (r_i, θ_i >0)

J₁= Z t

0

r₁Ss− θ₁α²_s

ds, J₂= Z t

0

r₂(M−Ss) − θ₂β_s² ds.

This can be written as a 0-sum game with cost functional (θ >0) J =

Z t 0

rS_s+ θα_s²−β_s² ds,

see Jorgensen and Zaccour, book 2004.

(53)

If the three parameters M,r, θare random, assume they are functions of a fast ergodic process y_s^ε.

The system is not split because there is a term M(y_s)αs. QUESTIONS:

Is the limit asε →0 a Lanchester system with objective functional linear in the state and quadratic in the control ?

If so, what are the effective parameters ?

(54)

(55)

(56)

Convergence via Bellman-Isaacs equations

1 Write the Hamilton-Jacobi-Bellman-Isaacs equation for the value function V^ε;

2 find a limit (effective) PDE such that V^εconverges to its solution V ;

3 identify the limit PDE as a Bellman-Isaacs for a new system f and cost functional J, so V is the value function of this new game.

(57)

Convergence via Bellman-Isaacs equations

(58)

Convergence via Bellman-Isaacs equations

(59)

Step 1: V^εsolves







∂V^ε

∂t + H x,y,DxV^ε,D_xx² V^ε −¹_εLV^ε=0 in R₊×Rⁿ×R^m, V^ε(0,x,y) =h(x) in Rⁿ×R^m,

H

x,y,Dx,D_xx²

:=min

b∈Bmax

a∈A

n−trace(σσ^TD_xx² ) −f·Dx −lo

L :=trace(νν^TD_yy² ) +g·D_y

Step 2: Look for an effective H such that the limit equation is

∂V

∂t +H

x,D_xV,D_xx² V

=0 in R₊×Rⁿ

(60)

Step 1: V^εsolves







∂V^ε

∂t + H x,y,DxV^ε,D_xx² V^ε −¹_εLV^ε=0 in R₊×Rⁿ×R^m, V^ε(0,x,y) =h(x) in Rⁿ×R^m,

H

x,y,Dx,D_xx²

:=min

b∈Bmax

a∈A

n−trace(σσ^TD_xx² ) −f·Dx −lo

L :=trace(νν^TD_yy² ) +g·D_y

Step 2: Look for an effective H such that the limit equation is

∂V

∂t +H

x,D_xV,D_xx² V

=0 in R₊×Rⁿ

(61)

Theorem [M.B., Cesaroni, Manca 2009]

V^ε(t,x,y) →V(t,x) asε →0, locallyuniformly and V solves (in viscosity sense)

∂V

∂t + Z

H

x,y,D_xV,D_xx² V

dµ(y)=0 in R₊×Rⁿ whereµis the invariant measure of the fast subsystem (FS).

Step 3: if∃effective system and costf,σ,l: H :=

Z

minb∈Bmax

a∈A

n−trace(σσ^TD_xx² ) −f·Dx −lo dµ(y)

=min

b∈Bmax

a∈A

n−trace(σσ^TD_xx² ) −f ·Dx−lo

=⇒V(t,x) :=inf_αsup_βEh Rt

0l(x_s, α[β]s, βs)ds+h(x_t) i

, x_s solving dx_s =f(x_s, α[β]_s, β_s)ds+σ(x_s, α[β]_s, β_s)dW_s.

(62)

∂V

∂t + Z

H

x,y,D_xV,D_xx² V

Z

minb∈Bmax

a∈A

=min

b∈Bmax

a∈A

(63)

∂V

∂t + Z

H

x,y,D_xV,D_xx² V

Z

minb∈Bmax

a∈A

=min

b∈Bmax

a∈A

(64)

∂V

∂t + Z

H

x,y,D_xV,D_xx² V

Z

minb∈Bmax

a∈A

=min

b∈Bmax

a∈A

(65)

Corollary

For split systems, i.e.,

σ = σ(x,y), f =f₀(x,y) +f₁(x,a,b), l=l₀(x,y) +l₁(x,a,b), the limit (effective) system and cost are obtained by averaging w.r.t.

µ(y):

f =hfi= Z

f₀(x,y)dµ(y) +f₁(x,a,b),

σσ^T =hσihσi^T = Z

σσ^Tdµ(y), l=hli= Z

l₀(x,y)dµ(y) +l₁(x,a,b)

Proof: under these assumptionsR

dµand minb∈Bmaxa∈A commute H=

Z

minb∈Bmax

a∈A {...}dµ(y) =min

b∈Bmax

a∈A

Z

{...}dµ(y).