Introduction to optimal control theory in continuos time (with economic applications)

(1)

Introduction to optimal control theory in continuos time

(with economic applications)

Salvatore Federico

June 26, 2017

(2)

(3)

Chapter 1 Introduction to optimal control problems in continuous time

1.1 From static to dynamic optimization

The problem of utility maximization is a classical problem of Economic Theory which in its simplest formulation can nicely illustrate the passage from static to dynamic optimization problems.

Static case

Consider a consumer with an initial amount of money x₀ who may consume k different goods and aims at maximizing its satisfaction from consumption (without taking any debt). If c = (c1, ..., c_k) denotes the vector of consumed quantities (clearly nonnegative), p = (p1, ..., p_k) the vector of nonnegative prices of the k goods available and U(c) the satisfaction from the consumption allocation c where

U :R₊^k→ R₊

is a utility function (jointly concave in all the variables and nondecreasing on each com- ponent), then the problem is the following

(max U (c)

s.t. c ≥ 0, 〈p,c〉 ≤ x0. (1.1.1)

The main ingredients of the problems are:

• the objective function to maximize U;

• the constraints to be fulfilled.

In one dimension the problem simply becomes max

c∈h

0,^x0_piU (c) , (1.1.2)

where U :R⁺→ R is a utility (i.e. concave and nondecreasing) function.

(6)

Dynamic case

The dynamic formulation of (1.1.2) is the following. The consumer may spread the consumption over a certain period of time: this means that the consumption c becomes now a time dependent function, and so may do the price p. The set of times where the decision of consumption may be taken is usually a subsetT of the positive half line R+= [0, +∞).

How to choose the subset T ? Usually one sets the initial time t = 0 and chooses a time horizonT ∈ [0,+∞]: if T < +∞, we are considering a finite horizon problem; if T = ∞, we are considering an infinite horizon problem. Typically

• T = [0,T] ∩ N, i.e. the discrete time case.

• T = [0,T] ∩ R⁺, i.e. the continuous time case.

In these notes we deal with the continuous time case, so T = [0,T] ∩ R⁺.

The consumption c is now a function representing the rate of consumption at time t ∈ T , and the consumer will aim at maximizing an intertemporal utility from consumption:

Z _T

0

e^−ρtU (c(t)) dt

whereρ > 0 is a discount factor, according to a usual economic assumption: the consumer is “less satisfied" by postponing consumption.

Considering now the constraints on the decision variable in the static case, let us try to see how they can be formulated in this dynamic case. The nonnegativity constraints c ≥ 0 naturally becomes

c(t) ≥ 0 ∀t ∈ T . The budget constraint pc ≤ x0 is naturally rephrased as

x(t) ≥ 0 ∀t ∈ T . where x(t) is now the money in the pocket at time t, i.e.

x(t) := x0− Z _t

0

p(s)c(s)ds, ∀t ∈ T .

Notice that, if p(·) and c(·) are continuous, then x⁰(t) = −p(t)c(t) for every t ∈ T . So the problem can be written in a compact form as

max Z _T

0

e^−ρt(c(t)) dt, subject to the pointwise constraints

c(t) ≥ 0 and x(t) ≥ 0 ∀t ∈ T , and to the differential constraint

(x⁰(t) = −p(t)c(t), ∀t ∈ T , x (0) = x0.

Remark 1.1.1. For simplicity we shall deal with one dimensional problems, but we stress that everything can be suitably reformulated and studied for n-dimensional problems replacingR with Rⁿ as state space (see afterwards for the notion of state space).

(7)

1.2 Basic ODEs theory

We recall some basic facts of one dimesnional Ordinary Differential Equations (ODEs).

Given f :T × R → R and (t0, x0) ∈ T × R, we consider the so called Cauchy problem:

(x⁰(t) = f (t, x(t)),

x(t₀) = x0, (1.2.1)

A (classical) solution to (1.2.1) is a differentiable function x :T → R satisfying the two requirements

1. x(t0) = x0;

2. x⁰(t) = f (t, x(t)) for every t ∈ T .

The first basic theorem in ODEs theory is the following:

Theorem 1.2.1 (Cauchy-Lipschitz). Let f ∈ C(T × R;R) be Lipschitz continuous with re- spect to space uniformly in time: i.e., for someL > 0,

| f (t, x) − f (t, x⁰) ≤ L|x − x⁰|, ∀t ∈ T , ∀x, x⁰∈ R,

and let (t0, x0) ∈ T × R. Then the Cauchy problem (1.2.1) has a unique solution.

To deal with optimal control problems we need a refinement of the theorem by allowing f which are less regular in time. Given U ⊆ R, we consider the following space

L¹_loc(T ;U) :=

½

u :T → R :

Z _R∧T

0 |u(t)|dt < ∞ ∀R > 0

¾ .

Remark 1.2.2. Actually in the real definition of the spaces above, the functions are not allowed to be arbitrarily irregular; they need to be measurable. We drop this concept issue to avoid technical complications. We only notice here that

• not all the function are measurable;

• the function “we can imagine" are measurable.

Moreover, the spaces above are usually defined not actually as spaces of functions but, rather, spaces of equivalence classes of functions.

Finally, in the caseT < ∞, we just have L¹_loc(T ;U) = L¹(T ;U) :=

½

u :T → R : Z _T

0 |u(t)|dt < ∞

¾ .

Letting t₀∈ T , we can consider the integral function U(t) =

Z _t

t0

u(s)ds, t ∈ T ,

where t₀ is a reference point within T . As f is not continuous a priori, the function F is not in general differentiable, it is only absolutely continuous. Absolutely continuous

(8)

functions are, however, almost everywhere (a.e.) ¹ differentiable. For function belonging toL¹_loc(T ) the Fundamental Theorem of Calculus holds in a.e. form, i.e.

∃U⁰(t) = u(t) for a.e. t ∈ T .

Theorem 1.2.3 (Caratheodory). Let f :T ×R → R be Lipschitz continuous with respect to space uniformly in time, i.e., for someL > 0

| f (t, x) − f (t, x⁰) ≤ L|x − x⁰|, ∀t ∈ T , ∀x, x⁰∈ R

and such thatf (·,0) ∈ L¹_loc(T ;R). Let (t0, x₀) ∈ T ×R. Then the Cauchy problem (1.2.1) has a unique integral solution in the sense that there exists a unique (absolutely continuous) function x :T → R such that

x(t) = x0+ Z t

t0

f (s, x(s))ds, ∀t ∈ T . Remark 1.2.4. For the solution x of Theorem 1.2.3, it holds

x⁰(t) = f (t, x(t)) a.e. t ∈ T .

1.3 Formulation of optimal control problems

Optimal control problems (OCPs) are a special kind of dynamic constrained optimization problems. A control system is a physical or economic system whose behavior is described by a state variable and which can be controlled by a controller through a control variable;

the latter enters into the dynamics of the system affecting the evolution of the state.

The state and the control variables are required to satisfy an ODE, the so called state equation, which represents the differential costraint of the control system. In optimal control aims at optimizing the behavior of the system by maximizing or minimizing a given functional on the state and the control variables.

To be more precise, the key ingredients of a continuous time optimal control problem are the following.²

• A set C ⊆ R: it is the control set, the set where the control variable takes values.

• The control variable c :T → C and the state variable x : T → R.

• The state equation, an ODE stating the dynamics of the state variable x :T → R for each given c :T → C in the given set of admissible control strategies.

• A set of admissible control strategies, i.e. a subset of {c :T → C}.

• An objective functional depending on time and on the paths of c(·) and x(·) to be optimized over c(·) ranging in the set of admissible strategies.

In these notes we want to deal with a class of optimal control problems that includes some important economic examples. The main goals are:

1This concept can be rigorously formalized.

2In the following, we will denote by x(·), y(·), c(·), etc. real functions defined on T .

(9)

• give a quite general formulation for a wide class of optimal control problems;

• give a brief outline of the Dynamic Programming method used to treat such problems;

• show how to apply such method to some economic examples.

For some reasons that will be lear in the following it is convenient to define the problem also when starting from a time t₀∈ T greater than 0. To deal with it we also define the sets

Tt0:= T \ [0, t0), t₀∈ T . and the space

L¹_loc(Tt0;U) :=

½

u :TT0→ R :

Z _R∧T

t0

|u(t)|dt < ∞ ∀R > t0

¾ .

1.3.1 State equation

Let t₀∈ T and let C ⊆ R be a control set. A control strategy starting at t0 is a function c(·) ∈ L¹_loc(Tt0; C). Given a control strategy c(·) starting at t0, the evolution of the state variable (state trajectory) in Tt0 is determined by a state equation (SE), an ODE with dynamics specified by a function

f :T × R × C 7→ R . We will consider the following standing assumptions:

(H1) ∃L ≥ 0 : |f (t, x, c) − f (t, x⁰, c⁰)| ≤ L(|x − x⁰| + |c − c⁰|), ∀t ∈ T , ∀x, x⁰∈ R, ∀c, c⁰∈ C, (H2) f (·; x, c) ∈ L¹_loc(T ;R), for some (hence, by (H1), for all) x ∈ R, c ∈ C.

Given c(·) ∈ L¹_loc(Tt0; C) and an initial state x₀∈ R, we can use (H1)-(H2) and apply Theo- rem 1.2.3 to get the existence and uniqueness of an absolutely continuous function solving (in integral sense) the Cauchy problem of the ODE (state equation)

(SE)

(x⁰(t) = f (t, x(t), c(t)), t ∈ Tt0, x(0) = x0.

The state equation is the core of what we call a control system or a controlled dynamical system. The unique solution to (SE) will be denoted by x (·; t0x₀, c(·)) or simply by x (·) when no confusion may arise.

1.3.2 Set of admissible control strategies

The state variable may be required to lie in some interval S ⊆ R. Then, the set of all admissible control strategies c(·) starting from the initial couple (t0, x₀) ∈ T ×R is accord- ingly defined as

A (t0, x0) =© c(·) ∈ L¹_loc¡

Tt0; C¢ : x (t; x₀, c(·)) ∈ S ∀t ∈ Tt0ª .

(10)

1.3.3 Objective functional

The objective of the problem is to maximize/minimize a given functional over the set A (t0, x0). We provide a class of functionals that are commonly used. A function

g :T × S × C → R, and, in case T < ∞, another function

φ : S → R,

are given. They represent, respectively, the instantaneous performance index of the system and the payoff from the final state min the case T < ∞. Then we define the functional for the case T = ∞

J (t0, x0; c(·)) :=

Z _+∞

t0

g (t, y(t), c(t)) dt, c(·) ∈ A (t0, x0) and, for the case T < ∞,

J (t₀, x₀; c(·)) = Z _T

t0

g(t, x(t), c(t))dt + φ(x(T)), c(·) ∈ A (t0, x₀).

Usually the time dependence of function g in economic problems is of the form g (t, x, c) = e^−ρtg₀(x, c) ,

whereρ > 0 is a discount factor and g0: S × C → R. The problem is then (P) Max/Min J(t₀, x₀; c(·)) over c(·) ∈ A (t0, x₀) .

Remark 1.3.1. We always consider maximization problems here. Recalling that, for a given functionF

max F = −min(−F)

we can treat with the same ideas also minimization problems.

The concept of optimality is naturally the following.

Definition 1.3.2. A control strategy c^∗(·) ∈ A (t0, x₀) is called an optimal control strategy at the starting point (t₀, y₀) if

J¡t₀, y₀; c^∗(·)¢ ≥ J (t0, y₀; c(·)) ∀c(·) ∈ A (t0, y₀) .

The corresponding state trajectory x (·; t0, y₀; c^∗(·)) is called an optimal state trajectory and will be often denoted simply byx^∗(·). The state-control couple (x^∗(·), c^∗(·)) is called an optimal couple.

The value function of the problem (P) is the optimum of the problem:

V (t₀, x₀) : = sup

c(·)∈A (t0,x₀)

J (t₀, x₀; c(·)). (1.3.1) Remark 1.3.3. We observe that the definition of optimal control strategy at (t0, x₀) makes sense if we know that the value function is finite at that point. Of course, it can happen thatV = +∞ or −∞ at some points. This is the case for example in many problems with infinite horizon arising in economic applications for some values of the parameters. In these cases one has to introduce a more general concept of optimality. We avoid to treat this case.

(11)

1.3.4 Feedback control strategies

The concept of feedback strategy plays a crucial role in optimal control theory. The idea of feedback is just the one of looking at the system at any time t ∈ Tt0, observing its the current state x(t) and then choose in real time the control strategy c(t) as a function of the state (and maybe of the time) at the same time:

c(t) = G (t, x(t))

for a suitable map G :T ×R → C. A key point is that the form of G does not depend on the initial time and state (t₀, x₀): this is more or less obvious in the philosophy of “controlling in real time”. To be more precise we introduce the following concepts.

Definition 1.3.4. A function G :T × R → C is called an admissible feedback map for problem (P) if for any initial data (t₀, x₀) ∈ T × R the closed loop equation

(x⁰(t) = f (t, x(t),G (t, x (t))) t ∈ Tt0, x (t₀) = x0,

admits a unique solution denoted by x_G(·; t0, x₀) and the corresponding feedback control strategy

c_(t₀_,x₀_,G)(t) := G (t, xG(·; t0, x₀)) , t ∈ Tt0, is admissible, i.e. it belongs toA (t0, x₀).

An admissible control strategy for problem (P) is usually called an “open loop” control strategy. A feedback control strategy will be called “closed loop” control strategy.

Definition 1.3.5. An admissible feedback map G is optimal for problem (P) if, for every initial data (t₀, x₀) ∈ T × R, the state-control couple¡x_G(·; t0, x₀) , c(t0,x0,G)(·)¢ is optimal in the sense of Definition 1.3.2.

Remark 1.3.6. Observe that, if we know an optimal feedback map G, we are able to optimally control the system in real time without knowing the real input map c_(t₀_,x₀_,G)(·).

In fact it is enough to know G and to put a “feedback device” that reads the state x (t) and give the value c (t) = G(x(t)) at any time t. This is a common technique in many real systems (especially in engineering).

Remark 1.3.7. The two philosophies of open loop and closed loop control strategies are substantially different mostly for their different use of the information. Looking for open loop strategies means that at the starting point we look at the problem, assuming to have a perfect foresight on the future and we choose once for all the optimal strategy without changing it. On the other side, looking for closed loop strategies means that we adjust at any time our policy, depending on our observation of the system. This is clearly a better policy in the sense of the use of information and we have equivalence of the two methods if we are in a deterministic world with perfect foresight.

(12)

1.4 Examples

1.4.1 Example 1: Cake’s problem

1.4.2 Example 2: Optimal consumption in AK growth models

This is a classical growth model in Economic Theory. Consider an aconomy represented by an aggregate variable k(t) representing capital at time t ∈ T and denote by c(t) respectively the consumption rate at time t ∈ T . We conider as state equation stating the evolution of the economy

(k⁰(t) = Ak(t) − c(t), t ∈ Tt0,

k(t₀) = x, (1.4.1)

where A = ˜A − δ, with ˜A > 0 being a parameter representing the technological level of the economy andδ > 0 being the depreciation rate of capital. The natural control set to consider in this context is C = R⁺, which corresponds to require nonnegative consumption rate. Moreover, it is natural to assume that the capital cannot become negative, so to require k (t) ∈ R⁺ (so S = R⁺ in the notation of the previous subsection) at any t ∈ Tt0. Hence the set of admissible strategies starting from (t₀, k₀) ∈ T × S is

A (t0, x₀) := {c(·) ∈ L¹_loc(T ;R⁺) : k(t; t₀, k₀, c(·)) ∈ R⁺ ∀t ∈ Tt0}.

The problem is

c(·)∈A (tmax0,k0)J(t₀, k₀; c(·)) where

J(t0, k0; c(·)) = Z _T

t0

e^−ρtu (c (t)) dt + e^−ρTφ(k (T)), if T < +∞,

J(t₀, k₀; c(·)) = Z _∞

t0

e^−ρtu (c (t)) dt, if T = ∞,

where the functionρ > 0 is the discount rate of the agent, u : R⁺7→ R⁺ is the instantaneous utility from consumption and φ :R⁺7→ R⁺ is, possibly, the utility from remaining capital stock. The functions u andφ are usually chosen strictly increasing, concave and two times differentiable. A standard choice of u is the so-called C.E.S. (Constant Elastic- ity of Substitution) utility function which is given by

u_σ(c) =c^1−σ− 1

1 − σ ifσ > 0, σ 6= 1 u₁(c) = log c if σ = 1.

1.4.3 Example 3: Optimal investment

This is a classical optimal investment problem with adjustment costs.

Consider a firm that at time t ∈ T produces goods using a certain amount of capital stock k(t) (i.e. the machines used for production or the cattle) and may invest a rate i(t)

(13)

at time t to increase its capital. A simple model (state equation) for the evolution of k(·) is

(k⁰(t) = −δk(t) + i(t), t ∈ Tt0,

k(t₀) = k0, (1.4.2)

whereδ > 0 is the depreciation rate of capital (the machines become older and can broke).

The firm has to choose the investment strategy respecting some constraints. For example on could assume i(·) ∈ L¹_loc(Tt₀;R), so C = R (allowing for negative investment, i.e. disin- vestment) and then impose no state constraint or the state constraint k (t; t₀, k₀, i(·)) ∈ R⁺ for every t ∈ T0(so S = R⁺); otherwise one could assume irreversibility for the investemnt imposing i(·) ∈ L¹_loc(T ;R⁺), so C = [0,∞): in this case it is automatically k (t; x, I(·)) ∈ R⁺ for every t ∈ Tt0 as soon as k₀∈ R⁺. I.e. we may have

A (t0, k₀) := {i(·) ∈ L¹_loc(Tt0;R)}, or

A (t0, k₀) := {i(·) ∈ L¹_loc(Tt0;R) : k(t; t0, k₀, i(·) ∈ R⁺};

or

A (t0, k₀) := L¹_loc(Tt0;R⁺)

(in this last case, restricting to initial data k0∈ R⁺, otherwiseA (t0, k0) is empty).

We model the behavior of the firm assuming that it wants to maximize the discounted intertemporal profit J(t₀, k₀; i(·):

i(·)∈A (tmax0,k0)J(t0, k0; i(·)), where

J(t0, k0; i(·)) :=

Z _T

0

e^−rtg(k(t), i(t))dt + e^−rTφ(k(T)), if T < +∞

J(t₀, k₀; i(·)) :=

Z _+∞

0

e^−ρtg(k(t), i(t))dt, if T = +∞

where ρ > 0 is a interest rate g(k, i) gives the instantaneous profit rate for the given levels of capital stock k and investment rate i, and φ(k) gives the profit for keeping a quantity of capital k at the end of the period (e.g. the market value of it). The function g :R⁺× C → R might have an additive form

g(k, i) = f1(k) − f2(i)

where f₁:R⁺→ R is strictly increasing and concave and f2: C → R is strictly concave and superlinear, i.e. lim_i→+∞ ^f²_|i|⁽ⁱ⁾= +∞ (e.g. f1(k) = αk, f2(i) = βi+γi². Similarlyφ is usually concave and strictly increasing (e.g.φ(k) = δk).

(14)

(15)

Chapter 2 Dynamic Programming method

The starting point of the Dynamic Programming (DP) method is the idea of embedding a given optimal control problem (OCP) into a family of OCP’s indexed by the initial data (t0, y0). This means that we keep the horizon T fixed and we let the data (t0, y0) vary tryinig to establish a relation among such problems.

The core of such relation is somehow hidden in the following sentence:

“The remaining part of an optimal trajectory is still optimal”.¹

The main idea of DP is the following. First state precisely the relationship between problems with different data (the Bellman’s Optimality Principle). Then use these relationship (eventually in a modified form: this happens especially in the continuous time case where the infinitesimal form is studied) to get information about optimal control strategies. The key tool to find these relationship is the value function of the problem, see (1.3.1).

Before to pass to precise statements of theorems we give an outline of the main ideas of (classical, i.e. based on the concept of classical solution of PDEs) Dynamic Program- ming (DP). The list is purely a rough indication.

1. Define the value function of the problem as in Definition (1.3.1): this is a function of the initial time and the initial state (t, x) = (t0, x₀).

2. Find a functional equation for V , the so-called Dynamic Programming Principle (DPP) or Bellman’s principle which is satisfied by V (Theorem 2.1.1).

3. Pass to the limit the DPP to get its differential form: a PDE called Hamilton-Jacobi- Bellman (HJB) equation.

4. Find a solution v, if possible, of the Bellman equation and prove that such solution is the value function V through the so called Verification Theorem (Theorem 2.3.2).

5. As a byproduct of the step 4, find a feedback (candidate) optimal map and then, via the closed loop equation, the optimal couples (still Theorem 2.3.2).

1This has been the first formulation (in the 50s) of the celebrated Bellman’s Optimality Principle.

(16)

The notation of this chapter will be the ones used of Subsection 1.3 with the following change. From now on the initial data (t₀, x₀) will be replaced by (t, x) for simplicity of notation. The running time index will be s in place of t and, in order to avoid confusion with the notation x for te state trajectory, we will change the name to the state variable x(·) into y(·). The assumption (H1)–(H2) on f will be standing throughout the whole chapter.

The value function of the problem is defined as V (t, x) : = sup

c(·)∈A (t,x)J (t, x; c(·)).

where, given an interval S ⊆ R we set

A (t, x) = {c(·) ∈ L¹_loc(Tt; C) : y(s; t, x, c(·)) ∈ S ∀s ∈ Tt}.

In the following, to simplify the treatment, we assume that

• S is an open interval;

• the objects

Z t⁰

t g(s, y(s; t, x, c(·))ds, J(t, x; c(·)), V (t, x)

are well defined and finite for all (t, x) ∈ T × S, t⁰∈ Tt, and c(·) ∈ A (t, x).

The first assumption avoid to consider the cases when the state trajectory may touch the boundary of the state set (on the boundary the HJB equation is not defined).

The second assumption has to be checked from case to case dealing with the specific problems.

2.1 Dynamic Programming Principle

We start by the Bellman’s Optimality Principle.

Theorem 2.1.1. (Bellman’s Optimality Principle). For every (t, x) ∈ T × S and t⁰∈ Tt we have

V (t, x) = sup

c(·)∈A (t,x)

(Z _t⁰

t g (s, y (s; t, x, c(·)), c(s)) ds + V¡t⁰, y¡t⁰; t, x, c(·)¢¢

)

. (2.1.1) Remark 2.1.2. The proof of the above result is based on the following properties of ad- missible controls:

1. For every 0 ≤ t ≤ t⁰< T, x ∈ S,

c(·) ∈ A (t, x) =⇒ c(·)|_Tt∈ A¡t⁰, x¡t⁰; t, x, c¢¢

(i.e. the second part of an admissible strategy is admissible).

2. For every 0 ≤ t ≤ t⁰< T, x ∈ S,

c₁(·) ∈ A (t, x), c₂(·) ∈ A¡t⁰, x¡t⁰; t, y, c1(·)¢¢

=⇒ c ∈ A (t, x) , where

c(s) :=

(c₁(s) , if s ∈ [t, t⁰), c₂(s) , if s ∈ Tt⁰,

(i.e. the concatenation of two admissible strategies is admissible).

(17)

2.2 HJB equation

Equation (2.1.1) is a functional equation satisfied by the value function. This is an alternative representation of V that can be useful to determine its properties or even to calculate it. Of course the functional form of (2.1.1) is not easy to handle. It is convenient then to find a differential form of it, i.e. the so called Hamilton-Jacobi-Bellman (HJB) equation.

Theorem 2.2.1. Assume that g is uniformly continuous. Assume morever that that V ∈ C¹([0, T) × S). Then V is a classical² solution of the following Partial Differential Equation (PDE):

−vt(t, x) = Hmax(t, x, vx(t, x)) , (t, x) ∈ [0, T) × S, (2.2.1) where the function H_max: [0, T) × S × R → R (“maximum value Hamiltonian” or, simply, the “Hamiltonian”) is given by

H_max(t, x, p) := sup

c∈C

H_cv(t, x, p; c) , (2.2.2)

where H_cv(“current value Hamiltonian")

H_cv(t, x, p; c) := f (t, x, c) p + g (t, x, c). (2.2.3)

Remark 2.2.2. The equation (2.2.1) usually bear the names of Hamilton and Jacobi be- cause such kind of PDE’s were first studied by them in connection with calculus of vari- ations and classical mechanics. Bellman was the first to discover its relationship with control problems. We will call it Hamilton-Jacobi-Bellman (HJB) equation.

Remark 2.2.3. The function H_max(t, x, p) is usually called (in the mathematics literature) the Hamiltonian of the problem. However in many cases the Hamiltonian is defined differently. In particular in the economic literature the name Hamiltonian (or “current value Hamiltonian” while the other is the “maximum value Hamiltonian”) is often used for the function to be maximized in (2.2.2). To avoid misunderstandings we will then use the notation

H_cv(t, y, p; c) := f (t, x, c) p + g (t, x, c) for the current value Hamiltonian and

H_max(t, x, p) := sup

c∈C

H_cv(t, x, p; c) for the maximum value Hamiltonian.

Remark 2.2.4. The key issue of the above result is to give an alternative characterization of the value function in term of the PDE (2.2.1). In fact this give a very powerful tool to study properties of V and to calculate it by some numerical analysis (at least in low dimension). Knowing V one can get important information on the optimal state-control trajectories, as we will see below. However, to get a real characterization one need a much

2In the sense that all derivatives exist and that the equation is satisfied for every (t, x) ∈ [0, T) × S.

(18)

more powerful result: here we assumedV ∈ C¹([0, T) × S) and we did not get uniqueness.

A “good” result should state that the value function V is the unique solution of (2.2.1) under general hypotheses on the data. Such kind of result have been a difficult problem for many years because the usual definitions of classical or generalized solution did not adapt to PDE of HJB type. The problem was solved in the 80ies with the introduction of the concept of viscosity solution by Crandall and Lions. With this concept it is possible to state that the value functionV is the unique viscosity solution of (2.2.1) under very weak assumptions on the data.

2.3 Verification Theorem and optimal feedbacks

The HJB equation has a crucial importance for solving the optimal control problem (P).

Before to give the main result on it we prove a fundamental identity in next lemma.

Lemma 2.3.1 (Fundamental identity). Let v ∈ C(T × S) ∩ C¹([0, T) × S) be a classical solution of (2.2.1) and assume that g is continuous. Then the following fundamental identity holds: for every (t, x) ∈ T × S, for every c(·) ∈ A (t, x) and every t⁰∈ Tt, setting

y(s) := y(s; t, x, c(·)) we have

v (t, x) − v(t⁰, y(t⁰)) = Z _t⁰

t

g(s, y(s), c(s))ds (2.3.1)

+ Z t⁰

t

[H_max(s, y(s), v_x(s, y (s))) − Hcv(s, y(s), v_x(s, y(s)) ; c(s))] ds

Proof. For all s ∈ [t, t⁰) we calculate, using that v is a classical solution of (2.2.1) d

dsv (s, y(s)) = vt(s, y(s)) + vx(s, y(s)) · y⁰(s),

= −Hmax(s, y(s), vx(s, y(s))) + f (s, y(s), c) vx(s, y(s))

= −Hmax(s, y(s), v_x(s, y(s))) + f (s, y(s), c) vx(s, y(s)) + g (s, x (s) , c(s)) − g (s, y(s), c(s))

= −Hmax(s, y(s), vx(s, y(s))) + Hcv(s, y(s), vx(s, x (s)) ; c(s)) − g (s, y(s), c (s)) In the case T = ∞ we just integrating the above identity on£t, t⁰¤ and using the continuity of v and we then get (2.3.1). In the case T < ∞ we can do the same passage and get (2.3.1) for t⁰< T. To get (2.3.1) also for t⁰= T one uses it for t⁰_ε= T − ε and passes to the limit for ε ↓ 0 using the continuity of v, g up to T and the fact that Hmax≥ Hcv. Theorem 2.3.2 (Verification Theorem - Finite horizon case). Let T < ∞ and (t, x) ∈ T ×S.

Assume that v ∈ C(T × S) ∩ C¹([0, T) × S) is a classical solution of (2.2.1) satisfying the terminal condition

v(T, ·) = φ(·). (2.3.2)

and that g is continuous. Then we have the following claims.

1. v(t, x) ≥ V (t, x).

(19)

2. Assume that c^∗(·) ∈ A (t, x) is such that, denoting y^∗(·) := y(·; t, x, c^∗(·)), we have³ H_max¡s, y^∗(s), v_x(s, y^∗(s))¢ − Hcv¡s, y^∗(s), v_x(s, y^∗(s)); c^∗(s)¢ = 0, ∀ s ∈ [t, T], i.e.

c^∗(s) ∈ argmax

c∈C H_cv¡s, y^∗(s), v_x¡s, y^∗(s)¢ ; c¢, ∀s ∈ [t, T] . (2.3.3) Thenc^∗(·) is an optimal control strategy starting at (t, x) and v (t, x) = V (t, x).

3. If we know from the beginning that v (t, x) = V (t, x), then every optimal strategy starting at (t, x) satisfies (2.3.3).

4. Assume that

• for every (s, z) ∈ [t, T] × S the map

C → R, c 7→ Hcv(s, z, vx(s, z); c) admits a unique maximum pointG (s, z) ∈ C;

• the closed loop equation

(y⁰(s) = f (s, y(s),G (s, y(s))), y(t) = x,

has a solution y_G(·);

• the feedback control strategy

c_G(s) := G (s, yG(s)) , s ∈ [t, T], belongs toA (t, x).

Thenv(t, x) = V (t, x) and (yG(·), cG(·)) is an optimal couple starting at (t, x).

Proof. Applying Lemma 2.3.1 with t⁰= T and the terminal condition (2.3.2) we get v (t, x) − φ(y(T)) =

Z _T

t

g(s, y(s), c(s))ds +

Z T t

[H_max(s, y(s), v_x(s, y (s))) − Hcv(s, y(s), v_x(s, y(s)) ; c(s))] ds, i.e.

v (t, x) = J(t, x; c(·))+

Z T t

[H_max(s, y(s), v_x(s, y (s))) − Hcv(s, y(s), v_x(s, y(s)) ; c(s))] ds. (2.3.4) 1. As (2.3.4) holds for every c(·) and Hmax(·) ≥ Hcv(·; c) for every c ∈ C, this shows v ≥ V .

2. Considering the first item, for such a c^∗(·) ∈ A (t, x) we get V (t, x) ≥ J(t, x) = v(t, x) ≥ V (t, x)

3Actually, it is sufficient to require the equality up to a null measure set.

(20)

and the claim follows.

3. If c^∗(·) ∈ A (t, x) is optimal starting at (t, x) and v(t, x) = V (t, x) from (2.3.4) we get V (t, x) = J(t, x; c^∗(·)) +

Z T t

£H_max(s, y(s), v_x(s, y (s))) − Hcv¡s, y(s), v_x(s, y(s)) ; c^∗(s)¢¤ ds.

Since the integrand in the equality above is nonnegative and J(t, x; c^∗(·)) = V (t, x), we conclude that the integrand must be null⁴.

4. The control strategy constructed in such a way that automatically satisfies the

assumptions of the second item and we conclude.

Remark 2.3.3. In Lemma 2.3.1 and in Theorem 2.3.2 the function v is not necessarily the value function. Of course, if we know (for example from Theorem 2.2.1) that the value functionV is a classical solution of equation (2.2.1) it is natural to choose v = V .

Remark 2.3.4. The above results can be used only in few cases, even interesting. Indeed the HJB equation (2.2.1) does not admit in general a classical solution.

Remark 2.3.5. Note that the above results do not need uniqueness of solutions of (2.2.1) and that such uniqueness can be get as consequence of Theorem 2.3.2.

2.3.1 Autonomous problems with infinite horizon

When dealing with autonomous (time-homogeneous) infinite horizon problems with discount factor, the HJB equation can be simplified. In this section we illustrate this fact.

Consider the problem¡P¢ of maximizing the functional¯ J (t, x; c(·)) =

Z _∞

t

e^−ρsg₀( y(s), c(s)) ds,

withρ > 0, where y(·) := y(·; t, x, c(·)) is the solution of the state equation (y⁰(s) = f0( y(s), c(s)) , s ∈ [t, T],

y(t) = x ∈ S, and

c(·) ∈ A (t, x) :=© c ∈ L¹_loc([t, +∞); C) : y(s; t, x, c) ∈ S ∀s ≥ tª .

Problem¡P¯¢ is nothing else than problem (P) with infinite horizon, time-homogeneous dynamics f (s, y, c) = f0( y, c) and current profit g (s, y, c) = e^−ρsg₀( y, c).

The value function satisfies V (t, x) = sup

c(·)∈A (t,x)

Z _∞

t

e^−ρsg₀( y(s), c (s)) ds

= e^−ρt sup

c(·)∈A (t,x)

Z _∞

t

e^−ρ(s−t)g₀( y (s) , c(s)) ds

= e^−ρt sup

c(·)∈A (t,x)

Z _∞

0

e^−ρτg₀( y (t + τ), c (t + τ)) dτ.

4Actually, up to a null measure set.

(21)

Now, being f0autonomous we have

y(s + t; t, x, c(t + ·)) = y(s;0, x, c(·), ∀c(·) ∈ L¹_loc(T ;C), so also in particular

c(t + ·) ∈ A (t, x) ⇐⇒ c (·) ∈ A (0, x), With this observation we get

V (t, x) = e^−ρt sup

c(·)∈A (0,x)

Z _∞

0

e^−ρτg₀( y (s; 0, x, c(·)), c(s)) ds = e^−ρtV (0, x) . (2.3.5)

Let us write the HJB equation. The current value Hamiltonian is

H_cv(t, x, p; c) = f0(x, c) p + e^−ρtg₀(x, c) = e^−ρt£ f₀(x, c) pe^ρt+ g0(x, c)¤ . Setting

H⁰_cv(x, ˜p; c) := f0(x, c) ˜p + g0(x, c) , it becomes

H_cv(t, x, p; c) = e^−ρtH_cv⁰ ¡x, e^ρtp; c¢ , The maximum value Hamiltonian is

H_max(t, x, p) = sup

c∈C

H_cv(t, x, p; c) = e^−ρtsup

c∈C

H_cv⁰ ¡x, e^ρtp; c¢ .

Then, inspired by (2.3.5), we reduce the HJB equation by restricting to solutions in the form

v(t, x) = e^−ρtv₀(x).

Plugging this structure into (2.2.1), it becomes

ρe^−ρtv₀(x) = e^−ρtH_max⁰ ¡x, e^ρt¡ e^−ρtv⁰₀(x)¢¢ , where

H_max⁰ (x, ˜p) = sup

c∈C

H_cv⁰ (x, ˜p; c).

Hence, we get the equation for v₀formally associated to V₀:

ρv0(x) = H⁰max¡x, v⁰₀(x)¢ . (2.3.6) In studying the problem¡P¢ it is convenient to study the (2.3.6) instead of (2.2.1), since¯ in this case (2.3.6) is just an ODE and V₀ is just a function of one variable.

Due to time-stationarity, to implement the DP method we can consider the problem only for t = 0. Set

A0(x) := A (0, x); J0(x; c(·)) = J(0, x; c(·)); V0(x) = V (0, x).

Let us see what Lemma 2.3.1 provides in this case.

(22)

Theorem 2.3.6 (Verification Theorem - Infinite horizon case). Let T = ∞ and x ∈ S. As- sume thatv₀∈ C¹(S) is a classical solution of (2.2.1) such that, setting y(s) = y(s;0, x, c(·)), it satisfies the transversality condition

tlim⁰→∞e^−ρt⁰v₀( y(t)) = 0, ∀c(·) ∈ A0(x). (2.3.7) Finally, assume that g is continuous. Then we have the following claims.

1. v₀(x) ≥ V0(x).

2. Assume that c^∗(·) ∈ A0(x) is such that, denoting y^∗(·) := y(·;0, x, c^∗(·)), we have⁵ H_max⁰ ¡ y^∗(s), v⁰₀( y^∗(s))¢ − Hcv⁰ ¡ y^∗(s), v⁰₀( y^∗(s)); c^∗(s)¢ = 0, ∀ s ∈ R⁺, i.e.

c^∗(s) ∈ argmax

c∈C H⁰_cv¡ y^∗(s), v₀⁰¡ y^∗(s)¢ ; c¢, ∀s ∈ R⁺. (2.3.8) Thenc^∗(·) is an optimal control strategy starting at (0, x) and v0(x) = V0(x).

3. If we know from the beginning thatv₀(x) = V0(x), then every optimal strategy starting at (0, x) satisfies (2.3.3).

4. Assume that

• for everyz ∈ S the map

C → R, c 7→ Hcv⁰ ¡ z, v⁰₀(z); c¢ admits a unique maximum pointG (z);

• the closed loop equation

(y⁰(s) = f (y(s),G (s, y(s))), y(0) = x,

has a solution y : G(·);

• the feedback control strategy

c^G(s) := G (yG(s)) , s ∈ R⁺, belongs toA0(x).

Thenv₀(x) = V0(x) and ( y_G(·), cG(·)) is an optimal couple starting at (0, x).

Proof. Considering (2.3.1) with t = 0, passing it to the limit for t⁰→ ∞, and using the transversality condition (2.3.7), we get, setting y(s) := (y(s;0, x, c(·))

v₀(x) = J0(x; c(·)) + Z _+∞

0

e^−ρs£H_max⁰ ¡ y(s), v⁰₀( y (s))¢ − H⁰cv¡ y(s), v⁰( y(s)) ; c(s)¢¤ ds.

Then the proof follows the arguments of the proof of Theorem 2.3.2.

5Again it is sufficient to require the equality up to a null measure set.

(23)

Chapter 3 Application to economic problems

Now we want to apply the DP method to our examples. First we give an outline of the main steps of the DP method in the simplest cases (i.e. when the main assumptions of the above results are verified). We will try to do the following steps.

1. Calculate the Hamiltonians Hcvand Hmaxtogether with arg max Hcv. 2. Write the HJB equation and find an explicit classical solution v.

3. Calculate the feedback map G, then solve the closed loop equation finding the optimal state-control couple.

Remark 3.0.1. Of course in general it is impossible to perform such steps (notably, find an explicit solution to HJB).

3.1 AK model: finite horizon

The state equation is

(y⁰(s) = A y(s) − c(s), s ∈ [t, T], y(t) = x > 0,

and the problem is

max

c(·)∈A (t,x)J(t, x; c(·)), where

A (t, x) = ©c(·) ∈ L¹([0, T];R⁺) : y(s; t, x, c(·)) > 0 ∀s ∈ [t, T]ª , and¹

J(t, x; c(·)) = Z _T

t

c (s)^1−σ

1 − σ ds + α y(T)^1−σ

1 − σ , whereα ≥ 0, σ > 0.

The value function of the problem is:

V (t, x) = sup

c(·)∈A (t,x)

J(t, x; c(·)).

We treat the caseσ 6= 1.

1We adopt the convention that, ifσ = 1, we read ^z_1−σ^1−σ:= log z.

(24)

Hamiltonians

The current value Hamiltonian does not depend on t and is given by

H_cv(t, x, p; c) = Axp − c p + c^1−σ

1 − σ x > 0, p ∈ R, c ∈ R⁺. If p > 0, then the maximum point of Hcv(x, p; c) is attained at

c^∗(t, x, p) := argmax

c≥0 H_cv(t, x, p; c) = p^−1/σ, so

H_max(t, x, p) = Axp + σ

1 − σp^σ−1^σ . HJB equation: classical solution

The HJB equation associated to our problem is

(−vt(t, x) = Hmax(t, x, v_x(t, x)) ∀(t, x) ∈ [0, T) × (0, +∞),

v(T, x) = α^x_1−σ^1−σ. (3.1.1)

We look for a solution to the HJB equation in the form

v(t, x) = a(t)x^1−σ

1 − σ. (3.1.2)

with a(·) > 0. In this case vx> 0 and (3.1.1) becomes

(−vt(t, x) = Axvx(t, x) +_1−σ^σ v_x(t, x)^σ−1^σ ∀(t, x) ∈ [0, T) × (0, +∞),

v(T, x) = α^x_1−σ^1−σ. (3.1.3)

Let us plug (3.1.2) into (3.1.3). We get

(−a⁰(t)^x_1−σ^1−σ = Axa(t)x^−σ+_1−σ^σ a(t)^σ−1^σ (x^−σ)^σ−1^σ a(T)^x_1−σ^1−σ = α^x_1−σ^1−σ,

i.e.

(−a⁰(t)^x_1−σ^1−σ = Aa(t)x^1−σ+_1−σ^σ a(t)^σ−1^σ x^1−σ a(T)^x_1−σ^1−σ = α^x_1−σ^1−σ,

Dividing by x^1−σ and multiplying by 1 − σ we get the ODE (−a⁰(t) = A(1 − σ)a(t) + σa(t)^σ−1^σ

a(T) = α. (3.1.4)

Let us look for a solution of such terminal value problem. Consider b(t) := (a(T − t))^σ¹.

(25)

We have

b⁰(t) = −1

σ(a(T − t))^σ¹⁻¹a⁰(T − t), t ∈ [0, T].

Hence, in terms of this function the ODE above rewrites as (b⁰(t) = A^1−σ_σ b(t) + 1,

b(0) = α¹^σ. (3.1.5)

Set

δ := A1 − σ σ . Then

b(t) = e^δt µ

α^σ¹+1 − e^−δt δ

¶

solves (3.1.5) and is strictly positive. Hence

a(t) = (b(T − t))^σ (3.1.6)

solves (3.1.4) and is strictly positive, so we conclude that v given by (3.1.2), where a(·) is given by (3.1.6), solves the HJB equation (3.1.1).

Optimal feedback control

Consider the feedback map coming from the optimization of H_cv(p; c) when v_x(s, z) is plugged in place of the formal argument p. It is the map

G(s, z) = c^∗(s, z, vx(s, z)) = a(s)⁻^σ¹z,

where a(t) is given in (3.1.6). The closed loop equation associated to this feedback map is a linear ODE:

(y⁰(s) = A y(s) − G(s, y(s)) = (A − a(s)⁻^σ¹) y(s), y (t) = x > 0,

with unique solution yG(·; t, x) strictly positive. The feedback strategy associated to the feedback map G is then

c_G(s) := G(s, yG(s)), s ∈ [t, T].

One verifies now that the c_G(·) ∈ A (t, x) as y(s; t, x, cG(·)) = yG(s; t, x) > 0. Hence, by Theo- rem 2.3.2, we conclude that v(t, x) = V (t, x) and that (cG(·), yG(·)) is an optimal couple for the problem starting at (t, x).

3.2 AK model: infinite horizon

The state equation is

(y⁰(s) = A y(s) − c(s), s ∈ R⁺, y(0) = x > 0,

(26)

and we want to maximize the intertemporal discounted utility J(x; c(·)) =

Z _∞

0

e^−ρtc (t)^1−σ

1 − σ dt, whereσ ∈ (0,1).

over all consumption strategies c(·) ∈ A (x) where

V (x) = sup

c(·)∈A (x)

J(x; c(·)).

We treat the caseσ ∈ (0,1).

Hamiltonians

The current value Hamiltonian H⁰_cvof our problem is H⁰_cv(x, p; c) = Axp − c p + c^1−σ

1 − σ, x > 0, p ∈ R, c ∈ R⁺. If p > 0, the maximum of H⁰_CV(x, p; ·) over R⁺is attained at

c^∗(x, p) = argmax

c≥0 H⁰(x, p; c) = p^−1/σ, so

H_max⁰ (x, p) = Axp + σ

1 − σp^σ−1^σ . HJB equation: classical solution

The Hamilton-Jacobi equation associated to our problem is

ρv(x) = H⁰_max(x, v⁰(x)) ∀x > 0. (3.2.1) where, if p > 0,

H_max⁰ (x, p) = sup

c≥0

½

Ax p − cp + c^1−σ 1 − σ

¾

= Axp + σ

1 − σp^σ−1^σ . We look for a solution to the HJB equation in the form

v(x) = ax^1−σ

1 − σ, a > 0. (3.2.2)

This is motivated by the fact that given the linearity of the state equation and the (1−σ)- homogeneity of the functional, one can actually show that V is (1 − σ)-homogeneous. Let us plug (3.2.2) into (3.2.1). We get

ρax^1−σ

1 − σ = Axax^−σ+ σ

1 − σ(ax^−σ)^σ−1^σ

= x^1−σ h

Aa + σ

1 − σa^σ−1^σ i

(27)

from which we see that (3.2.2) solves (3.2.1) if (and only if) a =

·ρ − A (1 − σ) σ

¸_−σ

(3.2.3) provided that

ρ > A (1 − σ). (3.2.4)

We take the latter as assumption. Note that, since c(·) ≥ 0, we have a maximal growth for y(·): indeed

y(s; 0, x; c(·)) ≤ yM(s) := y(s;0, x; c(·) ≡ 0) = xe^As, s ∈ R⁺. (3.2.5) Hence, setting y(s) := y(s;0, x, c(·)), we have by (3.2.4)

0 ≤ lim

s→∞e^−ρsv( y(s)) ≤ lim

s→∞e^−ρsv( yM(s)) = lim

s→∞e^−ρsaxe^A(1−σ)s= 0, so the transversality condition (2.3.7) is fulfilled: we can apply Theorem 2.3.6.

Optimal control in feedback form

Consider the feedback map coming from the optimization of H_cv⁰ (p; c) when v⁰(z) is plugged in place of the formal argument p. It is the linear map

z 7→ G(z) = c^∗(z, v⁰(z)) =ρ − A (1 − σ) σ z > 0.

The closed loop equation associated to this feedback map is linear (y⁰(s) = A y(s) − G(y(s)) = −^ρ−A_σ y (s) ,

y (0) = x > 0, with unique solution

y_G(s) = xe⁻^ρ−A^σ ^s.

Note that y_G(s) > 0 for every s ∈ R⁺, so the feedback strategy associated to the feedback map G

c_G(s) := G(yG(s)) =ρ − A (1 − σ)

σ xe⁻^ρ−A^σ ^s

belongs toA (x). Then, by Theorem 2.3.6, we deduce that v(x) = V (x) and that (cG(·), yG(·)) is an optimal couple for the problem starting at x.

3.3 Optimal investment model: infinite horizon

Let us consider the classical optimal investment problem with quadratic adjustment costs and linear production function. The state equation is

(y⁰(s) = −δy(s) + c(s), s ∈ R⁺, y(0) = x > 0,

(28)

whereδ > 0 and we want to maximize J(x; c(·)) :=

Z _∞

0

e^−ρt³

αy(t) −γ 2c²(t)´

dt, α,γ,ρ > 0.

over the set of admissible strategies

A (x) := L¹_loc(R⁺; [0, M]).

where M > 0. The value function is

V (x) = sup

c(·)∈A (x)

J(x; c(·)).

Hamiltonians

The current value Hamiltonian is

H_cv⁰ (x, p, c) = (−δx + c)p + αx −γ

2c²= −δxp + αx + c p −γ 2c², Assuming p ≥ 0 the unique maximum point of H_cv⁰ (x, p; ·) over [0, M] is

c^∗(x, p) = (_p

γ, if _γ^p≤ M, M, if _γ^p> M.

Therefore the maximum value Hamiltonian is H_max⁰ (x, p) =

(−δxp + αx +₂^p_γ², if _γ^p≤ M,

−δxp + αx + M p −^γ₂M², if ^p_γ > M.

HJB equation: classical solution

The HJB equation is

ρv(x) = −δxv⁰(x) + αx +

(_(v0(x))²

2γ if 0 ≤ v⁰(x) ≤ γM,

Mv⁰(x) −^γ₂M², if v⁰(x) > γM, x > 0. (3.3.1) We look for a solution in affine form:

v(x) = ax + b, a ∈ [0,γM], b ≥ 0. (3.3.2) Plugging (3.3.2) into (3.3.1) we get

ρ(ax + b) = −δxa + αx +a²

2γ, x > 0.

i.e.

[(ρ + δ)a − α]x + b −a²

2γ= 0, x > 0.

Hence, under the assumption

α

ρ + δ∈ [0, γM], the function v given in (3.3.2) with

a = α

ρ + δ, b =a² 2γ, solves (3.3.1).

(29)

Optimal control in feedback form

The feedback map is simply constant now:

G(z) = c^∗(z, v⁰(z)) ≡a

γ=: ¯c ∈ [0, M].

The corresponding closed loop equation has solution y_G(s) = ¯c

δ+ e^−δt

· x − ¯c

δ

¸

, s ∈ R⁺. (3.3.3)

The “feedback" (it is constant!) control strategy associated to G^∗is c_G(s) = ¯c ∀s ∈ R⁺.

The transversality condition (2.3.7) is easily checked. Therefore v(x) = V (x) and that the couple ( y_G(·), cG(·)) is optimal starting at x.

3.4 Exercises

Exercise 3.4.1. Solve the problem of Sections 3.1 in the caseσ = 1, i.e. with J(t, x; c(·)) =

Z T

t log c(s)ds + αlog(y(T)), where α ≥ 0.

(Hint: Guess a solution to the HJB equation in the form v(t, x) = a(t)log x + b(t) with a(·) > 0.)

Solution. The current value Hamiltonian does not depend on t and is given by

H_cv(t, x, p; c) = Axp − c p + log c x > 0, p ∈ R, c ∈ R⁺. If p > 0, then the maximum point of Hcv(x, p; c) is attained at

c^∗(t, x, p) := argmax

c≥0 H_cv(t, x, p; c) = p⁻¹, so

H_max(t, x, p) = Axp − 1 − log p.

The HJB equation associated to our problem is

(−vt(t, x) = Hmax(t, x, v_x(t, x)) ∀(t, x) ∈ [0, T) × (0, +∞),

v(T, x) = αlog x. (3.4.1)

We look for a solution to the HJB equation in the form

v(t, x) = a(t)log x + b(t). (3.4.2)

with a(·) > 0. In this case vx> 0 and (3.1.1) becomes

(−vt(t, x) = Axvx(t, x) − 1 − log vx(t, x), ∀(t, x) ∈ [0, T) × (0, +∞),

v(T, x) = αlog x. (3.4.3)

(30)

Let us plug (3.4.2) into (3.4.3). We get

(−a⁰(t) log x − b⁰(t) = Ax^a(t)_x − 1 − log^a(t)_x a(T) log x + b(T) = α.

n

−a⁰(t) log x − b⁰(t) = Aa(t) − 1 + log x − log a(t).a(T)log x + b(T) = α.

Equating the terms containing log x and the ones which do not contain it, we get two ODE’s with terminal conditions:

(−a⁰(t) = 1, a(T) = α,

(−b⁰(t) = Aa(t) − 1 − log a(t),

b(T) = 0. (3.4.4)

From the first one we get

a(t) = α + T − t.

Plugging this expression in the second one we get

(b⁰(t) = −A(α + T − t) + 1 + log(α + T − t), b(T) = 0,

hence

b(t) = − Z _T

t −A(α + T − s) + 1 + log(α + T − s)ds, which can be explicitly computed yielding

...

Consider the feedback map coming from the optimization of H_cv(p; c) when v_x(s, z) is plugged in place of the formal argument p. It is the map

G(s, z) = c^∗(s, z, v_x(s, z)) = z

a(s)= z α + T − s.

The closed loop equation associated to this feedback map is a linear ODE:

(y⁰(s) = A y(s) − G(s, y(s)) = (A −_α+T−s¹ ) y(s), y (t) = x > 0,

with unique solution y_G(·; t, x) strictly positive. The feedback strategy associated to the feedback map G is then

c_G(s) := G(s, yG(s)), s ∈ [t, T].

One verifies now that the c_G(·) ∈ A (t, x) as y(s; t, x, cG(·)) = yG(s; t, x) > 0. Hence, by Theo- rem 2.3.2, we conclude that v(t, x) = V (t, x) and that (cG(·), yG(·)) is an optimal couple for

the problem starting at (t, x).

Introduction to optimal control theory in continuos time (with economic applications)