• Non ci sono risultati.

Differentiable Convex Functions and Legendre Trans-

Theorem 1.4.1. Let U be an open convex subset of Rn. If f : U → R is convex and differentiable at each point of U, then f is C1.

Proof. We fix x ∈ U . Let r ∈ ]0, ∞[ be such that the closed ball B(x, r) is contained in U . Let us set M = sup¯ y∈ ¯B(x,r)|f (y)| <

+∞. For h, k ∈ ¯B(0,r2), we have

f (x + h + k) − f (x + k) ≥ Df (x + k)(h), (*) taking the supremum over all h such that khk = r/2, we obtain kDf (x + k)k ≤ 4M /r. Since the ball {p ∈ Rn∗ | kpk ≤ 4M /r} is compact, it is enough to see that if kn→ 0 and Df (x + kn) → p, then p = Df (x). But taking the limit in the inequality (∗), we get

∀k ∈ ¯B(0, r/2), f (x + h) − f (x) ≥ hp, hi.

It results that Df (x) = p, since we have already seen that at a point where a function is differentiable only its derivative can be a supporting linear form, see Proposition 1.2.4.

Exercise 1.4.2. Let K be a compact topological space and U an open convex subset of Rn. If L : K × U → R is continuous and such that for each k ∈ K, the map U → R : v 7→ L(x, v) is convex and everywhere differentiable, then ∂L∂v : K × U → (Rn), (k, v) 7→

∂L

∂v(k, v) is continuous. [Indication: Adapt the proof of Theorem 1.4.1

Definition 1.4.3 (Legendre Transform). Let L : U → R be a C1 function, with U ⊂ Rn open. The Legendre transform associated with L is the map L : U → Rn∗, v 7→ DL(v).

21 We can rephrase part (ii) of Fenchel’s Theorem 1.3.7 and Corol-lary 1.3.10 in the following way:

Proposition 1.4.4. If L : Rn → R be C1 and superlinear, then its Legendre transform L : Rn → Rn∗ is surjective. Moreover, if we denote by H : Rn∗ → R its Fenchel transform then hp, vi = H(p) + L(v) if and only if p = DL(v), and we have

∀v ∈ Rn, H ◦ L(v) = DL(v)(v) − L(x, v).

In particular, the surjectivity of L is a consequence of super-linearity of L.

We are interested in finding out, for a C1 convex function L : Rn→ R, when its Legendre transform L : Rn → Rn∗ is bijective.

It is easy to understand when L is injective.

Theorem 1.4.5. Suppose L : U → R is a C1 convex function, defined on an open subset U of Rn. Its associated Legendre trans-form L is injective if and only if L is strictly convex.

Proof. Let p ∈ Rn∗. We have p = DL(x) if and only if DLp(x) = 0 where Lp(x) = L(x) − p(x). Hence x is a point where the function Lp reaches its minimum, see 1.2.11. If L is strictly convex Lp is too. However a strictly convex function can achieve its minimum at most at one point.

Conversely, if L is injective, the convex function L(x)−DL(x0)(x) has only x0 as a critical point and hence, and again by Corollary 1.2.11, it reaches its minimum only at x0. If x0 = tx + (1 − t)y with t ∈]0, 1[, x 6= x0 and y 6= x0, we therefore have

L(x) − DL(x0)(x) > L(x0) − DL(x0)(x0) L(y) − DL(x0)(y) > L(x0) − DL(x0)(x0).

Since t > 0 and (1 − t) > 0, we obtain tL(x)+(1−t)L(y)−DL(x0)(tx + (1 − t)y)

| {z }

x0

) > L(x0)−DL(x0)(x0),

hence tL(x) + (1 − t)L(y) > L(x0).

We would like now to prove the following theorem.

Theorem 1.4.6. Let L : Rn → R be C1, and convex. If is its Legendre transform, then the following statements are equivalent:

1 The function L is strictly convex, and superlinear.

2 Its Legendre transform L : Rn→ Rn∗ is a homeomorphism.

3 Its Legendre transform L : Rn→ Rn∗ is bijective.

Proof. We first show that (1) implies (3). If (1) is true then from Proposition 1.4.4, we know that L is surjective, and from Theorem 1.4.5 it is injective.

The fact that (3) implies (2) follows from Brouwer’s Theo-rem on the invariance of the domain see [Dug66, TheoTheo-rem 3.1, page 358]. (Note that one can obtain a proof independent from Brouwer’s Theorem by using Theorem 1.4.13 below.)

We now prove that (2) implies (3). Another application of Theorem 1.4.5 shows that L is strictly convex. It remains to show the superlinearity. Since L is a homeomorphism, the set AK = {x | kDL(x)k = K} is compact, and L(AK) = {p ∈ Rn∗ | kpk = K}, thus

∀v ∈ Rn, Kkvk = sup

x∈AK

DL(x)(v).

As L(v) ≥ DL(x)(v) + L(x) − DL(x)(x) we see that L(v) ≥ Kkvk + inf

x∈AK

[L(x) − DL(x)(x)],

but infx∈AK[L(x)− DL(x)(x)] > −∞, because AK is compact and L is of class C1.

When it comes to Lagrangians, Analysts like to assume that they are superlinear, and Geometers prefer to assume that its as-sociated Legendre transform is bijective. The following Corollary shows that for C2-strictly convex Lagrangians, these hypothesis are equivalent.

Corollary 1.4.7. Let L : Rn → R be a C2 convex function. Its associated Legendre transform L is a C1 diffeomorphism from Rn onto its dual space Rn∗ if and only if L is superlinear, and C2 -strictly convex.

23 Proof. Suppose that L = DL is a C1 diffeomorphism. By the previous Theorem 1.4.6, the map L is superlinear. Moreover, the derivative DL(v) = D2L(v) is an isomorphism, for each v ∈ Rn. Therefore D2L(v) is non degenerate as a bilinear form, for each v ∈ Rn. Since, by the convexity of L, the second derivative D2L(v) is non negative definite as a quadratic form, it follows that D2L(v) is positive definite as a quadratic form, for each v ∈ Rn.

Conversely, suppose L superlinear, and C2-strictly convex . Then DL : Rn → Rn∗ is a homeomorphism by Theorem 1.4.6.

Moreover, since DL(v) = D2L(v), the derivative DL(v) is thus an isomorphism at every point v ∈ Rn. By the Local Inversion Theorem, the inverse map L−1 is also C1.

In the sequel of this section, we will discuss some aspects of the Legendre transform that will not be used in this book. They nonetheless deserve to be better known.

We start with the notion of proper map.

Definition 1.4.8 (Proper Map). A map f : X → Y , between the topological spaces X and Y , is said to be proper if for ev-ery compact subset K of the target space Y , the inverse image f−1(K) ⊂ X is also compact.

The main properties of proper maps are recalled in the follow-ing exercise.

Exercise 1.4.9. Let f : X → Y be a proper continuous map between metric spaces.

1) Show that for each closed subset F ⊂ X, the image f (F ) is closed in Y . [Indication: Use the fact that if a sequence converges, then the subset formed by this sequence together with its limit is compact.

2) Conclude that f is a homeomorphism as soon as it is bijec-tive.

3) Show that a continuous map f : Rn → Rm is proper if and only if

kxk→+∞lim kf (x)k = +∞.

Theorem 1.4.10. Let L : U → R be a C1 convex function, where U is an open convex subset of Rn. If its associated Legendre transform L : U → Rn∗ is proper, then L is surjective.

We need some preliminaries in order to prove the theorem.

Lemma 1.4.11 (Of the Minimum). Let f : ¯B(x, r) → R be a function which has a derivative at each point of ¯B(x, r). If f achieves its minimum at x0 ∈ ¯B(x, r), a closed ball in a normed space, then Df (x0)(x0−x) = −kDf (x0)kr = −kDf (x0)kkx0−xk.

Proof. Without loss of generality, we can suppose x = 0. For all y ∈ ¯B(0, r) and for all t ∈ [0, 1], we have

f (ty + (1 − t)x0) ≥ f (x0),

thus, the function φy : [0, 1] → R, t 7→ f (ty + (1 − t)x0) has a minimum at t = 0, its derivative at 0, namely Df (x0)(y − x0), is thus ≥ 0. Hence Df (x0)(y − x0) ≥ 0, for each y ∈ ¯B(0, r), and consequently Df (x0)(x0) ≤ Df (x0)(y), for each y ∈ ¯B(0, r). It follows that

Df (x0)(x0) = inf

y∈ ¯B(0,r)Df (x0)(y) = −kDf (x0)kr.

If Df (x0) = 0, we also have the second part of the required equal-ities. If Df (x0) 6= 0, then x must be on the boundary ∂B(0, r) of B(0, r) and we again have the second part of the equalities.¯ Corollary 1.4.12. Let f : U → R be a C1convex function defined on the open convex subset U of Rn. If the derivative Df (x) is never the 0 linear form, for x ∈ U , then for each compact subset K ⊂ U , we have

x∈UinfkDf (x)k = inf

x∈U \KkDf (x)k.

Proof. The inequality infx∈UkDf (x)k ≤ infx∈U \KkDf (x)k is ob-vious. If we do not have equality for some compact subset K, then

x∈UinfkDf (x)k < inf

x∈U \KkDf (x)k, and therefore

x∈UinfkDf (x)k = inf

x∈U KkDf (x)k,

25 If we set K0 = {x ∈ U | kDf (x)k = infz∈UkDf (z)k}, it follows that K0 is closed, non-empty, and contained in K, therefore K0 is compact. Moreover

∀x ∈ U \ K0, kDf (x)k > inf

z∈UkDf (z)k.

Since K0 is compact there exists r > 0 such that the closed set V¯r(K0) = {x | d(x, K0) ≤ r} is contained in the open set U . As this set ¯Vr(K0) is also compact, there exists x0 ∈ ¯Vr(K0) such that f (x0) = infx∈ ¯Vr(K0)f (x). Necessarily x is on the boundary of V¯r(K0), because otherwise x0 would be a local minimum of f and therefore Df (x0) = 0, which is excluded. Hence d(x0, K0) = r.

By compactness of K0, we can find x ∈ K0 such that d(x0, x) = r.

Since ¯B(x, r) ⊂ ¯Vr(K0) and x0 ∈ ¯B(x, r), we also have f (x0) = infy∈ ¯B(x,r)f (y). By the previous Lemma 1.4.11, we must have Df (x0)(x0− x) = −kDf (x0)kr. The convexity of f gives

f (x) − f (x0) ≥ Df (x0)(x − x0) f (x0) − f (x) ≥ Df (x)(x0− x),

hence Df (x)(x − x0) ≥ Df (x0)(x − x0) = kDf (x0)kr. As kx − x0k = r, we get

Df (x)(x − x0) ≤ kDf (x)kkx − x0k = kDf (x)kr.

This implies kDf (x)k ≥ kDf (x0)k, which is absurd. In fact, we have kDf (x)k = infz∈UkDf (z)k, since x ∈ K0, and kDf (x0)k >

infz∈UkDf (z)k, because x0 ∈ K/ 0.

Proof of theorem 1.4.10. Fix p ∈ Rn∗, the Legendre transform of Lp= L−p is L−p, it is thus also proper. By the previous Corollary, it must vanish at some point in U , because infx∈ ¯B(0,r)kDLp(x)k →

∞, when r → ∞.

Theorem 1.4.13. Let L : Rn→ R be a C1 convex function. Its associated Legendre transform L : Rn→ Rn∗is proper if and only if L is superlinear.

Proof. Let us suppose L superlinear. By convexity we have L(0)−

L(x) ≥ DL(x)(0−x) and thus DL(x)(x) ≥ L(x)−L(0) from which

we obtain

kDL(x)k ≥ DL(x) x kxk



≥ L(x)

kxk − L(0) kxk,

by the superlinearity of L, we do have kDL(x)k → ∞, when kxk →

∞.

The proof of the converse is very close to the end of the proof of Theorem 1.4.6. If L is proper, the set AK = {x | kDL(x)k = K}

is compact. Moreover, since L is necessarily surjective, see 1.4.10, we have L(AK) = {p ∈ Rn∗| kpk = K}, and thus

∀v ∈ Rn, Kkvk = sup

x∈AK

DL(x)(v).

As L(v) ≥ DL(x)(v) + L(x) − DL(x)(x) we see that L(v) ≥ Kkvk + inf

x∈AK

[L(x) − DL(x)(x)],

but infx∈AK[L(x)− DL(x)(x)] > −∞, because AK is compact and L is of class C1.

We would like to conclude this section with a very nice theorem due to Minty see [Min64, Min61] (see also [Gro90, 1.2. Convexity Theorem]). In order to give the best statement, we recall some notions about convex subsets of a finite-dimensional vector space E. If C ⊂ F is a convex subset, we will denote by Aff(C) the affine subspace generated by C, the relative interior relint(C) of C is its interior as a subset of Aff(C).

Theorem 1.4.14 (Minty). If L : Rn→ R is a C1convex function, then the closure of the image L(Rn) = DL(Rn) of its associated Legendre transform L is convex. Moreover L contains the relative interior of its closure.

In order to prove this theorem, we will need the following lemma.

Lemma 1.4.15. Let L : Rn → R be a C1 convex function. If p /∈ L(Rn), then there exists v ∈ Rn\{0} such that L(x)(v) ≥ p(v) for all x ∈ Rn. If p is not in the closure of L(Rn), then, moreover, there exists ǫ > 0 such that L(x)(v) ≥ ǫ + p(v) for all x ∈ Rn.

27 Proof of Theorem 1.4.14. To simplify notations, we call C the clo-sure of L(Rn). Observe that a linear form on Rn∗ is of the form p 7→ p(v), where v ∈ Rn. Therefore the Lemma above 1.4.15 shows that a point in the complement of C, can be strictly separated by a hyperplane from L(Rn), and hence from its closure C. This implies the convexity of C.

It remains to prove the second statement.

We first assume that the affine subspace generated by L(Rn) is the whole of Rn∗. We have to prove that the interior of C is contained in L(Rn). Suppose that p0 ∈ ˚C is not contained in L(Rn), by Lemma 1.4.15 above we can find v ∈ Rn \ {0} with L(z)(v) ≥ p0(v) for all z ∈ Rn, therefore p(v) ≥ p0(v) for all p in the closure C of L(Rn). This is clearly impossible since p0 ∈ ˚C and v 6= 0.

To do the general case, call E the affine subspace generated by L(Rn). Replacing L by L − DL(0), we can assume that 0 ∈ E, and therefore E is a vector subspace of Rn∗. Changing bases, we can assume that E = Rk∗ × {0} ⊂ Rn∗. Since DL(x) ∈ E = Rk∗ × {0}, we see that ∂L/∂xi is identically 0 for i > 0, and therefore L(x1, . . . , xn) depends only on the first k variables, so we can write L(x1, . . . , xn) = ˜L(x1, . . . , xk), with ˜L : Rk→ R convex and C1. It is obvious that DL and D ˜L have the same image in Rk∗ = Rk∗ × {0} ⊂ Rn∗, therefore the image of D ˜L generates affinely Rk∗. We can therefore apply the first case treated above to finish the proof.

Proof of Lemma 1.4.15. We fix p0∈ L(R/ n) = DL(Rn). The func-tion Lp0 = L − p0 is convex. As its derivative is never the 0 linear map, it does not have a local minimum. To simplify the notations let us set f = L − p0. For each integer k ≥ 1, by Lemma 1.4.11 applied to f , and to the ball ¯B(0, k), we can find xkwith kxkk = k such that

Df (xk)(xk) = −kDf (xk)kkxkk.

The convexity of f gives

∀y ∈ Rn, Df (y)(y − xk) ≥ f (y) − f (xk) ≥ Df (xk)(y − xk), hence

∀y ∈ Rn, Df (y)(y − xk) ≥ Df (xk)(y − xk). (*)

In particular, we have Df (0)(−xk) ≥ Df (xk)(−xk) = kDf (xk)kkxkk, and thus kDf (0)k ≥ kDf (xk)k. Taking a subsequence, we can suppose that kxkk → ∞, that Df (xk) → p, and that xk/kxkk → v, with v of norm 1. Dividing both sides the inequality (∗) above by kxkk, and using the equality

Df (xk)(xk) = −kDf (xk)kkxkk, and taking limits we obtain

∀y ∈ Rn, Df (y)(−v) ≥ kpk, we rewrite it as

∀y ∈ RnDL(y)(−v) ≥ p0(−v) + kpk.

It then remains to observe that p = 0 implies that Df (xk) = DL(xk)−p0tends to 0 and thus p0is in the closure of DL(Rn).

Exercise 1.4.16. Suppose that L : Rn → R is C1, and strictly convex. Show that the image L(Rn) is a convex open subset of Rn∗.

Does this result remain true, for L : U → R C1, and strictly convex, on the open subset U Rn?