It has a rather technical character, and can be seen as an extension of the chapter on canonical transformations, since it deals with transformations close to identity

(1)

NEAR THE IDENTITY

CANONICAL TRANSFORMATIONS

This part should be considered as an intermezzo. It has a rather technical character, and can be seen as an extension of the chapter on canonical transformations, since it deals with transformations close to identity. The scope is to introduce some useful tools for developing a constructive perturbation scheme based on the algorithms of Lie series and of Lie transform, that will be used as substitutes for the classical method of constructing near the identity canonical transformations via generating functions in mixed variables.

The usefulness of Lie transform methods relies on two remarks. First, one avoids the cumbersome procedure of inversion of functions and of substitution of variables.

Second, completing the formal methods based on Lie transforms with quantitative estimates — as typically required by modern perturbation theory — is not really difficult.

Concerning the first remark, one may easily notice that the operations of inversion of functions and substitution of variables are basically simple, and so introduce no real difficulty from the point of view of Analysis. Moreover, most of the rigorous methods of perturbation theory available in the current literature are based on powerful analytical tools such as the implicit function theorem and fixed point theorems, to quote just the most common ones. The drawback of these method is that they are not constructive, not easily implemented in explicit calculations, in particular if one plans to perform series expansions up to a quite high order in a parameter. E.g., the first few steps of an inversion are indeed quite simple — and often faster — than the expansions required by Lie transforms. But soon one realizes that pushing the typical expansions of perturbation theory to higher orders becomes a really cumbersome and time wasting procedure. In contrast, the methods based on Lie transform allow us to express all series expansions that we need in a very straightforward manner, using only algebraic operations that are easily implemented on computers: sums, products and derivatives.

The second remark is related to an apparent defect of Lie transform methods in algorithmic form: they are often used only as formal tools, skipping all questions

(2)

related to convergence of the expansions. We could also say that there is a gap between the formal methods of perturbation theory that have been used by astronomers since the times of Euler and Lagrange and the rigorous methods that are now widely used in the framework of Analysis. The ambitious attempt of the present notes is to fill, at least partially, that gap.

It may be interesting to add a few historical remarks. The introduction of Lie series as a tool for writing the solutions of differential equations goes back to Marius Sophus Lie. The field of Lie algebras has been developed on the basis of his work;

however, this matter will not be discussed in the present notes. The usefulness of Lie series in numerical calculations has been pointed out by Wolfgang Gr¨obner in a series of papers, starting in 1957. The book [42], published in 1967, contains a complete theory including the discussion of the convergence of Lie series for analytic vector fields, when all functions are expressed as power series. The expansions are then used as a numerical tool for numerically solving differential equations. It should be remarked that the approximation of a solution by Lie series has been recently used also as a tool for constructing symplectic integration algorithms which are particularly useful for numerically investigating the dynamics of Hamiltonian systems.

The application of Lie series methods to perturbation expansions in Celestial Mechanics has been first proposed by Gen-Ichiro Hori^[50] and Andr´e Deprit^[27], who generalized the method by introducing Lie transforms and producing explicit algorithms. Many other papers have been devoted to algorithms based on Lie transforms, mainly for applications in Celestial Mechanics. An attempt to collect several such algorithms may be found in the papers [48] and [49] by Jacques Henrard, where extensive references may be found. However, all these papers were concerned only with the formal aspects, while convergence problems were neglected. An extension of the rigorous result of Gr¨obner to the Hamiltonian case including use of action–angle variables has been first given in [32].

The exposition here will be concerned only with the application of Lie methods in a Hamiltonian framework, often considering the particular case of action–angle variables. This is indeed what we need in order to establish a working background for the rest of the present notes. However that a similar theory may be developed also for general systems of differential equations.

6.1 Formal expansions

Let us start with a short discussion concerning the meaning to be assigned to the expression formal calculus. In chapter 5 we have used it in intuitive sense — perhaps vague — just saying: “we disregard questions relative to convergence”. Let us try to make the last sentence a little more precise, with the help of Poincar´e.

The easiest and most natural case is concerned with polynomial expansions, or formal power series, that we have used in chapter 5 in order to construct a first integral for a Hamiltonian in a neighbourhood of an (elliptic) equilibrium. Following Poincar´e ([87], Tome II, ch. VIII), we may introduce the concept of formal expansion as

(3)

follows. Consider the truncated expansion Φ = Φ0+ . . . + Φr, i.e., a non homogeneous polynomial of degree r + 1. We say that the equation {H, Φ} = 0 is satisfied in formal sense if by replacing the truncated polynomial we get

(6.1) {H, Φ} = O(|(x, y)|^r+3) ,

i.e., the difference is of degre at least r +3, that we have identified as order r +1. Let us call this difference the remainder. Hencefort we shall denote as O(r + 1) the remainder of order r + 1, in whatever sense. This is essentially what we did in chapter 5, and the propositions stated there should be interpreted in this sense.

A similar attitude may be taken when considering expansions in a small parameter ε, as considered in chapter 4: just interpret order O(r + 1) as meaning: terms that have a power ε^r+1 (or higher) as a factor. More elaborated cases may occur, as we shall see in the rest of the present notes.

Such an attitude turns out to be useful in view of the fact, known after Poincar´e, that most series usually considered in perturbation theory are not convergent. On the other hand there is a long standing and successful tradition in Celestial Mechanics (or Astronomy, as it was called in the past) based on the use of the first order terms of a series expansions, disregarding the rest of the series: the problem of convergence is not even mentioned, in most cases.

Such a matter of fact has been splendidly illustrated by Poincar´e (see [87], tome II, chap. VIII).

“ Il y a entre les géomètres et les astronomes une sorte de malentendu au sujet de la signification du mot convergence. Les géomètres, préoccupés de la parfaite rigueur et souvent trop indifférents à la longueur de calculs inextricables dont ils con¸coivent la possibilité, sans songer à les entrepren- dre effectivement, disent qu’ une série est convergente quand la somme des termes tend vers une limite déterminée, quand même les premiers termes diminueraient très lentement. Les astronomes, au contraire, ont coutume de dire qu’ une série converge quand les vingt premiers termes, par exemple, diminuent très rapidement, quand même les termes suivants devraient croˆıtre indéfiniment.

Ainsi, pour prendre un exemple simple, considérons les deux séries qui ont pour terme général

ⁿ

 ·  ·  . . . n et  ·  ·  . . . n

ⁿ .

Les géomètres diront que la première série converge, et même qu’ elle converge rapidement, parce que le millionième terme est beaucoup plus petit que le  ê; mais ils regarderont la seconde come divergente, parce que le terme général peut croˆıtre au delà de toute limite.

Les astronomes, au contraire, regarderont la première série comme divergente, parce que les  premiers termes vont en croissant; et la seconde comme convergente, parce que les  premiers termes vont en décroissant et que cette décroissance est d’ abord très rapide.

(4)

Les deux règles sont légitimes : la première, dans les récherches théoriques; la séconde, dans les applications numériques. Toutes deux doivent régner, mais dans deux domaines séparés et dont il importe de bien connaˆıtre les frontières.

. . . .

Le premier exemple qui a montré clairement la légitimité de certains développements divergentes est l’ exemple classique de la série de Stirling.

Cauchy a montré que les termes de cette série vont d’ abord en décroissant, puis en croissant, de sorte que la série diverge; mais si l’ on s’ arrête au terme le plus petit, on répresente la function eulérienne avec une approximation d’ autant plus grande que l’ argument est plus grand. ”

An example of the usefulness of a divergent series has been given in chapter 5, where an exponential stability estimate has been obtained for an elliptic equilibrium.

A similar but more general application is the theorem of Nekhoroshev that will be the subject of chapter 8. In chapter 7 the method of Lie series will be used in order to prove the theorem of Kolmogorv on the existence of invariant tori for perturbed systems. In all cases it will be essential to implement a scheme of quantitative analytical estimates.

6.2 Lie series

This section is devoted to the formal aspects of the theory of Lie series. All questions concerning convergence will be neglected here and deferred to sections 6.5 and 6.6.

Particular care will instead be devoted to the formulation of constructive algorithms.

In this chapter all functions will be assumed to be holomorphic.

On a 2n–dimensional phase space endowed with canonical coordinates p, q, consider a holomorphic function χ(p, q), that will be called a generating function. Consider also the Lie derivative

(6.2) Lχ· = {·, χ} ,

i.e., the time derivative along the Hamiltonian vector field generated by χ. It plays a basic role in all the chapter.

6.2.1 The Lie series operator

The Lie series operator is defined as the exponential of εLχ, namely

(6.3) exp(εLχ) =X

s≥0

ε^s s!L^s_χ

(see, e.g., [42]). It represents the time one evolution of the canonical flow generated by the autonomous Hamiltonian χ.

The basic use of Lie series is the explicit expression of the canonical flow. Consider

(5)

χ(p^′, q^′) as a function of the new variables q^′, p^′, and write

(6.4)

p = exp(εLχ)p^′= p^′− ε ∂χ

∂q^′ p^′,q^′

+ ε² 2 Lχ

∂χ

∂q^′ p^′,q^′

+ . . .

q = exp(εLχ)q^′= q^′+ ε ∂χ

∂p^′ p^′,q^′

+ ε² 2 Lχ

∂χ

∂p^′ p^′,q^′

+ . . . ;

This is the explicit expression of the flow at time ε, and is a near the identity canonical transformation in the sense that it depends analytically on ε as a parameter, and for ε = 0 is the identity.¹ The reader will recognize the power expansion in time of the local solution of an holomorphic system of differential equations, as used by Cauchy.

It is also an easy matter to write the inverse transformation: exploiting the fact that the the flow is autonomous, it is enough to replace χ by −χ or, equivalently, ε by −ε . Example 6.1: Translation and deformation. Consider the case of action–angle variables p ∈ Rⁿ and q ∈ Tⁿ. The function χ = P

lξlq^′_l, with ξ ∈ Rⁿ generates through the Lie series operator at time ε the canonical transformation

p = exp(εLχ)p^′ = p^′− εξ , q = exp(εLχ)q^′ = q^′ ,

namely a translation in the action space thet leaves the angles unchanged. Similarly, a generating function χ = χ(q^′) (independent of p^′) generates the transformation

p = p^′− ∂χ

∂q^′ , q = q^′ ,

namely a deformation of the action variables. Remark that here the flow at time ε = 1 has been considered.

Lemma 6.1: The Lie series operator has the following properties.

(i) Linearity: for any pair of functions f , g and for all α ∈ R we have

exp εL_χ(f + g) = exp εLχf + exp εLχg , exp εL_χ(αf ) = α exp εLχf . (ii) Distributivity over the product (or conservation of product): for any pair of

functions f , g we have

exp εLχ(f · g) = exp εLχf · exp εLχg .

(iii) Distributivity over Poisson bracket (or conservation of Poisson bracket): for any pair of functions f , g we have

exp εLχ{f, g} = exp εLχf, exp εLχg .

The easy proof is left to the reader. Remark in particular that the property (iii) means that the Hamiltonian flow is a canonical transformation, a property that we already know (see lemma 3.3).

1 We have already encountered an expression similar to this one in sect. 5.1. The flow of a linear system has been expressed there using an evolution operator which is the exponential of a matrix, which is the Lie series operator for a linear vector field.

(6)

Having defined a coordinate transformation we may ask how a function is transformed. Here the considerations of section 6.1 concerning the formal calculus play a relevant role. Having given a function f (q, p) we should calculate the trasformed function f^′(p^′, q^′) by substitution of the transformation (6.4). In perturbation theory we also need to expand the transformed function in powers of the parameter ε up to a given order, which is clearly a long and boring procedure since we should apply the Taylor’s formula. A remarkable property is the following.

Lemma 6.2: Let a generating function χ(p, q) and a function f (p, q) be given. Then the following equality holds true:

(6.5) f (p, q)

p=exp(εLχ)p^′, q=exp(εLχ)q^′ = exp(εLχ)f

p=p^′, q=q^′ .

This lemma has been named by Gr¨obner the exchange theorem. A few comments are in order.

The claim is that the series expansion in ε of the transformed function may be calculated by applying the exponential operator of the Lie series directly to the function, with no need of making a substitution of variables.² This may appear trivial in view of the following considerations. The substitution of variables in the left member of (6.5) produces a function f^′(q^′, p^′, ε) , where ε is the time of the flow. We may expand the function in Taylor series by calculating the derivatives of f^′ with respect ε . The right member says essentially that the derivatives with respect to ε are calculated as Lie derivatives with respect to the flow generated by χ . This seems quite obvious.

However, some care is required if we remark that we are combining two operations: in the left member we first make the sustitution, and then perform the expansion; in the right member we perform the operations in reverse order. To be rigorous, we should prove that the result is the same. This is evident for a polynomial in view of properties (i) and (ii) of lemma 6.1, and for an analytic functions follows via approximation by polynomials. The complete proof may be found in [42], § I.2.

6.2.2 The triangle of Lie series

An elegant and effective representation of the operation of transforming a function is found as follows. Assume, as is typical in perturbation theory, that the function to be transformed is expanded in power series of the parameter ε, namely f (p, q, ε) = f₀(p, q)+εf₁(p, q)+ε²f₂(p, q)+. . . and that we want to write the transformed function g = exp εLχf as a power series g₀+ εg₁+ ε²g₂+ . . . in ε . Working at a formal level

2 A further commenton the role of the variables may be useful, if perhaps pedantic. The left member of (6.5) produces a function of the new variables q^′, p^′. On the other hand, the right member may be calculated by considering both functions χ and f as depending on the old variables q, p so that the result is a function of q, p . The substitution p = p^′, q = q^′ means that the variables q, p must be simply renamed. The equality must be intepreted in the sense that we take care only of the form of the function, the name of the variables being irrelevant. This fact should be kept in mind when we use the algorithm:

don’t care of the names of the variables; just transform the functions.

(7)

we may use the linearity of the Lie series operator, thus writing

g = exp εL_χf0+ ε exp εL_χf1+ ε²exp εL_χf2+ . . . .

That is, we apply the Lie series to every term of the expansion of f . The action of the operator is represented by the triangular diagram

(6.6)

g0 f0

↓

g1 Lχf0 f1

↓ ↓

g₂ _2!¹L²_χf₀ L_χf₁ f₂

↓ ↓ ↓

g₃ _3!¹L³_χf₀ _2!¹L²_χf₁ Lχf₂ f₃

↓ ↓ ↓ ↓

... ... ... ... ... . ..

Terms of the same order in ε are aligned on the same line. The calculation may be performed by columns, as indicated by the arrows: if the function f and the generating function χ are known, then every column may be calculated proceeding up–down until the line corresponding to the wanted order in ε is reached. Then it is enough to add together all terms appearing on the same line, and this gives every term of the expansion of g up to the wanted order. Everything not included in the diagram has higher order in ε , and needs not be calculated. This is precisely what we mean while saying that g = exp(Lχ)f in formal sense.

A compact form of the diagram is given by the recurrent formula

(6.7) g₀ = f₀ , gr =

r

X

j=1

1

j!Lχjf_r−j per r > 0 .

6.2.3 Composition of Lie series

As we have seen, the Lie series defines a near the identity transformation, which in our case is a canonical one. A natural question is whether the converse is also true, i.e., if every canonical transformation can be expressed as a Lie series.

The question may be formulated in more precise terms. We are actually considering a one parameter family of transformations. Suppose that we are given such a family in the form

(6.8) q = q^′+εϕ₁(p^′, q^′)+ε²ϕ₂(p^′, q^′)+. . . , p = p^′+εψ₁(p^′, q^′)+ε²ψ₂(p^′, q^′)+. . . , where the functions ϕ₁, ϕ₂, . . . and ψ₁, ψ₂, . . . are assumed to satisfy the necessary conditions for the transformation to be canonical. The question is whether a generating function χ(p^′, q^′) exists which produces exactly that transformation. The answer in general is negative, the limitation being that we are considering the flow of an

(8)

autonomous system. However, more general transformations can be constructed by composition of Lie series.

Let us consider a sequence of generating functions χ = {χ₁, χ₂, χ₃, . . .} , and let the sequence of operators S⁽¹⁾, S⁽²⁾, S⁽³⁾, . . . be defined by recurrence as

(6.9) S⁽¹⁾ = exp εLχ1 , S^(k) = exp ε^kLχkS^(k−1) for k > 1 .

If we work at formal level we interrupt the sequence at k = r for some r > 0, also truncating all expansions at order r; all the rest of the sequence produces only terms of order O(r + 1). If we can prove that the sequence converges to some limit then the limit

(6.10) S_χ = . . . ◦ exp ε³L_χ₃ ◦ exp ε²L_χ₂ ◦ exp εLχ1

is well defined. This procedure defines the composition of Lie series.

Inverting the operators so defined is not difficult. Let us define the sequence S˜⁽¹⁾, ˜S⁽²⁾, ˜S⁽³⁾, . . . as

(6.11) S˜⁽¹⁾ = exp −εL_χ₁ , S˜^(k) = ˜S^(k−1)exp −ε^kL_χ_k

for k > 1 .

It is an easy matter to check that the sequence defines the inverse in formal sense, namely

(6.12) S˜^(r)◦ S^(r) = O(r + 1) .

If the sequence ˜S^(k) tends to some limit then we get the inverse of the operator Sχ, namely

(6.13) S_χ⁻¹ = exp −εLχ1 ◦ exp −ε²Lχ2 ◦ exp −ε³Lχ3 ◦ . . . . In this case (6.12) is replaced by ˜Sχ◦ Sχ = Id .

Let us pay a little attention the triangular diagram (6.6) making clear how the calculation should proceed when a composition of Lie series is considered. Keep in mind that all equalities must be considered in formal sense Suppose we want to transform the function f = f0+ εf1+ ε²f2+ . . . , which means that we want

g = . . . ◦ exp ε²Lχ2 ◦ exp εLχ1f .

We first calculate f^′ = exp εLχ1f = f₀^′ + εf₁^′ + ε²f₂^′ + . . . as indicated in the diagram (6.6), just writing f^′ in place of g and χ1 in place of χ . Then we calculate f^′′ = exp ε²Lχ2f^′ = f₀^′′ + εf₁^′′ + ε²f₂^′′+ . . . by constructing a similar diagram, but paying attention to the correct alignment of the powers of ε . With a moment’s thought

(9)

we realize that the diagram should be represented as f₀^′′ f₀^′

↓

f₁^′′ 0 f₁^′

↓ ↓

f₂^′′ Lχ2f₀^′ 0 f₂^′

↓ ↓ ↓

f₃^′′ 0 L_χ₂f₁^′ 0 f₃^′

↓ ↓ ↓ ↓

f₄^′′ _2!¹L²_χ₂f₀^′ 0 Lχ2f₂^′ 0 f₄^′

↓ ↓ ↓ ↓ ↓

... ... ... ... ... ... . ..

with a number of null elements, because we proceed by powers of ε², not of ε. Similarly, when we go on with our procedure we shall construct a diagram for exp ε^sL_χ_s which contains in every column just one non zero element out of s . A compact formula may be given, namely

(6.14) gs=

k

X

j=0

1

j!L^j_χ_rf_s−jr , k =js r

k ,

where ⌊x⌋ denotes the maximal integer that does not exceed x.

Pedantically, let us see in some more detail how to proceed in a practical calculation, keeping in mind that we shall unavoidably truncate the series to some order ε^r, having chosen r ≥ 1 in a way that we consider suitable.

Let us pick a truncated function f = f0 + εf1 + . . . + ε^rfr and suppose that we want to construct the transformed function up to degree r in ε . With a little attention we realize that it is enough to construct every diagram until we reach the line corresponding to the power ε^r, and in particular we need to know only the generating functions χ1, . . . , χr. For, this includes all terms up to order ε^r, and neglects everything of higher order. Similar considerations apply to the inverse transformation: it is enough to consider only the action of the operator ˜S^(r) defined by (6.11).

A fact that may raise some perplexity in an actual calculation is the following.

Suppose that we have constructed the generating functions χ1, . . . , χr, so that we know how to construct the operators Sχ^(r) and ˜Sχ^(r). Let us calculate the transformation

q = S_χ^(r)q^′ , p = S_χ^(r)p^′ up to degree r in ε . This will give us expressions such as

(6.15) q = q^′+εϕ₁(p^′, q^′)+. . .+ε^rϕr(p^′, q^′) , p = p^′+εψ₁(p^′, q^′)+. . .+ε^rψr(p^′, q^′) ,

(10)

where the functions ϕ1(p^′, q^′), . . . , ϕr(p^′, q^′) and ψ1(p^′, q^′), . . . , ψr(p^′, q^′) may be explicitly calculated. We may then consider the inverse transformation

q^′ = ˜S_χ^(r)q , p^′ = ˜S_χ^(r)p . This will give us the expressions

(6.16) q^′ = q + ε ˜ϕ₁(p, q) + . . . + ε^rϕ˜r(p, q) , p^′ = p + ε ˜ψ₁(p, q) + . . . + ε^rψ˜r(p, q) , which could be explicilty calculated. Suppose now that we substitute the expressions (6.16) into (6.15). By lemma 6.2 (the exchange theorem) this is equivalent to applying the operator ˜Sχ^(r) to the functions in the right member of (6.15). We expect that this will give us the identity, which is definitely true if we consider the infinite series. If we work with truncated series at order r we shall realize that the result is the identity up to a term of order ε^r+1. This is the best we can expect in a practical calculation — and is coherent with the rules of formal calculus.

Lemma 6.3: Let a near the identity canonical trasformation in the form (6.8) be given. Then there exists a sequence χ = {χ1(p^′, q^′), χ2(p^′, q^′), . . .} of generating functions such that

q = Sχq^′ , p = Sχp^′ .

Proof. Let us see how the sequence χ is constructed, step by step. In the formal approach, we want to determine χ1 so that³

(6.17) exp(εL_χ₁)q_j −qj + εϕ_1,j = O(2) , exp(εLχ1)pj −pj+ εψ_1,j= O(2) .

The condition for the transformation (6.8) be canonical (in formal sense at order 1) is {qj + εϕ1,j, qk+ εϕ1,k} = {qj, qk} + ε{qj, ϕ1,k} + ε{ϕ1,j, qk} = O(2) ,

{pj + εψ1,j, pk+ εψ1,k} = {pj, qk} + ε{pj, ψ1,k} + ε{ψ1,j, pk}= O(2) , {q_j + εϕ_1,j, p_k+ εψ_1,k} = {q_j, p_k} + ε{q_j, ψ_1,k} + ε{ϕ_1,j, p_k}= 1 + O(2) . Hence we ask

∂ϕ_1,k

∂pj

− ∂ϕ_1,j

∂pk

= 0 , −∂ψ_1,k

∂qj

+ ∂ψ_1,j

∂qk

= 0 , ∂ψ_1,k

∂pj

+ ∂ϕ_1,j

∂qk

= 0 . That is, there is a function χ₁(q, p) such that

ϕ_1,j = ∂χ₁

∂pj

, pj = −∂χ₁

∂qj

,

so that χ₁ is determined by quadrature, and (6.17) is satisfied. Suppose now that we have determined {χ1, . . . , χr−1} so that

3 The reader will notice that the argument below is essentially a reinterpretation of the near the identity canonical transforamtions that we have seen in example 2.12.

(11)

exp(ε^(r−1)Lχr−1) ◦ . . . ◦ exp(εLχ1)qj

−qj + εϕ_1,j + . . . + ε^(r−1)ϕ_r−1,j − ε^rϕ^′_r,j = O(r + 1) , exp(ε^(r−1)Lχr−1) ◦ . . . ◦ exp(εLχ1)pj

−pj+ εψ_1,j + . . . + ε^(r−1)ψ_r−1,j − ε^rψ_r,j^′ = O(r + 1) .

Remark that the expressions ϕ^′_r,j and ψ^′_r,j can be explicitly determined. On the other hand they must satisfy the conditions for canonicity in formal sense at order r. The calculation requires some patience and attention: one should exploit the fact that canonicity up to order r − 1 follows from the construction, for the transformation generated by a Lie series is canonical. We find that the canonicity conditions are, again,

∂ϕ^′_r,k

∂pj

− ∂ϕ^′_r,j

∂pk

= 0 , −∂ψ_r,k^′

∂qj

+ ∂ψ^′_r,j

∂qk

= 0 , ∂ψ_r,k^′

∂pj

+ ∂ϕ^′_r,j

∂qk

= 0 .

Thus, the function χrcan be determined by quadrature. We conclude that the sequence can be constructed up to any wanted order, as claimed. Q.E.D.

6.3 Lie transform

The algorithm of Lie series may be generalized to the case of a time–dependent generating function χ(p, q, t), so that the corresponding time evolution is due to the flow of a non–autonomous Hamiltonian. This leads to an algorithm different from (6.3).

As already said at the beginning of the chapter, several different formulæ have been devised in order to give the Lie transform an algorithmic recurrent form: everybody, of course, has his favourite one. To make a definite choice, the algorithm used here is the favourite one of the author. It is related to the “algorithm of the inverse”, found by Henrard [48].

Hereafter particular attention will be paid to the algebraic aspect of the method, leaving somehow hidden the relation with a canonical flow. In particular the algorithm will be formulated after setting ε = 1.

Consider a generating sequence χ = {χs}_s≥1 of analytic functions on the phase space. The Lie transform operator Tχ is defined as

(6.18) Tχ =X

s≥0

Es ,

where the sequence {E_s}_s≥0 of operators is recurrently defined as

(6.19) E₀ = Id , Es =

s

X

j=1

j

sLχjE_s−j .

(12)

A coordinate transformation defined by means of Lie transform is written as

(6.20)

q = Tχq^′ = q^′+ Lχ1q^′+ 1

2L²_χ₁q^′+ Lχ2q^′

+ . . . , p = Tχp^′ = p^′+ Lχ1p^′+ 1

2L²_χ₁p^′+ Lχ2p^′

+ . . . . This is a canonical transformation (see lemma 6.4 below).

A direct connection with Lie series comes out by considering the case χ = {χ₁, 0, 0, . . .}, namely a generating sequence containing only the first term. Then the Lie transform generated by χ coincides with the Lie series generated by χ₁, i.e., T_χ = exp(L_χ₁) .

Lemma 6.4: The Lie transform T_χ defined by (6.18) and (6.19) has the following properties.

(i) Linearity: for any pair f , g of functions and for α ∈ R we have T_χ(f + g) = T_χf + T_χg , T_χ(αf ) = αT_χf . (ii) Conservation of product: for any pair f , g of functions we have

Tχ(f · g) = Tχf · Tχg .

(iii) Conservation of Poisson brackets: for any pair f , g of functions we have Tχ{f, g} = {Tχf, Tχg .

For a proof see, e.g., [32]. The property (iii) is particularly relevant, since it means that the coordinate transformation (6.20) is canonical.

The reader will notice that the Lie transform has the same formal properties as the Lie series. This is true also for the exchange theorem, as stated by the following Lemma 6.5: Let the generating sequence χ = {χs}_s≥1 and a function f (p, q) be given. We have

(6.21) f (p, q)

p=Tχp^′, p=Tχq^′ = Tχf

p=p^′, q=q^′ .

The proof may be obtained using the linearity and the conservation of products.

This makes the claim evident for polynomials, and the result may be extended to analytic functions.⁴

6.3.1 The triangular diagram for the Lie transform

The calculation of a Lie transform has a nice graphical representation similar to that of Lie series. Suppose that we are given a function f = f0+ f1+ . . . , and denote by g = g₀ + g₁ + . . . its transformed function g = Tχf . Using the linearity of the Lie

4 See the remarks made for the Lie series, including the note 2 in this chapter.

(13)

transform we can apply Tχ separately to every term of f . It is useful to rearrange terms according to the triangular diagram

(6.22)

g₀ f₀

↓

g₁ E₁f₀ f₁

↓ ↓

g₂ E₂f₀ E₁f₁ f₂

↓ ↓ ↓

g₃ E₃f₀ E₂f₁ E₁f₂ f₃

↓ ↓ ↓ ↓

... ... ... ... ... . ..

where terms of the same order appear on the same line. Remark that the operator Tχ

acts by columns, as indicated by the arrows: the knowledge of fj and of the generating sequence allows one to construct the whole column below fj. Thus, the first line gives g0 = f0, the second line gives g1= E1f0+ f1, and so on. This shows how to practically perform the transformation.⁵ Again, truncating the diagram at line r allows us to calculate all contribution up to order r and forget everything of higher order, which is coherent with the rules of formal calculus.

Finding the inverse of the Lie transform may appear definitely more complicated than for Lie series. However, with a bit of attention we may realize that it is just matter of using in a skilful manner the triangular diagram (6.22). Assume that g is given, and f is unknown. Then, the first line gives f0 = g0; having determined f0, all the column below f₀ can be constructed, and the second line gives immediately f₁ = g₁ − E₁f₀; having determined f1, all the corresponding column can be constructed, so that f2

can be determined from the third line as f₂ = g₂− E₂f₀− E₁f₁, and so on.

5 This scheme may look mysterious to a reader who is not familiar with the methods of expansion of perturbation theory. Let me try to clarify this matter. Introduce an expansion parameter, ε say, and write εχ1, ε²χ2, . . . in place of χ1, χ2, . . . . Then the Lie transform applied to a generic function f (independent of the expansion parameter ε) generates in a natural way a transformed function expanded in power series of ε. For, in view of Lεχ₁ = εLχ₁, L_ε2χ₂ = ε²Lχ₂, . . . one easily sees that the operator Esproduces a factor ε^s. Let now f itself be a series in ε, namely f = f0+ εf1+ ε²f2 + . . . , and look for the ε–expansion g = g0+ εg1+ ε²g2+ . . . of its transformed function g = Tχf Then the triangular diagram (6.22) is easily constructed by putting on the same line the functions that have the same power of ε as coefficient. This should make natural the whole procedure. At this point, just set ε = 1 and leave everything in its place, remarking that the indexes play the role of the exponents of ε. Indeed, this is not just a formal game. We may consider the index s of a function, in our notation, as indicating that the function is “of order s” in some sense. The quantitative theory will be responsible of giving the expression a definite meaning, e.g., by assuring that with increasing values of the index s the size of the function decreases in some regular manner.

(14)

There is also an explicit formula for the inverse, namely

(6.23) T_χ⁻¹ =X

s≥0

Ds ,

where

(6.24) D0 = Id , Ds = −

s

X

j=1

j

sDs−jLχj .

However, this expression is actually useless for a practical computation: the algorithm described above is much more efficient. Nevertheless, the explicit recurrent formula is useful for quantitative estimates.

The property that makes the Lie transform quite useful is expressed by the following

Lemma 6.6: Let a canonical transformation be given in the form (6.8). Then there exists a generating sequence χ1(p^′, q^′), χ2(p^′, q^′), . . . such that we have

p = Tχp^′ , q = Tχq^′ .

Proof. Want Tχpj and Tχqj coincide with the corresponding expressions of pj and qj, respectively, in (6.8) (recall that the name of the variables is irrelevant). Recalling the definition of Tχ we actually write the infinite system

Espj = ψs,j , Esqj = ϕj , s ≥ 0 .

For s = 1 the equation takes the simple form L_χ₁p_j = ψ_1,j and L_χ₁q_j = ϕ_1,j, i.e.,

−∂χ1

∂qj

= ψ1,j , ∂χ1

∂pj

= ϕ1,j .

Here we proceed as in the proof of lemma 6.3: the canonicity condition give

∂ϕ_1,k

∂pj

− ∂ϕ_1,j

∂pk

= 0 , −∂ψ_1,k

∂qj

+ ∂ψ_1,j

∂qk

= 0 , ∂ψ_1,k

∂pj

+ ∂ϕ_1,j

∂qk

= 0 ,

so that χ₁ is determined by quadratures. For s > 1, using the espression (6.19) of E_s and separating the last term of the sum we write the equations as

L_χ_sp_j +

s−1

X

l=1

l

sL_χ_lE_s−lp_j = ψ_s,j , L_χ_sq_j+

s−1

X

l=1

l

sL_χ_lE_s−lq_j = ϕ_s,j .

Proceeding by induction, remark that the sum is determined by χ₁, . . . , χ_s−1, so that the equations read

Lχspj = ψs,j − Fs,j , Lχsqj = ϕs,j+ Gs,j

with known functions F, G. On the other hand, we should recall that the transformation generated by the truncated sequence χ^(s−1) = {χ1, . . . , χs−1} is canonical, so that the right members must satisfy the canonicity conditions. Proceeding again as in the proof of lemma 6.3 we conclude that χs may be determined. Q.E.D.

(15)

It is mandatory here to emphasize that the sequence of generating functions does not coincide with the sequence that generates the same transformation by composition of Lie series. This because Lie derivatives do not commute, in general.

6.4 Analytic framework

It now time to introduce the technical tools that will allow us to investigate the convergence or the asymptotic properties of perturbation series. The methods exposed here are in fact a variazione over the classical method of majorants due to Cauchy.

6.4.1 Cauchy estimates

Consider an open disk ∆̺(0), with ̺ > 0, centered at the origin of the complex plane C. Consider a function f analytic and bounded on the closure of the disk ∆_̺(0). The supremum norm |f |̺ of f in the domain ∆̺(0) is defined as

(6.25) |f |̺ = sup

z∈∆̺(0)

|f (z)| .

The estimate of Cauchy for the derivative f^′ of f at the origin states that

|f^′(0)| ≤ 1

̺|f |̺ .

More generally, for the s–th derivative f^(s) one has the estimate

f^(s)(0) ≤ s!

̺^s|f |̺ .

For instance, let ̺ = 1, and consider the function f (z) = z^s. It is an easy matter to check that |f |₁ = 1, so that Cauchy’s estimate gives |f^(s)(0)| ≤ s! . This shows that the estimate cannot be improved in general. The proof of the inequalities above is an easy consequence of Cauchy’s formula

f^(s)(z) = s!

2πi

I f (ζ)

(ζ − z)^s+1dζ .

For, writing the contour of the disk as z = ̺e^iϑ a straightforward calculation gives f^(s)(0)

≤ s!

2π

I f (ζ)

̺^s+1 dζ

≤ s!

2π̺^s|f |_̺ Z 2π

0

dϑ = s!

̺^s|f |_̺ .

The case of n variables requires a straightforward extension. Let the domain ∆̺(0) be the polydisk of radius ̺ centered at the origin of Cⁿ, namely

(6.26) ∆_̺(0) = {z ∈ Cⁿ : |z| < ̺} ,

where |z| = maxj|zj| is the l_∞ norm on Cⁿ. This is nothing but the Cartesian product of complex disks of radius ̺ in the complex plane. Define the supremum norm of an