NEAR THE IDENTITY

CANONICAL TRANSFORMATIONS

This part should be considered as an intermezzo. It has a rather technical character, and can be seen as an extension of the chapter on canonical transformations, since it deals with transformations close to identity. The scope is to introduce some useful tools for developing a constructive perturbation scheme based on the algorithms of Lie series and of Lie transform, that will be used as substitutes for the classical method of constructing near the identity canonical transformations via generating functions in mixed variables.

The usefulness of Lie transform methods relies on two remarks. First, one avoids the cumbersome procedure of inversion of functions and of substitution of variables.

Second, completing the formal methods based on Lie transforms with quantitative estimates — as typically required by modern perturbation theory — is not really difficult.

Concerning the first remark, one may easily notice that the operations of inversion of functions and substitution of variables are basically simple, and so introduce no real difficulty from the point of view of Analysis. Moreover, most of the rigorous methods of perturbation theory available in the current literature are based on powerful analytical tools such as the implicit function theorem and fixed point theorems, to quote just the most common ones. The drawback of these method is that they are not constructive, not easily implemented in explicit calculations, in particular if one plans to perform series expansions up to a quite high order in a parameter. E.g., the first few steps of an inversion are indeed quite simple — and often faster — than the expansions required by Lie transforms. But soon one realizes that pushing the typical expansions of perturbation theory to higher orders becomes a really cumbersome and time wasting procedure. In contrast, the methods based on Lie transform allow us to express all series expansions that we need in a very straightforward manner, using only algebraic operations that are easily implemented on computers: sums, products and derivatives.

The second remark is related to an apparent defect of Lie transform methods in algorithmic form: they are often used only as formal tools, skipping all questions

related to convergence of the expansions. We could also say that there is a gap between the formal methods of perturbation theory that have been used by astronomers since the times of Euler and Lagrange and the rigorous methods that are now widely used in the framework of Analysis. The ambitious attempt of the present notes is to fill, at least partially, that gap.

It may be interesting to add a few historical remarks. The introduction of Lie series as a tool for writing the solutions of differential equations goes back to Marius Sophus Lie. The field of Lie algebras has been developed on the basis of his work;

however, this matter will not be discussed in the present notes. The usefulness of Lie series in numerical calculations has been pointed out by Wolfgang Gr¨obner in a series of papers, starting in 1957. The book [42], published in 1967, contains a complete theory including the discussion of the convergence of Lie series for analytic vector fields, when all functions are expressed as power series. The expansions are then used as a numerical tool for numerically solving differential equations. It should be remarked that the approximation of a solution by Lie series has been recently used also as a tool for constructing symplectic integration algorithms which are particularly useful for numerically investigating the dynamics of Hamiltonian systems.

The application of Lie series methods to perturbation expansions in Celestial
Mechanics has been first proposed by Gen-Ichiro Hori^{[50]} and Andr´e Deprit^{[27]}, who
generalized the method by introducing Lie transforms and producing explicit algo-
rithms. Many other papers have been devoted to algorithms based on Lie transforms,
mainly for applications in Celestial Mechanics. An attempt to collect several such algo-
rithms may be found in the papers [48] and [49] by Jacques Henrard, where extensive
references may be found. However, all these papers were concerned only with the for-
mal aspects, while convergence problems were neglected. An extension of the rigorous
result of Gr¨obner to the Hamiltonian case including use of action–angle variables has
been first given in [32].

The exposition here will be concerned only with the application of Lie methods in a Hamiltonian framework, often considering the particular case of action–angle variables. This is indeed what we need in order to establish a working background for the rest of the present notes. However that a similar theory may be developed also for general systems of differential equations.

6.1 Formal expansions

Let us start with a short discussion concerning the meaning to be assigned to the expression formal calculus. In chapter 5 we have used it in intuitive sense — perhaps vague — just saying: “we disregard questions relative to convergence”. Let us try to make the last sentence a little more precise, with the help of Poincar´e.

The easiest and most natural case is concerned with polynomial expansions, or formal power series, that we have used in chapter 5 in order to construct a first integral for a Hamiltonian in a neighbourhood of an (elliptic) equilibrium. Following Poincar´e ([87], Tome II, ch. VIII), we may introduce the concept of formal expansion as

follows. Consider the truncated expansion Φ = Φ0+ . . . + Φr, i.e., a non homogeneous polynomial of degree r + 1. We say that the equation {H, Φ} = 0 is satisfied in formal sense if by replacing the truncated polynomial we get

(6.1) {H, Φ} = O(|(x, y)|^{r+3}) ,

i.e., the difference is of degre at least r +3, that we have identified as order r +1. Let us call this difference the remainder. Hencefort we shall denote as O(r + 1) the remainder of order r + 1, in whatever sense. This is essentially what we did in chapter 5, and the propositions stated there should be interpreted in this sense.

A similar attitude may be taken when considering expansions in a small parameter
ε, as considered in chapter 4: just interpret order O(r + 1) as meaning: terms that
have a power ε^{r+1} (or higher) as a factor. More elaborated cases may occur, as we
shall see in the rest of the present notes.

Such an attitude turns out to be useful in view of the fact, known after Poincar´e, that most series usually considered in perturbation theory are not convergent. On the other hand there is a long standing and successful tradition in Celestial Mechanics (or Astronomy, as it was called in the past) based on the use of the first order terms of a series expansions, disregarding the rest of the series: the problem of convergence is not even mentioned, in most cases.

Such a matter of fact has been splendidly illustrated by Poincar´e (see [87], tome II, chap. VIII).

“ Il y a entre les g´eom`etres et les astronomes une sorte de malentendu au sujet de la signification du mot convergence. Les g´eom`etres, pr´eoccup´es de la parfaite rigueur et souvent trop indiff´erents `a la longueur de calculs inextricables dont ils con¸coivent la possibilit´e, sans songer `a les entrepren- dre effectivement, disent qu’ une s´erie est convergente quand la somme des termes tend vers une limite d´etermin´ee, quand mˆeme les premiers termes diminueraient tr`es lentement. Les astronomes, au contraire, ont coutume de dire qu’ une s´erie converge quand les vingt premiers termes, par exem- ple, diminuent tr`es rapidement, quand mˆeme les termes suivants devraient croˆıtre ind´efiniment.

Ainsi, pour prendre un exemple simple, consid´erons les deux s´eries qui ont pour terme g´en´eral

^{n}

· · . . . n et · · . . . n

^{n} .

Les g´eom`etres diront que la premi`ere s´erie converge, et mˆeme qu’ elle con-
verge rapidement, parce que le millioni`eme terme est beaucoup plus petit
que le ^{e}; mais ils regarderont la seconde come divergente, parce que
le terme g´en´eral peut croˆıtre au del`a de toute limite.

Les astronomes, au contraire, regarderont la premi`ere s´erie comme di- vergente, parce que les premiers termes vont en croissant; et la seconde comme convergente, parce que les premiers termes vont en d´ecroissant et que cette d´ecroissance est d’ abord tr`es rapide.

Les deux r`egles sont l´egitimes : la premi`ere, dans les r´echerches th´eoriques; la s´econde, dans les applications num´eriques. Toutes deux doivent r´egner, mais dans deux domaines s´epar´es et dont il importe de bien connaˆıtre les fronti`eres.

. . . .

Le premier exemple qui a montr´e clairement la l´egitimit´e de certains d´eveloppements divergentes est l’ exemple classique de la s´erie de Stirling.

Cauchy a montr´e que les termes de cette s´erie vont d’ abord en d´ecroissant, puis en croissant, de sorte que la s´erie diverge; mais si l’ on s’ arrˆete au terme le plus petit, on r´epresente la function eul´erienne avec une approximation d’ autant plus grande que l’ argument est plus grand. ”

An example of the usefulness of a divergent series has been given in chapter 5, where an exponential stability estimate has been obtained for an elliptic equilibrium.

A similar but more general application is the theorem of Nekhoroshev that will be the subject of chapter 8. In chapter 7 the method of Lie series will be used in order to prove the theorem of Kolmogorv on the existence of invariant tori for perturbed systems. In all cases it will be essential to implement a scheme of quantitative analytical estimates.

6.2 Lie series

This section is devoted to the formal aspects of the theory of Lie series. All questions concerning convergence will be neglected here and deferred to sections 6.5 and 6.6.

Particular care will instead be devoted to the formulation of constructive algorithms.

In this chapter all functions will be assumed to be holomorphic.

On a 2n–dimensional phase space endowed with canonical coordinates p, q, con- sider a holomorphic function χ(p, q), that will be called a generating function. Consider also the Lie derivative

(6.2) Lχ· = {·, χ} ,

i.e., the time derivative along the Hamiltonian vector field generated by χ. It plays a basic role in all the chapter.

6.2.1 The Lie series operator

The Lie series operator is defined as the exponential of εLχ, namely

(6.3) exp(εLχ) =X

s≥0

ε^{s}
s!L^{s}_{χ}

(see, e.g., [42]). It represents the time one evolution of the canonical flow generated by the autonomous Hamiltonian χ.

The basic use of Lie series is the explicit expression of the canonical flow. Consider

χ(p^{′}, q^{′}) as a function of the new variables q^{′}, p^{′}, and write

(6.4)

p = exp(εLχ)p^{′}= p^{′}− ε ∂χ

∂q^{′}
p^{′},q^{′}

+ ε^{2}
2 Lχ

∂χ

∂q^{′}
p^{′},q^{′}

+ . . .

q = exp(εLχ)q^{′}= q^{′}+ ε ∂χ

∂p^{′}
p^{′},q^{′}

+ ε^{2}
2 Lχ

∂χ

∂p^{′}
p^{′},q^{′}

+ . . . ;

This is the explicit expression of the flow at time ε, and is a near the identity canonical
transformation in the sense that it depends analytically on ε as a parameter, and for
ε = 0 is the identity.^{1} The reader will recognize the power expansion in time of the
local solution of an holomorphic system of differential equations, as used by Cauchy.

It is also an easy matter to write the inverse transformation: exploiting the fact that
the the flow is autonomous, it is enough to replace χ by −χ or, equivalently, ε by −ε .
Example 6.1: Translation and deformation. Consider the case of action–angle vari-
ables p ∈ R^{n} and q ∈ T^{n}. The function χ = P

lξlq^{′}_{l}, with ξ ∈ R^{n} generates through
the Lie series operator at time ε the canonical transformation

p = exp(εLχ)p^{′} = p^{′}− εξ , q = exp(εLχ)q^{′} = q^{′} ,

namely a translation in the action space thet leaves the angles unchanged. Similarly,
a generating function χ = χ(q^{′}) (independent of p^{′}) generates the transformation

p = p^{′}− ∂χ

∂q^{′} , q = q^{′} ,

namely a deformation of the action variables. Remark that here the flow at time ε = 1 has been considered.

Lemma 6.1: The Lie series operator has the following properties.

(i) Linearity: for any pair of functions f , g and for all α ∈ R we have

exp εL_{χ}(f + g) = exp εLχf + exp εLχg , exp εL_{χ}(αf ) = α exp εLχf .
(ii) Distributivity over the product (or conservation of product): for any pair of

functions f , g we have

exp εLχ(f · g) = exp εLχf · exp εLχg .

(iii) Distributivity over Poisson bracket (or conservation of Poisson bracket): for any pair of functions f , g we have

exp εLχ{f, g} = exp εLχf, exp εLχg .

The easy proof is left to the reader. Remark in particular that the property (iii) means that the Hamiltonian flow is a canonical transformation, a property that we already know (see lemma 3.3).

1 We have already encountered an expression similar to this one in sect. 5.1. The flow of a linear system has been expressed there using an evolution operator which is the exponential of a matrix, which is the Lie series operator for a linear vector field.

Having defined a coordinate transformation we may ask how a function is trans-
formed. Here the considerations of section 6.1 concerning the formal calculus play a
relevant role. Having given a function f (q, p) we should calculate the trasformed func-
tion f^{′}(p^{′}, q^{′}) by substitution of the transformation (6.4). In perturbation theory we
also need to expand the transformed function in powers of the parameter ε up to a
given order, which is clearly a long and boring procedure since we should apply the
Taylor’s formula. A remarkable property is the following.

Lemma 6.2: Let a generating function χ(p, q) and a function f (p, q) be given. Then the following equality holds true:

(6.5) f (p, q)

p=exp(εLχ)p^{′}, q=exp(εLχ)q^{′} = exp(εLχ)f

p=p^{′}, q=q^{′} .

This lemma has been named by Gr¨obner the exchange theorem. A few comments are in order.

The claim is that the series expansion in ε of the transformed function may be
calculated by applying the exponential operator of the Lie series directly to the func-
tion, with no need of making a substitution of variables.^{2} This may appear trivial in
view of the following considerations. The substitution of variables in the left member
of (6.5) produces a function f^{′}(q^{′}, p^{′}, ε) , where ε is the time of the flow. We may ex-
pand the function in Taylor series by calculating the derivatives of f^{′} with respect ε .
The right member says essentially that the derivatives with respect to ε are calculated
as Lie derivatives with respect to the flow generated by χ . This seems quite obvious.

However, some care is required if we remark that we are combining two operations: in the left member we first make the sustitution, and then perform the expansion; in the right member we perform the operations in reverse order. To be rigorous, we should prove that the result is the same. This is evident for a polynomial in view of proper- ties (i) and (ii) of lemma 6.1, and for an analytic functions follows via approximation by polynomials. The complete proof may be found in [42], § I.2.

6.2.2 The triangle of Lie series

An elegant and effective representation of the operation of transforming a function is
found as follows. Assume, as is typical in perturbation theory, that the function to
be transformed is expanded in power series of the parameter ε, namely f (p, q, ε) =
f_{0}(p, q)+εf_{1}(p, q)+ε^{2}f_{2}(p, q)+. . . and that we want to write the transformed function
g = exp εLχf as a power series g_{0}+ εg_{1}+ ε^{2}g_{2}+ . . . in ε . Working at a formal level

2 A further commenton the role of the variables may be useful, if perhaps pedantic. The
left member of (6.5) produces a function of the new variables q^{′}, p^{′}. On the other hand,
the right member may be calculated by considering both functions χ and f as depending
on the old variables q, p so that the result is a function of q, p . The substitution p =
p^{′}, q = q^{′} means that the variables q, p must be simply renamed. The equality must be
intepreted in the sense that we take care only of the form of the function, the name of the
variables being irrelevant. This fact should be kept in mind when we use the algorithm:

don’t care of the names of the variables; just transform the functions.

we may use the linearity of the Lie series operator, thus writing

g = exp εL_{χ}f0+ ε exp εL_{χ}f1+ ε^{2}exp εL_{χ}f2+ . . . .

That is, we apply the Lie series to every term of the expansion of f . The action of the operator is represented by the triangular diagram

(6.6)

g0 f0

↓

g1 Lχf0 f1

↓ ↓

g_{2} _{2!}^{1}L^{2}_{χ}f_{0} L_{χ}f_{1} f_{2}

↓ ↓ ↓

g_{3} _{3!}^{1}L^{3}_{χ}f_{0} _{2!}^{1}L^{2}_{χ}f_{1} Lχf_{2} f_{3}

↓ ↓ ↓ ↓

... ... ... ... ... . ..

Terms of the same order in ε are aligned on the same line. The calculation may be performed by columns, as indicated by the arrows: if the function f and the generating function χ are known, then every column may be calculated proceeding up–down until the line corresponding to the wanted order in ε is reached. Then it is enough to add together all terms appearing on the same line, and this gives every term of the expansion of g up to the wanted order. Everything not included in the diagram has higher order in ε , and needs not be calculated. This is precisely what we mean while saying that g = exp(Lχ)f in formal sense.

A compact form of the diagram is given by the recurrent formula

(6.7) g_{0} = f_{0} , gr =

r

X

j=1

1

j!Lχjf_{r−j} per r > 0 .

6.2.3 Composition of Lie series

As we have seen, the Lie series defines a near the identity transformation, which in our case is a canonical one. A natural question is whether the converse is also true, i.e., if every canonical transformation can be expressed as a Lie series.

The question may be formulated in more precise terms. We are actually consid- ering a one parameter family of transformations. Suppose that we are given such a family in the form

(6.8) q = q^{′}+εϕ_{1}(p^{′}, q^{′})+ε^{2}ϕ_{2}(p^{′}, q^{′})+. . . , p = p^{′}+εψ_{1}(p^{′}, q^{′})+ε^{2}ψ_{2}(p^{′}, q^{′})+. . . ,
where the functions ϕ_{1}, ϕ_{2}, . . . and ψ_{1}, ψ_{2}, . . . are assumed to satisfy the necessary
conditions for the transformation to be canonical. The question is whether a generat-
ing function χ(p^{′}, q^{′}) exists which produces exactly that transformation. The answer
in general is negative, the limitation being that we are considering the flow of an

autonomous system. However, more general transformations can be constructed by composition of Lie series.

Let us consider a sequence of generating functions χ = {χ_{1}, χ_{2}, χ_{3}, . . .} , and let
the sequence of operators S^{(1)}, S^{(2)}, S^{(3)}, . . . be defined by recurrence as

(6.9) S^{(1)} = exp εLχ1 , S^{(k)} = exp ε^{k}LχkS^{(k−1)} for k > 1 .

If we work at formal level we interrupt the sequence at k = r for some r > 0, also truncating all expansions at order r; all the rest of the sequence produces only terms of order O(r + 1). If we can prove that the sequence converges to some limit then the limit

(6.10) S_{χ} = . . . ◦ exp ε^{3}L_{χ}_{3} ◦ exp ε^{2}L_{χ}_{2} ◦ exp εLχ1

is well defined. This procedure defines the composition of Lie series.

Inverting the operators so defined is not difficult. Let us define the sequence
S˜^{(1)}, ˜S^{(2)}, ˜S^{(3)}, . . . as

(6.11) S˜^{(1)} = exp −εL_{χ}_{1} , S˜^{(k)} = ˜S^{(k−1)}exp −ε^{k}L_{χ}_{k}

for k > 1 .

It is an easy matter to check that the sequence defines the inverse in formal sense, namely

(6.12) S˜^{(r)}◦ S^{(r)} = O(r + 1) .

If the sequence ˜S^{(k)} tends to some limit then we get the inverse of the operator Sχ,
namely

(6.13) S_{χ}^{−1} = exp −εLχ1 ◦ exp −ε^{2}Lχ2 ◦ exp −ε^{3}Lχ3 ◦ . . . .
In this case (6.12) is replaced by ˜Sχ◦ Sχ = Id .

Let us pay a little attention the triangular diagram (6.6) making clear how the
calculation should proceed when a composition of Lie series is considered. Keep in mind
that all equalities must be considered in formal sense Suppose we want to transform
the function f = f0+ εf1+ ε^{2}f2+ . . . , which means that we want

g = . . . ◦ exp ε^{2}Lχ2 ◦ exp εLχ1f .

We first calculate f^{′} = exp εLχ1f = f_{0}^{′} + εf_{1}^{′} + ε^{2}f_{2}^{′} + . . . as indicated in the di-
agram (6.6), just writing f^{′} in place of g and χ1 in place of χ . Then we calculate
f^{′′} = exp ε^{2}Lχ2f^{′} = f_{0}^{′′} + εf_{1}^{′′} + ε^{2}f_{2}^{′′}+ . . . by constructing a similar diagram, but
paying attention to the correct alignment of the powers of ε . With a moment’s thought

we realize that the diagram should be represented as
f_{0}^{′′} f_{0}^{′}

↓

f_{1}^{′′} 0 f_{1}^{′}

↓ ↓

f_{2}^{′′} Lχ2f_{0}^{′} 0 f_{2}^{′}

↓ ↓ ↓

f_{3}^{′′} 0 L_{χ}_{2}f_{1}^{′} 0 f_{3}^{′}

↓ ↓ ↓ ↓

f_{4}^{′′} _{2!}^{1}L^{2}_{χ}_{2}f_{0}^{′} 0 Lχ2f_{2}^{′} 0 f_{4}^{′}

↓ ↓ ↓ ↓ ↓

... ... ... ... ... ... . ..

with a number of null elements, because we proceed by powers of ε^{2}, not of ε. Similarly,
when we go on with our procedure we shall construct a diagram for exp ε^{s}L_{χ}_{s} which
contains in every column just one non zero element out of s . A compact formula may
be given, namely

(6.14) gs=

k

X

j=0

1

j!L^{j}_{χ}_{r}f_{s−jr} , k =js
r

k ,

where ⌊x⌋ denotes the maximal integer that does not exceed x.

Pedantically, let us see in some more detail how to proceed in a practical calcula-
tion, keeping in mind that we shall unavoidably truncate the series to some order ε^{r},
having chosen r ≥ 1 in a way that we consider suitable.

Let us pick a truncated function f = f0 + εf1 + . . . + ε^{r}fr and suppose that
we want to construct the transformed function up to degree r in ε . With a little
attention we realize that it is enough to construct every diagram until we reach the line
corresponding to the power ε^{r}, and in particular we need to know only the generating
functions χ1, . . . , χr. For, this includes all terms up to order ε^{r}, and neglects everything
of higher order. Similar considerations apply to the inverse transformation: it is enough
to consider only the action of the operator ˜S^{(r)} defined by (6.11).

A fact that may raise some perplexity in an actual calculation is the following.

Suppose that we have constructed the generating functions χ1, . . . , χr, so that we know
how to construct the operators Sχ^{(r)} and ˜Sχ^{(r)}. Let us calculate the transformation

q = S_{χ}^{(r)}q^{′} , p = S_{χ}^{(r)}p^{′}
up to degree r in ε . This will give us expressions such as

(6.15) q = q^{′}+εϕ_{1}(p^{′}, q^{′})+. . .+ε^{r}ϕr(p^{′}, q^{′}) , p = p^{′}+εψ_{1}(p^{′}, q^{′})+. . .+ε^{r}ψr(p^{′}, q^{′}) ,

where the functions ϕ1(p^{′}, q^{′}), . . . , ϕr(p^{′}, q^{′}) and ψ1(p^{′}, q^{′}), . . . , ψr(p^{′}, q^{′}) may be ex-
plicitly calculated. We may then consider the inverse transformation

q^{′} = ˜S_{χ}^{(r)}q , p^{′} = ˜S_{χ}^{(r)}p .
This will give us the expressions

(6.16) q^{′} = q + ε ˜ϕ_{1}(p, q) + . . . + ε^{r}ϕ˜r(p, q) , p^{′} = p + ε ˜ψ_{1}(p, q) + . . . + ε^{r}ψ˜r(p, q) ,
which could be explicilty calculated. Suppose now that we substitute the expres-
sions (6.16) into (6.15). By lemma 6.2 (the exchange theorem) this is equivalent to
applying the operator ˜Sχ^{(r)} to the functions in the right member of (6.15). We expect
that this will give us the identity, which is definitely true if we consider the infinite
series. If we work with truncated series at order r we shall realize that the result is
the identity up to a term of order ε^{r+1}. This is the best we can expect in a practical
calculation — and is coherent with the rules of formal calculus.

Lemma 6.3: Let a near the identity canonical trasformation in the form (6.8) be
given. Then there exists a sequence χ = {χ1(p^{′}, q^{′}), χ2(p^{′}, q^{′}), . . .} of generating func-
tions such that

q = Sχq^{′} , p = Sχp^{′} .

Proof. Let us see how the sequence χ is constructed, step by step. In the formal
approach, we want to determine χ1 so that^{3}

(6.17) exp(εL_{χ}_{1})q_{j} −qj + εϕ_{1,j} = O(2) ,
exp(εLχ1)pj −pj+ εψ_{1,j}= O(2) .

The condition for the transformation (6.8) be canonical (in formal sense at order 1) is {qj + εϕ1,j, qk+ εϕ1,k} = {qj, qk} + ε{qj, ϕ1,k} + ε{ϕ1,j, qk} = O(2) ,

{pj + εψ1,j, pk+ εψ1,k} = {pj, qk} + ε{pj, ψ1,k} + ε{ψ1,j, pk}= O(2) ,
{q_{j} + εϕ_{1,j}, p_{k}+ εψ_{1,k}} = {q_{j}, p_{k}} + ε{q_{j}, ψ_{1,k}} + ε{ϕ_{1,j}, p_{k}}= 1 + O(2) .
Hence we ask

∂ϕ_{1,k}

∂pj

− ∂ϕ_{1,j}

∂pk

= 0 , −∂ψ_{1,k}

∂qj

+ ∂ψ_{1,j}

∂qk

= 0 , ∂ψ_{1,k}

∂pj

+ ∂ϕ_{1,j}

∂qk

= 0 .
That is, there is a function χ_{1}(q, p) such that

ϕ_{1,j} = ∂χ_{1}

∂pj

, pj = −∂χ_{1}

∂qj

,

so that χ_{1} is determined by quadrature, and (6.17) is satisfied. Suppose now that we
have determined {χ1, . . . , χr−1} so that

3 The reader will notice that the argument below is essentially a reinterpretation of the near the identity canonical transforamtions that we have seen in example 2.12.

exp(ε^{(r−1)}Lχr−1) ◦ . . . ◦ exp(εLχ1)qj

−qj + εϕ_{1,j} + . . . + ε^{(r−1)}ϕ_{r−1,j} − ε^{r}ϕ^{′}_{r,j} = O(r + 1) ,
exp(ε^{(r−1)}Lχr−1) ◦ . . . ◦ exp(εLχ1)pj

−pj+ εψ_{1,j} + . . . + ε^{(r−1)}ψ_{r−1,j} − ε^{r}ψ_{r,j}^{′} = O(r + 1) .

Remark that the expressions ϕ^{′}_{r,j} and ψ^{′}_{r,j} can be explicitly determined. On the other
hand they must satisfy the conditions for canonicity in formal sense at order r. The
calculation requires some patience and attention: one should exploit the fact that
canonicity up to order r − 1 follows from the construction, for the transformation
generated by a Lie series is canonical. We find that the canonicity conditions are,
again,

∂ϕ^{′}_{r,k}

∂pj

− ∂ϕ^{′}_{r,j}

∂pk

= 0 , −∂ψ_{r,k}^{′}

∂qj

+ ∂ψ^{′}_{r,j}

∂qk

= 0 , ∂ψ_{r,k}^{′}

∂pj

+ ∂ϕ^{′}_{r,j}

∂qk

= 0 .

Thus, the function χrcan be determined by quadrature. We conclude that the sequence can be constructed up to any wanted order, as claimed. Q.E.D.

6.3 Lie transform

The algorithm of Lie series may be generalized to the case of a time–dependent gen- erating function χ(p, q, t), so that the corresponding time evolution is due to the flow of a non–autonomous Hamiltonian. This leads to an algorithm different from (6.3).

As already said at the beginning of the chapter, several different formulæ have been devised in order to give the Lie transform an algorithmic recurrent form: everybody, of course, has his favourite one. To make a definite choice, the algorithm used here is the favourite one of the author. It is related to the “algorithm of the inverse”, found by Henrard [48].

Hereafter particular attention will be paid to the algebraic aspect of the method, leaving somehow hidden the relation with a canonical flow. In particular the algorithm will be formulated after setting ε = 1.

Consider a generating sequence χ = {χs}_{s≥1} of analytic functions on the phase
space. The Lie transform operator Tχ is defined as

(6.18) Tχ =X

s≥0

Es ,

where the sequence {E_{s}}_{s≥0} of operators is recurrently defined as

(6.19) E_{0} = Id , Es =

s

X

j=1

j

sLχjE_{s−j} .

A coordinate transformation defined by means of Lie transform is written as

(6.20)

q = Tχq^{′} = q^{′}+ Lχ1q^{′}+ 1

2L^{2}_{χ}_{1}q^{′}+ Lχ2q^{′}

+ . . . ,
p = Tχp^{′} = p^{′}+ Lχ1p^{′}+ 1

2L^{2}_{χ}_{1}p^{′}+ Lχ2p^{′}

+ . . . . This is a canonical transformation (see lemma 6.4 below).

A direct connection with Lie series comes out by considering the case χ =
{χ_{1}, 0, 0, . . .}, namely a generating sequence containing only the first term. Then
the Lie transform generated by χ coincides with the Lie series generated by χ_{1}, i.e.,
T_{χ} = exp(L_{χ}_{1}) .

Lemma 6.4: The Lie transform T_{χ} defined by (6.18) and (6.19) has the following
properties.

(i) Linearity: for any pair f , g of functions and for α ∈ R we have
T_{χ}(f + g) = T_{χ}f + T_{χ}g , T_{χ}(αf ) = αT_{χ}f .
(ii) Conservation of product: for any pair f , g of functions we have

Tχ(f · g) = Tχf · Tχg .

(iii) Conservation of Poisson brackets: for any pair f , g of functions we have Tχ{f, g} = {Tχf, Tχg .

For a proof see, e.g., [32]. The property (iii) is particularly relevant, since it means that the coordinate transformation (6.20) is canonical.

The reader will notice that the Lie transform has the same formal properties as
the Lie series. This is true also for the exchange theorem, as stated by the following
Lemma 6.5: Let the generating sequence χ = {χs}_{s≥1} and a function f (p, q) be
given. We have

(6.21) f (p, q)

p=Tχp^{′}, p=Tχq^{′} = Tχf

p=p^{′}, q=q^{′} .

The proof may be obtained using the linearity and the conservation of products.

This makes the claim evident for polynomials, and the result may be extended to
analytic functions.^{4}

6.3.1 The triangular diagram for the Lie transform

The calculation of a Lie transform has a nice graphical representation similar to that
of Lie series. Suppose that we are given a function f = f0+ f1+ . . . , and denote by
g = g_{0} + g_{1} + . . . its transformed function g = Tχf . Using the linearity of the Lie

4 See the remarks made for the Lie series, including the note 2 in this chapter.

transform we can apply Tχ separately to every term of f . It is useful to rearrange terms according to the triangular diagram

(6.22)

g_{0} f_{0}

↓

g_{1} E_{1}f_{0} f_{1}

↓ ↓

g_{2} E_{2}f_{0} E_{1}f_{1} f_{2}

↓ ↓ ↓

g_{3} E_{3}f_{0} E_{2}f_{1} E_{1}f_{2} f_{3}

↓ ↓ ↓ ↓

... ... ... ... ... . ..

where terms of the same order appear on the same line. Remark that the operator Tχ

acts by columns, as indicated by the arrows: the knowledge of fj and of the generating
sequence allows one to construct the whole column below fj. Thus, the first line gives
g0 = f0, the second line gives g1= E1f0+ f1, and so on. This shows how to practically
perform the transformation.^{5} Again, truncating the diagram at line r allows us to
calculate all contribution up to order r and forget everything of higher order, which
is coherent with the rules of formal calculus.

Finding the inverse of the Lie transform may appear definitely more complicated
than for Lie series. However, with a bit of attention we may realize that it is just matter
of using in a skilful manner the triangular diagram (6.22). Assume that g is given, and
f is unknown. Then, the first line gives f0 = g0; having determined f0, all the column
below f_{0} can be constructed, and the second line gives immediately f_{1} = g_{1} − E_{1}f_{0};
having determined f1, all the corresponding column can be constructed, so that f2

can be determined from the third line as f_{2} = g_{2}− E_{2}f_{0}− E_{1}f_{1}, and so on.

5 This scheme may look mysterious to a reader who is not familiar with the methods
of expansion of perturbation theory. Let me try to clarify this matter. Introduce an
expansion parameter, ε say, and write εχ1, ε^{2}χ2, . . . in place of χ1, χ2, . . . . Then the
Lie transform applied to a generic function f (independent of the expansion parameter
ε) generates in a natural way a transformed function expanded in power series of ε. For,
in view of Lεχ_{1} = εLχ_{1}, L_{ε}2χ_{2} = ε^{2}Lχ_{2}, . . . one easily sees that the operator Esproduces
a factor ε^{s}. Let now f itself be a series in ε, namely f = f0+ εf1+ ε^{2}f2 + . . . , and
look for the ε–expansion g = g0+ εg1+ ε^{2}g2+ . . . of its transformed function g = Tχf
Then the triangular diagram (6.22) is easily constructed by putting on the same line
the functions that have the same power of ε as coefficient. This should make natural the
whole procedure. At this point, just set ε = 1 and leave everything in its place, remarking
that the indexes play the role of the exponents of ε. Indeed, this is not just a formal
game. We may consider the index s of a function, in our notation, as indicating that
the function is “of order s” in some sense. The quantitative theory will be responsible
of giving the expression a definite meaning, e.g., by assuring that with increasing values
of the index s the size of the function decreases in some regular manner.

There is also an explicit formula for the inverse, namely

(6.23) T_{χ}^{−1} =X

s≥0

Ds ,

where

(6.24) D0 = Id , Ds = −

s

X

j=1

j

sDs−jLχj .

However, this expression is actually useless for a practical computation: the algorithm described above is much more efficient. Nevertheless, the explicit recurrent formula is useful for quantitative estimates.

The property that makes the Lie transform quite useful is expressed by the fol- lowing

Lemma 6.6: Let a canonical transformation be given in the form (6.8). Then there
exists a generating sequence χ1(p^{′}, q^{′}), χ2(p^{′}, q^{′}), . . . such that we have

p = Tχp^{′} , q = Tχq^{′} .

Proof. Want Tχpj and Tχqj coincide with the corresponding expressions of pj and qj, respectively, in (6.8) (recall that the name of the variables is irrelevant). Recalling the definition of Tχ we actually write the infinite system

Espj = ψs,j , Esqj = ϕj , s ≥ 0 .

For s = 1 the equation takes the simple form L_{χ}_{1}p_{j} = ψ_{1,j} and L_{χ}_{1}q_{j} = ϕ_{1,j}, i.e.,

−∂χ1

∂qj

= ψ1,j , ∂χ1

∂pj

= ϕ1,j .

Here we proceed as in the proof of lemma 6.3: the canonicity condition give

∂ϕ_{1,k}

∂pj

− ∂ϕ_{1,j}

∂pk

= 0 , −∂ψ_{1,k}

∂qj

+ ∂ψ_{1,j}

∂qk

= 0 , ∂ψ_{1,k}

∂pj

+ ∂ϕ_{1,j}

∂qk

= 0 ,

so that χ_{1} is determined by quadratures. For s > 1, using the espression (6.19) of E_{s}
and separating the last term of the sum we write the equations as

L_{χ}_{s}p_{j} +

s−1

X

l=1

l

sL_{χ}_{l}E_{s−l}p_{j} = ψ_{s,j} , L_{χ}_{s}q_{j}+

s−1

X

l=1

l

sL_{χ}_{l}E_{s−l}q_{j} = ϕ_{s,j} .

Proceeding by induction, remark that the sum is determined by χ_{1}, . . . , χ_{s−1}, so that
the equations read

Lχspj = ψs,j − Fs,j , Lχsqj = ϕs,j+ Gs,j

with known functions F, G. On the other hand, we should recall that the transfor-
mation generated by the truncated sequence χ^{(s−1)} = {χ1, . . . , χs−1} is canonical, so
that the right members must satisfy the canonicity conditions. Proceeding again as in
the proof of lemma 6.3 we conclude that χs may be determined. Q.E.D.

It is mandatory here to emphasize that the sequence of generating functions does not coincide with the sequence that generates the same transformation by composition of Lie series. This because Lie derivatives do not commute, in general.

6.4 Analytic framework

It now time to introduce the technical tools that will allow us to investigate the convergence or the asymptotic properties of perturbation series. The methods exposed here are in fact a variazione over the classical method of majorants due to Cauchy.

6.4.1 Cauchy estimates

Consider an open disk ∆̺(0), with ̺ > 0, centered at the origin of the complex plane
C. Consider a function f analytic and bounded on the closure of the disk ∆_{̺}(0). The
supremum norm |f |̺ of f in the domain ∆̺(0) is defined as

(6.25) |f |̺ = sup

z∈∆̺(0)

|f (z)| .

The estimate of Cauchy for the derivative f^{′} of f at the origin states that

|f^{′}(0)| ≤ 1

̺|f |̺ .

More generally, for the s–th derivative f^{(s)} one has the estimate

f^{(s)}(0)
≤ s!

̺^{s}|f |̺ .

For instance, let ̺ = 1, and consider the function f (z) = z^{s}. It is an easy matter to
check that |f |_{1} = 1, so that Cauchy’s estimate gives |f^{(s)}(0)| ≤ s! . This shows that
the estimate cannot be improved in general. The proof of the inequalities above is an
easy consequence of Cauchy’s formula

f^{(s)}(z) = s!

2πi

I f (ζ)

(ζ − z)^{s+1}dζ .

For, writing the contour of the disk as z = ̺e^{iϑ} a straightforward calculation gives
f^{(s)}(0)

≤ s!

2π

I f (ζ)

̺^{s+1} dζ

≤ s!

2π̺^{s}|f |_{̺}
Z 2π

0

dϑ = s!

̺^{s}|f |_{̺} .

The case of n variables requires a straightforward extension. Let the domain ∆̺(0)
be the polydisk of radius ̺ centered at the origin of C^{n}, namely

(6.26) ∆_{̺}(0) = {z ∈ C^{n} : |z| < ̺} ,

where |z| = maxj|zj| is the l_{∞} norm on C^{n}. This is nothing but the Cartesian product
of complex disks of radius ̺ in the complex plane. Define the supremum norm of an