NEAR THE IDENTITY
CANONICAL TRANSFORMATIONS
This part should be considered as an intermezzo. It has a rather technical character, and can be seen as an extension of the chapter on canonical transformations, since it deals with transformations close to identity. The scope is to introduce some useful tools for developing a constructive perturbation scheme based on the algorithms of Lie series and of Lie transform, that will be used as substitutes for the classical method of constructing near the identity canonical transformations via generating functions in mixed variables.
The usefulness of Lie transform methods relies on two remarks. First, one avoids the cumbersome procedure of inversion of functions and of substitution of variables.
Second, completing the formal methods based on Lie transforms with quantitative estimates — as typically required by modern perturbation theory — is not really difficult.
Concerning the first remark, one may easily notice that the operations of inversion of functions and substitution of variables are basically simple, and so introduce no real difficulty from the point of view of Analysis. Moreover, most of the rigorous methods of perturbation theory available in the current literature are based on powerful analytical tools such as the implicit function theorem and fixed point theorems, to quote just the most common ones. The drawback of these method is that they are not constructive, not easily implemented in explicit calculations, in particular if one plans to perform series expansions up to a quite high order in a parameter. E.g., the first few steps of an inversion are indeed quite simple — and often faster — than the expansions required by Lie transforms. But soon one realizes that pushing the typical expansions of perturbation theory to higher orders becomes a really cumbersome and time wasting procedure. In contrast, the methods based on Lie transform allow us to express all series expansions that we need in a very straightforward manner, using only algebraic operations that are easily implemented on computers: sums, products and derivatives.
The second remark is related to an apparent defect of Lie transform methods in algorithmic form: they are often used only as formal tools, skipping all questions
related to convergence of the expansions. We could also say that there is a gap between the formal methods of perturbation theory that have been used by astronomers since the times of Euler and Lagrange and the rigorous methods that are now widely used in the framework of Analysis. The ambitious attempt of the present notes is to fill, at least partially, that gap.
It may be interesting to add a few historical remarks. The introduction of Lie series as a tool for writing the solutions of differential equations goes back to Marius Sophus Lie. The field of Lie algebras has been developed on the basis of his work;
however, this matter will not be discussed in the present notes. The usefulness of Lie series in numerical calculations has been pointed out by Wolfgang Gr¨obner in a series of papers, starting in 1957. The book [42], published in 1967, contains a complete theory including the discussion of the convergence of Lie series for analytic vector fields, when all functions are expressed as power series. The expansions are then used as a numerical tool for numerically solving differential equations. It should be remarked that the approximation of a solution by Lie series has been recently used also as a tool for constructing symplectic integration algorithms which are particularly useful for numerically investigating the dynamics of Hamiltonian systems.
The application of Lie series methods to perturbation expansions in Celestial Mechanics has been first proposed by Gen-Ichiro Hori[50] and Andr´e Deprit[27], who generalized the method by introducing Lie transforms and producing explicit algo- rithms. Many other papers have been devoted to algorithms based on Lie transforms, mainly for applications in Celestial Mechanics. An attempt to collect several such algo- rithms may be found in the papers [48] and [49] by Jacques Henrard, where extensive references may be found. However, all these papers were concerned only with the for- mal aspects, while convergence problems were neglected. An extension of the rigorous result of Gr¨obner to the Hamiltonian case including use of action–angle variables has been first given in [32].
The exposition here will be concerned only with the application of Lie methods in a Hamiltonian framework, often considering the particular case of action–angle variables. This is indeed what we need in order to establish a working background for the rest of the present notes. However that a similar theory may be developed also for general systems of differential equations.
6.1 Formal expansions
Let us start with a short discussion concerning the meaning to be assigned to the expression formal calculus. In chapter 5 we have used it in intuitive sense — perhaps vague — just saying: “we disregard questions relative to convergence”. Let us try to make the last sentence a little more precise, with the help of Poincar´e.
The easiest and most natural case is concerned with polynomial expansions, or formal power series, that we have used in chapter 5 in order to construct a first integral for a Hamiltonian in a neighbourhood of an (elliptic) equilibrium. Following Poincar´e ([87], Tome II, ch. VIII), we may introduce the concept of formal expansion as
follows. Consider the truncated expansion Φ = Φ0+ . . . + Φr, i.e., a non homogeneous polynomial of degree r + 1. We say that the equation {H, Φ} = 0 is satisfied in formal sense if by replacing the truncated polynomial we get
(6.1) {H, Φ} = O(|(x, y)|r+3) ,
i.e., the difference is of degre at least r +3, that we have identified as order r +1. Let us call this difference the remainder. Hencefort we shall denote as O(r + 1) the remainder of order r + 1, in whatever sense. This is essentially what we did in chapter 5, and the propositions stated there should be interpreted in this sense.
A similar attitude may be taken when considering expansions in a small parameter ε, as considered in chapter 4: just interpret order O(r + 1) as meaning: terms that have a power εr+1 (or higher) as a factor. More elaborated cases may occur, as we shall see in the rest of the present notes.
Such an attitude turns out to be useful in view of the fact, known after Poincar´e, that most series usually considered in perturbation theory are not convergent. On the other hand there is a long standing and successful tradition in Celestial Mechanics (or Astronomy, as it was called in the past) based on the use of the first order terms of a series expansions, disregarding the rest of the series: the problem of convergence is not even mentioned, in most cases.
Such a matter of fact has been splendidly illustrated by Poincar´e (see [87], tome II, chap. VIII).
“ Il y a entre les g´eom`etres et les astronomes une sorte de malentendu au sujet de la signification du mot convergence. Les g´eom`etres, pr´eoccup´es de la parfaite rigueur et souvent trop indiff´erents `a la longueur de calculs inextricables dont ils con¸coivent la possibilit´e, sans songer `a les entrepren- dre effectivement, disent qu’ une s´erie est convergente quand la somme des termes tend vers une limite d´etermin´ee, quand mˆeme les premiers termes diminueraient tr`es lentement. Les astronomes, au contraire, ont coutume de dire qu’ une s´erie converge quand les vingt premiers termes, par exem- ple, diminuent tr`es rapidement, quand mˆeme les termes suivants devraient croˆıtre ind´efiniment.
Ainsi, pour prendre un exemple simple, consid´erons les deux s´eries qui ont pour terme g´en´eral
n
· · . . . n et · · . . . n
n .
Les g´eom`etres diront que la premi`ere s´erie converge, et mˆeme qu’ elle con- verge rapidement, parce que le millioni`eme terme est beaucoup plus petit que le e; mais ils regarderont la seconde come divergente, parce que le terme g´en´eral peut croˆıtre au del`a de toute limite.
Les astronomes, au contraire, regarderont la premi`ere s´erie comme di- vergente, parce que les premiers termes vont en croissant; et la seconde comme convergente, parce que les premiers termes vont en d´ecroissant et que cette d´ecroissance est d’ abord tr`es rapide.
Les deux r`egles sont l´egitimes : la premi`ere, dans les r´echerches th´eoriques; la s´econde, dans les applications num´eriques. Toutes deux doivent r´egner, mais dans deux domaines s´epar´es et dont il importe de bien connaˆıtre les fronti`eres.
. . . .
Le premier exemple qui a montr´e clairement la l´egitimit´e de certains d´eveloppements divergentes est l’ exemple classique de la s´erie de Stirling.
Cauchy a montr´e que les termes de cette s´erie vont d’ abord en d´ecroissant, puis en croissant, de sorte que la s´erie diverge; mais si l’ on s’ arrˆete au terme le plus petit, on r´epresente la function eul´erienne avec une approximation d’ autant plus grande que l’ argument est plus grand. ”
An example of the usefulness of a divergent series has been given in chapter 5, where an exponential stability estimate has been obtained for an elliptic equilibrium.
A similar but more general application is the theorem of Nekhoroshev that will be the subject of chapter 8. In chapter 7 the method of Lie series will be used in order to prove the theorem of Kolmogorv on the existence of invariant tori for perturbed systems. In all cases it will be essential to implement a scheme of quantitative analytical estimates.
6.2 Lie series
This section is devoted to the formal aspects of the theory of Lie series. All questions concerning convergence will be neglected here and deferred to sections 6.5 and 6.6.
Particular care will instead be devoted to the formulation of constructive algorithms.
In this chapter all functions will be assumed to be holomorphic.
On a 2n–dimensional phase space endowed with canonical coordinates p, q, con- sider a holomorphic function χ(p, q), that will be called a generating function. Consider also the Lie derivative
(6.2) Lχ· = {·, χ} ,
i.e., the time derivative along the Hamiltonian vector field generated by χ. It plays a basic role in all the chapter.
6.2.1 The Lie series operator
The Lie series operator is defined as the exponential of εLχ, namely
(6.3) exp(εLχ) =X
s≥0
εs s!Lsχ
(see, e.g., [42]). It represents the time one evolution of the canonical flow generated by the autonomous Hamiltonian χ.
The basic use of Lie series is the explicit expression of the canonical flow. Consider
χ(p′, q′) as a function of the new variables q′, p′, and write
(6.4)
p = exp(εLχ)p′= p′− ε ∂χ
∂q′ p′,q′
+ ε2 2 Lχ
∂χ
∂q′ p′,q′
+ . . .
q = exp(εLχ)q′= q′+ ε ∂χ
∂p′ p′,q′
+ ε2 2 Lχ
∂χ
∂p′ p′,q′
+ . . . ;
This is the explicit expression of the flow at time ε, and is a near the identity canonical transformation in the sense that it depends analytically on ε as a parameter, and for ε = 0 is the identity.1 The reader will recognize the power expansion in time of the local solution of an holomorphic system of differential equations, as used by Cauchy.
It is also an easy matter to write the inverse transformation: exploiting the fact that the the flow is autonomous, it is enough to replace χ by −χ or, equivalently, ε by −ε . Example 6.1: Translation and deformation. Consider the case of action–angle vari- ables p ∈ Rn and q ∈ Tn. The function χ = P
lξlq′l, with ξ ∈ Rn generates through the Lie series operator at time ε the canonical transformation
p = exp(εLχ)p′ = p′− εξ , q = exp(εLχ)q′ = q′ ,
namely a translation in the action space thet leaves the angles unchanged. Similarly, a generating function χ = χ(q′) (independent of p′) generates the transformation
p = p′− ∂χ
∂q′ , q = q′ ,
namely a deformation of the action variables. Remark that here the flow at time ε = 1 has been considered.
Lemma 6.1: The Lie series operator has the following properties.
(i) Linearity: for any pair of functions f , g and for all α ∈ R we have
exp εLχ(f + g) = exp εLχf + exp εLχg , exp εLχ(αf ) = α exp εLχf . (ii) Distributivity over the product (or conservation of product): for any pair of
functions f , g we have
exp εLχ(f · g) = exp εLχf · exp εLχg .
(iii) Distributivity over Poisson bracket (or conservation of Poisson bracket): for any pair of functions f , g we have
exp εLχ{f, g} = exp εLχf, exp εLχg .
The easy proof is left to the reader. Remark in particular that the property (iii) means that the Hamiltonian flow is a canonical transformation, a property that we already know (see lemma 3.3).
1 We have already encountered an expression similar to this one in sect. 5.1. The flow of a linear system has been expressed there using an evolution operator which is the exponential of a matrix, which is the Lie series operator for a linear vector field.
Having defined a coordinate transformation we may ask how a function is trans- formed. Here the considerations of section 6.1 concerning the formal calculus play a relevant role. Having given a function f (q, p) we should calculate the trasformed func- tion f′(p′, q′) by substitution of the transformation (6.4). In perturbation theory we also need to expand the transformed function in powers of the parameter ε up to a given order, which is clearly a long and boring procedure since we should apply the Taylor’s formula. A remarkable property is the following.
Lemma 6.2: Let a generating function χ(p, q) and a function f (p, q) be given. Then the following equality holds true:
(6.5) f (p, q)
p=exp(εLχ)p′, q=exp(εLχ)q′ = exp(εLχ)f
p=p′, q=q′ .
This lemma has been named by Gr¨obner the exchange theorem. A few comments are in order.
The claim is that the series expansion in ε of the transformed function may be calculated by applying the exponential operator of the Lie series directly to the func- tion, with no need of making a substitution of variables.2 This may appear trivial in view of the following considerations. The substitution of variables in the left member of (6.5) produces a function f′(q′, p′, ε) , where ε is the time of the flow. We may ex- pand the function in Taylor series by calculating the derivatives of f′ with respect ε . The right member says essentially that the derivatives with respect to ε are calculated as Lie derivatives with respect to the flow generated by χ . This seems quite obvious.
However, some care is required if we remark that we are combining two operations: in the left member we first make the sustitution, and then perform the expansion; in the right member we perform the operations in reverse order. To be rigorous, we should prove that the result is the same. This is evident for a polynomial in view of proper- ties (i) and (ii) of lemma 6.1, and for an analytic functions follows via approximation by polynomials. The complete proof may be found in [42], § I.2.
6.2.2 The triangle of Lie series
An elegant and effective representation of the operation of transforming a function is found as follows. Assume, as is typical in perturbation theory, that the function to be transformed is expanded in power series of the parameter ε, namely f (p, q, ε) = f0(p, q)+εf1(p, q)+ε2f2(p, q)+. . . and that we want to write the transformed function g = exp εLχf as a power series g0+ εg1+ ε2g2+ . . . in ε . Working at a formal level
2 A further commenton the role of the variables may be useful, if perhaps pedantic. The left member of (6.5) produces a function of the new variables q′, p′. On the other hand, the right member may be calculated by considering both functions χ and f as depending on the old variables q, p so that the result is a function of q, p . The substitution p = p′, q = q′ means that the variables q, p must be simply renamed. The equality must be intepreted in the sense that we take care only of the form of the function, the name of the variables being irrelevant. This fact should be kept in mind when we use the algorithm:
don’t care of the names of the variables; just transform the functions.
we may use the linearity of the Lie series operator, thus writing
g = exp εLχf0+ ε exp εLχf1+ ε2exp εLχf2+ . . . .
That is, we apply the Lie series to every term of the expansion of f . The action of the operator is represented by the triangular diagram
(6.6)
g0 f0
↓
g1 Lχf0 f1
↓ ↓
g2 2!1L2χf0 Lχf1 f2
↓ ↓ ↓
g3 3!1L3χf0 2!1L2χf1 Lχf2 f3
↓ ↓ ↓ ↓
... ... ... ... ... . ..
Terms of the same order in ε are aligned on the same line. The calculation may be performed by columns, as indicated by the arrows: if the function f and the generating function χ are known, then every column may be calculated proceeding up–down until the line corresponding to the wanted order in ε is reached. Then it is enough to add together all terms appearing on the same line, and this gives every term of the expansion of g up to the wanted order. Everything not included in the diagram has higher order in ε , and needs not be calculated. This is precisely what we mean while saying that g = exp(Lχ)f in formal sense.
A compact form of the diagram is given by the recurrent formula
(6.7) g0 = f0 , gr =
r
X
j=1
1
j!Lχjfr−j per r > 0 .
6.2.3 Composition of Lie series
As we have seen, the Lie series defines a near the identity transformation, which in our case is a canonical one. A natural question is whether the converse is also true, i.e., if every canonical transformation can be expressed as a Lie series.
The question may be formulated in more precise terms. We are actually consid- ering a one parameter family of transformations. Suppose that we are given such a family in the form
(6.8) q = q′+εϕ1(p′, q′)+ε2ϕ2(p′, q′)+. . . , p = p′+εψ1(p′, q′)+ε2ψ2(p′, q′)+. . . , where the functions ϕ1, ϕ2, . . . and ψ1, ψ2, . . . are assumed to satisfy the necessary conditions for the transformation to be canonical. The question is whether a generat- ing function χ(p′, q′) exists which produces exactly that transformation. The answer in general is negative, the limitation being that we are considering the flow of an
autonomous system. However, more general transformations can be constructed by composition of Lie series.
Let us consider a sequence of generating functions χ = {χ1, χ2, χ3, . . .} , and let the sequence of operators S(1), S(2), S(3), . . . be defined by recurrence as
(6.9) S(1) = exp εLχ1 , S(k) = exp εkLχkS(k−1) for k > 1 .
If we work at formal level we interrupt the sequence at k = r for some r > 0, also truncating all expansions at order r; all the rest of the sequence produces only terms of order O(r + 1). If we can prove that the sequence converges to some limit then the limit
(6.10) Sχ = . . . ◦ exp ε3Lχ3 ◦ exp ε2Lχ2 ◦ exp εLχ1
is well defined. This procedure defines the composition of Lie series.
Inverting the operators so defined is not difficult. Let us define the sequence S˜(1), ˜S(2), ˜S(3), . . . as
(6.11) S˜(1) = exp −εLχ1 , S˜(k) = ˜S(k−1)exp −εkLχk
for k > 1 .
It is an easy matter to check that the sequence defines the inverse in formal sense, namely
(6.12) S˜(r)◦ S(r) = O(r + 1) .
If the sequence ˜S(k) tends to some limit then we get the inverse of the operator Sχ, namely
(6.13) Sχ−1 = exp −εLχ1 ◦ exp −ε2Lχ2 ◦ exp −ε3Lχ3 ◦ . . . . In this case (6.12) is replaced by ˜Sχ◦ Sχ = Id .
Let us pay a little attention the triangular diagram (6.6) making clear how the calculation should proceed when a composition of Lie series is considered. Keep in mind that all equalities must be considered in formal sense Suppose we want to transform the function f = f0+ εf1+ ε2f2+ . . . , which means that we want
g = . . . ◦ exp ε2Lχ2 ◦ exp εLχ1f .
We first calculate f′ = exp εLχ1f = f0′ + εf1′ + ε2f2′ + . . . as indicated in the di- agram (6.6), just writing f′ in place of g and χ1 in place of χ . Then we calculate f′′ = exp ε2Lχ2f′ = f0′′ + εf1′′ + ε2f2′′+ . . . by constructing a similar diagram, but paying attention to the correct alignment of the powers of ε . With a moment’s thought
we realize that the diagram should be represented as f0′′ f0′
↓
f1′′ 0 f1′
↓ ↓
f2′′ Lχ2f0′ 0 f2′
↓ ↓ ↓
f3′′ 0 Lχ2f1′ 0 f3′
↓ ↓ ↓ ↓
f4′′ 2!1L2χ2f0′ 0 Lχ2f2′ 0 f4′
↓ ↓ ↓ ↓ ↓
... ... ... ... ... ... . ..
with a number of null elements, because we proceed by powers of ε2, not of ε. Similarly, when we go on with our procedure we shall construct a diagram for exp εsLχs which contains in every column just one non zero element out of s . A compact formula may be given, namely
(6.14) gs=
k
X
j=0
1
j!Ljχrfs−jr , k =js r
k ,
where ⌊x⌋ denotes the maximal integer that does not exceed x.
Pedantically, let us see in some more detail how to proceed in a practical calcula- tion, keeping in mind that we shall unavoidably truncate the series to some order εr, having chosen r ≥ 1 in a way that we consider suitable.
Let us pick a truncated function f = f0 + εf1 + . . . + εrfr and suppose that we want to construct the transformed function up to degree r in ε . With a little attention we realize that it is enough to construct every diagram until we reach the line corresponding to the power εr, and in particular we need to know only the generating functions χ1, . . . , χr. For, this includes all terms up to order εr, and neglects everything of higher order. Similar considerations apply to the inverse transformation: it is enough to consider only the action of the operator ˜S(r) defined by (6.11).
A fact that may raise some perplexity in an actual calculation is the following.
Suppose that we have constructed the generating functions χ1, . . . , χr, so that we know how to construct the operators Sχ(r) and ˜Sχ(r). Let us calculate the transformation
q = Sχ(r)q′ , p = Sχ(r)p′ up to degree r in ε . This will give us expressions such as
(6.15) q = q′+εϕ1(p′, q′)+. . .+εrϕr(p′, q′) , p = p′+εψ1(p′, q′)+. . .+εrψr(p′, q′) ,
where the functions ϕ1(p′, q′), . . . , ϕr(p′, q′) and ψ1(p′, q′), . . . , ψr(p′, q′) may be ex- plicitly calculated. We may then consider the inverse transformation
q′ = ˜Sχ(r)q , p′ = ˜Sχ(r)p . This will give us the expressions
(6.16) q′ = q + ε ˜ϕ1(p, q) + . . . + εrϕ˜r(p, q) , p′ = p + ε ˜ψ1(p, q) + . . . + εrψ˜r(p, q) , which could be explicilty calculated. Suppose now that we substitute the expres- sions (6.16) into (6.15). By lemma 6.2 (the exchange theorem) this is equivalent to applying the operator ˜Sχ(r) to the functions in the right member of (6.15). We expect that this will give us the identity, which is definitely true if we consider the infinite series. If we work with truncated series at order r we shall realize that the result is the identity up to a term of order εr+1. This is the best we can expect in a practical calculation — and is coherent with the rules of formal calculus.
Lemma 6.3: Let a near the identity canonical trasformation in the form (6.8) be given. Then there exists a sequence χ = {χ1(p′, q′), χ2(p′, q′), . . .} of generating func- tions such that
q = Sχq′ , p = Sχp′ .
Proof. Let us see how the sequence χ is constructed, step by step. In the formal approach, we want to determine χ1 so that3
(6.17) exp(εLχ1)qj −qj + εϕ1,j = O(2) , exp(εLχ1)pj −pj+ εψ1,j= O(2) .
The condition for the transformation (6.8) be canonical (in formal sense at order 1) is {qj + εϕ1,j, qk+ εϕ1,k} = {qj, qk} + ε{qj, ϕ1,k} + ε{ϕ1,j, qk} = O(2) ,
{pj + εψ1,j, pk+ εψ1,k} = {pj, qk} + ε{pj, ψ1,k} + ε{ψ1,j, pk}= O(2) , {qj + εϕ1,j, pk+ εψ1,k} = {qj, pk} + ε{qj, ψ1,k} + ε{ϕ1,j, pk}= 1 + O(2) . Hence we ask
∂ϕ1,k
∂pj
− ∂ϕ1,j
∂pk
= 0 , −∂ψ1,k
∂qj
+ ∂ψ1,j
∂qk
= 0 , ∂ψ1,k
∂pj
+ ∂ϕ1,j
∂qk
= 0 . That is, there is a function χ1(q, p) such that
ϕ1,j = ∂χ1
∂pj
, pj = −∂χ1
∂qj
,
so that χ1 is determined by quadrature, and (6.17) is satisfied. Suppose now that we have determined {χ1, . . . , χr−1} so that
3 The reader will notice that the argument below is essentially a reinterpretation of the near the identity canonical transforamtions that we have seen in example 2.12.
exp(ε(r−1)Lχr−1) ◦ . . . ◦ exp(εLχ1)qj
−qj + εϕ1,j + . . . + ε(r−1)ϕr−1,j − εrϕ′r,j = O(r + 1) , exp(ε(r−1)Lχr−1) ◦ . . . ◦ exp(εLχ1)pj
−pj+ εψ1,j + . . . + ε(r−1)ψr−1,j − εrψr,j′ = O(r + 1) .
Remark that the expressions ϕ′r,j and ψ′r,j can be explicitly determined. On the other hand they must satisfy the conditions for canonicity in formal sense at order r. The calculation requires some patience and attention: one should exploit the fact that canonicity up to order r − 1 follows from the construction, for the transformation generated by a Lie series is canonical. We find that the canonicity conditions are, again,
∂ϕ′r,k
∂pj
− ∂ϕ′r,j
∂pk
= 0 , −∂ψr,k′
∂qj
+ ∂ψ′r,j
∂qk
= 0 , ∂ψr,k′
∂pj
+ ∂ϕ′r,j
∂qk
= 0 .
Thus, the function χrcan be determined by quadrature. We conclude that the sequence can be constructed up to any wanted order, as claimed. Q.E.D.
6.3 Lie transform
The algorithm of Lie series may be generalized to the case of a time–dependent gen- erating function χ(p, q, t), so that the corresponding time evolution is due to the flow of a non–autonomous Hamiltonian. This leads to an algorithm different from (6.3).
As already said at the beginning of the chapter, several different formulæ have been devised in order to give the Lie transform an algorithmic recurrent form: everybody, of course, has his favourite one. To make a definite choice, the algorithm used here is the favourite one of the author. It is related to the “algorithm of the inverse”, found by Henrard [48].
Hereafter particular attention will be paid to the algebraic aspect of the method, leaving somehow hidden the relation with a canonical flow. In particular the algorithm will be formulated after setting ε = 1.
Consider a generating sequence χ = {χs}s≥1 of analytic functions on the phase space. The Lie transform operator Tχ is defined as
(6.18) Tχ =X
s≥0
Es ,
where the sequence {Es}s≥0 of operators is recurrently defined as
(6.19) E0 = Id , Es =
s
X
j=1
j
sLχjEs−j .
A coordinate transformation defined by means of Lie transform is written as
(6.20)
q = Tχq′ = q′+ Lχ1q′+ 1
2L2χ1q′+ Lχ2q′
+ . . . , p = Tχp′ = p′+ Lχ1p′+ 1
2L2χ1p′+ Lχ2p′
+ . . . . This is a canonical transformation (see lemma 6.4 below).
A direct connection with Lie series comes out by considering the case χ = {χ1, 0, 0, . . .}, namely a generating sequence containing only the first term. Then the Lie transform generated by χ coincides with the Lie series generated by χ1, i.e., Tχ = exp(Lχ1) .
Lemma 6.4: The Lie transform Tχ defined by (6.18) and (6.19) has the following properties.
(i) Linearity: for any pair f , g of functions and for α ∈ R we have Tχ(f + g) = Tχf + Tχg , Tχ(αf ) = αTχf . (ii) Conservation of product: for any pair f , g of functions we have
Tχ(f · g) = Tχf · Tχg .
(iii) Conservation of Poisson brackets: for any pair f , g of functions we have Tχ{f, g} = {Tχf, Tχg .
For a proof see, e.g., [32]. The property (iii) is particularly relevant, since it means that the coordinate transformation (6.20) is canonical.
The reader will notice that the Lie transform has the same formal properties as the Lie series. This is true also for the exchange theorem, as stated by the following Lemma 6.5: Let the generating sequence χ = {χs}s≥1 and a function f (p, q) be given. We have
(6.21) f (p, q)
p=Tχp′, p=Tχq′ = Tχf
p=p′, q=q′ .
The proof may be obtained using the linearity and the conservation of products.
This makes the claim evident for polynomials, and the result may be extended to analytic functions.4
6.3.1 The triangular diagram for the Lie transform
The calculation of a Lie transform has a nice graphical representation similar to that of Lie series. Suppose that we are given a function f = f0+ f1+ . . . , and denote by g = g0 + g1 + . . . its transformed function g = Tχf . Using the linearity of the Lie
4 See the remarks made for the Lie series, including the note 2 in this chapter.
transform we can apply Tχ separately to every term of f . It is useful to rearrange terms according to the triangular diagram
(6.22)
g0 f0
↓
g1 E1f0 f1
↓ ↓
g2 E2f0 E1f1 f2
↓ ↓ ↓
g3 E3f0 E2f1 E1f2 f3
↓ ↓ ↓ ↓
... ... ... ... ... . ..
where terms of the same order appear on the same line. Remark that the operator Tχ
acts by columns, as indicated by the arrows: the knowledge of fj and of the generating sequence allows one to construct the whole column below fj. Thus, the first line gives g0 = f0, the second line gives g1= E1f0+ f1, and so on. This shows how to practically perform the transformation.5 Again, truncating the diagram at line r allows us to calculate all contribution up to order r and forget everything of higher order, which is coherent with the rules of formal calculus.
Finding the inverse of the Lie transform may appear definitely more complicated than for Lie series. However, with a bit of attention we may realize that it is just matter of using in a skilful manner the triangular diagram (6.22). Assume that g is given, and f is unknown. Then, the first line gives f0 = g0; having determined f0, all the column below f0 can be constructed, and the second line gives immediately f1 = g1 − E1f0; having determined f1, all the corresponding column can be constructed, so that f2
can be determined from the third line as f2 = g2− E2f0− E1f1, and so on.
5 This scheme may look mysterious to a reader who is not familiar with the methods of expansion of perturbation theory. Let me try to clarify this matter. Introduce an expansion parameter, ε say, and write εχ1, ε2χ2, . . . in place of χ1, χ2, . . . . Then the Lie transform applied to a generic function f (independent of the expansion parameter ε) generates in a natural way a transformed function expanded in power series of ε. For, in view of Lεχ1 = εLχ1, Lε2χ2 = ε2Lχ2, . . . one easily sees that the operator Esproduces a factor εs. Let now f itself be a series in ε, namely f = f0+ εf1+ ε2f2 + . . . , and look for the ε–expansion g = g0+ εg1+ ε2g2+ . . . of its transformed function g = Tχf Then the triangular diagram (6.22) is easily constructed by putting on the same line the functions that have the same power of ε as coefficient. This should make natural the whole procedure. At this point, just set ε = 1 and leave everything in its place, remarking that the indexes play the role of the exponents of ε. Indeed, this is not just a formal game. We may consider the index s of a function, in our notation, as indicating that the function is “of order s” in some sense. The quantitative theory will be responsible of giving the expression a definite meaning, e.g., by assuring that with increasing values of the index s the size of the function decreases in some regular manner.
There is also an explicit formula for the inverse, namely
(6.23) Tχ−1 =X
s≥0
Ds ,
where
(6.24) D0 = Id , Ds = −
s
X
j=1
j
sDs−jLχj .
However, this expression is actually useless for a practical computation: the algorithm described above is much more efficient. Nevertheless, the explicit recurrent formula is useful for quantitative estimates.
The property that makes the Lie transform quite useful is expressed by the fol- lowing
Lemma 6.6: Let a canonical transformation be given in the form (6.8). Then there exists a generating sequence χ1(p′, q′), χ2(p′, q′), . . . such that we have
p = Tχp′ , q = Tχq′ .
Proof. Want Tχpj and Tχqj coincide with the corresponding expressions of pj and qj, respectively, in (6.8) (recall that the name of the variables is irrelevant). Recalling the definition of Tχ we actually write the infinite system
Espj = ψs,j , Esqj = ϕj , s ≥ 0 .
For s = 1 the equation takes the simple form Lχ1pj = ψ1,j and Lχ1qj = ϕ1,j, i.e.,
−∂χ1
∂qj
= ψ1,j , ∂χ1
∂pj
= ϕ1,j .
Here we proceed as in the proof of lemma 6.3: the canonicity condition give
∂ϕ1,k
∂pj
− ∂ϕ1,j
∂pk
= 0 , −∂ψ1,k
∂qj
+ ∂ψ1,j
∂qk
= 0 , ∂ψ1,k
∂pj
+ ∂ϕ1,j
∂qk
= 0 ,
so that χ1 is determined by quadratures. For s > 1, using the espression (6.19) of Es and separating the last term of the sum we write the equations as
Lχspj +
s−1
X
l=1
l
sLχlEs−lpj = ψs,j , Lχsqj+
s−1
X
l=1
l
sLχlEs−lqj = ϕs,j .
Proceeding by induction, remark that the sum is determined by χ1, . . . , χs−1, so that the equations read
Lχspj = ψs,j − Fs,j , Lχsqj = ϕs,j+ Gs,j
with known functions F, G. On the other hand, we should recall that the transfor- mation generated by the truncated sequence χ(s−1) = {χ1, . . . , χs−1} is canonical, so that the right members must satisfy the canonicity conditions. Proceeding again as in the proof of lemma 6.3 we conclude that χs may be determined. Q.E.D.
It is mandatory here to emphasize that the sequence of generating functions does not coincide with the sequence that generates the same transformation by composition of Lie series. This because Lie derivatives do not commute, in general.
6.4 Analytic framework
It now time to introduce the technical tools that will allow us to investigate the convergence or the asymptotic properties of perturbation series. The methods exposed here are in fact a variazione over the classical method of majorants due to Cauchy.
6.4.1 Cauchy estimates
Consider an open disk ∆̺(0), with ̺ > 0, centered at the origin of the complex plane C. Consider a function f analytic and bounded on the closure of the disk ∆̺(0). The supremum norm |f |̺ of f in the domain ∆̺(0) is defined as
(6.25) |f |̺ = sup
z∈∆̺(0)
|f (z)| .
The estimate of Cauchy for the derivative f′ of f at the origin states that
|f′(0)| ≤ 1
̺|f |̺ .
More generally, for the s–th derivative f(s) one has the estimate
f(s)(0) ≤ s!
̺s|f |̺ .
For instance, let ̺ = 1, and consider the function f (z) = zs. It is an easy matter to check that |f |1 = 1, so that Cauchy’s estimate gives |f(s)(0)| ≤ s! . This shows that the estimate cannot be improved in general. The proof of the inequalities above is an easy consequence of Cauchy’s formula
f(s)(z) = s!
2πi
I f (ζ)
(ζ − z)s+1dζ .
For, writing the contour of the disk as z = ̺eiϑ a straightforward calculation gives f(s)(0)
≤ s!
2π
I f (ζ)
̺s+1 dζ
≤ s!
2π̺s|f |̺ Z 2π
0
dϑ = s!
̺s|f |̺ .
The case of n variables requires a straightforward extension. Let the domain ∆̺(0) be the polydisk of radius ̺ centered at the origin of Cn, namely
(6.26) ∆̺(0) = {z ∈ Cn : |z| < ̺} ,
where |z| = maxj|zj| is the l∞ norm on Cn. This is nothing but the Cartesian product of complex disks of radius ̺ in the complex plane. Define the supremum norm of an