Improving computations by using low complexity matrix algebras
FIRST LECTURE, Friday August 29, 2014 (Moscow, LMSU) Bernoulli polynomials and numbers
Set B
0(x) = 1, and let B
n(x) the degree n polynomial uniquely defined by the following two conditions
B
n(x + 1) − B
n(x) = nx
n−1, Z
10
B
n(x) dx = 0
(the proof of the fact that B
nis uniquely defined is left to the reader, it could be a further exercise).
First observe that any B
nis monic, in fact, if B
n(x) = a
nx
n+ a
n−1x
n−1+ . . . then B
n(x + 1) − B
n(x) = (a
n(x + 1)
n+ a
n−1(x + 1)
n−1+ . . .) − (a
nx
n+ a
n−1x
n−1+ . . .)
= [a
n(x
n+ nx
n−1+ . . .) + a
n−1(x
n−1+ (n − 1)x
n−2+ . . .) + . . .] − (a
nx
n+ a
n−1x
n−1+ . . .)
= a
nnx
n−1+ (·)x
n−2+ . . .
and thus the condition B
n(x + 1) − B
n(x) = nx
n−1implies a
nn = n, i.e. a
n= 1.
Let us compute B
1(x) and B
2(x).
B
1(x) is of the form B
1(x) = x + β, thus B
1(x + 1) − B
1(x) = (x + 1 + β) − (x + β) = 1, i.e. the first condition is satisfied ∀ β. Moreover, R
10
(x + β) dx = [x
2/2 + βx]
10=
12+ β must be zero, so B
1(x) = x −
12. B
2(x) is of the form B
2(x) = x
2+αx+β, thus B
2(x+1)−B
2(x) = ((x+1)
2+α(x+1)+β)−(x
2+αx+β) = 2x + 1 + α, and the first condition is satisfied if α = −1. Moreover, R
10
(x
2− x + β) dx = [x
3/3 − x
2/2 + βx]
10= 1/3 − 1/2 + β must be zero, so B
2(x) = x
2− x + 1/6.
The numbers {B
2j(0)}
+∞j=0are known as Bernoulli numbers. We can say that B
0(0) = 1, B
2(0) = 1/6.
Euler guessed the following “explicit” formula for the generic Bernoulli number:
B
2j(0) = (−1)
j+12(2j)!
(2π)
2jζ(2j), ζ(s) =
+∞
X
k=1
1 k
s.
The value in 0 of degree odd Bernoulli polynomials is less interesting. In fact, B
1(0) = −
12, and B
2j+1(0) = 0, j = 1, 2, . . . (see below).
Recall that P
nx=1
x
jis n for j = 0, is n(n + 1)/2 if j = 1, is (n(n + 1)/2)
2if j = 3, and for j = 2 and j generic? Note that
n
X
x=1
x
j=
n
X
x=1
1
j + 1 (B
j+1(x + 1) − B
j+1(x)) = 1
j + 1 [B
j+1(n + 1) − B
j+1(1)].
It follows that ϕ
j(n) := P
nx=1
x
jcan be explicitely written in terms of the values of B
j+1in n + 1 and 1.
For example, in order to write P
nx=1
x
2it is sufficient to know the values of the third Bernoulli polynomial B
3in n + 1 and 1.
Let us deduce B
3by a way different from that used to deduce B
1and B
2. Exercise (IMP). Prove that
B
n(1 − x) = (−1)
nB
n(x), B
′n+1(x) = (n + 1)B
n(x).
(hint: note that also the polynomials (−1)
nB
n(1 − x) and
n+11B
n+1′(x) satisfy the conditions that uniquely
define B
n(x)).
The first of the equalities in the exercise implies that the Bernoulli polynomials are, with respect to x =
12, symmetric if n is even and antisymmetric if n is odd. In particular, we have B
n(1) = (−1)
nB
n(0), n = 0, 1, 2, . . .. But we also know that B
n(1) = B
n(0), n = 0, 2, 3, 4, . . . (by the first condition defining Bernoulli polynomials). Thus,
B
n(1) = B
n(0) = 0, n odd , n 6= 1.
The same equality in the Exercise (IMP) also implies B
n(
12) = (−1)
nB
n(
12), from which we deduce that B
n( 1
2 ) = 0, n odd.
It follows that B
1(x) = x − 1
2 , B
3(x) = x(x − 1
2 )(x − 1), B
2j+1(x) = x(x − 1
2 )(x − 1)q
2j−2(x), j = 2, 3, . . . , where q
2j−2is monic of degree 2j − 2. Now that we know B
3we can say that
n
X
x=1
x
2= 1
3 [B
3(n + 1) − B
3(1)] = 1
3 (n + 1)(n + 1 − 1
2 )(n + 1 − 1) = 1
6 n(n + 1)(2n + 1) Exercise. Prove that B
2j+1(x) (j ≥ 1) is zero in [0, 1] if and only if x = 0,
12, 1.
The second equality in Exercise (IMP) let us observe that the zeros of B
nare all the stationary points for B
n+1, and viceversa. So, for example, in [0, 1] the only stationary points of the even degree Bernoulli polynomials B
nare x = 0,
12, 1 (for n = 2 only x =
12). Moreover, the second equality yields the following integral formula for B
n(x)
B
n(x) = B
n(0) + n Z
x0
B
n−1(t) dt.
Finally, note that, by the Euler formula, the Bernoulli numbers B
2j(0) are rational numbers, and
|B
2j(0)| → +∞ if j → +∞.
Exercise . Prove that in [0, 1] if j → +∞, then B
2j(x)/B
2j(0) → cos(2πx).
Exercise . Prove that for x ∈ [0, 1] one has |B
2j(x)| ≤ |B
2j(0)|, ∀ j. (It would be sufficient to prove that
|B
2j(
12)| ≤ |B
2j(0)|, why?)
Bernoulli numbers are involved in the well known Euler-Mclaurin summation formula:
m, n ∈ Z, m < n,
n
X
r=m
f (r) = 1
2 (f (m) + f (n)) + Z
nm
f (x) dx +
k
X
j=1
B
2j(0)
(2j)! [f
(2j−1)(n) − f
(2j−1)(m)] + u
k+1,
u
k+1= 1 (2k + 1)!
Z
n mf
(2k+1)(x)B
2k+1(x) dx where B
n(x) is the periodic extension on R of B
n(x)|
[0,1).
Proof. The starting point is to integrate by parts R
k+1k
f
′(x)B
1(x) dx. The main steps of the proof are in the Appendix to the first lecture.
The euler-Mclaurin formula can be used to compute approximations of ζ(s), s > 1, for example of the yet mysterious number ζ(3) = P
+∞k=1
1/k
3(set f (r) = 1/r
3and let n go to infinite), or it can be used to obtain a formula for the error of the trapezoidal rule in approximating R
ba
g(x) dx (set f (t) = g(a + th), h = (b − a)/n,
m = 0; recall that the trapezoidal rule is I
h= h[
g(a)2+
g(b)2+ P
n−1i=1
g(a + ih)]). For some more details, see the Appendix to the first lecture.
Bernoulli numbers solve a lower triangular semi-infinite linear system
Now let us show that Bernoulli numbers solve a semi-infinite lower triangular linear system. It is known that te
xte
t− 1 =
+∞
X
n=0
B
n(x) n! t
n, t
e
t− 1 =
+∞
X
n=0
B
n(0)
n! t
n= − 1 2 t +
+∞
X
k=0
B
2k(0) (2k)! t
2k.
Multiply the latter identity by e
t− 1, expand e
tin terms of powers of t, and set to zero the coefficients of t
jof the right hand side, j = 2, 3, 4, . . .:
t = − 1
2 t +
+∞
X
k=0
B
2k(0) (2k)! t
2k+∞
X
r=0
t
r+1(r + 1)! , t = − 1
2
+∞
X
j=2
t
j(j − 1)! +
+∞
X
k,r=0
B
2k(0)t
2k+r+1(2k)!(r + 1)! ,
t = − 1 2
+∞
X
j=2
t
j(j − 1)! +
+∞
X
j=1
[(j−1)/2]
X
k=0
B
2k(0)t
j(2k)!(j − 2k)! ,
− 1 2
1 (j − 1)! +
[(j−1)/2]
X
k=0
B
2k(0)
(2k)!(j − 2k)! = 0, j = 2, 3, 4, 5, . . . . Thus
− 1 2 j +
[j−12 ]
X
k=0
j 2k
B
2k(0) = 0, j = 2, 3, 4, 5, . . . .
In particular, for j = 2, 4, 6, 8, . . ., we obtain the equations:
W b =
2 0
4 0
4 2
6 0
6 2
6 4
8 0
8 2
8 4
8 6
· · · · ·
B
0(0) B
2(0) B
4(0) B
6(0)
·
=
1 2 3 4
·
,
i.e. the lower triangular linear system we were looking for.
So, we can for instance easily compute the first Bernoulli numbers:
1, 1 6 , − 1
30 , 1 42 , − 1
30 , 5
66 , − 691 2730 , 7
6 , B
16(0) = − 3617 510 .
In the next lecture it will be shown that the coefficient matrix W of the above linear system turns out
to have an analytic representation. In order to prove this fact, it is enough to observe that W is a suitable
submatrix of the Tartaglia-Pascal matrix which can be represented as a power series: X = P
+∞k=0 1 k!
Y
k, where
Y = Z · diag i : i = 1, 2, 3, . . . , Z =
0 1 0
1 ·
·
.
The analytic representation of W will let us introduce a lower triangular Toeplitz linear system solved by Bernoulli numbers.
A FINAL REMARK on the first lecture.
In [Ramanujan, 1919] Ramanujan writes explicitly 11 sparse equations solved by the absolute values of the 11 Bernoulli numbers B
2(0), B
4(0), . . ., B
22(0). They are the first of an infinite set of sparse equations solved by the absolute values of all the Bernoulli numbers. The Ramanujan equations, written together and directly in terms of the B
2i:= B
2i(0), i = 1, 2, . . . , 11, form the linear system displayed here below:
1
0 1
0 0 1
1
3
0 0 1
0
520 0 1
0 0 11 0 0 1
1
5
0 0
14340 0 1
0 4 0 0
28630 0 1
0 0
20450 0 221 0 0 1
1
7
0 0
193870 0
323070 0 1
0
1120 0
710650 0
355340 0 1
· · · · · · · · · · · ·
B
2B
4B
6B
8B
10B
12B
14B
16B
18B
20B
22·
=
1 6
−
301 1 421 45−
1321 4 4551 120−
3061 3 6651 231−
5521·
.
For example, by using the last but three of Ramanujan equations, from the Bernoulli numbers B
2(0), . . . , B
16(0) listed previously, the following further Bernoulli numbers can be easily obtained:
B
18(0) = 43867
798 , B
20(0) = − 174611
330 , B
22(0) = 854513 138 .
Exercise. Let R be the coefficient matrix of the above Ramanujan linear system. Prove that there exists a diagonal matrix D such that RD = D ˜ R where ˜ R is lower triangular Toeplitz and sparse (for a more precise statement of this exercise see the Appendix to the first lecture).
APPENDIX TO THE FIRST LECTURE On the Euler-Mclaurin formula:
Proof. First observe that Z
k+1k
f
′(x)B
1(x) dx = 1
2 (f (k) + f (k + 1)) − Z
k+1k
f (x) dx.
Then sum on k from m to n − 1, call u
1the integral R
nm
f
′(x)B
1(x) dx, and set u
j= 1
(2j − 1)!
Z
n mf
(2j−1)(x)B
2j−1(x) dx.
Find a formula for u
jin terms of u
j+1and apply it for j = 1, 2, . . . , k. . . .
Remark. If f
(2k+2)does not change sign in [m, n], then
|u
k+1| ≤ 2 |B
2k+2(0)|
(2k + 2)! |f
(2k+1)(n) − f
(2k+1)(m)|.
This result allows to bound the error in the approximations of ζ(s) calculated via the Euler-Mclaurin formula.
Remark. Set I = R
ba
g(x) dx, h =
b−an, I
h= h[
g(a)2+ P
n−1i=1
g(a + ih) +
g(b)2]. An expression of R in the equality I = I
h+ R can be found by using Euler-Mclaurin formula for m = 0 and f (t) = g(a + th). Note that R = c
1h
2+ c
2h
4+ c
3h
6+ . . . and this justifies the efficiency of the number
I ˜
h= 2
2I
h− I
2h2
2− 1 as an approximation of I (I − ˜ I
h= O(h
4)).
• ζ(3) = P
+∞r=1 1
r3
is an irrational number (proved in 1973). It is not known if it is algebraic or trascenden- tal. Show that the Euler-Mclaurin formula can be used to obtain approximations of ζ(3) as good as possible (hint: choose m > 1 and let n go to infinite).
• The Eulero-Mascaroni constant is
γ = lim
n→+∞
(
n
X
k=1
1
k − log
en).
Find approximations as good as possible of γ by using the Euler-Mclaurin formula.
• Find a diagonal matrix D and a lower triangular Toeplitz matrix ˜ R of the following type
R = ˜
+∞
X
k=0
(·)
kZ
3k, Z =
0 1 0
1 0
· ·
,
such that RD = D ˜ R, where R is the matrix whose 11 × 11 upper-left part has been shown at the end of the first lecture (find (·)
kexplicitely for all k).
SECOND LECTURE, Monday September 1, 2014 (Moscow, INM RAS)
Normalized Bernoulli numbers solve a lower triangular semi-infinite Toeplitz linear system Note that
Z
a2,a3,a4,...=
0 a
20
a
30 a
4·
· ·
,
Z
a22,a3,a4,...=
0
0 0
a
2a
30 0 a
3a
40 ·
a
4a
5· ·
· ·
,
Z
a32,a3,a4,...=
0
0 0
0 0 0
a
2a
3a
40 0 · a
3a
4a
50 · ·
a
4a
5a
6· ·
· ·
,
Z
ak2,a3,a4,...=
0
· 0
0 · 0
a
2a
3· a
1+k0 · ·
a
3a
4· a
2+k0 · · a
4a
5· a
3+k· ·
· ·
.
Thus
[Z
ak2,a3,a4,...]
ij=
a
j+1a
j+2· a
j+kif i = k + j per j = 1, 2, 3, . . .
0 otherwise
Set
φ = Z
2,12,30,56,...= Z · diag (2i − 1)2i : i = 1, 2, 3, . . . or, equivalently, φ = Z
a2,a3,a4,...where a
r= (2r − 3)(2r − 2). Then
[φ
k]
ij= (2j − 1)(2j)(2j + 1)(2j + 2) · · · (2j + 2k − 3)(2j + 2k − 2) = (2j + 2k − 2)!
(2j − 2)! . Now observe that
W = (∗), (∗) = diag (2, 12, 30, 56, . . .) ·
+∞
X
k=0
1 (2k + 2)! φ
k[in fact,
[(∗)]
ij= (2i)!
(2i − 2)!
+∞X
k=0
1
(2k + 2)! φ
kij
= (2i)!
(2i − 2)!
1
(2i − 2j + 2)! [φ
i−j]
ij= (2i)!
(2i − 2)!
1 (2i − 2j + 2)!
(2i − 2)!
(2j − 2)! =
2j − 22i. ]. Thus the lower triangular system solved by Bernoulli numbers can be rewritten as follows:
+∞
X
k=0
2
(2k + 2)! φ
kb = q
e, b =
B
0(0) B
2(0) B
4(0) B
6(0)
·
, q
e=
1 1/3 1/5 1/7
·
. (∗∗)
Set D = diag (d
1, d
2, d
3, . . .), d
i6= 0. By investigating the nonzero entries of the matrix DφD
−1, it is easy to observe that it can be forced to be equal to a matrix of the form xZ; just choose d
k= x
k−1d
1/(2k − 2)!, k = 1, 2, 3, . . . (d
1can be chosen equal to 1). So, if
D = diag 1, x 2! , x
24! , · , x
n−1(2n − 2)! , · ,
then we have the equality DφD
−1= xZ.
Now, since Dφ
kD
−1= (DφD
−1)
k= x
kZ
k, it is easy to show the equivalence of (**) with the following lower traingular Toeplitz linear system:
[
+∞
X
k=0
2x
k(2k + 2)! Z
k]Db = Dq
e,
Exercise . By investigating the powers of the matrix Z, verify that the matrix [·] is lower triangular Toeplitz.
The set of all ε-circulant matrices and, in particular, the set of all lower triangular Toeplitz matrices, form a low complexity matrix algebra
Consider the following two n × n matrices:
Π
ε=
0 1 0 · 0
· 0 1 · ·
· · · · 0
0 · · 0 1
ε 0 · · 0
, ε ∈ C, J =
1
· 1 1
.
We want to study the algebra generated by Π
ε, i.e. C
ε= {p(Π
ε) : p = polynomials}. By writing the powers I, Π
ε, Π
2ε, . . . , Π
n−1ε, one easily realizes that
C
ε(z) =
n
X
k=1
a
kΠ
k−1ε=
a
1a
2a
3· a
nεa
na
1a
2· ·
· εa
na
1· a
3εa
3· · · a
2εa
2εa
3· εa
na
1
(note that Π
nε= εI).
ASSUME ε 6= 0 (the matrix algebra of all n × n ε-circulant matrices) Eigenvalues of Π
ε:
det(λI − Π
ε) = det
λ −1
0 λ −1
· · · ·
0 · −1
−ε 0 · 0 λ
= λλ
n−1+ (−1)
n+1(−ε)(−1)
n−1= λ
n− ε
(the determinant has been computed with respect to the first column). Thus, the eigenvalues of Π
εare the roots of the algebraic equation λ
n− ε = 0.
Eigenvectors of Π
ε:
If λ is such that λ
n= ε, then
0 1
· 0 ·
0 · 1
ε 0 · 0
1 λ
· λ
n−1
=
λ
· λ
n−1ε
= λ
1 λ
· λ
n−1
.
Let λ
0, λ
1, . . . , λ
n−1denote the eigenvalues of Π
ε(note that they are distinct!). Then
Π
εX =
0 1
· 0 ·
0 · 1
ε 0 · 0
1 · · 1
λ
0· · λ
n−1· · · ·
λ
n−10· · λ
n−1n−1
=
1 · · 1
λ
0· · λ
n−1· · · ·
λ
n−10· · λ
n−1n−1
λ
0λ
1· λ
n−1
= XΛ.
Note that the matrix X is invertible since its columns are eigenvectors of Π
εcorresponding to distinct eigenvalues of Π
ε(or since it is a vandermonde matrix). Thus, from the above equality, we obtain the following spectral representations, of Π
ε:
Π
ε= XΛX
−1and of C
ε(a), a ∈ C
n:
C
ε(a) =
n
X
k=1
a
kΠ
k−1ε= X X
nk=1
a
kΛ
k−1X
−1= X diag (
n
X
k=1
a
kλ
k−1i, i = 0, 1, . . . , n − 1)X
−1= Xd(X
Ta )X
−1.
Remark. One can easily verify that effectively the first row of Xd(X
Ta)X
−1is a
T: e
T1(Xd(X
Ta)X
−1) = (a
TX)d(X
Te
1)X
−1= a
T.
Remark. Set ε = ρ
εe
iθε, θ
ε∈ [0, 2π) (ρ
ε> 0). Then note that [X]
r,k= λ
rk= (·)
rω
rkn, 0 ≤ r, k ≤ n − 1, that is,
X =
1
(·)
·
(·)
n−1
1 1 · 1
1 ω
n· ω
nn−1· · · ·
1 ω
nn−1· ω
(n−1)(n−1)n
where ω
n= e
i2πn(ω
nn= 1, ω
ni6= 1 if 0 < i < n) and (·) = |ρ
εn1|e
iθεn, so that λ
k= (·)ω
nk, k = 0, 1, . . . , n − 1.
Now let F be the n × n Fourier matrix F =
√1n(ω
nij)
n−1i,j=0. Note that F is unitary, F
HF = I (this is an exercise!) and that
X = √
nDF, D
ii= (·)
i−1, i = 1, . . . , n.
Thus
XX
H= nDF F
HD
H= n
1
|(·)|
2·
|(·)|
2(n−1)
i.e. the rows of X are orthogonal. If |ε| = 1, then |(·)| = 1 and XX
H= nI, i.e. also the columns of X are orthogonal and the matrix
√1nX = DF , D
kk= e
iθε(k−1)/n, k = 1, . . . , n, is unitary (because D is unitary).
In other words, we have shown that if |ε| = 1 then the ε-circulant matrices are simultaneously diagonalized by a unitary transform, i.e. by U =
√1nX = DF . Actually, the condition |ε| = 1 is also necessary for C
εbeing diagonalized by a unitary transform. To prove this simply observe that Π
εis a normal matrix if and only if |ε| = 1 (prove this!). Recall that A ∈ C
n×nis normal (AA
H= A
HA) if and only if there exists U unitary such that U
HAU is diagonal.
In conclusion,
ε ∈ C ⇒ C
ε(a) = DF d( √
nF Da)F
HD
−1,
|ε| = 1 ⇒ C
ε(a) = DF d( √
nF Da)F
HD = U d( √
nU
Ta )U
H, U = DF unitary.
From the above spectral representations of the matrices from C
εit follows that C
εis a space of low complexity, in the sense that
1) any matrix-vector product C
ε(a)v,
2) the solution of any linear system C
ε(a)x = b and 3) the computation of the eigenvalues of any C
ε(a)
are all operations that can be done with O(n log
2n) arithmetic operations. This is true for any value of
ε 6= 0. (Check these assertions!)
Remark. Note however that the problem of the eigenvalues of matrices from C
εis optimally conditioned if and only if |ε| = 1. This assertion is a consequence of the Bauer-Fike theorem which states that for any eigenvalue ˜ λ of C
ε(a) + ∆, there is an eigenvalue λ of C
ε(a) such that
|˜λ − λ| ≤ µk∆k
2, µ = inf
M:M−1Cε(a)M=diag
µ
2(M ),
and of the fact that µ is equal to 1 if and only if |ε| = 1 (see the Exercise here below).
Exercise . Let M be a n × n matrix. Prove that µ
2(M ) = kMk
2kM
−1k
2= 1 if and only if M = cU for some matrix U unitary and c ∈ C.
THIRD LECTURE, Tuesday September 2, 2014 (Moscow, INM RAS)
Exercise. Find a value of x for which the entry (Db)
nof the solution of the linear system [ P
+∞k=0
2x
k/(2k + 2)!Z
k]Db = Dq
efor n → +∞ remains bounded.
From yesterday, we have C
ε(a) =
n
X
k=1
a
kΠ
k−1ε= DF d( √
nF Da)F
HD
−1= DF d( √
nF Da)(DF )
Hwhere the last identity holds if and only if |ε| = 1 (D
kk= (·)
k−1, k = 1, . . . , n, (·) = |ρ
1
εn
|e
iθεn, F
ij=
√1nω
nij, 0 ≤ i, j ≤ n − 1, ω
n= e
i2πn, ε = ρ
εe
iθε).
For ε = 1: D = I ⇒ C
1(a) = F d( √
nF a)F
H. For ε = −1: C
−1(a) = DF d( √
nF Da)(DF )
H, D = diag ((e
iπn)
k−1, k = 1, . . . , n).
1) C
ε(a)v: two or three FFT, thus O(n log
2n) arithmetic operations 2) C
ε(a)z = f : z = DF d( √ nF Da)
−1F
HD
−1f , as above
3) eigenvalues of C
ε(a): one FFT, as above
ASSUME ε = 0 (the matrix algebra of all n × n upper triangular matrices)
Now we will prove that the operations in 1), 2), 3) can be done with O(n log
2n) arithmetic operations also in the case ε = 0. Note that no representation of C
0(a) of the type M d(v)M
−1can hold since, for example, Π
0is not diagonalizable.
Set
L(a) = C
0(a)
T=
a
0a
1a
0· a
1a
0a
n−1· a
1a
0
3) The eigenvalues of L(a): 0 computations
1) L(a)v: O(n log
2n) arithmetic operations (see the exercise here below)
Exercise. Prove that any Toeplitz matrix T = (t
i−j)
ni,j=1can be written as C
1(u) + C
−1(w), for suitable u , w ∈ C
n.
2) L(a)z = f : L(a) is not diagonalizable since even L(e
2) is not diagonalizable! However, the cost of solving a lower triangular Toeplitz system is O(n log
2n) arithmetic operations. In the following of the present lecture we shall prove this fact.
L(a)z = f , N, a ∈ C
N, a =
a
0a
1·
semi-infinite vector.
L(a) = P
+∞k=0
a
kZ
k, Z =
0 1 0
1 0
· ·
The space {L(a) : a ∈ C
n} is closed under matrix multiplication and commutative. Moreover, it is closed under inversion since {L(a) : a ∈ C
n} = {A : AZ = ZA} (so AZ = ZA ⇒ A
−1Z = ZA
−1).
LEMMA 1. a, b ∈ C
N: L(a)b = c ⇔ L(a)L(b) = L(c)
Proof. ⇐: check the first column. ⇒: note that L(a)L(b) is a lower triangular Toeplitz matrix and that its first column is L(a)b.
LEMMA 2. Set E =
1 0 0 0 0 1 0 0 0 0
· · ·
, Ev = [v
00 v
10 v
20 · · · ]
T. Then E
sL(u)v = L(E
su)E
sv , s ∈ N.
Proof. For s = 0 trivial. If we prove the thesis for s = 1, EL(u)v = L(Eu)Ev, then, by multiplying on the left by E we obtain the thesis for all other s. But the fact that EL(u)v = L(Eu)Ev follows from an easy check:
EL(u)v = E
u
0u
1u
0u
2u
1u
0· · · ·
v
0v
1v
2·
=
u
0v
00 u
1v
0+ u
0v
10
u
2v
0+ u
1v
1+ u
0v
20
·
,
L(Eu)Ev =
u
00 u
0u
10 u
00 u
10 u
0u
20 u
10 u
0· · · · · ·
v
00 v
10 v
2·
=
u
0v
00 u
1v
0+ u
0v
10
u
2v
0+ u
1v
1+ u
0v
20
·
.
L(a)z = f , a ∈ C
N, f ∈ C
NSTEP 1. Find ˆ a ∈ C
Nsuch that L(a)ˆ a = Ea
(1)for some a
(1)∈ C
N. Then, by Lemma 1 and commutativity, L(ˆ a)L(a) = L(a)L(ˆ a) = L(Ea
(1)).
Multiply the system on the left by L(ˆ a ):
L(ˆ a)L(a)z = L(ˆ a)f , L(Ea
(1))z = L(ˆ a)f . (∗) Note that
L(Ea
(1)) =
a
(1)00 a
(1)0a
(1)10 ·
0 a
(1)1· a
(1)20 ·
· · ·
.
STEP 2. Find ˆ a
(1)∈ C
Nsuch that L(a
(1))ˆ a
(1)= Ea
(2)for some a
(2)∈ C
N. Then, by Lemma 2 L(Ea
(1))Eˆ a
(1)= EL(a
(1))ˆ a
(1)= E
2a
(2), and, by Lemma 1 and commutativity,
L(Eˆ a
(1))L(Ea
(1)) = L(Ea
(1))L(Eˆ a
(1)) = L(E
2a
(2)).
Multiply the system (*) on the left by L(Eˆ a
(1)):
L(Eˆ a
(1))L(Ea
(1))z = L(Eˆ a
(1))L(ˆ a)f , L(E
2a
(2))z = L(Eˆ a
(1))L(ˆ a)f . (∗∗) Note that
L(E
2a
(2)) =
a
(2)00 a
(2)00 0 ·
0 0 ·
a
(2)10 · 0 a
(2)1·
· · ·
.
STEP 3. Find ˆ a
(2)∈ C
Nsuch that L(a
(2))ˆ a
(2)= Ea
(3)for some a
(3)∈ C
N. Then, by Lemma 2 L(E
2a
(2))E
2ˆ a
(2)= E
2L(a
(2))ˆ a
(2)= E
3a
(3), and, by Lemma 1 and commutativity,
L(E
2a ˆ
(2))L(E
2a
(2)) = L(E
2a
(2))L(E
2a ˆ
(2)) = L(E
3a
(3)).
Multiply the system (**) on the left by L(E
2ˆ a
(2)):
L(E
2ˆ a
(2))L(E
2a
(2))z = L(E
2a ˆ
(2))L(Eˆ a
(1))L(ˆ a)f , L(E
3a
(3))z = L(E
2ˆ a
(2))L(Eˆ a
(1))L(ˆ a)f . Note that
L(E
3a
(3)) =
a
(3)00 a
(3)00 0 ·
0 0 ·
0 0 ·
0 0 ·
0 0 ·
0 0 ·
a
(3)10 · 0 a
(3)1·
· · ·
.
Assume now that the lower triangular Toeplitz system we have to solve is made up with 8 equations. Then we stop the process here, and we note that, by commutativity,
L(E
3a
(3))z = L(ˆ a)L(Eˆ a
(1))L(E
2a ˆ
(2))f .
Assume moreover that f = E
2v for some vector v ∈ C
N(important: note that this assumption is satisfied by f = e
1). Then, by Lemma 2,
L(E
3a
(3))z = L(ˆ a)L(Eˆ a
(1))L(E
2a ˆ
(2))E
2v = L(ˆ a)L(Eˆ a
(1))E
2L(ˆ a
(2))v = L(ˆ a)EL(ˆ a
(1))EL(ˆ a
(2))v, and this equality between semi-infinite vectors imply the following equality between 8-dimensional vectors:
a
(3)0{z}
8= {L(ˆa)}
8,8{E}
8,8{L(ˆa
(1))}
8,8{E}
8,8{L(ˆa
(2))}
8,8{v}
8= {L(ˆa)}
8,8{E}
8,4{L(ˆa
(1))}
4,4{E}
4,2{L(ˆa
(2))}
2,2{v}
2[The last four columns of E
8,8are null; the last two columns of E
4,4are null].
The latter result let us state that the first column of the inverse of a 8 ×8 lower triangular Toeplitz matrix can be computed by performing two matrix-vector products where the matrices are lower triangular Toeplitz 4 × 4 and 8 × 8, respectively (set v = e
1). In the general case in which n = 2
sthe lower triangular Toeplitz matrices that have to be multiplied by vectors are respectively of order 4 × 4, 8 × 8, . . ., 2
s× 2
s. So, we have the following
Result: If O(j2
j) is the cost of computing the product of a 2
j× 2
jlower triangular Toeplitz matrix by a vector, then the cost of the computation of the first column of the inverse of a 2
s× 2
slower triangular Toeplitz matrix by the above described algorithm is: P
sj=2
O(j2
j) = O(s2
s) = O(n log
2n).
Finally, once {˜z}
2s= {L(a)}
−12s,2s{e
1}
2sis known, in order to solve the system {L(a)}
2s,2s{z}
2s= {f}
2s, note that
{z}
2s= {L(a)}
−12s,2s{f}
2s= {L(˜z)}
2s,2s{f}
2s,
and that {L(˜z)}
2s,2s{f}
2scan be computed with O(s2
s) = O(n log
2n) arithmetic operations (via the repre- sentation of {L(˜z)}
2s,2sas the sum of a circulant and of a (−1)-circulant matrix).
Important remark (for the proof of the above result). How to find ˆ a such that L(a)ˆ a = Ea
(1)for some a
(1)∈ C
N? No computation is required to obtain such a vector ˆ a, in fact
1 a
11 a
2a
11 a
3a
2a
11
· · · · ·
1
−a
1a
2−a
3·
=
1 0 2a
2− a
210 6= 0
·
(RES)
FINAL REMARK.
Find ˆ a such that L(a)ˆ a = Ea
(1)is equivalent to find ˆ a such that L(ˆ a)L(a) = L(Ea
(1)) (by Lemma 1 and by commutativity), i.e. to find ˆ a
ksuch that
(
+∞
X
k=0
ˆ a
kZ
k)(
+∞
X
k=0
a
kZ
k) = (
+∞
X
k=0
a
(1)kZ
2k).
More in general, how to find ˆ a
ksuch that
( X
+∞k=0
ˆ a
kZ
k)(
+∞
X
k=0
a
kZ
k) = (
+∞
X
k=0
a
(1)kZ
rk),
or, equivalently, given a polynomial a(z) how to find a polynomial ˆ a(z) such that ˆ a(z)a(z) = a
(1)(z
r) for some polynomial a
(1)? An answer is in the following
Exercise. Prove that ˆ a(z) = a(zt)a(zt
2) · · · a(zt
r−1) where t
r= 1, t
j6= 1, 0 < j < r, is such that ˆ
a(z)a(z) = a
(1)(z
r).
For example, for r = 2 we have ˆ a(z) = a(−z), and we retrieve the result observed in (RES).
Exercise (not for evaluation). If Ev = [v
00 0 v
10 0 v
20 · · · ]
T, then analogous Lemmas 1 and 2 hold, and analogous algorithm ok for n = 3
sof complexity O(s3
s) can be conceived . . .
FOURTH LECTURE, Monday September 8, 2014 (Rome)
Hessenberg algebras and displacement matrix formulas
Exercise 1. Prove the following assertion:
If B is n × n real symmetric positive definite, and y, s ∈ R
nare such that y
Ts > 0, then A = B + 1
y
Ts yy
T− 1
s
TBs Bss
TB is real symmetric positive definite.
Exercise 2. Prove that eigenvectors x and y corresponding to distinct eigenvalues λ and µ of a hermitian (or unitary) matrix A (Ax = λx, Ay = µy) are such that x
Hy = 0.
Exercise 3. Find values of α and β such that P
ni=1
sin
2 ijπn+1= αn + β (for all j = 1, 2, . . . , n).
Hessenberg algebras.
X =
r
11b
1r
21r
22b
2· · · ·
· · b
n−1r
n1· · r
nn
, b
i6= 0, ∀ i. (Hess)
Note that the following matrices
J
1= X
0= I, J
2= 1
b
1(X − r
11I), J
3= 1
b
2(J
2X − r
21I − r
22J
2), . . . , J
n= 1
b
n−1(J
n−1X − r
n−1,1I − r
n−1,2J
2. . . − r
n−1,n−1J
n−1)
are polynomials in X (of degree 0, 1, . . ., n − 1) and have as first row, respectively, e
T1, e
T2, e
T3, . . ., e
Tn. Then consider the set H
Xof all polymomials in X. Note that its dimension is less than or equal to n, since by Cayley-Hamilton theorem X
nmust be a linear combination of the previous powers of X. But there are n matrices in H
Xwhich are linearly independent, the J
k. So, the dimension of H
Xis n,
H
X= {p(X) : p = polynomials} = Span {J
1, J
2, . . . , J
n} = {A ∈ C
n×n: AX = XA},
and any matrix of H
Xis uniquely determined by its first row, i.e., given any vector a in C
nit is well defined the matrix of H
Xwith first row a
T:
H
X(a) =
n
X
k=1
a
kJ
k, e
T1H
X(a) = a
T.
Of course, the space H
Xis closed under matrix multiplication (H
Xis a matrix algebra), and is commutative.
In particular, we have that J
kJ
s= J
sJ
k(for all k, s), and thus
e
TiH
X(a) = e
Tin
X
k=1
a
kJ
k=
n
X
k=1
a
ke
TkJ
i= a
TJ
i, ∀ i
and, multiplying the previous identity by the scalar v
iand summing on i = 1, . . . , n, we obtain the equality v
TH
X(a) = a
TH
X(v), a, v ∈ C
n.
Note that if a matrix H
X(a) in H
Xis invertible, then its inverse must be a polynomial in X, i.e. it must exist
a vector z ∈ C
nsuch that H
X(a)
−1= H
X(z) (see below). From the equality H
X(z)H
X(a) = I it follows
that z can be determined by solving the linear system z
TH
X(a) = e
T1.
The space of ε-circulant matrices (for any value of ε) is of course an example of Hessenberg algebra. We will study also another important example of Hessenberg algebra, the τ matrix algebra.
Now we want to state an example of general displacement formula, which involves Hessenberg algebras, i.e. given any matrix A we represent it as a sum of products of types M
iN
iwhere M
iand N
iare matrices of two general Hessenberg algebras H
Xand H
X′. The number of addenda in such sum is α + 1 where α is the rank of AX − XA. Such formula for A can be extremely useful if α does not depend on the order n of the matrix A.
Lemma. Let A be n × n with complex entries. If AX − XA = P
αm=1
x
my
Tm(x
m, y
m∈ C
n), then P
αm=1
x
Tmp(X)
Ty
m= 0, for any polynomial p.
Proof.
α
X
m=1
x
Tmp(X
T)y
m=
α
X
m=1 n
X
i,j=1
(x
m)
i(p(X
T))
ij(y
m)
j=
n
X
i,j=1
X
αm=1
(x
m)
i(y
m)
j(p(X
T))
ij=
n
X
i,j=1
(AX − XA)
ij(p(X
T))
ij=
n
X
i,j=1
X
k
A
ikX
kj(p(X
T))
ij−
n
X
i,j=1
X
k
X
ikA
kj(p(X
T))
ij=
n
X
i,k=1
A
ikX
j
(p(X
T))
ij(X
T)
jk−
n
X
k,j=1
A
kjX
i
(X
T)
ki(p(X
T))
ij=
n
X
i,k=1
A
ik(p(X
T)X
T)
ik−
n
X
k,j=1
A
kj(X
Tp(X
T))
kj= 0.
The displacement formula obtained in the following theorem involves persymmetric Hessenberg algebras.
Such formula and the fact that rank(AX − XA) = 2 for A Toeplitz and for X = Π
εwill be used to obtain an efficient representation of the inverse of a Toeplitz matrix, due to Ammar and Gader.
Theorem. Let the lower Hessenberg matrix X be persymmetric, i.e. symmetric with respect to the anti- diagonal (X
T= JXJ), and define X
′∈ C
n×nand β ∈ C by the following identity
X = X
′+ (r
n1− β)e
ne
T1. Let A be n × n with complex entries. If AX − XA = P
αm=1
x
my
Tm(x
m, y
m∈ C
n), then (r
n1− β)A = −
α
X
m=1
H
X(ˆ x
m)H
X′(y
m) + (r
n1− β)H
X(JAe
n)
(for v ∈ C
nthe symbol ˆ v means Jv).
Exercise (not for evaluation). Prove that if instead X has a Toeplitz structure, then, under the same assumption AX − XA = P
αm=1
x
my
Tm, we have bA = − P
αm=1
L(Zx
m)H
X(y
m) + bH
X(A
Te
1) where b = b
iand L(z) = C
0(z)
T(for X = Z
Tsuch result yields the famous Gohberg-Semencul formula).
Proof. Set (∗) = − P
αm=1
H
X(ˆ x
m)H
X′(y
m). Then (∗)X−X(∗) = −
α
X
m=1
H
X(ˆ x
m) h
H
X′(y
m)X−XH
X′(y
m) i
= −(r
n1−β)
α
X
m=1
H
X(ˆ x
m) h
H
X′(y
m)e
ne
T1−e
ne
T1H
X′(y
m) i
= −(r
n1− β)
α
X
m=1
H
X(ˆ x
m)[ˆ y
me
T1− e
ny
Tm] = (r
n1− β)
α
X
m=1
x
my
Tm= (r
n1− β)(AX − XA).
Thus, (r
n1− β)A − (∗) must commute with X, i.e. must be a polynomial in X:
(r
n1− β)A − (∗) = H
X(z), z ∈ C
n.
By imposing that the last column of the left matrix and the last column of the right matrix, in the previous equality, are equal, one obtains that the vector z is defined by the identity Jz = (r
n1− β)Ae
n.
Note that in the proof we have used twice the fact that P
αm=1
H
X(ˆ x
m)ˆ y
mis the null vector, in fact e
Ti(
α
X
m=1
H
X(ˆ x
m)ˆ y
m) =
α
X
m=1
ˆ
x
TmJ
iy ˆ
m=
α
X
m=1