Rome-Moscow school of Matrix Methods and Applied Linear Algebra
Rome, Sept 19 - Oct 3, 2010, Moscow, Oct 10 - Oct 24, 2010 Carmine Di Fiore In these notes the concepts of circulants, τ and Toeplitz matrices, Hessenberg al- gebras and displacement decompositions, spaces of class V and best least squares fits on such spaces, are introduced and investigated. As a consequence of the re- sults presented, the choice of matrices involved in displacement decompositions, the choice of preconditioners in solving linear systems and the choice of Hes- sian approximations in quasi-Newton minimization methods, become possible in wider classes of low complexity matrix algebras.
The Fourier matrix, circulants, and fast discrete transforms Consider the following n × n matrix
P 1 =
0 1
0 1 . ..
1
1 0
.
Let ω ∈ C. Note that
P 1
1 1 1
=
1 1 1
= 1
1 1 1
, P 1
1 ω ω n−1
=
ω ω 2 ω n−1
1
= ω
1 ω ω n−1
,
where the latter identity holds if ω n = 1. More in general, if ω n = 1, we have the following vectorial identities
P 1
1 ω j ω (n−1)j
=
ω j ω (n−1)j
1
= ω j
1 ω j ω (n−1)j
, j = 0, 1, . . . , n − 1,
or, equivalently, the following matrix identity P 1 W = W D 1ω
n−1,
D 1ω
n−1=
1
ω
ω n−1
, W =
1 1 1 1
1 ω ω j ω n−1
1 ω n−1 ω (n−1)j ω (n−1)(n−1)
.
Proposition. If ω n = 1 and if ω j 6= 1 for 0 < j < n, then W ∗ W = nI.
proof: since |ω| = 1, ω = ω −1 , we have [W ∗ W ] ij = [W W ] ij = P n
k=1 [W ] ik [W ] kj = P n
k=1 ω (i−1)(k−1) ω (k−1)(j−1)
= P n
k=1 ω (k−1)(j−i) = P n
k=1 (ω j−i ) k−1 .
Thus [W ∗ W ] ij = n if i = j, and [W ∗ W ] ij = 1−(ω 1−ω
j−ij−i)
n= 0 if i 6= j (note that the assumption ω j 6= 1 for 0 < j < n is essential in order to make 1 − ω j−i 6= 0).
By the result of the above Proposition, we can say that the following (sym- metric) Fourier matrix
F = 1
√ n W is unitary, i.e. F ∗ F = I.
Exercise. Prove that F 2 = JP 1 where J is the permutation matrix Je k = e n+1−k , k = 1, . . . , n (J is usually called anti-identity).
The matrix identity satisfied by P 1 and W can be of course rewritten in terms of F , P 1 F = F D 1ω
n−1, thus we obtain the equality
P 1 = F D 1ω
n−1F ∗
which states that the Fourier matrix diagonalizes the matrix P 1 , or, more precisely, that the columns of the Fourier matrix form a system of n unitar- ily orthonormal eigenvectors for the matrix P 1 with corresponding eigenvalues 1, ω, . . . , ω n−1 .
But if F diagonalizes P 1 , then it diagonalizes all polynomials in P 1 : P 1 k−1 = F D k−1 1ω
n−1F ∗ ,
P n
k=1 a k P 1 k−1 = F P n
k=1 a k D k−1 1ω
n−1F ∗
= F
P n
k=1 a k
P n
k=1 a k ω k−1
P n
k=1 a k ω (n−1)(k−1)
F ∗
= F d(W a)F ∗ = √
nF d(F a)F ∗
where by d(z) we mean the diagonal matrix whose diagonal entries are z 1 , z 2 , . . . , z n . Let us investigate the matrices P 1 k−1 , k = 1, . . . , n, and the matrix P n
k=1 a k P 1 k−1 in the case n = 4:
P 1 0 = I, P 1 1 =
0 1 0 0 0 0 1 0 0 0 0 1 1 0 0 0
, P 1 2 =
0 0 1 0 0 0 0 1 1 0 0 0 0 1 0 0
, P 1 3 =
0 0 0 1 1 0 0 0 0 1 0 0 0 0 1 0
,
P 1 4 = P 1 3 P 1 = P 1 T P 1 = I = P 1 0 ,
4
X
k=1
a k P 1 k−1 =
a 1 a 2 a 3 a 4
a 4 a 1 a 2 a 3
a 3 a 4 a 1 a 2
a 2 a 3 a 4 a 1
= √
4F d(F a)F ∗ , F = 1
√ 4
1 1 1 1
1 ω ω 2 ω 3 1 ω 2 ω 4 ω 6 1 ω 3 ω 6 ω 9
,
ω 4 = 1, ω j 6= 1, 0 < j < 4 (ω = e ±i2π/4 ).
Note that, for n generic, we have the identities e T 1 P 1 k−1 = e T k , k = 1, . . . , n,
and P 1 n = I (prove them!). So, the set C = {p(P 1 )} of all polynomials in P 1
is spanned by the matrices J k = P 1 k−1 ; the particular polynomial P n
k=1 a k J k is simply denoted by C(a). Note that C(a) is the matrix of C with first row a T :
C(a) =
n
X
k=1
a k J k =
a 1 a 2 a n−1 a n
a n a 1 a n−1
a 3 a 2
a 2 a 3 a n a 1
= F d(F T a)d(F T e 1 ) −1 F −1 .
C is known as the space of circulant matrices.
Exercise. (i) Repeat all, starting from the n × n matrix
P −1 =
0 1
0 1 . ..
1
−1 0
and arriving to the (−1)-circulant matrix whose first row is a T , a ∈ C n :
C −1 (a) =
a 1 a 2 a n−1 a n
−a n a 1 a n−1
−a 3 a 2
−a 2 −a 3 −a n a 1
.
(ii) Let T be a Toeplitz n × n matrix, i.e. T = (t i−j ) n i,j=1 , for some t k ∈ C.
Show that T can be written as the sum of a circulant and of a (−1)-circulant, that is, T = C(a) + C −1 (b), a, b ∈ C n .
Why circulant matrices can be interesting in the applications of linear alge- bra? The main reason is in the fact that the matrix-vector product C(a)z can be computed in at most O(n log 2 n) arithmetic operations (whereas, usually, a matrix-vector product requires n 2 multiplications).
Proposition FFT. Given z ∈ C n , the complexity of the matrix-vector product F z is at most O(n log 2 n). Such operation is called discrete Fourier transform (DFT) of z. As a consequence, the matrix-vector product C(a)z is computable by two DFTs (after the preprocessing DFT F a).
proof: since ω (i−1)(k−1) is the (i, k) entry of W and z k is the k entry of z ∈ C n , we have
(W z) i = P n
k=1 ω (i−1)(k−1) z k = P n/2
j=1 ω (i−1)(2j−2) z 2j−1 + P n/2
j=1 ω (i−1)(2j−1) z 2j
= P n/2
j=1 (ω 2 ) (i−1)(j−1) z 2j−1 + P n/2
j=1 ω (i−1)(2(j−1)+1) z 2j
= P n/2
j=1 (ω 2 ) (i−1)(j−1) z 2j−1 + ω i−1 P n/2
j=1 (ω 2 ) (i−1)(j−1) z 2j .
Note that ω is in fact a function of n, i.e. the right notation for ω should be ω n . Then ω 2 = ω 2 n is such that (ω n 2 ) n/2 = 1 and (ω n 2 ) i 6= 1 0 < i < n/2; in other words ω n 2 = ω n/2 . So, we have the identities
(W n z) i =
n/2
X
j=1
ω n/2 (i−1)(j−1) z 2j−1 + ω n i−1
n/2
X
j=1
ω n/2 (i−1)(j−1) z 2j , i = 1, 2, . . . , n. (?)
It follows that, for i = 1, . . . , n 2 ,
(W n z) i = (W n/2
z 1
z 3
z n−1
) i + ω n i−1 (W n/2
z 2
z 4
z n
) i .
Moreover, by setting i = n 2 + k, k = 1, . . . , n 2 , in (?), we obtain (W n z)
n2+k = P n/2
j=1 ω n/2
n2(j−1) ω n/2 (k−1)(j−1) z 2j−1 + ω n
n2ω k−1 n P n/2
j=1 ω n/2
n2(j−1) ω (k−1)(j−1) n/2 z 2j
= P n/2
j=1 ω (k−1)(j−1) n/2 z 2j−1 − ω n k−1 P n/2
j=1 ω n/2 (k−1)(j−1) z 2j
= (W n/2
z 1
z 3
z n−1
) k − ω n k−1 (W n/2
z 2
z 4
z n
) k , k = 1, . . . , n 2 .
(ω n
n2= −1; think ω = e ±i2π/n ). Thus
W n z =
"
I D
1ω
n 2 −1 n
I −D 1ω
n2 −1 n#
W n/2 0 0 W n/2
Qz,
D 1ω
n 2 −1 n