Eigenvalues and eigenvectors
Let A ∈ Rn×n. If 0 6= v ∈ Cn and λ ∈ C satisfy Av = λv
then λ is called eigenvalue, and v is called eigenvector.
Given a matrix, we want to approximate its eigenvalues and eigenvectors. Some applications:
Structural engineering (natural frequency, heartquakes ) Electromagnetics (resonance cavity)
Google’s Pagerank algorithm ...
Eigenvalues and eigenvectors
Let A ∈ Rn×n. If 0 6= v ∈ Cn and λ ∈ C satisfy Av = λv
then λ is called eigenvalue, and v is called eigenvector.
Given a matrix, we want to approximate its eigenvalues and eigenvectors.
Some applications:
Structural engineering (natural frequency, heartquakes ) Electromagnetics (resonance cavity)
Google’s Pagerank algorithm ...
The characteristic polynomial
The eigenvalues of a matrix are the roots of the characteristic polynomial
p(λ) := det (λI − A) = 0
However, computing the roots of a polynomial is a very ill-conditioned problem! We cannot use this approach to compute the eigenvalues.
Eigenvalues and eigenvectors
Algorithms that compute the eigenvalues/eigenvectors of a matrix are divided into two categories:
1 Methods that compute all the eigenvalues/eigenvectors at once.
2 Methods that compute only a few (possibly one) eigenvalues/eigenvectors.
The methods are also different whether the matrix is symmetric or not. In this lesson we will discuss methods of type 2 for symmetric matrices.
Eigenvalues/eigenvectors of a symmetric matrix (refresh)
Theorem
All the eigenvalues of a symmetric matrix are real. Moreover, there exists a basis of eigenvectors u1, . . . , un, i.e.
Aui = λiui
that are orthonormal, i.e.
(ui, uj) = δij
We assume that the eigenvalues are numbered in decreasing order (in absolute value), i.e.
|λ1| ≥ |λ2| ≥ . . . ≥ |λn|
Eigenvalues/eigenvectors of a symmetric matrix (refresh)
Theorem
All the eigenvalues of a symmetric matrix are real. Moreover, there exists a basis of eigenvectors u1, . . . , un, i.e.
Aui = λiui
that are orthonormal, i.e.
(ui, uj) = δij
We assume that the eigenvalues are numbered in decreasing order (in absolute value), i.e.
|λ1| ≥ |λ2| ≥ . . . ≥ |λn|
The power method
We want to approximate the eigenvalue of A that is largest in abs. value.
v0 = some vector with kv0k = 1.
for k = 1, 2, . . .
w = Avk−1 apply A
vk = w / kw k normalize
µk = (vk)TAvk Reyleigh quotient
end
Theorem
Assume |λ1| > |λ2| and (v0, u1) 6= 0. Then
k→+∞lim µk = λ1
and
k→+∞lim min {kvk − u1k , kvk+ u1k} = 0
The power method
We want to approximate the eigenvalue of A that is largest in abs. value.
v0 = some vector with kv0k = 1.
for k = 1, 2, . . .
w = Avk−1 apply A
vk = w / kw k normalize
µk = (vk)TAvk Reyleigh quotient
end Theorem
Assume |λ1| > |λ2| and (v0, u1) 6= 0. Then
k→+∞lim µk = λ1
and
k→+∞lim min {kvk − u1k , kvk+ u1k} = 0
Proof
We expand v0on the eigenvector basis:
v0= a1u1+ a2u2+ . . . + anun
Note that a1= (v0, u1) 6= 0.
vkis a scalar multiple of Akv0, and since it has unit norm vk= Akv0/ Akv0
Akv0 = a1Aku1+ a2Aku2+ . . . + anAkun
= a1λk1u1+ a2λk2u2+ . . . + anλknun
= λk1 a1u1+ a2
λ2
λ1
k
u2+ . . . + an
λn
λ1
k
un
!
Since λ1> λ2, for k 1, λ2 λ1
k
≈ 0, . . . , λn λ1
k
≈ 0. Thus,
Akv0≈ a1λk1u1 =⇒ vk≈ a1λk1u1
|a1λk1|ku1k = a1λk1
|a1λk1|u1= sign
a1λk1
u1
Thus,
vkTAvk≈ λ1 and min {kvk− u1k , kvk+ u1k} ≈ 0
Some observations
The fact
min {kvk− u1k , kvk+ u1k} ≈ 0 for k 1 means that vk is either a good approximation of u1 or a good
approximation of −u1 In either case, vk is a good approximation of an eigenvector.
Note that, if λ1 is negative, then at each iteration vk “jumps” from u1 to −u1 and back, since
vk ≈ sign a1λk1
u1 = −sign
a1λk+11
u1≈ vk+1
Some observations
The fact
min {kvk− u1k , kvk+ u1k} ≈ 0 for k 1 means that vk is either a good approximation of u1 or a good
approximation of −u1 In either case, vk is a good approximation of an eigenvector.
Note that, if λ1 is negative, then at each iteration vk “jumps” from u1 to −u1 and back, since
vk ≈ sign a1λk1
u1 = −sign
a1λk+11
u1 ≈ vk+1
Some observations
In the proof we observed that for k 1 vk ≈ sign
a1λk1
u1
More precisely,
min {kvk− u1k , kvk + u1k} =
vk− sign a1λk1
u1 = O
λ2 λ1
k!
Moreover, it can be shown that
|µk− λ1| =
vkTAvk − u1Au1
= O
λ2
λ1
2k!
In particular, if |λ2| |λ1| the convergence will be fast. On the other hand, if λ2 ≈ λ1 the convergence will be slow.
Some observations
In the proof we observed that for k 1 vk ≈ sign
a1λk1
u1
More precisely,
min {kvk− u1k , kvk + u1k} =
vk− sign a1λk1
u1 = O
λ2 λ1
k!
Moreover, it can be shown that
|µk− λ1| =
vkTAvk − u1Au1 = O
λ2
λ1
2k!
In particular, if |λ2| |λ1| the convergence will be fast. On the other hand, if λ2 ≈ λ1 the convergence will be slow.
Some observations
In the proof we observed that for k 1 vk ≈ sign
a1λk1
u1
More precisely,
min {kvk− u1k , kvk + u1k} =
vk− sign a1λk1
u1 = O
λ2 λ1
k!
Moreover, it can be shown that
|µk− λ1| =
vkTAvk − u1Au1 = O
λ2
λ1
2k!
In particular, if |λ2| |λ1| the convergence will be fast. On the other hand, if λ2 ≈ λ1 the convergence will be slow.
Stopping criterion
A simple stopping criterion for the power method is based on the residual:
Stop when kAvk − µkvkk ≤ tol
Inverse power method
Let µ ∈ R a user-specified parameter, we want to approximate the closest eigenvalue of A to µ, i.e.
λJ = argmin
i
|µ − λi|
v0 = some vector with kv0k = 1.
for k = 1, 2, . . .
w =(A − µI )−1vk−1 (equivalently, solve (A − µI ) w = vk−1) vk = w / kw k
µk = (vk)TAvk end
Theorem
Assume |µ − λJ| < |µ − λi| ∀ i = 1, . . . , n and (v0, uJ) 6= 0. Then
k→+∞lim µk = λJ and
k→+∞lim min {kvk − uJk , kvk+ uJk} = 0
proof
The proof is analogous to the previous case, observing that the largest (in absolute value) eigenvalue of (A − µI )−1 is λ 1
J−µ, and the relative eigenvector is uJ.
Note that if µ = 0, the method approximates the eigenvalue of A that is smallest in absolute value.
Theorem
Assume |µ − λJ| < |µ − λi| ∀ i = 1, . . . , n and (v0, uJ) 6= 0. Then
k→+∞lim µk = λJ and
k→+∞lim min {kvk − uJk , kvk+ uJk} = 0
proof
The proof is analogous to the previous case, observing that the largest (in absolute value) eigenvalue of (A − µI )−1 is λ 1
J−µ, and the relative eigenvector is uJ.
Note that if µ = 0, the method approximates the eigenvalue of A that is smallest in absolute value.
Theorem
Assume |µ − λJ| < |µ − λi| ∀ i = 1, . . . , n and (v0, uJ) 6= 0. Then
k→+∞lim µk = λJ and
k→+∞lim min {kvk − uJk , kvk+ uJk} = 0
proof
The proof is analogous to the previous case, observing that the largest (in absolute value) eigenvalue of (A − µI )−1 is λ 1
J−µ, and the relative eigenvector is uJ.
Note that if µ = 0, the method approximates the eigenvalue of A that is smallest in absolute value.