Eigenvalues and eigenvectors

(1)

Eigenvalues and eigenvectors

Let A ∈ R^n×n. If 0 6= v ∈ Cⁿ and λ ∈ C satisfy Av = λv

then λ is called eigenvalue, and v is called eigenvector.

Given a matrix, we want to approximate its eigenvalues and eigenvectors. Some applications:

Structural engineering (natural frequency, heartquakes ) Electromagnetics (resonance cavity)

Google’s Pagerank algorithm ...

(2)

Eigenvalues and eigenvectors

Let A ∈ R^n×n. If 0 6= v ∈ Cⁿ and λ ∈ C satisfy Av = λv

then λ is called eigenvalue, and v is called eigenvector.

Given a matrix, we want to approximate its eigenvalues and eigenvectors.

Some applications:

Structural engineering (natural frequency, heartquakes ) Electromagnetics (resonance cavity)

Google’s Pagerank algorithm ...

(3)

The characteristic polynomial

The eigenvalues of a matrix are the roots of the characteristic polynomial

p(λ) := det (λI − A) = 0

However, computing the roots of a polynomial is a very ill-conditioned problem! We cannot use this approach to compute the eigenvalues.

(4)

Eigenvalues and eigenvectors

Algorithms that compute the eigenvalues/eigenvectors of a matrix are divided into two categories:

1 Methods that compute all the eigenvalues/eigenvectors at once.

2 Methods that compute only a few (possibly one) eigenvalues/eigenvectors.

The methods are also different whether the matrix is symmetric or not. In this lesson we will discuss methods of type 2 for symmetric matrices.

(5)

Eigenvalues/eigenvectors of a symmetric matrix (refresh)

Theorem

All the eigenvalues of a symmetric matrix are real. Moreover, there exists a basis of eigenvectors u1, . . . , un, i.e.

Aui = λiui

that are orthonormal, i.e.

(ui, uj) = δij

We assume that the eigenvalues are numbered in decreasing order (in absolute value), i.e.

|λ₁| ≥ |λ₂| ≥ . . . ≥ |λ_n|

(6)

Eigenvalues/eigenvectors of a symmetric matrix (refresh)

Theorem

All the eigenvalues of a symmetric matrix are real. Moreover, there exists a basis of eigenvectors u1, . . . , un, i.e.

Aui = λiui

that are orthonormal, i.e.

(ui, uj) = δij

We assume that the eigenvalues are numbered in decreasing order (in absolute value), i.e.

|λ₁| ≥ |λ₂| ≥ . . . ≥ |λ_n|

(7)

The power method

We want to approximate the eigenvalue of A that is largest in abs. value.

v0 = some vector with kv0k = 1.

for k = 1, 2, . . .

w = Avk−1 apply A

v_k = w / kw k normalize

µ_k = (v_k)^TAv_k Reyleigh quotient

end

Theorem

Assume |λ₁| > |λ₂| and (v₀, u₁) 6= 0. Then

k→+∞lim µ_k = λ1

and

k→+∞lim min {kv_k − u₁k , kv_k+ u₁k} = 0

(8)

The power method

We want to approximate the eigenvalue of A that is largest in abs. value.

for k = 1, 2, . . .

w = Avk−1 apply A

v_k = w / kw k normalize

µ_k = (v_k)^TAv_k Reyleigh quotient

end Theorem

Assume |λ₁| > |λ₂| and (v₀, u₁) 6= 0. Then

k→+∞lim µ_k = λ1

and

k→+∞lim min {kv_k − u₁k , kv_k+ u₁k} = 0

(9)

Proof

We expand v0on the eigenvector basis:

v0= a1u1+ a2u2+ . . . + anun

Note that a1= (v0, u1) 6= 0.

vkis a scalar multiple of A^kv0, and since it has unit norm vk= A^kv0/ A^kv0

A^kv0 = a1A^ku1+ a2A^ku2+ . . . + anA^kun

= a1λ^k1u1+ a2λ^k2u2+ . . . + anλ^knun

= λ^k₁ a1u1+ a2

λ2

λ1

k

u2+ . . . + an

λn

λ1

k

un

!

Since λ1> λ2, for k 1, λ₂ λ1

k

≈ 0, . . . , λ_n λ1

k

≈ 0. Thus,

A^kv0≈ a1λ^k1u1 =⇒ vk≈ a1λ^k1u1

|a1λ^k₁|ku1k = a1λ^k1

|a1λ^k₁|u1= sign

a1λ^k1

u1

Thus,

vk^TAvk≈ λ1 and min {kvk− u1k , kvk+ u1k} ≈ 0

(10)

Some observations

The fact

min {kvk− u₁k , kv_k+ u1k} ≈ 0 for k 1 means that v_k is either a good approximation of u₁ or a good

approximation of −u1 In either case, vk is a good approximation of an eigenvector.

Note that, if λ1 is negative, then at each iteration vk “jumps” from u1 to −u1 and back, since

v_k ≈ sign a₁λ^k₁

u₁ = −sign

a₁λ^k+1₁

u₁≈ v_k+1

(11)

Some observations

The fact

min {kvk− u₁k , kv_k+ u1k} ≈ 0 for k 1 means that v_k is either a good approximation of u₁ or a good

approximation of −u1 In either case, vk is a good approximation of an eigenvector.

Note that, if λ1 is negative, then at each iteration vk “jumps” from u1 to −u1 and back, since

v_k ≈ sign a₁λ^k₁

u₁ = −sign

a₁λ^k+1₁

u₁ ≈ v_k+1

(12)

Some observations

In the proof we observed that for k 1 v_k ≈ sign

a1λ^k₁

u1

More precisely,

min {kv_k− u₁k , kv_k + u₁k} =

v_k− sign a₁λ^k₁

u₁ = O

λ₂ λ₁

k!

Moreover, it can be shown that

|µ_k− λ₁| =

v_k^TAv_k − u₁Au1

= O

λ2

λ₁

2k!

In particular, if |λ₂| |λ₁| the convergence will be fast. On the other hand, if λ₂ ≈ λ₁ the convergence will be slow.

(13)

Some observations

a1λ^k₁

u1

More precisely,

u₁ = O

λ₂ λ₁

k!

|µ_k− λ₁| =

v_k^TAv_k − u₁Au₁ = O

λ2

λ₁

2k!

(14)

Some observations

a1λ^k₁

u1

More precisely,

u₁ = O

λ₂ λ₁

k!

|µ_k− λ₁| =

v_k^TAv_k − u₁Au₁ = O

λ2

λ₁

2k!

(15)

Stopping criterion

A simple stopping criterion for the power method is based on the residual:

Stop when kAv_k − µ_kvkk ≤ tol

(16)

Inverse power method

Let µ ∈ R a user-specified parameter, we want to approximate the closest eigenvalue of A to µ, i.e.

λ_J = argmin

i

|µ − λ_i|

for k = 1, 2, . . .

w =(A − µI )⁻¹v_k−1 (equivalently, solve (A − µI ) w = v_k−1) vk = w / kw k

µ_k = (v_k)^TAv_k end

(17)

Theorem

Assume |µ − λ_J| < |µ − λ_i| ∀ i = 1, . . . , n and (v₀, u_J) 6= 0. Then

k→+∞lim µ_k = λ_J and

k→+∞lim min {kv_k − u_Jk , kv_k+ u_Jk} = 0

proof

The proof is analogous to the previous case, observing that the largest (in absolute value) eigenvalue of (A − µI )⁻¹ is _λ ¹

J−µ, and the relative eigenvector is u_J.

Note that if µ = 0, the method approximates the eigenvalue of A that is smallest in absolute value.

(18)

Theorem

proof

(19)

Theorem

proof