0% found this document useful (0 votes)
18 views23 pages

Matrix Analyisis

Uploaded by

duncan888000
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views23 pages

Matrix Analyisis

Uploaded by

duncan888000
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Matrix Analysis Class Note

FIN
Contents

1 2 by 2 Matrix 2
1.1 Standard Form and Numerical Range of 2 by 2 Matrix . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Normal Matrix and Square of 2 by 2 Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Equivalence 5
2.1 Unitary Equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Elementary Symmetric Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3 Hermitian and Symmetric Matrices 8


3.1 Eigenvalues of Hermitian Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Symmetric Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

4 The Polar Form and the Sigular Value Decomposition 15


4.1 Preliminary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.2 Polar Form and SVD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

5 QR-Decomposition and LU -Decomposition 18


5.1 QR-Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.2 LU -Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

6 Gerschgorin Discs 21

1
Chapter 1

2 by 2 Matrix

1.1 Standard Form and Numerical Range of 2 by 2 Matrix


 
a b
Theorem 1.1.1. If A ∈ M2 , then A ≃ , where b ≥ 0 uniquely (except a, c may interchange).
0 c
1
Definition 1.1.2. Let A ∈ Mm×n , we have ∥A∥F = tr(A∗ A) 2
Theorem 1.1.3. Let A, B ∈ M2 . Then the followings are equivalent:
1. A ≃ B
2. tr(A) = tr(B), det(A) = det(B) and ∥A∥F = ∥B∥F .
3. tr(A) = tr(B), tr(A2 ) = tr(B 2 ) and tr(A∗ A) = tr(B ∗ B).
Definition 1.1.4. Let A ∈ Mn . The numerical range of A is defined by

W (A) = {⟨Ax, x⟩ : x ∈ Cn , ∥x∥ = 1} .

Proposition 1.1.5. Let A ∈ Mn . Then


1. W (A) is a nonempty compact subset of C.
2. W (aA + bIn ) = aW (A) + b for any a, b ∈ C.
3. W (ℜA) = ℜW (A), W (ℑA) = ℑW (A).
4. A ≃ B =⇒ W (A) = W (B).
5. W (aℜA + ibℑA) = {ax + iby : x, y, a, b ∈ R, x + iy ∈ W (A)}.
6. W (A) is convex in C.
 
a b
Theorem 1.1.6. A = =⇒ W (A) is the elliptic disk with foci at a, b and the lengh of minor axis
0 c
being |b|.
Proposition 1.1.7. Let A, B ∈ M2 , then A ≃ B ⇐⇒ W (A) = W (B).
Definition 1.1.8. The spectual norm ∥·∥ is defined on Mn by ∥A∥ = max ∥Ax∥.
∥x∥=1
1 1
Proposition 1.1.9. If A ∈ Mn , then ∥A∥ = ∥A∗ A∥ = (largest eigenvalue of A∗ A) 2 .
2

Proposition 1.1.10. Let A, B ∈ Mn . Then A ≃ B =⇒ ∥A∥ = ∥B∥.


√ √
 
a b
Theorem 1.1.11. Let A = . Then ∥A∥ = 21 ( α + 2β + α − 2β), where α = tr(A∗ A) and β =
c d
|det(A)|.

2
1.2 Normal Matrix and Square of 2 by 2 Matrix
Proposition 1.2.1. Let A, B ∈ Mn . If A ≃ B, then At ≃ B t and A∗ ≃ B ∗ .
 
a b
Theorem 1.2.2. Let A = . Then A ≃ At .
c d
Definition 1.2.3. Let A ∈ Mn . Then A is normal if A∗ A = AA∗ .
Theorem 1.2.4. Let A ∈ Mn . Then
Pn 2 Pn 2
1. A is normal =⇒ j=1 |aij | = j=1 |aji | .

2. A is normal ⇐⇒ A is unitary similar to a diagonal matrix.

Proof.
1. Consider A∗ Aei = AA∗ ei for i = 1, · · · , n.
2. Pass A to a upper triangular B matrix by Schur, then B is normal and apply 1.

Theorem 1.2.5. Let A, B ∈ M2 . Then A, B are normal =⇒ AB ≃ BA.


Definition 1.2.6. B ∈ Mn is a square root of A ∈ Mn if B 2 = A.
Theorem 1.2.7. If A ∈ M2 , then
 
0 a
1. A has no square root ⇐⇒ A is nonscalar and tr(A) = det(A) = 0 ⇐⇒ A ≃ , a ̸= 0 ⇐⇒
0 0
2
A ̸= 0 and A = 0.

2. A has square root ⇐⇒ A = 0 or A2 ̸= 0.


 
A B
Theorem 1.2.8. Suppose that ρ = ≥ 0, where A ∈ Mn , D ∈ Mk . Then det(ρ) ≤ det(A) det(D).
B∗ D
The equality holds iff B = 0.
  Qn
Theorem 1.2.9. Let A = a1 · · · an . Then |det(A)| ≤ k=1 ∥ai ∥, the equality holds iff {a1 , · · · , an } is
a orthogonal set.
Proposition 1.2.10. Let A ∈ Mn .

1. If ⟨Ax, x⟩ = 0 for all x ∈ Cn , then A = 0.


2. A is Hermitian ⇐⇒ ⟨Ax, x⟩ ∈ R for any x ∈ Cn .
3. A ≥ 0 =⇒ A is Hermitian =⇒ A is normal.

4. A ≥ 0 =⇒ λ ≥ 0 for all eigenvalue of A.


Proof. Let A ∈ Mn .
1. 4 ⟨Ax, y⟩ = ⟨A(x + y), x + y⟩ − ⟨A(x − y), x − y⟩ + i ⟨A(x + iy), x + iy⟩ − i ⟨A(x − iy), x − iy⟩ = 0 for
any x, y ∈ Cn . Then A = 0.

2. ⟨Ax, x⟩ = ⟨x, Ax⟩ = ⟨Ax, x⟩. Conversely, ⟨Ax, x⟩ = ⟨x, A∗ x⟩ = ⟨A∗ x, x⟩ =⇒ ⟨(A − A∗ )x, x⟩ = 0 =⇒
A = A∗ .
3. By 2.
4. By 1.2.4 since A is normal by 3.

3
 
B C
Proposition 1.2.11. Let A = , where B ∈ Mn , E ∈ Mk . Then
D E
1. A is Hermitian ⇐⇒ B, E are Hermitian and D = C ∗ .
2. A ≥ 0 =⇒ B, E ≥ 0.
Proposition 1.2.12. Let A, B ∈ Mn .
1. If A > 0, then A is Hermitian and A ≃ diag(λ1 , · · · , λn ) with λ1 , · · · , λn > 0.

2. If A > 0, then there exists invertible matrix C ∈ Mn such that C ∗ AC = In .


3. If A ≥ B and B ≥ A, then A = B.
4. If A ≥ B, then tr(A) ≥ tr(B).
Theorem 1.2.13. Let A, B ∈ Mn be Hermitian. Suppose there exists α, β ∈ R such that αA + βB > 0.
Then there exists invertible C ∈ Mn such that C ∗ AC and C ∗ BC are diagonal.
Proof. Let ρ = αA + βB. Suppose β ̸= 0. Then B = β −1 (ρ − αA). Since ρ > 0, there exists invertible
C ∈ Mn such that C1 ρC1∗ = In . Then C1∗ AC1 is Hermitian. Then there exists unitary U ∈ Mn such that
U ∗ C1∗ AC1 U = D is diagonal. Let C = C1 U . Then C ∗ ρC = In and C ∗ AC = D, hence C ∗ BC = β −1 (In −αD)
is diagonal. For the case α ̸= 0 is similar.

Corollary 1.2.13.1. If A = A∗ and B > 0, then there exists invertible C ∈ Mn such that C ∗ AC is diagonal
and C ∗ BC = In .
Definition 1.2.14. Let a ∈ Mn .
1. The set of all eigenvalues of A is called the spectrum of A is denoted by σ(A).

2. The spectual radius of A, denoted by ρ(A) = max{|λ| : λ ∈ σ(A)}.


Theorem 1.2.15. Let A, B ∈ Mn . Suppose that A > 0, B ≥ 0.
1. Then A ≥ B ⇐⇒ ρ(BA−1 ) ≤ 1.
2. A > B ⇐⇒ ρ(BA−1 ) < 1.

Corollary 1.2.15.1. Let A, B ∈ Mn . Suppose A > 0, B ≥ 0. If A ≥ B, then det(A) ≥ det(B).

4
Chapter 2

Equivalence

2.1 Unitary Equivalence


Theorem 2.1.1. Let A = [aij ], B = [bij ] ∈ Mm,n . If B = V AU , where V, U are unitary. Then tr(A∗ A) =
Pn,m 2 Pn,m 2
tr(B ∗ B). Equivalently, i,j=1 |bij | = i,j=1 |aij | .
Theorem 2.1.2. Let A ∈ Mn . Then A is unitary similar to a upper triangular matrix T = [aij ] with
a12 = a23 = · · · = a(n−1)n ≥ 0.

Proof. By Schur, A is unitary similar to an upper triangular matrix T ′ = [a′ij ]. Let a′k(k+1) = a′k(k+1) eiθk ,
k = 1, · · · , n − 1 and

U = diag(exp(i(θ1 + · · · + θk−1 )), exp(i(θ2 + · · · + θk−1 )), · · · , exp(iθn−1 ), 1).

Then U ∗ T ′ U is what we want.


Proposition 2.1.3. Let U ∈ Mn be unitary. Then U is unitary similar to a diagonal matrix diag(u1 , · · · , un )
such that |uk | = 1 for all k = 1, · · · , n.

2.2 Similarity
Proposition 2.2.1. Let A ∈ Mn . Then A ≈ At . Therefore, if σ(A) ⊆ R, then A ≈ A∗ .

Theorem 2.2.2. Let A ∈ Mn , then


XM 1
A≈ (∆(1, 2λj , 1) + i∇(−1, 0, 1)),
j
2

where ∆ denote the tridiagonal matrix and ∇ denote the skew-tridiagonal matrix (read from top to bottom).
Proof. By Jordon form, we only need to show the case A = Jn (λ). Let B = ∇(0, 1, 0) and U = √12 (In + iB).
Then B, U are symmetric and B 2 = In , hence U U ∗ = In , which implies that U is unitary. OTOH, BJn (λ)B =
∆(0, λ, 1), BJn (λ) = ∇(0, λ, 1) and Jn (λ)B = ∇(1, λ, 0). Then
1 1 1 1
Jn (λ) ≃ U Jn (λ)U ∗ = (Jn (λ) + BJn (λ)B) + i(BJn (λ) − Jn (λ)B) = ∆(1, 2λ, 1) − i∇(−1, 0, 1).
2 2 2 2

Corollary 2.2.2.1. Let A ∈ Mn , then A = BC, where B, C ∈ Mn are symmetric.


Proof. Since A = XDX −1 for some diagonal matrix D, we have A = XDX t (X t )−1 X −1 = BC, where
B = (XDX t ), C = (X −1 )t X −1 .

5
Definition 2.2.3. Given a polynomial p(x) = xn + an−1 xn−2 + · · · + a1 x + a0 , then companion matrix of
p is defined by  
0 1

 0 1 

C(P ) = 
 . .. . .. 

 
 0 1 
−a0 · · · · · · · · · −an−1
or its transpose. It can be shown that the minimal polynomial and characteristic polynomial of C(p) are
both p.
Theorem 2.2.4. Let A ∈ Mn . Then A ≈ j C(Pj ), where Pj′ s are polynomial such that Pj+1 |Pj uniquely.
L
Q
Remark. PA (x) (the characteristic polynomial of A) is equal to j Pj (x) and mA (x) = P1 (x).
Pl L
Remark. Recall that A ≈ i=1 j Jk(i) (λi ). Then C(P1 ) ≈ Jk(1) (λ1 ) ⊕ · · · ⊕ Jk(l) and C(P2 ) ≈ Jk(1) (λ1 ) ⊕
j 1 1 2
· · · ⊕ Jk(l) and so on.
2

2.3 Elementary Symmetric Form


Definition 2.3.1. The kth elementary symmetric function of the numbers λ1 , · · · , λn is
 X
 λi1 λi2 · · · λik k ≤ n;
σk (λ1 , · · · , λn ) = 1≤i1 <i2 <···<ik ≤n .
0 k > n.

Pn
The kth power sum of numbers λ1 , · · · , λn is Sk (λ1 , · · · , λn ) = j=1 λkj .
Pn
Remark. For p(x) = (x − λ1 ) · · · (x − λn ) = xn + k=1 (−1)k σk (λ1 , · · · , λn )xn−k . Hence, for A ∈ Mn , then
pA (x) = p(x) and tr(Ak ) = Sk (λ1 , · · · , λn ). Notice that λ1 , · · · , λn may not be distinct.
Lemma 2.3.2. Sm − Sm−1 σ1 + Sm−2 σ2 − · · · + (−1)m mσm = 0, for m ≥ 1.
Proposition 2.3.3. We can express σm (λ1 , · · · , λn ) in terms of S1 (λ1 , · · · , λn ), · · · , Sm (λ1 , · · · , λn ) and vice
versa.

Theorem 2.3.4. Let k ≥ l and A ∈ Mk and B ∈ Ml . Then tr(Am ) = tr(B m ) for m = 1, · · · , k if and
only if k − l eigenvalues of A are 0, remaining eigenvalues of A and eigenvalues of B are the same, inculding
algebraic muliplicity.
Proof. (⇐) : Clearly.
   
λ1 β1
 λ2 ∗   β2 ∗ 
1
(⇒) : Let A ≃   and B ≃  . Let Sm (λ1 , · · · , λk ) = Sm , Sm (β1 , · · · , βk ) =
   
.. ..
 .   . 
λk βk
2 1 2 1 2 1 2
Sm , σm (λ1 , · · · , λk ) = σm and σm (β1 , · · · , βk ) = σm for m = 1, · · · , k. Since Sm = Sm , we have σm = σm for
1 2 k−l
m = 1, · · · , k. Also, since σm = σm = 0 for l < m ≤ k, pA (x) = x pB (x), then we are done.
Corollary 2.3.4.1. Let A, B ∈ Mn . If tr(Am ) = tr(B m ) for m = 1, · · · , n, then σ(A) = σ(B).
Corollary 2.3.4.2. Let A, B ∈ Mn . Then AB, BA have same characteristic polynomial, eigenvalues and
algebraic muliplicity.
Theorem 2.3.5. Let A, B ∈ Mn . Then AB, BA have same geometric muliplicity for nonzero eigenvalues.
Proof. Let λ ∈ σ(AB) = σ(BA) with λ ̸= 0. We will show that dim ker(AB − λIn ) = dim ker(BA − λIn ).

6
1. “AB(ker(AB − λIn )) ⊆ A(ker(BA − λIn )) ⊆ ker(AB − λIn )”: let x ∈ ker(AB − λIn ), then

(BA − λIn )Bx = B(AB − λIn )x = 0.

Hence, B ker(AB − λIn ) ⊆ ker(BA − λIn ) =⇒ AB ker(AB − λIn ) ⊆ A ker(BA − λIn ). Let x ∈
ker(BA − λIn ), then (AB − λIn )Ax = A(BA − λIn )x = 0. Then A ker(BA − λIn ) ⊆ ker(AB − λIn ).

2. Suppose AB is restricted on ker(AB − λIn ), then AB : ker(AB − λIn ) → ker(AB − λIn ). Suppose
x ∈ ker(AB − λIn ) with ABx = 0, then ker(AB − λIn )x and ABx = 0 =⇒ λx = 0 =⇒ x = 0.
Hence, AB(ker(AB − λIn )) = A(ker(BA − λIn )) = ker(AB − λIn ) and therefore dim ker(BA − λIn ) ≥
dim ker(AB − λIn ).
3. By symmetry, we are done.

7
Chapter 3

Hermitian and Symmetric Matrices

3.1 Eigenvalues of Hermitian Matrices


Theorem 3.1.1. Let A ∈ Mn be Hermitian and let λ1 ≤ · · · ≤ λn be eigenvalues of A. Then λ1 =
max{⟨Ax, x⟩ : ∥x∥ = 1} and λn = min{⟨Ax, x⟩ : ∥x∥ = 1}.
Proof. Consider the set W (A) = {⟨Ax, x⟩ : ∥x∥ = 1}. Then since A is Hermitian, there exists an unitary
U ∈ Mn such that U ∗ AU = diag(λ1 , · · · , λn ). Then
2 2
W (A) = {⟨AU x, U x⟩ : ∥U x∥ = 1 = ∥x∥} = {⟨U ∗ AU x, x⟩ : ∥x∥ = 1} = {λ1 |x1 | + · · · + λ |xn | : ∥x∥ = 1},
2 2
where |x1 | + · · · + |xn | = 1. Then λ1 = max W (A) and λn = min W (A).
Theorem 3.1.2. Let A ∈ Mn be Hermitian, λ1 ≤ · · · ≤ λn be eigenvalues of A, and let y1 , · · · , yn be
orthonormal eigenvectors corresponding to λ1 , · · · , λn , respectively. Then
1. λj = min{⟨Ax, x⟩ : ∥x∥ = 1 and x ⊥ {y1 , · · · , yj−1 }}.

2. λj = max{⟨Ax, x⟩ : ∥x∥ = 1 and x ⊥ {yj+1 , · · · , yn }}.


Proof. Take x = yj , then ⟨Ax, x⟩ = λj and x ⊥ {y1 , · · · , yj−1 }. Hence, λj ≥ min{⟨Ax, x⟩ : ∥x∥ = 1 and x ⊥
Pn Pn 2
{y1 , · · · , yj−1 }}. Conversely, suppose ∥x∥ = 1 and x ⊥ {y1 , · · · , yj−1 }, then x = i=j ai yi with i=j |ai | =
1. Then
Xn n
X Xn
2
⟨Ax, x⟩ = ai λi ak ⟨yi , yk ⟩ = |ai | λi ≥ λj = λj .
i,k=j i=j i=j

Then we are done.


For 2, set B = −A.
Theorem 3.1.3. Let A ∈ Mn be Hermitian and let λ1 ≤ · · · ≤ λn be eigenvalues of A. Then

λj = max min{⟨Ax, x⟩ : ∥x∥ = 1, x ⊥ {w1 , · · · , wj−1 }}


w1 ,··· ,wj−1 ∈Cn

= min max{⟨Ax, x⟩ : ∥x∥ = 1, x ⊥ {wj+1 , · · · , wn }}


wj+1 ,··· ,wn ∈Cn

Proof. “≤”: By above.


 “≥”: Let y1 , · · · , yn be orthonormal eigenvectors corresponding to λ1 , · · · , λn respectively. Let U =
y1 · · · yn unitary. Given w1 , · · · , wj−1 ∈ Cn , let x ∈ Cn with ∥x∥ = 1 and x ⊥ {w1 , · · · , wj−1 }. Let r =
Pn 2
U ∗ x. Then ∥r∥ = 1 and r ⊥ {U ∗ w1 , · · · , U ∗ wj−1 }. Then ⟨Ax, x⟩ = ⟨AU r, U r⟩ = ⟨U ∗ AU r, r⟩ = i=1 λi |ri | .

8
Then
( n n
)
X 2
X 2 ∗ ∗
min{⟨Ax, x⟩ : ∥x∥ = 1, x ⊥ {w1 , · · · , wj−1 }} = min λi |ri | : |ri | = 1, r ⊥ {U w1 , · · · , U wj−1 }
i=1 i=1
( n n
)
X 2
X 2
≤ min λi |ri | : |ri | = 1, r ⊥ {U ∗ w1 , · · · , U ∗ wj−1 }, rj+1 = · · · = rn = 0
i=1 i=1
j j
( )
X 2
X 2 ∗ ∗
= min λi |ri | : |ri | = 1, r ⊥ {U w1 , · · · , U wj−1 }, rj+1 = · · · = rn = 0 ≤ λj .
i=1 i=1

Then max min{⟨Ax, x⟩ : ∥x∥ = 1, x ⊥ {w1 , · · · , wj−1 }} = λj .


w1 ,··· ,wj−1 ∈Cn
Let B = −A, then apply 1.
Corollary 3.1.3.1. Let A ∈ Mn . Let σ(ℜ(A)) = {λ1 ≥ · · · ≥ λn } and σ(A∗ A) = {σ1 ≥ · · · ≥ σn }. Then

λ k ≤ σk .
Proof.
A + A∗
  
λk = max min x, x : ∥x∥ = 1, x ⊥ {w1 , · · · , wk−1 }
w1 ,··· ,wk−1 ∈Cn 2
np o
≤ max n min ⟨A∗ Ax, x⟩ : ∥x∥ = 1, x ⊥ {w1 , · · · , wk−1 }
w1 ,··· ,wk−1 ∈C
r np o √
= max n min ⟨A∗ Ax, x⟩ : ∥x∥ = 1, x ⊥ {w1 , · · · , wk−1 } = σk .
w1 ,··· ,wk−1 ∈C

Theorem 3.1.4. Let A, B ∈ Mn be Hermitian and A ≤ B. Let σ(A) = {λ1 ≤ · · · ≤ λn } and σ(B) = {u1 ≤
u2 ≤ · · · ≤ un }. Then λj ≤ uj for j = 1, · · · , n.
Proof.

uj = max min{⟨Bx, x⟩ : ∥x∥ = 1, x ⊥ {w1 , · · · , wj−1 }}


w1 ,··· ,wj−1 ∈Cn

≥ max min{⟨Ax, x⟩ : ∥x∥ = 1, x ⊥ {w1 , · · · , wj−1 }}


w1 ,··· ,wj−1 ∈Cn

= λj .

Corollary 3.1.4.1. Let A, B ∈ Mn be Hermitian and A ≤ B. Let σ(A) = {λ1 ≤ · · · ≤ λn } and σ(B) =
{u1 ≤ u2 ≤ · · · ≤ un }. If rank(B − A) = 1, then

λ1 ≤ u1 ≤ λ2 ≤ u2 ≤ · · · ≤ λn ≤ un .

Proof. By above, we only need to prove that uj ≤ λj+1 . Let F = B − A, then F = F ∗ and rank(F ) = 1.
Let im(F ) = span(y0 ) for some y0 ∈ Cn . Then

uj = max min{⟨(A + F )x, x⟩ : ∥x∥ = 1, x ⊥ {w1 , · · · , wj−1 }}


w1 ,··· ,wj−1 ∈Cn

≤ max min{⟨Ax, x⟩ : ∥x∥ = 1, x ⊥ {w1 , · · · , wj−1 , y0 }}


w1 ,··· ,wj−1 ∈Cn

≤ λj+1 .

Theorem 3.1.5. Let A, F ∈ Mn , A be Hermitian F ≥ 0 and rank(F ) ≤ k. Let σ(A) = {λ1 ≤ · · · ≤ λn }


and σ(A + F ) = {u1 ≤ · · · ≤ un }. Then
λj ≤ uj ≤ λj+k
for j = 1, · · · , n if we let λj+k = ∞ for j > n.

9
 
B ∗
Theorem 3.1.6. Let A = , where A ∈ Mn+1 is Hermitian, B ∈ Mn . Let σ(A) = {λ1 ≤ · · · ≤ λn+1 }
∗ ∗
and σ(B) = {u1 ≤ · · · ≤ un }. Then λ1 ≤ u1 ≤ λ2 ≤ · · · ≤ un ≤ λn+1 .
Proof.

λj = max min{⟨Ax, x⟩ : ∥x∥ = 1, x ⊥ {w1 , · · · , wj−1 }}


w1 ,··· ,wj−1 ∈Cn

≤ max min{⟨Ax, x⟩ = ⟨B x̂, x̂⟩ : ∥x∥ = 1, x ⊥ {w1 , · · · , wj−1 , en+1 }} = uj


w1 ,··· ,wj−1 ∈Cn

λj+1 = min max{⟨Ax, x⟩ : ∥x∥ = 1, x ⊥ {wj+2 , · · · , wn }}


wj+2 ,··· ,wn+1 ∈Cn

≥ min max{⟨Ax, x⟩ = ⟨B x̂, x̂⟩ : ∥x∥ = 1, x ⊥ {wj+2 , · · · , wn , en+1 }} = uj .


wj+2 ,··· ,wn+1 ∈Cn

 
B ∗
Theorem 3.1.7. Let A = , where A ∈ Mn is Hermitian, B ∈ Mk . Let σ(A) = {λ1 ≤ · · · ≤ λn+1 }
∗ ∗
and σ(B) = {u1 ≤ · · · ≤ uk }. Then λj ≤ uj ≤ λj+n−k , where j = 1, · · · , k.
Theorem 3.1.8. Let B ∈ Mn be Hermitian,  = {u1 ≤ · · · ≤ un }. Assume λ1 ≤ u1 ≤ λ2 ≤ · · · ≤ un ≤
 σ(B)
B ∗
λn+1 . Then there exists a Hermitian A = ∈ Mn+1 with σ(A) = {λ1 ≤ · · · ≤ λn+1 }.
∗ ∗

Theorem 3.1.9. Let A ∈ Mn with A ≥ 0, then there exists unique B ≥ 0 such that B 2 = A.
Proof. Since A ≥ 0, A is Hermitian. √ Therefore, A can be ∗diagonalized as U DU ∗ with D = diag(λ1 , · · · , λn )
and λ1···n ≥ 0. Let u1···n = λ1···n , then A = U DU = U T U = U T U ∗ U T U ∗ = B 2 , where T =
2 ∗

diag(u1 , · · · , un ) and B = U T U ∗ , which proves the existance.


Now suppose U = [v1 , · · · , vn ] and there exists B ∈ Mn such that B 2 = A, we shall prove that Bv1···n =
u1···n v1···n . Since B ≥ 0, B can be diagonalizedPas W ΛW ∗ withPΛ = diag(a1 , · · · , an ) and a1 , P · · · , an ≥ 0
and W = [w1 , · · · , wn ]. Then for each k, vk = j cj wj and λk j bj wj = λk vk = Avk = B 2 ( j cj wj ) =
2 2
P P P
P bj aj wj ⇐⇒ P j bj (aj − λk )wj = 0 ⇐⇒P bj = 0 or aj = uk for each j. Hence, vk = j bj wj =
b
bj ̸=0 j jw = b
j,aj =uk j jw . Thus, Bv k = u k b w
j,aj =uk j j = u v
k k .

3.2 Symmetric Matrices


Lemma 3.2.1. If A ∈ Mn , then A = U BU t for some unitary U ∈ Mn and upper triangular B ∈ Mn if and
only if σ(AA) ⊆ [0, ∞). Moreover, in ⇐, B = [bij ] can be chosen such that bii ≥ 0.

Proof. Assume A = U BU t , then AA = U B(U t U )BU ∗ = U BBU ∗ =⇒ AA ≃ BB. Hence, σ(AA) =


2 2
σ(BB) = {|b11 | , · · · , |bnn | } ⊆ [0, ∞).
Conversely, assume
√ that σ(AA) ⊆ [0, ∞). Then we claim that if λ1 ∈ σ(AA), there exists a unit vector
such that Av1 = λ1 v1 .
Proof of claim. Since λ1 ∈ σ(AA), there exists x ̸= 0 such that AAx = λ1 x.
2 2
1. If Ax and x are dependent, then Ax = µx for some µ ∈ C. Then AAx = |µ| x =⇒ |µ| = λ1 .

2. If x and Ax are independent, then y = Ax + µx ̸= 0 for any µ ∈ C. Take y0 = Ax + λ1 x. Then
p p p p p
Ay0 = A(Ax + λ1 x) = λ1 x + λ1 Ax = λ1 (Ax + λ1 x) = λ1 y0 .
y0
Take v1 = ∥y0 ∥ , then we are done.

10
2 √ √
For case 1, since |µ| = λ1 , µ = eiθ λ1 for some θ ∈ R. Take η = θ
2. Then A(e−iη x) = λ1 (eiη x). Let

v1 = e∥x∥x , then we are done.

By claim, we extend v1 to an orthonormal basis (v1 , · · · , vn ) of Cn . Let U1 = [v1 , · · · , vn ] ∈ Mn . Then


 ∗ 
 ∗  v1 Av1
v1 Av1
  0
 √
λ1 wt
 
U1 AU1 =  ...
t
= = ,
 
∗  ..
 
0 A2
∗ . ∗
vn Av1
0

where w ∈ Cn−1 and A2 ∈ Mn−1 . Then since



λ 1 w t + w t A2
 
t t λ1
U1 AU1 (U1 AU1 ) = U1∗ AAU1 = ,
0 A2 A2
t
σ(A2 A2 ) ⊆ σ(AA) ⊆ [0, ∞). Then by induction, there exists a unitary U such that U AU = B, where B is
upper triangular with nonnegative diagonal. Therefore, A = U BU t , as reqired.
Theorem 3.2.2 (Takagi). If A ∈ Mn is symmetric, then there exist a unitary U such that A = U DU t ,
where D = diag(d1 , · · · , dn ) with d1 , · · · , dn ≥ 0. Moreover, in this case, the column of U are an orthonormal
set of eigenvectors for AA and σ(AA) = {d21 , · · · , d2n }.
Proof. Since A = At , A = A∗ and AA = AA∗ ≥ 0. By 3.2.1, there exists a unitary U such that A = U DU t
with D = [dij ] being upper triangular with nonnegative diagonal. Since A = At , D = Dt and D is diagonal.
Let D = diag(d1 , · · · , dn ). Since AA = U DU t U DU ∗ = U D2 U ∗ =⇒ AA ≃ D2 .

Corollary 3.2.2.1. A ∈ Mn is symmetric if and only if there exists S ∈ Mn such that A = SS t .


Proof. ⇐ is clear. For the other direction, √ by 3.2.2, A can be factorized as A = U DU t with D =
diag(d1 , · · · , dn ) and dj ≥ 0. Let S = U D, then SS t = U DU t = A, as reuqired.
Lemma 3.2.3. Let A ∈ Mn (R) with eigenvalue λ = a + bi (a, b ∈ R, b ̸= 0) and let z1 = x + iy (where
x, y ∈ Rn ) be an eigenvectors belonging to λ1 . If S = span(x, y) then dim(S) = 2 and S is invariant under
A.
Proof. Since A ∈ Mn (R) and b ̸= 0, y ̸= 0. Since A(x − iy) = A(x + iy) = (a + bi)(x + iy) = (a − bi)(x − iy),
(x − iy) is also a eigenvector of A and λ2 = a − bi is its eigenvalue. If x = cy for some c ∈ R, then
(x + iy) = (c + i)y and (x − iy) = (c − i)y and therefore (x + iy), (x − iy) are dependent. However, since
a + bi ̸= a − bi, x and y are independent. Thus, dim(S) = 2.
Now suppose z ∈ S, then z = k1 z1 + k2 z2 , where z1 = (x + iy), z2 = (x − iy), k1 , k2 ∈ C. Compute Az:

Az = k1 λ1 z1 + k2 λ2 z2 ∈ S,

where λ1 = a + bi, λ2 = a − bi.

Theorem 3.2.4. Let A ∈ Mn (R). Then there exists a real orthogonal matrix Q such that
 
A1 ∗
Qt AQ = 
 .. ,

.
0 Ak

where each Aj is a real 1 by 1 matrix or 2 by 2 matrix with a nonreal pair of complex conjugate eigenvalues.
Proof. Since A ∈ Mn , there exist a unit eigenvector v1 ∈ Cn and λ ∈ C such that Av1 = λv1 . We shall prove
it by cases:

11
1. Suppose v1 is real, then λ is real. Extend v1 into a orthonormal basis (v1 , · · · , vn ) of Rn . Let Q1 =
[v1 · · · vn ]. Then  t 
v1 Av1
wt
 
t  .. λ
Q1 AQ1 =  . = ,

∗ 0 A2
vnt Av1
where w ∈ Rn−1 and A2 ∈ Mn−1 (R).
2. Suppose v1 is complex, say x + iy with x, y ∈ Rn and y ̸= 0. Then λ is complex and by 3.2.3, we can
take orthonormal q1 , q2 ∈ Rn such that span(q1 , q2 ) = S = span(x, y). Extend q1 , q2 into a orthonormal
basis (q1 , q2 , · · · , qn ) of Rn and let Q1 = [q1 · · · qn ]. Then
 
A1 ∗
Qt1 AQ1 = ,
0 A2

where A1 ∈ Mn (R) and A2 ∈ Mn−2 (R).


By induction, there exists a real orthogonal matrix Q such that
 
A1 ∗
Qt AQ = 
 .. ,

.
0 Ak

where each Aj is a real 1 by 1 matrix or 2 by 2 matrix with a nonreal pair of complex conjugate eigenvalues.
Theorem 3.2.5. Let A ∈ Mn (R). Then A is normal if and only if there exists a real orthogonal matrix Q
such that  
A1 0
Qt AQ = 
 .. ,

.
0 Ak
 
αi βi
where each Aj is a real 1 by 1 matrix or 2 by 2 matrix of the form Ai = , where βi ̸= 0.
−βi αi

A1 At1
 
0
Proof. For ⇐, since AA∗ = Qt 
 ..  Q, we only need to check that Atj Aj = Aj Atj , where

.
0 Ak Atk
 
αj βj
Aj = , which is clear by direct computation.
−βj αj
For the other direction, by 3.2.4. there exists a real orthonormal matrix Q ∈ Mn (R) such that
 
R A01 A02 · · · A0k

 A11 A12 · · · A1k  
t
Q AQ = 
 A22 · · · A2k   = B,
 . . .. 
 . . 
0 Akk
 
λ1 ∗
where R = 
 ..  ∈ Mp (R) and A11 , · · · , Akk ∈ M2 (R) with a nonreal pair of complex conjugate

.
0 λp
 
λ1
 .. 

 . 

 λp 
eigenvalues. We claim that B =  .

 A11 

 .. 
 . 
Akk

12
Proof of claim. Since A is normal, B is also normal. Then B t B = BB t =⇒ Rt R = RRt + A01 At01 + · · · +
A0k At0k . Then
k k
X tr(Rt R)=tr(RRt ) X A0j At0j ≥0
tr(Rt R) = tr(RRt ) + tr(A0j At0j ) ⇐⇒ tr(A0j At0j ) = 0 ⇐⇒ A0j = 0, ∀j.
j=0 j=0

Then Rt R = RRt and therefore R is diagonal. 


λ1
 .. 

 . 

 λp 
By induction, B =  .

 A11 

 .. 
 . 
Akk
 
a b
Suppose Aii = ∈ M2 (R). Since each Aii is normal, b2 = c2 , ac + bd = ab + cd. If b = c, then Aii is
c d
Hermitian and σ(Aii ) ⊆ R, a contradiction. Hence, c = −b ̸= 0 and ac + bd = ab + cd =⇒ c = −b ̸= 0, a =
d.
Corollary 3.2.5.1. Let A ∈ Mn (R). Then A = At if and only if there exists a real orthonormal matrix Q
such that Qt AQ is diagonal.
Proof. ⇐ clear.
For the other direction, since A is normal, by 3.2.5, there exists a real orthonormal matrix Q such that
Qt AQ = diag(A1 , · · · , Ak ), where Aj is (· · · ). Since A is symmetric, each Aj is symmetric and therefore each
Aj is diagonal.
Lemma 3.2.6. Let A, B ∈ Mn .
1. Let A ∈ Mn and S ⊆ Cn be an invariant subspace of A. Then mA|S divides mA .

2. If A is diagonalizable and S ⊆ Cn is an invariant subspace of A. Then A|S is diagonalizable.


3. If AB = BA and E(λ; A) is an eigenspace of A, then E(λ; A) is invariant under B.
4. If A, B are diagonalizable and AB = BA, then A and B can be diagonalized simultaneouly.
5. If A, B ∈ Mn (R) are symmetric and AB = BA, then A and B can be orthogonal diagonalized simul-
taneouly.
Proof. Let A ∈ Mn . Let the minimal polynomial of A be µ(x). If S ⊆ Cn be a invariant subspace of A, since
µ(A)|S = µ(A|S ) = 0, then the minimal polynomial of A|S divides µ(x).
Since A is diagonalizable and S ⊆ Cn is an invariant subspace of A. Since mA splits and has no repeated
factors and mA|S divides mA , mA|S has no repeated factors and splits. Thus, A|S is diagonalizable.
Let x ∈ E(λ; A). Then AB(x) = BA(x) = B(λx) = λB(x), hence B(x) ∈ E(λ; A).
Since A and B is diagonalizable, Cn can be written as the direct sum of eigenspaces of A, say j E(λj ; A).
L
By 3, E(λj ; A) is invariant under B. By 2, B|E(λk ;A) is diagonalizable. Since A|E(λj ;A) = λj I, AE(λk ;A) and
B|E(λj ;A) can be diagonalized simultaneouly. Then A and B can be diagonalized simultaneouly.
Similar as 4.

Theorem 3.2.7. Let A ∈ Mn . Then A = At and A is normal if and only if there exists a real orthonormal
matrix Q such that Qt AQ = D, where D is diagonal.
Proof. ⇐ is clear.
For the other direction. Write A = B + iC, with B, C ∈ Mn (R). Since A = At , B = B t and C = C t .
Since AA∗ = A∗ A, BC = CB. Then by 3.2.6, there exists real orthonormal matrix Q such that Qt BQ, Qt CQ
is diagonal. Then Qt AQ = Qt (B + iC)Q = Qt BQ + i(Qt CQ) is diagonal.

13
Lemma 3.2.8. Let x1 , · · · , xk ∈ Cn be given vectors with k ≤ n. Let X = [x1 · · · xk ] ∈ Mn×k and

r = rank(X t X)(= rank(X) = rank(X  X)).
n
 Then there exists y1 , · · · , yk ∈ C such that span(x1 , · · · , xk ) =
I
span(y1 , · · · , yk ) such that Y t Y = r , where Y = [y1 · · · yk ].
0

Proof. Since X t X is symmetric, by 3.2.2, there exists unitary U such that X t X = U ΛU t , where Λ =
diag(λ1 , · · · , λk ) with λj ≥ 0. Since r = rank(X t X) = rank(Λ), we can rearrage λ1,··· ,k such that λ1···r is
decreasing and λr >p0 ≥ λr+1···k = 0.
Let D1 = diag( λ1,··· ,r , 1, · · · , 1) ∈ Mk , D2 = diag(1, · · · , 1, 0, · · · , 0) ∈ Mk . Then X t X = U ΛU t =
| {z }
r
U D1 D2 D1 U t = (U D1 )D2 (U D1 )t = S t D2 S, where S = D1 U t is invertible. Then (XS −1 )t (XS −1 ) = D2 .
Let Y = XS −1 = [y1 · · · yk ]. Then rank(Y ) = rank(X) and Y t Y = D2 .
Theorem 3.2.9. Let A ∈ Mn be symmetric. Then A = SDS −1 for a diagonal D and an invertible S ∈ Mn
if and only if A = QDQt , where Q ∈ Mn with Qt Q = In .
Proof. ⇐ is clear.
For the other direction, suppose that A = At and A = SDS −1 . We may assume that D = j λj Inj ∈
L
Mni , where n1 + · · · + L
nj = n and λi ̸= λj for i ̸= j. Partition the columns of S = [S1 · · · Sn ] = [V1 · · · Vd ]
conformally with D = j λj Inj ∈ Mni . Hence, Vj ∈ Mn×nj for j = 1, · · · , d. We claim that Vit Vj = 0 ∈
Mni ×nj for i ̸= j.
Proof of claim. Let xi ∈ Vi , xj ∈ Vj with i ̸= j. Then λi xti xj = (Axi )t xj = xti Axj = xt (λj xj ) = λj xti xj =⇒
xti xj = 0 since λi ̸= λj .
By claim, S t S = diag(V1t V1 , · · · , Vdt Vd ) and each Vjt Vj is invertible. By 3.2.8, there exists Qj ∈ Mn×nj
such and Rj ∈ Mnj such that Qj = Vj Rj and Qtj Qj = Inj . Then Qti Qj = Rit Vit Vj Rj = 0 and AQi =
AVi Ri = λVi Ri = λi Qi . Let Q = [Q1 · · · Qd ] ∈ Mn . Then Qt Q = In and Qt AQ = D, as reuqired.

14
Chapter 4

The Polar Form and the Sigular Value


Decomposition

4.1 Preliminary
Lemma 4.1.1. Let A ∈ Mm×n with m ≤ n and rank(A) = k. Then there exist
1. a unitary X ∈ Mm
2. D = diag(λ1 , · · · , λm ) ∈ Mm with λ1 ≥ · · · ≥ λk > λk+1···m = 0

3. Y ∈ Mm×n with Y Y ∗ = Im .
such that A = XDY .
Proof. Since AA∗ ≥ 0 and rank(AA∗ ) = k. There exists a unitary X = [x1 · · · xm ] ∈ Mm such that

X ∗ AA∗ X = diag(u1 ≥ u2 ≥ · · · ≥ uk > uk+1 = · · · = um = 0).



Let λi = ui for 1 ≤ i ≤ m. Define yi = λi−1 (A∗ xi ) ∈ Cn for 1 ≤ i ≤ k. Then for 1 ≤ i, j ≤ k, ⟨yi , yj ⟩ = δij .
λ1 y1∗
 
 .. 
 ∗  ∗   . 
y1 x1 A
λk y ∗ 
 
 ..  ∗ ∗  ..  k
Let Y =  .  ∈ Mm×n . Then Y Y = Im . We need to check that X A =  .  =    . For
 0 

∗ ∗
ym xm A  . 
 .. 
0
1 ≤ i ≤ k, yi = λ−1 ∗ ∗ ∗
i (A xi ) =⇒ λi yi = xi A. For k < i ≤ m,

⟨A∗ xi , A∗ xi ⟩ = x∗i λ2i xi = 0 =⇒ A∗ xi = 0 =⇒ x∗i A = 0.

Lemma 4.1.2. In lemma 1,


1. D is unique and {λ21 , · · · , λ2m } are eigenvalues of AA∗ .

2. the columns of X are eigenvectors of AA∗ .


3. if λi ̸= λj for all i ̸= j and A = X ′ DY ′ , where unitary X ′ ∈ Mm and Y ′ ∈ Mm×n with Y ′ (Y ′ )∗ = Im ,
then X ′ = Xdiag(eiθ1 , · · · , eiθm ) for some θ1 , · · · , θm ∈ R.
4. if rank(A) = m, then given X, Y is unique.

5. if A ∈ Mm×n (R), then X and Y may be taken to be real.

15
Proof. Suppose that A = XDY , where X ∈ Mm is unitary, D ∈ Mm is diagonal with λ1 ≥ · · · ≥ λk >
λk+1 = · · · = λm = 0 and Y ∈ Mm×n with Y Y ∗ = Im .
1. AA∗ = XD2 X ∗ and X is unitary, then AA∗ ≃ D2 and D is unique.
2. X ∗ AA∗ X = D2 and X is unitary, then column of X are eigenvalues of AA∗ .
3. Let X = [x1 · · · xm ], X ′ = [x′1 · · · x′m ]. By 2, the columns of X, X ′ are eigenvalues of AA∗ . Since
λi ̸= λj for any i ̸= j. Then x′i = ri xi . Since |ri | = 1, ri = eiθi for some θi ∈ R. Hence, X ′ =
Xdiag(eiθ1 , · · · , eiθm ).
4. If rank(A) = m, then rank(AA∗ ) = m and AA∗ is invertible and then D2 is invertible. Therefore D is
invertible, A = XDY ⇐⇒ Y = D−1 X ∗ A is unique.
5. Suppose A ∈ Mm×n (R) with rank(A) = k. Then AA∗ is real symmetric matrix. The rest is similar to
4.1.1.

4.2 Polar Form and SVD


Theorem 4.2.1. Let A ∈ Mm×n with m ≤ n. Then there exists P ∈ Mm with P ≥ 0 and U ∈ Mm×n with
U U ∗ = Im such that A = P U . Moreover,
1. rank(P ) = rank(A).

2. P is unique (P = AA∗ ).
3. U is unique if rank(A) = m.
4. if A is real, then P and U may be taken to be real.
Proof. By 4.1.1, A = XDY , where X is unitary, D = diag(λ1 , · · · , λm ) with λ1 ≥ · · · ≥ λm ≥ 0 and
Y ∈ Mm×n with Y Y ∗ = Im . A = XDY = XDX ∗ XY . Let P = XDX ∗ , U = XY .
Then P ≥ 0 and U U ∗ = XY Y ∗ X ∗ = XIm X ∗ = XX ∗ = Im .
Suppose that A = P U , where P ∈ Mm with P ≥ 0 and U ∈ Mm×n with U U ∗ = Im .
1. rank(A) = rank(AA∗ ) = rank(P U U ∗ P ∗ ) = rank(P P ∗ ) = rank(P ).

2. 0 ≤ AA∗ = P 2 . Then P ≥ 0 and by 3.1.9, P = AA∗ .
3. If rank(A) = m = rank(P ), then P is invertible U = P −1 A is unique.
4. By 4.1.2 and we are done.

Corollary 4.2.1.1. If B ∈ Mm×n with m ≥ n, then exists Q ∈ Mn with Q ≥ 0 and W ∈ Mm×n with
W ∗ W = In such that B = W Q.
Theorem 4.2.2. Let A ∈ Mn and let A = P U be a polar form. Then A is normal if and only if P U = U P .
Proof. (⇐) : if P and U commute, then AA∗ = P U U ∗ P ∗ = P 2 and A∗ A = U ∗ P ∗ P U = U ∗ P 2 U = U ∗ U P 2 =
P 2 , then A is normal.
(⇒) : if A is normal, then AA∗ = P 2 = (U ∗ P U )2 = A∗ A. Since P, U ∗ P U ≥ 0, by 3.1.9, P = U ∗ P U ⇐⇒
UP = PU.
Theorem 4.2.3. Let A ∈ Mm×n with rank(A) = k. Then there exists V ∈ Mm , W ∈ Mn are unique and
D = [dij ] ∈ Mm×n with dij = 0 for all i ̸= j and d11 ≥ · · · ≥ dkk > d(k+1)(k+1) = · · · = dqq = 0, where
q = min{n, m} such that A = V DW ∗ . Moreover,
1. D is unique and {d211 , · · · , d2kk } = σ(AA∗ )\{0}.
2. The column of V are eigenvectors of AA∗ and the columns of W are eigenvectors of A∗ A.

16
3. If A = V1 DW1∗ , where V1 ∈ Mn , W1 ∈ Mn are unique and if m ≤ n, AA∗ has distinct eigenvalues, then
V1 = V diag(eiθ1 , · · · , eiθm ) and W1 = W diag(eiθ1 , · · · , eiθn ).
4. If m = n = k and V given, then W is unique.
5. If A ∈ Mm×n (R), then V and W may be taken to be real.
Proof. We may assume that m ≤ n (otherwise, replace A by A∗ ). By 4.1.1, A = XD′ Y , where unitary
X ∈ Mm , D′ = diag(λ1 , · · ·, λm ) with λ1 ≥ · · · ≥ λk > λk+1 = · · · = λm = 0, Y ∈ Mm×n with Y Y ∗ = Im .
′ ∗

Let V = X, take D = D 0 ∈ Mm×n , we have the columns  ∗ of∗ Y ∈ Mn×m are orthogonal. We can

extend Y such that S ∈ Mn×(n−m) such that the columns of Y S = W ∈ Mn is and orthonormal basis
of Cn .  
 Y
So, W is unitary, then V DW ∗ = X D′ 0 = XD′ Y = A.

S
The rests are similar to 4.1.2.

Definition 4.2.4. The eigenvalues of AA∗ are called the singular values of A with notation s1 (A) ≥
· · · ≥ sm (A) ≥ 0.
Proposition 4.2.5. Let A ∈ Mn . Then
1. s1 (A) = ∥A∥.
2. rank(A) = number of si (A)’s which are positive.
√ √
3. σ( A∗ A) = σ( AA∗ ) and AA∗ ≃ A∗ A.
4. A is invertible ⇐⇒ sn (A) > 0. In this case, A−1 = 1
sn (A) .
qP Q
5. ∥A∥F = 2
j s1 (A) , |det(A)| = j sj (A).

Proof. Write A = V DW ∗ , where V, W ∈ Mn are unitary and D = diag(s1 (A), · · · , sn (A)).


1.
2 2 2
∥A∥ = max ⟨Ax, Ax⟩ = max ⟨DW ∗ x, DW ∗ x⟩ = max ∥Dx∥ = ∥D∥ ⇐⇒ ∥D∥ = ∥A∥ .
∥x∥=1 ∥x∥=1 ∥x∥=1

2. rank(A) = rank(A∗ A) = rank(W D2 W ∗ ) = rank(D2 ) = number of si (A)’s which are positive.


√ √
3. Since AA∗ = V D2 V ∗ and A∗ A = W D2 W ∗ , then AA∗ ≃ D2 ≃ A∗ A. Therefore, σ( AA∗ ) = σ( A∗ A).
4. Clear.
qP
2
5. ∥A∥F = tr(A∗ A) = tr(W D2 W ∗ ) = tr(D2 ) = 2
P
j sj (A) =⇒ ∥A∥F = j sj (A)2 . Since |det(A)| =
Q
|det(V )| |det(D)| |det(W )| = j sj (A).

 
0 A
Corollary 4.2.5.1. Let A ∈ Mm×n and let B = ∈ Mm+n . The ordered eigenvalues of B are
A∗ 0

−s1 (A) ≤ · · · ≤ −sq (A) ≤ 0| = ·{z


· · = 0} ≤ sq (A) ≤ · · · ≤ s1 (A),
|m−n|

q = min{m, n}.
Proof. Only show the case q = n ≤ m.
By SVD, A = V DW∗ , where V ∈ Mn , W ∈ Mn , D ∈ Mm×n are describe in 4.2.3.
Write V = V1 V2 , V1 ∈ Mm×n . Let V̂ = √12 V1 , Ŵ = √12 W, D̂ = diag(s1 (A), · · · , sn (A)), and define
 
V̂ −V̂ V2
U = ∈ Mm+n . Then U U ∗ = Im+n and 2V̂ D̂Ŵ ∗ = A and U diag(D̂, −D̂, 0m−n )U ∗ = B.
Ŵ Ŵ 0
Then B ≃ diag(D̂, −D̂, 0m−n ), as required.

17
Chapter 5

QR-Decomposition and
LU -Decomposition

5.1 QR-Decomposition
Theorem 5.1.1. Let A ∈ Mm×n with m ≥ n, then there exists Q ∈ Mm×n with Q∗ Q = In and an upper
triangular matrix R ∈ Mn such that A = QR. Moreover, if A is a real matrix, then Q and R may be taken
to be real.
Proof. Let A = [x1 · · · xn ] ∈ Mm×n . Apply Gram-Schimdt on x1 , · · · , xn , we are done.
Corollary 5.1.1.1. Let A ∈ Mm×n with m ≥ n. If rank(A) = n, then there exists a unique Q ∈ Mm×n
with Q∗ Q = In and R = [rij ] be a upper triangular with positive diagonal such that A = QR.

Proof. The existance is similar to 5.1.1.


Assume that A = [x1 · · · xn ] = QR = Q′ R′ , where Q = [z1 · · · zn ], Q′ = [z1′ · · · zn′ ], R = [rij ], R′ = [rij

] are
described above.
Let S, T be two subspaces of Cn , we define S ⊖ T = {x ∈ S : x ⊥ M }. Then

zj ⊥ span(z1 , · · · , zj−1 ) =⇒ zj ∈ span(x1 , · · · , xj ) ⊖ span(x1 , · · · , xj−1 ) .


| {z }
dim=1

Similarly, zj′ ∈ span(x1 , · · · , xj ) ⊖ span(x1 , · · · , xj−1 ). Then zj′ = λj zj for some |λj | = 1. Then by direct
computation, zj = zj′ . Hence, Q = Q′ and therefore R = R′ .
Theorem 5.1.2. Let A ∈ Mn with A ≥ 0. Then A = LL∗ , where L = [ℓij ] is lower triangular with
nonnegative diagonal. Moreover, if A > 0, then L is unique.
|r11 | eiθi
 

√ ..
Proof. By 3.1.9 and 5.1.1, let A = Q′ R′ , where Q′ is unitary and R′ =   . Choose
 
.
iθn
0 |rnn | e
 
|r11 | ∗
D = diag(eiθ1 , · · · , eiθn ). Let Q = Q′ D and R = D∗ R′ . Then Q is unitary and R = 
 ..  and

.
0 |rnn |
′ ∗ ′ ′ ′
√ √ 2 √ ∗√ ∗ ∗ ∗
QR = Q DD R = Q R = A. Therefore, A = A = A A = R Q QR = R R.
Assume that A > 0 and A = LL∗ = L′ L′∗ , where L = [ℓij ], L′ = [ℓ′ij ] are described above. Since A > 0 is
invertible, L, L′ is invertible. Hence, ℓii , ℓ′ii > 0. Then by direct computation, ℓij = ℓ′ij .

18
5.2 LU -Decomposition
Definition 5.2.1.
 Let A = [aij ] ∈ Mn . The leading principle submatrix of order k, denoted by ℓpk (A),
a11 · · · a1j
is the matrix  ... .. .

. 
ak1 ··· akk

Theorem 5.2.2. Let A = [aij ] and let rank(A) = k. If det(ℓpj (A)) ̸= 0 for all j = 1, · · · , k, then A = LU ,
where L, U ∈ Mn and L(U ) is lower (upper) triangular.
   
ℓ11 0 1 ∗
Proof. We wish to find A1 = ℓpk (A) = L1 U1 , where L1 = 
 ..  , U1 = 
  ..  with

. .
∗ ℓkk 0 1
ℓ11 , · · · , ℓkk ̸= 0 such that A1 = L1 U1 , which is done by direct computation.
 
A1 A2  
Say, A = . Since rank(A1 ) = k =⇒ row of A1 are independent =⇒ rows of A1 A2 are
A3 A4        
independent =⇒ rows of A3 A4 are combination of rows of A1 A2 =⇒ A3 A4 = B A1 A2 for
some B ∈ M(n−k)×k .
U L−1
   
L1 0 1 A2 . Then LU = A, as desired.
A3 = BA1 and A4 = BA2 . Let L = −1 and U = 1
A3 U 1 0 0 0

Corollary 5.2.2.1. Let A = [aij ] ∈ Mn be invertible. Then A = LU if and only if det(ℓpj (A)) ̸= 0 for any
1 ≤ j ≤ n.
    
A1 A2 L1 0 U1 U2
Proof. (⇐) : By 5.2.2. (⇒) : Let A = = , where A1 , L1 , U1 ∈ Mj for 1 ≤ j ≤
A3 A4 L2 L3 0 U3
n. Then A1 = L1 U1 . Since A is invertible, det(L), det(U ) ̸= 0 =⇒ det(L1 ), det(U1 ) ̸= 0 =⇒ det(A1 ) ̸= 0
and therefore A1 is invertible.

Lemma 5.2.3. Let A ∈ Mn with det(A) ̸= 0. Then there exists a permutation matrix P ∈ Mn such that
det(ℓpj (P A)) ̸= 0 for all 1 ≤ j ≤ n.
Proof. We shall prove it by induction:
 
a b
1. Suppose n = 2, then A = with det(A) = ad − bc ̸= 0. If a ̸= 0, let P = I2 , then we are done. If
c d    
0 1 c d
a = 0, then b, c ̸= 0; therefore, let P = , then P A = .
1 0 0 b
2. Assume true for n − 1.
   
3. Let A = a1 · · · an−1 an be invertible. Then a1 · · · an−1 ∈ Mn×(n−1) has n − 1 rows
independent. We mat permue these into first n − 1 rows , then
 ′ 
A ∗
P1 A = ,
∗ ∗

where A′ ∈ Mn−1 is invertible. By induction hypothesis to A′ , there exists  a permutation


 matrix
P 2 0
P2 ∈ Mn−1 such that det(ℓpj (P A′ )) ̸= 0 for any j = 1, · · · , n − 1. Let P = P ∈ Mn . Then P
0 1 1
P A′ ∗
 
is a permutation matrix and P A = 2 , we have det(ℓpj (P A)) ̸= 0 for j = 1, · · · , n − 1. Since
∗ ∗
det(P A) = det(A) ̸= 0, we are done.

Theorem 5.2.4. Let A ∈ Mn with det(A) ̸= 0. Then there exists a permutation matrix P ∈ Mn such that
P A = LU .

19
Proof. By 5.2.3, there exists P such that det(ℓpj (P A)) ̸= 0 for 1 ≤ j ≤ n. Now by 5.2.2.1, we are done.
Theorem 5.2.5. Let A ∈ Mn . Then there exist permutation matrices P, Q ∈ Mn such that A = P LU Q.
Proof. Suppose rank(A) = k, then there exists permutation matrices P1 , Q1 ∈ Mn such  that P1 AQ1 =
P2′
 
A1 ∗ 0
, where A1 ∈ Mk is invertible. By 5.2.3. There exists a permutation matrix P2 = ∈ Mn
∗ ∗ 0 In−k
such that det(ℓpj (P2 P1 AQ1 )) ̸= 0 for 1 ≤ j ≤ k. By 5.2.2, P2 P1 AQ1 = LU , then A = P1t P2t LU Qt1 , then we
are done.

20
Chapter 6

Gerschgorin Discs
Pn
Definition 6.0.1. Let A = [aij ] ∈ Mn and let ri (A) = j=1,i̸=j |aij | , 1 ≤ i ≤ n. The Gerschgorin discs
Di = {z ∈ C : |z − aii | ≤ ri (A)}.
Theorem 6.0.2 (Gerschgorin). Let A ∈ Mn .
Sn
1. σ(A) = i=1 Di := G(A).
2. If k discs are disjoint from the others, their union contains k eigenvalues.
 
x1
 .. 
Proof. Let λ ∈ σ(A) and suppose Ax = λx, where x =  .  ̸= 0. Let max1≤i≤n |xi | = |xp | > 0 for some
xn
P Pn
1 ≤ p ≤ n. Then Ax = λx =⇒ λxp = j=1 apj xj ⇐⇒ xp (λ − app ) = j=1,j̸=p apj xj . Hence,

n
[
|xp | |λ − app | ≤ |xp | rp (A) ⇐⇒ λ ∈ Dp ⊆ Di .
i=1

Suppose A = D + B, where D = diag(a11 , · · · , ann ). Let At = D + tB. Then A1 = A and At → At0


as t → t0 . Define Di (t) = {z ∈ C : |z − aii | ≤ ri (At ) = tri (A)}. Then Di (t) ⊆ Di (1) = Di for all
1 ≤ i ≤ n, t ∈ [0, 1]. Then it is hard to show that λj (At ) is continuous on t ∈ [0,S
1], so we igore
Snit for now.
n
Now suppose (D1 ∪ · · · Dj ∪ · · · ∪ Dk ) ∩ Dm = ∅ for k < m ≤ n. Since λ(At ) ∈ i=1 Di (t) ⊆ i=1 Di and
λj (A0 ) = ajj ∈ Dj . Then we are done.
Theorem 6.0.3. Let A ∈ Mn . Then σ(A) = det(S)̸=0 G(S −1 AS).
T

Proof. Since σ(A) = σ(S −1 AS) ⊆ G(S −1 AS) for any S ∈ Mn with det(S) ̸= 0, σ(A) ⊆ det(S)̸=0 G(S −1 AS).
T
By Jordon form, there exist invertible S such that
k
M
S −1 AS = Jnj (λj ).
j=1

t−1
 
λj
 .. 
nj −1 −1
 λj . 
Let Dj (t) = diag(1, t, · · · , t ) for t > 0. Then Dj (t)Jnj (λj )Dj (t) =  . Take
 .. 
−1 
 . t
λj
Dj (t), then D(t)S −1 ASD(t)−1 = Dj (t)Jnj (λj )Dj (t)−1 . Hence,
L L
D(t) = j j
\
G(S −1 AS) ⊆ G(D(t)S −1 ASD(t)−1 ) → σ(A)
det(S)̸=0

G(S −1 AS) ⊆ σ(A).


T
as t → ∞. Hence, det(S)̸=0

21
Definition 6.0.4. We say that A = [aij ] ∈ Mn is (strictly) diagonally dominant if aii (>) ≥ ri (A) for
any 1 ≤ i ≤ n.
Theorem 6.0.5. Let A = [aij ] ∈ Mn be strictly diagonally dominant.
1. A is invertible

2. If aii > 0, then all eigenvalues of A have positive real parts.


3. If ai i > 0 and A is Hermitian, then all the eigenvalues of A are real positive.
Theorem 6.0.6. Let A = [aij ] ∈ Mn be diagonally dominant with ai i ̸= 0 and |aii | > ri (A) for at least
n − 1 values. Then A is invertible.

Proof. Assume that |akk | = rk (A). Given ϵ > 0, let Dϵ = diag(1, · · · , 1, 1 + ϵ, 1, · · · , 1). Then
| {z }
kth
 
(1 + ϵ)a1k
 .. 

 ak1 . 
−1 akn 

Dϵ ADϵ = 
 1+ϵ ··· akk ··· 1+ϵ  .
 .. 
 . 
(1 + ϵ)ank

P |akj |
Therefore, j=1,j̸=k 1+ϵ < |akk |. We can choose ϵ so small such that ri (A) + ϵ |aik | < |aii | for any i ̸= k.
Hence, Dϵ−1 ADϵ is strictly diagonally dominant, then by 6.0.5, A is invertible.

22

You might also like