0% found this document useful (0 votes)

142 views47 pages

Linear Algebra: Lecture Notes

The document provides lecture notes on linear algebra topics including singular value decomposition, LU decomposition, Cholesky decomposition, and QR decomposition. It begins with an outline of the topics to be covered and instructions for an online quiz. It then discusses the spectral theorem and its applications, including constructing matrices from eigenvectors and low-rank approximations. The document motivates singular value decomposition as a generalization of spectral decomposition to non-symmetric matrices, and defines singular values as the square roots of the eigenvalues of the matrix A transposed times A.

Uploaded by

Halyna Oliinyk

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

142 views47 pages

Linear Algebra: Lecture Notes

Uploaded by

Halyna Oliinyk

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 47

Linear Algebra

Lecture Notes

Rostyslav Hryniv

Ukrainian Catholic University

Data Science Master Programme

1st term
Autumn 2019
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Heart and soul review:

go to socrative.com
press student login
enter the room LAUCU2019
answer 10 questions on eigenvalues and eigenvectors
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Outline
Spectral Theorem
Low rank approximations
1 Singular Value Decomposition
Definition
Explanation and proof
Applications of SVD
2 LU decomposition
Linear systems
Elementary matrices
LU factorization
3 Cholesky decomposition
Motivation
Applications of Cholesky decomposition
Algorithm
4 QR and around
Applications of QR
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

The Spectral Theorem

Holds for symmetric/Hermitian, skew-Hermitian, orthogonal/unitary
matrices
claims existence of an orthogonal basis of eigenvectors u1 , . . . , un
for eigenvalues λ1 , . . . , λn
spectral decomposition:

A = λ1 u1 u> >
1 + · · · + λn un un

orthogonally diagonalizable: with

Λ = diag(λ1 , . . . , λn )
P with columns u1 , . . . , un

P −1 AP = P > AP = Λ ⇐⇒ A = PΛP >

λj are
real for symmetric/Hermitian matrices
purely imaginary for anti-symmetric/skew-Hermitian ones
unimodular for orthogonal/unitary matrices
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Applications of the Spectral Theorem

to construct symmetric or anti-symmetric or orthogonal (Hermitian,

skew-Hermitian or unitary) matrix with prescribed spectrum and
eigenvectors
to construct low-rank approximation of A:
if |λ1 | ≥ |λ2 | ≥ · · · ≥ |λn | and λk +1 , . . . , λn are small compared to
λ1 , . . . , λk , then
Ak = λ1 u1 u> 1 + · · · + λk uk uk
>

is a good rank k approximation of A

in what norm? use the so-called Frobenius norm
Xn
2
kAkF := |ajk |2 = trace(A∗ A)
j,k =1
Pn Pn
then kAk2F = λ2j trace(uj u>
j=1 j )= j=1 λ2j
n
kA − Ak k2F = j=k +1 λ2j
P
and
in fact, the SVD says Ak is the best rank k approximation!
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Example of low-rank approximation

Let
15 10
A=
10 0

Eigenvalues: λ1 = 20, λ2 = −5 (λ1 + λ2 = 15, λ1 λ2 = −100)

> >
eigenvectors: u1 = √25 , √15 , u2 = − √15 , √25
rank-one approximation:

> 16 8 −1 2
A1 = λ1 u1 u1 = , A − A1 =
8 4 2 −4
Frobenius norms:
kAk2F = 152 + 102 + 102 = 425,
kA − A1 k2F = 25 is just 1/17 of kAk2F
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Why low-rank?

Why low-rank matrix approximation are important?

In reality, deal with huge matrices (sizes 103 –106 or larger)
Sending and efficient storing becomes an issue!
Low-rank approximations are much easier for storing and sending!

Cost comparison:
full m × n matrix requires mn numbers to store;
rank one matrix requires only m + n + 1
important e.g. for image compressions
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

What if A is non-symmetric or non-square?

If A is non-symmetric, then
its eigenvectors need not be orthogonal, or even
too few eigenvectors (have to use generalized EVc’s)
If A is non-square, there are no eigenvalues and eigenvectors at all!

However, low rank approximations in (Frobenius norm) exist;

what is the best one?
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Best rank one approximation to a generic A

Problem
What is the best rank one approximation uv> of an m × n matrix A in
Frobenius norm? (WLOG assume kvk = 1)

The matrix uv> has rows u1 v> , . . . , um v> , if the rows of A are a>
1 , ...,
a>
m , then m m
X X
> 2 > > 2
kA − uv kF = kaj − uj v k = kaj − uj vk2
j=1 j=1
This is minimal if uj v is the projection Pk aj of aj onto ls(v):
m
X m
X m
X m
X
kaj − Pk aj k2 = kP⊥ aj k2 = kaj k2 − kPk aj k2
j=1 j=1 j=1 j=1
Thus need to maximize
Xm m
X m
X
kPk aj k2 = |a>
j v|2
= |v> aj |2 = kAvk2
j=1 j=1 j=1
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

The trolley-line-location problem

We reduced the above problem to the following one:
Maximize kAvk under the restriction that kvk = 1
This is what we get in the trolley-line-location problem:
Choose a direction v to minimize the sum of
squared distances from a1 , . . . , am to the line
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

The trolley line problem

Problem:
For the given vectors a1 , a2 , . . . am in Rn , find their best line fit `
The objective function to be minimized:
Xm
f (`) := dist2 (aj , `)
k =1

v is the unit vector on ` and Pv := vv> the orthogonal projector;

then dist(ak , `) = kak − Pv ak k, so that
X X X
f (`) = kak − Pv ak k2 = kak k2 − kPv ak k2

thus one needs to maximize the sum

X X X
kPv ak k2 = kvv> ak k2 = |a> 2 2
k v| = kAvk ,

where A has rows a> > >

1 , a2 , . . . , am
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Solution to the rank-one approximation problem

Consider the quadratic form
Q(v) := kAvk2 = (Av)> (Av) = v> A> Av
and denote
the largest eigenvalue by σ12
corresponding eigenvector (the first principal axis) by v1
then A> Av1 = σ12 v1

max{Q(v) | kvk = 1} = Q(v1 ) = σ12

and u1 := Av1 satisfies A> u1 = σ12 v1
Solution to the rank-one approximation problem:
In Frobenius norm, the best rank-one approximation of A is σ1 u1 v>
1

This leads to the notion of singular values of A

Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Motivation

The spectral decomposition A = PDP −1 works perfectly for

symmetric matrices:
P is an orthogonal matrix
columns of P are eigenvectors of A
D is diagonal with EV’s on the diagonal
Is there anything similar for nonsymmetric matrices
What about rectangular matrices?
Use the Singular Value Decomposition (SVD)

A = UΣV T
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Singular values

Let A be any m × n matrix

B := AT A is n × n and nonnegative:

xT Bx = xT (AT A)x = (Ax)T (Ax) = kAxk2 ≥ 0

denote by λ1 ≥ λ2 ≥ · · · ≥ λn the EV’s of B

p
σj := λj are called the singular values of A
notice that there are r := rank B = rank A positive σj

Example
 
1 1
T 2 1
A = 0 1; B = A A = has EV’s λ1 = 3 and λ2 = 1;
1 2
1 0
√
thus σ1 = 3, σ2 = 1
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

SVD theorem

Theorem (SVD)
Every m × n matrix A can be written as

A = UΣV T

where U and V are orthogonal and Σ is an m × n matrix with singular

values of A on its main diagonal and zeros otherwise

Remark
This is an analogue for the A = UDU T diagonalization of a symmetric
matrix A
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

SVD theorem
Theorem (SVD — expanded form)
Every m × n matrix A of rank r can be written as A = UΣV T , where
U = (u1 . . . ur |ur +1 . . . um ),
V = (v1 . . . vr |vr +1 . . . vn ),
Σ has σj on its main diagonal and zeros otherwise
vj are eigenvectors of AT A with EV’s σj2 : AT Avj = σj2 vj
uj := Avj /kAvj k = Avj /σj for j = 1, . . . , r is an ONB for the range
of A
u1 , . . . , um is an ONB for Rm
A = σ1 u1 vT1 + · · · + σr ur vTr
The vectors u1 , . . . , ur are the left singular vectors of A;
v1 , . . . , vr are the right singular vectors of A
Remark: Avj = σj uj , AT uj = σvj , AAT uj = σj2 uj
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Proof of the SVD decomposition

Start with normalized eigenvectors v1 , . . . , vn

and eigenvalues σ12 , . . . , σn2 of AT A
Form uj := Avj /kAvj k = Avj /σj , j = 1, . . . , r (= rank A)
σj T
uTi uj = vTi AT Avj /(σi σj ) = vTi (AT A)vj /(σi σj ) = σi vi vj = δij
complete with ur +1 , . . . , um to an ONB of Rm
Now

UΣ = (σ1 u1 . . . σr ur |0 .{z
. . 0})
n−r
. . 0}) = A(v1 . . . vn ) = AV
= (Av1 . . . Avr |0 .{z
n−r

since V is orthogonal, VV T = I yields A = UΣV T

Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Example
 
1 1
For A = 0 1, we find that
1 0
√
σ1 = 3 and σ2 = 1
v1 = ( √1 , √1 ) and v2 = ( √1 , − √1 )
2 2 2 2
1
√ 1 1 T
u1 = √3 ( 2, √ , √ ) ,
2 2
u2 = (0, − √1 , √1 )T ,
2 2
u3 = √1 (−1, 1, 1)T
   
3 1 1 0 0
σ1 u1 vT + σ2 u2 vT2 =  12 1
2 + − 21 1
2 =A
1 1 1
2 2 2 − 12
T is the best rank one approximation of A in the Frobenius
σ1 u1 vP
norm (aij − bij )2
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Interpretation of SVD

A = UΣV T implies decomposition of x 7→ Ax into

x 7→ y := V T x, y 7→ z := Σy, z 7→ Ax = Uz

x 7→ y finds the coordinates of the vector x in terms of one

orthonormal basis (v1 , . . . , vn )
y 7→ z scales those coordinates
z 7→ Ax find the vector with the scaled coordinates over another
orthonormal basis (u1 , . . . , un )
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Reduced SVD
In the SVD representation, some part is uninformative:
vr +1 , . . . , vn are chosen arbitrarily in the nullspace of A
ur +1 , . . . um are chosen arbitrarily in the nullspace of AT
Σ has zero rows or columns
The reduced SVD removes that uninformative part:
 T
v1

σ 1 · · · 0
A = u1 · · · ur  . . . . . . . . . . .   ... 
 
| {z } 0 ··· σ T
} | v{zr }
m×r r
| {z
r ×r r ×n

The reduced SVD of AT :

 T
u1

σ 1 · · · 0
 .. 
AT = v1 · · ·

vr  ...........  . 

0 · · · σr uT r
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

To summarize:
The SVD for arbitrary rectangular matrices is an analogue of the
spectral decomposition for square (especially symmetric) matrices
factorization
A = UΣV T
means:
rotation V T : change of basis to v1 , . . . , vn
stretch Σ: multiplication by singular values along the vj
rotation U: change of basis to u1 , . . . , um
in particular, Ax = b is equivalent to Σc = d, where
c := V > x is the coordinate vector of x in the ONB v1 , . . . , vn
d := U > bPis the coordinatePvector of b in the ONB u1 , . . . , um ; ;
thus x = ck vk and b = dk uk with dk = σk ck for k = 1, . . . , r
Geometrically this means that A maps the unit ball Bn of Rn into
“degenerate” ellipsoid Em of Rm :
X
(dk /σk )2 ≤ 1
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Geometrical meaning of SVD

v1 7→ e1 7→ σ1 e1 7→ σ1 u1
v2 7→ e2 7→ σ2 e2 7→ σ2 u2
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Polar decomposition

Theorem (Polar decomposition)

Any square matrix A can be written as QS with orthogonal
::::::::::
Q and
symmetric positive semidefinite
::::::::::::::::::::::::::::::
S

Why polar?
z = reiθ =⇒ zz = |z|2 = r 2
A = QS =⇒ AT A = S(Q T Q)S = S 2

Proof.
Write A = UΣV T = (UV T )(V ΣV T ) =: QS
Q := UV T is orthogonal
S := V ΣV T is symmetric and positive semidefinite
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Image compression
Instead of storing m × n numerical entries, can take the best rank-r
approximation of A; need
r (1 + m + n)
numbers
Pseudo-inverse
A rectangular A cannot be inverted!
However, a pseudo-inverse A+ can be defined s.t.
A+ A ≈ In and AA+ ≈ Im
for Σ, the pseudo-inverse Σ+ should satisfy
Σ+ Σ = Ir ⊕ 0n−r , Σ Σ+ = Ir ⊕ 0m−r
Σ+ gets transposed and σj replaced with 1/σj
if A = UΣV T , then its pseudo-inverse is A+ := V Σ+ U T : indeed,

A+ A = V Σ+ (U T U)ΣV T = V Σ+ ΣV T = V (Ir ⊕ 0n−r )V T = (Ir ⊕ 0n−r )

A A+ = UΣ(V T V )Σ+ U T = UΣ Σ+ U T = U(Ir ⊕ 0m−r )U T = (Ir ⊕ 0m−r )
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Best solution to Ax = b:
Recall that Ax = b is soluble ⇐⇒ b belongs to the range (ie,
column space) of A
if n > rank A, then there are many solutions (or none)
(homogeneous equation Ax = 0 has nontrivial solutions)
any solution has the form x = x0 + x1 with x0 a particular solution
and x1 any solution to Ax = 0
if b not in the range, solve the normal equation AT Ax = AT b to get
the least square solution
if rank A = n, then AT A is invertible
otherwise, the least square solution is not unique
look for the shortest solution x̂
Claim: if A = UΣV T , then x̂ = V Σ+ U T b
kAx − bk = kΣV T x − U T bk = kΣy − U T bk;
the shortest solution: y = Σ+ (U T b) and x = V y = (V Σ+ U T )b
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

SVD vs PCA
Observe that the largest value of kAxk with kxk ≤ 1 is obtained for
x = v1 and is equal to σ1 ;
this is the first principal axis for AT A:
indeed, AT A = V ΣT U T UΣV T = V ΣT ΣV T = VDV T is the spectral
decomposition of the symmetric matrix B := AT A
B has eigenvalues σk2 with eigenvectors vk
the quadratic form Q(x) := xT Bx is equal to kAxk2
by the minimax properties of the eigenvalues,

σ12 = max xT Bx = max kAxk2 ,

kxk=1 kxk=1

σ22 = max xT Bx = max kAxk2 ,

kxk=1, kxk=1,
x⊥v1 x⊥v1

σ32 = . . .

AT A can be considered as a correlation matrix for the columns of A

Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

The trolley line problem

Problem:
For the given vectors a1 , a2 , . . . am in Rn , find their best line fit `
The target function to be minimized:
Xm
f (`) := dist2 (aj , `)
k =1
u is the unit vector on ` and Pu := uuT the orthogonal projector;
then dist(ak , `) = kak − Pu ak k, so that
X X X
f (`) = kak − Pu ak k2 = kak k2 − kPu ak k2
thus one needs to maximize the sum
X X X
kPu ak k2 = kuuT ak k2 = |aTk u|2 = kAuk2 ,

where A has rows aT1 , aT2 , . . . , aTm

Solution: The best direction is the first left singular vector v1 of A
The smallest value of the target function is kak k2 − σ12 = σ22 + · · · + σn2
P
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Best low-rank approximation of A

Frobenius norm of a matrix
X X X
kAk2F = |aij |2 = kai k2 = σi2
Indeed, i,j i i
pre-/post-multiplying by an orthogonal matrix does not change k · kF
thus A = UΣV T yields kAk2F = kU T AV k2F = kΣk2F
another reason: kAk2F = trace(AT A); now
trace(AT A) = trace(V ΣT ΣV T ) = trace(ΣT Σ) = σk2
P

Best rank-one approximation of A in the Frobenius norm

For a rank-one operator B = uvT , kBk2F = kuk2 kvk2 ; thus (kuk = 1)

kA − uvT k2F = trace(A − uvT )T (A − uvT )

= ...
= kAk2F − kAuk2 + kAu − vk2
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Linear systems

Linear equation, or linear system:



 a11 x1 + a12 x2 + · · · + a1n xn = b1
a21 x1 + a22 x2 + · · · + a2n xn = b2

Ax = b

 . .................................
am1 x1 + am2 x2 + · · · + amn xn = bm


A an m × n matrix
x ∈ Rn , b ∈ Rm
Solution: x = A−1 b for invertible A (m=n)
Computation of A−1 is costly ∼ O(n3 ) and not always necessary!
Alternatively, use the Gauss elimination method
In matrix form, amounts to an LU representation of A
L stands for “lower”- and U for “upper”-triangular
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Elementary row operations that simplify linear systems

Definition
A system is called consistent if it possesses at least one solution
Two linear systems are called equivalent if they possess the same
set of solutions
Idea:
Transform a system to a simpler form without changing the set of
solutions using the following elementary row transformations:
1. Multiply an equation/a row through by a nonzero constant
2. Add a constant times one equation/row to another
3. Interchange two equations/rows
Properties
1 Elementary row operations are reversible
2 Lead to equivalent systems/augmented matrices
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Example: elementary row operations

 
x + 2y − z = 2 1 2 −1 2
2x + y − z = 1 −2 × (r1 ) 2 1 −1 1
x+ y− z = 0 −1 × (r1 ) 1 1 −1 0
 
x + 2y − z = 2 1 2 −1 2
− 3y + z = −3 (r2 ) → (r3 ) 0 −3 1 −3
− y = −2 (r3 ) → (r2 ) 0 −1 0 −2
 
x + 2y − z = 2 1 2 −1 2
− y = −2 ×(−1) 0 −1 0 −2
− 3y + z = −3 +3 × (r2 ) 0 −3 1 −3
 
x + 2y − z = 2 −2 × (r2 ) + (r3 ) 1 2 −1 2
y = 2 0 1 0 2
z = 3 0 0 1 3
 
x = 1 1 0 0 1
y = 2 0 1 0 2
z = 3 0 0 1 3
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Elementary row transformations revisited

A is an m × n coefficient matrix of a system
1. Row multiplication: Multiply i th row by t
Amounts to matrix multiplication EA, with
E −1 = diag{1, . . . , 1, t −1 , 1, . . . , 1}
| {z }
i−1

2. Row replacement: Add α times k th row to `th row

Amounts to EA, with (E)ii = 1, (E)`k = α, (E)ij = 0 otherwise:
 
1 k

 ↓ 

−1 ` → −α 1
E =


 . . 
 . 
1
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Elementary row transformations revisited

3. Row interchange: Interchange k th and `th rows

Amounts to matrix multiplication EA, with
 
1 k
 ↓ 0 1  ← k
−1

E =E =  1 

 1 0 ↑ ←`
` 1

Definition
The above matrices performing the elementary row operations are
called elementary matrices
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Example: ERO via matrix multiplication

   
1 2 −1 1 0 0
2 1 −1 −2 × (r1 ) E1 = −2 1
 0
1 1 −1 −1 × (r1 ) −1 0 1
   
1 2 −1 1 0 0
0 −3 1 E2 = 0 1 0
0 −1 0 − 13 × (r2 ) 0 − 13 1
 
1 2 −1
E2 E1 A = 0 −3 1 =⇒
0 0 − 13
| {z }
U
 
1 0 0
A = E1−1 E2−1 U = 2 1 0 U = LU
1 13 1
| {z }
L
L: lower-triangular U: upper-triangular
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Products of lower-triangular matrices

Definition (Lower- and upper-triangular matrices)

A square matrix is called lower-triangular (upper-triangular) if all its
entries above (below) the main diagonal are zero

Lemma
Product of two lower-triangular (upper-triangular) matrices is
lower-triangular (upper-triangular)

Proof.
Use the row or column form of matrix-matrix product
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

LU factorization
Theorem
Assume that an m × n matrix can be reduced to row echelon form U
using only row substitution operations. Then A = LU with a
lower-triangular m × m matrix L.
Proof.
Ek · · · E1 A = U =⇒ L = (Ek · · · E1 )−1 = E1−1 · · · Ek−1

Definition
The above representation A = LU, with an m × m lower-triangular
matrix L and upper-triangular∗ m × n matrix U, is the LU-factorization
of A.
Remark (∗ )
L is unique if all its diagonal entries are 1
If row interchanges are needed, use PA = LU, with P encoding all
row interchanges
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Why is LU-factorization important?

(
Ux = y
Ax = b ⇐⇒
Ly = b

Typically, O(n3 ) flops are needed to solve Ax = b

To solve Ly = b and Ux = y, need O(n2 ) flops
E.g., elementary row operations transforming L to In only perform
on b
Use to find A−1 for nonsingular A
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

What is a Cholesky decomposition?

Recall:
A = LU (or PA = LU) exists for rectangular matrices
a LU-representation is not unique!
usually L is m × m with 1 on the main diagonal;
then both L and U = L−1 A is uniquely determined
reason: if A = L1 U1 = L2 U2 , then
L−1
2 L1 = U2 U1
−1

L−1
2 L1 is lower-triangular with 1 on the diagonal
U2 U1−1 is upper-triangular
thus L−1 −1
2 L 1 = U2 U1 = I
if A is nonsingular, U has nonzero diagonal
can “factor it out” as D to get A = LDU with U having 1’s on the
main diagonal
for symmetric matrices, U = LT and A = LDLT
reason: AT = U T DLT = LDU = A and use uniqueness
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Standard Cholesky decomposition

The standard form of Cholesky decomposition reads

A = LLT with L lower-triangular

requires A symmetric and positive semi-definite :

xT Ax = xT LT Lx = (Lx)T Lx = kLxk2 ≥ 0

conversely, if A positive semi-definite, the LLT decomposition exists

for positive definite A, such a decomposition is (almost) unique!
if L has diagonal S, then with L := L1 S and D := S 2 , we get

A = LLT = (L1 S)(SLT1 ) = L1 S 2 LT1 = L1 DLT1

Thus the LDLT decomposition is more general as it does not

require positive semi-definiteness
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Applications

For numerical solutions of Ax = b

As in LU factorization, split solving Ax = b into
Ly = b (forward substitution) and
LT x = y (backward substitution)
For symmetric matrices, twice as efficient as LU

Monte Carlo simulation of correlated random variables

The covariance matrix Σ of Gaussian RV X := (X1 , X2 , . . . , Xn )T is
positive (semi-)definite. If Σ = LLT , then X can be simulated as LZ with
standard Gaussian RV Z; indeed, then
E(XXT ) = E(LZZT LT ) = LE(ZZT )LT = LLT = Σ
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

The algorithm
Idea:
As in Gaussian elimination, make entries below diagonal zero
Recursive algorithm:
1 start with i := 1 and A(1) := A
2 At step i, the matrix A(i) has the following form:
 
Ii−1 0 0
A(i) =  0 ai,i bTi  ,
0 bi B (i)
with the identity matrix Ii−1 of size i − 1
3 Set  
Ii−1 0 0
 0 √
Li :=  ai,i 0 

0 √1 bi In−i
ai,i
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Recursive algorithm (continued):

4 Write
A(i) = Li A(i+1) L∗i ;
then
Ii−1 0 0
 

A(i+1) = 0 1 0 
(i) 1 T
0 0 B − ai,i bi bi
5 Finally, A(n+1) = In and so we get

A = A(1) = L1 A(2) LT1 = L1 L2 A(3) LT2 LT1

= · · · = L1 L2 . . . Ln LTn . . . LT2 LT1
= L1 L2 . . . Ln (L1 L2 . . . Ln )T
= LLT

with L := L1 L2 . . . Ln
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Example

   
4 0 2 2 0 0
A = A(1) = 0 1 −1 =⇒ L1 = 0 1 0 
2 −1 6 1 0 1
 
1 0 0
(2) T 1 0
A(1) = L1 A L1 =⇒ A = 0 (2)  1 −1 =⇒ L̃2 =

−1 1
0 −1 5

(2) (3) T (3) 1 0 1 0
Ã = L̃2 Ã L̃2 =⇒ Ã = =⇒ L̃3 =
0 4 0 2
   
2 0 0 1 0 0 1 0 0
L = L1 L2 L3 = 0 1 0 0 1 0 0 1 0
1 0 1 0 −1 1 0 0 2
  
2 0 0 2 0 1
A = LLT = 0 1 0  0 1 −1
1 −1 2 0 0 2
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

QR factorization = matrix form of Gram–Schmidt

Assume A has linearly independent columns a1 , . . . , an

perform Gram–Schmidt orthogonalization to get q1 , . . . , qn
at each step, qk is a linear combination of ak , q1 , . . . , qk −1
thus ak is in the span of q1 , . . . , qk
ak = P1 ak + · · · + Pk ak = q1 q> >
1 ak + · · · + qk qk ak
in matrix form, this becomes a QR factorization:
 >
q1 a1 q> >

1 a2 . . . q1 an
 q>
2 a2 . . . q2 an 
> 
a1 a2 . . . an = q1 q2 . . . qn  . . . . . . . . . .
q>
| {z } | {z }
A Q n an
| {z }
R

R can be calculated as Q > A

Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Full QR factorization and applications

In the above form, Q is m × n and R is n × n
often called the reduced QR factorization
Q is not orthogonal (as it is not square)
add m − n columns to get an orthogonal Q̃ and add m − n zero
rows to R;
then A = Q̃ R̃ is the full QR factorization
Application of QR to least squares:
AT A = (QR)T (QR) = R T R;
therefore, R T R x̂ = R T Q T b =⇒ R x̂ = Q T b
as R is upper-triangular, this is very fast! Que: why is R invertible?

Advantages of QR-decomposition:
Orthogonal columns of Q make algorithm stable
(norms do not increase or decrease)
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Full QR factorization and applications

another method to find Q and R involve Householder’s reflections
Q = I − 2v vT
Householder’s reflection Q can be chosen so that Qx = kxke1 : set
u = x − kxke1 and v = u/kuk

the impact on A: take x to be the first column of A; then

 
kxk ∗ . . . ∗
 0 ∗ . . . ∗
QA =  .
 
. .. .. .. 
 . . . .
0 ∗ ... ∗
Yet another method uses Givens rotations
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Applications of QR

QR eigenvalue algorithm
On each step, factorize Ak = Qk Rk and set Ak +1 := Rk Qk
as Rk = Qk−1 Ak , one gets Ak +1 = Qk−1 Ak Qk
thus Ak and Ak +1 have the same eigenvalues
typically, Ak converge to an upper-triangular matrix R
(Schur form of A)
eigenvalues of A = diagonal entries of R
Fourier transform and Fast Fourier transform
...

Linear Algebra for Engineers
No ratings yet
Linear Algebra for Engineers
123 pages
Least Squares and The Singular Value Decomposition: Ivan Markovsky
No ratings yet
Least Squares and The Singular Value Decomposition: Ivan Markovsky
52 pages
Matrix Factorization Techniques
No ratings yet
Matrix Factorization Techniques
30 pages
The Singular Value Decomposition: Prof. Walter Gander ETH Zurich Decenber 12, 2008
No ratings yet
The Singular Value Decomposition: Prof. Walter Gander ETH Zurich Decenber 12, 2008
18 pages
MFDS Lecture BITS WILP
No ratings yet
MFDS Lecture BITS WILP
29 pages
SVD Notes
No ratings yet
SVD Notes
7 pages
SVD Notes
No ratings yet
SVD Notes
7 pages
Lecture1 Slides
No ratings yet
Lecture1 Slides
26 pages
SVD Notes
No ratings yet
SVD Notes
2 pages
Eigen and Singular Value Decomposition
No ratings yet
Eigen and Singular Value Decomposition
30 pages
Unit 3-4 Singular Value Decomposition SVD
No ratings yet
Unit 3-4 Singular Value Decomposition SVD
6 pages
Linear Models: Stability and Redundancy: 2.1 Singular Value Decomposition
No ratings yet
Linear Models: Stability and Redundancy: 2.1 Singular Value Decomposition
24 pages
Lecture 6
No ratings yet
Lecture 6
53 pages
Lecture 6
No ratings yet
Lecture 6
53 pages
The Singular Value Decomposition
No ratings yet
The Singular Value Decomposition
16 pages
Lecture Notes On SVD For Math 54
No ratings yet
Lecture Notes On SVD For Math 54
5 pages
CS168: The Modern Algorithmic Toolbox Lecture #9: The Singular Value Decomposition (SVD) and Low-Rank Matrix Approximations
No ratings yet
CS168: The Modern Algorithmic Toolbox Lecture #9: The Singular Value Decomposition (SVD) and Low-Rank Matrix Approximations
10 pages
The Singular Value Decomposition Let A Be
No ratings yet
The Singular Value Decomposition Let A Be
15 pages
Singular Value Decomposition: Notes On Linear Algebra
No ratings yet
Singular Value Decomposition: Notes On Linear Algebra
9 pages
Svdnotes
No ratings yet
Svdnotes
10 pages
Singular-Value Decomposition and Its Applications
No ratings yet
Singular-Value Decomposition and Its Applications
28 pages
Vietnam General Confederation of Labor: Ton Duc Thang University Faculty of Information Technology
No ratings yet
Vietnam General Confederation of Labor: Ton Duc Thang University Faculty of Information Technology
26 pages
2 - Introduction To SVD
No ratings yet
2 - Introduction To SVD
5 pages
Matrix Decomposition Guide
100% (3)
Matrix Decomposition Guide
17 pages
SVD for Advanced Math Students
No ratings yet
SVD for Advanced Math Students
9 pages
1.2.7 Singular Value Decomposition: Mathematical Background 39
No ratings yet
1.2.7 Singular Value Decomposition: Mathematical Background 39
7 pages
SVD Cholesky
No ratings yet
SVD Cholesky
9 pages
Final 4 Sem
No ratings yet
Final 4 Sem
29 pages
3 - Low Rank Apprx For SVD
No ratings yet
3 - Low Rank Apprx For SVD
4 pages
Lecture 4
No ratings yet
Lecture 4
48 pages
Linear Algebra Exercises & SVD
No ratings yet
Linear Algebra Exercises & SVD
17 pages
SVD Slides
No ratings yet
SVD Slides
26 pages
8 - The Singular Value Decomposition: Cmda 3606 Mark Embree
No ratings yet
8 - The Singular Value Decomposition: Cmda 3606 Mark Embree
24 pages
Internal 4 Sem
No ratings yet
Internal 4 Sem
36 pages
Singular Value Decomposition Geometry
No ratings yet
Singular Value Decomposition Geometry
9 pages
Abdi SVD2007 Pretty PDF
No ratings yet
Abdi SVD2007 Pretty PDF
14 pages
The Singular Value Decomposition.
No ratings yet
The Singular Value Decomposition.
88 pages
Compact SVD
No ratings yet
Compact SVD
10 pages
SVD for Linear Least Squares Solutions
No ratings yet
SVD for Linear Least Squares Solutions
20 pages
LA Hw8
No ratings yet
LA Hw8
2 pages
SVD in Image Compression Report
No ratings yet
SVD in Image Compression Report
9 pages
Singular Value Decomposition (SVD) : - Definition
No ratings yet
Singular Value Decomposition (SVD) : - Definition
5 pages
Singular Value Decomposition
No ratings yet
Singular Value Decomposition
24 pages
Singular Value Decomposition Fast Track Tutorial
No ratings yet
Singular Value Decomposition Fast Track Tutorial
5 pages
Math 5390 Chapter 3
No ratings yet
Math 5390 Chapter 3
32 pages
SVD Slides
No ratings yet
SVD Slides
17 pages
Cos323 s06 Lecture09 SVD
No ratings yet
Cos323 s06 Lecture09 SVD
24 pages
Singular Value Decomposition
100% (1)
Singular Value Decomposition
24 pages
Linear Algebra Essentials
No ratings yet
Linear Algebra Essentials
13 pages
Revisión SVD
No ratings yet
Revisión SVD
17 pages
Eval Norms
No ratings yet
Eval Norms
49 pages
SVD My Lecture 2021-Desktop-Qov8vhr
No ratings yet
SVD My Lecture 2021-Desktop-Qov8vhr
79 pages
Introduction To Linear Algebra V: 1 Eigenvalue and Eigenvector
No ratings yet
Introduction To Linear Algebra V: 1 Eigenvalue and Eigenvector
4 pages
Decomp
No ratings yet
Decomp
27 pages
Elec9731 LM1
No ratings yet
Elec9731 LM1
41 pages
Numerical Linear Algebra Homework Solutions
100% (1)
Numerical Linear Algebra Homework Solutions
5 pages
Tulsiramji Gaikwad-Patil College of Engineering & Technology Department of Computer Science & Engineering Session 2012-2013 Question Bank
No ratings yet
Tulsiramji Gaikwad-Patil College of Engineering & Technology Department of Computer Science & Engineering Session 2012-2013 Question Bank
11 pages
Nash J.C. Compact Numerical Methods For Computers.. Lin. Algebra and Function Minimisation (2ed., IOP, 1990) (288s) - MN
No ratings yet
Nash J.C. Compact Numerical Methods For Computers.. Lin. Algebra and Function Minimisation (2ed., IOP, 1990) (288s) - MN
288 pages
Pre Mfe Nla Feb2020 Syllabus
No ratings yet
Pre Mfe Nla Feb2020 Syllabus
4 pages
Block Diagonal Inversion
No ratings yet
Block Diagonal Inversion
13 pages
Advanced Linear Algebra Methods
No ratings yet
Advanced Linear Algebra Methods
37 pages
Numerical Solution of The Stable, Non-Negative Definite Lyapunov Equation
100% (1)
Numerical Solution of The Stable, Non-Negative Definite Lyapunov Equation
21 pages
Amir Vaxman, Game Physics (Infomgp), Period 3, 2018/2019
No ratings yet
Amir Vaxman, Game Physics (Infomgp), Period 3, 2018/2019
9 pages
On The Application of The Minimum Degree Algorithm To Finite Element Systems
No ratings yet
On The Application of The Minimum Degree Algorithm To Finite Element Systems
23 pages
Data Minig
No ratings yet
Data Minig
48 pages
Solution Manual: Scientific Computing
0% (1)
Solution Manual: Scientific Computing
192 pages
Linear Algebra: Lecture Notes
No ratings yet
Linear Algebra: Lecture Notes
47 pages
A Review On The Inverse of Symmetric Tridiagonal A
No ratings yet
A Review On The Inverse of Symmetric Tridiagonal A
23 pages
Matrix Algebra Exam Prep
No ratings yet
Matrix Algebra Exam Prep
5 pages
PETE-560 Mathematical Methods in Petroleum Engineering Lecture Notes Chapter # 5 6.1 Linear Algebra
No ratings yet
PETE-560 Mathematical Methods in Petroleum Engineering Lecture Notes Chapter # 5 6.1 Linear Algebra
50 pages
Midterm
No ratings yet
Midterm
7 pages
12-9-Sec8-3-Positive Definite Matrices
No ratings yet
12-9-Sec8-3-Positive Definite Matrices
8 pages
MFE Exam Module Quantitative Risk Management
No ratings yet
MFE Exam Module Quantitative Risk Management
2 pages
Numerical Algorithms For SQP
No ratings yet
Numerical Algorithms For SQP
186 pages
Numerical Analysis for Students
No ratings yet
Numerical Analysis for Students
71 pages
Cholesky Decomposition
No ratings yet
Cholesky Decomposition
17 pages
Chapter 11
No ratings yet
Chapter 11
14 pages
Choosing A Solver For FEM - Direct or Iterative - SimScale
No ratings yet
Choosing A Solver For FEM - Direct or Iterative - SimScale
15 pages
Lesson 4-Linear Systems
No ratings yet
Lesson 4-Linear Systems
34 pages
A Comprehensive Analysis of The Performance of Gear Fault Detection Algorithms
No ratings yet
A Comprehensive Analysis of The Performance of Gear Fault Detection Algorithms
11 pages
Complete Accuracy and Stability of Numerical Algorithms Second Edition Nicholas J. Higham Ebook PDF File All Chapters
100% (5)
Complete Accuracy and Stability of Numerical Algorithms Second Edition Nicholas J. Higham Ebook PDF File All Chapters
67 pages
Bickel and Levina 2004
No ratings yet
Bickel and Levina 2004
28 pages
Janus Guide
No ratings yet
Janus Guide
40 pages
Linear Algebra and Applications: Numerical Linear Algebra: David S. Watkins
No ratings yet
Linear Algebra and Applications: Numerical Linear Algebra: David S. Watkins
107 pages
Lê Xuân Đ I Linear - System - Handout
No ratings yet
Lê Xuân Đ I Linear - System - Handout
90 pages

Linear Algebra: Lecture Notes

Uploaded by

Linear Algebra: Lecture Notes

Uploaded by

Linear Algebra

Ukrainian Catholic University

Heart and soul review:

The Spectral Theorem

orthogonally diagonalizable: with

P −1 AP = P > AP = Λ ⇐⇒ A = PΛP >

Applications of the Spectral Theorem

to construct symmetric or anti-symmetric or orthogonal (Hermitian,

is a good rank k approximation of A

Example of low-rank approximation

Eigenvalues: λ1 = 20, λ2 = −5 (λ1 + λ2 = 15, λ1 λ2 = −100)

Why low-rank matrix approximation are important?

What if A is non-symmetric or non-square?

However, low rank approximations in (Frobenius norm) exist;

Best rank one approximation to a generic A

The trolley-line-location problem

The trolley line problem

v is the unit vector on ` and Pv := vv> the orthogonal projector;

thus one needs to maximize the sum

where A has rows a> > >

Solution to the rank-one approximation problem

max{Q(v) | kvk = 1} = Q(v1 ) = σ12

This leads to the notion of singular values of A

The spectral decomposition A = PDP −1 works perfectly for

Let A be any m × n matrix

xT Bx = xT (AT A)x = (Ax)T (Ax) = kAxk2 ≥ 0

denote by λ1 ≥ λ2 ≥ · · · ≥ λn the EV’s of B

where U and V are orthogonal and Σ is an m × n matrix with singular

Proof of the SVD decomposition

Start with normalized eigenvectors v1 , . . . , vn

since V is orthogonal, VV T = I yields A = UΣV T

A = UΣV T implies decomposition of x 7→ Ax into

x 7→ y finds the coordinates of the vector x in terms of one

The reduced SVD of AT :

Geometrical meaning of SVD

Theorem (Polar decomposition)

A+ A = V Σ+ (U T U)ΣV T = V Σ+ ΣV T = V (Ir ⊕ 0n−r )V T = (Ir ⊕ 0n−r )

σ12 = max xT Bx = max kAxk2 ,

σ22 = max xT Bx = max kAxk2 ,

AT A can be considered as a correlation matrix for the columns of A

The trolley line problem

where A has rows aT1 , aT2 , . . . , aTm

Best low-rank approximation of A

Best rank-one approximation of A in the Frobenius norm

kA − uvT k2F = trace(A − uvT )T (A − uvT )

Linear equation, or linear system:

Elementary row operations that simplify linear systems

Example: elementary row operations

Elementary row transformations revisited

2. Row replacement: Add α times k th row to `th row

Elementary row transformations revisited

3. Row interchange: Interchange k th and `th rows

Example: ERO via matrix multiplication

Products of lower-triangular matrices

Definition (Lower- and upper-triangular matrices)

Why is LU-factorization important?

Typically, O(n3 ) flops are needed to solve Ax = b

What is a Cholesky decomposition?

Standard Cholesky decomposition

A = LLT with L lower-triangular

requires A symmetric and positive semi-definite :

conversely, if A positive semi-definite, the LLT decomposition exists

A = LLT = (L1 S)(SLT1 ) = L1 S 2 LT1 = L1 DLT1

Thus the LDLT decomposition is more general as it does not

For numerical solutions of Ax = b

Monte Carlo simulation of correlated random variables

Recursive algorithm (continued):

A = A(1) = L1 A(2) LT1 = L1 L2 A(3) LT2 LT1

QR factorization = matrix form of Gram–Schmidt

Assume A has linearly independent columns a1 , . . . , an

R can be calculated as Q > A

Full QR factorization and applications

Full QR factorization and applications

the impact on A: take x to be the first column of A; then

You might also like