0% found this document useful (0 votes)

22 views49 pages

Cholesky

Uploaded by

Guna Sekar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views49 pages

Cholesky

Uploaded by

Guna Sekar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 49

L.

Vandenberghe ECE133A (Fall 2024)

12. Cholesky factorization

• positive definite matrices

• examples
• Cholesky factorization
• complex positive definite matrices
• kernel methods

12.1
Definitions

• a symmetric matrix 𝐴 ∈ R𝑛×𝑛 is positive semidefinite if

𝑥𝑇 𝐴𝑥 ≥ 0 for all 𝑥

• a symmetric matrix 𝐴 ∈ R𝑛×𝑛 is positive definite if

𝑥𝑇 𝐴𝑥 > 0 for all 𝑥 ≠ 0

this is a subset of the positive semidefinite matrices

note: if 𝐴 is symmetric and 𝑛 × 𝑛, then 𝑥𝑇 𝐴𝑥 is the function

𝑛 ∑︁
𝑛 𝑛
𝐴𝑖𝑖 𝑥𝑖2 +2
𝑇
∑︁ ∑︁ ∑︁
𝑥 𝐴𝑥 = 𝐴𝑖 𝑗 𝑥𝑖 𝑥 𝑗 = 𝐴𝑖 𝑗 𝑥𝑖 𝑥 𝑗
𝑖=1 𝑗=1 𝑖=1 𝑖> 𝑗

this is called a quadratic form

Cholesky factorization 12.2

Example

9 6

𝐴=
6 𝑎

𝑥𝑇 𝐴𝑥 = 9𝑥12 + 12𝑥1𝑥2 + 𝑎𝑥 22 = (3𝑥1 + 2𝑥2) 2 + (𝑎 − 4)𝑥22

• 𝐴 is positive definite for 𝑎 > 4

𝑥𝑇 𝐴𝑥 > 0 for all nonzero 𝑥

• 𝐴 is positive semidefinite but not positive definite for 𝑎 = 4

𝑥𝑇 𝐴𝑥 ≥ 0 for all 𝑥, 𝑥𝑇 𝐴𝑥 = 0 for 𝑥 = (2, −3)

• 𝐴 is not positive semidefinite for 𝑎 < 4

𝑥𝑇 𝐴𝑥 < 0 for 𝑥 = (2, −3)

Cholesky factorization 12.3

Simple properties

• every positive definite matrix 𝐴 is nonsingular

𝐴𝑥 = 0 =⇒ 𝑥𝑇 𝐴𝑥 = 0 =⇒ 𝑥=0

(last step follows from positive definiteness)

• every positive definite matrix 𝐴 has positive diagonal elements

𝐴𝑖𝑖 = 𝑒𝑇𝑖 𝐴𝑒𝑖 > 0

• every positive semidefinite matrix 𝐴 has nonnegative diagonal elements

𝐴𝑖𝑖 = 𝑒𝑇𝑖 𝐴𝑒𝑖 ≥ 0

Cholesky factorization 12.4

Schur complement

partition 𝑛 × 𝑛 symmetric matrix 𝐴 as

𝐴11 𝐴𝑇2:𝑛,1
𝐴=
𝐴2:𝑛,1 𝐴2:𝑛,2:𝑛

• the Schur complement of 𝐴11 is defined as the (𝑛 − 1) × (𝑛 − 1) matrix

1
𝑆 = 𝐴2:𝑛,2:𝑛 − 𝐴2:𝑛,1 𝐴𝑇2:𝑛,1
𝐴11

• if 𝐴 is positive definite, then 𝑆 is positive definite

to see this, take any 𝑥 ≠ 0 and define 𝑦 = −( 𝐴𝑇2:𝑛,1𝑥)/𝐴11; then

𝑇
𝑦 𝐴11 𝐴𝑇2:𝑛,1 𝑦
𝑇
𝑥 𝑆𝑥 = >0
𝑥 𝐴2:𝑛,1 𝐴2:𝑛,2:𝑛 𝑥

because 𝐴 is positive definite

Cholesky factorization 12.5

Singular positive semidefinite matrices

• we mentioned that positive definite matrices are nonsingular (page 12.4)

• if 𝐴 is positive semidefinite, but not positive definite, then it is singular

to see this, suppose 𝐴 is positive semidefinite but not positive definite

• there exists a nonzero 𝑥 with 𝑥𝑇 𝐴𝑥 = 0

• since 𝐴 is positive semidefinite the following function is nonnegative:

𝑓 (𝑡) = (𝑥 − 𝑡 𝐴𝑥)𝑇 𝐴(𝑥 − 𝑡 𝐴𝑥)

= 𝑥𝑇 𝐴𝑥 − 2𝑡𝑥𝑇 𝐴2𝑥 + 𝑡 2𝑥𝑇 𝐴3𝑥
= −2𝑡 ∥ 𝐴𝑥∥ 2 + 𝑡 2𝑥𝑇 𝐴3𝑥

• 𝑓 (𝑡) ≥ 0 for all 𝑡 is only possible if ∥ 𝐴𝑥∥ = 0; therefore 𝐴𝑥 = 0

• hence there exists a nonzero 𝑥 with 𝐴𝑥 = 0, so 𝐴 is singular

Cholesky factorization 12.6

Outline

• positive definite matrices

• examples
• Cholesky factorization
• complex positive definite matrices
• kernel methods
Exercise: resistor circuit

R1 R2

x1 x2

y1 + R3 + y2
− −

𝑦1 𝑅1 + 𝑅3 𝑅3 𝑥1
=
𝑦2 𝑅3 𝑅2 + 𝑅3 𝑥2

show that the matrix

𝑅1 + 𝑅3 𝑅3
𝐴=
𝑅3 𝑅2 + 𝑅3

is positive definite if 𝑅1, 𝑅2, 𝑅3 are positive

Cholesky factorization 12.7

Solution

Solution from physics

• 𝑥𝑇 𝐴𝑥 = 𝑦𝑇 𝑥 is the power delivered by sources, dissipated by resistors

• power dissipated by the resistors is positive unless both currents are zero

Algebraic solution

𝑥𝑇 𝐴𝑥 = (𝑅1 + 𝑅3)𝑥12 + 2𝑅3𝑥1𝑥2 + (𝑅2 + 𝑅3)𝑥 22

= 𝑅1𝑥12 + 𝑅2𝑥22 + 𝑅3 (𝑥1 + 𝑥2) 2
≥ 0

and 𝑥𝑇 𝐴𝑥 = 0 only if 𝑥 1 = 𝑥 2 = 0

Cholesky factorization 12.8

Gram matrix

recall the definition of Gram matrix of a matrix 𝐵 (page 4.20):

𝐴 = 𝐵𝑇 𝐵

• every Gram matrix is positive semidefinite

𝑥𝑇 𝐴𝑥 = 𝑥𝑇 𝐵𝑇 𝐵𝑥 = ∥𝐵𝑥∥ 2 ≥ 0 ∀𝑥

• a Gram matrix is positive definite if

𝑥𝑇 𝐴𝑥 = 𝑥𝑇 𝐵𝑇 𝐵𝑥 = ∥𝐵𝑥∥ 2 > 0 ∀𝑥 ≠ 0

in other words, 𝐵 has linearly independent columns

Cholesky factorization 12.9

Graph Laplacian

recall definition of node–arc incidence matrix of a directed graph (page 3.29)

 1

 if edge 𝑗 ends at vertex 𝑖
𝐵𝑖 𝑗 = −1 if edge 𝑗 starts at vertex 𝑖

 0 otherwise



assume there are no self-loops and at most one edge between any two vertices

3
2 3

 −1 −1 0 1 0 
 1 0 −1 0 0 

1 4 5
𝐵=
 0 0 1 −1 −1 

 0 1 0 0 1 
2

1 4

Cholesky factorization 12.10

Graph Laplacian

the positive semidefinite matrix 𝐴 = 𝐵𝐵𝑇 is called the Laplacian of the graph

 degree of vertex 𝑖 if 𝑖 = 𝑗


if 𝑖 ≠ 𝑗 and vertices 𝑖 and 𝑗 are adjacent

𝐴𝑖 𝑗 = −1
 0 otherwise



the degree of a vertex is the number of edges incident to it

3
2 3

 3 −1 −1 −1 
2 −1 0 
 
1 4 5 𝑇
 −1
𝐴 = 𝐵𝐵 = 
3 −1 

 −1 −1
 −1 0 −1 2 
2

1 4

Cholesky factorization 12.11

Laplacian quadratic form

recall the interpretation of matrix–vector multiplication with 𝐵𝑇 (page 3.31)

• if 𝑦 is vector of node potentials, then 𝐵𝑇 𝑦 contains potential differences:

(𝐵𝑇 𝑦) 𝑗 = 𝑦 𝑘 − 𝑦 𝑙 if edge 𝑗 goes from vertex 𝑙 to 𝑘

• 𝑦𝑇 𝐴𝑦 = 𝑦𝑇 𝐵𝐵𝑇 𝑦 is the sum of squared potential differences

2
(𝑦 𝑗 − 𝑦𝑖 ) 2
∑︁
𝑦 𝐴𝑦 = ∥𝐵 𝑦∥ =
𝑇 𝑇
edges 𝑖 → 𝑗

this is also known as the Dirichlet energy function

Example: for the graph on the previous page

𝑦𝑇 𝐴𝑦 = (𝑦 2 − 𝑦 1) 2 + (𝑦 4 − 𝑦 1) 2 + (𝑦 3 − 𝑦 2) 2 + (𝑦 1 − 𝑦 3) 2 + (𝑦 4 − 𝑦 3) 2

Cholesky factorization 12.12

Weighted graph Laplacian

• we associate a nonnegative weight 𝑤 𝑘 with edge 𝑘

• the weighted graph Laplacian is the matrix 𝐴 = 𝐵 diag(𝑤)𝐵𝑇
(diag(𝑤) is the diagonal matrix with vector 𝑤 on its diagonal)

if 𝑖 = 𝑗 (where N𝑖 are the edges incident to vertex 𝑖 )

P


 𝑤𝑘

 𝑘∈N𝑖

𝐴𝑖 𝑗 = −𝑤 𝑘
 if 𝑖 ≠ 𝑗 and edge 𝑘 is between vertices 𝑖 and 𝑗
 0 otherwise



3
2 3

𝑤 1 + 𝑤 2 + 𝑤 4 −𝑤 1 −𝑤 4 −𝑤 2 
0 

1 4 5  −𝑤 1 𝑤1 + 𝑤3 −𝑤 3
𝐴=

−𝑤 4 −𝑤 3 𝑤 3 + 𝑤 4 + 𝑤 5 −𝑤 5 
0

 −𝑤 2 −𝑤 5 𝑤 2 + 𝑤 5
2

1 4

this is the conductance matrix of a resistive circuit (𝑤 𝑘 is conductance in branch 𝑘 )

Cholesky factorization 12.13
Outline

• positive definite matrices

• examples
• Cholesky factorization
• complex positive definite matrices
• kernel methods
Cholesky factorization

every positive definite matrix 𝐴 ∈ R𝑛×𝑛 can be factored as

𝐴 = 𝑅𝑇 𝑅

where 𝑅 is upper triangular with positive diagonal elements

• complexity of computing 𝑅 is 13 𝑛3 flops

• 𝑅 is called the Cholesky factor of 𝐴
• can be interpreted as “square root” of a positive definite matrix
• gives a practical method for testing positive definiteness

Cholesky factorization 12.14

Cholesky factorization algorithm

0
" #
𝑅11

𝐴11 𝐴1,2:𝑛 𝑅11 𝑅1,2:𝑛
=
𝐴2:𝑛,1 𝐴2:𝑛,2:𝑛 𝑅𝑇1,2:𝑛 𝑅𝑇2:𝑛,2:𝑛 0 𝑅2:𝑛,2:𝑛

2
" #
𝑅11 𝑅11 𝑅1,2:𝑛
=
𝑅11 𝑅𝑇1,2:𝑛 𝑅𝑇1,2:𝑛 𝑅1,2:𝑛 + 𝑅𝑇2:𝑛,2:𝑛 𝑅2:𝑛,2:𝑛

1. compute first row of 𝑅 :

√︁ 1
𝑅11 = 𝐴11, 𝑅1,2:𝑛 = 𝐴1,2:𝑛
𝑅11

2. compute 2, 2 block 𝑅2:𝑛,2:𝑛 from

𝐴2:𝑛,2:𝑛 − 𝑅𝑇1,2:𝑛 𝑅1,2:𝑛 = 𝑅𝑇2:𝑛,2:𝑛 𝑅2:𝑛,2:𝑛

this is a Cholesky factorization of order 𝑛 − 1

Cholesky factorization 12.15

Discussion

the algorithm works for positive definite 𝐴 of size 𝑛 × 𝑛

• step 1: if 𝐴 is positive definite then 𝐴11 > 0

• step 2: if 𝐴 is positive definite, then

1
𝐴2:𝑛,2:𝑛 − 𝑅𝑇1,2:𝑛 𝑅1,2:𝑛 = 𝐴2:𝑛,2:𝑛 − 𝐴2:𝑛,1 𝐴𝑇2:𝑛,1
𝐴11

is positive definite (see page 12.5)

• hence the algorithm works for 𝑛 = 𝑚 if it works for 𝑛 = 𝑚 − 1
• it obviously works for 𝑛 = 1; therefore it works for all 𝑛

Cholesky factorization 12.16

Example

 25 15 −5   𝑅11 0 0   𝑅11 𝑅12 𝑅13 

 15 18 0  0  0
     
=  𝑅12 𝑅22  𝑅22 𝑅23 
 −5 0 11   0 0
    
 𝑅13 𝑅23 𝑅33  𝑅33 
     

 5 0 0   5 3 −1 
 3 3 0  0 3 1
  
= 
 −1 1 3   0 0 3 
  
 

Cholesky factorization 12.17

Example

 25 15 −5   𝑅11 0 0   𝑅11 𝑅12 𝑅13 

 15 18 0  =  𝑅12 0  0
     
𝑅22  𝑅22 𝑅23 
 −5 0 11   𝑅13  0 0
   
   𝑅23 𝑅33 
  𝑅33 


• first row of 𝑅
 25 15 −5   5 0 0   5 3 −1 
 15 18 0  =  3 0  0
     
𝑅22  𝑅22 𝑅23 
 −5 0 11   −1  0 0
   
   𝑅23 𝑅33 
  𝑅33 


• second row of 𝑅

18 0 3 0

𝑅22 𝑅22 𝑅23
3 −1 =

−
0 11 −1 𝑅23 𝑅33 0 𝑅33

9 3 3 0 3 1

=
3 10 1 𝑅33 0 𝑅33
2 , i.e., 𝑅 = 3
• third column of 𝑅 : 10 − 1 = 𝑅33 33

Cholesky factorization 12.18

Solving equations with positive definite 𝐴

solve 𝐴𝑥 = 𝑏 with 𝐴 a positive definite 𝑛 × 𝑛 matrix

Algorithm

• factor 𝐴 as 𝐴 = 𝑅𝑇 𝑅
• solve 𝑅𝑇 𝑅𝑥 = 𝑏
– solve 𝑅𝑇 𝑦 = 𝑏 by forward substitution
– solve 𝑅𝑥 = 𝑦 by back substitution

Complexity: 31 𝑛3 + 2𝑛2 ∼ 31 𝑛3 flops

• factorization: 13 𝑛3
• forward and backward substitution: 2𝑛2

Cholesky factorization 12.19

Cholesky factorization of Gram matrix

• suppose 𝐵 is an 𝑚 × 𝑛 matrix with linearly independent columns

• the Gram matrix 𝐴 = 𝐵𝑇 𝐵 is positive definite (page 4.20)

two methods for computing the Cholesky factor of 𝐴, given 𝐵

1. compute 𝐴 = 𝐵𝑇 𝐵, then Cholesky factorization of 𝐴

𝐴 = 𝑅𝑇 𝑅

2. compute QR factorization 𝐵 = 𝑄𝑅 ; since

𝐴 = 𝐵𝑇 𝐵 = 𝑅𝑇 𝑄𝑇 𝑄𝑅 = 𝑅𝑇 𝑅

the matrix 𝑅 is the Cholesky factor of 𝐴

Cholesky factorization 12.20

Example

 3 −6 
25 −50

𝐵 =  4 −8  ,
 
𝐴 = 𝐵𝑇 𝐵 =
−50 101

 0 1 


1. Cholesky factorization:

5 0 5 −10

𝐴=
−10 1 0 1

2. QR factorization

 3 −6   3/5 0 
 5 −10

𝐵 =  4 −8  =  4/5 0 
  
 0 1   0 1  0 1
   

Cholesky factorization 12.21

Comparison of the two methods

Numerical stability: QR factorization method is more stable

• see the example on page 8.16

• QR method computes 𝑅 without “squaring” 𝐵 (i.e., forming 𝐵𝑇 𝐵)
• this is important when the columns of 𝐵 are “almost” linearly dependent

Complexity

• method 1: cost of symmetric product 𝐵𝑇 𝐵 plus Cholesky factorization

21 3
𝑚𝑛 + 𝑛 flops
3

• method 2: 2𝑚𝑛2 flops for QR factorization

• method 1 is faster but only by a factor of at most two (if 𝑚 ≫ 𝑛)

Cholesky factorization 12.22

Sparse positive definite matrices

Cholesky factorization of dense matrices

• 31 𝑛3 flops
• on a standard computer: a few seconds or less, for 𝑛 up to several 1000

Cholesky factorization of sparse matrices

• if 𝐴 is very sparse, 𝑅 is often (but not always) sparse

• if 𝑅 is sparse, the cost of the factorization is much less than 13 𝑛3
• exact cost depends on 𝑛, number of nonzero elements, sparsity pattern
• very large sets of equations can be solved by exploiting sparsity

Cholesky factorization 12.23

Sparse Cholesky factorization

if 𝐴 is sparse and positive definite, it is usually factored as

𝐴 = 𝑃𝑅𝑇 𝑅𝑃𝑇

𝑃 a permutation matrix; 𝑅 upper triangular with positive diagonal elements

Interpretation: we permute the rows and columns of 𝐴 and factor

𝑃𝑇 𝐴𝑃 = 𝑅𝑇 𝑅

• choice of permutation greatly affects the sparsity 𝑅

• there exist several heuristic methods for choosing a good permutation

Cholesky factorization 12.24

Example

sparsity pattern of 𝐴 Cholesky factor of 𝐴

pattern of 𝑃𝑇 𝐴𝑃 Cholesky factor of 𝑃𝑇 𝐴𝑃

Cholesky factorization 12.25

Solving sparse positive definite equations

solve 𝐴𝑥 = 𝑏 with 𝐴 a sparse positive definite matrix

Algorithm

1. compute sparse Cholesky factorization 𝐴 = 𝑃𝑅𝑇 𝑅𝑃𝑇

2. permute right-hand side: 𝑐 := 𝑃𝑇 𝑏
3. solve 𝑅𝑇 𝑦 = 𝑐 by forward substitution
4. solve 𝑅𝑧 = 𝑦 by back substitution
5. permute solution: 𝑥 := 𝑃𝑧

Cholesky factorization 12.26

Outline

• positive definite matrices

• examples
• Cholesky factorization
• complex positive definite matrices
• kernel methods
Quadratic form

suppose 𝐴 is 𝑛 × 𝑛 and Hermitian ( 𝐴𝑖 𝑗 = 𝐴¯ 𝑗𝑖 )

𝑛 ∑︁
𝑛
𝐴𝑖 𝑗 𝑥¯𝑖 𝑥 𝑗
𝐻
∑︁
𝑥 𝐴𝑥 =
𝑖=1 𝑗=1
𝑛
2
( 𝐴𝑖 𝑗 𝑥¯𝑖 𝑥 𝑗 + 𝐴¯ 𝑖 𝑗 𝑥𝑖 𝑥¯ 𝑗 )
∑︁ ∑︁
= 𝐴𝑖𝑖 |𝑥𝑖 | +
𝑖=1 𝑖> 𝑗
𝑛
2
𝐴𝑖𝑖 |𝑥𝑖 | + 2 Re 𝐴𝑖 𝑗 𝑥¯𝑖 𝑥 𝑗
∑︁ ∑︁
=
𝑖=1 𝑖> 𝑗

note that 𝑥 𝐻 𝐴𝑥 is real for all 𝑥 ∈ C𝑛

Cholesky factorization 12.27

Complex positive definite matrices

• a Hermitian 𝑛 × 𝑛 matrix 𝐴 is positive semidefinite if

𝑥 𝐻 𝐴𝑥 ≥ 0 for all 𝑥 ∈ C𝑛

• a Hermitian 𝑛 × 𝑛 matrix 𝐴 is positive definite if

𝑥 𝐻 𝐴𝑥 > 0 for all nonzero 𝑥 ∈ C𝑛

Cholesky factorization

every positive definite matrix 𝐴 ∈ C𝑛×𝑛 can be factored as

𝐴 = 𝑅𝐻 𝑅

where 𝑅 is upper triangular with positive real diagonal elements

Cholesky factorization 12.28

Outline

• positive definite matrices

• examples
• Cholesky factorization
• complex positive definite matrices
• kernel methods
Regularized least squares model fitting

• we revisit the data fitting problem with linear-in-parameters model (page 9.9)

𝑓ˆ(𝑥) = 𝜃 1 𝑓1 (𝑥) + 𝜃 2 𝑓2 (𝑥) + · · · + 𝜃 𝑝 𝑓 𝑝 (𝑥)

= 𝜃𝑇 𝐹 (𝑥)

• 𝐹 (𝑥) = ( 𝑓1 (𝑥), . . . , 𝑓 𝑝 (𝑥)) is a 𝑝 -vector of basis functions 𝑓1 (𝑥) , . . . , 𝑓 𝑝 (𝑥)

Regularized least squares model fitting (page 10.7)

𝑁 2 𝑝
minimize (𝑘) (𝑘)
𝜃 2𝑗
∑︁ ∑︁
𝜃 𝐹 (𝑥
𝑇
)−𝑦 +𝜆
𝑘=1 𝑗=1

• (𝑥 (1) , 𝑦 (1) ) , . . . , (𝑥 (𝑁) , 𝑦 (𝑁) ) are 𝑁 examples

• to simplify notation, we add regularization for all coefficients 𝜃 1, . . . , 𝜃 𝑝
P𝑝 2
• next discussion can be modified to handle 𝑓1 (𝑥) = 1, regularization 𝑗=2 𝜃 𝑗

Cholesky factorization 12.29

Regularized least squares problem in matrix notation

minimize ∥ 𝐴𝜃 − 𝑏∥ 2 + 𝜆∥𝜃 ∥ 2

• 𝐴 has size 𝑁 × 𝑝 (number of examples × number of basis functions)

 𝐹 (𝑥 (1) )𝑇   𝑓1 (𝑥 (1) ) 𝑓2 (𝑥 (1) ) ··· 𝑓 𝑝 (𝑥 (1) ) 

 𝐹 (𝑥 (2) )𝑇   𝑓1 (𝑥 (2) ) 𝑓2 (𝑥 (2) ) 𝑓 𝑝 (𝑥 (2) )
   
··· 
𝐴 =  .. =
  .. .. .. 

 𝐹 (𝑥 (𝑁) )𝑇   𝑓 (𝑥 (𝑁) ) 𝑓2 (𝑥 (𝑁) ) 𝑓 𝑝 (𝑥 (𝑁) )
   
   1 ··· 


• 𝑏 is the 𝑁 -vector 𝑏 = (𝑦 (1) , . . . , 𝑦 (𝑁) )

• we discuss methods for problems with 𝑁 ≪ 𝑝 ( 𝐴 is very wide)
• the equivalent “stacked” least squares problem (p.10.3) has size ( 𝑝 + 𝑁) × 𝑝
• QR factorization method may be too expensive when 𝑁 ≪ 𝑝

Cholesky factorization 12.30

Solution of regularized LS problem

from the normal equations:

𝜃ˆ = ( 𝐴𝑇 𝐴 + 𝜆𝐼) −1 𝐴𝑇 𝑏 = 𝐴𝑇 ( 𝐴𝐴𝑇 + 𝜆𝐼) −1 𝑏

• second expression follows from the “push-through” identity

( 𝐴𝑇 𝐴 + 𝜆𝐼) −1 𝐴𝑇 = 𝐴𝑇 ( 𝐴𝐴𝑇 + 𝜆𝐼) −1

this is easily proved, by writing it as 𝐴𝑇 ( 𝐴𝐴𝑇 + 𝜆𝐼) = ( 𝐴𝑇 𝐴 + 𝜆𝐼) 𝐴𝑇

• from the second expression for 𝜃ˆ and the definition of 𝐴,

𝑁
𝑓ˆ(𝑥) = 𝜃ˆ𝑇 𝐹 (𝑥) = 𝑤𝑇 𝐴𝐹 (𝑥) = 𝑤𝑖 𝐹 (𝑥 (𝑖) )𝑇 𝐹 (𝑥)
∑︁
𝑖=1

where 𝑤 = ( 𝐴𝐴𝑇 + 𝜆𝐼) −1 𝑏

Cholesky factorization 12.31

Algorithm

1. compute the 𝑁 × 𝑁 matrix 𝑄 = 𝐴𝐴𝑇 , which has elements

𝑄 𝑖 𝑗 = 𝐹 (𝑥 (𝑖) )𝑇 𝐹 (𝑥 ( 𝑗) ), 𝑖, 𝑗 = 1, . . . , 𝑁

2. use a Cholesky factorization to solve the equation

(𝑄 + 𝜆𝐼)𝑤 = 𝑏

Remarks

• 𝜃ˆ = 𝐴𝑇 𝑤 is not needed; 𝑤 is sufficient to evaluate the function 𝑓ˆ(𝑥) :

𝑁
𝑓ˆ(𝑥) = 𝑤𝑖 𝐹 (𝑥 (𝑖) )𝑇 𝐹 (𝑥)
∑︁
𝑖=1

• complexity: 31 𝑁 3 flops for factorization plus cost of computing 𝑄

Cholesky factorization 12.32

Example: multivariate polynomials

𝑓ˆ(𝑥) is a polynomial of degree 𝑑 (or less) in 𝑛 variables 𝑥 = (𝑥1, . . . , 𝑥 𝑛 )

• 𝑓ˆ(𝑥) is a linear combination of all possible monomials

𝑥1𝑘 1 𝑥2𝑘 2 · · · 𝑥 𝑛𝑘 𝑛

where 𝑘 1, . . . , 𝑘 𝑛 are nonnegative integers with 𝑘 1 + 𝑘 2 + · · · + 𝑘 𝑛 ≤ 𝑑

• number of different monomials is

(𝑛 + 𝑑)!

𝑛+𝑑
=
𝑛 𝑛! 𝑑!

Example: for 𝑛 = 2, 𝑑 = 3 there are ten monomials

1, 𝑥1, 𝑥2, 𝑥12, 𝑥1𝑥2, 𝑥22, 𝑥13, 𝑥12𝑥2, 𝑥1𝑥22, 𝑥23

Cholesky factorization 12.33

Multinomial formula

(𝑑 + 1)!
𝑥0𝑘 0 𝑥1𝑘 1 · · · 𝑥 𝑛𝑘 𝑛
∑︁
(𝑥0 + 𝑥1 + · · · + 𝑥 𝑛 ) = 𝑑
𝑘 0 +···+𝑘 𝑛 =𝑑 𝑘 0! 𝑘 1! · · · 𝑘 𝑛 !

sum is over all nonnegative integers 𝑘 0, 𝑘 1, . . . , 𝑘 𝑛 with sum 𝑑

• setting 𝑥0 = 1 gives

𝑐 𝑘 1 𝑘 2 ···𝑘 𝑛 𝑥1𝑘 1 𝑥2𝑘 2 · · · 𝑥 𝑛𝑘 𝑛

∑︁
(1 + 𝑥1 + 𝑥2 + · · · + 𝑥 𝑛 ) = 𝑑
𝑘 1 +···+𝑘 𝑛 ≤𝑑

• the sum includes all monomials of degree 𝑑 or less with variables 𝑥1, . . . , 𝑥 𝑛
• coefficient 𝑐 𝑘 1 𝑘 2 ···𝑘 𝑛 is defined as

(𝑑 + 1)!
𝑐 𝑘 1 𝑘 2 ···𝑘 𝑛 = with 𝑘 0 = 𝑑 − 𝑘 1 − · · · − 𝑘 𝑛
𝑘 0! 𝑘 1! 𝑘 2! · · · 𝑘 𝑛 !

Cholesky factorization 12.34

Vector of monomials

write polynomial of degree 𝑑 or less, with variables 𝑥 ∈ R𝑛 , as

𝑓ˆ(𝑥) = 𝜃𝑇 𝐹 (𝑥)

• 𝐹 (𝑥) is vector of basis functions

√
𝑐 𝑘 1 ···𝑘 𝑛 𝑥1𝑘 1 𝑥2𝑘 2 · · · 𝑥 𝑛𝑘 𝑛 for all 𝑘 1 + 𝑘 2 + · · · + 𝑘 𝑛 ≤ 𝑑

• length of 𝐹 (𝑥) is 𝑝 = (𝑛 + 𝑑)!/(𝑛! 𝑑!)

• multinomial formula gives simple formula for inner products 𝐹 (𝑢)𝑇 𝐹 (𝑣) :

𝑐 𝑘 1 𝑘 2 ···𝑘 𝑛 (𝑢 1𝑘 1 · · · 𝑢 𝑛𝑘 𝑛 )(𝑣 1𝑘 1 · · · 𝑣 𝑛𝑘 𝑛 )
∑︁
𝐹 (𝑢) 𝐹 (𝑣)
𝑇
=
𝑘 1 +···+𝑘 𝑛 ≤𝑑

= (1 + 𝑢 1 𝑣 1 + · · · + 𝑢 𝑛 𝑣 𝑛 ) 𝑑

• only 2𝑛 + 1 flops needed for inner product of length 𝑝 = (𝑛 + 𝑑)!/(𝑛! 𝑑!)

Cholesky factorization 12.35

Example

vector of monomials of degree 𝑑 = 3 or less in 𝑛 = 2 variables

 1 𝑇  1 
 √   √ 
3𝑢 3𝑣
√ 1 √ 1
   
   
3𝑢 3𝑣
   
√ 22 √ 22
   
3𝑢 1 3𝑣
   
√ 1
   
 √   
6𝑢 𝑢 6𝑣 𝑣
√ 122 √ 122
   
𝐹 (𝑢)𝑇 𝐹 (𝑣) =
   
3𝑢 2 3𝑣 2
   
   
𝑢 31 𝑣 31
   
   
 √ 2   √ 2 
3𝑢 𝑢 3𝑣 𝑣
√ 1 22 √ 1 22
   
   
3𝑢 1𝑢 2 3𝑣 1 𝑣 2
   
   
𝑢 32 𝑣 32
   
   
   

= (1 + 𝑢 1 𝑣 1 + 𝑢 2 𝑣 2) 3

Cholesky factorization 12.36

Least squares fitting of multivariate polynomials

fit polynomial of 𝑛 variables, degree ≤ 𝑑 , to points (𝑥 (1) , 𝑦 (1) ) , . . . , (𝑥 (𝑁) , 𝑦 (𝑁) )

Algorithm (see page 12.32)

1. compute the 𝑁 × 𝑁 matrix 𝑄 with elements

𝑄 𝑖 𝑗 = 𝐾 (𝑥 (𝑖) , 𝑥 ( 𝑗) ) where 𝐾 (𝑢, 𝑣) = (1 + 𝑢𝑇 𝑣) 𝑑

2. use a Cholesky factorization to solve the equation (𝑄 + 𝜆𝐼)𝑤 = 𝑏

• the fitted polynomial is

𝑁 𝑁
𝑓ˆ(𝑥) = (𝑖)
𝑤𝑖 (1 + (𝑥 (𝑖) )𝑇 𝑥) 𝑑
∑︁ ∑︁
𝑤𝑖 𝐾 (𝑥 , 𝑥) =
𝑖=1 𝑖=1

• complexity: 𝑛𝑁 2 flops for computing 𝑄 , plus 31 𝑁 3 for the factorization, i.e.,

21 3
𝑛𝑁 + 𝑁 flops
3
Cholesky factorization 12.37
Kernel methods

Kernel function: a generalized inner product 𝐾 (𝑢, 𝑣)

• 𝐾 (𝑢, 𝑣) is inner product of vectors of basis functions 𝐹 (𝑢) and 𝐹 (𝑣)

• 𝐹 (𝑢) may be infinite-dimensional
• kernel methods work with 𝐾 (𝑢, 𝑣) directly, do not require 𝐹 (𝑢)

Examples

• the polynomial kernel function 𝐾 (𝑢, 𝑣) = (1 + 𝑢𝑇 𝑣) 𝑑

• the Gaussian radial basis function kernel

∥𝑢 − 𝑣∥ 2
𝐾 (𝑢, 𝑣) = exp (− 2
)
2𝜎

• kernels exist for computing with graphs, texts, strings of symbols, . . .

Cholesky factorization 12.38

Example: handwritten digit classification
we apply the method of page 12.37 to least squares classification

• training set is 10000 images from MNIST data set (≈ 1000 examples per digit)
• vector 𝑥 is vector of pixel intensities (size 𝑛 = 282 = 784)
• we use the polynomial kernel with degree 𝑑 = 3:

𝐾 (𝑢, 𝑣) = (1 + 𝑢𝑇 𝑣) 3

hence 𝐹 (𝑧) has length 𝑝 = (𝑛 + 𝑑)!/(𝑛! 𝑑!) = 80931145

• we calculate ten Boolean classifiers

𝑓ˆ𝑘 (𝑥) = sign( 𝑓˜𝑘 (𝑥)), 𝑘 = 1, . . . 10

𝑓ˆ𝑘 (𝑥) distinguishes digit 𝑘 − 1 (outcome +1) form other digits (outcome −1)
• the Boolean classifiers are combined in the multi-class classifier

𝑓ˆ(𝑥) = argmax 𝑓˜𝑘 (𝑥)

𝑘=1,...,10

Cholesky factorization 12.39

Least squares Boolean classifier

Algorithm: compute Boolean classifier for digit 𝑘 − 1 versus the rest

1. compute 𝑁 × 𝑁 matrix 𝑄 with elements

𝑄 𝑖 𝑗 = (1 + (𝑥 (𝑖) )𝑇 𝑥 ( 𝑗) ) 𝑑 , 𝑖, 𝑗 = 1, . . . , 𝑁

2. define 𝑁 -vector 𝑏 = (𝑦 (1) , . . . , 𝑦 (𝑁) ) with elements

+1 𝑥 (𝑖) is an example of digit 𝑘 − 1

(𝑖)
=
−1 otherwise
𝑦

3. solve the equation (𝑄 + 𝜆𝐼)𝑤 = 𝑏

the solution 𝑤 gives the Boolean classifier for digit 𝑘 − 1 versus rest

𝑁
𝑓˜𝑘 (𝑥) = 𝑤𝑖 (1 + (𝑥 (𝑖) )𝑇 𝑥) 𝑑
∑︁
𝑖=1

Cholesky factorization 12.40

Complexity

• the matrix 𝑄 is the same for each of the ten Boolean classifiers
• hence, only the right-hand side of the equation

(𝑄 + 𝜆𝐼)𝑤 = 𝑦 d

is different for each Boolean classifier

Complexity

• constructing 𝑄 requires 𝑁 2/2 inner products of length 𝑛: 𝑛𝑁 2 flops

• Cholesky factorization of 𝑄 + 𝜆𝐼 : 31 𝑁 3 flops
• solve the equation (𝑄 + 𝜆𝐼)𝑤 = 𝑦 d for the 10 right-hand sides: 20𝑁 2 flops
• total is 13 𝑁 3 + 𝑛𝑁 2

Cholesky factorization 12.41

Classification error

Training set
6 Test set
Classification error (%) 5

10−2 100 102 104 106

percentage of misclassified digits versus 𝜆

Cholesky factorization 12.42

Confusion matrix

Predicted digit
Digit 0 1 2 3 4 5 6 7 8 9 Total

0 965 1 0 0 0 1 8 2 3 0 980
1 0 1127 2 1 1 0 2 1 1 0 1135
2 6 2 988 4 1 1 5 16 8 1 1032
3 0 0 7 973 0 12 0 8 6 4 1010
4 1 3 0 0 957 0 3 1 3 14 982
5 3 0 0 5 0 874 5 2 2 1 892
6 9 4 0 0 5 2 937 0 1 0 958
7 0 13 13 1 5 0 0 987 2 7 1028
8 3 1 3 11 4 4 3 5 934 6 974
9 3 4 2 7 13 3 1 6 4 966 1009
All 990 1155 1015 1002 986 897 964 1028 964 999 10000

• multiclass classifier (𝜆 = 104) on 10000 test examples

• 292 digits are misclassified (2.9% error)

Cholesky factorization 12.43

Examples of misclassified digits

Predicted digit
Digit 0 1 2 3 4 5 6 7 8 9

Cholesky factorization 12.44

Examples of misclassified digits

Predicted digit
Digit 0 1 2 3 4 5 6 7 8 9

Cholesky factorization 12.45

Cholesky Factorization: EE103 (Fall 2011-12)
No ratings yet
Cholesky Factorization: EE103 (Fall 2011-12)
21 pages
Cholesky
No ratings yet
Cholesky
10 pages
Chole Sky
No ratings yet
Chole Sky
33 pages
Cholesky Factorization Guide
No ratings yet
Cholesky Factorization Guide
96 pages
Cholesky Factorization - MATLAB Chol
No ratings yet
Cholesky Factorization - MATLAB Chol
3 pages
A Fast Algorithm For Subspace State-Space System Identification Via Exploitation of The Displacem
No ratings yet
A Fast Algorithm For Subspace State-Space System Identification Via Exploitation of The Displacem
11 pages
Lect 9
No ratings yet
Lect 9
16 pages
Chilkosy Method (L9)
No ratings yet
Chilkosy Method (L9)
7 pages
12 9 Sec8 3 Draft
No ratings yet
12 9 Sec8 3 Draft
30 pages
12-9-Sec8-3-Positive Definite Matrices
No ratings yet
12-9-Sec8-3-Positive Definite Matrices
8 pages
Cholesky Decomposition: Viewed Separately Opened Into A New Sheet
No ratings yet
Cholesky Decomposition: Viewed Separately Opened Into A New Sheet
7 pages
Midterm Solutions: SOLUTION. We Can Write F (U
100% (1)
Midterm Solutions: SOLUTION. We Can Write F (U
7 pages
4.special Systems
No ratings yet
4.special Systems
11 pages
Cholesky Decomposition: 1 1 2 1 2) /2), and X 2 2 1
No ratings yet
Cholesky Decomposition: 1 1 2 1 2) /2), and X 2 2 1
7 pages
Modified Cholesky Algorithms: A Catalog With New Approaches: Haw-Ren Fang Dianne P. O'Leary August 8, 2006
No ratings yet
Modified Cholesky Algorithms: A Catalog With New Approaches: Haw-Ren Fang Dianne P. O'Leary August 8, 2006
42 pages
Sparse Matrix Ordering for Parallel Computing
No ratings yet
Sparse Matrix Ordering for Parallel Computing
26 pages
Generalized Shur Superfast
No ratings yet
Generalized Shur Superfast
20 pages
Cholesky 2
No ratings yet
Cholesky 2
2 pages
Cholesky Decomposition
No ratings yet
Cholesky Decomposition
8 pages
Rapport Factorisation Cholesky-1
No ratings yet
Rapport Factorisation Cholesky-1
11 pages
BeOuTr Tqabmf
No ratings yet
BeOuTr Tqabmf
15 pages
Symmetric Banded Matrix Algorithms
No ratings yet
Symmetric Banded Matrix Algorithms
7 pages
Tut 6s
No ratings yet
Tut 6s
5 pages
Math 4
No ratings yet
Math 4
17 pages
2025 4 22
No ratings yet
2025 4 22
4 pages
Cholesky Factorization & LU Methods
No ratings yet
Cholesky Factorization & LU Methods
3 pages
Systems of Linear Equations: Matrix A Has A Diagonal Form
No ratings yet
Systems of Linear Equations: Matrix A Has A Diagonal Form
8 pages
Cholesky Method
100% (1)
Cholesky Method
6 pages
Linear Algebra II
No ratings yet
Linear Algebra II
3 pages
Decomp
No ratings yet
Decomp
27 pages
Lec N8
No ratings yet
Lec N8
23 pages
Supplementary Lecture Notes: Solving Linear Systems With Special Properties
No ratings yet
Supplementary Lecture Notes: Solving Linear Systems With Special Properties
11 pages
CH 10
No ratings yet
CH 10
10 pages
Sparse Matrix Methods: Day 1: Overview
No ratings yet
Sparse Matrix Methods: Day 1: Overview
17 pages
Linear Algebra Cheat Sheet
100% (1)
Linear Algebra Cheat Sheet
2 pages
NM2012S-Lecture10-LU Factorization
No ratings yet
NM2012S-Lecture10-LU Factorization
11 pages
Advanced Eigenvalue Techniques
No ratings yet
Advanced Eigenvalue Techniques
50 pages
Chapter 4 of Calculus: The Non-Symmetric Eigenvalue Problem
No ratings yet
Chapter 4 of Calculus: The Non-Symmetric Eigenvalue Problem
50 pages
CS 240A: Solving Ax B in Parallel: Dense A: Gaussian Elimination With Partial Pivoting (LU)
No ratings yet
CS 240A: Solving Ax B in Parallel: Dense A: Gaussian Elimination With Partial Pivoting (LU)
35 pages
1.1 How To Study Applied Mathematics
No ratings yet
1.1 How To Study Applied Mathematics
3 pages
Numerical Solution of The Stable, Non-Negative Definite Lyapunov Equation
100% (1)
Numerical Solution of The Stable, Non-Negative Definite Lyapunov Equation
21 pages
Math 313/513, Homework 6 (Due Tues. Feb. 28)
No ratings yet
Math 313/513, Homework 6 (Due Tues. Feb. 28)
2 pages
Kronecker Products in Algebra
No ratings yet
Kronecker Products in Algebra
48 pages
133A Note
No ratings yet
133A Note
74 pages
Iccgi 2013 11 30 10133
No ratings yet
Iccgi 2013 11 30 10133
5 pages
Section 8-3
No ratings yet
Section 8-3
5 pages
SRM Valliammai Engeineering College
No ratings yet
SRM Valliammai Engeineering College
21 pages
Matrix Fcatorization
No ratings yet
Matrix Fcatorization
59 pages
Cholesky
No ratings yet
Cholesky
13 pages
Worksheet 01 S2-2024-2025
No ratings yet
Worksheet 01 S2-2024-2025
4 pages
Iterative Matrix Computation
No ratings yet
Iterative Matrix Computation
55 pages
Pages From Linear Fall23 Pp21-40
No ratings yet
Pages From Linear Fall23 Pp21-40
20 pages
Unit 4 Classwork - Docla 24
No ratings yet
Unit 4 Classwork - Docla 24
2 pages
MA7163 Applied Mathematics For Electrical Engineers
No ratings yet
MA7163 Applied Mathematics For Electrical Engineers
20 pages
An Easy New Approach For Matrix Decomposition: Mohammed Hassan Elzubair
No ratings yet
An Easy New Approach For Matrix Decomposition: Mohammed Hassan Elzubair
3 pages
On Fourier-Toeplitz Methods For Separable Elliptic Systems
No ratings yet
On Fourier-Toeplitz Methods For Separable Elliptic Systems
21 pages
Applied Mathematics For Electrical Engineers Qbank
No ratings yet
Applied Mathematics For Electrical Engineers Qbank
18 pages
Mass Spring
No ratings yet
Mass Spring
7 pages
Presentation 1
No ratings yet
Presentation 1
10 pages
Organized by NCC Naval Unit Saturday 28 DEC 2021 S.No Time Topic Venue
No ratings yet
Organized by NCC Naval Unit Saturday 28 DEC 2021 S.No Time Topic Venue
1 page
Workshop Check List
No ratings yet
Workshop Check List
1 page
Syllabus FOR Ph. D Course Work in Chemistry: P.G. Department of Chemistry
No ratings yet
Syllabus FOR Ph. D Course Work in Chemistry: P.G. Department of Chemistry
8 pages
Minimal Spanning Tree PDF
No ratings yet
Minimal Spanning Tree PDF
4 pages
Romberg
No ratings yet
Romberg
4 pages
Math Root Calculation Guide
No ratings yet
Math Root Calculation Guide
4 pages
8a Asymptotes of Rational Functions
No ratings yet
8a Asymptotes of Rational Functions
50 pages
2-1 Practice - A
0% (1)
2-1 Practice - A
2 pages
LESSON 7 Solving Rational Inequalities
No ratings yet
LESSON 7 Solving Rational Inequalities
20 pages
Class 9 Exam of Maths (May Examination Series)
No ratings yet
Class 9 Exam of Maths (May Examination Series)
4 pages
Solution of Engineering Problems: Given
No ratings yet
Solution of Engineering Problems: Given
42 pages
Euclidean Plane and Its Relatives
No ratings yet
Euclidean Plane and Its Relatives
206 pages
1-5 Practice - C
No ratings yet
1-5 Practice - C
2 pages
Assignment For Group Theory
0% (1)
Assignment For Group Theory
2 pages
Global To Local Transformation
No ratings yet
Global To Local Transformation
47 pages
Data Structures 2-1 2nd Mid
0% (1)
Data Structures 2-1 2nd Mid
5 pages
Mathematics Modules: Quarter 2 - Weeks 1 - 4
No ratings yet
Mathematics Modules: Quarter 2 - Weeks 1 - 4
41 pages
Understanding Polynomials Basics
No ratings yet
Understanding Polynomials Basics
13 pages
Arabic Competitive Programming: Channel Road Map
No ratings yet
Arabic Competitive Programming: Channel Road Map
8 pages
Yearly Lesson Plan FORM 4 (2020) - PENJAJARAN
100% (1)
Yearly Lesson Plan FORM 4 (2020) - PENJAJARAN
18 pages
Num (1 1) Den (1 1.2 1) Step (Num, Den) Grid
No ratings yet
Num (1 1) Den (1 1.2 1) Step (Num, Den) Grid
2 pages
Math 3 Rel 3235
No ratings yet
Math 3 Rel 3235
25 pages
Sampled-Data Systems Analysis
No ratings yet
Sampled-Data Systems Analysis
10 pages
MAD Blooms Taxonomy Question Paper Format
No ratings yet
MAD Blooms Taxonomy Question Paper Format
3 pages
(Calculus Oxford) Problem Sheet 06
No ratings yet
(Calculus Oxford) Problem Sheet 06
2 pages
Indian Mathematician "Madhava"
100% (3)
Indian Mathematician "Madhava"
3 pages
GRADE 11 - GenMath - 1st Quarterly Examination
No ratings yet
GRADE 11 - GenMath - 1st Quarterly Examination
7 pages
04 Quiz
No ratings yet
04 Quiz
2 pages
Operations Research 34th Edition R.K. Gupta Digital Download
100% (1)
Operations Research 34th Edition R.K. Gupta Digital Download
120 pages
14: Differentiation: The Gradient of A Curve
No ratings yet
14: Differentiation: The Gradient of A Curve
6 pages
MA 214: Introduction To Numerical Analysis: Shripad M. Garge. IIT Bombay (Shripad@math - Iitb.ac - In)
No ratings yet
MA 214: Introduction To Numerical Analysis: Shripad M. Garge. IIT Bombay (Shripad@math - Iitb.ac - In)
66 pages
Grade 7 Mensuration Perimeter Area Volume PK
No ratings yet
Grade 7 Mensuration Perimeter Area Volume PK
10 pages
1.3 Distance and Midpoints
No ratings yet
1.3 Distance and Midpoints
11 pages

Cholesky

Uploaded by

Cholesky

Uploaded by

L.

Vandenberghe ECE133A (Fall 2024)

12. Cholesky factorization

• positive definite matrices

• a symmetric matrix 𝐴 ∈ R𝑛×𝑛 is positive semidefinite if

• a symmetric matrix 𝐴 ∈ R𝑛×𝑛 is positive definite if

𝑥𝑇 𝐴𝑥 > 0 for all 𝑥 ≠ 0

this is a subset of the positive semidefinite matrices

note: if 𝐴 is symmetric and 𝑛 × 𝑛, then 𝑥𝑇 𝐴𝑥 is the function

this is called a quadratic form

Cholesky factorization 12.2

𝑥𝑇 𝐴𝑥 = 9𝑥12 + 12𝑥1𝑥2 + 𝑎𝑥 22 = (3𝑥1 + 2𝑥2) 2 + (𝑎 − 4)𝑥22

• 𝐴 is positive definite for 𝑎 > 4

𝑥𝑇 𝐴𝑥 > 0 for all nonzero 𝑥

• 𝐴 is positive semidefinite but not positive definite for 𝑎 = 4

𝑥𝑇 𝐴𝑥 ≥ 0 for all 𝑥, 𝑥𝑇 𝐴𝑥 = 0 for 𝑥 = (2, −3)

• 𝐴 is not positive semidefinite for 𝑎 < 4

𝑥𝑇 𝐴𝑥 < 0 for 𝑥 = (2, −3)

Cholesky factorization 12.3

• every positive definite matrix 𝐴 is nonsingular

(last step follows from positive definiteness)

• every positive definite matrix 𝐴 has positive diagonal elements

𝐴𝑖𝑖 = 𝑒𝑇𝑖 𝐴𝑒𝑖 > 0

• every positive semidefinite matrix 𝐴 has nonnegative diagonal elements

𝐴𝑖𝑖 = 𝑒𝑇𝑖 𝐴𝑒𝑖 ≥ 0

Cholesky factorization 12.4

partition 𝑛 × 𝑛 symmetric matrix 𝐴 as

• the Schur complement of 𝐴11 is defined as the (𝑛 − 1) × (𝑛 − 1) matrix

• if 𝐴 is positive definite, then 𝑆 is positive definite

because 𝐴 is positive definite

Cholesky factorization 12.5

• we mentioned that positive definite matrices are nonsingular (page 12.4)

to see this, suppose 𝐴 is positive semidefinite but not positive definite

• there exists a nonzero 𝑥 with 𝑥𝑇 𝐴𝑥 = 0

𝑓 (𝑡) = (𝑥 − 𝑡 𝐴𝑥)𝑇 𝐴(𝑥 − 𝑡 𝐴𝑥)

• 𝑓 (𝑡) ≥ 0 for all 𝑡 is only possible if ∥ 𝐴𝑥∥ = 0; therefore 𝐴𝑥 = 0

Cholesky factorization 12.6

• positive definite matrices

show that the matrix  

is positive definite if 𝑅1, 𝑅2, 𝑅3 are positive

Cholesky factorization 12.7

Solution from physics

• 𝑥𝑇 𝐴𝑥 = 𝑦𝑇 𝑥 is the power delivered by sources, dissipated by resistors

𝑥𝑇 𝐴𝑥 = (𝑅1 + 𝑅3)𝑥12 + 2𝑅3𝑥1𝑥2 + (𝑅2 + 𝑅3)𝑥 22

Cholesky factorization 12.8

recall the definition of Gram matrix of a matrix 𝐵 (page 4.20):

• every Gram matrix is positive semidefinite

• a Gram matrix is positive definite if

in other words, 𝐵 has linearly independent columns

Cholesky factorization 12.9

recall definition of node–arc incidence matrix of a directed graph (page 3.29)

Cholesky factorization 12.10

the degree of a vertex is the number of edges incident to it

Cholesky factorization 12.11

recall the interpretation of matrix–vector multiplication with 𝐵𝑇 (page 3.31)

• if 𝑦 is vector of node potentials, then 𝐵𝑇 𝑦 contains potential differences:

(𝐵𝑇 𝑦) 𝑗 = 𝑦 𝑘 − 𝑦 𝑙 if edge 𝑗 goes from vertex 𝑙 to 𝑘

• 𝑦𝑇 𝐴𝑦 = 𝑦𝑇 𝐵𝐵𝑇 𝑦 is the sum of squared potential differences

this is also known as the Dirichlet energy function

Example: for the graph on the previous page

Cholesky factorization 12.12

• we associate a nonnegative weight 𝑤 𝑘 with edge 𝑘

if 𝑖 = 𝑗 (where N𝑖 are the edges incident to vertex 𝑖 )

this is the conductance matrix of a resistive circuit (𝑤 𝑘 is conductance in branch 𝑘 )

• positive definite matrices

every positive definite matrix 𝐴 ∈ R𝑛×𝑛 can be factored as

where 𝑅 is upper triangular with positive diagonal elements

• complexity of computing 𝑅 is 13 𝑛3 flops

Cholesky factorization 12.14

1. compute first row of 𝑅 :

2. compute 2, 2 block 𝑅2:𝑛,2:𝑛 from

𝐴2:𝑛,2:𝑛 − 𝑅𝑇1,2:𝑛 𝑅1,2:𝑛 = 𝑅𝑇2:𝑛,2:𝑛 𝑅2:𝑛,2:𝑛

this is a Cholesky factorization of order 𝑛 − 1

Cholesky factorization 12.15

the algorithm works for positive definite 𝐴 of size 𝑛 × 𝑛

• step 1: if 𝐴 is positive definite then 𝐴11 > 0

show that the matrix