0% found this document useful (0 votes)
11 views6 pages

LN 10

This document discusses the Householder transformation as part of the QR decomposition in linear algebra, detailing its advantages over other methods like Givens rotations and Gram Schmidt. It provides a mathematical framework for the Householder matrix and its application in transforming vectors to achieve row echelon form. Additionally, an algorithm for orthogonal reduction using Householder transformations is presented, along with an analysis of its computational complexity.

Uploaded by

Zahid Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views6 pages

LN 10

This document discusses the Householder transformation as part of the QR decomposition in linear algebra, detailing its advantages over other methods like Givens rotations and Gram Schmidt. It provides a mathematical framework for the Householder matrix and its application in transforming vectors to achieve row echelon form. Additionally, an algorithm for orthogonal reduction using Householder transformations is presented, along with an analysis of its computational complexity.

Uploaded by

Zahid Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

MATH 5330: Computational Methods of Linear Algebra

Lecture Note 10: Householder Transformation


Xianyi Zeng
Department of Mathematical Sciences, UTEP

1 The QR Decomposition
Using Givens rotations allows us to write A = QE where Q is orthogonal and E is of the row
echelon form. Note that the lower-triangular part of E is always zero, i.e. eij = 0 if i > j; thus this
is decomposition is also known as the QR decomposition, where “R” stands for right-triangular or
upper triangular. Henceforth we will write A = QR instead of A = QE. In the special case when
A is square and non-singular, we immediately see that R = Q−1 A is also non-singular; hence all its
diagonal elements are non-zero. Furthermore, we can compute the inverse of A easily by:

A−1 = R−1 Qt .

The advantage of using the QR decomposition over the LU decomposition to invert a non-singular
matrix is that all its components have an a priori estimate. However, the QR decomposition requires
much more computational cost (∼ 2n3 ) than the Gaussian elimination (∼ 2n3 /3). Robustness does
not come free!
Givens rotations belong to one of three widely used methods to compute A = QR:

• Gram Schmidt: Since every column of A is a linear combination of the columns of Q, we have
col(A) ⊆ col(Q); thus in the end the QR decomposition can be reduced to orthogonalization
of the column vectors of A. We have already seen in the Arnoldi’s method that, this can be
achieved by the Gram Schmidt process. The major problem of the Gram Schmidt process is
similar to that of Gaussian elimination: The algorithm involves dividing by a number that
cannot be estimated a priori, hence it can breakdown unexpected.

• Givens rotations: This method is more robust than the Gram Schmidt, and in each rotation
only two adjacent rows are involved so it is more bandwidth efficient and reduce cache misses.
A major objection for using the Givens rotation is its complexity in implementation; partic-
ularly people found out that the ordering of the rotations actually matter in practice [1], and
determining the optimal order is a non-trivial problem.

• Householder transformation: This method is robust like the one using Givens rotations, easier
to implement, and requires less computation. However, it is less bandwidth efficient and more
difficult to parallelize than the latter.

At the end of this section, we prove that if A is square and non-singular, the QR decomposition
is unique, if we require in addition that all diagonal entries of R are positive. In fact, if A = Q1 R1 =
Q2 R2 where both Q1 and Q2 are orthogonal and both R1 and R2 are upper-triangular, we then
have:
Qt2 Q1 = R2 R1−1 .

1
The left hand side is orthogonal whereas the right hand side is upper triangular. It is not difficult to
see that the only upper triangular matrix that is also orthogonal is diagonal matrices with diagonal
elements being ±1. By the assumption that the diagonal elements of both R1 and R2 are positive,
the only possibility is R2 R1−1 = I, or equivalently Q1 = Q2 and R1 = R2 .

2 The Householder Transformation


A major motivation for using Givens transform to construct the QR decomposition is that rotations
preserve the L2 -norm of vectors. These are, however, not the only operations that have this
property. For example, the reflection about any plane also preserve the L2 -norm of vectors in Rn .
The method utilizing this latter property is built on the Householder transformation.
There are at least two ways to describe a Householder matrix. Algebraically, a Householder
matrix differs from the identity matrix by a rank one matrix as follows:
Hv = I − 2vv t , (2.1)
where v is a unit vector. Furthermore, Hv is symmetric and orthogonal; and we’ll check the latter:
Hvt Hv = (I − 2vv t )(I − 2vv t ) = I − 4vv t + 4v(v t v)v t
= I − 4vv t + 4vv t = I .
Now let u be any vector, and we compute Hv u:
Hv u = (I − 2vv t )u = u − 2(u · v)v . (2.2)
The second term looks familiar: (u · v)v is the projection of u onto the straight line spanned by
the vector v. Indeed, we see that:
u = (u · v)v + [u − (u · v)v]
is the orthogonal decomposition of u w.r.t. V = span(v):
v · [u − (u · v)v] = v · u − (u · v)||v||2 = 0 .
Thus (2.2) gives the reflection of u w.r.t. V ⊥ , in the sense that the midpoint of the line connecting
u and Hv u is the orthogonal projection of u onto V ⊥ . This is the geometrical description of a
Householder transform, and a 2D example is depicted in Figure 1.

V⊥
Hv u

O
v

Figure 1: Reflection of u by the line V ⊥ in R2 .

2
In 2D, V ⊥ itself is a straight line; in Rn , V ⊥ has dimension n − 1 and it has normal vector v.
We call V ⊥ a hyperplane with unit normal vector v.
As an immediate application of the Householder transform, we can essentially map u to any
other direction by choosing v carefully. In particular, we would like to find v such that Hv u=||u||e1 ,
i.e., all the components except the first one are zero. Let α = u · v for now, then we need:

u1 − 2αv1 = ||u|| ;
ui − 2αvi = 0 , i = 2,···,n .

Hence v is given by:


u1 − ||u|| ui
v1 = , vi = , i = 2,···,n .
2α 2α
To proceed, note that v needs to be a unit vector, we thusly have:
n
X
4α2 = (u1 − ||u||)2 + u2i = 2||u||2 − 2u1 ||u|| .
i=2
q
Choosing 2α = 2||u||2 − 2u1 ||u|| we obtain the desired unit vector v:
 
u1 −||u||

 2||u||2 −2u1 ||u|| 
√ u2 
 2||u||2 −2u
1 ||u||

v= .
. (2.3)

 .. 

u
 
√ 2
n
2||u|| −2u1 ||u||

Finally, it is not difficult to check that:


n
u2 − u1 ||u|| X u2i ||u||2 − u1 ||u|| 2α2
v ·u = q 1 + q =q = =α.
2 2 2 2α
2||u|| − 2u1 ||u|| i=2 2||u|| − 2u1 ||u|| 2||u|| − 2u1 ||u||

For convenience, we shall denote this Householder matrix by H(u), i.e., I − 2vv t where v is given
by (2.3).

3 The QR Decomposition using Householder Transformations


The idea is to use Householder transformation to construct a sequence of matrices H1 , H2 , ···, Hn
such that Hn Hn−1 ···H1 A is of the row echelon form. Again, the purpose of each Hi is to put the
i-th column of A in the desired form without touching the previous i − 1 columns.
The purpose of H1 is to have the first column of H1 A possess at most one non-zero, which
is the first component. This is clearly achieved by defining H1 as the Householder matrix that
(0)
corresponds to (2.3) where u is replaced by the first column of A. Let a1 be the first column of

3
def (0)
A(0) == A, following our previous notation we have H1 = H(a1 ):
 p 2
a11 + a221 + ··· + a2m1
 
a11
 a21   0 
   
 a31   0
[A]1 7→ [H1 A]1 is given by   7→  .

 ..   .. 
 .   . 
am1 0
(1)
Let A(1) = H1 A; now we look at the second column. Again, there are two situations: if a11 =
p (1) (1)
a211 + ··· + a2m1 = 0, we define H2 = H(a2 ), where a2 is the entire second column of A(1) :
 (1)   q 
a12 (1) 2 (1) 2 (1) 2
 (1)   (a12 ) + (a21 ) + ··· + (am1 ) 
 a22   0 
(1) (1)
 (1)   
[A ]2 7→ [H2 A ]2 is given by  a32  7→ 
   0 .
 ..  

.. 
 .   . 
(1)
am2 0

(1)
If, however, a11 is not zero, we do not want to touch the first row of A(1) and we only need to put
the last (m−2) elements of the second column of A(1) into zero. To this end, we abuse the notation
and denote:  (1) 
a
 22
 a(1)

(1) 32 
a2 =  . 

 .. 

(1)
am2
and define: " #
1
H2 = (1) .
H(a2 )

It is not difficult to check that H2 A(1) keeps the first column of A(1) unchanged; furthermore, the
first two columns of A transform as:
 (1) (1) 
a11 q a12
 
a11 a12
 a21 a22   (1) (1) (1) 
   0 (a22 )2 + (a32 )2 + ··· + (am2 )2 
[A]1:2 7→ [H2 H1 A]1:2 is given by  a31 a32
 
 7→  0 .
  
0
 .. .. 
. ..
 
 . .   . 
. . 
am1 am2 0 0

Now suppose we have already computed H1 , ··· , Hk . In order to constructHk+1 , we first identify
(k)
the vector ak+1 , which is composed of the last m−p elements of the (k +1)-th column of A(k) that
we need to transform, and then define:
" #
Ip
Hk+1 = (k) .
H(ak+1 )

4
Here Ip is the identity matrix in Rp×p .
To summarize, the Householder version of QR decomposition is given by Algorithm 3.1.

Algorithm 3.1 Orthogonal Reduction by Householder Transformation


1: Set p = 1 and Q = Im
2: for i = 1,2,···,n do
3: Compute s = a2p+1,i + ··· + a2m,i
4: if s 6= 0 then q
5: Compute β = s + a2pi
p
6: Compute γ = 2(β 2 − api β)
7: Compute v = [(api − β)/γ , ap+1,i /γ , ··· , ami /γ]t
8: A[p : m,i : n] ← Hv A[p : m,i : n]
9: Q[p : m,p : m] ← Q[p : m,p : m]Hv
10: end if
11: if api ! = 0 then
12: Set p ← p + 1
13: end if
14: end for

Here we use a notation M [I,J ] to denote a sub-matrix of M : I is a subset of all the row indices
of M and J is a subset of all column indices of J ; and M [I,J ] is the |I| × |J | matrix composed
of elements at the intersections of rows denoted by I and columns designated by J .

4 Analysis and Other Aspects


First of all, we note that updating the matrices A[p : m,i : n] and Q[p : m,p : m] in Algorithm 3.1
does not require constructing Hv explicitly. Let us assume for simplicity Hv ∈ Rm×m and we want
to compute Hv A. Note that:

Hv A = (I − 2vv t )A = A − 2v(v t A) ,

can be accomplished column by column. That is, for each column ai of A, we replace ai by:

ai ← ai − 2(ai · v)v ,

which requires 4m flops and essentially no additional storage. Similarly, computing QHv leads to:

QHv = Q(I − 2vv t ) = Q − 2(Qv)v t ,

and this can be done row by row, with 4m flops for updating each row of Q.
Finally, we look at the complexity of Algorithm 3.1. Note that the line 3 will be executed every
outer loop, unless we smartly terminate the program when p reaches r =min(m,n). But in any case,
the computational cost associated with line 3 is bounded by (2m − 1)n ∼ 2mn flops. The execution
of the block from line 4 to line 10, however, will leads to an increase in the value of p; hence this

5
block can be run at most r times, with the complexity bounded by 2r square root operations and
the number of flops:
X r
[2 + 3 + m − p + 2 + 4(m − p + 1)(n − ip + 1)]
p=1

where ip is the value i such that the if block with p is executed. Clearly ip ≥ p, and we obtain the
estimate:
r
X r
X
[2 + 3 + m − p + 2 + 4(m − p + 1)(n − ip + 1)] ≤ [m − p + 7 + 4(m − p + 1)(n − p + 1)]
p=1 p=1
2
∼ 2mr(n − r) + 2nr(m − r) + r3 .
3
Comparing Householder transformation and Givens rotation, the former requires only nearly two
thirds of the computational cost of the latter; however, because each Householder transformation
work on (almost) the entire column of A simultaneously, it is less friendly to parallelization.

References
[1] M. I. Gillespie and D. O. Olesky. Ordering givens rotations for sparse QR factorization. SIAM
J. Matrix Anal. A., 16(3):1024–1041, 1995.

You might also like