3 Orthogonal Vectors and Matrices
The linear algebra portion of this course focuses on three matrix factorizations: QR factorization, singular
valued decomposition (SVD), and LU factorization. The first two of these factorizations involve orthogonal
matrices. These matrices play a fundamental role in many numerical methods. This lecture first considers
orthogonal vectors and then defines orthogonal matrices.
3.1 Orthogonal Vectors
A pair of vector u, v ∈ Rm is said to be orthogonal if
(u, v) = 0.
In view of formula (11) in Lecture 1, orthogonal vectors meet at a right angle. The zero-vector 0 is orthogonal
to all vector, but we are more interested in nonvanishing orthogonal vectors.
A set of vectors Sn = {vj }nj=1 in Rm is said to be orthonormal if each pair of distinct vectors in Sn is
orthogonal and all vectors in Sn are of unit length, i.e., if
½
0, j 6= k,
(vj , vk ) = (1)
1, j = k.
Here we have used that (vk , vk ) = kvk k2 . It is not difficult to show that orthonormal vectors are linearly
independent; see Exercise 3.1 below. It follows that the m vectors of an orthonormal set Sm in Rm form a
basis for Rm .
Example 3.1
The set S3 = {ej }3j=1 in R5 is orthonormal, where the ej are axis vectors; cf. (15) of Lecture 1. 2
Example 3.2
The set S2 = {v1 , v2 } in R2 , with
1 1
v1 = √ [1, 1]T , v2 = √ [−1, 1]T ,
2 2
is orthonormal. Moreover, the set S2 forms a basis for R2 . 2
An arbitrary vector v ∈ Rm can be decomposed into orthogonal components. Consider the set Sn =
{vj }nj=1 of orthonormal vectors in Rm , and regard the expression
n
X
r=v− (vj , v)vj . (2)
j=1
The vector (vj , v)vj is referred to as the orthogonal component of v in the direction vj . Moreover, the
vector r is orthogonal to the vectors vj . This can be seen by computing the inner products (vk , r) for all k.
We obtain
n
X n
X
(vk , r) = (vk , v − (vj , v)vj ) = (vk , v) − (vj , v)(vk , vj ).
j=1 j=1
1
Using (1), the sum in the right-hand side simplifies to
n
X
(vj , v)(vk , vj ) = (vk , v),
j=1
which shows that
(vk , r) = 0
Thus, v can be expressed as a sum of the orthogonal vectors r, v1 , . . . , vn ,
n
X
v =r+ (vj , v)vj .
j=1
Example 3.3
Let v = [1, 2, 3, 4, 5]T and let the set S be the same as in Example 3.1. Then r = [0, 0, 0, 4, 5]T . Clearly, r is
orthogonal to the axis vectors e1 , e2 , e3 . 2
Exercise 3.1
Let Sn = {vj }nj=1 be a set of orthonormal vectors in Rm . Show that the vectors v1 , v2 , . . . , vn are linearly
independent. Hint: Assume this is not the case. For instance, assume that v1 is a linear combination of the
vectors v2 , v3 , . . . , vn , and apply (1). 2
3.2 Orthogonal Matrices
A square matrix Q = [q1 , q2 , . . . , qm ] ∈ Rm×m is said to be orthogonal if its columns {qj }m j=1 form an
orthonormal basis of Rm . Since the columns q1 , q2 , . . . , qm are linearly independent, cf. Exercise 3.1, the
matrix Q is nonsingular. Thus, Q has an inverse, which we denote by Q−1 . It follows from the orthonormality
of the columns of Q that
(q1 , q1 ) (q1 , q2 ) · · · (q1 , qm )
(q2 , q1 ) (q2 , q2 ) · · · (q2 , qm )
QT Q = .. .. .. = I,
. . .
(qm , q1 ) (qm , q2 ) · · · (qm , qm )
where I denotes the identity matrix. Multiplying the above expression by the inverse Q−1 from the right-hand
side shows that
QT = Q−1 .
Thus, the transpose of an orthogonal matrix is the inverse.
Example 3.4
The identity matrix I is orthogonal. 2
2
Example 3.5
The matrix · √ √ ¸
1/√2 1/√2
Q=
−1/ 2 1/ 2
is orthogonal. Its inverse is its transpose,
· √ √ ¸
−1 T 1/√2 −1/√2
Q =Q = .
1/ 2 1/ 2
2
Geometrically, multiplying a vector by an orthogonal matrix reflects the vector in some plane and/or
rotates it. Therefore, multiplying a vector by an orthogonal matrices does not change its length. Therefore,
the norm of a vector u is invariant under multiplication by an orthogonal matrix Q, i.e.,
kQuk = kuk. (3)
This can be seen by using the properties (18) and (16) of Lecture 1. We have
kQuk2 = (Qu)T (Qu) = uT QT (Qu) = uT (QT Q)u = uT u = kuk2 .
We take square-roots of the right-hand side and left-hand side, and since k · k is nonnegative, equation (3)
follows.
3.3 Householder Matrices
Matrices of the form
2
H = I − ρuuT ∈ Rm×m , u 6= 0, ρ= , (4)
uT u
are known as Householder matrices. They are used in numerical methods for least-squares approximation
and eigenvalue computations. We will discuss the former application in the next lecture.
Householder matrices are symmetric, i.e., H = H T , and orthogonal. The latter property follows from
HT H = H 2 = (I − ρuuT )(I − ρuuT ) = I − ρuuT − ρuuT + (ρuuT )(ρuuT )
= I − ρuuT − ρuuT + ρu(ρuT u)uT = I − ρuuT − ρuuT + 2ρuuT = I,
where we have used that ρuT u = 2; cf. (4).
Our interest in Householder matrices stems from that they are orthogonal and the vector u in their
definition can be chosen so that an arbitrary (but fixed) vector w ∈ Rm \{0} is mapped by H onto a
multiple of the axis vector e1 . We will now show how this can be done. Let w 6= 0 be given. We would like
Hw = σe1 (5)
for some scalar σ. It follows from (3) that
kwk = kHwk = kσe1 k = |σ|ke1 k = |σ|.
Therefore,
σ = ±kwk. (6)
3
Moreover, using the definition (4) of H, we obtain
σe1 = Hw = (I − ρuuT )w = w − τ u, τ = ρuT w
from which it follows that
τ u = w − σe1 .
The matrix H is independent of the scaling factor τ in the sense that the entries of the matrix H do not
change if we replace τ u by u. We therefore may choose
u = w − σe1 . (7)
This choice of u and either one of the choices (6) of σ give a Householder matrix that satisfies (5).
Nevertheless, in finite precision arithmetic, the choice of sign in (6) may be important. To see this, let
w = [w1 , w2 , . . . , wm ]T and write the vector (7) in the form
w1 − σ
w2
u = w3 .
..
.
wm
If the components wj for j ≥ 2 are of small magnitude compared to w1 , then kwk ≈ |w1 | and, therefore, the
first component of u satisfies
u1 = w1 − σ = w1 ± kwk ≈ w1 ± |w1 |. (8)
We would like to avoid the situation that |w1 | is large and |u1 | is small, because then u1 is determined with
low relative accuracy; see Exercise 3.3 below. We therefore let
σ = −sign(w1 )kwk, (9)
which yields
u1 = w1 + sign(w1 )kwk. (10)
Thus, u1 is computed by adding numbers of the same sign.
Example 3.6
Let w = [1, 1, 1, 1]T . We are interested in determining the Householder matrix H that maps w onto a
multiple of e1 . The parameter σ in (5) is chosen, such that |σ| = kwk = 2, i.e., σ is 2 or −2. The vector u
in (4) is given by
1±σ
1
u = w ± σe1 = 1 .
1
The choice σ = 2 yields u = [3 1 1 1]T . We choose the σ positive, because the first entry of w is positive.
Finally, we obtain ρ = 2/kuk2 = 1/6 and
−1/2 −1/2 −1/2 −1/2
−1/2 5/6 −1/6 −1/6
H = I − ρuuT = −1/2 −1/6
.
5/6 −1/6
−1/2 −1/6 −1/6 5/6
4
It is easy to verify that H is orthogonal and maps w to −2e1 . 2
In most applications it is not necessary to explicitly form the Householder matrix H. Matrix-vector
products with a vector v are computed by using the definition (4) of H, i.e.,
Hv = (I − ρuuT )v = v − (ρuT v)u.
The left-hand side is computed by first evaluating the scalar τ = ρuT v and then computing the vector
scaling and addition v − τ u. This way of evaluating Hw requires fewer arithmetic floating-point operations
than straightforward computation of the matrix-vector product; see Exercise 3.6. Moreover, the entries of H
do not have to be stored, only the vector u and scalar ρ. The savings in arithmetic operations and storage
is important for large problems.
Exercise 3.2
Let w = [1, 2, 3]T . Determine the Householder matrix that maps w to a multiple of e1 . Only the vector u in
(4) has to be computed. 2
Exercise 3.3
This exercise illustrates the importance of the choice of the sign of σ in (6). Let w = [1, 0.5 · 10−8 ]T and let
u be the vector in the definition of Householder matrices (4), chosen such that Hw = σe1 . MATLAB yields
>> w=[1; 5e-9]
w =
1.0000
0.0000
>> sigma=norm(w)
sigma =
>> u1=w(1)+sigma
u1 =
>> u1=w(1)-sigma
u1 =
5
where the u1 denote the computed approximations of the first component, u1 , of the vector u. How large
are the absolute and relative errors in the computed approximations u1 of the component u1 of the vector
u? 2
Exercise 3.4
Show that the product U1 U2 of two orthogonal matrices is an orthogonal matrix. Is the product of k > 2
orthogonal matrices an orthogonal matrix? 2
Exercise 3.5
Let Q be an orthogonal matrix, i.e., QT Q = I. Show that QQT = I. 2
Exercise 3.6
What is the count of arithmetic floating point operations for evaluating a matrix vector product with an n×n
Householder matrix H when the representation (4) of H is used? Only u and ρ are stored, not the entries
of H. What is the count of arithmetic floating point operations for evaluating a matrix-vector product with
H when the entries of H (but not u and ρ) are available. Correct order of magnitude of the arithmetic work
when n is large suffices. 2
Exercise 3.7
e such that w2 =
Let the nonvanishing m-vectors w1 and w2 be given. Determine an orthogonal matrix H,
e 1 . Hint: Use Householder matrices. 2
Hw