MATH 2101: Linear Algebra I Chapter 7: Orthogonality
MATH 2101 Linear Algebra I
Chapter 7
Orthogonality
Department of Mathematics, HKU
1/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
Chapter 7: Orthogonality
A Vector geometry
B Famous results in vector form
C Orthogonal and orthonormal sets
D The Gram-Schmidt process
E Orthogonal complement
F Orthogonal projection
G Applications of orthogonal projections
Part A: Vector geometry 2/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
A1. A glance at R2
u1 v
Given two vectors u = and v = 1 in R2 , how will you determine
u2 v2
the length of each vector?
the distance between the two vectors?
whether the vectors are perpendicular?
y y
−x2
x1 x1
u u−v
x2
90◦
v
x x
Part A: Vector geometry A1. A glance at R2 3/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
A2. Extending the notions to Rn
The notions on the previous page can be naturally generalised to R3 with the same
geometric meanings.
While it is much harder to associate geometric meanings in higher dimensions, we
can still extend the algebraic definitions to Rn . Let u, v be vectors in Rn with their
entries denoted by u1 , u2 , ..., un and v1 , v2 , ..., vn .
(a) The norm (or length) of v is defined by and denoted by .
n
(b) The distance between u and v in R is defined by and denoted by
.
(c) We say that u and v are orthogonal (or perpendicular) if u · v = , where
u · v = u1 v1 + u2 v2 + · · · + un vn as before (in Chapter 2 when we mentioned
that it is a particular entry of a matrix product).
Part A: Vector geometry A2. Extending the notions to Rn 4/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
A3. Properties of dot product and norm
The dot product and the norm satisfy many nice properties. For u, v, w ∈ Rn and
scalar c, we have
(a) u · u =
(b) u · u ≥ 0, with equality if and only if
(c) u · v =
(d) u · (v + w) =
(e) (cu) · v = =
(f) kcuk =
Part A: Vector geometry A3. Properties of dot product and norm 5/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
A4. Implications of the properties
The properties of the dot product and the norm enable us to carry out various
algebraic manipulations:
(a) We can expand and simplify expressions of the following form:
(au + bv) · (cu + dv)
kau + bvk2
Part A: Vector geometry A4. Implications of the properties 6/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
A4. Implications of the properties
The properties of the dot product and the norm enable us to carry out various
algebraic manipulations:
1
(b) For any non-zero vector v, consider u = v.
kvk
Then kuk = and is called a unit vector. This process is known as
normalising the vector v, producing a unit vector in the same direction as v.
1 v
u= v
kvk
Part A: Vector geometry A4. Implications of the properties 7/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
A5. Projection
How to find the distance from the point (2, 5) to the line y = 2x?
y
y = 2x
(2, 5) 2 1
−k
5 2
2
5
1
k
2
Part A: Vector geometry A5. Projection 8/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
A5. Projection
In general, the orthogonal projection of u on a non-zero vector v is given by
u·v
w= v.
kvk2
u v
Part A: Vector geometry A5. Projection 9/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
Chapter 7: Orthogonality
A Vector geometry
B Famous results in vector form
C Orthogonal and orthonormal sets
D The Gram-Schmidt process
E Orthogonal complement
F Orthogonal projection
G Applications of orthogonal projections
Part B: Famous results in vector form 10/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
B1. Pythagoras’ Theorem
We are all familiar with Pythagoras’ Theorem. We may formulate it in vector form
as follows:
c u+v
a u
b v
2 2 2
c = a2 + b 2
2 ku + vk = kuk + kvk
How about the converse of the theorem?
Part B: Famous results in vector form B1. Pythagoras’ Theorem 11/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
B1. Pythagoras’ Theorem
The same result holds not only in R2 but also in Rn . We have
Pythagoras’ Theorem (and its converse)
Two vectors u and v in Rn are orthogonal if and only if
ku + vk2 = kuk2 + kvk2 .
Proof:
We have ku + vk2 =
Part B: Famous results in vector form B1. Pythagoras’ Theorem 12/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
B1. Pythagoras’ Theorem
The theorem can also be generalised to more than two vectors. For instance, if u,
v and w are orthogonal to each other, then we have
2 2 2 2
ku + v + wk = kuk + kvk + kwk .
u+v+w w
v
u
Part B: Famous results in vector form B1. Pythagoras’ Theorem 13/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
B2. Cauchy-Schwarz inequality
The Cauchy-Schwarz inequality states that for real numbers u1 , u2 , ..., un and v1 ,
v2 , ..., vn , we have
2
(u1 v1 + u2 v2 + · · · + un vn ) ≤ (u12 + u22 + · · · + un2 )(v12 + v22 + · · · + vn2 ).
Equality holds if and only if ui vj = uj vi for all i, j. In vector form, we have
Cauchy-Schwarz inequality
For any u, v ∈ Rn , we have |u · v| ≤ kuk · kvk.
In vector form, equality holds if and only if one of u and v is
.
Part B: Famous results in vector form B2. Cauchy-Schwarz inequality 14/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
B2. Cauchy-Schwarz inequality
Cauchy-Schwarz inequality
For any u, v ∈ Rn , we have |u · v| ≤ kuk · kvk.
Proof:
The case where u = 0 or v = 0 is trivial. Suppose kuk = a > 0 and kvk = b > 0.
Part B: Famous results in vector form B2. Cauchy-Schwarz inequality 15/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
B3. Triangle inequality
We are all familiar with the triangle inequality. We may formulate it in vector form:
y
B B
v x
u z
A C x+y+z
u+v
AB + BC ≥ AC kuk + kvk ≥ ku + vk kxk + kyk + kzk
AB + BC ≥ AC AB + BC ≥ AC ≥ kx + y + zk
Part B: Famous results in vector form B3. Triangle inequality 16/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
B3. Triangle inequality
More generally, we have
Triangle inequality
For any v1 , v2 , ..., vk ∈ Rn , we have
kv1 k + kv2 k + · · · + kvk k ≥ kv1 + v2 + · · · + vk k.
Proof:
(LHS)2 =
(RHS)2 =
Part B: Famous results in vector form B3. Triangle inequality 17/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
Chapter 7: Orthogonality
A Vector geometry
B Famous results in vector form
C Orthogonal and orthonormal sets
D The Gram-Schmidt process
E Orthogonal complement
F Orthogonal projection
G Applications of orthogonal projections
Part C: Orthogonal and orthonormal sets 18/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
C1. Definitions
Let S be a subset of Rn .
(a) S is said to be an orthogonal set if any two vectors in S are orthogonal.
(b) Furthermore, if every vector in S has unit length (i.e. norm 1), then S is said
to be an orthonormal set.
Examples:
Orthogonal set? Orthonormal set?
{e1 , e2 , . . . , en }
{e1 , 2e2 , . . . , nen }
Part C: Orthogonal and orthonormal sets C1. Definitions 19/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
C2. Normalising an orthogonal set
Suppose {v1 , . . . , vk } is an orthogonal set of non-zero vectors.
By normalising each vector, we get an orthonormal set. Why?
Part C: Orthogonal and orthonormal sets C2. Normalising an orthogonal set 20/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
C3. Orthogonal and orthonormal basis
1 1 −1 −1 1
1 0 0 3 2
Let v1 =
1 , v2 1, v3 = 1 , v4 = −1 and x = 3.
−1 2 0 1 4
Check that B = {v1 , v2 , v3 , v4 } is an orthogonal basis for R4 .
Can you express the vector x as a linear combination of the vectors in B?
Part C: Orthogonal and orthonormal sets C3. Orthogonal and orthonormal basis 21/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
C3. Orthogonal and orthonormal basis
1 1 −1 −1 1
1 0 0 3 2
Let v1 =
1 , v2 1, v3 = 1 , v4 = −1 and x = 3.
−1 2 0 1 4
By setting w1 = 12 v1 , w2 = √16 v2 , w3 = √12 v3 and w4 = √112 v4 , it is easy to see
that B 0 = {w1 , w2 , w3 , w4 } is an orthonormal basis for R4 .
Can you express the vector x as a linear combination of the vectors in B 0 ?
Part C: Orthogonal and orthonormal sets C3. Orthogonal and orthonormal basis 22/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
C3. Orthogonal and orthonormal basis
In general, suppose B = {v1 , v2 , ..., vk } is an orthogonal basis for a subspace V of
Rn . Then for any v ∈ V , we have
v= v1 + v2 + · · · + vk .
Furthermore, if the basis B is orthonormal, the above expression can be simplified
to
v= v1 + v2 + · · · + vk .
Part C: Orthogonal and orthonormal sets C3. Orthogonal and orthonormal basis 23/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
C4. Linear independence and orthogonality
A linearly independent set may not be an orthogonal set. (Example?)
However, we have the following:
Every orthogonal set of non-zero vectors is linearly independent.
Proof:
Let {v1 , v2 , ..., vk } be an orthogonal set of non-zero vectors. Suppose
c1 v1 + c2 v2 + · · · + ck vk = 0.
Part C: Orthogonal and orthonormal sets C4. Linear independence and orthogonality 24/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
Chapter 7: Orthogonality
A Vector geometry
B Famous results in vector form
C Orthogonal and orthonormal sets
D The Gram-Schmidt process
E Orthogonal complement
F Orthogonal projection
G Applications of orthogonal projections
Part D: The Gram-Schmidt process 25/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
D1. Overview
In the previous part, we have seen that having an orthonormal basis for a subspace
enables us to express vectors in the subspace as a linear combination of the basis
vectors easily.
Does every non-zero subspace have an orthonormal basis? The answer is
affirmative. The Gram-Schmidt process can be used to turn any given basis to an
orthogonal basis, which can then be normalised into an orthonormal basis.
Gram-Schmidt process Normalisation
Arbitrary basis Orthogonal basis Orthonormal basis
Part D: The Gram-Schmidt process D1. Overview 26/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
D2. The algorithm
Suppose {u1 , u2 , ..., uk } is a basis for a subspace W of Rn . The Gram-Schmidt
process turns this basis into an orthogonal basis {v1 , v2 , ..., vk } by the formulas
v1 = u1
ui · v1 ui · v2 ui · vi−1
vi = ui − 2 v1 − 2 v2 − · · · − 2 vi−1 for 2 ≤ i ≤ k
kv1 k kv2 k kvi−1 k
Part D: The Gram-Schmidt process D2. The algorithm 27/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
D3. Example
1 2 1
1 1 1
Suppose u1 =
1, u2 = 0 and u3 = 2.
1 1 1
Part D: The Gram-Schmidt process D3. Example 28/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
D4. Proof of validity
Suppose {u1 , u2 , ..., uk } is a basis for some subspace. Set v1 = u1 and
ui · v1 u i · v2 ui · vi−1
vi = ui − 2 v1 − 2 v2 − · · · − 2 vi−1 for 2 ≤ i ≤ k.
kv1 k kv2 k kvi−1 k
Then for 1 ≤ i ≤ k, the set Vi = {v1 , v2 , ..., vi } is an orthogonal set of non-zero
vectors having the same span as the set Ui = {u1 , u2 , ..., ui }.
Proof:
The statement is true when i = 1. Suppose 2 ≤ i ≤ k and that the result
holds for smaller i.
Part D: The Gram-Schmidt process D4. Proof of validity 29/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
D4. Proof of validity
Suppose {u1 , u2 , ..., uk } is a basis for some subspace. Set v1 = u1 and
ui · v1 u i · v2 ui · vi−1
vi = ui − 2 v1 − 2 v2 − · · · − 2 vi−1 for 2 ≤ i ≤ k.
kv1 k kv2 k kvi−1 k
Then for 1 ≤ i ≤ k, the set Vi = {v1 , v2 , ..., vi } is an orthogonal set of non-zero
vectors having the same span as the set Ui = {u1 , u2 , ..., ui }.
Proof:
Then Vi is an orthogonal set of non-zero vectors.
Part D: The Gram-Schmidt process D4. Proof of validity 30/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
D4. Proof of validity
Suppose {u1 , u2 , ..., uk } is a basis for some subspace. Set v1 = u1 and
ui · v1 u i · v2 ui · vi−1
vi = ui − 2 v1 − 2 v2 − · · · − 2 vi−1 for 2 ≤ i ≤ k.
kv1 k kv2 k kvi−1 k
Then for 1 ≤ i ≤ k, the set Vi = {v1 , v2 , ..., vi } is an orthogonal set of non-zero
vectors having the same span as the set Ui = {u1 , u2 , ..., ui }.
Proof:
Finally, we have Span Vi = Span Ui .
Part D: The Gram-Schmidt process D4. Proof of validity 31/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
Chapter 7: Orthogonality
A Vector geometry
B Famous results in vector form
C Orthogonal and orthonormal sets
D The Gram-Schmidt process
E Orthogonal complement
F Orthogonal projection
G Applications of orthogonal projections
Part E: Orthogonal complement 32/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
E1. Definition
Let S be a subset of Rn .
The orthogonal complement of S, denoted by S ⊥ , is the set of vectors in Rn that
are orthogonal to every vector in S. In other words,
S ⊥ = {v ∈ Rn : v · u = 0 for every u ∈ S}.
Part E: Orthogonal complement E1. Definition 33/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
E2. Examples
Find the orthogonal complement of each of the following sets.
y
1
(a) S = 3
2 1
4
2
1 3
(b) T = , x
2 4
3x + 4y = 0
x + 2y = 0
Part E: Orthogonal complement E2. Examples 34/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
E2. Examples
Find the orthogonal complement of each of the following sets.
z
1
4
(c) U = 2
1 5
3
2 6
3
y
1 4
x
(d) V = 2 , 5
4x + 5y + 6z = 0
3 6 x + 2y + 3z = 0
Part E: Orthogonal complement E2. Examples 35/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
E2. Examples
Find the orthogonal complement of each of the following sets.
(e) W = the xy -plane in R3 z
Part E: Orthogonal complement E2. Examples 36/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
E3. Properties
(a) Let S be a subset of Rn . Then S ⊥ is a subspace of Rn .
Proof:
0 ∈ S ⊥ since
Suppose x, y ∈ S ⊥ .
Hence x + y ∈ S ⊥ since
Suppose x ∈ S ⊥ and c ∈ R.
Hence cx ∈ S ⊥ since
Part E: Orthogonal complement E3. Properties 37/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
E3. Properties
(b) Let S be a finite subset of Rn . Then S ⊥ = (Span S)⊥ .
Proof:
(⊆) Suppose x ∈ S ⊥ .
For any v ∈ Span S, we have x · v
(⊇) Suppose x ∈ (Span S)⊥ .
For any v ∈ S, we have x · v = 0 since
Part E: Orthogonal complement E3. Properties 38/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
E3. Properties
(c) Let A be a matrix. Then (Row A)⊥ = Null A. (Here we identify the row
vectors in Row A as column vectors in the natural way.)
1 2 3
For example, when A = ,
4 5 6
Part E: Orthogonal complement E3. Properties 39/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
E3. Properties
(c) Let A be a matrix. Then (Row A)⊥ = Null A. (Here we identify the row
vectors in Row A as column vectors in the natural way.)
Proof:
(⊆) Suppose x ∈ (Row A)⊥ . Then x is orthogonal to each row of A. Hence
(⊇) Suppose x ∈ Null A. In view of (b), it suffices to show that x is orthogonal to
each row of A.
Part E: Orthogonal complement E3. Properties 40/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
Chapter 7: Orthogonality
A Vector geometry
B Famous results in vector form
C Orthogonal and orthonormal sets
D The Gram-Schmidt process
E Orthogonal complement
F Orthogonal projection
G Applications of orthogonal projections
Part F: Orthogonal projection 41/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
F1. Orthogonal decomposition theorem
Orthogonal decomposition theorem
Let W be a subspace of Rn . Then every vector u in Rn can be written in the
form u = w + z where w ∈ W and z ∈ W ⊥ in a unique way.
Before proving the theorem, let’s look at its geometric meaning.
In R2 : In R3 :
y
W u
z ∈ W⊥
⊥
z∈W
u
W
w∈W
w∈W
x
Part F: Orthogonal projection F1. Orthogonal decomposition theorem 42/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
F1. Orthogonal decomposition theorem
Orthogonal decomposition theorem
Let W be a subspace of Rn . Then every vector u in Rn can be written in the
form u = w + z where w ∈ W and z ∈ W ⊥ in a unique way.
Proof of uniqueness:
Suppose that u = w + z = w0 + z0 , where w, w0 ∈ W and z, z0 ∈ W ⊥ . Then
Part F: Orthogonal projection F1. Orthogonal decomposition theorem 43/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
F1. Orthogonal decomposition theorem
Orthogonal decomposition theorem
Let W be a subspace of Rn . Then every vector u in Rn can be written in the
form u = w + z where w ∈ W and z ∈ W ⊥ in a unique way.
Proof of existence:
Take an orthonormal basis {w1 , w2 , ..., wk } of W . Set
w= and z = .
Then we have u = w + z, w ∈ W and
Part F: Orthogonal projection F1. Orthogonal decomposition theorem 44/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
F2. Dimension of W ⊥
For any subset W of Rn , we have shown in E3 that W ⊥ is always a subspace. So
naturally we ask what the dimension of W ⊥ will be.
Since we also proved in E3 that W ⊥ = (Span W )⊥ , it suffices to consider the case
when W itself is also a subspace. Indeed, the orthogonal decomposition theorem
leads to the following result:
Let W be a subspace of Rn . Then dim W + dim W ⊥ = n.
Part F: Orthogonal projection F2. Dimension of W ⊥ 45/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
F2. Dimension of W ⊥
To establish the result, it suffices to prove the following statement.
Let W be a subspace of Rn . If B = {w1 , . . . , wk } is a basis for W and
B 0 = {z1 , . . . , z` } is a basis for W ⊥ , then B ∪ B 0 is a basis for Rn .
Proof (B ∪ B 0 generates Rn ):
Let u ∈ Rn . By the orthogonal decomposition theorem, we may write u = w + z
where w ∈ W and z ∈ W ⊥ .
Part F: Orthogonal projection F2. Dimension of W ⊥ 46/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
F2. Dimension of W ⊥
To establish the result, it suffices to prove the following statement.
Let W be a subspace of Rn . If B = {w1 , . . . , wk } is a basis for W and
B 0 = {z1 , . . . , z` } is a basis for W ⊥ , then B ∪ B 0 is a basis for Rn .
Proof (B ∪ B 0 is linearly independent):
Suppose c1 w1 + · · · + ck wk + d1 z1 + · · · + d` z` = 0.
Part F: Orthogonal projection F2. Dimension of W ⊥ 47/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
F3. Orthogonal projection as a function
Let W be a subspace of Rn . By the orthogonal decomposition theorem, there exist
unique w ∈ W and z ∈ W ⊥ such that u = w + z. The vector w is said to be the
orthogonal projection of u on W .
Orthogonal projection can thus be thought of as a function UW : Rn → Rn .
Part F: Orthogonal projection F3. Orthogonal projection as a function 48/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
F3. Orthogonal projection as a function
As an example, consider projecting the point (1,3,4) onto the plane
W : x − y + 2z = 0.
1
3
4
W : x − y + 2z = 0
Part F: Orthogonal projection F3. Orthogonal projection as a function 49/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
F4. Orthogonal projection as a linear transformation
Let W be a subspace of Rn . The orthogonal projection function UW : Rn → Rn
is a linear transformation.
Proof:
We prove that UW preserves addition (the case for scalar multiplication is similar).
Let u1 , u2 ∈ Rn and suppose UW (u1 ) = w1 and UW (u2 ) = w2 . Then
Part F: Orthogonal projection F4. Orthogonal projection as a linear transformation 50/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
F4. Orthogonal projection as a linear transformation
Let W be a subspace of Rn . The standard matrix PW of the linear transfor-
mation UW : Rn → Rn is given by
−1
PW = C (C T C ) CT
where C is a matrix whose columns form a basis for W .
Proof:
u u − Cv
w = Cv
W
Part F: Orthogonal projection F4. Orthogonal projection as a linear transformation 51/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
F4. Orthogonal projection as a linear transformation
Let C be a matrix whose columns are linearly independent. Then C T C is
invertible.
Proof:
Note that C T C is a square matrix. To show that it is invertible, consider the
homogeneous system (C T C )x = 0.
Part F: Orthogonal projection F4. Orthogonal projection as a linear transformation 52/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
F5. Closest vector property
UW (u) is the vector in W that is closest to u.
Proof:
u−w
u
u − w0
w W
w0
Part F: Orthogonal projection F5. Closest vector property 53/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
Chapter 7: Orthogonality
A Vector geometry
B Famous results in vector form
C Orthogonal and orthonormal sets
D The Gram-Schmidt process
E Orthogonal complement
F Orthogonal projection
G Applications of orthogonal projections
Part G: Apps. of orthogonal projections 54/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
G1. Least squares fitting
Given n points (x1 , y1 ), (x2 , y2 ), ..., (xn , yn ) on the plane, we often want to find a
‘best fit’ straight line y = a0 + a1 x to fit the points. The usual ‘best’ criterion is
defined in the least square sense.
y = a0 + a1 x
Part G: Apps. of orthogonal projections G1. Least squares fitting 55/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
G1. Least squares fitting
Given n points (x1 , y1 ), (x2 , y2 ), ..., (xn , yn ) on the plane, we often want to find a
‘best fit’ straight line y = a0 + a1 x to fit the points. The usual ‘best’ criterion is
defined in the least square sense.
In other words we want to find a0 and a1 which minimise the quantity
E = [y1 − (a0 + a1 x1 )]2 + [y2 − (a0 + a1 x2 )]2 + · · · + [yn − (a0 + a1 xn )]2
Hence we want to look for the vector in that is closest to . In view
of the closest vector property, we should consider the orthogonal projection of the
vector on .
Part G: Apps. of orthogonal projections G1. Least squares fitting 56/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
G1. Least squares fitting
As an example, we find the best fit straight line for the points (0,0), (1,1) and
(2,3) as follows:
y
y = a0 + a1 x
(2, 3)
(0, 0) (1, 1)
x
Part G: Apps. of orthogonal projections G1. Least squares fitting 57/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
G2. Inconsistent linear systems
Essentially, the best fit straight line problem amounts to finding approximate
solutions to inconsistent linear systems. Ideally we want to solve for a0 and a1 so
that the line y = a0 + a1 x passes through all data points, but in most cases such
a0 and a1 would not exist. So we turn to finding a0 and a1 which gives the ‘best
approximation’.
More generally, suppose that the linear system Ax = b is inconsistent. Then the
best we can do is to look for a ‘best approximate solution’ z so that Az = b0 is as
close to b as possible. This amounts to solving the equation Ax = b0 where b0 is
the orthogonal projection of on .
Part G: Apps. of orthogonal projections G2. Inconsistent linear systems 58/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
G2. Inconsistent linear systems
As an example, we consider the inconsistent system Ax = b where
1 1 1 1
2 1 4 7
A=
−1
and b=
−4 .
0 −3
3 2 5 8
Part G: Apps. of orthogonal projections G2. Inconsistent linear systems 59/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
G3. Solution of least norm
When a linear system Ax = b has infinitely many solutions (or it is inconsistent but
there are infinitely many best approximate solutions to Ax = b0 as in the previous
example), it is often useful to select a solution of least norm.
The general form of the solution in this case would be x = x0 + z, where x0 is a
particular solution to Ax = b while z is any solution to the associated homogeneous
system Ax = 0. To find a solution of the least norm means looking for a choice of
z so that x = x0 + z is as close to 0 as possible. Hence we should consider the
orthogonal projection of on , giving a unique solution of least norm.
Part G: Apps. of orthogonal projections G3. Solution of least norm 60/61
MATH 2101: Linear Algebra I Chapter 7: Orthogonality
G3. Solution of least norm
We return to the inconsistent system Ax = b where
1 1 1 1
2 1 4 7
A=
−1
and b=
−4 .
0 −3
3 2 5 8
The best approximate solution of least norm is given by
Part G: Apps. of orthogonal projections G3. Solution of least norm 61/61