0% found this document useful (0 votes)
384 views64 pages

Linear Transformations Guide

This document defines and discusses linear transformations between vector spaces. It begins by defining a linear transformation as a function between vector spaces that preserves vector addition and scalar multiplication. It then proves several properties of linear transformations, including that the composition and inverse of linear transformations are also linear. It introduces the kernel (or nullspace) and image of a linear transformation, and proves that these are always subspaces. It defines the nullity of a transformation as the dimension of its kernel, and the rank as the dimension of its image. The document concludes by proving the rank-nullity theorem relating these quantities.

Uploaded by

Christopher
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
384 views64 pages

Linear Transformations Guide

This document defines and discusses linear transformations between vector spaces. It begins by defining a linear transformation as a function between vector spaces that preserves vector addition and scalar multiplication. It then proves several properties of linear transformations, including that the composition and inverse of linear transformations are also linear. It introduces the kernel (or nullspace) and image of a linear transformation, and proves that these are always subspaces. It defines the nullity of a transformation as the dimension of its kernel, and the rank as the dimension of its image. The document concludes by proving the rank-nullity theorem relating these quantities.

Uploaded by

Christopher
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 64

Chapter 3

Linear Transformations

MATH2601 Slides, 2021 – p. 1


3.1 Linear Transformations
Vector spaces have two operations on them, and the nice
maps (morphisms) between them have to respect both:
Definition 3.1 Suppose V and W are vector spaces over the
field F. A function T : V → W is a linear transformation or a
linear map (or simply linear) if
• T (u + v) = T (u) + T (v), and
• T (λv) = λT (v),

for all u, v ∈ V and for all λ ∈ F.


Note that we only define linear maps between vector spaces
over the same field.

MATH2601 Slides, 2021 – p. 2


Lemma 3.1 Let V and W be vector spaces over field F.
a) The identity map, id : V → V defined by id(v) = v is linear.
b) If T : V → W is linear then T (0 ) = 0 and T (−v) = −T (v).
Proof:

a) Is trivial.
b) The first part follows from T (0v) = 0T (v) for any v and
Lemma 2.1 part (a).
The second part follows from Lemma 2.1 part (b). 

We next want a result that saves us doing two proofs, along


the lines of the Subspace Test Lemma. . .

MATH2601 Slides, 2021 – p. 3


Lemma 3.2 (Linearity Test Lemma) A function T : V → W
between vector spaces over the same field F is linear if and
only if
T (λu + v) = λT (u) + T (v)
for all λ ∈ F, and u, v ∈ V .
Proof: If T is linear then T (λu + v) = λT (u) + T (v) follows
easily from the definition.
Conversely suppose T (λu + v) = λT (u) + T (v) for all λ ∈ F,
and u, v ∈ V .
Putting λ = 1 we get T (u + v) = T (u) + T (v).
Putting λ = −1 and u = v gives T (0) = 0.
Putting v = 0 now gives T (λu) = λT (u).
Hence T is linear. 

MATH2601 Slides, 2021 – p. 4


Example 3.1 Let V = Fp , W = Fq .
Let T : V → W, T (v) = Av where A is any q × p matrix.
Then T is linear since

T (λx + y) = A(λx + y)
= λAx + Ay
= λT (x) + T (y)

As we shall see this is the example which, in effect, includes


all linear maps between finite dimensional vector spaces.
Example 3.2 Many important geometric transformations of
R2 and R3 (and, if you can imagine it, Rn ) are linear:
e.g. rotations, reflections and projections.
But translations are not linear. Why?

MATH2601 Slides, 2021 – p. 5


Theorem 3.3 Let V and W be two vector spaces over field F.
The set L(V, W ) of all linear transformations from V to W is a
vector space under the operations

(S + T )(v) = S(v) + T (v), (λS)(v) = λS(v) .

Proof: The first step is to prove that with our definitions S + T


and λS are actually linear.
I suggest you try this yourself: it is good practice at using the
definitions carefully.
The rest is an easy extension of the result for F (X) we saw in
chapter 2. 

We can do more with maps than just add them of course . . .

MATH2601 Slides, 2021 – p. 6


Lemma 3.4 Let T : V → W and S : W → X be linear maps
between vector spaces.
Then S ◦ T : V → X is also linear.
Proof: EXERCISE. 
Lemma 3.5 Let T : V → W be an invertible linear map
between two vector spaces over field F.
Then T −1 : W → V is linear.
Proof: For vectors u and v in W , let x = T −1 (u) and
y = T −1 (v).
Then for any λ ∈ F,

T −1 (λu + v) = T −1 (λT (x) + T (y)) = T −1 ◦ T (λx + y)


= λx + y = λT −1 (u) + T −1 (v)

and so T −1 is linear by the Linearity Test Lemma. 


MATH2601 Slides, 2021 – p. 7
Theorem 3.3 implies that L(V, V ), the linear maps from V to
itself is a vector space.
But also. . .
Theorem 3.6 The invertible linear maps in L(V, V ) form a
group under composition.
Proof: Since composition of maps is always associative, we
only need to prove closure and existence of an identity and an
inverse.
Let G be the set of invertible maps in L(V, V ).
Then by lemma 3.4, G is closed under composition.
Lemma 3.1 a) tells us the identity map is linear, and it is
clearly invertible, so is in G.
Finally, by definition every map in G has an inverse, and by
lemma 3.5, that inverse is linear.
Hence G is a group under composition. 
MATH2601 Slides, 2021 – p. 8
Example 3.3 On a suitable set of functions, differentiation,
definite integration and multiplication by a fixed function are all
linear.
Evaluation can also give rise to linear maps:
Example 3.4 !Show that T : P(R) → R2 defined by
f (1)
T (f ) = is linear.
f (2)
SOLUTION: For any f, g ∈ P(R) and λ ∈ R
! !
(λf + g)(1) λf (1) + g(1)
T (λf + g) = =
(λf + g)(2) λf (2) + g(2)
! !
f (1) g(1)
=λ + = λT (f ) + T (g)
f (2) g(2)

so T is linear. 
MATH2601 Slides, 2021 – p. 9
Another vital linear transformation is taking coordinates:
Lemma 3.7 Let V be a (finite-dimensional) vector space over
F with a basis B = {v1 , . . . , vp }.
Then the function S : V → Fp defined by S(x) = [x]B is linear.
p
X p
X
Proof: Let x = xi vi and y = yi vi .
i=1 i=1
Then for any λ ∈ F,
p
X p
X p
X
λx + y = λ xi v i + yi vi = (λxi + yi )vi
i=1 i=1 i=1

But this last equation says

[λx + y]B = λ[x]B + [y]B

as required. 
MATH2601 Slides, 2021 – p. 10
3.2 Kernel and Image
Definition 3.2 Let T : V → W be a linear transformation.
The kernel (or nullspace) of T is the set

ker T = {v ∈ V : T (v) = 0 } .

If U ≤ V then the image of U is the set

T (U ) = {T (u) : u ∈ U } .

We also define the image of T (or range of T ), im(T ) as the


image of all of V : im(T ) = T (V ).

MATH2601 Slides, 2021 – p. 11


Example 3.5 a) !In example 3.4, T : P(R) → R2 defined by
f (1)
T (f ) = , it is clear that im(T ) = R2 .
f (2)
On the other hand, ker(T ) consists of all polynomials that
are zero at both 1 and 2, and so

f ∈ ker(T ) if and only if f (t) = (t − 1)(t − 2)p(t)

for some polynomial p. So ker(T ) is all polynomial


multiples of (t − 1)(t − 2).
b) Consider differentiation on quadratics, D : P2 (R) → P2 (R).
Then ker(D) is the set (subspace) of constant
polynomials, and im(D) the set (subspace) of degree 1
polynomials.
That these kernels and images were subspaces in these
examples is no accident. . .
MATH2601 Slides, 2021 – p. 12
Theorem 3.8 Let T : V → W be a linear transformation
between vector spaces over F and U ≤ V . Then
a) ker T is a subspace of V .
b) T (U ) is a subspace of W , and so im(T ) ≤ W .
c) If U is finite-dimensional, so is T (U ), so if V is finite
dimensional, so is im(T ).
Proof: Part (a) is a simple EXERCISE.
For (b), we firstly note that T (U ) is not empty as U is not.
Let w1 , w2 ∈ T (U ); by definition we have w1 = T (u1 ) and
w2 = T (u2 ) for some u1 , u2 ∈ U . Then for any λ ∈ F

λw1 + w2 = λT (u1 ) + T (u2 ) = T (λu1 + u2 ) ∈ T (U ) ,

so T (U ) is a subspace.
The second part of (b) follows easily.
MATH2601 Slides, 2021 – p. 13
For (c), if U has a finite spanning set {u1 , u2 , . . . , un }, then for
every vector w in T (U ) there are scalars α1 , α2 , . . . , αn for
which

w = T (u) = T (α1 u1 + α2 u2 + · · · + αn un )
= α1 T (u1 ) + α2 T (u2 ) + · · · + αn T (un )

So {T (u1 ), T (u2 ), . . . , T (un )} is a (finite) spanning set for T (U ).


This shows that if U is finite-dimensional, then T (U ) is
finite-dimensional.
The second part of (c) follows trivially. 

MATH2601 Slides, 2021 – p. 14


Theorem 3.8 means the following now makes sense:
Definition 3.3 If T is a linear transformation, then the
dimension of the kernel of T is called the nullity of T , and the
dimension of its image is called the rank of T .
Example 3.6 From example 3.5, differentiation on P2 (R) has
nullity 1 and rank 2.
Lemma 3.9 A linear map T : V → W is one-to-one if and only
if nullity(T ) = 0.
Proof: Since T (0 ) = 0, if T is one-to-one then ker(T ) = {0}
i.e. nullity(T ) = 0.
Conversely if nullity(T ) = 0, i.e. ker(T ) = {0} suppose
T (x) = T (y).
Then T (x − y) = 0, so x − y = 0 and T is one-to-one. 
Note that this proof is nearly identical to that of lemma 1.8 for
group homomorphisms – not surprisingly!
MATH2601 Slides, 2021 – p. 15
Theorem 3.10 (Rank-Nullity Theorem) If V is a finite
dimensional vector space over F and T : V → W is linear then

rank(T ) + nullity(T ) = dim(V ).

Proof: Since ker(T ) ≤ V , it is finite dimensional, so with


k = nullity(T ), let A = {a1 , . . . , ak } be a basis for ker T ≤ V .
Now extend A to a basis for V (theorem 2.10):

B = {a1 , . . . , ak , b1 , . . . , bs } .

Our result will hold if rank(T ) = dim(im(T )) = s, and the


obvious way to prove this is to show that

C = {T (b1 ), . . . , T (bs )}

is a basis for im(T ).

MATH2601 Slides, 2021 – p. 16


So let w = T (v) ∈ im(T ). Then for some scalars αi and βj we
have
v = α1 a1 + · · · αk ak + β1 b1 + · · · + βs bs .
Applying T to both sides and using linearity

w = T (v) = β1 T (b1 ) + · · · + βs T (bs ) ,

so that C is a spanning set for im(T ).


Now suppose that

β1 T (b1 ) + · · · + βs T (bs ) = 0 .

Then by linearity the vector β1 b1 + · · · + βs bs is in ker(T ), which


is span(A).
But B is a basis for V , so βj = 0 for all j , and hence C is l.i.
Thus C is a basis for im(T ), completing the proof. 

MATH2601 Slides, 2021 – p. 17


Theorem 3.11 Let V , W be vector spaces over F with
dim(V ) = dim(W ) finite and T : V → W be linear. The
following are equivalent:
a) T is invertible (bijective).
b) T is one-to-one (injective) i.e. nullity(T ) = 0.
c) T is onto (surjective) i.e. rank(T ) = dim(V ).
Proof: The definition of a bijection gives that a) implies b).
If T is one-to-one, then by lemma 3.9 and the Rank-Nullity
Theorem, rank(T ) = dim(V ) = dim(W ).
Since im(T ) ≤ W and they are the same dimension,
im(T ) = W and T is onto, so b) implies c).
Finally, if T is onto its rank is dim(W ) = dim(V ) and the
Rank-Nullity theorem implies nullity(T ) = 0, so by lemma 3.9 it
is one-to-one.
Thus T is a bijection and c) implies a), completing the proof. 
MATH2601 Slides, 2021 – p. 18
Example 3.7 Define the map T : P2 (R) → R3 by
 
p(1)
T (p) =  p(0)  .
 
p(−1)
Prove that T is invertible.
SOLUTION: Linearity of T is clear (see example 3.4).
Since dim(P2 (R)) = 3 = dim(R3 ), T will be invertible if it is
either one-to-one or onto: the former is the easier to use.
Suppose T (p) = 0. Then p(1) = p(0) = p(−1) = 0, so p is a
degree 2 polynomial with three distinct roots.
But the only such polynomial is the zero polynomial, hence
ker(T ) is trivial.
Thus by lemma 3.9, T is one-to-one and so by theorem 3.11,
T is invertible. 
MATH2601 Slides, 2021 – p. 19
Definition 3.4 An invertible linear map T : V → W is called an
isomorphism of the vector spaces V and W .
If there is an isomorphism between V and W we call the two
spaces isomorphic.
Isomorphism is clearly an equivalence relation on vector
spaces:
• Reflexive: The identity map id : V → V is an
isomorphism from V to itself.
• Symmetric: If T : V → W is an isomorphism, then T −1 is
an isomorphism from W to V .
• Transitive: If T : V → W and S : W → X are
isomorphisms, then S ◦ T is an isomorphism from V to X .
I’ll leave the proofs of these as easy EXERCISES.
Example 3.7 shows that P2 and R3 are isomorphic, but there
is a simpler way to show this . . .
MATH2601 Slides, 2021 – p. 20
Theorem 3.12 Finite dimension vector spaces V and W over
F are isomorphic if and only if they have the same dimension.
Proof: Let p = dim(V ) and q = dim(W ).
Suppose that p = q and choose B , a basis for V .
Define f : V → Fp by f (v) = [v]B .
Then f is linear by lemma 3.7 and as ker(f ) is clearly trivial, a
bijection by theorem 3.11.
So V is isomorphic to Fp , and similarly W is isomorphic to Fp .
Thus V and W are isomorphic.
Conversely, suppose T : V → W is an isomorphism.
Then as nullity(T ) = 0 from theorem 3.11, for any basis
B = {vi } of V , the set {T (vi )} is a basis of im(T ) = W from the
proof of the Rank-Nullity theorem.
Thus p = q . 

MATH2601 Slides, 2021 – p. 21


As a special case of the previous result, note that
If V is a p-dimensional vector space over F then it is
isomorphic to Fp .
We proved this by showing that taking coordinates in some
chosen basis is an isomorphism.
In fact, all isomorphisms to Fp can be described this way:
Why?
We note as a further special case that the vector spaces Fp
and Mp,1 (F) and M1,p (F) are all isomorphic.
This is why we can treat vectors of p components as if they
were p × 1 matrices, and why it doesn’t matter (out of context)
whether we regard them as row vectors or column vectors.

MATH2601 Slides, 2021 – p. 22


3.3 Spaces Associated with Matrices
Let A be a p × q matrix over field F, and define a map
T : Fq → Fp by T (x) = Ax.
The kernel, image, nullity and rank of A are by definition the
same as those of this map T .
Now suppose A has columns c1 , . . . , cq (all in Fp ). Then

im(A) = {Ax : x ∈ Fq }
= {x1 c1 + · · · xq cq : xi ∈ F}
= span ({c1 , . . . , cq })

That is, im(A) is the space spanned by the columns of A: the


column space of A, col(A), a subspace of Fp .
The rank of A is thus the dimension of the column space of A.

MATH2601 Slides, 2021 – p. 23


As an immediate corollary of the Rank-Nullity Theorem for
maps we have the Rank-Nullity Theorem for matrices:
For A ∈ Mp,q (F), rank(A) + nullity(A) = q , the number of
columns of A.

The row space of A, row(A), is defined similarly as the space


spanned by the rows: it is a subspace of Fq .
Note that row(A) = col(AT ) = im(AT ).
Our definition for the rank of a matrix makes it the same as the
dimension of the column space.
This leaves open the question of the dimension of the row
space — which is usually called the row rank.
In fact, the row rank of a matrix is always the same as its rank
(“column rank”), a rather surprising result that is quite fiddly to
prove rigorously. . .

MATH2601 Slides, 2021 – p. 24


Theorem 3.13 Let A ∈ Mp,q (F). The spaces row(A) and
col(A) have the same dimension.
Proof (outline): Consider row reducing A to echelon form by
always taking multiples of an earlier row from each row.
This is just taking linear combinations of rows and so will not
change the row space.
It follows that the row rank will be the number of non-zero rows
in the echelon form of A: call it r.
Since each such non-zero row has exactly one pivot entry, the
row rank is the number of pivot columns in the row echelon
form, i.e. r.
But if we were solving Ax = 0, reducing to echelon form and
having r pivot columns implies there are q − r free variables in
the back substitution.
Thus nullity(A) = q − r and by the Rank-Nullity Theorem for
matrices, rank(A) = r, the row rank. 
MATH2601 Slides, 2021 – p. 25
3.4 The Matrix of a Linear Map
We now go back to the hint we had after example 3.1: for finite
dimensional vector spaces, all linear maps are “really” matrix
multiplication.
Theorem 3.14 Let V, W be two finite dimensional vector
spaces over F. Suppose dim(V ) = q and V has basis B and
also dim(W ) = p and W has basis C .
If T : V → W is linear then there is a unique A ∈ Mp,q (F) with

[T (v)]C = A[v]B . (1)

Conversely, for any A ∈ Mp,q (F), equation (1) defines a unique


linear map from V to W .
What this theorem is saying is that once we pick bases in the
domain V and codomain W of T , coordinates in those bases
allow us to identify L(V, W ) with Mp,q (F).
MATH2601 Slides, 2021 – p. 26
Proof: Suppose B = {v1 , v2 , . . . , vq } and C = {w1 , w2 , . . . , wp }.
Xp
Now T (v1 ) ∈ W so T (v1 ) = aj1 wj or using coordinates with
j=1
respect to C :
   
a11 a12
a21  a22 
[T (v1 )]C =  .  , [T (v2 )]C =  .  , . . .
   
 ..   .. 
ap1   ap2
q x1
xi vi so that [v]B =  ...  then
X
Now let v =
 
i=1 xq
q q
!
X X
T (v) = T xi v i = xi T (vi ) .
i=1 i=1
MATH2601 Slides, 2021 – p. 27
Since taking coordinates is linear (lemma 3.7), we have
q
X
[T (v)]C = xi [T (vi )]C = A[v]B
i=1
 
a11 . . . a1q
where A =  ... .. , proving such an A exists and

. 
ap1 . . . apq
giving us a method of finding it.
For uniqueness, if there were two such matrices A and B ,
then by uniqueness of coordinates, we would have Ax = Bx
for all x ∈ Fq .
But this means that (A − B)x = 0 for all x ∈ Fq , so that
nullity(A − B) = q thus rank(A − B) = 0 and A − B is zero.
The converse follows from uniqueness of coordinates. 
MATH2601 Slides, 2021 – p. 28
Definition 3.5 We call A in the above theorem the matrix of
T with respect to B and C .
A useful notation is to denote this matrix by [ T ]BC and then
equation (1) takes the form

[T (v)]C = [ T ]B
C [v]B .

Our theorem tells us that to find the first column of [ T ]BC we


take the first element of B (the basis of the domain V ),
calculate T of this vector, and the column is then the
coordinate vector of this image with respect to C , the basis of
the codomain W .
The other columns of [ T ]BC are then found by a similar method.

Corollory 3.15 If dim(V ) = q and dim(W ) = p then


dim(L(V, W )) = pq .

MATH2601 Slides, 2021 – p. 29


Example 3.8 Find the matrix of the evaluation map from
example 3.7 (evaluating at 1, 0, −1 in order) with respect to the
standard bases in P2 (R) and R3 .
SOLUTION: Evaluating on the basis {1, t, t2 } gives
     
1 1 1
T (1) = 1 , T (t) =  0  , T (t2 ) = 0
     
1 −1 1

Fortunately, these image vectors are exactly the same as their


coordinates in the standard basis of R3 , so no more work is
needed.  
1 1 1
The matrix of T in the standard bases is 1 0 0. 
 
1 −1 1

MATH2601 Slides, 2021 – p. 30


Theorem 3.16 Let T : V → W and S : W → X be linear maps
between vector spaces and suppose V , W and X have bases
A, B and C respectively.
Then the matrix of S ◦ T : V → X is the product of the matrices
of T and S , all taken with respect to the appropriate bases:

[S ◦ T ]A
C = [S ] B
C · [T ] A
B

This theorem is really why we define the matrix product the


way we do.
Proof: Let A = {vi }, B = {wj }, C = {xk }; [T ]A B = (aij ),
[S ]B
C = (b km ) and [S ◦ T ] A = (c ) so that
C pq
X X
T (vi ) = aji wj , S(wj ) = bkj xk ,
j k
X
and (S ◦ T )(vi ) = cpi xp .
p
MATH2601 Slides, 2021 – p. 31
Then by linearity
!
X X X
(S ◦ T )(vi ) = aji S(wj ) = aji bkj xk
j j k

As addition is associative and commutative:


 
X X
(S ◦ T )(vi ) =  bkj aji  xk .
k j

Comparing with the previous definitions, we get


X
cki = bkj aji
j

and this proves the result. 


MATH2601 Slides, 2021 – p. 32
It is obvious that the identity map on vector space V will have
as matrix the identity matrix if we use the same basis in
domain and codomain.
Putting this idea together with theorem 3.16:
Corollory 3.17 If T : V → W is linear and invertible, the
matrix of T −1 is the inverse of the matrix of T .
Thus the group of invertible linear maps on an n-dimensional
vector space over F is isomorphic to GL(n, F).
A more formal way of saying the first half of this (and note the
position of the labels) is:
B −1
[T −1 ]CB

= [T ]C .

MATH2601 Slides, 2021 – p. 33


Example 3.9 Repeat example 3.8 but using
B ={3 2 , 1 − t − t2 } and
+ 2t,1 +2t  
 1
 2 1  
C =  1  , 1 , −1 in domain and codomain
     

 −1 
1 −1 
respectively.
SOLUTION: Evaluating on the elements of B gives
     
5 3 −1
T (3+2t) = 3 , T (1+2t2 ) = 1 , T (1−t−t2 ) =  1  .
     
1 3 1

Our problem now is to write these three vectors as linear


combinations of the elements of C .
This is just solving three sets of three linear equations in three
unknowns: simple but tedious.
MATH2601 Slides, 2021 – p. 34
However, we note that the equations we need to solve will all
have the same matrix of coefficients: only the left hand sides
will change.
So we can do all three with one big row reduction:
   
1 2 1 5 3 −1 1 2 1 5 3 −1
 1 1 −1 3 1 1  ∼  0 1 2 2 2 −2 
   
−1 1 −1 1 3 1 0 0 1 0 0 −1

Back-substitution for each column on the right will give us the


coordinate vectors.
But we could also do this by going to reduced echelon form:

MATH2601 Slides, 2021 – p. 35


From the reduced echelon form we can read off the
coordinates: they are simply the columns to the right of the
bar.
You can and should check this mentally, of course.
Thus the matrix is
 
1 −1 0
[T ]B = 2 2 0 . 
 
C 
0 0 −1

You may recall that going to reduced echelon form is the same
as multiplying the columns to the right of the bar by the
inverse of the coefficient matrix (where that exists):

Ax = b ⇔ x = A−1 b so (A | b) ⇔ (I | A−1 b) .

MATH2601 Slides, 2021 – p. 36


Indeed, the method we taught you in first year to find the
inverse is exactly this, since you put the identity matrix to the
right of the bar.
This suggests that (at least when the codomain is Fn ) we
ought to use the inverse of the matrix whose columns are the
basis vectors directly instead of messing about with systems
of equations.
In fact, we can always find a suitable short cut to make use of
this observation, at least in spaces where there is some basis
for which coordinates are easy to find.
This short cut relies on the idea in the next example. . .

MATH2601 Slides, 2021 – p. 37


Example 3.10 Find the matrix of the identity mapping on R3
with the basis C from example 3.9 in the domain and the
standard basis S in the codomain.
I commented earlier that for the identity map, if we use the
same basis in domain and codomain we get the identity
matrix. Here we free ourselves of that restriction.
SOLUTION: We follow the same pattern as before: map the
elements of C and find their coordinates.
But since the identity map does nothing, and vectors in Rn are
their own coordinate vectors in the standard basis of Rn , the
matrix we want just has the elements of C as columns in order:
 
1 2 1
C
[id]S =  1 1 −1 . 
 
−1 1 −1

MATH2601 Slides, 2021 – p. 38


Definition 3.6 If vector space V has two bases B and C , the
matrix [id]BC of the identity map is called the change of basis
matrix (from B to C ).
The point here is that a change of basis matrix can be used to
change coordinates:

[v]C = [id]B
C [v]B .

Corollary 3.17 tells us that the inverse matrix will change the
coordinates in the opposite direction.

MATH2601 Slides, 2021 – p. 39


Example 3.11 Find the change of basis matrix from the
standard basis S of P2 (R) to the basis B = {1, 1 + t, 1 + t + t2 }
of P2 (R).
SOLUTION: The change of basis matrix from B to S will be
simple, so we find that and invert.
We have
     
1 1 1
2
[1]S = 0 , [1 + t]S = 1 , [1 + t + t ]S = 1
     
0 0 1

So    
1 1 1 1 −1 0
[id]B = 0 1 1 and [id] S
= 0 1 −1. 
   
S   B 
0 0 1 0 0 1

MATH2601 Slides, 2021 – p. 40


To find the matrix of T : V → W with respect to bases B in the
domain and C in the codomain, we may write T as the
composition of three mappings:
Given any v in V ,
a) calculate idV (v) = v, where idV is the identity map on V ;
b) calculate T (idV (v)) = T (v);
c) calculate idW (T (idV (v))) = T (v), where idW is the identity
map on W .
The point is that if we find coordinates in B for initial vector v
and coordinates in C for the final value of T (v), but switch to
standard bases for all the intermediate steps with the identity
maps, then the matrices of all three functions will be quite
easy to find.
It is usually most helpful to display the triple composition in
diagrammatic form. . .
MATH2601 Slides, 2021 – p. 41
basis S A basis S ′
T :V W

P Q

M
T :V W
basis B basis C
In this commutative diagram:
a) S and S ′ are standard bases.
b) The vertical arrows represent identity maps: so
P = [idV ]B
S and Q = [id W ] C .
S′

c) The matrix A = [T ]SS ′ , which we assume is easy to find.


d) The matrix M = [T ]BC and

e) M = Q−1 AP i.e. [T ]B
C = [idW ]C [T ]SS ′ [idV ]B
S
S.
MATH2601 Slides, 2021 – p. 42
Example 3.12 Define T : R3 → R2 by
!
x1 + 2x2 + 3x3
T (x) = .
4x1 + 3x2 + 2x3

Find the matrix of T with respect to bases


      
 1 1 −1 
 ( ! !)
1 −3

B =  0 , 1 , 1  and C = ,
     

 −1  −2 1
−3 0 

Answer: !
1 −4 3 2
5 2 11 −1

MATH2601 Slides, 2021 – p. 43


We can now clarify rank and nullity for matrices:
Lemma 3.18 Let T : V → W be a linear map between finite
dimensional vector spaces over F and A its matrix with
respect to any two bases in V and W .
Then

nullity(A) = nullity(T ) and rank(A) = rank(T ) .

Proof: Let B be the basis of V .


Then v ∈ ker(T ) iff [v]B ∈ ker(A).
From the proof of theorem 3.12, taking coordinates is an
isomorphism, so nullity(T ) = nullity(A).
The ranks being equal follow from the two versions of the
Rank-Nullity Theorem. 
I will leave you to write down a matrix version of theorem 3.11
using this result.
MATH2601 Slides, 2021 – p. 44
Definition 3.7 Let V be a vector space over F and T : V → V
a linear map.
If X ≤ V is is such that T (X) ≤ X , we call X an invariant
subspace of T .
(We know from theorem 3.8 that T (X) is a subspace of V .)
The trivial subspace is always invariant, as is V itself of
course.
There are cases where they are the only invariant subspaces.
Can you think of an example?
Given an invariant subspace X of map T : V → V , we know
there is a complementary subspace Y so that V = X ⊕ Y .
It is not always the case that Y is invariant (in fact there are
cases where none of the complementary subspaces are
invariant), but when there is one, the matrix of T can be
simplified.

MATH2601 Slides, 2021 – p. 45


Theorem 3.19 Let T : V → V be a linear map on a finite
dimensional vector space. Suppose V = X ⊕ Y with both X
and Y invariant subspaces of T with dimensions p and q
respectively.
Then there is a basis B for V in which the matrix [T ]BB of T is of
the form !
B A 0
[T ]B =
0 B

with A a p × p and B a q × q matrix.


Note that we are using the same basis of V in domain and
codomain here.
Proof: Suppose {xi } is any basis of X and {yk } any basis of
Y then
B = {x1 , . . . , xp , y1 , . . . , yq }
is a basis of V .
MATH2601 Slides, 2021 – p. 46
Now we must have
X X
T (xi ) = aji xj and T (yk ) = bmk ym
j m

for some scalars aji and bmk by invariance.


It follows that the matrix [T ]BB is of the form required. 

We call matrices of the shape of [T ]BB in this theorem the


direct sum of A and B and write it as A ⊕ B .

MATH2601 Slides, 2021 – p. 47


Example 3.13 On R3 , let n be any vector and X be the plane
through the origin normal to n, with point-normal form n · x = 0.
Define T : R3 → R3 to be orthogonal projection onto X , which
we know is linear.
Then it is clear geometrically that T (X) = X , and so X is an
invariant subspace.
(In fact it is a fixed subspace, which is a special case of an
invariant subspace.)
Any complementary subspace to X must be a line, but to be
invariant the line must be orthogonal to X , and so project to
zero.
Hence Y = span({n}) is the only complementary invariant
subspace.
To find the matrix of T , suppose a and b are two arbitrary
vectors orthogonal to n, which will form a basis for X .

MATH2601 Slides, 2021 – p. 48


The action of T on the basis B = {a, b, n} of R3 is then

T (a) = a, T (b) = b, T (n) = 0 .

So the matrix of T with respect to B is


 
1 0 0

0 1 0 = I2 ⊕ 0 . 
 
0 0 0

MATH2601 Slides, 2021 – p. 49


3.4.1 Isomorphisms Again
Theorem 3.20 Let T : V → W be a linear map between
finite-dimensional spaces, and let A be the matrix of T (with
respect to any bases).
Then T is invertible if and only if A is invertible.
If this is the case, then the matrix of T −1 (with respect to the
same bases) is A−1 .
Proof: The first part follows directly from theorem 3.11, its
matrix version (in particular, that A ∈ Mp,p (F) has rank p iff A is
invertible) and lemma 3.18.
The second part is just corollary 3.17. 

MATH2601 Slides, 2021 – p. 50


Example 3.14 Let V = {p ∈ P3 (R) : p(0) = 0}, which from
example 2.4 is a vector space, and let D : V → P2 (R) be
differentiation: D(p) = p′ , which we know is linear.
The set B = {t, t2 , t3 } is a basis for V (as is easy to check) and
let us take the exotic basis

C = {1 + 2t, 3 + 4t, 1 + 3t2 }

for P2 (R). We calculate the images

D(t) = 1 , D(t2 ) = 2t , D(t3 ) = 3t2

and their respective coordinate vectors:


     
−2 3 2
1 , −1 , −1 .
     
   
0 0 1
MATH2601 Slides, 2021 – p. 51
This gives the matrix of D with respect to B and C as
 
−2 3 2
A =  1 −1 −1 .
 
0 0 1

The inverse of D is the map D−1 : P2 (R) → V given by


Z t
D−1 (p) = p(s) ds .
0
 
1 3 1
Its matrix 1 2 0 is easier to calculate than A since we
 
0 0 1
have a standard basis (more or less) in the codomain.
I will leave you to check that this latter matrix is A−1 . 
MATH2601 Slides, 2021 – p. 52
3.4.2 The Normal Form
Theorem 3.21 Let T : V → W be a linear map between vector
spaces over F where dim V = p, dim W = q and rank(T ) = r.
Then there are bases B of V and C of W with respect to which
the matrix of T takes the form
!
Ir 0
Nq,p;r = ∈ Mq,p (F)
0 0

where Ir is the r × r identity matrix and 0 are suitably sized


zero matrices.
Proof: Let {vr+1 , . . . , vp } be a basis of ker(T ) and extend to
give the basis B = {v1 , . . . , vr , vr+1 , . . . , vp } of V .
Then {T (v1 ), . . . , T (vr )} will be a basis of im(T ), as we saw in
the proof of the Rank-Nullity Theorem.
Extend to give the basis C of W . 
MATH2601 Slides, 2021 – p. 53
Corollory 3.22 Let A ∈ Mq,p (F) be rank r. Then there is a
q × q invertible matrix Q and a p × p invertible matrix P such
that
Q−1 AP = Nq,p;r

Proof: Apply the previous result to the (rank r) map


T : Fp → Fq defined by T (x) = Ax.
Then P is the matrix whose columns are the basis of Fp and Q
the matrix whose columns are the basis of Fq . 
Definition 3.8 The q × p rank r matrix Nq,p;r in these results is
called the normal form of the map or matrix.
Note that the bases in the theorem and the invertible matrices
in the corollary are not unique.
In fact there is a simpler way of finding the matrices P and Q
in the matrix version using row and column reduction, but we
will not look into that.
MATH2601 Slides, 2021 – p. 54
3.5 Similarity
For a linear map T : V → W , theorem 3.21 tells us that if we
are free to choose bases in V and W , the matrix of T can
always be taken to be the normal form, Ir ⊕ 0 where
r = rank(T ).
This implies that the only really important thing for such maps
is the rank.
The same is true if V = W — but why would we want to
choose different bases in the domain and codomain versions
of V ?
Surely it is more natural to choose one basis of V and find the
matrix with respect to it in both domain and codomain.
If we do that, then we find some much more interesting
mathematics about linear maps.

MATH2601 Slides, 2021 – p. 55


So consider a linear transformation T : V → V , where V is
finite-dimensional and let B1 and B2 be two bases for V .
If the matrix of T with respect to B1 is M1 and the matrix with
respect to B2 is M2 , then our standard commutative diagram
implies that
M2 = P −1 M1 P ,
where P is the matrix whose columns are the basis vectors in
B2 , written as coordinate vectors with respect to B1 .
This equation defines an important relation between matrices.
Definition 3.9 Matrices A and B in Mp,p (F) are similar if
there exists a matrix P ∈ GL(p, F) such that B = P −1 AP .
It is easy to show that similarity is an equivalence relation on
square matrices of a given size (EXERCISE).

MATH2601 Slides, 2021 – p. 56


Theorem 3.23 Matrices A1 and A2 are similar if and only if
they are the matrices of the same linear transformation with
respect to two choices of bases.
Proof: The “if” statement is proved by the remarks at the
beginning of this section.
Conversely, suppose A1 and A2 are similar p × p matrices over
F, with A2 = P −1 A1 P and define T : Fp → Fp by T (x) = A1 x.
Then A1 is the matrix of T with respect to the standard basis
S = {e1 , . . . , ep } for Fp .
Now consider B = {P e1 , . . . , P ep }.
By theorem 2.5(f), this is a linearly independent set, and it has
p elements, so it is a basis for Fp .
Our commutative diagram method shows immediately that the
matrix of T with respect to B is P −1 A1 P = A2 .
Therefore A1 and A2 are the matrices of T with respect to the
bases S and B respectively. 
MATH2601 Slides, 2021 – p. 57
If I hand you two square matrices, checking whether of not
they are similar is obviously a very difficult problem.
A natural way forward is to look for properties shared by
similar matrices:
Definition 3.10 A property of matrices is called a similarity
invariant if it the same for all similar matrices.
One important example of a similarity invariant that you met
previously is the determinant: since det(AB) = det(A) det(B) it
follows that det(P −1 AP ) = det(A).
Similarity invariants can be used to prove matrices are not
similar. ! !
1 3 1 0
For example and are not similar as they have
2 4 0 1
different determinants.

MATH2601 Slides, 2021 – p. 58


! !
1 3 1 0
But the determinant cannot tell us that and
2 7 0 1
are not similar (which they are not: the only matrix similar to
the identity is the identity).
In order to use similarity invariants to prove matrices are
similar we need a complete set of similarity invariants: a
set of properties that are always the same for matrices that
are similar and always different for matrices that are not.
In fact, we will find such a set: it is one of the key goals of this
course.
But before we leave this topic, we find three more useful
similarity invariants. . .

MATH2601 Slides, 2021 – p. 59


Theorem 3.24 The rank, nullity and trace of matrices are all
similarity invariants.
Recall that the trace of a (square) matrix is the sum of its
diagonal elements.
Proof: Let A1 , A2 be similar matrices, say, A1 = P −1 A2 P .
Take a basis B1 = {b1 , . . . , bn } for ker(A1 ) and write
B2 = {P b1 , . . . , P bn }, which is l.i. by theorem 2.5(f).
Now for any w ∈ ker(A2 )

A2 w = 0 ⇒ P A1 P −1 w = 0 ⇒ A1 P −1 w = 0
⇒ P −1 w ∈ ker(A1 ) ⇒ P −1 w = α1 b1 + · · · + αn bn
⇒ w = α1 (P b1 ) + · · · + αn (P bn ) ∈ span(B2 ) .

Therefore B2 spans ker(A2 ) and so is a basis for ker(A2 ), and


nullity(A1 ) = nullity(A2 ).
MATH2601 Slides, 2021 – p. 60
The Rank-Nullity Theorem then implies that

rank(A2 ) = rank(A1 ) .

For the trace, note that tr(AB) = tr(BA) for square matrices A
and B of the same size (EXERCISE).
It follows that

tr(A1 ) = tr(P −1 A2 P ) = tr(P P −1 A2 ) = tr(A2 ) .

So trace is a similarity invariant. 


Note from this proof that the kernels are (in general) not
similarity invariants: the spans of the set B1 and B2 in the proof
will usually be different, although they have the same
dimension.
The same applies to images.

MATH2601 Slides, 2021 – p. 61


Our set of known similarity invariants for matrix A is thus

{det(A), tr(A), rank(A), nullity(A)}

This is not a complete set yet; for example


! !
1 1 1 0
A= and I2 =
0 1 0 1

have the same determinants (1), traces (2), ranks (2) and
nullities (0), but are not similar.
We know this since the only matrix similar to I2 is I2 .
One more invariant you may remember from earlier courses
are the eigenvalues: note that A and I2 above have the same
eigenvalues (1), so they do not give us a complete set either!
We will return to our search for a complete set of similarity
invariants later.
MATH2601 Slides, 2021 – p. 62
3.6 Multilinear maps
Definition 3.11 Let V1 , V2 and W be vector spaces over field
F. A map T : V1 × V2 → W is bilinear if it is linear in each
argument, that is

T (λv1 + v1′ , v2 ) = λT (v1 , v2 ) + T (v1′ , v2 )


T (v1 , λv2 + v2′ , ) = λT (v1 , v2 ) + T (v1 , v2′ )

for all suitable vectors and scalars.


If V2 = V1 we call T bilinear on V1 .
This definition can obviously be extended to trilinear maps (on
V1 × V2 × V3 ) and to general multilinear maps (on a product of
k vector spaces).

MATH2601 Slides, 2021 – p. 63


Example 3.15 The standard dot product on Rn is a basic
example of a bilinear map on Rn (to R).
You may recall the scalar triple product on R3 :

[u, v, w] = (u × v) · w .

This is a trilinear map on R3 (to R).


In fact, the determinant on Mp,p (F) is multilinear as a map from
the product of p copies of Fp (the columns of the matrix) to F.
Definition 3.12 A multilinear map T on V is said to be
symmetric if its value on any ordered set of vectors is
unchanged when any two of the vectors are swapped.
If such a swap always simply changes the sign of the value, T
is called alternating.
The dot product is symmetric; the scalar triple product and the
determinant are alternating.
MATH2601 Slides, 2021 – p. 64

You might also like