0% found this document useful (0 votes)

305 views46 pages

MATH 212 - Linear Algebra I: Last Revision: November 22, 2012

This document provides an overview of linear algebra concepts related to solving systems of linear equations. It discusses: 1) Representing systems of linear equations using an augmented matrix. 2) Elementary row operations that can be performed on the augmented matrix without changing the solution. 3) Using Gaussian elimination and row reduction to systematically transform the augmented matrix into reduced row echelon form to solve the system. 4) How the row echelon form and presence of a contradictory row can indicate if a system is consistent or inconsistent (has a solution or not).

Uploaded by

tt Oo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

305 views46 pages

MATH 212 - Linear Algebra I: Last Revision: November 22, 2012

Uploaded by

tt Oo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 46

MATH 212 – Linear Algebra I

Richard Taylor — Thompson Rivers University

last revision: November 22, 2012

1 Systems of Linear Equations

Lec. #1
In your other courses you might have already met with a problem similar to the following.
Example 1.1. Find numbers x, y that satisfy both of the following equations:
x − 2y = 0
x−y =1

solution: x = 2, y = 1

There are a variety of methods for solving such a problem:

Method 1: Isolate one variable in one of the equations, and substitute into the other.
Method 2: Subtract one equation from the other.
We will want to deal with more than two equations and two unknowns. Method 1 becomes too
cumbersome; instead we will build on Method 2.
Interpretation: the solution of the system of equations can be interpreted as the intersection of two
lines in the xy-plane.
Example 1.2. A company owns mines A, B and C. Each mine produces gold, silver and copper,
in the following amounts (in ounces of metal per ton of raw ore):

mine gold silver copper

A 1 10 20
B 2 18 22
C 1 12 40

How many tons of ore should be extracted from each mine, to exactly fill an order for a total of 8
oz. gold, 78 oz. silver and 144 oz. copper?
Let x, y, z be the number of tons of raw ore from mines A, B, C, respectively. Then the total
amount of gold produced will be x + 2y + z. So we require that x + 2y + z = 8 if the order for gold
is to be filled exactly. We can write a similar equation for the total quantities of silver and copper
produced. In summary:
x +2y +z = 8
10x +18y +12z = 78
20x +22y +40z = 144

We can solve by generalizing Method 2 above.

1
For any given system of linear equations, the fundamental questions are:

• does a solution exist?

• if so, how many solutions are there?

• how to we find the solution(s)?

In the first section of the course we will develop some computational tools and some theory that
we can use to answer these questions.

2 Gauss-Jordan Elimination
Lec. #2

2.1 Augmented Matrix

Throughout the algebraic manipulations required to solve the previous example, many of the sym-
bols were redundant—they act only as placeholders. All the essential information can be represented
in the augmented matrix:  
1 2 1 8
 10 18 12 78 
20 22 40 144

The variable names and equals sign are implicit. The essential operations required to solve the
system can be performed by operating on rows of this matrix. This cuts down on the book-keeping.

2.2 Elementary row operations

Permissible operations on the equations become the following permissible elementary row operations
on the augmented matrix:

1. multiply any row by a constant

2. add a multiple of any row to another

3. interchange any two rows

These operations allow us to modify the form of the equations without changing their solution.

2.3 Row reduction / Gaussian elimination

By a systematic application of these operations (the row reduction algorithm) we arrive at the
following:  
1 2 1 8
 0 1 −1 1 
0 0 1 1

2
This gives z = 1 immediately; x and y can be found by back substitution.
solution: x = 3, y = 2, z = 1

2.4 A systematic approach: the Gauss-Jordan elimination

The method of Gauss-Jordan elimination is a systematic approach to solving a system of linear

equations as we did in the previous example.

1. Go to the first non-zero column; interchange rows so the top entry (the pivot) is non-zero.
2. Apply row operations to obtain all zeroes below the pivot.
3. Consider the sub-matrix below and to the right of the pivot; repeat steps 1–2 on this subma-
trix. Continue in this way until there are no more pivots available.
4. Scale each row so the leading entry is 1.
5. Go to the right-most pivot; use row operations to obtain all zeroes above the pivot.
6. Move one pivot to the left, and repeat. Continue in this way for all pivots.

After step 3 the augmented matrix is in row echelon form. It has the following properties:

• On each row, the leading entry is to the right of the leading entry in the row above, and has
only zeroes below it.
• Any all-zero rows are at the bottom.

At this stage (if the solution is unique) we can read off one solution, and proceed with back-
substitution. Or we can continue with steps 3-6, after which the matrix is in reduced row echelon
form (RREF). It then has the following additional properties:

• Every leading entry is a 1, and is the only non-zero entry in its column.

From the reduced row echelon form, we can just read off the solution, if it is unique.
Note: there’s more than one way to get to the RREF, so you might think there’s more than one
possible final answer. It turns out that for any matrix, the RREF is unique: everyone will get the
same final answer, no matter what steps they take to get there (we’ll prove this later).
Example 2.1. Use row operations to put the following matrix in reduced row echelon form:
 
0 5 3 −1
 1 2 1 0 
−3 −1 2 1
   
1 2 1 0 1 0 0 3/5
solution: REF:  0 5 3 −1  RREF:  0 1 0 −4/5 
0 0 2 2 0 0 1 1

3
Example 2.2. Use Gauss-Jordan elimination to solve the following linear systems:
 
x1 + x2 + x3 = 3
 x1 − 2x2 + 3x3 = 9

2x1 + 3x2 + x3 = 5 −x1 + 3x2 = −4

 

x1 − x2 − 2x3 = −5 2x1 − 5x2 + 5x3 = 17

3 Consistency and Uniqueness

Lec. #3
3.1 Consistency

Consider the linear system we started with:

x − 2y = 0
x−y =1
We can interpret the solution as the point of intersection of two lines:

[graph of example above]

Clearly there is exactly one solution. But consider:

x − 2y = 0
2x − 4y = 8.
These lines are parallel:

[graph of example above]

There is no point of intersection, and the system has no solution. We say this system is inconsistent.
When we apply Gauss-Jordan to this system, we get:

1 −2 0
0 0 8
The last row asserts 0 = 8 (which is obviously false). We say the system is inconsistent. In other
words, the original system is equivalent to the false statement “0 = 8”. We can generalize this
observation as follows:

• If a row echelon form of the augmented matrix has a row of the form [0 · · · 0 k] where k 6= 0,
then the system is inconsistent.
• Conversely, there is no such row then the system is consistent: there exists at least one
solution.
Example 3.1. For the linear system with the following augmented matrix, how many solutions
are there? (interpret the result graphically.)
 
0 1 2
 1 −1 1 
1 1 3

4
3.2 General Solution

Consider the linear system

−x1 + 3x2 = 1
x2 − x3 = 0
x1 − x3 = −1
whose RREF is  
1 0 −3 1
 0 1 −1 0 
0 0 0 0

Notice that the final row expresses the equation “0 = 0”, which is redundant. In essence, the
Gauss-Jordan has eliminated one of the equations. Also, this is the simplest system of equations
that are equivalent to the original; they can’t be reduced further.
It will be useful to distinguish between the basic variables (x1 and x2 ) which correspond to pivot
columns, and the other, free variable (x3 ). This system has an infinite number of solutions, which
can be conveniently expressed by solving for the basic variables in terms of the free variables,
yielding the general solution 
x1 = 3x3 + 1

x2 = x3


x3 is free.
which describes every possible solution of the original system. This means x3 can take on any
value, while x1 and x2 are determined by x3 according to the formulas above. So, for example, one
solution is:
x3 = 0, x1 = 1, x2 = 0
and another is
x3 = 2, x1 = 7, x2 = 2.
We get a different solution for each choice of x3 ; thus this system has an infinite number of solutions.
Note that the variable x3 acts a parameter in general solution: it can take any value, but this
value then determines all of the other variables. It is sometimes convenient to introduce a dummy
variable to act as this parameter. Then, for example, if we let x3 = t we can write the general
solution in parametric form as 
x1 = 3t + 1

x2 = t


x3 = t
where the parameter t can take any value.
Example 3.2. Find the general solution of the linear system
x1 − x2 − x3 + 2x4 = 1
2x1 − 2x2 − x3 + 3x4 = 3
−x1 + x2 − x3 = −3,

5
3.3 General results
Lec. #4
Note that for any system of linear equations, there are only three possibilities for the number of
solutions: there must be either:

• exactly one solution (when the number of pivot columns equals the number of variables)
• no solutions (when the final row of a REF looks like [0 · · · 0 k])
• an infinite number of solutions (when the final row does not look like [0 · · · 0 k] and there
are fewer pivot columns than the number of variables, so there is at least one free variable)

So, e.g. it is impossible for a linear system to have exactly two solutions.
Example 3.3. Consider the linear system
x + ky = 2
−3x + 4y = 1.
For what value(s) of k is there (a) a unique solution (b) no solution (c) an infinite number of
solutions?
Example 3.4. Consider the linear system
2x − y = 3
−4x + 2y = k
4x − 2y = 6.
For what value(s) of k is there (a) a unique solution (b) no solution (c) an infinite number of
solutions?

3.4 Homogeneous Systems

A linear system with all all zeroes on the right-hand side is said to be homogeneous. For example,
the system
3x + y + z = 0
x−y+z =0
x + 5y − 2z = 0
is homogeneous. Homogeneous systems have some special properties that are worth noting. For
example, a homogenous system always has at least one solution, the “zero solution” x = y = z =
· · · = 0.
Theorem 3.1. For a homogenous linear system, if the number of equations is less than the number
of variables then the system has infinitely many solutions.

Proof. Since the augmented matrix has fewer rows than variables, the RREF will have fewer pivots
(there is at most one pivot per row) than the number of variables. Thus there must be at least one
free variable. The system cannot be inconsistent (the zero solution will always be a solution) so,
since there is a free variable, there must be infinitely many solutions.

6
4 Applications of Linear Systems

4.1 Balancing Chemical Reactions

Lec. #5
Example 4.1. To balance the chemical reaction for the combustion of propane, use variables to
represent the unknown coefficients:

x1 C3 H8 + x2 O2 −→ x3 CO2 + x4 H2 O

Write equations expressing the balance of numbers of each atom on both sides of the reaction:

3x1 = x3
8x1 = 2x4
2x2 = 2x3 + x4

Bring all terms to the left-hand sides, write the corresponding augmented matrix, and RREF:
   
3 0 −1 0 0 1 0 0 −1/4 0
 8 0 0 −2 0  −→  0 1 0 −5/4 0 
0 2 −2 −1 0 0 0 1 −3/4 0

So x4 is a free variable, and the general solution can be written in parametric form as

x1 = (1/4)t
x2 = (5/4)t
x3 = (3/4)t
x4 = t.

It is convenient to choose t = 4, then x1 = 1, x2 = 5, x3 = 3, x4 = 4 are all integers. Any other

values of x4 could be used.

Example 4.2. Balance the following chemical reactions

B2 S3 + H2 O −→ H3 BO3 + H2 S

NaHCO3 + H3 C6 H5 O7 −→ Na3 C6 H5 O7 + H2 O + CO2

4.2 Network Flows

Lec. #6
e.g., traffic flow, electric circuits, computer networks. . .

Example 4.3. Solve the following network flow (i.e. find the flow in each branch of the network).

7
x2
10 7

x1 x3

Using the principle that “flow in = flow out” at each node, we get the following equations:
     
x1 + x2 = 10
 1 1 0 10 1 0 −1 3 x 1 = 3 + t

RREF 
x2 + x3 = 7 =⇒  0 1 1 7 −−−−→ 0 1
 1 7  =⇒ x2 = 7 − t
 

x1 = 3 + x3 1 0 −1 3 0 0 0 0 
x3 = t

If the directions of flow are required to be as shown, then the range of possible solutions becomes
restricted as follows:
 
3 + t ≥ 0
 3 ≤ x1 ≤ 10

7−t≥0 =⇒ 0 ≤ t ≤ 7 =⇒ 0 ≤ x2 ≤ 7

 

t≥0 0 ≤ x3 ≤ 7

4.3 Markov Processes

Three companies (A, B and C) compete for market share (e.g. customers). Each weak, each
company loses a fixed proportion of its market share to its competitors, as shown below.

0.3 0.1
0.1

0.1
A B
0.2

After many weeks, the distribution of market share reaches an equilibrium. What share of the
market does each of the companies have?

8
At equilibrium, the weekly net change of each company’s market share is zero. This gives the
following system of equations.
−0.4x1 + 0.2x2 + 0.1x3 = 0
0.1x1 − 0.2x2 + 0.1x3 = 0
0.3x1 − 0.2x3 = 0

5 Matrix Algebra
Lec. #7
We have already seen that matrices are useful to keep track of the “book-keeping” involved in
solving linear systems. Matrices have many other uses as well; to this end we need to develop our
theory of matrices and the mathematical operations that can be performed with them.

5.1 Matrix addition & subtraction

It is convenient to use symbols to represent matrices, just as symbols are used to represent numbers
in ordinary algebra. For example, let
   
5 3 1 2
2 4
A =  1 −2  B= 3 4  C= and c = 3.
6 8
4 0 5 6
Throughout this course we will follow the convention of using uppercase letters to represent matri-
ces. Lowercase letters will be used to represent scalars (i.e. numbers).
Definition 5.1. Matrix addition (A + B) and subtraction (A − B) are defined by element-wise
addition/subtraction of the individual components of the matrices.

So, for example,      

5 3 1 2 6 5
A + B =  1 −2  +  3 4  =  4 2 
4 0 5 6 9 6
     
5 3 1 2 4 1
A − B =  1 −2  −  3 4  =  −2 −6 
4 0 5 6 −1 −6

Note that when two matrices are added or subtracted, the result is matrix of the same dimensions
as the original two. Also, addition and subtraction are only defined for matrices that that have the
same dimensions. So, with A and C as given above, both A + C and A − C are undefined.
Definition 5.2. Multiplication by a scalar (cA) is defined by element-wise multiplication.

So, for example,

     
5 3 3×5 3×3 15 9
cA = 3A = 3  1 −2  =  3 × 1 3 × −2  =  3 −6 
4 0 3×4 3×0 12 0

9
Again, the result is a matrix of the same dimensions as the original.
As in regular algebra, we can combine addition and scalar multiplication to form compound ex-
pressions.

Example 5.1. Evaluate 2A − 3B.

An expression like 2A − 3B can encapsulate a large number of arithmetic operations on a large

amount of data, if A and B are large matrices. This is one reason matrices are useful: it provides a
concise notation for specifying a lot of data manipulation. This is especially useful when we treat
matrices as variables and do algebra with them (linear algebra).

5.2 Algebraic properties of addition and scalar multiplication

The following rules of matrix algebra are consequences of the way we have defined addition, sub-
traction and scalar multiplication.

Theorem 5.1. Let A, B and C be given matrices with the same dimensions; let c and d be given
scalars. Let 0 be the zero matrix. Then:

1. A + B = B + A (commutative law of addition)

2. (A + B) + C = A + (B + C) (associative law of addition)

3. A + 0 = A

4. A + (−A) = 0.

5. c(A + B) = cA + cB (distributive law of scalar multiplication)

6. (c + d)A = cA + dA

7. c(dA) = (cd)A

8. 1A = A

We need to prove that these algebraic identiteies are indeed true in general. The proofs can be
done simply (but tediously) by considering arbitrary matrices whose components are represented
symbolically, and showing that both sides of each identity do indeed agree.

Example 5.2. To prove rule 1, let

   
a11 a12 . . . a1n b11 b12 ... b1n
 a21 a22 . . . a2n   b21 b22 ... b2n 
   
A= . .. ..  B= . .. .. 
 .. . .   .. . . 
am1 am2 . . . amn bm1 bm2 . . . bmn

10
From the definition of matrix addition, the left-hand side of the identity becomes:
 
a11 + b11 a12 + b12 . . . a1n + b1n
 a21 + b21 a22 + b22 . . . a2n + b2n 
 
A+B = .. .. .. 
 . . . 
am1 + bm1 am2 + bm2 . . . amn + bmn

whereas
   
b11 + a11 b12 + a12 ... b1n + a1n a11 + b11 a12 + b12 ... a1n + b1n
 b21 + a21 b22 + a22 ... b2n + a2n   a21 + b21 a22 + b22 ... a2n + b2n 
   
B+A= .. .. .. = .. .. .. 
 . . .   . . . 
bm1 + am1 bm2 + am2 . . . bmn + amn am1 + bm1 am2 + bm2 . . . amn + bmn

where in the last step we used the commutative law for ordinary addition of scalars. Comparing
final results from these two equations, we see that indeed A + B = B + A.

In essence, the algebraic properties listed above imply that matrices can be treated as algebraic
quantities that follow the same rules as in ordinary algebra, as least as far as addition, subtraction,
and scalar multiplication are concerned. That is, when doing algebra with matrices we can treat
them like ordinary variables — up to a point. So, for example, we can use algebra to solve equations
involving matrices:

Example 5.3. If A, B and C are matrices that satisfy the equation

4A + 3B = 8C

find an expression for A in terms of B and C.

5.3 Vectors

Matrices having just a single column play a special role, and are called vectors. A vector is sometimes
written as a column matrix, sometimes just as a comma-separated list. For example:
 
1
x =  5  = (1, 5, 3).
3

Throughout this course we will follow the convention of writing vectors using boldface symbols (or
with an underline, when writing by hand).
We identify the individual components of a vector using subscripts. So, example, the vector x
above has components x1 = 1, x2 = 5, and x3 = 3. The order of the components matters:
(1, 2, 3) 6= (1, 3, 2).

11
5.4 Vector form of the general solution of a linear system
Lec. #8
Go back to a previous example:
Example 5.4. Find, in vector form, all solutions of the linear system
3x1 + 5x2 − 4x3 = 0
−3x1 − 2x2 + 4x3 = 0.
RREF gives
1 0 −4/3 0
0 1 0 0
so the general solution is
x1 = 34 t
x2 = 0
x3 = t
where t ∈ IR is an arbitrary parameter. Introducing the vector of unknowns x = (x1 , x2 , x3 ) we can
write the general solution in vector form as
  4  4
x1 3t 3
x = x2  =  0t  = t  0  .
x3 t 1
We see that every solution is a scalar multiple of the constant vector ( 43 , 0, 1).
Example 5.5. Find, in vector form, the general solution of
x1 − x2 − x3 + 2x4 = 1
2x1 − 2x2 − x3 + 3x4 = 3
−x1 + x2 − x3 = −3.

5.5 Matrix products

Having defined addition, subtraction and scalar multiplication for matrices, we now turn to defining
matrix multiplication. It turns out that the most useful way to define the product of two matrices is
not by element-wise multiplication, as you might have expected. The matrix product has a rather
more complicated definition, whose utility will become apparent as we proceed.
Definition 5.3. Given two matrices A and B, the matrix product (AB) is defined by
AB = C
where Cij , the (i, j)’th component of C (i.e. the entry in the i’th row and j’th column) is given by
the cumulative product of the i’th row of A with the j’th row of B. That is,
n
X
Cij = Ai1 B1j + Ai2 B2j + · · · + Ain Bnj = Aik Bkj
k=1

where A has n columns and B has n rows.

12
Important: the product AB is only defined if the number of columns of A equals the number of
rows of B.
Example 5.6. Evaluate:
 
1 2
1 2 3  1 2 3 1 2 1 2 3
(a) 0 −3  (b) (c)
4 5 6 4 5 6 0 −3 0 −3 1
−1 5

Example (c), the product of a matrix and a vector, is a common special case that’s worth extra
attention. Note that the product Ax always results in a vector.
Matrix powers can be define in terms of iterated muliplication. So, for example, if A is a matrix
then A3 = AAA. In general, An = AA · · · A}.
| {z
n times

1 2
Example 5.7. Calculate A3 where A = .
0 3

There is another useful way to think of matrix multiplication. Suppose the columns of A are the
vectors v1 , . . . , vn (we write A = [v1 v2 · · · vn ]). Then
Ax = x1 v1 + x2 v2 + · · · + xn vn .
So the matrix-vector product is the vector sum of the columns of A, each weighted by the corre-
sponding component of x (such a weighted sum is called a linear combination of the columns of
A.)

5.6 Algebraic properties of matrix multiplication

The following algebraic properties are consequences of our definition of matrix multiplication (al-
though they are tedious to prove):

1. A(BC) = (AB)C = ABC (associative law)

2. A(B + C) = AB + BC (distributive law for left multiplication)
3. (A + B)C = AC + BC (distributive law for right multiplication)
4. k(AB) = (kA)B = A(kB)
5. in general, AB 6= BA (matrix multiplication does not commute)
6. Am An = Am+n
7. (Am )n = Amn

Thus, under matrix multiplication, matrices can be treated as ordinary algebraic quantities except
that matrix multiplication does not commute: left-multiplication is fundamentally different from
right-multiplication; they are not interchangeable. This is very important to remember. It is the
key difference between linear algebra and the ‘ordinary’ algebra, and accounts for much of the
richness in the theory of linear algebra.

13

1 2
3 4
Example 5.8. Calculate AB and BA where A = and B = .
0 −3
1 0

1 2 5
Example 5.9. Calculate AB and BA where A = and B = .
0 −3 2

5.7 Matrix multiplication and systems of linear equations

Lec. #9
There is an important connection between
matrix
multiplication
and systems of linear equations.
1 2 x1
To illustrate this, suppose A = and x = . Note that
3 4 x2

1 2 x1 x1 + 2x2
Ax = =
3 4 x2 3x1 + 4x2

The components of this matrix look like the left-hand side of a linear system. Indeed, the the linear
system

x1 + 2x2 = 5
3x1 + 4x2 = 8

is equivalent to a single matrix equation:

1 2 x1 5
=
3 4 x2 8

or simply
Ax = b
where b = (5, 8).
The matrix A is called the coefficient matrix ; its components can be read directly from the system
of equations.
Any system of linear equations can be written as a single equivalent matrix equation Ax = b. This
turns out to be very useful.

Example 5.10. Write a system of linear equations that is equivalent to the matrix equation Ax = b
with
1 3 0
A= and b = .
2 5 1
Example 5.11. Find matrices A and b so that the linear system

x1 − 2x2 = 0
2x1 − 4x2 = 8.

is equivalent to the matrix equation .

Identifying matrix equations with linear systems has many advantages, including:

14
• any linear system can be represented by a single equation of the form Ax = b, no matter
how many unknowns or equations there are

• by studying algebraic properties of the equation Ax = b, we can arrive at a deeper under-

standing of linear systems

Since any linear system can be written in matrix form as Ax = b, there is much to be gained by
discovering how to solve this and other matrix equations.
To solve Ax = b, one is tempted to simply “divide” both sides by A. However, matrix division is
an undefined operation, so this won’t work. But consider the equation

ax = b

where a, x and b are simply scalars. To solve this equation we could divide both sides by a, but it
is also possible to solve it using only multiplication, by multiplying both sides by the number a−1 :

ax = b =⇒ x = (a−1 )b.

Our goal in the following sections is to do something similar for matrix equations, i.e. to solve
Ax = b using only multiplication. We will do this by finding a matrix A−1 so that the solution
can be written as x = A−1 b. To do this, we need to introduce the “identity matrix”.

5.8 The identity matrix

Lec. #10

1 2 1 0 1 0 1 2
Example 5.12. Evaluate and .
3 4 0 1 0 1 3 4

This example can be generalized:

Definition 5.4. The n × n identity matrix is the (unique) matrix I such that IA = AI = A or
every n × n matrix A. Every identity matrix I has the form
 
1 0 ... 0
0 1 . . . 0
 
I = . . . .. 
 .. . .
0 0 ... 1

For the purposes of matrix multiplication, I plays a role similar to the number 1 in ordinary
multiplication. This simple idea proves to be very useful.
Note that multiplication by the identity does commute.

5.9 Solving matrix equations: The matrix inverse

We are now in a position to present the main tool for solving matrix equations like Ax = b:

15
Definition 5.5. Given an n × n matrix A, the inverse of A (if it exists) is the (unique) n × n
matrix, written A−1 , such that
AA−1 = A−1 A = I.
If A−1 exists then A is said to be invertible or non-singular.

Now consider the equation Ax = b. By multiplying this equation on both sides by A−1 and
simplifying, using the rules of matrix algebra, we can solve for x as follows:
Ax = b =⇒ A−1 (Ax) = A−1 b
=⇒ (A−1 A)x = A−1 b
=⇒ Ix = A−1 b
=⇒ x = A−1 b.
This gives an algebraic method of solving a system of linear equations, without using the Gauss-
Jordan algorithm, and it gives an explicit formula for the solution. All we need is a way to find
A−1 . Often this involves a lot of work, but for special cases it’s easy.

5.10 Special case: Inverse of a 2-by-2 matrix

a b
Theorem 5.2. If A = and ab − bc 6= 0 then
c d

1 d −b
A−1 = .
ad − bc −c a

This makes it possible to solve 2-by-2 systems quickly:

5 4
Example 5.13. Let A = . Find A−1 .
1 1
Example 5.14. Use a matrix inverse to solve the following linear system:
5x1 + 4x2 = 6
x1 + x2 = 2.
Example 5.15. Use a matrix inverse to solve the following linear system:
x1 + 2x2 = 3
2x1 + 6x2 = 5.
   
1 −1 0 −2 −3 1
Example 5.16. Let A =  1 0 −1 . Check that A−1 =  −3 −3 1  and solve:
6 −2 −3 −2 −4 1

x−y =4
x−z =6
6x − 2y − 3z = 2.

16
5.11 Non-invertible matrices

We have found an algebraic way of solving a system of linear equations, by transforming the system
into a matrix equation:
Ax = b =⇒ x = A−1 b.
Notice this this equation implies that there is a unique solution. Thus, if a linear system does not
have a unique solution then the coefficient matrix does not have an inverse. Of course, not all
matrices have inverses, since not all linear systems have unique solutions.

Definition 5.6. If a matrix A has an inverse then we say A is invertible or nonsingular. If A does
not have an inverse we say A is non-invertible or singular.

1 2
Example 5.17. If we try to find the inverse of A = we get a division by zero. The matrix
2 4
is singular: it does not have an inverse. Consider the linear system corresponding to Ax = b:

x1 + 2x2 = b1
2x1 + 4x2 = b2

1 2 b1 RREF 1 2 b1
−−−−→
2 4 b2 0 0 b2 − 2b1
Depending on the values b1 , b2 this system is either inconsistent or has an infinite number of
solutions. Either way, there will not be a unique solution, so we cannot solve the corresponding
matrix equation in the form x = A−1 b. Therefore A cannot have an inverse.

5.12 Algorithm for finding A−1

We have seen that matrix inverses are useful for expressing the solution of linear systems, but this
will only be possible if we can find a matrix inverse when we need one.
Here is a general method for finding A−1 , if it exists:

1. Form the augmented matrix [ A | I ] where the identity I has the dimensions of A.

2. Calculate the RREF of this augmented matrix.

3. If A is invertible then the resulting RREF will be [ I | A−1 ].

To summarize:
RREF
[ A | I ] −−−−→ [ I | A−1 ].

Later we’ll see why this process works; for now we’ll just take it for granted and use it whenever
we need to compute a matrix inverse.

1 4
Example 5.18. Find the inverse of A = .
1 3

17
 
1 2 3
Example 5.19. Find the inverse of  2 5 3  if it exists.
1 0 8
 
1 −2 −1
Example 5.20. Show that  −1 5 6  does not have an inverse.
5 −4 5

5.13 More general matrix equations

Lec. #11
Suppose we want to solve the matrix equation XA = B for X (assuming A has in inverse)? We can’t
proceed as before and multiply on the left by A−1 to “cancel” the multiplication by A. Instead, we
multiply by A−1 on the right

XA = B =⇒ (XA)A−1 = BA−1
=⇒ X(AA−1 ) = BA−1
=⇒ XI = BA−1
=⇒ X = BA−1 .

Example 5.21. Solve the matrix equation ABXC = D for X, supposing A, B and C are invertible.

5.14 Properties of the matrix inverse

Lec. #12
To reliably use inverses in matrix algebra, we will need to know their algebraic properties:

Theorem 5.3. If A and B are invertible then:

1. A−1 is unique.

2. (A−1 )−1 = A.

3. (AB)−1 = B −1 A−1 .

4. (cA)−1 = 1c A−1 .

5. (An )−1 = (A−1 )n .

Proof of 1: (by contradiction) Suppose B and C are two different inverses of A. This means that
BA = I, AB = I, CA = I, AC = I. Then:

B = IB = (CA)B = C(AB) = CI = C

Proof of 2: The matrix (A−1 )−1 is the (unique) matrix C such that A−1 C = I and CA−1 = I. But
we alreay know that A is such a matrix, so the inverse of A−1 is A.

18
Proof of 3: It suffices to demonstrate that both (B −1 A−1 )(AB) = I and (AB)(B −1 A−1 ) = I:

(B −1 A−1 )(AB) = B −1 (A−1 A)B = B −1 IB = B −1 B = I.

(AB)(B −1 A−1 ) = A(BB −1 )A−1 = AIA−1 = AA−1 = I.

Property 3 can be a useful shortcut in solving matrix equations:

ABX = C =⇒ (AB)−1 (AB)X = (AB)−1 C

=⇒ X = (AB)−1 C
=⇒ X = B −1 A−1 D
−1 −1
Property 3 can also be generalized as follows: (A1 A2 · · · An )−1 = A−1
n · · · A2 A1 .

5.15 The Transpose

Definition 5.7. Let A be an m × n matrix. The transpose of A, written AT , is the n × m matrix

formed by interchanging the rows and columns of A.

T T 1 0 3 4
Example 5.22. Find A and B given A = and B = .
5 8 0 −2

The transpose has the following algebraic properties:

1. (AT )T = A

2. (A + B)T = AT + B T

3. (kAT ) = k(AT )

4. (AB)T = B T AT

5. (An )T = (AT )n if n is a non-negative integer.

6. (AT )−1 = (A−1 )T .

5.16 Elementary row operations as matrix multiplication

Here we will see why it is that calculating the RREF of [ A | I ] gives [ I | A−1 ].
The key observation is that every elementary row operation can be represented by a certain matrix
multiplication:

Example 5.23. Show that performing the row operation R2 + 2R1 on the augmented matrix
 
−1 3 2 5
 2 4 1 1 
−4 9 0 2

19
is equivalent to left multiplication by the matrix
 
1 0 0
E = 2 1 0 .
0 0 1

In the previous example, notice that the matrix E is what we would get if we applied the same row
operation to the identity matrix. In fact, this is always the case. To see why, suppose a certain row
operation (or combination of row operations) corresponds to left-multiplication by some matrix E.
We have, of course, that
E = EI,
but now the right-hand side is the result of applying the given row operations to I, so this equation
states that we can find E by applying the corresponding row operations to I.

Example 5.24. Find the 3 × 3 matrix E such that left multiplication of a 3 × n matrix by E
corresponds to the row operations R1 ↔ R3 following by R2 − 5R1 .

Now consider calculating the RREF of [ A | I ]. This means finding a sequence of row operations
that reduces the left side of this matrix to the identity. Equivalently, it means finding a series of
matrices E1 , E2 , . . . , Ek so that left multiplication by E1 , followed by E2 , etc., reduces A to I:

Ek · · · E2 E1 A = I.

Now this says that the matrix Ek · · · E2 E1 must be A−1 , since multiplication of A by Ek · · · E2 E1
gives the identity. When we calculate the RREF of [ A | I ], we apply the same row operations to
both sides, so we have
RREF
[ A | I ] −−−−→ [ Ek · · · E2 E1 A Ek · · · E2 E1 I] = [ I Ek · · · E2 E1 ] = [ I A−1 ].

Note that we have found an interesting characterization of A−1 : it is the matrix that performs the
series of elementary row operations required tp reduce A to I. So, for example, when we solve
Ax = b by multiplying both sides by A−1 = Ek · · · E2 E1 , we are effectively performing the row
operations that result in the RREF of the augmented matrix [ A | b ]:
RREF
[ A b ] −−−−→ [ Ek · · · E2 E1 A Ek · · · E2 E1 b ] = [ I A−1 b ].

In other words, A−1 is a record of all the row operations required to compute the RREF of the
augmented matrix; calculating A−1 b performs these row operations on b only, resulting in the final
column of the RREF, from which we can read off the solution.

6 Applications
Lec. #13
Having built up a collection of the most essential ideas and techniques of linear algebra, we’re in a
position to explore some applications. The techniques discussed so far are used in a huge variety
of different fields; we’ll only sample some of the possibilities.

20
6.1 Balancing chemical equations: vector form

Go back to the previous example of balancing a chemical reaction:

x1 C3 H8 + x2 O2 −→ x3 CO2 + x4 H2 O.

Rather than balance all species separately, we can balance all at once using vectors. For each
molecule, define a constant vector whose elements are the number of atoms of carbon, hydrogen
and oxygen, respectively. This is, represent C3 H8 with the vector (3, 8, 0), O2 with the vector
(0, 0, 2), etc. The balanced chemical equation then becomes the vector equation:
       
3 0 1 0
x1 8 + x2 0 = x3 0 + x4 2 .
      
0 2 2 1

Of course, component-by-component this represents three equations, each expressing the balance of
one species. But working at the level of vectors is more concise, there’s only one equation, there’s
a more intuitive connection to the chemical equation, and it’s obvious how to generalize to more
complicated reactions with more species.
Taking everything to the left-hand side we get the homogeneous matrix equation:
   
3 0 −1 0 0
 8 0 0 −2 x = 0 .
 
0 2 −2 −1 0

Computing the RREF gives, as before:

 
1 0 0 −1/4 0
 0 1 0 −5/4 0  .
0 0 1 −3/4 0

This time, write the solution in vector form:

  1   
x1 4 x3 1/4
x2   5 x3 
 =  43  = x3 5/4
 
x= x3   x3  3/4 where x3 is free.
4
x4 x3 1

So one solution (taking x3 = 4) is x = (1, 5, 3, 4). The structure of the entire solution set is also
apparent: every solution is some multiple of this contant vector.

6.2 Population dynamics and migration

Lec. #14
Example 6.1. Each year 5% of BC’s population moves to Alberta, while 10% of Alberta’s popu-
lation moves to BC. Assume, for simplicity’s sake, that there is no migration to or from the rest of
the world. Now there are 4,000,000 people living in BC and 3,000,000 living in Alberta.

21
.05
.95 .9
BC Alberta

1. How many people will be living in each province after 10 years? (Assume the foregoing
assumptions continue to hold throughout this time interval.) 50 years?

2. Eventually the population distribution reaches an equilibrium. How to predict this equilib-
rium distribution from the transition matrix A?

(n) (n)
To solve 1. let x1 = number of people in BC after n years, x2 = number of people in Alberta
(n) (n)
after n years. Let x(n) = (x1 , x2 ); so e.g. x(0) = (4 × 106 , 3 × 106 ). We can write the following
equations for the populations after one year:
(1) (0) (0)
x1 = 0.95x1 + 0.1x2
(1) (0) (0)
x2 = 0.05x1 + 0.9x2

or in vector form: " (0) #

0.95 0.1 x1
x(1) = = Ax(0) .
0.05 0.9 x(0)
2
The matrix A is called the transition matrix for this system.
By the same argument we get

x(2) = Ax(1) = A(Ax(0) ) = A2 x(0) .

In general,
x(n) = An x(0) .

So to answer the question given,

10 " (0) #
(10) 10 (0) 0.95 0.1 x1 0.732 0.535 4, 000, 000 4, 535, 000
x =A x = (0) ≈ 0.268 0.465 =
0.05 0.9 x2 3, 000, 000 2, 465, 000

Note: the matrix A10 can be interpreted as a transition matrix, representing the net migration that
has occurred after 10 years.
To answer 2. notice that if x is the population vector at equilibrium then

0.95 0.1 1 0 x1 0
Ax = x =⇒ (A − I)x = 0 =⇒ − = .
0.05 0.9 0 1 x2 0

We can’t solve this by using the matrix inverse of (A − I), since this would give x = (0, 0) (the
trivial solution). In fact (A − I) doesn’t have an inverse, because there are an infinite number of
solutions. Instead, we use Gauss-Jordan on the augmented matrix for this linear system:

−0.05 0.1 0 RREF 1 −2 0
−−−−→ .
0.05 −0.1 0 0 0 0

22
So the general solution (in vector form) is

x1 2x2 2
x= = = x2 where x2 is free.
x2 x2 1
So every equilibrium solution is one where BC has twice the population of Alberta. Of course the
particular value of x2 depends on the total population of the two regions:

4, 666, 667
4, 000, 000 + 3, 000, 000 = 2x2 + x2 =⇒ x2 = 7, 000, 000/3 =⇒ x =
2, 333, 333

6.3 Graphs and networks

Lec. #15
A graph is a useful way to study interconnections between people and things (e.g. computers in a
network). The following directed graph might represent possible paths of data flow in a computer
network, or possible driving routes between various locations in a city, or family relationships among
a group of individuals, or. . .

1 3
5

A directed graph consists of a set of vertices or nodes (numbered 1 through 5 above) and a set of
directed edges between nodes.
Some important types of questions that might be asked about such a graph are:

• How many different paths from node 1 to node 4 take exactly 6 steps?
• How many different paths are there from node 1 to node 4 taker fewer than 6 steps?
• What is the shortest path from node 1 to node 4?

Of course for a simple graph we can answer these questions easily; but for a complicated graph
these are hard problems. For general graphs, all of these questions can be answered using methods
of matrix algebra. The key is to associate with the graph an incidence matrix A with elements
aij = number of edges joining the i’th node to the j’th node.
So for the graph above the incidence matrix is
 
0 1 0 0 1
0 0 1 1 0
 
A= 0 0 0 0 1

0 0 0 0 1
1 0 0 1 0

23
Given A, it is possible to recover a drawing of the graph. So A is as complete a description of the
graph as the picture.
Definition 6.1. A path or chain is a route (sequence of edges) from one node to another. A path
that consists of n edges is called an n-chain.
Theorem 6.1. The number of n-chains from node i to node j is equal to the element aij of An .

So by direct calculation we can evaluate:

   
1 0 1 2 0 10 0 4 14 0
0 0 0 0 2 0 4 0 0 12
   
A2 = 
1 0 0 1 0
 A6 = 
6 0 2 8 0 
1 0 0 1 0 6 0 2 8 0
0 1 0 0 2 0 6 0 0 16
so e.g. we can read off the number (14) of paths from node 1 to node 4 that take exactly 6 steps.
Why does this work? Consider the number of 2-chains from node 2 to node 5. Obviously there are
two ((2, 3, 5) and (2, 4, 5)); the matrix product to get element a25 of A2 requires that we calculate
(2nd row of A) · (5th column of A) = (0, 0, 1, 1, 0) · (1, 0, 1, 1, 0) = 0 + 0 + 1 + 1 + 0
The first 1 comes from the path (2, 3, 5); the second comes from the path (2, 4, 5).
Theorem 6.2. The number of paths from node i to node j that take n steps or fewer is equal to
the element aij of A + A2 + A3 + · · · + An .

So by direct calculation we can find

 
15 6 6 21 15
6 6 3 9 18
 
A + A2 + · · · + A6 = 
9 3 3 12 9 

9 3 3 12 9 
9 9 3 12 24
so e.g. we can read off the number (21) of paths from node 1 to node 4 that take 6 steps or fewer.
To answer the question of shortest path from node i to node j, notice that if A has aij = 6 0 then
there is already a path of length 1. If not, calculate A2 . If A2 has aij 6= 0 then there is a path of
length 2, which is the shortest path. If not, calculate A3 , etc, until you get aij 6= 0.
For example, to find the shortest route from node 2 to node 1, compute powers of A until the
element a21 6= 0 (we’ve already checked A2 ):
 
0 1 0 0 4
2 0 0 2 0
 
A3 = 0 1 0 0 2
.
0 1 0 0 2
2 0 1 3 0
So there are exactly two 3-chains from node 2 to node 1. Since there are no paths that require
fewer than 3 steps, the shortest path from 2 to 1 requires three steps.

24
7 Determinants
Lec. #16

7.1 The 2 × 2 case

Recall the formula for the inverse of a 2 × 2 matrix:

a b −1 1 d −b
A= =⇒ A = .
c d ad − bc −c a

The factor ad − bc in the denominator is called the determinant of the matrix, and plays an
important role. In particular, if ab − bc = 0 the inverse of the matrix is undefined. This provides
a quick check for whether a matrix has an inverse, and hence whether the corresponding linear
system has a unique solution. Determinants are important for many other reasons, so we’ll spend
a couple of days studying them.

a b
Definition 7.1. If A = then det A = |A| = ad − bc.
c d

Theorem 7.1. If A is a square matrix, then A−1 exists if and only if det A 6= 0.

This really becomes useful for larger matrices, because it’s a much faster alternative to trying to
compute A−1 by the Gauss-Jordan method. But how to compute det A if A is a a general n × n
matrix? This requires some more definitions:

7.2 The general n × n case

Definition 7.2. Let A be an n × n matrix with elements aij . The matrix Mij formed by deleting
row i and column j from A is called the ij’th minor of A. The determinant of A, written det(A)
or |A|, is given by
n
X
det(A) = (−1)i+j aij |Mij | (cofactor expansion along row i (i = 1, . . . , n)) (1)
j=1

for any i. Equivalently, for any j,

n
X
det(A) = (−1)i+j aij |Mij | (cofactor expansion along column j (j = 1, . . . , n)) (2)
i=1

So, for example if A is a 3 × 3 matrix then the cofactor expansion along row 1 gives

det(A) = a11 |M11 | − a12 |M12 | + a13 |M13 |

 
1 5 0
Example 7.1. To calculate the determinant of A =  2 4 −1  you could expand along the
0 −2 0

25
first row:

4 −1 2 −1 2 4
det A = (1)
− (5)
+ (0)

−2 0 0 0 0 −2
h i h i
= (1) (4)(0) − (−1)(−2) − (5) (2)(0) − (−1)(0) + 0
= −2

(Note that det A = −2 6= 0 so we now know that A has an inverse.)

But It turns out that we can calculate the determinant using a cofactor expansion along any row
Lec. #17 or column:
Theorem 7.2. If A is an n × n matrix then for any k,
det A = ak1 Ak1 + ak2 Ak2 + · · · + akn Akn (expansion along row k)
= a1k A1k + a2k A2k + · · · + ank Ank (expansion along column k).

This often makes calculating a determinant easier, because we are free to make a convenient choice
of row or column along which to do the cofactor expansion — typically this is the row or column
with the most zeroes.
Example 7.2. To calculate the determinant in the previous example, we can use a cofactor ex-
pansion down the third column:

2 4 1 5 1 5
det A = (0)
− (−1)
+ (0)

0 −2 0 −2 2 4
h i
= (1) (1)(−2) − (5)(0) = −2.
Example 7.3. Calculate determinants of
 
  3 5 −8 4
3 0 4  0 −2 3 −7 
(a) A =  2 3 2  (b) B = 
 0

0 1 5 
0 5 −1
0 0 0 2

Aside: Students who have studied recursion (e.g. computer science) will recognize the determinant
calculation as a recursive algorithm: the determinant of an n × n matrix is expressed as a sum of
determinants of n smaller sub-matrices of dimension (n − 1) × (n − 1). These in turn are expressed
as determinants of yet smaller sub-matrices, of dimension (n − 2) × (n − 2), and so on until the
algorithm terminates at the 2 × 2 case, for which we have a simple formula.
The number of arithmetic operations to calculate det A grows very quick as A gets bigger. Count
just the multiplications (most expensive operation):

size of A multiplications in det A

2×2 2
3×3 3·2
4×4 4 · (3 · 2)
.. ..
. .
n n(n − 1)(n − 2) · · · 3 · 2 = n!

26
For the 10 × 10 case this is 10! = 3, 628, 800. 20! ≈ 2 × 1018 (more than the age of the universe in
seconds!)

7.3 Properties of determinants

The last example can be generalized:

Theorem 7.3. If A is a triangular matrix (all zeroes either above or below the main diagonal)
then det A is the product of the entries on the main diagonal.

In particular, a triangular matrix is invertible if and only if its diagonal entries are all non-zero.
Applying elementary row operations to a matrix changes the value of its determinant in predicable
ways:
Theorem 7.4. Let A be an n × n matrix. Then

1. If a multiple of one row of A is added to another to produce B, then det A = det B.

2. If two row are interchanged to produce B, then det B = − det A.
3. If one row of A is multiplied by k to produce B, then det B = k · det A.

Perhaps more important are the following algebraic properties:

Theorem 7.5. Let A be an n × n matrix. Then

1. det(AB) = (det A)(det B).

1
2. det(A−1 ) = .
det A
This last is easy to prove, using the first:
1
1 = det I = det(AA−1 ) = det(A) det(A−1 ) =⇒ det(A−1 ) = .
det A

7.4 Summing up
Lec. #18
The determinant helps to tie together the disparate parts of the theory discussed so far, as sum-
marized in the following theorem:
Theorem 7.6. Let A be an n × n matrix. Then all of the following statements are equivalent:

1. A is invertible.
2. The linear system Ax = 0 has only the trivial solution.
3. The linear system Ax = b has a unique solution.
4. A is row equivalent to the n × n identity matrix I.
5. The row echelon form of A has n pivots.
6. det A 6= 0.

27
8 Vectors
Lec. #19
 
1
Recall: a vector is a column matrix, e.g. x = 0 = (1, 0, 2).
2
Geometrically, 2-vectors can be identified with points in the plane; similarly, 3-vectors represent
points in 3-space. It is also useful to think of a vector as displacement, e.g. relative to the origin,
which specifies a magnitude and direction, but without any particular location; we represent this
graphically by drawing a vector as an arrow, often but not necessarily anchored at the origin.

[pictures of 2- and 3-vectors]

8.1 Euclidean space

Definition 8.1. The set of all 2-vectors is denoted IR2 . The set of all 3-vectors is denoted IR3 .
More generally, for any fixed integer n ≥ 0, IRn denotes the set of all n-vectors. A set IRn is called
a Euclidean space.

Geometrically, IR2 is the set of points in the plane; IR3 is the set of all points in 3-space. An
ordinary scalar is an element of IR1 , or simply IR, the real number line.

8.2 Geometry of vector algebra

- scalar multiplication lengthens or shortens a vector while preserving its orientation (if the scalar
multiple is negative, there is also a reflection)

[pictures of scalar multiplication]

- vector addition corresponds geometrically to placing vectors head to tail

[pictures of vector addition]

Vector subtraction can be viewed similarly. The difference x − y asks“what must I add to y to get
to x?”

[pictures of vector subtraction]

8.3 Linear Combinations and Span

Lec. #20
Definition 8.2 (Linear Combination). A linear combination of vectors v1 , v2 , . . . , vn is a vector
of the form
c1 v1 + c2 v2 + · · · + cn vn
where c1 , c2 , . . . cn are scalars.

28
Definition 8.3 (Span). The set of all possible linear combinations of vectors v1 , . . . , vn is denoted

span{v1 , . . . , vn } = {v ∈ V : v = c1 v1 + · · · + cn vn for some scalars c1 , . . . , cn }.

Essentially, the span of a given set of vectors is the set of points you can “get to” by forming linear
combinations.

Example 8.1. Let u = (1, 3) ∈ IR2 . Then span{u} is the set of all vectors of the form cu where c
is a scalar. Geometrically, span{u} is the set of vectors on a line with direction u passing through
the origin.

[need a picture here]

Example 8.2. Let u = (1, 2, 3), v = (2, 0, 0) ∈ IR3 . Then span{u, v} is the set of all vectors of the
form c1 u + c2 v where c1 , c2 are scalars. Geometrically, span{u, v} is the plane through the origin
containing both u and v.

[need a picture here]

Example 8.3. Consider v1 = (1, 2), v2 = (1, −1). We see that span{v1 , v2 } contains x = (5, 1),
since x = 2v1 + 3v2 (i.e. x is a linear combination of v1 , v2 ).

Example 8.4. Let v1 = (2, 3), v2 = (4, 6). Consider x = (4, 4). Is x ∈ span{v1 , v2 }?

Example 8.5. Let v1 = (1, 1), v2 = (1, −1). Prove that span{v1 , v2 } = IR2 .
(Since you can “get to” all of IR2 by forming linear combinations of v1 and v2 , we say that these
vectors span IR2 .)

Example 8.6. Let v1 = (1, 2, 0), v2 = (1, 0, 1), v3 = (3, 2, 2). Do v1 , v2 , v3 span IR3 ?

8.4 Linear Independence

Lec. #21
Linear independence is an important concept that’s probably best introduced with a couple of
examples:

Example 8.7. Let u = (1, 3), v = (2, 6) ∈ IR2 . Then span{u, v} = span{u} = span{v} is a line
through the origin with direction u, since u and v point in the same direction. In other words, u
and v don’t span a plane because they don’t point in independent directions.
[need a picture here]

Example 8.8. Let u = (1, 0, 0), v = (0, 1, 0), w = (1, 2, 0) ∈ IR3 . Then span{u, v, w} is the plane
with equation z = 0. This is the same plane obtained by considering just span{u, v} because w
does not point in a third independent direction. Loosely speaking, including w in the set of vectors
does not increase the number of places we can get to be forming linear combinations.
[need a picture here]

29
In the previous example it’s easy to see that all three vectors lie in the same plane, but consider:

Example 8.9. Suppose u = (1, 2, 3), v = (0, 1, −1), w = (1, 4, 1) ∈ IR3 . Do all three vectors point
in essentially independent directions, or do they all lie in a single plane? To answer this question
we need to check whether any one of the vectors can be expressed as a linear combination of the
other two (in which case this vector does not point in a third direction independent of the other
two). In fact you can check that w = u + 2v, so indeed w is a linear combination of u and v, and
all three vectors lie in a single plane. In other words, the three vectors span a set in which there
are only two independent directions.

The previous example motivates the following definition:

Definition 8.4 (Linear Independence). Let V be a vector space. A set of vectors v1 , . . . , vn ∈ V

is said to be linearly dependent if

c1 v1 + c2 v2 + · · · + cn vn = 0

for some scalars c1 , . . . cn that are not all zero. Otherwise the vectors are linearly independent.
That is, the vectors are linearly independent if the equation above has only the trivial solution
c1 = c2 = · · · = cn = 0.

The motivation behind the definition is that if it were possible to find scalars c1 , . . . , cn that satisfy
the equation in the definition, then it would be possible to express one of the vectors as a linear
combination of the others; e.g. if c1 6= 0 then
c2 cn
v1 = − v2 − · · · − vn
c1 c1
so that v1 is a linear combination of the other vectors; that is, it doesn’t point in an independent
direction.

Example 8.10. Let u = (1, 2, 3), v = (0, 1, −1), w = (1, 4, 0) ∈ IR3 . To test for linear indepen-
dence, we could painstakingly check directly whether any of these vectors can be expressed as a
linear combination of the other two (as in the previous example). It’s easier to apply the definition,
in which we seek numbers c1 , c2 , c3 that satisfy the equation
       
1 0 1 0
c1 2 + c2  1  + c3 4 = 0 .
3 −1 0 0

This is equivalent to a linear system with unknowns c1 , c2 , c3 , with augmented matrix

   
1 0 1 0 1 0 0 0
 2 RREF
1 4 0  −−−−→  0 1 0 0 
3 −1 0 0 0 0 1 0

so the only solution is the trivial solution c1 = c2 = c3 = 0. Therefore the vectors are linearly
independent. In fact they span all of IR3 .

30
If you have two linearly independent vectors in IR2 , then it should be impossible to choose a third
vector that points in a third independent direction. This idea is formalized in the following theorem.
Theorem 8.1. A set of m vectors in IRn is always linearly dependent if m > n.

Proof. (sketch) Following the same method as the previous example, we set up a system of equations
for the unknowns c1 , . . . , cm , and the corresponding augmented matrix. We get n equations and m
unknowns. If m > n there will be at least one free variable, hence an infinite number of nontrivial
solutions. The existence of a nontrivial solution means the vectors are linearly dependent.
Lec. #22
An immediate consequence is the following corollary:
Corollary 8.1. Any set of linearly independent vectors in IRn contains at most n vectors.
Example 8.11. By the theorem just given, we know that the set of vectors u = (1, 2), v =
(0, 1), w = (3, 4) ∈ IR2 is necessarily linearly dependent.

Linear independence gives us another way to think of existence of solutions for systems of linear
equations:
Theorem 8.2. Let A be an n × n matrix. The homogeneous system Ax = 0 has nontrivial
solutions if and only if the columns of the coefficient matrix A are a set of linearly dependent
vectors. Conversely, if the columns of A are linearly independent then the equation Ax = 0 has
only the trivial solution x = 0.

Proof. The proof is basically the same as for Theorem 8.1 with m = n.
Theorem 8.3. Any set of n linearly independent vectors in IRn spans IRn .

8.5 Basis and Dimension

Definition 8.5 (Basis). A sector of vectors v1 , . . . , vn is a basis for a vector space V if

1. v1 , . . . , vn are linearly independent, and

2. span{v1 , . . . , vn } = V .

A basis defines a coordinate system for V , in that any x ∈ V can be expressed as a (unique) linear
combination
x = c1 v1 + c2 v2 + · · · + cn vn .
The vector c = (c1 , . . . , cn ) gives the coordinates of x relative to the basis B = {v1 , . . . , vn }.
Definition 8.6 (Standard Basis for IRn ). In IRn the vectors
     
1 0 0
0 1 0
     
e1 =  .  e2 =  .  · · · en =  . 
 ..   ..   .. 
0 0 1
form the standard basis.

31
Example 8.12. The vector x = (2, 5) ∈ IR2 can be expressed as

x = 2e1 + 5e2 .

The numbers 2, 5 are the familiar coordinates of x relative to the standard basis. An alternative
basis for IR3 is defined by the vectors v1 = (1, 1), v2 = (−1, 1). The same vector x = (2, 5) can be
expressed with coordinates (c1 , c2 ) relative to the basis B = {v1 , v2 }, according to:

2 1 −1
= c1 + c2
5 1 1

Solving this linear system gives c1 = 2/3, c2 = −7/3. We can think of these as the coordinates of x
relative to a coordinate system with axes in the directions of v1 and v2 . We write (x)B = (2/3, −7/3)
to denote that the coordinates are being specified relative to the basis B.

Here’s an immediate consequence of Theorem 8.3:

Corollary 8.2. Every set of n linearly independent vectors in IRn is a basis for IRn .

Example 8.13. Consider the set of vectors x ∈ IR3 that lie on the plane

x + 2y + 3z = 0.

We’ve seen already that such a plane through the origin is a vector space (a subspace of IR3 ); call
it V . To find a basis for V , write the general solution of the equation in vector form:
       
x −2y − 3z −2 −3
x= y =
   y  =y 1 +z 0 
  
z z 0 1

Thus any point x in the plane can be expressed as a linear combination of v1 = (−2, 1, 0) and
v2 = (−3, 0, 1). So B = {v1 , v2 } provides a basis for V ; you can think of B as a coordinate system
in V , with axes in the v1 - and v2 -directions.
Lec. #23
The number of vectors required to span a vector space V provides some measure of the “size” of
V ; we call this number the dimension of V :

Definition 8.7 (Dimension). If a vector space V has a finite basis, then the dimension of V
(written dim(V )) is the number of vectors in the basis.

There’s some ambiguity in this definition, since at first it seems that different bases might give
different dimensions. The following theorem establishes that in fact there is just one dimension:

Theorem 8.4. If {u1 , . . . , um } and {v1 , . . . , vn } are both bases for a vector space V , then m = n.
That is, any two bases for V have the same number of vectors.

Example 8.14. In IRn , a basis (e.g. the standard basis {e1 , . . . , en }) consists of n vectors, so that
dim(IRn ) = n.

32
Example 8.15. In a previous example we found a basis B = {v1 , v2 } for the plane V with equation
x + 2y + 3z = 0. Therefore dim(V ) = 2.

A special type of subspace of IRn is defined by the null space of a matrix:

Definition 8.8 (Null Space). For a given matrix A, the solution space or null space of A (written
null(A)) is the set of solutions of the homogeneous system Ax = 0.

1 2 −1
Example 8.16. Let A = . To find a basis for the null space of A and find its
2 −1 3
dimension, we first find the general solution of Ax = 0 as usual:

1 2 −1 0 RREF 1 0 1 0
−−−−→ ,
2 −1 3 0 0 1 −1 0
so the general solution is
      
x = −z
 x −z −1
y=z =⇒ x = y  =  z  = z  1 


z is free z z 1

Therefore v = (−1, 1, 1) is a basis for null(A), and dim(null(A)) = 1.

8.6 Change of basis

Suppose we have a vector space V and a given a vector x ∈ V , relative to the standard basis. Given
some other basis B = {v1 , . . . , vn }, we would like to find the coordinates of x relative to B. That
is, we wish to find the coordinates (x)B = (c1 , . . . , cn ) such that
c1 v1 + · · · cn vn = x.
We can write a simpler formula for the c1 , . . . , cn by writing this equation in matrix form:
 
c1
 . 
v1 v2 · · · vn  ..  = x.
cn

Letting PB = v1 · · · vn and (x)B = (c1 , . . . , cn ), this becomes simply
x = PB (x)B .
The matrix PB is called the change of basis matrix or transition matrix from the standard basis to
the basis B. We can solve algebraically for xB :
(x)B = PB−1 x.
Example 8.17. Let x = (3, 1) and B = {v1 , v2 } where v1 = (1, 1), v2 = (−1, 1). The transition
matrix for the change of basis from the standard basis to the basis B is

1 −1 −1 1/2 1/2
PB = v1 v2 = so PB =
1 1 −1/2 1/2
Thus
1/2 1/2 3 2
x = PB (x)B =⇒ (x)B = PB−1 x = = .
−1/2 1/2 1 −1

33
9 Abstract Vector Spaces
Lec. #24
9.1 Abstract vector spaces and their properties

So far we have been studying only vectors in IR2 , IR3 and (generally) IRn — the so-called real
Euclidean spaces. Most of our considerations were motivated by either systems of linear equations
or coordinate geometry. But it turns out that most of what we have done can be generalized, to the
case of general vector spaces. While this new setting is more abstract, we gain greatly in that the
ideas applicable to vectors in IRn (about which you have hopefully developed some intuitive ideas
by now) become powerful tools applicable to much broader classes of mathematical objects. The
ideas developed in this section are fundamental to many mathematical fields, including advanced
calculus, differential equations, and analysis in general.

9.1.1 Definitions

Definition 9.1 (Real vector space). A real vector space V is a set of mathematical objects, called
vectors, together with two operations called addition and scalar multiplication that satisfy the
following axioms:

1. If x, y ∈ V then x + y ∈ V . [closure under addition]

2. If x, y ∈ V then x + y = y + x. [commutative law of addition]
3. For all x, y, z ∈ V , (x + y) + z = x + (y + z). [associative law of addition]
4. There is a vector 0 ∈ V such that for all x ∈ V , x + 0 = x. [existence of an additive identity]
5. If x ∈ V , there is a vector −x ∈ V such that x + (−x) = 0. [existence of an additive inverse]
6. If x ∈ V and α is a scalar, then αx ∈ V . [closure under scalar multiplication]
7. If x, y ∈ V and α is a scalar then α(x + y) = αx + αx. [first distributive law]
8. If x ∈ V and α, β are scalars, then (α + β)x = αx + βx. [second distributive law]
9. If x ∈ V and α, β are scalars, then α(βx) = (αβ)x. [associative law of scalar multiplication]
10. For every x ∈ V , 1x = x. [multiplicative identity]

Recall that a scalar is a real number (thus the term real in the definition).
The axioms above are fairly obviously true for the sets IR, IR2 , IR3 and IRn in general, so each of
these is a real vector space. However, there are much more exotic vector spaces that also satisfy
these axioms. The power of the concept is that any mathematical fact that we can prove about
vector spaces in general will be true for these exotic spaces, and not just for IRn . Any theorems we
prove in this section will apply to any vector space.
To get a sense for the generality of the concept of a vector space, it will help to consider some
examples. Often the meaning of a definition becomes more clear if we consider some examples that
do not satisfy the requirements of the definition.

34
Example 9.1. The unit interval V = [0, 1] ⊂ IR is not a vector space. This is because V is not
closed under addition: both x = 0.5 ∈ V and y = 0.8 ∈ V , but x + y = 1.3 ∈
/ V . In other words, it
is possible to get outside of V using addition of elements of V .
Example 9.2. The set of non-negative integers V = {0, 1, 2, . . .} is a not vector space. This is
because V does not contain the additive inverse: 1 ∈ V but −1 ∈
/ V.
Example 9.3. The set of integers V = {. . . , −2, −1, 0, 1, 2, . . .} is not a real vector space. This is
because V is not closed under scalar multiplication: 2 ∈ V but 0.3(2) = 0.6 ∈ / V . In other words,
it is possible to get outside of V using scalar multiplication.
Example 9.4. The set of points (x, y) on the line y = mx is a vector space, i.e. V = {(x, y) ∈ IR2 :
y = mx}.
Most of the axioms are obvious. We’ll prove some of the less obvious ones. Let x = (x1 , y1 ) and
y = (x2 , y2 ) ∈ V , so y1 = mx1 and y2 = mx2 . To show that x + y ∈ V :
x + y = (x1 , mx1 ) + (x2 , mx2 ) = (x1 + x2 , mx1 + mx2 ) = (x1 + x2 , m(x1 + x2 )) ∈ V.

To show that −x ∈ V :
−x = −(x1 , mx1 ) = (−x1 , −mx1 ) = (−x1 , m(−x1 )) ∈ V.

To show that αx ∈ V :
αx = α(x1 , mx1 ) = (αx1 , αmx1 ) = (αx1 , m(αx1 )) ∈ V.

Clearly 0 = (0, 0) = (0, m0) ∈ V .

The other properties follow similarly.

Notice that the line V in the previous example is equivalent to IR. One sometimes speaks of such
a V as being an embedding of IR in IR2 . More commonly we say V is a subspace of IR2 . Similarly
a plane (a 2-dimensional vector space) in 3 dimensions is basically an embedding of IR2 in IR3 , or
a 2-dimensional subspace of IR3 :
Now some more exotic examples.
Example 9.5. Let Pn = {polynomials P (x): P has real coefficients and deg(P ) ≤ n}. Then Pn is
a vector space. Clearly the sum of two polynomials of degree ≤ n is also a polynomial of degree
≤ n, so Pn is closed under addition. The additive identity is given by 0 = 0xn + 0xn−1 + · · · +
0x + 0 ∈ Pn . If x = an xn + an−1 xn−1 + · · · + a1 x + a0 then the corresponding additive inverse is
−x = −an xn − an−1 xn−1 − · · · − a1 x − a0 , so that x + (−x) = 0. All other properties are at least
as obvious.
Example 9.6. The space C[a, b] = {real-valued continuous functions on the interval [a, b]} is a
real vector space. It is closed under addition because the sum of two continuous functions is also
a continuous function. The addition operation is the usual addition of functions: (f + g)(x) =
f (x) + g(x). Similarly, scalar multiplication is defined by (αf )(x) = αf (x). The zero element 0
is the zero function f (x) = 0. The additive inverse is given by (−f )(x) = −f (x). With these
identifications, the rest of the axioms are evident.

35
9.1.2 Subspaces
Lec. #25
To check that a given set is a vector space is tedious, requiring that we check many axioms.
Fortunately the situation is simpler if we already have a vector space V and we wonder whether
some subset U ⊂ V is a vector space:
Theorem 9.1. Let V be a vector space. A nonempty subset U ⊂ V is a vector space if the following
closure axioms hold:

1. If x, y ∈ U then x + y ∈ U .
2. If x ∈ U and α is a scalar, then αx ∈ U .

Basically, this works because U inherits all of the other axioms (algebraic properties of addition
and multiplication) from V . The only axioms that might be broken by taking a subset of V are
the closure axioms, so we need to check that these hold on U .
An important consequence is that every subspace of a vector space V automatically contains the 0
element. This is because for any x ∈ U , closure under scalar multiplication implies that 0x = 0 ∈ U
(we’ll prove this in the next section).
Example 9.7. If U ⊂ IR3 is the set of points lying in some plane that passes through the origin,
then U is a vector space (a subspace of IR3 ). To prove this, write U = {(x, y, z) : Ax+By +Cz = 0}
(points in the plane satisfy the equation of the plane in standard form). To check closure under
addition, suppose x = (x1 , y1 , z1 ) ∈ U and y = (x2 , y2 , z2 ) ∈ U . To prove that x + y ∈ U , write
x + y = (x1 + x2 , y1 + y2 , z1 + z2 ).
Now show that the coordinates of this vector satisfy the equation of the plane:
A(x1 + x2 ) + B(y1 + y2 ) + C(z1 + z2 ) = (Ax1 + By1 + Cz1 ) + (Ax2 + Bx2 + Cz2 ) = 0.
| {z } | {z }
=0 since x ∈ U =0 since y ∈ U

The check that αx ∈ U , write

αx = (αx, αy, αz)
which is in U because
A(αx) + B(αy) + C(αz) = α(Ax + By + Cz) = α0 = 0.
Example 9.8. R 1 Let V = C[0, 1] (the space of continuous functions on the interval [0, 1]). Let
U = {f ∈ V : 0 f (x) dx = 0}. Then U is a vector space (a subspace of V ). To check closure under
addition, suppose f , g ∈ V . Then
Z 1 Z 1 Z 1
(f + g)(x) dx = f (x) dx + g(x) dx = 0 + 0 = 0,
0 0 0

so f + g ∈ U . To check closure under scalar multiplication, suppose f ∈ U . Then

Z 1 Z 1
(αf )(x) dx = α f (x) dx = α0 = 0
0 0

so αf ∈ U .

36
Example 9.9. Let V = C[0, 1] and U = {f ∈ V : f (0) = 0}. Then U is a subspace of V .
Lec. #26
Example 9.10. Let V = C[0, 1] and U = {f ∈ V : f ′ (x) exists and is continuous}. Then U is a
subspace of V .

Given some set U that we wish to prove is a vector space, sometimes it’s possible to recognize U as
the intersection of two sets that have already been proved to be vector spaces. Then the following
theorem guarantees that U also is a vector space.

Theorem 9.2. Let U1 , U2 be subspaces of a vector space V . Then U1 ∩ U2 is a subspace of V .

Example 9.11. We’ve already proved that V = C[0, 1] R= {continuous functions on [0, 1]} is a
1
vector space. We’ve also proved that U1 = {f ∈ V : 0 f (x) dx = 0} and U2 = {f ∈ V :
′
f (x) is continuous} are both subspaces of V . From theR previous theorem it follows that the
1
subset U = {f ∈ V : f has continuous first derivative and 0 f (x) dx = 0} is also a subspace of V ,
since we recognize that U = U1 ∩ U2 .

The span of a given set of vectors a frequently encountered subspace:

Theorem 9.3. If V is a vector space, and v1 , . . . , vn ∈ V , then span{v1 , . . . , vn } is a subspace of

Example 9.12. Let u = (1, 2, 3), v = (2, 0, 0) ∈ IR3 . Then the plane spanned by u and v is a
subspace of IR3 .

Example 9.13. Consider V = C[0, 1], and let f (x) = 1, g(x) = x, and h(x) = x2 . Then
U = span{f, g, h} is the set of functions of the form A + Bx + Cx2 for some coefficients A, B, C
(i.e. U is the set of quadratic functions). Then U is a subspace of V . In fact U is identically the
space P2 from example 9.5.

9.1.3 Using the axioms to carry out proofs

Numerous logical consequences follow from the axioms defining a vector space. For example:

Theorem 9.4. Let V be a vector space. Then for any x ∈ V and any scalar α,

1. α0 = 0

2. 0x = 0

3. (−1)x = −x.

4. If αx = 0 then α = 0 or x = 0 (or both)

Proof. 1. From the axioms we have 0 + 0 = 0, so that

α0 = α(0 + 0) = α0 + α0.

37
Adding −(α0) to both sides we get

α0 + (−α0) = (α0 + α0) + (−α0)

=⇒ 0 = α0 + (α0 + (−α0))
=⇒ 0 = α0 + 0
=⇒ 0 = α0.

2. The proof is basically the same as above. Start with 0+0 = 0, so that 0x = (0+0)x = 0x+0x.
Adding −0x to both sides we get 0 = 0x + (0x + (−0x)) = 0x + 0 = 0x.

It may seem that we’re being inordinately fussy in carrying out the steps of the proof, but there are
important reasons. Notice that every step is justified by one of the defining axioms; no algebraic
“shortcuts” are taken, because it isn’t apparent what shortcuts might be valid. Everything must
follow directly from the defining axioms.
Every vector space is equipped with an “addition” operation we label “+”, and a “zero element”
we label “0”. In doing so we’re re-using some familiar mathematical symbols, but they need not
correspond to addition and the number 0 as we have come to know them. So no algebraic operations
with them can be carried out without justification either by the defining axioms of by some theorem
that follows from them. This is typical of axiomatic reasoning that is common throughout advanced
mathematics.

10 Scalar Product and Orthogonality

10.1 The scalar product

The scalar- or dot-product of two vectors x = (x1 , x2 , . . . , xn ) and y = (y1 , y2 , . . . , yn ) is

defined by
x · y = x 1 y 1 + x 2 y 2 + · · · + x n yn .

(Clearly x and y must have the same dimension.)

Example 10.1. Evaluate (1, 2, 3) · (−2, 0, 5).

The usual laws of algebra lead to the following properties of the scalar product:

1. x · y = y · x

2. x · (y + z) = x · y + x · z

3. (ax) · y = a(x · y)

38
10.2 Length of a vector

If v = (a, b) is a vector in IR2 then its length or magnitude is written

p
|v| = a2 + b2 .

This follows from the Pythagorean Theorem. More generally, if v = (x1 , x2 , . . . , xn ) is any vector
in IRn then its magnitude is q
|v| = x21 + x22 + · · · + x2n .
From the definition it follows that for any vector v and scalar c,

|cv| = |c||v|.

Notice that |v| can be written in terms of the dot product:

|v|2 = v · v = x21 + x22 + · · · + x2n .

Definition 10.1. The distance between two vectors u, v is d(u, v) = |u − v|.

Example 10.2. A wire is extended between the tops of two flag poles, whose tops are located at
points with coordinates (3, 4, 1) and (2, 2, 2) in IR3 . Find the length of the wire.

Theorem 10.1 (Triangle Inequality). For any vectors u, v we have:

|u + v| ≤ |u| + |v|.

This theorem states the intuitive fact that a straight line is the shortest distance between two
points.

Proof. Use the algebraic properties of the dot product (and the Cauchy-Schwarz inequality, |u·v| ≤
|u||v|, which we haven’t discussed):

|u + v|2 = (u + v) · (u + v)
= u · u + 2u · v + v · v
= |u|2 + 2u · v + |v|2
≤ |u|2 + 2|u · v| + |v|2
≤ |u|2 + 2|u||v| + |v|2
= (|u| + |v|)2

10.3 Unit vectors and direction

Lec. #27
Definition 10.2. A vector u is a unit vector if its magnitude is 1: |u| = 1.

39
Given any nonzero vector v, we can form a unit vector having the same direction by dividing v by
its length: u = v/|v|. Then u points in the same direction as v, but has unit magnitude. This
is useful because it makes it easy to form other vectors, of any length we want, that point in the
same direction as v; e.g. 5u is a vector of length 5 in the same direction as v.
Example 10.3. You walk from the origin along a straight line through the point v = (2, 3). If you
keep walking along this line until your distance from the origin is 10, what is your final position
vector?
Definition 10.3. The standard unit vectors are defined as i = (1, 0, 0), j = (0, 1, 0), k = (0, 0, 1).
These point in the directions of the coordinate axes.
Example 10.4. Write the vector v = (4, −3, 7) as a combination of the standard unit vectors.

10.4 Angles between vectors

Definition 10.4. The angle θ between vectors u and v is given by

u·v
cos θ =
|u||v|

[need a picture here]

Proof. (Special case of 2- and 3-D.) This follows from the cosine law, which in terms of vectors can
be written:
|u − v|2 = |u|2 + |v|2 − 2|u||v| cos θ

By rewriting the left-hand side using algebraic properties of the dot product:

|u − v|2 = (u − v) · (u − v)
= u · u − 2u · v + v · v
= |u|2 − 2u · v + |v|2 ,

the cosine law becomes:

|u|2 − 2u · v + |v|2 = |u|2 + |v|2 − 2|u||v| cos θ
=⇒ −2u · v = −2|u||v| cos θ
u·v
=⇒ cos θ =
|u||v|

Angles between vectors really only make geometrical sense in IR2 and IR3 ; nevertheless, this defini-
tion extends the notion of “angle” to higher dimensions.
Example 10.5. Guy wires are extended from the origin to the tops of two flag poles whose tops
are located at points with coordinates (3, 4, 1) and (2, 2, 2) in IR3 . Find the angle between the wires
at the origin.

40
10.5 Orthogonality

Definition 10.5. Two vectors u, v are orthogonal if u · v = 0.

In 2- and 3-D, orthogonality of vectors is equivalent to their being perpendicular, i.e. the angle θ
between them is 90◦ . From the definition of angle between vectors:
u·v
= cos 90◦ = 0 ⇐⇒ u · v = 0.
|u||v|

Example 10.6. Let u = (1, 2, 3), v = (3, 2, 1), w = (−3, 0, 1). Which pairs of vectors (if any) are
orthogonal?

Again, this definition only makes geometrical sense in 2- and 3-D, but it applies in higher dimensions
as well. Basically, vectors are orthogonal if they point in independent directions.

Definition 10.6. Two vectors u, v are parallel if the angle θ between them is 0 or 180◦ ; that is, if
u·v
cos θ = = ±1.
|u||v|
Lec. #28
Theorem 10.2 (Pythagoras). Vectors u, v are orthogonal if and only if |u + v|2 = |u|2 + |v|2 .

Proof. By algebraic properties of the dot product we have:

|u + v|2 = (u + v) · (u + v)
= u · u + 2u · v + v · v
= |u|2 + 2u · v + |v|2

so |u + v|2 = |u|2 + |v|2 if and only if u · v = 0 (i.e. u, v are orthogonal).

10.6 Orthogonal projection

Given vectors u and v in IRn , it will be useful to have a method for resolving u into two components,
one parallel to v and the other perpendicular to v:

u = uk + u⊥ .

[need a picture here]

Basic trigonometry leads to a simple formula for the length of uk :

u·v u·v
|uk | = |u| cos θ = |u| = .
|u||v| |v|

Now we can form the vector uk since we know both its length (by the formula above) and its
direction (v/|v|). The result uk is called the orthogonal projection of u onto v:

41
Definition 10.7. Let u, v be nonzero vectors. The orthogonal projection of u onto v, denoted by
projv (u), is the vector

u·v v u·v u·v
projv (u) = = 2
v= v.
|v| |v| |v| v·v

Example 10.7. Find the orthogonal projection of u = (3, 4) onto v = (5, 2).

Example 10.8. Given u = (3, −1, 4) and v = (4, 1, 1) find projv (u).

11 Lines and Planes

Lec. #29

11.1 Lines in space

Lec. #30
11.1.1 Parametric (vector) form

In two dimensions the standard form for the equation of a line is y = mx + b. It turns out this
doesn’t generalize well to describing lines in higher dimensions. For this purpose it is better to
write the equation of the line in parametric form.
A line through the point r ∈ IR2 parallel to v can be represented by the parametric equation

x = r + tv, t ∈ IR

which describes the set of all points on the line.

This also works in three (and higher) dimensions. To describe the set of points on a line, we need
only specify any point on the line and a vector giving the direction of the line.

Example 11.1. To find the parametric form of the line through the points (3, 2, 1) and (1, −1, 2),
we first find the direction vector

v = (3, 2, 1) − (1, −1, 2) = (2, 3, −1).

Then any point x on the line is of the form

x = (3, 2, 1) + t(2, 3, −1), t ∈ IR.

Component-wise, the parametric equations of the line are


x = 3 + 2t

y = 2 + 3t


z = 1 − t.

We can readily find other points on the line, e.g. taking t = 2 we get x = (3 + 2 · 2, 2 + 3 · 2, 1 − 2) =
(7, 8, −1).

42
11.1.2 Symmetric form

By solving for the parameter t the equation of a line can be written in symmetric form. Following
the previous example, we get
x−3 y−2 1−z
t= t= t=
2 3 1
so every point on the line must satisfy the equation
x−3 y−2
= = 1 − z.
2 3
From the first equality we have

3(x − 3) = 2(y − 2) =⇒ 3x − 2y = 5.

This give the equation of the line as projected on the xy-plane (i.e. top-down view). We can get
projection in the xz- or yz-planes by considering other pairs of equalities.

11.1.3 Intersection of lines

Example 11.2. Consider two lines: one through the points (0, 1, 2) and (1, 2, 3), the other through
the points (−2, 0, 1) and (1, 1, 5). Do these lines intersect?
It helps to first express both lines in parametric form. The direction vectors are (1, 1, 1) and (3, 1, 4),
respectively, so in vector form the lines have parametric equations:

x = (0, 1, 2) + t1 (1, 1, 1)
x = (1, 1, 5) + t2 (3, 1, 4).

Note that we need two parameters t1 and t2 (one to parametrize each line). The lines intersect if
there they have a point in common; that is, if there are numbers t1 , t2 such that both equations
above describe the same point. Component-wise, this condition can be expressed as a system of
linear equations: 
0 + t1 = 1 + 3t2 (x-coordinate)

1 + t1 = 1 + t2 (y-coordinate)


2 + t1 = 5 + 4t2 (z-coordinate).
The augmented matrix for this system is:
   
1 −3 1 1 0 0
 1 −1 0  − RREF 
−−−→ 0 1 0 
1 −4 3 0 0 1

so the system is inconsistent—the lines do not intersect.

43
11.2 Planes
Lec. #31
11.2.1 Standard form

We have already seen how a linear equation in three variable describes a plane, e.g. the set of points
(x, y, z) ∈ IR3 that satisfy the equation

3x + 2y − 5z = 8.

all lie in a plane. In general, any plane in IR3 can be described by an equation in standard form:

ax + by + cz = d,

for some numbers a, b, c and d.

One way to graph a plane is to plot its x-, y- and z-intercepts. E.g. for the example above the
intercepts are

y = z = 0 =⇒ x = 8/3 (x-intercept)
x = z = 0 =⇒ y = 4 (y-intercept)
x = y = 0 =⇒ z = −8/5 (z-intercept)

[need a picture here]

11.3 Parametric form

Another way to describe a plane is to parametrize it much as we did for lines in the previous section.
To obtain a parametric form for the equation of a plane, consider its equation in standard form as
a system of linear equations (a somewhat trivial system consisting of just one equation), with one
basic variable (say, x) and two free variables (y and z).
For our example above (3x + 2y − 5z = 8 in standard ) the general solution is
(
x = 38 − 23 y + 35 z
y, z are free.

Writing the general solution in vector form we have

  8 2 5
      
x 3 − 3y + 3z 8/3 −2/3 5/3
x = y  =  y  =  0 +y 1 +z 0 
z z 0 0 1

where y, z are free; the free variables are parameters that parametrize the set of all points in the
plane.
In general the parametric form of the equation of any plane is

x = r + su + tv, s, t ∈ IR.

44
In our example r = (8/3, 0, 0). u = (−2/3, 1, 0) and v = (5/3, 0, 1). The plane consists exactly of
the set of points x of the form x = r + su + tv where s, t can be any real numbers. The vector r is
some point in the plane; vectors u, v specify two independent directions in the plane, along which
we can travel from r to arrive at any other point in the plane. It makes sense that there should be
two independent directions (and two corresponding parameters), since a plane is a 2-dimensional
surface.
If we fix s = 0 then the equation reduces to x = r + tv, which is the equation of a line through r
with direction v. This line lies in the plane. Fixing any other value of s we get a different (parallel)
line, through a different point. We get another collection of lines in the plane if we hold t fixed and
allow s to vary. Thus we can see that the plane consists of collections of parallel lines.

11.3.1 Normal form

In standard form the equation of a plane doesn’t give us much insight into the plane’s orientation.
We might wonder, for example, what direction does the plane face?
A natural way to describe the orientation of a plane is by it normal vector, i.e. the direction of any
vector perpendicular to the plane. This also leads to a natural form for the equation of the plane.
Given any point r in the plane and the normal vector n, we have that for any point x in the plane,
the vector x − r is perpendicular to n.
[need a picture here]
Using the dot product to express orthogonality, we have:
(x − r) · n = 0.
This is the equation of the plane in normal form. The plane consists of all points x that satisfy
this equation.
Example 11.3. The normal form of the equation of the plane that passes through the point (2, 3, 1)
and is oriented with normal vector (−1, −1, 1) is
[x − (2, 3, 1)] · (−1, −1, 1) = 0.
Expanding this in terms of the components of x = (x, y, z) we have
[(x, y, z) − (2, 3, 1)] · (−1, −1, 1) = 0 =⇒ (x − 2, y − 3, z − 1) · (−1, −1, 1) = 0
=⇒ −(x − 2) − (y − 3) + (z − 1) = 0
=⇒ −x − y + z = −4,
which gives the equation of the line in standard form.
Example 11.4. To find the normal vector for a plane expressed in standard form, e.g.
3x + 2y − 5z = 8,
we need only rearrange the equation:
3x + 2y − 5z − 8 = 0 =⇒ 3(x − 0) + 2(y − 4) − 5(z − 0) = 0
=⇒ [(x, y, z) − (0, 4, 0)] · (3, 2, −5) = 0
=⇒ (x − r) · n = 0

45
with r = (0, 4, 0) and n = (3, 2, −5). Notice that the components of n are exactly the coefficients
from the equation in standard form. This is true in general, so we can always easily find the normal
vector n directly from the equation in standard form.

12 Linear Transformations

Finance Math: Linear Equations
No ratings yet
Finance Math: Linear Equations
32 pages
Chapter 1. System of Linear Equations and Matrices
No ratings yet
Chapter 1. System of Linear Equations and Matrices
127 pages
Chapter 1
No ratings yet
Chapter 1
34 pages
MAST10007 Notes S2 2023 (Student) 1pp
No ratings yet
MAST10007 Notes S2 2023 (Student) 1pp
428 pages
Lecture 1
No ratings yet
Lecture 1
9 pages
Linear Algebra Class Lecture Pass-Unlocked
No ratings yet
Linear Algebra Class Lecture Pass-Unlocked
132 pages
Lecture 2 Linear System
No ratings yet
Lecture 2 Linear System
12 pages
Slides 01
No ratings yet
Slides 01
8 pages
Chapter 1
No ratings yet
Chapter 1
29 pages
Math201 - Calculus
No ratings yet
Math201 - Calculus
45 pages
Topics1 3 4topage
No ratings yet
Topics1 3 4topage
34 pages
Mathematics 210
No ratings yet
Mathematics 210
27 pages
Chapter 2 Systems of Linear Equations
No ratings yet
Chapter 2 Systems of Linear Equations
12 pages
Solving Systems of Equations: MATH 15-1 Linear Algebra Week 2
No ratings yet
Solving Systems of Equations: MATH 15-1 Linear Algebra Week 2
53 pages
Lecture 11
No ratings yet
Lecture 11
42 pages
Elementary Linear Algebra
No ratings yet
Elementary Linear Algebra
20 pages
Linear Algebra (1) (2) System of Linear Equations
No ratings yet
Linear Algebra (1) (2) System of Linear Equations
33 pages
Linear Algebra: Solving Systems
No ratings yet
Linear Algebra: Solving Systems
53 pages
Linear Systems: MATH 149 Linear Algebra With Computer Application
No ratings yet
Linear Systems: MATH 149 Linear Algebra With Computer Application
53 pages
Linear Algebra 1 B
No ratings yet
Linear Algebra 1 B
36 pages
4MTH272 Note2
No ratings yet
4MTH272 Note2
143 pages
Copy of Copy of Hello 3
No ratings yet
Copy of Copy of Hello 3
14 pages
The Augmented Matrix of A Linear System
No ratings yet
The Augmented Matrix of A Linear System
14 pages
Linear Algebra for STEM Students
No ratings yet
Linear Algebra for STEM Students
32 pages
Chapter 1: Systems of Linear Equations and Matrices: Section 1.2
No ratings yet
Chapter 1: Systems of Linear Equations and Matrices: Section 1.2
9 pages
Linear Algebra Notes
No ratings yet
Linear Algebra Notes
39 pages
Lec 1
No ratings yet
Lec 1
45 pages
1.2 - Gaussian Elimination
No ratings yet
1.2 - Gaussian Elimination
23 pages
4.linear Algebra PDF
No ratings yet
4.linear Algebra PDF
171 pages
MATH1853 Maths I: System of Linear Equations
No ratings yet
MATH1853 Maths I: System of Linear Equations
16 pages
Systems of Linear Equations Guide
No ratings yet
Systems of Linear Equations Guide
12 pages
MM Unit-1
No ratings yet
MM Unit-1
115 pages
Linear Algebra II: Systems & Solutions
No ratings yet
Linear Algebra II: Systems & Solutions
54 pages
PPT04 - Systems of Linear Equation
No ratings yet
PPT04 - Systems of Linear Equation
19 pages
Systems of Linear Equations
No ratings yet
Systems of Linear Equations
13 pages
المحاضرة الثانية
No ratings yet
المحاضرة الثانية
7 pages
Chapter 2 (A01) 5
No ratings yet
Chapter 2 (A01) 5
13 pages
Linear Algebra
No ratings yet
Linear Algebra
15 pages
1.2 - Gaussian Elimination
No ratings yet
1.2 - Gaussian Elimination
23 pages
Linear Algebra
No ratings yet
Linear Algebra
150 pages
Gauss Jordan and Gauss Elimination Method
No ratings yet
Gauss Jordan and Gauss Elimination Method
21 pages
System Linear Equations
No ratings yet
System Linear Equations
40 pages
Week1 3
No ratings yet
Week1 3
20 pages
Linear Algebra Notes PDF
No ratings yet
Linear Algebra Notes PDF
3 pages
Solving Simultaneous Linear Equations: Presented By: Soha Tharwat Aly Abd El-Rahman Makhyoun Presented To
No ratings yet
Solving Simultaneous Linear Equations: Presented By: Soha Tharwat Aly Abd El-Rahman Makhyoun Presented To
11 pages
MTPDF5 Systems of Linear Equations
No ratings yet
MTPDF5 Systems of Linear Equations
49 pages
C II L - : AL FOR Ife Sciences
No ratings yet
C II L - : AL FOR Ife Sciences
8 pages
20220108202159D6130 - 01-02 Systems of Linear Equations-Update
No ratings yet
20220108202159D6130 - 01-02 Systems of Linear Equations-Update
31 pages
System of Linear Equations-20250911075012
No ratings yet
System of Linear Equations-20250911075012
38 pages
Chapter 1
No ratings yet
Chapter 1
49 pages
Systems of Linear Eccuations
No ratings yet
Systems of Linear Eccuations
14 pages
Row Echelon Form in Linear Algebra
No ratings yet
Row Echelon Form in Linear Algebra
10 pages
1st Class
No ratings yet
1st Class
18 pages
Linear Equations and Matrices
No ratings yet
Linear Equations and Matrices
30 pages
Math 2015 Lecture Notes 1
No ratings yet
Math 2015 Lecture Notes 1
21 pages
LA 1 LinearSystems
No ratings yet
LA 1 LinearSystems
54 pages
Linear Algebra Lectures
No ratings yet
Linear Algebra Lectures
185 pages
BaseCrow Whitepaper
No ratings yet
BaseCrow Whitepaper
11 pages
Who You Say I Am Lyrics
No ratings yet
Who You Say I Am Lyrics
4 pages
Officialwriting Correspondence
No ratings yet
Officialwriting Correspondence
15 pages
Oceans Lyrics
No ratings yet
Oceans Lyrics
2 pages
Squirrel Vs Slip Ring Induction Motors
No ratings yet
Squirrel Vs Slip Ring Induction Motors
1 page
62e0372ddccb7c2fa80c8bc8 - Academy NFT Terms and Abbreviations
No ratings yet
62e0372ddccb7c2fa80c8bc8 - Academy NFT Terms and Abbreviations
15 pages
Presents: Youth Parliament Tagged:: Christ Apostolic Church
No ratings yet
Presents: Youth Parliament Tagged:: Christ Apostolic Church
1 page
HDJDKDKDK
No ratings yet
HDJDKDKDK
2 pages
CHAPTER 1 ND 3
No ratings yet
CHAPTER 1 ND 3
12 pages
Chapter 2
No ratings yet
Chapter 2
18 pages
In The National Industrial Court of Nigeria in Ibadan Judicial Division Holden at Ibadan Sworn Affidavit For Loss of Glo Sim Card and Sim Pack
No ratings yet
In The National Industrial Court of Nigeria in Ibadan Judicial Division Holden at Ibadan Sworn Affidavit For Loss of Glo Sim Card and Sim Pack
2 pages
Abs Ract
No ratings yet
Abs Ract
5 pages
Oyo State TESCOM Recruitment Past Questions and Answers PDF
91% (23)
Oyo State TESCOM Recruitment Past Questions and Answers PDF
42 pages
Request To Present The "La Buena Vida Project"
No ratings yet
Request To Present The "La Buena Vida Project"
2 pages
MTH 112 First Semester Nd1 Fulltime
No ratings yet
MTH 112 First Semester Nd1 Fulltime
4 pages
Fuoye - Screening Advert 2021-1
No ratings yet
Fuoye - Screening Advert 2021-1
1 page
Polynomials
No ratings yet
Polynomials
139 pages
Time Value of Money Explained
33% (3)
Time Value of Money Explained
24 pages
Advanced Temperature Measurement & Control Valves Volume - 5
No ratings yet
Advanced Temperature Measurement & Control Valves Volume - 5
82 pages
Ch02v4
No ratings yet
Ch02v4
95 pages
Dynamic Equations of Belt Conveyor
No ratings yet
Dynamic Equations of Belt Conveyor
21 pages
Monthly Budget vs Actual Report
No ratings yet
Monthly Budget vs Actual Report
3 pages
Boolean Algebra and Logic Gates
No ratings yet
Boolean Algebra and Logic Gates
50 pages
MATH 1342 Elementary Statistics Prerequisite Review
No ratings yet
MATH 1342 Elementary Statistics Prerequisite Review
8 pages
Nitsche 1985
No ratings yet
Nitsche 1985
19 pages
Mathematics Stage 8 Sample Paper 1 - tcm143-595673
No ratings yet
Mathematics Stage 8 Sample Paper 1 - tcm143-595673
6 pages
Physical Therapy in Sport: Mark J. Stokes, Jeremy Witchalls, Gordon Waddington, Roger Adams
No ratings yet
Physical Therapy in Sport: Mark J. Stokes, Jeremy Witchalls, Gordon Waddington, Roger Adams
10 pages
Enhancing Mathematics Learning
No ratings yet
Enhancing Mathematics Learning
38 pages
Chinese: Chinese Mathematics Book On Numbers and Computation
No ratings yet
Chinese: Chinese Mathematics Book On Numbers and Computation
5 pages
Information Theory and Entropy
100% (1)
Information Theory and Entropy
111 pages
Morphemes Suggested Sequence
No ratings yet
Morphemes Suggested Sequence
20 pages
Flow Regimes - Inflow
0% (1)
Flow Regimes - Inflow
13 pages
Maxon DC Motor 22mm
No ratings yet
Maxon DC Motor 22mm
5 pages
Process Control MCQ Final
100% (4)
Process Control MCQ Final
25 pages
Course Synopsis For B. Eng (Mechanical - Automotive) SEMM 1013 Programming For Engineers
No ratings yet
Course Synopsis For B. Eng (Mechanical - Automotive) SEMM 1013 Programming For Engineers
9 pages
PhD Entrance Exam Instructions
No ratings yet
PhD Entrance Exam Instructions
4 pages
Exam 256 Summer School
No ratings yet
Exam 256 Summer School
4 pages
Year 2 Math Practice Worksheet
No ratings yet
Year 2 Math Practice Worksheet
2 pages
IGCSE Mathematics A 4WM1H 01R - May 2025 Mark Scheme PDF
No ratings yet
IGCSE Mathematics A 4WM1H 01R - May 2025 Mark Scheme PDF
30 pages
System Simulation and Modeling-Course Material
No ratings yet
System Simulation and Modeling-Course Material
82 pages
Number Systems A Path Into Rigorous Mathematics 1st Edition Anthony Kay Download
100% (3)
Number Systems A Path Into Rigorous Mathematics 1st Edition Anthony Kay Download
87 pages
BlueARP Manual v222 EN PDF
No ratings yet
BlueARP Manual v222 EN PDF
24 pages
OK
No ratings yet
OK
2 pages
J. K.Shah Classes Regression Analysis
No ratings yet
J. K.Shah Classes Regression Analysis
15 pages
Quantum Mechanics with Scilab
No ratings yet
Quantum Mechanics with Scilab
56 pages
Mathematical Methods For Economic Analysis PDF
No ratings yet
Mathematical Methods For Economic Analysis PDF
245 pages