0% found this document useful (0 votes)
7 views43 pages

LAChapter 1

linear algebra book by jinxiaoqing(a pfof. in university of macau) chapter1

Uploaded by

ws6zhi1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views43 pages

LAChapter 1

linear algebra book by jinxiaoqing(a pfof. in university of macau) chapter1

Uploaded by

ws6zhi1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Chapter 1

Linear Systems and Matrices

“No beginner’s course in mathematics can do without linear algebra.”


— Lars Gårding

“Matrices act. They don’t just sit there.”


— Gilbert Strang

Solving linear systems (a system of linear equations) is the most important problem of linear
algebra and possibly of applied mathematics as well. Usually, information in a linear system
is often arranged into a rectangular array, called a “matrix”. The matrix is particularly
important in developing computer programs to solve linear systems with huge sizes because
computers are suitable to manage numerical data in arrays. Moreover, matrices are not only
a simple tool for solving linear systems but also mathematical objects in their own right. In
fact, matrix theory has a variety of applications in science, engineering, and mathematics.
Therefore, we begin our study on linear systems and matrices in the first chapter.

1.1 Introduction to Linear Systems and Matrices

Let R denote the set of real numbers. We now introduce linear equations, linear systems,
and matrices.

1.1.1 Linear equations and linear systems

We consider
a1 x1 + a2 x2 + · · · + an xn = b,

1
2 CHAPTER 1. LINEAR SYSTEMS AND MATRICES

R
where ai ∈ (i = 1, 2, . . . , n) are coefficients, xi (i = 1, 2, . . . , n) are variables (unknowns),
R
n is a positive integer, and b ∈ is a constant. An equation of this form is called a linear
equation, in which all variables occur to the first power. When b = 0, the linear equation
is called a homogeneous linear equation. A sequence of numbers s1 , s2 , . . . , sn is called
a solution of the equation if x1 = s1 , x2 = s2 , . . . , xn = sn such that

a1 s1 + a2 s2 + · · · + an sn = b.

The set of all solutions of the equation is called the solution set of the equation.

In the book, we always use example(s) to make our points clear.

Example. We consider the following linear equations:

(a) x + y = 1.

(b) x + y + z = 1.

It is easy to see that the solution set of (a) is a line in xy-plane and the solution set of (b)
is a plane in xyz-space.

We next consider the following m linear equations in n variables:




 a11 x1 + a12 x2 + · · · + a1n xn = b1




 a21 x1 + a22 x2 + · · · + a2n xn = b2
.. .. .. .. (1.1)

 . . . .




 am1 x1 + am2 x2 + · · · + amn xn = bm ,

where aij ∈ R (i = 1, 2, . . . , m; j = 1, 2, . . . , n) are coefficients, xj (j = 1, 2, . . . , n) are


variables, and bi ∈ R(i = 1, 2, . . . , m) are constants. A system of linear equations in this
form is called a linear system. A sequence of numbers s1 , s2 , . . . , sn is called a solution
of the system if x1 = s1 , x2 = s2 , . . . , xn = sn is a solution of each equation in the system.
A linear system is said to be consistent if it has at least one solution. Otherwise, a linear
system is said to be inconsistent if it has no solution.

Example. Consider the following linear system


(
a11 x + a12 y = b1
a21 x + a22 y = b2 .

The graphs of these equations are lines called l1 and l2 . We have three possible cases of lines
l1 and l2 in xy-plane. See Figure 1.1.

ˆ When l1 and l2 are parallel, there is no solution of the system.


1.1. INTRODUCTION TO LINEAR SYSTEMS AND MATRICES 3

ˆ When l1 and l2 intersect at only one point, there is exactly one solution of the system.

ˆ When l1 and l2 coincide, there are infinitely many solutions of the system.

y y y
@ 6 @  6 6
@ @  l1 
@ 
@ 
@ l1  @ l 
@ @  @
2 
x
- x
-  l2 x
-
@ @  @ 
@ @  @ 

l2 @ @
@ l1  @
@ 

@
@ 
@@ 
No solution One solution Infinitely many solutions
Figure 1.1

1.1.2 Matrices

The term matrix was first introduced by a British mathematician James Sylvester in the 19th
century. Another British mathematician Arthur Cayley developed basic algebraic operations
on matrices in the 1850s. Up to now, matrices have become the language to know.

Definition. A matrix is a rectangular array of numbers. The numbers in the array are
called the entries in the matrix.

Remark. The size of a matrix is described in terms of the number of rows and columns it
contains. Usually, a matrix with m rows and n columns is called an m × n matrix. If A
is an m × n matrix, then we denote the entry in row i and column j of A by the symbol
(A)ij = aij . Moreover, a matrix with real entries will be called a real matrix and the set of
R
all m × n real matrices will be denoted by the symbol m×n . For instance, a matrix A in
R m×n
can be written as  
a11 a12 · · · a1n
 a21 a22 · · · a2n 
 
A =  .. .. ..  ,
 . . . 
am1 am2 · · · amn
where aij ∈ R
for any i and j. When compactness of notation is desired, the preceding
matrix can be written as
A = [aij ] .
4 CHAPTER 1. LINEAR SYSTEMS AND MATRICES

In particular, if A ∈ R1×1, then A = a11 ∈ R.


We now introduce some important matrices with special sizes. A row matrix is a
general 1 × n matrix a given by

a = [a1 , a2 , . . . , an ] ∈ R1×n.
A column matrix is a general m × 1 matrix b given by
 
b1
 b2 
 
b =  ..  ∈ m×1 .
 . 
R
bm

A square matrix is an n × n matrix A given by


 
a11 a12 · · · a1n
 a21 a22 · · · a2n 

A =  ..
 .
.. ..
.
..

∈

Rn×n. (1.2)
. .
an1 an2 · · · ann

The main diagonal of the square matrix A is the set of entries a11 , a22 , . . . , ann in (1.2).

For linear system (1.1), we can write it briefly as the following matrix form
 
a11 a12 · · · a1n b1
 a21 a22 · · · a2n b2 
 
 .. .. .. ..  ,
 . . . . 
am1 am2 · · · amn bm

which is called the augmented matrix of (1.1).

Remark. When we construct an augmented matrix associated with a given linear system,
the unknowns must be written in the same order in each equation and the constants must
be on the right.

1.1.3 Elementary row operations

In order to solve a linear system efficiently, we replace the given system with its augmented
matrix and then solve the same system by operating on the rows of the augmented matrix.
There are three elementary row operations on matrices defined as follows:

(1) Interchange two rows.


1.1. INTRODUCTION TO LINEAR SYSTEMS AND MATRICES 5

(2) Multiply a row by a nonzero number.

(3) Add a multiple of one row to another row.

By using elementary row operations, we can always reduce the augmented matrix of a given
system to a simpler augmented matrix from which the solution of the corresponding system
is evident. In fact, the elementary row operations do not change the solution of the given
system. See the following example.

Example. Solve the following linear system




 x+ y+ z= 3
−x + y − z = 3


2x + 2y − 3z = 1.

Solution. In the left column below we solve a system by operating on the equations in the
system, and in the right column we use the corresponding operations on the rows of its
augmented matrix.
  
 x+ y+ z= 3
 1 1 1 3
−x + y − z = 3  −1 1 −1 3  .


2x + 2y − 3z = 1. 2 2 −3 1

Adding the first Adding the first row


equation to the to the second and
second and adding adding −2 times
−2 times the first the first row to the
equation to the third third obtain
obtain
  
 x+ y+ z= 3
 1 1 1 3
2y = 6  0 2 0 6 .


− 5z = −5. 0 0 −5 −5

Multiplying the sec- Multiplying the sec-


ond equation by 1/2 ond row by 1/2
and multiplying the and multiplying the
third equation by third row by −1/5
−1/5 give give
6 CHAPTER 1. LINEAR SYSTEMS AND MATRICES
  
 x+ y+ z= 3
 1 1 1 3
y = 3  0 1 0 3 .


z = 1. 0 0 1 1

Adding −1 times Adding −1 times


the second equa- the second row to
tion to the first and the first and adding
adding −1 times −1 times the third
the third equation row to the first give
to the first give
  
 x
 = −1 1 0 0 −1
y = 3  0 1 0 3 .


z= 1. 0 0 1 1

We immediately have
x = −1, y = 3, z = 1.

1.2 Gauss-Jordan Elimination

In this section, we develop a method called Gauss-Jordan elimination [1, p. 15] for solv-
ing linear systems. In fact, Gauss-Jordan elimination is the most frequently used algorithm
in scientific computing.

1.2.1 Reduced row-echelon form

In the example of Subsection 1.1.3, we solved the given linear system by reducing the aug-
mented matrix to  
1 0 0 −1
 0 1 0 3 
0 0 1 1
from which the solution of the system was evident. This is an example of a matrix that is
in reduced row-echelon form. We therefore give the following definition.

Definition. For any matrix in reduced row-echelon form, it must satisfy the following con-
ditions.
1.2. GAUSS-JORDAN ELIMINATION 7

(i) For the rows that consist entirely of zeros, they are grouped together at the bottom of
the matrix. The rows that consist entirely of zeros will be called zero rows.

(ii) If a row does not consist entirely of zeros, then the first nonzero number in the row is
a 1. We call this a leading 1.

(iii) For two successive rows that both contain leading 1’s, the leading 1 in the higher row
occurs farther to the left than the leading 1 in the lower row.

(iv) Each column that contains a leading 1 has zeros in all its other entries.

Example. The following example is in reduced row-echelon form:


 
1 0 3 0 5
 0 1 0 0 3 
 
 0 0 0 1 0 .
0 0 0 0 0

Remark. A matrix having properties (i), (ii), (iii), but not necessarily (iv), is said to be in
row-echelon form.

1.2.2 Gauss-Jordan elimination

Gauss-Jordan elimination is a standard technique for solving linear systems. Actually Gauss-
Jordan elimination is a step-by-step elimination procedure which reduces an augmented
matrix of a given linear system to reduced row-echelon form. Then the solution set of the
system can be found by just inspection. We illustrate the idea by the following example.

Example. We solve the following system




 − 3x2 +7x5 = 15

2x1 + 6x2 +6x3 +4x4 +2x5 = 28



2x1 +11x2 +6x3 +4x4 −9x5 = 5.

The augmented matrix of the system is given by


 
0 −3 0 0 7 15
 2 6 6 4 2 28  .
2 11 6 4 −9 5

Now, by using the elementary row operations, we are going to reduce the matrix to reduced
row-echelon form.
8 CHAPTER 1. LINEAR SYSTEMS AND MATRICES

Step 1. Interchange the top row with another row, if necessary, to bring a nonzero entry
to the top of the leftmost column that does not consist entirely of zeros:
 
2 6 6 4 2 28
Interchange 1st row and 2nd row
−−−−−−−−−−−−−−−−−−−−→  0 −3 0 0 7 15  .
2 11 6 4 −9 5

Step 2. If the entry that is now at top of the column found in Step 1 is a ̸= 0, multiply
the first row by 1/a in order to introduce the leading 1:
 
1 3 3 2 1 14
1/2 × 1st row
−−−−−−−−−−−−−−−−→  0 −3 0 0 7 15  .
2 11 6 4 −9 5

Step 3. Add suitable multiples of the top row to the rows below so that all entries below
the leading 1 become zeros:
 
1 3 3 2 1 14
3rd row + (−2) × 1st row
−−−−−−−−−−−−−−−−−→  0 −3 0 0 7 15  .
0 5 0 0 −11 −23

Step 4. Now cover the top row in the matrix and begin again with Step 1 applied to the
submatrix remained. Continue in this way until the entire matrix is in row-echelon
form:
 
1 3 3 2 1 14
(−1/3) × 2nd row  
−−−−−−−−−−−−−−−−→   0 1 0 0 − 3 −5 
7 

0 5 0 0 −11 −23

 
1 3 3 2 1 14
3rd row + (−5) × 2nd row  
−−−−−−−−−−−−−−−−→  0 1 0 0 −7 −5 
 3 
2
0 0 0 0 3
2

 
1 3 3 2 1 14
3/2 × 3rd row  
−−−−−−−−−−−−−−−−→  0 1 0 0 −7 −5 
 3 .
0 0 0 0 1 3

The entire matrix is now in row-echelon form. To find the reduced row-echelon
form we need the following additional step.
1.2. GAUSS-JORDAN ELIMINATION 9

Step 5. Beginning with the last nonzero row and working upward, add suitable multiples
of each row to the rows above to introduce zeros above the leading 1’s:
 
1 3 3 2 1 14
2nd row + (7/3) × 3rd row
−−−−−−−−−−−−−−−−−→  0 1 0 0 0 2 
0 0 0 0 1 3

 
1 3 3 2 0 11
1st row + (−1) × 3rd row
−−−−−−−−−−−−−−−−→  0 1 0 0 0 2 
0 0 0 0 1 3

 
1 0 3 2 0 5
1st row + (−3) × 2nd row
−−−−−−−−−−−−−−−−→  0 1 0 0 0 2 .
0 0 0 0 1 3

The last matrix is in reduced row-echelon form.

The corresponding system is




 x1 + 3x3 + 2x4 =5

x2 =2



x5 = 3.

Since x1 , x2 , and x5 correspond to leading 1’s in reduced row-echelon form of the augmented
matrix, we call them leading variables. The remaining variables x3 and x4 are called free
variables. Solving for the leading variables yields


 x1 = −3x3 − 2x4 + 5

x2 = 2



x5 = 3.

Setting x3 = s and x4 = t, we therefore obtain the solution set of the system,




 x1 = −3s − 2t + 5

x2 = 2



x5 = 3,

where the parameters s and t can take arbitrary values.


10 CHAPTER 1. LINEAR SYSTEMS AND MATRICES

1.2.3 Homogeneous linear systems

A linear system is called to be homogeneous if the constant terms are all zero. Consider


 a11 x1 + a12 x2 + · · · + a1n xn = 0




 a21 x1 + a22 x2 + · · · + a2n xn = 0
 .. .. .. ..

 . . . .



 a x + a x + · · · + a x = 0.
m1 1 m2 2 mn n

Obviously, x1 = x2 = · · · = xn = 0 is a solution of the system, which is called the trivial


solution. Any nonzero solutions are called nontrivial solutions. For nontrivial solutions, we
have the following theorem.

Theorem 1.1. A homogeneous linear system has infinitely many solutions if there are more
variables than equations.

Proof. Let 

 a11 x1 + a12 x2 + · · · + a1m xm + · · · + a1n xn = 0




 a21 x1 + a22 x2 + · · · + a2m xm + · · · + a2n xn = 0
 .. .. .. .. ..

 . . . . .



 a x + a x + · · · + a x + · · · + a x = 0,
m1 1 m2 2 mm m mn n

where m < n. By using elementary row operations, one can obtain the reduced row-echelon
form of the augmented matrix of the system. It follows from the reduced row-echelon form
that the corresponding system has the following form
 P

 xk 1 + ()=0

 P


 xk 2 + ()=0
 .. ..

 . .



 P
xk r + ( ) = 0,
P
where k1 < k2 < · · · < kr are numbers in the set {1, 2, . . . , m} and ( ) denotes sums that
involve the n − r free variables. We remark that r ≤ m < n and usually k1 = 1. Finally, we
obtain  P

 xk 1 = − ( )

 P


 xk = − ( )
2

 ..

 .



 x P
kr =− ( ),
1.3. MATRIX OPERATIONS 11

which implies that the system has infinitely many solutions.

Example. We consider the following linear system with 6 variables and 4 equations,


 a11 x1 + a12 x2 + a13 x3 + a14 x4 + a15 x5 + a16 x6 = 0



 a21 x1 + a22 x2 + a23 x3 + a24 x4 + a25 x5 + a26 x6 = 0

 a31 x1 + a32 x2 + a33 x3 + a34 x4 + a35 x5 + a36 x6 = 0




a41 x1 + a42 x2 + a43 x3 + a44 x4 + a45 x5 + a46 x6 = 0.
The augmented matrix A of the system is
 
a11 a12 a13 a14 a15 a16 0
 a21 0 
A=  a31
a22
a32
a23
a33
a24
a34
a25
a35
a26
a36
∈
0 
R4×7.
a41 a42 a43 a44 a45 a46 0
By using elementary row operations, if the reduced row-echelon form of A is obtained as
 
1 b12 0 0 0 b16 0
 0 0 1 b24 0 b26 0 
 ,
 0 0 0 0 1 b36 0 
0 0 0 0 0 0 0
then the corresponding system is


 x1 + b12 x2 + b16 x6 = 0

x3 + b24 x4 + b26 x6 = 0



x5 + b36 x6 = 0.
We therefore have 

 x1 = −b12 x2 − b16 x6

x3 = −b24 x4 − b26 x6



x5 = −b36 x6 .
From the proof of Theorem 1.1, it follows that r = 3, k1 = 1, k2 = 3, and k3 = 5. There are
three free variables, say, x2 , x4 , and x6 . Thus, the system has infinitely many solutions.

1.3 Matrix Operations

Matrices appear in many contexts other than as augmented matrices for linear systems.
In this section, we begin our study on matrix theory by giving some basic definitions of
the subject. We also introduce some operations on matrices and discuss their fundamental
properties.
12 CHAPTER 1. LINEAR SYSTEMS AND MATRICES

1.3.1 Operations on matrices

Now, we develop an arithmetic of matrices which contains the sum, difference, product of
matrices, and so on. We have the following definition of operations on matrices.

Definition.

(i) Equal of matrices: Two matrices A = [aij ] and B = [bij ] are said to be equal, denoted
by A = B, if they have the same size and aij = bij for all i, j.

(ii) Sum and difference: Let A = [aij ] and B = [bij ] have the same size. Then A + B is
a matrix with the entries given by (A + B)ij := aij + bij for all i, j, and A − B is a
matrix with the entries given by (A − B)ij := aij − bij for all i, j.

(iii) Scalar multiplication: Let A = [aij ] and c be any scalar. Then cA is a matrix with the
entries given by (cA)ij := caij for all i, j.
s
X
(iv) Linear combination of matrices: ci A(i) , where A(i) (1 ≤ i ≤ s) are matrices of the
i=1
same size and ci (1 ≤ i ≤ s) are scalars.

(v) Matrix product: Let A = [aij ] be a general m × r matrix and B = [bij ] be a general
r × n matrix. Then the product of A and B is an m × n matrix denoted by
 
a11 a12 · · · a1r  
 a21 a22 · · · a2r  −−
 .. .. ..  b11 b12 · · · ¦ b1j ¦ · · · b1n 
  
 . . .  b21 b22 · · · ¦ b2j ¦ · · · b2n 
AB = 

−−−−−−−−− 
 .. .. . .. 
 ¦ ai1 ai2 · · · air ¦  . . ¦ .. ¦ . 
 −−−−−−−−−  
 .. .. ..  br1 br2 · · · ¦ brj ¦ · · · brn
. . . −−
am1 am2 · · · amr

with entries (AB)ij defined by


r
X
(AB)ij := ai1 b1j + ai2 b2j + · · · + air brj = aik bkj
k=1

for all 1 ≤ i ≤ m and 1 ≤ j ≤ n.

Remark. In the definition of matrix product, the number of columns of the first factor A
must be the same as the number of rows of the second factor B in order to form the product
AB. If the condition is not satisfied, then the product is undefined. Even if A and B are
both n × n matrices, we usually have AB ̸= BA.
1.3. MATRIX OPERATIONS 13

Example 1. Consider the matrices


 
  4 1 0 4
1 0 3
A= , B =  0 −3 −1 2  .
1 −2 0
1 2 1 0

Since A is a 2 × 3 matrix and B is a 3 × 4 matrix, the product AB is a 2 × 4 matrix given by


 
7 7 3 4
AB = .
4 7 2 0

But the product BA is undefined.

Example 2. Let
   
1 5 4 3 1 2
A= 2 −3 0 , B =  0 −1 −2  .
0 4 1 1 6 2

The product of matrices A and B is given by


    
1 5 4 3 1 2 7 20 0
AB =  2 −3 0   0 −1 −2  =  6 5 10 
0 4 1 1 6 2 1 2 −6

and the product of matrices B and A is given by


    
3 1 2 1 5 4 5 20 14
BA =  0 −1 −2   2 −3 0  =  −2 −5 −2  .
1 6 2 0 4 1 13 −5 6

But AB ̸= BA.

1.3.2 Partition of matrices

A matrix can be partitioned into smaller matrices by inserting horizontal and vertical rules
between selected rows and columns. For instance, below are three possible partitions of a
general 4 × 5 matrix A. The first one is a partition of A into four submatrices A11 , A12 ,
A21 , and A22 ; the second one is a partition of A into its row matrices r1 , r2 , r3 , and r4 ; the
third one is a partition of A into its column matrices c1 , c2 , c3 , c4 , and c5 :
14 CHAPTER 1. LINEAR SYSTEMS AND MATRICES

 
a11 a12 a13 ¦ a14 a15
 
 a21 a22 a23 ¦ a24 a25  A11 A12

A =  − − − − − − − − −−  = ;
a31 a32 a33 ¦ a34 a35 A21 A22
a41 a42 a43 ¦ a44 a45

  
a11 a12 a13 a14 a15 r1
 − − − − − − − − −−   −− 
 a21 a22 a23 a24 a25   r2 
A=
 − − − − − − − − −−  =  −−  ;
  r 
 a31 a32 a33 a34 a35   3 
− − − − − − − − −− −−
a41 a42 a43 a44 a45 r4

 
a11 ¦ a12 ¦ a13 ¦ a14 ¦ a15
 a21 ¦ a22 ¦ a23 ¦ a24 ¦ a25   
A= 
 a31 ¦ a32 ¦ a33 ¦ a34 ¦ a35  = c1 ¦ c2 ¦ c3 ¦ c4 ¦ c5 .
a41 ¦ a42 ¦ a43 ¦ a44 ¦ a45

1.3.3 Matrix product by columns and by rows

Sometimes it may be desirable to find a particular row or column of a matrix product AB


without computing the entire product. The following results are useful for that purpose. Let
R R
A ∈ m×r and B ∈ r×n . Then
 r 
X
 a1k bkj 
 k=1 
 . 
jth column matrix of AB =   ..  = A[ jth column matrix of B ]

 Xr 
 
amk bkj
k=1

and
hX
r r
X i
ith row matrix of AB = aik bk1 , . . . , aik bkn = [ ith row matrix of A ]B.
k=1 k=1

Remark. Let a1 , a2 , . . . , am denote the row matrices of A and b1 , b2 , . . . , bn denote the


column matrices of B. It follows from the formulas above that
   
AB = A b1 ¦ b2 ¦ · · · ¦ bn = Ab1 ¦ Ab2 ¦ · · · ¦ Abn , (1.3)
1.3. MATRIX OPERATIONS 15

which shows that AB can be computed column by column, and


   
a1 a1 B
 −−  − − −
 a2   a2 B 
   
AB =  −−  B =  − − −  ,
 ..   ..  (1.4)
 .   . 
 −−  − − −
am am B

which shows that AB can also be computed row by row.

1.3.4 Matrix product of partitioned matrices

From the remark in Subsection 1.3.3, we know that the computation of a matrix product
can be completed by some special partitions of matrices. We now introduce the general case.
Let
   
A11 A12 · · · A1s B11 B12 · · · B1t
 A21 A22 · · · A2s   B21 B22 · · · B2t 

A =  ..
 .
..

..  ∈ R
m×l
,

B =  .. ..

..  ∈ R
l×n
,
. .   . . . 
Ar1 Ar2 · · · Ars Bs1 Bs2 · · · Bst

where the number of columns of submatrix Aik is equal to the number of rows of submatrix
Bkj for each 1 ≤ i ≤ r, 1 ≤ k ≤ s, and 1 ≤ j ≤ t. Then we construct
 
C11 C12 · · · C1t
 C21 C22 · · · C2t 

C =  ..
 .
..

..  ∈ R
m×n
,
. . 
Cr1 Cr2 · · · Crt

where each submatrix of C is given by


s
X
Cij = Ai1 B1j + Ai2 B2j + · · · + Ais Bsj = Aik Bkj
k=1

for 1 ≤ i ≤ r and 1 ≤ j ≤ t. In fact, the partitioned matrix C is nothing new but the
product of matrices A and B, i.e., C = AB. See the following example.

R R
Example. Let A ∈ 3×3 and B ∈ 3×2 . Below are the partitions of A and B:
   
a11 ¦ a12 a13   b11 ¦ b12  
− − − − −− A A − − −− B B
A =  a21 ¦ a22 a23  = 11 12
, B =  b21 ¦ b22  = 11 12
.
A21 A22 B21 B22
a31 ¦ a32 a33 b31 ¦ b32
16 CHAPTER 1. LINEAR SYSTEMS AND MATRICES

Then
    
A11 A12 B11 B12 A11 B11 + A12 B21 A11 B12 + A12 B22
=
A21 A22 B21 B22 A21 B11 + A22 B21 A21 B12 + A22 B22
     
     b21      b22
 a11 b11 + a12 a13
b31
a11 b12 + a12 a13
b32 
 

= − 
 − −− − − − − − − − − −  − − − −− − − − − − − −  − − − − −− − −  
 a21   a22 a23 b21 a21   a22 a23 b22 
b11 + b12 +
a31 a32 a33 b31 a31 a32 a33 b32
         
a11 b11 + a12 b21 + a13 b31 a11 b12 + a12 b22 + a13 b32
 − − − − − − − − − − − − − − − − − − − − − − − − − − −− 
= 
 a21 b11
       
a22 b21 + a23 b31 a21 b12 a22 b22 + a23 b32 
+ +
a31 b11 a32 b21 + a33 b31 a31 b12 a32 b22 + a33 b32
 
a11 b11 + a12 b21 + a13 b31 a11 b12 + a12 b22 + a13 b32
=  a21 b11 + a22 b21 + a23 b31 a21 b12 + a22 b22 + a23 b32  = AB.
a31 b11 + a32 b21 + a33 b31 a31 b12 + a32 b22 + a33 b32

1.3.5 Matrix form of a linear system

In fact, the matrix product has an important application in solving linear systems. Consider
linear system (1.1) of m linear equations in n unknowns. We can replace the m equations
in this system with the single matrix equation
   
a11 x1 + a12 x2 + · · · + a1n xn b1
 a x + a x + ··· + a x   b 
 21 1 22 2 2n n   2 
 = 
 .. .. ..   ..  .
 . . .   . 
am1 x1 + am2 x2 + · · · + amn xn bm
By using the product of matrices, it follows that
    
a11 a12 · · · a1n x1 b1
 a    
 21 a22 · · · a2n   x2   b2 
  = .
 .. .. ..   ..   .. 
 . . .  .   . 
am1 am2 · · · amn xn bm
Let    
  x1 b1
a11 a12 ··· a1n
   x2   b2 
 a21 a22 ··· a2n     
A= .. .. .. , x=
 ..
,
 b=
 ..
.

 . . .   .   . 
am1 am2 · · · amn xn bm
1.3. MATRIX OPERATIONS 17

Then the original system has been replaced by the single matrix equation

Ax = b.

Here A is called the coefficient matrix of the system. Thus, the augmented matrix for
the system is obtained by adjoining b to A as the last column, i.e., [ A ¦ b ].

Remark. Note that by using matrix operations on the above linear system, it can also be
written as follows:
       
a11 a12 a1n b1
 a   a   a   b 
 21   22   2n   2 
x1 
 .. 
 + x2 
 .. 
 + · · · + xn    
 ..  =  ..  ,
 .   .   .   . 
am1 am2 amn bm

i.e.,
x1 c1 + x2 c2 + · · · + xn cn = b, (1.5)
where cj is the jth column matrix of A for 1 ≤ j ≤ n.

1.3.6 Transpose and trace of a matrix

Definition. The transpose of an m × n matrix A = [aij ], denoted by AT , is defined to be


the n × m matrix with entries given by

(AT )ij = aji .

The trace of an n × n matrix A = [aij ] is given by


n
X
tr(A) := aii .
i=1

For instance,
   
a11 a12 a13 a11 a21 a31
A =  a21 a22 a23  , AT =  a12 a22 a32  , tr(A) = a11 + a22 + a33 .
a31 a32 a33 a13 a23 a33

Some important properties of the transpose are listed in the following theorem.

Theorem 1.2. Let the sizes of matrices A and B be such that the stated operations can be
performed. Then
18 CHAPTER 1. LINEAR SYSTEMS AND MATRICES

(a) (AT )T = A.

(b) (A ± B)T = AT ± B T .

(c) (kA)T = kAT , where k is any scalar.

(d) (AB)T = B T AT .

Proof. Parts (a), (b), and (c) are self-evident. We therefore only prove (d). Let

A = [aij ] ∈ Rm×r , B = [bij ] ∈ Rr×n.


Then the products (AB)T and B T AT can both be formed and they have the same size of
n × m. It only remains to show that corresponding entries of (AB)T and B T AT are the
same, i.e., for all i, j, 
(AB)T ij = (B T AT )ij . (1.6)
Applying the definition of transpose of a matrix to the left-hand side of (1.6) and then using
the definition of matrix product, we obtain
r
X
T

(AB) ij
= (AB)ji = ajk bki .
k=1

To evaluate the right-hand side of (1.6), let AT = [a′ij ] and B T = [b′ij ], then

a′ij = aji , b′ij = bji .

Furthermore, we have for all i and j,


r
X r
X r
X 
(B T AT )ij = b′ik a′kj = bki ajk = ajk bki = (AB)T ij
.
k=1 k=1 k=1

Thus, (d) holds.

Some important properties of the trace are included in the following theorem.

Theorem 1.3. Let A = [aij ] and B = [bij ] be n × n matrices. Then

(a) tr(A) = tr(AT ).

(b) tr(AB) = tr(BA).

(c) tr(αA + βB) = αtr(A) + βtr(B), where α and β are any scalars.

(d) tr(AB − BA) = 0.


1.4. RULES OF MATRIX OPERATIONS AND INVERSES 19

(e) tr(B) = 0 if B T = −B.

Proof. Part (a) is obvious. For (b), because addition is associative and commutative, we
have
n
X n X
X n n X
X n n
X
tr(AB) = (AB)ii = aik bki = bki aik = (BA)kk = tr(BA).
i=1 i=1 k=1 k=1 i=1 k=1

For (c), we have

n
X n
X
tr(αA + βB) = (αA + βB)ii = (α · aii + β · bii )
i=1 i=1

n
X n
X
=α aii + β bii = αtr(A) + βtr(B).
i=1 i=1

For (d), it follows from (c) and (b) that

tr(AB − BA) = tr(AB) − tr(BA) = tr(AB) − tr(AB) = 0.

For (e), by using (a), the given condition B T = −B, and (c), we deduce

tr(B) = tr(B T ) = tr(−B) = −tr(B).

Thus, tr(B) = 0.

1.4 Rules of Matrix Operations and Inverses

In this section, we study some basic properties of the arithmetic operations on matrices.
20 CHAPTER 1. LINEAR SYSTEMS AND MATRICES

1.4.1 Basic properties of matrix operations

Theorem 1.4. Let A, B, and C be matrices and the sizes of matrices be assumed such that
the indicated operations can be performed. The following rules of matrix operations are valid.

(a) A + B = B + A. (Commutative law for addition)

(b) A + (B + C) = (A + B) + C. (Associative law for addition)

(c) (AB)C = A(BC). (Associative law for product)

(d) A(B ± C) = AB ± AC. (Left distributive law)

(e) (B ± C)A = BA ± CA. (Right distributive law)

(f) a(B ± C) = aB ± aC.

(g) (a ± b)C = aC ± bC.

(h) a(bC) = (ab)C.

(i) a(BC) = (aB)C = B(aC).

Here a and b are any scalars.

Proof. We only prove (c) of the associative law for matrix product. The other parts here are
left as an exercise. Assume that

A = [aij ] ∈ Rs×n, B = [bjk ] ∈ Rn×m, C = [ckl ] ∈ Rm×r .


We want to show
(AB)C = A(BC).
Let
V = AB = [vik ] ∈ Rs×m, W = BC = [wjl ] ∈ Rn×r .
Then
n
X
vik = aij bjk
j=1

for 1 ≤ i ≤ s and 1 ≤ k ≤ m, and


m
X
wjl = bjk ckl
k=1
1.4. RULES OF MATRIX OPERATIONS AND INVERSES 21

for 1 ≤ j ≤ n and 1 ≤ l ≤ r. Since (AB)C = V C, the entry in row i and column l of matrix
V C is given as follows:
m m n
! m X n
X X X X
(V C)il = vik ckl = aij bjk ckl = aij bjk ckl (1.7)
k=1 k=1 j=1 k=1 j=1

for 1 ≤ i ≤ s and 1 ≤ l ≤ r. Since A(BC) = AW , the entry in row i and column l of matrix
AW is given as follows:
n n m
! n Xm
X X X X
(AW )il = aij wjl = aij bjk ckl = aij bjk ckl (1.8)
j=1 j=1 k=1 j=1 k=1

for 1 ≤ i ≤ s and 1 ≤ l ≤ r. Because addition is associative and commutative, the results


in (1.7) and (1.8) should be the same. Hence the proof is completed.

1.4.2 Identity matrix and zero matrix

We define the identity matrix


and the zero matrix as follows:
   
1 0 ··· 0 0 0 ··· 0
 0 1 ···   
0   0 0 ··· 0

In =  .. .. . .
 . .
..  ∈ R
n×n
, 0 =  .. .. ..

∈ Rm×n.
..   . . . 
0 0 ··· 1 0 0 ··· 0

Remark. Throughout the book, we use the symbol In to denote the n×n identity matrix. If
there is no confusion, we sometimes use I to denote the identity matrix with an appropriate
size. Besides, we also use ei to denote the ith column matrix of In , i.e.,
 T
ei = 0 · · · 0 1 0 ··· 0 .

ith

Theorem 1.5. Let the sizes of the matrices be such that the indicated operations can be
performed. The following rules of matrix operations are valid.

(a) AI = A, IA = A.

(b) A + 0 = 0 + A = A.

(c) A0 = 0, 0A = 0.
22 CHAPTER 1. LINEAR SYSTEMS AND MATRICES

The proof of the theorem is trivial and we therefore omit it.

Example. Let    
1 −1 1 1
A= , B= .
−1 1 1 1
Then AB = 0 even if both A and B are nonzero matrices. Thus, if AB = 0 for A ∈ Rm×n
R
and B ∈ n×r , perhaps it does not follow that A = 0 or B = 0.

As the following theorem shows, the identity matrix is useful in studying reduced row-
echelon forms of square matrices. The proof of the theorem is left as an exercise.

Theorem 1.6. Let R be the reduced row-echelon form of a square matrix A. Then either R
has a row of zeros or R = I.

1.4.3 Inverse of a matrix

Definition. Let A and B be square matrices of the same size such that

AB = BA = I.

Then B is called an inverse of A, denoted by B = A−1 , and A is said to be invertible. If


no such B exists, then A is said to be not invertible.

Example. Consider the matrices


   
1 1 3 −1
A= , B= .
2 3 −2 1

One can verify that B is an inverse of A since


    
1 1 3 −1 1 0
AB = = =I
2 3 −2 1 0 1

and     
3 −1 1 1 1 0
BA = = = I.
−2 1 2 3 0 1

The next theorem shows that an invertible matrix has exactly one inverse.

Theorem 1.7. Let B and C be both inverses of the matrix A. Then B = C.


1.4. RULES OF MATRIX OPERATIONS AND INVERSES 23

Proof. Since B and C are both inverses of A, we have

AB = BA = I, AC = CA = I.

From BA = I, multiplying both sides on the right by C yields

(BA)C = IC = C.

However, we obtain by Theorem 1.4 (c),

(BA)C = B(AC) = BI = B.

Thus, C = B.

For a 2 × 2 invertible matrix, the following theorem gives a formula for constructing the
inverse.

Theorem 1.8. The matrix " #


a b
A=
c d
is invertible if ad − bc ̸= 0. The inverse is given by the formula
 
" # d b
 − 
1 d −b  ad − bc ad − bc 
A−1 = = .
ad − bc −c a  c a 

ad − bc ad − bc

Proof. Verify that AA−1 = I2 and A−1 A = I2 .

The following theorem is concerned with the invertibility of the product of invertible
matrices.

Theorem 1.9. Let A and B be n × n invertible matrices. Then AB is invertible and

(AB)−1 = B −1 A−1 .

In general,
−1
A(1) A(2) · · · A(p) = A−1 −1 −1
(p) · · · A(2) A(1) , (1.9)
where A(i) (1 ≤ i ≤ p) are n × n invertible matrices and p is any positive integer.
24 CHAPTER 1. LINEAR SYSTEMS AND MATRICES

Proof. Since A and B are invertible, we obtain by Theorem 1.4 (c),

(AB)(B −1 A−1 ) = A(BB −1 )A−1 = AA−1 = I

and
(B −1 A−1 )(AB) = B −1 (A−1 A)B = B −1 B = I.
Thus,
(AB)−1 = B −1 A−1 .
We are now going to prove the general case of (1.9) by using induction. When p = 2, we
have just proved that (1.9) is true. We assume that (1.9) is true for p = k − 1, i.e.,
−1
A(1) A(2) · · · A(k−1) = A−1 −1 −1
(k−1) · · · A(2) A(1) . (1.10)

Considering the case of p = k, it follows from the case of p = 2 and (1.10) that
−1   −1
A(1) A(2) · · · A(k−1) A(k) = A(1) A(2) · · · A(k−1) A(k)

−1 
= A−1
(k) A(1) A(2) · · · A(k−1) = A−1 −1 −1 −1
(k) A(k−1) · · · A(2) A(1)

= A−1 −1 −1 −1
(k) A(k−1) · · · A(2) A(1) .

Therefore, (1.9) is true for any positive integer p.

The following theorem gives a relationship between the inverse of an invertible matrix A
and the inverse of AT .

Theorem 1.10. If A is invertible, then AT is also invertible and

(AT )−1 = (A−1 )T .

Proof. We have by using Theorem 1.2 (d),

AT (A−1 )T = (A−1 A)T = I T = I

and
(A−1 )T AT = (AA−1 )T = I T = I.
Thus, (AT )−1 = (A−1 )T .
1.5. ELEMENTARY MATRICES AND A METHOD FOR FINDING A−1 25

1.4.4 Powers of a matrix

Definition. If A is square, then we define the powers of A for any integer n ≥ 0,

A0 := I, An := AA · · · A} .
| {z
n

Moreover, if A is invertible, then

A−n := (A−1 )n = A −1 −1 −1
| A {z· · · A } .
n

If r and s are integers, then it follows from the definition of powers that

Ar As = Ar+s , (Ar )s = Ars .

Theorem 1.11. If A is invertible, then

(a) A−1 is invertible and (A−1 )−1 = A.

(b) An is invertible and (An )−1 = (A−1 )n for any integer n ≥ 0.


1 −1
(c) For any scalar k ̸= 0, the matrix kA is invertible and (kA)−1 = A .
k
The proof of the theorem is straightforward and we therefore omit it.

1.5 Elementary Matrices and a Method for Finding


A−1

We develop an algorithm for finding the inverse of an invertible matrix in this section.

1.5.1 Elementary matrices and their properties

Definition. An n × n elementary matrix can be obtained by performing a single elemen-


tary row operation on In . The following are three types of elementary matrices.
26 CHAPTER 1. LINEAR SYSTEMS AND MATRICES

(i) Interchange rows i and j of In :


 
1
 .. 
 . 
 
 1 
 
 0 ··· 1  row i
 
 1 
 .. ... .. 
E(i, j) = 
 . . 

 1 
 
 1 ··· 0  row j
 
 1 
 
 ... 
 
1

(ii) Multiply row i of In by c (c ̸= 0):


 
1
 .. 
 . 
  c

 row i
E i(c) =  
 1 
 ... 
 
1

(iii) Add k times row j to row i of In :


 
1
 ... 
 
 
 1 ··· k  row i
  . . . .. 
E i, j(k) = 
 . 

 1  row j
 
 .. 
 . 
1

Remark. A square matrix is called a permutation matrix if it can be written as a


product of elementary matrices of type (i).

When a matrix A is multiplied on the left by an elementary matrix E, the result is to


perform an elementary row operation on A. More precisely, we have the following theorem.
1.5. ELEMENTARY MATRICES AND A METHOD FOR FINDING A−1 27

Theorem 1.12. Let A be an m × n matrix. If the elementary matrix E results from per-
forming a certain row operation on Im , then the product EA is the matrix that results when
this same row operation is performed on A.

Proof. We only prove the statement concerned  with the elementary matrix E(i, j). One can
prove the statements concerned with E i(c) and E i, j(k) easily by using the same trick.
Let r1 , r2 , . . . , rm denote the row matrices of A and e1 , e2 , . . . , em denote the column matrices
of Im . It follows that    T 
r1 e1
 ..   .. 
 .   . 
   T 
 ri   ej 
 .   . 
A=  .
. ,
 E(i, j) =  ..  .
 
 r   eT 
 j   i 
 .   . 
 ..   .. 
rm eTm
Since eTk A = rk for 1 ≤ k ≤ m, we have by (1.4),
     
eT1 eT1 A r1
 ..   ..   .. 
 .   .   . 
 T   T   
 ej   ej A   rj 
 .   ..   .. 
E(i, j)A =  ..  A = 
 
 .  
 =  .  .
 eT   eTi A   ri 
 i     
 .   ..  
 .. 
 ..   .  . 
eTm eTm A rm

Hence E(i, j)A is a matrix that results when rows i and j of A are interchanged.

The following theorem is concerned with the invertibility of elementary matrices. The
proof of the theorem is left as an exercise.

Theorem 1.13. For three types of elementary matrices, we have


−1  −1 
E(i, j)−1 = E(i, j), E i(c) = E i(c−1 ) , E i, j(k) = E i, j(−k) .

Remark. It follows from Theorem 1.13 that the inverse of any elementary matrix is still an
elementary matrix.
28 CHAPTER 1. LINEAR SYSTEMS AND MATRICES

1.5.2 Main theorem of invertibility

The next theorem establishes some equivalent statements of the invertibility of a matrix.
These results are extremely important and will be used many times later.

Theorem 1.14. Let A be an n × n matrix. Then the following statements are equivalent,
i.e., all true or all false.

(a) A is invertible.

(b) Ax = 0 has only the trivial solution.

(c) The reduced row-echelon form of A is In .

(d) A is expressible as a product of elementary matrices.

Proof. It is sufficient to prove that (a) ⇒ (b) ⇒ (c) ⇒ (d) ⇒ (a).

(a) ⇒ (b): Let x0 be any solution of Ax = 0, i.e., Ax0 = 0. Since A is invertible, multiplying
both sides of Ax0 = 0 by A−1 , we have A−1 Ax0 = A−1 0, which implies In x0 = 0. Thus,
x0 = 0. Therefore, Ax = 0 has only the trivial solution.

(b) ⇒ (c): Let R be the reduced row-echelon form of A. Then by Theorem 1.6, R = In
or R has a zero row. If R has a zero row, then it follows from Theorem 1.1 that Rx = 0
has infinitely many solutions, i.e., nontrivial solutions. Therefore, Ax = 0 has nontrivial
solutions, which contradicts (b). Thus, R = In .

(c) ⇒ (d): If the reduced row-echelon form of A is In , then there exist some elementary
matrices E(1) , E(2) , . . . , E(k) such that

E(k) · · · E(2) E(1) A = In . (1.11)

By Theorem 1.13, we know that every elementary matrix is invertible and the inverse of an
elementary matrix is still an elementary matrix. It follows from (1.11) that
−1 −1 −1
A = E(1) E(2) · · · E(k) .

Thus, (d) holds.

(d) ⇒ (a): It is obtained directly from Theorems 1.13 and 1.9.


1.5. ELEMENTARY MATRICES AND A METHOD FOR FINDING A−1 29

1.5.3 A method for finding A−1

In the following, we establish a method for constructing the inverse of an n × n invertible


matrix A. Since A−1 exists, multiplying both sides of (1.11) on the right by A−1 yields

A−1 = E(k) · · · E(2) E(1) .


 
We next construct an n × 2n matrix A ¦ I . We have by using (1.3),
   
E(k) · · · E(2) E(1) A ¦ I = E(k) · · · E(2) E(1) A ¦ E(k) · · · E(2) E(1) I

 
= I ¦ E(k) · · · E(2) E(1)

 
= I ¦ A−1 .

Thus, the sequence of elementary row operations that reduces A to I actually converts I to
A−1 simultaneously.

Example. Find the inverse of  


1 0 2
 
A =  3 1 6 .
2 7 3

Solution. The computations are as follows:


 
  1 0 2 ¦ 1 0 0
A ¦ I3 =  3 1 6 ¦ 0 1 0 
2 7 3 ¦ 0 0 1
 
2nd row + (−3) × 1st row
1 0 2 ¦ 1 0 0
3rd row + (−2) × 1st row
−−−−−−−−−−−−−−−−→  0 1 0 ¦ −3 1 0 
0 7 −1 ¦ −2 0 1
 
1 0 2 ¦ 1 0 0
3rd row + (−7) × 2nd row
−−−−−−−−−−−−−−−−→  0 1 0 ¦ −3 1 0 
0 0 −1 ¦ 19 −7 1
 
1 0 2 ¦ 1 0 0
(−1) × 3rd row
−−−−−−−−−−−−−−−−→  0 1 0 ¦ −3 1 0 
0 0 1 ¦ −19 7 −1
 
1 0 0 ¦ 39 −14 2  
1st row + (−2) × 3rd row
−−−−−−−−−−−−−−−−→  0 1 0 ¦ −3 1 0  = I3 ¦ A−1 .
0 0 1 ¦ −19 7 −1
30 CHAPTER 1. LINEAR SYSTEMS AND MATRICES

Thus,  
39 −14 2
 
A−1 =  −3 1 0 .
−19 7 −1

1.6 Further Results on Systems and Invertibility

We develop more results concerned with linear systems and invertibility of matrices in this
section.

1.6.1 A basic theorem

Theorem 1.15. Every linear system has either no solution, exactly one solution, or infinitely
many solutions.

Proof. If Ax = b is a system of linear equations, then exactly one of the following is true:

(a) the system has no solution;

(b) the system has exactly one solution;

(c) the system has more than one solution.

The proof will be completed if we can show that the system has infinitely many solutions in
case (c). Assume that Ax = b has more than one solution, and let x0 = x1 − x2 , where x1
and x2 are any two distinct solutions. Therefore, x0 is nonzero. Moreover,

Ax0 = A(x1 − x2 ) = Ax1 − Ax2 = b − b = 0.

Let k be any scalar. Then

A(x1 + kx0 ) = Ax1 + A(kx0 ) = Ax1 + k(Ax0 ) = b + k0 = b + 0 = b,

i.e., x1 + kx0 is a solution of Ax = b. Since x0 ̸= 0 and there are infinitely many choices for
k, we conclude that Ax = b has infinitely many solutions.
1.6. FURTHER RESULTS ON SYSTEMS AND INVERTIBILITY 31

1.6.2 Properties of invertible matrices

From the definition of the inverse of an invertible matrix A, it is necessary to find a square
matrix B such that
AB = I, BA = I.
The next theorem shows that if B satisfies either condition, then the other condition holds
automatically.

Theorem 1.16. Let A be a square matrix.

(a) If B is square and satisfies BA = I, then B = A−1 .

(b) If B is square and satisfies AB = I, then B = A−1 .

Proof. We only prove (a) and the proof of (b) is similar. We consider the system Ax = 0 and
show that this system only has the trivial solution. Let x0 be any solution of this system,
i.e., Ax0 = 0. Multiplying both sides by B yields

BAx0 = B0.

Since BA = I, we deduce
Ix0 = 0, i.e., x0 = 0.
Thus, the system Ax = 0 has only the trivial solution. It follows from Theorem 1.14 that
A−1 exists. Multiplying both sides of BA = I on the right by A−1 , we obtain

BA · A−1 = I · A−1 =⇒ B = A−1 .

Theorem 1.17. Let A and B be square matrices of the same size. If AB is invertible, then
A and B must also be invertible.

Proof. Since AB is invertible, there exists (AB)−1 such that

(AB)(AB)−1 = I =⇒ A[B(AB)−1 ] = I,

and
(AB)−1 (AB) = I =⇒ [(AB)−1 A]B = I.
By Theorem 1.16, both A and B are invertible.

Example. Let A and B be square matrices of the same size. Show that if I − AB is
invertible, then I − BA is also invertible.
32 CHAPTER 1. LINEAR SYSTEMS AND MATRICES

Proof. By Theorem 1.16, we only need to find a matrix X such that (I − BA)X = I.
Actually,

I = I − BA + BA = I − BA + BIA

= I − BA + B(I − AB)(I − AB)−1 A

= I − BA + (B − BAB)(I − AB)−1 A

= I − BA + (I − BA)B(I − AB)−1 A

= (I − BA)[I + B(I − AB)−1 A].

Thus, I − BA is invertible and

(I − BA)−1 = I + B(I − AB)−1 A.

The following theorem shows that we can solve a certain linear system by using the
inverse of its coefficient matrix.

Theorem 1.18. Let A be an n × n invertible matrix. Then for each n × 1 matrix b, the
system Ax = b has exactly one solution, namely, x = A−1 b.

Proof. Since A is invertible, there exists A−1 . For each n × 1 matrix b, let x0 be an arbitrary
solution of Ax = b, i.e., Ax0 = b. Multiplying both sides of Ax0 = b by A−1 , we have

A−1 Ax0 = A−1 b.

Thus, x0 = A−1 b is the only solution of Ax = b.

Furthermore, we add one more equivalent statement into Theorem 1.14.

Theorem 1.19. Let A be an n × n matrix. Then the following are equivalent.

(a) A is invertible.

(b) Ax = 0 has only the trivial solution.

(c) The reduced row-echelon form of A is In .

(d) A is expressible as a product of elementary matrices.

(e) Ax = b has exactly one solution for every n × 1 matrix b.


1.7. SOME SPECIAL MATRICES 33

Proof. Since we know that (a), (b), (c), and (d) are equivalent, it is sufficient to prove that
(a) ⇔ (e).

(a) ⇒ (e): This was proved in Theorem 1.18.

(e) ⇒ (a): If the system Ax = b has exactly one solution for every n × 1 matrix b, then in
particular, the systems

Ax = e1 , Ax = e2 , ..., Ax = en

are consistent, where ei denotes the ith column matrix of In for 1 ≤ i ≤ n. Let x1 , x2 , . . . , xn
be solutions of the respective systems, and let us form an n × n matrix C having these
solutions as columns:
 
C = x1 ¦ x2 ¦ · · · ¦ xn .
Then we have by (1.3),
   
AC = Ax1 ¦ Ax2 ¦ · · · ¦ Axn = e1 ¦ e2 ¦ · · · ¦ en = In .

It follows from Theorem 1.16 (b) that C = A−1 . Thus, A is invertible.

1.7 Some Special Matrices

Certain classes of matrices have special structures, which are useful in linear algebra and
also have many applications in practice.

1.7.1 Diagonal and triangular matrices

The following square matrix is called a diagonal matrix :


 
d11 0 · · · 0
 0 d22 · · · 0 
 
D =  .. .. . . ..  ,
 . . . . 
0 0 · · · dnn

which is usually denoted by

D = diag(d11 , d22 , . . . , dnn ).


34 CHAPTER 1. LINEAR SYSTEMS AND MATRICES

If dii ̸= 0 for 1 ≤ i ≤ n, then


 −1 
d11 0 ··· 0
 0 d−1 · · · 0 
 22 
D−1 =  .. .. ... ..  = diag(d−1 −1 −1
11 , d22 , . . . , dnn ).
 . . . 
−1
0 0 · · · dnn

The following square matrices are called triangular matrices: a lower triangular matrix
 
a11 0 0 ··· 0
 a21 a22 0 · · · 0 
 
 a32 a33 · · · 0 
L =  a31  (aij = 0, i < j)
 .. .. .. .. .. 
 . . . . . 
an1 an2 an3 · · · ann

and an upper triangular matrix


 
a11 a12 a13 ··· a1n
 0 a22 a23 ··· a2n 
 
 ··· 
U = 0 0 a33 a3n  (aij = 0, i > j).
 .. .. .. .. .. 
 . . . . . 
0 0 0 ··· ann

Theorem 1.20. We have

(a) The transpose of a lower triangular matrix is upper triangular, and the transpose of an
upper triangular matrix is lower triangular.

(b) The product of lower triangular matrices is lower triangular, and the product of upper
triangular matrices is upper triangular.

(c) A triangular matrix is invertible if and only if its diagonal entries are all nonzero.

(d) The inverse of an invertible lower triangular matrix is lower triangular, and the inverse
of an invertible upper triangular matrix is upper triangular.

Proof. Part (a) is obvious. We defer the proof of (c) until the next chapter (after Theorem
2.8). Here we prove (b) and (d) only.

For (b), we will prove the result for lower triangular matrices. The proof for upper triangular
matrices is similar. Let A = [aij ] and B = [bij ] be n × n lower triangular matrices, and let
C = AB = [cij ]. Obviously, aij = bij = 0 for i < j. We can prove that C is lower triangular
1.7. SOME SPECIAL MATRICES 35

by showing that cij = 0 for i < j. If i < j, then the terms in the expression of cij can be
grouped as follows:

cij = ai1 b1j + ai2 b2j + · · · + ai,j−1 bj−1,j + aij bjj + ai,j+1 bj+1,j + · · · + ain bnj .
| {z } | {z }
Terms in which the row Terms in which the row
number of b is less than the number of a is less than
column number of b the column number of a

In the first grouping all of the b factors are zero since bij = 0 for i < j, and in the second
grouping all of the a factors are zero since aij = 0 for i < j. Thus, cij = 0 for i < j. It
follows that C is a lower triangular matrix.

For (d), we only prove the result for lower triangular matrices again. Let A = [aij ] be an
n × n invertible lower triangular matrix, where aij = 0 for i < j. From (c), we know that
aii ̸= 0 for all i. Suppose that B = [bij ] is the inverse of A. Then

AB = I. (1.12)

We now compare the entries in both sides of (1.12) row by row. Beginning with the first
row, we have for j > 1,

0 = (I)1j = a11 b1j + a12 b2j + · · · + a1n bnj = a11 b1j + 0 · b2j + · · · + 0 · bnj = a11 b1j ,

which implies b1j = 0. For the second row, we have for j > 2,
n
X
0 = (I)2j = a2t btj = a21 b1j + a22 b2j = a22 b2j =⇒ b2j = 0.
t=1

By induction, we suppose that for the top k − 1 rows, bij = 0 if i < j and i < k < n. For
the kth row, we have for j > k,
n
X k
X
0 = (I)kj = akt btj = akt btj = akk bkj =⇒ bkj = 0.
t=1 t=1

In particular, bn−1,n = 0. Therefore, B is also a lower triangular matrix.

1.7.2 Symmetric matrix

A square matrix A is called symmetric if A = AT . A square matrix A is called skew-


symmetric if A = −AT .
36 CHAPTER 1. LINEAR SYSTEMS AND MATRICES

Example. Consider the matrices


 
  0 −1 4
1 3
A= , B= 1 0 3 .
3 2
−4 −3 0
Then A is symmetric and B is skew-symmetric since
 
  0 1 −4
1 3
AT = = A, B T =  −1 0 −3  = −B.
3 2
4 3 0

Theorem 1.21. Let A and B be symmetric matrices of the same size. Then
(a) AT is symmetric.
(b) A ± B are symmetric.
(c) kA is symmetric, where k is any scalar.
(d) AB is symmetric if and only if AB = BA.
(e) If A is invertible, then A−1 is symmetric.

Proof. Parts (a), (b), and (c) are obvious. Here we only prove (d) and (e).
For (d), since A and B are symmetric, it follows from Theorem 1.2 (d) that
AB = (AB)T ⇐⇒ AB = B T AT ⇐⇒ AB = BA.

For (e), if A is invertible and symmetric, then we have by Theorem 1.10,


(A−1 )T = (AT )−1 = A−1 .
Thus, A−1 is symmetric.

Theorem 1.22. Let A be an arbitrary matrix. Then AAT and AT A are symmetric. Further-
more, if A is square and invertible, then both AAT and AT A are symmetric and invertible.

Proof. It directly follows from Theorem 1.2 (d) and (a) that
(AAT )T = (AT )T AT = AAT and (AT A)T = AT (AT )T = AT A.
If A is square and invertible, it follows from Theorem 1.10 that AT is also invertible. By
Theorem 1.9, we find that AAT and AT A are invertible.

Remark. Every square matrix can be written as a sum of a symmetric matrix and a skew-
symmetric matrix. This is one of the most famous results in matrix theory. See Exercise
1.35.
1.7. SOME SPECIAL MATRICES 37

Exercises

Elementary exercises

1.1. Determine which equations are linear in variables x, y, and z. If an equation is not
linear, explain why not.

(a) x − πy + 3 5z = 0. (b) x2 + y 2 + z 2 = 1.

(c) x−1 + 7y + z = sin(π/9). (d) 3 cos x − 4y + z = 3.

(e) (cos 3)x − 4y + z = 3. (f) x = −7xy + 3z.
(g) xy + z + 1 = 0. (h) x−2 + y + 8z = 5.

1.2. Determine whether each system has a unique solution, infinitely many solutions, or no
solution. Then try to solve each system to confirm your answer.

(
x+y=0  x + 5y = −1

(a) (b) −x + y = −5
2x + y = 3. 

2x + 4y = 4.

1.3. Find the augmented and coefficient matrices for each of the following linear systems.
 
 2x − 3y
 +5=0  5x1 + x2
 − x4 + 2x5 = 1
(a) 4x + 2y −2=0 (b) 3x2 + 2x3 − x4 =3

 

3x + 5z + 3 = 0. 5x1 + 3x5 = 2.

1.4. Consider 
 x + y + 3z = a

2x − y + 2z = b


x − 2y − z = c.
Show that if this system is consistent, then the constants a, b, and c must satisfy c = b − a.

1.5. For which values of a, will the following system have no solution? Exactly one solution?
Infinitely many solutions?

 x + 2y −
 3z = 5
3x − y + 5z = 1


4x + y + (a2 − 14)z = a + 2.
38 CHAPTER 1. LINEAR SYSTEMS AND MATRICES

1.6. Determine whether each matrix is in reduced row-echelon form.


   
  1 1 3 5 1 2 3
0 1 3 0  0 0 1 −1   1 0 0 
(a) . (b) 
.
 (c) 
.
0 0 0 1 0 0 0 1 0 1 1 
0 0 0 0 0 0 1

1.7. Solve the following systems by Gauss-Jordan elimination.



 2x − y + 2z = 10

(a) x − 2y + z = 8


3x − y + 2z = 11.
(
5x − 2y + 6z = 0
(b)
−2x + y + 3z = 1.


 x1 + 2x2 + 3x3 + 4x4 = −3


 x1 + 2x2 − 5x4 = 1
(c)

 2x1 + 4x2 − 3x3 − 19x4 = 6



3x1 + 6x2 − 3x3 − 24x4 = 7.


 2x1 + 4x2 + 8x4 + 6x5 + 18x6 − 16 = 0


 x1 + 2x2 − 2x3 + 3x5 =0
(d)

 5x3 + 10x4 + 15x6 − 5 = 0



−2x1 − 4x2 + 5x3 + 2x4 − 6x5 + 3x6 − 1 = 0.

1.8. Let      
0 1 1 0 1 1
A= , B= , C= .
−1 0 0 1 1 1
 
1 4
(a) Is M = a linear combination of A, B, and C?
2 1
 
1 2
(b) Is N = a linear combination of A, B, and C?
3 4

R R R R R
1.9. Let A ∈ 4×5 , B ∈ 4×5 , C ∈ 5×2 , D ∈ 4×2 , and E ∈ 5×4 . Determine which of the
following matrix operations can be performed. If so, find the size of each resulting matrix.

(a) E(A + B). (b) AE + BC + D. (c) B(EA).


1.7. SOME SPECIAL MATRICES 39

1.10. Find AB and BA, where


 
  2
A= 1 3 2 , B =  −1  .
1

1.11. Let A, B, and C be square matrices of the same size. Find an example to show that
AB = AC but B ̸= C.

1.12. Compute A = (6E)( 31 D), where


   
1 5 2 −1 1 2
D =  3 2 4 , E =  6 1 3 .
−1 0 1 4 1 3

1.13. Let
   
  1 5 2 6 1 3
1 4 2
C= , D =  −1 0 1  , E =  −1 1 2  .
3 1 5
3 2 4 4 1 3

Using as few computations as possible, compute

(a) The second row of DE.

(b) The third column of DE.

(c) The entry in row 2 and column 3 of C(DE).

1.14. In each part, compute the product of A and B by the method of product of partitioned
matrices in Subsection 1.3.4. Then check your results by multiplying AB directly.
 
  2 1 ¦ 4
−1 2 ¦ 1 5
 −3 5 ¦ 2 
(a) A =  0 −3 ¦ 4 2  , B= − − − − −− .
−−−−−−−  7 −1 ¦ 5 
1 5 ¦ 6 1
0 3 ¦ −3
 
  2 1 ¦ 4
−1 2 1 ¦ 5
−−−−−−−   −3 5 ¦ 2 
(b) A =  0 −3 4 ¦ 2 , B=
 7 −1 ¦

5 .
1 5 6 ¦ 1 − − − − −−
0 3 ¦ −3
40 CHAPTER 1. LINEAR SYSTEMS AND MATRICES
 
2 −5
 
 1 3  2 −1 3 −4
(c) A = 
 0

5 , B= .
0 1 5 7
−−−
1 4

1.15. Let A and B be partitioned as follows:


 
b1
 −− 
   b2 
A= a1 ¦ a2 ¦ · · · ¦ an , B=

−−
..
,

 . 
−−
bn

where ai (1 ≤ i ≤ n) are column matrices of A and bi (1 ≤ i ≤ n) are row matrices of B.


Then AB can be expressed as

AB = a1 b1 + a2 b2 + · · · + an bn . (1.13)

Based on (1.13), compute AB if



  4 −1
1 3 2
A= , B= 1 2 .
0 −1 1
3 0

R
1.16. Let A = [aij ] ∈ n×n . Find two n × 1 matrices x and y such that xT Ay = aij , where
1 ≤ i ≤ n and 1 ≤ j ≤ n.

1.17. Let   

3 −2 7 6 −2 4
A= 6 5 4 , B= 0 1 3 .
0 4 9 7 7 5
Compute
(a) tr(3A − 5B T ). (b) tr(A2 ). (c) tr(AB).

1.18. Show that tr(AAT ) = 0 if and only if A = 0.

1.19. What is M T for the partitioned matrix


 
A B
M= ?
C D
1.7. SOME SPECIAL MATRICES 41

1.20. Prove Theorem 1.4 except (c).

1.21. Prove Theorem 1.6.

1.22. Use Theorem 1.8 to compute the inverses of the following matrices.
   
3 1 cos θ sin θ
(a) A = . (b) B = .
5 2 − sin θ cos θ

 n
1 0 1
1.23. Find  0 1 0  , where n is any positive integer.
0 0 1

1.24. Let A be a square matrix. Show that if A2 = A and A is invertible, then A = I.

1.25. Let A be a square matrix. Show that if A4 = 0, then

(I − A)−1 = I + A + A2 + A3 .

1.26. Let A be a square matrix. Show that if A2 − 3A + 4I = 0, then A + I is invertible.


Find (A + I)−1 .

1.27. Let A, B ∈ Rn×n. Show that if A is invertible and AB = 0, then B = 0.


1.28. Show that P T = P −1 for any permutation matrix P .

1.29. Let A be a square matrix and partitioned as


 
A11 A12
A= ,
A21 A22

where A11 and A22 are square matrices. Find a permutation matrix P such that P T AP = B,
where  
A22 A21
B= .
A12 A11

1.30. Prove Theorem 1.13.


42 CHAPTER 1. LINEAR SYSTEMS AND MATRICES

1.31. Express each of the following matrices as a product of elementary matrices.


 
  0 0 0 1
1 0  0 1 0 0 
(a) . (b)   0 0 2 0 .

−3 2
1 0 0 0

1.32. Find the inverses of the following matrices.


 
  1 2 3 4
1 2 0  1 3 1 3 
(a)  −1 2 3  . (b) 
 0
.
1 0 2 
1 3 1
1 2 2 2

1.33. Find a matrix A such that


   
2 5 4 −6
A= .
1 3 3 1

1.34. Let A, B ∈ Rn×n. Show that tr(AB) = 0 if A is symmetric and B is skew-symmetric.


R
1.35. Let A ∈ n×n . Show that A can be written as A = H + K, where H is a symmetric
matrix and K is a skew-symmetric matrix.

Challenge exercises

1.36. Determine the value of λ such that the following linear system has only the trivial
solution.

 λx1 + x2 + x3 = 0

x1 + λx2 + x3 = 0


x1 + x2 + x3 = 0.

1.37. Let A, B ∈ Rn×n. If AB = 0, show that for any positive integer k,


tr[(A + B)k ] = tr(Ak ) + tr(B k ).

1.38. Show that if A, B, C, D ∈ Rn×n such that ABCD = I, then


ABCD = DABC = CDAB = BCDA = I.
1.7. SOME SPECIAL MATRICES 43

1.39. Find the inverses of the following matrices.


 
  0 1··· 1 1
1
1 1 1 ··· 1 
   1 0··· 1 1 
1 
 1 2 2 ··· 2  
   1 1··· 1 1 
0 
(a)  1 2 3 ··· 3 . (b)  .. . . .. .. 
.. .. .
 .. .. .. .. ..   . . . . . 
.
 . . . . .   
 1 1 1 ··· 0 1 
1 2 3 ··· n
1 1 1 ··· 1 0

1.40. Find the inverse of the 3 × 3 Vandermonde matrix


 
1 1 1
 a b c ,
a2 b 2 c 2

where a, b, and c are distinct scalars from each other.

1.41. Let A, B, C, X, Y, Z ∈ Rn×n. If A−1 and C −1 exist, find


 −1
 −1 I X Y
A B
(a) . (b)  0 I Z  .
0 C
0 0 I

1.42. Let A, B ∈ Rn×n. Show that if A + B is invertible, then


A(A + B)−1 B = B(A + B)−1 A.

1.43. Let A ∈ Rn×n. Show that if AB = BA for all B ∈ Rn×n, then A = cI, where c is a
scalar.

1.44. Let A be a skew-symmetric matrix. Show that

(a) I − A is invertible.

(b) (I − A)−1 (I + A) = (I + A)(I − A)−1 .

(c) M T M = I, where M = (I − A)−1 (I + A).

(d) I + M is invertible.

1.45. Let A, B ∈ Rn×n. Show that if A3 = 2I and B = A3 − 2A + 3I, then B is invertible.


Find B −1 .

You might also like