INVERTIBLE MATRICES
1. The inverse of a square matrix
Definition 1.1. We say that a square n × n matrix A is invertible or regular if
there exists an n × n matrix B such that
AB = In and BA = In .
The matrix B is then unique. It is called the inverse of A and is denoted by A−1 .
Proof. We need to prove the unicity of B for this definition to make sense. Suppose
we have two matrices B and C such that:
AB = In , BA = In and AC = In , CA = In .
Then, since by associativity we have C(AB) = (CA)B, we obtain
(CA)B = In B = B = C(AB) = CIn = C
thus B = C. □
Example 1.2. The 2 × 2 matrix
( )
3 0
A=
0 1
is invertible and ( )
1
−1 3 0
A = .
0 1
Indeed, we can easily check that AA−1 = I2 and A−1 A = I2 . However, the following
matrix: ( )
5 2
B=
0 0
is not invertible. Indeed, suppose there existed a matrix C such that BC = I2 with
( )
a b
C= .
c d
Then we would have
( ) ( )
5a + 2c 5b + 2d 1 0
BC = = .
0 0 0 1
But looking at the last entry of the second row, this would imply 0 = 1, which is
impossible.
Proposition 1.3. The inversion of matrices satisfies the following important prop-
erties.
• It is involutive: (A−1 )−1 = A
• It is anti-commutative: (AB)−1 = B −1 A−1
for all n × n invertible matrices A and B.
1
Proof. The first property is obvious by the definition of the inverse. The second
property follows from the two equations:
(B −1 A−1 )(AB) = B −1 (A−1 A)B = B −1 In B = B −1 B = In
(AB)(B −1 A−1 ) = A(BB −1 )A−1 = AIn A−1 = AA−1 = In
□
Remark 1.4. If A is an n × n invertible matrix and λ is a nonzero scalar, it is also
clear that the inverse of the matrix λA is the matrix: λ1 A−1 .
2. Finding the inverse by direct computation
Given a square matrix A, we would like now to answer the following two impor-
tant questions:
(1) Is the matrix A invertible or not?
(2) If A is invertible, how can we compute the inverse A−1 ?
The first method is to do it by direct computation. We take a n × n matrix B
with unknown coefficients and we solve the matrix equation:
AB = In
If we come up with a contradiction, it means that B cannot be computed and that
A is thus not invertible. If we find a solution for B, we check that
BA = AB = In
and we deduce from this that B is the inverse of A. The matrix B has n2 coefficients,
so this method amounts to solving n systems of linear equations with n variables
each. It is a very basic, but not a very efficient method. We give a few examples
below when n = 3.
Example 2.1.
1 −2 2
A= 1 −1 3
2 −4 5
Let B be the matrix
a d g
B= b e h
c f i
and suppose that AB = I3 , that is
a − 2b + 2c d − 2e + 2f g − 2h + 2i 1 0 0
a − b + 3c d − e + 3f g − h + 3i = 0 1 0 .
2a − 4b + 5c 2d − 4e + 5f 2g − 4h + 5i 0 0 1
We obtain three systems of linear equations in three variables each:
a − 2b + 2c = 1 d − 2e + 2f = 0 g − 2h + 2i = 0
a − b + 3c = 0 d − e + 3f = 1 g − h + 3i = 0
2a − 4b + 5c = 0 2d − 4e + 5f = 0 2g − 4h + 5i = 0
2
which we solve by row reduction. Here, all three systems have a unique solution,
and we can compute all the coefficients a, b, c, d, e, f, g, h, i. We obtain:
7 2 −4
B = 1 1 −1 .
−2 0 1
Now we check by direct computation that:
AB = I3 and BA = I3 .
From all this we conclude that A is invertible and that:
7 2 −4
A−1 = 1 1 −1 .
−2 0 1
Example 2.2.
1 2 3
A = −1 0 −2
0 2 1
Let B be the matrix
a d g
B= b e h
c f i
and suppose that AB = I3 , that is
a + 2b + 3c d + 2e + 3f g + 2h + 3i 1 0 0
−a − 2c −d − 2f −g − 2i = 0 1 0 .
2b + c 2e + f 2h + i 0 0 1
We obtain three systems of linear equations in three variables each:
a + 2b + 3c = 1 d + 2e + 3f = 0 g + 2h + 3i = 0
−a − 2c = 0 −d − 2f = 1 −g − 2i = 0
2b + c = 0 2e + f = 0 2h + i = 0
which we solve by row reduction. But here, we see that the first system becomes:
a + 2b + 3c = 1 a + 2b + 3c = 1 a + 2b + 3c = 1
−a − 2c = 0 ⇐⇒ 2b + c = 1 ⇐⇒ 2b + c = 1
2b + c = 0 2b + c = 0 0 = −1
which is impossible. Therefore, the existence of a matrix B such that AB = I3
would imply a contradiction. Such a B cannot exist. The matrix A is not invertible.
3. The coefficient matrix of a system of linear equations
Let A be a square n × n matrix. In this section, we will investigate the set of
solutions of the system of linear equations:
−
→
A−→x = b
−
→
where b is an n-dimensional column vector with coordinates b1 , b2 , ..., bn . We will
see that the situation is very different when A is invertible and when A is not
invertible.
3
Theorem 3.1. Let A be an n × n matrix. The following three conditions are
equivalent:
(1) The matrix A is invertible.
−
→
(2) For each n-dimensional column vector b , the system of linear equations
−
→
A−
→
x = b
has a unique solution.
(3) The reduced row-echelon form of A is the unit matrix In .
Proof. • First we prove that 1 =⇒ 2. Suppose that A is invertible and fix
−
→
a column vector b . Then
−
→ −
→ −
→
A−→x = b =⇒ A−1 A− →x = A−1 b =⇒ − →
x = A−1 b
since A−1 A = I and I −
n
→
x =−
n
→
x . Furthermore,
−
→ −
→ −
→ −
→
x = A−1 b =⇒ A− →
x = AA−1 b =⇒ A− →
x = b
−
→ − →
since AA−1 = In and In b = b . Therefore we have proved that the system
−
→ −
→
of linear equations A−
→
x = b has a unique solution, which is −
→
x = A−1 b .
−
→
• Next we prove that 2 =⇒ 3. Take the zero vector for b . The homogeneous
−
→ −
→
system A− →
x = 0 has a unique solution, which is necessarily − →
x = 0 . Let
B be the augmented matrix of this system. The first n columns of B are
the columns of A, and the last column of B is the zero column. Now we
transform B by elementary row operations into its reduced row-echelon
form B ′ . Notice that the last column doesn’t change during this process: it
stays equal to the zero column. The system of linear equations associated
with B ′ has only one solution, so it is necessarily:
x1 = 0
x2 = 0
...
xn = 0
B ′ is necessarily equal to:
1 0 0 ... 0 0 0
0 1 0 ... 0 0 0
B′ = ... ... ... ... ... ... ... .
... ... ... ... ... ... ...
0 0 0 ... 0 1 0
The first n columns of B ′ are exactly the columns of In and the last column
of B ′ is the zero column. B ′ is row equivalent to B , so In is row equivalent
to A.
• Now, we prove that 3 =⇒ 2 =⇒ 1. Suppose that A is row equivalent
−
→
to the identity matrix In . Let b be a column vector and let B be the
−
→
augmented matrix of the system A− →
x = b . We perform on B the elemen-
tary row operations that reduce A to In and we obtain a matrix B ′ . It is
4
then clear that the first n columns of B ′ are the columns of In and the last
column of B ′ is of the type:
c1,1 b1 + c1,2 b2 + ... + c1,n bn
c2,1 b1 + c2,2 b2 + ... + c2,n bn
...
cn,1 b1 + cn,2 b2 + ... + cn,n bn
where the numbers ci,j are real coefficients, for 1 ≤ i ≤ n and 1 ≤ j ≤ n.
B ′ is clearly the reduced row-echelon form of B and the system of linear
−
→
equations A− →
x = b has a unique solution which is:
c1,1 b1 + c1,2 b2 + ... + c1,n bn
−
→ c2,1 b1 + c2,2 b2 + ... + c2,n bn
x = ...
.
cn,1 b1 + cn,2 b2 + ... + cn,n bn
Now let C be the matrix
c1,1 c1,2 ... ...
c1,n
c2,1 c2,2 c2,n
... ...
C= .
... ... ... ...
cn,1 cn,2 ... ...
cn,n
−
→
We have proved that for each column vector b , the system of linear equa-
−
→
tions A−→
x = b has a unique solution, which is
−
→ −
→
x =Cb.
−
→
So for all n-dimensional vectors b , we get:
−
→ − →
AC b = b .
from which we can conclude:
AC = In .
−
→
Furthermore, for any column vector − →
y , if we set b = A−
→
y , then the system
−
→ −
→
of linear equations A x = b has a unique solution which is −
−
→ →x =− →
y =Cb.
So for all vectors −
→
y in Rn , we get:
CA−
→
y =−
→
y
and therefore we can prove that:
CA = In .
The matrix A is invertible and A−1 = C.
□
We had stated that an n × n matrix A is invertible if there exists an n × n matrix
B such that both conditions:
AB = In and BA = In
are satisfied. However, in practice, the reader may have noticed that both conditions
are not necessary: once one of the two conditions is satisfied, the other one is
also automatically satisfied. We will prove this fact in the following proposition:
actually, we do not need to check both conditions, only one of the two conditions
is sufficient.
5
Proposition 3.2. Let A and B be two n × n matrices. If either AB = In or
BA = In then A is invertible and B = A−1 .
Proof. Suppose BA = In and let − →
x be a solution of the homogeneous system
−
→ − →
A x = 0. Then we have BA x = B 0 = 0 and BA−
−
→ −
→ →
x = In −
→
x =− →x which implies
−
→ −
→ −
→
that x = 0 . So the homogeneous system A x = 0 has a unique solution −
−
→ −
→ →
x = 0.
By the previous discussion, this implies that the reduced row-echelon form of A is
In so A is invertible. Furthermore,
BA = In =⇒ BAA−1 = In A−1 =⇒ B = A−1 .
If AB = In we reverse the roles of A and B: B is invertible and B −1 = A which is
equivalent to A−1 = B. □
Now, we would like to deal with the case where the matrix A is not invertible.
The following theorem is an analog of Theorem 3.1, rephrased for non-invertible
matrices.
Theorem 3.3. Let A be an n × n matrix. The following three conditions are
equivalent:
(1) The matrix A is not invertible.
−
→
(2) For each n-dimensional column vector b , the system of linear equations
−
→
A−→x = b
has either no solution or infinitely many solutions.
(3) In any row-echelon form of the matrix A, the bottom row is zero.
Proof. • First, we will prove that 1 =⇒ 3. Suppose that A is not invertible
and consider the homogeneous system
−
→
A−→
x = 0.
By the results of the previous section, this system has infinitely many solu-
tions. Let B be the augmented matrix of this system: the first n columns
of B are the columns of A and the last column of B is the zero column.
We transform this B, by elementary row operations, into its reduced row-
echelon form B ′ . Notice that the last column stays constantly equal to zero
during this process. The system of linear equations associated with B ′ has
infinitely many solutions. So there are k free variables, for some k ≥ 1, and
n − k leading variables. There is one leading variable on each row, from row
1 to row n − k. The rows below row n − k do not have a leading variable:
they are all zero rows. Since n − k ≤ n − 1, there is at least one zero row,
the bottom one.
• Now, we will prove that 3 =⇒ 2. Suppose that in all row-echelon forms
−
→
of A the bottom row is zero. Let b be a column vector and let B be the
augmented matrix of the system
−→
A−→
x = b.
We transform B by elementary row operations into its reduced row-echelon
form B ′ . Now the first n columns of B ′ form a row-echelon form of A, so
6
their bottom k rows are zero for some integer k ≥ 1. Then the bottom k
rows of B ′ are necessarily of the type:
(0, 0, ..., 0, αi )
−
→
for some constant αi depending on b . If αi ̸= 0 for some i then the system
is inconsistent: it has no solution. If αi = 0 for all i with n − k + 1 ≤ i ≤ n,
then the system is consistent and it has k free variables: it has infinitely
many solutions.
• Finally, 2 =⇒ 1 follows directly from the result of Theorem 3.1. So we
are done.
□
4. Finding the inverse by solving a system of linear equations
This gives us a new method to compute the inverse of a matrix A: solve the
system
−
→
A−
→
x = b
with the coefficients of
b1
b2
→
−
b =
...
...
bn
seen as parameters.
More precisely, two cases can occur:
−
→
• When A is invertible, the system A−
→
x = b has a unique solution which is
−
→ −
→
x = A−1 b .
By solving the system, at the end, we will get the entries of the matrix
A−1 = C in the formula:
c1,1 b1 + c1,2 b2 + ... + c1,n bn
−
→ c2,1 b1 + c2,2 b2 + ... + c2,n bn
x = ...
cn,1 b1 + cn,2 b2 + ... + cn,n bn
as in the proof of Theorem 3.1.
−
→
• When A is not invertible, when we solve the system A− →
x = b , at some
point the bottom row of the augmented matrix will become
(0, 0, ..., 0, αn )
where αn is a linear combination of the coefficients b1 , b2 , ..., bn . The system
has either no solution or infinitely many solutions, depending on the vector
−
→
b . By Theorem 3.3, we can stop here and conclude that A is not invertible.
7
Example 4.1 (Example 2.1).
1 −2 2
A= 1 −1 3
2 −4 5
−
→
We solve the system A−
→
x = b with the coefficients of
b
−
→ 1
b = b2
b3
seen as parameters. By Gaussian elimination:
1 −2 2 b1 1 −2 2 b1 1 −2 0 5b1 − 2b3
1 −1 3 b2 ∼ 0 1 1 b2 − b1 ∼ 0 1 0 b1 + b2 − b3
2 −4 5 b3 0 0 1 b3 − 2b1 0 0 1 b3 − 2b1
1 0 0 7b1 + 2b2 − 4b3
∼ 0 1 0 b1 + b2 − b3
0 0 1 −2b1 + b3
Therefore, we get a unique solution
7b1 + 2b2 − 4b3
−
→ −
→
x = A−1 b = b1 + b2 − b3
−2b1 + 0b2 + b3
and thus the coefficients of the matrix
7 2 −4
A−1 = 1 1 −1 .
−2 0 1
We see in this example that this method is not very simple and not very effective.
However, it is in some way essential, because it captures the real meaning of the
−
→
inverse of a matrix. When A is invertible, the equation A−→x = b is equivalent to
−
→ −
→ −
→
the equation −→x = A−1 b . That is, if −→
x is transformed into b by A, then b is
transformed into − →
x by A−1 and conversely.
Example 4.2 (Example 2.2).
1 2 3
A = −1 0 −2
0 2 1
We proceed similarly to the previous example, and:
1 2 3 b1 1 2 3 b1 1 2 3 b1
−1 0 −2 b2 ∼ 0 2 1 b1 + b2 ∼ 0 2 1 b1 + b2
0 2 1 b3 0 2 1 b3 0 0 0 b3 − b1 − b2
At this point, the last row has become:
(0, 0, 0, α3 )
with α3 = b3 − b1 − b2 . The system has either no solution or infinitely many
solutions: A is not invertible.
8
5. Finding the inverse by row reduction
In this section, we want to discuss a new way to compute the inverse of an n × n
matrix A. This is also probably the quickest and the most effective way to do
it. This method is heavily based on the method used in the previous section, but
−
→
instead of using the vectors b and their coordinates b1 , b2 , ..., bn , we perform the
elementary row operations directly on the matrices A and In .
Definition 5.1. An elementary matrix is a square n × n matrix obtained by
performing one elementary row operation on the identity matrix In .
There are three types of elementary matrices, corresponding to the three ele-
mentary row operations.
(1) Transpositions or exchanges are elementary matrices obtained by ex-
changing two rows of In .
(2) Dilatations are elementary matrices obtained by multiplying one row of
In by a constant k ̸= 0.
(3) Transvections are elementary matrices obtained by adding to one row of
In another row multiplied by a constant k.
Example 5.2. Here are a few examples of elementary matrices when n = 3.
0 1 0 1 0 0 1 0 0
1 0 0 0 3 0 0 1 0
0 0 1 0 0 1 0 5 1
The first one is a transposition: it is obtained from I3 by exchanging row 1 and row
2. The second one is a dilatation: row 2 of I3 is multiplied by 3. The third one is
a transvection: we added row 2 multiplied by 5 to row 3.
−
→
Proposition 5.3. Let b be an n-dimensional column vector. Performing one ele-
−
→ −
→ −
→
mentary row operation on b is equivalent to replacing b by the vector E b which
−
→
is obtained by multiplying b on the left by the corresponding n × n elementary
matrix E.
−
→
Proof. If E is a transposition that exchanges row i and row j, then row i of E b is
−
→ −
→
bj and row j of E b is bi . The other rows of b remain the same. If E is a dilatation
−
→
that multiplies row i by a constant k ̸= 0, then row i of E b is kbi . The other rows
−
→
of b remain unchanged. If E is a transvection that adds to row i another row j
−
→ −
→
multiplied by a constant k, then row i of E b is bi + kbj . The other rows of b
remain unchanged, which completes the proof. □
Because of this, we see that instead of solving the system of linear equations
−
→
A−
→
x = b
−
→
where the coefficients of b are considered as parameters, and perform the ele-
mentary row operations on the usual augmented matrix B, we can consider the
following ”super-augmented” matrix with n rows and 2n columns:
(A In )
9
obtained by writing the matrices A and In side by side, and perform the elementary
row operations directly on this matrix.
If A is invertible, after a finite number of operations, we will obtain the matrix
(In A−1 )
whose the last n columns are exactly the inverse A−1 . If A is not invertible, at
one point the bottom row of the first n columns will become the zero row. At that
point, we can stop the process and conclude that A is not invertible, without going
all the way to computing the reduced row-echelon form of A.
Example 5.4 (Examples 2.1 and 4.1 ).
1 −2 2
A = 1 −1 3
2 −4 5
We obtain:
1 −2 2 1 0 0 1 −2 2 1 0 0
1 −1 3 0 1 0 ∼ 0 1 1 −1 1 0
2 −4 5 0 0 1 0 0 1 −2 0 1
1 −2 0 5 0 −2 1 0 0 7 2 −4
∼ 0 1 0 1 1 −1 ∼ 0 1 0 1 1 −1
0 0 1 −2 0 1 0 0 1 −2 0 1
and so
7 2 −4
A−1 = 1 1 −1 .
−2 0 1
Note that in this case, we have performed 5 operations, that is we multiplied on
the left by the following elementary matrices:
1 0 0 1 0 0 1 0 0
E1 = −1 1 0 E2 = 0 1 0 E3 = 0 1 −1
0 0 1 −2 0 1 0 0 1
1 0 −2 1 2 0
E4 = 0 1 0 E5 = 0 1 0
0 0 1 0 0 1
which happen to be only transvections. The reader can check that I3 is equal to
the product of matrices:
I3 = E5 E4 E3 E2 E1 A
and that
A−1 = E5 E4 E3 E2 E1 .
Notice the similarities between examples 4.1 and 5.4. Actually, what we are doing
in both cases amounts to the same thing. Only, by performing the operations on
−
→
the matrix I3 instead of performing them on the column vector b , we are avoiding
writing all over again the coefficients b1 , b2 and b3 and this makes our computations
more simple and more elegant.
In the following example, we see what happens when A is not invertible.
10
Example 5.5 (Examples 2.2 and 4.2).
1 2 3
A = −1 0 −2
0 2 1
We obtain:
1 2 3 1 0 0 1 2 3 1 0 0 1 2 3 1 0 0
−1 0 −2 0 1 0 ∼ 0 2 1 1 1 0 ∼ 0 2 1 1 1 0
0 2 1 0 0 1 0 2 1 0 0 1 0 0 0 −1 −1 1
Here, the bottom row of the first three columns has become zero, so we can conclude
that A is not invertible.
This method for computing the inverse of a matrix A is very efficient when A is
a larger matrix. We give a last example with a 4 × 4 matrix.
Example 5.6.
1 −1 0 2
2 1 1 1
A=
−1
1 3 0
0 2 2 −1
We obtain:
1 −1 0 2 1 0 0 0 1 −1 0 2 1 0 0 0
2 1 1 1 0 1 0 0 3 1 −3 −2 1 0 0
∼ 0
−1 1 3 0 0 0 1 0 0 0 3 2 1 0 1 0
0 2 2 −1 0 0 0 1 0 2 2 −1 0 0 0 1
1 −1 0 2 1 0 0 0 1 −1 0 2 1 0 0 0
0 1 −1 −2 −2 1 0 −1 1 −1 −2 −2 1 0 −1
∼ ∼ 0
0 0 3 2 1 0 1 0 0 0 3 2 1 0 1 0
0 2 2 −1 0 0 0 1 0 0 4 3 4 −2 0 3
1 −1 0 2 1 0 0 0 1 −1 0 2 1 0 0 0
0 1 −1 −2 −2 1 0 −1 1 −1 −2 −2 0 −1
∼ ∼ 0 1
0 0 3 2 1 0 1 0 0 0 1 1 3 −2 −1 3
0 0 1 1 3 −2 −1 3 0 0 3 2 1 0 1 0
1 −1 0 2 1 0 0 0 1 −1 0 2 1 0 0 0
0 1 −1 −2 −2 1 0 −1 1 −1 −2 −2 0 −1
∼ ∼ 0 1
0 0 1 1 3 −2 −1 3 0 0 1 1 3 −2 −1 3
0 0 0 −1 −8 6 4 −9 0 0 0 1 8 −6 −4 9
1 −1 0 0 −15 12 8 −18 1 −1 0 0 −15 12 8 −18
0 1 −1 0 14 −11 −8 17 9 −7 −5 11
∼ ∼ 0 1 0 0
0 0 1 0 −5 4 3 −6 0 0 1 0 −5 4 3 −6
0 0 0 1 8 −6 −4 9 0 0 0 1 8 −6 −4 9
1 0 0 0 −6 5 3 −7
0 1 0 0 9 −7 −5 11
∼ 0 0 1 0 −5
4 3 −6
0 0 0 1 8 −6 −4 9
11
and it follows that:
−6 5 3 −7
9 −7 −5 11
A−1 = −5
.
4 3 −6
8 −6 −4 9
Notice that, because of the size of the matrix, this example would have been quite
defficult to solve using the previous methods.
12