Intro to Elementary Matrix Theory
Intro to Elementary Matrix Theory
Definition
A matrix A of order (or size) m  n is a rectangular array of mn elements
arranged in m rows and n columns. The element in the i th row and j th
column is called the (i, j ) -entry of the matrix and is denoted by aij .
              a11 a12  a1n 
             a    a22  a2 n 
A  [aij ]    21            
                             
                             
              am1 am 2  amn 
Here, consider matrices whose elements are, in general, complex numbers. Two
matrices A and B are equal when their corresponding elements are equal i.e.,
aij  bij for all i , j (which implies that A and B are of the same order).
Matrix operations:
Addition:          Matrices         are     added    (or   subtracted)   element-wise:
[aij ]  [bij ]  [aij  bij ] , which is of same order.
Matrix Multiplication
An m  n matrix A can be multiplied with an p  q matrix B if and only if
                                         n          
n  p . Their product is defined as AB    aik bkj  .
                                          k 1      
Example
2 3                2 1  3  4     2  (1)  3  6  14 16 
5 7        
                    5 1  7  4     5  (1)  7  6   33 37 
            1  1
         
         4 6                                                   
11 13          11  1  13  4 11  (1)  13  6  63 67 
Matrix multiplication is associative, A( BC )  ( AB)C ; and distributive,
A( B  C )  AB  AC , but not commutative (in general), AB  BA . Indeed, the
product BA may not even be defined, as in the case of the matrices in the above
example.
Example
                              1 0 9
       1 5 1 4              5 3 2 
                  
If A  0 3 1 7  , then A  
                           T            .
                               1 1 5 
       9 2 5 1                   
                               4 7 1
Hermitian transpose
               Symmetric if A  A
                                 T
   (i)
               Skew-symmetric if A   A
                                        T
   (ii)
   (iii)       Hermitian of A  A *
Trace of a matrix
The trace of a square matrix is the sum of all the elements on the main diagonal,
                                               n
and is denoted by trace( A) or tr( A)   aii .
                                              i 1
Example
         9 1 1
If A   3 2 1 , then tr( A)  a11  a22  a33  9  (2)  5  12 .
         4 7 5
Properties
1. ( A  B)T  AT  BT and ( A  B)*  A *  B * for all m  n matrices A and B .
2.  cA  cAT and  cA*  c * A * for all matrices A and scalars c .
           T
           7 17            8 37 
Then AB          and BA        .
           1  34           11 19 
Determinants
For every square matrix A , we can associate a unique real (or complex)
number, called the determinant of A , and denoted by | A | or det( A) , having
certain important properties.
Definition
1. For a 1 1 matrix A  [a11 ] , the determinant is defined to be | A | a11 .
2. For an n  n matrix A  [aij ] (where n  1 ), the determinant is defined as
          n
   | A |  (1) j 1a1 j A(1| j )
          j 1
Example
           1 2
1. Let A        .
            3  4 
               1 2
   Then | A |
               3 4
                  1 4  2  3
                  2 .
                                                     1 3 1
2. Let                                           A   1 1 2                                                .
                                                               
                                                      2 1 2 
                1         3       1
   Then | A | 1 1 2
               2 1 2
                      1       2             1    2             1 1
               1                 3                 1
                      1 2                  2    2             2        1
               11 (2)  2 1  3 (1)  (2)  2  2   1 11  1 2 
4 6 3
1 .
Inverse of a Matrix
For a square matrix A of order n , another matrix B of the same order is said to
be an inverse of A if AB  BA  I , the identity matrix of order n .
A square matrix that has no inverse is said to be singular, and one that has an
inverse is said to be non-singular or invertible.
Theorem
Every invertible matrix has a unique inverse.
Proof
Let A be an invertible matrix, and suppose B1 and B2 are inverses of A .
2 1 2 3 5 4
            1 3 1   4 7 5 1 0 0 
For, AA   1 1 2   2 4 3   0 1 0  .
         1
2 1 2 3 5 4 0 0 1
Properties
   
         1
1. A1         A for all invertible matrices A .
              
         1          T
3. AT          A1 for all invertible matrices A .
Definition
The cofactor Cij of A corresponding to the (i, j ) -entry is defined as
Cij  (1)i j M ij , where M ij is the minor.
The cofactor matrix C of A is the matrix with (i, j ) -entry as the
cofactor Cij , for every i , j .
The adjoint (or adjugate) of A , denoted by adj A is the transpose of the
cofactor matrix C .
The inverse of a non-singular matrix A is given by the formula
        1
A1        adj A
      | A|
Example
         1 0 2 
Let A   2 3 1  .
         1 2 1 
                             1 1 1 
The cofactor matrix is C   4 1 2  and the determinant is | A | 3 .
                             6 3 3 
                      1 4 6 
Thus, adj A  C T   1 1 3  .
                      1 2 3 
                  1 4 6 
                1 1 3  .
        adj A  1
 A1 
         | A| 3            
                  1 2 3 
Special Matrices
A matrix A  [aij ] be an n  n (real or complex) matrix is said to be
1. diagonal if aij  0 for all i  j .
2. tridiagonal if aij  0 for all i , j with | i  j | 1 , i.e., all the entries of A are
   zero except (possibly) for those on the main diagonal
   ( a11, a22 ann ), the superdiagonal ( a12 , a23 an1,n ), and the subdiagonal
   ( a21, a32 an,n1 ).
3. upper-triangular if aij  0 for i  j , i.e., all entries below the main
   diagonal are zero.
4. lower-triangular if aij  0 for i  j , i.e., all entries above the main diagonal
   are zero.
5. orthogonal if all its entries are real and AT  A1 , i.e., AAT  AT A  I .
6. unitary if it has complex entries and A*  A1 , i.e., AA*  A * A  I .
Exercises
1. Classify the following matrices as symmetric, skew-symmetric, Hermitian, or
   none of these:
       1 3
   a.         
       3 4 
      ex      eix 
  b.  ix             
     e       e  x 
      0 1        1
  c. 1 0        1
                     
      1 1       0 
        2      1 i      3
  d. 1  i       0       i 
                             
        3       i      1
        0 1 1
  e.  1 0 1 
        1 1 0 
       1 1 1
  f.  1 1 1
       1 1 1
         2 1 i 3 
  g. 1  i   i     i 
                       
         3   i 1
                                      x 2  9 x 2  2 x  1
2. Find the value of x such that A                         is
                                      4  2 x       0      
   a. symmetric
   b. skew-symmetric
3. Show that every skew-symmetric matrix A  [aij ] of order n has all main
  diagonal entries zero, i.e., a11  a22   ann  0 .
4. Show that every Hermitian matrix has only real numbers along the main
   diagonal.
5. Find the determinant of the following:
      1 1
   a.       
      3 3 
      0 1 1 
  b. 1 0 1 
      1 1 0 
       ex      eix 
   c.  ix             
      e       e  x 
        cos sin 
   d.               
         sin cos 
      e x e2 x 
   e.  x         
      e 2e 
               2x
        1 2 2 
   f.  1 3 0 
        0 2 1 
6. Determine which of the following are invertible and find the inverse:
       1 1
   a.       
       3 3 
         1 1
   b.          
         3 3 
       0 1 1 
   c. 1 0 1 
       1 1 0 
         0 1 1
   d.  1 0 1 
         1 1 0 
         cos sin 
   e.                
          sin cos 
      1 x         x2 
                     
   f. 0 1         x
      0 0         1 
      
       1      a b
   g. 0      1 c  , where a , b , c are any three complex numbers.
                    
       0     0 1 
        a     0 0
   h.  0     b 0
                    
         0   0 c 
7. Show that the inverse of every 3  3 upper-triangular matrices with non-zero
   diagonal entries is also upper triangular.
   Hint: If the i th row of A is multiplied by a non-zero constant p , then the i th
   column of A1 gets divided by p . Use this property to reduce A to a matrix
   of the same form as that in Exercise 6g.
8. Find the inverse of a general n  n diagonal matrix with non-zero diagonal
   entries.
9. Classify the following matrices as orthogonal, unitary, or neither:
      1 0 
   a.       
      0 1 
       0    0 1
   b. 0    1 0
                  
       1   0 0 
      1     1
   c. 
       1   0 
        1 1 
   d.          
        1 1
       1           1      
             2        2 
   e. 
          1          1 
            2         2 
      1            i     
             2        2  
   f.                    
      
          1        i 
              2         2
         cos sin 
   g.                   
          sin cos 
1. In each row, the first non-zero entry (called the leading entry) should be1.
2. The leading entry of every row is to the left of the leading entry, if any, of all
    rows below it.
Note that zero rows (i.e., rows with all entries 0) are allowed, but the second
condition implies that all zero rows must occur below all non-zero rows (i.e.,
rows with at least one non-zero entry).
Example
Among the matrices given below, A and C are in row-reduced echelon form,
while B and D are not (why?).
    1 0 2 1                     1 0 2 1
A  0 0 1 4                  B  0 0 0 0 
                                           
    0 0 0 0                    0 0 1 4 
    1 2 0 0 1 5                1 2 0 0 1 5
C  0 1 0 4 3 3             D  0 1 0 4 3 3
                                               
    0 0 0 1 1 2                0 1 0 5 4 1
The following three operations performed on a matrix are called elementary row
operations:
1. Interchange: Interchange any two rows. Ri  R j denotes interchanging of
   the i th and j th rows
2. Scaling: Multiply every entry of a row by the same non-zero scalar. Ri  kRi
   denotes multiplying the i th row by the scalar k .
3. Row Addition: Add a multiple of one row of the matrix to another row.
    Ri  Ri  kR j denotes adding k times the j th row, to the i th row.
We can use elementary row operations to transform a matrix into its row-
reduced echelon form.
Example
     4 8 0 0 4 20 
A   3 5 0 4 6 12 
                     
     2 6 0 9 5 14 
         1 2 0 0 1 5  R1  R1 / 4
      ~  3 5 0 4 6 12 
                         
         2 6 0 9 5 14 
        1 2 0 0 1 5 R2  R2  3R1
      ~ 0 1 0 4 3 3 R3  R3  2 R1
                       
        0 2 0 9 7 4 
        1 2 0 0 1 5 R3  R3  2 R2
      ~ 0 1 0 4 3 3
                      
        0 0 0 1 1 2 
Example
     4 8 0 0 4 20 
A   3 5 0 4 6 12 
                     
     2 6 0 9 5 14 
         4     0 0 0 0    0  C2  C2  2C1
      ~  3    1 0 4 3   3 C5  C5  C1
                              
          2   2 0 9 7   4  C6  C6  5C1
        4      0 0 0 0   0  C4  C4  4C2
      ~  3    1 0 0 0   0  C5  C5  3C2
                           
         2    2 0 1 1   2  C6  C6  3C2
         4 0 0 0 0 0  C5  C5  C4
      ~  3 1 0 0 0 0 C6  C6  2C4
                       
         2 2 0 1 0 0 
          4 0 0 0 0 0  C3  C4
      ~  3 1 0 0 0 0 
                       
          2 2 1 0 0 0
Elementary Matrices
All the elementary row and column operations can be defined in terms of
multiplication by special matrices called elementary matrices.
Definition
An elementary matrix E is a square matrix that generates an elementary
row operation on a matrix A under the multiplication EA .
Example:
         0 1 0            1 2 3
Let E  1 0 0  ,    A  4 5 6 .
                                    
         0 0 1          7 8 9 
     0 1 0  1 2 3  4         5 6
EA  1 0 0   4 5 6   1      2 3                                        .
                                  
     0 0 1  7 8 9  7    8 9 
Thus, left-multiplication by E   has the effect of interchanging the first and
second rows of A .
                           1 2 3  0       1 0  2 1 3
Now,                  AE   4 5 6  1      0 0  5 4 6                   .
                                                        
                           7 8 9  0    0 1  8 7 9 
Right-multiplication by E , therefore,      interchanging the first and second
columns of A .
              0 0 1  1 2 3 4  9 10 11 12 
Then, E1 A  0 1 0 5 6 7 8   5 6 7 8  .
              1 0 0  9 10 11 12  1 2 3 4 
      1 0 0
E2   5 1 0 
              
      0 0 1 
             1 0 0  1 2 3 4   1 2 3          4 
Then E2 A   5 1 0  5 6 7 8   0 4 8 12 .
                                    
             0 0 1  9 10 11 12 9 10 11 12 
However, E1 cannot be used for interchanging the first and third columns of A ,
as it is not possible to right-multiply A having order 3  4 by E1 having order
3  3 . Therefore, we need to obtain another elementary matrix, say E3 , from the
identity matrix of order 4 by interchanging the first and third columns (or
equivalently, rows).
           0   0 1 0
           0   1 0 0
Thus, E3           .
           1   0 0 0
                    
           0   0 0 1
                            0     0 1 0
            1 2 3 4                    3 2 1 4
                                   1 0 0 
Then, AE3  5 6 7 8                   7 6 5 8
                              0
                          1     0 0 0            
            9 10 11 12               
                                         11 10 9 12 
                            0     0 0 1
Definition
The row rank of a matrix is the number of non-zero rows in the echelon
form of that matrix.
                     1     2 4 2 
                     0     0 1 2 
Example: The rank of                is 3, since this is in the echelon form of
                     0     0 0 1
                                  
                     0     0 0 0
                       2 1 3                           1 0 1  R1  R2
                      1 0 1                              2 1 3 
Example           A                                   ~        
                       0 2 1                            0 2 1
                                                                
                      1 1 4                             1 1 4 
 1 0 1  R2  R2  2 R1
 0 1 1  R  R  R
~        4     4    1
 0 2 1
        
 0 1 3 
 1 0      1  R3  R3  2 R2
 0 1     1  R4  R4  R2
~           
 0 0      1
            
 0 0      4
 1   0   1
 0   1   1 R2   R2
~           
 0   0   1  R4  R4  4 R3
            
 0   0   0
As there are three non-zero rows in the row-reduced echelon form of A , the row
rank     of    A     is   3.   Now,       we    find   the     column     rank.
      2 1 3 
     1 0 1 
 A             
      0 2 1
                
     1 1 4 
              1   2    3  C1  C2
             0     1    1
            ~              
             2     0   1
                           
             1     1    4
              1   0   0  C2  C2  2C1
             0     1   1  C3  C3  3C1
            ~            
             2     4   5
                         
             1     3   7
              1   0   0  C3  C3  C2
             0     1   0
            ~            
             2     4   1
                         
             1     3   4
                      1 0 2 1 0 0  R1
Solution: [A | I] =  2 1 3 0 1 0  R 2
                      4 1 8 0 0 1  R 3
           1 0 2 1 0 0 
[A | I] = 0 1 1 2 1 0 
           0 1 0 4 0 1 
          1 0 2 1 0 0 
        0 1 1 2 1 0 
          0 0 1 6 1 1 
         1 0 0 11 2 2 
       0 1 0 4 0 1
         0 0 1 6 1 1
         1 0 0 11 2 2 
       0 1 0 4 0 1  .
         0 0 1 6 1 1
                     11 2 2 
Therefore A-1 =  4     0 1.
                               
                     6 1 1
                                                             4    1   2
Example: Find the inverse of the matrix A =                  0    1   0
                                                             
                                                             8   4   5
Solution:
Writing the given matrix side by side with the unit matrix of order 3, we have
                   4 1 2 1 0 0  R1
       [A | I] = 0 1 0 0 1 0  R 2
                  8 4 5 0 0 1  R 3
                                                      R1
Perform the row transformations: R1                     and R 3  R 3  2R1
                                                      4
              1        1       1                
          1                            0       0
              4        2       4                
                                                
[A | I] =                                       
          0   1        0       0       1       0
                                                
                                                
          0    2       1       2      0       1
                           1       1                 
          1       1                        0        0
                  4        2       4                 
                                                     
[A | I] =                                            
          0       1        0       0       1        0
                                                     
                                                     
          0       0        1    2         2       1
                                                       1
Perform the row transformations: R1  R1  R 2
                                                       4
                   1       1           1      
          1   0                             0
                   2       4           4      
                                              
[A | I] =                                     
          0   1       0       0        1     0
                                              
                                              
          0   0    1      2       2        1
                                                       1
Perform the row transformations: R1  R1  R 3
                                                       2
                          5       3         1
          1   0    0                        
                          4       4         2
                                             
[A | I] =                                     = [ I | B]
          0   1    0       0      1         0
                                             
                                             
          0   0   1       2      2        1
                                                                    5     3      
                                                                               1
                                                                    4     4     2
                                                                                 
   Hence the inverse of the given matrix is B = A1 =                            .
                                                                    0     1     0
                                                                                 
                                                                                 
                                                                    2   2     1
                                      1           0        1 
           1 1 0                     2                    2
   1. A   2 2 1      Ans. A1   1          0       1 
                                       2                     2
           1 1 0                 0           1        2 
                                                              
                                      1           0        1 
           1 1 0                     2                    2
   2. A  1 1 1       Ans. A1   1          0       1 
                                       2                     2
           1 1 0                 0           1        1 
                                                              
           1 3 1                  4 7 5
   3. A   1 1 2  Ans. A1   2 4 3 
            2 1 2              3 5 4 
             1 1 1               1 1 1 
             
   4. A  1 0 1      Ans. A   1 1 0 
                              1
             1 1 0             1 0 1
            5 8 1                  5 11 6 
   5. A   0 2 1  Ans A1   4 9        5 
                                                  
             4 3 1              8 17 10 
                         1    4   2   9
The rank of the matrix   0    1   0   0 is 2, since two rows are non-zero.
                         
                         0   0   0   0
Problem: Find the rank of the following matrix using elementary row
transformations.
    2    1   3 1
A =(4    0   2 6).
    0    1   2 2
     2      1     3 1
A  (0     2    4 4).
     0      1     2 2
R3 R3 + (1/2)R3,
       2       1     3 1
 A  (0      2    4 4).
       0       0     0 0
Rank(A) = 2
                           1 2 3 2 
Example: Find the rank of  2 3 5 1        using elementary row transformation.
                           1 3 4 5  3  4
      1 2 3 2  R1
A =  2 3 5 1  R 2      [Firstly we use the leading entry in the first row 1 to make the
     1 3 4 5  R 3
                                          1 2 3 2
Perform,     R 2  R 2  2 R1, then A  0 1 1 3
             R 3  R 3  R1              0 1 1 3
                                    1 2 3 2 
Perform, R 2  (1) R 2   then A  0 1 1 3 
                                    0 1 1 3 
                                      1 2 3 2 
Perform,    R3  R3  R 2   then A  0 1 1 3 
                                      0 0 0 0 
The above matrix is in the echelon form having two non-zero rows. Hence the rank of A is 2.
                                           1    2   3
Example: Determine the rank of the matrix (1    4   2).
                                           2    6   5
                                      1   2    3
                                     (0   2   1)
                                      0   2   1
                      1      2    3
Apply R3  R3 - R1  (0      2   1).
                      0      0    0
Exercises
       1     2 0 
   1. 1     1 1
                           Ans. Rank = 2
        2   1 1 
        1    1 0
   2.  2    2 1 
                           Ans. Rank = 3
        1   1 0 
        1 1 1 1 
   3. 1 1 1 2  Ans. Rank = 2
        3 1 3 4 
         1 3 1 1
         1 1 2 2 
   4.                Ans. Rank = 4
         2 1 2 1 
                    
         1 2 2 2 
       1 1 1 
   5. 1 0 1        Ans. Rank = 3
       1 1 0 
Linear Equations-Consistency:
Definitions: A system of m linear equations in n unknowns x1, x2, , xn is a set of
equations of the form
... ...
The aij are given numbers, which are called the coefficients of the system. The bi are
also given numbers.
If the bi are all zero, then (*) is called a homogeneous system. If at least one bi is not
zero, then (*) is called a non-homogeneous system.
A solution of (*) is a set of numbers x1, x2, .., xn which satisfy all the m equations. If
the system (*) is homogeneous, it has at least one trivial solution x1 = 0, x2 = 0 xn =
0.
From the definition of matrix multiplication, we can write the above system (*) as
AX = B
             x1                    b1 
            x                     b 
        X =  2     and B =         2 
                                   
              n 1                  m 1
            xn                     b m 
is called the augmented matrix. The matrix A B determines the system (*)
Note: The matrix equation AX = B need not always have a solution. It may have no
solution or a unique solution or an infinite number of solutions.
    solution.
iii) if rank A = rank  A B < number of unknowns, then system has an infinite
number of solutions.
        a1 x + b1y + c1z = d1
        a2x + b2y + c2z = d2                         (i)
                                                              a2 
We eliminate x from the second equation by subtracting            times the first equation
                                                              a1 
from the second equation. Similarly we eliminate x from the third equation by
               a3 
eliminating        times the first equation from the third equation. We thus, get the
               a1 
new system
a1 x + b 1 y + c 1 z = d1
Here the first equation is called the pivotal equation and a1 is called the first pivot
element.
                                                                  b '3 
We eliminate y from the third equation of (ii) by subtracting           times the second
                                                                       
                                                                  b '2 
equation from the third equation. We thus, get the new system
a1 x + b 1 y + c 1 z = d 1
c''3 z d ' 3
Here the second equation is the pivotal equation and b'2 is the new pivot element.
This also can be derived from the augmented matrix of the system (i),
             a1     b1   c1   d1 
[A|B] =     a       b2   c2   d 2                   (iv)
             2
             a 3   b3   c3   d1 
                   a1 b1          c1     d1 
That is., [A|B] ~  0 b '2        c '2   d'2         ..(v)
                    0 0          c'3    d ' 3 
a1 x + b 1 y + c 1 z = d1
c3 ''z d3 ''
By back substitution we get z, y and x constituting the exact solution of the system (i).
Note: The method will fail if one of the pivot elements a1, b '2 or c3 '' vanishes. In
such cases the method can be modified by rearranging the rows so that the pivots are
non-zero. If this is impossible, then the matrix is singular and the equations have no
solution.
                           5 3 7          4
Solution: We have           3 26 2        9 
                           
                           7 2 10        5
Apply R1  3R1, R2  5R2
  15   9 21         12 
  15 130 10         45
  
   7  2 10          5
Apply R2 R2 - R1
 15   9 21          12 
  0 121 11         33
 
  7  2 10           5
 35 21 49 28
  0 11 1 3
              
 35 10 50 25
 5 3 7       4
 0 11 1     3
 
 0 0 0      0 
Therefore the rank of the coefficient matrix and the augmented matrix both are
2. Hence the equations are consistent. The given system is equivalent to
5x + 3y + 7z = 4, 11y - z = 3.
                                       1 1 5      1
Solution: The augmented matrix is      2 4 0      12 
                                       
                                        5 1 1   10 
Apply R2  R2 - 2R1, R3  R3 -5R1
  1 1     5        1
  
 0 2 10          14 
  0 6 24        15
Apply R2 (1/2)R2
    1 1     5 1
   0   1 5 7 
    
    0 6 24 15
Apply R3 R3 + 6R2
    1 1    5 1
   0 1 5 7 
                 
    0 0 54 57 
Apply R3 (-1/54)R3
 1 1 5       1
 0 1 5       7 
 
 0 0 1 1.0556 
y- z + 2 = 0
       x + y - 2z = 0
Answer: Rank (A) = Rank (A:B) = 2 < the number of unknowns = 3. Therefore the
system is consistent, and have infinite number of solutions, which involve n  r = 3 
2 = 1 constant (s).
Gauss Jordon Method:
Example: Balance the following chemical equation with the help of Gauss
Jordan Method
6x + 0y + 0z - 2w = 0
0x + 2y - 2z - w = 0
              1       
 1    0   0         0
               3
                      
 0            7
       1   0         0
              6       
              2       
 0    0   1         0
              3       
                     
 We get x = (1/3)k; y = (7/6)k; z = (2/3)k; w = k. Take k = 6. Thus we get
the balanced chemical equation as follows:
2C2H6 + 7O2  4CO2 + 6H2O.
Solution:
Writing the given matrix side by side with the unit matrix of order 3, we have
                            4    1         2       1       0   0 R1
          [A | I ] =        0    1         0       0       1   0 R2
                            
                            8   4         5       0       0   1 R3
                                                                  
    1           1          1           1               0        0
                4          2           4                           R1
                                                                  
~   0           1          0           0               1        0 R2
                                                                  
                                                                  
    0          0          1         2                2       1 R3  R3  2 R2
                                                           1        
    1           0          1           1                         0 R1  R1  1 R2
                            2           4                   4                   4
                                                                    
~                                                                   
    0            1          0          0                   1      0 R2
                                                                    
    0          0           1        2                2         1 R3
                           5   3           1          1
     1    0      0                           R  R1  R3
                           4   4           2 1        2
                                             
~    0    1      0         0    1          0 R2
                                             
                                             
     0   0      1    2       2          1 R3
               = [ I | B]
Hence the inverse of the given matrix is
                        5       3           
                                         1
                        4       4         2
      B = A1 =                             
                        0       1         0
                                            
                                            
                        2    2         1
                3   1 2                 x                   3
where A =        2 3 1 ,         X   y     and   B   3
                        
                 1 2 1                 z                4 
Now AX = B  X = A1 B                                            .. (ii)
Writing the given matrix side by side with the unit matrix of order 3, we have
                  3 1 2 1 0 0  R1
      [A | I] =  2 3 1 0 1 0  R 2
                 1 2 1 0 0 1  R 3
Continue row transformations (please refer the earlier lesson for inverse
computation), we obtain
                  1          3          5
                
                   8          8          8
                                           
A1 =           
                   3          1          7
                  8          8          8
                                           
                  7          5          11 
                                      
                 8          8          8 
                          1            3          5
                          8           8          8
                 x                                   3
                  y    3           1          7     3
                    8                8          8     
                  z   7                             4 
                                     
                                        5
                                                  
                                                  11
                          8           8         8 
                          3   9                20 
                         8  8                 8 
                                                                 1
             =            9  3               28 
                                                                  2
                         8     8                8                 
                                                                  1
                          21   15                44 
                                  
                          8    8                8 
                 x               1
                  y             2
                                 
                  z            1
                         3 1             3 1
ITERATIVE METHODS:
We shall now describe the iterative or indirect methods, which start from an approximation to
the true solution and if convergent, derive a sequence of closer approximations  the cycle of
computations being repeated till the required accuracy is obtained. This means that in a direct
method the amount of computation is fixed, while in an iterative method the amount of
computation depends on the accuracy required. In the case of matrices with a large number of
zero elements, it will be advantageous to use iterative methods which preserve these
elements.
Jacobis Method:
Let the system be given by
         a11 x1  a12 x2  a13 x3  ...  a1n xn  b1 
                                                          
         a21 x1  a22 x2  a23 x3  ...  a2 n xn  b2 
         a31 x1  a32 x2  a33 x3  ...  a3n xn  b1 
                                                          
                                                 .        
                                                 .        
                                                          
         a n1x 2 +a n2 x 2 +a n3 x 3 +...+a nn x n = b n 
In which the diagonal elements aii do not vanish. If this is not the case, then the equations
should be rearranged so that this condition is satisfied. Now, we rewrite the system above as
                b1 a12      a             a
         x1           x2  13 x3  ...  1n xn
                a11 a11     a11           a11
                b2 a21      a             a
         x2           x1  23 x2  ...  2 n xn
                a22 a22     a22           a22
         .
         .
                bn an1      a              a
         xn           x1  n 2 x2  ...  n , n 1 xn 1
                ann ann     ann             ann
Let x1(0) , x2(0) ,....., xn(0) be the initial approximation to the solution.
If x1( k ) , x2( k ) ,....., xn( k ) denotes the approximation to the solution at kth iteration, then the solution
at (k +1)th iteration is given by
                        b1 a12 ( k ) a13 ( k )     a
         x1( k 1)            x2      x3  ...  1n xn ( k )
                        a11 a11      a11           a11
                        b2 a21 ( k ) a23 ( k )     a
         x2( k 1)            x1      x2  ...  2 n xn ( k )
                        a22 a22      a22           a22
         .
         .
                        bn an1 ( k ) an 2 ( k )    a
         xn ( k 1)           x1      x2  ...  n ,n 1 xn 1( k )
                        ann ann      ann            ann
This method is due to Jacobi and is called the method of simultaneous displacements.
Gauss Seidel Method:
In Jacobis method, to obtain the values at the (k+1)th iteration, only kthiteration values are
used. A simple modification to the above method sometimes yields faster convergence and
isdescribed below.
Here, along with the kth iteration values, all available (k + 1)thiteration values are used. It is
clear, therefore, that this method uses an improved component as soon as it is available and it
is called the method of successive displacements, or the Gauss  Seidel method.
The Jacobi and Gauss  Seidel methods converge, for any choice of the first approximation
xj(0) (j= 1, 2, , n), if every equation of the system satisfies the condition that sum of the
absolute values of the coefficients aij/aii is almost equal to, or in at least one equation less than
unity, i.e. provided that
            n
          
        j = 1, j1
                     a ij / a ii  1,  i =1, 2,..., n 
where the < sign should be valid in the case of at least one equation. A matrix A
satisfying the above condition is known as a diagonally dominant matrix. It can be shown
that the Gauss  Seidel method converges twice as fast as the Jacobi method. The working of
the methods is illustrated in the following examples.
                           1
           y =                  18  3x  z                           (2)
                           20
                           1
           z =                 25  2x  3y            
                           20
                         18
         y = y (1) =              =  0.9
                         20
                         25
         z = z(1) =               = 1.25,
                         20
where (x(1), y(1), z(1)) = (0.85,  0.9, 1.25) are the first approximated values.
Putting these values on the right hand sides of the equations (2) we obtain
         x = x (2) =      1    [17  (0.9) + 2  1.25]   =      1.0200
                         20
y = y(3) = 0.9954
z = z(3) = 1.0032
The values in the 5th and 6th iterations being practically the same, we can stop
the iterations. Hence the solution is
Now using equation (1), the second approximation is computed from the
equations as
Substituting equation (2) into equation (3), we get the second approximation as
                           x (2)  0.9976
                                          
                           y   1.2339 
                              (2)   
                                                          .........(4)
                           (2)           
                                             
                          z        1.7424 
                                 
Proceed in a similar way, we get the third, fourth and fifth approximations to the
required solution and they are tabulated as below.
Taking x 2  x (0)
               2  0 , x 3  x 3  0 in (i), we get x1 = 0.85
                               (0)                   (1)
           1
  2 
x (1)         [18  3  0.85 + 0 ] =  1.0275
           20
           1
x 3(1)       [ 25  2  0.85 + 3  (1.0275)] = 1.0109
           20
Second iteration:
                      1
           x1(2)        [17  3 (1.0275) + 2  1.0109] = 1.0025
                      20
                      1
             2 
           x (2)         [18  3  (1.0025) + 1.0109] =  0.9998
                      20
                      1
           x 3(2)       [25  2  1.0025 + 3  (0.9998)] = 0.9998
                      20
                                                        2   1.0000 , x 3  1.0000
                                                                         (3)
Similarly in third iteration, we get x1(3)  1.0000 , x (3)
The values in the 2nd and 3rd iterations being practically the same, we can stop
the iterations. Hence the solution is x1 = 1, x2 = 1, x3 = 1
Example: Find the solution of the following system of equations
                                               1   1   1
                                            x y z 
                                               4   4   2
                                             1       1   1
                                             xy w 
                                             4       4   2
                                             1       1   1
                                             xz w 
                                             4       4   2
                                             1   1       1
                                             y zw 
                                             4   4       2
using Gauss-Seidel method and perform the first five iterations.
Solution: The given system of equations can be rewritten as
                             x = 0.5 + 0.25y + 0.25z
                             y = 0.5 + 0.25x + 0.25w             ...     (1)
                             z = 0.25 + 0.25x + 0.25w
                             w = 0.25 + 0.25y + 0.25z
Taking the initial approximation as y = z = 0 on the right side of the equation
(1) we get x(1) = 0.5.
Taking z = 0 and w = 0 and the current value of x, we get
y(1) = 0.5 + (0.25)(0.5) + 0 = 0.625 from the second equation of (1).
Now taking w = 0 and the current value of x, we get
z(3) = 0.25 + (0.25) (0.5) + 0 = 0.375 from the third equation of (1).
Lastly, using the current values of y and z, the fourth equation of (1) gives
w(1) = 0.25 + (0.25)(0.625) + (0.25)(0.375) = 0.5.
The Gauss-Seidal iterations for the given set of equations can be written as
                          x(r+1) = 0.5 + 0.25y(r) + 0.25z(r)
                         y(r+1) = 0.5 + 0.25x(r+1) + 0.25w(r)           ...    (1)
                         z(r+1) = 0.25 + 0.25x(r+1) + 0.25w(r)
                         w(r+1) = 0.25 + 0.25y(r+1) + 0.25z(r+1)
Now,    by    Gauss-Seidel    procedure,     the   second       and     the    subsequent
approximations can be obtained and the sequence of the first five
approximations is tabulated below.
          Iteration number (n)               Variables
                                    x       y         z                  w
                     1             0.5    0.625    0.375                0.5
                     2             0.75  0.8125 0.5625                0.59375
                     3           0.84375 0.85938 0.60938              0.61719
                     4           0.86719 0.87110 0.62110              0.62305
                     5           0.87305 0.87402 0.62402              0.62451
Eigenvalues and Eigenvectors
Introduction
Ax x (1)
Clearly, x  0 is one solution of (1) for any  . But we are looking for vectors
x  0 which satisfy (1).
Ax x ( A I ) x 0
Where I is the nth order identity matrix. The determinant of this matrix ( A   I )
equated to zero,
Where the ki s are expressible in terms of the elements aij of the matrix A .
The vectors x  0 which satisfy (1) are called the Eigenvectors or latent vectors
of A . The determinant A   I , which is a polynomial in  , is called as the
characteristic polynomial of A .
Which will have a non-trivial solution if and only if the coefficient matrix is
singular, i.e, if A   I  0 .
                                                                               5 4 
Example: Find the eigenvalues and eigenvectors of A =     .
                                                      1 2 
       5  4
i.e,            0 or  2  7  6  0
        1  2
or ( 6)( 1) 0
6,1
                                                   1    4   x1 
Corresponding to   6 , we have                                       0 which gives only one
                                 1                      4  x2 
dependent equation  x1  4 x2  0 .
    x1 x2
        giving the eigenvector (4,1) .
    4 1
                                                   4 4  x 
Corresponding to   1 , we have      
                                           1
                                               0 which gives only one dependent
                                 1 1   x2 
equation x1  x2  0 .
    x1 x2
        giving the eigenvector (1, 1) .
    1 1
Example 3: Find all eigen values and the corresponding eigen vectors of the
matrix
                  8 6 2 
            A =  6 7 4 
                  2 4 3
                                                                        0 0 
The characteristic equation of A is A  I = 0 where I =                      
                                                                       0  0  .
                                                                       0 0  
           8        6        2
That is,    6     7         4        = 0.
             2        4       3
Expanding we get
 (8  ) [(7) (3) 16] + 6 [6 (3) + 8] + 2 [24 2 (7) ] = 0
3 + 18 2 45 = 0
( 3) ( 15) = 0
, then we have
                         8   6   2               x     0
            (A  I) X =  6 7   4 
                                                      y   0  .
                                                             
                          2   4 3               z  0
That is,    (8  ) x              6y    +     2z   = 0
                  6x + (7)y                 4z   = 0
                 2x                4y    + (3  )z = 0
Case I: Let  = 0 we have
                8x  6y + 2z = 0                                          (i)
             6x + 7y  4z = 0                                            (ii)
                2x  4y + 3z = 0                                        (iii)
              x        y          z
                             
           24  14   32  12   56  36
            x   y     z    x   y   z
                       or      
           10   20   20    1   2   2
Therefore (x, y, z) are proportional to (1, 2, 2) and we can write x = 1k, y = 2k,
z = 2k where k ( 0) is an arbitrary constant. However it is enough to keep the
values of (x, y, z) in the simplest form x = 1, y = 2, z = 2 (putting k = 1). These
values satisfy all the equations simultaneously.
                                                                                     1
                                                                                     
Thus the eigen vector X1 corresponding to the eigen value  = 0 is                   2   .
                                                                                     
                                                                                     2
                                                                                     
                          x   y    z
      That is,                  
                         16   8   16
                 x   y    z
      That is,         
                 2   1    2
            2
Thus X2 =  1 is the eigen vector corresponding to  = 3.
            2 
            
6x 8y 4z = 0 (viii)
2x 4y 12z = 0 (ix)
                 x y z
      That is,      
                 2 2 1
            2
Thus X3 =  2  is the eigen vector corresponding to  = 15.
            1
            
                                                      1         2      2
Therefore  = 0, 3, 15 are the eigen values of A and  2  ,           
                                                                  1 and  2  are
                                                       2        2     1
                                                                       
 3 1 -1
 2 2 -1
         
 2 2 0 
                      3   1   1
               2           2 1 = 0
                2           2 
3 52 + 8 4 = 0.
= 1, 2, 2
value 1 is . Then (A I) = 0
    3 1 1 1 0 0    1 
                        
   2 2 1  0 1 0    2  = 0
                    
    2 2 0  0 0 1    
                     3 
   2 1 1  1   0 
  2 1 1  2  =  0  .
                      
   2 2 1 3   0 
                  
21 + 2 - 3 = 0 ..(i)
21 + 2 - 3 = 0 ...(ii)
21 + 22 - 3 = 0 ..(iii)
21 + 22 - 3 = 0 .(iii)
21 + 2 - 3 = 0 ..(i)
------------------------
        2 = 0
-----------------------
21 - 3 = 0
    
 1 = 3.
  1   2
If 1 = 1, then we get
1 = 1, 2 = 0, 3 = 2.
= (1, 2, 3) = (1, 0, 2)
    1 
 =  0  is the characteristic vector corresponding to the characteristic value1.
     
     2 
    3 1 1  2 0 0    1 
                         
   2 2 1   0 2 0    2  = 0
                     
    2 2 0   0 0 2   
                      3 
  1 1 1  1   0 
  2 0 1  2  =  0 
                       
   2 2 2  3   0 
       1 + 2 - 3 = 0 ..(i)
      21 - 3 = 0 .(ii)
3 = 21
21 + 22 23 = 0 .(iii)
(or 1 + 2 - 3 = 0) 3 = 1 + 2
Now 21 = 1 + 2 1 = 2.
So 1 = 2 and 3 = 21
If 1 = 1, then 2 = 1 and 3 = 2.
         1 
Thus  =  1  is a characteristic vector corresponding to the characteristic value
          
          2 
2.
Example: Find all the eigen roots and the eigen vector corresponding to the
                                     6 2 2 
least eigen root of the matrix A =  2 3 1 .
                                     2 1 3
               6    2     2
This implies    2 3      1  0.
                 2    1 3  
    2 1 1      2 0 1
a)  1 1 2  b)  0 2 0 
              
    1 2 1   1 0 2 
Properties of Eigenvalues
1. Any square matrix A and its transpose A ' have the same eigenvalues.
2. The eigenvalues of a triangular matrix are just the diagonal elements of the
   matrix.
4. The sum of the eigenvalues of a matrix is the sum of the elements of the
   principal diagonal.
                                                            1
7. If  is an eigenvalue of an orthogonal matrix, then          is also its eigenvalue.
                                                           
8. If 1, 2 ,....n are the eigenvalues of a matrix A , then Am has the eigenvalues
   1m , 2m ,.....n m where m is a positive integer.
Cayley-Hamilton Theorem
i.e if the characteristic equation for the nth order square matrix A is
            1
 A1        (1)n An1  k1 An2  ....  kn 1I  .
            kn 
A3 + k1 A2 + k2 A + k3 I = O
                  A2 + k1 A + k2 I + k3 A1 = O, since A1A = I, IA = A
                        1
Therefore A1 =           [ A2 + k1 A + k2 I ].
                        k3
On expanding, we get 3 32 -8 + 2 = 0.
Applying Cayley-Hamilton theorem, A3 3A2 -8A + 2I = O.
Post multiplying with A-1 we have,
                                               1 2
A2-3A-8I +2A-1 = O. This implies A-1 =           (A  3A  8I) (2)
                                               2
              0 1 2 0 1 2          7 4 5
Now A = AA =  1 2 3    1 2 3  = 11 8 11 
         2                     
              3 1 1 3 1 1           4 6 10 
                                            
Therefore
       1 1 1
   1 
A =  8 6 2  .
 -1
   2
       5 3 1
Example: Verify Cayley  Hamilton theorem and compute the inverse of the
matrix
                    1 1 1
               A =  3 2 3 
                     1 1 2 
          1   1            1
 A  I   3   2           3    = 0
               1     1     2
           1 1 1 1 1 1          5 4 6
A = A. A =  3 2 3   3 2 3    12 10 15 
 2                            
            1 1 2   1 1 2   6 5 8 
                   1 1 1  5 4 6                  23 19 29    
 3
A = A.A =  2        3 2 3   12 10 15            57 47 72    
                                                             
                    1 1 2   6 5 8            29 24 37   
L.H.S. of A3 5 A2 + A + I becomes,
 23 19 27         5 4 6     1 1 1    1 0 0 
57 47 72   5    12 10 15  3 2 3   0 1 0 
                                             
 29 24 37       6 5 8   1 1 2  0 0 1 
         23  25  1  1 19  20  1  0 29  30  1  0 
     = 57  60  3  0 47  50  2  1 72  75  3  0 
         29  30  1  0 24  25  1  0 37  40  2  1 
        0 0 0 
     = 0 0 0  = O
        0 0 0 
Therefore A3  5A2 + A + I = 0
Hence Cayley  Hamilton theorem is verified.
To find the inverse of A
We have A3  5A2 + A + I = O. Post multiplying by A1 we have
       A2 5A + I + A1 = O
Therefore A1 = A2 + 5A  I
                 5 4 6                1 1 1    1 0 0                                 1 1 1
A    1
           =  12 10 15  5         3 2 3    0 1 0 
                                                         
                                                                           A     1
                                                                                       =  3 1 0 
                 6 5 8              1 1 2  0 0 1                               1 0 1
                                                                                   1     2
Example: Using Cayley-Hamilton theorem, find A8 , if A       .
                                                          2 1
                                                                              1    2
Solution: The characteristic equation of A is given by                                     0 or
                                                                               2   1  
2  5  0 .
A8  ( A2 )4  ( A2  5I  5I )4
    ( A2  5I )4  4( A2  5I )3 5I  6( A2  5I )2 (5I ) 2  4( A2  5I )(5I )3  (5I ) 4
By Cayley-Hamilton theorem, A2 5I 0 .
A8 (5I )4 625I .
                 2 2 1
Example: If A  1 3 1  , then find the eigenvalues of A1 and hence, find its
                 1 2 2 
determinant.
5,1,1 .
                                              1
If  is an eigenvalue of A , then                 is an eigenvalue of A1 .
                                              
                                                     1
Therefore, the eigenvalues of A1 are ,1,1 .
                                                     5
          1
 A1      .
          5
Problems:
1. Verify Cayley-Hamilton theorem for the following matrices and find its
              2 1 1          7 2 2       3 2 4
inverse: a)  1 2 1    b)  6 1 2  c)  4 3 2 
                                        
              1 1 2        6 2 1      2 4 3 
            2 1 2 
2. If A   1 2 1 , find A4 .
            1 1 2 
                                                   2 3 4
3. Find the eigenvalues of A  2 A  I where A   0 4 2 .
                                   2
0 0 3
We start with a column vector x which is as near the solution as possible and
evaluate Ax , which is written as  (1) x(1) after normalisation, that is, we obtain
 (1) from Ax which is largest in magnitude. Hence, the resulting vector x (1) will
have largest component unity. This gives the first approximation  (1) to the
largest eigenvalue and x (1) to the eigenvector. Similarly, we evaluate
 Ax(1)   (2) x(2) which gives the second approximation. We repeat this process till
 x(r)  x( r 1) becomes negligible. Then,  ( r ) will be the largest eigenvalue of A and
 x ( r ) , the corresponding eigenvector.
This iterative method to find the dominant eigenvalue of a matrix is known as
Rayleighs power method.
For the initial approximation, we can take the vector x  [1 0 0]' , if it is not
explicitly provided.
Note: To find the smallest eigenvalue of A , apply the Rayleighs power method
                                                             1
to A1 and obtain its largest eigenvalue. Since                  is an eigenvalue of A1 if  is
                                                             
an eigenvalue of A , the resultant largest eigenvalue of A1 is the smallest
eigenvalue of A .
Example: Determine the largest eigen value and the corresponding eigen vector
                    2 0 1
of the matrix A =  0 2 0  by power method taking the initial vector as [1, 1,
                    1 0 2 
1]t.
                                              1
                           (0)            t    
Solution: Given X = (1, 1, 1) =               1   is the initial eigen vector
                                              1
                                               
                2 0 1 1   2         1 
AX     (0)
             =  0 2 0 0  0   2  0  = (1) X(1)
                                  
                1 0 2  0  1  0.5
                                    2
( Since 2 is the largest value in  0  )
                                   1
                                    
                2 0 1  1   2.5           1 
AX     (1)
             =  0 2 0  0    0   2  5  0  = (2) X(2)
                                    
                1 0 2  0.5  2      0.8
                2 0 1  1   2.8            1 
AX     (2)
             =  0 2 0  0    0   2  8  0  = (3) X(3)
                                      
         2 0 1  1   2.98            1 
AX(4) =  0 2 0  0    0   2  98  0  = (5) X(5)
                                 
         1 0 2  0.98  2.96   0.99 
              2 0 1  1   2.99           1 
AX  (5)
           =  0 2 0  0    0   2.99  0  = (6) X(6)
                                      
              1 0 2  0.99   2.98 0.997 
              2 0 1  1   2.997            1 
AX  (6)
           =  0 2 0  0    0   2.997  0  = (7) X(7).
                                        
              1 0 2 0.997   2.994  0.999
Therefore we can conclude that the largest eigen value is approximately 3 and
Example: Determine the largest eigen value and the corresponding eigen vector
                    6 2 2
of the matrix A =  2 3 1 by power method taking the initial vector as
                    2 1 3
[1, 1, 1]t.
                                            1
                        (0)             t    
Solution: Given X = (1, 1, 1) =             1   is the initial eigen vector
                                            1
                                             
              6 2 2  1     6             1 
AX   (0)
           =  2 3 1 1  0   6
                                           0  = (1) X(1)
                                                     
              2 1 3 1  4         0.67 
                                                                   6
                                ( Since 6 is the largest value in  0  )
                                                                    4
                                                                    
            6 2 2  1          7.34         1.0
AX (1)
         =  2 3 1  0    2.67   7.34  0.36 = (2) X(2)
                                    
            6 2 2  1           7.82          1.0
AX (2)
         =  2 3 1  0.36   3.63  7.82  0.46 = (3) X(3)
                                     
            2 1 3  0.55   4.01        0.51
            6 2 2         1   7.94         1.0
AX (3)
         =  2 3 1 0.45  3.89  7.94 0.69 = (4) X(4)
                                    
            2 1 3  0.51  4.01       0.51
            6 2 2         1   7.98            1
AX (4)
         =  2 3 1  0.49   3.97   7.98  0.5 = (5) X(5)
                                    
            2 1 3  0.51  3.99          0.5
            6 2 2   1        8        1
AX (5)
         =  2 3 1  0.5   4  8  0.5 = (6) X(6).
                                
            2 1 3  0.5  4      0.5
Therefore the largest eigen value is = (6) = 8 and the corresponding eigen
Problems:
1. Find the largest eigenvalue and the corresponding eigenvector of the matrix
1 3 2 
 4 4 1 using the initial approximation 1 0 0 ' . Carry out 4 iterations.
                                            
 6 3 5