Physics 116A                                                               Winter 2010
The Matrix Representation of a Three-Dimensional Rotation—Revisited
         In a handout entitled The Matrix Representation of a Three-Dimensional Rota-
     tion, I provided a derivation of the explicit form for most general 3 × 3 rotation
     matrix, R(n̂, θ) that describes the counterclockwise rotation by an angle θ about a
     fixed axis n̂. For example, the matrix representation of the counterclockwise rotation
     by an angle θ about a fixed z-axis is given by [cf. eq. (7.18) on p. 129 of Boas]:
                                                                
                                          cos θ − sin θ        0
                             R(k, θ) ≡  sin θ      cos θ      0 .                     (1)
                                            0         0        1
        The general rotation matrix R(n̂, θ) satisfies the following relations:
                      R(n, θ + 2πk) = R(n, θ) ,         k = 0, ±1 ± 2 . . . ,
                         [R(n, θ)]−1 = R(n, −θ) = R(−n, θ) .                             (2)
     Combining these two results, it follows that
                                  R(n, 2π − θ) = R(−n, θ) ,                              (3)
     which implies that any three-dimensional rotation can be described by a counter-
     clockwise rotation by θ about an arbitrary axis n̂, where 0 ≤ θ ≤ π. However, if we
     substitute θ = π in eq. (3), we conclude that
                                     R(n, π) = R(−n, π) ,                                (4)
     which means that for the special case of θ = π, R(n, π) and R(−n, π) represent
     the same rotation. Finally, if θ = 0, then R(n, 0) = I is the identity operator,
     independently of the direction of n̂.
        In these notes, I will provide a much simpler derivation of the explicit form for
     R(n̂, θ), based on the techniques of tensor algebra.
     1. A derivation of the Rodriguez formula
        The matrix elements of R(n̂, θ) will be denoted by Rij . Since R(n̂, θ) describes a
     rotation by an angle θ about an axis n̂, the formula for Rij that we seek will depend
     on θ and on the coordinates of n̂ = (n1 , n2 , n3 ) with respect to a fixed Cartesian
     coordinate system. Note that since n̂ is a unit vector, it follows that:
                                       n21 + n22 + n23 = 1 .                             (5)
                                                1
    Using the techniques of tensor algebra, we can derive the formula for Rij in the
following way. We can regard Rij as the components of a second-rank tensor (see
Appendix A). Likewise, the ni are components of a vector (equivalently, a first-rank
tensor). Two other important quantities for the analysis are the invariant tensors δij
(the Kronecker delta) and ǫijk (the Levi-Civita tensor). If we invoke the covariance
of Cartesian tensor equations, then one must be able to express Rij in terms of a
second-rank tensor composed of ni , δij and ǫijk , as there are no other tensors in the
problem that could provide a source of indices. Thus, the form of the formula for Rij
must be:
                            Rij = aδij + bni nj + cǫijk nk ,                        (6)
where there is an implicit sum over the index k in the third term of eq. (6).1 The
numbers a, b and c are real scalar quantities. As such, a, b and c are functions of θ,
since the rotation angle is the only relevant scalar quantity in this problem.2 It also
follows that n̂ is an axial vector, in which case eq. (6) is covariant with respect to
transformations between right-handed and left-handed orthonormal coordinate sys-
tems.3
    We now propose to deduce conditions that are satisfied by a, b and c. The first
condition is obtained by noting that
                                          R(n̂, θ)n̂ = n̂ .
This is clearly true, since R(n̂, θ), when acting on a vector, rotates the vector around
the axis n̂, whereas any vector parallel to the axis of rotation is invariant under the
action of R(n̂, θ). In terms of components
                                            Rij nj = ni .                                          (7)
To determine the consequence of this equation, we insert eq. (6) into eq. (7) and make
use of eq. (5). Noting that4
                       δij nj = ni ,         nj nj = 1         ǫijk nj nk = 0 ,                    (8)
   1
     We follow the Einstein summation convention in these notes. That is, there is an implicit sum
over any pair of repeated indices in the present and all subsequent formulae.
   2
     One can also construct a scalar by taking the dot product of n̂· n̂, but this quantity is equal
to 1 [cf. eq. (5)], since n̂ is a unit vector. Thus, it does not add anything new.
   3
     Under inversion of the coordinate system, θ → −θ and n̂ → −n̂. However, since 0 ≤ θ ≤ π (by
convention), we must then use eq. (2) to flip the signs of both θ and n̂ to represent the rotation
R(n̂, θ) in the new coordinate system. Hence, the signs of θ and n̂ effectively do not change under
the inversion of the coordinate system. That is, θ is a scalar and n̂ is an axial (or pseudo-) vector.
   4
     In the third equation of eq. (8), there is an implicit sum over j and k. Since ǫijk = −ǫjik , when
the sum ǫijk nj nk is carried out, we find that for every positive term, there is an identical negative
term to cancel it. The total sum is therefore equal to zero. This is an example of a very general
rule. Namely, one always finds that the product of two tensor quantities, one symmetric under the
interchange of a pair of summed indices and one antisymmetric under the interchange of a pair of
summed indices, is equal to zero when summed over the two indices. In the present case, nj nk is
symmetric under the interchange of j and k, whereas ǫijk is antisymmetric under the interchange of
j and k. Hence their product, summed over j and k, is equal to zero.
                                                  2
it follows immediately that ni (a + b) = ni . Hence,
                                        a+b = 1.                                     (9)
Since the formula for Rij given by eq. (6) must be completely general, it must hold for
any special case. In particular, consider the case where n̂ = k. In this case, eqs. (1)
and (6) yields:
         R(k, θ)11 = cos θ = a ,            R(k, θ)12 = − sin θ = c ǫ123 n3 = c .   (10)
Using eqs. (9) and (10) we conclude that,
                  a = cos θ ,        b = 1 − cos θ ,         c = − sin θ .          (11)
Inserting these results into eq. (6) yields the Rodriguez formula:
                 Rij (n̂, θ) = cos θ δij + (1 − cos θ)ni nj − sin θ ǫijk nk         (12)
    We can write the above quantity in 3 × 3 matrix form, although eq. (12) is more
compact and convenient. For completeness, here is the explicit form for the general
3 × 3 rotation matrix:
                                                                                                   
                cos θ + n21 (1 − cos θ)   n1 n2 (1 − cos θ) − n3 sin θ n1 n3 (1 − cos θ) + n2 sin θ
R(n̂, θ) = n1 n2 (1 − cos θ) + n3 sin θ     cos θ + n22 (1 − cos θ)   n2 n3 (1 − cos θ) − n1 sin θ ,
             n1 n3 (1 − cos θ) − n2 sin θ n2 n3 (1 − cos θ) + n1 sin θ    cos θ + n23 (1 − cos θ)
                                                                                  (13)
         2    2      2
where n1 + n2 + n3 = 1. Thus, we have reproduced the explicit form for the gen-
eral 3 × 3 rotation matrix, which was derived in the previous handout, The Matrix
Representation of a Three-Dimensional Rotation.
2. Determining the rotation axis and the rotation angle
   In Section 3 of the previous handout, The Matrix Representation of a Three-
Dimensional Rotation, I presented an algorithm for obtaining the direction of the
rotation axis n̂ and the rotation angle θ if we are given an arbitrary 3 × 3 rotation
matrix R(n̂, θ). With some tensor algebra manipulations involving the Levi-Civita
tensor, we can use eq. (12) to quickly obtain the desired results.
   First, we compute the trace of R(n̂, θ). In particular, using eq. (12) it follows
that:
                            Tr R(n̂, θ) = Rii = 1 + 2 cos θ                      (14)
In deriving this result, we used the fact that δii = Tr I = 3 (since the indices run
over i = 1, 2, 3 in three-dimensional space) and ǫiik = 0 (the latter is a consequence
of the fact that the Levi-Civita tensor is totally antisymmetric under the interchange
of any two indices). By convention, 0 ≤ θ ≤ π, which implies that sin θ ≥ 0. Thus,
                                                           p
    cos θ = 12 (Rii − 1) and sin θ = (1 − cos2 θ)1/2 = 21 (3 − Rii )(1 + Rii )    (15)
                                             3
where cos θ is determined from eq. (14). All that remains is to determine the axis of
rotation n̂.
   Let us multiply eq. (12) by ǫijm and sum over i and j. Noting that5
                       ǫijm δij = ǫijm ni nj = 0 ,          ǫijk ǫijm = 2δkm ,                    (16)
it follows that
                                      2nm sin θ = −Rij ǫijm .                                     (17)
    If R is a symmetric matrix (i.e. Rij = Rji ), then Rij ǫijm = 0 automatically (since
ǫijk is antisymmetric under the interchange of the indices i and j). In this case
sin θ = 0 and we must seek other means to determine n̂. If sin θ 6= 0, then one can
divide both sides of eq. (17) by sin θ. Using eq. (15), we obtain:6
                            Rij ǫijm       −Rij ǫijm
                   nm = −            =p                      ,           sin θ 6= 0               (18)
                            2 sin θ     (3 − Rii )(1 + Rii )
More explicitly,
                                                               
                   1
 n̂ = p                           R32 −R23 , R13 −R31 , R21 −R12 ,               Rii 6= −1 , 3 . (19)
          (3 − Rii )(1 + Rii )
In Appendix B, we verify that n̂ as given by eq. (18) is a vector of unit length [as
required by eq. (5)]. The overall sign of n̂ is fixed by eq. (18) due to our convention
in which sin θ ≥ 0.
   If we multiply eq. (17) by nm and sum over m, then
                                      sin θ = − 12 ǫijm Rij nm ,                                  (20)
after using nm nm = 1. This provides an additional check on the determination of the
rotation angle.
    As noted above, if R is a symmetric matrix (i.e. Rij = Rji ), then sin θ = 0 and
n̂ cannot be determined from eq. (18). In this case, eq. (14) determines whether
cos θ = +1 or cos θ = −1. If cos θ = +1, then Rij = δij and the axis n̂ is undefined.
If cos θ = −1, then according to eq. (12), Rij = 2ni nj − δij , which determines the
direction of n̂ up to an overall sign. That is,
        n̂ is undetermined if θ = 0 ,
               q                q                 q            
                   1                 1               1
        n̂ = ǫ1 2 (1 + R11 ) , ǫ2 2 (1 + R22 ) , ǫ3 2 (1 + R33 ) ,                    if θ = π , (21)
   5
      In regards to the first equation of eq. (16), see the footnote 4. The second equation of eq. (16)
is given in eq. (5.8) on p. 511 of Boas.
    6
      If sin θ = 0, both the numerator and denominator of eq. (18) vanish. We will show below
[cf. eq. (21)] that n̂ is undefined if Rii = 3, corresponding to the case of R(n̂, 0) = I. When
Rii = −1, corresponding to the case of R(n̂, π), n̂ can be determined directly from eq. (12).
                                                  4
where the individual signs ǫi = ±1 are determined up to an overall sign via7
                          Rij
        ǫi ǫj = p                        ,      for fixed i 6= j , Rii 6= −1 , Rjj 6= −1 .          (22)
                    (1 + Rii )(1 + Rjj )
The ambiguity of the overall sign of n̂ sign is not significant, since R(n̂, π) and
R(−n̂, π) represent the same rotation [cf. eq. (4)].
    One slightly inconvenient feature of the above analysis is that the case of Rii = −1
(corresponding to θ = π) requires a separate treatment in order to determine n̂.
Moreover, for values of θ very close to π, the numerator and denominator of eq. (19)
are very small, so that a very precise numerical evaluation of both the numerator and
denominator is required to accurately determine the direction of n̂. Thus, we briefly
mention another approach for determining n̂ that can be employed for all possible
values of Rii (except for Rii = 3 corresponding to the identity rotation, where n̂ is
not defined). This approach is based directly on the Rodriguez formula [eq. (12)].
Define the matrix
                               S = R + RT + (1 − Rii )I .                            (23)
Then, eq. (12) yields Sjk = 2(1 − cos θ)nj nk = (3 − Rii )nj nk . Hence,8
                                                Sjk
                                    nj nk =           ,       Rii 6= 3                              (24)
                                              3 − Rii
Note that for θ close to π (which corresponds to Rii ≃ −1), neither the numerator
nor the denominator of eq. (24) is particularly small, and the direction of n̂ can be
determined numerically without significant roundoff error.
    To determine n̂ up to an overall sign, we simply set j = k (no sum) in eq. (24),
which fixes the value of n2j . If sin θ 6= 0, the overall sign of n̂ is fixed by eq. (17).
If sin θ = 0 there are two cases. For θ = 0 (corresponding to the identity rotation),
the rotation axis n̂ is undefined. For θ = π, the ambiguity in the overall sign of n̂ is
immaterial, in light of eq. (4).
    Thus, we have achieved our goal. Eqs. (15), (19) and (21) [or equivalently,
eqs. (15), (17) and (24)] provide a simple algorithm for determining the rotation
axis n̂ and the rotation angle θ for any rotation matrix R(n̂, θ) 6= I.
  7
      If Rii = −1 [no sum over i], then ni = 0, in which case the corresponding ǫi is not well-defined.
  8
      Eq. (23) yields Tr S ≡ Sii = 3 − Rii . One can then use eq. (24) to verify that n̂ is a unit vector.
                                                     5
Appendix A: Matrix elements of matrices correspond to the components
of second rank tensors
    In the class handout entitled Coordinates, matrix elements and changes of basis,
we examined how the matrix elements of linear operators change under a change
of basis. Consider the matrix elements of a linear operator with respect to two
different orthonormal bases, B = {ê1 , ê2 , ê3 } and B′ = {ê′1 , ê′2 , ê′3 } . Then, using
the Einstein summation convention,
                                            ê′j = Pij êi ,
where P is an orthogonal matrix. Given any linear operator A with matrix elements
aij with respect to the basis B, the matrix elements a′ij with respect to the basis B′
are given by
                           a′kℓ = (P −1 )kiaij Pjℓ = Pik aij Pjℓ ,
where we have used the fact that P −1 = P T in the second step above. Finally,
identifying P = R−1 , where R is also an orthogonal matrix, it follows that
                                       a′kℓ = Rki Rℓj aij ,
which we recognize as the transformation law for the components of a second rank
Cartesian tensor.
Appendix B: Verifying that n̂ as determined from eq. (18) is a unit vector
   We first need some preliminary results. Using the results from the handout entitled
The Characteristic Polynomial, the characteristic equation of an arbitrary 3×3 matrix
R is given by:
                     p(λ) = − λ3 − λ2 Tr R + c2 λ2 − det R ,
                                                             
where
                                            (Tr R)2 − Tr(R2 ) .
                                       1
                                                            
                                c2 =   2
For a special orthogonal matrix, det R = 1. Hence,
                p(λ) = − λ3 − λ2 Tr R + 21 λ (Tr R)2 − Tr(R2 ) − 1 .
                                                               We now employ the Cayley-Hamilton theorem, which states that a matrix satisfies
its own characteristic equation, i.e. p(R) = 0. Hence,
                  R3 − R2 Tr R + 21 R (Tr R)2 − Tr(R2 ) − I = 0 .
                                                      Multiplying the above equation by R−1 , and using the fact that R−1 = RT for an
orthogonal matrix,
                 R2 − R Tr R + 21 I (Tr R)2 − Tr(R2 ) − RT = 0 .
                                                                                                       6
Finally, we take the trace of the above equation. Using Tr(RT ) = Tr R, we can solve
for Tr(R2 ). Using Tr I = 3, the end result is given by:
                               Tr(R2 ) = (Tr R)2 − 2 Tr R ,                                   (25)
which is satisfied by all 3 × 3 special orthogonal matrices.
   We now verify that n̂ as determined from eq. (18) is a unit vector. For convenience,
we repeat eq. (18) here:
                           Rij ǫijm      −Rij ǫijm
               nm = − 21            =p                     ,        sin θ 6= 0 ,
                            sin θ     (3 − Rii )(1 + Rii )
where Rii ≡ Tr R. We evaluate n̂· n̂ = nm nm as follows:
                 Rij ǫijm Rkℓ ǫklm    Rij Rkℓ (δik δjℓ − δiℓ δjk )    Rij Rij − Rij Rji
     nm nm =                        =                              =                      .
               (3 − Rii )(1 + Rii )    (3 − Rii )(1 + Rii )          (3 − Rii )(1 + Rii )
The numerator of the above expression is equal to:
     Rij Rij − Rij Rji = Tr(RT R) − Tr(R2 ) = Tr I − Tr(R2 )
                       = 3 − Tr(R2 ) = 3 − (Tr R)2 + 2 Tr R = (3 − Rii )(1 + Rii ) , (26)
after using eq. (25) for Tr(R2 ). Hence, employing eq. (26) yields
                                            Rij Rij − Rij Rji
                        n̂· n̂ = nm nm =                        = 1,
                                           (3 − Rii )(1 + Rii )
and the proof is complete.