Group Actions Explained
Group Actions Explained
KEITH CONRAD
1. Introduction
The groups Sn , An , and (for n ≥ 3) Dn behave, by their definitions, as permutations
on certain sets. The groups Sn and An both permute the set {1, 2, . . . , n} and Dn can be
considered as a group of permutations of a regular n-gon, or even just of its n vertices, since
rigid motions of the vertices determine where the rest of the n-gon goes. If we label the
vertices of the n-gon in a definite manner by the numbers from 1 to n then we can view Dn
as a subgroup of Sn . For instance, the labeling of the square below lets us regard the 90
degree counterclockwise rotation r in D4 as (1234) and the reflection s across the horizontal
line bisecting the square as (24). The rest of the elements of D4 , as permutations of the
vertices, are in the table below the square.
2
3 1
4
1 r r2 r3 s rs r2 s r3 s
(1) (1234) (13)(24) (1432) (24) (12)(34) (13) (14)(23)
If we label the vertices in a different way (e.g., swap the labels 1 and 2), we turn the elements
of D4 into a different subgroup of S4 .
More abstractly, if we are given a set X (not necessarily the set of vertices of a square),
then the set Sym(X) of all permutations of X is a group under composition, and the
subgroup Alt(X) of even permutations of X is a group under composition. If we list the
elements of X in a definite order, say as X = {x1 , . . . , xn }, then we can think about Sym(X)
as Sn and Alt(X) as An , but a listing in a different order leads to different identifications
of Sym(X) with Sn and Alt(X) with An .1
The “abstract” symmetric groups Sym(X) really do arise naturally:
Theorem 1.1 (Cayley). Every finite group G can be embedded in a symmetric group.
Proof. To each g ∈ G, define the left multiplication function `g : G → G, where `g (x) = gx
for x ∈ G. Each `g is a permutation of G as a set, with inverse `g−1 . So `g belongs to
Sym(G). Since `g1 ◦ `g2 = `g1 g2 (that is, g1 (g2 x) = (g1 g2 )x for all x ∈ G), associating to g
the mapping `g gives a homomorphism of groups, G → Sym(G). This homomorphism is
one-to-one since `g determines g (after all, `g (e) = g). Therefore the correspondence g 7→ `g
is an embedding of G as a subgroup of Sym(G).
1When X = ∅, consider Sym(X) and Alt(X) to be trivial groups. The number of permutations of a set
of size 0 is 0! = 1.
1
2 KEITH CONRAD
From this viewpoint, the set of g ∈ G that act trivially (g · x = x for all x ∈ X) is simply
the kernel of the homomorphism G → Sym(X) associated to the action. Therefore those g
that act trivially on X are said to lie in the kernel of the action.
We will not often use the interpretation of Theorem 1.6 before Section 6. Until then we
take the more concrete viewpoint of a group action as a kind of product g · x of g with x,
taking values in X subject to the properties e · x = x and g1 · (g2 · x) = (g1 g2 ) · x.
Here is an outline of later sections. Section 2 describes several concrete examples of
group actions and also some general actions available for all groups. Section 3 describes the
important orbit-stabilizer formula. The short Section 4 isolates an important fixed-point
congruence for actions of p-groups. Sections 5 and 6 give applications of group actions to
group theory. In Appendix A, group actions are used to derive three classical congruences
from number theory.
2. Examples
Example 2.1. We can make Rn act on itself by translations: for v ∈ Rn , let Tv : Rn → Rn
by Tv (w) = w + v. The axioms for a group action are: T0 (w) = w and Tv1 (Tv2 (w)) =
Tv1 +v2 (w). These are true from properties of vector addition. (This is a special case of
Example 1.4 using additive notation.)
Example 2.2. Let G be the group of Rubik’s cube: all sequences of motions on the cube
(keeping center colors in fixed locations). This group acts on two different sets: the 12 edge
cubelets and the 8 corner cubelets. Or we could let G act on the set of all 20 non-centerface
cubelets together.
Example 2.3. For n ≥ 3, each element of the group Dn acts as a rigid motion of a regular
n-gon in the plane, either a rotation or a reflection.
We can also view Dn as acting just on the n vertices of a regular n-gon. Knowing where
the vertices go determines the rest of the rigid motion, so the effect of Dn on the vertices
tells us all we need to know to determine the rigid motion on the n-gon. Restricting the
action of Dn from an n-gon to its vertices, and labeling the vertices as 1, 2, . . . , n in some
manner, makes Dn act on {1, 2, . . . , n}.
Example 2.4. The group GLn (R) acts on vectors in Rn in the usual way that a matrix
can be multiplied with a (column) vector: A · v = Av. In this action, the origin 0 is fixed
by every A while other vectors get moved around (as A varies). The axioms of a group
action are properties of matrix-vector multiplication: In v = v and A(Bv) = (AB)v.
Example 2.5. The group Sn acts on polynomials f (T1 , . . . , Tn ), by permuting the variables:
(2.1) (σ · f )(T1 , . . . , Tn ) = f (Tσ(1) , . . . , Tσ(n) ).
4 KEITH CONRAD
π(12) (π(23) (c1 , c2 , c3 )) = π(12) (c1 , c3 , c2 ), in the next step write (c1 , c3 , c2 ) = (d1 , d2 , d3 ).
Then
π(12) (π(23) (c1 , c2 , c3 )) = π(12) (c1 , c3 , c2 ) = π(12) (d1 , d2 , d3 ) = (d2 , d1 , d3 ) = (c3 , c1 , c2 ),
which does not agree with
π(12)(23) (c1 , c2 , c3 ) = π(123) (c1 , c2 , c3 ) = (c2 , c3 , c1 ).
In general, for σ and σ 0 in Sn , and v = (c1 , . . . , cn ) in Rn ,
πσ (πσ0 (v)) = πσ (cσ0 (1) , . . . , cσ0 (n) )
= πσ (d1 , . . . , dn ) where di = cσ0 (i)
= (dσ(1) , . . . , dσ(n) )
= (cσ0 (σ(1)) , . . . , cσ0 (σ(n)) )
= (c(σ0 σ)(1) , . . . , c(σ0 σ)(n) )
= πσ0 σ (v),
so πσ ◦ πσ0 is πσ0 σ , which is not πσσ0 on Rn if σ 0 σ 6= σσ 0 (e.g., n ≥ 3, σ = (12), σ 0 = (23)).
To make things turn out nicer, redefine the effect of Sn on Rn using the inverse: set
σ · v = (cσ−1 (1) , . . . , cσ−1 (n) ). Then σ · (σ 0 · v) = (σσ 0 ) · v and we have a group action
of Sn on Rn , which in fact is essentially the action of Sn from the previous example on
homogeneousPlinear polynomials (see (2.2)). Indeed, if e1 , . . . , en is the standard basis of
Rn and v = ni=1 ci ei then
n
X n
X n
X
σ· ci ei = (cσ−1 (1) , . . . , cσ−1 (n) ) = cσ−1 (i) ei = ci eσ(i) ,
i=1 i=1 i=1
which is how (2.2) looks with Ti in place of ei . In other words, the action of each σ ∈ Sn
on Rn is R-linear and permutes the basis vectors {ei } (not the coefficients!) in the same
way it permutes the indices: σ(ei ) = eσ(i) .
The lesson from these last two examples is that when Sn permutes variables in a poly-
nomial then it acts “directly”, but when Sn permutes coordinates in a vector then it has
to act using inverses. When Sn acts on variables or coordinates, it acts without inverses in
one case and with inverses in the other case, but it’s easy to forget which case is which. At
least remember that you need to be careful.
Example 2.7. Let G be a group acting on the set X, and S be a set. Write Map(X, S)
for the set of all functions f : X → S. It is natural to try defining an action of G on the set
Map(X, S) by the rule
(2.3) (πg f )(x) = f (gx),
where gx is the action of g ∈ G on x ∈ X. While πg f is a function X → S, sending each f
to πg f is usually not an action of G on Map(X, S) even though it is easy to confuse yourself
into thinking it is: for g and h in G, and x ∈ X,
πg ((πh f )(x)) = πg (f (hx)) = f (g(hx)) = f ((gh)x) = (πgh f )(x).
This holds for all x ∈ X, so πg (πh f ) = πgh f , right? No. The calculation is bogus since
the first expression πg ((πh f )(x)) is nonsense: (πh f )(x) = f (hx) belongs to S and no action
of G has been defined on S. Even if it had been, πg is applied to functions X → S, not
6 KEITH CONRAD
to elements of S. The mistake was confusing (πg f )(x) in (2.3) with πg (f (x)), which is
meaningless.
A correct calculation is
(πg (πh f ))(x) = (πh f )(gx) = f (h(gx)) = f ((hg)x) = (πhg f )(x).
Therefore πg (πh f ) = πhg f . This is the same problem as the false action in Example 2.6 if
gh and hg act on X in different ways. To get a group action of G on Map(X, S) when G
already acts on X (from the left), replace g with g −1 in (2.3): set
(g · f )(x) = f (g −1 x).
Now
(g · (h · f ))(x) = (h · f )(g −1 x) = f (h−1 (g −1 x)) = f ((gh)−1 x) = ((gh) · f )(x),
so g · (h · f ) = (gh) · f . This is a group action of G on Map(X, S).
If G is Sn , X is {1, . . . , n} with its natural Sn -action, and S = R, then Map(X, S) = Rn :
writing down a vector v = (c1 , . . . , cn ) amounts to listing the coordinates in order, and
a list of coordinates in order is a function f : {1, 2, . . . , n} → R where f (i) = ci . The
definition (g · f )(i) = f (g −1 i) amounts to saying g · (c1 , . . . , cn ) = (cg−1 (1) , . . . , cg−1 (n) ),
which is precisely the valid action of Sn on Rn at the end of Example 2.6.
There are three basic ways we will make an abstract group G act: left multiplication of
G on itself, conjugation of G on itself, and left multiplication of G on a coset space G/H.
All of these will now be described.
Example 2.8. To make G act on itself by left multiplication, we let X = G and g · x (for
g ∈ G and x ∈ G) be the usual product of g and x. This example was used already in
the proof of Cayley’s theorem and in Example 1.4, and the definition of a group action is
satisfied by the axioms for multiplication in G.
Note that right multiplication of G on itself, given by rg (x) = xg for g and x in G, is
not an action since the order of composition gets reversed: rg1 ◦ rg2 = rg2 g1 . But if we set
rg (x) = xg −1 then we do get an action. This could be called the action by right-inverse
multiplication (nonstandard terminology).
Example 2.9. To make G act on itself by conjugation, take X = G and let g · x = gxg −1 .
Here g ∈ G and x ∈ G. Since e · x = exe−1 = x and
g1 · (g2 · x) = g1 · (g2 xg2−1 )
= g1 (g2 xg2−1 )g1−1
= (g1 g2 )x(g1 g2 )−1
= (g1 g2 ) · x,
conjugation is a group action.
Example 2.10. For a subgroup H ⊂ G, consider the left coset space G/H = {aH : a ∈ G}.
(We do not care whether or not H C G, as we are just thinking about G/H as a set.) Let
G act on G/H by left multiplication. That is, for g ∈ G and a left coset aH (a ∈ G), set
g · aH = gaH = {gy : y ∈ aH}.
GROUP ACTIONS 7
While the left multiplication action of G on itself (Example 2.8) turns different group
elements into different permutations, the conjugation action of G on itself (Example 2.9)
can make different group elements act in the same way: if g1 = g2 z, where z is in the center
of G, then g1 and g2 have the same conjugation action on G. Group actions where different
elements of the group act differently have a special name:
Definition 2.15. A group action of G on X is called faithful (or effective) if different
elements of G act on X in different ways: when g1 6= g2 in G, there is an x ∈ X such that
g1 · x 6= g2 · x.
Note that when we say g1 and g2 act differently, we mean they act differently somewhere,
not everywhere. This is consistent with what it means to say two functions are not equal:
they take different values somewhere, not everywhere.
Example 2.16. The action of G on itself by left multiplication is faithful: different elements
send e to different places.
Example 2.17. The action of G on itself by conjugation is faithful if and only if G has a
trivial center, because g1 gg1−1 = g2 gg2−1 for all g ∈ G if and only if g2−1 g1 is in the center of
G. When D4 acts on itself by conjugation, the action is not faithful since r2 acts trivially
(it is in the center), so 1 and r2 act in the same way.
Example 2.18. When H is a subgroup of G and G acts on G/H by left multiplication
(Example 2.10), g1 and g2 in G act inT the same−1way on G/H precisely when g1 gH = g2 gH
−1
for all g ∈ G, which means g2 g1 ∈ g∈G gHg . So the left multiplication action of G on
G/H is faithful if and only if the subgroups gHg −1 (as g varies) have trivial intersection.
Example 2.19. The action of GL2 (R) on R2 is faithful, since we can recover the columns
of a matrix by acting it on 10 and 01 .
reverses the order of multiplication in G. We saw this idea at work in Examples 2.6, 2.7,
and 2.8. We will not use right actions (except in Example 3.24), so for us “group action”
means “left group action.”
Example 3.3. When the group GL2 (Z) acts in the usual way on Z2 , the orbit of 0 is {0}
1
with stabilizer GL2 (Z). But in contrast to Example 3.2, the orbit of 0 under GL2 (Z) is
not Z2 − {0}. Indeed, a matrix ( ac db ) in GL2 (Z) sends 10 to ac , which is a vector with
relatively prime coordinates since ad − bc = ±1. (For instance, GL2 (Z) can’t send 10 to
2
each vector m 2
0 .) Conversely, n in Z with relatively prime coordinates is in the GL2 (Z)-
1 −y
orbit of 0 : we can solve mx + ny = 1 for some integers x and y, so ( m n x ) is in GL2 (Z)
−y 1 m
m
(its determinant is 1) and ( n x ) 0 = n .
Check as an exercise that the orbits in Z2 under the action of GL2 (Z) are the vectors
whose coordinates have a fixed greatest common divisor. Each orbit contains one vector of
the form d0 for d ≥ 0, and the stabilizer of d0 for d > 0 is {( 10 xy ) : y = ±1} ⊂ GL2 (Z).
Example 3.4. Identifying Z/(2) with the subgroup {±In } of GLn (R) gives an action of
Z/(2) on Rn , where 0 acts as the identity and 1 acts by negation on Rn . We can restrict
this action of Z/(2) to the unit sphere of Rn , and then it is called the antipodal action since
its orbits are pairs of opposite points (which are called antipodal points) on the sphere.
10 KEITH CONRAD
Example 3.5. When the Rubik’s cube group acts on the non-centerface cubelets of Rubik’s
cube, there are two orbits: the corner cubelets and the edge cubelets.
Example 3.6. For n ≥ 2, consider Sn in its natural action on {1, 2, . . . , n}. What is the
stabilizer of an integer k ∈ {1, 2, . . . , n}? It is the set of permutations of {1, 2, . . . , n} fixing
k, which can be identified with the set of permutations of {1, 2, . . . , n} − {k}. This is just
Sn−1 in disguise (once we identify {1, 2, . . . , n} − {k} in a definite manner with the numbers
from 1 to n − 1). The stabilizer of each number in {1, 2, . . . , n} for the natural action of Sn
on {1, 2, . . . , n} is isomorphic to Sn−1 .
Example 3.7. For n ≥ 2, the even permutations of {1, 2, . . . , n} that fix a number k can be
identified with the even permutations of {1, 2, . . . , n} − {k}, so the stabilizer of each point
in the natural action of An is essentially An−1 up to relabelling.
Remark 3.8. When trying to think about a set as a geometric object, it is helpful to refer
to its elements as points, no matter what they might really be. For example, when we think
about G/H as a set on which G acts (by left multiplication), it is useful to think about
the cosets of H, which are the elements of G/H, as the points in G/H. At the same time,
though, a coset is a subset of G. There is a tension between these two interpretations: is a
left coset of H a point in G/H or a subset of G? It is both, and it is important to be able
to think about a coset in both ways.
All of our applications of group actions to group theory will flow from the relations
between orbits, stabilizers, and fixed points, which we now make explicit in our three basic
examples of group actions.
Example 3.9. When a group G acts on itself by left multiplication,
• there is one orbit (g = ge ∈ Orbe ),
• Staba = {g : ga = a} = {e} is trivial,
• there are no fixed points (if |G| > 1).
Example 3.10. When a group G acts on itself by conjugation,
• the orbit of a is Orba = {gag −1 : g ∈ G}, which is the conjugacy class of a,
• Staba = {g : gag −1 = a} = {g : ga = ag} is the centralizer of a, denoted Z(a),
• a is a fixed point when it commutes with all elements of G, and thus the fixed points
of conjugation form the center of G.
Example 3.11. When a group G acts on G/H (for a subgroup H) by left multiplication,
• there is one orbit (gH = g · H ∈ OrbH ),
• StabaH = {g : gaH = aH} = {g : a−1 ga ∈ H} = aHa−1 ,
• there are no fixed points (if H 6= G).
These examples illustrate several facts: an action need not have fixed points (Example 3.9
with nontrivial G), different orbits can have different lengths (Example 3.10 with G = S3 ),
and the points in a common orbit don’t have to share the same stabilizer (Example 3.11 if
H is not a normal subgroup).
Example 3.12. When G acts on its subgroups by conjugation, StabH = {g : gHg −1 = H}
is the normalizer N(H) and the fixed points are the normal subgroups of G.
When a group G acts on a set X, each subgroup H of G also acts on X. Let’s look at a
few examples.
GROUP ACTIONS 11
Corollary 3.19. Let a group G act on a set X, where X is finite. Let the different orbits
of X be represented by x1 , . . . , xt . Then
t
X t
X
(3.1) |X| = | Orbxi | = [G : Stabxi ].
i=1 i=1
Proof. The set X can be written as the union of its orbits, which are mutually disjoint. The
orbit-stabilizer formula tells us how large each orbit is.
Example 3.20. For all finite groups G, each conjugacy class in G has size dividing the size
of G, since a conjugacy class in G is an orbit in the conjugation action of G on itself, so
Corollary 3.18a applies. Moreover, for the conjugation action (3.1) is the class equation.
Example 3.21. Let S3 act on itself by conjugation. Its orbits are its conjugacy classes,
which are
{(1)}, {(12), (13), (23)}, {(123), (132)}.
The conjugacy class of (12) has size 3 and the stabilizer of (12) is its centralizer {(1), (12)},
which has index 3 in S3 .
Example 3.22. Which elements of D6 commute with the reflection s? This is asking for
{g ∈ D6 : gs = sg}. Three such elements are 1, s, and r3 (since rn/2 ∈ Z(Dn ) for even n).
Let’s interpret the condition gs = sg as gsg −1 = s: the task is now computing the
stabilizer of s when D6 acts on itself by conjugation. To compute the stabilizer, let’s
first compute the orbit: how many different values of gsg −1 are there as g runs over D6 ?
Elements of D6 are rk (rotations) and rk s (reflections, so equal to their inverses). From
Example 3.23. We examine now a geometric example. The figure F below is a hexagon
with an X drawn inside of it. Which elements of D6 preserve this figure when D6 acts in a
natural way on it?
For g ∈ D6 , g(F ) = F means g ∈ StabF . To compute StabF we first compute the orbit
of F : it’s easier to figure out all the ways F can change than to figure out all the ways F
can stay the same, and these are related by the orbit-stabilizer formula. By rotating and
reflecting it is clear that g(F ), as g runs over D6 , has only the 3 values below.
14 KEITH CONRAD
F F0 F 00
Let r be the 60-degree counterclockwise rotation preserving the hexagonal shape and let
s be the reflection across the horizontal line bisecting F . Since F has an orbit of size 3,
its stabilizer in D6 has index 3, so | StabF | = |D6 |/3 = 12/3 = 4. From the 180-degree
rotational symmetry of F , r3 ∈ StabF . Since s(F ) = F , s ∈ StabF . Since StabF is a
subgroup of D6 , StabF contains r3 s. Thus {1, r3 , s, r3 s} ⊂ StabF , and we are done since
we know | StabF | = 4.
While F 0 looks like F , it is not equal to F . What are StabF 0 and {g ∈ D6 : g(F ) = F 0 }?
We can compute both as soon as we know just one g sending F to F 0 . Since F 0 = r(F ) we
can use g = r. Then Theorem 3.16b says
StabF 0 = Stabr(F ) = r StabF r−1 = r{1, r3 , s, r3 s}r−1 = {1, r3 , r2 s, r5 s}
and Theorem 3.16c says
{g ∈ D6 : g(F ) = F 0 } = r StabF = r{1, r3 , s, r3 s} = {r, r4 , rs, r4 s}.
Example 3.24. The 2 × 2 matrices ( ac db ) ∈ GL2 (R) whose columns add up to 1 are a
group. This can be checked by a tedious calculation. But this can be seen more simply by
observing that the column sums are the entries in the vector-matrix product (1 1)( ac db ), so
the matrices of interest are those satisfying (1 1)( ac db ) = (1 1). This is the stabilizer of (1 1)
in the (right!) action of GL2 (R) on R2 – viewed as row vectors – by v · A = vA, so it is a
subgroup of GL2 (R) since the stabilizers of a point are always a subgroup. (Theorem 3.16
for right group actions should be formulated and checked by the reader.)
Moreover, because (0 1)( 01 −1
1 ) = (1 1), Stab(1 1) and Stab(0 1) are conjugate subgroups
in GL2 (R). Since Stab(0 1) = {( a0 1b ) ∈ GL2 (R)} = Aff(R), a model for our “column-sum-1
group” is its conjugate subgroup Aff(R). Explicitly,
−1
0 −1 0 −1
Stab(1 1) = Stab(0 1)( 0 −1 ) = Aff(R) .
1 1 1 1 1 1
Example 3.25. As a cute application of the orbit-stabilizer formula we explain why |HK| =
|H||K|/|H ∩K| for subgroups H and K of a finite group G. Here HK = {hk : h ∈ H, k ∈ K}
is the set of products, which usually is just a subset (not a subgroup) of G. To count the size
of HK, let the direct product group H × K act on the set HK like this: (h, k) · x = hxk −1 .
Check this gives a group action (the group is H × K and the set is HK). There is only one
orbit since e = ee ∈ HK and hk = (h, k −1 ) · e. Therefore the orbit-stabilizer formula tells
us
|H × K| |H||K|
|HK| = = .
| Stabe | |{(h, k) : (h, k) · e = e}|
The condition (h, k) · e = e means hk −1 = e, so Stabe = {(h, h) : h ∈ H ∩ K}. Therefore
| Stabe | = |H ∩ K| and |HK| = |H||K|/|H ∩ K|.
Example 3.26. We now discuss the original version of Lagrange’s theorem in group theory.
Here is what he proved: for each polynomial f (T1 , . . . , Tn ) in n variables, the number of
GROUP ACTIONS 15
different polynomials we can obtain from f (T1 , . . . , Tn ) through all permutations of its
variables is a factor of n!.
For instance, taking n = 3, consider the polynomial T1 . If we run through all six permu-
tations of the set {T1 , T2 , T3 }, and apply each to T1 , we get 3 different results: T1 , T2 , and
T3 . The polynomial T1 T22 + T2 T32 + T3 T12 has only 2 possibilities under each change of vari-
ables: itself and T2 T12 + T1 T32 + T3 T22 (check this for yourself). The polynomial T1 + T22 + T33
has 6 different possibilities. The number of different polynomials we find in each of these
examples is a factor of 3!.
To explain Lagrange’s general observation, we apply the orbit-stabilizer formula to the
group action in Example 2.5. That was the action of Sn on n-variable polynomials by
permutations of the variables. For an n-variable polynomial f (T1 , . . . , Tn ), the different
polynomials we obtain by permuting its variables are exactly the polynomials in its Sn -
orbit. Therefore, by the orbit-stabilizer formula, the number of polynomials we get from
f (T1 , . . . , Tn ) by permuting its variables is [Sn : Hf ], where Hf = Stabf = {σ ∈ Sn : σ · f =
f }. This index is a divisor of n!.
In a group action, the length of an orbit divides |G|, but the number of orbits usually
does not divide |G|. For example, S4 has 5 conjugacy classes, and 5 does not divide 24. But
there is an interesting relation between the number of orbits and the group action.
Theorem 3.27. Let a finite group G act on a finite set X with r orbits. Then r is the
average number of fixed points of the elements of the group:
1 X
r= | Fixg (X)|,
|G|
g∈G
Next we count over the x’s and have to add up the number of g’s with gx = x, i.e., with
g ∈ Stabx :
X
|{(g, x) ∈ G × X : gx = x}| = | Stabx |.
x∈X
Equating these two counts gives
X X
| Fixg (X)| = | Stabx |.
g∈G x∈X
Divide by |G|:
1 X X 1
| Fixg (X)| = .
|G| | Orbx |
g∈G x∈X
Let’s consider the contribution to the right side from points in a single orbit. If an orbit
has n points in it, then the sum over the points in that orbit is a sum of 1/n for n terms,
and that is equal to 1. Thus the part of the sum over points in an orbit is 1, which makes
the sum on the right side equal to the number of orbits, which is r.
Theorem 3.27 is often called Burnside’s lemma, but it is not due to him [4]. He included
it in his widely read book on group theory.
Example 3.28. We will use a special case of Theorem 3.27 to prove for all a ∈ Z and
m ∈ Z+ that
Xm
(3.2) a(k,m) ≡ 0 mod m.
k=1
When m = p is a prime number, the left side is (p − 1)a + ap = (ap − a) + pa, so (3.2)
becomes ap ≡ a mod p, which is Fermat’s little theorem. Thus (3.2) can be thought of as a
generalization of Fermat’s little theorem to all moduli that is essentially different from the
generalization called Euler’s theorem, which says aϕ(m) ≡ 1 mod m if (a, m) = 1: (3.2) is
true for all a ∈ Z.
Our setup leading to (3.2) starts with a finite group G and comes from [3]. For a positive
integer a, G acts on the set of functions Map(G, {1, 2, . . . , a}) by (g · f )(h) = f (g −1 h) for
g, h ∈ G. This is a special case of the group action at the end of Example 2.7, where G acts
on itself by left multiplication. We want to apply Theorem 3.27 to this action, so we need
to understand the fixed points (really, fixed functions) of each g ∈ G. We have g · f = f if
and only if f (g −1 h) = f (h) for all h ∈ G, which is the same as saying f is constant on every
left coset hgih in G. The number of left cosets of hgi in G is [G : hgi] = m/ord(g), where
m = |G| and ord(g) is the order of g, so the number of functions fixed by g is am/ord(g) , since
the value of the function on each coset can be chosen arbitrarily in {1, . . . , a}. Therefore
Theorem 3.27 implies (1/m) g∈G am/ord(g) is a positive integer, so
P
X
(3.3) am/ord(g) ≡ 0 mod m.
g∈G
Since (3.3) depends on a only through the value of a mod m, it holds for all a ∈ Z, not just
a > 0.
Taking G = Z/(m), each k ∈ G has additive order m/(k, m), so (3.3) becomes
Xm
a(k,m) ≡ 0 mod m.
k=1
Next we turn to the idea of two different actions of a group being essentially the same.
Definition 3.29. Two actions of a group G on sets X and Y are called equivalent if there
is a bijection f : X → Y such that f (gx) = g(f (x)) for all g ∈ G and x ∈ X.
Actions of G on two sets are equivalent when G permutes elements in the same way on
the two sets after matching up the sets appropriately. When f : X → Y is an equivalence of
group actions on X and Y , gx = x if and only if g(f (x)) = f (x), so the stabilizer subgroups
of x ∈ X and f (x) ∈ Y are the same.
GROUP ACTIONS 17
Example 3.30. Let R× act on a linear subspace Rv0 ⊂ Rn by scaling. This is equivalent
to the natural action of R× on R by scaling: let f : R → Rv0 by f (a) = av0 . Then f is a
bijection and f (ca) = (ca)v0 = c(av0 ) = cf (a) for all c in R× and a ∈ R.
Example 3.31. Let S3 act on its conjugacy class {(12), (13), (23)} by conjugation. This
action on a 3-element set, described in the first half of Table 1 below, looks like the usual
action of S3 on {1, 2, 3} in the second half of Table 1 if we identify (12) with 3, (13) with 2,
and (23) with 1 (in short, identity (ij) with k where k 6∈ {i, j}). Then the action of S3 on
{(12), (13), (23)} by conjugation is equivalent to the natural action of S3 on {1, 2, 3}.
Example 3.32. Let GL2 (R) act on the set B of ordered bases (e1 , e2 ) of R2 in the natural
way: if A ∈ GL2 (R) then A(e1 , e2 ) := (Ae1 , Ae2 ) is another ordered basis of R2 . This
action of GL2 (R) on B is equivalent to the action of GL2 (R) on itself by left multiplication.
The reason is that the two columns of a matrix in GL2 (R) are a basis of R2 (with the first
and second columns being an ordering of basis vectors: the first column is the first basis
vector and the second column is the second basis vector) and two square matrices multiply
a b a b
through multiplication on the columns: A( c d ) = (A c A d ). Letting f : B → GL2 (R) by
f ( ac , db ) = ( ac db ) gives a bijection and f (A(e1 , e2 )) = A · f (e1 , e2 ) for all A ∈ GL2 (R) and
(e1 , e2 ) ∈ B.
Example 3.33. Let H and K be subgroups of G. The group G acts by left multiplication on
G/H and G/K. If H and K are conjugate subgroups then these actions are equivalent: fix a
representation K = g0 Hg0−1 for some g0 ∈ G and let f : G/H → G/K by f (gH) = gg0−1 K.
This is well-defined (independent of the coset representatives for gH) since, for h ∈ H,
f (ghH) = ghg0−1 K = ghg0−1 g0 Hg0−1 = gHg0−1 = gg0−1 K.
The reader can check f (g(g 0 H)) = gf (g 0 H) for g ∈ G and g 0 H ∈ G/H, and f is a bi-
jection. (The mapping f might depend on g0 , but that is not a problem. There can be
multiple equivalences between two equivalent group actions, just as there can be multiple
isomorphisms between two isomorphic groups.)
If H and K are nonconjugate then the actions of G on G/H and G/K are not equivalent:
corresponding points in equivalent actions have the same stabilizer subgroup, but the sta-
bilizer subgroups of left cosets in G/H are conjugate to H and those in G/K are conjugate
to K, and none of the former and latter are equal.
The left multiplication action of G on a left coset space G/H has one orbit. It turns out
all actions with one orbit are essentially of this form:
Theorem 3.34. An action of G that has one orbit is equivalent to the left multiplication
action of G on some left coset space of G.
18 KEITH CONRAD
Proof. Suppose that G acts on the set X with one orbit. Fix x0 ∈ X and let H = Stabx0 .
We will show the action of G on X is equivalent to the left multiplication action of G on
G/H.
Every x ∈ X has the form gx0 for some g ∈ G, and all elements in a left coset gH have
the same effect on x0 : for all h ∈ H, (gh)(x0 ) = g(hx0 ) = g(x0 ). Let f : G/H → X by
f (gH) = gx0 . This is well-defined, as we just saw. Moreover, f (g · g 0 H) = gf (g 0 H) since
both sides equal gg 0 (x0 ). We will show f is a bijection.
Since X has one orbit, X = {gx0 : g ∈ G} = {f (gH) : g ∈ G}, so f is onto. If
f (g1 H) = f (g2 H) then g1 x0 = g2 x0 , so g2−1 g1 x0 = x0 . Since x0 has stabilizer H, g2−1 g1 ∈ H,
so g1 H = g2 H. Thus f is one-to-one.
A particular case of Theorem 3.34 says that an action of G is equivalent to the left
multiplication action of G on itself if and only if the action has one orbit and the stabilizer
subgroups are trivial.
Definition 3.35. The action of G on X is called free when every point has a trivial stabi-
lizer.
Example 3.36. The left multiplication action of a group on itself (Example 3.9) is free
with one orbit.
Example 3.37. The antipodal action of Z/(2) on a sphere (Example 3.4) is a free action.
There are uncountably many orbits.
Free actions show up quite often in topology, and Example 3.37 is a typical illustration
of that.
Example 3.38. For an integer n ≥ 2, let Xn be the set of roots of unity of order n in
C× , so3 |Xn | = ϕ(n). (For instance, X4 = {i, −i}.) The group (Z/(n))× acts on Xn by
a · ζ = ζ a . (This is well-defined since a ≡ b mod n ⇒ ζ a = ζ b .) Since every element of Xn is
a power of every other element of Xn using exponents relatively prime to n, this action of
(Z/(n))× has a single orbit. Since ζ a = ζ only if a ≡ 1 mod n (ζ has order n), all stabilizers
are trivial (a free action). Thus (Z/(n))× acting on Xn is equivalent to the multiplication
action of (Z/(n))× on itself, except there is no naturally distinguished element of Xn (when
ϕ(n) > 1, i.e., n > 2) while 1 is a distinguished element of (Z/(n))× .
It is worth comparing faithful and free actions. An action is faithful (Definition 2.15)
when g1 6= g2 in G ⇒ g1 x 6= g2 x for some x ∈ X (different elements of G act differently
at some point in X) while an action is free when g1 6= g2 in G ⇒ g1 x 6= g2 x for all x ∈ X
(different elements of G act differently at every point of X). So all free actions are faithful.
Since g1 x = g2 x if and only if g2−1 g1 x = x, we can describe faithful and free actions in terms
of fixed points: an action is faithful when each g 6= e has Fixg (X) 6= X while an action is
free when each g 6= e has Fixg (X) = ∅.
4. Actions of p-groups
The action of a group of prime power size has special features. When |G| = pk for a prime
p, we call G a p-group. For example, D4 is a 2-group. Because all subgroups of a p-group
have p-power index, the length of an orbit under an action by a p-group is divisible by p
3Having order n is more than just satisfying z n = 1: no smaller power can be 1, so X is not all nth
n
roots of unity when n > 1.
GROUP ACTIONS 19
unless the point is a fixed point, when its orbit has length 1. This leads to an important
congruence modulo p for actions of a p-group.
Theorem 4.1 (Fixed Point Congruence). Let G be a finite p-group acting on a finite set
X. Then
|X| ≡ |{fixed points}| mod p.
Proof. Let the different orbits in X be represented by x1 , . . . , xt , so Corollary 3.19 leads to
t
X
(4.1) |X| = | Orbxi |.
i=1
Since | Orbxi | = [G : Stabxi ] and |G| is a power of p, | Orbxi | ≡ 0 mod p unless Stabxi = G,
in which case Orbxi has length 1, i.e., xi is a fixed point. Thus when we reduce both sides
of (4.1) modulo p, all terms on the right side vanish except for a contribution of 1 for each
fixed point. That implies
|X| ≡ |{fixed points}| mod p.
Keep in mind that the congruence in Theorem 4.1 holds only for actions by groups with
prime-power size. When a group of size 9 acts we get a congruence mod 3, but when a
group of size 6 acts we do not get a congruence mod 2 or 3.
Corollary 4.2. Let G be a finite p-group acting on a finite set X. If |X| is not divisible
by p, then there is at least one fixed point in X. If |X| is divisible by p, then the number of
fixed points is a multiple of p (possibly 0).
Proof. When |X| is not divisible by p, neither is the number of fixed points (by the fixed
point congruence), so the number of fixed points can’t equal 0 (after all, p | 0) and thus is
≥ 1. On the other hand, when |X| is divisible by p, then the fixed point congruence shows
the number of fixed points is ≡ 0 mod p, so this number is a multiple of p.
Example 4.3. Let G be a p-subgroup of GLn (Z/(p)), where n ≥ 1. Then there is a nonzero
v ∈ (Z/(p))n such that gv = v for all g ∈ G. Indeed, because G is a group of matrices it
naturally acts on the set V = (Z/(p))n . (The identity matrix is the identity function and
g1 (g2 v) = (g1 g2 )v by the rules of matrix-vector multiplication.) Since the set V has size
pn ≡ 0 mod p, the number of fixed points is divisible by p. The number of fixed points is
at least 1, since the zero vector is a fixed point, so the number of fixed points is at least p.
A nonzero fixed point for a group of matrices can be interpreted as a simultaneous
eigenvector with eigenvalue 1. These are the only possible simultaneous eigenvectors for G
in (Z/(p))n since every element of G has p-power order and the only element of p-power
order in (Z/(p))× is 1 (so a simultaneous eigenvector for G in (Z/(p))n must have eigenvalue
1 for each element of the group).
Theorem 4.1 can be used to prove existence theorems involving finite groups (noncon-
structively) if we can interpret a problem in terms of fixed points. For example, an element
of a group is in the center precisely when it is a fixed point for the conjugation action of the
group on itself. Thus, if we want to show some class of groups has nontrivial centers then
we can try to show there are fixed points other than the identity element for the conjugation
action.
20 KEITH CONRAD
We noted above that |X| = |G|p−1 , so this set is big. The nice feature of this solution
set is that cyclic shifts of one solution give us more solutions: if (g1 , g2 , . . . , gp ) ∈ X then
so is (g2 , . . . , gp , g1 ). Indeed, g1 = (g2 · · · gp )−1 and elements commute with their inverses
so g2 · · · gp g1 = e. Successive shifting of coordinates in a solution can be interpreted as a
group action of Z/(p) on X: for j ∈ Z/(p), let j · (g1 , . . . , gp ) = (g1+j , . . . , gp+j ), where the
subscripts are interpreted modulo p. This shift is a group action. Since the group doing
the acting is the p-group Z/(p), the fixed point congruence (Theorem 4.1) tells us
What are the points of X fixed by Z/(p)? Cyclic shifts bring every coordinate eventually
into the first position, so a fixed point of X is one where all coordinates are equal. Calling
the common value g, we have (g, g, . . . , g) ∈ X precisely when g p = e. Therefore (5.2)
becomes
Up to this point we have not used the condition p | |G|. That is, (5.3) is valid for all finite
groups G and primes p. This will be useful in Appendix A.
Since p divides |G|, the left side of (5.3) vanishes modulo p, so the right side is a multiple
of p. Thus |{g ∈ G : g p = e}| ≡ 0 mod p. Since |{g ∈ G : g p = e}| > 0, there must be some
g 6= e with g p = e.
The divisor d need not be a prime. However, the proof is not as direct as the case of a
prime divisor, and we don’t look at this more closely.
Proof. Let n = |G| and p | n. We let the group Z/(p) × G act on the set Gp by
Thus
g1 = ggi+1
= g · gg2i+1 (since gk = ggi+k for all k)
2
= g · g2i+1
= ···
= g r gri+1 for all r.
If g = e then i 6≡ 0 mod p and gri+1 = g1 for all r. Since {ri + 1 mod p : r ≥ 1} = Z/(p),
every gk equals g1 , so (g1 , . . . , gp ) ∈ ∆, a contradiction. Therefore g 6= e. Taking r = p,
g1 = g p gpi+1 = g p g1 , so g p = e.
Corollary 6.4. Let G be a finite p-group. Every subgroup of G with index p is a normal
subgroup.
Proof. We give two proofs. First, let the subgroup be H, so H ⊂ N(H) ⊂ G. Since
[G : H] = p, one of these inclusions is an equality. By Theorem 6.2, N(H) 6= H, so
N(H) = G. That means H C G.
For a second proof, consider the left multiplication action of G on the left coset space
G/H. By Theorem 1.6, this action can be viewed as a group homomorphism ` : G →
Sym(G/H) ∼ = Sp . Let K be the kernel of `. We will show H = K. The quotient G/K
embeds into Sp , so [G : K] | p!. Since [G : K] is a power of p, [G : K] = 1 or p. At the
same time, each g ∈ K at least satisfies gH = H, so g ∈ H. In other words, K ⊂ H, so
[G : K] > 1. Thus [G : K] = p, so [H : K] = [G : K]/[G : H] = 1, i.e., H = K C G.
Corollary 6.5. Let G be a finite group and p be a prime with pn | |G|. Then there is a
chain of subgroups
{e} = H0 ⊂ H1 ⊂ · · · ⊂ Hn ⊂ G,
where |Hi | = pi .
Proof. We can take n ≥ 1. Since p | |G| there is a subgroup of size p by Cauchy’s theorem,
so we have H1 . Assuming for some i < n we have a chain of subgroups up to Hi , we will
find a subgroup Hi+1 with size pi+1 that contains Hi .
Since p | [G : Hi ], by Theorem 6.2 p | [N(Hi ) : Hi ]. Since Hi C N(Hi ), we can consider
the quotient group N(Hi )/Hi . It has size divisible by p, so by Cauchy’s theorem there
is a subgroup of size p. The inverse image of this subgroup under the reduction map
N(Hi ) → N(Hi )/Hi is a group Hi+1 of size p|Hi | = pi+1 .
Theorem 6.6 (C. Jordan). If a nontrivial finite group acts on a finite set of size greater
than 1 and the action has only one orbit then some g ∈ G has no fixed points.
Proof. By Theorem 3.27,
1 X 1 X
1= | Fixg (X)| = |X| + | Fixg (X)| .
|G| |G|
g∈G g6=e
Each subgroup gHg −1 has the same size, namely |H|. How many different conjugate
groups gHg −1 are there (as g varies)? For g1 , g2 ∈ G,
g1 Hg1−1 = g2 Hg2−1 ⇐⇒ g2−1 g1 Hg1−1 g2 = H
⇐⇒ g2−1 g1 H(g2−1 g1 )−1 = H
⇐⇒ g2−1 g1 ∈ N(H)
⇐⇒ g1 ∈ g2 N(H)
⇐⇒ g1 N(H) = g2 N(H).
Therefore the number of different subgroups gHg −1 as g varies is [G : N(H)]. These
subgroups all contain the identity,Sso they are not disjoint. Therefore, on account of the
overlap at the identity, the size of g∈G gHg −1 is strictly less than
|G| |H|
[G : N(H)]|H| = |H| = |G| ≤ |G|,
| N(H)| | N(H)|
so the union of all gHg −1 is not all of G.
For the second proof, we apply Theorem 6.6 to the action of G on X = G/H by left
multiplication. For a ‘point’ gH in G/H, itsS stabilizer is gHg −1 . By Theorem 6.6, some
a ∈ G has no fixed points, which means a 6∈ g∈G gHg −1 .
Remark 6.9. Theorem 6.8 is not always true for infinite groups. For instance, let G =
GL2 (C). Every matrix inS G has an eigenvector, so we can conjugate each matrix in G to the
form ( a0 db ). Thus G = g∈G gHg −1 , where H is the proper subgroup of upper triangular
matrices.
Remark 6.10. Here is a deep application of Theorem 6.8 to number theory. Suppose a
polynomial f (X) in Z[X] is irreducible and has a root modulo p for every p. Then f (X) is
linear. The proof of this requires Theorem 6.8 and complex analysis.
Corollary 6.11. If H is a proper subgroup of the finite group G, there is a conjugacy class
in G that is disjoint from H and its conjugate subgroups.
Proof. Pick an x 6∈ g∈G gHg −1 and use the conjugacy class of x.
S
Theorem 6.12. Let G be a finite group with |G| > 1, and p the smallest prime factor of
|G|. Every subgroup of G with index p is a normal subgroup.
Corollary 6.4 is a special case of Theorem 6.12. Group actions don’t appear in the
statement of Theorem 6.12, but they will play a role in its proof. According to [2, pp. 3-4],
Theorem 6.12 was conjectured and proved by Ernst Straus when he was a student.
Proof. Let H be a subgroup of G with index p, so G/H is a set with size p. We will prove
H C G in two ways using group actions.
Method 1. We will show H is the kernel of a homomorphism out of G, and thus is a
normal subgroup of G. The argument will be similar to the second proof of Corollary 6.4.
Let G act on G/H by left multiplication, which (by Theorem 1.6) gives a group homo-
morphism
(6.2) G → Sym(G/H) = ∼ Sp .
This homomorphism sends each g ∈ G to the permutation `g of G/H, where `g (aH) = gaH.
We will show this homomorphism has kernel H.
GROUP ACTIONS 25
Write the kernel of the homomorphism (6.2) as K, so K C G. The quotient G/K embeds
into Sp , so [G : K] | p!. Since [G : K] divides |G|, whose smallest prime factor is p,
[G : K] = 1 or p. Each g ∈ K at least satisfies gH = H, so g ∈ H. Thus K ⊂ H, so
[G : K] = [G : H][H : K] = p[H : K]. Thus [G : K] = p and [H : K] = 1, so H = K C G.
Method 2.5 Let H act on G/H by left multiplication, which (by Theorem 1.6) gives a
group homomorphism
(6.3) H → Sym(G/H) = ∼ Sp .
This action of H on a p-element set fixes the coset H, so each orbit has size at most p−1. By
the orbit-stabilizer formula, an orbit has length dividing |H|, which divides |G|. The only
factor of |G| that’s at most p−1 is 1 (why?), so all orbits of the H-action in (6.3) have length
1. That means (6.3) is a trivial action: hgH = gH for each h ∈ H and g ∈ G. Therefore
g −1 hgH = H, so g −1 hg ∈ H, which implies (as h varies) g −1 Hg ⊂ H, so g −1 Hg = H since
both sides have the same size. Since this last equation holds for all g ∈ G, H C G.
Some special cases of Theorem 6.12 are worth recording separately.
Corollary 6.13. Let G be a finite group.
a) If H is a subgroup with index 2, then H C G.
b) If G is a p-group and H is a subgroup with index p, then H C G.
c) If |G| = pq where p < q are different primes, then each subgroup of G with size q is a
normal subgroup.
Proof. Parts a and b are immediate consequences of Theorem 6.12. For part c, note that a
subgroup with size q is a subgroup with index p. This completes the proof.
Part a can be checked directly, without the reasoning of Theorem 6.12: if [G : H] = 2
and a 6∈ H, then the two left cosets of H are H and aH, while the two right cosets of H
are H and Ha. Therefore aH = G − H = Ha, so H C G. Part b was already seen in
Corollary 6.4. (In fact, our second proof of Corollary 6.4 used the same idea as the proof of
Theorem 6.12.) Part c could also be checked directly with the Sylow theorems, which show
a subgroup of order q in G is not just normal but in fact unique. In Theorem 6.12, these
disparate results are unified into a single statement.
All of our applications of group actions in this section have been to finite groups. Here
is an application to infinite groups.
Theorem 6.14. A finitely generated group has finitely many subgroups of index n for each
integer n ≥ 1.
Proof. Let G be a finitely generated group and H be a subgroup with finite index, say n.
The left multiplication action of G on G/H is a group homomorphism ` : G → Sym(G/H).
In this action, the stabilizer of the coset H is H (gH = H if and only if g ∈ H).
Pick an enumeration of the n cosets in G/H so that the coset H corresponds to the
number 1. This enumeration gives an isomorphism Sym(G/H) ∼ = Sn , so we can make G
act on the set {1, 2, . . . , n} and the stabilizer of 1 is H. Therefore we have constructed from
each subgroup H ⊂ G of index n an action of G on {1, 2, . . . , n} in which H is the stabilizer
of 1. Since H is recoverable from the action, the number of subgroups of G with index n is
bounded above by the number of homomorphisms G → Sn . Since G is finitely generated,
5I learned this from the answer by Bar Alon at https://math.stackexchange.com/questions/164244/
normal-subgroup-of-prime-index.
26 KEITH CONRAD
it has finitely many homomorphisms to the finite group Sn . Therefore G has finitely many
subgroups of index n.
I am not aware of a proof of this theorem that is fundamentally different from the one
presented here.
This is probably a good place to warn the reader about a false property of finitely
generated groups: a subgroup of a finitely generated group need not be finitely generated!
However, every subgroup of a finitely generated group with finite index is finitely generated:
if the original group has d generators, a subgroup with index n has at most (d − 1)n + 1
generators. This is due to Schreier.
for 0 ≤ i ≤ p − 1, so
{1, 2, . . . , n} = A0 ∪ A1 ∪ · · · ∪ Ap−1 ∪ {pn0 + 1, . . . , pn0 + a0 }.
For 1 ≤ t ≤ n0 , let σt be the p-cycle
σt = (t, n0 + t, 2n0 + t, . . . , (p − 1)n0 + t).
This cycle cyclically permutes the numbers in A0 , A1 , . . . , Ap−1 that are ≡ t mod n0 . The
σt ’s for different t are disjoint, so they commute. Set σ = σ1 σ2 · · · σn0 . Then σ has order p
as a permutation of {1, 2, . . . , n} (fixing all numbers above pn0 ).
n
Let X be the set of m-element subsets of {1, 2, . . . , n}, so |X| = m . Let the group hσi
act on X. Since σ has order p, Theorem 4.1 tells us
|X| ≡ |{fixed points}| mod p.
n0
The left side is m . We will show the right side is ab00 m
n
0 .
References
[1] B. Fein, W. M. Kantor, M. Schacher, Relative Brauer groups II, J. Reine Angew. Math. 328 (1981),
39–57.
[2] B. R. Gelbaum and J. M. H. Olmstead, Theorems and Counterexamples in Mathematics, Springer, New
York, 1990.
[3] I. M. Isaacs and M. R. Pournaki, “Generalizations of Fermat’s Little Theorem Using Group Theory,”
Amer. Math. Monthly 112 (2005), 734–740.
[4] Wikipedia, Burnside’s lemma, http://en.wikipedia.org/wiki/Burnside%27s_lemma.