0% found this document useful (0 votes)
16 views472 pages

GTLect1 AbstractGroupTheory 2021

The document is a comprehensive exploration of abstract group theory, detailing fundamental concepts such as groups, homomorphisms, and group actions, along with their applications in various fields including mathematics and physics. It covers a wide range of topics from basic definitions to advanced concepts like representation theory and group cohomology. The author, Gregory W. Moore, emphasizes the historical significance and contemporary relevance of group theory in understanding symmetries and transformations in both mathematics and physical sciences.

Uploaded by

chenmu2024
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views472 pages

GTLect1 AbstractGroupTheory 2021

The document is a comprehensive exploration of abstract group theory, detailing fundamental concepts such as groups, homomorphisms, and group actions, along with their applications in various fields including mathematics and physics. It covers a wide range of topics from basic definitions to advanced concepts like representation theory and group cohomology. The author, Gregory W. Moore, emphasizes the historical significance and contemporary relevance of group theory in understanding symmetries and transformations in both mathematics and physical sciences.

Uploaded by

chenmu2024
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 472

Preprint typeset in JHEP style - HYPER VERSION

Chapter 1: Abstract Group Theory

Gregory W. Moore

Abstract: Very Abstract.September 7, 2021


Contents

1. Introduction 5
1.1 Equivalence Relations 5

2. Groups: Basic Definitions And Examples 7

3. Homomorphism and Isomorphism 20

4. Group Actions On Sets 26


4.1 Group Actions On Sets 26
4.2 Group Actions On Sets Induce Group Actions On Associated Function Spaces 29

5. The Symmetric Group. 30


5.1 Cayley’s Theorem 32
5.2 Cyclic Permutations And Cycle Decomposition 34
5.3 Transpositions 35
5.4 Diversion and Example: Card shuffling 41

6. Generators and relations 44


6.1 Example Of Generators And Relations: Fundamental Groups In Topology 51
6.1.1 The Fundamental Group Of A Topological Space 51
6.1.2 Surface Groups: Compact Two-Dimensional Surfaces 59
6.1.3 Braid Groups And Anyons 62
6.1.4 Fundamental Groups Of Three-Dimensional Manifolds 66
6.1.5 Fundamental Groups Of Four-Dimensional Manifolds 68

7. Cosets and conjugacy 68


7.1 Lagrange Theorem 68
7.2 Conjugacy 71
7.3 Normal Subgroups And Quotient Groups 76
7.3.1 A Very Interesting Quotient Group: Elliptic Curves 85
7.4 Conjugacy Classes In Sn 89
7.4.1 Conjugacy Classes In Sn And Harmonic Oscillators 91
7.4.2 Conjugacy Classes In Sn And Partitions 99

8. More About Group Actions And Orbits 101


8.1 Some Definitions And Terminology Associated With Group Actions 102
8.2 The Stabilizer-Orbit Theorem 106
8.3 Examples Of Orbits 108
8.3.1 Extended Example: The Case Of 1 + 1 Dimensions 115
8.3.2 Higher Dimensional Light Cones 118

–1–
8.3.3 Torsors And Principal Bundles 121
8.4 More About Induced Group Actions On Function Spaces 127

9. Centralizer Subgroups And Counting Conjugacy Classes 129


9.1 0 + 1-Dimensional Gauge Theory 130
9.2 Three Mathematical Applications Of The Counting Principle 131

10. Kernel, Image, And Exact Sequence 135


10.1 The Relation Of SU (2) And SO(3) 142

11. Some Representation Theory 146


11.1 Some Basic Definitions 147
11.2 Characters 151
11.3 Unitary Representations 152
11.4 Haar Measure, a.k.a. Invariant Integration 153
11.5 The Regular Representation 158
11.6 Reducible And Irreducible Representations 161
11.6.1 Definitions 161
11.6.2 Reducible vs. Completely reducible representations 166
11.7 Schur’s Lemmas 168
11.8 Pontryagin Duality 171
11.8.1 An Application Of Pontryagin Duality: Bloch’s Theorem And Band
Structure 174
11.9 Orthogonality Relations Of Matrix Elements And The Peter-Weyl Theorem 176
11.10Orthogonality Relations For Characters And Character Tables 181
11.10.1Finite Groups And The Character Table 182
11.10.2Orthogonality Relations And Pontryagin Duality 188
11.10.3The Poisson Summation Formula 189
11.11The Finite Heisenberg Group And The Quantum Mechanics Of A Particle
On A Discrete Approximation To A Circle 191
11.12Decomposition Of Tensor Products Of Representations And Fusion Coeffi-
cients 196
11.13Induced Representations 199
11.13.1The Geometrical Interpretation 200
11.13.2Frobenius Reciprocity 201
11.14Representations Of SU (2) 206
11.14.1Homogeneous Polynomials 206
11.14.2Chararacters Of The Representations Vj 211
11.14.3Unitarization 213
11.14.4Inhomogeneous Polynomials And Mobius Transformations On CP1 213
11.14.5The Geometrical Interpretation Of P2j 217
11.15The Clebsch-Gordon Decomposition For SU (2) 220
11.16Lie Groups And Lie Algebras And Lie Algebra Representations 222

–2–
11.16.1Some Useful Formulae For Working With Exponentials Of Operators 222
11.16.2Lie Algebras 229
11.16.3The Classical Matrix Groups Are Lie Groups 232
11.16.4Representations Of Lie Algebras 240
11.16.5Finite Dimensional Irreducible Representations Of sl(2, C),sl(2, R),
and su(2) 242
11.16.6Casimirs 247
11.16.7Lie Algebra Operators In The Induced Representations Of SU (2) 249

12. Group Theory And Elementary Number Theory 251


12.1 Reminder On gcd And The Euclidean Algorithm 251
12.2 Application: Expressing elements of SL(2, Z) as words in S and T 256
12.3 Products Of Cyclic Groups And The Chinese Remainder Theorem 258
12.3.1 The Chinese Remainder Theorem 263

13. The Group Of Automorphisms 265


13.1 The group of units in ZN 271
13.2 Group theory and cryptography 277
13.2.1 How To Break RSA: Period Finding 280
13.2.2 Period Finding With Quantum Mechanics 280

14. Semidirect Products 285

15. Group Extensions and Group Cohomology 300


15.1 Group Extensions 300
15.2 Projective Representations 308
15.2.1 How projective representations arise in quantum mechanics 310
15.3 How To Classify Central Extensions 316
15.4 Extended Example: Charged Particle On A Circle Surrounding A Solenoid 333
15.4.1 Hamiltonian Analysis 333
15.4.2 Remarks About The Quantum Statistical Mechanics Of The Particle
On The Ring 345
15.4.3 Gauging The Global SO(2) Symmetry, Chern-Simons Terms, And
Anomalies 351
15.5 Heisenberg Extensions 361
15.5.1 Heisenberg Groups: The Basic Motivating Example 361
15.5.2 Example: The Magnetic Translation Group For Two-Dimensional
Electrons 364
15.5.3 The Commutator Function And The Definition Of A General Heisen-
berg Group 365
15.5.4 Classification Of U (1) Central Extensions Using The Commutator
Function 367
15.5.5 Pontryagin Duality And The Stone-von Neumann-Mackey Theorem 368
15.5.6 Some More Examples Of Heisenberg Extensions 373

–3–
15.5.7 Lagrangian Subgroups And Induced Representations 381
15.5.8 Automorphisms Of Heisenberg Extensions 383
15.5.9 Coherent State Representations Of Heisenberg Groups: The Bargmann
Representation 395
15.5.10Some Remarks On Chern-Simons Theory 395
15.6 Non-Central Extensions Of A General Group G By An Abelian Group A:
Twisted Cohomology 395
15.6.1 Crystallographic Groups 400
15.6.2 Time Reversal 403
15.6.3 T2 = (−1)2j and the Clebsch-Gordon Decomposition 409
15.7 General Extensions 410
15.8 Group cohomology in other degrees 414
15.8.1 Definition 415
15.8.2 Interpreting the meaning of H 0+ω 419
15.8.3 Interpreting the meaning of H 1+ω 419
15.8.4 Interpreting the meaning of H 2+ω 420
15.8.5 Interpreting the meaning of H3 420
15.9 Some references 422

16. Overview of general classification theorems for finite groups 422


16.1 Brute force 423
16.2 Finite Abelian Groups 428
16.3 Finitely Generated Abelian Groups 431
16.4 The Classification Of Finite Simple Groups 433

17. Categories: Groups and Groupoids 440


17.1 Groupoids 450
17.2 The topology behind group cohomology 452

18. Lattice Gauge Theory 456


18.1 Some Simple Preliminary Computations 456
18.2 Gauge Group And Gauge Field 458
18.3 Defining A Partition Function 461
18.4 Hamiltonian Formulation 466
18.5 Topological Gauge Theory 466

19. Example: Symmetry Protected Phases Of Matter In 1 + 1 Dimensions 471

–4–
1. Introduction

Historically, group theory began in the early 19th century. In part it grew out of the
problem of finding explicit formulae for roots of polynomials. 1 . Later it was realized that
groups were crucial in transformation laws of tensors and in describing and constructing
geometries with symmetries. This became a major theme in mathematics near the end of
the 19th century. In part this was due to Felix Klein’s very influential Erlangen program.
In the 20th century group theory came to play a major role in physics. Einstein’s 1905
theory of special relativity is based on the symmetries of Maxwell’s equations. The general
theory of relativity is deeply involved with the groups of diffeomorphism symmetries of
manifolds. With the advent of quantum mechanics the representation theory of linear
groups, particularly SU (2) and SO(3) came to play an important role in atomic physics,
despite Niels Bohr’s complaints about “die Gruppenpest.” One basic reason for this is the
connection between group theory and symmetry, discussed in chapter ****. The theory of
symmetry in quantum mechanics is closely related to group representation theory.
Since the 1950’s group theory has played an extremely important role in particle theory.
Groups help organize the zoo of subatomic particles and, more deeply, are needed in the
very formulation of gauge theories. In order to formulate the Hamiltonian that governs
interactions of elementary particles one must have some understanding of the theory of Lie
algebras, Lie groups, and their representations.
In the late 20th and early 21st century group theory has been essential in many areas
of physics including atomic, nuclear, particle, and condensed matter physics. However,
the beautiful and deep relation between group theory and geometry is manifested perhaps
most magnificently in the areas of mathematical physics concerned with gauge theories
(especially supersymmetric gauge theories), quantum gravity, and string theory. It is with
that in the background that I decided to cover the topics in the following chapters.
Finally, the author would like to make two requests of the reader: First, much of what
follows is standard textbook material. But some is nonstandard. If you make use of any of
the nonstandard material in something you write, please give proper acknowledgement to
these notes. Second, if you find any mistakes in these notes please do not hesitate to send
me an email. (But please, first do check carefully it really is a mistake.) These notes are
a work in progress and are continually being updated. Thank you - Gregory Moore.

1.1 Equivalence Relations


A very elementary, but very basic idea that we will use repeatedly is that of an equivalence
relation. A good reference for this elementary material is I.N. Herstein, Topics in Algebra,
sec. 1.1.
Definition 1.1.1 . Let X be any set. A binary relation ∼ is an equivalence relation
if ∀a, b, c ∈ X
1. a ∼ a
2. a ∼ b ⇒ b ∼ a
1
For a romantic description, see the chapter “Genius and Stupidity” in E.T. Bell’s Men of Mathematics.
For what is likely a more realistic account see chapter 6 of T. Rothman’s Science à la Mode.

–5–
3. a ∼ b and b ∼ c ⇒ a ∼ c
Example 1.1.1 : The notion of equality satisfies these axioms of an equivalence
relation. So a ∼ b iff a = b is an equivalence relation. The main point, however, is that an
equivalence relation is a more flexible notion than equality, and yet captures many of the
important aspects of equality.
Example 1.1.2 : X = Z, a ∼ b if a − b is even.
Example 1.1.3 : More generally, let X = Z, and choose a positive integer N . We
can define an equivalence relation by saying that a ∼ b iff a − b is divisible by N .
Example 1.1.4 : At the other extreme from equality we could say that every element
of the set X is equivalent to every other element. This would be the coarsest possible
equivalence relation on the set.
Definition 1.1.2: Let ∼ be an equivalence relation on X. The equivalence class of
an element a is

[a] ≡ {x ∈ X : x ∼ a} (1.1)

In the above two examples we have


Example 1.1.1’ : If our equivalence relation is just equality then the equivalence
class of every element has only one element: [a] = {a}. The set of equivalence classes is in
bijective correspondence with the original set.
Example 1.1.2’ : [n] is the set of all integers with the same parity as n. For example,
[1] = {n : n is an odd integer}
[4] = {n : n is an even integer}.
Example 1.1.3’ : Consider the equivalence relation a ∼ b iff a − b is divisible by N .
Recall that if n is an integer then we can write n = r + N q in a unique way where the quo-
tient q is integral and the remainder or residue modulo N is the integer r ∈ {0, 1, . . . , N −1}.
Thus there is a bijective correspondence between the set of equivalence classes and the set
{0, 1, . . . , N − 1}. The equivalence class of an integer n will sometimes be written as n̄.
One way to write it is

n̄ := n + N Z := {. . . n − 2N, n − N, n, n + N, n + 2N, . . . } (1.2)

Example 1.1.4’ : If we take the coarsest possible equivalence relation then there is
only one equivalence class, namely the full set X itself.
Here is a simple, but basic, principle:

The distinct equivalence classes of an equivalence relation on X decompose X into


a union of mutually disjoint subsets. Conversely, given a disjoint decomposition
X = qXi we can define an equivalence relation by saying a ∼ b if a, b ∈ Xi .

For example, the integers are the disjoint union of the even and odd integers, and the
corresponding equivalence relation is the one mentioned above: a ∼ b iff a − b is even.

–6–
2. Groups: Basic Definitions And Examples

We begin with the abstract definition of a group.

Definition 2.1: A group is a quartet (G, m, I, e) where

1. G is a set.

2. m : G × G → G is a map, called the group multiplication map.

3. I : G → G is a map, called the inverse map

4. e ∈ G is a distinguished element of G called the identity element.

These data (G, m, I, e) are required to satisfy the following conditions:

1. m is associative: For all g1 , g2 , g3 ∈ G we have

m(m(g1 , g2 ), g3 ) = m(g1 , m(g2 , g3 )) (2.1)

2.
∀g ∈ G m(g, e) = m(e, g) = g (2.2)

3.
∀g ∈ G m(I(g), g) = m(g, I(g)) = e (2.3)

The above notation is unduly heavy, and we will not use it. Thus, we give the definition
again, but more informally:
∀a, b ∈ G there exists a unique element in G, called the product, and denoted a · b ∈ G
in other words, we streamline notation by writing a · b := m(a, b).
The product is required to satisfy 3 axioms:

1. Associativity: (a · b) · c = a · (b · c)

2. Existence of an identity element: ∃e ∈ G such that:

∀a ∈ G a·e=e·a=a (2.4)

3. Existence of inverses: Again, we streamline notation by writing a−1 := I(a). so that


a · a−1 = a−1 · a = e

Remarks

1. We will often denote e by 1, or, when discussing more than one group at a time, we
denote the identity in a particular group G by 1G . The identity element is also often
called the unit element, although the term “unit” can have other meanings when
dealing with more general mathematical structures such as rings.

–7–
2. Also, we sometimes denote the product of a and b simply by ab.

3. We can drop some axioms and still have objects of mathematical interest. For ex-
ample, a monoid is a set M with a multiplication map m : M × M → M which is
associative. And that’s all. If there is an identity element e ∈ M which functions as
the identity for this multiplication then we speak of a unital monoid. The further as-
sumption of inverses turns the monoid into a group. The definition of a group seems
to be in the Goldilocks region of having just enough data and conditions to allow a
deep theory, but not having too many constraints to allow only a few examples. It is
just right to have a deep and rich mathematical theory.

4. We can also put further mathematical structures on the data (G, m, I, e). For exam-
ple, if G is a topological space and m and I are both continuous maps, then we have
a topological group. If G is furthermore a manifold and m and I are real analytic in
local coordinates, then we have a Lie group.

Exercise
a.) Show that e unique. 2
b.) Given a is a−1 unique?
c.) Show that axioms 2,3 above are slightly redundant: For example, just assuming
a · e = a and a · a−1 = e show that e · a = a follows as a consequence.

Example 2.1: As a set, G = Z, R, or C. The group operation is ordinary addition:

m(a, b) := a + b (2.5)

The reader should check all the axioms.


Example 2.2: A simple generalization is to take n-tuples for a positive integer n: G =
Zn , Rn , Cn , with the operation being vector addition, so if ~x = (x1 , . . . , xn ) and ~y =
(y1 , . . . , yn ) then
m(~x, ~y ) := (x1 + y1 , . . . , xn + yn ) (2.6)

Example 2.3: G = R∗ := R−{0} or G = C∗ := C−{0} Now if x, y ∈ G then m(x, y) := xy


is ordinary multiplication of complex numbers. Check the axioms.

Definition 2.2: Suppose (G, m, I, e) is a group and H ⊂ G is a subset so that m and I


preserve H, that is, the restriction of m takes H × H → H and the restriction of I maps
H → H. (It then follows that e ∈ H.) In this case we say that (H, m, I, e) is a subgroup of
(G, m, I, e).
2
Answer : Suppose that two elements e1 , e2 ∈ G behave as units. Consider the product e1 · e2 . Using
e1 as a unit we can say this is e2 . On the other hand, using e2 as a unit we can say this is e1 . Therefore
e1 = e2 .

–8–
Exercise Subgroups
a.) Z ⊂ R ⊂ C with operation +, define subgroups.
b.) Is Z − {0} a monoid (with m given by standard multiplication) ?
c.) Is Z − {0} ⊂ R∗ a subgroup?
d.) Let R∗>0 and R∗<0 denote the positive and negative real numbers, respectively.
Using ordinary multiplication of real numbers, which of these are subgroups of R∗ ?
e.) Consider the negative real numbers R<0 with the multiplication rule:

m(x, y) = −xy (2.7)

Show that this defines a group law on R<0 , but that (R<0 , m, ...) is not a subgroup of R∗ .
3

Definition 2.3: The order of a group G, denoted |G|, is the cardinality of G as a set.
Roughly speaking this is the same as the “number of elements in G.” A group G is called
a finite group if |G| < ∞, and is called an infinite group otherwise.

Already, with the simple concepts we have just introduced, we can ask nontrivial
questions. For example:
Does every infinite group necessarily have proper subgroups of infinite order?
This is of course true of the examples we have just discussed. It is actually not easy to
think of counterexamples, but in fact there are infinite groups all of whose proper subgroups
are finite. 4

Let us continue with an overview of examples of groups:


The groups in Examples 1,2,3 above are of infinite order. Here are examples of finite
groups:

Example 2.4: The group of N th roots of unity. Choose a natural number N . 5 We


let µN be the set of complex numbers z such that z N = 1. Thus we could write

µN = {1, ω, . . . , ω N −1 } (2.8)
3
Answer : The point is that the multiplication of (2.7) is not the restriction of the multiplication on R∗
to the negative reals.
4
One example are the Prüfer groups. These are subgroups of the group of roots of unity. They are
defined by choosing a prime number p and taking the subgroup of roots of unity of order pn for some
natural number n. Even wilder examples are the “Tarski Monster groups” (not to be confused with the
Monster group, which we will discuss later). These are infinite groups all of whose subgroups are isomorphic
to the cyclic group of order p.
5
The natural numbers are the same as the positive integers.

–9–
where ω = exp[2πi/N ]. This is a finite group with N elements, as is easily checked.

Exercise
Does µ137 have any nontrivial subgroups? 6

Exercise
In example 4 show that if N is even then the subset of classes of even integers forms
a proper subgroup of µN . What happens if N is odd?

Example 2.5: The residue classes modulo N , also called “The cyclic group of
order N.” Let N be a positive integer. Recall that we can put an equivalence relation
on Z defined by a ∼ b iff a − b is divisible by N , and we denoted the class of an integer
n by n̄. (One could identify the set of equivalence classes with the set {0, 1, . . . , N − 1}.)
We take G to be the set of equivalence classes of integers modulo N , using the notation
of equivalence relation of §7.1 above. We need to define m(r̄1 , r̄2 ). To do this we choose a
representative r1 , r2 from each equivalence class and take

m(r̄1 , r̄2 ) := (r1 + r2 ) (2.9)

The main thing to check here is that the equation is well-defined, since we chose represen-
tatives for each equivalence class. This group, which appears frequently in the following,
will be denoted as Z/N Z or ZN . For example, telling time in hours is arithmetic in Z12 ,
or in Z24 in railroad/military time. The reader should note that it “resembles” closely the
group µN . We will make that precise in the next section.
So far, all our examples had the property that for any two elements a, b

a·b=b·a (2.10)
Definition 2.5: When equation (2.10) holds for two elements a, b ∈ G we say “a and
b commute.” If a and b commute for every pair (a, b) ∈ G × G then we say that G is an
Abelian group:
If a, b commute for all a, b ∈ G we say “G is Abelian.”

Note: Note that our abbreviated notation a · b for the group multiplication m(a, b) would
actually be quite confusing when working with ZN . The reason is that it is also possible
to define a ring structure (see Chapter 2) where one multiplies r1 and r2 as integers and
then takes the residue. This is NOT the same as m(r1 , r2 ) !! For example, if we take
N = 5 then m(2, 3) = 0 in Z5 because 2 + 3 = 5 is congruent to 0 modulo 5. Of course,
6
Answer : We will give an elegant answer below.

– 10 –
multiplying as integers 2 × 3 = 6 and 6 is congruent to 1 mod 5. When considering Abelian
groups we often prefer to use the abbreviated notation

a + b := m(a, b) (2.11)

When we use this additive notation for Abelian groups we will write the identity element
as 0 so that a + 0 = 0 + a = a. (Writing “a + 1 = a” would look extremely weird.) Note
that we will not always use additive notation for Abelian groups! For example, for µN the
multiplicative notation is quite natural.
Since we defined a notion of “Abelian group” we are implicitly suggesting there are
examples of groups which are not Abelian. If one tries to use the group axioms to prove
that m(g1 , g2 ) = m(g2 , g1 ) one will fail. The only way we can know conclusively that
one will fail is to provide a counterexample. The next example gives a set of examples of
nonabelian groups:

Example 2.6: The General Linear Group


Let κ = R or κ = C. Define Mn (κ) to be the set of all n × n matrices whose matrix ♣κ will be our
official symbol for a
elements lie in κ. Note that this is a unital monoid under matrix multiplication. But it is general field. This
needs to be changed
not a group, because some matrices are not invertible. Therefore we define: from k in many
places below. ♣

GL(n, κ) := {A|A = n × n invertible matix over κ} ⊂ Mn (κ) (2.12)

When κ = R or κ = C GL(n, κ) is a group of infinite order. It is Abelian if n = 1 and


nonabelian if n > 1. There are some important generalizations of this example: 7 We
could let κ be any field. If κ is a finite field then GL(n, κ) is a finite group. More generally,
if R is a ring GL(n, R) is the subset of n × n matrices with entries in R with an inverse
in Mn (R). This set forms a group. For example, GL(2, Z) is the set of 2 × 2 matrices of
integers such that the inverse matrix is also a 2 × 2 matrix of integers. This set of matrices
forms an infinite nonabelian group under matrix multiplication.

Definition 2.5: The center Z(G) of a group G is the set of elements z ∈ G that commute
with all elements of G:

Z(G) := {z ∈ G|zg = gz ∀g ∈ G} (2.13)

Exercise Due Diligence: The Center :


a.) Show that for any group G, Z(G) is an Abelian subgroup of G.
7
See Chapter 2 for some discussion of the mathematical notions of fields and rings used in this paragraph.

– 11 –
b.) Show that the center of GL(n, κ) is the subgroup of matrices proportional to the
unit matrix with scalar factor in κ∗ . 8

Example 2.7: The Classical Matrix Groups


A matrix group is a subgroup of GL(n, κ). There are several interesting examples
which we will study in great detail later. Some examples include:
The special linear group:

SL(n, κ) ≡ {A ∈ GL(n, κ) : detA = 1} (2.14)


The orthogonal groups:

O(n, κ) := {A ∈ GL(n, κ) : AAtr = 1}


(2.15)
SO(n, κ) := {A ∈ O(n, κ) : detA = 1}

Another natural class are the unitary and special unitary groups:

U (n) := {A ∈ GL(n, C) : AA† = 1} (2.16)

SU (n) := {A ∈ U (n) : detA = 1} (2.17)


Finally, to complete the standard list of classical matrix groups we consider the stan-
dard symplectic form on R2n :
!
0 1n×n
J= ∈ M2n (R) (2.18)
−1n×n 0

Note that the matrix J satisfies the properties:

J = J ∗ = −J tr = −J −1 (2.19)

Definition A symplectic matrix is a matrix A such that

Atr JA = J (2.20)
8
Answer : It is obvious that matrices of the form z1n×n with z ∈ κ∗ are in the center. What is not
immediately obvious is that there are no other elements of the center. Here is a careful proof that this is
indeed the case: Consider the matrix units: eij . The matrix eij has a 1 in the ith row and j th column and
zeroes elsewhere. Note that for any matrix A we have eii Aejj = Aij eij with no sum on i, j here. On the
RHS Aij is a matrix element, not a matrix. Now let z be in the center. Check that for any pair ij the
matrix 1 + eij is invertible. Therefore, if z is in the center then z must commute with 1 + eij and hence z
must commute with eij for all i, j. Now, as we observed above, eii zejj = zij eij holds for any matrix, but
P
since z is also central eii zejj = eii ejj z = δij ejj z So z is diagonal. But for any diagonal matrix z = k zk ekk
we have (zA)ij = zi Aij and (Az)ij = Aij zj . As long as there are matrices with Aij 6= 0 and invertible we
can conclude that zi = zj .

– 12 –
We define the symplectic groups:

Sp(2n, κ) := {A ∈ GL(2n, κ)|Atr JA = J} (2.21)

Remarks:

1. As an exercise you should show from the definition above that the most general
element of SO(2, R) must be of the form
!
x y
x2 + y 2 = 1 (2.22)
−y x

where the matrix elements x, y are real. Thus we recognize that group elements in
SO(2, R) are in 1-1 correspondence with points on the unit circle in the plane. We
can even go further and parametrize x = cos φ and y = sin φ and φ is a coordinate
provided we identify φ ∼ φ + 2π so the general element of SO(2, R) is of the form:
!
cos φ sin φ
R(φ) := (2.23)
− sin φ cos φ

This is familiar from the implementation of rotations of the Euclidean plane in Carte-
sian coordinates. Note that the group multiplication law is

R(φ1 )R(φ2 ) = R(φ1 + φ2 ) (2.24)

so, in φ “coordinates” the group multiplication law is continuous, differentiable, even


(real) analytic. Similarly, taking an inverse is φ → −φ.

2. Let us consider the group U (1): This is simply the group of 1 × 1 unitary matrices.
They are not hard to diagonalize. The general matrix can be written as z(φ) = eiφ
with multiplication z(φ1 )z(φ2 ) = z(φ1 + φ2 ) where φ ∼ φ + 2π yield identical group
elements. Again, as with µN and ZN the groups look like they are “the same”
although strictly speaking they are different sets and therefore have different m’s.
We will make this idea precise in the next section.

3. One of the most important groups in both mathematics and physics is SU (2). We
claim that the general element of SU (2) looks like
!
z −w∗
g= (2.25)
w z∗

for a pair of complex numbers (z, w) ∈ C2 such that

|z|2 + |w|2 = 1 (2.26)

– 13 –
One can prove this by studying the 4 equations for the matrix elements in the identity
gg † = 1. Another way to proceed makes use of some concepts from the linear algebra
chapter below and goes as follows: Since g is unitary it follows that the basis
! !
1 0
g g (2.27)
0 1

Should be orthonormal. Therefore, if we write


!
z u
g= (2.28)
w v
! !
z u
it must be that is orthogonal to and hence u = −λw∗ and v = λz ∗ .
w v
Moreover, since norms are preserved we know that |z|2 + |w|2 = 1. So the general
unitary matrix must be of the form
!
z −λw∗
g= (2.29)
w λz ∗

for some phase λ. But now if we impose detg = 1 we discover that λ = 1. Note that,
writing out the real and imaginary parts of z, w we see that the equation (2.26) is
just the equation for a 3-dimensional sphere: Just the way SO(2) and U (1) are, as
manifolds, S 1 , we see that SU (2) also has the structure of a manifold, namely the
three-dimensional sphere.

4. Some of what we have just said can be generalized to all the classical matrix groups:
They can be identified with manifolds (although no other matrix groups are spheres
- not an obvious fact.) There are parametrizations of these manifolds, so the group
multiplication and inverse are smooth operations. Moreover, the groups “act” nat-
urally on various linear spaces. (See Section 4.1 for the notion of “group action.”)
This is part of the theory of Lie groups. Lie groups have vast applications in physics.
For example, G = SU (3) is the gauge group of a Yang-Mills theory that describes the
interactions of quarks and gluons, while G = SU (3) × SU (2) × U (1) is related to the
standard model that describes all known elementary particles and their interactions.
The general theory of Lie groups will be discussed in Chapter 8(?) below, although
we will meet many many examples before then.

Example 2.8 Function spaces as groups.


Suppose G is a group. Suppose X is any set. Consider the set of all functions from X
to G:
F = {f : f is a function from X → G} (2.30)

– 14 –
If we want to stress the role of X and/or G we write F[X → G] for F. We claim that F is
also a group. The main step to show this is simply giving a definition of the group multi-
plication and the inversion operation. The product mF (f1 , f2 ) of two functions f1 , f2 ∈ F
must be another function in F. We define this function by giving a formula for the values
at all values of x ∈ X:
mF (f1 , f2 )(x) := mG (f1 (x), f2 (x)) (2.31)
It is the only sensible thing we could write given the data at hand. In less cumbersome
notation:
(f1 · f2 )(x) := f1 (x) · f2 (x) (2.32)
Similarly inverse of f is the function that maps x → f (x)−1 , where f (x)−1 ∈ G is the
group element in G inverse to f (x) ∈ G.
If both X and G have finite cardinality then F[X → G] is a finite group. If X or G has
an infinite set of points then this is an infinite order group. If X is a positive dimensional
manifold and G is a Lie group (notions defined below) this is an infinite-dimensional space.
In the special case of the space of maps from the circle into the group:

LG = F[S 1 → G] (2.33)

we have the famous “loop group” whose representation theory has many wonderful prop-
erties, closely related to the subjects of 2d conformal field theory and string theory. In
some cases if X is a manifold and G is a classical matrix group then, taking a subgroup
defined by suitable continuity and differerentiability properties, we get the group of gauge
transformations of Yang-Mills theory. As a simple example, you are probably familiar with
the gauge transformation in Maxwell theory:

Aµ → Aµ + ∂µ  (2.34)

where Aµ is the vector potential so that Fµν = ∂µ Aν − ∂ν Aµ is the fieldstrength tensor.


Here  : M1,3 → R is a function on 1+3 dimensional Minkowski space. (In a more careful
account one would put restrictions on the allowed functions - they should be differentiable
and satisfy suitable boundary conditions - etc.) The more canonical object is

f : x 7→ ei(x) (2.35)

and this is a function from spacetime, M1,3 to U (1), so f ∈ F[M1,3 → U (1)]. This is a
better point of view because it generalizes in interesting ways to other spacetimes. Note
that the gauge transformation law can be written as:

(−i∂µ + A0µ ) = f −1 (−i∂µ + Aµ )f (2.36)

For many reasons this is a conceptually superior way to write it.


Example 2.9: Permutation Groups.
Let X be any set. A permutation of X is a one-one invertible transformation φ : X →
X. The composition φ1 ◦φ2 of two permutations is a permutation. The identity permutation

– 15 –
leaves every element unchanged. The inverse of a permutation is a permutation. Thus,
composition defines a group operation on the permutations of any set. This group is
designated SX . It is an extremely important group and we will be studying it a lot. In the
case where X = M is a manifold we can also ask that our permutations φ : M → M be
continuous or even differentiable. If φ and φ−1 are differentiable then φ is a diffeomorphism.
The composition of diffeomorphisms is a diffeomorphism by the chain rule, so the set of
diffeomorphisms Diff(M ) is a subgroup of the set of all permutations of M . The group
Diff(M ) is the group of gauge symmetries in General Relativity. Except in the case where
M = S 1 is the circle, remarkably little is known about the diffeomorphism groups of
manifolds. One can ask simple questions about them whose answers are unknown.
Example 2.10: Power Sets As Groups.
Let X be any set and let P(X) be the power set of X. It is, by definition, the set of all
subsets of X. If Y1 , Y2 ∈ P(X) are two subsets of X then define the symmetric difference:

Y1 + Y2 := (Y1 − Y2 ) ∪ (Y2 − Y1 ) (2.37)

This defines an abelian group structure on P(X). The identity element 0 is the empty set
∅ and the inverse of Y is Y itself: That is, in this group

2Y := Y + Y = ∅ = 0 (2.38)

Definition 2.4 Let G1 , G2 be two groups. The direct product of G1 , G2 is the set G1 × G2
with product:

mG1 ×G2 (g1 , g2 ), (g10 , g20 ) = (mG1 (g1 , g10 ), mG2 (g2 , g20 ))

(2.39)

Exercise Due Diligence: Direct Product Of Groups


a.) Check the group axioms.
b.) Generalize this to arbitrary products: Given a map G from a set I to the set of all
groups define the product over I as a group.
c.) Show that the direct product of two finite groups is finite.
Remark: It turns out that there can be several interesting ways to put a group
structure m on the set G1 × G2 . We will explore this in great detail in sections **** below.

Exercise Classical Matrix Groups: Due diligence


Check that each of the above sets (2.14),(2.15),(2.16), (2.21), are indeed subgroups of
the general linear group.

– 16 –
Exercise Apparent Asymmetry In The Definitions
In (2.15) we used AAtr = 1 but we could have used Atr A = 1. Similarly, in (2.16)
we used AA† = 1 rather than A† A = 1. Finally, in (2.21) we could, instead, have defined
Sp(2n, κ) to be matrices in M2n (κ) such that AJAtr = J. In all three cases, writing things
the other way defines the same group: Why?
(Careful: Just taking the transpose or hermitian conjugate of these equations does not
help.) 9

Exercise O(2, R) vs. SO(2, R)


a.) Show from the definition above of O(2, R) that the most general element of this
group is the form of (2.22) above, OR, of the form
!
x y
x2 + y 2 = 1 (2.40)
y −x

btw: Note that


! ! !
x y 1 0 x y
= (2.41)
y −x 0 −1 −y x

b.) Show that no matrix in O(2, R) is simultaneously of the form (2.22) and (2.40).
Conclude that, as a manifold, O(2, R) is a disjoint union of two circles.

Exercise Symplectic groups and canonical transformations


Let q i , pi i = 1, . . . n be coordinates and momenta for a classical mechanical system.
The Poisson bracket of two functions f (q 1 , . . . q n , p1 , . . . pn ), g(q 1 , . . . q n , p1 , . . . pn ) is
defined to be
n  
X ∂f ∂g ∂f ∂g
{f, g} = − (2.42)
∂q i ∂pi ∂pi ∂q i
i=1

a.) Show that


{q i , q j } = {pi , pj } = 0 {q i , pj } = δ i j (2.43)

9
Answer : Hint: Remember that in a group the inverse matrix is in the group. Consider replacing
g → g −1 in the definition.

– 17 –
Suppose we define new coordinates and momenta Qi , Pi to be linear combinations of
the old:    
Q1 q1
 .  .
 ..     .. 
 
Qn  a11 · · · a1,2n  
q n 
   .. . . .
.
 = . · (2.44)
  
. .   
 P1   p1 
 .. 
  a2n,1 · · · a2n,2n  .. 
 
 .  .
Pn pn
where A = (aij ) is a constant 2n × 2n matrix.
b.) Show that
{Qi , Qj } = {Pi , Pj } = 0 {Qi , Pj } = δji (2.45)
if and only if A is a symplectic matrix.
c.) Show that J ∈ Sp(2n, R). Note that it exchanges momenta and coordinates.
d.) What are the conditions on the n × n matrix B so that
!
1B
{ } (2.46)
0 1

is a subgroup. 10
e.) What are the conditions on the n × n matrix C so that
!
1 0
{ } (2.47)
C 1

is a subgroup. 11
f.) Show that ! !
1 0 1B
=J J −1 (2.48)
C 1 0 1
for C = −B.

Exercise The Quaternion Group And The Pauli Group


When working with spin-1/2 particles it is very convenient to introduce the standard
Pauli matrices: !
0 1
σ 1 := (2.49)
10
!
2 0 −i
σ := (2.50)
i 0
10
Answer : B must be a symmetric matrix
11
Answer : C must be a symmetric matrix

– 18 –
!
3 1 0
σ := (2.51)
0 −1

a.) Show that they satisfy the identity, valid for all 1 ≤ i, j ≤ 3:

σ i σ j = δ ij + iijk σ k (2.52)

b.) Show that the set of matrices

Q = {±1, ±iσ 1 , ±iσ 2 , ±iσ 3 } (2.53)

forms a subgroup of order 8 of SU (2) ⊂ GL(2, C). It is known as the quaternion group.
c.) Show that the set of matrices

P = {±1, ±i, ±σ 1 , ±σ 2 , ±σ 3 , ±iσ 1 , ±iσ 2 , ±iσ 3 } (2.54)

forms a subgroup of U (2) ⊂ GL(2, C) of order 16. It is known as the Pauli group.

Remark: 12 The Pauli group is often used in quantum information theory. If we think
of the quantum Hilbert space of a spin 1/2 particle (isomorphic to C2 with standard inner
product) then there is a natural basis of up and down spins: v1 = | ↑i and v2 = | ↓i.
Thinking of these as quantum analogs |0i and |1i of classical information bits 0, 1 we see
that X = σ 1 acts as a “bit flip,” while Z = σ 3 acts as a “phase-flip.” Y = iσ 2 flips both
bits and phases. These are then quantum error operators. Note that if we have a chain of
N spin 1/2 particles then the N th direct product

PN = P
| × ·{z
· · × P} (2.55)
N times

acts naturally on this chain of particles. 13 This group is useful in quantum information
theory. For example if H ⊂ P N is a subgroup such that (−1, ...., −1) is not in H then
we can study the subspace of Hilbert space {ψ|gψ = ψ, ∀g ∈ H}. For astutely chosen
subgroups these are useful quantum code subspaces, known as stabilizer codes.

Exercise Function Groups


Interpret the direct product Gn of a group with itself n times as a group of the form
F[X → G] for some X.

12
Many terms used here will be more fully explained in Chapter 2.
13
See 4.1 for the formal definition of a group action on a space.

– 19 –
3. Homomorphism and Isomorphism

Definition 3.1: Let (G, m, I, e) and (G0 , m0 , I0 , e0 ) be two groups,


1.) A homomorphism from (G, m, I, e) to (G0 , m0 , I0 , e0 ) is a mapping that preserves
the group law. That is, it is a map of sets ϕ : G → G0 such that, for all g1 , g2 ∈ G we
have:
ϕ(m(g1 , g2 )) = m0 (ϕ(g1 ), ϕ(g2 )) (3.1)

2.) If ϕ is 1-1 and onto it is called an isomorphism.


3.) One often uses the term automorphism of G when ϕ is an isomorphism and G = G0 ,
that is G and G0 are literally the same set with the same multiplication law.

Remarks

1. We will henceforward be more informal and simply say that ϕ : G → G0 is a homo-


morphism of groups if, for all g1 , g2 ∈ G:

product in G0
z }| {
ϕ( g1 g2 ) = ϕ(g1 )ϕ(g2 ) (3.2)
|{z}
product in G

2. A common slogan is: “isomorphic groups are the same.”

Example 1: µN is isomorphic to ZN : Let N be a positive integer. Then we can define


a homomorphism
ϕ : ZN → µN (3.3)

as follows. We want to define ϕ(r̄). Recall that r̄ = r + N Z is an equivalence class. We


choose any representative r0 ∈ r + N Z. Then we set:

r0
 
ϕ(r̄) := exp 2πi (3.4)
N

There is a crucial thing to check here: We need to check that the map is actually well-
defined. We know that any two representatives r10 and r20 for r̄ must have the property
that r10 − r20 = 0 modN , that is r10 − r20 = `N for some
 integer
 N and
 nowby standard
r0 r0
properties of complex numbers we see that indeed exp 2πi N1 = exp 2πi N2 .
Next we check that
ϕ(r̄1 + r̄2 ) = ϕ(r̄1 )ϕ(r̄2 ) (3.5)

If you unwind the definitions you should find this follows from a standard property of the
exponential map.
Equation (3.5) implies that (3.3) is a homomorphism. In fact one easily checks:
a.) If ϕ(r̄) = 1 then r̄ = 0̄. Thus ϕ is 1-1 (a.k.a. “injective”).

– 20 –
b.) Every element of µN is of the form ϕ(r̄) for some r̄. Thus, ϕ is onto (a.k.a.
“surjective”). Note that this is equivalent to saying that every element in µN is of the form
ω j where ω = e2πi/N .
Thus, ϕ is in fact an isomorphism. As we mentioned above, the two groups “seemed
to be the same.” We have now given precise meaning to that idea.

Example 2: A family of homomorphisms µN → µN : For each integer k we can define


the k th power map
pk : µN → µN (3.6)

by
pk (z) = z k (3.7)

where z is any N th root of unity. Note that z k is also an N th root of unity. Moreover
(z1 z2 )k = z1k z2k by elementary properties of complex numbers, so pk is a homomorphism.
Note that it is not always injective or surjective. For example, if k is a multiple of N it is
the stupid homomorphism. In fact pk+N = pk .

Example 3: A family of homomorphisms ZN → ZN :


For any integer k we can define the “k th multiplication map”

mk : ZN → ZN (3.8)

by the equation:
mk (r̄) := kr (3.9)

where on the right hand side kr is defined by choosing a representative r for the class r̄
and then using ordinary multiplication of integers k × r (e.g. 2 × 3 = 6) and then reducing
modulo N . Again, one needs to check the equation is well-defined. Note that mk+N = mk .

Example 4: Relating the homomorphisms in the previous three examples: Since


ZN and µN are isomorphic, one should expect that homomorphisms ZN → ZN and µN →
µN should be related. Moreover, one should have the intuition that pk and mk somehow
have the “same effect.” Indeed, note that

pk (ω j ) = (ω j )k = ω jk . (3.10)

is the essential identity. More formally, one easily checks that

ϕ ◦ mk = pk ◦ ϕ (3.11)

Or, since ϕ is invertible,


pk = ϕ ◦ mk ◦ ϕ−1 . (3.12)

– 21 –
In mathematics one often uses commutative diagrams to express identities such as
(3.11). In this case the diagram looks like
mk
ZN / ZN (3.13)
ϕ ϕ
 pk 
/ µN
µN

We say a diagram commutes if the following condition holds: The diagram describes a
graph with sets associated to vertices and maps associated with oriented edges. Consider
following the arrows around any two paths on the graph with the same beginning and
final points. We compose the maps associated with those arrows to get two maps from the
initial set to the final set. The diagram commutes iff any pair of maps obtained this way
are equal.
Remark We will discuss in detail later on that when k is an integer relatively prime
to N the map pk is an automorphism of µN and mk is an automorphism of ZN . For
example in Z/3Z = {0̄, 1̄, 2̄} if we take k = 2, or any even integer not divisible by 3, then
µ exchanges 1̄ and 2̄. (Check that such an exchange is indeed a homomorphism!) We will
discuss this kind of example in greater detail in Section §13 below.
One kind of homomomorphism is especially important:

Definition 3.2: A matrix representation of a group G is a homomorphism

T : G → GL(n, κ) (3.14)

for some positive integer n and field κ. (One can also have matrix representations in
GL(n, R) where R is a ring.)
More generally, if V is a vector space over a field κ let GL(V ) denote the group of all
invertible linear transformations from V → V . (The group multiplication is composition.)
If G is a group then a homomorphism T : G → GL(V ) defines a representation of G.
Sometimes V is referred to as the carrier space.
*******************************************
PUT EXAMPLE OF HOMOMORPHISM π : SU (2) → SO(3) HERE. IT GIVES A
3-dimensional MATRIX REP OF SU (2).
CURRENTLY IT IS IN SECTION 10.1
BUT SOME OF THAT SHOULD BE PUT HERE
*******************************************

Exercise Preservation Of Structure


Show that, for any group homomorphism µ we always have:

µ(1G ) = 1G0 (3.15)

µ(g −1 ) = µ(g)−1 (3.16)

– 22 –
Exercise The Stupid Homomorphism
Consider the map µ : G → G0 defined by µ(g) = 1G0 . Show that this is a homomor-
phism.

Exercise Some Simple Isomorphisms


a.) Show that the exponential map x → ex defines an isomorphism between the
additive group (R, > 0) and the multiplicative group (R∗+ , ×).
b. ) Show that SO(2) and U (1) are isomorphic groups.
c.) Show that z → z −1 is an automorphism of U (1) → U (1). What is the corresponding
automorphism of SO(2)?

Exercise A group “with one free generator”


Consider a group with a nontrivial element g0 such that every element in the group if
a power of g0 or g0−1 and g0n = g0m iff n = m in the integers.
Show that this group is isomorphic to Z.
Remark: This is an example of what we will call below a group freely generated by
one element.

Exercise Sometimes diagrams don’t commute


Show that the diagram
mk1
ZN / ZN (3.17)
ϕ ϕ
 p k2 
/ µN
µN
commutes iff k1 = k2 modN .

Exercise The Quaternion Group

– 23 –
Construct a homomorphism

µ : Q → Z2 × Z2 (3.18)

where Q is the Quaternion group (2.53).

Exercise Subgroups of ZN
a.) Show that the subgroups of ZN are isomorphic to the groups ZM for M |N .
b.) For N = 8, M = 4 write out H.

Exercise
Let S2 be any set with two elements
a.) Show that there are exactly two possible group structures on S2 , and in each case
construct an isomorphism of S2 with µ2 ∼ = Z2 .
b.) Consider the matrix group of two elements:
! !
10 01
Ŝ2 = { , } (3.19)
01 10

with multiplication being matrix multiplication. Construct an isomorphism with S2 . 14

Exercise Some Simple Representations Of µN


Let ω = e2πi/N .
a.) Show that for any integer k the k th power map

pk (ω j ) = ω jk (3.21)

defines a representation of µN by 1 × 1 matrices.


14
Answer : Write S2 = {e, σ} with e the identity and σ 2 = e. Define µ : S2 → Ŝ2
!
1 0
µ(e) =
0 1
! (3.20)
0 1
µ(σ) =
1 0

– 24 –
b.) Show that !
2πj cos( 2πj 2πj
N ) sin( N )
µ : ωj →
7 R( ) := (3.22)
N − sin( 2πj 2πj
N ) cos( N )
defines a two-dimensional matrix representation of ZN .
c.) Let P be the N × N “shift matrix” all of whose matrix elements are zero except
for 1’s just below the diagonal and P1,N = 1. See equation (10.19) below. Show that

µ(ω j ) = P j (3.23)

is an N × N dimensional representation of µN .

Exercise Two Characterizations Of Abelian Groups


Let G be a group.
a.) Consider the map: µ : G → G given by squaring: µ(g) = g 2 . Show that µ is a
group homomorphism iff G is Abelian.
b.) Consider the map:
G×G→G (3.24)
defined by group multiplication: µ(g1 , g2 ) = m(g1 , g2 ) = g1 g2 . Show that µ is a group
homomorphism iff G is Abelian.

Exercise Isomorphisms And Preservation Of Structure


a.) Suppose ϕ : G1 → G2 is an isomorphism. Show that ϕ−1 is an isomorphism.
b.) Suppose that ϕ : G1 → G2 is an isomorphism, and ϕ0 : G01 → G02 is an iso-
morphism. Suppose also that ν1 : G1 → G01 is a homomorphism (not necessarily an
isomorphism). Show that there is a unique homomorphism ν2 : G2 → G02 so that we have
the commmutative diagram:
ν1
G1 / G0 (3.25)
1
ϕ ϕ0
 ν2

G2 / G0
2

Exercise Fiber Products


Given groups G1 and G2 , and homomorphisms ψ1 : G1 → H and ψ2 : G2 → H one
can define a subset of G1 × G2 known as a fiber product:

G1 ×ψ1 ,ψ2 G2 := {(g1 , g2 )|ψ1 (g1 ) = ψ2 (g2 )} . (3.26)

– 25 –
Show that the fiber product is in fact a subgroup of G1 × G2 , where G1 × G2 has the direct
product group structure.

4. Group Actions On Sets

4.1 Group Actions On Sets


Recall that we said that if X is any set then a permutation of X is a 1-1 and onto mapping
X → X. The set SX of all permutations forms a group under composition.
We now define the notion of an action of a group on a set. This is a very important
notion, and we will return to it extensively when discussing examples. If the following
discussion seems too abstract the reader should consult section 8 for a number of concrete
examples beyond the ones we are about to give. There are three ways to think about a
group action on a set:

First Way: A transformation group on X is a subgroup of SX .

Second Way: We define a left G-action on a set X to be a map φ : G×X → X compatible


with the group multiplication law as follows:

φ(g1 , φ(g2 , x)) = φ(g1 g2 , x) (4.1)

We would also like x 7→ φ(1G , x) to be the identity map. Now, equation (4.1) implies that

φ(1G , φ(1G , x)) = φ(1G , x) (4.2)

which is compatible with, but does not quite imply that φ(1G , x) = x. Thus in defining a
group action we must also impose the condition:

φ(1G , x) = x ∀x ∈ X. (4.3)

Exercise
Give an example of a map φ : G × X → X that satisfies (4.1) but not (4.3). 15

Third Way: Yet another way to say this is the following: Define the map Φ : G → SX
that takes g 7→ φ(g, ·). That is, for each g ∈ G, Φ(g) is the function X → X taking
15
Answer : As the simplest example, choose any element x0 ∈ X and define φ(g, x) = x0 for all g, x. For
a slightly less trivial example consider G = S2 and let φ(e, x) = φ(σ, x) = f (x). Then if f ◦ f (x) = f (x)
the condition (4.1) will be satisfied, but there certainly exist functions with f ◦ f = f which are not the
identity map.

– 26 –
x 7→ φ(g, x). Clearly Φ(g1 ) ◦ Φ(g2 ) = Φ(g1 g2 ) because of (4.1). In order to make sure it
is a permutation we need to know that Φ(g) is invertible and therefore we need to impose
that Φ(1G ) is the identity transformation. This follows from (4.3). Then Φ(g) ∈ SX . So,
to say we have a group action of G on X is to say that Φ is a homomorphism of G into the
permutation group SX . We will discuss G-actions on sets and their properties extensively
in Chapter 3.

Definition: If X has a group action by a group G we say that X is a G-set.

Notation: Again, our notation is overly cumbersome because we want to stress the con-
cept. Usually one writes a left G-action as

g · x := φ(g, x) (4.4)

The key axioms become

g1 · (g2 · x) = (g1 g2 ) · x
(4.5)
1G · x = x

We can think of Φ(g) as the map that sends x → g · x.

Definition/Discussion: Orbits: If G acts on a set X then we can define an equivalence


relation on X by saying that two elements x1 , x2 ∈ X are equivalent, x1 ∼ x2 if there is
some g ∈ G with φ(g, x1 ) = x2 . The reader should check that this is indeed an equivalence
relation. The equivalence class [x] with this equivalence relation is known as the orbit of
G through a point x. So, concretely it is the set of points y ∈ X which can be reached by
the action of G:
OG (x) = {y : ∃g such that y = g · x} (4.6)

The notion of orbits is very important in geometry, gauge theory and many other subjects.
The set of orbits is denoted X/G. We will discuss many examples below.

Examples

1. Consider rotations around the origin of R2 . They act on the points of R2 as a G-


action. The distinct orbits are circles centered on the origin. The origin is also an
orbit by itself.

2. Consider rotations around the origin by multiples of 2π/3. Check that this group is
isomorphic to Z3 . Consider an equilateral triangle centered on the origin. Then the
rotations act on the triangle preserving it. Thus, Z3 acts as a group of symmetries
of the equilateral triangle. Intuitively, group theory is the theory of symmetry. We
have just illustrated how that idea can be formalized through the notation of group
action on a set. ♣FIGURE HERE!

– 27 –
3. Let G = GL(n, κ) and X = κn , the n-dimensional vector space over κ. Then the
usual linear action on vectors defines a group action of G on X. One can check that
there are only two orbits: The zero vector gives one orbit.

4. If G = Z2 acts linearly on Rn+1 (i.e. V = Rn+1 is a representation of Z2 ) then we


can choose coordinates so that the nontrivial element σ ∈ G acts by

σ · (x1 , . . . , xn+1 ) = (x1 , . . . , xp , −xp+1 , · · · , −xp+q ) (4.7)

where p + q = n + 1. Note that this action preserves the equation of the sphere
P i 2 n
i (x ) − 1 = 0 and hence descends to a Z2 -action on the sphere S . The case
p = 0, q = n + 1 is the antipodal map. The set of orbits is known as RPn ∼
= S n /h−1i.
There are many other natural actions of Z2 on S . n

5. Let the group be G = C∗ . This acts on X = Cn by scaling all the coordinates. Note
that scaling a nonzero vector by a nonzero scalar gives a nonzero vector so G = C∗
also restricts to act on X̃ = Cn − {0}. The set of orbits in these two cases are very
different. One can put a natural topology on the set of orbits. One finds that Cn /C∗
is not a Hausdorff space while (Cn − {0})/C∗ is a nice manifold. This important
manifold is often denoted CPn−1 . See Chapter 3 for more discussion. We will often
denote elements of CPn−1 by [X 1 : · · · : X n ]. This stands for the equivalence class of
a vector (X 1 , . . . , X n ) ∈ Cn − {0}.

6. Now consider a set of integers (q1 , . . . , qn ) ∈ Zn . Then for each such set of integers
there is a C∗ -action on CPn−1 defined by

µ · [X 1 : · · · : X n ] := [µq1 X 1 : · · · : µqn X n ] (4.8)

for µ ∈ C∗ . (Check it is well-defined!)

7. The group G = SL(2, R) acts on the complex upper half plane:

H = {τ |Imτ > 0} (4.9)

via
aτ + b
g · τ := (4.10)
cτ + d
where !
a b
g= (4.11)
c d
The reader should check that indeed:

g1 · (g2 · τ ) = (g1 g2 ) · τ (4.12)

8. Consider the action of Z on R where n : x 7→ x + n. The orbit of a real number r is


r + Z. Note that the value of the function p(x) := e2πix uniquely determines an orbit.
So we can identify the space of orbits X/G = R/Z with this action with the circle.

– 28 –
9. Actions of Z on any set X. Let us consider Z to be the free group with one generator
g0 . Then, given any invertible map f : X → X we can define a group action of Z on
X by 


 f ◦ · · · ◦ f (x) n>0
 | {z }
 n times


n
g0 · x = x n=0 (4.13)

−1 −1
f ◦ · · · ◦ f (x) n < 0





| {z }
|n| times

Conversely, any Z-action must be of this form since we can define f (x) := g0 · x. Thus
the orbit of a point x ∈ X is the set of all images of successive actions of f and f −1 .

10. Let M1,d be (d + 1)-dimensional Minkowski space and consider an electromagnetic


gauge potential Aµ to be a map Aµ : M1,d → Rd+1 . (This is geometrically inaccurate
because Aµ is an example of what is called a connection, but it will suffice for our
present purposes.) Then the group G = M ap[M1,d → U (1)] is a group which acts on
the set A of all gauge potentials by Aµ → Aµ − if −1 ∂µ f . The space of orbits A/G
parametrizes gauge-inequivalent field configurations.

********************************
SHOULD DEFINE A MAP, OR MORPHISM, OF G-SPACES HERE. INTRODUCE
THE GENERAL IDEA OF AN EQUIVARIANT MAP.
GIVE SOME EXAMPLES.
*********************************

4.2 Group Actions On Sets Induce Group Actions On Associated Function


Spaces
The following general abstract idea is of great importance in both mathematics and physics:
Suppose X and Y are any two sets and F[X → Y ] is the set of functions from X to Y . Now
suppose that there is a left G-action on X defined by φ : G×X → X. Then, automatically,
there is also a G action φ̃ on F[X → Y ]. To define it, suppose F ∈ F[X → Y ] and g ∈ G.
Then we need to define φ̃(g, F ) ∈ F[X → Y ]. We do this by setting φ̃(g, F ) to be that
specific function whose values are defined by:

φ̃(g, F )(x) := F (φ(g −1 , x)). (4.14)

Note the inverse of g on the RHS. It is there so that the group law works out:

φ̃(g1 , φ̃(g2 , F ))(x) = φ̃(g2 , F )(φ(g1−1 , x))


= F (φ(g2−1 , φ(g1−1 , x))
= F (φ(g2−1 g1−1 , x)) (4.15)
−1
= F (φ((g1 g2 ) , x))
= φ̃(g1 g2 , F )(x)

– 29 –
and hence φ̃(g1 , φ̃(g2 , F )) = φ̃(g1 g2 , F ) as required for a group action. It should also be
clear that φ̃(1G , F ) = F .
In less cumbersome notation we would simply write

(g · F )(x) := F (g −1 · x) (4.16)

In the above discussion we could impose various conditions, on the functions in F[X →
Y ]. For example, if X and Y are manifolds we could ask our maps to be continuous,
differentiable, etc. The above discussion would be unchanged.
As just one (important) example of this general idea: In field theory if we have fields
on a spacetime, and a group of symmetries acting on that spacetime then that group also
acts on the space of fields.

Exercise When Y Is A G-Set


Suppose there is a left G-action on a set Y and X is any set. Show that there is a
natural left G-action on F[X → Y ].

5. The Symmetric Group.

The symmetric group is an important example of a finite group. As we shall soon see, all
finite groups are isomorphic to subgroups of the symmetric group.
Recall from section 2 above that for any set X we can define a group SX of all permu-
tations of the set X. If n is a positive integer the symmetric group on n elements, denoted
Sn , is defined as the group of permutations of the set X = {1, 2, . . . , n}.
In group theory, as in politics, there are leftists and rightists and we can actually define
two group operations:

(φ1 ·L φ2 )(i) := φ2 (φ1 (i))


(5.1)
(φ1 ·R φ2 )(i) := φ1 (φ2 (i))
That is, with ·L we read the operations from left to right and first apply the left permu-
tation, and then the right permutation. Etc. Each convention has its own advantages and
both are frequently used.
In these notes we will adopt the ·R convention and henceforth simply write φ1 φ2 for the product.
We can write a permutation symbolically as
!
1 2 ··· n
φ= (5.2)
p1 p2 · · · pn

meaning: φ(1) = p1 , φ(2) = p2 , . . . , φ(n) = pn . Note that we could equally well write the
same permutation as: !
a1 a2 · · · an
φ= (5.3)
pa1 pa2 · · · pan

– 30 –
where a1 , . . . , an is any permutation of 1, . . . , n. With this understood, suppose we want
to compute φ1 ·L φ2 . We should first see what φ1 does to the ordered elements 1, . . . , n,
and then see what φ2 does to the ordered output from φ1 . So, if we write:
!
1 ··· n
φ1 =
q1 · · · qn
! (5.4)
q1 · · · qn
φ2 =
p1 · · · pn

Then !
1 ··· n
φ1 ·L φ2 = (5.5)
p1 · · · pn

On the other hand, to compute φ1 ·R φ2 we should first see what φ2 does to 1, . . . , n and
then see what φ1 does to that output. We could write represent this as:

!
1 ··· n
φ2 =
q10 · · · qn0
! (5.6)
q10 · · · qn0
φ1 =
p01 · · · p0n

and then !
1 ··· n
φ1 ·R φ2 = (5.7)
p01 · · · p0n

Exercise
a.) Show that the order of the group is |Sn | = n!.
b.) Show that if n1 ≤ n2 then we can consider Sn1 as a subgroup of Sn2 .
c.) In how many ways can you consider S2 to be a subgroup of S3 ? 16
d.) In how many ways can you consider Sn1 to be a subgroup of Sn2 when n1 ≤ n2 ?
17

Exercise Show that the inverse of (5.2) is the permutation:


!
p1 p2 · · · pn
φ= (5.8)
1 2 ··· n

– 31 –
1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

Figure 1: A pictorial view of the composition of two permutations φ1 , φ2 in S8 . Thus 1 → 3, 2 → 7


etc. for the group product φ2 · φ1 .

It is often useful to visualize a permutation in terms of “time evolution” (going up) as


shown in 1.

Exercise Left versus right


a.) Show that in the pictorial interpretation the inverse is obtained by running arrows
backwards in time.
b.) Show that the left- and right- group operation conventions are related by

φ1 ·L φ2 = (φ−1 −1 −1
1 ·R φ2 ) (5.9)
c.) Interpret (5.9) as the simple statement that φ1 ·R φ2 puts φ2 in the past while
φ1 ·L φ2 puts φ1 in the past.

5.1 Cayley’s Theorem


As a nice illustration of some of the concepts we have introduced we now prove Cayley’s
theorem. This theorem states that any finite group is isomorphic to a subgroup of a
permutation group SN for some N .
To prove this we begin with an elementary, but important, observation known as the

The rearrangement lemma: Consider a totally ordered group, that is, we can list the group
elements in order
G = {g1 , g2 , . . . , } (5.10)
16
Answer : There are three subgroups of S3 isomorphic to S2 . They are the subgroups that fix 1, 2, 3
respectively.
17
Answer : for any subset T ⊂ {1, . . . , n2 } of cardinality n2 −n1 we can consider the subset of permutations
that fix all elements of T . This subset of permutations will be a subgroup isomorphic to Sn1 . So there are
n2

n1
distinct subgroups isomorphic to Sn1 .

– 32 –
Consider this as an ordered set, with all the gi distinct. The set can be finite or infinite.
Then, for any h ∈ G consider the ordered set:

h · G = {h · g1 , h · g2 , . . . , }. (5.11)

With a little thought you can (and should) convince yourself that (5.11) is a a permutation
of (5.10): No two elements coincide (since the gi 6= gj for i 6= j) and every element of G
must appear in the list h · G.

To put this differently, there is a left-G-action of G on itself: For h ∈ G, define the


map L(h) : G → G by the rule:

L(h) : g 7→ h · g ∀g ∈ G. (5.12)

This map is one-one and invertible so L(h) ∈ SG , the group of permutations of the set G.
(In fact, there is no need to assume G is totally ordered.) Now note that

L(h1 ) ◦ L(h2 ) = L(h1 · h2 ) (5.13)

so the map L defined by L : h 7→ L(h) is a homomorphism

L : G → SG (5.14)

This is an example of a group action on a set. In this case X = G and G is acting on


itself by left-multiplication and L is the quantity denoted by Φ above. Furthermore, if
L(h1 ) = L(h2 ) then h1 = h2 . Therefore L is an isomorphism of G with its image in SG .
The above remarks apply to any group. However, now consider any finite group G
with N = |G| then SG is isomorphic to SN . Therefore, any finite group is isomorphic to
a subgroup of a symmetric group SN for some N . This is Cayley’s theorem. Note that
which subgroup of SN we obtain depends on how we choose to order G, that is, it depends
on the choice of isomorphism SG ∼ = SN .

Exercise Concrete Example


By Cayley’s theorem the cyclic group Zn of order n is isomorphic to a subgroup of a
permutation group. Exhibit such an isomorphic subgoup. 18

Exercise Right Action ♣This is redundant


with some material
There are other ways G can act on itself. For example we can define on group actions
below. ♣

R(a) : g 7→ g · a (5.15)
18
Answer : Choose any cyclic permutation of length n (Cyclic permutations are defined in Section 5.2
below.) Then it generates a subgroup of SN of length n for any N ≥ n.

– 33 –
a.) Show that R(a) permutes the elements of G.
b.) Show that R(a1 ) ◦ R(a2 ) = R(a2 a1 ). Thus, a 7→ R(a) is not a homomorphism of
G into the group SG of permutations of G.
c.) Show that a 7→ R(a−1 ) is a homomorphism of G into SG .

5.2 Cyclic Permutations And Cycle Decomposition


A very important class of permutations are the cyclic permutations of length `. Choose `
distinct numbers, a1 , . . . , a` between 1 and n and permute:

a1 → a2 → · · · → a` → a1 (5.16)

holding all other n − ` elements fixed. Such a permutation is called a cycle of length `. We
will denote such permutations as:

φ = (a1 a2 . . . a` ). (5.17)

Bear in mind that with this notation the same permutation can be written in ` different
ways:

(a1 a2 . . . a` ) = (a2 a3 . . . a` a1 ) = (a3 . . . a` a1 a2 ) = · · · = (a` a1 a2 . . . a`−1 ) (5.18)

Let us write out the elements of the first few symmetric groups in this notation:

S2 = {1, (12)} (5.19)

S3 = {1, (12), (13), (23), (123), (132)} (5.20)


Remarks

1. S2 is abelian.

2. S3 is NOT ABELIAN 19

(12) · (13) = (132)


(5.21)
(13) · (12) = (123)

and therefore so is Sn for n > 2.

It is not true that all permutations are just cyclic permutations, as we first see by
considering S4 :

S4 = {1, (12), (13), (14),(23), (24), (34), (12)(34), (13)(24), (14)(23),


(123), (132), (124), (142),(134), (143), (234), (243) (5.22)
(1234),(1243), (1324), (1342), (1423), (1432)}
19
Note that (12) ·L (13) = (123). But we use the ·R convention.

– 34 –
Now a key observation is:
Any permutation σ ∈ Sn can be uniquely written as a product of disjoint cycles. This
is called the cycle decomposition of σ.
For example
σ = (12)(34)(10, 11)(56789) (5.23)
is a cycle decomposition in S11 . There are 3 cycles of length 2 and 1 of length 5.
The decomposition into products of disjoint cycles is known as the cycle decomposition.

Exercise Decomposition as a product of disjoint cyclic permutations


Prove the above claim: every permutation above is a product of cyclic permutations
on disjoint sets of integers. 20

Exercise
a.) Let φ be a cyclic permutation of order `. Suppose we compose φ with itself N
times. Show that the result is the identity transformation iff ` divides N .
b.) Suppose φ has a cycle decomposition with cycles of length k1 , . . . , ks . What is the
smallest number N so that if we compose φ with itself φ ◦ · · · ◦ φ for N times that we get
the identity transformation?

5.3 Transpositions
A transposition is a permutation of the form: (ij). These satisfy some nice properties:
Suppose i, j, k are distinct. You can check as an exercise that transpositions obey the
following identities:

(ij) · (jk) · (ij) = (ik) = (jk) · (ij) · (jk)


(ij)2 = 1 (5.24)
(ij) · (kl) = (kl) · (ij) {i, j} ∩ {k, l} = ∅
The first identity is illustrated in Figure 2. Draw the other two.
We observed above that there is a cycle decomposition of permutations. Now note
that
Any cycle (a1 , · · · , ak ) can be written as a product of transpositions. To prove this
note that
(1, k)(1, k − 1) · · · (1, 4)(1, 3)(1, 2) = (1, 2, 3, 4, . . . , k) (5.25)
20
Answer : Use induction: Consider any element, say x ∈ {1, . . . , n} and let φ be a permutation. Consider
the elements x, φ(x), φ(φ(x)), . . . . This must be a finite set C, so we get a cyclic permutation of the elements
in C. Then φ must permute all the elements in {1, . . . , n} − C. But this has cardinality strictly smaller
than n. So, use the inductive hypothesis.

– 35 –
=

i j k k
i j

Figure 2: Pictorial illustration of equation (4.21) line one for transpositions where i < j < k. Note
that the identity is suggested by “moving the time lines” holding the endpoints fixed. Reading time
from bottom to top corresponds to reading the composition from left to right in the ·R convention.

Now, consider a permutation that takes

1 → a1 , 2 → a2 , 3 → a3 , · · · , k → ak (5.26)

For our purposes, it won’t really matter what it does to the other integers greater than k.
Choose any such permutation and call it φ. Note that

φ ◦ (1 2 · · · k) ◦ φ−1 = (a1 a2 · · · ak ) (5.27)

so now multiply the above identity by φ on the left and φ−1 on the right to get:

φ(1, k)φ−1 φ(1, k − 1)φ−1 · · · φ(1, 4)φ−1 φ(1, 3)φ−1 φ(1, 2)φ−1 = (a1 , a2 , . . . , ak ) (5.28)

but φ(1, j)φ−1 = (a1 , aj ). So we get a decomposition of (a1 a2 · · · ak ) as a product of


transpositions.
In general, a group element of the form ghg −1 is called a conjugate of h. See Section
7.2 below.
Therefore, every element of Sn can be written as a product of transpositions, gener-
alizing (5.21). We say that the transpositions generate the permutation group. Taking
products of various transpositions – what we might call a “word” whose “letters” are the
transpositions – we can produce any element of the symmetric group. We will return to
this notion in §6 below.
Of course, a given permutation can be written as a product of transpositions in many
ways. This clearly follows because of the identities (5.24). A nontrivial fact is that the
transpositions together with the above relations generate precisely the symmetric group. 21
It therefore follows that all possible nontrivial identities made out of transpositions follow
from repeated use of these identities.
21
This follows once one has shown that the Coxeter presentation given below gives precisely the symmetric
group, and not some larger group (requiring the imposition of further relations) since the above relations
all follow from the Coxeter relations.

– 36 –
Although permutations can be written as products of transpositions in different ways,
the number of transpositions in a word modulo 2 is always the same, because the identities
(5.24) have the same number of transpositions, modulo two, on the LHS and RHS. Thus
we can define even, resp. odd, permutations to be products of even, resp. odd numbers of
transpositions.

Definition: The alternating group An ⊂ Sn is the subgroup of Sn of even permuta-


tions.

Exercise
a.) What is the order of An ? 22
b.) Write out A2 , A3 , and A4 . Show that A3 is isomorphic to Z3 . 23

c.) A3 is Abelian. Is A4 Abelian? 24

Exercise
When do two cyclic permutations commute? Illustrate the answer with pictures, as
above.

Exercise A Smaller Set Of Generators


Show that from the transpositions σi := (i, i + 1), 1 ≤ i ≤ n − 1 we can generate all
other transpositions in Sn . These are sometimes called the elementary generators.

Exercise An Even Smaller Set Of Generators


22
Answer : 12 n! for n > 1. To prove this note that the transformation φ → φ ◦ (12) is an invertible
transformation Sn → Sn that squares to the identity. On the other hand, it exchanges even and odd
permutations.
23
Answer : A2 = {1}. A3 = {1, (123), (132)}.

A4 = {1, (123), (132), (124), (142), (134), (143), (234), (243), (12)(34), (13)(24), (14)(23)}.

24
Answer : No. Just multiply a few elements to find a counterexample. For example (123)(134) = (234)
but (134)(123) = (124).

– 37 –
Show that, in fact, Sn can be generated by just two elements: (12) and (1 2 · · · n). 25

Exercise Center of Sn
What is the center of Sn ? 26

Exercise Decomposing the reverse shuffle


Consider the permutation which takes 1, 2, . . . , n to n, n − 1, . . . , 1.
a.) Write the cycle decomposition.
b.) Write a decomposition of this permutation in terms of the elementary generators
σi . 27

Example 3.2 The sign homomorphism.


This is a very important example of a homomorphism:

 : Sn → Z2 (5.29)

where we identify Z2 as the multiplicative group {±1} of square roots of 1. The rule is:
 : σ → +1 if σ is a product of an even number of transpositions.
 : σ → −1 if σ is a product of an odd number of transpositions.
Put differently, we could define (ij) = −1 for any transposition. This is compatible
with the words defining the relations on transpositions. Since the transpositions generate
the group the homomorphism is well-defined and completely determined.
In physics one often encounters the sign homomorphism in the guise of the “epsilon
tensor” denoted:
i1 ···in (5.30)
Its value is:

1. i1 ···in = +1 if !
1 2 ··· n
(5.31)
i1 i2 · · · in
is an even permutation.
25
Answer : Conjugate (12) by the n-cycle to get (23). Then conjugate again to get (34) and so forth.
Now we have the set of generators of the previous exercise.
26
Answer : If n = 2 then Sn is Abelian and the center is all of S2 . If n > 2 then the center is the trivial
group. To prove this suppose z ∈ Z(Sn ). If z is not the trivial element then it moves some i to some j.
WLOG we can say it moves 1 to i 6= 1. Then z(i) 6= i. If z(i) = 1 then z is the transposition (1, i). If n > 2
there will be some other j 6= 1, i and z will not commute with (1, j). If z(i) = j with j 6= 1, i then φ = (1, i)
does not commute with z because zφ takes 1 → j and φz takes 1 → 1.
27
Hint: Use the pictorial interpretation mentioned above.

– 38 –
2. i1 ···in = −1 if
!
1 2 ··· n
(5.32)
i1 i2 · · · in

is an odd permutation.

3. i1 ···in = 0 if two indices are repeated. (This goes a bit beyond what we said above
since in that case we are not discussing a permutation.)

So, e.g. among the 27 entries of ijk , 1 ≤ i, j, k ≤ 3 we have

123 = 1
132 = −1
(5.33)
231 = +1
221 = 0

and so forth.

Exercise
Show that
X
i1 i2 ···in j1 j2 ···jn = (σ)δi1 jσ(1) δi2 jσ(2) · · · δin jσ(n) (5.34)
σ∈Sn

This formula is often useful when proving identities involving determinants. An im-
portant special case occurs for n = 3 where it is equivalent to the rule for the cross-product
of 3 vectors in R3 :

~ × (B
A ~ × C)
~ = B(
~ A~ · C)
~ − C(
~ A~ · B)
~ (5.35)

The next two exercises assume some familiarity with concepts from linear algebra. See
Chapter 2 below if they are not familiar.

Exercise The Canonical Permutation Representation Of Sn


Consider the standard Euclidean vector space Rn (or Cn or κn ) with basis vectors
~e1 , . . . , ~en where ~ei has component 1 in the ith position and zero else. Note that the
symmetric group permutes these vectors in an obvious way:

T (φ) : ~ei → ~eφ(i) , (5.36)

– 39 –
and now extend by linearity so that
n
X n
X n
X
T (φ) : xi ei 7→ xi eφ(i) = xφ−1 (i) ei (5.37)
i=1 i=1 i=1

Thus to any permutation φ ∈ Sn we can associate a linear transformation T (φ) on κn .


a.) Show that
T (φ1 ) ◦ T (φ2 ) = T (φ1 ◦ φ2 ) = T (φ1 ·R φ2 ) (5.38)
This means we have a linear representation of the group Sn .
b.) The matrix A(φ) of T (φ) defined by T (φ) and the ordered basis {e1 , . . . , en } is
defined by:
n
X
T (φ)~ei = A(φ)ji~ej (5.39)
j=1

b.) Show that


A(φ1 )A(φ2 ) = A(φ1 ◦ φ2 ) (5.40)
and in particular that A(φ−1 ) = A(φ)−1 . Thus, φ → A(φ) is a matrix representation of
Sn .
c.) Write out A(φ) for small values of n and some simple permutations φ.
d.) Write a general formula for the matrix elements of A(φ). 28
e.) The matrices A(φ) are called permutation matrices. In each row and column there
is only one nonzero matrix element, and that nonzero element is 1. If B is any other n × n
matrix show that
A(φ)−1 BA(φ) i,j = Bφ(i),φ(j)

(5.41)
In general, if we have a representation of a group T : G → GL(V ) and a nontrivial
subspace W ⊂ V such that T (g) takes vectors in W to vectors in W for all g ∈ G, we say
that the representation is reducible.
SEE SECTION ***** BELOW FOR MORE
f.) Show that the natural permutation representation of Sn on Rn is reducible. 29

Exercise Signed Permutation Matrices


Define signed permutation matrices to be invertible matrices such that in each row and
column there is only one nonzero matrix element, and the nonzero matrix element can be
either +1 or −1. Finally, require the matrix to be invertible.
a.) Show that the set of n × n signed permutation matrices form a group. We will call
it W (Bn ) for reasons that will not be obvious for a while.
b.) Define a group homomorphism W (Bn ) → Sn .

28
Answer : A(φ)i,j = δi,φ(j) = δφ−1 (i),j .
29
Answer : Show that the linear subspace spanned by the “all ones vector” v0 = e1 + · · · + en is preserved
under the action of T (φ) for all φ ∈ Sn : T (φ)(λv0 ) = λv0 , for all λ ∈ R.

– 40 –
5.4 Diversion and Example: Card shuffling

One way we commonly encounter permutation groups is in shuffling a deck of cards.


A deck of cards is equivalent to an ordered set of 52 elements. Some aspects of card
shuffling and card tricks can be understood nicely in terms of group theory.
Mathematicians often use the perfect shuffle or the Faro shuffle. Suppose we have a
deck of 2n cards, so n = 26 is the usual case. There are actually two kinds of perfect
shuffles: the In-shuffle and the Out-shuffle.
In either case we begin by splitting the deck into two equal parts, and then we interleave
the two parts perfectly.
Let us call the top half of the deck the left half-deck and the bottom half of the deck
the right half-deck. Then, to define the Out-shuffle we put the top card of the left deck on
top, followed by the top card of the right deck underneath, and then proceed to interleave
them perfectly. The bottom and top cards stay the same.
If we number the cards 0, 1, . . . , 2n − 1 from top to bottom then the top (i.e. left) half-
deck consists of the cards numbered 0, 1, . . . , n − 1 while the bottom (i.e. right) half-deck
consists of the cards n, n + 1, . . . , 2n − 1. Then the Out-shuffle gives the cards in the new
order
0, n, 1, n + 1, 2, n + 2, . . . , n + 2, 2n − 2, n − 1, 2n − 1 (5.42)

Another way to express this is that the Out-shuffle defines a permutation of {0, 1, . . . , 2n−
1}. If we let Cx , 0 ≤ x ≤ 2n − 1 denote the cards in the original order then the new ordered
set of cards Cx0 are related to the old ones by:

0
CO(x) = Cx (5.43)

where (
2x x≤n−1
O(x) = (5.44)
2x − (2n − 1) n ≤ x ≤ 2n − 1

Note that this already leads to a card trick: Modulo (2n − 1) the operation is just
x → 2x, so if k is the smallest number with 2k = 1mod(2n − 1) then k Out-shuffles will
restore the deck perfectly.
For example: For a standard deck of 52 cards, 28 = 5 × 51 + 1 so 8 perfect Out-shuffles
restores the deck!
We can also see this by working out the cycle presentation of the Out-shuffle:

O = (0)(1, 2, 4, 8, 16, 32, 13, 26)(3, 6, 12, 24, 48, 45, 39, 27)
(5, 10, 20, 40, 29, 7, 14, 28)(9, 18, 36, 21, 42, 33, 15, 30) (5.45)
(11, 22, 44, 37, 23, 46, 41, 31)(17, 34)(19, 38, 25, 50, 49, 47, 43, 35)(51)

Clearly, the 8th power gives the identity permutation.


Now, to define the In-shuffle we put the top card of the right half-deck on top, then
the top card of the left half-deck underneath, and then proceed to interleave them.

– 41 –
Now observe that if we have a deck with 2n cards D(2n) := {0, 1, . . . , 2n − 1} and we
embed it in a Deck with 2n + 2 cards

D(2n) → D(2n + 2) (5.46)

by the map x → x + 1 then the Out-shuffle on the deck D(2n + 2) permutes the cards
1, . . . , 2n amongst themselves and acts as an In-shuffle on these cards! ♣Explain this some
more, e.g. by
Therefore, applying our formula for the Out-shuffle we find that the In-shuffle is given illustrating with a
pack of 6 cards. ♣
by the formula
(
2(x + 1) − 1 x+1≤n
I(x) = (5.47)
2(x + 1) − (2n + 1) − 1 n ≤ x ≤ 2n − 1

One can check that this is given by the uniform formula

I(x) = (2x + 1) mod(2n + 1) (5.48)

for x ∈ D(2n).
For 2n = 52 this turns out to be one big cycle!

(0, 1, 3, 7, 15, 31, 10, 21, 43, 34,16, 33, 14, 29, 6, 13, 27, 2, 5,
11, 23, 47, 42, 32, 12, 25, 51, 50,48, 44, 36, 20, 41, 30, 8, 17, (5.49)
35, 18, 37, 22, 45, 38, 24,49, 46, 40, 28, 4, 9, 19, 26)

so it takes 52 consecutive perfect In-shuffles to restore the deck.


One can do further magic tricks with In- and Out-shuffles. As one example there is
a simple prescription for bringing the top card to any desired position, say, position ` by
doing In- and Out-shuffles.
To do this we write ` in its binary expansion:

` = 2k + ak−1 2k−1 + · · · + a1 21 + a0 (5.50)

where aj ∈ {0, 1}. Interpret the coefficients 1 as In-shuffles and the coefficients 0 as Out-
shuffles. Then, reading from left to right, perform the sequence of shuffles given by the
binary expression: 1ak−1 ak−2 · · · a1 a0 .
To see why this is true consider iterating the functions o(x) = 2x and i(x) = 2x + 1.
Notice that the sequence of operations given by the binary expansion of ` are

0→1
→ 2 · 1 + ak−1
→ 2 · (2 · 1 + ak−1 ) + ak−2 = 22 + 2ak−1 + ak−2
(5.51)
→ 2 · (22 + 2ak−1 + ak−2 ) + ak−3 = 23 + 22 ak−1 + 2ak−2 + ak−3
.. ..
. .
→ 2k + ak−1 2k−1 + · · · + a1 21 + a0 = `

– 42 –
For an even ordered set we can define a notion of permutations preserving central
symmetry. For x ∈ D2n let x̄ = 2n − 1 − x. Then we define the group W (Bn ) ⊂ S2n to be
the subgroup of permutations which permutes the pairs {x, x̄} amongst themselves.
Note that there is clearly a homomorphism
φ : W (Bn ) → Sn (5.52)
Moreover, both O and I are elements of W (Bn ). Therefore the shuffle group, the group
generated by these is a subgroup of W (Bn ). Using this one can say some nice things about
the structure of the group generated by the in-shuffle and the out-shuffle. It was completely
determined in a beautiful paper (the source of the above material):
“The mathematics of perfect shuffles,” P. Diaconis, R.L. Graham, W.M. Kantor, Adv.
Appl. Math. 4 pp. 175-193 (1983)
It turns out that shuffles of decks of 12 and 24 cards have some special properties. In
particular, special shuffles of a deck of 12 cards can be used to generate a very interesting
group known as the Mathieu group M12 . It was, historically, the first “sporadic” finite
simple group. See section §16.4 below.
To describe M12 we need to introduce a Mongean shuffle. Here we take the deck of
cards put the top card on the right. Then from the deck on the left alternatively put cards
on the top or the bottom. So the second card from of the deck on the left goes on top of
the first card, the third card from the deck on the left goes under the first card, and so on.
If we label our deck as cards 1, 2, . . . , 2n then the Mongean shuffle is:
m : {1, 2, . . . , 2n} → {2n, 2n − 2, . . . , 4, 2, 1, 3, 5, . . . , 2n − 3, 2n − 1} (5.53)
In formulae, acting on D(2n)
m(x) = Min[2x, 2n + 1 − 2x] (5.54)
In particular for 2n = 12 we have
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12} → {12, 10, 8, 6, 4, 2, 1, 3, 5, 7, 9, 11} (5.55)
which has cycle decomposition (check!)
(3 8) · (1 12 11 9 5 4 6 2 10 7) (5.56)
Now consider the reverse shuffle that simply orders the cards backwards. In general
for a deck D(2n) with n = 2mod4 Diaconis et. al. show that r and m generate the entire
symmetric group. However, for a pack of 12 cards r and m generate the Mathieu group
M12 . It turns out to have order
|M12 | = 26 · 33 · 5 · 11 = 95040 (5.57)
Compare this with the order of S12 :
12! = 210 · 35 · 52 · 7 · 11 = 479001600 (5.58)
So with the uniform probability distribution on S12 , the probability of finding a Mathieu
1
permutation is 5040 ∼ 2 × 10−4 .
We mention some final loosely related facts:

– 43 –
1. There are indications that the Mathieu groups have some intriguing relations to string
theory, conformal field theory, and K3 surfaces.

2. In the theory of L∞ algebras and associated topics, which are closely related to string
field theory one encounters the concept of the k-shuffle...
FILL IN.

Exercise Cycle structure for the Mongean shuffle


Write the cycle structure for the Mongean shuffle of a deck with 52 cards. How many
Mongean shuffles of such a deck will restore the original order?

6. Generators and relations

The presentation (5.24) of the symmetric group is an example of presenting a group by


generators and relations.

Definition 6.1 A subset S ⊂ G is a generating set for a group if every element g ∈ G can
be written as a “word” or product of elements of S. That is any element g ∈ G can be
written in the form
g = si1 · · · sir (6.1)

where, for each 1 ≤ k ≤ r we have sik ∈ S.


Finitely generated means that the generating set S is finite, that is, there is a finite
list of elements {s1 , . . . sn } so that all elements of the group can be obtained by taking
products – “words” – in the “letters” drawn from S. For example, the symmetric group
is finitely generated by the transpositions. Typical Lie groups such as SU (n) or SO(n, κ)
(over κ = R, C) are not finitely generated.
The relations are then equalities between different words such that any two equivalent
words in G can be obtained by successively applying the relations. 30
In general if we have a finitely generated group we write

G = hg1 , . . . , gn |R1 , · · · Rr i (6.2)

where Ri are words in the letters of S which will be set to 1. ALL other relations,
that is, all other identities of the form W = 1 are supposed to be consequences of these
relations.
Remark: It is convenient to exclude the unit 1 from S. When we write our words it
is understood that we can raise a generator s to any integer power sn where, s0 = 1 and,
if n < 0, this means (s−1 )|n| . Alternatively, we can, for each generator s introduce another
30
See Jacobsen, Basic Algebra I, sec. 1.11 for a more precise definition.

– 44 –
generator t, which will play the role of s−1 and then impose another relation st = ts = 1G .
A generating set that contains s−1 for every generator s is said to be symmetric.

Example 2.1: If S consists of one element a then F (S) ∼


= Z. The isomorphism is given
n
by mapping n ∈ Z to the word a .

Example 2.2: The most general group with one generator and one relation must be of
the form:
ha|aN = 1i (6.3)

where N is an integer and by replacing a → a−1 we can assume it is a positive integer.


You will recognize this as the cyclic group, isomorphic to ZN and µN .

Example 2.3: Free groups. If we impose no relations on the generating set S then we
obtain what is known as the free group on S, denoted F (S). If S consists of one element
then we just get Z, as above. However, things are completely different if S consists of two
elements a, b. Then F (S) is very complicated. A typical element looks like one of

an1 bm1 · · · ank


an1 bm1 · · · bmk
(6.4)
bn1 am1 · · · ank
bn1 am1 · · · bmk

where ni , mi are nonzero integers (positive or negative). Three nice general results on free
groups are:

1. Two free groups F (S1 ) and F (S2 ) are isomorphic iff they have the same cardinality.

2. Nielsen-Shreier theorem: Any subgroup of a free group is free.

3. Every group has a presentation in terms of generators and relations. For the group
G we can consider the free group F (G) with S = G as a set. There is then a
natural homomorphism ϕ : F (G) → G where we take a word in elements of G and
map concatenation of letters to group multiplication in G. As we will see in Section
*** below, the kernel of the homomorphism is a normal subgroup K(G) so that
G∼= F (G)/K(G) and K(G) are the relations in this presentation. This presentation
can be incredibly inefficient and useless. (Think, for example, of Lie groups.)

Combinatorial group theorists use the notion of a Cayley graph to illustrate groups
presented by generators and relations. Assuming that 1 ∈/ S the Cayley graph is a graph
whose vertices correspond to all group elements in G and the oriented edges are drawn
between g1 and g2 if there is an s ∈ S with g2 = g1 s. We label the edge by s. (If S is
symmetric we can identify this edge with the edge from g2 to g1 labeled by s−1 .) For the
free group on two elements this generates the graph shown in Figure 3.

– 45 –
Figure 3: The Cayley graph for the free group on 2 generators a and b.

Example 2.4: Coxeter groups: Let mij by an n × n symmetric matrix whose entries are
positive integers or ∞, such that mii = 1, 1 ≤ i ≤ n, and mij ≥ 2 or mij = ∞ for i 6= j.
Then a Coxeter group is the group with generators and relations:

hs1 , . . . , sn |∀i, j : (si sj )mij = 1i (6.5)

where, if mij = ∞ we interpret this to mean there is no relation.


Note that since mii = 1 we have
s2i = 1 (6.6)

Quite generally, a group element that squares to 1 is called an involution. So all the
generators of a Coxeter group are involutions. It then follows that if mij = 2 then si and
sj commute. If mij = 3 then the relation can also be written:

si sj si = sj si sj (6.7)

A theorem of Coxeter’s from the 1930’s gives a classification of the finite Coxeter
groups. 31 Coxeter found it useful to describe these groups by a diagrammatic notation:
We draw a graph whose vertices correspond to the generators si . We draw an edge between
vertices i and j if mij ≥ 3. By convention the edges are labeled by mij and if mij = 3 then
the standard convention is to omit the label.
It turns out that the finite Coxeter groups can be classified. The corresponding Coxeter
diagrams are shown in Figure 4.
31
For a quick summary see the expository note by D. Allcock at
https://web.ma.utexas.edu/users/allcock/expos/reflec-classification.pdf.

– 46 –
Figure 4: Coxeter’s list of finite Coxeter groups. They are finite groups of reflections in some
Euclidean space.

The finite Coxeter groups turn out to be isomorphic to concrete groups of reflections
in some Euclidean space. That is, finite subgroups of O(N ) for some N . That is, there is
some vector space RN and collection of vectors vi ∈ RN with inner products
π
vi · vj = −2 cos( ) (6.8)
mi,j
so that the group generated by reflections in the plane orthogonal to the vectors vi :
2v · vi
Pvi : v 7→ v − vi (6.9)
vi · vi
is a finite group isomorphic to the Coxeter group with matrix mi,j . (Note that since
mi,i = 1 we have vi2 = 2 and Pvi (v) = v − (v · vi )vi .)
Note that, if Pv is the Euclidean reflection in the plane orthogonal to v then Pv1 ◦ Pv2
is just rotation in the plane spanned by v1 , v2 by an angle 2θ where the angle between
v1 and v2 is θ. To prove this, note that Pv1 ◦ Pv2 clearly leaves all vectors in the plane
orthogonal to v1 , v2 fixed. Now represent vectors in a 2-dimensional Euclidean plane by
complex numbers, but view C as a real vector space. WLOG take v = eiθ . Then Pv is the
transformation:
Pv : z 7→ −e2iθ z̄ (6.10)
Note that this is a linear transformation of real vector spaces: Pv (a1 z1 +a2 z2 ) = a1 Pv (z1 )+
a2 Pv (z2 ) if a1 , a2 ∈ R. To check this formula note that if z = eiθ then Pv (z) = −z and if
z = ieiθ is in the orthogonal hyperplane to v then Pv (z) = z.
Now if va = eiθa , a = 1, 2, it is an easy matter to compute:

Pv1 ◦ Pv2 : z 7→ −e2iθ2 z̄


7→ −e2iθ1 −e2iθ2 z̄ (6.11)
= e2i(θ1 −θ2 ) z
So: The product of reflections in the hyper-planes orthogonal to two vectors at an angle θ
is a rotation by an angle 2θ in the plane spanned by the two vectors.
We will meet some of these groups again later as Weyl groups of simple Lie groups.
We have, in fact, already met two of these groups! The case An turns out to be isomorphic

– 47 –
to the symmetric group Sn+1 . 32 In this case we have seen that the elementary generators
σi = (i, i + 1), 1 ≤ i ≤ n indeed satisfy the Coxeter relations:

σi2 = 1
(σi σi+1 )3 = 1 1≤i≤n−1 (6.12)
2
(σi σj ) = 1 |i − j| > 1

Now consider the standard basis ei for Rn+1 , 1 ≤ i ≤ n + 1 and consider the vectors:

αi = ei − ei+1 (6.13)

which have the inner products: αi2 = 2 and αi · αi±1 = −1 (so they are at angle 2π/3) and
all other inner products vanish. This is summarized in the matrix:

αi · αj = Cij = 2δi,j − δi,j+1 − δi,j−1 (6.14)

Then the map si → Pαi is an isomorphism of the Coxeter group An with a subgroup of
O(n + 1). Moreover, one computes that

ej

 j 6= i, i + 1
Pαi (ej ) = ei+1 j=i (6.15)


e j =i+1
i

So, referring to equation (5.36) we see that this is just the permutation action of σi on
the standard basis of Rn+1 . This makes clear that the Coxeter group is isomorphic to the
symmetric group Sn+1 . ♣A presentation of
the Monster in
terms of generators
and relations is
Remarks known.(Atlas) Give
it here? ♣

1. One very practical use of having a group presented in terms of generators and relations
is in the construction of homomorphisms. If one is constructing a homomorphism
φ : G1 → G2 , then it suffices to say what elements the generators map to. That is, if
gi are generators of G1 we can fully specify a homomorphism by choosing elements
gi0 ∈ G2 (not necessarily generators) and declaring

φ(gi ) = gi0 . (6.16)

However, we cannot choose the gi0 arbitrarily. Rather, the gi0 must satisfy the same
relations as the gi . This puts useful constraints on what homomorphisms you can
write down. For example, using this idea you can prove that there is no nontrivial
homomorphism φ : ZN → Z.
32
The notation here is standard but exceedingly unfortunate and confusing!!! Here An does NOT refer to
the alternating group! It refers to Cartan’s classification of simple Lie groups and the Coxeter group with
this label is in fact isomorphic to Sn+1 .

– 48 –
2. In general it is hard to say much about a group given a presentation in terms of
generators and relations. For example, it is not even obvious, in general, if the
group is the trivial group! This is part of the famous “word problem for groups.”
There are finitely presented groups where the problem of saying whether two words
represent the same element is undecidable! 33 However, for many important finitely
presented groups the word problem can be solved. Indeed, the word problem was first
formulated by Max Dehn in 1911 and solved by him for the surface groups discussed
below. ♣It would be more
effective here to
give an example of
3. Nevertheless, there are four Tietze transformations (adding/removing a relation, a set of generators
and relations that is
adding/removing a generator) which can transform one presentation of a group to a actually isomorphic
to the trivial group
different presentation of an isomorphic group. It is a theorem [REF!] that any two - but not obviously
so. ♣
presentations can be related by a finite sequence of Tietze transformations. How is
this compatible with the previous remark? The point is that the number f (n) of
such transformations needed to transform a presentation of the trivial group with n
relations into the trivial presentation grows faster than any recursive function of n.

4. It turns out that the case of Coxeter groups Bn = Cn are isomorphic to the group
of symmetric permutations WBn ⊂ S2n discussed in card-shuffling. The Coxeter
diagrams are very similar to the Dynkin diagrams that are used to label finite dimen-
sional simple Lie algebras over the complex numbers except that Hn and In do not
correspond to Lie algebras.

Exercise Homomorphisms involving ZN and Z


a.) Write a nontrivial homomorphism µ : Z → ZN .
b.) Show that there is no nontrivial homomorphism µ : ZN → Z. 34

c.) Find the most general homomorphism µ : Z → Z.


d.) Find the most general homomorphism µ : ZN → ZN .

Exercise One generator and many relations


Suppose a group has a single generator g but many relations. Describe this group. 35

33
The Wikipedia article on “Word problem for groups,” is useful.
34
Answer : Since ZN can be generated by one element, say 1̄, it suffices to say what the value of φ(1̄)
is. The trivial homomorphism takes the generator to zero: φ(1̄) = 0 ∈ Z and hence takes every element
to zero. On the other hand, if φ(1̄) = k is a nonzero integer, then N k = N φ(1̄) = φ(N 1̄) = φ(0̄) = 0, a
contradiction. So there is no nontrivial homomorphism.
35
Answer : Let the generator be g. The relations must be of the form g k1 = 1, . . . , g kn = 1 for some
integers k1 , . . . , kn , and WLOG we can assume they are positive. Then it is not hard to see, using the
Chinese remainder theorem (see below) that the group is isomorphic to ZN where N = gcd(k1 , ..., kn ).

– 49 –
Exercise Simple Roots Of SU (n + 1)
a.) Verify equation (6.14). The matrix Cij is known as a Cartan matrix of SU (n + 1).
b.) Show that the vectors αi are all orthogonal to the all-one vector: v = (1, . . . , 1)
and that they span the orthogonal complement of v.
c.) Show that the permutation representation of Sn+1 separately preserves v and the
orthogonal complement of v. Thus, Rn+1 gives what is known as a reducible representation
of Sn+1 .
d.) Compute the action of Pαi on αj . Give the matrix representation relative to the
ordered basis {α1 , . . . , αn }. 36

Exercise Show that

ha, b|a3 = 1, b2 = 1, abab = 1i (6.17)

is a presentation of S3

Exercise
Consider the group with presentation:

hT, S|(ST )3 = 1, S 2 = 1i (6.18)

Is this group finite or infinite?


This group plays a very important role in string theory.

Exercise Bounds on the minimal number of generators of a finite group


Suppose we have a set of finite groups G1 , G2 , G3 , . . . with a minimal set of generators
S1 ⊂ S2 ⊂ · · · of cardinality |Sk | = k. Show that 2|Gk | ≤ |Gk+1 | and hence as k → ∞ the
order |Gk | must grow at least as fast as 2k .
36
Answer The matrix is a diagonal matrix of 1’s except on the 3×3 block for rows and columns i−1, i, i+1
where it looks like  
1 0 0
1 1 1 .
 
0 0 1

– 50 –
Remark: Denote the smallest cardinality of a set of generators of G by d(G). If G is
a finite and transitive permutation subgroup of Sn (meaning it acts transitively on some
set X) then there is a constant C such that
n
d(G) ≤ C √ (6.19)
logn
and if G is a primitive permutation group, meaning that it acts on a set X such that it
does not preserve any nontrivial disjoint decomposition of X, then there is a constant C
so that if n ≥ 3:
logn
d(G) ≤ C √ (6.20)
loglogn
Moreover, these results are asymptotically the best possible. For a review of such results
see. 37

Exercise Generators And Relations For Products Of Groups


Suppose you are given groups G1 and G2 in terms of generators and relations. Write
a set of generators and relations for the product group G1 × G2 . 38

6.1 Example Of Generators And Relations: Fundamental Groups In Topology


Presentations in terms of generators and relations is very common when discussing the
fundamental group of a topological space X.
This subsection assumes some knowledge of topological spaces and the idea of a ho-
motopy. Without trying to be too precise we choose a basepoint x0 ∈ X and let π1 (X, x0 )
be the set of closed paths in X, beginning and ending at x0 where we identify two paths if
they can be continuously deformed into each other. We can define a group multiplication
by concatenation of paths. Inverses exist since we can run paths backwards. The following
subsubsection contains more precise definitions. ♣Need to fix
pictures and change
p0 to x0 . ♣
6.1.1 The Fundamental Group Of A Topological Space
Choose a point x0 ∈ X. The fundamental group π1 (X, x0 ) based at x0 is, as a set, the set
of homotopy classes of closed curves.
That is we consider continuous maps:

f : ([0, 1], {0, 1}) → (X, {x0 }) (6.21)

These define paths in X with beginning and ending point fixed at x0 . The path must be
traveled in time 1.
37
F. Menegazzo, “The Number of Generators of a Finite Group,” Irish Math. Soc. Bulletin 50 (2003),
117128.
38
Answer : If G1 = hgi |Ri i and G2 = hha |Sa i then G1 × G2 = hgi , ha |Ri , Sa , gi ha gi−1 h−1
a = 1i.

– 51 –
Figure 5: Two loops f, g with basepoint at x0 .

Figure 6: The concatenation of the looops f ? g. Note that the “later” loop is written on the right.
This is generally a more convenient convention when working with homotopy and monodromy. In
order for f ? g to be a map from [0, 1] into X we should run each of the individual loops at “twice
the speed” so that at time t = 1/2 the loop returns to x0 . However, in homotoping f ? g there is
no reason why the point at t = 1/2 has to stay at x0 .

We say that two such paths f0 , f1 are homotopic if there is a continuous map

F : [0, 1] × [0, 1] → X (6.22)

such that

1. F (0, t) = f0 (t) and F (1, t) = f1 (t)

2. F (s, 0) = F (s, 1) = x0

If we define fs (t) := F (s, t) and consider fs (t) as a path in t at fixed s then, as we vary
s we are describing a path of paths.
Now, homotopy of paths in X is an equivalence relation. 39 We denote by [f ] the
equivalence class of a path f and we denote the set of such equivalence class by π1 (X, x0 ).
We will see that this set has a natural and beautiful group structure.
39
See section 1.1 above for this notion.

– 52 –
Figure 7: The homotopy demonstrating that loop concatenation is an associative multiplication
on homotopy equivalence classes of closed loops. The blue line is s = 4t − 1 and the red line is
s = 4t − 2.

Figure 8: The homotopy demonstrating that the loop g(t) = f (1 − t) provides a representative for
the inverse of [f (t)].

We can define a group structure on π1 (X, x0 ) by concatenating curves as in Figure 6


and rescaling the time variable so that it runs from 0 to 1. In equations we have
(
1
f1 (2t) 0≤t≤ 2
f1 ? f2 (t) := (6.23)
1
f2 (2t − 1) 2 ≤t≤1

Remarks

1. Note that we are composing successive paths on the right. This is slightly nonstandard
but a nice convention when working with monodromy and path ordered expontentials
of gauge fields - one of the main physical applications.

2. Note well that (f1 ? f2 ) ? f3 is NOT the same path as f1 ? (f2 ? f3 ). This observation
ultimately leads to the notion of A∞ spaces.

– 53 –
3. For the moment we simply notice that if we mod out by homotopy then we have a
well-defined product on homotopy classes in π1 (X, x0 )

[f1 ] · [f2 ] := [f1 ? f2 ] (6.24)


and the virtue of passing to homotopy classes is that now the product (6.24) is in
fact associative. The proof is in Figure 7. Written out in excruciating detail the
homotopy is 
4
f1 ( s+1 t)

 0 ≤ t ≤ s+1
4
F (s, t) = s+1 s+2 (6.25)
f2 (4t − (s + 1)) 4 ≤t≤ 4

f ( 4 (t − s+2 ))
 s+2
≤t≤1
3 2−s 4 4

4. Since we have an associative product on π1 (X, x0 ) we are now ready to define a


group structure. The identity element is clearly given by the (homotopy class of the)
constant loop: f (t) = x0 . If a homotopy class is represented by a loop f (t) then
the inverse is represented by running the loop backwards: g(t) := f (1 − t). The two
are joined at t = 1/2, and since this is in the open interval (0, 1) the image can be
deformed away from x0 . See Figure 8. In equations, there is a homotopy of f ? g
with the constant loop given by

f (2t)

 t ≤ 1−s
2
F (s, t) = f (1 − s) 1−s 1+s (6.26)
 2 ≤t≤ 2
f (2 − 2t) 1+s ≤ t ≤ 1

2

Thus, with the group operation defined by concatenation in the sense of


(6.24) the set of homotopy classes π1 (X, x0 ) is a group. It is known as the
fundamental group based at x0 .

5. A connected space such that π1 (X, x0 ) is the trivial group is called simply connected.

A Basic Example: The Fundamental Group Of The Circle:


The first and most basic example of a nontrivial fundamental group is the fundamental
group of the circle. It should be intuitively clear that

π1 (S 1 , x0 ) ∼
=Z (6.27)

which just measures the number of times the path winds around the circle. The sign of
the integer takes into account winding clockwise vs. counterclockwise.
Let us amplify a little on how one proves this basic fact. We will just give the main
idea. For a thorough and careful proof see A. Hatcher’s book on Algebraic Topology, or
Section 13.2 of

– 54 –
http://www.physics.rutgers.edu/∼gmoore/511Fall2014/Physics511-2014-Ch2-Topology.pdf
We have a standard map
p : R → S1 (6.28)
given by p(x) = e2πix . Note that the inverse image of any phase e2πis with s ∈ R is the
set of real numbers s + Z. So p is the map that identifies the orbits of the Z-action by
translation on R with S 1 . That is S 1 ∼
= R/Z. Now, suppose we have a map f¯ : [0, 1] → S 1 .
We claim that there is a map f : [0, 1] → R so that f¯ “factors through p,” meaning that
there is a map f : [0, 1] → R such that:

f¯ = p ◦ f (6.29)

The problem of finding such a map f is nicely expressed in terms of diagrams. One is
trying to complete the following diagram to make it a commutative diagram by finding a
suitable map f to use on the dashed line in:

=R (6.30)
f
p


[0, 1] / S1

In other words, given f¯ one is trying to find a map f : [0, 1] → R so that

e2πif (x) = f¯(x) (6.31)

So, f (x) is a logarithm, but a logarithm is not single-valued. Nevertheless, in sufficiently


small open sets of [0, 1] (small enough that the image of f¯(x) does not “wrap” around the
circle) one can choose an unambiguous logarithm. Let us fix (WLOG) f¯(0) = 1. Then f (0)
can be any integer, n0 . Once we choose that integer the branch of the logarithm is fixed in
some small open set [0, ). Now we continue choosing open sets along the interval so that
we can choose an unambiguous logarithm and we fix branches successively as we move from
one open set to the next along the positive direction. The net result is an unambiguous
map f : [0, 1] → R satisfying (6.31). Now if f¯(1) = 1, then f (1) = n1 , with n1 ∈ Z. The
integer n1 − n0 only depends on f¯, and not on the particular choice n0 , and is the winding
number of the map. Moroever, the winding number is continuous as a function of f¯ and
hence only depends on the homotopy class.

1. It is not hard to prove that

π1 (X × Y, (x0 , y0 )) ∼
= π1 (X, x0 ) × π1 (Y, y0 ) (6.32)

So the fundamental group of the n-dimensional torus is isomorphic to the n-dimensional


lattice Zn .

2. If F : X → Z is a continuous map of topological spaces and takes x0 ∈ X to z0 ∈ Z


then we can define F∗ : π1 (X, x0 ) → π1 (Z, z0 ) simply by F∗ [f ] := [F ◦ f ]. This can be
shown to be a group homomorphism. In particular, if F is a homotopy equivalence,
then it is a group isomorphism.

– 55 –
𝒰𝒰+ −

𝑔𝑔𝑖𝑖+ −

𝒰𝒰+ 𝒰𝒰−

Figure 9: Illustrating the Seifert-VanKampen theorem. The green curve has a homotopy class
in U +− that is one of the generators of π1 (U +− ). Now it must separately be a word Wi+ in the
generators of π1 (U + ) and Wi− in the generators of π1 (U − ) so in π1 (X) there must be a relation of
the form Wi+ = Wi− .

3. In algebraic topology books a major result which is proved is the Seifert-van Kam-
pen theorem. This is an excellent illustration of defining groups by generators and
relations. The theorem can be useful because it allows one to compute π1 (X, x0 ) by
breaking up X into simpler pieces. Specifically, suppose that X = U + ∪ U − is a
union of two open path-connected subsets and that U +− := U + ∩ U − is also path-
connected and contains x0 . See Figure 9. Now suppose we know presentations of the
fundamental groups of the pieces U + , U − , U +− in terms of generators and relations:

π1 (U + , x0 ) ∼ +
= hg1+ , . . . , gn++ |R1+ , . . . , Rm +i
− ∼ − − −
π1 (U , x0 ) = hg , . . . , g − |R , . . . , R − i −
(6.33)
1 n 1 m
π1 (U +− , x0 ) ∼
= +− +− +− +−
hg1 , . . . , gn+− |R1 , . . . , Rm+− i

Then the recipe for computing π1 (X, x0 ) is this: Denote the injection ι+ : U +− → U +
and ι− : U +− → U − . Then the generators of π1 (U +− , x0 ) push forward to words in
gi+ or gi− , respectively:
+− +
ι+
∗ (gi ) := Wi i = 1, . . . , n+−

(6.34)
ι− +−
∗ (gi ) := Wi i = 1, . . . , n+−

Finally, we have the presentation:

π1 (X, x0 ) ∼
= hg1+ , . . . , gn++ , g1− , . . . , gn−− |Rα i (6.35)

where the relations Rα include the old relations


− −
R1+ , . . . , Rm
+
+ , R1 , . . . , Rm− (6.36)

– 56 –
and a set of new relations:

W1+ (W1− )−1 , . . . , Wn++− (Wn−+− )−1 (6.37)

It is obvious that these are relations on the generators. What is not obvious is that
these are the only ones. Note that in the final presentation the generators gi+− and
the relations Ri+− have dropped out of the description.

Figure 10: The house that Bing built. Taken from M. Freedman and T. Tam Nguyen-Pham,
“Non-Separating Immersions Of Spheres and Bing Houses,” which describes nice mathematical
properties of this house.

Exercise
Show that if X = S n with n > 1 then π1 (X, x0 ) is the trivial group.

Exercise
Does the fundamental group depend on a choice of basepoint x0 ?

Exercise
What is the fundamental group of Serin Physics Laboratory?

– 57 –
Exercise The house that Bing built
Show that the house in Figure 10 can be shrink-wrapped with a single balloon so that
the complement of the balloon in R3 is connected and simply connected.

Figure 11: Right: Cutting a torus along the A and B cycles the surface falls apart into a rectangle,
shown on the left. Conversely, gluing the sides of the rectangle together produces a torus with
distinguished closed curves A, B.

Figure 12: Illustrating the Seifert-VanKampen theorem for the torus: The “tubular neighborhood”
- the green region - of the cutting curves, shown in (b) is homotopy equivalent to a one-point union
of two circles. The latter space has a π1 which is a free group on two generators. The boundary
of the green region contracts into the remainder of the surface - which can be deformed to a disk.
Therefore the group commutator [a, b] = 1.

– 58 –
Figure 13: A collection of closed paths at x0 which generate the fundamental group of a two-
dimensional surface with two handles and three (green) holes.

6.1.2 Surface Groups: Compact Two-Dimensional Surfaces


The fundamental groups of two-dimensional surfaces, known as surface groups and braid
groups turn out to provide a very rich set of examples of groups defined by generators and
relations.
The simplest example of a nontrivial surface group is the torus. Let a, b denote the
homotopy classes of the cycles A, B shown in Figure 11. One can convince oneself that
these generate the fundamental group: Every closed curve based at x0 can be homotoped
to a word in a±1 and b±1 . Now, if we cut the torus along the cycles the surface falls apart
into a rectangle as shown in Figure 11. The edge of the rectangle represents the class
aba−1 b−1 . A slightly different way of thinking about this is described in Figure 12.

Definition: In general, in group theory an expression of the form g1 g2 g1−1 g2−1 is known as
a group commutator and is sometimes denoted [g1 , g2 ]. It should not be confused with the
commutator of matrices [A1 , A2 ] = A1 A2 − A2 A1 .

Returning to the fundamental group of the torus, the group commutator [a, b] it can
be contracted inside the rectangle to a point. Therefore, the generators a, b satisfy the
relation:
aba−1 b−1 = 1 (6.38)

– 59 –
so this means
ab = ba (6.39)
In fact, this is the only relation and therefore:

π1 (T 2 , x0 ) ∼
= Z ⊕ Z. (6.40)

Figure 14: Illustrating the Seifert-VanKampen theorem for a genus two surface: The “tubular
neighborhood” - the green region - of the cutting curves, shown in (b) is homotopy equivalent to
a one-point union of 4 circles. The latter space has a π1 which is a free group on four generators
which we can call a1 , b1 , a2 , b2 . The boundary of the green region is a single circle homotopic to
[a1 , b1 ][a2 , b2 ]. But it contracts into the remainder of the surface - which can be deformed to a disk.
Therefore by the Seifert-van Kampen theorem the presentation of π1 of the genus two surface has
a single relation [a1 , b1 ][a2 , b2 ] = 1.

The above ideas generalize nicely. Let us consider the case of a genus 2, or 2-handled,
surface shown in Figure 14. The fundamental group can be presented as a group with four
generators a1 , b1 , a2 , b2 and one relation:

[a1 , b1 ][a2 , b2 ] = 1 (6.41)

Now let us consider a more complicated surface, perhaps with punctures as shown in
Figure 15. By cutting along the paths shown there the surface unfolds to a presentation
by gluing as in Figure 16:
From these kinds of constructions one can prove 40 that the fundamental group of an
orientable surface with g handles and p punctures will be
g
Y p
Y
π1 (S, x0 ) = hai , bi , cs | [ai , bi ] cs = 1i (6.42)
i=1 s=1

40
See, for example, W. Massey, Introduction to Algebraic Topology, Springer GTM

– 60 –
Figure 15: A collection of closed paths at x0 which generate the fundamental group of a two-
dimensional surface with two handles and three (green) holes.

There is only one relation so this is very close to a free group! In fact, for p ≥ 1 we can
solve for one generator cs in terms of the rest so the group is just a free group on 2g + p − 1
generators. When there are no punctures the group is not a free group. Groups of the
form (6.42) are sometimes called surface groups.
As mentioned above, a flat connection amounts to a representation of this group - so
one is searching for matrices Ai , Bi , Cs such that
Y Y
(Ai Bi A−1 −1
i Bi ) Cs = 1 (6.43)
i s

Exercise Fundamental group of the Klein bottle


A very interesting unorientable surface is the Klein bottle. Its fundamental group has
two natural presentations in terms of generators and relations. One is

ha, b|a2 = b2 i (6.44)

and the other is


hg1 , g2 |g1 g2 g1 g2−1 = 1i (6.45)
Show that these two presentations are equivalent.

– 61 –
Figure 16: When the directed edges are identified according to their labels the above surface
reproduces the genus two surface with three punctures. Since the disk is simply connected we
derive one relation on the curves shown here.

Exercise
Use the Seifert-van Kampen theorem to relate the fundamental group of a torus to
that of a torus with a disk cut out.

6.1.3 Braid Groups And Anyons


Let us modify Figure 2 and Figure 1 to include an under-crossing and overcrossing of the
strands. So now we are including more information - the topological configuration of the
strands in three dimensions. In an intuitive sense, which we will not make precise here we

– 62 –
Figure 17: Pictorial illustration of the generator σi of the braid group Bn .

Figure 18: Pictorial illustration of the Yang-Baxter relation.

obtain a group called the nth braid group. It is generated by the overcrossing σ̃i of strings
(i, i + 1), for 1 ≤ i ≤ n − 1 and may be pictured as in Figure 17. Note that σ̃i−1 is the
undercrossing.
Now one verifies the relations

σ̃i σ̃j = σ̃j σ̃i |i − j| ≥ 2 (6.46)

– 63 –
and
σ̃i σ̃i+1 σ̃i = σ̃i+1 σ̃i σ̃i+1 (6.47)
where the relation (6.47) is illustrated in Figure 18.
The braid group Bn may be defined as the group generated by σ̃i subject to the relations
(6.46)(6.47):

Bn := hσ̃1 , . . . , σ̃n−1 |σ̃i σ̃j σ̃i−1 σ̃j−1 = 1, |i − j| ≥ 2; σ̃i σ̃i+1 σ̃i = σ̃i+1 σ̃i σ̃i+1 i (6.48)

The braid group Bn may also be defined as the fundamental group of the space of
configurations of n unordered points on the disk D. We first consider the set:

{(x1 , . . . , xn )|xi ∈ D xi 6= xj i 6= j} (6.49)

Then we observe that there is a group action of Sn on this set. Note that this set is not
simply connected: For example if we let x1 loop around x2 holding all other xi fixed it
should be intuitively clear that the loop cannot be deformed to the trivial loop. That is
even more clear if you view the looping process as taking place in time on particles in a
plane.
Now we consider the space of orbits under this group action:

Cn := {(x1 , . . . , xn )|xi ∈ D xi 6= xj i 6= j}/Sn (6.50)

There are new nontrivial loops here where, for example, xi and xj exchange places, all ♣Since we must
quotient by Sn this
other xk staying fixed. needs to be moved
to the section on
Note that the “only” difference from the presentation of the symmetric group is that group actions on
spaces. ♣
we do not put any relation like (σ̃i )2 = 1. Indeed, Bn is of infinite order because σ̃in keeps
getting more and more twisted as n → ∞.

Exercise Homomorphisms Between Braid And Symmetric Groups


a.) Define a homomorphism µ : Bn → Sn .
b.) Can you define a homomorphism s : Sn → Bn so that µ ◦ s is the identity
transformation?

Remarks

1. In the theory of integrable systems the relation (6.47) is closely related to the “Yang-
Baxter relation.” It plays a fundamental role in integrable models of 2D statistical
mechanics and field theory.

2. One interesting application of permutation groups to physics is in the quantum the-


ory of identical particles. It was a major step in the development of quantum theory
when Einstein and Bose realized that a system of n identical kinds of particles (pho-
tons, for example, or atomic Nuclei of the same isotope) are in fact indistinguishable.

– 64 –
41 In mathematical terms, there is a group action of Sn on a set of n indistinguish-
able particles leaving the physical system “the same.” In quantum mechanics this
translates into the statement that the Hilbert space of a system of n indistinguishable
particles should be a representation of (a central extension of) Sn . There are many
different representations of Sn (we have already encountered three different ones).
Most of them are higher dimensional. Particles transforming in higher-dimensional
representations are said to satisfy “parastatistics.” (This idea goes back to B. Green
in the 1950’s and Messiah and Greenberg in the 1960’s.) However, remarkably, in
relativistically invariant theories in spacetimes of dimension larger than 3 particles
are either bosons or fermions. This is related to the classification of the projective rep-
resentations of SO(d, 1), where d is the number of spatial dimensions, for relativistic
systems and to representations of SO(d) for nonrelativistic systems. (We will discuss
projective representations in section **** below.) Now, when discussing projective
representations the fundamental group of SO(d, 1) and SO(d) becomes important.
In fact π1 (SO(d, 1)) ∼= π1 (SO(d)) for d ≥ 2. However, there is a fundamental differ-
ence between d ≤ 2 and d > 2. The essential point is that the fundamental group
π1 (SO(2)) ∼ = Z is infinite while π1 (SO(d)) ∼
= Z2 for d ≥ 3. A consequence of this,
and other principles of physics is that in 2 + 1 and 1 + 1 dimensions, particles with
“anyonic” statistics can exist. 42 Anyons are defined by the property that, if we just
consider the wavefunction of two identical such particles, Ψ(z1 , z2 ) where z1 , z2 are
points in the plane and then we adiabatically switch their positions using the kind of
braiding that defines σ̃ then

σ̃ · Ψ(z2 , z1 ) = eiθ Ψ(z1 , z2 ) (6.51)

Unlike bosons and fermions where θ = 0, πmod2π, respectively, for “anyons” the
phase can be anything - hence the name. There are even physical realizations of
this theoretical prediction in the fractional quantum Hall effect. Moreover, quantum
wavefunctions should transform in representations of the braid group. The law (6.51)
leads to the 1-dimensional representation σ̃ → eiθ but there can also be more interest-
ing “nonabelian representations.” That is, there can be interesting irreducible rep-
resentations of dimension greater than one, and if wavefunctions transform in such
representations there can be nonabelian statistics. The particles should be called
nonabelions. There are some theoretical models of fractional quantum Hall states in
which this takes place. 43 Nonabelions are of potentially great importance because of
their possible use in quantum computation, an observation first made by A. Kitaev.
41
See Chapters 24 and 25 in the book Einstein and the Quantum by A.D. Stone for a nice historical
account of the importance of this discovery in the development of quantum mechanics.
42
The possible existence of anyons was pointed out by Leinaas and Myrheim in 1977. Another early
reference are the papers of Goldin, Melnikof and Sharp. The term “anyon” was invented in F. Wilczek,
”Quantum Mechanics of Fractional-Spin Particles”. Physical Review Letters 49 (14): 957959. For comments
on the early history of these ideas see “The Ancestry of the ‘Anyon’ ” in Physics Today, August 1990, page
90.
43
Important work on the compatibility of nonabelions with spin-statistics theorems was done by Jurg
Fröhlich. Perhaps the first concrete proposal of a physical system with nonabelions is that of G.W. Moore

– 65 –
Here are some sources for more material about anyons: ♣Need to keep
updating and
refining this
reference list. ♣
1. There are some nice lecture notes by John Preskill, which discuss the potential rela-
tion to quantum computation and quantum information theory: http://www.theory.caltech.edu/˜preski

2. A. Stern, ”Anyons and the quantum Hall effectA pedagogical review”. Annals of
Physics 323: 204; arXiv:0711.4697v1.

3. A. Lerda, Anyons: Quantum mechanics of particles with fractional statistics Lect.Notes


Phys. M14 (1992) 1-138

4. A. Khare, Fractional Statistics and Quantum Theory,

5. G. Dunne, Self-Dual Chern-Simons Theories.

6. David Tong, “Lectures on the Quantum Hall Effect,” e-Print: arXiv:1606.06687

7. S. Burton, “A Short Guide To Anyons and Modular Functors,” 1610.05384

8. J. Pachos, Introduction To Topological Quantum Computation

9. K. Beer, “From Categories To Anyons: A Travelogue,” 1811.06670

10. E.C. Rowell and Z. Wang, “Mathematics of Topological Quantum Computation,”


Bulletin American Math. Soc., 55, 183

11. Z. Wang, “Topological Quantum Computation,” http://web.math.ucsb.edu/∼zhenghwa/data/course/c

6.1.4 Fundamental Groups Of Three-Dimensional Manifolds


The fundamental groups of surfaces are easy to write down, and surfaces can be classified.
The situation is completely different in dimensions three and larger where the problem is
much harder. Of course, we could take products like S 1 × Σ, where Σ is a two-dimensional
manifold. But many more possibilities arise.
Let us start by considering a circle S 1 ⊂ S 3 . It is probably helpful to think of S 3 as
3
R with the boundary at infinity identified to a point. Now the tubular neighborhood of
the circle is diffeomorphic to S 1 × D2 and has boundary S 1 × S 1 . Imagine cutting out this
tubular neighborhood - what remains? It is some three-manifold with a single boundary
which is also a torus S 1 × S 1 . If you draw a solid torus you will notice that one of the
homotopically nontrivial loops on the torus becomes homotopically trivial.
FIGURE HERE OF SOLID TORUS AND A-CYCLE.
The other cycle is nontrivial. Thus, it must be that S 3 can be obtained by gluing
together two solid tori (bagels). However, S 3 is simply connected, so the cycle which is
nontrivial in one bagel must become trivial in the other, and vice versa. This suggests that

and N. Read, “Nonabelions in the fractional quantum hall effect.” Nuclear Physics B360, 1991. It was
inspired, in part, by the work of G. Moore and N. Seiberg, “Polynomial Equations For RCFT” and “Clas-
sical and Quantum Conformal Field Theory,” Commun.Math.Phys. 123 (1989) 177 which gave the first
description of a modular tensor category.

– 66 –
there are topologically interesting diffeomorphisms of the torus that are used when gluing
together the two bagels: Note that if you glue with the identity transformation you get
S 2 × S 1 , not S 3 .
Indeed a vast number of new constructions comes of three-manifolds comes from ideas
of “surgery.” An important point is that there are typically many nontrivial diffeomor-
phisms of a surface. Let us see this with the case of a torus.
We consider the torus to be the space of orbits R2 /ZxZ with coordinates (σ 1 , σ 2 ) with
σ ∼ σ i + 1. Then consider the transformation
i

σ 1 → aσ 1 + bσ 2
(6.52)
σ 2 → cσ 1 + dσ 2

This will be consistent with the identifications σ i ∼ σ i + 1 iff a, b, c, d ∈ Z, and will be


invertible if !
a b
∈ GL(2, Z) (6.53)
c d

Finally, it will preserve orientation if it is in SL(2, Z), that is, if ad − bc = 1. We claim


this diffeomorphism cannot be smoothly deformed to the identity if the matrix is not the
identity. To see this consider its action on the homotopically nontrivial loops on the torus.
So, we can generalize the above construction by considering an arbitrary knot K ⊂ S 3 .
Again, its tubular neighborhood is topologically just a solid torus, as is the complement -
but the two bagels must be glued together by a nontrivial diffeomorphism.
If we consider a higher genus surface Σ there is similarly an infinite number of nontrivial
self-diffeomorphisms.
The idea of a solid torus can be generalized to the notion of a handlebody: A connected
three-manifold with a surface Σ as boundary.
FIGURE
Note that in a genus g handlebody, g of the generators of π1 become topologically
trivial and the π1 is just the free group on g generators.
We can now generalize the above construction by taking two genus g handlebodies,
together with a diffeomorphism φ ∈ Diff(Σ) and glue the handlebodies together. The re-
sulting three-manifold is said to have a Heegaard decomposition. This means we can find
an embedded closed oriented surface Σ ⊂ Y so that Y is the gluing of two handlebodies
(bordisms of Σ to the emptyset) using a diffeomorphism of Σ. Such a Heegard decom-
position strongly constraints the fundamental group, thanks to the Seifert-van-Kampen
theorem: A set of generators is given by a set of generators of the two handlebodies. The
fundamental group of the handlebody for Σ of genus g is just a free group on g generators.
(The “A-cycles” contract to a point, leaving the “B-cycles” with no relation.) After gluing
to the other handlebody there will be g relations expressing the contractibility of the new
“A-cycles” of the second handlebody.
In fact, any three-manifold has a Heegaard decomposition: By a theorem of Moise, ev-
ery 3-manifold can be triangulated. Take the 1-skeleton and thicken it to get a handlebody.
This shows that every three-manifold admits a Heegaard decomposition. The trouble is,

– 67 –
there are a lot of nontrivial diffeomorphisms of a surface to itself, and it can be hard to
recognize two equivalent 3-folds constructed from different Heegaard splittings.
Thus, any three-manifold Y , π1 (Y, y0 ) admits a presentation where the number of
generators is equal to the number of relations. The existence of such a presentation is not
possible for a general finitely generated group. So, at least in three dimensions it is not
true that any finitely generated group is the fundamental group of some three-dimensional
manifold.

6.1.5 Fundamental Groups Of Four-Dimensional Manifolds


A basic fact is the following theorem of Markov:

Theorem Any finitely generated group G is the fundamental group of some four-manifold.

Proof : Suppose the group G has presentation:

G∼
= hg1 , . . . , gn |R1 , . . . , Rm i (6.54)

We aim to produce a four-manifold M4 with fundamental group G. First, consider the free
group on one generator hgi ∼
= Z. A good manifold that has this as a fundamental group is
1 3
X4 = S × S . Now let us consider M̃4 := X4 # · · · #X4 . Then

π1 (M̃4 ) ∼
= hg1 , . . . , gn i (6.55)

is the free group with generators gi corresponding to the simple loops around the S 1 factor
in each summand (extended to some common basepoint). Now, each relation Rα is a ♣Need to explain
why it is a free
word in the gi and thus can be represented by some closed based loop `α ⊂ M̃4 . We can group and not a free
abelian group! ♣
take the `α to be nonintersecting, by simple codimension arguments. Now, take a tubular
neighborhood N (`α ) of `α . By our discussion above of the local picture of submanifolds it
is diffeomorphic to N (`α ) ∼ = S 1 × D3 , where D3 is the 3-dimensional ball. The boundary is
thus ∂N (`α ) ∼= S 1 × S 2 . This is also the boundary of D2 × S 2 . So, glue in a copy of D2 × S 2
along the boundary of N (`α ). This procedure is known as surgery. Now the loop S 1 in
S 1 × D3 (which was representing the word Rα ) becomes contractible! Thus it is a relation
on the generators gi in the new manifold. We can choose the tubular neighborhoods around
the different loops `α to be nonintersecting, and hence we can perform surgeries on each
of these loops without interference. If we do this for all the loops we produce our manifold
M4 . By the Seifert-van Kampen theorem it follows that the fundamental group of M4 is
exactly G. ♠
Since finitely presented groups cannot be classified it follows that four-manifolds cannot
be classified, even up to homotopy type.

7. Cosets and conjugacy

7.1 Lagrange Theorem


The reader should refresh her/his memory about equivalence relations - see section 1.1.

– 68 –
Definition 7.1.1: Let H ⊆ G be a subgroup. The set

gH ≡ {gh|h ∈ H} ⊂ G (7.1)

is called a left-coset of H.

Example 1: G = Z, H = 2Z. There are two cosets: H and H + 1.

Example 2: G = S3 , H = {1, (12)} ∼


= S2 . Cosets:

1 · H = {1, (12)}
(12) · H = {(12), 1} = {1, (12)} = H
(13) · H = {(13), (123)}
(7.2)
(23) · H = {(23), (132)}
(123) · H = {(123), (13)} = {(13), (123)} = (13) · H
(132) · H = {(132), (23)} = {(23), (132)} = (23) · H

Claim: Two left cosets are either identical or disjoint. Moreover, every element g ∈ G
lies in some coset. That is, the cosets define an equivalence relation by saying g1 ∼ g2 if
there is an h ∈ H such that g1 = g2 h. Here’s a proof written out in excruciating detail. 44
First, g is in gH, so every element is in some coset. Second, suppose g ∈ g1 H ∩ g2 H.
Then g = g1 h1 and g = g2 h2 for some h1 , h2 ∈ H. This implies g1 = g2 (h2 h−1
1 ) so g1 = g2 h
for an element h ∈ H. (Indeed h = h2 h−1 1 , but the detailed form is not important.) By
the rearrangement lemma hH = H, and hence g1 H = g2 H.
The basic principle above leads to a fundamental theorem:
Theorem 7.1.1 (Lagrange) If H is a subgroup of a finite group G then the order of
H divides the order of G:
|G|/|H| ∈ Z+ (7.3)
Proof : If G is finite G = qm1 gi H for some set of gi , leading to distinct cosets. Now
note that the order of any coset is the order of H:

|gi H| = |H| (7.4)

So |G|/|H| = m, where m is the number of distinct cosets. ♠


This theorem is simple, but powerful: For example we have the following

Corollary: Any finite group of prime order p is isomorphic to µp ∼


= Zp . Moreover, such
groups have exactly two subgroups: The trivial group and itself.

Proof : Choose a nonidentity element g ∈ G and consider the subgroup generated by g i.e,

{1, g, g 2 , g 3 , . . . } (7.5)
44
In general, the reader should provide these kinds steps for herself or himself and we will not spell out
proofs in such detail.

– 69 –
The order of this group must divide |G| so if |G| = p is prime it must be the entire group.

Definition 7.1.2: If G is any group and H any subgroup then the set of left cosets of H
in G is denoted G/H. It is the set of orbits under the right H action on G. A set of the
form G/H is also referred to as a homogeneous space. The order of this set is the index of
H in G, and denoted [G : H].

Example 1: If G = S3 , H = {1, (12)} ∼


= S2 , then G/H = {H, (13) · H, (23) · H}, and
[G : H] = 3.

Example 2: Let G = {1, ω, ω 2 , . . . , ω 2N −1 } = µ2N where ω is a primitive (2N )th root of


1. Let H = {1, ω 2 , ω 4 , . . . , ω 2N −2 } = µN . Then [G : H] = 2 and G/H = {H, ωH}.

Example 3: Let G = A4 and H = {1, (12)(34)} ∼


= Z2 . Then [G : H] = 6 and

G/H = {H, (13)(24) · H, (123) · H, (132) · H, (124) · H, (142) · H} (7.6)

Remark Note well! If H ⊂ G is a subgroup and g1 H = g2 H it does not follow that


g1 = g2 . All you can conclude is that there is some h ∈ H with g1 = g2 h.

Exercise Is there a converse to Lagrange’s theorem?


Suppose n||G|, does there then exist a subgroup of G of order n? Not necessarily! Find
a counterexample. That is, find a group G and an n such that n divides |G|, but G has no
subgroup of order n. 45

Nevertheless, there is a very powerful theorem in group theory known as

Theorem 7.1.2: (Sylow’s (first) theorem). Suppose p is prime and pk divides |G| for a
nonnegative integer k. Then there is a subgroup H ⊂ G of order pk .

Herstein’s book, sec. 2.12, waxes poetic on the Sylow theorems and gives three proofs.
We’ll give a proof as an application of the class equation in section 9 below. Actually,
Sylow has a bit more to say. We will explain some more about this in the next section.
45
Answer: One possible example is A4 , which has order 12, but no subgroup of order 6. By examining the
table of groups below we can see that this is the example with the smallest value of |G|. Sylow’s theorem
(discussed below) states that if a prime power pk divides |G| then there is in fact a subgroup of order pk .
This fails for composite numbers - products of more than one prime. Indeed, the smallest composite number
is 6 = 2 · 3. Thus, in regard to a hypothetical converse to Lagrange’s theorem, as soon as things can go
wrong, they do go wrong.

– 70 –
Definition: Thus far we have repeatedly spoken of the “order of a group G” and of various
subsets of G, meaning simply the cardinality of the various sets. In addition a common
terminology is to say that an element g ∈ G has order n if n is the smallest natural number
such that g n = 1.

Note carefully that if g has order n and k is a natural number then (g n )k = g nk = 1


and hence if g m = 1 for some natural number m it does not necessarily follow that g has
order m. However, as an application of Lagrange’s theorem we can say the following: If
G is a finite group then the order of g must divide |G|, and in particular g |G| = 1. The
proof is simple: Consider the subgroup generated by g, i.e. {1, g, g 2 , . . . }. The order of
this subgroup is the same as the order of g.

Exercise Subgroups of A4
Write down all the subgroups of A4 . Draw a diagram indicating how these are sub-
groups of each other.

Exercise Orders of group elements in infinite groups


a.) Give an example of an infinite group in which all elements, other than the identity,
have infinite order. (This should be quite easy for you.) 46
b.) Give an example of an infinite group where some group elements have finite order
and some have infinite order. 47
c.) Give an example of an infinite group where all elements have finite order. 48

Exercise
Suppose a finite group G has subgroups Hi , i = 1, . . . , s of order hi where the hi are
Q
all mutually relatively prime integers. Show that i hi divides |G|.

7.2 Conjugacy
Now introduce a notion generalizing the idea of similarity of matrices:
46
One possible answer : Take Z or Zn or ....
47
One possible answer : Z × ZN . Another possible answer is G = U (1).
48
One possible answer : Regard U (1) as the group of complex numbers of modulus one. Let G be the
subgroup of complex numbers so that z N = 1 for some integer N . This is the group of all roots of unity of
any order. It is clearly an infinite group, and by its very definition every element has finite order. Using
the notation of the next section, this group is isomorphic to Q/Z.

– 71 –
Definition 7.2.1 :
a.) A group element h is conjugate to h0 if ∃g ∈ G h0 = ghg −1 .
b.) Conjugacy defines an equivalence relation and the conjugacy class of h is the
equivalence class under this relation:

C(h) := {ghg −1 : g ∈ G} (7.7)

c.) Let H ⊆ G, K ⊆ G be two subgroups. We say “H is conjugate to K” if ∃g ∈ G


such that

K = gHg −1 := {ghg −1 : h ∈ H} (7.8)

Example 7.2.1 : We showed above that all cyclic permutations in Sn are conjugate.

Example 7.2.2 : Consider G = U (N ). Then conjugacy is the same notion as similarity of


matrices. It is a simple consequence of an important theorem - the Spectral theorm that if
u ∈ U (N ) there is a g ∈ U (N ) with gug −1 = Diag{z1 , . . . , zN } where |zi | = 1. One proves
this by induction on N . See section 17 of Chapter 2.
This does not mean that the set of conjugacy classes can be identified with U (1)N .
Consider, for example, the permutation matrix A(φ) for φ ∈ SN . This is unitary and

A(φ)Diag{z1 , . . . , zN }A(φ)−1 = Diag{zφ(1) , . . . , zφ(N ) } (7.9)

However, once we have taken this into account we are done: The set of conjugacy classes in
U (N ) is the set of unordered N -tuples of phases.. We can make this assertion plausible by
noting that all the traces Tr(uk ) are invariant under conjugation, so two conjugate diagonal
matrices must provide simultaneous solutions to
N
X
zik = wk (7.10)
i=1

for k ∈ Z. This can only be the case if they are related by permutation.
The set of conjugacy classes is therefore a space of orbits U (1)N /SN .
Now suppose we have two commuting unitary matrices. Again, basic linear algebra
(explained in Chapter two) shows that they can be simultaneously diagonalized. That is, if
u1 , u2 ∈ U (N ) and [u1 , u2 ] = 1 (group commutator) then there is a single g ∈ U (N ) with

gui g −1 = Di (7.11)

with Di diagonal. Now by induction if we have a maximal Abelian subgroup of U (N ) then


they can all be simultaneously diagonalized. So: Every maximal Abelian subgroup of U (N )
is conjugate to the subgroup of diagonal unitary matrices, and this group is isomorphic to
U (1)N .
In general, for any compact Lie group G the maximal torus is the largest commuting
subgroup of G, and it is an important theorem, generalizing the spectral theorem, that
they are all conjugate subgroups.

– 72 –
Example 7.2.3 : Now consider G = GL(n, C). We must stress that not all matrices are
diagonalizable, so that the full description of conjugacy classes is more complicated. For
any matrix A ∈ Mn (C) we can define its characteristic polynomial

pA (x) := det(x1 − A) (7.12)

Note that pA only depends on the conjugacy class of A:

pgAg−1 (x) = pA (x) (7.13)

If r is a root of this polynomial then the matrix r1 − A has zero determinant, so it


has a nontrivial kernel (see Chapter 2) and therefore there is an eigenvector v of A with
eigenvalue r:
Av = rv (7.14)

It is very important to note that the eigenvectors might not form a basis,. Here is a simple
and basic example: !
01
A= (7.15)
00

Then one easily checks pA (x) = x2 . So an eigenvector would satisfy Av = 0. But if there
were a basis of eigenvectors then A = 0, a contradiction. The general statement is that any
matrix is conjugate to its Jordan canonical form. See section 10.4 of Chapter 2. Briefly,
we define the Jordan block with eigenvalue λ;
(k)
Jλ = λ1 + N (k) (7.16)

N (k) = e1,2 + e2,3 + · · · + ek−1,k (7.17)


Qs
Let A be any complex N × N matrix and pA (x) = i=1 (x − λi )ki where λi are the
distinct roots. Then A is conjugate to a block form Aλ1 ⊕ · · · Aλs , where each Aλi is a
block diagonal matrix of Jordan blocks
ki
(n )
X
A λi = Jλi a,i (7.18)
a=1

Various permutation matrices can act preserving the Jordan decomposition.

Remark/Definitions: We say that two homomorphisms ϕi : H → G are conjugate if


there is an element g ∈ G such that

ϕ2 (h) = gϕ1 (h)g −1 (7.19)

for all h ∈ H. Recall that a matrix representation of a group G is a homomorphism

ϕ : G → GL(n, κ) (7.20)

– 73 –
We say two matrix representations are equivalent representations if the two homomorphisms
are conjugate.
Definition: A class function on a group is a function f on G (it can be valued in any
set) such that f takes the same values on conjugate group elements:

f (gg0 g −1 ) = f (g0 ) (7.21)

for all g0 , g ∈ G. Note particularly that if ϕ is a matrix representation then

χϕ (g) := Trϕ(g) (7.22)

is an example of a class function. This function is called the character of the representation.
Note that two equivalent representations must have the same character.

Exercise Conjugacy Is An Equivalence Relation


a.) Show that conjugacy is an equivalence relation
b.) Prove that if H is a subgroup of G then gHg −1 is also a subgroup of G using the
multiplication structure on G.

Exercise Characters Of A Permutation Representation


Consider the n-dimensional representation of Sn given by T (σ) : ei 7→ eσ(i) where ei is
the standard basis of Rn . Show that the character of this representation is

χ(σ) = N (σ) = |{i : σ(i) = i}| (7.23)

As we will see later, this is the number of fixed points of σ.

Exercise Rotations In SO(3) . Consider a subgroup of SO(3) defined by rotations


around some axis in R3 . Show that all such subgroups are conjugate subgroups.

Exercise Conjugacy Classes In SU (2)


a.) Using the spectral theorem show that the conjugacy class of a matrix u ∈ SU (2)
is uniquely determined by its trace. 49
49
Answer : By the spectral theorem u is conjugate to
!
eiθ 0
0 e−iθ

– 74 –
b.) Show that the set of conjugacy classes in SU (2) can be identified with S 1 /Z2 =
[0, π].
c.) Show that the most general continuous homomorphism ϕ : U (1) → SU (2) looks
like !
|α|2 z + |β|2 z −1 αβ(z −1 − z)
ϕ:z→ (7.24)
α∗ β ∗ (z −1 − z) |α|2 z −1 + |β|2 z

where (α, β) ∈ C2 satisfy |α|2 + |β|2 = 1 and z ∈ U (1). (Hint: Show that all continuous
homomorphisms ϕ : U (1) → SU (2) are conjugate.)
d.) Show that the conjugacy class of a matrix A ∈ M2 (C) is not uniquely determined
by the values of Tr(Ak ), but it is if A is diagonalizable. (k = 1, 2 will suffice.)

Exercise The Complex Conjugate Representation


a.) Consider the two 2-dimensional representations of SU (2) where ϕ1 is the identity
and ϕ2 (u) = u∗ . Show that these are equivalent representations of SU (2). 50
b.) Consider the two 2-dimensional representations of U (2) where ϕ1 is the identity
and ϕ2 (u) = u∗ . Show that these are inequivalent representations of U (2). 51
c.) Consider the N dimensional representation of SU (N ) given by ϕ1 (u) = u and
ϕ2 (u) = u∗ . Are these equivalent representations?

Exercise An Example Of Inequivalent Representations


(To do this exercise you need to understand about tensor products. See Chapter 2,
section 5.3.)
Consider the following four-dimensional representations of SU (2):
!
u 0
ϕ1 (u) = (7.26)
0 u

ϕ2 (u) = u ⊗ u (7.27)

Of course θ ∼ θ + 2π. However conjugation by iσ 1 shows that θ and θ + π define conjugate elements. So,
T r(u) = 2 cos θ determines θ up to θ ∼ θ + π.
50
Answer : They are equivalent: Conjugation by iσ 2 is equivalent to complex conjugation in SU (2):

(iσ 2 )u(iσ 2 )−1 = u∗ . (7.25)

51
Answer : They are inequivalent. By the spectral theorem u ∈ U (2) can be conjugated to a diagonal
matrix Diag{z1 , z2 } with z1 , z2 ∈ U (1). The character of ϕ1 is χ1 (u) = z1 + z2 . The character of ϕ2 is
χ2 (u) = z1−1 + z2−1 . For general elements of U (2) these are different so the character functions are different.

– 75 –
Are these representations equivalent or inequivalent? 52

7.3 Normal Subgroups And Quotient Groups


Groups which are self-conjugate are very special:

Definition 7.2.2: A subgroup N ⊆ G is called a normal subgroup, or an invariant


subgroup if

gN g −1 = N ∀g ∈ G (7.28)
Sometimes this is denoted as N / G.

Warning! Equation (7.28) does not mean that gng −1 = n for all n ∈ N !

There is a beautiful theorem associated with normal subgroups. In general the set of
cosets of a subgroup H in G, denoted G/H, does not have any natural group structure. 53
However, if H is normal something special happens:

Theorem 7.2.1. If N ⊂ G is a normal subgroup then the set of left cosets G/N =
{gN |g ∈ G} has a natural group structure with group multiplication defined by:

(g1 N ) · (g2 N ) := (g1 · g2 )N (7.29)


Proof- left as an important exercise - see below.

Remarks:

1. All subgroups N of Abelian groups A are normal, and moreover the quotient group
A/N is Abelian.

2. Groups of the form G/N are known as quotient groups. A very common source of
error and confusion is to mix up quotient groups and subgroups. They are very
different!

3. As an illustration of the previous remark note that if T : G → GL(n, κ) is a matrix


representation of G and if H ⊂ G is a subgroup then we can also restrict T to H
to get an n-dimensional representation of H. However, if Q is a quotient of G it is
not true in general that a representation of G determines a representation of Q. We
can try to define T (gN ) = T (g), but this will only make sense if T (n) = 1 for every
n ∈ N.
52
Answer : They are inequivalent. One can show this by computing characters of the two representations.
If u is in the conjugacy class with T r(u) = 2 cos θ then χ1 (u) = 4 cos θ while χ2 (u) = 4(cos θ)2 .
53
Note that it might have many unnatural group structures. For example, if G/H is a finite set with
n elements that we could choosely - arbitrarily!! - some one-one correspondence between the elements of
G/H and the elements in any finite group with n elements and use this to define a group multiplication
law on the set G/H. We hope the reader can appreciate how incredibly tasteless such a procedure would
be. Technically, it is unnatural because it makes use of an arbitrary extra choice of one-one correspondence
between the elements of G/H and the elements of some group.

– 76 –
Example 7.2.1 Cyclic Groups For example nZ ⊂ Z is normal, and the quotient group
is Z/nZ. This is isomorphic to the cyclic group we have previously denoted as µn or Zn .
So r̄ is the equivalence class of an integer r ∈ Z:

r̄ = r + nZ (7.30)

r̄ + s̄ = (r + s) + nZ (7.31)

Example 7.2.2 Quotients of Zd . Consider G = Zd and let ei be a standard basis, with


1 in the ith row and zeroes elsewhere. Let Aij be a d × d matrix of integers and consider
the elements:
d
X
fi := Aij ej (7.32)
j=1

Consider the subgroup H ⊂ G of all integral linear combinations of fi :

Xd
H := { ni fi |ni ∈ Z} (7.33)
i=1

H is clearly a subgroup of G, so we can form the quotient group G/H. If detA 6= 0 then
in fact G/H is a finite group. One way to see this easily is to consider G as a subgroup of
Qd , so that we can write
ei = A−1
ij fj (7.34)

with A−1 ∈ GL(d, Q). Recall that A−1 = (detA)−1 Cof (A) where the cofactor matrix
Cof (A) is a matrix of minors, and therefore is a matrix of integers. Therefore (detA)ei ∈ H
and hence detA[ei ] = 0 in the quotient group so every element of G/H has a representative
P
of the form [ i xi ei ] with |xi | < |detA|.
Actually, if we invoke a nontrivial theorem we can say much more: The matrix A can
be put into Smith normal form. This means that there are matrices S, T ∈ GL(d, Z), ♣Should give a
sketch of a proof of
representing change of generators (i.e. change of basis of the Z-module ) of H and G so SNF in a
supplementary
that section. ♣

SAT = Diag{α1 , . . . , αd } (7.35)

with αi = di /di−1 where d0 = 1 and di for i > 0 is the g.c.d. of the i × i minors. (For a
proof see the Wikipedia article.) Then

G/H ∼
= Zα1 × · · · Zαd (7.36)

Note that it has order |G/H| = detA. Here is a good example of the difference between a
quotient group and a subgroup: No nontrivial finite group will be a subgroup of Zd .

Example 7.2.3 Discriminant Group. Now consider an embedded lattice in Rd equipped


with Euclidean inner product. This is the integral span of a collection {vi } of vectors. For

– 77 –
simplicity we will assume it is full rank, that is, the {vi } form a basis for Rd over R. We
denote it by Λ, so
Xd
Λ := { ni vi |ni ∈ Z} ⊂ Rd (7.37)
i=1

We define the dual lattice (closely related to the “reciprocal lattice” in solid state physics)
as the set of vectors w ∈ Rd such that w · v ∈ Z for all v ∈ Λ:

Λ∨ := {w ∈ Rd |∀v ∈ Λ v · w ∈ Z} (7.38)

Now assume that Λ is an integral lattice. This means that the matrix of inner products
Gij = vi ·vj is a d×d matrix of integers. (Note it is symmetric and of nonzero determinant.)
Then it follows that Λ ⊂ Λ∨ is a sublattice. The discriminant group of Λ is the finite group

D := Λ∨ /Λ (7.39)

Note that Λ∨ has a basis fi with vi = Gij fj so one can work out D as a product of cyclic
groups using the Smith normal form of Gij .

Example 7.2.4. Let us now consider some nonabelian examples.

A3 ≡ {1, (123), (132)} ⊂ S3 (7.40)

is normal. Note that


(12)A3 (12)−1 = A3 (7.41)

but conjugation by (12) induces a nontrivial permutation of the set A3 . The group S3 /A3
has order 2 and hence must be isomorphic to Z2 .

Example 7.2.5. Of course, in any group G the subgroup {1} and G itself are normal
subgroups. These are the trivial normal subgroups. It can happen that these are the only
normal subgroups of G:

Definition . A group with no nontrivial normal subgroups is called a simple group.

Remarks

1. Note that a simple group cannot have a nontrivial center.

2. The term is a bit of a misnomer: Some simple groups are pretty darn complicated.
What it means is that there is no means of simplifying it using something called
the Jordan-Holder decomposition - discussed below. Simple groups are extremely
important in the structure theory of finite groups. One example of simple groups are
the cyclic groups Z/pZ for p prime. Can you think of others?

– 78 –
3. Sylow’s theorems again. Recall that Sylow’s first theorem says that if pk divides |G|
then G has a subgroup of order pk . If we take the largest prime power dividing |G|,
that is, if |G| = pk m with m relatively prime to p then a subgroup of order pk is called
a p-Sylow subgroup. Sylow’s second theorem states that all the p-Sylow subgroups
are conjugate. The third Sylow theorem says something about how many p-Sylow
subgroups there are.

4. WARNING!: In the theory of Lie groups you will find the term “simple Lie group.”
A simple Lie group is NOT a simple group in the sense we defined above ! For example
SU (2) is a simple Lie group. But it has a nontrivial center namely the two diagonal
SU (2) matrices {±12×2 }.

Example 7.2.6. Recall that SL(n, κ) ⊂ GL(n, κ), SO(n, κ) ⊂ O(n, κ), and SU (n) ⊂
U (n) are all subgroups defined by the condition detA = 1 on a matrix. Note that, since
det(gAg −1 ) = detA for any invertible matrix g these are in fact normal subgroups. The
quotient groups are

GL(n, κ)/SL(n, κ) ∼
= κ∗
O(n, R)/SO(n, R) ∼= Z2 (7.42)
U (n)/SU (n) ∼
= U (1)

Lines 1 and 3 follow since every element in GL(n, κ) can be written as zA with z ∈ κ∗ and
A ∈ SL(n, κ). For line 2 take P to be any reflection in any hyperplane orthogonal to some
vector v, then O(n, R) = SO(n, R) q P SO(n, R) because detP = −1. Recall that Pv1 Pv2 is
a rotation in the plane spanned by v1 , v2 , so it doesn’t matter which hyperplane we choose.

Example 7.2.7. Let G be a topological group. Let G0 be the (path-) connected component
of the identity element 1G ∈ G. We claim that G0 is a normal subgroup: If g0 ∈ G0 there
is a continuous path of group elements γ : [0, 1] → G with γ(0) = 1G and γ(1) = g0 .
Then if g ∈ G is any other group element gγ(t)g −1 is a continuous path connecting 1G to
gg0 g −1 . The quotient group G/G0 is the group of components, sometimes denoted π0 (G).
In general for a topological space X, the set of connected components is denoted by π0 (X),
but it carries no natural group structure.
For some examples:

1. G = R∗ under multiplication. The identity element is 1 and clearly R∗>0 is the


connected component of the identity. The quotient group is isomorphic to Z2 .

2. The determinant map defines a homomorphism det : GL(n, R) → R∗ . Clearly there


is no path of GL(n, R) matrices that connects elements with detA < 0 to detA > 0.
(Why?) In fact, the subgroup of GL(n, R) of matrices with positive determinant is

– 79 –
connected: The sign of the determinant is the only obstruction to deformation to the
identity. 54

3. Very similar considerations hold for O(n, R). One can show that SO(n, R) is the
connected component of the identity and π0 ∼ = Z2 . For example, consider O(2). In
an exercise above you showed that as a manifold it has two components, each of which
can be identified with a circle. The connected component of the identity is SO(2).
We have O(2) = SO(2) q SO(2)P where P is any O(2) matrix of determinant = −1.
So π0 (O(2)) ∼
= Z2 . Similarly, π0 (O(n)) ∼
= Z2 .
∼ GL(2, Z). Indeed there is a subgroup of Dif f (T 2 )
4. One can show that π0 (Dif f (T 2 )) =
isomorphic to GL(2, Z) of diffeomorphisms
! ! !
σ1 a b σ1
7→ (7.43)
σ2 c d σ2

which projects isomorphically to the quotient group.

5. If G is a finite group then π0 (G) ∼


= G.

Example 7.2.8. The center of U (N ) consists of matrices proportional to the unit matrix.
See the exercise below. Elements in the center of SU (N ) must also be diagonal. However,
now if z1N ×N is to be in SU (N ) then z N = 1 (why?) so Z(SU (N )) ∼ = µN ∼= ZN . Since
this subgroup is normal we can take a quotient and get another group. It is known as

P SU (N ) := SU (N )/ZN (7.44)

One can show that P SU (N ) ∼= U (N )/Z(U (N )) ∼


= U (N )/U (1). There are representations
of SU (N ) that are not representations of P SU (N ) so here is another example where
P SU (N ) cannot be considered as a subgroup of SU (N ) in any sense.

Exercise Due Diligence


a.) Check the details of the proof of Theorem 7.2.1 ! 55

54
The theory of fiber bundles shows that if H is a Lie subgroup of G so that G/H is a topological space
then if H is connected the set of components of G can be identified with the set of components of G/H.
We will see later from the stabilizer orbit theorem that SO(n + 1)/SO(n) = S n , so we can prove SO(n)
is connected by induction on n. By Gram-Schmidt procedure SL(n, R) is a product of SO(n) and upper
triangular matrices with unit diagonal - and this space is connected. So SL(n, R) is connected. Then the
set of components of GL(n, R) is that of R∗ , and this is measured by the determinant.
55
Answer : The main thing to check is that the product law defined by (7.29) is actually well defined.
Namely, you must check that if g1 N = g10 N and g2 N = g20 N then g1 g2 N = g10 g20 N . To show this note that
g10 = g1 n1 and g20 = g2 n2 for some n1 , n2 ∈ N . Now note that g10 g20 = g1 n1 g2 n2 = g1 g2 (g2−1 n1 g2 )n2 . But,
since N is normal (g2−1 n1 g2 ) ∈ N and hence (g2−1 n1 g2 )n2 ∈ N and hence indeed g1 g2 N = g10 g20 N . Once
we see that (7.29) is well-defined the remaining checks are straightforward. Essentially all the basic axioms
are inherited from the group law for multiplying g1 and g2 . Associativity should be obvious. The identity
is 1G N = N and the inverse of gN is g −1 N . etc. ♠

– 80 –
b.) Consider the right cosets. Show that N \G is a group.

Exercise Even Permutations


Example 7.2.2 has a nice generalization. Recall that a permutation is called even if it
can be written as a product of an even number of transpositions.
a.) Show that the even permutations, An , form a normal subgoup of Sn .
b.) What is Sn /An ?

Exercise Subgroups Of Index Two


a.) Suppose that H ⊂ G is of index two: [G : H] = 2. Show that H is normal in G.
What is the group G/H in this case? 56
b.) Using (a) give another proof that An / Sn is a normal subgroup.
c.) As we will discuss later, the groups An for n ≥ 5 are simple groups. Accepting
this for the moment give an infinite set of counterexamples to the converse of Lagrange’s
theorem. 57

Exercise
Look at the 3 examples of homogeneous spaces G/H in section 7.1. Decide which of
the subgroups H is normal and what the group G/H would be.

Exercise
Show that if the center Z(G) is such that G/Z(G) is cyclic then G is Abelian. 58

56
Answer : Suppose G = H q g0 H. Then take any h ∈ H. The element g0 hg0−1 must be in H or g0 H.
But if it were in g0 H then there would be an h0 ∈ H such that g0 hg0−1 = g0 h0 but this would imply g0 is
in H, which is false. Therefore, for all h ∈ H, g0 hg0−1 ∈ H, and hence H is a normal subgroup. Therefore
G/H ∼ = Z2 .
57
Answer : Note that the order of |An | is even and hence 12 |An | is a divisor of |An |. However, a subgroup
of order |An |/2 would have to be a normal subgroup, and hence does not exist, since An is simple. More
generally, a high-powered theorem, known as the Feit-Thompson theorem states that a finite simple non-
abelian group has even order. Therefore if G is a finite simple nonabelian group there is no subgroup of
order 21 |G|, even though this is a divisor.
58
Answer : Every element of G would be of the form g0n z with z ∈ Z. But then it is easy to check:
g0 zg0m z 0 = g0m z 0 g0n z so G is Abelian. So G = Z(G), and in fact the cyclic subgoup must be trivial.
n

– 81 –
Exercise Sylow subgroups of A4
Write down the 2-Sylow and 3-Sylow subgroups of A4 .

Exercise Commutator Subgroups And Abelianization


If g1 , g2 are elements of a group G then the group commutator is the element [g1 , g2 ] :=
g1 g2 g1−1 g2−1 . If G is any group the commutator subgroup usually denoted [G, G] (sometimes
denoted G0 ) is the subgroup generated by words in all group commutators g1 g2 g1−1 g2−1 .
a.) Show that [G, G] is a normal subgroup of G. 59
b.) Show that G/[G, G] is abelian. This is called the abelianization of G. 60
c.) Consider the free group on 2 generators. What is the abelianization?
d.) Consider a surface group of the type given in (6.42). The abelianization of this
group is isomorphic to the first homology group H1 (S) where S is the punctured surface.
Compute this group. 61

Exercise A Less Than Perfect Group


a.) Recall that a simple group is a group with no nontrivial normal subgroups. A
perfect group is a group which is equal to its commutator subgroup. Show that a nonabelian
simple group must be perfect.
b.) Show that Sn is not a perfect group. What is the commutator subgroup? 62

Exercise Signed Permutations Again


Recall our discussion of a natural matrix representation of Sn and the group W (Bn )
of signed permutations from *** above.
59
Answer : Note that g0 [g1 , g2 ]g0−1 = [g10 , g20 ] where gi0 = g0 gi g0−1 .
60
Answer : Let G0 = [G, G] then g1 G0 g2 G0 = (g1 g2 )G0 = g2 g1 (g1−1 g2−1 g1 g2 )G0 = g2 g1 G0 .
61
One can define higher homology groups Hk (X) of a topological space X but these in general are not
Abelianiations of the higher homotopy groups πk (X), even though both groups are Abelian. Homology and
homotopy groups measure different aspects of the topology of a space.
62
The commutator subgroup is clearly a subgroup of An . In fact An is generated by products of two
transpositions and hence is generated by (abc). But note that (ab)(ac)(ab)(ac) = (abc). Therefore the
commutator subgroup of Sn is just An .

– 82 –
a.) Show that the subgroup of W (Bn ) of diagonal matrices is a normal subgroup
isomorphic to Zn2 .
b.) Show that every signed permutation matrix can be written in the form D · Π where
D is a diagonal matrix of ±1’s and Π is a permutation matrix.
c.) Conclude that the quotient of W (Bn ) by the normal subgroup of diagonal matrices
is isomorphic to Sn .
d.) Show that every signed permutation can also be written as Π0 · D0 . How is this
decomposition related to writing it as D · Π.

Exercise Products Of Simple Groups


Let G1 and G2 be simple groups.
a.) What are the subgroups of the Cartesian product G1 × G2 ? 63
Q
b.) Suppose Gi , i ∈ I is a set of simple groups. What are the subgroups of i∈I Gi ?

♣This exercise
should go to the
first section when
we introduce the
matrix groups. ♣

Exercise The Center Of U (N )


Show that the center of U (N ) consists of the subgroup of matrices proportional to the
unit matrix and is therefore isomorphic to U (1). 64

Exercise Subgroups Which Are Not So Normal


a.) Consider O(n, R) ⊂ GL(n, R). Is this a normal subgroup?
b.) Consider the subgroup of diagonal matrices in SU (N ). Is this a normal subgroup.

63
Answer : Only {1}, G1 × {1G2 } , {1G1 } × G2 and G1 × G2 .
64
Answer : There are many proofs but one nice one is to use induction on N . First establish the result for
U (2) - here the matrix multiplication is easy and this can be done by hand. Now suppose that ζ ∈ U (N + 1)
is in the center. Decompose it as follows !
A v
ζ=
w D
where A ∈ M2 (C), v ∈ M2×N −1 (C), w ∈ MN −1×2 (C), and D ∈ MN −1×N −1 (C). Now insist that it commute
with !
u 0
0 1
with u ∈ U (2) to show that uAu−1 = A, uv = v and wu−1 = w for all u ∈ U (2). These equations imply A
is diagonal and v, w = 0.

– 83 –
Exercise The Normalizer Subgroup
If H ⊂ G is a subgroup then we define the normalizer of H within G to be the largest
subgroup N of G such that H is a normal subgroup of N . Note that H is normal inside
itself so such subgroups exist. If N1 , N2 ⊂ G are subgroups and H is normal in both then
they generate a subgroup in which H is normal. In fact we have:

NG (H) := {g ∈ G|gHg −1 = H} (7.45)

a.) Show that (7.45) is a subgroup of G and H is a normal subgroup of NG (H).


b.) Show that NG (H) is the largest subgroup of G which contains H as a normal
subgroup.
Note that there is no claim that NG (H) is a normal subgroup of G. In general, it is
not.

Exercise An Important Class of Quotient Groups: The Weyl Groups Of Lie Groups
a.) Let D ⊂ SU (2) be the subgroup of diagonal matrices. Note that D ∼ = U (1).
Compute
NSU (2) (D) (7.46)
explicitly. 65
b.) Compute the quotient group NSU (2) (D)/D. 66
c.) Show that conjugation by elements of the normalizer act by a permutation of the
diagonal elements and the permutation only depends on the projection to the quotient.
d.) Show that there is no subgroup of NSU (2) (D) whose conjugation on D induces the
permutation action.

Remark: In general, in a simple Lie group G there is a unique maximal torus, T ⊂ G up


to conjugation. The Weyl group of G is by definition

W (G) := NG (T )/T (7.47)

For example, in SU (n) any maximal torus is conjugate to the subgroup D ⊂ SU (n) of
diagonal matrices. In this case, conjugation by NSU (n) (D) acts on D by permutation of
the diagonal elements and in fact

W (SU (n)) := NSU (n) (D)/D ∼


= Sn (7.48)
65
Answer : The normalizer is the subgroup of SU (2) that is the union of matrices of the form
!
z 0
0 z −1
or of the form !
0 −z −1
z 0
where z is a phase.
66
Answer The quotient is isomorphic to Z2 .

– 84 –
Note that the Weyl group is defined as a quotient of a subgroup of G. (Often this is
abbreviated to “subquotient of G.”) In general there is no subgroup of G whose conjugation
action induces the Weyl group action on T . (It is a common mistake to confuse W (G)
with a subgroup of G.)

Exercise Representations Of SU (N ) That Are Not Representations Of P SU (N )


Give an example of a representation of SU (N ) that is not a representation of the
quotient P SU (N ). 67

Exercise Homomorphic Images And Normal Subgroups


Suppose N / G, and suppose that ϕ : G → G̃ is a homomorphism to some other group
G̃. Then ϕ(N ) ⊂ G̃ is a subgroup. Is it a normal subgroup? 68 ♣Your answer here
should have a more
explicit
counterexample. ♣

𝑅𝑅
𝑄𝑄
𝑃𝑃

−𝑅𝑅 = 𝑃𝑃 + 𝑄𝑄

Figure 19: In a suitable range of real values of f, g the real points on the elliptic curve have the
above form. Then the elliptic curve group law is easily pictured as shown.

7.3.1 A Very Interesting Quotient Group: Elliptic Curves


Consider the Abelian group C of complex numbers with normal addition as the group
operation. If τ is a complex number with nonzero imaginary part then Z + τ Z is the
67
The defining representation is not, because the center of SU (N ) acts nontrivially.
68
Answer : In general it is not a normal subgroup. However, if ϕ is surjective then it is easy to see that
it is a normal subgroup

– 85 –
subgroup of complex numbers of the form n1 + τ n2 where n1 and n2 are integers. Since C
is abelian we can form the Abelian group C/Z + τ Z. Note that Z + τ Z is a rank two lattice
in the plane so that this quotient space can be thought of as a torus. As an Abelian group
this group is isomorphic to U (1) × U (1). The explicit isomorphism is

(σ1 + τ σ2 ) + (Z + τ Z) 7→ (e2πiσ1 , e2πiσ2 ) (7.49)

A remarkable fact is that this torus (minus one point) can be thought of as the space
of solutions of the algebraic equation

y 2 = x3 + f x + g (7.50)

where (x, y) ∈ C2 and f, g ∈ C. 69


The mapping between [z] ∈ C/(Z + τ Z) and (x, y) and between f, g and the complex
number τ involves very interesting functions known as elliptic functions. The solution set
is known as an “elliptic curve.” It is not difficult to describe the mapping. One introduces
a holomorphic function of z known as the Weierstrass function:
1 X  1 1

℘(z|τ ) := 2 + − (7.51)
z (z − ω)2 ω 2
ω∈Λ−{0}

where ω = n1 + n2 τ ∈ Λ := Z + τ Z and z ∈ / Λ Note that for large values of ω the summand


behaves like ω3 , so the series converges absolutely since dxdy
2z
R
r3
is convergent at r → ∞. By
general results in complex analysis the function is holomorphic for z ∈ C − Λ. Note that it
is also doubly-periodic:
℘(z + m + m0 τ |τ ) = ℘(z|τ ) (7.52)
for all m, m0 ∈ Z. So it descends to a function on the quotient to define a meromorphic
function on a complex manifold.
For [z] = 0 the function has a second order pole. Put differently, the Weierstrass
function has a double pole at every point z ∈ Λ. Indeed we can expand ℘(z|τ ) around
z = 0:

1 X
℘(z|τ ) = 2 + (2k + 1)G2k+2 z 2k
z
k=1 (7.53)
1
= 2 + 3G4 z 2 + 5G6 z 4 + · · ·
z
where X 1
G2k+2 = (7.54)
(n1 + n2 τ )2k+2
(n1 ,n2 )∈Z2 −{(0,0)}

69
We can restore the point at infinity using projective geometry. The equation ZY 2 = X 3 + f XZ 2 + gZ 3
makes sense for a point [X : Y : Z] ∈ CP2 . Indeed note that the equivalence relation says that [X : Y :
Z] = [λX : λY : λZ] and the equation is homogeneous and of degree three. The equation (7.50) is the
equation we get in the patch Z 6= 0 where we can fix the scaling degree of freedom by choosing λ so that
Z = 1. We then define x, y by [x : y : 1] = [X : Y : Z] = [X/Z : Y /Z : 1] which makes sense when Z 6= 0.
The point at infinity has Z = 0. Therefore, by the equation X = 0, and since we have a point in CP2 we
must have Y 6= 0, which can therefore be scaled to y = 1. So, the point at infinity is [0 : 1 : 0] ∈ CP2 .

– 86 –
are absolutely convergent and hence holomorphic functions of τ for k ≥ 1 when Imτ 6= 0.
They are famous functions known as Eisenstein functions and are basic examples of a
fascinating set of functions known as modular forms. In order to produce equation (7.50)
we will take x = ℘(z|τ ) and define

y := ℘(z|τ )
∂z (7.55)
−2
= 3 + 6G4 z + 20G6 z 3 + · · ·
z
Now a small amount of algebra shows that we have the series expansion

y 2 − 4x3 + 60G4 x = −140G6 + O(z 2 ) (7.56)

So the combination y 2 − 4x3 + 60G4 x is entire, i.e., holomorphic in the entire complex
plane, and doubly-periodic. But then by Liouville’s theorem it must be constant! Thus all
the higher terms in the series vanish! Thus we have:

y 2 = 4x3 − 60G4 x − 140G6 (7.57)

which is exactly the form (7.50) up to a simple rescaling.


The Abelian group law expressed in terms of (x, y) is rather nontrivial and closely
related to some deep topics in number theory. If one considers f, g to be real and studies
the real solutions then the group law can be visualized as in Figure 19. (We are following
the Wikipedia article here, which is quite clear.) One first defines the inverse −P of a
point P on the curve with coordinates P = (x, y) to be −P := (x, −y). Then, for generic
points we can define P + Q by saying that P + Q = −R where R is on the intersection
of the straight line through P, Q and the elliptic curve. In formulae we can write the line
between P, Q as
y = sx + d (7.58)
with
   
yP − yQ yP − yQ yP − yQ
s= d = yP − xP = yQ − xQ (7.59)
xP − xQ xP − xQ xP − xQ
Now the intersection of this line with the cubic equation has x coordinates given by

(sx + d)2 = x3 + f x + g (7.60)

and by simple rearrangement we can rewrite (7.60) as

x3 − s2 x2 + (f − 2sd)x + (g − d2 ) = 0 (7.61)

On the other hand, this equation must be of the form

(x − xP )(x − xQ )(x − xR ) = 0 (7.62)

Expanding out (7.61) and equating the coefficient of x2 we obtain

xR = s2 − xP − xQ (7.63)

– 87 –
so we have xR explicitly as a function of xP , xQ , yP , yQ . Now the point (xR , yR ) must lie
on the line y = sx + d so we can also say that

yR = yP + s(xR − xP ) (7.64)

expressing yR and hence the coordinates of R = (xR , −yR ) as rational functions of xP , xQ , yP , yQ .


It is not at all obvious that the above group law really satisfies the associativity constraint.
When points coincide or the line is tangent to the elliptic curve one must carefully de-
generate the above expressions. Indeed, requiring that P + (−P ) = 0 shows that 0 must
correspond to the point at infinity. When f, g are not in the range to give a figure like
Figure 19 the algebraic equations above still define a group law. Indeed, these equations
make sense over any field, thus allowing one to define an Abelian group law for elliptic
curves defined over any field.

Exercise Modular Transformations


a.) Show that the transformation law
!  
a b z aτ + b
: (z, τ ) 7→ , (7.65)
c d cτ + d cτ + d

defines a group action of SL(2, Z) on pairs C × H.


b.) Show that  
z aτ + b
℘ | = (cτ + d)2 ℘(z|τ ) (7.66)
cτ + d cτ + d
c.) Show that
aτ + b
G2k ( ) = (cτ + d)2k G2k (τ ) k = 2, 3, 4, ... (7.67)
cτ + d
d.) Using Fourier analysis prove that for z ∈
/ Z we have
X 1 π2
= (7.68)
n∈Z
(z + n)2 sin2 (πz)

and conclude that for Imz > 0



X 1 (2πi)2k X 2k−1 2πi`z
= ` e (7.69)
(z + n)2k (2k − 1)!
n∈Z `=1

From this derive that



(2πi)2k X
G2k (τ ) = 2ζ(2k) + 2 σ2k−1 (n)q n (7.70)
(2k − 1)!
n=1

m and q := e2πiτ . Furthermore show that


P
where σm (n) = d divides n d

G2k (τ ) = 2ζ(2k)E2k (τ ) (7.71)

– 88 –

4k X n2k−1 q n
E2k (τ ) = 1 − (7.72)
B2k 1 − qn
n=1
where B2k is the Bernoulli number (as defined in Wikipedia).

7.4 Conjugacy Classes In Sn


Above we discussed the cycle decomposition of elements of Sn . Now let us study how the
cycles change under conjugation.
When showing that transpositions generate Sn we noted the following fact:

If (i1 i2 · · · ik ) is a cycle of length k then g(i1 i2 · · · ik )g −1 is a cycle of length k. It


is the cycle where we replace i1 , i2 , . . . by their images under g. That is, if g(ia ) = ja ,
a = 1, . . . , k, then g(i1 i2 · · · ik )g −1 = (j1 j2 · · · jk ).

It therefore follows that:

Any two cycles of length k are conjugate.

Example In S3 there are two cycles of length 3 and they are indeed conjugate:

(12)(123)(12)−1 = (213) = (132) (7.73)

Now recall that any element in Sn can be written as a product of disjoint cycles.

Therefore, the conjugacy classes in Sn are labeled by specifying a nonnegative integer,


denoted `j , where j = 1, . . . , n, and `j is the number of distinct cycles of length j in the
cycle decomposition of any typical element σ of C(σ).

Examples:

1. The following two permutations in S12 are conjugate:

(1, 2)(3, 4)(5, 6)(7, 8, 9)(10, 11, 12) (7.74)

(4, 10)(7, 8)(9, 11)(1, 12, 6)(2, 5, 3) (7.75)


This has `1 = 0, `2 = 3, `3 = 2, `j = 0 for j > 3.

2. In S4 there are 3 elements with cycle decomposition of type (ab)(cd):

(12)(34), (13)(24), (14)(23) (7.76)

Note that these can be conjugated into each other by suitable transpositions. So this
conjugacy class is determined by

`1 = 0 `2 = 2 `3 = 0 `4 = 0 (7.77)

– 89 –
In general we can denote a conjugacy class in Sn by:

(1)`1 (2)`2 · · · (n)`n (7.78)

Then, since we must account for all n letters being permuted we must have:

n
X
n = 1 · `1 + 2 · `2 + · · · n · `n = j · `j (7.79)
j=1

Definition A decomposition of n into a sum of nonnegative integers is called a partition


of n.

Therefore:

The conjugacy classes of Sn are in 1-1 correspondence with the partitions of n.

Definition The number of distinct partitions of n is called the partition function of n,


and denoted p(n). 70

Example For n = 4, 5 p(4) = 5 and p(5) = 7 and the conjugacy classes of S4 and S5 are:

Partition Cycle decomposition Typical g |C(g)| Order of g


4=1+1+1+1 (1)4 1 1 1
4
(1)2 (2)

4=1+1+2 (ab) 2 =6 2
4=1+3 (1)(3) (abc) 2·4=8 3
1 4
(2)2

4=2+2 (ab)(cd) 2 2 =3 2
4=4 (4) (abcd) 6 4

70
This is a term in number theory. It is not to be confused with the “partition function” of a field theory!

– 90 –
Cycle decomposition |C(g)| Typical g Order of g
(1)5 1 1 1
5
(1)3 (2)

2 = 10 (ab) 2
(1)2 (3) 2 · 53 = 20 (abc) 3
6 · 54 = 30

(1)(4) (abcd) 4
(1)(2)2 5 · 12 42 = 15

(ab)(cd) 2
2 · 52 = 20

(2)(3) (ab)(cde) 6
(5) 4! = 24 (abcde) 5

♣Put the tables in


uniform format and
add one for S6 . ♣

Exercise Sign of the conjugacy class P


Let  : Sn → {±1} be the sign homomorphism. Show that (g) = (−1)n+ j `j if g is
in the conjugacy class (7.78).

Exercise Order of the conjugacy class


Given a conjugacy class of type (7.78) compute the order |C(g)|. 71

7.4.1 Conjugacy Classes In Sn And Harmonic Oscillators


There is a beautiful relation of conjugacy classes of the symmetric group with special
collections of harmonic oscillators. We’ll give a taste of how that happens here.
Let’s review briefly some facts about the quantum mechanical harmonic oscillator:
The classical harmonic oscillators is described by a phase space with coordinate q ∈ R (the
displacement of the oscillator) and momentum p ∈ R with Hamiltonian

1
H = (p2 + ω 2 q 2 ) (7.80)
2

(we have scaled away the mass). The classical Poisson bracket {p, q} = 1 is quantized by
postulating there are operators p̂, q̂ with

[p̂, q̂] = −i~ (7.81)


71 Qn
Answer : |C(g)| = n!/( i=1 i`i `i !).

– 91 –
and there is a Hilbert space representing this operator algebra. One forms the linear
combinations:
1
a := √ (ω q̂ + ip̂)
2~ω
(7.82)
1
ā := √ (ω q̂ − ip̂)
2~ω
so that
[a, ā] = 1 (7.83)

The operator algebra is a ∗-algebra so there is a C-antilinear map with q̂ ∗ = q̂ and p̂∗ = p̂.
So then ā = a∗ .
If we just consider the operator algebra without considering H then there are a number
of different ways to represent it. We could postulate there is a vector |0i with a|0i = 0.
Then the Hilbert space is spanned by ān |0i and ā = a† . But note that we could make a
linear transformation to

b = αa + βā
(7.84)
b̄ = γa + δā

and this will preserve the commutation relations: [b, b̄] = 1, if αδ − βγ = 1. If we preserve
the ∗ structure so that b̄ = b∗ then δ = α∗ and γ = β ∗ and we find transformations by
SU (1, 1).
If we represent p̂, q̂ on the Hilbert space L2 (R) with

(q̂ · ψ)(x) = xψ(x)


∂ (7.85)
(p̂ · ψ)(x) = −i~ ψ(x)
∂x
then the groundstate with a|0i = 0 is preferred because the corresponding vector in L2 (R)
satisfies:
ωx2
(ip̂ + ω q̂)ψ = 0 ⇒ ψ(x) = Ce− 2~ (7.86)

This choice of quantization is also preferred when we consider the oscillator Hamiltonian:
The quantum Hamiltonian is H = ω(a† a + 21 ). The states ān |0i are eigenstates of H
with eigenvalue ω(n + 21 ). Assuming |0i has unit norm the normalized eigenstates √1n! ān |0i
form a complete ON basis for the Hilbert space.
In quantum statistical mechanics a very important quantity is the (physics) partition
function defined to be
1
−βH e− 2 βω 1
TrHsingle e = −βω
= (7.87)
h.o.
1−e 2 sinh 12 βω

Here β has the physical interpretation of 1/(kT ) where k is Boltzmann’s constant and T
is the temperature above absolute zero. Using the (physics) partition function one derives
thermodynamic quantities when the oscillator is connected to a heat bath.

– 92 –
Now, suppose we have a system which is described by an infinite collection of harmonic
oscillators:

[aj , ak ] = 0 [a†j , a†k ] = 0 [aj , a†k ] = δj,k j, k = 1, . . . (7.88)

Suppose they have frequencies which are all a multiple of a basic harmonic which we’ll de-
note ω, so the frequencies are associated with the oscillators a1 , a2 , a3 , . . . are ω, 2ω, 3ω, ....
The motivation for choosing all frequencies to be multiples of a basic frequency comes from
the theory of strings, as we will explain below.
If we write the standard sum of harmonic oscillator Hamiltonians we get, formally,

1
jω(a†j aj + )
X
formal
H = (7.89)
2
j=1

This is formal, because on the usual lowest weight module of the system defined by saying
the vacuum line satisfies:
aj |vaci = 0 ∀j (7.90)
the groundstate energy is infinite. This is typical of the divergences of quantum field theory
72 An infinite number of degrees of freedom typically leads to divergences in physical quan-

tities. However, there is a very natural way to regularize and renormalize this divergence
by using the Riemann zeta function:
∞ ∞
X j ωX 1 ω ω
ω= −1
→ ζ(−1) = − (7.91)
2 2 j 2 24
j=1 j=1

73 This can be justified much more rigorously and indeed it gives the correct Casimir ♣We now arranged
the string
energy for a massless scalar field on an interval. Multiplying by two we get a similar result explanation below
so this remark is
for a scalar field on a circle. If we restore proper units so that the radius of the circle so out of place. ♣

that the length is 2πL the ground state energy is:


1 ~c
Eground = − =− (7.92)
24L 24L
where in the second equation we restored ~ and c which had been set to 1. Of course,
unless you couple the system to gravity, the zero of energy is arbitrary. Here the zero of
energy is defined by saying the massless scalar field on the real line has zero groundstate
energy. Then the above formula for the Casimir energy is meaningful. What is meaningful
∂E
independent of the choice of zero of energy is the Casimir force − ground
∂L .
72
The quantum field theory in question is that of a massless scalar field in a spacetime of 1+1 dimensions.
73
One way to see that ζ(−1) = −1/12 is to use the functional equation for the Riemann zeta function. ζ(s)
is a convergent series for Re(s) > 1. Define ξ(s) = 21 π −s/2 s(s − 1)Γ( 2s )ζ(s). Then one proves the stunning
result: ξ(s) = ξ(1 − s) for the analytically continued ζ-function. Now, one can analytically continue the Γ
function using Γ(x + 1) = xΓ(x), and then evaluate both sides of the functional equation at s = 2 using

Γ(1/2) = π and ζ(2) = π 2 /6. The analytic continuation of the ζ-function can be derived from the integral
R∞
representation 2π −s/2 Γ(s/2)ζ(s) = 0 xs/2 (ϑ(ix) − 1)dx and from this representation one easily derives the
functional equation once one knows the modular transformation law of the theta function ϑ(τ ). See section
*** below for a physical derivation of the modular transformation law of the theta function.

– 93 –
In any case, things work out very nicely if we take the Hamiltonian to be:

ω
jωa†j aj −
X
H= (7.93)
24
j=1

The dimension of the space of states of energy nω above the groundstate is p(n). A natural
basis of this space is labeled by partitions of n:

(a†1 )`1 (a†2 )`2 · · · (a†n )`n |0i (7.94)

and hence the vectors in this basis are in 1-1 correspondence with the conjugacy classes of
Sn . This turns out to be significant in the boson-fermion correspondence in 1+1 dimen-
sional quantum field theory.
The quantum statistical mechanical partition function of this collection of oscillators
has a truly remarkable property, as we now explain:
Let q be a complex number with |q| < 1. Notice that:
1
Q∞ j)
= (1 + q + q 2 + · · · )(1 + q 2 + q 4 + · · · )(1 + q 3 + q 6 + · · · ) · · ·
j=1 (1 − q

X (7.95)
n
=1+ p(n)q
n=1

Indeed, note that this is almost exactly the same as the physical partition function of our
system of oscillators!
Therefore, taking into account the full system of harmonic oscillators we have:
1
Z osc (β) = TrHall the h.o.s
e−βH = Q∞ , (7.96)
q 1/24 n=1 (1 − qn)
where we trace over the Hilbert space of states of our collection of oscillators. Here we
identify q = e−βω . Expanding out (7.95) gives the first few values of p(n):

1 + q + 2q 2 + 3q 3 + 5q 4 + 7q 5 + 11q 6 + 15q 7 + 22q 8 + 30q 9 +


(7.97)
+ 42q 10 + 56q 11 + 77q 12 + 101q 13 + 135q 14 + · · ·

and one can easily generate the first few 100 values using Maple or Mathematica or Sage
or ...
It turns out the generating series has a remarkable “modular transformation property”
relating Z(β) to Z(1/β):
β −1/4 Z osc (β) = β̃ −1/4 Z osc (β̃) (7.98)
 2

β β̃ = (7.99)
ω
This is a kind of high-low temperature duality.
The property (7.98) and (7.99) is proven in textbooks on analytic number theory. In
this context one often uses the variable
βω
τ =i (7.100)

– 94 –
so that q = e−βω = e2πiτ . Although the physical motivation starts with β > 0, one
can analytically continue so that τ is in the upper half complex plane. (See below for
the physical interpretation of this continuation.) Analytic number theorists define the
Dedekind eta function :
  ∞
2πiτ Y
η(τ ) = exp (1 − q n ) (7.101)
24
n=1

and prove the crucial identity:

η(−1/τ ) = (−iτ )1/2 η(τ ). (7.102)

This equation is equivalent to the identity in equations (7.98) and (7.99).

An Interpretation From Physics

It turns out that (7.98) and (7.99), or equivalently, (7.102) has a beautiful interpreta-
tion from physics. The string is a circle with coordinate σ ∼ σ + 2π and the displacement
of the string X(σ, t) is a real number depending on position and time. The action is:
Z Z 2π
1
dσ (∂t X)2 − ∂σ X)2

S= dt (7.103)
4π`2s 0

where `s has units of length, or inverse mass, and we have temporarily set ~ = c = 1 by
choice of units. So T = `−2
s is the tension of the string. Then the general solution of the
classical equation of motion (the wave equation) (∂t2 − ∂σ2 )X = 0 is:
 
2 `s X αn in(t+σ) α̃n in(t−σ)
X(t, σ) = X0 + `s pt + i √ e + e (7.104)
2 n n
n6=0

with complex numbers αn = (α−n )∗ and α̃n = (α̃−n )∗ and X0 , p are real.
We can think of this as a 1 + 1 dimensional field theory. Then spacetime is a cylinder
1
S × R with Lorentzian metric. The αn are the amplitudes of waves moving at the speed
of light to the left, while the α̃n are the amplitudes of waves moving at the speed of light
to the right.
The Hamiltonian computed from the action is
Z 2π
1
dσ (∂t X)2 + ∂σ X)2

H= (7.105)
4π`2s 0

and evaluating this on the general solution of the equation of motion gives:
1 1 X
H = `s p2 + `−1 (α−n αn + α̃−n α̃n ) (7.106)
2 2 s
n6=0

When quantizing this system we find that [X0 , p] = i~ and

[αn , αm ] = nδn+m,0 [α̃n , α̃m ] = nδn+m,0 [αn , α̃m ] = 0 (7.107)

– 95 –
If we represent the Heisenberg algebras with vacua so that αn |0i = α̃n |0i = 0 for n > 0
√ √
then we get standard Harmonic oscillators by defining an = αn / n and a†n = α−n / n for
n > 0, and similarly for the right-moving modes α̃n . Therefore, the Hamiltonian becomes:
"∞ #
1 2 X   2
H = `s p̂ + `−1
s n a†n an + ã†n ãn − (7.108)
2 24
n=1

We thus recognize ω = `−1


as our basic frequency.
s
Let us now compute the quantum statmech partition function for this string:
Z(β) := T re−βH = (2πβω)−1/2 (Z osc ((β))2 (7.109)
where the prefactor comes from the zeromode degrees of freedom (X0 , p) of the scalar field
and is obtained from the Gaussian integral
Z +∞
dp −β 1 `s p2 `s 1
`s e 2 =√ =√ (7.110)
−∞ 2π 2πβ`s 2πβω
We have written this for β real.
Moreover, it is a standard and fundamental result that for real β the partition function
T re−βH can be written as a path integral with periodic Euclidean time of period β. The
chain of logic is that the partition function is of the form
X
Tre−βH = hψn |e−βH |ψn i (7.111)
ψn

where ψn is a basis of the space of states. But now hψ1 |e−βH |ψ2 i is an analytic contin-
uation of the transition amplitude hψ1 |e−itH |ψ2 i to imaginary time. On the other hand
hψ1 |e−itH |ψ2 i can be written as a path integral with initial and final conditions ψ2 , ψ1 ,
respectively. If we set ψ1 = ψ2 and sum over a complete basis then the domain of the path
integral becomes that of all field configurations on the circle of Euclidean time. For more
on this, see the classic book by Feynman and Hibbs.
If we apply the above principle to the case of our 1 + 1 dimensional QFT, the quantum
field X is already a map from the circle to the real line so altogether we have a path integral
on a torus S 1 × S 1 with metric
"  2 #
β β
ds2 = (dσ)2 + ( )2 (dσ 2 )2 = (2π)2 (dσ 1 )2 + (dσ 2 )2 (7.112)
`s 2π`s

where we have chosen a dimensionless coordinate σ 2 ∼ σ 2 +1 in the Euclidean time direction


and rescaled σ = 2πσ 1 so that σ 1 ∼ σ 1 + 1.
Now, in statistical physics it is natural to consider the analytic continuation of Z(β)
to the subset of the complex β-plane with positive real part. In this case, to interpret the
analytic continuation as a trace we should split the Hamiltonian into contributions of left-
and right-moving oscillators. We can do this by introducing separate Hamiltonians for the
left- and right-moving degrees of freedom:
1
HL = (H + P )
2 (7.113)
1
HR = (H − P )
2

– 96 –
R 2π
where P is the total momentum of the field ∼ 0 ∂t X∂σ X. Note that H = HL + HR .
Explicitly, one finds: 74
"∞ #
1 2 −1
X
† 1
HL = `s p̂ + `s nan an −
4 24
n=1
"∞ # (7.114)
1 2 −1
X
† 1
HR = `s p̂ + `s nãn ãn −
4 24
n=1

We set
βω
q = e2πiτ = e−βω ⇔ τ =i . (7.115)

Then the proper analytic continuation to consider is

Z(β) = Trq HL q̄ HR = Tre−Re(βω)H e−iIm(βω)P (7.116)

With this understood the partition function now becomes

Z(β) = (Imτ )−1/2 |Z osc (β)|2 (7.117)

One of the virtues of (7.116) is that it still has a nice interpretation in terms of a path
integral on a torus: After we propagate in Euclidean time by Re(βω) we shift in the σ
coordinate by Im(βω) before gluing, because P is the generator of translations in the σ
direction. The net result is that we can identify Z(β) with the pathintegral on a torus with
metric:
ds2 = (2π)2 |dσ 1 + τ dσ 2 |2 = (2π)2 |dz|2 (7.118)
We can identify the torus with our friend: C/(Z + τ Z) with a flat metric ds2 = |dz|2 where
τ is a complex number in the upper half-plane. One can easily (and rigorously) compute
that path integral and show that it is

Z(β) = (Imτ )−1/2 |η(τ )|−2 (7.119)

where

Y
1/24
η(τ ) = q (1 − q n ), (7.120)
n=1

confirming the general principle we just enunciated.


On the other hand, the path integral is invariant under “large” diffeomorphisms of
the torus, that is, on diffeomorphisms which are not deformable to the identity. Such
diffeomorphisms will act nontrivially on π1 (T 2 ). If we take z = x + τ y with x, y real and
x, y identified modulo one, then we can make the diffeomorphism that rotates by 90 degrees
in the x, y plane. Note that this exchanges A- and B-cycles: We are exchanging the spatial
circle with the (Euclidean) time circle. The transformation also takes the torus to a torus
with τ → −1/τ and the flat metric rescaled by a constant factor: If ds2 = |dz|2 with
74
The splitting of the zeromode between left- and right-movers is a very subtle point we have elided here.
To do this properly one needs to work out the quantization of a self-dual field.

– 97 –
z = x + iy then the pull-back is ds2 = |τ |2 |dx0 + τ 0 dy 0 |2 with x0 = y and y 0 = −x0 and
τ 0 = −1/τ .
What about the overall factor of |τ |2 in front of the metric? But the massless scalar
field above has a beautiful property known as conformal invariance. Let us return to the
action and restore the 1 + 1 dimensional Minkowskian metric:
Z
1 p
S=− 2
d2 σ |detη|η αβ ∂α X∂β X (7.121)
4π`s S 1 ×R

we choose ηtt = −1. The generalization to an arbitrary metric hαβ dσ α dσ β on the “world-
sheet” S 1 × R is clear:
Z
1 p
S=− 2
d2 σ |deth|hαβ ∂α X∂β X (7.122)
4π`s S 1 ×R

This is the standard minimal coupling in general relativity. But note that the action is
invariant under conformal transformations

hαβ → Ω2 hαβ (7.123)

One has to be very careful about the quantum theory: In this case the partition function
is not quite invariant but rather scales by an overall functional of Ω. But for a flat metric
and constant Ω, the overall scaling factor is just one.
We finally conclude that (Imτ )−1/2 |η(τ )|−2 is invariant under τ → −1/τ and since
η(τ ) is holomorphic one can deduce the very important result (7.102) above.
As an interesting application of (7.102), when combined with the method of stationary
phase, one can derive the Hardy-Ramanujan formula giving an asymptotic formula for large
values of n:
 r 
1 1 3/4 −1 n
p(n) ∼ √ n exp 2π (7.124)
2 24 6
Note that this grows much more slowly than the order of the group, n!. So we conclude
that some conjugacy classes must be very large! (See discussion in the next section on the
class equation if this is not obvious.)
Analogs of equation (7.124) for a class of functions known as modular forms plays
an important role in modern discussions of the entropy of supersymmetric (and extreme)
black hole solutions of supergravity.

Exercise Deriving the Hardy-Ramanujan formula


The function Z osc (β) has a nice analytic continuation into the right half complex plane
where Re(β) > 0. Note that q 1/24 Z osc (β) is periodic under imaginary shifts β → β + 2πi
ω .
Write
Z β0 + 2πi
ω
p(n) = dβe−nβω q 1/24 Z osc (β) (7.125)
β0

– 98 –
and use the above transformation formula, together with the stationary phase method to
derive (7.124). 75

7.4.2 Conjugacy Classes In Sn And Partitions


Another way of thinking about partitions of n uses the general idea of partitions: In general,
a partition is a sequence of nonnegative integers {λ1 , λ2 , λ3 , . . . } so that
a.) λi are nonincreasing: λi ≥ λi+1 .
b.) The λi eventually become zero.
P
Given a partition, we define |λ| = i λi so that a partition of n can be written:

n = λ1 + λ2 + · · · + λk (7.126)

as a sum of positive integers with λ1 ≥ λ2 ≥ · · · ≥ λk . The nonzero λi are called the parts
of the partition. The above is a partition of n with k parts.

(4)

Figure 20: Young diagrams corresponding to the 5 different partitions of 4.

In general, to a partition λ we can associate a Young diagram. This is a diagram with


λ1 boxes in the first row, λ2 boxes in the second row and so forth. The boxes are arranged
to make, roughly speaking, an upside-down L-shape. See Figure 20 for some examples. We
will talk much more about these when discussing representations of the symmetric group
and representations of SU (n).
R 1/2+iβ
75
Answer : Write p(n) = −1/2+iβ e−2πinτ q 1/24 η(τ )−1 dτ where τ = x + iβ and the contour is along a
horizontal line. One argues that, as β → 0+ the dominant terms in the integral come from the region near
x∼= 0. (This is a rather subtle step to do correctly.) Then, using the modular transformation law one writes
η(τ )−1 = (−iτ )1/2 η(−1/τ )−1 and for Im(−1/τ ) → ∞ one approximates η(−1/τ )−1 ∼ = exp[2πi/(24τ )]. Now
one applies the standard stationary phase technique. When this procedure is carried out more systematically
one is led to the famous Rademacher expansion for coefficients of certain modular functions.

– 99 –
One way to associate a conjugacy class in Sn with a partition is as follows. We let

mi (λ) := |{j|λj = i}| (7.127)

denote the multiplicity of i in λ. In terms of the Young diagram it is the number of rows
with i boxes. If there are mi rows with i boxes then that accounts for i × mi boxes.
Summing over i gives n, the total number of boxes in the diagram. So we can associate
the conjugacy class:
(1)m1 (2)m2 · · · (7.128)
to a partition/Young diagram.
It is worth noting that there is another associated partition and conjugacy class known
as the conjugate partition λ. It corresponds to the partition obtained by flipping on the
main diagonal: We exchange rows and columns. Let λ0i be the number of boxes in the ith
column. By the inverted L-shape we see that this is another partition

λ01 ≥ λ02 ≥ · · · (7.129)

with |λ0 | = n. Note that λ00 = λ.


Note that
λ0i = |{j|λj ≥ i}| (7.130)
For example, λ01 is the number of boxes in the first column. This is clearly the number of
rows, and to have a nontrivial row we must have λj ≥ 1. Now, to get a box in the second
column we must have rows with λj ≥ 2, and so on.
Therefore,
mi (λ) = λ0i − λ0i+1 (7.131)
So in terms of λ0 our conjugacy class above becomes
0 0 0 0 0
(1)λ1 −λ2 (2)λ2 −λ3 · · · (n)λn (7.132)

Note that the identity

λ01 +λ02 +· · ·+λ0n = (λ01 −λ02 )+2(λ02 −λ03 )+3(λ03 −λ04 )+· · ·+(n−1)(λ0n−1 −λ0n )+nλ0n (7.133)

assures us the total number of boxes is still n.


Now, when n is large we can ask what the “typical” partition is. That is, what are
the “typical” conjugacy classes in Sn when n is large? This is an imprecise, and rather
subtle question. To get some sense of an answer it is useful to consider the number pk (n)
of partitions of n into precisely k parts (as in (7.126)). The generating function is
∞ k
X Y x
pk (n)xn = (7.134)
1 − xj
n=1 j=1

One natural guess, then, is that the “typical” partition has k ∼
= n with “most of the

parts” on the order of n. This naive picture can be considerably improved using the

– 100 –
statistical theory of partitions. 76 Without going into a lot of complicated asymptotic
formulae, the main upshot is that, for large n, as a function of k, pk (n) indeed is sharply
peaked with a maximum around

6√
k̄(n) := nlogn (7.135)

See Figure 21 for a numerical illustration. Moreover,
√ √ and again√speaking very roughly, the

number of terms in the partition λj with λj = 2π6 n is order 6n/π.

Remarks:

1. Recall that in our discussion of a string, or equivalently of a massless scalar field on


1
the circle, there are p(n) states in the energy eigenspace with energy E = (n − 24 )ω.
Thus we can interpret the above result as a kind of equipartition theorem: The most
likely state is the one where the energy is shared equally by the different oscillators.

2. A Young tableau is a Young diagram with n boxes where the boxes have been filled
in with integers drawn from {1, . . . , n} so that no integer is repeated. Note that the
symmetric group Sn acts on Young tableau. For a given tableau T we can define
two subgroups of the symmetric group: R(T ) are the permutations that only move
numbers around within the rows and C(T ) are the permutations that only move
numbers around within the columns. Young tableau and these subgroups are used
in constructing the irreducible representations of the symmetric group.

Exercise Generating Function For pk (n)


Prove equation (7.134). 77

8. More About Group Actions And Orbits


♣NOTE BENE!
THE MATERIAL
In Section 4.1 above we introduced the notion of a group action on a set. In this section IN THIS SECTION
IS IDENTICAL TO
we develop this important idea a bit further. SECTION 2 OF
CHAPTER 3 ♣
76
It is a large subject. See P Erdös and J. Lehner, “The distribution of the number of summands in the
partition of a positive integer,” Duke Math. Journal 8(1941)335-345 or M. Szalay and P. Turán, “On some
problems of the statistical theory of partitions with application to characters of the symmetric group. I,”
Acta Math. Acad. Scient. Hungaricae, Vol. 29 (1977), pp. 361-379.
77
Answer : A partition of n into exactly k-parts means that λk ≥ 1. So now write

n − k = (λ1 − λ2 ) + 2(λ2 − λ3 ) + · · · + (k − 1)(λk−1 − λk ) + k(λk − 1)

This is a partition of n − k as a sum of integers drawn from {1, . . . , k}. Enumerating those is clearly given
by kj=1 (1 − xj )−1 .
Q

– 101 –
1.5  1017

1.0  1017

5.0  1016

20 40 60 80 100 120

Figure 21: Showing the distribution of pk (n) √as a function of k for n = 400 and 1 ≤ k ≤ 120. Note
that the Erdös-Lehner mean value of k is k̄ = 2π6 20log(20) ∼
= 46.7153 is a very good approximation
to where the distribution has its sharp peak. The actual maximum is at k = 45.

8.1 Some Definitions And Terminology Associated With Group Actions

Let X be any set (possibly infinite). Recall the definition we gave in Section 4.1.
A permutation of X is a 1-1 and onto mapping X → X. The set SX of all permutations

– 102 –
forms a group under composition. A transformation group on X is a subgroup of SX .
Equivalently, a G-action on a set X is a map φ : G × X → X compatible with the
group multiplication law as follows:
A left-action satisfies:
φ(g1 , φ(g2 , x)) = φ(g1 g2 , x) (8.1)
A right-action satisfies
φ(g1 , φ(g2 , x)) = φ(g2 g1 , x) (8.2)
In addition in both cases we require that

φ(1G , x) = x (8.3)
for all x ∈ X.

Remarks:

1. If φ is a left-action then it is natural to write g · x for φ(g, x). In that case we have

g1 · (g2 · x) = (g1 g2 ) · x. (8.4)

Similarly, if φ is a right-action then it is better to use the notation φ(g, x) = x · g so


that
(x · g2 ) · g1 = x · (g2 g1 ). (8.5)

2. If φ is a left-action then φ̃(g, x) := φ(g −1 , x) is a right-action, and vice versa. Thus


there is no essential difference between a left- and right-action. However, in compu-
tations with nonabelian groups it is extremely important to be consistent and careful
about which choice one makes. Confusing left- and right- actions is a common source
of error.

3. A given set X can admit more than one action by the same group G. If one is working
simultaneously with several different G actions on the same set then the notation g · x
is ambiguous and one should write, for example, φg (x) = φ(g, x) or speak of φg , etc.
A good example of a set X with several natural G actions is the case of X = G
itself. Then there are the actions of left-multiplication, right-multiplication, and
conjugation. The action of g on the group element g 0 is:
L(g, g 0 ) = gg 0
L̃(g, g 0 ) = g −1 g 0
R(g, g 0 ) = g 0 g
(8.6)
R̃(g, g 0 ) = g 0 g −1
C(g, g 0 ) = g −1 g 0 g
C̃(g, g 0 ) = gg 0 g −1
where on the RHS of these equations we use group multiplication. The reader should
work out which actions are left actions and which actions are right actions.

– 103 –
There is some important terminology one should master when working with G-actions.
First here are some terms used when describing a G-action on a set X:

Definitions:

1. A group action is effective or faithful if for any g 6= 1 there is some x such that
g · x 6= x. Equivalently, the only g ∈ G such that φg is the identity transformation
is g = 1G . A group action is ineffective if there is some g ∈ G with g 6= 1 so that
g · x = x for all x ∈ X. The set of g ∈ G that act ineffectively is a normal subgroup
of G.

2. A group action is transitive if for any pair x, y ∈ X there is some g with y = g · x.

3. A group action is free if for any g 6= 1 then for every x, we have g · x 6= x.

In summary:

1. Effective: ∀g 6= 1, ∃x s.t. g · x 6= x.

2. Ineffective: ∃g 6= 1,s.t. ∀x g · x = x.

3. Transitive: ∀x, y ∈ X, ∃g s.t. y = g · x.

4. Free: ∀g 6= 1, ∀x, g · x 6= x

In addition there are some further important definitions:

1. Given a point x ∈ X the set of group elements:

StabG (x) := {g ∈ G : g · x = x} (8.7)

is called the isotropy group at x. It is also called the stabilizer group of x. It is often
denoted Gx . The reader should show that Gx ⊂ G is in fact a subgroup. Note that a
group action is free iff for every x ∈ X the stabilizer group Gx is the trivial subgroup
{1G }.

2. A point x ∈ X is a fixed point of the G-action if there exists some element g ∈ G


with g 6= 1 such that g · x = x. So, a point x ∈ X is a fixed point of G iff StabG (x)
is not the trivial group. Some caution is needed here because if an author says “x is
a fixed point of G” the author might mean that StabG (x) = G. That would not be
implied by our terminology.

3. Given a group element g ∈ G the fixed point set of g is the set

FixX (g) := {x ∈ X : g · x = x} (8.8)

The fixed point set of g is often denoted by X g . Note that if the group action is free
then for every g 6= 1 the set FixX (g) is the empty set.

– 104 –
4. We repeat the definition from section **** above. The orbit of G through a point x
is the set of points y ∈ X which can be reached by the action of G:

OG (x) = {y : ∃g such that y = g · x} (8.9)

Remarks:

1. If we have a G-action on X then we can define an equivalence relation on X by defining


x ∼ y if there is a g ∈ G such that y = g · x. (Check this is an equivalence relation!)
The orbits of G are then exactly the equivalence classes of under this equivalence
relation. Therefore, X is partitioned into a disjoint union of all the G-orbits.

2. The group action restricts to a transitive group action on any orbit.

3. If x, y are in the same orbit then the isotropy groups Gx and Gy are conjugate
subgroups in G. Therefore, to a given orbit, we can assign a definite conjugacy class
of subgroups.

Point 3 above motivates the

Definition If G acts on X a stratum is a set of G-orbits such that the conjugacy class of
the stabilizer groups is the same. The set of strata is sometimes denoted X k G.

Exercise Group actions of G on G


Referring to equation (8.6). Which actions are left-actions and which actions are right-
actions? 78

Exercise
Recall that a group action of G on X can be viewed as a homomorphism φ : G → SX .
Show that the action is effective iff the homomorphism is injective.

Exercise
Suppose X is a G-set.
a.) Show that the subset H of elements which act ineffectively, i.e. the set of h ∈ G
such that φ(h, x) = x for all x ∈ X is a normal subgroup of G.
78
Answer : L, R̃, C̃ are left-actions, while L̃, R, C are right-actions.

– 105 –
b.) Show that the group G/H acts effectively on X.

Exercise
Let G act on a set X.
a.) Show that the stabilizer group at x, denoted Gx above, is in fact, a subgroup of G.
b.) Show that the G action is free iff the stabilizer group at every x ∈ X is the trivial
subgroup {1G }.
c.) Suppose that y = g · x. Show that Gy and Gx are conjugate subgroups in G. 79

Exercise Derangements
A permutation in Sn which acts on {1, . . . , n} without fixed points is called a derange-
ment. Show that the number of derangements in Sn is given by
n
X (−1)k
Dn = n! (8.10)
k!
k=0

8.2 The Stabilizer-Orbit Theorem


There is a beautiful relation between orbits and isotropy groups:

Theorem [Stabilizer-Orbit Theorem]: Each left-coset of Gx in G is in 1-1 correspondence


with the points in the G-orbit of x:

ψ : OrbG (x) → G/Gx (8.11)

for a 1 − 1 map ψ.

Proof : Suppose y is in a G-orbit of x. Then ∃g such that y = g · x. Define ψ(y) ≡ g · Gx .


You need to check that ψ is actually well-defined.

y = g0 · x → ∃h ∈ Gx g0 = g · h → g 0 Gx = ghGx = gGx (8.12)

Conversely, given a coset g · Gx we may define

ψ −1 (gGx ) ≡ g · x (8.13)

Again, we must check that this is well-defined. Since it inverts ψ, ψ is 1-1. ♠


79
Answer : If y = g0 · x and g · x = x then (g0 gg0−1 ) · y = y so Gy = g0 Gx g0−1 .

– 106 –
Corollary: If G acts transitively on a set X then the isotropy groups Gx for all the
points x ∈ X are conjugate subgroups of G, and for any x ∈ X, there is a 1 − 1
correspondence between X and the set of cosets G/Gx . If H is any one of these
isotropy groups we can therefore identify X with the set of left-cosets G/H.

Remark: Sets of the type G/H are called homogeneous spaces. This theorem is the
beginning of an important connection between the algebraic notions of subgroups and cosets
to the geometric notions of orbits and fixed points. Below we will show that if G, H are
topological groups then, in some cases, G/H are beautifully symmetric topological spaces,
and if G, H are Lie groups then, in some cases, G/H are beautifully symmetric manifolds.

Exercise Orbits Of Zp For Prime p


Let p be a prime and suppose the cycle group Zp acts on a space X. Show that any
orbit consists of a single point, or of p points.

Exercise The Lemma that is not Burnside’s


Suppose a finite group G acts on a finite set X as a transformation group. A common
notation for the set of points fixed by g is X g . Show that the number of distinct orbits is
the averaged number of fixed points:
1 X g
|{orbits}| = |X | (8.14)
|G| g

For the answer see. 80

Exercise Jordan’s theorem


Suppose G is finite and acts transitively on a finite set X with more than one point.
Show that there is an element g ∈ G with no fixed points on X. 81

80
Answer : Write X X
|X g | = |{(x, g)|g · x = x}| = |Gx | (8.15)
g∈G x∈X

Now use the stabilizer-orbit theorem to write |Gx | = |G|/|OG (x)|. Now in the sum
X 1
(8.16)
x∈X
|OG (x)|

the contribution of each distinct orbit is exactly 1.


81
Hint: Note that X = G/H for some H and apply the Burnside lemma.

– 107 –
Any two points are
SO(3)-related

Figure 22: Transitive action of SO(3, R) on the sphere.

Figure 23: Orbits of SO(2, R) on the two sphere.

this orbit is a circle

this
orbit
is a
point

Figure 24: Notice not all orbits have the same dimensionality. There are two qualitatively different
kinds of orbits of SO(2, R).

8.3 Examples Of Orbits


The concept of a G-action on a set is an extremely important concept, so let us consider a
number of examples:

Examples

1. Let G be any group and consider the group action defined by φ(g, x) = x for all
g ∈ G. This is as ineffective as a group action can be: For every x, the istropy group
is all of G, and for all g ∈ G, Fix(g) = X. In particular, this situation will arise if
X consists of a single point. This example is not quite as stupid as might at first
appear, once one takes the categorical viewpoint, for pt// G is a very rich category
indeed.

– 108 –
2. Let X = {1, · · · n}, so SX = Sn as before. The action is effective and transitive, but
not free. Indeed, the fixed point of any j ∈ X is just the permutations that permute
j ∼
everything else, and hence SX = Sn−1 . Note that different j have different stabilizer
subgroups isomorphic to Sn−1 , but they are all conjugate.

3. GL(n, R) acts on Rn by matrix multiplication. If we act with a matrix on a column


vector we get a left action. If we act on a row vector we get a right action. The
action is:
a.) Effective: If g 6= 1 some vector ~x is moved.
b.) Not transitive: If ~x 6= 0 it cannot be mapped to 0
c.) Not free: ~0 is a fixed point of the entire group.
d.) There are two orbits.
e.) The isotropy group of the vector e1 is (under the left-action) the subgroup of
matrices of the form
!
1 v
StabGL(n,R) (e1 ) = { |v ∈ M at1×(n−1) (R), B ∈ GL(n − 1, R)} (8.17)
0B

The stabilizer group for all other nonzero vectors will be conjugate to this one. The
stabilizer group of the origin is the entire group GL(n, R).

4. If we restrict from GL(n, R) to SO(n, R) the picture changes completely. For sim-
plicity consider the case n = 2. The left- action is:
! ! !
x1 cos φ sin φ x1
R(φ) : → (8.18)
x2 − sin φ cos φ x2

The group action is effective. It is not free, and it is not transitive. There are now
infinitely many orbits of SO(2), and they are all distinguished by the invariant value
of x2 +y 2 on the orbit. From the viewpoint of topology, there are two distinct “kinds”
of orbits acting on R2 . One has trivial isotropy group and one has isotropy group
SO(2). See Figure 24. These give two strata.

5. Orbits of O(2). We have seen that O(2) can be written as a disjoint union:

O(2) = SO(2) q P · SO(2) (8.19)

where P is not canonical and can be taken to be reflection in any line through the
origin. The orbits of SO(2) and O(2) are the same. We will find a very different
picture when we consider the orbits of the Lorentz group.

6. Now consider a fixed SO(2, R) subgroup of SO(3, R), say, the subgroup defined by
rotations around the z-axis, and consider the action of this group on a sphere in R3
of fixed radius. The action is not not transitive. The G-orbits are shown in Figure
23. It is also not free: The north and south poles are fixed points.

– 109 –
7. Now consider the action of SO(3, R) on a sphere of positive fixed radius in R3 .
(WLOG take it to be of radius one.) The action is then transitive on the sphere.
Now the isotropy subgroup StabSO(3) (n̂) ⊂ SO(3) of any unit vector n̂ ∈ S 2 is
isomorphic to SO(2):
StabSO(3) (n̂) ∼
= SO(2) (8.20)

But, for different choices of n̂ we get different subgroups of SO(3). For example, with
usual conventions, if n̂ = e3 is on the x3 -axis then the subgroup is the subgroup of
matrices of the form  
cos φ sin φ 0
R12 (φ) = − sin φ cos φ 0 (8.21)
 
0 0 1

but if x is on the x1 -axis the subgroup is the subgroup of matrices of the form
 
1 0 0
R23 (φ) = 0 cos φ sin φ  (8.22)
 
0 − sin φ cos φ

and so on. For any n̂ ∈ S 2 let SO(2)n̂ ⊂ SO(3) denote the subgroup, isomorphic
to SO(2), which stabilizes n̂. According to the stabilizer-orbit theorem there is a
natural one-one correspondence

S2 ∼
= SO(3)/SO(2)n̂ (8.23)

Therefore, fixing any b̂ ∈ S 2 there is a map

πn̂ : SO(3, R) → S 2 (8.24)

Put simply, π(R) rotates n̂ ∈ S 2 to R · n̂ ∈ S 2 :

πn̂ (R) := R · n̂ ∈ S 2 (8.25)

Therefore, the inverse image of any other vector k̂ ∈ S 2 :

πn̂−1 (k̂) := {R|Rn̂ = k̂} ⊂ SO(3) (8.26)

is the set of rotations which can be (noncanonically!) put in 1-1 correspondence with
elements of SO(2). That is because if k̂ = R1 n̂ and k̂ = R2 n̂ then R1−1 R2 n̂ = n̂ and
therefore R2 = R1 R0 where R0 ∈ StabSO(3) (n̂) ∼= SO(2).
So, for each point k̂ ∈ S 2 we can associate a copy of SO(2) inside SO(3), which is
topologically a circle, and clearly every element of SO(3) will be captured this way
as k̂ ranges over S 2 . One might think that this means that, as manifolds, SO(3) is
diffeomorphic to S 2 × SO(2) = S 2 × S 1 , but this turns out to be quite false. For
example the homotopy groups of SO(3) and S 2 × S 1 are completely different.

– 110 –
Nevertheless, we can try to parametrize the general rotation by using this idea: We
choose n̂ = e3 to be the basepoint. Then the standard polar angles of a point on the
sphere are defined by
   
0 sin θ sin φ
R12 (φ)R23 (θ) 0 = sin θ cos φ (8.27)
   
1 cos θ

But this does NOT mean every rotation matrix is of the form R12 (φ)R23 (θ) ! It only
gives us a parametrization of the cosets SO(3)/SO(2)12 . The general element can be
written as
R = R12 (φ)R23 (θ)R12 (ψ) (8.28)
with range φ ∼ φ + 2π, ψ ∼ ψ + 2π and 0 ≤ θ ≤ π. These are the famous Euler
angles But they are not global coordinates. They can’t be: (θ, φ) are not global
coordinates on S 2 (they go bad at the north and south poles) and S 2 × S 1 is not the
same manifold as SO(3).

8. GL(2, C) And SU (2) Act On CP1 . Recall that CP1 can be identified with equivalence
classes of points (z1 , z2 ) ∈ C2 − {0} with equivalence relation (z10 , z20 ) ∼ (λz1 , λz2 ).
We denote equivalence classes by [z1 : z2 ].
Note that CP1 can also be thought of as the space of states of a single Qbit: [z1 : z2 ]
always has a representative with |z1 |2 + |z2 |2 = 1 and the representative is unique up
to multiplication by a phase. We can use such a normalized representative to define
a Qbit state: !
z1
ψ= (8.29)
z2
♣THIS REMARK
BELONGS
There is a well-defined action of GL(2; C) on CP1 : EARLIER, WHEN
WE FIRST
INTRODUCED
CPn . ♣
!
a b
: [z1 : z2 ] 7→ [az1 + bz2 : cz1 + dz2 ] (8.30)
c d

(The reader should carefully check that this is a well-defined group action. Since the
GL(2, C) action on C2 − {0} is transitive, the action on CP1 is transitive. Therefore
choosing a point p ∈ CP1 we have an identification of CP1 as a homogeneous space:

GL(2, C)/B ∼
= CP1 (8.31)

For example, if we take B to be the stabilizer of [1 : 0] we compute


!
a b
B={ |a, d ∈ C∗ b ∈ C} (8.32)
0d

Note that by the equivalence [z1 : z2 ] = [λz1 : λz2 ] we can always find a representative
so that (z1 , z2 ) is a unit vector in the Hilbert space C2 . Therefore, the restriction of

– 111 –
the GL(2, C) action on CP1 to SU (2) is still transitive. Now the stabilizer of [1 : 0]
is the subgroup of diagonal SU (2) matrices and is isomorphic to U (1). Therefore,
there is also an identification

CP1 ∼
= SU (2)/U (1) (8.33)

and hence there is a natural map

π : SU (2) → CP1 (8.34)

defined by this identification. By stereographic projection we can identify CP1 with


the Riemann sphere. and, hence since S 3 ∼ = SU (2) we have a natural continuous ♣Explain more ♣

map:
π : S3 → S2 (8.35)
whose fibers are copies of S 1 . This is a famous map in mathematics and physics
known as the Hopf map and has many beautiful properties. It appears in the physics
of magnetic monopoles and in several other related contexts. It is very closely related
to the map π : SO(3) → S 2 defined above.
Another way of thinking about CP1 is that it is the space of lines through the origin
in C2 . This leads us to our next example:

9. Grassmannians. A very nice application of the Stabilizer-Orbit theorem is to the


Grassmannian of a vector space. Consider a finite dimensional vector space V , say
of dimension n and let 0 < k < n be an integer and define Grk (V ) to be the set
of all k-dimensional linear subspaces of V . It is not hard to see that GL(V ) acts
transitively on this space: If W ⊂ V is a k-dimensional subspace and T ∈ GL(V )
then T (W ) = {T (w)|w ∈ W } is a k-dimensional subspace. It is an easy fact of linear
algebra that for any two k-dimensional subspaces W1 , W2 ⊂ V there is a T with
T (W1 ) = W2 , i.e. the action is transitive. To compute the stabilizer of some vector
space W0 choose an ordered basis v1 , . . . , vk for W0 and any complementary basis for
V so that
v1 , . . . , vk , u1 , . . . , un−k (8.36)
is an ordered basis for V . Now, what is the subgroup of T so that T (W0 ) = W0 ? By
definition:

T (vi ) = Aji vj + Cαi uα


(8.37)
T (uα ) = Bjα vj + Dβα uβ

The condition T (W0 ) = W0 is then the condition that C = 0 for the matrix of T
relative to such a basis. So the stabilizer group is isomorphic to the subgroup P of
GL(n, κ) of matrices of the form:
!
AB
∈ GL(n, κ) (8.38)
0 D

– 112 –
In fact, the Grassmannian is a manifold and the representation in terms of homoge-
neous coordinates helps us to find local coordinates. Suppose that g ∈ GL(n, κ) is
“not too far” from the identity matrix then we will try to find a representative g0
in the coset gP so that if g0 6= g00 then they must be in different cosets. The idea is
that we view right-multiplication as a “gauge freedom” and try to “fix the gauge” by
choosing g0 to be of the form:
!
1k×k 0
g0 = (8.39)
γ(n−k)×k 1(n−k)×(n−k)

for some matrix γ(n−k)×k ∈ M at(n−k)×k (κ). Note that if we right multiply by an
element of P and ask that
! ! !
1 0 AB 1 0
= (8.40)
γ 1 0 D γ0 1

Then B = 0 and A = 1 and D = 1 and hence γ 0 = γ.


“Not to far” means more precisely: that if we write g in block diagonal form
!
αβ
g= (8.41)
γ δ

then α and δ are invertible and α − βδ −1 γ is also invertible. These are the conditions
so that we can solve the equation
! !
AB 1 0
g = (8.42)
0 D γ̃ 1

for some γ̃ ∈ M at(n−k)×k (κ). Indeed, one finds that we must set: A = α−1 , D =
δ −1 (1 − γB) and and B = −(α − βδ −1 γ)−1 βδ −1 .

Exercise Z2 Actions On The Sphere


Consider the action of Z2 on the sphere defined by (4.7):

σ · (x1 , . . . , xn+1 ) = (x1 , . . . , xp , −xp+1 , · · · , −xp+q ) (8.43)

a.) For which values of p, q is the action effective?


b.) For which values of p, q is the action transitive?
c.) Compute the fixed point set of the nontrivial element σ ∈ Z2 .
d.) For which values of p, q is the action free?

– 113 –
Exercise C∗ Actions On CPn−1
Consider the action of G = C∗ on CPn−1 defined by (4.8).
a.) For which values of (q1 , . . . , qn ) is the action effective?
b.) For which values of (q1 , . . . , qn ) is the action transitive?
c.) What are the fixed points of the C∗ action?
d.) What are the stabilizers at the fixed points of the C∗ action?

Exercise SL(2, R) Action On The Upper Half-Plane


a.) Show that (4.10) above defines a left-action of SL(2, R) on the complex upper
half-plane. 82
b.) Is the action effective?
c.) Is the action transitive?
d.) Which group elements have fixed points?
e.) What is the isotropy group of τ = i ? 83
Conclude that
H∼= SL(2, R)/SO(2) (8.44)

Exercise
Using GL(2, C)/B ∼
= CP1 show that GL(2, C) has a natural action on the Riemann
sphere given by
az + b
z 7→ (8.45)
cz + d

Exercise
Since there is a left-action of G × G on X = G there is a left-action of the diagonal
subgroup ∆ ⊂ G × G where ∆ = {(g, g)|g ∈ G} is a subgroup isomorphic to G.
a.) Show that this action is given by a 7→ I(a), where I(a) is the conjugation by a.
b.) Show that the orbits of ∆ are the conjugacy classes of G.
c.) What is the stabilizer subgroup of an element g0 ∈ G?

82
Hint: Show that Im(g · τ ) = |cτImτ
+d|2
.
83
The isotropy group is the subgroup SO(2, R) ⊂ SL(2, R). To see this set ai+b
ci+d
= i and conclude that
a = d and b = −c. Then since ad − bc = 1 we have a2 + b2 = 1 but this implies that the group element is
in SO(2, R).

– 114 –
Exercise Spheres As Homogeneous Spaces
a.) Show that there is a transitive action of SO(n + 1) on S n , considered as a sphere
of fixed radius in Rn+1 .
b.) Show that S n ∼
= SO(n + 1)/SO(n).
c.) Give an inductive proof that SO(n) is a connected manifold for n ≥ 2.

Figure 25: The distinct kinds of orbits of SO(1, 1, R) are shown in different colors. If we enlarge
the group to include transformations that reverse the orientation of time and/or space then orbits
of the larger group will be made out of these orbits by reflection in the space or time axis.

8.3.1 Extended Example: The Case Of 1 + 1 Dimensions


Consider 1+1-dimensional Minkowski space with coordinates x = (x0 , x1 ) and metric given
by !
−1 0
η := (8.46)
0 1

i.e. the quadratic form is (x, x) = −(x0 )2 + (x1 )2 . The two-dimensional Lorentz group is
defined by
O(1, 1) = {A|Atr ηA = η} (8.47)

This group acts on M1,1 preserving the Minkowski metric.


The connected component of the identity is the group of Lorentz boosts of rapidity θ:

x0 → cosh θ x0 + sinh θ x1 (8.48)

– 115 –
x1 → sinh θ x0 + cosh θ x1 (8.49)

that is: !
cosh θ sinh θ
SO0 (1, 1; R) ≡ {B(θ) = | − ∞ < θ < ∞} (8.50)
sinh θ cosh θ

In the notation the S indicates we look at the determinant one subgroup and the subscript
0 means we look at the connected component of 1. This is a group since

B(θ1 )B(θ2 ) = B(θ1 + θ2 ) (8.51)

so SO0 (1, 1) ∼
= R as groups. Indeed, note that
" !#
01
B(θ) = exp θ (8.52)
10

It is often useful to define light cone coordinates: 84

x± := x0 ± x1 (8.53)

and the group action in these coordinates is simply:

x± → e±θ x± (8.54)

so it is obvious that x+ x− = −(x, x) is invariant.


It follows that the orbits of the Lorentz group are, in general, hyperbolas. They are
separated by different values of the Lorentz invariant x+ x− = λ, but this is not a complete
invariant, since the sign (or vanishing) of x+ and of x− is also Lorentz invariant. For a real
number r define 
+1 r > 0


sign(r) := 0 r=0 (8.55)


−1 r<0

Then (λ, sign(x+ ), sign(x− )) is a complete invariant of the orbits. That is, given this triple
of data there is a unique orbit with these properties.
It is now easy to see what the different types of orbits there are. They are shown in
Figure 25: They are: ♣Actually, the
lightrays and
hyperbolas have
trivial stabilizer and
1. hyperbolas in the forward/backward lightcone and the left/right of the lightcone hence are in the
same strata. This is
a problem with
2. 4 disjoint lightrays. using strata. ♣

3. the origin: x+ = x− = 0.
84

Some authors will define these with a 1/2 or 1/ 2. One should exercise care with this choice of
convention.

– 116 –
It is now interesting to consider the orbits of the full Lorentz group O(1, 1) and its
relation to the massless wave equations. But there are clearly elements of O(1, 1) not
continuously connected to the identity such as:
! !
1 0 −1 0
P = T = (8.56)
0 −1 0 1

In general we can write (noncanonically),

O(1, 1) = SO0 (1, 1) q P · SO0 (1, 1) q T · SO0 (1, 1) q P T · SO0 (1, 1) (8.57)

with
The P and T operations map various orbits of SO0 (1, 1) into each other: P is a
reflection in the time axis, i.e., a reflection of the spatial coordinate, while T is a reflection
in the space axis, i.e. a reflection of the time coordinate. Thus the orbits of the groups
SO(1, 1), SO0 (1, 1) q P T · SO0 (1, 1), and O(1, 1) all differ slightly from each other. ♣Should give more
details here, or form
As an example of a physical manifestation of orbits let us consider the energy-momentum an exercise. ♣

dispersion relation of a particle of mass m with energy-momentum (E, p) ∈ R1,1 .

1. Massive particles: m2 > 0 have (E, p) along an orbit in the upper quadrant:

O+ (m) = {(m cosh θ, m sinh θ)|θ ∈ R} (8.58)

2. Massless particles move at the speed of light. In 1+1 dimensions there is an interesting
refinement of the massless orbits: Left-moving particles with positive energy have
support on 85 p+ = 21 (E + p) = 0 and p− = 12 (E − p) 6= 0. Right-moving particles
with positive energy have support on p− = 0 and p+ 6= 0. In d + 1 dimensions
with d > 1 the orbits of SO0 (1, d) consisting of the forward and backward lightcones
(minus the origin) are connected.

3. Tachyons have E 2 −p2 = m2 < 0 and have their support on the left or right quadrant.
If i(k0 x0 +k1 x1 ) then k 2 =
p we try to expand a solution to the wave-equation with e 0
k12 + m2 and so if the spatial momentum k1 is sufficiently small then k0 is pure
imaginary and the wave grows exponentially, signaling and instability. This tells us
our theory is out of control and some important new physical input is needed.

4. A massless “particle” of zero energy and momentum.

Exercise Components Of The 1 + 1-Dimensional Lorentz Group


a.) Prove equation (8.57). 86
85
Note the factors of two, so that x0 p0 + x1 p1 = x+ p+ x− p− . This is an example of the tricky factors of
two one encounters when working with light-cone coordinates.
86
Answer : The general matrix !
a b
(8.59)
c d

– 117 –
b.) Show that the group of components of the Lorentz group is O(1, 1)/SO0 (1, 1) ∼
=
Z2 × Z2

Figure 26: Illustrating orbits of the connected component of the identity in O(1, 3). In (a) the top
and bottom hyperboloids are separate orbits, and if we include time-reversing transformations the
orbits are unions of the two hyperboloids. In (b) there are three orbits shown with x0 > 0 x0 < 0
(the future and past, or forward and backward light cones), and the orbit consisting of the single
point. In (c), once x2 has been specified, there is just one orbit, for d > 2.

8.3.2 Higher Dimensional Light Cones


We define d-dimensional Minkowski space M1,d−1 with d > 2 to be the vector space Rd

is in O(1, 1) iff

a2 − c2 = 1
d 2 − b2 = 1 (8.60)
ab = cd

The most general solution of the first two equations is

a = κ1 cosh θ
c = sinh θ
(8.61)
d = κ2 cosh θ0
b = sinh θ0

where κi ∈ {±1} and θ, θ0 ∈ R. Now impose the third equation. The solutions split into two cases: If
κ1 /κ2 = 1 then θ = θ0 . This gives two components. If κ1 /κ2 = −1 then θ = −θ0 , giving the other two
components.

– 118 –
with quadratic form
η = Diag{−1, +1d−1 } (8.62)
and
O(1, d − 1) = {A|Atr ηA = η} (8.63)
The nature of the orbits is slightly different because of the zero-dimensional sphere S 0
is disconnected but the higher dimensional spheres are connected.

1. For λ2 > 0 we can define

O+ (λ) = {x|(x0 )2 − (~x)2 = λ2 & sign(x0 ) = sign(λ)} (8.64)

By the stabilizer-orbit theorem we can identify this with

SO0 (1, d − 1)/SO(d − 1) (8.65)

by considering the isotropy group at (x0 = λ, ~x = 0). See Figure 26(a).

2. For µ2 > 0 we can define

O− (λ2 ) = {x|(x0 )2 − (~x)2 = −µ2 } (8.66)

By the stabilizer-orbit theorem we can identify this with

SO0 (1, d − 1)/SO0 (1, d − 2) (8.67)

by considering the isotropy group at x = (x0 = 0, x1 = 0, . . . , xd−2 = 0, xd−1 = µ).


The sign of µ does not distinguish different orbits for d > 2 because the sphere S d−2
is connected. See Figure 26(c).

3.
O± = {x|x2 = 0 & sign(x0 ) = ±1} (8.68)
Vectors in this orbit are of the form (x0 , |x0 |n̂) where n̂ ∈ S d−2 ⊂ Rd−1 and the sign
of x0 is invariant under the action of the identity component of O(1, 3). (Show this!).
Note that, for d = 2 the sphere S 0 has two disconnected components, leading to
left- and right-movers. But for d > 2 there is only one component. We can think of
n̂ ∈ S d−2 as parametrizing the directions of light-rays. That is, the point where the
light ray hits the celestial sphere. In one spatial dimension, a light ray either moves
left or right, and this is a Lorentz-invariant concept. In d − 1 > 1 spatial dimensions,
we can rotate any direction of light ray into any other. See Figure 26(b). One can
show that these orbits too are homogeneous spaces: 87

O± ∼
= SO0 (1, d − 1)/I (8.69)
87
The isotropy group of a light ray is I ∼ = ISO(d − 2), where ISO(d − 2) is the Euclidean group
on Rd−2 . The easiest way to show this is to use the Lie algebra of so(1, d − 1) and work with light-cone
coordinates. Choosing a direction of the light ray along the xd−1 axis and introducing light-cone coordinates
x± := x0 ± xd−1 , and transverse coordinates xi , i = 1, . . . , d − 2 if the lightray satisfies x− = 0 then we
have unbroken generators M +i and M ij .

– 119 –
4. The final orbit is of course {x = 0}.

Remarks

1. Discrete symmetries in nature. The higher dimensional Lorentz groups also have four
connected components. The Lorentz group in d+1 spacetime dimensions is O(d−1, 1).
In QFT it is clear that if a theory is invariant under infinitesimal Lorentz symmetries,
then it is invariant under the connected component of the identity SO0 (d − 1, 1).
However, it turns out that such theories can nevertheless not be invariant under the
disconnected components. For a long time it was assumed that parity is a symmetry
of all the laws of physics. This means that, if you watch a video of a physical process,
then you cannot tell whether you are looking at that process in a mirror, or not.
However, in 1956 T.D. Lee and C.N. Yang carefully reviewed the evidence for parity
conservation in nature and pointed out that there was no careful experimental test for
the weak interactions (nuclear beta decay, etc.). They proposed some experimental
tests and C.S. Wu and collaborators discovered experimentally in 1957 that parity is
indeed violated.
So, what about time reversal invariance? If you run a movie backwards, is the
resulting process physically possible (however unlikely)? There is a famous theorem
in QFT stating that the product of parity and time reversal, the P T -component of
O(d − 1, 1) will be a symmetry if the connected component is a symmetry. This is
usually called the CPT theorem. In 1964 V. Fitch and J. Cronin discovered that
certain very rare processes in nature actually do violate time-reversal invariance.
That is a good thing, because, as noted by Sakharov, if there were no time-reversal
invariance in the laws of physics it would be impossible to understand why there is
matter/anti-matter asymmetry in the context of the big bang theory.

2. An important generalization of the Lorentz groups is the following:

O(p, q; R) := {A ∈ GL(p + q; R)|Atr dp,q A = dp,q } (8.70)

where !
1p×p 0
dp,q := (8.71)
0 −1q×q
forms a group, generalizing the Lorentz and rotation-reflection groups and, when
both p > 0 and q > 0 it has four connected components. 88
**************************
SHOW CONTRACTION TO O(p) × O(q) USING THE MATRICES
 √ √ 
† π sinh( π † π)
!
0 π cosh( ππ √
exp[ † ] =  sinh(√ππ† ) π† π
√ 
π 0 π † √ cosh( π † π
ππ †
88
The proof proceeds by showing that the group contracts to O(p) × O(q) and then recall that O(p) has
two connected components.

– 120 –
SINCE
Atr A = 1 + C tr C
(8.72)
Dtr D = 1 + B tr B

A, D are positive have polar decompositions A = O1 P1 and D = O2 P2 etc.


Detailed proof in Survey Of Matrix Groups 2009, Chapter 5.
**************************

𝑥𝑥

Figure 27: The torsors Z + x, plotted in the y direction are the points above a point x on the
Z-axis. Note that the union of the red lines has a symmetry under the action (x, y) → (x + n, y)
for n ∈ Z. If we quotient by this action then the projection to the x-axis becomes the projection
R → S 1 given by the exponential map. This gives us a principal bundle for the group Z.

8.3.3 Torsors And Principal Bundles

Definition A torsor or principal homogeneous space for a group G is a G-set X on which


the action is transitive and free.

One can set up a 1-1 correspondence between a torsor and elements of G, but in general
there is no natural correspondence: A torsor has no distinguished element we can call the
identity.

Example 1: Let x ∈ R be a real number. Consider the subset X = Z + x ⊂ R. This set


is a torsor for Z. But there is no natural zero in X. Indeed, let x vary continuously, then
any purported natural zero would vary continuously to any other number.

Example 2: Imagine that the surface of the earth is flat and of infinite extent. Is this
a copy of R2 ? Yes and no. We can identify it with R2 , but not in any natural way: R2

– 121 –
is a vector space with a distinguished vector ~0. Where should we put the origin? Rome?
Beijing? Moscow? London? New York? Piscataway? Wuhan? If the UN tried to assign
an origin there would be endless disputes. However, there would never be any dispute
about the vector in R2 needed to translate from New York to London. So, the difference
of London minus New York is well-defined, but the sum is not. If London and New York
represented vectors in a vector space then one could both add and subtract these vectors.
The infinite flat earth is an example of two-dimensional affine Euclidean space E2 .
More formally: An affine space Ed modeled on Rd is a space of points with an action of
Rd that translates the points so that nonzero vectors always move points and one can get
from one point to any other by the action of a vector. But there is no natural choice of
origin. In equations:

1. If v ∈ Rd and p ∈ Ed then there is a point p + v ∈ Ed so that (p + v) + v 0 = p + (v + v 0 ).

2. If p + v = p then v = 0.

3. If p, p0 ∈ Ed there is a (unique) vector v ∈ Rd so that p0 = p + v. We can therefore


say p0 − p = v.

If we do choose an origin (this choice is arbitrary) then we can identify Ed ∼= Rd .


d 0 d 0
Indeed, given the above statements, given p ∈ E , every p ∈ E is of the form p = p + v
for a unique v ∈ Rd . So we map Ψp : Ed → Rd by taking Ψp : p0 7→ v.
In this language we can say that Ed is a principal homogeneous space for the Abelian
group Rd .
There is a distance between points which we take to be the Euclidean norm of the
vector:
dist(p, p0 ) :=k v k (8.73)
Now we can study the group of isometries of Ed . This is the group of transformations
T : Ed → Ed that preserves these distances:

dist(T (p), T (p0 )) = dist(p, p0 ) (8.74)

We denote it as Euc(d) and refer to it as the Euclidean group.


In a similar way, we can consider d-dimensional Minkowski space M1,d−1 to be an affine
space modeled on Rd , but now with quadratic form on p0 − p = v given by ♣Earlier we defined
it as a vector space,
so need to alter that
v · v = v µ ηµν v ν (8.75) to avoid confusion.

so the “distance squared” between two points is now dist(p, p0 )2 = v · v. Considered as an


affine space we define the Poincaré group as the group of transformations T : M1,d−1 →
M1,d−1 preserving the squared distance.

Example 3: Let V be a finite-dimensional vector space over a field κ. The set X = B(V ) of
all ordered bases for V is a GL(n, κ) torsor: g ·{v1 , . . . , vn } := {ṽ1 , . . . , ṽn } where ṽi = gji vj .
Any two such bases are related by some g, but, if we are just given an abstract vector space,

– 122 –
there is no natural basis. In the case κ = R the torsor B(V ) has two connected components.
A choice of connected component is known as an orientation of V .

Example 4: It is quite interesting to consider continuous families of torsors. For example,


let us return to Example 1 and vary x so that we have a family of torsors of the form Z + x.
Let
P̃ = ∪x∈R (Z + x) (8.76)
Note that Z + x + 1 = Z + x so that we can quotient by the group action of Z on R to get

P = P̃ /(x, y) ∼ (x + 1, y) (8.77)

See Figure 27.

Example 5: Let us return to our example πn̂ : SO(3) → S 2 , or more generally,

π : G → G/H (8.78)

Note that the fibers of a coset gH are:

π −1 (gH) = gH (8.79)

The subset gH ⊂ G can be put into 1-1 correspondence with H, but not in any natural
way. While it is identified with H as a set, it is not identified as a group because there is
no natural element in gH to identify with the unit in H.
The above examples are special cases of an extremely important idea in mathematics
- that of a principal fiber bundle. In the language of this section, a principal G-bundle is a
continuous family of G-torsors. Here is the formal definition:

Definition: Let G be a topological group. A principal G-bundle is a pair of topological


spaces P and X together with a surjective continuous map π : P → X so that there is a
continuous free right G action (not necessarily transitive!) on P with

π(p · g) = π(p) (8.80)

Technically, we need it to have a local triviality property: For all x ∈ X there is a neigh-
borhood Ux ⊂ X with a smooth map φx : π −1 (Ux ) → Ux × G such that
φx
π −1 (Ux ) / Ux × G (8.81)
π
π
% 
Ux

commutes. Moreover, we require that the right G-action on π −1 (Ux ) is equivalent to the
natural right G-action on Ux × G, where g0 acts by (y, g) 7→ (y, gg0 ), i.e. if p ∈ π −1 (Ux )
and φx (p) = (y, g) then
φx (p · g0 ) = (y, g · g0 ) (8.82)

– 123 –
P is called the total space of the principal bundle and X is called the base space. The
preimage π −1 (x) is called the fiber above x.

A consequence of the local triviality property is that if we cover the base space X with
charts {Uα } with φα : π −1 (Uα ) → Uα × G then

φαβ := φα ◦ φ−1
β : (x, g) 7→ (x, gαβ (x)g) x ∈ Uαβ (8.83)

Note that on triple overlaps Uαβγ

gαβ (x)gβγ (x)gγα (x) = 1 x ∈ Uαβγ (8.84)

a condition called the cocycle condition.

Examples:

1. Let X be any topological space and G any topological group. Then P = X × G with
π : (x, g) 7→ x is a principal bundle. A bundle of this form is known as the trivial
bundle.

2. π : G → G/H is a principal H bundle over the homogeneous space X = G/H. (Note


the free right H-action that commutes with π is g 7→ g · h so that π(g) = gH = π(gh).

3. In particular, π : SO(3) → S 2 is a principal SO(2) bundle over S 2

4. π : SU (2) → CP1 is a principal U (1) bundle over CP1 ∼


= S2.

5. G-bundles over the circle. Let G be a discrete group. Then R × G is the trivial
principal G- bundle over R. We can make a more interesting bundle by considering
the left Z-action defined by choosing an element g0 ∈ G and defining

φg0 ((x, g), n) = (x + n, g0n g) (8.85)

so equivalence classes satisfy: [(x, g)] = [(x + n, g0n g)]. The quotient P = (R × G)/Z
by this action is a principal G-bundle over S 1 . Note that there is still a well-defined
free right G-action. Call this bundle π : Pg0 → S 1 . We could also describe it as
([0, 1] × G) where we glue (0, g) to (1, g0 g). We will return to it below.

6. A nice special case of the bundle Pg0 described above is obtained by considering the
map of the unit disk in the complex plane π : z 7→ z n . Restricting this map to the
boundary of the disk we get a map from S 1 → S 1 , but notice that the inverse image
of any point is a torsor for multiplication by elements of the group µn . So π describes
a principle Zn bundle over the circle.

A map of G-torsors, or, more properly, a morphism of G-torsors is a map the preserves
the mathematical structure of being a G-torsor. So, if X1 and X2 are two G-torsors then a
morphism of G-torsors is a map
ψ : X1 → X2 (8.86)

– 124 –
such that
ψ(y · g) = ψ(y) · g (8.87)

for all y ∈ X1 . Such maps are also said to be equivariant.


A bundle map between two principal G-bundles, or, more properly a morphism of G-
bundles is a continuous map that preserves fibers and restricts on fibers to a be a morphism
of G-torsors. In diagrams, if ψ is a bundle map between two principal G-bundles P1 and
P2 then ψ : P1 → P2 is a map so that it preserves fibers, meaning that:

ψ
P1 / P2 (8.88)

π1 π2
~
X

commutes. That is π2 (ψ(p1 )) = π1 (p1 ) for all p1 ∈ P . Moreover, the map must be G
equivariant, or a morphism of torsors on the fibers, so that 89

ψ(p1 · g) = ψ(p1 ) · g (8.89)

If there are bundle maps ψ1 : P1 → P2 and ψ2 : P2 → P1 whose composition is the


identity then the bundles are said to be isomorphic bundles.
Let us return to our example of π : Pg0 → S 1 . Note that the following diagram
commutes:
ψh
R×G /R×G (8.90)
φg0 ,n φhg −1 ,n
0h
 ψh 
R×G /R×G

This implies that ψh descends to a well-defined bundle map Pg0 → Phg0 h−1 . Clearly ψh−1
defines the inverse bundle map.
Therefore, the isomorphism classes of principal G bundles over the circle are labeled
by conjugacy classes of elements of G.
Finally, let π : P → X be a principal G bundle and let Y be any G-space with a left
G-action. Then we can define a G action on P × Y :

φg (p, y) 7→ (p · g −1 , g · y) (8.91)

Notice this is a left G-action. The quotient space, usually denoted P ×G Y has a well-defined
continuous map
π̃ : P ×G Y → X (8.92)

defined by π̃([p, y]) = π(p). The fibers of π̃ can be identified with the space Y . (See the
exercise below.) This is an example of a more general fiber bundle. The map π̃ : P ×G Y →
89
The reader familiar with the general theory of fiber bundles will note that (8.88) alone serves as the
definition of a bundle map for a fiber bundle. But a principal bundle has more structure, and for a morphism
of principal bundles we require the additional condition (8.89).

– 125 –
X defines what is called an associated bundle to the principal bundle π : P → X by the
G-set Y .

Sections Of Bundles

If π : E → X is a principal bundle, or an associated bundle to a principal bundle


(or any fiber bundle, if you know what that means) then we define a section of π to be a
continuous map
s:X→E (8.93)
which is a right-inverse to π. That is

π ◦ s(x) = x ∀x ∈ X (8.94)

Note that s(x) is always an element of E which is in the fiber of π : E → X above x.


Because of local triviality, local sections always exist. That is, near any x ∈ X there
will be sections in the bundle π −1 (U) → U for some neighborhood U of x. It is less obvious
what happens globally. In fact, we have

Theorem A principal G-bundle π : P → X is isomorphic to the trivial bundle π : X ×G →


X iff there is a globally defined continuous section.

Proof : Suppose there is a globally defined section s : X → P . Then let

ψ:X→G→P (8.95)

be defined by ψ(x, g) := s(x)g. One checks this is a bundle morphism. Conversely, suppose
there is a bundle morphism ψ : P → G × X. Then for each x ∈ X there is a unique
s(x) ∈ P so that ψ(s(x)) = (x, 1G ). ♠

The situation is rather different for associated bundles. For example if Y = V is a


linear space then the space of sections is an infinite dimensional vector space. ♣Explain more and
write this out with
transition functions

Exercise Bundle Maps For Trivial Bundles


Show that the most general bundle map from the trivial bundle π : X × G → X to
itself is of the form:
ψ : (x, g) 7→ (x, h(x)g) (8.96)
for some continuous map h : X → G.

Exercise Morphisms Of Principal G-Bundles Are Isomorphisms


a.) Show that any morphism of G-torsors is an isomorphism of G-torsors.

– 126 –
b.) Extend this to show that any morphism of principal G-bundles is an isomorphism
of principal G-bundles.

Exercise Fibers Of An Associated Bundle


Show that the fibers of the associated bundle π̃ : P ×G Y → X described above are in
1-1 correspondence with the space Y . 90

8.4 More About Induced Group Actions On Function Spaces


Let us return to the considerations of section 4.2.
Let X be a G-set and let Y be any set. There are natural left- and right- actions on
the function space Map(X, Y ). Given Ψ ∈ Map(X, Y ) and g ∈ G we need to produce a
new function φ(g, Ψ) ∈ Map(X, Y ). The rules are as follows:

1. If G is a left-action on X then

φ(g, Ψ)(x) := Ψ(g · x) right action on Map(X, Y ) (8.97)

2. If G is a left-action on X then

φ(g, Ψ)(x) := Ψ(g −1 · x) left action on Map(X, Y ) (8.98)

3. If G is a right-action on X then

φ(g, Ψ)(x) := Ψ(x · g) left action on Map(X, Y ) (8.99)

4. If G is a right-action on X then

φ(g, Ψ)(x) := Ψ(x · g −1 ) right action on Map(X, Y ) (8.100)

Example: Consider a spacetime S. With suitable analytic restrictions the space of scalar
fields on S is Map(S, κ), where κ = R or C for real or complex scalar fields. If a group
G acts on the spacetime, there is automatically an induced action on the space of scalar
fields. To be even specific, suppose X = M1,d−1 is d-dimensional Minkowski space time, G
is the Poincaré group, and Y = R. Given one scalar field Ψ and a Poincaré transformation
g −1 · x = Λx + v we have (g · Ψ)(x) = Ψ(Λx + v).
90
Answer : Let us define a map f : π̃ −1 (x) → Y . To do this we must choose an element p0 ∈ π −1 (x).
Then if [p, y] ∈ π̃ −1 (x) it follows that π(p) = x and hence p = p0 · g0 for some g0 . Then we define
f [p, y] = g0−1 · y. The reader needs to check that this map is well-defined. Since we can choose any y ∈ Y
the map is clearly surjective. Finally note that [p1 , y1 ] = [p1 , y2 ] implies that y1 = y2 since the G-action on
P is free. Therefore the map is also injective. Note that the map does depend on a choice of p0 for each x,
so there is no canonical identification of the fibers with Y .

– 127 –
Similarly, suppose that X is any set, but now Y is a G-set. Then again there is a
G-action on Map(X, Y ):

(g · Ψ)(x) := g · Ψ(x) or Ψ(x) · g (8.101)

according to whether the G action on Y is a left- or a right-action, respectively. These are


left- or right-actions, respectively.

We can now combine these two observations and get the general statement: We assume
that both X is a G1 -set and Y is a G2 -set. We can assume, without loss of generality, that
we have left-actions on both X and Y . Then there is a natural G1 ×G2 -action on Map(X, Y )
defined by:
φ((g1 , g2 ), Ψ)(x) := g2 · (Ψ(g1−1 · x)) (8.102)
note that if one writes instead g2 · (Ψ(g1 · x)) on the RHS then we do not have a well-defined
G1 × G2 -action (if G1 and G2 are both nonabelian). In most applications X and Y both
have a G action for a single group and we write

φ(g, Ψ)(x) := g · (Ψ(g −1 · x)) (8.103)

This is a special case of the general action (8.102), with G1 = G2 = G and specialized to
the diagonal ∆ ⊂ G × G.

Example: Again let X = M1,d−1 be a Minkowski space time. Take G1 = G2 and let
G = ∆ ⊂ G × G be the diagonal subgroup, and take G to be the Poincaré group. Now
let Y = V be a finite-dimensional representation of the Poincaré group. Let us denote the
action of g ∈ G on V by ρ(g). Then a field Ψ ∈ Map(X, Y ) has an action of the Poincaré
group defined by
g · Ψ(x) := ρ(g)Ψ(g −1 x) (8.104)
This is the standard way that fields with nonzero “spin” transform under the Poincaré
group in field theory. As a very concrete related example, consider the transformation of
electron wavefunctions in nonrelativistic quantum mechanics. The electron wavefunction
is governed by a two-component function on R3 :
!
ψ+ (~x)
Ψ(~x) = (8.105)
ψ− (~x)

Then, suppose G = SU (2). Recall there is a surjective homomorphism π : G → SO(3) ♣IT HASN’T
BEEN DEFINED
defined by π(u) = R where YET. IT IS IN
SECTION 10.1.
u~x · ~σ u−1 = (R~x) · ~σ (8.106) SOME OF THAT
MATERIAL
SHOULD BE
Then the (double-cover) of the rotation group acts to define the transformed electron MOVED EARLIER
wavefunction u · Ψ by !
SO WE CAN
RECALL IT HERE.
ψ+ (R−1 ~x) ♣
(u · Ψ)(~x) := u (8.107)
ψ− (R−1 ~x)
In particular, u = −1 acts trivially on ~x but nontrivially on the wavefunction.

– 128 –
9. Centralizer Subgroups And Counting Conjugacy Classes

Definition 9.1: Let g ∈ G, the centralizer subgroup of g, (also known as the normalizer
subgroup ), denoted, Z(g), is defined to be:

Z(g) := {h ∈ G|hg = gh} = {h ∈ G|hgh−1 = g} (9.1)

Exercise Due Diligence


a.) Check that Z(g) ⊂ G is a subgroup.
b.) Show that g n ∈ Z(g) for any integer n.
c.) If g1 = g0 g2 g0−1 show that Z(g1 ) = g0 Z(g2 )g0−1 .
d.) Show that
Z(G) = ∩g∈G Z(g) (9.2)

Exercise Is Z(g) always an Abelian group?


a.) Show that Z(1) = G. Answer the above question.
b.) Show that the centralizer of the transposition (12) in Sn for n ≤ 3 is isomorphic to S2 .
c.) Show that the centralizer of the transposition (12) in Sn for n ≥ 4 is isomorphic to
S2 × Sn−2 .

Recall that C(g) denotes the conjugacy class of g. Using the Stabilizer-Orbit theorem
we can establish a 1-1 correspondence between C(g) and the cosets of G/Z(g). As in the
proof of that theorem we have a map ψ : G/Z(g) → C(g) by

ψ : gi Z(g) → gi ggi−1 ∈ C(g) (9.3)

It is 1-1 and onto.


Since conjugacy is an equivalence relation G decomposes as a disjoint union of the
orbits, which in this case are the conjugacy classes. When G is a finite group this decom-
position leads to some useful theorems based on simple counting ideas. When |G| is finite
we can usefully write: X
|G| = |C(g)| (9.4)
conj. classes

The sum is over distinct conjugacy classes. What is g in this formula? For each class we
may choose any representative element from that class.
Now, if G is finite, then by the above 1-1 correspondence we may write:
|G|
|C(g)| = (9.5)
|Z(g)|

– 129 –
which allows us to write the above decomposition of |G| in a useful form sometimes called
the class equation:
X |G|
|G| = (9.6)
|Z(g)|
conj. classes

Again, we sum over a complete set of distinct non-conjugate elements g. Which g we choose
from each conjugacy class does not matter since if g1 = hg2 h−1 then Z(g1 ) = hZ(g2 )h−1
are conjugate groups, and hence have the same order. So, for each distinct conjugacy class
we just choose any element we like.

9.1 0 + 1-Dimensional Gauge Theory


Recall the definition above of a morphism of bundles, and an isomorphism of bundles. An
automorphism of a principal G-bundle π : P → X is an isomorphism of the principal bundle
with itself. Since the composition of automorphisms is an automorphism, and automor-
phisms are invertible, and the identity is an automorphism the set of automorphisms form
a group under composition, called the group of automorphisms of the bundle π : P → X.
Recall the bundles Pg0 → S 1 determined by the group element g0 . As we showed above,
the invertible bundle maps are of the form ψh : Pg0 → Phg0 h−1 . Therefore, the isomorphism
classes of G-bundles over the circle are in 1-1 correspondence with the conjugacy classes in
G, and the group of automorphisms of Pg0 is precisely the centralizer group Z(g0 ).
A gauge theory is a physical theory where physical quantities are defined by summing
over principal G-bundles with connection. We haven’t defined the term “connection,” but
for principal G-bundles with discrete group there is a unique connection so we can discuss
those here. 91 In the case of 0 + 1 dimensional gauge theory a basic quantity of interest
is the “partition function” on a one-dimensional manifold. The only closed connected
one-dimensional manifold is the circle. So we have
X
Z(S 1 ) = F (Pg ) (9.7)

where the sum is over isomorphism classes of principal G bundles over the circle. and F
is a function on the set of such bundles, or equivalently, a function on all the principal
G-bundles that only depends on the isomorphism class. We call this a gauge invariant
Boltzman factor. The standard physical Boltzman factors involve curvature and holonomy.
In this setting there is no curvature, so F should be proportional to a character of g in
some representation.
In gauge theory we must also divide by the “volume” of the group of automorphisms
of the bundle so
X χρ (g)
Z(S 1 ) = (9.8)
cc
|Z(g)|
for some character χρ . We will see later that the sum is zero unless ρ contains some copies
of the trivial representation, so we might as well take χρ (g) = 1.
91
See section 18 below for a discussion of gauge theory that uses minimal prerequisites and is sufficient
to understand this remark.

– 130 –
Now notice that we can use the stabilizer-orbit theorem to rewrite this as:
X 1 1 X
Z(S 1 ) = = 1 (9.9)
cc
|Z(g)| |G|
Pg

In the second equality we have summed over all the G-bundles Pg weighted by 1 and divided
by the full “volume” |G| of the gauge group. So
1 X
Z(S 1 ) = 1=1 (9.10)
|G|
g∈G

The Hilbert space is one-dimensional with zero Hamiltonian so indeed this is Tre−βH = 1.

9.2 Three Mathematical Applications Of The Counting Principle


In this section let p be a prime number.
Application 1:

Theorem: If |G| = pn then the center is nontrivial, i.e., Z(G) 6= {1}.

Proof : Observe that an element g is central if and only if C(g) = {g} has order 1. Now
let us use the class equation. We can usefully split up the sum over conjugacy classes as a
sum over the center and the rest:
X
|G| = |Z(G)| + |Ci | (9.11)
i
where the sum over i is a sum over the distinct conjugacy classes more than one
element. As we noted above, by the stabilizer orbit theorem
|G|
|Ci | = (9.12)
|Z(gi )|
where gi is any element of the conjugacy class Ci . But, for these conjugacy classes |Z(gi )| <
|G| and by Lagrange’s theorem, and the assumption that p is prime, |Z(gi )| = pn−ni for
some ni < n. Therefore, the second term on the RHS of (9.11) is divisible by p and hence
p||Z(G)|. ♠

Application 2: Cauchy’s theorem:


In a similar style, we can prove the very useful:

Theorem: If p divides |G| then there is an element g ∈ G, g 6= 1 with order p.


♣Proof 1 is not
really an
Proof 1 : This is a nice application of the stabilizer-orbit theorem. Consider the set application of the
class equation,

X = {(g1 , . . . , gp )|g1 · · · gp = 1} ⊂ Gp
rather it is an
(9.13) application of
stabilizer-orbit. ♣

Note that the cyclic group Zp acts on this set with the standard generator acting by

ω · (g1 , . . . , gp ) = (gp , g1 , g2 , . . . , gp−1 ) (9.14)

– 131 –
A fixed point of the Zp -action corresponds to an element of the form (g, . . . , g) such that
g p = 1. If g 6= 1 then this corresponds to an element of order p. Now, by the stabilizer-orbit
theorem, the orbits of any Zp action (on any set) have cardinality either 1 or p. Let N1 be
the number of orbits of length one and let Np be the number of orbits of length p. Note
that the order of X is just |G|p−1 since one can always solve for gp in terms of g1 , . . . , gp−1 .
Then, by the counting principle we have:

|G|p−1 = N1 + pNp (9.15)

It follows that p divides N1 . Also N1 > 0 because (1, ..., 1) is a fixed point of the Zp action.
Therefore N1 = kp > 1 and hence there are other fixed points, i.e. there are group elements
of order p. In fact, there must be at least (p − 1) of them. ♠

Proof 2 : We can also prove Cauchy’s theorem using induction on the order of G, dividing
the proof into two cases: First we consider the case where G is Abelian and then the case
where it is nonabelian.

Case 1: G is Abelian:
If |G| = p then G is cyclic and the statement is obvious: Any generator has order p.
More generally, note that if G is a cyclic group Z/N Z with N > p and p divides N then
N/p ∈ Z/N Z has order p. This establishes the result for cyclic groups.
Now suppose our Abelian group has order |G| > p. Choose an element g0 6= 1 and
suppose that g0 does not have order p. Let H = hg0 i. If H = G then G would be cyclic
but then as we just saw, it would have an element of order p. So now assume H is a proper
subgroup of G. If p divides |H| then H (and hence G) has an element of order p by the
inductive hypothesis. If p does not divide |H| then we consider the group G/H. But this
has order strictly less than |G| and p divides the order of G/H. So there is an element aH
of order p meaning ap = g0x for some x. If g0x = 1 we are done. If not then there is some
smallest positive integer y so that g0xy = 1 but then ay has order p. We have now proved
Cauchy’s theorem for abelian groups.

Case 2: G is non-Abelian: By the class equation we can write


0
X |G|
|G| = |Z(G)| + (9.16)
|Z(gi )|

If p divides the order of the centralizer Z(G) then we can apply our previous result about
Cauchy’s theorem for Abelian groups. If p does not divide Z(G) then there must be some gi
|G|
so that p does not divide |Z(g i )|
but this means p divides |Z(gi )|, but now by the inductive
hypothesis Z(gi ), and hence G has an element of order p. This completes the proof. ♠

Application 3: Sylow’s theorem:


Finally, as a third application we give a simple proof of Sylow’s first theorem: If p is
prime and pk divides |G| then G has a subgroup of order pk .
♣Proof 1 is not
really an
application of the
class equation,
rather it is an
application of
stabilizer-orbit. ♣

– 132 –
Proof 1 : The first proof is again an application of the stabilizer-orbit theorem. 92 Suppose
|G| = pk+r u with gcd(u, p) = 1 and r ≥ 0 and k > 0. We will show that G has a subgroup
of order pk . Consider the power set P(G), namely the set of all subsets of G, and consider
the subset of P(G) of all subsets (not subgroups!) of G of cardinality pk . Call this set of
subsets P(G, pk ). The cardinality of P(G, pk ) is clearly:
k −1
pY
pk+r u pk+r u/j − 1
 
k r
|P(G, p )| = = p u (9.17)
pk pk /j − 1
j=1

In the product we have a ratio of rational numbers of the form pk+r u/j−1 (the denominator
is a special case of this form). Any rational number r can be expressed as a product of
prime powers r = p̃ prime p̃vp̃ (r) where the vp̃ (r) ∈ Z is known as the valuation of r at p̃
Q

and the product runs over all primes p̃. Now, given a specific prime p, note that if a, b are
relatively prime to p then
pk a pk a − b
−1= (9.18)
b b
and hence for such rational numbers r the integer vp (r) = 0 for the prime p. It follows that
pr divides P(G, pk ) and that it is the maximal power which does so.
Now note that G acts on P(G, pk ) via:

φg : S 7→ g · S := {gh|h ∈ S} (9.19)

where we are denoting an element of P(G, pk ) by S. Consider the stabilizer subgroup GS


of any S ∈ P(G, pk ). Note that if h ∈ S then every element g · h ∈ S for g ∈ GS . (Why?
Because g · S = S if g ∈ GS .) But this means that GS · h is a subset of S. Since the left-G
action is free
|GS | = |GS · α| ≤ |S| = pk (9.20)
We now aim to show that some stabilizer group GS has order exactly pk . This will be
our subgroup predicted by Sylow’s theorem. Suppose, on the contrary that no stabilizer
group has order pk . Then every stabilizer group satisfies |GS | < pk , and therefore it is
divisible at most by pk−1 . Now, by the stabilizer-orbit theorem

|G| = |GS | · |O(S)| (9.21)

where O(S) is the G-orbit through S. Now pk+r divides |G| and if GS is divisible by at
most pk−1 then ps divides |O(S)| for s > r. But now
X
|P(G, pk )| = |O(Si )| (9.22)
distinct orbits

If all the orbits on the RHS were divisible by ps with s > r then |O(Si )| would be divisible
by ps with s > r. But this is not true. Therefore, some orbit is divisible by pr and no
higher power. Therefore some |GS | is divisible by pk , therefore |GS | = pk . Since GS is a
stabilizer group it is a subgroup of G. ♠
92
We are following the nice article on Wikipedia here.

– 133 –
Proof 2 : The more conventional proof is similar to that of Cauchy’s theorem. We work by
induction on |G|, and divide the proof into two cases:

Case 1: p divides the order of Z(G).: By Cauchy’s theorem Z(G) has an element of order
p and hence a subgroup N ⊂ Z(G) of order p. N is clearly a normal subgroup of G
(being a subgroup of the center of G) so G/N is a group. It is clearly of order pk−1 m.
So, by the inductive hypothesis there is a subgroup H̄ ⊂ G/N of order pk−1 . Now let
H = {g ∈ G|gN ∈ H̄}. It is not hard to show that H is a a subgroup of G containing N
and in fact H/N = H̄. Therefore |H| = pk , so H is a p-Sylow subgroup of G.

Case 2: p does not divide the order of Z(G).: In this case, by the class equation p must
not divide |C(g)| = |G|/|Z(g)| for some nontrivial conjugacy class C(g). But that means
that for such an element g we must have that pk divides |Z(g)| < |G|. So Z(g) has a
p-Sylow subgroup which can serve as a p-Sylow subgroup of G. ♠ ♣Again, there is a
nice proof using the
orbit-stabilizer
theorem. See
Wikipedia article.
Give this in the
Exercise section on
Orbit-Stabilizer
If pk divides |G| with k > 1 does it follow that there is an element of order pk ? 93 below? ♣

Exercise Groups Whose Order Is A Square Of A Prime Number


If |G| = p2 where p is a prime then show that
1. G is abelian
2. G ∼= Zp × Zp or Zp2 .

Exercise
Write out the class equation for the groups S4 and S5 .

Exercise
Find the centralizer Z(g) ⊂ Sn of g = (12 . . . n) in Sn .

Exercise
93
Answer : NO! Zkp is a counterexample: It has order pk and every element has order p.

– 134 –
Prove that if |G| = 15 then G = Z/15Z.

Exercise Groups whose order is a product of two primes


Suppose that G has order pq where p and q are distinct primes. We assume WLOG
that p < q. We now also assume that and p does not divide q − 1.
a.) Show that G is isomorphic to Zpq .
Warning!! This is hard. 94
b.) Why is it important to say that p does not divide q − 1? 95
c.) Show that this result implies that if a nonabelian group has odd order then the
order must be ≥ 21. (And in fact, there does exist a nonabelian group of order 21.)

10. Kernel, Image, And Exact Sequence

Given an arbitrary homomorphism

µ : G → G0 (10.1)

there is automatically a “God-given” subgroup of both G and G0 :


94
Answer. By Cauchy’s theorem we know there is an element a of order p and an element b of order q.
We can easily reduce to the case the center of G is trivial. In general the subgroup Z(G) must have order
pq, p, q, or 1. If Z(G) has order pq then a and b commute and G ∼ = Zp × Zq ∼ = Zpq . If |Z(G)| has order p
or q then G/Z(G) must be cyclic of order q or p, respectively. Hence by an easy exercise above G is cyclic.
This leaves us with the hard case where Z(G) = {1} is the trivial subgroup. Let us consider the conjugacy
classes of the powers of a, C(a), C(a2 ),. . . . Since Z(a) has order at least p and its order must divide pq and
it can’t be the whole group (since Z(G) = {1}) it must be that Z(a) = {1, a, . . . , ap−1 } and hence C(a) has
order q. Indeed, for any element g ∈ G that is not the identity it must be that Z(g) has order p or q and
C(g) has order q or p. Now note that Z(a) ⊃ Z(a2 ) ⊃ · · · . So, as long as ax is not one, it must be that
Z(ax ) = Z(a) and C(ax ) has order q. Now we claim that the different conjugacy classes C(a), C(a2 ),. . . ,
C(ap−1 ) are all distinct. The statement that these are distinct can be reduced to the statement that it is
not possible to have bab−1 = ax for any x, so now we verify this latter statement. If it were the case that
bab−1 = x then since the general element of the conjugacy class is bj ab−j the conjugacy class would have
to be {a, ax , a2x , . . . , a(q−1)x }. But that set must be the set C(a) = {a, a2 , · · · , ap−1 } of p elements. Since
q > p it must be that bj1 ab−j1 = bj2 ab−j2 where 1 ≤ j1 , j2 ≤ (q − 1) and j1 6= j2 . So we have to have
bj ab−j = a for some 1 ≤ j ≤ (q − 1). But then bj 6= 1. But then such an element bj would be in Z(a). This
is impossible. So we can never have bab−1 = ax and hence C(a), C(a2 ),. . . , C(ap−1 ) are all distinct. Now
the class equation says that
pq = 1 + (p − 1)q + X
where X accounts for all the other conjugacy classes. As we have remarked these must have order p or q
and hence X = rp + sq for nonnegative integers r, s. But now

q − 1 = rp + sq

But this is impossible: If s ≥ 1 the RHS is too large. So s = 0 but then p would have to divide q − 1.
95
Answer : Consider p = 2 and q = 3 and note that S3 is not isomorphic to Z6 .

– 135 –
Definition 10.1:
a.) The kernel of µ is

K = kerµ := {g ∈ G|µ(g) = 1G0 } (10.2)

b.) The image of µ is


imµ := µ(G) ⊂ G0 (10.3)

Exercise Due Diligence


b.) Check that µ(G) ⊆ G0 is indeed a subgroup.
a.) Check that ker(µ) ⊂ G is indeed a subgroup. 96

In mathematics one often encounters the notation of an exact sequence: Suppose we


have three groups and two homomorphisms f1 , f2
f1 f2
G1 →G2 →G3 (10.4)

We say the sequence is exact at G2 if imf1 = kerf2 .

This generalizes to sequences of several groups and homomorphisms


fi−1 fi fi+1
· · · Gi−1 −→ Gi −→ Gi+1 −→ · · · (10.5)

The sequence can be as long as you like. It is said to be exact at Gi if im(fi−1 ) = ker(fi ).
A short exact sequence is a sequence of the form

f1 f2
1 −→ G1 −→ G2 −→ G3 −→ 1 (10.6)
which is exact at G1 , G2 , and G3 . Here 1 refers to the trivial group with one element.
There is then a unique homomorphism 1 → G1 and G3 → 1 so we don’t need to specify it.
Thus, the meaning of saying that (10.6) is a short exact sequence is that

1. Exactness at G1 : The kernel of f1 is the image of the inclusion {1} ,→ G1 , and hence
is the trivial group. Therefore f1 an injection of G1 into G2 .

2. Exactness at G2 : imf1 = kerf2 .

3. Exactness at G3 : G3 → 1 is the homomorphism which takes every element of G3


to the identity element in the trivial group. The kernel of this homomorphism is
therefore all of G3 . Exactness at G3 means that this kernel is the image of the
homomorphism f2 , and hence f2 is a surjective homomorphism.
96
Answer : If k1 , k2 ∈ K then µ(k1 k2 ) = µ(k1 )µ(k2 ) = 1G0 . So K is closed under multiplication. The
group properties of K now follow.

– 136 –
In particular, note that if µ : G → G0 is any group homomorphism then we automati-
cally have a short exact sequence:
µ
1 → K → G → im(µ) → 1 (10.7)

where K is the kernel of µ.


When we have a short exact sequence of groups there is an important relation between
them, as we now explain.
Theorem 10.1: Let K ⊆ G be the kernel of a homomorphism (10.1). Then K is a normal
subgroup of G.
Proof: µ(gkg −1 ) = µ(g)µ(k)µ(g −1 ) = µ(g)1G0 µ(g)−1 = 1G0 ⇒ K is normal. ♠

Exercise Is the image of a homomorphism a normal subgroup?


If µ : G → G0 is a group homomorphism is µ(G) a normal subgroup of G0 ?
Answer the question with a proof or a counterexample. 97

It follows by Theorem 7.2.1, that G/K has a group structure. Note that µ(G) is also
naturally a group.

These two groups are closely related because

µ(g) = µ(g 0 ) ↔ gK = g 0 K (10.8)


Thus we have

Theorem 10.2:
µ(G) ∼
= G/K (10.9)
Proof : We associate the coset gK to the element µ(g) in G0 .

ψ : gK 7→ µ(g) (10.10)
Claim: ψ is an isomorphism. You have to show three things:

1. ψ is a well defined map:

gK = g 0 K ⇒ ∃k ∈ K, g 0 = gk ⇒ µ(g 0 ) = µ(gk) = µ(g)µ(k) = µ(g) (10.11)

2. ψ is in fact a homomorphism of groups

ψ(g1 K · g2 K) = ψ(g1 K) · ψ(g2 K) (10.12)

where on the LHS we have the product in the group G/K and on the RHS we have
the product in G0 . We leave this as an exercise for the reader.
97
Answer : Definitely not! Any subgroup H ⊂ G is the image of the inclusion homomorphism. In general,
subgroups are not normal subgroups.

– 137 –
3. ψ is one-one, i.e. ψ is onto and invertible. The surjectivity should be clear. To prove
injectivity note that:

µ(g 0 ) = µ(g) ⇒ ∃k ∈ K, g 0 = gk ⇒ g 0 K = gK ♠ (10.13)

Remarks:

1. If we have a short exact sequence

1→N →G→Q→1 (10.14)

then it automatically follows that N is isomorphic to a normal subgroup of G (it is


the kernel of a homomorphism G → Q) and moreover Q is isomorphic to G/N . For
this reason we call Q the quotient group. A frequently used terminology is that “G
is an extension of Q by N .” Some authors 98 will use the terminology that “G is an
extension of N by Q.” So it is best simply to speak of a group extension with kernel
N and quotient Q.

2. VERY IMPORTANT: In quantum mechanics physical states are actually repre-


sented by “rays” in Hilbert space, or better, by one-dimensional subspaces of Hilbert
space, or, even better, by orthogonal projection operators of rank one. (This is
for “pure states.” More generally, “states” are described mathematically by den-
sity matrices.) When comparing symmetries of quantum systems with their classical
counterparts, group extensions play an important role so we will discuss them rather
thoroughly in §15 below. For the moment we quote three important examples:

Example 1: Consider the group of fourth roots of unity, Res(4) and the homomorphism
π : Res(4) → Res(2) given by π(g) = g 2 . The kernel is {±1} = Res(2) and so we have:

1 → Z2 → Z4 → Z2 → 1 (10.15)

As an exercise the reader should also describe this extension thinking of Z4 additively as
Z/4Z, and generalize it to
1 → Zp → Zp2 → Zp → 1 (10.16)
where p is prime.

Example 2: Consider the homomorphism

rN : Z → Z/N Z (10.17)

given by reduction modulo N . (Or, if you prefer to think multiplicatively, rN (n) = ω n


where ω is a primitive N th root of 1.) The kernel is K = N Z ⊂ Z. As a group this kernel
is isomorphic to Z and so we have
N ιN r
0 → Z→Z→Z/N Z→0 (10.18)
98
notably, S. MacLane, one of the inventors of group cohomology,

– 138 –
where ιN (x) = N x.

Example 3: Finite Heisenberg Groups: Let P, Q be N × N “clock” and “shift” matrices.


To define these introduce an N th root of unity, say ω = exp[2πi/N ]. Then

Pi,j = δi=j+1modN (10.19)

Qi,j = δi,j ω j (10.20)


Note that P N = QN = 1 and no smaller power is equal to 1. Further note that 99

QP = ωP Q (10.21)

For N = 4 the matrices look like


   
00 0 1 ω 0 0 0
1 0 0 0 0 ω2 0 0
P = Q= (10.22)
   
ω3
 
0 1 0 1 0 0 0
00 1 0 0 0 0 1

with ω = e2πi/4 . The group of matrices generated by P, Q and ω1N ×N is a finite subgroup
of GL(N, C) isomorphic to a finite Heisenberg group, denoted HeisN . It is an extension
π
1 → ZN → HeisN →ZN × ZN → 1 (10.23)

and has many pretty applications to physics and we will return to this group several times
below. See, for example section 11.11 below for a physical interpretation.

Exercise
Give a formula for π in the exact sequence (10.23).

Exercise An
Use Theorem 7.1 to show that An is a normal subgroup of Sn .

Exercise Induced maps on quotient groups


We will use the following result in §12.3: Suppose µ : G1 → G2 is a homomorphism
and H2 ⊂ G2 is a subgroup.
99
The fastest way to check that - and thereby to check that you have your conventions under control - is
to compute QP Q−1 because (Q−1 P Q)ij = Qii Pij (Qjj )−1 = ωPij .

– 139 –
a.) Show that µ−1 (H2 ) ⊂ G1 is a subgroup.
b.) If H1 ⊂ µ−1 (H2 ) is a subgroup show that there is an induced map µ̄ : G1 /H1 →
G2 /H2 .
c.) Show that if H1 and H2 are normal subgroups then µ̄ is a homomorphism.
d.) In this case there is an exact sequence

1 → µ−1 (H2 )/H1 → G1 /H1 → G2 /H2 (10.24)

Exercise
Let A, B be abelian groups and A1 ⊂ A and B1 ⊂ B subgroups, and suppose φ : A → B
is a homomorphism such that φ takes A1 into B1 .
a.) Show that φ induces a homomorphism

φ̄ : A/A1 → B/B1 (10.25)

b.) Show that if φ : A1 → B1 is surjective then

ker{φ : A → B}
ker{φ̄ : A/A1 → B/B1 } ∼
= (10.26)
ker{φ : A1 → B1 }

Exercise
Let n be a natural number and let

ψ : Z/nZ → (Z/nZ)d (10.27)

be given by the diagonal map ψ(ω) = (ω, · · · , ω).


Find a set of generators and relations for G/ψ(H).

Exercise
Let G = Z × Z4 . Let K be the subgroup generated by (2, ω 2 ) where we are writing Z4
as the multiplicative group of 4th roots of 1. Note (2, ω 2 ) is of infinite order so that K ∼
= Z.
Show that G/K ∼ = 8Z .

– 140 –
Exercise The Finite Heisenberg Groups
a.) Using the matrices of (10.19) and (10.20) show that the word

P n1 Qm1 P n2 Qm2 · · · P nk Qmk (10.28)

where ni , mi ∈ Z can be written as ξP x Qy where x, y ∈ Z and ξ is an N th root of unity.


Express x, y, ξ in terms of ni , mi .
b.) Show that P N = QN = 1.
c.) Find a presentation of HeisN in terms of generators and relations.
d.) What is the order of HeisN ?

Exercise
Let Bn be a braid group. Compute the kernel of the natural homomorphism φ : Bn →
Sn and show that there is an exact sequence

1 → Zn−1 → Bn → Sn → 1 (10.29)

Exercise Centrally symmetric shuffles


Let us consider again the permutation group of the set {0, 1, . . . , 2n − 1}. Recall we let
W (Bn ) denote the subgroup of S2n of centrally symmetric permutations which permutes
the pairs x + x̄ = 2n − 1 amongst themselves.
Show that there is an exact sequence

1 → Zn2 → W (Bn ) → Sn → 1 (10.30)

and therefore |W (Bn )| = 2n n!.

Exercise Weyl Group Of SU (N )


Every element of SU (N ) can be conjugated into the set T Of diagonal matrices.
a.) Show that the normalizer N (T ) of T within SU (N ) is larger than T by considering
permutation matrices i(eij + eji ).
b.) Show that these act by permuting the ii and jj diagonal elements .
c.) Show that there is a homomorphism N (T ) → SN .
d.) In fact, there is an exact sequence

1 → T → N (T ) → SN → 1 (10.31)

– 141 –
10.1 The Relation Of SU (2) And SO(3)
There is a standard homomorphism

π : SU (2) → SO(3) . (10.32)

To define it we note that for any u ∈ SU (2) there is a unique R ∈ SO(3) such that, for all
~x ∈ R3 we have:
u~x · ~σ u−1 = (R~x) · ~σ (10.33)

where R ∈ SO(3).
To prove (10.33) we begin by noting that, since u−1 = u† and ~x is real, the 2×2 matrix
u~x · ~σ u−1 is hermitian, and traceless, and hence has to be of the form ~y · ~σ , where ~y ∈ R3 .
Moreover, ~y depends linearly on ~x. So the transformation ~x 7→ ~y defined by u~x ·~σ u−1 = ~y ·~σ
is a linear transformation of R3 . In fact it is a norm-preserving transformation. One way
to prove this is to note that (see exercise below)
!
2 2 1 0
(~x · ~σ ) = ~x (10.34)
01

or, alternatively (see exercise below)

det~x · ~σ = −~x2 (10.35)

From either formula we conclude that (see exercise below) ~x2 = ~y 2 . We therefore conclude
that ~y = R~x with R ∈ O(3).
We define π(u) = R by using this equation. To be totally explicit

uσ i u−1 = Rji σ j . (10.36)

It should be clear from the definition that π(u1 u2 ) = π(u1 )π(u2 ), that is that π is a
homomorphism of groups. Now to show that actually R ∈ SO(3) ⊂ O(3) note that

2i = tr σ 1 σ 2 σ 3


= tr uσ 1 u−1 uσ 2 u−1 uσ 3 u−1




= Rj1 ,1 Rj2 ,2 Rj3 ,3 tr σ j1 σ j2 σ j3



(10.37)
j1 j2 j3
= 2i Rj1 ,1 Rj2 ,2 Rj3 ,3
= 2idetR

and hence detR = 1. Alternatively, if you know about Lie groups, you can use the fact
that π is continuous, and SU (2) is a connected manifold.
We will now prove that:

1. ker(π) = {±12×2 } = Z(SU (2)).

2. Every proper rotation R comes from some u ∈ SU (2):

– 142 –
Thus we have the extremely important extension:
ι π
1 → Z2 → SU (2) → SO(3) → 1 (10.38)

Thus, SU (2) is a two-fold cover of SO(3) and in fact

SO(3, R) ∼
= SU (2)/Z2 (10.39)

where the Z2 we quotient by is the center {±12×2 }. This is arguably the most important
exact sequence in physics.
To prove the above two claims we will need to get to know SU (2) a bit better.
First we claim that, as a manifold, SU (2) can be identified with the unit three-
dimensional sphere. One way to see this is to consider the unit sphere in R4 as the space
of unit vectors in a two-dimensional complex Hilbert space (the space of states of “one
Qbit”):
S3 ∼
= {~z|~z†~z = 1} ⊂ C2 (10.40)

This is easily seen by writing !


z1
~z = (10.41)
z2

and decomposing z1 , z2 into their real and imaginary parts. Next, we note that SU (2) has
a transitive action on the unit sphere:

φu : ~z 7→ u~z (10.42)

The action is transitive because, given any unit vector we can find another orthogonal unit
vector. But any two ON bases are related by some unitary transformation. By changing
the phase of the second vector we can arrange that they are related by a special unitary
transformation.
Therefore, we should invoke the stabilizer-orbit theorem and compute the stabilizer of,
say !
1
~z0 = . (10.43)
0

These are the upper triangular SU (2) matrices with one on the diagonal: The stabilizer is
trivial. So
SU (2) ∼
= S3 (10.44)

In particular, a general SU (2) element must have the form


!
z1 ∗
u= (10.45)
z2 ∗

with |z1 |2 + |z2 |2 = 1. Now imposing the condition

u−1 = u† (10.46)

– 143 –
we solve for the other two matrix elements and conclude that every SU (2) element is of
the form !
α β
u= (10.47)
−β̄ ᾱ
where
|α|2 + |β|2 = 1 (10.48)
This makes the identification of the group as a manifold quite clear.
There are many ways to parametrize S 3 . One is to introduce a polar angle and stratify
S 3 by two-dimensional spheres. Viewed this way, we can write the general SU (2) element
as
u = cos χ + i sin χ~n · ~σ (10.49)
where 0 ≤ χ ≤ π and ~n ∈ S 2 . From this form it is easy to check that u only with all the
σ i if sin χ = 0 so cos χ = ±1. From this we conclude

ker(π) = {±12×2 } = Z(SU (2)) (10.50)

Now let us study the restriction of the homomorphism π to some special U (1) subgroups
of SU (2). First consider the subgroup of diagonal matrices D of the form:
!
ξ 0
(10.51)
0 ξ −1

with |ξ| = 1. These act by


! ! ! !
ξ 0 x3 x1 − ix2 ξ −1 0 x3 ξ 2 (x1 − ix2 )
= (10.52)
0 ξ −1 x1 + ix2 −x3 0 ξ ξ −2 (x1 + ix2 ) −x3

If we write
ξ = e−iφ/2 (10.53)
for some angle φ then π maps the diagonal matrix to R ∈ SO(3) that is a rotation around
the x3 axis. It is a counterclockwise rotation by φ in the x1 − x2 plane with the orientation
dx1 ∧ dx2 .
Another obvious subgroup of SU (2) is SO(2), the real unitary matrices. We parametrize
the group by !
cos(θ/2) − sin(θ/2) −i θ σ2
R(θ/2) = e 2 (10.54)
sin(θ/2) cos(θ/2)

Of course, the eigenvalues are e±iθ/2 and indeed


θ 3
SR(θ/2)S −1 = e−i 2 σ (10.55)

where
1
S = √ (1 − iσ 1 ) ∈ SU (2) (10.56)
2

– 144 –
(all you have to check is Sσ 2 S −1 = σ 3 . ) We find that
 
cos(θ/2) 0 sin(θ/2)
π(R(θ/2)) =  0 1 0 (10.57)
 

− sin(θ/2) 0 cos(θ/2)

is rotation by θ around the x2 axis. By the Euler angle parametrization we therefore learn
that π is onto. In fact, we can parametrize all SU (2) elements by
3 2 3
u = eφT eθT eψT (10.58)

where
i
T i = − σi 1≤i≤3 (10.59)
2
The range of Euler angles that covers SO(3) once is 0 ≤ θ ≤ π with φ and ψ identified
modulo 2π. Because SU (2) is a double cover we should extend the range of φ or ψ by a
factor of 2 if we want to cover the group SU (2) once. For example, taking:

0≤θ≤π
φ ∼ φ + 2π (10.60)
ψ ∼ ψ + 4π

Then for generic SU (2) elements we will have a unique representation


!
φT 3 θT 2 ψT 3 i i i α β
u=e e e = exp[− φσ 3 ]exp[− θσ 2 ]exp[− ψσ 3 ] = (10.61)
2 2 2 −β̄ α

with
1 1
α = ei 2 (φ+ψ) cos(θ/2) β = −ei 2 (φ−ψ) sin(θ/2) (10.62)
The Euler angle coordinates on SU (2) break down at θ = 0, π. At θ = 0 the product only
depends on (φ + ψ) even though we have a three-dimensional manifold Similarly at θ = π
the product only depends on (φ − ψ).
Remark: As we will discuss later, a good parametrization near the identity would be

u = exp[θk T k ] (10.63)

where we are exponentiating the general element of the Lie algebra su(2)

Exercise Simple Identities For ~x · ~σ


a.) Prove (10.34)
b.) Prove (10.35)
c.) Show that both these formulae imply that ~x2 = ~y 2 .

– 145 –
Exercise Polar Angle Decomposition Of SU (2)
a.) Prove that every element of SU (2) can be written in the form of (10.49).
b.) Express α, β in terms of χ and n̂.
c.) The coordinates χ, n̂ cover a product of an interval and a two-dimensional sphere.
Prove that [0, π]×S 2 is not topologically the same as SU (2). Where does the χ, n̂ coordinate
system go bad?

Exercise A Basis For The Lie Algebra su(2)


a.) Show that every traceless anti-Hermitian 2 × 2 matrix is a real linear combination
of the three matrices T i defined in (10.59).
b.) Show that the matrix commutators satisfy

[T i , T j ] = ijk T k (10.64)

with the convention 123 = +1.

Exercise
Show that in the Euler angle parametrization the shift

ψ → ψ + 2π (10.65)

takes u → −u.

11. Some Representation Theory

One of the main motivations from physics for studying representation theory stems from
Wigner’s theorem discussed below. The basic upshot of Wigner’s theorem is that, in
quantum mechanics, if G is a group of symmetries of a physical system then the Hilbert
space of the theory will be a representation space of G and in fact will define a unitary
representation: 100 For every symmetry operation g ∈ G there is a unitary operator U (g)
acting on the Hilbert space H so that

U (g1 )U (g2 ) = U (g1 g2 ) (11.1)

The use of representation can be very powerful and have far-reaching consequences. A
few examples:

100
We will need to amend this in two ways to be completely accurate: First, classical symmetries in
general are represented projectively on a quantum Hilbert space. Second, one must allow symmetries to be
represented by both unitary and anti-unitary operators in general.

– 146 –
Figure 28: Roots of unity on the unit circle in the complex plane. Here ω = e2πi/8 is a primitive
eighth root of 1.

1. The use of representation theory greatly aids in the diagonalization of physical ob-
servables, such as Hamiltonians.

2. Quantum states can be classified according to their symmetry types. This has im-
portant applications to selection rules governing what kind of transition amplitudes
can be nonzero.

3. Conservation laws are associated with symmetry operations.

4. The very formulation of Lagrangians and actions makes heavy use of representation
theory. For example, in relativistically invariant field theory the fields form a rep-
resentation of the Poincaré group (induced from the action on spacetime) and one
wishes to make a Lorentz invariant density when forming a Lagrangian.

11.1 Some Basic Definitions

Let V be a vector space over a field κ and recall that GL(V ) denotes the group of all
invertible linear transformations V → V . It is also denoted as Aut(V ) since it is the group
of linear automorphisms of V with itself.

Definition 11.1.1 A representation of G is a group homomorphism from G to a group of


the form GL(V ) = Aut(V ) where V is a vector space over a field κ:

T : g 7→ T (g)
(11.2)
G → GL(V )

– 147 –
V is sometimes called the representation space or the carrier space. We will abbreviate the
action of T (g) on a vector v ∈ V from T (g)(v) to T (g)v for readability.
Put differently, in terms of group actions, a representation of G is a G action on a
vector space that respects the linear structure:

g · (α1 v1 + α2 v2 ) = α1 g · v1 + α2 g · v2 (11.3)

where v1 , v2 ∈ V are any vectors and α1 , α2 ∈ κ are any scalars.

If the vector space has an ordered basis then we get a matrix representation. For
example, if V is finite-dimensional then we can choose an ordered basis {v1 , . . . , vn } and
the corresponding matrix representation g 7→ M (g) ∈ GL(n, κ) is defined by:
X
T (g)vi = M (g)ji vj (11.4)
j

One easily checks that T (g1 ) ◦ T (g2 ) = T (g1 g2 ) implies M (g1 )M (g2 ) = M (g1 g2 ).

1. As a simple example, take V = κ, the standard one-dimensional vector space over


κ and let T (g) = 1V for all g ∈ G. This representation is known as the trivial
representation.

2. We will often abbreviate “representation” to “rep.” Moreover we sometimes refer to


“the rep (V, T )” or simply by V , when we wish to stress the representation space. Or
we might refer to “the rep T ,” when the rest of the data is understood. ♣Should it be
(V, T ) or (T, V )?
Depends on which
3. The “dimension of the representation” is by definition the dimension dimV of the is logically prior.
Some would say V
vector space V . This number can be finite or infinite. Sometimes representations is defined by the
codomain of T .
are simply denoted by the dimension of the carrier space. This can be dangerous. Others would say
you first choose a V
For example, there are p(n) inequivalent n-dimensional representations of SU (2), and then define a T .
Need to be
although, as we will see, there is a unique irreducible n-dimensional representation consistent... ♣

of SU (2) - up to isomorphism. See below.

4. The notion of representation can be generalized by replacing GL(V ) by GL(R) where


R is a ring. Then one often speaks of a module for G.

Definition 11.1.2. Let (V1 , T1 ) and (V2 , T2 ) be two representations of a group G. An


intertwiner between these representations is a linear transformation A : V1 → V2 such
that, for all g ∈ G the diagram

V1
A / V2 (11.5)
T1 (g) T2 (g)
 
V1
A / V2

commutes. Equivalently,
T2 (g)A = AT1 (g) (11.6)

– 148 –
for all g ∈ G. Put differently: A is a morphism of G actions, and put yet another way: A
is an equivariant linear map of G spaces. We denote the vector space of all intertwiners by
HomG (V1 , V2 ).
Note that if an intertwiner is invertible then T −1 is also an intertwiner. So we have:

Definition 11.1.3. Two representations T1 and T2 are equivalent T1 ∼


= T2 if there is an
intertwiner A : V1 → V2 which is an isomorphism. That is,

T2 (g) = AT1 (g)A−1 (11.7)

for all g ∈ G.

Examples:

1. The general linear group GL(n, κ) with κ = R, C always has a family of one-
dimensional real representations, labeled by µ ∈ C given by

T (g) := |det g|µ (11.8)

This is a representation because:

T (g1 g2 ) = |detg1 g2 |µ = |detg1 |µ |detg2 |µ = T (g1 )T (g2 ) (11.9)

Note that for different µ these are inequivalent representations.

2. We saw that the connected component of the Lorentz group in 1 + 1 dimensions is


isomorphic to R, SO0 (1, 1) ∼
= R with boosts B(θ1 )B(θ2 ) = B(θ1 + θ2 ) with θi ∈ IR.
The “spin-s” representation is

ρs (B(θ)) = esθ (11.10)

For different values of s these one-dimensional representations are inequivalent.

Familiar notions of linear algebra generalize to representations:

1. The direct sum ⊕ of representations. The direct sum of (T1 , V1 ) and (T2 , V2 ) is the
rep (T1 ⊕ T2 , V1 ⊕ V2 ) where the representation space is V1 ⊕ V2 and the operators
are:

((T1 ⊕ T2 )(g)) v1 ⊕ v2 := (T1 (g))(v1 ) ⊕ (T2 (g))(v2 ) (11.11)

If {v1 , . . . , vn } is an ordered basis for V1 and {w1 , . . . , wm } is an ordered basis for


V2 then {v1 , . . . , vn , w1 , . . . , wm } is an ordered basis for V1 ⊕ V2 and relative to this
ordered basis the matrix representation is block diagonal:
!
MT1 (g) 0
MT1 ⊕T2 (g) = (11.12)
0 MT2 (g)

– 149 –
2. Similarly, for the tensor product, the carrier space is V1 ⊗ V2 . See Chapter 2 for a
proper definition of the tensor product. For finite-dimensional vector spaces we can
say that if {v1 , . . . , vn } is a basis for V1 and {w1 , . . . , wm } is a basis for V2 then the
set of vectors of the form vi ⊗ wa form a basis for V1 ⊗ V2 and we impose the rules
(α1 v1 + α2 v2 ) ⊗ w = α1 v1 ⊗ w + α2 v2 ⊗ w (11.13)
v ⊗ (α1 w1 + α2 w2 ) = α1 v ⊗ w1 + α2 v ⊗ w2 (11.14)

The tensor product can be defined without reference to a basis (See Chapter 2) and
so can the tensor product of representations. We set:

((T1 ⊗ T2 )(g)) v ⊗ w := (T1 (g)v) ⊗ (T2 (g)w) (11.15)
for all v∈ V1 and w ∈ V2 and then extend by κ-linearity.
If {v1 , . . . , vn } is an ordered basis for V1 and {w1 , . . . , wm } is an ordered basis for V2
then the matrix elements of (T1 ⊗ T2 )(g) will be of the form
(M1 ⊗ M2 )(g)ia,jb = (M1 (g))ij (M2 (g))ab (11.16)
Note that while the set {vi ⊗ wa } forms a basis for V1 ⊗ V2 it does not in any natural
way define an ordered basis: One needs to make a further choice of how to order this
basis and there are several choices. Once one has done this, there will be an ordering
on pairs (i, a) and (11.16) will define the matrix relative to this ordered basis.

3. Given a representation of V we get a dual representation on the dual space V ∨ :=


Hom(V, κ) of linear maps V → κ. We do this by demanding that the natural pairing
between V and V ∨ is preserved by the group transformation:
hT ∨ (g)`, T (g)vi = h`, vi, (11.17)
where ` ∈ V ∨ , v ∈ V . If V is finite dimensional and {v1 , . . . , vn } is an ordered basis
then there is a natural ordered basis {v1∨ , . . . , vn∨ } for V ∨ and relative to these bases
we have
(MT ∨ (g)) = MT (g)tr,−1 (11.18)

4. As a corollary of the previous remark note that the vector space of linear transforma-
tions from V to W , denoted Hom(V, W ), is canonically isomorphic to V ∨ ⊗ W and
hence naturally becomes a representation of G. In concrete terms, if φ ∈ Hom(V, W )
is a linear transformation φ : V → W then the G-action on φ is determined by our
general remarks about induced actions on function spaces:
(T̃ (g) · φ)(v) := TW (g) · φ TV (g −1 )v

(11.19)

5. If V is a complex vector space then the complex conjugate representation sends


g → T̄ (g) ∈ GL(V̄ ). A real representation is one where (T̄ , V̄ ) is equivalent to
(T, V ). If {vi } is an ordered basis for V then there is a canonical ordered basis {v̄i }
for V̄ and the matrices are related by
MT̄ (g) = (MT (g))∗ (11.20)

– 150 –
Exercise New Matrix Representations From Old Ones
Given a matrix representation of a group g → M (g) show that
a.) g → (M (g))tr,−1 is also a representation.
b.) Check the claim (11.19) above.
c.) If M is a matrix representation in GL(n, C) then g 7→ M (g)∗ is also a representa-
tion.
d.) If T is a real representation, then there exists an S ∈ GL(n, C) such that for all
g ∈ G:
M ∗ (g) = SM (g)S −1 (11.21)
Warning: The matrix elements M (g)ij of a real representation might not be real
numbers:
e.) Show that the defining two-dimensional representation of SU (2) acting on C2 is a
real representation, but there is no basis in which the matrix elements are all real.
As we will explain later, real representations can be further distinguished as totally
real and quaternionic (a.k.a. pseudoreal).

Exercise Representation Matrices On Hom(V, W )


Let V = Cn and W = Cm equipped with their standard ordered bases ei , i = 1, . . . , n
and ea , a = 1, . . . , m, respectively. Identify Hom(V, W ) ∼
= Matm×n (C) and consider the
basis ea,i for Matm×n (C). Show that if V and W are representation spaces of G then the
representation on Hom(V, W ) satisfies:

T̃ (g)eai = Mba (g)(M (g)tr,−1 )ki ebk (11.22)

♣Some duplication
in this section with
previous material ♣
11.2 Characters
For any finite-dimensional representation T : G → Aut(V ) of any group G we can define
the character of the representation, denoted χT . It is a function on the group:

χT : G → κ (11.23)

and it is defined by
χT (g) := TrV (T (g)) (11.24)
Some useful general remarks about characters:

1. The character is independent of any choice of basis for V .

2. Equivalent representations define precisely the same character function.

– 151 –
3. χT (h−1 gh) = χT (g) for all g, h ∈ G. In other words, χT (g) only depends on g via its
conjugacy class. In general, a function F : G → C that only depends on conjugacy
class, that is, that satisfies F (h−1 gh) = F (g) for all g, h ∈ G is known as a class
function. Such functions “descend” to functions on the set of conjugacy classes of G.

4. χT1 ⊕T2 = χT1 + χT2

5. χT1 ⊗T2 = χT1 χT2

Exercise Complex Conjugate Representation


Show that the fundamental representation of SU (2) is equivalent to its complex con-
jugate representation.
Show that the fundamental representation of SU (3) is not equivalent to its complex
conjugate representation

11.3 Unitary Representations


In physics, unitary representations play a distinguished role. The basic reason for this is
that symmetries should preserve probability amplitudes.

Definition 11.3.1. Let V be an inner product space. 101 A unitary representation is a


representation (V, T ) such that ∀g ∈ G, T (g) is a unitary operator on V , i.e.,

hT (g)v, T (g)vi = hv, vi ∀g ∈ G, v ∈ V (11.25)

1. The canonical representation of Sn on Rn or Cn is unitary.

2. The fundamental representations of SO(n) and U (n) are unitary.

Definition 11.3.2. If a rep (V, T ) is equivalent to a unitary rep then such a rep is said to
be unitarizable.

Example. A simple example of non-unitarizable reps are the detµ reps of GL(n, κ) with
κ = R, C and the “spin-s” representations ρs of the Lorentz group SO0 (1, 1) described in
section *****

Exercise
a.) Show that if T (g) is a rep on an inner product space then T (g −1 )† is a rep also.
101
See Chapter 2, section 12

– 152 –
b.) Suppose T : G → GL(V ) is a unitary rep on an inner product space V . Let {vi }
be an ordered orthonormal basis for V . Show that the corresponding matrix rep M (g)ij is
a unitary matrix rep. That is:
M : G → U (dimV ) (11.26)
is a homomorphism.
c.) Show that for a unitary matrix rep the transpose-inverse and complex conjugate
representations are equal.

Exercise Characters Of Unitarizable Representations


Show that if V (ρ) is unitarizable (in particular, if it is unitary), then

χρ (g −1 ) = χρ (g)∗ . (11.27)

11.4 Haar Measure, a.k.a. Invariant Integration


When proving facts about representations a very important tool is the notion of invariant
integration. In many situations we would like to consider a group to be a measure space
and give the average value of functions on the group.
If the group is finite then the obvious way to do that is:

1 X
f→ f (g) := hf i (11.28)
|G|
g∈G

We can write, very suggestively:


Z
1 X
f (g) := f (g)dg (11.29)
|G| g G

R
The map f 7→ G f (g)dg on complex-valued functions is clearly linear in the function
f . So, it defines an element of the dual space of the space of functions, i.e. a map from the
functions to C. There are lots of other elements of the dual space, such as evaluation on a
particular group element g0 ,
evg0 : f 7→ f (g0 ) (11.30)
As a measure this is i.e. the Dirac measure supported at g0 (a.k.a. the evaluation map at
g0 ). Of course we could take various linear combinations of functionals of the form evg0 to
get others. What is special about the measure (11.29) is that it satisfies the left invariance
property: Z Z
f (hg)dg = f (g)dg (11.31)
G G
for all h ∈ G.

– 153 –
The left-action on G induces an action L∗h on the functions on G. Let L∗h (f ) denote
the function on G defined by:
(L∗h f )(g) := f (hg) (11.32)
Then hL∗h (f )i = hf i.
Note that, in this case of a finite group, the measure is also right-invariant:
Z Z
f (gh)dg = f (g)dg (11.33)
G G

For a finite group left-invariant and right-invariant measures are unique up to overall
scale. Indeed, the most general measure will be of the form
X
ρ(g)f (g) (11.34)
g∈G

for some weight function ρ(g). Left invariance implies that


X X
ρ(g)f (g) = ρ(g)f (hg)
g∈G g∈G
X (11.35)
= ρ(h−1 g)f (g)
g∈G

Now apply the statement of left-invariance to the “Dirac function” at g0 , a.k.a. the char-
acteristic function at g0 : 102 (
1 g = g0
δg0 (g) := (11.36)
0 else

Then ρ(g0 ) = ρ(h−1 g0 ) for every g0 and every h. Therefore ρ(g) = c is just some constant
function on the group. It is now easy to check that the measure is also right-invariant.
The idea of a left- or right-invariant measure extends to continuous groups. For exam-
ple for G = R the general measure is given by
Z +∞
f (x)ρ(x)dx (11.37)
−∞

for some measure ρ(x)dx. By a similar argument to the above we learn that ρ(x + y)dx =
ρ(x)dx for all x, y, and hence ρ(x) is a constant.

Remark: Haar’s Theorem: The existence of left and right-invariant measures on topo-
logical groups is very general. The topological group G must be “locally compact” and
“Hausdorff” - two mild topological conditions on G. 103
102
Do not confuse this function with the Dirac measure evg0 .
103
A topological space X is locally compact if, for every x ∈ X there is a compact neighborhood of x.
That is, there is a compact subspace K ⊂ X so that x ∈ U ⊂ K for some open neighborhood U of x. A
topological space X is Hausdorff if, for all pairs of distinct points x1 , x2 ∈ X there exist neighborhoods
x1 ∈ U1 and x2 ∈ U2 such that U1 ∩ U2 = ∅. In other words, open sets separate points.

– 154 –
Then one can define a set B of measurable subsets of G and a measure µ : B → R
is “left-invariant” if µ(gS) = µ(S) for all measurable subsets S ∈ B. There is a similar
definition of “right-invariant.” If one imposes a few more technical conditions then Haar’s
theorem states that such measures are unique up to multiplication by a positive constant.
Given a Haar measure one can define integrals of some class of functions - known as
measureable functions. For example, for G = R clearly we can only integrate functions so
that Z +∞
f (x)dx (11.38)
−∞
exists.
In general, the left- and right- invariant measures on a topological group need not
coincide, even up to scale. However, in the case of compact groups the left- and right-
invariant measures, are unique and coincide up to scale. For compact Lie groups the
essential observation is that left-invariance shows the volume form must be proportional
to hg −1 dg ∧ · · · g −1 dg, vi where v is a volume form on g∨ . 104

Examples:

1. G = R: The most general Haar measure is of the form:


Z Z +∞
f (g)dg := c dxf (x) (11.39)
G=R −∞

where c is a constant.

2. G = Z: The most general Haar measure is of the form:


Z X
f (g)dg := c f (n) (11.40)
G=Z n∈Z

where c is a constant.

3. Now let G = R∗>0 be the multiplicative group of positive real numbers. The most
general Haar measure is of the form
Z Z ∞
dx
f (g)dg := c f (x) (11.41)
G=R∗>0 0 x

where c is a constant.
104
See Chapter **** for a discussion of the Maurer-Cartan form g −1 dg and the notion of left- and right-
invariant differential forms on Lie groups. Another, more sophisticated, but also more elegant proof was
explained to me by Dan Freed: Recall that for a finite dimensional real vector space V of dimension n
there is a GL(n, R) torsor B(V ) of the bases. Consider the character |det| on GL(n, R). There is a one-
dimensional line of equivariant functions B(V ) → R transforming according to this character. There is a
natural orientation on this line given by the positive functions and a nonzero function corresponds to a
measure on V . It is unique up to scalar multiplication by a positive constant. Now let V be the vector
space of left-invariant vector fields on G. A measure on V corresponds to a left-invariant measure on the
group. Now right translation acts by a positive scalar, and this gives a homomorphism G → R>0 . But G is
compact so the only possible homomorphism is the trivial one. Therefore the measure is also right-invariant.

– 155 –
4. Similarly, consider G = GL(n, R). We can take the matrix elements gij to be co-
ordinates on the open domain {g|detg 6= 0} ⊂ Mn (R) ∼
2
= Rn . The usual Euclidean
Q
measure ij dgij changes under g → g0 g by
Y Y
dgij → |detg0 |n dgij (11.42)
ij ij

and so the most general Haar measure is of the form


Z Z Y
f (g)dg := c f (g)|detg|−n dgij (11.43)
G=GL(n,R) detg6=0 ij

where c is a constant.

5. G = U (1): Up to scale we have the Haar measure:


Z I Z 2π
1 dz dθ
f (g)dg := f (z) = g(θ) (11.44)
G=U (1) 2πi |z|=1 z 0 2π

where g(θ) = f (eiθ ) and the range of the last integral is over any interval of length
2π. Here the scale of the measure has been chosen so that the “volume” of the group
is 1.

6. For the important case of G = SU (2) we can write it as follows. First, every element
of SU (2) can be written as: ♣Definition of β
! here is backwards
α −β ∗ from what we use
g= (11.45) when we describe
β α∗ reps using
homogeneous
polynomials below.
for 2 complex numbers α, β with ♣

|α|2 + |β|2 = 1. (11.46)

In this way we identify the group as a manifold as S 3 . That manifold has no globally
well-defined coordinate chart. The best we can do is define coordinates that cover
“most” of the group but will have singularities are some places. (It is always impor-
tant to be careful about those singularities when using explicit coordinates!) We can
always write:

α = ζ1 cos θ/2
(11.47)
β = ζ2 sin θ/2

where ζ1 , ζ2 are phases, and the magnitude is parametrized in a 1-1 fashion by taking
0 ≤ θ ≤ π. Next it is standard to parametrize the phases by:
1
α = ei 2 (ψ+φ) cos θ/2
1
(11.48)
β = iei 2 (ψ−φ) sin θ/2

– 156 –
The virtue of this definition is that we can then write:
! ! !
eiφ/2 0 cos θ/2 i sin θ/2 eiψ/2 0
g=
0 e−iφ/2 i sin θ/2 cos θ/2 0 e−iψ/2 (11.49)
i 21 φσ 3 i 21 θσ 1 i 12 ψσ 3
=e e e
and under the standard homomorphism π : SU (2) → SO(3) the angles θ, φ, ψ become
the Euler angles.
We need to be a little careful about φ and ψ since they are not defined at θ = 0, π
and we also need to be careful about their ranges. The above expression is invariant
under the transformations:
(φ, ψ) → (φ + 4π, ψ)
(φ, ψ) → (φ, ψ + 4π) (11.50)
(φ, ψ) → (φ + 2π, ψ + 2π)

If we think of this as generating a group of transformations on R2 we can choose a


fundamental domain in various ways, and then taking (φ, ψ) to be in that fundamental
domain we will cover the group elements exactly once (away from θ = 0, π). One
standard fundamental domain is:
0 ≤ φ < 2π
(11.51)
0 ≤ ψ < 4π
This is good because if we then act on the unit vector e3 , (φ, θ) become the standard
angular coordinates on S 2 .
The normalized Haar measure for SU (2) in these coordinates is 105

1
[dg] = dψ ∧ dφ ∧ sin θdθ (11.52)
16π 2
For much more about this, see Chapter 5 below.
We give formulae for the Haar measures on the classical compact matrix groups in
Chapter 6 below.

A consequence of the existence of invariant integration is that every finite-dimensional


representation is equivalent to a unitary representation. This follows because you can use
a suitable averaging procedure over the group to make the matrices unitary.

Proposition If (T, V ) is a rep of a compact group G and V is an inner product space then
(T, V ) is unitarizable.

Proof We make essential use of the Haar measure. If T is not already unitary with
respect to the inner product h·, ·i1 then we can define a new inner product by:
Z
hv, wi2 := hT (g)v, T (g)wi1 dg (11.53)
G
105
We have chosen an orientation so that, with a positive constant, this is Tr2 (g −1 dg)3 ).

– 157 –
For a compact group hT (g)v, T (g)wi1 will be a nice continuous function, hence bounded,
and therefore integrable. If v 6= 0 then T (g)v 6= 0 hence hT (g)v, T (g)vi1 > 0 and therefore
0 < hv, vi2 < ∞. Here we used the fact that for a compact group the volume is finite. So
h·, ·i2 will be a good inner product. Then using the properties of the Haar measure it is
easily checked that
hT (g)v, T (g)wi2 = hv, wi2 (11.54)
and hence T (g) is unitary w.r.t. the inner product h·, ·i2 ♠

Remark: We saw that the representations detµ of GL(n, R), GL(n, C) and ρs of SO0 (1, 1)
are not unitarizable. What fails in the above argument is the infinite volume of these
noncompact groups.

Exercise Due Diligence


Show that
hT (g)v, T (g)wi2 = hv, wi2 (11.55)

Exercise Computation For SU (2)


Using the parametrization by Euler angles show that
Z
gαβ dg = 0 (11.56)
SU (2)
Z
1
gαβ gγδ dg = αγ βδ (11.57)
SU (2) 2
We will later interpret these equations as special cases of the orthogonality relations of
matrix elements in irreducible representations. (See section on the Peter-Weyl theorem
below.)

11.5 The Regular Representation


Let G be a group. Then there is a left action of G × G on G: (g1 , g2 ) 7→ L(g1 )R(g2−1 ):

(g1 , g2 ) · g0 = g1 g0 g2−1 (11.58)

and hence an induced action on Map(G, Y ) for any Y . Now let Y = C. Then Map(G, C)
is a representation of G × G because the induced left-action:

((g1 , g2 ) · Ψ) (h) := Ψ(g1−1 hg2 ) (11.59)

– 158 –
converts the vector space of functions Ψ : G → C into a representation space for G × G.
If we equip G with a Haar measure then we can speak of L2 (G), namely the Hilbert
space based on the complex-valued functions such that
Z
hf, f i := |f (g)|2 dg < ∞ (11.60)
G

Note that the G × G action preserves the L2 -property thanks to left- and right- invariance,
and in fact the G × G action is unitary.

Definition The representation L2 (G) is known as the regular representation of G.


Note that L2 (G) is a representation of G × G although by restriction to the natural
subgroups G × {1} and {1 } × G it becomes a representation of G. So sometimes people
speak of “the regular representation of G.” More precisely, if we restrict to operations of
the form:
(L(h) · Ψ)(g) := Ψ(h−1 g) (11.61)

We have the left regular representation, while

(R(h) · Ψ)(g) := Ψ(gh) (11.62)

defines the right regular representation. Both actions are left- actions on the functions
space, so the terminology is slightly confusing.
Suppose that (T, V ) is a representation of G. As we explained above the vector space
of linear transformations End(V ) := Hom(V, V ) of V to itself is then also a representation.
In fact, it is a representation of G × G because if S ∈ End(V ) then we can define a linear
left-action of G × G on End(V ) by:

(g1 , g2 ) · S := T (g1 ) ◦ S ◦ T (g2 )−1 (11.63)

Now, we have two representations of G × G. They are related as follows: If V is


finite-dimensional we have a map

ι : End(V ) → L2 (G) (11.64)

The map ι takes a linear transformation S : V → V to the complex-valued function


ΨS : G → C defined by
ΨS (g) := TrV (ST (g −1 )) (11.65)

That is
ι(S) := ΨS (11.66)

We claim that ι is a G × G-equivariant map:

(h1 , h2 ) · ΨS = Ψ(h1 ,h2 )·S (11.67)

– 159 –
You are asked to prove this in an exercise below. Put differently, denoting by TEnd(V ) the
representation of G×G on End(V ) and TReg.Rep. the representation of G×G on Map(G, C)
we get a commutative diagram:

End(V )
ι / Map(G, C) (11.68)
TEnd(V ) TReg.Rep.
 
End(V )
ι / Map(G, C)

If we choose an ordered basis {vi } for V then the operators T (g) are represented by
matrices: X
T (g) · vi = M (g)ji vj (11.69)
j
If we take S = eij to be the matrix unit in this basis then ΨS is the function on G given
by the matrix element M (g −1 )ji = M tr,−1 (g)ij . So the ΨS ’s are linear combinations of
matrix elements of the representation matrices of G. (Replacing V by its dual V ∨ we will
get the representation matrices M (g)ij .) The advantage of (11.65) is that it is completely
canonical and basis-independent.
See section 11.9 below for more about the regular representation.

Exercise Due Diligence


Prove equation (11.67). 106

Exercise
a.) Let δ0 , δ1 , δ2 be a basis of functions in the regular representation of Z3 which are
1 on 1, ω, ω 2 , respectively, and zero elsewhere. Show that ω is represented as
 
010
L(ω) = 0 0 1 (11.71)
 
100
b.) Show that
L(h) · δg = δh·g
(11.72)
R(h) · δg = δg·h−1
106
Answer : This is a straightforward computation:
(h1 , h2 ) · ΨS (g) = ΨS (h−1
1 gh2 )

= TrV (ST (h−1


2 g
−1
h1 ))
= TrV (T (h1 )ST (h−1
2 )T (g
−1
)) (11.70)
−1
= TrV ((h1 , h2 ) · ST (g ))
= Ψ(h1 ,h2 )·S (g)

– 160 –
and conclude that for the left, or right, regular representation of a finite group the repre-
sentation matrices in the δ-function basis are permuation matrices.

11.6 Reducible And Irreducible Representations


11.6.1 Definitions
In general, representations of a group on a vector space of “large dimension” are harder to
understand and work with than representations on a vector space of “small dimension.”
So, presenting a representation as a direct sum of smaller ones is often a very useful
simplification. In terms of matrices it corresponds to block diagonalization. We now
investigate how that can be done systematically.

Figure 29: T (g) preserves the subspace W .

Definition. Let W ⊂ V be a linear subspace of the carrier space V of a group represen-


tation T : G → GL(V ). Then W is invariant under T , a.k.a. an invariant subspace if
∀g ∈ G, w ∈ W
T (g)w ∈ W (11.73)

This may be pictured as in 29.

Examples

1. Both {~0} and V are always invariant subspaces.

2. Consider the three-dimensional representation of SO(2) as rotations around the z-


axis. Then the vector subspace of the xy plane at z = 0 is an invariant subspace.
Note that the parallel planes at z = z0 6= 0 are invariant under the group action but
are not linear subspaces.

3. Consider the canonical representation of Sn on κn . The line through the all ones
vector is an invariant subspace.

4. We saw above that spaces of functions of the form (11.65) for a fixed V define invariant
subspaces of the regular representation with action (11.59). Let us write this out in
more detail.
Let M : G → GL(n, κ) be any matrix n-dimensional representation of G. For a
fixed i, j consider the matrix element Mij as a κ-valued function on G: So Mij is the

– 161 –
function on G whose value at g ∈ G is just M (g)ij ∈ κ. Now consider the linear span
of functions where we fix i:

Ri := Span{Mij }j=1,...,n (11.74)

We claim this is an invariant subspace of L2 (G) in the right regular representation:


To see this, just check:

(R(g) · Mij )(h) = Mij (hg)


n
X (11.75)
= Mis (h)Msj (g)
s=1

which is equivalent to the equation on functions:


function on Gnµ function on G
z }| { X z}|{
R(g) · Mij = Msj (g) Mis (11.76)
| {z }
s=1
matrix element for R

Similarly, suppose we fix j and consider:

Lj := Span{Mij }i=1,...,n (11.77)

is an invariant subspace under the left-regular representation of G. Putting these


together we see that the space

LR := Span{Mij }i,j=1,...,n (11.78)

is an invariant subspace under the action of G × G under the action (11.59) on L2 (G).

Remarks:

1. If (V, T ) is a rep and W ⊂ V is an invariant subspace we can define a smaller group


representation (T, W ) called restriction of T to W . We also say the (T, W ) is a
subrepresentation of (V, T ). Strictly speaking we should write T |W but we will not
write that out.

2. If T is unitary on V then it is unitary on W .

Definition. A representation T is called reducible if there is an invariant subspace W ⊂ V ,


under T , which is nontrivial, i.e., such that W 6= 0, V , If V is not reducible we say V is
irreducible. That is, in an irreducible rep, the only invariant subspaces are {~0} and V . We
often shorten the term “irreducible representation” to “irrep.”

Remarks:

– 162 –
1. Given any nonzero vector v ∈ V , the linear span of {T (g)v}g∈G is an invariant sub-
space. In an irrep this will span all of V . Such a vector is called a cyclic vector.
Caution!! The existence of a cyclic vector does not imply the representation is irre-
ducible. Consider the vector e1 in the permutation representation of Sn on Rn .

2. Suppose (T, W ) is a subrepresentation of (V, T ). Choose an ordered basis

{w1 , . . . , wk } (11.79)

for W . Then it can be completed to an ordered basis

{w1 , . . . , wk , uk+1 , . . . , un } (11.80)

for V . Let us write wi , i = 1, . . . , k for the basis vectors for W and ua , a = k+1, . . . , n
for our choice of a set of complementary ordered basis vectors for V . The matrix
representation associated with such a choice of basis is defined by:

T (g)(wi ) = (M11 (g))ji wj + (M12 (g))ai ua


(11.81)
T (g)(ua ) = (M21 (g))ja wj + (M22 (g))ba ub

where

M11 (g) ∈ M atk×k


M21 (g) ∈ M at(n−k)×k (11.82)
M22 (g) ∈ M at(n−k)×(n−k)

Because W is an invariant subspace we have M12 = 0 so the matrix representation


looks like: !
M11 (g) 0
M (g) = (11.83)
M21 (g) M22 (g)
Since M (g1 )M (g2 ) = M (g1 g2 ) it follows that M11 gives a matrix representation on
W.

3. If W ⊂ V is an invariant subspace then the quotient vector space (see Chapter two)
V /W is a representation of G in a natural way:

T (g)(v + W ) := T (g)(v) + W (11.84)

The reader can check this is well-defined. The vectors ua above define a basis for
V /W of the form ua + W , and relative to this basis the representation will look like
M22 .

Definition. A representation T is called completely reducible if it is isomorphic to a direct


sum of representations:
W1 ⊕ · · · ⊕ W n (11.85)
where the Wi are irreducible reps. Thus, there is a basis in which the matrices look like:

– 163 –
 
M11 (g) 0 0 ···
 0 M22 (g) 0 ···
M (g) =  (11.86)
 
···

 0 0 M33 (g)
··· ··· ··· ···

Examples

1. Let G = U (1). Then, for any n ∈ Z we define the one-dimensional representation ρn


by
ρn (z) = z n (11.87)

for z ∈ U (1). This is clearly an irreducible representation. We will argue below that
these are the only irreducible representations.

2. Finite-dimensional representations of Abelian groups are completely reducible. Choos-


ing an ordered ON basis the matrices M (g), g ∈ G are commuting unitary matrices
and, by the spectral theorem, can be simultaneously diagonalized. For example, a fd
unitary representation of U (1) will be a family of commuting matrices and we can
choose a basis so that
M (z) = Diag{z n1 , . . . , z nd } (11.88)

so if V ∼
= Cd is the carrier space we would write

V ∼
= ρn1 ⊕ · · · ⊕ ρnd (11.89)

3. G = Z2
!
10
1→
01
! (11.90)
01
(12) →
10

is a 2-dimensional reducible rep on R2 because W = {(x, x)} ⊂ R2 is a nontrivial


invariant subspace. Indeed, σ 1 is diagonalizable, so this rep is equivalent to
!
10
1→
01
! (11.91)
1 0
(12) →
0 −1

which is a direct sum of two 1 × 1 reps.

4. Now consider the nonabelian group S3 . There is a natural 3-dimensional permutation


representation of S3 we defined above: T (σ)ei = eσ(i) where ei is the standard basis

– 164 –
for R3 . As we have noted, the all ones vector u0 := e1 + e2 + e3 spans an invariant
subspace L ⊂ R3 . We can choose basis vectors for the orthogonal complement

u1 := e1 − e2
(11.92)
u2 := e2 − e3

and the subspace spanned by u1 , u2 is also an invariant subspace. In fact, one easily
computes that relative to this basis:
! ! !
−1 1 1 0 0 −1
M ((12)) = M ((23)) = M ((13)) =
0 1 1 −1 −1 0
! !
0 −1 −1 1
M ((123)) = M ((132)) =
1 −1 −1 0
(11.93)

Note that this representation is clearly irreducible: Since it is a representation of


a finite group it is completely reducible (see below). The reduction would have to
give diagonal matrices. But diagonal matrices commute. The above matrices do not
commute since T (12)T (23) = T (123) is not the same as T (23)T (12) = T (132).
Note that the basis u1 , u2 is not an ON basis and the above matrices are not unitary
matrices, even though this is a unitary representation. One could use the averaging
procedure above to make a unitary representation, although this would be tedious.
A better way to proceed is to note that the reflections and rotations in the plane that
preserve an equilateral triangle form the group S3 . (We will discuss this in much
more detail later.) If we label the vertices 123 in counter-clockwise order with vertex
3 on the y-axis, as in figure 35 then:
!
1 0
M ((12)) = (11.94)
0 −1

√ !
−√21 − 23
M ((13)) = (11.95)
− 23 12
√ !
1 3

√2 2
M ((23)) = 3 1
(11.96)
2 2
√ !
2π −
√2
1
− 23
M ((123)) = R( ) = 3
(11.97)
3 2 − 12
√ !
−2π −√12 23
M ((132)) = R( )= (11.98)
3 − 23 − 21
generates a unitary representation.

– 165 –
5. Consider the representation of Sn on Rn . Then the one-dimensional subspace L =
{(x, · · · , x)} is a subrepresentation. Moreover we can take
X
L⊥ = {x1 , · · · xn | xi = 0}. (11.99)

Then L⊥ is an (n − 1)-dimensional representation of Sn and the representation is


equivalent to a direct sum L ⊕ L⊥ . So, if we choose a basis of the all ones vector and
an orthogonal basis for L⊥ the matrices will be block diagonal. It is not obvious, but
it does follow from the general representation theory of the symmetric group that the
representation L⊥ of dimension (n − 1) is irreducible.

11.6.2 Reducible vs. Completely reducible representations


Irreps are the “atoms” out of which all reps are made. Thus we are naturally led to study
the irreducible reps of G. In real life it can and does actually happen that a group G
has representations which are reducible but not completely reducible. Reducible, but not
completely reducible reps are sometimes called indecomposable.

Example 1 An example of an indecomposible rep which is not completely reducible is the


rep !
1 log|detA|
A→ (11.100)
0 1
of GL(n, R).

Example 2 Similarly, we can write an indecomposable representation of the connected


component of the identity of the 1 + 1 dimensional Lorentz group. If B(η) ∈ SO0 (1, 1) is
the boost of rapidity η, with η ∈ R, then:
!

T (B(η)) = (11.101)
01

Example 3 Let G = GL(n, κ) and H = Matn (κ). We can define a group which, as a set
is H × G, but it has a “twisted” group multiplication law:

m((h1 , g1 ), (h2 , g2 )) := (h1 + g1 h2 g1tr , g1 g2 ) (11.102)

One can check that this really does define a group structure. It is a special case of the
semi-direct product structure that we will study in more detail in Section **** . Now check
that the following is a matrix representation:
!
g hg tr,−1
T (h, g) := (11.103)
0 g tr,−1

Because of the upper-triangular nature of the matrices we cannot simultaneously block-


diagonalize all the matrices T (h, g) but there is a nontrivial invariant subspace. This kind

– 166 –
R_G

W^\perp

Figure 30: The orthogonal complement of an invariant subspace is an invariant subspace.

of construction gives the finite-dimensional indecomposable representations of the Poincaré


group, the Euclidean group, and its crystallographic subgroups.

It is useful to have criteria for when this complication cannot occur:

Proposition 11.6.2.1 Suppose that the representation (V, T ) is a unitary representation


on an inner product space V and W ⊂ V is an invariant subspace then W ⊥ is an invariant
subspace:
Proof : Recall that

y ∈ W ⊥ ⇔ ∀x ∈ W, hy, xi = 0 (11.104)

Let g ∈ G, y ∈ W ⊥ . Compute

T (g)y, x = y, T (g)† x
(11.105)
= y, T (g −1 )x

But, T (g −1 )x ∈ W , since W ⊂ V is an invariant subspace. Therefore: ∀x ∈ W ,


hT (g)y, xi = 0. Therefore: T (g)y ∈ W ⊥ Therefore: W ⊥ is an invariant subspace. ♠
Therefore:

1. Finite dimensional unitary representations are always completely reducible. This fol-
lows from the above proposition together with induction on the dimension.

2. Finite dimensional representations of compact groups are always completely reducible.

3. In particular, for a finite group the regular representation L2 (G) is completely re-
ducible as a representation of G or of G × G.

4. The complete reduction L2 (G) is given by the Peter-Weyl theorem described below.
The beautiful aspect of this theorem is that it holds not just for finite groups but for
all compact groups.

– 167 –
Isotypical Components: For each isomorphism class of irreducible representation of
G choose a representative (T (µ) , V (µ) ) where µ runs over the set of distinct irreducible
representations. If V is a completely decomposable representation then we can write

V ∼
= ⊕µ ⊕i=1 V (µ) (11.106)

where V (µ) is the carrier space of an irreducible representation of G, we are summing over all
irreps in (11.106), and aµ is the number of times that irrep appears in the decomposition.
(We understand that if a particular irrep does not appear at all then we take aµ = 0.)
When aµ 6= 0 we can write

⊕i=1 V (µ) ∼
= κaµ ⊗κ V (µ) (11.107)
as vector spaces where κ is the ground field. We can moreover interpret this as an inter-
pretation as representations where T (g) acts as the identity operator on κaµ . With this
understood, the summand κaµ ⊗κ V (µ) is called the isotypical component of V belonging
to µ. Note that here If we abbreviate κaµ ⊗ V (µ) to aµ V (µ) then the decomposition into
isotypical components can be written as:

V = ⊕aµ V (µ) (11.108)

11.7 Schur’s Lemmas


A very important remark about equivariant maps between irreducible representations is
known as Schur’s lemma. It is almost a tautology - but it is a very powerful tautology.

Schur’s Lemma: Let V1 , V2 be vector spaces over any field κ such that they are
carrier spaces of irreducible representations of any group G. If A : V1 → V2 is an
intertwiner between these two irreps then A is either zero or an isomorphism of
representations.

Proof : Note that the kernel and image of A are invariant subspaces of V1 and V2 , respec-
tively. Recall these are defined by:

kerA := {v1 ∈ V1 |A(v1 ) = 0} (11.109)

imA := {v2 ∈ V2 |∃v1 ∈ V1 v2 = A(v1 )} (11.110)


The reader should check that these are linear subspaces because A is linear and they are
invariant subspaces because A is an intertwiner.
Now, since V1 is an irrep we conclude that kerA is either the 0 vector space or all of
V1 . Similarly, since V2 is an irrep imA is either 0 or the entire space V2 . Now, if kerA = V1
then A = 0. If A 6= 0 then kerA 6= V1 so therefore kerA = 0, therefore A is injective.
Moreover if A 6= 0 then there is a nonzero vector in imA, and therefore imA = V2 , so A is
surjective. Therefore, A is an isomorphism. ♠

– 168 –
Note that if V1 = V2 = V is an irrep then the set of intertwiners V → V is not only a
linear space but is also an algebra (see Chapter 2). Namely, A1 ◦ A2 is also an intertwiner.
In the case of the algebra of self-intertwiners of a representation one can say more. But
now the choice of field κ becomes important.

Theorem: Suppose (V, T ) is an irreducible representation of a group G and V is a


complex vector space. Suppose A : V → V is an intertwiner, i.e., A commutes with
T (g) for all g ∈ G. Then A is proportional to the identity transformation: There
exists a scalar λ ∈ C such that for all v ∈ V , A(v) = λv.

Proof : Since we are working over the complex field A has a nonzero eigenvector Av = λv.
That follows because the characteristic polynomial pA (x) = det(x1 − A) is a polynomial in
the complex field and has a root in the complex numbers. The eigenspace C = {w : Aw =
λw} is therefore not the zero vector space. But it is also an invariant subspace. Therefore,
it must be the entire carrier space. ♠.

Remarks:

1. Degeneracy Spaces As Spaces Of Intertwiners. Let us return to the isotypical decom-


position of a completely reducible representation, (11.106). Let HomG (V1 , V2 ) denote
the vector space of G-equivariant maps between two G-spaces V1 , V2 .

HomG (V (µ) , V ) ∼
= ⊕ν HomG (V (µ) , κaν ⊗ V (ν) )

= ⊕ν κaν ⊗ HomG (V (µ) , V (ν) ) (11.111)
= κaµ ⊗ HomG (V (µ) , V (µ) )

In the second line we used the fact that G acts trivially on κaν . If we work over κ = C
then we just showed HomG (V (µ) , V (µ) ) ∼
= C and hence we have a better interpretation
of the degeneracy space: It is the linear space of G-invariant maps from V (µ) → V .
Indeed, note that there is a canonical equivariant map:

HomG (V (µ) , V ) ⊗ V (µ) → V (11.112)

given by A ⊗ u 7→ A(u). So we have a canonical G-equivariant map

⊕µ HomG (V (µ) , V ) ⊗ V (µ) → V (11.113)

and complete reducibility is the statement that this is an isomorphism.

2. Block diagonalization of Hamiltonians. Suppose H is a physical Hilbert space which


is a representation of a group G so that H is completely reducible into isotypical
components:
H∼ = ⊕µ H(µ) (11.114)

– 169 –
∼ Dµ ⊗ V (µ) where V (µ) is an irreducible representation and Dµ is the
That is H(µ) =
degeneracy space. The sum in (11.114) is over the irreps of G. Suppose now that H
is a Hermitian operator, such as a Hamiltonian, that commutes with the G-action.
That is
H:H→H (11.115)
is an intertwiner. This is typically what happens when we have a symmetry of some
dynamics. By Schur’s lemma we therefore have under this isomorphism

H∼
= ⊕µ H (µ) ⊗ 1V (µ) (11.116)

In terms a basis compatible with the isotypical decomposition it means that H (µ)
acts only on the degeneracy space, and hence the Hamiltonian has been partially (or
sometimes completely) diagonalized. This also leads to selection rules. If ψ1 , ψ2 are
two states in different isotypical components and O is an operator that commutes
with the G-action then the transition amplitude

hψ1 , Oψ2 i = 0 (11.117)

********************
ABOVE DISCUSSION NEEDS TO BE SUPPLEMENTED IN MORE CONCRETE
TERMS:
D⊗V with G acting as 1⊗T (g) commutes with A⊗1, with A any linear transformation
on D. If V is irreducible

HomG (D ⊗ V, D ⊗ V ) ∼
= Hom(D, D) (11.118)

WRITE OUT WHAT THIS MEANS IN TERMS OF BLOCK DIAGONAL MATRI-


CES
*********************************

3. Schur’s lemma over other fields can lead to more complicated possibilities. All we
can say in general is that HomG (V (µ) , V (µ) ) is a division algebra over κ, meaning
that it is an algebra in which all nonzero elements are invertible. (This follows
immediately from our proof of Schur’s lemma above.) When we take κ = R we get a
division algebra over R and there are three possibilities: R, C, H. As an example of
the case R just consider the real linear transformations between the irreducible real
representations of Z2 . The algebra of intertwiners is clearly just R. If we consider
the defining representation C2 ∼= R4 as a representation of SU (2) over R, so that
SU (2) elements are represented as 4 × 4 real matrices, then there are more R-linear
transformations on R4 that can commute with the representation matrices T (g). (If
we tried to express them as transformations of C2 they would typically be a linear
combination of C-linear and C-antilinear transformations.) This is an example where
the algebra of intertwiners is H. For more on this see the section in chapter two on
quaternions.

– 170 –
11.8 Pontryagin Duality
In this section we introduce the beautiful idea of the Pontryagin dual of an Abelian group.
107 We will use it in section 11.11, and again we will use it to give a very general construction
of interesting Heisenberg extensions of Abelian groups.

Definition: Let S be an Abelian group. The Pontryagin dual group Sb is defined to be


the group of homomorphisms Hom(S, U (1)). Note that if χ1 , χ2 ∈ Hom(S, U (1)) then the
pointwise product
(χ1 · χ2 )(s) := χ1 (s)χ2 (s) (11.119)
is again a homomorphism S → U (1), thus making Sb into an Abelian group.

Remarks:

1. The Pontryagin dual group Sb can also be thought of as the group of all complex
one-dimensional unitary representations of S. It follows from Schur’s lemma that
all irreducible finite dimensional complex representations of an Abelian group are
one-dimensional.
Note that the adjective complex is essential here. After all the defining representation
of the Abelian group SO(2) is R2 and is irreducible as a representation over R.

2. Elements of the group Sb are also called characters.

3. It is best to discuss Pontryagin duality in the context of topological groups. In this


case we should only consider the continuous characters χ : S → U (1). For the duality
theorem below we should consider locally compact Abelian groups. Examples of locally
compact Abelian groups are Rn , tori, lattices, and finite Abelian groups with compact
topology. An infinite-dimensional Hilbert space is a topological Abelian group under
addition, but it is not locally compact.

Note that, for a fixed s ∈ S we can define a homomorphism Hom(S,


b U (1)) by

ŝ : χ 7→ χ(s) (11.120)

Note that ŝ is just the evaluation map evs discussed previously. The map s 7→ ŝ is a
homomorphism S → S. The main theorem is:
bb

Theorem[Pontryagin-van Kampen duality]. If G is a locally compact Abelian group then


the canonical homomorphism S → Sb is in fact an isomorphism:
b

Sb ∼
=S (11.121)
b

For a proof see, for example, the book on representation theory by A.A. Kirillov.
107
The transliteration from the Cyrillic to the Latin alphabets takes various forms. Another common one
is Pontrjagin.

– 171 –
Example 1: Consider S = Z/nZ, thought of additively. To determine χ ∈ Hom(S, U (1))
¯ = χ(1̄)` for any ` ∈ Z. Put χ(1̄) = ω ∈ U (1). But
it suffices to determine χ(1̄), since χ(`)
now we need to impose the relation χ(n̄) = χ(0̄) = 1. This implies ω n = 1, so ω is an nth
\ is
root of unity. So the most general element of Z/nZ
¯ = χω (`)
χ(`) ¯ := ω ` (11.122)

where ω is an nth root of unity. Moreover χω1 χω2 = χω1 ω2 so Z/nZ


\ is identified in this way
th
with the multiplicative group of n roots of unity µn In this way we see that, as abstract
groups
\ = µn ∼
Z/nZ = Z/nZ (11.123)
So a finite cyclic group is self-dual.

Example 2: Consider R, additively. Then if χ ∈ R b we have χ(x + y) = χ(x)χ(y) so


ax
χ(x) = e for some constant a. For χ to be valued in U (1) we must have a = ik with
k ∈ R and hence
χ(x) = χk (x) := eikx (11.124)
moreover,
χk χ` = χk+` (11.125)
b∼
and hence R cn ∼
= R. In an entirely similar way R = Rn

Example 3: Consider S = Z. To determine χ ∈ Hom(S, U (1)) it suffices to determine


χ(1). Choose any phase ξ ∈ U (1) and set χ(1) = ξ. Then it must be that, for all n ∈ Z:

χ(n) = χξ (n) := ξ n (11.126)

Moreover χξ1 χξ2 = χξ1 ξ2 . Thus,


b∼
Z = U (1) (11.127)

Example 4: Consider S = U (1). To determine χ ∈ Hom(S, U (1)) it might help to think


of U (1) ∼
= R/Z. We know from the Pontryagin dual of R that χ should be of the form

χ(x + Z) = exp[ik(x + Z)] (11.128)

for some real number k. However, for this to be well-defined we must have k = 2πn with
n ∈ Z. Therefore χ must be of the form

χn (x + Z) = exp[2πinx] (11.129)

Or, if we think of S = U (1) multiplicatively as complex numbers of modulus one, then we


can say that every character on U (1) is of the form:

χn (ξ) := ξ n (11.130)

for some n ∈ Z. Therefore,


[
U (1) ∼
=Z (11.131)

– 172 –
Comparing (11.127) and (11.131) we verify the general result (11.121).

Example 5: Tori. Consider the group G = Zd . It will be useful to consider a free G action
on affine Euclidean space Ed . This defines a subset Γ ⊂ Ed known as a lattice (sometimes
called an embedded lattice). The quotient space Ed /Γ has a natural basepoint, namely the
coset of Γ and, as a group it is isomorphic to U (1)d . By the same arguments as above its
Pontryagin dual will be isomorphic to Zd .
There is a nice way to think about the Pontryagin duality between lattices and tori.
Suppose Γ is a lattice in Rd . Using the Euclidean norm we can define another lattice, the ♣Really we should
do this using the
dual lattice dual vector space...

Γ∨ = {g ∈ Rd |g · γ ∈ Z ∀γ ∈ Γ} (11.132)

Note we can identify Γ∨ ∼ = Hom(Γ, Z) since any g ∈ Γ∨ defines a homomorphism φg whose


values are: φg (γ) = g · γ. Also, as an abstract Abelian group of course Γ∨ ∼ = Zd , but the
above definition identifies it as a specific subgroup of R . d

The unitary irreps of Γ are represented by points in the torus T ∨ := Rd /Γ∨ . Note that
T ∨ is a torus as a manifold and is isomorphic to the group U (1)d , as an Abelian group.
For any k̄ ∈ T ∨ we can
χk̄ (γ) = exp[2πik · γ] (11.133)

where k is any representative of k̄, that is k̄ = k + Γ∨ . Note that the above formula is
well-defined because if we choose any two lifts k1 and k2 of k̄ then k1 = k2 + g with g ∈ Γ∨ ,
and then g · γ ∈ Z for all γ ∈ Γ. So

b∼
Γ = Rd /Γ∨ ∼
= U (1)d (11.134)

Conversely, the Pontryagin dual of the torus Rd /Γ∨ can naturally be identified with Γ by
the same formula:
χγ (k̄) = exp[2πik · γ] (11.135)

Exercise Pontryagin Dual Of The Prüfer Groups


Recall that the Prüfer groups P r(p) are defined for each prime p as the union over all
n of roots of unity of order pn .
What is the Pontryagin dual of P r(p)? (Give it the discrete topology.) 108

108
Answer : Use the isomorphism U (1) ∼ = R/Z. One needs to say what is the image of p−n . So χ( p1n ) =
exp[2πian /p ], for some integer an because χ(p−n ) must itself be a (pn )th root of unity. Note we can regard
n

an ∈ Z/pn Z. Now note that


an−1 1 1 an
exp[2πi ] = χ( n−1 ) = χ( n )p = exp[2πi n−1 ] (11.136)
pn−1 p p p
So the Pontryagin dual is the subgroup of n Z/pn Z consisting of sequences an so that an projects to an−1 .
Q

This is known as the group of p-adic integers.

– 173 –
Figure 31: Example of a bandstructure. (For silicon.) On the horizontal axis the structure is
plotted as a function of k along lines inside the Brillouin torus. The letters refer to points where
the (cubic) crystallographic group has fixed points. Γ denotes the identity element k̄ = 0 where the
full cubic symmetry group is restored.

11.8.1 An Application Of Pontryagin Duality: Bloch’s Theorem And Band


Structure
The Pontryagin duality between an embedded lattice Γ ⊂ Rd and the torus T ∨ = Rd /Γ∨
has a very significant application in condensed matter physics known as Bloch’s theorem.
In the one-electron approximation to the Schrodinger problem of electrons in a crystal
one considers the Schrödinger Hamiltonian on L2 (Rd ) of the form:
~2 2
H=− ∇ + U (x) (11.137)
2m
where the potential U : Rd → R is assumed to be invariant under some crystallographic
group (see below). In particular, it is invariant under translation by a lattice Γ ⊂ Rd .
For exampe, if we take into account the Coulomb interaction between the electron and a
collection of ions of charge Zi e at positions xi ∈ C, where C is a crystal then
X −Zi e2
U (x) = (11.138)
|x − xi |
i

But for the statement we are going to make all we need is that U (x + γ) = U (x) for γ ∈ Γ.
Now the group Γ acts on the Hilbert space through unitary operators commuting with H.
Explicitly:
ρ(γ) = exp[iγ · p̂/~] (11.139)
~ as is standard in quantum mechanics. Note that
where p̂ = −i~∇
ρ(γ1 )ρ(γ2 ) = ρ(γ1 + γ2 ) (11.140)

– 174 –
so the Hilbert space will be a representation of Γ. The Hamiltonian cannot make transitions
between different irreps in an isotypical decomposition.
We have classified the one-dimensional representations above so if ψ ∈ H were to be
in a one-dimensional representation then it would have to be quasi-periodic:

ψ(x + γ) = χk̄ (γ)ψ(x) (11.141)

where k̄ ∈ Rd /Γ∨ . In this context the Pontryagin dual torus T ∨ is known as the Brillouin
torus.
Note that if ψ obeys (11.141) then we can always write (noncanonically!!!) as

ψ(x) = e2πik·x uk (x) (11.142)

where we have chosen a specific representative k ∈ Rd of the element k̄ in the Brillouin


torus and uk (x) is periodic in x, i.e. invariant under shifts of x → x + γ for γ ∈ Γ. In
condensed matter physics the vector k ∈ Rd is known as a reciprocal vector. We stress that
this decomposition is noncanonical. If k1 and k2 both represent the same k̄ then k1 = k2 +g
for g ∈ Γ∨ and uk2 (x) = e2πig·x uk1 (x). Note that both functions uk2 and uk1 are periodic
under shifts by Γ. Therfore, they can be considered as functions on the real space torus
T := Rd /Γ.
Note that a quasiperiodic function (11.141) cannot also be L2 . For each point k̄ in the
Brillouin torus define the Hilbert space
Z
Hk̄ := {ψ(x)|ψ(x + γ) = χk̄ (γ)ψ(x) |ψ(x)|2 dx < ∞} (11.143)
Rd /Γ

Then there is a measure on the torus T ∨ so that we have an isomorphism


I
H= ∼ dk̄Hk̄ (11.144)
Rd /Γ∨

Now, the Hamiltonian acts within Hk̄ . It is useful to write the eigenvalue problem as

Hk uk (x) = Ek uk (x) (11.145)

where
Hk = e−2πik·x He2πik·x (11.146)
Note that we had to make a choice of k that projects to k̄ to write the Hamiltonian Hk .
However, if k 0 = k + g where g ∈ Γ∨ then Hk0 is unitarily equivalent to Hk . Indeed
U = e2πix̂·g is a nice unitary operator on the wavefunctions on the torus Rd /Γ.
Hk is an Hermitian elliptic operator acting on the functions on a compact manifold.
Explicitly, it works out to

~2 2 ~2 ~
Hk = − ∇ − 4π k · (i∇) + (U + 4π 2 k 2 ) (11.147)
2m 2m 2m
and it acts on L2 functions on the torus T = Rd /Γ. This operator should be viewed as a
perturbation of a Laplace operator on functions on a compact manifold. The latter has a

– 175 –
discrete spectrum of eigenvalues. The spectrum of h = −∇2 is {4π 2 g 2 }g∈Γ∨ . The theory
of elliptic Hermitian operators on compact manifolds shows that the lower order terms in
(11.147) do not change this property and hence the operator (11.147) has a discrete set of
eigenvalues {En (k)}. Note that while Hk depends on k, the spectrum itself only depends
on k̄. This is not obvious from (11.147) but it follows from the unitary equvalence between
Hk1 and Hk2 where k1 and k2 are two representatives of k̄.
The eigenvalues vary continuously as functions of k̄ ∈ Rd /Γ∨ to give what is called a
band structure. See Figure 31.

11.9 Orthogonality Relations Of Matrix Elements And The Peter-Weyl Theo-


rem
The very beautiful Peter-Weyl theorem states that, if G is a compact group, then there is
an isomorphism of G × G representations:

L2 (G) ∼
= ⊕µ End(V (µ) ) (11.148)

where we sum over the isomorphism class of each irreducible representation exactly once,
and for each irrep we choose a representative (T (µ) , V (µ) ). ♣Notation is too
heavy: Replace
The key to proving the Peter-Weyl theorem are the orthogonality relations for matrix V (µ) by V µ etc.
below. ♣
elements of irreps:

Theorem: Let G be a compact group, and define an Hermitian inner product on L2 (G)
by Z
hΨ1 , Ψ2 i := Ψ∗1 (g)Ψ2 (g)dg (11.149)
G

where, WLOG we normalize the Haar measure so the volume of G is one. Let {V (µ) }
be a set of representatives of the distinct isomorphism classes of irreducible unitary rep-
(µ)
resentations for G. For each representation V (µ) choose an ON basis wi , i = 1, . . . , nµ
with
nµ := dimC V (µ) . (11.150)
Then the matrix elements Mijµ (g) defined by

(µ) µ (µ)
X
T (g)wi = Mji (g)wj (11.151)
j=1

form a complete orthogonal set of functions on L2 (G) so that


1 µ1 ,µ2
(Miµ11,j1 , Miµ22,j2 ) = δ δi1 ,i2 δj1 ,j2 (11.152)

Proof : The proof is based on linear algebra and Schur’s lemma. For any linear transfor-
mation A : V (µ) → V (ν) we can average using the Haar measure
Z
à := T (ν) (g)AT (µ) (g −1 )dg (11.153)
G

– 176 –
And then a small computation shows that à is an intertwiner:
Z
(ν)
T (h)Ã = T (ν) (hg)AT (µ) (g −1 )dg
ZG
= T (ν) (g)AT (µ) (g −1 h)dg (11.154)
G
= ÃT (µ) (h)

Therefore, by Schur’s lemma, à = δµ,ν  where  is a multiple of the identity transfor-


mation. It is useful at this point to choose an ordered basis and examine our conclusion:
à = δµ,ν Â, with  proportional to the identity, in that basis:
For any matrix A ∈ Matnν ×nµ (C) we have
Z
(ν) (µ)
[dg]Mij (g)Aja Mab (g −1 ) = δµ,ν cA δi,b (11.155)
G

We can determine the constant cA by setting µ = ν and b = i and summing on i to get

TrV (µ) (A) = nµ cA (11.156)

where we normalized the volume of the group to 1.


Note that for the matrix unit A = ejk we have Trejk = δjk and hence
Z
(ν) (µ) 1
[dg]Mij (g)Mk` (g −1 ) = δµ,ν δjk δi,` (11.157)
G nµ

Equation (??) holds for the matrix elements relative to any ordered bases for the V (µ) . If
we specialize to ON bases then the matrices M (µ) (g) are unitary. It is now useful to define
functions φµij ∈ L2 (G) by
(µ) √ (µ)
φij : g 7→ nµ Mij (g) (11.158)
so that we have Z
(µ) (ν)
[dg](φij (g))∗ φk` (g) = δµ,ν δik δj` . (11.159)
G

Now recall the definition of the map ι : End(V ) → L2 (G) in (11.66). This is easily
generalized: If we have any collection of finite-dimensional representations {Vλ } of G then
we have
ι : ⊕λ End(Vλ ) ,→ L2 (G) (11.160)
P
where we just add the functions ι(⊕i Si ) := i ΨSi . Thanks to the equivariance, the image
of (11.160) is a G × G-invariant subspace, i.e. a subrepresentation of L2 (G). It follows
from the orthogonality of matrix elements that if we choose the collection {Vλ } to be the
set of distinct irreps of G, namely {V (µ) } then ι is injective. We claim it is also surjective:
The orthogonal complement would have to be a representation of G, and hence would
be isomorphic to a direct sum of representations V (µ) . But then the matrix elements are
(µ)
already accounted for by the image of ι. Thus the functions φij are in fact an orthonormal ♣Be more explicit
and careful about
basis for L2 (G). ♠ this step. ♣

– 177 –
When working with finite groups then we have the following beautiful:

Corollary: If G is a finite group then


X
|G| = n2µ (11.161)
µ

Proof : On the one hand L2 (G) clearly has a basis of delta-functions δg so the dimension
is |G| on the other hand End(V (µ) ) has dimension n2µ

This fact is extremely useful: If you are trying to list the dimensions of the irreps of a
finite group it is always a very useful check to verify (11.161). ♣SHOULD WE
CHANGE D TO M
FOR MATRIX

Example 1: Let G = Z2 = {1, σ} with σ 2 = 1. Then the general complex valued-function ELEMENTS
BELOW? ♣
on G is specified by two complex numbers (ψ+ , ψ− ) ∈ C2 :

Ψ(1) = ψ+ Ψ(σ) = ψ− (11.162)

This identifies Map(G, C) ∼


= C2 as a vector space. We found the irreps of any cyclic group
above. For Z2 there are just two irreducible representations V± ∼= C with ρ± (σ) = ±1.
The matrix elements give two functions on the group M ± :

M + (1) = 1 M + (σ) = 1 (11.163)

M − (1) = 1 M − (σ) = −1 (11.164)

(Here and in the next examples when working with 1×1 matrices we drop the µν subscript!)
The reader can check they are orthonormal, and they are complete because any function
Ψ can be expressed as:
ψ+ + ψ− + ψ+ − ψ− −
Ψ= M + M (11.165)
2 2
Note that if V is any representation of Z2 with nontrivial element represented by T (σ)
then we can form orthogonal projection operators P± = 21 (1 ± ρ(σ)) onto direct sums of
isotypical components. Note that we can write these projectors as:
Z
P± = (M ± (g))∗ T (g) (11.166)
G

Example 2: We can generalize the previous example slightly by taking G = Z/nZ =


hω|ω n = 1i. Let us identify this group with the group of nth roots of unity and choose
a generator ω = exp[2πi/n]. Since G is abelian all the representation matrices can be
simultaneously diagonalized so all the irreps are one-dimensional. They are:
V = C and ρm (ω) = ω m where m is an integer. Note that m ∼ m + n so the set of
irreps is again labeled by Z/nZ and in fact, under tensor product the set of irreps itself
forms a group isomorphic to Z/nZ.
The matrix elements in the irrep (ρm , V ) are ♣These were called
ρm previously.... ♣

– 178 –
mj
M (m) (ω j ) = ω mj = e2πi n (11.167)

Now we can check that indeed


1 X
(M (m1 ) (g))∗ M (m2 ) (g) = δm1 −m2 =0modn (11.168)
|G|
g∈G

The decomposition of a function Ψ on the group G is known as the discrete Fourier trans-
form: If Ψ : Zn → C is any function we can write it as
X
Ψ= Ψ̂m M (m) (11.169)
m
Z
Ψ̂m = (M (m) (g))∗ Ψ(g)dg (11.170)
Zn

Example 3: The theorem applies to all compact Lie groups. For example, when G =
U (1) = {z||z| = 1} then the invariant measure on the group is just −i dz dθ
z = 2π where
z = eiθ : Z 2π

hΨ1 , Ψ2 i = (Ψ1 (θ))∗ Ψ2 (θ) (11.171)
0 2π
Now, again since G is abelian the irreducible representations are 1-dimensional and the
unitary representations are (ρn , Vn ) where n ∈ Z, Vn ∼
= C and

ρn (z) := z n (11.172)

Now, the orthonormality of the matrix elements is the standard orthonormality of einθ and
the Peter-Weyl theorem specializes to Fourier analysis: An L2 -function Ψ(θ) on the circle
can be expanded in terms of the matrix elements of the irreps:
X
Ψ= Ψ̂n M (n) (11.173)
Irreps ρn

This is just the standard Fourier decomposition of a periodic function.

We stress that the Peter-Weyl theorem applies to all compact groups, not just Abelian
ones. For this reason it is an important defining subject of the subject of “nonabelian
Fourier analysis.” Here is a simple nonabelian example:

Example 4: So far all our examples have been Abelian groups. Let us consider G = S3 . It
has order 6 so that L2 (G) is a six-dimensional vector space. So far, we have discussed three
different irreps of dimensions 1, 1, 2. Are there any others? Note that 6 = 12 + 12 + 22 . So
we conclude that there are no other irreps. The matrix elements are:
1. The trivial representation: M + (g) = 1 for all g ∈ S3

– 179 –
2. The sign representation:

M − (1) = 1 M − (123) = M − (132) = 1


(11.174)
M − (12) = M − (13) = M − (23) = −1

3. The two-dimensional representation. See equation (11.94) et. seq. So we have


2 2
M11 (1) = M11 ((12)) = 1
2 2
M11 ((13)) = M11 ((23)) = −1/2 (11.175)
2 2
M11 ((123)) = M11 ((123)) = −1/2

and so on for the other 3 matrix elements in the 2-dimensional representation. The reader
should check that the function M112 is orthogonal to the functions M + and M − and that
√ 2 has unit norm.
2M11

Remark: This leaves the separate question of actually constructing the representations
of the finite group G. Until recently there has been no general algorithm for doing this.
Recently there has been a claim that it can be done. See
Vahid Dabbaghian-Abdoly, ”An Algorithm for Constructing Representations of Finite
Groups.” Journal of Symbolic Computation Volume 39, Issue 6, June 2005, Pages 671-688

Exercise Due Diligence


2 on S .
Check the orthogonality relations for the other matrix elements Dij 3

Exercise How Many One Dimensional Represenations Does An Arbitrary Finite Group
Have?
Show that the number of distinct one-dimensional representations of a finite group G
is the same as the index of the commutator subgroup [G, G] in G. 109

Exercise Projection Operators


Let (T, V ) be any representation of a compact group G. Consider the linear operators
on V : Z
(µ) (µ)
Pij := (φij (g))∗ T (g)dg ∈ End(V ) (11.176)
G
a.) Show that
(µ) (ν) (ν)
Pij Pkl = δµν δj,k Pi` (11.177)
109
Hint: A one-dimensional representation is trivial on [G, G] and hence descends to a representation of
the abelianization of G, namely the quotient group G/[G, G].

– 180 –
b.) Show that, for any vector ψ ∈ V we have

T (h)Pijµ ψ µ µ
X
= Mki (h)Pkj ψ (11.178)
k=1

Conclude that, if the Pijµ ψ 6= 0 then, for fixed µ, j the span

Span{Pijµ ψ|i = 1, . . . , nµ } (11.179)

is a subspace of V transforming in the representation (V µ , T µ ).


c.) In particular show that the projector onto the isotypical component corresponding
to the trivial representation is: Z
P = T (g)dg (11.180)
G
We will discuss this more later, giving examples of the decomposition of a reducible
representation into its irreps.

11.10 Orthogonality Relations For Characters And Character Tables


An important simplification of the orthogonality relations is the corresponding relations
for the character functions of the irreps: If we take traces by setting i = j and k = ` and
summing in equation (11.157) we get the orthogonality relations of characters:
Z
(χ(µ) (g))∗ χ(ν) (g)dg = δµ,ν (11.181)
G

This beautiful fact is extremely powerful. For example, as we have seen, any represen-
tation (T, V ) of G is completely reducible so

V ∼
= ⊕µ aµ V (µ) (11.182)

with aµ ∈ Z+ . But then we can determine these degeneracies by


Z
aµ = (χ(µ) (g))∗ χV (g)dg = hχ(µ) , χV i (11.183)
G

Therefore, we conclude the extremely important fact:

A representation of a compact group is completely determined, up to equivalence, by


its character function!

Moreover, by using the characters we can easily determine the decomposition into
isotypical components.
Recall that a class function on G is a function f : G → C whose value only depends on
the conjugacy class: f (g) = f (hgh−1 ) for all g, h ∈ G. The class functions form a subspace
L2 (G)class ⊂ L2 (G). The characters of the irreps provide an ON basis:

Theorem. {χµ } is an ON basis for the vector space of class functions L2 (G)class .

– 181 –
Proof : This is a corollary of the orthogonality relations for matrix elements. Any
function f ∈ L2 (G) be expanded
X µ µ
f (g) = fˆij Mij (g) (11.184)
µ,i,j

Now, if f is a class function then when we compute the integral:


Z
˜
f (g) := f (hgh−1 )dh (11.185)
G

we of course get f˜ = f , provided we normalized the group to have volume 1.


On the other hand,
Z
˜
f (g) = f (hgh−1 )dh
G
X µZ
= fˆij Mijµ (hgh−1 )dh
µ,i,j G
Z (11.186)
fˆijµ Mkl
µ µ
(h)Mljµ (h−1 )dh
X
= (g) Mik
µ,i,j G

X fˆµ
= ii
χµ (g)

µ,i

We already knew that the χµ were ON and now we see that they span the subpsace of
class functions. ♠

11.10.1 Finite Groups And The Character Table


In the case of a finite group there is another obvious basis of the space of class functions:
Denote the distinct conjugacy classes by Ci , i = 1, . . . r. For each Ci we can define a class
function δCi in L2 (G) to be the characteristic function of Ci . In equations:
(
1 g ∈ Ci
δCi (g) := (11.187)
0 else
Since any class function takes the same value for all group elements g in a fixed conjugacy
class it follows that the functions δCi form a basis for the space of class functions. Any two
bases for a given vector space have the same cardinality so we conclude:

Theorem The number of conjugacy classes of G is the same as the number of irreducible
representations of G.
Since the number of representations and conjugacy classes are the same we can define
an r × r array known as a character table:

m 1 C1 m 2 C2 ··· ··· mr Cr
χ1 χ1 (C1 ) ··· ··· ··· χ1 (Cr )
χ2 χ2 (C1 ) ··· ··· ··· χ2 (Cr )
(11.188)
··· ··· ··· ··· ··· ···
··· ··· ··· ··· ··· ···
χr χr (C1 ) ··· ··· ··· χr (Cr )

– 182 –
Here mi denotes the order of Ci .
Generally speaking, most of what you typically want to know about a group is con-
tained in its character table.
For a finite group we can rewrite the orthogonality relations of the characters, equation
(11.181), more explicitly as:

1 X
mi χµ (Ci )χν (Ci )∗ = δµν (11.189)
|G|
Ci ∈C

where mi = |Ci | is the order of the conjugacy class Ci .


We claim that there is a “dual” orthogonality relation where we sum over irreps rather
than conjugacy classes
X |G|
χµ (Ci )∗ χµ (Cj ) = δij (11.190)
µ
mi

The proof of (11.190) is very elegant: Note that equation (11.189) can be interpreted as
the statement that the r × r matrix
r
mi
Sµi := χµ (Ci ) µ = 1, . . . , r i = 1, . . . , r (11.191)
|G|

satisfies
r
X

Sµi Sνi = δµν (11.192)
i=1

Therefore, Sµi is a unitary matrix. The left-inverse is the same as the right-inverse, and
hence we obtain (11.190).

Example 1: For the simplest group there are two conjugacy classes C1 = [1] and C2 =
[(12)]. They both have cardinality mi = 1. There are two irreps 1+ and the sign represen-
tation 1− . We thus have the character table

[1] [(12)]
1+ 1 1
1− 1 −1

It is now straightforward to write out the character table for S3 .

– 183 –
[1] 3[(12)] 2[(123)]
1+ 1 1 1
1− 1 −1 1
2 2 0 −1

Here is an example of the use of characters to decompose a representation:

Now let us see how we can use the orthogonality relations on characters to find the decom-
position of a reducible representation.

Example 1 Consider the 3 × 3 rep generated by the \ action of the permutation group S3
on R3 . We’ll compute the characters by choosing one representative from each conjugacy
class:
     
100 010 010
1 → 0 1 0 (12) → 1 0 0 (132) → 0 0 1 (11.193)
     
001 001 100
From these representatives the character of V = R3 is easily calculated:

χV (1) = 3 χV ([(12)]) = 1 χV ([(132)] = 0 (11.194)

Using the orthogonality relations we compute

1 3 2
a1+ = (χ1+ , χ) = 3 + 1 + 0 = 1 (11.195)
6 6 6
1 3 2
a1− = (χ1− , χ) = 3 + (−1) · 1 + 0 = 0 (11.196)
6 6 6
1 3 2
a2 = (χ2 , χ) = 3 · 2 + 0 · 1 + (−1) · 0 = 1 (11.197)
6 6 6
Therefore:
χV = χ1+ + χ2 (11.198)
showing the decomposition of R3 into irreps, and confirming what we showed above.

Example 2: Let V be any vector space. Consider natural permutation V ⊗ V by S2 .


where V is a finite dimensional vector space (over any field) of dimension d. Let us write
out the isotypical decomposition under the S2 action. The character is easily computed:

χV ⊗2 (1) = d2
(11.199)
χV ⊗2 ((12)) = d

– 184 –
To check the second line we choose a basis {vi } for V so that {vi ⊗ vj } is a basis for V ⊗ V .
In this basis we have (12) · vi ⊗ vj = vj ⊗ vi and hence only the basis elements vi ⊗ vi
contribute to the trace. Therefore, we compute the degeneracies of the trivial and sign
representation from:
1
a1+ = hχ+ , χV ⊗2 i = (χV ⊗2 (1) + χV ⊗2 ((12)))
2
1
= d(d + 1)
2 (11.200)
1
a1− = hχ− , χV ⊗2 i = (χV ⊗2 (1) − χV ⊗2 ((12)))
2
1
= d(d − 1)
2
so we have isotypical decomposition:
1 1
V ⊗2 = d(d + 1)1+ ⊕ d(d − 1)1− (11.201)
2 2
General elements of V ⊗2 are of the form Tij vi ⊗vj and the components are often referred to
as 2-index covariant tensors. The above decomposition is a decomposition into symmetry
types of tensors: We can choose a basis of symmetric tensors:
1
(ei ⊗ ej + ej ⊗ ei ) 1≤i≤j≤d (11.202)
2
and anti-symmetric tensors
1
(ei ⊗ ej − ej ⊗ ei ) 1≤i<j≤d (11.203)
2

Example 3: The previous example is the beginning of a very beautiful story called Schur-
Weyl duality. Let us consider the next case: Consider S3 acting by permuting the various
factors in the tensor space V ⊗ V ⊗ V for any vector space V . Now, if dimV = d then we
have

χ([1]) = d3
χ([(ab)]) = d2 (11.204)
χ[(abc)] = d

as is easily computed by considering the action on the basis {vi ⊗ vj ⊗ vk }.


So we can compute
1 3 2 1
a1+ = (χ1+ , χ) = d3 + d2 + d = d(d + 1)(d + 2) (11.205)
6 6 6 6
1 3 2 1
a1− = (χ1− , χ) = d3 + (−1) · d2 + d = d(d − 1)(d − 2) (11.206)
6 6 6 6
1 3 3 2 1
a2 = (χ2 , χ) = 2d + 0 · d2 + (−1) · d = d(d2 − 1) (11.207)
6 6 6 3

– 185 –
Thus, as a representation of S3 , we have
d(d + 1)(d + 2) d(d − 1)(d − 2) d(d + 1)(d − 1)
V ⊗3 ∼
= 1+ ⊕ 1− ⊕ 2 (11.208)
6 6 3
Note that the first two dimensions are those of S 3 V and Λ3 V , respectively, and that the
dimensions add up correctly. It is not supposed to be obvious, but it turns out that the
last summand corresponds to tensor of mixed symmetry type that satisfy

Tijk + Tjki + Tkij = 0 (11.209)

together with
Tijk = −Tkji (11.210)
Why this is so, and the generalization to the Sn action on V ⊗n is best discussed in the
context of Young diagrams, representation theory of Sn and Schur-Weyl duality. See section
*****

Exercise
Show that right-multiplication of the character table by a diagonal matrix produces a
unitary matrix.

Exercise
Using the orthogonality relations on matrix elements, derive the more general relation
on characters:

1 X δµν (ν)
χµ (g)χν (g −1 h) = χ (h) (11.211)
|G| nµ
g∈G

We will interpret this more conceptually later.

Exercise Another proof of (11.190)


Consider the operator L(g1 ) ⊗ R(g2 ) acting on the regular representation. We will
compute
TrRG [L(g1 ) ⊗ R(g2 )] (11.212)
in two bases.
First consider the basis φµij of matrix elements. We have:

L(g1 ) ⊗ R(g2 ) · φµij =


X µ
Tj 0 j (g2 )Tiµ0 i (g1−1 )φµi0 j 0 (11.213)
i0 ,j 0

– 186 –
so the trace in this basis is just:
X
TrRG [L(g1 ) ⊗ R(g2 )] = χµ (g1 )∗ χµ (g2 ) (11.214)
µ

On the other hand, we can use the delta-function basis: δg . Note that

L(g1 ) ⊗ R(g2 ) · δg = δg1 gg−1 (11.215)


2

So in the delta function basis we get a contribution of +1 to the trace iff g = g1 gg2−1 , that
is iff g2 = g −1 g1 g, that is iff g1 and g2 are conjugate, otherwise we get zero.
Now when g1 and g2 are conjugate we might as well take them to be equal. Then the
functions δg contributing to the trace are precisely those with g ∈ Z(g1 ). But then

|G| |G|
|Z(g1 )| = = (11.216)
|C(g1 )| m1

completing the proof of (11.190).

Exercise Average Number Of Fixed Points - Again


Recall that in a previous exercise you showed that if a finite group G acts on a finite
set X then
1 X g
|X | = |{orbits}| (11.217)
|G| g

Prove this again by viewing L2 (X) as a G-representation and using the orthgonality rela-
tions on characters. 110

Exercise
Write out the unitary matrix Sµi for G = S3 . 111

110
Answer : Note that one basis for L2 (X) is given by δx . Here g · δx = δg·x so |X g | is just the character of
g in this representation. Therefore the average number of fixed points is the degeneracy of the trivial repre-
sentation in the isotypical decomposition. On the other hand, if f : X → C is invariant under the g action
then f takes the same value on any two x1 , x2 in the same G-orbit. Therefore, the subspace corresponding
to the trivial representation in the isotypical decomposition has a basis given by the characteristic function
for each orbit.
111
Answer : The unitary matrix Sµi for S3 is:
 1 1 1

√ √ √
6 2 3
Sµi =  √16 − √12 √13  (11.218)
 
2 1

6
0 − √3

– 187 –
Exercise
a.) Suppose we tried to define a representation of S3 by taking (12) → 1 and (23) → −1.
What goes wrong? ♣this is really
about generators
b.) Show that for any n there are only two one-dimensional representations of Sn and relations and
should be moved
up. ♣

Exercise Universal Properties Of Character Tables


Show that if we partially order the conjugacy classes and the representations so that
the class of the identity and the trivial representation come first then the first row of the
character table is all 1’s and the first column of the character table gives the dimensions
of the irreducible representations.

11.10.2 Orthogonality Relations And Pontryagin Duality


It is interesting to think about the orthogonality of characters in the context of Pontryagin
duality. Let S be a locally compact Abelian group and Sb its Pontryagin dual.
Compact groups have discrete sets of irreducible representations, but the representa-
tions of noncompact groups typically come in continuous families. Thus, for example, the
Pontryagin dual of Z is the continuous group U (1). The orthogonality relations for locally
compact Abelian groups extend to this case, but the δµ,ν delta function on the representa-
tions becomes a delta function δŜ (χ1 − χ2 ) on the characters, where the Dirac measure is
relative to the Haar measure on Ŝ. Thus we have:
Z
χ1 (s)∗ χ2 (s)ds = δSb(χ1 − χ2 ) (11.219)
S

Now, by Pontryagin duality, we have Sb ∼


= S so we also have the relation
b

hψ1 , ψ2 iL2 (S) = hψ̂1 , ψ̂2 iL2 (S)


b (11.220)

Therefore, given a function ψ ∈ L2 (S), we can define its Fourier transform, quite generally,
as Z
ψ̂(χ) := χ(s)∗ ψ(s)ds (11.221)
S

and the orthogonality relations above show that


Z
χ(s1 )∗ χ(s2 )dχ = δS (s1 − s2 ) (11.222)
S
b

– 188 –
This is a result known as either the Plancherel theorem or the Parseval theorem. 112
An important special case of the above is the case S = Γ ⊂ Rn , an embedded lattice.
Then we have: X
χk̄1 (γ)∗ χk̄2 (γ) = δΓ̂ (k̄1 − k̄2 ) (11.223)
γ∈Γ

Here δΓ̂ (k̄) is the delta measure on the dual group

Γ̂ = Rn /Γ∨ (11.224)

We can lift k̄ to k ∈ Rn , and then


X
δΓ̂ (k̄1 − k̄2 ) = δRn (k1 − k2 − γ ∨ ) (11.225)
γ ∨ ∈Rn

On the other hand, using the explicit formula:

χk̄ (γ) = e2πik·γ (11.226)

we get the relation X X


e2πi(k2 −k1 )·γ = δRn (k2 − k1 − γ ∨ ) (11.227)
γ∈Γ γ ∨ ∈Γ∨

This is one version of the Poisson summation formula. Since the PSF is an important
result we will unpack this a bit in the next section.

11.10.3 The Poisson Summation Formula


Let us begin with a standard derivation of the very useful Poisson summation formula:
Let f : R → C that decays fast enough that
X
F (x) := f (x + n) (11.228)
n∈Z

exists. Then F (x) is clearly periodic of period one and therefore defines a function F :
R/Z → C. Since R/Z ∼ = U (1) we can decompose in terms of irreducible representations:
X
F (x) = F̂ (ρm )e2πimx (11.229)
ρm
Z 1
F̂ (ρm ) = e−2πimt F (t)dt (11.230)
0
Now note that
Z 1 Z 1 X
−2πimt
e F (t)dt = e−2πimt f (m + t)dt
0 0 m
(11.231)
Z +∞
−2πimt
= e f (t)dt
−∞

112
The original statements by Plancherel and Parseval concerned special cases and the two terms are not
consistently used in the literature.

– 189 –
Putting x = 0 we learn that for suitably rapidly decaying functions:
X X
f (m) = fb(w) (11.232)
m∈Z w∈Z

where fb is the Fourier transform:


Z
fb(w) = e−2πitw f (t) (11.233)
R

This is valid for functions such that

1. f decays rapidly enough so that the sum on the LHS converges.

2. The Fourier transform fb exists.

3. The Fourier transform fb decays rapidly enough so that the sum on the RHS converges.

Note that this can also be understood as a special case of the orthogonality relations
for characters on Z:
0
X X
e2πint e−2πint = δR/Z (t − t0 ) = δR (t − t0 − k) (11.234)
nZ k∈Z

where δZ means the delta function on the Pontryagin dual group Z.


The generalization of this statement is our result (11.227) above.
Put differently, we have:
X Z X Z
~ ~t ~ ~t
X
f (~n) = e−2πim· f (t)dt = e+2πim· f (t)dt (11.235)
Rd Rd
n∈Zd
~ m∈Z
~ d m∈Z
~ d

That is:

X X
f (~v ) = fˆ(~l) . (11.236)
~v ∈Γ ~l∈Γ∨

Remarks

1. Since people have different conventions for the factors of 2π in Fourier transforms
it is hard to remember the factors of 2π in the PSF. The equation (11.234) has no
factors of 2π. One easy way to see this is to integrate both sides from t = −1/2 to
t = +1/2.

2. One application of this is the x-ray crystallography: The LHS is the sum of scattered
waves. The RHS constitutes the bright peaks measured on a photographic plate.

– 190 –
3. Another application is to analytic number theory. If τ is in the upper half com-
plex plane, and θ, φ, z are complex numbers define the Riemann theta function with
characteristics θ, φ
θ X 2
ϑ[ ](z|τ ) := eiπτ (n+θ) +2πi(n+θ)(z+φ) (11.237)
φ
n∈Z

(usually, θ, φ are taken to be real numbers, but z is complex). This converges to an


entire function of z and is also holomorphic for Imτ > 0.
Using the Poisson summation formula one can show that it obeys the modular trans-
formation law:
θ −z −1 2 −φ
ϑ[ ]( | ) = (−iτ )1/2 e2πiθφ eiπz /τ ϑ[ ](z|τ ) (11.238)
φ τ τ θ

Exercise
a.) Show that
r
X
−πan2 +2πibn 1 X − π(m−b)2
e = e a (11.239)
a
n∈Z m∈Z

b.) Check equation (11.238).

11.11 The Finite Heisenberg Group And The Quantum Mechanics Of A Par-
ticle On A Discrete Approximation To A Circle
*******************************
SOME OF THE MATERIAL IN THIS SUBSECTION IS NOW REDUNDANT WITH
MATERIAL ABOVE AND NEEDS TO BE REMOVED
*******************************
It is very illuminating to interpret the group HeisN in terms of the quantum mechanics
of a particle on a discrete approximation to a circle. ♣This sub-section
assumes some
Recall that if G acts on a set X then it acts on the functions F[X → Y ] for any Y . knowledge of linear
algebra and
Moreover, G always acts on itself by left-translation. Let us apply this general idea to quantum mechanics
explained in chapter
G = ZN , thought of as the N th roots of unity and Y = C, the complex numbers. So we are 2. ♣

studying complex-valued functions on the group G. We can picture the group as a discrete
set of points on the unit circle so we can think, physically, of F[X → Y ] as the space of
wavefunctions of a particle moving on a discrete approximation to a circle.
Now, as a vector space it is clear that F[ZN → C] is isomorphic to CN . To specify
a function is to specify the N different complex values Ψ(ω k ) where ω is a primitive N th
root of one, say, ω = exp[2πi/N ] for definiteness, and k = 0, . . . , N − 1. (We will not try to
normalize our wavefunctions Ψ, but we could. It would make no difference to the present
considerations.)

– 191 –
Another way to see we have an isomorphism is to choose a natural basis, the delta-
function basis:
δj (ω k ) = δj̄,k̄ (11.240)
where j̄, k̄ ∈ Z/N Z, viewed additively. So our isomorphism is δj 7→ ~ej . Put differently,
every wavefunction can be uniquely expressed as
N
X −1
Ψ= zj δj (11.241)
j=0

where zj ∈ C. Indeed zj = Ψ(ω j ).


In fact, we can make H = F[ZN → C] into a Hilbert space in a natural way by
declaring that the inner product is:
1 X ∗
hΨ1 , Ψ2 i := Ψ1 (g)Ψ2 (g) (11.242)
|G|
g∈G

Note that with this inner product the basis δj , j = 0, . . . , N − 1, to be an orthonormal


basis of H. Note that
Note that the sum on the RHS of (11.242) defines a measure on the group: If F : G → C
we define its integral Z
1 X
F dµ := F (g) (11.243)
G |G|
g∈G

and we have normalized the measure so that the group has “volume 1.” This is an example
of an important idea called a Haar measure that we will discuss more later.
Now, recall the general definition from section 4.1. G acts naturally on X (which
happens to be G itself) by left-multiplication. The induced action of G on the complex-
valued functions on G in this case is such that the generator ω of ZN acts on the space of
functions via:
φ̃(ω, Ψ)(ω k ) := Ψ(φ(ω −1 , ω k ))
(11.244)
= Ψ(ω k−1 )

So the generator ω of the group ZN acts linearly on the functions F[ZN → C]. We call
this linear operator P . We can therefore rewrite (11.244) as

(P · Ψ)(ω k ) := Ψ(ω k−1 ) (11.245)

Note that with respect to the inner product (11.242) P is clearly a unitary operator.
The operator P can be viewed as translation operator around the discrete circle by
one step in the clockwise direction. Recall that in the quantum mechanics of a particle on
the line translation by a distance a is

(T (a) · Ψ)(x) = Ψ(x − a) (11.246)

This equation makes sense also for a particle on the circle, that is, with x, a considered
periodically. So our P is T (a) for translation by 2π/N times around the circle clockwise.

– 192 –
Remark: In the quantum mechanics of a particle on a line or circle we could also
write
d
(T (a) · Ψ)(x) = Ψ(x − a) = (exp[iap̂])Ψ(x) = (exp[−a ] · Ψ)(x) (11.247)
dx
so the momentum operator p̂ generates translations. In the finite Heisenberg group there is
no analog of the infinitesimal translations generated by p̂, but only of a finite set of discrete
translations.

Now let Q be the position operator:

(Q · Ψ)(ω k ) := ω k Ψ(ω k ) (11.248)

Q is likewise a unitary operator.


Now note that
(P ◦ Q · Ψ)(ω k ) = (Q · Ψ)(ω k−1 )
(11.249)
= ω k−1 Ψ(ω k−1 )
while
(Q ◦ P · Ψ)(ω k ) = ω k (P · Ψ)(ω k )
(11.250)
= ω k Ψ(ω k−1 )
and therefore we conclude that we have the operator equation:

Q ◦ P = ωP ◦ Q (11.251)

Given a linear transformation and ordered bases for domain and range we can associate
a matrix. (See Chapter two for details if you do not know this.) Now, let us choose the
ordered basis δ1 , . . . , δN . Then we easily compute

P · δj = δj+1 (11.252)

and therefore the matrix for P relative to the basis {δj }, is the matrix with matrix elements

Pi,j = δi,j+1 (11.253)

so, for N = 3 it is  
001
P = 1 0 0 (11.254)
 
010
Similarly, in the basis δj we have
Qi,j = ω j δi,j (11.255)
and since j = 0, 1, . . . , N − 1 we have for N = 3:
 
1 0 0
Q = 0 ω 0  (11.256)
 
0 0 ω2

– 193 –
Thus, we have recovered the N × N clock and shift matrices we discussed above. The
group of unitary operators generated by discrete position and translation operators is the
finite Heisenberg group.
It is interesting to study the operators P and Q in a different basis. We can introduce
a “plane wave basis” of functions Ψj ∈ H, with j = 0, . . . , N − 1 defined by

Ψj (ω k ) = ω jk (11.257)

It is now easy to compute the action of P and Q on this basis:

P · Ψj = ω −j Ψj
(11.258)
Q · Ψj = Ψj+1

The roles of P and Q have been exchanged! P (as is P −1 ) is now representated by “clock
matrix” and Q is represented by a “shift matrix”. Indeed, what we have done is perform
a transformation from a position representation to a momentum representation in the
language of quantum mechanics.
The basis Ψj represents a very general and beautiful fact: The Ψj are actually group
homomorphisms Ψj : ZN → U (1) ⊂ GL(1, C) because

Ψj (ω k1 ω k2 ) = Ψj (ω k1 ) · Ψj (ω k2 ) (11.259)

as is easily checked. The one-dimensional vector spaces spanned by the individual Ψj


decompose the Hilbert space L2 (ZN ), which is an N -dimensional representation of ZN into
a direct sum of 1-dimensional (irreducible) representations. 113 ♣The next several
paragraphs are now
In general, for compact Lie groups there is a left action of G on L2 (G) and, as a redundant with
material in section
representation of G 11.8 et. seq. You

L2 (G) ∼
might want to leave
= ⊕j (dimVj )Vj (11.260) this out and bring
back the example in
those later sections.
where Vj runs over the distinct irreducible representations. This is part of an important ♣
general theorem about compact Lie groups known as the Peter-Weyl theorem.
Moreover, the exchange of clock and shift matrices by passing from the δj basis to the
basis of characters is again an instance of a general fact:
Let Gb be the set of homomorphisms χ : G → U (1). For an Abelian group G the set G b
is also a group, using the product law:

(χ1 · χ2 )(g) := χ1 (g)χ2 (g) (11.261)

The reader should be able to check that this defines a group law on G.
b The group G
b is
known as the Pontryagin dual group
For reasonable 114 Abelian groups we have
bb ∼
G =G (11.262)
113
See Chapter four for a detailed discussion of the idea of reducible and irreducible representations. In
brief, a representation ρ : G → GL(V ) is reducible if there is a nonzero linear subspace W ⊂ V which is
preserved by all the operators ρ(g). Note that one-dimensional representations are trivially irreducible.
114
e.g. locally compact Abelian groups. See the book by Kirillov on representation theory.

– 194 –
If χ is a homomorphism we can define the Fourier transform
1 X
Ψ(χ)
b := p χ(g)Ψ(g) (11.263)
|G| g∈G

giving an isometry of L2 (G)


b with L2 (G). Being an isometry means that

hΨ b 2 i 2 b = hΨ1 , Ψ2 iL2 (G)


b 1, Ψ
L (G)
(11.264)

and this in turn implies that


1 X ∗
χ (g1 )χ(g2 ) = δg1 ,g2
|G|
b
χ∈G
b
(11.265)
1 X ∗
χ1 (g)χ2 (g) = δχ1 ,χ2
|G|
g∈G

These equations are special cases of the famous orthogonality relations for the matrix
elements of irreducible representations of compact groups.
Now, for G = ZN we have G b∼ = ZN . The passage from the δj basis to the Ψj basis,
which diagonalizes P is just the finite Fourier transform.
In more concrete terms: The trace of all the powers of P less than N is also obviously
zero and P N = 1 and no smaller power of P is the identity. So P must be unitarily
equivalent to Q. Now we can easily check that

SP S −1 = Q (11.266)

where S is the finite Fourier transform matrix


1 jk
Sj,k = √ e2πi N (11.267)
N
One easy way to check this is to multiply the matrices SP and QS in the δj basis. The
reader should check that S is in fact a unitary matrix and that the matrix elements only
depend on the projections j̄, k̄ ∈ Z/N Z.
In any case, Sj,k takes us from a position basis δj to a “momentum basis” where P is
diagonal, in beautiful analogy to how the Fourier transform converts a position basis to a
momentum basis for a particle on the line.
We will return to these ideas and discuss Pontryagin duality and Fourier transforms
in Chapter 4 below.

Exercise The Pontryagin Dual Of ZN Is Isomorphic To ZN


Let χ : ZN → U (1) be a homomorphism. Let g be a generator of ZN . Show that χ(g)
must be an N th root of unity, and choosing any N th root of unity defines the homomor-
phism. Conclude that Zc ∼
N = ZN .

– 195 –
Exercise Orthogonality For ZN
Write out the equations (11.265) for the case G = ZN

11.12 Decomposition Of Tensor Products Of Representations And Fusion Co-


efficients
A frequently asked question is the following: Suppose we know the reduction of two rep-
resentations of G, say, V1 , V2 into irreps. What is the decomposition of the tensor product
V1 ⊗ V2 into irreps? For compact groups this is nicely answered with characters.
Suppose we have two representations (T1 , V1 ) and (T2 , V2 ) and we know the isotypical
decompositions:
V1 = ⊕aµ V µ V2 = ⊕bν V ν (11.268)
then
V1 ⊗ V2 = ⊕µ,ν aµ bν V µ ⊗ V ν (11.269)
so the isotypical decomposition of V1 ⊗ V2 follows immediately if we can find the isotypical
decomposition of the tensor product of two irreps.
On general grounds we know that ♣κ = C here? ♣

Vµ⊗Vν ∼
= ⊕λ HomG (V λ , V µ ⊗ V ν ) ⊗ V λ (11.270)

The dimensions
λ
Nµν := dimκ HomG (V λ , V µ ⊗ V ν ) (11.271)
are known as the fusion coefficients and they give the isotypical decomposition:
µνN λ times
z }| {
V (µ) ⊗ V (ν) = ⊕λ V (λ)
⊕ ··· ⊕ V (λ)
(11.272)
λ
= ⊕λ Nµν V (λ)
Recall that
χV1 ⊗V2 (g) = χV1 (g)χV2 (g) (11.273)
Therefore, taking the trace of (11.272) we get a formula for the fusion coefficients:
X
λ
χµ (g)χν (g) = Nµν χλ (g) (11.274)
λ

Now, taking the inner product we get:

λ
Nµν = hχλ , χµ χν i (11.275)
There is a very beautiful explicit formula for the fusion coefficients in the case of finite
groups. In this case we can write:
1 X
Nµνλ
= χµ (g)χν (g)χλ (g −1 ) (11.276)
|G|
g∈G

– 196 –
This can be written in a different way. For simplicity choose unitary irreps, and recall
that the orthogonality relations on characters is equivalent to the statement that
r
mi
Sµi := χµ (Ci ) µ = 1, . . . , r i = 1, . . . , r (11.277)
|G|

is a unitary q
matrix. Let 1 denote the trivial representation or the trivial conjugacy class.
mi
Then S1i = |G| . Thus we can write
X Sµi Sνi S ∗
λ λi
Nµν = (11.278)
S1i
i

This is a prototype of a celebrated result in conformal field theory known as the


“Verlinde formula.”
Equations (11.275)(11.276)(11.278) give a very handy way to get the numbers Nµν λ .
λ
Note that, by their very definition the coefficients Nµν are nonnegative integers, although
this is hardly obvious from, say, (11.278).

Example 1: Consider the irreps ρm of Z/N Z. We then have the fusion rules:

ρm ⊗ ρn ∼
= ρm+n (11.279)

Recall that ρm only depends on the integer mmodN . The S-matrix above is the finite
Fourier transform matrix: s
1 2πiµi/N
Sµi = e (11.280)
|G|
from which one easily verifies (11.278).

Remarks

1. Denote the intertwiners from V λ to V µ ⊗ V ν by:


λ
Vµν := HomG (V λ , V µ ⊗ V ν ) (11.281)

Note that we can decompose the triple product V µ ⊗ V ν ⊗ V λ in two natural ways
leading to the isomorphism
ρ κ ∼ κ ρ
⊕ρ Vµν ⊗ Vρλ = ⊕ρ Vµρ ⊗ Vνλ (11.282)

If we choose bases for the spaces of intertwiners then the components of the isomor-
phism relative to these bases are called fusion matrices and they satisfy some nice
identities. ♣Explain pentagon
and hexagon? ♣

2. The considerations of this section lead rather naturally to the beautiful subject of
Frobenius algebras and 2d topological quantum field theory. See Chapter 4 section
**** for more about this.

– 197 –
3. The various identities satisfied by the fusion matrix and the S-matrix have remarkable
analogues in the subject of 2d rational conformal field theory. For more about this
see:
a.) G. Moore and N. Seiberg, “Classical and Quantum Conformal Field Theory,”
Commun. Math. Phys. 123(1989)177
b.) G. Moore and N. Seiberg, “Lectures on Rational Conformal Field Theory,” in
Strings ’89,Proceedings of the Trieste Spring School on Superstrings, 3-14 April 1989,
M. Green, et. al. Eds. World Scientific, 1990. Available on G. Moore’s home page.
c.) Philippe Di Francesco, Pierre Mathieu, David Senechal, Conformal Field Theory,
Springer
d.) Jurgen Fuchs, Affine Lie Algebras And Quantum Groups ♣Need to add other
references ♣

Exercise Fusion Algebra For S3


Let V + , V − , V 2 denote the irreps of S3 of dimensions 1, 1, 2. Show that

V+⊗Vµ ∼
=Vµ
V−⊗V− ∼
=V+
(11.283)
V−⊗V2 ∼
=V2
∼V+⊕V−⊕V2
V2⊗V2 =

Exercise Identities For Fusion Coefficients


a.) Show that Nµνλ = Nλ
νµ
λ
b.) Show that Nµ1 = δµλ
λ as a matrix. It gives the matrix for tensor product with
c.) For fixed µ regard Nµν
V µ . Show that this matrix is diagonalized by Sµi :
X
λ Sµj Sνj
Nµν Sλj = (11.284)
S1j
λ

 
X
∗ λ Sµj
Sνi Nµν Sλj = δij (11.285)
S1j
ν,λ

d.) Show that


Sµ1
= dimV µ (11.286)
S11

– 198 –
11.13 Induced Representations
Let G be a group and H a subgroup. Suppose that ρ : H → Aut(V ) is a representation
of the subgroup H. Using this data we are going to produce, canonically, a representation
of G known as an induced representation. Note that, in general, there is no way to extend
ρ : H → Aut(V ) to a homomorphism ρ : G → Aut(V ). As a counterexample, we will see
later that the only one-dimensional representation of SU (2) is the trivial representation.
However, there are nontrivial representations of the subgroup of diagonal matrices. So, in
general, a representation ρ : H → Aut(V ) of a subgroup H ⊂ G is not just the restriction
of a representation of G on V .
Then, as we have seen, Map(G, V ) is canonically a G × H-space. The left-action of
G × H is defined by declaring that for (g, h) ∈ G × H and Ψ ∈ Map(G, V ) the new function
φ((g, h), Ψ) ∈ Map(G, V ) is the function G → V defined by:
φ((g, h), Ψ)(g0 ) := ρ(h) · Ψ(g −1 g0 h) (11.287)
for all g0 ∈ G. Or, in slightly lighter notation:
(g, h) · Ψ(g0 ) := ρ(h)Ψ(g −1 g0 h) (11.288)
Now, we can consider the subspace of functions fixed by the action of 1 × H. That is,
we consider functions which satisfy

Ψ(gh−1 ) = ρ(h)Ψ(g) (11.289)


for every g ∈ G and h ∈ H. Put differently: There are two natural left-actions by H on
Map(G, V ) and we consider the subspace where they are equal. Such functions are said
to be the H-equivariant. See the exercise below for the justification of this terminology.
Note that the space of H-equivariant functions G → V is a linear subspace of Map(G, V ).
We will denote it by IndG
H (V ):
−1
IndG
H (V ) := {Ψ : G → V | Ψ(gh ) = ρ(h)Ψ(g) ∀g ∈ G, h ∈ H} (11.290)
Note that since we are taking the fixed points of the subgroup {1G } × H of the G × H
action on Map(G, V ) there is still a G action on the fixed point set. More explicitly, if Ψ
is an H-equivariant function satisfying (11.289) then g · Ψ with values
(g · Ψ)(g0 ) := Ψ(g −1 g0 ) (11.291)
is also an H-equivariant function. (Check this!) Thus IndG H (V ) is a representation space
of G.
The subspace IndG H (V ) ⊂ Map(G, V ) of H-equivariant functions, i.e. functions satis-
fying (11.289) is called the induced representation of G, induced by the representation V of
the subgroup H. This is an important construction with a beautiful underlying geometrical
interpretation. In a sense we will explain below all the representations of compact groups
follow from this construction. One can also use it to construct representations of many
important noncompact and infinite-dimensional groups. For this reason it appears in many
places in physics.
Two examples of important applications in physics are:

– 199 –
1. The irreducible unitary representations of space groups in condensed matter physics.

2. The irreducible unitary representations of the Poincaré group in QFT.

Example: Let us take V = C with the trivial representation of H, i.e. ρ(h) = 1. Then
the induced representation is the vector space of functions on G which are invariant under
right-multiplication by H. This is precisely the vector space of C-valued functions on the
homogeneous space G/H. Recall that for G = SU (2) and H = U (1) we have seen that
G/H = CP1 ∼ = S 2 so we can get a nice basis of orthogonal functions on S 2 from the
functions on SU (2). These are called spherical harmonics and we will discuss them more
below.

Exercise Functoriality Of Induction


a.) Show that
IndG ∼ G G
H (V1 ⊕ V2 ) = IndH (V1 ) ⊕ IndH (V2 ) (11.292)
b.) If A ∈ HomH (V1 , V2 ) is an intertwiner between H-reps then there is an induced
map F (A) : IndG G
H (V1 ) → IndH (V2 ) which is an intertwiner of G-representations.
115

Exercise H-Equivariance
Show that a function Ψ → V satisfying (11.289) fits in a commutative diagram:

G
Ψ /V (11.293)
R(h) ρ(h−1 )
 
G
Ψ /V

where R(h) : g 7→ gh is the right action of H on G. Thus a function satisfying (11.289) is


a morphism of H-spaces, justifying the term “equivariant.”

11.13.1 The Geometrical Interpretation


Note that there is a right H-action on the set G × V :

φh : (g, v) 7→ (gh, ρ(h−1 )v) (11.294)

we can therefore form the quotient space of orbits. In this case it is usually denoted
G ×H V , but it is just the set of equivalence classes under the above right H-action. There
is a natural map
π : G ×H V → G/H (11.295)
115
Answer (F (A)(Ψ))(g) = A(Ψ(g)).

– 200 –
given by π : [(g, v)] 7→ gH. Referring back to the discussion of equation (8.92) we see that
this is the associated bundle to the principal H bundle π : G → G/H.
When G, H are Lie groups and ρ is a continous representation the map π is continuous.
Moreover, the fiber above any coset gH is the vector space V . We therefore have an
example of a vector bundle over G/H with fiber V . The sections of the vector bundle are, ♣WE SHOULD
EXPLAIN THE
by definition, continuous maps NOTION OF
SECTIONS IN THE
DISCUSSION OF
s : G/H → G ×H V (11.296) BUNDLES ABOVE.

that are a right-inverse to π, that is π ◦ s = IdG/H . To construct such a section we have


to identify, for each coset gH an equivalence class in G × V which projects back down to
gH. If we represent gH by g then it must be an equivalence class of the form

s(gH) = [(g, v(g))] (11.297)

for some vector v(g) associated to g. But now gH = g̃H when g̃ = gh for h ∈ H. So it
must be that

[(g, v(g))] = s(gH) = s(g̃H) = [(g̃, v(g̃)] = [(gh, v(g̃)] = [(g, ρ(h)v(g̃))] (11.298)

and hence

v(g) = ρ(h)v(g̃) = ρ(h)v(gh) ⇒ v(gh) = ρ(h−1 )v(g) (11.299)

This must hold for all g ∈ G, and hence g 7→ v(g) is an equivariant function: Thus,
the space of sections of the homogeneous vector bundle π : G ×H V → G/H is canoni-
cally identified with the space of H-equivariant functions G → V satisfying (11.289).

11.13.2 Frobenius Reciprocity


The theory of induced representations is already interesting and nontrivial for G, H fi-
nite groups. In this case G → G/H is a finite cover (by H) of a discrete set of points.
Nevertheless, the general geometrical ideas apply.
Let Rep(G) denote the category of finite-dimensional representations of G. Mor-
phisms between W1 , W2 ∈ Rep(G) are linear transformations commuting with G, i.e.
G-intertwiners, and the vector space of all morphisms is denoted HomG (W1 , W2 ). The
induced representation construction defines a functor

Ind : Rep(H) → Rep(G). (11.300)

(We denoted this by IndGH before but H, G will be fixed in what follows so we simplify the
notation.) On the other hand, there is an obvious functor going the other way, since any
G-rep W is a foriori an H-rep, by restriction. Let us denote this “restriction functor”

R : Rep(G) → Rep(H) (11.301)

How are these two maps related? The answer is that they are “adjoints” of each other!
This is the statement of Frobenius reciprocity:

– 201 –
HomG (W, Ind(V )) = HomH (R(W ), V ) (11.302)
We can restate the result in another way which is illuminating because it helps to
answer the question: How is IndG
H (V ) decomposed in terms of irreducible representations
of G? Let Wα denote the distinct irreps of G. Then Schur’s lemma tells us that

IndG ∼ G
H (V ) = ⊕α Wα ⊗ HomG (Wα , IndH (V )) (11.303)
But now Frobenius reciprocity (11.302) allows us to rewrite this as

IndG ∼
H (V ) = ⊕α Wα ⊗ HomH (R(Wα ), V ) (11.304)

where the sum runs over the unitary irreps Wα of G, with multiplicity one.
The statement (11.304) can be a very useful simplification of (11.303) if H is “much
smaller” than G. For example, G could be nonabelian, while H is abelian. But the
representation theory for abelian groups is much easier! Similarly, G could be noncompact,
while H is compact. etc.

Proof of Frobenius reciprocity:

In order to prove (11.304) we note that it is equivalent (see the exercise below) to the
statement that the character of IndG
H (V ) is given by

X
χ(g) = χ̂(x−1 gx) (11.305)
x∈G/H

where x runs over a set of representatives and χ̂ is the character χV for H when the
argument is in H and zero otherwise.
On the other hand, (11.305) can be understood in a very geometrical way. Think of
the homogeneous vector bundle G ×H V as a collection of points gj H, j = 1, . . . , n with
a copy of V sitting over each point. Now, choose a representative gj ∈ G for each coset.
Having chosen representatives gj for the distinct cosets, we may write:

g · gj = gg·j h(g, j) (11.306)


where j 7→ g · j is just a permutation of the integers 1, . . . , n, or more invariantly, a
permutation of the points in G/H.
Now let us define a basis for the induced representation by introducing a basis va for
the H-rep V and the equivariant functions determined by:

ψi,a (gj ) := va δi,j (11.307)

Geometrically, this is a section whose support is located at the point gi H. The equivariant
function is then given by
ψi,a (gj h) := ρ(h−1 )va δi,j (11.308)

– 202 –
Now let us compute the action of g ∈ G in this basis:

(g · ψi,a )(gj ) = ψi,a (g −1 gj )


= ψi,a (gg−1 ·j h(g −1 , j)) (11.309)
= δi,g−1 ·j ρ(h(g −1 , j)−1 ) · va

Fortunately, we are only interested in the trace of this G-action. The first key point is
that only the fixed points of the g-action on G/H contribute. Note that the RHS above
is supported at j = g · i, but if we are taking the trace we must have i = j. But in
this case ggi = gi h(g, i) and hence g −1 gi = gi h(g, i)−1 so for fixed points we can simplify
h(g −1 , i) = h(g, i)−1 , and hence when we take the trace the contribution of a fixed point
ggi H = gi H is the trace in the H-rep of h(g, i) = gi−1 ggi , as was to be shown ♠

(23)

g H
1

(123) (123)
(13) (12)

g H (23) g H
3 2

(123)
(12) (13)

Figure 32: The left action of G = S3 on G/H. In fact, this picture should be considered as a
picture of a category, in this case, a groupoid.

Remark: The Ind map does not extend to a ring homomorphism of representation
rings.

Exercise ♣Physics 619, ch.


5, 2002 for more ♣
Let G be the symmetric group on {1, 2, 3} and let H = {1, (12)}. Choose a represen-
tation of H with V ∼ = C and ρ(σ) = +1 or ρ(σ) = −1.
a.) Show that in either case, the induced representation IndG
H (V ) is a three-dimensional
vector space.
b.) Choose a basis for IndG H (V ) and compute the representation matrices of the ele-
ments of S3 explicitly.

– 203 –
Example A simple example from finite group theory nicely illustrates the general idea.
Let G = S3 be the permutation group. Let H = {1, (12)} ∼ = Z2 be a Z2 subgroup. G/H
consists of 3 points. The left action of G on this space is illustrated in (??).
There are two irreducible representations of H, the trivial and the sign representation.
These are both 1-dimensional. Call them V (), with  = ±. Accordingly, we are looking
at a line bundle over G/H and the vector space of sections of G ×H V () is 3-dimensional.
A natural basis for the space of sections is given by the functions which are “δ-functions
supported at each of the three points”:

si (gj H) = δij
g1 H = (13)H = {(13), (123)}
(11.310)
g2 H = (23)H = {(23), (132)}
g3 H = (12)H = {1, (12)}
These sections correspond to equivariant functions on the total space. The space of all
functions F : G → R is a six-dimensional vector space. The equivariance condition:

F (12) = F (1)
F (123) = F (13) (11.311)
F (132) = F (23)
cuts this six-dimensional space down to a three-dimensional space.
We can choose a basis of equivariant functions by choosing 3 representatives g1 , g2 , g3
for the cosets in G/H and setting F i (gj ) = δij . Using such a basis the representation of
the group is easily expressed as a permutation representation.
In our example of G = S3 it is prudent to choose g1 = (13), g2 = (23), g3 = (12) so
that

(12)g1 = g2 (12)
(12)g2 = g1 (12)
(12)g3 = g3 (12)
(13)g1 = g3 (12)
(13)g2 = g2 (12) (11.312)
(13)g3 = g1 (12)
(23)g1 = g1 (12)
(23)g2 = g3 (12)
(23)g3 = g2 (12)
From this one easily gets the induced representation
 
0  0
ρind (12) =   0 0 (11.313)
 
00 

– 204 –
 
00 
ρind (132) =   0 0 (11.314)
 
0  0
and so forth.
Now let us look at Frobenius reciprocity. The irreducible representations of G are
W () defined by (ij) → , and W2 defined by the symmetries of the equilateral triangle,
embedded into O(2):

!
−1 0
ρ2 (12) =
0 1
√ ! (11.315)
−√12 23
ρ2 (123) =
− 23 − 12
As H = Z2 representations, we have W () ∼
= V () and

W2 ∼
= V (+1) ⊕ V (−1) (11.316)

Therefore
HomH (W (), V (0 )) = δ,0 R
(11.317)
HomH (W2 , V ()) = R
By Frobenius reciprocity we have

IndSS32 (V ()) = W () ⊕ W2 (11.318)


Let us check this by computing characters from the geometric perspective.
The induced representation consists of functions on G/H valued in V (). The action
of g acts as a permutation representation on the support of the functions.
Therefore, we can compute the character of g in the induced representation by looking
at its action on fixed points:
X
χρind (g) = χV () (h(g, i)) (11.319)
F ix(g)

To find out how to decompose the representation IndSS32 (V ()) in terms of G = S3 irreps
it suffices to compute the character for g = (12) and g = (123). Now, g = (12) has exactly
one fixed point, namely g3 H and h(g, 3) = (12) for this element. Therefore,

χρind (12) = χV (12) =  (11.320)

On the other hand, g = (123) clearly has no fixed points, and therefore the character is
zero. It follows immediately that we have the decomposition (11.318).
********************
MORE MATERIAL IN GTLect4-IntroRepTheory-2020
*******************
************************************************

– 205 –
11.14 Representations Of SU (2)
11.14.1 Homogeneous Polynomials
We now use the idea of induced representations to construct representations of SU (2). In
fact, we will construct all the irreducible representations. ♣Using ρ rather
than T for
We take G = SU (2) and we take the subgroup H ⊂ SU (2) to be a “maximal torus” representations
here. Uniformize?
namely the subgroup of diagonal SU (2) matrices. We will denote it by T . As a group ♣

T ∼
= U (1). For V we choose an irreducible representation ρk of U (1) where k ∈ Z and the ♣This conflicts with
our notation T for
carrier space is V ∼= C. Thus we choose the one-dimensional representation of T defined the homomorphism
in a rep. So we use
by: !! ρ here. ♣

z 0
ρk := z k (11.321)
0 z −1

where z ∈ U (1) and k ∈ Z. Then the induced representation


SU (2)
IndU (1) (ρk ) (11.322)

is, by definition the space of functions F : SU (2) → V = C satisfying the equivariance


condition ! ! !
u −v̄ eiθ u −v̄
F( ) = e−ikθ F ( ) (11.323)
v ū e−iθ v ū

We can abbreviate !
u −v̄
F( ) ⇒ F (u, v) (11.324)
v ū

so we just have
F (ueiθ , veiθ ) = e−ikθ F (u, v) (11.325)

Of course g ∈ SU (2) implies |u|2 + |v|2 = 1. So we can also view this as the space of
equivariant functions S 3 → C for the right-action of U (1).
Actually, we don’t want all functions - that space is too big. One useful subspace to
restrict to is L2 (S 3 ).
The induced representation is infinite dimensional and very far from being irreducible.
For example, the functions uk |u|` for ` any positive integer ≥ −k are all smooth functions
on SU (2) and are independent vectors in L2 (SU (2)).
It turns out we can cut down the ρk -equivariant functions L2 (SU (2)) down to a finite
dimensional irrep of SU (2) by imposing one more condition. It is a condition of holomorphy
as we now explain.
One way to view this restriction is first to view S 3 ⊂ C2 and consider instead the space
of all functions of two variables !
u
∈ C2 (11.326)
v

which are equivariant in the sense of equation (11.325). This larger space of functions
is also a representation of SU (2) and also we can restrict functions to give equivariant

– 206 –
functions on SU (2). But now we can look at the subspace of equivariant functions which
are also holomorphic functions on C2 . This will be a finite-dimensional space of functions.
Holomorphy requires that −k ≥ 0. Such homogeneous holomorphic functions must
be polynomials in u, v. It is convenient to set −k = 2j ∈ Z+ . The restriction to the
holomorphic equivariant functions gives the finite dimensional space H2j of homogeneous
polynomials in u, v of degree 2j. In physics this is called the spin-j representation for
reasons explained below. In general we will denote the isomorphism class of the spin-j
representation by Vj .
There is a second, very useful, way to think about the restriction to holomorphic
functions. Recall that:
SU (2)/U (1) ∼
= SL(2, C)/B (11.327)

where B is the subgroup of upper triangular matrices.


!
z w
B={ |z ∈ C∗ , w ∈ C} (11.328)
0 z −1

Both quotients are CP1 . But the latter description is useful because we can introduce
the idea of holomorphy. Note that the representation ρk of U (1) extends uniquely to a
holomorphic representation of B:
!
z w
ρk ( ) = zk (11.329)
0 z −1

Now consider
SL(2,C)
IndB (ρk ) (11.330)

Once again we can consider the equivariant functions:

F (gb) = ρk (b−1 )F (g) b∈B (11.331)

Now g ∈ SL(2, C) looks like


!
us
g= ut − vs = 1 (11.332)
v t

and any three matrix elements will serve as holomorphic coordinates as long as we can
solve for the fourth. However we have matrices
!
1x
∈B (11.333)
0 1

and equivariance under these matrices implies that


!! !!
us u s + ux
F =F (11.334)
v t v t + vx

– 207 –
and hence B-equivariant functions on SL(2, C) are again functions of just the vector
!
u
∈ C2 − {0} . (11.335)
v

such that ! !
u −k u
F( z) = z F ( ) (11.336)
v v

for z ∈ C∗ . Such functions restrict to the functions on the 3-sphere |u|2 + |v|2 = 1 and a
function equivariant for B by ρk restricts to a function on SU (2), equivariant for T by ρk
since
B ∩ SU (2) = T (11.337)
SU (2)
Conversely we can use the equivariance to extend the functions in IndU (1) (ρk ) to functions
SL(2,C)
in IndB (ρk ) so they are really the same representation:

SL(2,C) SU (2)
IndB (ρk ) ∼
= IndU (1) (ρk ) (11.338)

SL(2,C)
The great advantage of the description IndB (ρk ) is that in this description it is
manifest that there is a subspace of holomorphic equivariant functions: Such functions
extend uniquely to holomorphic functions on all of C2 (by a theorem known as Hartog’s
theorem) so we are simply considering holomorphic functions on C2 . In this way we have
SL(2,C)
found a nice finite dimensional subspace of IndB (ρk ) of equivariant holomorphic func-
tions.
Since the SU (2) action is holomorphic, the restriction to the finite-dimensional sub-
space of holomorphic functions in fact forms a representation of SU (2). It is the space of
homogeneous polynomials of degree 2j = −k in u, v. (The space is empty if k > 0.)
A basis for H2j is

f˜j,m (u, v) := uj+m v j−m (11.339)

for m = −j, −j + 1, −j + 2, · · · , j − 1, j. Note that m increases in steps of +1 and hence


j ± m is always an integer even though j, m might be half-integer. Thus Vj ∼ = H2j is a
complex vector space of dimension 2j + 1. Now, for g ∈ SU (2),
!
α −β̄
g= |α|2 + |β|2 = 1 (11.340)
β ᾱ

we compute the matrix elements for this representation of SU (2) relative to this basis via:

(g · f˜j,m )(u, v) := f˜j,m (ᾱu + β̄v, −βu + αv)


= (ᾱu + β̄v)j+m (−βu + αv)j−m
X j (11.341)
:= D̃m0 m (g)f˜j,m0
m0

– 208 –
Note that the basis f˜j,m diagonalizes the action of the diagonal SU (2) matrices with
β = 0:
g · f˜j,m = ᾱj+m αj−m f˜j,m = α−2m f˜j,m (11.342)
That is,
j
D̃m L ,mR
(g) = δmL ,mR α−2mR (11.343)
when β = 0.
j
More generally, we can derive an explicit formula for the matrix elements D̃m 0 m (g)

as functions on SU (2) by expanding out the two factors in (11.341) using the binomial
theorem and collecting terms:

j
X j + mj − m
D̃m0 m (g) = ᾱs αj−m−t β̄ j+m−s (−β)t (11.344)
0
s t
s+t=j+m

a

Here the sum is over integers s ≥ 0 and t ≥ 0 and we recall that b = 0 when b > a so
this is a finite sum. We recover the functions f˜m , up to scale from
 
j j−m 2j
D̃−m,−j (g) = (−1) αj+m β j−m (11.345)
j−m
We claim that the representations H2j are irreducible, and moreover give a full set
of irreducible representations of SU (2). We will prove this using characters in the next
section.

Remarks

1. Representations Of SO(3) That Lift To Representations Of SU (2)


Since SO(3) ∼ = SU (2)/Z where Z ∼ = Z2 is the central subgroup consisting of SU (2)
matrices proportional to the unit matrix: Z = {±12×2 }, we can easily determine the
irreducible representations of SO(3). On the spin j representation the central element
acts as ρj (−1) = (−1)2j 1Vj . Therefore: the irreducible representations of SO(3) are
given by Vj for j ∈ Z, that is, the representations where dimC Vj is odd. The other
irreducible representations of SU (2) with dimC Vj even are not representations of
SO(3), although they are projective representations - see section **** below.
SU (2)
2. Returning to the induced representation IndU (1) (ρk ) of smooth (not holomorphic)
equivariant functions on SU (2) we see that the functions D̃ satisfy the equivariance
condition !
eiθ
j
D̃mL ,mR (g ) = (eiθ )−2mR D̃m
j
L ,mR
(g) (11.346)
e−iθ
and hence we get the equivariance condition for 2mR = k. Such values of mR can
only appear for j ≥ |k|/2 with j = |k|/2mod1. So the induced representation is
j
spanned by the functions D̃m L ,k/2
with j ≥ |k|/2 and j = |k|/2mod1 and m ∈
{−j, −j + 1, . . . , j − 1, j}. Thus as a representation of SU (2) we have:
SU (2)
IndU (1) (ρk ) ∼
= V|k|/2 ⊕ V|k|/2+1 ⊕ · · · (11.347)

– 209 –
When k is nonpositive we can set k = −2j0 and the span of the set of functions
j0
Dm L ,−j0
where m ranges over −j0 , −j0 + 1, . . . , j0 − 1, j0 transforms under the left
regular representation in a representation isomorphic to Vj . These are the func-
tions with a holomorphic interpretation. When k is positive there is no holomorphic
interpretation.
j
3. It is instructive to evaluate the functions D̃m 0 m (g) where g is parametrized by Euler

angles. We find:
j
D̃m L ,mR
(g) = e−iφmL P̃m
j
L ,mR
(θ)e−iψmR (11.348)
j
where P̃m L ,mR (θ) is a polynomial in cos(θ/2) and sin(θ/2):

  
X j + mR j − mR
j
P̃m L ,mR
(θ) 2j+mL +mR
= (−i) (−1)s
(cos θ/2)j−mR +s−t ((sin θ/2)j+mR +t−s
s t
s+t=j+mL
(11.349)
j
It is closely related to an associated Legendre polynomial and the functions D̃m 0 m are

closely related to Wigner functions. We will explain more about this below.

4. Looking ahead to the relation to Lie algebras, diagonal SU (2) matrices can be written
as g = exp[−iσ 3 φ] and then
g · f˜m = ei2mφ f˜m (11.350)

so the basis f˜m is proportional to the physicist’s basis |j, mi. Note that
!
eiφ
 
i 3
= exp[−2φ − σ ] (11.351)
e−iφ 2

so this basis diagonalizes the generator T 3 = − 2i σ 3 , that is ρ(T 3 ) · f˜j,m = −mf˜j,m .

Exercise Explicit Matrices For Small j


a.) Show that for j = 1/2, relative to the ordered basis {f˜+1/2 , f˜−1/2 } we have
!
ᾱ β̄
D̃ j=1/2
(g) = = g∗ (11.352)
−β α

b.) Show that for j = 1, relative to the ordered basis {f˜1 , f˜0 , f˜−1 } we have
 
ᾱ2 −ᾱβ β2
D̃j=1 (g) = 2ᾱβ̄ |α|2 − |β|2 −2αβ  (11.353)
 
β̄ 2 αβ̄ α2

– 210 –
11.14.2 Chararacters Of The Representations Vj
Every g ∈ SU (2) is diagonalizable so we can say
!
z
g ∼ d(z) := (11.354)
z −1

where |z| = 1 and both z and z −1 define the same conjugacy class 116
For the spin j representation denote the character by χj . In the basis described above
D̃j (d(z)) is diagonal and given by

D̃j (d(z)) = Diag{z −2j , z −2j+2 , . . . , z 2j−2 , z 2j } (11.355)

Therefore
z 2j+1 − z −2j−1
χj (g) = z −2j + z −2j+2 + · · · + z 2j−2 + z 2j = (11.356)
z − z −1
Now we write the orthogonality relations for the characters. We have written the Haar
measure above in Euler angles, but this is not the most convenient form for integrating
class functions. Given the Euler angles (φ, θ, ψ) of u ∈ SU (2) it is not so evident how to
find the value of the angle ξ such that

g = cos ξ + i sin ξn̂ · ~σ = exp[iξn̂ · ~σ ] (11.357)

The conjugacy class with angle ξ is a two-dimensional sphere in S 3 with radius sin ξ. Using
this we see that the properly normalized measure on SU (2) is such that on a class function
F we have:
1 2π
Z Z I
1 dz
F (u)[du] = 2
f (θ) sin θdθ = − g(z)(z − z −1 )2 (11.358)
SU (2) π 0 4πi z

where !
eiθ
f (θ) = F ( ) = g(z) (11.359)
e−iθ

with z = eiθ .
Now, the space of class functions L2 (SU (2))class can be identified with the (completion
of) the space of Laurent polynomials in z which is even under the involution z → 1/z.
This space is clearly spanned by z 2j + z −2j , and that basis is related to the χj by an upper
triangular matrix. Moreover, one easily confirms the general relation

hχj , χj 0 i = δj,j 0 (11.360)

From this we learn that we have a complete set of representations and moreover these are
all irreducible representations.

Remarks
116
d(z) = vd(z −1 )v −1 where, for example, we can take v = iσ 1 .

– 211 –
1. If we write n = 2j and z = eiθ then
sin((n + 1)θ)
χj (u) = = Un+1 (cos θ) (11.361)
sin θ
where Un+1 (x) are known as the Chebyshev polynomials of the second kind.

2. Weyl Density Formula. For more details about the following see Chapter 5, Survey
of Matrix Groups. The formula for integrating class functions with the Haar measure
has a nice analog for all the classical groups. For example, if F : SU (n) → C is a
class function then F (u) only depends on the eigenvalues of u. Write the unordered
set of eigenvalues as {eiθ1 , . . . , eiθn }. Then define

F (u) = f (θ1 , . . . , θn ) (11.362)

and then, we have

1 2π
Z Z Y Y dθi
F (u)du = f (θ1 , . . . , θn ) |eiθi − eiθj |2 (11.363)
SU (n) n! 0 2π
1≤i<j≤n i

and there are similar formulae for SO(n) and U Sp(2n). In SO(n) we can conjugate
any matrix to the form
Diag{R(θ1 ), . . . , R(θr )} (11.364)
when n = 2r is even and

Diag{R(θ1 ), . . . , R(θr ), 1} (11.365)

when n = 2r + 1 is odd. We then have


2 Z π
2(n−1)
Z Y Y
f [dg] = n (cos θj − cos θk )2 f (θ1 , . . . , θn ) dθi (11.366)
SO(2n) π n! −π
1≤j<k≤n

2 π n
2n
Z Z Y Y Y
2
f [dg] = n (cos θj −cos θk ) sin2 (θj /2)f (θ1 , . . . , θn ) dθi
SO(2n+1) π n! −π 1≤j<k≤n j=1
(11.367)
Where we have normalized the Haar measure to have volume one. Finally, for the
unitary symplectic group U Sp(2r), any matrix can be conjugated to

Diag{eiθ1 , . . . , eiθr , e−iθ1 , . . . , e−iθr } (11.368)

in terms of which the formula for integrating class functions is:


2 π n
2n
Z Z Y Y Y
f [dg] = n (cos θj − cos θk )2 sin2 (θj )f (θ1 , . . . , θn ) dθi
U Sp(2n) π n! −π 1≤j<k≤n j=1
(11.369)
These kinds of formulae appear very frequently in the subject of random matrix
theory. ♣Need to give a
reference here. ♣

– 212 –
3. Weyl-Kac Character Formula The character formula for SU (2) can be written as
w(λ+ρ)
P
−ρ w∈W (w)e
χj = e Q −α )
(11.370)
α>0 (1 − e

where eλ , eα and eρ are functions on the maximal torus such that

eλ (d(z)) = z 2j (11.371)

eα (d(z)) = z 2 eρ = eα/2 (11.372)

and the sum on w is a sum over the Weyl group, with (w) ∈ {±1} a homomorphism
to µ2 .

11.14.3 Unitarization
As we have seen, the finite-dimensional representations of SU (2) are unitarizable, and
unitarity is very important in physics and mathematics.
One might first be tempted to define a unitary structure on H2j by declaring the inner
product of two homogeneous polynomials to be the integral of ψ1∗ ψ2 . But this will not
converge. Instead we take
Z
1 2 2
hψ1 , ψ2 iH2j := ψ1 (u, v)∗ ψ2 (u, v)e−|u| −|v| d2 ud2 v (11.373)
π(2j + 1)! C2

This will give finite overlaps on H2j . Note that since du ∧ dv and |u|2 + |v|2 are SU (2)
invariant the inner product is SU (2)-invariant, i.e. it is unitary:

hg · ψ1 , g · ψ2 iH2j = hψ1 , ψ2 iH2j (11.374)

The reason for the funny normalization constant will be clearer below.

Exercise ON Basis
Show that an ON basis is given by
s
1 (2j + 1)!
fj,m = √ uj+m v j−m (11.375)
π (j + m)!(j − m)!

11.14.4 Inhomogeneous Polynomials And Mobius Transformations On CP1


We saw in section **** above that there is a close relation between SU (2)/U (1) and
SL(2, C)/B and CP1 . Indeed, SL(2, C) acts naturally on CP1 :
!
αs
· [z1 : z2 ] → [αz1 + sz2 : βz1 + tz2 ] (11.376)
β t

– 213 –
On the other hand we can identify CP1 with S 2 and then we can stereographically
project CP1 to C ∪ {∞}. See the exercise below for the precise formulae. The projection
from CP1 to the Riemann sphere is just:
(
z := z1 /z2 z2 6= 0
[z1 : z2 ] 7→ (11.377)
∞ z2 = 0

It follows that SL(2, C) acts on the Riemann sphere by Mobius transformations:


!
αs αz + s
·z → (11.378)
β t βz + t

and restricted to SU (2) this becomes:


!
α −β ∗ αz − β ∗
·z → (11.379)
β α∗ βz + α∗

Since the SU (2) action on S 2 factors through the usual SO(3) rotations it is not
surprising that the round volume form is invariant under SU (2). We can verify this with
the volume form
1 idz ∧ dz̄
ω := (11.380)
2π (1 + |z|2 )2
directly since if z̃ = g · z then one easily checks:
dz
dz̃ = (11.381)
(βz + α∗ )2
1 |βz + α∗ |2
= (11.382)
(1 + |z̃|2 )2 (1 + |z|2 )2
and therefore the measure ω is invariant. Indeed, as you show in an exercise below, it is
just the usual round measure on S 2 evaluated in stereographic coordinates.
Now let us return to the space H2j of homogeneous polynomials in z1 , z2 . This vector
space is canonically isomorphic to the vector space of polynomials in z = z1 /z2 of degree
≤ 2j since we can identify a homogenous polynomial F with a polynomial p by

p(z) := z2−2j F (z1 , z2 ) (11.383)

The map F → p defines a map H2j → P2j and we can make this an SU (2)-equivariant
map if we declare the action of SU (2) on polynomials p to be 117
 ∗
α z + β∗

2j
(g · p)(z) := (−βz + α) p (11.384)
−βz + α
The model for the spin j representation Vj as the space of polynomials of degree ≤ 2j
can be turned into a Hilbert space by defining

h(ψ1 , ψ2 ) := (1 + |z|2 )−2j ψ1∗ (z)ψ2 (z) (11.385)


117
One needs to be careful to remember that the left g-action on functions takes the argument z of the
function to g −1 z. Note it is g −1 and not g.

– 214 –
One checks that this is invariant under the SU (2) action so we integrate:
Z
hψ1 , ψ2 iP2j = h(ψ1 , ψ2 )ω (11.386)
C
and then
hg · ψ1 , g · ψ2 i = hψ1 , ψ2 i (11.387)
so g acts as a unitary operator on P2j in this inner product.
Using the integrals
Z ∞
xn1 Γ(n1 + 1)Γ(n2 − n1 − 1)
n
dx = − 1 < n1 & 1 < n2 − n1 (11.388)
0 (1 + x) 2 Γ(n2 )
we check that an ON basis for P2j is given by:
ψ` := N` z ` ` = 0, . . . , 2j (11.389)
s  
2j
N` := (2j + 1) (11.390)
`
In physics this is usually written as:
s
(2j + 1)!
ψj,m = z j+m = Nj,m z j+m (11.391)
(j + m)!(j − m)!
Now the Wigner functions are defined by the matrix elements relative to this ON basis:
X
j
g · ψj,mR = Dm L ,mR
(g)ψj,mL (11.392)
mL

In physics notation ψj,m would usually be written as the ket vector |j, mi, the spin j
representation might be written as operators T j (g) and we would have:
j 0 j
Dm 0 m (g) := hj, m |T (g)|j, mi (11.393)

Remarks:
1. There is an isometry of Hilbert spaces between H2j with inner product h·, ·iH2j defined
in equation (11.373) above and h·, ·iP2j . Indeed, in the integral (11.373) we can change
coordinates to z = u/v and v. Then du ∧ dv = vdz ∧ dv and one can do the v integral
explicitly to show the isometry.
j j
2. To relate the Wigner function Dm 0 m to our functions D̃m0 m defined above we compute

j
Dm 0 ,m (g) := hψj,m0 , g · ψj,m i

d(r2 )
Z
0 dφ
= Nj,m0 Nj,m (z j+m )∗ (−βz + α)j−m (α∗ z + β ∗ )j+m
C 2π (1 + r2 )2j+2
Z ∞
d(r2 )
  
X j + m j − m s j−m−t j+m−s t 0
= Nj,m Nj,m
0 ᾱ α β̄ (−β) r2j+2m
s t 0 (1 + r2 )2j+2
s+t=j+m0
s
(j + m0 )!(j − m0 )! j
= D̃ 0 (g)
(j + m)!(j − m)! m ,m
(11.394)

– 215 –
3. As we noted before, if we induce from the trivial representation then we get functions
on G/H which in this case is G/H = S 2 . The trivial representation of U (1) is mR = 0.
In this case j = ` must be an integer. The functions
`
Dm L ,0
(g) = e−imL φ Pm
`
L
(θ) = Y`,m (θ, φ) (11.395)
are known as spherical harmonics. From what we have said they form a complete
orthonormal set of functions on S 2 and are widely used in electromagnetism and
quantum mechanics.

4. If we specialize further to mL = 0 we get the famous Legendre polynomials


`  2
−`
X `
P` (cos θ) = 2 (−1)`−s (1 + cos θ)s (1 − cos θ)`−s (11.396)
s
s=0
whereas the P`,m (θ) are known as associated Legendre functions. Now some addition
theorems become transparent from group theory. For example, if we consider g1 =
1 1
eiθ1 σ /2 and g2 = eiθ2 σ /2 then
X
`
D00 (g1 g2−1 ) = `
D0m `
(g1 )Dm0 (g2−1 ) (11.397)
m
becomes one of the identities for Legendre polynomials known as the “addition the-
orem.” See, for example, J.D. Jackson, Classical Electrodynamics, section 3.6. 118

5. The Wigner functions have many applications in physics. As we will see, from group
theory, they satisfy some nice differential equations and hence appear in the wave-
functions of atoms. They also appear in the study of the quantum Hall effect on S 2
and in the study of wavefunctions of electrons in the field of a magnetic monopole of
magnetic charge 2j.

Exercise Relations Between Coordinates On S 2 And Stereographic Projection


a.) Let n̂ = (x̂1 , x̂2 , x̂3 ) = (sin θ cos φ, sin θ, sin φ, cos θ) parametrize the unit sphere S 2 .
Verify the following relations: Now we have the relation between stereographic and angular
coordinates where x̂3 = 1 corresponds to z = ∞ and x̂3 = −1 corresponds to z = 0:
z 1 1
2
= (x̂1 + ix̂2 ) = eiφ sin θ
1 + |z| 2 2
|z| 2 1 1 + cos θ
2
= (1 + x̂3 ) =
1 + |z| 2 2
1 1 1 − cos θ
2
= (1 − x̂3 ) = (11.398)
1 + |z| 2 2
|z|2 − 1
= x̂3
|z|2 + 1
x1 + ix2 θ
z= = eiφ cot( )
1 − x3 2
1
118 /2 iφ1 σ 3 /2
We have specialized Jackson’s equation 3.62. For the more general statement let g1 = eiθ1 σ e
and similarly for g2 .

– 216 –
b.) Also show that the standard round metric of the unit sphere can be written as

1 |dz|2
ds2 = (11.399)
π (1 + |z|2 )2

and the volume form is:


1 idz ∧ dz̄
ω :=
2π (1 + |z|2 )2
1 dφ
= d(|z|2 ) (11.400)
(1 + |z|2 )2 2π
1
= sin θdφ ∧ dθ

is the spherically symmetric unit volume measure on the sphere.
c.) Show that !
1 1 |z|2 z̄
(1 + x̂ · ~σ ) = (11.401)
2 1 + |z|2 z 1

is a 2 × 2 projection matrix that depends smoothly on x̂ ∈ S 2 . It plays many important


roles and is known as the Bott projector.

Exercise Spherical Harmonics And Polynomials


Show that the (rescaled) spherical harmonics can be written as
X ``
` 2`
D̃mL ,0 = |β| (−1)t z̄ s z `−t (11.402)
s t
s+t=`+mL

where z = α/β.

11.14.5 The Geometrical Interpretation Of P2j


We discussed in section 11.13.1 above that equivariant functions can be viewed as sections
of an associated bundle. In this section we unpack that idea a bit for the case of SU (2).
In the case of SU (2)/U (1) we have the line bundle

Lk = (SU (2) × Vk )/U (1) (11.403)

where Vk is the irreducible representation of U (1).


Abstractly, a section is a map s : SU (2)/U (1) → Lk so that π ◦ s = Id. This can be
made much more explicit using local trivializations.
To motivate the description of sections in terms of local trivializations consider the
transition we made above from homogenous functions of the (z1 , z2 ) to polynomials in
z = z1 /z2 .

– 217 –
If we think of CP1 as the space of equivalence classes [z1 : z2 ] then on the “patch” UN
where z2 6= 0 we can define a map

φ N : UN → C φN ([z1 : z2 ]) := zN := z1 /z2 (11.404)

Note that the point [1 : 0] corresponds to a “point at infinity” in this mapping. In


terms of the identification S 2 ∼
= CP1 this corresponds to stereographic projection from
the north pole:
x̂1 + ix̂2
zN = (11.405)
1 − x̂3
In this language we identified a homogeneous polynomial of z1 , z2 of degree 2j with a
polynomial pN (zN ) by
pN (zN ) = z2−2j F (z1 , z2 ) (11.406)
We could clearly have given an analogous discussion based on stereographic projection
from the south pole:
x̂1 − ix̂2
zS = (11.407)
1 + x̂3
Note that
z S zN = 1 (11.408)
In terms of CP1 , we could introduce a ‘patch” UN where z1 6= 0 we can define a map

φS : U S → C φS ([z1 : z2 ]) := zS := z2 /z1 (11.409)

Note that the point [0 : 1] corresponds to a “point at infinity” in this mapping. Again it is
manifest that zN zS = 1 on the patch overlap US ∩ UN where both functions are defined.
Now, the same homogeneous function F could be used to define a different polynomial

pS (zS ) = z1−2j F (z1 , z2 ) (11.410)

Comparing the equations we see that on patch overlaps we can say:


−2j
pS (zS ) = zN pN (zN ) (11.411)

or equally well
pN (zN ) = zS−2j pS (zS ) (11.412)
Equations (11.411) and (11.412) are known as gluing conditions and zS−2j and zN−2j
are
known as transition functions. We can glue together the total space of the line bundle Lk
by taking two trivial line bundles
US × C (11.413)
UN × C (11.414)
−2j
and identifying (zS , vS ) with (1/zN , zN vN ) on the patch overlaps.
Note the the function h is invariant under this description:

h(pS , pS ) = h(pN , pN ) (11.415)

– 218 –
because, on patch overlaps we have:

(1 + |zS |2 )−2j |pS (zS )|2 = (1 + |zN |2 )−2j |pN (zN )|2 (11.416)

as the reader should carefully check. Thus this is a globally defined notion of the length-
square of a section. It is called a Hermitian metric on the line bundle.

Remarks:

1. Recalling that 2j = −k the transition rules for all sections of the line bundle Lk can
be written
ψN (zN , z̄N ) = zSk ψS (zS , z̄S ) (11.417)
where there is now no condition of holomorphy. The space of all sections can be
SU (2)
identified with the space of all functions in IndU (1) (ρk ). This formulation makes
sense for any integer k. When k > 0 there are no holomorphic sections, but there are
plenty of C ∞ sections.

2. The Borel-Weil-Bott Theorem: Here is a rough statement: If G is a compact ♣Improve this


statement. ♣
group and T a maximal subgroup then one can give G/T a complex structure so that
G/T ∼ = GC /B. Irreps of T induce reps of B and then the holomorphic sections of a line
bundle of GC /B, corresponding to the equivalent holomorphic induced representation
IndGB (V ) is an irreducible representation of G, and all such arise in this way. Gc /B
C

for other finite dimensional simple Lie groups.

3. We will see later that SL(2, C) is isomorphic to the Lorentz group in 3+1 dimensions.
The Lorentz group acts on the set of light rays through the origin of Minkowski
space, and we can identify a light ray with its point on the celestial sphere. Under
this identification, the action of the Lorentz group on the set of light rays is just the
Mobius action on the sphere.

4. Representations Of Lorentz, Poincaré, and Affine Euclidean Groups. Also, the con-
struction of the irreducible unitary representations of Lorentz groups, and affine
Euclidean and Poincaré groups proceeds using this method. (That observation goes
back to Wigner and Bargmann.) Briefly, for a representation of the Poincaré group
one induces from a representation of the translation group (or the translation group
semidirect product with a compact rotation group). For a representation of the
Lorentz group one considers the homogeneous spaces from orbits of SO(1, d) in mo-
mentum space. For example, the mass shell p2 = m2 with p0 > 0 can be identified
with SO(1, d)/SO(d). Then one induces from a representation of SO(d) to produce
a unitary representation of SO(1, d). One source that approaches the subject from
the present viewpoint is M. Carmeli, Group Theory And General Relativity.

5. Loop Groups: See the book of Pressley and Segal

6. Diffeomorphism Groups: Nice paper of A. Alekseev and S. Shatashvili

7. Geometric Quantization. EXPLAIN A LITTLE

– 219 –
11.15 The Clebsch-Gordon Decomposition For SU (2)
In physics the spin j representation Vj shows up almost universally. Among other applica-
tions it appears in the quantum mechanical theory of spin. (See the relation to Lie algebras
below for an explanation of this.)
The combination of two systems of spins naturally leads to the question of how to give
an isotypical decomposition of Vj1 ⊗ Vj2 . This is known as the Clebsch-Gordon decompo-
sition. The general formula is:

V (j1 ) ⊗ V (j2 ) ∼
= V (|j1 − j2 |) ⊕ V (|j1 − j2 | + 1) ⊕ · · · · · · ⊕ V (j1 + j2 ) (11.418)

Note that every representation on the RHS has the same parity of (−1)2j .
Let us give a proof of (15.610) using characters. Because a representation is uniquely
determined by its character we can consider the character of V (j1 ) ⊗ V (j2 ). If we can write
this as a linear combination of characters χj with nonnegative integer coefficients we can
uniquely determine the decomposition into irreps.
The easiest thing to do is prove that

V 1 ⊗ Vj ∼
= Vj+1/2 ⊕ Vj−1/2 (11.419)
2

If j = 0 we interpret V−1/2 as the zero vector space. Since a representation is uniquely


determined by its character we can prove this by simply computing

z 2j+1 − z −2j−1
χ1/2 (z)χj (z) = (z + z −1 )
z − z −1
z 2j+2 − z −2j−2 z 2j − z −2j (11.420)
= +
z − z −1 z − z −1
= χj+1/2 + χj−1/2

The general result (15.610) now follows by induction.


Alternatively, we can write:

− z −2j1 −1 − z −2j2 −1
 2j1 +1   2j2 +1 
z z
χj1 (z)χj2 (z) = ·
z − z −1 z − z −1
! (11.421)
1 z 2j1 +2j2 +2 − z 2(j1 −j2 ) z −2j1 −2j2 −2 − z −2(j1 −j2 )
= +
z − z −1 z − z −1 z − z −1

Now, WLOG assume that j1 ≥ j2 . Then we use the identity:

z a+2 − z b z a−b+2 − 1
−1
= z b+1 = z b+1 + z b+3 + · · · + z a+1 (11.422)
z−z z2 − 1
for each of the two terms in the sum above, then realize that the two terms are related by
z → 1/z and we directly obtain:

χj1 χj2 = χj1 +j2 + χj1 +j2 −1 + · · · + χ|j1 −j2 | (11.423)

– 220 –
It is also instructive to give a proof using the orthogonality of characters. One can
check directly by contour integration that
(
+1 |j1 − j2 | ≤ j ≤ j1 + j2 & 2j = 2j1 + 2j2 mod2
I
1 dz
− χj1 (z)χj2 (z)χj (z)(z−z −1 )2 =
4πi z 0 else
(11.424) ♣Explain this
better ♣

Definition Choose an ON basis ψj,m of Vj . Let Pj the the orthogonal projector onto
the subspace of Vj1 ⊗ Vj2 transforming in the representation Vj . Then

hψj,m , Pj (ψj1 ,m1 ⊗ ψj2 ,m2 )i (11.425)

is known as a Clebsch-Gordon coefficient. In physics they are denoted as: *****************

Example: Let us consider in detail the important case of

V1/2 ⊗ V1/2 ∼
= V0 ⊕ V1 (11.426)

The orthogonal projectors are obtained from


Z
j
PmL ,mR = j
(Dm L ,mR
(g))∗ T (g)dg (11.427)
SU (2)

In particular, as noted above in equation (11.180) the projector to the isotypical component
of the trivial representation is always:
Z
P = T (g)dg (11.428)
G

In our example,
T (g)|βi ⊗ |δi = gαβ gγδ |αi ⊗ |γi (11.429)
Tradition demands that an ON basis for the fundamental representation be denoted {|+i, |−i}
with
d(z) · |+i = z|+i d(z)· = z −1 |−i (11.430)
We denote the basis vectors |αi with α, .... ∈ {+, −}. If we look at the projector to the
trivial representation then we need to evaluate
Z
gαβ gγδ dg (11.431)
SU (2)

0 0
Now recall that gαβ = αα0 ββ 0 (g ∗ )α β so we can use the orthogonality relations to say
Z
1
gαβ gγδ dg = αγ βδ (11.432)
SU (2) 2

confirming the direct computation from the exercise (11.57).

– 221 –
It follows that an ON basis for the spin 0 (“singlet”) isotypical component of V1/2 ⊗V1/2
is
1
√ (|+i ⊗ |−i − |−i ⊗ |+i) (11.433)
2
This state is also called a Bell pair in quantum information theory. It is the simplest
example of a nontrivially entangled state and was used in the famous EPR paper - the first
paper on quantum information theory.
The orthogonal complement must be the spin one states, and a basis that diagonalizes
the action of the diagonal unitary matrices is

|1, 1i = |+i ⊗ |+i


1
|1, 0i = √ (|+i ⊗ |−i + |−i ⊗ |+i) (11.434)
2
|1, −1i = |−i ⊗ |−i

******************************
In physics a standard example of a general selection rule is the Wigner-Eckart theorem:
Suppose Oj,m a collection of operators transforming in the spin j rep

hψj,m |Oj1 ,m1 |ψj2 ,m2 i = CLEBSCHx”GroupIndependent” (11.435)

EXPLAIN.
**********************

Exercise Recovering Dimension


As a nice check of the expression on the far RHS in (11.356) show that the limit u → 1
reproduces the dimension of Vj .

11.16 Lie Groups And Lie Algebras And Lie Algebra Representations
11.16.1 Some Useful Formulae For Working With Exponentials Of Operators
In this section we will discuss some formulae that are very useful for working with exponen-
tials of matrices (and linear operators). In particular we will derive the Baker-Campbell-
Hausdorff formula.
Let us recall that if A is a matrix or an operator then eA is the matrix, or operator,
defined by the exponential series. The following three identities are easily shown by direct
use of the exponential series:

1.
eαA eβA = e(α+β)A (11.436)

2.
d tA
e = AetA = etA A (11.437)
dt

– 222 –
3.
A Be−A
eA eB e−A = ee (11.438)

Now we prove some identities that are not directly obvious from the exponential series:

Definition: For A ∈ Mn (κ) we denote by Ad(A) the linear transformation Mn (κ) →


Mn (κ) defined by
Ad(A) : B 7→ [A, B] (11.439)

We also denote:
m times
z }| {
(Ad(A))m B = [A, [A, · · · [A, B] · · · ] (11.440)

where there are m commutators on the RHS.


First we prove
eA Be−A = eAd(A) B (11.441)

in other words:

eA Be−A = eAd(A) B
1
= B + Ad(A)B + (Ad(A))2 B + · · · (11.442)
2!
1
= B + [A, B] + [A, [A, B]] + · · ·
2!

To prove this define B(t) := etA Be−tA . So B(0) = B and B(1) = eA Be−A is the quantity
we want. Now it is easy to derive the differential equation:

d
B(t) = Ad(A)B(t) (11.443)
dt
so
B(t) = etAd(A) B(0) (11.444)

Now set t = 1.
Combining with (11.438) we now have the somewhat less trivial identity:
Ad(A) B
eA eB e−A = ee (11.445)

All these identities follow from a much more nontrivial formula, known as the Baker-
Campbell-Hausdorff formula that expresses the operator C defined by

eA eB = eC (11.446)

as a power series in A, B. We have

C = A + B + s(A, B) (11.447)

where s(A, B) is an infinite series and every term involves nested commutators.

– 223 –
We will give a complete statement and proof of the BCH formula below. In order to
do that we first state the extremely useful

Lemma : Let

ez − 1 z z2
f (z) = =1+ + + ··· (11.448)
z 2! 3!
Then
d A(t)  −A(t) d
e e = −eA(t) e−A(t) = f (Ad(A(t))) · Ȧ(t) (11.449)
dt dt
where A(t) is any differentiable matrix function of t.
Note that this is nontrivial because Ȧ(t) does not commute with A(t) in general!
Indeed, from the exponential series you can easily show that

d A(t) 1
e − Ȧ(t)eA(t) = [A(t), Ȧ(t)] + · · ·
dt 2 (11.450)
d A(t) A(t) 1
e −e Ȧ(t) = − [A(t), Ȧ(t)] + · · ·
dt 2

Proof : Introduce a matrix function of two variables and take derivatives wrt s:

d −sA(t)
B(s, t) := esA(t)e
dt
∂B d d  −sA(t)
= A(t)esA(t) e−sA(t) − esA(t)

e A(t)
∂s dt dt (11.451)
= Ad(A(t))B(s, t) − Ȧ(t)
∂j B
= (Ad(A(t)))j B(s, t) − (AdA(t))j−1 Ȧ(t)
∂sj

B(0, t) = 0 therefore again by Taylor:

1 ∂j
B(s, t) |s=0 = −Ad(A(t))j−1 Ȧ(t) j≥1 (11.452)
j! ∂sj
So


X sj (Ad(A(t)))j−1
sA(t) d −sA(t)
e (e )=− Ȧ(t) (11.453)
dt j!
j=1

Now set s=1. ♠


Note: you can rewrite this lemma as the statement:
Z 1
d A(t)
e = esA(t) Ȧ(t)e(1−s)A(t) ds (11.454)
dt 0

because:

– 224 –
Z 1
d A(t)
e = esA(t) Ȧ(t)e(1−s)A(t) ds
dt 0
Z 1
= esAd(A(t)) dsȦ(t)eA(t) (11.455)
"0 #
eAd(A(t)) − 1

= Ȧ(t) eA(t)
Ad(A(t))

Remark: Equation (11.454) is an intuitively appealing formula. For a finite product we


have:
 
d d
M1 (t)M2 (t)M3 (t) · · · Mn (t) = ( M1 (t))M2 (t)M3 (t) · · ·
dt dt
d
+ (M1 (t))( M2 (t))M3 (t) · · ·
dt (11.456)
d
+ (M1 (t))(M2 (t))( M3 (t)) · · ·
dt
d
+ · · · + M1 (t)M2 (t) · · · ( Mn (t))
dt
Now write
N
Y
A(t)
e = [eA(t)∆s ] (11.457)
i=1
d ∆sA(t)
where ∆s = 1/N . By equation (11.450), we can replace dt e by e∆sA(t) ∆sȦ(t) up to
order (∆s)2 . Then write the general term in the sum (11.456) as
! !
Y Y
e∆sA(t) e∆sA(t) Ȧ(t) e∆sA(t) ∆s (11.458)
i<s s<i

up to terms of order (∆s)2 . Next we sum over these terms and take N → ∞ to get (11.454).

Now we are finally ready to state the main theorem:

Theorem: (Baker-Campbell-Hausdorff formula)


Let:

logw X (1 − w)j 1 − w (1 − w)2
g(w) = = =1+ + + ··· (11.459)
w−1 j+1 2 3
j=0

be a power series in w about 1. Then when A, B are n × n matrices with k A k, k B k


sufficiently small, the matrix C given by the expansion:
Z 1
C=B+ g(etAdA eAdB )(A)dt (11.460)
0

– 225 –
satisfies C = log(eA eB ).

Proof :
Introduce the matrix-valued function C(t) via:

eC(t) = etA eB (11.461)

and note that C(0) = B, and C(1) is the matrix we want. We derive a differential equation
for C(t). By our lemma we have:

d −C(t)
eC(t) e = −f (AdC(t))Ċ(t) (11.462)
dt
with
ez − 1
f (z) = (11.463)
z
On the other hand, plugging in the definition (11.461) we compute directly the simple
result
d d
eC(t) e−C(t) = etA e−tA = −A (11.464)
dt dt
Therefore we get a differential equation:

f (AdC(t))Ċ(t) = A (11.465)

Now, f is a power series about 1 so it immediately follows that

Ċ(t) = f (Ad(C(t)))−1 A (11.466)

Let us make this more explicit: Using the power series g(w) above with w = ez note
that
ez − 1 z
f (z)g(ez ) = · z =1 (11.467)
z e −1
regarded as an identity of power series in z. Now we can substitute for z any operator O,
and use

g(eO ) = f (O)−1 , (11.468)

and therefore we can solve for Ċ:

Ċ(t) = f (Ad(C(t)))−1 · A
(11.469)
= g(exp(Ad(C(t))) · A

where we applied (11.468) with O = Ad(C(t). This hardly seems useful, since we still
don’t know C(t), but now since we have power series we can say

eO = eAd(C(t)) = eAd(tA) eAd(B) (11.470)

– 226 –
To prove (11.470) note that for all H we have:

eAdC(t) H = eC(t) He−C(t)


= etA eB He−B e−tA
(11.471)
= eAd(tA) eAd(B) H
⇒ eAd(C(t)) = eAd(tA) eAd(B)

Therefore:

Ċ(t) = g(eAd(tA) eAd(B) ) · A (11.472)

Now we integrate equation (11.472)


Z t
C(t) = C(0) + g(eAd(sA) eAd(B) )Ads (11.473)
0
but C(0) = B, so
Z 1
A B
C = C(1) = log(e e ) = B + g(eAd(s A) Ad(B)
e )A ds (11.474)
0
which is what we wanted to show. ♠.

Remarks:

1. To evaluate g(eO ) for an operator O we expand eO around 0 so eO = 1 + O + · · ·


and then we expand around 1 to get an expansion of g(eO ) around O = 0. A similar
remark applies to g(eO1 eO2 ).

2. Explicitly the first few terms are: 119

1 1 1 1
C = A + B + [A, B] + [A, [A, B]] + [B, [B, A]] + [A, [B, [A, B]]] + · · ·
2 12 12 24
(11.475)
5
where the next terms are order  if we scale A, B by .120

3. For suitable operators A, B on Hilbert space the BCH formula continues to hold. But
the series has a finite radius of convergence: See the exercises below.

Exercise
Work out the BCH series to order 5 in A, B.

119
It is useful to note that [A, [B, [A, B]]] = −[B, [A, [B, A]]] = B 2 A2 − A2 B 2
120
One can find an algorithm for generating the higher order terms in Varadarajan’s book on group theory.

– 227 –
Exercise
Show that we can also write:

Z 1
C = log(eA eB ) = A + g(e−Ad(s B) −Ad(A)
e )B ds (11.476)
0

Exercise All Orders In B, First Order in A


Write A =  and consider it to be small. Show that the formula for C given by BCH
to all orders in B and first order in  is

AdB
C=B+ ()
eAdB
−1 (11.477)
1 1 1
= B +  − [B, ] + [B, [B, ]] − [B, [B, [B, [B, ]]]] + · · ·
2 12 720

Note:

x 1 1 2 x4 x6 x8 x10
= 1 − x + x − + − +
ex − 1 2 12 720 30240 1209600 47900160
691
− x12 + · · · (11.478)
1307674368000

X Bn xn

n!
n=0

is an important expansion in classical function theory -the numbers Bn are known as the
Bernoulli numbers
There are many applications of this formula. One in particle physics is to spontaneous
symmetry breaking where the formula above gives the chiral transformation law of the pion
field. Here B = π(x) is the pion field and  is the chiral transformation parameter.

Exercise Eigenvalues of Ad(A) For Diagonalizable A


a.) Show that if A ∈ Mn (κ) is diagonalizable A ∼ Diag{λ1 , . . . , λn } then the eigenval-
ues of Ad(A) acting on Mn (κ) are λi − λj .
b.) Using (a) and the previous exercise conclude that the BCH formula has finite
radius of convergence.

– 228 –
11.16.2 Lie Algebras
We recall two basic definitions - the definition of an algebra and of a Lie algebra. See
Chapter 2, Linear Algebra User’s Manual for more discussion and examples. The relation
between Lie groups and Lie algebras is covered in much more detail in Chapter 8. But here
is a preview.

Definition An algebra over a field κ is a vector space A over κ with a notion of multipli-
cation of two vectors
A×A→A (11.479)
denoted:
a1 , a2 ∈ A → a1 a2 ∈ A (11.480)
which has a ring structure compatible with the scalar multiplication by the field. Con-
cretely, this means we have axioms:
i.) (a1 + a2 ) a3 = a1 a3 + a2 a3
ii.) a1 (a2 + a3 ) = a1 a2 + a1 a3
iii.) α(a1 a2 ) = (αa1 ) a2 = a1 (αa2 ), ∀α ∈ κ.

The product might, or might not, be associative. When it is associative, it is called


an associative algebra.

Example: A basic example of an algebra is the vector space of n × n matrices over a field
κ. The vector addition is simply addition of matrices. The algebra product is matrix
multiplication.

Definition A Lie algebra over a field κ is an algebra A over κ where the multiplication of
vectors a1 , a2 ∈ A, satisfies in addition the two conditions:

1. ∀a1 , a2 ∈ A:
a2 a1 = −a1 a2 (11.481)

2. ∀a1 , a2 , a3 ∈ A:

((a1 a2 ) a3 ) + ((a3 a1 ) a2 ) + ((a2 a3 ) a1 ) = 0 (11.482)

This is known as the Jacobi relation.

Now, tradition demands that the product on a Lie algebra be denoted not as a1 a2
but rather as [a1 , a2 ] where it is usually referred to as the bracket. So then the two defining
conditions (11.481) and (11.482) are written as:

1. ∀a1 , a2 ∈ A:
[a2 , a1 ] = −[a1 , a2 ] (11.483)

– 229 –
2. ∀a1 , a2 , a3 ∈ A:
[[a1 , a2 ], a3 ] + [[a3 , a1 ], a2 ] + [[a2 , a3 ], a1 ] = 0 (11.484)

Remark: Note that if we consider the Lie algebra product as the algebra product on a
vector space, then the algebra is non-associative. If we have a Lie algebra product and use
it to define an algebra product: a1 a2 := [a1 , a2 ] then

(a1 a2 ) a3 − a1 (a2 a3 ) = [[a1 , a2 ], a3 ] − [a1 , [a2 , a3 ]]


(11.485)
= −[[a3 , a1 ], a2 ]]

by the Jacobi identity.

Example 1: Whenever we have an associative algebra A, we can automatically turn it


into a Lie algebra by defining the Lie product as a commutator:

[a1 , a2 ] := a1 a2 − a2 a1 (11.486)

However, it should be stressed that not all Lie algebras arise in this way. A good example
would be first order smooth differential operators on, say, Rn , (or more generally, smooth
vector fields on a manifold). If

V = v µ (x) µ (11.487)
∂x
then the product of the differential operators - which makes sense as a second order operator
- is not a vector field, but we can define
 
µ ∂ ν µ ∂ ν ∂
[V1 , V2 ] = v1 µ v2 − v2 µ v1 (11.488)
∂x ∂x ∂xν

Example 2: An important example arises by applying the general remark above to the al-
gebra Mn (κ). In this case the Lie algebra is often denoted gl(n, κ). Now Mn (κ) and gl(n, κ)
are precisely the same as sets, and are precisely the same as vector spaces. The notation
simply puts a different stress on the algebraic structure which the author is considering.

Example 3: If g ⊂ gl(n, κ) is a vector subspace of matrices that is closed under matrix


commutation, then g is a Lie subalgebra of gl(n, κ). For example, there is a vector subspace
of gl(n, κ) consisting of traceless matrices. The property of being traceless is preserved by
commutator. Therefore this subspace is a sub-Lie algebra. It is denoted as sl(n, κ). In
the classification of semi-simple Lie algebras over C these Lie algebras have special names:
An := sl(n + 1, κ).

Example 4: Consider so(n, κ) ⊂ Mn (κ) defined to be the vector subspace of n × n


antisymmetric matrices. The matrix product of anti-symmetric matrices is not anti-
symmetric, but the matrix commutator is. In the classification of semi-simple Lie algebras

– 230 –
over C these Lie algebras have special names, and for good reasons the even and odd cases
are considered as separate families. They are denoted Bn = so(2n + 1) and Dn = so(2n).

Example 5: u(n) ⊂ Mn (C) is defined to be the vector subspace of n × n antihermitian


matrices. The matrix product of anti-hermitian matrices is not anti-hermitian, but the
matrix commutator is. Note carefully that u(n) is a real vector space and is not a complex
vector space! After all, if A† = −A then (iA)† = iA is not anti-hermitian. We can further
define a sub-Lie algebra su(n) ⊂ u(n) of anti-hermitian matrices that are, in a addition,
traceless.

Example 6: Finally sp(2n, κ) ⊂ Mn (κ) is the Lie subalgebra such that (Ja)tr = Ja where
!
0 1
J= (11.489)
−1 0

Example 7: The intersection of two Lie algebras is a Lie algebra and a particularly
important example is usp(2n) = su(2n) ∩ sp(2n, C) ⊂ M2n (C). ♣say why ♣

Let g ⊂ gl(n, κ) be a sub-Lie algebra with κ = Q, R, C. The BCH formula gives one
way to understand a relation between Lie algebras and Lie groups: Provided the BCH
series converges it follows that if A, B ∈ g then

eA eB = eC (11.490)

with C ∈ g. Thus, up to convergence issues, the invertible operators eA with A ∈ g will


close to form a group. The series will converge for A, B in an open neighborhood of the
origin. So, taking the closure (under group multiplication) of the set of matrices exp[A]
with A ∈ g will give the Lie group.
One can show that for compact connected Lie groups the exponential map

exp : g → G (11.491)

is indeed surjective, although it will not be injective. A good example is the case U (1).
Here the Lie algebra is g = iR (and the commutator is zero). Then all of 2πiZ is in the
kernel of the exponential map.
Conversely, from a matrix Lie group one can recover the Lie algebra by considering the
general one-parameter subgroups g(t) with g(0) = 1 and computing g −1 (t) dt d
g(t) at t = 0.
We will elaborate on this idea by first closing a gap in our discussion at the very beginning
of the course and prove that the classical Lie groups are manifolds as follows:

– 231 –
Exercise Structure Constants
If g is a Lie algebra over κ then we can choose a basis T i for g and necessarily we have
an expansion X ij
[T i , T j ] = fk T k fkij ∈ κ (11.492)
k

from which one can construct all commutators. The constants fkij are known as structure
constants.
a.) Show that
fkij = −fkji (11.493)

f`ij fm
`k
+ f`jk fm
`i
+ f`ki fm
`j
=0 (11.494)

b.) Conversely, show that given tensors fkij ∈ κ satisfying equations (11.493) and
(11.494) one can define a Lie algebra over κ.

Remark: Sometimes Lie algebras are presented by giving a list of structure constants. If
someone tries to sell you a Lie algebra by giving you a list of structure constants don’t buy
it until you have checked equations (11.493) and (11.494).

11.16.3 The Classical Matrix Groups Are Lie Groups


This section assumes some knowledge of differential geometry. Some readers might wish
to skip it.
First we recall the pre-image theorem. A map f : M1 → M2 between two manifolds
M1 and M2 is said to be a submersion if at every p ∈ M1 the map of tangent spaces:

df : Tp M1 → Tf (p) M2 (11.495)

is surjective. In this case f (p) is said to be a regular value. Using results from the calculus of
many variables one can show that if f is a submersion at p then there are local coordinates
so that in a neighborhood of p it has the form:

f : (x1 , . . . , xn1 ) 7→ (x1 , . . . , xn2 ) (11.496)

That is, locally, in suitable coordinate systems f is literally the map f : Rn1 → Rn2 keeping
only the first n2 coordinates. Then, every point in the target Rn2 is a regular value of f
and the inverse image of a regular value is

f −1 (c1 , . . . , cn2 ) = {x ∈ Rn1 |x1 = c1 , . . . , xn2 = cn2 } ∼


= Rn1 −n2 (11.497)

is a submanifold of dimension n1 − n2 .
Therefore we have

Theorem: [Preimage Theorem] If f : M1 → M2 and q ∈ M2 is a regular value in the image


of f then the preimage f −1 (q) is a submanifold of M1 of dimension dimM1 − dimM2 . That

– 232 –
is, the preimage f −1 (q) is a submanifold of M1 of codimension dimM2 . The tangent space
to f −1 (q) at any point p is ker(dfp ).

Now we can apply this idea to describe subsets defined by equations. If f : M → R`


and dimM ≥ ` then for ~c ∈ R` the sets

M~c := f −1 (~c) = ∩`i=1 {p ∈ M |f i (p) = ci } (11.498)

are called level sets. If ~c is a regular value then the level set is a submanifold of M of
codimension `. Note that for each i, f i : M → R so df i : Tp M → Tci R ∼ = R, and hence df i
is a linear functional on Tp M . The regularity condition is the condition that these linear
functionals are all linearly independent. We say that the functions f i are independent.
We can apply these ideas to give easy proofs that the classical matrix groups are in
fact Lie groups. First we prove they are all manifolds:

Example 1: GL(n, R) ⊂ Rn and GL(n, C) ⊂ Cn ∼


2 2 2
= R2n are both manifolds. The
coordinates are the matrix elements. The inverse image det−1 (0) is a closed subset of
Mn (κ), with κ = R, C and hence any invertible matrix has an open neighborhood of
invertible matrices. The matrix elements thus serve as global coordinates. Unless we insist
2
that our coordinate patches are diffeomorphic to Rn there is no need to use more than
one coordinate patch. The tangent space at any point A ∈ GL(n, κ) is isomorphic to the
vector space Mn (κ) of n × n matrices over κ.

Example 2: Now, for SL(n, κ) consider f : Mn (κ) → κ defined by f (A) := detA − 1. We


claim that 0 ∈ κ is a regular value of f . Indeed, if A is invertible then for any

M ∈ TA Mn (κ) ∼
= Mn (κ) (11.499)

we have
dfA (M ) = detATr(A−1 M ) (11.500)
This is usually written as the (very useful) identity 121

δlogdetA = Tr A−1 δA

(11.501)

for A invertible. When A is invertible the kernel of dfA is the linear subspace of n × n
matrices M such that A−1 M is traceless, which is linearly equivalent to the linear subspace
of traceless matrices, and therefore has dimension (over κ) equal to n − 1. Therefore the
rank of dfA is 1, and f = det is a submersion. So the inverse image is a manifold.
1
Example 3: For O(n; κ) we define f : Mn (κ) → Sn (κ) where Sn (κ) ∼
= κ 2 n(n+1) is the
vector space over κ of n × n symmetric matrices. We take f to be

f (A) = AAtr − 1 (11.502)


121
For a proof see the Linear Algebra User’s Manual, ch. 3.

– 233 –
Then O(n) = f −1 (0). We aim to show it is a manifold. Note that dfA is a linear operator
Mn (κ) → Sn (κ). It is just
dfA (M ) = M Atr + AM tr (11.503)
Therefore ker(dfA ) is the linear subspace of Mn (κ) of matrices such that M Atr is anti-
symmetric. When A is invertible this subspace is isomorphic to the linear subspace of
anti-symmetric matrices and hence has dimension 21 n(n − 1). It follows that 0 is a regular
value of f and O(n, κ) is a manifold.

Example 4: For Sp(2n; κ) we define f : M2n (κ) → A2n (κ) where A2n (κ) is the set of
(2n) × (2n) matrices over κ such that (Jm) is antisymmetric. This is isomorphic to the
vector space over κ of dimension 21 (2n)(2n − 1) = n(2n − 1). Now we take f to be

f (A) = AJAtr J tr − 1 (11.504)

so that Sp(2n; κ) = f −1 (0). Again we claim that 0 is a regular value of f . Now dfA is the
linear operator Mn (κ) → A2n (κ). It is just

dfA (M ) = M JAtr J tr + AJM tr J tr (11.505)

Therefore ker(dfA ) is the linear subspace of Mn (κ) of matrices such that M JAtr is symmet-
ric. When A is invertible this subspace is isomorphic to the linear subspace of symmetric
matrices and hence has dimension 21 2n(2n + 1), which is complementary to the dimension
of the image 12 (2n)(2n − 1) and hence dfA is surjective. It follows that 0 is a regular value
of f and Sp(2n; κ) is a manifold.

Example 5: Finally, for U (n) consider f : Mn (C) → Hn where Hn is the real vector space
of n × n Hermitian matrices in Mn (C). This has real dimension n + 2 × 12 n(n − 1) = n2 .
We now take f (A) = AA† − 1. Then

dfA (M ) = M A† + AM † (11.506)

When A is invertible the kernel is the subspace of Mn (C) of matrices such that M A† is
anti-hermitian. This is again a real vector space of real dimension n + 2 × 21 n(n − 1) = n2 .
Since Mn (C) is a real vector space of real dimension 2n2 it follows that dfA is surjective
and hence 0 is a regular value of f . Therefore U (n) is a manifold.

Example 6: It is useful to combine the previous two examples and define the Lie group:

U Sp(2n) := U (2n) ∩ Sp(2n, C) (11.507)

Now we have a map: f : M2n (C) → A2n (C) ⊕ H2n defined by taking the direct sum. Again,
one must check that the real linear map dfA at an invertible matrix in the preimage of 0
has a kernel of the correct dimension. 122
122
We have not covered quaternions yet, but a superior viewpoint is to view U Sp(2n) as the group of
n × n unitary matrices over the quaternions. In this viewpoint we should define f : Mn (H) → Hn (H)

– 234 –
All the examples above are submanifolds of GL(n, κ). The group operation of mul-
tiplication is polynomial in the matrix elements and hence certainly a C ∞ function. The
group operation of inversion is a rational function of the matrix elements and is also C ∞ on
GL(n, κ). It follows from the exercise of Section §?? that the group operations of multipli-
cation and inversion are C ∞ maps in all the cases of the matrix subgroups. This concludes
the argument that the above examples are all Lie groups.
In general, the Lie algebra of a Lie group G is defined, as a vector space, to be the
tangent space at the identity:
Lie(G) := T1 G (11.508)

As we will demonstrate, it is in fact a Lie algebra.


Our proof above that the classical matrix groups are manifolds also leads nicely to an
immediate computation of the Lie algebras of these groups. ♣For consistency,
use small gothic
font for Lie algebras

1. GL(n, κ):
T1 GL(n, κ) ∼
= Mn (κ) = gl(n, κ) (11.509)

2. SL(n, κ):
T1 SL(n, κ) ∼
= {M ∈ Mn (κ)|Tr(M ) = 0} = sl(n, κ) (11.510)

3. O(n, κ):

so(n; κ) := T1 O(n, κ) ∼
= {M ∈ Mn (κ)|M tr = −M } = o(n, κ) = so(n, κ) (11.511)

4. Sp(2n, κ):

T1 Sp(2n, κ) ∼
= {M ∈ M2n (κ)|(M J)tr = +M J} = sp(2n, κ) (11.512)

5. U (n):
T1 U (n) ∼
= {M ∈ Mn (C)|M † = −M } = u(n) (11.513)

6. SU (n):

T1 SU (n) ∼
= {M ∈ Mn (C)|M † = −M & Tr(M ) = 0} = su(n) (11.514)

7. U Sp(2n):

T1 U Sp(2n) ∼
= {M ∈ M2n (C)|M † = −M & (M J)tr = M J}
(11.515)

= {M ∈ Mn (H)|M † = −M } := usp(2n)
where Hn (H) is the space of n × n quaternionic Hermitian matrices. The above arguments work in the
same way: dimR Mn (H) = 4n2 , while dimR Hn (H) = n + 4 12 n(n − 1) = 2n2 − n. As before, the kernel of
dfA , for A invertible is the space of n × n quaternionic-antihermitian matrices. This has real dimension
3n + 4 21 n(n − 1) = 2n2 + n. (The 3n is there because one can have an arbitrary imaginary quaternion on
the diagonal.) Therefore 0 is a regular value, and U Sp(2n) is a manifold. Many authors denote this group
simply as Sp(n).

– 235 –
We now recognize that the above vector spaces of matrices are in fact the Lie algebras
we introduced earlier. We can get back the groups by exponentiation. It is a good exercise
to check, from the defining relations of T1 G above that the exponentiated matrix indeed
satisfies the defining relations of the group. Thus, for example, one should check that if M
is anti-Hermitian, i.e. if M † = −M then exp(M ) is unitary.
We can argue more generally that they must be Lie algebras as follows: Given a matrix
V ∈ T1 G ⊂ MN (κ) we can form the family of group elements

X (tV )n
gV (t) = exp[tV ] := (11.516)
n!
n=0

Note that for t ∈ R these elements form a subgroup of G:

gV (t1 )gV (t2 ) = gV (t1 + t2 ) (11.517)

In general, in differential geometry, there is an exponential map from the tangent space
Tp M of a manifold M to a neighborhood of p in M . It is not as explicit as an exponential
series of a matrix, but involves using the vector to define a differential equation. An
important property of the tangent spaces T1 G for the various groups above is that:
If V1 , V2 ∈ T1 G then the matrix commutator [V1 , V2 ] is also in T1 G.
This can be verified by directly checking each case. For example, in the case of so(n, κ),
if V1 , V2 are antisymmetric matrices over κ then neither V1 V2 , nor V2 V1 is antisymmetric,
but [V1 , V2 ] is. The reader should check the other cases in this way. Nevertheless, this
fact also follows from more general principles, and that is important because as we will see
not every Lie group is a classical matrix group. In fact, it is not true that every finite-
dimensional Lie group is a subgroup of GL(N, R) for some N . 123 Given V1 , V2 we can
consider the path through g = 1 at t = 0 given by the group commutator:
√ √
λ(t) = [gV1 ( t), gV2 ( t)] (11.518)

Now, usin the BCH formula one can show that for t1 , t2 small we have
1
gV1 (t1 )gV2 (t2 ) = exp[t1 V1 + t2 V2 + t1 t2 [V1 , V2 ] + O(ta1 tb2 )] (11.519)
2
where the higher order terms have a + b > 2, and therefore the tangent vector to the path
through λ(t) is the matrix commutator.
Therefore, just based on group theory and manifold theory, one can deduce that the
tangent space at the identity g = T1 G is indeed a Lie algebra.
One of the main theorems about the relation of Lie groups and Lie algebras is the
following:

Theorem:
123
A counterexample is the metaplectic group, a group which arises as a central extension of the symplectic
group when one tries to implement symplectic transformations on a the quantum mechanics of a system of
free particles.

– 236 –
a.) Every finite dimensional Lie algebra g over κ = R arises from a unique (up to
isomorphism) connected and simply connected Lie group G.
b.) Under this correspondence, Lie group homomorphisms f : G1 → G2 are in 1 − 1
correspondence with Lie algebra homomorphisms µ : T1 G1 → T1 G2 .
For more about this see Chapter 8. The best statement makes use of the language of
categories. What is described here is an equivalence of categories. See Section **** below.

Examples:

1. su(2) has a standard basis


i
T a := − σ a (11.520)
2
with structure constants
[T a , T b ] = abc T c (11.521)
Note that any traceless anti-Hermitian matrix can be diagonalized: For any A ∈ su(2)
there is a u ∈ SU (2) with
u−1 Au = iλσ 3 (11.522)
for some λ ∈ R. Note that it follows that the exponentiation of any one-parameter
subgroup is subgroup of SU (2) isomorphic to U (1). In particular, it is compact.
Finally, as we have seen, every SU (2) group element can be written as

g = cos χ + i sin χn̂ · ~σ = exp[iχn̂ · ~σ ] (11.523)

where n̂ ∈ S 2 ⊂ R3 . So the exponential map is surjective.

2. sl(2, R) has a standard basis:


! ! !
01 −1 0 0 0
e= h= f= (11.524)
00 0 1 −1 0

You should check the structure constants, taking careful note of signs:

[h, e] = −2e [e, f ] = h [h, f ] = +2f (11.525)

Note that sl(2, R) is qualitatively different: The elements e, f ∈ sl(2, R) have non-
trivial Jordan form and cannot be diagonalized (even within GL(2, C)) So this Lie
algebra is inequivalent to su(2). It is not hard to show that if A ∈ sl(2, R) then it is
conjugate (via SL(2, R) conjugation) to one of three distinct forms
!
−1 1x
SAS = xe = x∈R (11.526)
0 1
!
−1 −x 0
SAS = xh = x∈R (11.527)
0 x

– 237 –
!
−1 0 x
SAS = x(e − f ) = x∈R (11.528)
−x 0

and hence there are three distinct maximal Abelian subgroups (up to conjugation):
!
1x
exp[xe] = x∈R (11.529)
0 1

!
e−x 0
exp[xh] = x∈R (11.530)
0 ex
!
cos x sin x
exp[x(e + f )] = x∈R (11.531)
− sin x cos x

Note that the last subgroup is compact and is just a copy of SO(2, R) and x ∼ x + 2π
parametrize the same group element.
It is easy to show these subgroups are not conjugate by considering the trace - see the
exercise below. It is now easy to see that the exponential map cannot be surjective
onto SL(2, R). Consider the group elements
!
−1 x
x 6= 0 (11.532)
0 −1

If this were of the form exp[A] then it would be in the one-parameter group exp[tA].
But the trace is −2 and the only way one of the above one-parameter groups can
have trace = −2 is to take (11.531) with x = π(2n + 1) with n ∈ Z. Nevertheless,
via the Gram-Schmidt procedure (see chapter 2) one can prove the so-called KAN
decomposition: Every g ∈ SL(2, R) can be uniquely written in the form
! ! !
cos θ sin θ λ 0 1x
g= · · (11.533)
− sin θ cos θ 0 λ−1 0 1

with λ > 0 and x ∈ R. So the products of the one-parameter groups generate all
group elements.

Remark: The group manifold SL(2, R) as anti-deSitter space: Every 2 × 2 real matrix
can plainly be written as: !
T1 + X1 X2 + T2
g= (11.534)
X2 − T2 T1 − X1

for real variables T1 , T2 , X1 , X2 . Restricting to matrices with detg = 1 gives the hyperboloid
in R2,2 :
−T12 − T22 + X12 + X22 = −1 (11.535)

– 238 –
Thus one picture of SL(2, R) can be given as a hyperboloid in R2,2 . The induced metric
has signature (−1 , +2 ). Indeed, a global set of coordinates is:

T1 = cosh ρ cos t
T2 = cosh ρ sin t
(11.536)
X1 = sinh ρ cos φ
X2 = sinh ρ sin φ

These coordinates smoothly cover the manifold once for 0 ≤ ρ < ∞, t ∼ t + 2π, φ ∼ φ + 2π.
Substituting into
ds2 = −(dT1 )2 − (dT2 )2 + (dX1 )2 + (dX2 )2 (11.537)
the induced metric becomes

ds2 = − cosh2 ρdt2 + sinh2 ρdφ2 + dρ2 . (11.538)

This is one form of the anti-deSitter metric of constant curvature −1. Note that the time
coordinate t is periodic. What is usually meant by anti-deSitter space is the universal cover
of SL(2, R).

Exercise Due Diligence


Show that if matrices a1 , a2 satisfy (Ja)tr = Ja then [a1 , a2 ] has the same property.

Exercise Group Commutators And Lie Algebra Commutators


a.) Use the BCH theorem to show that if

g1 = et1 A1 , g2 = et2 A2 (11.539)

the group commutator, g1 g2 g1−1 g2−1 corresponds to the Lie algebra commutator:

g1 · g2 · g1−1 · g2−1 = 1 + t1 t2 [A1 , A2 ] + O(t21 , t22 ) (11.540)

b.) We say a Lie algebra is “Abelian” if [A1 , A2 ] = 0 for all A1 , A2 ∈ g. That that
such a Lie algebra exponentiates to form an Abelian group.
d
c.) If Ai = dt |0 gi (t) then [A1 , A2 ] is the Lie algebra element associated to the curve
√ √ √ √
g12 (t) = g1 ( t) · g2 ( t) · g1−1 ( t) · g2−1 ( t) (11.541)

– 239 –
Exercise Inequivalent One-Parameter Subgroups Of SL(2, R)
a.) Show that the three maximal Abelian subgroups (11.529), (11.530), (11.531) are
non-conjugate. 124
b.) Consider the one-parameter subgroup exp[xf ]. Is it conjugate to one of the above
three?

11.16.4 Representations Of Lie Algebras


A representation of a Lie algebra is a linear map

ρ : g → End(V ) (11.542)

for some vector space V (again sometimes called the carrier space) such that

[ρ(x), ρ(y)] = ρ([x, y]) (11.543)

for all x, y ∈ g.
Note that any Lie algebra g has a canonical representation with V = g and

ρ̇(x) : y 7→ Ad(x)(y) = [x, y] (11.544)

This follows because, as one easily checks, the equation [ρ̇(x), ρ̇(y)] = ρ̇([x, y]) is equivalent
to the Jacobi identity. This representation is known as the adjoint representation.
Given a representation ρ̇ of a Lie algebra g we get a representation of a corresponding
(connected) Lie group G so that g = T1 G. Recall a representation of a Lie group is a group
homomorphism
ρ : G → GL(V ) (11.545)

We do this by setting 125

ρ(ex ) := eρ̇(x) (11.546)

For g = gl(n, κ) the group representation associated to the adjoint representation has
carrier space g and the group action is:

ρ(g)(x) = gxg −1 (11.547)

Conversely, given a representation ρ of a Lie group we can define a representation of


the Lie algebra by
d
ρ̇(X) = |0 ρ(etX ) (11.548)
dt

Remarks:
124
Answer : Consider the traces of these matrices. They are = 2, ≥ 2, and ≤ 2, respectively.
125
We are assuming there is a corresponding Lie group. Some Lie algebras can, in fact, not be exponenti-
ated to form Lie groups.

– 240 –
1. Adjoint Representation Of su(2) And Rotations. We have already seen this in
a slightly different form when we defined the natural double-cover homomorphism
π : SU (2) → SO(3). We can identify su(2) ∼ = R3 as a vector space by identifying
3
~x ∈ R with
M~x := i~x · ~σ (11.549)
The adjoint representation is ρadj (u) · M = uM u−1 , so ρadj acts linearly on ~x ∈ R3 .
Note that M~x2 = −~x2 12×2 , so this linear action preserves the norm and therefore
ρadj (u) ∈ O(3). Since

tr (M~x1 M~x2 M~x3 ) = 2i~x1 · (~x2 × ~x3 ) (11.550)

is also preserved ρadj (u) ∈ SO(3).

2. Adjoint Representation Of sl(2, R) And Lorentz Transformations Every element of


sl(2, R) can be written as
!
x y−t
Mxµ = = (y − t)e + xh + (y + t)f (11.551)
y + t −x

Note g ∈ SL(2, R) acts by ρadj (g) · M = gM g −1 . But now

Mx2µ = (−t2 + x2 + y 2 )12×2 (11.552)

So, in a manner similar to the previous example SL(2, R) double covers the connected
component of the 2 + 1 dimensional Lorentz group SO0 (1, 2).

3. sl(2, C) And Lorentz Transformations For completeness we note that 2×2 Hermitian
matrices can be identified with 3 + 1 dimensional Minkowski space M1,3 via

Mxµ = x0 12×2 + ~x · ~σ (11.553)

Note g ∈ SL(2, C) acts linearly via ρ(g) · M = gM g † since this preserves Hermiticity.
However
detMxµ = (x0 )2 − ~x2 (11.554)
and the determinant is preserved by this actions so g 7→ ρ(g) describes SL(2, C) as a
double cover of the connected component SO0 (1, 3) of the 3 + 1 dimensional Lorentz
group.

4. su(2)⊕su(2) And Rotations In R4 Finally, a very similar construction gives the double
cover π : SU (2) × SU (2) → SO(4). But this is best discussed in the context of the
quaternions. See the section on quaternions in Chapter 2.

Exercise Another Proof That T1 GL(n, κ) = gl(n, κ)

– 241 –
Show that for G = GL(n, κ) the corresponding Lie algebra is gl(n, κ) by considering
the 1-parameter subgroups exp[teij ] where eij are the matrix units.

Exercise
Interpret equation (11.441) as a special case of (11.546) for the adjoint representation.
Use this to derive (11.470) from the group homomorphism property of ρ.

Exercise Tensor Products Of Representations


We have noted that if ρ1 : G → Aut(V1 ) and ρ2 : G → Aut(V2 ) are two representations
of any group G then there is a tensor product representation

ρ12 (g) := ρ1 (g) ⊗ ρ2 (g) (11.555)

Show that if G is a Lie gorup then the corresponding Lie algebra representation on V1 ⊗ V2
representations X ∈ g = T1 G by

ρ̇12 (X) = ρ̇1 (X) ⊗ 1V2 + 1V1 ⊗ ρ̇2 (X) (11.556)

11.16.5 Finite Dimensional Irreducible Representations Of sl(2, C),sl(2, R), and


su(2)
Relation of sl(2, R), su(2) and sl(2, C):
If we regard sl(2, C) as a Lie algebra over κ = R there are two inequivalent real Lie
subalgebras: sl(2, R) and su(2), as we have described in detail above. However, if we allow
ourselves to multiply by complex numbers, that is, if we consider sl(2, R) ⊗ C and su(2) ⊗ C
then we obtain a single Lie algebra sl(2, C). Indeed, in the complexification, we have

e = iT 1 − T 2
h = −2iT 3 (11.557)
1 2
f = −iT − T

Therefore, there is no distinction between the finite-dimensional representations of


su(2), sl(2, R), and sl(2, C) on complex vector spaces. It is actually easiest to construct
the finite dimensional representation of sl(2, R) or sl(2, C) on a complex vector space. But
by the above formulae these will immediately give the finite dimensional representations
of su(2).

– 242 –
Suppose we have a finite-dimensional complex vector space V and linear operators
ρ(e), ρ(f ), ρ(h) on V satisfying the commutation relations:

[ρ(h), ρ(f )] = 2ρ(f )


[ρ(h), ρ(e)] = −2ρ(e) (11.558)
[ρ(e), ρ(f )] = ρ(h)

As shown in Chapter two, any linear operator on a complex vector space has at least one
eigenvector. (It might have only one eigenvector.)
Suppose we choose an eigenvector v of ρ(h) and suppose the eigenvalue is λ. Then, we
claim that, so long as ρ(e)n v 6= 0 the vector ρ(e)n v has eigenvalue λ − 2n. To prove this
apply the general identity
n−1
X
[A, B n ] = B i [A, B]B n−1−i (11.559)
i=0

to conclude:
[ρ(h), ρ(e)n ] = −2nρ(e)n (11.560)
and the result follows. Now, it is general fact of linear algebra that if we have nonzero
vectors v1 , . . . , vn with distinct eigenvalues λ1 , . . . , λn for some operator then the v1 , . . . , vn
are linearly independent. (Prove this as an exercise.) Therefore, if ρ(e)n v 6= 0 then the
vectors v, ρ(e)v, . . . , ρ(e)n v are linearly independent. Therefore, since we have a finite-
dimensional representation there must be a nonnegative integer n so that ρ(e)n v 6= 0 but
ρ(e)n+1 v = 0. Let us denote v0 := ρ(e)n v. So ρ(e)v0 = 0 with ρ(h)v0 = λ0 v0 and v0 6= 0.
Now, using (11.559) again we get:

[ρ(h), ρ(f )k ] = 2kρ(f )k (11.561)

and therefore
ρ(h)(ρ(f )k v0 ) = (λ0 + 2k)ρ(f )k v0 (11.562)
By a similar argument to that above, we know that if ρ(f )n v0 is nonzero then the vectors
v0 , ρ(f )v0 , . . . , ρ(f )n v0 are linearly independent, and therefore there must exist an integer
N so that ρ(f )N v0 6= 0 but ρ(f )N +1 v0 = 0. Therefore

[ρ(e), ρ(f )N +1 ]v0 = 0 (11.563)

Now, using
N
X N
X
N −i
[ρ(e), ρ(f ) N +1
]= i
ρ(f ) [ρ(e), ρ(f )]ρ(f ) = ρ(f )i ρ(h)ρ(f )N −i (11.564)
i=0 i=0

and applying this identity to v0 we get:


N
!
X
0= (λ0 + 2(N − i)) ρ(f )N v0 (11.565)
i=0

– 243 –
But ρ(f )N v0 6= 0, by the definition of N so
N
!
X
(λ0 + 2(N − i)) =0 (11.566)
i=0

which implies λ0 = −N . Thus, we have produced a set of vectors

v0 , ρ(f )v0 , · · · , ρ(f )N v0 = v0 , v1 , . . . , vN (11.567)

spanning an (N + 1)-dimensional representation of sl(2, R) inside any finite-dimensional


representation. Indeed, if we define:

ṽs := ρ(f )s v0 (11.568)

understanding that vN +1 = vN +2 = · · · = 0 then

ρ(f )ṽs = ṽs+1


ρ(h)ṽs = (2s − N )ṽs (11.569)
ρ(e)ṽs = −s(N + 1 − s)ṽs−1

So the span, which we will denote WN ⊂ V is a subrepresentation. Note that in this


ordered basis for WN the representation matrices are: (See chapter 2 for the proper way
to associate a matrix to a linear transformation on a basis with an ordered basis.)
 
0 0 0 ··· 0 0
1 0 0 · · · 0 0 
 
0 1 0 · · · 0 0 
 
ρ(f ) = 
 .. .. .. ..
 (11.570)
 . . . . 0 0

 
0 0 0 · · · 0 0 
0 0 0 ··· 1 0
 
−N 0 0 ··· 0 0
 0 −N + 2 0 · ·· 0 0
 
 0 0 −N + 4 · · · 0 0
 
ρ(h) = 
 .. . .. ..  (11.571)
 . .. . . 0 0

 
 0 0 0 ··· N −2 0
0 0 0 ··· 0 N
 
0 −N 0 ··· 0 0
0 0 −2(N − 1) · · · 0 0 
 
0 0 0 ··· 0 0 
 
ρ(e) =  .
. . . . ..  (11.572)
. . .. . 0 0 

 
0 0 0 ··· 0 −N 
0 0 0 ··· 0 0
Note that the Jordan form for ρ(f ) shows that the representation WN is irreducible.

– 244 –
As mentioned above, we automatically get a representation of su(2), and hence of the
group SU (2). Therefore V is completely reducible. Therefore, we have recovered - this
time at the Lie algebra level - that there is one irreducible representation of SU (2) of
dimension n for each positive integer n. In fact, identifying N = 2j the representation WN
gives another model for the spin j representation Vj .

Remarks

1. In physics the angular momentum operators in quantum mechanics generate the


Lie algebra su(2). See the exercise below. In physics observable quantities (like
angular momentum) are represented by Hermitian operators. The observable angular
momentum operators are related to the anti-Hermitian generators of su(2) by a factor

of −1. Physicists usually define
1
J a := iT a = σ a (11.573)
2
and they satisfy the commutation relations

[J a , J b ] = iabc J c (11.574)

which, unfortunately, obscures the fact that we are working with a real Lie algebra.
Group elements are of the form exp[iθ~ · J] ~ where θ~ is real. When working with
representations physicists generally define

J + = J 1 + iJ 2 = e
(11.575)
J − = J 1 − iJ 2 = −f

so that

[J 3 , J ± ] = ±J ±
(11.576)
[J + , J − ] = 2J 3

More generally, in a physical system with SU (2) symmetry the J a are the conserved
“Noether charges” for that symmetry. Recall that we called the irreducible represen-
tations of SU (2) of dimension n = 2j + 1 the “spin j representations.” This is the
origin of that terminology.

2. Vj is a unitary representation of su(2) and SU (2) but is definitely not unitary as a


representation of sl(2, R). (There are unitary irreps of sl(2, R), but they are infinite-
dimensional.) The standard physics notation for an ON basis diagonalizing ρ(J 3 ) is
|j, mi with
p
ρ(J + )|j, mi = (j − m)(j + m + 1)|j, m + 1i
ρ(J 3 )|j, mi = m|j, mi (11.577)
ρ(J − )|j, mi = (j + m)(j − m + 1)|j, m − 1i
p

– 245 –
with m = −j, −j + 1, . . . , j − 1, j. We can relate this to our basis by identifying
N = 2j and
ρ(f )s v0 = Cj,s |j, j − si (11.578)
for a suitable normalization factor Cj,s .

Exercise Justifying The Relation To Angular Momentum


In the classical mechanics of a particle moving in R3 the angular momentum of a
particle around the origin is the function on phase space given by

La = abc xb pc a, b, c ∈ {1, 2, 3} (11.579)

La is a (pseudo)-vector. In quantum mechanics we adopt the same expression and there is


no issue of operator ordering because the epsilon symbol prevents b = c:

L̂a = abc x̂b p̂c a, b, c ∈ {1, 2, 3} (11.580)

Using [p̂a , x̂b ] = −i~δab show that

[L̂a , L̂b ] = iabc ~L̂c (11.581)

so that T̂ a = − ~i L̂a are anti-Hermitian operators generating a copy of su(2).


Remark: In spin j representations if j is order 1 the physical angular momentum is
of order ~ and hence intrinsically quantum mechanical. On the other hand, for fixed L̂,
the semiclassical ~ → 0 limit is the large spin limit.

Exercise Another basis


Show that if we use the basis
1
vs = ρ(f )s v0 (11.582)
s!
then

ρ(f )vs = (s + 1)vs+1


ρ(h)vs = (2s − N )vs (11.583)
ρ(e)vs = (s − N − 1)vs−1

Exercise Unitarizing Vj

– 246 –
Unitarizing Vj means equipping it with an inner product so that the ρ(J a ) are Hermi-
tian operators. Equivalently, ρ(h) is Hermitian and

ρ(J + )† = ρ(J − ) ↔ ρ(e)† = −ρ(f ) (11.584)

Show that by rescaling us = cs ṽs and declaring us to be ON we get a unitary structure


provided that
cs−1 2
 
= s(2j + 1 − s) (11.585)
cs
so that ♣Fix this equation:
p sign issue ♣
ρ(e)us = − s(2j + 1 − s)us−1
p (11.586)
ρ(f )us = s(2j + 1 − s)us+1

11.16.6 Casimirs
We have stressed that if two elements a, b in a Lie subalgebra of a matrix Lie algebra are
multiplied as matrices ab then in general the result is not in the Lie algebra. Nevertheless,
if we have a representation ρ(a), ρ(b) ∈ End(V ) nothing stops us form multiplying the
operators ρ(a)ρ(b). Certain relations among the algebra of the operators ρ(a) for a ∈ g are
universal and independent of the representation. They can be expressed in terms of tensor
algebra using the universal enveloping algebra. See Chapter two for a description. Here we
just look at one important aspect of such universal relations.
In any representation of su(2) the operator
3
X
C2 (ρ) := ρ(T a )2 (11.587)
a=1

commutes with all the operators ρ(T a ). This operator is known as a quadratic Casimir.
It is a theorem in the theory of universal enveloping algebras that any representation, any
operator that commutes with all operators made by multiplying and adding the ρ(T a ) is
a polynomial in the operator C2 (ρ). This fact generalizes to all simple Lie algebras: The
center of the universal enveloping algebra is a polynomial in the Casimirs and there are
r independent Casimirs where r is the rank. For SU (N ) there are N − 1 independent
Casimirs.
Returning to SU (2), we can express C2 (ρ) in terms of the representations of the basis
for sl(2, R) using
1
2(ρ(e)ρ(f ) + ρ(f )ρ(e)) − ρ(h)2

C2 (ρ) = (11.588)
4
In any irreducible representation the Casimir operator must be a multiple of the iden-
tity operator. We can easily compute the value by acting on any convenient vector. For
example,
1
2(ρ(e)ρ(f ) + ρ(f )ρ(e)) − ρ(h)2 v0

C2 (ρ)v0 =
4 (11.589)
N (N + 2)
=− v0
4

– 247 –
which the physicists will prefer to write as
~ 2 = j(j + 1)1(2j+1)×(2j+1)
ρj (J) (11.590)

where ρj is the representation on Vj and we have merely substituted N = 2j above.

Examples:

1. Consider a quantum system of two spin 1/2 particles with Hamiltonian:

H = J1a ⊗ J2a (11.591)

We can easily use representation theory to find the spectrum of this Hamiltonian.
We note that
1 a 
H= (J1 ⊗ 1 + 1 ⊗ J2a )2 − J~12 ⊗ 1 − 1 ⊗ J~22 (11.592)
2
But the Hilbert space is V1/2 ⊗ V1/2 ∼
= V0 ⊕ V1 . On the one-dimensional subspace
∼ V
= 0 we use (11.590) to compute
1 3 3 3
H|V0 = (0 − − ) = − (11.593)
2 4 4 4

But on the three-dimensional subspace = V1 we have
1 3 3 1
H|V1 = (2 − − ) = + (11.594)
2 4 4 4

Exercise Three Qbits On A Ring


Consider a quantum system of three spin 1/2 particles on a ring with J~i ·Ji+1 interaction
between neighboring spins. Compute the spectrum of the Hamiltonian.
X
H= Jia ⊗ Ji+1
a
(11.595)
i

where i is understood as an index modulo 3. 126

126
Answer : We can write:
1 a 
H= (J1 ⊗ 1 ⊗ 1 + 1 ⊗ J2a ⊗ 1 + 1 ⊗ 1 ⊗ J3a )2 − J~12 ⊗ 1 ⊗ 1 − 1 ⊗ J~22 ⊗ 1 − 1 ⊗ 1 ⊗ J~32 (11.596)
2
But we have the isotypical decomposition:

V1/2 ⊗ V1/2 ⊗ V1/2 ∼


= 2V1/2 ⊕ V3/2 (11.597)

On the 4 dimensional space 2V1/2 we have


3
H|2V1/2 = − (11.598)
4
and on the 4-dimensional space V3/2 we have
3
H|V3/2 = + (11.599)
4

– 248 –
11.16.7 Lie Algebra Operators In The Induced Representations Of SU (2)
Using the above models for the irreducible representations of SU (2) and the relation be-
tween elements of the Lie algebra and infinitesimal group elements we get representations
of the Lie algebra in terms of differential operators.
In terms of homogeneous polynomials ψ(u, v) of two variables in the space H2j recall
that ! !!
u u
(ρ(g) · ψ)( ) := ψ g −1 (11.600)
v v
Now suppose g is infinitesimally close to the identity so that
! !!
u u
(ρ(1 + X) · ψ)( ) := ψ (1 − X) (11.601)
v v

up to order 2 . Since ρ(X) is linear in X we can form complex linear combinations, and
therefore the same formula will apply to any X ∈ sl(2, C). In particular, we deduce:


ρ(e) · ψ = −v ψ
 ∂u 
∂ ∂
ρ(h) · ψ = u −v ψ (11.602)
∂u ∂v

ρ(f ) · ψ = u ψ
∂v
The reader should check the operators really satisfy ρ([X, Y ]) = [ρ(X), ρ(Y )]. Note that
the Casimir is:
∂ 2 1
   
1 ∂ ∂ ∂
C2 (ρ) = − u +v − u +v (11.603)
4 ∂u ∂v 2 ∂u ∂v
so acting on H2j we immediately get that C2 (ρj ) acts as the scalar operator of multiplication
by j(j + 1).
It is also interesting to consider the action in the inhomogeneous representation. As
above, we extend the action to all of SL(2, C) so that if
!
a b
g= ∈ SL(2, C) (11.604)
c d

then  
2j dz − b
ρ(g) · p(z) = (−cz + a) p (11.605)
−cz + a
and if g = 1 + X + O(2 ) then ρ(g) = 1 + ρ(X) + O(2 ), and in this way we derive:


ρ(e) · p = − p
∂z

ρ(h) · p = −(2z + 2j)p (11.606)
∂z

ρ(f ) · p = −(z 2 − 2jz)p
∂z

– 249 –
For example:
 
2j z
(ρ(1 + f ) · p)(z) := (z + 1) p
z + 1
= (1 + 2jz)p(z − z 2 ) + O(2 ) (11.607)

= p(z) − z 2 p(z) + 2jzp(z) + O(2 )
∂z
Similarly, acting on the Wigner functions themselves we learn that, after complexifi-
j
cation ρ(e) translates the group element in the argument of Dm L ,mR by

! !
α −β̄ β ᾱ
g= →g− (11.608)
β ᾱ 0 0

and similarly ρ(f ) translates it by


! !
α −β̄ 0 0
g= →g+ (11.609)
β ᾱ α −β̄

So that, acting on the Wigner functions considered as polynomials in α, ᾱ, β, β̄ if we treat


α, ᾱ, β, β̄ as independent then

∂ ∂
ρ(e) = ᾱ −β
∂ β̄ ∂α
∂ ∂ ∂ ∂
ρ(h) = ᾱ −α − β̄ +β (11.610)
∂ ᾱ ∂α ∂ β̄ ∂β
∂ ∂
ρ(f ) = α − β̄
∂β ∂ ᾱ

One can check explicitly that


j j
ρ(e)D̃m L ,mR
= const.D̃m L +1,mR
j j
ρ(h)D̃m L ,mR
= D̃m L ,mR
(11.611)
j j
ρ(f )D̃m L ,mR
= const.D̃m L −1,mR

We can specialize this to get the standard identities on spherical harmonics which are ♣NEED TO GIVE
MORE DETAILS
frequently used in mathematical physics: 127 ON THIS LAST
REMARK ♣
 
iφ ∂ ∂
L+ = e + icotθ
∂θ ∂φ
 
−iφ ∂ ∂
L− = e − + icotθ (11.612)
∂θ ∂φ

L3 = −i
∂φ
127
See, for example, J.D. Jackson, Classical Electrodynamics, page 743.

– 250 –
p
L+ Y`,m = (` − m)(` + m + 1)Y`,m+1
p
L− Y`,m = (` + m)(` − m + 1)Y`,m−1
(11.613)
L3 Y`,m = mY`,m
~ 2 Y`,m = `(` + 1)Y`,m
L

~ 2 , the Casimir, as a differential shows up in the expression for the Laplacian


In particular L
expressed in terms of spherical coordinates. The equation L ~ 2 Y`,m = `(` + 1)Y`,m becomes
the standard differential equation satisfied by associated Legendre functions and Legendre
polynomials, whereas the first two lines of (11.613) become standard identities relating
associated Legendre functions.

Remark: We have seen above that group theory gives a nice perspective on many identi-
ties satisfied by the family of special functions associated with Legendre polynomials and
spherical harmonics. This viewpoint extends very nicely to many other special functions
in mathematical physics. Two references that explain this in some detail are:
1. J.D. Talman, Special Functions: A Group Theoretic Approach Based on Lectures
by Eugene P. Wigner
2. N. Ja. Vilenkin, Special Functions and the Theory of Group Representations ♣The following
section on Kernel,
Image, Exact
Sequence should be
moved to be just
before the
12. Group Theory And Elementary Number Theory representation
theory section, but
the quantum
mechanics on the
circle should be
In this chapter we review some very elementary number theory that has a strong connection moved either to the
Heisenberg section
to group theory. The facts here can be very useful in thinking about many physics problems. or to the Pontryagin
duality section. ♣
Two general references are
Hardy and Wright, An Introduction To The Theory Of Numbers
Ireland and Rosen, A Classical Introduction to Modern Number Theory

12.1 Reminder On gcd And The Euclidean Algorithm

Let us recall some basic facts from grade school arithmetic:


First, if A > B are two positive integers then we can write

A = qB + r 0≤r<B (12.1)

for unique nonnegative integers q and r known as the quotient and the residue, respectively.
Next, let (A, B) = (±A, ±B) = (±B, ±A) denote the greatest common divisor of A, B.
Then we can find it using the Euclidean algorithm by looking at successive quotients. If

– 251 –
A = qB with r = 0 we are done! Then (A, B) = B. If r > 0 then we proceed as follows:

A = q1 B + r1 0 < r1 < B
B = q2 r1 + r2 0 < r2 < r1
r1 = q3 r2 + r3 0 < r3 < r2
r2 = q4 r3 + r4 0 < r4 < r3 (12.2)
.. ..
. .
rj−2 = qj rj−1 + rj 0 < rj < rj−1
rj−1 = qj+1 rj

Note that B > r1 > r2 > · · · ≥ 0 is a strictly decreasing sequence of nonnegative integers
and hence must terminate at r∗ = 0 after a finite number of steps.

Examples
A = 96 and B = 17:

96 = 5 · 17 + 11
17 = 1 · 11 + 6
11 = 1 · 6 + 5 (12.3)
6=1·5+1
5=5·1

A = 96 and B = 27:

96 = 3 · 27 + 15
27 = 1 · 15 + 12
(12.4)
15 = 1 · 12 + 3
12 = 4 · 3

Note well: In (12.1) the remainder might be zero but in the first j lines of the Euclidean
algorithm the remainder is positive, unless B divides A, in which case rather trivially
(A, B) = B. The last positive remainder rj is the gcd (A, B). Indeed if m1 , m2 are integers
then the gcd satisfies:

(m1 , m2 ) = (m2 , m1 ) = (m2 , m1 − xm2 ) (12.5)

for any integer x. Applying this to the Euclidean algorithm above we get:

(A, B) = (B, r1 ) = (r1 , r2 ) = · · · = (rj−1 , rj ) = (rj , 0) = rj . (12.6)

A corollary of this algorithm is that if g = (A, B) is the greatest common divisor then
there exist integers (x, y) so that
Ax + By = g (12.7)

– 252 –
In particular, two integers m1 , m2 are relatively prime, that is, have no common integral
divisors other than ±1, if and only if there exist integers x, y such that

m1 x + m2 y = 1. (12.8)

Of course x, y are not unique. Equation (12.8) is sometimes known as “Bezout’s theorem.”
We can prove these statements from the Euclidean algorithm as follows. ♣Putting this
discussion here
For an integer n define ! makes part of the
section on SL(2, Z)
1n and continued
T (n) := = Tn (12.9) fractions a little
0 1 redundant. ♣

where !
11
T := T (1) = . (12.10)
01
Now let us write the first line of the Euclidean algorithm as a matrix identity as
! !
A r1
T (−q1 ) = (12.11)
B B

and better, we write this as: ! !


1 A B
σ T (−q1 ) = (12.12)
B r1
Then the second line of the Euclidean algorithm becomes:
! !
A r1
σ 1 T (−q2 )σ 1 T (−q1 ) = (12.13)
B r2

Thus we have ! !
A rj−1
σ 1 T (−qj ) · · · σ 1 T (−q2 )σ 1 T (−q1 ) = (12.14)
B rj
and in the final step:
! !
A rj
σ 1 T (−qj+1 )σ 1 T (−qj ) · · · σ 1 T (−q2 )σ 1 T (−q1 ) = (12.15)
B 0

Multiplying out the matrices on the LHS gives an expression:


! ! !
xy A rj
= (12.16)
uv B 0

(Note that x, y, u, v are polynomials in the qi . See comments on continued fractions below.)
Remarks:

1. The Euclidean algorithm is fast: A theorem of Lamé asserts that the Euclidean
algorithm is very efficient. It should be completely obvious to you that the number
of steps cannot exceed B. (Recall that A > B.) However, Lamé asserts that in fact
the number of steps never exceeds 5log10 B. This is important for RSA (see below).

– 253 –
2. Relation to continued fractions: Note that from equation (12.15) we can also write
! !
A rj
= T (q1 )σ 1 T (q2 )σ 1 · · · T (qj )σ 1 T (qj+1 )σ 1 (12.17)
B 0

Let us write: !
q 1
M (q) := T (q)σ 1 = (12.18)
10

We now define two sequences of polynomials in n variables that we call Nn (q1 , . . . , qn )


and Dn (q1 , . . . , qn ) for all n ≥ 1. It is convenient to define N0 = 1 and D0 = 1 and
then we can write:
!
Nn (q1 , . . . , qn ) Nn−1 (q1 , . . . , qn−1 )
M (q1 ) · · · M (qn ) := (12.19)
Dn (q1 , . . . , qn ) Dn−1 (q1 , . . . , qn−1 )

(The reader should check that this is a consistent definition for all n.) One easily
generates:

N1 (q1 ) = q1
N2 (q1 , q2 ) = 1 + q1 q2 (12.20)
N3 (q1 , q2 , q3 ) = q1 + q3 + q1 q2 q3

D1 (q1 ) = 1
D2 (q1 , q2 ) = q2 (12.21)
D3 (q1 , q2 , q3 ) = 1 + q2 q3
These polynomials are closely related to continued fractions, defined as:

1
[q1 , q2 , q3 , · · · , qj ] := q1 + 1 (12.22)
q2 + q3 +···+ q1
j

Indeed, now that


!
Nn (q2 , . . . , qn+1 ) Nn−1 (q2 , . . . , qn )
M (q1 ) · (M (q2 ) · · · M (qn+1 )) = M (q1 ) (12.23)
Dn (q2 , . . . , qn+1 ) Dn−1 (q2 , . . . , qn )

from which one deduces the recursion relations:

Nn+1 (q1 , . . . , qn+1 ) = q1 Nn (q2 , . . . , qn+1 ) + Dn (q2 , . . . , qn+1 )


(12.24)
Dn+1 (q1 , . . . , qn+1 ) = Nn (q2 , . . . , qn+1 )

On the other hand, writing

Pn (q1 , . . . , qn )
[q1 , q2 , q3 , · · · , qn ] := (12.25)
Qn (q1 , . . . , qn )

– 254 –
we see that
1
[q1 , q2 , q3 , · · · , qn+1 ] = q1 +
[q2 , . . . , qn+1 ]
(12.26)
q1 Pn (q2 , . . . , qn+1 ) + Qn (q2 , . . . , qn+1 )
=
Pn (q2 , . . . , qn+1 )

So, Pn , Qn satisfy the same recursion relations as Nn , Dn , respectively, and since the
initial values are also the same we conclude that Pn = Nn and Qn = Dn .

Exercise
Check the Lamé bound for the two examples above.

Exercise
Given one solution for (12.7), find all the others.

Exercise Continued fractions and the Euclidean algorithm


a.) Show that the quotients qi in the Euclidean algorithm define a continued fraction
expansion for A/B:

A 1
= q1 + 1 := [q1 , q2 , q3 , · · · , qj ] (12.27)
B q2 + q +···+ 1
3 qj

The fractions [q1 ], [q1 , q2 ], [q1 , q2 , q3 ], . . . are known as the convergents of the continued
fraction.
b.) Show that 128

Nn+1 (q1 , . . . , qn+1 ) = qn+1 Nn (q1 , . . . , qn ) + Nn−1 (q1 , . . . , qn−1 )


(12.28)
Dn+1 (q1 , . . . , qn+1 ) = qn+1 Dn (q1 , . . . , qn ) + Dn−1 (q1 , . . . , qn−1 )

c.) Show that


Nn Dn−1 − Dn Nn−1 = (−1)n (12.29)

128
Answer : Write M (q1 ) · · · M (qn+1 ) = (M (q1 ) · · · M (qn )) · M (qn+1 ).

– 255 –
12.2 Application: Expressing elements of SL(2, Z) as words in S and T
The group SL(2, Z) is generated by
! !
0 −1 11
S := & T := (12.30)
1 0 01

Here is an algorithm for decomposing an arbitrary element


!
A B
h= ∈ SL(2, Z) (12.31)
C D

as a word in S and T .
First, note the following simple

Lemma Suppose h ∈ SL(2, Z) as in (12.31). Suppose moreover that g ∈ SL(2, Z) satisfies:


! !
A 1
g· = (12.32)
C 0

Then
gh = T n (12.33)
for some integer n ∈ Z.
The proof is almost immediate by combining the criterion that gh ∈ SL(2, Z) has
determinant one and yet must have the first column (1, 0).
Now, suppose h is such that A > C > 0. Then (A, C) = 1 and hence we have the
Euclidean algorithm to define integers q` , ` = 1, . . . N + 1, where N ≥ 1, such that

A = q1 C + r1 0 < r1 < C
C = q2 r1 + r2 0 < r2 < r1
r1 = q3 r2 + r3 0 < r3 < r2
.. .. (12.34)
. .
rN −2 = qN rN −1 + rN 0 < rN < rN −1
rN −1 = qN +1 rN

with rN = (A, C) = 1. (Note you can interpret r0 = C, as is necessary if N = 1.) Now, ♣N = 0 here? ♣

write the first line in the Euclidean algorithm in matrix form as:
! ! !
1 −q1 A r1
= (12.35)
0 1 C C

We would like to have the equation in a form that we can iterate the algorithm, so we need
the larger integer on top. Therefore, rewrite the identity as:
! ! !
1 1 −q1 A C
σ = (12.36)
0 1 C r1

– 256 –
We can now iterate the procedure. So the Euclidean algorithm implies the matrix identity:
! !
A 1
g̃ = (12.37)
C 0

g̃ = (σ 1 T −qN +1 ) · · · (σ 1 T −q1 ) (12.38)


Now, to apply the Lemma we need g to be in SL(2, Z), but

detg̃ = (−1)N +1 (12.39)

We can easily modify the equation to obtained a desired element g. We divide the argument
into two cases:

1. Suppose first that N + 1 = 2s is even. Then we group the factors of g̃ in pairs and
write
(σ 1 T −q2` )(σ 1 T −q2`−1 ) = (σ 1 σ 3 )(σ 3 T −q2` σ 3 )(σ 3 σ 1 )T −q2`−1
(12.40)
= −ST q2` ST −q2`−1

where we used that σ 1 σ 3 = −iσ 2 = S. Therefore, we can write


s
Y
g̃ = g = (−1)s (ST q2` ST −q2`−1 ) (12.41)
`=1

2. Now suppose that N + 1 = 2s + 1 is odd. Then we rewrite the identity (12.37) as:
! !
1 A 1
σ g̃ = (12.42)
C 0

so now we simply take


s
Y
−q2s+1
1
g = σ g̃ = (−1) s+1
(ST ) (ST q2` ST −q2`−1 ) (12.43)
`=1

Thus we can summarize both cases by saying that


N +1
N +1 Y `q
g = (−1)b 2
c
(ST (−1) `
) (12.44)
`=1

Then we can finally write !


A B
h= = g −1 T n (12.45)
C D
as a word in S and T for a suitable integer n. (Note that S 2 = −1.) ♣It would be good
to give an algorithm
Now we need to show how to bring the general element h ∈ SL(2, Z) to the form with for determining n.

A > C > 0 so we can apply the above formula. Note that
! ! !
1 0 A B A B
= (12.46)
m1 C D C + mA D + mB

– 257 –
while !
1 0
= ST m S −1 (12.47)
−m 1
Thus, if A > 0 we can use this operation to shift C so that 0 ≤ C < A. In case A < 0 we
can multiply by S 2 = −1 to reduce to the case A > 0. Finally, if A = 0 then
!
0 ±1
h= (12.48)
∓1 n

and we write !
0 −1
ST n = (12.49)
1 n
♣Need to
summarize the
result in a useful
12.3 Products Of Cyclic Groups And The Chinese Remainder Theorem way ♣

Recall the elementary definition we met in the last exercise of section 2.


Definition Let H, G be two groups. The direct product of H and G, denoted H × G,
is the set H × G with product:

(h1 , g1 ) · (h2 , h2 ) = (h1 · h2 , g1 · g2 ) (12.50)

We will consider the direct product of cyclic groups. According to our general notation
we would write this as Zm1 × Zm2 . However, since Zm is also a ring the notation Zm1 ⊕ Zm2
also often used, and we will use it below, especially when we write our Abelian groups
additively.
Let us begin with the question: Is it true that

?
Zm1 × Zm2 ∼
=Zm1 m2 . (12.51)
In general (12.51) is false!

Exercise
a.) Show that Z4 is not isomorphic to Z2 ⊕ Z2 . (There is a one-line proof.) 129

b.) Is p is prime is Zp ⊕ Zp isomorphic to Zp2 ?


c.) Is Z3 ⊕ Z5 isomorphic to Z15 ?

Write g = gcd(m1 , m2 ) and ` = lcm(m1 , m2 ). Then there are two natural exact
sequences:
1 → Zg → Zm1 × Zm2 → Z` → 1 (12.52)
0 → Z/`Z → Z/m1 Z ⊕ Z/m2 Z → Z/gZ → 0 (12.53)
129
Answer : Every element in Z2 ⊕ Z2 is of order two. But some elements of Z4 have order four. The but
the order of a group element is preserved under isomorphism.

– 258 –
In fact, we will show below that
Zm1 × Zm2 ∼
= Zg × Z` (12.54)
Remarks:
1. The sequence (12.52) is easier to write down multiplicatively, while (12.53) is easier
to write down additively. See the discussion below. (Of course, both are true in
either formulation!)

2. If g = 1 since Z1 = Z/Z is the trivial group we can indeed conclude that Zm1 ×Zm2 ∼
=
Zm1 m2 but otherwise this is false. We will return to this point.
Now, let us prove (12.52) and (12.53).
Recall that
m1 m2 = g` (12.55)
a fact that will be useful momentarily. (If you do not know this we will prove it below.)
It will also be useful to write m1 = µ1 g and m2 = µ2 g where µ1 , µ2 are relatively prime.
Thus there are integers ν1 , ν2 with
µ1 ν1 + µ2 ν2 = 1 (12.56)
and hence
m1 ν1 + m2 ν2 = g. (12.57)
To prove (12.52) think of Zm as the multiplicative group of mth roots of 1, so they are
all subgroups of U (1). Now define a group homomorphism:
π : Zm1 × Zm2 → Z` (12.58)
by:
π : (ξ1 , ξ2 ) → ξ1 ξ2 (12.59)
That is, we merely multiply the two entries. (This makes it clear that it is a group homo-
morphism since the group law is multiplication of complex numbers and that multiplication
is commutative.) Here ξ1 is an mth th
1 root of unity and ξ2 is an m2 root of unity. The only
thing you need to check is that indeed then ξ1 ξ2 is an `th root of unity, so π indeed maps
into Z` .
2πi 2πi
Now we prove that π is surjective: Let ω1 = e m1 and ω2 = e m2 . These are generators
of Zm1 and Zm2 . Choose integers ν1 , ν2 so that ν1 m1 + ν2 m2 = g then π maps
π : (ω1ν2 , ω2ν1 ) 7→ ω1ν2 ω2ν1
  
ν2 ν1
= exp 2πi +
m1 m2
  
m2 ν2 + m1 ν1
= exp 2πi
m1 m2 (12.60)
 
g
= exp 2πi
m1 m2
 
1
= exp 2πi
`

– 259 –
But exp 2πi 1` is a generator of the multiplicative group of `th roots of unity, isomorphic
 

to Z` , and hence the homomorphism π is onto. Thus, we have checked exactness of the
sequence at Z` .
On the other hand the injection map

ι : Zg → Zm1 × Zm2 (12.61)

is defined by identifying Zg with the multiplicative group of g th roots of unity and just
sending:
ι(ξ) = (ξ, ξ −1 ) (12.62)

Note that a g th root of unity ξ has the property that ξ ±1 is also both an mth
1 and an m2
th

root of unity. So this makes sense. It is now easy to check that indeed the kernel of π
is the image of ι. Since π takes the product of the two entries it is immediate from the
definition (12.62) that im(ι) ⊂ ker(π). On the other hand, if π(ξ1 , ξ2 ) = ξ1 ξ2 = 1 then
clearly ξ2 = ξ1−1 , so this must be in the image of ι. Now we have checked exactness at the
middle of the sequence. Exactness at Zg is trivial. This concludes the proof of (12.52) ♠
It is worth noting that we can write “additive” version of the maps ι and π as:

ι(x) = µ1 x ⊕ (−µ2 x)
(12.63)
π(x1 ⊕ x2 ) = µ2 x1 + µ1 x2

You should check that written this way it is well defined, and the sequence is exact.

Exercise
a.) Show that there is an exact sequence
ι π
0 → Z/`Z→Z/m1 Z ⊕ Z/m2 Z→Z/gZ → 0 (12.64)

where
π : x1 ⊕ x2 7→ (x1 − x2 )modg . (12.65)

ι : x 7→ (xmodm1 ⊕ xmodm2 ) (12.66)

b.) Show that if we think of these groups as groups of roots of unity then we have
π(ξ1 , ξ2 ) = ξ1µ1 ξ2−µ2 and ι(ω) = (ω µ2 , ω µ1 ) with m1 = µ1 g and m2 = µ2 g.

Now we prove that in general

Zm1 × Zm2 ∼
= Zg × Z` (12.67)

First, it follows from either of the two exact sequences we proved above that if
(m1 , m2 ) = 1 then indeed
Zm1 m2 ∼
= Zm1 × Zm2 (12.68)

– 260 –
Next, recall that any integer can be decomposed into its prime factors:
Y
m= pvp (m) (12.69)
p

where vp (m) ∈ Z+ , known as the valuation of m at p is zero for all but finitely many
primes. (So we have an infinite product of 1’s on the RHS of the above equation.)
Now in terms of the prime factorizations of m1 , m2 we can write:
Y
g = gcd(m1 , m2 ) = pmin[vp (m1 ),vp (m2 )]
p
Y (12.70)
` = lcm(m1 , m2 ) = pmax[vp (m1 ),vp (m2 )]
p

Now, from the above we know that Zm1 × Zm2 ∼


= Zm1 m2 if m1 and m2 are relatively prime.
Therefore we can write
Z/mZ ∼
Y
= (Z/pvp (m) Z) (12.71)
p

Applying this to each of the two factors in Zm1 × Zm2 and using G1 × G2 ∼
= G2 × G1 to
arrange the factors so the minimum power is on the left and maximum on the right and
regrouping gives (12.67). In equations:

Zm1 × Zm2 ∼
Y Y
= Zpνp (m1 ) × Zpνp (m2 )
p p

Y Y
= Zpmin[νp (m1 ),νp (m2 )] × Zpmax[νp (m1 ),νp (m2 )] (12.72)
p p

= Zg × Z`

A second proof gives some additional insight by providing an interesting visual picture
of what is going on, as well as relating this fact to lattices. It is related to the first by
“taking a logarithm” and involves exact sequences of infinite groups which induce sequences
on finite quotients.
Consider the sublattice of Z ⊕ Z given by
!
m1 α
Λ = m1 Z ⊕ m2 Z = { |α, β ∈ Z} (12.73)
m2 β

Then Λ ⊂ Z ⊕ Z is a sublattice and it should be pretty clear that

Z2 /Λ = Zm1 ⊕ Zm2 (12.74)

Now, write m1 = µ1 g, m2 = µ2 g as above. Choose integers ν1 , ν2 so that µ1 ν1 + µ2 ν2 = 1


and consider the matrix !
µ2 µ1
∈ SL(2, Z) (12.75)
−ν1 ν2
This is an invertible matrix over the integers, so we can change coordinates on the lattice
from x = m1 α, y = m2 β to

– 261 –
! ! !
x0 µ2 µ1 x
= (12.76)
y0 −ν1 ν2 y
that is ! ! !
x ν2 −µ1 x0
= (12.77)
y ν1 µ2 y0
which we prefer to write as:
! ! !
x ν2 −µ1
= x0 + y0 (12.78)
y ν1 µ2

We interpret this as saying that x0 , y 0 are the coordinates of the vector (x, y) ∈ Z2 relative
to the new basis vectors for Z2 .
! !
ν2 −µ1
v1 = v2 = (12.79)
ν1 µ2

The good property of this basis is that the smallest multiple of v1 that sits in Λ is `v1
(prove this) 130 Similarly, the smallest multiple of v2 in Λ is gv2 . Thus, we have a way of
writing Z2 as Zv1 ⊕ Zv2 such that the projection of Λ to the v1 axis is the group `Z while
the kernel is the subgroup of Zv2 that maps into Λ, and that is just ∼ = gZ.
2
Put differently, there is a homomorphism ψ : Z → Z that takes
!
x
ψ: 7→ x0 . (12.80)
y

This is the projection on the v1 axis. This defines a surjective homomorphism onto Z.
(Explain why.) On the other hand, using (12.76) and µ1 µ2 g = ` we see that the image of
Λ under ψ is `Z. Therefore, using the exercise result (10.24) ψ descends to a map

ψ̄ : Z2 /Λ → Z/`Z (12.81)

Now note from (12.78) that !


−µ1
modΛ (12.82)
µ2

is in the kernel of ψ̄, and moreover it generates a cyclic subgroup of order g in Z2 /Λ. By
counting, this cyclic subgroup must be the entire kernel of ψ̄. Therefore we have an exact
sequence
0 → Zg → Z2 /Λ → Z` → 0 (12.83)

**************************
AND THERE IS A MAP TO y 0 AND TOGETHER THESE GIVE ISOMORPHISM
TO Z/`Z ⊕ Z/gZ.
**************************

– 262 –
This concludes our second proof. ♠ ♣Should really add
a figure to illustrate
this. Unfortunately
the first really
nontrivial case is
m1 = 2 · 3 and
m2 = 2 · 5 so
` = 30. ♣

Exercise
Using the Kronecker theorem show that if a finite Abelian group G is not a cyclic
group then there is a nontrivial divisor n of |G| so that g n = 1 for all g ∈ G.

12.3.1 The Chinese Remainder Theorem


In fact, there is an important generalization of this statement known as the Chinese re-
mainder theorem:

Theorem Suppose m1 , . . . , mr are pairwise relatively prime positive integers, (i.e.


(mi , mj ) = 1 for all i 6= j) then

(Z/m1 Z) ⊕ (Z/m2 Z) · · · ⊕ (Z/mr Z) ∼


= Z/M Z (12.84)

where M = m1 m2 · · · mr .

Proof :
The fastest proof makes use of the previous result and induction on r.
A second proof offers some additional insight into solving simultaneous congruences:
We construct a homomorphism

ψ : Z → (Z/m1 Z) ⊕ (Z/m2 Z) · · · ⊕ (Z/mr Z) (12.85)

by
ψ(x) = (xmodm1 , xmodm2 , . . . , xmodmr ) (12.86)

We first claim that ψ(x) is onto. That is, for any values a1 , . . . , ar we can solve the
simultaneous congruences:

x = a1 modm1
x = a2 modm2
.. .. (12.87)
. .
x = ar modmr

for some common value x ∈ Z.


130
Answer : We have xv1 ∈ Λ iff xν2 = 0mod(gµ1 ) and xν1 = 0mod(gµ2 ). Multiply these equations by µ2
and µ1 , respectively, and add them. Find that x = 0mod`.

– 263 –
Q
To prove this note that m̂i := M/mi = j6=i mj is relatively prime to mi (by the
hypothesis of the theorem). Therefore there are integers xi , yi such that

xi mi + yi m̂i = 1 (12.88)

Let gi = yi m̂i . Note that

gi = δi,j modmj ∀1 ≤ i, j ≤ r (12.89)

Therefore if we set
r
X
x= ai gi (12.90)
i=1

then x is a desired solution to (12.87) and hence is a preimage under ψ.


On the other hand, the kernel of ψ is clearly M Z. Therefore:

0 → M Z → Z → (Z/m1 Z) ⊕ (Z/m2 Z) · · · ⊕ (Z/mr Z) → 0 (12.91)

and hence the desired isomorphism follows. ♠

Remarks

1. Equation (12.67) is used implicitly all the time in physics, whenever we have two
degrees of freedom with different but commensurable frequencies. Indeed, it is used
all the time in everyday life. As a simple example, suppose you do X every other day.
You will then do X on Mondays every other week, i.e., every 14 days, because 2 and 7
are relatively prime. More generally, consider a system with a discrete configuration
space Z/pZ thought of as the multiplicative group of pth roots of 1. Suppose the time
evolution for ∆t = 1 is ωpr → ωpr+1 where ωp is a primitive pth root of 1. The basic
period is T = p. Now, if we have two oscillators of periods p, q, the configuration
space is Zp × Zq . The basic period of this system is - obviously - the least common
multiple of p and q. That is the essential content of (??).

2. Our second proof shows that in fact equation (12.84) is a statement of an isomorphism
of rings.

3. One might wonder how the theorem got this strange name. (Why don’t we refer
to the “Swiss-German theory of relativity?”) The theorem is attributed (see, e.g.
Wikipedia) to Sun-tzu Suan-ching in the 3rd century A.D. (He should not be con-
fused with Sun Tzu who lived in the earlier Spring and Autumn period and wrote The
Art of War.) For an interesting historical commentary see 131 which documents the
historical development in India and China up to the definitive treatments by Euler,
Lagrange, and Gauss who were probably unaware of previous developments hundreds
of years earlier. The original motivation was apparently related to construction of
calendars, and this is certainly mentioned by Gauss in his renowned book Disqui-
sitiones Arithmeticae. The Chinese calendar is based on both the lunar and solar
cycles. Roughly speaking, one starts the new year based on both the winter solstice

– 264 –
and the new moon. Thus, to find periods of time in this calendar one needs to ♣Some students
with Chinese
solve simultaneous congruences. I suspect the name “Chinese Remainder Theorem” background say this
is wrong. Check it
is an invention of 19th century mathematicians. Hardy & Wright (1938) do not call out. ♣

it that, but do recognize Sun Tzu.

Exercise Counting your troops


Suppose that you are a general and you need to know how many troops you have from
a cohort of several hundred. Time is too short to take attendance.
So, you have your troops line up in rows of 5. You observe that there are 3 left over.
Then you have your troops line up in rows of 11. Now there are 2 left over. Finally, you
have your troops line up in rows of 13, and there is only one left over.
How many troops are there? 132

Exercise
a.) Show that the Chinese Remainder theorem is false if the mi are not pairwise
relatively prime.
b.) Show that the obstruction to finding a solution x to x = ai modmi is given by the
reductions (ai − aj )mod(mi , mj ) over all pairs i 6= j. That is, a solution exists iff all of
these vanish.

13. The Group Of Automorphisms

Recall that an automorphism of a group G is an isomorphism µ : G → G, i.e. an isomor-


phism of G onto itself.
One easily checks that the composition of two automorphisms µ1 , µ2 is an automor-
phism. The identity map is an automorphism, and every automorphism is invertible. In
this way, the set of automorphisms, Aut(G), is itself a group with group law given by
composition.
Given a group G there are God-given automorphisms given by conjugation. That is,
if a ∈ G then

I(a) : g → aga−1 (13.1)


131
Kang Sheng Shen, “Historical development of the Chinese remainder theorem,” Arch. Hist. Exact Sci.
38 (1988), no. 4, 285305.
132
Apply the Chinese remainder theorem with m1 = 5, m2 = 11, m3 = 13. Then M = 715, m̂1 = 143,
m̂2 = 65 and m̂3 = 55. Using the Euclidean algorithm you find convenient lifts to the integers g1 = 286,
g2 = −65 and g3 = −220. Then the number of troops is 3 × 286 − 2 × 65 − 1 × 220 = 508mod715. Therefore
there are 508 soldiers.

– 265 –
defines an automorphism of G. Indeed I(a) ◦ I(b) = I(ab) and hence I : G → Aut(G)
is a homomorphism. The subgroup Inn(G) of such automorphisms is called the group of
inner automorphisms. Note that if a ∈ Z(G) then I(a) is trivial, and conversely. Thus we
have:

Inn(G) ∼
= G/Z(G). (13.2)

Moreover, Inn(G) is a normal subgroup of Aut(G), since for any automorphism φ ∈


Aut(G):
φ ◦ I(a) ◦ φ−1 = I(φ(a)). (13.3)

Therefore we have another group

Out(G) := Aut(G)/Inn(G) (13.4)

known as the group of “outer automorphisms.” Thus

1 → Inn(G) → Aut(G) → Out(G) → 1 (13.5)

Note we can also write and exact sequence of length four:

1 → Z(G) → G → Aut(G) → Out(G) → 1 (13.6)

Remarks

1. In practice one often reads or hears the statement that an element ϕ ∈ Aut(G)
is an “outer automorphism.” What this means is that it projects to a nontrivial
element of Out(G). However, strictly speaking this is an abuse of terminology and
an outer automorphism is in the quotient group (13.4). These notes might sometimes
perpetrate this abuse of terminology.

2. Note that for any abelian group G all nontrivial automorphisms are outer automor-
phisms.

Example 13.1: Consider Aut(Z3 ). This group is Abelian so all automorphisms are outer.
Thinking of it multiplicatively, the only nontrivial choice is ω → ω −1 . If we think of
A3 ∼
= Aut(Z3 ) then we are taking

(123) → (132) (13.7)

So: Aut(Z3 ) ∼
= Z2 .
Example 13.2: Consider Aut(Z4 ). Think of Z4 as the group of fourth roots of unity,
generated by ω = exp[iπ/2] = i. A generator must go to a generator, so there is only
one possible nontrivial automorphism: φ : ω → ω 3 . Note that ω → ω 2 is a nontrivial
homomorphism of Z4 → Z4 , but it is not an automorphism. Thus Aut(Z4 ) ∼
= Z2 .

– 266 –
Example 13.3: Consider Aut(Z5 ). Think of Z5 as the group of fifth roots of unity,
generated by ω = exp[2πi/5]. Now there are several automorphisms: φ2 defined by its
action on the generator ω → ω 2 . Similarly, we can define φ3 , by ω → ω 3 and φ4 , by
ω → ω 4 . Letting φ1 denote the identity we have

φ22 = φ4 φ32 = φ3 φ42 = φ24 = φ1 = 1 (13.8)

So Aut(Z5 ) ∼
= Z4 . The explicit isomorphism is

φ2 → 1̄
φ4 → 2̄ (13.9)
φ3 → 3̄

Example 13.4: Consider Aut(ZN ), and let us think of ZN multiplicatively as the group
of N th roots of 1. An automorphism φ of ZN must send ω 7→ ω r for some r. On the
other hand, ω r must also be a generator of ZN . Automorphisms must take generators to
generators. Hence r is relatively prime to N . This is true iff there is an s with

rs = 1modN (13.10)

Thus, Aut(ZN ) is the group of transformations ω → ω r where r admits a solution to


rs = 1modN . We will examine this interesting group in a little more detail in §13.1
below.
Example 13.5: Automorphisms Of The Symmetric Group Sn : There are no outer auto-
morphisms of Sn so
Aut(Sn ) ∼
= Inn(Sn ) ∼
= Sn , n 6= 2, 6 (13.11)
Note the exception: n = 2, 6. Note the striking contrast from an abelian group, all of
whose automorphisms are outer.
This is not difficult to prove: Note that an automorphism φ of Sn must take conju-
gacy classes to conjugacy classes. Therefore we focus on how it acts on transpositions.
These are involutions, and involutions must map to involutions so the conjugacy class of
transpositions must map to a conjugacy class of the form (1)k (2)` with k + 2` = n. We
will show below that, just based on the order of the conjugacy class, φ must map trans-
positions to transpositions. We claim that any automorphism that maps transpositions to
transpositions must be inner. Let us say that

φ((ab)) = (xy) φ((ac)) = (zw) (13.12)

where a, b, c are all distinct. We claim that x, y, z, w must comprise precisely three distinct
letters. We surely can’t have (xy) = (zw) because φ is 1-1, and we also can’t have (xy)
and (zw) commuting because the group commutator of (ab) and (ac) is (abc). Therefore
we can write
φ((ab)) = (xy) φ((ac)) = (xz) (13.13)
Therefore, we have defined a permutation a → x and φ is the inner automorphism associ-
ated with this permutation.

– 267 –
Now let us consider the size of the conjugacy classes. This was computed in exercise
*** above. The size of the conjugacy class of transpositions is of course
 
n n!
= (13.14)
2 (n − 2)!2!
The size of a conjugacy class of the form (1)k (2)` with k + 2` = n is
n!
(13.15)
(n − 2`)!`!2`
Setting these equal results in the identity
(n − 2)!
= `!2`−1 n ≥ 2` (13.16)
(n − 2`)!
For a fixed ` the LHS is a polynomial in n which is growing for n ≥ 2` and therefore
bounded below by (2` − 2)!. Therefore we consider whether there can be a solution with
n = 2`:
(2` − 2)! = `!2`−1 (13.17)
For ` = 3, corresponding to n = 6, there is a solution, but for ` > 3 we have (2` − 2)! >
`!2`−1 . The peculiar exception n = 6 is related to the symmetries of the icosahedron. For
more information see
1.http://en.wikipedia.org/wiki/Automorphisms of the symmetric and alternating groups
2. http://www.jstor.org/pss/2321657
3. I.E. Segal, “The automorphisms of the symmetric group,” Bulletin of the American
Mathematical Society 46(1940) 565.
Example 13.6: Automorphisms Of Alternating Groups. For the group An ⊂ Sn there
is an automorphism which is not obviously inner: Conjugation by any odd permutation.
Recall that Out(G) = Aut(G)/Inn(G) is a quotient group so conjugation by any odd
permutation represents the same element in Out(G). If we consider A3 ⊂ S3 then
(12)(123)(12)−1 = (132) (13.18)
is indeed a nontrivial automorphism of A3 and since A3 is abelian this automorphism must
be an outer automorphism. In general conjugation by an odd permutation defines an outer
automorphism of An . For example suppose conjugation by (12) were inner. Then there
would be an even permutation a so that conjugation by a · (12) centralizes every h ∈ An .
But a · (12) together with An generates all of Sn and then a · (12) would have to be in
the center of Sn , a contradiction. Thus, the outer automorphism group of An contains a
nontrivial involution. Again for n = 6 there is an exceptional outer automorphism.
The above example nicely illustrates a general idea: If N / G is a normal subgroup of
G and g ∈/ H then conjugation by g defines an automorphism H → H which is, in general,
not an inner automorphism.
Example 13.7: Consider G = GL(n, C). Then A → A∗ is an outer automorphism: That
is, there is no invertible complex matrix S ∈ GL(n, C) such that, for every invertible matrix
A ∈ GL(n, C) we have
A∗ = SAS −1 (13.19)

– 268 –
Exercise Outer Automorphisms Of Some Matrix Groups
a.) Prove (13.19). 133
b.) Consider maps of GL(n, C) given by A → Atr , A → A−1 and A → Atr,−1 . Which
of these are automorphisms? Which of these are outer automorphisms?
c.) Consider G = SU (2). Is A → A∗ an outer automorphism? 134
d.) Consider the automorphism of G = SO(2)

R(φ) → R(−φ) (13.20)

Is this inner or outer?

Exercise Automorphisms of Z
Show that Aut(Z) ∼
= Z2 . 135

Exercise
Although Z2 does not have any automorphisms the product group Z2 ⊕ Z2 certainly
does.
a.) Show that an automorphism of Z2 ⊕ Z2 must be of the form

φ(x1 , x2 ) = (a1 x1 + a2 x2 , a3 x3 + a4 x4 ) (13.21)

where we are writing the group additively, and


!
a1 a2
∈ GL(2, Z2 ) (13.22)
a3 a4

b.) Show that GL(2, Z2 ) ∼


= S3 . 136

133
Hint: Consider the invertible matrix A = i1n×n .
134
Answer : No! Note that (iσ k )∗ = −i(σ k )∗ = (iσ 2 )(iσ k )(iσ 2 )−1 . But iσ 2 ∈ SU (2) and every SU (2)
matrix is a real linear combination of 1 and iσ k . This has an important implication for the representation
theory of SU (2): Every irreducible representation is either real or “pseudoreal” (quaternionic).
135
Answer : The most general homomorphism Z → Z is the map n 7→ an for some integer a. But for an
automorphism a must be mutliplicatively invertible in the integers. Therefore a is +1 or −1.
136
Hint: Consider what the group does ! to the! three nontrivial
! elements (1, 0), (0, 1), and (1, 1).! The
0 1 1 1 1 0 1 1
three transpositions correspond to , , and the two elements of order 3 are and
1 0 0 1 1 1 1 0
!
0 1
.
1 1

– 269 –
c.) Now describe Aut(Z4 × Z4 ). 137

Exercise Automorphisms of ZN p
(Warning: This is hard and uses some other ideas from algebra.)
Let p be prime. Describe the automorphisms of ZN p , and show that the group has
order 138
N −1
|Aut(ZN N N N 2 N
p )| = (p − 1)(p − p)(p − p ) · · · (p − p ) (13.23)

Exercise Automorphisms Of The Quaternion Group ♣This exercise


should go later,
Show that the group of automorphisms of the quaternion group Q = {±1, ±i, ±j, ±k} perhaps in the
section on
is isomorphic to S4 . 139 extensions. Perhaps
in the chapter on
(This assumes you know what the quaternions are. See below for various descriptions symmetries of
regular objects. ♣
of the quaternion group Q.)

Exercise Isomorphisms between two different groups


Let G1 , G2 be two groups which are isomorphic, but not presented as the same set
with the same multiplication table. Let Isom(G1 , G2 ) be the set of all isomorphisms from
G1 → G 2 .
Show that
a.) Any two isomorphisms Ψ, Ψ0 ∈ Isom(G1 , G2 ) are related by Ψ0 = Ψ ◦ φ where
φ ∈ Aut(G1 ).
137
Answer : This group has a homomorphism onto Aut(Z2 × Z2 ) with a kernel isomorphic to Z42 .
138
Answer : The group is the group of N × N invertible matrices over the ring Zp . These are in one-one
correspondence with the possible bases for the vector space ZN p . How many ordered bases are there? Note
that any nonzero vector can serve as the first basis vector, and there are pN − 1 nonzero vectors. Choose
one and call it e1 . Now, e2 can be any vector not in the linear span of e1 . But the linear span of e1 is a
one-dimensional subspace of p elements. These are all excluded so e2 must be chosen from a set of pN − p
vectors. Make a choice of e2 . Then e3 must be chosen from a vector not in the span of e1 , e2 . The span of
e1 , e2 consists of p2 vectors so there are pN − p2 choices for e3 , and so on.
139
Answer : First, the group of inner automorphisms is Q/Z(Q) ∼ = Z2 × Z2 . The three nontrivial elements
are given by conjugation by i, j, k. Now, any automorphism must permute the three normal subgroups
generated by i, j, k, and automorphisms leading to nontrivial permutations of normal subgroups must be
outer. So the outer automorphism group must be a subgroup of S3 . Now, in fact, one can construct such
outer automorphisms. In fact, it suffices to say what the image of i and j are since these generate the
whole group. Thus, the automorphism group is an extension of S3 by Z2 × Z2 and one can then map this
isomorphically to S4 .

– 270 –
b.) Any two isomorphisms Ψ, Ψ0 ∈ Isom(G1 , G2 ) are related by Ψ0 = φ ◦ Ψ where
φ ∈ Aut(G2 ).
The set Isom(G1 , G2 ) with G1 , G2 not equal but isomorphic is a good example of what
is called a torsor. A torsor for a group G is a set X with a free transitive action.

13.1 The group of units in ZN


We have seen that Z/N Z is a group inherited from the additive law on Z. For an integer
n ∈ Z denote its image in Z/N Z by n̄. With this notation the group law on Z/N Z is

n̄1 + n̄2 = n1 + n2 , (13.24)

and 0̄ is the unit element.


However, note that since

(n1 + N `1 )(n2 + N `2 ) = n1 n2 + N `00 (13.25)

we do have a well-defined operation on Z/N Z inherited from multiplcation in Z:

n̄1 · n̄2 := n1 · n2 . (13.26)

In general, even if we omit 0̄, Z/N Z is not a group with respect to the multiplication
law (find a counterexample). Nevertheless, Z/N Z with +, × is an interesting object which
is an example of something called a ring. See the next chapter for a general definition of a
ring.
Let us define the group of units in the ring Z/N Z:

(Z/N Z)∗ := {m̄ : 1 ≤ m ≤ N − 1, gcd(m, N ) = 1} (13.27)

where (m, N ) is the greatest common divisor of m and N . We will also denote this group
as Z∗N .
Then, (Z/N Z)∗ is a group with the law (13.26) ! Clearly the multiplication is closed
and 1̄ is the unit. The existence of multiplicative inverses follows from (12.8).
Moreover, as we have seen above, we can identify

Aut(Z/N Z) ∼
= (Z/N Z)∗ (13.28)

The isomorphism is that a ∈ (Z/N Z)∗ is mapped to the transformation

ψa : n modN → an modN (13.29)

if we think of Z/N Z additively or


ψa : ω → ω a (13.30)

if we think of it multiplicatively. Note that ψa1 ◦ ψa2 = ψa1 a2 .

– 271 –
The order of the group (Z/N Z)∗ is denoted φ(N ) and is called the Euler φ-function or
Euler’s totient function. 140 One can check that

φ(2) = 1
φ(3) = 2 (13.31)
φ(4) = 2

What can we say about the structure of Z∗N ? Now, in general it is not true that
Aut(G1 × G2 ) and Aut(G1 ) × Aut(G2 ) are isomorphic. Counterexamples abound. For
example Aut(Z) ∼ = Z2 but Aut(Z ⊕ Z) ∼ = GL(2, Z). Nevertheless, it actually is true that

Aut(Zn × Zm ) = Aut(Zn ) × Aut(Zm ) when n and m are relatively prime. To prove this,
let v1 be a generator of Zn and v2 a generator of Zm and let us write our Abelian group
additively. The general endomorphism of Zn ⊕ Zm is of the form

v1 → αv1 + βv2
(13.32)
v2 → γv1 + δv2

with α, β, γ, δ ∈ Z. Now impose the conditions nv1 = 0 and mv2 = 0 and the fact that n̄ is
multiplicatively invertible in Zm and m̄ is multiplicatively invertible in Zn to learn that in
fact an endomorphism must have β = 0mod m and γ = 0mod n. Therefore βv2 = 0 and
γv1 = 0. Therefore, an automorphism of Zn ⊕ Zm is determined by v1 → αv1 with ᾱ ∈ Z∗n
and v2 → δv2 with δ̄ ∈ Z∗m and hence Aut(Zn ⊕ Zm ) ∼ = Aut(Zn ) × Aut(Zm ) when n and m
are relatively prime. (The corresponding statement is absolutely false when they are not
relatively prime.) So we have:
Z∗nm ∼
= Z∗n × Z∗m (13.33)
In particular, φ is a multiplicative function: φ(nm) = φ(n)φ(m) if (n, m) = 1. Therefore,
if N = pe11 · · · perr is the decomposition of N into distinct prime powers then

(Z/N Z)∗ ∼
= (Z/pe11 Z)∗ × · · · (Z/perr Z)∗ (13.34)

Moreover, (Z/pe Z)∗ is of order φ(pe ) = pe − pe−1 , as is easily shown 141 and hence
1
(pei i − piei −1 ) = N
Y Y
φ(N ) = (1 − ) (13.35)
p
i p|N

Remark: For later reference in our discussion of cryptography note one consequence
of this: If we choose, randomly - i.e. with uniform probability density - a number between
1 and N the probability that it will be relatively prime to N is
φ(N ) Y 1
= (1 − ) (13.36)
N p
p|N

140
Do not confuse φ(N ) with the φa above!
141
Proof: The numbers between 1 and pe which have gcd larger than one must be of the form px where
1 ≤ x ≤ pe−1 . So the rest are relatively prime.

– 272 –
This means that, if N is huge and a product of just a few primes, then a randomly chosen
number will almost certainly be relatively prime to N .

In elementary number theory textbooks it is shown that if p is an odd prime then


(Z/pe Z)∗ is a cyclic group. ♣Finish proof for
e > 1. ♣
To prove this let us begin with (Z/pZ)∗ . (This proof uses some ideas from the algebra
of fields.) Suppose this group were not cyclic. Then there would be some n which is a
nontrivial divisor of the order, φ(p) = p − 1 such that xn = 1 for all x ∈ (Z/pZ)∗ . That
would imply that in the field Fp the equation xn − 1 would have p − 1 distinct roots. On
the other hand, the equation xn − 1 can have at most n roots, and that is a contradiction.
We conclude that, in fact,(Z/pZ)∗ must be cyclic.
Of course, all primes are odd, and two is the oddest prime of all. If p = 2 the result is
a little different and we have:
(Z/4Z)∗ ∼= {±1} (13.37)

is cyclic but

(Z/2e Z)∗ = {(−1)a 5b |a = 0, 1, 0 ≤ b < 2e−2 } ∼


= (Z/2Z) × (Z/2e−2 Z) (13.38)

when e ≥ 3.
In fact, more generally, it turns out that (Z/nZ)∗ is cyclic iff n ∈ {1, 2, 4, pk , 2pk } where
p runs over odd primes and k > 0.
Note that if we take a product of two distinct odd prime powers then

(Z/(pk11 pk22 Z)∗ ∼


= (Z/pk11 Z)∗ × (Z/pk22 Z)∗ (13.39)

But φ(pk11 ) and φ(pk22 ) are both even, being divisible by p1 − 1 and p2 − 1, respectively, and
hence are not relatively prime, and hence (Z/(pk11 pk22 Z)∗ is not cyclic.

Examples

1. (Z/7Z)∗ = {1, 2, 3, 4, 5, 6}mod7 ∼


= Z6 . Note that 3 and 5 are generators:

31 = 3, 32 = 2, 33 = 6, 34 = 4, 35 = 5, 36 = 1 mod7 (13.40)

51 = 5, 52 = 4, 53 = 6, 54 = 2, 55 = 3, 56 = 1 mod7 (13.41)

However, 2 = 32 mod7 is not a generator, even though it is prime. Rather, it generates


an index 2 subgroup ∼ = Z3 , as does 4, while 6 generates an index 3 subgroup ∼
= Z2 . Do
not confuse this isomorphic copy of Z6 with the additive presentation Z6 ∼ = Z/6Z =
{0, 1, 2, 3, 4, 5} with the additive law. Then 1 and 5 are generators, but not 2, 3, 4.

2. (Z/9Z)∗ = {1, 2, 4, 5, 7, 8}mod9 ∼


= Z6 . It is a cyclic group generated by 2 and 25 =
5mod9, but it is not generated by 22 = 4, 23 = 8 or 24 = 7mod9, because 2, 3, 4 are
not relatively prime to 6.

– 273 –
800

600

400

200

100 200 300 400 500

Figure 33: A plot of the residues of 2x δx modulo N = 2 · 3 · 5 · 7 · 11 = 2310, for 1 ≤ x ≤ 500. Here
δx = 0 if gcd(x, N ) > 1 so that we only see the values in (Z/N Z)∗ . Notice the apparently random
way in which the value jumps as we increase x.

3. (Z/8Z)∗ = {1, 3, 5, 7} ∼
= Z2 × Z2 . Note that 32 = 52 = 72 = 1mod8 and 3 · 5 = 7mod8,
so we can take 3 and 5 to be the generators of the two Z2 subgroups.

– 274 –
4. (Z/15Z)∗ ∼
= Z2 × Z4 is not cyclic.

Remarks

1. When (Z/nZ)∗ is cyclic a generator is called a primitive root modulo n, and is not
to be confused with a primitive nth root of one. It is trivial to find examples of the
latter and highly nontrivial to find examples of the former.

2. The values of f (x) = ax modN for (a, N ) = 1 appear to jump about randomly as a
function of x, as shown in Figure 33. Therefore, finding the period of this function,
that is, the smallest positive integer r so that f (x + r) = f (x) is not easy. This is
significant because of the next remark.

3. Factoring Integers. Suppose N is a positive integer and a is a positive integer so that


(a, N ) = 1, and the order, denoted r, of ā ∈ (Z/N Z)∗ is even and finally suppose
that b := ar/2 6= ±1modN . Note that b2 = 1modN so b̄ is a nontrivial squareroot of
1̄ ∈ (Z/N Z)∗ . Then we claim that d± := gcd(b ± 1, N ) are in fact nontrivial factors
of N . To see this we need to rule out d± = 1 and d± = N , the trivial factors of N .
If we had d± = N then N would divide b ± 1 but that would imply b = ∓1modN ,
contrary to assumption. Now, suppose d± = 1, then by Bezout’s theorem there would
be integers α± , β± so that
(b ± 1)α± + N β± = 1 (13.42)
But then multiply the equation by b ∓ 1 to get

(b2 − 1)α± + N β± (b ∓ 1) = b ∓ 1 (13.43)

But now, N divides the LHS so b ∓ 1 = 0modN which implies b = ±1modN , again
contrary to assumption. Thus, d± are nontrivial divisors of N .
To give a concrete example, take N = 3 · 5 · 7 = 105, so φ(N ) = 48. Then the
period of f (x) = 2x is r = 12, and b = 212/2 = 64. Well gcd(64 + 1, 105) = 5 and
gcd(64 − 1, 105) = 21 are both divisors of 105. In fact 105 = 5 · 21.

4. Artin’s Conjecture: Finding a generator is not always easy, and it is related to some
deep conjectures in number theory. For example, the Artin conjecture on primitive
roots states that for any positive integer a which is not a perfect square there are an
infinite number of primes so that ā is a generator of the cyclic group (Z/pZ)∗ . In
fact, if a is not a power of another integer, and the square-free part of a is not 1mod4
then Artin predicts the density of primes for which a is a generator to be
 
Y 1
1− = 0.37.... (13.44)
p(p − 1)
Artin primes

According to the Wikipedia page, there is not a single number a for which the con-
jecture is known to be true. For example, the primes p < 500 for which a = 2 is a

– 275 –
generator of (Z/pZ)∗ is

{3, 5, 11, 13, 19, 29, 37, 53, 59, 61, 67, 83,101, 107, 131, 139, 149, 163, 173, 179, 181,
197, 211, 227, 269, 293, 317, 347, 349, 373, 379,389, 419, 421, 443, 461, 467, 491}
(13.45)

5. A good reference for this material is Ireland and Rosen, A Classical Introduction to
Modern Number Theory Springer GTM

Exercise Euler’s theorem and Fermat’s little theorem


a.) Let G be a finite group of order n. Show that if g ∈ G then g n = e where e is the
identity element.
b.) Prove Euler’s theorem: For all integers a relatively prime to N , g.c.d(a, N ) = 1,

aφ(N ) = 1modN (13.46)

Note that a special case of this is Fermat’s little theorem: If a is an integer and p is
prime then
ap = amodp (13.47)
Remark: This theorem has important practical applications in prime testing. If we
want to test whether an odd integer n is prime we can compute 2n modn. If the result
is 6= 2modn then we can be sure that n is not prime. Now 2n modn can be computed
much more quickly with a computer than the traditional test of seeing whether the primes

up to n divide n. If 2n modn is indeed = 2modn then we can suspect that n is prime.
Unfortunately, there are composite numbers which will masquerade as primes in this test.
They are called “base 2 pseudoprimes.” In fact, there are numbers n, known as Carmichael
numbers which satisfy an = amodn for all integers a. The good news is that they are rare.
The bad news is that there are infinitely many of them. According to Wikipedia the first
few Carmichael numbers are

561, 1105, 1729, 2465, 2821, 6601, 8911, 10585, . . . , (13.48)

The first Carmichael number is 561 = 3 · 11 · 17 and Erdös proved that the number C(X)
of Carmichael numbers smaller than X is bounded by
 
κlogXlogloglogX
C(X) < Xexp − (13.49)
loglogX

where κ is a positive real number.

– 276 –
Exercise Periodic Functions
a.) Consider the function
f (x) = 2x mod N (13.50)
for an odd integer N . Show that this function is periodic f (x + r) = f (x) for a minimal
period r which divides φ(N ).
b.) Compute the period for N = 15, 21, 105. 142
c.) More generally, if (a, N ) = 1 show that f (x) = ax mod N is a periodic function.

Exercise How Many Primitive Roots Of n Are There?


Show that n has either zero or φ(φ(n)) different primitive roots.

13.2 Group theory and cryptography


Any invertible map f : Z/N Z → Z/N Z can be used to define a code. For example, if
N = 26 we may identify the elements in Z/26Z with the letters in the Latin alphabet:

a ↔ 0̄, b ↔ 1̄, c ↔ 2̄, . . . (13.51)

Exercise Caesar Shift


a.) Show that f (m) = (m − 3)mod26 defines a code. In fact, the above remark, and
this example in particular, is attributed to Julius Caesar. Using this decode the message:

ZOLP P QEBORY F ZLK! (13.52)

b.) Is f (m) = (3m)mod26 a valid code? By adding symbols or changing the alphabet
we can change the value of N above. Is f (m) = (3m)mod27 a valid code?

The RSA public key encryption system is a beautiful application of Euler’s theorem
and works as follows. The basic idea is that with numbers with thousands of digits it is
relatively easy to compute powers an modm and greatest common divisors, but it is very
difficult to factorize such numbers into their prime parts. For example, for a 1000 digit
number the brute force method of factorization requires that we sample up to

101000 = 10500 (13.53)
142
Answer : r = 4, 6, 12 divides φ(N ) = 8, 12, 48.

– 277 –
divisors. Bear in mind that our universe is about π × 107 × 13.79 × 109 =∼ 4 × 1017 seconds
old. 143 There are of course more efficient algorithms, but all the publicly known ones are
still far too slow.
Now, Alice wishes to receive and decode secret messages sent by any member of the
public. She chooses two large primes (thousands of digits long) pA , qA and computes nA :=
pA qA . These primes are to be kept secret. How does she find her secret thousand-digit
primes? She chooses a random thousand digit number and applies the Fermat primality
test. By the prime number theorem she need only make a few thousand attempts, and she
will find a prime. 144
Next, Alice computes φ(nA ) = (pA − 1)(qA − 1), and then she chooses a random
thousand-digit number dA such that gcd(dA , φ(nA )) = 1 and computes an inverse dA eA =
1modφ(nA ). All these steps are relatively fast and easy, because Euclid’s algorithm is very
fast. Thus there is some integer f so that

dA eA − f φ(nA ) = 1 (13.54)

That is, she solves the congruence x = 1modφ(nA ) and x = 0moddA , for the smallest
positive x and then computes eA = x/dA .
Finally, she publishes for the world to see the encoding key: {nA , eA }, but she keeps
the numbers pA , qA , φ(nA ), dA secret. This means that if anybody, say Bob, wants to send
Alice a secret message then he can do the following:
Bob converts his plaintext message into a number less than nA by writing a ↔ 01,
b ↔ 02, . . . , z ↔ 26. (Thus, when reading a message with an odd number of digits we
should add a 0 in front. If the message is long then it should be broken into pieces of
length smaller than nA .) Let Bob’s plaintext message thus converted be denoted m. It is
a positive integer smaller than nA .
Now to compute the ciphertext Bob looks up Alice’s numbers {nA , eA } on the public
site and uses these to compute the ciphertext:

c := meA modnA (13.55)

Bob sends the ciphertext c to Alice over the internet. Anyone can read it.
Then Alice can decode the message by computing

cdA modnA = meA dA modnA


= m1+f φ(nA ) modnA (13.56)
= mmodnA

Thus, to decode the message Alice just needs one piece of private information, namely the
integer dA .
143
There are π × 107 seconds in a year, to 0.3% accuracy.
144
The prime number theorem says that if π(x) is the number of primes between 1 and x then as x → ∞
we have π(x) ∼ logxx
. Equivalently, the nth prime is asymptotically like pn ∼ nlogn. This means that
the density of primes for large x is ∼ 1/logx, so if x ∼ 10n then the density is 1/n so if we work with
thousand-digit primes then after about one thousand random choices we will find a prime.

– 278 –
Now Eve, who has a reputation for making trouble, cannot decode the message without
knowing dA . Just knowing nA and eA but not the prime factorization nA = pA qA there is
no obvious way to find dA . The reason is that even though the number nA is public it is
hard to compute φ(nA ) without knowing the prime factorization of nA . Of course, if Eve
finds out about the prime factorization of nA then she can compute φ(nA ) immediately
and then quickly (using the Euclidean algorithm) invert eA to get dA . Thus, the security
of the method hinges on the inability of Eve to factor nA into primes.
In summary,

1. The intended receiver of the message, namely Alice in our discussion, knows

(pA , qA , nA = pA qA , φ(nA ) = (pA − 1)(qA − 1), eA , dA ). (13.57)

2. Alice publishes (nA , eA ). Anybody can look these up.

3. The sender of the message, namely Bob in our discussion, takes a secret message mB
and computes the ciphertext c = meBA modnA .

4. Alice can decode Bob’s message by computing mB = cdA modnA using her secret
knowledge of dA .

5. The attacker, namely Eve in our discussion, knows (nA , eA , c) but will have to work
to find dA or some other way of decoding the ciphertext.

Remarks

1. Note that the decoding will fail if m and nA have a common factor. However,
nA = pA qA and pA , qA are primes with thousands of digits. The probability that
Bob’s message is one of these is around 1 in 101000 .

Exercise Your turn to play Eve


Alice has published the key

(n = 661643, e = 325993) (13.58)

Bob sends her the ciphertext in four batches:

c1 = 541907 c2 = 153890 c3 = 59747 c4 = 640956 (13.59)

What is Bob’s message? 145

145
Factor the integer n = 541 ∗ 1223. Then you know p, q and hence φ(n) = 659880. Now take e and
compute d by using the Chinese Remainder theorem to compute x = 1modφ and x = 0mode. This
gives x = 735766201 = de and hence d = 2257. Now you can compute the message from the ciphertext
m = cd modn.

– 279 –
13.2.1 How To Break RSA: Period Finding
The attacker, Eve, can read the ciphertext cmodnA . That means the attacker can try to
compute the period of the function

f (x) := cx modnA (13.60)

Suppose (as is extremely likely when nA is a product of two large primes) that c is
relatively prime to nA . Then the cyclic group hci ∈ (Z/nA Z)∗ generated by c must coincide
with the cyclic group generated by the message mB and in particular they both have the
same period r, which divides φ(nA ). Suppose Eve figures out the period r. Since the
published value eA is relatively prime to φ(nA ) it will be relatively prime to r and therefore
there exists a new decoding method: Compute dE such that

eA dE = 1 mod r (13.61)

Then
cdE = meA dE modnA = m1+`r
B modnA = mB modnA (13.62)
decodes the message.
Thus, if the attacker can find the period of f (x) the message can be decoded.
Another way in which finding the period leads to rapid decoding is through explicit
factoring:
We saw in our discussion of Z∗N that, if one has an element ā ∈ Z∗N with even period
r and b̄ = ār/2 6= ±1 then d± = gcd(b ± 1, N ) are nontrivial factors of N . Suppose there
were a quick method to find the period r. Then we could quickly factor N as follows:
1. Choose a random integer a and using Euclid check that (a, N ) = 1. If N is a product
of two large primes you will only need to make a few choices of a before succeeding.
2. Compute the period r of the function f (x) = ax modN .
3. If r is odd go back and choose another a until you get one with r even.
4. Then check that b = ar/2 6= −1modN . Again this can be done quickly, thanks to
Euclid. If you get b = −1modN go back and choose another a, until you find one that
works. The point is that, with high probability, if you pick a at random you will succeed.
So you might have a try a few times, but not many.
So, the only real bottleneck in factoring N is computing the order r of ā in Z∗N . Equiv-
alently, this is computing the period of the function f (x) = ax mod N where (a, N ) = 1.
This is where the “quantum Fourier transform” and “phase estimation” come in. Quan-
tum computers give a way to compute this period in polynomial time in N , as opposed to
classical computers which take exponential time in N . We will come back to this.

13.2.2 Period Finding With Quantum Mechanics


♣This section is out
of place. Goes later
Here we sketch how quantum computation allows one to find the period of the function in the course ♣

f (x) = ax modN where (a, N ) = 1. This is just a sketch. A nice and clear and elementary
account (which we used heavily) can be found in D. Mermin’s book Quantum Computer
Science and more details and a more leisurely discussion can be found there.

– 280 –
Quantum computation is based on the action of certain unitary operators on a system
of n Qbits, that is, on a Hilbert space

Hn = (C2 )⊗n (13.63)

equipped with the standard inner product. For each factor C2 one chooses a basis {|0i, |1i},
which one should think of as, for example spin up/down eigenstates of an electron or photon
helicity polarization states. Then there is a natural basis for Hn :

|~xi := |xn−1 i ⊗ · · · ⊗ |x1 i ⊗ |x0 i (13.64)

Here, for each i, xi ∈ {0, 1}. One can identify the vector ~x ∈ Fn2 , the n-dimensional vector
space over the field F2 . In our discussion we will only use its Abelian group structure, so
one can also think of it as Z2 ⊕ · · · ⊕ Z2 with n summands. The basis of states (13.64) is
known as the computational basis or, the Classical basis. Now to each computational basis
vector we can assign an integer by its binary expansion:

N (~x) := 2n−1 xn−1 + · · · + 22 x2 + 21 x1 + x0 (13.65)

Now, let N := 2n . We can also define the Hilbert space L2 (ZN ) of functions on the
group ZN with the natural Haar measure. Of course Hn is isomorphic to L2 (ZN ) and the
isomorphism we choose to use is the one which identifies the computational basis vector
|~xi with the delta function supported at N (~x)mod N . We will denote the latter states as
|N (~x)iiN , where the subscript indicates which Abelian group ZN we are working with.
In quantum computation one works with a Hilbert space decomposed as

H = Hinput ⊗ Houtput (13.66)

The two factors have dimension Nin = 2nin and Nout = 2nout , respectively. The quantum
gates are unitary operators and, moreover, under identification of H as a tensor product
of Qbits there should be a notion of “locality” in the sense that they only act nontrivially
on “a few” adjacent factors. The locality reflects the spatial locality in some realization in
the lab in terms of, say, spin systems. Moreover, we should only have to apply “ a few”
quantum gates in a useful circuit. With an arbitrary number of gates we can construct
any unitary out of products of local ones to arbitrary accuracy. The above notions can be
made precise, but that is beyond the scope of this section. ♣Still, it is essential
to explain more
Now suppose we have a function about the notion of
“local quantum
gate” and quantum
f : (ZN )∗ → (ZN )∗ (13.67) circuit and
illustrate a few
examples of simple
(such as f (x) = ax modN for (a, N ) = 1). We would like to convert this to a map gates. ♣

fˇ : Fn2 in → Fn2 out (13.68)

We now choose a fundamental domain which is a subset of {1, 2, . . . , N − 1} for (ZN )∗ with
N < Nout and N < Nin (in fact we will eventually assume N  Nout and N  Nin ) so
that we can view elements of (Z/N Z)∗ as elements of the set {1, 2, . . . , N − 1} which is, in

– 281 –
turn, a subset of Z/Nin Z and Z/Nout Z. We use the function N (~x) above to define fˇ such
that
f (N (~x)) = N (fˇ(~x))mod2nout (13.69)
This does not uniquely specify fˇ but the ambiguity will not affect the discussion. To read
this equation, suppose you want to compute fˇ(~x) for some ~x ∈ Fn2 in . Then you compute
N (~x) which is a nonnegative integer between 0 and Nin . Then you reduce it modulo N .
If it is relatively prime to N you can compute f (N (~x)) and considerate the result as a
number between 1 and 2nout − 1. The above equation then pins down fˇ(~x). Using fˇ we ♣And what if N (~ x)
is not relatively
can define a unitary operator Uf by its action on the computational basis: prime to N ? Of
course, we are
thinking this is rare,
Uf : |~xi ⊗ |~y i → |~xi ⊗ |~y + fˇ(~x)i (13.70) but it can happen.
What is the best
way to extend the

where on the right-hand side addition is in the Abelian group (Z2 )nout . We will say that the function? ♣

function f is nice if Uf can be implemented with a “reasonable” number of local unitary


gates. (Of course, one could make this notion much more precise.)
A good example of a local unitary operator on a Qbit is the Hadamard gate that acts
by
1
H : |0i → √ (|0i + |1i)
2
(13.71)
1
H : |1i → √ (|0i − |1i)
2
This can be summarized by the formula
1 X
H|yi = √ (−1)xy |xi (13.72)
2 x

So,  n X
⊗n 1
H |~y i = √ (−1)~x·~y |~xi (13.73)
2 x∈Fn
~ 2

and in particular  n X
⊗n 1
H |~0i = √ |~xi (13.74)
2 x∈Fn
~ 2

Therefore, (recall that double brackets |jii refer to the position basis in L2 (ZN )) we
have
1 nin X
 
⊗nin

Uf H ⊗ 1 : |0iin × |0iout → √ |jiiNin ⊗ |f (j)iiNout (13.75)
2 j∈Z Nin

Now we wish to apply this to the function f (x) = ax mod N with (a, N ) = 1 and
N is the number we would like to factorize. So, in particular we don’t want N = 2n for
some n. Nobody will be impressed if you can factor a power of two! Rather, we identify
the group Z∗N , as a set, with the integers {1, . . . , N − 1} so that it can be considered to
be a subset of the natural fundamental domain {0, . . . , Nin − 1} for ZNin , and similarly for

– 282 –
ZNout . Then to compute f we compute ax , take the residue modulo N to get an integer in
the fundamental domain, and then consider that number modulo Nout . Hence, we should
choose Nout = 2nout to be some integer larger than N . A key claim, explained in textbooks
on quantum information theory is that such a function f is nice. That is, it makes sense
to compute it with a quantum circuit.
So, we conclude that a suitable quantum circuit can implement:
1 nin X
 
⊗n
|kiiNin ⊗ |ak iiNout

Uf H ⊗ 1 : |0iNin × |0iNout → √
2 k∈Z Nin

0
1 nin X
 
= √ (|j0 iiNin + |j0 + riiNin + |j0 + 2riiNin + · · · ) ⊗ |f0 iiNout
2 f ∈Z ∗
0 N

(13.76)

In the second line we are considering Z∗N as a subset of ZNout as explained above and the
prime on the sum means that we are just summing over the values that are in the image of
f (x) = ax mod N . This will be all the values in Z∗N if a is a generator of Z∗N but in general
might be smaller. Also, j0 is some solution of f0 = aj0 mod N , and r is the period of f (x),
that is, the smallest positive integer so that f (x + r) = f (x) for all x. We can choose j0 so
that 0 ≤ j0 < r and write the RHS of (13.76) as
O−1
!
1 nin X
  X
Ψ= √ |j0 + sriiNin ⊗ |aj0 iiNout (13.77)
2 0≤j <r s=0
0

Here O is the smallest integer such that j0 + Or ≥ Nin . So


Nin − j0
O=b c (13.78)
r
In the applications we have in mind Nin and r are typically very large numbers so that
this is a (very weak) function of j0 . At the end of this section we will use the observation
that in this case, rO/Nin is, to very good accuracy, just equal to 1.
Now we measure the output system and get some result, say, f0 = ak0 modN . Applying
the usual Born rule we get the state for the input system
O−1
1 X
Pf0 (Ψ) = √ |k0 + sriiNin (13.79)
O s=0

It is some kind of plane wave state in L2 (ZN ), so measuring position will give no useful
information on r. Of course, we should therefore go to the Fourier dual basis to learn about
the period. In terms of the position basis of L2 (ZN ) we can apply VF T to get:

1 X O−1
X 2πi (k0 +sr)p
V F T P f0 Ψ = √ e Nin
|piiNin (13.80)
Nin O p∈Z s=0
Nin

Note that this Fourier transform is quite nontrivial and nontransparent in the compu-
tational basis because of the nontrivial isomorphism between Hn and L2 (ZN ) with N = 2n .

– 283 –
Nevertheless, and this is nontrivial and part of the magic of Shor’s algorithm, the Fourier
transform operator VF T can be implemented nicely with quantum gates in the computa-
tional basis. Again, the textbooks on quantum information theory give explicit construction
of VF T as a quantum circuit in the computational basis. It is exactly at this point that the
exponential speed-up of the period finding takes place:

1. Classical Fourier Transform: N 2 = 22n operations. We learn every Fourier coeffi-


cient.

2. Fast Fourier Transform: nN = n2n operations. We learn every Fourier coefficient.

3. Quantum Fourier Transform: n2 quantum gates. We only learn about correlations


of the output state.

Now to find the period we make a measurement of the amplitudes for the various
Fourier components |piiNin . (Here we are using the isomorphism between ZNin and its
unitary dual.) The probability to measure p is

O−1
1 X 2πi N p
P rob(p) = | (e in /r )s |2
Nin O
s=0
2

Or
 (13.81)
1 sin π·p· Nin
=  
Nin O sin2 π · p · r
Nin

The basic idea is that the probability, as a function of p, will be peaked near values of p
from which we can deduce the crucial number r, but we need to be a bit careful at this
point.
Let us ask what is the probability that we will measure a value p of the form

Nin
pj = j + δj |δj | ≤ 1/2 (13.82)
r

If pj is of this form with any value j = 1, . . . , r−1 then we can extract r. Thus, substituting
such a value for pj into the formula for the probability we have
 
2 Or
1 sin π · δj · Nin
  (13.83)
Nin O sin2 π · δ · r
j Nin

Now recall that Or/Nin is equal to 1 to excellent accuracy. Suppose we also choose a
number of Qbits so that
Nin  N > r (13.84)

– 284 –
Then the argument of the sign in the denominator is extremely small and we can replace
sin(x) by x. So we get:
1sin2 (πδj )
P rob(pj ) ∼
= 2
Nin O

π · δj · Nrin
Nin 1 sin πδj 2
 
= · (13.85)
rO r πδj
 2
∼ 1 sin πδj
=
r πδj

1.0

0.8

0.6

0.4

0.2

-3 -2 -1 1 2 3

Figure 34: A plot of the function sin2 (πx)/(πx)2 as a function of x. This function, very familiar
from the theory of diffraction, is symmetric in x → −x and monotonically decreasing in the interval
0 ≤ x ≤ 1/2. It therefore takes its minimal value in this interval at x = 1/2 where it is about

= 0.405.
sin x 2
Now for 0 ≤ δ ≤ π/2 we have x ≥ π so
X r−1 4 ∼
P rob(pj ) ≥ = 0.4 (13.86)
r π2
j

Now, using various tricks one can raise this 40% value to near 100%. For these tricks
consult Mermin’s book. Two other useful textbooks on quantum information theory and
quantum computing where one can look these things up include:

1. Nielsen and Chuang, Quantum Information Theory

2. A. Y. Kitaev, A. H. Shen, and M. N. Vyalyi, Classical and Quantum Computation,


Grad Studies in Math 47, AMS

3. J. Preskill, lecture notes at http://www.theory.caltech.edu/∼preskill/ph219/ph219-


2019-20

14. Semidirect Products

We have seen a few examples of direct products of groups above. We now study a more
subtle notion, the semidirect product. The semidirect product is a twisted version of the
direct product of groups H and G which can be defined once we are given one new piece
of extra data. The new piece of data we need is a homomorphism

α : G → Aut(H). (14.1)

– 285 –
For an element g ∈ G we will denote the corresponding automorphism by αg . The value
of αg on an element h ∈ H is denoted αg (h). Thus αg (h1 h2 ) = αg (h1 )αg (h2 ) because αg
is a homomorphism of H to itself while we also have αg1 g2 (h) = αg1 (αg2 (h)) because α is
a homomorphism of G into the group of automorphisms Aut(H). We also have that α1 is
the identity automorphism. (Prove this!)
Using the extra data given by α we can form a more subtle kind of product called the
semidirect product H o G, or H oα G when we wish to stress the role of α. In the math
literature on group theory the notation H : G is also used. This group is the Cartesian
product H × G as a set but has the “twisted” multiplication law:

(h1 , g1 ) · (h2 , g2 ) := (h1 αg1 (h2 ), g1 g2 ) (14.2)


A good intuition to have is that “as g1 moves from left to right across the h2 they
interact via the action of g1 on h2 .”

Exercise Due diligence


a.) Show that (14.2) defines an associative group law.
b.) Show that (1H , 1G ) defines the unit and
(h, g)−1 = αg−1 (h−1 ), g −1

(14.3)
c.) Compute the group commutator [(h1 , g1 ), (h2 , g2 )] for a semidirect product. 146
d.) Let End(H) be the set of all homomorphisms H → H. Note that this set is closed
under the operation of composition, and this operation is associative, but End(H) is not a
group because some homomorphisms will not be invertible. Nevertheless, it is a monoid.
Show that if αg : G → End(H) is a homomorphism of monoids then (14.2) still defines a
monoid. When is it a group?

Example 14.1: Infinite dihedral group. Let G = {e, σ} ∼ = Z2 with generator σ, and let
H = Z, written additively. Then define a nontrivial α : G → Aut(H) by letting ασ act on
x ∈ H as ασ (x) = −x. Then Z o Z2 is a group with elements (x, e) and (x, σ), for x ∈ Z.
Note the multiplication laws:
(x1 , e)(x2 , e) = (x1 + x2 , e)
(x1 , e)(x2 , σ) = (x1 + x2 , σ)
(14.4)
(x2 , σ)(x1 , e) = (x2 − x1 , σ)
(x1 , σ)(x2 , σ) = (x1 − x2 , e)
and hence the resulting group is nonabelian with this twisted multiplication law. Since
Aut(Z) ∼
= Z2 this is the only nontrivial semidirect product we can form. This group is
known as the infinite dihedral group sometimes denoted D∞ . It has a presentation:
Z o Z2 ∼
= hr, s|s2 = 1 srs = r−1 i (14.5)
 
146
Answer : [(h1 , g1 ), (h2 , g2 )] = h1 αg1 (h2 )αg1 g2 g−1 (h−1 −1 −1 −1
1 )αg1 g2 g −1 g −1 (h2 ), g1 g2 g1 g2 .
1 1 2

– 286 –
(e.g. take s = (0, σ) and r = (1, e))
Remark: Taking x = s and y = rs we see that D∞ also has a presentation as a
Coxeter group:
Z o Z2 ∼ = hx, y|x2 = 1 y 2 = 1i (14.6)
Indeed it is the Weyl group for the affine Lie group LSU (2).
Example 14.2: Finite dihedral group. We can use the same formulae as in Example 1,
retaining G = {e, σ} ∼
= Z2 but now we take H = Z/N Z. We still have

ασ : n̄ → −n̄ (14.7)

where we are writing Z/N Z additively. The semi-direct product of Z/N Z with Z2 using
this automorphism gives one definition of an important group, the finite dihedral group DN .
Observe that, when we write ZN = µN multiplicatively, the automorphism is ασ (ω j ) = ω −j .
In this way we can obtain a presentation of DN of the form:

ZN oα Z2 ∼
= hr, s|s2 = 1, srs = r−1 , rN = 1i (14.8)

Note that here we have switched to a multiplicative model for the group ZN . The group
has the order given by the cardinality of the Cartesian product ZN × Z2 so it has order
2N . Note that using the relations, every word in the r, s can be converted to the form rx
or rx s with 0 ≤ x ≤ N − 1, thus accounting for all 2N elements.

|DN | = 2N. (14.9)

3 4 3

1 2 1 2

Figure 35: A regular 3-gon and 4-gon in the plane, centered at the origin. The subgroup of O(2)
that preserves these is D3 and D4 , respectively.

Important Remark: The Dihedral Groups And Symmetries Of Regular Polygons.


The group DN has a natural action on the vector space R2 . The generator r acts by a

– 287 –
rotation around the origin: φr = R(2π/N ). This generates a group isomorphic to ZN and
in this context it is usually denoted CN . If P is any reflection through a line through the
origin then φs = P will satisfy all the relations. The resulting group of transformations of
the plane generated by φr and φs is isomorphic to DN . Moreover, if we consider the regular
N -gon centered at the origin of the plane R2 then the subgroup of O(2) that maps it to
itself is isomorphic to DN , although to preserve the polygon we must choose P carefully
so that it is a reflection through an axis of symmetry. For example, if we consider the
regular triangle illustrated in 35 then reflection in the y-axis is a symmetry, as is rotation
by integral amounts of 2π/3. So we have a two-dimensional matrix representation of D3 :
!
−1 0
s→P =
0 1
! √ ! (14.10)
cos(2π/3) sin(2π/3) 1 −1 3
r → R(2π/3) = = √
− sin(2π/3) cos(2π/3) 2 − 3 −1

Note that, if we label the vertices of the triangle 1, 2, 3 as shown in the figure then the
various symmetries are in 1-1 correspondence with permutations of {1, 2, 3}. So in fact, we
have an isomorphism
D3 ∼= S3 (14.11)
with P → (12) and R(2π/3) → (123). Similarly, D4 is the group of symmetries of the
square. We can take s → P as before and now r → R(2π/4). Again we can label
the vertices of the square 1, 2, 3, 4. Again transformations are uniquely determined by
permutations of {1, 2, 3, 4}. However, the group of permutations we get this way is only a
subgroup of the permutation group S4 .
Clearly there is something quite different about the groups DN when N is even and
odd. This is nicely seen in the set of conjugacy classes. As you show in the exercise below
the conjugacy classes in DN are of the form

C(rx ) = {rx , r−x } (14.12)

and
C(rx s) = {rx+2y s|y ∈ Z} (14.13)
Now, thinking in terms of symmetry actions on the plane, rx correspond to rotations by
2πx/N , whereas rx s correspond to reflections. Now note that for N odd, since (2, N ) = 1
the conjugacy class C(rx s) will contain all the elements of the form rz s, in other words all
the reflections. However, if N is even then there are two distinct conjugacy classes: C(rx s)
for x even and odd are distinct. This is nicely in accord with symmetries of the N -gon:
For N odd it is clear that all the symmetry axes can be mapped to each other by using the
symmetries of the N -gon. Whereas for N even there are two distinct kinds of reflection
axes: Those that go through vertices and those that go through edges.

Exercise D4 and permutations

– 288 –
Show that under the map to S4 described above the dihedral group D4 maps to the
subgroup containing just the identity, the cyclic permtuations (the rotations)

(1234), (1234)2 = (13)(24), (1234)3 = (1432) (14.14)

and four reflections:


(12)(34), (23)(14), (13), , (24) (14.15)

Exercise
Show that DN is a quotient of the infinite dihedral group. 147

Exercise Conjugacy Classes In DN


a.) Using the presentation of DN in terms of generators r, s and relations s2 = 1 and
srs = r−1 , and rN = 1 show that we have conjugacy classes given by (14.12) and (14.13).
b.) List the distinct conjugacy classes in DN for N even and N odd. ♣Should provide
answer in a footnote
here. ♣

Example 14.3: In equations (2.22) and (2.40) we found the most general form of a matrix
in O(2). It is a disjoint union of two circles, each circle being the elements of determinant
det(A) = ±1. As a group we have an isomorphism

O(2) ∼
= SO(2) o Z2 (14.17)

In fact, there are many such isomorphisms.


We can write an explicit isomorphism as follows: Let Z2 = {1, σ}. Then

α : Z2 → Aut(SO(2)) (14.18)

is given by
α(σ) : R(φ) → R(−φ) (14.19)
Now choose any nonzero vector v ∈ R2 and define an isomorphism

Ψv : SO(2) o Z2 → O(2) (14.20)

by

Ψv : (1, σ) → Pv
(14.21)
Ψv : (R(φ), 1) → R(φ)
147
Answer : Note that N = {(x, e)|x = 0modN } ⊂ Z o Z2 is a normal subgroup and

(Z/N Z) o Z2 ∼
= (Z o Z2 )/N . (14.16)

– 289 –
Here Pv is the reflection in the line orthogonal to v. We need to check that this is a well-
defined homomorphism by checking that the images we have specified are indeed compatible
with the relations in the semidirect product. This amounts to checking that Pv2 = 1, which
is obvious, and
Pv R(φ)Pv−1 = R(−φ) (14.22)

Thanks to (14.22) the relations are indeed satisfied and now it is an easy matter to check
that Ψv is injective and surjective, so it is an isomorphism.
Note that there are many different such isomorphisms, Ψv , depending on the choice of
v. If v 0 is another nonzero vector in the plane then recall from (6.11) that Pv Pv0 = R(2θ)
where θ is the angle between v and v 0 . Then

Ψv0 : (1, σ) 7→ Pv0


= (Pv0 Pv )Pv
(14.23)
= R(2θ)Pv
= R(θ)Pv R(−θ)

So
Ψv0 = I(R(θ)) ◦ Ψv0 (14.24)

and changing v → v 0 changes Ψv by composition with an inner automorphism of O(2).


More generally, it is true that:

1. When d is odd
O(d) ∼
= SO(d) × Z2 (14.25)

2. When d is even
O(d) ∼
= SO(d) o Z2 (14.26)

In the case when d is odd the element −1d×d has determinant −1, so it is in the
nontrivial component of O(d), and yet it is also central: so conjugating elements R ∈ SO(d)
acts trivially. The semi-direct product is isomorphic to a direct product in this case. On
the other hand, when d is even −1d×d has determinant +1 and is an element of SO(d).
However, it is still true that if Ψv is reflection in the hyperplane orthogonal to a nonzero
vector v then
αv (R) := Pv RPv (14.27)

is a nontrivial automorphism of SO(d) and we have a family of isomorphisms:

Ψv : SO(d) oαv Z2 → O(d) (14.28)

The same discussion as above shows that the dependence on v is through composition with
an inner automorphism of SO(d).
Example 14.4: Affine Euclidean Space. Recall the discussion of Affine Euclidean space
in section ******** above. We defined there the Euclidean group.

– 290 –
Some natural examples of isometries are the following: Given any vector v ∈ Rd we
have the translation operator
Tv : p → p0 (14.29)
Moreover, if R ∈ O(d) then, if we choose a point p, we can define an operation:

Rp : p + v → p + Rv (14.30)

It turns out (this is a nontrivial theorem) that all isometries are obtained by composing
such transforamtions. A simple way to express the general transformation, then, is to
choose a point p as the “origin” of the affine space thus giving an identification Ed ∼= Rd .
d
Then, to a pair R ∈ O(d) and v ∈ R we can associate the isometry: 148

{v|R} : x 7→ v + Rx ∀x ∈ Rd (14.31)

In this notation the group multiplication law is

{v1 |R1 }{v2 |R2 } = {v1 + R1 v2 |R1 R2 } (14.32)

which makes clear that there is a nontrivial automorphism used to construct the semidirect
product of the group of translations, isomorphic to Rd with the rotation-inversion group
O(d).
Put differently: O(d) acts as an automorphism group of Rd :

αR : v 7→ Rv (14.33)

so we can form the abstract group Rd oα O(d). Then, once we choose an origin p ∈ Ed we
can write an isomorphism
Ψp : Rd oα O(d) → Euc(d) (14.34)
To be concrete:
Ψp (v, R) : p + x 7→ p + (v + Rx) (14.35)
As in our example of O(d) we now have a family of isomorphisms. If Ψp0 is another
such based on a different origin p0 = p + v0 then the two isomorphisms are related by a
translation. See the exercise below.

Exercise Internal Definition Of Semidirect Products


Suppose there is a homomorphism α : G → Aut(H) so that we can form the semidirect
product H oα G.
a.) Show that elements of the form (1, g), g ∈ G form a subgroup Q ⊂ H o G
isomorphic to G, while elements of the form (h, 1), h ∈ H constitute another subgroup,
call it N , which is isomorphic to H.
148
Our notation is logically superior to the standard notation in the condensed matter physics literature
where it is known as the Seitz notation. In the cond-matt literature we have {R|v} : x 7→ Rx + v.

– 291 –
b.) Show that N = {(h, 1)|h ∈ H} is a normal subgroup of H o G, while Q =
{(1, g)|g ∈ G} in general is not a normal subgroup. 149 This explains the funny product
symbol o that looks like a fish: it is a combination of × with the normal subgroup symbol
/.
c.) Show that we have a short exact sequence:

1 → N → H oα G → Q → 1 (14.37)

d.) Show that H o G = N Q = QN and show that Q ∩ N = {1}. Here N Q means the
sent of elements nq where n ∈ N and q ∈ Q. 150
e.) Conversely, show that if G̃ = N Q where N is a normal subgroup of G̃ and Q is a
subgroup of G̃, (that is, every element of G̃ can be written in the form g = nq with n ∈ N
and q ∈ Q and N ∩ Q = {1} ) then G̃ is a semidirect product of N and Q. Show how
to recover the action of Q as a group of automorphisms of N by defining αq (n) := qnq −1 .
Note that αq in general is NOT an inner automorphism of N .

Exercise When is a semidirect product actually a direct product?


Show that if G = N Q is a semidirect product and Q is also a normal subgroup of G,
then G is the direct product of N and Q. 151

It is useful to think about the Euclidean group in terms of the “internal” character-
ization of semi-direct products explained in the exercise above. Here we have a normal
subgroup N := {{v|1}|v ∈ Rd } and a subgroup Q given by the set of elements of the form
{0|R}. To check that N is normal a short computation using the group law reveals

{v|R}{w|1} = {Rw|1}{v|R} (14.38)

and hence:
{v|R}{w|1}{v|R}−1 = {Rw|1} (14.39)
Note that, again, thanks to the group law, π : {v|R} → R is a surjective homomorphism
Euc(d) → O(d). Thus there is an exact sequence:

0 → Rd → Euc(d) → O(d) → 1 (14.40)


149
Answer to (b): Compute (h1 , g1 )(h, 1)(h1 , g1 )−1 = (h1 αg1 (h)h−1
1 , 1) and

(h1 , g1 )(1, g)(h1 , g1 )−1 = (h1 αg1 gg−1 (h−1 −1


1 ), g1 gg1 ). (14.36)
1

150
The notation is slightly dangerous here: We are considering the group Q both as a subgroup of G and,
in equation (14.37), as a quotient. In general, as we will see below in the chapter on exact sequences, there
is no way to view a quotient of a group G as a subgroup of G. Failure to appreciate this point has led to
many, many, many errors in the physics literature.
151
Answer: Note that n1 q1 n2 q2 = n1 n2 (n−1 −1
2 q1 n2 q1 )q1 q2 . However, if both N and Q are normal subgroups
−1 −1
then (n2 q1 n2 q1 ) ∈ N ∩ Q = {1}. Therefore n1 q1 n2 q2 = n1 n2 q1 q2 is the direct product structure.

– 292 –
Almost identical considerations show that the Poincaré group is isomorphic to the
semidirect product of the translation and Lorentz groups:

Poincare(M1,d−1 )) ∼
= M1,d−1 o O(1, d − 1) (14.41)

where, once again, the choice of isomorphism depends on a choice of origin.


Example 14.5: Wreath Products. If X and Y are sets then let F[X → Y ] be the set of
functions from X to Y . Recall that

1. If Y = G1 is a group then F[X → G1 ] is itself a group.

2. If a group G2 acts on X and Y is any set then G2 actions on the function space
F[X → Y ] in a natural way.

We can combine these two ideas as follows: Suppose that G2 acts on a set X and
Y = G1 is itself a group. Then let

α : G2 → Aut(F[X → G1 ]) (14.42)

be the canonical G2 action on the function space: so if φ : G2 × X → X is the action on


X (part of our given data) then the induced action on the function space is

αg : F 7→ αg (F ) ∈ F[X → G1 ] (14.43)

where we define
αg (F )(x) = F (φ(g −1 , x)) ∀g ∈ G2 , x ∈ X (14.44)
Then we can form the semidirect product

F[X → G1 ] o G2 (14.45)

This is a generalized wreath product. The traditional wreath product is a special case where
G2 = Sn for some n and Sn acts on X = {1, . . . , n} by permutations in the standard way.
Note that the group F[X → G1 ] ∼ = Gn1 . The traditional wreath product G1 wrSn , also
denoted G1 o Sn , is then F[X → G1 ] o Sn . To be quite explicit, the group elements in
G1 o Sn are
(h1 , . . . , hn ; φ) (14.46)
with hi ∈ G1 and φ ∈ Sn and the product is

(h1 , . . . , hn ; φ)(h01 , . . . , h0n ; φ0 ) = (h1 h0φ−1 (1) , h2 h0φ−1 (2) , . . . , hn h0φ−1 (n) , φ ◦ φ0 ) (14.47)

Example 14.6: Kaluza-Klein theory. The basic idea of Kaluza-Klein theory is that we
study physics on a product manifold X × Y and partially rigidify the situation by putting
some structure on Y. We then regard Y as “small” and study the physics as “effectively”
taking place on X .
The idea is intuitively understood by imagining a 2 + 1 dimensional world where space
is a cylinder of radius R. If we imagine beings in this flatland of a fixed lengthscale,

– 293 –
and shrink R → 0 then the beings will end up perceiving themselves as living in a 1 + 1
dimensional world. ♣SUITABLE
FIGURE NEEDED
Historically, Kaluza-Klein theory arose from attempts to unify the field theories of HERE ♣

general relativity with Maxwell’s theory of electromagnetism. The basic idea is that pure
general relativity on X × Y appears, when Y is “small” to be a theory of several fields,
including electro-magnetism, in X . As originally conceived the idea is very beautiful, but
now regarded as too naive and simplistic. Nevertheless, the idea that there might be extra
dimensions of spacetime in a compact manifold survives to this day and models that make
use of it come astonishingly close to describing the standard model of particle physics and
gravity, in the context of “string compactification.”
The canonical example of Kaluza-Klein theory is the case where X = M1,d−1 is d-
dimensional Minkowski space and Y = S 1 is the circle. We rigidify the situation by
putting a metric on the circle S 1 so that the metric on space-time is

ds2 = ηµν dxµ dxν + R2 (dθ)2 (14.48)

where R is the radius of the circle, θ ∼ θ + 2π and 0 ≤ µ ≤ d − 1. Our signature is mostly


plus. Now we consider a massless scalar field in (d + 1) dimensions on this spacetime. A
massless scalar field would satisfy the wave equation:
"  2 #
µν ∂ ∂ 1 ∂
η µ ν
+ 2 φ=0 (14.49)
∂x ∂x R ∂θ

In Quantum Field Theory one makes a huge leap: The quantization of the field leads to
quantum states which are interpreted as the states of a system of particles. An essential
step in this feat of magic is that one makes a Fourier-decomposition of the field. The
Fourier modes of the field are interpreted as creation/annihilation operators of particle
states. For the massless scalar field the Fourier modes are
M µ
eipM x = eipµ x eipθ θ (14.50)

corresponding to single-particle creating and annihilation operators of definite energy-


momentum. But since θ ∼ θ + 2π single-valuedness of the field implies that pθ = n ∈ Z is
an integer. But now the wave-equation implies that we have a dispersion relation:

n2
E 2 − p~2 = (14.51)
R2
where pµ = (E, p~). From the viewpoint of a d-dimensional field theory, Fourier modes with
n 6= 0 describe massive particles with m2 = n2 /R2 .
Now consider the case that R is very small compared to the scale of any observer.
Then that observer will perceive only a d-dimensional spacetime. If R is very small the
single massless particle in d+1-dimensions is percieved as an infinite set of different massive
particles with mass |n|/R in d dimensions. As R → 0 the masses of the particles ∼ |n|/R
goes to infinity. So, except for the n = 0 modes, the particles are very massive and therefore
will not be created by low energy processes, and are hence in general unobservable. For

– 294 –
example, if R is on the order of the Planck scale then the nonzero Fourier modes are fields
that represent particles of Planck-scale mass.
In a similar spirit, one finds that the Einstein-Hilbert action in (d + 1) dimensions
describing gravity in (d + 1) dimensions is equivalent, upon keeping only the n = 0 Fourier
modes, to the action of d-dimensional general relativity together with the Maxwell action
and the action for a scalar field. In a little more detail, suppose that Y = S 1 and we use
coordinates X M , M = 0, . . . , d + 1 on X × S 1 and coordinates xµ with µ = 0, . . . , d on X .
So that X M = (xµ , θ) where θ is an angular coordinate around S 1 . Then we consider the
metric:

ds2 = gM N dX M dX N = gµν (x)dxµ dxν + Ω2 (x)(dθ + Aµ (x)dxµ )2 (14.52)

where the metric gµν , the “warp factor” Ω2 and the “gauge connection” Aµ are only
functions of xµ (that is, we make the restriction to zero Fourier modes).
Note that this means the metric tensor looks like
!
µ gµν (x) + Ω2 (x)Aµ (x)Aν (x) Ω2 (x)Aµ (x)
gM N (x , θ) = (14.53)
Ω2 (x)Aν (x) Ω2 (x)

The general symmetric (d + 1) × (d + 1) matrix has


1 1
(d + 1)(d + 2) = d(d + 1) + d + 1 (14.54)
2 2
independent components, so we have not lost any generality in the form of the matrix, but
writing it this way makes the connection to physical quantities (and to connections on a
principal U (1) bundle) clearer. By writing the fields on the RHS as functions of xµ and
not (xµ , θ) we have made a severe restriction - limiting attention to the massless modes in
d dimensions, as explained above.
Under these conditions one computes that the Riemann scalar for the (d+1)-dimensional
metric is:
Ω2
R[gM N ] = R[gµν ] − Fµν F µν − 2(∇logΩ)2 − 2∇2 logΩ (14.55)
4
so the Einstein-Hilbert action for GR in (d + 1) dimensions reduces to that of Einstein-
Hilbert-Maxwell-Scalar in d dimensions. This is a truly remarkable equation: Pure gravity
in (d + 1) dimensions leads to both gravity and electricity and magnetism in d dimensions!
Remarks:

1. The KK ansatz also leads to a scalar field logΩ2 (x), known as the “dilaton” because
it can dilate, in a space-time dependent way, the size of the “internal dimensions” Y.
Note that in electricity and magnetism the coupling constant enters via
√ 1
Z
SMaxwell = g 2 Fµν F µν (14.56)
4e
so the presence of the dilaton can lead to space-time variation of the fine structure
constant. In physically viable models one must explain why the dilaton does not

– 295 –
fluctuate wildly. The discovery of the naturally occurring nuclear reactor in Oklo,
Africa, has led to the bound
α̇
| | < f ew × 10−17 yr−1 (14.57)
α

2. By considering internal spaces Y equipped with metric with isometry group H similar
considerations lead gauge theory with gauge group H in X .

It is interesting to understand how gauge symmetries in theories on X arise in this


point of view. Suppose D ∼ = Dif f (X ) is a subgroup of diffeomorphisms of X × Y of the
form
ψf : (x, y) → (f (x), y) f ∈ Dif f (X ). (14.58)
We also consider a subgroup G of Dif f (X × Y) where G is isomorphic to a subgroup of
M ap(X , Dif f (Y)). For the moment just take G = M ap(X , Dif f (Y)), so an element g ∈ G
is a family of diffeomorphisms of Y parametrized by X : For each x we have a diffeomorphism
of Y: gx : y → g(y; x). Then we take G to be the subgroup of diffeomorphisms of Dif f (X ×
Y) of the form
ψg : (x, y) → (x, g(y; x)) g∈G (14.59)
Note that within Dif f (X × Y) we can write the subgroup

GD (14.60)

and D acts as a group of automorphisms of G via

ψf ψg ψf−1 : (x, y) → (f −1 (x), y)


→ (f −1 (x), g(y; f −1 (x))) (14.61)
→ (x, g(y; f −1 (x)))

so if g ∈ G and f ∈ D then ψf ψg ψf−1 = ψg0 with g 0 ∈ G and hence GD is a semidirect


product. In fact, it is an example of the generalized wreath product of the previous example.
This is a model for the group of gauge transformations in Kaluza-Klein theory. So
X is the “large”, possibly noncompact, spacetime where we have general relativity, while
Y is the “small,” possibly compact space giving rise to gauge symmetry. D is the diffeo-
morphism group of the large spacetime and is the gauge symmetry of general relativity on
X . Typically, Y is endowed with a fixed metric ds2Y and the diffeomorphism symmetry of
Y is (spontaneously) broken down to the group of isometries of Y, Isom(Y, ds2Y ). So in
the above construction we take G to be the unbroken subgroup M ap(X , Isom(Y, ds2Y )) ⊂
M ap(X , Dif f (Y)). This subgroup M ap(X , Isom(Y, ds2Y )) is interpreted as a group of
gauge transformations of a gauge theory on X coupled to general relativity on X .
As a simple example of the remarks of the previous paragraph let us suppose that
Y = S 1 with coordinate θ and round metric (dθ)2 . The isometries of the circle are just
constant translations, θ → θ + . So if

g ∈ Map(X , Isom(S 1 )) (14.62)

– 296 –
then g(x) will take θ → θ + (x), so

ψg : (xµ , θ) → (xµ , θ + (x)) (14.63)

so the metric in (14.52) transforms as

ψg∗ (ds2 ) = gµν (x)dxµ dxν + Ω2 (x)(dθ + dxµ ∂µ  + Aµ (x)dxµ )2 (14.64)

meaning that the fields gµν and Ω are invariant, but

Aµ → Aµ + ∂ µ  (14.65)

and thus, these special diffeomorphisms appear as gauge transformations of the Maxwell
field!

Exercise
Show that S3 ∼= Z3 o Z2 where the generator of Z2 acts as the nontrivial outer auto-
morphism of Z3 .

Figure 36: A baseball.

Exercise Symmetries Of The Baseball


Ignoring any writing, but taking into account the seams, find the symmetry group of
a baseball. (See figure 36.) 152

152
Answer D4 .

– 297 –
Exercise Symmetries Of The Cube
a.) Consider a perfect cube. By considering the action of proper rotations on the four
diagonal axes through vertices show that the symmetry group is isomorphic to S4 .
b.) Centering the cube on the origin with vertices (±1, ±1, ±1) show that the symmetry
group is (Z2 × Z2 ) o S3 . Deduce that

S4 ∼
= (Z2 × Z2 ) o S3 (14.66)

Exercise Centralizers in the symmetric group


a.) Suppose that g ∈ Sn has a conjugacy class given by ni=1 (i)`i . Show that the
Q

centralizer Z(g) is isomorphic to


n  
Z(g) ∼
Y
= Z`i i o S`i (14.67)
i=1
Q
where i is a direct product.
b.) Use this to compute the order of a conjugacy class in the symmetric group.

Exercise The Lorentz Group As A Semidirect Product


Let η = Diag{−1, 1d }. The Lorentz group in d + 1 dimensions, denoted O(1, d) is the
matrix group
O(1, d) = {A|AηAtr = η} (14.68)
a.) Considering the case d = 1 find the general solution and show that there are four
connected components. 153
b.) Show that, group-theoretically, we have

O(1, 1) = SO0 (1, 1) o (Z2 × Z2 ) (14.70)

where SO0 (1, 1) is the connected component of the identity.


Remark: In fact, more generally O(1, d) has four connected components for d ≥ 1
and
O(1, d) = SO0 (1, d) o (Z2 × Z2 ) (14.71)
153
Answer : Writing out the four matrix elements of the defining equation easily arrives at the general
solution: !
ξ1 cosh θ ξ2 sinh θ
A= (14.69)
ξ4 sinh θ ξ3 cosh θ
where θ ∈ R, ξi ∈ {±1} and ξ4 = ξ1 ξ2 ξ3 . By changing the sign of θ we can set ξ= 2. The sign of ξ1 , ξ2 is
meaningful, giving us four components.

– 298 –
You can easily see that there are at least four components since detA ∈ {±1} and moreover
A200 = 1 + di=1 A20i so that we can independently have A00 ≥ 1 or A00 ≤ −1.
P

Exercise Holomorph
Given a finite group G a canonical semidirect product group is G o Aut(G) known as
the holomorph of G.
a.) Show that this is the normalizer of the copy of G in the symmetric group S|G| given
by Cayley’s theorem.
b.) Show that the affine Euclidean group Euc(d) is the holomorph of the Abelian
group Rd .

Exercise Equivalence of semidirect products


A nontrivial homomorphism α : G → Aut(H) can lead to a semidirect product which
is in fact isomorphic to a direct product. Show this as follows: Suppose φ : G → H is a
homomorphism. Define α : G → Aut(H) by αg = I(φ(g)). Construct an isomorphism 154

Ψ : H oα G → H × G (14.72)

Exercise Manipulating the Seitz notation


a.) Show that:

{v|R}−1 = {−R−1 v|R−1 }


{0|R}{v|1} = {Rv|R}
{v|1}{0|R} = {v|R}
{w|1}{v|R} = {w + v|R}
{v1 |R1 }{v2 |R2 }{v1 |R1 }−1 = {R1 v2 + (1 − R1 R2 R1−1 )v1 |R1 R2 R1−1 }
[{v1 |R1 }, {v2 |R2 }] = {(1 − R1 R2 R1−1 )v1 − R1 R2 R1−1 R2−1 (1 − R2 R1 R2−1 )v2 |R1 R2 R1−1 R2−1 }
(14.73)

b.) Show that the subgroup of pure translations, that is, the subgroup of elements of
the form {v|1} with v ∈ Rd is a normal subgroup of Euc(d).
154
Answer : Ψ(h, g) = (hφ(g), g).

– 299 –
c.) Can you construct a homomorphism O(d) → Euc(d)?
d.) Consider the group G = L o Z2 where L is a lattice and the nontrivial element in
Z2 acts on L by v → −v. Compute the conjugacy classes in G. 155

Exercise Dependence on basepoint for isomorphism of Euclidean group


Show that if p0 = p + v0 then

Ψp0 (v 0 , R) = Ψp (v, R) ∈ Euc(d) (14.74)

for
v 0 = v + (1 − R)v0 (14.75)

15. Group Extensions and Group Cohomology

15.1 Group Extensions


♣Add: Pushforward
extensions ♣
Recall that an extension of Q by a group N is an exact sequence of the form:

ι π
1→N → G → Q→1 (15.1)
There is a notion of homomorphism of two group extensions
ι
1 π
1→N → G1 →1 Q → 1 (15.2)
ι
2 π
1→N → G2 →2 Q → 1 (15.3)
This means that there is a group homomorphism ϕ : G1 → G2 so that the following diagram
commutes:

ι1
1 /N / G1 π 1 / Q /1 (15.4)
O O
Id ϕ Id
 ι2  π2 
1 /N / G2 /Q /1

To say that a “diagram commutes” means that if one follows the maps around two
paths with the same beginning and ending points then the compositions of the maps are
the same. Thus (15.4) is completely equivalent to the pair of equations:

π1 = π2 ◦ ϕ
(15.5)
ι2 = ϕ ◦ ι1
155
Answer : C({v|1}) = {{±v|1}} has two elements while C({v| − 1}) = {{±v + 2v1 | − 1}|v1 ∈ L} has
infinitely many elements.

– 300 –
However, drawing a diagram makes the relations between maps, domains and codomains
much more transparent. Sometimes a picture is worth a thousand equations. This is why
mathematicians like commutative diagrams.
When there is a homomorphism of group extensions based on ψ : G2 → G1 such that
ϕ ◦ ψ and ψ ◦ ϕ are the identity then the group extensions are said to be isomorphic.
It can certainly happen that there is more than one nonisomorphic extension of Q by
N . Classifying all extensions of Q by N is a difficult problem. We will discuss it more in
section 15.7 below.

Figure 37: Illustration of a group extension 1 → N → G → Q → 1 as an N -bundle over Q.

We would encourage the reader to think geometrically about this problem, even in
the case when Q and N are finite groups, as in Figure 37. In particular we will use the
important notion of a section, that is, a right-inverse to π: It is a map s : Q → G such that
π(s(q)) = q for all q ∈ Q. Such sections always exist.156 Note that in general s(π(g)) 6= g.
This is obvious from Figure 37. The set of pre-images, π −1 (q), is called the fiber of π over
q. The map π projects the entire fiber over q to the single element q. A choice of section
s is a choice, for each and every q ∈ Q, of just one single point in the fiber above q.
In order to justify the picture of Figure 37 let us prove that, as a set, G is just the
product N × Q. Note that for any g ∈ G and any section s:

g(s(π(g)))−1 (15.6)

maps to 1 under π (check this). Therefore, since the sequence is exact

g(s(π(g)))−1 = ι(n) (15.7)

for some n ∈ N . That is, every g ∈ G can be written as

g = ι(n)s(q) (15.8)

for some n ∈ N and some q ∈ Q. In fact, this decomposition is unique: Suppose that:

ι(n1 )s(q1 ) = ι(n2 )s(q2 ) (15.9)


156
By the axiom of choice. For continuous groups such as Lie groups there might or might not be continuous
sections.

– 301 –
Then we rewrite this as
ι(n2−1 n1 ) = s(q2 )s(q1 )−1 (15.10)
Now, applying π we learn that 1 = q2 π(s(q1 )−1 ) = q2 (π(s(q1 )))−1 = q2 q1−1 , so q1 = q2 .
But that implies n1 = n2 . Therefore, as a set, G can be identified with N × Q.

Remark: As a nice corollary of the decomposition (15.8) note that if ϕ defines a


morphism of group extensions then ϕ is in fact an isomorphism of G1 to G2 . It is a
homomorphism by definition. Now note that if s1 : Q → G1 is a section of π1 then
s2 := ϕ ◦ s1 : Q → G2 is a section of π2 so

ϕ(g) = ϕ(ι1 (n)s1 (q))


= ϕ(ι1 (n))ϕ(s1 (q)) (15.11)
= ι2 (n)s2 (q)

and since the decomposition is unique (given a choice of section) the map ϕ is 1 − 1.

Now, given an extension and a choice of section s we define a map

ω : Q → Aut(N ) (15.12)

denoted by
q 7→ ωq (15.13)
where the definition of ωq is given by

ι(ωq (n)) = s(q)ι(n)s(q)−1 (15.14)

Because ι(N ) is normal the RHS is again in ι(N ). Because ι is injective ωq (n) is well-defined.
Moreover, for each q the reader should check that indeed ωq (n1 n2 ) = ωq (n1 )ωq (n2 ), and ωq
is one-one on N . Therefore we really have a map of sets (15.12). Note carefully that we
are not saying that q 7→ ωq is a group homomorphism. In general, it is not.

Remark: Clearly the ι is a bit of a nuisance and leads to clutter and it can be safely
dropped if we consider N simply to be a subgroup of G, for then ι is simply the inclusion
map. The confident reader is encouraged to do this. The formulae will be a little cleaner.
However, we will be pedantic and retain the ι in most of our formulae.

Let us stress that the map ω : Q → Aut(N) in general is not a homomorphism and in
general depends on the choice of section s. We will discuss the dependence on the choice of
section s below when we have some more machinery and context. For now let us see how
close ω comes to being a group homomorphism:

ι (ωq1 ◦ ωq2 (n)) = s(q1 )ι(ωq2 (n))s(q1 )−1


(15.15)
= s(q1 )s(q2 )ι(n)(s(q1 )s(q2 ))−1

– 302 –
We want to compare this to ι (ωq1 q2 (n)). In general they will be different unless s(q1 q2 ) =
s(q1 )s(q2 ), that is, unless s : Q → G is a homomorphism. In general the section is not a
homomorphism, but clearly something nice happens when it is:

Definition: We say an extension splits if there exists a section s : Q → G which is also


a group homomorphism. A choice of a section which is a group homomorphism is called a
(choice of) splitting.

Theorem: An extension is isomorphic to a semidirect product iff it is a split extension.

Proof :
First suppose it splits. Choose a splitting s. Then from (15.15) we know that

ωq1 ◦ ωq2 = ωq1 q2 (15.16)

and hence q 7→ ωq defines a homomorphism ω : Q → Aut(N ). Therefore, we can aim to


prove that there is an isomorphism of G with N oω Q.
In general if s is just a section the image s(Q) ⊂ G is not a subgroup. But if the
sequence splits, then it is a subgroup. The equation (15.8) implies that G = ι(N )s(Q)
where s(Q) is a subgroup, and by the internal characterization of semidirect products that
means we have a semidirect product.
To give a more concrete proof, let us write the group law in the parametrization (15.8).
Write
ι(n)s(q)ι(n0 )s(q 0 ) = ι(n) s(q)ι(n0 )s(q)−1 s(qq 0 )

(15.17)
Note that
s(q)ι(n0 )s(q)−1 = ι(ωq (n0 )) (15.18)
so
ι(n1 )s(q1 )ι(n2 )s(q2 ) = ι (n1 ωq1 (n2 )) s(q1 q2 ) (15.19)
But this just means that
Ψ(n, q) = ι(n)s(q) (15.20)
is in fact an isomorphism Ψ : N oω Q → G. Indeed equation (15.19) just says that:

Ψ(n1 , q1 )Ψ(n2 , q2 ) = Ψ((n1 , q1 ) ·ω (n2 , q2 )) (15.21)

where ·ω stresses that we are multiplying with the semidirect product rule.
Thus, we have shown that a split extension is isomorphic to a semidirect product
G∼= N o Q. The converse is straightforward. ♠
In §15.7 below we will continue the general line of reasoning begun here. However, in
order to appreciate the formulae better it is a good idea first to step back and consider a
simple but important special case of extensions, namely, the central extensions. These are ♣Do the general
case first and then
extensions such that ι(N ) is a subgroup of the center of G. Here is the official definition: specialize? ♣

(Note the change of notation from the general situation above):


Let A be an abelian group and G any group.

– 303 –
Definition A central extension of G by A, 157 is a group G̃ and an extension such
that
ι π
1→A → G̃ → G→1 (15.22)
such that ι(A) ⊂ Z(G̃).
We stress again that what we called G in the previous discussion is here called G̃, and
what we called Q in the previous discussion is here called G.
Example And Remark:Sections of group extensions vs. continuous sections of prin-
cipal fiber bundles. Let us return to the very important exact sequence (10.38):
ι π
1 → Z2 → SU (2) → SO(3) → 1 (15.23)

The Z2 is embedded as the subgroup {±1} ⊂ SU (2), so this is a central extension. We


said above that there is always a section, but when we said that we did not impose any
properties of continuity in the case where G and Q are continuous groups. In this example
while there is a section of π there is, in fact, no continuous section. Such a continuous
section πs = Id would imply that π∗ s∗ = 1 on the first homotopy group of SO(3). But
that is impossible since it would have factor through π1 (SU (2)) = 1.
We are using a few facts here:

1. Every SU (2) matrix can be written as


!
α β
(15.24)
−β̄ ᾱ

where α, β are complex numbers with |α|2 + |β|2 = 1. Writing this equation in terms
of the real and imaginary parts of α, β we recognize the equation of the unit three
dimensional sphere. Now recall that all the spheres of dimension ≥ 2 are simply
connected. Therefore π1 (SU (2)) = 1 is simply connected.

2. But SO(3) is not simply connected! In fact, using a coffee cup you can informally
demonstrate that π1 (SO(3)) ∼
= Z2 . [Demonstrate].

3. If there were a continuous section then s∗ : π1 (SO(3)) → π1 (SU (2)) would be a well-
defined group homomorphism and s ◦ π = Id would imply that on the fundamental
groups Id∗ = s∗ π∗ in

π1 (SO(3)) → π1 (SU (2)) = 1 → π1 (SO(3)) (15.25)

But Id∗ takes the nontrivial element of Z2 to the nontrivial element, not to the trivial
element. This is impossible if you factor through the trivial group.

In algebraic topology one introduces another kind of topological invariant known as


homology. The homology groups of a manifold are Abelian groups that encode many
important properties of the manifold. The homology group H1 (M ; Z2 ) tells us what are
157
Some authors say an extension of A by G.

– 304 –
∼ Z2 , (this
the possible 2-fold covers of the manifold M . It turns out that H1 (SO(3); Z2 ) =

is closely related to π1 (SO(3)) = Z2 . so there are two double covers of SO(3). One is
O(3) = Z2 × SO(3) and the other is SU (2), the nontrivial double cover.
The extension (15.1) generalizes to
ι π
1 → Z2 → Spin(d) → SO(d) → 1 (15.26)

as well as the two Pin groups which extend O(d):


ι π
1 → Z2 → Pin± (d) → O(d) → 1 (15.27)

we discuss these in Section *** below. Again, in these cases there is no continuous section.
Thus, these examples are nontrivial as fiber bundles. Moreover, even if we allow ourselves
to choose a discontinuous section, we cannot do so and make it a group homomorphism.
In other words these examples are also nontrivial as group extensions.

Exercise
If s : Q → G is any section of π show that for all q ∈ Q,

s(q −1 ) = s(q)−1 n = n0 s(q)−1 (15.28)

for some n, n0 ∈ N .

Exercise The pullback construction


There is one general construction with extensions which is useful when discussing
symmetries in quantum mechanics. This is the notion of pullback extension. Suppose we
are given both an extension

1 / H0 ι /H π / H 00 /1 (15.29)

and a homomorphism
ρ : G → H 00 (15.30)
one can define another extension of G by H 0 known as a pullback extension. We are trying
to fill in the diagram:
G (15.31)
ρ

1 / H0 ι /H π / H 00 /1

with an extension on the first row of G by H 0 .


We do this by defining a subgroup of the Cartesian product G̃ ⊂ H × G:

G̃ := {(h, g)|π(h) = ρ(g)} ⊂ H × G (15.32)

– 305 –
We have an extension of the form

1 / H0 ι / G̃ π̃ /G /1 (15.33)

where π̃(h, g) := g. Show that this extension fits in the commutative diagram

1 / H0 / G̃ π̃ / G00 /1 (15.34)
ρ̃ ρ
 
1 / H0 /H π / H 00 /1

(N.B. This is not a morphism of extensions.)

Exercise The pushforward extension


Under some circumstances one can complete the diagram

1 / H0 ι /H π / H 00 /1 (15.35)
ρ


to get an extension of H 00 by H̃. This is not as universal as the pullback. But one can
construct it if ρ : H 0 → H̃ is surjective and ι(ker(ρ)) / H. Give the construction. 158

Exercise Choice of splitting and the Euclidean group Euc(d)


As we noted, the Euclidean group Euc(d) is isomorphic to the semidirect product
d
R oO(d), but to exhibit that we needed to choose an origin about which to define rotation-
inversions. See equation (14.35) above.
a.) Show that a change of origin corresponds to a change of splitting.
b.) Using the Seitz notation show that another choice of origin leads to the splitting
R 7→ {(1 − R)v|R}, and verify that this is also a splitting. ♣This is almost
redundant with
another exercise
158
Answer : Define the group G̃ := H/ι(ker(ρ)). Then note that we can define ι̃ : H̃ → G̃ via ι̃(ρ(h)) = above. ♣
ι(h) + ι(ker(ρ)). Note that if ρ(h) = ρ(h0 ) then the RHS is the same so this does give a well-defined map
ι̃ on the image of ρ, but if ρ is surjective that is enough. Now define π̃(ẽ) = π(e) where ẽ = e + ι(kerρ).
Then we have a commutative diagram:

/ H0 ι
/H π
/ H 00 /1
1 O (15.36)
ρ ϕ Id
  
1 / H̃ ι̃ / G̃ π̃ / H 00 /1

where ϕ(h) = h + ι(ker(ρ)).

– 306 –
Exercise Another form of splitting
Show that an equivalent definition of a split exact sequence for a central extension is
that there is a homomorphism t : G̃ → A which is a left-inverse to ι, t(ι(a)) = a.
(Hint: Define s(π(g̃)) = ιt(g̃ −1 ))g̃.)

Exercise The exact sequence for a product of two cyclic groups


Revisit the exact sequence discussed in equation (12.52). Show that this sequence
splits. 159

Exercise Is A Restriction Of A Split Extension Split?


Suppose
1 /N ι /G π /Q /1 (15.37)
is a split extension and N1 ⊂ N and Q1 ⊂ Q, so that there is an extension
1 / N1 ι / G1 π / Q1 /1 (15.38)
given by restriction to G1 ⊂ G. Does it follow that this extension is a split extension?

Exercise A Split Central Extension Is A Direct Product


Suppose
1 /N ι /G π /Q /1 (15.39)
is a split central extension. Show that G ∼ = N × Q and the extension is isomorphic to
the trivial extension with ι inclusion into the first factor and π projection onto the second
factor. 160
159
Answer : Let ω` generate Z` then, using the notation above let s : ω` 7→ (ω`µ2 ν2 , ω`µ1 ν1 ).
160
Answer : Choose a splitting s. Then use the parametrization g = ι(n)s(q). The group multiplication
can be written
g1 g2 = ι(n1 )s(q1 )ι(n2 )s(q2 )
= ι(n1 )ι(n2 )s(q1 )s(q2 ) (15.40)
= ι(n1 n2 )s(q1 q2 )
In the going from the first to second line we used that ι(n2 ) is in the center. In going from the second to
the third line we used that ι and s are group homomorphisms. (Recall s is a group homomorphism because
it is a splitting.)

– 307 –
15.2 Projective Representations
We have already encountered the notion of a matrix representation of a group G. This is
a homomorphism from G into GL(d, κ) for some field κ. In many contexts in mathematics
and physics (especially in quantum physics) one encounters a generalization of the notion
of matrix representation known as a projective representation. The theory of projective
representations is closely related to the theory of central extensions.
Recall that a matrix representation of a group G is a group homomorphism

ρ : G → GL(d, κ) (15.41)

A projective representation is a map

ρ : G → GL(d, κ) (15.42)

which is “almost a homomorphism” in the sense that

ρ(g1 )ρ(g2 ) = f (g1 , g2 )ρ(g1 , g2 ) (15.43)

for some function f : G × G → κ∗ . Of course f (g1 , g2 ) is “just a c-number” so you might


think it is an unimportant nuisance. You might try to get rid of it by redefining

ρ̃(g) = b(g)ρ(g) (15.44)

where b(g) ∈ κ∗ is a c-number. Then if there exists a function b : G → κ∗ such that

? b(g1 g2 )
f (g1 , g2 )= (15.45)
b(g1 )b(g2 )

then ρ̃ would be an honest representation.


The trouble is, in many important contexts, there is no function b so that (15.45)
holds. So we need to deal with it.
A simple example is the “spin representation of the rotation group SO(3)” where one
attempts to define a map:

ρ : SO(3) → SU (2) ⊂ GL(2, C) (15.46)

that attempts to describe the effects of a rotation on - say - a spin 1/2 particle. In fact,
there is no such thing as the “spin representation of the rotation group SO(3).” There is
a spin projective representation of SO(3). 161
We saw above that there is a very natural group homomorphism from SU (2) to SO(3),
but there is no group homomorphism back from SO(3) to SU (2): There is no splitting.
The so-called “spin representation of SO(3)” is usually presented by attempting to con-
struct a splitting ρ : SO(3) → SU (2) using Euler angles. Indeed, under the standard
161
However, there is a perfectly well-defined spin representation of the Lie algebra so(3).

– 308 –
homomorphism π : SU (2) → SO(3) one recognizes that exp[iθσ i ] maps to a rotation by
angle 2θ around the the ith axis. For example,
!
e iθ 0
u = exp[iθσ 3 ] = = cos θ + i sin θσ 3 (15.47)
0 e−iθ

acts by
!
−1 z x − iy
u~x · ~σ u =u u−1
x + iy −z
! (15.48)
z e2iθ (x − iy)
=
e−2iθ (x + iy) −z

One can represent any rotation in SO(3) by a rotation around the z-axis, then around
the x-axis, then around the z axis. Call this R(φ, θ, ψ). So one attempts to define ρ by
assigning
φ 3 θ 1 ψ 3
ρ : R(φ, θ, ψ) → ei 2 σ ei 2 σ ei 2 σ . (15.49)
Clearly, we are going to have problems making this mapping well-defined. For example,
φ 3
R(φ, 0, 0) would map to ei 2 σ , but this is not well-defined for all φ because R(2π, 0, 0) = 1
2π 3
and ei 2 σ = −1. The problem is that the Euler angle coordinates on SO(3) are sometimes
singular. So, we need to restrict the domain of φ, θ, ψ so that (15.49) is well-defined
for every R ∈ SO(3). However, when we make such a restriction we will spoil the group
homomorphism property, but only up to a phase. So, in this way, we get a two-dimensional
projective representation of SO(3).
As an exercise you can try the following: Every SU (2) matrix can be written as
u = cos(χ) + i sin(χ)n̂ · ~σ and this maps under π to a rotation by 2χ around the n̂ axis. But
again, you cannot smoothly identify every SO(3) rotation by describing it as a rotation by
2χ around an axis.

Exercise Projective representations of Z2 × Z2 and the Pauli group


Consider the group Z2 × Z2 multiplicatively:

Z2 × Z2 = {1, g1 , g2 , g1 g2 } (15.50)

with relations g12 = g22 = 1 and g1 g2 = g2 g1 . Consider the map

ρ : Z2 × Z2 → GL(2, C) (15.51)

defined by

ρ(1) = 1
ρ(g1 ) = σ 1
(15.52)
ρ(g2 ) = σ 2
ρ(g1 g2 ) = σ 3

– 309 –
a.) Show that this defines a projective representation of Z2 × Z2 .
b.) Try to remove the phase to get a true representation.
c.) Show that ρ defines a section of an exact sequence with G given by the Pauli group.

15.2.1 How projective representations arise in quantum mechanics


The following material, while very important, assumes knowledge of some of the linear
algebra from Chapter 2 and some familiarity with quantum mechanics. For further details
see Chapter **** below. The reader should also consult Section 2 of 162
Let us review, very briefly the most essential points of quantum mechanics: The Dirac-
von Neumann axioms of quantum mechanics posit that to a physical system we associate
a complex Hilbert space H such that

1. Physical states are identified with traceclass positive operators ρ of trace one. They
are usually called density matrices. We denote the space of physical states by S. 163

2. Physical observables are identified with self-adjoint operators. We denote the set of
(bounded) self-adjoint operators by O.

3. The Born rule states that when measuring the observable O in a state ρ the prob-
ability of measuring value e ∈ E ⊂ R, where E is a Borel-measurable subset of R,
is
Pρ,O (E) = TrPO (E)ρ. (15.53)

Here PO is the projection-valued-measure associated to the self-adjoint operator O


by the spectral theorem. For example, if O has a complete discrete spectrum {λi } of
eigenvalues so that
X
O= λi P (λi ) (15.54)
i

where P (λi ) is the projection operator onto the eigenspace with eigenvalue λi then
X
PO (E) = P (λi ) (15.55)
λi ∈E

When the spectrum of O is more complicated, e.g. if there is a continuous spectrum,


one can still define PO (E), but the story is more involved. See Chapter 2, the Linear
Algebra User’s Manual.
162
http://www.physics.rutgers.edu/∼gmoore/695Fall2013/CHAPTER1-QUANTUMSYMMETRY-
OCT5.pdf
163
As explained in the Linear Algebra User’s Manual, positive operators can be defined as operators A
such that (ψ, Aψ) ≥ 0 for every ψ ∈ H. Such operators are always self-adjoint. Indeed, any operator A
such that (ψ, Aψ) ∈ R for all ψ ∈ H is self-adjoint. To see this note that (ψ, Aψ)∗ = (ψ, Aψ) and hence

(ψ, Aψ) = (ψ, A† ψ). Now apply this equation to ψ1 + zψ2 for z = 1 and z = −1 and add the resulting
equations to deduce that (ψ1 , Aψ2 ) = (ψ1 , A† ψ2 ) for all pairs ψ1 , ψ2 ∈ H. Choose an ON basis for H to
deduce that A = A† .

– 310 –
4. There are further axioms regarding time-development, and so on, but the above is
all we need for the present discussion.

Given this setup up the natural notion of a general “symmetry” in quantum mechanics
is the following:

Definition An automorphism of a quantum system is a pair of bijective maps s1 : S → S


and s2 : O → O and where s2 is real linear on O such that (s1 , s2 ) preserves probability
measures:
Ps1 (ρ),s2 (O) = Pρ,O (15.56)

This set of mappings forms a group which we will call the group of quantum automorphisms. ♣Need to state
some appropriate
While this is conceptually straightforward, it is an unwieldy notion of symmetry. We continuity
properties. ♣
will now simplify it considerably, ending up with the crucial theorem known as Wigner’s
theorem.
We begin by noting that the space of density matrices is a convex set. The convexity
means that if ρ1 , ρ2 are density matrices then for all 0 ≤ t ≤ 1

tρ1 + (1 − t)ρ2 (15.57)

is a density matrix. Given a convex set on defines an extremal point to be a point in the
set which cannot be written in the above form with 0 < t < 1. By definition, the pure
states are the extremal points of S. The pure states are just the dimension one projection
operators.
Pure states are often referred to in the physics literature as “rays in Hilbert space” for
the following reason:
If ψ ∈ H is a nonzero vector then it determines a line

`ψ := {zψ|z ∈ C} := ψC (15.58)

Note that the line does not depend on the normalization or phase of ψ, that is, `ψ = `zψ
for any nonzero complex number z. Put differently, the space of such lines is projective
Hilbert space
PH := (H − {0})/C∗ (15.59)

Equivalently, this can be identified with the space of rank one projection operators. Indeed,
given any line ` ⊂ H we can write, in Dirac’s bra-ket notation: 164

|ψihψ|
P` = (15.60)
hψ|ψi
164
We generally denote inner products in Hilbert space by (x1 , x2 ) ∈ C where x1 , x2 ∈ H. Our convention
is that it is complex-linear in the second argument. However, we sometimes write equations in Dirac’s
bra-ket notation because it is very popular. In this case, identify x with |xi. Using the Hermitian structure
there is a unique anti-linear isomorphism of H with H∗ which we denote x 7→ hx|. Sometimes we denote
vectors by Greek letters ψ, χ, . . . , and scalars by Latin letters z, w, . . . . But sometimes we denote vectors
by Latin letters, x, w, . . . and scalars by Greek letters, α, β, . . . .

– 311 –
where ψ is any nonzero vector in the line `.
It is possible to argue (see the above reference) that such a symmetry maps pure states
to pure states, and is completely determined by this map. So we can view the group of
quantum automorphisms as the group of transformations of one-dimensional projection
operators (or rays) that preserves overlaps

|hψ1 |ψ2 i|2


Tr(P1 P2 ) = (15.61)
k ψ1 k2 k ψ2 k2

The group of such transformations is denoted Aut(QM ). Any symmetry of a quantum


mechanical system will define a subgroup of this group.

Example: Let us consider this case of a single Qbit, namely H = C2 . First we write the
most general general density matrix. Any 2 × 2 Hermitian matrix is of the form a + ~b · ~σ
where ~σ is the vector of “Pauli matrices”:
!
1 01
σ =
10
!
0 −i
σ2 = (15.62)
i 0
!
1 0
σ3 =
0 −1

a ∈ R and ~b ∈ R3 . Now a density matrix ρ must have trace one, and therefore a = 12 . Then
the eigenvalues are 12 ± |~b| so positivity means it must have the form

1
ρ = (1 + ~x · ~σ ) (15.63)
2
where ~x ∈ R3 with ~x2 ≤ 1.
The extremal states, corresponding to the rank one projection operators are therefore
of the form
1
P~n = (1 + ~n · ~σ ) (15.64)
2
where ~n is a unit vector. This gives the explicit identification of the pure states with
elements of S 2 . Moreover, we can easily compute:
1
TrP~n1 P~n2 = (1 + ~n1 · ~n2 ) (15.65)
2
and ~n1 · ~n2 = cos(θ1 − θ2 ) where |θ1 − θ2 | (with θ’s chosen so this is between 0 and π) is
the geodesic distance between the two points on the unit sphere.
There is another viewpoint which is useful. Nonzero vectors in C2 can be normalized
to be in the unit sphere S 3 . Then the association of projector to state given by
1
|ψi → |ψihψ| = (1 + ~n · ~σ ) (15.66)
2

– 312 –
defines a map π : S 3 → S 2 known as the Hopf fibration.
The unit sphere is a principal homogeneous space for SU (2) and we may coordinatize
SU (2) by the Euler angles:
φ 3 θ 2 ψ 3
u = e−i 2 σ e−i 2 σ e−i 2 σ (15.67)
with range 0 ≤ θ ≤ π and identifications:

(φ, ψ) ∼ (φ + 4π, ψ) ∼ (φ, ψ + 4π) ∼ (φ + 2π, ψ + 2π) (15.68)

We can make an identification with the unit sphere in C2 by viewing it as a homogeneous


space:
ψ+φ
! !
e−i 2 cos θ/2 1
ψ= ψ−φ =u· (15.69)
e−i 2 sin θ/2 0
The projector onto the line through this space is
1
P`ψ = |ψihψ| = (1 + ~n · ~σ ) (15.70)
2
with ~n = (sin θ cos φ, sin θ sin φ, cos θ) as usual. Alternatively, we could map π : S 3 → S 2
by π(ψ) = [ψ1 : ψ2 ] ∼ = CP 1 , and this will correspond to the point in S 2 by the usual
stereographic projection. ♣from which pole?

In any case, for the case N = 2 we see that Autqtm (PH) is just the group of isometries
of S 2 with its round metric. This group is well known to be the orthogonal group O(3).
In general for H = CN +1 the space of pure states is CPN . This space has a natural
homogeneous metric known as the Fubini-Study metric. When it is suitably normalized
the overlap function o is nicely related to the Fubini-Study distance d by
d(`1 , `2 ) 2
 
o(`1 , `2 ) = cos (15.71)
2
Now, it is hard to work with the space of one-dimensional projection operators since it
is not a linear space: The sum of two one-dimensional projection operators is typically not
even proportional to a projection operator. It would be much nicer to work with operators
acting on Hilbert space. A fundamental theorem of quantum mechanics known as Wigner’s
theorem states that Aut(QM ) is indeed a quotient of a certain group of operators on a
Hilbert space. This group, denoted Aut(H), is the group of the norm-preserving unitary
and anti-unitary operators on Hilbert space. We now explain a bit more about Aut(H).
A unitary operator on H is a C-linear operator u : H → H that preserves norms:

k uψ k=k ψ k (15.72)

An anti-unitary operator on H is an C-anti-linear operator 165 a : H → H that preserves


norms:
k aψ k=k ψ k (15.73)
165
See Chapter 2, the Linear Algebra User’s Manual, for more on this. Briefly this means a(ψ1 + ψ2 ) =
a(ψ1 ) + a(ψ2 ) for any two vectors but a(zψ) = z ∗ a(ψ) for any scalar z ∈ C. Note that we must then define
the adjoint by ha† ψ1 , ψ2 i := hψ1 , aψ2 i∗ in a convention where the sesquilinear form is C-antilinear in the
first argument and linear in the second.

– 313 –
The composition of unitary operators is clearly unitary. The composition of unitary and
antiunitary is antiunitary, and the composition of antiunitaries is unitary so we have an
exact sequence
φ
1 → U (H) → Aut(H)−→Z2 → 1 (15.74)

where (
+1 g unitary
φ(g) = (15.75)
−1 g anti − unitary

There is a homomorphism π : Aut(H) → Aut(QM ) defined by

π(u) : P 7→ uP u† (15.76)

for u ∈ U (H) and similarly


π(a) : P 7→ aP a† (15.77)

for anti-unitary operators a. See the footnote above for the definition of the adjoint of an
anti-unitary operator. In both cases u−1 = u† and a−1 = a† and these operations preserve
the overlap function.
The fiber of the map π can be thought of as possible c-number phases which can
multiply the operator on Hilbert space representing a symmetry operation:
π
1 → U (1) → Aut(H) −→Aut(QM ) → 1 (15.78)

Here the U (1) is the group of phases acting on quantum states: ψ → zψ for z ∈ U (1).
The upshot is that, given a classical symmetry group G of a physical system, for each
g ∈ G we can associate a unitary, or antiunitary, operator U (g) acting on a Hilbert space.
Quantum mechanics only guarantees that

U (g1 )U (g2 ) = c(g1 , g2 )U (g1 g2 ) (15.79)

for some phase factor c(g1 , g2 ), which, in general, cannot be removed by a redefinition of
U (g) by a phase Ũ (g) = b(g)U (g). So we have a projective representation of the classical
symmetry G.
Put differently: In a given physical system we will not consider the full group Aut(QM )
as a group of symmetries because. For example, the time-dynamics of a nonrelativistic
quantum system is governed by the Schrödinger equation:

∂ψ
i~ = Hψ (15.80)
∂t
where H is a self-adjoint operator, the Hamiltonian. For time-independent Hamiltonians
the unitary time evolution is governed by

t
U (t) = exp[−i H] (15.81)
~

– 314 –
Only a subgroup of Aut(QM ) will commute with the resultant flows on the space of states.
If we think of G as embedded in Aut(QM ) then we have

G (15.82)
ρ

1 / U (1) ι / Aut(H) π / Aut(QM ) /1

The group which acts on the Hilbert space will be a central extension of G given by the
pullback construction.
One thing we can note is that finite dimensional matrices are always associative! 166
So for all g1 , g2 , g3 ∈ G

(U (g1 )U (g2 ))U (g3 ) = U (g1 )(U (g2 )U (g3 )) (15.83)

and hence
c(g1 , g2 )c(g1 g2 , g3 ) = c(g1 , g2 g3 )c(g2 , g3 ) (15.84)
We note that projective representations are quite pervasive in modern physics:

1. Projective representations appear naturally in quantization of bosons and fermions.


The Heisenberg group is an extension of a translation group (on phase space). In
addition, the symplectic group of linear canonical transformations gets quantum me-
chanically modified by a central extension to define something called the metaplectic
group.

2. Projective representations are important in the theory of anomalies in quantum field


theory.

3. Projective representations are very important in conformal field theory. The Virasoro
group, and the Kac-Moody groups are both nontrivial central extensions of simpler
objects.

Now, as we will explain near the end of section 15.3 below projective representations
are very closely connected to central extensions. So in the next section we turn to a deeper
investigation into the structure of central extensions.

Remark: The fact that the symmetry operators U (g) should commute with the Hamil-
tonian has an important implication. Suppose that H has a complete set of eigenvectors
Ψλ where λ ∈ Spec(H) is a discrete set of eigenvalues. Then we can restrict U (g) to the
different eigenspaces Hλ of H, for if

Hψλ = Eλ ψλ (15.85)

and U (g) commutes with H then U (g)ψλ also is an eigenvector of eigenvalue Eλ . This
means that the eigenspaces Hλ are each projective representations of G. This can be
extremely useful in diagonalizing H and simplifying computations. ♣Need to work in
some examples ♣
166
And the same holds for linear operators on infinite-dimensional Hilbert spaces provided the domains
are such that the composition of the three operators is well-defined.

– 315 –
Exercise Kernel Of π In The Wigner Sequence
Show that the kernel of π in the exact sequence (15.78) is precisely the group of phases
times the identity operator. 167

15.3 How To Classify Central Extensions


There is an interesting way to classify central extensions of G by A.
As before let s : G → G̃ be a “section” of π. That is, a map such that

π(s(g)) = g ∀g ∈ G (15.86)
As we have stressed, in general s is not a homomorphism. In the case when the
sequence splits, that is, when there exists a section which is a homomorphism, then we can
say G̃ is isomorphic to a direct product G̃ ∼= A × G via

ι(a)s(g) → (a, g) (15.87)


When the sequence splits the semidirect product of the previous section is a direct
product because A is central, so ωg (a) = a.
Now, let us allow that (15.22) does not necessarily split. Let us choose any section
s and measure by how much s differs from being a homomorphism by considering the
combination:
s(g1 )s(g2 ) (s(g1 g2 ))−1 . (15.88)
Now the quantity (15.88) is in the kernel of π and hence in the image of ι. Since ι is
injective we can define a function fs : G × G → A by the equation

ι(fs (g1 , g2 )) := s(g1 )s(g2 ) (s(g1 g2 ))−1 . (15.89)

That is, we can write:


s(g1 )s(g2 ) = ι(fs (g1 , g2 ))s(g1 g2 ) (15.90)
The function fs satisfies the important cocycle identity

f (g2 , g3 )f (g1 , g2 g3 ) = f (g1 , g2 )f (g1 g2 , g3 ) (15.91)

Exercise Derivation Of The Cocycle Identity


Verify (15.91) by using (15.89) to compute s(g1 g2 g3 ) in two different ways.
167
Answer : Suppose uP = P u for every rank one projection operator. Consider the projection operator
for the line through ψ1 + zψ2 for any two nonzero vectors ψ1 , ψ2 ∈ H. Applying the condition for the cases

z = 1 and z = −1 deduce that u commutes with |ψ1 ihψ2 |. Thus, choosing an ON basis ei it commutes
with |ei ihej | and therefore must be proportional to the identity matrix. On the other hand, it also must
preserve norms, so it is the group of multiplication by a phase.

– 316 –
(Note that simply substituting (15.89) into (15.91) is not obviously going to work
because G̃ need not be abelian.)

Exercise Simple consequences of the cocycle identity


a.) By putting g1 = 1 and then g3 = 1 show that any cocycle f must satisfy:

f (g, 1) = f (1, g) = f (1, 1) ∀g ∈ G (15.92)


b.) Show that 168

f (g, g −1 ) = f (g −1 , g). (15.93)

Now we introduce some fancy terminology:

Definition: In general
1. A 2-cochain on G with values in A is a function

f :G×G→A (15.94)

We denote the set of all such 2-cochains by C 2 (G, A).


2. A 2-cocycle is a 2-cochain f : G × G → A satisfying (15.91). We denote the set of
all such 2-cocycles by Z 2 (G, A).

Remarks:

1. The fancy terminology is introduced for a good reason because there is a topological
space and a cohomology theory underlying this discussion. See Section §15.8 and
Section §17.2 for further discussion.

2. Note that C 2 (G, A) is naturally an abelian group because A is an abelian group.


(Recall example 2.7 of Section §2.) Z 2 (G, A) inherits an abelian group structure
from C 2 (G, A).

So, in this language, given a central extension of G by A and a section s we naturally


obtain a two-cocycle fs ∈ Z 2 (G, A) via (15.89).
Now, if we choose a different section ŝ then 169

ŝ(g) = ι(t(g))s(g) (15.95)

Answer : Consider the triple g · g −1 · g and apply part (a).


168

169
Since we are working with central extensions we could put the ι(t(g)) on either side of the s(g). However,
when we discuss non-central extensions later the order will matter.

– 317 –
for some function t : G → A. It is easy to check that

fŝ (g1 , g2 ) = fs (g1 , g2 )t(g1 )t(g2 )t(g1 g2 )−1 (15.96)

where we have used that ι(A) is central in G̃.

Definition: In general two 2-cochains f and fˆ are said to differ by a coboundary if there
exists a function t : G → A such that

fˆ(g1 , g2 ) = f (g1 , g2 )t(g1 )t(g2 )t(g1 g2 )−1 (15.97)

for all g1 , g2 ∈ G.
One can readily check, using the condition that A is Abelian, that if f is a cocycle
then any other fˆ differing by a coboundary is also a cocycle. Moreover, being related by a
cocycle defines an equivalence relation on the set of cocycles f ∼ fˆ. Thus, we may define:

Definition: The group cohomology H 2 (G, A) is the set of equivalence classes of 2-cocycles
modulo equivalence by coboundaries.

Now, the beautiful theorem states that group cohomology classifies central extensions:
170

Theorem: Isomorphism classes of central extensions of G by an abelian group A are in


1-1 correspondence with the second cohomology set H 2 (G, A).

Proof : Let E(G, A) denote the set of all extensions of G by A, and let E(G, A) denote the
set of all isomorphism classes of extensions of G by A.
We first construct a map:

ΨE→H : E(G, A) → H 2 (G, A) (15.98)

To do this, we choose a section, then from (15.89)(15.91)(15.96) we learn that we get a


cocycle whose cohomology class does not depend on the section. So
 
ΨE→H 1 → A → G̃ → G → 1 = [fs ] (15.99)

is well-defined, because the RHS does not depend on the choice of section s.
Now we claim that this map descends to a map ΨE→H : E(G, A) → H 2 (G, A). Indeed,
if we have an isomorphism of central extensions:

1 /A ι /G
e π /G /1 (15.100)
Id ψ Id
  
/A ι0 /G
e0 π0 /G /1
1

– 318 –
where ψ : G̃ → G̃0 is an isomorphism such that the inverse also leads to a commutative
diagram, then ψ can be used to map sections of π : G̃ → G to sections of π 0 : G̃0 → G by ♣where do we use
this condition in the
s 7→ s0 where s0 (g) = ψ(s(g)). Then proof? ♣

s0 (g1 )s0 (g2 ) = ψ(s(g1 ))ψ(s(g2 ))


= ψ(s(g1 )s(g2 ))
= ψ(ι(fs (g1 , g2 ))s(g1 g2 )) (15.101)
= ψ(ι(fs (g1 , g2 )))ψ(s(g1 g2 ))
= ι0 (fs (g1 , g2 ))s0 (g1 g2 )

and hence we assign precisely the same 2-cocycle f (g1 , g2 ) to the section s0 . Hence the
map ΨE→H only depends on the isomorphism class of the extension. This defines the map
ΨE→H .
Conversely, we can define a map ΨH→E : Z 2 (G, A) → E(G, A) as follows: Given a
cocycle f ∈ Z 2 (G, A) we may define G̃ = A × G as a set and we use f to define the
multiplication law:
(a1 , g1 )(a2 , g2 ) := (a1 a2 f (g1 , g2 ), g1 g2 ) (15.102)
You should check that this does define a valid group multiplication: The associativity
follows from the cocycle relation. Note that if we use the trivial cocycle: f (g1 , g2 ) = 1 for
all g1 , g2 ∈ G then we just get the direct product of groups.
Now suppose that we use two 2-cocycles f and f 0 which are related by a coboundary
as in (15.97) above. Then we claim that the map ψ : G̃ → G̃0 defined by

ψ : (a, g) → (at(g)−1 , g) (15.103)

is an isomorphism of central extensions as in (15.100). This means that the map ΨH→E :
Z 2 (G, A) → E(G, A) actually descends to a well-defined map

ΨH→E : H 2 (G, A) → E(G, A) (15.104)

We leave it to the reader to check that ΨH→E and ΨE→H are inverse maps. ♠

Remarks:

1. Central extensions and projective representations. A very important consequence of


the construction (15.102) is that, if we are given a projective representation of G then
we can associate a centrally extended group G: e

1 → U (1) → G
e→G→1 (15.105)

and a true representation ρe of G:


e

ρe(z, g) := zρ(g) (15.106)


170
In fact, both the set of isomorphism classes of extensions and H 2 (G, A) are Abelian groups and a
stronger statement is that the 1-1 correspondence described here is an isomorphism of Abelian groups.

– 319 –
The evil failure of ρ(g) to be a true representation of G now becomes a virtuous
fact that allows ρe to be a true representation of G.
e This is the typical situation in
quantum mechanics, where G is a group of classical symmetries and G e is the group
that is implemented quantum-mechanically. A good example is the spin-1/2 system
where G = SO(3) is the classical group of rotations but for a quantum rotor the
proper symmetry group is G e = SU (2). There are many many other examples.

2. Group Structure on E(G, A). The set H 2 (G, A) carries a natural structure of an
Abelian group. Indeed, as we remarked above C 2 (G, A), being a set of maps with
target space a group, A, is naturally a group. Then, because A is Abelian, we can
define a group structure on Z 2 (G, A) by the rule:

(f1 · f2 )(g, g 0 ) = f1 (g, g 0 ) · f2 (g, g 0 ) (15.107)

where we are writing the product in A multiplicatively. Again using the fact that ♣It might be
clearer to write A
A is abelian this descends to a well-defined multiplication on cohomology classes: additively... ♣

[f1 ] · [f2 ] := [f1 · f2 ]. Therefore H 2 (G, A) itself is an abelian group. The identity
element corresponds to the cohomology class of the trivializable cocycles, which in
turn corresponds to the split extension A × G.
It is natural to ask whether one can give a more canonical description of the abelian
group structure on the set of equivalence classes of central extensions of G by A.
Indeed we can: We pull back the Cartesian product to the diagonal of G × G and
then push forward by the multiplication map µ : A × A → A. That is, suppose we
have two central extensions:
1ι π
E1 : 1→A → G̃1 →1 G → 1 (15.108)
2ι π
E2 : 1→A → G̃2 →2 G → 1 (15.109)
The Cartesian product E1 × E2 is the extension of G × G by A × A using the group
G̃1 × G̃2 with the Cartesian product of the group homomorphisms. We want an
extension of G by A, corresponding, under the 1-1 correspondence of the above
theorem to the natural group structure on H 2 (G, A). To construct it, let

∆:G→G×G (15.110)

be the diagonal homomorphism: ∆ : g 7→ (g, g). Then we claim that the product
extension E1 · E2 can be identified as

E1 · E2 = µ∗ ∆∗ (E1 × E2 ) (15.111)

where ∆∗ (E1 × E2 ) is the pull-back extension under ∆ (see equation (15.34)), an


extension of G by A × A, and µ∗ is the pushforward extension. In concrete terms the
pullback extension under ∆∗ is:
(ι1 ,ι2 ) π12
1→A×A → G
d 12 → G→1 (15.112)

– 320 –
where
G 12 := {(g̃1 , g̃2 )|π1 (g̃1 ) = π2 (g̃2 )} ⊂ G̃1 × G̃2 (15.113)
d

We can define π12 (g̃1 , g̃2 ) := π1 (g̃1 ) = π2 (g̃2 ). Now consider the “anti-diagonal”

Aanti := ker(µ) = {(a, a−1 )} ⊂ A × A (15.114)

and its image:


N := {(ι1 (a), ι2 (a−1 ))|a ∈ A} ⊂ G
d 12 (15.115)
Because we are working with central extensions this will be a normal subgroup. Then
we let
G̃12 := G
d 12 /N (15.116)
Since N is in the kernel of π12 and since it is central the homomorphism π12 descends
to a surjective homomorphism which we will also call π12 : G̃12 → G. Now we have
an exact sequence
ι12 π12
1→A → G̃12 → G→1 (15.117)
where ι12 (a) := [(ι1 (a), ι2 (1))] = [(ι1 (1), ι2 (a))]. Given sections s1 , s2 of π1 , π2 re-
spectively we can define a section s12 (g) := [(s1 (g), s2 (g))] and one can check that
the resulting cocycle is indeed in the cohomology class of fs1 · fs2 . The extension
(15.117) represents the product of extensions (15.108) and (15.109). The point of
this construction is that it is canonical: We did not make any choices of sections to
define the product extension.

3. Trivial vs. Trivializable. Above we defined the trivial cocycle to be the one with
f (g1 , g2 ) = 1A for all g1 , g2 . We define a cocycle to be trivializable if it is cohomologous
to the trivial cocycle. Note that a trivializable cocycle f could be trivialized in
multiple ways. Suppose both b and b̃ trivialize f . Then you should show that b̃ and
b “differ” by a group homomorphism φ : G → A in the sense that

b̃(g) = φ(g)b(g) (15.118)

There are situations where a cohomological obstruction vanishes and the choice of
trivialization has physical significance. ♣Should give some
examples: H 3 is
obstruction to
4. An analogy to gauge theory: Changing a cocycle by a coboundary is strongly anal- orbifolding CFT
and choice of
ogous to making a gauge transformation in a gauge theory. In Maxwell’s theory we trivialization is H 2
- hence discrete
can make a change of gauge of the vector potential Aµ by torsion. There are
bundle examples.
Find example where
A0µ (x) = Aµ (x) − ig −1 (x)∂µ g(x) (15.119) class in H 2 is zero
but trivialization
has physical
where g(x) = ei(x) is a function on spacetime valued in U (1). In the case of electro- meaning. ♣

magnetism we would say that Aµ is trivializable if there is a gauge transformation


g(x) that simplifies it to 0. (For valid gauge transformations g(x) must be a single-
valued function on spacetime.) If we are presented with Aµ (x) and we want to know
if it is trivializable then we should check whether gauge invariant quantities vanish.

– 321 –
One such quantity is the fieldstrength tensor Fµν := ∂µ Aν − ∂ν Aµ , but this is not
a complete gauge invariant. The isomorphism class of a field is completely specified
H
by the holonomies expi γ A around all the closed cycles γ in spacetime. Even when
Aµ (x) is not trivializable, it is very often useful to use gauge transformations to try
to simplify Aµ . In the next remark we do the same for cocycles.

5. Simplifying Cocycles Using Coboundaries. Using a coboundary one can usefully sim-
plify cocycles. Since this topic will be unfamiliar to some readers we explain this in
excruciating detail. Those who are familiar with cohomology can safely skip the rest
of this remark. To begin, note that a coboundary modification takes a cochain f to
f (1) satisfying:
t(1)t(1)
f (1) (1, 1) = f (1, 1) = f (1, 1)t(1) (15.120)
t(1 · 1)
so by choosing any function t such that t(1) = f (1, 1)−1 we get a new cochain
satisfying f (1) (1, 1) = 1. Choose any such function. (The simplest thing to do is set
t(g) = 1 for all other g 6= 1. We will make this choice, but it is really not necessary.)
Now recall that if f is a cocycle then a modification of f by any coboundary produces
a new cochain f (1) that is also a cocycle. So now, if f is a cocycle and we have set
f (1) (1, 1) = 1 then, by (15.92) we have f (1) (g, 1) = f (1) (1, g) = 1 for all g. Now, we
can continue to make modifications by coboundaries to simplify further our cocycle
f (1) . In order not to undo what we have done we require that the new coboundaries
we use, say, t̃ satisfy t̃(1) = 1. We may say that we are “partially choosing a gauge”
by choosing representatives so that f (1) (1, 1) = 1 and then the further coboundaries
t̃ must “preserve that gauge.” Now suppose that g 6= 1. Then (using our particular
choice of t above):

1
f (1) (g, g −1 ) = f (g, g −1 ) = f (g, g −1 )f (1, 1) (15.121)
t(1)

is not particularly special. (Remember that we are making the somewhat arbitrary
choice that t(g) = 1 for g 6= 1.) So we have not simplified these quantities. However,
we still have plenty of gauge freedom left and we can try to simplify the values as
follows: Suppose, first, that g 6= g −1 , equivalently, suppose g 2 6= 1 so g is not an
involution. Then we can make another “gauge transformation” by a coboundary
function t̃ to produce:

t̃(g)t̃(g −1 )
f (2) (g, g −1 ) = f (1) (g, g −1 ) = fˆ(g, g −1 )t̃(g)t̃(g −1 ) (15.122)
t̃(g · g −1 )

where in the second equality we used the “gauge-preserving” property that t̃(1) = 1.
Now, in any way you like, divide the non-involution elements of G into two disjoint
sets S1 q S2 so that no two group elements in S1 are related by g → g −1 . Then, if
g ∈ S2 we have g −1 ∈ S1 and vice versa. Then we can choose a function t̃ so that for
every g ∈ S2 we have
t̃(g) = (t̃(g −1 ))−1 (f (1) (g, g −1 ))−1 (15.123)

– 322 –
Consequently:
f (2) (g, g −1 ) = 1 ∀g ∈ S2 (15.124)
It doesn’t really matter what we choose for t̃ on S1 . For definiteness we choose it to
be = 1. But if we had made another choice the above procedure would still lead to
equation (15.124). Now recall from (15.93) that any cocycle f satisfies f (g, g −1 ) =
f (g −1 , g) for all g. Since f (2) is a cocycle (if we started with a cocycle f ) then we
conclude that for all the non-involutions:

f (2) (g, g −1 ) = f (2) (g −1 , g) = 1 ∀g ∈ S1 q S2 (15.125)

Note that there is still a lot of “gauge freedom”: We have not yet constrained t̃(g) for
g ∈ S1 , nor have we constrained t̃(g) for the involutions, that is, the group elements
g with g 2 = 1. What can we say about f (2) (g, g) for g an involution? we have

t̃(g)2
f (2) (g, g) = f (1) (g, g) = f (1) (g, g)(t̃(g))2 (15.126)
t̃(g 2 )

Now, it might, or might not be the case that f (1) (g, g) is a perfect square in the
group. If it is not a perfect square then we are out of luck: We cannot make any
further gauge transformations to set f (2) (g, g) = 1. Now one can indeed check that
the property of f (g, g) being a perfect square, or not, for an involution g is a truly
“gauge invariant” condition. Therefore we have proven: If f (g, g) is not a perfect
square for some nontrivial involution g then we know that f is not “gauge equivalent”
- that is, is not cohomologous to - the trivial cocycle. That is, [f ] is a nontrivial
cohomology class. Such cocycles will define nontrivial central extensions.

Example 1 . Extensions of Z2 by Z2 . WLOG we can take f (1, 1) = f (1, σ) = f (σ, 1) = 1.


Then we have two choices: f (σ, σ) = 1 or f (σ, σ) = σ. Each of these choices satisfies the
cocycle identity and they are not related by a coboundary. Indeed σ is an involution and
also σ is not a perfect square, so by our discussion above a cocycle with f (σ, σ) = σ cannot
be gauged to the trivial cocycle. In other words H 2 (Z2 , Z2 ) = Z2 . For the choice f = 1 we
obtain G̃ = Z2 × Z2 . For the nontrivial choice f (σ, σ) = σ we obtain G̃ ∼ = Z4 . Let us see
this in detail. We’ll let σ1 ∈ A ∼ = 2Z and σ 2 ∈ G ∼
= 2
Z be the nontrivial elements so we
should write f (σ2 , σ2 ) = σ1 . Note that (σ1 , 1) has order 2, but then

(1, σ2 ) · (1, σ2 ) = (f (σ2 , σ2 ), 1) = (σ1 , 1) (15.127)


shows that (1, σ2 ) has order 4. Moreover (σ1 , σ2 ) = (σ1 , 1)(1, σ2 ) = (1, σ2 )(σ1 , 1).
Thus,

Ψ : (σ1 , 1) → ω 2 = −1
(15.128)
Ψ : (1, σ2 ) → ω

– 323 –
where ω is a primitive 4th root of 1 defines an isomorphism with the group of fourth roots
of unity. In conclusion, the nontrivial central extension of Z2 by Z2 is:

1 → Z2 → Z4 → Z2 → 1 (15.129)

Recall that Z4 is not isomorphic to Z2 × Z2 . The square of this extension is the trivial
extension.

Example 2. Extensions of Zp by Zp . The generalization of the previous example to the


extension of Zp by Zp for an odd prime p is extremely instructive. So, let us study in detail
the extensions
1 → Zp → G → Zp → 1 (15.130)

In this example we will write our cyclic groups multiplicatively. Now, using methods of
topology one can show that 171
H 2 (Zp , Zp ) ∼
= Zp . (15.131)

The result (15.131) should puzzle you. After all we know that G must be a group of
order p2 , and we know from the class equation and Sylow’s theorems that there are exactly
two groups of order p2 , up to isomorphism! How is that compatible with the p distinct
extensions predicted by equation (15.131) !? The answer is that there can be nonisomorphic
extensions (15.22) involving the same group G̃. Let us see how this works in the present
example by examining in detail the possible extensions:
ι π
1 → Zp → Zp2 → Zp → 1 (15.132)

We write the first, second and third groups in this sequence as

Zp = hσ1 |σ1p = 1i
2
Zp2 = hα|αp = 1i (15.133)
Zp = hσ2 |σ2p = 1i

respectively.
For the injection ι we have
ι(σ1 ) = αx (15.134)

for some x. For this to be a well-defined homomorphism we must have

ι(1) = ι(σ1p ) = αpx = 1 (15.135)

and therefore px = 0 mod p2 and therefore x = 0 mod p. But since ι must be an injection
it must be of the form
ιk (σ1 ) := αkp (15.136)
171
You can also show it by examining the cocycle equation directly. We will write down the nontrivial
cocycles presently.

– 324 –
where k is relatively prime to p. We can take

1≤k ≤p−1 (15.137)

or (preferably) we can regard k ∈ Z∗p .


Similarly, for π we must have π(α) = σ2y for some y. Now, since π has to be surjective,
σ2y must be a generator and hence π must be of the form

πr (α) = σ2r 1≤r ≤p−1 (15.138)

where again we should really regard r as an element of Z∗p .


Note that the kernel of πr is the set of elements α` with σ2`r = 1. This implies
`r = 0 mod p and therefore ` = 0 mod p so

ker(πr ) = {1, αp , α2p , . . . , α(p−1)p } (15.139)

Since k ∈ Z∗p we have


ker(πr ) = im(ιk ) (15.140)
so our sequence is exact for any choice of r, k ∈ Z∗p . We have now described all the extensions
of Zp by Zp . Let us find a representative cocycle fk,r for each of these extensions.
To find the cocycle we choose a section of πr . It is instructive to try to make it a
homomorphism. Therefore we must take s(1) = 1. What about s(σ2 )? It must be of the
form s(σ2 ) = αx for some x, and since πr (s(σ2 )) = σ2 we must have

σ2xr = σ2 (15.141)

so that
xr = 1mod p (15.142)
Recall that r ∈ Z∗p and let r∗ be the integer 1 ≤ r∗ ≤ p − 1 such that

rr∗ = 1mod p (15.143)

Then we have that x = r∗ + `p for any `. That is, s(σ2 ) could be any of
∗ ∗ +p ∗ +2p ∗ +(p−1)p
αr , αr , αr , . . . , αr (15.144)

Here we will make the simplest choice s(σ2 ) = αr . The reader can check that the discussion
is not essentially changed if we make one of the other choices. (After all, this will just change
our cocycle by a coboundary!)

Now that we have chosen s(σ2 ) = αr , if s were a homomorphism then we would be
forced to take:
s(1) = 1

s(σ2 ) = αr

s(σ22 ) = α2r (15.145)
.. ..
. .

s(σ2p−1 ) = α(p−1)r

– 325 –
But now we are stuck! The property that s is a homomorphism requires two contradictory
things. On the one hand, we must have s(1) = 1 for any homomorphism. On the other

hand, from the above equations we also must have s(σ2p ) = αpr . But because 1 ≤ r∗ ≤ p−1

we know that αpr 6= 1. So the conditions for s being a homomorphism are impossible to
meet. Therefore, with this choice of section we find a nontrivial cocycle as follows:
(
x y x+y −1 1 x+y ≤p−1
s(σ2 )s(σ2 )s(σ2 ) = ∗
(15.146)
αr p p ≤ x + y

Here we computed:
∗ ∗ ∗ (x+y−p) ∗p
αr x αr y α−r = αr (15.147)
where you might note that if p ≤ x + y ≤ 2p − 2 then 0 ≤ x + y − p ≤ p − 2. Therefore,
our cocycle is fk,r where
(
x y 1 x+y ≤p−1
fk,r (σ2 , σ2 ) := ∗ ∗
(15.148)
σ1k r p ≤ x + y

since
∗ r∗ ∗ r ∗ kp ∗p
ιk (σ1k ) = αk = αr (15.149)
and here we have introduced an integer 1 ≤ k ∗ ≤ p − 1 so that

kk ∗ = 1 mod p (15.150)

Although it is not obvious from the above formula for fk,r , we know that fk,r will satisfy
the cocycle equation because we constructed it from a section of a group extension.
Now, we know the cocycle is nontrivial because Zp × Zp is not isomorphic to Zp2 . But
let us try to trivialize our cocycle by a coboundary. So we modify our section to

s̃(σ2x ) = ι(t(σ2x ))s(σ2x ) (15.151)

We can always write our function t in the form


τ (x)
t(σ2x ) = σ1 (15.152)

for some function τ (x) valued in Z/pZ. We are trying to find a function τ (x) so that the
new cocycle fs̃ is identically 1. We certainly need s̃(1) = 1 and hence τ (0̄) = 0̄. But now,
because f (σ2x , σ2y ) = 1 already holds for x + y ≤ p − 1 don’t want to undo that so we learn
that
τ (x) + τ (y) − τ (x + y) = 0modp (15.153)
for x + y ≤ p − 1. This means we must take

τ (x) = xτ (1) 1≤x≤p−1 (15.154)

So, our coboundary is completely fixed up to a choice of τ (1). But now let us compute for
x + y ≥ p − 1:
∗ τ (x)+τ (y)−τ (x+y) ∗p
s̃(σ2x )s̃(σ2y )s̃(σ2x+y )−1 = αr p ι(σ1 ) = αr (15.155)

– 326 –
So, we cannot gauge the cocycle to one, confirming what we already knew: The cocycle is
nontrivial.
Now let us see when the different extensions defined by k, r ∈ Z∗p are actually equivalent.
To see this let us try to construct ϕ so that

ιk1 8 hαi πr1


(15.156)
&
/ hσ1 i /1
1 8 hσ2 i
ϕ

ιk2 &  πr2


hαi

Now ϕ, being a homomorphism, must be of the form

ϕ(α) = αy (15.157)

for some y. We know this must be an isomorphism so y must be relatively prime to p.


Moreover commutativity of the diagram implies

πr2 (ϕ(α)) = πr1 (α) ⇒ r2 y = r1 modp (15.158)

ϕ(ιk1 (σ1 )) = ιk2 (σ1 ) ⇒ k1 py = k2 p modp2 ⇒ k1 y = k2 mod p (15.159)

Putting these equations together, and remembering that y is multiplicatively invertible


modulo p we find that there exists a morphism of extensions iff

k1 r1 = k2 r2 modp (15.160)

Note that the cocycles fk,r constructed in (15.148) indeed only depend on krmodp. Equiv-
alently, we can label their cohomology class by (kr)∗ = k ∗ r∗ modp.
The conclusion is that kr ∈ Z∗p is the invariant quantity. Extensions with the same
group G̃ = Zp2 in the middle, but with different kr ∈ Z∗p , define inequivalent extensions of
Zp by Zp .
Now let us examine the group structure on the group cohomology. Just multiplying
the cocycles we get:
(
1 x+y ≤p−1
(fk1 ,r1 · fk2 ,r2 ) (σ2x , σ2y ) = (k r )∗ +(k2 r2 )∗
(15.161)
σ1 1 1 p≤x+y

Thus if we map
[fk,r ] 7→ (kr)∗ modZp (15.162)

we have a homomorphism of H 2 (G, A) to the additive group Z/pZ, with the trivializable
cocycle representing the direct product and mapping to 0̄ ∈ Z/pZ.
In conclusion, we describe the group of isomorphism classes of central extensions of Zp
by Zp as follows: The identity element is the trivial extension

1 → Zp → Zp × Zp → Zp → 1 (15.163)

– 327 –
and then there is an orbit of (p − 1) nontrivial extensions of the form

1 → Zp → Zp2 → Zp → 1 (15.164)

acted on by Aut(Zp ) = Z∗p .

Example 3:Prime Powers. Once we start to look at prime powers things start to get more
complicated. We will content ourselves with extensions of Z4 by Z2 . Here it can be shown
that
H 2 (Z4 , Z2 ) ∼
= Z2 (15.165)

so there should be two inequivalent extensions. One is the direct product and the other is ♣Do this more
systematically so
show that there are
precisely two
1 → Z2 → Z8 → Z4 → 1 (15.166) extensions of Z4 by
Z2 . ♣

We will think of these as multiplicative groups of roots of unity, with generators σ = −1


for Z2 , α = exp[2πi/8] for Z8 , and ω = exp[2πi/4] for Z4 .
The inclusion map ι : σ → α4 , while the projection map takes π : α → α2 = ω.
Let us try to find a section. Since we want a normalized cocycle we must choose
s(1) = 1. Now, π(s(ω)) = ω implies s(ω)2 = ω, and this equation has two solutions:
s(ω) = α or s(ω) = α5 . Let us choose s(ω) = α. (The following analysis for α5 is similar.)
If we try to make s into a homomorphism then we are forced to choose

s(ω) = α
s(ω 2 ) = α2 (15.167)
s(ω 3 ) = α3

but now we have no choice - we must set s(ω 4 ) = s(1) = 1. On the other hand, if s were
to have been a homomorphism we would have wanted to set s(ω 4 ) = s(ω)4 = α4 , but, as
we just said, we cannot do this. With the above choice of section we get the symmetric
cocycle whose nontrivial entries are

f (ω, ω 3 ) = f (ω 2 , ω 2 ) = f (ω 2 , ω 3 ) = f (ω 3 , ω 3 ) = α4 = σ. (15.168)
♣Probably should
just describe all
extensions with
Example 4.. Products Of Cyclic Groups. Another natural generalization is to consider Q = Zn ♣

products of cyclic groups. For simplicity we will only consider the case

G = Zp ⊕ · · · ⊕ Zp (15.169)

where there are k summands, but p is prime. We will think of our group additively
and moreover we will think of Zp as a ring in this example. If we write elements as
~x = (x1 , . . . , xk ) with xi ∈ Zp and our cocycle f (~x, ~y ) is also valued in Zp , so that we are
considering central extensions:

e → Z⊕k → 0
0 → Zp → G (15.170)
p

– 328 –
then the cocycle condition becomes:

f (~x, ~y ) + f (~x + ~y , ~z) = f (~x, ~y + ~z) + f (~y , ~z) (15.171)

An obvious way to satisfy this condition is to use a bilinear form:

f (~x, ~y ) = Aij xi yj (15.172)

where the matrix elements Aij ∈ Zn . We can modify by a coboundary:

f (~x, ~y ) → f (~x, ~y ) + q(~x + ~y ) − q(~x) − q(~y ) (15.173)

Notice a linear term cancels out. It we want to restrict attention to expressions which are
quadratic then we can modify

Aij → Aij − (qij + qji ) (15.174)

where qij is any matrix with values in Zp .


Now we must distinguish the case p = 2 from p an odd prime. If p = 2 we can use the
coboundary to make the off-diagonal part asymmetric, and WLOG we can agree that for
each i < j either Aij = Aji = 0 or Aij = 0 and Aji = 1. Note that the diagonal matrix
elements are gauge invariant since qii + qii = 2qii = 0. Therefore we can produce in this
way 12 k(k + 1) independent cocycles.
If p is an odd prime we can require that the matrix is “anti-symmetric” in the sense
that Aij + Aji = 0 for all i, j, because 2 is invertible. In this way we only produce 12 k(k − 1)
independent cocycles.
On the other hand, using methods of topology (See section **** below for hints) one
can prove that
1
k(k+1)
H 2 (Z⊕k , Zp ) ∼
p = Zp2 (15.175)
for any prime p. What are the k “missing” cocycles for p an odd prime? They are exactly
the extensions we discussed in detail in Example 2 above!

Example 5.. As a special case of the above, consider extensions of Z2 ⊕Z2 by Z2 . This will
be a group of order 8. As we will see, there are five groups of order 8 up to isomorphism:

Z8 , Z2 × Z4 , Z32 , Q, D4 (15.176)

where Q and D4 are the quaternion and dihedral groups, respectively. Now Z8 cannot sit
in an extension of Z2 × Z2 . (Why not? 172 ) This leaves 4 isomorphisms classes of groups
which do fit in extensions of Z2 × Z2 by Z2 and it happens they are all central extensions.
They are:
1 → Z2 → Z2 × Z2 × Z2 → Z2 × Z2 → 1
1 → Z2 → Z2 × Z4 → Z2 × Z2 → 1
(15.177)
1 → Z2 → Q → Z2 × Z2 → 1
1 → Z2 → D4 → Z2 × Z2 → 1
172
Answer : Because Z2 × Z2 would have to be a quotient of Z8 . But we can easily list the subgroups of
Z8 and no quotient is of this form.

– 329 –
where Q is the quaternion group and D4 the dihedral group. We have already met Q and
D4 above. One can define a homomorphism π : Q → Z2 ⊕ Z2 by

π(±1) = 0 = (0, 0)
π(±iσ 1 ) = v1 = (1, 0)
(15.178)
π(±iσ 2 ) = v2 = (0, 1)
π(±iσ 3 ) = v1 + v2 = (1, 1)

where we are thinking of Z2 ∼


= Z/2Z additively to make contact with the previous example.
We can choose a section:

s(v1 ) = iσ 1
s(v2 ) = iσ 2 (15.179)
s(v1 + v2 ) = iσ 3

and, computing the cocycle we find that it is given by the bilinear form (see the previous
exercise): !
11
AQ = (15.180)
01
Similarly, we can define a homomorphism π : D4 → Z2 × Z2 by

π(±1) = 0 = (0, 0)
π(±R(π/2)) = v1 = (1, 0)
(15.181)
π(±P ) = v2 = (0, 1)
π(±P R(π/2)) = v1 + v2 = (1, 1)

We can choose a section:

s(v1 ) = R(π/2)
s(v2 ) = P (15.182)
s(v1 + v2 ) = P R(π/2)

and, computing the cocycle we find that it is given by the bilinear form (see the previous
exercise): !
11
AD 4 = (15.183)
00
Now, on the other hand, using methods of topology one can prove that

H 2 (Z2 × Z2 , Z2 ) = Z2 ⊕ Z2 ⊕ Z2 (15.184)

We can understand this group in terms of the bilinear quadratic forms mentioned in the
previous example. Under Aij → Aij − (qij + qji ). Note that A11 and A22 are invariant,
but we can modify the off-diagonal part of Aij by a symmetric matrix. Thus, there are 8
possible values for A11 , A22 , A12 ∈ Z2 .

– 330 –
In a way analogous to our discussion of extensions of Zp by Zp , while there are only four
different isomorphism classes of groups, there can be different extensions. An extension
with group cocycle Aij xi1 xj2 defines a group of elements (z, ~x). If we only care about the
isomorphism class of the group we are free to consider an isomorphism

(z, ~x) 7→ (z, S~x) (15.185)

where S ∈ GL(2, Z2 ). This maps A → SAS tr . In general that will produce an isomorphic
group, but a different extension. ♣Explain in detail
how this takes us
In all our examples up to now the group G̃ has been Abelian, but in this example we from 8 extensions to
four isomorphism
have produced two nonisomorphic nonabelian groups Q and D4 of order 8. classes of groups
using the explicit
transforms by
Example 5.. Nonabelian groups can also have central extensions. Indeed, we already saw elements of
GL(2, Z2 ) ∼
= S3 . ♣
this for G = SO(3). Here is an example with G a nonabelian finite group. We take G to ♣Should add an
exercise showing
be the symmetric group Sn . It turns out that it has one nontrivial central extension by that f (g, g) for g an
involution
Z2 : determines the

H 2 (Sn ; Z2 ) ∼
= Z2 (15.186) entire cocycle in
this case. There are
three nontrivial
To define it we let σi = (i, i + 1), 1 ≤ i ≤ n − 1 be the transpositions generating Sn . Then involutions, again
giving 8 possible
nonisomorphic
Ŝn is generated by σ̂i and a central element z satisfying the relations: cocycles. ♣
♣Should also
consider central
extensions of Z2 by
z2 = 1 Z2 ⊕ Z2 . ♣

σ̂i2 = z
(15.187)
σ̂i σ̂i+1 σ̂i = σ̂i+1 σ̂i σ̂i+1
σ̂i σ̂j = zσ̂j σ̂i j >i+1

When restricted to the alternating group An we get an extension of An that can be


elegantly described using spin groups.

Remarks:

1. One generally associates cohomology with the subject of topology. There is indeed
a beautiful topological interpretation of group cohomology in terms of “classifying
spaces.”

2. In the case where G is itself abelian we can use more powerful methods of homological
algebra to classify central extensions.

3. The special case H 2 (G, U (1)) (or sometimes H 2 (G, C∗ ), they are the same) is known
as the Schur multiplier. It plays an important role in the study of projective repre-
sentations of G. We will return to this important point.

4. We mentioned that a general extension (15.1) can be viewed as a principal N bundle


over Q. Let us stress that trivialization of π : G → Q as a principal bundle is
completely different from trivialization of the extension (by choosing a splitting).

– 331 –
These are different mathematical structures! For example, for finite groups the bundle
is of course trivial because any global section is also continuous. However, as we have
just seen the extensions might be nontrivial. It is true, quite generally, that if a central
extension is trivial as a group extension then G̃ = A × G and hence π : G̃ → G is
trivializable as an A-bundle. ♣In general a
central extension by
U (1) is equivalent
to a line bundle
over the group and
you should explain
that here. ♣

Exercise
Suppose that the central extension (15.22) is equivalent to the trivial extension with
G̃ = A × G, the direct product. Show that the possible splittings are in one-one correspon-
dence with the set of group homomorphisms φ : G → A.

Exercise
Construct cocycles corresponding to each of the central extensions in (15.177) and
show how the automorphisms of Z2 × Z2 account for the the fact that there are only four
entries in (15.177) while (15.184) is order 8.

Exercise D4 vs. Q
a.) Show that D4 and Q both fit in exact sequences

1 → Z4 → D4 → Z2 → 1 (15.188)

1 → Z4 → Q → Z2 → 1 (15.189)
b.) Are these central extensions?
c.) Are D4 and Q isomorphic? 173

Exercise
Choosing the natural section s : σi → σ̂i in (15.187) and find the corresponding cocycle
fs .

173
Answer : No. D4 has 5 nontrivial involutions: The reflections in the four symmetry axes of the square
and the rotation by π, while Q has only one nontrivial involution, namely −1.

– 332 –
Exercise Due Diligence
Show that the associative law for the twisted product (15.102) is equivalent to the
cocycle condition on the 2-cochain f .

Exercise Involution Criterion For A Nontrivial Cocycle


Let g be a nontrvial involution. Show that the condition that f (g, g) is, or is not, a
perfect square is independent of which cocycle we use within a cohomology class.

Exercise Group Commutator Criterion For A Nontrivial Cocycle


a.) Show that if a central extension is defined by a cocycle f then the group commutator
is:

f (g1 g2 , g1−1 g2−1 ) f (g1 , g2 )


 
−1 −1
[(a1 , g1 ), (a2 , g2 )] = −1 −1 f (g , g ) , g1 g2 g1 g2 (15.190)
f (g2 g1 , g1 g2 ) 2 1

b.) Suppose G is abelian. Show that G̃ is abelian iff f (g1 , g2 ) is symmetric.


c.) In general the condition that f is symmetric: f (g1 , g2 ) = f (g2 , g1 ) would not be
preserved by a coboundary transformation. Show that it does make sense in this setting.
d.) Suppose G̃ is a central extension of a not-necessarily-Abelian group G by an Abelian
group A. Show that if (g1 , g2 ) is a commuting pair of elements in G and if f (g1 , g2 )/f (g2 , g1 )
is not the identity then the extension is nontrivial. 174

15.4 Extended Example: Charged Particle On A Circle Surrounding A Solenoid


In the following extended example we will illustrate how classical symmetries can be cen-
trally extended in the context of a very interesting quantum system. Along the way we
will take the opportunity to introduce many ideas about quantum field theory in a very
simple context.

15.4.1 Hamiltonian Analysis


Consider a particle of mass m confined to a ring of radius r in the xy plane. The position
of the particle is described by an angle φ, so we identify φ ∼ φ + 2π, and the action is
Z Z
1 2 2 1 2
S= mr φ̇ = I φ̇ (15.191)
2 2
174
Answer : This follows because, as we saw above, a split central extension is a direct product. But the
group commutator of (1, g1 ) and (1, g2 ) must then be the identity. On the other hand, f (g1 , g2 )/f (g2 , g1 )
is gauge invariant, so if it is nontrivial then the group commutator cannot be the identity.

– 333 –
Figure 38: Spectrum of a particle on a circle as a function of B = eB/2π. The upper left shows
the low-lying spectrum for B = 0. It is symmetric under m → −m. The upper right shows the
spectrum for B = 0.2. There is no symmetry in the spectrum. The lower figure shows the spectrum
for B = 1/2. There is again a symmetry, but under m → 2B − m = 1 − m. In general there will be
no symmetry unless 2B ∈ Z. If 2B ∈ Z the spectrum is symmetric under m → 2B − m.

with I = mr2 the moment of inertia.


Let us also suppose that our particle has electric charge e and that the ring is threaded
by a solenoid with magnetic field B, so the particle moves in a zero B field, but there is a
nonzero gauge potential 175
B
A= dφ (15.192)

The action is therefore:
Z I
1 2
S= I φ̇ dt + eA
2
Z (15.193)
1 2 eB
= I φ̇ dt + φ̇dt
2 2π

The second term is an example of a “topological term” or a “θ-term.” Classically, the


second term does not affect physical predictions, since it is a total derivative. However
as we will soon see, quantum mechanically, it will have an important effect on physical
predictions.
175
For readers not familiar with differential form notation this means, in cylindrical coordinates that
Az = 0, Ar = 0 and Aφ = B/2π.

– 334 –
We are going to analyze the symmetries of this system and compare their realization
in the classical and quantum theories.

Classical Symmetries:

We begin by analyzing the classical symmetries. Because the θ-term does not affect
the classical dynamics the classical system has O(2) symmetry. We can rotate: R(α) :
eiφ → eiα eiφ , or, if you prefer, translate φ → φ + α (always bearing in mind that α and φ
are only defined modulo addition of an integral multiple of 2π). If we think of the circle
in the x − y plane centered on the origin, with the solenoid along the z-axis then we could
also take as usual: !
cos α sin α
R(α) = . (15.194)
− sin α cos α
Also we can make a “parity” or “charge conjugation” transformation P : φ → −φ.
The second term in the Lagrangian is not invariant but this “doesn’t matter” because it is
a total derivative. Put differently: φ → −φ is a symmetry of the equations of motion, and
hence it is a classical symmetry.
Note that these group elements in O(2) satisfy

R(α)R(β) = R(α + β)
P2 = 1 (15.195)
P R(α)P = R(−α)

and indeed, as we have seen, O(2) is a semidirect product:

O(2) = SO(2) o Z2 (15.196)

with ω : hP i ∼
= Z2 → Aut(SO(2)) ∼= Z2 acting by taking the nontrivial element of Z2 to
the outer automorphism that sends R(α) → R(−α).

Diagonalizing The Hamiltonian

Now let us consider the quantum mechanics with the “θ-term” added to the La-
grangian. Our goal is to see how that term affects the quantum theory.
We will first analyze the quantum mechanics in the Hamiltonian approach. See the
remark below for some remarks on the path integral approach. The conjugate momentum
is
eB
L = I φ̇ + (15.197)

We denote it by L because it can be thought of as angular momentum.
Note that the coupling to the flat gauge field has altered the usual relation of angular
momentum and velocity. Now we obtain the Hamiltonian from the Legendre transform:
Z Z
1 eB 2
Lφ̇dt − S = (L − ) dt (15.198)
2I 2π

– 335 –

Upon quantization L → −i~ ∂φ , so the Hamiltonian is
2
~2


HB := −i −B (15.199)
2I ∂φ
eB
where B := 2π~ .
The eigenfunctions of the Hamiltonian HB are just

1
Ψm (φ) = √ eimφ m∈Z (15.200)

They give energy eigenstates with energy

~2
Em = (m − B)2 (15.201)
2I
There is just one energy eigenstate for each m ∈ Z.
Before moving on with the analysis of the symmetries in this quantum mechanical
problem let us take the opportunity to make a long list of:

Remarks:

1. The action (15.193) makes good sense for φ valued in the real line or for φ ∼ φ + 2π,
valued in the circle. Making this choice is important in the choice of what theory
we are describing. Where - in the above analysis - did we make the choice that the
target space is a circle? 176

2. Taking φ ∼ φ + 2π, even though the θ-term is a total derivative it has a nontrivial
effect on the quantum physics as we can see since B has shifted the spectrum of the
quantum Hamiltonian in a physically observable fashion: This is how we see that
topological terms matter.

3. Note that when 2B is even the energy eigenspaces are two-fold degenerate, except
for the ground state at m = B. On the other hand, when 2B is odd all the energy
eigenspaces are two-fold degenerate, including the ground state. If 2B is not an
integer all the energy eigenspaces are one-dimensional. See Figure 38.

4. The total spectrum is periodic in B, and shifting B → B+1 is equivalent to m → m+1.


To be more precise, we can define a unitary operator on the Hilbert space by its action
on a basis:
U Ψm = Ψm+1 (15.202)

and
U HB U −1 = HB+1 (15.203)
176
Answer : If we took the case where φ is valued in R and not the circle then there would be no quantization
on m and the spectrum of the Hamiltonian would be continuous. In this case the Chern-Simons term would
not affect the physics in the quantum mechanical version as well.

– 336 –
5. The quantum mechanics problem (15.193) and the spectrum (15.201) arise in the
discussion of the “Coulomb blockade” in physics of quantum dots. See Yoshimasa
Murayama, Mesoscopic Systems, Section 10.10.

6. Viewing the system as a field theory. We have introduced this system as describing
the quantum mechanics of a particle. However, it is important to note that it can
also be viewed as a special case of a quantum field theory. In general, in a field theory
177 we have a spacetime M and the fields φ are functions on M valued in some target

space X . (So the term “target space” means nothing more or less than the codomain
of the fields.) An important example is that of a nonrelativistic particle of mass m
moving on a Riemannian manifold X with metric ds2 = gµν (x)dxµ ⊗ dxν . The action
would be Z
m
S = dt gµν (x(t))ẋµ ẋν dt (15.204)
2
If, in addition, the particle has charge e and there is an electromagnetic potential
Aµ (x)dxµ on X then the action is
Z Z
m
S = dt gµν (x(t))ẋ ẋ dt + eAµ (x(t))ẋµ dt
µ ν
(15.205)
2
Here M is the manifold of time. It could be M = R if we describe the entire history of
the particle, or M = [tin , tf in ] if we describe only the motion in a finite time interval.
As we will soon see, it can also be interesting to let M = S 1 . The “field” is a suitably
differentiable map
x:M →X (15.206)
describing the position of the particle as a function of time. This is an example of a
“0 + 1 dimensional field theory.” A generalization would be a theory of maps from a
d-dimensional spacetime with metric hab dσ a dσ b and action
Z p 1
S = dd+1 σ |deth|hab (σ) mgµν (x(t))∂a xµ ∂b xν (15.207)
2
and the “field” would be a suitably differentiable map:

x:M →X (15.208)

Equations (15.205) and (15.207) are examples of what is known as a “nonlinear sigma
model.” 178 In our case our fields are maps

eiφ : M → S 1 (15.212)
177
As traditionally conceived. The topic of topological field theory generalizes the next few lines consid-
erably.
178
For the mathematically sophisticated reader we note that, for general nonlinear sigma models

dx : Tσ M → Tx(σ) X (15.209)

is a linear map between two inner-product spaces. We can use the inner products to define (dx) and then
the kinetic term is Z
Tr((dx)(dx)† )vol (h) (15.210)
M

– 337 –
We have been referring to φ → −φ as “parity” because that is the appropriate term
in the context of the quantum mechanics of a particle constrained to a circle in the
plane. The parity operation is just reflection around some line in the plane. However,
if we take the point of view that we are discussing a 0 + 1 dimensional “field theory”
then it would be better to refer to the operation as “charge conjugation” because it
complex conjugates the U (1)-valued field eiφ .
In addition there are (in the field theory interpretation) “worldvolume symmetries”
of time translation invariance and time reversal. These form the group R o Z2 . We
will put those aside. (Note that time reversal is not a symmetry of the second term
in the Lagrangian but is a symmetry of the space of solutions of the equations of
motion.)

7. Relations to higher dimensional field theories and string theory. The θ-term we have
added has a very interesting analog in 1 + 1 dimensional field theory, where it is
known as a coupling to the B-field. It can also be obtained from a Kaluza-Klein
reduction of 1 + 1 dimensional Maxwell theory:
Z Z
1 θ
S = 2 dx0 dx1 F01 2
+ F01 dx0 ∧ dx1
e 2π
Z Z (15.213)
1 θ
= 2 F ∗F + F
e 2π
In 1 + 1 dimensional theory we can choose A0 = 0 gauge and gauge away the x1
dependence so that on S 1 × R the only gauge invariant quantity is
A1 dx1
H H
eiφ(t) = ei S1 A
= ei S1 (15.214)
With this in mind we can say
θ = 2πB (15.215)

Remark: More generally, in 1 + 1 dimensional Yang-Mills theory on S 1 × R we


can always go to A0 = 0 gauge and then the only gauge invariant observable is the
conjugacy class of the holonomy around the circle.

The theta term also has a close analog in 3 + 1-dimensional gauge theory. In the case
of 3 + 1 dimensional Maxwell theory we can write
Z Z
4 1 µν θ µνλρ
S = d x 2 Fµν F +  Fµν Fλρ d4 x
4e 8π
Z Z (15.216)
1 θ
= F ∗F + F ∧F
2e2 (4π)
More to the point dx is a section of (T M )∨ ⊗ f ∗ (T X) and we can use the metric on this bundle to write
k dx k2 . In the case of the charged particle moving on the Riemannian manifold X, there is also the
R
M
data of a principal U (1) bundle with connection d + A and the topological term is based on the holonomy
of the pulled back connection: Z I
m
S= k dx k2 dt + ex∗ (A) (15.211)
2
There are similar topological terms for the d > 0 sigma models.

– 338 –
In fact, in the effective theory of electromagnetism in the presence of an insulator a
very similar action arises with a θ term. If a parity- and/or time-reversal symmetry
is present then θ is zero or π, corresponding to our case 2B ∈ Z. The difference
between a normal and a topological insulator is then, literally, the difference between
2B being even (normal) and odd (topological), respectively. Finally, in the 3+1-
dimensional Yang-Mills theories that describe the standard model of electro-weak
and strong interactions one can add an analogous θ-term. Topological terms matter,
and in this case the topological term for the strong gauge field leads to the prediction
of an intrinsic electric dipole moment of the neutron. However, to excellent accuracy
it is known that if the neutron dipole moment it is very small and

|θ| < 10−9 (15.217)

One of the great unsolved mysteries about nature is why the (effective) theta angle
for the strong interactions in the standard model is so small. 179

Now let us get back to the symmetries of the particle on the ring. We have seen that the
classical “internal” symmetry group - the “internal” symmetry group of the equations of
motion - is O(2). Now let us analyze how the symmetries are implemented in the quantum
theory:
In quantum mechanics the SO(2) shift symmetry φ → φ+α is realized by a translation
operator ρ(R(α)) = R(α) and acting on Ψm we have

(R(α) · Ψm ) = eimα Ψm (15.218)

Can we also represent ρ(P ) = P on the Hilbert space? Classically, parity symmetry
P just takes φ → −φ. If we make this substitution in the Hamiltonian HB we see that the
naive parity operation takes
PHB P −1 = H−B (15.219)
For general values of B the operator HB is not unitarily equivalent to H−B . However,
thanks to (15.203) it is clear that when 2B ∈ Z they are unitarily equivalent and the naive
operation of taking φ → −φ, which takes m → −m on eigenvectors of HB , should be
accompanied by U 2B . Therefore P should map the eigenspace associated with m to that
associated with with 2B − m. As a sanity check note that indeed Em = E2B−m . Therefore
we should define a parity operation:

P · Ψm = ξm Ψ2B−m (15.220)

where ξm is a phase which we can take to be 1. Note that the operator P so defined ♣You should allow
the possibility of a
commutes with the Hamiltonian: Indeed, it takes eigenvectors to eigenvectors with the phase in the
definition of P and
same eigenvalue. show in detail it
doesn’t matter. ♣
If 2B is not an integer the parity symmetry is broken and the quantum symmetry
group is just SO(2).
179
For much more about this see M. Dine’s TASI lectures https://arxiv.org/pdf/hep-ph/0011376.pdf.

– 339 –
Now consider the case when 2B ∈ Z and let us study the relations obeyed by the
operators R(α) and P and compare them with the classical relations (15.195). We still
have R(α)R(β) = R(α + β) and P 2 = 1 but now the third line of (15.195) is modified to:

PR(α)P = ei2Bα R(−α) (15.221)

Exercise Due Diligence


a.) Check that (15.221) is well-defined, even though α is only defined up to a shift by
an integral multiple of 2π.
b.) Check the operator relation (15.221)!

We now consider the group of operators generated by the operators P, R(α), and z1H
where z ∈ U (1). (Do not forget that we identify α ∼ α + 2π. This will be quite important
in what follows.) Denote this group of operators by GB . Naively we might have expected
this group of operators on Hilbert space to be isomorphic to U (1) × O(2) where O(2) is our
classical symmetry group and U (1) is just the group of phases acting on wavefunctions by
scalar multiplication. However, equation (15.221) is not satisfied by a direct product. So,
how is GB related to U (1) and O(2)? General principles tell us it will be an extension

1 / U (1) / GB / O(2) /1 (15.222)

But what extension?


Now, when B is an integer we can indeed define an isomorphism of GB with U (1)×O(2)
by setting
R̃(α) := e−iBα R(α) (15.223)
We now recover the standard relations of O(2), so the classical O(2) symmetry is not
modified quantum mechanically. However, when B is a half-integer, R̃ is not well-defined
since we must identify α ∼ α + 2π. In this case the group GB is really different.
To understand what happens when B is half-integral we introduce a new group called
Spin(2). As an abstract group it is isomorphic to SO(2), and U (1), and R/Z. The groups
are all isomorphic. What makes Spin(2) nontrivial is its relation to SO(2). The group
elements in Spin(2) can be parametrized by α̂ with α̂ ∼ α̂ + 2π. Let us call the elements
of the spin group R̂(α̂). You can think of it in terms of Pauli matrices as

R̂(α̂) = exp[α̂σ 1 σ 2 ] = cos(α̂) + i sin(α̂)σ 3 (15.224)

But it is called the spin group because it comes with a nontrivial double cover:

π : Spin(2) → SO(2) (15.225)

the double covering is given by restricting our standard projection π : SU (2) → SO(3) to
the subgroup of SU (2) in (15.224). In this way we get a double cover of the rotation group
around the z axis:
π : R̂(α̂) 7→ R(2α̂) (15.226)

– 340 –
See equation (15.48) above.
Now, taking Z2 to act on Spin(2) by the nontrivial outer automorphism. So, denoting
the nontrivial element of Z2 by P̂ we use the homomorphism α : Z2 → Aut(Spin(2)) defined
by
α(P̂ ) : R̂(α̂) → (R(α̂))−1 = R̂(−α̂) (15.227)
Then, one definition of the group Pin+ (2) is that it is the semidirect product:

Pin+ (2) ∼
= Spin(2) oα Z2 (15.228)

(We will give a slightly different definition below.) There is a generalization of Spin and
Pin groups to higher dimensions. They double cover SO(d) and O(d), respectively. See
the remark below for a brief description and Chapters *** and **** for full details.
Now, when 2B is an odd integer the group GB is generated by

z1H
ρ(R̂(α̂)) := e−i(2B)α̂ R(2α̂) 0 ≤ α̂ < 2π (15.229)
ρ(P̂ ) := P

where we take P̂ to be the nontrivial element in Z2 in the semidirect product that defines
Pin+ (2), so that R̂(α̂) and P̂ generate Pin+ (2). One checks that ρ is a homomorphism and
the image under ρ is an isomorphic copy of Pin+ (2) inside GB , and we have:

1 / Z2 / Pin+ (2) / O(2) /1 (15.230)


ρ Id
  
1 / U (1) / GB / O(2) /1

where Z2 ∼
= {±1} ⊂ U (1). When 2B is odd ρ has no kernel. (When 2B is even there is a
kernel.)
In conclusion:

1. The classical theory has an O(2) symmetry.

2. In the quantum theory when 2B is not an integer the symmetry is broken to SO(2).

3. In the quantum theory, when 2B is an even integer the theory still has O(2) symmetry.
The sequence
1 / U (1) / GB / O(2) /1 (15.231)
splits and GB ∼
= U (1) × O(2).

4. In the quantum theory, when 2B is an odd integer,

1 / U (1) / GB / O(2) /1 (15.232)

does not split. When we try to realize the classical O(2) symmetry on the Hilbert
space we are forced to implement the pin double cover Pin+ (2), a central extension
of O(2) by Z2 . It is related to GB as in (15.230).

– 341 –
We conclude with some remarks:

1. We stress that the particle we put on the ring did NOT have any intrinsic spin!!
2
Having said that, if we define an angular momentum L so that H = L2I then indeed
when B is half-integral the angular momentum has half-integral eigenvalues, as one
expects for a spin representation. So, what we are finding is that the half flux quantum
is inducing a half-integral spin of the system so that the classical O(2) symmetry of
the classical system is implemented as a Pin+ (2) symmetry in the quantum theory.
This is an intriguing phenomenon appearing in quantum symmetries with nontrivial
gauge fields and topological terms: The statistics and spins of particles can be shifted
from their classical values, often in ways that involve curious fractions.

2. Spin And Pin Groups. Enquiring minds will wonder about the definition of Pin± (d).
These groups are defined using Clifford algebras. For much more detail and moti-
vation see the two chapters on Clifford algebras and Spin groups. In brief, consider
the Clifford algebra generated by {γi , γj } = 2Qij where Qij is an invertible d × d
symmetric matrix. For a vector v i define γ(v) := v i γi . Assume that Qij = δij . Then
Pin+ (d) is the group of expressions of the form
±γ(v1 ) · · · γ(vr ) (15.233)

for some r. The group Pin (d) is similarly defined with Qij = −δij . The group
Spin(d) is the subgroup of such expressions where r is even. The projection π :
Pin± (d) → O(d) is defined by the equation:
γ(π(g) · w) = (−1)r gγ(w)g −1 (15.234)
The key idea here is that
−γ(v)γ(w)γ(v)−1 = γ(Rv (w)) (15.235)
where Rv (w) is the reflection of w through the plane orthogonal to v, as the reader
can easily check in an exercise below. Then use the fact that all elements of O(d) are
products of reflections. The restriction to Spin(d) defines π : Spin(d) → SO(d). This
is a generalization of our standard double-covering π : SU (2) → SO(3). Although
Spin(3) ∼ = SU (2) for d > 3 Spin(d) is not isomorphic to a unitary or orthogonal
group. The difference between Pin+ (d) and Pin− (d) is whether the lift of a reflection
will square to +1 or −1 respectively. As a group Pin− (d) is isomorphic to (Spin(2) o
Z4 )/Z2 .

3. It is instructive to study the representation of Pin+ (2) on the two-dimensional space


of ground states, Hgrnd , when B = 1/2. In this case we can choose the ordered basis
{Ψ0 , Ψ1 } for Hgrnd , and, relative to this basis we have a matrix representation:
!
e−iα̂ 0
ρ(R(α̂))|
b Hgrnd =
0 eiα̂
! (15.236)
0 1
ρ(Pb)|Hgrnd =
10

– 342 –
We stress the appearance of α̂ in the representation matrix. This transformation cor-
responds to a translation of φ by α = 2α̂. Had we tried to express the representation
in terms of α we would encounter the phase e±iα/2 which is not well-defined because
α is only defined modulo α ∼ α + 2π.

4. As pointed out in a recent paper 180 this extension of the symmetry group at half-
integral θ is an excellent baby model for how one can learn about nontrivial dynamics
of quantum systems (in particular, QCD) by thinking carefully about group exten-
sions. For example, if we were to add a potential U (φ) to the problem we just dis-
cussed we could no longer solve exactly for the eigenstates. Also, a generic potential
would be of the form
X X
U generic (φ) = cn cos(nφ) + sn sin(nφ) (15.237)
n∈Z n∈Z

Potentials with generic coefficients will explicitly break all of the O(2) symmetry.
Suppose however, that we can restrict attention to a special class of potentials with
only cosine Fourier coefficients that are 0 mod 2:
X
U special (φ) = un cos(2nφ) (15.238)
n

Then, even though we cannot solve the spectrum of the Hamiltonian exactly we
can make an interesting statement about it. For such potentials the classical O(2)
symmetry is explicitly broken to Z2 × Z2 generated by P : φ → −φ and r : φ → φ + π.
We have shown that when 2B is odd and the potential is zero the O(2) symmetry
is centrally extended and realized as the double-cover Pin+ (2) on the Hilbert space.
The double cover of the subgroup group hP, ri ⊂ O(2) acting on Hilbert space is
described by the pullback diagram:
π1
1 / Z2 ι / D4 / Z2 × Z2 /1 (15.239)
Id ι ι
  
π2
1 / Z2 / Pin+ (2) / O(2) /1

To check this note that P lifts to the operators ±P and r = R(π) lifts to the operators

±r̂ = eiBπ R(π) (15.240)

One checks that hP, r̂i generates a group isomorphic to D4 . Indeed, these operators
satisfy the defining relations of D4 : P 2 = 1, r̂4 = 1, and P r̂P = r̂−1 . Note that,
in addition r̂2 = −1 on the entire Hilbert space. The representation on the Qbit
groundstate in (15.236) (with α̂ = ±π/2) is a two-dimensional irrep of D4 . In fact
all the doubly degenerate energy eigenspaces are two-dimensional irreps of D4 .
180
D. Gaiotto, A. Kapustin, Z. Komargodski and N. Seiberg, “Theta, Time Reversal, and Temperature,”
https://arxiv.org/pdf/1703.00501.pdf

– 343 –
It is reasonable to assume that when we turn on a weak potential of the form (15.238)
the classically preserved Z2 × Z2 subgroup again lifts to a D4 action on the Hilbert
space, even though we can no longer construct the operators in the D4 group explic-
itly. The cocycle is discrete: So if it is a continuous function of the parameters un at
un = 0 (this is an assumption) then the classical Z2 × Z2 symmetry must be realized
by D4 on the Hilbert space.
Now, we saw that, in the absence of the potential, there is a Qbit giving a two-
dimensional representation of Pin+ (2). This representation restricts to an irreducible
two-dimensional representation of D4 . (See equation (15.236) with α̂ = ±π/2.) Now,
D4 has four one-dimensional irreducible representations 1±,± and, we will show later,
exactly one two-dimensional irreducible representation. In particular, the set of repre-
sentations of D4 is discrete. Again, it is reasonable to suppose that the representation
is a continuous function of un . Again, this is an assumption. But granting this, turn-
ing on a weak potential cannot change the decomposition of the energy eigenspaces
into irreducible representations. This leads to a striking prediction: The two-fold
groundstate degeneracy is not broken by potentials of the form (15.238) when 2B is
odd! This is remarkable when one compares to the standard discussion of the double-
well potential of one-dimensional quantum mechanics. In that standard case one has
a two-fold classical degeneracy broken by tunneling (instanton) effects so that there
is a unique ground state. For potentials of the form (15.238) there are (generically)
four stationary points of the potential, at φ = 0, ±π/2, π. Generically, two will be
maxima and two will be minima. So, classically, and perturbatively in quantum me-
chanics, for a generic potential of the form (15.238) there will be a two-fold degenerate
groundstate. However, unlike the textbook discussion of the double-well potential,
the degeneracy will not be lifted by nonperturbative tunneling effects.

Exercise
Show that the ground state energy is

~2
Eground = M inm∈Z (m − B)2 (15.241)
2I
and, using the floor function, give a formula for Eground directly in terms of B (without
requiring minimization).

Exercise Pin Action


a.) Show that (15.234) defines a homomorphism to O(d). 181

181
Answer : The key equation to check is (15.235). To check this consider the two cases that w is parallel
to v and that w is perpendicular to v. Then note that every element in O(d) is a product of reflections.

– 344 –
b.) Show that the general definition of Pin+ (d) specializes to the definition of Pin+ (2)
as a semidirect product. 182

Exercise One-Dimensional Representations Of D4


Show that there are four distinct one-dimensional representations of D4 . 183

Exercise A Cocycle Puzzle


Note that had we defined P with an extra factor of i we would have concluded that it
is order 4, not order 2. Now, we know that the sequence 1 → Z2 → Z4 → Z2 → 1 is not
split and has a nontrivial cocycle. When then, can we define a parity operation of order
two? 184

15.4.2 Remarks About The Quantum Statistical Mechanics Of The Particle On


The Ring
In quantum statistical mechanics a central object of study is the partition function:

Z := TrH e−βH (15.243)

Here β = 1/(kT ) where k is Boltzmann’s constant and T is the absolute temperature.


For simplicity we will henceforth set ~ = k = 1 (as can always be done by a suitable
choice of units).
Since we have diagonalized the Hamiltonian exactly we can immediately say that
X β 2
Z= e− 2I (m−B) (15.244)
m∈Z

β
This is in fact a very interesting function of 2I and B. Some immediate facts we can note
are
182
Answer : First reproduce Spin(2) using the general definition. We can represent the d = 2 Clifford
algebra by σ 1 , σ 2 and then the product of two vectors of the form γ(v) with v 2 = 1 is a matrix of the form
(x1 σ 1 + x2 σ 2 )(y 1 σ 1 + y 2 σ 2 ) where v1 = (x1 , x2 ) and v2 = (y 1 , y 2 ) are unit vectors in R2 . Multiplying this
out we get
cos θ + sin θσ 1 σ 2 (15.242)
where θ is the angle between v1 and v2 . Now let P̂ be represented by σ 1 (or γ(w) for any unit vector w).
183
Answer : D4 has generators x, y with x2 = 1 and y 4 = 1 and xyx = y −1 . In a one-dimensional
representation x, y will be represented by complex numbers. So we solve the above equations with x, y ∈ C.
Clearly x ∈ {±1} and then y 2 = 1 so y ∈ {±1} and there is no correlation between the choice of sign for x
and the choice of sign for y.
184
Hint:It is important to think about which group the cocycle and coboundaries take values in.

– 345 –
1. The expression is manifestly periodic under integer shifts of B, illustrating the general
claim above that the theory is invariant under integral shifts of the “theta angle” B.

2. Moreover, at low temperature, β → ∞ there is a single dominant term from the sum,
unless B is a half-integer, in which there are two equally dominant terms - this reflects
the double degeneracy of the ground state when 2B is odd: The ground state is a
Qbit. A standard technique in field theory is to study the IR behavior of a partition
function to learn about the ground states of the system.

We are going to see that this system in fact has a very interesting high/low temperature
duality and use this to understand better the θ-dependence of the previous example in terms
of path integrals.
To relate Z to a path integral we observe that we can write:
Z 2π
Z= dφhφ|e−βH |φi (15.245)
0

Now, we can interpret hφ|e−βH |φi as a specialization of the matrix elements of the Euclidean
time propagator
1 X − tE (m−B)2 +im(φ1 −φ2 )
hφ2 |e−tE HB |φ1 i = e 2I (15.246)

m∈Z

Indeed, the usual propagator in quantum mechanics is

U (t) = e−itH/~ (15.247)

Under good conditions this family of operators for t ∈ R has an “analytic continuation”
to part of the complex plane. What this means is that there is a well-defined family of
operators
e−izH/~ (15.248)
where z takes values in a region R ⊂ C. The region R should, at least contain the real axis
of time on its boundary (or closure). To see that there might be restrictions on R suppose
that R = C. Then we can consider the restriction to the imaginary axis, setting

z = −itE tE ∈ R (15.249)

Here tE is called Euclidean time because if we were to make a substitution t → −itE in


the Lorentz metric then we would get a metric of definite signature. (If we take signa-
ture (−, +d ) we would get the Euclidean metric.) If the Hamiltonian is bounded below
exp[−tE H] should make sense for tE positive, but if the spectrum of H grows rapidly the
operator will be unbounded and certainly not traceclass for tE negative. So, for Hamil-
tonians, such as the one we are considering the region R can be taken to be the negative
half-plane. Defining such an analytic family of operators and restricting to the negative
imaginary axis (or some other part of the imaginary axis) is called Wick rotation.
itH
Now, it is well-known that the propagator: hφ2 |e− ~ |φ1 i can be represented by a
Feynman path integral. After Wick rotation we still have a path integral representation.

– 346 –
Feynman’s argument proceeds just as well with e−tE H/~ . In fact, formally, it is better since
the integral has better (formal) convergence properties for Euclidean actions whose real
part is bounded below.
Now, by setting φ1 = φ2 = φ and integrating over φ we are making Euclidean time
periodic, with period β and computing the path integral on a compact spacetime, namely,
the circle. The path integral for φ1 = φ2 is done with boundary conditions on the fields so
that φ(0) = φ(β). This is precisely the kind of boundary condition that says that φ(t) is
defined on a circle. More details on path integrals are available in many textbooks. See,
for examples:
1. Feynman and Hibbs, Quantum Mechanics and Integrals
2. Feynman, Statistical Mechanics
3. C. Itzykson and J.B. Zuber, Quantum Field Theory,
4. J. Zinn-Justin, Quantum Field Theory and Critical Phenomena,
In this subsubsection we will henceforth drop the subscript E on tE and just use t for
the real Euclidean time coordinate.
H
In the Wick rotation to Euclidean space the “θ-angle” eA remains real so the matrix
elements of the Euclidean time propagator have the path integral representation:
Z
φ(β)=φ 1 β 1 2
R R
−βH
hφ2 |e |φ1 i = Z(φ2 , φ1 |β) := [dφ(t)]φ(0)=φ12 e− ~ 0 2 I φ̇ dt−i Bφ̇dt (15.250)

(One must be careful with the sign of the imaginary term, and it matters.)
Viewed as a field theory, this is a free field theory and the path integral can be done
exactly by semiclassical techniques:
The equation of motion is simply φ̈ = 0. Again, the θ-term has not changed it.
Thus, the classical solutions to the equations of motion with boundary conditions
φ(0) = φ1 , φ(β) = φ2 are:
 
φ2 − φ1 + 2πw
φcl (t) = φ1 + t w∈Z (15.251)
β

or more to the point:  


iφcl (t) i (1− βt )φ1 + βt φ2 + 2πitw
e =e β
(15.252)

These are solutions of the Euclidean equations of motion, and are known as “instantons”
for historical reasons. Notice that because of the compact nature of the spacetime on which
we define our 0 + 1 dimensional field theory there are infinitely many solutions labeled by
w ∈ Z. There are two circles in the game: The spacetime of this 0 + 1-dimensional field
theory is the Euclidean time circle. Then the target space of the field theory is also a circle.
The quantum field eiφ(t) is a map M → X . M , which is spacetime is S 1 and X , which is the
target space is also X . As we saw: π1 (S 1 ) ∼
= Z. There can be topologically inequivalent
field configurations. That is the space of maps M ap(M → X ) has different connected
components. The different topological sectors are uniquely labelled by the winding number
of the map (15.252). In the path integral we sum over all field configurations so we should
sum over all these instanton configurations.

– 347 –
We now write
φ = φcl + φq (15.253)
where φcl is an instanton solution, as in (15.251), and φq is the quantum fluctuation with
φq (0) = φq (β) = 0, and, moreover, φq (t) is in the topologically trivial component of
M ap(S 1 , X ). The action is a sum S(φcl ) + S(φq ), precisely because φcl solves the equation
of motion. Indeed the integral factorizes and we just get:
X − 2π2 I (w+ φ2 −φ1 )2 +2πiB(w+ φ2 −φ1 )
Z(φ2 , φ1 |β) = Zq e β 2π 2π (15.254)
w∈Z

The summation runs over classical solutions, weighted by the value of the classical action
on that solution.
Zq is the path integral over φq :
Z Rβ 1 2
φ(β)=0
Zq = [dφq (t)]φq (0)=0 e− 0 2 I φ̇ dt (15.255)

We are integrating over the space of “all” maps φq : [0, 1] → U (1) with φq (0) = φq (1) = 0
that are homotopically trivial. We can do it by noticing that this is a Gaussian integral.
Now in finite dimensions we have the integral
n
dxi
Z Y
1 i j i 1 1 −1 ij
√ e− 2 x Aij x +bi x = √ e 2 bi (A ) bj (15.256)
i=1
2π detA

where Re(A) > 0 is a symmetric matrix. When A can be diagonalized by a real orthogonal
transformation we can replace
n
Y
detA = λi (15.257)
i=1
where the product runs over the eigenvalues of A. Thus, we need to generalize this expres-
sion to the determinant of an infinite-dimensional “matrix”
Z 1
I d2
Z
−1/2
[dφq ]exp[− φq (− 2
)φq ] = (2π)Det0 (O) (15.258)
0 2β dτ
Here the prime on the determinant means that we have omitted the zero-mode and the
I d2
analog of A is the operator O = − 2~β dτ 2
.
One way to make sense of DetO for an operator O on Hilbert space is known as
“ζ-function regularization.” (It will only work for a suitable class of operators.) Note that
d
|s=0 λ−s = −logλ (15.259)
ds
So if we define X
ζO (s) := λ−s (15.260)
λ
where we take the sum over the spectrum of O (and we assume O is diagonalizable with
discrete spectrum) then, formally:
Y
0
λ = exp[−ζO (0)] (15.261)
λ

– 348 –
For good operators O the spectrum goes to infinity sufficiently fast that ζO (s) exists as an
analytic function of s in a half plane Re(s) > N for some N . Moreover, ζO (s) also admits
an analytic continuation in s to an open region around s = 0. In this case, we can define
the determinant by the RHS of (15.261).
I d2 Iπ 2 s

For O = − 2~β dτ 2
we have ζO (s) = 2 2~β ζ(2s) where ζ(s) is the standard Riemann
ζ-function, and since
1 1 
ζ(s) = − + slog √ + O(s2 ) (15.262)
2 2π
we have
β
Det0 (O) := exp[−ζO
0
(0)] = (15.263)
I
(There are some factors of 2 and π that need to be fixed in this equation.)
We can understand this result nicely as follows. Let us study the β → 0 behavior of
the path integral. Then for |φ1 − φ2 | < π,
I
 
− 2β (φ2 −φ1 )2 +iB(φ2 −φ1 ) −κ/β
Z → Zq e 1 + O(e ) (15.264)

where κ > 0. In plain English: the instantons are only important at large β. This is
intuitively very satisfying: At very small times β it must cost a lot of action for φ(t) to make
a nonzero number of circuites around the circle because the velocity must then be large, and
large velocity means large action. So for physical quantities based on such small fluctuations
the topologically nontrivial field configurations must contribute subleading effects. Let us
therefore compare Z(φ2 , φ1 |β) as β → 0 with the standard quantum mechanical propagator.
For small φ we can remove the phase from the B-field via ψ(φ) → e−iBφ ψ(φ), so Zq should
not depend on B. (Note this transformation is not globally defined in φ for generic B so
we cannot use it to remove B from the problem when we treat the full quantity Z exactly.)
After this transformation we expect to recover the standard propagator of a particle of
mass M = I on the line. Rotated to Euclidean space this would be:
s
2
M − M (φ2~β2 −φ1 )
e (15.265)
2π~β
so s
I
Zq = (15.266)
2π~β
The net result is that
s
I X − 2πβ2 I (w+ φ22π
−φ1 2 φ −φ
) −2πiB(w+ 22π 1 )
Z(φ2 , φ1 |β) = e (15.267)
2πβ
w∈Z

Now compare (15.246) (with tE = β) with (15.267). These expressions look very dif-
ferent! One involves a sum of exponentials with β in the numerator and the other with β in
the demoninator. One is well-suited to discussing the asymptotic behavior for β → ∞ (low
temperature) and the other for β → 0 (high temperature), respectively. Nevertheless, we
have computed the same physical quantity, just using two different methods. So they must

– 349 –
be the same. But the mathematical identity that says they are the same appears somewhat
miraculous. We now explain how to verify the two expressions are indeed identical using a
direct mathematical argument.
The essential fact is the Poisson summation formula discussed in section
In our case, the expression computed directly from the diagonalization of the Hamil-
tonian is
1 X − β (m−B)2 +im(φ1 −φ2 )
Z(φ2 , φ1 |β) = e 2I

m∈Z (15.268)
1 iB(φ1 −φ2 ) θ
= e ϑ[ ](0|τ )
2π φ

where we have written it in terms of the Riemann theta function (11.237) with:

β
τ =i
2πI
θ = −B (15.269)
φ1 − φ2
φ=

On the other hand, the expression that emerges naturally from the semiclassical eval-
uation of the Euclidean path integral is
s
I X − 2πβ2 I (w+ φ22π
−φ1 2 φ −φ
) −2πiB(w+ 22π 1 )
Z(φ2 , φ1 |β) = e
2πβ
w∈Z
s (15.270)
I θ 0
= ϑ[ 0 ](0|τ 0 )
2πβ φ

where we have written it in terms of the Riemann theta function (??):


2πI
τ0 = i
β
φ 1 − φ2 (15.271)
θ0 = −

0
φ = −B

Note that the modular transformation law of the Riemann theta function, equation (??) is
equivalent to the relation between the expressions naturally arising from the Hamiltonian
and Lagrangian approaches to evaluation of the matrix elements of the Euclidean time
propagator!
Note in particular that for the partition function proper we have, as β → 0:
 1/2 X
1 2πI I
−2π 2 n2 β~ +2πinB
Z(S ) = e
β~
n∈Z
 1/2  (15.272)
2πI I
−2π 2 β~

∼β→0 1 + 2e cos(2πB) + · · ·
β~

– 350 –
The overall factor of β −1/2 gives the expected divergence. The first correction term to the
factor in parentheses is an instanton effect.
Note that in the Hamiltonian version the only thing that is manifest about the high-
temperature, β → 0, limit is that Z diverges. Note that for β → 0 all the terms in the sum
contribute about equally and the sum diverges. The modular transformation reveals an
interesting duality: Once we factor out this multiplicative divergence we discover another
theta function.

Exercise A Parity Puzzle


Suppose 2B is an odd integer.
a.) Show that as β → ∞ we have
 
βH i φ1 − φ2
hφ2 |e− ~ |φ1 i ∼ 2e 2 (φ1 −φ2 ) cos e−βEground + · · · (15.273)
2

b.) Note that this expression is not invariant under φ → −φ. But in Pin+ (2) there is
an element P which corresponds to φ → −φ. How is this compatible with our argument
that Pin+ (2) is a valid symmetry of the quantum theory? 185

15.4.3 Gauging The Global SO(2) Symmetry, Chern-Simons Terms, And Anoma-
lies
When a theory has a symmetry one can implement a procedure called “gauging the sym-
metry.” This is a two-step process:

1. Make the symmetry local and couple to a gauge field.

2. Integrate over “all possible” gauge fields consistent with the symmetry.

It is not necessary to proceed to step (2) after completing step (1). In this case, we say
that we are coupling to nondynamical external gauge fields. It makes perfectly good sense
to introduce nondynamical, external gauge fields for a symmetry. We do this all the time
in quantum mechanics courses where we couple our quantum system to an electromagnetic
field, but do not try to quantize the electromagnetic field.
For the more mathematically sophisticated reader the two-step process can be sum-
marized, somewhat more concisely and precisely, as saying that we:

1. Identify the symmetry group with the structure group of a principal bundle and we
change the bordism category in the domain of the field theory functor to include
G-bundles with connection (where G is the symmetry we are gauging).
185
Answer : Show that
P · |φi = e−iφ | − φi (15.274)
P
You can prove this by expanding |φi = m∈Z hΨm |φiΨm . Now, using this expression check that the
propagator indeed transforms correctly.

– 351 –
2. Sum over isomorphism classes of principal bundles and integrate over the isomorphism
classes of connections on those bundles.

In the present simple example of the charged particle on a ring surrounding a solenoid
we can “gauge” the global SO(2) symmetry φ → φ + α that is present for all values of
B. It is then interesting to see how coupling to the external gauge field tells us about
the subtleties of combining SO(2) symmetry with charge conjugation symmetry that we
studied above. (The following discussion was inspired by Appendix D of. 186 )
So in our simple example we implement Step 1 above as follows: We seek to make the
shift symmetry local, that is, we attempt to make

φ(t) → φ(t) + α(t) (15.275)

into a symmetry where α(t) is not a constant but an “arbitrary” function of time. When
α(t) is time dependent the action ∼ φ̇2 is not invariant under such transformations.
R

To compensate for this we introduce an extra function of time into the problem, call it
A(e) (t). Here the superscript e - for “external” - reminds us that this is an “external” or
“background” field: We will not do a path integral over these functions (unless we proceed
to Step 2 above). By contrast, we will do a path integral over the “dynamical” field φ(t)
or Φ(t) = eiφ(t) .
The gauged action is
Z I
1
S= I(φ̇ + A ) dt + B(φ̇ + A(e) )dt
(e) 2
2
Z  2 I (15.276)
1 −1 d (e) −1 d (e)
= I Φ(t) (−i + A )Φ(t) dt + BΦ(t) (−i + A )Φ(t)dt
2 dt dt
This action is a functional of both the nondynamical field A(e) (t) and the dynamical field
φ(t). Note that the action is invariant under the gauge transformation:
φ(t) → φ(t) + α(t)
(15.277)
A(e) (t) → A(e) (t) − ∂t α(t)
where, for the moment, we ignore boundary terms.
It turns out that it is better to regard the function A(e) (t) as a component of a 1-form:

A(e) := A(e) (t)dt (15.278)

and, better still, A(e) is the local one-form associated to a connection on a (locally trivial-
ized) principal SO(2) bundle over the time manifold M . We stress that the gauge field A(e)
is NOT the gauge field of electromagnetism. (That field has already produced our theta
term.) Rather, it is a new field in our system: It is a gauge field for the shift symmetry of
the field φ(t). 187
186
D. Gaiotto, A. Kapustin, Z. Komargodski and N. Seiberg, “Theta, Time Reversal, and Temperature,”
https://arxiv.org/pdf/1703.00501.pdf
187
The physical interpretation of the gauging process in terms of our original charged particle on a ring is
not completely clear to the author. But the mathematical structure makes sense and is a good toy model
for other field theoretic systems where the physical interpretation of the gauging is clear.

– 352 –
Note that we could also write the gauge transformation in the form:

eiφ(t) → eiφ(t) eiα(t)


(15.279)
d − iA(e) → eiα(t) (d + iA(e) )e−iα(t)

This is better because is captures better the geometric content, Consequently, it makes
more sense when working on topologically nontrivial spacetimes, such as the Euclidean
time circle. One can make choices of gauge group so that it becomes important that eiα(t)
can be single-valued even when α(t) is not.
What about charge conjugation symmetry?
First consider the classical theory. In the absence of the external gauge field we noted
that there is an O(2) symmetry of the equations of motion, even though under φ(t) → −φ(t)
the theta term in the action flips sign. In the presence of the external gauge field the
equations of motion are modified, however, as we will see below, we can gauge A(e) (t) to
be a constant, and in this case they are not modified. So we still have an O(2) symmetry.
Now consider the quantum theory. One can show that, appropriately defined, the
quantum Hamiltonian is still HB . Under charge conjugation we must flip B and then we
change the Hamiltonian (unless B = 0). But, as noted above, if 2B ∈ Z in that case HB
is unitarily equivalent to H−B and we can implement a unitary operator P corresponding
to the charge conjugation operation. In the path integral the value of the action matters.
The action (15.276) is invariant under the charge conjugation transformation if we take

φ(t) → −φ(t)
A(e) → −A(e) (15.280)
B → −B

and consequently if we change B → −B we must also take A(e) → −A(e) as noted above.
We will return to the quantum implementation of charge conjugation symmetry.
Now let us re-examine the periodicity of the physics as a function of B. In the absence
of the external gauge field A(e) we found that physical quantities are periodic functions of
B with period one. However, in the presence of a nonzero A(e) , the term BA(e) (t)dt spoils
R

the periodicity in B, because the value of the action matters in the quantum theory.
We can restore a kind of periodicity in B by adding a Chern-Simons term to the action.
We will comment in detail on the Chern-Simons term below. In Euclidean space the new
action is:
1 (e) (e) (e)
I(φ̇+At )2 dt−i
R H R
e−S = e− 2
B(φ̇+At )dt ik
e At dt (15.281)

and the last factor is the Chern-Simons term. By introducing the Chern-Simons term we
have introduced yet another parameter, the level k, into our theory. Classically the action
with (B, k) is equivalent to the action with (B + r, k + r) where r ∈ R is any real number.
Now that we have restored some kind of periodicity we can ask about quantum im-
plementation of charge conjugation symmetry. We must take B → −B, but quantum
mechanically the theory with B is only equivalent to that with −B when 2B ∈ Z. So, in

– 353 –
the quantum theory we can only hope to have charge conjugation invariance if there is an
integer N so that
(B + N, k + N ) = (−B, −k) (15.282)
In other words k = B = N/2 ∈ Z/2.
The introduction of the Chern-Simons term raises a new issue: When one sees gauge
potentials in an action that do not enter through fieldstrengths or covariant derivatives it
is important to ask about gauge invariance. In order to discuss the gauge invariance of
the Chern-Simons term properly we need first to discuss more carefully the space of gauge
fields and the group of gauge transformations.

The Space Of Gauge Fields And The Group Of Gauge Transformations

In our simple setting the space A of gauge fields can be identified with the space of
single-valued, continuous real-valued functions A(e) (t) on M .
We now must choose a gauge group G. This will be a Lie group - typically finite-
dimensional, although not necessarily connected. In our case, there are two natural choices:
We could take G = R or we could take G = U (1). Then the group of gauge transformations
is a group of maps
G = M ap[M → G] (15.283)
If M and G have positive dimension the group of gauge transformations will be an infinite-
dimensional Lie group.
Of particular interest in gauge theory is the quotient space A/G, the space of gauge
orbits, or, equivalently, the space of gauge-inequivalent field configurations.
Let us examine a few examples of A/G:

1. Let us first consider what happens when G = R and M is an interval or the real
line. Then the space A of gauge fields can be identified with the space of real-valued
continuous functions on M . The group G is the space of real-valued C 1 functions on
M , t 7→ α(t) ∈ R. The group G acts on A via

A(e) (t) → A(e) (t) − ∂t α(t) (15.284)

If M = [t1 , t2 ] with free boundary conditions on A and G then we can always solve

∂t α(t) = A(e) (t) (15.285)

for some α(t) and hence we can always gauge A(e) (t) to zero. So A/G is just a point.
188

For the discussion below it is useful to note here that the expression
Z t2
(e)
exp[i At (t0 )dt0 ] (15.286)
t1
188
Actually, this is too naive: A/G is more properly thought of as a stack. See the section below on
groupoids.

– 354 –
is in general not gauge invariant. Rather:
Z t2 Z t2
(e) 0 0 (e)
exp[i At (t )dt ] → e iα(t1 )
exp[i At (t0 )dt0 ]e−iα(t2 ) (15.287)
t1 t1

2. Now, continuing to take G = R let us consider what happens if M = R and we


impose boundary conditions that α(t) → 0 at t → ±∞. In this case
Z +∞
A(e) (t)dt (15.288)
−∞

is gauge invariant. There are no “local” invariants. From the previous discussion we
see that we can gauge A(e) (t) to zero in any compact region. In this case

A/G ∼
=R (15.289)

and the integral (15.288) fully determines the gauge equivalence class.

3. Now let us consider the case G = U (1), and let us also take M to be the Euclidean
time circle so
1
G = Map(Ss.t. → U (1)) (15.290)
Now, just viewing the gauge transformation as a set of continuous maps S 1 → S 1
there is a winding number. If this winding number is nonzero there is an obstruction
to finding a single-valued function α(t) so that g(t) = eiα(t) .
There is a normal subgroup G0 of small gauge transformations for which g(t) admits
a well-defined logarithm. That is, the gauge transformations g(t) ∈ G0 are of the
form g(t) = eiα(t) where α(t) is single-valued. Then
π
1 → G0 → G → Z→1 (15.291)

where the map π can be viewed as the winding number. Gauge transformations in
G0 are known as small gauge transformations. Those which have nonzero winding
numbers are known as large gauge transformations. It is worth noting that the above
sequence splits: For w ∈ Z we can take s(w) = gw to be the gauge transformation:

gw (t) = exp[2πiwt/β] (15.292)

and gw (t)gw0 (t) = gw+w0 (t). Note that for these transformations if we tried to define
α(t) it would be α(t) = 2πwt/β and would not be single-valued when M is the circle.
Now, referring to (15.295) it is clear that if α(t) is single valued then
I
(e)
exp[i At (t0 )dt0 ] (15.293)
S1

is gauge invariant. However when we have a large gauge transformation gw (t) we can
cut the circle say at t = 0 and t = β and then, with α(t) = 2πwt/β with w ∈ Z the

– 355 –
holonomy is still gauge invariant. Note that the large gauge transformations gw (t)
take
(e) (e)
At (t0 ) → At (t0 ) + w/β (15.294)

but preserves the holonomy (15.293). Put differently, (15.295) is generalized to


Z t2 Z t2
(e) (e)
exp[i At (t0 )dt0 ] → g(t2 )−1 exp[i At (t0 )dt0 ]g(t1 ) (15.295)
t1 t1

so that on the circle the holonomy (15.293) is gauge invariant. Equation (15.295)
generalizes nicely to the case of nonabelian groups on arbitrary spacetimes.
We can ask if there are other independent gauge invariant functions of A(e) (t) be-
sides the holonomy. Since A(e) (t) is periodic we can decompose A(e) (t) in a Fourier
(e)
expansion. Write Ãt (t) for the sum of the nonzero frequency modes. Then we can
solve the differential equation
(e)
∂t α(t) = Ãt (t) (15.296)

with a single-valued α(t) to choose a gauge so that

A(e) (t) = µ/β (15.297)

is constant. Put differently, A/G0 can be identified with the space of real numbers,
given by the constant µ. We will denote Ared = A/G0 . Then the “large” gauge
transformations gw (t) shift µ → µ + 2πw, with w ∈ Z. Therefore we have

A/G ∼
= Ared /Z ∼
= {[µ] = [µ + 2πw] w ∈ Z} ∼
= U (1) (15.298)

and the holonomy eiµ around the circle is a complete gauge invariant.

4. Finally, let us take the gauge group to be G = O(2) = SO(2) o Z2 . Let GSO(2) be the
group of gauge transformations when the gauge group is SO(2) and GO(2) be the group
of gauge transformations when the gauge group is O(2). Then GO(2) = GSO(2) o Z2
and we have the exact sequence

1 → G0 → GO(2) → Z o Z2 ∼
= D∞ → 1 (15.299)

Again the sequence splits and the infinite dihedral group D∞ ∼


= Z o Z2 preserves the
red
space A of constant gauge fields with generators acting as

σ :µ → −µ
(15.300)
s :µ → µ + 2π

Comments On The Chern-Simons Term

– 356 –
Now we are ready to discuss the gauge invariance of the Chern-Simons term. The
Chern-Simons term on M = S 1 is invariant under G0 for any value of k. However, under
the large gauge transformations gw (t) with w 6= 0:
I I
(e) 0 0 (e)
exp[ik At (t )dt ] → e 2πiwk
exp[ik At (t0 )dt0 ] (15.301)
S1 S1
and therefore, if we are going to allow our theory to make sense on a circle with the gauge
group SO(2) ∼ = U (1) then k should be quantized to be an integer. Note there would be no
such quantization of k if the gauge group is taken to be R.
The above observation is related to two extremely important conceptual points that
are essential to all discussions of the use of Chern-Simons terms in quantum physics:

It is not necessary for the action to be invariant. All that is necessary for a well-defined
path integral is that the exponentiated action must be invariant.
Gauge invariance of the “Chern-Simons term” under large gauge transformations implies
that the level k, one of the couplings of the theory, is quantized: k ∈ Z.

In our case the action in equation (15.281) is not gauge invariant! Under large gauge
transformations with winding number w we have S → S + 2πikw with w ∈ Z. However,
having a well-defined measure in the path integral only requires e−S to be well-defined,
and this will be the case if, and only if, k ∈ Z.
Note that I
(e)
exp[ik At (t0 )dt0 ] = eikµ (15.302)
S1
and, since k ∈ Z is quantized this is properly periodic under µ → µ + 2π and the expo-
nentiated Chern-Simons term descends to a well-defined function on A/G. For the group
O(2) we would have to consider cos(µ). (Of course, cos(nµ) is a Tchebyshev polynomial
of the basic invariant cos(µ).)

Anomalies

We can now discuss, very generally, the notion of anomalies. In quantum systems we
typically have both “dyanmical variables” such as dynamical fields, degrees of freedom,
etc. as well as “external” or “background” or “control” variables. We will denote generic
“background fields” by φbck and generic “dynamical fields” by φdyn . Any parameter of the
theory should be considered a “field.” The space of all fields is then fibered:
F dyn /F (15.303)


F bck

– 357 –
In the very simple situation we are discussing here the fibration is just a Cartesian product.
In the computation of physical quantities we will typically integrate over F dyn thus
producing a function (or, more generally, a section of a bundle) on F bck . We then study
the physical quantity as a function on F bck .
In our example F dyn can be taken to be the set of functions Φ(t) : M → U (1) and F bck
can be taken to be the set of functions A(e) (t), or better, the connections on a principal
G-bundle over M where in our present examples G = R, SO(2) or O(2). 189
Now suppose that there is a group G acting on F so that physical quantities are formally
invariant. For example, if we have an invariant action S[φdyn ; φbck ] and a formally invariant
measure, then the path integral will be formally invariant. Then, physical quantities such
as the partition function:
Z
dyn bck
bck
Z[φ ] = e−S[φ ;φ ] vol (φdyn ) (15.304)
F dyn

will, formally define a G-invariant function on F bck . However, it can happen that when
one defines the path integral carefully the partition function fails to be G-invariant. In
that case we say that there is a potential anomaly. Sometimes potential anomalies can be
removed by physically unimportant redefinitions. When this cannot be done we say there
is an anomaly.
If we tried to consider the Chern-Simons term for k ∈ / Z we would say it has an
anomaly. For any value of k it descends to a function on A/G0 . However, only when k ∈ Z
does it desend to a well-defined function on A/G.
It can happen that there can be different subgroups H1 ⊂ G and H2 ⊂ G such that
there are different definitions of the path integral so that it is invariant either under H1 or
under H2 but there is no definition so that it is invariant simultaneously under both H1
and H2 . In this case we say there is a mixed anomaly.

The Partition Function As Function On A And Its Behavior Under The Action Of G

Let us now illustrate some of the above ideas about anomalies by examining the par-
tition function in our example of the gauged particle on a ring.
There will not be any interesting anomalies under G0 . As we have explained we can
always use G0 to gauge A(e) to be a constant 1-form, and we will henceforth take our gauge
field to be constant. Then the equation of motion is the same as before, and performing
the path integral just as in the previous section we find
2 µ 2 µ
− 2πβ I (w+ 2π ) −2πiB(w+ 2π )
X
Z(µ) = eikµ Zq e
w∈Z (15.305)
µ/2π
= eikµ Zq ϑ[ ](0|τ )
−B
189
One can also promote B to be a field for fun and profit. This has been discussed in many places. See
e-Print: 1905.09315 for a recent discussion.

– 358 –
with τ = i 2πI
β . All we need to do here is replace the value of the classical action for solutions
with φ̇ = 2πw/β by making the substitution w → w + µ/2π.
As in the case without the external gauge field there is a Hamiltonian interpretation.
Performing the Poisson summation (or using the modular transformation law of the theta
function) we get:
1 X − β (m−B)2 −i(m−B)µ
Z(µ) = ei(k−B)µ e 2I (15.306)

m∈Z

It can be shown that the Euclidean path integral with action (15.193) is in fact equal to

Z(µ) = eikµ Tre−βHB eiµQ (15.307)

where Q is the operator measuring the charge of the SO(2) symmetry we gauged. In our
case QΨm = mΨm . 190 As noted above we still have
 2
1 ∂
HB = −i −B (15.308)
2I ∂φ

acting on L2 (X ) = L2 (S 1 ). One easily checks the equality of (15.306) and (15.307).


From either point of view Z(µ) is a periodic function of µ and there is no anomaly
under the group Z of large gauge transformations, so long as k ∈ Z.

The Gauge Group O(2) and Mixed Anomalies

What happens if we try to extend the gauge group to gauge the full O(2)? Then, as
we have seen, the quotient group G/G0 is the infinite dihedral group generated by σ and s
defined in equation (15.300) above.
If 2B is even then we can take k = B ∈ Z. The partition function is invariant under
µ → −µ and has the expected periodicity µ ∼ µ + 2π. In other words Z(µ) is invariant
under the group of large gauge transformations isomorphic to D∞ and generated by σ and
s, so it descends to a function on A/G and there is no anomaly.
Things are much more subtle when 2B is odd. As we saw, we can only expect charge
conjugation symmetry when
1
k =B ∈Z+ (15.309)
2
But this clashes with the constraint k ∈ Z. So we see an example of a mixed anomaly.
It is interesting to see how the mixed anomaly is manifested in the partition function.
The main point can be seen most easily by considering the leading term in the β → ∞
expansion which is (taking B = 1/2 for simplicity):

e−βEground i(k− 1 )µ  iµ/2 


Z→ e 2 e + e−iµ/2 + · · · (15.310)

190
To give a first-principles proof of why this should be so we gauge away A(e) in the path integral.
The result is an identification of the fields at t = 0 with the fields at t = β accompanied by a gauge
transformation by the holonomy eiµ which takes eiφ → ei(φ+µ) . So Ψm → eimµ Ψm = eiµQ Ψm .

– 359 –
If k = 0 then
e−βEground
1 + e−iµ + · · ·

Z→ (15.311)

the expression is properly periodic in µ, but not invariant under the analog of charge
conjugation: µ → −µ. This is not surprising since k 6= B.
As we will discuss below, by changing the physical system (yet again!) there is a
way to make sense of the half-integral level Chern-Simons term. If we just go ahead and
mindlessly substitute k = B = 1/2 in the above formula for Z(µ) we get:

e−βEground  iµ/2 
Z→ e + e−iµ/2 + · · · (15.312)

The action is now invariant under the generator σ : µ → −µ of D∞ but it is no longer
invariant under the generator s : µ → µ + 2π.
Provided we view the different choices of Chern-Simons terms as different definitions
of the theory, we can define the theory to be invariant under the group generated by s, but
with that definition σ is anomalous, or we can define the theory to be invariant udner the
group generated by σ, but then with that definition s is anomalous. So in this sense there
is a mixed anomaly of Z and Z2 in the D∞ subgroup of global gauge transformations.

Making sense of Chern-Simons terms with half-integer level

There is a way to make sense of the half-integer quantized Chern-Simons term by


viewing the 0 + 1 dimensional theory as the boundary of a well-defined 1 + 1 dimensional
theory. By Stokes’ theorem we have:
I Z
(e)
exp[ik At dt] = exp[ik F (e) ] (15.313)
S1 Σ

where F (e) = dA(e) . The RHS makes sense even if k is not an integer, but now the
expression depends on details of the gauge field in the “bulk” of the 1 + 1 dimensional
spacetime Σ.
A very analogous phenomenon is observed in real condensed matter systems where the
boundary theory of a 3+1 dimensional topological insulator is described by a Chern-Simons
theory with half-integral level. (That is, half the level allowed by naive gauge invariance.)
Such half-integral level Chern-Simons terms come up in many interesting physical
systems. For example, half-integral (spin) Chern-Simons theory is needed to describe the
topological features of the fractional quantum Hall effect. In supersymmetric field theories
and string theories many of the supergravity effective actions and brane effective actions
involve half-integrally quantized Chern-Simons terms.

Exercise Puzzle
Warning: This exercise requires some knowledge of topology.

– 360 –
Resolve the following paradox:
We first argued that, if k ∈
/ Z then the LHS of (15.313) is not invariant under large
gauge transformations. Then we proceeded to define the LHS by the expression on the
RHS which is manifestly gauge invariant.
How can these two statements be compatible? 191

15.5 Heisenberg Extensions


Consider again a central extension

1→A→G
e→G→1 (15.314)

In many of the examples above we had G Abelian and G e was also Abelian. However, as
our examples with Q and D4 have shown, in general G e need not be Abelian. (See equation
(15.177).) In this section we focus on an important class of examples where G is Abelian
and Ge is non-Abelian. They are known as Heisenberg groups and Heisenberg extensions. In
fact in the literature closely related but slightly different things are meant by “Heisenberg
extensions” and “Heisenberg groups.” These kinds of extensions show up all the time in
physics, in many different ways. They are very basic in quantum field theory and other
areas of physics, so we are going to dwell upon them a bit.

15.5.1 Heisenberg Groups: The Basic Motivating Example


Those who have taken quantum mechanics will be familiar with the relation between po-
sition and momentum operators for the quantum mechanics of a particle on the real line:

[q̂, p̂] = i~ (15.315)

One realization of these operator relations is in terms of normalizable wavefunctions ψ(q)


where we write:

(q̂ · ψ)(q) = qψ(q)


d (15.316)
(p̂ · ψ)(q) = −i~ ψ(q)
dq

Now, let us consider the operators

U (α) := exp[iαp̂]
(15.317)
V (β) := exp[iβ q̂]

These are unitary when α, β are real. When α is real U (α) implements translation in
position space by ~α. When β is real V (β) implements translation in momentum space by
−~β.
191
Answer : The gauge transformation eiα(t) must extend to a continuous map Σ → U (1). If Σ is a smooth
manifold whose only boundary is S 1 , as we have tacitly assumed in writing equation (15.313), then such
maps always restrict to small gauge transformations on the bounding S 1 .

– 361 –
The group of operators {U (α)|α ∈ R} is isomorphic to R because U (α1 )U (α2 ) =
U (α1 + α2 ). A similar statement holds for the group of operators V (β). But when we take
products of both U (α) and V (β) operators we do not get the group R ⊕ R of translations
in position and momentum, separately. Rather, one can show in a number of ways that:

U (α)V (β) = ei~αβ V (β)U (α) (15.318)

This is an extremely important equation. We can understand it in many different ways.


We will explain three ways to derive it. First, it immediately follows from the BCH formula
since [q̂, p̂] is central.
A second way to derive (15.318) is to evaluate both operators on a wavefunction in the
position representation. So, on the one hand:

((U (α)V (β)) · ψ) (q) = (V (β) · ψ) (q + ~α)


(15.319)
= eiβ(q+~α) ψ(q + ~α)

On the other hand


((V (β)U (α)) · ψ) (q) = eiβq (U (α) · ψ) (q + ~α)
(15.320)
= eiβq ψ(q + ~α)

Comparing (15.319) with (15.320) we arrive at (15.320). The reader should compare this
with our discussion of quantum mechanics with a finite number of degrees of freedom,
especially the derivation of (11.251).
Here is a third derivation of (15.318): Using (11.441) it follows that

U (α)q̂U (α)−1 = eiαAd(p̂) q̂ = q̂ + ~α (15.321)

V (β)p̂V (β)−1 = eiβAd(q̂) p̂ = q̂ − ~β (15.322)


Now using (11.445) we obtain (15.318).
Returning to the group generated by the operators U (α) and V (α) for α ∈ R, which
we’ll denote Heis(R × R), it fits in a central extension:

1 → U (1) → Heis(R × R) → R × R → 1 (15.323)

With one (nice) choice of cocycle we can write the group law as:
i
(z1 , (α1 , β1 )) · (z2 , (α2 , β2 )) = (z1 z2 e 2 ~(α1 β2 −α2 β1 ) , (α1 + α2 , β1 + β2 )) (15.324)

Notice that the cocycle is expressed in terms of the anti-symmetric form


! !
  0 1 α2
ω(v1 , v2 ) := α1 β2 − α2 β1 = α1 β1 = v1tr Jv2 (15.325)
−1 0 β2

where !
α
v= (15.326)
β

– 362 –
and the matrix !
0 1n
J= (15.327)
−1n 0
was used at the very beginning of the course (see equation (2.21)) to define the symplectic
group Sp(2n, κ) as the group of matrices such that Atr JA = J.
ω is called a symplectic form Note that
ω(A~v1 , A~v2 ) = ω(~v1 , ~v2 ) (15.328)
for A ∈ Sp(2, R). We say that ω is invariant under symplectic transformations.

Exercise
Referring to equations (15.318) and (15.316) et. seq.
a.) Show that the choice of section
s(α, β) = U (α)V (β) (15.329)
leads to the cocycle
f ((α1 , β1 ), (α2 , β2 ) = ei~(α1 β1 +α2 β2 +α1 β2 ) (15.330)
b.) Show that the choice of section
s(α, β) = exp[i(αp̂ + β q̂)] (15.331)
leads to the cocycle
i
f ((α1 , β1 ), (α2 , β2 ) = e 2 ~(α1 β2 −α2 β1 ) (15.332)
c.) Find an explicit coboundary that relates the cocycle (15.330) to (15.332). ♣NEED TO
PROVIDE
ANSWER HERE. ♣

Exercise Generalization To Heis(Rn ⊕ Rn )


Consider the group generated by exponentiating linear combinations of q̂ i and p̂i ,
i = 1, . . . , n with the nonzero commutators being:
[q̂ i , p̂j ] = i~δ i j (15.333)
Then for !
αi
~v = (15.334)
βj
we can choose section:
s(~v ) = exp i(αi p̂i + βi q̂ i )
 
(15.335)
Show that the resulting group is
~
(z1 , ~v1 ) · (z2 , ~v2 ) = (z1 z2 ei 2 ω(~v1 ,~v2 ) , ~v1 + ~v2 ) (15.336)
with ω defined by equation (11.470).

– 363 –
15.5.2 Example: The Magnetic Translation Group For Two-Dimensional Elec-
trons
In the presence of an electromagnetic field the group of translations acting on charged
particles definitely becomes centrally extended. This shows up naturally when discussing a
charged nonrelativistic particle confined to two spatial dimensions and moving in a constant
magnetic field B. In one convenient gauge the Hamiltonian is
 
1 eBx2 2 eBx1 2 1 2
H= (p1 + ) + (p2 − ) = (p̃ + p̃22 ) (15.337)
2m 2 2 2m 1
where the gauge invariant momenta are p̃i := pi − eAi are
eB
p̃1 = p1 + x2
2 (15.338)
eB
p̃2 = p2 − x1
2
Ordinary translations are generated by p1 , p2 and do not commute with the Hami-
tonian: We have lost translation invariance. Nevertheless we can define the magnetic
translation operators:
eBx2 eBx1
π1 := p1 − π2 := p2 + (15.339)
2 2
Compare this carefully with the definitions of p̃i . Note the relative signs! These operators
satisfy [πi , p̃j ] = 0. In particular they are translation-like operators that commute with the
Hamiltonian: [πi , H] = 0. Hence the name. While they are called “translation operators”
note that they do not commute:

[π1 , π2 ] = −i~eB (15.340)

The “magnetic translation group” is generated by the operators

U (a1 ) = exp[ia1 π1 /~]


(15.341)
V (a2 ) = exp[ia2 π2 /~]

The operators U (a1 ), V (a2 ) satisfy the relations:

U (a1 )V (a2 ) = exp[ieBa1 a2 /~]V (a2 )U (a1 ) (15.342)

If we are interested in quantized values of a1 , a2 (as, for example, if the charged particle is
moving in a lattice, or is confined to a torus) then we obtain the basic relations (15.522).
Note that
exp[ieBa1 a2 /~] = exp[2πiΦ/Φ0 ] (15.343)
where Φ = Ba1 a2 is the flux through an area element a1 a2 and Φ0 = h/e is known as the
magnetic flux quantum. 192
192
One should be careful about a factor of two here since in superconductivity the condensing field has
charge 2e and hence the official definition of the term “flux quantum” used, for example, by NIST is
Φ0 = h/2e, half the value we use.

– 364 –
Remark: The group generated by the operators U = U (a1 ) and V = V (a2 ) is a
Heisenberg group, but it is also interesting to consider the algebra of operators generated by
U, V . This algebra admits a C∗-algebra structure and is sometimes referred to as the algebra
of functions on the noncommutative torus or the irrational rotation algebra. Abstractly it
is the C ∗ algebra generated by unitary operators U, V satisfying U V = e2πiθ V U for some θ.
The properties of the algebra are very different for θ rational and irrational. The algebra
Aθ figures prominantly in applications of noncommutative geometry to the QHE and in
applications of noncommutative geometry to toroidal compactifications of string theory.

15.5.3 The Commutator Function And The Definition Of A General Heisenberg


Group
Let us now step back and think more generally about central extensions of G by A where
G is also abelian. From the exercise (15.190) we know that for G abelian the commutator
is  
f (g1 , g2 )
[(a1 , g1 ), (a2 , g2 )] = ,1 (15.344)
f (g2 , g1 )
(We are writing 1/f (g2 , g1 ) for f (g2 , g1 )−1 and since A is abelian the order doesn’t matter,
so we write a fraction as above.)
The function κ : G × G → A defined by
f (g1 , g2 )
κ(g1 , g2 ) = (15.345)
f (g2 , g1 )
is known as the commutator function.
Note that:

1. The commutator function is gauge invariant, in the sense that it does not change
under the change of 2-cocycle f by a coboundary. (Check that! This uses the property
that G is abelian). It is therefore a more intrinsic quantity associated with the central
extension.

2. κ(g, 1) = κ(1, g) = 1. (This follows from exercise (15.92) above.)

3. The extension G̃ is abelian iff κ(g1 , g2 ) = 1, that is, iff there exists a symmetric
cocycle f . 193

4. κ is skew :
κ(g1 , g2 ) = κ(g2 , g1 )−1 (15.346)

5. κ is alternating:
κ(g, g) = 1 (15.347)

6. κ is bimultiplicative:
κ(g1 g2 , g3 ) = κ(g1 , g3 )κ(g2 , g3 ) (15.348)
κ(g1 , g2 g3 ) = κ(g1 , g2 )κ(g1 , g3 ) (15.349)
193
Note that in our example of Q, D4 as extensions the cocycle we computed was not symmetric.

– 365 –
All of these properties except, perhaps, the last, are obvious. To prove the bimulti-
plicative properties (it suffices to prove just one) we rewrite (15.348) as

f (g1 g2 , g3 )f (g3 , g2 )f (g3 , g1 ) = f (g2 , g3 )f (g1 , g3 )f (g3 , g1 g2 ) (15.350)

Now multiply the equation by f (g1 , g2 ) and use the fact that A is abelian to write

(f (g1 , g2 )f (g1 g2 , g3 ))f (g3 , g2 )f (g3 , g1 ) = f (g2 , g3 )f (g1 , g3 )(f (g1 , g2 )f (g3 , g1 g2 )) (15.351)

We apply the cocycle identity on both the LHS and the RHS (and also use the fact that
G is abelian) to get

f (g2 , g3 )f (g1 , g2 g3 )f (g3 , g2 )f (g3 , g1 ) = f (g2 , g3 )f (g1 , g3 )f (g3 , g1 )f (g3 g1 , g2 ) (15.352)

Now canceling some factors and using that A is abelian we have

f (g1 , g2 g3 )f (g3 , g2 ) = f (g1 , g3 )f (g3 g1 , g2 ) (15.353)

Now use the fact that G is abelian to write this as

f (g1 , g3 g2 )f (g3 , g2 ) = f (g1 , g3 )f (g1 g3 , g2 ) (15.354)

which is the cocycle identity. This proves the bimultiplicative property (15.348). ♠
We now define the Heisenberg extensions. The function κ is said to be nondegenerate
if for all g1 6= 1 there is a g2 with κ(g1 , g2 ) 6= 1. When this is the case the center of G̃ is
precisely A:
Z(G) e ∼ =A. (15.355)
This follows immediately from equation (15.344). If κ is degenerate the center will be
larger. In the extreme case that κ(g1 , g2 ) = 1 for all g1 , g2 we get the direct product
Ge = A × G and
Z(G) e =G e. (15.356)
In general, we will have an intermediate situation and A will be a proper subgroup of Z(G̃).
One definition which is used in the literature is

Definition: A Heisenberg extension is a central extension of an abelian group G by an


abelian group A where the commutator function κ is nondegenerate.

Exercise Alternating implies skew


Show that a map κ : G × G → A which satisfies the bimultiplicative identity (15.348)
and the alternating identity (15.347) is also skew, that is, satisfies (15.346).

– 366 –
Exercise Commutator function for Heis(R ⊕ R)
a.) Show that both of the cocycles (15.330) and (15.332) defining groups isomorphic
to Heis(R ⊕ R) have the same commutator function

κ((α1 , β1 ), (α2 , β2 ) = ei~(α1 β2 −α2 β1 ) = ei~ω(v1 ,v2 ) (15.357)

b.) Similarly show that for Heis(Rn ⊕ Rn ) we have

κ(v1 , v2 ) = exp (i~ω(v1 , v2 )) (15.358)

where
ω(v1 , v2 ) = v1tr Jv2 (15.359)

15.5.4 Classification Of U (1) Central Extensions Using The Commutator Func-


tion
For a large class of abelian groups G, there is a nice theorem regarding arbitrary central
extensions by U (1). We consider

1. Finitely generated Abelian groups. As we will prove in sections 16.2 and 16.3 below,
these can be (noncanonically) written as products of cyclic groups Zn , for various n,
and a lattice Zd for some d (possibly d = 0).

2. Vector spaces. 194

3. Tori. These are isomorphic to V /Zd where V is a d-dimensional real vector space.

4. Direct products of the above three.

Remark: This class of groups can be characterized as the set of Abelian groups A which
are topological groups so there is an exact sequence:

0 → π1 (A) → Lie(A) → A → π0 (A) → 0 (15.360)

where Lie(A) is a vector space that projects to A by an exponential map.


For this class of groups we have the following theorem:

Theorem Let G be a topological Abelian group of the above class. The isomorphism
classes of central extensions of G by U (1) are in one-one correspondence with continuous
bimultiplicative maps

κ : G × G → U (1) (15.361)
which are alternating (and hence skew).
194
Topological, separable.

– 367 –
I do not know who originally proved this theorem, but one proof can be found in 195

Remarks

1. In other words, given the commutator function κ one can always find a corresponding
cocycle f . This theorem is useful because κ is invariant under change of f by a
coboundary, and moreover the bimultiplicative property is simpler to check than the
cocycle identity. (In fact, one can show that it is always possible to find a cocycle f
which is bimultiplicative. This property automatically ensures the cocycle relation.)

2. It is important to realize that κ only characterizes G̃ up to noncanonical isomorphism:


to give a definite group one must choose a definite cocycle. ♣Explain this
comment more. ♣

3. In this theorem we can replace U (1) by any subgroup of U (1), such as Zn realized as
the group of nth roots of unity.

15.5.5 Pontryagin Duality And The Stone-von Neumann-Mackey Theorem


The theorem of section 15.5.4 is nicely illustrated by the Heisenberg of extension of the
Abelian group G = S × Sb by A = U (1). Here S is any locally compact Abelian group and
Sb is the Pontryagin dual.
Now, there is a very natural alternating, skew, bilinear map on the product S × Sb
defined by
χ2 (s1 )
κ ((s1 , χ1 ), (s2 , χ2 )) := (15.362)
χ1 (s2 )
and according to the above theorem this defines a general Heisenberg extension

1 → U (1) → Heis(S × S)
b → S × Sb → 1 (15.363)

at least up to isomorphism.

Remarks

1. Note that one natural cocycle giving the commutator function (15.362) is

1
f ((s1 , χ1 ), (s2 , χ2 )) := (15.364)
χ1 (s2 )

2. There is a very natural representation of (15.363). We need a Haar measure on S


so that we can define V = L2 (S), a Hilbert space of complex-valued functions on S.
195
D. Freed, G. Moore, G. Segal, “The uncertainty of fluxes,” Commun.Math.Phys. 271 (2007) 247-274,
arXiv:hep-th/0605198, Proposition A.1.

– 368 –
For our key examples we have
Z
hψ1 , ψ2 i = ψ1∗ (x)ψ2 (x)dx S=R
R
X
= ψ1∗ (n)ψ2 (n) S=Z
n∈Z
1
Z 2π (15.365)
= ψ1∗ (eiθ )ψ2 (eiθ )dθ S = U (1)
2π 0
n−1
1X ∗
= ψ1 (k̄)ψ2 (k̄) S = Z/nZ
n
k=0

We represent s0 ∈ S as a translation operator:

(Ts0 · Ψ)(s) := Ψ(s + s0 ) (15.366)

and we represent χ0 ∈ Sb as a multiplication operator

(Mχ0 · Ψ)(s) := χ0 (s)Ψ(s) (15.367)

Then one checks that this does not define a representation of the direct product S × Sb
but rather we have the operator equation:

Ts0 Mχ0 = χ0 (s0 )Mχ0 Ts0 (15.368)

If O is the group of operators generated by Ts , Mχ and z ∈ U (1) acting on V then


the map
(z; (s, χ)) → zTs Mχ (15.369)
is a homomorphism, and in fact an isomorphism of Heis(S × S)
b with O where we use
the cocycle (15.364).

3. This construction is extremely important in quantum mechanics and also in the


description of free quantum field theories. In these cases we take S to be a vector
space. For example, for the quantum mechanics of a particle in Rn we have V = Rn
as an Abelian group. For the quantum field theory of a free real scalar field in d + 1
dimensions we would take V to be a suitable space of real-valued functions on a d-
dimensional spatial slice. 196 Then we introduce the dual vector space V ∨ ∼ = S,
b and
the canonical pairing V × V ∨ → R gives the character:

χk (v) = eik·v (15.370)

Then, Ts in our general discussion is the operator U (s) of the basic motivating exam-
ple, and Mχk is the operator V (k) of the basic motivating example and the general
identity (15.368) becomes our starting point:

U (s)V (k) = eik·s V (k)U (s) (15.371)


196
Technical point: This will not be locally compact. So, if one wishes to be rigorous, further considerations
are required.

– 369 –
4. Stone-von Neumann-Mackey Theorem. Up to isomorphism (equivalence) there is a
unique irreducible unitary representation of Heis(S × S) b such that A = U (1) acts by
scalar multiplication. That is, if ξ ∈ U (1) then we require ρ(ξ) = ξ1V . In addition we
need some further technical hypotheses. 197 This is called the Stone-von Neumann
theorem or sometimes the Stone-von Neumann-Mackey theorem. For a relatively
short proof see. 198 The main idea is to consider the maximal Abelian subgroups
of Heis(S × S).
b One such subgroup is isomorphic to U (1) × S, another is U (1) × S. b
Let us consider U (1) × S.
b Over the subgroup {1} × Sb we can split the sequence and
consider the elements:
Mχ := ρ(1, (0, χ)) (15.372)
where 1 ∈ U (1), 0 ∈ S, and χ ∈ S.
b These operators commute for different choices of χ
and are simultaneously diagonalizable so we have a complete basis ψα of simultaneous
eigenvectors for the representation with eigenvalues: 199

Mχ ψα = λα (χ)ψα (15.373)

Clearly χ 7→ λα (χ) must be a character on S,


b so the eigenvalues can be identified
with characters on S and therefore, by Pontryagin duality, can be identified with
b
elements sα ∈ S. So there is some set {sα } whose elements are drawn from S with
corresponding eigenbasis ψα with

Mχ ψα = χ(sα )ψα (15.374)

Now, for any s ∈ S define the operator

Ts := ρ(1, (s, 1)) (15.375)

Choose some particular α0 and consider the vectors Ts ψα0 for s ∈ S. Using the
Heisenberg relations and the fact that the central U (1) group just acts by scalars we
know that:
Mχ (Ts ψα0 ) = χ(s − sα0 ) (Ts ψα0 ) (15.376)
Therefore the span of {Ts ψα0 }s∈S , which is the span of {Ts+sα0 ψα0 }s∈S is a copy of
the representation constructed above. So if (V, ρ) is irreducible, it must be equiva-
lent to our representation constructed above. This demonstrates uniqueness of the
irreducible representation.

5. Simple Proof Of Irreducibility for S = Rn . For v = (α, β) ∈ R2n we introduced the ♣This remark
assumes at least a
section little bit of
knowledge from the
s(v) := exp[i(αq̂ + β p̂)] (15.377) linear algebra
chapter 2 and the
Now, consider the standard representation of Heis(Rn × R
cn ) on H = L2 (Rn ). For idea of an
irreducible
representation, from
197 chapter 4. ♣
We need the representation to be continuous in the norm topology and we need S to have a translation-
♣The α, β here are
invariant measure so that L2 (A) makes sense. Then we replace V above by the Hilbert space L2 (S). reversed from the
198
A. Prasad, “An easy proof of the Stone-von Neumann-Mackey Theorem,” arXiv:0912.0574. convention in
199 previous section. ♣
It is exactly at this point that we are using unitarity, and a careful discussion requires more functional
analysis.

– 370 –
any two vectors ψ1 , ψ2 ∈ H define the Wigner function:

W (ψ1 , ψ2 )(v) := hs(v)ψ1 , ψ2 i (15.378)

This is a function on the phase space v ∈ R2n . Now we compute:


αβ
(s(v)ψ1 )(q) = ei 2 eiαq ψ1 (q + β) (15.379)

Shifting the integration variable by β/2 in the inner product we have


Z
W (ψ1 , ψ2 )(v) = e−iαq ψ1∗ (q + β/2)ψ2 (q − β/2)dq (15.380)
Rn

Now, an elementary computation shows that, on L2 (R2n ) with the standard measure
we have
k W (ψ1 , ψ2 ) k2 =k ψ1 k2 k ψ2 k2 (15.381)
Here are some details ♣Need to clean up
some 2π’s here... ♣
Z
2 2 2n
k W (ψ1 , ψ2 ) k = |W (ψ1 , ψ2 )| d v
2n
ZR
= dαdβdq1 dq2 eiα(q1 −q2 ) ψ1 (q1 + β/2)ψ2∗ (q1 − β/2)ψ1∗ (q2 + β/2)ψ2 (q2 − β/2)
Z
= dqdβψ1∗ (q + β/2)ψ1 (q + β/2)ψ2∗ (q − β/2)ψ2 (q − β/2)

=k ψ1 k2 k ψ2 k2
(15.382)

Now with the key result (15.381) we can show that H is irreducible. 200 Suppose that
H0 ⊂ H is preserved preserved by the Heisenberg group. Suppose there is a vector
ψ⊥ ∈
/ H0 . WLOG we can take ψ⊥ to be perpendicular to H0 (hence the notation).
But then,
W (ψ, ψ⊥ )(v) = hs(v)ψ, ψ⊥ i = 0 (15.383)
for all v ∈ R2n and all ψ ∈ H0 , because s(v)ψ ∈ H0 . But then by (15.381) we know
that k ψ k2 k ψ⊥ k2 = 0. Therefore either ψ = 0 or ψ⊥ = 0. If H0 is not the zero
vector space then we can always choose ψ 6= 0 and hence ψ⊥ = 0. But if H0 were
proper then there would be a nonzero choice for ψ⊥ . Therefore, there is no nonzero
and proper subspace H0 ⊂ H preserved by the group action of Heis. Therefore H is
irreducible.

6. Stone von-Neumann And Fourier Now let us combine the Stone-von-Neumann the-
orem with Pontryagin duality. Because Sb ∼
= S we can write
b

b ∼
Heis(S × S) = Heis(Sb × S)
bb
(15.384)
200
See chapter 4 for a thorough discussion of reducible vs. irreducible representations. Briefly - if a
representation H has a nonzero and proper subspace H0 preserved by the group action then it is said to be
reducible. A representation which is not reducible is said to be irreducible.

– 371 –
so, we could equally well give a unitary representation of the group by taking Vb :=
F un(Sb → C) with inner product
Z
hψ̂1 , ψ̂2 i := dχψ̂1∗ (χ)ψ̂2 (χ) (15.385)
S
b

Now we represent translation and multiplication operators by

(T̂χ0 ψ̂)(χ) := ψ̂(χ0 χ) (15.386)

(M̂s0 ψ̂)(χ) := χ(s0 )ψ̂(χ) (15.387)


and check the commutator function. So, by SvN there must be a unitary isomorphism

S : V → Vb (15.388)

mapping

STs0 S −1 = M̂s0
(15.389)
SMχ0 S −1 = T̂χ0

Indeed there is: It is the Fourier transform:


Z
ψ 7→ ψ̂(χ) := χ(s)∗ ψ(s)ds (15.390)
S

Moreover, S is an isometry by the Plancherel/Parseval theorem noted above.

7. Of course, there are many different Lagrangian subgroups of G. For example, if G


is a symplectic vector space there will be many different Lagrangian subspaces, all
related by the linear action of the symplectic group. Each choice of Lagrangian sub-
space gives a representation of the Heisenberg group, but they all must be isomorphic,
by the Stone-von Neumann-Mackey theorem. This leads to interesting isomorphisms
between seemingly different representations and interesting (projective) actions of
symplectic groups. These kinds of considerations are central to some simple exam-
ples of “duality symmetries” in quantum field theory. We will next investigate how
the groups of symplectic automorphisms lift to automorphism groups of Heisenberg
groups.

Exercise Bimultiplicative cocycle


a.) Check that (15.364) satisfies the cocycle relation.
b.) Show that, in fact, (15.364) is bimultiplicative.

– 372 –
Exercise Irreducibility Of The Stone-von Neumann Representation Of Heis(S × S)
b
n
Generalize the proof of irreducibility for S = R to other locally compact Abelian
groups. 201

15.5.6 Some More Examples Of Heisenberg Extensions


For this section the reader might wish to consult section [**** 2.1 *** ?] of chapter two
for the definition of a ring. The reader won’t lose much by taking R = Z or R = Z/N Z
with abelian group structure + and extra multiplication structure n̄1 n̄2 = n1 n2 .

Example 1: Suppose R is a commutative ring with identity. Then we can consider the
group of 3 × 3 matrices over R of the form
 
1ac
M (a, b, c) := 0 1 b  (15.391)
 
001

The multiplication law is easily worked out to be

M (a, b, c)M (a0 , b0 , c0 ) = M (a + a0 , b + b0 , c + c0 + ab0 ) (15.392)

Therefore, if we define Heis(R × R) to be the group of matrices M (a, b, c) and take

π : M (a, b, c) → (a, b) (15.393)

and
ι : c 7→ M (0, 0, c) (15.394)
we have an extension
0 → R → Heis(R × R) → R ⊕ R → 0 (15.395)
with cocycle f ((a, b), (a0 , b0 )) = ab0 . Note that we are writing our Abelian group R addi-
tively so the cocycle identity becomes

f (v1 , v2 ) + f (v1 + v2 , v3 ) = f (v1 , v2 + v3 ) + f (v2 , v3 ) (15.396)

where v = (a, b) ∈ R ⊕ R. In this additive notation the commutator function is

κ((a, b), (a0 , b0 )) = ab0 − a0 b (15.397)

In the literature one will sometimes find the above class of groups defined as the
“Heisenberg groups.” It is a special case of what we have defined as general Heisenberg
groups.
201
Answer For the general case define the Wigner function as the function W (ψ1 , ψ2 ) : S × Sb → C
by W (ψ1 , ψ2 )(s, χ) := hTs Mχ ψ1 , ψ2 i. Show that (15.381) continues to hold. You will need to use the
orthogonality relation for characters.

– 373 –
As a special case of the above construction let us take R = Z/nZ. We will now show
that we recover the group Heis(Zn × Zn ) discussed in section 11.11 in the context of a
particle on a discrete approximation to a circle.
First consider Zn written additively. So if a ∈ Z, then ā ∈ Z/nZ is just ā = a + nZ is
the coset. Then we define
     
1̄ 1̄ 0 1̄ 0 0 1̄ 0 1̄
U = 0 1̄ 0 V = 0 1̄ 1̄ q = 0 1̄ 0 (15.398)
     
0 0 1̄ 0 0 1̄ 0 0 1̄

We easily check that for a ∈ Z,


     
1̄ ā 0 1̄ 0 0 1̄ 0 ā
U a = 0 1̄ 0 V a = 0 1̄ ā q a = 0 1̄ 0 (15.399)
     
0 0 1̄ 0 0 1̄ 0 0 1̄
so
U n = V n = qn = 1 (15.400)
Moreover:
U V = qV U qU = U q qV = V q (15.401)
Thus we obtain the presentation:

Heis(Zn × Zn ) = hU, V, q|U n = V n = q n = 1, U V = qV U, U q = qU, V q = qV i


(15.402)
we saw before.
A simple and useful generalization of the previous construction is to take any bilinear map
c : R ×R → Z where Z is Abelian. Thus c(a1 +a2 , b) = c(a1 , b)+c(a2 , b) and c(a, b1 +b2 ) =
c(a, b1 ) + c(a, b2 ). Then we can define a central extension

0 → Z → G̃ → R ⊕ R → 0 (15.403)

by the law

(z1 , (a, b)) · (z2 , (a0 , b0 )) = (z1 + z2 + c(a, b0 ), (a + a0 , b + b0 )) (15.404)

The corresponding group cocycle is f ((a, b), (a0 , b0 )) = c(a, b0 ). The cocycle relation is
satisfied simply by virtue of c being bilinear. It will be a Heisenberg extension if κ :
(R × R) × (R × R) → Z given by κ((a, b), (a0 , b0 )) = c(a, b0 ) − c(a0 , b) is nondegenerate. In
particular, if we take Z = R and c(a, b0 ) = ab0 using the ring multiplication then we recover
(15.391).

Example 2: Clifford algebra representations and Extra-special groups. Suppose we have a


set of matrices γi , 1 ≤ i ≤ n such that

{γi , γj } = 2δij (15.405)

– 374 –
Such a set of matrices can indeed be constructed, by taking suitable tensor products of Pauli
matrices. They are called “gamma matrices.” They form a matrix representation of what
is called a Clifford algebra and we will study them in more detail and more abstractly in
chapter ****. For the moment the reader should be content with the explicit representation:
202

γ1 = σ 1
(15.406)
γ2 = σ 2

for n = 2,

γ1 = σ 1
γ2 = σ 2 (15.407)
γ3 = σ 3

for n = 3,

γ1 = σ 1 ⊗ σ 1
γ2 = σ 1 ⊗ σ 2
(15.408)
γ3 = σ 1 ⊗ σ 3
γ4 = σ 2 ⊗ 1

for n = 4,

γ1 = σ 1 ⊗ σ 1
γ2 = σ 1 ⊗ σ 2
γ3 = σ 1 ⊗ σ 3 (15.409)
γ4 = σ 2 ⊗ 1
γ5 = σ 3 ⊗ 1

for n = 5,

γ1 = σ 1 ⊗ σ 1 ⊗ σ 1
γ2 = σ 1 ⊗ σ 1 ⊗ σ 2
γ3 = σ 1 ⊗ σ 1 ⊗ σ 3
(15.410)
γ4 = σ 1 ⊗ σ 2 ⊗ 1
γ5 = σ 1 ⊗ σ 3 ⊗ 1
γ6 = σ 2 ⊗ 1 ⊗ 1

for n = 6, and so on. So for the Clifford algebra with n generators we have constructed a
representation by 2[n/2] × 2[n/2] matrices.
Of course, the above choice of matrices is far from a unique choice of matrices satisfying
the Clifford relations (15.405). If the γi ∈ Matd (C) then for any S ∈ GL(d, C) we can
202
See Chapter 2, section 5.3 for a detailed discussion of tensor product ⊗.

– 375 –
change γi → Sγi S −1 . These give equivalent representations of the Clifford algebra. We
could also modify γi → i γi where i ∈ {±1} and still get a representation, although it
might not be an equivalent one.
For example, note that for n = 3

γ1 γ2 γ3 = i12×2 (15.411)

and for n = 5
γ1 · · · γ5 = −14×4 (15.412)

We cannot change the sign on the RHS by conjugating with S. So in this case we conclude
that there are at least two inequivalent representations of the Clifford algebra.
The general story, proved in detail in Chapters 11-12 is that

1. If n is even there is a unique irreducible representation of dimension d = 2n/2 .

2. If n is odd there are precisely two irreducible representation of dimension d = 2(n−1)/2


and they are distinguished by the sign of the “Clifford volume element” ω = γ1 · · · γn .

In Chapter 11 these statements, and more are generalized to real Clifford algebras for
a quadratic form of any signature,
For w ∈ Zn2 (where we will think of Z2 = Z/2Z as a ring in this example) we define

γ(w) := γ1w1 · · · γnwn (15.413)

Then
γ(w)γ(w0 ) = (w, w0 )γ(w + w0 ) = κ(w, w0 )γ(w0 )γ(w) (15.414)

where
wi wj0
P
(w, w0 ) = (−1) i>j (15.415)

defines a cocycle with commutator function

wi wj0
P
κ(w, w0 ) = (−1) i6=j (15.416)

When is the commutator function κ nondegenerate? We need to consider two cases:


P
Case 1a i wi = 1mod 2 and there is an i0 so that wi0 = 0. Then γ(w) anticommutes with
γ i0 .
P
Case 1b i wi = 1mod 2 and wi = 1 for all i. In this case n must be odd. Then in fact
γ1 · · · γn commutes with all the γi and the commutator function is degenerate.
P
Case 2. i wi = 0 mod 2 and some wi0 6= 0. Then γi0 anticommutes with γ(w).

– 376 –
We conclude that for the even degree Clifford algebras κ is nondegenerate and the
group generated by taking products of the matrices ±γ(w) defined by an irreducible rep-
resentation in fact defines a Heisenberg extension:

1 → Z2 → E2m → Z2m
2 →1 (15.417)

In finite group theory this group is an example of what is known as an “extra-special group”
and is denoted E2m = 21+2m
+ . 203
In the case when n is odd then in fact the Clifford volume form γ1 · · · γn commutes
with all the γi and the cocycle is degenerate.

Example 3: Heisenberg Construction Of Nontrivial U (1) Bundle Over Symplectic Tori.


Consider a torus T := Rn /Γ where Γ is an integral lattice. Suppose we have an integral-
valued symplectic form on Γ. This is a blinear, anti-symmetric, nondegenerate map:

Ω:Γ×Γ→Z (15.418)

It is shown in the Linear Algebra User’s Manual that we can choose an ordered basis
{γ1 , . . . , γn } for Γ so that the matrix Ω(γi , γj ) is of the form
! !
0 d1 0 d2
⊕ ··· (15.419)
−d1 0 −d2 0

where the di are integers. We will assume that Ω is nondegenerate so therefore n = 2m is


even and all the integer di are nonzero.
If Γ is full rank we can extend Ω to a bilinear antisymmetric form on 204

V = Γ ⊗Z R ∼
= R2m

to get an antisymmetric bilinear form:

Ω:V ×V →R (15.420)

If all the di are nonzero then this is a symplectic form. Using an invertible matrix S we
can bring S tr ΩS to the standard form J. We can define a commutator function on Rn :

κ(v1 , v2 ) = e2πiΩ(v1 ,v2 ) (15.421)

and this defines the isomorphism class of a Heisenberg extension of V ∼


= R2m . To be
concrete our extension is

1 → U (1) → Heis(V, Ω) → V → 0 (15.422)


203
For any prime p an extra-special group is a group G that fits in a central extension 1 → Zp → G →
Zn
p → 1 where the center is minimal, that is, is isomorphic to Zp . For example, for p = 2 we have seen that
both D4 and Q are extra-special groups. Up to isomorphism there are two such groups, sometimes denoted
p1+n
± .
204
The symbol ⊗Z is explained in the Linear Algebra User’s Manual. It means we are taking a tensor
product of Z-modules, i.e. Abelian groups.

– 377 –
where we make the explicit choice of cocycle f (v1 , v2 ) = eiπΩ(v1 ,v2 ) . So, Heis(V, Ω) is the
group of pairs (z, v) with z ∈ U (1) and v ∈ V with multiplication:

(z1 , v1 ) · (z2 , v2 ) := (z1 z2 eiπΩ(v1 ,v2 ) , v1 + v2 ) (15.423)

We stress that this gives a Heisenberg group and in particular the sequence does not split.
Now consider the pullback of the sequence under the inclusion ι : Γ → V . We claim
that the pulled-back sequence splits: Let us try to choose a section

s(γ) = (γ , γ) (15.424)

then to split the sequence we will need

(γ1 +γ2 , γ1 + γ2 ) = s(γ1 + γ2 )


= s(γ1 )s(γ2 ) (15.425)
iπΩ(γ1 ,γ2 )
= (γ1 γ2 e , γ1 + γ2 )

In other words, to split the sequence over Γ we need to find a function  : Γ → U (1) so
that
γ1 γ2 = e−iπΩ(γ1 ,γ2 ) γ1 +γ2 (15.426)

It is indeed possible to find such functions. See the exercise.


Since the sequence splits over Γ we consider the Abelian subgroup s(Γ) ⊂ Ṽ . Then
define the quotient space:
P (T, Ω, ) := Heis(V, Ω)/s(Γ) (15.427)

Explicitly P (T, Ω, ) is the quotient (U (1) × V )/Γ with the equivalence relation

(z, v) ∼ (z, v) · (γ , γ) = (zγ eiπΩ(v,γ) , v + γ) (15.428)

for all γ ∈ Γ.
Note that there is a continuous map

π : P (T, Ω, ) → T (15.429)

whose fiber is U (1). This space, together with its projection map is an example of a
principal U (1) bundle over the torus T : Each fiber is a principal homogeneous space for
the group U (1), under the natural action of U (1) on P (T, Ω, ). (Since the U (1) is central we
can consider it either as a left- or right- action.) Our construction of the bundle depended
on a choice of splitting γ , but a change of splitting defines isomorphic bundles.

Remarks

1. The above construction comes up, either explicitly or implicitly in discussions of


the quantum Hall efect, Chern-Simons theory, and the quantization of p-form gauge
theories.

– 378 –
2. Note that while s(Γ) is a subgroup of Heis(V, Ω) it is not a normal subgroup, so that,
while, P (T, Ω, ) is a bundle, it is not a group.

3. If we view Heis(V, Ω) as a principal U (1) bundle over V then we can construct a


very natural connection on this bundle. Parallel transport of the point (z, v0 ) ∈
P (T, Ω, ) over a straightline path ℘v0 ,w := {v0 + tw|0 ≤ t ≤ 1} in V is defined by
left-multiplication by (1, w):

U (℘v0 ,w ) : (z, v0 ) → (1, w) · (z, v0 ) = (zeiπΩ(w,v0 ) , w + v0 ) (15.430)

Now, if one considers parallel transport around a small square loop starting at v0 by
composing paths

℘v, w1 ,w2 := ℘v0 ,w1 ? ℘v0 +w1 ,w2 ∗ ℘v0 +w1 +w2 ,−w1 ∗ ℘v0 +w2 ,−w2 (15.431)

one obtains the holonomy

U (℘v0 ,w1 ,w2 ) : (z, v0 ) → (ze2πiΩ(w1 ,w2 ) , v0 ) (15.432)

showing that the curvature of this connection is Ω regarded as a 2-form on V . The


connection and 2-form descend to the principal U (1) bundle P (T, Ω, ). The coho-
mology class of [Ω], which is the first Chern class of P (T, Ω, ) is characterized by the
integers d~ = (d1 , d2 , . . . ). A nonzero value of d~ obstructs topological triviality.

Exercise
We illustrated how Q and D4 are the only two non-Abelian groups that sit in an
extension of Z2 × Z2 by Z2 . Which one is the Heisenberg extension?

Exercise Finite Heisenberg group in multiplicative notation


It is interesting to look at the Heisenberg extension

1 → Zn → Heis(Zn × Zn ) → Zn × Zn → 1 (15.433)

where we think of Zn as the multiplicative group of nth roots of unity. Let ω = exp[2πi/n].
We distinguish the three Zn factors by writing generators as ω1 , ω2 , ω3 .
a.) Show that one natural choice of cocycle is:
 0 0
 0
f (ω1s , ω2t ), (ω1s , ω2t ) := ω3st (15.434)

b.) Compute the commutator function


 0 0
 0 0
κ (ω1s , ω2t ), (ω1s , ω2t ) := ω3st −ts (15.435)

– 379 –
c.) Connect to our general theory of extensions by defining U := (1, (ω1 , 1)), V :=
(1, (1, ω2 )) and computing

U V = (f ((ω1 , 1), (1, ω2 )), (ω1 , ω2 ))


= (ω3 , (ω1 , ω2 ))
(15.436)
V U = (f ((1, ω2 ), (ω1 , 1)), (ω1 , ω2 ))
= (1, (ω1 , ω2 ))

or in other words, since the center is generated by q = (ω3 , (1, 1)) we can write:

U V = qV U (15.437)

Exercise Degenerate Heisenberg extensions


Suppose n = km is composite and suppose we use the function ck (a, b0 ) = kab0 in
defining an extension of Zn × Zn .
a.) Show that the commutator function is now degenerate.
b.) Show that the center of the central extension is larger than Zn . Compute it. 205
While these are not - strictly speaking - Heisenberg extensions people will often refer to
them as Heisenberg extensions. We might call them “degenerate Heisenberg extensions.”

Exercise Constructing The Splitting (15.426)


Give an explicit construction of a function  : Γ → U (1) satisfying (15.426). 206

Exercise Different Splittings Give Isomorphic Bundles


a.) Describe the relation between two splittings of the pullback of the sequence (15.422)
to Γ.
b.) Let 1 , 2 denote two splittings of the pullback of the sequence (15.422) to Γ. Show
that the bundles P (T, Ω, 1 ) with P (T, Ω, 2 ).
205
Answer : The center is generated by q, U m , V m and is Zn × Zk × Zk .
206
Answer : Choose an ordered basis γ1 , . . . , γn for Γ and define
P
γ := e−iπ i<j ni nj Ωij
(15.438)
P
where γ = i ni γi and Ωij = Ω(γi , γj ). One can check this satisfies the desired identity. Note that it is
crucial that Ωij and ni are integral.

– 380 –
Exercise Two Dimensions
a.) Suppose T = C/(Z + τ Z) with Imτ > 0. So Γ = Z + τ Z. Choose

Ω(1, τ ) = −Ω(τ, 1) = k ∈ Z. (15.439)

b.) Show that


Im(z̄1 z2 )
Ω(z1 , z2 ) = k (15.440)
Imτ

15.5.7 Lagrangian Subgroups And Induced Representations


Let us compare the general Heisenberg extension

1 → Z → G̃ → G → 0 (15.441)

with (15.363) and (15.403) with Z any subgroup of U (1). The difference from the general
case is that in these examples G is explicitly presented as a product of subgroups G = L×L0
where L and L0 are maximal Lagrangian subgroups. A subgroup L ⊂ G is said to be a
Lagrangian subgroup if κ(g1 , g2 ) = 1 for all pairs (g1 , g2 ) ∈ L and similarly for L0 .
When discussing Heisenberg groups of the form Heis(S × S) b the group S × Sb has two
canonical Lagrangian subgroups, namely S and S. b With the choice of κ we made above
these are maximal Lagrangian subgroups.
The case of G = S × Sb should be contrasted with other examples where G is R2n or Z2n 2 .
These groups certainly can be presented as products of maximal Lagrangian subgroups,
but there is no canonical decomposition. Consider, for example the Heisenberg extension
Heis(Z2n2 ) constructed using the gamma-matrices. Note that Z2 = F2 is a field, and we can
consider F2n2 to be a vector space over κ = F2 . We could take L to be any half-dimensional
Lagrangian subspace.
In the general situation, with no canonical choice of L one would often like to construct
an explicit unitary representation of the Heisenberg group. One way to do this is the
following:
Choose a maximal Lagrangian subgroup L ⊂ G. The inverse image L̃ ⊂ G̃ is a maximal
commutative subgroup of G̃.
We now choose a character of L̃ such that ρ(z, x) = zρ(x). Note that such a character ♣This is bad
notation, because
must satisfy ρ(z, x) is later used
for the full SvN
ρ(x)ρ(x0 ) = f (x, x0 )ρ(x + x0 ) ∀x, x0 ∈ L (15.442) representation. The
letter here should
be changed and it
Note that f need not be trivial on L, but it does define an Abelian extension L̃ which must propagates through
the discussion. ♣
therefore be isomorphic to a product L × U (1), albeit noncanonically. Different choices of
ρ are different choices of splitting. Indeed note that (15.442) says that, when restricted to
L × L the cocycle is trivialized by ρ.

– 381 –
The carrier space of our representation will be the space F of functions ψ : G̃ → C
such that
ψ (z, x)(z 0 , x0 ) = ρ(z 0 , x0 )−1 ψ (z, x) ∀(z 0 , x0 ) ∈ L̃

(15.443)
Setting (z 0 , x0 ) = (z −1 , 0) we note that equation (15.443) implies ψ(z, x) = z −1 ψ(1, x), so
defining Ψ(x) := ψ(1, x) we can simplify the description of F by identifying it with the
space of functions Ψ : G → C such that:
 f (x, x0 )
Ψ x + x0 = Ψ(x) ∀x0 ∈ L. (15.444)
ρ(x0 )
If our group is continuous or noncompact we should state an L2 condition. We take ρ
to be a unitary character so that |Ψ(x)|2 descends to a function on G/L and we demand:
Z
|Ψ(x)|2 dx < +∞. (15.445)
G/L

The group action is simply left-action of G̃ on the functions ψ(z, x). When written in terms
of Ψ(x) the representation of (x, z) ∈ G̃ is:
(ρ(z, x) · Ψ)(y) = (ρ(z, x) · ψ)(1, y)
= ψ (z, x)−1 · (1, y)

(15.446)
= ψ (z −1 f (x, −x)−1 f (−x, y), y − x)


= zf (x, y − x)Ψ(y − x)
where in the last line we assumed that f is a normalized cocycle so that f (0, y) = 1.
Let us see how we recover the standard Stone-von Neumann representation of Heis(S ×
S) from this viewpoint. Let us choose L = S.
b b Then the equivariance condition (15.444)
becomes
1
Ψ(s, χχ0 ) = Ψ(s, χ) (15.447)
ρ(χ0 )
Now set χ0 = 1/χ and conclude that
1
Ψ(s, χ) = Ψ(s, 1) (15.448)
ρ(χ)

So the dependence on χ ∈ Sb is completely fixed by equivariance. Defining ψ̃(s) := Ψ(s, 1)


we obtain a vector ψ̃ ∈ L2 (S), and thus if we take L = Sb then our space of equivariant
functions is naturally identified with L2 (S).
We can now work out the representations of
Ts = ρ(1, (s, 1))
(15.449)
Mχ = ρ(1, (0, χ))

on L2 (S). Working through the above definitions it should not be surprising that one
recovers:
(Ts ψ̃)(s0 ) = ψ̃(s0 − s)
ρ(χ) (15.450)
(Mχ ψ̃)(s0 ) = ψ̃(s0 )
χ(s0 )

– 382 –
0
P
Example. Consider F2m
2 with κ(w, w0 ) = (−1) i6=j wi wj . We can give a Lagrangian de-
composition
F2m ∼
2 =L⊕N (15.451)
in many different ways. For special values of m there are special Lagrangian subspaces
provided by classical error correcting codes. A Heisenberg representation can be given by
taking V to be the space of functions ψ : L → C. This has complex dimension 2m . N can
be identified with the group of characters on L since we can set

χn (`) = κ(n, `) (15.452)

The usual translation and multiplication operators T` and Mn generate an algebra iso-
morphic to M atd (C). V is also a representation of the Clifford algebra (and hence the
extra-special group). So the Clifford representation matrices γi can be expressed in terms
of these, and vice versa.

Remarks

1. The representation is, geometrically, just the space of L2 -sections of the associated
line bundle G̃ ×L̃ C defined by ρ. The representation is independent of the choice of
ρ, and any two choices are related by an automorphism of L given by the restriction
of an inner automorphism of G̃.

2. This is an example of an induced representation, IndG̃



(C) which we will study more
systematically in chapter 4.

Exercise
Construct explicit Lagrangian subspaces of F2m 2 for small values of m and write out
the matrices of the Heisenberg representation. 207

15.5.8 Automorphisms Of Heisenberg Extensions


In several of our examples above, such as Heis(V, Ω) and the extraspecial group we have
noted that to define a representation one must make a choice of a Lagrangian subgroup,
where κ restricts to 1. On the other hand, we also noted that in some situations there is
no natural choice of such a Lagrangian group. There are many such Lagrangian subgroups
and they are related by “symplectic automorphisms” of G.
We say that an automorphism α ∈ Aut(G) is symplectic if it preserves the commu-
tator function:
α∗ κ(g1 , g2 ) := κ(α(g1 ), α(g2 ))
(15.453)
= κ(g1 , g2 )
207
Answer. Start with m = 1. Any vector is isotropic so let L be spanned by ` = (1, 0) and N spanned by
n = (0, 1). Choose a basis for V by taking the delta function supported on basis vector ei . Then T` = σ 1
and Mn = σ3 relative to this basis.*********** CONTINUE ***********.

– 383 –
In the first line we defined the general notion of pullback κ → α∗ κ and the second line
is the invariance condition. In physics, such symplectic transformations are relevant to
canonical transformations. We can ask whether such automorphisms of “phase space”
actually lift to automorphisms of the full Heisenberg group, and then whether and how
this lifted group acts on the representations of the Heisenberg group. This would be the
“quantum mechanical implementation of symplectic transformations.” In this section we
will investigate those questions from the group-theoretical viewpoint.
e→G
Quite generally, (there is no need for G to be Abelian in this paragraph), if π : G
is a homomorphism and α ∈ Aut(G) is an automorphism of G we say that α lifts to an
automorphism of G e if there is an automorphism α e ∈ Aut(G)
e that completes the diagram:

G
e π /G (15.454)
α α

e

G
e π /G

Or, in equations, for every α ∈ Aut(G) we seek a corresponding α


e ∈ Aut(G)
e such that

π(e
α(e
g )) = α(π(e
g )) (15.455)

for all ge ∈ G.
e

Some General Theory 208


Consider a group extension

1→A→G
e→G→1 (15.456)

where A and G are Abelian and we write the group operations on both A and G additively
so G
e is the group of pairs (a, g) with group multiplication

(a1 , g1 )(a2 , g2 ) = (a1 + a2 + f (g1 , g2 ), g1 + g2 ) (15.457)

and f (g1 , g2 ) is a cocycle satisfying the additive version of the cocycle identity:

f (g1 + g2 , g3 ) + f (g1 , g2 ) = f (g1 , g2 + g3 ) + f (g2 , g3 ) (15.458)

Now suppose α ∈ Aut(G). If α preserves the commutator function then we can hope
to lift it to an automorphism Tα of G.
e As explained above, “lifting” means that

π(Tα (a, g)) = α(π(a, g)) = α(g) . (15.459)

Therefore, Tα must be of the form:

Tα (a, g) = (ξα (a, g), α(g)) (15.460)


208
What follows is an elaboration of the ideas from Appendix A of arXiv:1707.08888. I also got some
useful help from Graeme Segal.

– 384 –
where ξα is some function ξα : A × G → A. We can write constraints on this function
from the requirement that Tα must be an automorphism with the group law defined by the
cocycle f . In particular Tα must be a group homomorphism:

Tα ((a1 , g1 ), (a2 , g2 )) = Tα (a1 , g1 ) · Tα (a2 , g2 ) (15.461)

which is true iff

ξα (a1 + a2 + f (g1 , g2 ), g1 + g2 ) = ξα (a1 , g1 ) + ξα (a2 , g2 ) + f (α(g1 ), α(g2 )) (15.462)

Now, specialize this equation by putting g1 = 0 and assuming (WLOG) that we have a
normalized cocycle, so that f (g, 0) = f (0, g) = 0. Then equation (15.462) simplifies to

ξα (a1 + a2 , g) = ξα (a1 , 0) + ξα (a2 , g) (15.463)

Putting a2 = 0 in (15.463) we now learn that

ξα (a, g) = ξα (a, 0) + ξα (0, g) (15.464)

On the other hand, putting g = 0 in (15.463) we now learn that a 7→ ξα (a, 0) is just
an automorphism of A. Composing lifts of α with such automorphisms is an inherent
ambiguity in lifting α. Thus, it is useful to make the simplifying assumption that ξα (a, 0) =
a, since we can always arrange this by composition with an automorphism of A. Therefore
we can write equation (15.464) in the general form

Tα (a, g) = (a + τα (g), α(g)) (15.465)

If we restrict to automorphisms of the type (15.465) then one easily checks that Tα is
indeed a group homomorphism iff

(α∗ f − f )(g1 , g2 ) = τα (g1 + g2 ) − τα (g1 ) − τα (g2 ) (15.466)

where
α∗ f (g1 , g2 ) := f (α(g1 ), α(g2 )) (15.467)
is known as the “pulled-back cocycle.” The conceptual meaning of this equation is the
following: α ∈ Aut(G) is a “symmetry of G.” We are asking how badly the cocycle f
breaks that symmetry. We say that “f is invariant under pullback” if α∗ f = f and in
that case we can take τα = 0 and we can easily lift the group Aut(G) to a group of
automorphisms of G. e The more general condition (15.466) says that the amount by which
f is not symmetric under α, that is α∗ f − f , must be a trivializable cocycle. Put this way,
it is clear that the condition is unchanged under shifting f by a coboundary. Indeed, if we
change f by a coboundary so

f˜(g1 , g2 ) = f (g1 , g2 ) + b(g1 + g2 ) − b(g1 ) − b(g2 ) (15.468)

then we can solve (15.466) by taking

τ̃α (g) = τα (g) + (α∗ b − b)(g) (15.469)

– 385 –
so the existence of a solution to (15.466) is gauge invariant. Put more simply, the co-
homology class [f ] ∈ H 2 (G, A) must be invariant under the action of α∗ on H 2 (G, A).
♣The above
paragraph should
Thus, in general we cannot lift all automorphisms of G, only those for which (15.466) be said more
succinctly. ♣
holds. The set of such automorphisms forms a subgroup of Aut(G) that we will denote as
Aut0 (G). Note that, in our additive notation we have

κ(g1 , g2 ) = f (g1 , g2 ) − f (g2 , g1 ) (15.470)

which, as mentioned above, is a generalization of a symplectic form. Any automorphism


satisfying (15.466) automatically satisfies α∗ (κ) = κ and is therefore a “symplectic au-
tomorphism.” However, Aut0 (G) is in general a subgroup of the group of symplectic
automorphisms of G. We will give an example with G = SL(2, Zn ) below.
We now restrict attention to α ∈ Aut0 (G). If, in addition

τα1 ◦α2 (g) = τα1 (α2 (g)) + τα2 (g) (15.471)

that is,
τα1 ◦α2 = α2∗ τα1 + τα2 (15.472)
then in fact Tα1 ◦ Tα2 = Tα1 ◦α2 generate a subgroup of Aut(G)
e isomorphic to Aut(G). ♣ Relate this to
equation (15.542))
In general, even if we can find a solution to (15.466) the criterion (15.472) will not the condition for a
twisted
hold. Nevertheless, the automorphisms Tα will generate a subgroup of Aut(G). e To see homomorphism
τ : Aut(G) → A. ♣
what subgroup it is we introduce, for ` ∈ Hom(G, A), the automorphism

P` (a, g) = (a + `(g), g) (15.473)

Then we note that


Tα1 ◦ Tα2 (a, g) = Tα1 ◦α2 ◦ P`α1 ,α2 (15.474)
where
`α1 ,α2 (g) := τα1 (α2 (g)) + τα2 (g) − τα1 ◦α2 (g) (15.475)
A little computation shows that `α1 ,α2 ∈ Hom(G, A) is indeed a homomorphism from G to
A. A little more computation reveals

P`1 ◦ P`2 = P`1 +`2


(15.476)
Tα ◦ P` = Pα∗ (`) ◦ Tα

Therefore, we can write any word in P ’s and T ’s in the form P`0 ◦ Tα0 for some (`0 , α0 ).
Altogether, equations (15.474), (15.475), and (15.476) mean that Tα generate a subgroup
of Aut(G). ^ of Aut(G)
e Including all transformations P` defines a subgroup Aut(G) e which
fits in an exact sequence:

^ → Aut0 (G) → 1
1 → Hom(G, A) → Aut(G) (15.477)

Finally, restoring Hom(G, A) → Hom(G, A) o Aut(A) gives the group of (possible) lifts of
automorphisms of G to automorphisms of G̃.

– 386 –
Example: Heis(R ⊕ R)
Let us consider the basic example from quantum mechanics based on a phase space
R ⊕ R with symplectic form defined by J. In this case there is a very nice way of thinking
of the matrix group Sp(2, R). We note that if
!
a b
A= (15.478)
c d

then
! ! !
a c 0 1 a b
Atr JA =
b d −1 0 c d (15.479)
= (ad − bc)J

Therefore, A ∈ Sp(2, R) iff ad − bc = 1. But this is precisely the condition that defines
SL(2, R). Therefore
SL(2, R) = Sp(2, R) (15.480)
are identical as matrix groups. The same argument applies if we replace R by any ring R.
This kind of isomorphism is definitely not true if we consider higher rank groups SL(n, R)
and Sp(2n, R).
Now, as we have seen, the section s(α, β) = exp[i(αp̂ + β q̂)] leads to the choice of
section, written additively
~
f ((α1 , β1 ), (α2 , β2 )) = (α1 β2 − α2 β1 ) (15.481)
2
This is symplectic invariant, so we can take τα (g) = 0 in the equation (15.466) and we
conclude that the symplectic group acts as a group of automorphisms on the Heisenberg
group Heis(R2n ).
It is interesting however, that the symplectic group only acts projectively on the Stone-
von-Neumann representation of the Heisenberg group. We now explain this point.
As we will discuss in detail in the chapter on Lie groups, SL(2, R) and Sp(2, R) are
examples of Lie groups. It is useful to look at group elements infinitesimally close to the
identity matrix. These can be written as

A = 1 + m + O(2 ) (15.482)

We learn from the defining equation that

Tr(m) = 0 (15.483)

is required to satisfy the defining conditions to order . (Exercise: Prove this using both
the definition of Sp(2, R) and of SL(2, R).)
The infinitesimal group elements are thus characterized by the vector space sp(2, R) =
sl(2, R), which is the vector space of 2 × 2 real traceless matrices. These form a Lie algebra
with the standard matrix commutator. Generic (but not all) group elements are obtained

– 387 –
by exponentiating such matrices. In particular, in a neighborhood of the identity all group
elements are obtained by exponentiating elements of the Lie algebra.
Recall the standard basis of sl(2, R):
! ! !
01 −1 0 0 0
e= h= f= (15.484)
00 0 1 −1 0

(Check the signs carefully!) Compute:

[h, e] = −2e [e, f ] = h [h, f ] = +2f (15.485)

From this one can in principle multiply exponentiated matrices using the BCH formula.
We now consider the quantum implementation of these operators on L2 (R) with ρ(e) =
ê etc. with:
i 2 i i 2
ê := p̂ ĥ := (q̂ p̂ + p̂q̂) fˆ := q̂ (15.486)
2~ 2~ 2~
Now, using the useful identities: 209

[AB, CD] = A[B, C]D + [A, C]BD + CA[B, D] + C[A, D]B


(15.488)
= AC[B, D] + A[B, C]D + C[A, D]B + [A, C]DB

we can check that this is indeed a representation:

[ĥ, ê] = −2ê [ê, fˆ] = ĥ [ĥ, fˆ] = +2fˆ (15.489)

However, we have to be careful about exponentiating these operators.


Let us consider the one-parameter subgroup of SL(2, R):
!
0 1
exp[θ(e + f )] = cos θ + sin θ (15.490)
−1 0

This is in fact a maximal compact subgroup of SL(2, R). (See chapter **** on 2x2 matrix
groups.) Note carefully that it has period θ ∼ θ + 2π.
The quantum implementation of e + f is just the standard harmonic oscillator Hamil-
tonian!
i 1
ê + fˆ = (p̂2 + q̂ 2 ) = i(āa + ) (15.491)
2 2
where
1
a = √ (q + ip)
2
(15.492)
1
ā = √ (q − ip)
2
209
These identities are very useful, but a bit hard to remember. If you are on a desert island you can
easily reconstruct them from the special cases:

[AB, C] = A[B, C] + [A, C]B


(15.487)
[A, CD] = C[A, D] + [A, C]D
both of which are easy to remember.

– 388 –
Now, in the Stone-von-Neumann representation 2i (p̂2 + q̂ 2 ) has the spectrum i(n + 21 ),
n = 0, 1, 2, . . . . Therefore, the one-parameter subgroup exp[θ(ê + fˆ)] has period θ ∼ θ + 4π.
We see that the group generated by ê, fˆ, ĥ is at least a double cover of Sp(2, R). In fact, it
turns out to be exactly a double cover, and it is known as the metaplectic group.
One very interesting aspect of the metaplectic group is that this is a Lie group with no
finite-dimensional faithful representation. We now explain that fact, 210 and a few other
important things in the following remarks:

Remarks

1. Application to the metaplectic group. The Lie algebra of the metaplectic group is
sl(2, R). Any finite dimensional representation of the metaplectic group would give
a finite-dimensional representation of the Lie algebra sl(2, R) and we discussed in
detail what these are above. Note that e + f corresponds to iJ 1 . So: In any finite-
dimensional complex representation of sl(2, R), the operator ρ(e) + ρ(f ) is diagonaliz-
able and all the eigenvalues are of the form i`, where ` is an integer. So exp[θρ(e + f )]
has period θ ∼ θ + 2π. Therefore no finite dimensional representation of M pl can
be faithful. In particular, M pl is an example of a Lie group which is not a matrix
group: It cannot be embedded as a subgroup of GL(N, C) for any N .

2. It is very interesting to consider the action of the one-parameter family exp[θ(ê + fˆ)]
d
in the standard “position space” representation L2 (R) with p̂ = −i dq . Let us compute
the integral kernel:
hx|exp[θ(ê + fˆ)]|yi (15.493)
since   Z +∞
ˆ
eθ(ê+f ) ψ (x) = hx|exp[θ(ê + fˆ)]|yiψ(y)dy (15.494)
−∞

Clearly, to evaluate (15.493) we should insert a complete set of eigenstates of the


harmonic oscillator Hamiltonian. So we introduce the Hermite functions: 211
√ x2
ψn (x) := (2n n! π)−1/2 e− 2 Hn (x) n ∈ Z+ (15.495)

where Hn (x) is the Hermite polynomial

d n −x2
 
x2
Hn (x) = e − e (15.496)
dx

The Hermite functions ψn (x) satisfy

d2
 
2
− 2 + x ψn = (2n + 1)ψn (15.497)
dx
210
Our demonstration of this surprising fact follows the discussion in G. Segal in Lectures On Lie Groups
and Lie Algebras.
211
What follows here are completely standard facts. We used Wikipedia.

– 389 –
and we have normalized them so that so that
Z +∞
ψn (x)ψm (x)dx = δn,m (15.498)
−∞

Now we have

" 2  #
x−y 2
 
X
n 1 1−u x+y 1+u
u ψn (x)ψm (x) = p exp − −
2
π(1 − u ) 1+u 2 1−u 2
n=0
(15.499)
To prove this write
d n −x2 d n
     Z +∞ 
x2 x2 1 − 14 s2 +isx
Hn (x) = e − e =e − √ e ds (15.500)
dx dx 4π −∞
Apply this to both ψn (x) and ψn (y) apply the derivatives, exchange integration and
sum and get
∞ 1 2 2 Z
X e 2 (x +y ) − 14 (s2 +t2 )− 12 stu+isx+ity
un ψn (x)ψm (x) = dsdte
n=0
4π 3/2
"  #
1−u x+y 2 1+u x−y 2
  
1
=p exp − −
π(1 − u2 ) 1+u 2 1−u 2
(15.501)

Now we have

X
hx|exp[θ(ê + fˆ)]|yi = eiθ/2 einθ ψn (x)ψn (y) (15.502)
n=0

so we apply the above identity with u = eiθ . We should be careful about convergence:
The Gaussian integral in (15.501) has quadratic form
!
1 1 u
A= (15.503)
4 u 1

which has eigenvalues 41 (1 ± u). The zero-modes at u = ±1 indicate a divergent


Gaussian. In fact we have
X∞
ψn (x)ψn (y) = δ(x − y)
n=0
∞ (15.504)
X
n
(−1) ψn (x)ψn (y) = δ(x + y)
n=0

The second line follows easily from the first since the parity of ψn (x) as a function of
x is the parity of n.
The quadratic form A has a positive definite real part for |u| ≤ 1 except for u = ±1.
The values for |u| > 1 have to be defined by analytic continuation and there is a
branch point at u = ±1. Note that at θ = π/2 we have
π eiπ/4
hx|exp[ (ê + fˆ)]|yi = √ eixy (15.505)
2 2π

– 390 –
and we recognize the kernel for the Fourier transform:
π ˆ
e 2 (ê+f ) ψ = eiπ/4 F(ψ) (15.506)

where F : L2 (R) → L2 (R) is the Fourier transform. This is the quantum implemen-
tation of the canonical transformation exchanging position and momenta. Note that
θ = π is the square of the Fourier transform (up to a scalar multiplication by i) but
this is not a scalar operator on the space of functions. Indeed at θ = π we have
 
ˆ
eπ(ê+f ) ψ (x) = iψ(−x) (15.507)

The Fourier transform is of order four not order two. Also note that at θ = 2π the
operator is just multiplication by −1.

3. We note that there are beautiful general formulae for expectation value operators
defined by exponentiating general quadratic forms in the p̂i and q̂ i , or, equivalently
in a’s and a† ’s. This is useful when working with coherent states and squeezed states.
But it is best presented in the Bargmann or geometric quantization formalism. ♣say more? ♣

Exercise
a.) Check that, for 2 × 2 matrices the condition Tr(m) = 0 is identical to the condition
(mJ)tr = mJ.
b.) Show that for any n × n matrix, the infinitesimal version of the condition detA = 1
is that A = 1 + m + O(2 ) with Tr(m) = 0.
c.) Show that for any n×n matrix, the infinitesimal version of the condition Atr JA = J
is that A = 1 + m + O(2 ) with (mJ)tr = mJ.
d.) Show that the conditions Tr(m) = 0 and (mJ)tr = mJ are inequivalent different
for n > 2.

♣Following exercise
belongs in the
Linear Algebra
chapter. ♣

Exercise Linear independence of eigenvectors with distinct eigenvalues


Suppose a set of nonzero vectors v1 , . . . , vn are eigenvectors of some operator A with
distinct eigenvalues. Show that they are linearly independent. 212

212 P
Answer : Suppose s cs vs = 0 for some coefficients cs . Applying powers of A we determine that
cs λks vs = 0. If all the cs are nonzero then we learn that the matrix the matrix V C must have determinant
P

zero where Vij = λji and C is the diagonal matrix with entries c1 , . . . , cn . If some of the cs are nonzero
then we have a minor of the matrix V times the diagonal matrix of the nonzero cs . In any case, none of
the minors of Vij have zero determinant, provided the λi are distinct. Therefore, V C (or the appropriate
minor) has nonzero determinant and hence no kernel. So {vs } is a linearly independent set.

– 391 –
Exercise su(2) vs. sl(2, R)
A basis for the real Lie algebra of 2 × 2 traceless anti-Hermitian matrices is
i
T i = − σi i = 1, 2, 3 (15.508)
2
with Lie algebra
[T i , T j ] = ijk T k (15.509)
Can one make real linear combinations of T i to produce the generators e, h, f of sl(2, R)
above?

Exercise Representation matrices


a.) Show that if v0 is the vector used above with ρ(e)v0 = 0 and ρ(h)v0 = −N v0 then
213

ρ(e)ρ(f )` v0 = −`(N + 1 − `)ρ(f )`−1 v0 (15.510)


b.) Suppose that we choose a highest weight vector vh so that ρ(f )w = 0 and ρ(h)w =
+N w. Write the representation matrices in the ordered basis

w, ρ(e)w, . . . , ρ(e)N w (15.511)

c.) Put a unitary structure on the vector space V so that ρ(T j ) are anti-Hermitian
matrices and relate the above bases to the standard basis |j, mi, m = −N/2, −N/2 +
1, . . . , N/2 − 1, N/2 appearing in quantum-mechanics textbooks. That is, find a rescaling
of the vectors ρ(f )k v0 so that in the new basis, after defining ρ(T i ) using (11.557) one
obtains anti-Hermitian matrices for ρ(T i ).

Example: SL(2, Z) action on Heis(Zn × Zn ).

Consider again the example of the quantum mechanics of a particle on a discrete


approximation to a ring.
1. Because the position is periodic, the momentum is quantized.
2. Because the position is quantizied, the momentum is periodic.
So, the momentum is both periodic and discrete, just like the position. Recall the
position operator Q and the momentum operator P both had a spectrum which is given
by nth roots of unity.
213 P`−1
Hint: Use [ρ(e), ρ(f )` ] = i=0 ρ(e)i [ρ(e), ρ(f )]ρ(e)`−1−i .

– 392 –
So there is a symmetry between momentum and position. This is part of a kind of
symplectic symmetry in this discrete system related to SL(2, Zn ). All such matrices arise
from reduction modulo n of matrices in SL(2, Z). Recall from Section [**** 8.3 ****] that
SL(2, Z) is generated by S and T with relations

(ST )3 = S 2 = −1 (15.512)

Therefore, S and T (reduced modulo n) will generate SL(2, Zn ), although there will be
further relations, such as T n = 1. The SL(2, Zn ) symmetry plays and important role in
string theory and Chern-Simons theory and illustrates nicely some ideas of duality.
We now take the cocycle for Heis(Zn × Zn ) to be

f ((a1 , b1 ), (a2 , b2 )) = a1 b2 ∈ Zn (15.513)

so the commutator function is

κ((a1 , b1 ), (a2 , b2 )) = a1 b2 − a2 b1 = v1tr Jv2 (15.514)

which we can recognize as a symplectic form on Zn ⊕ Zn .


Now an important subtlety arises. In general we cannot find an equivalent cocycle so
that
1
f˜((a1 , b1 ), (a2 , b2 )) = (a1 b2 − a2 b1 ) (15.515)
2
because, in general, we are not allowed to divide by 2 in Zn . After all, x = 21 (a1 b2 − a2 b1 )
should be the solution to 2x = (a1 b2 − a2 b1 ), but, if n is even, then if x is a solution x + n/2
is a different solution, so the expression is ambiguous. If n is odd then 2 is invertible and
we can divide by 2. Therefore, it is not obvious if Sp(2, Zn ) will lift to automorphisms of
the finite Heisenberg group.
Consider the transformation:

S : (a, b) → (b, −a) (15.516)

This is a symplectic transformation: S ∗ κ = κ, and it satisfies S 2 = −1. Now we compute

(S ∗ f − f )((a1 , b1 ), (a2 , b2 )) = −(a1 b2 + b1 a2 ) (15.517)

Although the cocycle is not invariant under S nevertheless the difference S ∗ f −f can indeed
be trivialized by
τS (a, b) = −ab (15.518)
Now consider the transformation

T : (a, b) → (a + b, b) (15.519)

We compute
(T ∗ f − f )((a1 , b1 ), (a2 , b2 )) = b1 b2 (15.520)
This can be trivialized by
1
τT (a, b) = b2 (15.521)
2

– 393 –
PROVIDED we are able to divide by 2!! This is possible if n is odd, but not when n is
even. Indeed, when n = 2 the cocycle (T ∗ f − f ) is not even a trivializable cocycle! (Why
not? Apply the triviality test described in [ **** Remark 5, section 11.3 ****] above.)
One way to determine the lifted group, and how the group lifts when n is even is the
following. Suppose we have any two operators U, V that satisfy

U V = e2πiθ V U (15.522)

for some θ (which, at this point, need not even be rational). If


!
a b
A= ∈ SL(2, Z) (15.523)
c d

then ♣Need to add


phases here to get a
e := U a V c
U Ve = U b V d (15.524) proper group
action. These don’t
compose properly
also satisfy yet. ♣
e Ve = e2πiθ Ve U
U e (15.525)
Now suppose, in addition that θ = 1/n and U n = V n = 1. Then we can compute

Ue n = e−iπθn(n−1)ac = e−iπ(n−1)ac
(15.526)
Ve n = e−iπθn(n−1)bd = e−iπ(n−1)bc

Now, when n is odd, the conditions (15.526) place no restriction on A. In that case,
the group SL(2, Z) acts on Heis(Zn ⊕ Zn ) as a group of automorphisms, but the normal
subgroup
Γ(n) := {A ∈ SL(2, Z)|A = 1modn} (15.527)
acts trivially so that
SL(2, Z)/Γ(n) ∼
= SL(2, Zn ) (15.528)
indeed acts as a group of automorphisms.
However, when n is even, we must consider the subgroup of SL(2, Z) with the extra
conditions ac = bd = 0mod2. Only this subgroup acts as a group of automorphisms. Again
a finite quotient group acts effectively.

Lifting Symplectomorphisms In Geometric Quantization


************************************************
************************************************
TO BE WRITTEN OUT IN DETAIL: A good example of the general problem of
lifting automorphisms is to consider a line bundle with connection (L, ∇) over a symplectic
manifold such that the curvature of the connection is the symplectic form Ω. Then, in
geometric quantization we can attempt to lift the symplectomorphisms which preserve
the line bundle with connection. We will see all the above phenomena, and only the
“Hamiltonian automorphisms” will lift. For some discussion see section 6 of:

– 394 –
https://arxiv.org/pdf/hep-th/0605200.pdf
IN PARTICULAR EXPLAIN THE RELATION OF Aut0 (G) to the kernel of a homo-
morphism Aut(G) → Hom(G, H 1 (G, A)).
****************************************
****************************************

15.5.9 Coherent State Representations Of Heisenberg Groups: The Bargmann


Representation
EXPLAIN BARGMANN REPRESENTATION.
NICE FORMULAE FOR COHERENT STATES AND VEV’S OF EXPONENTI-
ATED QUADRATIC EXPRESSIONS IN OSCILLATORS.

15.5.10 Some Remarks On Chern-Simons Theory


In Chern-Simons theory (and similar topological field theories) it is quite typical for the
Wilson line operators to generate finite Heisenberg groups. For example for U (1) Chern-
Simons of level k on a torus we have an action
Z Z
k
S= dt A1 ∂t A2 + · · · (15.529)
4π R T2

so upon quantization A2 ∼ 4π δ
k δA1 . The consequence is that Wilson lines along the a- and
b-cycles generate a finite Heisenberg group with q = e2πi/k . The Hilbert space of states is
a finite-dimensional irreducible representation of this group.
************************
EXPLAIN MORE. THETA FUNCTIONS AND METAPLECTIC REPRESENTA-
TION.
***************

15.6 Non-Central Extensions Of A General Group G By An Abelian Group A:


Twisted Cohomology
Let us now generalize central extensions to extensions of the form:
ι π
1→A → G̃ → G→1 (15.530)

Here G can be any group, not necessarily Abelian. We continue to assume that N = A is
Abelian, but now we no longer assume ι(A) is central in G̃. So we allow for the possibility
of non-central extensions by an Abelian group.
Much of our original story goes through, but now the map

ω : G → Aut(A) (15.531)

of our general discussion (defined in equations (15.12) and (15.14)) is canonically defined
and is actually a group homomorphism. As we stressed below (15.14), in general it is not
a group homomorphism. There are two ways to understand that:

– 395 –
1. G̃ acts on A by conjugation on the isomorphic image of A in G̃ which, because the
sequence is exact, is still a normal subgroup. In equations, we can define

ι(ω̃g̃ (a)) := g̃ι(a)g̃ −1 (15.532)

But now ω̃g̃ only depends on the equivalence class [g̃] ∈ G̃/ι(A) beccause

(g̃ι(a0 )) ι(a) (g̃ι(a0 ))−1 = g̃ι(a)g̃ −1 (15.533)

so ω̃g̃ι(a0 ) = ω̃g̃ and since G̃/ι(A) ∼


= G we can use this to define ωg . However, from
this definition it is clear that g 7→ ωg is a group homomorphism.

2. Or you can just choose a section and define ωg exactly as in (15.14). To stress the
dependence on s we write

ι(ωg,s (a)) = s(g)ι(a)s(g)−1 (15.534)

However, now if we change section so that 214 ŝ(g) = ι(t(g))s(g) is another section
then we compute

ι (ωg,ŝ (a)) := {ι(t(g))s(g)} · ι(a) · {ι(t(g))s(g)}−1


= ι(t(g)) · ι(ωg,s (a)) · ι(t(g))−1
(15.535)
= ι t(g) · ωg,s (a) · (t(g))−1


= ι(ωg,s (a))
and since ι is injective ωg,s is independent of section and we can just denote it as ωg .
Note carefully that only in the very last line did we use the assumption that A is
Abelian. We will come back to this when we discuss general extensions in section
15.7.
Moreover, given a choice of section we can define fs (g1 , g2 ) just as we did in equation
(15.90). This definition works for all group extensions:

s(g1 )s(g2 ) = ι(fs (g1 , g2 ))s(g1 g2 ) (15.536)

We can now compute, just as in (15.15):


ι (ωg1 ◦ ωg2 (a)) = s(g1 )ι(ωg2 (a))s(g1 )−1
= s(g1 )s(g2 )ι(a)(s(g1 )s(g2 ))−1
= ι(fs (g1 , g2 )) · ι(ωg1 g2 (a)) · ι(fs (g1 , g2 ))−1 (15.537)
= ι fs (g1 , g2 ))ωg1 g2 (a)fs (g1 , g2 ))−1


= ι (ωg1 g2 (a))
and again notice that only in the very last line did we use the hypothesis that A is
Abelian. Again, since ι is injective, we conclude that ωg1 ◦ ωg2 = ωg1 g2 so that the
map ω is a group homomorphism.
214
Note that here the order of the two factors on the RHS matters, since ι(A) is not necessarily central in
G
e

– 396 –
Now, computing s(g1 )s(g2 )s(g3 ) in two ways, just as before, we derive the twisted
cocycle relation:
ωg1 (fs (g2 , g3 ))fs (g1 , g2 g3 ) = fs (g1 , g2 )fs (g1 g2 , g3 ) (15.538)

Conversely, given a homomorphism ω : G → Aut(A) and a twisted cocycle for ω we


can define a group law on the set A × G:

(a1 , g1 ) · (a2 , g2 ) = (a1 ωg1 (a2 )f (g1 , g2 ), g1 g2 ) (15.539)

The reader should check that this really does define a valid group law on the set A × G.

Remark: Note that (15.539) simultaneously generalizes the twisted product of a


semidirect product (14.2) and the twisted product of a central extension (15.102).

Now suppose that we change section from s to ŝ(g) := ι(t(g))s(g) using some arbitrary
function t : G → A. Then one can compute that the new cocycle is related to the old one
by
fŝ (g1 , g2 ) = t(g1 )ωg1 (t(g2 ))fs (g1 , g2 )t(g1 g2 )−1 (15.540)

Note that since A is Abelian the order of the factors on the RHS do not matter, but in
the analogous formula for general extensions, equation (15.620) below, the order definitely
does matter.
We say two different twisted cocycles are related by a twisted coboundary if they are
related as in (15.540) for some function t : G → A. One can check that if f is a twisted
cocycle and we define f 0 as in (15.540) then f 0 is also a twisted cocycle. We again have
an equivalence relation and we define the twisted cohomology group H 2+ω (G, A) to be the
abelian group of equivalence classes. It is again an Abelian group, as in the untwisted case,
as one shows by a similar argument.
The analog of the main theorem of section 15.3 above is:
Theorem: Let ω : G → Aut(A) be a fixed group homomorphism. Denote the set of
isomorphism classes of extensions of the form

1 → A → G̃ → G → 1 (15.541)

which induce ω by Extω (G, A). Then the set Extω (G, A) is in 1-1 correspondence with the
twisted cohomology group H 2+ω (G, A).
The proof is very similar to the untwisted case and we will skip it. Now the trivial ele-
ment of the Abelian group H 2+ω (G, A) corresponds to the semidirect product determined
by ω.
Now we can observe an interesting phenomenon which happens often in cohomology
theory: Suppose that a twisted cocyle f is trivializable so that [f ] = 0. Then our group
extension is equivalent to a semidirect product. Nevertheless, the sequence (15.530) can
be split in many different ways: There are many distinct trivializations and the different

– 397 –
trivializations have meaning. Equivalently, there are many different coboundary transfor-
mations that preserve the trivial cocycle. A glance at (15.540) reveals that this will happen
when
t(g1 g2 ) = t(g1 )ωg1 (t(g2 )) (15.542)
This is known as a twisted homomorphism. Of course, in the case that ω : G → Aut(A)
takes every g ∈ G to the identity automorphism of Aut(A) (that is, the identity element
of Aut(A), the condition specializes to the definition of a homomorphism.
For the later discussion of group cohomology is useful:
A 1-cochain t ∈ C 1 (G, A) is simply a map t : G → A.
A twisted homomorphism is also known as a twisted one-cocycle. That is, a 1-cocycle
t ∈ Z 1+ω (G, A) with twisting ω is a 1-cochain that satisfies (15.542).
To define group cohomology H 1+ω (G, A) we need an appropriate notion of equivalence
of one-cocycles. This is motivated by noting that if s : G → G̃ is a section that is also a
homomorphism (that is, a splitting) then for any a ∈ A we can produce a new splitting

s̃(g) = ι(a)s(g)ι(a)−1 (15.543)

This corresponds to the change of section s̃(g) = ι(t(g))s(g) where the function t(g) is:

t(g) = ta (g) := aωg (a)−1 . (15.544)

To check this you write

s̃(g) = ι(a)s(g)ι(a)−1
= ι(a) · s(g)ι(a)−1 s(g)−1 · s(g)

(15.545)
= ι(a) · ι(ωg (a−1 )) · s(g)


= ι(aωg (a−1 )) · s(g)




One easily checks that if t is a one-cocycle, then t · ta is also a one-cocycle. So, in


defining the cohomology group H 1+ω (G, A) we use the equivalence relation t ∼ t0 if there
exists an a so that t = t0 ta .

Theorem: When the sequence (15.530) splits, that is, when the cohomology class of the
twisted cocycle is trivial [f ] = 0, then the inequivalent splittings are in one-one corre-
spondence with the inequivalent trivializations of a trivializable cocycle, and these are in
one-one correspondence with the cohomology group H 1+ω (G, A).

Example 1: Consider the sequence associated with the Euclidean group


ι π
0 → Rd →Euc(d)→O(d) → 1 (15.546)

Recall that if v ∈ Rd then ι(v) = Tv is the translation operator on affine space Ad . We have
Tv (p) = p + v. As we saw in (14.74) and (14.75) and the discussion preceding that exercise,
for any p ∈ Ad we have a section R 7→ sp (R) ∈ Euc(d) where sp (R) is the transformation
that takes
sp (R) : p + v 7→ p + Rv (15.547)

– 398 –
In other words, we define rotation-reflections by choosing p as the origin. Then from

TωR (v0 ) = sp (R)Tv0 sp (R)−1 (15.548)

we compute that
ωR (v0 ) = Rv0 (15.549)
thus ωR ∈ Aut(Rd ), and indeed R 7→ ωR is a group homomorphism. If we have two
difference sections sp0 and sp then

sp0 (R) = Tt(R) sp (R) (15.550)

where
t(R) = (1 − R)(p0 − p) = (1 − R)w (15.551)
where we have put p0 = p + w, w ∈ Rd .
Note that One easily checks that for fixed w ∈ Rd

R 7→ t(R) := (1 − R)w (15.552)

is indeed a twisted homomorphism O(d) → Rd . (It is not a homomorphism.) However,


one also checks that it is of the form tw (R) = w − ωR (w), so it is a trivial one-cocycle: All
the splittings are equivalent in the sense defined above.

Example 2: Now restrict the sequence (15.546) to

0 → Zd → G → {1, σ} → 1 (15.553)

where {1, σ} ⊂ O(d) is a subgroup isomorphic to Z2 with σ = −1d×d . Then ω : Z2 →


Aut(Zd ) is inherited from ωR in (15.546) and ωσ (~n) = −~n. Now the sequence splits and
the most general possible splitting is s~n0 where s~n0 (1) = {0|1} and

s~n0 (σ) = {~n0 |σ} (15.554)

for some ~n0 ∈ Zd . Indeed one checks s~n0 (σ)2 = 1. Now for ~a ∈ Rd we have

t~a (σ) = ~a − ωσ (~a) = 2~a ∈ 2Zd (15.555)

So not all splittings are equivalent! The equivalent ones have ~n0 − ~n00 ∈ 2Zd . Therefore

H 1+ω (Z2 , Zd ) ∼
= Zd /2Zd ∼
= (Z2 )d (15.556)

Remarks

1. Different trivializations of something trivializable can have physical meaning. In the


discussion on crystallographic groups below the different trivializations are related to
a choice of origin for rotation-reflection symmetries of the crystal.

– 399 –
2. An analogy to bundle theory might help some readers: Let G be a compact Lie group.
Then the isomorphism classes of principal G-bundles over S 3 are in 1-1 correspon-
dence with π2 (G) and a theorem states that π2 (G) = 0 for all compact Lie groups.
Therefore, every principal G-bundle over S 3 is trivializable. Distinct trivializations
differ by maps t : S 3 → G and the set of inequivalent trivializations is classified by
π3 (G), which is, in general nontrivial. This can have physical meaning. For example,
in Yang-Mills theory in 3 + 1 dimensions on S 3 × R the principal G-bundle on space
S 3 is trivializable. But if there is an instanton between two time slices then the
trivialization jumps by an element of π3 (G).

Exercise Due diligence


Derive equation (15.538) and show that if we change f by a coboundary using (15.540)
then indeed we produce another twisted cocycle.

Exercise
Suppose that a twisted cocycle f (g1 , g2 ) can be trivialized by two different functions
t1 , t2 : G → A. Show that t12 (g) := t1 (g)/t2 (g) is a trivialization that preserves the trivial
cocycle. That is, show that t12 is a twisted 1-cocycle.

15.6.1 Crystallographic Groups


A crystal is a subset of affine space C ⊂ Ad that is invariant under translations by a
lattice L ⊂ Rd (actually, that’s an embedded lattice). As an example, see Figure 39.
Then restricting the exact sequence of the Euclidean group (equation (15.546) above ) to
the subgroup G(C) ⊂ Euc(d) of those transformations that preserve C we have an exact
sequence
1 → L(C) → G(C) → P (C) → 1 (15.557)
where P (C) ∼
= G(C)/L(C) is a subgroup of O(d) known as the point group of the crystal.

Remark: In solid state physics when the sequence (15.557) does not split the crystallo-
graphic group G(C) is said to be nonsymmorphic.

Example 1: Take C = Z q (Z + δ) ⊂ R where 0 < δ < 1. Then of course L(C) = Z acts


by translations, preserving the crystal. But note that it is also true that

{δ|σ} : n 7→ δ − n = −n + δ
(15.558)
: n + δ 7→ δ − (n + δ) = −n

– 400 –
Figure 39: A portion of a crystal in the two-dimensional plane.

where σ ∈ O(1) is the reflection around 0, σ : x → −x in R. The transformation {δ|σ}


maps Z to Z + δ and Z + δ to Z so that the whole crystal is preserved. Since O(1) = Z2 ,
this is all we can do. We thus find that G(C) fits in a sequence

0 → L(C) ∼
= Z → G(C) → O(1) ∼
= Z2 → 1 (15.559)

But we can split this sequence by choosing a section s(σ) = {δ| − 1}. Note that

{δ|σ} · {δ|σ} = {0|1} (15.560)

so s : O(1) → G(C) is a homomorphism. Another way of thinking about this is that s(σ) is
just reflection, not around the origin, but around the point 12 δ. So, by a shift of origin for
defining our rotation-inversion group O(1) we just have reflections and integer translations.
In any case we can recognize G(C) as the infinite dihedral group.

Example 2: More generally, consider a lattice L ⊂ Rd and a generic vector ~δ ∈ Rd .


Consider the crystal
C = L q (L + ~δ) (15.561)
If L and ~δ are generic then the point group is just Z2 generated by −1 ∈ O(d). Denoting
the action of −1 on Rd by σ we can lift this to the involution

{~δ|σ} ∈ G(C) (15.562)

– 401 –
which exchanges L with (L+~δ). This group is symmorphic (because {~δ|σ} is an involution).
In fact, this operation is just inversion about the new origin 12 ~δ:
1 1
{~δ|σ} : ~δ + ~y → ~δ − ~y (15.563)
2 2
♣NEED TO HAVE
A FIGURE HERE.
THIS WOULD
HELP. ♣

Example 3: For another very similar example consider

C = L q (L + ~δ) ⊂ R2 (15.564)

where
L = a1 Z ⊕ a2 Z ⊂ R2 (15.565)
As we have just discussed, for generic a1 , a2 and ~δ the symmetry group will be isomorphic
to the semidirect product Z2 o Z2 .
However, now let 0 < δ < 21 and specialize ~δ to ~δ = (δa1 , 12 a2 ). Then the crystal has
more symmetry and in particular the point group is enhanced from Z2 to Z2 × Z2 :

1 → Z2 → G(C) → Z2 × Z2 → 1 (15.566)

To see this let σ1 , σ2 be generators of Z2 × Z2 acting by reflection around the x2 and x1


axes, respectively. Then the operations:

σ̂1 : (x1 , x2 ) 7→ ~δ + (−x1 , x2 ) (15.567)

σ̂2 : (x1 , x2 ) 7→ (x1 , −x2 ) (15.568)


are symmetries of the crystal G(C). In Seitz notation (or rather, its improvement - see
equations (14.31) and (14.32) above) we have:
!
−1
σ̂1 = {~δ| } (15.569)
1
!
1
σ̂2 = {0| } (15.570)
−1
Now, we can define a section s(σ1 ) = σ̂1 and s(σ2 ) = σ̂2 . Note that the square of the
lift
σ̂12 = {(0, a2 )|1} (15.571)
is a nontrivial translation. Thus σi → σ̂i is not a splitting. Moreover, σ̂1 does not have
finite order. Therefore, it cannot be in a discrete group of rotations about any point!
Just because we chose a section that wasn’t a splitting doesn’t mean that a splitting
doesn’t actually exist. Here is how we can prove that in fact no splitting exists: The most
general section is of the form

s(σ1 ) = {~δ + ~v |σ1 } (15.572)

– 402 –
where ~v = (n1 a1 , n2 a2 ) ∈ L where n1 , n2 ∈ Z. Now consider the square:

s(σ1 )2 = {(0, a2 (1 + 2n2 )|1}. (15.573)

Since n2 ∈ Z there is no lifting that makes this an involution. Therefore, there is no section.
Therefore the sequence (15.566) does not split.

Example 4: It is interesting to see what happens to the previous example when a1 = a2 =


a and we take δ = a( 12 , 21 ). Then, clearly

C = L q (L + ~δ) ⊂ R2 (15.574)

has a point group symmetry D4 . So this becomes a symmorphic crystal. In fact, this is
just a square lattice in disguise! We can take basis vectors δ and R(π/2)δ.
***************************
NEED TO RELATE THE ABOVE FACTS MORE DIRECTLY TO THE PREVIOUS
DISCUSSION OF GROUP COHOMOLOGY. SHOULD DO MORE ON CASE WHERE
THE SEQUENCE SPLITS BUT THERE ARE INEQUIVALENT SPLITTINGS: PROB-
ABLY A good example is Zincblend structure with tetrahedral symmetry. For example
GaAs has this structure. There are two tetrahedra around the Ga and As but they are
rotated.
*****************************

Exercise
Why does the argument of example 3 fail in the special case of example 4? 215

Exercise Honeycomb
Consider a honeycomb crystal in the plane. Discuss the crystal group, the point group,
and decide if it is symmorphic or not.

♣Need to provide
answer in a footnote

15.6.2 Time Reversal
A good example of a physical situation in which it is useful to know about how twisted
cocycles define non-central extensions is when there are anti-unitary symmetries in a quan-
tum mechanical system. A typical example where this happens is when there is a time-
orientation-reversing symmetry. In this case there is a homomorphism

τ : G → {±1} ∼
= Z2 (15.575)
215
Answer : The wrong step is in equation (15.572). When δ takes the special form 12 (a, a) this is not the
most general lifting. One has now translation symmetry by multiples of ~δ, so there is an obvious lifting of
σ1 .

– 403 –
telling us whether the symmetry g ∈ G preserves or reverses the orientation of time.
In quantum mechanics it is often (but not always! - see below) the case that time-
reversal is implemented as an anti-unitary operator (see Chapter 2 below for a precise
definition of this term) and therefore when looking at the way the symmetry is implemented
quantum mechanically we should consider the nontrivial automorphism of U (1) defined by
complex conjugation.
Recall that
Aut(U (1)) ∼
= Out(U (1)) ∼= Z2 (15.576)

and the nontrivial element of Aut(U (1) is the automorphism z → z ∗ = z −1 .


So, when working with a symmetry group G that includes time-orientation-reversing
symmetries we will need to consider the group homomorphism

ω : G → Aut(U (1)) (15.577)

where: (
z τ (g) = +1
ω(g)(z) = (15.578)
z −1 τ (g) = −1

Example 1. The simplest example is where we have a symmetry group G = Z2 interpreted


as time reversal. It will be convenient to denote M2 = {1, T̄ }, with T̄ 2 = 1. Of course,
M2 ∼ = Z2 . In quantum mechanics the representation of T̄ will be an operator ρ(T ) := T̃
on the Hilbert space and we will get a possibly twisted central extension of M2 . Let
ω : M2 → Aut(U (1)). There are two possibilities: ω(T̄ ) = 1 (so the operation is unitary)
and ω(T̄ ) is the complex conjugation automorphism (so the operation is anti-unitary).
Assuming the anti-unitary case is the relevant one, so that ω is the nontrivial homomorphis
M2 → Aut(U (1)) (both are isomorphic to Z2 so ω is just the identity homomorphism of
Z2 ) we need the group cohomology:

H 2+ω (Z2 , U (1)) = Z2 (15.579)

To prove this we look at the twisted cocycle identity. Exactly the same arguments as in
Remark 5 of section 15.3 show that we can choose a gauge with f (g, 1) = f (1, g) = 1 for
all g. This leaves only f (T̄ , T̄ ) to be determined. Now, the twisted cocycle relation for the
case g1 = g2 = g3 = T̄ says that

f (T̄ , T̄ )∗ = ωT̄ (f (T̄ , T̄ )) = f (T̄ , T̄ ) (15.580)

and since f (T̄ , T̄ ) ∈ U (1) this means f (T̄ , T̄ ) ∈ {±1}. We need to check that we can’t
gauge f (T̄ , T̄ ) to one using a twisted coboundary relation. That relation says that we can
gauge f to
f˜(T̄ , T̄ ) = t(T̄ )ωT̄ (t(T̄ ))f (T̄ , T̄ )/t(1) (15.581)

Now t(1) = 1 since we want to preserve the gauge f (g, 1) = f (1, g) = 1 and t(T̄ )ωT̄ (t(T̄ )) =
|t(T̄ )|2 = 1 so f (T̄ , T̄ ) is gauge invariant.

– 404 –
(Note that T̄ is an involution so our old criterion from Remark 5 of section 15.3 would
ask us to find a square root of f (T̄ , T̄ ) = −1. Indeed such a square root exists, it is ±i,
but our old criterion no longer applies because we are in the twisted case.)
So, choosing ω to be the nontrivial homomorphism M2 → Aut(U (1)) there are two
extensions:
1 / U (1) / M ± π̃ / M2 /1 (15.582)
2

Let us write these out more explicitly:


Choose a lift T̃ of T̄ . Then π(T̃ 2 ) = 1, so T̃ 2 = z ∈ U (1). But, then
T̃ z = T̃ T̃ 2 = T̃ 2 T̃ = z T̃ (15.583)
On the other hand, since we take ω(T̄ ) to be the nontrivial automorphism of U (1) then
T̃ z = z −1 T̃ (15.584)
Therefore z 2 = 1, so z = ±1, and therefore T̃ 2 = ±1. Thus the two groups are
M2± = {z T̃ |z T̃ = T̃ z −1 & T̃ 2 = ±1} (15.585)
These possibilities are really distinct: If T̃ 0 is another lift of T̄ then T̃ 0 = µT̃ for some
µ ∈ U (1) and so
(T̃ 0 )2 = (µT̃ )2 = µµ̄T̃ 2 = T̃ 2 (15.586)
So the sign of the square of the lift of the time-reversing symmetry is an invariant.
The extension corresponding to the identity element of H 2+ω (Z2 , U (1)) is the semidi-
rect product. This is just O(2), using SO(2) ∼ = U (1):
O(2) = SO(2) o Z2 (15.587)
But the nontrivial extension is a new group for us. It double-covers O(2) and is known as
Pin− (2). Indeed we can define homomorphisms
π ± : M2± → O(2) (15.588)
where π ± (T̃ ) = P ∈ O(2) and π ± (z = eiα ) = R(2α). Note that −1 = eiπ 7→ R(e2iπ ) = +1.
In Pin+ (2) the double cover of a reflection, T̃ = (π + )−1 (P ) , squares to one. In Pin− (2)
the double cover of a reflection, T̃ = (π − )−1 (P ) squares to −1.

Remark: In QM textbooks it is shown that if we write Schrödinger equation for an electron


in a potential with spin-orbit coupling then there is a time-reversal symmetry:
(T̃ · Ψ)(~x, t) = iσ 2 (Ψ(~x, −t)∗ (15.589)
where here Ψ is a 2-component spinor function of (~x, t). 216 Note that this implies:
 ∗
(T̃ 2 · Ψ)(~x, t) = iσ 2 · (T̃ · Ψ)(~x, −t)
∗
= iσ 2 · iσ 2 · (Ψ(~x, t))∗
(15.590)
2 2
= iσ iσ Ψ(~x, t)
= −Ψ(~x, t)
216
This is most elegantly derived from the time-reversal transformation on the Dirac equation.

– 405 –
So, in this example, T̃ 2 = −1. More generally, in analogous settings for spin j particles
T̃ 2 = (−1)2j . See section 15.6.3 below for an explanation. The fact that T̃ 2 = (−1)2j in
the spin j representation has a very important consequence known as Kramer’s theorem:
In these situations the energy eigenspaces must have even degeneracy. For if Ψ is an energy
eigenstate HΨ = EΨ and we have a time-reversal invariant system then T̃ · Ψ is also an
energy eigenstate. We can prove that it is linearly independent of Ψ as follows: Suppose
to the contrary that
T̃ · Ψ = zΨ (15.591)
for some complex number z. Then act with T̃ again and use the fact that it is anti-unitary
and squares to −1:
−Ψ = z ∗ T̃ · Ψ (15.592)
but this implies that z = −1/z ∗ which implies |z|2 = −1, which is impossible. Therefore,
(15.591) is impossible. Therefore Ψ and T̃ · Ψ are independent energy eigenstates. A slight
generalization of the argument shows that the dimension of the energy eigenspace must
be even. A more conceptual way of understanding this is that the energy eigenspace must
be a quaternionic vector space because we have an anti-linear operator on it that squares
to −1. See the discussion of real, complex, and quaternionic vector spaces in Chapter 2
below.

Example 2: In general a system can have time-orientation reversing symmetries but the
simple transformation t → −t is not a symmetry. Rather, it must be accompanied by
other transformations so that the symmetry group is not of the simple form G = G0 × Z2
where G0 is a group of time-orientation-preserving symmetries. (Such a structure is often
assumed in the literature.) As a simple example consider a crystal

C = Z2 + (δ1 , δ2 ) q Z2 + (−δ2 , δ1 ) q Z2 + (−δ1 , −δ2 ) q Z2 + (δ2 , −δ1 )


   
(15.593)

where ~δ is generic so, as we saw above we have a symmorphic crystal with P (C) ∼ = D4 .
π
The action of D4 is just given by rotation around the origin {0|R( 2 )} which we will denote
by R and reflection, say, in the y-axis, which we will denote by P . So R4 = 1, P 2 = 1, and
P RP = R−1 . We have
G(C) = Z2 o D4 (15.594)
But now suppose there is a dipole moment, or spin S. We model this with a set of two
elements S = {S, −S} for dipole moment up and down and now our crystal with spin is a
subset of R2 × S. This subset is of the form

C b+ q C
b=C b− (15.595)

with
b+ = Z2 + (δ1 , δ2 ) × {S} q Z2 + (−δ1 , −δ2 ) × {S}
 
C (15.596)
but a spin −S on points of the complementary sub-crystal

b− = Z2 + (−δ2 , δ1 ) × {−S} q Z2 + (δ2 , −δ1 ) × {−S}


 
C (15.597)

– 406 –
Figure 40: In this figure the blue crosses represent an atom with a local magnetic moment pointing
up while the red crosses represent an atom with a local magnetic moment pointing down. The
magnetic point group is isomorphic to D4 but the homomorphism τ to Z2 has a kernel Z2 × Z2
(generated by π rotation around a lattice point together with a reflection in a diagonal). Since D4
τ
is nonabelian the sequence 1 → Pb0 → Pb→Z2 → 1 plainly does not split.

Now let Z2 = {1, σ} act on R2 × S by acting trivially on the first factor and σ : S → −S
on the second factor. Now reversal of time orientation exchanges S with −S. So the
\ ⊂ Euc(2) × Z2 known as the
symmetries of the crystal with dipole is a subgroup G(C)
magnetic crystallographic group. The subgroup of translations by the lattice is still a normal
subgroup and the quotient by the lattice of translations is the magnetic point group. In
the present example:
0 → Z2 → G(C)
\→P \(C) → 1 (15.598)
\
The elements in P (C) are

{(1, 1), (R, σ), (R2 , 1), (R3 , σ), (P, σ), (P R, 1), (P R2 , σ), (P R3 , 1)} (15.599)

This magnetic point group is isomorphic to D4 but the time reversal homomorphism takes
τ (R, σ) = −1 and τ (P, σ) = −1 so that we have
\ τ
1 → Z2 × Z2 → P (C) → Z2 → 1 (15.600)

The induced automorphism on Z2 × Z2 is trivial so clearly this sequence does not split,
\
since P (C) ∼
= D4 is nonabelian.

Remarks:

– 407 –
1. With the possible exception of exotic situations in which quantum gravity is impor-
tant, physics takes place in space and time. Except in unusual situations associated
with nontrivial gravitational fields we can assume our spacetime is time-orientable.
Then, any physical symmetry group G must be equipped with a homomorphism

τ : G → Z2 (15.601)

telling us whether the symmetry operations preserve or reverse the orientation of


time. That is τ (g) = +1 are symmetries which preserve the orientation of time while
τ (g) = −1 are symmetries which reverse it.
Now, suppose that G is a symmetry of a quantum system. Then Wigner’s theorem
gives G another grading φ : G → Z2 , telling us whether the operator ρ(g) implement-
ing the symmetry transformation g on the Hilbert space is unitary or anti-unitary.
Thus, on very general grounds, a symmetry of a quantum system should be bigraded
by a pair of homomorphisms (φ, τ ), or what is the same, a homomorphism to Z2 ×Z2 .
It is natural to ask whether φ and τ are related. A natural way to try to relate them
is to study the dynamical evolution.
In quantum mechanics, time evolution is described by unitary evolution of states.
That is, there should be a family of unitary operators U (t1 , t2 ), strongly continuous
in both variables and satisfying composition laws U (t1 , t3 ) = U (t1 , t2 )U (t2 , t3 ) so that
the density matrix % evolves according to:

%(t1 ) = U (t1 , t2 )%(t2 )U (t2 , t1 ) (15.602)

Let us - for simplicity - make the assumption that our physical system has time-
translation invariance so that U (t1 , t2 ) = U (t1 − t2 ) is a strongly continuous group of
unitary transformations. 217
By Stone’s theorem, U (t) has a self-adjoint generator H, the Hamiltonian, so that ♣There is an
obvious
we may write generalization of
  this statement for
it U (t1 , t2 ). Is it
U (t) = exp − H (15.603) proved rigorously
~ somewhere? ♣

Now, suppose we have a group 218 of operators on the Hilbert space: ρ : G →


AutR (H). We say this group action is a symmetry of the dynamics if for all g ∈ G:

ρ(g)U (t)ρ(g)−1 = U (τ (g)t) (15.604)

where τ : G → Z2 is the indicator of time-orientation-reversal.


217
In the more general case we would need an analog of Stone’s theorem to assert that there is a family
Rt
of self-adjoint operators with U (t1 , t2 ) = Pexp[− ~i t12 H(t0 )dt0 ]. Then, the argument we give below would
lead to ρ(g)H(t)ρ(g)−1 = φ(g)τ (g)H(t) for all t.
218
As explained in the 2012 article of Freed and Moore, this group might be an extension of the original
group of quantum symmetries ρ̄ : Ḡ → Autqtm (PH).

– 408 –
Now, substituting (15.603) and paying proper attention to φ we learn that the con-
dition for a symmetry of the dynamics (15.604) is equivalent to

φ(g)ρ(g)Hρ(g)−1 = τ (g)H (15.605)

in other words,
ρ(g)Hρ(g)−1 = φ(g)τ (g)H (15.606)

Thus, the answer to our question is that φ and τ are unrelated in general. We should
therefore define a third homomorphism χ : G → Z2

χ(g) := φ(g)τ (g) ∈ {±1} (15.607)

Note that
φ·τ ·χ=1 (15.608)

2. It is very unusual for physical systems to have nontrivial homomorphisms χ. That


is, it is very unusual to have physical systems with time-orientation-reversing sym-
metries which are C-linear or time-orientation-preserving symmetries which act C-
anti-linearly. But it is not impossible. To see why it is unusual note that:

ρ(g)Hρ(g)−1 = χ(g)H (15.609)

implies that if any group element has χ(g) = −1 then the spectrum of H must be
symmetric around zero. In particular, if the spectrum is bounded below but not
above this condition must fail. In many problems, e.g. in the standard Schrödinger
problem with potentials which are bounded below, or in relativistic QFT with H
bounded below we must have χ(g) = 1 for all g and hence φ(g) = τ (g), which is
what one reads in virtually every physics textbook: “A symmetry is anti-unitary iff
it reverses the orientation of time.” Not true, in general.

3. However, there are physical examples where χ(g) can be non-trivial, that is, there
can be symmetries which are both anti-unitary and time-orientation preserving. An
example are the so-called “particle-hole” symmetries in free fermion systems.

15.6.3 T 2 = (−1)2j and the Clebsch-Gordon Decomposition


Above we checked that T 2 = −1 on spin 1/2 particles whose wavefunction obeys the usual
Schrödinger equation with spin-orbit coupling. The generalization to spin j particles is
T 2 = (−1)2j .
One simple way to see this is to note that the spin j representation is obtained by
decomposing the tensor product of (2j) copies of the spin 1/2 representation. In general
we have the very important Clebsch-Gordon decomposition:

V (j1 ) ⊗ V (j2 ) ∼
= V (|j1 − j2 |) ⊕ V (|j1 − j2 | + 1) ⊕ · · · · · · ⊕ V (j1 + j2 ) (15.610)

Note that every representation on the RHS has the same parity of (−1)2j . Also note
the triangular structure of the Clebsch-Gordon decomposition of V ( 21 )⊗n allowing for an

– 409 –
inductive proof. Finally T̃ 2 on V ( 12 )⊗n is just (−1)n , so it is (−1)n on the highest summand
V (n/2).
Let us give a proof of (15.610). We need some general facts about representation
theory. See Chapter 4 for full explanations:

15.7 General Extensions


Let us briefly return to the general extension (15.1). Thus, we are now not assuming that N
or Q is abelian. We might ask what happens if we try to continue following the reasoning
of section (15.1) in this general case, but now keeping in mind the nice classification of
central extensions using group cohomology.
What we showed is that for any group extension a choice of a section s : Q → G
automatically gives us two maps:

1. ωs : Q → Aut(N )

2. fs : Q × Q → N

These two maps are defined by

ι(ωs,q (n)) := s(q)ι(n)s(q)−1 (15.611)


and
s(q1 )s(q2 ) := ι(fs (q1 , q2 ))s(q1 · q2 ) (15.612)
respectively.
Now (15.611) defines an element of Aut(N ) for fixed s and q, but the map q 7→ ωs,q
need not be a homomorphism, as we have repeatedly stressed. Rather, using (15.611) and
(15.612) we can derive a twisted version of the homomorphism rule:

ωs,q1 ◦ ωs,q2 = I(fs (q1 , q2 )) ◦ ωs,q1 q2 (15.613)

Recall that for a ∈ N , I(a) ⊂ Aut(N ) denotes the inner automorphism given by conjugation
by a. The proof of (15.613) follows exactly the same steps as (15.537), except for the very
last line.
Moreover, using (15.612) to relate s(q1 )s(q2 )s(q3 ) to s(q1 q2 q3 ) in two ways gives a
twisted cocycle relation:

ωs,q1 (fs (q2 , q3 ))fs (q1 , q2 q3 ) = fs (q1 , q2 )fs (q1 q2 , q3 ) (15.614)

Note this is the same as (15.538), but unlike that equation now order of the terms is very
important since we no longer assume that N is abelian.
To summarize: Given a general extension (15.1) there exist maps (ωs , fs ), associated
with any section s and defined by (15.611) and (15.612). The maps (ωs , fs ) automatically
satisfy the identities (15.613) and (15.614).
We now consider, more generally, functions satisfying identities (15.613) and (15.614).
That is, we assume we are given two maps (not necessarily derived from some section):

– 410 –
1. A map f : Q × Q → N

2. A map ω : Q → Aut(N )

And we suppose the data (ω, f ) satisfy the two conditions

ωq1 ◦ ωq2 = I(f (q1 , q2 )) ◦ ωq1 q2 (15.615)

ωq1 (f (q2 , q3 ))f (q1 , q2 q3 ) = f (q1 , q2 )f (q1 q2 , q3 ) (15.616)


then we can construct an extension (15.1) with the multiplication law:

(n1 , q1 ) ·f,ω (n2 , q2 ) := (n1 ωq1 (n2 )f (q1 , q2 ), q1 q2 ) (15.617)

This is very similar to (15.539) but we stress that since N might be nonabelian, the order
of the factors in the first entry on the RHS matters!
With a few lines of algebra, using the identities (15.615) and (15.616) one can check
the associativity law and the other group axioms. We have already seen this simultaneous
generalization of the semidirect product (14.2) and the twisted product of a central exten-
sion (15.102) in our discussion of the case where N = A is abelian. (See equation (15.539)
above.) The new thing we have now learned is that this is the most general way of putting
a group structure on a product N × Q so that the result fits in an extension of Q by N .
Now, suppose again that we are given a group extension. As we showed, a choice of
section s gives us a pair of functions (ωs , fs ) satisfying (15.615) and (15.616). Any other
section s̃ is related to s by a function t : Q → N . Indeed that function t is defined by:

s̃(q) = ι(t(q))s(q) (15.618)

and one easily computes that we now have

ωs̃,q = I(t(q)) ◦ ωs,q (15.619)

fs̃ (q1 , q2 ) = t(q1 )ωs,q1 (t(q2 ))fs (q1 , q2 )t(q1 q2 )−1 (15.620)
The proof of (15.619) follows exactly the same steps as (15.535). To prove (15.620) we
patiently combine the definition (15.618) with the definition (15.612). ♣We also skipped
this proof for
These formulae for how (ωs , fs ) change as we change the section now motivate the N = A abelian.
Probably should
following: show the steps. ♣

Suppose we are given a pair (ω, f ) satisfying (15.615) and (15.616) and an arbitrary
function t : Q → N . We can now define a new pair (ω 0 , f 0 ) by the equations:

ωq0 = I(t(q)) ◦ ωq (15.621)

f 0 (q1 , q2 ) = t(q1 )ωq1 (t(q2 ))f (q1 , q2 )t(q1 q2 )−1 (15.622)


Now, with some algebra (DO IT!) one can check that indeed (ω 0 , f 0 ) really do satisfy
(15.615) and (15.616) as well. Equations (15.621) and (15.622) generalize the coboundary
relation (15.97) of central extension theory. Note that the equations relating ω and f back
to ω 0 and f 0 are of the same form with t(q) → t(q)−1 .

– 411 –
The relations (15.621) and (15.622) define an equivalence relation on the set of pairs
(ω, f ) satisfying (15.615) and (15.616). Moreover, if (ω, f ) and (ω 0 , f 0 ) are related by
(15.621) and (15.622) then we can define a group structure on the set N × Q in two ways
using the equation (15.617) for each pair. Nevertheless, there is a morphism between these
two extensions in the sense of (15.4) above where we define

ϕ(n, q) := (nt(q)−1 , q) (15.623)

So, to check this you need to check

ϕ((n1 , q1 ) ·f,ω (n2 , q2 )) = ϕ(n1 , q1 ) ·f 0 ,ω0 ϕ(n2 , q2 ) (15.624)

Then note that ϕ−1 (n, q) = (nt(q), q) is an inverse morphism of extensions, and hence we
have an isomorphism of extensions.
Now we would like to state all this a little more conceptually. The first point to
note is that a map q 7→ ωq ∈ Aut(N ) that satisfies (15.615) in fact canonically defines a
homomorphism ω̄ : Q → Out(N ) of Q into the group of outer automorphisms of N . This
homomorphism is defined more conceptually as the unique map that makes the diagram:

1 /N ι /G π /Q /1 (15.625)
I ψ ω
  
1 / Inn(N ) / Aut(N ) / Out(N ) /1

Here I : N → Inn(N ) is the map that takes n to the inner automorphism I(n) : n0 7→
nn0 n−1 and ψ is the map from G → Aut(N ) defined by

ι(ψ(g)(n)) = gι(n)g −1 (15.626)

Now, we can ask the converse question: Given an arbitrary homomorphism ω̄ : Q →


Out(N ) is there an extension of Q by N that induces it as in (15.625)?
The most obvious thing to try when trying to answer this question is to use ω̄ : Q →
Out(N ) and the pullback construction (15.34) of the canonical exact sequence given by
the lower line of (15.625). But this will only give an extension of Q by Inn(N ). Note that
Inn(N ) ∼= N/Z(N ), and so the center of N might cause some trouble. That is in fact what
happens: The answer to the above question is, in general, “NO,” and the obstruction has
to do with the third cohomology group H 3+ω̄ (Q, Z(N )) where Z(N ) is the center of N .
See section 15.8.5 below.
But for now, let us suppose we have a choice of ω̄ such that extensions inducing it do
exist. What can we say about the set Extω̄ (Q, N ) of equivalence classes of such extensions?
To answer this we choose a lifting of the homomorphism, that is, a map q 7→ ωq ∈
(1)
Aut(N ). Now, if we have two extensions both inducing ω̄ and we choose two liftings ωq
(2)
and ωq then they will be related by

ωq(1) = I(t(q)) ◦ ωq(2) (15.627)

– 412 –
for some function t : Q → N . Note, please, that while this equation is formally very similar
to (15.619) it is conceptually different. Nothing has been said about the relation of the two
extensions, other than that they induce the same ω̄.
Now we try to relate the corresponding functions f (1) (q1 , q2 ) and f (2) (q1 , q2 ). To do
that we compute
 
−1
ωq(1)
1
◦ ω (1)
q2 (n) = t(q1 )ω (2)
q1 t(q 2 )ω (2)
q2 (n)t(q 2 ) t(q1 )−1
 
= t(q1 )ωq(2)
1
(t(q 2 )) ω (2)
q 1
◦ ω (2)
q2
(n) ωq(2)
2
(t(q2 )−1 )t(q1 )−1
 
−1
= t(q1 )ωq(2)
1
(t(q 2 )) f (2)
(q 1 , q 2 )ω (2)
q q
1 2
(n)f (2)
(q 1 , q2 ) ωq(2)
2
(t(q2 )−1 )t(q1 )−1
   −1
(2) (2) −1 (1) (2) (2) −1
= t(q1 )ωq1 (t(q2 ))f (q1 , q2 )t(q1 q2 ) · ωq1 q2 (n) t(q1 )ωq1 (t(q2 ))f (q1 , q2 )t(q1 q2 )

= fˆ(2) (q1 , q2 ) · ωq(1)


1 q2
(n)fˆ(2) (q1 , q2 )−1
(15.628)

where we define
fˆ(2) (q1 , q2 ) := t(q1 )ωq(2)
1
(t(q2 ))f (2) (q1 , q2 )t(q1 q2 )−1 (15.629)
On the other hand, we know that

ωq(1)
1
◦ ωq(1)
2
(n) = f (1) (q1 , q2 )ωq(1)
1 q2
(n)f (1) (q1 , q2 )−1 (15.630)

(1)
Can we conclude that fˆ(2) (q1 , q2 ) = f (1) (q1 , q2 ) ? Certainly not! Provided ωq1 q2 (n) is
sufficiently generic all we can conclude is that

fˆ(2) (q1 , q2 ) = f (1) (q1 , q2 )ζ(q1 , q2 ) (15.631)

for some function ζ : Q × Q → Z(N ). These two functions are not necessarily related by a
coboundary and the extensions are not necessarily equivalent!
What is true is that if fˆ(2) and f (1) satisfy the twisted cocycle relation then ζ(q1 , q2 ) in
(15.631) also satisfies the twisted cocycle relation. (This requires a lot of patient algebra....)
It follows that
ζ ∈ Z 2+ω̄ (Q, Z(N )) (15.632)
Moreover, going the other way, given one extension and corresponding (ω (1) , f (1) ), and
a ζ ∈ Z 2+ω̄ (Q, Z(N )) we can change f as in (15.631). If [z] ∈ H 2+ω̄ (Q, Z(N )) is nontrivial
we will in general get a new, nonequivalent extension.
All this is summarized by the theorem:

Theorem: Let Extω̄ (Q, N ) be the set of inequivalent extensions of Q by N inducing ω̄.
Then either this set it is empty or it is a torsor 219 for H 2+ω̄ (Q, Z(N )).
************************
219
A torsor X for a group G is a set X with a G-action on it so that given any pair x, x0 ∈ X there is a
unique g ∈ G that maps x to x0 . In this chapter we have discussed an important example of a torsor quite
extensively: Affine space Ad is a torsor for Rd with the natural action of Rd on Ad by translation.

– 413 –
NEED SOME EXAMPLES HERE. AND NEED SOME MORE INTERESTING EX-
ERCISES.
*************************

Exercise Checking the group laws


Show that (15.617) really defines a group structure.
a.) Check the associativity relation.
b.) What is the identity element? 220
c.) Check that every element has an inverse.

Exercise
a.) Check that (15.623) really does define a homomorphism of the group laws (15.617)
defined by (ω, f ) and (ω 0 , f 0 ) if (ω 0 , f 0 ) is related to (ω, f ) by (15.621) and (15.622).
b.) Check that the diagram (15.4) really does commute if we use (15.623).

15.8 Group cohomology in other degrees

Motivations:
a.) The word “cohomology” suggests some underlying chain complexes, so we will
show that there is such a formulation.
b.) There has been some discussion of higher degree group cohomology in physics in

1. The theory of anomalies (Faddeev-Shatashvili; Segal; Carey et. al.; Mathai et. al.;
... )

2. Classification of rational conformal field theories (Moore-Seiberg; Dijkgraaf-Vafa-


Verlinde-Verlinde; Dijkgraaf-Witten; Kapustin-Saulina)

3. Chern-Simons theory and topological field theory (Dijkgraaf-Witten,...)

4. Condensed matter/topological phases of matter (Kitaev; Wen et. al.; Kapustin et.
al.; Freed-Hopkins;....)

5. Three-dimensional supersymmetric gauge theory.

Here we will be brief and just give the basic definitions:


220
Answer : (f (1, 1)−1 , 1Q ).

– 414 –
15.8.1 Definition

Suppose we are given any group G and an Abelian group A (written additively in this
sub-section) and a homomorphism

ω : G → Aut(A) (15.633)

Definition: An n-cochain is a function φ : G×n → A. The space of n-cochains is


denoted C n (G, A). It is also useful to speak of 0-cochains. We interpret a 0-cochain φ0 to
be some element φ0 = a ∈ A.
Note that C n (G, A), for n ≥ 0, is an abelian group using the abelian group structure
of A on the values of φ, that is: (φ1 + φ2 )(~g ) := φ1 (~g ) + φ2 (~g ).
Define a group homomorphism: d : C n (G, A) → C n+1 (G, A)

(dφ)(g1 , . . . , gn+1 ) := ωg1 (φ(g2 , . . . , gn+1 ))


−φ(g1 g2 , g3 , . . . , gn+1 ) + φ(g1 , g2 g3 , . . . , gn+1 ) ± · · · + (−1)n φ(g1 , . . . , gn−1 , gn gn+1 )
+ (−1)n+1 φ(g1 , . . . , gn )
(15.634)

Then we have, for n = 0:


(dφ0 )(g) = ωg (a) − a (15.635)

For n = 1, n = 2 and n = 3 the formula written out looks like:

(dφ1 )(g1 , g2 ) = ωg1 (φ1 (g2 )) − φ1 (g1 g2 ) + φ1 (g1 ) (15.636)

(dφ2 )(g1 , g2 , g3 ) = ωg1 (φ2 (g2 , g3 )) − φ2 (g1 g2 , g3 ) + φ2 (g1 , g2 g3 ) − φ2 (g1 , g2 ) (15.637)

(dφ3 )(g1 , g2 , g3 , g4 ) = ωg1 (φ3 (g2 , g3 , g4 ))−φ3 (g1 g2 , g3 , g4 )+φ3 (g1 , g2 g3 , g4 )−φ3 (g1 , g2 , g3 g4 )+φ3 (g1 , g2 , g3 )
(15.638)
Next, one can check that for any φ, we have the absolutely essential equation:

d(dφ) = 0 (15.639)

We will give a simple proof of (15.639) below but let us just look at how it works for
the lowest degrees: If φ0 = a ∈ A is a 0-cochain then

(d2 φ0 )(g1 , g2 ) = ωg1 (dφ0 (g2 )) − dφ0 (g1 · g2 ) + dφ0 (g1 )


= ωg1 (ωg2 (a) − a) − (ωg1 g2 (a) − a) + (ωg1 (a) − a)
(15.640)
= ωg1 (ωg2 (a)) − ωg1 g2 (a)
=0

– 415 –
if φ1 is any 1-cochain then we compute:

(d2 φ1 )(g1 , g2 , g3 ) = ωg1 (dφ1 (g2 , g3 )) − (dφ1 )(g1 g2 , g3 ) + (dφ1 )(g1 , g2 g3 ) − (dφ1 )(g1 , g2 )
= ωg1 (ωg2 (φ1 (g3 )) − φ1 (g2 g3 ) + φ1 (g2 ))
− (ωg1 g2 (φ1 (g3 )) − φ1 (g1 g2 g3 ) + φ1 (g1 g2 ))
+ (ωg1 (φ1 (g2 g3 ) − φ1 (g1 g2 g3 ) + φ1 (g1 ))
− (ωg1 (φ1 (g2 )) − φ1 (g1 g2 ) + φ1 (g1 ))
=0
(15.641)

where you can check that all terms cancel in pairs, once you use ωg1 ◦ ωg2 = ωg1 g2 .
The set of (ω-twisted) n-cocycles is defined to be the subgroup Z n+ω (G, A) ⊂ C n (G, A)
of cochains that satisfy dφn = 0.
Thanks to (15.639) we can define a subgroup B n+ω (G, A) ⊂ Z n+ω (G, A), called the
subgroup of coboundaries:

B n+ω (G, A) := {φn |∃φn−1 s.t. dφn−1 = φn } (15.642)

then, since d2 = 0 we have B n+ω (G, A) ⊂ Z n+ω (G, A).


Then the group cohomology is defined to be the quotient

H n+ω (G, A) = Z n+ω (G, A)/B n+ω (G, A) (15.643)

Example: Let us take G = Z2 = {1, σ} and A = Z. Recall that

Aut(A) = Aut(Z) = {IdZ , P} ∼


= Z2 (15.644)

where P is the automorphism that takes P : n → −n. Now Hom(G, Aut(Z)) ∼ = Z2 . Of


course, ω1 = IdZ always and now we have two possibilities for ωσ . Either ωσ = IdZ in which
case we denote ω = T (“T ” for trivial) or ωσ = P which we will denote I. Let us compute
H 1+ω (Z2 , Z) for these two possibilities. First look at the subgroup of coboundaries. If
φ0 = n0 ∈ Z is some integer then

(dφ0 )(1) = 0
(
0 ω=T (15.645)
(dφ0 )(σ) = ωσ (n0 ) − n0 =
−2n0 ω=I

Now consider the differential of a one-cochain:

(dφ1 )(1, 1) = ω1 (φ1 (1)) − φ1 (1) + φ1 (1) = φ1 (1)


(dφ1 )(1, σ) = ω1 (φ1 (σ)) − φ1 (σ) + φ1 (1) = φ1 (1)
(15.646)
(dφ1 )(σ, 1) = ωσ (φ1 (1)) − φ1 (σ) + φ1 (σ) = ωσ (φ1 (1))
(dφ1 )(σ, σ) = ωσ (φ1 (σ)) − φ1 (1) + φ1 (σ)

– 416 –
Now the cocycle condition implies φ1 (1) = 0, making the first three lines of (15.646) vanish.
Using this the fourth line becomes:
(
2φ1 (σ) ω = T
(dφ1 )(σ, σ) = ωσ (φ1 (σ)) + φ1 (σ) = (15.647)
0 ω=I

Now, when is φ1 a cocycle? When ω = T is trivial then we must take φ1 (σ) = 0 and
hence φ1 = 0 moreover, there are no coboundaries. We find H 1+T (Z2 , Z) = 0 in this case,
reproducing the simple fact that there are no nontrivial group homomorphisms from Z2 to
Z.
On the other hand, when ω = I we can take φ1 (σ) = a to be any integer a ∈ Z.
The group of twisted cocycles is isomorphic to Z. However, now there are nontrivial
coboundaries, as we see from (15.645). We can shift a by any even integer a → a − 2n0 .
So
H 1+I (Z2 , Z) ∼
= Z2 (15.648)
In addition to the interpretation in terms of splittings, this has a nice interpretation in
topology in terms of the unorientability of even-dimensional real projective spaces.
Remarks:

1. Previously we were denoting the cohomology groups by H n+ω (G, A). In the equations
above the ω is still present, (see the first term in the definition of dφ) but we leave
the ω implicit in the notation. Nevertheless, we are talking about the same groups
as before, but now generalizing to arbitrary degree n.

2. Remembering that we are now writing our abelian group A additively, we see that
the equation (dφ2 ) = 0 is just the twisted 2-cocycle conditions, and φ02 = φ2 + dφ1
are two different twisted cocycles related by a coboundary. See equations (15.538)
and (15.540) above. Roughly speaking, you should “take the logarithm” of these
equations.

3. Homological Algebra: What we are discussing here is a special case of a topic known
as homological algebra. Quite generally, a chain complex is a sequence of Abelian
groups {Cn }n∈Z equipped with group homomorphisms

∂n : Cn → Cn−1 (15.649)

such that ∂n ◦ ∂n+1 = 0 for all n ∈ Z. A cochain complex is similarly a sequence


of Abelian groups {C n }n∈Z with group homomorphisms dn : C n → C n+1 so that
dn+1 ◦ dn = 0 for all n ∈ Z. Note that these are NOT exact sequences. Indeed
the failure to be an exact sequence is measured by the homology groups of the chain
complex
Hn (C∗ , ∂∗ ) := ker(∂n )/im(∂n+1 ) (15.650)
and the cohomology groups of the cochain complex:

H n (C ∗ , d∗ ) := ker(dn )/im(dn−1 ) (15.651)

– 417 –
4. Homogeneous cocycles: A nice way to prove that d2 = 0 is the following. We define
homogeneous n-cochains to be maps ϕ : Gn+1 → A which satisfy

ϕ(hg0 , hg1 , . . . , hgn ) = ωh (ϕ(g0 , g1 , . . . , gn )) (15.652)

Let C n (G, A) denote the abelian group of such homogeneous group cochains. (Warn-
ing! Elements of C n (G, A) have (n + 1) arguments!) Define

δ : C n (G, A) → C n+1 (G, A) (15.653)

by
n+1
X
δϕ(g0 , . . . , gn+1 ) := (−1)i ϕ(g0 , . . . , gbi , . . . , gn+1 ) (15.654)
i=0

where gbi means the argument is omitted. Clearly, if ϕ is homogeneous then δϕ is


also homogeneous. It is then very straightforward to prove that δ 2 = 0. Indeed, if
ϕ ∈ C n−1 (G, A) we compute:
n+1
( i−1
X X
δ 2 ϕ(g0 , . . . , gn+1 ) = (−1)i (−1)j ϕ(g0 , . . . , gbj , . . . , gbi , . . . , gn+1 )
i=0 j=0
n+1
)
X
− (−1)j ϕ(g0 , . . . , gbi , . . . , gbj , . . . , gn+1 )
j=i+1
X (15.655)
= (−1)i+j ϕ(g0 , . . . , gbj , . . . , gbi , . . . , gn+1 )
0≤j<i≤n+1
X
− (−1)i+j ϕ(g0 , . . . , gbi , . . . , gbj , . . . , gn+1 )
0≤i<j≤n+1

=0

Now, we can define an isomorphism ψ : C n (G, A) → C n (G, A) by defining

φn (g1 , . . . , gn ) := ϕn (1, g1 , g1 g2 , . . . , g1 · · · gn ) (15.656)

That is, when φn and ϕn are related this way we say φn = ψ(ϕn ). Now one can check
that the simple formula (15.654) becomes the more complicated formula (15.634).
Put more formally: there is a unique d so that dψ = ψδ, or even more formally, there
is a unique group homomorphism d such that we have a commutative diagram:

C n (G, A)
δ / C n+1 (G, A) (15.657)
ψ ψ
 
C n (G, A)
d / C n+1 (G, A)

For example, if
φ1 (g) = ψ(ϕ1 )(g) = ϕ1 (1, g) (15.658)

– 418 –
then we can check that

(dφ1 )(g1 , g2 ) = d(ψ(ϕ1 ))(g1 , g2 )


= ψ(δϕ1 )(g1 , g2 )
= δϕ1 (1, g1 , g1 g2 )
(15.659)
= ϕ1 (g1 , g1 g2 ) − ϕ1 (1, g1 g2 ) + ϕ1 (1, g1 )
= ωg1 (ϕ1 (1, g2 )) − ϕ1 (1, g1 g2 ) + ϕ1 (1, g1 )
= ωg1 (φ1 (g2 )) − φ1 (g1 g2 ) + φ1 (g1 )

in accord with the previous definition!

5. Where do all these crazy formulae come from? The answer is in topology. We will
indicate it briefly in our discussion of categories and groupoids below.

6. The reader will probably find these formulae a bit opaque. It is therefore good to
stop and think about what the cohomology is measuring, at least in low degrees.

Exercise
Derive the formula for the differential on an inhomogeneous cochain dφ2 starting with
the definition on the analogous homogeneous cochain ϕ3

Exercise
If (Cn , ∂n ) is a chain complex show that one can define a cochain complex with groups:

C n := Hom(Cn , Z) (15.660)

15.8.2 Interpreting the meaning of H 0+ω


A zero-cocycle is an element a ∈ A so that for all g

ωg (a) = a (15.661)

There are no coboundaries to worry about, so H 0 (G, A) is just the set of fixed points of
the G action on A.

15.8.3 Interpreting the meaning of H 1+ω


We have interpreted H 1+ω (G, A) above as the set of nontrivial splittings of the semidirect
product defined by ω:
0→A→AoG→G→1 (15.662)

– 419 –
15.8.4 Interpreting the meaning of H 2+ω
Again, we have interpreted H 2+ω (G, A) as Extω (G, A), the set of equivalence classes of
extensions
0 → A → G̃ → G → 1 (15.663)
inducing a fixed ω : G → Aut(A). The trivial element of the cohomology group corre-
sponds to the semi-direct product and the set of inequivalent trivializations is the group
H 1+ω (G, A) of splittings of the semi-direct product.
More generally, Extω̄ (Q, N ) is a torsor for H 2+ω̄ (Q, Z(N )).

15.8.5 Interpreting the meaning of H 3


To see one interpretation of H 3 in terms of extension theory let us return to the analysis
of general extensions in §15.7.
Recall that, as we have discussed using (15.625), a general extension (15.1) has a
canonically associated homomorphism

ω̄ : Q → Out(N ) (15.664)

where Out(N ) is the group of outer automorphisms of N .


The natural question arises: Given a homomorphism ω̄ as in (15.664) is there a cor-
responding extension of Q by N inducing ω̄ as in equation (15.625) ?
To answer this question we could proceed by choosing for each q ∈ Q an automorphism
ξq ∈ Aut(N ) such that [ξq ] = ω̄q in Out(N ). To do this, choose a section s of π : Aut(N ) →
Out(N ) and let ξq := s(ω̄q ). If we cannot split the sequence

1 → Inn(N ) → Aut(N ) → Out(N ) → 1 (15.665)

then q 7→ ξq will not be a group homomorphism. But we do know that for all q1 , q2 ∈ Q

ξq1 ◦ ξq2 ◦ ξq−1


1 q2
∈ Inn(N ) (15.666)

Therefore, for every q1 , q2 we may choose an element f (q1 , q2 ) ∈ N so that

ξq1 ◦ ξq2 ◦ ξq−1


1 q2
= I(f (q1 , q2 )) (15.667)

i.e.
ξq1 ◦ ξq2 = I(f (q1 , q2 )) ◦ ξq1 q2 (15.668)
Of course, the choice of f (q1 , q2 ) is ambiguous by an element of Z(N )!
Equation (15.668) is of course just (15.615) written in slightly different notation.
Therefore, as we saw in §15.7, if f (q1 , q2 ) were to satisfy the the “twisted cocycle con-
dition” (15.616) then we could use (15.617) to define an extension inducing ω̄.
Therefore, let us check if some choice of f (q1 , q2 ) actually does satisfy the twisted
cocycle condition (15.616). Looking at the RHS of (15.616) we compute:

I(f (q1 , q2 )f (q1 q2 , q3 )) = I(f (q1 , q2 ))I(f (q1 q2 , q3 ))


= ξq1 ◦ ξq2 ◦ ξq−1 ◦ ξq1 q2 ◦ ξq3 ◦ ξq−1
 
1 q2 1 q2 q3
(15.669)
= ξq1 ◦ ξq2 ◦ ξq3 ◦ ξq−1
1 q2 q3

– 420 –
On the other hand, looking at the LHS of (15.616) we compute:

I(ξq1 (f (q2 , q3 ))f (q1 , q2 q3 )) = I(ξq1 (f (q2 , q3 )))I(f (q1 , q2 q3 ))


= ξq1 ◦ I((f (q2 , q3 ))) ◦ ξq−1 ◦ I(f (q1 , q2 q3 )
 1 −1  (15.670)
= ξq1 ◦ ξq2 ◦ ξq3 ◦ ξq2 q3 ◦ ξq1 ◦ ξq1 ◦ ξq2 q3 ◦ ξq−1
−1
1 q2 q3

= ξq1 ◦ ξq2 ◦ ξq3 ◦ ξq−1


1 q2 q3

Therefore, comparing (15.669) and (15.670) we conclude that

I(ξq1 (f (q2 , q3 ))f (q1 , q2 q3 )) = I(f (q1 , q2 )f (q1 q2 , q3 )) (15.671)

We cannot conclude that f satisfies the twisted cocycle equation from this identity because
inner transformations are trivial for elements in the center Z(N ). Rather, what we can
conclude is that for every q1 , q2 , q3 there is an element z(q1 , q2 , q3 ) ∈ Z(N ) such that

f (q1 , q2 )f (q1 q2 , q3 ) = z(q1 , q2 , q3 )ξq1 (f (q2 , q3 ))f (q1 , q2 q3 ) (15.672)

Now, one can check (with a lot of algebra) that

1. z is a cocycle in Z 3+ω̄ (Q, Z(N )). (We are using Aut(Z(N )) ∼


= Out(Z(N )).)

2. Changes in choices of ξq and f (q1 , q2 ) lead to changes in z by a coboundary.

and therefore we conclude:


Theorem 15.8.5.1 : Given ω̄ : Q → Out(N ) there exists an extension of Q by N iff
the cohomology class [z] ∈ H 3 (Q, Z(N )) vanishes.
Moreover, as we have seen, if [z] = 0 then the trivializations of z are in 1-1 correspon-
dence with elements H 2 (Q, Z(N )) and are hence in 1-1 correspondence with isomorphism
classes of extensions of Q by N . This is the analogue, one step up in degree, of our
interpretation of H 1 (G, A).

Examples: As an example 221 where a degree three cohomology class obstructs the exis-
tence of an extension inducing a homomorphism ω̄ : Q → Out(N ) we can take N to be the
generalized quaternion group of order 16. It is generated by x and y satisfying:

x4 = y 2 x8 = 1 yxy −1 = x−1 (15.673)

Using these relations every word in x±1 and y ±1 can be reduced to either xm , or yxm , with
m = 0, . . . , 7, and these words are all different. One can show the outer automorphism
group Out(N ) ∼ = Z2 × Z2 with generators α, β acting by

α(x) = x3 , α(y) = y β(x) = x, β(y) = yx (15.674)

Then there is is no group extension with group G fitting in

1 → N → G → Z2 → 1 (15.675)
221
I learned these nice examples from Clay Cordova. They will appear in a forthcoming paper with
Po-Shen Hsin and Francesco Benini.

– 421 –
inducing the homomorphism ω̄ : Z2 → Out(N ) defined by ω̄(σ) = α ◦ β where σ is the
nontrivial element of Z2 . One way to prove this is to look up the list of groups of order 32
and search for those with maximal normal subgroup given by N . 222 There are five such.
Then one computes ω̄ for each such extension and finds that it is never of the above type.
A similar example can be constructed by taking N to be a dihedral group of order 16. ♣Really should
explain four-term
Remark There is an interpretation of H 3 (Q, Z(N )) as a classification of four-term sequences and
crossed modules
exact sequences, and there are generalizations of this to higher degree. See here.... ♣

1. K. Brown, Group Cohomology.


2. C. A. Weibel, An introduction to homological algebra, chapter 6

15.9 Some references

Some online sources with links to further material are


1. http://en.wikipedia.org/wiki/Group-extension
2. http://ncatlab.org/nlab/show/group+extension
3 http://terrytao.wordpress.com/2010/01/23/some-notes-on-group-extensions/
4. Section 15.8.5, known as the Artin-Schreier theory, is based on a nice little note by
P.J. Morandi,
http://sierra.nmsu.edu/morandi/notes/GroupExtensions.pdf
5. Jungmann, Notes on Group Theory
6. S. MacLane, “Topology And Logic As A Source Of Algebra,” Bull. Amer. Math.
Soc. 82 (1976), 1-4.
Textbooks:
1. K. Brown, Group Cohomology
2. Karpilovsky, The Schur Multiplier
3. C. A. Weibel, An introduction to homological algebra, chapter 6

16. Overview of general classification theorems for finite groups

In general if a mathematical object proves to be useful then there is always an associated


important problem, namely the classification of these objects.
For example, with groups we can divide them into classes: finite and infinite, abelian
and nonabelian producing a four-fold classication:

Finite abelian Finite nonabelian


Infinite abelian Infinite nonabelian

222
See, for example, B. Shuster, “Morava K-theory of groups of order 32,” Algebr. Geom. Topol. 11
(2011) 503-521.

– 422 –
But this is too rough, it does not give us a good feeling for what the examples really
are.
Once we have a “good” criterion we often can make a nontrivial statement about the
general structure of objects in a given class. Ideally, we should be able to construct all
the examples algorithmically, and be able to distinguish the ones which are not isomor-
phic. Of course, finding such a “good” criterion is an art. For example, classification of
infinite nonabelian groups is completely out of the question. But in Chapter *** we will
see that an important class of infinite nonabelian groups, the simply connected compact
simple Lie groups, have a very beautiful classification: There are four infinite sequences
of classical matrix groups: SU (n), Spin(n), U Sp(2n) and then five exceptional cases with
names G2 , F4 , E6 , E7 , E8 . 223
One might well ask: Can we classify finite groups? In this section we survey a little of
what is known about this problem.

16.1 Brute force


If we just start listing groups of low order we soon start to appreciate what a jungle is out
there.
But let us try, if only as an exercise in applying what we have learned so far. First,
let us note that for groups of order p where p is prime we automatically have the unique
possibility of the cyclic group Z/pZ. Similarly, for groups of order p2 there are precisely
two possibilities: Z/p2 Z and Z/pZ × Z/pZ. This gets us through many of the low order
cases.
Given this remark the first nontrivial order to work with is |G| = 6. By Cauchy’s
theorem there are elements of order 2 and 3. Call them b, with b2 = 1 and a with a3 = 1.
Then (bab)3 = 1, so either
1. bab = a which implies ab = ba which implies G = Z2 × Z3 = Z6
2. bab = a−1 which implies G = D3 .
This is the first place we meet a nonabelian group. It is the dihedral group, the first
of the series we saw before

Dn = ha, b|an = b2 = 1, bab = a−1 i (16.1)

and has order 2n. There is a special isomorphism D3 ∼ = S3 with the symmetric group on
three letters.
The next nontrivial case is |G| = 8. Here we can invoke Sylow’s theorem: If pk ||G|
then G has a subgroup of order pk . Let us apply this to 4 dividing |G|. Such a subgroup
has index two and hence must be a normal subgroup, and hence fits in a sequence

1 → N → G → Z2 → 1 (16.2)

Now, N is of order 4 so we know that N ∼


= Z2 × Z2 or N ∼
= Z4 . If we have

1 → Z4 → G → Z2 → 1 (16.3)
223
Spin(n) double covers the classical matrix group SO(n).

– 423 –
then we have α : Z2 → Aut(Z4 ) = ∼ Z2 and there are exactly two such homomorphisms.
Moreover, for a fixed α there are two possibilities for the square σ̃ 2 ∈ Z4 where σ̃ is a lift
of the generator of Z2 . Altogether this gives four possibilities: ♣Need to explain
more here. ♣

1 → Z4 → Z2 × Z4 → Z2 → 1 (16.4)

1 → Z4 → Z8 → Z2 → 1 (16.5)

1 → Z4 → D4 → Z2 → 1 (16.6)

1 → Z4 → D
e 2 → Z2 → 1 (16.7)

Here we meet the first of the series of dicyclic or binary dihedral groups defined by

fn := ha, b|a2n = 1, an = b2 , b−1 ab = a−1 i


D (16.8)

It has order 4n. There is a special isomorphism of D


e 2 with the quaternion group.
The other possibility for N is Z2 × Z2 and here one new group is found, namely
Z2 × Z2 × Z2 .
Thus there are 5 inequivalent groups of order 8.
The next few cases are trivial until we get to |G| = 12. By Cauchy’s theorem there are
subgroups isomorphic to Z2 , so we can view G as an extension of D3 or Z6 by Z2 . There
is also a subgroup isomorphic to Z3 so we can view it as an extension of an order 4 group
by an order 3 group. We skip the analysis and just present the 5 distinct order 12 groups.
In this way we find the groups forming the pattern at lower order: ♣Check this
reasoning is correct.
You need to know
the subgroups are
Z12 , , Z2 × Z6 , , D6 , , D̃3 (16.9) normal to say there
is an extension. ♣

And we find one “new” group: A4 ⊂ S4 .


We can easily continue the table until we get to order |G| = 16. At order 16 there are
14 inequivalent groups! So we will stop here. 224

224
See, however, M. Wild, “Groups of order 16 made easy,” American Mathematical Monthly, Jan 2005

– 424 –
Order Presentation name
1 ha|a = 1i Trivial group
2 ha|a2 = 1i Cyclic Z/2Z
3 ha|a3 = 1i Cyclic Z/3Z
4 ha|a4 = 1i Cyclic Z/4Z
4 ha, b|a2 = b2 = (ab)2 = 1i Dihedral D2 ∼= Z/2Z × Z/2Z, Klein
5 ha|a5 = 1i Cyclic Z/5Z
6 ha, b|a = 1, b2 = 1, bab = ai
3 Cyclic Z/6Z ∼= Z/2Z × Z/3Z
6 ha, b|a3 = 1, b2 = 1, bab = a−1 i Dihedral D3 ∼ = S3
7 ha|a7 = 1i Cyclic Z/7Z
8 ha|a8 = 1i Cyclic Z/8Z
8 ha, b|a = 1, b4 = 1, aba = bi
2 Z/2Z × Z/4Z
8 ha, b, c|a2 = b2 = c2 = 1, [a, b] = [a, c] = [b, c] = 1i Z/2Z × Z/2Z × Z/2Z
8 ha, b|a4 = 1, b2 = 1, bab = a−1 i Dihedral D4
8 ha, b|a4 = 1, a2 = b2 , b−1 ab = a−1 i f2 ∼
Dicyclic D = Q, quaternion
9 ha|a9 = 1i Cyclic Z/9Z
9 ha, b|a3 = b3 = 1, [a, b] = 1i Z/3Z × Z/3Z
10 ha|a10 = 1i Cyclic Z/10Z ∼ = Z/2Z × Z/5Z
10 ha|a5 = b2 = 1, bab = a−1 i Dihedral D5
11 ha|a11 = 1i Cyclic Z/11Z
12 ha|a12 = 1i Cyclic Z/12Z ∼ = Z/4Z × Z/3Z
12 ha, b|a2 = 1, b6 = 1, [a, b] = 1i Z/2Z × Z/6Z
12 ha, b|a6 = 1, b2 = 1, bab = a−1 i Dihedral D6
12 ha, b|a6 = 1, a3 = b2 , b−1 ab = a−1 i Dicyclic D
f3
12 ha, b|a3 = 1, b2 = 1, (ab)3 = 1i Alternating A4
13 ha|a13 = 1i Cyclic Z/13Z
14 ha|a14 = 1i Cyclic Z/14Z ∼= Z/2Z × Z/7Z
14 ha, b|a7 = 1, b2 = 1, bab = a−1 i Dihedral D7
15 ha|a15 = 1i Cyclic Z/15Z ∼= Z/3Z × Z/5Z

Remarks:

1. Explicit tabulation of the isomorphism classes of groups was initiated by by Otto


Holder who completed a table for |G| ≤ 200 about 100 years ago. Since then there
has been much effort in extending those results. For surveys see

1. J.A. Gallan, “The search for finite simple groups,” Mathematics Magazine, vol.
49 (1976) p. 149. (This paper is a bit dated.)

– 425 –
Figure 41: A plot of the number of nonisomorphic groups of order n. This plot was taken from
the book by D. Joyner, Adventures in Group Theory.

2. H.U. Besche, B. Eick, E.A. O’Brian, “A millenium project: Constructing Groups


of Small Order,”

2. There are also nice tables of groups of low order, in Joyner, Adventures in Group
Theory, pp. 168-172, and Karpilovsky, The Schur Multiplier which go beyond the
above table.

3. There are also online resources:


1. http://www.gap-system.org/ for GAP
2. http://hobbes.la.asu.edu/groups/groups.html for groups of low order.
3. http://www.bluetulip.org/programs/finitegroups.html
4. http://en.wikipedia.org/wiki/List-of-small-groups

4. The number of isomorphism types of groups jumps wildly. Apparently, there are
49, 487, 365, 422 isomorphism types of groups of order 210 = 1024. (Besche et. al.
loc. cit.) The remarkable plot of Figure 41 from Joyner’s book shows a plot of the
number of isomorphism classes vs. order up to order 100. Figure 42 shows a log plot
of the number of groups up to order 2000.

5. There is, however, a formula giving the asymptotics of the number f (n, p) of isomor-
phism classes of groups of order pn for n → ∞ for a fixed prime p. (Of course, there
are p(n) Abelian groups, where p(n) the the number of partitions of n. Here we are

– 426 –
Figure 42: A logarithmic plot of the number of nonisomorphic groups of order n out to n ≤ 2000.
This plot was taken from online encyclopedia of integer sequences, OEIS.

talking about the number of all groups.) This is due to G. Higman 225 and C. Sims
226 and the result states that:

2 3
f (n, p) ∼ p 27 n (16.10)

1/2
Note that the asymptotics we derived for p(n) before had a growth like econst.n so,
unsurprisingly, most of the groups are nonabelian.

Exercise Relating the binary dihedral and dihedral groups

225
G. Higman, “Enumerating p-Groups,” Proc. London Math. Soc. 3) 10 (1960)
226
C. Sims, “Enumerating p-Groups,” Proc. London Math. Soc. (3) IS (1965) 151-66

– 427 –
Show that D
e n is a double-cover of Dn which fits into the exact sequence:

Z2 Z2 (16.11)

 
1 / Z2n /D
en / Z2 /1


1 / Zn / Dn / Z2 /1

16.2 Finite Abelian Groups


The upper left box of our rough classification can be dealt with thoroughly, and the result
is extremely beautiful.
In this subsection we will write our abelian groups additively.
Recall that we have shown that if p and q are positive integers then

0 → Z/gcd(p, q)Z → Z/pZ ⊕ Z/qZ → Z/lcm(p, q)Z → 0 (16.12)

and in particular, if p, q are relatively prime then

Z/pZ ⊕ Z/qZ ∼
= Z/pqZ. (16.13)

It thus follows that if n has prime decomposition


Y
n= pei i (16.14)
i

then
Z/nZ ∼ e
= ⊕i Z/pi i Z (16.15)

This decomposition has a beautiful generalization to an arbitrary finite abelian group:

Kronecker Structure Theorem. Any finite abelian group is a direct product of cyclic
groups of order a prime power. That is, we firstly have the decomposition:

G = G2 ⊕ G3 ⊕ G 5 ⊕ G 7 ⊕ · · ·
(16.16)
= ⊕p prime Gp

where Gp has order pn for some n ≥ 0 (n can depend on p, and for all but finitely many p,
Gp = {0}.) And, secondly, each nonzero factor Gp can be written:

Gp = ⊕i Z/(pni Z) (16.17)

for some finite collection of positive integers ni (depending on p).

– 428 –
Proof : The proof proceeds in two parts. The first, easy, part shows that we can split G
into a direct sum of “p-groups” (defined below). The second, harder, part shows that an
arbitrary abelian p-group is a direct sum of cyclic groups.
For part 1 of the proof let us consider an arbitrary finite abelian group G. We will
write the group multiplication additively. Suppose n is an integer so that ng = 0 for all
g ∈ G. To fix ideas let us take n = |G|. Suppose n = m1 m2 where m1 , m2 are relatively
prime integers. Then there are integers s1 , s2 so that

s1 m1 + s2 m2 = 1 (16.18)

Therefore any element g can be written as

g = s1 (m1 g) + s2 (m2 g) (16.19)


Now m1 G and m2 G are subgroups and we claim that m1 G ∩ m2 G = {0}. If a ∈
m1 G ∩ m2 G then m1 a = 0 and m2 a = 0 and hence (16.19) implies a = 0. Thus,

G = m1 G ⊕ m2 G (16.20)

Moreover, we claim that m1 G = {g ∈ G|m2 g = 0}. It is clear that every element in m1 G


is killed by m2 . Suppose on the other hand that m2 g = 0. Again applying (16.19) we see
that g = s1 m1 g = m1 (s1 g) ∈ m1 G.
Thus, we can decompose
G = ⊕p primeGp (16.21)
where Gp is the subgroup of G of elements whose order is a power of p.
If p is a prime number then a p-group is a group all of whose elements have order a
power of p. Now for part 2 of the proof we show that any abelian p-group is a direct sum
of the form (16.17). The proof of this statement proceeds by induction and is based on a
systematic application of Cauchy’s theorem: If p divides |G| then there is an element of G
of order precisely p. (Recall we proved this theorem in Section 9.
Now, note that any p-group G has an order which is a power pn for some n. If not,
then |G| = pn m where m is relatively prime to p. But then - by Cauchy’s theorem - there
would have to be an element of G whose order is a prime divisor of m.
Next we claim that if an abelian p-group has a unique subgroup H of order p then G
itself is cyclic.
To prove this we again proceed by induction on |G|. Consider the subgroup defined
by:
H = {g|pg = 0} (16.22)
From Cauchy’s theorem we see that H cannot be the trivial group, and hence this must
be the unique subgroup of order p. On the other hand, H is manifestly the kernel of the
homomorphism φ : G → G given by φ(g) = pg. Again by Cauchy, φ(G) has a subgroup
of order p, but this must also be a subgroup of G, which contains φ(G), and hence φ(G)
has a unique subgroup of order p. By the induction hypothesis, φ(G) is cyclic. But now
φ(G) ∼= G/H, so let g0 + H be a generator of the cyclic group G/H. Next we claim

– 429 –
that H ⊂ hg0 i. Since G is a p-group the subgroup hg0 i is a p-group and hence contains a
subgroup of order p (by Cauchy) but (by hypothesis) there is a unique such subgroup in
G and any subgroup of hg0 i is a subgroup of G, so H ⊂ hg0 i. But now take any element
g ∈ G. On the one hand it must project to an element [g] ∈ G/H. Thus must be of the
form [g] = kg0 + H, since g0 + H generates G/H. That means g = kg0 + h, h ∈ H, but
since H ⊂ hg0 i we must have h = `g0 for some integer `. Therefore G = hg0 i is cyclic.
The final step proceeds by showing that if G is a finite abelian p-group and M is a
cyclic subgroup of maximal order then G = M ⊕ N for some subgroup N . Once we have
established this the desired result follows by induction.
So, now suppose that that G has a cyclic subgroup of maximal order M . If G is cyclic
then N = {0}. If G is not cyclic then we just proved that there must be at least two
distinct subgroups of order p. One of them is in M . Choose another one, say K. Note that
K must not be in M , because M is cyclic and has a unique subgroup of order p. Therefore
K ∩ M = {0}. Therefore (M + K)/K ∼ = M . Therefore (M + K)/K is a cyclic subgroup
of G/K. Any element g + K has an order which divides |g|, and |g| ≤ |M | since M is a
maximal cyclic subgroup. Therefore the cyclic subgroup (M + K)/K is a maximal order
cyclic subgroup of G/K. Now the inductive hypothesis implies G/K = (M + K)/K ⊕ H/K
for some subgroup K ⊂ H ⊂ G. But this means (M +K)∩H = K and hence M ∩H = {0}
and hence G = M ⊕ H. ♠
For other proofs see
1. S. Lang, Algebra, ch. 1, sec. 10.
2. I.N. Herstein, Ch. 2, sec. 14.
3. J. Stillwell, Classical Topology and Combinatorial Group Theory.
4. Our proof is based on G. Navarro, “On the fundamental theorem of finite abelian
groups,” Amer. Math. Monthly, Feb. 2003, vol. 110, p. 153.

One class of examples where we have a finite Abelian group, but it’s Kronecker de-
composition is far from obvious is the following: Consider the Abelian group Zd . Choose
a set of d vectors vi ∈ Zd , linearly independent as vectors in Rd .
d
X
L := { ni vi |ni ∈ Z} (16.23)
i=1

is a subgroup. Then
A = Zd /L (16.24)
is a finite Abelian group. For example if vi = kei where ei is the standard unit vector
in the ith direction then obviously A ∼ = (Z/kZ)d . But for a general set of vectors the
decomposition is not obvious.
So, here is an algorithm for giving the Kronecker decomposition of a finite Abelian
group:

1. Compute the orders of the various elements.

– 430 –
2. You need only consider the elements whose order is a prime power. (By the Bezout
manipulation all the others will be sums of these.)

3. Focusing on one prime at a time. Take the element g1 whose order is maximal. Then
Gp = hg1 i ⊕ N . ♣Have to say how
to get N . ♣

4. Repeat for N .

Exercise
Show that an alternative of the structure theorem is the statement than any finite
abelian group is isomorphic to

Zn1 ⊕ Zn1 ⊕ · · · ⊕ Znk (16.25)


where
n1 |n2 & n2 |n3 & ··· & nk−1 |nk (16.26)
Write the ni in terms of the prime powers in (16.17).

Exercise p-groups
a.) Show that Z4 is not isomorphic to Z2 ⊕ Z2 .
b.) Show more generally that if p is prime Zpn and Zpn−m ⊕ Zpm are not isomorphic if
0 < m < n.
c.) How many nonisomorphic abelian groups have order pn ?

Exercise
Suppose e1 , e2 ∈ Z2 are two linearly independent vectors (over Q). Let Λ = he1 , e2 i ⊂
Z2 be the sublattice generated by these vectors. Then Z2 /Λ is a finite abelian group.
Compute its Kronecker decomposition in terms of the coordinates of e1 , e2 .

16.3 Finitely Generated Abelian Groups


It is hopeless to classify all infinite abelian groups, but a “good” criterion that leads to an
interesting classification is that of finitely generated abelian groups.
Any abelian group has a canonically defined subgroup known as the torsion subgroup,
and denoted Tors(G). This is the subgoup of elements of finite order :

Tors(G) := {g ∈ G|∃n ∈ Z ng = 0} (16.27)

– 431 –
where we are writing the group G additively, so ng = g + · · · + g.
One can show that any finitely generated abelian group fits in an exact sequence

0 → Tors(G) → A → Zr → 0 (16.28)

where Tors(G) is a finite abelian group.


For a proof, see, e.g., S. Lang, Algebra .
Moreover (16.28) is a split extension, that is, it is isomorphic to

Zr ⊕ Tors(G) (16.29)

The integer r, called the rank of the group, and the finite abelian group Tors(G) are
invariants of the finitely generated abelian group. Since we have a general picture of the
finite abelian groups we have now got a general picture of the finitely generated abelian
groups.

Remark:
Remarks

1. The groups C, R, Q under addition are abelian but not finitely generated. This is
obvious for C and R since these are uncountable sets. To see that Q is not finitely
generated consider any finite set of fractions { pq11 , . . . , pqss }. This set will will only
generate rational numbers which, when written in lowest terms, have denominator at
most q1 q2 · · · qs .

2. Note that a torsion abelian group need not be finite in general. For example Q/Z is
entirely torsion, but is not finite.

3. A rich source of finitely generated abelian groups are the integral cohomology groups
H n (X; Z) of smooth compact manifolds.

4. We must stress that the presentation (16.29) of a finitely generated abelian group
is not canonical! There are many distinct splittings of (16.28). They are in 1-1
correspondence with the group homomorphisms Hom(Zr , Tors(G)). For a simple
example consider Zd /Λ where Λ is a general sublattice of rank less than d.

5. In a nonabelian group the product of two finite-order elements can very well have
infinite order. Examples include free products of cyclic groups and simple rotations
by 2π/n around different axes in SO(3). So, there is no straightforward generalization
of Tors(G) to the case of nonabelian groups.

Exercise

– 432 –
Consider the finitely generated Abelian group 227
X
L = {(x1 , x2 , x3 , x4 ) ∈ Z4 | xi = 0mod2} (16.30)
i

and consider the subgroup S generated by

v1 = (1, 1, 1, 1)
(16.31)
v2 = (1, 1, −1, −1)

a.) What is the torsion group of L/S ?


b.) Find a splitting of the sequence (16.28) and compare with the one found by other
students in the course. Are they the same?

Exercise
Given a set of finite generators of an Abelian group A try to find an algorithm for a
splitting of the sequence (16.28).

16.4 The Classification Of Finite Simple Groups


Kronecker’s structure theorem is a very satisfying, beautiful and elegant answer to a clas-
sification question. The generalization to nonabelian groups is very hard. It turns out that
a “good” criterion is that a finite group be a simple group. This idea arose from the Galois
demonstration of (non)solvability of polynomial equations by radicals.
A key concept in abstract group theory is provided by the notion of a composition
series. This is a sequence of subgroups

1 = Gs+1 / Gs / · · · / G2 / G1 = G (16.32)

which have the property that Gi+1 is a maximal normal subgroup of Gi . (Note: Gi+1 need
not be normal in G. Moreover, there might be more than one maximal normal subgroup
in Gi . ) As a simple example we shall see that we have ♣should give an
example of this....

1 = G4 / G3 = Z2 × Z2 / G2 = A4 / G1 = S4 (16.33)

but
G 3 = 1 / G 2 = An / G 1 = S n n≥5 (16.34)

Not every group admits a composition series. For example G = Z does not admit
a composition series. (Explain why!) However, it can be shown that every finite group
admits a composition series. ♣Give a reference.

227
It is the root lattice of so(8).

– 433 –
It follows that in a composition series the quotient groups Gi /Gi+1 are simple groups:
By definition, a simple group is one whose only normal subgroups are 1 and itself. From
what we have learned above, that means that a simple group has no nontrivial homomorphic
images. It also implies that the center is trivial or the whole group.
Let us prove that the Gi /Gi+1 are simple: In general, if N / G is a normal subgroup
then there is a 1-1 correspondence:
Subgroups H between N and G: N ⊂ H ⊂ G ⇔ Subgroups of G/N
Moreover, under this correspondence:
Normal subgroups of G/N ⇔ Normal subgroups N ⊂ H / G. If H/Gi+1 ⊂ Gi /Gi+1 ♣Make this an
exercise in an
were normal and 6= 1 then Gi+1 ⊂ H ⊂ Gi would be normal and and properly contain earlier section. ♣

Gi+1 , contradicting maximality of Gi+1 . ♠


A composition series is a nonabelian generalization of the Kronecker decomposition.
It is not unique (see exercise below) but the the following theorem, known as the Jordan-
Hölder theorem states that there are some invariant aspects of the decomposition:
Theorem: Suppose there are two different composition series for G:

1 = Gs+1 / Gs / · · · / G2 / G1 = G (16.35)

1 = G0s0 +1 / G0s / · · · / G02 / G01 = G (16.36)


Then s = s0 and there is a permutation i → i0 so that Gi /Gi+1 ∼ = G0i0 /G0i0 +1 . That is:
The length and the unordered set of quotients are both invariants of the group and do not
depend on the particular composition series.
For a proof see Jacobsen, Section 4.6.
The classification of all finite groups is reduced to solving the extension problem in
general, and then classifying finite simple groups. The idea is that if we know Gi /Gi+1 = Si
is a finite simple group then we construct Gi from Gi+1 and the extension:

1 → Gi+1 → Gi → Si → 1 (16.37)

We have discussed the extension problem thoroughly above. One of the great achievements
of 20th century mathematics is the complete classification of finite simple groups, so let us
look at the finite simple groups:
First consider the abelian ones. These cannot have nontrivial subgroups and hence
must be of the form Z/pZ where p is prime.
So, now we search for the nonabelian finite simple groups. A natural source of non-
abelian groups are the symmetric groups Sn . Of course, these are not simple because
An ⊂ Sn are normal subgroups. Could the An be simple? The first nonabelian example
is A4 and it is not a simple group! Indeed, consider the cycle structures (2)2 . There are
three nontrivial elements: (12)(34), (13)(24), and (14)(23), they are all involutions, and

((12)(34)) · ((13)(24)) = ((13)(24)) · ((12)(34)) = (14)(23) (16.38)

and therefore together with the identity they form a subgroup K ⊂ A4 isomorphic to
Z2 × Z2 . Since cycle-structure is preserved under conjugation, this is obviously a normal
subgroup of A4 !. After this unpromising beginning you might be surprised to learn:

– 434 –
Theorem An is a simple group for n ≥ 5.

Sketch of the proof :


We first observe that An is generated by cycles of length three: (abc). The reason
is that (abc) = (ab)(bc), so any word in an even number of distinct transpositions can
be rearranged into a word made from a product of cycles of length three. Therefore, the
strategy is to show that any normal subgroup K ⊂ An which is larger than 1 must contain
at least one three-cycle (abc). WLOG let us say it is (123). Now we claim that the entire
conjugacy class of three-cycles must be in K. We consider a permutation φ which takes
!
1 2 3 4 5 ···
φ= (16.39)
i j k l m ···

Then φ(123)φ−1 = (ijk). If φ ∈ An we are done, since K is normal in An so then (ijk) ∈ K.


If φ is an odd permutation then φ̃ = φ(45) is even and φ̃(123)φ̃−1 = (ijk).
Thus, we need only show that some 3-cycle is in K. For n = 5 this can be done rather
explicitly. See the exercise below. Once we have established that A5 is simple we can
proceed by induction as follows.
We first establish a lemma: If n ≥ 5 then for any σ ∈ An , σ 6= 1 there is a conjugate
element (in An ) σ 0 with σ 0 6= σ such that there is an i ∈ {1, . . . , n} so that σ(i) = σ 0 (i).
To prove the lemma choose any σ 6= 1 and for σ choose a cycle of maximal length, say
r so that σ = (12 . . . r)π with π fixing {1, . . . , r}. If r ≥ 3 then consider the conjugate:

σ 0 = (345)σ(345)−1 = (345)(123 · · · )π(354) (16.40)

We see that σ(1) = σ 0 (1) = 2, while σ(2) = 3 and σ 0 (2) = 4. We leave the case r = 2 to
the reader.
Now we proceed by induction: Suppose Aj is simple for 5 ≤ j ≤ n. Consider An+1 and
let N / An+1 . Then choose σ ∈ N and using the lemma consider σ 0 ∈ An+1 with σ 0 6= σ
and σ 0 (i) = σ(i) for some i. Let Hi ⊂ An+1 be the subgroup of permutations fixing i. It is
isomorphic to An . Now, σ 0 ∈ N since it is a conjugate of σ ∈ N and N is assumed to be
normal. Therefore σ −1 σ 0 ∈ N , and σ −1 σ 0 6= 1. Therefore N ∩ Hi 6= 1. But N ∩ Hi must
be normal in Hi . Since Hi ∼ = An it follows that N ∩ Hi = Hi . But Hi contains 3-cycles.
Therefore N contains 3-cycles and hence N ∼ = An+1 . ♠
Remark: For several other proofs of the same theorem and other interesting related
facts see
http://www.math.uconn.edu/kconrad/blurbs/grouptheory/Ansimple.pdf.

Digressive Remark: A group is called solvable if the Gi /Gi+1 are abelian (and hence
Z/pZ for some prime p). The term has its origin in Galois theory, which in turn was the
original genesis of group theory. Briefly, in Galois theory one considers a polynomial P (x)
with coefficients drawn from a field F . (e.g. consider F = Q or R). Then the roots of the
polynomial θi can be adjoined to F to produce a bigger field K = F [θi ]. The Galois group
of the polynomial Gal(P ) is the group of automorphisms of K fixing F . Galois theory

– 435 –
sets up a beautiful 1-1 correspondence between subgroups H ⊂ Gal(P ) and subfields
F ⊂ KH ⊂ K. The intuitive notion of solving a polynomial by radicals corresponds to
finding a series of subfields F ⊂ F1 ⊂ F2 ⊂ · · · ⊂ K so that Fi+1 is obtained from Fi
by adjoining the solutions of an equation y d = z. Under the Galois correspondence this
translates into a composition series where Gal(P ) is a solvable group - hence the name. If
we take F = C[a0 , . . . , an−1 ] for an nth order polynomial

P (x) = xn + an−1 xn−1 + · · · + a1 x + a0 (16.41)

then the roots θi are such that aj are the j th elementary symmetric polynomials in the θi
(See Chapter 2 below). The Galois group is then Sn . For n ≥ 5 the only nontrivial normal
subgroup of Sn is An , and this group is simple, hence certainly not solvable. That is why
there is no solution of an nth order polynomial equation in radicals for n ≥ 5.

Returning to our main theme, we ask: What other finite simple groups are there? The
full list is known. The list is absolutely fascinating: 228

1. Z/pZ for p prime.

2. The subgroup An ⊂ Sn for n ≥ 5.

3. “Simple Lie groups over finite fields.”

4. 26 “sporadic oddballs”

We won’t explain example 3 in great detail, but it consists of a few more infinite
sequences of groups, like 1,2 above. To get a flavor of what is involved note the following:
The additive group Z/pZ where p is prime has more structure: One can multiply elements,
and if an element is nonzero then it has a multiplicative inverse, in other words, it is a finite
field. One can therefore consider the group of invertible matrices over this field GL(n, p),
and its subgroup SL(n, p) of matrices of unit determinant. Since Z/pZ has a finite number
of elements it is a finite group. This group is not simple, because it has a nontrivial center,
in general. For example, if n is even then the group {±1} is a normal subgroup isomorphic
to Z2 . If we divide by the center the we get a group P SL(n, p) which, it turns out, is
indeed a simple group. This construction can be generalized in a few directions. First,
there is a natural generalization of Z/pZ to finite fields Fq of order a prime power q = pk .
Then we can similarly define P SL(n, q) and it turns out these are simple groups except
for some low values of n, q. Just as the Lie groups SL(n, C) have counterparts O(n), Sp(n)
etc. one can generalize this construction to groups of type B, C, D, E. This construction
can be used to obtain the third class of finite simple groups. ♣Double check.
Does this figure
leave out a
228
See the Atlas of Finite Simple Groups, by Conway and Norton subgroup relation?

– 436 –
Figure 43: A table of the sporadic groups including subgroup relations. Source: Wikipedia.

It turns out that there are exactly 26 oddballs, known as the “sporadic groups.” Some
relationships between them are illustrated in Figure 43. The sporadic groups first showed
up in the 19th century via the Mathieu groups

M11 , M12 , M22 , M23 , M24 . (16.42)

Mn is a subgroup of the symmetric group Sn . M11 , which has order |M11 | = 7920 was
discovered in 1861. We met M12 when discussing card-shuffling. The last group M24 , with
order ∼ 109 was discovered in 1873. All these groups may be understood as automorphisms
of certain combinatorial objects called “Steiner systems.”
It was a great surprise when Janko constructed a new sporadic group J1 of order
175, 560 in 1965, roughly 100 years after the discovery of the Mathieu groups. The list of
sporadic groups is now thought to be complete. The largest sporadic group is called the
Monster group and its order is:

|M onster| = 246 · 320 · 59 · 76 · 112 · 133 · 17 · 19 · 23 · 29 · 31 · 41 · 47 · 59 · 71


= 808017424794512875886459904961710757005754368000000000 (16.43)

= 8.08 × 1053

– 437 –
but it has only 194 conjugacy classes! (Thus, by the class equation, it is “very” nonabelian.
The center is trivial and Z(g) tends to be a small order group.)
The history and status of the classification of finite simple groups is somewhat curious:
229

1. The problem was first proposed by Hölder in 1892. Intense work on the classification
begins during the 20th century.

2. Feit and Thompson show (1963) that any finite group of odd order is solvable. In
particular, it cannot be a simple group.

3. Janko discovers (1965) the first new sporadic group in almost a century.

4. Progress is then rapid and in 1972 Daniel Gorenstein (of Rutgers University) an-
nounces a detailed outline of a program to classify finite simple groups.

5. The largest sporadic group, the Monster, was first shown to exist in 1980 by Fischer
and Griess. It was explicitly constructed (as opposed to just being shown to exist)
by Griess in 1982.

6. The proof is completed in 2004. It uses papers from hundreds of mathematicians


between 1955 and 2004, and largely follows Gorenstein’s program. The proof entails
tens of thousands of pages. Errors and gaps have been found, but so far they are just
“local.”

Compared to the simple and elegant proof of the classification of simple Lie algebras
(to be covered in Chapter **** below) the proof is obviously terribly unwieldy.
It is conceivable that physics might actually shed some light on this problem. The
simple groups are probably best understood as automorphism groups of some mathemat-
ical, perhaps even geometrical object. For example, the first nonabelian simple group,
A5 is the group of symmetries of the icosahedron, as we will discuss in detail below. A
construction of the monster along these lines was indeed provided by Frenkel, Lepowsky,
Meurman, (at Rutgers) using vertex operator algebras, which are important in the descrip-
tion of perturbative string theory. More recently the mystery has deepened with interesting
experimental discoveries linking the largest Mathieu group M24 to nonlinear sigma models
with K3 target spaces. For more discussion about the possible role of physics in this subject
see:

1. Articles by Griess and Frenkel et. al. in Vertex Operators in Mathematics and
Physics, J. Lepowsky, S. Mandelstam, and I.M. Singer, eds.

2. J. Harvey, “Twisting the Heterotic String,” in Unified String Theories, Green and
Gross eds.
229
Our source here is the Wikipedia article on the classification of finite simple groups. See also: Solomon,
Ronald, “A brief history of the classification of the Finite simple groups,” American Mathematical Society.
Bulletin. New Series, 38 (3): 315-352 (2001).

– 438 –
3. L.J. Dixon, P.H. Ginsparg, and J.A. Harvey, “Beauty And The Beast: Superconfor-
mal Symmetry In A Monster Module,” Commun.Math.Phys. 119 (1988) 221-241

4. M.C.N. Cheng, J.F.R. Duncan, and J.A. Harvey, “Umbral Moonshine,” e-Print:
arXiv:1204.2779 [math.RT]

Exercise Completing the proof that A5 is simple


Show that any nontrivial normal subgroup of A5 must contain a 3-cycle as follows:
a.) If N / A5 is a normal subgroup containing no 3-cycles then the elements must have
cycle type (ab)(cd) or (abcde).
b.) Compute the group commutators (a, b, c, d, e are all distinct):

[(abe), (ab)(cd)] = (aeb) (16.44)

[(abc), (abcde)] = (abd) (16.45)

c.) Use these facts to conclude that N must contain a 3-cycle.


Legend has it that Galois discovered this theorem on the night before his fatal duel.

Exercise Conjugacy classes in An


Note that conjugacy classes in An are different from conjugacy classes in Sn . For
example, (123) and (132) are not conjugate in A3 .
Describe the conjugacy classes in An .

Exercise Jordan-Hölder decomposition


Work out JH decompositions for the order 8 quaternion group D
e 2 and observe that
there are several maximal normal subgroups.

Exercise The simplest of the Chevalley groups


a.) Verify that SL(2, Z/pZ) is a group.
b.) Show that the order of SL(2, Z/pZ) is p(p2 − 1). 230

230
Break up the cases into d = 0 and d 6= 0. When d = 0 you can solve ad − bc = 1 for a. When d = 0
you can have arbitrary a but you must have bc = −1.

– 439 –
c.) Note that the scalar multiples of the 2 × 2 identity matrix form a normal subgroup
of SL(2, Z/pZ). Show that the number of such matrices is the number of solutions of
x2 = 1modp. Dividing by this normal subgroup produces the group P SL(2, Z/pZ). Jordan
proved that these are simple groups for p 6= 2, 3.
It turns out that P SL(2, Z5 ) ∼= A5 . (Check that the orders match.) Therefore the
next simple group in the series is P SL(2, Z7 ). It has many magical properties.
d.) Show that P SL(2, Z7 ) has order 168.

17. Categories: Groups and Groupoids

A rather abstract notion, which nevertheless has found recent application in string theory
and conformal field theory is the language of categories. Many physicists object to the high
level of abstraction entailed in the category language. Some mathematicians even refer to
the subject as “abstract nonsense.” (Others take it very seriously.) However, it seems to be
of increasing utility in the further formal development of string theory and supersymmetric
gauge theory. It is also essential for reading any of the literature on topological field theory.
We briefly illustrate some of that language here. Our main point here is to introduce a
different viewpoint on what groups are that leads to a significant generalization: groupoids.
Moreover, this point of view also provides some very interesting insight into the meaning of
group cohomology. Related constructions have been popular in condensed matter physics
and topological field theory.

Definition A category C consists of


a.) A set Ob(C) of “objects”
b.) A collection M or(C) of sets hom(X, Y ), defined for any two objects X, Y ∈ Ob(C).
The elements of hom(X, Y ) are called the “morphisms from X to Y .” They are often
denoted as arrows:
φ
X → Y (17.1)
c.) A composition law:

hom(X, Y ) × hom(Y, Z) → hom(X, Z) (17.2)

(ψ1 , ψ2 ) 7→ ψ2 ◦ ψ1 (17.3)
Such that
1. A morphism φ uniquely determines its source X and target Y . That is, hom(X, Y )
are disjoint for distinct ordered pairs (X, Y ).
2. ∀X ∈ Ob(C) there is a distinguished morphism, denoted 1X ∈ hom(X, X) or
IdX ∈ hom(X, X), which satisfies:

1X ◦ φ = φ ψ ◦ 1X = ψ (17.4)

for all morphisms φ ∈ hom(Y, X) and ψ ∈ hom(X, Y ) for all Y ∈ Ob(C). 231

231
As an exercise, show that these conditions uniquely determine the morphism 1X .

– 440 –
3. Composition of morphisms is associative:

(ψ1 ◦ ψ2 ) ◦ ψ3 = ψ1 ◦ (ψ2 ◦ ψ3 ) (17.5)


An alternative definition one sometimes finds is that a category is defined by two sets
C0 (the objects) and C1 (the morphisms) with two maps p0 : C1 → C0 and p1 : C1 → C0 .
The map p0 (f ) = x1 ∈ C0 is the range map and p1 (f ) = x0 ∈ C0 is the domain map. In
this alternative definition a category is then defined by a composition law on the set of
composable morphisms

C2 = {(f, g) ∈ C1 × C1 |p0 (f ) = p1 (g)} (17.6)

which is sometimes denoted C1p1 ×p0 C1 and called the fiber product. The composition law
takes C2 → C1 and may be pictured as the composition of arrows. If f : x0 → x1 and
g : x1 → x2 then the composed arrow will be denoted g ◦ f : x0 → x2 . The composition
law satisfies the axioms

1. For all x ∈ X0 there is an identity morphism in X1 , denoted 1x , or Idx , such that


1x f = f and g1x = g for all suitably composable morphisms f, g.

2. The composition law is associative. If f, g, h are 3-composable morphisms then


(hg)f = h(gf ).

Remarks:

1. When defining composition of arrows one needs to make an important notational


decision. If f : x0 → x1 and g : x1 → x2 then the composed arrow is an arrow
x0 → x2 . We will write g ◦ f when we want to think of f, g as functions and f g when
we think of them as arrows. ♣Is this dual
notation really a
good idea?? ♣
2. It is possible to endow the data X0 , X1 and p0 , p1 with additional structures, such as
topologies, and demand that p0 , p1 have continuity or other properties.

3. A morphism φ ∈ hom(X, Y ) is said to be invertible if there is a morphism ψ ∈


hom(Y, X) such that ψ ◦ φ = 1X and φ ◦ ψ = 1Y . If X and Y are objects with an
invertible morphism between them then they are called isomorphic objects. One key
reason to use the language of categories is that objects can have nontrivial automor-
phisms. That is, hom(X, X) can have invertible elements other than just 1X in it.
When this is true then it is tricky to speak of “equality” of objects, and the language
of categories becomes very helpful. As a concrete example you might ponder the
following question: “Are all real vector spaces of dimension n the same?”

Here are some simple examples of categories:

1. SET: The category of sets and maps of sets. 232

232
We take an appropriate collection of sets and maps to avoid the annoying paradoxes of set theory.

– 441 –
2. TOP: The category of topological spaces and continuous maps.

3. TOPH: The category of topological spaces and homotopy classes of continuous


maps.

4. MANIFOLD: The category of manifolds and suitable maps. We could take topo-
logical manifolds and continuous maps of manifolds. Or we could take smooth man-
ifolds and smooth maps as morphisms. The two choices lead to two (very different!)
categories.

5. BORD(n): The bordism category of n-dimensional manifolds. Roughly speaking,


the objects are n-dimensional manifolds without boundary and the morphisms are
bordisms. A bordism Y from an n-manifold M1 to and n-manifold M2 is an (n + 1)-
dimensional manifold with a decomposition of its boundary ∂Y = (∂Y )in q (∂Y )out
together with diffeomorphisms θ1 : (∂Y )in → M1 and θ2 : (∂Y )out → M2 .

6. GROUP: the category of groups and homomorphisms of groups. Note that here
if we took our morphisms to be isomorphisms instead of homomorphisms then we
would get a very different category. All the pairs of objects in the category with
nontrivial morphism spaces between them would be pairs of isomorphic groups.

7. AB: The (sub) category of abelian groups.

8. Fix a group G and let G-SET be the category of G-sets, that is, sets X with a
G-action. For simplicity let us just write the G-action Φ(g, x) as g · x for x a point in
a G-set X. We take the morphisms f : X1 → X2 to satisfy satisfy f (g ·x1 ) = g ·f (x1 ).

9. VECTκ : The category of finite-dimensional vector spaces over a field κ with mor-
phisms the linear transformations.

One use of categories is that they provide a language for describing precisely notions
of “similar structures” in different mathematical contexts. When discussed in this way it
is important to introduce the notion of “functors” and “natural transformations” to speak
of interesting relationships between categories.
In order to state a relation between categories one needs a “map of categories.” This
is what is known as a functor:

Definition A functor between two categories C1 and C2 consists of a pair of maps Fobj :
Obj(C1 ) → Obj(C2 ) and Fmor : M or(C1 ) → M or(C2 ) so that if

f
x / y ∈ hom(x, y) (17.7)

then
Fmor (f )
Fobj (x) / Fobj (y) ∈ hom(Fobj (x), Fobj (y)) (17.8)

– 442 –
and moreover we require that Fmor should be compatible with composition of morphisms:
There are two ways this can happen. If f1 , f2 are composable morphisms then we say F is
a covariant functor if
Fmor (f1 ◦ f2 ) = Fmor (f1 ) ◦ Fmor (f2 ) (17.9)
and we say that F is a contravariant functor if

Fmor (f1 ◦ f2 ) = Fmor (f2 ) ◦ Fmor (f1 ) (17.10)

In both cases we also require 233

Fmor (IdX ) = IdF (X) (17.11)

We usually drop the subscript on F since it is clear what is meant from context.

Exercise
Using the alternative definition of a category in terms of data p0,1 : X1 → X0 define
the notion of a functor writing out the relevant commutative diagrams.

Exercise Opposite Category


If C is a category then the opposite category C opp is defined by just reversing all arrows.
More formally: The set of objects is the same and

homC opp (X, Y ) := homC (Y, X) (17.12)

so for every morphism f ∈ homC (Y, X) we associate f opp ∈ homC opp (X, Y ) such that

f1 ◦C opp f2 = (f2 ◦C f1 )opp (17.13)

a.) Show that if F : C → D is a contravariant functor then one can define in a natural
way a covariant functor F : C opp → D.
b.) Show that if F : C → D is a covariant functor then we can naturally define another
covariant functor F opp : C opp → Dopp

Example 1: Every category has a canonical functor to itself, called the identity functor
IdC .
Example 2: There is an obvious functor, the forgetful functor from GROUP to SET.
This idea extends to many other situations where we “forget” some mathematical structure
and map to a category of more primitive objects.
233
Although we do have Fmor (IdX ) ◦ Fmor (f ) = Fmor (f ) for all f ∈ hom(Y, X) and Fmor (f ) ◦ Fmor (IdX ) =
Fmor (f ) for all f ∈ hom(X, Y ) this is not the same as the statement that Fmor (IdX ) ◦ φ = φ for all
φ ∈ hom(F (Y ), F (X)), so we need to impose this extra axiom.

– 443 –
Example 3: Since AB is a subcategory of GROUP there is an obvious functor F :
AB → GROUP.

Example 4: In an exercise below you are asked to show that the abelianization of a group
defines a functor G : GROUP → AB.

Example 5: Fix a group G. Then in the notes above we have on several occasions used
the functor
FG : SET → GROUP (17.14)

by observing that if X is a set, then FG (X) = M aps[X → G] is a group. Check this is a


contravariant functor: If f : X1 → X2 is a map of sets then

FG (f )
FG (X1 ) o FG (X2 ) (17.15)

The map FG (f ) is usually denoted f ∗ and is known as the pull-back. To be quite explicit:
If Ψ is a map of X2 → G then f ∗ (Ψ) := Ψ ◦ f is a map X1 → G.
This functor is used in the construction of certain nonlinear sigma models which are
quantum field theories where the target space is a group G. The viewpoint that we are
studying the representation theory of an infinite-dimensional group of maps to G has been
extremely successful in a particular case of the Wess-Zumino-Witten model, a certain two
dimensional quantum field theory that enjoys conformal invariance (and more).

Example 6: Now let us return to the category G-SET. Now fix any set Y . Then in the
notes above we have on several occasions used the functor

FG,Y : G-SET → G-SET (17.16)

by observing that if X is a G-set, then FY (X) = M aps[X → Y ] is also a G-set. To check


this is a contravariant functor we write:

[g · (f ∗ Ψ)](x1 ) = (f ∗ Ψ)(g −1 · x1 )
= Ψ(f (g −1 · x1 ))
= Ψ(g −1 · (f (x1 ))) (17.17)
= (g · Ψ)(f (x1 ))
= (f ∗ (g · Ψ))(x1 )

and hence Ψ → g · Ψ is a morphism of G-sets.


This functor is ubiquitous in quantum field theory: If a spacetime enjoys some sym-
metry (for example rotational or Poincaré symmetry) then the same group will act on the
space of fields defined on that spacetime.

Example 7: Fix a nonnegative integer n and a group G. Then the group cohomology we
discussed above (take the trivial twisting ωg = IdA for all g) defines a covariant functor

H n (G, •) : AB → AB (17.18)

– 444 –
To check this is really a functor we need to observe the following: If ϕ : A1 → A2 is a
homomorphism of Abelian groups then there is an induced homomorphim, usually denoted
ϕ∗ : H n (G, A1 ) → H n (G, A2 ) (17.19)
You have to check that Id∗ = Id and
(ϕ1 ◦ ϕ2 )∗ = (ϕ1 )∗ ◦ (ϕ2 )∗ (17.20)
Strictly speaking we should denote ϕ∗ by H n (G, ϕ), but this is too fastidious for the present
author.
Example 8: Fix a nonnegative integer n and any group A. Then the group cohomology
we discussed above (take the trivial twisting ωg = IdA for all g) defines a contravariant
functor
H n (•, A) : GROUP → AB (17.21)
To check this is really a functor we need to observe the following: If ϕ : G1 → G2 is a
homomorphism of Abelian groups then there is an induced homomorphim, usually denoted
ϕ∗
ϕ∗ : H n (G2 , A) → H n (G1 , A) (17.22)

Example 9: Topological Field Theory. The very definition of topological field theory is
that it is a functor from a bordism category of manifolds to the category of vector spaces
and linear transformations. For much more about this one can consult a number of papers.
Two online resources are
http://www.physics.rutgers.edu/∼gmoore/695Fall2015/TopologicalFieldTheory.pdf
https://www.ma.utexas.edu/users/dafr/bordism.pdf

Note that in example 2 there is no obvious functor going the reverse direction. When
there are functors both ways between two categories we might ask whether they might be,
in some sense, “the same.” But saying precisely what is meant by “the same” requires
some care.

Definition If C1 and C2 are categories and F1 : C1 → C2 and F2 : C1 → C2 are two functors


then a natural transformation τ : F1 → F2 is a rule which, for every X ∈ Obj(C1 ) assigns
an arrow τX : F1 (X) → F2 (X) so that, for all X, Y ∈ Obj(C1 ) and all f ∈ hom(X, Y ),
τY ◦ F1 (f ) = F2 (f ) ◦ τX (17.23)
Or, in terms of diagrams.
F1 (f )
F1 (X) / F1 (Y ) (17.24)
τX τY
 F2 (f ) 
F2 (X) / F2 (Y )

– 445 –
Example 1: The evaluation map. Here is another tautological construction which never-
theless can be useful. Let S be any set and define a functor

FS : SET → SET (17.25)

by saying that on objects we have

FS (X) := M ap[S → X] × S (17.26)

and if ϕ : X1 → X2 is a map of sets then

FS (ϕ) : M ap[S → X1 ] × S → M ap[S → X2 ] × S (17.27)

is defined by FS (ϕ) : (f, s) 7→ (ϕ ◦ f, s). Then we claim there is a natural transformation


to the identity functor. For every set X we have

τX : FS (X) = M ap[S → X] × S → Id(X) = X (17.28)

It is defined by τX (f, s) := f (s). This is known as the “evaluation map.” Then we need to
check
τX
FS (X) /X (17.29)
FS (ϕ) ϕ
 τY 
FS (Y ) /Y

commutes. If you work it out, it is just a tautology.


Example 2: The determinant. 234 Let COMMRING be the category of commutative
rings with morphisms the ring morphisms. (So, ϕ : R1 → R2 is a homomorphism of Abelian
groups and moreover ϕ(r · s) = ϕ(r) · ϕ(s).) Let us consider two functors

COMMRING → GROUP (17.30)

The first functor F1 maps a ring R to the multiplicative group U (R) of multiplicatively
invertible elements. This is often called the group of units in R. If ϕ is a morphism of
rings and r ∈ U (R1 ) then ϕ(r) ∈ U (R2 ) and the map ϕ∗ : U (R1 ) → U (R2 ) defined by

ϕ∗ : r 7→ ϕ(r) (17.31)

is a group homomorphism. So F1 is a functor. The second functor F2 maps a ring R


to the matrix group GL(n, R) of n × n matrices such that there exists an inverse matrix
with values in R. Again, if ϕ : R1 → R2 is a morphism then applying ϕ to each matrix
element defines a group homomorphism ϕ∗ : GL(n, R1 ) → GL(n, R2 ). Now consider the
determinant of a matrix g ∈ GL(n, R). The usual formula
X
det(g) := (σ)g1,σ(1) · g2,σ(2) · · · gn,σ(n) (17.32)
σ∈Sn

234
This example uses some terms from linear algebra which can be found in the “User’s Manual,” Chapter
2 below.

– 446 –
makes perfect sense for g ∈ GL(n, R). Moreover,

det(g1 g2 ) = det(g1 )det(g2 ) (17.33)

Now we claim that the determinant defines a natural transformation τ : F1 → F2 . For each
object R ∈ Ob(COMMRING) we assign the morphism

τR : GL(n, R) → U (R) (17.34)

defined by τR (g) := det(g). Thanks to (17.33) this is indeed a morphism in the cate-
gory GROUP, that is, it is a group homomorphism. Moreover, it satisfies the required
commutative diagram because if ϕ : R1 → R2 is a morphism of rings then

ϕ∗ (det(g)) = det(ϕ∗ (g)). (17.35)

Example 3: Natural transformations in cohomology theory. Cohomology groups provide


natural examples of functors, as we have stressed above. There are a number of interesting
natural transformations between these different cohomology-group functors. ♣Can we explain an
elementary example
with group
cohomology as
developed so far???
Definition Two categories are said to be equivalent if there are functors F : C1 → C2 ♣

and G : C2 → C1 together with isomorphisms (via natural transformations) F G ∼ = IdC2



and GF = IdC1 . (Note that F G and IdC2 are both objects in the category of functors
FUNCT(C2 , C2 ) so it makes sense to say that they are isomorphic.) ♣Should explain
example showing
Many important theorems in mathematics can be given an elegant and concise formu- category of
finite-dimensional
lation by saying that two seemingly different categories are in fact equivalent. Here is a vector spaces over a
field is equivalent to
(very selective) list: 235 the catetgory of
nonnegative
integers. ♣
Example 1: Consider the category with one object for each nonnegative integer n and the
morphism space GL(n, κ) of invertible n × n matrices over the field κ. These categories
are equivalent. That is one way of saying that the only invariant of a finite-dimensional
vector space is its dimension.

Example 2: The basic relation between Lie groups and Lie algebras the statement that the
functor which takes a Lie group G to its tangent space at the identity, T1 G is an equivalence
of the category of connected and simply-connected Lie groups with the category of finite-
dimensional Lie algebras. One of the nontrivial theorems in the theory is the existence of
a functor from the category of finite-dimensional Lie algebras to the category of connected
and simply-connected Lie groups. Intuitively, it is given by exponentiating the elements of
the Lie algebra.

Example 3: Covering space theory is about an equivalence of categories. On the one


hand we have the category of coverings of a pointed space (X, x0 ) and on the other hand
235
I thank G. Segal for a nice discussion that helped prepare this list.

– 447 –
the category of topological spaces with an action of the group π1 (X, x0 ). Closely related
to this, Galois theory can be viewed as an equivalence of categories.

Example 4: The category of unital commutative C ∗ -algebras is equivalent to the category


of compact Hausdorff topological spaces. This is known as Gelfand’s theorem.

Example 5: Similar to the previous example, an important point in algebraic geometry


is that there is an equivalence of categories of commutative algebras over a field κ (with
no nilpotent elements) and the category of affine algebraic varieties.

Example 6: Pontryagin duality is a nontrivial self-equivalence of the category of locally


compact abelian groups (and continuous homomorphisms) with itself.

Example 7: A generalization of Pontryagin duality is Tannaka-Krein duality between the


category of compact groups and a certain category of linear tensor categories. (The idea
is that, given an abstract tensor category satisfying certain conditions one can construct a
group, and if that tensor category is the category of representations of a compact group,
one recovers that group.)

Example 8: The Riemann-Hilbert correspondence can be viewed as an equivalence of


categories of flat connections (a.k.a. linear differential equations, a.k.a. D-modules) with
their monodromy representations. ♣This needs a lot
more explanation.

In physics, the statement of “dualities” between different physical theories can some-
times be formulated precisely as an equivalence of categories. One important example of
this is mirror symmetry, which asserts an equivalence of (A∞ )-) categories of the derived
category of holomorphic bundles on X and the Fukaya category of Lagrangians on X ∨ .
But more generally, nontrivial duality symmetries in string theory and field theory have a
strong flavor of an equivalence of categories.

Exercise Playing with natural transformations


a.) Given two categories C1 , C2 show that the natural transformations allow one to
define a category FUNCT(C1 , C2 ) whose objects are functors from C1 to C2 and whose
morphisms are natural transformations. For this reason natural transformations are often
called “morphisms of functors.”
b.) Write out the meaning of a natural transformation of the identity functor IdC to
itself. Show that End(IdC ), the set of all natural transformations of the identity functor
to itself is a monoid.

– 448 –
Exercise Freyd’s theorem
A “practical” way to tell if two categories are equivalent is the following:
By definition, a fully faithful functor is a functor F : C1 → C2 where Fmor is a bijection
on all the hom-sets. That is, for all X, Y ∈ Obj(C1 ) the map

Fmor : hom(X, Y ) → hom(Fobj (X), Fobj (Y )) (17.36)

is a bijection.
Show that C1 is equivalent to C2 iff there is a fully faithful functor F : C1 → C2 so that
any object α ∈ Obj(C2 ) is isomorphic to an object of the form F (X) for some X ∈ Obj(C1 ).

Exercise
As we noted above, there is a functor AB → GROUP just given by inclusion.
a.) Show that the abelianization map G → G/[G, G] defines a functor GROUP →
AB.
b.) Show that the existence of nontrivial perfect groups, such as A5 , implies that this
functor cannot be an equivalence of categories.

In addition to the very abstract view of categories we have just sketched, very concrete
objects, like groups, manifolds, and orbifolds can profitably be viewed as categories.
One may always picture a category with the objects constituting points and the mor-
phisms directed arrows between the points as shown in Figure 44.

Figure 44: Pictorial illustration of a category. The objects are the black dots. The arrows are
shown, and one must give a rule for composing each arrow and identifying with one of the other
arrows. For example, given the arrows denoted f and g it follows that there must be an arrow
of the type denoted f ◦ g. Note that every object x has at least one arrow, the identity arrow in
Hom(x, x).

– 449 –
As an extreme example of this let us consider a category with only one object, but
we allow the possibility that there are several morphisms. For such a category let us look
carefully at the structure on morphisms f ∈ M or(C). We know that there is a binary
operation, with an identity 1 which is associative.
But this is just the definition of a monoid!
If we have in addition inverses then we get a group. Hence:
Definition A group is a category with one object, all of whose morphisms are invertible.

To see that this is equivalent to our previous notion of a group we associate to each
morphism a group element. Composition of morphisms is the group operation. The in-
vertibility of morphisms is the existence of inverses.
We will briefly describe an important and far-reaching generalization of a group af-
forded by this viewpoint. Then we will show that this viewpoint leads to a nice geometrical
construction making the formulae of group cohomology a little bit more intuitive.
*************************************
CONSTRUCT EXERCISE HERE EXAMINING HOW CONCEPTS OF FUNCTORS
AND NATURAL TRANSFORMATIONS TRANSLATE INTO GROUP THEORY LAN-
GUAGE WHEN SPECIALIZED TO THE CATEGORIES CORRESPONDING TO GROUPS
*************************************

17.1 Groupoids

Definition A groupoid is a category all of whose morphisms are invertible.

Note that for any object x in a groupoid, hom(x, x) is a group. It is called the auto-
morphism group of the object x.

Example 1. Any equivalence relation on a set X defines a groupoid. The objects are the
elements of X. The set Hom(a, b) has one element if a ∼ b and is empty otherwise. The
composition law on morphisms then means that a ∼ b with b ∼ c implies a ∼ c. Clearly,
every morphism is invertible.

Example 2. Consider time evolution in quantum mechanics with a time-dependent Hamil-


tonian. There is no sense to time evolution U (t). Rather one must speak of unitary evolu-
tion U (t1 , t2 ) such that U (t1 , t2 )U (t2 , t3 ) = U (t1 , t3 ). Given a solution of the Schrodinger
equation Ψ(t) we may consider the state vectors Ψ(t) as objects and U (t1 , t2 ) as morphisms.
In this way a solution of the Schrodinger equation defines a groupoid. ♣Clarify this
remark. ♣

Example 3. Let X be a topological space. The fundamental groupoid π≤1 (X) is the
category whose objects are points x ∈ X, and whose morphisms are homotopy classes of
paths f : x → x0 . These compose in a natural way. Note that the automorphism group of
a point x ∈ X, namely, hom(x, x) is the fundamental group of X based at x, π1 (X, x).

Example 4. Gauge theory: Objects = connections on a principal bundle. Morphisms


= gauge transformations. This is the right point of view for thinking about some more

– 450 –
exotic (abelian) gauge theories of higher degree forms which arise in supergravity and string
theories.

Example 5. In the theory of string theory orbifolds and orientifolds spacetime must be
considered to be a groupoid. Suppose we have a right action of G on a set X, so we have
a map
Φ:X ×G→X (17.37)
such that
Φ(Φ(x, g1 ), g2 ) = Φ(x, g1 g2 ) (17.38)
Φ(x, 1G ) = x (17.39)
for all x ∈ X and g1 , g2 ∈ G. We can just write Φ(x, g) := x · g for short. We can then
form the category X//G with

Ob(X//G) = X
(17.40)
M or(X//G) = X × G

We should think of a morphism as an arrow, labeled by g, connecting the point x to the


point x · g. The target and source maps are: ♣FIGURE
NEEDED HERE! ♣

p0 ((x, g)) := x · g p1 ((x, g)) := x (17.41)

The composition of morphisms is defined by

(xg1 , g2 ) ◦ (x, g1 ) := (x, g1 g2 ) (17.42)

or, in the other notation (better suited to a right-action):

(x, g1 )(xg1 , g2 ) := (x, g1 g2 ) (17.43)

Note that (x, 1G ) ∈ hom(x, x) is the identity morphism, and the composition of morphisms
makes sense because we have a group action. Also note that pt//G where G has the trivial
action on a point realizes the group G as a category, as sketched above.

Example 6. In the theory of string theory orbifolds and orientifolds spacetime must be
considered to be a groupoid. (This is closely related to the previous example.)

Exercise
For a group G let us define a groupoid denoted G//G (for reasons explained later)
whose objects are group elements Obj(G//G) = G and whose morphisms are arrows defined
by
g1
h / g2 (17.44)
iff g2 = h−1 g1 h. This is the groupoid of principal G-bundles on the circle.
Draw the groupoid corresponding to S3 .

– 451 –
Exercise The Quotient Groupoid
a.) Show that whenever G acts on a set X one can canonically define a groupoid: The
objects are the points x ∈ X. The morphisms are pairs (g, x), to be thought of as arrows
g
x → g · x. Thus, X0 = X and X1 = G × X.
b.) What is the automorphism group of an object x ∈ X.
This groupoid is commonly denoted as X// G.

Figure 45: Elementary 0, 1, 2 simplices in the simplicial space |C| of a category

Figure 46: An elementary 3-simplex in the simplicial space |C| of a category

17.2 The topology behind group cohomology


Now, let us show that this point of view on the definition of a group can lead to a very
nontrivial and beautiful structure associated with a group.

– 452 –
An interesting construction that applies to any category is its associated simplicial
space |C|.
This is a space made by gluing together simplices 236 whose simplices are:

• 0-simplices = objects

• 1-simplices = ∆1 (f ) associated to each morphism f : x0 → x1 ∈ X1 .

• 2-simplices: ∆(f1 , f2 ) associated composable morphisms

(f1 , f2 ) ∈ X2 := {(f1 , f2 ) ∈ X1 × X1 |p0 (f1 ) = p1 (f2 )} (17.45)

• 3-simplices: ∆(f1 , f2 , f3 ) associated to 3 composable morphisms, i.e. elements of:

X3 = {(f1 , f2 , f3 ) ∈ X1 × X1 × X1 |p0 (fi ) = p1 (fi+1 ), i = 1, 2} (17.46)

• and so on. There are infinitely many simplices of arbitarily high dimension because
we can keep composing morphisms as long as we like.

And so on. See Figures 45 and 46. The figures make clear how these simplices are
glued together:
∂∆1 (f ) = x1 − x0 (17.47)
∂∆2 (f1 , f2 ) = ∆1 (f1 ) − ∆1 (f1 f2 ) + ∆1 (f2 ) (17.48)
and for Figure 46 view this as looking down on a tetrahedron. Give the 2-simplices of
Figure 45 the counterclockwise orientation and the boundary of the 3-simplex the induced
orientation from the outwards normal. Then we have

∂∆(f1 , f2 , f3 ) = ∆(f2 , f3 ) − ∆(f1 f2 , f3 ) + ∆(f1 , f2 f3 ) − ∆(f1 , f2 ) (17.49)

Note that on the three upper faces of Figure 46 the induced orientation is the ccw orien-
tation for ∆(f1 , f2 f3 ) and ∆(f2 , f3 ), but with the cw orientation for ∆(f1 f2 , f3 ). On the
bottom fact the inward orientation is ccw and hence the outward orientation is −∆(f1 , f2 ).
Clearly, we can keep composing morphisms so the space |C| has simplices of arbitrarily
high dimension, that is, it is an infinite-dimensional space.
Let look more closely at this space for the case of a group, regarded as a category with
one object. Then in the above pictures we identify all the vertices with a single vertex.
For each group element g we have a one-simplex ∆1 (g) beginning and ending at this
vertex.
For each ordered pair (g1 , g2 ) we have an oriented 2-simplex ∆(g1 , g2 ), etc. We simply
replace fi → gi in the above formulae, with gi now interpreted as elements of G:

∂∆(g) = 0 (17.50)

∂∆(g1 , g2 ) = ∆1 (g1 ) + ∆1 (g2 ) − ∆1 (g1 g2 ) (17.51)


236
Technically, it is a simplicial space.

– 453 –
∂∆(g1 , g2 , g3 ) = ∆(g2 , g3 ) − ∆(g1 g2 , g3 ) + ∆(g1 , g2 g3 ) − ∆(g1 , g2 ) (17.52)
See Figure 46.
Let us construct this topological space a bit more formally:
We begin by defining n + 1 maps from Gn → Gn−1 for n ≥ 1 given by

d0 (g1 , . . . , gn ) = (g2 , . . . , gn )
d1 (g1 , . . . , gn ) = (g1 g2 , g3 , . . . , gn )
d2 (g1 , . . . , gn ) = (g1 , g2 g3 , g4 , . . . , gn )
······ (17.53)
······
n−1
d (g1 , . . . , gn ) = (g1 , . . . , gn−1 gn )
n
d (g1 , . . . , gn ) = (g1 , . . . , gn−1 )

On the other hand, we can view an n-simplex ∆n as


n
X
∆n := {(t0 , t1 , . . . , tn )|ti ≥ 0 & ti = 1} (17.54)
i=0

Now, there are also (n + 1) face maps which map an (n − 1)-simplex ∆n−1 into one of the
(n + 1) faces of the n-simplex ∆n :

d0 (t0 , . . . , tn−1 ) = (0, t0 , . . . , tn−1 )


d1 (t0 , . . . , tn−1 ) = (t0 , 0, t1 , . . . , tn−1 )
······ (17.55)
······
dn (t0 , . . . , tn−1 ) = (t0 , . . . , tn−1 , 0)

di embeds the (n − 1) simplex into the face ti = 0 which is opposite the ith vertex ti = 1
of ∆n .
Now we identify 237
(q∞ n
n=0 ∆n × G ) / ∼

via
(di (~t), ~g ) ∼ (~t, di (~g )). (17.56)
The space we have constructed this way has a homotopy type denoted BG. This
homotopy type is known as the classifying space of the group G. It can be characterized
as the homotopy type of a topological space which is both contractible and admits a free
G-action.
Note that for all g ∈ G, ∂∆1 (g) = 0, so for each group element there is a closed loop.
On the other hand
∆1 (1G ) = ∂(∆2 (1G , 1G )) (17.57)
237
This means we take the set of equivalence classes and impose the weakest topology on the set of
equivalence classes so that the projection map is continuous.

– 454 –
so ∆1 (1G ) is a contractible loop. But all other loops are noncontractible. (Show this!)
Therefore:
π1 (BG, ∗) ∼
=G (17.58)
Moreover, if G is a finite group one can show that all the higher homotopy groups of BG
are contractible. So then BG is what is known as an Eilenberg MacLane space K(G, 1).
Even for the simplest nontrivial group G = Z/2Z the construction is quite nontrivial
and BG has the homotopy type of RP ∞ .
Now, an n-cochain in C n (G, Z) (here we take A = Z for simplicity) is simply an
assignment of an integer for each n-simplex in BG. Then the coboundary and boundary
maps are related by
hdφn , ∆i = hφn , ∂∆i (17.59)
and from the above formulae we recover, rather beautifully, the formula for the coboundary
in group cohomology.
Remarks:

1. When we defined group cohomology we also used homogeneous cochains. This is


based on defining G as a groupoid from its left action and considering the mapping
of groupoids G//G → pt//G. ♣Explain more
here? ♣

2. A Lie group is a manifold and hence has its own cohomology groups as a manifold,
H n (G; Z). There is a relation between these: There is a group homomorphism
n+1 n
Hgroup cohomology (G; Z) → Htopological space cohomology (G; Z) (17.60)

3. One can show that H n (BG; Z) is always a finite abelian group if G is a finite group.
[GIVE REFERENCE].

4. The above construction of BG is already somewhat nontrivial even for the trivial
group G = {1G }. Indeed, following it through for the 2-cell, we need to identify
the three vertices of a triangle to one vertex, and the three edges to a single edge,
embedded as a closed circle. If you do this by first identifying two edges and then try
to identify the third edge you will see why it is called the “dunce’s cap.” It is true,
but hard to visualize, that this is a contractible space. Things only get worse as we
go to higher dimensions. A better construction, due to Milnor, is to construct what
is known as a “simplicial set,” and then collapse all degenerate simplices to a point.
This gives a nicer realization of BG, but one which is homotopy equivalent to the
one we described above. For the trivial category with one object and one morphism
one just gets a topological space consisting of a single point. 238

5. The “space” BG is really only defined up to homotopy equivalence. For some G


there are very nice realizations as infinite-dimensional homogeneous spaces. This is
useful for defining things like “universal connections.” For example, one model for
238
I thank G. Segal for helpful remarks on this issue.

– 455 –
BZ is as the humble circle R/Z = S 1 . This generalizes to lattices BZd = T d , the
d-dimensional torus. On the other hand BZ2 must be infinite-dimensional but it can
be realized as RP∞ , the quotient of the unit sphere in a real infinite-dimensional
separable Hilbert space by the antipodal map. Similarly, BU (1) is CP∞ , realized as
the quotient of the unit sphere in a complex infinite-dimensional separable Hilbert
space by scaling vectors by a phase: ψ → eiθ ψ.

18. Lattice Gauge Theory

As an application of some of the general concepts of group theory we discuss briefly lattice
gauge theory.
Lattice gauge theory can be defined on any graph: There is a set of unoriented edges
Ē. Each edge can be given either orientation and we denote the set of oriented edges by E.
The set of vertices is denoted V and source and target maps that tells us the vertex at the
beginning and end of each oriented edge:

s:E →V t:E →V (18.1)

We will view the union of edges Ē (i.e. forgetting the orientation) as a topological space
and denote it as Γ.
The original idea of Ken Wilson was that we could formulate Yang-Mills theory on
a “lattice approximation to Euclidean spacetime” which we visualize as a cubic lattice in
Rd for some d. Then, the heuristic idea is, that as the bond lengths are taken to zero we
get a good approximation to a field theory in the continuum. Making this idea precise is
highly nontrivial! For example, just one of the many issues that arise is that important
symmetries such as Euclidean or Poincaré symmetries of the continuum models we wish to
understand are broken, in this formulation, to crystallographic symmetries.

18.1 Some Simple Preliminary Computations


A rather trivial part of the idea is to notice the following: Suppose we have a field theory
on Rd of fields
φ : Rd → T (18.2)
where T is some “target space.” Then if we consider the embedded hypercubic lattice:

Λa := {a(n1 , . . . , nd ) ∈ Rd |ni ∈ Z} (18.3)

and we restrict φ to Λa then at neighboring vertices the value of φ will converge as a → 0:

lim φ(~x0 + aêµ ) = φ(~x0 ) (18.4)


a→0

where êµ , µ = 1, . . . , d is a unit vector in the µth direction. Moreover, if φ : Rd → T is


differentiable and T is a linear space then

lim a−1 (φ(~x0 + aêµ ) − φ(~x0 )) = ∂µ φ(~x0 ) (18.5)


a→0

– 456 –
and so on.
In lattice field theory we attempt to go the other way: We assume that we have fields
defined on a sequence of lattices Λa ⊂ Rd and try to take an a → 0 limit to define a
continuum field theory.
Here is a simple paradigm to keep in mind: 239 Consider the one-dimensional lattice
Z, but it is embedded in the real line so that bond-length is a, so Λa = {an|n ∈ Z} ⊂ R.
Our degrees of freedom will be a real number φ` (n) at each lattice site n ∈ Z, and it will
evolve in time to give a motion φ` (n, t) according to the action:
Z X m k 2

2
S= dt φ̇` (n, t) − (φ` (n, t) − φ` (n + 1, t)) (18.6)
R 2 2
n∈Z

We can think of this as a system of particles of mass m fixed at the vertices of Λa with
neighboring particles connected by a spring with spring constant k. For the action to
have proper units, φ` (n, t) should have dimensions of length, suggesting it measures the
displacement of the particle in some orthogonal direction to the real line. The equations
of motion are of course:
d2
m φ` (n, t) = k(φ` (n + 1, t) − 2φ` (n, t) + φ` (n − 1, t)) (18.7)
dt2
Now we wish to take the a → 0 limit. We assume that there is some differentiable function
φcont (x, t) such that
φcont (x, t)|x=an = φ` (n, t) (18.8)
so by Taylor expansion
d2
φ` (n + 1, t) − 2φ` (n, t) + φ` (n − 1, t) = a2
φcont |x=an + O(a3 ) (18.9)
dx2
Now suppose we scale the parameters of the Lagrangian so that
v2T
m = aT k= (18.10)
a
then, if the limits really exist, the continuum function φcont (x, t) must satisfy the wave
equation:
d2 2 d
2
φ cont − v φcont = 0 (18.11)
dt2 dx2
whose general solution is
Φlef t (x + vt) + Φright (x − vt) (18.12)
The general solution is described by arbitrary wavepackets traveling to the left and right
along the real line. (We took v > 0 here.) We can also see this at the level of the Lagrangian
since if φ` (n, t) is well-approximated by a continuum function φcont (x, t) then
"  2 2 #
v2 d
Z Z 
1 d
S→T dt dx φcont − φcont + O(a) (18.13)
R R 2 dt 2 dx

Remarks
239
Here we will just latticize the spatial dimension of a 1 + 1 dimensional field theory. In the rest of the
section we latticize spacetime with Euclidean signature.

– 457 –
1. In the lattice theory there will certainly be sequences of field configurations φlattice (n, t)
that have no good continuum limit. The idea is that these are unimportant to the
physics because they have huge actions whose contributions to the path integral is
unimportant in the continuum limit.

2. Keeping in mind the interpretation of φcont (x, t) as a height in a direction orthogonal


to the real axis, we see that we are describing a string of tension T .

18.2 Gauge Group And Gauge Field


In lattice gauge theory we choose a group G - known as the gauge group. For the moment
it can be any group. The dynamical degree of freedom is a gauge field, or more precisely,
the dynamical object is the gauge equivalence class or isomorphism class of the gauge field.
This will be defined below.
In mathematics, a gauge field is called a connection.
To give the definition of a connection let P be the set of all connected open paths in
Γ. For example, we can think of it as the set of continuous maps γ : [0, 1] → Γ. Since we
are working on a graph you can also think of a path γ as a sequence of edges e1 , e2 , . . . , ek
such that
t(ei ) = s(ei+1 ) 1≤i≤k−1 (18.14)
(We also allow for the trivial path γv (t) = v for some fixed vertex v which has no edges.)
However, the former definition is superior because it generalizes to connections on other
topological spaces.
Now, by definition, a connection is just a map

U : P → G, (18.15)

which satisfies the composition law: If we concatenate two paths γ1 and γ2 to make a path
γ1 ◦ γ2 , so that the concatenated path begins at γ1 (0) and ends at γ2 (1) and such that
γ1 (1) = γ2 (0), that is, the end of γ1 is the beginning of γ2 , then we must have:

U(γ1 ◦ γ2 ) = U(γ1 )U(γ2 ) (18.16)

If our path is the trivial path then


U(γv ) = 1G (18.17)
and if γ −1 (t) = γ(1 − t) is the path run backwards then

U(γ −1 ) = (U(γ))−1 (18.18)

♣Are these really


independent
Note that if the path γ is made by concatenating edges e1 , e2 , . . . , ek then conditions? ♣

U(γ) = U(e1 )U(e2 ) · · · U(ek ) (18.19)

so, really, in lattice gauge theory it suffices to know the U(e) for the edges. If e−1 is the
edge e with the opposite orientation then

U(e−1 ) = U(e)−1 (18.20)

– 458 –
We will denote the space of all connections by A(Γ).

Remark: Background heuristics: For those who know something about gauge fields in
field theory we should think of U(e) as the parallel transport (in some trivialization of
our principal bundle) along the edge e. From these parallel transports along edges we can
recover the components of the gauge field. To explain more let us assume for simplicity
that G = U (N ) is a unitary group, or some matrix subgroup of U (N ).
Recall some elementary ideas from the theory of Lie groups: If α is any anti-Hermitian
matrix then exp[α] is a unitary matrix. Moreover, if α is “small” then exp[α] is close to the
identity. Conversely, if U is “close” to the identity then it can be uniquely written in the
form U = exp[α] for a “small” anti-Hermitian matrix α. Put more formally: The tangent
space to U (N ) at the identity is the (real!) vector space of N × N anti-Hermitian matrices.
(This vector space is a real Lie algebra, because the commutator of anti-Hermitian matrices
is an anti-Hermitian matrix.) Moreover, the exponential map gives a good coordinate chart
in some neighborhood of the identity of the topological group U (N ).
The poor man’s way of understanding the relation between Lie algebras and Lie groups
is to use the very useful Baker-Campbell-Hausdorf formula: If A, B are n × n matrices then
the formula gives an expression for an n × n matrix C so that

eA eB = eC (18.21)

The formula is a (very explicit) infinite set of terms all expressed in terms of multiple
commutators. The first few terms are:
1 1 1 1
C = A + B + [A, B] + [A, [A, B]] + [B, [B, A]] + [A, [B, [A, B]]] + · · · (18.22)
2 12 12 24
The series is convergent as long as A, B are small enough (technically, such that the char-
acteristic values of Ad(A) and Ad(B) are less than 2π in magnitude). See Chapter 8 for a
full explanation. Note in particular that if we expand in small parameters 1 , 2 then

e1 A e2 B e−1 A e−2 B = e1 2 [A,B]+··· (18.23)

Now, returning to lattice gauge theory: In the usual picture of “approximating” Eu-
clidean Rd by Zd with bond-length a we can write a fundamental edge eµ (~n) as the straight
line in Rd from ~n to ~n+aêµ . If a is small and we have some suitable continuity then U(eµ (~n))
will be near the identity and we can write:

U(eµ (~n)) = exp[aAlattice


µ (~n)] (18.24)

for some anti-Hermitian matrix Aµ (a~n). In lattice gauge theory, the connections with a
good continuum limit are those such that there is a locally defined 1-form valued in N × N
anti-Hermitian matrices Acont
µ (~x)dxµ so that Acont
µ (a~ n) = Alattice
µ (~n).

Now, the gauge field U has redundant information in it. The reason it is useful to
include this redundant information is that many aspects of locality become much clearer

– 459 –
when working with A(Γ) as we will see when trying to write actions below. The redundant
information is reflected in a gauge transformation which is simply a map

f :V→G (18.25)

The idea is that if γ is a path then the gauge fields U and U0 related by the rule

U0 (γ) = f (s(γ))U(γ)f (t(γ))−1 (18.26)

are deemed to be gauge equivalent, i.e. isomorphic. We denote the set of gauge transfor-
mations by G(Γ). Note that, being a function space whose target is a group, this set is a
group in a natural way. It is called the group of gauge transformations. 240 The group of
gauge transformations G(Γ) acts on A(Γ). The moduli space of gauge inequivalent fields is
the set of equivalence classes: A(Γ)/G(Γ). Mathematicians would call these isomorphism
classes of connections.
It might seem like there is no content here. Can’t we always choose f (s(γ)) to set U0 (γ)
to 1? Yes, in general, except when s(γ) = t(γ), that is, when γ is a closed loop based at a
vertex, say v0 . For such closed loops we are stuck, all we can do by gauge transformations
is conjugate:
U0 (γ) = gU(γ)g −1 (18.27)
where g is the gauge transformation at v0 . Moreover, if we start the closed loop at another
vertex on the loop then the parallel transport is again in the same conjugacy class. Thus
there is gauge invariant information associated to a loop γ: The conjugacy class of the
U(γ). That is: The holonomy function:

HolU : LΓ → Conj(G) (18.28)

that maps the loops in Γ to the conjugacy class:

HolU : γ 7→ C(U(γ)) (18.29)

is gauge invariant: If U0 ∼ U are gauge equivalent then

HolU0 = HolU (18.30)

In fact, one can show that HolU is a complete invariant, meaning that we have the
converse: If HolU0 = HolU then U0 is gauge equivalent to U. Put informally:

The gauge invariant information in a gauge field, or con-


nection, is encoded in the set of conjugacy classes associ-
ated to the closed loops in Γ.

Exercise
240
AND IS NOT TO BE CONFUSED WITH THE gauge group G!!!!

– 460 –
Show that if γ is a closed loop beginning and ending at v0 and if v1 is another vertex
on the path γ then if γ 0 describes the “same” loop but starting at v1 then U(γ) and U(γ 0 )
are in the same conjugacy class in G.

Exercise
Consider a graph Γ which forms a star: There is one central vertex, and r “legs” each
consisting or Ni edges radiating outward, where i = 1, . . . , r.
a.) Show explicitly that any gauge field can be gauged to U = 1.
b.) What is the unbroken subgroup of the group of gauge transformations? (That is,
what is the automorphism group of the gauge field U = 1? )

Exercise
Consider a d-dimensional hypercubic lattice with periodic boundary conditions, so that
we are “approximating a torus” which is a product of “circles” of length N a.
What is the maximal number of edges so that we can set U(e) = 1?

18.3 Defining A Partition Function


Next, to do physics, we need to define a gauge invariant action. At the most general level
this is simply a function F : A(Γ)/G(Γ) → C so that we can define a “partition function”:
X
Z= F ([U]) (18.31)
[U]∈A(Γ)/G(Γ)

If Γ is finite and G is finite this sum is just a finite sum. If Γ is finite and G is a finite-
dimensional Lie group then A(Γ)/G(Γ) is a finite-dimensional topological space and the
“sum” needs to be interpreted as some kind of integral. Since a connection on Γ is com-
pletely determined by its values on the elementary edges (for a single orientation) we can,
noncanonically, identity the space of all connections as

A(Γ) ∼
= G|Ē| . (18.32)

Similarly
G(Γ) ∼
= G|V| (18.33)

Now we need a way of integrating over the group. If G is a finite group and F : G → C
is a function then Z
1 X
F dµ := F (g) (18.34)
G |G|
g∈G

– 461 –
This basic idea can be generalized to Lie groups. A Lie group is a manifold and we define
a measure on it dµ. (If G is a simple Lie group then there is a canonical choice of measure
up to an overall scale.) As a simple example, coonsider U (1) = {eiθ } then the integration
is Z 2π

F (eiθ ) (18.35)
0 2π
In all cases, the crucial property of the group integration is that, for all h we have
Z Z Z
F (gh)dµ(g) = F (hg)dµ(g) = F (g)dµ(g) (18.36)
G G G

This property defines what is called a left-right-invariant measure. It is also known as the
Haar measure.
In general the Haar measure is only defined up to an overall scale. In the above
examples we chose the normalization so that the volume of the group is 1.
Now, choosing a left-right-invariant measure we can define:
Z
1
Z= F̂ (U)dµA(Γ) (18.37)
vol (G(Γ)) A(Γ)

where F̂ is a lifting of F to a G(Γ)-invariant function on A(Γ) and dµG(Γ) is the Haar


measure on G|Ē| induced by a choice of Haar measure on G. It is gauge invariant because
the Haar measure is left- and right- invariant.
If we want to impose locality then it is natural to have F̂ (U) depend only on the local
gauge invariant data. This motivates us to consider “small” loops and consider a class
function.
In general, a class function on a group G is a function F : G → C such that F (hgh−1 ) =
F (g) for all h ∈ G. We should clearly take F̂ to be some kind of class function. A natural
source of class functions are traces in representations, for if ρ : G → GL(N, C) is a matrix
representation then χ(g) := Trρ(g) is a class function by cyclicity of the trace. (This class
function is called the character of the representation.)
The smallest closed loops we can make are the “plaquettes.” For Λa ⊂ Rd these would
be labeled by a pair of directions µ, ν with µ 6= ν and would be the closed loop

a~n → a~n + aêµ → a~n + aêµ + aêν → a~n + aêν → a~n (18.38)

Let us denote this plaquette as pµν (~n). ♣FIGURE


NEEDED HERE! ♣
Given a class function F : G → C we can form a partition function by taking
P
F̂ (U) := e−S(U) := e− p S(p)
(18.39)

where we have summed over all plaquettes in the exponential to make this look more like a
discrete approximation to a field theory path integral, and the action S(p) of a plaquette p
is some class function applied to U(p). If G is a continuous group then we need to interpret
the sum over A(Γ)/G(Γ) as some kind of integral, as discussed above.

– 462 –
Figure 47: A small plaquette, centered on a surface element in a tangent plane with coordinates
(x, y) and centered on a point with coordinates (x, y). The holonomy around the plaquette, to
leading order in an expansion in small values of bond-length a is governed by the curvature tensor
evaluated on that area element.

Remark: More background heuristics: For those who know something about gauge fields
in field theory we should think of the parallel transport U(p) around a plaquette p as
defining the components of the curvature on a small area element dxµ ∧ dxν at some point
~x0 = a~n (in some framing). Indeed, using the idea that

U(eµ (~n)) ∼ exp[aAcont


µ |~ n]
x=a~ (18.40)

we can try to take a “limit” where a → 0. The plaquette pµν (~n) is two-dimensional so,
temporarily choosing coordinates so that µ = 1 and ν = 2 we can write the plaquette gauge
group element as
1 1 1 1
eaA1 (x,y− 2 a) eaA2 (x+ 2 a,y) e−aA1 (x,y+ 2 a) e−aA2 (x− 2 a,y) (18.41)

See Figure 47. Now, using the BCH formula 241 we define the fieldstrength of the gauge
field or, equivalently, the curvature of the connection by

U(pµν (~n)) = exp[a2 Fµν + O(a4 )] (18.44)

Here in the continuum we would have the relation:

Fµν (~x) = ∂µ Aν (~x) − ∂ν Aµ (~x) + [Aµ (~x), Aν (~x)] (18.45)

241
Warning: If you are not careful the algebra can be extremely cumbersome here! Taylor expansion in
to order a2 gives:
a2 2 2 2
∂2 A1 aA2 + a2 ∂1 A2 −aA1 − a2 ∂2 A1 −aA2 + a2 ∂1 A2
eaA1 − 2 e e e (18.42)
2
We only need to keep the first commutator term in the BCH formula if we are working to order a so we
get
2 3
ea (∂1 A2 −∂2 A1 +[A1 ,A2 ])+O(a ) (18.43)

– 463 –
A standard action used in lattice gauge theory in the literature is constructed as follows:
First, choose a finite-dimensional unitary representation of G, that is, a group homo-
morphism
ρ : G → U (r) (18.46)
Next, define the action for a plaquette to be

S(p) = K(r − Re[Trρ(U(p))]) (18.47)

for some constant K. Note that the trivial gauge field has action S(p) = 0. Moreover,
every unitary matrix can be diagonalized, by the spectral theorem, with eigenvalues eiθi (p) ,
i = 1, . . . , r and then
r
X X
S(p) = K (1 − cos θi (p)) = 2K sin2 (θi (p)/2) (18.48)
i=1 i=1

is clearly positive definite for K > 0. This is good for unitarity (or its Euclidean counterpart
- “reflection positivity.”)

Remarks:

1. Correlation Functions: The typical physical quantities we might want to compute are
expectation values of products of gauge invariant operators. In view of our discussion
of gauge equivalence classes of gauge fields above one very natural way to make such
gauge invariant operators is via Wilson loop operators. For these one chooses a matrix
representation R : G → GL(N, C) of G (totally unrelated to the choice we made in
defining the action) and a particular loop γ and defines:

W (R, γ)(U) := TrCN R(U(γ)) (18.49)

So, W (R, γ) should be regarded as a gauge invariant function

W (R, γ) : A(Γ) → C (18.50)

and therefore we can consider the expectation values:


−S(U) dµ
R Q
Y A(Γ) i W (Ri , γi )e A(Γ)
h W (Ri , γi )i := R
−S(U)
(18.51)
i A(Γ) e dµA(Γ)

2. Yet more background heuristics: For those who know something about gauge fields
in field theory we can begin to recognize something like the Yang-Mills action if we
use (18.44) and write ♣There is a bit of a
cheat here since you
did not work out
X 1 X X
(r − Re[Trρ(U(pµν (~n)))]) → − Ka4 Trρ(Fµν (a~n))2 (18.52)
the plaquette to
S(p) = K order a4 . ♣
2
pµν (~
n) d n∈Z µ6=ν
~

– 464 –
The heuristic limit (18.52) is to be compared with the Yang-Mills action
Z
1 p
SY M = − 2 dd x detgg µλ g νρ TrFµν Fλρ (18.53)
2g0 X
where here we wrote it in Euclidean signature on a Riemannian manifold M . The
trace is in some suitable representation and the normalization of the trace can be
absorbed in a rescaling of the coupling constant g0 . If we use the representation
ρ : G → U (r) then
1 4−d ad−4
= Ka ⇒ K = (18.54)
g02 g02
The constant K must be dimensionless so that d = 4 dimensions is selected as special.
For d = 4 the Yang-Mills coupling g02 is dimensionless. It has dimensions of length
to a positive power for d > 4 and length to a negative power for d < 4. To take the
continuum limit we should hold g02 fixed and scale K as above as a → 0.

3. Very important subtlety in the case d = 4 Actually, if one attempts to take the
limit more carefully, the situation becomes more complicated in d = 4, because in
quantum mechanics there are important effects known as vacuum fluctuations. What
is expected to happen (based on continuum field theory) is that, if we replace K by
g −2 (a) and allow a-dependence then we can get a good limit of, say, correlation
functions of Wilson loop vev’s if we scale g 2 (a) so that
8π 2 8π 2 a1
= + βlog + O(g 2 (a2 )) (18.55)
g 2 (a1 ) g 2 (a2 ) a2
where there are higher order terms in the RHS in an expansion in g 2 (a2 ). Here β
is a constant, depending on the gauge group G and other fields in the theory. For
G = SU (n) we have the renowned result of D. Gross and F. Wilczek, and of D.
Politzer that
11
β=− n (18.56)
3
As long as β < 0 we see that g 2 (a2 ) → 0 as a2 → 0. This is known as asymptotic
freedom. It has the good property that as we attempt to take a2 → 0 the higher
order terms on the RHS are at least formally going to zero.

4. One can therefore ask, to what extent is this continuum limit rigorously defined and
how rigorously has (18.55) been established from the lattice gauge theory approach.
My impression is that it is still open. Two textbooks on this subject are:
1. C. Itzykson and J.-M. Drouffe, Statistical Field Theory, Cambridge
2. M. Creutz, Quarks, gluons, and lattices, Cambridge

5. Phases and confinement. Many crucial physical properties can be deduced from
Wilson loop vev’s. In Yang-Mills theory a crucial question is whether, for large
planar loops γ hW (R, γ)i decays like exp[−T Area(γ)] or exp[−µP erimeter(γ)]. If it
decays like the area one can argue that quarks will be confined. For a nice explanation
see S. Coleman, Aspects Of Symmetry, for a crystal clear explanation.

– 465 –
6. Including quarks and QCD. The beta function is further modified if there are “mat-
ter fields” coupling to the gauge fields. If we introduce nf Dirac fermions in the
fundamental representation of SU (n) then (18.56) is modified to:
 
11 2
β=− n − nf (18.57)
3 3
The theory of the strong nuclear force between quarks and gluons is based on n = 3
and nf = 6. Actually, there is a strong hierarchy of quark masses so for low energy
questions nf = 2 (for “up” and “down” quarks) is more relevant.

7. There are very special situations in which β = 0 and in fact all the higher terms
on the RHS of the “renormalization group equation” (18.55) vanish. These lead to
scale-invariant theories, and in good cases, to conformal field theories. In the modern
viewpoint on field theory, these conformal field theories are the basic building blocks
of all quantum field theories.

18.4 Hamiltonian Formulation


EXPLAIN HILBERT SPACE FOR 1+1 CASE IS L2 (G).

Figure 48: A triangulated surface. Figure from Wikipedia.

18.5 Topological Gauge Theory


A very popular subject in discussions of topological phases of matter is a set of models
known as “topological gauge theories.” In general, topological field theories are special

– 466 –
classes of field theories that are independent of distances in spacetime. They focus on the
topological aspects of physics. A formal mathematical definition is that it is a functor from
some bordism category to, say, the category Vectκ .
If G is a finite group and we are working on a smooth manifold then there can be no
curvature tensor, so all gauge fields are “flat.” They can still be nontrivial since U(γ) can
still be nontrivial for homotopically nontrivial loops. The simplest example would be 0 + 1
dimensional Yang-Mills theory on a circle. If the action is literally zero then the partition
function is just
1 X
Z= 1 (18.58)
|G|
g∈G

Recalling our discussion of the class equation we recognize that the partition function can
be written as: X 1
Z= (18.59)
c.c.
|Z(g)|
where we sum over conjugacy classes in the group and weight each class by one over the
order of the centralizer of some (any) representative of that class. This second form of the
sum can be interpreted as a sum over the isomorphism classes of principal G-bundles over
the circle, weighted by one over the automorphism group of the bundle.
For those who know something about gauge theory note that this illustrates a very
general principle: In the partition function of a gauge theory we sum over all the iso-
morphism classes of bundles with connection: We weight the bundle with connection by a
gauge invariant functional divided by the order of the automorphism group of the bundle
with connection.
It is also worth remarking that, quite generally in field theory, the partition function
on a manifold of the form X × S 1 can be interpreted as a trace in a Hilbert space. With
proper boundary conditions for the “fields” around S 1 we simply have

Z(X × S 1 ) = TrH(X) e−βH (18.60)

where β is the length of the circle. In a topological theory the Hamiltonian H = 0, so we


just get the dimension of the Hilbert space associated to the spatial slice X. In the case of
Yang-Mills in 0 + 1 dimensions we see that the Hilbert space associated to a point is just
H = C.
In lattice models of topological gauge theories in higher dimensions we insert the
gauge-invariant function Y
δ(U(p)) (18.61)
p

where δ(g) is the Dirac delta function relative to the measure dµ(g) we chose on G, and is
concentrated at g = 1G . Here we take the product over all plaquettes that are meant to be
“filled in” in the continuum limit. That means that the parallel transport around “small”
loops defined by plaquettes will be trivial. This does not mean that the gauge field is trivial!
For example if we consider a triangulation of a compact surface or higher dimensional
manifold with nontrivial fundamental group then there can be nontrivial holonomy around

– 467 –
homotopically nontrivial loops. In general, a connection, or gauge field, such that U(γ) = 1
for homotopically nontrivial loops (this is equivalent to the vanishing of the curvature 2-
form Fµν ) is known as a flat connection or flat gauge field. In topological gauge theories
we sum over (isomorphism classes of) flat connections.
Note that (18.61) is just part of the definition of a topological gauge theory. We want
to do this so that physical quantities only depend on topological aspects of the theory.
In standard Yang-Mills theory hW (R, γ)i will depend on lots of details of γ. Indeed,
one definition of the curvature is how W (R, γ) responds to small deformations of γ. In
topological gauge theories we want
Y
h W (Ri , γi )i (18.62)
i

to be independent of (nonintersecting!) γi under homotopy. Therefore, our measure should


be concentrated on flat gauge fields, at least in some heuristic sense. In lattice topological
gauge theory we do this by hand.

Remark: In general, flat gauge fields for a group G on a manifold M are classified,
up to gauge equivalence by the conjugacy classes of homomorphisms Hom(π1 (M, x0 ), G).

For a flat gauge field, the standard Wilson action we discussed above will simply vanish.
We can get a wider class of models by using group cocycles. This was pointed out in the
paper
R. Dijkgraaf and E. Witten, “Topological Gauge Theory And Group Cohomology,”
Commun.Math.Phys. 129 (1990) 39.
and topological gauge theories that make use of group cocycles for the action are now
known as Dijkgraaf-Witten models.
For simplicity we now take our group G to be a finite group. Let us start with a
two-dimensional model. We can view Γ as a triangulation of an oriented surface M as in
Figure 48. We want a local action, so let us restrict to a flat gauge field on a triangle as in
Figure 45. We want to assign the local “Boltzman weight.” It will be a function:

W : G × G → C∗ (18.63)

(If we wish to match to some popular physical theories we might take it to be U (1)-valued.
The distinction will not matter for anything we discuss here.) Now referring to Figure 45
we assign the weight
W (g1 , g2 ) (18.64)
to this triangle. But now we have to decide if we are to use this, or W (g2 , (g1 g2 )−1 ) or
W ((g1 g2 )−1 , g1 ). In general these complex numbers will not be equal to each other. So we
number the vertices 1, . . . , |V| and then for any triangle T we start with the vertices with
the two smallest numbers. Call this W (T ). This will define an orientation that might or
might not agree with that on the surface M . Let (T ) = +1 if it agrees and (T ) = −1 if

– 468 –
it does not. Then the Boltzman weight for a flat gauge field configuration U on the entire
surface is defined to be
Y
W (U) := W (T )(T ) (18.65)
T

Now, if this weight is to be at all physically meaningful we definitely want the depen-
dence on all sorts of choices to drop out.

Figure 49: A local change of triangulation of type I.

Now, one thing we definitely want to have is independence of the choice of triangulation.
A theorem of combinatorial topology states that any two triangulations can be related by a
sequence of local changes of type I and type II illustrated in Figure 49 and 50, respectively.
We see that the invariance of the action under type I requires:

W (g1 , g2 )W (g1 g2 , g3 ) = W (g1 , g2 g3 )W (g2 , g3 ) (18.66)

and this is the condition that W should be a 2-cocycle. Similarly, the change of type II
doesn’t matter provided

W (g1 , g2 ) = W (g1 , g2 g3 )W (g2 , g3 )W (g1 g2 , g3 )−1 (18.67)

which is again guaranteed by the cocycle equation! This strongly suggests we can get a
good theory by using a 2-cocycle, and that is indeed the case. But we need to check some
things first:

– 469 –
Figure 50: A local change of triangulation of type II.

1. The dependence on the labeling of the vertices drops out using an argument based
on topology we haven’t covered. This can be found in the Dijkgraaf-Witten paper.
Similarly, if W is changed by a coboundary then we modify
t(g1 )t(g2 )
W (g1 , g2 ) → W (g1 , g2 ) (18.68)
t(g1 g2 )
that is, we modify the weight by a factor based on a product around the edges. When
multiplying the contributions of the individual triangles to get the total weight (18.65)
the edge factors will cancel out from the two triangles sharing a common edge.

2. The action is not obviously gauge invariant, since it is certainly not true in general
that W (g1 , g2 ) is equal to

W (h(v1 )−1 g1 h(v2 ), h(v2 )−1 g2 h(v3 )) (18.69)

for all group elements h(v1 ), h(v2 ), h(v3 ) ∈ G. The argument that, nevertheless, the
total action (18.65) is invariant is given (for the d = 3 case) in the Dijkgraaf-Witten
paper around their equation (6.29). ♣Cop out. Give a
better argument.
Explain that
3. The idea above generalizes to define a topological gauge theory on oriented manifolds Chern-Simons
actions change by
in d-dimensions for any d, where one uses a d-cocycle on G with values in C∗ (or boundary terms and
it is too much to
U (1)). These topological gauge theories are known as “Dijkgraaf-Witten theories.” hope for exact local
gauge invariance.
The Boltzmann weight W represents a topological term in the action that exists and ♣

is nontrivial even for flat gauge fields.

– 470 –
4. The invariance under the change of type II in Figure 50, which can be generalized
to all dimensions is particularly interesting. It means that the action is an “exact
renormalization group invariant” in the sense reminiscent of block spin renormaliza-
tion. 242 This fits in harmoniously with the alleged the metric-independence of the
topological gauge theory.

5. The case d = 3 is of special interest, and was the main focus of the original Dijkgraaf-
Witten paper. In this case we have constructed a “lattice Chern-Simons invariant,”
and the theory with a cocycle [W ] ∈ H 3 (BG, U (1)) = Hgroupcohomology
3 (G, U (1)) is a
Chern-Simons theory for gauge group G. In the case of G finite one can show that
H 3 (BG, U (1)) ∼
= H 4 (BG; Z). In general the level of a Chern-Simons theory is valued
in H 4 (BG; Z) for all compact Lie groups G.

ALSO DISCUSS HAMILTONIAN VIEWPOINT! Check out D. Harlow and H. Ooguri,


Appendix F of https://arxiv.org/pdf/1810.05338.pdf

19. Example: Symmetry Protected Phases Of Matter In 1 + 1 Dimensions

242
The idea of block spin renormalization, invented by Leo Kadanoff, is that we impose some small lattice
spacing a as a UV cutoff and try to describe an effective theory at ever larger distances. So, we block spins
together in some way, define an effective spin, and then an effective action
X
e−Sef f := e−S(spins) (18.70)
f ixed−ef f ective−spins

The hope is that at long distances, with ever larger blocks, the “relevant” parts of Sef f converge to a useful
infrared field theory description.

– 471 –

You might also like