0% found this document useful (0 votes)
12 views276 pages

Redirection

The document presents the book 'Applied Abstract Algebra' by Ki Hang Kim and Fred W. Roush, which serves as a comprehensive introduction to abstract algebra, emphasizing both relational and operational structures. It covers essential topics such as groups, rings, and various applications in fields like sociology and computer science, while providing exercises at different difficulty levels to cater to students' varying backgrounds. The book is intended for undergraduate students and lecturers, aiming to illustrate the relevance of abstract algebra through practical applications.

Uploaded by

Esteban Gadacz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views276 pages

Redirection

The document presents the book 'Applied Abstract Algebra' by Ki Hang Kim and Fred W. Roush, which serves as a comprehensive introduction to abstract algebra, emphasizing both relational and operational structures. It covers essential topics such as groups, rings, and various applications in fields like sociology and computer science, while providing exercises at different difficulty levels to cater to students' varying backgrounds. The book is intended for undergraduate students and lecturers, aiming to illustrate the relevance of abstract algebra through practical applications.

Uploaded by

Esteban Gadacz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 276

Ellis Horwood Series MATHEMATICS AND ITS APPLICATIONS

APPLIED
ABSTRACT ALGEBRA
Ellis Horwood Series
MATHEMATICS AND ITS APPLICATIONS
Series Editor: Professor G. M. BELL,
Chelsea College, University of London
Operational Research Editor:
Professor B. W. CONOLLY,
Chelsea College, University of London

APPLIED ABSTRACT ALGEBRA


Kl HANG KIM and FRED W. ROUSH, Professor
and Assistant Professor of Mathematics, Mathe¬
matics Research Group, Alabama State University,
Montgomery, Alabama, USA

This book contains all the necessary subject matter


of abstract algebra, featuring the most recent
topics and illustrative applications: it approaches
the subject from a broad, non-numerical viewpoint.
The material has been worked as a simple and
concrete introduction, leading to a good cross-
section of applications which include voting
theory, automata theory, mathematical linguistics,
sociology, symmetries in physics and geometry,
kinship systems, geometrical constructions, and
design of experiments for statistical analysis.
The authors include newer topics connected with
relational structures, to complement the traditional
emphasis on operational structures. Delving into
their 30 years of combined teaching and research
experience, the authors organise the material into
major areas of abstract algebra, such as groups or
rings, which are divided into sections which can be
covered in one, two or three lectures. They add
exercises to test the student's comprehension,
graded in three levels to accommodate varying
mathematical backgrounds. Advanced exercises
include important results beyond those given in
the text. Among the major topics covered here,
which are neglected in other books at this level,
are binary relations, lattices, semigroups, Boolean
matrices, directed graphs, network theory, group
representation (both finite and continuous),
all important tools for the applied mathematician.

Readership: Lecturers and third and second year under¬


graduate students of abstract algebra, students in computer
science, electrical engineering, civil and mechanical
engineering, the social sciences, physics, economics,
linguistics, anthropologists, and experimental design.
APPLIED ABSTRACT ALGEBRA
ELLIS HORWOOD SERIES IN
MATHEMATICS AND ITS APPLICATIONS
Series Editor: Professor G. M. BELL, Chelsea College, University of London
(and within the same series)
Statistics and Operational Research
Editor: B. W. CONOLLY, Chelsea College, University of London
Baldock, G. R. & Bridgeman, T. Mathematical Theory of Wave Motion
de Barra, G. Measure Theory and Integration
Beaumont, G. P. Introductory Applied Probability
Burghes, D. N. & Borrie, M. Modelling with Differential Equations
Burghes, D. N. & Downs, A. M. Modern Introduction to Classical Mechanics and Control
Burghes, D. N. & Graham, A. Introduction to Control Theory, including Optimal Control
Burghes, D. N., Huntley, I. & McDonald, J. Applying Mathematics
Burghes, D. N. & Wood, A. D. Mathematical Models in the Social, Management
and Life Sciences
Butkovskiy, A. G. Green’s Functions and Transfer Functions Handbook
Butkovskiy, A. G. Structure of Distributed Systems
Chorlton, F. Textbook of Dynamics, 2nd Edition
Chorlton, F. Vector and Tensor Methods
Conolly, B. Techniques in Operational Research
Vol. 1: Queueing Systems
Vol. 2: Models, Search, Randomization
Dunning-Davies, J. Mathematical Methods for Mathematicians, Physical Scientists
and Engineers
Eason, G., Coles, C. W., Gettinby, G. Mathematics and Statistics for the Bio-sciences
Exton, H. Handbook of Hypergeometric Integrals
Exton, H. Multiple Hypergeometric Functions and Applications
Faux, I. D. & Pratt, M. J. Computational Geometry for Design and Manufacture
Goodbody, A. M. Cartesian Tensors
Goult, R. J. Applied Linear Algebra
Graham, A. Kronecker Products and Matrix Calculus: with Applications
Graham, A. Matrix Theory and Applications for Engineers and Mathematicians
Griffel, D. H. Applied Functional Analysis
Hoskins, R. F. Generalised Functions
Hunter, S. C. Mechanics of Continuous Media, 2nd (Revised) Edition
Huntley, I. & Johnson, R. M. Linear and Nonlinear Differential Equations
Jaswon, M. A. & Rose, M. A. Crystal Symmetry: The Theory of Colour Crystallography
Jones, A. J. Game Theory
Kemp, K. W. Computational Statistics
Kosinski, W. Field Singularities and Wave Analysis in Continuum Mechanics
Marichev, O. I. Integral Transforms of Higher Transcendental Functions
Meek, B. L. & Fairthorne, S. Using Computers
Muller-Pfeiffer, E. Spectral Theory of Ordinary Differential Operators
Nonweiler, T. R. F Computational Mathematics: An Introduction to Numerical Analysis
Oliviera-Pinto, F. Simulation Concepts in Mathematical Modelling
Oliviera-Pinto, F. & Conolly, B. W. Applicable Mathematics of Non-physical Phenomena
Rosser, W. G. V. An Introduction to Statistical Physics
Scorer, R. S. Environmental Aerodynamics
Smith, D. K. Network Optimisation Practice: A Computational Guide
Stoodley, K. D. C., Lewis, T. & Stainton, C. L. S. Applied Statistical Techniques
Sweet, M. V. Algebra, Geometry and Trigonometry for Science Students
Temperley, H. N. V. & Trevena, D. H. Liquids and Their Properties
Temperley, H. N. V. Graph Theory and Applications
Twizell, E. H. Computational Methods for Partial Differential Equations in Biomedicine
Wheeler, R. F. Rethinking Mathematical Concepts
Whitehead, J. R. The Design and Analysis of Sequential Clinical Trials
APPLIED ABSTRACT
ALGEBRA

KI HANG KIM
Professor of Mathematics
and
FRED W. ROUSH
Assistant Professor of Mathematics
both of Mathematics Research Group, Alabama State University
Montgomery, Alabama, USA

ELLIS HORWOOD LIMITED


Publishers • Chichester

Halsted Press: a division of


JOHN WILEY & SONS
New York • Chichester • Brisbane • Ontario
First published in 1983 by
ELLIS HORWOOD LIMITED
Market Cross House, Cooper Street, Chichester West Sussex, P019 1EB, England
The publisher’s colophon is reproduced from James Gillison’s drawing of the ancient Market
Cross, Chichester

Distributors
Australia, New Zealand, South-east Asia:
JAcaranda-Wiley Ltd., Jacaranda Press
JOHN WILEY & SONS INC.,
GPO Box 859, Brisbane, Queensland 40001, Australia
Canada:
JOHN WILEY & SONS CANADA LIMITED
22 Worcester Road, Rexdale, Ontario, Canada
Europe, Africa:
JOHN WILEY & SONS LIMITED
Baffins Lane, Chichester, West Sussex, England
North and South America and the rest of the world:
Halsted Press: a division of
JOHN WILEY & SONS
605 Third Avenue, New York, NY 10016, USA

© 1983 Ki Hang Kim and F. W. Roush/Ellis Horwood Ltd.

British Library Cataloguing in Publication Data


Kim, Ki Hang
Applied abstract algebra.
1. Algebra
I. Title II. Roush, Fred W.
512 QA155

Library of Congress Card No. 83-217

ISBN 0-85312-563-5 (Ellis Horwood Limited, Publishers - Library Edn.)


ISBN 0-85312-612-7 (Ellis Horwood Limited, Publishers - Student Edn.)
ISBN 0-470-27441-7 (Halsted Press)

Typeset in Press Roman by Ellis Horwood Ltd.


Printed in Great Britain by Butler & Tanner, Frome, Somerset.

All rights reserved. No part of this publication may be reproduced, stored in a retrieval
system, or transmitted, in any form or by any means, electronic, mechanical, photocopying,
recording or otherwise, without the permission of Ellis Horwood Limited, Market Cross
House, Cooper Street, Chichester, West Sussex, England.
Table of Contents

Preface.... ..7

Chapter 1. Sets and Binary Relations..9


1.1 Sets. 9
1.2 Binary Relations. 13
1.3 Functions .. 16
1.4 Order Relations. .. 20
1.5 Boolean Matrices and Graphs...26
1.6 Arrow’s Impossibility Theorem. 32

Chapter 2. Semigroups and Groups...39


2.1 Semigroups ..40
2.2 Generators and Relations. 44
2.3 Green’s Relations. 50
2.4 Blockmodels. . ..55
2.5 Finite State Machines. 60
2.6 Recognition of Machine Languages by Finite State
Machines. 63
2.7 Groups. .67
2.8 Subgroups.72
2.9 Homomorphisms. 76
2.10 Permutation Groups. . ..79
2.11 Systems of Distinct Representatives and Flows on
Networks.83
2.12 Orbits and Conjugations. 92
2.13 Symmetries.98
2.14 Polya Enumeration.107
2.15 Kinship Systems. Ill
2.16 Lattices of Subgroups ..114
6 Table of Contents

Chapter 3. Vector Spaces.119


3.1 Vector Spaces.120
3.2 Basis and Dimension.126
3.3 Matrices.131
3.4 Linear Transformations.138
3.5 Determinants and Characteristic Polynomials.143
3.6 Eigenvalues, Eigenvectors, Similarity.149
3.7 Symmetric and Unitary Matrices.157

Chapter 4. Rings.163
4.1 The Integers and Divisibility.164
4.2 Euclidean Domains and Factorization.168
4.3 Ideals and Congruences.173
4.4 Structure of Zn.176
4.5 Simple and Semisimple Rings.179

Chapter 5. Group Representations .184


5.1 The Group Ring and Representations.185
5.2 Modules and Representations.190
5.3 Irreducible Representations.194
5.4 Group Characters.197
5.5 Tensor Products.201

Chapter 6. Field Theory.210


6.1 Finite Dimensional Extensions.211
6.2 Applications of Extensions of the Rationals.215
6.3 Finite Fields.221
6.4 Coding Theory.227
6.5 Cyclic Codes.233
6.6 Latin Squares.240
6.7 Projective Planes and Block Designs.246

Open Problems.253

List of Special Symbols.254

References.259

Index 261
Preface

This book is intended for a two-semester course in abstract algebra. In addition


to the standard concepts of sets, groups, rings, fields, and so on we also give
applications. We hope that for many students this may increase the interest of
the course and help explain why abstraction and generality are useful.
Distinctive features of the text include the following:

(a) It emphasizes relational structures as well as operational structures. A


unified approach is given by the parallel treatment of basic ideas valid for
all types of structures.
(b) The book is comprehensive in treating important topics not usually
emphasized, and the different results are interwoven in a way that reveals
the interconnections between the various concepts.
(c) It is written in a down-to-earth style, with examples giving simple
illustrations of most concepts.
(d) It is organized into a few chapters dealing with major areas of abstract
algebra, such as groups or rings, these being divided into many sections
which can be covered in 1-3 lectures.
(e) It contains exercises graded in three levels in order to accommodate students
with varying mathematical backgrounds.
(f) The book contains applications to many areas which illustrate the usefulness
of abstract algebra.
(g) It contains some very important open problems in algebra for which students
can understand the problem after this course. This will give some indication
of what research is like in mathematics.

To give a variety of applications we have put in several sections on semi¬


groups and their applications and group representations. However, the student
or reader need not go through all this material. The selection should be left up to
the interest of the individual or instructor. The reader is assumed to have had
calculus and some exposure to linear algebra or matrix theory.
8 Preface

We have put in exercises immediately after each section, divided into three
levels. It is strongly recommended that each student go through all the Level 1
exercises. Level 2 exercises require some experience in proving theorems, and
develop this skill. Level 3 exercises are in many cases very challenging, and
include important theorems not in the text itself. They are for the student
who is deeply interested in mathematics and wants to go beyond the material
presented in the text itself.
The authors are happy to acknowledge an unknown official referee for
numerous very constructive criticisms and suggestions. Also, both authors are
grateful to Mrs Kay Roush for her diligent and accurate proofreading.

Ki Hang Kim Fred William Roush


Mundok, Korea Montgomery, Alabama
CHAPTER 1

Sets and binary relations

In this chapter we cover the most basic types of algebraic structures: binary
relations on sets. We first review material on the theory of sets itself.
In contrast to binary operations like addition and multiplication from
two things yield a third, sum or product, a binary relation only compares two
quantities, as x < y or x = y.
There are three kinds of binary relations we consider at greater length. A
function y = f(x) expresses the fact that x determines y, as a cause determines
its effect. Functions are of importance in many branches of mathematics. By
means of functions different structures can be compared.
Equivalence relations like the geometric idea of similarity express the idea
that two elements are the same in some respect.
Order relations like ‘greater than’ or lX is a subset of Y ’ establish a com¬
parison of lesser to greater. The key property of order relations is transitivity, if
x <y and y < z then jc <z. There are many varieties of order relations such
as semiorders, preorders (quasiorders) and weak orders.
Finally we prove a well-known theorem in social choice theory using those
results. The preferences of any individual of a group over a list of possible
actions of the group can be represented by an order relation. A group choice
method can be taken as a certain type of function. The theorem shows certain
kinds of group choice methods are possible only if there are no more than two
alternatives.

1.1 SETS
In application of quantitative mathematics such as arithmetic, algebra, geometry,
calculus, differential and integral equations, and linear algebra, objects are
represented by numbers or n-tuples of numbers, which are measurements of that
object. Relationships among objects such as joint motion under gravitational or
electromagnetic forces, are represented by relationships among numbers such as
functions or equations or by relationships among functions.
10 Sets and Binary Relations [Ch. 1

In applications of nonquantitative mathematics objects are frequently


represented by sets or elements of sets. Mathematically little can be said about a
nonnumerical object except its relationship to other objects. Mathematics can
deal with nonnumerical relationships of a formal nature. All an object in a set
has is a name.
Sets are collections of objects. The simplest way to specify a set is to list its
elements as {orange, apple, pe&ch} or {A, E, I, O, U}. A second way is to give a
condition describing precisely those objects which are members of the set. An
example is {x :x is a positive odd integer less than 20}.
In a sense the theory of sets (with logic) is the foundational part of
mathematics. That is, numbers can be described in terms of sets. For instance
for the number 7 one can build a set- with exactly 7 elements, and use it to
compare other sets with, to see if they have 7 elements. This takes care of
positive integers.
From positive integers one can define negative integers, fractions, and
real numbers successively. Then geometry can be done from numbers using
coordinates. So all of mathematics can be derived from set theory.
The basic operations with sets are union, intersection, and complement.

DEFINITION 1.1.1. Let F be a family (set) of sets. Then the union of the
members of F, denoted

U A or V) A
AEF F

is the set of objects which are members of at least one set of F.


The intersection of the members of F, denoted

n A or n A
AEF F
is the set of all objects which are members of every set of F.

For two sets A, B the union and intersection are written A U B and A n B.
The set A is contained in the set B if all members of A are also members of
B. This is denoted A C B (or B D A) and it can also be said that A is a subset
of B. To prove a statement that one set (A) is contained in another (5), the
straightforward approach is to let x E A and then prove x E B.
If A and B have exactly the same members, we say A = B. To prove a
statement that A - B one can first prove AC B and then B D A. Or one can
show A — C and C = B for some other set C. If A C B but Ai= B then A is
called a proper subset of B. Occasionally it is possible to show directly that
x EA if and only if x EB.
Suppose we are considering subsets of a fixed set. We call this set the
universal set or universe of discourse. We denote this set by (J. If we are con¬
sidering plane geometric figures, U could be the set of all points of the plane.
Sec. 1.1] Sets 11

DEFINITION 1.1.2. The complement A of A C U is the set of all elements of U


not in A.

The relative complement of any two sets is also important.

DEFINITION 1.1.3. The relative complement A\B (or A — B, A~ B) is the


set of all elements in A but not in B.

The following laws hold for the operations of set theory.

(1) Commutativity
AV B = BV A, AC\B — BOA
(2) Associativity
A U (B U C) = (A U B) U C, An(BnC) = (AnB)nC
(3) Distributive Laws
dn(fiuc) = (A n b) u (A n c) au(bhc) = (A u b) n (A u c)
(4) de Morgan’s Law
AUB = AOB, ATVB = AUB
(5) Idempotency
AU A = A, An A = A
(6) Double Negative
A = A
These laws generalize from operations with just two sets to unions and
intersections of arbitrary families of sets.

The null set (empty set) 0 is the set { } containing no elements.

DEFINITION 1.1.4. Two sets A, B are disjoint if and only if their intersection
is the null set.

The following laws apply to the null set and the universal set U = 0:

A U 0 = A, An 0 = 0

A U U = U, An u = A

The cardinality (order) \ A\ of a set A is the number of elements of A, if this


number is finite. Cardinality has also been defined for infinite sets.
There is no set of everything although a different concept, a class of all
sets can be defined. If there were a set of all sets there would also be a set
S = {all sets which are not members of themselves}. But if S E S then S ^ S.
If S then SE S. This is a contradiction, known as Russell’s Paradox: both
S E S and S £ S are impossible. Therefore there is no set of all sets.
12 Sets and Binary Relations [Ch. 1

Instead sets are built from specific other sets. For example, for a set A we
may define the set of all objects of A having property P. This is a subset of A
for any property P. Sets can also be constructed by unions, intersections, taking
the set of all subsets of another set, and two processes considered later: Cartesian
product and the axiom of choice.

EXERCISES
Level 1
For sets U = {positive integers less than 8]-, A = {1,3,5,7}, B = {3,4,6,7},
C = {4,5,6,7}, compute the following: '

1. AUB.
2. AC B.
3. iu(snc).
4. dn(fiuc).
5. (dn£)\(/in c).
6. Give additional examples of sets.

Level 2
1. Verify the associative law for union for a specific example (such as that
above).
2. Verify the distributive law A O (B U C) — (A n B) U (A n C) for a specific
set.
3. Which laws of sets are valid for arithmetic if one replaces U by +, n by X,
0 by 0, and U by 1?
4. Give examples to show the other laws of sets fail for arithmetic.
5. Prove if A C B and B C C then ACC.
6. Prove A~U B = AC\ B.

Level 3
1. Prove the distributive law A U (B n C) = (A U B) n (A U C).
2. Deduce the other distributive law from this using de Morgan’s law.
3. How many subsets does a set of n elements have?
4. When is |.4 UZ?| = |y4| + |fi|?
5. Give a formula for \A U B\ if \A n B\ =£ 0.
6. Explain why the method of Venn diagrams provides adequate proofs of
equations involving at most 3 sets. (There are 8 classes of elements x in any
such 3 sets according as x £ A or not, xC B or not, x E C or not).
7. Show we can test any law for 3 sets by considering the example used in
Level 1 above.
Sec.1.2] Binary Relations 13

1.2 BINARY RELATIONS

The Cartesian product of n sets is the set of ordered n-tuples from them,
meaning things like (1, 2, 3,4,7) with entries. It is used in a number of mathe¬
matical constructions. The Cartesian product of the real numbers with itself is
the set of pairs (x, j) of coordinates in geometry, and is the basis for the name
(Cartesian geometry, after Ren<£ Descartes).

DEFINITION 1.2.1. The Cartesian product Ax X A2 X ... X An of n sets is the


set of all ordered rc-tuples (x1( x2,...,xn) for which x* G Ah i = 1 to n.

EXAMPLE 1.2.1. {a, b, c} X {a, b\ = {(a, a), (a, b), (b, a), (b, b), (c, a), (c, b)\.

The Cartesian product of finite sets AltA2,...,An has \At\\A2\ ... \An\
elements. It has other properties related to products, such as the distributive
laws (A U B) X C = (A X C) U (B X C) and C X (A U B) = (C X A) U (C X B).
They occur in defining binary operations like addition and multiplication, and
in studying binary relations like = and <. A message or sequence of n symbols
each chosen from a set A, like the alphabet, is equivalent to an element of the
set AXAX...XA=An.

DEFINITION 1.2.2. A binary relation from a set A to a set B is a subset of


the Cartesian product AX B.

EXAMPLE 1.2.2. The subset S = {(x, y) : x2 < y2, x, y G R} is a binary rela¬


tion. Here R denotes the set of all real numbers. It is considered a representation
of the relationship x2 < y2.

For every precisely defined relationship between numbers or objects there


exists a binary relation, which is the set of ordered pairs x, y such that x has
the relationship in question to y. For any binary relation we conversely have the
relationship that x and y are such that (x, y) belongs in the binary relation. All
the logical properties of the relationship are true of the binary relation also.
For instance the transitive property of inequality if x < y and y < z then x < z
is the property if (x, y) G R and (y, z) G R then (x, z) G R.
Two types of binary relations are equivalence relations and functions. An
equivalence relation is a relation like x = y or in geometry, x — y (congruence)
or x ~ y (similarity) which says that x, y are the same in a certain respect.

DEFINITION 1.2.3. A binary relation R on a set X is reflexive if and only


if (x, x) G R for all x G X. That it is symmetric means (x, y) G R if and only if
{y, x) G R. That it is transitive means if (x, y) G R and (y, z) G R then
(x, z) G R.

DEFINITION 1.2.4. A reflexive, symmetric, transitive binary relation is called


an equivalence relation.
14 Sets and Binary Relations [Ch. 1

EXAMPLE 1.2.3. The relation x=y has these three properties. For all x,
x = x. If x — y then y — x. If x = y and y = z then x = z. It is an equivalence
relation.

Equivalence relations can be expressed as follows. The set X is divided into


classes. Then two members are equivalent if and only if they belong to the
same class. For the relation equality the classes are single elements, so that two
equivalent elements must be identical.
A division of a set into disjoint nonempty subsets is called a partition.

DEFINITION 1.2.5. A family Fof nonempty subsets of a set X is a partition


if and only if

(1) U A = X
AE F

(2) for A, B E X if A =£ B then A n B = 0

EXAMPLE 1.2.4. The set {1,2, 3,4, 5} can be partitioned into the subsets
{1,3}, {2, 5}, {4}.

THEOREM 1.2.1. For any equivalence relation R on a set X, the family of


subsets S = {y E X: (x, y) E /?} for xE X form a partition ofX. Every partition
ofX arises in this way from a unique equivalence relation.
Proof. The sets S are nonempty since (x, x) E R and thus x E S. This also
implies U S contains every x so U S = X. Let S, T be the equivalence classes of
x, y. Suppose S n T ^ 0. Let z E S n T. Then (x, z) E R and (y, z) E R. For
any w E S we have (x, w) E R. Since (x, z) E R, (z, x) E R by symmetry. Since
(y, z) E R and (z, x) E R, (y, x) E R by transitivity. Since (y, x) E R and
(x, w) E R, (y, w) E R. So w ET. So if w E S then w ET. Thus SET. By
symmetry T C S. So S = T. This proves any two classes S, T which are not
disjoint are equal. Therefore {5} is a partition.
To a partition F associate the binary relation R = {(x, y): for some AEf,
x E A and y E A}. This is an equivalence relation. For any x E X, xEA for
some A since the union of A is X. Then xEA and x E A. So (x, x)ER.
If x E A and y E A then y E A and x E A. If (x, y) E R and (y, z) E R then
x E A, y E A, y E B, z E B for some A, B E F. Then yEAFB so AnB^ty.
So A = B. So x E A and z E B = A. So (x, z) E A.
It is straightforward to show that R does give rise to the partition F. Let
x E A. Then it can be shown S — A. Uniqueness of R can also be shown. □

EXERCISES
Level 1
1. Write out all partitions of {1,2}. There are two: one with a single 2-element
subset and one with two 1-element subsets.
Sec.1.2] Binary Relations 15

2. Write out all partitions of {1,2,3}. There are one with a 3-element set,
three with a 2-element set and a 1-element set, and one with three 1-element
sets.
3. Let R be the binary relation on real numbers that x, y have the same sign
or are both zero. Is this an equivalence relation? Give examples of the three
properties.
4. What is the equivalence class of 1 in this relation? It will be -[x : x, 1 have the
same sign}. What is this?
5. What are the equivalence classes of —1 and of 0?
6. What partition of the real numbers corresponds to this relation?
7. List 16 binary relations (not all equivalence relations) on {1, 2}. Simply list
the possible sets of ordered pairs, such as {(1,1), (1,2), (2, 2)}.

Level 2
1. Show that if R is an equivalence relation on S and T C S then R gives an
equivalence relation on T.
2. What are the equivalence classes of T1
3. Show that on a Cartesian product AX B the relation (alf bf) R (a2, b2) if
and only if ax — a2 is an equivalence relation. What are the equivalence
classes?
4. Show the relation x2(l — x2) = y2(l — y2) is an equivalence relation.
Generalize this.
5. What are the equivalence classes of the relation in the above exercise?
6. Show the universal relation (J = {(x,y): xGRjGR} is an equivalence
relation.
Level 3
1. Let S(n, k) denote the number of partitions of {1, 2, ..., n}, or any
^-element set having exactly k distinct members. This is the same as the
number of equivalence classes on an n-element set having k members. What
are S(n, n) and S(n, 1)? The S(n, k) are called Stirling’s numbers of the
second kind.
2. Find a formula for S(n, n — 1). Describe all partitions with n — 1 classes.
3. Find a formula for S(n, 2). Describe all partitions with 2 classes.
4. Tell why S(n + 1, k) = kS(n, k) + S(n, k —1). (Add an element to any
existing equivalence class or make it a new class.)
5. Compute S(n, k) for n < 5 using this formula laid out in a table

2 1 1
3 13 1
Each time to get an entry in a lower row in column k, add the entry to
upper left of it and k times the entry just above it.
6. How many symmetric, reflexive binary relations are there on an ^-element
set? The number of transitive relations is an unsolved problem.
16 Sets and Binary Relations [Ch. 1

1.3 FUNCTIONS
A function is a relationship in which a quantity uniquely determines a second
quantity. For instance x uniquely determines x2 + x + 1 but it does not
determine y if the relation is x < y.

DEFINITION 1.3.1. A partial function from S to T is a binary relation R


such that if (x, z) G R and (x, y) G R then y = z. The domain of a partial
function R is {x: for some y G T, (x, y) £ R}. \ function is a partial function
with domain S. The set T is called the range.

EXAMPLE 1.3.1. On the real numbers 1/x is a partial function with domain
{x : x =£ 0}.

A partial function differs from a function only in that it may not be defined
for all x. The following are situations in which some quantity can be regarded as
a function.

EXAMPLE 1.3.2. If x causes y then y is in some sense a function of x.

EXAMPLE 1.3.3. The output of a machine is a function of its input and its
internal state.

EXAMPLE 1.3.4. The result of a measurement on a system is a function of the


state of the system.

EXAMPLE 1.3.5. Any fixed property of a person could be considered a func¬


tion of his name, since the name uniquely determines the person (ignoring
people with identical names).

EXAMPLE 1.3.6. The outcome of an election is determined by the votes,


essentially the preferences, of individual voters.

EXAMPLE 1.3.7. A binary operation on a set S, such as addition, is a function


from S X S to S where f(a, b) = a + b.

EXAMPLE 1.3.8. A homomorphism from one multiplicative system to another


is a function / such that f(xy) = /(x)/(y). An example is x” from the real
numbers to itself.

EXAMPLE 1.3.9. A map of a region can be considered as a function for which


any point of the region determines a corresponding point on the map.

Because of this last example, functions in general are called maps or


mappings.
Sec.1.3] Functions 17

They may also be thought of as assignments of an element of T to each


element of S'.

EXAMPLE 1.3.10. The function /(l) = 1, /(2) = 3, /(3) = 2 can be


represented as

For a function f and element x & S, f(x) denotes the unique number $uch
that (x, /(x)) G f.
There is a method of obtaining a product of functions on general sets called
composition. The second function is applied to the result of the first function.
In algebraic terms, the first function is substituted into the second.

DEFINITION 1.3.2. For binary relation R from A to B and a binary relation


S from B to C, the composition R o S is {(a, c): a G A, c G C and for some
b G B, both (a, b) G R and (b, c) G S}.

EXAMPLE 1.3.11. For a transitive binary relation R, R o R is contained in R.


If R is transitive and reflexive then R o R = R.

EXAMPLE 1.3.12. If R is x2 + 1 = y and S is x < y then R o S is x2 + 1 < y.

For a function this definition takes this form.

DEFINITION 1.3.3. The function (fog)(x) is g(f(x)). The domains and


ranges are as above.

EXAMPLE 1.3.13. If /(x) is x2 + x + 2 and g(x) is 3x + 4 then g(f(x)) is


3(x2 + x + 2) + 4.

PROPOSITION 1.3.1. Composition of binary relations is associative.

Proof. Let R, S, T be binary relations from A to B, B to C, C to D. Let (a, d)


belong to R o (S o T). Then there exist (a, b) G R, (b, d) G S o T by definition
of composition. Thus there exist (b, c) G S, (c, d) G T. Therefore (a, c) G R o S.
Since (c, d) G T, (a, d) G (A o S) o T. This proves R o (5 o T) C (R o S) o T.
A similar argument proves the reverse inclusion. □

DEFINITION 1.3.4. The identity function i from a set to itself is the function
defined by i(x) = x for every x.
18 Sets and Binary Relations [Ch. 1

PROPOSITION 1.3.2. For any function f and the appropriate identity


functions i° 1t — °/ = f

Proof (ls o f) (x) = /(t5(x)) = /(x) and (/ o ir) (x) = iT(f(x)) = f(x). □

DEFINITION 1.3.5. That a function /: S-*T is one-to-one (or 1-to-l, 1-1)


means if x w then /(x) i= /(w).
In terms of relations this is, if (x, y~) G / and (w, y) G / then x = w.

EXAMPLE 1.3.14. If /(2) = /(3) then / would not be 1-1.

DEFINITION 1.3.6. That a function f is onto means if y G T then there exists


x G S such that /(x) — y.

EXAMPLE 1.3.15. There is a function from the positive integers into itself
which is 1-1 but not onto, given by /(x) = x + 1.

A function which is both 1-1 and onto is called a 1-1 correspondence, or


isomorphism of sets.

EXAMPLE 1.3.16. The function /(x) = — x gives a 1-1 correspondence


between positive integers and negative integers.

For finite sets any two of these conditions implies the third: / is 1-1,/ is
onto, |S| = ir|.

PROPOSITION 1.3.3. A function f: S T is a 1-1 correspondence if and


only if there exists a function g: T -*■ S such that fog and g o f are both
identity functions.

Proof. Suppose g exists. If f(x) = f(w) then g(f(xj) = g(f(wj). But


(fog)(x) = i(x) = x. So x = w. This proves f is 1-1. For any y G T, f(g(y)) =
(g o /) (y) = i(y) = y. So f is onto.
Conversely suppose / is 1-1 and onto. By onto, for each y G T there exists
x G S with f(x) = y. Choose such an x and define g(y) = x. Then f(g(y)) =
/(x) = y. So g o f is an identity function. Thus f(g(f(x))) = /(x). Since / is
■ 1-1, g(f(x)) = x. □

The function g is called the inverse of f. The inverse function of / is


denoted f~l.
We conclude this section with the ideas of ra-ary operation and relation.

DEFINITION 1.3.7. An n-ary relation on a set S is a subset R of


S X S X ... X S (n factors). An n-ary operation is a function /: S X S X ... X S
to S. An algebraic structure is a finite collection of n-ary relations and
operations.
Sec.1.3] Functions 19

EXAMPLE 1.3.17. For n = 2 we have binary relation and binary operations,


such as addition, subtraction, multiplication and division. The relation x is
between y and z on a straight line is a 3-ary (ternary) relation. The operation
of finding the average of n numbers is an n-ary operation.

Abstract algebra is the study of algebraic structures in this sense.

DEFINITION 1.3.8. An isomorphism between two algebraic structures on sets


S, T is 1-1, onto function h : S-* T such that (1) for all corresponding n-ary
operations ft, gt we have h(ft(su s2,..., sn)) = gi(h(s{), h(s2),..., h(sn)), (2)
for all corresponding n-ary relations Rl,R2,we have (slt s2,..., sn) E Rx if
and only if (^(sx), h(s2),..., h(sn)) E R2.

EXAMPLE 1.3.18. Complex conjugation x + iy -* x — iy is an isomorphism of


the complex numbers to itself. The real numbers under the binary relation < and
the operations +, X has no isomorphisms to itself except the identity.

EXERCISES
Level 1
1. Is this binary relation a function on {1,2, 3} : {(1,1), (3, 3)}?
2. Is this binary relation a function on {1, 2, 3} : {(1, 1), (1,2), (2, 1), (3, 3)}?
3. Write the composition of / with itself where /(1) = 2, /(2) = 3, /(3) = 1.
4. Give an example of a function from {1,2, 3} to {1,2, 3, 4} which is 1-1 but
not onto.
5. Can a function from {1, 2\ to {1, 2, 3} be onto?
6. Can a function from {1,2, 3, 4} to {1, 2, 3} be 1-1?

Level 2
1. Write out the functions from the set {1, 2, 3} to the set {1, 2} .
2. Show by an example that fog^gofin general.
3. What is the composition of xn and xml
4. What is the inverse function to x"?
5. Prove a composition of 1-1 functions is 1-1.
6. Prove a composition of onto functions is onto.

Level 3
1. How many functions exist from a set of n elements to a set of n elements?
2. What is the inverse function toy = x2 + x + l?
3. What is the inverse function toy = x + l/x?
4. Find a 1-1 correspondence from positive integers to all integers.
5. Show this is a 1-1 correspondence from pairs of positive integers to positive
integers:
(n + m)(n + m — 1)
f(n, m) -1- m
2
Sets and Binary Relations [Ch. 1
20

6. Prove the set of rational numbers under +, X has no isomorphisms to itself


except the identity. Show first /(0) = 0, /(1) = 1, f(n) = n by induction.

1.4 ORDER RELATIONS


A third important class of binary relations, in addition to equivalence relations
and functions is. the class of partial orders (and related structures). One example
of a strict partial order is x < y on any subset of the real numbers. Another is
set inclusion : S ^ T. These share the common property of transitivity. If x <y
and y < z then x < z. If X C Y and Y C Z then ICZ. Transitivity is the
fundamental property of an order relation.
In addition we have z < z for all z and the same is true of This is called
irreflexivity. And if a < b then b < a (antisymmetry). The same holds for sets.
The trichotomy property that x<y,y<xoiy = x does not hold for sets.

EXAMPLE 1.4.1. {1}<£{2},{2}<£{1},{1}*{2}.

Let x R y denote (x, y) E R.

DEFINITION 1.4.1. A binary relation R is irreflexive (reflexive) if xR x is


false (true) for all x.

DEFINITION 1.4.2. That a binary relation R is antisymmetric means if x =Ay


and x R y then not y R x.

DEFINITION 1.4.3. An irreflexive transitive binary relation is a strict partial


order. A reflexive antisymmetric transitive binary relation is a partial order.

EXAMPLE 1.4.2. The relation < is a strict partial order, but < is not a strict
partial order.

PROPOSITION 1.4.1. Any strict partial order is antisymmetric. If Pis a strict


partial order then P U i is a partial order. Conversely if Pis a partial order P\i is
a partial order. Here i denotes an identity function {(x, x): x E X }.

Proof. Let P be a strict partial order. Let aPb, bPa. Then aPa by transi¬
tivity. This is a contradiction. So PUi is antisymmetric. Let (a, i)GPUt,
(b, c) E P U l. If both are in P, so is (a, c). If a = b or b = c then (a, c) = (b, c)
or (a, b) which belong to P. Thus P U i is transitive. Since i C P U i, P U i is
reflexive. So P U i is a partial order.
If P is a partial order, i C P by reflexivity. The relation P\i is irreflexive
since i is removed. Let (a,i)eP\i, (i,c)GP\i. Then (a, c) G P. Suppose
(a, c) Ei. Then (a, b) E P, (b, a) E P, a b. This contradicts antisymmetry. □
Sec. 1.4] Order Relations 21

So a partial order has a corresponding strict partial order related to it in


exactly the way < is related to <.

DEFINITION 1.4.4. A preorder is a reflexive, transitive binary relation.


Preorders are also known as quasiorders.

EXAMPLE 1.4.3. The relation that one triangle has area at least as large as
another is a preorder.
Every partial order is a preorder, but some preorders are not partial orders.
However, every preorder can be reduced to a partial order on certain equivalence
classes.

DEFINITION 1.4.5. For a preorder Q, the indifference relation 2d is {(x, y):


(x> y) e 2 and (y, x) G Q\. The strict order 2s is {(x, y): (x, y) G Q but
O *) £ Qb

EXAMPLE 1.4.4. In the preceding example 2d would be the relation that two
triangles have equal areas, 2s that the area of one exceeds the area of the other.

PROPOSITION 1.4.2. The indifference relation Qd of a preorder Q is an


equivalence relation. There exists a unique partial order T on the equivalence
classes of 2d such that (x, y ) G T if and only if (x, y) G Q.

Proof. Since Q is reflexive, (x, x) G 2d- So 2d is reflexive. By definition,


2d is symmetric. If (a, b) G 2d and (b, c) G 2d then they belong to Q. So
(a, c) G Q. Since (b, a) G Q, (c, b) G Q, by transitivity (c, a) G Q. Thus
(a, c) G 2d- This proves QD is an equivalence relation.
Define P by (x, y) G P if and only if (a, b) G Q for some representatives
a G x, b G y. Then if ax Qd a, b^Q^b we have (a1; a) G Q, {a, b) G Q and by
transitivity (ax, b) G Q. By transitivity from (ah b) G Q, (b, bf) G Q we have
(alt bi) G Q. So (x, y ) G P if and only if (x, y) G Q for any x G x, y G y.
It remains to be shown that P is a partial order. Transitivity and reflexivity
of P follow from that of Q. Antisymmetry of P follows from the fact that if
b P a and a P b then a 2d b so d — b. □

Therefore preorders can be regarded as partial orders on sets of equivalence


classes.

EXAMPLE 1.4.5. The preceding preorder on triangles is derived from a partial


order >, on areas (or classes of triangles with the same area).

Partial orders on a finite set must have maximal elements, that is elements
such that no other is greater. However, these elements may not be greater than
every other.
22 Sets and Binary Relations [Ch. 1

DEFINITION 1.4.6. Let P be a partial order on a set S. That an element


X E S is maximal means if (x, y) E P then x = y. That it is minimal means if
(y, x) E P then x = y.

EXAMPLE 1.4.6. In the set {1,2, 3} order <, 3 is maximal since 3 ^ x for
x 3, and 1 is minimal.

The names maximal and minimal may be reversed according to the situation.
Maximal means x < y is false.

DEFINITION 1.4.7. In a partial order P two elements x, y are comparable


(,incomparable) if (x, 7) E P or (y, x) E P (neither (x, y) E P nor (y, x) £ P).

EXAMPLE 1.4.7. Among subsets of {1,2} the sets {1} and {2} are not
comparable. They form the only incomparable pair.

THEOREM 1.4.3. Let P be a partial order on a finite set S. Then P has at least
one maximal element and at least one minimal element. In fact for u E P there
is at least one maximal element x such that (u, x) E P.

Proof. For |S|= 1, the element is both maximal and minimal. Suppose this
theorem holds for ISI = k. Let |S| = k 4-1. Let u, y E S, u ¥= y. Let Q be
P with y removed. By induction assumption Q has a maximal element x,
(u, x) E P. If (x, y) ^ P then x is maximal in P also. If (x, >»)£? then y is
maximal in P, else if (y, z) E P, z ^ >>,then z E S\{y}, (x, z) E Q contradicting
maximally of x in Q. Moreover by transitivity (u, y) E P. The proof for a
minimal element is similar. □

This shows that for a finite partially ordered set (poset) any element is
< some maximal element.

EXAMPLE 1.4.8. {(1,1), (2, 2), (3, 3), (1,2), (1,3)}. The set has two maximal
elements 2,3. Neither is greater than the other: they are incomparable.

EXAMPLE 1.4.9. The set of positive integers is not finite, and has no maximal
element.

To obtain a result analogous to this theorem for infinite sets, and to further
obtain results on partial order, the idea of a linear order is needed. A linear
order is a partial order satisfying trichotomy: x > y, y < x or y = x.

DEFINITION 1.4.8. A binary relation R on a set S is complete if for all


x, y E S either (x, y) E R or (y, x) E R.
Sec.1.4] Order Relations 23

EXAMPLE 1.4.10. A partial order which has incomparable elements is not


complete. The relation < on real numbers is complete.

DEFINITION 1.4.9. A linear {total) order is a complete partial order.

EXAMPLE 1.4.11. The relation < on subsets of the real numbers is a linear
order. The relation C is not in general.

Linear orders on finite sets S correspond to arrangements of S as


Xi, x2,, xn, Where xt < x, for i < j. That is, they have the same structure
as < on the set {1, 2. ..., n\. To state this precisely we need the idea of
isomorphism stated earlier.
Two binary operations Rx on a set S and R2 on a set T are isomorphic if
and only if there exists a function h: S -*■ T which is 1-1 and onto, such that
(x1( x2) E Ri if and only if (h(x{), h(x2)) G R2.

THEOREM 1.4.4. Any linear order on a set S of n elements is isomorphic to


the linear order < on [1, 2,... ,n\.

Proof. For n = 1, this is immediate. Suppose it holds for n = k. Let y be a


maximal element of S. Let h be an isomorphism of to {1, 2,..., n — 1]-.
Set h(y) = n. Then h is 1-1 and onto.
Let u < v in S. Then ui^y. If v i^y then h(u)< h(y) by construction.
Suppose v = y. Then h(u)< n since n is maximal.
Conversely let h(u)< h(y). If h(u), hip)^ n then u < v by construction.
Since n is maximal, u^y. Let h(v) = n, i.e. v = y, u v. Then either u < y or
y <u. Since y is maximal the latter is false. So u < v. □

EXAMPLE 1.4.12. This is not true for infinite linearly ordered sets. The set
of negative integers under < is not isomorphic to the set of positive integers
under <.

DEFINITION 1.4.10. A chain in a poset S is a subset which is linearly ordered


by the partial order.

EXAMPLE 1.4.13. Among subsets of a set U, a chain is a family of subsets


Sh S2,..., Sfcwith Si C S2 C ... C Sk.
Finally the generalization of the result about maximal elements can be
stated (but not proved).

Hausdorff’s Maximal Principle. Every poset has a maximal chain.


This cannot be proved without the use of a special axiom of set theory, the
axiom of choice.
24 Sets and Binary Relations [Ch. 1

AXIOM OF CHOICE. For any indexed family Fa of nonempty sets there exists
an indexed family xa such that xa G Fa for each a.

The Hausdorff maximal principle and its equivalents are quite important in
abstract algebra, for infinite sets. An example is as follows.

THEOREM 1.4.5. Every partial order on a set S is contained in a linear order


on S.

Proof. Let P be a partial order on S. Consider the family of all partial orders
on S containing P. It is itself a poset under inclusion. Take a maximal chain
C in that set. The union U of its members will also be a partial order on S.
For example take x, y, z G S. Let (x, y) G U, (y, z) G U. Then (x, y) G Pu
(y, z) G P2 for some Ply P2 in C. By definition of chain, Px C P2 or P2 C Px.
Assume the latter. Then (x, y), (y, z) G Px so (x, z) G Pv So (x, z) G U. This
proves U is transitive. Proofs of reflexivity and antisymmetry are similar.
Since C is maximal, U is not contained in any other partial order on S else that
would give a larger chain. This will be used to get a contradiction if U is not
a linear order. Suppose not. Then for some x =£ y, (x, y) £ U, (y, x) ^ U. Then
let T — U together with {(w, z): (w, x) G U and (y, z) G U}. Then T properly
contains U since (x, y) G T. It can be shown that T is a partial order, by
checking various cases. For example let (w, z) G T\U, (z, s) G U. Then
(w, x) G U, (y, z) G U. Then (w, x) G U, (y, s) G U. So (w, s) G T. Let
(w1( Zj) G T\U, (zi, z2) G T\U. Then (y, zx) G U, (zlt x) G U so (y, x) G U.
This is false. The other cases of transitivity and antisymmetry are proved in a
similar way. This gives a contradiction. □

The following consequence of the Axiom of Choice is called Zorn’s Lemma.


Suppose P is a poset and every chain has an upper bound. Then P has a maximal
element.

EXERCISES
Level 1
1. Prove that the identity relation on any set is a partial order. Is any
equivalence relation a preorder (quasiorder)?
2. What are the maximal and minimal elements of this strict partial order:
{(1, 3), (1,4), (2, 3), (2, 4)}? There are two of each.
3. What are the maximal and minimal elements of this strict partial order:
{(1, 3), (2, 3)}?
4. Finite posets are represented by diagrams called Masse diagrams. To draw
one, first find the minimal elements and label them as separate points at the
bottom level. Then for all points z such that (m, z) G P for minimal m but
Sec.1.4] Order Relations 25

there is no intermediate y, with (ra, y) E P and (y, z) E P draw z on a


level above m and a line from z to m. Then draw the elements w such that
(z, w) E P but there is no intermediate y. For example: in {(1,2), (1,3)}, 1
is minimal. Both 2 and 3 are above it. So we have

2 3

Do this for Exercise 2.


5. Draw the Hasse diagram for Exercise 3.
6. Find 3 partial orders on {1,2} (include the identity).
7. If Pi is a partial order on Slt P2 is a partial order on S2 and Si O S2 = 0,
show Pi U P2 is a partial order on Si U S2.
8. An example of exercise 7 is p! = {(l, 2)}, P2 = {(3, 4)}. Draw the Hasse
diagram of Px U P2.

Level 2
1. Draw the Hasse diagram of {(1, 2), (2, 3), (1, 3)}.
2. Draw all possible Hasse diagrams for 3 elements (there are 5). Two are as
follows:
o

o o o o
Identity {(2,3)}

3. What is the diagram of any identity relation? any linear order?


4. Label the subsets of {1,2, 3} as {a, b,..., h\. Write out the strict partial
order for inclusion of proper subsets. Draw the Hasse diagram. (It should
look like a cube drawn on the plane. There is one minimal element 0 and
one maximal element {1,2, 3}.)
5. Prove any partial order P on a set S is isomorphic to a subset of the
poset of subsets of S. Send x E S to h(x) = {y: (y, x) E P}. Show that
(x, y) E P if and only if h(x) C h(y). This implies h is 1-1.
6. Write out all preorders on {1,2, 3]- such that 1 P 2.
7. If Pi is a partial order on Si and P2 is a partial order on S2 show that there
is a corresponding partial order Pj X P2 on S\ X S2.

Level 3
1. Draw all possible Hasse diagrams (19) for 4 elements.
2. Prove that every partial order P on a set S is an intersection of linear
orders on S. It suffices to show that if (y, x) ^ P then there exists a linear
26 Sets and Binary Relations [Ch. 1

order L such that (y, x) L but P C L.(P will then be the intersection of
all these L.) The proof is similar to the proof of Theorem 1.4.5, with C
being the family of all partial orders containing P but not (y, x). The same
T will do.
3. Represent the strict partial order {(1, 2), (1,3)} as an intersection of
linear orders.
4. A semiorder is a binary relation R such that (1) R is irreflexive, (2) if
(x, y) G R and (z, w) G R then (x, w) E R or (z, y) G R, (3) if (x, y) G R
and (y, z) G R then for all w either (x, w) E R or (w, z) G R. Prove a
semiorder is transitive, and is therefore a strict partial order.
5. Every 3-element partial order is a semiorder. Find two 4-element partial
orders which respectively violate (2), (3) of the above definition and are not
semiorders.
6. Suppose S' is a finite set and / a function f:S-+T and e > 0. Then show
{(x, y): /(y) > /(x) + e} is a semiorder. This leads to the interpretation
that ‘x is distinguishably greater than y ’ is a semiorder. The converse of this
result is also true. See Scott and Suppes (1958), Luce (1956), Rabinovitch
(1977), Scott (1964), Suppes and Zinnes (1963).
7. Prove for a semiorder R that the sets Lx = {y: (x, y) G P} and the sets
Lx = {y - (y, x) G P\ are each linearly ordered under inclusion and that
Lx ^ Ly, Lx ^ Ly cannot happen.

1.5 BOOLEAN MATRICES AND GRAPHS


The simplest Boolean algebra is B = {0,1}. Its operations are given by
0 + 0 = 0*1 = 1*0 = 0*0 = 0
1 • 1 = 1+0 = 0+1 = 1+1 = 1

(except for 1 + l,the same as for ordinary addition and multiplication.) However,
it has this difference, B obeys the rules of set intersection and union, rather than
those of arithmetic. For example,

x + y = y + x, xy = yx, x + (y + z) = (x + y) + z, x(yz) = x(yz)


x{y + z) = xy + xz, x + yz = (x + y)(x + z) (Dual Distributive)

x + 0 = x + x = x, x • 1 = x • x = x, x • 0 = 0, x • 1 = x

There is also the operation xc, complement such that lc = 0, 0C = 1.

(xc)c = x, (x + y)c = xcyc, (xy)c = xc + yc

A partial order is defined by x < x, 0 < 1. If x < y then x + z < y + z,


xz < yz, and xc < yc.
Sec.1.5] Boolean Matrices and Graphs 27

Any system with operations denoted by addition, multiplication, and


complementation, satisfying the laws just given, is called a Boolean algebra.
Boolean algebras have a number of applications.

EXAMPLE 1.5.1. Logic can be studied using B. In this case, ‘1’ is ‘true’, ‘0’ is
‘false’,‘+’ is ‘or’,‘X’is‘and’,‘C’is‘not’. So (p or q) and not r would be translated

(P + q)rc
Statement jc implies y if and only if x <y.

EXAMPLE 1.5.2. In mathematics, in dealing with nonnegative numbers, 0 can


represent 0 and 1 can represent any positive number. Then 1 + 1 = 1 refers to
the fact that the sum of any two positive numbers is positive.

EXAMPLE 1.5.3. Sets can be represented by ^-tuples (0,1,1,..., 0) of Boolean


numbers. A one in place i means that x, E S and a zero in place i means xt ^ S.
Then sum is union, product is intersection, complement is complement.

Boolean algebras are also used in the study of computer and other switching
circuits.

DEFINITION 1.5.1. An n X m Boolean matrix M is an n X m array of


numbers my from B = {0,1}. The number my lies in the z'th row (horizontal
line) and /th column (vertical line) of M. It is called the (/, j)-entry.

Matrices over a field will be treated in section 3.3.

EXAMPLE 1.5.4. This is a Boolean matrix M, where mn = 1, m21 = 0.

'0 1 0~

_0 1 0.

The following operations are simply taken entry by entry for Boolean
matrices A = (fly), B — (by).

Sum: A + B = (ay + by)


Logical Product: A (D B = (flyh,y)

Complement: Ac = (af/c)
Inequality is also defined entry wise: A < B if and only if for all i, j,
atj < by. This is a partial order. The strict partial order A < B means ay < by
for at least one i, j and ay < by for all i, j.
28 Sets and Binary Relations [Ch. 1

EXAMPLE 1.5.5.

'1 o ' '1 f _


'1 1"

(a) +
.1 0. _0 0_ .1 0.

'1 o' '1 r '1 0 "

(b) ©
.1 0. _0 0. .0 0.

c -
'1 o ' '0 1
(c)
_0 0. _1 1

'1 o ' 1 f
(d) <
.0 1. .0 1.

Inequality means the places where 1 occurs in one matrix is a subset of the places
where 0 occurs. The operation +, ©, C, and the relation < obey all the laws stated
earlier for 8-
There are two other operations on Boolean matrices: transposes and matrix
multiplication.

Transpose: Ar = (a,,)

Matrix Product: AB = (2 a{jbjk)

EXAMPLE 1.5.6.

'0 1" T 0 o'


(a)
.0 0_ _1 0.

0“ ■i ■1
+
o
o
1_

-1 r l* l + o- r

r
(b)
-1 1- .0 1. .1 • 1+ 1 • 0 l • l + i-1. .1 l.

r 'i i r
1_
o

0 0

(c) 1 0 0 0 0 1 = 0 1 0

0 1 i. .1 i i. _i i i.
The transpose changes the (z, /)-entry to location j, i. Rows of A become
columns of AJ, and vice versa.
The matrix product is the same as the product of ordinary matrices except
that operations are Boolean, so that 1 + 1 = 1. The (z, /)-entry may be calculated
Sec.1.5] Boolean Matrices and Graphs 29

by multiplying the entries of row i of A by those of column / of B, and adding


the sums. Products are defined only if the number of columns of A equals the
number of rows of B. An n X m matrix times anraXs matrix gives an n X s
matrix.
There is also a simpler way to multiply Boolean matrices. To form row i of
the product, consider the ones in row i of A. If these are in locations iu i2,..., ik,
add rows i\, i2,..., ik of B. This gives row i of the product.

The ith row (column) of A is denoted A(* (A*t).

EXAMPLE 1.5.7. To form the product of Example 1.5.6. (c), Aj*: a13 = 1
take B3* = (1 1 1). A2* '■ a21 = 1 take Bx* = (0 1 0). A3*: a32 = a33 = 1 take
B2* + B3* = (0 0 1) + (1 1 1) = (1 1 1).

Transpose obeys these laws:

(A + B)t = At + Bt

(A 0 B)t = At © Bt

(Ac)t = (At)c

(AB)t = btat

A<B => At<Bt

Boolean matrix products obey these laws (as ordinary nonnegative matrices
do):
(AB)C = (AB)C

(A + B)C = AC + BC, C(A + B) = CA + CB

A < B => AC < BC, CA<CB

In addition to its relation to multiplication of nonnegative matrices, Boolean


matrix multiplication is used because it corresponds to composition of binary
relations.

THEOREM 1.5.1. Let S,T be a sets of n, m elements. Label their elements


x2,..., xn, y2,y2,...,ym. Associate a Boolean matrix A to each binary
relation R by the rule atj = 1 if and only if (x0 yj) e R. This gives a 1-1 cor¬
respondence between relations on S, T and n X m Boolean matrices. Under this
correspondence unions become sums, intersections logical products, complements
complements, compositions matrix products, inclusions become <, and ‘converse’
sending (x, y) to {y, x) becomes transpose.

Proof. The proof is straightforward but lengthy. The matrix entries determine
precisely which pairs are in R. And for any matrix A we can define a relation by
30 Sets and Binary Relations [Ch. 1

{Of, Xj): di) = 1} which goes to it. This proves the mapping of relations to
matrices is 1-1 and onto.
We verify the statement about matrix products. Take a third set
Z = {z1( z2,..., zq} and a second relation P with matrix B. Let C = AB. Then
Cy = 1 if and only if 'Zaikbkj = 1, i.e., for some k, aikbkj = 1 if and only if
for some k, aik = 1 and bkj = 1 if and only if for some k, (xityk) E R and
(yk, Zj) G P if and only if (xit Zj) E R o P.
Proofs for other operations are similar. Q

For a comprehensive treatment of Boolean matrix theory, see Kim (1982).


Here we will view graphs as representations of binary relations. A directed
graph (not like the graph of a function) consists of points called vertices, and line
segments with arrows from certain vertices to certain others. The line segments
are called directed edges. The location of points or the shape of edges do not
affect the structure of a directed graph.
A directed graph represents a binary relation R on a set S if its vertices are
labelled to correspond to the elements of S and a directed edge is drawn from
x to y if and only if (x, y) E R.

EXAMPLE 1.5.8. The relation {(1, 1), (1,2), (2, 2)} has this graph.

Q->Q
1 2

EXAMPLE 1.5.9. The relation {(1,5), (2, 3), (3, 4), (2, 4)]- has this graph.

5 3

First the correct number of points should be drawn. Then they can be
labelled. Then for each pair in the binary relation, a directed segment is drawn
between the vertices having the same labels. Two-way arrows (or edges without
arrows) may be used if (x, y) and (y, x) are in R.
Graphs are useful in studying questions related to powers of a binary relation
under composition R o R o ... o R. In graph theory questions of connectedness,
existence of various types of paths, and families of subsets such that no point
joins another of the same subset are studied.
The graph of an n X n Boolean matrix can be drawn. Vertices are drawn
and labelled 1, 2,..., n. An arrow is drawn from i to / for each entry atj which
is 1.
Sec.1.5] Boolean Matrices and Graphs 31

EXERCISES
Level 1
1. Compute (1 + lc) • (0 + 0C) in Boolean arithmetic.
2. Prove the dual distributive law x + yz = (x + y) (x 4- z) in Boolean algebra
by considering three cases (1) x = 1,(2) y = z = 1,(3) x = 0 and y or z is 0.
3. When is xy equal to 1? How is this like a statement ‘x and y’l
4. Add

"1 1 O' '0 0 f


0 1 1 + 0 1 0
_1 1 0_ _1 0 0_

5. Find the logical product

‘1 1 o ' '0 0 r
0 1 1 © 0 1 0
.1 1 0_ _1 0 0.

6. Draw the graph of the Boolean matrix

“0 1 0 "

1 0 1

.0 1 0.

Level 2
1. Prove x + y = x if and only if x > y in a Boolean algebra.
2. Prove addition of Boolean matrices is associative (assuming addition is
associative in 8).
3. Prove the distributive law for Boolean matrix products (assuming the laws
for 8).
4. Multiply

"l 0 1 o"
10 10
0 10 1

0 10 1

times itself.
32 Sets and Binary Relations [Ch. 1

5. Find all distinct powers of

0 1 0 0~

0 0 10

0 0 0 1

_1 i 0 0_

6. Let J be the Boolean matrix all of whose entries are 1. When is MJ = J ?


7. Let / be the n X n identity matrix (8{f) where 5,y = 0 if i ¥= j and 8{i = 1
for all i. Prove AI — IA = A for every nXn Boolean matrix A.
8. Prove a relation is transitive if and only if its Boolean matrix satisfies
A2 < A. Show that A2 = A if it is reflexive also.
9. For a preorder Q, with Boolean matrix A, show A © AT is the Boolean
matrix of the indifference relation QD.

Level 3
1. Suppose 8 is replaced by any system in which +, X are defined. Can Boolean
matrix multiplication be defined?
2. A Boolean row vector v is a 1 X n matrix. Prove that A = B if and only if
vA = vB for all row vectors v. (Choose v with 1 entry.)
3. Prove {At*)B is the ith row of AB.
4. Prove that if / < A, v < vA and that if v = vA then v — vA1 for all i > 0.
Here / is an identity matrix.
5. Use (2), (3), (4) to prove that for A > I, An~x equals all powers Ak, k>n— 1.
6. A row basis is the set of vectors formed by rows of A which are not sums
of other, lesser rows of A. The row rank pr of a Boolean matrix is the
number of vectors in a row basis. Prove that pT(AB) < pr(^4). Must
pT(AB)< pr(5)?Try 4 X 4 Boolean matrices with 3 nonzero rows.
7. Prove that in a graph every point is accessible from every other by a directed
sequence of edges if and only if (/ 4- A)n has all ones in it.
8. Prove (/ + A)n is the Boolean matrix of a preorder. This relation is called
reachability. What is its graph-theoretic interpretation?

1.6 ARROW’S IMPOSSIBILITY THEOREM


Majority rule is a traditional method of group decision-making in the West.
In fact it can be proved to be the only rule which satisfies certain conditions
like giving every person and every alternative equal weight, and favouring an
alternative more as more people prefer it over another choice.
For three or more alternatives, a problem known as the Condorcet paradox
can occur. Suppose there are three voters 1, 2, 3 and three alternatives a, b, c.
Let the voter preference orderings be abc (meaning a is preferred to b and b
to c), bca, cab for voters 1,2,3. Then two voters prefer a to b (voters 1 and 3).
Sec.1.6] Arrow’s Impossibility Theorem 33

Also two voters prefer b to c and two voters prefer c to a. Thus for any chosen
alternative, a majority will prefer some other alternative.
Can a more complicated voting procedure avoid this paradox? Kenneth
Arrow in 1950 proved there is no voting procedure satisfying five basic condi¬
tions. (However, voting procedures can exist satisfying all but one, called
independence of irrelevant alternatives.)
Let X be a set of alternatives which the group N of voters, represented as
{1, 2 ..., n] will choose. The preferences of individual i will be expressed as a
binary relation /?,• where xR(y is interpreted as the voter liking x at least as well
as y, for xjGl. If the voter is rational, the relation should be transitive: if
he likes y at least as well as x, and z at least as well as y, he will like z at least
as well as y. Moreover it should be complete: either he likes x at least as well as
y or he likes y better than y.

DEFINITION 1.6.1. A weak order is complete, transitive binary relation. Thus


it is a total preorder.

EXAMPLE 1.6.1. A linear order is a weak order.

Here we will assume that no voter is exactly indifferent between any two
alternatives, so that /?,- is a linear order.
For the linear order /?,• of preferences of voter i, let Pt denote the
corresponding strict partial order.
The group preference is expressed as a social welfare function
R2, ...,Rn). Here |F is the function which assigns to the individual
preference relations Ru R2, ..., Rn the group preference relation in this case.
The group preference relation gives the set of pairs (x, y) of alternatives such
that group either chooses x over y or is indifferent about them. The binary
relations Rlr R2,..., Rn He in a set of binary relations called the domain.

DEFINITION 1.6.2. A social welfare function is a function |F from a set


(DCh'" = lPXlPX...XH/to BX- Here |D is the domain, W is the set of weak
orders on X, Wn is its n-fold Cartesian product with itself, and Bx is the set of
binary relations on X.
Frequently we write |F for |F(i?i, R2,..., Rn)-

EXAMPLE 1.6.2. Majority rule is a social welfare function, where x |F y if and


only if |{/: x Riy}\ > «/2. This is a rule by which the individual preferences /?,-
determine a group preference |F. Any such definite rule can be expressed as a
social welfare function.
Both the value of the function |F and the individual preference relation /2,-
will be represented by Boolean matrices, as was done in the last section. So a
fixed labelling of the elements of X as xu x2,... ,xm will be used. Then the
Boolean matrix R(i) of /?,• will be 1 in (/, k)-entry if x;- R(i) xk, else it will be
zero.
34 Sets and Binary Relations [Ch. 1

The first two assumptions on the social welfare function are universal
domain and weak ordering.

ASSUMPTION 1. UNIVERSAL DOMAIN. The domain (D is the set of all


linear orders on X.

Thus every individual preference which is a linear order can occur (weak
orders are frequently used instead).

ASSUMPTION 2. WEAK ORDERING. The relation |F(Ru R2, ..., Rn)


expressing group preferences is a weak order.

PROPOSITION 1.6.1. A Boolean matrix A is the matrix of a weak order if and


only if for all i, j, k, (1) at] + an = 1,(2) aif ajk < aik.

Proof. It can be checked that the first condition is equivalent to completeness


(Xj R Xj or Xj R Xj) and the second to transitivity (if xt R Xj and x;- R xk then
xt R xk). □

The third condition is Pareto optimality. This means if every individual


strictly prefers x to y, the group strictly prefers x to y.

DEFINITION 1.6.3. That a social welfare function |F is Pareto optimal means


if xPty for all i then x |F(R1; R2,..., Rn) y but not y |F(R1; R2,..., Rn) x.

EXAMPLE 1.6.3. Majority rule satisfies this condition: if every one prefers x
to y then a majority does.

PROPOSITION 1.6.2. A social welfare function |F on linear orders is Pareto


optimal if and only if RU) © R{2) © ... © R(n) < IF(/?1, R2,..., Rn) <
R(l) + R(2) + ... + R(n) in Boolean matrix terms.

Proof. Pareto optimality means that if the (i, /)-entries of R(i) are 1 and the
(j, i)-entries are zero (each of which implies the other for linear orders), then
fij = 1, fji = 0. So if all Rt have a 1 entry |F is 1. This is |F > R{ 1) © R{2) © ...
© R(n) in the (/, /)-entry. And if all R(i) have a 0 entry, |F is 0. This is
\V<R(l)+ R(2)+ ... + R(n). □

The fourth condition is nondictatoriality. This means there is no fixed


individual k such that the group preferences are always identical with his.

DEFINITION 1.6.4. A social welfare function |F is dictatorial if for some


k, |F(/?!, R2, ...,Rn)= Rk.
Sec. 1.6] Arrow’s Impossibility Theorem 35

EXAMPLE 1.6.4. |F = Rx is a dictatorial social welfare function.

The fifth condition is independence of irrelevant alternatives. This means


the group preference between alternatives x, y should not be affected by how
the group feels about some third alternative z. While this might ideally be true
there is a question as to whether it is necessary for practical voting methods, for
which preferences about z may reveal something about intensities of preferences
for x, y. If the alternative set is very large, or not precisely defined, though, it
may also be a practical necessity.

DEFINITION 1.6.5. That a social welfare function |F is independent of


irrelevant alternatives means that if for some x, y and preferences Rit Q(,
if x R{ y if and only if x Qi y and y Rt x if and only if y Q{ x then
x 1F(/?!, R2,..., Rn)y if and only if x f(Qx, Q2,..., Qn)y.

EXAMPLE 1.6.5. Majority voting and any dictatorial social welfare function
are independent of irrelevant alternatives.
This condition means that if preferences of each individual i over x, y are
the same in profiles R, Q (a profile is an n-tuple of preferences) then the group
preference is the same in R, Q regarding x, y. The remaining assumptions are
these.

ASSUMPTION 3. PARETO OPTIMALITY. |F is Pareto optimal.

ASSUMPTION 4. NONDICTA TORIALITY. |F is nondictatorial.

ASSUMPTION 5. INDEPENDENCE OF IRRELEVANT ALTERNATIVES. |F is


independent of irrelevant alternatives.

PROPOSITION 1.6.3. For a social welfare function |F which is independent of


irrelevant alternatives, fjk is a function of rjk(i).

Proof. Being a function means, that if for all i, rjk(i) — Qjk(i) then
fjk(Ri,R2,, Rn)= fjkiQu Qi.Qn)- But for linear orders’ rik = rkjC so
rk.(i) = qk].{i) also. Now the assertion follows directly from the definition. □

THEOREM 1.6.4. (Arrow’s Impossibility Theorem). If X contains at least


three alternatives, there exists no social welfare function satisfying assumptions
(i) to (5).

Proof. Write /yOi, a2,..., an) to express ftj as a function of the (i, /)-entries
ofRu R2,... ,Rn by Proposition 1.6.3. By completeness for i 4= j,

fij(ai, a2,..., an) + fji(a<{, a2,..., an ) — 1


36 Sets and Binary Relations [Ch. 1

By the Pareto condition

fij( l,1.1) = 1

fij( 0,0,...,0) = 0

Let aha2,..., an, bx,b2,..., bn, cltc2,...,cn be such that 0/h, < ct < at + bt
for all i. We will construct a profile Rf such that for any three fixed alternatives
x, y, z, x R( y if and only if a,- = 1, y R{ z if and only if bt = 1, x Rt z if and
only if ct = 1. To do this use this table:

a,- b( ct Preference

0 0 0 ,zyx
1 0 0 zxy
010 yzx
101 xzy
0 1 1 yxz
111 xyz

This proves the claim. Thus by transitivity and universal domain, if


dibi < ct < at + bf then fij(a)fjk(b) < fik{c) where a = (alf a2,.... an), and
so on. Take a = c, b = (1, 1,..., 1). Then /iy(a) < fik(a). By symmetry
fij(a) — fik(a). By a dual argument fji(a) = fki(a). Since this is true for all
distinct i, j, k we have that all fu for i =£ j, are equal (note fa ( a ) = 1 identically
by completeness). Let / denote this common function.
For dibi < Ci < a,- + bj, /(a)/(h) < /(c). Suppose /(a) = 1 and ct > a,-.
Let bt = 1. We have /(c) > 1 so /(c) = 1. So / is monotone. Choose a vector w
with the fewest number of ones such that /(w) = 1. Suppose f(v) = 1 for any
v. Then let c,- — w?-. Then /(c) >1*1 = 1. But unless v>w,c will have fewer
ones than w. This is impossible. So only vectors greater than w can have f(v) = 1.
By Pareto optimality, w =£ c. Suppose w has two or more ones. Let
0 < v < w. Then f(v) = 0 since v > w and f(vc) = 0 since vc > w. This
contradicts completeness. So w has exactly one 1, say in place t.
Then f(v) = 1 if and only if v > w, if and only if vt = 1. So x |F y if and
only if x Rt y. So voter t is a dictator. □

EXERCISES
Level 1
1. For only two alternatives, name a social welfare function satisfying
assumptions 1-5.
2. What assumption does majority rule fail to satisfy in general?
3. Which assumptions does a dictatorial social welfare function satisfy?
4. Suppose we count point totals as follows: an alternative receives m — 1
points for being first choice of anyone, m — 2 for being second, etc., where
Sec.1.6] Arrow’s Impossibility Theorem 37

m = \X\. Is this transitive? complete? Pareto optimal? nondictatorial? Can


it satisfy universal domain? Here x |F y if and only if x receives a greater
point total than y.
5. Show the function in Exercise 4 is not independent of irrelevant alternatives.
Take profiles P2 and P2 with

Py P2

xyz xyz
yzx yxz
zxy xyz

6. Which assumption is least necessary for real applications of majority rule?


How does it really work despite violating the five assumptions?
Level 2
1. List all weak orders on the set {1,2, 3]-. A weak order is a ranking of certain
alternatives as first, others as second, others as third, where ties are allowed.
2. Prove that majority rule is transitive on any domain of the form Ln where
L is a set of linear orders such that for no three alternatives x, y, z do we
have three orderings xyz, yzx, zxy? (This is called a cyclic triple.) Suppose
not. Suppose a majority prefers x to y and a majority prefers y to z but a
majority does not prefer x to z. Prove at least one voter prefers x to y and
y to z, at least one prefers z to x and x to y, at least one prefers y to z
and z to x. This is a contradiction.
3. A set of preferences is called single peaked if the alternatives can be ordered
x1; x2,xn so that no one ranks Xj below both x{, xk if i < j < k. Prove
that no cyclic triple can occur for single peaked preferences. (Thus majority
rule is transitive.) Suppose xyz, yzx, zxy are single peaked preferences. Let
x be the middle numbered one of x, y, z. Can yzx occur?
4. Graphically being single peaked means that as one goes from left to right
along Xj, x2,..., xn the utility (value) of x{ to a person rises to apeak and
then falls. Explain why this may be true in an election where x! is the most
conservative candidate, xm the most liberal and the others are arranged in
order of liberalness or conservativeness.
5. Give an example of a social welfare function satisfying all the assumptions
except Pareto optimality. (It could be a constant function.)

Level 3
1. Prove Arrow’s Impossibility Theorem for the domain |D of all weak orders
(nondictatorial must be weakened so that just if voter k strictly prefers any
x to y so does the group. If he is indifferent, the group choice may differ).
2. Explain why the number of weak orders on a set of m elements is
2 S(n, k)k\. Use the classification of preorders. The partial order involved
must be a weak order.
38 Sets and Binary Relations [Ch. 1]

3. Suppose the assumption of completeness is dropped from Arrow’s


Impossibility Theorem. What can be concluded?
4. A nonempty family F of subsets of a finite set is called a filter (1) if A E F
and BD A then BE: F, (2) if A, BE f then A H B E F, (3) 0 ^ F. Classify
Filters on a finite set.
5. Find a different type of filter on an infinite set. A filter is called an ultra¬
filter if for all A either A E F or Ac E f. Show by Zorn’s Lemma that any
filter is a subfamily of an ultrafilter.
CHAPTER 2

Semigroups and groups

In this chapter we take up systems having a single binary operation, such as multi¬
plication. Most systems of interest satisfy the associative property (ab)c = a(bc)
or a related property. In them the commutative property ba = ab is not always
true (for instance matrix multiplication). A system with the associative property
is called a semigroup. Semigroups can be studied in several ways. One is to find
generators and relations. A set of elements is a set of generators if all elements of
the system are products of a sequence from the set. For instance —1,1 generate
the integers as a semigroup under addition since every interger is a sum of copies
of —1,1. Relations tell when two products of generators are the same, for
instance 1 + 1 — 1 = 1.
Another method is to study subsemigroups, ideals, and quotient semigroups.
A subsemigroup is a subset which forms a semigroup by itself under the product.
An ideal is a subsemigroup such that every product of an element in it and an
element outside it is in it. For instance an even integer times any integer is even.
A quotient semigroup is a semigroup made up of equivalence classes of some
equivalence relation under the same product.
Green’s relations are defined from the ideals and enable any semigroup to
be expressed as the union of equivalence classes. They to some degree express
the relation between semigroups and groups.
Any set of binary relations generates a semigroup. Binary relations such
as ‘x associates with y’ have been studied in particular groups of people by
sociologists.
For any machine, such as a computer, there is a semigroup associated with
the transitions from one internal state to another determined by inputs. This is
a semigroup of functions under composition.
An abstract concept of finite-state machine has been defined involving a
set of inputs, a set of outputs and a set of internal states. Such machines can
accomplish various theoretical tasks such as adding numbers or recognizing a
programming language. Programming languages themselves can be represented in
terms of free semigroups.
40 Semigroups and Groups [Ch. 2

A group is a semigroup having an identity element e such that ex — xe — x


for all jc and an inverse x_1 of each element x such that xx_1 = x_1x = e.
Groups may be studied in terms of subgroups and quotient groups as semigroups
can be. A homomorphism is a function f from one group to another such that
f(xy) = Finite (or finitely generated) commutative groups can be
viewed as Cartesian products of cyclic groups (one generator groups), which are
quotient groups of the integers.
A one-to-one onto function from a set to itself is called a permutation.
The permutations on any set form a group, and every group may be represented
as a permutation group. The symmetries of a geometric set or mathematical
model form a group, and use of symmetry often simplifies the analysis of a
mathematical situation.
A system of distinct representatives consists of a selection of an object
from each of n-sets U\, U2, .... Un such that every selection is different from
the rest. If |U U2 U ... U Un \ = n this must in effect be a permutation. The
Hall-Koenig theorem gives^ necessary and sufficient condition for a system of
distinct representatives to exist. It is convenient to prove it by the important
Ford-Fulkerson algorithm for flows on networks.
A group of permutations on a set is said to act on the set. The subsets of
elements mapped into one another by the group are called orbits. The size
of each orbit equals the size of the group divided by the number of elements
leaving a given number of the orbit fixed (isotropy group). This is the basis
for many enumerational results where symmetry is involved. Polya’s theory
describes the number of classes of functions / from a set X acted on by a group
G to another set S, where two functions differing by an action of G are equi¬
valent. For instance the number of types of colorings of a cube by three colors
can be computed using his theorem. Or the number of ways to arrange atoms in
a given molecular diagram may be calculated.
In kinship systems of primitive tribe, there are sometimes a small number of
clans such that the clan of children is determined from those of the parents in
such a way as to prevent incest. These may be studied by means of groups.
A lattice is a partially ordered set in which any two elements have a least
upper bound (as a union of sets) and a greatest lower bound (as an intersection).
Subgroups of a group form a lattice.

2.1 SEMIGROUPS
A semigroup is about the simplest and most general mathematical structure of
wide significance.

DEFINITION 2.1.1. A semigroup consists of a set S together with a binary


operation o mapping S X S to S and satisfying the associative law

x o (y o z) = (xoj')oz,
Sec.2.1] Semigroups 41

Thus a semigroup is a set provided with an associative binary operation.


(It must be closed under the operation, that is, a o b must always be within the
set.)

EXAMPLE 2.1.1. Any of the following is a semigroup under either multiplica¬


tion or addition: the positive integers, the integers, the positive rational numbers,
the rational numbers, the positive real numbers, the real numbers, the set of
nX n matrices over a field.
A semigroup will sometimes be written (S, o) to indicate the operation.
Why should an operation be associative? For one thing, an associative
operation on two elements defines unambiguously an operation on n elements,
such as Xx + x2 + ... + xn. For another associativity ties in with the idea of
performing an operation on something like twisting a face of Rubik’s cube or
transforming raw materials into outputs. These are examples of transformations.

DEFINITION 2.1.2. A transformation on a set .S' is a function f:S-*S. The


composition f o g of two transformations is defined by (/ o g) (*) = #(/(*))•

Many branches of mathematics use the reverse convention and call


(/ o g) (x) what we call (g o /) (x), so the student should be careful. However,
our convention is the one that follows from the definitions of composition of
binary relations in Definition 1.3.2 and is standard for semigroup theory.
The composition of two transformations means, apply the first transforma¬
tion, then the second to its result. Associativity was proved in Proposition 1.3.1.
Therefore any set of transformations closed under composition is a semigroup.

EXAMPLE 2.1.2. The following are semigroups, on any set S: all binary
relations on S, all transformations on S, all onto transformations, all 1-to-l onto
transformations, all transformations sending a given subset T into itself, all
transformations, having an image of at most k elements.
More generally all transformations on a set preserving some structure such as
a partial order or a binary operation, form a semigroup.

Another class of operations giving rise to semigroups are least upper bound
and greatest lower bound of two elements, provided these are unique.

EXAMPLE 2.1.3. For any subset of the real numbers, the operations sup{x, y}
(the larger of x, y) and inf (x, y} (the smaller of x, y) are associative.

DEFINITION 2.1.3. A lattice is a partially ordered set S in which for any two
elements a, b there exist unique elements a A b, a V b such that (l)«A6<a,
a A b < b, (2) a V b > a, a V b > b, (3) if c > a, c > b then c > aV b,( 4) if
c < a, c < b then c < a A b. The elements a V b, a A b are called the least
upper bound and greatest lower bound of a, b, also the join and meet of a, b.

Not all posets are lattices, but many of the most important ones are.
42 Semigroups and Groups [Ch. 2

EXAMPLE 2.1.4. The following are lattices: the real numbers, any linearly
ordered set, the set of all subsets of a set, the set of all partitions of a set, the set
of all lines and planes through the origin in together with {0} and itself.
Here is 3-dimensional space, and the partial order is inclusion in the last three
examples.

In this book E^will denote n-dimensional Euclidean space.


EXAMPLE 2.1.5. These partial orders are not lattices:

{(1, 1), (2, 2)}, {(1, 1), (2, 2), (3, 3), (3,1), (3, 2)}.

Every lattice is a special type of senigroup under V (or A) with several


additional properties.

Commutativity: a V b = b V a

Idempotence: a V a = a

The last property is not true for semigroups obtained by addition or multiplica¬
tion of integers or real numbers, and indicates something of the variety of
semigroups that exist.
Finite lattices can in fact be described as finite idempotent commutative
semigroups with identity.
A different type of semigroup is made up of sequences of symbols called
words. The product of two words is obtained by writing one word directly after
another. For instance (xyz) (xxzy) = xyzxxzy. Associativity follows from
(wi w2)w3 = wi w2 w3 = wi(w2 w3).

DEFINITION 2.1.4. The free semigroup generated by a nonempty set X is the


set of all finite sequences from X (called words). The product of two sequences
xxx2 ... xr and y2y2 ... ys is the sequence xxx2 ... xryiy2 ...

The importance of free semigroups comes from the fact that every semi¬
group is the image of a free semigroup under an onto function / such that
f{xy) — f(x)f(y).

EXAMPLE 2.1.6. The free semigroup generated by x,y has as its 5 elements
y, xy, yx, xx, yy, xyx, xxx, yyx, xyy, xxy, yyy, yxy, yxx and so on.
A free semigroup on one generator is commutative, but not free semigroups
on more than one generator. All free semigroups on at least one free generator
are infinite.

DEFINITION 2.1.5. A subsemigroup of a semigroup S is a subset T C S such


that if x, y E T then xy E T.

EXAMPLE 2.1.7. The semigroup of positive integers is a subsemigroup of the


semigroup of all integers.
Sec. 2.1] Semigroups 43

Subsemigroups of a semigroup are subsets closed under the operation.


Knowledge of the subsemigroups of a semigroup gives information about its
structure.

DEFINITION 2.1.6. Two semigroups S, T are isomorphic if and only if there


exists a 1-to-l onto function f:S-*T such that f(x * y) = f(x) o f(y), where
* is the operation in S and o is the operation in T.

EXAMPLE 2.1.8. The semigroup of real numbers under addition is isomorphic


to the semigroup of positive real numbers under multiplication. The isomorphism
is given by /(x) = ex, and the property /(x * y) = /(x) o f(y) corresponds to
the basic properties of exponential functions ex+y = exey. Here * is +, o is X.

Two isomorphic semigroups are identical in mathematical structure. It can


be considered that they differ only in the names of the elements: if x is renamed
/(x) then the products always agree. Thus if a semigroup S is proved isomorphic
to a semigroup T it suffices to study T in order to determine the properties of S.
For instance S is commutative or idempotent or finite or has an identity if and
only if T has these respective properties. There is a 1-to-l correspondence
between subsemigroups of S and those of T.
Any given semigroup is isomorphic to a semigroup of transformations
on a large enough set. Thus any abstract property true of all semigroups of
transformations is true of every semigroup.

THEOREM 2.1.1. Any semigroup S is isomorphic to a subsemigroup of a


semigroup of transformations.

Proof Let 1 be an element disjoint from S. Make SU{1} into a semigroup M


in which S has its same product and 1 is an identity: lx = xl = x. The associative
law (xy)z = x(yz) can be verified from by studying each case as to whether
x, y, z equal 1 or are in S.
To each element x G M we associate a transformation fx in M such that
fx(a) = ax. We have fxy(a) = axy = (ax)y = fy(fx(a)) = (fx o fy)a. Therefore
fxy ~ fx ° fy■ Moreover this correspondence is 1-to-l since if fx = fy then

fx (!) = X = fy( 1) = y. This proves the theorem. □

EXERCISES
Level 1
1. Give 3 additional examples of semigroups (what about complex numbers?).
2. Let /^onI = (1,2,3} be given by/(l) = 2,/(2) = 3, /(3) = 3, ^(1) = 3,
g(2) = 1, g(3) = 3. Calculate / o g and g o f.
3. If h is given by fc(l) = 2, h{2) = 3, h(3) = 1 verify the associative law for
(/ o g) o h. Here f and g are the same as in the above exercise.
4. What is 2 V 3? 3 A 2?
5. Prove in any lattice that a l\ a — a, a\! a = a, a f\ b — b f\ a, a \J b — b V a.
44 Semigroups and Groups [Ch. 2

6. What is the product (xyzz) (zyz)? Give examples to illustrate the fact that
free semigroups are associative but not commutative.

Level 2
1. Consider all transformations on real numbers of the form ax + b. Prove this
is a semigroup (i.e. prove closure) and give a formula for the composition of
ax + b and cx + d.
2. What are A, V in the lattice of subsets of a set?
3. Prove that if (S, o) is a semigroup then another semigroup can be defined as
(S, *) where x * y — y ox. This is called the opposite semigroup.
4. Write down all transformations on the set {1,2, 3}.
5. Compute the composition of all transformations on {1,2, 3} having at most
two elements in their image.
6. Prove that transformations having image size < k form a subsemigroup, and
even more, an ideal: if / does so do / o g and g o f for all g.
7. Write out all words of length exactly 4 in the free semigroup on 2 letters
y.
Level 3
1. How can the semigroup of n X n matrices over R be regarded as a semigroup
of transformations?
2. Let S be a finite commutative, idempotent semigroup with an identity e
such that ex = xe = x. Define a partial order such that xy = x Vy. Prove
it is a partial order.
3. Define A in the situation of the last example. Prove this gives a lattice.
4. How many transformations are there on a set of n elements?
5. Describe free semigroups on one generator. Give an isomorphism from such
a free semigroup to the positive integers.
6. When will two elements of a free semigroup commute?

2.2 GENERATORS AND RELATIONS


A semigroup can be described by listing all its elements and the product of any
elements.

EXAMPLE 2.2.1. The following is a semigroup:

abed

d c d c d
Sec. 2.2] Generators and Relations 45

It is a special case of a rectangular band, that is, a Cartesian product with


multiplication (a, b) (c, d) = (a, d).

In many cases a briefer description is possible. Instead of giving all elements,


it is sufficient to give a set G C S of elements such that every other element is a
product of a sequence of elements of G. In this case G is called a set of
generators for S.

DEFINITION 2.2.1. Let S be a semigroup and G C S. Then G is a set of


generators for S if S is the set of all products jcj x2 ... xs such that xt E G.

For the positive integers under addition {1} is a generating set since every
positive integer has the form 1 + 1 + ... + 1. A free semigroup on 2 generators
illustrates the fact that a finite set of generators can generate an infinite
semigroup.
In terms of transformations a generating set is a set from which all
transformations of the semigroup can be produced. A generating set for the
transformations on Rubik’s cube has 6 elements, a quarter-turn clockwise
for each face.

EXAMPLE 2.2.2. Any rearrangement of n objects can be obtained by


successively interchanging pairs of adjacent objects. This implies that the
permutations (k k + 1) which interchange k, k + 1 and keep all other elements
fixed, generate the set of all permutations on (1,2, ..., n}.

To describe the multiplication in a semigroup, not only the generators are


needed, but also which products of generators equal one another.

DEFINITION 2.2.2. A relation in a semigroup S is an equation xxx2 ... xr —


yi y2 ■ ■ ■ ys for some xit y, E S.

EXAMPLE 2.2.3. In the additive semigroup of positive integers we have the


relations 2 + 3 = 4 +1, 6 + 9 = 5 + 10.

A semigroup is essentially known if we know a set of generators and all


relations among those generators. It is not necessary even to know all relations,
just enough to imply all relations. Every element is a product of generators and
the relations tell which products are distinct. The multiplication is described by
(xxx2 ...x,.)Oi.y2... +s) =xxx2 ... xryiy2 ...ys.

EXAMPLE 2.2.4. The semigroup of positive integers 2”3w,n, m>0 under


multiplication has generators 1, 2, 3. These are a set of defining relations:

1-1 = 1, 1-2 = 2, 2-1 = 2, 1-3 = 3,

3-1 = 3, 2 • 3 = 3 • 2.
46 Semigroups and Groups [Ch. 2

Congruences on a semigroup are equivalence relations such that the equi¬


valence class of a product depends only on the equivalence classes of its factors.
This means that the equivalence classes themselves form a semigroup, called the
quotient semigroup. If two elements are congruent and both are multiplied by
the same element, the products are congruent.

DEFINITION 2.2.3. A congruence on a semigroup S is an equivalence relation


E on S such that if x E y and z E S then xz E yz and zx E zy.

Let x E y and z E w then by definition of congruence xz E yz E yw. This


proves that the equivalence class of a product depends only on that of its factors.
Since associativity follows from (xy)z = xy z = (xy)z = x(yx) — x yz =
x(y~z), we have a semigroup.

DEFINITION 2.2.4. The quotient semigroup defined by an equivalence


relation E is the set of equivalence classes x with product x y = xy.

EXAMPLE 2.2.5. A congruence on the positive integers is defined by x E y if


and only if x = y or x, y > z. The quotient semigroup has three classes 1, 2, 3
and multiplication

I2 3

12 3 3

2 3 3 3

3 3 3 3

The theory of generators and relations is connected with the theory of


congruences on a free semigroup on the generating set. Each relation corresponds
to two words in the free semigroup being equivalent.
We first state that any set of relations on a generating set gives a congruence
on a free semigroup.

PROPOSITION 2.2.1. Let F be the free semigroup generated by a set C. Let


Wj, vt for i — 1 to k be words, i. e. elements of F. Let R be the relation such that
(w, w') e R if and only if w' can be obtained from w by a finite number of
changes of the form awtb -> avtb or avtb -> awib, where a, b G F (as a special
case we allow a or b to be a sequence of no elements). Then R is a congruence.

Proof. Straightforward.

Call the congruence of this proposition E(yi = v^).


Sec. 2.2] Generators and Relations 47

DEFINITION 2.2.5. The semigroup with generating set C and defining relations
W; = vt is the quotient semigroup associated with the congruence E(Vj = wf) on
the free semigroup generated by C.

EXAMPLE 2.2.6. The semigroup with generating set x and defining relation
x - x4, has 3 distinct elements x, x2, x3. (We have x4 = x, xs = x, x4 — xx = x2,
x6 = x4x2, and so on.) Products are given by the table below.

x x2 x3

x2 x3 x

x3 x x2

x3 x x2 x3

Namely xs = x4x = xx = x2, x6 = x4x2 = xx2 = x3, and so on.

EXAMPLE 2.2.7. The semigroup with generators x, y and defining relation


xy = yx has as its distinct elements all products xnyn. Multiplication is given by
xnymxrys = xn+rym+s. Namely by induction one can prove that any power of
x must commute with any power of y.

A homomorphism of semigroups is more general than an isomorphism, in that


it need not be 1-to-l or onto. Although a homomorphism, may not be 1-to-l,
frequently a homomorphic image of a semigroup is simpler than the original and
thus gives insight into its structure.

DEFINITION 2.2.6. A homomorphism of semigroups/: S -* T is a function/


such that f(xy) = f(x)f(y).

EXAMPLE 2.2.8. There exists a homomorphism / of semigroups from


the set of all positive integers under multiplication to the set of positive
integers under addition such that /(2"(odd number)) = n. For instance
/(36) = /(4*9) = f(22'9) — 2.

EXAMPLE 2.2.9. The determinant is a homomorphism of semigroups from


nX n matrices under matrix multiplication to real (or complex) numbers under
multiplication. This is true because

det (AB) = det (A) det (B).

In some applications it is convenient to represent semigroups by generators


and defining relations, or equivalently to find a set of generators and a set of
48 Semigroups and Groups [Ch.2

relations which imply every, relation. For any set of relations in a semigroup S,
the next proposition shows there is a homomorphism of the semigroup defined
by those relations, onto S. This will be an isomorphism if the set of relations is
complete.

PROPOSITION 2.2.2. Let C be a set of elements contained in a semigroup S


and w{ = v( certain relations which hold in S, where wit vt are products of
elements of C. Let G be the semigroup with generating set C and defining
relations wt — vt. Then there exists a unique homomorphism f:G~*S such
that f(c) = c for all c £ C.

Proof. Straightforward.

To check that a complete set of relations has been found, it may be necessary
to determine the semigroup determined by a set of relations vt — w{ in certain
generators. First deduce some consequences of these relations. Then use these to
show that every word is equal to a word from a list of words S. (The set S
should be such that no two words are equal under the relations. If not, delete
one of the equal words.)
Compute products in S by taking the products of powers of words and
reducing them to elements of S by means of the relations. If products are
associative and the relations vt = wt are satisfied then the semigroup determined
by the given relation has been found.

EXAMPLE 2.2.10. Let a semigroup have generators x, y and defining relation


xy = yx2. Then by induction, xys — ysx2 . By another induction x fys = ysx2 f.
This means any word can be rearranged so that the y powers (if any) precede
the x powers (if any). Thus any word can be written in the form ysx* for
s, t > 0.
Products are given by ysxfyuxv — ysyux2 fxv = ys+ux2 t+v. It can be
verified that these are associative. So the semigroup determined by this relation
has been found.

For many elementary semigroups this procedure can readily be carried out.
However, it can be very difficult. In fact, it has been proved that the problem
of deciding whether or not two words are equal, in a semigroup specified by
generators and relations, can be undecidable.
There is a different method for studying semigroups, without generators and
relations, taken up in the next section.

EXERCISES
Level 1
1. A more general rectangular band can be definied on any nonempty Cartesian
product set S X T by the multiplication (z, /) (n, m) = (z, m). Prove this is
associative.
Sec. 2.2] Generators and Relations 49

2. Show that if S or T contains at least 2 elements, a rectangular band is not


commutative.
3. Show these relations hold identically in rectangular bands: x2 = x,
xwy — xzy.
4. What are a set of generators for the positive integers under multiplication?
5. Show that in the semigroup S of constant transformations on a set S we
have xy - x for all jc, y in S.
6. Consider a semigroup with one generator x and one relation x3 = x. Prove
that x2n+1 = x and x2n+2 = x2 for all n. Prove x2 acts as an identity
element e. Assuming x e write out the multiplication table

e x

7. If e is an identity element in T and S is any semigroup show the constant


mapping f(x) = e is a homomorphism. Need e be an identity?

Level 2
1. Consider the transformations fit i = \,2,... ,n — \ on {1, 2,..., n} such
that fi(i) = i + 1, fi(i + 1) = / and /,-(*) = x for other x. Prove /} o ft = e
where e(x) = x for all x.
2. Prove fi o fj = fj o fa whenever / A i ± 1.
3. Prove (/} o /m)2 = fi+1 o ft for i = 1, 2, ...,n- 2.
4. Describe all homomorphisms from the positive integers under addition to
itself.
5. Write out the set of transformations from {1,2} into itself. Let x be the
transposition interchanging 1, 2 and y the constant transformation y(i) = 1
for i = 1, 2. Prove the other two can be expressed as products of x, y.
6. Give as far as you can, a complete set of relations on x, y in the above
exercise.
7. Describe the semigroup with generators x, y and defining relation xy = yx.

Level 3
1. What semigroup is defined by these relations on x, y: xy = yx, xxy = x,
yxy = yl
2. Consider the semigroup defined by these relations: xx=yyy, xxx = x,
yyyy = y, xyx = yy. How many elements does it have? What is xx = yyy ?
3. Classify homomorphisms from a free semigroup into itself.
50 Semigroups and Groups [Ch. 2

4. Show that for any set C any semigroup S and any function f:C-+S there
exists a unique homomorphism g from the free semigroup generated by C
into S such that /(c) = g(c) for c E C. This is the ‘universal property’’ of
free semigroups.
5. Show that if a semigroup has the property of the above exercise it is
isomorphic to a free semigroup.
6. Consider the semigroup of all transformations f(x) — ax + b where a = ±1
and b is an integer. Find a set of generators for this semigroup, and some
relations among them.
7. Give some homomorphisms from the semigroup of n X n matrices under
addition to the real numbers.

2.3 GREEN’S RELATIONS


The algebraic theory of semigroups is based on constructing semigroups out of
simpler components. For instance one type of component might be a subset S
which is a group (that is has an identity e such that ex — xe — x for x E S and
for all xE S there is an element yES with xy — yx — e). Groups have their
own structure, which we take up later, but from the viewpoint of semigroup
theory they are irreducible and cannot be broken down further.

EXAMPLE 2.3.1. The semigroup of transformations contains the subset of


1-to-l onto transformations, which is a group. It also has other subsets which
are groups.

EXAMPLE 2.3.2. The multiplicative semigroup of rational numbers consists of


0 together with the multiplicative group of rational members.
Subsemigroups, subsets closed under multiplication, are significant in the
structure of semigroups. Even more significant are ideals, subsets closed under
multiplication even if one factor is outside the subset.

DEFINITION 2.3.1. A left {right, two-sided) ideal in a semigroup S' is a non¬


empty set K C S such that if xEK, y, z ES,yxEK (xy E K, xy, yx, yxz E K).

EXAMPLE 2.3.3. The set ofnXn singular matrices over a field has the ideal
consisting of all singular matrices, since if X has determinant 0, so does YXZ.

EXAMPLE 2.3.4. The set of positive integers under multiplication has an ideal
Im consisting of all multiples of m for a positive integer m. This is true because
if x is a multiple of m so is yxz.

For noncommutative semigroups there can be left-ideals which are not


right-ideals and vice versa. There are ideals associated with all elements x; how¬
ever, these ideals may not be distinct and may be the entire semigroup, which is
always a two-sided ideal.
Sec. 2.3] Green’s Relations 51

DEFINITION 2.3.2. For subsets A, B of a semigroup, AB means


{ab: a E A, b E B], the set of products formed by an element of A times an
element of B.

DEFINITION 2.3.3. The left {right) ideal generated by a subset A in a


semigroup S is Sv4 U A {AS U A). The two-sided ideal generated by x is
{x} U {xS} U {Sx} U {SxS}.

DEFINITION 2.3.4. A principal ideal is one generated by a single element.

EXAMPLE 2.3.5. The two-sided ideal generated by* in the additive semigroup
of positive real numbers is all real numbers y>x.

EXAMPLE 2.3.6. In the multiplicative semigroup of positive real numbers 0


generates the ideal {0}.

For any semigroup S an element 1 can be added. If products in S are


extended by \a = a\ = a, a E S, and 1*1 = 1 this gives a new semigroup with
identity, denoted S1. Definitions can sometimes be stated more conveniently in
terms of .S'1 than S.

DEFINITION 2.3.5. Two elements of a semigroup are L-equivalent


(R-equivalent, J-equivalent) if and only if they generate the same left (right,
two-sided) ideal.

It is readily verified that the relations defined by this are equivalence


relations in any semigroup.

EXAMPLE 2.3.7. In the semigroup of constant transformations any two


elements are L-equivalent since xy — x means Sy equals S. But no two elements
x, y are R-equivalent since y>S = {y} and xS = {x}.

Both L-equivalence and R-equivalence imply J-equivalence but not


conversely. Two more relations are built out of L, R.

DEFINITION 2.3.6. Two elements x,y are H-equivalent if and only if they
are both R and L-equivalent. They are D-equivalent if and only if there exists z
such that xRz and z L y.
Thus H is the intersection relation RflL and D is the composition R o L.
In a commutative semigroup all 5 relations coincide. In a group any two elements
are equivalent under each relation.

In a free semigroup no two elements are J-equivalent.


52 Semigroups and Groups [Ch. 2

EXAMPLE 2.3.8. Two matrices over a field are J (D)-equivalent if and only
if they have the same rank. They are R (L)-equivalent if and only if their row
(column) spaces are identical. They are H-equivalent if and only if their rows
span the same space and their columns span the same space.

The five relations mentioned in Definition 2.3.6 are known as Green’s


relations. These relations have a meaning in terms of solving equations like
xa = b or ax = b in the semigroup.

THEOREM 2.3.1. The Green’s relations are equivalence relations. The relations
R, L, J can alternatively be defined as follows, in a semigroup S:

(1) a L b <=> xa = b and yb = a for some x, y G S1


(2) a R b ^ ax - b and by = a for some x, y G 51
(2) a J b *=> zax = b and wby = a for some x, y G S1

Proof. First (1) will be proved. Suppose a L b. Then Sla = S1b. So a G Svb
and b G S1a. So a = yb, b — xa for some rjGS1.
Suppose a = yb, b = xa. Then a G Slb and b G S1a. Thus
S'a C S'iS'b) = (S'Sfyb = S'b. And S'b C Sla. So a L b. This proves (1).
The proofs of (2) and (3) are similar. □

COROLLARY 2.3.2. The relation D is an equivalence relation.

Proof. Suppose a D b. Then a R c and c L b for some c. So for some


x, y, z, w, ax — c, cy — a, zc — b, wb = c by (1), (2) of Theorem 2.3.1. To
prove symmetry it will suffice to find d such that b R d and d L a. Let d = za.
Then wd — wza — wzcy = wby — cy = a. So d La. And dx — zax — zc = b.
Also by — zcy — za = d. So d R b. This proves symmetry.
Note that D is the composition relation RoL=LoR. But now D o D =
RoLoRoL=RoRoLoLCRoL=D. This proves transitivity of D.
It is reflexive since R, L are. □

Equivalence classes of these relations are called R, L, J, H, and D-classes


respectively. The relation D can also be viewed as the smallest equivalence
relation containing R, L (join of R, L).
The following tables gives some of these classes for several semigroups.

R-classes L-classes J-classes = D-classes

Mn(F) null space image rank

Tn partition image rank

column space row space isomorphism type of row space


Sec. 2.3] Green’s Relations 53

Here the semigroups Mn(F), Tn, and B^ are, respectively, nX n matrices


over the field F, transformations on the set {1,2,..., n}, and nXn Boolean
matrices. The table gives a characteristic which determines the class of an
element in each semigroup. Matrices are regarded as acting on row vectors.
The partition of a function / is the set of equivalence classes of the relation
i(x,y):f(x)=f(y)}.

THEOREM 2.3.3. For any finite semigroup the relations J and D coincide.

Proof If a J b then a = xby, b = zaw for some x, y, z, w E S\ We have


the inclusion aSx D awS1 and the epimorphism awS1 -* zawS1. So
\aSx\> \awSx\>\bSx\. Likewise \bSx\>\aSx\. So laS11! = lawS1! =
So aS1 = awS\since aS1 D awS1. So a R aw. Likewise Sxaw D Sxzaw = Sxb
and \Sxb\> \Sxxb\> \Sxxby\> \Sxxbyw\ — \Sxaw\. So Sxaw = Sxb. So
aw L b. Therefore a D b. □

Within a D-class the elements can be arranged in R and L-classes like bottles
in a crate (this is called the egg box picture). Let R 1( R2,..., Rn be the R -classes
contained in D and Li,L2,..., Lm the L-classes contained in D. Then the
H-classes are

Ri Cl L\ R\ C Li2 ■,.. R\ C Lm

R2 C Z/j R2 C Li2 .• • • R-2 Fl Lm

Rn H Zj Rn C L2 . .. Rn n Lm

These H-classes all have the same size, and there is an explict isomorphism
between any two. For details on this and succeeding results, see Clifford and
Preston (1964).
Regularity is a concept related to existence of certain weak inverses of an
element.

DEFINITION 2.3.7. An element x of a semigroup S is regular if and only if


there exists yGS such that xyx = x. (Here x is a little like an inverse of y.)

EXAMPLE 2.3.9. Any nonsingular matrix is regular, since we may take


y = x_1.

More generally, any element of a group is regular.


54 Semigroups and Groups [Ch. 2

EXAMPLE 2.3.10. No element of the positive integers under addition, or of


any free semigroup, is regular.

This is related to the fact that no inverse of an element in these semigroups


(in a larger group) belongs in the semigroup.

A group inverse satisfies xy = yx = e, where e is an identity element. This


implies xyx = xe = x, yxy = ye = y. If x, y satisfy the equations xyx — x,
yxy = y they are called generalized inverses. The next theorem shows regularity
implies existence of a generalized inverse. Generalized inverses are used in matrix
theory for singular or non-square matrices, and statistics and other areas.

THEOREM 2.3.4. Let S be a semigroup.


(1) a E S is regular if and only if a R e and a L / for certain idempotents e, f.
(2) If one element of a D-class is regular so is the entire D-class.
(3) An L-class D of a finite semigroup is regular if and only if DD n D =£ 0.
(■4) If a is regular, there exists x G S such that a = axa, x = xax.

Proofs of all but (3) can be found directly in Clifford and Preston (1964)
and (3) follows from their Lemma 2.17 by reasoning similar to the proof of
Theorem 2.3.3.

Idempotents, H-classes, and subsets of semigroups which form groups are


related. In a regular D-class some H-classes (those with idempotents in them)
will be groups, and all other H-classes will be in 1-1 correspondence with groups.

EXERCISES

Level 1

1. Prove that in the additive group of integers any two elements are
R-equivalent.
2. Prove that R = L = J in a commutative semigroup.
3. Let S be a semigroup with identity e with a partial order > such that if
x> y then ax > ay. Show the elements a > e form a subsemigroup T.
4. Let T be as in the above exercise. Prove that in T if x J y then x = y.
5. Prove that if x R y then x D y and x J y.
6. Show the subsets of a semigroup form a semigroup under the product AB.
7. Prove that every element of a group is regular.
8. What are the 5 relations in a rectangular band S X T with product
(r, s)(t, u) = (r, u)l
Sec. 2.4] Blockmodels 55

Level 2
1. Let the rank of a transformation on {1, 2,be the number of elements
in its image set. Prove rank (fg) < rank (/) and rank (fg) < rank (g).
2. Let A, B be subsets of {1, 2,...,«} having the same number of elements.
Define a transformation / on {1,2, ..., n) such that f(A)=B.
3. Prove that two L-equivalent transformations /, g on {1,2,..., n} must have
the same image set.
4. Prove two R-equivalent transformations on {1,2,...,«} have the same
partition.
5. Prove two J-equivalent transformations have the same rank.
6. Show that the non-square matrix [1 1] has a generalized inverse.
7. For transformation / on {1,2, construct g as follows. For each
a G Image (/) choose ca such that f(ca) = a. Let g(a) = ca for
a G Image (/). For x £ Image (/) let g be any fixed element ca. Prove g is
a generalized inverse of f. Therefore the semigroup of transformations is
regular.
8. What is the relation R in a lattice?

Level 3
1. Prove that if a semigroup has no left ideals except S and no right ideals
except S then it is a group.
2. Prove that the relation x <y if and only if x = ayb for some a, b in S
gives a quasi-order on S and gives a partial order on J-classes. Show this
partial order is isomorphic to the set of two-sided principal ideals under
inclusion.
3. Construct an infinite semigroup for which J =£ D (use generators and
relations).
4. A rectangular band of groups uses of nonempty sets S, T a group G,
elements gstE G for s G S, t G T (sandwich matrix). The product on
S X G X T is given by (r, x, t) (s, y, u) = (r, xgsty, u). Prove this is
associative. What are the 5 Green’s relations?
5. Prove that two transformations with the same image set are L-equivalent.
6. Prove two transformations with the same partition are R-equivalent.
7. Prove two transformations of the same rank are D-equivalent.

2.4 BLOCKMODELS
Sociologists have studied groups of people in terms of various relationships that
exist on the group: friendship, respect, influence, frequent contact, dislike. Each
gives rise to a binary relation, as {(x, y): x is a friend of y}. Each binary relation
may then be represented by a Boolean matrix, whose (i,/)-entry is 1 if the
56 Semigroups and Groups [Ch.2

relation exists between persons i and /, and whose (/, /)-entry is 0 if the relation
does not exist. This gives a set of Boolean matrices for analysis.
One method of analysis is to try to find significant subsets of the group.
An early method of R. D. Luce and A. D. Perry was to try to find cliques.
subsets of at least 3 members, and as large as possible, in which every person has
the relation to every other. There are several hindrances: cliques are difficult
to find by computer, and are usually approximate and not exact. R. D. Luce
considered cliques in the square (or higher power) of a binary relation as a means
of dealing with the second. However, some power of most binary relations is the
Boolean matrix all of whose entries are 1.
Methods of finding subgroups of a binary relation which are approximate
(in any sense) cliques are known as cluster analysis. There are many methods:
clusters can be built up from single individuals or the group may be repeatedly
split in two so as to maximize some clustering index. Generalized components of
a graph may be studied. Powers of a Boolean matrix may be taken.
In the method of blockmodels of H. C. White, S. A. Boorman, and R. L.
Breiger, the individuals are to be grouped in blocks. Every individual in one
block is to have the relation to every individual in another block, or else no
individual in the former block is to have the relation to any individual in the
second block. In practice, a few exceptions must be allowed.
This means we are searching for a congruence on the individuals with
respect to the given relations. To group the individuals, a method, CONCOR, of
R. L. Breiger, S. A. Boorman, and P. Arabie works well, although other clustering
methods can be used. CONCOR repeats a step of passing from a Boolean matrix
M to the Boolean matrix of correlations among rows of M, until this sequence
of Boolean matrices converges.
To display a block pattern it is convenient to relabel the individuals so that
those in the same block are adjacent. The relationships Of the different blocks
can be represented by a smaller Boolean matrix whose rows and columns cor¬
respond to blocks of individuals, that is, row i represents block i. To obtain it,
consider each submatrix Ay of the original Boolean matrix A, partitioned by
blocks. Replance Ay by a single 0 entry if Ay is a zero matrix and replace it by a
single 1 matrix if Ay is not a zero matrix. This smaller Boolean matrix is called
the image.

EXAMPLE 2.4.1. Griffith, Maier, and Miller (1976) studied a set of biomedical
researchers. On of the relationships between researchers was that a pair of
researchers had mutual contact with one another. Griffith, Maier, and Miller
considered 107 men. White, Boorman, and Breiger investigated a random sample
of 28 of these men.
When the group of men are suitably ordered, the matrix looks like this:
Sec. 2.4] Blockmodels 57

Mutual Contact

9 XXXX X X
26 X XX XX X XX XX
23 X X
4 XX X X
1 XX X X

12 X XX X X
7 X X X XX
6 XX XX
2 xxxxx XX
24
19

14 X X
28 X XXX
11 X
10 X X X
18 X X
22 X
15 X X

16
20
17
5
8
13
21
27
25
3

When the zero submatrices are replaced by 0 and the nonzero submatrices
by l,we obtain the following image:

1110
1110
110 0
0 0 0 0

A hypothesis about the image matrix and a partitioning of the group which
will produce this image is called a blockmodel. G.H. Heil developed a computer
algorithm, which for any given blockmodel, finds all partitions of the set of
people which yield the blockmodel as image.
58 Semigroups and Groups [Ch. 2

Once a blockmodel is found it can sometimes be studied by partitioning


it and producing images which are still simpler matrices.
Usually not one relation, but a number of binary relations are considered
on the same group. All the resulting matrices should be partitioned the same way,
so as to produce one image matrix for each binary relation.
The interrelationships of the resulting set of image matrices may be fairly
complex. One technique, proposed by S. A. Boorman and H. C. White, is to study
the subsemigroup of Boolean matrices generated by the image matrices. That is,
we consider all matrix products of these matrices using Boolean operations on
8 = {0,1}.
Relations such as idempotence A2 — A, commutativity AB = BA, and
absorption AB — A or AB — B are of.interest.

EXERCISES
Level 1
1. Describe the blocks and give the image for these Boolean matrices.

1 1 0 0
1 1 0 0
(a)
0 0 1 1

0 0 1 1

1 1 0

(b) 1 1 0
1 1 1

1 1 0 1

1 1 0 1
(c)
1 1 1 0

0 0 0 1

2. Do the same for these after rearranging the individuals.

1 0 1

(a) 0 0 0

1 0 1
Sec. 2.4] Blockmodels 59

1 0 1 0

0 1 0 1
(b)
1 0 1 0

0 1 0 1

Level 2
1. Explain the names of these blockmodels.

1 0
Deference:
1 1

Hierarchy: 1 0

-1 0.

1 f
Center-periphery:
J o_
1 0~
Multiple caucus:
.0 1.

2. Give examples of relationships you would expect to lead to basically


symmetric Boolean matrices. What about transitivity? Might it partly be
true?
3. For which of these relations is A2 3 4 5 6 = A1
4. Which pairs of these Boolean matrices commute?

Level 3
1. Suppose we consider Boolean matrices with a fixed partitioning, and
square diagonal blocks. Let im (^4) denote the image matrix. Prove
im (v4 + B) = im (A) + im (B). The matrix addition is Boolean.
2. Prove that if all nonzero blocks in A, B have at least one nonzero row,
im (AB) = im (A) im (B). Here the matrix multiplication is Boolean.
3. Find all 2 X 2 Boolean matrices which equal their own correlation matrices.
4. Suppose a Boolean matrix is partitioned into 4 submatrices An, A12, A21,
A22 and each Ay is a scalar multiple of the Boolean matrix having all
entries 1. Suppose A equals its own correlation matrix. What possibilities
are there for A?
5. Give an example of a binary relation on 5 elements such that a union of
cliques contains a different clique.
6. Suggest an idea for a clustering method.
60 Semigroups and Groups [Ch. 2

7. Write out the semigroup generated by these Boolean matrices:

'0 r "1 0“ '1 r


5
_1 0. 0 0. _1 0.

8. For two random Boolean matrices A, B and n large, the semigroup


generated by A, B is likely to have three elements A, B, AB. Explain why.

2.5 FINITE STATE MACHINES


For an abstract model of a machine such as a calculator we consider basically
three things: its inputs, its outputs, and'its internal state. Thus there will be a set
X of inputs, a set Z of outputs, and a set S of internal states. Changes in
internal states must also be taken into account. The new state depends on the
previous state and the current input. Thus there is a function v such that v(s, x)
is the next state if the current state is s and x is the input. Finally the output
must be described. The output is a function of current state and (sometimes)
input. It is described by a function 8 such that 8(s, x) is the output from state
s and input x.

DEFINITION 2.5.1. A finite state machine is a 5-tuple (S, X, Z, v, 5) where


S, X, Z are finite sets, v is a function S X X to S, and 5 is a function S X X
to Z. Such a machine is called a Mealy machine after its inventor. A Moore
machine is the same except that the output depends only on the internal state.
Thus 6 can be taken as a function § : S -*■ Z.

EXAMPLE 2.5.1. We can write out a machine to add two numbers in


binary notation. At time i we put the digits au bt of the two numbers. Thus
X = (0, 1} X (0,1}. The output is the zth digit of the output (0, 1}. So
Z = {0,1}. The internal state is the carry from previous digits, either 0 or 1.
So S = {0,1}. The output is the rightmost digit of the digits ait bit c where
c is carry. So 8 is given by

at bi c 8

0 0 0 0
0 0 11
0 10 1
0 110
10 0 1
10 10
110 0
1111
Sec. 2.5] Finite State Machines 61

And v, the internal state, is the cary from a{ + bt 4- c. So v is 1 if and only if


a{ + b{ + c> 2. Else v is 0.

From this abstract machine an electronic switching circuit could be designed


to do the addition. Machines can be designed to perform other logical or
arithmetical tasks or in general do what machines or robots might do.
Finite state machines can also be used as models of general systems or of
animal behaviour.
Semigroup theory, in particular the theory of transformations plays a role
in the theory of finite state machines.

DEFINITION 2.5.2. The semigroup of a machine is the semigroup generated


by the transformations fx(s) where x ranges over all inputs and fx(s) = v(s, x),
together with the identity transformation on S.

EXAMPLE 2.5.2. The machine of Example 2.5.1 has semigroup f00 = 0,


/oi = fio which is the identity, and fn = 1. Thus the semigroup has 3 elements.

DEFINITION 2.5.3. A semigroup T having an element e such that


ex = xe — x for all x £ T is called a monoid. Such an element e in any
semigroup is called an identity element.

Thus the semigroup of a machine is a monoid, since the identity


transformation is included.

EXAMPLE 2.5.3. The set of nonnegative integers under addition is a monoid


but the semigroup of positive integers is not. It does not contain 0.

Machines can also be represented by graphs. A vertex is labelled for each


internal state, and for each input x and state s an arrow labelled x is drawn
from s to v(s, x).

EXAMPLE 2.5.4. The graph of Example 2.5.3 is

This amounts to graphing each transformation fx.


62 Semigroups and Groups [Ch. 2

These are some questions considered in the theory of automata (of which
finite machines are the simplest kind):

1. What is the machine having the fewest number of states which can
accomplish a given task?
2. How can machines efficiently compute products in semigroups?
3. How can machines be decomposed into simpler machines?
4. What machine languages can machines recognize?

The theory of automata more generally includes not only finite state
machines but various types of machines with infinite memory: push-down
automata and Turing machines. The latter can perform any calculation which can
be done completely systematically. To show that a task cannot be performed by
a Turing machine is considered equivalent to showing that no algorithm exists
for it. This has been shown for a number of purely mathematical problems,
for instance solving Diophantine equations (equations where the solutions are
required to be whole numbers) in n variables.
We conclude this section by a result showing that no finite state machine
can multiply two numbers of arbitrary length. This is contrary to the case for
addition, where Example 2.5.1 constructs such a machine.

THEOREM 2.5.1. Let M be a machine whose inputs are the sequence of


digits ait bt of two numbers a, b in binary notation and whose outputs are the
successive digits of the product ab. Then if M can accept numbers of arbitrary
length, M cannot have a finite number of states.

Proof. Suppose M is a machine which can perform this task, having n states.
Suppose we multiply the two numbers a = 2n+1, b = 2n+1. Let the initial state
be s0. The first n + 1 inputs are (0, 0). The next input is (1,1). Let Sj be the
state after this input. The next n + 1 inputs are (0, 0). The state after i of these
is (si)/(o,o)!. i ~ 0 to n. But some two of the states (si)/(0>o)Z must be equal,
since they lie in a set of n elements. Say (si)/(o,o/ = (si)/(o,o/ where i < j.

Then (si)/(o,o)”+1-^/-^ = (si)/(o,o)”+1, since the inputs and states are equal,
the outputs at these times must be equal. But the output from the latter is the
unique 1 digit of the answer, and the output from the former is 0. This proves
the theorem. □

EXERCISES
In these exercises S is the set of states, X the set of inputs, Z the set of outputs,
v(s, x) is the next state from state s and input x and 5(s, x) is the output from
state s and input x. For brevity v and 5 are written for v(s, x) and 5(s, x).
Sec. 2.6] Recognition of Machine Languages by Finite State Machines 63

Level 1
1. Consider a machine that will count to 100 (and then go back to 0). The
output and the internal state are the number counted. How many internal
states are needed?
2. The inputs will be either 0 or 1 as something is to be counted or not. Write
S, X, Z.
3. The next state will be the same as the last if no input is received else it
will be greater (orIf the last is 100 the next is 0). Write the function v(s, x)
describing the next state.
4. Write the function 6 giving the output.
5. Suppose the output were only to signify whether 100 is reached or not and
after 100 is reached the state remains there. Write a machine for this.
6. Write a machine to test whether all digits of a binary number are zero.

Level 2
1. Consider a machine that will add three numbers in base 10, digit by digit.
What possible internal states (carries) must be allowed for? Write S, X, Z.
2. Write v, 5 for a machine that adds two numbers in base 10.
3. Write a machine that adds three numbers in base 2.
4. Write a machine that multiplies any number in base 10 by 3.
5. Write a machine that divides any number in base 10 by 2. The digits must be
put in left-to-right in contrast with previous examples.
6. Write a machine given two binary numbers a, b digit by digit to test whether
a = b.
7. Write a machine given two numbers a, b in base 10 to decide whether a — b,
a < b or a > b.

Level 3
1. For any monoid S, any set X, and any mapping h : X -*■ S construct a
machine with semigroup S such that h is the mapping h(x) = fx.
2. Write a machine that divides any number in base 10 by 11.
3. Can a finite state machine take the square of a number of arbitrary length
input digit by digit right-to-left?
4. What tasks could be accomplished by a machine which has no memory, i.e.
only one internal state? which remembers only the last input?
5. Show that some finite state machine can handle any task involving only a
bounded number of inputs.

2.6 RECOGNITION OF MACHINE LANGUAGES BY FINITE


STATE MACHINES
In recent years a very abstract notion of languages (mostly used for computer
languages) has been developed. Here we consider only the purely formal aspects
(syntax) of these languages and not their meaning (semantics).
64 Semigroups and Groups [Ch. 2

We start with a set X of basic units (words) and consider sequences of


words.

DEFINITION 2.6.1. If X is a set, X* is the sequence of finite sequences from


X together with the empty sequence e.

EXAMPLE 2.6.1. If X is {a, b}, X* is {e, a, b, aa, ab, ba, bb, aaa,...}.

DEFINITION 2.6.2. A phrase-structure grammar is a quadruple (N, T, P, a)


such that T =£ 0, TV n T = 0, P = ((TV U T)*\T*) X (N U T)*, o E N, and
N, T, P are finite sets. The set T is a set of words in the language, as sentence.
The set N is a set of valid grammatical forms, which involve abstract concepts
like ‘variable’, ‘expression’. The set P is the set of ways we can substitute into
a valid grammatical form to obtain another grammatical form. The element a is
called the starting symb&f. The sets N, T, P are abbreviations for nonterminals,
terminals, productions. That is for a nonterminal either a terminal (word) or
nonterminal (more detailed grammatical form) can be substituted as allowed
by P.

EXAMPLE 2.6.2. Suppose we want to obtain the statement x — y + z + w.


Let T be {x, =, y, z, w}. Let N — {o, expression, variable}. We start with o and
obtain successively

1. o
2. Variable = Expression
3. Variable = Expression + Expression
4. Variable = Expression + Expression + Expression
5. Variable = Expression + Expression + w
6. Variable = Expression + z +w
7. Variable = y + z +w
8. x = y + z +w

The set of productions involved is (a, variable = expression), (expression,


expression + expression), (variable, x), (expression, y), (expression, z),
(expression, w). Other sets N, P could also be used.
Steps (l)-(8) are called a derivation.

DEFINITION 2.6.3. If x, y E(NU T)* then y is directly derived from x if


x = azb and y = awb for some (z, w) G P and a, b E(NU T)*. An indirect
derivation is a sequence of direct derivations.
That is, a direct derivation is a substitution of w in place of a nonterminal
z where (z, w) is a valid production.
Sec. 2.6] Recognition of Machine Languages by Finite State Machines 65

EXAMPLE 2.6.3. Going from ‘variable = expression’ to ‘variable =


expression + expression’ is a direct derivation.

DEFINITION 2.6.4. L(G) is the set of elements of T* which can be derived


from o.

EXAMPLE 2.6.4. For the case above L(G) includes x = y, x — z, x = w,


x=y + w, x = y + z, x = y + y, x = y + z+ w + y, and other equations.

DEFINITION 2.6.5. A grammar is context-free if and only if for all (a, b)E P
we have a E N, and b =£ e.

EXAMPLE 2.6.5. If x and y (but not x_y) were members of N and (xy, z)E P
then the grammar would not be context-free.

Being context-free means that what is substituted depends only on a single


grammatical element, not surrounding elements.

DEFINITION 2.6.6. A grammar is regular if and only if for all (a, b)E P we
have a E N, b = tN where t ET and n E N or n = e.

Being regular means that the productions are filled in with terminals one per
step, going left to right and that at each stage there is a string of terminals
followed by a single nonterminal.

EXAMPLE 2.6.6. Let P = {(a, w), (w, xw), (w, yw), (w, jc)} and T — {x, y}.
Then L(G) is all sequences in x, y ending in x. This is a regular grammar.

DEFINITION 2.6.7. A machine accepts L(G) if and only if the machine can
tell whether or not any given sequence of inputs is in L(G}.

EXAMPLE 2.6.7. The following machine accepts the last language. The internal
states are {0,1}. Outputs are {Yes, No} (meaning all symbols so far do or do not
constitute a valid expression).

v(s, x) 8(s, x)

s x s x
0x0 1 x Yes
0 y 0 other No
0 other 0
lxl
i y 1
1 other 0

We assume the initial state is 1.


66 Semigroups and Groups [Ch.2

THEOREM 2.6.1. A language is acceptable by some finite state machine if


and only if it is regular.

Proof Let L(G) be accepted by a finite state machine. Outputs will be ‘Yes,
No’ as in Example 2.6.7, meaning ‘all symbols so far are a valid expression’ or
‘all symbols so far are not a valid expression’. The set T will be as in L(G) but
we will construct a new grammar based on the machine.
Let the set of nonterminals be o together with the rest of internal states of
the machine. Let the set of productions be the set of pairs (nlt xn2) such that
if the machine is in state «x and x is input then state n2 results, together with
the set of (nh x) such that in state nx if x is the next input the machine declares
the sentence to be valid.
Let xxx2...xfc be an expression the machine recognizes as valid by a
succession of internal states nx, n2,..., nk before these inputs. Then we have
a derivation xx x2 ... xrnr -*■ xxx2 ... xr+l nr+i in the language.
Conversely if xxx2 ... x„ is derivable in the language and the sequence at
time r is xxx2... xrnr then nr will be the state of the machine at time r for
these inputs and the machine will recognize the sentence when the last input
xk is put in. This proves that if L(G) is acceptable it arises from a regular
grammar.
Conversely let L(G) arise from a regular grammar. Let outputs for a
machine be ‘Yes, No’ as above, and inputs be symbols of T. Let internal states
be in 1-1 correspondence with all sets of nonterminals. Let the initial state be 0.
If the state at time r is a set U of nonterminals let the state at time r + 1 be the
set of all nonterminals z such that (u, xz) £ P for some u £ U. Let the output
be ‘Yes’ if and only if for some u £ U, (u, x) £ P. Then if we have any derivation
xxx2 ... xrnr -*■ xx... xr xr+1nr+1 then nr will belong to U at time r and when
xk is put in the machine will print out ‘Yes’. Conversely if the machine prints
‘Yes’ then there must have been a sequence of elements of the sets U making up
a valid derivation. □

EXERCISES
In these exercises N is the set of nonterminals, T the set of terminals, P the set
t)f productions, o the starting symbols, L(G) the language generated by grammar
G, 5(s, x) the output from state s and input x and v(s, x) the next state from
state s and input x.

Level 1
1. In the grammar T = {+, 1}, N — {a}, P = (a, 1), (a, a + a)} derive the
expression 1 + 1 + 1.
2. What are the elements of L(G)1
Sec. 2.7] Groups 67

3. Can 1 + 1 + 1 be derived in two ways?


4. Is this grammar context-free? Is it regular?
5. In the grammar T = {x, ( , )},N = o, P — {(a, x), (a, ad), (a, (a))} derive
(x) (x).
6. What is L(G)?

Level 2
1. What is L(G) for this grammar {(a, 1), (a, 1 + a)} where N, T are as in
Exercise 1 of Level 1.
2. Is the above grammar regular?
3. Do you think a regular grammar exists for L(G) as in Exercise 5 of Level 1?
4. Find a regular grammar which generates L(G) = {all sequences in x of
length (am + Z?)} where a, b are fixed. Here T = {x}, and it is only necessary
to find P, N.
5. Give a grammar to obtain a union L(Gj) U L(G2).
6. Let S be any finite subset of T* for fixed T. Does there exist a regular
grammar yielding S as L(G)?

Level 3
1. Two machines (S, X, Z, v, 6), (T, X, Z, p, co) have the same behavior if
there exists a binary relation R from S to T such that R and RT are onto
and if (s, t)€E R and the machines are initially in states s, t, then for any
sequence of inputs the machines give the same sequence of outputs. Show
this is an equivalence relation.
2. Two states sh s2 of machine (S, X, Z, v, 5) are equivalent if and only
if for any sequence x1x2...x„ of inputs 8((si)fXlfX2... fXn_v xn) =
5((s2)fx1fx2--- fxn_v xn)• Show this is an equivalence relation.
3. For any machine M show there exists a machine M having the same
behavior whose states are equivalence classes of states of M.
4. Show that if M and N have the same behavior there exists a homomorphism
of machines from M to N which is 1-1 on states of M. Here a homo¬
morphism of machines from (5, X, Z, v, 5) to (T, X, Z, n, w) is a function
/: S -+T such that for all s G S, x E S, 5(s, x) = tx) and f(v(s, x)) =
M(/0), x). _
5. Tell why the preceding exercises show that for any machine M, M is a
machine having the same behavior with the least number of states.

2.7 GROUPS
A group is a semigroup in which multiplication by any element is a 1-to-l onto
function.

DEFINITION 2.7.1. A group consists of a set G and an operation o;GXG^G


such that
68 Semigroups and Groups [Ch. 2

(1) (x o y) o z = x o (y o z)
(2) there exists e G G such that for allxGC,roe = eor = r, where e
is called an identity element
(3) for all x G G there exists x_1 G G such that x o x_1 = x-1 o x = e,
where x_1 is called the inverse of x.

The presence of inverses x_1 makes group theory very different from
semigroup theory. Every multiplication is reversible and cancellation holds.

EXAMPLE 2.7.1. Under addition, the following are groups: the integers Z, the

rational numbers Q, rational numbers Z(£) whose denominator is a power of

p, the real numbers R, the complex numbers C, n X m matrices for any n, m.

Positive integers, rationals, or reals do not form a group, lacking the identity
0 and the negative elements which are inverses.

EXAMPLE 2.7.2. Under multiplication these are groups: rationals r/s such
that r, s have among their divisors a fixed set S of primes, nonzero rational
numbers, nonzero real numbers, nonzero complex numbers, complex numbers
of absolute value 1, nonsingular nX n matrices over C.

DEFINITION 2.7.2. A 1-to-l transformation from a set to itself is called a


permutation.

EXAMPLE 2.7.3. The permutations on any set S form a group. Associativity


follows from their being a subsemigroup of the semigroup of transformations.
The identity is given by e(x) = x for all x G S. The inverse is defined by
f~\y) = x whenever /(x) = y. A composition of 1-to-l onto transformations
is again a permutation.

For groups there is a result similar to the result about semigroups being
isomorphic to semigroups of transformations.

PROPOSITION 2.7.1. Any group G is isomorphic to a set of permutations on


the set G.

Proof. Essentially the same as the proof of Theorem 2.1.1.

DEFINITION 2.7.3. The group Syi of permutations on the set {1, 2,..., n} is
called the symmetric group of degree n (or symmetric group on n letters).

The degree n ofS^ should be distinguished from the order of a group G


which is |G|. A symmetric group of degree n has order n !.
Sec. 2.7] Groups 69

EXAMPLE 2.7.4. The symmetric group on {1, 2} has two elements e, x where
x(l) = 2, x(2) = 1 and multiplication e2 = e, ex = xe = x, x2 = e.

The symmetric groups will be studied in more detail later.

DEFINITION 2.7.4. A set C contained in a group G is a set of generators for G


if and only if every element of G can be expressed as a product xx o x2 o ... o xn
where x,- EC or xf1 E C.

EXAMPLE 2.7.5. The element 1 generates the integers as a group (but in the
semigroup sense it would generate only the positive integers).
An equivalent statement is that no group which is a proper subset of G
contains C.

DEFINITION 2.7.5. A subgroup of a group G is a subset which forms a group


under the same operation.
This is equivalent to saying that a subgroup is a subset closed under
products and inverses.

EXAMPLE 2.7.6. The additive integers are a subgroup of the additive rationals
which are a subgroup of the additive real numbers which are a subgroup of the
additive complex numbers.

The propositions on generators and relations for semigroups also go through


for groups except that we must allow inverses. A free group is made up of words
in the given elements and their inverses. When a product is taken any adjacent
pair formed by an element and its inverse must be cancelled.

EXAMPLE 2.7.7. The free group on generators x,y has these words of length
at most 2: e,x,y,xy~\ x2, y2, x~2, y ~2, xy, x~ly xy _1, x~ly, yx,y _1x
yx~\ y~lx. The product of x2yx~x and xy"2 is x2yx~1xy~2 = x2yy~2 = x2y~1
after cancelling x_1x and yy-1.

The simplest nontrivial groups are those having one generator. For instance,
they are always commutative.

DEFINITION 2.7.6. A group is cyclic if it has a single generator.

EXAMPLE 2.7.8. The additive group Z of integers is cyclic, with generator 1.

EXAMPLE 2.7.9. For any positive integer m there is a group Zm whose


elements are 0,1, 2,..., m — 1. Let the operation be x * y = remainder if x + y
is divided by m. That is, it is the unique integer 0 < z < m such that m divides
x + y — z. From this it can be shown the operation is associative. The inverse of
a is m — a if a ^0 and 0 is its own inverse.
70 Semigroups and Groups [Ch. 2

The addition table for Z3 is

0 1 2

0 0 1 2

1 1 2 0

2 2 0 1

EXAMPLE 2.7.10. Any 1-element group has this structure: {e} where e * e = e.
We will call this Zx-

THEOREM 2.7.2. Any cyclic group G is isomorphic to Z or Zm.

Proof. Let x be a generator. Then all elements by definition can be expressed as


positive or negative powers xk of x. And the law of exponents xrxs = xr+s
holds. Suppose there is no positive integer m such that xm = e. Then xr ^ xs
for r>s else xrx~s = xsx~s, xr~s = e. So there is a 1-to-l homomorphism
from G to the integers defined by f(xn) = n.
If there exists a positive integer m such that xm — e, let m be the smallest
such number. By induction we can readily prove that all powers of x equal
one of x°, x1,..., xw-1 for instance xm = x°, xm+1 - xxm = xx° = x1, and so
on. The powers x°, x1,..., xw_1 are distinct since if xr- xs, m > r > s > 0
then xrx~s = x*x~s = e, xr~s = e. This contradicts the fact that m was the
smallest positive integer such that xm = e. Thus we have a 1-to-l onto map from
G to Zm defined by /(x1) = i for i = 0 to m — 1. It can readily be checked that
this is a homomorphism, for instance by induction. This proves the theorem. □

Another way of constructing groups is by taking products.

DEFINITION 2.7.7. The Cartesian product Aj X A2 X ... X An of groups


Ai, A2,..., An is the group with set the Cartesian product set and operation
On g* ■■■,gn)(hi, h2,...,hn) = (g1 o hi,g2 o h2, ...,gn o hn).

EXAMPLE 2.7.11. The «-fold Cartesian product of the real numbers is


isomorphic to the set of n-dimensional vectors (xx, x2,..., x„) with operation
(x1( x2,..., x„) -I- Ox, y2,..., yn) = (xt + yx, x2 + y2,..., x„ + yn).

One use of Cartesian products is in studying the structure of commutative


groups. It can be proved, though we will not prove it, that any finite commutative
group is isomorphic to a product of cyclic groups Zm where the m’s are powers
of primes.
Commutative groups are frequently called Abelian.
Sec. 2.7] Groups 71

EXERCISES
Level 1
1. Prove that any group of order 2 is isomorphic to Z2.
2. Prove that these cancellation properties hold in groups: if xy — xz then
y — z, if yx = zx then y = z (multiply by the x~l on left or right).
3. Write out the addition table of Z4.
4. Write out the addition table of Z2 X Z2.
5. Prove Z4 is not isomorphic to Z2 X Z2.
6. Prove the symmetric group on (1, 2, 3} is not commutative. Let /
interchange 1, 2 and g interchange 2, 3. Prove (/ o g)0)*(g o /)(i).
7. Prove the identity element of a group is unique.
8. Prove inverses are unique.

Level 2
1. Write out the multiplication tables of Z2, Z3, Z2 X Z3.
2. Show Z2 X Z3 is cyclic. Show Z2 X Z3 is isomorphic to Z6.
3. Can you suggest a generalization of the above exercise about products of
certain cyclic groups being cyclic?
4. Find all subgroups of Z6 (they are cyclic).
5. Prove the inverse of xy is y~lx~x. Describe the inverse any word xxx2 .... xn
in a free group.
6. Let f(vlt..., vn) be any function from n-tuples of vectors to the real
numbers. Prove the set of invertible matrices A such that f(v i, v2,..., vn) =
f(Avlt Av2, ..., Avn) is a group. For instance the set of all matrices A such
that \v\2 = \Av\2 is a group called the orthogonal group. Its members
preserve distances \x — y\2 and are made up of totations and reflections in
n-dimensional space.

Level 3
1. Consider the group of transformations of a square into itself including all
rotations by multiples of 90° and all reflections through a line inclined at a
multiple of 45° through the origin. Write out its multiplication table. This
is the dihedral group of order 8.
2. Show there exists an onto homomorphism /: Zm-*Zn if and only if n
divides m. Show the set of x such that f(x) = e is a subgroup.
3. Show that if n divides m, Zm has a subgroup of order n.
4. Prove that any finite group is isomorphic to a group of matrices. Use
Proposal 2.4.1 and prove that the symmetric group on {1,2,...,«} is
isomorphic to a set of (0,1)-matrices with exactly one 1 in each row and
column.
5. Suppose there exist homomorphisms : G -*■ Gi and f2 : G -+
Construct a homomorphism /: G -* Gx X G2. Use this to prove Znm is
72 Semigroups and Groups [Ch. 2

isomorphic to Zn X Zm if n, m are relatively prime. (Since they have the


same order it suffices to prove a homomorphism is 1-to-l. Let /(x) = f(y).
Then f{xy~l) = e. Prove xy~l = e.)
6. Prove that for any group G the onto homomorphisms G -*■ G form a group.
Such homomorphisms are called automorphisms.
7. Let G, H be groups, and / a homomorphism from G into the automorphism
group of H. Thus fg is a homomorphism of H and fg o fh = fhg■ Define a
product on G X H by {glt hf) o (g2, h2) = (gig2, fg^hi)h2). JProve this
product is associative. It gives a group called the semidirect product. Express
the group of functions f(x) = ax + b, a i=- 0, a, b real numbers as a
semidirect product.

2.8 SUBGROUPS
It has already been mentioned that a subgroup of a group is a nonempty subset
closed under multiplication and inversion.

DEFINITION 2.8.1. For a subset C of a group G, the subgroup generated by


C is the set of all products x2x2 ... xn where x, G C or xf1 G C for i = 1 to n.
It can readily be shown that this set is closed under products and inverses,
and is therefore a subgroup. It is contained in every other subgroup containing
C (since x^x2 ... xn must be) and is thus the smallest subgroup containing C.

This means that for every subset C we obtain a subgroup generated by C,


although many of these will be identical. For an element x we get a cyclic
subgroup generated by x, consisting of all powers xl.

EXAMPLE 2.8.1. A set of positive rational numbers {*!, x2,..., x*.} generates
the multiplicative subgroup of rational numbers of the form x”1 x”2 ... x”k.

For any subgroup H we can divide a group into equivalence classes called
cosets. The cosets are sets which are ‘parallel’ to the subgroup in the sense
y = x + 1 is parallel to y = x. All the lines y = x + c partition the plane.
As before, AB means the set of all products {ab: a G A, b G B}. We have
(AB)C = A(BC)= {abc: a G A, b G B, c G C}.

DEFINITION 2.8.2. Let H be a subgroup of G. A left {right) coset of H is a


set of the form xH (Hx) for some x in G. A normal subgroup H is one such
that xH = Hx for all x in G.

EXAMPLE 2.8.2. For a positive integer m, let H be the subgroup of Z of


all integers divisible by m. Then the coset 1 4- H consists of all integers having
the form 1 + km for some integer k. The coset 0 + H equals H. Since Z is
commutative, H is normal.
For brevity, let a\b denote a divides b.
Sec. 2.8] Subgroups 73

THEOREM 2.8.1. For any subgroup H of a group G, G is the disjoint union of


the left cosets of H. Each left coset of H has the same cardinality as H. If G
is finite then |//| | |G | and \G\/\H\ is the number of cosets. If H is a normal
subgroup of G, then the left cosets of H form a subgroup under the operation
(.xH) (yH) = xyH.

Proof. The left cosets of G are the equivalence classes of the equivalence
relations {(x, y): x = yh for some h E H}. The first assertion follows from this.
The mapping H -*■ xH which sends h to xh is 1-to-l and onto. This implies the
second statement. The first two statements imply \G\ is \H\ times the number
of cosets. If H is normal then xHyH = x{yH)H = xyHH. But HH = H since
H is a subgroup. So the operation is well defined. The proof of associativity is
(xHyH)zH = (xyH)zH = (xy)zH = x(yz)H = xHyzH = xH(yHzH). And
x~lH = (xH)~1 and eH = H is an identity element. This proves the last
statement. □

EXAMPLE 2.8.3. The symmetric group of degree 3 has multiplication table:

e a b X y z

e e a b X y z

a a b e y z X

b b e a z X y

X X z y e b a

y y X z a e b

z z y X b a e

The set H = {e, a, b] is a subgroup. There are two different left cosets of H,
eH = {e, a, b} and xH = {xe, xa, xb} = {x, y, z}. All other cosets equal one
of these two. These two cosets form a partition of G. The subgroup H is normal.
As can be seen from the distribution of symbols e, a, b, x, y, z in the table,
multiplication of cosets is described by

eH xH

eH eH xH

xH xH eH
74 Semigroups and Groups [Ch. 2

DEFINITION 2.8.3. If H is normal, the group of left cosets of H is called the


quotient group G/H.

EXAMPLE 2.8.4. G/G is the one element group.

For a normal subgroup TV, the order of G is given by |TV| |G/TV| since there
are |G/TV| cosets. Many other properties of G can be studied in terms of the
simpler groups TV, G/N if a proper nontrivial normal subgroup exists. In fact all
groups G such that G/N = H can be classified by means of extension theory.
One such group is the direct product N X G/N.
Groups having no normal subgroups can be studied in terms of other types
of subgroups. If pm is the highest power of a prime p such that pm divides \G\
for m > 1, a subgroup of G of order pm is called a Sylowp subgroup. For any
p dividing |G| there exists a Sylow p subgroup of G and any two Sylow p
subgroups for the same p are isomorphic, in fact for such subgroups Hx, H2
there exists xE G with xHiX~l = H2.
Groups of order pm for prime p have a special structure. For instance, there
always exists a nontrivial center.

DEFINITION 2.8.4. The center of a group G is {z E G : xz = zx for all


x E G}.

EXAMPLE 2.8.5. The center of a commutative group is the entire group.

EXAMPLE 2.8.6. The center of a symmetric group of order > 2 is the identity
element only.

EXAMPLE 2.8.7. The center of the group of n X n nonsingular matrices


consists of all scalar multiples al of the identity matrix.

Thus the center is the set of elements commuting with every element of G.

DEFINITION 2.8.5. A group is simple if it is not {e} and has no normal


subgroups except {e} and the entire group.

EXAMPLE 2.8.8. The quotient of the group ofnXn matrices of determinant


1 over the real or complex numbers, by its center, is simple.

It is reported as of 1982 that all finite simple groups have been classified
(the proof being some five thousand pages of research done by many group
theorists). These include finite analogues of the groups in the last example, other
matrix groups, subgroups of the symmetric group called alternating groups
defined later, and certain other groups.
Sec. 2.8] Subgroups 75

EXERCISES
Level 1
1. Prove the center of a group is closed under products and inverses (so is a
semigroup).
2. Prove the center is a normal subgroup.
3. Find three subgroups of order 2 of the symmetric group of degree 3 from the
multiplication table given in Example 2.5.3 (these will be {{e, g]: g2 = e}).
4. Show none of the subgroups in the last exercise is normal.
5. Find all cosets of the subgroup {e, x} in the symmetric group of degree 3.
(How many will there be, by Theorem 2.5.1?)
6. Find two elements of the symmetric group of order 3 which generate the
entire group. (Many pairs will do.)
7. What are some Sylow 2 and 3 subgroups of the symmetric group of degree 3?

Level 2
1. Prove that a group G of prime order p has no subgroup H except G, {e}
from the fact that \H\ divides |G|.
2. Prove that if x ± e in a group G of order p then the cyclic subgroup
generated by x is all of G.
3. Using the last two exercises and a theorem in the last section prove a group
of prime order p is isomorphic to Zp.
4. Prove that in a direct product G X H the sets {(e, g) : g G G] and {(e, h):
h& H} are normal subgroups.
5. Prove that if H is a subgroup of G and \G | / \H\ = 2 then H is normal. For
x £ H show xH = G — H and Hx = G — H. For x E H, xH = Hx = H.
6. Prove that if H, K are subgroups so is H Pi K.
7. What are the quotient groups for the normal subgroups in Exercise 4?

Level 3
1. What are all subgroups of Zp X Zp (note that except for G, {e} they have
order p and by Exercise 3 above are cyclic)?
2. Prove for subgroups H, K the set HK is a subgroup if and only if HK = KH.
3. Prove that if H is a subgroup and N is a normal subgroup HN = NH.
4. The quaternion group of order 8 has elements ±1, ±i, ±j, ±k with multi¬
plication if = k, ji = —k, kj = —i, ki — j, ik — —signs treated as usual
in algebra, and 1 the identity. Write out the multiplication table and find the
center and all subgroups. They must have orders 1, 2, 4, 8. Orders 1, 8 are
{e}, G. The order 2 subgroups are cyclic since 2 is prime. The order 4
subgroups are normal by Exercise 5 above.
5. Do the same for the dihedral group of order 8 which is the group of rota¬
tions and reflections of a square. It can also be represented by generators
x (90° rotation) and y (reflection) with x4 = e, y2 = e, yx = x3y. The
76 Semigroups and Groups [Ch.2

elements are e, x, x2, x3, y, xy, x2y, x3y and products follow directly from
these relations. Show that there exists a noncyclic subgroup of order 4
so that the dihedral and quaternion groups are not isomorphic. What is the
quotient of this group by its center?
6. For any prime p there exists a nonabelian group of order p3 with generators
x, y, z and defining relations xp = 1, yp = 1, zp = 1, xy = yxz, zx = xz,
zy = yz. All elements have the form xnymzr and products are given by
xnymzrxaybzc = xa+nyb+mzr+b+am. Find the center. Show every
subgroup containing the center is normal. Find some normal subgroups of
order p2. What is the quotient of this group by its center?

2.9 HOMOMORPHISMS
A homomorphism of groups is a function preserving products.

DEFINITION 2.9.1. A function f\G-*H is a homomorphism of groups if


and only if /(xy) = /(x) /(y) for all x, y in G.

As was earlier remarked homomorphisms that are not 1-to-l can simplify
the structure of a group. One-to-one homomorphisms can represent a group in
terms of some other structure: a group can be regarded as a group of matrices
or permutations.

EXAMPLE 2.9.1. If H is a subgroup of G the mapping /(x) = x from H to G


is a homomorphism.

EXAMPLE 2.9.2. The logarithm gives a group homomorphism from the


positive real numbers under addition to the additive real numbers. This is
expressed by log (xy) = log x + log y.

EXAMPLE 2.9.3. For any constant c the mapping /(x) = cx is a homo¬


morphism from the additive real numbers to itself since c(x + y) = cx + cy.

EXAMPLE 2.9.4. The determinant is a homomorphism from n X n non¬


singular matrices to nonzero real numbers, since det (XT) = det (X) det (X).

EXAMPLE 2.9.5. For any commutative group and integer n the mapping
x xn is a homomorphism of the group into itself since (xy)” = x”y” holds.

One-to-one onto homomorphisms are called isomorphisms. An isomorphism


between two groups means that it suffices to identify the structure of only
one, since the other will have exactly the same structure. The logarithm is such
an isomorphism and shows the group-theoretic properties of the positive real
numbers under multiplication are just like the properties of the real numbers
under addition.
Sec. 2.9] Homomorphisms 77

One-to-one homomorphisms are called monomorphisms, injections, or


embeddings. Onto homomorphisms are called epimorphisms, surjections, or
quotient maps.

PROPOSITION 2.9.1. If f is a homomorphism of groups, f(e) = e and


f{x-1) = (/(x))-1.

Proof We have f(e)f(e) — f(ee) = fie) = ef(e). Multiply by/_1(e) on the right,
we have /(e) = e. We have /(x)/(x_1) =/(xx_1) = /(e) = e = /(x)/-1(-*)-
Apply /_1(x) on the left. This completes the proof. □

Two of the most important aspects of homomorphisms are the kernel and
image.

DEFINITION 2.9.2. The kernel of a homomorphism f:G-+H is {xEG:


fix) = e).

DEFINITION 2.9.3. The image of a homomorphism f:G-+H is [fix): x G G}.

EXAMPLE 2.9.6. If / is an isomorphism, its image is H and its kernel is {e},


since it is 1-to-l. If / sends all of G to e, its kernel is G and its image is e.
The image describes the result of applying the homomorphism. The kernel
tells how far the map is from being 1-to-l, since /(jc)=/(y) if and only if
fix) f~1iy)=fix)fiy-1) = fixy~1) = e, if and only if xy~'e kernel.

EXAMPLE 2.9.7. For any normal subgroup TV of a group G there is the


mapping x -*■ xN from G -* G/N. This is an epimorphism with kernel TV.

PROPOSITION 2.9.2. The image is always a subgroup and the kernel is always
a normal subgroup.

Proof. fixy)=fix)fiy). If fix)=fiy) = e and zGG then fixy)=fix)fiy) =


e and /(zxz-1) =/(z)/(x)/(z_1) = /(z)e(/(z))_1 = e. □

The relationships between kernel and image can generally be expressed in a


similar way to the epimorphism onto a quotient group.

THEOREM 2.9.3. Let f: G -> H be a homomorphism with kernel TV. Then


there is an isomomorphism g from G/N to the image off with the isomorphism
described by gixN) = fix).

Proof. We show g does not depend on the choice of x, is a homomorphism, is


onto, and is 1-to-l.
If xN = yN then x =yn for some n G TV. Then fix) =fiyn) =/(>’)/(«) =
fiy) e = fiy) since n is the kernel. This shows gixN) is the same for any choice
of x.
78 Semigroups and Groups [Ch. 2

We have g(xNyN) = g(xyN) = f(xy) = f(x)f(y) = g(xN)g(yN).


Therefore g is a homomorphism.
For /(x) in the image, f(x) — g(xN).
If g(xN) = g(yN) then /(x) = /(>’), /(*7_1) = /(*)/_1O0 = e so
xy-1 E N, x E Ny = yN. Thus xN = yN. d

EXAMPLE 2.9.8. The function |x| is a homomorphism from the nonzero real
numbers under multiplication to the positive real numbers under multiplication.
The kernel is ±1. This gives an isomorphism from R\{0}/{x: |x| = 1} onto R+
where R+ denotes the positive real numbers.

EXERCISES
Level 1
1. What is the image of the determinant map on nonsingular matrices? Give
examples of elements of its kernel.
2. Consider the homomorphism G X H -*■ H such that /(g. h) is h. What are
its kernel and image?
3. The map in the above exercise gives an isomorphism between (G XH)/(G X e)
and what group, by the theorem? How does this partly justify the reason
for the name “quotient group’? (The order of the quotient group is another
reason, as well as the division into cosets.)
5. Find a monomorphism from Zm into the multiplicative group of complex
numbers C (consider mth roots of unity).

Level 2
1. Prove any group homomorphism is the composition of an epimorphism and
a monomorphism, using the theorem.
2. Show any group homomorphism f:G~*H is the composition of a mono-
morphism and an epimorphism. Take as monomorphism fx: G -*■ G X H
where fx(g) = (g, /(g)).
3. For any complex number x. define a multiplicative homomorphism /from
the real numbers under addition into the complex numbers under addition,
such that f(l) = x.
4. Let / from nonzero rationals under multiplication to the additive integers
be defined as the power of 2 occurring in the rational number. Give the
miage and kernel of /.
5. Prove the intersection of two normal subgroups is normal.
6. Let /i: G -+Hx and f2: G~*H2 be homomorphisms with kernels Nx, N2.
Let /(x) = (/i(x),/2(x)) E Hx X H2. Show / is a homomorphism. What is
its kernel?
Sec. 1.10] Permutation Groups 79

Level 3

kernel is precisely K.) This is a basic theorem of group theory.


2. Show that the transformations /(x) = ax + b, a ^ 0 have a normal sub¬
group of transformations of the form /(x) = x + b. Show the quotient
group is isomorphic to the multiplicative nonzero real numbers.
3. There is a homomorphism from the additive real numbers to the multiplica¬
tive complex numbers, described by f(x) = cos x + i sin x (that is, elx).
Find its kernel and image.
4. Prove that if H C G is a subgroup and TV C G is a normal subgroup, H n N
is normal in H. Prove HN/N is isomorphic to H/H n N. (Map H into HN/N
by fix) = xN. Show the image is all of HN/N and the kernel is H H N.)
This is a basic theorem of group theory.
5. Show a quotient of a cyclic group is cyclic.
6. Show the set Nm = {mx : x an integer} is a normal subgroup of Z. The
quotient group must be cyclic since Z is. What is the quotient group
isomorphic to?

2.10 PERMUTATION GROUPS


A permutation f on (1,2,...,«} is a 1-to-l, onto transformation from the set
(1, 2, ...,«} to itself. Permutations (and transformations) are frequently written
in the form

1 2 ... n
\/(l)/(2) .../(„)/

That is, the top row is always 1, 2, ... n. Underneath each number x is the value
fix).

EXAMPLE 2.10.1. Let / be the permutation /(1) = 2, /(2) = 3, /(3) = 1. Then


/ is
2
3

Permutations here will be multiplied as if composing them on the right.


That is, (x)fg = ((x)f)g. Apply / to x, then apply g to the result /(x). This
can be done quickly using the double-row notation.
80 Semigroups and Groups [Ch.2

EXAMPLE 2.10.2. The product

1 2 3 4\/l 2
2 3 4 1/ \4 3

is l-*2-*3,2->-3->2,3 -► 4 -► 1, 4 ->1 -»• 4. This is

12 3 4
3 2 14

To find such a product, for each x look underneath x in the left-hand


permutation. Find that same number on top of the right-hand permutation.
What is under it goes below x in the result.
A second common way of writing permutations is as products of cycles.
A cycle is a sequence like 1 -*■ 4, 4 -+ 2, 2 -* 3, 3 -* 1 which runs through a set of
numbers once, and ends where it began, the function being applied each time
to the result at the last stage. A cycle is a set of numbers x, /(x), /2(x),...,
fk(x) = x. Numbers not in the cycle are left unchanged by it. The above cycle
would actually be written as (1423).

DEFINITION 2.10.1. For distinct x,-, (xxx2 ••• xk) is the permutation f such
that f(x0 = x2, f(x2) = x3, in general /(*,) = xi+1, and f(xk) = xv For
x ^ {^x, x2,..., xfc), /(x) = x. Such a permutation is called a cycle or k-cycle.

EXAMPLE 2.10.3. The cycle (123) is

1 2 3
2 3 1

This notation is not unique: (123) = (231) = (312).

Two cycles are disjoint if and only if the sets {x1( x2.x*.} are disjoint.
The notation (x) represents the identity permutation and is sometimes called
a 1-cycle.
Cycle notation is more compact and tells more about the permutation than
the double-row notation. But multiplication of cycles is less simple.

THEOREM 2.10.1. Every permutation is a product of disjoint cycles. Disjoint


cycles commute.

Proof Let p be a permutation on (1,2,..., n). Consider the relation {(x, y):
x ~ (y)pl for some integer /}. This is an equivalence relation since x = (x)p°
and if x = {y)pl then (x)p~l = y and if x = (y)pl and y = (.z)p> then
Zpi+l = ypl - jo each equivalence class we will associate a cycle of p.
Namely for an integer x, let n be the least positive integer such that (x)pn = x.
Sec. 2.10] Permutation Groups 81

Then for 0 < s < t < n, if (x)ps = (x)pf then (x) = (x)psp~s = (x)ptp~s =
(x)pt~s. This contradicts the assumption on n. So x, (x)p, (x)p2,..., (x)p”_1
are distinct. We have a cycle (x (x)p (x)p2 ... (x)p"_1). The product of all such
cycles is p. This proves the first assertion.
Let /, g be two disjoint cycles. Then for all x either (x)/ = x or (x)g- = x.
Moreover if x =A(x)/ then (x)/ ^ (x)/2. Suppose x =£ (x)/ Then {x)g = x and
(O)f)g = (x)f So (x)gf = (x)f and (x)fg = (x)f. Likewise if xi=(x)g,
Cx)gf = (x)fg. Suppose x = (x)/ and x = (x)g. Thus (x)gf = x = (x)fg. This
proves the last statement. □

EXAMPLE 2.10.4. If

123456789
457923618

we have l->4->9-*8-*l, 2->5-*2, 3->7->6->3. Thus the permutation


equals (1498) (25) (376).

Permutations in cycle form can be multiplied by tracing the image of


1, 2Afterwards the cycles of the product must usually be found.

EXAMPLE 2.10.5. Let p=(123)(34). In this product 1 2 -> 2, 2 3 -> 4,


3 -> 1 -»■ 1, 4-^-4->-3.So the product is

This is (1243).

DEFINITION 2.10.2. The order of an element xGG is the least positive


integer n such that x” = e.

EXAMPLE 2.10.6. The order of the permutation

is 3.
The order of an element is the same as the order of the cyclic subgroup it
generates, since that subgroup is {e, x, x2,..., x”-1}. Thus the order of any
element of a group divides the order of the group, since the order of a subgroup
divides the order of a group.

PROPOSITION 2.10.2. The order of a k-cycle is k. The order of a product of


disjoint cycles is the least common multiple of their orders.
82 Semigroups and Groups [Ch. 2

Proof. If / is a fc-cycle, it can be seen that xfk — x and that if x is in the cycle,
k is the lowest power with this property. So the order of / is k.
Let / be a product Z\Z2...zr of disjoint cycles. Then since the cycles
commute, fm = z™z™... z™. This will be the identity if and only if z™ = e,
z™ = e,..., z™ = e, that is, if m is divisible by the order of each cycle. The last
statement of the proposition follows from this. □

EXAMPLE 2.10.7. The order of (1467) (2398510) is the least common multiple
of 4, 6, that is, 12.

PROPOSITION 2.10.3. Let p be a permutation and z = (xjx2 ... xfc) a k-cycle.


Then pzp~l is the k-cycle (yx y2 ... T/c) such that y{ = p~l(xt) for each i.

Proof Under pzp~l the element p_1(x,-) is sent to xt then to x,-+1 then to
p_1(x,-+1). So yt -> y,-+1. Also y^-* y2. And elements v not of the form p_1(xf)
have v -*■ p(v) -*■ p(v) -*■ v. □

For products of cycles pzp~l can be computed in the same way since
pxp~lpyp~x = pxyp~l. Here p'1 was defined after Proposition 1.3.3.

EXAMPLE 2.10.8. (123) (12) (123)"1 = (31).

EXERCISES
Level 1
1. Multiply

(1 2 3 4\/l 2 3 4
\2 3 4 1/ \3 1 4 2

2. Find all powers of

/I 2 3 4 5\
\2 3 1 5 4/

3. Write
(1 2 3 4 5\
\2 3 1 5 4/

in cycle form.
4. Write the multiplication of the set of permutations e, (12) (34), (13) (24),
(14) (23). Show it fonns a commutative subgroup. It is called the Klein
4-group.
5. What is (12) (23)? (12) (23) (34)?
Sec. 2.11] Systems of Distinct Representatives and Flows on Networks 83

Level 2
1. Find a permutation of degree 7 but of order 10. (It will have a 5-cycle and
a 2-cycle.) What is the largest order for degree 8?
2. What is the product (xy) (yz)? Here x, y, z are distinct.
3. What is the general product (*iX2) (*2 *3) ••• (xk_txk)l Here all xt are
distinct. Tell why this implies that 2-cycles, called transpositions, generate
the symmetric group.
4. Can a power of a cycle have more than one cycle? When? Consider (1234)
and (12345).
5. Tell how to obtain the inverse of a permutation written in cycle form.

Level 3
1. Find (12 ... n) (23) (12 ... n)~l. Note (12 ... n)”1 = (n ... 21). Note
for x =£ 1, 2, n\ x-*x + l-+x+l-*x, 1 -*■ 2 -*■ 3 -> 2, 2 -> 3 -»■ 2 -> 1,
n -»■ 1 -*■ 1 ->•«.
2. Find (12 ... n) (i + 1 / + 2) (12 ... «)_1 for i = 1, 2, ..., n — 2.
3. Generalize the formula (54) (43) (32) (12) (23) (34) (45) = (15). Thus
show all transpositions are products of (/ / 4-1).
4. Combine the answers to 2, 3 of Level 3 and 3 of Level 2 to find two
permutations x, y which generate the entire symmetric group.
5. Show that all transformations are generated by the two permutations
in the above exercise together with the transformation f(x) = x, x =£ 1,
/(1) = 2. First obtain all transformations with image size n— 1 as pfq
for permutations p, q. Then show any transformation with image size
k, k < n — 1 can be factored as a product of transformations with image
size k + 1.

2.11 SYSTEMS OF DISTINCT REPRESENTATIVES AND FLOWS


ON NETWORKS
Systems of distinct representatives are solutions to a problem such as this:
Suppose a workshop has n rooms. Each room i can be used for a specific set of
tasks Si which must be done. How can we assign a task to each room, with the
understanding that all tasks are distinct?

EXAMPLE 2.11.1. Suppose tasks 1, 2 can be performed in room 1, tasks 2, 3 in


room 2, and task 2 only in room 3. Then there is only one possible assignment:
task 1 to room 1, task 3 to room 2, task 2 to room 3.

Many combinatorial problems reduce to finding a system of distinct


representatives.
84 Semigroups and Groups [Ch. 2

DEFINITION 2.11.1. Let Ult U2,...,Un be subsets of a set U. A system of


distinct representatives (SDR) is a sequence U\, u2,..., un such that the ut are
distinct and «,• E £/,- for all i.

Systems of distinct representatives can conveniently be studied by means of


a type of matrix defined below.

DEFINITION 2.11.2. A permutation matrix P is a Boolean matrix (or


(0,1)-matrix) which is the matrix of a permutation 7r on {1, 2,..., n}, i.e. a
1-to-l onto function. Thus = 1 if and only if (/)n = /.

EXAMPLE 2.11.2. The matrix of the permutation

12 3 4
2 3 14
is
0 10 0
0 0 10
10 0 0
_0 0 0 1_

PROPOSITION 2.11.1. A Boolean matrix (or (0,1)-matrix) is a permutation


matrix if and only if it has exactly one 1 in each row and column.

Proof. A permutation matrix has a 1 on place (i)ir in row i and place (/)7r_1 in
column /. Conversely if A has exactly one 1 in each row and column, define n by
ai(i)n ~ 1- Then n must be a 1-to-l else some column would have two ones. So it
is a permutation. □

DEFINITION 2.11.3. A Boolean matrix (or (0, l)-matrix) A is a Hall matrix if


and only if there exists a permutation matrix P such that P < A.

EXAMPLE 2.11.3. There is no permutation matrix P < A for

0 1 1
A 1 0 0
1 0 0

Therefore it is not a Hall matrix.

Hall matrices will here be considered usually as Boolean matrices, since they
form a semigroup under Boolean matrix multiplication. Treatment as a Boolean
Sec. 2.11] Systems of Distinct Representatives and Flows on Networks 85

matrix makes no difference except that the operations are Boolean, but for
a (0,1)-matrix they are the usual arithmetic operations in the real number field.
For a system of distinct representatives we form a matrix A. Let \U\ = m
and label the elements of U as U\, u2,, um. Form an n X m matrix A by
aij = 0 if Uj (f. Ui and ay — 1 if Uj E U(. Thus the zth row of A has ones for
precisely those places corresponding to the members of

EXAMPLE 2.11.4. Let £4 = {1,2, 4}, £/2 = {3,4}, U3 = { 1,3}, UA = { 1,4}.


Then

1 0 f
0 0 11
10 10
1 o o i_

There are two SDRs. One is Wj = 2, u2 = 3, «3 = 1, «4 = 4.

In a square Boolean matrix A, an SDR corresponds to a set of 1 entries of


A such that there is precisely one 1 entry in each row and column, that is, a
permutation matrix P < A.

EXAMPLE 2.11.5. The SDR in Example 2.11.4 corresponds to the permutation


matrix of Example 2.11.2.
If m < n the set U can have no SDR. If m > n we take a square m X m
matrix by simply adding sets Un+i, Un+2,..., Um all equal to U. An SDR exists
in the new system if and only if one exists in the old system.
By the above remarks, a square Boolean matrix is a Hall matrix if and
only if the corresponding system of sets has an SDR. There exists a famous, but
somewhat difficult to prove theorem characterizing when a matrix is a Hall
matrix.
We will prove this using an algorithm of Ford and Fulkerson in network
theory. The algorithm will not be proved in detail here but a proof is sketched.

DEFINITION 2.11.4. An r X s rectangle of zeros (ones) in a Boolean matrix


(,
(or 0 l)-matrix) A is a set R X S C (1,2,..., n] X {1, 2,..., «} such that
ay = 0 (ay = 1) for i E R, j E S and \R\= r and |S| = s.

EXAMPLE 2.11.6. In the Boolean matrix

'o 1 0
1 1 1
0 1 0_

{1,3} X {1,3} gives a 2 X 2 rectangle of zeros.


86 Semigroups and Groups [Ch. 2

DEFINITION 2.11.5. A matrix A is full if it has no r X s rectangle of zeros


where r, s =£ 0 and r + s = n + 1.
EXAMPLE 2.11.7. If A is

"l 0 1 l"
0 111
10 11
1 1 1 0_

then A is a full matrix.


A network is a directed graph having a vertex x0 (source) such that every
edge from x0 is directed outwards and a vertex z (sink) such that every edge
from z goes to z. Each edge e is assigned an integer c(e) called its capacity.
A flow consists of an assignment of integers f(e) to edges such that (1) at each
vertex except jc0, z the sum of all incoming values f(e) equals the sum of all
outgoing values /(e), (2) /(e) < c(e). The value of the flow is the sum /(e) over
all incoming edges at z. This equals the sum over outgoing edges at x0. One can
think of a flow as oil flowing through a collection of pipes from x0 to z or
goods being shipped along a network of roads from x0 to z.

EXAMPLE 2.11.8. For this network


a

there exists the following flow


a
Sec. 2.11] Systems of Distinct Representatives and Flows on Networks 87

The Ford-Fulkerson algorithm starts with a flow of zeros. It then increases


the flow if possible by the following method, by one unit. We form sets Sk
where S0 = {x0}. We put a point x in Sk if and only if it is not in

and either (1) there exists an edge from a point u E Sk_i to x which is not
flowing at full capacity c(e) or (2) there exists an edge from x to a point
u E Sk_! having positive flow. Continue as long as possible. If z belongs to some
Sk then we can increase the flow by one unit. Trace a path x0, xh x2,..., z,
where x( E S(. This can be done going backwards. For each edge of type (1) in
the path increase the flow by one. For each edge of type (2) reduce the flow by
one. This in effect adds 1 unit of flow along the path from x0 to z so it increases
the flow by 1.

EXAMPLE 2.11.9. For the preceding network start with a flow of zero. Now
form sets 51 = {a, b}, S2 = {z}. Thus we can increase the flow by 1. One path
from x0 to z is x0 -*■ b -* z. So increase the flow along this path.

a
n

:o z

o
b

Now repeat. This time Sy = {a}, S2 = {b, z). So increase the flow by 1 along the
path x0 to a to z.

a
o

:o z

o
b
88 Semigroups and Groups [Ch. 2

Repeat once more Sx = {a}, S2 = {b}, S3 = {z}. Increase the flow by 1 along this
path
a

Finally again increase the flow by 1 along the same path. This gives the flow
of the previous example.
a

This is maximal since all edges from {jc0, a} to {b, z} are at full capacity.

EXAMPLE 2.11.10. Consider this network

a
Sec. 2.11] Systems of Distinct Representatives and Flows on Networks 89

Suppose the flow

existed and it is desired to increase this flow by 1. Then form St = {a}, S2 = {b},
S3 = {z}. This is an example where an edge of type (2) is used. Increase the flow
by 1 unit from x0 to a, reduce it by 1 from a to b, increase it by 1 from b to z.

This is maximal since all edges out of x0 are at full capacity.

THEOREM 2.11.2. (Ford and Fulkerson.) If v is the maximum value of a flow


from jc0 to z then there exists a set U of vertices such thatx0 GU, z U and the
total capacity of all edges from U to U is v.

Proof. The value of the flow cannot exceed the total capacity of edges from U
to U since, intuitively speaking all material flowing from x to z must go through
one of these edges. In fact it can be shown that the value of the flow equals the
amount flowing from U to U.
Suppose the flow is maximal. Then in the algorithm the sets Sk do not
include z. Let U = U 5). Then x0GU, z^lU. Moreover there exists no edge
from U to U not at full capacity and no positive flow from U to U or there
would be an additional set Sk. So the total capacity from U to U equals the
amount actually flowing which is v. □
90 Semigroups and Groups [Ch.2

THEOREM 2.11.3. (Hall and Koenig.) A {0,1)-matrix {or Boolean matrix)


is not a Hall matrix if and only if it is full.

Proof No permutation matrix can have an r X s rectangle of zeros, where


r + s> n, since p(R) must be onto S, but \R\ = n — r < s = |S|, if p is the
permutation and R X S is the rectangle of zeros.
Suppose A is not a Hall matrix. Draw a graph with vertices x0, yi, y2,..., yn>
wu w2,..., wBf z having directed edges from x0 to yt for each i, w,- to z for
each i, and an edge from y{ to wj if and only if a^ — 1. Then if P is a permuta¬
tion matrix, P < A and p is the permutation whose matrix is P, a flow on the
graph is defined by g{x0, y() = 1, g(yj, wp(/)) = 1, g(W{, z) = 1 and g is zero on
all other edges. Here all edges have been assigned capacity 1. Conversely a flow
of value n arises from a unique SDR. It follows from the Ford-Fulkerson
Algorithm that if no such flow exists there is some set of vertices T such that at
most n — 1 edges lead from T to T, and x0ET, z ET.
Let R = {y :y E T} and S = {\Vj is not connected to a yt E T}, |/? | = r,
\S\ = s. There are at least n — r edges from x0 to members of y, not in T. For
each element w;- not in S there is at least one edge from an element y{ of R to
it. If Wj ^ T this counts as one of the edges from T to T. Else we have an edge
from Wj to z which is from T to T. This gives at least n — r + n — s edges T
to T. So n — r + n — s < n — 1. So r + s > n + 1. So if no flow exists, R, S
give a rectangle of zeros with r + s > n. □

EXAMPLE 2.11.11. This matrix has a 3 X 2 rectangle of zeros. So it is not a


Hall matrix.

1111
110 0
110 0
110 0

EXERCISES
Level 1
1. Find an SDR for Ux = (1,3}, U2 = {2, 3), U3 = {2, 4), U4 = {1,4}.
2. Find a permutation matrix P < A.

0 10 f
0 0 10
0 111
10 10
Sec. 2.11] Systems of Distinct Representatives and Flows on Networks 91

3. Find a permutation matrix P < A.

~0 1 1 r
10 0 1
10 10
_1 1 0 0

4. Find a rectangle of zeros with r + s > n.

lllll"

0 10 0 0
0 0 10 0
11110
0 0 10 0
_1 1 1 0 1_
5. Find a maximal flow in this network.

Level 2
1. Find all SDRs for this system {{1,2}, {2,3}, {3,4}, {4,5}, {1,5}}.
Generalize to any n.
2. Suppose a (0, 1)-matrix has at least k rows with at most k ones for each
k < n. Show the system has at most one SDR.
3. Find all permutation matrices P less than or equal to this matrix.

"oiiif
1 0 0 0 1
10 0 10
10 10 0
110 0 0
92 Semigroups and Groups [Ch. 2

4. The permanent of a matrix M is the sum over all permutations n of


mi(i)Tr m2(2)ir ••• mn{n)n- Thus it is the same expression as the determinant
except that there are no negative signs. Show that for a (0,1)-matrix A the
permanent is the number of SDRs in the corresponding system.
5. A matrix is said to be doubly stochastic if it is nonnegative and all rows and
columns have sum 1. Give examples of doubly stochastic nX n matrices
with permanents precisely 1 and precisely
n !
7 n'

These are known to be the upper and lower bounds (the latter, the
van der Waerden conjecture, was proved only recently).

Level 3
1. Show a square matrix having exactly k ones in each row and column
must be a Hall matrix (show that an R X S rectangle of zeros with
|/?| 4- | S| = n + 1 would make this row and column sum condition false).
2. A Boolean matrix is k-decomposable if and only if it has a rectangle of
zeros with r + s = n — k + \. Else it is called k-indecomposable. Show a
Boolean matrix A is k-indecomposable if and only if for every Boolean
vector v =£ 0 either vA has no zeros or has at least k zeros fewer than v.
3. Using Exercise 2 show the Boolean product of a k-indecomposable and a
p-indecomposable matrix is at least (p + &)-indecomposable. Thus for
k > 0 the set of k-indecomposable matrices forms a semigroup.
4. Prove a Boolean matrix A is a Hall matrix if and only if MA has no more
zeros than M, for any Boolean matrix M. Exercise 2 may be used row by
row.
5. Vertices jcj of a directed graph are said to lie in the same k-connected
component if there exist flows of value k from x to y and from y to x.
Here any ingoing edges to the source and any outgoing edges to the sink are
disregarded. All edges have capacity one. Using the theorems of this section
prove this is an equivalence relation.

2.12 ORBITS AND CONJUGATIONS


An automorphism, as mentioned earlier, is an isomorphism from a group to itself.
The identity mapping is always one automorphism.
Every group of order greater than two has at least one other automorphism.
If the group is abelian and not all elements are of order two, then x -> — x is an
automorphism which is not the identity. If all elements have order two and the
group is commutative, then any permutation of a minimum generating set
extends to an automorphism if its order exceeds two. Nonabelian groups have
what is called inner automorphisms (and frequently others).
Sec. 2.12] Orbits and Conjugations 93

DEFINITION 2.12.1. Let G be a group and gEG. Let Cg be the function


from G to G such that Cg(x) = gxg~l. Then Cg is called a conjugation, or
inner automorphism.

EXAMPLE 2.12.1. In an abelian group Cg(x) = gxg~l = gg~lx — x.

EXAMPLE 2.12.2. A conjugation by a nonsingular matrix replaces a matrix


by a matrix similar to that matrix.

DEFINITION 2.12.2. Two elements x, y of a group G are conjugate if and


only if there exists g G G such that y — gxg~x.

EXAMPLE 2.12.3 In Example 2.8.3, x, y, z are all conjugate, since y — bxb~x


and z — axa~\

THEOREM 2.12.1.
(1) Cg(xy)=Cg(x)Cg(y).
(2) CgCh(x) — Cgh(x).
(3) Cg is an isomorphism G -*■ G.
(4) Conjugacy is an equivalence relation.

Proof. Cg(xy) = gxyg~\ Cg(x)Cg(y) = gxg~xgyg_1 = gxeyg~x = gxyg~x.


This proves (1).
Cg(Ch(x)) = Cg(hxh~x) = ghxh~xg~x — (gh)x(gh)_1. This proves (2).
Let g, h be inverses of each other. Then Cg and Ch are inverse functions
of each other. So each is 1-to-l and onto. This proves (3).
For reflexivity, x = exe~l. If x = gyg_1 then y = (g^jxjg'1)'1. If
x — Cg(y) and y = Ch(z), then x = Cgh(z). This proves conjugacy is an
equivalence relation. □

THEOREM 2.12.2. Two permutations are in the same conjugacy class if and
only if for k = 1 to n they have the same number of k-cycles when expressed
as products of disjoint cycles.

Proof. By Proposition 2.10.3, two conjugate permutations do have the same


number of ^-cycles for each k. Conversely let / and g be products of disjoint
cycles such that the same number of A:-cycles occur in both, for each k. Let p
send the elements of each cycle of f in order, to the elements of a cycle of g
having the same length. Let p be an arbitrary isomorphism on any remaining
elements of {1,2,..., n}. Then by Proposition 2.10.3, pgp~x = /. □

This means that the number of conjugacy classes of the symmetric group of
degree n equals the number of partitions of the integer n, that is the number
of distinct sets of positive integers whose sum is n.
94 Semigroups and Groups [Ch. 2

The degree of a permutation on {1,2, is n. This should be


distinguished from the order.

EXAMPLE 2.12.4. The symmetric group of degree 3 has three conjugacy


classes. Representatives are e = (1) (2) (3), (12), (3), (123). The three partitions
of 3 corresponding to these are

3 = 13-1 + 1

3 + 2+1

3 = 3

A group of permutations on a set is transitive if it takes every member of


the set onto every other. For instance', the group of rigid motions in plane
geometry (generated by translations, rotations, and reflections) is transitive on
points since any point can be translated to any other, on lines since any line can
be translated and rotated to any other, on right angles since any right angle can be
translated and rotated to coincide with any other. It is not transitive on all
triangles since noncongruent triangles cannot be made to coincide by rigid
motions.
A group is imprimitive if it moves the elements in sets such that each set is
always sent onto another set. For example the additive group of integers acting
on itself is imprimitive, since the odd integers to precisely the odd integers
or precisely the even integers, on addition by any number. Let Z°, ZE denote
the sets of odd integers, and of even integers, respectively. For an integer x and
set S let x + S denote {x + y : y E 5}. Then for any x either x + Z° = Z°or
x + Z° = ZE.
Another way to say this is that it is imprimitive if it preserves an equivalence
relation other than the identity or universal relation. These two are always
preserved. It will then send any equivalence class to some equivalence class.

DEFINITION 2.12.3. Let G be a finite group of permutations of a set X. Then


G is said to be transitive if for all x, y in X there exists g E G such that
(x)# = y, and G is said to be primitive if there does not exist an equivalence
relation R such that (x, y) E R if and only if ((x)g, (y)g) E R, other than the
identity and the universal relation.

EXAMPLE 2.12.5. The cyclic group generated by (12) (34) is not transitive,
since (l)g = 3 never holds.

EXAMPLE 2.12.6. The cyclic group generated by (1234) is transitive, but not
primitive, since it preserves the partition {{1,3}, {2, 4}}. That is the set {1,3} is
either mapped to itself or to {2,4), never to some set that overlaps both.
Sec. 2.12] Orbits and Conjugations 95

EXAMPLE 2.12.7. The group of permutations e, (123), (132) on {1,2,3} is


transitive and primitive.

If a group is not transitive there will be more than one subset, called an
orbit,, restricted to which it is transitive.

DEFINITION 2.12.4. Let G be a group of permutations on the set X. Then x


and y are said to belong to the same orbit if and only if (x)g = y for some g.
The isotropy subgroup of * is {g : (x)g = x}.

The idea is that under the group action any element stays within its own
orbit, eventually reaching all points of the orbit, but does not go outside that
orbit.

EXAMPLE 2.12.8. The group e, (12) (34) has two orbits {1,2} and {3,4}.
The isotropy subgroup of 1 is {e}. Under the action 1 -* 2 -> 1 and 3 -» 4 -*■ 3. The
number 1 will never be sent to 3.

The orbits of a cyclic group are the sets for the cycles of its generator.

THEOREM 2.12.3. The relation of belonging to the same orbit is an equivalence


relation. A group is transitive if and only if there is exactly one orbit. If a group
is primitive, it must be transitive. If a group is transitive but not primitive, all
equivalence classes of an equivalence relation preserved by G are equal. If H is
the isotropy subgroup of x, and Ox is the orbit of x, then |G\ = \H\ \Ox\.

Proof. If (x)gt = y and (y)g2 = z then (x)gig2 = z. The proofs of reflexivity


and symmetry are similar. By definition of transitive, G is transitive if and only
if all elements of x belong to the same orbit. If there are several orbits, the
relationship of belonging to the same orbit is preserved by the group. Thus G is
not primitive.
Suppose G is transitive but not primitive, and (x, y) e R if and only if
((x)g, (y)g) G R for an equivalence relation R, for all g EG. Then (x)g = xg.
So for any two equivalence classes x, y, (x)g = y for some g. Thus, x,y have
the same cardinality.
Let H,Ox be as in the theorem. By Theorem 2.8.1, \G\ —\H\[G \ H]
where [G : H] denotes the number of cosets of H (here we will use right cosets,
but the proof is the same). By definition Ox= {(x)g:g G G}. We have
(*)£i = (x)g2 if and only if (x)g1g21 = (x) if and only if gxg2x G H if and only
if Sigi1 ~ h G H if and only if gq = hg2 for some h G H if and only if gt and
g2 belong in the same right coset of H. Thus the number of right cosets equals
the size of Ox, since we have an isomorphism from cosets Hg to orbit elements
(x)g. So \G\ = \H\ |Ox\. This proves the theorem. □
96 Semigroups and Groups [Ch.2

EXAMPLE 2.12.9. The set of permutations e, (12) (34), (13) (24), (14) (23),
(1324), (1423), (12), (34) consists of all permutations which preserve the equi¬
valence relation with partition {{1,2}, (3, 4}}. It follows that it is a group. It is
transitive since 1 -»• 1, 1 —> 2, 1 -> 3, 1 -> 4 under different elements of the group,
but is not primitive. The isotropy group of 1 is e, (34). So |(7| = 8 = \H\ \Ox\ —
2X4.
The formula \G\ — \H\ \Ox\ is used to prove many combinatorial results,
where some set related to a group must be enumerated. An example in group
theory itself is the size of conjugacy classes.

PROPOSITION 2.12.4. The order of any conjugacy class in a finite group


divides the order of the group.

Proof We may regard G as a set of permutations of G itself by conjugation.


The orbits will be the conjugacy classes. And by Theorem 2.12.3, \Ox\ divides
\G I. □

EXERCISES
Level 1
1. Give representatives of all conjugacy classes in S4.
2. For g = (12) describe Cg on any permutation written on cyclic form. Try
(1345) and (243) as examples. Use the last result of Section 2.7.
3. In S4 list the number of elements in each conjugacy class.
4. What group is generated by x = (12) and y = (45) in the symmetric group
of degree 5. What are its orbits? Find all distinct products xlyf
5. What group is generated by x = (123) and y = (45) in the symmetric group
of order 5. What are its orbits? Find all products xlyf Is this group cyclic?

Level 2
1. Show a subgroup is normal if and only if it is a union of conjugacy classes.
2. Prove a group of order pn for p prime has at least p conjugacy classes of
one element. Note that the sizes of all conjugacy classes C;- are pm for
m> 0 and that pn = 2 |C;-1. Thus

pn = 2 \q\ + P 2 ^
191 = 1 \Cj\ — pr
r> 0
Thus
2 I Cj |
|C/I = 1

is a multiple of p, being

icyi
Sec. 2.12] Orbits and Conjugations 97

3. Prove an element is in a conjugacy class by itself if and only if it is in the


center. Therefore any p group has a nontrivial center.
4. Consider the group G of permutations of degree 5 generated by
Gi = {e, (12), (13), (23), (123), (132)} and G2 = {e, (45)}. Show each
element of Gx commutes with each element of G2 and note GjO G2 = {e}.
Describe an isomorphism 53 X into G. What are the orbits of G?
5. Generalize the above exercise to represent X as a subgroup of Syi+m-
6. Prove that if G is generated by subgroups Gx, G2 such that all elements of
Gx commute with all elements of G2 there is a homomorphism of Gx X G2
into G. If Gj n G2 = {e}, prove it is 1-to-l.
7. Prove a transitive permutation group of prime degree is primitive.
8. Give an automorphism of Z2 X Z2, which is not the identity. Give similar
automorphisms of G X G for any G, and of G X G X ... X G, n factor.
9. Find all automorphisms of Z5. (Any automorphism is determined by /(1)
where 1 is a generator. Try /(1) = 1, /(l) = 2, and so on.)
10. Consider G acting on itself by x -*■ xg. Show this is transitive, but if G has
a nontrivial proper subgroup H, it is not primitive (consider the right cosets
Hx).

Level 3
1. Show a group is simple (has no normal subgroups) if and only if any
non-identity conjugacy class generates the entire group.
2. Every permutation either preserves the sign of the product

d = n (xi — xi)
i<i

or reverses it if the permutation is applied to the subscripts giving

]}. (xp(i) ~ xp(/))


i<i

For instance for degree 2 we have

D = (xx-x2)(x2 - x3)(xx - x3)

Under (12) this goes to (x2 — xx) (xx — x3)(x2 — x3) = —D. Prove any
transposition reverses the sign of D.
3. Prove there is a homomorphism h from the symmetric group onto {±1} by
sending x to ±1 according as x preserves or reverses the sign of D. Here D is
the same as in the above exercise.
4. The kernel of n is a normal subgroup of the symmetric group called the
alternating group Ay\. Prove every product of an even number of trans¬
positions is in An but no product of an odd number of transpositions is.
5. Prove that A5 is simple by showing each of its 3 non-identity conjugacy
classes generates it. It is the smallest noncommutative simple group.
98 Semigroups and Groups [Ch.2

6. Let W be the subgroup of such that W preserves the partition


{1,2,..., m}, {m + 1, m + 2,..., 2m}.{nm — m +1,..., nm). Show
W is transitive but not primitive. Show W has a normal subgroup W0
isomorphic to S^i X X ... X which fixes each set of the partition.
Show is isomorphic to What is the order of It is called the
wreath product of and

2.13 SYMMETRIES
A symmetry of some object is a 1-to-l onto mapping from the object to
itself which preserves some of the structure of the object. One of the most
characteristic mathematical methods is to look for the symmetries in a problem.

EXAMPLE 2.13.1. The operation x->xc establishes a symmetry from the


algebra of sets to itself which reverses the operation n, and U. Thus any
theorem which is true for ft is also true for U. For instance since A n (B U C) =
(A n B) U (A n C) we have also A U (B n C) - (A U B) n (A U C).

EXAMPLE 2.13.2. The equation x4 + x2 + 1 is unchanged if we replace x


by —x. Thus the negative of any root is also a root.

EXAMPLE 2.13.3. For any group G there is a 1-to-l onto mapping G to G


which sends x to x_1. Since (x>>)-1 = y~lx~l this mapping reverses the order
of products, so that right cosets become left cosets and left cosets become right
cosets. Thus for any property of left cosets there is a similar property of right
cosets.

EXAMPLE 2.13.4. Conjugation x + iy-*■ x — iy is a symmetry of the complex


number system.

EXAMPLE 2.13.5. Suppose we want to show that the edges of the figure
below (called a Petersen graph) cannot be 3-colored so that edges of all 3 colors
Sec. 2.13] Symmetries 99

meet at each vertex. We can reason as follows. Note that the entire figure has
a pentagonal symmetry. Any 3-coloration of a pentagon uses all 3 colors since
a pentagon cannot be 2-colored. There are symmetries of such a graph coloring
problem which permute the colors in any way. That is, if we have one coloring
we may obtain another by recoloring all 1 edges as 2, all 2 edges as 3, and all
3 edges as 1, for instance. In a pentagon not all 3 colors can appear on 2 edges.
So assume color 3 occurs only on 1 edge. No color may appear on 3 edges of a
pentagon because some two of the three will meet. So each of colors 1, 2 appear
on 2 edges. If we remove the edge colored 3 the other colors must alternate.
So we may say that any edge coloring on a pentagon, up to symmetry is

So we may suppose that the outer edges of the Petersen graph are colored in
this way. The edges between the outer and inner edges are determined by this.
We have

Evidently now no inside edges can be colored 3. But the inner 5 edges form a
pentagon. So one of them must be colored 3. This proves the impossibility of
3-coloring a Petersen graph. This problem is related to the 4-color problem, see
Gardner (1976).
100 Semigroups and Groups [Ch.2

The 4-color theorem asserts that any map in the plane can be colored with
4 colors in such a way that no two regions of the same color have an extended
common boundary larger than a point. This can be reduced to the case of maps
such that no more than three regions meet at a point, called trivalent maps.
A trivalent map can be 4-colored if and only if its edges can be 3-colored. The
Petersen graph (which, however, is not planar) is the smallest trivalent graph
which cannot be 3-colored and is ‘bridgeless’. Tutte has conjectured that every
bridgeless trivalent graph which cannot be 3-colored contains a topological copy
of the Petersen graph.

EXAMPLE 2.13.6. The set of n X n matrices has the symmetry X -*■ XT, the
transpose of X. This means that for any theorem about the columns of a matrix
there is a corresponding theorem about the rows of a matrix.

EXAMPLE 2.13.7. For any partial order there is another partial order
obtained by replacing < by >. This means for any theorem about > there is a
corresponding theorem about <.

Symmetries such as in Examples 2.13.1, 2.13.3, 2.13.4, 2.13.6, 2.13.7 are


called dualities.
Symmetry was used by Felix Klein to classify different kinds of geometry.
Euclidean geometry is based on what is called the group of rigid motions, which
are like drawing a figure on a piece of paper and moving it around on top of
a plane. Any two congruent figures can be made to coincide by a rigid motion
applied to one of them. Euclidean geometry is the study of properties like
distance and angle which are invariant under rigid motions, that is, they do not
change.
Projective geometry studies properties unchanged under a large group of
transformations, which include the transformations which occur when a shadow
falls on a plane inclined at an angle or a person sees something tilted at an angle
to his field of vision. Here angles and distances can change according to the
position of the observer, and other invariant quantities, such as cross-ratios, are
studied. A circle may be changed into an ellipse or even a hyperbola. Projective
geometry gives new results in ordinary geometry, such as Pascal’s theorem that
the extended sides of a hexagon inscribed in a conic section meet by pairs in
three collinear points.

EXAMPLE 2.13.8. Symmetry groups are used in the classification of elementary


particles. Sometimes a group of particles can be considered as separate states of
a single particle. These states will be solutions of an equation having a certain
symmetry, derived from a symmetry of a force law. If there is also a weaker
force which does not have the symmetry, solutions will come in sets that are
almost but not quite symmetric.
Sec. 2.13] Symmetries 101

The following analogy may convey a little of what is involved. Suppose we


have an algebraic polynomial equation of degree 4 and a symmetry jc -> — x
(actually there are partial differential equations and infinite, continuous
symmetry groups). Then the equation has a form such as

x4 — ax2 + b — 0.

Roots occur in symmetrical pairs ±r. Now suppose a small nonsymmetric term
is added
x4 — ax2 + 0.001 ax + b = 0.

Then the roots will still probably come in pairs which are almost but not quite
equal in absolute value. This is analogous to particles having nearly but not
quite the same mass, like the proton and neutron.

For the remainder of this section we will be concerned with geometrical


symmetries. Let En denote ^-dimensional Euclidean space. That is, points of
En are n-tuples (xif x2,, xn) of real numbers, distance is given by the formula

V(*i - Ti)2 + (*2 - yi? + ynf


and the angle between two vectors v, w is defined by

v•w
cos 0 = -
\v\ |w|

where the angle 0 is chosen to lie between 0 and 7r.

DEFINITION 2.13.1. An isometry of En is a one-to-one, onto function


/: En -*■ En such that for all points x, y the distance d(x, y) equals the distance
d(f(x), f(y))- Two subsets Sit S2 of En are congruent if and only if f(SE) = S2
for some isometry /.

EXAMPLE 2.13.9. For any (au a2,..., a„), the mapping (xlt x2,..., xn) -»■
+ flj, x2 + a2,..., xn + an) is an isometry called a translation. A translation
moves the plane a distance \Ja\ + a\ + ... + a„ in the direction of the vector
(ai, a2,..., an). It does not alter the directions of lines.

EXAMPLE 2.13.10. The transformation x-> Ax for a matrix A will be an


isometry if and only if AT = A"1. Such matrices are called orthogonal.

DEFINITION 2.13.2. In a reflection in the plane H is the function such


that f(x) = y where H is the perpendicular bisector of the segment from x to
y, and /(jc) = x if x E H. Reflections through a line in E^ are defined similarly.
A reflection through a point p is defined by f(x) = y where p is the midpoint
of the segment from jc to y.
All reflections are isometries.
102 Semigroups and Groups [Ch. 2

EXAMPLE 2.13.11. The reflection in the x, y plane sends (x, y, z) to (x, y, —z).

A rotation in will be considered as any isometry which leaves a certain


line (the axis) pointwise fixed, and does not fix any other point.

THEOREM 2.13.1. Every isometry f can be written as tg where t is a


translation andg is an isometry which fixes the point (0, 0,, 0).
Every isometry which fixes (0, 0,..., 0) is multiplication of every vector
by a unique orthogonal matrix. Isometries form a group in which translations
are a normal subgroup.

Proof. Let t be the translation which takes (0,0, ...,0) to /_1(0, 0,..., 0).
Then tf = g fixes (0, 0,..., 0).

An isometry which fixes (0,0, ...,0) will preserve distance by definition.


It will preserve straight lines since a straight line is the shortest distance between
two points. It will preserve the lengths of the sides of all triangles, so will also
preserve angles. It will thus preserve parallelism of lines. So it will send a sum of
two vectors to a sum of vectors. Thus it will satisfy f(v + w) = f(v) + /(w).
This implies f(kw) = kf(w) for all rational numbers k. By a continuity argu¬
ment, f(kw) — kf(w) for all real numbers k. Thus / is a linear transformation
and can be represented by a matrix. This matrix will be orthogonal.
It is straightforward to show that if f g are isometries so are fg and
So the isometries form a group. The composition of the translations
(xi, x2, ..., xn)-+(xt + ah x2 + a2, ..., xn + an) and (xlt x2, ..., xn) -*•
(*i + bu x2 +b2, ...,xn + bn) is (*i, x2,..., xn) -+ (xt + at + blf x2 + a2 + b2,
..., xn + an + bn). The inverse of the former is (xj, x2,..., xn) -> (jcx — a\,
x2 — a2,..., xn — an). Let be a translation x -*■ x 4- ax in vector notation
and f=t2g an isometry, where g is represented by an orthogonal matrix A.
Then (x)ft\f~l = (x)t2gtxg-lt2l = (x + a2')At2A~lt2i, a translation, where
(x)t2 = x + a2. Thus translations form a normal subgroup. This proves the
theorem. □

Later we will take up matrices in more detail, but for the present we will
state some facts without proof.

DEFINITION 2.13.3. The eigenvalues of an n X n matrix A are the roots of


the equation det (zl — A) = 0, a polynomial of degree n.

EXAMPLE 2.13.12. The eigenvalues of a 2 X 2 matrix A are the roots of

a 12
= 0
a 2i z a22

or (z — an) (z — a22) — al2 a2i = 0.


Sec. 2.13] Symmetries 103

DEFINITION 2.13.4. An eigenvector corresponding to an eigenvalue z is a


vector v such that v(zl — A) = 0, or zv = zlv = vA.

EXAMPLE 2.13.13. For a diagonal matrix the vector v having a 1 in place i


and Os elsewhere is an eigenvector. In fact vA = a^v.

Orthogonal 3X3 real matrices have the following forms. All eigenvalues of
an orthogonal matrix have absolute value 1, and satisfy a cubic equation with
real coefficients, "ftius either all three are real (and must be ±1) or two are
complex and one is real.

Eigenvalues Operation

1,1,1 Identity
-1,1,1 Reflection in plane of last two eigenvectors
-1,-1,1 180° rotation about axis of last eigenvector
-1.-1.-1 —I, reflection through origin
1, z, z Rotation about axis of first eigenvector
—1,Z,z Combined rotation about axis of first eigenvector and reflection
in the plane of last two eigenvectors

In general every orthogonal matrix of determinant 1 will be a rotation.


Rotations of a figure can be physically realized by moving the object about,
whereas reflections cannot.
Every 2X2 orthogonal matrix has one of these two forms, where
x2 + y2 = 1:

The former, of determinant 1, will be a rotation, and the latter, of determinant


— l,will be a reflection.

DEFINITION 2.13.5. If S is a subset of En, the group of symmetries of S is


the group of all functions f:S-+S such that there exists an isometry g with
/(s) = g(s) for all s G S.

EXAMPLE 2.13.14. The group of all symmetries of the line segment [—1, 1] is
the group consisting of the identity function and the function f(x) = —x.

THEOREM 2.13.2. The group of symmetries of a regular polygon of n sides is


a group of order 2n having a normal subgroup isomorphic to Zn. Its elements
can be represen ted by e, z, z2,..., zn~\y, yz,..., yz"'1 where products can be
computed by the rules zn = e, y2 = e, z’y = yz ~K
104 Semigroups and Groups [Ch.2

Proof. Assume the polygon has its center at the origin, that the distance from
its center to a vertex is 1, and that a vertex vx is located at the point (1,0).
All isometries of En preserve convex combinations since both multiplication by
isometries and translations do. The center is the vector ^(vx + v2 + ... + vn) if
vn are the vertices, so the center is preserved. Thus all isometries are rotations
and reflections about the origin. Any rotation is determined by the position
of V\. Thus there are exactly n rotations, which send vx to vx, v2,, vn. Let z
be the rotation sending vx to v2. Then the others are z, z2,..., z"-1 and zn = e.
The rotations are the isometries in the kernel of the homomorphism det (A)
to the group {±1}. Thus they form a normal subgroup K. Let y be the reflection
(x,.y)->(x, —y). Then y is a symmetry of the polygon. Any reflection would
satisfy the rule y2 = e. And yzy sends vx to vn. Its determinant is (—1) (1) (—1)
so it is a rotation. So yzy = z~l. By the properties of conjugation yz]y —
yz’y~l = z~K So yz1 — z~]y Finally we must observe that no other isometries
exist. The subgroup K has two cosets by Theorem 2.5.1 since the image under
det (A) is the subgroup {±1}. Each coset has n elements, so the group has order
precisely 2n. This completes the proof. □

The group described by this theorem is an important group known as the


dihedral group. Its elements can be represented in matrix form as follows:

27- 2/
cos — sin —
n n “l o'
z1 = . y =
. 2/ 27- _0 -1_
— sin— cos —
n n

EXERCISES
Level 1
1. What is a symmetry of the function sin x?
2. The equation x4 + x3 + x2 + x + 1 = 0 can be written as x2 + x + 1 +
-f x"2 = 0. Show that replacing x by x_1 is a symmetry of this.
3. Use this symmetry to reduce x2 + x + 1 + x_1 + x-2 to a quadratic in
z = x + x_1. Note that z2 + x2 + x-2 + 2. Write the function as a
quadratic expression z2 + az + b. Find its roots in z.
4. A nonsquare rectangle has a symmetry group consisting of e, one rotation
and two reflections. Describe the rotation and reflections.
5. Draw a 3-coloring of the edges of a cube. How many different 3-colorings
exist? How many coincide under a symmetry of the cube or a permutation
of the colors?
6. Write out the multiplication table of the dihedral group of order 10, using
the description in Theorem 2.13.2.
Sec. 2.13] Symmetries 105

Level 2
1. Color the faces of a cube with as few colors as possible so that no two faces
of the same color have an edge in common.
2. Every finite partial order has a maximum element. What is the dual to this
theorem?
3. Write out the Kronecker product of this matrix with itself (which will be a
4X4 matrix) (62i+fc-2,2/+m-2) = (0i/0fcm), where range over 1, 2.
To find brs put in all values for i, j, k, m successively. If i = k = j = m = 1,
^2 + 1 —2,2 + 1 —2 = 011011 SO bn = aji = l2 = 1.

1 2
A =
3 5

Show the Kronecker product of its square is the square of its Kronecker
product.
4. Prove the Kronecker product El is multiplicative in that (A El B) (C El D) =
(AB E CD),where (A E B)ni+k_n>ni+m_n = aijbkm, B is n X n.
5. Prove that any two translations commute. Show the group of translations in
En is isomorphic to R X R X ... X R where R is the additive real numbers.
6. Show that the group of translations and rotations in the plane is isomorphic
to the group of transformations az + b on complex numbers, where a, b
are complex numbers and |a| = 1. What operation corresponds to reflection
in the x-axis?
7. The only rotations that can occur in symmetries of crystals are multiples of
60° and 90°. This can be shown as follows. The atoms in a plane section
form a regular structure of points called a lattice. If one atom is fixed as
(0, 0) all atoms will be a set L of points (x, y) such that if (x, y) E L and
(w, z)Gf then (x + w, y 4- z) E L and (— x, —y) E L. That is, it will be a
subgroup of R X R. In order for atoms to be a finite distance apart in
2-dimensions the lattice must have 2 generators, z, w. If A is the matrix
of the rotation, ^4z = HjZ 4- n2w, Aw = n3z + n4w, nx, n2, n3, «4 G Z.
Thus the trace (sum of diagonal entries in any basis) is nx + n4, an integer.
The trace is independent of basis. For a standard basis a rotation will be
represented as

cos d sin 6
—sin 6 cos d_

with trace 2 cos 0. Thus 2 cos d = nx + «4, an integer. Show this implies 6
is a multiple of 60° or 90°. Give examples of lattices (or regular tilings)
with 60° and 90° symmetry.
106 Semigroups and Groups [Ch. 2

Level 3
1. The linear system

y' = ay + bz
z' = bz + ay

is invariant under interchanging y and z. Show that it splits into separate


equations on u = y + z and v — y — z.
2. Let A be a matrix and G a group of matrix symmetries of A, that is,
XA = AX for XGG. Then show that if y is any linear combination of the
matrices XG G, show Ay = yA. What generally happens is that a collec¬
tion y{i) of primitive idempotents can be found satisfying 2 y(i)= I,
yU)y(j) = 0 for i ¥=■ j, y2(i) = y(t). Show that any vector x is the summa¬
tion of x(i) = y(i)x. This represents the group V of all vectors x as the
direct product set (x(l), x(2>,..., x(n)) of V(i)=Y{i)V. Show that
Ax = 0 if and only if Ax(i) - 0 for each i. This is the basic reason for the
physics example.
3. Show that the group of isometries in En can be represented as a group of
(n + 1)-dimensional matrices with block form

B 0"

A /.

where B is an orthogonal matrix acting on row vectors and A = (a1( a2,


an) represents the translation.
4. Prove that \Ax\2 — x2 for all vectors x if and only if AT=A~1. Since
X'y = \(\x + y\2— \x\2— \y\2), \Ax|2 = x2 is equivalent to Ax-Ay =
x*y for all x,y, or since A must be nonsingular Ax’Z =x%A~lz where
z = Ay. Prove Ax • z = x-ATz and that Ax • z = Bx • z for all x, z implies
A= B.
5. Show the orthogonal matrices are closed under multiplication. Show a
matrix is orthogonal if and only if its rows At* satisfy |y4,-*|2 = 1 and
= 0 for i =£ /. Here At* denotes the r'th row of A. Note that
At = A~l is equivalent to ^4,4T = I.
6. Compute the orders of the symmetry groups of a tetrahedron, an
octahedron, and an icosahedron. These groups are transitive, so the number
of vertices can be multiplied by group of symmetries which fix a given
vertex. There will be the same number of reflections and rotations.
7. Show the group of symmetries of a tetrahedron is isomorphic to the
symmetric group on its vertices.
8. Show the groups of rotations of a cube is isomorphic to the symmetric
group on its diagonals.
Sec. 2.14] Polya Enumeration 107

9. Show the group of rotations of an icosahedron is isomorphic to the


alternating group.

2.14 POLYA ENUMERATION


Polya enumeration is concerned with counting the number of patterns of a
certain type where two symmetrical patterns are considered the same.

EXAMPLE 2.14.1. Consider all labellings of the vertices of a square with {0,1}.
There exist 24 = 16 such labellings since each vertex can be labelled 2 ways no
matter how previous vertices have been labelled. But if we consider symmetry
there are precisely 6 distinct labellings:
1

i_

1 o'
o
o

5 >
_1

_1

-1
o
ot

o
I

'l f ‘i r
1_

9
o

i i_

G. Polya’s formula follows from a lemma of W. Burnside. We state this as in


W. Gilbert (1976).

LEMMA 2.14.1. (Burnside.). Let G be a group acting on a finite set X. For


g G G let fg be the number of elements fixed by g. Then the number of orbits
of G is |G|_1 2 fg.

Proof Let S = {(jc, #): £(x) = jc}. For each element g G G we have fg members
of S. So |S| = 2 fg. On the other hand for a fixed X the number of elements g
such that £•(*)= x equals the isotropy group of x. This has size |G| IChJ"1 by
Theorem 2.12.3. So any orbit contributes in all

elements. If N is the number of orbits, \S\ = N\G\ = 2 fg □


In the following we consider a set X acted on by a group G. Each element
of X is labelled by a member of a set L of labels. (This is equivalent to a function
/: X -> L.) Two labellings, or functions, are equivalent if they differ by an
operation of G. The problem is to find the number of nonequivalent labellings.

EXAMPLE 2.14.2. Suppose G is the symmetric group. Then two labellings are
equivalent if each label occurs the same number of times.
108 Semigroups and Groups [Ch.2

PROPOSITION 2.14.2. The number of equivalence classes is

— 2 \L\c(g)
\G\
where L is the set of labels and c(g) is the number of cycles of G.

Proof We need to show fg = \L\c('g\ For fixed g choose cycle representations


xl,x2, ...,xc(gy To show that a labelling is invariant means precisely that all
members of the same cycle receive the same labelling. These invariant labellings
are in 1-1 correspondence with labellings of xlf x2,..., xc(gy Each xt can be
labelled \L \ ways regardless of how previous ones are labelled. Thus there are
|L|C^ ways to label x2, x2,..., xc(gy „ □

Labellings can be classified more completely by Polya’s general result.


Associate a variable a,- to each element of L. Let Pg(jcx, x2,denote
the polynomial

2 xf^ x2c*W Ck(g)



Xk
\G\g<EG

where c,(g) is the number of cycles of g of length i, and k is the maximum


length of any cycle.

EXAMPLE 2.14.3. For G a cyclic group of prime order contained in the


symmetric group of degree p there will exist the identity transformation with
p 1 cycles and p — 1 nonidentity elements with 1 p cycle. So

Pg Oi. x2,..., xp) = + (p - 1)V)

THEOREM 2.14.3. (Redfield and Polya.) The number of labellings in which at


occurs precisely n, times is the coefficient of a/*1 a2”2... CLrrfm in

PG(otx +a2 + ... + am, ai2 + a22 + ... + am2,... + oc2k + ... + amk)

Proof. We apply Lemma 2.14.1 to the set S of labellings in which a,- occurs
exactly «,• times. By the lemma the number of equivalence classes equals

~ 2 |s E S : (s)g = s |
|G[ g
Suppose g has c^g) orbits of length g. Then it suffices to show

Is E G : (s)# = s |

equals the coefficient of a22 ... in

(ax + a2 +... + aw)Cl(^ ... (a* + a2 + ... + amk)Ck^g\


Sec. 2.14] Polya Enumeration 109

Choose representatives Xy of all /-cycles. Let S,- = (jc,y}. The the labellings of
X invariant under g are the same as labellings of these representatives. The
condition on the number of labels is equivalent to saying that the number of
occurrences of oq in St plus twice the number of occurrences in S2, and so on,
equals n(. Note that c,-Qr) — |5,-|. Thus the number of choices mentioned equals
the number of ways to choose a term from each of the following

Ci(g) times

(ax + a2 +... + am),... ,(ax + a2 + ... + am), ...,

ck(g) times

+ a2k + ... + a-m), + ak + ... + ocmk)


so that the total exponent of a,- is «,•. But this equals the coefficient of
nnl n, n2 n nm
ai “2 ••• am

in the product. □

EXAMPLE 2.14.4. Consider labellings of a square with 0, +1, —1. The


symmetry group of a square consists of the identity (8, 1-cycles), two 90°
rotations (1 4-cycle), a 180° rotation (2 2-cycles), two reflections on a vertical
or horizontal axis (2 2-cycles), and two reflections on a diagonal axis (1 2-cycle,
2 1-cycles).
So the polynomial Pq is

|(xx4 + 2x4 + 3x22 + 2x2Xi)

The various labellings are then given by

|((a1 + a0 + a_x)4 + 2(a4 + a04 + a-i4) + 3(a12 + a02 + a-i2)2

+ 2(ax2 + a02 + a-r2) (ax + a0 + «-i)2)

For instance the coefficient of a2 a0 «-! is

|(12+ 2(2)) = 2.
The two labellings are
_1
1

"l f
>
-0 -1. _-i i_

EXERCISES
Level 1
1. Work out the labellings of a square by {0,1} using Polya’s method. Imitate
the last example but use only a0, ax.
110 Semigroups and Groups [Ch. 2

2. Write out all distinct labellings of a pentagon with 0,1.


3. Work out this problem by Polya’s method where G, the dihedral group, has
the identity with 5 1-cycles, 4 rotations with 1 5-cycle, and 5 reflections
with 2 2-cycles and 11-cycle.
4. Write the labellings of a hexagon with 0,1 so that there are 3 zeros and 3
ones.
5. Compute the number of ways to label the vertices of a square with 4 labels
if all 4 labels appear. Check by writing out the diagrams.

Level 2
1. In switching circuits designers are concerned with functions on n Boolean
variables. Thus we have a set of 2” elements (all values of xh x2,..., xn)
acted on by the permutations of x2,..., xn. These are labelled with 0,1.
Use Polya’s theory to compute the number of nonequivalent functions of
this kind for n = 2. Example: the following are distinct.

(1) (2) (3)


Xi X2 f X! X2 f Xi X2 f

0 0 0 0 0 1 0 0 1
0 1 0 0 1 0 0 1 1
1 0 0 1 0 0 1 0 0
1 1 0 1 1 0 1 1 0
The group has two elements, the identity and (12). The latter acts with a
2-cycle and 2 1-cycles.
2. Consider switching circuits with a larger group generated by both permuta¬
tion and complementation of input variables. For the group has n ! 2”
elements for n variables. For « = 2we have

Number of Cycles of order


elements 12 3 4

1 4 0 0 0
2 10 0 1
3 0 2 0 0
2 2 10 0
(In fact it is the symmetry group of a square.)
3. Write out the functions for Exercise 2.
4. Do Exercise 2 for three variables. The group will be the symmetry group of
a cube, of order 48.
5. Find a general formula for the number of ‘necklaces’ of black and white
beads. This is the same problem as labelling a regular rc-gon with {0,1}. The
Sec. 2.15] Kinship Systems 111

group is the dihedral group of order 2n and has the following elements for
n odd (consider only this case here), where d is any divisor of n: <p(d)

rotations having n/d c?-cycles and n reflections having 1 1-cycle and —-—

2-cycles. Here 0(«) is Euler’s function n 11(1— —) where p ranges over

prime divisors of n. Check the formula for n = 5,7.

Level 3
1. Solve the switching circuits problem for n = 4 with complementation (see
Exercise 2 of Level 2).
2. Solve the necklace problem for n even (see Exercise 5 of Level 2).
3. How many distinct molecules can have this configuration

x-y-u

vl
where each letter represents one of three given atoms?
4. Compute the number of distinct cyclic strings of 0, 1 of length n. Two are
equivalent if they can be made to coincide by taking the last element. This
is like the necklace problem except there are no reflections.
5. Compute the number of ways to color the vertices of a cube with 4 colors
such that each color occurs twice.

2.15 KINSHIP SYSTEMS


Certain tribal societies are divided into subsets which will here be called clans,
though they are not hereditary, and there are strict rules about which clans a
person of a given clan may marry. Probably such rules help prevent incest and
help unite the tribe. These systems may have a mathematical structure. A number
of workers have attempted to state this structure precisely. Here we will follow
the treatment of H. C. White (1963), whose axioms can be stated as follows.

KS1. The entire population is partitioned into n nonempty clans.


KS2. There is a permanent rule fixing the single clan among whose women
the men of a given clan must find wives.
KS3. Men from two different clans cannot marry women of the same clan.
KS4. All children of a couple are assigned to a single clan uniquely determined
by the clans of their mother and father.
KS5. Children whose fathers are in different clans must be in different clans.
KS6. A man can never marry a woman of his own clan.
112 Semigroups and Groups [Ch.2

KS7. Every person has some relative by marriage and descent in each other
clan.
KS8. Whether two people who are related by marriage and descent are in the
same clan depends only on their relationship, not on which clan either
belongs to.

Let the clans be numbered 1,2,...,n. Let w be the function such that
w(z) is the clan of the wife of a person in clan i. By 1, 2 there is such a function.
By (KS3) it is a permutation. Let d be the function such that d(i) is the clan of
the children of a man in clan i and a woman, necessarily in clan w(i). By (KS4),
d exists, and by (KS5) it is a permutation. (KS7) states that the group G of
permutations generated by w, d is transitive. (KS8) states that if for some i and
some element g of the group G,(i)g^i then (f)g = j for all /. (KS7) now
implies that w is not the identity. Moreover any permutations w, d satisfying
these conditions give a possible kinship system.
We will now relate this to a theorem in group theory.

DEFINITION 2.15.1. Let G be a group of n elements, numbered 1,2,..., n.


Let r be the function from G to the symmetric group of degree n such
that r(g) is the function fg, where (x)fg = xg. Then r is called the regular
representation of G.

The regular representation is a particular way of associating a permutation


with each element of a group.

EXAMPLE 2.15.1. Let the elements be labelled as in Example 2.8.3. Then


r(x) is the permutation

fe a b x y z
\x y z e a b

from the x column of the table.


Thus the regular representation represents any group as a group of
permutations. The permutation is that given by right multiplication.

THEOREM 2.15.1. (Cayley.) The regular representation is an isomorphism from


G into the symmetric group of degrees n. Thus any group is isomorphic to a
subgroup of a symmetric group.

Proof. (x)fgh = xgh=(xg)h = ((x)fg)fh. Thus r(gh) = r(g)r(h). So r is a


homomorphism. For any x, if r(g) = r(h) then xg = xh so g = h. Therefore r
is one-to-one. So G is isomorphic to the image of r. This proves the theorem. □

THEOREM 2.15.2. A group G of permutations satisfies the conditions (1) G is


transitive, (2) if (i)g = i for some i then (j)g = j for all j, if and only if G
is the regular representation of itself.
Sec. 2.15] Kinship Systems 113

Proof. We first show the regular representation satisfies these conditions. For
any x,y let g = x~xy. Then (x)fg = xg — = y. Thus the representation
is transitive. Suppose (x)fg = x. Then xg = x so g = e.
Now suppose G satisfies (1), (2). We number the elements of G as follows:
for any element g, number it (l)g. By condition (2) these numbers are distinct
and by Condition (1) they include all of 1, 2,..., n. Now r(h) is the permutation
which takes (l)g to (1 )gh. But that is precisely the effect of the permutation h.
So r(h) = h. So G is the regular representation of itself, if its elements are
propertly numbered. This proves the theorem. □

COROLLARY 2.15.3. There is al-to-1 correspondence between kinship struc¬


tures of n clans and triples ( G, w, d) where G is a group of order n generated by
two elements, w, d are generators of G, and w e.

In practice as many as 16 clans may effectively occur.

EXAMPLE 2.15.2. The Kariera tribe of Australia has been associated with this
marriage system although some regard it as an oversimplification.

Banaka-Burung

Palyeri-Karimera

Horizontal lines denote the tribe of a husband or wife, vertical ones those of a
parent or child.

EXAMPLE 2.15.3. The Arunta tribe of Australia has been associated with this
marriage system, where horizontal and vertical lines denote spouse and lines at a
45° angle, parent or child.

Panunga■ Purula

Appungerta" Kumara

Umbitchana Bultara

Ungalla Uknaria"

EXERCISES
Level 1
1. Graph the kinship structure corresponding to {Z4, 1,3}. Vertices are clans
and edges are labelled d or w from one clan to another.
2. Graph all kinship structures with group Z2.
114 Semigroups and Groups [Ch.2

3. Do the same for Z3.


4. Write out the complete regular representation of the symmetric group of
degree 3.
5. Write out the regular representation of the cyclic group Z5.

Level 2
1. What is the condition that the clans of children be distinct from those of
their mother?
2. What is the condition that mothers, fathers, and children belong to three
different clans?
3. Find the smallest examples of the Exercise 1.
4. Find the smallest examples of the Exercise 2.
5. Work out Example 2.15.2. as a regular representation.
6. Do the same for Example 2.15.3.

Level 3
1. What is the condition that first-cousin marriages be impossible? There are
four cases: the mothers are sisters, the fathers are brothers, the father of
one is the brother of the mother of the other. Show the first two cannot
happen.
2. Find the smallest example for which the first exercise of this level holds.
3. Prove a group of order p2 is commutative. By an earlier exercise, the center
must be at least Zp.
4. Show for any subgroup H of a group G, G acts as permutations of the
cosets of H by (Hx)g — H(xg).
5. Show every transitive permutation representation arises as in the above
exercise where H is the isotropy group of an arbitrary chosen point.

2.16 LATTICES OF SUBGROUPS

In this section we briefly describe some results about lattices of normal sub¬
groups of a group. This theory also applies to modules,vector spaces, and ideals
in rings.
Recall a lattice is a partially ordered set in which every pair of elements have
a least upper bound {join, denoted V) and a greatest lower bound {meet,
denoted A). Any linearly ordered set, the subsets of any set, any Cartesian
product of lattices forms a lattice. For intersection and union of sets (and for
any linearly ordered set), the distributive laws hold. Any lattice in which they
hold is called distributive.
Sec. 2.16] Lattices of Subgroups 115

DEFINITION 2.16.1. A lattice is modular if and only if the distributive laws:

a A (b V c) = (a A b) V (a A c)

aV(JAc) = (aVi)A(aVc)

hold whenever some pair of elements of a, b, c is comparable, i.e. some element


is less than or equal to another. It is distributive if and only if these laws hold
for all a, b,c.

We will represent lattices by Hasse diagrams of partially ordered sets, that


is a line is drawn from x to y if x <. y and no z satisfies x < z < y. The higher
of x, y in the diagram is the greater.

EXAMPLE 2.16.1.

The first lattice is distributive. The second is modular, but not distributive. The
third is not modular. The matrices of the three partial orders are

1 0 0 0 0 1 0 0 0 o’ 1 0 0 0 0
1 1 0 0 0 1 1 0 0 0 1 1 0 0 0
1 1 1 0 0 J
1 0 1 0 0 >
1 0 1 0 0
1 1 0 1 0 1 0 0 1 0 1 0 1 1 0
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
The latter two lattices are known as a diamond and a pentagon.

A lattice will in fact be modular if and only if a A (b V c) = (a A b) V c


whenever a > c. This law follows from the distributive property a A (b V c) =
(a A b) V (a A c) and is equivalent to it in the only case when it can fail and
two elements be comparable, namely a < c. Similarly for the other distributive
law.

DEFINITION 2.16.2. A sublattice of a lattice L is a subset M C L such that M


is closed under the operations A and V.
116 Semigroups and Groups [Ch.2

EXAMPLE 2.16.2. Most subspaces of Vn are not sublattices, because they are
not closed under products of vectors. However, the subspace {(0, 0), (0,1), (1,1)}
is a sublattice.
Every sublattice of a modular lattice is modular and every sublattice of a
distributive lattice is distributive.

THEOREM 2.16.1. Any lattice which is not modular contains a sublattice


isomorphic to a pentagon.

Proof. Suppose a > c but a A (b V c) (a A b) V c. We have a ^ a A b, a ^ c


so a > (a A b) V c, and b V c > a A b, b V c > c so b V c > (a A b) V c. Thus
a A{b V c)> (a A b) V c. The pentagon will be as follows

bvc

3a(6vc)

(a Ah)vc

It is immediate that b V c > b, b V c > a A (6 V c), b > a A b,


(a A b) V c > a A b. We will show that if any of these were equalities, then
a A (b V c) < (a A b) V c. If b V c —■ b then aA(b\/c)=aAb so this holds.
If h V c = a A (b Vc) then a > b V c so a Ab — b. Thus a A {b V c) =
b V c = (a A h) V c. If b = a A b then a > b and so a > b V c again. If
(a A b)Mc=aAb then c < a A b. So a A (b V c) = a A b = (a A b) V c.
This shows all the inequalities of the pentagon are strict.
In order that the diagram be a pentagon as partially ordered set, we must
not have b > (a A h) V c or b < a A (b V c). In both these cases b> c, or
a> b and we have a contradiction as before. So we have a subset which is a
pentagon as partially ordered set.
Finally it remains to show that this set is a sublattice, that it is closed under
A, V. We need only consider pairs of elements which are not related by <. For
instance b V (fa A b) V c) > b V c. But from the diagram b V ((a A b) V c) <
b V c is immediate. Likewise b A (fa A b) V c) < b A a. The reverse inequality
Sec. 2.16] Lattices of Subgroups 117

follows from the diagram. The other relations follow from these and the diagram.
This completes the proof. □

A number of other characterizations of modular lattices are also known


(Donellan, 1968).

THEOREM 2.16.2. Let the normal subgroups of a group G be partially


ordered by inclusion. This partial order is a lattice, where AM B = AB and
A A B = AH B. Moreover the lattice is modular.

Proof. Let A, B be normal subgroups. Suppose a E A, b E B. Then bab~x E A.


So bab~x = au ba — ax b. Thus AB C BA. Likewise BA C AB. Also a-1 E A,
b_1 E B so ZrV1 E AB. So ABAB = AABB = AB and (AB)~X C AB. This
proves AB is a subgroup of G.
Let a E A, b E B, g E G. Then gabg~x — gag~xgbg~x E AB. So AB is a
normal subgroup. Suppose a subgroup N D A, N D B. Then N = NN D AB.
Also AB D A, AB D B. Thus AB is a least upper bound of A, B.
The set A n B is closed under products, inverses, and conjugations, and so
is also a normal subgroup. If A is a subgroup, K C A, K C B then K C A n B.
Thus AH B is a greatest lower bound of A, B.
Thus the normal subgroups form a lattice. Suppose A, B, C are normal sub¬
groups and ADC. To prove modularity we must show A fl (BC) C (A H B)C.
Let a E A, a = be where b E B, c E C. Then b — ac-1 E A C — A. And b E B.
So B E A n B and cEC. Thus fl = kE(df1 B)C. This proves the theorem. □

There are many unsolved problems in the theory of modular lattices.


In contrast, finite distributive lattices are fairly well known. Every finite
lattice has a unique basis under the operation V, i.e. a minimum generating set
(with zero removed). And the structure of this basis as a partially ordered set
completely determines the lattice.

EXERCISES
Level 1
1. Why is a linearly ordered set a lattice?
2. Prove a linearly ordered set is distributive (consider six cases for the
distributive laws: a < Z? < c, a < c < < c < a, c < Z> < a, b a ^ c,
c<a<b).
3. Draw the lattice of (0, 1}.
4. Do the same for subsets of (0,1, 2}.
5. Draw the lattice of subgroups of Z3 X Z3. Write out the elements. All
subgroups except e, Z3 X Z3 are cyclic of order 3.
6. Draw the lattice of subgroups of Z8. (They are all cyclic.)
118 Semigroups and Groups [Ch. 2]

7. Draw the lattice of subgroups of Z6. Use the multiplication table. (All
subgroups are cyclic.)
8. Show a finite lattice has a smallest element 0, and that 0 V x = x for all x.

Level 2
1. Find an example of an infinite lattice with no smallest element.
2. A basis is a minimum generating set under V of the nonzero elements of
a lattice, i.e. a set of generators such that no proper subset generates the
lattice. Prove any finite lattice has a basis. Show the set of integers has no
basis.
3. Show V is associative in any lattice.
4. The positive integers form a lattice under the relation n divides m, although
this cannot be proved without some number theory. What is 6 V 10? What
is 10 A 6? Do you think this lattice is distributive?
5. Show that the family of all subgroups of a group (not necessarily normal)
is a lattice under inclusion. What are the operations?
6. Show that if L is any finite partially ordered set in which A always exists,
and L has an element 1 larger than any other, L is a lattice. Define V by
a V b = A z.
z>a,b
7. Prove the dual of the above exercise that any subset of a finite lattice
containing 0 and closed under V is a lattice under the partial order. Give an
example for L = subsets of (1, 2, 3} where it is not a sublattice.

Level 3
1. Show that if a lattice is modular but not distributive, it contains a diamond.
2. Show that the basis for any finite lattice is unique and consists of all x such
that if x = y + z then x = y or x = z.
3. Let B be the basis for a finite distributive lattice L regarded as a partially
ordered set. Show that B determines the structure of L. Specifically show
that L is isomorphic to the lattice of subsets S in B such that if x G S,y<x
then y E S, that a specific isomorphism is given by sending S to V x.
*e ,s
4. Prove any distributive lattice with basis B is isomorphic to a sublattice of the
lattice of subsets of a set. Show in particular L is isomorphic to a sublattice
of the subsets of itself by the 1-to-l map sending x to {y e B : y < x}. No
similar result is known for modular lattices.
5. Prove that if a A (b V c) = (a A b) V {a A c) for all a, b, c in a lattice, that
a V (b A c) = (a V b) A (a V c).
6. For a modular lattice, prove (x V y) A (y V z) A (z V jc) =
(x A (y V z)) V (y A (z V x)). (This identity can be found in Birkhoff
(1967).)
CHAPTER 3

Vector spaces

In this chapter we review a number of the most important facts about matrices
and linear algebra, some of which are needed in later chapters. A field is a
system such as the rational, real, or complex numbers, in which addition, multi¬
plication, subtraction, and division exist and satisfy the usual laws of algebra
(exclusive of order properties). A vector space over F is a set V in which addition
D + w is defined and scalar multiplication cv is defined for c in F, v in V,
having various properties such as u+ w = w+ u. The standard example is the
spaces of n-tuples Fn of elements of F, where (x1(x2, + (yi,y2,...,yn) =
(xi + yux2 + y2, .... xn + yn) and c(xu x2, xn) = (cxu cx2, cxn).
Vector spaces are used to study systems of linear equations, geometry of
^-dimensional space, forces and velocities in physics, supply and demand for
many goods in economics, and have other uses.
Every vector space has a set {vu v2,vk} of vectors such that all vectors
w can uniquely be written as linear combinations civx + c2v2 + ...+ ckvk.
A set {Vy, v2,..., vk] is called a basis. The number of vectors V\, v2,vk is
called the dimension of the vector spaces.
A matrix is a rectangular array of numbers. Matrices can be added,
subtracted, and multiplied. The rule AB = BA does not always hold, and
inverses A'1 may not exist. For given bases, there is a 1-1 correspondence
between matrices of a certain size and linear transformations / from one vector
space to another, i.e. functions / such that f(cv) = cf(v) and f(v + w) =
f(y) + /(w). A set of simultaneous linear equations may be written as a single
matrix equation xA — b.
The determinant of a matrix is a somewhat complicated function having
many properties such as det (A B) = det (A) det (B) and det (A) =£ 0 if and only
if A has an inverse. The characteristic polynomial of an n X n matrix is the
polynomial of degree n det (tl — A) = 0. It is a similarity invariant of a matrix,
that is, it is the same for A as for XAX'1. This means it gives properties of a
linear transformation independent of the basis. Its roots are eigenvalues of the
matrix A, that is, numbers k such that for a nonzero vector v, vA — kv.
120 Vector Spaces [Ch.3

For normal matrices, those such that AB — BA where B is the transpose of


the complex conjugate of A, more is true. The matrix A = UDU~X where D is a
diagonal matrix consisting of eigenvalues of A, and U is a unitary matrix, that
is, the transposed complex conjugate of U is U~l. Such matrices represent
rotations of the coordinate system. For a symmetric matrix all eigenvalues are
real numbers.

3.1 VECTOR SPACES


Vectors are encountered in physics, as line segments whose direction is the
direction of a force (or position, velocity, acceleration) and whose length gives
the intensity of the force. Such vectors can be added to give the sum of two
forces by adding their x, y, and z components (parallelogram method). There
also exist methods of multiplying vectors called inner and cross products.

EXAMPLE 3.1.1. A vector directed upwards at a 45° angle with an intensity of


2\fl can be represented by (2,2). Here the line segment from (0, 0) to (2, 2)
has direction a 45° angle with the x-axis and length 2\[2.

A vector can be regarded as simply an n-tuple of real numbers. Thus the set
of vectors is in 1-1 correspondence with the set of points in ^-dimensional
space, represented by coordinates. Unlike points, however,vectors can be added
and multiplied by numbers.

EXAMPLE 3.1.2. (1,2, 7) + (-4, 2,1) = (-3,4,8).

EXAMPLE 3.1.3. 2(1,2, 3) = (2, 4, 6).

This viewpoint is useful in various problems involving linear equations, such


as the solution of simultaneous linear equations. The solutions for all variables
xh x2,xn can be regarded as a single vector instead of many distinct quanti¬
ties. In economics quantities of many different foods are often regarded as a
single vector. For instance (2,3,10) might denote 2 bushels of wheat, 3 pounds
of steel, and 10 grams of gold. Vectors can indicate different aspects of a
sensation, such as intensity and pitch of sound.
A more general viewpoint is taken in abstract algebra. Here a vector is any¬
thing, not necessarily consisting of numbers on which operations of addition and
multiplication by elements of another set as in Example 3.1.2, can be defined.
A field is a set like the real numbers obeying the usual laws of algebra, in
which we can divide by any nonzero element.

DEFINITION 3.1.1. A field is a set F on which binary operations addition and


multiplication F X F -> F are defined,with these properties:
Sec. 3.1] Vector Spaces 121

(a + b) + c — a + (b + c), (ab)c = a(bc)

a + b = b + a, ab = ba

a(b + c) = ab + ac

There exist 0,1 such that for all x, x + 0 = x, x • 1 = x.

For any x there exists —x and if x 0 there exists ^ such that


—x + x = 0, x • ^ = 1

EXAMPLE 3.1.4. The rational, real, and complex numbers are fields.

EXAMPLE 3.1.5. The set of numbers of the form a + b\fl, a, b GQ is a field.

The field will be the set of ‘numbers’ by which vectors can be multiplied as
3(1, —1) = (3, —3). Elements of it are called scalars.

DEFINITION 3.1.2. A vector space over a field F is a set V with an operation


of scalar multiplication F X V -> V and an operation of addition V X V -> V
such that the following are true for all s, t G F, v, w, zGV and for a fixed
0G V:
s(tv) = (st)v, \v - V

(s + t)v = sv + tv, s{v + w) = ^ + sw

(v + w) + Z = V + (w + z), V + tv = W + V

0+v = v

and, for the zero, identity elements 0,1 of the field we have

0 • v = 0, 1•v = v

It follows that (—l)v + v= (— X)v + \ ' v = (—l + l)z> = 0*p = 0 so


that a vector space is an abelian group under addition.

EXAMPLE 3.1.6. The F is itself a vector space over F.

EXAMPLE 3.1.7. The set of all ^-tuples (vlt v2,..., vn) over F is a field with
operations (xj, xa,.... x„) + (j>i, J'a. •••. T») = (x1 + yux2 + ya,.... xn + yn)
and c(vu v2,..., v„) = (cvx, cv2, ...,cvn), c G F.

EXAMPLE 3.1.8. The Mn(F) is a vector space.

EXAMPLE 3.1.9. The set of functions from any set X to F is a vector space
under the usual addition of functions and multiplication of functions by
constants.
122 Vector Spaces [Ch. 3

DEFINITION 3.1.3. A subspace of a vector space V is a subset of V closed


under addition and under multiplication by field elements.

EXAMPLE 3.1.10. The set of polynomial functions from R to itself is a


subspace of the vector space of all functions from R to itself.

The concept of linear dependence may be recalled from the theory of


simultaneous linear equations. A set of linear equations is dependent if one is
a linear combination of the others. In the same way a set of vectors is linearly
dependent if one is a linear combination of the others.

DEFINITION 3.1.4. Let I be a set, called an index set. Let / be a function


from I to a family of sets. For i G I, let the values of / be denoted A{. Then
the sets At form what is called an indexed family of sets. The product of the
indexed family of sets At is the set of all functions g from I to the union A of
all sets Aj, such that g(i) G A,- for all i.

EXAMPLE 3.1.11. Let I = {1, 2, 3}. Then our indexed family consists of three
sets Ax, A2, A3. The product is the set of all functions g such that g(l)GAi,
g(2) e A2, g(3) e A3. This can be written (Ui, g2, gi) : gi G Ax, g2 G A2,
g3 G A3}. So we see this really is the same as the previously defined Cartesian
product.

DEFINITION 3.1.5. A set S in a vector space V is linearly dependent if


there exist elements fx, f2,.... /„ G F, not all zero, and distinct elements vlt v2,
..., n„GV for some positive integer n such that fxVx + /2»2 + ... + fnvn = 0.
Otherwise it is linearly independent. An indexed sequence of vectors
is linearly independent if all the vectors v{ are distinct and the set {>/},• eI is
independent. Here 0 denotes the zero vector.

EXAMPLE 3.1.12. The set of vectors {(1,0, 0), (1,0, — 1), (—1,0, 2)} is
linearly dependent, since

(—1) (1,0,0) + 2(1,0, —1) + (1) (—1,0, 2) - (0,0,0)

EXAMPLE 3.1.13. Any set containing the vector 0 is dependent.

EXAMPLE 3.1.14. The set {v, 2v} is dependent.

EXAMPLE 3.1.15. The set {v, w, v + w} is dependent.

EXAMPLE 3.1.16. For any n the set of n-tuples {(1, 0,..., 0), (0,1,..., 0),
..., (0, 0,..., 1)} with (n — 1) zeros and 1 one is linearly independent.
Sec. 3.1] Vector Spaces 123

DEFINITION 3.1.6. The span <S) of a set S of vectors is the set of all linear
combinations ft zq + f2 v2 + ... + /„ vn where ft G F, zq G V, and n is any
positive integer.

EXAMPLE 3.1.17. The span of {(1,0, 0), (0,1, 0)} is the set of all vectors of
the form (ft, f2, 0) for ft G F.

If a set of vectors is linearly dependent, some vector v will be a linear


combination of the others. Take any zq such that ft ^ 0 in the equation
ftvi + ft v2 + ... + fnvn = 0 and solve for zq in terms of the rest. If a vector
is a linear combination of others, removing it will not affect the span, since we
can substitute this expression for it in any other linear combination.

PROPOSITION 3.1.1. The span of S is a subspace and is contained in every


subspace containing S.

Proof. A sum of linear combinations is again a linear combination, and a


multiple of a linear combination by an element of F is again a linear combination.
Therefore <S) has the closure properties needed to be a subspace. If S C W for a
subspace W then by closure W must contain all linear combinations of elements
of S, i.e. it contains <S). □

Vector spaces are frequently studied in terms of subsets called bases. Two
vector spaces with bases of the same size are isomorphic. A basis gives a system
of coordinates (which need not be at right angles to one another). Thus the usual
x, y, z coordinates correspond to the basis {(1, 0, 0), (0,1,0), (0, 0,1)}.

DEFINITION 3.1.7. A basis for V is an independent subset S C V that spans V.

PROPOSITION 3.1.2. The set (zq, v2,..., vn} is a basis for V if and only if
every vector v G V has a unique expression v = ft zq + f2 v2 + ... + /„ vn where
/i'SF.

Proof. Let {zq, v2,.... vn} be a basis. Since it is a spanning set, for any v,
v = ft Vi + f2v2 + ... + fnvn for some ft G F. If the ft are not unique take the
difference of two expressions for v. This gives a relation gx zq + g2 v2 + ... +
gnvn = 0 where not all gj = 0, a contradiction to independence.
Suppose conversely every vector v G V has a unique expression
v = ftvx + f2v2 + ... + fnvn. Then zq, v2,...,vn spanV. Let v = 0. Uniqueness
shows that all ft = 0. This proves independence. □

EXAMPLE 3.1.18. The set {(1,0), (0,1)} is a basis for any fixed a G F, for the
space of all ordered pairs (ft, f2).
124 Vector Spaces [Ch.3

THEOREM 3.1.3. Any independent set maximal under inclusion is a basis, and
any spanning set minimal under inclusion is a basis. Every independent set is
contained in a basis and every spanning set contains a basis.

Proof. Suppose S spans V but is not a basis. Then some vector v £ S is a linear
combination of the other members of S. Thus if w is deleted, we still have
a spanning set. So S is not minimal.
Suppose S is independent but not a basis. Then it does not span S. So some
vector v of V is not a linear combination of elements of S. Then S U {v} is
independent: if fxvx + f2v2 + ... + fnvn + fn+xv = 0 then fn+xv = 0 since
vfi(S) and ft = 0, i < n + 1 since S is independent. This proves S is not
maximal.
Let S be an independent set. Consider the family F of independent sets
which contain S. The family F is partially ordered by inclusion. Any chain C in
F is bounded by the union of the members of C, which will be independent:
if fxvx + f2v2 + ... + fnvn = 0 take members Cj, C2,..., Cn of C containing
vx, v2, ..., vn respectively. Let Ck be the largest under inclusion. Then all
vxE Ck and /} = 0 by independence of Ck. Thus C is bounded. So by Zorn’s
Lemma, F has a maximal element. This will be a basis by what was proved
above.
Let S be a spanning set. The Well-Ordering Principle which can be proved
from the Axiom of Choice, states that there exists a linear ordering < of S in
which every subset of S has a least element.
Let T — {s E S: s is not a linear combination of elements t < s in the
ordering}. Suppose T is dependent. Let /i -E f2 t2 + ... + /„f„ = 0. Assume
all ft 0 else delete that term. Let the last of tx, t2,..., tn be tk. Then tk is a
linear combination of prior elements. This contradicts the definition of T.
Suppose T does not span V. Then since (S) = V it cannot span S. Let t be
the first member of S not in <T>. Then t is not a linear combination of prior
elements since prior elements are in <T) and t would be. So by definition of
t, t e T. This is a contradiction. So T is a basis. □

COROLLARY 3.1.4. Every vector space has a basis.

This theorem assumes the Axiom of Choice, and cannot be proved without
it for infinite-dimensional spaces. For finite-dimensional spaces, its main use, the
Axiom of Choice is not needed.

EXAMPLE 3.1.19. The independent set {(1,0, 1),(2,1,0)} is included in the


basis {(1,0,1), (2,1,0), (0,0,1)}.

EXAMPLE 3.1.20. The set 5 = {(0, 0, 0),(1, 0,1),(2, 0, 2), (0,1, 0), (1,1,1),
(0,0,1), (1,0,2)} spans the set of all ordered triples of R. To obtain a basis
delete all elements of S which are linear combinations of prior elements. This
leaves {(1,0,1), (0,1,0), (0, 0,1)} which is a basis.
Sec. 3.1] Vector Spaces 125

We will conclude this section with a brief history of vector analysis. The
parallelogram method of addition for vectors was discovered by the Greeks for
addition of velocities. It was used to represent addition of forces by physicists
in the sixteenth and seventeenth centuries. C.Wessel, a Norwegian surveyor, in
1799 discovered the geometric representation of complex numbers (which are a
vector space over R) and extended some results to 3 dimensions. This result was
independently discovered by C. F. Gauss, J. R. Argand, C. Mourey, and J. Warren.
W. R. Hamilton tried for years to find a system that extends the complex
numbers in the same way the complex numbers extend the real numbers. He
found no 3-dimensional system, but a 4-dimensional noncommutative one
called the quaternions, with basis 1, /, /, k where i2 = j2 = k2 = — 1 and ij = k,
jk = ki = /,// = —k, kj = —i, ik = —j. This system can be used to represent
rotations in 3-dimensional space. He also introduced the operators del and del2
of mathematical physics.
Operations of dot and cross product were introduced by A. M. Moebius,
H. G. Grassman, A. Barre and others. Grassmann in particular considered very
general types of n-dimensional systems with vector space operations and various
types of products, some not even associative.
P. G. Tait further developed the theory of quaternions. The modern form
of the theory of vectors is due to two mathematical physicists, J. W. Gibbs and
O. Heaviside. Much of the above material is from J. M. Crowe (1967).
The nineteenth century was the time of the beginnings of a number of other
concepts of abstract algebra. A. Cauchy studied cycles of permutations and
transpositions. E. Galois studied groups as a method of proving polynomials
of the fifth degree were not solvable, and discovered the importance of normal
subgroups and commutativity. A. Cayley proved his previous cited theorem.
Fields of algebraic numbers were implicit in the work of E. Galois on
solving polynomials. Finite fields Zp were discovered by C. F. Gauss, in his work
on congruences. Gauss also discovered what in effect were deep results about
algebraic number fields in his work on quadratic forms, quadratic reciprocity,
and other types of number theory. E. Kummer studied algebraic number
fields in order to solve many cases of Fermat’s still unsolved conjecture that
xn + yn = zn, n > 3 has no solutions in whole numbers. These authors did not,
however, use the abstract, axiomatic approach based on set theory which is
used today. This came about as an attempt to find rigorous foundations for all
of mathematics, especially calculus, in the late nineteenth century. G. Cantor is
generally considered the inventor of set theory.

EXERCISES
Level 1
1. Add (1,2, 3), (4, —5, 6), (—5, —10,1).
2. Multiply (1,0, -2) by -3.
126 Vector Spaces [Ch.3

3. If x = (1,2, 3), .y = (1,0,1), z =(0,0,1) write 2x + 3y + 4z.


4. Find a, b such that (1,2) = a( 1,0) + b( 1,1).
5. Is this set dependent? {(1,1), (2, 3), (0,1)}. Find an independent subset.

Level 2
1. Prove that for any a, b, c: {(1, 0, 0), (a, 1, 0), (b, c, 1)} gives a basis for all
triples of real numbers.
2. Give an example of a subset of {(<z, b)\ a, b E R} which is closed under
addition but is not a vector space over R.
3. Tell why if a field Fj is contained in F2 that F2 is a vector space over F2.
What is a basis for the complex numbers as a vector space over R?
4. Find a subset of this set which is a basis, by the method of Example 3.1.20:
{(0,1,0),(1,2,0),(2,4,0),(1,3,6),(1,1,1),(2,2,2)}.
5. Extend this independent set to a basis: {(1,1,1), (1, 2, 0)}. To do this, add
a spanning set {(1,0, 0), (0,1,0), (0, 0,1)} and apply the method of the
last exercise.

Level 3
1. Does any countable set have a well-ordering?
2. Use the existence of a basis for the real numbers over the rationals to
construct a function / from the real numbers to itself such that /(x + y) =
/(x) + f(y) but / is not continuous.
3. Add the quaternions 1 4- 2/ + 3/ +4k and 2 + i + j — k.
4. Multiply them. The distributive law holds for quaternions.
5. Prove that the associative law holds for the quaternions. It suffices to check
the basis elements ±1, ±i, ±j, ±k. If one of x, y, z is the identity, the relation
(xy)z = x(yz) always holds. Factor out all minus signs. By symmetry in
/, /, k assume x = i. Then check 9 cases for y, z = /, /, k.

3.2 BASIS AND DIMENSION


In geometry a point is said to have dimension 0, a line dimension 1, a plane
dimension 2, and all of three-dimensional space dimension 3. These can be
interpreted as the number of continuous real coordinates needed to give the
position of a point, or the number of simultaneously perpendicular line segments
which can exist. The measures length, area, volume are related to this.
Dimension generalizes to vector spaces. Instead of coordinates we simply
consider the number of vectors in a basis. However, we must show this number
is the same for all bases.
First we state a frequently used result about simultaneous linear equations,
valid also in the case of an infinite number of equations and variables.
As previously mentioned two sets S, T are said to have the same cardinality
if there exists a 1-1 onto mapping from S to T.
Sec. 3.2] Basis and Dimension 127

DEFINITION 3.2.1. The cardinality of S is less than or equal to that of T,


writren |5| < \T\ if and only if there exists a 1-1 mapping from 5 into T.

EXAMPLE 3.2.1. For finite sets this means the number of elements in 5 is
less than or equal to that in T.

EXAMPLE 3.2.2. Since Z C R, |Z| < |R|.

Assuming the Axiom of Choice this definition is equivalent to saying there


exists an onto mapping from T to 5. It is readily verified that the relation
151 < in is reflexive and transitive (compose two 1-1 maps 5 -*■ T and T -*■ U).
The result that if |5| < \T\ and |Ij < |5| then |7’| = |5| is less obvious and is
called the Schroder-Bernstein Theorem. In addition we use the fact that for S
infinite, |51 = |51 X |Z|.
Linear equations are called homogeneous if they have no constant terms
except zero. For instance 2x + 3y = 0 is homogeneous but not 2x + 3y — 1.
Coefficients are assumed to be in F.

THEOREM 3.2.1. Consider a set E of homogeneous linear equations in distinct


variables {x,-}fei. If 111 > |if| then there exists a solution of all equations in
lwhich the set of nonzero variables is finite and nonempty.

Proof. Each equation can contain only a finite number of variables. Thus if E
is infinite the number of variables which actually appear in the equations will
be no more than |ZX£’|=|£’|<|I|. So some variables will not appear in any
of the equations. Let the variables which appear in the equations be zero,
and assign arbitrary nonzero values to finitely many others.
Thus we may assume E is finite. We may now by a standard procedure
reduce E to a simpler system E1. Take the first equation and choose a variable
which occurs in it with nonzero coefficient. Eliminate this variable from the
other equations by adding or subtracting a multiple of the first equation. At
the z'th stage, if the z'th equation is zero, delete it. Otherwise choose a variable
which occurs in it with nonzero coefficient, and eliminate that variable from all
the other equations by adding or subtracting a multiple of the z'th equation. At
the final stage each chosen variable will occur in only one equation. Since there
are more variables than equations, some variables must not have been chosen.
Assign variables not chosen, arbitrary nonzero values. Then there will exist
unique values of the chosen variables satisfying all equations (just solve the
equations for the chosen variables). □

EXAMPLE 3.2.3. The system

(
x + 2y + z = 0

3x — 2y + 4z = 0
128 Vector Spaces [Ch. 3

has the nonzero solution x — —10, y — 1, z =8. Any multiple of this gives
another solution.

THEOREM 3.2.2. Any two bases of a vector space have the same cardinality.

Proof. Let and {w/}/<= j be two bases for V. Suppose |I| < 1JI. Since

is a spanning set, there exist atj G F

Wi = 2 an
fe i

for each j G J. By Theorem 3.2.1 there exist {x,-}j^j not all zero but only
finitely many nonzero solving the equations

2 OijXj = 0
/e J
for all i G I. Therefore

2 Xj vv. “22 Xt an vi 2 Vf 2 xj a^j


je J /e J /el /eI /ej
= 2 vt(0) = 0.
|GI

Thus the Wj are linearly dependent. This is a contradiction. By symmetry,


ill > | J | is also impossible. So 111 = | J |. □

The Cartesian product of a collection of vector spaces is itself a vector space


in a natural way.

DEFINITION 3.2.2. The (external) direct sum of vector spaces V1; V2, ...,Vn
over F is the set of all ^-tuples (v2, v2,.... vn) such that v{ G Vj for each i. Addi¬
tion and scalar multiplication are defined by (v2, v2,..., Vn) + (w1; w2,..., wn) =
(Vi + wlt V2 + W2,...,vn + wn), c(vu v2,..., vn) = (cv2, cv2, ...,cvn) where
c G F.

EXAMPLE 3.2.4. The direct sum F ® F © ... © F is the usual space of n-tuples.
It is denoted Fn. This vector space is the same as the Cartesian product Fn
mentioned in previous sections.

This definition extends to infinite direct sums and products. The direct
product is as in Definition 3.1.4. A direct sum, or restricted direct product in
somewhat confusing language is the subset of the direct product consisting of
functions which have only a finite number of nonzero values.

DEFINITION 3.2.3. The dimension of a vector space is the cardinality of a


basis.
Sec. 3.2] Basis and Dimension 129

EXAMPLE 3.2.5. The dimension of C over the field R is 2, since {1, i} is a


basis.

A basis for a direct sum space can be obtained as the union of bases for each
vector space Vj. Therefore the dimension of a direct sum of vector spaces Vj is
the sum of dim Vj.

EXAMPLE 3.2.6. The dimension of Fn is n.

The expression ‘internal direct sum’means that a vector space is isomorphic


to a direct sum of certain of its own subspaces, rather than ‘outside’ vector
spaces.

DEFINITION 3.2.4. A vector space V is the internal direct sum of subspaces


Wi, W2,Wn if and only if every vector v G V can be uniquely expressed as
v — w>i + w2 + ... + wn, with w( It is denoted Wi © W2 ©... © Wn.

EXAMPLE 3.2.7. The vector space F2 is the direct sum of the subspaces =
{(x, x) : x G F} and W2 = {(x, 0) : x G F}. In fact (x, y) = (y, y) + (x — y, 0)
for all xjGF.

PROPOSITION 3.2.3. An internal direct sum is isomorphic to the external


direct sum of the same sub spaces.

Proof. Let V be the internal direct sum of Wlt W2, ..., Wn. Let f :V-*■
Wi © W2 ©... © Wn be defined by /(w! + w2 + ... + w„) = (wj, w2,..., wn). Let
g: Wi © W2 © ... © Wn -»-V be defined by g(wi, w2,..., wn) = Wi + w2 + ...+ wn.
Then / and g are homomorphisms, and gf(x) = x and fg(y) = y. Thus g is the
inverse of f □

DEFINITION 3.2.5. If Wu W2,..., Wn are subspaces of V, the sum W2 + W2 + ...


+ Wn denotes {wq + w2 + ... + wn : wt G for each /}.

EXAMPLE 3.2.8. If Wi = {(x,y, 0):xjGF} and W2 = {(0, y, z): y, z G F}


then IVj + W2 is the set of all triples {(x, y, z) : x, y, z G F}.

PROPOSITION 3.2.4. Let Wi, W2, ..., Wn be the subspaces of V. Then


Wx + W2 + ... + Wn = (W! U W2 U ... U Wn).

Proof. Every element of wq + w2 + ... + wn is a linear combination of elements


of wt. Conversely every linear combination of elements w( G is a sum of
elements Zyw,- G W(, tt G F. □

DEFINITION 3.2.6. The subspace H4 of V is a complement of the subspace W2


if and only if Wi Cl W2 = {0} and Wi + W2 = V.
130 Vector Spaces [Ch.3

EXAMPLE 3.2.9. The subspace {(x, 0, 0): x G F} is a complement of


{(0,^, z): y,z e F}.

PROPOSITION 3.2.5. Let Wi, W2 be the subspaces of V. Then the subspace


is a complement of the subspace W2 if and only if V = Wx © W2.

Proof If V = W2 © W2 and Wi n W2 ¥= 0 then x G Wi n W2 can be expressed in


two ways: x + 0 and 0 + x. Conversely if u2 + u2 = Wj + w2 where uit w,- G IP,
then Uy — wj = w2 — u2 E H4 n W2. So n W2 {0}. □

THEOREM 3.2.6. Any subspace of a vector space has a complement.

Proof Let W C V. Let B be a basis for IP. Then B is an independent set in V.


Thus by Theorem 3.1.3 there exists a basis C of V such that B C C. Let
D — C\B. Then <£> + <Z» = V. Let ve(B)n(D). Then v = Laiui, u(GB
and v = 'Lbiwi, wzGD. So 0 = 2 a^i + (— 2 h,vv,). Since BUD = C is
linearly independent, all a, = 0 and all £,• = 0. So v = 0. Thus <D> is a
complement of (B) = W. □

For a vector space V and subspace W the relation {(x, y) : x — y £ IP} is


a congruence, that is, an equivalence relation such that if x~ y then ax ~ ay
and x + z ~ y + z. Therefore the set of equivalence classes forms a vector space
under the same operations. This vector space of equivalence classes is called the
quotient vector space, denoted V/IP. For any complement U of W, the mapping
U -*V -> V/W is an isomorphism. It is onto since u~n + wifwEIP and 1-1
since the kernel is W D U = {0}. Since V = IP © U, dim V = dim W + dim U.
Therefore dim V/IP = dim U - dim V — dim IP.

EXERCISES
Level 1
1. Find a nonzero solution of these equations:

2. Find an example of 2 nonhomogeneous linear equations with no solution,


in 3 variables.
3. What is the dimension of the space of vectors {(x, y, — x, — jy)}? Show
{(1,0, —1), (0,1, —1)} is a basis.
4. If F is a finite field how many elements does a vector space of dimension n
have?
5. Find a complement to the space spanned by {(1, 1)} in R2 3 4 5. (Extend {(1,1
to a basis.)
Sec. 3.3] Matrices 131

Level 2
1. Prove |Z X Z| < |Z| because |Z| = |Z+| and the map (x, _y) -*■ 2X3 y on Z+
is 1-1. Here Z+ denotes the set of positive integers.
2. Prove |RXR|< |R I as follows: write a real number x as an infinite decimal
(if a999 ... occurs replace it by a + 10 000). For two real numbers x, y.
form the real numbers obtained by alternating digits of x, y. Thus if
x = 0.101 and y = 0.345 send (x, y) to 0.130415. Show |R X Z| < |R|.
3. What is the dimension of {(x, y, z) : x + 2y + 3z = 0}? Find a basis.
4. What is the dimension of {(x, y, z) : x 4- y = 0, y + z = 0}? Find a basis.
5. Find a complement to the space spanned by (1,1, —2) in {(x, y, z) :
x + y + z =0}.

Level 3
1. Prove the Schroder-Bernstein Theorem in the case where one set is finite.
2. Using Exercise 1 prove when both sets have cardinality at most |Z|, hence
have cardinality exactly |Z| or are finite.
3. Find (possibly in a book) a general proof of the Schroder-Bernstein
Theorem. One method is as follows: Suppose we have 1-1 maps /: X -* Y
and g: Y -» X. To an element x G X associate the set Cx of all elements
of X which can be obtained from x by iterating f g, f~l, g_1 where
defined. For yEf define a similar set Dy. Show that f(Cx) C •£>/(*) and
g(Dy)C Q(X). This gives a 1-1 correspondence between the family {Cx}
and the family {Dx}. To construct a 1-1 correspondence from X to Y it
suffices to construct a 1-1 correspondence on each Cx to -£>/(*) by the
Axiom of Choice. This reduces the problem from X, Y to the case
Cx, Df(xy Show these sets are at most countable. Next use Exercise 2.
4. For vector spaces U, V, W, let ilf i2 be the inclusion U-*U®V,V-*U©V
and 7?!, 772 the maps 77!(x,.y) = x, 7r2(x, y) = y. For any vector space homo-
morphisms fh f2 (i.e. /(ax) = af(x) and /(x +y) = /(x) + f(y)) from
W -*■ U, W -* V, show there exists a unique homomorphism /(W) -+ U © V
such that / ° =/i, / o 7r2 = fi- Show for any homomorphisms g^: UW
and g2 : V -> W there exist a unique homomorphism g: U © V -> W such
that z'i <©£ = gu i2 O g2 = g2.
5. Prove either property in Exercise 4 uniquely characterizes U © V, that for
any space W having the property W must be isomorphic to U © V.

3.3 MATRICES
It is assumed that the reader is familiar with basic operations on matrices by
earlier study but we review the basic definitions here.
An n X m matrix over a set S is an indexed collection ay of elements of S
for / = 1, 2,..., n and / = 1, 2,..., m. That is, it is a function from the set
{1, 2,..., n] X (1, 2,..., m} to S. Matrices are usually represented as rectangles
132 Vector Spaces [Ch. 3

of elements of S, with the value fly in row i and column /. However, they exist
as functions independent of any particular diagram. The set S will usually be
assumed to be a field, or at least to have operations of addition and multiplication
in it, as well as additive and multiplicative identities 0,1.

DEFINITION 3.3.1. The nXn identity matrix is the matrix I = (5 y) such that
5y = 1 if i = / and 8y = 0 if i =£ j.

EXAMPLE 3.3.1.

"l 0 o"
0 10


0 0 1 —

DEFINITION 3.3.2. The mXn zero matrix 0 is the matrix all of whose entries
are 0.

EXAMPLE 3.3.2.

0 0 0 o"
0 0 0 0
0 0 0 0

DEFINITION 3.3.3. Addition and multiplication of matrices are defined by


(A + B)ij = (fly + bij) and (AB)y = (X aikbkj).

This means in the case of addition that entries in the same locations in A, B
are added to give the entry of A + B in that location.

EXAMPLE 3.3.3.

1 2 o’ 0 1 l" "l 3 l"


+
—1 3 1 _2 3 -1. _1 6 0

Multiplication is more complicated. For each i, j to obtain the (z, /)-entry


of a matrix AB multiply row i of A by column B of /. The product of a row
and a column is obtained adding all products of the fcth element of the row
times the k th element of the column for all k. This is lengthy. If done correctly
by the standard method matrix multiplication requires n3 multiplications (n per
entry of AB) and n3 — n2 additions (n — 1 per entry of AB) for two nX n
matrices.
Sec.3.3] Matrices 133

EXAMPLE 3.3.4.

1 2 "o 1 l" 4 7 -l"

-1 3 2 3 -1 = 6 8 -3
4 2 0 0 1 4 10 4

Addition is defined only if both matrices have the same number of rows and
both have the same number of columns. Multiplication of an n X m matrix times
an r X s matrix is possible only if m = r. The answer is then n X s. Matrices
obey the following laws, assuming addition and multiplication in S have the
corresponding properties (e.g. if S is a field).

A + B = B + A, A + (B + C) = (A + B) + C, (AB)C = A(BC)
A(B + C) = AB + AC, (B + C)A ~ BA + CA, 0 + A=A

IA = AI = A

DEFINITION 3.3.4. For c G F, cA is the matrix (ca,y).

EXAMPLE 3.3.5.

"1 2" '2 4


_0 5_ _0 !0_

If S has an element —1 then (—1)^4 will be an additive inverse —A of


A:(-1)A + A = 0.
Existence of multiplicative inverses is not always true even over F.

EXAMPLE 3.3.6. Let

A = 1 0
1 0_

Then A has no inverse since for any B, both rows of AB are equal since both
rows of A are equal.

The main diagonal of a matrix A consists of the entries %.

DEFINITION 3.3.5. A matrix is a diagonal matrix if all entries not on the main
diagonal are zero.
134 Vector Spaces [Ch. 3

EXAMPLE 3.3.7.

'2 0 ~

_0 3_

Any diagonal matrices will commute. However, the commutative law is not
true for all pairs of matrices.
The (i, /)-entry of the mth power Am of a matrix is denoted aif'm\
The ith row of A is written At* and the ith column A*,-. (By replacing the
asterisk with numbers we obtain the entries of the row or column.) A 1 X n
matrix is called a row vector. An n X 1 matrix is called a column vector.

DEFINITION 3.3.6. The transpose of a matrix A = (a(/) is AT = (ay,).

That is, the (i, /)-entry of A becomes the (/, z)-entry of AT. Rows
become columns and columns become rows. We have (A + B)T = AT + BT,
(AB)t = BtAt, (At)t = A.

EXAMPLE 3.3.8.

T
'1 2 'l 6

0 1_ 2 i_

DEFINITION 3.3.7. A matrix A is called symmetric if A = AT.

Matrices can be multiplied by row vectors or column vectors of appropriate


dimension:

(vA)i = (2 vk aki)

(Av)i = (2 aik vk)

This is matrix multiplication for 1 X n or m X 1 matrices, so

A(v + w) = Av + Aw, A(Bv) = (AB)v

A(av) = a(Av), Iv = v, Ov = 0

and dual rules hold for vA.

Block and triangular forms of matrices are frequently very useful. If we


partition the rows of a matrix into sets of adjacent rows and the columns of a
matrix into sets of adjacent columns, the entire matrix is partitioned into..blocks,
each of which is a submatrix. An (/, /)-block is denoted Atj. Thus
Sec. 3.3] Matrices 135

11 Al2 .. • Ain

21 ^22 •• ■ A2n

nl An2 • ■ ■ ■ Ann

EXAMPLE 3.3.9.

1 2 4 5
0 0 1 0

1 1 3 2
1 1 1 0

This matrix has been partitioned into 4 blocks

"l 2 4 5"
5 ^4l2 —
0 0_ 1 0

"l 1 _3 2"
II
c*

_1 1_ .1 0_

PROPOSITION 3.3.1. If two matrices A, B are partitioned in the same way,


their sum is the matrix of blocks Ay + By. If they are n X n matrices, and the
columns of A are partitioned in the same say as the rows of B, their product is
the matrix of blocks Bkj.

belongs to the (s, t)-block. Thus the (s, t)-block of AB is E AskBkt. Here dk
Jc
denotes the kth block. □

EXAMPLE 3.3.10. Let A, B be

'1 1 f 0 1 0"

0 1 1 1 0 0
5

_0 0 1. _1 0 1_

Then A + B, AB are

"1 2 f 2 1 r

1 1 1 5
2 0 1

1 0 2 1 0 i_
136 Vector Spaces [Ch.3

Note that

"l f 1
+ [1 0] = +
0 1_ 0

~1 f
+ [1] =
0 1

[0 0] + [1] [1 0] = [1 0]
1 0

[0 0] + [1] [1] = [1]

DEFINITION 3.3.8. A matrix A is in upper {lower) triangular form if a,y = 0


for i > j {i < j ).

EXAMPLE 3.3.11. This matrix is in upper triangular form.

1 1 3
0 1 2

_0 0 4_

PROPOSITION 3.3.2. If two matrices are both in upper {lower) triangular


form, so is their product. The diagonal entries of the product are products of
the diagonal entries of the two matrices.

Proof. Assume A, B are in upper triangular form. Suppose i > j. Then in


aik bkj either i > k or k > j. So 2 aik bkj = 0. Suppose i = j. If / < k and
k < j then i = k = j. So au bH is the only nonzero term in 2 aik bkj. A similar
proof holds for the lower triangular case. □

We can also combine these two concepts by considering matrices whose


blocks are in triangular form. Such a matrix is said to be in block-triangular
form.

EXERCISES
Level 1
1. Find
1 4
0 0
Sec. 3.3] Matrices 137

2. Find

3. Find
4 0
+ (-3)
0 0
4. Find
"l 4 0" "l -1 f
0 5 3 2 1 1
1_

2 2-5
1

5. Give an example to show AAT may not equal ATA.


6. Find
10 11

[1 1] -5 -9
0 13

7. Assume A, B are n X n matrices. Verify that B 1A 1 is an inverse of AB by


multiplying where A~x is an inverse of A and B~l is an inverse of B.

Level 2
1. Give an example to verify the distributive law for matrices.
2. Given an example to verify the associative law for matrices.
3. Multiply these matrices quickly using block forms and properties of the
identity matrix:
1
l

to

1 0 10 9
0 1 0 1 8 7 4 5

1 0 1 0 3 6 9 12
0 1 0 1 11 9 7 5

4. Show that any two matrices of this form commute

a 0
b a

Generalize this to n X n matrices.


138 Vector Spaces [Ch. 3

5. Prove that if vA = 0 for some nonzero vector v, A cannot have an inverse


A'1 such that AA~l = /.
6. Find three different 3X3 matrices A such that A2 = /.
7. Find a 2 X 2 matrix A such that A2 = 0.

Level 3
1. Multiply two matrices having block form

A B~

.0 D_

2. Give a formula for an inverse of such a matrix in terms of A~\ B~l by


solving

~A B X Y 7 o"
.0 D .o z. 0 I

3. Do the same for upper triangular block form.


4. Factor
A B~

C D_

by upper and lower triangular block matrices, if A is invertible. Find X, Y


such that

A B A O' 7 Y

C D_ X l 0 D — XY_

In principle this and the preceding exercises give an inverse for any 2X2
matrix in block form.
5. Prove A 4- Ar, AAT, and ,474 are symmetric using the laws regarding
transposes given in the text.
6. Prove for symmetric matrices A, B that AB is symmetric if and only if
A, B commute.
'7. Prove that a matrix A commutes with any polynomial c0/ + c^A 4- ... +
cn An where c,- e F.

3.4 LINEAR TRANSFORMATIONS


A linear transformation is a homomorphism of vector spaces, and is a linear
function from one vector space to another. Every linear transformation can be
Sec. 3.4] Linear Transformations 139

represented by a matrix for a given basis. Conversely every matrix gives a linear
transformation.

DEFINITION 3.4.1. Let Vand W be the vector spaces over F. A function g from
V to W is a linear transformation if and only if g(ax + by) = ag(x) + bg(y)
for all a, b EF, x, y E\.

EXAMPLE 3.4.1. Let V be the vector space of all polynomials with coefficients
in a F containing the rationals. The following are linear transformations on V.

g(x) -+ 2g(x), J*g(x) dx


£(X)-+X£(»,

£(*W(*), *(*)-**(*)-*(0)

EXAMPLE 3.4.2. Any nX m matrix over F gives a linear transformation on a


vector space'of row vectors defined by v -»■ vA and on a vector space of column
vectors by v -*■ Av.

In fact every linear transformation between finite dimensional vector spaces


can be regarded as multiplication by some matrix. If we allow infinite matrices,
this is also true for infinite dimensional vector spaces.

PROPOSITION 3.4.1. Let g be a linear transformation from a vector space V


over F to a vector space W over F. Let [vit v2,..., vn) be a basis for V and let
{wi, w2, ...,wm} be a basis for W. Let atj EF be such that g(Vi) = 2 a^Wj.
Then ^(/i + h v2 + ••• + /« *>«) = Wj + k2 w2 + ... + km wm where
[/1/2 ••• fn]A = iklk2 ••• km\-

Proof Since

g(Z fiVt) = ? fg (v^ = 2 f 2 atjWj = 2 wf 2


l l l l J l

so

ki = 2 //«//

The same formula arises from matrix multiplication. □

This result gives a 1-1 correspondence between matrices and linear


transformations for any basis.

DEFINITION 3.4.2. The image space of a linear transformation g: V-*W is


{w E W : w = g(v) for some v E V}. The kernel, or null space of g is
{nGV: f(v) = 0}. The rank of g is the dimension of the image space.
140 Vector Spaces [Ch.3

EXAMPLE 3.4.3. Let g be the linear transformation of multiplication by

'1 f
.2 2_

on row vectors. Then the image space of g is the set of vectors having the form
(x, x). The kernel of g is the set of vectors having the form (— 2x, x). The rank
of g is 1.

PROPOSITION 3.4.2. Letg be a linear transformation from a finite dimensional


vector space V into a finite dimensional vector space W. Then
dim (null space) + dim (image space) = dim V

Proof By Theorem 3.2.6, the null space N has a complement W. Since


N © W — V, dim N 4- dim W = dim V. We will show g is an isomorphism from W
into the image space M. Let zGM. Then z = g{y) for yEV. And y = w + n,
w6W, n E N. So z = g(w + n) = £(w) + g{n) = ^(w) + 0 = g(w). So g is
onto.
Suppose g(wi) = g{w2). Then ^(wj) ~g{w2) = 0. So g{w2 — w2) = 0. So
wi — w2 e N. So h>! — w2 E N fi W = {0}. So W[ = w2. This proves g is
one-to-one. So W — M. So dim W = dim M. □

DEFINITION 3.4.3. The row (column) space of a matrix is the vector space
spanned by its rows (columns). The row {column) rank is the dimension of
the row (column) space. The row (column) rank of A will be denoted by
PrOO (PcU))-

EXAMPLE 3.4.4. The row and column spaces of

”l l"

2 2_

are {(x, x) : x E R} and {(x, 2x) :x£R}, The row rank and the column rank
are both 1.

Note that the row (column) rank can more simply be expressed as the
maximum number of independent rows (columns).

PROPOSITION 3.4.3. Let v be regarded as a row vector. Then vA = 2 U/A,*.


If v is regarded as a column vector, Av = 2 A*j Vj. The row (column) space of
A is equal to the image space of A on row {column) vectors.

Proof. The / component of vA is 2 ^a;y, which equals the /' component of


2 ViAi*. Thus if e{ denotes a vector with a 1 in place i and zeros elsewhere,
Sec. 3.4] Linear Transformations 141

e{A = Ai*. So the row space is contained in the image space. Yet if w belongs
to the image space then w = vA = 2 ViA{* which is contained in the row space.
Therefore the image space is contained in the row space.
The proofs for column vectors are similar. □

THEOREM 3.4.4. The row and column rank of a matrix are equal.

Proof. Let A be an n X m matrix of column rank r. Then there are r columns


of A which form a basis for the column space. Rearranging the columns will not
affect the row rank or the column rank, so we will assume that A*h A*2,..., A*r
form a column basis. Then if we restrict all row vectors to their first r com¬
ponents, we assert that this gives an isomorphism on the row space. Let p/y
be such that A*j = 2 A*t ptj where i = 1,2,..., r and / = r + 1, r + 2,..., m.
Such elements exist since A*h A*2,..., A*r is a basis.
Then for any row vector At* we have

r
@ij 2 @ik Pkj
k=1

by taking the i component of the equation above. This implies that for any
linear combination v of the Aj*,
r
Vj = 2 vkpkj
k= 1

where / = r + 1, r + 2,..., m. Thus if two vectors have identical components


vu v2,...,vr the components vr+h vr+2, are also identical. This proves
the assertion.
So the row space is isomorphic to a subspace of an r-dimensional space. So
it has dimension at most r. So pc(A) < pr(^l). By symmetry, pr(A)< pc(A)
where pr(A) (pc(A)) denotes the row (column) rank of A. This proves the
theorem. D

EXAMPLE 3.4.5. Let

1
5

Then A*h A*2 form a basis. There is a 1-1 map on the row space taking [10 1]
to [1 0], [0 1 1] to [0 1], [2 3 5] to [2 3]. Thus the row space is a subspace
of R2 and is two-dimensional.
142 Vector Spaces [Ch.3

EXERCISES
Level 1
1. Find matrices to represent these linear transformations, according to
Proposition 3.4.1: (a) /(x, y) — (x + y, y), (b), f(x, y, z) = (y, z, x),
(c) /(x, y, z) = x + y + z.
2. If the rows of a matrix are independent, what is its rank? What is the rank of

1 1 1
0 1 1 ?

0 0 1

4. Show the zero matrix is the matrix of the linear transformation f(x) = 0.
5. Show if A has rank k, AB and BA have rank < k for any B.

Level 2
1. Find matrices to represent these linear transformations on the space of
polynomials of degree <5 with basis 1, x, x2, x3, x4: (a) /(x) — /(0),
(b) /'(x), (c) * f£f(x) dx, (d) O/O))'.
2. What are the ranks of the matrices in Exercise 1?
3. Prove linear transformations from V to W form a vector space. If V has
dimension n and W has dimension m, what is the dimension of the space of
linear transformations from V to W?
4. Show the identity matrix is the matrix of the identity linear transformation.

Level 3
1. For any vector v G Rn show the transformations x (x • v)v and
x -> x — (x • v)v are linear where x • v is the inner product 2 x,- V(. What
are the kernels?
2. Identify the ranks and image spaces in the above exercise, and describe
these mappings geometrically. Prove that f O / = / for these linear
transformations.
3. Prove composition of linear transformations obeys the distributive law on
each side f(g + h)- fg + fh, (g + H)f = gf + hf.
4. Find a general form for rank 1 matrices.
5. Prove that if a multiple of one row is added to another row, the rank of a
matrix is unchanged. Represent such a row operation as multiplication by
a matrix / + cE(i, j) where c times row i is added to row j and E(i, /) has
a 1 entry in location i, j and all other entries are 0.
6. Describe the infinite matrices of /-*■ xf and /-> /' on the space of all
polynomials of any degree with basis 1, x, x2, xn, _ Show these
matrices obey the identity AB — BA = I, used in matrix mechanics.
7. Prove any two vector spaces of the same dimension are isomorphic.
Sec. 3.5] Determinants and Characteristic Polynomials 143

8. Prove any nX m rank k matrix can be represented as AIB where A is


nX k, 7 is a k X k identity matrix, and B is k X m.

3.5 DETERMINANTS AND CHARACTERISTIC POLYNOMIALS


The determinant of an n X n matrix gives a formula for solving linear equations.
However, it is more useful for theoretical purposes, and for finding the eigen¬
values of a matrix. In particular it gives a criterion for a matrix to be of rank n,
and to have an inverse.

DEFINITION 3.5.1. Let a be a permutation on {1, 2,i.e. a 1-1 function


from this set to itself. Then the number of inversions of a is |{(z, /') : i < / and
a(z') > a(/)}|. The permutation o is said to be even or odd according to this
number being even or odd. We write sign (a) = +1 or — 1 as o is even or odd.

We write permutations in the following form:

This means a(l) = 1, n(2) = 3, cr(3) = 2. In general a(z') is the number


underneath i.

EXAMPLE 3.5.1. The permutation

inverts the pairs (1, 2) and (2, 3). So it is even. The permutation

inverts the pair (2, 3). So it is odd.

PROPOSITION 3.5.1. sign (ai<;2) = sign (oi) sign (a2). If a interchanges


exactly two elements then sign (a) = —1.

Proof. A pair (z, /') is inverted by ox o2 if and only if it lies in one of the sets
S = {(/, /) : o2 inverts (/, /)} or T = {(z, /) : ax inverts (a2(z), o2(/))}, but not
both. The cardinality of this set equals |S U T\ — \S n T\ = \S\ + 171 — 215 n 71
which is an even number plus |5| + |71. Thus sign (0\02) = (—l)151 + |:r|, sign
(a2) = (—l)151, sign (aj) = (— l)|r|. This proves the first assertion.
For the second assertion, let a interchange i, j. Then if i < k < / both (z, k)
and (k, /) are inverted. One other pair is inverted: (z, /) itself. So o is odd. □
144 Vector Spaces [Ch. 3

DEFINITION 3.5.2. The determinant of an n X n matrix A is

det (A) = 2 sign (a) fliCT(i) fl2a(2) •••


where the summation ranges over all permutations a of {1, 2,..., n}.

EXAMPLE 3.5.2.

a b c
det d e f aei + bfg 4- cdh — ceg — fha — dbi

g h i

A number of results about determinants are important. These will be stated


in an order in which it is convenient to prove them, but without proofs. Proofs
can be found in most books on the subject.

Dl. A determinant changes sign if two rows or two columns of a matrix are
interchanged.
D2. A determinant is linear as a function of the ith column or the jth row.
D3. A determinant of a diagonal matrix is the product of the main diagonal
entries.
D4. If any row or any column of a matrix is zero, then the determinant is zero.
D5. If the matrix B is obtained from A by replacing Aj* by Aj* — kAt* where
i =£ j, then det (B) = det (A). Likewise for columns.
D6. det (A) = det (Ar).
D7. If B is obtained from A by rearranging the rows by a permutation o then
det (B) = sign (a) det (A).
D8. det (AB) = det (A) det (B).
D9. det (A) 0 if and only if A has an inverse A'1 such that AA~X — A~lA = I.
DIO. Let A[i |/] be the (n — 1) X (n — 1) matrix obtained from A by removing
the ith row and the jth column. Let C[i\j] be the {i,jjth-cofactor of
A[i\j] which is (—l)l+’det(A[i\j\). Then
n n
det {A) = 2 arjC[r\j] = 2 fl/sC[z|s]
/=i «=i

for any, r, s.
Dll. Suppose A has the form

An 0 ... 0

A21 A22 ••• 0

Ay\ Ay 2 ... Ayy


Sec. 3.5] Determinants and Characteristic Polynomials 145

where the zeros and At/- represent submatrices (blocks) and not single
entries, and the Ati are square matrices. Then

det (A) = det (An) det (A22) ... det (Arr)

D12. For any matrix A, let cof (A) be the matrix C[j\i]. Then A (cof (A)) =
(cof (A)) A = I (det (A)). Thus if A has an inverse,

A~i = c°f(A)
det (A)
The matrix cof (A) is called the adjoint of A.

EXAMPLE 3.5.3. Let A be

"l 1 f
1 2 3
1 4 9

Then cof (A) is

"6-5 l"
-6 8 -2
2 -3 1

and A(cof (A)) is

"2 0 (f
0 2 0
0 0 2

The characteristic polynomial of a matrix gives properties of linear


transformation which do not depend on the particular basis chosen.

DEFINITION 3.5.3. The characteristic polynomial of the matrix A is


det (tl — A) where t G C.

EXAMPLE 3.5.4. The characteristic polynomial of the matrix A in


Example 3.5.3 is t3 — 1212 + \St — 2, i.e. the determinant of

~t -1 -1 -1 '
-1 t-1 -3
-1 -4 t-9
146 Vector Spaces [Ch. 3

The characteristic polynomial of an n X n matrix has degree n. Its constant


term is det (—A) = (— 1)” det (A) (set t = 0).

DEFINITION 3.5.4. The trace of A of an n X n matrix A is Tr (A) =


an + a2 2 + ... + ann.

EXAMPLE 3.5.5. Let A be as in Example 3.5.3. Then Tr (A) = 12.

The coefficient of tn_1 in the characteristic polynomial is — Tr (A). The


trace has additive properties somewhat similar to the multiplicative properties
of the determinant. For instance Tr (A + B) = Tr (A) + Tr (B), det (AB) =
det (A) det (B).

PROPOSITION 3.5.2. The coefficient of tn~x in the characteristic polynomial


is —Tr (A). The constant term is (—2)” det (A).

Proof By inspection of (tl — A), the coefficient of tn~l is the sum of — %.


Set t = 0. Then det (tl — A) becomes det (—A) = (—1)” det (^4). □

THEOREM 3.5.3. (Cayley-Hamilton.) Let p(t) be the characteristic


polynomial of A. Then p{A)= 0.

Proof Let B(t) = {tl — A) and let B(t) = B0 + Bit + B2t2 + ... + Bn_ltn~1
be the adjoint of B (to expand B(t), let Bt be the matrix of coefficients of tn~l
of tn~l in B{tf). Then B{t)B{t) = det (B{tJ)I = p(t)I. So

(tI-A)(B0 + B1t + B2t2 + ... +

= ~AB0 + t(B0 ~ A By) + ... + tnBn_i

= Pit) I

This equation must be an identity in t. So if

Pit) = tn + cn_1tn~1+ ... + c0, c,/ = B{_i — AB{

We expand

(AI-A)(B0 + AB1+ ... +An~1Bn_l)

= ~AB0 + AiBo-AB^d- ... + AnBn_1

= c0I + CiA + c2A2 + ... + An

= P(A)

But AI - A = A - A = 0. So p(A) = 0. □
Sec. 3.5] Determinants and Characteristic Polynomials 147

EXERCISES
Level 1
1. Compute the determinant of

"l 1 f ~1 1 o'
~1 2
> 1 2 3 0 1 1
2 3.
14 9 1 0 1

2. Compute the characteristic polynomial of

'o 1 o'
5 0 0 1
- _0 0 0_

3. Prove Property D3 of determinants.


4. Is this permutation (123) (45) even or odd?
5. Prove Property D4 of determinants.

Level 2
1. Compute the determinant of

a b c 0 a b
c a b ) —a 0 c
b c a -b —c 0

2. Prove Property D2 of determinants. Show also that if all entries are multi¬
plied by a constant c the determinant is multiplied by cn where n is the
order of the matrix.
3. Use Exercise 2 and Property D6 to show that the determinant of an n X n
matrix A is zero if AT = —A and n is odd.
4. Prove that if two rows of a determinant are equal its value is zero.
5. Prove Property D5 of determinants.
6. Use Property Dll to compute the determinant of

1 2 0 0
2 10 0
10 11 3 5
14 15 5 3
148 Vector Spaces [Ch.3

7. Find the inverse of

"l 1 l"
1 2 4
1 4 16

Level 3
1. Prove Property D7 of determinants.
2. Prove Property D8 of determinants. Show det(^45) is a function which
is (i) linear in each row, (ii) if two rows of A are equal then it is zero. Thus
it is unchanged under row operations as in Level 2, Exercise 5. Prove under
permutation of the rows of A we have a property like D7. Now by such
operations we can reduce A to a matrix having at most one nonzero entry
per column. Prove directly det (AB) = det {A) det (B) in this case.
3. Find a general formula for the inverse of a 3 X 3 matrix.
4. Find recursion formulas for the characteristic polynomial of an n X n
(0,1)-matrix A such that <z,y = 1 if / > / — 1, for example

1 10 0 “

1110
1111
1111

Let /„, gn be the determinants of the n X n matrices corresponding to

X 1 0 6 1 1 0 0
1 X 1 0 1 X 1 0
5
1 1 X i 1 1 X 1
1 1 1 X 1 1 1 X

Prove by Property DIO that /„(*) = xfn_^x) -g„_t(x) and gn(x) =


fn-i(x) ~ 8n-i(x)- Note (—1)”/«(1 — t) is the desired characteristic
polynomial.
5. Prove using the formulas in Exercise 4 that if hn = (-1)”/„(1 - t) it
satisfies hn = thn_1 — thn_2. Also ht = (t — 1 ),h2 = t2 — 2t (we assume
ho = 1).
6. Prove that hn is the coefficient of un in
1—u
1 — ut + u2t
Sec. 3.6] Eigenvalues, Eigenvectors, Similarity 149

by showing this coefficient satisfies the same recursion formula and has the
same value for n = 0,1.
7. Prove Property D9 for determinants by using row operations as in Exercise 2
to simplify A.
8. Show every polynomial in an n X n matrix A can be reduced to a sum of
powers of I, A, A2,An~1. First express An this way using the Cayley-
Hamilton Theorem. Then express An+1 in terms of An, An~\ ..., A using
that expression and substitute the previous expression for An.

3 .6 EIGENVALUES, EIGENVECTORS, SIMILARITY


If we change the basis in a vector space, the matrix A representing any linear
transformation will not be unchanged but will be replaced by a matrix XAX~l.
Thus the intrinsic properties of the transformation will be reflected by those
properties of A not affected by the linear transformation. A matrix XAX"1 is
said to be similar to A. Thus we look for similarity invariants. The most
important is the characteristic polynomial and its roots, called eigenvalues.

DEFINITION 3.6.1. Two nX n matrices A, B are similar if and only if there


exists an nX n matrix X, having an inverse, such that B = XAX"1.

EXAMPLE 3.6.1. The matrices A, B


o,
1
o

>
_
o
7
1

are similar. Let X be the matrix

1 l"

-1 1_

then X is invertible and XA = BX.

PROPOSITION 3.6.1. For a fixed matrix X, the mapping A-+XAX_1 is an


isomorphism of rings.
+ □

Proof. We have X(AB)X~i = {XAX~x) (XBX'1), X(A + B)X~x = XAX~x


XBX~l and an inverse mapping is given by X~xAX.

PROPOSITION 3.6.2. The relation of similarity is an equivalence relation.

Proof. (Reflexivity) A = IAI~l. (Symmetry) if A = XBX~l then B =


(Transitivity) if A = XBX~x and B = YCY~X then
A^iXYyCiXY)-1. □
150 Vector Spaces [Ch.3

PROPOSITION 3.6.3. Let f be a linear transformation from V to V. Let


{Vi, v2,.... vn} and {u^, w2,..., wn} be bases for the vector space V. Let A
be the matrix of f in terms of the basis {vh v2,.... vn} and let B be the matrix
of f in terms of the basis {wj, w2,..., wn }. Let matrices X and Y be defined by
expanding Vj = 2 w(- x,y and Wj = 2 Then Y = X'1 and XA = BX. Thus
A, B are similar.

Proof The matrix A is defined by f(vj) = 2 v, atj and B is defined by


f(Wj) = 2 Wj bq. Thus

f(Vj) = 2 vt ay = 2 2 wkxkiaij

= ? 2 xki atj. '


k i
On the other hand

f(Vj) = ffZwtXij) = 2 Xjj f(Wi)

= 2 xtj 2 wk bki = 2 wk 2 bki xtj.


IK K X

Therefore

2 2 xkiaif = 2 wk 2 bkixij.
k i k i
Since the set {wj, w2,..., wn] is independent,

? xkiaij = kixij •
So XA = BX. By the same sort of argument, XY = /. □

PROPOSITION 3.6.4. The characteristic polynomial is invariant under similarity.

Proof, det (tl - XAX~l) = det (X{tl - A)X~l) = det (X) det (tl-A)
(det (.X))-1 = det (tl — A). □

COROLLARY 3.6.5. All the coefficients of the characteristic polynomial, such


as the trace and determinant, are also similarity invariants.

DEFINITION 3.6.2. A nonzero row (column) vector v is said to be a row


(column) eigenvector of the matrix A if and only if vA — kv (Av = kv) for
some k E C. The complex number k is called an eigenvalue.

EXAMPLE 3.6.2. The vectors (1,1) and (1,-1) are both row and column
eigenvectors of the matrix

0 l"

_1 0

The corresponding eigenvalues are 1 and —1.


Sec. 3.6] Eigenvalues, Eigenvectors, Similarity 151

PROPOSITION 3.6.6. An element k G F is an eigenvalue of A if and only if k


is a root of the characteristic polynomial of A.

Proof The element A: is a (row) eigenvalue if and only if vA = kv for some


v ^ 0 if and only if 0 = v(kl — A) for some v =£ 0 if and only if kl — A has a
nonzero kernel if and only if det (kl — A) = 0 if and only if k is a root of the
characteristic polynomial. □

DEFINITION 3.6.3. The multiplicity of an eigenvalue k is its multiplicity as


a root of the characteristic polynomial p(t), i.e. the highest power of (t — k)
which divides p(t).

EXAMPLE 3.6.3. The matrix

"l 0~
0 1_

has the eigenvalue 1 with multiplicity 2, since the characteristic polynomial is


t2 -It + 1 = (r - l)2.

PROPOSITION 3.6.7. The trace is the sum of the eigenvalues, each counted
with its multiplicity. The determinant is the product of the eigenvalues, each
counted with its multiplicity.

Proof. Let the characteristic polynomial be xn + clxn~1+ ... + cn. Then by


Proposition 3.5.2, the trace is —c%, and the determinant is (—l)”c„. But for any
polynomial of degree n, the coefficient of xn~x is —(sum of roots) and the
constant term is (— l)”(product of roots) if the coefficient of xn is 1. This
proves the proposition. □

From here on, in this section, we assume that the field F is such that the
characteristic polynomial p(t) factors into linear factors t — kt over F. For
instance this will be true for any matrix of real or complex numbers if we use
F = C. We will also assume vectors are row vectors unless the contrary is stated.
Let the eigenvalues of A be kx, k2, .... kr and their multiplicities be
ni, n2,..., nr. Then the characteristic polynomial is p(t) = (t — kt)ni(t — k2)”2
...(t-kr)\

DEFINITION 3.6.4. The kernel of (ktl — A)"1 is called the characteristic


subspace belonging to kt.
Vector Spaces [Ch.3
152

EXAMPLE 3.6.4. The matrix

"l 0 0_
1 1 0
0 0 -1

has two eigenvalues: 1, with multiplicity 2, and —1, with multiplicity 1. The
characteristic subspace belonging to 1 is {{x,y, 0): x,y £ R}. The characteristic
subspace belonging to —1 is {(0, 0 ,z):zE R}.

Let Vt denote the subspace belonging to kt.


We will use the following result about polynomials: if fi(t), f2 (?)>
are polynomials over a field F having greatest common divisor 1, then there exist
polynomials a2 (f), a2 (t), ...,ar{t) such that 1 = fli (t)/i (0 + a2 (Ofi (O'*" • • •
+ ar(i)fr(t). We also assume unique factorization of such polynomials.

THEOREM 3.6.8. The vector space V is the (internal) direct sum of


V\, V2,Vr. For i = 1 to r, {VfiA C Vt. The dimension of Vt is nt. There
exists a matrix X such that XAX ~l has the form

~BU) 0 ... 0 '


0 B{2)... 0

0 0 ... B{r)

where the zeros are zero submatrices and the B{i) are square submatrices
{blocks). The characteristic polynomial of B(i) is (t —

Proof. Let f{t) = p{t) {t — Then fXt) is a polynomial, and the different
polynomials f have greatest common divisor 1. Thus 1 = fli(0/i(0 + ci2{t)f2 (f) +
... + ar{t)fr{t) for some polynomials a,-. Such an identity will mean that the co¬
efficients of each power of t are equal on both sides. This implies that the identity
will remain true if t is replaced by A. Thus I = a2{A)fi{A) + a2{A)f2(A) + ...
+ ar{A)fr{A). Multiply both sides by the row vector v. So v = va\{A)fi{A)
+ va2{A)f2{A) + ... + var{A)fr{A). Let v(i> = vat{A)f({A). Then
v{i){A-kiI)ni = vai{A)ft{A) {A - kil)ni = vat{A)p{A) = 0 by the Cayley-
Hamilton theorem.
So v(i)E V{. This proves that every vector is a sum of vectors lying in the
subspaces Vt. So V = V2+ V2 + ... + Vr.
We must also show that for any choice of u(i)G Vt, that if u(\) + u{ 2) + ...
+ u<r) = 0 then u(i) = 0. We have u{i) = u(i)a1(A)f1(A) + u(i)a2(A)f2{A)
+ ... + u{i) ar{A) fr{A) = u(i)f1{A)al{A) + u{i)f2{A)a2{A) + ... +
u{i)fr{A)ar{A) since any two powers of A commute. But for j i, fj{A) has
Sec. 3.6] Eigenvalues, Eigenvectors, Similarity 153

the factor (A — A:,-/)”*' so since u<OGker (A — A:,/)"*', u(i)f,(A) = 0. So


u(i) = u(i)fi(A)ai(A). Apply fi(A)a,(A) to the equation

0 = u{ 1) 4- u(2) 4-... + u{r),

0 = 0 4- 0 4-... 4- u{i) + 0 + ...4-0.

So each u(i> is 0. This proves that V is the direct sum of Vu V2,..., Vr. If
v(A - ktI)ni = 0 then vA(A - ktI)ni = v(A - k,I)niA = 0A = 0. This proves
(Vi)A C Vh
Let di = dim (F,). Choose vectors wx, w2,..., w„ such that Wj, w2,.... is
a basis for Vlt wdi+1, wdi+2, ... , wdl+d2 is a basis f°r an(i so on. Then
wi, w2,..., w„ is a basis for A. Let 5 be the matrix of the linear transformation
given by A, with the wh w2, ...,wn as basis. Then B is similar to A, by
Proposition 3.6.5. And since (V^A C Vt, wtA is a linear combination of those
Wj which lie in the same subspace V{. This means that B has the form

~B{ 1> 0 ... 0"


0 B{2) ... 0

0 0 ... B(r)

where B(i) are certain submatrices corresponding to the subspaces Vt. For
instance B( 1> is the submatrix of such that i, j = 1,2, ...,d2. Then the
characteristic polynomial of B by Property Dll of determinants is the product
of the characteristic polynomials of the B(i). Thus p(t) = (t — ki)n\t — k2)n2 ...
(t—kr)nr = gi(t)g2(t) ... gr(t) where gt is the characteristic polynomial of
B(i). But on Vj. (A—kiI)rii=: 0, by definition of Vf. Also gt{A)= 0 on V(.
Suppose gf(t) has a root / other than kt. Then / is an eigenvalue of B(i), by
Proposition 3.6.6. So for some v G V{, vA = fv. Then y(A — A;,/)”'=
v(f — ki)Ki ^ 0. This is a contradiction. So gt(t) = (t — ki)di. So (t — ki)ni
(t — A:2)”2 ... (t — kr)nr = (t — kt)dl(t — k2)d2 ... (t — ky)dr where klt k2,... ,kr
are distinct. So nt = dt. This proves the theorem. □

The next result,which we do not prove, gives the general characterization of


similarity by showing that every matrix only be transformed into a unique form,
the Jordan Canonical Form. In this form in addition to diagonal entries the only
nonzero entries possible are 1 entries immediately below the main diagonal.

EXAMPLE 3.6.5. These matrices are in Jordan Canonical Form.

2 0 0 2 0 O’ 4 0 O’
1 2 0 J
0 2 0 J
0 1 0
_0 1 2_ 0 1 2_ 0 1 1
154 Vector Spaces [Ch.3

However, the eigenvalues which are the numbers appearing on the main diagonal,
are sufficient for many purposes.

A weaker result, a lower triangular form, is established in the exercises. This


suffices to prove the last result of this section.

THEOREM 3.6.9. (Jordan Decomposition Theorem.) Every matrix over a field


containing its eigenvalues is similar to a direct sum of matrices A such that
aij = 0 unless i — j or i = j + 1, % = k, an eigenvalue of A, and ai+1)i = 1.
The summands are unique except for order.

For a proof consult a more advanced book on linear algebra or matrix


theory.

The following result is useful in finding the eigenvalues of many matrices.

PROPOSITION 3.6.10. Let p be a polynomial. Then the eigenvalues of p(A)


are the elements p (&,), each counted with multiplicity nt.

Proof. By the preceding theorem, we may assume that A is a matrix whose


main diagonal entries are k(, having no nonzero entries above the main diagonal,
and diagonal entries of As will be kf. Therefore, adding the different powers of
A in p(A), we find that p(A) has no nonzero entries above the main diagonal
and the main diagonal entries are p(kj). By Property Dll of determinants, the
characteristic polynomial of p (A) will be the product of (f —p(k{))”'. □

EXERCISES
Level 1
1. Find the eigenvalues of these matrices.

y 4 0 1 f 3 4 O"
_i_ 3 ’ -1 0 1 5 4 2 0
-1 -1 0 0 0 1_
1111

2. Any 3X3 circulant

a b c
c a b
b c a
Sec. 3.6] Eigenvalues, Eigenvectors, Similarity 155

is a I + bP+ cP2, where

0 1 0
P = 0 0 1
1 0 0

Find the eigenvalues of a 3 X 3 circulant.


3. Any matrix

a b b b
b a b b
b b a b
b b b a

is (a —b)I + bJ where J is the matrix all of whose entries are 1. Find a


formula for the eigenvalues of such a matrix.
4. What are the eigenvalues of a diagonal matrix?

Level 2
1. Show by Theorem 3.6.8 that any matrix with distinct eigenvalues is similar
to a diagonal matrix.
2. If a matrix has distinct eigenvalues t\, t2,..., tk what diagonal matrix will it
be similar to?
3. Prove by induction that any matrix over C is similar to a lower triangular
matrix. Let V\ be an eigenvector and W a complementary subspace with
basis v2, v3,...,vn. Express the linear transformation in terms of V\, v2,,
vn. Show there is a block triangular form. If the matrix has already been
made similar to a lower triangular matrix on W (by choosing a suitable basis)
this completes the proof. This suffices to establish the last proposition
above.
4. A matrix A is called nilpotent if Ak = 0 for some k. Show all eigenvalues of
a nilpotent matrix are zero.
5. Conclude from Exercises 3, 4 that a nilpotent matrix is similar to a matrix
with zeros on or above the main diagonal. Is any such matrix nilpotent?
6. If A2 = A what are the only numbers which can be eigenvalues of A?
7. Show that a matrix A has an inverse if and only if 0 is not an eigenvalue.
8. Find the eigenvalues of the nX n matrix J all of whose entries are 1. A
complete set of eigenvectors is given by (1,1,..., 1), (1, — 1, —1,..., 0),
(1, 0, —1,..., 0),..., (1, 0, 0,..., —1). Generalize Exercise 7, Level 1 to
nX n matrices.
9. Generalize Exercises 5,6, Level 1 to n X n matrices.
156 Vector Spaces [Ch.3

Level 3
1. Conclude from the Jordan Decomposition Theorem that every matrix can
be expressed as a sum A = D 4- N where DN = ND and N is nilpotent and
D is similar to a diagonal matrix.
2. Show that the characteristic polynomial of a matrix can be calculated from
the n numbers Tr (Ak), k = 1 to n. Let the eigenvalues be xk. Then
Tr (Ak) = 2 xkk. Let this be denoted sk. Let ck be the coefficients in the
characteristic polynomial. They are except for sign

— Ci = xl+x2 + ... + xn

c2 = 2 XfXi
i<i
C3 — 2 XjiXjXm
i</< m

(“1 )nCn = X!X2 ... Xm.

Prove Newton’s formulas kcm + cm_1s1 + cm_2s2 + ... + c0sm = 0 where


c0 = 1. Using these, we can find in turn ct = — su (set m = 1), c2 = —s2
(set m = 2), and so on. Thus we can find the characteristic polynomial of
A. Its roots will be the eigenvalues.
3. Show that the eigenvalues of the matix

1 1 0 0~
1110
1111
1 1 1 l_

considered in the exercises of the last section are in the n X n case


2kn
4 cos2-.
2n + 4

Use the results of those exercises in that the characteristic polynomial is the
coefficient of un in
1 —u
1 — tu + tu2

4. Construct a matrix having arbitrary characteristic polynomial of the form

0 1 o'
0 0 1
a b c
Sec. 3.7] Symmetric and Unitary Matrices 157

Such a matrix is called a companion matrix.


5. Prove that any nilpotent matrix is similar to a (0, l)-matrix all of whose
1 entries are such that i = j + 1. Construct a basis beginning with 0 eigen¬
vectors as a set S0. Each basis element must be sent to another basis element
or zero.

3.7 SYMMETRIC AND UNITARY MATRICES


A matrix M is symmetric if M = Mr and it is real orthogonal if M~l — Mr.
Hermitian and unitary matrices are generalizations of these concepts to complex
numbers, and are defined by equations M = (AfT)* or M~l = (AfT)* where * is
complex conjugation. Every Hermitian or unitary matrix can be represented as
UDU~l where U is unitary and D is diagonal.
Both unitary and Hermitian matrices are special cases of a slightly more
general class called normal matrices.

EXAMPLE 3.7.1. This matrix is unitary for any 0:

"cos 0 —sin 0”

Jin0 cos0.

The inverse and transpose is


cos 0 sin 0 "

sin 0 cos 0 _

EXAMPLE 3.7.2. This matrix is Hermitian:

~2 2 + 3i

2-3 i 5_

Entries above the main diagonal must be complex conjugates of those below it.

DEFINITION 3.7.1. A matrix M is normal if and only if M(MT)* =

EXAMPLE 3.7.3. Since MM = MM and M~lM = MM~l = / all Hermitian and


unitary matrices are normal.

To deal with transposes effectively, we need the idea of an inner product of


two complex row vectors.

DEFINITION 3.7.2. The inner product of two vectors x, y is the 1 X 1 matrix


x • y = Jt(yT)*. This can be written x ' y = xx y? + x2 y2* + ••• + *« yn*-
158 Vector Spaces [Ch.3

EXAMPLE 3.7.4. (1, 2 + 0 • (1, 2 - i) = l2 + (2 + i) (2 - i) = 1 + 4 + 1 = 6.

PROPOSITION 3.7.1. Let x, y, z be any row vectors and a£F. The inner
product has these properties:
(1) x ' x is real and nonnegative.
(2) x • x = 0 only if x = 0.
(3) x • y = (> • x)*.
(4) x • (y 4- z) — x • y + x • z.
(5) (7 4- z) • x = y • x 4- z • x.
(<5) ax • 7 = a(x • 7).
(7) x • ay — a*(x • .y).

Proof. Properties 1, 2 follows from

X • X = Xi • Xi* +x2 • X2* + ... + X„ • X„* = 2 Ufl2

Property 3 follows from

y'X = = (x*^)*

Properties 4, 5, 6 follow from the distributive law for matrix multiplication.


Property 7 follows from Properties 6 and 3. □

In general a function b{x,y) with these properties is called a bilinear


Hermitian form.

PROPOSITION 3.7.2. Let A, B be any matrices, andx, y be any vectors. (2) If


xAyT = xByT for all x, y then A = B. (2)x • (yA) = (x(AT)*) • y.

Proof. Let x, y have a 1 in places i, j respectively and zeros elsewhere.


Then xAyr = a/y = xByT = by. This proves (1). For (2), x((j^A)t)* =
x(AT)*(yT)*. □
These results are useful in proving results concerning normal matrices.
Two vectors are called orthogonal if x • y = 0. In the case of real vectors,
this means they are perpendicular to one another.

DEFINITION 3.7.3. A set S of vectors is orthonormal if and only if


(1) v • v = 1 for all v E. S, (2) for v, w distinct in S, v • w = 0.
The first condition is called normality (length is 1), the second
orthogonality (different vectors perpendicular).

EXAMPLE 3.7.5. The vectors (1, 0,..., 0), (0,1,..., 0),..., (0, 0,..., 1) are
orthonormal.

Orthonormality means that the vectors can in effect be chosen as the basis
of a coordinate system geometrically equivalent to standard coordinates. Ortho¬
normal vectors must be linearly independent: if a^xj + a2x2 + ... + an xn = 0,
Sec. 3.7] Symmetric and Unitary Matrices 159

ai =£ 0, then ai^x • Xj + a2• x2 + ... + an Xi • xn = aiXx • x^ = 0. But this is


false.

LEMMA 3.7.3. Let v be an eigenvector of a normal matrix N with eigenvalue k.


Then v is an eigenvector of (TV7)* with eigenvalue k*.

Proof Let (W7)* = M. It suffices to show

(vM — k• (vM — k*v) = 0

This is

vM • vM — k*V' vM — vM • k*v + k*v • k*v

= vMN • v —k*(vN • v) — k(v • Nv) + k*kv • v


= vNM • v — k*(kv • v) — kv ' kv + k*kv ' v

= vN• vN — k*kv • v — k*kv • v + k*kv • v

= kv • kv —k*kw = kk*v • v — k*kv • v =0 □

THEOREM 3.7.4. Let W be any subspace of Cn. Then W has an orthonormal


basis, and this basis extends for an orthonormal basis for all of Cn. Here
cn = c © c ®... © c.
Proof We construct an orthonormal basis of row vectors by the Gram-Schmidt
Process. Let vit v2,~.., vk be any basis for W. Take

Vi
Wx =
yjvi * vt
We have Wj-Wi^l. Given wit w2, ... ,wt construct wi+l as follows. Let
ui+i = vi+1 — ^ (Vi+1 ’ wi)wi- Then ui+1 • Wj — 0 for / = 1 to i and ui+x ^ 0.
Let
“/+1
wi+1 ~ f-
V«i+1 ’ ui+1
Then w( • w,- = 1 and w/+1 • Wj = 0 for / = 1 to z. Moreover w,-+1 is linearly
independent from w1( w2,wt. Therefore Wj, w2, ...,wn form an orthonormal
basis.
We can extend this to a basis for all of Cn by extending the basis to a
basis for all of Cn and using the same process. □

LEMMA 3.7.5. Let M be a normal matrix. Write the linear transformation


v -> vM in terms of a new orthonormal basis wb w2, ...,wn to obtain a matrix X.
Then X is normal.
160 Vector Spaces [Ch. 3

Proof. We have UMU 1 = X where U is the matrix whose z'th row is wt.
And by orthonormality U(UT)* = I. Therefore U is unitary. So =
(y-lxu)(u-lxu)T* = (u~lx u) (uT*xT*(u~1)T*) = (y-lxu) (u-lxT*u)
= U~1XXT*U = U~1Xr*XU = (U-lXT*U) (U-'XU) = mt*m. □

Theorem 3.7.6. Let N be a normal matrix. There exists an orthonormal basis


consisting of eigenvectors of N.

Proof. There exists an eigenvector since the characteristic polynomial has at


least one root. Multiply this eigenvector by a positive real number r to obtain
an eigenvector v with eigenvalue k for some k such that v • v = 1. Let W be the
space of vectors orthogonal to V\ that is W = (w : v • w = 0}. Then for w G W,
v • wN = v(Nr)* • w = X*v - w — 0 by Lemma 3.7.3 where X E C. Thus
(W)VC W.
Now repeat the process restricted to W. Find an eigenvector, multiply it by a
real number and take its perpendicular space in W. Repetition of this process
constructs the required basis. (By Theorem 3.7.4 and Lemma3.7.5 we may still
take a normal matrix when the linear transformation is restricted to W.) □

COROLLARY 3.7.7. Let X be normal. Let U be a matrix whose rows are an


orthonormal basis of eigenvectors of X and D a diagonal matrix such that
the diagonal entries are the corresponding eigenvalues. Then U is unitary and
X = U~l DU.

Proof. From orthonormality we have U(UT)* — /. Thus U 1 = (t/T)*. And


UX~ DU by definition of eigenvector. □

COROLLARY 3.7.8. The eigenvalues of a Hermitian matrix are real and those
of a unitary matrix have absolute value 1.

Proof. The matrix D = UXU~l must be, respectively, Hermitian or unitary. Thus
in the respective cases D = (Z)T)* = D* and D~l — (Z)T)* = D*. □

EXERCISES
Level 1
1. Write these normal matrices as U~XDU. To do this, first find the charac¬
teristic polynomial. Then find its roots, the eigenvalues. For each eigenvalue
k solve the linear equation vA — vk to find an eigenvector v =£ 0, (for
instance take vy = 1). Then write U, D as in Corollary 3.7.7. The matrix U
need not be unitary.
Sec. 3.7] Symmetric and Unitary Matrices 161

"l l" 2 l' 0 2 1 f


5 5 J
1 1 1 2 2 3 -1 1

1 1 1
1 1 1

1 1 1

(Hint. Use as basis {(1,1,1), (0,1, —1), (—2,1,1)}.)

2. Show by calculation that the eigenvectors of this matrix for different


eigenvalues are orthogonal.

0 1 l"

-1 0 1
-1 -1 0

Level 2
1. Prove directly that if M is symmetric, then two eigenvectors vx, v2 belonging
to different eigenvalues k\, k2 are orthogonal. Show xMy = kjtq • v2 =
k2Vi • v2. Conclude that V\ • v2 = 0 since k^k2.
2. A quadratic form is an expression of the form
n n
2 2 UijXfXj
/=1 i = l

such as x? + 2xxx2 — x2. Prove that the quadratic form can be represented
as xAxt where A = (fly), x = (jc1; x2,..., xn). Here we assume the ay are
real numbers.
3. Show that in Exercise 2 we can always take the matrix atj to be symmetric.
If aij =£ dji replace both by
(djj + dji)
2

4. Prove that if we make a substitution of variables ylt y2, ..., yn for


Xi, x2,..., xn according to the rule Xj = 2 yt by that the quadratic form
xAxt goes to (yB)A(yB)T = y(BABT)yT.
5. Prove that if A is a real symmetric matrix there exists a real unitary matrix
U and a real diagonal matrix D such that A = U~XDU = UTDU (UT = U~l
by unitariness). By Corollary 3.7.7 each eigenvalue k is real. Thus we can
always find real eigenvectors by solving the linear equations vA = kv in
which all coefficients are real. (These have a solution v not zero since
162 Vector Spaces [Ch. 3]

A — kl is singular, hence one equation may be deleted.) Now the argument


of this section goes through where all numbers are real.
6. Show that any real quadratic form can be expressed in terms of vectors
yh y-i, •••,>'« as 2 Wi f°r ci e R- Here Xj = 2 yt by and the matrix
B — (bjj) is orthogonal.
7. Show we can write the quadratic form xy in this form by making the
substitution
u+v u —v

x = vf* y =
8. Write the quadratic form xy + yz + xz as 2 c,-wf where cf E R.

Level 3
1. Let M be a normal matrix. MMT* = MT*M is real symmetric, and its
eigenvalues are the squares of the absolute values of the eigenvalues of M.
2. Show a real symmetric matrix M has a real symmetric square root X such
that M = XX provided that all eigenvalues of X are nonnegative. Use
M = UDU~X and find \[D.
3. Show any normal matrix A can be written as BC where BC = CB, B is
Hermitian and C is unitary. Let A — UDU~X and factor D = XY where
xii ~~ I dii I >

Let B = UXU~l, C = UYU~X. This is called the polar decomposition of a


normal matrix.
4. The signature of a quadratic form is the number of positive eigenvalues of
its matrix minus the number of negative eigenvalues. Show two quadratic
forms of the same signature, where the matrices A are both nonsingular,
can be converted one into the other by a substitution of the form BABr,
btj E R and det (B) 0. Use Level 2, Exercise 6 to obtain the form 2 C(yf
where the c,- are the eigenvalues.
5. Show that the signature of a real quadratic form is invariant under
substitution A -> BABr for B real and nonsingular.
6. A quadratic form xAxr is called positive definite if xAxJ > 0 for all
, x =£ 0. Show this is equivalent to all eigenvalues of A being positive.
CHAPTER 4

Rings

A ring is a general system in which there are two operations, addition and
multiplication. A ring is a commutative group under addition, a semigroup
under multiplication, and satisfies distributive laws. We describe the integers as
a special ring in this chapter. The set of n X n matrices over a field and the set
of polynomials in a variables over a field are also rings.
There are many kinds of rings. An integral domain is a ring having a unit 1,
satisfying the commutative law ab = ba and a cancellation property that if
ac = be, c 0 then a = b. A Euclidean domain is an integral domain in which
we can divide a nonzero element y into an element x to obtain a quotient and
a remainder such that the remainder is of smaller degree than y. The integers are
an example of both. In any Euclidean domain prime numbers and divisibility
have most of the usual properties. Every element can be uniquely factored into
primes.
An ideal in a ring R is an additive subgroup H such that if x GR, y G H
then xy, yx G H. For any ideal there exists a congruence defined by x ~ y if
and only if x — y G H. The equivalence classes form the quotient ring R/H.
In the case of the integers these concepts take the form that x=y(mod m)
if x — y is a multiple of m. For instance 5 = l(mod 2) since 2 divides 5—1 = 4.
Congruences can be added, subtracted, and multiplied. The quotient ring is a
finite ring Zm of m elements. For m prime, it is a field.
Two numbers x, y are called relatively prime if they have no common
divisor (c.d.) except 1.
An element c in Zm has an inverse (c)-1 if and only if c is relatively prime
to m. The elements with inverses form a group. Its order is 0(ra), the number of
positive integers from 1 to n — 1 relatively prime to m. From this follows the
Euler-Fermat Theorem x^m^ = l(mod m) if x is relatively prime to m. For m
prime the group of nonzero elements of Zm is cyclic. This gives a criterion for
equations xk = c(mod m) to be solvable.
In Section 5 we present an advanced topic, simple and semisimple rings,
needed for the study of group representations. A ring is simple if it has no
164 Rings [Ch.4

nonzero two-sided ideals. Under a finite dimensionality assumption every simple


ring is isomorphic to a complete ring of n X n matrices. A semisimple ring is one
which is a direct sum of simple rings. Proofs of results about semisimplicity are
omitted. These results provide the most important classification theory in ring
theory. If two-sided ideals I do exist then a study of ideals I and the quotient
rings R/J gives information about R.

4.1 THE INTEGERS AND DIVISIBILITY


In this section we state a typical set of axioms for Z and then prove elementary
facts about divisibility. We view Z here as one case of a general class of systems.

DEFINITION 4.1.1. A ring is a set S on which two binary operations S X S -* S


are defined, denoted as addition and multiplication having element 0 such that
for all a, b, c E S :

(1) (a + b) + c = a + (h+c), (2) (ab)c = a(bc)

(3) a(b + c) = ab + ac, (4) (b + c)a = ba + ca

(5) a + b = b + a, (6) a + 0 = a

(7) for all a there exists —a such that a + (—a) = 0

For brevity, let R denote an arbitrary ring.

In other words a ring is an abelian group under addition and a semigroup


under multiplication. The two operations are linked only by the right and left
distributive laws (3) and (4).

EXAMPLE 4.1.1. Z forms a ring.

EXAMPLE 4.1.2. Any field is a ring.

EXAMPLE 4.1.3. Mn(F) forms a ring.

DEFINITION 4.1.2. A ring R is a ring with unit if there exists an element 1^0
such that lx = xl for all x G R. It is commutative if ab — ba for all a, b G R.

EXAMPLE 4.1.4. A field is a commutative ring with unit.

DEFINITION 4.1.3. A ring R is ordered if there exists a strict partial order


< on R such that if x < y then (1) for all z ER, x + z < y + z, (2) for all
z > 0 in R, xz < yz and zy < zx.
Sec. 4.1] The Integers and Divisibility 165

EXAMPLE 4.1.5. Z, Q, R are ordered rings.

A ring is linearly ordered if the order < is also a linear order, that is, for all
x, y, z G R either x = y, x < y, or x >y. The symbol > is defined by x > y if
and only if y < x.
The axioms for Z can now be stated compactly.

AXIOM 1. The integers are a linearly ordered, commutative ring with unit.
AXIOM 2. Let S C Z be such that (i) 1 G 5 and (ii) if x £ S then x + 1 E S.
Then S contains all positive integers Z+(all x such that 0 < x).

Axiom 2 is called the inductive axiom. All the usual forms of mathematical
induction follow from it.

DEFINITION 4.1.4. In R, a — b is a + (— h) for all a, b & R.

DEFINITION 4.1.5. The absolute value |x| = x if x is positive or zero and


I x | = —x if x is negative.

Next we state a number of simple properties of Z which follow from the


axioms, without proof. (The proofs are in the exercises.) Let a, b, c E Z.

Zl. 1>0.
Z2. (-a)(h) = a(-h) = -(ab).
Z3. (-1)(-1)=1.
Z4. a(h — c) — ab — ac.
Z5. If a, b are positive so is ah.
Z6. If one a, b is positive and the other is negative ab < 0.
Z7. If a, b are negative, ab is positive.
Z8. aO = 0.
Z9. If a > 0 then —a < 0.
Z10. If a < Othen -a> 0.
Zll. If a + b = a + c then b = c.
Z12. If ab = 0 then a = 0 or b = 0.
Z13. \a\>0.
Z14. If a =£ 0 then |a| > 0.
Z15. \ab\ = \a\\b\.
Z16. |a|+ |h| < |a| + |b|.
Z17. The smallest positive integer is 1.
Z18. If a =£ 0 then aa > 0.
Z19. If ah = 1 then a = 1 or a = —1, and a = h.
Z20. |a| = | —a|.

DEFINITION 4.1.6. An integral domain is a commutative ring with unit in


which if ah = 0 then a = 0 or h = 0. For brevity, V will denote an arbitrary
integral domain.
166 Rings [Ch.4

EXAMPLE 4.1.6. The integers, any field, and any subring of a field (subset
forming a ring under the same operations) are integral domains.

For the rest of this section, F[x] (Q[x]) will denote the ring of polynomials
over F (Q).

PROPOSITION 4.1.1. Let a, b, c G £>. If ac — be, c =£ 0, then a — b.

Proof. Since ac — be = (a — b)c — 0, c = 0 or a — b — 0. So a — b = 0. □

DEFINITION 4.1.7. If a, b G V, a divides b if and only if there exists c E V


such that a — be. This is written as a \ b.

EXAMPLE 4.1.7. In the ring of polynomials with integer coefficients,


x2-l\x4-\.

DEFINITION 4.1.8. If a G Q and a has as inverse in V, then a is called a unit.

EXAMPLE 4.1.8. The units of the integers are precisely ±1.

EXAMPLE 4.1.9. The units of F[x] are the nonzero elements of F.

DEFINITION 4.1.9. If a\, a2,..., akG V, then d E V is called a greatest


common divisor (g.c.d.) of ah a2,... ,ak if and only if (1) d\at for 1 = 1 to
k, (2) if x|for i = 1 to k then x\d. The g.c.d. of a and b, is denoted by
{a, b).

EXAMPLE 4.1.10. In Q[x], x is a g.c.d. of x2 — x and x3.

PROPOSITION 4.1.2. The relation a\b is a quasiorder. We have a\b and b\a
if and only if a = ub where u is a unit. If b\aj for i — 1 to k then for any
(-h c2> ••• 1 ck ^ b \ c\ a1 c2 u2 4" ... 4" ck ak.

Proof. Since a = a\, a\a. If ra = b and sb = c then rsa = sb = c. Therefore if


a\b and b\c,a\c.

Let a\b and b \a. If a or b is zero both are zero, and 0 = 1(0). Suppose a, b
are nonzero. Let a = ub, b = va. Then a = u(va). So by Proposition 4.1.1,
uv = 1. So u is a unit.
Let ai = u(b.Then c^ay + c2a2+...+ ck ak = (cj Ui 4- c2 u2 4-...4- ckuk)b.

The g.c.d. also has a kind of associative property.
Sec. 4.1] The Integers and Divisibility 167

PROPOSITION 4.1.3. Let g be a g.c.d. of a finite set S. Let e be a g.c.d. of a


finite set T and let f be a g.c.d. of X where S — T UI. Then g is a g.c.d.
of e,f

Proof. Since g is a common divisor (c.d.) of S and T, X C S, it is a c.d. of


T, X. Therefore g\e and g\f. If d is a c.d. of e, f then d\e and d\f so d is a
c.d. of S, T. So d is a c.d. of S U Tor X. So d\g. □

PROPOSITION 4.1.4. If d, g are g. c.d. of a set S then d = ug where u is a unit.

Proof. We have d\g and g\d. □

EXERCISES
Level 1
1. Prove Property Zll. Add —a to both sides.
2. Prove Property Z8. Note that a 4- Oa = la + Oa = (1 + 0)a = la = a. Now
add —a to both sides.
3. Prove Property Z2. Note that ab + (—a)b = (a + (—a))b = Ob — 0 by
Exercise 2. Now add —(ab) to both sides. The relation a(—b) = —ab
follows by the commutative law.
4. Prove Property Z6 from Exercise 2 and Property Z2.
5. Prove Property Z9 from Definition 4.1.3, adding (—a) to both sides.
6. Prove Property Z10 from Definition 4.1.3, adding (—a) to both sides.

Level 2
1. Prove Properties Z13, Z14 by considering each of three cases, a is positive,
negative, or zero.
2. Prove Property Z20 in that way.
3. Prove Property Z3 using 0 = 00 = (1 + (—1)) (1 + (— 1)) = 1(1 + (—1)) +
(-1(1 + (-1))) = 1 + (-1) + (-1) (1) + (-1)(—1) = 0 + (-1) + (—1)(—1)
= —1 + (—1) (—1). Add 1 to both sides.
4. Prove Property Z7 using Properties Z2, Z3, Z10.
5. Prove Property Z18 using Properties Z5, Z7 and two cases: a positive or
negative.
6. Prove Property Zl. Suppose 1 is negative.Then 1(1) = 1 < O.This contradicts
Property Z18.
7. Prove Property Z4.

Level 3
1. Prove Property Z6 using Properties Z9, Z10, Z2, Z3.
2. Prove Property Z15 by taking five cases: a = 0 or b = 0, a> 0 and b > 0,
a > 0 and b < 0, a < 0 and b > 0, a < 0 and b < 0.
168 Rings [Ch. 4

3. Prove Property Z12 using any from Properties Zl-Zll and possibly
Properties Z13-Z15.
4. Prove Property Z16 by taking five cases as in Exercise 2.
5. Prove Property Z17 from the induction axiom.
6. Prove Property Z19.
7. Show any ring R is a subring of a ring Ri with unit. Let Ri = Z © R
and define (a, b) + (b, d) = (a + c, b + d). Define (a, b) (c, d) =
(ac, ad+ be + bd). Show R is a subring of R1; and Rx is a ring. What is
the unit?
8. Which of the listed properties of Z hold for all integral domains?

4.2 EUCLIDEAN DOMAINS AND FACTORIZATION


In this section we consider^ special type of integral domain called a Euclidean
domain and show that the integers are a special case of this. We observe that in a
Euclidean domain any two elements have a g.c.d., and that every element can
be factored uniquely into primes.

DEFINITION 4.2.1. An element p E V is prime if and only if whenever p = ab


for a, b E V, either a or b is a unit and p itself is not a unit. We say that the
integers a and b are relatively prime (or that a is prime to b) if {a, b) = 1.

EXAMPLE 4.2.1. The numbers 2, 3, 5,7,11,13,17 are prime.

EXAMPLE 4.2.2. The polynomial x + a is prime for any a, where x is an


indeterminate.

A Euclidean domain is an integral domain in which we can divide to obtain


a quotient and remainder, where the remainder is in a certain sense less than the
divisor.
In dealing with the ring of polynomials in a variable (indeterminate) x over
a coefficient field F, it is more precise to say that x is transcendental over F,
that is, it does not satisfy any nonzero polynomial equation with coefficients
in F. Such a polynomial ring can be constructed by taking a subset of the
Cartesian product of a countable number of copies of F, and defining operations
appropriately.

DEFINITION 4.2.2. The integral domain E is a Euclidean domain if and only if


for every c EE, c # 0, there exists a nonnegative integer v(c) such that for all
a, b E E, a =£ 0, b 0: (i) v(ab) > v(a), (ii) there exist q, r E E such that
a = qb + r and either r = 0 or v(r) < v(Z>).

We remark that from (i) it follows that v(a) > v(l) for all a.
Sec. 4.2] Euclidean Domains and Factorization 169

EXAMPLE 4.2.3. Let V be the integers and let v(a) = |a|.

EXAMPLE 4.2.4. Let V be F[x] and let v(a) be the degree of the polynomial a.

EXAMPLE 4.2.5. Let V be {a + by/—I: a,b & Z) and let v(x) = \x\2 = a2 + b2.

THEOREM 4.2.1. If a, b E E, a =£ 0, b =£ 0, then there exists a g.c.d. of a, b


such that d = ra + sb for some r, s E Z.

Proof. Let K = {ax + by : x, y E £}. Let g be a nonzero element of K such


that v(g) is as small as possible. Write g = at + sb. We must first showg\a and
g\b. Suppose g\a is false. Then a = qg + r where v(r) < v(g), and r =£ 0. Also
r E K since r — a — qg = a — atq — sbq = a( 1 — tq) + b(—sq). This contradicts
the assumption that v(g) was a minimum. So g\a. Likewise g\b. Suppose x\a
and x | b. Then x | ta + sb. This proves g is a g.c.d. of a, b. □

PROPOSITION 4.2.2. If ay, a2, ..., ak are nonzero elements of E, then


there exists a greatest common divisor g of ay, a2, ..., ak of the form
g= ayXy + a2x2 + ... + akxk wherexx, x2,..., xk E V-

Proof. First find a g.c.d. gy of ay, a2. Then let g2 be a g.c.d. of gy, a3. Find
S3, g*> ••• » Sk-\ bi similar fashion, always using Theorem 4.2.1. Then let
g =gk_y. We will have g\a( for i = 1 to k and g = ayXy + a2x2 + ... + akxk.
This implies g is a g.c.d. of Xy, x2,..., xk. □

EXAMPLE 4.2.6. In Q[x], 1 is a g.c.d. of x3 and x2 - 1. We have


1 = xx3 + (—x2 — 1) (x2 — 1).

Since we can multiply both sides of g — ayXy-V a2x2 + ... + akxk by any
unit, any g.c.d. can be expressed in this form.

PROPOSITION 4.2.3. Let a, b be nonzero elements of E. Then v(ab) = v(b)


if and only if a is a unit.

Proof. Let a be a unit. Then v(ab) > v{b). And v{b) = v^ab) > v(ab). So
v(ab)=v(b). Suppose v(ab) = v(b). Then b=qab + r for some q,rEE
where r = 0 or v{r) < v(ab) = v(b) where v(x) denotes the degree of x.
Suppose r=h 0. Then r = b( 1 — qa). So v(r) > v(h). This is false. So r = 0. So
b = qab. So qa = 1. So a is invertible. □

PROPOSITION 4.2.4. Let a, b, c EE. If c\ab and (c, a) = 1 where (c, a) = 1


means that c and a are relatively prime then c | b.

Proof. We have ax + cy = 1 for some x,y E Z. So abx + cby = b. Since c


divides the left-hand side, it divides the right-hand side. □
170 Rings [Ch.4

COROLLARY 4.2.5. Let a, b £ E. If p\ab andp is prime then p\a orp\b.


PROPOSITION 4.2.6. Let a be a nonzero element of E which is not a unit.
Then there exist primes px, p2, ..., pr for some integer r > 0 such that
a — P1P2 -■■ Pr-

Proof Let a be an element of minimum degree for which this assertion fails.
Suppose v(a) = v(l). Then since v(al) = v(l), a is a unit by Proposition 4.2.3.
This is contrary to hypothesis. So v(a)>v(l). Since a is not prime, a — be
where neither b nor c is a unit. If v{a) = v(b) then c would be a unit, by
Proposition 4.2.3. So v(b) < v(a). Likewise v(c) < v(a). But the proposition is
true for all elements of degree less than a. So b and c are products of primes.
So a = be is a product of primes. This completes the proof. □

In most cases it is fairly clear that the last result holds. It is less obvious,
however, that a factorization into primes is unique except for rearrangement of
the primes and multiplication of each prime by a unit.
We first note that it follows by induction from Proposition 4.2.4, that if p
is prime and p|aja2 ... ak then p\at for some i. That is, p\ax or p\a2a3 ... ak.
If p|a2a3 ... ak, then p|a2 or p|a3a4 ... ak. And so on.

THEOREM 4.2.7. Let a be a nonzero element of E which is not a unit. Let


a - P1P2 - ■ ■ Pm = Q1Q2 • Qn be factorizations of a into primes. Then m = n,
and we can renumber qxq2 ... qm in such a way that UiPi = qi where u{ is a
unit, for i = 1 to n.

Proof. Let k be the minimum of n, m. If k = 1 then a is prime. So m = 1


and px = qx. Now suppose the theorem is true for k = 1, 2, ..., r. Let
a = P1P2 -•• Pr+i = q\q2 qn- Then px\qxq2 ... qn. So px\qt for some i. So
Piui - qt for some ux. Since qt is prime, Wjmust be a unit. Renumber qu q2,
...,qn so that qt is qx. Then pxux= qx. So also p2p3 ... pr = {u^q-^q^ ... qn.
Since the theorem is true for k < r + 1, then r = n — 1 and we can renumber
q2> q3> ••• ,qn and find vx, u3,..., un such that v2p2 = u^q^ u3p3 = q3,
unPn — qn- Let u2 = uxv2. Then the theorem has been verified. □

This theorem also holds for certain non-Euclidean rings. It is known that
polynomials in several variables do not form a Euclidean ring, yet unique
factorization still is true.
In any Euclidean domain, there is an effective method: (i) to find the g.c.d.
g of two nonzero elements a, b and (ii) to express g as ax + by. Label a, b so
that v{a)>v(b). Let Xj = a, x2 = b. Obtain x,-+1by setting xi_l = qixi + xi+x.
Then v(x/+1) <v(xf) unless xl+1 = 0. So eventually some xt is zero. The last
nonzero element is taken as the g.c.d. g. This procedure is called the Euclidean
Algorithm.
Sec. 4.2] Euclidean Domains and Factorization 171

PROPOSITION 4.2.8. In any E the last nonzero element g is the g. c. d. of a, b.

Proof. Let g = xt. Then xt_x = qiXt. We prove by a backwards induction that
for each /, g\Xj, g\Xj_x, and g is a linear combination of Xj, Xj_v For /' = i this
is immediate. Assume g\x/, g\x/_h and g = rxj + sxj_1. We have

X]_2 = Ql-lXf-i + Xj

Therefore g divides Xj_2 and xj_i. And g = rxj + sx/_t = r(Xj_2~~Qi-ixj~i) +


sxj_x. This completes the induction. Thus g\a, g\b and g = ax + by for some
x, y. Thus g is the g.c.d. □

EXAMPLE 4.2.7. Find (31, 47). Find x and y such that (31, 47) = 3lx + 47^.
Divide and take remainders.

47 = 1 • 31 + 16
31 = 1-16 + 15

16 = 1-15 + 1

15 = 15-1 + 0

The last nonzero remainder, 1 is the g.c.d. To find the linear combination, solve
the equations for the remainders

47-1-31 = 16

31-1-16 = 15

16—1*15 = 1

Now start at the last equation and substitute.

1 = 16-1-15 = 16 - (31 - 16) = 2-16-31

= 2(47 - 31) - 31 = 2 - 47- 3 - 31

Simplify but do not alter the numbers in the series xt.

EXAMPLE 4.2.8. Find the g.c.d. of x3 - 1,2x2 - 3x + 1.

Divide polynomials: -x + -

2x2 — 3x + L) x3 + 0 + 0 — 1
x3 — |x2 + \x

fx2 + \ x — 1
3 2_9 +3
2X 4 ■* ' 4
172 Rings [Ch.4

We may write \{x — 1) and use (x — 1) as a divisor in the next stage since \ is a
unit when we are dealing with polynomials
2x -1
* - 1 j’2x2 - 3x 4- 1
2x2 — 2x
— x + i
— X + 1
0
The g.c.d. is (x — 1).

EXERCISES
Level 1
1. Factor 192 as a product of primes.
2. Find (36, 54).
3. Find the g.c.d. of 501, 111 by the Euclidean Algorithm.
4. Find integers x, y such that 4x + ly — 1.
5. Find the g.c.d. of 700,133.
6. Find the g.c.d. of x3 — 4x + 1 and its derivative.

Level 2
1. Find numbers x, y such that I7x + lly = 1.
2. Find numbers x, y, z such that 6x + 15>» + lOz = 1. (First find numbers
such that 6r + 15s=3. Then find numbers such that 3x + lOz = 1.)
3. Find polynomials f(x),g(x) such that /(x)(x2 + l)+g-(x)(x2 + x + 1) = 1.
4. If ar + bs = 1 (so a, b are relatively prime) show that a(r + mb) +
b(s — ma) = 1.
5. Show that Exercise 4 yields all solutions of ax + by = 1.

Level 3
1. Prove from the properties given in Section 4.1 that Z is a Euclidean domain,
i.e. prove that for any x ^ 0, y =£ 0 there exist q, r with x = qy + r and
| r | < |x |. Take r to be a number x — qy of minimum absolute value.
2. Prove that the set of integers {a + bi: a, b E Z} is a Euclidean domain.
3. Factor 1 + 3i into primes in this ring. (Try factoring a2 + b2 = |z|2.)
4. Show unique factorization into primes fails in the subring of Q[x] generated
by l,(x - 1)2,(* - l)3.
5. Although unique factorization into primes holds in the ring of polynomials
in 2 variables over C, show that this is not a Euclidean domain. To do this
show that a g.c.d. of x, y is 1, yet there do not exist /(x), g(x) such that
/(x)x + g(x)y = 1.
Sec. 4.3] Ideals and Congruences 173

4.3 IDEALS AND CONGRUENCES


An ideal is a subset I of a ring R which is a subring, and is such that if a El,
b E R then ab, baER. Ideals are central in the study of the structure of rings.
For any ideal, the relation x=y + m, mEl is called a congruence. A con¬
gruence is an equivalence relation for which the equivalence classes themselves
form a ring. This ring is frequently simpler, but gives properties of the original
ring. It is called a quotient ring. All nontrivial quotient rings of Z are finite.
Additively they are the Zm mentioned in connection with groups.

DEFINITION 4.3.1. In R, a nonempty set I is a left {right, two-sided) ideal if


for all a, b E J, c E R we have —a El, a + b E J and caEl {ac E l, ca, ac E I).

EXAMPLE 4.3.1. Let m E Z+. Then the set of multiples km of m is an ideal.

EXAMPLE 4.3.2. For any ring R and finite set of elements ax, a2,..., an E R
the set Ra! + Ra2 + ... + Ran = {r!ai + r2a2 + ... + rnan: r/E Rj is a left
ideal called the left ideal generated by a1( a2,..., an.

A two-sided ideal is frequently just called an ideal. For commutative rings


there is no difference among left, right, and two-sided ideals.

PROPOSITION 4.3.1. For any two ideals l, J both either right, left, or
two-sided, the following are ideals of the same type:

(1) Z + J = [x + y: x E l, y E J}
(2) IJ = {xxyx + x2y2 + ... + xnyn : n E Z,XiEl,y{E J}
(3) in J

For any set S the intersection of all ideals containing S is an ideal {the ideal
generated by S).

Proof. We verify (1), (2), (3) for left ideals. First the additive property (1)
ax + bx + a2 + b2 = Oi + a2) + {bx + b2), (2) {xxyx + x2y2 + ... + xnyn) +
{rxsx + r2 s2 + ... 4- rksk) is again a sum of this type, (3) a + bEl and
a + b E J so a + b E l n J. For the multiplicative property (1) c{x + =
ex + cy E l + J, {2)c{x1y1 + x2y2 + ... + xnyn) = {cxx)yx + {cx2)y2 + ... +
{cxn)yn, (3) ca E l and ca E J so ca E l n J.
The remainder of the proof including the verification that —a is in the
ideal, is similar. □

In every ring R there are two ideals, {0} and R. Since a + {—a) = 0, 0 is in
every ideal.

DEFINITION 4.3.2. A congruence on a ring R is an equivalence relation x~ y


such that if x~ y then for all z, x + z + z, xz ~yz, and zx ~ zy.
174 Rings [Ch. 4

All congruences of rings arise in the way given in the next result.
If x ~ y and z ~ w then xJrz~x + w~y + w and xz ~ xw ~ yw.

PROPOSITION 4.3.2. The relation x ~ y if and only if x — y E I is a


congruence.

Proof. If a~ b, b ~ c, then a — b El, b — c£ I. Since I + I C I, (a — b) +


(b — c)-a — cE I. So a ~ c. Also b — a — —{a b)Ej. And a — a = 0 G I.

So it is an equivalence relation. The congruence properties hold also. □


The equivalence classes under any congruence form a ring called a quotient
ring, and denoted in the present case by R/I.
The function /(x) = x which assigns to each element its equivalence
class is a homomorphism of rings, that is f(x + y) — f(x) + f(y) and
f(xy)=f(x)f(y).
For the integers these ideas take the following form.

DEFINITION 4.3.3. For a, b E Z we have a = Z>(mod m) if and only if


m\a — b. Here m \a — b means m divides a — b. That is, a — b belongs to the
ideal {km}, k E Z.

EXAMPLE 4.3.3. 7 = l(mod 3) since 3(7 — 1.

The relation of congruence is an equivalence relation, and it follows


from the preceding theory that if x = y(mod m), z = w(mod m) we have
x + z = y + w(mod m), xz =j>w(mod m).
The quotient ring is denoted Zm.

PROPOSITION 4.3.3. The classes 0,1, 2,..., m — 1 are distinct and include all
congruence classes modulo m.

Proof. For xEZ,x=qm + r where ()</■< m — 1. Then x — r — qm so


x = r(mod m). So any x lies in one of these classes. Suppose 0<r<s<m — 1.
Then m > s — r > 0. Therefore m cannot divide s — r. So s, r must lie in
distinct classes. □

COROLLARY 4.3.4. Zm has exactly m elements.

EXAMPLE 3.3.4. The addition and multiplication tables of Zm can be found


by taking ab or a + b and then finding the remainder when this is divided by m.
For example modulo 5,2 + 3 = 5 = 0 since

5
51T
5_
0
Sec. 4.3] Ideals and Congruences 175

and 2 3 = 6 = 1 since

1
5W
5_
1

An ideal is called proper if it is a proper subset of the ring. It is called


trivial if it equals {0}.

The proof of the following result follows the same pattern as for groups
(it is not given in full).

THEOREM 4.3.5. Let f:Rj -> R2 be a homomorphism from a ring Ri onto a


ring R2 with kernel K = {x : f(x) = 0}. Then K is an ideal and f gives an
isomorphism Ri/K -*■ R2.

Proof If x — y -» K then f(x) — f(y) = 0. So / gives a well-defined map /


on equivalence classes from Ri/K. ~>R2. This is a ring homomorphism: f{xy) =
f(xy) = f(x)f(y) = f(x)f(y). It is onto since / is. It is also 1-1. □

EXERCISES
Level 1
1. Work out the mod 5 addition table.
2. Work out the mod 5 multiplication table.
3. Work out the addition and multiplication mod 7.
4. Work out the tables for Z13.
5. Find x such that 4x = 2(mod 7).

Level 2
1. For any congruence x ~ y on a ring show the set {x : x ~ 0} is an ideal.
2. Show the kernel {x : /(x) = 0} of any ring homomorphism is an ideal.
3. Show that a field F has no ideals except {0} and F.
4. Prove the addition table of Zm has exactly one copy of each element in
each row and column. Excluding the row and column of zero, when will
this hold for multiplication?
5. Define a ring structure on a Cartesian product Ri X R2 of two rings. Show
that if both have units there must exist ideals of R, X R2 other than
{0}, Ri X R2.
6. Prove that the last statement of Proposition 4.3.1 agrees with the previous
idea of left ideal generated by a set.
176 Rings [Ch.4

Level 3
1. Show any congruence on R is determined by a unique ideal.
2. Prove any ideal in a Euclidean domain E is of the form Rx for some element
x E R. Such ideals are called principal.
3. Prove that the family of ideals of R is a lattice under inclusion.
4. Prove that any integral domain V is a subring of a F. Define F by taking
ordered pairs (a, b) from R interpreted as fractions a/b. On these ordered
pairs take a congruence (a, b) ~ (c, d) if ad = be and use the usual
definition for sums and products of fractions.
5. Let I =£ {0}, I =£ R, and I is maximal under inclusions among proper ideals of
R. Prove R/I is a field.
6. Prove that if R/I is a field, then I cannot be contained in another proper
ideal J.

4.4 STRUCTURE 0F.<2n


The rings Zn have a number of special properties. They are fields for n prime.
Algebraic equations can be considered in them, such as ax — b or x2 = c.
Various identities hold, of which the most famous is Fermat’s Theorem.

PROPOSITION 4.4.1. In Zm, an element c has a multiplicative inverse if and


only if (c, m) = 1.

Proof. If cx = 1 then cx — 1 = km, cx — km = 1. So c, m cannot have a


common divisor. Suppose c, m are relatively prime. Then cx — km = 1 for
some x, m. Therefore cx = l(mod m). □

COROLLARY 4.4.2. Zm is a field if and only if m is prime.

In any ring the set of elements having inverses is a group since we can take
(x-1)-1 = x and (xjv)-1 = Thus the set of invertible elements of Zm
forms a group.

DEFINITION 4.4.1. The Euler function </>(m) is the number of integers


x, 0 < x < m which are relatively prime to m.

EXAMPLE 4.4.1. If m is pn, a power of a prime, </>(m) consists of all numbers


1, 2,..., pn except the pn~l multiples of p. Therefore 0(p”) = pn — pn~l.

PROPOSITION 4.4.3. (Euler-Fermat.) If x and m are relatively prime then


x<t>(m) = MjYlod my

Proof. The order of any element of a group divides the order of the group. □
Sec. 4.4] Structure of Zn 177

COROLLARY 4.4.4. If p is prime then xp = x{mod p) for all x.

There exist finite fields other than Zp as will be shown later.

THEOREM 4.4.5. Every finite subgroup of the multiplicative group of F is


cyclic.

Proof Let G be a multiplicative subgroup of a field of order n. If G has


elements a of order k then k divides n. Let a have maximal order k. Suppose
b has order t and t does not divide k. Then some prime p occurs to a higher
power in t than in k. Let t = psu, k = pwv where p does not divide u and p
does not divide v. Then bu has order ps and ap has order v. And buap has
order psv > k — pwv. (Its order divides psv but not psv divided by q for any
prime q.) This contradicts maximality of the order of a.
Therefore the order of b divides k. If k = n then a generates a cyclic
subgroup of order n which must coincide with G. Suppose k <n. Then there are
n>k roots rx, r2,..., rn of the equation x*=l. But this means (x —/q),
(x — rf),..., (x — rn) divide xk — 1. Therefore by unique factorization in F[x],
(x — rf) (x — r2) ... (x — rn) divides (xk — 1). But a polynomial of degree n
cannot divide a polynomial of degree k < n. □

COROLLARY 4.4.6. For any prime p there exists an element a such that
a, a2,, ap~l ranges over all nonzero elements of Zp.

DEFINITION 4.4.2. A number x is said to be a quadratic residue modulo m if


and only if y2 = x(mod m) for some y E Zm.

EXAMPLE 4.4.2 The number 4 is a quadratic residue to any modulus since


4 = 22.

PROPOSITION 4.4.7. For p prime and odd, x is a quadratic residue of p if and


only if
p-i
x 2 =1.

Proof. If x — b2 then
p-t
x 2 = bp~l = l(mod p)

Let x = ak for a generator of k of the multiplicative group. If


p-i
x 2 = 1
„ __^
then (p — 1) divides (k 2~) so k is even. So

x = (J)2 □
178 Rings [Ch. 4

COROLLARY 4.4.8. For p prime, —1 is a quadratic residue of p if and only


if p = lfmod 4).

DEFINITION 4.4.3. An element wGF is a primitive nth root of unity if


to” = 1 but gj* 1 for i = 1,2,..., n — 1.

EXAMPLE 4.4.3. The numbers ±i are primitive 4th roots of unity.

EXERCISES
Level 1
1. A test for primeness of a number/? is to check whether xp_1 = l(mod p).
(It is not always sufficient but for p large gives strong evidence.) Try this
for x = 2, p = 1 to 10.
2. When does ax — Z?(mod m) have a solution x?
3. Find a multiplicative inverse of 5, mod 13.
4. If x, y are quadratic residues modulo m, prove xy is also a quadratic residue
modulo m.
5. How many quadratic residues exist modulo 3? 5? Try to generalize to any p.
6. Do you think a product of two quadratic nonresidues must be a quadratic
residue, modulo a prime? Compute a number of examples.
7. Prove x2 = l(mod 8) for any odd x by checking all cases.

Level 2
1. In any field, show by Theorem 4.4.5 that ±1 are the only solutions of
x2 — \ . p-l
2. Show from Exercise 1 that for any x, x 2 = ±1.
3. Prove Wilson’s Theorem that (/? — 1)! = (—l)p(mod p) for any prime p.
All numbers 1, 2,...,/?— 1 are solutions of xp~1 — 1. Hence (x — 1) (x — 2)
... (x — p + 1) divides xp_1—1. Hence xp_1 — 1 = k(x — 1) (x — 2) ...
(x — p + 1). By considering the xp_1term, k = 1.
4. Show that if p divides x2 + y2 and p does not divide y then
p = l(mod 4), using Corollary 4.4.8.
5. When does the equation x2 + ax + b = 0(mod p) have a root jc, for odd
primes pi
6. Prove x2 + y2 + z2 cannot be congruent to 7(mod 8), by considering cases.
7. Let the Legendre symbol (xly) be ±1 accordingly, as x is or is not a
quadratic residue of y. Prove
y-1

(x | y) = x 2 (mod y)

for y prime.
Sec. 4.5] Simple and Semisimple Rings 179

8. For primes x, y, there is a relation between (x|j>) and (.yl*). Experiment


to find this (but don’t try to prove it). It is Gauss’s Law of Quadratic
Reciprocity.

Level 3
1. Show that if f(x) and #(x) are polynomials of degree <p in a variable x
over Zp and f(k) = g{k) for k = 0,1, 2,..., p — 1 then f(x) = #(jc). Since
kp = k, this is false for degree p.
2. Prove for m, r relatively prime Zmr — Zm© Zr. Map Zmr-> Zm and Zmr-+ Zr
by the map sending k to k. This is a homomorphism because ZmrC Zm and
Zmr C Zr. Show the kernel of this map is zero (this proves isomorphism).
3. Extend Exercise 2 to k factors.
4. Prove the 1 Chinese Remainder Theorem ’. If xt = c{ (mod mf) are a set
of congruences, they have a simultaneous solution if and only if for all
i, j, ct = Cj (mod <i) where d = g.c.d. (mit m{).
5. Prove , .
‘M.pf'Pi"2 - P*nk) = p"‘pi"2 ■■■ Pk"k0 -^)(l -^ ...
... (l —t-)

6. Discuss the multiplicative structure of Zpn where p is a prime number and


n G Z+.
7. Find in a book and write out in your own words a proof of the law of
quadratic reciprocity.
8. Prove any finite integral domain is a field.

4.5 SIMPLE AND SEMISIMPLE RINGS


A simple ring is one with no ideals other than itself and zero. We deal with
simple rings which are algebras over the complex numbers of finite dimension.
Such algebras it will be shown, are always precisely the rings of n X n matrices
over C, for some n.

DEFINITION 4.5.1. A ring R with unit is an F-algebra for F if R is a vector


space over F and for all a, b G R, a G F we have {a.a)b = a(ab) = a(ab).

EXAMPLE 4.5.1. If F2 is a field and Fx is a field containing F2 then Fj is an


F-algebra. Thus the complex numbers are an F-algebra over R.

EXAMPLE 4.5.2. The ring of Mn(F) is an F-algebra, where a G F is sent to aI.

An F-algebra is necessarily an F-vector space. It is called finite dimensional


if it is finite dimensional as a vector space over F.
180 Rings [Ch.4

DEFINITION 4.5.2. A ring R with unit is a division ring, if and only if every
nonzero element has a multiplicative inverse, and is sometimes denoted Rp.

EXAMPLE 4.5.3. A commutative division ring is precisely a field.

EXAMPLE 4.5.4. The quaternions, an F-algebra with basis {1,/, /, k} where


i2 - j2 = k2 = —l, ij = k, ji = —k, jk = i, kj = —ki = j, ik = —/, are a
division ring for any field F C R.

Over general fields there exist many complicated division algebras. However,
we will show next that over C, the only finite dimensional Rp is C itself.

THEOREM 4.5.1. Let Rp as a finite dimensional division ring which is a


C-algebra. Then Rp = C.

Proof Suppose xERp\C. The powers 1, x, x2,..., xn,... must be linearly


dependent, since Rp is finite dimensional. In particular let xk be the least power
of x which is a C-linear combination of 1, x, x2,, xk. Then xk = c0 + CjX +
c2x2 + ... + ckxk~1. So xk — ckxk~1 — ... — c0 = f(x) = 0. And /(x) is the
polynomial of least degree satisfied by x. But over C, /(x) factors into linear
factors (x — rf), where r,- are the complex roots of /(x). Thus (x — rf) A 0 for
all i since x ^ C but the product of x — r{ is zero. This contradicts the fact Rp
is a division ring since if x — rt have inverses, their product has an inverse and so
is nonzero. □

DEFINITION 4.5.3. An element x of R with unit is an idempotent if x2 = x.


It is a central idempotent if also xy = yx for all y E R.

EXAMPLE 4.5.5. In any ring with unit, 1 is a central idempotent.

EXAMPLE 4.5.6. Any diagonal (0, l)-matrix is an idempotent.

DEFINITION 4.5.4. R is regular if it is regular as a semigroup. That is for all


x ER there exists y E R such that x.yx = x.

EXAMPLE 4.5.7. Rp is regular.

EXAMPLE 4.5.8. Mn(F) is regular.

A proof of the following result, which is basically the first Wedderburn


theorem, is too technical for this book.

THEOREM 4.5.2. For a finite dimensional F-algebra A the following are


equivalent!:
Sec. 4.5] Simple and Semisimple Rings 181

( / ) A is a direct sum of simple F-algebras.


(2) A as a vector space, is a direct sum of left ideals (or right ideals).
(3) Every left and right ideal of A is generated by an idempotent.
(4) A has no nonzero nilpotent two-sided ideals.
(5) A has no nonzero nilpoint left or right ideals.
(6) The multiplicative semigroup of A is a regular semigroup.

An algebra satisfying any of these conditions is called semisimple.

EXAMPLE 4.5.9. Any division algebra is regular, as is the ring of Mn(F).

EXAMPLE 4.5.10. The ring of lower triangular matrices over F has a nilpotent
ideal (those with zeros on the main diagonal) and so it is not regular.

The proof of the following theorem follows M. Hall (1959).

THEOREM 4.5.3. (Wedderburn.) A simple finite dimensional algebra over F is


isomorphic to the ring Mn(AD) of nX n matrices over a division algebra Ap
over F for some n, Ap.

Proof. We first find an expression of K as a direct sum of minimal ideals Ju.


Let Ij be a minimal right ideal. Let eu be an idempotent such that enJj — Ij.
Suppose we have obtained idempotents eu, e22,..., ekk such that (1) eu R is a
minimal right ideal 1^ in R, (2) eu e-. = 0 for i j. It follows from (1), (2) that
I; n ( 2 I;) = 0 since if jc is in the intersection eux = x since x G Jj but
J Ki¥=f 'L' n J
ejjx = 0 by (2). Suppose R 4^ 2 I^. Let x G R\2 1^. Then
k
(1—2 e(i)x 0
i=1
So

(1-2 ctf)R*0.
i=1
Let l\i+j be a minimal right ideal in this ring. Write = wl^+j where w is
an idempotent in I^+|. Thus enw = 0 for i = 1 to k. Let

x = w(l - 2 eu)
i=1

Then x2 = w • 1 • x = x and eux = Q and xeit — 0, So let ek+l k+1 = x. This


continues an induction. So we can write R as the direct sum of euR where eu is
an idempotent and e/;- = 0 for i 4^ j. From the direct sum expression we have
that 2 eny = y for all y in R. Thus 2 eu is a left identity. So it will be the
two-sided identity 1 of R (take y = 1).
182 Rings [Ch. 4

The rings eff Rew will be the required division rings. We show that etiReit is a
division ring. It is a subring with identity eit. Suppose it is not a division ring.
If y =£ 0 is not invertible in it then y eit R ez7 ezzRezz-. Therefore y ei{ R ezz R.
This contradicts the fact that ez7 R was a minimal right ideal. Therefore e/z-Rez7 is
a division ring Rp..
A,
Next we observe that ey/Re« =£ 0. Since R is simple, R = ReieR. Thus
ejj — 2 akenbk. So 2 akenbk 0. This implies e7/afce,7- ^ 0 for some A:.
Choose for i = 2 to n a nonzero element en6ze/z- and write it as We
have = elzez7 = elz-. Also elz-R C en R so elzR = enR by minimality. There
exists therefore y( such that elzyz- = en. Set eix = e^^e^.Then euen — eixen =
en> euen=eu(.eiiyien)=euyien=en. Therefore (euen)2 = eu, so eneu # 0.
Moreover (enelz)2 = ezlelz- is a nonzerh idempotent in Rp.. So ezlelz- = eu.
Now set e,y = ezle1;-, which is consistent with the previous definitions.
We have eijSjk ^n^ij^ji^ik k ^n^ik ~~ ^ik• And for j ^ r, eXj£rs —
eijejjerrers = 0. Therefore ef/- have precisely the properties required to be the
0,1)-matrices whose only 1 entry is in location i, j.
All the division rings Rp. are isomorphic to Rp. by the mapping y -*■ euyen.
A, I
And in fact eti Re/;- is isomorphic as a vector space to Rp. under the mapping
x -* euxejX which has an inverse x -* e(ixe1 j.
It readily follows that we have a homomorphism from the ring of matrices
D = (dij) over Rp^ into R sending (dy) to 2 endijelj.

This has kernel zero since if we multiply by e(i and e;/- on left and right only
the term in dtj will remain. It is onto since x = l(x)l = (2 ez7) (x) (2 ez7) and
eiiRejj^Rpr □

This result is also valid for rings which are not F- algebras but are such that
there does not exist an infinite descending family lj D D D ... of distinct
left ideals (Artinian rings).
The converse is left as an exercise.

EXERCISES
Level 1
1. Show that the quaternions are a division ring.
2. Consider the ring of rational functions
Qi(x)
Q2 O)
over C where x is a variable. Show this is a field.
3. Show it is an infinite dimensional division ring over C.
4. If z is in the center of R, prove z R is a two-sided ideal.
Sec. 4.5] Simple and Semisimple Rings 183

5. If I is a maximal proper ideal of R, prove R/I is an integral domain.


6. If e,-; is an idempotent in R, prove R is the direct sum of eti R and (1 — e^R.

Level 2
1. In the ring of Mn(F):
(a) Show all left ideals are principal (have the form Ra).
(b) Classify left ideals in terms of the image space of a matrix on row
vectors.
(c) Classify right ideals.
(d) Show no two-sided ideals exist.

Level 3
1. Carry through Level 2 for matrices over R-p.
2. Prove that a finite dimensional integral domain of F is a division ring.
3. If u is an element of a finite dimensional F-algebra A, prove that there exists
a polynomial in u such that p(u)uk = uk for some k. Assume uR = u2R.
By induction show ukR = uR. Show p(u)u = u,p(u) is idempotent, and
generates uR.
CHAPTER 5

Group representations

A group representation is a homomorphism from an abstract group G to a group


of n X n matrices. Two representations are equivalent if there is a similarity
transformation X-+AXA”1 changing one to another. Suppose the field F
contains the rationals, and the group G is finite (or compact). Then it can be
shown there are only a finite number of equivalence classes of representations
of any fixed degree n, and these can be completely classified by the characters,
i.e. the traces of the matrices assigned to each element g£G.
To obtain this result, we first study the group ring F(G) of a group G. The
group ring consists of all ‘formal sums’ figi + figi + ... + fngn of elements of
G times coefficients in F. Such sums are added termwise (e 4- 3g) + (2e + 5g) =
3e + Sg. They are multiplied termwise using the products in G (commutativity
does not necessarily hold). Every group representation of the group G extends
to a ring representation of the ring F(G), and conversely. It follows from the
Wedderburn theorems (Theorems 4.5.2, and 4.5.3) of Chapter 4 if F = C that
F(G) has a simple structure as a ring: it is a direct sum of rings Mk(F) consisting
of all k X k matrices over F, for varying k.
Over the real or complex numbers, we show that any representation is
equivalent to a representation by geometrical symmetries, that is, orthogonal
or unitary matrices.
A representation of F(G) can be formalized as a module. A module M over
any ring R is a generalization of the idea of vector space to rings R which are not
fields. That is M is an abelian group provided with a multiplication R X M M
satisfying associative, distributive, and identity laws. The theory of modules is
very important in more advanced abstract algebra. Two modules can be ‘added’
by the operation of direct sum. There is also a multiplication operation. The set
of homomorphisms from one module to another is an abelian group. The kernel
and image of a homomorphism are also modules.
Finite dimensional modules over F(G) have the property that every module
is a direct sum of irreducible modules, i.e. modules with no nonzero proper
submodule. Every irreducible module is isomorphic to a submodule of the group
[Sec. 5.1] The Group Ring and Representations 185

ring F(G) regarded as a module over itself. Therefore to obtain all irreducible
representations it suffices to find all irreducible submodules of the group ring.
Then we introduce characters, the traces of the matrices of a representation.
The characters of distinct irreducible complex representations turn out to be
orthonormal vectors. This is a consequence of the fact that there exists no
nonzero homomorphism from one irreducible module to another unless the two
are isomorphic.
It follows that every equivalence class of complex representations is
completely determined by its character. Characters can be added and multiplied,
and have a number of other properties.
Group representations are important in the study of systems having
symmetry, for instance in physics. Frequently such systems can be decomposed
in terms of the distinct irreducible representations of the system.
In the last section we discuss tensor products M ® W of the modules. These
are constructed by generators and relations expressing bilinearity. They explain
the properties of Kronecker products of matrices, and enable new representa¬
tions of a group to be found. Under some unproved assumptions, we find the
character ring of the n-dimensional unitary group.

5.1 THE GROUP RING AND REPRESENTATIONS


For any group G and ring R there exists a ring R(G) closely connected with the
structure of the group.

DEFINITION 5.1.1. For a group G and ring R the group ring R(G) is the set of
all functions f:G~>R which are zero on all but a finite number of elements
of G. Such functions are added by the usual functional addition,

(/+ h)(x) = /(*)+ h(x)

Products are defined by

(fOh)(x) = 2 f(y)h(z)
yz=x

Elements of the group ring are written as

rig x+ r2g2 + ... + rngn.

Two such sums are added and multiplied like algebraic expressions in which the
gj are treated as variables and the rt are treated as coefficients. The coefficient
rt of gi is interpreted as f(gt) where / is the function described in the definition.
However, unless x, y commute in G, xy must be distinguished from yx in
multiplication.
The description in terms of formal sums is equivalent to the description by
functions: to the sum 2 r{g{ corresponds the function / such that f(g{) = rt and
186 Group Representations [Ch.5

f(x) = 0 for other elements of g and conversely. Let sums 2 r/gj and 2 Sjgj
correspond to functions f g. Then (2 r{-£,-)(2 Sjgj) is the sum 2 tkgk where

h = 2 rtSf.
g{ij “ gk

This corresponds to a function h such that h(gk)

Kgk) = 2 f(gdg(gj)
SiSj=gk
This is the product of Definition 5.1.1.

EXAMPLE 5.1.1. Let G be the symmetric group of degree 3, with six elements
e, y, y2, x, xy, xy2 where y3 — e, x2 = e, xy = y2x. In R(G),

(2e + 3y + 4 xy) + (—e + x + 7xy) = e + 3y + x + Uxy

(2e + y + xy) (3e — 4x) = (6e + 3y + 3xy) — (8x + 4yx + 4xyx)

= 6e + 3y + 3xy — 8x — 4 xy2 — 4 y2

THEOREM 5.1.1. The group ring of a group is a ring.

Proof. We prove one of the distributive laws and the associative law of
multiplication.

(/ O (r + s)) (x) = 2 f(y) (r + s) (z) - 2 f(y) (r(z) + s(z))


yz = x yz = x

= 2 f(y)r(z)+ 2 f(y)s(z)
yz = x yz-x
Therefore
fO(r + s) = (/ O r) + (/ O h)

It follows that

(fO(rOs))(x) = 2 f(y)(rOs)(z) = 2 f(y) 2 r(u)s(v)


yz-x yz=x uv=z

= 2 f(y)r(u)s(v)
yuv-x

((/ Or)Os) (x) = 2 (/Or)(iv)s(t)= 2 2 f(y)r(u)s(v)


wv-x wv-xyu-w

= 2 f(y)r(u)s(v)
y uv-x

The proofs of the other properties are similar. □


The group ring here will be mainly used to study group representations.
Sec. 5.1] The Group Ring and Representations 187

DEFINITION 5.1.2. A representation of a group G in a ring R with unit is a


homomorphism h from G into the group GL (n, R) of n X n invertible matrices
over R, for some n G Z+. Two representations r, s are equivalent if and only if
there exists an invertible matrix A such that r(g) = As(g)A_1 for all g G G. The
number n is the dimension of the representation.

It follows that h(e) must be the identity matrix.


A representation is thus an assignment of a matrix Mg to every group
element in such a way that Mgh = MgMh. For a field F this means that a group
G acts on the space Fn as a group of linear mappings.

EXAMPLE 5.1.2. The trivial representation is the representation h(g) = / for


every g E G where / is an identity matrix.

EXAMPLE 5.1.3. The cyclic group Zm acts on 2-dimensional space as a group


2 71
of rotations by multiples of the angle —. If x is a generator, we have a
representation sending x’ to the matrix

2uj
cos
m

2nj
cos
m

EXAMPLE 5.1.4. Any regular solid in 3-dimensional space has a finite group
of symmetries G. Then G has a 3-dimensional representation as the matrices of
the rotations and reflections involved. This gives, for example, a 3-dimensional
representation of the alternating group of degree 5 by rotations of an
icosahedron.

If R C S a representation over R is a representation over S also. We next


prove that every representation of a finite group over the real or complex
numbers is equivalent to a unitary representation, that is, a group or rotations or
reflections. That is, every real representation is equivalent to a homomorphism
into a group of symmetries of n- dimensional space.

THEOREM 5.1.2. Let r be a representation over R or C. Then r is equivalent


to a unitary representation.

Proof. Let /(x, y) denote the function on Cn given by

2 r(g)x • r(g)y
g^G
188 Group Representations [Ch. 5

Then

f(r(h)x, r(h)y) = 2 r(g)r(h)x • r(g)r(h)y = 2 r{gh)x • r(gh)y


g gh
= 2 r(g)x • r(g)y = /(x, 7)
g

Here, since G is a group, gh ranges over all elements of G if g does. This shows
/ is invariant.
We show that f(x, y) is what is called a nonsingular bilinear Hermitian
form. We have

f(x,y) = f(y,x)

/(x + y, z) = 2 r(g) (x + y) • r(^)z = 2 r(g)x • r(£)z + 2 r(g)y • r(»z

= f(x> 2) + /O, z)
f(ax, z) = 2 r(»ax • g(z) = a/(x, z)

These imply /(z, x + y) = f(z, x) + f(z, y) and f(x, az)= a* f(x, z).
In addition
fix, x) = 2 r{g)x • r(g)x > 0
g&G
for all x^O.

By the Gram-Schmidt Process we can construct a new basis vlt v2,..., vn


such that f(Vi, v() — 1 and f(Vi, vf) = 0 if i =£ j. To do this let uu u2,..., un be
any basis. Let
_ Ul

V/Ol, Ml)
so Vi • v1 = 1. For / = 1 to n, let

w/ = - 2 /(«,, zn) v-
j<i
Then (w,- • vj) = 0 for / < i. Let

V/(w/, W/)
Then y,- is the required basis.
Now we claim that with respect to this new basis, the matrix X of r(h) will
be unitary. Let r(h)Vi = 2 xi}-vf. Then by invariance f(vt, vk) = 8ik = f(r(h)Vi,
r{h)vk) where 8ik = 1 if i = k and 8ik = 0 if i =£ k. This gives

5/* ^ fi'Lxijvj>?'xkmVm) = 2 XUxkm f(Vj, Vm) = 2x,/Xfcw5/m


— 2 xim xkm

Therefore X(Xr)* = I. Therefore X is unitary. □


Sec. 5.1] The Group Ring and Representations 189

This result, by a similar proof, holds for representations of compact


topological groups.
One relationship between representations and the group ring is that there
is a 1-1 correspondence between group homomorphisms from G to n X n
invertible matrices over R and ring homomorphisms from R(G) to the ring of
n X n matrices over R, sending e to /, provided R is a commutative ring with
unit. Here e denotes the identity element of G and / denotes the identity
matrix respectively. This mapping is defined by sending 2 rgg to 2 rgh{g).

EXERCISES
Level 1
1. Compute in the group ring of the symmetric group of degree 3 over Z:
(a) (2e + 6x + 5^ + ly2) + (2e 4- 10x + xy +- xy2)
(b) (e + y + y2)(y)
(c) (e + y + y2)(x)
(d) (e + y + y2) (2x + 3^)
2. Define a representation of Z2 X Z2 into 2X2 diagonal matrices over Z.
Generalize this.
3. Define a representation of the symmetric group into nXn permutation
matrices over Z.

Level 2
1. Prove that in a group ring F(G) there always exist nonzero elements which
have no inverses, if \G \ > 1.
2. Show Q(Z2) is isomorphic to Q ® Q.
3. What are the nonzero ideals of Q(Z2)?
4. Show that
1 0
m
m 1

is a representation of Z not equivalent to a unitary representation.


5. Find a real orthogonal representation of the symmetric group of degree 3
equivalent to this one:

"o f " o f
X -* , y -*■
.1 0 T -\_

Level 3
1. Consider the rational group ring Q(Zm). Define homomorphisms
&:Q(Zm)-*Q by h(x) = 1, x G Zm and Q(Zm)-*F where F is the
subfield of C generated by
2 tri
190 Group Representations [Ch. 5

If m is prime the number oj satisfies com_1 + cow~2 + ... 4- co + 1 = 0 but


no equation of lesser degree over Q. Use this to show the map into Q © F is
an isomorphism.
2. Prove that a nX n matrix A over Q is a linear combination of permutation
matrices if and only if the vector (1,1,..., 1) is both a row and a column
eigenvector of A.
3. Show Q(Z4) is isomorphic to Q © Q © F where F is the set of numbers
[a + bi : a, b & Q}.
4. Show that a group of order n has a nontrivial representation of degree n.
5. Show that any group Zm has a nontrivial 1-dimensional complex representa¬
tion h. Show the real 2-dimensional representation of the example is
equivalent over C to the direct sum representation:

'h(x) 0 -

0 h* (x)_

5.2 MODULES AND REPRESENTATIONS


We next show that a representation over F with unit is equivalent to a finite
dimensional module over F(G). A module can be described as a ring acting linearly
on an abelian group. Modules play an important role in group representation
theory, algebraic topology, and other areas of modern mathemetics.

DEFINITION 5.2.1. A left {right) module over a ring R is an abelian group M


together with a function RXM^MCMXR-^M) such that (1) (r + s)m —
rm + sm (respectively m{r + s) ~ mr + ms), (2) r{m + n) = rm + rn (respec¬
tively (m + n)r = mr + nr), (3) (rs) (m) = r(sm) (respectively m(sr) = (ms)r),
(4) (0)m — 0 (respectively m(0) = 0).

EXAMPLE 5.2.1. A set which is a left module over itself is a ring.

EXAMPLE 5.2.2. The a-fold direct sum R©R®...©Risa left module by


s(0, r2> •••. rn) — (sO> sr2,..., srn). A module isomorphic to such a direct sum
is called a free module.

DEFINITION 5.2.2. A bimodule over a ring is a module which is simultaneously


a left module and a right module such that (rm)s = r{ms).

EXAMPLE 5.2.3. Over a commutative ring every left or right module is a


bimodule by the definition rm = mr.

EXAMPLE 5.2.4. Any free module R®R©...©Risa bimodule.

For noncommutative rings it is necessary to distinguish right and left


modules.
Sec. 5.2] Modules and Representations 191

DEFINITION 5.2.3. A module is unitary if and only if 1 • m = m (m • 1 = m)


for all m E M, where 1 is the identity over R.

For the remainder of this chapter, all modules will be unitary.

EXAMPLE 5.2.5. Any vector space is a unitary module over the field in
question.

DEFINITION 5.2.4. A homomorphism of left (right) R-modules from M to M


is a function / from M to W such that f(x + y) = f(x) + f(y) and f(rx) =
rf(x)(f(xr)~ f(x)r) for all rjGM, r E R. A 1-1 onto homomorphism of
modules is an isomorphism.

EXAMPLE 5.2.6. Any two vector spaces of the same dimension are isomorphic
F-modules.

Every unitary F(G)-module is a vector space over F. We will call it finite


dimensional if it is so as a vector space over F, and its dimension is the vector
space dimension.

THEOREM 5.2.1. There exists a 1-1 correspondence between equivalence


classes of representations of a group G over F with unit and isomorphism classes
of unitary, finite dimensional left F(G)-modules.

Proof To every representation h of G over F we make the vector space Fn into


a G-module by setting (2 cgg)v = 2 cgh{g)v. Conversely for any finitely
generated F(G)-module, choose a basis mt, m2,..., mn for M. Multiplication by
g defines a linear transformation fg on M since g(m + n) = gm + gn and
g(an) = (ga)n = ag(n). These linear transformations satisfy fgh(m) = gh{m) =
g(h(m)) = fgfyfjri). So fgh — fgfh- Let h(g) be the matrix of fg. Then
h(g)h(g~1) = h(e) = I so h(g) is invertible. And h(r)h(s) = h(rs). This gives
a representation. Different choices of a basis change each h(g) to a matrix
Xh(g)X~x so the representations are equivalent. Isomorphic modules likewise
give equivalent representations and equivalent representations give isomorphic
modules.
If we choose the standard basis for Fn then the matrix representation
h(g) goes to the module h(g)v which again yields h(g) as matrix. Conversely
if we go from a module M with basis vu v2,..., vn to a matrix h{g) then h(g)v
is the original gv. So 2 cgh(g)v = 2 cggv. This establishes that the two
correspondences are inverse to one another. Therefore each is 1-1. □

Right modules could have been used in Theorem 5.2.1 since to every left
module over F(G) there corresponds a right module with multiplication given by
m (2 aigi) = 2 aigi'm.
192 Group Representations [Ch.5

So it suffices to study F(G)-modules to determine equivalence classes of


representations. We will show next that every finite dimensional F(G)-module
is a sum of irreducible F(G)-modules.

DEFINITION 5.2.5. The direct sum of the left (right) R-modules Mi, M2, , Mn
is the Cartesian product Mi®M2®---®M^ with operations defined by
(*1, x2, ... , xn) + (yu y2, ... , yn) = (xi + yu x2 + y-i< •••. xn + yn) and
c(xhx2,... ,xn) = (cx1, cx2,..., cxn) (respectively (x1; x2,..., x„)c = (xqc,
x2c,..., xnc).

EXAMPLE 5.2.7. For vector spaces this is the same as the previous definition
of direct sum.

DEFINITION 5.2.6. An additive subgroup M of a left (right) R-module M is a


submodule if and curly if for all r E R, n £ N we have rn E M (respectively
nr E W).

EXAMPLE 5.2.8. An ideal is a submodule of a ring over itself.

EXAMPLE 5.2.9. For vector spaces, a submodule is a subspace.

EXAMPLE 5.2.10. The subsets (0), M submodules of any module M.

DEFINITION 5.2.7. A nonzero module M is irreducible if and only if it has no


submodule except M, {0}.

EXAMPLE 5.2.11. A vector space V is irreducible as a F-module if and only if


it is 1-dimensional.

DEFINITION 5.2.8. For every submodule M C M, there exists a congruence


relation on M defined by r ~ s if and only if r — s E N. The set of equivalence
classes forms a module with operations x + y = x + y, rx — nc (or xr = xr).
This is called the quotient module M /N.

THEOREM 5.2.2. Let G be a finite group and let F be C or R. For every sub-
module N of a finite dimensional F(G) -module M there exists a complementary
submodule J such that M — W ® T. Moreover N/M — T.

Proof By Theorem 5.1.2, we may assume G acts on M by unitary matrices. Let


T be the space of vectors orthogonal to all vectors in M. Then T is a submodule
since if s E T, x E W, 2 cgg E F(G) we have

£ cggs • n = £ cg(s • g~'n) = £ cg(0) = 0 .


Sec. 5.2] Modules and Representations 193

Here the g 1 comes from the unitary property and g~ln G M so s • g xn = 0.


Moreover Tn M = {0} and T+ M = M as vector spaces. This implies the mapping
(s, n) -* s + n is an isomorphism from T + M into M.
The mapping s -* s from T to M/M has kernel 0 and is onto, therefore it is
an isomorphism. □

COROLLARY 5.2.3. Every finite dimensional F(G)-module is a direct sum


of irreducible modules.

Proof. We repeatedly express a module M as a direct sum until we arrive at


modules having no nonzero submodules. If the dimension is n this process must
end after at most (n — 1) steps. □

DEFINITION 5.2.9. The characteristic of a field F (char (F)) is the least


positive integer n such that nl = l + l+ ... + l = 0if such an n exists. Else it is
zero.

EXAMPLE 5.2.12. char (Q) = 0, char (Zp) = p.

Theorem 5.2.2 is actually true for any field F such that (char (F), |G|) = 1.
A proof is given in the exercises (see Level 3, Exercises 2-5).

EXERCISES
Level 1
1. Explain why every commutative group is a Z-module.
2. If / is a homomorphism from a ring R to a ring S explain how S can be
regarded as an R-module.
3. Prove that Z is not a unitary Q-module.
4. Prove that the set of ^-dimensional vectors over F is a bimodule over the
ring Mn(F).
5. For a G R, prove that aM is a submodule of M provided a commutes with
all elements of R.

Level 2
1. Suppose a group G acts by permutations on a set X. Find a Z-module
associated with this action. (It can be taken as the set of functions X -> Z.)
2. Prove that if M is a module over R, every invertible element of the centre
of R, {x : xy = yx for all y G R} gives a module isomorphism M "*■ M.
3. Let Mn be the direct sum of n copies of an R-module M. Prove is a
module over the ring of n X n matrices over R.
4. Show that the subgroup of Z X Z generated by {(1,1), (2, 3)} is itself a free
Z-module.
194 Group Representations [Ch.5

5. Do the same for the subgroup generated by {(1,4), (2, 2), (4,1)}.
6. Prove that every abelian group of order m is a Zm-rnodule.

Level 3
1. An integral domain V is called a principal ideal domain if and only if every
ideal has the form V(a) for a G V. Prove every finitely generated module
over a principal ideal domain is isomorphic to a direct sum of modules of
the form V/V(a).
2. If Zr <t F for rllGI every F(G)-submodule hi of an F(G)-module M is a
direct summand. Let V C M be a vector space complement of hi in M, and
let h : M -► V be a projection mapping onto V with kernel hi. Let

hi(x) = ~ Zgh(g-\x))
IGI G
Prove hi is a module homomorphism from M to M. The property Zr <£ F
guarantees |G| has an inverse in F.
3. Let hi and hi be the same as in the above exercise. Prove hi C ker (hi).
4. Prove that the mapping V -*■ M1*1 -*■ M M /hi — V is the identity.
5. Prove image of hi is a complement of hi.

5.3 IRREDUCIBLE REPRESENTATIONS


We have established that every finite dimensional representation of a finite group
over R or C is a direct sum of irreducible representations. In this section we
show the number of irreducible representations is finite, and every irreducible
representation is a submodule of the group module R itself. We further study
the decomposition of the group ring as a direct sum, the number of occurrences
of each irreducible module, and homomorphisms among irreducible modules
over C.

THEOREM 5.3.1. Every irreducible module M over a ring R has the form R/K
where K is a maximal proper left ideal.

Proof. Let x G M, x =£ 0. Then Rx is a submodule of M, so Rx = M. We have a


module homomorphism R -> Rx = M sending 1 to x. If K is the kernel of this
mapping, R/K — M and K is a left-ideal. If K C J for another proper ideal I then
J/K would be a nonzero proper submodule of M/fG Therefore K is maximal. □

In matrix terms, an irreducible representation is one which is not equivalent


to a representation by block lower triangular matrices

’* o’

* *
Sec.5.3] Irreducible Representations 195

DEFINITION 5.3.1. The regular matrix representation of a finite group


G = {g\, g2,..., gn} is the representation by permutation matrices Pij(g) where
Pij(g) = 1 if and only if gtg = g,.

These are the matrices of the regular representation by permutations.

EXAMPLE 5.3.1. For the group Z2 the matrices are


L°_

"l o'
1_

1_

9
'o

o
1

It can be verified that the corresponding module is precisely the group


ring. An isomorphism is given by sending the basis element g{ for the group ring
to the (0,l)-vector whose only 1 is in place i. The regular representation has
dimension |G|.
For the remainder of this chapter we will be concerned with representations
over the field of complex numbers, although the next two theorems are valid
for any field F such that (char (F), |G |) = 1.

LEMMA 5.3.2. (Schur’s Lemma.) Let M, M be irreducible modules over C(G).


If M and M are not isomorphic then every homomorphism from M to N is zero.
If M, M are isomorphic then every nonzero homomorphism is an isomorphism.

Proof. Consider /: MW. Suppose / is nonzero. Then the image of f is a


nonzero submodule of N so it must be N. The kernel of / is a proper submodule
of M so it must be 0. So / is an isomorphism. □

THEOREM 5.3.3. Let the regular representation of C(G) be decomposed as a


direct sum of irreducible modules Mf. Every irreducible module M is isomorphic
to some

Proof. Let y be any nonzero element of M. We have a module homomorphism


from C(G) = ® to M sending 1 G C(G) to y and 2 cgg to 2 cggy. For some
i this mapping must be nonzero on By Lemma 5.3.2 is isomorphic
to M. □

COROLLARY 5.3.4. A group of order n has at most n inequivalent irreducible


representations.

So it suffices to study the regular representation to obtain a complete set of


irreducible representations. But the structure of the regular representation is in
effect known from the last section of Chapter 4. The ring C(G) is a direct sum of
matrix rings Mn(C) of n X n matrices over C and these matrix rings correspond
to distinct idempotents, hence we have the following theorem.
196 Group Representations [Ch.5

THEOREM 5.3.5. An irreducible representation of dimension n occurs


precisely n times in the regular representation.

Proof. Without going into detail, each matrix algebra Mn(C) of n X n matrices
over C yields precisely n copies of a single irreducible representation, corres¬
ponding to the n columns of each matrix. Distinct matrix algebras give distinct
representations. The representation of Mn(C) on the set of column vectors can
be verified to be irreducible. □

PROPOSITION 5.3.6. For any irreducible representation M, the vector space


of homomorphisms M to M is isomorphic to C.

Proof. We show the module V of row vectors over Mn{C) has this property.
Here Mn(C) denotes the n X n matrices over C. Let /: V -* V be an
isomorphism such that f(Av)= Af(y) for every v. Let A be a (0, l)-matrix
with exactly one 1 in the (/, z')-entry. Let et be the (0,1)-vector having exactly
one 1, in place i. Then Af(v) = f(A(y)) = f(vt et) = vif(ei). Therefore the i
component of f(v) which is that of A f(v) is v{ So / is a diagonal matrix
D with entries Let A have a,y = ay = 1 and all other entries zero. Then
AD = DA implies dti = djj by looking at the (/, /)-entry of each. □

EXERCISES
Level 1
1. If the group is commutative show the group ring is also.
2. For a commutative group if C(G) is isomorphic to a direct sum of Mn(C)
explain why no n can exceed 1. Therefore all irreducible representations are
1-dimensional.
3. Show that n distinct irreducible representations of Zn are obtained by
2 it ik
sending a generator 1 to —-—. Therefore these are precisely the irreducible
representations of Zn.
4. For any irreducible representation of a group H into Mn(C) and for
any homomorphism f:G-+H which is onto prove the representation
/: G -*■ Mn(C) is irreducible.
5. Use Exercise 4 to obtain p2 irreducible representations of Zp X Zp.

Level 2
1. How many times does the regular representation contain the trivial
representation?
2. Identify the trivial representation as a submodule of Zn.
3. For the symmetric group give a linear representation of the form
Sn^Z2->Mn(C).
Sec. 5.4] Group Characters 197

4. Prove the symmetric group of order 3 has a 2-dimensional representation


given by symmetries of an equilateral triangle.
5. Show any 1-dimensional representation is commutative. Hence a
2 - dimensional representation by non-commuting matrices must be
irreducible.
6. Using the above 5 exercises give the complete set of irreducible represen¬
tations of the symmetric group of degree 3.

Level 3
1. Generalize Level 2 to obtain all irreducible representations of the dihedral
group of order 2n, n odd. (They are all 1 or 2-dimensional.) Note that if
the group is written in terms of generators and relations x, y: x2 = yn = e,
xy = y-1x that there is a quotient map on to Z2 sending y to e where e
is an identity and there are automorphisms y -> yk for any relatively prime
to n. These with symmetries of a regular «-gon give the irreducible
representations.
2. Give the irreducible representations of the dihedral group for n even. (There
are additional 1-dimensional representations sending y2 -*■ e but not y.)

5.4 GROUP CHARACTERS


A character x of a representation assigns a complex number to each element of
the representation. A character identifies a representation completely. Charac¬
ters have many properties: x(M ® N) = x(M) + x(-W), X(M ® N) =
where ® is an operation called tensor product. Thus if one adds negative values
they form a ring. The characters of distinct irreducible representations are
orthogonal.

DEFINITION 5.4.1. Let r : G-*■ Mn(C) be a representation where Mn(C)


denotes the set of all n X n matrices over C. Then the character associated with
r is xQO = Tr (r(g)).

EXAMPLE 5.4.1. Consider this representation of Z2:

"l 0* 0 f
e = , x =
0 1_ .1 0_

The character is x(e) = 2, x(*) = 0.

PROPOSITION 5.4.1. The character of a direct sum of two representations is


the sum of the characters of the separate representations.
198 Group Representations [Ch. 5

Proof. For a direct sum of representations rlt r2 the matrices look like this:

>iU) o
- 0 r2(g)_

The trace of such a matrix, the sum of the main diagonal entries, is
Tr (rite))+ Tr (/•,(*)). □

We have briefly mentioned before the Kronecker product, sending an m X m


matrix A and an n X n matrix B to an nm X nm matrix A El B such that
(A El B)in+j-n.un+v-n ~ aiu^jv- This operation is multiplicative. (A E B)
(C 0 D) = i4C Kl BD. It follows 'that the Kronecker product of two
representations gives a new representation.

PROPOSITION 5.4.2. The character of a Kronecker product of two


representations is the product of the characters of the representations.

Proof. We need to show that Tr(^4 E B) = Tr(^4) Tr(5). Write A, B in sub-


triangular form with eigenvalues on the main diagonal, by taking a similarity
transformation. Then A ® B will also be in subtriangular form, and its
eigenvalues are the products aubjj. So Tx{AB) = 2 au b/j = 2 au 2 bjj =
Tr04)Tr(5). 1,1 1 1 □

In the following proof we use the fact that if r, s are representations, and
Y is a matrix such that = Ks(g) then v -* Yv gives a module homo¬
morphism / between the corresponding modules. The reason is that f(r(g)v) —
Yr(g)v = s(g)Yv = s(g)f(v).

THEOREM 5.4.3. Let r, s be irreducible representations giving matrices


(fij(g), Sij(g)). Then S rij(g)skm(g~1) = 0 unless r is equivalent to s. If r = s
IGI
then this sum is 0 unless i = m, j = k and in that case it is ■— where n is the
dimension of the representation.

Proof. For any matrix B the rectangular matrix

Y = 2 r(g)Bs(g~1)
satisfies
r(h)Y - Xr^Bsig-1)

= 2 r(hg)Bs((hg)~1)s(h)

= Ys(h)

Thus it gives a module homomorphism from one representation to the other.


Sec. 5.4] Group Characters 199

For r, s distinct it must be zero by Lemma 5.3.2. Let r = s. Then all such
homomorphisms Y must be a multiple of the identity al, by Proposition 5.3.6.
Let B be a (0, 1)-matrix with a 1 precisely in its (/, k)-entry. Then the
sum in question is the (z, ra)-entry of Y. This proves the first statement. Also
if z =£ m the sum is 0 since Y = al, and for z = m is independent of m. But
2 rij(.g)rkm(g~l) - 2 rii(g~1)rkm(g) if we let the sum run over g~l instead
of g. Therefore the sum for ij, km is the same as for km, ij. Therefore if / =£ k
it is zero, and all the nonzero sums are equal to some number a. They can be
evaluated by taking B — I, in which case Y = \ G\I. So nal = \G\I since taking
B — I means we add n equal sums for / = 1,2,..., n. □

We observe that for a unitary matrix M, the eigenvalues of M~l are


conjugates of those of M. Hence x(<?_1) = X(£*)-

DEFINITION 5.4.2. The inner product of two characters (xi, X2) is

— 2 XiC^XaU"1) = — 2 Xi(g)X2(g*)

COROLLARY 5.4.4. If Xi. X2 are irreducible then (Xi, X2) = 0 if Xi- X2 are
not equivalent, and (xi, X2) — lif Xi. X2 are equivalent.

Proof This follows from Theorem 5.4.3 by summing 2 rii(g)skk(g~1) over


all i, k. There are n equal nonzero terms, with z = k. For a fixed g this gives
xi0)X20-1)- D

COROLLARY 5.4.5. For any representation x, the value (x, x) is lif and only
if x is irreducible.

Proof. Write the character x as 2 n/Xz corresponding to the irreducible


representations making it up. Then

(x> X) = 2 UfUj (xt> X/) = 2 nj(Xi> Xi) = 2 n]

by Theorem 5.4.3. This is 1 if and only if there is only one nonzero nt and
it is 1. E

COROLLARY 5.4.6. Two representations with the same character are


equivalent.

Proof. Write x as 2 zz,-Xz corresponding to the irreducible representations


making up the representation. Then (x,Xz) gives the number of times nt that
Xi occurs. Thus the number of copies of each irreducible summand can be
determined from x • ^
200 Group Representations [Ch. 5

Most of these results go through for compact continuous groups, if the sums
are replaced by integrals.
There are many other relationships involving characters, some sketched in
the exercises.

EXERCISES
Level 1
1. Prove that the number of times the trivial representation occurs in a
representation with character x is

2. What is x(e)? Here e stands for an identity.


3. Verify the relations (x,, X/) = 0 if / ¥= j for irreducible representations of
Zp. Use the fact for anyjHJntrivial pth root x of unity xp~1 +xp~2 + ...
+ 1 = 0.
4. Prove that the trace of any element of order p is a sum of pth roots of
unity, i.e. numbers satisfying xp = 1.
5. Show x(#) = x(hgh~l) for any character x •

Level 2
1. Write the characters of the dihedral group of order 2n, n odd.
2. Verify the orthogonality relations for this group.
3. For an irreducible representation of degree 2 of the symmetric group of
degree 3, what is x ® X? What irreducible representations make up the
Kronecker product of this representation with itself?
4. Let Ck be 2 g over the fcth conjugate class nk. Prove that Q Cy is a sum of
elements Ck with integer coefficients, Q Cy = 2 nijk Ck for some integers
nijk •
5. In an irreducible representation Ck being in the center, must go to a certain
multiple of the identity al. Let xk E nk and let the dimension be n and let
l^fcl — hk- Then Tr (Ck) = hk\(xk) = na. What is then the image of Cfc?
6. Let
fytxOfc)
yk =
n
We have from previous results y{yj = 2 n(jkyk. So each y{ satisfies a monic
polynomial with integer coefficients, the characteristic polynomial of (a,y)
where ajk = n.ijk. Such numbers are called algebraic integers.
7. Show a basis for the center of C(G) is given by the sums Ck since elements
in the same class must have the same coefficient. The center, as a ring, is
a direct sum of copies of the complex numbers, one for each irreducible
representation. Therefore the number of distinct irreducible representations
is what?
Sec. 5.5] Tensor Products 201

Level 3
1. Try to prove the class orthogonality relations
1 1 for g conjugate to h
— ?X/(*)X/W =
|Cr I * 0 otherwise
where the sum is taken over all irreducible representations.

2. Assuming sums and products of algebraic integers are algebraic integers and
the results of Exercise 7 of Level 2 and the orthogonality relations of the
above exercise, prove the degree of an irreducible representation divides
the order of the group.

5.5 TENSOR PRODUCTS


A tensor product is a kind of multiplicative operation on vector spaces or
modules. Its definition uses bilinear forms. It gives a more natural explanation
of the somewhat arbitrary Kronecker product of representations.

DEFINITION 5.5.1. A function b from M X hi to G is bilinear, for a right


R-module M, a left R-module M, and an abelian group G, if and only if for all
x, y E z,w E N, r E R:

(1) b(xr, z) = b(x, rz)

(2) b(x + y,z) = b(x, z) + b(y, z)

(3) b(x, z + w) = b(x, z) + b(y, z)

EXAMPLE 5.5.1. The multiplication b(x, y) = xy is bilinear in any ring R.

EXAMPLE 5.5.2. The inner product x • y is a bilinear mapping of vector


spaces.

The tensor product is defined to be, in category-theoretical language, a


universal object for bilinear mappings.

DEFINITION 5.5.2. An abelian group H provided with a bilinear map


/: M X U -* H is called a tensor product of a left R-module N and a right R-
module M if and only if for any abelian group G and bilinear map b : MX M -*■ G
there exists a unique homomorphism h.H-+G such that / O h — b.

EXAMPLE 5.5.3. The tensor product of R with a R-module M is M.

PROPOSITION 5.5.1. Any two tensor products are isomorphic by a natural


isomorphism.
202 Group Representations [Ch. 5

Proof. Let Hh H2 be tensor products of M, W. There exist bilinear mappings


bt : M X N -* These by the definition, give mappings gx: Hx -*■ H2 and
g2 : H2 -*■ H\. Moreover bxgx = b2, b2g2 = bx. Therefore gxg2 is the identity
on the image of bx and g2gx is the identity on the image of b2.
We will show bt are onto. Suppose bx is not onto but has image E con¬
tained in Hx. Then by the argument just given there exists a mapping Hx-> E
such that E-+Hx-+E is the identity. Then for b = / there exist two distinct
homomorphisms:

hx = identity

h2 — the map Hx~* E -* Hx

such that fhi = b. This contradicts'uniqueness. Thus the bt are onto. Thus
gxg2 and g2 gi are the identity. So each is an isomorphism. □

The existence of a tensor product follows by a construction in terms of


generators and relations. This construction will not be proved in detail.
Take a free abelian group (sum of copies of Z),with one generator denoted
m ® n for each pair (m, ri) £ M X M. Introduce relations of three types:

(1) xr ® z = x® rz

(2) (x + y) ® z = x ® y + y ® z

(3) x ® (z + w) = x ® z + x ® w

The group with these generators and relations can be verified to be the tensor
product by the same sort of proof as was presented for semigroups.
Tensor products have many properties.

Tl. R ® M - M.
T2. (M © N) ® G = (M ® G) e (N ® G).
T3. If M is a bimodule, M ® SI is a left module.
R
T4. (M ® N) ® G — M ® (M ® G), if these are defined.
R R R R

Therfe are dual properties where left, right factors are interchanged.

In group representation theory, the tensor product is taken usually not over
F(G) but over F only, that is, it is a tensor product of vector spaces.

THEOREM 5.5.2. Let vx, v2, ..., vn be a basis for a vector space V and
wx, w2,... ,wn be a basis for vector space W. Then vt ® w;- is a basis for V ® W
where i, j = 1, 2,..., n.
Sec. 5.5] Tensor Products 203

Proof. The set x ® y spans V ® W by construction, for x e V, y e W. Write


x = 2 ftvt, 7 = 2 g{wt. Then

* ® y = 2 fVi ® 2 gjW/ = 2 fiipt ® 2 gjWj) = 2 figj (Vi ® wy)

by repeated applications of the defining relations of the construction. Therefore


® Wj form a spanning set.
Take a basis denoted u,y for a space U of dimension (dim V) (dim W). Then
there is an onto bilinear map from V ® W to U defined by b( 2 fVi, 2 gyuy) =
(2 figjUij). The elements vt ® wy go to linearly independent elements. Therefore
they are linearly independent in V ® W. □

If a group G acts on V, W then G acts on V ® W by g(v ® w) = gv ® gw.


It is only necessary to verify that this preserves the defining relations:

g(fv ® w) = fgv ® gw = gv ® fgw = gv ® g(fw) - g(v ® fw)

£((* + y) ® z) = (g^ + ^7) ® = gx ® +^7 ®^z

= g(x ® z) + g(y ® z) = £■(;*: ® z + 7 ® z)

^■(z ® (x + 7)) = gz ® (^x + £7) = ^z ® ^x 4" gz ® 5-7

= g(z ® x + z ® 7)

This can be shown to give precisely the Kronecker product representation.


A major use of tensor products in group representation theory is the con¬
struction of new representations. In addition to the tensor product itself of two
representations, we describe two other important methods.
First, we take the induced representation. Let H be a subgroup of G with
[G : H] = k and let M be an F(H)-module of dimension n. Here [G : H] is the
index of H in G. Then

F(G) ® M
F(H)

is a representation of G of dimension kn called the induced representation. For


groups of prime power order and some other types of groups, every irreducible
representation is induced from a 1-dimensional representation of a subgroup.
Richard Brauer proved that for any finite group, representations induced from
1-dimensional representations give a set of additive generators for the character

ring.
Another method is especially important for representations of compact
continuous groups. Let V be a complex representation of any group G. Then
Vn = V ® V ® ... ® V is a G-module. However, it is also a module over the
symmetric group acting by permutation of the factors: 7r(Xi ® x2 ® ... ® x„) =
x^t,. ® X(2)jr ®...® X(„)w. Moreover these two actions commute: gn(x1 ® x2 ®
204 Group Representations [Ch. 5

... ® x„) = gx(1)n ® £x(2)7r ® ... ® gx(„)7r = n(gxx ® £x2 ® ... ® gxn) =
ng(xi ® x2 ® ... ® x„). The G-module Vn decomposes as a direct sum according
to the different irreducible representations of the symmetric group 5^.
One way to see this is to recall that corresponding to each irreducible
representation there will be a central idempotent e( in C(Sn), and any C(5^)-
module M is the direct sum of efM. Since et is a sum of permutations 7r it will
commute with any element G. Therefore ^(e/M) = e,-#M C e,-M so e/M is a
submodule over G.

EXAMPLE 5.5.4. For n = 2, let S2 = {e, f} where t2 = 2. Then the two


central idempotents are ei=|(e + 0 and e2=|(e —f). Any S^-module M
splits as a direct sum of ex M and e2 M since exe2 = 0 and ex 4- e2 = 1.
Then M ® M is acted upon by by e(x ® y) = x ® y and t(x ® y) = y ® x.
The submodule ex M is spanned by x ® y + y ® x. On it acts trivially, that
is, every operation is the identity. It is called the symmetric 2nd power of M.
The submodule e2M is spanned by x ® y — y ® x. On it acts in such
a way that interchanging the factors reverses the sign of any element. It is called
the exterior 2nd power of M.

In general we can get many representations of any group from the tensor
powers M” of any 1-1 representation. The exterior powers are especially
important.

DEFINITION 5.5.3. Let M be a C(G)-module. The kth exterior power \fc(M)


is given by ekMk where ek is the central idempotent
1
— 2 o(n)n
n!

where o(n) is the sign of n. The kth symmetric power sfc(M) is given by fkMk
where fk is the idempotent
1
— 2rr
n!
EXAMPLE 5.5.5. For k — \, sx(M) is M. For k = 2, X2(M), s2(M) are in the
preceding example.

DEFINITION 5.5.4. xx A x2 A ... A xk = ek(xt ® x2 ® ... ® xfc), where ek is


the central idempotent.

PROPOSITION 5.5.3. Let vx, v2,..., vn be a basis for M. Then a basis for
Xfe(M) is given by Ai>/2A...A Vfk such that ix< i2< ...< ik.

Proof We have from the definition 7t(xj A x2 A ... A xfc) = o(ir) (xx A x2 A ...
A xk). The vXl A V/2 A ... A V\k for arbitrary iu i2,..., ik form a spanning set
Sec. 5.5] Tensor Products 205

for Afc(M) since Afc(M) = e^ and vXl ® Vj2 ® ...®Vik form a basis for the
tensor product. By applying a permutation n we may assume zx< i2< ... < ik,
and 7r will at most alter the sign of the result. If Vj = Vjg then let ir E Sn inter¬
change r, s. Then -n(vix Ad/2 A ... A vik) = A v^A ..! A vik since ir = is but
n(vh AvhA...A vik) = o( it) (vh A A ... A vik) = ~(vh A vh A ... A vifc).
So Vix A Vj2 A ... A V[k = — (Vix AVj2 A ... A So it is zero. Therefore for
*1 < i2 < ... < ik we have a spanning set.

The given terms are independent in Mfc since different ones involve distinct
basis elements vXl ® Vj2® ... ® i>,• . □

EXAMPLE 5.5.6. A basis for A2(M) is given by vi AVj where i < j. We have
Vj A vt = —vt A Vj.

PROPOSITION 5.5.4. Let a matrix have eigenvalues ex, e2,..., en. Then the
linear transformation it induces on the kth exterior power has eigenvalues all
products eix e;2 ... eik, ix < i2 < ... < ik.

Proof. Choose a basis vx, v2,..., vk in which M is lower triangular with main
diagonal elements eh e2,... , en. Then plus a linear combination of
Vj,f<i. On the exterior power it will send vXx A Vj2 A ... A Vjk to e/2 ...
eik(vix A Vj2 A ... A Vik) to plus terms involving lower Vj. So it will be lower
triangular with eigenvalues e^e;2 ... ez-fc. □

We next apply these results to determine the character ring of Un, the n X n
unitary matrices, under some assumptions. Let Dln the diagonal matrices such
that \dti\ = 1 for i = 1 to n. Then D\ is commutative. Every element of Un is
similar to a matrix of D„, hence the characters take the same value. Therefore
any character is determined by its values on D„. The group Dln is £" where £x is
the group of complex numbers of absolute value 1.
We have shown in an earlier exercise that every irreducible representa¬
tion of a finite commutative group is 1-dimensional. This result extends to
compact commutative groups (in effect we can simultaneously diagonalize the
matrices). Every 1-dimensional representation is equivalent to a unitary represen¬
tation, which is a homomorphism Such representation are sums of
homomorphisms |x -> £x, which are of the form x -*■ xn. Let xt denote the homo¬
morphism which takes (ax, a2,, an) to a(. Then all 1-dimensional
representations of Dxn are of the form xfZix^2... x™n for mt E Z.

Therefore any character of Un can be described by a finite sum

2 x™1x™2...Xnnp(ml,m2,...,mn)
mx,...,mn<a Z

where p(mx, m2,..., mn) is a positive integer.


206 Group Representations [Ch. 5

EXAMPLE 5.5.7. The standard representation rj: Un^ Un restricted to D\


is a direct sum of the n representations xx, x2, ... , xn. So its character is
*i + x2 + ... + xn.

EXAMPLE 5.5.8. The determinant d gives a representation Un -*■ U]. Its value
on Dxn is the product of the main diagonal entries. So it has character XiX2 ... xn.
Its complex conjugate has character (xx x2 ... xn )_1.

EXAMPLE 5.5.9. The fcth exterior power Xfc(M) has character. 2 . xXlx/2
... Xjk by the Proposition 5.5.4. ll lk

The character of any finite dimensional representation r of Un must be a


function of Xi, x2,..., xn which is symmetric under interchanging the x,-. The
reason is that if P is the matrix of a permutation interchanging i, / then PAP_1
for AEDxn has the effect of interchanging the zth and /th main diagonal entries.
Therefore it interchanges x*-, Xy. This will not change the characters since PE Un
and Tr (r(P) r(A) r(P)~x) = Tr (r(^4)).

DEFINITION 5.5.5. The kth elementary symmetric function ok of variables


xi, x2, ...,x„is . 2 Xj Xi ...Xi

EXAMPLE 5.5.10. For n = 4 these are Oi = xx + x2 + x3 + x4, a2 = xxx2 +


Xi x3 + xx x4 + x2 x3 + x2 x4 4- x3 x4, a3 = xx x2 x3 + xx x2 x4 + xx x3 x4 +
x2 x3 x4, a4 = xx x2 x3 x4.

THEOREM 5.5.5. Let R be a commutative ring with unit. Then any symmetric
polynomial in xx> x2,... ,xn with coefficients in R is a polynomial in ax, o2,...,
on with coefficients in R.

Proof. Let p(xx, x2, ..., x„) be any symmetric polynomial. For any product
T of powers of the x,- let 2 [T] denote the sum of all terms symmetric to T.
It suffices to show all the 2 [T] can be expressed as integer polynomials in
Ox, o2,...,on since every polynomial p(x) is a sum of expressions 2 [T] with
coefficients in R. If n = 1 or the degree of p(x) = 1 the theorem is true.
Assume that this result is true for m<n and for m — n and expression
2 [T] of degree less than the degree of p(x). If T involves every variable, 2 [T]
is divisible by on — xxx2 ... xn. By induction we can express

2JT]

as a polynomial in the a,-. So 2 [T] is such a polynomial.


Suppose T does not involve every variable, but omits variable x„. Let 2 s[T]
denote the sum of all terms involving xx, x2,..., x„_x symmetric to V. Let vt be
Sec.5.5] Tensor Products 207

the elementary symmetric functions in xu x2,..., xn_v We have an expression


by induction Ss[r] = f(vu vn_f). Let g = 2 [r] - f(oh o2,..., a„_0.
If we substitute zero for xn we obtain 2s[F] — f(vu v2,..., y„_x) = 0. By
symmetry the same is true if we substitute zero for any xt. This means all terms
of g involve every x{. Therefore g is divisible by on and we can express it in
terms of ox, o2,..., an as before. □

EXAMPLE 5.5.11. To express p(x) = XiX2 + xfx3 + x2jtx 4- x2x3 + xfxj +


xix2 in au o2, o3, we first consider x?x2 + x2Xj.This is (*! + x2)xlx2 = vx v2.
Then take p(x) ~ o1o2 = 3xjx2x3 = 3o3. Therefore p(x) = oxa2 + 3o3.

DEFINITION 5.5.6. The character ring of a group G is the ring of complex


functions G -> C generated by the characters.

EXAMPLE 5.5.12. For G = Z2, the character ring is the ring of all sums
n(l, 1) + m( 1, —1) since (1, 1) and (1,-1) are the irreducible representations.

THEOREM 5.5.6. The character ring of Un consists of all functions


°n f{° i> °2> ■■■ > °n) where k is an integer and f is an integer valued polynomial.

Proof We have a function symmetric in x1( x2, , xn. Write it as


(*! x2 ... xn)~kf(x i, x2,..., xn) where / involves only positive powers of each
xt, by taking k large. Then / is a symmetric polynomial in x2, x2,..., xn. By
the last theorem, it has the given form.
Since \lrj has character oh we can obtain any polynomial /(a1; o2,..., on)
as a sum or difference of characters of actual representatives. And a^ is the
character of the conjugate of the determinant representation. Therefore we can
obtain all the functions described. □

It is important to note that the character ring contains negatives and


differences of representations, so all its elements are not actually characters of
a representation. A study of irreducible representations of Un requires a study
of irreducible representations of Sand is more complicated.

EXERCISES
In the following exercises is the multiplicative group of complex numbers elx
of absolute value 1 and D„ is the group of n X n complex diagonal matrices
whose main diagonal entries have absolute value 1. Other notation is the same as
in the preceding text.

Level 1
1. Express XiX2 + XiX3+x|xi + x|x3 + x3xj + x|x2 as a polynomial in
oi, o2, o3. First study x\x2 + x2Xi.
208 Group Representations [Ch. 5

2. Tell why every element of the character ring of a compact group is a sum
2 riiXi where n; G Z and \i are characters of irreducible representations.
Show every member is a difference of the characters of two actual
representations.
3. The irreducible representations of have the form e27ridn, n G Z where Q
ranges from 0 to 2it. The orthogonality relations for the characters have the
form
fl"e27tiene-2iridmd9 = q

for m^n. Verify these by integration.


4. Let M be the direct sum of m copies of R and let M be the direct sum of n
copies of R. What is M ® W?
5. Prove Zm ® Zn = {0} if m and a?'are relatively prime. The tensor product
is taken over Z.
6. Prove Zn ® Zn — Z#
7. What is Zn ® Q ?
8. Prove the regular representation of a group is induced from a trivial
representation of the subgroup {e }.
9. Show the group ring C(G X H) is isomorphic to C(G) ® C(H).
C

Level 2
1. What is the dimension of X*(77)?
2. What is the character of s{(>/)?
3. Corresponding to the three irreducible representations of S3 we have
M ® M ® M — X2(M) + s2(M) + M3 for some M3. By subtraction find the
dimension of M3 if M has dimension n. For the unitary group, find its
character.
4. The tensor product can be defined more precisely as follows. Let G be the
set of all functions /: M X M -* Z, which are zero on all but a finite set of
ordered pairs. Let H(g) be the subgroup generated by all functions of the
form g(xr, z) - g(x, rz), g(x + y, z) - g(x, z) - g(y, z), g(x, z + w) -
g(x, z) — g(w, z). Then the tensor product is the quotient G/H(g). Let h
be the mapping M X W -> H(g) such that h(m, n) is 1 on (m, ri) and 0 on all
other pairs. Prove h is bilinear.
5. Let G' be an arbitrary abelian group and b any bilinear mapping
M X M -* G'. Define r: G ->• G' by r(f) = 2 b(m, n)f(m, ri). Prove r is 0
on H(g) and rh= b where M, N, G, H(g), h are the same as in Exercise 4.
6. Prove r is unique in the above exercise.
7. Show that the irreducible 2-dimensional representation of S3 is induced
from a representation of Z3.
8. Find a conjugation by a permutation matrix interchanging the first two
factors of D\.
Sec. 5.5] Tensor Products 209

Level 3
1. Find a relation between Xn and the determinant of an n X n matrix.
2. Prove a complex representation cannot be decomposed as a direct sum if
there does not exist any matrix other than the identity which commutes
with every matrix of the representation.
3. Prove the representation \lrj of the unitary group is irreducible (How could
its character be a sum of positive symmetric polynomials in the xt ?)
4. Try to find a relation between characters of and Fourier series (consider
Exercise 3, Level 1).
5. Prove from the definition that (M ® SI) ® T — (M ® T) ® (SI ® T) where
M, SI, and T are modules.
6. Describe an induced representation as a matrix.
7. Prove that tensor products of modules correspond to Kronecker products of
representations.
8. Find the character ring of unitary matrices of determinant 1.
9. Prove the ring Mn(C) is a semigroup ring (of matrices having at most one
1 entry) but not a group ring for n prime. Use the fact that a group of order
n2 is commutative if n is prime.
CHAPTER 6

Field theory

A large part of field theory is concerned with systems of numbers intermediate


between Q and C but closed under addition, multiplication and division.
Examples are Q(\/T), the set of all numbers a + b\f5 where a, b E Q, Q(^3 ),
the set of all numbers a + b^3 + c^9. These examples are finite dimensional
over Q. Fields of this nature are involved in solution of algebraic equations. They
can be defined in terms of quotients of a polynomial ring F[x] by the ideal
p(x)F(x) where p(x) is an irreducible polynomial such that p(y) is zero for a
generator y of the field.
Ruler and compass constructions involve fields obtained by repeatedly
adding square roots to the rationals.This gives a sequence of degree 2 extensions.
Therefore a number such as $2 which generates a degree 3 extension cannot be
obtained by ruler and compass constructions. As a consequence various problems
such as trisecting an angle and duplicating a cube by ruler and compass are
impossible. Galois theory associates a group to every field extension which is
generated by a complete set of roots of a polynomial. It is a group of permuta¬
tions of the x( which give field automorphisms, like i -» — i. A polynomial is
solvable by radicals if and only if its group is solvable. It can be proved in this
way that there exist polynomials of the 5 th degree not solvable by extraction of
square, cube, fourth, and fifth roots.
There exists a unique finite field of any prime power order pn. It can
be obtained by an extension generated by a root of any irreducible degree n
polynomial over Zp.
A t-error correcting code is a set C of sequences of symbols S or words
such that if at most t errors are made in a sequence and at most t errors are
made in another sequence the result will be different. Therefore if at most t
errors are made the receiver can uniquely determine the correct original word,
despite the errors. These are used in computers and have other communications
applications. A code is called perfect if it requires a minimal number of digits
added to the original code according to a theoretical inequality. All perfect
codes have been determined. The most important infinite family are called the
Hamming codes.
[Sec. 6.1] Finite Dimensional Extensions 211

Other non-perfect but frequently effective error-correcting codes have as


the set of code words an ideal in the ring of polynomials subject to a relation
xn = 1. They are the cyclic codes. These include BCH, Reed-Solomon, and Fire
codes. The latter are designed to correct a burst of errors occurring in a short
sequence within the message block.
A Latin square is an n X n table in which each entry occurs exactly once in
every row and column. These can be used to arrange statistical testing of several
controlled factors for convenient analysis. For four or more factors orthogonal
Latin squares are used. In these every possible pair of entries can be obtained as
a pair of entries in the same location in two Latin squares. We show how to
construct orthogonal Latin squares for all orders not of the form 4n + 2, from
finite rings.
A projective plane is a system of any objects called lines and points, in
which every two points determine a unique line, and every two lines have a
unique point in common. These can be constructed from finite fields for all
prime power orders, but for other orders it is not known whether they exist.
A projective plane is one kind of balanced incomplete block design. Such designs
are also used in statistics.

6.1 FINITE DIMENSIONAL EXTENSIONS


A large part of field theory was motivated by the problem of solving polynomial
equations xn + cixn~1+ ... + cn = 0 and studying of the numbers, such as
2 + 3 Vs", which occur as solutions. For a given polynomial, the solutions
generate a field containing the rationals. Unlike the real or complex numbers,
the dimension of this field over the rationals will be finite. Such a field is called
a finite dimensional extension of the rationals.
Finite fields are necessarily finite extensions of a field Zp.

DEFINITION 6.1.1. A field E is an extension of F if F C E. The degree [E: F]


of the extension is the dimension of E as a vector space V over F. F is called a
sub field of E.

EXAMPLE 6.1.2. [C : R] = 2.

If [E : F] is finite then every element of E is algebraic over F, that is, satis¬


fies a polynomial with coefficients in F. The reason is,the powers 1, x, x2,xn
lie in an ^-dimensional space, if [E: F] = n, and are therefore linearly dependent.
The polynomial of lowest degree satisfied by x (with first coefficient 1) is unique
(else take the difference of two) and cannot be factored over F, since nonzero
elements have inverses. It is called the minimum polynomial of x.

EXAMPLE 6.1.2. The minimum polynomial of \/2 over the rationals is


x2 — 2 = 0.
212 Field Theory [Ch.6

If E D F and S is a subset of E the set of members of E obtained by adding,


subtracting, multiplying, and dividing finitely many times elements of S U F is
itself a field, denoted F(S). It is called the field generated by S over F.

EXAMPLE 6.1.3. The number y/3 generates the field {a 4- b\/l>, a, b E Q}


over Q.

DEFINITION 6.1.2. A polynomial is monic if its first coefficient is 1. It is


irreducible if it cannot be factored as a product of two polynomials of lower
degree.

THEOREM 6.1.1. If the minimum polynomial of x over E has degree n then


1, x, x2,..., xn~l form a basis for E(jc) over E. The value of a polynomial
pit) for t = x is zero if and only if p(t) is divisible by the minimum polynomial
of x.

Proof. If 1, x, x2,..., x”-1 were linearly dependent x would satisfy a polynomial


of lower degree. Let R be the set of all expressions a^ + a2 x + a3x2 + ... +
anxn~\ at E F. Then R is closed under addition and multiplication by elements
of F. Let the minimum polynomial of jc be written as xn = —CiXn~1 — ... —
cn_\X — cn. This expresses xn as an element of R. Suppose xk E R. Then xk is

a linear combination of 1, x, x2,..., x”-1. So xfc+1 is a linear combination of


x, x2,..., xn but xn is a linear combination of 1, x, x2,, jc”-1. Therefore by
induction R contains all xk and is therefore closed under products.
Let p(x) = 0 and let f(x) be the minimum polynomial of x. Then p(t) =
Sit) fit) + fit) where deg r(t) < deg fit). So r(f) = 0. So fit) divides pit).
The ring R has no zero divisors since it is contained in a field. So cancel¬
lation holds. For any z E R, z ^ 0, the mapping z -+zy is 1-1 and linear. Since
it has kernel zero its image must have dimension n. So zy = 1 for some y. So z
has a multiplicative inverse. So R is a field. It is contained in F(x). Since it is
closed under addition, subtraction, multiplication, and division, and contains
x, it must equal F(x). □

There is a similar result which can be used to construct extension fields as


quotients of polynomial rings.

THEOREM 6.1.2. Let gix) be an irreducible monic polynomial over F, and let
F[x] denote the ring of polynomials in x with coefficients in F. Then
F[*]
£0)F[jc]
is a field E with basis 1, x, x2,..., jc”-1 over F and the minimum polynomial
of x in E is gix). If y is any element of any extension field satisfying giy) = 0
then there exists an isomorphism E -* F(y) sending x to y.
Sec. 6.1] Finite Dimensional Extensions 213

pr°°f Let F[x]

gO)F[x]
Then E is a ring containing F. Any polynomial h(x) is equivalent in E to
a polynomial of degree < n = deg QK*)), since h(x) = g(x)g(x) + r(x)
so h(x) — r(x) E g-(x)F(x). Let h(x) be a nonzero polynomial of degree < n.
Since #(x) is irreducible (prime) there exist /*(x), s(x) such that r(x)h(x) +
g(x)s(x) = 1. So r(x)h(x) is equivalent to 1 in E. Therefore h(x) has a
multiplicative inverse. Therefore E is a field.
Since any polynomial is equivalent to a polynomial of degree less than n,
the powers 1, x, x2,..., x"-1 are a spanning set. No polynomial of degree less
than n is divisible by^(x). Therefore they are independent.
Since g(x) = 0 in E but no polynomial of lower degree is zero in E unless
it is zero in F[x], g(x) is the minimum polynomial of x.
Let y be an element of Ex D F such that g(y) = 0. There exists a map
F[x] into F(>’) sending any polynomial p(x) to p(y). This map is a ring homo¬
morphism. By the previous theorem it is onto, since 1, y, y2,, yn~x form
a basis. It sends #(x) to g(y) — 0. So ^(x)F[x] is in its kernel. This gives a
mapping from
E = gw
g(x)F[x]
onto F(>’). It is onto. Since both spaces have dimension n, it is an isomorphism.

This result gives a method of constructing many finite dimensional
extensions of F. Take any irreducible polynomial p(x). Then

FW
P(x)F[x]
is a finite dimensional extension.
Moreover every finite dimensional extension Ex can be obtained by repeat¬
ing this procedure. Let y E. E! but y & E. Then the subfield generated by y is
isomorphic to a field

FW
p(x)F[x]
Then an element of E1\F(>’) can be taken, and so on.
To carry this out, some criteria for irreducibility of a polynomial are needed.
The following are some well-known facts:

Ir-1. If a polynomial /(x) with integer coefficients factors over Q, it factors


into two polynomials with integer coefficients. The first and last
coefficients of the factors divide those of/(x).
Field Theory [Ch.6
214

Ir-2. (Eisenstein Irreducibility Criterion.) Suppose that /(x + k) for some


integer k has the property that / is monic of degree n, some prime p
divides the coefficients of xn~\ nn~2, ..., x, 1 and p2 does not divide the
constant term. Then / is irreducible.
Ir-3. A polynomial of degree 2, or 3 is irreducible if it has no roots in F.

EXAMPLE 6.1.4. The polynomial x3 + 2x2 4- 2x + 2 is irreducible by the


second criterion, since 2 divides the coefficients of x2, x, 1 but 22 does not
divide the constant. So we have an extension field. A basis is given by 1, x, x2.
Products can be computed from the table:

1 X x2

1 X x2
X x2 — 2x2 — 2x — 2
x2 — 2x2 — 2x — 2 2x2 + 2x 4- 4

Inverses can be computed from the Euclidean algorithm, finding r(x), s(x)
such that r(x)p(x) + s(x)/(x) = 1.
For instance, if we want to find the inverse of x2 4- x + 1, divide it into
x3 + 2x2 4- 2x 4- 2. The quotient is x 4-1 and the remainder is 1. Therefore
(x2 4- x 4-1) (x + 1) — (x3 4- 2x2 + 2x 4- 2) = 1. So x 4-1 is (x2 4- x 4-1)-1 in
this field.

EXERCISES
Level 1
1. Find the minimum polynomial of \fl 4- >/T. (Compute its first four powers
and show they are linearly dependent.)
2. Prove the 3rd irreducibility criterion mentioned.
3. Show the polynomial x2 4-1 is irreducible over R. Compute the multi¬
plication table for 1, x.
4. Show the extension in Exercise 2 is isomorphic to the complex numbers,
using Theorem 6.1.2.
5. Show that two different irreducible quadratic polynomials can give the same
extension field.

Level 2
1. Suppose x, y lie in extension fields of respective degree m, n over F. Show
both lie in an extension field of degree at most mn over F. (Adjoin first x,
then _y.)
2. Show that x3 4- 3 is irreducible. (Any roots would have to be integers
dividing 3.)
Sec. 6.2] Applications of Extensions of the Rationals 215

3. Write out the multiplication table of 1, x, x2 in

E = Q[*l
(x3 + 3)Q[x]
4. Multiply (1 + x 4- x2) (2 + 3x — x2) in this extension field.
5. Find an inverse of (1 + x) in E.
6. Verify that Exercise 5 gives an inverse of 1 + 3.

Level 3
1. There is a finite procedure for determining if a polynomial p(x) of
degree 2, 3 is irreducible over Z. (Try linear factors over Z whose first and
last coefficients divide those of p(x).) Is there such a procedure for higher
degrees, where some factors may not be linear? Can the coefficients of the
factors be bounded (consider the roots of p(x))?
2. Prove that C is the only finite extension field of R. Use the fact that every
polynomial of odd degree has a real root.
3. Prove that there exist infinitely many non-isomorphic extensions of Q of
degree 2.
4. Show that (x3 — 3) has a root in 0(^3) but does not factor into linear
factors over this field.
5. Find an irreducible polynomial of degree 3 over Q which does factor into
linear factors of a single root is adjoined to Q. (Look up cubic equations,
the discriminant must be a perfect square.)

6.2 APPLICATIONS OF EXTENSIONS OF THE RATIONALS


Field theory can be used to study solvability of polynomial equations under
various restrictions. This in turn gives conditions for solving problems depending
on polynomial equations.
An example is ruler and compass constructions. Without being formal, a
unit length is given. Then it is asked whether, using an ideal (unmarked ruler)
and an ideal compass a given geometrical figure can be produced. For example
regular triangles, squares, and pentagons can be drawn by these constructions.
Lines may be bisected and perpendiculars drawn at any point.
Choose coordinates such that the given segment has endpoints (0, 0) and
(1,1).
THEOREM 6.2.1. A point can be constructed by ruler and compass if and only
if its coordinates can be obtained from the rational numbers by repeatedly
extracting square roots.

Proof. There exist constructions for adding, subtracting, multiplying, dividing,


and faking square roots. Thus coordinates of this form can be separately
216 Field Theory [Ch. 6

generated as lengths. Then by laying out these lengths on the coordinate axes
we may obtain the required point.
Conversely the only way we can obtain new points is by intersecting two
lines (ruler), a line and a circle (ruler and compass), or two circles (compass).
The line must be between two points already constructed, so its coefficients
lie in the field they generate. The circle must have a radius consisting of a length
already constructed and a center already constructed, so its coefficients lie in
the field corresponding to points and lengths already generated.
The intersection of two lines is found by solving simultaneous linear
equations. Therefore it lies in the same field since such a solution can be obtained
by adding, subtracting, multiplying, and dividing.
An intersection of a line and a circle

ax + by = c

(x — h)2,4- (y — k)2 = r2

can be found by solving the linear equation for one variable and substituting in
the equation for the circle. This involves only a quadratic equation, which lies
in an extension field obtained by adding a square root.
The intersection of two circles

Ux-a)2 + {y-bf = c2

Ux-/z)2 + (y-*)2 = r2

may be found in the same way after first subtracting one equation from the
other to obtain a linear equation. □

COROLLARY 6.2.2. The coordinates of a point constructible by ruler


and compass lie in an extension E of the rationals Q such that there exists
Q = Ej C E2C ... C En = E and [Ei+1: E4] = 2 for each i.

Proof An extension field obtained by adding a square root corresponds to


an irreducible quadratic. So the extension has degree 2, if it is nontrivial, by
Theorem 6.1.1. □

PROPOSITION 6.2.3. Suppose EDFDK and [E : K] is finite. Then [E : K] =


[E : F] [F : KJ.

Proof All the extensions involved must be finite. Let x{ be a basis for F over K
and yf a basis for E over F. Let z E E then z = 2 afyj = 22 b^x^j for some
dj EF, bji&YL. Therefore the xtyj span E over K. Suppose they are dependent.
Let 2 bjiXiyj — 0. Let aj = 2 bjixi. If some by =£ 0 then some a;- =£ 0. Therefore
2 a/yj 0. This is false. So form a basis for E over K. □
Sec. 6.2] Applications of Extensions of the nationals 217

COROLLARY 6.2.4. The coordinates of a point constructible by ruler and


compass lie in an extension of the rationals of degree 2n.

COROLLARY 6.2.5. If x is a number which satisfies an irreducible polynomial


over the rationals of degree not a power of two, then x cannot be a length or
coordinate constructible by ruler and compass.

Proof If x were constructible, Q C Q(x) C E where [E: Q] = 2”. So [Q(jc):Q]|2"


Therefore [Q(x) : Q] = 2m for some m. So the minimum polynomial of x has
degree 2m. □

This criterion suffices for most of the impossibility results for ruler and
compass construction. Trisection of an angle and duplication of a cube (con¬
struction of a cube with twice the volume of a given cube) require solving an
irreducible cubic. Constructing a regular p-gon for p a prime requires finding
a solution to an irreducible equation of degree p — 1 so p must have the form
2n + 1, as 2,3,5,17, 257,.... Since ir does not satisfy any polynomial over the
rationals it cannot be constructed, so that a circle cannot be squared.
We give a brief survey of a topic which would require one or more chapters
to deal with adequately, Galois theory. There is a close relationship between
solving a polynomial and studying how quantities are affected by permutations
of the roots. For instance, the coefficients
—Ci = Xi + x2 + ... + xn

C2 = XxX2 + X2X3 + ... + X„_jX„

(-1 )nc„ = xxx2 ...x„

of xn + cxxn~1 + ... + cn — (x — xx)(x — x2) ...(x — xn) are unchanged by


permutations of the x,-. By Theorem 5.5.5, any function of the xt unchanged
under permutations of the roots can be expressed as a rational function of
Ci, c2,..., cn. Therefore it can be computed. The discriminant function

A = n (x{ - x.) = (Xj - x2) (Xj - x3) ... (x„_! - x„)


i<j

at most changes sign under permutation of the roots. Therefore A2 is symmetric


and A can be computed from A2 by extracting a square root. Equations of
degrees 2, 3,4 can be solved by first computing A and then a series of quantities
less and less invariant under permutations, culminating in xh x2,..., xn, the
roots, invariant under no permutations. This is the reason that the roots of a
quadratic appear as
—Ci ± A
2
The ± corresponds to permuting the two roots.
218 Field Theory [Ch.6

This method corresponds with finding a series of fields Q = FjC F2 C ...


CFn = Q(jcj, x2,..., x„), where each Fj is obtained from Fj^by extracting an
nth root.

DEFINITION 6.2.1. The Galois group of an irreducible polynomial p(x) over


F is the group of all automorphisms a of the field F(x1; x2,..., xn) where the
xi are the roots of p(x) such that a(a) = a for a £ F.

EXAMPLE 6.2.1. The Galois group of an irreducible quadratic is Z2. The


automorphism interchanges
—Ci + A
*i =
2
with
-Cj-A
x2 =

It can be verified that A is irrational and, like complex conjugation, this process
preserves sums, and products.

EXAMPLE 6.2.2. The Galois group over Q of x3 — 2 is the symmetric group


of all permutations of ^2, co-^2, oj2\/2 where w is a non-real number such
that co3 = 1.

The Galois group of any polynomial is contained in the symmetric group


of permutations of the xt since if x satisfies p{x) — 0 its image under a field
automorphism must also satisfy p(x) = 0.
Recall that a group G is solvable if and only if there exist subgroups
G — N0 D Ni D N2 D ... DA^W = {e} where Nt is normal in N{_x and G/Nlt
N1/N2,..., Nn_JNm, Nm are all commutative groups. Without loss of generality
they may be taken as cyclic groups of prime order, if G is finite.

EXAMPLE 6.2.3. Any commutative group is solvable.

EXAMPLE 6.2.4. The symmetric groups of degree 2, 3, 4 are solvable.

It is proved in Galois theory that extracting a sequence of pth roots corres¬


ponding to a sequence of field extension Q = F0C F!C F2 C ... C Fn where
the Galois group of Fn over Fk+1 is a normal subgroup of that of Fn over Fk
and the quotient group is Zp. Moreover the converse is also true. Therefore a
polynomial is solvable by radicals if and only if its Galois group is solvable.
There exists polynomials of degree 5 whose Galois group is the symmetric group.
Since S5 is not solvable, these polynomials cannot be solved by a series of
processes involving extraction of nth roots.
Sec. 6.2] Applications of Extensions of the Rationals 219

EXERCISES
Level 1
1. Tell why this diagram, constructible by ruler and compass, multiplies lengths
x, y, where AC = x, DE = y, BC = 1, AE = xy.

2. Relabel it to give a method of division.


3. Tell how to add or subtract two lengths by ruler and compass.
4. Duplication of a cube means given an edge x (set x = 1) of a cube construct
the edge y of a cube having twice the volume. Show this leads to y3 — 2 = 0.
Prove this equation has no integer root. Therefore it is an irreducible cubic.
Therefore duplication of the cube by ruler and compass is not possible.
5. The addition formulas for sine, cosine give cos 36 = 4cos30 — 3cos0. Let
3 6 = 60°, that is we want to trisect an angle of 60°. Then cos 60° =
If x — cos 6, we have 4x3 — 3x = \ or 8x3 — 6x + 1 = 0. Let y = 2x.
Then y3 — 2y + 1 = 0. Show this equation is irreducible. Therefore angles
cannot in general be trisected and a regular 9-gon cannot be constructed.

Level 2
1. Construction of a regular n-gon can be simplified by introduction of
complex numbers. Let the centre be (0, 0) and the radius 1. Then if a point
x, y can be constructed the number x + iy can be obtained by repeated
extraction of square roots. The converse is also true. Let one vertex be
27t 2ir 2tt 2 7r
(1,0). The next is (cos-—, sin—). Let z = cos — + isin —. Then it
can be shown zn = 1. Factor zn — 1 to prove z satisfies the equation
zn-i + zn~2+ " + z + 1 = 0.

2. Prove if regular m, n-gons can be constructed, and m and n are relatively


2,71 2lT
prime so can a regular mn-gon. We can construct angles of —, —. Show
, . . , . 27rr 2ns 2n
there exist integers r, s such that —-— = ^.
3. Prove by repeated bisection of an angle we can construct a regular 2”-gon.
4. Prove a regular 5-gon can be constructed, by solving z4 + z3 + z2 + z1 + z° = 0.
Write u = z + z'1. Express 0 = z2 + z + l + z-14-z~2asa sum u2 + u + c = 0.
Thus u satisfies a quadratic. Show that z satisfies a quadratic involving u.
220 Field Theory [Ch.6

5. For n prime show z" 1 + zn 2 + ... + z1 + z° is irreducible using the


Eisenstein Irreducibility Criterion. Let z = u + 1. Then this polynomial is
0 + 1)" - 1
u

Show all coefficients after un are divisible by n but the constant term is
precisely n. Therefore a regular p-gon is not constructive by ruler and
compass for p not of the form 2k + 1.
6. Prove the Eisenstein Irreducibility Criterion. If /(x) = g(x)h(x), prove that
all coefficients of g, h except the first are divisible by p by reducing f, g, h
modulo p. Then show the last nonzero coefficient of / is divisible by p2,
contradiction.

Level 3
1. Derive the solution of the general cubic equation, as follows. We use
solvability of S3 by the series {e} CZ3C S3. Let the roots be xlt x2, x3.
Begin with A = (xx — x2) (x2 — x3) (xx — x3). Since A2 is a symmetric
function we can express it in terms of the coefficients cu c2, c3 of
x3 — CjX2 + c2 x — c3 by the methods of Theorem 5.5.5. Assume that
Ci = 0. This yields A2.
2. Let to be a nontrivial root of co2 = 1 where
l-iy/3
to =-.
2

Show y = Xi + u>x2 + to2x3 is sent to a multiple of itself by a power of to


under the cyclic permutation Xi-+x2-+ x3 and Xj-> x2x3. Hence y3 is
invariant under Z3. So it belongs to the same group as A. Compute y3 and
write it as a function of cu c2, c3, A.
3. By symmetry write an expression for z = Xi + co2x2 + cox3.
4. Since Xj + x2 + x3, x3 + cox2 + co2x3, Xj + co2x2 + cox3 are known and

"11 1 "

1 to to2

-1 co2 CO -

is nonsingular, explain how xlt x2, x3 may be computed by solving


simultaneous linear equations.
5. If zp = 1 but zp^ 1, show z satisfies g(z) = (zp)p~l + (zp)p + ... +
zp + z° — 0. Show that this equation is irreducible. Suppose not. Use the
fact that if z is a root, so is zco for any root co of cop = 1, and irreducibility
of zp~l + zp~2 + ... + z1 + z°. Given the minimum polynomial /(z) of z,
add /(z) +/(coz) + ... +/(cop 1z) to get a polynomial involving only
Sec. 6.3] Finite Fields 221

powers of zp. Then /(x)|,g-(x) since g(z) = 0. Yet this is not possible since
yp~l + yp~2 + ... + y + 1 is irreducible. So for p > 2, a regular polygon of
degree p2 cannot be constructed with ruler and compass.

6.3 FINITE FIELDS


In this section we will study fields having a finite number of elements. All such
fields are known and they are used in many combinatorial constructions, for
instance in coding theory and the design of Latin squares. The fields Zp are
examples of finite fields, and all other finite fields turn out to be extension fields
of some field Zp.

THEOREM 6.3.1. Every finite field F has pn elements for some prime p, and
Zp is a sub field of F.

Proof Let F be a finite field. Then the additive group is finite, so for some
integer n, a sum of n copies of 1 is zero. We write n 1 = 0. Choose n to be the
least such positive integer. Suppose n = ab where a, b G Z+, 1 < a, b < n. Then
(al) (hi) = n \ but al =£ 0 and b 1^0. This contradicts the properties of a field.
So n is a prime p, unless n = 1. But if n = 1 then 1 = 0, which is false. So assume
P > 1.
The set S of all sums of copies of 1 will be closed under addition and multi¬
plication. The additive and multiplicative groups of F are finite. This means that
some repeated sum of an element will be its additive inverse. The same is true for
products. So any subset closed under addition and multiplication also contains
all additive inverses and multiplicative inverses of nonzero elements. So it is a
sub field.
The mapping a -» a\ is a ring homomorphism from Z onto S. Its kernel is
precisely the set of multiples of p. So by Theorem 4.3.5, the quotient ring Zp
is isomorphic to S.
Now F is a vector space over S. So F has a basis xlf x2,..., x^. Then F
is in 1-1 correspondence with the set of all sums axXi 4- a2 x2 + ... + dkxk
where alt a2,..., ak G (0,1, 2,..., p — 1}. The number of such sums is pk.
So |F | = pk. □

We note that the integer p = char (F).

THEOREM 6.3.2. In any field no polynomial equation of degree n can have


more than n distinct roots.

Proof If r is a root of /(x) then /(x) = (x — f)£(x) + a where a is 0 or has


degree 1. Setting x = r we find a = 0. So /(x) = (x — r)^(x). Thus we can
factor off an x — r for each r. A polynomial of degree n cannot have more
than n factors of the form (x — r). □
222 Field Theory [Ch.6

We showed in Theorem 4.4.5 that any finite subgroup of the multiplicative


group of a field is cyclic. Therefore if F is a field of order pn its nonzero
elements are isomorphic to Zpn-l-
Finite fields are usually constructed as quotient rings

F(s)
pWFW
where p(x) is an irreducible polynomial over Zp. The elements 1, x, x2,..., xn~x
form a basis if p(x) has degree n. This determines the addition, for example in
an extension of Z5 of degree 2:

(2 + 3x) + (4 + x) = (2 + 4) + (3 + l)x = 6 + 4x = 1 + 4jc (mod 5)

Multiphcation is determined by multiplying two polynomials and then using


p{x) as a relation to reduce the powers of x. For instance,

(2 + 3x) (4 + x) = 6 + 12x + 2x + 3x2 = 1 + 4x + 3x2 =

1 + 4x + 3 (2) = 7 + 4x = 2 + 4x

provided that p(x) = x2 — 2 so that x2 = 2.

EXAMPLE 6.3.1. Over Z2 the polynomial x2 + x + 1 has no roots, hence no


factors of degree 1 and is therefore irreducible.
In the quotient ring there are four elements 0,1, x, 1 + x. Addition and
multiplication are described below.

+ 0 1 X 1+ x

0 0 1 X 1+ x
1 1 0 1 + x X

X X 1+x 0 1
1 + X 1+ x X 1 0

X 0 1 X 1 + x

0 0 0 0 0
1 0 1 X 1+ x

X 0 X 1+ x 1
1 + X 0 1+ x 1 X

instance x2 = -1 - x = 1 + x (mod 2). And x (1


x + l + x =1.
Sec. 6.3] Finite Fields 223

Another way to deal with the multiplicative structure of a Fmite field is


to construct a logarithm table. That is, find an element g which is a generator
of the cyclic group of the field, and write out all powers of g. Then two poly¬
nomials can rapidly be multiplied simply by adding the exponents of g. We
illustrate this process for a field of order 8.

EXAMPLE 6.3.2. We find a logarithm table for the field of order 8 generated
by z such that z3 + z + 1 = 0. The powers of z are:

z2=z2
z3 = 1 + z
z4 = z(l + z) = z + z2
zs = z(z 4- z2) = z2 4- z3 = z2 4- 1 4- z
z6 = z(z2 + 1 + z) = z3 + z + z2 = 1 + z + z + z2 = 1 + z2
z7=l

For instance (z + z2) (z2 4-1 4- z) = z4z5 = z9 = z2 since z7 = 1.

The major theorem about finite fields is that there exists one and only one
field of order pn for every prime p and positive integer, n. To show existence it
suffices to show there exists at least one irreducible polynomial of degree n, then
the quotient ring has degree n over Zp.

LEMMA 6.3.3.- Let R be a commutative ring with 1, in which pi = 0 for a


prime p. Then for any r> 0 the mapping x -+ xp is a ring homomorphism,
which is the identity on nl for any n.

Proof p_i
(x + y)p - xp + yp + 2 (p)xryp~r.
r=1

But for r = 1 to p — 1, p\p\ but not r\(p — r)!, since p does not divide any of
the factors of the latter expression. Thus p \(p). So in R, (x 4- y)p = xp + yp.
Also, by commutativity (xy)p = xpyp. And by Corollary 4.4.4, (nl)p = nl.
Therefore x -+xp is a ring homomorphism which is the identity on {«1}. But
x ^ xpT is the r-fold composition of x->xp with itself. So it is also a ring
homomorphism which is the identity on {«1}. □

EXAMPLE 6.3.3. The mapping x -* xp is an automorphism of any field of


order pn.

In the ring of polynomials over any field we can define a derivative by


D(c0xn + c1xn~1+ ... + c„) = nc0xn~1+ (n — Ijcix”"2-!- ...+ cn_i. Then
224 Field Theory [Ch.6

the derivative is linear: D(af + bg) = aDf + bDg. The formula D(fg) =
D(fg) = fDg + gDf holds for any powers xn,xm of x. By linearity one can
then show it holds for any /, g.

LEMMA 6.3.4. The polynomial xp” — x over a field in which pi = 0 has no


factors which are squares of a polynomial of positive degree.

Proof D(fg2) = fDg2 + g2 Df = fgDg + fgDg + g2Df So if g2 divides a


polynomial then g divides its derivative. But the derivative of xp — x is
0 — 1 = —1. □
ft I

THEOREM 6.3.5. The polynomial xp _ —2 modulo p has no irreducible


factors of degree larger than n, but has some irreducible factor of degree n.

Proof Suppose this equation has an irreducible factor /(x) of degree larger
than n. Then form the quotient ring Rq associated with /(x). This quotient ring
has as a basis 1, x, x2,, xfc_1 and is a field of order pk where k> n. Since
/(x) xp — 1, in this field xp = 1. So xp = x. But this implies for any
k, (xk)pn = xk. By Lemma 6.3.4,

(aixk~1 + a2xk~2 + ... + an 1 )p” = fiqx*-1 + a2xk~2 + ... + 1.

So every element of Rqsatisfies the equation yp = y. This contradicts that fact


that in a field a polynomial of degree pn cannot have pk> pn roots. So no
irreducible factors of degree greater than n can occur.
^Suppose that xp — x divides some power of x(xp — x) (xp — x) ...
(xp — x). Then by Lemma 6.3.4 and unique factorization of polynomials,

xpK — x|x(xp — x) (xp2 — x) ... (xpH 1 — x)


But since
pn > 1 + p + p2 + ... + pn~l = ---
P ~ 1

that is impossible. So let /(x) be an irreducible factor of xp" — x which does


not divide x(xp -x)(xp -x) ... (xp - x). The degree of /(x) cannot be
greater than n by Theorem 6.3.2. It will suffice to show it has degree exactly n.
Suppose /(x) has degree k < n. The quotient

ZP[*]
/(x)Zp[x]

has order p and in it x satisfies xp = x since the multiplicative group has


order pk — 1. Yet also x satisfies /(x) = 0, which is its minimum poly¬
nomial in the quotient field. Therefore /(x)\xpk - x. But this is contrary to the
assumption on /(x). n
Sec. 6.3] Finite Fields 225

EXAMPLE 6.3.4. For p= 3, n = 2 we factor x32_1 — 1 = x8 — 1. It is


divisible by x3~3 — 1 = x2 — 1. The quotient is x6 + x4 + x2 + 1. This factors
as (x2 + 1) (x4 + 1). The factor (x4 + 1) equals (x2 — x — 1) (x2 + x — 1). Any
of the factors x2+l, x2 — x — 1, x2 + x — 1 is an irreducible polynomial
giving a field of order 9. They in fact give the same field.

THEOREM 6.3.6. For any prime p and positive integer n, there exists a unique
field of order pn. A field of order pm is isomorphic to a sub field of a field of
order pn if and only if m\n.

Proof Existence of a field of order pn follows from Theorem 6.3.5 together


with Theorem 6.1.2.
Let Fi and F2 be two fields of order pn. Then in each field all elements
Yl
satisfy xp — x, and this polynomial factors into linear factors.
Let /(x) be an irreducible factor (over Zp) of this polynomial having
degree n. Then some linear factors (x — r), r £ Fj and (x — s), s E F2, divide
/(x) by unique factorization. So f(r) — 0 in Fi and f(s) = 0 in F2.
We next observe that the powers 1, r, r2,..., rn~l are linearly independent
over Zp. Suppose not. Then

ai rn~x + ax rn~2 + ... + an = 0

where not all a,- are 0. Call this polynomial #(x). Then the degree of g(x) is
less than n, so g(x) is not divisible by f(x). Since f(x) is irreducible, g.c.d. of
f(x),g(x) is 1. Therefore f(x)u(x) + p(x)g(x) = 1. Put x = r. Then 1 = 0.
This proves 1, r, r2, ..., rn~x are linearly independent. Their span has pn
elements. So it is all of Fj. So they are a basis. Likewise 1, s, s2,..., s"-1 are a
basis for F2.
Now define a function h : Fi -> F2 by

h(an + an-ir + ...+ axrn~l) = an + an_xs + ...+ axsn~l.

Then h is an isomorphism of vector spaces, since 1, r, r2, ..., rn~l and


1, s, s2,..., sn_1 are bases.
Since f(r) = 0, each power rn, rn+\ ... can be expressed as a linear com¬
bination of 1, r, r2,, rn~l using the coefficients of f. And sn, sn+1, ...Will be
the same linear combinations of 1, s, s2,..., sM_1 since /(s)=0. Therefore
h(r0 = s; for all / > 0. Therefore for any polynomial p{r), h(p(r)) = p(s).
Thus h(Pl(r)p2(r)) = Pl(s)p2(s) = h(Pl(r))h(p2(r)). Therefore h is an
isomorphism of rings from Fj to F2.
Suppose Fj of order pm is a subfield of E. Then E is a vector space over Fi.
Let a basis be xlt x2, ...,xk. Therefore every element of E is a unique linear
combination axxx + a2x2 + ... + akxk, ax E Fj. So E hasn(pm)k elements.
Conversely suppose mk = n for some k. Then xp x |x^ x. Let E
be a field of order pn. Let /(x) be an irreducible factor of xp — x of degree
226 Field Theory [Ch. 6

m. Then for some r EE, x — r\f(x). So f(r) = 0. As above, we can show


1, r,r2,rm~x generate a subfield of E having pm elements. □

Finite fields are also called Galois fields. A finite field of order pn is denoted
GF(pn).
Many methods we have used here can also be used in the study of infinite
fields.

EXERCISES
Level 1
1. Write out the addition table of a field of order 9. The 9 elements have the
form a + bx, a, b E 0,1, 2 and are added as polynomials in Z3.
2. Work out the multiplication table of this field. Use x2 4- 1 as a generating
polynomial, so that x2 = — 1 in this field. (We can write the elements as
a + bi, i = y/—\ instead of a + bx.)
3. Find a multiplication generator g of this field and compute its powers
g,g2, ,g••• 8-

4. Use the table in Exercise 3 to find the fourth power of (1 + x).


5. Find the inverse of (1 — x) in the field of order 9.
6. In the field of order 8 use the logarithm table given in Example 5.3.2 to
multiply (1 + z) and (1 + z2).
7. In the field order 4 describe the automorphism x -> x2. Thus x, 1 + x are
symmetrical.
8. Find an irreducible polynomial of order 2 over the field of order 9 having
the form x2 — a, that is, a is not a square of an element.

Level 2
1. Find an irreducible monic polynomial of degree 3 over Z3. It must satisfy
/(0) =£ 0, /(l) =£ 0, /(— 1) ¥= 0 modulo 3.
2. Show that if an irreducible polynomial of degree n, p > n > 1 exists over
Zp one exists having the coefficient of x"-1 equal to zero.
3. In a field of order 27 construct the multiplication table of 1, x, x2.
4. Show that for any odd prime p there exists an irreducible quadratic over
Zp of the form x2 — a. Use results on quadratic residues from Proposition
4.4.7 and Corollary 4.4.8.
5. Show that for any odd prime p of the form (6n + 1) there exists an
irreducible cubic of the form x3 — a over Zp.
6. Find an irreducible polynomial of degree 2 over GF(4). Using it find a
logarithm table for the field of order 16.

Level 3
1. Show the automorphism x -> xp generates a cyclic group Zn of
automorphisms of a field of order pn.
Sec. 6.4] Coding Theory 227

k
2. Show every automorphism of a finite field has the form x -*■ xp ./
3. If f(x) is a monic polynomial with coefficients in Z which factors over Z
it factors over Zp for every prime p. Do you think the converse is true?
(This is a deep question. For quadratics it holds by quadratic reciprocity.)
4. For 2,3 it has been remarked that xk + x 4- 1 is irreducible over Z2. What
about 4, 5? Try irreducible quadratic factors x2 + ax 4- 1.
5. Show that every irreducible polynomial of degree n is a divisor of xp — x
which does not divide xp — x for k < n. Hence factoring xp — x is not
in general a good method of finding irreducible polynomials.

6.4 CODING THEORY

Coding theory is concerned with methods of symbolizing data such that most
errors can be detected. This is done because correctly coded messages have a
certain form. For instance if the first digit of a number could not exceed 7, and
a 9 is received there must have been an error. Error correcting codes cannot
correct all possible errors since for any correct message any other correct message
of the same length could have been sent instead. They correct errors occurring
in only a small proportion of the digits in each group of numbers. Such codes
are widely used in computer circuitry and in digital radio communications.
The simplest error-detecting code of any interest is a parity check. Suppose
a coded message is in blocks of 7 digits 0,1. Then an extra digit is added to
each block. This digit is chosen to make the total number of ones in the block
even.

EXAMPLE 6.4.1. To the block 1110101 is added a digit 1, giving 11101011


having an even number of 1 digits.

This code detects any single error in a block. If a single error is made, the
number of 1 digits is changed by 1 and therefore becomes odd. If a receiver sees
a block having an odd number of 1 digits, he knows an error has occurred.
However, if exactly 2 errors occur, the code cannot detect any error.
This is an error-detecting code. There also exist error-correcting codes
which can tell what the correct block should have been, provided that 1 or more
errors occurred. A trivial example is the code with blocks of length 3 each all
zeros or all ones, 0 0 0 or 1 1 1. If a single error is made, the receiver can tell
whether the original block was 111 or 0 0 0 by looking at whether there is a
majority of ‘0’s or a majority of Ts.
To be precise we assume the message is encoded in blocks of m digits from
a set S containing q possible digits. Each block is encoded as a block of n
digits from S where n>m.
228 Field Theory [Ch.6

EXAMPLE 6.4.2. For the parity check m = 7, q - 2, n = 8. The set S of


possible digits is {0,1}. A block of length 7 from {0,1} is encoded as a block
of length 8 by adding a parity check.

A code involves a coding function from length m sequences from the set S
of symbols to length n sequences, and then a decoding function to recover the
original. However, for error detecting and correction the essential thing is the set
of encoded words, not the coding function: any 1-1 function may be chosen.
The coding and decoding functions should, however, be rapidly computable.
We assume q is a power of a prime and regard S as GF (q). Also we regard
a sequence of length m as a vector of length m over GF (q).

DEFINITION 6.4.1. Let V(n, q) denote the set of n-dimensional vectors over
GF (q). A code is a subset of V{n, q). A coding function is a 1-1 function
c: V(m, q) C V(n, q) whose image lies in the code. A decoding function is a
function d such that d{c(v)) = v for all v £ V(n, q).

EXAMPLE 6.4.3. For the case of a parity check, we have a function


/e : V(m, 2) -* V(m 4-1,2) such that

fa(vi,v2= (v%, v2,...,vn, vt + v2 + ... + vn)

DEFINITION 6.4.2. For two vectors u = (ulf u2,..., un), v = (vu v2,, vn)
the Hamming distance H(u, v) is the number of places where the vectors differ,
i.e. |{i: Ui^Vi}\.

EXAMPLE 6.4.4. The Hamming distance //((0 1 1 0 1), (1 1 1 0 0)) is 2, since


these vectors differ in two locations.

Note that H(u, v) = H(u — v, 0). The function H(u, 0) is sometimes called
the weight of u, for any vector u.

THEOREM 6.4.1. A code can detect up to k errors if and only if the Hamming
distance between two encoded words is at least k + 1. It can correct up to k
errors if and only if the Hamming distance between any two encoded words is at
least 2k + 1.

Proof We will prove the latter statement first. Suppose the distance between
any two encoded words is at least 2 k + 1. Suppose that an encoded word v
is sent and that after at most k errors in transmission it becomes w. Then we
claim that v is the unique encoded word such that H(v, w) < k. Suppose for
u^v, H(u, w) < k. Then H(u, v)<,2k. But this is contrary to assumption.
So the receiver can find the unique encoded word w such that H(v, w) < k.
This must be the correct original word v.
Sec. 6.4] Coding Theory 229

Conversely suppose the Hamming distance between two coded words a, b


is H(a, b)< 2k. Form a word z intermediate between the two, which agrees
with both if both agree, else agrees with a in k places where they differ and with
b in the other H{a, b) - k. Then H(z, a) - H(a, b) - k < k and H(z, b)=k
so z could have arisen from either a or b.
If k errors occur in a word a we obtain b with H(a, b) = k. So b is not
a code word under the assumption of the first statement, and the error can
be detected. If two words have Hamming distance < k then one could be
erroneously received as the other. So the code cannot always detect k errors. □

The problem is to find codes which are efficient. If n is twice m, then the
code takes up 100% more time than the original transmission. So n should be
not too much larger than rn.

DEFINITION 6.4.3. The rate of a code is

EXAMPLE 6.4.5. The example 000,1 1 1 used to encode 0,1 has rate f = 3.
It is not very efficient.

One major problem in coding theory is minimizing the rate (another is


finding coding and decoding algorithms which run quickly on a computer). The
rate is studied by various inequalities, many quite difficult to prove.
The following inequality is called the Hamming bound. It gives an upper
bound on n for given n, q, t where t is the number of errors.

PROPOSITION 6.4.2. If a code can correct up to t errors, then

q"-m > Co) + (« -1) (7) + (« - tftf) +... + (« ~ lYO

Proof. The number of vectors at Hamming distance k from a given vector v is


(k) (q ~ l)fc since we can choose k locations out of n for the errors and then
choose the new values in each case by (q — 1) possibilities. Thus the right-hand
side is the number of vectors of Hamming distance less than or equal to t from
a given vector.
For a code which can correct up to t errors, consider the set of vectors
within Hamming distance t of an encoded vector. The encoded vector can be
chosen in qm ways and then the vectors with errors in

(o) + fe-l) (")+.. . + (fl-l)'(")


ways. All these choices will be distinct by the proof of Theorem 6.4.1.
So we have this many distinct vectors:

^m((o) + (?“l)(i)+... + (^-l)f(?))


230 Field Theory [Ch. 6

But there are only qn vectors in all in V(n, q) so this quantity does not exceed
q n. This implies the inequahty. □

Codes in which this inequality is an equality are said to be perfect codes,


and are of special interest.
There exist a family of perfect 1-error correcting codes, where n is of the
form qk — 1

<7 “I
called Hamming codes. We consider these for q =2.

DEFINITION 6.4.4. Let A be the (2k — 1) X k matrix such that the /'th row
of A is the number / written in binary-notation.

EXAMPLE 6.4.6. For k = 3,Ais the matrix

~0 0 f
0 1 0
0 1 1
1 0 0
1 0 1
1 1 0
_1 1 ^

To code a (0, l)-vector v of length 2k — k — 1, form a vector w of length


2—1 such that v is the sequence of entries in w in locations other than the
1,2, 4,..., 2k~1 components of w.
These k components are now chosen so that wA = 0.

EXAMPLE 6.4.7. Suppose we wish to encode the vector v = (1,1, 0,1). Write
w = (wj, w2,1, vv4, 1, 0, 1), where the components of w are the unknown w2/
together with the entries of v.

wA = [vv4 w2 Wi 4- 1 ]

So let vv4 — 0,w2 = 0, Wj = 1. The encoded vector w is (1, 0,1, 0,1, 0,1).

PROPOSITION 6.4.3. For any vector u there exists a unique vector w such that
wA - 0 and = ut for i not of the form 2/

Proof. The equation that wAj = 0 is

2 wh = 0
h&S
Sec. 6.4] Coding Theory 231

here S is the set of numbers 1 to 2k — 1 whose /th digit in binary notation is 1.


Thus this equation has the form
w2fc_/ + 2 wd = 0
d<ES

where the numbers d are not powers of 2. Therefore wd = ud.


The unique solution is
^ 2 Urf
de S
= U{ for i not of the form 2/ □

This defines the encoding function fe.

THEOREM 6.4.4. The Hamming code can correct any 1-digit error.

Proof. Suppose that the correctly coded vector is v, and that an error in 1-digit
has been made, resulting in a vector w. Then v — w is a vector u having exactly
one 1. The location of this 1 is the location of the error. Therefore

w/E = (v + m)/e = 0 + m/e = m/e


But if u has a 1 in place /, then ufe is the /th row of fe. But this row is the
number / written in binary notation. So by reading wfe in binary notation we
have j, the location of the error. By changing this digit we have corrected the
error. □

To decode a message in Hamming code, simply delete all digits w2j.


It has been proved that the only perfect codes are the Hamming codes, the
trivial codes where m = 1, n = 2t + 1 which repeat a digit 2t + 1 times, a code
with q = 2, n = 23, m = 12, t = 3, and a code with q = 3, n = 11, m = 6, t =2,
both due to Golay. The latter are associated with special permutation groups
called Mathieu groups.
C. Shannon proved that essentially random codes for large enough blocks
sizes, at a fixed probability p of error per digit achieve arbitrarily low probability
e of error per digit and rates arbitrarily close to the maximum

1 + plog2 p + (1 - p)log2 (1 - p)

Another bound in coding theory is that of Joshi.

THEOREM 6.4.5. A code which can detect t errors has m<,n — t.

Proof. Let C be a code any pair of whose members have Hamming distance at
least t+ 1. Let V be the set of vectors having zeros in places 14-1, t + 2,..., n.
Then all words c + v, cEC, vEV are distinct, else if cx 4- C c2 + v2 then
232 Field Theory [Ch.6

cx — c2 = vx — v2 E V so ci and c2 have Hamming distance < t. This is a


contradiction. So these qm,qt words are distinct in V(n, q). So qm+t < qn. □

This proof used the fact that V is a set of words of weight at most t, closed
under addition.

EXERCISES
As in the preceding section, here m is the length of a message block before
encoding, q the number of possible symbols, n the length of a message block
after encoding and t the number of errors correctible.

Level 1
1. Define a modulo 3 code by adding a digit such that the sum of all digits is
divisible by 3. EncodeK) 12 11.
2. Show the preceding code detects a single error.
3. Encode 10 0 0 using the Hamming code.
4. Decode 1 0 11 0 11 in the Hamming code.
5. If a message 1111111 in the Hamming code is received, how many errors
occurred (assuming at most 1 did)? Where?
6. What are the rates of the Hamming codes?
7. Prove the Gilbert-Varshamov lower bound. There exists a t error-correcting
code over GF (q) with n code words and

S (")(*-1)' H>q"
i= 0
Assume that q code words have been found having Hamming distance at
least 2t + 1 from the rest. Show that if the inequality is false then words
at Hamming distance less than or equal to 21 from existing words do not
exhaust all words. So a new one can be added, increasing n.

Level 2
1. Construct a nonperfect code with n = (21 + \)m, m > 1 by repeating each
block t times. Prove it corrects t errors.
2. Characterize the set of correctly coded words in the Hamming code.
3. Define the Hamming codes for q > 2.
4. For q > 2 prove results analogous to those given in the text for q =2.
5. Give examples of encoding in the Hamming codes for q> 2.
6. Give examples of decoding in the Hamming codes for q > 2.
7. For a perfect code to exist

s (")(?-!)'
i= 0
must be a power of q. For t = 2 give examples of n, q which are ruled out.
Sec. 6.5] Cyclic Codes 233

8. Fix t, q. Suppose a perfect code existed for n. Show that as n -> 00 the rate
would approach 1.

Level 3
1. Prove a code is perfect if and only if every sequence of length m can be
decoded by the procedure of Theorem 6.4.1.
2. What is the average Hamming distance between two words of length nl
This is the same as the average number of nonzero entries of a given word.
3. Consider a code satisfying the Gilbert-Varshamov lower bound. If £ -» 0
does the rate approach 1?
4. Look up and write out in your own words a proof of Shannon’s theorem.

6.5 CYCLIC CODES


A code is linear if it is a subspace of V(n, q). It is cyclic if in addition any
cyclic rearrangement of a code word is one. Cyclic codes turn out to be ideals
in V(n, q) if it is regarded as the ring

F[s]
(jc"-1)F[x]
Such ideals can be dealt with by specifying a generating polynomial, which can
be chosen to divide (xn — 1).
Three classes of such codes are the BCH codes, constructed by R. C. Bose
and D.K. Ray-Chaudhuri, and independently by A. Hoquenghem, in 1959, the
Reed-Solomon codes of I. S. Reed and G. Solomon, and the Fire codes of
P. Fire.
PROPOSITION 6.5.1. A code is cyclic if and only if it is an ideal in

F[*]
On- l)F[x]
Proof A linear subspace in this ring is an ideal if and only if it is preserved
under multiplication by x. But this takes a0 + fiiX + ... + an-\Xn~l to
an-\ 4- a0 x + ax x + ... + an_2 which is a cyclic rotation. □

PROPOSITION 6.5.2. Any ideal in


F[x]
(xn — l)F[x]
contains a unique monic polynomial of lowest degree. This polynomial generates
the ideal. It divides (xn — 1).

Proof. An ideal 7 in
F[x]
(*”-l)F[x]
234 Field Theory [Ch.6

gives an ideal I, in F(x), namely all polynomials whose image lies in I. If 17 had
two monic polynomials of lowest degree, their difference would give a monic
polynomial g(x) of lower degree, which is a contradiction. The g.c.d. of this
polynomial and (x” — 1) lies in 17 so this g.c.d. cannot have degree less than
g(x). So it must be g-(x). So g(x)\xn — 1. If f(x) did not divide any member
of I there would be a similar contradiction. □
The problem is to find an ideal which has a large number of elements
and can correct or detect many errors. The number of errors detectible is the
minimum Hamming distance which equals the minimum weight of any poly¬
nomial. The number of members of the ideal is qn~dt^s) sjnce the quotient ring
has basis 1, x,..., xdeg(-^“1, and so has elements.
The following lemma is the means'of guaranteeing that polynomials of small
weight are not in the ideal.

LEMMA 6.5.3. The determinant of

Zi z\ ... z?

Z2 z\ ... Z2

equals {zxz2 ... zn) II (z; — z7).


i>j

Proof For n — 2 this result is correct. Assume it is true for all numbers less
than n. Subtract zn times column n — 1 from column n, zn times column n — 2
from column n — 1, and so on. This will not affect the determinant. We have

Zi z 1 (z! Zyf) ... zj"~1(z1-z„)

z2 z2(z2-z„) ... z2n“1(z2-z„)

Jn 0 ... 0

Expand by minors on the last row, and factor out the quantities in parentheses.
We have (—1)” 1zn(zl — z„) (z2 — z„) ... (z„_1 — z„) times the determinant of

n—l
Zj Zf Zl

n-1
...
Sec. 6.5] Cyclic Codes 235

By inductive hypothesis, the latter is

n (zt-Zj)
j < i<n

Thus the determinant for the nX n matrix equals the given formula. □

EXAMPLE 6.5.1. The determinant of

is abc(b — a)(c — a) {b — c).

A matrix having this form is called a Vandermonde matrix.


The codes given by the following theorem for £(x) of lowest degree are
called BCH codes.

THEOREM 6.5.4. Let d,n, qE Z+, d<n, and (n, q) = 1. Let F be GF(q).
Let E be a field such that F C E and there exists y G E with yn = 1, but no
lower power of y is 1. Suppose g(x) is a monic polynomial over F such that
y,y2,, yd~l are roots of #(*), and#(jc)\xn —1. Then the minimum Hamming
distance between elements of g(y)R(y) is at least d. Here R(y) denotes the
quotient ring where y = x and the set g(y)R(y) is the ideal I in R(y) generated
by g(y).

Proof. Let p(x)Gj. Then £(x)|p(x) and p(y) = p(y2) = ... = p{yd~x) = 0.
We need to show that if p(x) =£ 0 then p{x) has at least d nonzero coefficients.
Let
p{x) = Cq + CjX + c2 x2 + ... 4- Cn_iXn~l

and let v = (c0, cx,..., cn_f). Then the equations p(y) = p(y2) = ... =
P(yd l) — 0 are equivalent to the matrix equation vA = 0 where A is the
matrix

"l 1 ... 1
y y2 yd-1

yti-1 y2(n—l) y(d—l)(n—l)

We will show that any (d — 1) rows of this matrix are linearly independent.
236 Field Theory [Ch.6

Such rows form a matrix

d—l
Z\ *1 ■ .. Z1
d—l
*2 z\ . • • z2

d—l
zd-1 Zd-1 • za-l

where zt are distinct members of {1, y, y2,..., 7n_1}.


By Lemma 6.5.3, this determinant is nonzero. Therefore every (d — 1)
rows of the matrix are linearly independent. Therefore if vA — 0, then v has
at least d nonzero entries. This means'that if vu v2 are encoded vectors then
(Vi — v2)A = 0 so Vi — v2 has at least d nonzero entries, so H(vu v2) > d. □

We compute the numbers associated with this code as follows. The numbers

n, q are as usual. The number of errors corrective is The number m is

such that there are qm elements in the ideal. It will be n — deg (g(x)).
The remaining question, is, what choice of g(x) has minimal degree? Let
s(y, F) denote the polynomial of least positive degree of which 7 is a root.
Then s(y, F), s(y2, F),..., s(7d_1, F) must divide g(x). So the best choice of
^(jc) is their least common multiple (l.c.m.).

DEFINITION 6.5.1. Let 7 be an element of an extension field of F = GF(q)


such that yn = 1 and n is the least such power. Then the BCHcode associated
with 7 is the ideal in

F[*]
(xn — l)F[x]
generated by

£(*) “ l.c.m. (s(7,F),s(72,F),...,s(7d“1,F))

where s(y', F) is the minimum polynomial of yl over F.

EXAMPLE 6.5.2. Let q = 2, n = 3. Then x E GF(4)\GF(2) satisfies *3 = 1.


Its minimum polynomial is x2 + x + l. This is s(x, F). And s(x2, F) is the
same. So
g(x) = l.c.m. (x2 + x + 1, x2 4- x 4-1)
= x2 + x + l

The ideal has two code words 0 and x2 + x + 1. This gives the code 000, 111
which can correct 1-error.
Sec. 6.5] Cyclic Codes 237

If we take n = q — 1 we obtain the Reed-Solomon codes. For these we will


have yG F is any generator of the multiplicative group, and s(y{, F) = x — yl.
Therefore
d—1
£(*) = n (x - y*)
/=i

Reed-Solomon codes have been used in photodigital memory systems.


A simple encoding function for BCH codes takes p(x) to p(x)£(x). This
could be decoded by dividing g(x) into the answer if no errors occurred.
However, the standard coding function takes /(x) to xn~mf(x) — r(x) where
r(x) is the remainder if xn~mf{x) is divided by g(x).
If errors occur decoding is much more complicated because the error is
not a linear function of the received word. However, addition of any word in
the ideal does not change the error. E. Berlekamp in 1968 discovered a rapid but
somewhat complicated algorthim for correcting errors with BCH codes. See
L. Dornhoff and F. E. Hohn (1978).
In many cases, such as equipment failure, errors may not occur at random
but as a sequence of adjacent digits. Such an error is called a burst-error. For
given n, m, q the size of a burst-error correctible exceeds the number of random
errors correctible. Reed-Solomon codes are good for combinations of random
and burst-errors.
A class of codes specially designed for burst-errors are the Fire codes. Let P
be the maximum length of a burst which will be corrected.

DEFINITION 6.5.2. Let c > p and let p(x) be an irreducible polynomial


of degree c over GF (q). Let s be the smallest positive integer such that
p(x)|xJ — 1. Assume (s, 2/3 — 1) = 1. Then the Fire code associated with p{x)
is the cyclic code generated by (x2*3-1 — l)p(x).

For a Fire code, n = s(2@ — l),m = n — 2/3 — c + 1. It can be verified


that p(x) (x2/3_1 — ljKx'* — 1) (x2*3-1 — :1)|*” — 1 so we have a suitable
generator.

EXAMPLE 6.5.3. Let q — 2, c — 3, /3 = 3. Take as irreducible polynomial


x3 + x + 1. The exponent s is 7 since in the field GF(8) generated by a root
x, x7 — 1 and no lower power is 1.
The Fire code is generated by (x5 — 1) (x3 + x + 1) and n is 35, m is 27.
Bursts of length 3 can be corrected.

THEOREM 6.5.5. The Fire code corrects bursts of length < p.

Proof. Suppose a code word c(x) has a burst-error b(x) during transmission so
that r(x) = c(x) + b(x) is received. Write b(x) = x7fc(x) where k(x) = 2 ^/x3
of degree at most P — 1 and k0 =£ 0, or b(x) is identically zero.
238 Field Theory [Ch.6

Then the remainders of r(x) on division by either x2/3 1 — 1 or p(x) depend


only on b(x) since c(x) is a multiple of the generator (x2/3_1 — 1). First divide
by x213 — l.This reduces all powers of b{x) modulo 2/3 — 1. So we obtain xh k(x)
where h is the remainder if 2/3 — 1 is divided into j and all powers of k are
taken the same way. This is a cyclic rotation of k(x). Now h can be recovered:
either it is zero or xhk0 is the first nonzero coefficient preceded cyclically
by /3 — 1 zero coefficients. Then we recover k(jc) by multiplying by x~h.
So the error could only be one of the form xh+^~l^tk(x) where
f = 0,1,2,__ s — 1. It will suffice to verify that any two of these leave different
remainders on division by p(x).
Suppose not. Then p(x) divides a difference
x*+(2P-l)tk(x) _ xh + W-i)uk(x^ t > u
Then
p(x)\xh x^~x^u {x2^~l^f~u^ — l)k(x)

However, p(x) is relatively prime to x since it is irreducible and to fc(x) since

deg (£(*)) < (3 - 1 < deg (p(x))


So
— 1
But also p|x* — 1. And s is relatively prime to (2/3 — 1) (r — m) since
0 < t — u < s. This is not possible: if (2/3 — 1) (f — u) = sw + a where a < s
is the remainder on dividing by s, then

xa = l(mod p(x))

but s is the least positive integer with this property. □

EXERCISES
In the following exercises, as in the text, q is the number of possible letters or
symbols, m the length of a message block before encoding, n the length of words
in the code, d the number of errors detectible, t the number of errors correctible.
For a Fire code /3 is the size of a burst-error correctible, and c is the degree of a
polynomial p(x) such that p(x) (x2(i — 1) generates the code.

Level 1
1. For d = n what is the only linear code which has Hamming distance > d
between any two members?
2. For d = n — 1 what is the only code with at least two members?
3. For q - p where p is a prime number, let y generate the extension field
Z(pk) and be a primitive root. Give an example.
4. For q = 2, n = 5, there exists y G GF (16) satisfying xs — 1. Its minimum
polynomial is x4 + x3 + x2 + x + 1. What code results?

i
Sec. 6.5] Cyclic Codes 239

5. For q = 2, n = 7 there exists y G GF (8) satisfying jc7= 1 with minimum


polynomial x3 + x + 1. Its square satisfies the same minimum polynomial
since (x3 + x + l)2 = x6 + x2 + 1. What is the generator? There exist 8
code words in this code. Write several.
6. Encode 0 0 1 in the code of the above exercise.
7. Decode 0101110 in this code (no errors occur).
8. Decode 10 10 111 (which has errors) by finding a code word within
Hamming distance 1.
9. For any linear code (a subspace of V(n, q)) explain why the minimum
Hamming distance between two members equals the least weight of a
member.

Level 2
1. Suppose y generates the extension field GF(qn) and is a primitive «th root
of unity. For d = l,what is m for BCH codes?
2. Do the BCH codes include any with the same parameters as the Hamming
codes?
3. Why must any two of the s(yl, F) be equal or be relatively prime?
4. Compute a generating polynomial for a Fire code with (3 = 2, q = 2, n = 25,
c — 4. What is m and the rate?
5. Compute a generating polynomial for a Fire code with (i = c = 3, q = 2,
n = 35. Use the same p(x).
6. Compute a generating polynomial for a Reed-Solomon code correcting one
error for n — 1. Here y is any member of GF (8)\G.F (2).
7. Encode 111 0 0 0 in the preceding code.

Level 3
1. Prove, using the idea of Theorem 6.5.4, that for a BCH code and a word
r(x) = c(x) + e(x) where c(x) is a code word and e(x) an error having
at most t nonzero coefficients the 21 quantities S( = ^(7'), i = 1, 2,..., 2t
depend only on e(x) and uniquely determine e(x). This is one basic idea in
decoding BCH codes. It reduces the problem of finding the error to solving
the 21 equations

t inj
S( = 2 CP 1
i = 1

for Cf, rii.


2. Prove the g.c.d. over any field of x1 — 1 and x1 — 1 is xk — 1 where k is
the g.c.d. of i, j.
3. Explain how to find prime polynomials of low degree over a field GF (q)
by a sieve method: find all irreducible polynomials of degrees < n. Divide
a polynomial of degree p by all irreducible polynomials of degree < p. If
p{x) is irreducible, so is p(ax + b), a ^ 0.
240 Field Theory [Ch. 6

4. Find all irreducible monic polynomials of degrees 3,4 over GF(3) of


degrees 4, 5 over GF (2).
5. Construct codes using the previous polynomials.
6. Find a code correcting 2-errors, having rate less than 2.

6.6 LATIN SQUARES


Orthogonal Latin squares are used in experiments in which several factors are
tested simultaneously, and to counteract the influence of extraneous factors.
Suppose we wish to examine under which conditions a plant will grow best,
in a certain region of the country. We want to find the best nutrient, pest-control
method, amout of water needed, and amount of light needed. We will test 5
variations on each one. If we are to try all possible combinations of factors,
we would need 54 = 625 different experiments. However, with only 25 we
can arrange the experiments so that every pair of factors occur in all possible
combinations with each other. (In fact we will later show we could even do this
with 6 different factors.)
Consider the matrix

"(M) (2,2) (3,3) (4,4) (5,5)“


(2,3) (3,4) (4,5) (5,1) (1,2)
(3,5) (4,1) (5,2) (1,3) (2,4)
(4,2) (5,3) (1,4) (2,5) (3,1)
_(5,4) (1,5) (2,1) (3,2) (4,3)_

For each pair i, j make an experiment with nutrient i, pest-control method /,


and let the (i, ;)-entry of this matrix determine the amounts of water and light
received by the plants.
The conditions that every pair of factors occurs exactly once means that in
each row and in each column every number from 1 to 5 occurs exactly once as a
first number and exactly once as a second number, and that every ordered pair
(i, /) occurs somewhere in the matrix. These conditions are satisfied in the case
above.

DEFINITION 6.6.1. A nX n Latin square is a matrix whose entries are numbers


from 1 to n such that every number occurs at least once in every row and at least
once in every column.

EXAMPLE 6.6.1. Let A be

~1 2 3_
2 3 1
_3 1 2_

This is a Latin square.


Sec. 6.6] Latin Squares 241

DEFINITION 6.6.2. Two Latin squares A, B are orthogonal if and only if the
set of ordered pairs (ay, by) includes every ordered pair of integers from 1 to n.

EXAMPLE 6.6.2. Let A be the matrix of first entries in the 5X5 matrix above
and B be the matrix of second entries.

1 2 3 4 5 1 2 3 4 5
2 3 4 5 1 3 4 5 1 2
3 4 5 1 2 5 5 1 2 3 4
4 5 1 2 3 2 3 4 5 1
5 1 2 3 4 4 5 1 2 3

Then A, B are orthogonal Latin squares.

THEOREM 6.6.1. Let n = p?1 p”2... Pkk- Let k = inf {pf* — 1}. Then there
exist k mutually orthogonal nXn Latin squares.

Proof. We form a ring R which is the direct product of GF(p[li). Then R is


a commutative ring of order n having a unit (1, 1,..., 1). Choose k nonzero
elements Xy from GF(pfli) for i = 1,2,...,k, j = 1,2,..., k.Then the elements
yj = (x1;-, x2i,..., xnj-) have the property that yj is invertible and for i =£ /,
yj — y{ is invertible. Write ^+iJ/fc+2>..., yn for the remaining elements of R.
Form k nX n Latin squares M(,r) by m(r)y = yryt + yj. This is a 1-1
function in i, j separately since yr has an inverse. Therefore each row and each
column ranges over all elements ylf y2,..., yn.
Consider M(r) and M(s). If they are not orthogonal then for some i, j, u, v
where (i, /) =£ (u, v) the two matrices have (i, /^-entries equal to their (u, v)
entries. Therefore

yryi + yj = yryu + yv> ysyt + yj = ysyu + yv


So
(yr - y*)yi = (yr - y*)yu

So yi = yu since (yr — exists. So yv = yj also. This contradicts


O', /) ^ (U, v). □

COROLLARY 6.6.2. If n is a prime power, then there exist n — 1 mutually


orthogonal nXn Latin squares.

EXAMPLE 6.6.3. For n = 5,we take square 1 having (i, ;')-entry i + / modulo
5 and square 2 having (/, /)-entry i + 2/ modulo 5. This gives
242 Field Theory [Ch. 6

*0 1 2 3 4 "o 1 2 3 4

1 2 3 4 0 2 3 4 0 1
2 3 4 0 1 9 4 0 1 2 3
3 4 0 1 2 1 2 3 4 0
4 0 1 2 3 3 4 0 1 2_

For prime powers, this result is best possible. For any number n there
exists a Latin square, the addition table of Zn. For n = l,2,6; G. Tarry in
1900 proved there does not exist any pair of orthogonal Latin squares. In 1782
Euler had proved already there exist a pair of orthogonal Latin squares for
all n ^ 2 (mod 4). The problem of whether orthogonal Latin squares exist for
n = 2 (mod 4) was unsolved until the 1958 work of R. C. Bose, S. S. Shrikhande,
and E. T. Parker.
Many constructions for Latin squares depend on an equivalent concept,
orthogonal array.

DEFINITION 6.6.3. A kX m matrix A whose entries are taken from a set S


is an orthogonal array provided that each pair of rows is orthogonal. Two rows
Aj*, Aj* are orthogonal if and only if the ordered pairs (ais, ajS) are distinct for
s = 1 to m.

Here we are concerned with the case |S| = n, m — n2.

EXAMPLE 6.6.4. Any set of mutually orthogonal Latin squares gives an


orthogonal array if we write the entries of each matrix as a single row in the
array.

EXAMPLE 6.6.5. For the Latin squares


o

1—*

"0 1 2
1 2 0 5 2 0 1
o

_! 2 0.
1

we obtain the array

"0 1 2 1 2 0 2 0 1"

_0 1 2 2 0 1 1 2 0_

by writing out each matrix row by row. The columns include all 9 possible
ordered pairs each exactly once.
Sec. 6.6] Latin Squares 243

For two rows to be orthogonal means as here that the corresponding entries
of the rows run through all entries of the Cartesian product set.

THEOREM 6.6.3. There exist m mutually orthogonal Latin squares if and only
if there exists an orthogonal array of size (m + 2)X n2 whose entries are from
{1, 2,... ,n}.

Proof. These two matrices are orthogonal

1 1 . . 1 1 2 . . n
2 2 . . 1 1 2 . . n
A = , B =

n n . . n _1 2 . . n

and a matrix is a Latin square if and only if it is orthogonal to both A and B.


Therefore from m mutually orthogonal Latin squares M(i) we obtain an
(m + 2)Xn2 orthogonal array by writing A, B, M{\), M(2),..., M{m) each
as a single array.
Conversely suppose we have an (n + 2) X n2 orthogonal array. Then each
row must include every integer 1,2, ...,n exactly n times. Rearrange the
columns so that row 1 is

[1 1 1 ... 1 2 2 ... 2 ... n n ... n\

Now row 2 is orthogonal to row 1. So underneath [z i... z] the entries of row 2


are [12 ... «] in some order. Rearrange these so they are [12 ...n\. Then
rows 1, 2 are

1 1 1 ... 1 2 ... n n ... n


1 23 ...n 1 ... 1 2 ... n_

That is, they are A, B. So the other rows regarded as matrices, are matrices
orthogonal to A, B and to each other. So they are mutually orthogonal Latin
squares. □

EXAMPLE 6.6.6. If we add A, B to the previous orthogonal array we have

0 0 0 1 1 1 2 2 2
0 1 2 0 1 2 0 1 2
0 1 2 1 2 0 2 0 1
0 1 2 2 0 1 1 2 0
244 Field Theory [Ch.6

DEFINITION 6.6.4. Two Latin squares are isotopic if one can be obtained
from the other by changing the labels of elements, permuting the rows, and
permuting the columns.

EXAMPLE 6.6.7. The squares

a b c c b a
b c a t a c b
cab b a c

are isotopic. Change labels a, b, c to c, b, a. Then interchange the last two rows.
Isotopy is an equivalence relation on Latin squares. The number of isotopy
classes ofnXn Latin squares for small values of n is:

n Isotopy classes

1-3 1
4-5 2
6 22
7 563
8 1,676,257

EXERCISES
Level 1
1. Construct two orthogonal 3X3 Latin squares.
2. Construct four orthogonal 5X5 Latin squares.
3. Show any 3X3 Latin square can be written as

a b c
b c a
cab

by possibly interchanging two columns.


4. Write out the three mutually orthogonal Latin squares corresponding to
GF(4) = {0,1, x, y — 1 4- x} where addition is modulo 2 and multiplication
satisfies x2 — y, jc3 = 1.
5. Give two 4X4 Latin squares which are not isotopic.
6. Prove that if n ^ 2 (mod 4) and n ^ 3, 6 (mod 9), there exist at least three
mutually orthogonal nXn Latin squares.
7. Given orthogonal arrays A, B of size kXn2 and kXm2 from sets of n, m
elements, construct a k X n2m2 orthogonal array, whose entries are ordered
pairs. Each row should be the Cartesian product of the corresponding rows
of A, B listed in a fixed order. Give an example for k — 2, n = 2, m = 2.
Sec. 6.6] Latin Squares 245

Level 2
1. Show that r rows of a Latin square can be extended to r 4- 1 rows if
and only if the system formed by the complements of the columns has an
SDR.
2. Prove there exist exactly two isotopy classes of 4 X 4 Latin squares.
3. Give two 5X5 Latin squares which are not isotopic.
4. Let G be any set with a binary operation denoted product, such that
for all a, b there exists x, y with ax = b and by = a. Show that the
multiplication table of G is a Latin square.
5. Count the number of 3 X 3 Latin squares with entries a, b, c. Count 4X4
Latin squares with entries labelled a, b, c, d.
6. Prove there is a correspondence between sets of t mutually orthogonal
q X q Latin squares and codes which can detect f-errors with word length
n = t + 2 where m — 1. Let M(r) be the Latin squares. Let the coding of
i, j be the sequence m<l>^, m<2>y,..., m(t){j.

Level 3
1. Show there exist at most m — 1 pairwise orthogonal Latin squares, for any
m.
2. For two groups Glt G2 when are the Latin squares arising from Gh G2
isotopic?
3. Show that for n a prime power = 3 (4) there exist two orthogonal m X m
3n — 1
Latin squares where m = —-— (E.T. Parker). Let g generate the multi¬

plicative group of GF{n). The condition on n guarantees that —1 is not a


square in GF(n). Let yx, y2,..., yy2(n-i) be symbols not in GF(n). We
construct a 4 X n2 array as follows. Take all columns:

yt g2i(g + 1) + X g2t + X X

X yt g2i(g + 1) + X g2i + X

Fx X yt g2i(g + 1) + X

(g + 1) + X g2t + X X yt

where x varies through GF (n) and i independently varies through


1, 2,..., !4(rt — 1). Add n columns

x
246 Field Theory [Ch. 6

/ __ ^ ^2
as x runs through GF(n). Add -— -columns whose entries correspond

to any pair of orthogonal -0- * X n * Latin squares whose entries are


2
n—1
vi, yi> • • • > yn-1- Note that is odd. Prove this is an orthogonal array.

4. Construct a 10 X 10 pair of orthogonal Latin squares using the method of


the previous exercise.

6.7 PROJECTIVE PLANES AND BLOCK DESIGNS


In projective geometry, it is convenient to assume that in addition to the
ordinary points of the plane there are points at infinity, one for each class of
parallel lines. All the points at infinity lie form a set called the line at infinity.
With these additions, in the projective plane, any two lines intersect on a unique
point, and any two points lie on a unique line.
However, there also exist other structures with the same property such as a
projective plane having 7 lines and 7 points, 3 points on each line, and 3 lines
through each point.

DEFINITION 6.7.1. A projective plane consists of sets P, L whose elements


are respectively called points, lines and a binary relation r from P to L called
‘point p lies on line m ’, such that

PP1. Two distinct points lie on one and only one line.
PP2. Two distinct lines have one and only one point in common.
PP3. There exist four distinct points no three of which lie on any line.

EXAMPLE 6.7.1. Suppose we take L = {great circles on a sphere}, P = {pairs of


diametrically opposite points on a sphere}. Then P, L form a projective plane.

The standard example of a projective plane has P being the set of points of
the plane together with 1 point at infinity lying on each line through the origin,
and L the lines of the plane, together with the line at infinity, whose points are
the points at infinity. A line m in the plane contains one and only one point at
infinity, the point for the line through the origin parallel to m.

THEOREM 6.7.1. For any field or division ring F exists a projective plane
whose points are the 1-dimensional subspaces of a 3-dimensional vector space
over V and whose lines are the 2-dimensional subspaces of V.

Proof. The intersection of any two distinct 2-spaces must be nonzero since other¬
wise their direct sum. of dimension 4 would be contained in a three-dimensional
space. So it is a space of dimension 1, since its dimension is less than 2. Any two
Sec. 6.7] Projective Planes and Block Designs 247

distinct 1-spaces generated a 2-dimensional subspace. The 1-spaces spanned by


{(0, 0,1), (0,1, 0),(1, 0,0), (1,1,1)} satisfy the last condition. □

p3k~ 1 p3k-1
This theorem gives a projective plane having points and
pK-1 r pK-1
lines, for every prime power pk. Such planes can be characterized by the fact
that the theorem of Desargues holds in them.
This theorem states that if the following sets of points are collinear:
(a) 0, Au A2, (b) 0, Bu B2, (c) 0, Q, C2 then the three intersections (d) AXBX
intersected with A2B2, (e)v4iCi intersected with A2C2, (f)5iQ intersected
with B2 C2, are collinear.
Many other projective planes exist in which the theorem of Desargues is not
true. See Hall (1959), Hughes and Piper (1973), and Pickert (1955).

THEOREM 6.7.2. Suppose that some line in a projective plane contains n +1


points. Then every line contains n + 1 points, every point is on n + 1 lines, and
there are n2 + n + 1 points and n2 + n + 1 lines.

Proof. Let P, Q, R, S be 4 points no three of which are collinear. Then no three


of the lines PQ,QR, RS, PS can pass through a single point. By symmetry,
it suffices to prove this for PQ, QR, RS. If a point X is on all three lines, it
would follow that any two lines coincide, so they are all equal. Thus P, Q, R are
collinear, a contradiction. This establishes a perfect duality between the points
and lines in results about projective planes.
Consider two lines mi, m2. If all points belonged to mi or m2, then P, Q, R, S
would, say P, Q on mi and R, S on m2. But then the intersection of PQ,RS
could not lie on either without contradiction. Let O not be on mi or m2.
Then to X on mx we have the point OX n m2 on mx. This gives a one-to-one
correspondence between the points on mi and the points on m2. So every line
has n + 1 points.
There is also a one-to-one correspondence from points on mi to lines
through O, given by X -» OX. So there are n + 1 lines through O. By duality to
the result about points, there are n + 1 lines through every point.
Now the entire plane consists of O, and the points other than O on
the n + 1 lines through O. This gives n2 + n + 1 points. By duality there are
n2 + n + 1 lines. □

The number n is called the order of the projective plane.


Projective planes in general correspond to ternary systems generalizing
fields. A ternary operation on a set S is a mapping S X S X S -*■ S. We indicate
how this is done. First coordinates in a set of n elements are found. Choose any
four points X, Y, O, U no three on a line. Here 0 is to be the origin, OX the
x-axis, OY the y-axis, XY the line at infinity and OU the line of points (a, a).
248 Field Theory [Ch.6

First, set O = (0,0), U = (1, 1) and assign coordinates (a, a) to the other
points of OU in any 1-1 fashion, where a = 2, 3,..., n — 1. If P & OU U XY,
let P = (a, b) where (b, b) is the intersection of XP and OU and (a, a) is
the intersection of YP and OU. To a point at infinity on the line joining (0, 0)
and (1, m) assign the coordinate (m) (which can be considered a slope of a class
of parallel lines). Assign Y the coordinate (°°).
Then a ternary operation on 0,1,2,...,«— 1 is defined by y = x • m O b
if and only if the line L through (m) and (0, b) contains the point (x, y). Since
this is the intersection of L and the line joining (°°) and (x, x), a unique y exists.
This ternary operation has the following properties:

Tl. 0-m0c = a'00c = c


T2. I ‘ m O 0 — m ' 100 = m
T3. The function a • m O x is 1-1 in x for all a, m
T4. For any ¥= m2, b2, b2 there exists a unique x such that x • O b\ =
x • m2 O b2
T5. If a\i^ a2, clt c2 are given then there exists a unique pair m, b such that
ax' m O b = C\ and a2 • m O b = c2.

Conversely it can be proved that any ternary system with these five
properties determines a projective plane.

EXAMPLE 6.7.2. For projective planes associated with subspaces of a


3-dimensional vector space, the ternary ring is isomorphic to the field F with
ternary operation x • m O b — xm + b.

Many other systems result in various other types of projective planes,


various non associative division rings, near-fields, alternative division rings.
Every known projective plane has order a prime power, and the most
famous unsolved problem in projective plane theory is, do there exist projective
planes of order not a prime power? Existence of a projective plane of order n
holds if and only if there exist n — 1 pairwise orthogonal n X n latin squares.
Projective planes may also be described in terms of the (0, l)-matrix A
which is the matrix of the binary relation p E L for points p and lines L.

PROPOSITION 6.7.3. An sX s (0, l)-matrix A defines a projective plane if and


only if AtA = AAT = nl + J where J is the n X n matrix all of whose entries
are 1, for s> 3.

Proof. (A1A)ij = 2 akiakj is the number of points k in the intersection of lines


k,j. Thus it should be 1 if / =£ /, n + 1 if i = j. And (4AT)/;- = 2 aikajk is the
number of lines passing through points i, j. So it should be 1 if / j, else n + 1.
Conversely the matrix conditions imply that (PP1) and (PP2) and that
every line has the same number of points and every point is on the same number
Sec. 6.7] Projective Planes and Block Designs 249

of lines. By examining the special cases satisfying 1, 2 but not 3 we find that
none satisfies the hypotheses of the proposition. □

A balanced incomplete block design has uses similar to those of Latin


squares in statistics. It is used in the design of experiments in which it is not con¬
venient to have to test every value of factor a against every value of factor b, for
varying values of factor c. That is, a Latin square gives a pattern of experiments.

Factor a
1 2 3

Factor b 2 3 1

3 1 2

where the value of factor c is in the interior. A more general balanced incomplete
block design would be

1 2

Factor b 2 3

3 1

Here perhaps it is feasible to make only two experiments for each value of
factor b. Only two factors, b, c are involved.

DEFINITION 6.7.2. A balanced incomplete block design (BIBD) of type


(b, v, r, k, X) consists of a family Bh i = 1 to b of subsets of a set V of v
elements such that (1) \Bt | = k < X for all i, (2) \{i:xE Bt} I = r for all xEV,
(3) | {/: x E Bt and y E 5,-}| = X for all x ¥= y in V.

EXAMPLE 6.7.3. If all B( have a single distinct element we have a


BIBD (n, n, 1,1, 0). For instance

V
2
3

EXAMPLE 6.7.4. A projective plane of order n is a


BIBD (n2 + n + 1, n2 + n +1, n + 1, n + 1,1).

The block B( is generally considered as the /th set of experiments and V


is considered as the set of treatments, or varieties being tested.
250 Field Theory [Ch.6

To a balanced incomplete block design is associated an incidence matrix A


of size v X b where ay = 1 if variety i occurs in block /, otherwise ay = 0. This
matrix has the property

AAt = \J + (r - X)V

and its column sums are each k. Conversely any v X b (0,1)-matrix with these
properties defines a BIBD (b, v, r, h, X).

PROPOSITION 6.7.4. For a BIBD, bk = vr,\(v -i)= r(k - 1).

Proof. The first equation is a counting on the total number of elements of


the design, first by blocks then by varieties. Let vx be a given member of V.
Then the number of elements of the design lying in a block containing V\, but
not containing V\ there are themsleves, is A(p — 1) if we count it for each of the
v — 1 other treatments. If we count it by the blocks, then there are r blocks
containing V\, each having k — 1 other treatments. □

Balanced incomplete block designs can be constructed from doubly


transitive groups.

DEFINITION 6.7.3. A group G acting on a set X is doubly transitive if for all


x =£ w, y =£ z in X there exists g E G such that gw = y, gx = z.

EXAMPLE 6.7.5. 5^ is doubly transitive.

PROPOSITION 6.7.5. Let G be any doubly transitive group acting on a set V.


Let S be a subset of V Then the family B{ of distinct sets g(S),g E G is a BIBD

Proof (1) holds since !g(S)| = |S|. (2) and (3) are unaffected if we take
m copies of each block for any m. Take therefore as blocks all sets g(S) in
which each distinct block occurs a number of times equal to |{g: g(S) = 5}|.
Then (2) follows from transitivity since i G g(S) if and only if h(i) G h(g(sf).
And (3) follows from double transitivity since i, j E g(S) if and only if
HO ,h(j)Eh(g(S)). □

EXAMPLE 6.7.6. The group of invertible matrices acts doubly transitively on


the set of 1-spaces of any vector space. Therefore if the blocks are taken as all
^-spaces of the vector space regarded as sets of 1-spaces lying in them, we have
a BIBD.

There are a number of generalizations of BIBD. One type is used in the


construction of two 22 X 22 orthogonal Latin squares.
Sec. 6.7] Projective Planes and Block Designs 251

EXERCISES
In the following exercises, for balanced incomplete block designs, v is the
number of elements in the union of all blocks (rows), b is the number of blocks,
k is the number of elements per block, r is the number of blocks a given element
belongs to, and X is the number of blocks a given pair of elements belongs to.

Level 1
1. Find a 7 X 7 circulant (0,1)-matrix having 3 ones in each row and column,
which is a projective plane. Assume these main diagonal entries are all 1 as
well as the (1, 2)-entry.
2. Do the same for a 13 X 13 matrix with 4 ones per row. This should be the
projective plane arising from GF(3).
3. Construct a BIBD (n, n, n — 1, n, n — 2) for any n.
4. Construct a BIBD ((£), H,(£lj), f°r any n> from ah subsets of
a set of n elements.
5. Explain why k < v in BIBD.

Level 2
1. Show that a Veblen-Wedderburn system defines a projective plane. This is
a set having two binary operations denoted addition and multiplication such
that (1) addition is an abelian group, (2) multiplication has a unit 1 and
multiplication by any nonzero element is 1-1 on either side, (3) (a + b)m =
am + bm, (4) if r s xr = xs 4- t has a unique solution x.
2. From the ternary system corresponding to a projective plane explain how
to obtain n — 1 orthogonal nX n Latin squares. Show how to do the
reverse also.
3. What are the parameters of the BIBD of all k-spaces contained in an
rc-space over GF (#)?
4. Prove that in a BIBD r > X.
5. If a system satisfies (PP1) and (PP2) for a projective plane and two distinct
lines contain respective points (a, b) and (c, d) not equal to their intersection,
show (PP3) holds.
6. Consider a system formed by a line L with n > 2 points and single point 0
not on L, together with all lines OX for X G L. Show (PP1), (PP2) but not
(PP3) holds.

Level 3
1. Prove the projective planes arising from lines over GF (q) can be represented
by a circulant (0, l)-matrix. Represent the 3-space as GF(q3) and use the
fact that its multiplicative group is cyclic.
2. Write a matrix for a projective plane of order 4.
3. Express the general condition on the first row for a circulant matrix to
represent a projective plane.
252 Field Theory [Ch. 6]

4. Prove that b > v for a BIBD. Compute det (AA1) and note that it is zero if
b < v since A A1 would have rank < b < v.
5. For all values of b, v, k which are at most 5 such that positive integers X, r
can be defined by bk = vr, \(v — 1) = r(k — 1) and the inequalities
k < v, r > X, b > v, try to construct a BIBD.
Open problems

An open problem is a question for which no one has yet been able to provide an
answer and prove it. Most of the open problems listed here are famous, and
have been worked on for many years by many mathematicians. Partial results
have been obtained. This course provides you with enough tools to understand
the problems, so we encourage some of you to solve these problems.

1. Find the structure and number of D-classes of Bn.


2. Find a method of enumerating groups of order pn, if p is prime and n E Z+.
3. Find a simple proof (< 100 pages) that every simple group has even order.
4. Two matrices A, B over Z+ are said to be shift equivalent if and only if
there exist matrices R, S over Z+ such that for some n E Z+, RA = BR,
AS = SB, RS = Bn, and SR = An. Characterize shift equivalence in the
case of eigenvalues of multiplicity > 1.
5. What lattices are lattices of ideals in a regular ring? Of normal subgroups in
a group?
6. Classify representations of finite groups over finite fields of order not
relatively prime to the order of the group.
7. Is every finite group a Galois group of an extension of Q over Q?
8. Find codes which approximate Shannon’s bound and can be rapidly
decoded.
9. Is there a projective plane of order other than a prime power? How many
mutually orthogonal n X n Latin squares can exist?
10. Devise improved computational methods for finding any of the following:
(i) eigenvectors, (ii) factors of Boolean matrices, (iii) factors of Z, or
polynomials, and (iv) solution of simultaneous polynomial equations.
11. Provide a useful and computable concept of complexity of a scientific
problem.
List of special symbols

CHAPTER 1

xEA, 10
F: family of sets, 10
U A, U A, 10
AE F F
n A, n A, 10
AEF F
ACB,BDA, 10
A : complement of a set A, 11
A\B (A — B): relative complement of a set B, 11
0: empty set, 11
U: universal set, 11
\A\: cardinality of a set A, 11
(a, b): ordered pair, 13
AiX A2X ... X An: Cartesian product, 13
R: the set of real numbers, 13
x — y: congruence, 13
x ~ y: similarity, 13
S(n, k): Stirling’s number of the second kind, 15
R o S : composition of relations R and S, 17
f og: composition of functions / and g, 17
t: identity function, 17
1 — 1: one-to-one, 18
f~l\ inverse function of/, 18
xR y : (x,^) E R, 20
Qd: indifference relation, 21
<2S: strict order, 21
(x, y): ordered pair of equivalence classes, 21
8: two element {0,1} Boolean algebra, 26
List of Special Symbols 255

xc: complement of x (Boolean algebra), 26


A : complement of a Boolean matrix, 27
AO B: logical product, 27
At: transpose of a matrix A (Boolean), 28,134
Ai*: zth row of a matrix A (Boolean), 29,134
A*i'. zth column of a matrix A (Boolean), 29,134
I: identity matrix (Boolean), 32,132
J: matrix all of whose entries are 1 (Boolean), 32, 248
pr(y4): row rank of A (Boolean), 32,140
pc(A): column rank of A (Boolean), 32,140
|F: social welfare function, 33
P: domain of social welfare function, 33
Wn\ n-tuples of weak orders, 33
x-1: inverse of x, 40, 68

CHAPTER 2
V: join (lattice), 41,114
A: meet (lattice), 41,114
En: n-dimensional Euclidean space, 42,101
x: equivalence class of x, 46
AB: {ab: a G A, b £ B], 51
Sl: semigroup with identity, 51
L: Green’s relation, 51
R: Green’s relation, 51
J: Green’s relation, 51
D : Green’s relation, 51
H: Green’s relation, 51
Mn(F): the set of n X n matrices over F, 53
Tn: the set of transformations on an ^-element set, 53
Bn: the set of n X n Boolean matrices, 53
F: field, 53,119
rank(/): rank of/, 55
im (A) : image of A, 59
(S, X, Z, v, 8): finite state machine, 60
v(s, x): next state from state s and input x, 62
5(s, x): output from state s and input x, 62
(TV, T, P,o) : phrase-structure grammar, 64
L(G) : language generated by grammar G, 65
e: identity (group), 68
Z: the set of integers, 68
Q: the set of rational numbers, 68
256 . List of Special Symbols

C: the set of complex numbers, 68


Syi'- symmetric group on n letters, 68
Zm: residue classes modulo m, 69
finite field, 125
finite ring, 163
quotient ring, 174
xH: left coset of H, 72
Hx: right coset of H, 72
ABC: {abc: a G A, b G B, c G C], 72
a\b: a divides b, 72
G/H: quotient group, 74
R+: set of positive real numbers, 78
G/N
quotient of two quotient groups, 79

SDR: system of distinct representatives, 84


Cg\ conjugation, 93
ZE: set of even integers, 94
Z°: set of odd integers, 94
Ox: orbit of x, 95
IG: H\
-: the number of cosets of H. 95
[G:H]
Ayi- alternating group on n letters, 97
A K! B : Kronecker product, 105
<t>(ri): Euler’s function, 111

CHAPTER 3
V: vector space, 119
Fn: F © F © ... © F, 119
I: index set, 122
(S): span of S, 123
U © V: internal direct sum, 129
dim W: dimension of W, 129
at,™: (i, /)-entry of mth power of Am, 134
Aij- (i, /)-blockof A, 134
A-1: inverse of A, 137
cof(y4): cofactor of A, 145
adj (A): adjoint of A, 145
Tr (A) : trace of A, 146
x • y: inner product, 157
Cn: C©C®... ©C, 159
List of Special Symbols 257

CHAPTER 4
R: ring, 163,164
c.d.: common divisor, 163
J: ideal, 164
R/H: quotient ring, 164
Z+: the set of positive integers, 165
V: integral domain, 165
F[x]: ring of polynomial over F, 166
Q[x]: ring of polynomial over Q, 166
g.c.d.: greatest common divisor, 166
(a, b) = 1: a and b are relatively prime, 168
E: Euclidean domain, 168
v(a): degree of polynomial, 168
x~y: similarity (ring), 173
co : primitive rzth root of unity, 178
Rp: division ring, 180

CHAPTER 5
F(G): group ring over F, 184
M: module, 184
R(G): group ring over R, 185
char (F): characteristic of a field, 193
K: maximal proper left ideal, 194
C(G): group ring over C, 195
Mn(C): the set of n X n matrices over C, 195
X: group character, 197
A ® B: tensor product, 198
[G: H]: index of H in the group G, 95, 203
Xfc(M): fcth exterior power, 204
Sfc(M): kth symmetric power, 204
Lln: the set of n X n unitary matrices, 205
ok : A:th elementary symmetric function, 206
rI \ standard representation, 206

CHAPTER 6
BCH: Bose-Chaudhuri-Hoquenghen code, 211
[E : F]: degree of the extension, 211
[C : R]: degree of the complex numbers over the reals, 211
F(5): field generated by S over F, 212
E(x): field generated by x over E, 212
258 List of Special Symbols

F(x): field generated by x over F, 212


A: discriminant function, 217
GF (n): finite field of order n, 226
V(n, q): the set of dimensional vector over GF (q), 228
H(u, v): Hamming distance, 228
/E: encoding function, 228,231
l.c.m.: least common multiple, 236
s(x, F): the minimum polynomial of x over F, 236
(3: maximum length of a burst, 237
BIBD: balanced incomplete block design, 249
References

G. Birkhoff, Lattice Theory, American Mathematical Society, Providence, R. I.,


1967.
A. H. Clifford and G. B. Preston, The Algebraic Theory of Semigroups, American
Mathematical Society, Providence, R. I., 1964.
J. M. Crowe, A History of Vector Analysis, Notre Dame University Press, Notre
Dame, 1967.
T. Donellan, Lattice Theory, Pergamon Press, New York, 1968.
L. L. Dornhoff and F. E. Hohn, Applied Modern Algebra, Macmillan, New York,
1978.
M. Gardner, Mathematical games, Scientific American, 234/4 (1976), 126-130
and 234/9 (1976), 210-211.
W. Gilbert, Modem Algebra with Applications, Wiley, New York, 1976
B. C. Griffith, V. L. Maier and A. J. Miller, Describing communication networks
through the use of matrix-based measures, preprint, 1976.
M. Hall, Jr., The Theory of Groups, Macmillan, New York, 1959.
D. R. Hughes and F. C. Piper, Projective Planes, Springer-Verlag, New York,
1973.
K. H. Kim, Boolean Matrix Theory and Applications, Marcel Dekker, New York,
1982.
R. D. Luce, Semiorders and a theory of utility discrimination. Econometrica, 24
(1956), 178-191.
G. Pickert, Projective Ebenen, Springer, Berlin, 1955.
I. Rabinovitch, The Scott-Suppes theorem on semiorders, Journal of Mathe¬
matical Psychology, 15 (1977), 209-212.
D. Scott, Measurement structure and linear inequalities, Journal of Mathematical
Psychology, 1 (1964), 233-247.
D. Scott and P. Suppes, Foundation aspects of theories of measurement, Journal
of Symbolic Logic, 23 (1958), 113-128.
P. Suppes and R. Zinnes, Basic measurement theory, in R. D. Luce, R. R. Bush
and E. Galanter, editors, Handbook of Mathematical Psychology I, Wiley,
New York, 1963.
260 References

H. C. White, An Anatomy of Kinship, Prentice-Hall, Englewood Cliffs, N.J.,


1963.
H. C. White, S. A. Boorman and R. L. Breiger, Social structure from multiple
networks, I. Blockmodels of roles and positions, American Journal of
Sociology, 81 (1976), 730-790.
Index

A Character, 197
Absolute value, 165 Characteristic
Abelian group, 70 of field, 193,221
Algebra, 179 polynomial, 145
Algebraic integer, 200 subspace, 151
Alternating group, 97 Circulant, 154,155
Antisymmetric, 20 Class, 11
Arrow’s Impossibility Theorem, 35 Clique, 56
Artinian ring, 182 Cluster, 56
Associativity, 11,17 Code, 228
Automorphism group, 72 Cofactor, 145
Axiom of Choice, 24 Column rank, 140
Column space, 140
B Column vector, 134
Balanced incomplete .block design (BIBD), Commutative ring, 164
249 Commutativity, 11
Basis, 123,128 Comparable, 22
BCH code, 236 Complement
Behavior of machine, 67 of a set, 11
Bilinear Hermitian form, 158 of a matrix, 27
Bilinear function, 201 of a subspace, 129
Binary relation, 13 Complete relation, 22
Binary operation, 16 Composition
Blockmodel, 57 of functions, 41
Block form, 134 of relations, 17
Block triangular form, 136 Condorcet Paradox, 32
Boolean algebra, 26 Congruence
Boolean matrix, 27 of modules, 192
Boolean vector, 32 of numbers, 174
Burst code, 237 of rings, 173
of semigroups, 46
of vector spaces, 130
C Conjugacy
Capacity, 86 class, 93
Cardinality, 11 of elements, 93
Cartesian product, 13 Conjugation, 93
Cayley-Hamilton Theorem, 146 Containment, 10
Center, 74 Context-free grammar, 65
Central idempotent, 180 Coordinates of a projective plane, 246
Chain, 23 Coset, 72
262 Index

Cycle (permutation), 80 F
Cyclic code, 233 F- algebra, 179
Cyclic group, 69 Family, 10
Field, 120
D
Filter, 38
Defining relation, 47 Finite dimensional algebra, 179
Degree Finite dimensional extension, 211
of extension, 211 Finite state machine, 60
of permutation, 68 Fire code, 237
De Morgan’s laws, 11 Flow, 86
D-equivalent, 51 Ford-Fulkerson algorithm, 87
Derivation, 64 Free group, 69
Desargues’s Theorem, 247 Free module, 190
Determinant, 47,144 Free semigroup, 42
Diagonal matrix, 133 Full matrix, 86
Dictatorial, 34 Function, 16
Dihedral group, 104
Dimension of vector space, 119,128 G
Direct product, 70,128 Galois field, 226
Direct sum, 128 Galois group, 218
Direct sum of modules, 192 Galois theory, 210, 217
Directed edge, 30 Generators
Directed graph, 30 of a field, 212
Disjoint sets, 11 of a group, 69
Disjoint cycles, 80 of an ideal, 173
Distributive lattice, 115 of a semigroup, 45
Distributivity, 11 Gilbert-Varshamov bound, 232
Divisibility, 166 Golay code, 231
Division ring, 180 Gram-Schmidt process, 188
Domain, 16 Greatest common divisor, 166
Double negative, 11 Greatest lower bound, 41
Doubly stochastic matrix, 92 Green’s relations, 52
Doubly transitive, 250 Group
Duality, 100 abstract, 67
of symmetries, 103
E representation, 187
Eigenvalue, 102,150 ring, 185
Eigenvector, 103,150
H
Element, 10
Hall matrix, 84
Elementary symmetric function, 206
Empty set, 11 Hall-Koenig Theorem, 90
Entry, 27 Hamming bound, 229
Epimorphism, 77 Hamming code, 230
Equality of sets, 10 Hamming distance, 228
Hasse diagram, 24
Equivalence
H-equivalent, 51
class, 14
Hermitian matrix, 157
of representations, 187
relation, 13 Homogeneous linear equations, 127
Homomorphism
Error-correcting code, 210,227
general, 16
Error-detecting code, 227
of groups, 76
Euler-Fermat Theorem, 176
of modules, 191
Euclidean domain, 168
of semigroups, 47
Euler’s function, 111,176
of vector spaces, 138
Even permutation, 143
Extension field, 211 I
Exterior &th power, 204 Ideal
External direct sum, 128 generated by x, 173
Index 263

of a ring, 173 Kernel


of a semigroup, 51 group, 77
Idempotency, 11 matrix, 139
Idempotent, 180 Klein 4-group, 82
Identity function, 17 Kronecker product, 105,198
Identity matrix, 132
Image L
blockmodel, 56 Latin square, 240
of group, 77 Lattice, 41,114,115
space, 139 Least upper bound, 41
Inclusion, 10 Left ideal, 51,173
Incomparable, 22 Legendre symbol, 178
Independence of irrelevant alternatives, 35 L-equivalent, 51
Index set, 122 Linear code, 233
Indexed family. 122 Linear combination, 123
Indifference relation, 21 Linear dependence, 122
Indirect derivation, 64 Linear independence, 122
Induced representation, 203 Linear order, 23
Induction axiom, 165 Linear transformation, 139
Inequality Logical product, 27
of Boolean matrices, 27 Lower triangular form, 136
of sets, 127 M
Infimum, 41
Main diagonal, 133
Injection, 77
Mapping, 16
Inner automorphism, 93
Mathieu group, 231
Inner product, 157
Matrix, 131
Integral domain, 165
Maximal element, 22
Internal direct sum, 129
Mealy machine, 60
Intersection, 10
Meet, 41
Inverse
Member, 10
function, 18
Minimal element, 22
in group, 68
Minimum polynomial, 211
in semigroup, 54
Module, 190
of a matrix, 137,145
Monic polynomial, 212
Irreducibility criterion, 213
Monoid, 61
Irreducibility of a module, 192
Monomorphism, 77
Irreducibility of a polynomial, 212
Moore machine, 60
Irreflexive, 20
Multiplicity, 151
Isometry, 101
Isomorphism N
general, 18
N-ary operation, 18
of groups, 76
,/V-ary relation, 18
of semigroups, 43
Network, 86
Isotopy of Latin squares, 244
Newton’s formulas, 156
Isotropy subgroup, 95 Nilpotent matrix, 155
Nondictatoriality, 34
J Nonsingular bilinear form, 188
J-equivalent, 51 Nonterminal, 64
Normal matrix, 157
Join, 41
Jordan Canonical Form, 153 Normal subgroup, 72
Normality, 158
Joshi bound, 231
Null set, 11
Null space, 139
K
X-connected component, 92 O
X-cycle, 80 Odd permutation, 143
X- decomposable, 92 One-to-one function, 18
264 Index

One-to-one correspondence, 18 ring, 173


Onto function, 18 semigroup, 46
Opposite semigroup, 44 Quotient vector space, 130
Orbit, 95
Order R
of an element, 81 Rank of a Boolean matrix, 32
of a group, 68 Rank of a matrix, 140
of a projective plane, 247 Rank of a transformation, 55
of a set, 11 Rate, 229
Ordered ring, 164 Reachability matrix, 32
Orthogonal array, 242 Rectangle of zero, 85
Orthogonal group, 71 Rectangular band, 45
Orthogonal Latin squares, 241 Reed-Solomon code, 237
Orthogonal matrix, 101 Reflection, 101
Orthogonal vectors, 158 Reflexivity, 13
Orthogonality relations, 201 . Regular element, 53,180
Regular grammar, 65
P Regular representation, 112
Pareto condition, 35 Regular matrix representation, 195
Parity check, 227 Relation
Partial function, 16 binary, 13
Partial order, 20 in a semigroup, 45
Partition, 14 Relative complement, 11
Perfect code, 230 Relatively prime, 163, 168
Permanent, 92 Ft-equivalent, 51
Permutation Right ideal, 51, 173
matrix, 84 Ring
of a set, 40,79 general, 164
Peterson graph, 99 with unit, 164
Phrase-structure grammar, 64 Row
Polar decomposition, 162 basis, 32
Poset, 22 rank, 140
Preorder, 21 rank (Boqlean), 32
Prime, 168 space, 140
Primitive group, 94 vector, 32
Primitive root of unity, 178,189 Ruler and compass construction, 215
Principal ideal, 51,176 Russell’s Paradox, 11
Principal ideal domain, 194
Product S
of Boolean matrices, 28
Sandwich matrix, 55
of matrices, 132
Scalar, 121
Production, 64
Schroder-Berstein Theorem, 127
Projective plane, 246
SDR, 84
Proper ideal, 175
Semidirect product, 72
Proper subset, 10
Semigroup
general, 39, 40
Q
of a machine, 61
Quadratic form, 161
Semisimple ring, 181,196
Quadratic residue, 177
Set, 9
Quadratic reciprocity, 179
Shannon’s Theorem, 231
Quasiorder, 21
Sign of permutation, 143
Quaternion
Similar matrices, 149
group, 75
Simple group, 74
ring, 125
Simple ring, 179
Quotient
Social welfare function, 33
group, 74
Span, 123
map, 77 Starting symbol, 64
Index 265

Strict partial order, 20 U


Subfield, 211
Ultrafilter, 38
Subgroup, 69
Union, 10
Sublattice, 115
Unit, 166
Submodule, 192
Unitary matrix, 120,157
Subsemigroup, 42 Universal domain, 34
Subset, 10 Universal matrix, J, 32
Subspace, 122 Universal object, 201
Surjection, 77 Universal property of direct sums, 131
Supremum, 41 Universal property of free semigroups, 50
Sylow subgroup, 74 Universal property of tensor products, 201
Symmetric group, 68 Upper triangular form, 136
Symmetric Arth power, 204
Symmetric matrix, 134
Symmetry, 98 V
System of distinct representatives, 83, 84 Value of flow, 86
Van der Waerden conjecture, 92
T Vector space, 121
Tensor product, 201 Vertex, 30
Terminal, 64
Ternary ring, 248
W
Total order, 23
Trace, 146 Weak order, 33
Transformation, 41 Wedderburn Theorem, 180,181
Transitive group, 94 Weight, 228
Transitivity, 9,13 Wilson’s Theorem, 178
Translation, 101 Word, 42
Transpose, 28 Wreath product, 98
Transposition, 83
Trichotomy, 22 Z
Trivial ideal, 175
Zero matrix, 132
Two-sided ideal, 51,173
Zorn’s Lemma, 24
^-pjrjPjAVA/H FROM ,
RY
JUNIATA COLLtut tlUt vr>

QA 162 .K55 1983


KIM, KI HANG.
APPLIED ABSTRACT ALGEBRA

DATE DUE

QA162.K55 1983
Kim, Ki Hang.
Applied abstract algebra
PJUM 83000217

28209200032 2002 rBRARY


HUNTINGDON, PA.

OEMCO
Ki Hang Kim is Professor of Mathematics at the
Alabama State University, Alabama, USA. He is a
graduate of the University of Southern Mississippi
with a B.S. and M.S. in Mathematics, to which he
added an M.Phil. and Ph.D. in Mathematics from
the George Washington University. His Ph.D.
dissertation was written under Gian-Carlo Rota
of Massachusetts Institute of Technology.
His first post was as an Instructor at the University
of Hartford, Connecticut, from 1961 to 1966. He
was Lecturer at the George Washington University
for two years, and Associate Professor of St.
Mary's College of Maryland for a further two
years. He became Associate Professor at Pembroke
State University of North Carolina in 1970, and
Research Professor in Alabama State University in
1974. He was also Visiting Professor at the Institute
de Fisica e Matematica, Lisbon, at the University
of Stuttgart and at the Korean Academy of
Sciences in Pyongyang: he is at present also
Director of the Kiro Research Group, Montgomery.
His research interests are discrete mathematics and
mathematical social sciences.

Fred W. Roush is Assistant Professor of Mathe¬


matics at Alabama State University. He graduated
from the University of North Carolina with an
A.B. in Mathematics in 1966, and a Ph.D. in
Mathematics from Princeton University in 1972.
He joined the University of Georgia in 1970 as an
Assistant Professor, moving to Alabama in 1974.
He is a Research Associate with the Kiro Research
Group: his research interests are also discrete
mathematics and mathematical social sciences.
Professor Kim is Managing Editor of the Inter¬
national Journal of Mathematical Social Sciences,
of which publication Professor Roush is a member
of the Editorial Board. Both authors have written
previous books and numerous papers in their
field of mathematical studies.
MATHEMATICAL METHODS FOR MATHEMATICIANS, PHYSICAL SCIENTISTS AND
ENGINEERS
J. DUNNING-DAVIES, Senior Lecturer in Applied Mathematics, University of Hull
Maintains a strong practical flavour, concentrating on techniques rather than mathematical
background, an approach favoured by students who find the abstract theory easier to grasp
when they are in possession of sound practical information and have matured as
mathematicians. Features the use of index notation, which simplifies many manipulations
in the sections on vectors and tensors. Introduces functions of a complex variable
emphasising contour integration.

MATHEMATICS AND STATISTICS FOR THE BIOSCIENCES


C. EASON, G. W. COLES and G. GETTINBY, Department of Mathematics, University of
Strathclyde
"a useful text to students of biological sciences" - G. J. McLachlan, University of Queensland, Australia, in
Biometrics.
"appropriate for mathematically-inclined advanced undergraduate and graduate students in the life sciences"
— Choice (USA).

APPLIED LINEAR ALGEBRA


R. J. GOULT, Department of Mathematics, Cranfield Institute of Technology, Cranfield, Beds.
"unpretentious and highly readable ... attractive non-encyclopedic package ... fascinating mix of topics,
unusual in books at this level" - American Mathematical Monthly.
"a concise textbook ... readable and detailed style ... applications numerous and broad in scope" —
Choice (USA).

MATHEMATICAL MODELS IN THE SOCIAL, MANAGEMENT AND LIFE SCIENCES


D. N. BURGHES, University of Exeter, and A. D. WOOD, Department of Mathematics,
Cranfield Institute of Technology, Cranfield, Beds.
"very well written and has a lot to offer" - R. J. Reed, Lecturer in Statistics, University of Warwick, in The
Times Higher Education Supplement.
"well laid-out and easy to read and dip into. It will appeal primarily to the non-mathematical students it is
intended for, and should prove a useful and popular addition to libraries" — Allan Grady, in The
Mathematical Gazette.

COMPUTATIONAL MATHEMATICS
T. R. F. NONWEILER, Department of Mathematics, Victoria University of Wellington, New
Zealand
An introduction to the mathematical techniques of numerical analysis for those with a
particular interest in performing calculations, whether by using pocket calculators,
microcomputers, or by large digital computers. Covers the familiar ground of introductory
courses in this subject and embodies the attitude that numerical analysis is an experimental
branch of mathematics. Thus the text is free of algebra and algorithms, and describes the wider
context of the elementary mathematical methods.

COMPUTATIONAL GEOMETRY FOR DESIGN AND MANUFACTURE


I. D. FAUX and M. J. PRATT, Cranfield Institute of Technology, Cranfield, Beds.
"the authors get directly to the point... they deserve high marks" - J. Alan Adams, US Naval Academy in
Computer-Aided Design.
"They have succeeded ... an excellent book ... applied mathematics at its best" — F. V. Pohle in Applied
Mechanics Review (USA).

published by
ELLIS HORWOOD LIMITED
Publishers Chichester

9780853125631
Ellis Horwood Library Edition ISBN 0-8531207/99/90-1 Q -in-no q
??
Ellis Horwood Student Edition ISBN 0-85312 l a I U UU-o
Halsted Press Paperback Edition ISBN 0-470-
J

You might also like