Académique Documents
Professionnel Documents
Culture Documents
Contents
1 Topics in Number Theory 2
1.1 Subgroups of the Integers . . . . . . . . . . . . . . . . . . . . 2
1.2 Greatest Common Divisors . . . . . . . . . . . . . . . . . . . . 2
1.3 The Euclidean Algorithm . . . . . . . . . . . . . . . . . . . . . 3
1.4 Prime Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 The Fundamental Theorem of Arithmetic . . . . . . . . . . . . 5
1.6 The Infinitude of Primes . . . . . . . . . . . . . . . . . . . . . 6
1.7 Congruences . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.8 The Chinese Remainder Theorem . . . . . . . . . . . . . . . . 8
1.9 The Euler Totient Function . . . . . . . . . . . . . . . . . . . 9
1.10 The Theorems of Fermat, Wilson and Euler . . . . . . . . . . 11
1.11 Solutions of Polynomial Congruences . . . . . . . . . . . . . . 13
1.12 Primitive Roots . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.13 Quadratic Residues . . . . . . . . . . . . . . . . . . . . . . . . 16
1.14 Quadratic Reciprocity . . . . . . . . . . . . . . . . . . . . . . 21
1.15 The Jacobi Symbol . . . . . . . . . . . . . . . . . . . . . . . . 22
1
1 Topics in Number Theory
1.1 Subgroups of the Integers
A subset S of the set Z of integers is a subgroup of Z if 0 ∈ S, −x ∈ S and
x + y ∈ S for all x ∈ S and y ∈ S.
It is easy to see that a non-empty subset S of Z is a subgroup of Z if and
only if x − y ∈ S for all x ∈ S and y ∈ S.
Let m be an integer, and let mZ = {mn : n ∈ Z}. Then mZ (the set of
integer multiples of m) is a subgroup of Z.
Theorem 1.2 Let a1 , a2 , . . . , ar be integers, not all zero. Then there exist
integers u1 , u2 , . . . , ur such that
(a1 , a2 , . . . , ar ) = u1 a1 + u2 a2 + · · · + ur ar .
Proof Let S be the set of all integers that are of the form
n1 a1 + n2 a2 + · · · + nr ar
2
common divisor of a1 , a2 , . . . , ar , (since ai ∈ S for i = 1, 2, . . . , r). Moreover
any common divisor of a1 , a2 , . . . , ar is a divisor of each element of S and is
therefore a divisor of m. It follows that m is the greatest common divisor
of a1 , a2 , . . . , ar . But m ∈ S, and therefore there exist integers u1 , u2 , . . . , ur
such that
(a1 , a2 , . . . , ar ) = u1 a1 + u2 a2 + · · · + ur ar ,
as required.
1 = u1 a1 + u2 a2 + · · · + ur ar .
3
Any divisor of rn is a divisor of rn−1 , because rn−1 = qn rn . Moreover if
2 ≤ i ≤ n then any common divisor of ri and ri−1 is a divisor of ri−2 , because
ri−2 = qi−1 ri−1 + ri . If follows that every divisor of rn is a divisor of all the
integers r0 , r1 , . . . , rn . In particular, any divisor of rn is a common divisor of
a and b. In particular, rn is itself a common divisor of a and b.
If 2 ≤ i ≤ n then any common divisor of ri−2 and ri−1 is a divisor of ri ,
because ri = ri−2 − qi−1 ri−1 . It follows that every common divisor of a and b
is a divisor of all the integers r0 , r1 , . . . , rn . In particular any common divisor
of a and b is a divisor of rn . It follows that rn is the greatest common divisor
of a and b.
There exist integers ui and vi such that ri = ui a + vi b for i = 1, 2, . . . , n.
Indeed ui = ui−2 −qi−1 ui−1 and vi = vi−2 −qi−1 vi−1 for each integer i between 2
and n, where u0 = 1, v0 = 0, u1 = 0 and v1 = 1. In particular rn = un a+vn b.
The algorithm described above for calculating the greatest common di-
visor (a, b) of two positive integers a and b is referred to as the Euclidean
algorithm. It also enables one to calculate integers u and v such that (a, b) =
ua + vb.
Example We calculate the greatest common divisor of 425 and 119. Now
425 = 3 × 119 + 68
119 = 68 + 51
68 = 51 + 17
51 = 3 × 17.
It follows that 17 is the greatest common divisor of 425 and 119. Moreover
17 = 68 − 51 = 68 − (119 − 68)
= 2 × 68 − 119 = 2 × (425 − 3 × 119) − 119
= 2 × 425 − 7 × 119.
4
Theorem 1.4 Let p be a prime number, and let x and y be integers. If p
divides xy then either p divides x or else p divides y.
Proof Suppose that p divides xy but p does not divide x. Then p and x
are coprime, and hence there exist integers u and v such that 1 = up + vx
(Corollary 1.3). Then y = upy + vxy. It then follows that p divides y, as
required.
Proof Let n be an integer greater than one. Suppose that every integer m
satisfying 1 < m < n is a prime number or factors as a product of prime
numbers. If n is not a prime number then n = ab for some integers a and
b satisfying 1 < a < n and 1 < b < n. Then a and b are prime numbers or
products of prime numbers. It follows that n is a prime number or a product
of prime numbers. The required result therefore follows by induction on
n.
5
Proof Let n be a composite number greater than one. Suppose that every
composite number greater than one and less than n factors uniquely as a
product of prime numbers. We show that n then factors uniquely as a product
of prime numbers. Suppose therefore that
n = p1 p2 · · · pr = q1 q2 . . . , qs ,
1.7 Congruences
Let m be a positive integer. Integers x and y are said to be congruent
modulo m if x − y is divisible by m. If x and y are congruent modulo m then
we denote this by writing x ≡ y (mod m).
6
The congruence class of an integer x modulo m is the set of all integers
that are congruent to x modulo m.
Let x, y and z be integers. Then x ≡ x (mod m). Also x ≡ y (mod m)
if and only if y ≡ x (mod m). If x ≡ y (mod m) and y ≡ z (mod m) then
x ≡ z (mod m). Thus congruence modulo m is an equivalence relation on
the set of integers.
(x + y) − (x0 + y 0 ) = (x − x0 ) + (y − y 0 ),
xy − x0 y 0 = (x − x0 )y + x0 (y − y 0 ).
Proof There exist integers a and b such that 1 = am + bx, since m and x
are coprime (Corollary 1.3). Then y = amy + bxy, and m divides xy, and
therefore m divides y, as required.
Lemma 1.11 Let m be a positive integer, and let a, x and y be integers with
ax ≡ ay (mod m). Suppose that m and a are coprime. Then x ≡ y (mod m).
Lemma 1.13 Let m be a positive integer, and let a and b be integers, where
a is coprime to m. Then there exist integers x that satisfy the congruence
ax ≡ b (mod m). Moreover if x and x0 are integers such that ax ≡ b (mod m)
and ax0 ≡ b (mod m) then x ≡ x0 mod m.
7
Proof There exists an integer c such that ac ≡ 1 (mod m), since a is coprime
to m (Lemma 1.12). Then ax ≡ b (mod m) if and only if x ≡ cb (mod m).
The result follows.
Let m be a positive integer. For each integer x, let [x] denote the con-
gruence class of x modulo m. If x, x0 , y and y 0 are integers and if x ≡ x0
(mod m) and y ≡ y 0 (mod m) then xy ≡ x0 y 0 (mod m). It follows that there
is a well-defined operation of multiplication defined on congruence classes of
integers modulo m, where [x][y] = [xy] for all integers x and y. This opera-
tion is commutative and associative, and [x][1] = [x] for all integers x. If x is
an integer coprime to m, then it follows from Lemma 1.12 that there exists
an integer y coprime to m such that xy ≡ 1 (mod m). Then [x][y] = [1].
Therefore the set Z∗m of congruence classes modulo m of integers coprime to
m is an Abelian group (with multiplication of congruence classes defined as
above).
Proof For each integer k between 1 and r let Pk be the product of the
integers mi with 1 ≤ i ≤ k. Then P1 = m1 and Pk = Pk−1 mk for k =
2, 3, . . . , r. Let x be a positive integer that is divisible by mi for i = 1, 2, . . . , r.
We must show that Pr divides x. Suppose that Pk−1 divides x for some
integer k between 2 and r. Let y = x/Pk−1 . Then mk and Pk−1 are coprime
8
(Lemma 1.14) and mk divides Pk−1 y. It follows from Lemma 1.10 that mk
divides y. But then Pk divides x, since Pk = Pk−1 mk and x = Pk−1 y. On
successively applying this result with k = 2, 3, . . . , r we conclude that Pr
divides x, as required.
9
Proof Let x be an integer satisfying 0 ≤ x < m1 that is coprime to m1 ,
and let y be an integer satisfying 0 ≤ y < m2 that is coprime to m2 . It
follows from the Chinese Remainder Theorem (Theorem 1.16) that there
exists exactly one integer z satisfying 0 ≤ z < m1 m2 such that z ≡ x
(mod m1 ) and z ≡ y (mod m2 ). Moreover z must then be coprime to m1
and to m2 , and must therefore be coprime to m1 m2 . Thus every integer z
satisfing 0 ≤ z < m1 m2 that is coprime to m1 m2 is uniquely determined by
its congruence classes modulo m1 and m2 , and the congruence classes of z
modulo m1 and m2 contain integers coprime to m1 and m2 respectively. Thus
the number ϕ(m1 m2 ) of integers z satisfying 0 ≤ z < m1 m2 that are coprime
to m1 m2 is equal to ϕ(m1 )ϕ(m2 ), since ϕ(m1 ) is the number of integers x
satisfying 0 ≤ x < m1 that are coprime to m1 and ϕ(m2 ) is the number of
integers y satisfying 0 ≤ y < m2 that are coprime to m2 .
Y 1
Corollary 1.18 ϕ(n) = n 1− , for all positive integers n, where
p
p|n
Y 1
1
1− denotes the product of 1 − taken over all prime numbers p
p p
p|n
that divide n.
Proof Let n = pk11 pk22 · · · pkmm , where p1 , p2 , . . . , pm are prime numbers and
k1 , k2 , . . . , km are positive integers. Then ϕ(n) = ϕ(pk11 )ϕ(pk22 ) · · · ϕ(pkmm ), and
m
ki ki
Y 1
ϕ(pi ) = pi (1 − (1/pi )) for i = 1, 2, . . . , m. Thus ϕ(n) = n 1− , as
i=1
p i
required.
Let f be any function defined on the set of positive integers, and let n be
a positive
Xinteger. We denote the sum of the values of f (d) over all divisors d
of n by f (d).
d|n
X
Lemma 1.19 Let n be a positive integer. Then ϕ(d) = n.
d|n
10
such that x = ay. Then (x, n) is a multiple of a. Moreover a multiple ae
of a divides both x and n if and only if e divides both y and d. Therefore
(x, n) = a(y, d). It follows that the integers x satisfying 0 ≤ x < n for
which (x, n) = a are those of the form ay, where y is an integer, 0 ≤ y < d
and (y, d) = 1. It follows that there are exactly ϕ(d) integers X
x satisfying
0 ≤ x < n for which (x, n) = n/d, and thus nd = ϕ(d) and n = ϕ(d), as
d|n
required.
It then follows from Lemma 1.21 that (x + 1)p ≡ xp + 1 (mod p). Thus
if f (x) = xp − x then f (x + 1) ≡ f (x) (mod p) for all integers x, since
f (x + 1) − f (x) = (x + 1)p − xp − 1. But f (0) ≡ 0 (mod p). It follows
by induction on |x| that f (x) ≡ 0 (mod p) for all integers x. Thus xp ≡ x
(mod p) for all integers x. Moreover if x is coprime to p then it follows from
Lemma 1.11 that xp−1 ≡ 1 (mod p), as required.
11
Second Proof of Theorem 1.20 Let x be an integer. If x is divisible by
p then x ≡ 0 (mod p) and xp ≡ 0 (mod p).
Suppose that x is coprime to p. If j is an integer satisfying 1 ≤ j ≤ p − 1
then j is coprime to p and hence xj is coprime to p. It follows that there
exists a unique integer uj such that 1 ≤ uj ≤ p − 1 and xj ≡ uj (mod p).
If j and k are integers between 1 and p − 1 and if j 6= k then uj 6= uk . It
follows that each integer between 1 and p − 1 occurs exactly once in the list
u1 , u2 , . . . , up−1 , and therefore u1 u2 · · · up−1 = (p − 1)!. Thus if we multiply
together the left hand sides and right hand sides of the congruences xj ≡ uj
(mod p) for j = 1, 2, . . . , p−1 we obtain the congruence xp−1 (p−1)! ≡ (p−1)!
(mod p). But then xp−1 ≡ 1 (mod p) by Lemma 1.11, since (p−1)! is coprime
to p. But then xp ≡ x (mod p), as required.
12
First Proof of Theorem 1.23 The result is trivially true when m = 1.
Suppose that m > 1. Let I be the set of all positive integers less than m that
are coprime to m. Then ϕ(m) is by definition the number of integers in I. If
y is an integer coprime to m then so is xy. It follows that, to each integer j in
I there exists a unique integer uj in I such that xj ≡ uj (mod m). Moreover
if j ∈ I and k ∈ I and j 6= k then uj 6≡ uk . Therefore I = {uj : j ∈ I}. Thus
if we multiply the left hand sides and right hand sides of the congruences
xj ≡ uj (mod m) for all j ∈ I we obtain the congruence xϕ(m) z ≡ z (mod m),
where z is the product of all the integers in I. But z is coprime to m, since
a product of integers coprime to m is itself coprime to m. It follows from
Lemma 1.11 that xϕ(m) ≡ 1 (mod m), as required.
2nd Proof of Theorem 1.23 Let m be a positive integer. Then the con-
gruence classes modulo m of integers coprime to m constitute a group of order
ϕ(m), where the group operation is multiplication of congruence classes. Now
it follows from Lagrange’s Theorem that that order of any element of a finite
group divides the order of the group. If we apply this result to the group of
congruence classes modulo m of integers coprime to m we find that xϕ(m) ≡ 1
(mod m), as required.
13
Proof The result is clearly true when f is a constant polynomial. We can
prove the result for non-constant polynomials by induction on the degree of
the polynomial.
First we observe that, given any integer a, there exists a polynomial g with
integer coefficients such that f (x) = f (a) + (x − a)g(x). Indeed f (y + a) is a
polynomial in y with integer coefficients, and therefore f (y+a) = f (a)+yh(y)
for some polynomial h with integer coefficients. Thus if g(x) = h(x − a) then
g is a polynomial with integer coefficients and f (x) = f (a) + (x − a)g(x).
Suppose that f (a) ≡ 0 (mod p) and f (b) ≡ 0 (mod p). Let f (x) =
f (a) + (x − a)g(x), where g is a polynomial with integer coefficients. The
coefficients of f are not all divisible by p, but f (a) is divisible by p, and
therefore the coefficients of g cannot all be divisible by p.
Now f (a) and f (b) are both divisible by the prime number p, and therefore
(b−a)g(b) is divisible by p. But a prime number divides a product of integers
if and only if it divides one of the factors. Therefore either b − a is divisible
by p or else g(b) is divisible by p. Thus either b ≡ a (mod p) or else g(b) ≡ 0
(mod p). The required result now follows easily by induction on the degree
of the polynomial f .
Proof There are only finitely many congruence classes modulo m. Therefore
there exist positive integers j and k with j < k such that xj ≡ xk (mod m).
Let n = k − j. Then xj xn ≡ xj (mod m). But xj is coprime to m. It follows
from Lemma 1.11 that xn ≡ 1 (mod m).
Remark The above lemma also follows directly from Euler’s Theorem (The-
orem 1.23).
14
xk ≡ xj xk−j ≡ xj (mod m). Conversely suppose that xj ≡ xk (mod m) and
j < k. Then xj xk−j ≡ xj (mod m). But xj is coprime to m. It follows from
Lemma 1.11 that xk−j ≡ 1 (mod m). Thus if k − j = qd + r, where q and r
are integers and 0 ≤ r < d, then xr ≡ 1 (mod m). But then r = 0, since d is
the smallest positive integer for which xd ≡ 1 (mod m). Therefore k − j is
divisible by d, and thus j ≡ k (mod d).
Lemma 1.27 Let p be a prime number, and let x and y be integers coprime
to p. Suppose that the congruence classes of x and y modulo p have the same
order. Then there exists a non-negative integer k, coprime to the order of
the congruence classes of x and y, such that y ≡ xk (mod p).
Proof Let d be the order of the congruence class of x modulo p. The solu-
tions of the congruence xd ≡ 1 (mod p) include xj with 0 ≤ j < d. But the
congruence xd ≡ 1 (mod p) has at most d solutions modulo p, since p is prime
(Theorem 1.24), and the congruence classes of 1, x, x2 , . . . , xd−1 modulo p are
distinct (Lemma 1.26). It follows that any solution of the congruence xd ≡ 1
(mod p) is congruent to xk for some positive integer k. Thus if y is an integer
coprime to p whose congruence class is of order d then y ≡ xk (mod p) for
some positive integer k. Moreover k is coprime to d, for if e is a common
divisor of k and d then y d/e ≡ xd(k/e) ≡ 1 (mod p), and hence e = 1.
Theorem 1.28 Let p be a prime number. Then there exists a primitive root
modulo p.
15
hence d divides n (Lemma 1.10). Let y be an integer coprime to p whose
congruence class is also of order d. It follows from Lemma 1.27 that there
exists a non-negative integer k coprime to d such that y ≡ xk (mod p). It
then follows from Lemma 1.26 that there exists a unique integer k coprime to
d such that 0 ≤ k < d and y ≡ xk (mod p). Thus if there exists at least one
integer x coprime to p whose congruence class modulo p is of order d then
the congruence classes modulo p of integers coprime to p that are of order d
are the congruence classes of xk for those integers k satisfying 0 ≤ k < d
that are coprime to d. Thus if ψ(d) > 0 then ψ(d) = ϕ(d), where ϕ(d) is the
number of integers k satisfying 0 ≤ k < d that are coprime
X to d.
Now 0 ≤ ψ(d) ≤ ϕ(d) for each divisor d of p−1. But ψ(d) = p−1 and
d|p−1
X
ϕ(d) = p − 1 (Lemma 1.19). Therefore ψ(d) = ϕ(d) for each divisor d of
d|p−1
p − 1. In particular ψ(p − 1) = ϕ(p − 1) ≥ 1. Thus there exists an integer g
whose congruence class modulo p is of order p − 1. The congruence classes
of 1, g, g 2 , . . . g p−2 modulo p are then distinct. But there are exactly p − 1
congruence classes modulo p of integers coprime to p. It follows that any
integer that is coprime to p must be congruent to g j for some non-negative
integer j. Thus g is a primitive root modulo p.
16
Proof Let x be an integer. Then ax2 + bx + c ≡ 0 (mod p) if and only if
4a2 x2 + 4abx + 4ac ≡ 0 (mod p), since 4a is coprime to p (Lemma 1.11). But
4a2 x2 + 4abx + 4ac = (2ax + b)2 − (b2 − 4ac). It follows that ax2 + bx + c ≡ 0
(mod p) if and only if (2ax + b)2 ≡ b2 − 4ac (mod p). Thus if there exist
integers x satisfying the congruence ax2 + bx + c ≡ 0 (mod p) then either
b2 − 4ac is a quadratic residue of p or else b2 − 4ac ≡ 0 (mod p). Conversely
suppose that either b2 − 4ac is a quadratic residue of p or b2 − 4ac ≡ 0
(mod p). Then there exists an integer y such that y 2 ≡ b2 − 4ac (mod p).
Also there exists an integer d such that 2ad ≡ 1 (mod p), since 2a is coprime
to p (Lemma 1.12). If x ≡ d(y − b) (mod p) then 2ax + b ≡ y (mod p), and
hence (2ax + b)2 ≡ b2 − 4ac (mod p). But then ax2 + bx + c ≡ 0 (mod p), as
required.
Lemma 1.31 Let p be an odd prime number, and let x and y be integers.
Suppose that x2 ≡ y 2 (mod p). Then either x ≡ y (mod p) or else x ≡ −y
(mod p).
Lemma 1.32 Let p be an odd prime number, and let m = (p − 1)/2. Then
there are exactly m congruence classes of integers coprime to p that are
quadratic residues of p. Also there are exactly m congruence classes of inte-
gers coprime to p that are quadratic non-residues of p.
Theorem 1.33 Let p be an odd prime number, let R be the set of all integers
coprime to p that are quadratic residues of p, and let N be the set of all
17
integers coprime to p that are quadratic non-residues of p. If x ∈ R and
y ∈ R then xy ∈ R. If x ∈ R and y ∈ N then xy ∈ N . If x ∈ N and y ∈ N
then xy ∈ R.
Lemma 1.35 (Euler) Let p be an odd prime number, and let x be an integer
coprime to p. Then x is a quadratic residue of p if and only if x(p−1)/2 ≡ 1
18
(mod p). Also x is a quadratic non-residue of p if and only if x(p−1)/2 ≡ −1
(mod p).
Remark Let p be an odd prime number. It follows from Theorem 1.28 that
there exists a primitive root g modulo p. Moreover the congruence class of
g modulo p is of order p − 1. It follows that g j ≡ g k (mod p), where j and k
are positive integers, if and only if j − k is divisible by p − 1. But p − 1 is
19
even. Thus if g j ≡ g k then j − k is even. It follows easily from this that an
integer x is a quadratic residue of p if and only if x ≡ g k (mod p) for some
even integer k. The results of Theorem 1.33 and Lemma 1.35 follow easily
from this fact.
Let p be an odd prime number, and let m = (p − 1)/2. Then each integer
not divisible by p is congruent to exactly one of the integers ±1, ±2, . . . , ±m.
The following lemma was proved by Gauss.
Lemma 1.38 Let p be an odd prime number, letm = (p − 1)/2, and let x
x
be an integer that is not divisible by p. Then = (−1)r , where r is the
p
number of pairs (j, u) of integers satisfying 1 ≤ j ≤ m and 1 ≤ u ≤ m for
which xj ≡ −u (mod p).
20
modulo p to any integer between 1 and m. But the integers x with this
property are those for which m/2 < x ≤ m. Thus r = m/2 if m is even, and
r = (m + 1)/2 if m is odd.
If p ≡ 1 (mod 8) then m is divisible by 4 and hence r is even. If p ≡ 3
(mod 8) then m ≡ 1 (mod 4) and hence r is odd. If p ≡ 5 (mod 8) then
m ≡ 2 (mod 4) and hence r is odd. If p ≡ 7 (mod 8) then m ≡ 3 (mod 4)
2
and hence r is even. Therefore = 1 when p ≡ 1 (mod 8) and when p ≡ 7
2 p
(mod 8), and = −1 when p ≡ 3 (mod 8) and p ≡ 5 (mod 8). Thus
2 p
2
= (−1)(p −1)/8 for all odd prime numbers p, as required.
p
Proof Let S be the set of all ordered pairs (x, y) of integers x and y satisfying
1 ≤ x ≤ mand 1 ≤ y ≤ n, where p = 2m + 1 and q = 2n + 1. We must
p q
prove that = (−1)mn .
q p p
First we show that = (−1)a , where a is the number of pairs (x, y)
q
of integers in S satisfying −n ≤ py − qx ≤ −1. If (x, y) is a pair of integers
in S satisfying −n ≤ py − qx ≤ −1, and if z = qx − py, then 1 ≤ y ≤ n,
1 ≤ z ≤ n and py ≡ −z (mod q). On the other hand, if (y, z) is a pair of
integers such that 1 ≤ y ≤ n, 1 ≤ z ≤ n and py ≡ −z (mod q) then there is
a unique positive integer x such that z = qx − py. Moreover qx = py + z ≤
(p + 1)n = 2n(m + 1) and q > 2n, and therefore x < m + 1. It follows that
the pair (x, y) of integers is in S, and −n ≤ py − qx ≤ −1. We deduce that
the number a of pairs (x, y) of integers in S satisfying −n ≤ py − qx ≤ −1 is
equal to the number of pairs (y, z) of integers satisfying 1 ≤ y ≤n,1 ≤ z ≤ n
p
and py ≡ −z (mod q). It now follows from Lemma 1.38 that = (−1)a .
q q
Similarly = (−1)b , where b is the number of pairs (x, y) in S satisfying
p
1 ≤ py − qx ≤ m.
If x and y are integers satisfying py − qx = 0 then x is divisible by p and
y is divisible by q. It follows from this that py − qx 6= 0 for all pairs (x, y) in
21
S. The total number of pairs (x, y) in S is mn. Therefore mn = a + b + c + d,
where c is the number of pairs (x, y) in S satisfying py − qx < −n and d is
the number of pairs (x, y) in S satisfying py − qx > m.
Let (x, y) be a pair of integers in S, and let and let x0 = m + 1 − x and
y 0 = n + 1 − y. Then the pair (x0 , y 0 ) also belongs to S, and py 0 − qx0 =
m − n − (py − qx). It follows that py − qx > m if and only if py 0 − qx0 < −n.
Thus there is a one-to-one correspondence between pairs (x, y) in S satisfying
py − qx > m and pairs (x0 , y 0 ) in S satisfying py 0 − qx0 < −n, where (x0 , y 0 ) =
(m + 1 − x, n + 1 − y) and (x, y) = (m + 1 − x0 , n + 1 − y 0 ). Therefore
p qc= d,
mn a b
and thus mn = a + b + 2c. But then (−1) = (−1) (−1) = , as
q p
required.
22
x x
(i.e., is the product of the Legendre symbols for i = 1, 2, . . . , m.)
s pi
x
We define = 1.
1
Note that the Jacobi symbol can have the values 0, +1 and −1.
Lemma
x 1.42 Let s be an odd positive integer, and let x be an integer. Then
6= 0 if and only if x is coprime to s.
s
Proof Let s = p1 p2 · · · pm , where p1 , p2 , . . . , pm are odd prime numbers. Sup-
pose that
x x is coprime to s. Then x is coprime to each prime x factor of s, and
hence = ±1 for i = 1, 2, . . . , m. It follows that = ±1 and thus
x pi s
6= 0.
s
Next suppose that x is not coprime to s. Let p be a prime factor of the
x
greatest common divisor of x and s. Then p = pi , and hence = 0 for
x p i
Lemma xy1.44
Let x and y be integers, and let s and t be odd positive integers.
x y x x x
Then = and = .
s s s st s t
xy x y
Proof = for all prime numbers p (Corollary 1.34). The
p p p
required result therefore follows from the definition of the Jacobi symbol.
x2 x
Lemma 1.45 = 1 and = 1 for for all odd positive integers s
s s2
and all integers x that are coprime to s.
Proof This follows directly from Lemma 1.44 and Lemma 1.42.
−1
Theorem 1.46 = (−1)(s−1)/2 for all odd positive integers s.
s
23
−1
(s−1)/2
Proof Let f (s) = (−1) . for each odd positive integer s. We
s
must prove that f (s) = 1 for all odd positive integers s. If s and t are odd
positive integers then
(st − 1) − (s − 1) − (t − 1) = st − s − t + 1 = (s − 1)(t − 1)
24
The results proved above can be used to calculate Jacobi symbols, as in
the following example.
25
Course 311: Michaelmas Term 1999
Part II: Topics in Group Theory
D. R. Wilkins
Copyright
c David R. Wilkins 1997
Contents
2 Topics in Group Theory 2
2.1 Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.2 Examples of Groups . . . . . . . . . . . . . . . . . . . . . . . 3
2.3 Cayley Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.4 Elementary Properties of Groups . . . . . . . . . . . . . . . . 5
2.5 The General Associative Law . . . . . . . . . . . . . . . . . . 6
2.6 Subgroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.7 Cyclic Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.8 Cosets and Lagrange’s Theorem . . . . . . . . . . . . . . . . . 11
2.9 Normal Subgroups and Quotient Groups . . . . . . . . . . . . 12
2.10 Homomorphisms . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.11 The Isomorphism Theorems . . . . . . . . . . . . . . . . . . . 18
2.12 Direct products of groups . . . . . . . . . . . . . . . . . . . . 19
2.13 Cayley’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.14 Group Actions, Orbits and Stabilizers . . . . . . . . . . . . . . 21
2.15 Conjugacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.16 Permutations and the Symmetric Groups . . . . . . . . . . . . 22
2.17 The Alternating Groups . . . . . . . . . . . . . . . . . . . . . 26
2.18 Normal Subgroups of the Symmetric Groups . . . . . . . . . . 29
2.19 Finitely Generated Abelian Groups . . . . . . . . . . . . . . . 30
2.20 The Class Equation of a Finite Group . . . . . . . . . . . . . . 33
2.21 Cauchy’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . 34
2.22 The Structure of p-Groups . . . . . . . . . . . . . . . . . . . . 34
2.23 The Sylow Theorems . . . . . . . . . . . . . . . . . . . . . . . 35
2.24 Solvable Groups . . . . . . . . . . . . . . . . . . . . . . . . . . 37
1
2 Topics in Group Theory
2.1 Groups
A binary operation ∗ on a set G associates to elements x and y of G a third
element x ∗ y of G. For example, addition and multiplication are binary
operations of the set of all integers.
One usually adopts multiplicative notation for groups, where the product
x ∗ y of two elements x and y of a group G is denoted by xy. The inverse of
an element x of G is then denoted by x−1 . The identity element is usually
denoted by e (or by eG when it is necessary to specify explicitly the group
to which it belongs). Sometimes the identity element is denoted by 1. Thus,
when multiplicative notation is adopted, the group axioms are written as
follows:-
2
The group G is said to be Abelian (or commutative) if xy = yx for all elements
x and y of G.
It is sometimes convenient or customary to use additive notation for cer-
tain groups. Here the group operation is denoted by +, the identity element
of the group is denoted by 0, the inverse of an element x of the group is
denoted by −x. By convention, additive notation is only used for Abelian
groups. When expressed in additive notation the axioms for a Abelian group
are as follows:
3
For each positive integer n the set of all nonsingular n × n matrices is a
group, where the group operation is matrix multiplication. These groups are
not Abelian when n ≥ 2.
The set of all transformations of the plane that are of the form
(x, y) 7→ (ax + by, cx + dy)
with ad − bc 6= 0 is a group with respect to the operation of composition of
transformations. This group includes all rotations about the origin, and all
reflections in lines passing through the origin. It is not Abelian.
Consider a regular n-sided polygon centered at the origin. The symme-
tries of this polygon (i.e., length- and angle-preserving transformations of
the plane that map this polygon onto itself) are rotations about the origin
through an integer multiple of 2π/n radians, and reflections in the n axes of
symmetry of the polygon. The symmetries of the polygon constitute a group
of order 2n. This group is referred to as the dihedral group of order 2n.
The symmetries of a rectangle that is not a square constitute a group of
order 4. This group consists of the identity transformation, reflection in the
axis of symmetry joining the midpoints of the two shorter sides, reflection
in the axis of symmetry joining the two longer sides, and rotation though
an angle of π radians (180◦ ). If I denotes the identity transformation, A
and B denote the reflections in the two axes of symmetry, and C denotes
the rotation through π radians then A2 = B 2 = C 2 = I, AB = BA = C,
AC = CA = B and BC = CB = A. This group is Abelian: it is often
referred to as the Klein 4-group (or, in German, Kleinsche Viergruppe).
The symmetries of a regular tetrahedron in 3-dimensional space constitute
a group. Any permutation of the vertices of the tetrahedron can be effected
by an appropriate symmetry of the tetrahedron. Moreover each symmetry is
completely determined by the permutation of the vertices which it induces.
Therefore the group of symmetries of a regular tetrahedron is of order 24,
since there are 24 permutations of a set with four elements. It turns out that
this group is non-Abelian.
4
consist of the identity transformation I, an anticlockwise rotation R about
the centre through an angle of 2π/3 radians (i.e., 120◦ ), a clockwise rotation S
about the centre through an angle of 2π/3 radians, and reflections U, V and
W in the lines joining the vertices A, B and C respectively to the midpoints
of the opposite edges. Calculating the compositions of these rotations, we
obtain the following Cayley table:
I R S U V W
I I R S U V W
R R S I W U V
S S I R V W U
U U V W I R S
V V W U S I R
W W U V R S I .
Note that each element of the group occurs exactly once in each row and
in each column in the main body of the table (excluding the labels at the left
of each row and at the head of each column), This is a general property of
Cayley tables of groups which can be proved easily from the group axioms.
Proof We know from the axioms that the group G contains at least one
element x−1 which satisfies xx−1 = e and x−1 x = e. If z is any element of
G which satisfies xz = e then z = ez = (x−1 x)z = x−1 (xz) = x−1 e = x−1 .
Similarly if w is any element of G which satisfies wx = e then w = x−1 . In
particular we conclude that the inverse x−1 of x is uniquely determined, as
required.
5
Lemma 2.3 Let x and y be elements of a group G. Then (xy)−1 = y −1 x−1 .
Proof It follows from the group axioms that
(xy)(y −1 x−1 ) = x(y(y −1 x−1 )) = x((yy −1 )x−1 ) = x(ex−1 ) = xx−1 = e.
Similarly (y −1 x−1 )(xy) = e, and thus y −1 x−1 is the inverse of xy, as re-
quired.
Note in particular that (x−1 )−1 = x for all elements x of a group G, since
x has the properties that characterize the inverse of the inverse x−1 of x.
Given an element x of a group G, we define xn for each positive integer n
by the requirement that x1 = x and xn = xn−1 x for all n > 1. We also define
x0 = e, where e is the identity element of the group, and we define x−n to be
the inverse of xn for all positive integers n.
Theorem 2.4 Let x be an element of a group G. Then xm+n = xm xn and
xmn = (xm )n for all integers m and n.
Proof The identity xm+n = xm xn clearly holds when m = 0 and when n = 0.
The identity xm+n = xm xn can be proved for all positive integers m and n by
induction on n. The identity when m and n are both negative then follows
from the identity x−m−n = x−n x−m on taking inverses. The result when m
and n have opposite signs can easily be deduced from that where m and n
both have the same sign.
The identity xmn = (xm )n follows immediately from the definitions when
n = 0, 1 or −1. The result when n is positive can be proved by induction on
n. The result when n is negative can then be obtained on taking inverses.
If additive notation is employed for an Abelian group then the notation
n
‘x ’ is replaced by ‘nx’ for all integers n and elements x of the group. The
analogue of Theorem 2.4 then states that (m + n)x = mx + nx and (mn)x =
m(n(x)) for all integers m and n.
6
(Thus if pj = x1 , x2 , . . . , xj for j = 1, 2, . . . , n then pj = pj−1 xj for each
j > 1.)
Now an arbitrary product of n elements of G is determined by an expres-
sion involving n elements of G together with equal numbers of left and right
parentheses that determine the order in which the product is evaluated. The
General Associative Law ensures that the value of such a product is deter-
mined only by the order in which the elements of the group occur within that
product. Thus a product of n elements of G has the value x1 x2 · · · xn , where
x1 , x2 , . . . , xn are the elements to be multiplied, listed in the order in which
they occur in the expression defining the product.
((x1 x2 )x3 )x4 , (x1 x2 )(x3 x4 ), (x1 (x2 x3 ))x4 , x1 ((x2 x3 )x4 ), x1 (x2 (x3 x4 ))
all have the same value. (Note that x1 x2 x3 x4 is by definition the value of the
first of these expressions.)
Now the first step in evaluating the product will involve multiplying some
element xr with the succeeding element xr+1 . The subsequent steps will then
evaluate a product of n − 1 elements, namely the elements xi for 1 ≤ i < r,
the element xr xr+1 , and the elements xi for r + 1 < i ≤ n. The validity of the
General Associative Law for products of fewer than n elements then ensures
that the value p of the product is given by
(x1 x2 )x3 · · · xn if r = 1;
x1 (x2 x3 )x4 · · · xn if r = 2;
p = x1 x2 (x3 x4 )x5 · · · xn if r = 3 (and n > 4);
..
.
x1 x2 · · · xn−2 (xn−1 xn ) if r = n − 1.
7
Also the General Associativity Law for products of fewer than n elements
ensures that if r < n − 1 then
x1 x2 · · · xn−2 (xn−1 xn ) = x1 x2 · · · xn .
The case when n = 3 is the Associative Law for products of three elements.
For n > 3 let y be the product x1 x2 , · · · xn−2 of the elements x1 , x2 , . . . , xn−2
(with y = x1 x2 in the case when n = 4). Then
We have thus shown that if the General Associative Law holds for all products
involving fewer than n elements of the group G, then it holds for all products
involving n elements of G. The validity of the General Associative Law
therefore follows by induction on the number of elements occurring in the
product in question.
Note that the only group axiom used in verifying the General Associative
Law is the Associative Law for products of three elements. It follows from
this that the General Associative Law holds for any binary operation on a
set that satisfies the Associative Law for products of three elements. (A set
with a binary operation satisfying the Associative Law is referred to as a
semigroup—the General Associative Law holds in all semigroups.)
2.6 Subgroups
Definition Let G be a group, and let H be a subset of G. We say that H
is a subgroup of G if the following conditions are satisfied:
Lemma 2.5 Let x be an element of a group G. Then the set of all elements
of G that are of the form xn for some integer n is a subgroup of G.
8
Proof Let H = {xn : n ∈ Z}. Then the identity element belongs to H, since
it is equal to x0 . The product of two elements of H is itself an element of
H, since xm xn = xm+n for all integers m and n (see Theorem 2.4). Also the
inverse of an element of H is itself an element of H since (xn )−1 = x−n for
all integers n. Thus H is a subgroup of G, as required.
Example The group of all rotations of the plane about the origin through an
integer multiple of 2π/n radians is a cyclic group of order n for all integers n.
This group is generated by an anticlockwise rotation through an angle of
2π/n radians.
Lemma 2.7 Let G be a finite cyclic group with generator x, and let j and
k be integers. Then xj = xk if and only if j − k is divisible by the order of
the group.
9
Proof First we show that xm = e for some strictly positive integer m, where
e is the identity element of G. Now xj = xk for some integers j and k with
j < k, since G is finite. Let m = k − j. Then m > 0 and xm = xk (xj )−1 = e.
Let n be the smallest strictly positive integer for which xn = e. Now any
integer i can be expressed in the form i = qn + r, where q and r are integers
and 0 ≤ r < n. (Thus q is the greatest integer for which qn ≤ i.) Then
xi = (xn )q xr = xr (since xn = e). Now the choice of n ensures that xr 6= e
if 0 < r < n. It follows that an integer i satisfies xi = e if and only if n
divides i.
Let j and k be integers. Now xj = xk if and only if xj−k = e, since
xj−k = xj (xk )−1 . It follows that xj = xk if and only if j − k is divisible by n.
Moreover n is the order of the group G, since each element of G is equal to
one of the elements xi with 0 ≤ i < n and these elements are distinct.
10
2.8 Cosets and Lagrange’s Theorem
Definition Let H be a subgroup of a group G. A left coset of H in G is a
subset of G that is of the form xH, where x ∈ G and
xH = {y ∈ G : y = xh for some h ∈ H}.
Similarly a right coset of H in G is a subset of G that is of the form Hx,
where x ∈ G and
Hx = {y ∈ G : y = hx for some h ∈ H}.
Note that a subgroup H of a group G is itself a left coset of H in G.
Lemma 2.8 Let H be a subgroup of a group G. Then the left cosets of H
in G have the following properties:—
(i) x ∈ xH for all x ∈ G;
(ii) if x and y are elements of G, and if y = xa for some a ∈ H, then
xH = yH;
(iii) if x and y are elements of G, and if xH ∩ yH is non-empty then xH =
yH.
Proof Let x ∈ G. Then x = xe, where e is the identity element of G. But
e ∈ H. It follows that x ∈ xH. This proves (i).
Let x and y be elements of G, where y = xa for some a ∈ H. Then
yh = x(ah) and xh = y(a−1 h) for all h ∈ H. Moreover ah ∈ H and a−1 h ∈ H
for all h ∈ H, since H is a subgroup of G. It follows that yH ⊂ xH and
xH ⊂ yH, and hence xH = yH. This proves (ii).
Finally suppose that xH ∩ yH is non-empty for some elements x and y
of G. Let z be an element of xH ∩ yH. Then z = xa for some a ∈ H, and
z = yb for some b ∈ H. It follows from (ii) that zH = xH and zH = yH.
Therefore xH = yH. This proves (iii).
Lemma 2.9 Let H be a finite subgroup of a group G. Then each left coset
of H in G has the same number of elements as H.
Proof Let H = {h1 , h2 , . . . , hm }, where h1 , h2 , . . . , hm are distinct, and let x
be an element of G. Then the left coset xH consists of the elements xhj for
j = 1, 2, . . . , m. Suppose that j and k are integers between 1 and m for which
xhj = xhk . Then hj = x−1 (xhj ) = x−1 (xhk ) = hk , and thus j = k, since
h1 , h2 , . . . , hm are distinct. It follows that the elements xh1 , xh2 , . . . , xhm are
distinct. We conclude that the subgroup H and the left coset xH both have
m elements, as required.
11
Theorem 2.10 (Lagrange’s Theorem) Let G be a finite group, and let H be
a subgroup of G. Then the order of H divides the order of G.
The proof of Lagrange’s Theorem shows that the index [G: H] of a sub-
group H of a finite group G is given by [G: H] = |G|/|H|.
Proof Let H be the set of all elements of G that are of the form xn for some
integer n. Then H is a subgroup of G (see Lemma 2.5), and the order of
H is the order of x. But the order of H divides G by Lagrange’s Theorem
(Theorem 2.10). The result follows.
12
and we can use analogous notation to denote the product of four or more
subsets of G.
If A, B and C are subsets of a group G, and if A ⊂ B then clearly
AC ⊂ BC and CA ⊂ CB.
Note that if H is a subgroup of the group G and if x is an element of G
then xH is the left coset of H in G that contains the element x. Similarly
Hx is the right coset of H in G that contains the element x.
If H is a subgroup of G then HH = H. Indeed HH ⊂ H, since the
product of two elements of a subgroup H is itself an element of H. Also
H ⊂ HH since h = eh for any element h of H, where e, the identity element
of G, belongs to H.
Definition A subgroup N of a group G is said to be a normal subgroup of
G if xnx−1 ∈ N for all n ∈ N and x ∈ G.
The notation ‘N / G’ signifies ‘N is a normal subgroup of G’.
Definition A group G is said to be simple if the only normal subgroups of
G are the whole of G and the trivial subgroup {e} whose only element is the
identity element e of G.
Lemma 2.13 Every subgroup of an Abelian group is a normal subgroup.
Proof Let N be a subgroup of an Abelian group G. Then
xnx−1 = (xn)x−1 = (nx)x−1 = n(xx−1 ) = ne = n
for all n ∈ N and x ∈ G, where e is the identity element of G. The result
follows.
Example Let S3 be the group of permutations of the set {1, 2, 3}, and let
H be the subgroup of S3 consisting of the identity permutation and the
transposition (1 2). Then H is not normal in G, since (2 3)−1 (1 2)(2 3) =
(2 3)(1 2)(2 3) = (1 3) and (1 3) does not belong to the subgroup H.
Proposition 2.14 A subgroup N of a group G is a normal subgroup of G if
and only if xN x−1 = N for all elements x of G.
Proof Suppose that N is a normal subgroup of G. Let x be an element
of G. Then xN x−1 ⊂ N . (This follows directly from the definition of a
normal subgroup.) On replacing x by x−1 we see also that x−1 N x ⊂ N , and
thus N = x(x−1 N x)x−1 ⊂ xN x−1 . Thus each of the sets N and xN x−1 is
contained in the other, and therefore xN x−1 = N .
Conversely if N is a subgroup of G with the property that xN x−1 = N
for all x ∈ G, then it follows immediately from the definition of a normal
subgroup that N is a normal subgroup of G.
13
Corollary 2.15 A subgroup N of a group G is a normal subgroup of G if
and only if xN = N x for all elements x of G.
Proof Let x, y and z be any elements of G. Then the product of the cosets
xN and yN is the coset (xy)N . The subgroup N is itself a coset of N in G,
since N = eN . Moreover
14
Example Consider the dihedral group D8 of order 8, which we represent as
the group of symmetries of a square in the plane with corners at the points
whose Cartesian co-ordinates are (1, 1), (−1, 1), (−1, −1) and (1, −1). Then
D8 = {I, R, R2 , R3 , T1 , T2 , T3 , T4 },
N A B C
N N A B C
A A N C B
B B C N A
C C B A N .
15
k of H such that x = uh and y = vk. Then xy = uhvk = uv(v −1 hvk). Now
v −1 hv ∈ H since h ∈ H and H is normal in G. It follows that v −1 hvk ∈
H, since the product of any two elements of a subgroup belongs to that
subgroup. We deduce that if x ∼H u and y ∼H v then xy ∼H uv. Also
x−1 = (uh)−1 = h−1 u−1 = u−1 (uh−1 u−1 , where uh−1 u−1 ∈ H. It follows that
if x ∼H u then x−1 ∼H u−1 .
Now, for any x ∈ G, let Cx denote the coset of H to which the element x
belongs. Now Cx is the equivalence class of x with respect to the equivalence
relation ∼H . It follows from this that elements x and u satisfy Cx = Cu if
and only if x ∼H u. We conclude that if H is normal in G, and if Cx = Cu
and Cy = Cv then Cxy = Cuv and Cx−1 = Cu−1 . One can deduce from this
that there is a well-defined group multiplication operation on cosets of H in
G, where Cx Cy is defined to be Cxy . The results just prove show that this
definition of Cx Cy does not depend on the choice of x and y representing their
respective cosets. The identity element is the subgroup H itself, which can
be viewed as the coset containing the identity element, and the inverse of the
coset Cx is the coset Cx−1 . One can readily verify that all the group axioms
are satisfied and thus the set of cosets of H in G does indeed constitute a
group, the quotient group G/H.
2.10 Homomorphisms
Definition A homomorphism θ: G → K from a group G to a group K is a
function with the property that θ(g1 ∗ g2 ) = θ(g1 ) ∗ θ(g2 ) for all g1 , g2 ∈ G,
where ∗ denotes the group operation on G and on K.
Example Let q be an integer. The function from the group Z of integers to
itself that sends each integer n to qn is a homomorphism.
Example Let x be an element of a group G. The function that sends each
integer n to the element xn is a homomorphism from the group Z of integers
to G, since xm+n = xm xn for all integers m and n (Theorem 2.4).
Lemma 2.18 Let θ: G → K be a homomorphism. Then θ(eG ) = eK , where
eG and eK denote the identity elements of the groups G and K. Also θ(x−1 ) =
θ(x)−1 for all elements x of G.
Proof Let z = θ(eG ). Then z 2 = θ(eG )θ(eG ) = θ(eG eG ) = θ(eG ) = z. The
result that θ(eG ) = eK now follows from the fact that an element z of K
satisfies z 2 = z if and only if z is the identity element of K.
Let x be an element of G. The element θ(x−1 ) satisfies θ(x)θ(x−1 ) =
θ(xx−1 ) = θ(eG ) = eK , and similarly θ(x−1 )θ(x) = eK . The uniqueness of
the inverse of θ(x) now ensures that θ(x−1 ) = θ(x)−1 .
16
An isomorphism θ: G → K between groups G and K is a homomor-
phism that is also a bijection mapping G onto K. Two groups G and K are
isomorphic if there exists an isomorphism mapping G onto K.
Example Let D6 be the group of symmetries of an equilateral triangle in
the plane with vertices A, B and C, and let S3 be the group of permutations
of the set {A, B, C}. The function which sends a symmetry of the triangle
to the corresponding permutation of its vertices is an isomorphism between
the dihedral group D6 of order 6 and the symmetric group S3 .
Example Let R be the group of real numbers with the operation of addition,
and let R+ be the group of strictly positive real numbers with the operation
of multiplication. The function exp: R → R+ that sends each real number x
to the positive real number ex is an isomorphism: it is both a homomorphism
of groups and a bijection. The inverse of this isomorphism is the function
log: R+ → R that sends each strictly positive real number to its natural
logarithm.
Here is some further terminology regarding homomorphisms:
• A monomorphism is an injective homomorphism.
• An epimorphism is a surjective homomorphism.
• An endomorphism is a homomorphism mapping a group into itself.
• An automorphism is an isomorphism mapping a group onto itself.
Definition The kernel ker θ of the homomorphism θ: G → K is the set of
all elements of G that are mapped by θ onto the identity element of K.
Example Let the group operation on the set {+1, −1} be multiplication,
and let θ: Z → {+1, −1} be the homomorphism that sends each integer n
to (−1)n . Then the kernel of the homomorphism θ is the subgroup of Z
consisting of all even numbers.
Lemma 2.19 Let G and K be groups, and let θ: G → K be a homomorphism
from G to K. Then the kernel ker θ of θ is a normal subgroup of G.
Proof Let x and y be elements of ker θ. Then θ(x) = eK and θ(y) = eK ,
where eK denotes the identity element of K. But then θ(xy) = θ(x)θ(y) =
eK eK = eK , and thus xy belongs to ker θ. Also θ(x−1 ) = θ(x)−1 = e−1
K = eK ,
and thus x−1 belongs to ker θ. We conclude that ker θ is a subgroup of K.
Moreover ker θ is a normal subgroup of G, for if g ∈ G and x ∈ ker θ then
θ(gxg −1 ) = θ(g)θ(x)θ(g)−1 = θ(g)θ(g −1 ) = eK .
17
If N is a normal subgroup of some group G then N is the kernel of the
quotient homomorphism θ: G → G/N that sends g ∈ G to the coset gN . It
follows therefore that a subset of a group G is a normal subgroup of G if and
only if it is the kernel of some homomorphism.
Proof The set HN clearly contains the identity element of G. Let x and y
be elements of HN . We must show that xy and x−1 belong to HN . Now
x = hu and y = kv for some elements h and k of H and for some elements u
and v of N . Then xy = (hk)(k −1 ukv). But k −1 uk ∈ N , since N is normal.
It follows that k −1 ukv ∈ N , since N is a subgroup and k −1 ukv is the product
of the elements k −1 uk and v of N . Also hk ∈ H. It follows that xy ∈ HN .
We must also show that x−1 ∈ HN . Now x−1 = u−1 h−1 = h−1 (hu−1 h−1 ).
Also h−1 ∈ H, since H is a subgroup of G, and hu−1 h−1 ∈ N , since N
is a normal subgroup of G. It follows that x−1 ∈ HN , and thus HN is a
subgroup of G, as required.
18
Theorem 2.23 (First Isomorphism Theorem) Let G be a group, let H be a
subgroup of G, and let N be a normal subgroup of G. Then
HN ∼ H
= .
N N ∩H
Proof Every element of HN/N is a coset of N that is of the form hN for
some h ∈ H. Thus if ϕ(h) = hN for all h ∈ H then ϕ: H → HN/N is
a surjective homomorphism, and ker ϕ = N ∩ H. But ϕ(H) ∼ = H/ ker ϕ
(Corollary 2.21). Therefore HN/N ∼= H/(N ∩ H) as required.
Theorem 2.24 (Second Isomorphism Theorem) Let M and N be normal
subgroups of a group G, where M ⊂ N . Then
G ∼ G/M
= .
N N/M
Proof There is a well-defined homomorphism θ: G/M → G/N that sends
gM to gN for all g ∈ G. Moreover the homomorphism θ is surjective, and
ker θ = N/M . But θ(G/M ) ∼ = (G/M )/ ker θ (Corollary 2.21). Therefore
G/N is isomorphic to (G/M ) / (N/M ), as required.
19
Let us first consider C2 × C3 . Let x and y be generators of C2 and C3
respectively, and let e and e0 denote the identity elements of C2 and C3 . Thus
C2 = {e, x} and C3 = {e0 , y, y 2 }, where x2 = e and y 3 = e0 . The elements of
C2 × C3 are
Thus 6 is the smallest positive integer n for which z n is equal to the identity
element (e, e0 ) of the group. We deduce that the group C2 × C3 (which is a
group of order 6) must be a cyclic group generated by the element z.
Next consider C2 × C2 . This has four elements I, A, B and C, where
I = (e, e), A = (e, x), B = (x, e) and C = (x, x). If we calculate the Cayley
table for the group, we discover that it is that of the Klein 4-group.
and
σx (σx−1 (g)) = x(x−1 g) = (x(x−1 )g = g
for all g ∈ G. It follows that, for any x ∈ G, the function σx : G → G is a
bijection whose inverse is σx−1 It follows that σx is a permutation of G for all
x ∈ G, and thus the function sending an element x of G to the permutation
σx is a function from G to the group of permutations of G. This function
is a homomorphism. Indeed σxy = σx ◦ σy since σxy (g) = (xy)g = x(yg) =
σx (σy (g)) for all g ∈ G. The homomorphism sending x ∈ G to σx is be
injective, for if σx is the identity permutation then xg = g for all g ∈ G, and
hence x is the identity element of G. It follows that G is isomorphic to the
image of the homomorphism. This image is a subgroup {σx : x ∈ G} of the
group of permutations of G. The result follows.
20
2.14 Group Actions, Orbits and Stabilizers
Definition A left action of a group G on a set X associates to each g ∈ G
and x ∈ X an element g.x of X in such a way that g.(h.x) = (gh).x and
1.x = x for all g, h ∈ G and x ∈ X, where 1 denotes the identity element of
G.
Given a left action of a group G on a set X, the orbit of an element x of
X is the subset {g.x : g ∈ G} of X, and the stabilizer of x is the subgroup
{g ∈ G : g.x = x} of G.
Lemma 2.26 Let G be a finite group which acts on a set X on the left.
Then the orbit of an element x of X contains [G: H] elements, where [G: H]
is the index of the stabilizer H of x in G.
2.15 Conjugacy
Definition Two elements h and k of a group G are said to be conjugate if
k = ghg −1 for some g ∈ G.
One can readily verify that the relation of conjugacy is reflexive, sym-
metric and transitive and is thus an equivalence relation on a group G. The
equivalence classes determined by this relation are referred to as the conju-
gacy classes of G. A group G is the disjoint union of its conjugacy classes.
Moreover the conjugacy class of the identity element of G contains no other
element of G.
A group G is Abelian if and only if all its conjugacy classes contain exactly
one element of the group G.
Lemma 2.27 Let G be a finite group, and let h ∈ G. Then the number of
elements in the conjugacy class of h is equal to the index [G: C(h)] of the
centralizer C(h) of h in G.
21
Let H be a subgroup of a group G. One can easily verify that gHg −1 is
also a subgroup of G for all g ∈ G, where gHg −1 = {ghg −1 : h ∈ H}.
Example There are two permutations ofa set {a, b} with two elements.
a b a b
These are the identity permutation and the transposition
a b b a
that interchanges the elements a and b.
Example There are six permutations of a set {a, b, c} with three elements.
These are
a b c a b c a b c
, , ,
a b c a c b b a c
a b c a b c a b c
, , .
b c a c a b c b a
22
from the fact that the inverse of a bijection is itself a bijection.) Composition
of permutations is associative: (p ◦ q) ◦ r = p ◦ (q ◦ r) for all permutations p, q
and r of S. (This can be verified by noting that ((p◦q)◦r)(x) = p(q(r(x))) =
(p ◦ (q ◦ r))(x) for all elements x of S.) It follows from this that the set of all
permutations of a set S is a group, where the group operation is composition
of permutations.
Definition For each natural number n, the symmetric group Σn is the group
of permutations of the set {1, 2, . . . , n}.
23
Let S be a set with k elements and let p be a permutation of S. Choose
an element a1 of S, and let elements a2 , a3 , a4 , . . . of S be defined by the
requirement that p(ai ) = ai+1 for all positive integers i. Let n be the largest
positive integer for which the elements a1 , a2 , . . . , an of S are distinct. We
claim that p(an ) = a1 .
Now the choice of n ensures that the elements a1 , a2 , . . . , an , an+1 are not
distinct. Therefore an+1 = aj for some positive integer j between 1 and n.
If j were greater than one then we would have aj = p(aj−1 ) and aj = p(an ),
which is impossible since if p is a permutation of S then exactly one element
of S must be sent to aj by p. Therefore j = 1, and thus p(an ) = a1 . Let
σ1 = (a1 a2 · · · an ).
Let T be the set S \ {a1 , a2 , . . . , an } consisting of all elements of S other
than a1 , a2 , . . . , an . Now a1 = p(an ), and ai = p(ai−1 ) for i = 2, 3, . . . , n.
Thus if x ∈ T then p(x) 6= ai for i = 1, 2, . . . , n (since the function p: S → S
is injective), and therefore p(x) ∈ T . We can therefore define a function
q: T → T , where q(x) = p(x) for all x ∈ T . This function has a well-
defined inverse q −1 : T → T where q −1 (x) = p−1 (x) for all x ∈ T . It follows
that q: T → T is a permutation of T . The induction hypothesis ensures
that this permutation is the identity permutation of T , or is a cycle, or can
be expressed as a composition of two or more disjoint cyles. These cycles
extend to permutations of S that fix the elements a1 , a2 , . . . , an , and these
permutations of S are also cycles. It follows that either p = σ1 (and q is the
identity permutation of T ), or else p = σ1 σ2 . . . σm , where σ2 , σ3 , . . . , σm are
disjoint cycles of S that fix a1 , a2 , . . . , an and correspond to cycles of T . Thus
if the result holds for permutations of sets with fewer than k elements, then
it holds for permutations of sets with k elements. It follows by induction on
k that the result holds for permutations of finite sets.
Lemma 2.29 Every permutation of a finite set with more than one element
can be expressed as a finite composition of transpositions.
It follows from Proposition 2.28 that a permutation of S that is not the iden-
tity permutation can be expressed as a finite composition of transpositions.
Moreover the identity permutation of S can be expressed as the composition
24
of any transposition with itself, provided that S has more than one element.
The result follows.
Theorem 2.30 A permutation of a finite set cannot be expressed in one way
as a composition of an odd number of transpositions and in another way as
a composition of an even number of transpositions.
Proof We can identify the finite set with the set {1, 2, . . . , n}, where n is the
number of elements in the finite set. Let F : Zn → Z be theQ function sending
each n-tuple (m1 , m2 , . . . , mn ) of integers to the product (mk − mj ) of
1≤j<k≤n
the quantities mk − mj for all pairs (j, k) of integers satisfying 1 ≤ j < k ≤
n. Note that F (m1 , m2 , . . . , mn ) 6= 0 whenever the integers m1 , m2 , . . . , mn
are distinct. If we transpose two of the integers m1 , m2 , . . . , mn then this
changes
Q the sign of the function F , since the number of factors of the product
(mk − mj ) that change sign is odd. (Indeed if we transpose ms and
1≤j<k≤n
mt , where 1 ≤ s < t < n then the factor mt − ms changes sign, the factor
mt − mi becomes −(mi − ms ) and the factor mi − ms becomes −(mt − mi )
for each integer i for which s < i < t.) But any permutation σ of the
set {1, 2, . . . , n} is a composition of transpositions. It follows that to each
permutation σ of {1, 2, . . . , n} there corresponds a number σ , where σ = +1
or −1, such that F (mσ(1) , mσ(2) , . . . , mσ(n) ) = σ F (m1 , m2 , . . . , mn ) for all
integers m1 , m2 , . . . , mn . Moreover στ = σ τ for all permutations σ and τ
of the set {1, 2, . . . , n}. Also τ = −1 if the permutation τ is a transposition.
It follows that if σ is expressible as a composition of r transpositions then
σ = (−1)r . If σ is also expressible as a composition of s transpositions then
σ = (−1)s , and hence (−1)r = (−1)s . But then r − s must be divisible by
2. The result follows.
A permutation of a finite set is said to be even if it is expressible as the
composition of an even number of transpositions. A permutation of a finite
set is said to be odd if it is expressible as the composition of an odd number
of transpositions.
Any permutation of a finite set is expressible as a composition of trans-
positions (Lemma 2.29) and must therefore be either even or odd. However
Theorem 2.30 ensures that a permutation of a finite set cannot be both even
and odd.
Lemma 2.31 An n-cycle is even if n is odd, and is odd if n is even.
Proof An n-cycle (a1 , a2 , . . . , an ) is expressible as a composition of n − 1
transpositions, since
(a1 a2 · · · an ) = (a1 a2 )(a2 a3 ) · · · (an−1 an ).
25
Thus an n-cycle is even if n − 1 is even, and is odd if n − 1 is odd.
Note that, for each integer n satisfying n > 1, the alternating group An
is a normal subgroup of Σn of index 2.
Lemma 2.33 All cycles of order k in the alternating group An are conjugate
to one another, provided that k ≤ n − 2.
26
be composed with the transposition that interchanges n − 1 and n to obtain
an even permutation ρ with the required property. Then (m1 m2 · · · mk ) =
ρ(1 2 · · · k)ρ−1 . Thus if k ≤ n−2 then all cycles of order k in An are conjugate
to (1 2 · · · k) and are therefore conjugate to one another, as required.
We recall that a group G is simple if and only if the only normal subgroups
of G are G itself and the trivial subgroup whose only element is the identity
element of G. The alternating group A4 is not simple. We shall prove that
An is simple when n ≥ 5.
Proof Let X = {1, 2, . . . , n}. The proof divides into two cases, depending
on whether or not the normal subgroup N contains a permutation ρ of X
with the property that ρ2 is not the identity permutation.
27
Suppose that the normal subgroup N contains a permutation ρ of X with
the property that ρ2 is not the identity permutation. Then there exists a ∈ X
such that ρ(ρ(a)) 6= a. Let b = ρ(a) and c = ρ(b). Then the elements a, b
and c are distinct. Choose elements d and e of X such that a, b, c, d and e
are distinct. (This is possible since the set X has n elements, where n ≥ 5.)
Let ρ0 = (c d e)ρ(c d e)−1 . Then ρ0 ∈ N (since ρ ∈ N and N is a normal
subgroup), ρ0 (a) = b and ρ0 (b) = d. Now ρ0 6= ρ, since ρ0 (b) 6= ρ(b). Thus if
σ = ρ−1 ρ0 then σ ∈ N , σ(a) = a, and σ is not the identity permutation.
It remains to prove the result in the case where ρ2 is the identity permu-
tation for all ρ ∈ N . In this case choose ρ ∈ N , where ρ is not the identity
permutation, let a be an element of X for which ρ(a) 6= a, and let b = ρ(a).
The permutation ρ is even (since it belongs to the alternating group An ), and
therefore ρ cannot be the transposition (a b). It follows that there exists an
element c, distinct from a and b, such that ρ(c) 6= c. Let d = ρ(c). Then the
elements a, b, c and d of X are distinct. Choose an element e of X which is
distinct from a, b, c and d. (This is possible since the set X has n elements,
where n ≥ 5.) Let ρ0 = (c d e)ρ(c d e)−1 . Then ρ0 (a) = b and ρ0 (d) = e. Now
ρ0 6= ρ, since ρ0 (d) 6= ρ(d). Thus if σ = ρ−1 ρ0 then σ ∈ N , σ(a) = a, and σ is
not the identity permutation.
28
must contain the permutations (1 2)(3 4), (1 3)(2 4) and (1 4)(2 3) since the
two non-trivial normal subgroups of A4 each contain these permutations. But
then the normal subgroup N of A5 contains also the permutation (1 2)(4 5),
since (1 2)(4 5) = (3 4 5)(1 2)(3 4)(3 4 5)−1 . It follows that N contains the
cycle (3 4 5), since (3 4 5) = (1 2)(3 4)(1 2)(4 5). It follows from Lemma 2.35
that N = A5 . Thus the group A5 is simple.
We now prove that An is simple for n > 5 by induction on n. Thus
suppose that n > 5 and the group An−1 is simple. Let N be a non-trivial
normal subgroup of An , and let H = {ρ ∈ An : ρ(n) = n}. It follows from
Lemma 2.34 that there exists σ ∈ N , where σ is not the identity permutation,
and a ∈ {1, 2, . . . , n} such that σ(a) = a. Choose ρ ∈ An such that ρ(a) = n,
and let σ 0 = ρσρ−1 . Then σ 0 ∈ N and σ 0 (n) = n, and therefore σ 0 ∈ H ∩ N .
But σ 0 is not the identity permutation. Thus H ∩ N is a non-trivial normal
subgroup of H. But the subgroup H of An is simple, since it is isomorphic to
An−1 . It follows that N ∩ H = H, and thus H ⊂ N . But then N contains a
3-cycle, and therefore N = An (Lemma 2.35). Thus the group An is simple.
We conclude by induction on n that the group An is simple whenever n ≥ 5,
as required.
Example We now show that the only normal subgroups of the symmetric
29
group Σ4 are the trivial subgroup, the Klein Viergruppe V4 , the alternating
group A4 and Σ4 itself.
The trivial group and the groups V4 and A4 are normal subgroups of Σ4 .
Moreover they are the only normal subgroups of Σ4 contained in A4 , since
they are the only normal subgroups of A4 .
Let N be a normal subgroup of Σ4 that is not contained in A4 . Then
N ∩ A4 is a normal subgroup of A4 . One can readily verify that Σ4 contains
no normal subgroup of order 2. It follows that V4 ⊂ N , since only normal
subgroups of A4 other than the trivial subgroup are the groups V4 and A4 .
Now the only odd permutations in Σ4 are transpositions and cycles of order 4.
Moreover if N contains a cycle of order 4 then N contains a transposition,
since V4 ⊂ N and
(m1 m2 )(m3 m4 )(m1 m2 m3 m4 ) = (m2 m4 )
for all cycles (m1 m2 m3 m4 ) of order 4. It follows that if N is a normal
subgroup of Σ4 that is not contained in A4 then N must contain at least one
transposition. But then N contains all transpositions, and therefore N = Σ4 .
This shows that the only normal subgroups of Σ4 are the trivial group, the
Klein Viergruppe V4 , the alternating group A4 and Σ4 itself.
30
Proof We prove the result by induction on n. The result is clearly true
when n = 1, since every non-trivial subgroup of Z is of the form kZ for some
positive integer k. Suppose therefore that n > 1 and that the result holds
for all subgroups of Zn−1 . We must show that the result then holds for all
subgroups H of Zn .
Let k1 be the smallest strictly positive integer for which there exists some
integral basis u1 , u2 , . . . , un of Zn and some element of H of the form m1 u1 +
m2 u2 + · · · + mn un where m1 , m2 , . . . , mn are integers and mi = k1 for some
integer i satisfying 1 ≤ i ≤ n. Let u1 , u2 , . . . , un be such a basis, with i = 1,
and let h0 be an element of H for which h0 = m1 u1 + m2 u2 + · · · + mn un ,
where m1 , m2 , . . . , mn are integers and m1 = k1 .
We show that each coefficient mi is divisible by k1 . Now, for each i,
there exist P integers qi and ri such that mi = qi k1 + ri and 0 ≤ ri < k1 . Let
b1 = u1 + ni=2 qi ui . Then b1 , u2 , . . . , un is an integral basis of Zn and
n
X
h0 = k1 b1 + ri ui .
i=2
The choice of k1 now ensures that the coefficients ri cannot be strictly positive
(as they are less than k1 ), and therefore ri = 0 and mi = qi k1 for i =
2, 3, . . . , n. Moreover h0 = k1 b1 .
Now let ϕ: Zn−1 → Zn be the injective Pn homomorphism sending each ele-
ment (m2 , m3 , . . . , mn ) of Z n−1
to i=2 mi ui , and let H̃ = ϕ−1 (H). Then,
given any element h of H, there exist an integer m and an element h̃ of Zn−1
such that h = mb1 + ϕ(h̃). Moreover m and h̃ are uniquely determined by
h, since b1 , u2 , . . . , un is an integral basis of Zn . Let m = qk1 + r, where q
and r are integers and 0 ≤ r < k1 . Then h − qh0 = rb1 + ϕ(h̃), where ϕ(h̃)
is expressible as a linear combination of u2 , . . . , un with integer coefficients.
The choice of k1 now ensures that r cannot be strictly positive, and therefore
r = 0. Then ϕ(h̃) ∈ H, and hence h̃ ∈ H̃. We conclude from this that, given
any element h of H, there exist an integer q and an element h̃ of H̃ such
that h = qk1 b1 + ϕ(h̃). Moreover q and h̃ are uniquely determined by h.
Now the induction hypothesis ensures the existence of an integral basis
b̃2 , b̃3 , . . . , b̃n of Zn−1 for which there exist positive integers k2 , k3 , . . . , ks such
that k2 b̃2 , k3 b̃3 , . . . , ks b̃s is an integral basis of H̃. Let bi = ϕ(b̃i ) for each
integer i between 2 and n. One can then readily verify that b1 , b2 , . . . , bn is
an integral basis of Zn and k1 b1 , k2 b2 , . . . , ks bs is an integral basis of H, as
required.
An Abelian group G is generated by elements g1 , g2 , . . . , gn if and only if
every element of G is expressible in the form g1m1 g2m2 · · · gnmn for some integers
m1 , m 2 , . . . , m n .
31
Lemma 2.38 A non-trivial Abelian group G is finitely generated if and only
if there exists a positive integer n and some surjective homomorphism θ: Zn →
G.
G∼
= Ck1 × Ck2 × · · · × Cks × Zn−s ,
m1 b1 + m2 b2 + · · · + mn bn
of Zn to (am m2 ms
1 , a2 , . . . , as , ms+1 , . . . , mn ), where ai is a generator of the
1
Corollary 2.40 Let G be a non-trivial finite Abelian group. Then there exist
positive integers k1 , k2 , . . . , kn such that G ∼
= Ck1 × Ck2 × · · · × Ckn , where Cki
is a cyclic group of order ki for i = 1, 2, . . . , n.
32
With some more work it is possible to show that the positive integers
k1 , k2 , . . . , ks in Theorem 2.39 may be chosen such that k1 > 1 and ki−1
divides ki for i = 2, 3, . . . , s, and that the Abelian group is then determined
up to isomorphism by the integer n and the sequence of positive integers
k1 , k2 , . . . , k s .
|G| = |Z(G)| + n1 + n2 + · · · + nr .
33
2.21 Cauchy’s Theorem
Theorem 2.42 (Cauchy) Let G be an finite group, and let p be a prime
number that divides the order of G. Then G contains an element of order p.
Lemma 2.43 Let p be a prime number, and let G be a p-group. Then there
exists a normal subgroup of G of order p that is contained in the centre of G.
Proof Let |G| = pk . Then pk divides the order of G but does not divide the
order of any proper subgroup of G. It follows from Proposition 2.41 that p
divides the order of the centre of G. It then follows from Cauchy’s Theorem
(Theorem 2.42) that the centre of G contains some element of order p. This
element generates a cyclic subgroup of order p, and this subgroup is normal
since its elements commute with every element of G.
34
Proof We prove the result by induction on the order of G. Thus suppose
that the result holds for all p-groups whose order is less than that of G. Let
Z be the centre of G. Then ZH is a well-defined subgroup of G, since Z is
a normal subgroup of G.
Suppose that ZH 6= H. Then H is a normal subgroup of ZH. The
quotient group ZH/H is a p-group, and contains a subgroup K1 of order p
(Lemma 2.43). Let K = {g ∈ ZH : gH ∈ K1 }. Then H / K and K/H ∼ = K1 ,
and therefore K is the required subgroup of G.
Finally suppose that ZH = H. Then Z ⊂ H. Let H1 = {hZ : h ∈ H}.
Then H1 is a subgroup of G/Z. But G/Z is a p-group, and |G/Z| < |G|,
since |Z| > p (Lemma 2.43). The induction hypothesis ensures the existence
of a subgroup K1 of G/Z such that H1 / K1 and K1 /H1 is cyclic of order p.
Let K = {g ∈ G : gZ ∈ K1 }. Then H / K and K/H ∼ = K1 /H1 . Thus K is
the required subgroup of G.
Theorem 2.46 (First Sylow Theorem) Let G be a finite group, and let p be a
prime number dividing the order of G. Then G contains a Sylow p-subgroup.
35
order pk−1 , since |G/N | = |G|/p. Let K = {g ∈ G : gN ∈ L}. Then
|K| = p|L| = pk , and thus K is the required Sylow p-subgroup of G.
Theorem 2.47 (Second Sylow Theorem) Let G be a finite group, and let
p be a prime number dividing the order of G. Then all Sylow p-subgroups
of G are conjugate, and any p-subgroup of G is contained in some Sylow p-
subgroup of G. Moreover the number of Sylow p-subgroups in G divides the
order of |G| and is congruent to 1 modulo p.
Proof Let K be a Sylow p-subgroup of G, and let X be the set of left cosets
of K in G. Let H be a p-subgroup of G. Then H acts on X on the left,
where h(gK) = hgK for all h ∈ H and g ∈ G. Moreover h(gK) = gK if and
only if g −1 hg ∈ K. Thus an element gK of X is fixed by H if and only if
g −1 Hg ⊂ K.
Let |G| = pk m, where k and m are positive integers and m is coprime to
p. Then |K| = pk . Now the number of left cosets of K in G is |G|/|K|. Thus
the set X has m elements. Now the number of elements in any orbit for the
action of H on X divides the order of H, since it is the index in H of the
stabilizer of some element of that orbit (Lemma 2.26). But then the number
of elements in each orbit must be some power of p, since H is a p-group.
Thus if an element of X is not fixed by H then the number of elements in its
orbit is divisible by p. But X is a disjoint union of orbits under the action
of H on X. Thus if m0 denotes the number of elements of X that are fixed
by H then m − m0 is divisible by p.
Now m is not divisible by p. It follows that m0 6= 0, and m0 is not divisible
by p. Thus there exists at least one element g of G such that g −1 Hg ⊂ K. But
then H is contained in the Sylow p-subgroup gKg −1 . Thus every p-subgroup
is contained in a Sylow p-subgroup of K, and this Sylow p-subgroup is a
conjugate of the given Sylow p-subgroup K. In particular any two Sylow
p-subgroups are conjugate.
It only remains to show that the number of Sylow p-subgroups in G
divides the order of |G| and is congruent to 1 modulo p. Now choosing the
p-subgroup H of G to be the Sylow p-subgroup K itself enables us to deduce
that g −1 Kg = K for some g ∈ G if and only if gK is a fixed point for the
action of K on X. But the number of elements g of G for which gK is a
fixed point is m0 |K|, where m0 is the number of fixed points in X. It follows
that the number of elements g of G for which g −1 Kg = K is pk m0 . But every
Sylow p-subgroup of G is of the form g −1 Kg for some g ∈ G. It follows that
the number n of Sylow p-subgroups in G is given by n = |G|/pk m0 = m/m0 .
In particular n divides |G|. Now we have already shown that m − m0 is
divisible by p. It follows that m0 is coprime to p, since m is coprime to p.
36
Also m − m0 is divisible by m0 , since (m − m0 )/m0 = n − 1. Putting these
results together, we see that m − m0 is divisible by m0 p, and therefore n − 1
is divisible by p. Thus n divides |G| and is congruent to 1 modulo p, as
required.
37
Proof Suppose that G is solvable. Let G0 , G1 , . . . , Gm be a finite sequence
of subgroups of G, where G0 = {1}, Gn = G, and Gi−1 / Gi and Gi /Gi−1 is
Abelian for i = 1, 2, . . . , m.
We first show that the subgroup H is solvable. Let Hi = H ∩ Gi for
i = 0, 1, . . . , m. Then H0 = {1} and Hm = H. If u ∈ Hi and v ∈ Hi−1 then
uvu−1 ∈ H, since H is a subgroup of G. Also uvu−1 ∈ Gi−1 , since u ∈ Gi−1 ,
v ∈ Gi and Gi−1 is normal in Gi . Therefore uvu−1 ∈ Hi−1 . Thus Hi−1 is a
normal subgroup of Hi for i = 1, 2, . . . , m. Moreover
Hi Gi ∩ H Gi−1 (Gi ∩ H)
= =
Hi−1 Gi−1 ∩ (Gi ∩ H) Gi−1
by the First Isomorphism Theorem (Theorem 2.23), and thus Hi /Hi−1 is
isomorphic to a subgroup of the Abelian group Gi /Gi−1 . It follows that
Hi /Hi−1 must itself be an Abelian group. We conclude therefore that the
subgroup H of G is solvable.
Now let N be a normal subgroup of G, and let Ki = Gi N/N for all i.
Then K0 is the trivial subgroup of G/N and Km = G/N . It follows from
Lemma 2.48 that Ki−1 / Ki and Ki /Ki−1 is isomorphic to the quotient of
Gi /Gi−1 by some normal subgroup. But a quotient of any Abelian group
must itself be Abelian. Thus each quotient group Ki /Ki−1 is Abelian, and
thus G/N is solvable.
Finally suppose that G is a group, N is a normal subgroup of G and
both N and G/N are solvable. We must prove that G is solvable. Now the
solvability of N ensures the existence of a finite sequence G0 , G1 , . . . , Gm of
subgroups of N , where G0 = {1}, Gm = N , and Gi−1 / Gi and Gi /Gi−1 is
Abelian for i = 1, 2, . . . , m. Also the solvability of G/N ensures the existence
of a finite sequence K0 , K1 , . . . , Kn of subgroups of G/N , where K0 = N/N ,
Kn = G/N , and Ki−1 / Ki and Ki /Ki−1 is Abelian for i = 1, 2, . . . , n.
Let Gm+i be the preimage of Ki under the the quotient homomorphism
ν: G → G/N , for i = 1, 2, . . . , n. The Second Isomorphism Theorem (The-
orem 2.24) ensures that Gm+i /Gm+i−1 ∼ = Ki /Ki−1 for all i > 0. Therefore
G0 , G1 , . . . , Gm+n is a finite sequence of subgroups of G, where G0 = {1},
Gn = G, and Gi−1 / Gi and Gi /Gi−1 is Abelian for i = 1, 2, . . . , m + n. Thus
the group G is solvable, as required.
38
Course 311: Hilary Term 2000
Part III: Introduction to Galois Theory
D. R. Wilkins
Contents
3 Introduction to Galois Theory 2
3.1 Rings and Fields . . . . . . . . . . . . . . . . . . . . . . . . . 2
3.2 Ideals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.3 Quotient Rings and Homomorphisms . . . . . . . . . . . . . . 5
3.4 The Characteristic of a Ring . . . . . . . . . . . . . . . . . . . 7
3.5 Polynomial Rings . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.6 Gauss’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.7 Eisenstein’s Irreducibility Criterion . . . . . . . . . . . . . . . 12
3.8 Field Extensions and the Tower Law . . . . . . . . . . . . . . 12
3.9 Algebraic Field Extensions . . . . . . . . . . . . . . . . . . . . 14
3.10 Ruler and Compass Constructions . . . . . . . . . . . . . . . . 16
3.11 Splitting Fields . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.12 Normal Extensions . . . . . . . . . . . . . . . . . . . . . . . . 24
3.13 Separability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.14 Finite Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.15 The Primitive Element Theorem . . . . . . . . . . . . . . . . . 30
3.16 The Galois Group of a Field Extension . . . . . . . . . . . . . 31
3.17 The Galois correspondence . . . . . . . . . . . . . . . . . . . . 33
3.18 Quadratic Polynomials . . . . . . . . . . . . . . . . . . . . . . 35
3.19 Cubic Polynomials . . . . . . . . . . . . . . . . . . . . . . . . 35
3.20 Quartic Polynomials . . . . . . . . . . . . . . . . . . . . . . . 36
3.21 The Galois group of the polynomial x4 − 2 . . . . . . . . . . . 37
3.22 The Galois group of a polynomial . . . . . . . . . . . . . . . . 39
3.23 Solvable polynomials and their Galois groups . . . . . . . . . . 39
3.24 A quintic polynomial that is not solvable by radicals . . . . . 43
1
3 Introduction to Galois Theory
3.1 Rings and Fields
Definition A ring consists of a set R on which are defined operations of
addition and multiplication satisfying the following axioms:
• x+y = y+x for all elements x and y of R (i.e., addition is commutative);
• (x + y) + z = x + (y + z) for all elements x, y and z of R (i.e., addition
is associative);
• there exists an an element 0 of R (known as the zero element) with the
property that x + 0 = x for all elements x of R;
• given any element x of R, there exists an element −x of R with the
property that x + (−x) = 0;
• x(yz) = (xy)z for all elements x, y and z of R (i.e., multiplication is
associative);
• x(y + z) = xy + xz and (x + y)z = xz + yz for all elements x, y and z
of R (the Distributive Law ).
Lemma 3.2 Let R be a ring. Then (−x)y = −(xy) and x(−y) = −(xy) for
all elements x and y of R.
2
A ring R is said to be unital if it possesses a (necessarily unique) non-zero
multiplicative identity element 1 satisfying 1x = x = x1 for all x ∈ R.
3
Proof A field is a unital commutative ring. Let x and y be non-zero elements
of a field K. Then there exist elements x−1 and y −1 of K such that xx−1 = 1
and yy −1 = 1. Then xyy −1 x−1 = 1. It follows that xy 6= 0, since 0(y −1 x−1 ) =
0 and 1 6= 0.
3.2 Ideals
Definition Let R be a ring. A subset I of R is said to be an ideal of R if
0 ∈ I, a + b ∈ I, −a ∈ I, ra ∈ I and ar ∈ I for all a, b ∈ I and r ∈ R. An
ideal I of R is said to be a proper ideal of R if I 6= R.
Lemma 3.4 A unital commutative ring R is a field if and only if the only
ideals of R are {0} and R.
4
We denote by (f1 , f2 , . . . , fk ) the ideal of R generated by any finite subset
{f1 , f2 , . . . , fk } of R. We say that an ideal I of the ring R is finitely generated
if there exists a finite subset of I which generates the ideal I.
Proof Let I be the subset of R consisting of all these finite sums. If J is any
ideal of R which contains the set X then J must contain each of these finite
sums, and thus I ⊂ J. Let a and b be elements of I. It follows immediately
from the definition of I that 0 ∈ I, a + b ∈ I, −a ∈ I, and ra ∈ I for all
r ∈ R. Also ar = ra, since R is commutative, and thus ar ∈ I. Thus I
is an ideal of R. Moreover X ⊂ I, since the ring R is unital and x = 1x
for all x ∈ X. Thus I is the smallest ideal of R containing the set X, as
required.
Lemma 3.6 Every ideal of the ring Z of integers is generated by some non-
negative integer n.
Proof The zero ideal is of the required form with n = 0. Let I be some
non-zero ideal of Z. Then I contains at least one strictly positive integer
(since −m ∈ I for all m ∈ I). Let n be the smallest strictly positive integer
belonging to I. If j ∈ I then we can write j = qn + r for some integers q
and r with 0 ≤ r < n. Now r ∈ I, since r = j − qn, j ∈ I and qn ∈ I.
But 0 ≤ r < n, and n is by definition the smallest strictly positive integer
belonging to I. We conclude therefore that r = 0, and thus j = qn. This
shows that I = nZ, as required.
5
x, x0 , y and y 0 are elements of R satisfying I + x = I + x0 and I + y = I + y 0
then
(x + y) − (x0 + y 0 ) = (x − x0 ) + (y − y 0 ),
xy − x0 y 0 = xy − xy 0 + xy 0 − x0 y 0 = x(y − y 0 ) + (x − x0 )y 0 .
6
The verification of the following result is a straightforward exercise.
Proof Let p = charR. Clearly p 6= 1. Suppose that p > 1 and p = jk, where
j and k are positive integers. Then (j.1)(k.1) = (jk).1 = p.1 = 0. But R is
an integral domain. Therefore either j.1 = 0, or k.1 = 0. But if j.1 = 0 then
p divides j and therefore j = p. Similarly if k.1 = 0 then k = p. It follows
that p is a prime number, as required.
a0 + a1 x + a2 x 2 + a3 x 3 + · · · ,
7
where the coefficients a0 , a1 , a2 , a3 , . . . of the polynomial are elements of the
ring R and only finitely many of these coeffients are non-zero. If ak = 0 then
the term ak xk may be omitted when writing down the expression defining
the polynomial. Therefore every polynomial can therefore be represented by
an expression of the form
a0 + a1 x + a2 x 2 + · · · + am x m
a0 + a1 x + a2 x 2 + · · · + am x m
f (x) = a0 + a1 x + a2 x2 + a3 x3 + · · ·
and
g(x) = b0 + b1 x + b2 x2 + b3 x3 + · · ·
then
and
f (x)g(x) = u0 + u1 x + u2 x2 + u3 x3 + · · ·
where, for each integer i, the coefficient ui of xi in f (x)g(x) is the sum
of the products aj bk for all pairs (j, k) of non-negative integers satisfying
j + k = i. (Thus u0 = a0 b0 , u1 = a0 b1 + a1 b0 , u2 = a0 b2 + a1 b1 + a2 b0
etc.). Straightforward calculations show that the set R[x] of polynomials
with coefficients in a ring R is itself a ring with these operations of addition
and multiplication. The zero element of this ring is the polynomial whose
coefficients are all equal to zero.
We now consider various properties of polynomials whose coefficients be-
long to a field K (such as the field of rational numbers, real numbers or
complex numbers).
8
Lemma 3.10 Let K be a field, and let f ∈ K[x] be a non-zero polynomial
with coefficients in K. Then, given any polynomial h ∈ K[x], there exist
unique polynomials q and r in K[x] such that h = f q + r and either r = 0
or else deg r < deg f .
Proof If deg h < deg f then we may take q = 0 and r = h. In general we
prove the existence of q and r by induction on the degree deg h of h. Thus
suppose that deg h ≥ deg f and that any polynomial of degree less than deg h
can be expressed in the required form. Now there is some element c of K
for which the polynomials h(x) and cf (x) have the same leading coefficient.
Let h1 (x) = h(x) − cxm f (x), where m = deg h − deg f . Then either h1 = 0
or deg h1 < deg h. The inductive hypothesis then ensures the existence
of polynomials q1 and r such that h1 = f q1 + r and either r = 0 or else
deg r < deg f . But then h = f q + r, where q(x) = cxm + q1 (x). We now
verify the uniqueness of q and r. Suppose that f q + r = f q + r, where
q, r ∈ K[x] and either r = 0 or deg r < deg f . Then (q − q)f = r − r. But
deg((q − q)f ) ≥ deg f whenever q 6= q, and deg(r − r) < deg f whenever
r 6= r. Therefore the equality (q − q)f = r − r cannot hold unless q = q and
r = r. This proves the uniqueness of q and r.
Any polynomial f with coefficients in a field K generates an ideal (f )
of the polynomial ring K[x] consisting of all polynomials in K[x] that are
divisible by f .
Lemma 3.11 Let K be a field, and let I be an ideal of the polynomial ring
K[x]. Then there exists f ∈ K[x] such that I = (f ), where (f ) denotes the
ideal of K[x] generated by f .
Proof If I = {0} then we can take f = 0. Otherwise choose f ∈ I such
that f 6= 0 and the degree of f does not exceed the degree of any non-zero
polynomial in I. Then, for each h ∈ I, there exist polynomials q and r in K[x]
such that h = f q + r and either r = 0 or else deg r < deg f . (Lemma 3.10).
But r ∈ I, since r = h − f q and h and f both belong to I. The choice of f
then ensures that r = 0 and h = qf . Thus I = (f ).
Definition Polynomials f1 , f2 , . . . , fk with coefficients in some field K. are
said to be coprime if there is no non-constant polynomial that divides all of
them.
Theorem 3.12 Let f1 , f2 , . . . , fk be coprime polynomials with coefficients in
some field K. Then there exist polynomials g1 , g2 , . . . , gk with coefficients in
K such that
f1 (x)g1 (x) + f2 (x)g2 (x) + · · · + fk (x)gk (x) = 1.
9
Proof Let I be the ideal in K[x] generated by f1 , f2 , . . . , fk . It follows from
Lemma 3.11 that the ideal I is generated by some polynomial d. Then d
divides all of f1 , f2 , . . . , fk and is therefore a constant polynomial, since these
polynomials are coprime. It follows that I = K[x]. The existence of the
required polynomials g1 , g2 , . . . , gk then follows using Lemma 3.5.
Proof Suppose that f does not divide g. We must show that f divides
h. Now the only polynomials that divide f are constant polynomials and
multiples of f . No multiple of f divides g. Therefore the only polynomials
that divide both f and g are constant polynomials. Thus f and g are coprime.
It follows from Proposition 3.12 that there exist polynomials u and v with
coefficients in K such that 1 = ug + vf . Then h = ugh + vf h. But f divides
ugh + vf h, since f divides gh. It follows that f divides h, as required.
Proposition 3.14 Let K be a field, and let (f ) be the ideal of K[x] generated
by an irreducible polynomial f with coefficients in K. Then K[x]/(f ) is a
field.
Proof Let I = (f ). Then the quotient ring K[x]/I is commutative and has
a multiplicative identity element I +1. Let g ∈ K[x]. Suppose that I +g 6= I.
Now the only factors of f are constant polynomials and constant multiples
of f , since f is irreducible. But no constant multiple of f can divide g, since
g 6∈ I. It follows that the only common factors of f and g are constant
polynomials. Thus f and g are coprime. It follows from Proposition 3.12
that there exist polynomials h, k ∈ K[x] such that f h + gk = 1. But then
(I +k)(I +g) = I +1 in K[x]/I, since f h ∈ I. Thus I +k is the multiplicative
inverse of I + g in K[x]/I. We deduce that every non-zero element of K[x]/I
is invertible, and thus K[x]/I is a field, as required.
10
Definition A polynomial with integer coefficients is said to be primitive if
there is no prime number that divides all the coefficients of the polynomial
11
3.7 Eisenstein’s Irreducibility Criterion
Proposition 3.17 (Eisenstein’s Irreducibility Criterion) Let
f (x) = a0 + a1 x + a2 x2 + · · · + an xn
be a polynomial of degree n with integer coefficients, and let p be a prime
number. Suppose that
• p does not divide an ,
• p divides a0 , a1 , . . . , an−1 ,
• p2 does not divide a0 .
Then the polynomial f is irreducible over the field Q of rational numbers.
Proof Suppose that f (x) = g(x)h(x), where g and h are polynomials with
integer coefficients. Let g(x) = b0 + b1 x + b2 x2 + · · · + br xr and h(x) =
c0 +c1 x+c2 x2 +· · ·+cs xs . Then a0 = b0 c0 . Now a0 is divisible by p but is not
divisible by p2 . Therefore exactly one of the coefficients b0 and c0 is divisible
by p. Suppose that p divides b0 but does not divide c0 . Now p does not divide
all the coefficients of g(x), since it does not divide all the coefficients of f (x).
Let j be the smallest value of i for which p does not divide bi . Then p divides
j−1
P
aj − bj c0 , since aj − bj c0 = bi cj−i and bi is divisible by p when i < j. But
i=0
bj c0 is not divisible by p, since p is prime and neither bj nor c0 is divisible by
p. Therefore aj is not divisible by p, and hence j = n and deg g ≥ n = deg f .
Thus deg g = deg f and deg h = 0. Thus the polynomial f does not factor
as a product of polynomials of lower degree with integer coefficients, and
therefore f is irreducible over Q (Proposition 3.16).
12
If L: K is a field extension then we can regard L as a vector space over
the field K. If L is a finite-dimensional vector space over K then we say that
the extension L: K is finite. The degree [L: K] of a finite field extension L: K
is defined to be the dimension of L considered as a vector space over K.
{xi yj : 1 ≤ i ≤ m and 1 ≤ j ≤ n}
13
is a basis of M , considered as a vector space over K. We conclude that the
extension M : K is finite, and
[M : K] = mn = [M : L][L: K],
as required.
Proof Let L: K be a finite field extension, and let n = [L: K]. Let α ∈ L.
Then either the elements 1, α, α2 , . . . , αn are not all distinct, or else these
elements are linearly dependent over the field K (since a linearly inde-
pendent subset of L can have at most n elements.) Therefore there exist
c0 , c1 , c2 , . . . , cn ∈ K, not all zero, such that
c0 + c1 α + c2 α2 + · · · + cn αn = 0.
14
Lemma 3.20 Let K be a field and let α be an element of some extension
field L of K. Suppose that α is algebraic over K. Then there exists a unique
irreducible monic polynomial m ∈ K[x], with coefficients in K, characterized
by the following property: f ∈ K[x] satisfies f (α) = 0 if and only if m divides
f in K[x].
Proof Suppose that the field extension K(α): K is finite. It then follows
from Lemma 3.19 that α is algebraic over K.
Conversely suppose that α is algebraic over K. Let R = {f (α) : f ∈
K[x]}. Now f (α) = 0 if and only if the minimum polynomial m of α over
K divides f . It follows that f (α) = 0 if and only if f ∈ (m), where (m) is
the ideal of K[x] generated by m. The ring homomorphism from K[x] to R
that sends f ∈ K[x] to f (α) therefore induces an isomorphism between the
quotient ring K[x]/(m) and the ring R. But K[x]/(m) is a field, since m is
irreducible (Proposition 3.14). Therefore R is a subfield of K(α) containing
K ∪ {α}, and hence R = K(α).
15
Let z ∈ K(α). Then z = g(α) for some g ∈ K[x]. But then there exist
polynomials l and f belonging to K[x] such that g = lm + f and either f = 0
or deg f < deg m (Lemma 3.10). But then z = f (α) since m(α) = 0.
Suppose that z = h(α) for some polynomial h ∈ K[x], where either h = 0
or deg h < deg m. Then m divides h−f , since α is a zero of h−f . But if h−f
were non-zero then its degree would be less than that of m, and thus h − f
would not be divisible by m. We therefore conclude that h = f . Thus any
element z of K(α) can be expressed in the form z = f (α) for some uniquely
determined polynomial f ∈ K[x] satisfying either f = 0 or deg f < deg m.
Thus if n = deg m then 1, α, α2 . . . , αn−1 is a basis of K(α) over K. It follows
that the extension K(α): K is finite and [K(α): K] = deg m, as required.
• the construction of the edge of a cube having twice the volume of some
given cube;
16
Definition Let P0 and P1 be the points of the Euclidean plane given by
P0 = (0, 0) and P1 = (1, 0). We say that a point P of the plane is constructible
using straightedge and compasses alone if P = Pn for some finite sequence
P0 , P1 , . . . , Pn of points of the plane, where P0 = (0, 0), P1 = (1, 0) and, for
each j > 1, the point Pj is one of the following:—
• the point at which a straight line joining two points belonging to the
set {P0 , P1 , . . . , Pj−1 } intersects a circle which is centred on a point of
this set and passes through another point of the set;
Constructible points of the plane are those that can be constructed from
the given points P0 and P1 using straightedge (i.e., unmarked ruler) and
compasses alone.
One can apply this criterion to show that there is no geometrical con-
struction that enables one to trisect an arbitrary angle using straightedge
and compasses alone. The same method can be used to show the impos-
sibility of ‘duplicating a cube’ or ‘squaring a circle’ using straightedge and
compasses alone.
17
Example We show that there is no geometrical construction for the trisec-
tion of an angle of π3 radians (i.e., 60◦ ) using straightedge and compasses
alone. Let a = cos π9 and b = sin π9 . Now the point (cos π3 , sin π3 ) (i.e, the
√
point ( 12 , 12 3)) is constructible. Thus if an angle of π3 radians could be tri-
sected using straightedge and compasses alone, then the point (a, b) would
be constructible. Now
cos 3θ = cos θ cos 2θ − sin θ sin 2θ = cos θ(cos2 θ − sin2 θ) − 2 sin2 θ cos θ
= 4 cos3 θ − 3 cos θ
Lemma 3.24 If the endpoints of any line segment in the plane are con-
structible, then so is the midpoint.
18
Proof Let P and Q be constructible points in the plane. Let S and T be the
points where the circle centred on P and passing through Q intersects the
circle centred on Q and passing through P . Then S and T are constructible
points in the plane, and the point R at which the line ST intersects the
line P Q is the midpoint of the line segment P Q. Thus this midpoint is a
constructible point.
Lemma 3.25 If any three vertices of a parallelogram in the plane are con-
structible, then so is the fourth vertex.
Theorem 3.26 Let K denote the set of all real numbers x for which the
point (x, 0) is constructible using straightedge and compasses alone. Then K
is a subfield of the field of real numbers, and a point (x, y) of the plane is
constructible using straightedge and compass √ alone if and only if x ∈ K and
y ∈ K. Moreover if x ∈ K and x > 0 then x ∈ K.
19
since it is the fourth vertex of a parallelogram which has three vertices at the
constructible points (x, 0), (0, y) and (0, 1) (Lemma 3.25). But the line which
passes through the two constructible points (0, y) and (x, y − 1) intersects
the x-axis at the point (xy, 0). Therefore the point (xy, 0) is constructible,
and thus xy ∈ K.
Now suppose that x ∈ K, y ∈ K and y 6= 0. The point (x, 1 − y) is
constructible, since it is the fourth vertex of a parallelogram with vertices
at the constructible points (x, 0), (0, y) and (0, 1). The line segment joining
the constructible points (0, 1) and (x, 1 − y) intersects the x-axis at the point
(xy −1 , 0). Thus xy −1 ∈ K.
The above results show that K is a subfield of the field of real numbers.
Moreover if x ∈ K and y ∈ K then the point (x, y) is constructible, since it is
the fourth vertex of a rectangle with vertices at the constructible points (0, 0),
(x, 0) and (0, y). Conversely, suppose that the point (x, y) is constructible.
We claim that the point (x, 0) is constructible and thus x ∈ K. This result is
obviously true if y = 0. If y 6= 0 then the circles centred on the points (0, 0)
and (1, 0) and passing through (x, y) intersect in the two points (x, y) and
(x, −y). The point (x, 0) is thus the point at which the line passing through
the constructible points (x, y) and (x, −y) intersects the x-axis, and is thus
itself constructible. The point (0, y) is then the fourth vertex of a rectangle
with vertices at the constructible points (0, 0), (x, 0) and (x, y), and thus is
itself constructible. The circle centred on the origin and passing though (0, y)
intersects the x-axis at (y, 0). Thus (y, 0) is constructible, and thus y ∈ K.
We have thus shown that a point (x, y) is constructible using straightedge
and compasses alone if and only if x ∈ K and y ∈ K.
Suppose that x ∈ K and that x > 0. Then 21 (1 − x) ∈ K. Thus if
C = (0, 12 (1 − x)) then C is a constructible point. Let (u, 0) be the point at
which the circle centred on C and passing through the constructible point
(0, 1) intersects the x-axis. (The circle does intersect the x-axis since it passes
through (0, 1) and (0, −x), and x > 0.) The radius of this circle is 12 (1 + x)),
and therefore 14 (1 − x)2 + u2 = 14 (1 + x)2 (Pythagoras’ Theorem.) But then
2
√ = x. But (u, 0) is a constructible point. Thus if x ∈ K and x > 0 then
u
x ∈ K, as required.
20
reduces to that of determining which regular polygons with an odd number
of sides are constructible. Moreover it is not difficult to reduce down to the
case where n is a power of some odd prime number.
Gauss discovered that a regular 17-sided polygon was constructible in
1796, when he was 19 years old. Techniques of Galois Theory show that the
regular n-sided polygon is constructible using straightedge and compass if
and only if n = 2s p1 p2 · · · pt , where p1 , p2 , . . . , pt are distinct Fermat primes:
a Fermat prime is a prime number that is of the form 2k +1 for some integer k.
If k = uv, where u and v are positive integers and v is odd, then 2k + 1 =
wv + 1 = (w + 1)(wv−1 − wv−2 + · · · − w + 1), where w = 2u , and hence
m
2k + 1 is not prime. Thus any Fermat prime is of the form 22 + 1 for some
non-negative integer m. Fermat observed in 1640 that Fm is prime when
m ≤ 4. These Fermat primes have the values F0 = 3, F1 = 5, F2 = 17,
F3 = 257 and F4 = 65537. Fermat conjectured that all the numbers Fm were
prime. However it has been shown that Fm is not prime for any integer m
between 5 and 16. Moreover F16 = 265536 + 1 ≈ 1020000 . Note that the five
Fermat primes 3, 5, 17, 257 and 65537 provide only 32 constructible regular
polygons with an odd number of sides.
It is not difficult to see that the geometric problem of constructing a
regular n-sided polygon using straightedge and compasses is equivalent to
the algebraic problem of finding a formula to express the nth roots of unity
in the complex plane in terms of integers or rational numbers by means of
algebraic formulae which involve finite addition, subtraction, multiplication,
division and the successive extraction of square roots. Thus the problem is
closely related to that of expressing the roots of a given polynomial in terms
of its coefficients by means of algebraic formulae which involve only finite
addition, subtraction, multiplication, division and the successive extraction
of pth roots for appropriate prime numbers p.
21
Definition Let L: K be a field extension, and let f ∈ K[x] be a polynomial
with coefficients in K. The field L is said to be a splitting field for f over K
if the following conditions are satisfied:—
• the polynomial f does not split over any proper subfield of L that
contains the field K.
We shall prove below that splitting fields always exist and that any two
splitting field extensions for a given polynomial over a field K are isomorphic.
Given any homomorphism σ: K → M of fields, we define
22
Proof Let g be an irreducible factor of f , and let L = K[x]/(g), where (g)
is the ideal of K[x] generated by g. For each a ∈ K let i(a) = a + (g). Then
i: K → L is a monomorphism. We embed K in L on identifying a ∈ K with
i(a).
Now L is a field, since g is irreducible (Proposition 3.14). Let α = x+(g).
Then g(α) is the image of the polynomial g under the quotient homomor-
phism from K[x] to L, and therefore g(α) = 0. But g is a factor of the
polynomial f . Therefore f (α) = 0, as required.
Corollary 3.29 Let K be a field and let f ∈ K[x]. Then there exists a
splitting field for f over K.
Proof We use induction on the degree deg f of f . The result is trivially true
when deg f = 1 (since f then splits over K itself). Suppose that the result
holds for all fields and for all polynomials of degree less than deg f . Now it
follows from Theorem 3.28 that there exists a field extension K1 : K of K and
an element α of K1 satisfying f (α) = 0. Moreover f (x) = (x − α)g(x) for
some polynomial g with coefficients in K(α). Now deg g < deg f . It follows
from the induction hypothesis that there exists a splitting field L for g over
K(α). Then f splits over L.
Suppose that f splits over some field M , where K ⊂ M ⊂ L. Then
α ∈ M and hence K(α) ⊂ M . But M must also contain the roots of g,
since these are roots of f . It follows from the definition of splitting fields
that M = L. Thus L is the required splitting field for the polynomial f over
K.
Any two splitting fields for a given polynomial with coefficients in a field K
are K-isomorphic. This result is a special case of the following theorem.
23
Let g and h be polynomials with coefficients in K1 . Now g(α) = h(α)
if and only if m divides g − h. Similarly σ∗ (g)(β) = σ∗ (h)(β) if and only if
σ∗ (m) divides σ∗ (g) − σ∗ (h). Therefore σ∗ (g)(β) = σ∗ (h)(β) if and only if
g(α) = h(α), and thus there is a well-defined isomorphism ϕ: K1 (α) → K2 (β)
which sends g(α) to σ∗ (g)(β) for any polynomial g with coefficients in K.
Now L1 and L2 are splitting fields for the polynomials f and σ∗ (f ) over the
fields K1 (α) and K2 (β) respectively, and [L1 : K1 (α)] < [L1 : K1 ]. The induc-
tion hypothesis therefore ensures the existence of an isomorphism τ : L1 → L2
extending ϕ: K1 (α) → K2 (β). Then τ : L1 → L2 is the required extension of
σ: K1 → K2 .
Note that a field extension L: K is normal if and only if, given any ele-
ment α of L, the minimum polynomial of α over K splits over L.
24
Proof Suppose that L: K is both finite and normal. Then there exist alge-
braic elements α1 , α2 , . . . , αn of L such that L = K(α1 , α2 , . . . , αn ) (Corol-
lary 3.22). Let f (x) = m1 (x)m2 (x) · · · mn (x), where mj ∈ K[x] is the mini-
mum polynomial of αj over K for j = 1, 2, . . . , n. Then mj splits over L since
mj is irreducible and L: K is normal. Thus f splits over L. It follows that
L is a splitting field for f over K, since L is obtained from K by adjoining
roots of f .
Conversely suppose that L is a splitting field over K for some polynomial
f ∈ K[x]. Then L is obtained from K by adjoining the roots of f , and
therefore the extension L: K is finite. (Corollary 3.22).
Let g ∈ K[x] be irreducible, and let M be a splitting field for the polyno-
mial f g over L. Then L ⊂ M and the polynomials f and g both split over
M . Let β and γ be roots of g in M . Now the polynomial f splits over the
fields L(β) and L(γ). Moreover if f splits over any subfield of M containing
K(β) then that subfield must contain L (since L is a splitting field for f over
K) and thus must contain L(β). We deduce that L(β) is a splitting field for
f over K(β). Similarly L(γ) is a splitting field for f over K(γ).
Now there is a well-defined K-isomorphism σ: K(β) → K(γ) which sends
h(β) to h(γ) for all polynomials h with coefficients in K, since two such poly-
nomials h1 and h2 take the same value at a root of the irreducible polyno-
mial g if and only if their difference h1 −h2 is divisible by g. This isomorphism
σ: K(β) → K(γ) extends to an K-isomorphism τ : L(β) → L(γ) between L(β)
and L(γ), since L(β) and L(β) are splitting fields for f over the field K(β) and
K(γ) respectively (Theorem 3.30). Thus the extensions L(β): K and L(γ): K
are isomorphic, and [L(β): K] = [L(γ): K]. But [L(β): K] = [L(β): L][L: K]
and [L(γ): K] = [L(γ): L][L: K] by the Tower Law (Theorem 3.18). It follows
that [L(β): L] = [L(γ): L]. In particular β ∈ L if and only if γ ∈ L. This
shows that that any irreducible polynomial with a root in L must split over
L, and thus L: K is normal, as required.
3.13 Separability
Let K be a field. We recall that nk is defined inductively for all integers n
and for all elements k of K so that 0k = 0 and (n + 1)k = nk + k for all
n ∈ Z and k ∈ K. Thus 1k = k, 2k = k + k, 3k = k + k + k etc., and
(−n)k = −(nk) for all n ∈ Z.
25
n
jcj xj−1 .
P
of f is defined by the formula (Df )(x) =
j=1
and hence (Df )(α) = 0. It follows that the minimum polynomial of α over
K is a non-constant polynomial with coefficients in K which divides both f
and Df .
Conversely let f ∈ K[x] be a polynomial with the property that f and
Df are both divisible by some non-constant polynomial g ∈ K[x]. Let L be
a splitting field for f over K. Then g splits over L (since g is a factor of f ).
Let α ∈ L be a root of g. Then f (α) = 0, and hence f (x) = (x − α)e(x)
for some polynomial e ∈ L[x]. On differentiating, we find that (Df )(x) =
e(x) + (x − α)De(x). But (Df )(α) = 0, since g(α) = 0 and g divides Df
in K[x]. It follows that e(α) = (Df )(α) = 0, and thus e(x) = (x − α)h(x)
for some polynomial h ∈ L[x]. But then f (x) = (x − α)2 h(x), and thus the
polynomial f has a repeated root in the splitting field L, as required.
26
Corollary 3.34 Let K be a field. An irreducible polynomial f is inseparable
if and only if Df = 0.
27
p
that p divides for all j satisfying 0 < j < p. But px = 0 for all x ∈ K,
j
since charK = p. Therefore (x + y)p = xp + y p for all x, y ∈ K. The identity
(xy)p = xp y p is immediate from the commutativity of K.
Corollary 3.38 There exists a finite field GF(pn ) of order pn for each prime
number p and positive integer n. Two finite fields are isomorphic if and only
if they have the same number of elements.
28
The field GF(pn ) is referred to as the Galois field of order pn .
The non-zero elements of a field constitute a group under multiplication.
We shall prove that all finite subgroups of the group of non-zero elements of
a field are cyclic. It follows immediately from this that the group of non-zero
elements of a finite field is cyclic.
For each positive integer n, we denote by ϕ(n) the number of integers
X x
satisfying 0 ≤ x < n that are coprime to n. We show that the sum ϕ(d)
d|n
of ϕ(d) taken over all divisors of a positive integer n is equal to n.
X
Lemma 3.39 Let n be a positive integer. Then ϕ(d) = n.
d|n
The set of all non-zero elements of a field is a group with respect to the
operation of multiplication.
Proof Let n be the order of the group G. It follows from Lagrange’s Theorem
that the order of every element of G divides n. For each divisor dX
of n, let ψ(d)
denote the number of elements of G that are of order d. Clearly ψ(d) = n.
d|n
Let g be an element of G of order d, where d is a divisor of n. The elements
1, g, g 2 , . . . , g d−1 are distinct elements of G and are roots of the polynomial
xd − 1. But a polynomial of degree d with coefficients in a field has at most
d roots in that field. Therefore every element x of G satisfying xd = 1 is g k
29
for some uniquely determined integer k satisfying 0 ≤ k < d. If k is coprime
to d then g k has order d, for if (g k )n = 1 then d divides kn and hence d
divides n. Conversely if g k has order d then d and k are coprime, for if e is
a common divisor of k and d then (g k )d/e = g d(k/e) = 1, and hence e = 1.
Thus if there exists at least one element g of G that is of order d then the
elements of G that are of order d are the elements g k for those integers k
satisfying 0 ≤ k < d that are coprime to d. It follows that if ψ(d) > 0 then
ψ(d) = ϕ(d), where ϕ(d) is the number of integers k satisfying 0 ≤ k < d
that are coprime to d. X
Now 0 ≤ ψ(d) ≤ ϕ(d) for each divisor d of n. But ψ(d) = n and
d|n
X
ϕ(d) = n. It follows that ψ(d) = φ(d) for each divisor d of n. In
d|n
particular ψ(n) = ϕ(n) ≥ 1. Thus there exists an element of G whose order
is the order n of G. This element generates G, and thus G is cyclic, as
required.
30
common root of g and h. It follows that x − γ is a highest common factor of
g and h in the polynomial ring K(θ)[x], and therefore γ ∈ K(θ). But then
β ∈ K(θ), since β = θ − cγ and c ∈ K. It follows that L = K(θ).
It now follows by induction on m that if L = K(α1 , α2 , . . . , αm ), where K
is infinite, α1 , α2 , . . . , αm are algebraic over K, and L: K is separable, then
the extension L: K is simple. Thus all finite separable field extensions are
simple, as required.
Proof It follows from the Primitive Element Theorem (Theorem 3.42) that
there exists some element α of L such that L = K(α). Let λ be an element
of L. Then λ = g(α) for some polynomial g with coefficients in K. But then
σ(λ) = g(σ(α)) for all σ ∈ Γ(L: K), since the coefficients of G are fixed by
σ. It follows that each automorphism σ in Γ(L: K) is uniquely determined
once σ(α) is known
If f be the minimum polynomial of α over K then f (σ(α)) = σ(f (α)) = 0
for all σ ∈ Γ(L: K) since the coefficients of f are in K and are therefore fixed
by σ. Thus σ(α) is a root of f . It follows that the order |Γ(L: K)| of the
Galois group is bounded above by the number of roots of f that belong to
L, and is thus bounded above by the degree deg f of f . But deg f = [L: K]
(Theorem 3.21). Thus |Γ(L: K)| ≤ [L: K], as required.
(x − α1 )(x − α2 ) · · · (x − αk ),
where α1 , α2 , . . . , αk are distinct and are the elements of the orbit of α under
the action of G on L.
31
Proof Let f (x) = (x − α1 )(x − α2 ) · · · (x − αm ). Then the polynomial f is
invariant under the action of G, since each automorphism in the group G
permutes the elements α1 , α2 , . . . , αk and therefore permutes the factors of
f amongst themselves. It follows that the coefficients of the polynomial f
belong to the fixed field K of G. Thus α is algebraic over K, as it is a root
of the polynomial f .
Now, given any root αi of f , there exists some σ ∈ G such that αi =
σ(α). Thus if g ∈ K[x] is a polynomial with coefficients in K which satisfies
g(α) = 0 then g(αi ) = σ(g(α)) = 0, since the coefficients of g are fixed by σ.
But then f divides g. Thus f is the minimum polynomial of α over K, as
required.
Proof It follows from Proposition 3.44 that, for each α ∈ L, the minimum
polynomial of α over K splits over L and has no multiple roots. Thus the
extension L: K is both normal and separable.
Let M be any field satisfying K ⊂ M ⊂ L for which the extension M : K
is finite. The extension M : K is separable, since L: K is separable. It follows
from the Primitive Element Theorem (Theorem 3.42) that the extension
M : K is simple. Thus M = K(α) for some α ∈ L. But then [M : K] is equal
to the degree of the minimum polynomial of α over K (Theorem 3.21). It
follows from Proposition 3.44 that [M : K] is equal to the number of elements
in the orbit of α under the action of G on L. Therefore [M : K] divides |G|
for any intermediate field M for which the extension M : K is finite.
Now let the intermediate field M be chosen so as to maximize [M : K].
If λ ∈ L then λ is algebraic over K, and therefore [M (λ): M ] is finite. It
follows from the Tower Law (Theorem 3.18) that [M (λ): K] is finite, and
[M (λ): K] = [M (λ): M ][M : K]. But M has been chosen so as to maximize
[M : K]. Therefore [M (λ): K] = [M : K], and [M (λ): M ] = 1. Thus λ ∈ M .
We conclude that M = L. Thus L: K is finite and [L: K] divides |G|.
The field extension L: K is a Galois extension, since it has been shown to
be finite, normal and separable. Now G ⊂ Γ(L: K) and |Γ(L: K)| ≤ [L: K]
(Lemma 3.43). Therefore |Γ(L: K)| ≤ [L: K] ≤ |G| ≤ |Γ(L: K)|, and thus
G = Γ(L: K) and |G| = [L: K], as required.
32
Theorem 3.46 Let Γ(L: K) be the Galois group of a finite field extension
L: K. Then |Γ(L: K)| divides [L: K]. Moreover |Γ(L: K)| = [L: K] if and only
if L: K is a Galois extension, in which case K is the fixed field of Γ(L: K).
Proof Let M be the fixed field of Γ(L: K). It follows from Theorem 3.45
that L: M is a Galois extension and |Γ(L: K)| = [L: M ]. Now [L: K] =
[L: M ][M : K] by the Tower Law (Theorem 3.18). Thus |Γ(L: K)| divides
[L: K]. If |Γ(L: K)| = [L: K] then M = K. But then L: K is a Galois
extension and K is the fixed field of Γ(L: K).
Conversely suppose that L: K is a Galois extension. We must show that
|Γ(L: K)| = [L: K]. Now the extension L: K is both finite and separable. It
follows from the Primitive Element Theorem (Theorem 3.42) that there exists
some element θ of L such that L = K(θ). Let f be the minimum polynomial
of θ over K. Then f splits over L, since f is irreducible and the extension
L: K is normal. Let θ1 , θ2 , . . . , θn be the roots of f in L, where θ1 = θ and
n = deg f . If σ is a K-automorphism of L then f (σ(θ)) = σ(f (θ)) = 0, since
the coefficients of the polynomial f belong to K and are therefore fixed by
σ. Thus σ(θ) = θj for some j. We claim that, for each root θj of f , there is
exactly one K-automorphism σj of L satisfying σj (θ) = θj .
Let g(x) and h(x) be polynomials with coefficients in K. Suppose that
g(θ) = h(θ). Then g − h is divisible by the minimum polynomial f of θ.
It follows that g(θj ) = h(θj ) for any root θj of f . Now every element of
L is of the form g(θ) for some g ∈ K[x], since L = K(θ). We deduce
therefore that there is a well-defined function σj : L → L with the property
that σj (g(θ)) = g(θj ) for all g ∈ K[x]. The definition of this function ensures
that it is the unique automorphism of the field L that fixes each element of
K and sends θ to θj .
Now the roots of the polynomial f in L are distinct, since f is irreducible
and L: K is separable. Moreover the order of the Galois group Γ(L: K) is
equal to the number of roots of f , since each root determines a unique element
of the Galois group. Therefore |Γ(L: K)| = deg f . But deg f = [L: K] since
L = K(θ) and f is the minimum polynomial of θ over K (Theorem 3.21).
Thus |Γ(L: K)| = [L: K], as required.
33
is irreducible over K and L: K is a normal extension. Also the roots of fK in
L are distinct, since L: K is a separable extension. But fM divides fK , since
fK (α) = 0 and the coefficients of fK belong to M . It follows that fM also
splits over L, and its roots are distinct. We deduce that the finite extension
L: M is both normal and separable, and is therefore a Galois extension.
The finite extension M : K is clearly separable, since L: K is separable.
Thus if M : K is a normal extension then it is a Galois extension.
34
3.18 Quadratic Polynomials
We consider the problem of expressing the roots of a polynomial of low degree
in terms of its coefficients. Then the well-known procedure for locating the
roots of a quadratic polynomial with real or complex coefficients generalizes
to quadratic polynomials with coefficients in a field K whose characteristic
does not equal 2. Given a quadratic polynomial ax2 + bx + c with coefficients
a and b belonging to some such field K, let us adjoin to K an element δ sat-
isfying δ 2 = b2 − 4ac. Then the polynomial splits over K(δ), and its roots are
(−b ± δ)/(2a). We shall describe below analogous procedures for expressing
the roots of cubic and quartic polynomials in terms of their coefficients.
f (u + v) = u3 + v 3 + (3uv − p)(u + v) − q.
where the two cube roots must be chosen so as to ensure that their product
is equal to 13 p. It follows that the cubic polynomial x3 − px − q splits over the
field K(, ξ, ω), where 2 = 14 q 2 − 27
1 3
p and ξ 3 = 12 q + and where ω satisfies
35
ω 3 = 1 and ω 6= 1. The roots of the polynomial in this extension field are α,
β and γ, where
p p p
α=ξ+ , β = ωξ + ω 2 , γ = ω2ξ + ω3 .
3ξ 3ξ 3ξ
Now let us consider the possibilities for the Galois group Γ(L: K), where
L is a splitting field for f over K. Now L = K(α, β, γ), where α, β and γ
are the roots of f . Also a K-automorphism of L must permute the roots
of f amongst themselves, and it is determined by its action on these roots.
Therefore Γ(L: K) is isomorphic to a subgroup of the symmetric group Σ3
(i.e., the group of permutations of a set of 3 objects), and thus the possibilities
for the order of Γ(L: K) are 1, 2, 3 and 6. It follows from Corollary 3.31 that
f is irreducible over K if and only if the roots of K are distinct and the
Galois group acts transitively on the roots of K. By considering all possible
subgroups of Σ3 it is not difficult to see that f is irreducible over K if and
only if |Γ(L: K)| = 3 or 6. If f splits over K then |Γ(L: K)| = 1. If f factors
in K[x] as the product of a linear factor and an irreducible quadratic factor
then |Γ(L: K)| = 2.
Let δ = (α−β)(α−γ)(β −γ). Then δ 2 is invariant under any permutation
of α β and γ, and therefore δ 2 is fixed by all automorphisms in the Galois
group Γ(L: K). Therefore δ 2 ∈ K. The element δ 2 of K is referred to as
the discriminant of the polynomial f . A straightforward calculation shows
that if f (x) = x3 − px − q then δ 2 = 4p3 − 27q 2 . Now δ changes sign under
any permutation of the roots α, β and γ that transposes two of the roots
whilst leaving the third root fixed. But δ ∈ K if and only if δ is fixed by all
elements of the Galois group Γ(L: K), in which case the Galois group must
induce only cyclic permutations of the roots α, β and γ. Therefore Γ(L: K)
is isomorphic to the cyclic group of order 3 if and only if f is irreducible
and the discriminant 4p3 − 27q 2 of f has a square root in the field K. If f
is irreducible but the discriminant does not have a square root in K then
Γ(L: K) is isomorphic to the symmetric group Σ3 , and |Γ(L: K)| = 6.
36
µ = (α + γ)(β + δ) = −(α + γ)2 ,
ν = (α + δ)(β + γ) = −(α + δ)2 .
It follows that g(x) = x3 + 2px2 + (p2 + 4r)x + q 2 . We can use the formulae
for the roots of a cubic polynomial to express the roots λ, µ and ν of g in
terms of the coefficients of f , and thus determine the roots α, β, γ and δ of
f in terms of the coefficients of f .
37
Q(ξ). Another application of Theorem 3.21 now shows that [L: Q(ξ)] =
[Q(ξ, i): Q(ξ)] = 2. It follows from the Tower Law (Theorem 3.18) that
[L: Q] = [L: Q(ξ)][Q(ξ): Q] = 8. Moreover the extension L: Q is a Galois
extension, and therefore its Galois group Γ(L: Q) is a group of order 8 (The-
orem 3.46).
Another application of the Tower Law now shows that [L: Q(i)] = 4,
since [L: Q] = [L: Q(i)][Q(i): Q] and [Q(i): Q] = 2. Therefore the minimum
polynomial of ξ over Q(i) is a polynomial of degree 4 (Theorem 3.21). But ξ is
a root of x4 −2. Therefore x4 −2 is irreducible over Q(i), and is the minimum
polynomial of ξ over Q(i). Corollary 3.31 then ensures the existence of an
automorphism σ of L that sends ξ ∈ L to iξ and fixes each element of Q(i).
Similarly there exists an automorphism τ of L that sends i to −i and fixes
each element of Q(ξ). (The automorphism τ is in fact the restriction to L
of the automorphism of C that sends each complex number to its complex
conjugate.)
Now the automorphisms σ, σ 2 , σ 3 and σ 4 fix i and therefore send ξ to
iξ, −ξ, −iξ and ξ respectively. Therefore σ 4 = ι, where ι is the identity
automorphism of L. Similarly τ 2 = ι. Straightforward calculations show
that τ σ = σ 3 τ , and (στ )2 = (σ 2 τ )2 = (σ 3 τ )2 = ι. It follows easily from this
that Γ(L: Q) = {ι, σ, σ 2 , σ 3 , τ, στ, σ 2 τ, σ 3 τ }, and Γ(L: Q) is isomorphic to the
dihedral group of order 8 (i.e., the group of symmetries of a square in the
plane).
The Galois correspondence is a bijective correspondence between the sub-
groups of Γ(L: Q) and subfields of L that contain Q. The subfield of L cor-
responding to a given subgroup of Γ(L: Q) is set of all elements of L that
are fixed by all the automorphisms in the subgroup. One can verify that
the correspondence between subgroups of Γ(L: Q) and their fixed fields is as
follows:—
Subgroup of Γ(L: Q) Fixed field
Γ(L: K) Q
{ι, σ, σ 2 , σ 3 } Q(i)
√
{ι, σ 2 , τ, σ 2 τ } Q( √2)
{ι, σ 2 , στ, σ 3 τ } √ 2)
Q(i
{ι, σ 2 } Q( 2, i)
{ι, τ } Q(ξ)
{ι, σ 2 τ } Q(iξ)
{ι, στ } Q((1 − i)/ξ)
{ι, σ 3 τ } Q((1 + i)/ξ)
{ι} Q(ξ, i)
38
3.22 The Galois group of a polynomial
Definition Let f be a polynomial with coefficients in some field K. The
Galois group ΓK (f ) of f over K is defined to be the Galois group Γ(L: K) of
the extension L: K, where L is some splitting field for the polynomial f over
K.
We recall that all splitting fields for a given polynomial over a field K
are K-isomorphic (see Theorem 3.30), and thus the Galois groups of these
splitting field extensions are isomorphic. The Galois group of the given poly-
nomial over K is therefore well-defined (up to isomorphism of groups) and
does not depend on the choice of splitting field.
Let f be a polynomial with coefficients in some field K and let the roots
of f is some splitting field L be α1 , α2 , . . . , αn . An element σ of Γ(L: K) is
a K-automorphism of L, and therefore σ permutes the roots of f . Moreover
two automorphism σ and τ in the Galois group Γ(L: K) are equal if and only
if σ(αj ) = τ (αj ) for j = 1, 2, . . . , n, since L = K(α1 , α2 , . . . , αn ). Thus the
Galois group of a polynomial can be represented as a subgroup of the group
of permutations of its roots. We deduce immediately the following result.
39
It follows from the definition above that a polynomial with coefficients in
a field K is solvable by radicals if and only if there exist fields K0 , K1 , . . . , Km
such that K0 = K, the polynomial f splits over Km , and, for each integer i
between 1 and m, the field Ki is obtained on adjoining to Ki−1 an element αi
with the property that αipi ∈ Ki−1 for some positive integer pi . Moreover we
can assume, without loss of generality that p1 , p2 , . . . , pm are prime numbers,
since an nth root α of an element of a given field can be adjoined that field
by successively adjoining powers αn1 , αn2 , . . . , αnk of α chosen such that n/n1
is prime, ni /ni−1 is prime for i = 2, 3, . . . , k, and nk = 1.
We shall prove that a polynomial with coefficients in a field K of charac-
teristic zero is solvable by radicals if and only if its Galois group ΓK (f ) over
K is a solvable group.
Let L be a field, and let p be a prime number that is not equal to the
characteristic of L. Suppose that the polynomial xp − 1 splits over L. Then
the polynomial xp − 1 has distinct roots, since its formal derivative pxp−1 is
non-zero at each root of xp − 1. An element ω of L is said to be a primitive
pth root of unity if ω p = 1 and ω 6= 1. The primitive pth roots of unity are
the roots of the polynomial xp−1 +xp−2 +· · ·+1, since xp −1 = (x −1)(xp−1 +
xp−2 + · · · + 1). Also the group of pth roots of unity in L is a cyclic group
over order p which is generated by any primitive pth root of unity.
Lemma 3.51 Let K be a field, and let p be a prime number that is not
equal to the characteristic of K. If ω is a primitive pth root of unity in
some extension field of K then the Galois group of the extension K(ω): K is
Abelian.
40
where αp = c and ω is some primitive pth root of unity. Now K(ω): K
is a normal extension, since K(ω) is a splitting field for the polynomial
xp − 1 over K (Theorem 3.32). On applying the Galois correspondence
(Theorem 3.48), we see that Γ(M : K(ω)) is a normal subgroup of Γ(M : K),
and Γ(M : K)/Γ(M : K(ω)) is isomorphic to Γ(K(ω): K). But Γ(K(ω): K) is
Abelian (Lemma 3.51). It therefore suffices to show that Γ(M : K(ω)) is also
Abelian.
Now the field M is obtained from K(ω) by adjoining an element α sat-
isfying αp = c. Therefore each automorphism σ in Γ(M : K(ω)) is uniquely
determined by the value of σ(α). Moreover σ(α) is also a root of xp − c, and
therefore σ(α) = αω j for some integer j. Thus if σ and τ are automorphisms
of M belonging to Γ(M : K(ω)), and if σ(α) = αω j and τ (α) = αω k , then
σ(τ (α)) = τ (σ(α)) = αω j+k , since σ(ω) = τ (ω) = ω. Therefore σ ◦ τ = τ ◦ σ.
We deduce that Γ(M : K(ω)) is Abelian, and thus Γ(M : K) is solvable, as
required.
41
Theorem 3.54 Let f be a polynomial with coefficients in a field K of char-
acteristic zero. Suppose that f is solvable by radicals. Then the Galois group
ΓK (f ) of f is a solvable group.
Proof The Galois group Γ(L: K) is a cyclic group of order p, since its order is
equal to the degree p of the extension L: K. Let σ be a generator of Γ(L: K),
let β be an element of L \ K, and let
αj = β0 + ω j β1 + ω 2j β2 + · · · + ω (p−1)j βp−1
α0 + α1 + α2 + · · · + αp−1 = pβ,
42
Proof Let ω be a primitive pth root of unity. Then ΓK(ω) (f ) is isomorphic
to a subgroup of ΓK (f ) (Lemma 3.49) and is therefore solvable (Proposi-
tion 2.49). Moreover f is solvable by radicals over K if and only if f is
solvable by radicals over K(ω), since K(ω) is obtained from K by adjoining
an element ω whose pth power belongs to K. We may therefore assume,
without loss of generality, that K contains a primitive pth root of unity for
each prime p that divides |ΓK (f )|.
The result is trivial when |ΓK (f )| = 1, since the polynomial f splits over
K. We prove the result by induction on the degree |ΓK (f )| of the Galois
group. Thus suppose that the result holds when the order of the Galois group
is less than |ΓK (f )|. Let L be a splitting field for f over K. Then L: K is
a Galois extension and Γ(L: K) ∼ = ΓK (f ). Now the solvable group Γ(L: K)
contains a normal subgroup H for which the corresponding quotient group
Γ(L: K)/H is a cyclic group of order p for some prime number p dividing
Γ(L: K). Let M be the fixed field of H. Then Γ(L: M ) = H and Γ(M : K) ∼ =
Γ(L: K)/H. (Theorem 3.48), and therefore [M : K] = |Γ(L: K)/H| = p. It
follows from Lemma 3.55 that M = K(α) for some element α ∈ M satisfying
αp ∈ K. Moreover ΓM (f ) ∼ = H, and H is solvable, since any subgroup of
a solvable group is solvable (Proposition 2.49). The induction hypothesis
ensures that f is solvable by radicals when considered as a polynomial with
coefficients in M , and therefore the roots of f lie in some extension field of
M obtained by successively adjoining radicals. But M is obtained from K by
adjoining the radical α. Therefore f is solvable by radicals, when considered
as a polynomial with coefficients in K, as required.
43
of f , and therefore |G| is divisible by p. It follows from a theorem of Cauchy
(Theorem 2.42) that G has an element of order p. Moreover an element of
G is determined by its action on the roots of f . Thus an element of G is of
order p if and only if it cyclically permutes the roots of f .
The irreducibility of f ensures that f has distinct roots (Corollary 3.35).
Let α1 and α2 be the two roots of f that are not real. Then α1 and α2 are
complex conjugates of one another, since f has real coefficients. We have
already seen that G contains an element of order p which cyclically permutes
the roots of f . On taking an appropriate power of this element, we obtain
an element σ of G that cyclically permutes the roots of f and sends α1 to
α2 . We label the real roots α3 , α4 , . . . , αp of f so that αj = σ(αj−1 ) for
j = 2, 3, 4, . . . , p. Then σ(αp ) = α1 . Now complex conjugation restricts to a
Q-automorphism τ of L that interchanges α1 and α2 but fixes αj for j > 2.
But if 2 ≤ j ≤ p then σ 1−j τ σ j−1 transposes the roots αj−1 and αj and fixes
the remaining roots. But transpositions of this form generate the whole of
the group of permutations of the roots. Therefore every permutation of the
roots of f is realised by some element of the Galois group G of f , and thus
G∼ = Σp , as required.
The above example demonstrates that there cannot exist any general
formula for obtaining the roots of a quintic polynomial from its coefficients in
a finite number of steps involving only addition, subtraction, multiplication,
division and the extraction of nth roots. For if such a general formula were
to exist then every quintic polynomial with rational coefficients would be
solvable by radicals.
44