Vous êtes sur la page 1sur 47

Elliptic Curves. HT 2009/10.

Section 1. The Group Law on an Elliptic Curve


Denition 1.1. An elliptic curve over a eld K is (up to birational equivalence) a nonsin-
gular projective cubic curve, dened over K, with a K-rational point.
Denition 1.2. Let ( : F(X, Y, Z) = 0 be an elliptic curve /K [the notation /K means
dened over K; that is, all of the coecients of ( are in the eld K]. So, ( is a nonsingular
projective cubic curve, with a K-rational point, which we shall denote o. For any two
points a, b on (, let
a,b
denote the line which meets ( at a, b [if a, b are distinct then
a,b
is the unique line through a, b; if a = b then
a,b
is the line tangent to ( at a = b].
a b

a,b
[ [
d
c
o

o,d
Let
a,b
denote the line which meets ( at a, b.
Then
a,b
and ( have 3 points of intersection (Bezout).
Let d be the third point of intersection between ( and
a,b
.
Now, let
o,d
denote the line which meets ( at o and d.
Let c be the third point of intersection between ( and
o,d
.
Dene a +b = c.
a a

a,k
[ [
k
_
o

o,o
Let
o,o
be the line tangent to ( at o.
Let k be the third point of intersection between ( and
o,o
.
Now, let
a,k
be the line which meets ( at a and k.
Let a be the third point of intersection between ( and
a,k
.
Dene a to be a.
We shall soon show that a + b is a commutative group law on the points on (, with
identity o and the inverse of a given by a. First we need the following technical lemma.
Lemma 1.3. Let P
1
, . . . , P
8
be such that no 4 points lie on a line and no 7 points lie on a
conic. Then there exists a unique point P
9
which is a 9th point of intersection of any two
cubics passing through P
1
, . . . , P
8
.
Optional Proof See 0.137.
1
2
Theorem 1.4. Let ( be an elliptic curve /K, with K-rational point o. Then a + b, as
in Denition 1.2, gives a commutative group law on the points on (, with identity o. The
inverse of a is given by the point a, constructed in in Denition 1.2. Further, the K-
rational points ((K) form a subgroup, called the Mordell-Weil group.
Proof It is easy to show commutativity, the fact that o is the identity, and the fact that a
is the inverse of a. The only dicult problem is associativity. In order to prove associativity,
consider the following diagram.
w
a
[
v r
f
b c u s
d e o t
m n
Here, r, s, t, , m, n are lines. On each line, the labelled points are the points of intersection
between ( and that line. From the construction of Denition 1.2:
a +b = e,
and so:
(a +b) +c = 3rd point of intersection on
o,f
.
Similarly:
b +c = v,
a + (b +c) = 3rd point of intersection on
o,w
.
To show (a + b) + c = a + (b + c), it is sucient to show that f = w. Let F
1
= mn and
F
2
= rst, both of which are cubic curves.
( and F
1
have 8 common points: a, b, c, d, e, u, v, o.
( and F
2
also have these 8 common points: a, b, c, d, e, u, v, o.
From Lemma 1.3, the 9th point of intersection of ( and F
1
must be the same as the 9th
point of intersection of ( and F
2
; that is, f = w, as required.
3
Hence, + is a commutative group law.
It remains to show that ((K) is a subgroup. We are given that o ((K). Let a, b ((K).
It is sucient to show that a +b ((K) and that a ((K).
Let a = (x
1
, y
1
) and b = (x
2
, y
2
), where x
1
, y
1
, x
2
, y
2
K. Then the line through a, b is
(in ane form)
a,b
: y = x + m, where =
y
1
y
2
x
1
x
2
K and m =
x
1
y
2
x
2
y
1
x
1
x
2
K. Substitute
y = x + m into the cubic equation for ( to get; (x) = x
3
+ c
2
x
2
+ c
1
x + c
0
= 0, dened
over K. Let (x) = (x x
1
)(x x
2
)(x x
3
) be the factorisation of (x). Then x
1
, x
2
, x
3
are the 3 roots of and so x
1
+ x
2
+ x
3
= c
2
, giving: x
3
= c
2
x
1
x
2
K and
y
3
= x
3
+ m K. The line
a,b
then meets ( at a, b, d = (x
3
, y
3
) ((K). The same
argument shows that the line
o,d
through o, d has 3rd point of intersection c which is also
in ((K). But c = a + b and so we have shown that a + b ((K). A similar argument
shows that if a ((K) then a ((K). Hence ((K) is a subgroup, as required.
Aside: It is apparent that, in the above proof, we have dealt with the typical case, where
none of our points are repeated (for the proof of associativity), and none are at innity
(for the proof that ((K) is a subgroup, since the points were written in ane form). It is
straightforward to check these special cases; we shall not bother to do so here.
Comment 1.5. When two nonsingular cubics (
1
, (
2
are birationally equivalent over K
(under : (
1
(
2
), it can be shown that a, b, c on (
1
are collinear i (a), (b), (c)
on (
2
are collinear, and is an isomorphism between (
1
(K) and (
2
(K).
Comment 1.6. By an elliptic curve, we shall always mean a projective curve, but often write
the equation in ane form. Note that, whichever way it is written, we are always referring
to the projective curve. For example, if we say let ( : y
2
= x
3
+ 3 be an elliptic curve,
it should be understood that this is a shorthand notation for the corresponding projective
curve ZY
2
= X
3
+ 3Z
3
.
Theorem 1.7. Let K be a eld satisfying char(K) ,= 2, 3 [recall this means that 1 +1 ,= 0
and 1 + 1 + 1 ,= 0]. Then any elliptic curve over K is birationally equivalent over K to a
curve of the form y
2
= x
3
+Ax +B.
When K = Q, we can birationally transform any y
2
= cubic in x to a curve of the
form y
2
= x
3
+Ax +B, with A, B Z, using only maps of the form (x, y) (ax +b, cy).
Comment 1.8. Let K be a eld satisfying char(K) ,= 2, 3, and let g(x) be a quartic
polynomial over K with nonzero discriminant. It can be shown that any curve T : y
2
= g(x),
with a K-rational point, is an elliptic curve, and is birationally equivalent over K to a curve
of the form y
2
= x
3
+Ax +B [see p.35 of Cassels].
4
Comment 1.9. We shall typically take our elliptic curves to have the form
c : y
2
= x
3
+Ax +B, where A, B K,
which should be regarded as shorthand for the projective curve ZY
2
= X
3
+AXZ
2
+BZ
3
.
Sometimes it will be convenient to include the x
2
term. Since c is nonsingular, we must
have = 4A
3
+ 27B
2
,= 0, as was shown in Example 0.110. The notation = 4A
3
+ 27B
2
is standard.
It is conventional to choose o = (0, 1, 0), the point at innity, as the identity [we shall
always take o = (0, 1, 0) unless otherwise stated]. Note that the line Z = 0 meets c at o
three times (such a point is called an inexion). Given a point a = (X, Y, Z), if we take
the line through a and o = (0, 1, 0) then the third point of intersection is (X, Y, Z), which
must then be a. In ane form: (x, y) = (x, y).
This gives an easy rule for nding the inverse of a point, under the group law, namely:
the inverse of a is its reection in the x-axis.
So, for an elliptic curve c written in the form y
2
= cubic in x, the points are o (the point
at innity) and the ane points (x, y), and the group law has a simpler description:
Let d = (x
3
, y
3
) the 3rd point of intersection of c and
a,b
.
Then a +b = (x
3
, y
3
), the reection of d in the x-axis.
We illustrate the group law with the following computation (see also 0.143).
Example 1.10. Let c : y
2
= x
3
+ 1. Let us compute a + b, where a = (x
1
, y
1
) = (1, 0)
and b = (x
2
, y
2
) = (0, 1).
The line through a, b is
a,b
: y = x + 1. Substituting this into c, we see that the
x-coordinate of any point of intersection satises: (x + 1)
2
= x
3
+ 1, and so:
x
3
x
2
2x = 0. ()
We are looking for (x
3
, y
3
), the 3rd point of intersection of c and
a,b
. We rst nd x
3
; note
that x
1
, x
2
, x
3
must be the roots of ().
Method A (for nding x
3
). Since the roots of () are x
1
, x
2
, x
3
, it follows that x
3
x
2
2x =
(x x
1
)(x x
2
)(x x
3
); equating coecients of x
2
gives that:
x
1
+x
2
+x
3
= (coecient of x
2
in ()) = (1) = 1,
so that (1) + 0 +x
3
= 1, giving x
3
= 2.
Method B (for nding x
3
). Factorise () to give: x(x+1)(x2), whose roots are: 0, 1, 2.
Two of these are the already known x
1
= 1, x
2
= 0, and so x
3
must be the remaining root:
x
3
= 2.
Having found x
3
(by either method), we use the equation of
a,b
to compute y
3
= x
3
+1 = 3.
In summary: c and
a,b
intersect at: (1, 0), (0, 1), (2, 3), and so (1, 0) +(0, 1) +(2, 3) = o.
5
Finally, this gives: (1, 0) + (0, 1) = (2, 3) = (2, 3), using the rule that negation is
given by reection in the x-axis.
One can also obtain an explicit general formula for the group law.
Lemma 1.11. Let c : y
2
= x
3
+Ax +B, where A, B K, with (as usual) o = the point at
innity. Let (x
3
, y
3
) = (x
1
, y
1
) + (x
2
, y
2
).
Case 1. When x
1
,= x
2
then:
x
3
=
x
1
x
2
2
+x
2
1
x
2
+A(x
1
+x
2
) + 2B 2y
1
y
2
(x
1
x
2
)
2
, y
3
= x
3
m,
where: =
y
1
y
2
x
1
x
2
, m =
x
1
y
2
x
2
y
1
x
1
x
2
.
Case 2. When (x
1
, y
1
) = (x
2
, y
2
) then (x
3
, y
3
) = (x
1
, y
1
) + (x
1
, y
1
) [which can be written
as 2(x
1
, y
1
)], and:
x
3
=
x
4
1
2Ax
2
1
8Bx
1
+A
2
4y
2
1
=
x
4
1
2Ax
2
1
8Bx
1
+A
2
4(x
3
1
+Ax
1
+B)
, y
3
= x
3
m,
where: =
3x
2
1
+A
2y
1
, m =
x
3
1
+Ax
1
+ 2B
2y
1
.
Optional Proof See 0.144.
The above formulas give an alternative method for computing the group law, although in
practice it often turns out to be easier to compute the group law from rst principles, as in
Example 1.10.
Comment 1.12. When = 4A
3
+ 27B
2
,= 0, all 3 roots of x
3
+ Ax + B are distinct,
guaranteeing that y
2
= x
3
+Ax +B has no singularities and is an elliptic curve.
When = 0, then this is no longer an elliptic curve and at least two roots of the cubic are
repeated: y
2
= (x )
2
(x ). It is still the case that the set of nonsingular points on c,
denoted c
ns
, forms a group [see pp.3941 of Cassels]. When ,= the singularity at (, 0)
is a node. When = the singularity is a cusp. In either case, the curve can be written:
_
y
x
_
2
= x , and so is birationally equivalent to the conic w
2
= x .
Denition 1.13. Let c be an elliptic curve and let P be a point on c. For any positive
integer m, let mP denote P + . . . + P [m times]. We say that P is an m-torsion point if
mP = o. The m-torsion group of c, denoted c[m], is the set of all m-torsion points. We
also say that P has order m (or that P is a point of order m) if m is the smallest positive
integer for which mP = o. When such m exists, P is a torsion point (P has nite order).
If no such m exists, then P is a non-torsion point (P has innite order). The group of all
K-rational torsion points on c is denoted c
tors
(K) [or sometimes c(K)
tors
].
6
Examples 1.14.
(a) Let c : y
2
= x
3
x, and let P = (1, 0) so that P = (1, 0) = (1, 0) = P, so that
2P = P + P = P P = o. But 1 P = P ,= o, and so 2 is the smallest m > 0 such
that mP = o. P has order 2 and P c
tors
(Q).
(b) Let c : y
2
= x
3
+1, and let P = (0, 1). First compute P +P. Using 2yy

= 3x
2
at (0, 1)
gives 2 1 y

= 3 0
2
and so the tangent line
P,P
to c at P has slope 0 and equation of
form y = 0x+m. But the line goes through (0, 1) and so m = 1 and the tangent line is y = 1.
Substituting y = 1 into y
2
= x
3
+1 gives x
3
= 0, with roots 0, 0, 0. So, c meets
P,P
at (0, 1)
with multiplicity 3, and (0, 1) +(0, 1) +(0, 1) = o. Hence: (0, 1) +(0, 1) = (0, 1) = (0, 1).
In summary:
1 (0, 1) = (0, 1), 2 (0, 1) = (0, 1), 3 (0, 1) = o.
(0, 1) has order 3 and (0, 1) c
tors
(Q).
When K = F
p
, a nite eld with p elements, there are of course only nitely many
members of c(F
p
).
Aside: Each of the p possible x-coordinates 0, . . . , p 1 has about a 50% chance of making
x
3
+Ax+B a square modulo p. When x
3
+Ax+B is not a square, there are no corresponding
y-coordinates. When x
3
+ Ax + B is a square, there are at most two corresponding y-
coordinates. So, one might expect on average about p ane points, that is, about p + 1
points, including the point at innity.
The following result gives a bound within which the number of points must lie.
Theorem 1.15. (Hasse). Let c be an elliptic curve over F
p
. Let N
p
= #c(F
p
) where, as
usual, c(F
p
) should be taken to including o [so that N
p
is the number of ane points (x, y)
on c with x, y F
p
, plus 1, to include the point at innity o]. Then:
[N
p
(p + 1)[ 2

p, that is, N
p
[(p + 1) 2

p, (p + 1) + 2

p].
Similarly, any curve y
2
= Q(x), where Q(x) = f
4
x
4
+. . . +f
0
has nonzero discriminant, has
at least p 1 2

p ane points.
Proof See p.118 of Cassels or p.131 of Silverman.
Example 1.16. Let c : y
2
= x
3
+ 4x + 1, dened over F
13
. Then:
#c(F
13
) 13 + 1 2

13 > 13 + 1 2 4 = 6, so that #c(F


13
) 7.
#c(F
13
) 13 + 1 + 2

13 < 13 + 1 + 2 4 = 22, so that #c(F


13
) 21.
Note that at most 4 of the points on c(F
13
) can be o and points of the form (x, 0), so there
must exist at least 3 ane points (x, y) c(F
13
) with y ,= 0.
7
Section 2. The p-adic Numbers Q
p
For Q, let [ [

denote the standard absolute value [e.g. [ 5[

= [5[

= 5]. Consider the


sequence: x
1
= 1.4, x
2
= 1.41, x
3
= 1.414, . . ., where x
n
is the largest decimal to n decimal
places satisfying x
2
n
< 2. Then [x
m
x
n
[

0 as m, n , so that the sequence is Cauchy


in Q, [ [

. The sequence x
n
cannot be convergent, since if x
n
then clearly
2
= 2 and
no such exists in Q. We say that Q, [ [

is incomplete (since not every Cauchy sequence


is convergent) and the real numbers R give the completion of Q, [ [

. The absolute value


[ [

is a special case of the following.


Denition 2.1. Let K be a eld. A valuation on K is a function [ [ : K R satisfying:
(1) [x[ 0 for all x K, with equality if and only if x = 0.
(2) [xy[ = [x[ [y[ for all x, y K.
(3) [x +y[ [x[ +[y[ for all x, y K [the triangle inequality].
If a valuation also satises the stronger property:
(3)

[x +y[ max([x[, [y[), for all x, y K,


then we say that it is a non-Archimedean valuation; otherwise it is an Archimedean valuation.
For example, Q, [ [

(or R, [ [

) is a valuation. It is Archimedean since, for example,


[1 + 1[

, max([1[

, [1[

). We shall now introduce another valuation on Q, which gives a


dierent notion of size and distance.
Denition 2.2. Fix a prime p. Let x =
m
n
Q. Write
m
n
= p
r a
b
, where p ,[ a, p ,[ b. Then
the p-adic valuation (or p-adic absolute value or p-adic size) is dened to be:
[x[
p
= [
m
n
[
p
= p
r
[so, x is smaller the higher the power of p dividing x].
We also dene [0[
p
= 0. For any x, y Q, the p-adic distance between x and y is dened to
be: d
p
(x, y) = [x y[
p
. (Note that d
p
is a metric)
Example 2.3. In Q, [ [
3
, we have: [
4
3
[
3
= [3
1 4
1
[
3
= (3
(1)
) = 3, [9[
3
= [3
2 1
1
[
3
= 3
2
=
1
9
,
and [7[
3
= [3
0 7
1
[
3
= 3
0
= 1.
Also, d
3
(5, 3) = [ 5 3[
3
= [ 8[
3
= 1, d
3
(5, 19) = [ 5 19[
3
= [ 24[
3
= 3
1
and
d
3
(
1
2
,
1
5
) = [
3
10
[
3
= 3
1
.
For integers m, n, m , n (mod 3) d
3
(m, n) = 1, m n (mod 3) d
3
(m, n)
1
3
,
m n (mod 3
2
) d
3
(m, n)
1
3
2
, and so on. The integers m, n are 3-adically closer
when they are congruent modulo a higher power of 3.
Lemma 2.4. The function [ [
p
of Denition 2.2 is a non-Archimedean valuation on Q.
8
Proof (1), (2), (3)

are trivially true when x or y = 0. Let x, y Q, x, y ,= 0, and write


x = p
r a
b
, y = p
s c
d
, where p ,[ a, b, c, d.
(1) [x[
p
= p
r
> 0.
(2) [xy[
p
= [p
r a
b
p
s c
d
[
p
= [p
r+s ac
bd
[
p
= p
(r+s)
[since p ,[ ac, bd] = p
r
p
s
= [x[
p
[y[
p
.
(3)

Wlog r s, giving: [x +y[


p
= [p
r a
b
+p
s c
d
[
p
= [p
r
_
a
b
+p
sr c
d
_
[
p
= [p
r ad+p
sr
bc
bd
[
p
= [p
r p
k

bd
[
p
for some k 0 and Z with p ,[ [since ad +p
sr
bc Z]
= p
(r+k)
p
r
= [x[
p
= max([x[
p
, [y[
p
).
Comment 2.5. By induction, [a
1
+ . . . + a
n
[
p
max([a
1
[
p
, . . . , [a
n
[
p
). It is also easy to
show that [x[
p
,= [y[
p
= [x + y[
p
= max([x[
p
, [y[
p
). Furthermore, if [a
k
[ > [a
i
[
p
for all i,
1 i n, i ,= k, then [a
1
+. . . +a
n
[
p
= max([a
1
[
p
, . . . , [a
n
[
p
).
Denition 2.6. Let K, [ [ be a eld with valuation. For a
n
, K, we say that the sequence
a
n
converges to [denoted a
n
] in K, [ [ when [a
n
[ 0 in R, [ [

as n . That
is: for any > 0 there exists N N such that, [a
n
[ < for all n > N. Given a
sequence a
n
K, if there exists K such that a
n
in K, [ [ then we say that a
n
converges in K, [ [, or that it is convergent in K, [ [. It is Cauchy if [a
m
a
n
[ 0 in R, [ [

as m, n . That is: for any > 0 there exists N N such that, [a


m
a
n
[ < for
all m, n > N.
We say that K, [ [ is complete if every Cauchy sequence is convergent.
Examples 2.7.
(a) Let a
n
= 6
n
. Then [a
n
0[
3
= [6
n
[
3
= 3
n
0 as n . So a
n
0 in Q, [ [
3
.
(b) Let a
1
= 1, a
2
= 11, a
3
= 111, . . . so that 9a
n
= 999 . . . 9 [n times] and 9a
n
+ 1 = 10
n
.
Then [9a
n
(1)[
5
= [10
n
[
5
= 5
n
0, giving 9a
n
1 in Q, [ [
5
. It follows that a
n

1
9
in Q, [ [
5
.
(c) Let x
0
= a
0
= 3. Then a
2
0
= 9 2(mod 7), and [x
2
0
2[
7
= [a
2
0
2[
7
= [7[
7
= 7
1
< 1.
We want to nd a
1
0, . . . , 6 such that (a
0
+a
1
7)
2
2 (mod 7
2
).
This is satised a
2
0
+ 2a
0
a
1
7 +a
2
1
7
2
2 (mod 7
2
)
6a
1
7 2 9 = 7 (mod 7
2
) 6a
1
1 (mod 7) a
1
1 (mod 7),
so we can take a
1
= 1. Let x
1
= a
0
+ a
1
7 = 3 + 1 7 = 10. Then x
2
1
= 100 2 (mod 7
2
)
and [x
2
1
2[
7
= 7
2
.
Aside: note how the solvability of the last congruence is aected by [2a
0
[
7
= [f

(a
0
)[
7
, where
f(x) = x
2
2.
9
When we similarly solve for a
2
0, . . . , 6 such that (a
0
+a
1
7 +a
2
7
2
)
2
2 (mod 7
3
) we
nd that a
2
= 2, giving x
2
= a
0
+ a
1
7 + a
2
7
2
= 3 + 7 + 98 = 108. Check: x
2
2
2 (mod 7
3
)
and [x
2
2
2[
7
7
3
.
We can inductively nd x
n
= a
0
+a
1
7 +. . . +a
n
7
n
such that x
2
n
2 (mod 7
n+1
), that is,
[x
2
n
2[
7
7
(n+1)
. Hence x
2
n
2 in Q, [ [
7
.
Intuitively, (3+1 7+2 7
2
+. . .)
2
= 2 in [ [
7
. The sequence x
n
is easily seen to be Cauchy
in Q, [ [
7
. The sequence is not convergent since if x
n
in Q, [ [
7
then
2
= 2, which is
impossible for Q.
(d) Again, let a
0
= 3, but now dene a
n+1
= a
n

f(a
n
)
f

(a
n
)
, for n 0, where f(x) = x
2
2 [the
Newton-Raphson formula]. Then:
a
0
= 3, a
1
= 3
3
2
2
23
=
11
6
, a
2
=
11
6

(
11
6
)
2
2
2
11
6
=
193
132
, and so on.
Check that: [a
2
0
2[
7
= [3
2
2[
7
7
1
, [a
2
1
2[
7
= [(
11
6
)
2
2[
7
= [
49
36
[
7
7
2
, and that a
n
satises the same properties as x
n
of Example (c), namely: [a
2
n
2[
7
7
(n+1)
so that a
2
n
2
in Q, [ [
7
, again forcing a
n
to be Cauchy but not convergent.
The last two examples show that Q is incomplete with respect to the valuation [ [
7
, and
indeed Q is incomplete with respect to any [ [
p
. We now dene an extension of Q which
performs the same role with respect to [ [
p
that R performs with respect to [ [

.
Denition 2.8. The set of p-adic numbers Q
p
is the completion of Q with respect to the
valuation [ [
p
, and is the smallest eld containing Q which is complete with respect to [ [
p
.
For any , Q
p
, we say that (mod p
n
) [ [
p
p
n
[ is congruent to
modulo p
n
]. A member of Q
p
(a p-adic number) x can be written in following form (the
p-adic expansion of x):
x =

n=N
a
n
p
n
, where N Z, a
N
,= 0 and each a
n
0, . . . , p 1,
in which case [x[
p
= p
N
, and the a
n
are the digits of x. We normally use the shorthand
notation a
N
. . . a
0
, a
1
a
2
. . . to represent the above sum. Note that x Q exactly when the
digits are eventually periodic.
Examples 2.9.
(a) w = 4 5
2
+1 5
1
+4 5
0
+1 5
1
+4 5
2
+. . . Q
5
and [w[
5
= 5
2
. This can be denoted
414, 14.
(b) = 3 7
0
+ 1 7
1
+ 2 7
2
+. . . Q
7
from Example 2.7(c) satises
2
= 2.
On the other hand, there is no Q
7
such that
2
= 3 since any such would satisfy
[[
2
7
= [
2
[
7
= [3[
7
= 1 and so would have 7-adic expansion = b
0
+b
1
7+b
2
7
2
+. . . and would
10
satisfy (b
0
+ b
1
7 + b
2
7
2
+ . . .)
2
= 3. This would give: b
2
0
3 (mod 7), which is impossible,
since 3 is not a quadratic residue mod 7 [none of 0
2
, 1
2
, 2
2
, 3
2
, 4
2
, 5
2
, 6
2
are 3 (mod 7)].
(c) In Q
5
: 27 = 2 + 5
2
= 2 5
0
+ 0 5
1
+ 2 5
2
= 2, 01 [the 5-adic expansion of 27].
(d) Let us nd the 5-adic expansion of 1/4. We have [ 1/4[
5
= 1 so that the 5-adic
expansion of 1/4 must be of the form = a
0
+ a
1
5 + a
2
5
2
+ . . ., each a
i
0, 1, 2, 3, 4
and a
0
,= 0. This satises 1 = 4(a
0
+ a
1
5 + a
2
5
2
+ . . .) which gives 1 4a
0
(mod 5)
and so a
0
= 1. Then 1 = 4(1 + a
1
5 + a
2
5
2
+ . . .) gives 5 4a
1
5 (mod 5
2
), giving
1 4a
1
(mod 5), and so a
1
= 1. Similarly, we nd that a
2
= 1, a
3
= 1, . . . and we suspect
that 1/4 = 1, 1.
Let = 1, 1. Then 1 = 0, 1 = 5, so that 4 = 1, giving = 1/4, proving that
we have the correct 5-adic expansion.
Comment 2.10. The eld Q is often referred to as a global eld and its completions with
respect to valuations, namely R and Q
p
, for any prime p, are its local elds (or localisations).
An equation dened over Q which has points in R and every Q
p
, but not in Q, is said to
violate the Hasse Principle.
Denition 2.11. Let K be a eld with a non-Archimedean valuation [ [. We say that x K
is an integer (with respect to the valuation) when [x[ 1, and R = x K : [x[ 1 is the
ring of integers (or valuation ring) of K. The set / = x K : [x[ < 1 is the maximal
ideal, and k = R// is the residue eld [also called the eld of digits]. The valuation group
is the set G
K
= [x[ : x K

under multiplication. We say that the valuation is discrete if


there exists > 0 such that 1 < [x[ < 1+ = [x[ = 1. When the valuation is discrete,
there exists an element p / such that / = pR; we say that such an element is a prime
element for the valuation.
The ring of integers for Q
p
is often denoted Z
p
= x Q
p
: [x[
p
1. The valuation
group G
Q
p
= p
r
: r Z = . . . , p
2
, p
1
, p
0
, p
1
, p
2
, . . ., so that Q
p
is discrete, and we can
take p as a prime element (or indeed any element with valuation p
1
). The maximal ideal
is / = pZ
p
= x Q
p
: [x[
p
p
1
and the residue eld Z
p
/pZ
p
is isomorphic to F
p
, the
nite eld with p elements.
The following result show how, in some respects, analysis is simpler for non-Archimedean
valuations.
Theorem 2.12. Let K be a eld, complete with respect to a non-Archimedean valuation [ [,
and let x
n
be a sequence in K. Then: x
n
0 in K

x
n
is convergent in K.
11
Proof Let S
N
=

N
n=1
x
n
.
: Assume that x
n
0 in K. Then:
[S
N
S
M
[ = [x
M+1
+. . . +x
N
[ max
_
[x
M+1
[, . . . , [x
N
[
_
0 as M, N .
S
N
is Cauchy and so convergent (since K is complete), giving that

x
n
is convergent.
: Assume that

x
n
is convergent, that is, S
N
for some K. Then:
[x
n
0[ = [x
n
[ = [S
n
S
n1
[ = [S
n
+ S
n1
[ [S
n
[ +[S
n1
[ 0 as n ,
so that x
n
0 in K, [ [.
For example,

n! converges in any Q
p
, since [n![
p
0 [it is unknown whether

n! Q].
The above result applies to Q
p
(since it is non-Archimedean), but not to R (where, for
example, x
n
=
1
n
is a standard counterexample).
Comment 2.13. It is easy to see that, the rules for nite sums in Comment 2.5 and apply
to innite series, namely, when

a
n
converges, [

a
n
[ max[a
n
[. Furthermore, if there
exists a
k
such that [a
k
[ > [a
i
[ for all i ,= k, then [

a
n
[ = [a
k
[; in particular, it is then
impossible for

a
n
= 0.
Aside: Recall Example 2.7(d), where x
0
= 3, and x
n+1
= x
n

f(x
n
)
f

(x
n
)
, where f(x) = x
2
2,
dened a sequence, which is Cauchy (but not convergent) in Q, [ [
7
, and which is convergent
in Q
7
to a root of f(x). The following describes when an initial approximation a
0
gives a
solution to f(x).
Theorem 2.14. (Hensels Lemma). Let K be a eld, complete with respect to a non-
Archimedean valuation [ [, with valuation ring R = x K : [x[ 1.
Let f(x) R[x] and let a
0
R satisfy: [f(a
0
)[ < [f

(a
0
)[
2
. ()
Then there exists a unique a R such that f(a) = 0 and [a a
0
[ [f(a
0
)[/[f

(a
0
)[.
Proof Dene f
j
(x) by: f(x +y) = f
0
(x) +f
1
(x)y +f
2
(x)y
2
+. . . ,
so that f
0
(x) = f(x), f
1
(x) = f

(x). Dene b
0
= f(a
0
)/f

(a
0
). By (), [b
0
[ < 1.
Dene a
1
= a
0
+b
0
= a
0
f(a
0
)/f

(a
0
). Then:
[f

(a
1
) f

(a
0
)[ = [f

(a
0
+b
0
) f

(a
0
)[ = [(poly in a
0
)b
0
+ (poly in a
0
)b
2
0
+. . . [
[b
0
[ < [f

(a
0
)[ (by ()),
so that [f

(a
1
)[ = [f

(a
0
)[.
Also, [f(a
1
)[ = [f(a
0
+b
0
)[ = [f
0
(a
0
) +f
1
(a
0
)b
0
+f
2
(a
0
)b
2
0
+. . . [
= [f
2
(a
0
)b
2
0
+. . . [ [since f
0
(a
0
) +f
1
(a
0
)b
0
= 0]
max
j2
[f
j
(a
0
)[[b
0
[
j
[b
0
[
2
=
|f(a
0
)|
2
|f

(a
0
)|
2
= [f(a
0
)[ < [f(a
0
)[, where =
|f(a
0
)|
|f

(a
0
)|
2
< 1.
Summarising: [f

(a
1
)[ = [f

(a
0
)[ and [f(a
1
)[ [f(a
0
)[ < [f(a
0
)[, where =
|f(a
0
)|
|f

(a
0
)|
2
< 1.
12
For all n, given a
n
R, dene b
n
= f(a
n
)/f

(a
n
) and a
n+1
= a
n
+b
n
= a
n
f(a
n
)/f

(a
n
).
Assume, as induction hypothesis, that:
[f

(a
n
)[ = . . . = [f

(a
1
)[ = [f

(a
0
)[ and [f(a
n
)[ [f(a
n1
)[ . . .
n
[f(a
0
)[. (1)
Then, as above: [f

(a
n+1
)[ = . . . = [f

(a
1
)[ = [f

(a
0
)[.
Then [f(a
n+1
)[ [b
n
[
2
[justied as for the case n = 0 above]
=
|f(a
n
)|
2
|f

(a
n
)|
2
=
|f(a
n
)|
2
|f

(a
0
)|
2
[by (1), the induction hypothesis]

|f(a
0
)|
|f

(a
0
)|
2
[f(a
n
)[ [since [f(a
n
)[ [f(a
0
)[ by (1), the induction hypothesis]
= [f(a
n
)[
n+1
[f(a
0
)[ [by (1), the induction hypothesis].
By induction, n, [f

(a
n
)[ = [f

(a
0
)[ and [f(a
n
)[
n
[f(a
0
)[ which 0 as n . (2)
Now, [b
n
[ = [f(a
n
)[/[f

(a
n
)[ = [f(a
n
)[/[f

(a
0
)[ 0, so by Theorem 2.12,
a
n
= a
0
+b
0
+b
1
+. . . +b
n
converges to a, say.
By continuity of polynomials, f(a) = limf(a
n
) = 0 [by (2)]. Furthermore:
[a a
0
[ = [

b
n
[ max[b
n
[ = max
|f(a
n
)|
|f

(a
n
)|
= max
|f(a
n
)|
|f

(a
0
)|
=
|f(a
0
)|
|f

(a
0
)|
[by (2)], as required.
For uniqueness, imagine a ,= a also satised f( a) = 0 and [ a a
0
[ [f(a
0
)[/[f

(a
0
)[. Let

b = a a ,= 0.
Then 0 = f( a) f(a) = f(a +

b) f(a) =

bf
1
(a) +

b
2
f
2
(a) +. . . (3)
But [

b[ = [ a a
0
+a
0
a[ max([ a a
0
[, [a a
0
[) [f(a
0
)[/[f

(a
0
)[
< [f

(a
0
)[ [by (*)] = [f
1
(a
0
)[ = [f
1
(a)[ [by (1) and continuity of [f

(x)[].
This gives [

b
j
f
j
(a)[ [

b
j
[ [

b
2
[ < [

bf
1
(a)[ (since [

b[ ,= 0 & [

b[ < [f
1
(a)[) for j 2, so that
the leading term of the sum in (3) has valuation strictly greater than the valuations of the
other terms, which is inconsistent with the sum being 0. Hence a is unique.
Example 2.15. Let f(x) = x
3
7 and a
0
= 3. Then [f(a
0
)[
5
= [3
3
7[
5
= 5
1
and
[f

(a
0
)[
5
= [3 3
2
[
5
= 1. So [f(a
0
)[
5
< [f

(a
0
)[
2
5
and by Hensels Lemma there exists a Z
5
such that f(a) = 0, that is: a
3
= 7.
Corollary 2.16. Let Q
p
with [[
p
= 1. When p ,= 2, is a square in Q
p
i it is a
square modulo p. When p = 2, is a square in Q
p
i 1 (mod 8).
Example 2.17. 23
_
Q

7
_
2
since [23[
7
= 1 and 23 2 3
2
(mod 7). However, 24 ,
_
Q

7
_
2
since [24[
7
= 1 and 24 3 (mod 7), which is not a quadratic residue mod 7.
The corollary does not apply to decide the status of 14, but in fact we can see that
14 ,
_
Q

7
_
2
, since if 14 =
2
for some Q
7
then [[
2
7
= [
2
[
7
= [14[
7
= 7
1
, contradicting
the fact that [[
7
= 7
r
for some r Z.
13
Section 3. The Reduction Map on an Elliptic Curve
Throughout this section, K denotes a complete non-Archimedean eld, with valuation
ring R = x : [x[ 1, maximal ideal /= x : [x[ < 1 and residue eld k = R//.
Denition 3.1. Then natural mod / map R k = R// : r r + /, is a surjection
and is denoted a a (or sometimes a). For example in Z
5
, if a = 3 +2 5
1
+. . . then a = 3;
also

17/3 = 2/3 = 2 2 = 4.
Let a = (a
0
, . . . , a
n
) P
n
(K). We dene the reduction map to P
n
(k) as follows.
Step 1. There exists i
0
such that [a
i
0
[ [a
i
[ for i = 0, . . . , n. We replace each a
i
by a
i
/a
i
0
(which leaves a unchanged) so that now the largest valuation is 1 (normalised form).
Step 2. Dene a = ( a
0
, . . . , a
n
) [easy to check that this is well dened].
In ane space, if a = (a
1
, . . . , a
n
) then a = ( a
1
, . . . , a
n
) , provided that all [a
i
[ 1.
When K = Q
p
, this is just the mod p map, where the coordinates are reduced modulo p.
Example 3.2. In P
2
(Q
5
), let a = (1/5, 2/15, 2). Dividing through by a
0
= 1/5 gives
a = (1, 2/3, 10) so that a = (

1,

2/3,

10) = (1, 4, 0) P
2
(F
5
). For b = (2/3, 25) in ane
space A
2
(Q
5
) [an ane point with no denominators of 5], then

b = (4, 0) A
2
(F
5
).
For the point P = (1/4, 7/8) c(Q) c(Q
2
) on the elliptic curve c : y
2
= x
3
x +1, we
should rst write P in projective form: (1/4, 7/8, 1) = (2/7, 1, 8/7) [after dividing through
by 7/8], which reduces modulo 2 to (0, 1, 0), the point at innity on

c(F
2
). Clearly any
(x, y) c(Q
p
) will reduce mod p to the point at innity i [x[
p
> 1 and [y[
p
> 1.
Denition 3.3. Let ( : F(X, Y, Z) = 0 be a projective curve, dened over K. Let f
i
be
the set of all coecients of (. The curve is unchanged if we multiply all the f
i
by a nonzero
constant, so after dividing through by f
i
0
such that [f
i
0
[ [f
i
[ for all i, we can say that
max([f
i
[) = 1 [normalised form]. The reduction of ( mod / is

( :

F(X, Y, Z) = 0, dened
over k = R//, where every coecient has been reduced mod /. When K = Q
p
, this is
again just a matter of reducing the coecients mod p.
Clearly, a lies on ( = a lies on

(, when we say that a reduces to a.
Denition 3.4. Let b

((k). If there exists a ((K) such that a = b, we say that b lifts
to ( [or that b lifts to a point on (].
14
Example 3.5. Let c : ZY
2
= X
3
+pZ
3
, dened over Q
p
, and

c : ZY
2
= X
3
, dened over F
p
.
Consider (0, 0, 1)

c(F
p
). Does it lift to a point in c(Q
p
)? Imagine (X, Y, Z) c(Q
p
)
reduces mod p to (0, 0, 1)

c(F
p
). Then p[X, p[Y, p ,[ Z, that is, [X[
p
< 1, [Y [
p
< 1, [Z[
p
= 1.
But all p-adic values are of the form: . . . , p
2
, p
1
, p
0
, p
1
, . . . so that [X[
p
p
1
, [Y [
p
p
1
,
and [X
3
[
p
p
3
. Furthermore, [pZ
3
[
p
= [p[
p
[Z[
3
p
= p
1
.
Since [X
3
[
p
,= [pZ
3
[
p
we must have [X
3
+ pZ
3
[
p
= max
_
[X
3
[
p
, [pZ
3
[
p
_
= p
1
. But then
[Y
2
[
p
= [ZY
2
[
p
= [X
3
+ pZ
3
[
p
= p
1
, a contradiction. We conclude that (0, 0, 1)

c(F
p
)
does not lift to a point in c(Q
p
). In fact: need not do proof; just refer to Problem Sheet 5.
If we had represented the above curves with the ane shorthand: c : y
2
= x
3
+ p and

c : y
2
= x
3
, then the above would be expressed by saying that (0, 0)

c(F
p
) does not lift.
On the other hand, the following result shows that we can guarantee lifting a nonsingular
point on

c.
Theorem 3.6. Let ( be dened over K, written so that the coecients lie in R. Let

(,
dened over k, be the reduction of ( modulo /. Let b

((k) be a nonsingular point.
Then b lifts to (; that is, there exists a ((K) such that a = b.
Proof Write ( : F(X
0
, X
1
, X
2
) = 0 (normalised), so that

( :

F(X
0
, X
1
, X
2
) = 0. Let
b = (b
0
, b
1
, b
2
)

((k) be a nonsingular point. Then at least one of the


F
X
i
(b) ,= 0; wlog say
that


F
X
0
(b) ,= 0. Let
0
,
1
,
2
R be such that each
i
= b
i
under the natural surjection
from R to k = R//. Then = (
0
,
1
,
2
) satises = b; however, we have no guarantee
that lies on (. We shall construct an adjustment of which lies on (, and which has the
same reduction as . Let f(t) = F(t,
1
,
2
). Then

f(
0
) =

F(b) = 0 so that [f(
0
)[ < 1.
Furthermore,

f

(
0
) =


F
X
0
( ) =


F
X
0
(b) ,= 0, so that [f

(
0
)[ = 1. By Hensels Lemma, there
exists a
0
R such that f(a
0
) = 0 and [a
0

0
[ < 1, so that a = (a
0
,
1
,
2
) is a point on (
and a = = b, as required.
We wish to see under what circumstances the reduction map is a homomorphism on an
elliptic curve.
Theorem 3.7. Let ( : F(X
0
, X
1
, X
2
) = 0 be a cubic curve dened over K, written so that
coecients of F have maximum valuation 1. Suppose the line / : L(X
0
, X
1
, X
2
) = 0 meets (
at a, b, c. Then either:
15
(1)

/

(, that is,

F(X
0
, X
1
, X
2
) =

L

M, for some M.
or:
(2)

/ meets

( precisely at a,

b, c.
Proof Let L :
0
X
0
+
1
X
1
+
2
X
2
, written so that max([
0
[, [
1
[, [
2
[) = 1, wlog [
0
[ = 1;
after dividing through by
0
(and relabelling
1
/
0
,
2
/
0
as
1
,
2
), we can take / : X
0
=

1
X
1

2
X
2
, where
1
,
2
R. Write a = (a
0
, a
1
, a
2
), b = (b
0
, b
1
, b
2
), c = (c
0
, c
1
, c
2
) with
max[a
i
[ = max[b
i
[ = max[c
i
[ = 1. Note that, since a, b, c lie on /, we must then have
max([a
1
[, [a
2
[) = max([b
1
[, [b
2
[) = max([c
1
[, [c
2
[) = 1.
Now, substitute L into F to get: G(X
1
, X
2
) = F(
1
X
1

2
X
2
, X
1
, X
2
) R[X
1
, X
2
].
Since the points a, b, c lie on both / and (, the roots of the projective polynomial G
are (a
1
, a
2
), (b
1
, b
2
), (c
1
, c
2
) P
1
(K), so that:
G(X
1
, X
2
) = F(
1
X
1

2
X
2
, X
1
, X
2
) = (a
2
X
1
a
1
X
2
)(b
2
X
1
b
1
X
2
)(c
2
X
1
c
1
X
2
),
for some R

. Now consider

F(

1
X
1

2
X
2
, X
1
, X
2
). If this is 0 then

L is a factor of

F,
giving case (1). Otherwise, this is a nonzero projective polynomial, dened over k, equal to

( a
2
X
1
a
1
X
2
)(

b
2
X
1

b
1
X
2
)( c
2
X
1
c
1
X
2
), with ( a
1
, a
2
), (

b
1
,

b
2
), ( c
1
, c
2
) P
1
(k) as roots,
so that a,

b, c lie on

/ and

(. Since

L and

F have no common factor, these must be precisely
the points of intersection of

/ and

(.
When we have an elliptic curve written, not as a general cubic, but birationally transformed
to the form c : y
2
= x
3
+Ax+B (A, B R) [which, as usual, is shorthand for the projective
curve ZY
2
= X
3
+AXZ
2
+BZ
3
], the reduction

c will still be of the form y
2
= x
3
+. . .. This
cannot contain a line, since any (y +rx+. . .)(y x
2
/r +. . .) would have an x
2
y term and so
would not give y
2
cubic in x. For such a curve, only option (2) can apply in the previous
theorem. Even though c is an elliptic curve (and therefore nonsingular), the reduction

c
might be singular [for example, when p[ Z so that

= 0 in F
p
], but even in that case we
still have the group

c
ns
(k) of nonsingular points [see Comment 1.12]. Since the group law is
constructed by nding intersections between the curve and lines, and since only option (2)
applies, the construction of the group law respects the reduction map, giving the following
result.
Corollary 3.8. Let c : y
2
= x
3
+Ax+B be an elliptic curve, with A, B R, with reduction

c.
Let

c
ns
(k) denote the group of nonsingular points in

c(k), and let c
0
(K) denote the set of
16
points in c(K) which reduce to members of

c
ns
(k), that is, dene: c
0
(K) = P c(K) :

P

c
ns
(k). Then the reduction map P

P is a homomorphism from c
0
(K) to

c
ns
(k).
Denition 3.9. Let c
0
(K) and

c
ns
(k) be as in Corollary 3.8. The kernel of reduction,
denoted c
1
(K), is the kernel of the reduction map from c
0
(K) to

c
ns
(k). That is:
c
1
(K) = P c(K) :

P = o,
where, as usual, o is the identity element, usually taken to be the point at innity, in which
case
c
1
(K) = P = (x, y) c(K) : [x[ > 1, [y[ > 1,
since these are the points that map to the point at innity under the reduction map.
We can summarise what we know so far by the following exact sequence:
0 c
1
(K)
i
c
0
(K)


c
ns
(k) 0,
where i is the inclusion map.
We now wish to look more closely at how we can describe the group law inside c
1
(K), the
kernel of reduction, for an elliptic curve:
c : y
2
= x
3
+Ax +B, where A, B R.
We adopt the usual convention that the identity is o, the point at innity so that, as
already observed, c
1
(K) = (x, y) c(K) : [x[ > 1, [y[ > 1. The members of c
1
(K)
are in a neighbourhood of o, and it is natural to try to describe the group law as a power
series. This will be more transparent if we write our equation in a form where the points in
the neighbourhood have coordinates with small, rather than large, valuation. We therefore
perform the following birational transformation:
z = x/y, w = 1/y, with inverse x = z/w, y = 1/w.
This transforms c to:
1
w
2
=
z
3
w
3
+A
z
w
+B,
giving the equation
c

: w = f(z, w) = z
3
+Aw
2
z +Bw
3
.
17
Note that the point at innity o on c maps to the point (0, 0) on c

, which we take as our


group identity on c

. The condition [x[ > 1, [y[ > 1 corresponds to [z[ < 1, [w[ < 1, so that
the kernel of reduction for c

is:
c

1
(K) = (z, w) c

(K) : [z[ < 1, [w[ < 1.


We now recursively substitute w = f(z, w) into itself. For the rst step:
w = f(z, w) = f(z, f(z, w)) = z
3
+A(z
3
+Aw
2
z +Bw
3
)
2
z +B(z
3
+Aw
2
z +Bw
3
)
3
= z
3
+Az
7
+. . .
Inductively dene f
n
(z, w) by: f
1
(z, w) = f(z, w) and f
n+1
(z, w) = f
n
(z, f(z, w)). Dene
w(z) = lim
n
f
n
(z, 0) Z[A, B][[z]].
The following is then easy to show.
Lemma 3.10. The power series w(z) = z
3
(1+. . .) Z[A, B][[z]] dened above is the unique
power series satisfying w(z) = f
_
z, w(z)
_
.
This means that
_
z, w(z)
_
satises c

. Since we are working in a non-Archimedean eld K,


we can appeal to the fact (see Theorem 2.12) that a series converges i its terms converge
to 0. When we are in the kernel of reduction [z[ < 1, [w[ < 1, this applies to the above
series w(z) [since A, B R and so [A[, [B[ 1]. Any (z, w) in the kernel of reduction must
satisfy w = w(z), and so is uniquely determined by z, which is called a local parameter.
Comment 3.11. We can recover x, y on c as formal Laurent series:
x(z) =
z
w(z)
=
z
z
3
(1 +. . .)
=
1
z
2
+. . .
y(z) =
1
w(z)
=
1
z
3
(1 +. . .)
=
1
z
3
+. . .
which gives a formal solution to c.
Let us now perform the addition (z
1
, w
1
) + (z
2
, w
2
). As usual, we rst write the line w =
z + through the points, given by = (w
1
w
2
)/(z
1
z
2
) and = (z
1
w
2
z
2
w
1
)/(z
1
z
2
).
As long as we are in the kernel of reduction, w
1
= w(z
1
) and w
2
= w(z
2
), and so:
= (z
1
, z
2
) =
w(z
1
) w(z
2
)
z
1
z
2
=
z
3
1
(1 +. . .) z
3
2
(1 +. . .)
z
1
z
2
Z[A, B][[z
1
, z
2
]],
18
with all terms being of degree 2, and:
= (z
1
, z
2
) =
z
1
w(z
2
) z
2
w(z
1
)
z
1
z
2
Z[A, B][[z
1
, z
2
]].
Substituting w = z + into c

gives z + = z
3
+A(z +)
2
z +B(z +)
3
, and so:
(1 +A
2
+B
3
)z
3
+ (2A + 3B
2
)z
2
+. . . = 0.
Let (z
3
, w(z
3
)) be the third point of intersection of c

and the line w = z+, so that z


1
, z
2
, z
3
are the roots of the above cubic, giving that z
1
+z
2
+z
3
= (coe of z
2
)/(coe of z
3
), so:
z
3
= z
1
z
2

2A + 3B
2

1 +A
2
+B
3
Z[A, B][[z
1
, z
2
]],
since the denominator is of the form 1 +(z
1
, z
2
), where (z
1
, z
2
) has no constant term [and
so is an invertible power series, with 1/(1 +(z
1
, z
2
)) = 1 (z
1
, z
2
) +(z
1
, z
2
)
2
+. . .].
The sum (z
1
, w
1
) +(z
2
, w
2
) +(z
3
, w
3
) = the identity, and so (z
1
, w
1
) +(z
2
, w
2
) = (z
3
, w
3
).
Negation (x, y) (x, y) induces (z, w) (z, w) [since z = x/y, w = 1/y], so that
the z-coordinate of (z
1
, w
1
) + (z
2
, w
2
) is given by F
E
(z
1
, z
2
), where:
F
E
(z
1
, z
2
) = z
1
+z
2
+ (terms of degree 2) Z[A, B][[z
1
, z
2
]].
We summarise this as follows.
Lemma 3.12. Any point (x, y) on c [ (z, w) on c

] in the kernel of reduction [namely:


[x[ > 1, [y[ > 1 [z[ < 1, [w[ < 1] is uniquely determined by z, with w = w(z) Z[A, B][[z]].
The group law is completely described by the above F
E
(z
1
, z
2
) Z[A, B][[z
1
, z
2
]], which con-
verges to the z-coordinate of the sum of (z
1
, w(z
1
)) and (z
2
, w(z
2
)).
We have already observed that F
E
(z
1
, z
2
) = z
1
+z
2
+ terms of higher degree. The associa-
tivity and commutativity properties of the group law on c also induce the properties:
F
E
(X, F
E
(Y, Z)) = F
E
(F
E
(X, Y ), Z), F
E
(X, Y ) = F
E
(Y, X).
Of course, the power series F
E
(z
1
, z
2
) Z[A, B][[z
1
, z
2
]] can be derived for any c dened over
any ring, regardless of convergence considerations. In the next section, we shall consider
power series F(X, Y ) which satisfy the above properties, and then apply the results to the
special case of F
E
(X, Y ).
19
Section 4. Formal Groups
Let R be any ring (by ring I shall alway mean a commutative ring with 1).
Denition 4.1. A (one-parameter, commutative) formal group dened over R is a power
series F(X, Y ) R[[X, Y ]] satisfying:
(1) F(X, Y ) = X +Y + terms of degree 2.
(2) F(X, F(Y, Z)) = F(F(X, Y ), Z).
(3) F(X, Y ) = F(Y, X).
Example 4.2. The following are all formal groups.
The formal group F
E
(X, Y ) of an elliptic curve dened over R, as described in Section 3.
The formal additive group F(X, Y ) =

G
a
(X, Y ) = X +Y .
The formal multiplicative group F(X, Y ) =

G
m
(X, Y ) = X +Y +XY .
Note: the last of these is just XY , but translated one unit to the left: (1 + X)(1 + Y ) 1
so that the identity is changed from 1 to 0.
Aside: A formal group does not necessarily induce an actual nontrivial commutative group,
since there is no guarantee that the power series will converge for any nonzero X, Y ; indeed,
our arbitrary ring R may not even come together with any structure (such as a valuation
or metric) that provides a denition of convergence. It is merely a power series satisfying
properties analogous to associativity and commutativity. The denition appears to be miss-
ing properties analogous to the existence of an identity element and inverses. In fact, the
following result shows these can be deduced from the given axioms.
Lemma 4.3. Let F(X, Y ) be a formal group over a ring R, and let R
T
denote R[[T]].
(1) There is a unique power series i(T) TR
T
such that F
_
T, i(T)
_
= 0.
(2) F(X, 0) = X and F(0, Y ) = Y .
Proof (1) Let Z
1
= T TR
T
; then the terms of F(T, Z
1
) all have degree 2. Suppose
we have Z
n
TR
T
such that F(T, Z
n
) = a
n+1
T
n+1
+ . . . has terms all of degree n + 1.
Dene Z
n+1
= Z
n
a
n+1
T
n+1
; then:
F(T, Z
n+1
) = F(T, Z
n
a
n+1
T
n+1
) = T + (Z
n
a
n+1
T
n+1
) +. . .
= F(T, Z
n
) a
n+1
T
n+1
+ (terms of degree n + 2)
= a
n+1
T
n+1
a
n+1
T
n+1
+ (terms of degree n + 2),
which has terms all of degree n + 2. This inductively denes a power series i(T), whose
rst n terms agree with Z
n
for all n, such that F
_
T, i(T)
_
= 0. Furthermore, each choice of
term of Z
n
was forced, so that i(T) is unique.
(2) By a similar argument to (1), there exists a unique j(T) TR
T
such that F(j(T), i(T)) =
0. By (1) we can take j(T) = T. By associativity F(F(0, T), i(T)) = F(0, F(T, i(T))) =
20
F(0, 0) = 0, so that we can also take j(T) = F(0, T). Since j(T) is unique, it follows that
F(0, T) = T. Similarly for F(T, 0) = T.
Denition 4.4. Let F, G dene formal groups over R. A power series f(T) TR
T
is a
homomorphism from F to G if it satises f
_
F(X, Y )
_
= G
_
f(X), f(Y )
_
. When there also
exists an inverse g(T) TR
T
[that is: f(g(T)) = g(f(T)) = T] then f(T) is an isomorphism.
Example 4.5. If char(R) = 0 and
1
n
R for all n, then f(T) = T T
2
/2 +T
3
/3 . . . is a
homomorphism from

G
m
to

G
a
.
Denition 4.6. Let F dene a formal group over R. Dene the multiplication by m map
[m](T) R
T
, for m Z, inductively by: [0](T) = 0, [m + 1](T) = F([m](T), T) and
[m 1](T) = F([m](T), i(T)). This is clearly a homomorphism from F to F, and is of the
form: [m](T) = mT + terms of degree 2.
Lemma 4.7. Let a R

[that is: a R and a


1
R], and let f(T) TR
T
be of the
form f(T) = aT + . . . Then there exists a unique g(T) TR
T
such that f(g(T)) = T.
Furthermore, g satises g(f(T)) = T.
Proof We shall construct g(T) = b
1
T + b
2
T
2
+ . . ., the limit of g
1
(T) = b
1
T, g
2
(T) =
b
1
T + b
2
T
2
, . . ., rst dening g
1
(T) = a
1
T, so that the terms of f(g
1
(T)) T all have
degree 2. Suppose we have g
n
(T) of degree n such that f(g
n
(T)) T = bT
n+1
+ . . . and
dene g
n+1
(T) = g
n
(T) a
1
bT
n+1
. Then
f(g
n+1
(T)) T = f(g
n
(T)) aa
1
bT
n+1
+ (terms of degree n + 2) T,
whose terms are all of degree n + 2. The resulting g(T) then satises f(g(T)) = T and is
unique, since each choice of coecient was forced.
There similarly exists h(T) R
T
such that g(h(T)) = T, and so f(g(h(T))) = f(T),
giving h(T) = f(T). Substituting this into g(h(T)) = T gives g(f(T)) = T, as required.
Aside: When R is an integral domain, this type of argument can also be interpreted as an
application of an adapted version of Hensels Lemma, applied to the ring R
T
, with valuation
[f(T)[ =
n
, where is a xed real number satisfying 0 < < 1 and n is the degree of
the smallest nonzero degree term [for example, [2T
3
+ 5T
4
+ . . . [ =
3
]. Here T takes on a
similar role for R
T
to that performed by p for Z
p
.
Lemma 4.8. The homomorphism [m] : F F of Denition 4.6 is an isomorphism when-
ever m R

.
Proof Since [m](T) = mT + terms of degree 2, we have from the previous lemma that
the homomorphism [m] has an inverse, and so is an isomorphism.
21
Aside: You might have wondered in school about the connection between the two properties
of log, that it is the integral of 1/x, and that log(ab) = log(a)+log(b) [a homomorphism from
multiplication to addition]. One way of seeing the connection is to dene log(T) =
_
v(T)
[with log(1) = 0], where v(T) =
1
T
dT, and note that [regarding T as a variable and S as a
constant] v(TS) =
1
TS
d(TS) = v(T), that is, v remains invariant under replacing T by TS.
Therefore log(TS) = log(T) + f(S), where f(S) is a constant; setting T = 1 gives f(S) =
log(S). If we were to adjust the multiplicative group, translating by 1, so that the identity
is 0: F(X, Y ) = (1+X)(1+Y )1 = X+Y +XY , then (T) =
1
1+T
dT = (1T +T
2
. . .)dT
would have the property that F(T, S) = (T) [and
_
(T) would give a homomorphism
from

G
m
to

G
a
]. It is natural to ask whether is unique (up to constants), and how we
would construct for a general choice of F(X, Y ).
Denition 4.9. We can represent a dierential form on R
T
as an expression of the form

m
i=1
P
i
(T)dQ
i
(T), where each P
i
(T), Q
i
(T) R
T
, and these satisfy the natural rules:
d
_
P(T)
_
= P

(T)dT, where P

(T) =

n=1
a
n
nT
n1
, for any P(T) =

n=0
a
n
T
n
,
d
_
P(T) +Q(T)
_
= dP(T) + dQ(T), d
_
P(T)Q(T)
_
= P(T)dQ(T) +Q(T)dP(T).
[Formally, the space of (formal) dierential forms on R
T
is the R
T
-module spanned by the
symbols df : f R
T
modulo the submodule spanned by f

dT df : f R
T
.]
An invariant dierential on a formal group F, dened over R, is a dierential form:
(T) = P(T)dT R
T
dT, satisfying F(T, S) = (T).
Note that F(T, S) is the same as P(F(T, S))d(F(T, S)) = P(F(T, S))F
X
(T, S)dT,
where F
X
(X, Y ) denotes the partial derivative of F(X, Y ) with respect to X. So, the above
condition on is equivalent to:
(T) = P(T)dT R
T
dT, satisfying P
_
F(T, S)
_
F
X
(T, S) = P(T).
An invariant dierential (T) = P(T)dT is said to be normalised if P(0) = 1.
Example 4.10. On

G
a
, the formal group dened by F(X, Y ) = X + Y , we can take
(T) = dT as a normalised invariant dierential. On

G
m
, the multiplicative formal group
dened by F(X, Y ) = X+Y +XY , we can take (T) = (1+T)
1
dT = (1T +T
2
. . .)dT.
Theorem 4.11. Let F be a formal group over R. There exists a unique normalised invariant
dierential given by (T) = F
X
(0, T)
1
dT R
T
dT. Every invariant dierential is of the
form a for some a R.
Proof Let P(T) = F
X
(0, T)
1
. Note that F
X
(0, T) = 1 + . . . is invertible, so that P(T) is
indeed a member of R
T
. Furthermore, P(0) = 1, so that it is normalised.
22
We need to show that is an invariant dierential. Recall from Denition 4.9 that this is
equivalent to: P
_
F(T, S)
_
F
X
(T, S) = P(T) so, in our case, it is sucient to show:
F
X
_
0, F(T, S)
_
1
F
X
(T, S) = F
X
(0, T)
1
,
which is true i:
F
X
_
0, F(T, S)
_
= F
X
(T, S)F
X
(0, T).
But this last statement is immediate from dierentiating F
_
U, F(T, S)
_
= F
_
F(U, T), S
_
[associativity] with respect to U to get: F
X
_
U, F(T, S)
_
= F
X
_
F(U, T), S
_
F
X
(U, T) and
setting U = 0. Hence is an invariant dierential.
Suppose that (T) = Q(T)dT R
T
dT is also an invariant dierential, so that Q(T)
satises Q
_
F(T, S)
_
F
X
(T, S) = Q(T). Substituting T = 0 gives Q(S)F
X
(0, S) = Q(0), so
that Q(S) = Q(0)F
X
(0, S)
1
. It follows that = a, where a = Q(0).
Corollary 4.12. Let f be a homomorphism over R from the formal group F to the formal
group G. Let
F
,
G
be the normalised invariant dierentials on F, G, respectively. Then

G
f = f

(0)
F
.
Proof First, note that
G
f
_
F(T, S)
_
=
G
_
G(f(T), f(S))
_
=
G
f(T), so that
G
f
is an invariant dierential on F. From the previous result, it follows that
G
f = a
F
,
for some a R. Since
F
,
G
are normalised, (1 + . . .)df(T) = a(1 + . . .)dT, and so
(1 +. . .)f

(T)dT = a(1 +. . .)dT; equating constant terms gives a = f

(0), as required.
Corollary 4.13. Let F be a formal group over R and let, as usual, [m](T) R
T
denote
the multiplication by m map on F, as in Denition 4.6. Let p be prime. Then there exist
f, g R
T
[f(T) = T +. . .], such that [p](T) = pf(T) +g(T
p
).
Proof Let be the normalised invariant dierential on F. Since [p](T) = pT +. . . , it satises
[p]

(0) = p. Applying the previous result to [p], a homomorphism from F to itself, gives:
[p] = [p]

(0) = p, and so
p(T) = [p](T) = (1 +. . .)d([p](T)) = (1 +. . .)[p]

(T)dT.
Hence [p]

(T) p R
T
. Each term a
n
T
n
in [p](T) must then satisfy p[na
n
in R, and so p[n
in Z or p[a
n
in R, as required.
Denition 4.14. Let (T) = P(T)dT = (1+c
1
T +c
2
T
2
+. . .)dT be the normalised invariant
dierential for the formal group F over R. For the special case when our ring R is a eld of
characteristic 0, we can dene the formal logarithm by: log
F
(T) =
_
(T) =
_
P(T)dT =
T +
c
1
2
T
2
+
c
2
3
T
3
+ . . . and the formal exponential function exp
F
(T) as the unique member
of R
T
satisfying log
F
(exp
F
(T)) = exp
F
(log
F
(T)) = T, which exists by Lemma 4.7.
23
Theorem 4.15. Let R be a eld of characteristic 0; then log
F
[as in the previous denition]
is an isomorphism from F to

G
a
, the additive group X +Y .
Proof Dierentiating log
F
_
F(T, S)
_
log
F
(T) with respect to T gives:
P
_
F(T, S)
_
F
X
(T, S)P(T) [and this = 0, since (T) = P(T)dT is an invariant dierential],
and so log
F
_
F(T, S)
_
log
F
(T) is a power series purely in S, which we denote f(S); that
is: log
F
_
F(T, S)
_
= log
F
(T) +f(S). Putting T = 0 forces f(S) = log
F
(S). Hence log
F
is a
homomorphism; the inverse is exp
F
, and so log
F
is an isomorphism.
Comment 4.16. Note that our proof of the existence of the invariant dierential required no
appeal to the commutativity axiom F(X, Y ) = F(Y, X). If our formal group F is dened over
any integral domain R of characteristic 0 (such as Z or any Z
p
), we can dene log
F
, exp
F
over K, the eld of fractions of R, and see that F(X, Y ) = exp
F
_
log
F
(X) + log
F
(Y )
_
,
which forces F to be commutative. So, at least when F is dened over an integral domain
of characteristic 0, we have the somewhat surprising fact that the commutativity axiom
is redundant; it can be deduced from: F(X, Y ) = X + Y + terms of degree 2 and
associativity. It is possible to construct non-commutative formal groups, but only when
dened over unusual rings.
Denition 4.17. Let K be eld, complete with respect to a discrete non-Archimedean
valuation, R = x K : [x[ 1 be the valuation ring, / = x K : [x[ < 1 be
the maximal ideal, and assume that k = R/M [the residue eld] is of characteristic p [for
example, K = Q
p
, R = Z
p
, /= pZ
p
, k = F
p
]. Let F be a formal group dened over R. The
group on / associated to F(X, Y ), denoted F(/), is the set / together with the group
operation: x y = F(x, y) [which converges for any x, y /]. The identity element is 0,
and the inverse of x is given by i(x) of Lemma 4.3. Similarly, for any n 1, dene F(/
n
)
to be the set /
n
with the same group operation.
Lemma 4.18. Let F, K, R, /, k [with char(k) = p] be as in Denition 4.17.
(a) The identity map: F(/
n
)/F(/
n+1
), /
n
//
n+1
, + is an isomorphism.
(b) Every torsion element of F(/) has order a power of p.
Proof
(a) For any x, y /
n
, xy = x+y+. . . x+y (mod /
2n
), and so is x+y (mod /
n+1
).
(b) It is sucient to show there does not exist a point of nite order m for any m > 1 with
p ,[ m [since any w of order mp
n
gives p
n
w of order m]. But, since char(k) = p, and p ,[ m,
we have [m[ = 1 and so m R

. By Lemma 4.8, [m] is an isomorphism from / to /,


which must then have trivial kernel: [m]z = 0 = z = 0, as required.
24
Theorem 4.19. Let F, K, R, /, k [with char(k) = p] be as in Defn 4.17. Suppose that z
F(/) has exact order p
n
, for some n 1, so that [p
n
](z) = 0, but [p
n1
](z) ,= 0. Then:
[z[ [p[
1
p
n
p
n1
.
Proof If char(R) ,= 0 then [p[ = 0, so assume that char(R) = 0. We have from Corollary 4.13
that [p](T) = pf(T) +g(T
p
) for some f(T) = T +. . . R
T
and g(T) R
T
. We shall proceed
by induction on n.
Suppose z ,= 0, z / and [p](z) = 0. Then 0 = pf(z) + g(z
p
) = p(z + . . .) + g(z
p
). We
cannot have [pz[ > [z
p
[, since then the term pz would have valuation strictly greater than the
valuations all other terms. Hence [pz[ [z
p
[ = [z[
p
, and so [p[ [z[
p1
, giving [z[ [p[
1
p
1
p
0
,
proving the result for n = 1.
Now, assume the result is true for n, and let z F(/) have order p
n+1
. Then [p](z) has
order n, and by the induction hypothesis, [[p](z)[ [p[
1
p
n
p
n1
. Hence:
[p[
1
p
n
p
n1
[[p](z)[ = [pf(z) +g(z
p
)[ max
_
[pz[, [z
p
[
_
.
But [z[ < 1, [p[ < 1, so that [p[
1
p
n
p
n1
[p[ > [pz[, giving [p[
1
p
n
p
n1
[z
p
[, and so
[z[ [p[
1
p
n+1
p
n
, as required.
This has immediate consequences for elliptic curves.
Corollary 4.20. Let c : y
2
= x
3
+ Ax + B, be an elliptic curve, where A, B Z
p
. The
kernel c
1
(Q
p
) of the reduction map

: c
0
(Q
p
)

c
ns
(F
p
) has no torsion (apart from o).
Any (x, y) c
tors
(Q
p
) satises [x[
p
1, [y[
p
1. When

c is non-singular, c
tors
(Q
p
) is
isomorphic to a subgroup of

c(F
p
).
Proof Let o ,= (x, y) c(Q
p
) be in the kernel of reduction, that is, [x[
p
, [y[
p
> 1. Then, from
the equation for c, [y[
p
= [x[
3/2
p
and [z[ = [ x/y[
p
= [x[
1/2
p
< 1, [w[ = [ 1/y[
p
< 1. If (x, y)
were torsion, then z would be a torsion point in F
E
(/) = F
E
(pZ
p
). By Lemma 4.18(b) it
must be of order p
n
, and so by Theorem 4.19 must satisfy 1 > [z[
p
[p[
1
p
n
p
n1
p
. Note that,
since [p[
p
= p
1
, any p
n
apart from 2
1
[so that p
n
p
n1
> 1] would force 1 > [z[
p
> p
1
,
contradicting the fact that [z[
p
is p
r
for some integer r. The only remaining possibility is that
(x, y) is of order 2; but then y = 0 and x is a root of x
3
+Ax +B; this is incompatible with
[x[
p
> 1 [which makes x
3
have strictly larger valuation than Ax and B]. We conclude that
x, y cannot be torsion, and that there is no torsion (apart from o) in the kernel of reduction.
When

c is non-singular, c
0
(Q
p
) = c(Q
p
) and

c
ns
(F
p
) =

c(F
p
) and the reduction map

: c(Q
p
)

c(F
p
) contains no nontrivial torsion, and so is injective when restricted
to c
tors
(Q
p
); hence c
tors
(Q
p
) is isomorphic to a subgroup of

c(F
p
).
25
Section 5. Global Torsion
Aside: We now turn to elliptic curves dened over Q, initially concentrating on the group
c
tors
(Q) of points of nite order. Any elliptic curve c : y
2
= x
3
+Ax+B, dened over Q can
be transformed with a map of the form (x, y) (k
2
x, k
3
y) so that A, B Z. The following
result is a consequence over Q of the p-adic results of the last section.
Lemma 5.1. Let c : y
2
= x
3
+ Ax + B, where A, B Z, be an elliptic curve [so that
= 4A
3
+ 27B
2
,= 0]. Let p be a prime satisfying: p ,= 2 and p ,[ (such a prime is said
to be of good reduction, since

c mod p is still an elliptic curve over F
p
). Then c
tors
(Q) is
isomorphic to a subgroup of

c(F
p
), and so #c
tors
(Q) [ #

c(F
p
).
Proof Since Q Q
p
, for any p, c(Q) c(Q
p
) and c
tors
(Q) c
tors
(Q
p
). Since p ,[ we
have

,= 0 in F
p
; since char(F
p
) ,= 2, this is enough to guarantee that

c is non-singular,
and so

c
ns
(F
p
) =

c(F
p
). By the last result of the previous section (Corollary 4.20), c
tors
(Q
p
)
is isomorphic to a subgroup of

c(F
p
), as must also be c
tors
(Q) [since c
tors
(Q) c
tors
(Q
p
)].
Lagranges Theorem then tells us that #c
tors
(Q) [ #

c(F
p
).
Note that, in particular, the above result tells us that c
tors
(Q) is always nite. In practice,
we can use reductions modulo nite elds to try to determine c
tors
(Q).
Example 5.2. Let c : y
2
= x
3
+3, dened over Q. Then = 4A
3
+27B
2
= 40
3
+273
2
= 3
5
.
We can choose any prime p ,= 2, p ,[ , that is, p ,= 2, 3.
p = 5.

c : y
2
= x
3
+ 3, dened over F
5
. Then

c(F
5
) consists of: o, (1, 2), (2, 1), (3, 0),
giving 6 points. So #c
tors
(Q) [ #

c(F
5
), that is: #c
tors
(Q) [ 6.
p = 7.

c : y
2
= x
3
+ 3, dened over F
7
. Then

c(F
7
) consists of:
o, (1, 2), (2, 2), (3, 3), (4, 2), (5, 3), (6, 3), giving 13 points. So #c
tors
(Q) [ 13.
The only possibility is: #c
tors
(Q) = 1, and so c
tors
(Q) = o. Note that (1, 2) c(Q),
but we know that (1, 2) is not of nite order, so that (1, 2), 2(1, 2), 3(1, 2), . . . are all distinct,
and can conclude that c(Q) is innite.
Note that, if we are given (for example) T : y
2
= x
3
+
3
5
6
, we can apply (x, y) (5
2
x, 5
3
y)
[with inverse (x, y) (
x
5
2
,
y
5
3
)] to transform T to c and so deduce that T
tors
(Q) = o also.
Aside: Another consequence of the p-adic results of the last section is the integrality of the
coordinates of any torsion point.
26
Lemma 5.3. Let (x
1
, y
1
) ,= o be a Q-rational torsion point on c : y
2
= x
3
+Ax +B, where
A, B Z. Then x
1
, y
1
Z.
Proof For any prime p, we have A, B Z Z
p
. Furthermore, (x
1
, y
1
) c
tors
(Q) c
tors
(Q
p
).
By the last result of the previous section (Corollary 4.20) we know that [x
1
[
p
1, [y
1
[
p
1.
In summary: x
1
, y
1
Q and x
1
, y
1
Z
p
for all primes p.
Imagine that x
1
, Z, that is, x
1
=
m
n
, where m, n Z, gcd(m, n) = 1, n ,= 1. Then
some prime p must divide n (and not divide m), giving [x
1
[
p
= [
m
n
[
p
= p
r
(for some r > 0),
which is > 1. This contradicts x Z
p
, and so we conclude that x
1
Z. Similarly y
1
Z.
For example, this tells us immediately that the point (
1
4
,
7
8
) is of innite order on the
elliptic curve c : y
2
= x
3
x + 1.
Aside: Reduction to nite elds usually works well enough in practice, but there is the
potential problem that it might leave us with c
tors
(Q) undetermined. For example, suppose
that, after trying several primes, we repeatedly nd that 3 [ #

c(F
p
), but a search has not
found a point of order 3. In that case, the group c
tors
(Q) would be unresolved. It would
be nice to have a nite search area within which the members of c
tors
(Q) must lie. This is
provided by the following result.
Theorem 5.4. (Nagell-Lutz). Let o ,= (x
1
, y
1
) c
tors
(Q), where c : y
2
= x
3
+Ax +B, and
A, B Z. Then x
1
, y
1
Z and either y
1
= 0 or y
2
1
[ , where = 4A
3
+ 27B
2
.
Proof From the last lemma, x
1
, y
1
Z. If y
1
= 0 then the result is satised; otherwise,
(x
1
, y
1
) is not 2-torsion and we can consider (x
2
, y
2
) = 2(x
1
, y
1
), with (x
2
, y
2
) ,= o, and so
x
2
, y
2
Q. But (x
2
, y
2
) is also a torsion point, so x
2
, y
2
Z. The line tangent to c at (x
1
, y
1
)
has slope = (3x
2
1
+ A)/(2y
1
); as usual, substituting y = x + into c gives (x + )
2
=
x
3
+Ax+B and so x
3

2
x
2
+. . . = 0, giving x
1
+x
1
+x
2
= (coe of x
2
)/(coe of x
3
) =
2
,
that is:
x
2
=
_
3x
2
1
+A
2y
1
_
2
2x
1
Z.
Now, we know x
1
, x
2
Z and so
_
3x
2
1
+A
2y
1
_
2
Z. It follows that 4y
2
1
[ (3x
2
1
+ A)
2
and so
y
2
1
[ (3x
2
1
+ A)
2
. Also, y
2
1
= x
3
1
+ Ax
1
+ B and so trivially y
2
1
[ (x
3
1
+ Ax
1
+ B). Applying
Euclids Algorithm to (3x
2
+A)
2
and x
3
+Ax +B gives the identity

1
(x)
1
(x) +
2
(x)
2
(x) = 4A
3
+ 27B
2
,
27
where
1
(x) = 3x
2
+4A,
1
(x) = (3x
2
+A)
2
,
2
(x) = 27(x
3
+AxB),
2
(x) = x
3
+Ax+B.
Since y
2
1
[
1
(x
1
) and y
2
1
[
2
(x
1
) we must have y
2
1
[ (
1
(x
1
)
1
(x
1
) +
2
(x
1
)
2
(x
1
)) = , as
required.
Example 5.5. Let c : y
2
= x
3
+ 3x + 1. Then = 4 3
3
+ 27 1
2
= 135 = 5 3
3
.
If (x, y) c
tors
(Q), (x, y) ,= o, then x, y Z and either y = 0 or y
2
[ 5 3
3
, giving only
y = 0, 1, 3 as possibilities.
Case y = 1. From c, (1)
2
= x
3
+ 3x + 1 and so x(x
2
+ 3) = 0. The only solution in Z
is x = 0, giving (0, 1) as the only possibilities.
Case y = 3. In this case, x Z satises (3)
2
= x
3
+ 3x + 1 and so x
3
+ 3x 8 = 0.
Let f(x) = x
3
+ 3x 8. Any integer root x of f(x) must satisfy x[(constant term) = (8),
giving x = 1, 2, 4, 8 as the only possibilities. When we substitute these, we nd that
f(1), f(1), . . . , f(8) are all nonzero, so there are no points on c with x Z and y = 3.
Case y = 0. In this case, x Z satises 0 = x
3
+3x +1, and we only need to check x = 1.
neither of which are roots of x
3
+3x+1. So, there are no points on c with x Z and y = 0.
In summary, o, (0, 1), (0, 1) are the only possible torsion points. Is (0, 1) c
tors
(Q)? If
it were then so would be 2(0, 1). But 2(0, 1) = (0, 1) + (0, 1) = (
9
4
,
35
8
); the coordinates are
not in Z and so this is not a torsion point. Hence (0, 1) must have innite order. The same
must be true for (0, 1), since it is the inverse of (0, 1). Conclusion: c
tors
(Q) = o.
The previous method of reductions modulo nite elds is usually quicker in practice, but
the Nagell-Lutz method is an eective procedure.
Comment 5.6. It was merely to ease the algebra in previous sections that we used only the
form y
2
= x
3
+Ax +B, and all of the previous arguments apply equally well to any elliptic
curve c : y
2
= x
3
+ax
2
+bx +c, where a, b, c Z, with now taken to be the discriminant
of x
3
+ax
2
+bx +c, which has the formula:
= 4a
3
c + 27c
2
+ 4b
3
a
2
b
2
18abc.
So, it remains true that, for any prime p ,[ 2, c
tors
(Q) is isomorphic to a subgroup of

c(F
p
),
that #c
tors
(Q) [ #

c(F
p
), and that any (x, y) c
tors
(Q) [(x, y) ,= o] satises x, y Z,
with y = 0 or y
2
[ .
28
Section 6. A 2-isogeny on an Elliptic Curve
[In the following, we shall use upper case letters X, Y, . . . for variables, and lower case
letters x, y, . . . for a point (x, y).]
Suppose that c is an elliptic curve over Q, together with a Q-rational point of order 2:
(x
0
, 0). After a birational transformation (x, y) (x+x
0
, y) [inverse (x, y) (xx
0
, y)] we
can assume that (0, 0) c(Q), so that Y
2
= cubic in X, with no constant term. As usual,
after mappings of the form (x, y) (k
2
x, k
3
y), we can assume that the coecients are in Z.
So, our elliptic curve can be taken to have the form
( : Y
2
= X(X
2
+aX +b), a, b Z, b(a
2
4b) ,= 0,
the last condition ensuring that the curve is non-singular. The point (0, 0) is of order 2 on (.
Let P = (x, y) be a point on (, and let P
1
= (x, y) + (0, 0) = (x
1
, y
1
). Dene T
(0,0)
by:
T
(0,0)
: ( ( : (x, y) (x, y) + (0, 0) = (x
1
, y
1
).
That is, P P + (0, 0). What are x
1
, y
1
in terms of x, y?
When (x, y) = (0, 0), then T
(0,0)
: (0, 0) o, since (0, 0) is of order 2. When x ,= 0, we
rst nd the line through (0, 0) and (x, y), which is: Y =
y
x
X. Substituting this into ( gives:
_
y
x
_
2
X
2
= X(X
2
+aX +b)
y
2
X
2
= x
2
X
3
+ax
2
X
2
+bx
2
X
x(x
2
+ax +b)X
2
= x
2
X
3
+ax
2
X
2
+bx
2
X [since (x, y) is on (]
0 = xX
3
(x
2
+b)X
2
+bxX, [since x ,= 0]
and so X(X x)(xX b) = 0. The roots of this cubic are: X = 0, X = x, X = b/x. The
line Y =
y
x
X and ( intersect at:
(0, 0), (x, y) and
_
b
x
,
by
x
2
_
[since X =
b
x
gives Y =
y
x
bx =
by
x
2
]
29
and so (x, y) + (0, 0) =
_
b
x
,
by
x
2
_
= (x
1
, y
1
), where x
1
=
b
x
, y
1
=
by
x
2
.
We want to construct a 2-to-1 map from ( to another curve T such that
_
P +(0, 0)
_
=
(P) for any P. We want expressions in x, y, call them (x, y), (x, y), such that P = (x, y)
and P + (0, 0) = (x
1
, y
1
) map to the same (, ). Natural attempts are: x +x
1
= x +
b
x
and
y +y
1
= y
by
x
2
. It turns out to be more convenient to choose x +x
1
+a instead of x +x
1
.
Dene: = x +x
1
+a = x +
b
x
+a =
x(x
2
+ax +b)
x
2
=
y
2
x
2
=
_
y
x
_
2
.
Dene: = y +y
1
= y
by
x
2
.
Both , are invariant under T
(0,0)
. We have a map from (, given by (x, y) (, ) =
_
_
y
x
_
2
, y
by
x
2
_
, which we shall call . We want to nd the new curve T which this map is
to, that is, we want the equation satised by and . Try:

2
=
_
y
by
x
2
_
2
=
_
y
x
_
x
b
x
_
_
2
=
_
y
x
_
2
_
x
b
x
_
2
=
_
x
2
2b +
b
2
x
2
_
=
_
x
2
+ 2b +
b
2
x
2
4b
_
=
_
_
x +
b
x
_
2
4b
_
=
_
( a)
2
4b
_
= (
2
2a +a
2
4b).
So (, ) is a point on the curve T : V
2
= U(U
2
+a
1
U+b
1
), where a
1
= 2a and b
1
= a
2
4b.
Our map is a rational map (but not a birational transformation, since it is 2-to-1). It is
easy to check that it is a homomorphism, with kernel o, (0, 0); such a map is a 2-isogeny
on (.
We can apply the same process to T, taking (u, v)
_
_
v
u
_
2
, v
b
1
v
u
2
_
from T to the curve
Y
2
= X(X
2
2a
1
X + a
2
1
4b
1
), which is the same as Y
2
= X(X
2
+ 4aX + 16b) [since
2(2a) = 4a and a
2
1
4b
1
= (2a)
2
4(a
2
4b) = 16b], that is:
Y
2
64
=
X
4
_
X
2
16
+
4aX
16
+
16b
16
_
=
X
4
_
X
2
16
+
aX
4
+b
_
,
and so
_
Y
8
_
2
=
X
4
_
_
X
4
_
2
+ a
_
X
4
_
+ b
_
. So, the map

: (u, v)
_
1
4
_
v
u
_
2
,
1
8
_
v
b
1
v
u
2
_
_
is a
map from T back to ( (the dual isogeny). The properties are the same as for , namely:

is a homomorphism with kernel o, (0, 0).


30
Note also that, if we let
1
=
a+

a
2
4b
2
,
2
=
a

a
2
4b
2
denote the roots of X
2
+aX +b,
then
_
(
1
, 0)
_
=
_
(
2
, 0)
_
= (0, 0), and so the kernel of

consists precisely of the
2-torsion of (, namely: o, (0, 0), (
1
, 0), (
2
, 0). Indeed, it is easy to show that

is the
multiplication by 2 map on (. We summarise as follows.
Lemma 6.1. Let ( : Y
2
= X(X
2
+ aX + b), where a, b Z, b ,= 0, a
2
4b ,= 0, and let
T : V
2
= U(U
2
+a
1
U +b
1
), where a
1
= 2a and b
1
= a
2
4b.
Dene : ( T by (x, y) =
__
y
x
_
2
, y
by
x
2
_
.
Dene

: T ( by

(u, v) =
_
1
4
_
v
u
_
2
,
1
8
_
v
b
1
v
u
2
__
.
Then the 2-isogenies ,

are 2-to-1 homomorphisms, each with kernel o, (0, 0). Since ,

are dened over Q, we also have : ((Q) T(Q) and



: T(Q) ((Q). The compositions

and

are the multiplication by 2 maps [2] on ( and T, respectively.
We shall concentrate for the moment on : ( T. Note that we can formally invert
(u, v) = (x, y) =
_
_
y
x
_
2
, y
by
x
2
_
, as follows. Since u =
_
y
x
_
2
, we have
y
x
= u
1/2
. For the
moment, say
y
x
= u
1/2
. We also have
u
1/2
v =
x
y
_
y
by
x
2
_
= x
b
x
,
u =
_
y
x
_
2
=
y
2
x
2
=
x(x
2
+ax +b)
x
2
= x +a +
b
x
,
and so: u
1/2
v +u = 2x +a. Solving for x, y then gives the following preimages.
Lemma 6.2. Let (, T, be as in Lemma 6.1, and let (u, v) be a point on T with u ,= 0. Let
x
1
=
_
u +u
1/2
v a
_
/2, y
1
= u
1/2
x
1
= u
1/2
_
u +u
1/2
v a
_
/2,
x
2
=
_
u u
1/2
v a
_
/2, y
2
= u
1/2
x
1
= u
1/2
_
u u
1/2
v a
_
/2.
31
Then (x
1
, y
1
) = (x
2
, y
2
) = (u, v).
We shall shortly make use of these to dene helpful maps on ((Q) and T(Q). First, we
recall the notation Q

and Q

/(Q

)
2
[see also Example 0.30(b)]. As usual, let Q

denote
the group of nonzero members of Q under multiplication, so that Q

/(Q

)
2
is Q

modulo
squares. For example,
12
49
= 3 in Q

/(Q

)
2
since
12
49
= 3
4
49
= 3
_
2
7
_
2
= 3 in Q

/(Q

)
2
. Note
that any member of Q

/(Q

)
2
can be written uniquely as a square free integer (that is, as
an integer not divisible by any square except 1).
Aside: Our main aim here is to show the Weak Mordell-Weil Theorem, that ((Q)/2((Q)
is nite, which we shall achieve by showing that T(Q)/(((Q)) and ((Q)/

(T(Q)) are nite,


and then using the fact that

= [2].
From now on, we denote ((Q) by ( and T(Q) by 1 [both groups under addition + given
by the group law on elliptic curves, with identity o].
Lemma 6.3. Let (u, v) 1. Then:
(u, v) (() u (Q

)
2
or [u = 0 and a
2
4b (Q

)
2
].
Proof
Case 1 u ,= 0. From the expressions in Lemma 6.2 for (x
1
, y
1
), (x
1
, y
1
) such that (x
1
, y
1
) =
(x
2
, y
2
) = (u, v), which are in terms of u, v, u
1/2
, we see that:
(u, v) (() u
1/2
Q u (Q

)
2
.
Case 2 u = 0. The expressions in Lemma 6.2 do not apply here, since they include u
1/2
.
But we know that (
1
, 0) = (
2
, 0) = (0, 0), where
1
=
a+

a
2
4b
2
,
2
=
a

a
2
4b
2
denote the roots of X
2
+aX +b. Hence:
(0, 0) (()
1
or
2
Q a
2
4b (Q

)
2
, as required.
This suggests the following map on 1.
32
Denition 6.4. Dene the map q : 1 Q

/(Q

)
2
by:
q(u, v) =
_
u when u ,= 0
b
1
= a
2
4b when u = 0.
Also dene q(o) = 1.
Note that we can equivalently dene q(u, v) to be d such that the preimages of (u, v)
under are dened over Q(

d).
Lemma 6.5. The map q : 1 Q

/(Q

)
2
of Denition 6.4 is a homomorphism with ker-
nel (() (so that the induced map q : 1/(() Q

/(Q

)
2
is an injective homomorphism).
Proof We only show that q(P +Q) = q(P)q(Q) in the typical case when none of P, Q, P +Q
are (0, 0) or o. Let (u
1
, v
1
), (u
2
, v
2
), (u
3
, v
3
) be 3 points on 1 = T(Q) which sum to o, [so
that (u
1
, v
1
) + (u
2
, v
2
) = (u
3
, v
3
)]. Then these are the 3 points of intersection between T
and some line dened over Q: V = U + m, say. Substituting V = U + m into T gives:
U(U
2
+a
1
U +b
1
) (U +m)
2
, whose 3 roots must be u
1
, u
2
, u
3
. That is: U(U
2
+a
1
U +b
1
)
(U +m)
2
= (U u
1
)(U u
2
)(U u
3
). Equating constant terms gives: u
1
u
2
u
3
= m
2
= 1 in
Q

/(Q

)
2
, and so u
1
u
2
= 1/u
3
= u
3
in Q

/(Q

)
2
. Therefore, by the denition of q we have:
q
_
(u
1
, v
1
)
_
q
_
(u
2
, v
2
)
_
= q
_
(u
3
, v
3
)
_
= q
_
(u
1
, v
1
) + (u
2
, v
2
)
_
, so that q is a homomorphism.
The fact that ker q = (() is an immediate consequence of Lemma 6.3.
Lemma 6.6. The map q : 1 Q

/(Q

)
2
of Denition 6.4 has nite image. Indeed, if
r Q

/(Q

)
2
is written as a square free integer, then r im q = r[b
1
. Under q, 1/(()
is isomorphic to the subgroup of Q

/(Q

)
2
consisting of all square free integers r[b
1
such that
W
r
: r
4
+a
1

2
m
2
+ (b
1
/r)m
4
= n
2
, for some , m, n Z, not all 0, with gcd(, m) = 1.
When this is satised, there is a point (u, v) 1 such that q(u, v) = r, satisfying u = r
_

m
_
2
.
33
Proof Let r Q

/(Q

)
2
, r imq, r Z, r square free. We want to prove that r[b
1
. Suppose
r = q(u, v), where (u, v) T(Q), which must exist since r imq. Then: r = q(u, v) = u =
u
2
+ a
1
u + b
1
in Q

/(Q

)
2
[since u(u
2
+ a
1
u + b
1
) = v
2
]. So, r, u, u
2
+ a
1
u + b
1
are all the
same modulo squares, which means we can write:
u
2
+a
1
u +b
1
= rs
2
, u = rt
2
, for some s, t Q.
Hence: (rt
2
)
2
+ a
1
(rt
2
) + b
1
= rs
2
. Let t = /m, where , m Z and gcd(, m) = 1.
Then: r
2

4
/m
4
+ a
1
r
2
/m
2
+ b
1
= rs
2
, and so: r
2

4
+ a
1
r
2
m
2
+ b
1
m
4
= r(m
2
s)
2
. Now,
a
1
, b
1
, r, , m Z, so the LHS of this last equation is in Z, and so the RHS is also in Z;
that is: r(m
2
s)
2
Z. Since r is square free, we must therefore have m
2
s Z. Dene:
n = m
2
s Z. Then our equation becomes:
r
2

4
+a
1
r
2
m
2
+b
1
m
4
= rn
2
, for some , m, n Z, gcd(, m) = 1, ()
(from which we have W
r
in the statement of the lemma, after dividing both side by r). We
want to show that r[b
1
, and we know that r is square free. It is sucient to show, for any
prime p, that p[r p[b
1
.
Imagine p[r and p ,[ b
1
, for some prime p. Then p[r
2

4
, a
1
r
2
m
2
, rn
2
and so by (),
p[b
1
m
4
, which in turn gives: p[m [since p ,[ b
1
]. Hence, since now p[r and p[m, we
have: p
2
[r
2

4
, a
1
r
2
m
2
, b
1
m
4
, and so by (), p
2
[rn
2
, which in turn gives: p[n [since r
is square free]. Hence, since now p[r, m, n, we have: p
3
[a
1
r
2
m
2
, b
1
m
4
, rn
2
, and so by
(), p
3
[r
2

4
, which in turn gives: p[ [since r is square free]. This is a contradiction,
since p[ and p[m but gcd(, m) = 1.
The above assumption that p[r and p ,[ b
1
let to a contradiction, and so it is impossible for
any prime p to satisfy p[r and p ,[ b
1
. This is the same as saying that p[r p[b
1
for any
prime p. Since r is square free, we conclude that r[b
1
, as required
34
Comment 6.7. If we similarly dene q : ( Q

/(Q

)
2
by:
q(x, y) =
_
x when x ,= 0
b = a
2
1
4b
1
when x = 0,
and q(o) = 1, then, by the same argument, q has nite image. If r Q

/(Q

)
2
is written
as a square free integer, then r im q = r[b. Under q, (/

(1) is isomorphic to the


subgroup of Q

/(Q

)
2
consisting of all square free integers r[b such that

W
r
: r
4
+a
2
m
2
+ (b/r)m
4
= n
2
, for some , m, n Z, not all 0, with gcd(, m) = 1.
When

W
r
is satised, there is a point (x, y) ( such that q(x, y) = r, satisfying x = r
_

m
_
2
.
Since 1/(() and (/

(1) have been shown to be isomorphic to nite groups, we can


immediately deduce one of our main goals.
Theorem 6.8. Both (/

(1) and 1/(() are nite.


Corollary 6.9. (The Weak Mordell-Weil Theorem, for an elliptic curve ( which has a
rational point of order 2). (/2( = ((Q)/2((Q) is nite.
Proof We know from Theorem 6.8 that (/

(1) and 1/(() are nite, so let (/

(1) =
g
1
, . . . , g
k
and 1/(() = h
1
, . . . , h

. Let g (. We can write g as:


g = g
i
+

(h), for some g
i
g
1
, . . . , g
k
, h 1
= g
i
+

_
h
j
+(g

)
_
, for some h
j
h
1
, . . . , h

, g

(
= g
i
+

(h
j
) +

((g

)) [since

is a homomorphism]
= g
i
+

(h
j
) + 2g

[since

= [2]]
= g
i
+

(h
j
) in (/2(.
Hence (/2( is a subset of g
i
+

(h
j
) : 1 i k, 1 j , which is nite, and so (/2(
is nite.
35
The above proves the Weak Mordell-Weil Theorem, that ((Q)/2((Q) is nite, for the case
when ( : Y
2
= X(X
2
+ aX + b) has a Q-rational point of order 2. In fact, the same result
can be proved for any elliptic curve c : Y
2
= F(X), regardless of whether it has a Q-rational
point of order 2 (see Chapter VIII of Silverman), giving:
Theorem 6.10. (The Weak Mordell-Weil Theorem). Let c be any elliptic curve over Q.
Then c(Q)/2c(Q) is nite.
The proof of the more general version is in a similar spirit, but requires some algebraic
number theory, working in the number eld Q(), where is a root of F(X).
Comment 6.11. A Boolean group is dened to be a group such that g g is the identity, for
any element g. A nite Boolean group, generated by the independent elements g
1
, . . . , g
n
,
has 2
n
elements. Given any Abelian group G, the quotient group G/2G is always Boolean.
When G/2G is nite, #G/2G is always a power of 2 and is isomorphic to C
2
. . . C
2
.
Suppose we are give an elliptic curve of the form ( : Y
2
= X(X
2
+ aX + b), and we
derive the associated objects already described, namely T : V
2
= U(U
2
+ a
1
U + b
1
), where
a
1
= 2a, b
1
= a
2
4b, with ( = ((Q), 1 = T(Q), : ( 1,

: 1 (, q : 1/(()
Q

/(Q

)
2
, q : (/

(1) Q

/(Q

)
2
. Then the above results and their proofs give a method
for trying to compute (/2(.
Step 1. Try to nd 1/(() by nding all square free integers r[b
1
satisfying W
r
.
Step 2. Try to nd (/

(1) by nding all square free integers r[b satisfying



W
r
.
Step 3. Combine (/

(1) and

_
1/(()
_
to generate (/2(.
Example 6.12. Let ( : Y
2
= X(X
2
X + 6). Then (/2( = ((Q)/2((Q)

= C
2
C
2
.
36
Proof Here, a = 1, b = 6 and so a
1
= 2a = 2, b
1
= a
2
4b = 23, giving T : V
2
=
U(U
2
+2U23). The isogeny : ( T is given by (x, y) =
_
_
y
x
_
2
, y
by
x
2
_
=
_
_
y
x
_
2
, y
6y
x
2
_
.
The isogeny

: T ( is given by

(u, v) =
_
1
4
_
v
u
_
2
,
1
8
_
v
b
1
v
u
2
_
_
=
_
1
4
_
v
u
_
2
,
1
8
_
v +
23v
u
2
_
_
.
Step 1. Find 1/((). We need to consider r[b
1
= 23, r Z, r square free, that is,
r = 1, 23, and q(o) = 1, q(0, 0) = b
1
= 23, so that: 1, 23 im q 1, 23.
Note that 1 im q 23 im q, and so it is only necessary to check one member of
the coset 1, 23.
Choose r = 1. Then equation W
r
, r
4
+a
1

2
m
2
+ (b
1
/r)m
4
= n
2
becomes:
W
1
:
4
+ 2
2
m
2
+ 23m
4
= n
2
, for some , m, n Z, not all 0, with gcd(, m) = 1.
On completing the square, we obtain:
(
2
m
2
)
2
+ 24m
4
= n
2
. (1)
This gives (
2
m
2
)
2
n
2
(mod 3).
Imagine 3 ,[ (
2
m
2
); then
2
m
2
would have an inverse mod 3, and so 1
(n)
2
(mod 3), contradicting the fact that 1 is not a quadratic residue mod 3.
Hence, by reductio, 3[(
2
m
2
) and so 3[n [since 3[n
2
], giving that 3
2
[(
2
m
2
)
2
and
3
2
[n
2
, so that, from (1), 3
2
[24m
4
, and so 3[m
4
[since 3
1
[[24], giving 3[m. But combining
3[m with 3[
2
m
2
gives 3[
2
, so that 3[. We have shown that 3[ and 3[m, contradicting
gcd(, m) = 1. Hence there are no solutions to W
1
, giving that 1 , im q [indeed, we have
shown that there are no solutions (, m, n) ,= (0, 0, 0) in Q
3
].
This gives im q = 1, 23 and 1/(() = o, (0, 0) = (0, 0)

= C
2
.
Step 2. Find (/

(1). We need to consider r[b = 6, r Z, r square free, that is, r =


1, 2, 3, 6. Also, q(o) = 1, q(2, 4) = 2, q(3, 6) = 3, q(0, 0) = b = 6, so that
1, 2, 3, 6 im q 1, 2, 3, 6. Note that 1 im q 2 im q
37
3 im q 6 im q, and so it is only necessary to check one member of the coset
1, 2, 3, 6.
Choose r = 1. Then

W
1
, r
4
+a
2
m
2
+ (b/r)m
4
= n
2
becomes:

W
1
:
4

2
m
2
6m
4
= n
2
, for some , m, n Z, not all 0, with gcd(, m) = 1.
For any , m, n Z,
4
,
2
m
2
, 6m
4
0, so
4

2
m
2
6m
4
0, and
LHS =
4

2
m
2
6m
4
= 0
4
=
2
m
2
= 6m
4
= 0 = m = 0.
Also, RHS = n
2
0 and n
2
= 0 n = 0. Both sides are equal both sides
are 0 = m = n = 0, but we require , m, n to be not all 0. Hence there are no
solutions to

W
1
, giving that 1 , im q [indeed, we have shown that there are no solutions
(, m, n) ,= (0, 0, 0) in R].
We conclude that im q = 1, 2, 3, 6 and (/

(1) = o, (0, 0), (2, 4), (3, 6) = (0, 0), (2, 4).
Step 3. Find (/2(. This is generated by (/

(1) = o, (0, 0), (2, 4), (3, 6) = (0, 0), (2, 4),
together with

_
1/(()
_
=

(o),

(0, 0) = o, which gives nothing new that wasnt al-
ready in (/

(1). Therefore, (/2( = o, (0, 0), (2, 4), (3, 6) = (0, 0), (2, 4)

= C
2
C
2
, as
required. Note that (0, 0), (2, 4) are independent in (/

(1) and so are independent in (/2(


[since 2( =

((())

(1)].
Comment 6.13. The equations
W
r
: r
4
+a
1

2
m
2
+ (b
1
/r)m
4
= n
2
,

W
r
: r
4
+a
2
m
2
+ (b/r)m
4
= n
2
,
[which can also be expressed as: rX
4
+ a
1
X
2
+ b
1
/r = Y
2
and rX
4
+ aX
2
+ b/r = Y
2
,
for X, Y Q] are called homogeneous spaces. Finding ((Q)/2((Q), as in the last example,
38
comes down to deciding, for each r[b
1
, whether W
r
has a solution , m, n Z, not all 0, with
gcd(, m) = 1, and for each r[b, whether

W
r
has such a solution.
In the last example, it turned out that each W
r
,

W
r
either had a solution , m, n, or we
were able to show such a solution was impossible with a modulo-power-of-p argument (a
p-adic argument) or that it was impossible in R. That is, each W
r
,

W
r
either had a point or
it was impossible in R or some Q
p
.
This doesnt always happen. It is possible in some examples for W
r
or

W
r
to have solutions
in R and every Q
p
, but not in Q [that is, for there to be a violation of the Hasse Principle].
For example, consider ( : Y
2
= X
3
+ 17X. Here, a = 0, b = 17, so that a
1
= 0, b
1
= 68,
giving T : Y
2
= X
3
68X. When computing 1/((), we consider r[b
1
= 68 and so r =
1, 2, 17, 34. For the case r = 2, the homogeneous space r
4
+a
1

2
m
2
+(b
1
/r)m
4
= n
2
becomes 2
4
34m
4
= n
2
. Note that the equation forces n to be even; setting n = 2k and
dividing both sides by 2 gives the slightly simpler form:
4
17m
4
= 2k
2
. As shown on
Problem Sheet 4, this has no solutions k, , m Z (not all 0, gcd(, m) = 1) [as shown on
Problem Sheet 4], and so 2 , im q, even though there exist solutions in R and every Q
p
[and
so proving 2 , im q requires an argument dierent to those in the last example]. Instances
of such W
r
(or

W
r
) correspond to members of a structure known as the Shafarevich-Tate
group.
Comment 6.14. There is another approach to the Weak Mordell-Weil Theorem, using
Galois cohomology. Recall that the slick denition of q : T(Q)/(((Q)) Q

/(Q

)
2
is that
q(Q) = d, where Q(

d) is the eld over which P, P

are dened, where (P) = (P

) = Q.
Since ker q = o, (0, 0), we must have P

= P + (0, 0). Furthermore, if


1
: a + b

d
a + b

d,
2
: a + b

d a b

d is the Galois group of the extension Q(

d) : Q, then
P

=
2
(P). So, we have a 1-1 correspondence between k
1
= o, k
2
= (0, 0), given by
39
k
1

1
and k
2

2
, with the property that, for any member of P, P

, the eect of adding


k
i
is the same as applying
i
. We then have a map which takes a member of T(Q)/(((Q))
to a 1 1 correspondence between o, (0, 0) and the Galois group of a quadratic number
eld. As we have seen, there are two main elements required to prove the Weak Mordell-Weil
Theorem: showing that q is a homomorphism and that im q is nite. For showing that q is a
homomorphism, suppose that q(Q
1
) = d
1
and q(Q
2
) = d
2
. Then, by denition, P
1
, P

1
[such
that (P
1
) = (P

1
) = Q
1
] and dened over Q(

d
1
), and P
2
, P

2
[such that (P
2
) = (P

2
) =
Q
2
] and dened over Q(

d
2
). Now, since is a homomorphism, (P
1
+ P
2
) = Q
1
+ Q
2
and P
1
+ P
2
is dened over Q(

d
1
,

d
2
). But

d
1

d
1
,

d
2

d
2
has the same
eect as adding (0, 0) to each of P
1
, P
2
and so leaves P
1
+ P
2
unchanged, so that P
1
+ P
2
is dened over Q(

d
1
d
2
); similarly for the other preimage of Q
1
+ Q
2
under . Hence
q(Q
1
+ Q
2
) = d
1
d
2
= q(Q
1
)q(Q
2
), giving that q is a homomorphism [without needing to
work explicitly with the group law]. For the niteness of im q, let q(Q) = d, a square free
integer, and imagine that a prime p of good reduction is a factor of d. By the denition
of q, there are P, P

, dened over Q(

d) such that (P) = (P

) = Q. But, on reduction
modulo

p, conjugation

d has no eect modulo



p, contradicting the fact that
P

= P + (0, 0) is distinct from P. Hence d has only primes dividing the discriminant as
factors, and so has only nitely many possibilities.
This approach is cleaner, and does not require getting our hands dirty with explicit
group law manipulations. On the other hand, it is often worth a more from-rst-principles
proof (as given previously), as it provides us with an explicit method for trying to compute
((Q)/2((Q).
40
Section 7. The Mordell-Weil Theorem
When c is an elliptic curve over Q, weve seen that c
tors
(Q) and c(Q)/2c(Q) are nite.
But c(Q) may sometimes be innite [if P c(Q) and P , c
tors
(Q) then P is of innite
order and so c(Q) is innite]. We shall show that c(Q) [whether nite or innite] is always
nitely generated. That is, we aim to show that, for any elliptic curve c, there exists nite
number of elements P
1
, . . . , P
k
c(Q) such that every P c(Q) can be written as:
P = m
1
P
1
+. . . +m
k
P
k
, m
1
, . . . , m
k
Z.
This will be achieved via height functions; we rst describe the general properties of a height
function on a general Abelian group.
Denition 7.1. Let A be an Abelian group with group operation +.
We say that h : A R is a height function if it satises:
(1) For any Q A, there exists C
1
= C
1
(Q) such that h(P +Q) 2h(P) +C
1
for all P A.
(2) There exists C
2
, independent of P, such that h(2P) 4h(P) C
2
for all P A.
(3) For any C
3
, the set P A : h(P) C
3
is nite.
Theorem 7.2. Let A be an Abelian group which has a height function h, and suppose that
A/2A is nite. Then A is nitely generated.
Proof We are given that A/2A is nite, so let A/2A = S = Q
1
, . . . Q
r
A. Let P be any
element of A. Then P = Q
i
1
in A/2A for some Q
i
1
S and so we can write: P = 2P
1
+Q
i
1
,
for some P
1
A. Inductively, continue to write: P
1
= 2P
2
+ Q
i
2
, P
2
= 2P
3
+ Q
i
3
, . . ., where
each P
j
A and each Q
i
j
S. Now:
h(P
j
)
1
4
_
h(2P
j
) +C
2
_
[by (2)] =
1
4
_
h(P
j1
Q
i
j
) +C
2
_

1
4
_
2h(P
j1
) +C

1
+C
2
_
[by (1)],
where:
41
C

1
= maxC
1
(Q) : Q S. So, if h(P
j1
) > (C

1
+C
2
)/2 then:
h(P
j
) <
1
4
_
2h(P
j1
) + 2h(P
j1
)
_
= h(P
j1
).
Imagine that h(P) > (C

1
+ C
2
)/2 and h(P
j
) > (C

1
+ C
2
)/2 for all j. Then the sequence
h(P), h(P
1
), h(P
2
), . . . would be strictly decreasing, giving innitely many distinct members
of A with height h(P), which would contradict (3). This contradiction shows that there
must exist an n such that h(P
n
) (C

1
+ C
2
)/2. So, we can write: P = 2P
1
+ Q
i
1
=
2(2P
2
+ Q
i
2
) + Q
i
1
= . . ., and after n steps P will be written as a linear combination of P
n
and members of S. Let T = Q A : h(Q) (C

1
+C
2
)/2. We have shown (since P
n
T)
that any P A is a linear combination of members of ST. Furthermore, T is nite, by (3).
In conclusion: A is generated by the nite set S T, and so is nitely generated.
A height function on c(Q) can be obtained as follows.
Lemma 7.3. Let c be an elliptic curve, dened over Q. Dene h
x
: c(Q) R by:
h
x
_
(x, y)
_
= log max
_
[a[, [b[
_
, where x =
a
b
, a, b Z, gcd(a, b) = 1,
and dene h
x
(o) = 0. Then h
x
is a height function on c(Q). Indeed, there exists a con-
stant C, independent of P, Q, such that [h
x
(P + Q) + h
x
(P Q) 2h
x
(P) 2h
x
(Q)[ C,
for all P, Q c(Q), from which properties (1),(2) can be deduced [property (3) is trivially
true].
For the proof (optional) see, for example, p.201 of Silverman.
Aside: The proof uses the explicit group law; for example, x

= a

/b

, the x-coordinate
of 2P = 2(x, y) is given by (quartic in x)/(cubic in x), and so max([a

[, [b

[) is approxi-
mately max([a[, [b[)
4
, giving that log max([a

[, [b

[) is approximately 4 log max([a[, [b[), that


42
is h
x
(2P) is approximately 4h
x
(P). It is only necessary to control the amount of cancella-
tion occurring, when writing the x-coordinate of 2P in lowest terms.
Theorem 7.4. (The Mordell-Weil Theorem). Let c be any elliptic curve over Q. Then
c(Q) is nitely generated.
Proof This follows immediately from Theorem 6.10, Theorem 7.2 and Lemma 7.3.
Comment 7.5. This means that we know what c(Q) looks like:
c(Q)

= c
tors
(Q) Z
r
, for some r 0, r Z.
The number r is called the rank of c(Q) (or just the rank of c). Clearly:
c(Q) has nitely many points rank
_
c(Q)
_
= 0.
To solve c(Q), we want to know: c
tors
(Q) and r (the rank). Note that:
c(Q)/2c(Q)

= c
tors
(Q)/2c
tors
(Q)
_
Z/2Z
_
r
,
so that:
c(Q)/2c(Q)

= c(Q)[2] C
r
2
,
where c(Q)[2] denotes the 2-torsion subgroup of c(Q) (see Comment 0.40).
Example 7.6. Let ( : Y
2
= X(X
2
X+6). In Example 6.12, we found that ((Q)/2((Q)

=
C
2
C
2
. Also, ((C)[2] = o points of order 2 = o, (0, 0),
_
1+

23
2
, 0
_
,
_
1

23
2
, 0
_
,
so that ((Q)[2] = o, (0, 0)

= C
2
. Since ((Q)/2((Q)

= ((Q)[2] C
r
2
, we deduce that
C
2
C
2

= C
2
C
r
2
and so the rank r = 1 [((Q) is innite, but is generated by (
tors
(Q) and
one element of innite order].
43
Section 8. Cryptography
Public keys allow message to be encoded (not decoded). Suppose A wants to send the
integer X to B safely; we assume that everything transmitted can be intercepted.
Step 1. B (in private) takes 2 large prime numbers p, q (usually about 250 digits) and
multiplies them together to give N = pq, chooses an exponent d, and publicises N, d to the
world.
Step 2. A (in private) computes Y X
d
(mod N) and sends the message Y to B.
Step 3. B privately computes (N) = (p)(q) = (p 1)(q 1) and also computes (by
Euclids Algorithm) e such that de 1 (mod (N)). Note that:
Y
e
(X
d
)
e
X
de
= X
1+k(N)
[for some k Z] X(X
(N)
)
k
X,
since X
(N)
1 (mod N) by Eulers Theorem, provided that X, N are coprime. Assuming
X < N, this decodes the message.
Note that computing X
d
(mod N) [and Y
e
(mod N)] is fast even when d is large, by
writing d in base 2 as d = 2
k
1
+ . . . + 2
k
m
(k
1
< . . . < k
m
). One then obtains X
2
0

X, X
2
1
(X
2
0
)
2
, X
2
2
(X
2
1
)
2
, . . . , X
2
k
m
, by k
m
squaring operations, after which:
X
d
X
2
k
1
X
2
k
2
. . . X
2
k
m
(mod N),
which takes roughly log d operations.
Anyone wishing to crack the code must be able to compute (N), which requires nding
p, q from N = pq. A naive (and very slow) approach is trial division: checking for each
c = 2, . . . , [

N ] whether c[N.
Much better is Pollards p 1 method. One chooses base a and exponent k = product of
powers of small primes. Compute a
k
(mod N) [as usual, after rst writing k in binary], and
44
then gcd(a
k
1, N) using Euclids Algorithm. If there exists prime p[N such that p 1[k
[k = (p 1)s, say] then:
a
k

_
a
p1
_
s
1
s
1 (mod p) [by Fermat],
provided that p ,[ a. This gives p[(a
k
1) and so p[gcd(a
k
1, N). Unless we have bad luck,
gcd(a
k
1, N) ,= N, and so gcd(a
k
1, N) will be a proper factor of N [,= 1, ,= N].
Example 8.1. A four-letter word L
1
L
2
L
3
L
4
has been divided into two pairs: L
1
L
2
and
L
3
L
4
. Each of these pairs has been converted into an integer (of at most 4 digits) via the
standard map: A 01, B 02, . . . , Z 26. These integers have been encoded by taking
each to the power of d = 6587, modulo N = 10123. The encoded message reads:
4268, 5744.
We shall factorise N by applying Pollards p 1 method, using base 2 and exponent 52,
and then use the factorisation of N to decode the message.
Write 52 as a sum of powers of 2: 52 = 4 + 16 + 32. First compute (modulo N = 10123):
2
1
2, 2
2
(2
1
)
2
4, 2
4
(2
2
)
2
16, 2
8
(2
4
)
2
256, 2
16
(2
8
)
2
4798, 2
32

(2
16
)
2
4798
2
1102 (where each of these was obtained be squaring the previous one, and
reducing modulo N). Since 52 = 4 + 16 + 32, we have: 2
52
2
4
2
16
2
32
16 4798 1102
5907 1102 425 modulo N, so that 2
52
1 424 modulo N.
Now, compute gcd(424, N) by Euclids Algorithm:
10123 = 23 424 + 371; 424 = 1 371 + 53; 371 = 7 53 + 0.
So, 53 is a factor of N. Compute 10123/53 = 191, giving the factorisation N = 10123 =
53 191.
45
Since N = 53 191, we have (N) = 52 190 = 9880. Compute the gcd of (N) = 9880
and d = 6587 we see:
_
1
0
0
1
[
9880
6587
_

R
1
R
2
_
1
0
1
1
[
3293
6587
_

R
2
2R
1
_
1
2
1
3
[
3293
1
_

R
1
3293R
2
_

2

3
[
0
1
_
,
where the entries need not be computed. This gives us, all in the same computation, that
gcd(9880, 6587) = 1, and the bottom row of the last matrix gives gcd(9880, 6587) as a linear
combination of 9880, 6587, namely: 1 = 2 9880 +3 6587. Hence 3 6587 1 (mod 9880),
that is, 3 is the inverse of 6587 modulo (N) = 9880.
The decoding operation is therefore Y Y
3
mod N. Computing 4268
3
= 4268
2
4268
4547 4268 805 (modulo N = 10123). Also: 5744
3
= 5744
2
5744 2679 5744 1216
(modulo N = 10123). The decoded message is therefore: 0805, 1216; that is: HELP.
The exponent k is typically chosen to be a product of powers of the rst r primes, for
some r. Pollards p 1 Method is fast when there exists at least one prime p[N such that
p 1 = #F

p
is only divisible by small primes, so that order(a)[#F

p
[k.
When Pollards p 1 method is slow for some N, we can replace powers of an integer
base a with multiples kP of a point P on an elliptic curve c.
We hope that, there exists prime p[N such that #

c(F
p
)[k, which would guarantee that
kP = o (the point at innity) mod p; that is to say, a denominator divisible by p, in which
case, taking the gcd of the denominator and N will reveal the factor p. This will be fast
if there exists p[N such that #

c(F
p
) is only divisible by small primes. Each new choice of
elliptic curve gives a new chance of this happening.
The Elliptic Curve Method (ECM) for attempting to factor an integer N is as follows.
Choose an elliptic curve c mod N, some point P on c, and some choice of k (normally a
product of powers of small primes). Attempt to compute kP (mod N) and hope that, in
46
performing one of the additions kP = k
1
P +k
2
P, a denominator will have gcd with N that
is a nontrivial factor of N (,= 1 and ,= N).
Example 8.2. Let N = 10123, as in Example 8.1. We shall factorise N by applying the
Elliptic Curve Method, using the curve c : Y
2
= X
3
+ 5X 5 and 4P, where P = (1, 1).
The line tangent to c at P = (1, 1) has slope y

given by 2yy

= 3x
2
+5, with x = 1, y = 1;
that is, the slope is 8/2 = 4. This tangent line also goes through (1, 1) and so has equation:
Y = 4X 3. The x-coordinate of 2P is therefore 4
2
(1 + 1) = 14, and the y-coordinate
is: (4 14 3) = 53 10070, so that Q = 2P = (14, 10070) (modulo N = 10123). We
now wish to double the point Q = 2P, and so again the rst step is to nd the line tangent
to c at Q. This has slope y

given by 2 10070 y

= 3 14
2
+ 5, and so we need to compute
(3 14
2
+ 5)/(2 10070) (modulo N = 10123), for which the rst step is to nd the inverse
of 2 10070 10017 (modulo N = 10123). Using Euclids Algorithm:
10123 = 1 10017 + 106; 10017 = 94 106 + 53; 106 = 2 53 + 0.
So, we cannot nd the inverse of 10017 (modulo N = 10123), and this step has given us our
factor 53 of N. As in the previous example, compute 10123/53 = 191, giving the factorisation
N = 10123 = 53 191.
47
References
[1] J.W.S. Cassels. Lectures on Elliptic Curves. LMSST 24. Cambridge University Press, Cambridge, 1991.
[2] J.H. Silverman. The Arithmetic of Elliptic Curves. GTM 106. Springer-Verlag, 1986.

Vous aimerez peut-être aussi