Vous êtes sur la page 1sur 153

Lecture Notes in Advanced Real Analysis

Eric T. Sawyer
McMaster University, Hamilton, Ontario
E-mail address: sawyer@mcmaster.ca
Abstract. Beginning with Lebesgue integration on the real line, and contin-
uing with Euclidean spaces, the Banach-Tarski paradox, and the Riesz rep-
resentation theorem on locally compact Hausdor spaces, these lecture notes
examine theories of integration with applications to analysis and dierential
equations.
Contents
Preface v
Part 1. Topology of Euclidean spaces 1
Chapter 1. Compact sets 3
1. Properties of compact sets 4
2. The Cantor set 9
3. Exercises 10
Chapter 2. Continuous functions 11
1. Limits 11
2. Continuity 12
3. Uniform continuity 15
4. Connectedness 17
5. Exercises 18
Part 2. Integration and dierentiation 19
Chapter 3. Riemann and Riemann-Stieltjes integration 21
1. Properties of the Riemann-Stieltjes integral 27
2. The Henstock-Kurtzweil integral 29
3. Exercises 31
Chapter 4. Lebesgue measure theory 33
1. Lebesgue measure on the real line 35
2. Measurable functions and integration 42
3. Exercises 51
Chapter 5. Paradoxical decompositions and nitely additive measures 53
1. Finitely additive invariant measures 54
2. Paradoxical decompositions and the Banach-Tarski paradox 55
3. Exercises 59
Chapter 6. Abstract integration and the Riesz representation theorem 61
1. Abstract integration 61
2. The Riesz representation theorem 69
3. Regularity of Borel measures 81
4. Lebesgue measure on Euclidean spaces 85
5. Littlewoods three principles 88
6. Exercises 92
iii
iv CONTENTS
Chapter 7. Lebesgue, Banach and Hilbert spaces 95
1. 1

spaces 97
2. Banach spaces 99
3. Hilbert spaces 104
4. Duality 112
5. Essentially bounded functions 113
6. Exercises 114
Chapter 8. Complex measures and the Radon-Nikodym theorem 115
1. The total variation of a complex measure 116
2. The Radon-Nikodym theorem 118
3. Exercises 125
Chapter 9. Dierentiation of integrals 127
1. Covering lemmas, maximal functions and dierentiation 128
2. The maximal theorem 132
3. Exercises 137
Chapter 10. Product integration and Fubinis theorem 139
1. Product o-algebras 139
2. Product measures 141
3. Fubinis theorem 144
4. Exercises 146
Appendix. Bibliography 147
Preface
These notes grew out of lectures given twice a week in a rst year graduate
course in advanced real analysis at McMaster University September to December
2010. Part 1 consists of a brief review of compactness and continuity. The top-
ics in Part 2 include Lebesgue integration on Euclidean spaces, the Banach-Tarski
paradox, the Riesz representation theorem on locally compact Hausdor spaces,
Lebesgue spaces 1

(j), Banach and Hilbert spaces, complex measures and the


Radon-Nikodym theorem, and Fubinis theorem. Applications to dierential equa-
tions will be forthcoming in Part 8. Sources include books by Rudin [3], [4] and
[5], and books by Stein and Shakarchi [6] and [7]. Special topics are covered in
Bartle and Sherbert [1] and Wagon [8]. Finally, thanks to Chai Molina for many
useful suggestions regarding these notes as used in 2011.
v
Part 1
Topology of Euclidean spaces
We begin Part 1 by reviewing some of the theory of compact sets and conti-
nuity of functions in Euclidean spaces R
n
. We assume the reader is already famil-
iar with the notions of sequence, open, closed, countable and uncountable, and is
comfortable with elementary properties of limits, continuity and dierentiability of
functions.
CHAPTER 1
Compact sets
Let R be the set of real numbers equipped with the usual eld and order
operations, and the least upper bound property. Denote by R
n
the :-dimensional
Euclidean space
R
n
= (r
1
, r
2
, ..., r
n
) R R ... R (: times)
equipped with the usual vector addition and scalar and inner products
r j = (r
1
j
1
, r
2
j
2
, ..., r
n
j
n
) ,
`r = (`r
1
, `r
2
, ..., `r
n
)
r j = r
1
j
1
r
2
j
2
... r
n
j
n
,
if r = (r
1
, r
2
, ..., r
n
), j = (j
1
, j
2
, ..., j
n
) and ` R.
We begin with the single most important property that a subset of Euclidean
space can have, namely compactness. In a sense, compact subsets share the most
important topological properties enjoyed by nite sets. It turns out that the most
basic of these properties is rather abstract looking at rst sight, but arises so of-
ten in applications and subsequent theory that we will use it as the denition of
compactness. But rst we introduce some needed terminology.
Let 1 be a subset of R
n
. A collection ( = G
o

o.
of subsets G
o
of R
n
is
said to be an open cover of 1 if
each G
o
is open and 1
_
o.
G
o
.
A nite subcover (relative to the open cover ( of 1) is a nite collection G
o
!

n
|=1
of the open sets G
o
that still covers 1:
1
n
_
|=1
G
o
!
.
For example, the collection ( =
__
1
n
, 1
1
n
__
o
n=1
of open intervals in R form an
open cover of the interval 1 =
_
1
8
, 2
_
, and
__
1
n
, 1
1
n
__
8
n=1
is a nite subcover.
Draw a picture! However, ( is also an open cover of the interval 1 = (0, 2) for
which there is no nite subcover since
1
n
,
_
1
n
, 1
1
n
_
for all 1 _ : _ :.
Definition 1. A subset 1 of R
n
is compact if every open cover of 1 has a
nite subcover.
Example 1. Clearly every nite set is compact. On the other hand, the interval
(0, 2) is not compact since ( =
__
1
n
, 1
1
n
__
o
n=1
is an open cover of (0, 2) that does
not have a nite subcover.
3
4 1. COMPACT SETS
The above example makes it clear that all we need is one bad cover as witness
to the failure of a set to be compact. On the other hand, in order to show that
an innite set is compact, we must often work much harder, namely we must show
that given any open cover, there is always a nite subcover. It will obviously be
of great advantage if we can nd simpler criteria for a set to be compact, and this
will be carried out below. For now we will content ourselves with giving one simple
example of an innite compact subset of the real numbers (even of the rational
numbers).
Example 2. The set 1 = 0 '
_
1
|
_
o
|=1
is compact in R. Indeed, suppose
that ( = G
o

o.
is an open cover of 1. Then at least one of the open sets in (
contains 0, say G
o0
. Since G
o0
is open, there is r 0 such that
1(0, r) G
o0
.
Now comes the crux of the argument: there are only nitely many points
1
|
that lie
outside 1(0, r), i.e.
1
|
, 1(0, r) if and only if / _
_
1
:

= :. Now choose G
o
!
to
contain
1
|
for each / between 1 and : inclusive (with possible repetitions). Then the
nite collection of open sets G
o0
, G
o1
, G
o2
, ..., G
or
(after removing repetitions)
constitute a nite subcover relative to the open cover ( of 1. Thus we have shown
that every open cover of 1 has a nite subcover.
It is instructive to observe that 1 = 1 where 1 =
_
1
|
_
o
|=1
is not compact
(since the pairwise disjoint balls 1
_
1
|
,
1
4|
2
_
=
_
1
|

1
4|
2
,
1
|

1
4|
2
_
cover 1 one point
at a time). Thus the addition of the single limit point 0 to the set 1 resulted in
making the union compact. The argument given as proof in the above example
serves to illustrate the sense in which the set 1 is topologically almost a nite set.
As a nal example to illustrate the concept of compactness, we show that any
unbounded set in R
n
fails to be compact. We say that a subset 1 of R
n
is bounded
if there is some ball 1(r, r) in R
n
that contains 1. So now suppose that 1 is
unbounded. Fix a point r R
n
and consider the open cover 1(r, :)
o
n=1
of 1
(this is actually an open cover of R
n
). Now if there were a nite subcover, say
1(r, :
|
)

|=1
where :
1
< :
2
< ... < :

, then because the balls are increasing,


1

_
|=1
1(r, :
|
) = 1(r, :

) ,
which contradicts the assumption that 1 is unbounded. We record this fact in the
following lemma.
Lemma 1. A compact subset of R
n
is bounded.
Remark 1. We can now preview one of the major themes in our development
of analysis. The Least Upper Bound Property of the real numbers will lead directly
to the following beautiful characterization of compactness in the metric space R of
real numbers, the Heine-Borel theorem: a subset 1 of R is compact if and only if
1 is closed and bounded.
1. Properties of compact sets
We now prove a number of properties that hold for compact sets in Euclidean
space R
n
.
Lemma 2. If 1 is a compact subset of R
n
, then 1 is a closed subset of R
n
.
1. PROPERTIES OF COMPACT SETS 5
Proof : We show that 1
c
is open. So x a point r 1
c
. For each point
j 1, consider the ball 1(j, r

) with
(1.1) r

=
1
2
d (r, j) .
Since 1(j, r

)
1
is an open cover of the compact set 1, there is a nite subcover
1(j
|
, r

!
)
n
|=1
with of course j
|
1 for 1 _ / _ :. Now by the triangle
inequality and (1.1) it follows that
(1.2) 1(r, r

!
) 1(j
|
, r

!
) = O, 1 _ / _ :.
Indeed, if the intersection on the left side of (1.2) contained a point . then we would
have the contradiction
d (r, j
|
) _ d (r, .) d (., j
|
) < r

!
r

!
= d (r, j
|
) .
Now we simply take r = minr

n
|=1
0 and note that 1(r, r) 1(r, r

!
) so
that
1(r, r) 1 1(r, r)
_
n
_
|=1
1(j
|
, r

!
)
_
=
n
_
|=1
1(r, r) 1(j
|
, r

!
)

n
_
|=1
1(r, r

!
) 1(j
|
, r

!
) =
n
_
|=1
O = O,
by (1.2). This shows that 1(r, r) 1
c
and completes the proof that 1
c
is open.
Draw a picture of this proof!
Lemma 3. If 1 1 A where 1 is closed in R
n
and 1 is compact, then 1
is compact.
Proof : Let ( = G
o

o.
be an open cover of 1. We must construct a nite
subcover o of 1. Now (
+
= 1
c
' ( is an open cover of 1. By compactness of
1 there is a nite subcover S
+
of (
+
that consists of sets from ( and possibly the
set 1
c
. However, if we drop the set 1
c
from the subcover S
+
the resulting nite
collection of sets o from ( is still a cover of 1 (although not neccessarily of 1),
and provides the required nite subcover of 1.
Corollary 1. If 1 is closed and 1 is compact, then 1 1 is compact.
Proof : We have that 1 is closed by Lemma 2, and so 1 1 is closed. Now
1 1 1 and so Lemma 3 shows that 1 1 is compact.
Remark 2. With respect to unions, compact sets behave like nite sets, namely
the union of nitely many compact sets is compact. Indeed, suppose 1 and 1 are
compact subsets of a metric space, and let G
o

o.
be an open cover of 1 ' 1.
Then there is a nite subcover G
o

o1
of 1 and also a (usually dierent) nite
subcover G
o

o
of 1 (here 1 and J are nite subsets of ). But then the union
of these covers G
o

o1|
= G
o

o1
' G
o

o
is a nite subcover of 1 ' 1,
which shows that 1 ' 1 is compact.
6 1. COMPACT SETS
Now we come to one of the most useful consequences of compactness in appli-
cations. A family of sets 1
o

o.
is said to have the nite intersection property
if

oJ
1
o
,= O
for every nite subset 1 of the index set . For example the family of open intervals
__
0,
1
n
__
o
n=1
has the nite intersection property despite the fact that the sets have
no element in common:
o

n=1
_
0,
1
n
_
= O. The useful consequence of compactness
referrred to above is that this cannot happen for compact subsets!
Theorem 1. Suppose that 1
o

o.
is a family of compact sets with the nite
intersection property. Then

o.
1
o
,= O.
Proof : Fix a member 1
o0
of the family 1
o

o.
. Assume in order to de-
rive a contradiction that no point of 1
o0
belongs to every 1
o
. Then the open
sets 1
c
o

o.\]o0]
form an open cover of 1
o0
. By compactness, there is a nite
subcover 1
c
o

oJ\]o0]
with 1 nite, so that
1
o0

_
oJ\]o0]
1
c
o
,
i.e.
1
o0


oJ\]o0]
1
o
= O,
which contradicts our assumption that the nite intersection property holds.
Corollary 2. If 1
n

o
n=1
is a nonincreasing sequence of nonempty compact
sets. i.e. 1
n+1
1
n
for all : _ 1, then
o

n=1
1
n
,= O.
Theorem 2. If 1 is an innite subset of a compact set 1, then 1 has a limit
point in 1.
Proof : Suppose, in order to derive a contradiction, that no point of 1 is a
limit point of 1. Then for each . 1, there is a ball 1(., r
:
) that contains at
most one point of 1 (namely . if . is in 1). Thus it is not possible for a nite
number of these balls 1(., r
:
) to cover the innite set 1. Thus 1(., r
:
)
:1
is
an open cover of 1 that has no nite subcover (since a nite subcover cannot cover
even the subset 1 of 1). This contradicts the assumption that 1 is compact.
The Least Upper Bound Property of the real numbers plays a crucial role in
the proof that closed bounded intervals are compact.
Theorem 3. The closed interval [a, /[ is compact (with the usual metric) for
all a < /.
1. PROPERTIES OF COMPACT SETS 7
We give two proofs of this basic theorem. The second proof will be generalized
to prove that closed bounded rectangles in R
n
are compact.
Proof #1: Assume for convenience that the interval is the closed unit interval
[0, 1[, and suppose that G
o

o.
is an open cover of [0, 1[. Now 1 G
o
for some
, and thus there is r 0 such that (1 r, 1 r) G
o
. With a = 1
:
2
1
it follows that G
o

o.
is an open cover of [0, a[. Now dene
1 = r [0, a[ : the interval [0, r[ has a nite subcover .
We have 1 is nonempty (0 1) and bounded above (by a). Thus ` = sup1 exists.
We claim that ` 1. Suppose for the moment that this has been proved. Then 1
cannot be an upper bound of 1 and so there is some o 1 satisfying
1 < o _ `.
Thus by the denition of the set 1 it follows that [0, o[ has a nite subcover, and
hence so does [0, 1[, which completes the proof of the theorem.
Now suppose, in order to derive a contradiction, that ` _ 1. Then there is
some open set G
~
with and also some : 0 such that
(` :, ` :) G
~
.
Now by the denition of least upper bound, there is some r 1 satisfying ` : <
r _ `, and by taking : less than a 1 we can also arrange to have
` : _ 1 : < a.
Thus there is a nite subcover G
o
!

n
|=1
of [0, r[, and if we include the set G
~
with
this subcover we get a nite subcover of
_
0, `
s
2

. This shows that `


s
2
1,
which contradicts our assumption that ` is an upper bound of 1, and completes
the proof of the theorem.
Proof #2: Suppose, in order to derive a contradiction, that there is an open
cover G
o

o.
of [a, /[ that has no nite subcover. Then at least one of the two
intervals
_
a,
o+b
2

and
_
o+b
2
, /

fails to have a nite subcover. Label it [a


1
, /
1
[ so
that
a _ a
1
< /
1
_ /,
/
1
a
1
=
1
2
c,
where c = / a. Next we note that at least one of the two intervals
_
a
1
,
o1+b1
2

and
_
o1+b1
2
, /
1

fails to have a nite subcover. Label it [a


2
, /
2
[ so that
a _ a
1
_ a
2
< /
2
_ /
1
_ /,
/
2
a
2
=
1
4
c.
Continuing in this way we obtain for each : _ 2 an interval [a
n
, /
n
[ such that
a _ a
1
_ ...a
n1
_ a
n
< /
n
_ /
n1
... _ /
1
_ /, (1.3)
/
n
a
n
=
1
2
n
c.
Now let 1 = a
n
: : _ 1 and set r = sup1. From (1.3) we obtain that each
/
n
is an upper bound for 1, hence r _ /
n
and we have
a _ a
n
_ r _ /
n
_ /, for all : _ 1,
8 1. COMPACT SETS
i.e. r [a
n
, /
n
[ for all : _ 1. Now r [a, /[ and so there is , and r 0 such
that
(r r, r r) G
o
.
By the Archimedian property of R we can choose : N so large that
1
:
< : < 2
n
(it is easy to prove : < 2
n
for all : N by induction), and hence
[a
n
, /
n
[ (r r, r r) G
o
.
But this contradicts our construction that [a
n
, /
n
[ has no nite subcover, and com-
pletes the proof of the theorem.
Corollary 3. A subset 1 of the real numbers R is compact if and only if 1
is closed and bounded.
Proof : Suppose that 1 is compact. Then 1 is bounded by Lemma 1 and is
closed by Lemma 2. Conversely if 1 is bounded, then 1 [a, a[ for some a 0.
Now [a, a[ is compact by Theorem 3, and if 1 is closed, then Lemma 3 shows
that 1 is compact.
Proof #2 of Theorem 3 is easily adapted to prove that closed rectangles
1 =
n

|=1
[a
|
, /
|
[ = [a
1
, /
1
[ ... [a
n
, /
n
[
in R
n
are compact.
Theorem 4. The closed rectangle 1 =

n
|=1
[a
|
, /
|
[ is compact (with the usual
metric) for all a
|
< /
|
, 1 _ / _ :.
Proof : Here is a brief sketch of the proof. Suppose, in order to derive a
contradiction, that there is an open cover G
o

o.
of 1 that has no nite sub-
cover. It is convenient to write 1 as a product of closed intervals with super-
scripts instead of subscripts: 1 =

n
|=1
_
a
|
, /
|

. Now divide 1 into 2


n
congruent
closed rectangles. At least one of them fails to have a nite subcover. Label it
1
1
=

n
|=1
_
a
|
1
, /
|
1

, and repeat the process to obtain a sequence of decreasing


rectangles 1
n
=

n
|=1
_
a
|
n
, /
|
n

with
a
|
_ a
|
1
_ ...a
|
n1
_ a
|
n
< /
|
n
_ /
|
n1
... _ /
|
1
_ /
|
,
/
|
n
a
|
n
=
1
2
n
c
|
,
where c
|
= /
|
a
|
, 1 _ / _ :. Then if we set r
|
= sup
_
a
|
n
: : _ 1
_
we obtain
that r =
_
r
1
, ..., r
n
_
1
n
1 for all :. Thus there is , , r 0 and : _ 1
such that
1
n
1(r, r) G
o
,
contradicting our construction that 1
n
has no nite subcover.
Theorem 5. Let 1 be a subset of Euclidean space R
n
. Then the following
three conditions are equivalent:
(1) 1 is closed and bounded;
(2) 1 is compact;
(3) every innite subset of 1 has a limit point in 1.
2. THE CANTOR SET 9
Proof : We prove that (1) implies (2) implies (3) implies (1). If 1 is closed
and bounded, then it is contained in a closed rectangle 1, and is thus compact by
Theorem 4 and Lemma 3. If 1 is compact, then every innite subset of 1 has a
limit point in 1 by Theorem 2. Finally suppose that every innite subset of 1
has a limit point in 1. Suppose rst, in order to derive a contradiction, that 1
is not bounded. Then there is a sequence r
|

o
|=1
of points in 1 with [r
|
[ _ /
for all /. Clearly the set of points in r
|

o
|=1
is an innite subset 1 of 1 but has
no limit point in R
n
, hence not in 1 either. Suppose next, in order to derive a
contradiction, that 1 is not closed. Then there is a limit point r of 1 that is not
in 1. Thus each deleted ball 1
t
_
r,
1
|
_
contains some point r
|
from 1. Again it is
clear that the set of points in the sequence r
|

o
|=1
is an innite subset of 1 but
contains no limit point in 1 since its only limit point is r and this is not in 1.
Corollary 4. Every bounded innite subset of R
n
has a limit point in R
n
.
2. The Cantor set
We now construct the Cantor middle thirds set (1883). This famous fractal
set arises as a counterexample to many conjectures in analysis. We start with the
closed unit interval 1 = 1
0
= [0, 1[. Now remove the open middle third
_
1
3
,
2
3
_
of
length
1
3
and denote the two remaining closed intervals of length
1
3
by 1
1
1
=
_
0,
1
3

and 1
1
2
=
_
2
3
, 1

. Then remove the open middle third


_
1
9
,
2
9
_
of length
1
3
2
from
1
1
1
=
_
0,
1
3

and denote the two remaining closed intervals of length


1
3
2
by 1
2
1
and
1
2
2
. Do the same for 1
1
2
and denote the two remaining closed intervals by 1
2
3
and 1
2
4
.
Continuing in this way, we obtain at the /
||
generation, a collection
_
1
|

_
2
!
=1
of 2
|
pairwise disjoint closed intervals of length
1
3
!
. Let 1
|
=

2
!
=1
1
|

and set
1 =
o

|=1
1
|
=
o

|=1
_
_
2
!
_
=1
1
|

_
_
.
Now each set 1
|
is closed, and hence so is the intersection 1. Then 1 is compact by
Corollary 3. It also follows from Corollary 2 that 1 is nonempty. Next we observe
that by its very construction, 1 is a fractal satisfying the replication identity
81 = 1 ' (1 2) = 1
1
' 1
2
.
Thus the fractal dimension c of the Cantor set 1 is
ln 2
ln 3
. Moreover, 1 has the
property of being perfect.
Definition 2. A subset 1 of R
n
is perfect if 1 is closed and every point in
1 is a limit point of 1.
To see that the Cantor set is perfect, pick r 1. For each / _ 1 the point r
lies in exactly one of the closed intervals 1
|

for some , between 1 and 2


|
. Since the
length of 1
|

is positive, it is possible to choose a point r


|
1 1
|

r. Now the
set of points in the sequence r
|

o
|=1
is an innite subset of 1 and clearly has r as
a limit point. This completes the proof that the Cantor set 1 is perfect.
By summing the lengths of the removed open middle thirds, we obtain
length ([0, 1[ 1) =
1
8

2
8
2

2
2
8
3
... = 1,
10 1. COMPACT SETS
and it follows that 1 is nonempty, compact and has length 1 1 = 0. Another
way to exhibit the same phenomenon is to note that for each / _ 1 the Cantor
set 1 is a subset of the closed set 1
|
which is a union of 2
|
intervals each having
length
1
3
!
. Thus the length of 1
|
is 2
| 1
3
!
=
_
2
3
_
|
, and the length of 1 is at most
inf
_
_
2
8
_
|
: / _ 1
_
= 0.
In contrast to this phenomenon that the length of 1 is quite small, the car-
dinality of 1 is quite large, namely 1 is uncountable, as is every nonempty perfect
subset of a metric space with the Heine-Borel property: every closed and bounded
subset is compact.
Theorem 6. Every nonempty perfect subset of R
n
is uncountable.
Proof : Suppose that 1 is a nonempty perfect subset of R
n
. Since 1 has a
limit point it must be innite. Now assume, in order to derive a contradiction, that
1 is countable, say 1 = r
n

o
n=1
. Start with any point j
1
1 that is not r
1
and
the ball 1
1
= 1(j
1
, r
1
) where r
1
=
J(r1,1)
2
. We have
1
1
1 ,= O and r
1
, 1
1
.
Then there is a point j
2
1
t
1
1 that is not r
2
and so we can choose a ball 1
2
such that
1
2
1 ,= O and r
2
, 1
2
and 1
2
1
1
.
Indeed, we can take 1
2
= 1(j
2
, r
2
) where r
2
=
min]J(r2,2),:1J(1,2)]
2
. Continuing
in this way we obtain balls 1
|
satisfying
1
|
1 ,= O and r
|
, 1
|
and 1
|
1
|1
, / _ 2.
Now each closed set 1
|
1 is nonempty and compact, and so by Corollary 2
we have
o

|=1
_
1
|
1
_
,= O, say r
_
o

|=1
1
|
_
1.
However, by construction we have r
n
, 1
n
for all : and since the sets 1
n
are
decreasing, we see that r
n
,

o
|=1
1
|
for all :; hence r ,= r
n
for all :. This
contradicts 1 = r
n

o
n=1
and completes the proof of the theorem.
3. Exercises
Exercise 1. Let 1 be the set of all r in [0, 1[ whose decimal expansion consists
of only 4
t
: and 7
t
:, e.g. r = .74474777474447.... Is 1 compact? perfect? countable?
Prove your answers.
Exercise 2. Suppose that 1
|

o
|=1
is a sequence of closed subsets of R
n
. If
R
n
=
o
_
|=1
1
|
, prove that at least one of the sets 1
|
has nonempty interior. Hint:
Mimic the proof of the theorem above on perfect sets.
CHAPTER 2
Continuous functions
We initially examine the connection between continuity and sequences, and
after that between continuity and open sets. Central to all of this is the concept of
limit of a function.
1. Limits
Definition 3. Suppose that ) : A R
n
is a function from a subset A of R
n
into R
n
. Let j R
n
be a limit point of A and suppose that R
n
. Then
lim
r
) (r) =
if for every - 0 there is c 0 such that
(1.1) d
R
r () (r) , ) < - whenever r A j and d
R
r (r, j) < c.
Note that the concept of a limit of ) at a point j is only dened when j is a
limit point of the set A on which ) is dened. Do not confuse this notion with the
denition of limit of a sequence : = :
n

o
n=1
in R
n
. In this latter denition, : is a
function from the natural numbers N into R
n
, but the limit point j is replaced by
the symbol . Here is a characterization of limit of a function in terms of limits
of sequences.
Theorem 7. Suppose that ) : A R
n
is a function from a subset A of
R
n
into R
n
. Let j R
n
be a limit point of A and suppose that R
n
. Then
lim
r
) (r) = if and only if
lim
|o
) (:
|
) =
for all sequences :
|

o
|=1
in A j such that
lim
|o
:
|
= j.
Proof : Suppose rst that lim
r
) (r) = . Now assume that :
|

o
|=1
is a
sequence in Aj such that lim
|o
:
|
= j. Then given - 0 there is c 0 such
that (1.1) holds. Furthermore we can nd so large that d
R
r (:
|
, j) < c whenever
/ _ . Combining inequalities with the fact that :
|
1 gives
d
R
r () (:
|
) , ) < - whenever / _ ,
which proves lim
|o
) (:
|
) = .
Suppose next that lim
r
) (r) = fails. The negation of Denition 3 is that
there exists an - 0 such that for every c 0 we have
(1.2) d
R
r () (r) , ) _ - for some r A j with d
R
r (r, j) < c.
11
12 2. CONTINUOUS FUNCTIONS
So x such an - 0 and for each c =
1
|
0 choose a point :
|
A j with
d
R
r (:
|
, j) <
1
|
. Then :
|

o
|=1
is a sequence in A j such that the sequence
) (:
|
)
o
|=1
does not converge to - indeed, d
R
r () (:
|
) , ) _ - 0 for all / _ 1.
2. Continuity
Definition 4. Let A be a subset of R
n
and suppose that ) : A R
n
is a
function from A to R
n
. Let j A. Then ) is continuous at j if for every - 0
there is c 0 such that
(2.1) d
R
r () (r) , ) (j)) < - whenever r A and d
R
r (r, j) < c.
Note that (2.1) says
(2.2) ) (1(j, c) A) 1() (j) , -) .
There are only two possibilities for the point j A; either j is a limit point of
A or j is isolated in A (a point r in A is isolated in A if there is a deleted ball
1
t
(r, r) that has empty intersection with A). In the case that j is a limit point of
A, then ) is continuous at j if and only if lim
r
) (r) exists and the limit is ) (j),
i.e.
(2.3) lim
r
) (r) = ) (j) .
On the other hand, if j is an isolated point of A, then ) is automatically con-
tinuous at j since (2.1) holds for all - 0 with c = r where 1
t
(r, r) A = O.
From these remarks together with Theorem 7, we immediately obtain the following
characterization of continuity in terms of sequences.
Theorem 8. Suppose that ) : A R
n
is a function from a subset A of R
n
into R
n
. Let j A. Then ) is continuous at j if and only if
lim
|o
) (:
|
) = ) (j)
for all sequences :
|

o
|=1
in A j such that
lim
|o
:
|
= j.
Remark 3. The theorem remains true if we permit the sequences :
|

o
|=1
to
lie in A rather than in A j.
There is an alternate characterization of continuity of ) : A 1 in terms of
relative open sets, which is particularly useful in connection with compact sets and
continuity of inverse functions. Recall that a subset Q of A is said to be relatively
open in A if there is an open set G in R
n
such that Q = GA. We will often drop
the adverb relatively.
Theorem 9. Suppose that ) : A 1 is a function from a subset A of R
n
into a subset 1 of R
n
. Then ) is continuous on A if and only if
(2.4) )
1
(G) is open in A for every G that is open in 1 .
Corollary 5. Suppose that ) : A 1 is a continuous function from a
compact subset A of R
n
into a subset 1 of R
n
. Then ) (A) is a compact subset
of R
n
.
2. CONTINUITY 13
Corollary 6. Suppose that ) : A 1 is a continuous function from a
compact subset A of R
n
to a subset 1 of R
n
. If ) is both one-to-one and onto,
then the inverse function )
1
: 1 A dened by
)
1
(j) = r where r is the unique point in A satisfying ) (r) = j,
is a continuous map.
Proof (of Corollary 5): If G
o

o.
is an open cover of ) (A), then
_
)
1
(G
o
)
_
o.
is an open cover of A, hence has a nite subcover
_
)
1
(G
o
!
)
_

|=1
. But then
G
o
!

|=1
is a nite subcover of ) (A) since
) (A) )
_

_
|=1
)
1
(G
o
!
)
_


_
|=1
)
_
)
1
(G
o
!
)
_


_
|=1
G
o
!
.
Note that it is not in general true that )
1
() (G)) G.
Proof (of Corollary 6): Let G be an open subset of A. We must show that
_
)
1
_
1
(G) is open in 1 . Note that since ) is one-to-one and onto, we have
_
)
1
_
1
(G) = ) (G). Now G
c
= A G is closed in A, hence compact, and so
Corollary 5 shows that ) (G
c
) is compact, hence closed in 1 , so ) (G
c
)
c
is open in
1 . But again using that ) is one-to-one and onto shows that ) (G) = ) (G
c
)
c
, and
so we are done.
Remark 4. Compactness is essential in this corollary since the map
) : [0, 2) T = . C : [.[ = 1 dened by ) (0) = c
I0
= (cos 0, sin0) ,
and takes [0, 2) one-to-one and continuously onto T, yet the inverse map fails to
be continuous at . = 1. Indeed, for points . on the circle just below 1, )
1
(.) is
close to 2, while )
1
(1) = 0.
Proof (of Theorem 9): Suppose rst that ) is continuous on A. We must
show that (2.4) holds. So let G be an open subset of 1 . We must now show that
for every j )
1
(G) there is r 0 (depending on j) such that 1(j, r) )
1
(G).
Fix j )
1
(G). Since G is open and ) (j) G we can pick - 0 such that
1() (j) , -) G. But then by the continuity of ) there is c 0 such that (2.2)
holds, i.e. ) (1(j, c)) 1() (j) , -) G. It follows that
1(j, c) )
1
() (1(j, c))) )
1
(G) .
Conversely suppose that (2.4) holds. We must show that ) is continuous at
every j A. So x j A. We must now show that for every - 0 there is c 0
such that (2.2) holds, i.e. ) (1(j, c)) 1() (j) , -). Fix - 0. Since 1() (j) , -) is
open, we have that )
1
(1() (j) , -)) is open by (2.4). Since j )
1
(1() (j) , -))
there is thus c 0 such that 1(j, c) )
1
(1() (j) , -)). It follows that
) (1(j, c)) )
_
)
1
(1() (j) , -))
_
1() (j) , -) .
We now show that continuity is stable under composition of maps.
Theorem 10. Suppose that A, 1, 7 are subsets of R
n
, R
n
, R

respectively. If
) : A 1 and q : 1 7 are both continuous maps, then so is the composition
/ = q ) : A 7 dened by
/(r) = q () (r)) , r A.
14 2. CONTINUOUS FUNCTIONS
Proof: If G is open in 7, then
/
1
(G) = )
1
_
q
1
(G)
_
is open since q continuous implies q
1
(G) is open by Theorem 9, and then )
continuous implies )
1
_
q
1
(G)
_
is open by Theorem 9. Thus / is continuous by
Theorem 9.
Continuity at a point is also easily handled using Denition 4. We leave the
proof of the following theorem to the reader.
Theorem 11. Suppose that A, 1, 7 are subsets of R
n
, R
n
, R

respectively. If
j A and ) : A 1 is continuous at j and q : ) (A) 7 is continuous at ) (j),
then the composition / = q ) : A 7 is continuous at j.
2.1. Real and complex-valued continuous functions. Here is an elemen-
tary consequence of the familiar limit theorems for sums, dierences, products and
quotients of complex-valued functions.
Proposition 1. If ) and q are continuous complex-valued functions on a subset
A of R
n
, then so are the functions ) q and )q. If in addition q never vanishes,
then
}

is also continuous on A.
Here is an extremely useful consequence of Corollary 5 when the target space
1 is the real numbers.
Theorem 12. Suppose that A is a compact subset of R
n
and ) : A R is
continuous. Then there exist points j, A satisfying
) (j) = sup) (A) and ) () = inf ) (A) .
Remark 5. Compactness of A is essential here as evidenced by the following
example. If A is the open interval (0, 1) and ) : (0, 1) (0, 1) is the identity map
dened by ) (r) = r, then ) is continuous and
sup) ((0, 1)) = sup (0, 1) = 1,
inf ) ((0, 1)) = inf (0, 1) = 0.
However, there are no points j, (0, 1) satisfying either ) (j) = 1 or ) () = 0.
Proof (of Theorem 12): Corollary 5 shows that ) (A) is compact. Lemmas 1
and 2 now show that ) (A) is a closed and bounded subset of R. Thus sup) (A)
exists and sup) (A) ) (A), i.e. there is j A such that sup) (A) = ) (j).
Similarly there is A satisfying inf ) (A) = ) ().
Now consider a complex-valued function ) : A C on a subset A of R
n
, and
let n : A R and : A R be the real and imaginary parts of ) dened by
n(r) = Io ) (r) =
) (r) ) (r)
2
,
(r) = Im) (r) =
) (r) ) (r)
2i
,
for r A. It is easy to see that ) is continuous at a point j A if and only if each
of n and is continuous at j. Indeed, the inequalities
max [a[ , [/[ _
_
[a[
2
[/[
2
_ [a[ [/[
3. UNIFORM CONTINUITY 15
show that if (2.1) holds for ) (with 1 = A), i.e.
d
C
() (r) , ) (j)) < - whenever d

(r, j) < c,
then it also holds with ) replaced by n or by :
d
R
(n(r) , n(j)) = [n(r) n(j)[
_
_
[n(r) n(j)[
2
[ (r) (j)[
2
= d
C
() (r) , ) (j)) < -
whenever d

(r, j) < c.
Similarly, if (2.1) holds for both n and then it holds for ) but with - replaced by
2-:
d
C
() (r) , ) (j)) =
_
[n(r) n(j)[
2
[ (r) (j)[
2
_ [n(r) n(j)[ [ (r) (j)[
= d
R
(n(r) , n(j)) d
R
( (r) , (j)) < 2-
whenever d

(r, j) < c.
The same considerations apply equally well to Euclidean space R
n
(recall that
C = R
2
as metric spaces) and we have the following theorem. Recall that the dot
product of two vectors z = (.
1
, ..., .
n
) and w = (n
1
, ..., n
n
) in R
n
is given by
z w =

n
|=1
.
|
n
|
.
Theorem 13. Let A be a subset of R
n
and suppose f : A R
n
. Let )
|
(r)
be the component functions dened by f (r) = ()
1
(r) , ..., )
n
(r)) for 1 _ / _ :.
(1) The vector-valued function f : A R
n
is continuous at a point j A if
and only if each component function )
|
: A R is continuous at j.
(2) If both f : A R
n
and g : A R
n
are continuous at j then so are
f g : A R
n
and f g : A R.
Here are some simple facts associated with the component functions on Euclid-
ean space.
For each 1 _ , _ :, the component function w = (n
1
, ..., n
n
) n

is
continuous from R
n
to R.
The length function w = (n
1
, ..., n
n
) [w[ is continuous from R
n
to
[0, ); in fact we have the so-called reverse triangle inequality:
[[z[ [w[[ _ [z w[ , z, w R
n
.
Every monomial function w = (n
1
, ..., n
n
) n
|1
1
n
|2
2
...n
|r
n
is continuous
from R
n
to R.
Every polynomial 1 (w) =

|1+...|r
a
|1,...|r
n
|1
1
n
|2
2
...n
|r
n
is continu-
ous from R
n
to R.
3. Uniform continuity
A function ) : A 1 that is continuous from a subset A of R
n
to another
subset 1 of R
n
satises Denition 4 at each point j in A, namely for every j A
and - 0 there is c

0 (note the dependence on j) such that (2.1) holds with


1 = A:
(3.1) d
Y
() (r) , ) (j)) < - whenever d

(r, j) < c

.
16 2. CONTINUOUS FUNCTIONS
In general we cannot choose c 0 to be independent of j. For example, the function
) (r) =
1
r
is continuous on the open interval (0, 1), but if we want
- d
Y
() (r) , ) (j)) =

1
r

1
j

whenever [j r[ < c,
we cannot take j = c since then r could be arbitrarily close to 0, and so
1
r
could
be arbitrarily large. In this example, A = (0, 1) is not compact and this turns out
to be the reason we cannot choose c 0 to be independent of j. The surprising
property that continuous functions ) on a compact metric space A have is that we
can indeed choose c 0 to be independent of j in (3.1). We rst give a name to
this surprising property; we call it uniform continuity on A.
Definition 5. Suppose that ) : A 1 maps a subset A of R
n
into a subset
1 of R
n
. We say that ) is uniformly continuous on A if for every - 0 there is
c 0 such that
d
Y
() (r) , ) (j)) < - whenever d

(r, j) < c.
The next theorem plays a crucial role in the theory of integration.
Theorem 14. Suppose that ) : A 1 is a continuous map from a compact
subset A of R
n
into a subset 1 of R
n
. Then ) is uniformly continuous on A.
Proof : Suppose - 0. Since ) is continous on A, (2.2) shows that for each
point j A, there is c

0 such that
(3.2) ) (1(j, c

)) 1
_
) (j) ,
-
2
_
.
Since A is compact, the open cover
_
1
_
j,
o
2
__

has a nite subcover


_
1
_
j
|
,
o
!
2
__

|=1
.
Now dene
c = min
_
c

!
2
_

|=1
.
Since the minimum is taken over nitely many positive numbers (thanks to the
nite subcover, which in turn owes its existence to the compactness of A), we have
c 0.
Now suppose that r, j A satisfy d

(r, j) < c. We will show that


d
Y
() (r) , ) (j)) < -.
Choose / so that j 1
_
j
|
,
o
!
2
_
. Then we have using the triangle inequality in
A that
d

(r, j
|
) _ d

(r, j) d

(j, j
|
) < c
c

!
2
_
c

!
2

c

!
2
= c

!
,
so that both j and r lie in the ball 1(j
|
, c

!
). It follows from (3.2) that both ) (j)
and ) (r) lie in
) (1(j
|
, c

!
)) 1
_
) (j
|
) ,
-
2
_
.
Finally an application of the triangle inequality in 1 shows that
d
Y
() (r) , ) (j)) _ d
Y
() (r) , ) (j
|
)) d
Y
() (j
|
) , ) (j)) <
-
2

-
2
= -.
4. CONNECTEDNESS 17
4. Connectedness
Definition 6. A subset A of R
n
is said to be connected if it is not possible
to write A = 1

' 1 where 1 and 1 are disjoint nonempty relatively open subsets
of A. A set that is not connected is said to be disconnected.
Equivalently, A is disconnected if it has a nonempty proper relatively clopen
subset (a relatively clopen subset of A is one that is simultaneously relatively open
and relatively closed in A).
Lemma 4. A subset A of R
n
is disconnected if and only if there are nonempty
subsets 1 and 1 of R
n
with A = 1

' 1 and
(4.1) 1 1 = O and 1 1 = O,
where the closures refer to the Euclidean space R
n
.
Proof : It is not hard to see that 1 is a relatively open subset of A if and only
if 1 1 = O where 1 = A 1. Similarly, 1 is relatively open in A if and only if
11 = O. Finally, 1 is relatively clopen in A if and only if both 1 and 1 = A1
are relatively open in A.
The connected subsets of the real line are especially simple - they are precisely
the intervals
[a, /[ , (a, /) , [a, /) , (a, /[
lying in R with _ a _ / _ (we do not consider any case where a or / is
and lies next to either [ or [).
Theorem 15. The connected subsets of the real numbers R are precisely the
intervals.
Proof : Consider rst a nonempty connected subset 1 of R. If a, / 1 , and
a < c < /, then we must also have c 1 since otherwise 1 (, c) is clopen in
1 . Thus the set 1 has the intermediate value property (a, / 1 and a < c < /
implies c 1 ), and it is now easy to see using the Least Upper Bound Property of
R, that 1 is an interval. Conversely, if 1 is a disconnected subset of R, then 1 has
a nonempty proper clopen subset 1. We can then nd two points a, / 1 with
a 1 and / 1 = 1 1 and (without loss of generality) a < /. Set
c = sup(1 [a, /[) .
Then we have c 1, and so c , 1 by (4.1). If also c , 1, then 1 fails the
intermediate value property and so cannot be an interval. On the other hand, if
c 1 then c , 1 (the closure of 1), and so there is d (c, /) 1. But then d , 1
since d c and so lies in (a, /) 1 , which again shows that 1 fails the intermediate
value property and so cannot be an interval.
Connected sets behave the same way as compact sets under pushforward by a
continuous map.
Theorem 16. Suppose ) : A 1 is a continuous map from a subset A of R
n
to a subset 1 of R
n
. If A is connected, then ) (A) is connected.
18 2. CONTINUOUS FUNCTIONS
Proof : We may assume that 1 = ) (A). If 1 is disconnected, there are
disjoint nonempty open subsets 1 and 1 with 1 = 1

' 1. But then A =
)
1
(1)

' )
1
(1) where both )
1
(1) and )
1
(1) are open in A by Theorem
9. This shows that A is disconnected as well, and completes the proof of the
(contrapositive of the) theorem.
Corollary 7. If ) : R R is continuous, then ) takes intervals to intervals,
and in particular, ) takes closed bounded intervals to closed bounded intervals.
Note that this corollary yields two familiar theorems from rst year calculus, the
Intermediate Value Theorem (real continuous functions on an interval attain their
intermediate values) and the Extreme Value Theorem (real continuous functions on
a closed bounded interval attain their extreme values).
Proof : Apply Theorems 16, 5 and 5.
Finally we have the following simple description of open subsets of the real
numbers.
Proposition 2. Every open subset G of the real numbers R can be uniquely
written as an at most countable pairwise disjoint union of open intervals 1
n

n1
:
G =

_
n1
1
n
.
Proof : For r G let
1
r
=
_
all open intervals containing r that are contained in G .
It is easy to see that 1
r
is an open interval and that if r, j G then
either 1
r
= 1

or 1
r
1

= O.
This shows that G is a union

o.
1
o
of pairwise disjoint open intervals. To see
that this union is at most countable, simply pick a rational number r
o
in each 1
o
.
The uniqueness is left as an exercise for the reader.
5. Exercises
Exercise 3. Suppose ) : [0, 1[ [0, 1[ is continuous. Prove that ) has a xed
point ., i.e. ) (.) = ..
Exercise 4. Suppose 1 is a compact subset of real numbers. Show that ) :
1 1 is continuous if and only if the graph of ),
qraj/()) =
_
(r, j) R
2
: r 1 and j = ) (r)
_
,
is a compact subset of the plane.
Part 2
Integration and dierentiation
In the second part of these notes we begin with the problem of describing the
inverse operation to that of dierentiation, commonly called integration. There are
four widely recognized theories of integration:
Riemann integration - the workhorse of integration theory that provides
us with the most basic form of the fundamental theorem of calculus;
Riemann-Stieltjes integration - that extends the idea of integrating the
innitesmal dr to that of the more general innitesmal dc(r) for an in-
creasing function c.
Lebesgue integration - that overcomes a shortcoming of the Riemann the-
ory by permitting a robust theory of limits of functions, all at the expense
of a complicated theory of measure of a set.
Henstock-Kurtzweil integration - that includes the Riemann and Lebesgue
theories and has the advantages that it is quite similar in spirit to the
intuitive Riemann theory, and avoids much of the complication of mea-
surability of sets in the Lebesgue theory. However, it has the drawback of
limited scope for generalization.
In Chapter 3 we follow Rudin [3] and use uniform continuity to develop the
standard theory of the Riemann and Riemann-Stieltjes integrals. A short detour
is taken to introduce the more powerful Henstock-Kurtzweil integral, and we use
compactness to prove its uniqueness and extension properties.
Chapter 4 draws on Stein and Shakarchi [6] to provide a rapid and transparent
introduction to the theory of the Lebesgue integral on the real line.
Chapter 5 proves the Banach-Tarski paradox by exploiting the existence of a
free nonabelian group of rank 2 in the rotation group oO
3
in three dimensions.
There is no better advertisement for resticting matters to measurable sets.
Chapter 6 uses Urysohns Lemma to establish the Riesz representation the-
orem on locally compact Hausdor spaces, and constructs Lebesgue measure on
Euclidean spaces. Regularity of measures is treated in some detail, and the Tietze
extension theorem is used to prove Lusins theorem.
Chapter 7 introduces the Lebesgue spaces 1

(j) and develops their elementary


theory including duality theory. The Baire category theorem is used to prove the
classical consequences in the more general setting of Banach spaces, namely the
uniform boundedness principle, the open mapping theorem and the closed graph
theorem, together with some applications. The special case j = 2 is further devel-
oped in the context of Hilbert spaces.
Chapter 8 introduces complex measures and proves the Radon-Nikodym theo-
rem using Hilbert space theory.
Chapter 9 discusses dierentiation of integrals using shifted dyadic grids.
Chapter 10 introduces integration on product spaces and proves Fubinis theo-
rem.
CHAPTER 3
Riemann and Riemann-Stieltjes integration
Let ) : [0, 1[ R be a bounded function on the closed unit interval [0, 1[. In
Riemanns theory of integration, we partition the domain [0, 1[ of the function into
nitely many disjoint subintervals
[0, 1[ =

_
n=1
[r
n1
, r
n
[ ,
and denote the partition by T = 0 = r
0
< r
1
< ... < r

= 1 and the length of


the subinterval [r
n1
, r
n
[ by r
n
= r
n
r
n1
0. Then we dene upper and
lower Riemann sums associated with the partition T by
l (); T) =

n=1
_
sup
[rr1,rr]
)
_
r
n
,
1(); T) =

n=1
_
inf
[rr1,rr]
)
_
r
n
.
Note that the suprema and inma are nite since ) is bounded by assumption.
Next we dene the upper and lower Riemann integrals of ) on [0, 1[ by
| ()) = inf
1
l (); T) , /()) = sup
1
1(); T) .
Thus the upper Riemann integral | ()) is the "smallest" of all the upper sums, and
the lower Riemann integral is the "largest" of all the lower sums.
We can show that any upper sum is always larger than any lower sum by con-
sidering the renement of two partitions T
1
and T
2
: T
1
'T
2
denotes the paritition
whose points consist of the union of the points in T
1
and T
2
and ordered to be
strictly increasing.
Lemma 5. Suppose ) : [0, 1[ R is bounded. If T
1
and T
2
are any two
partitions of [0, 1[, then
(0.1) l (); T
1
) _ l (); T
1
' T
2
) _ 1(); T
1
' T
2
) _ 1(); T
2
) .
Proof : Let
T
1
= 0 = r
0
< r
1
< ... < r
1
= 1 ,
T
2
= 0 = j
0
< j
1
< ... < j

= 1 ,
T
1
' T
2
= 0 = .
0
< .
1
< ... < .
1
= 1 .
Fix a subinterval [r
n1
, r
n
[ of the partition T
1
. Suppose that [r
n1
, r
n
[ contains
exactly the following increasing sequence of points in the partition T
1
' T
2
:
.
|r
< .
|r+1
< ... < .
|r+nr
,
21
22 3. RIEMANN AND RIEMANN-STIELTJES INTEGRATION
i.e. .
|r
= r
n1
and .
|r+nr
= r
n
. Then we have
_
sup
[rr1,rr]
)
_
r
n
=
_
sup
[rr1,rr]
)
_
_
_
nr

=1
.
|r+
_
_
_
nr

=1
_
sup
[:
r+1
,:
r+
]
)
_
.
|r+
,
since sup
[:
r+1
,:
r+
]
) _ sup
[rr1,rr]
) when [.
|r+1
, .
|r+
[ [r
n1
, r
n
[. If we
now sum over 1 _ : _ ' we get
l (); T
1
) =
1

n=1
_
sup
[rr1,rr]
)
_
r
n
_
1

n=1
nr

=1
_
sup
[:
r+1
,:
r+
]
)
_
.
|r+
=
1

=1
_
sup
[:1,:]
)
_
.

= l (); T
1
' T
2
) .
Similarly we can prove that
1(); T
2
) _ 1(); T
1
' T
2
) .
Since we trivially have 1(); T
1
' T
2
) _ l (); T
1
' T
2
), the proof of the lemma is
complete.
Now in (0.1) take the inmum over T
1
and the supremum over T
2
to obtain
that
| ()) _ /()) ,
which says that the upper Riemann integral of ) is always equal to or greater than
the lower Riemann integral of ). Finally we say that ) is Riemann integrable on
[0, 1[, written ) [0, 1[, if | ()) = /()), and we denote the common value by
_
1
0
) or
_
1
0
) (r) dr.
We can of course repeat this line of denition and reasoning for any bounded
closed interval [a, /[ in place of the closed unit interval [0, 1[. We summarize matters
in the following denition.
Definition 7. Let ) : [a, /[ R be a bounded function. For any partition
T = a = r
0
< r
1
< ... < r

= / of [a, /[ we dene upper and lower Riemann


sums by
l (); T) =

n=1
_
sup
[rr1,rr]
)
_
r
n
,
1(); T) =

n=1
_
inf
[rr1,rr]
)
_
r
n
.
Set
| ()) = inf
1
l (); T) , /()) = sup
1
1(); T) ,
3. RIEMANN AND RIEMANN-STIELTJES INTEGRATION 23
where the inmum and supremum are taken over all partitions T of [a, /[. We say
that ) is Riemann integrable on [a, /[, written ) [a, /[, if | ()) = /()), and
we denote the common value by
_
b
o
) or
_
b
o
) (r) dr.
A more substantial generalization of the line of denition and reasoning above
can be obtained on a closed interval [a, /[ by considering in place of the positive
quantities r
n
= r
n
r
n1
associated with a partition
T = a = r
0
< r
1
< ... < r

= /
of [a, /[, the more general nonnegative quantities
c
n
= c(r
n
) c(r
n1
) , 1 _ : _ ,
where c : [a, /[ R is nondecreasing. This leads to the notion of the Riemann-
Stieltjes integral associated with a nondecreasing function c : [a, /[ R.
Definition 8. Let ) : [a, /[ R be a bounded function and suppose c : [a, /[
R is nondecreasing. For any partition T = a = r
0
< r
1
< ... < r

= / of [a, /[
we dene upper and lower Riemann sums by
l (); T, c) =

n=1
_
sup
[rr1,rr]
)
_
c
n
,
1(); T, c) =

n=1
_
inf
[rr1,rr]
)
_
c
n
.
Set
| (), c) = inf
1
l (); T, c) , /(), c) = sup
1
1(); T, c) ,
where the inmum and supremum are taken over all partitions T of [a, /[. We say
that ) is Riemann-Stieltjes integrable on [a, /[, written )
o
[a, /[, if | (), c) =
/(), c), and we denote the common value by
_
b
o
)dc or
_
b
o
) (r) dc(r) .
The lemma on partitions above generalizes immediately to the setting of the
Riemann-Stieltjes integral.
Lemma 6. Suppose ) : [a, /[ R is bounded and c : [a, /[ R is nondecreas-
ing. If T
1
and T
2
are any two partitions of [a, /[, then
(0.2) l (); T
1
, c) _ l (); T
1
' T
2
, c) _ 1(); T
1
' T
2
, c) _ 1(); T
2
, c) .
0.1. Existence of the Riemann-Stieltjes integral. The dicult question
now arises as to exactly which bounded functions ) are Riemann-Stieltjes integrable
with respect to a given nondecreasing c on [a, /[. We will content ourselves with
showing two results. Suppose ) is bounded on [a, /[ and c is nondecreasing on [a, /[.
Then
)
o
[a, /[ if in addition ) is continuous on [a, /[;
)
o
[a, /[ if in addition ) is monotonic on [a, /[ and c is continuous on
[a, /[.
24 3. RIEMANN AND RIEMANN-STIELTJES INTEGRATION
Both proofs will use the Cauchy criterion for existence of the integral
_
b
o
)dc
when ) : [a, /[ R is bounded and c : [a, /[ R is nondecreasing:
For every - 0 there is a partition T of [a, /[ such that (0.3)
l (); T, c) 1(); T, c) < -.
Clearly, if (0.3) holds, then from (0.2) we obtain that for each - 0 that there is a
partition T
:
satisfying
| (), c) /(), c) = inf
1
l (); T, c) sup
1
1(); T, c)
_ l (); T
:
, c) 1(); T
:
, c) < -.
It follows that | (), c) = /(), c) and so
_
b
o
)dc exists. Conversely, given - 0
there are partitions T
1
and T
2
satisfying
| (), c) = inf
1
l (); T, c) l (); T
1
, c)
-
2
,
/(), c) = sup
1
1(); T, c) < 1(); T
2
, c)
-
2
.
Inequality (0.2) now shows that
l (); T
1
' T
2
, c) 1(); T
1
' T
2
, c) _ l (); T
1
, c) 1(); T
2
, c)
<
_
| (), c)
-
2
_

_
/(), c)
-
2
_
= -
since | (), c) = /(), c) if
_
b
o
)dc exists. Thus we can take T = T
1
' T
2
in (0.3).
The existence of
_
b
o
)dc when ) is continuous will use Theorem 14 on uniform
continuity in a crucial way.
Theorem 17. Suppose that ) : [a, /[ R is continuous and c : [a, /[ R is
nondecreasing. Then )
o
[a, /[.
Proof : We will show that the Cauchy criterion (0.3) holds. Fix - 0. By
Theorem 14 ) is uniformly continuous on the compact set [a, /[, so there is c 0
such that
[) (r) ) (r
t
)[ _
-
c(/) c(a)
whenever [r r
t
[ _ c.
Let T = a = r
0
< r
1
< ... < r

= / be any partition of [a, /[ for which


max
1n
r
n
< c.
Then we have
sup
[rr1,rr]
) inf
[rr1,rr]
) _ sup
r,r
0
[rr1,rr]
[) (r) ) (r
t
)[ _ -,
since [r r
t
[ _ r
n
< c when r, r
t
[r
n1
, r
n
[ by our choice of T. Now we
compute that
l (); T, c) 1(); T, c) =

n=1
_
sup
[rr1,rr]
) inf
[rr1,rr]
)
_
c
n
_

n=1
_
-
c(/) c(a)
_
c
n
= -,
which is (0.3) as required.
3. RIEMANN AND RIEMANN-STIELTJES INTEGRATION 25
Remark 6. Observe that it makes no logical dierence if we replace strict
inequality < with _ in - c type denitions. We have used this observation twice
in the above proof, and will continue to use it without further comment in the sequel.
The proof of the next existence result uses the intermediate value theorem for
continuous functions.
Theorem 18. Suppose that ) : [a, /[ R is monotone and c : [a, /[ R is
nondecreasing and continuous. Then )
o
[a, /[.
Proof : We will show that the Cauchy criterion (0.3) holds. Fix - 0 and
suppose without loss of generality that ) is nondecreasing on [a, /[. Let _ 2 be a
positive integer. Since c is continuous we can use the intermediate value theorem
to nd points r
n
(a, /) such that r
0
= a, r

= / and
c(r
n
) = c(a)
:

(c(/) c(a)) , 1 _ : _ 1.
Since c is nondecreasing we have r
n1
< r
n
for all 1 _ : _ , and it follows that
T = a = r
0
< r
1
< ... < r

= /
is a partition of [a, /[ satisfying
c
n
= c(r
n
) c(r
n1
) =
c(/) c(a)

<
-
) (/) ) (a)
,
provided we take large enough. With such a partition T we compute
l (); T, c) 1(); T, c) =

n=1
_
sup
[rr1,rr]
) inf
[rr1,rr]
)
_
c
n
_
-
) (/) ) (a)

n=1
_
sup
[rr1,rr]
) inf
[rr1,rr]
)
_
=
-
) (/) ) (a)

n=1
() (r
n
) ) (r
n1
)) = -,
This proves (0.3) as required.
0.2. A stronger form of the denition of the Riemann integral. For the
Riemann integral there is another formulation of the denition of
_
b
o
) that appears
at rst sight to be much stronger (and which doesnt work for general nondecreasing
c in the Riemann-Stieltjes integral). For any partition T = a = r
0
< r
1
< ... < r

= /,
set |T| = max
1n
r
n
, called the norm of T. Now if
_
b
o
) exists, then for every
- 0 there is by the Cauchy criterion (0.3) a partition T = a = r
0
< r
1
< ... < r

= /
such that
l (); T) 1(); T) <
-
2
.
Now dene c to be the smaller of the two positive numbers
min
1n
r
n
and
-
2 dia: ) ([a, /[)
.
Claim 1. If Q = a = j
0
< j
1
< ... < j
1
= / is any partition with
|Q| = max
1n1
j
n
< c,
26 3. RIEMANN AND RIEMANN-STIELTJES INTEGRATION
then
l (); Q) 1(); Q) < -.
Indeed, since j
n
< c _ r
n
for all : and : by choice of c, each point r
n
lies in a distinct one of the subintervals [j
n1
, j
n
[ of Q, call it J
n
= [j
nr1
, j
nr
[.
The other subintervals [j
n1
, j
n
[ of Q with : not equal to any of the :
n
, each
lie in one of the separating intervals 1
n
=
_
j
nr1
, j
nr1

that are formed by the


spaces between the intervals J
n
. These intervals 1
n
are the union of one or more
consecutive subintervals of Q. We have for each : that

n:[r1,r]1r
_
sup
[r1,r]
) inf
[r1,r]
)
_
j
n
_
_
_
sup
[r
r1
,rr1[
) inf
[r
r1
,rr1[
)
_
_

n:[r1,r]1r
j
n
_
_
sup
[rr,rr+1]
) inf
[rr,rr+1]
)
_
(j
n
j
n1
)
_
_
sup
[rr,rr+1]
) inf
[rr,rr+1]
)
_
(r
n+1
r
n
) .
Summing this in : yields

n=1

n:[r1,r]1r
_
sup
[r1,r]
) inf
[r1,r]
)
_
j
n
(0.4)
_

n=1
_
sup
[rr,rr+1]
) inf
[rr,rr+1]
)
_
(r
n+1
r
n
) = l (); T) 1(); T) .
Now we compute
l (); Q) 1(); Q) =
1

n=1
_
sup
[r1,r]
) inf
[r1,r]
)
_
j
n
=

n=1
_
sup
r
) inf
r
)
_
(j
nr
j
nr1
)

n=1

n:[r1,r]1r
_
sup
[r1,r]
) inf
[r1,r]
)
_
j
n
,
which by (0.4) and choice of c is dominated by
dia: ) ([a, /[)

n=1
(j
nr
j
nr1
) l (); T) 1(); T)
_ dia: ) ([a, /[) c
-
2
<
-
2

-
2
= -,
and this proves the claim.
Conversely, if
For every - 0 there is c 0 such that (0.5)
l (); Q) 1(); Q) < - whenever |Q| < c,
1. PROPERTIES OF THE RIEMANN-STIELTJES INTEGRAL 27
then the Cauchy criterion (0.3) holds with T equal to any such Q. Thus (0.5)
provides another equivalent denition of the Riemann integral
_
b
o
) that is more
like the - c denition of continuity at a point (compare Denition 4).
1. Properties of the Riemann-Stieltjes integral
The Riemann-Stieltjes integral
_
b
o
)dc is a function of the closed interval [a, /[,
the bounded function ) on [a, /[, and the nondecreasing function c on [a, /[. With
respect to each of these three variables, the integral has natural properties related
to monotonicity, sums and scalar multiplication. In fact we have the following
lemmas dealing with each variable separately, beginning with ), then c and ending
with [a, /[.
Lemma 7. Fix [a, /[ R and c : [a, /[ R nondecreasing. The set
o
[a, /[ is
a real vector space and the integral
_
b
o
)dc is a linear function of )
o
[a, /[: if
)

[a, /[ and `

R, then
) = `
1
)
1
`
2
)
2

o
[a, /[ and
_
b
o
)dc = `
1
_
b
o
)
1
dc `
2
_
b
o
)
2
dc.
Furthermore,
o
[a, /[ is partially ordered by declaring ) _ q if ) (r) _ q (r) for
r [a, /[, and the integral
_
b
o
)dc is a nondecreasing function of ) with respect to
this order: if ), q
o
[a, /[ and ) _ q, then
_
b
o
)dc _
_
b
o
qdc.
Lemma 8. Fix [a, /[ R and ) : [a, /[ R bounded. Then
(
}
[a, /[ = c : [a, /[ R : c is nondecreasing and )
o
[a, /[
is a cone and the integral
_
b
o
)dc is a positive linear function of c: if c

(
}
[a, /[
and c

[0, ), then
c = c
1
c
1
c
2
c
2
(
}
[a, /[ and
_
b
o
)dc = c
1
_
b
o
)dc
1
c
2
_
b
o
)dc
2
.
Lemma 9. Fix [a, /[ R and c : [a, /[ R nondecreasing and )
o
[a, /[. If
a < c < /, then c : [a, c[ R and c : [c, /[ R are each nondecreasing and
)
o
[a, c[ and )
o
[c, /[ and
_
b
o
)dc =
_
c
o
)dc
_
b
c
)dc.
These three lemmas are easy to prove, and are left to the reader. Properties
regarding multiplication of functions in
o
[a, c[ and composition of functions are
more delicate.
Theorem 19. Suppose that ) : [a, /[ [:, '[ and )
o
[a, /[. If , :
[:, '[ R is continuous, then , )
o
[a, /[.
Corollary 8. If ), q
o
[a, /[, then )q
o
[a, /[, [)[
o
[a, /[ and

_
b
o
)dc

_
_
b
o
[)[ dc.
Proof : Since ,(r) = r
2
is continuous, Lemma 7 and Theorem 19 yield
)q =
1
2
_
() q)
2
)
2
q
2
_

o
[a, /[ .
28 3. RIEMANN AND RIEMANN-STIELTJES INTEGRATION
Since ,(r) = [r[ is continuous, Theorem 19 yields [)[
o
[a, /[. Now choose
c = 1 so that c
_
b
o
)dc _ 0. Then the lemmas imply

_
b
o
)dc

= c
_
b
o
)dc =
_
b
o
(c)) dc _
_
b
o
[)[ dc.
Proof (of Theorem 19): Let / = , ). We will show that /
o
[a, /[ by
verifying the Cauchy criterion for integrals (0.3). Fix - 0. Since , is continuous
on the compact interval [:, '[, it is uniformly continuous on [:, '[ by Theorem
14. Thus we can choose c 0 such that
[,(:) ,(t)[ < - whenever [: t[ < c.
Since )
o
[a, /[, there is by the Cauchy criterion a partition
T = a = r
0
< r
1
< ... < r

= /
such that
(1.1) l (); T, c) 1(); T, c) < c-.
Let
'
n
= sup
[rr1,rr]
) and :
n
= inf
[rr1,rr]
),
'
+
n
= sup
[rr1,rr]
/ and :
+
n
= inf
[rr1,rr]
/,
and set
= : : '
n
:
n
< c and 1 = : : '
n
:
n
_ c .
The point of the index set is that for each : we have
'
+
n
:
+
n
= sup
r,[rr1,rr]
[,() (r)) ,() (j))[ _ sup
]s|]1rnr
[,(:) ,(t)[
_ sup
]s|]<o
[,(:) ,(t)[ _ -, : .
As for : in the index set 1, we have c _ '
n
:
n
and the inequality (1.1) then
gives
c

n1
c
n
_

n1
('
n
:
n
) c
n
< c-.
Dividing by c 0 we obtain

n1
c
n
< -.
Now we use the trivial bound
'
+
n
:
+
n
_ dia: ,([:, '[)
to compute that
l (/; T, c) 1(/; T, c) =
_

n.

n1
_
('
+
n
:
+
n
) c
n
_

n.
- c
n


n1
dia: ,([:, '[) c
n
_ - (c(/) c(a)) - dia: ,([:, '[)
= - [c(/) c(a) dia: ,([:, '[)[ ,
2. THE HENSTOCK-KURTZWEIL INTEGRAL 29
which veries (0.3) for the existence of
_
b
o
/dc as required.
2. The Henstock-Kurtzweil integral
We can reformulate the - c denition of the Riemann integral
_
b
o
) in (0.5)
using a more general notion of partition, that of a tagged partition. If T =
a = r
0
< r
1
< ... < r

= / is a partition of [a, /[ and we choose points t


n

[r
n1
, r
n
[ in each subinterval of T, then
T
+
= a = r
0
_ t
1
_ r
1
_ ... _ r
1
_ t

_ r

= / ,
where r
0
< r
1
< ... < r

,
is called a tagged partition T
+
with underlying partition T. Thus a tagged parti-
tion consists of two nite intertwined sequences r
n

n=0
and t
n

n=1
, where the
sequence r
n

n=0
is strictly increasing and the sequence t
n

n=1
need not be. For
every tagged partition T
+
of [a, /[, dene the corresponding Riemann sum o (); T
+
)
by
o (); T
+
) =

n=1
) (t
n
) r
n
.
Note that inf
[rr1,rr]
) _ ) (t
n
) _ sup
[rr1,rr]
) implies that
1(); T) _ o (); T
+
) _ l (); T)
for all tagged partitions T
+
with underlying partition T.
Now observe that if ) [a, /[, - 0 and the partition T satises
l (); T) 1(); T) < -,
then every tagged partition T
+
with underlying partition T satises
(2.1)

o (); T
+
)
_
b
o
)

_ l (); T) 1(); T) < -.


Conversely if for each - 0 there is a partition T such that every tagged partition
T
+
with underlying partition T satises (2.1), then (0.3) holds and so ) [a, /[.
However, we can also formulate this approach using the - c form (0.5) of the
denition of
_
b
o
). The result is that ) [a, /[ if and only if
There is 1 R such that for every - 0 there is c 0 such that (2.2)
[o (); T
+
) 1[ < - whenever |T
+
| < c.
Of course if such a number 1 exists we write 1 =
_
b
o
) and call it the Riemann
integral of ) on [a, /[. Here we dene |T
+
| to be |T| where T is the underlying
partition of T
+
. The reader can easily verify that ) [a, /[ if and only if the
above condition (2.2) holds.
Now comes the clever insight of Henstock and Kurtzweil. We view the positive
constant c in (2.2) as a function on the interval [a, /[, and replace it with an arbitrary
(not necessarily constant) positive function c : [a, /[ (0, ). We refer to such
an arbitrary positive function c : [a, /[ (0, ) as a guage on [a, /[. Then for any
guage on [a, /[, we say that a tagged partition T
+
on [a, /[ is c-ne provided
(2.3) [r
n1
, r
n
[ (t
n
c (t
n
) , t
n
c (t
n
)) , 1 _ : _ .
30 3. RIEMANN AND RIEMANN-STIELTJES INTEGRATION
Thus T
+
is c-ne if each tag t
n
[r
n1
, r
n
[ has its associated guage value c (t
n
)
suciently large that the open interval centered at t
n
with radius c (t
n
) contains
the :
||
subinterval [r
n1
, r
n
[ of the partition T. Now we can give the denition of
the Henstock and Kurtzweil integral.
Definition 9. A function ) : [a, /[ R is Henstock-Kurtzweil integrable on
[a, /[, written ) H/[a, /[, if there is 1 R such that for every - 0 there is a
guage c
:
: [a, /[ (0, ) on [a, /[ such that
[o (); T
+
) 1[ < - whenever T
+
is c
:
-ne.
It is clear that if ) [a, /[ is Riemann integrable, then ) satises Denition
9 with 1 =
_
b
o
) - simply take c
:
to be the constant guage c in (2.2). However,
for this new denition to have any value it is necessary that such an 1 is uniquely
determined by Denition 9. This is indeed the case and relies crucially on the fact
that [a, /[ is compact. Here are the details.
Suppose that Denition 9 holds with both 1 and 1
t
. Let - 0. Then there
are guages c
:
and c
t
:
on [a, /[ such that
[o (); T
+
) 1[ < - whenever T
+
is c
:
-ne,
[o (); T
+
) 1
t
[ < - whenever T
+
is c
t
:
-ne.
Now dene
j
:
(r) = min
_
c
:
(r) , c
t
:
(r)
_
, a _ r _ /.
Then j
:
is a guage on [a, /[. Here is the critical point: we would like to produce
a tagged partition T
+
:
that is j
:
-ne! Indeed, if such a tagged partition T
+
:
exists,
then T
+
:
would also be c
:
-ne and c
t
:
-ne (since j
:
_ c
:
and j
:
_ c
t
:
) and hence
[1 1
t
[ _ [o (); T
+
:
) 1[ [o (); T
+
:
) 1
t
[ < 2-
for all - 0, which forces 1 = 1
t
.
However, if j is any guage on [a, /[, let
1(r, j (r)) = (r j (r) , r j (r)) and 1
_
r,
j (r)
2
_
=
_
r
j (r)
2
, r
j (r)
2
_
.
Then
_
1
_
r,
q(r)
2
__
r[o,b]
is an open cover of the compact set [a, /[, hence there
is a nite subcover
_
1
_
r
n
,
q(rr)
2
__

n=0
. We may assume that every interval
1
_
r
n
,
q(rr)
2
_
is needed to cover [a, /[ by discarding any in turn which are included
in the union of the others. We may also assume that a _ r
0
< r
1
< ... < r

_ /.
It follows that 1
_
r
n1
,
q(rr1)
2
_
1
_
r
n
,
q(rr)
2
_
,= O, so the triangle inequality
yields
[r
n
r
n1
[ <
j (r
n1
) j (r
n
)
2
, 1 _ : _ .
If j (r
n
) _ j (r
n1
) then
[r
n1
, r
n
[ 1(r
n
, j (r
n
)) ,
and so we dene
t
n
= r
n
.
Otherwise, we have j (r
n1
) j (r
n
) and then
[r
n1
, r
n
[ 1(r
n1
, j (r
n1
)) ,
3. EXERCISES 31
and so we dene
t
n
= r
n1
.
The tagged partition
T
+
= a = r
0
_ t
1
_ r
1
_ ... _ r
1
_ t

_ r

= /
is then j-ne.
With the uniqueness of the Henstock-Kurtzweil integral in hand, and the fact
that it extends the denition of the Riemann integral, we can without fear of confu-
sion denote the Henstock-Kurtzweil integral by
_
b
o
) when ) H/[a, /[. It is now
possible to develop the standard properties of these integrals as in Theorem 19 and
the lemmas above for Riemann integrals. The proofs are typically very similar to
those commonly used for Riemann integration. One exception is the Fundamental
Theorem of Calculus for the Henstock-Kurtzweil integral, which requires a more
complicated proof. In fact, it turns out that the theory of the Henstock-Kurtzweil
integral is suciently rich to include the theory of the Lebesgue integral, which we
consider in detail in a later chapter. For further development of the theory of the
Henstock-Kurtzweil integral we refer the reader to Bartle and Sherbert [1] and the
references given there.
3. Exercises
Exercise 5. Dene a sequence )
n

o
n=1
of bounded continuous functions on
[0, 1) by
)
n
(r) =
(0,1)
(r) min
_
r

1
4
, 2
r
4
_
.
(1) Prove that )
n

o
n=1
is a Cauchy sequence in 1
2
7
([0, 1)), the vector space
of Riemann integrable functions on [0, 1) with metric
d (), q) =
__
1
0
[) (r) q (r)[
2
dr
_
1
2
,
after we have identied functions ) and q with d (), q) = 0.
(2) Prove that there is no (bounded) Riemann integrable function ) such that
)
n
) in 1
2
7
([0, 1)).
CHAPTER 4
Lebesgue measure theory
Recall that ) is Riemann integrable on [0, 1), written ) [0, 1), if | ()) =
/()), and we denote the common value by
_
1
0
) or
_
1
0
) (r) dr. Here | ()) and
/()) are the upper and lower Riemann integrals of ) on [0, 1) respectively given
by
| ()) = inf
1
l (); T) = inf
1

n=1
_
sup
[rr1,rr)
)
_
r
n
,
/()) = sup
1
1(); T) = sup
1

n=1
_
inf
[rr1,rr)
)
_
r
n
,
where T = 0 = r
0
< r
1
< ... < r

= 1 is any partition of [0, 1) and r


n
=
r
n
r
n1
0. This denition diers from that in Denition 7 only in that the
subinterval [r
n1
, r
n
) used here is dierent from the subinterval [r
n1
, r
n
[ used
there. It is an easy exercise to show these two denitions coincide for bounded )
on [0, 1), regardless of the value ) may take at 1. For convenience we work with
[0, 1) in place of [0, 1[ for now.
This denition is simple and easy to work with and applies in particular to
bounded continuous functions ) on [0, 1) since it is not too hard to prove that
) [0, 1) for such ), see e.g. Theorem 17 with c(r) = r. However, if we
consider the vector space 1
2
7
([0, 1)) of Riemann integrable functions ) [0, 1)
endowed with the metric
d (), q) =
__
1
0
[) (r) q (r)[
2
dr
_
1
2
,
it turns out that while 1
2
7
([0, 1)) can indeed be proved a metric space (actually we
must consider equivalence classes of functions where we identify functions ) and
q if
_
1
0
[) (r) q (r)[
2
dr = 0), it fails to be complete. This is a serious shortfall
of Riemanns theory of integration, and is our main motivation for considering the
more complicated theory of Lebesgue below. We note that the immediate reason
for the lack of completeness of 1
2
7
([0, 1)) is the inability of Riemanns theory to
handle general unbounded functions. For example, the sequence )
n

o
n=1
dened
on [0, 1) by
)
n
(r) =
[0,1)
(r) min
_
r

1
4
, 2
r
4
_
is a Cauchy sequence in 1
2
7
([0, 1)) that clearly has no bounded function as limit
in 1
2
7
([0, 1)). Indeed,
d ()
n
, )
n+1
)
2
=
_
2
r
0
[)
n+1
(r) )
n
(r)[
2
dr _
_
2
r
0

2
r
4

2
dr = 2

r
2
33
34 4. LEBESGUE MEASURE THEORY
and so for : : we have
d ()
n
, )
n
) _
n1

|=n
d ()
|
, )
|+1
) _
n1

|=n
2

!
4
0 as : .
However, even locally there are problems, and here is a sketch of one such
problem. Indeed, once we have Lebesgues theory in hand, we can construct a
famous example of a Lebesgue measurable subset 1 of [0, 1) with the (somewhat
surprising) property that
0 < [1 (a, /)[ < / a, 0 _ a < / _ 1,
where [1[ denotes the Lebesgue measure of a measurable set 1 (see Problem 3
below). It follows that the characteristic function
J
is bounded and Lebesgue
measurable, but that there is no Riemann integrable function ) such that ) =

J
almost everywhere, since such an ) would satisfy | ()) = 1 and /()) = 0.
Nevertheless, by Lusins Theorem (see Theorem 35 below or page 34 in [6] or page
55 in [4]) there is a sequence of compactly supported continuous functions (hence
Riemann integrable) converging to
J
almost everywhere and that are uniformly
bounded. By the Dominated Convergence Theorem 31 below, this sequence is
Cauchy in 1
2
7
([0, 1)).
On the other hand, in Lebesgues theory of integration, we partition the range
[0, ') of the bounded function ) into a homogeneous partition,
[0, ') =

_
n=1
_
(: 1)
'

, :
'

_
=

_
n=1
1
n
,
and we consider the associated upper and lower Lebesgue sums of ) on [0, 1) dened
by
l
+
(); T) =

n=1
_
:
'

)
1
(1
n
)

,
1
+
(); T) =

n=1
_
(: 1)
'

)
1
(1
n
)

,
where of course
)
1
(1
n
) =
_
r [0, 1) : ) (r) 1
n
=
_
(: 1)
'

, :
'

__
,
and [1[ denotes the "measure" or "length" of the subset 1 of [0, 1).
Here there will be no problem obtaining that l
+
(); T) 1
+
(); T) is small
provided we can make reasonable sense of

)
1
(1
n
)

. But this is precisely the


diculty with Lebesgues approach - we need to dene a notion of "measure" or
"length" for subsets 1 of [0, 1). That this is not going to be as easy as we might
hope is evidenced by the following negative result. Let T ([0, 1)) denote the power
set of [0, 1), i.e. the set of all subsets of [0, 1). For r [0, 1) and 1 T ([0, 1)) we
dene the translation 1 r of 1 by r to be the set in T ([0, 1)) dened by
1 r = 1 r (moo1)
= . [0, 1) : there is j 1 with j r . Z .
1. LEBESGUE MEASURE ON THE REAL LINE 35
Theorem 20. There is no map j : T ([0, 1)) [0, ) satisfying the following
three properties:
(1) j([0, 1)) = 1,
(2) j
_

o
n=1
1
n
_
=

o
n=1
j(1
n
) whenever 1
n

o
n=1
is a pairwise disjoint
sequence of sets in T ([0, 1)),
(3) j(1 r) = j(1) for all 1 T ([0, 1)).
Remark 7. All three of these properties are desirable for any notion of measure
or length of subsets of [0, 1). The theorem suggests then that we should not demand
that every subset of [0, 1) be "measurable". This will then restrict the functions )
that we can integrate to those for which )
1
([a, /)) is "measurable" for all <
a < / < .
Proof : Let r
n

o
n=1
= Q [0, 1) be an enumeration of the rational numbers
in [0, 1). Dene an equivalence relation on [0, 1) by declaring that r ~ j if r
j Q. Let / be the set of equivalence classes. Use the axiom of choice to
pick a representative a = from each equivalence class in /. Finally, let
1 = : / be the set consisting of these representatives a, one from each
equivalence class in /.
Then we have
[0, 1) =

_
o
n=1
1 r
n
.
Indeed, if r [0, 1), then r for some /, and thus r ~ a = , i.e.
r a Q. If r _ a then r a Q [0, 1) and r = a r
n
where a 1 and
r
n
r
n

o
n=1
. If r < a then r a 1 = r
n
Q [0, 1), i.e. a r
n
r = 1 Q
where a 1, and it follows by denition that r 1r
n
. Finally, if ar
n
= /r
n
,
then a / = r
n
r
n
Q which implies that a ~ / and then r
n
= r
n
.
Now by properties (1), (2) and (3) in succession we have
1 = j([0, 1)) = j
_

_
o
n=1
1 r
n
_
=
o

n=1
j(1 r
n
) =
o

n=1
j(1) ,
which is impossible since the innite series

o
n=1
j(1) is either if j(1) 0 or
0 if j(1) = 0.
1. Lebesgue measure on the real line
In order to dene a "measure" satisfying the three properties in Theorem 20,
we must restrict the domain of denition of the set functional j to a "suitable"
proper subset of the power set T ([0, 1)). A good notion of "suitable" is captured
by the following denition where we expand our quest for measure to the entire
real line.
Definition 10. A collection / T (R) of subsets of real numbers R is called
a o-algebra if the following properties are satised:
(1) c /,
(2)
c
/ whenever /,
(3)

o
n=1

n
/ whenever
n
/ for all :.
Here is the theorem asserting the existence of "Lebesgue measure" on the real
line.
36 4. LEBESGUE MEASURE THEORY
Theorem 21. There is a o-algebra / T (R) and a function j : / [0, [
such that
(1) [a, /) / and j([a, /)) = / a for all < a < / < ,
(2) j
_

o
n=1
1
n
_
=

o
n=1
j(1
n
) whenever 1
n

o
n=1
is a pairwise disjoint
sequence of sets in /,
(3) 1 r / and j(1 r) = j(1) for all 1 /,
(4) 1 / and j(1) = 0 whenever 1 1 and 1 / with j(1) = 0.
The sets in the o-algebra / are called Lebesgue measurable sets. A pair (/, j)
satisfying only property (2) is called a measure space. Property (1) says that the
measure j is an extension of the usual length function on intervals. Property (3)
says that the measure is translation invariant, while property (4) says that the
measure is complete.
From property (2) and the fact that j is nonnegative, and nite on intervals,
we easily obtain the following elementary consequences (where membership in / is
implied by context):
c / and j(c) = 0, (1.1)
1 / for every open set 1 in R,
j(1) = / a for any interval 1 with endpoints a and /,
j(1) = sup
n
j(1
n
) = lim
no
j(1
n
) if 1
n
1,
j(1) = inf
n
j(1
n
) = lim
no
j(1
n
) if 1
n
1 and j(1
1
) < .
For example, the fourth line follows from writing
1 = 1
1

'
_

_
o
n=1
1
n+1
(1
n
)
c
_
and then using property (2) of j.
To prove Theorem 21 we follow the treatment in [6] with simplications due to
the fact that Theorem 15 implies the connected open subsets of the real numbers
R are just the open intervals (a, /). Dene for any 1 T (R), the outer Lebesgue
measure j
+
(1) of 1 by,
(1.2)
j
+
(1) = inf
_
o

n=1
(/
n
a
n
) : 1

_
o
n=1
(a
n
, /
n
) and _ a
n
< /
n
_
_
.
It is immediate that j
+
is monotone,
j
+
(1) _ j
+
(1) if 1 1.
A little less obvious is countable subadditivity of j
+
. The reason lies in the use of
pairwise disjoint covers of 1 by open intervals in the denition of j
+
(1) in (1.2).
If we had instead used arbitrary open covers by open intervals in the denition,
then countable subadditivity of j
+
would have been trivial.
Lemma 10. j
+
is countably subadditive:
j
+
_
o
_
n=1
1
n
_
_
o

n=1
j
+
(1
n
) , 1
n

o
n=1
T (R) .
1. LEBESGUE MEASURE ON THE REAL LINE 37
Proof : Given 0 < - < 1, we have 1
n

o
|=1
(a
|,n
, /
|,n
) with
o

|=1
(/
|,n
a
|,n
) < j
+
(1
n
)
-
2
n
, : _ 1.
Now let
o
_
n=1
_

_
o
|=1
(a
|,n
, /
|,n
)
_
=

_
1

n=1
(c
n
, d
n
) ,
where '
+
N ' . Then dene disjoint sets of indices
J
n
= (/, :) : (a
|,n
, /
|,n
) (c
n
, d
n
) .
In the case c
n
, d
n
R, we can choose by compactness a nite subset T
n
of J
n
such that
(1.3)
_
c
n

-
2
c
n
, d
n

-
2
c
n
_

o
_
(|,n)Jr
(a
|,n
, /
|,n
) ,
where c
n
= d
n
c
n
. We may assume that each such interval (a
|,n
, /
|,n
) has
nonempty intersection with the compact interval on the left side of (1.3). Fix :
and arrange the left endpoints a
|,n

(|,n)Jr
in strictly increasing order a
I

1
I=1
and denote the corresponding right endpoints by /
I
(if there is more than one
interval (a
I
, /
I
) with the same left endpoint a
I
, discard all but one of the largest of
them). From (1.3) it now follows that a
I+1
(a
I
, /
I
) for i < 1 since otherwise /
I
would be in the left side of (1.3), but not in the right side, a contradiction. Thus
a
I+1
a
I
_ /
I
a
I
for 1 _ i < 1 and we have the inequality
(1 -) c
n
=
_
d
n

-
2
c
n
_

_
c
n

-
2
c
n
_
_ /
1
a
1
= (/
1
a
1
)
11

I=1
(a
I+1
a
I
)
_
1

I=1
(/
I
a
I
) _

(|,n)Jr
(/
|,n
a
|,n
)
_

(|,n)1r
(/
|,n
a
|,n
) .
We also observe that a similar argument shows that

(|,n)1r
(/
|,n
a
|,n
) =
if c
n
= . Then we have
j
+
(1) _
o

n=1
c
n
_
1
1 -
o

n=1

(|,n)Jr
(/
|,n
a
|,n
)
_
1
1 -

|,n
(/
|,n
a
|,n
) =
1
1 -
o

n=1
o

|=1
(/
|,n
a
|,n
)
<
1
1 -
o

n=1
_
j
+
(1
n
)
-
2
n
_
=
1
1 -
o

n=1
j
+
(1
n
)
-
1 -
.
Let - 0 to obtain the countable subadditivity of j
+
.
38 4. LEBESGUE MEASURE THEORY
Definition 11. Now dene the subset / of T (R) to consist of all subsets
of the real line such that for every - 0, there is an open set G satisfying
(1.4) j
+
(G ) < -.
Remark 8. Condition (1.4) says that can be well approximated from the
outside by open sets. The most dicult task we will face below in using this deni-
tion of / is to prove that such sets can also be well approximated from the inside
by closed sets.
Set
j() = j
+
() , /.
Trivially, every open set and every interval is in /. We will use the following two
claims in the proof of Theorem 21.
Claim 2. If G is open and G =

n=1
(a
n
, /
n
) (where
+
N ' ) is the
decomposition of G into its connected components (a
n
, /
n
) (Proposition 2 of Chapter
2), then
j(G) = j
+
(G) =

n=1
(/
n
a
n
) .
We rst prove Claim 2 when
+
< . If G

o
n=1
(c
n
, d
n
), then for each
1 _ : _
+
, (a
n
, /
n
) (c
n
, d
n
) for some : since (a
n
, /
n
) is connected. If
J
n
= : : (a
n
, /
n
) (c
n
, d
n
) ,
it follows upon arranging the a
n
in increasing order that

n1r
(/
n
a
n
) _ d
n
c
n
,
since the intervals (a
n
, /
n
) are pairwise disjoint. We now conclude that
j
+
(G) = inf
_
o

n=1
(d
n
c
n
) : G

_
o
n=1
(c
n
, d
n
)
_
_
o

n=1

n1r
(/
n
a
n
) =

n=1
(/
n
a
n
) ,
and hence that j
+
(G) =

n=1
(/
n
a
n
) by denition since G

n=1
(a
n
, /
n
).
Finally, if
+
= , then from what we just proved and monotonicity, we have
j
+
(G) _ j
+
_

_

n=1
(a
n
, /
n
)
_
=

n=1
(/
n
a
n
)
for each 1 _ < . Taking the supremum over gives j
+
(G) _

o
n=1
(/
n
a
n
),
and then equality follows by denition since G

o
n=1
(a
n
, /
n
).
Claim 3. If and 1 are disjoint compact subsets of R, then
j
+
() j
+
(1) = j
+
(' 1) .
1. LEBESGUE MEASURE ON THE REAL LINE 39
First note that
c = di:t (, 1) = inf [r j[ : r , j 1 0,
since the function ) (r, j) = [r j[ is positive and continuous on the closed and
bounded (hence compact) subset 1 of the plane - Theorem 12 shows that )
achieves its inmum di:t (, 1), which is thus positive. So we can nd open sets
l and \ such that
l and 1 \ and l \ = c.
For example, l =

r.
1
_
r,
o
2
_
and \ =

r1
1
_
r,
o
2
_
work. Now suppose that
' 1 G open.
Then we have
l G =

_
1

|=1
(c
|
, )
|
) and 1 \ G =

_
J

|=1
(q
|
, /
|
) ,
and then from Claim 2 and monotonicity of j
+
we obtain, using that G contains
the disjoint union of l G and \ G,
j
+
() j
+
(1) _
1

|=1
()
|
c
|
)
J

|=1
(/
|
q
|
)
= j
+
_
_
_
_

_
1

|=1
(c
|
, )
|
)
_
_

'
_
_

_
J

|=1
(q
|
, /
|
)
_
_
_
_
_ j
+
(G) .
Taking the inmum over such G gives j
+
() j
+
(1) _ j
+
(' 1), and subaddi-
tivity of j
+
now proves equality.
Proof (of Theorem 21): We now prove that / is a o-algebra and that / and j
satisfy the four properties in the statement of Theorem 21. First we establish that
/ is a o-algebra in four steps.
Step 1: / if j
+
() = 0.
Given - 0, there is an open G with j
+
(G) < -. But then j
+
(G ) _
j
+
(G) < - by monontonicity.
Step 2:

o
n=1

n
/ whenever
n
/ for all :.
Given - 0, there is an open G
n

n
with j
+
(G
n

n
) <
:
2
r
. Then
=

o
n=1

n
is contained in the open set G =

o
n=1
G
n
, and since G is
contained in

o
n=1
(G
n

n
), monotonicity and subadditivity of j
+
yield
j
+
(G ) _ j
+
_
o
_
n=1
(G
n

n
)
_
_
o

n=1
j
+
(G
n

n
) <
o

n=1
-
2
n
= -.
Step 3: / if is closed.
Suppose rst that is compact, and let - 0. Then using Claim 2 there is
G =

n=1
(a
n
, /
n
) containing with
j
+
(G) =
o

n=1
(/
n
a
n
) _ j
+
() - < .
40 4. LEBESGUE MEASURE THEORY
Now G is open and so G =

n=1
(c
n
, d
n
) by Proposition 2. We want to
show that j
+
(G ) _ -. Fix a nite ' _ '
+
and
0 < j <
1
2
min
1n1
(d
n
c
n
) .
Then the compact set
1
q
=
1
_
n=1
[c
n
j, d
n
j[
is disjoint from , so by Claim 3 and induction we have
j
+
(' 1
q
) = j
+
() j
+
(1
q
) = j
+
()
1

n=1
j
+
([c
n
j, d
n
j[) .
We conclude from subadditivity and ' 1
q
G that
j
+
()
1

n=1
(d
n
c
n
2j) = j
+
(' 1
q
) _ j
+
(G) _ j
+
() -.
Since j
+
() < for compact, we thus have
1

n=1
(d
n
c
n
) _ - 2'j
for all 0 < j <
1
2
min
1n1
(d
n
c
n
). Hence

1
n=1
(d
n
c
n
) _ - and taking
the supremum in ' _ '
+
we obtain from Claim 2 that
j
+
(G ) =
1

n=1
(d
n
c
n
) _ -.
Finally, if is closed, it is a countable union of compact sets =

o
n=1
([:, :[ ),
and hence / by Step 2.
Step 4:
c
/ if /.
For each : _ 1 there is by Claim 2 an open set G
n
such that j
+
(G
n
) <
1
n
. Then 1
n
= G
c
n
is closed and hence 1
n
/ by Step 3. Thus
o =
o
_
n=1
1
n
/, o
c
,
and
c
o G
n
for all : implies that
j
+
(
c
o) _ j
+
(G
n
) <
1
:
, : _ 1.
Thus j
+
(
c
o) = 0 and by Step 1 we have
c
o /. Finally, Step 2 shows that

c
= o ' (
c
o) /.
Thus far we have shown that / is a o-algebra, and we now turn to proving that
/ and j satisfy the four properties in Theorem 21. Property (1) is an easy exercise.
Property (2) is the main event. Let 1
n

o
n=1
be a pairwise disjoint sequence of sets
in /, and let 1 =

o
n=1
1
n
.
1. LEBESGUE MEASURE ON THE REAL LINE 41
We will consider rst the case where each of the sets 1
n
is bounded. Let - 0
be given. Then 1
c
n
/ and so there are open sets G
n
1
c
n
such that
j
+
(G
n
1
c
n
) <
-
2
n
, : _ 1.
Equivalently, with 1
n
= G
c
n
, we have 1
n
closed, contained in 1
n
, and
j
+
(1
n
1
n
) <
-
2
n
, : _ 1.
Thus the sets 1
n
in the sequence 1
n

o
n=1
are compact and pairwise disjoint. Claim
3 and induction shows that

n=1
j
+
(1
n
) = j
+
_

_
n=1
1
n
_
_ j
+
(1) , _ 1,
and taking the supremum over yields
o

n=1
j
+
(1
n
) _ j
+
(1) .
Thus we have
o

n=1
j
+
(1
n
) _
o

n=1
j
+
(1
n
1
n
) j
+
(1
n
)
_
o

n=1
-
2
n

o

n=1
j
+
(1
n
) _ - j
+
(1) .
Since - 0 we conclude that

o
n=1
j
+
(1
n
) _ j
+
(1), and subadditivity of j
+
then
proves equality.
In general, dene 1
n,|
= 1
n
[/, / 1) for / Z so that
1 =

_
o
n=1
1
n
=

_
n1,|Z
1
n,|
.
Then from what we just proved applied rst to 1 and then to 1
n
we have
j
+
(1) =

n1,|Z
j
+
(1
n,|
) =
o

n=1
_

|Z
j
+
(1
n,|
)
_
=
o

n=1
j
+
(1
n
) .
Finally, property (3) follows from the observation that 1

o
n=1
(a
n
, /
n
) if and
only if 1 r

o
n=1
(a
n
r, /
n
r). It is then obvious that j
+
(1 r) = j
+
(1)
and that 1 r / if 1 /. Property (4) is immediate from Step 1 above. This
completes the proof of Theorem 21.
Remark 9. The above proof also establishes the regularity of Lebesgue measure:
for every 1 / and - 0, there is a closed set 1 and an open set G satisfying
1 1 G,
j(G 1) < -.
This follows from the denition of / together with the fact that / is closed under
complementation.
42 4. LEBESGUE MEASURE THEORY
2. Measurable functions and integration
Let [, [ = R ' , be the extended real numbers with order and
(some) algebra operations dened by
< r < , r R,
r = , r R,
r = , r R,
r = , r 0,
r = , r < 0,
0 = 0.
The nal assertion 0 = 0 is dictated by

o
n=1
a
n
= 0 if all the a
n
= 0. It turns
out that these denitions give rise to a consistent theory of measure and integration
of functions with values in the extended real number system.
Let ) : R [, [. We say that ) is (Lebesgue) measurable if
)
1
([, r)) /, r R.
The simplest examples of measurable functions are the characteristic functions
J
of measurable sets 1. Indeed,
(
J
)
1
([, r)) =
_
_
_
c if r _ 0
1
c
if 0 < r _ 1
R if r 1
.
It is then easy to see that nite linear combinations : =

n=1
a
n

Jr
of such
characteristic functions
Jr
, called simple functions, are also measurable. Here
a
n
R and 1
n
is a measurable subset of R with nite measure. Note that these
functions are those arising as upper and lower Lebesgue sums. However, since the
dierence of upper and lower Lebesgue sums is automatically controlled, we proceed
to develop integration by an approximation method instead. It turns out that if we
dene the integral of a simple function : =

n=1
a
n

Jr
by
_
R
: =

n=1
a
n
j(1
n
) ,
the value is independent of the representation of : as a simple function. Armed
with this fact we can then extend the denition of integral
_
R
) to functions ) that
are nonnegative on R, and then to functions ) such that
_
R
[)[ < .
At each stage one establishes the relevant properties of the integral along with
the most useful theorems. For the most part these extensions are rather routine, the
cleverness inherent in the theory being in the overarching organization of the con-
cepts rather than in the details of the demonstrations. As a result, we will merely
state the main results in logical order and sketch proofs when not simply routine.
We will however give fairly detailed proofs of the three famous convergence theo-
rems, the Monotone Convergence Theorem, Fatous Lemma, and the Dominated
Convergence Theorem. The reader is referred to the excellent exposition in [6] for
the complete story including many additional fascinating insights.
2. MEASURABLE FUNCTIONS AND INTEGRATION 43
2.1. Properties of measurable functions. From now on we denote the
Lebesgue measure of a measurable subset 1 of R by [1[ rather than by j(1) as in
the previous sections. We say that two measurable functions ), q : R [, [
are equal almost everywhere (often abbreviated a.c.) if
[r R : ) (r) ,= q (r)[ = 0.
We say that ) is nite-valued if ) : R R. We now collect a number of elementary
properties of measurable functions.
Lemma 11. Suppose that ), )
n
, q : R [, [ for : N.
(1) If ) is nite-valued, then ) is measurable if and only if )
1
(G) / for
all open sets G R if and only if )
1
(1) / for all closed sets 1 R.
(2) If ) is nite-valued and continuous, then ) is measurable.
(3) If ) is nite-valued and measurable and 1 : R R is continuous, then
1 ) is measurable.
(4) If )
n

o
n=1
is a sequence of measurable functions, then the following func-
tions are all measurable:
sup
n
)
n
(r) , inf
n
)
n
(r) , ... lim sup
no
)
n
(r) , lim inf
no
)
n
(r) .
(5) If )
n

o
n=1
is a sequence of measurable functions and ) (r) = lim
no
)
n
(r),
then ) is measurable.
(6) If ) is measurable, so is )
n
for : N.
(7) If ) and q are nite-valued and measurable, then so are ) q and )q.
(8) If ) is measurable and ) = q almost everywhere, then q is measurable.
Comments: For property (1), rst show that ) is measurable if and only if
)
1
((a, /)) / for all < a < / < . For property (3) use (1 ))
1
(G) =
)
1
_
1
1
(G)
_
and note that 1
1
(G) is open if G is open. For property (7), use
) q a =
_
:Q
[) a r q r[ , a R,
)q =
1
4
_
() q)
2
() q)
2
_
.
Example 3. It is not always true that ) 1 is measurable when 1 : R R is
continuous and ) is measurable. To see this recall the construction of the Cantor
set 1 =
o

|=0
1
|
, where 1
|
=
2
!
_
=1
1
|

. Denote the open middle third of the closed


interval 1
|

by G
|

. Dene the Cantor function 1 : [0, 1[ [0, 1[ by


1 (r) =
1
2
1
for r G
0
1
=
_
1
8
,
2
8
_
;
1 (r) =
1
2
2
for r G
1
1
=
_
1
0
,
2
0
_
, 1 (r) =
8
2
2
for r G
1
2
=
_
7
0
,
8
0
_
;
1 (r) =
1
2
3
for r G
2
1
, 1 (r) =
8
2
3
for r G
2
2
,
1 (r) =

2
3
for r G
2
3
, 1 (r) =
7
2
3
for r G
2
4
;
1 (r) =
2, 1
2
|
for r G
|1

, 1 _ , _ 2
|
, / _ 1,
44 4. LEBESGUE MEASURE THEORY
and then extend 1 to the Cantor set 1 = [0, 1[
_
_
_
|,
G
|

_
_
by continuity. ( Exercise:
Prove there exists a unique continuous extension.) Now dene
G(r) =
1 (r) r
2
, 0 _ r _ 1.
Then G : [0, 1[ [0, 1[ is one-to-one (strictly increasing) and onto, hence the
inverse function 1 = G
1
: [0, 1[ [0, 1[ is continuous by Corollary 6. Now
[G([0, 1[ 1)[ =
1
2
[[0, 1[ 1[ =
1
2
by construction, and so [G(1)[ = 1
1
2
=
1
2
. We
have
G(1) =

_
n1
G(1) (r
n
) ,
and if 1
n
= G(1) (r
n
) /, then
o

=1
[1
n
r

[ =

_
1
(1
n
r

_ 1
implies that [1
n
[ = 0. Since [G(1)[ 0, it follows that 1
n
, / for some : _ 1.
Denote such a set 1
n
by 1. Then ) =
(1)
is measurable since 1(1) 1 is a
null set. On the other hand, ) 1 =
1
is not meaurable, despite the continuity
of 1.
Recall that a measurable simple function , (i.e. the range of , is nite and ,
vanishes o a set of nite measure) has the form
, =

|=1
c
|

J
!
, c
|
R, 1
|
/.
Next we collect two approximation properties of simple functions.
Proposition 3. Let ) : R [, [ be measurable.
(1) If ) is nonnegative there is an increasing sequence of nonnegative simple
functions ,
|

o
|=1
that converges pointwise and monotonically to ):
,
|
(r) _ ,
|+1
(r) and lim
|o
,
|
(r) = ) (r) , for all r R.
(2) There is a sequence of simple functions ,
|

o
|=1
satisfying
[,
|
(r)[ _

,
|+1
(r)

and lim
|o
,
|
(r) = ) (r) , for all r R.
Comments: To prove (1) let )
1
= min), '
[1,1]
, and for 0 _ : < '
dene
1
n,,1
=
_
r R :
:

< )
1
(r) _
: 1

_
.
Then ,
|
(r) =

2
!
|
n=1
n
2
!

J
r2
!
!
(r) works. Property (2) follows from applying (1)
to the positive and negative parts of ):
)
+
(r) = max ) (r) , 0 and )

(r) = max ) (r) , 0 .


2. MEASURABLE FUNCTIONS AND INTEGRATION 45
2.2. Properties of integration and convergence theorems. If , is a
measurable simple function (i.e. its range is a nite set and it vanishes o a set of
nite measure), then , has a unique canonical representation
, =

|=1
c
|

J
!
,
where the real (or even complex) constants c
|
are distinct and nonzero, and the
measurable sets 1
|
are pairwise disjoint. We dene the Lebesgue integral of , by
_
,(r) dr =

|=1
c
|
[1
|
[ .
If 1 is a measurable subset of R and , is a measurable simple function, then so is

J
,, and we dene
_
J
,(r) dr =
_
(
J
,) (r) dr.
Lemma 12. Suppose that , and c are measurable simple functions and that
1, 1 /.
(1) If , =

1
|=1
,
|

J
!
(not necessarily the canonical representation), then
_
,(r) dr =
1

|=1
,
|
[1
|
[ .
(2)
_
(a, /c) = a
_
, /
_
c for a, / C,
(3)
_
J|J
, =
_
J
,
_
J
, if 1 1 = c,
(4)
_
, _
_
c if , _ c,
(5)

_
,

_
_
[,[.
Properties (2) - (5) are usually referred to as linearity, additivity, monotonicity
and the triangle inequality respectively. The proofs of (1) - (5) are routine.
Now we turn to dening the integral of a nonnegative measurable function
) : R [0, [. For such ) we dene
_
) (r) dr = sup
__
,(r) dr : 0 _ , _ ) and , is simple
_
.
It is essential here that ) be permitted to take on the value , and that the
supremum may be as well. We say that ) is (Lebesgue) integrable if
_
) (r) dr <
. For 1 measurable dene
_
J
) (r) dr =
_
(
J
)) (r) dr.
Here is an analogue of Lemma 12 whose proof is again routine.
Lemma 13. Suppose that ), q : R [0, [ are nonnegative measurable func-
tions and that 1, 1 /.
(1)
_
(a) /q) = a
_
) /
_
q for a, / (0, ),
(2)
_
J|J
) =
_
J
)
_
J
) if 1 1 = c,
(3)
_
) _
_
q if 0 _ ) _ q,
(4) If
_
) < , then ) (r) < for a.e. r,
(5) If
_
) = 0, then ) (r) = 0 for a.e. r.
46 4. LEBESGUE MEASURE THEORY
Note that convergence of integrals does not always follow from pointwise con-
vergence of the integrands. For example,
lim
no
_

[n,n+1]
(r) dr = 1 ,= 0 =
_
lim
no

[n,n+1]
(r) dr,
and
lim
no
_
:
(0,
1
r
)
(r) dr = 1 ,= 0 =
_
lim
no
:
[0,
1
r
[
(r) dr.
In each of these examples, the mass of the integrands "disappears" in the limit; at
"innity" in the rst example and at the origin in the second example. Here are our
rst two classical convergence theorems giving conditions under which convergence
does hold. The rst generalizes the property in line 4 of (1.1):
j(1) = sup
n
j(1
n
) = lim
no
j(1
n
) if 1
n
1.
Theorem 22. (Monotone Convergence Theorem) Suppose that )
n

o
n=1
is an
increasing sequence of nonnegative measurable functions, i.e. )
n
(r) _ )
n+1
(r),
and let
) (r) = sup
n
)
n
(r) = lim
no
)
n
(r) .
Then ) is nonegative and measurable and
_
) (r) dr = sup
n
_
)
n
(r) dr = lim
no
_
)
n
(r) dr.
Proof : Since
_
)
n
_
_
)
n+1
we have lim
no
_
)
n
= 1 [0, [. Now ) is
measurable and )
n
_ ) implies
_
)
n
_
_
) so that
1 _ sup
n
_
)
n
_
_
).
To prove the opposite inequality, momentarily x a simple function , such that
0 _ , _ ). Choose c < 1 and dene
1
n
= r R : )
n
(r) _ c,(r) , : _ 1.
Then 1
n
is an increasing sequence of measurable sets with

o
n=1
1
n
= R. We have
_
)
n
_
_
Jr
)
n
_ c
_
Jr
,, : _ 1.
Now let , =

|=1
c
|

J
!
be the canonical representation of ,. Then
_
Jr
, =

|=1
c
|
[1
n
1
|
[ ,
and since lim
no
[1
n
1
|
[ = [1
|
[ by the fourth line in (1.1), we obtain that
_
Jr
, =

|=1
c
|
[1
n
1
|
[

|=1
c
|
[1
|
[ =
_
,
as : . Altogether then we have
1 = lim
no
_
)
n
_ c
_
,
for all c < 1, which implies 1 _
_
, for all simple , with 0 _ , _ ), which implies
1 _
_
) as required.
2. MEASURABLE FUNCTIONS AND INTEGRATION 47
Note that as a corollary we have
_
) = lim
|o
_
,
|
where the simple functions
,
|
are as in (1) of Proposition 3. We also have this.
Corollary 9. Suppose that a
|
(r) _ 0 is measurable for / _ 1. Then
_
o

|=1
a
|
(r) dr =
o

|=1
_
a
|
(r) dr.
To prove the corollary apply the Monotone Convergence Theorem to the se-
quence of partial sums )
n
(r) =

n
|=1
a
|
(r).
Lemma 14. (Fatous Lemma) If )
n

o
n=1
is a sequence of nonnegative mea-
surable functions, then
_
lim inf
no
)
n
(r) dr _ lim inf
no
_
)
n
(r) dr.
Proof : Let q
n
(r) = inf
|n
)
|
(r) so that q
n
_ )
n
and
_
q
n
_
_
)
n
. Then
q
n

o
n=1
is an increasing sequence of nonnegative measurable functions that con-
verges pointwise to liminf
no
)
n
(r). So the Monotone Convergence Theorem
yields
_
lim inf
no
)
n
(r) dr = lim
no
_
q
n
(r) dr _ lim inf
no
_
)
n
(r) dr.
Finally, we can give an unambiguous meaning to the integral
_
) (r) dr in the
case when ) is integrable, by which we mean that ) is measurable and
_
[) (r)[ dr <
. To do this we note that the positive and negative parts of ),
)
+
(r) = max ) (r) , 0 and )

(r) = max ) (r) , 0 ,


are both nonnegative measurable functions with nite integral. We dene
_
) (r) dr =
_
)
+
(r) dr
_
)

(r) dr.
With this denition we have the usual elementary properties of linearity, addi-
tivity, monotonicity and the triangle inequality.
Lemma 15. Suppose that ), q are integrable and that 1, 1 /.
(1)
_
(a) /q) = a
_
) /
_
q for a, / R,
(2)
_
J|J
) =
_
J
)
_
J
) if 1 1 = c,
(3)
_
) _
_
q if ) _ q,
(4)

_
)

_
_
[)[.
Our nal convergence theorem is one of the most useful in analysis.
Theorem 23. (Dominated Convergence Theorem) Let q be a nonnegative in-
tegrable function. Suppose that )
n

o
n=1
is a sequence of measurable functions sat-
isfying
lim
no
)
n
(r) = ) (r) , a.c. r,
and
[)
n
(r)[ _ q (r) , a.c. r.
Then
lim
no
_
[) (r) )
n
(r)[ dr = 0,
48 4. LEBESGUE MEASURE THEORY
and hence
_
) (r) dr = lim
no
_
)
n
(r) dr.
Proof : Since [)[ _ q and ) is measurable, ) is integrable. Since [) )
n
[ _ 2q,
Fatous Lemma can be applied to the sequence of functions 2q [) )
n
[ to obtain
_
2q _ lim inf
no
_
(2q [) )
n
[)
=
_
2q lim inf
no
_

_
[) )
n
[
_
=
_
2q lim sup
no
_
[) )
n
[ .
Since
_
2q < , we can subtract it from both sides to obtain
lim sup
no
_
[) )
n
[ _ 0,
which implies lim
no
_
[) )
n
[ = 0. Then
_
) = lim
no
_
)
n
follows from the
triangle inequality

_
() )
n
)

_
_
[) )
n
[.
Note that as a corollary we have
_
) = lim
|o
_
,
|
where the simple functions
,
|
are as in (2) of Proposition 3.
Finally, if ) (r) = n(r) i (r) is complex-valued where n(r) and (r) are
real-valued measurable functions such that
_
[) (r)[ dr =
_ _
n(r)
2
(r)
2
dr < ,
then we dene
_
) (r) dr =
_
n(r) dr i
_
(r) dr.
The usual properties of linearity, additivity, monotonicity and the triangle inequal-
ity all hold for this denition as well.
2.3. Three famous measure problems. The following three problems are
listed in order of increasing diculty.
Problem 1. Suppose that 1
1
, ..., 1
n
are : Lebesgue measurable subsets of [0, 1[
such that each point r in [0, 1[ lies in some / of these subsets. Prove that there is
at least one set 1

with [1

[ _
|
n
.
Problem 2. Suppose that 1 is a Lebesgue measurable set of positive measure.
Prove that
1 1 = r j : r, j 1
contains a nontrivial open interval.
Problem 3. Construct a Lebesgue measurable subset of the real line such that
0 <
[1 1[
[1[
< 1
for all nontrivial open intervals 1.
2. MEASURABLE FUNCTIONS AND INTEGRATION 49
To solve Problem 1, note that the hypothesis implies / _

n
=1

J
(r) for
r [0, 1[. Now integrate to obtain
/ =
_
1
0
/dr _
_
1
0
_
_
n

=1

J
(r)
_
_
dr =
n

=1
_
1
0

J
(r) dr =
n

=1
[1

[ ,
which implies that [1

[ _
|
n
for some ,. The solution is much less elegant without
recourse to integration.
To solve Problem 2, choose 1 compact contained in 1 such that [1[ 0. Then
choose G open containing 1 such that [G 1[ < [1[. Let c = di:t (1, G
c
) 0. It
follows that (c, c) 1 1 1 1. Indeed, if r (c, c) then 1 r G and
1 (1 r) ,= c since otherwise we have a contradiction:
2 [1[ = [1[ [1 r[ _ [G[ _ [G 1[ [1[ < 2 [1[ .
Thus there are /
1
and /
2
in 1 such that /
1
= /
2
r and so
r = /
2
/
1
1 1.
Problem 3 is most easily solved using generalized Cantor sets 1
o
. Let 0 < c _ 1
and set 1
0
1
= [0, 1[. Remove the open interval of length
1
3
c centered in 1
0
1
and denote
the two remaining closed intervals by 1
1
1
and 1
1
2
. Then remove the open interval of
length
1
3
2
c centered in 1
1
1
and denote the two remaining closed intervals by 1
2
1
and
1
2
2
. Do the same for 1
1
2
and denote the two remaining closed intervals by 1
2
3
and 1
2
4
.
Continuing in this way, we obtain at the /
||
generation, a collection
_
1
|

_
2
!
=1
of 2
|
pairwise disjoint closed intervals of equal length. Let
1
o
=
o

|=1
_
_
2
!
_
=1
1
|

_
_
.
Then by summing the lengths of the removed open intervals, we obtain
[[0, 1[ 1
o
[ =
1
8
c
2
8
2
c
2
2
8
3
c ... = c,
and it follows that 1
o
is compact and has Lebesgue measure 1 c. It is not hard
to show that 1
o
is also nowhere dense. The case c = 1 is particularly striking: 1
1
is a compact, perfect and uncountable subset of [0, 1[ having Lebesgue measure 0.
This is the classical Cantor set introduced at the end of Chapter 1.
In order to construct the set 1 in Problem 3, it suces by taking unions of
translates by integers, to construct a subset 1 of [0, 1[ satisfying
(2.1) 0 <
[1 1[
[1[
< 1, for all intervals 1 [0, 1[ of positive length.
Fix 0 < c
1
< 1 and start by taking 1
1
= 1
o1
. It is not hard to see that
[J
1
|1[
]1]
< 1
for all 1, but the left hand inequality in (2.1) fails for 1 = 1
1
whenever 1 is a subset
of one of the component intervals in the open complement [0, 1[ 1
1
. To remedy
this x 0 < c
2
< 1 and for each component interval J of [0, 1[ 1
1
, translate and
dilate 1
o2
to t snugly in the closure J of the component, and let 1
2
be the union
of 1
1
and all these translates and dilates of 1
o2
. Then again,
[J
2
|1[
]1]
< 1 for all
50 4. LEBESGUE MEASURE THEORY
1 but the left hand inequality in (2.1) fails for 1 = 1
2
whenever 1 is a subset of
one of the component intervals in the open complement [0, 1[ 1
2
. Continue this
process indenitely with a sequence of numbers c
n

o
n=1
(0, 1). We claim that
1 =

o
n=1
1
n
satises (2.1) if and only if
(2.2)
o

n=1
(1 c
n
) < .
To see this, rst note that no matter what sequence of numbers c
n
less than
one is used, we obtain that 0 <
]J|1]
]1]
for all intervals 1 of positive length. Indeed,
each set 1
n
is easily seen to be compact and nowhere dense, and each component
interval in the complement [0, 1[ 1
n
has length at most
c
1
8
c
2
8
...
c
n
8
_ 8
n
.
Thus given an interval 1 of positive length, there is : large enough such that 1 will
contain one of the component intervals J of [0, 1[ 1
n
, and hence will contain the
translated and dilated copy (
_
1
or+1
_
of 1
or+1
that is tted into J by construction.
Since the dilation factor is the length [J[ of J, we have
[1 1[ _

(
_
1
or+1
_

= [J[

1
or+1

= [J[ (1 c
n+1
) 0,
since c
n+1
< 1.
It remains to show that [1 1[ < [1[ for all intervals 1 of positive length in
[0, 1[, and it is here that we must use (2.2). Indeed, x 1 and let J be a component
interval of [0, 1[ 1
n
(with : large) that is contained in 1. Let (
_
1
or+1
_
be the
translated and dilated copy of 1
or+1
that is tted into J by construction. We
compute that
[1 J[ =

(
_
1
or+1
_

(1 c
n+2
)

J (
_
1
or+1
_

...
= (1 c
n+1
) [J[ (1 c
n+2
) c
n+1
[J[
(1 c
n+3
) c
n+2
c
n+1
[J[ ...
=
o

|=1
,
n
|
[J[ ,
where
,
n
|
= (1 c
n+|
) c
n+|1
...c
n+1
, / _ 1.
Then we have
[1 J[ =
_
o

|=1
,
n
|
_
[J[ < [J[ ,
and hence also
]J|1]
]1]
< 1, if we choose c
n

o
n=1
so that

o
|=1
,
n
|
< 1 for all :.
Now by induction we have
o

|=1
,
n
|
= lim
o

|=1
(1 c
n+|
) c
n+|1
...c
n+1
= lim
o
_
1

|=1
c
n+|
_
= 1
o

|=1
c
n+|
,
and by the rst line in (2.3) below, this is strictly less than 1 if and only if

o
|=1
(1 c
n+|
) < for all :. Thus the set 1 constructed above satises (2.1) if
and only if (2.2) holds.
3. EXERCISES 51
2.3.1. Innite products. If 0 _ n
n
< 1 and 0 _
n
< then
o

n=1
(1 n
n
) 0 if and only if
o

n=1
n
n
< , (2.3)
o

n=1
(1
n
) < if and only if
o

n=1

n
< .
To see (2.3) we may assume 0 _ n
n
,
n
_
1
2
, so that c
ur
_ 1 n
n
_ c
2ur
and
c
1
2
ur
_ 1
n
_ c
ur
. For example, when 0 _ r _
1
2
, the alternating series estimate
yields
c
2r
_ 1 2r
(2r)
2
2!
_ 1 r,
while the geometric series estimate yields
c
1
2
r
_ 1
_
1
2
r
_
_
1 r r
2
...
_
_ 1 r.
Thus we have
oxp
_

n=1
n
n
_
_
o

n=1
(1 n
n
) _ oxp
_
2
o

n=1
n
n
_
, (2.4)
oxp
_
1
2
o

n=1

n
_
_
o

n=1
(1
n
) _ oxp
_
o

n=1

n
_
.
3. Exercises
Exercise 6. Use the regularity of Lebesgue measure to show that 1 / if and
only if there is an increasing sequence 1
n

o
n=1
of compact sets in R and a null set
(i.e. j
+
() = 0) such that
1 =
_
o
_
n=1
1
n
_
' .
Show also that if another pair (/
t
, j
t
) satises (1) - (4), then 1 /
t
and j
t
(1) =
j(1) for all compact subsets 1 of R. Deduce from this that (/
t
, j
t
) is an extension
of (/, j), i.e. /
t
/ and j
t
(1) = j(1) for all 1 /.
Exercise 7. Let 1 be the Cantor set. Dene a function 1 : [0, 1[ 1 [0, 1[
as in Example 3. Prove that 1 has a unique continuous extension G : [0, 1[ [0, 1[
to the entire interval [0, 1[.
Exercise 8. (Exercise 9 Chapter 2 of Rudin) Construct a sequence )
n

o
n=1
of continuous functions )
n
: [0, 1[ [0, 1[ such that both
(1) )
n
(r)
o
n=1
fails to have a limit for every r [0, 1[, and
(2) lim
no
_
1
0
)
n
= 0.
Exercise 9. (Exercise 10 Chapter 2 of Rudin) Suppose that )
n

o
n=1
is a
sequence of continuous functions )
n
: [0, 1[ [0, 1[ such that ) (r) = lim
no
)
n
(r)
exists for every r [0, 1[.
52 4. LEBESGUE MEASURE THEORY
(1) Prove that
lim
no
_
1
0
)
n
=
_
1
0
).
(2) If ) is identically zero, try to prove this using only Riemann integration.
Exercise 10. (Exercise 20 page 59 in Rudin) Construct continuous func-
tions )
n
: [0, 1[ [0, [ such that lim
no
)
n
(r) = 0 for all r [0, 1[ and
lim
no
_
1
0
)
n
= 0, but for which sup
n1
)
n
is not integrable.
CHAPTER 5
Paradoxical decompositions and nitely additive
measures
Definition 12. Let G be a group acting on a set A. A subset 1 of A is nitely
G-paradoxical if there are subsets
I
, 1

of A and group elements q


I
, /

such that
1
_
`
'
n
I=1

I
_
`
'
_
`
'
n
=1
1

_
, (0.1)
1 = '
n
I=1
q
I

I
= '
n
=1
/

.
The notation
`
' asserts that the indicated union is pairwise disjoint. Note that
one can easily arrange to have each collection of sets q
I

n
I=1
and /

n
=1
in the second line of (0.1) pairwise disjoint simply by paring the sets
I
and 1

.
One can also achieve equality in the rst line of (0.1), but this is harder, and is
not proved until Corollary 12 below. We say that 1 is countably G-paradoxical if
:, : in (0.1) are permitted to be , the rst innite ordinal. By G-paradoxical
we mean nitely G-paradoxical. Finally, we say that G is paradoxical if G acts on
itself by left multiplication and G is G-paradoxical. The next result uses the axiom
of choice.
Theorem 24. Let G be the circle group T and let it act on itself A = T by
group multiplication:
c
I|
G sends the point c
Ir
A to the point c
I(|+r)
A.
Then A is countably G-paradoxical.
Proof : Let ' be a choice set for the equivalence classes of the relation on
T given by declaring two points equivalent if one is obtained from the other by
rotation through a rational multiple of 2 radians. Let j
I

o
I=1
enumerate the
rotations through a rational multiple of 2 radians, and set '
I
= j
I
'. Then the
countable paradoxical decomposition is provided by
A = (
`
'
I odd
'
I
)
`
'(
`
'
I even
'
I
) ,
A =
`
'
I odd
q
I
'
I
=
`
'
I even
/
I
'
I
,
where q
I
= j1+1
2
j
1
I
for i odd, and /
I
= j 1
2
j
1
I
for i even.
Corollary 10. There is a non-Lebesgue measurable subset of T.
Proof : If
I
, 1

, q
I
, /

witness a countable paradoxical decomposition (0.1) of


T = 1 with :, : _ , and if we assume every subset of T is Lebesgue measurable,
then
2 = [G[ _
n

I=1
[
I
[
n

=1
[1

[ =
n

I=1
[q
I

I
[
n

=1
[/

[
_ ['
n
I=1
q
I

I
[

'
n
=1
/

= 4,
53
54 5. PARADOXICAL DECOMPOSITIONS AND FINITELY ADDITIVE MEASURES
a contradiction.
Denote by G
n
the group of isometries of Euclidean space R
n
.
Remark 10. There exists a G
2
-paradoxical subset 1 of the plane R
2
= C that
does not require the axiom of choice for its construction, namely the Sierpinski-
Mazurkiewicz Paradox: let c
I0
be a transcendental complex number and dene
1 =
_
r =
o

n=0
r
n
c
In0
C : r
n
Z
+
and r
n
= 0 for all but nitely many :
_
,
1
1
= r 1 : r
0
= 0 ,
1
2
= r 1 : r
0
0 .
Then 1 = 1
1
`
'1
2
= c
I0
1
1
= 1
2
1.
1. Finitely additive invariant measures
Let G be a group acting on a set A. If there exists a nitely (countably)
additive G-invariant probability measure j on the power set T (A), then there are
no nitely (countably) G-paradoxical subsets 1 of A having positive j-measure.
In particular G itself is not nitely (countably) G-paradoxical. This is proved as
in the proof of Corollary 10 above. Thus paradoxical constructions can be viewed
as nonexistence theorems for invariant measures, and by the contrapositive, the
construction of invariant measures precludes paradoxical decompositions. In fact
a theorem of Tarski shows that if 1 A on which a group acts, then there is a
nitely additive G-invariant positive measure j on T (A) with j(1) = 1 if and
only if 1 is not G-paradoxical.
We now state two theorems in this regard. The rst states that paradoxical
decompositions never occur for abelian groups (such as the group of translations
on Euclidean space R
n
), and the second shows that paradoxical decompositions do
exist for the rotation groups on Euclidean space R
n
when : _ 8 (resulting in the
Banach-Tarski paradox).
Theorem 25. Suppose G is an abelian group and let / be the power set of G.
There is j : /[0, 1[ satisfying
(1) j(1
1
`
'1
2
) = j(1
1
) j(1
2
) , 1
I
/,
(2) j(1 a) = j(1) , 1 /, a G,
(3) j(G) = 1.
Definition 13. Let G act on a set A. Subsets and 1 of A are said to be
G-equidecomposable, written ~
c
1 or simply ~ 1 when G is understood, if
=
`
'
n
I=1

I
and 1 =
`
'
n
I=1
1
I
where
I
= q
I
1
I
for some q
I
G, 1 _ i _ :.
We will see later that 1 is G-paradoxical if and only if 1 =
`
'1 where
~
c
1 ~
c
1.
Remark 11. If A is Euclidean space R
n
, then G
3
-equidecomposability pre-
serves the following properties: boundedness, Lebesgue measure zero, rst category
(a countable union of nowhere dense sets), and second category (not rst category).
Theorem 26. (Banach-Tarski paradox) The sphere S
2
is oO
3
-paradoxical and
the ball B
3
is G
3
-paradoxical. Moreover, if and 1 are any two bounded subsets
of R
3
, each having nonempty interior, then and 1 are G
3
-equidecomposable.
We prove only the second theorem on the Banach-Tarski paradox.
2. PARADOXICAL DECOMPOSITIONS AND THE BANACH-TARSKI PARADOX 55
2. Paradoxical decompositions and the Banach-Tarski paradox
We obtain the strong form of the Banach-Tarski paradox in four steps.
First, we prove that the free nonabelian group 1
2
of rank 2 is paradoxical.
Second, we show that the special orthogonal group oO
3
in three dimen-
sions contains a copy of 1
2
.
Third, we lift the paradoxical decomposition from oO
3
to the sphere S
2
on which it acts almost without nontrivial xed points.
Fourth, we extend the paradox to bounded sets with nonempty interior
with the help of the proof of the Schrder-Bernstein theorem.
First step: We prove that 1
2
is paradoxical. Let 1
2
consist of all nite words
in o, o
1
, t, t
1
with concatenation as the group operation, and the empty word
as identity 1. For j
_
o, o
1
, t, t
1
_
, let \ (j) consist of all reduced words that
begin with j (a word is reduced if no pair of adjacent symbols is oo
1
, o
1
o, tt
1
,
or t
1
t). The following decompositions witness the paradoxical nature of 1
2
:
1
2
= 1
`
'\ (o)
`
'\
_
o
1
_
`
'\ (t)
`
'\
_
t
1
_
,
1
2
= \ (o)
`
'o\
_
o
1
_
,
1
2
= \ (t)
`
't\
_
t
1
_
.
Note that we do not use the identity in these reconstructions of 1
2
. We can however
witness the paradox with four disjoint pieces whose union is 1
2
using an absorption
process as follows. First we include 1 with the set \ (o) and call the new set
1
.
But then 1
2
=
1
`
'o\
_
o
1
_
fails since 1 is also in o\
_
o
1
_
. So 1 must be
removed from o\
_
o
1
_
, and we achieve this by moving o
1
from \
_
o
1
_
to
1
and denoting the new set \
_
o
1
_

_
o
1
_
by
2
. But then o
2
is in both
1
and

2
. So we move o
2
from
2
to
1
. This process must be continued indenitely,
so let o = o
n

o
n=1
and dene

1
= \ (o)
`
'1
`
'o,

2
= \
_
o
1
_
o,

3
= \ (t) ,

4
= \
_
t
1
_
.
Then 1
2
=
`
'
4
I=1

I
and 1
2
=
1
`
'o
2
and 1
2
=
3
`
't
4
since
o
2
= o\
_
o
1
_
oo =
_
1
`
'\
_
o
1
_
`
'\ (t)
`
'\
_
t
1
__
1
`
'o
=
_
\
_
o
1
_
o
_
`
'\ (t)
`
'\
_
t
1
_
,
has complement
1
.
Second step: To embed a copy of 1
2
in oO
3
we dene the 8 8 matrices:
c

=
_

_
1
3
(
2
_
2
3
0

2
_
2
3
1
3
0
0 0 1
_

_ =
1
8
_
_
1 (2
_
2 0
2
_
2 1 0
0 0 8
_
_
,
j

=
_

_
1 0 0
0
1
3
(
2
_
2
3
0
2
_
2
3
1
3
_

_ =
1
8
_
_
8 0 0
0 1 (2
_
2
0 2
_
2 1
_
_
.
56 5. PARADOXICAL DECOMPOSITIONS AND FINITELY ADDITIVE MEASURES
It suces to show that no nonempty reduced word n in c

, j

equals the identity


in oO
3
. We may assume that n is a nonempty reduced word ending in c

, since
the case when n ends in j

is similar.
Claim 4. Every nonempty reduced word n in c

, j

that ends in c

satises
n(1, 0, 0) = 8
|
_
a, /
_
2, c
_
for some a, /, c Z with 8 - /, and where / is the length
of n. In particular / ,= 0 and n is not the identity.
We prove the claim by induction on the length / of n. The case / = 1 is
evident upon examining the rst columns of the two matrices c

. If n of length
/ _ 2 equals c

n
t
or j

n
t
, where
n
t
(1, 0, 0) = 8
1|
_
a
t
, /
t
_
2, c
t
_
, a
t
, /
t
, c
t
Z, 8 - /
t
,
then
c

n
t
(1, 0, 0) = 8
|
_
a
t
(4/
t
, (/
t
2a
t
)
_
2, 8c
t
_
, (2.1)
j

n
t
(1, 0, 0) = 8
|
_
8a
t
, (/
t
(2c
t
)
_
2, c
t
4/
t
_
.
We now see that n(1, 0, 0) has the form 8
|
_
a, /
_
2, c
_
for some a, /, c Z, and
it remains only to prove 8 - / given that 8 - /
t
. There are four cases: n = c

,
j

, c

and j

where is possibly empty, and where in each of the


four cases, we take either both signs or both signs (since n is reduced). We
may suppose that (1, 0, 0) = 8
2|
_
a
tt
, /
tt
_
2, c
tt
_
where a
tt
, /
tt
, c
tt
Z (we do not
assume 8 - /
tt
in order to include the case is empty). In the rst case, we have
a
t
= 8a
tt
by the second line in (2.1) applied to instead of n
t
. Now 8 - /
t
and so we
obtain 8 - /
t
2a
t
= / as required. The second case is similar. For the third case
we have
a
t
= a
tt
(4/
tt
,
/
t
= 2a
tt
/
tt
,
by the rst line in (2.1) applied to instead of n
t
. Then
/ = /
t
2a
t
= /
t
2 (a
tt
(4/
tt
) = /
t
/
tt
2a
tt
0/
tt
= 2/
t
0/
tt
,
and again 8 - / follows from 8 - /
t
. The fourth case is similar and this completes the
proof of the claim.
Third step: To lift a paradoxical decomposition from a group to a set on
which it acts is easy using the axiom of choice provided the action is with trivial
xed points. We say that a group G acts on a set A with trivial xed points if
qr ,= r for all r A and all q Gc where c denotes the identity element of
G.
Proposition 4. If G is a paradoxical group and acts on a set A with trivial
xed points, then A is G-paradoxical.
Proof : Let
I
, 1

, q
I
, /

witness the paradoxical nature of G as in (0.1). Let


' be a choice set for the G-orbits in A. Then q'
c
is a partition of A because
there are no nontrivial xed points. Then
+
I
=
`
'
.1
q' and 1
+

=
`
'
|1
/'
easily yield a paradoxical decomposition of A:
A
_
`
'
n
I=1

+
I
_
`
'
_
`
'
n
=1
1
+

_
,
A =
`
'
n
I=1
q
I

+
I
=
`
'
n
=1
/

1
+

.
2. PARADOXICAL DECOMPOSITIONS AND THE BANACH-TARSKI PARADOX 57
Corollary 11. (Hausdors paradox) There is a countable set 1 S
2
such
that S
2
1 is oO
3
-paradoxical.
Proof : Let 1 be a free nonabelian group of rank 2 in oO
3
. Then 1 is countable
and since each c 1 1 xes exactly 2 points,
1 =
_
r S
2
: cr = r for some c 1 1
_
is countable. Then 1 acts on S
2
1 with trivial xed points. Indeed, the set 1
of trivial xed points of 1 is invariant for 1 since if 1 and c = , then
_
,c,
1
_
, = , for all c, , 1; and thus S
2
1 is invariant for 1 as well. So
Proposition 4 implies that S
2
1 is 1-paradoxical, hence also oO
3
-paradoxical.
Hausdors paradox is already sucient to disprove the existence of nitely
additive rotation invariant positive measures of total mass 1 on the power set of
S
2
, and hence also disproves the existence of nitely additive isometry invariant
positive measures on the power set of R
3
that normalize the unit cube (this was
Hausdors motivation). Exercise: prove this! We can eliminate the countable
set 1 in Hausdors paradox by an absorption process once we have the following
lemma.
Lemma 16. Let G act on a set A and let 1, 1
t
T (A). If 1 ~
c
1
t
, then 1
is G-paradoxical if and only if 1
t
is G-paradoxical.
First we note that the relation ~
c
is transitive. Suppose that 1 ~
c

and 1 ~
c
1. Then 1 =
`
'
n
I=1

I
=
`
'
n
=1
1

where =
`
'
n
I=1
q
I

I
and 1 =
`
'
n
=1
/

for some group elements q


I
, /

. Then =
`
'
n,n
I,=1
q
I
(
I
1

) and
1 =
`
'
n,n
I,=1
/

(
I
1

) shows that ~
c
1. From this we easily obtain the lemma.
Indeed, 1 is G-paradoxical if and only if there are disjoint subsets 1
1
, 1
2
of 1 such
that both 1
1
~
c
1 and 1
2
~
c
1. From 1 ~
c
1
t
, we have 1 =
`
'
n
I=1

I
and 1
t
=
`
'
n
I=1
q
I

I
. Thus if we dene 1
t
1
=
`
'
n
I=1
q
I
(
I
1
1
) and 1
t
2
=
`
'
n
I=1
q
I
(
I
1
2
),
we have that 1
t
1
, 1
t
2
are disjoint subsets of 1
t
such that 1
t
1
~
c
`
'
n
I=1
(
I
1
1
) =
1
1
~
c
1 ~
c
1
t
and similarly 1
t
2
~
c
1
t
. This shows that 1
t
is G-paradoxical.
Theorem 27. (Banach-Tarski paradox) S
2
is oO
3
-paradoxical and B
3
is G
3
-
paradoxical.
Proof : Let 1 = d
I

o
I=1
be as in Hausdors paradox. Pick a line / through
the origin that misses 1 and x a plane 1 containing /. Let
=
_
1
:
(0
I
0

) : :, i, , N
_
, 0
I
= ](d
I
, /) ,
where ](d
I
, /) denotes the angle mod through which the plane 1 must be rotated
(in a xed sense) about / so as to contain d
I
. Pick 0 , (mod ). Then if j is
rotation about / through angle 0, we have
j
n
1 j
n
1 = c, : ,= : in Z.
Indeed, if / is the .-axis and j
n
1 j
n
1 ,= c for some : ,= :, then using polar
coordinates in the rj-plane we have c
In0
r

c
I0
= c
In0
r
|
c
I0
!
, which implies 0 =
0
!
0
nn
. So with

1 =
`
'
o
n=0
j
n
1 = 1
`
'j

1 we have
S
2
=
_
S
2

1
_
`
'

1 ~
SO3
_
S
2

1
_
`
'j

1 = S
2
1,
58 5. PARADOXICAL DECOMPOSITIONS AND FINITELY ADDITIVE MEASURES
and the lemma shows that S
2
is oO
3
-paradoxical.
Finally, the equality
B
3
0 = '
.S
2 `. : 0 < ` < 1
shows that B
3
0 is oO
3
-paradoxical, and an absorption argument as above then
shows that B
3
is G
3
-paradoxical. Indeed, use a rotation j about a line / passing
through
_
0, 0,
1
3
_
but not passing through the origin, so that j
n
0 ,= j
n
0 for : ,= :,
and set

1 =
`
'
o
n=0
j
n
0 = 0
`
'j

1. Then since j G
3
,
B
3
=
_
B
3

1
_
`
'

1 ~
c3
_
B
3

1
_
`
'j

1 = B
3
0 .
Remark 12. The arguments above show that S
2
can be duplicated using 8
pieces, and that B
3
can be duplicated using 16 pieces. More rened arguments show
that 4 pieces suce for S
2
, and that pieces suce for B
3
. These latter results are
optimal.
Fourth step: The next result shows that if we declare _
c
1 when is
G-equidecomposable to a subset of 1, then the relation _
c
is a partial ordering of
the ~
c
equivalence classes in T (A).
Theorem 28. (Banach-Schrder-Bernstein) Suppose that a group G acts on a
set A. If , 1 T (A) satisfy both _
c
1 and 1 _
c
, then ~
c
1.
Proof : We have the following two properties of the relation ~
c
:
If ~
c
1, then there is a bijection q : 1 such that
(2.2) C ~
c
q (C) whenever C .
If
1

2
= c = 1
1
1
2
and
I
~
c
1
I
for i = 1, 2 then
1
'
2
~
c
1
1
' 1
2
.
By hypothesis, ~
c
1
1
and
1
~
c
1 for some 1
1
1 and
1
. By
the rst property, there are bijections ) : 1
1
and q :
1
1 satisfying
C ~
c
) (C) and 1 ~
c
q (1) whenever C and 1
1
. Let C
0
=
1
and
inductively C
n+1
= q
1
) (C
n
) for : _ 0. With C =
`
'
o
n=0
C
n
we have
q (C) = 1) (C)
and then C ~
c
1) (C) by (2.2). But we also have C ~
c
) (C) by (2.2) and
the second property now yields
= (C)
`
'C ~
c
(1) (C))
`
') (C) = 1.
Corollary 12. A subset 1 of A is G-paradoxical if and only if there are
disjoint sets , 1 1 with
`
'1 = 1 and ~
c
1 ~
c
1.
Theorem 29. (strong form of the Banach-Tarski paradox) If and 1 are
any two bounded subsets of R
3
, each with nonempty interior, then and 1 are
G
3
-equidecomposable.
Proof : It suces to show that _
c3
1, since interchanging and 1 yields
1 _
c3
, and the Banach-Schrder-Bernstein theorem then shows that ~
c3
1.
So choose solid balls 1 and 1 such that 1 and 1 1, and let : be large
enough that 1 can be covered by : copies of 1. Use the Banach-Tarski paradox to
3. EXERCISES 59
create a union o of : pairwise disjoint copies of 1, and then cover 1 by a union of
translates of these copies so that 1 _
c3
o. It follows that
1 _
c3
o ~
c3
1 1,
and so _
c3
1.
3. Exercises
Exercise 11. Use Hausdors paradox to prove that there is no nitely additive
isometry invariant positive measure ` on the power set T
_
R
3
_
of three dimensional
Euclidean space such that `(Q) = 1 where Q = [0, 1[
3
is the unit cube.
CHAPTER 6
Abstract integration and the Riesz representation
theorem
The properties of Lebesgue measure, as given in Theorem 21, are easily ex-
tended to a quite general setting of measure spaces, where a theory of integration
can then be established that includes the analogues of the monotone convergence
theorem, Fatous lemma and the dominated convergence theorem. It turns out
to be fruitful to abandon the completeness property (4) in Theorem 21 for the
abstract setting, and to include it as separate feature. The resulting abstract the-
ory of integration is one of the most powerful tools in analysis and we will give
several applications of it in the sequel. Fortunately, this abstract theory follows
very closely the theory of Lebesgue integration that was developed in the previous
chapter, which permits us to proceed relatively quickly here.
1. Abstract integration
Let A be a set and suppose that / T (A) is a o-algebra of subsets of A,
i.e. / contains the empty set, and is closed under complementation and countable
unions:
(1) c /,
(2)
c
/ whenever /,
(3)

o
n=1

n
/ whenever
n
/ for all :.
The pair (A, /) is called a measurable space and / is called a o-algebra on
A, although one usually abuses notation by referring to just A as the measurable
space, despite the fact that without /, the set A has no structure. There are lots
of o-algebras on a set A. In fact, given any xed collection T T (A) of subsets
of A, there is a smallest o-algebra on A containing T.
Lemma 17. Given T T (A), there is a unique o-algebra /
J
on A such that
(1) T /
J
,
(2) if / is any o-algebra on A with T /, then /
J
/.
Proof. The power set T (A) is a o-algebra on A that contains T. Thus the
set
/
J
=

/ : / is a o-algebra on A with T /
is nonempty. It is easily veried that /
J
is a o-algebra on A that contains T,
and it is then clear that /
J
is the smallest such. This completes the proof of the
lemma.
A map j : / [0, [ is called a positive measure on / if it is countably
additive, and nondegenerate in the sense that not every set has innite measure:
61
62 6. ABSTRACT INTEGRATION AND THE RIESZ REPRESENTATION THEOREM
(1)

o
n=1
1
n
/ and j
_

o
n=1
1
n
_
=

o
n=1
j(1
n
) whenever 1
n

o
n=1
is a
pairwise disjoint sequence of sets in /,
(2) there exists / with j() < .
The triple (A, /, j) is called a measure space. Again, one usually abuses nota-
tion and refers to such a set functional j as a positive measure on A, and often as
just a measure on A. Note that j(O) = 0 is a consequence of properties (1) and
(2) since
j() = j
_


' O

' O

' ...
_
= j() j(O) j(O) ...
and j() can be cancelled from both sides since j() < . We say that j is a
complete measure on A or / if all subsets of sets of j-measure zero lie in / and
have zero measure, i.e.
(1) 1 / and j(1) = 0 whenever 1 1 and 1 / with j(1) = 0.
Example 4. We give four examples of measures.
(1) Lebesgue measure on the real line R is an example of a complete measure.
(2) A simpler example is counting measure i : T (A) [0, [ dened on the
power set T (A) of a set A by
i (1) =
_
=1 if 1 is nite
if 1 is innite
.
(3) Simpler still is the Dirac unit mass measure c
r
: T (A) 0, 1 at a point
r in a set A dened by
c
r
(1) =
_
1 if r 1
0 if r , 1
.
(4) A very interesting example, and one which often arises as a counterex-
ample to reasonable conjectures in abstract measure theory, uses the well-
ordered set A that has .
1
as a last element, and with the property that
every predessor of .
1
has at most countably many predessors. Recall that
an ordered set (A, -) is well-ordered if - is a linear order on A such that
every nonempty subset of A has a least element. The axiom of choice
is equivalent to the assertion that every set can be well-ordered. To con-
struct A, let 1 be any uncountable well-ordered set and let .
1
be the least
element having uncountably many predessors - .
1
is uniquely determined
up to order isomorphism and is called the rst uncountable ordinal.
Now for c A, let 1
o
and o
o
be the predessor and successor sets of
c given by
1
o
= , A : , - c ,
o
o
= , A : c - , .
Dene a topology t on A by declaring that G A belongs to t if G is
either a predessor set 1
o
, a successor set o
o
, an open segment 1
o
o
o
=
(,, c), or an arbitrary union of predessors, successors and segments. Then
the topological space (A, t) is Hausdor (meaning that every pair of dis-
tinct points in A can be separated by disjoint open sets in A) and compact.
1. ABSTRACT INTEGRATION 63
To see that A is compact, observe that every collection of closed subsets
1
I

I1
with the nite intersection property has nonempty intersection,

I1
1
I
,= O, because every nonempty subset of A has a least element.
Indeed, if it were the case that

I1
1
I
= O, then there is an innite se-
quence 1
Ir

o
n=1
of these closed sets, such that the least upper bounds c
n
of the sets

n
|=1
1
I
!
form an innite strictly decreasing sequence c
n

o
n=1
in A, contradicting the existence of a least element in c
n

o
n=1
. To see
this, choose 1
I1
arbitarily. Then c
1
= |n/ (1
I1
) exists and lies in 1
I1
. In
fact, the set of upper bounds of any set 1 is nonempty (.
1
is an upper
bound), and so has a least element c because A is well-ordered. Every
open set containing c must contain a segment (,, ) with , < c < (or
a predessor or successor set containing c - we leave these cases to the
reader), and since , cannot be an upper bound for 1, there is an element
of 1 in the segment (,, c[. If 1 is closed it thus follows that c 1. Next,
we note that there is 1
I2
such that c
1
, 1
I1
1
I2
(otherwise

I1
1
I
,= O),
and since 1
I1
1
I2
is closed and nonempty, we have
c
2
= |n/ (1
I1
1
I2
) < c
1
.
We can continue in this manner to construct a sequence of sets 1
Ir

o
n=1
such that the points c
n
= |n/ (

n
|=1
1
I
!
) are strictly decreasing.
Now dene
/ =
_
1 T (A) : either 1 ' .
1
or 1
c
' .
1

contains an uncountable compact set


_
,
and dene a set functional ` : / 0, 1 by
`(1) =
_
1 if 1 ' .
1
contains an uncountable compact set
0 if 1
c
' .
1
contains an uncountable compact set
,
for 1 /.
1.1. Measurable functions. It is convenient to initially dene the notion of
a measurable function for ) : A 1 where A is a measure space and 1 is a general
topological space. Recall that t T (1 ) is a topology on 1 if it contains the empty
set, the whole set 1 , and is closed under nite intersections and arbitrary unions:
(1) O, A t,
(2) G
1
G
2
... G
n
t whenever G
I
t for 1 _ i _ : < ,
(3)

o.
G
o
t whenever G
o
t for all c (here is an arbitrary index
set).
The pair (1, t) is called a topological space, and the sets in t are called the
open sets in 1 . As usual, we often abuse notation and refer to just the set 1 as
the topological space, with the underlying topology being understood.
Definition 14. Let (A, /) be a measurable space and let (1, t) be a topological
space. A function
) : A 1
is said to be measurable (more precisely /-measurable) if
)
1
(G) / for all G t.
64 6. ABSTRACT INTEGRATION AND THE RIESZ REPRESENTATION THEOREM
Note the similarity to the denition of a continuous function ) : A 1 in the
case that (A, o) is a topological space: ) is continuous if )
1
(G) o for all G t.
We have already seen in Example 3 that the composition of a continuous function
followed by a Lebesgue measurable function need not be measurable. On the other
hand the composition of a measurable function followed by a continuous function
is always measurable, even in this abstract setting.
Proposition 5. Suppose that (A, /) is a measurable space and that (1, t)
and (7, j) are topological spaces. If ) : A 1 is measurable and q : 1 7 is
continuous, then the composition q ) : A 7 is measurable.
Proof. If H j is open in 7, then G = q
1
(H) o is open in 1 and so the
measurability of ) gives
(q ))
1
(H) = )
1
_
q
1
(H)
_
= )
1
(G) /
for all H j. This veries the denition that q ) : A 7 is measurable.
We now consider the possibility that A is simultaneously a measurable space
and a topological space, i.e. there is a o-algebra / on A as well as a topology t on
A. If t /, then every continuous function ) : A 1 is also measurable.
Lemma 18. Suppose that / is a o-algebra on A and t is a topology on A with
t /. If 1 is any topological space, then every continuous function ) : A 1 is
also measurable.
Proof. If G is open in 1 , then )
1
(G) t /.
If (A, t) is a topological space, then Lemma 17 shows that there is a smallest
o-algebra E
r
on A that contains the topology t. This important o-algebra E
r
is called the Borel o-algebra on the topological space (A, t), and the sets 1 in
E
r
are called Borel sets. A function ) : A 1 that is measurable with respect
to the Borel o-algebra on A is said to be a Borel function on A. The previous
lemma shows that continuous functions are always Borel measurable, but there is
an important property that Borel functions have that is not shared by measurable
functions in general.
Proposition 6. Suppose that (A, /) is a measurable space and that (1, t) and
(7, j) are topological spaces. If ) : A 1 is measurable and q : 1 7 is Borel
measurable, then the composition q ) : A 7 is measurable.
Proof. Consider the collection of subsets of 1 dened by
( =
_
1 T (1 ) : )
1
(1) /
_
.
It is a simple exercise to verify that ( is a o-algebra on 1 (no properties other than
/ is a o-algebra and ) is a function are needed for this). Indeed, the following
three properties hold since / is a o-algebra;
)
1
(O) = O /,
)
1
(1
c
) =
_
)
1
(1)

c
/, if 1 (,
)
1
_
o
_
|=1
1
|
_
=
o
_
|=1
)
1
(1
|
) /, if 1
|
(,
1. ABSTRACT INTEGRATION 65
and they show by denition of ( that
O (,
1
c
( when 1 (,
o
_
|=1
1
|
( when 1
|
(.
Moreover, the measurability of ) shows that ( contains t, the open sets in 1 . Thus
by Lemma 17, ( contains the Borel o-algebra E
r
on 1 .
Now if H j is open in 7, the Borel measurability of q shows that
q
1
(H) E
r
(,
which gives
(q ))
1
(H) = )
1
_
q
1
(H)
_
/
for every H j by the denition of q
1
(H) (.
Remark 13. For future reference we isolate one of the facts proved above: if
/ is a o-algebra on a set A, and if ) : A 1 is any function whatsoever, then
the set
( =
_
1 T (1 ) : )
1
(1) /
_
is a o-algebra on 1 . Thus o-algebras can be pushed forward by arbitrary functions.
1.1.1. Product spaces. Given two topological spaces (1
1
, t
1
) and (1
2
, t
2
), we
dene the product topology t
1
t
2
on the product space 1
1
1
2
to consist of
arbitrary unions of open rectangles G
1
G
2
where G
I
t
I
for i = 1, 2. It is
easy to see that t
1
t
2
is a topology - it is closed under nite intersections since
the intersection of two open rectangles is again an open rectangle. Let (A, o) be
another topological space. It is an easy exercise to show that if
) : A 1
1
1
2
, ) (r) = ()
1
(r) , )
2
(r)) 1
1
1
2
for r A,
then ) is continuous if and only if )
I
: A 1
I
is continuous for both i = 1 and
i = 2. The same sort of phenomenon holds for measurability if the spaces 1
1
and 1
2
each have a countable base. Recall that a topological space (1, t) has a countable
base G
n

o
n=1
if each G
n
is open and if for every point r contained in an open set
G there is : N such that r G
n
G. For example, Euclidean space R
n
has
a countable base, namely the collection of all open balls with rational radii having
centers with rational coordinates.
Lemma 19. Suppose that (A, /) is a measurable space, and that (1
1
, t
1
) and
(1
2
, t
2
) are topological spaces with countable bases. Then
) = ()
1
, )
2
) : A 1
1
1
2
is measurable if and only if )
I
: A 1
I
is measurable for both i = 1 and i = 2.
Proof. Suppose rst that ) is measurable. Since the projection map
I
:
1
1
1
2
1
I
is continuous, Proposition 5 shows that )
I
=
I
) is measurable for
i = 1, 2.
Now suppose that both )
1
and )
2
are measurable. Then if 1 = G
1
G
2
is an
open rectangle,
(1.1) )
1
(1) = )
1
(G
1
G
2
) = )
1
1
(G
1
) )
1
2
(G
2
) /.
66 6. ABSTRACT INTEGRATION AND THE RIESZ REPRESENTATION THEOREM
If J =

o
|=1
1
|
is a countable union of open rectangles 1
|
, we have
)
1
(J) =
o
_
|=1
)
1
(1
|
) /.
Finally, it is easy to see that every open set J in 1
1
1
2
is a countable union of
open rectangles because of our assumption that 1
I
has a countable base for i = 1
and i = 2. Indeed, if E
I
is a countable base for 1
I
, then
E
1
E
2
= G
1
G
2
: G
I
E
I
for 1 = 1, 2
is a countable base for 1
1
1
2
. Then if J is open,
J =
_
G : G E
1
E
2
and G J ,
and the latter union is clearly at most countable. This completes the proof that )
is measurable.
Corollary 13. Let (A, /) be a measurable space and : _ 2. Then
(1) ) : A R
n
is measurable if and only if each component function )
I
:
A R in ) (r) = ()
1
(r) , ..., )
n
(r)) is measurable, 1 _ i _ :, and
(2) if ), q : A R
n
are both measurable, then so are ) q : A R
n
and
) q : A R.
Proof. Assertion (1) follows by induction from Lemma 19. Now dene 1 :
A R
n
R
n
by 1 (r) = () (r) , q (r)) for r A. Then the measurability of
) and q implies that of 1 by Lemma 19. If , : R
n
R
n
R
n
is dened by
,(n, ) = n, then the continuity of , and Proposition 5 imply the measurability
of (, 1) (r) = ) (r) q (r) = () q) (r), r A. Similarly, if c : R
n
R
n
R
is dened by ,(n, ) = n , then the continuity of c and Proposition 5 imply the
measurability of (c 1) (r) = ) (r) q (r) = () q) (r), r A.
The following lemma is proved exactly as in the case of Lebesgue measure on
the real line treated above.
Lemma 20. Let (A, /) be a measurable space. Suppose that ), )
n
, q : A
[, [ for : N.
(1) If ) is nite-valued, then ) is measurable if and only if )
1
(G) / for
all open sets G R if and only if )
1
(1) / for all closed sets 1 R.
(2) If ) is nite-valued and continuous, then ) is measurable.
(3) If ) is nite-valued and measurable and 1 : R R is continuous, then
1 ) is measurable.
(4) If )
n

o
n=1
is a sequence of measurable functions, then the following func-
tions are all measurable:
sup
n
)
n
(r) , inf
n
)
n
(r) , ... lim sup
no
)
n
(r) , lim inf
no
)
n
(r) .
(5) If )
n

o
n=1
is a sequence of measurable functions and ) (r) = lim
no
)
n
(r),
then ) is measurable.
(6) If ) is measurable, so is )
n
for : N.
(7) If ) and q are nite-valued and measurable, then so are ) q and )q.
1. ABSTRACT INTEGRATION 67
1.2. Simple, nonnegative and integrable functions. We now proceed al-
most exactly as we did in the case of Lebesgue measure on the real line R. We will
be brief and omit all proofs here as they are virtually verbatim the same as the
proofs we gave for Lebesgue measure.
Let (A, /, j) be a measure space. A function , : A R is a simple function
if it is measurable and its range is nite. Such functions have the form
, =

|=1
c
|

J
!
, c
|
R, 1
|
/.
Proposition 7. Let ) : A [, [ be measurable.
(1) If ) is nonnegative there is an increasing sequence of nonnegative simple
functions ,
|

o
|=1
that converges pointwise and monotonically to ):
,
|
(r) _ ,
|+1
(r) and lim
|o
,
|
(r) = ) (r) , for all r A.
(2) There is a sequence of simple functions ,
|

o
|=1
satisfying
[,
|
(r)[ _

,
|+1
(r)

and lim
|o
,
|
(r) = ) (r) , for all r A.
If , is a simple function, then , has a unique canonical representation
, =

|=1
c
|

J
!
,
where the real constants c
|
are distinct and nonzero, and the measurable sets 1
|
are pairwise disjoint. We dene the integral of , by
_
,dj =

|=1
c
|
[1
|
[

,
where we are using the notation [1
|
[

= j(1) for 1 /. If 1 / and , is a


simple function, then so is
J
,, and we dene
_
J
,dj =
_
(
J
,) dj.
Lemma 21. Suppose that , and c are simple functions and that 1, 1 /.
(1) If , =

1
|=1
,
|

J
!
(not necessarily the canonical representation), then
_
,dj =
1

|=1
,
|
[1
|
[

.
(2)
_
(a, /c) dj = a
_
,dj /
_
cdj for a, / C,
(3)
_
J|J
,dj =
_
J
,dj
_
J
, if 1 1 = c,
(4)
_
,dj _
_
cdj if , _ c,
(5)

_
,dj

_
_
[,[ dj.
For ) : A [0, [ measurable we dene
_
)dj = sup
__
,dj : 0 _ , _ ) and , is simple
_
.
68 6. ABSTRACT INTEGRATION AND THE RIESZ REPRESENTATION THEOREM
We say that ) is integrable if
_
)dj < . For 1 measurable dene
_
J
)dj =
_
(
J
)) dj.
Lemma 22. Suppose that ), q : A [0, [ are nonnegative measurable func-
tions and that 1, 1 /.
(1)
_
(a) /q) dj = a
_
)dj /
_
qdj for a, / (0, ),
(2)
_
J|J
)dj =
_
J
)dj
_
J
)dj if 1 1 = c,
(3)
_
)dj _
_
qdj if 0 _ ) _ q,
(4) If
_
)dj < , then ) (r) < for a.e. r,
(5) If
_
)dj = 0, then ) (r) = 0 for a.e. r.
Theorem 30. (Monotone Convergence Theorem) Suppose that )
n

o
n=1
is an
increasing sequence of nonnegative measurable functions, i.e. )
n
(r) _ )
n+1
(r),
and let
) (r) = sup
n
)
n
(r) = lim
no
)
n
(r) .
Then ) is nonegative and measurable and
_
)dj = sup
n
_
)
n
dj = lim
no
_
)
n
dj.
Corollary 14. Suppose that a
|
(r) _ 0 is measurable for / _ 1. Then
_
o

|=1
a
|
dj =
o

|=1
_
a
|
dj.
Lemma 23. (Fatous Lemma) If )
n

o
n=1
is a sequence of nonnegative mea-
surable functions, then
_
lim inf
no
)
n
dj _ lim inf
no
_
)
n
dj.
If ) : A [, [ is measurable, dene
_
)dj =
_
)
+
dj
_
)

dj,
provided not both
_
)
+
dj and
_
)

dj are innite. We say that such an ) is


integrable if
_
[)[ dj =
_
_
)
+
)

_
dj =
_
)
+
dj
_
)

dj < .
Lemma 24. Suppose that ), q are integrable and that 1, 1 /.
(1)
_
(a) /q) dj = a
_
)dj /
_
qdj for a, / R,
(2)
_
J|J
)dj =
_
J
)dj
_
J
) if 1 1 = c,
(3)
_
)dj _
_
qdj if ) _ q,
(4)

_
)dj

_
_
[)[ dj.
We say that a property 1 (r) holds j a.c. r A if the set of r for which
1 (r) fails has j-measure zero.
2. THE RIESZ REPRESENTATION THEOREM 69
Theorem 31. (Dominated Convergence Theorem) Let q be a nonnegative in-
tegrable function. Suppose that )
n

o
n=1
is a sequence of measurable functions sat-
isfying
lim
no
)
n
(r) = ) (r) , j a.c. r A,
and
[)
n
(r)[ _ q (r) , j a.c. r A.
Then
lim
no
_
[) )
n
[ dj = 0,
and hence
_
)dj = lim
no
_
)
n
dj.
Finally, if ) (r) = n(r) i (r) is complex-valued where n(r) and (r) are
real-valued measurable functions such that
_
[)[ dj =
_
_
n
2

2
dj < ,
then we dene
_
)dj =
_
ndj i
_
dj.
The usual properties of linearity, additivity, monotonicity and the triangle inequal-
ity all hold for this denition as well.
2. The Riesz representation theorem
Suppose we have a measure space (A, /, j) that is also a topological space
(A, t) with topology t /. Then every continuous function ) : A C is
measurable. If in addition the measure j is locally nite, i.e.
j(1) < for all compact sets 1 A,
and if the space A is compact, or more generally just if ) has compact support, then
) is integrable and the integral
_
)dj is a complex number. Now the set C
c
(A) of
continuous complex-valued functions on A with compact support is clearly a com-
plex vector space under pointwise addition and scalar multiplication of functions.
The map
A

: C
c
(A) C, given by A

) =
_
)dj,
is a linear functional on the vector space C
c
(A). Moreover it has a special property
due to the positivity of the measure j, namely that A

is a positive linear functional:


A

) _ 0 whenever ) C
c
(A) satises ) (r) _ 0 for all r A.
Remarkable fact: For many topological spaces (A, t), every posi-
tive linear functional A on C
c
(A) is equal to A

for some positive


locally nite Borel measure j on A.
The condition we will impose on the space A in order to force this remarkable
fact is that A be locally compact and Hausdor. A topological space (A, t) is
locally compact if A has a base of compact sets, i.e. for every r G A with G
open, there is H open with H compact and
r H H G A.
70 6. ABSTRACT INTEGRATION AND THE RIESZ REPRESENTATION THEOREM
A topological space (A, t) is Hausdor if for every pair of distinct points r, j A
there are open sets G and H such that
r G, j H and G H = O.
The key fact that we use about such spaces, and which connects measures to con-
tinuous functions is Urysohns Lemma.
2.1. Urysohns Lemma.
Lemma 25. (Urysohn) Suppose that A is a locally compact Hausdor space and
that 1 \ A where 1 is compact and \ is open. Then there is a continuous
function with compact support ) C
c
(A) such that
(2.1)
1
(r) _ ) (r) _
\
(r) , r A.
In particular ) = 1 on 1 and ) = 0 outside \ .
The conclusion of Urysohns Lemma can be viewed as a strong form of the
Hausdor property. It says that if 1 is a compact set and 1 is a closed set, then
1 and 1 can be separated by a continuous function that is 0 on 1 and 1 on 1.
In particular, if singletons are closed in A, then given r and j distinct points in
A, we can take 1 = r, 1 = j, G =
_
)
1
2
_
and H =
_
) <
1
2
_
to obtain the
Hausdor property r G, j H and G H = O.
Proof of Urysohns Lemma: We give the proof in three steps.
Step 1: We rst show that we can squeeze an open set l with compact closure
l between 1 and \ as follows:
(2.2) 1 l l \.
Here is how to construct such a set l. For each j 1 we use the fact that A is
locally compact to choose an open set O

containing j and such that O

is compact.
Since 1 is compact there is a nite collection O
r

n=1
of these open sets that cover
1. Then O =

n=1
O
r
is open, contains 1 and O =

n=1
O
r
is compact. In
the special case that \ = A we can take l = O. Otherwise, 1 = A \ is closed
and nonempty.
Now we use the Hausdor property of A to obtain that for every r 1 and
j 1, there are open sets G
(r,)
and H
(r,)
with
r G
(r,)
, j H
(r,)
and G
(r,)
H
(r,)
= O.
Momentarily x j 1. Since 1 is compact there is a nite subcollection
_
G
(rr,)
_
1
n=1
that covers 1. Then the open sets
G

=
1
_
n=1
G
(rr,)
and H

=
1

n=1
H
(rr,)
separate 1 and j in the sense that G

and H

are disjoint open sets that con-


tain 1 and j respectively. Thus j , G

and we see that the collection of sets


_
1 O G

_
J
satises

J
_
1 O G

_
= O.
2. THE RIESZ REPRESENTATION THEOREM 71
Since the sets 1OG

are compact (O is compact and 1 and G

are closed), the -


nite intersection property shows that there is a nite subcollection
_
1 O G

_
J
|=1
satisfying
J

|=1
_
1 O G

_
= O.
Then the set
l =
J

|=1
(O G

)
is open with compact closure l =

J
|=1
_
O G

_
, and of course
l A 1 = \.
Step 2: We now iterate the squeezing process as follows. First rewrite (2.2)
with l
0
in place of l:
1 l
0
l
0
\.
Then apply (2.2) to the pair of sets 1 l
0
where 1 is compact and l
0
is open to
obtain an open set l
1
with compact closure satisfying
1 l
1
l
1
l
0
l
0
\.
Next, apply (2.2) to the pair of sets l
1
l
0
where l
1
is compact and l
0
is open
to obtain an open set l1
2
with compact closure satisfying
1 l
1
l
1
l1
2
l1
2
l
0
l
0
\.
We continue with
1 l
1
l
1
l3
4
l3
4
l1
2
l1
2
l1
4
l1
4
l
0
l
0
\,
and then
1 l
1
l
1
l7
8
l7
8
l3
4
l3
4
l5
8
l5
8
l1
2
l1
2
l3
8
l3
8
l1
4
l1
4
l1
8
l1
8
l
0
l
0
\.
This process can be continued indenitely and produces a collection of open sets
l
:

:1
where T =
_
|
2

: /, / N with / _ 1 and 0 _ / _ 2
|
_
and that satises
the property
(2.3) 1 l
:
l
:
l
s
l
s
\
whenever r, : T with r :.
Step 3: We can now dene our candidate for the function ) : A [0, 1[ in
(2.1). Given r A we dene
) (r) = sup
0:1
r
Ir
(r) .
Then we have
r A : ) (r) ` =
_
:,X
l
:
is open for all 0 _ ` _ 1.
72 6. ABSTRACT INTEGRATION AND THE RIESZ REPRESENTATION THEOREM
We similarly have that the function q dened by
q (r) = inf
0s1
:
(Is)
c (r) ,
satises
r A : q (r) < ` =
_
s<X
_
l
s
_
c
is open for all 0 _ ` _ 1.
If we can show that ) = q, it will then follow that ) is continuous since
)
1
((a, /)) = ) a q < /
will be open for all open intervals (a, /) with 0 _ a < / _ 1, and this is enough to
establish the continuity of ) : A [0, 1[. Now it suces to show that ) (r) = q (r)
for all r l
0
l
1
since both ) and q vanish outside l
0
, and both are 1 inside
l
1
. But if ) (r) q (r) and r l
0
l
1
, then there is r : such that r l
:
and r
_
l
s
_
c
, which implies l
:
l
s
, contradicting (2.3) which says l
:
l
s
.
On the other hand, if ) (r) < q (r) and r l
0
l
1
, then there are t, T such
that ) (r) < t < < q (r) with r , l
|
and r , (l
u
)
c
. Thus r
_
l
|
_
c
l
u
which
implies < t, contradicting our assumption that t < . This completes the proof
that ) = q, and hence the proof of Urysohns Lemma.
Urysohns Lemma can be thought of as a continuous unit function on the
compact set 1 that is subordinate to the open set \ covering 1. A simple algebraic
trick permits us to obtain a far more exible variant, namely a continuous partition
of unity on the compact set 1 that is subordinate to a nite open cover \
n

n=1
of 1.
Corollary 15. Suppose that \
n

n=1
is a nite collection of open subsets of
a locally compact Hausdor space A. If 1 is a compact subset of A that is covered
by \
n

n=1
, then there exist continuous compactly supported functions )
n

n=1

C
c
(A) satisfying

1
_

n=1
)
n
_
S
1
r=1
\r
,
0 _ )
n
_
\r
, 1 _ : _ .
In particular,

n=1
)
n
= 1 on 1 and )
n
= 0 outside \
n
.
Proof : For each r 1 there is : = :(r) such that r \
n
. Since A is locally
compact, there is an open set \
r,n
with compact closure satisfying r \
r,n

\
r,n
\
n(r)
. Then
_
\
r,n(r)
_
r
is an open cover of 1, and since 1 is compact,
there is a nite subcover G
|

J
|=1
. Now for 1 _ : _ let J
n
be the union of all G
|
that are contained in \
n
. Then J
n
is a nite union of compact sets, so is compact
itself. Also, we have 1

n=1
J
n
. Indeed, every j 1 lies in some G
|
, and G
|
equals \
r,n(r)
for some r A, and so j J
n(r)
since
j G
|
= \
r,n(r)
\
r,n(r)
\
n(r)
.
Now apply Urysohns Lemma to the pair J
n
\
n
to obtain q
n
C
c
(r) such that

r
_ q
n
_
\r
, 1 _ : _ .
Now we use an algebraic trick motivated by the solution to a well known math-
ematical teaser of P. Halmos.
2. THE RIESZ REPRESENTATION THEOREM 73
Mathematical Teaser: A barrel of pickles that is 99% water by
weight is opened at sunrise and left out in the sun all day.
At sunset it is 98% water and weighs 500 lbs. How much did
the barrel weigh at sunrise?
Solution: Consider the complement of the water. Since the percentage of
nonwater in the barrel doubles during the day (it goes from 1% nonwater
to 2% nonwater), the weight of the barrel and contents must have been
cut in half by sunset (the weight of nonwater - the barrel and pickle pulp
- remains constant). Thus the barrel started the day at 1000 lbs.
To apply this principle of complementation to our partition of unity problem,
we observe that each continuous function 1 q
n
vanishes on J
n
, hence the product

n=1
(1 q
n
) is continuous and vanishes on

n=1
J
n
. Thus ) = 1

n=1
(1 q
n
)
is continuous and equals 1 on 1 and vanishes outside

n=1
\
n
. It remains only to
write ) =

n=1
)
n
where each )
n
is continuous and satises 0 _ )
n
_
\r
. But
this can be achieved by writing

n=1
(1 q
n
) = q

n=1
(1 q
n
)
1

n=1
(1 q
n
)
= q

n=1
(1 q
n
) q
1
2

n=1
(1 q
n
)
2

n=1
(1 q
n
)
= )

)
1
... )
1
1,
where
)
1
= q
1
,
)
2
= (1 q
1
) q
2
,
)
3
= (1 q
1
) (1 q
2
) q
3
,
.
.
.
)

= (1 q
1
) (1 q
2
) ... (1 q
1
) q

.
Of course we could have simply begun by dening )
n
as above, and then using
induction on : to show that
1
n

|=1
)
|
=
n

|=1
(1 q
|
) , 1 _ : _ .
However, this would have denied us the fun of nding the formulas in the rst place.
In any event, the case : = yields

|=1
)
|
(r) = 1

|=1
(1 q
|
) = 1, r 1,
since every r 1 lies in J
|
for some 1 _ / _ , and hence 1 q
|
(r) = 1 1 = 0.
Finally
r
_ q
n
_
\r
and 0 _

n
|=1
(1 q
|
) _ 1 imply
0 _
_
n

|=1
(1 q
|
)
_

r
_
_
n

|=1
(1 q
|
)
_
q
n
= )
n
_
\r
for all 1 _ : _ .
74 6. ABSTRACT INTEGRATION AND THE RIESZ REPRESENTATION THEOREM
2.2. Representing continuous linear functionals. In preparation for sta-
ting the Riesz representation theorem, we introduce some regularity terminology
that links measure and topology.
Definition 15. Suppose that j is a Borel measure on a topological space A.
We say j is outer regular if
(2.4) j(1) = inf j(\ ) : 1 \ open
for all Borel sets 1. We say j is inner regular if
(2.5) j(1) = supj(1) : 1 compact 1
for all Borel sets 1. We say j has limited inner regularity if (2.5) holds for all
open sets 1, and for all Borel sets 1 with j(1) < .
Finally we say j is regular if j is both outer and inner regular; and we say j
has limited regularity if j is outer regular and has limited inner regulariy.
Remark 14. The terminology surrounding regularity and Borel measures is
not standardized. For example, many authors, including Rudin, say that a measure
j is a Borel measure if it is dened on a o-algebra / that contains the Borel o-
algebra E - as opposed to identifying the measure j with its measure space (A, /, j)
and declaring it to be Borel if / = E. Rudin goes on to dene j to be regular if both
(2.4) and (2.5) hold for all Borel sets 1 E. Other authors insist that a regular
measure satisfy the stronger requirement that (2.4) and (2.5) hold for all 1 /.
Of course, if every set 1 / has the form 1' where 1 E is Borel and /
is null (j() = 0), then the two notions of regular coincide.
We introduced the notion of limited regularity in order to clarify the uniqueness
assertion in the Riesz representation theorem, whose statement follows.
Theorem 32 (Riesz Representation Theorem). Suppose that A is a locally
compact Hausdor space, and that A : C
c
(A) C is a positive linear functional
on C
c
(A). Then there is a unique positive Borel measure j on A with limited
regularity such that
(2.6) A) =
_

)dj, ) C
c
(A) .
Moreover, there is a o-algebra / on A that contains the Borel sets in A, and an
extension of j to a measure on /, which we continue to denote by j, and which
satises the following properties:
Local niteness: j(1) < for all compact 1 A,
Outer /-regularity: j(1) = inf j(\ ) : 1 \ open for all 1 /,
Limited inner /-regularity: j(1) = supj(1) : 1 compact 1 for
1 open, and for 1 / with j(1) < ,
Completeness: / if 1 / and j(1) = 0.
We will see later that inner regularity may fail for a measure j arising in the
Riesz representation theorem. On the other hand we will also see later that in
nice topological spaces A, in particular those locally compact Hausdor spaces in
which every open set is a countable union of compact sets, every locally nite Borel
measure j is regular.
2. THE RIESZ REPRESENTATION THEOREM 75
Remark 15. If A is a positive linear functional on C
c
(A), where A is locally
compact and Hausdor, and if j is a positive Borel measure satisfying (2.6), then
j must be locally nite. Indeed, if 1 is compact, then by Urysohns Lemma there
is ) C
c
(A) with
1
_ ) _

= 1, and so
j(1) =
_


1
dj _
_

)dj = A) < .
Proof (of Theorem 32): We begin with the uniqueness of a positive Borel
measure j on A with limited regularity that satises the representation formula
(2.6). Suppose that j
1
and j
2
are two such Borel measures. First we observe that
because each of j
1
and j
2
has limited regularity, and because the collection of sets
on which j
1
and j
2
coincide is a o-algebra, it suces to prove that j
1
(1) = j
2
(1)
for all compact sets 1 in A.
Fix 1 compact and - 0. By outer regularity of j
2
there is an open set \
satisfying
j
2
(\ ) _ j
2
(1) -.
By Urysohns Lemma there is ) C
c
(A) such that

1
_ ) _
\
.
Altogether we thus have
j
1
(1) =
_


1
dj
1
_
_

)dj
1
= A) =
_

)dj
2
_
_


\
dj
2
= j
2
(\ ) _ j
2
(1) -.
Since - 0 is arbitrary, we conclude that j
1
(1) _ j
2
(1), and hence also j
2
(1) _
j
1
(1) by symmetry.
In order to establish the existence of a positive Borel measure j that satises
the representation formula (2.6), we must work much harder. However, it turns
out to be no harder to obtain the measure j on a o-algebra / with the additional
properties listed in the statement of the theorem. So we now turn to proving
the existence of such / and j in eleven steps. Parts of the arguments below are
reminiscent of some of those used in the construction of Lebesgue measure above.
We dene the support of a complex-valued function ) to be the closure of the
set of r where ) (r) ,= 0, and we denote it by supp); thus
supp) = r A : ) (r) ,= 0.
Step 1: For every subset 1 T (A) we dene
A
+
(1) = inf
J\ open
_
sup
0}_
\
A)
_
,
where the inmum is taken over all open sets \ that contain 1, and the supremum
in braces is taken over all nonnegative ) C
c()
such that ) is subordinate to \ .
We rst observe that for G open we have the simpler formula,
A
+
(G) = sup
0}_
C
A),
76 6. ABSTRACT INTEGRATION AND THE RIESZ REPRESENTATION THEOREM
and hence also
(2.7) A
+
(1) = inf
J\ open
A
+
(\ ) , 1 T (A) .
Step 2: We claim that A
+
: T (A) [0, [ is an outer measure, i.e. that A
+
is
monotone and countably subadditive.
Clearly A
+
is monotone since if 1 1, then
A
+
(1) = inf
J\ open
A
+
(\ ) _ inf
J\ open
A
+
(\ ) = A
+
(1)
follows since every open set \ containing 1 also contains 1. To see that A
+
is
countably subadditive, i.e.
(2.8) A
+
_
o
_
n=1
1
n
_
_
o

n=1
A
+
(1
n
) , for all 1
n

o
n=1
T (A) ,
we rst show that
(2.9) A
+
(l ' \ ) _ A
+
(l) A
+
(\ ) ,
for all open sets l and \ . Let ) C
c
(A) satisfy 0 _ ) _
I|\
. Apply the
partition of unity Corollary 15 with 1 = supp) to obtain q, / C
c
(A) with

1
_ q / _
I|\
,
0 _ q _
I
and 0 _ / _
\
.
Then we have
A) = A[) (q /)[ = A()q) A()/) _ A
+
(l) A
+
(\ )
since 0 _ )q _
I
and 0 _ )/ _
\
. Since this holds for all 0 _ ) _
I|\
, we
can take the supremum over such ) to get (2.9). Induction then yields the more
general statement,
(2.10) A
+
_

_
n=1
\
n
_
_

n=1
A
+
(\
n
) , for all \
n
open.
We may suppose that A
+
(1
n
) < in (2.8), and then given - 0, we can nd
open sets \
n
containing 1
n
such that A
+
(\
n
) _ A
+
(1
n
)
:
2
r
for each : _ 1. Set
\ =

o
n=1
\
n
and choose ) C
c
(A) with 0 _ ) _
\
. Since supp) is compact
there is < such that 0 _ ) _
S
1
r=1
\r
. Altogether we have
A) _ A
+
_

_
n=1
\
n
_
_

n=1
A
+
(\
n
) _

n=1
_
A
+
(1
n
)
-
2
n
_
< -

n=1
A
+
(1
n
) ,
and taking the supremum over such ), we obtain
A
+
(\ ) _ -
o

n=1
A
+
(1
n
) .
Since A
+
is monotone and - 0 is arbitrary, we thus have
A
+
_
o
_
n=1
1
n
_
_ A
+
_
o
_
n=1
\
n
_
= A
+
(\ ) _
o

n=1
A
+
(1
n
) .
2. THE RIESZ REPRESENTATION THEOREM 77
Step 3: We now dene / and j. Let
/
Innt:
=
_
1 T (A) : A
+
(1) < and A
+
(1) = sup
compact 1J
A
+
(1)
_
,
and
/ = 1 T (A) : 1 1 /
Innt:
for every compact set 1 .
Then we dene j : / [0, [ by
j(1) = A
+
(1) , 1 /.
We will eventually see that /
Innt:
consists exactly of those sets 1 / such that
j(1) < . In the steps below we establish that / is a o-algebra on A containing
the Borel sets, and that j is a positive measure on /.
It will be convenient to use the shorthand notation 1 - ) (read 1 is subordi-
nate to )) to mean 1 is compact, ) C
c
(A) and
1
_ ) _ 1; and to use ) - \
(read ) is subordinate to \ ) to mean \ is open, ) C
c
(A) and 0 _ ) _
\
.
Step 4: If 1 is compact, then 1 / and
(2.11) j(1) = inf
1~}
A).
That 1 / is trivial, and to see (2.11) suppose that 1 - ) and 0 < c < 1. Then
with \
o
= ) c we have
j(1) _ A
+
(\
o
) = sup
~\c
Aq =
1
c
sup
~\c
A(cq) _
1
c
A()) ,
since cq _ ) whenever q - \
o
. Letting c 1 we obtain j(1) _ A()).
If now - 0 there exists an open set \ containing 1 such that A
+
(\ ) <
j(1) -. Urysohns Lemma yields ) so that 1 - ) - \ , and so altogether we
have
j(1) _ A()) _ A
+
(\ ) < j(1) -.
Since - 0 is arbitrary, we obtain (2.11).
Step 5: If G is open, then
(2.12) A
+
(G) = sup
compact 1c
j(1) .
In particular, if G is open and A
+
(G) < , then G /
Innt:
. To see (2.12), let
c < A
+
(G) so that there is ) - G with c < A) _ A
+
(G). Now 1 = supp) is
compact and if \ is an open set that contains 1, then ) - \ and hence
A) _ A
+
(\) .
Since this holds for all such \ we obtain
A) _ inf
1V open
A
+
(\) = j(1) .
Altogether we have
c < A) _ j(1) _ A
+
(G) ,
and since c was any number less than A
+
(G) and 1 is compact, the proof of (2.12)
is complete.
78 6. ABSTRACT INTEGRATION AND THE RIESZ REPRESENTATION THEOREM
Step 6: Suppose that 1
I

o
I=1
is a sequence of pairwise disjoint sets in /
Innt:
.
Then
(2.13) A
+
_
o
_
I=1
1
I
_
=
o

I=1
A
+
(1
I
) .
If in addition A
+
(

o
I=1
1
I
) < , then

o
I=1
1
I
/
Innt:
. We begin by proving
(2.13) for a nite union of compact sets,
(2.14) j
_
1

' 1
_
= j(1) j(1) , 1, 1 compact.
Given - 0 there is by Urysohns Lemma a function ) C
c
(A) with 0 _ ) _ 1
separating 1 and 1 in the sense that ) = 1 on 1 and ) = 0 on 1. From (2.11) in
Step 4 we obtain q such that 1 ' 1 - q and
Aq < j(1 ' 1) -.
From (2.11) applied to 1 - )q and 1 - (1 )) q the linearity of A gives
j(1) j(1) _ A()q) A[(1 )) q[ = Aq < j(1 ' 1) -.
Now we use that - 0 is arbitrary, together with the subadditivity of A
+
in (2.8)
of Step 2, to obtain (2.14).
Now we turn to proving (2.13) in full generality. Recall that 1
I
/
Innt:
. Thus
given - 0 there are compact sets H
I
1
I
satisfying
A
+
(1
I
) < j(H
I
)
-
2
I
, 1 _ i < .
Now set 1
n
=

n
I=1
H
I
and use (2.14) repeatedly to obtain
n

I=1
A
+
(1
I
) <
n

I=1
j(H
I
) - = j(1
n
) - _ A
+
(1) -.
Letting rst - 0 and then : we obtain

o
I=1
A
+
(1
I
) _ A
+
(1), which when
combined with the countable subadditivity in (2.8), yields (2.13).
Finally, the inequality
j(1
n
)
n

I=1
A
+
(1
I
) - = A
+
_
o
_
I=1
1
I
_

I=n+1
A
+
(1
I
) -,
together with the compactness of 1

, shows that
o
_
I=1
1
I
/
Innt:
.
Step 7: Suppose that 1 /
Innt:
and - 0. Then there is a compact set 1
and an open set \ such that
1 1 \ and A
+
(\ 1) < -.
Indeed, the denition of A
+
in (2.7) shows that there is an open set \ such that
1 \ and A
+
(\ ) < A
+
(1)
:
2
; while the denition of /
Innt:
shows that there is
a compact set 1 such that 1 1 and j(1) A
+
(1)
:
2
. Now \ 1 is open
and A
+
(\ 1) _ A
+
(\ ) < , so that (2.12) in Step 5 implies \ 1 /
Innt:
.
Then (2.13) applied to \ = 1

' (\ 1) gives
A
+
(\ 1) = A
+
(\ ) j(1) < A
+
(1)
-
2

_
A
+
(1)
-
2
_
= -.
2. THE RIESZ REPRESENTATION THEOREM 79
Step 8: If , 1 /
Innt:
, then
1, 1, ' 1 /
Innt:
.
Given - 0, the previous step shows that there are compact sets 1 and 1 and
open sets l and \ such that
1 l and 1 1 \,
A
+
(l 1) , A
+
(\ 1) < -.
Then monotonicity and subadditivity (2.8) give
A
+
( 1) _ A
+
(l 1)
_ A
+
(l 1) A
+
(1 \ ) A
+
(\ 1)
< - A
+
(1 \ ) -.
Now J = 1 \ is a compact subset of 1, so we conclude that
A
+
( 1) = sup
compact .\1
A
+
(J) ,
which implies that 1 /
Innt:
by the denition in Step 3.
Now
( 1) = ( 1
c
)
c
= (
c
1) = 1
shows that 1 /
Innt:
. Finally, (2.13) applied to ' 1 = ( 1)

' 1 yields
' 1 /
Innt:
.
Remark 16. In the special case that A is compact, we have at this point in the
proof established that / = /
Innt:
is a o-algebra on A containing the Borel sets, and
that A
+
is a measure when restricted to /. Indeed, Step 5 shows /
Innt:
contains
all open sets, Step 8 shows that /
Innt:
is closed under complementation, and Step
6 then shows that /
Innt:
is closed under countable unions - after expressing a
countable union as a countable union of pairwise disjoint sets in /
Innt:
. It now
follows that / = /
Innt:
. The countable additivity of j = A
+
on / follows from
Step 6.
Step 9: / is a o-algebra on A containing the Borel sets. First we show that
/ is closed under complementation. If / and 1 is compact, then both 1 and
1 are in /
Innt:
and so by Step 8 we have

c
1 = 1 ( 1) /
Innt:
,
and this shows that
c
/.
Now suppose that =

o
I=1

I
where each
I
/, and let 1 be compact. We
now write 1 as a pairwise disjoint union by setting
1
1
=
1
1,
1
2
= (
2
1) 1
1
,
.
.
.
1
n
= (
n
1)
_
n1
_
I=1
1
I
_
,
.
.
.
80 6. ABSTRACT INTEGRATION AND THE RIESZ REPRESENTATION THEOREM
By Step 8 and induction on : we have 1
n
/
Innt:
for all : _ 1. Then Step 6
yields
1 =

_
n1
1
n
/
Innt:
.
Since this holds for all compact 1 we have /.
Finally, if 1 is a closed set, then 1 1 is compact, hence 1 1 /
Innt:
.
This proves that 1 / and it follows that / contains all the Borel sets.
Step 10: /
Innt:
= /1 T (A) : A
+
(1) < and j is a measure on /. If
1 /
Innt:
then 1 1 /
Innt:
for all compact 1 by Steps 4 and 8. This shows
that
/
Innt:
/ 1 : A
+
(1) < .
We can now write j(1) = A
+
(1) for 1 /
Innt:
, and in particular by Step 5, for
1 open and A
+
(1) < .
Conversely, suppose that 1 / and j(1) < . Given - 0 there is an open
set \ 1 with j(\ ) = A
+
(\ ) < , hence \ /
Innt:
. Now by Steps 5 and
7 there is a compact set 1 \ with j(\ 1) < -. Since 1 1 /
Innt:
by
denition of /, there is by denition of /
Innt:
, a compact set H 1 1 with
j(1 1) < j(H) -.
By subadditivity we thus have
j(1) _ j(1 1) j(\ 1) < j(H) 2-,
which implies that 1 /
Innt:
.
Finally, Step 6 shows that j is countably additive on /, i.e.
j
_
_

_
1I<o
1
I
_
_
=
o

I=1
j(1
I
) ,
since if one of the sets 1
I
has innite measure, there is nothing to prove, and
otherwise 1
I
/
Innt:
for all i.
Step 11: For every ) C
c
(A) we have
A) =
_

)dj.
Since A()) = A()), it suces to prove the inequality
(2.15) A) _
_

)dj, for all real ) C


c
(A) .
So let ) C
c
(A) be real with support 1 = supp), and let the interval [a, /[ contain
the compact range of ). Given - 0 choose points j
I

n
I=0
R such that
j
0
< a < j
1
< j
2
< ... < j
n
= /,
j
I
= j
I
j
I1
< -, 1 _ i _ :.
Dene sets 1
I
by
1
I
= )
1
((j
I1
, j
I
[) 1, 1 _ i _ :.
3. REGULARITY OF BOREL MEASURES 81
Now ) is continuous, hence Borel measurable, and thus the sets 1
I

n
I=1
are
pairwise disjoint Borel sets with union 1. By Step 9, the denitions of /
Innt:
and
/, and the continuity of ), there are opens sets \
I
with
j(\
I
) < j(1
I
)
-
:
, 1 _ i _ :,
) (r) < j
I
-, for r \
I
, 1 _ i _ :.
The partition of unity Corollary 15 yields functions /
I
- \
I
satisfying
n

I=1
/
I
(r) = 1, r 1.
Thus we have ) =

n
I=1
/
I
) and (2.11) in Step 4 shows that
j(1) _ A
_
n

I=1
/
I
_
=
n

I=1
A/
I
.
We also have
A/
I
_ j(\
I
) < j(1
I
)
-
:
, 1 _ i _ :.
Finally, we use that
/
I
) _ (j
I
-) /
I
,
j
I
- < ) (r) for r 1
I
,
to obtain
A) =
n

I=1
A(/
I
)) _
n

I=1
(j
I
-) A/
I
=
_
n

I=1
([a[ j
I
-) A/
I
_

_
[a[
n

I=1
A/
I
_
_
_
n

I=1
([a[ j
I
-)
_
j(1
I
)
-
:
_
_
[[a[ j(1)[
=
_
n

I=1
(j
I
-) j(1
I
)
_
[2-j(1)[
_
-
:
n

I=1
([a[ j
I
-)
_
_
_

)dj - [2j(1) 2 [a[ [/[ -[ .


Since - 0 is arbitrary, we obtain (2.15), and this completes the proof of the Riesz
representation theorem 32.
3. Regularity of Borel measures
Recall the fourth example in Example 4, where the set A was a well-ordered
set with last element .
1
, the rst uncountable ordinal. A positive measure ` : /
0, 1 was dened on A there by
`(1) =
_
1 if 1 ' .
1
contains an uncountable compact set
0 if 1
c
' .
1
contains an uncountable compact set
,
82 6. ABSTRACT INTEGRATION AND THE RIESZ REPRESENTATION THEOREM
for 1 /, and where / was the o-algebra given by
/ =
_
1 T (A) : either 1 ' .
1
or 1
c
' .
1

contains an uncountable compact set


_
.
Then \ = 1
.1
= A.
1
is an uncountable open set with `(\ ) = 1. On the other
hand, if 1 is a compact subset of \ , then 1 is closed and hence c = |n/ (1) 1
and it follows that 1 1
o+1
. Thus 1
c
o
o
= [c 1, .
1
[ where [c 1, .
1
[ is an
uncountable closed, hence compact, subset of A. It follows from the denition of
` that `(1) = 0. In particular, the measure ` does not have limited regularity:
`(\ ) = 1 ,= 0 = sup
compact 1\
`(1) .
We thus see that the measure ` cannot arise as one of the measures j in
the conclusion of the Riesz representation theorem 32. However, A is a compact
Hausdor space, so C
c
(A) = C (A), and A
X
: C (A) C is a positive linear
functional on C (A), where
A
X
) =
_

)d`.
By the Riesz representation theorem 32, there is a positive Borel measure j on
A with limited regularity such that A
X
= A

. Thus we see that ` ,= j and the


question arises as to what the measure j with limited regularity looks like. We
claim that
j = c
.1
,
where c
.1
is the Dirac unit mass at the point .
1
in A (see the third example in
Example 4). To see this we must show that
_

)d` = ) (.
1
) =
_

)dc
.1
, ) C (A) .
The second equality here is trivial so we turn to proving the rst equality. Given
- 0, let G = )
1
(() (.
1
) -, ) (.
1
) -)) be the set of c A such that
[) (c) ) (.
1
)[ < -.
Then G is open and so contains a successor set o
o
for some , < .
1
. Since o
o
=
[, 1, .
1
[ is compact and uncountable, we have `(G) = 1 and `(G
c
) = 0. Thus
_

)d` =
_
c
)d`
_
c
c
)d` =
_
c
)d`
where
) (.
1
) - =
_
c
() (.
1
) -) d` _
_
c
)d` _
_
c
() (.
1
) -) d` = ) (.
1
) -.
Since - 0 is arbitrary, we obtain
_

)d` = ) (.
1
). Note that c
.1
(\ ) = 0.
It turns out that the main topological obstacle to regularity in this example is
the existence of an open set that is not a countable union of compact sets (since
every compact set 1 in the open uncountable set \ = 1
.1
is at most countable).
Indeed, our main theorem in this section is that if every open subset of A is a
countable union of compact sets, then every locally nite Borel measure on A is
regular! In order to prove this we will rst give a mild topological condition on
A that forces the measures arising in the Riesz representation theorem 32 to be
regular. Note that when A is compact, regularity follows immediately from limited
3. REGULARITY OF BOREL MEASURES 83
regularity since j(A) < . The mild topological condition we impose is that A
be o-compact.
Notation 1. Let A be a topological space. We say that A is o-compact if
A = '
o
n=1
1
n
is a countable union of compact sets 1
n
. More generally, we say that
a subset 1 is o-compact if 1 is a countable union of compact sets. We say that a
set is an 1
c
-set if is a countable union of closed sets. We say that a set 1 is
a G
o
-set if 1 is a countable intersection of open sets.
Theorem 33. Suppose that A is a locally compact, o-compact Hausdor space.
If / and j are as in the conclusion of Theorem 32, then we have the following
properties:
(1) Suppose 1 /. Given - 0 there exist sets 1 closed and G open such
that
(3.1) 1 1 G and j(G 1) < -.
(2) j is a regular measure, i.e. (2.4) and (2.5) hold for all Borel sets 1, in
fact for all 1 /.
(3) If 1 /, there is an 1
c
-set and a G
o
-set 1 such that
1 1 and j(1 ) = 0.
In particular, every 1 / is the union of an 1
c
-set and a null set.
Proof : Let A = '
o
n=1
1
n
where 1
n
is compact for all : _ 1.
(1) Suppose 1 / and - 0. We rst claim that there is an open set
G 1 with j(G 1) <
:
2
. Indeed, j(1
n
1) _ j(1
n
) < and so the outer
/-regularity conclusion in Theorem 32 gives us an open set G
n
1
n
1 with
j(G
n
(1
n
1)) = j(G
n
) j(1
n
1) <
-
2
n+1
.
Thus with G = '
o
n=1
G
n
we have
j(G 1) _
o

n=1
j(G
n
(1
n
1)) <
-
2
.
Applying the same reasoning to 1
c
yields an open set l with j(l 1
c
) <
:
2
. Then
1 = l
c
is closed and the sets 1 and G satisfy (3.1) since
j(G 1) = j(G 1) j(1 1)
= j(G 1) j(l 1
c
) <
-
2

-
2
= -.
(2) To see that j satises (2.5) for all 1 /, we note that every closed set 1
is o-compact; 1 = '
o
n=1
(1
n
1). Thus (1) implies (2.5).
(3) Finally, for each : _ 1 choose 1
n
1 G
n
where 1
n
is closed and G
n
is
open and j(G
n
1
n
) <
1
n
. Then =

o
n=1
1
n
is an 1
c
-set and 1 =

o
n=1
G
n
is
a G
o
-set with 1 1 and
j(1 ) _ j(G
n
1
n
) <
1
:
, : _ 1.
Thus j(1 ) = 0.
Now we can prove that on nice topological spaces, every reasonable Borel mea-
sure is regular.
84 6. ABSTRACT INTEGRATION AND THE RIESZ REPRESENTATION THEOREM
Theorem 34. Let A be a locally compact Hausdor space satisfying
ccrj ojc: :ct i: o co:jact.
Suppose that ` is a positive Borel measure on A that is locally nite, i.e.
`(1) < for all compact sets 1.
Then ` is regular.
Proof : The map A
X
: C
c
(A) C, given by A
X
) =
_

)d` for ) C
c
(A), is
a positive linear functional on C
c
(A). Thus Theorems 32 and 33 yield a positive
regular Borel measure j satisfying (1) of Theorem 33 such that
A
X
) =
_

)dj, ) C
c
(A) .
It remains to show that ` = j under the hypotheses of our theorem. We will use
Urysohns Lemma and the Monotone Convergence Theorem for this.
Let \ be open in A. By hypothesis, \ is o-compact, so \ = '
o
n=1
1
n
where
each 1
n
is compact. Urysohns Lemma yields a function )
n
C
c
(A) such that

1r
_ )
n
_
\
, : _ 1.
Let q
n
= max
1nn
)
n
. Then q
n
C
c
(A) and q
n

\
monotonically as
: . Thus the Monotone Convergence Theorem can be applied twice to obtain
`(\ ) =
_


\
d` = lim
no
_

q
n
d` = lim
no
A
X
q
n
(3.2)
= lim
no
A

q
n
= lim
no
_

q
n
dj =
_


\
dj = j(\ ) .
Now x a Borel set 1. Let - 0. Since j satises (1) of Theorem 33, there
is a closed set 1 and an open set G such that 1 1 G and j(G 1) < -. In
particular,
(3.3) j(G) = j(1) j(G 1) _ j(1) -.
Now \ = G 1 is open and so (3.2) gives the same sort of inequality as (3.3), but
with ` in place of j:
(3.4) `(G) = `(1) `(G 1) = `(1) j(G 1) _ `(1) -.
Remark 17. Outer regularity of j is all that is needed to obtain j(G) _
j(1) - in (3.3). However, we have no such regularity information regarding `,
and in order to obtain (3.4), it is necessary to know that ` coincides with j on an
open set G 1 of small j-measure where 1 and G sandwich 1. This is why we
need assertion (1) of Theorem 33 for j, which is stronger than regularity of j.
Using (3.2) for the open set G, it follows that both
`(1) _ `(G) = j(G) _ j(1) -,
j(1) _ j(G) = `(G) _ `(1) -,
and since - 0 is arbitrary, we conclude that `(1) = j(1).
4. LEBESGUE MEASURE ON EUCLIDEAN SPACES 85
4. Lebesgue measure on Euclidean spaces
We can use the Riesz representation theorem 32 to construct Lebesgue measure
on the real line R, and more generally on Euclidean space R
n
. The idea is to dene
a positive linear functional A on C
c
(R
n
) using the Riemann integral
_
R
r
) (r) dr:
A) =
_
R
r
) (r) dr, ) C
c
(R
n
) .
It turns out that for this purpose we dont need the full theory of Riemann integra-
tion, but just enough to dene the integral of a function ) C
c
(R
n
). The following
is sucient.
4.1. Limited Riemann integration. Let T
|
=
__
,2
|
, (, 1) 2
|
__
Z
be
the collection of right open left closed intervals of length 2
|
and left endpoint in
2
|
Z. In R
n
we consider the corresponding cubes
T
n
|
=
_
Q
|

_
Z
r
=
_
n

I=1
_
,
I
2
|
, (,
I
1) 2
|
_
_
=(1,...,r)Z
r
,
obtained by forming products Q
|
(1,...,r)
= Q
|
1
Q
|
2
... Q
|
r
of intervals Q
|
1
in T
|
. A cube Q T
n
|
has volume [Q[ =
_
2
|
_
n
= 2
|n
and so for ) C
c
(R
n
), we
dene upper and lower sums at level / by
l (); /) =

Q1
r
!
2
|n
sup
rQ
) (r) ,
1(); /) =

Q1
r
!
2
|n
inf
rQ
) (r) .
Clearly we have for / /,
(4.1) l (); /) _ l (); /) _ 1(); /) _ 1(); /) .
Now 1 = supp) is compact, hence contained in a large cube 1 that is a union
of unit sized cubes in T
n
0
. Moreover, ) is uniformly continuous on 1, hence on R
n
,
and it follows that
l (); /) 1(); /) =

Q1
r
!
:Q1
2
|n
_
sup
rQ
) (r) inf
rQ
) (r)
_
(4.2)
_ [1[ sup
Q1
r
!
_
sup
rQ
) (r) inf
rQ
) (r)
_
,
which tends to 0 as / . Thus from (4.1) and (4.2) the limits of upper and
lower sums exist and coincide. We dene the Riemann integral of ) C
c
(R
n
) to
be
_
R
r
) (r) dr = lim
|o
l (); /) = lim
|o
1(); /) .
It follows easily that this integral has the elementary properties
_
R
r
(c) ,q) (r) dr = c
_
R
r
) (r) dr ,
_
R
r
q (r) dr,
for ), q C
c
(R
n
) and c, , C,
_
R
r
) (r) dr _
_
R
r
q (r) dr, for ) _ q.
86 6. ABSTRACT INTEGRATION AND THE RIESZ REPRESENTATION THEOREM
Thus the map A : C
c
(R
n
) C given by A) =
_
R
r
) (r) dr is a positive linear
functional, and Theorems 32 and 33 apply to show that there is a o-algebra /
n
containing the Borel sets, and a positive measure `
n
on /
n
, called Lebesgue measure,
that satises
(1) A) =
_
R
r
)d`
n
for all ) C
c
(R
n
) ; (4.3)
(2) `
n
(1) < for all compact 1 R
n
;
(8) Given 1 /
n
and - 0, there is 1 compact and G open such that
1 1 G and `
n
(G 1) < -;
(4) /
n
if 1 /
n
and j(1) = 0.
It is now an easy matter to establish the additional properties expected of Lebesgue
measure `
n
on R
n
:
() `
n
([a
1
, /
1
[ [a
2
, /
2
[ ... [a
n
, /
n
[) =
n

=1
(/

) , (4.4)
(6) 1 r /
n
and `
n
(1 r) = `
n
(1) if 1 /
n
and r R
n
.
We have already produced in Theorem 20 an example of a subset 1 of the
interval [0, 1) that is not Lebesgue measurable, i.e. 1 T (R) /
1
. We can lift this
example to higher dimensions simply by considering the set 1
n
= 1 R
n1
in R
n
since it is easy to see that 1
n
T (R
n
) /
n
.
Let E
n
denote the Borel o-algebra on R
n
. Then E
n
/
n
, and the question
arises as to whether or not /
n
E
n
is nonempty. In fact, we have already produced in
Example 3 a subset 1 of the unit interval [0, 1[ that is not measurable, and with the
additional property that there is a homeomorphism G : [0, 1[ [0, 1[ with inverse
1 = G
1
such that 1(1) is contained in the Cantor set. Since Lebesgue measure
is complete, 1(1) /
1
. But 1(1) cannot be a Borel set since a homeomorphism
takes Borel sets to Borel sets! Indeed, if we extend the bijection G : [0, 1[
[0, 1[ to a bijection G : T ([0, 1[) T ([0, 1[) in the natural way, 1 G(1),
then the pushforward of a o-algebra is again a o-algebra by Remark 13. Since a
homeomorphism takes open sets to open sets, it follows that E
1
, the smallest o-
algebra containing the open sets, is taken under the map G to the smallest o-algebra
containing the open sets, E
1
. Thus we have shown that /
1
E
1
,= O.
However, it turns out that the set of Lebesgue measurable sets has much larger
cardinality than the set of Borel measurable sets, and we now turn to establishing
this.
4.2. Cardinality of Borel sets. Recall that E
n
is the Borel o-algebra on R
n
.
Here we show that the cardinality [E
n
[ of E
n
is at most 2
.0
= [R[ = [T (N)[, the
cardinality of both the real numbers R and the power set of the natural numbers
N. On the other hand, the cardinality of the Lebesgue o-algebra /
n
is at least the
cardinality of the power set T (1) of the Cantor set (since `
n
(1) = 0 and `
n
is
complete). But 1 has cardinality 2
.0
, and so
(4.5) [/
n
[ _ 2
2
.
0
2
.0
_ [E
n
[ .
In particular, this shows that /
n
E
n
,= O. In fact there are many more Lebesgue
measurable sets than Borel measurable sets in the sense of cardinality.
4. LEBESGUE MEASURE ON EUCLIDEAN SPACES 87
It turns out that [/
n
[ = 2
2
.
0
and [E
n
[ = 2
.0
, but we will content ourselves
with proving only the inequalities used in (4.5). The rst two inequalities are easy.
To show that [E
n
[ _ 2
.0
, we start with the fact that R
n
has a countable base B of
balls, e.g. the collection of all balls with rational radii and centers having rational
coordinates. Since every open set G in R
n
is a union of balls from B, namely
G =
_
1 B : 1 G ,
we see that ( = G 1 (R
n
) : G is open has cardinality [([ _ 2
.0
, and so also
[T[ _ 2
.0
, where T =1 T (R
n
) : 1 is closed .
Now we consider the o c operator ^ that maps T (R
n
) to itself by
^c =
_
o

|=1
o
_
|=1
1
|
|
: 1
|
|
c for all /, / _ 1
_
.
We apply ^ iteratively to the set T to obtain larger and larger sets of sets:
T
0
= T, T
1
= ^T, T
n
= (^)
n
T = ^T
n1
, for : _ 1.
At this point we assume minimal familiarity with ordinal arithmetic. Then we
can continue with transnite induction to dene T
o
inductively for every ordinal
c _ .
1
, where .
1
is the rst uncountable ordinal:
T
o
=
_
^T
o1
if c is a successor ordinal

o<o
T
o
if c is a limit ordinal
, c _ .
1
.
One easily sees that [T
o
[ _ 2
.0
for all c < .
1
by transnite induction. Then we
have
(4.6) [T
.1
[ =

_
o<.1
T
o

_ .
1
2
.0
_ 2
.0
2
.0
= 2
.0
.
Claim 5. T
.1
= E
n
.
It follows immediately from (4.6) and the claim that [E
n
[ _ 2
.0
, and this
completes our proof of (4.5).
Proof of Claim: We rst use transnite induction to show that T
.1
E
n
.
Indeed, x c _ .
1
and suppose that T
o
E
n
for all , < c. If c is a successor
ordinal, then T
o1
E
n
and
T
o
= ^T
o1
^E
n
E
n
.
If c is a limit ordinal, then T
o
E
n
for all , < c implies that
T
o
=
_
o<o
T
o
E
n
.
Conversely, we begin by showing that the collection T
.1
is closed under count-
able unions. Suppose that 1
n

o
n=1
T
.1
. Then for each :, the set 1
n
T
or
for some c
n
< .
1
. Now
c = sup
n1
c
n
< .
1
,
and so 1
n
T
o
for all : _ 1, hence
o
_
n=1
1
n
^T
o
= T
o+1
T
.1
.
88 6. ABSTRACT INTEGRATION AND THE RIESZ REPRESENTATION THEOREM
Next we show that T
.1
is closed under complementation. For this we use the
complementation operator O : T (R
n
) T (R
n
) dened by
Oc = 1
c
: 1 c .
Suppose now that 1 T
0
= T. Then 1
c
^T since 1
c
is open and every
open set is an 1
c
-set. Thus we have OT
0
T
1
. We can now prove by transnite
induction that
(4.7) OT
o
T
2o+1
, for all ordinals c < .
1
.
Indeed, x an ordinal c < .
1
and make the induction assumption that OT
o

T
2o+1
for all , < c. Let 1 T
o
. If c is a successor ordinal, then 1 =

o
|=1

o
|=1
1
|
|
where 1
|
|
T
o1
for all /, / _ 1. The induction assumption ap-
plies to c 1 and we obtain
1
c
=
_
o

|=1
o
_
|=1
1
|
|
_
c
=
o
_
|=1
o

|=1
_
1
|
|
_
c
(^)
2
OT
o1
(^)
2
T
2(o1)+1
= (^)
2
T
2o1
= T
2o+1
.
If c is a limit ordinal, then 1 T
o
for some , < c. The induction assumption
applies to , and we obtain 1
c
OT
o
T
2o+1
T
2o+1
. This completes the proof
of (4.7). Finally, if 1 T
.1
, then 1 T
o
for some c < .
1
and since 2c 1 < .
1
,
we have from (4.7) that
1
c
T
2o+1
T
.1
.
Altogether, we have shown that T
.1
is a o-algebra on R
n
containing the closed
sets T. Thus T
.1
E
n
since E
n
is the smallest o-algebra on R
n
containing T.
This completes the proof of the claim.
5. Littlewoods three principles
A valuable quote from J. E. Littlewood is this:
Quote: "The extent of knowledge required is nothing like so great as is some-
times supposed. There are three principles, roughly expressible in the fol-
lowing terms: Every [measurable] set is nearly a nite union of inter-
vals; every [measurable] function is nearly continuous; every convergent
sequence of [measurable] functions is nearly uniformly convergent. Most
of the results of [the theory] are fairly intuitive applications of these ideas,
and the student armed with them should be equal to most occasions when
real variable theory is called for. If one of the principles would be the ob-
vious means to settle the problem if it were quite true, it is natural to ask
if the nearly is near enough, and for a problem that is actually solvable
it generally is."
In this quote, Littlewood is referring to Lebesgue measure on the real line, but
the principles apply with little change to regular measures as well.
Littlewoods rst principle is embodied in Theorems 33 and 34 for regular
measures. In the case of Lebesgue measure, it is explicitly contained in property
(3) of (4.3):
(8) Given 1 /
1
and - 0, there is 1 compact and G open such that
1 1 G and `
1
(G 1) < -.
5. LITTLEWOODS THREE PRINCIPLES 89
Indeed, since G is an open subset of R, it follows that G =

o
n=1
1
n
is an at
most countable union of pairwise disjoint intervals 1
n
. Choose < such that
`
1
(G 1)
o

n=+1
`
1
(1
n
) < -.
Then if we dene the symmetric dierence of two sets 1 and 1 by
(1, 1) = (1 1) ' (1 1) ,
we have

_

_

n=1
1
n
, 1
_
=
__

_

n=1
1
n
_
1
_
'
_
1
_

_

n=1
1
n
__
(G 1) '
_

_
o
n=+1
1
n
_
,
and so
`
1
_

_
1,

_

n=1
1
n
__
_ `
1
(G 1)
o

n=+1
`
1
(1
n
) < -,
which is what Littlewood meant by "Every [measurable] set is nearly a nite union
of intervals".
Littlewoods second principle is embodied in Lusins theorem.
Theorem 35 (Lusins Theorem). Suppose that A is a locally compact Haus-
dor space, and that j is a measure on a o-algebra / that satises the four prop-
erties in the conclusion of the Riesz representation theorem 32, namely local nite-
ness, outer /-regularity, limited inner /-regularity, and completeness. Suppose
also that ) : A C is measurable and that ) vanishes outside a measurable set 1
of nite measure. Then given - 0, there is q C
c
(A) such that both
j(r A : ) (r) ,= q (r)) < -, (5.1)
sup
r
[q (r)[ _ sup
r
[) (r)[ .
The following theorem of Tietze, whose proof is deferred until after we have
used it to prove Lusins theorem, is the key to our proof of Lusins theorem.
Theorem 36 (Tietze extension theorem). Suppose that A is a locally compact
Hausdor space, is a closed subset of A, and that ) : R is continuous with
compact support. Then there is a continuous extension q : A R satisfying both
q (r) = ) (r) , r ,
sup
r
[q (r)[ _ sup
r.
[) (r)[ .
We may take q C
c
(A).
Proof (of Lusins Theorem): We rst claim that it suces to prove Lusins
theorem for real-valued functions. Indeed, suppose Lusins theorem holds for real-
valued functions, and let ) = ni where n and are real-valued. We may assume
that 0 < 1 = sup
r
[) (r)[ < since otherwise the complex-valued case follows
90 6. ABSTRACT INTEGRATION AND THE RIESZ REPRESENTATION THEOREM
immediately from the real-valued case. Now n and are both measurable, and so
there are real-valued functions ,, c C
c
(A) with
j(r A : n(r) ,= ,(r)) j(r A : (r) ,= c (r)) < -.
Now dene
q (r) =
_
,(r) ic (r) if [,(r) ic (r)[ _ 1
,(r)+Ir(r)
],(r)+Ir(r)]
1 if [,(r) ic (r)[ _ 1
.
Then q C
c
(A) and satises (5.1).
Now suppose that ) is real-valued and measurable on A. By outer /-regularity
and limited inner /-regularity of j we can choose 1 compact and G open such that
1 1 G and
j(G 1) = j(G) j(1) <
-
2
.
Let 1
n

o
n=1
be a countable base of open intervals for R 0. Then for each : _ 1,
)
1
(1
n
) is a measurable subset of 1 and so by outer /-regularity and limited
inner /-regularity of j, there are open sets G
n
and compact sets 1
n
such that
1
n
)
1
(1
n
) G
n
,
j(G
n
1
n
) <
-
2
n+1
.
Now let
= G
c
'
_
1
_
o
_
n=1
(G
n
1
n
)
__
.
Then is a closed set and the restriction )
.
: R of ) to is continuous and
has compact support. Indeed, supp)
.
is contained in 1, and hence is compact.
Moreover,
()
.
)
1
(1
n
) = )
1
(1
n
) = G
n

is relatively open in for each : _ 1, and
)
1
(0) = G
c
= 1
c

is relatively open in as well. It follows easily that )
.
is continuous. The Tietze
extension theorem now yields q : A R continuous with compact support, and
such that ) = q on and sup

[q[ _ sup
.
[)[. Moreover,
j(
c
) _ j(G 1) j
_
o
_
n=1
(G
n
1
n
)
_
<
-
2

o

n=1
-
2
n+1
= -.
Proof (of the Tietze extension theorem): Let 1 = sup
r.
[) (r)[. Then 1 <
since supp) is compact, and ) is continuous. We may suppose that 1 0 as well
and, upon replacing ) with
}
1
, we may suppose that 1 = 1. Thus ) : [1, 1[
is continuous. Dene
1 =
_
r : ) (r) _
1
8
_
and C =
_
r : ) (r) _
1
8
_
.
Then 1 and C are compact sets since supp) is compact by hypothesis. Urysohns
Lemma now yields q
1
C
c
(A) with
1
_ q
1
_
c
c , and so )
1
=
2
3
_
q
1

1
2
_
is
5. LITTLEWOODS THREE PRINCIPLES 91
continuous (but no longer compactly supported) and satises [)
1
(r)[ _
1
3
for all
r A, as well as
)
1
(r) =
_

1
3
if r 1
1
3
if r C
.
It follows that we have
[) (r) )
1
(r)[ _
2
8
for all r ,
[)
1
(r)[ _
1
8
for all r A.
Indeed, to see the rst inequality, simply consider the three cases r 1, r C and
r (1 ' C) separately. In order to iterate this construction, it is important
to be able to take )
1
C
c
(A). To achieve this, use Urysohns Lemma to obtain a
function / C
c
(A) satisfying
supp}
4 / 4

, and then replace )


1
with /)
1
.
We now repeat this construction, but applied and rescaled to the continuous
function
() )
1
) :
_

2
8
,
2
8
_
,
that has compact support since both ) and )
1
do, to obtain a continuous function
)
2
: A R satisfying
[() )
1
) (r) )
2
(r)[ _
_
2
8
_
2
for all r ,
[)
2
(r)[ _
_
1
8
__
2
8
_
for all r A.
Once again we can assume that )
2
C
c
(A) upon multiplying it by a function
/ C
c
(A) satisfying
supp(}}1)
4 / 4

. We continue by induction to obtain


for each : _ 1 a continuous function )
n
: A R with compact support satisfying

) (r)
n

=1
)

(r)

_
_
2
8
_
n
for all r ,
[)
n
(r)[ _
_
1
8
__
2
8
_
n1
for all r A.
Now the innite series

o
=1
)

converges uniformly on A to a continuous func-


tion q on A that satises
) (r) = q (r) for all r ,
sup
r
[q (r)[ _
o

n=1
_
1
8
__
2
8
_
n1
= 1 = sup
r.
[) (r)[ .
If q is not compactly supported we may multiply it by a Urysohn function /
C
c
(A) satisfying
supp}
4 / 4

. This completes the proof of the Tietze extension


theorem.
Littlewoods third principle is embodied in Egoros theorem.
Theorem 37 (Egoros theorem). Suppose that (A, /, j) is a nite measure
space, i.e. j(A) < . Let )
n

o
n=1
be a sequence of complex-valued measurable
92 6. ABSTRACT INTEGRATION AND THE RIESZ REPRESENTATION THEOREM
functions on A that converges pointwise at every r A. For every - 0, there is
a measurable set 1 / satisfying
j(A 1) < -, (5.2)
)
n

o
n=1
converges uniformly on 1.
Proof : For every :, / N dene the set
o (:, /) =

I,n
_
r A : [)
I
(r) )

(r)[ <
1
/
_
.
Momentarily x / _ 1. The sequence of sets o (:, /)
o
n=1
is nondecreasing, i.e.
o (:, /) o (: 1, /) for all : _ 1. Moreover,

o
n=1
o (:, /) = A since )
n
(r)
o
n=1
is a Cauchy sequence for every r A. It follows that
lim
no
j(o (:, /)) = j
_
o
_
n=1
o (:, /)
_
= j(A) , for each xed / _ 1.
We now construct a sequence :
|

o
|=1
of positive integers such that
1 =
o

|=1
o (:
|
, /)
satises (5.2). For each / _ 1 choose :
|
so large that
j(A o (:
|
, /)) = j(A) j(o (:
|
, /)) <
-
2
|
.
Note that the rst equality above uses our assumption that j(A) < . Then we
have
j(A 1) = j
_
A
_
o

|=1
o (:
|
, /)
__
= j
_
o
_
|=1
o (:
|
, /)
c
_
_
o

|=1
j(o (:
|
, /)
c
) <
o

|=1
-
2
|
= -.
Finally, given j 0 choose /
1
q
. Then for all i, , _ :
|
and for all r 1
o (:
|
, /) we have
[)
I
(r) )

(r)[ <
1
/
< j,
which shows that )
n

o
n=1
converges uniformly on 1.
6. Exercises
Exercise 12. With regard to the fourth example in Example 4 above, prove that
/ is a o-algebra on A containing the open sets, and that ` is a positive measure
on /. Hint: Show that every countable intersection of uncountable compact subsets
of A is uncountable. Recall that compact sets are closed in a Hausdor space such
as A.
Exercise 13. Prove both (4.3) and (4.4) above.
6. EXERCISES 93
Exercise 14. (Exercise 17 page 59 in Rudin) Dene a distance between points
(r
1
, j
1
) and (r
2
, j
2
) in the plane by
d ((r
1
, j
1
) , (r
2
, j
2
)) =
_
[j
1
j
2
[ if r
1
= r
2
1 [j
1
j
2
[ if r
1
,= r
2
.
(1) Show that d is a metric on R
2
and that
_
R
2
, d
_
is a locally compact
Hausdor space.
(2) If ) C
c
_
R
2
_
(where R
2
has the metric d), show that there are only
nitely many values of r for which ) (r, j) ,= 0 for at least one j. If these
values are r

n
=1
dene
A) =
n

=1
_
o
o
) (r

, j) dj.
Show that A is a positive linear functional on C
c
_
R
2
_
.
(3) Let j be the measure given by the Riesz representation theorem. Let
1 = (r, j) : j = 0 be the r-axis and prove that
j(1) = , = 0 = sup
compact 1J
j(1) .
CHAPTER 7
Lebesgue, Banach and Hilbert spaces
Let (A, /, j) be a measure space. We have already met the space of integrable
complex-valued functions on A:
1
1
(j) =
_
) : A C :
_

[)[ dj <
_
.
Here the superscript 1 in 1
1
(j) refers to the power of [)[ in the integral
_

[)[ dj.
Using linearity and monotonicity of the integral, we see that 1
1
(j) is a complex
vector space:
_

[c) ,q[ dj _
_

([c[ [)[ [,[ [q[) dj = [c[


_

[)[ dj [,[
_

[q[ dj
is nite for all ), q 1
1
(j) and c, , C. In fact, the integral
_

[)[ dj denes a
norm on the vector space 1
1
(j) provided we identify any two functions ) and q in
1
1
(j) that dier only on a set of measure zero. More precisely, we declare ) ~ q if
j(r A : ) (r) ,= q (r)) = 0.
It is easy to see that ~ is an equivalence relation on 1
1
(j) and that the map
[)[
_

[)[ dj
denes a norm on the quotient space
/
1
(j) = 1
1
(j) , ~ .
For ) 1
1
(j), we are using the notation [)[ /
1
(j) above to denote the equiv-
alence class containing ). Recall that a norm on a complex vector space \ is a
function
|| : \ [0, ) by r |r|
that satises
|cr| = [c[ |r| , c C, r \,
|r j| _ |r| |j| , r, j \.
Every norm gives rise to an associated metric d on \ dened by
d (r, j) = |r j| , r, j \.
If the metric space (\, d) is complete, we call \ a Banach space.
In the next section, we will extend these considerations to the Lebesgue spaces
1

(j) dened for each 0 < j _ . We will see that for 1 _ j _ , 1

(j) is a
complete normed linear space, referred to as a Banach space. Moreover, the special
case 1
2
(j) has many remarkable additional properties, and is the prototypical
example of a Hilbert space.
95
96 7. LEBESGUE, BANACH AND HILBERT SPACES
But rst we use the Dominated Convergence Theorem together with Lusins
Theorem to make a connection between the spaces C
c
(A) and 1
1
(j) in the case
that j and A are related as in the Riesz representation theorem 32. In the special
case that
(0.1) j(\ ) 0 for every nonempty open set \,
C
c
(A) can be considered to be a subset of the space 1
1
(j) of equivalence classes
of integrable functions. Indeed, if ), q C
c
(A) dier only on a set of measure zero,
then they dier nowhere at all. This is because r A : ) (r) ,= q (r) is an open
set, and if it has j-measure zero, then by (0.1) it is empty. On the other hand,
without assuming (0.1), we can still consider the collection of equivalence classes
[)[ of functions ) C
c
(A), and we will show that this subspace is dense in 1
1
(j).
Lemma 26. Suppose that A is a locally compact Hausdor space. If a o-algebra
/ and a positive measure j are as in the conclusion of Theorem 32, then C
c
(A) is
dense in the metric space 1
1
(j).
Proof. Fix ) 1
1
(j) and - 0. For : N let
)
n
(r) =
_
) (r) if
1
n
_ [) (r)[ _ :
0 if [) (r)[ <
1
n
or [) (r)[ :
.
Then j(r A : [) (r)[ = ) = 0 and so lim
no
)
n
(r) = ) (r) for j-almost
every r A. Also [)
n
(r)[ _ [) (r)[ for all : N and r A where ) 1
1
(j).
Thus the Dominated Convergence Theorem shows that
lim
no
_

[) )
n
[ dj = 0,
and there exists : so that
_

[) )
n
[ dj <
-
2
.
Now )
n
vanishes outside the set
_
[)[ _
1
n
_
, which has nite measure, and so
we can use Lusins Theorem to obtain a function q C
c
(A) such that
j(r A : )
n
(r) ,= q (r)) <
-
4:
,
sup
r
[q (r)[ _ sup
r
[)
n
(r)[ .
Then
sup
r
[)
n
(r) q (r)[ _ sup
r
[)
n
(r)[ sup
r
[q (r)[ _ 2 sup
r
[)
n
(r)[ _ 2:,
and we have
di:t
J
1
()
(), q) =
_
[) q[ dj _
_
[) )
n
[ dj
_
[)
n
q[ dj
<
-
2

_
]r:}r(r),=(r)]
[)
n
(r) q (r)[ dj(r)
_
-
2
2: j(r A : )
n
(r) ,= q (r))
<
-
2
2:
-
4:
= -.

1. J

SPACES 97
1. 1

spaces
Let (A, /, j) be a measure space. For 0 < j < and ) : A C measurable
dene
|)|
J

()
=
__

[)[

dj
_1

.
We denote by 1

(j) the set of measurable functions ) satisfying |)|


J

()
< .
Just as in the case j = 1 above, we identify functions that dier only on a set of
measure zero. We will sometimes write |)|

instead of |)|
J

()
when no confusion
can arise. The next two inequalities are called Hlders inequality and Minkowskis
inequality respectively.
Lemma 27. Let (A, /, j) be a measure space and 1 < j, j
t
< ,
1

0
= 1.
Suppose that ), q : A [0, [ are measurable functions. Then
_

)q dj _ |)|
J

()
|q|
J

0
()
,
|) q|
J

()
_ |)|
J

()
|q|
J

()
.
Proof : The geometric/arithmetic mean inequality says
(1.1)
1

1
1

0
_
1
j

1
j
t
1, , 1 _ 0.
We may assume 0 < |)|
J

()
, |q|
J

0
()
< . Substitute =
}(r)

]}]

(,)
and 1 =
(r)

0
]]

0
1

0
(,)
in (1.1) and then integrate with respect to the measure j on A to obtain
_

) (r)
|)|
J

()
q (r)
|q|
J

0
()
dj(r) _
_

_
_
1
j
) (r)

|)|

()

1
j
t
q (r)

0
|q|

0
J

0
()
_
_
dj(r)
=
1
j
|)|

()
|)|

()

1
j
t
|q|

0
J

0
()
|q|

0
J

0
()
= 1,
which proves Hlders inequality.
Now we apply Hlders inequality to obtain
|) q|

()
=
_

() q)

dj (1.2)
=
_

) () q)
1
dj
_

q () q)
1
dj
_
_
|)|
J

()
|q|
J

()
__
_
_() q)
1
_
_
_
J

0
()
.
However, (j 1) j
t
= j and so
_
_
_() q)
1
_
_
_
J

0
()
=
__

_
() q)
1
_

0
dj
_ 1

0
=
__

() q)

dj
_ 1

0
= |) q|
1
J

()
.
98 7. LEBESGUE, BANACH AND HILBERT SPACES
Using () q)

_ 2
1
()

) we may assume 0 < |) q|


J

()
< and divide
both sides of (1.2) by
_
_
_() q)
1
_
_
_
J

0
()
= |) q|
1
J

()
to obtain Minkowskis
inequality.
For 1 _ j < , the subadditivity of ||
J

()
shows that1

(j) is a linear space,


that the function
) |)|
J

()
denes a norm on 1

(j), and that


d
J

()
(), q) = |) q|
J

()
, ), q 1

(j) ,
denes a metric on 1

(j). When 0 < j < 1, 1

(j) is still a linear space, but


|)|
J

()
is no longer a norm, nor is d
J

()
(), q) a metric. However, in this case
the j
||
power
c
J

()
(), q) = |) q|

()
, ), q 1

(j) ,
denes a metric on the linear space 1

(j) since for , 1 _ 0 and 0 < j < 1,


(1)

.
Indeed, with 1 0 xed, the function 1 () =

(1)

is increasing
since 1
t
() = j
_

1
(1)
1
_
0.
A key result in measure theory is the completeness of the metric space 1

(j).
Proposition 8. Let (A, /, j) be a measure space and 0 < j < . The metric
space 1

(j) is complete.
Proof : We prove the case 1 _ j < . The case 0 < j < 1 is proved in the
same way and is left to the reader. Suppose then that )
n

o
n=1
is a Cauchy sequence
in 1

(j). Choose a rapidly converging subsequence )


n
!

o
|=1
, by which we mean

o
|=1
_
_
)
n
!+1
)
n
!
_
_

< . This is easily accomplished inductively by choosing for


example :
|

o
|=1
strictly increasing such that
|)
n
)
n
!
|

<
1
2
|
, : _ :
|+1
.
Then set
q = [)
n1
[
o

|=1

)
n
!+1
)
n
!

.
By the Monotone Convergence Theorem and Minkowskis inequality we have
|q|

= lim
o
_
_

_
[)
n1
[

|=1

)
n
!+1
)
n
!

dj
_
1

_ lim sup
o
_
|)
n1
|

|=1
_
_
)
n
!+1
)
n
!
_
_

_
< ,
and it follows that
0 _ q (r) = [)
n1
(r)[
o

|=1

)
n
!+1
(r) )
n
!
(r)

<
2. BANACH SPACES 99
for j-almost every r A. Thus the series
)
n1
(r)
o

|=1
_
)
n
!+1
(r) )
n
!
(r)
_
converges absolutely for j-almost every r A to a measurable function ) (r).
We claim that ) 1

(j) and that lim


no
)
n
= ) in 1

(j). Indeed, Fatous


lemma gives
_

[) (r) )
n

(r)[

dj(r) =
_

lim inf
|o
[)
n
!
(r) )
n

(r)[

dj(r)
_ lim inf
|o
_

[)
n
!
(r) )
n

(r)[

dj(r)
= lim inf
|o
|)
n
!
)
n

0
as / by the Cauchy condition. This shows that ) )
n

(j), hence
) 1

(j), and also that )


n

) in 1

(j) as / . Finally, this together with


the fact that )
n

o
n=1
is a Cauchy sequence, easily shows that )
n
) in 1

(j) as
: .
Porism 1: If )
n

o
n=1
is a rapidly converging sequence in 1

(j),
o

n=1
|)
n+1
)
n
|

< ,
then
lim
no
)
n
(r) = )
1
(r)
o

n=1
)
n+1
(r) )
n
(r)
exists for j-almost every r A.
The completeness of 1

(j) shows that 1

(j) is a Banach space for 1 _ j < .


2. Banach spaces
Three famous results, namely the uniform boundedness principle, the open map-
ping theorem and the closed graph theorem, hold in the generality of Banach spaces
and depend on the following result of Baire.
Theorem 38. If A is either (1) a complete metric space or (2) a locally com-
pact Hausdor space, then the intersection of countably many open dense subsets
of A is dense in A.
Proof: Let \
|

o
|=1
be a sequence of open dense subsets of A, and let 1
0
be
any nonempty open subset of A. Dene sets 1
n
inductively by choosing 1
n
open
and nonempty with 1
n
\
n
1
n1
and in addition,
dia:(1
n
) <
1
:
in case (1),
1
n
is compact in case (2).
Let 1 =
o
n=1
1
n
. Then in case (1), if we choose points r
n
1
n
, the sequence
r
n

o
n=1
is Cauchy and converges in 1 since each 1
n
is closed. Thus 1 ,= c. In
case (2), 1 ,= c since the sets 1
n
are compact and decreasing, hence satisfy the
nite intersection property. Thus in both cases c ,= 1 1
0
(
o
|=1
\
|
), and this
shows that
o
|=1
\
|
is dense in A.
100 7. LEBESGUE, BANACH AND HILBERT SPACES
Remark 18. A subset \ of A is open and dense if and only if A\ is closed
with empty interior. Thus the conclusion of Baires Theorem can be restated as
every countable union of closed sets with empty interior in A has empty interior
in A.
Definition 16. Let 1 be a subset of a topological space A. We say that 1
is nowhere dense if 1 has empty interior, that 1 is of the rst category if it is a
countable union of nowhere dense sets, and that 1 is of the second category if it is
not of the rst category.
Thus 1 is of rst category if and only if it is a subset of a countable union of
nowhere dense subsets; equivalently if and only if its complement 1
c
is a superset
of a countable intersection of open dense subsets. If A is a complete metric space
or a locally compact Hausdor space, then A is of the second category. Indeed, if
A '
o
n=1
1
n
where 1
n
are closed sets with empty interior, then
c = A
c
= ('
o
n=1
1
n
)
c
=
o
n=1
1
c
n
where the 1
c
n
are open dense sets, contradicting Baires Theorem. Of course, the
countable union of rst category sets is a rst category set in any topological space
A, and so cannot be A if A is a complete metric space or a locally compact
Hausdor space.
2.1. The uniform boundedness principle.
Theorem 39. (Banach-Steinhaus uniform boundedness principle) Let A, 1 be
Banach spaces and I a set of bounded linear maps from A to 1 . Let
1 =
_
r A : sup

|Ar|
Y
<
_
,
be the subspace of A consisting of those r with bounded I-orbits. If 1 is of the
second category in A, then 1 = A and I is equicontinuous, i.e.
sup

|A| < ,
where |A| = sup
]r]1
|Ar|
Y
.
Proof : Let 1 =

A
1
_
1
Y
_
0,
1
2
_
_
where 1
Y
(0, r) is the ball of radius r
about the origin in 1 . Then 1 1 and 1 is closed by the continuity of the maps
A. If r 1, then there is : N such that Ar :1
Y
_
0,
1
2
_
for all A I. Thus
1 = '
o
n=1
:1 and since 1 is of the second category in A, so is :1 for some : N.
Since r :r is a homeomorphism of A, we have that 1 is of the second category
in A. Thus 1 has an interior point r and there is r 0 so that r 1 1

(0, r).
Then we conclude
A(1

(0, r)) Ar A1 1
Y
_
0,
1
2
_
1
Y
_
0,
1
2
_
1
Y
(0, 1),
which implies |A| _
1
:
for all A I; thus I is equicontinuous and 1 = A.
2. BANACH SPACES 101
2.2. The open mapping theorem. A map ) : A 1 where A, 1 are
topological spaces is open if ) (G) is open in 1 for every G open in A. A famous
open mapping theorem is that a holomorphic function ) on a connected open
subset \ of the complex plane is open if it is not constant. Another is the Invariance
of Domain Theorem that says ) : l R
n
is open if it is a continuous one-to-one
map from an open set l in R
n
into R
n
. If we consider continuous linear maps
A : A 1 where A, 1 are Banach spaces, then A is open if it is onto. Note that
for a linear map A : A 1 from one normed linear space A to another 1 , A is
open if and only if A(1

(0, 1)) 1
Y
(0, r) for some r 0.
Theorem 40. (Open mapping theorem) Suppose A, 1 are Banach spaces and
A : A 1 is bounded and onto. Then A is an open map.
Remark 19. More generally, if A : A 1 is a bounded linear operator from
a Banach space A to a normed linear space 1 , and if AA is of the second category
in 1 , then A is open and onto 1 , and 1 is a Banach space. The proof is essentially
the same as that given below.
Proof : Since A is onto we have 1 = '
o
|=1
A
_
/1

_
0,
1
4
__
, and thus by Baires
Theorem, one of the sets A
_
/1

_
0,
1
4
__
= /A
_
1

_
0,
1
4
__
must have nonempty
interior, and hence so must A
_
1

_
0,
1
4
__
, say
1
Y
(j
0
, r) A
_
1

_
0,
1
4
__
.
Then we have
A
_
1

_
0,
1
2
__
A
_
1

_
0,
1
4
__
A
_
1

_
0,
1
4
__
(2.1)
A
_
1

_
0,
1
4
__
A
_
1

_
0,
1
4
__
1
Y
(j
0
, r) 1
Y
(j
0
, r)
1
Y
(0, r) .
It remains only to prove that A
_
1

_
0,
1
2
__
A(1

(0, 1)). For this, x j


1

A
_
1

_
0,
1
2
__
. Now the argument above shows that A
_
1

_
0,
1
4
__
contains an
open ball 1
Y
(0, r
1
) about the origin as well. There is r
1
1

_
0,
1
2
_
such that
Ar
1
A
_
1

_
0,
1
2
__
satises |Ar
1
j
1
|
Y
< r
1
. Then we have
Ar
1
1
Y
(j
1
, r
1
)
_
j
1
A
_
1

_
0,
1
4
__
_
.
Now dene
j
2
= j
1
Ar
1
A
_
1

_
0,
1
4
__
.
102 7. LEBESGUE, BANACH AND HILBERT SPACES
We can repeat this procedure inductively to obtain sequences r
n

o
n=1
A and
j
n

o
n=1
1 satisfying
r
n
1

_
0,
1
2
n
_
,
j
n
A
_
1

_
0,
1
2
n
__
,
j
n+1
= j
n
Ar
n
,
for all : _ 1. Then r = lim
no

n
n=1
r
n
1

(0, 1) since |r| _



o
n=1
|r
n
| <

o
n=1
1
2
r
= 1, and since |j
n
| _ |A| 2
n
,
Ar = lim
no
n

n=1
Ar
n
= lim
no
n

n=1
(j
n
j
n+1
) = j
1
lim
no
j
n+1
= j
1
.
2.2.1. Fourier coecients of integrable functions. Here we apply the Open
Mapping Theorem, together with Lusins Theorem and the Dominated Conver-
gence Theorem, to answer a question regarding Fourier coecients of integrable
functions on the circle group T. Recall that for ) 1
1
(T), its Fourier coecients

) (:) are dened by

) (:) =
_
2t
0
) (t) c
In|
dt
2
, : Z.
Then

) (:)

_
2t
0
) (t) c
In|
dt
2

_ |)|
J
1
(T)
for all : Z, i.e. T = is a bounded linear map from 1
1
(T) to /
o
(Z) of norm
1 (

1 = c
0
). More is true because of the density of trigonometric polynomials

n=
c
n
c
Inr
in 1
1
(T), namely the Riemann-Lebesgue lemma:
lim
no

) (:)

= 0, ) 1
1
(T) .
Remark 20. The set of trigonometric polynomials T is a self-adjoint subalgebra
of C (T) that separates points in the compact set T, and is nonvanishing at every
point of T. Thus the Stone-Weierstrass Theorem shows that T is a dense subset of
the metric space C (T) with metric d (), q) = sup
rT
[) (r) q (r)[. Combining this
with the density of C (T) in 1
1
(T), namely Lemma 26, we obtain that T is dense in
1
1
(T). Indeed, given ) 1
1
(T) and - 0, choose q C (T) with
_
T
[) q[
J0
2t
<
:
2
and then choose 1 T such that sup
T
[q 1[ <
:
2
. Altogether we have
di:t
J
1
(T)
(), 1) =
_
T
[) 1[
d0
2
_
_
T
[) q[
d0
2

_
T
[q 1[
d0
2
<
-
2
sup
T
[q 1[ <
-
2

-
2
= -.
To prove the Riemann-Lebesgue lemma, simply let - 0 be given and choose
1 (r) =

n=
c
n
c
Inr
such |) 1|
J
1
(T)
< -. Since

1 (:) = 0 for [:[ , we
have

) (:)

\
) 1 (:)

_ |) 1|
J
1
(T)
< -
for [:[ . Thus T : 1
1
(T) /
o
0
(Z) with norm 1 where /
o
0
(Z) is the closed sub-
space of /
o
(Z) consisting of those sequences with limit zero at . The following
2. BANACH SPACES 103
application of the open mapping theorem shows that not every such sequence arises
as the Fourier transform of an integrable function on T.
Theorem 41. The Fourier transform T : 1
1
(T) /
o
0
(Z) is bounded and
one-to-one, but not onto.
Proof : To see that T is one-to-one, suppose that ) 1
1
(T) and

) (:) = 0 for
all : Z. Then if 1 (r) =

n=
c
n
c
Inr
is a trigonometric polynomial,
(2.2)
_
2t
0
) (t) 1 (t) dt =

n=
c
n
_
2t
0
) (t) c
In|
dt = 0,
and since trigonometric polynomials are dense in C (T), we have
_
2t
0
) (t) q (t) dt = 0
for all q C (T). Now let 1 be a measurable subset of T. By Lusins Theorem
there is a sequence of continuous functions q
n

o
n=1
such that q
n
=
J
except on
a set of measure at most 2
n
and where |q
n
|
o
= 1 for all : _ 1. Thus q
n

J
almost everywhere on T, and the dominated convergence theorem shows that
_
J
) (t) dt = 0.
With 1 equal t : ) (t) 0 and t : ) (t) < 0, we see that ) = 0 a.e.
Now we prove that T is not onto by contradiction. If
J
= /
o
0
(Z), then the
open mapping theorem shows that there is c 0 such that
(2.3)
_
_
_

)
_
_
_
|
1
0
(Z)
_ c |)|
J
1
(T)
, ) 1
1
(T) .
But (2.3) fails if we take ) = T
n
for : large, since
_
_
_

)
_
_
_
|
1
0
(Z)
=
_
_
_
]n,1n,...,n1,n]
_
_
_
|
1
0
(Z)
= 1
while |T
n
|
J
1
(T)
.
2.3. The closed graph theorem. If A is any topological space and 1 is
a Hausdor space, then every continuous map ) : A 1 has a closed graph
(exercise: prove this). A statement that gives conditions under which the converse
holds is referred to as a closed graph theorem. Here is an elementary example.
Suppose that A and 1 are metric spaces and 1 is compact. If the graph of )
is closed in A 1 then ) is continuous. Indeed, for metric spaces it is enough
to show that every sequence r
n

o
n=1
in A converging to a point r A has a
subsequence r
n
!

o
|=1
such that ) (r
n
!
) ) (r) as / . However, since 1
is compact, ) (r
n
)
o
n=1
has a convergent subsequence, say ) (r
n
!
) j 1 as
/ . Thus (r, j) is a limit point of the graph G = (r, ) (r)) : r A, and
since G is assumed closed, we have (r, j) G, i.e. j = ) (r). The next theorem
gives the same conclusion for a linear map from one Banach space to another. Note
that linearity is needed here since ) : R R by ) (r) =
_
1
r
if r ,= 0
0 if r = 0
has a
closed graph, but is not continuous at the origin.
104 7. LEBESGUE, BANACH AND HILBERT SPACES
Theorem 42. (closed graph theorem) Suppose that A and 1 are Banach spaces
and A : A 1 is linear. If the graph G = (r, A(r)) : r A is closed in A 1 ,
then A is continuous.
Proof : The product A 1 is a Banach space with the norm |(r, j)| =
|r|

|j|
Y
. Since A is linear and the graph G of A is closed, G is also a Banach
space. Now the projection
1
: A 1 A by (r, j) r is a continuous linear
map from the Banach space G onto the Banach space A, and the open mapping
theorem thus implies that
1
is an open map. However,
1
is clearly one-to-one and
so the inverse map
1
1
: A G exists and is continuous. But then the composition

2

1
1
: A 1 is also continuous where
2
: A 1 1 by (r, j) j. We are
done since
2

1
1
= A.
As a consequence of the closed graph theorem, we obtain the automatic conti-
nuity of symmetric linear operators on a Hilbert space.
Theorem 43. (Hellinger and Toeplitz) Suppose that T is a linear operator on a
Hilbert space H satisfying Tr, j = r, Tj for all r, j H. Then T is continuous.
Proof : It is enough to show that T has a closed graph G. So let (r, .) be a
limit point of G. Then there is a sequence r
n

o
n=1
A such that r
n
r and
Tr
n
.. For every j H the symmetry hypothesis now shows that
T (r
n
r) , j = r
n
r, Tj 0
as : . But we also have
T (r
n
r) , j = Tr
n
, j Tr, j ., j Tr, j
as : . Thus . Tr, j = 0 for all j H and so . = Tr, which shows that
(r, .) G.
3. Hilbert spaces
There is a class of special Banach spaces that enjoy many of the properties of
the familiar Euclidean spaces R
n
and C
n
, namely the Hilbert spaces, whose norms
arise from an inner product. We follow the presentation in Rudin ([3]).
Definition 17. A complex vector space H is an inner product space if there
is a map , from H H to C satisfying for all r, j H and ` C,
r, j = j, r,
r ., j = r, j ., j ,
`r, j = `r, j ,
r, r _ 0 and r, r = 0 ==r = 0.
Then |r| =
_
r, r denes a norm on H (see below) and if this makes H into a
Banach space, i.e. the metric d (r, j) = |r j| is complete, then we say H is a
Hilbert space.
A simple example of a Hilbert space is real or complex Euclidean space R
n
or C
n
with the usual inner product. More generally, the space /
2
(N) of square
summable sequences a = a
n

o
n=1
with inner product a, / =

o
n=1
a
n
/
n
is a
Hilbert space. Both of these examples are included as special cases of the Hilbert
space 1
2
(j) where j is a positive measure on a measure space A and the inner
3. HILBERT SPACES 105
product is ), q =
_

)qdj. Note that an inner product , on an inner product


space H can always be recovered from its norm || by polarization:
4 Io r, j = |r j|
2
|r j|
2
, r, j H.
Lemma 28. Let H be an inner product space and dene |r| =
_
r, r for
r H. Then || is a norm on H and for all r, j H,
[r, j[ _ |r| |j| ,
|j| _ |`r j| for all ` C i r, j = 0,
|r j|
2
|r j|
2
= 2 |r|
2
2 |j|
2
.
Proof : For r, j H and ` C,
(3.1) 0 _ |`r j|
2
= [`[
2
|r|
2
2 Io (`r, j) |j|
2
.
Thus r, j = 0 implies |j| _ |`r j| for all ` C. Conversely, if r ,= 0 we
minimize the right side of (3.1) with ` =
r,)
]r]
2
to get
0 _ |`r j|
2
=
[r, j[
2
|r|
2
|j|
2
.
This shows that |j| _ |`r j| fails for some ` if r, j ,= 0, and also proves the
Cauchy-Schwarz inequality [r, j[ _ |r| |j|. With ` = 1 in (3.1) we now have
|r j|
2
= |r|
2
2 Io r, j |j|
2
_ |r|
2
2 |r| |j| |j|
2
= (|r| |j|)
2
,
which shows || satises the triangle inequality, and || is now easily seen to be a
norm. Finally, the parallelogram law follows from expanding the inner products on
the left side.
The next easy theorem lies at the heart of the great success of Hilbert spaces
in analysis.
Theorem 44. Suppose 1 is a nonempty closed convex subset of a Hilbert space
H. Then 1 contains a unique element r of minimal norm, i.e. |r| = inf
J
|j|.
Proof : Let d = inf
J
|j|, which is nite since 1 is nonempty. Pick r
n

o
n=1

1 with |r
n
| d as : . Since 1 is convex,
rr+rr
2
1 and so has norm at
least d. The parallelogram law now yields
_
_
_
_
r
n
r
n
2
_
_
_
_
2
=
|r
n
|
2
|r
n
|
2
2

_
_
_
_
r
n
r
n
2
_
_
_
_
2
_
|r
n
|
2
|r
n
|
2
2
d
2

d
2
d
2
2
d
2
= 0
as :, : . Thus r
n

o
n=1
is Cauchy and since H is complete and 1 closed,
r = lim
no
r
n
1. Since || is continuous, we have |r| = d. If r
t
1 also
106 7. LEBESGUE, BANACH AND HILBERT SPACES
satises |r
t
| = d, then using the parallelogram law as above yields
_
_
_
rr
0
2
_
_
_
2
=
]r]
2
+|r
0
|
2
2

_
_
_
r+r
0
2
_
_
_
2
_ 0, hence r = r
t
.
Let H be a Hilbert space. We say that r and j in H are perpendicular, written
r l j, if r, j = 0. We say subsets 1 and 1 of H are perpendicular, written
1 l 1, if r, j = 0 for all r 1 and j 1. Finally, we dene
1
J
= j H : r, j = 0 for all r 1 .
The next theorem uses Theorem 44 to establish an orthogonal decomposition of H
relative to any closed subspace ' of a Hilbert space H.
Theorem 45. Suppose that ' is a closed subspace of a Hilbert space H. Then
H = ' '
J
,
which means that ' and '
J
are closed subspaces of H whose intersection is the
smallest subspace 0, and whose span is the largest subspace H. The representation
r = ::
J
,
where : ' and :
J
'
J
, is uniquely determined for each r H.
Proof : '
J
is a subspace since r, j is linear in r, and is closed by the Cauchy-
Schwarz inequality. The fact that r, r = 0 == r = 0 gives ' '
J
= 0.
Finally, to show ' '
J
= H, let r H and set 1 = r ', a nonempty closed
convex set. Thus there is a unique element :
J
r ' of minimal norm having
the form r : with : '. Thus for all . ' and ` C,
_
_
:
J
_
_
_
_
_
:
J
`.
_
_
and Lemma 28 implies that

., :
J
_
= 0 for all . ', which yields :
J
'
J
.
Thus r = ::
J
' '
J
. If there is another such representation r = : :
J
,
then
:: = :
J
:
J
' '
J
= 0 ,
and so : = : and :
J
= :
J
.
Corollary 16.
_
'
J
_
J
= '.
Proof : '
_
'
J
_
J
is obvious, and since ' '
J
= H = '
J

_
'
J
_
J
,
we cannot have that ' is a proper subset of
_
'
J
_
J
.
Definition 18. Let ' be a closed subspace of a Hilbert space H. Dene
1
1
: H ' and 1
1
? : H '
J
by 1r = : and 1
J
r = :
J
where r = ::
J
with : ' and :
J
'
J
.
Lemma 29. 1
1
and 1
1
? are linear maps satisfying
|1
1
r|
2
|1
1
?r|
2
= |r|
2
, r H,
(1
1
)
2
= 1
1
and (1
1
?)
2
= 1
1
?.
Definition 19. The element 1
1
r is called the orthogonal projection of r onto
'.
3. HILBERT SPACES 107
3.0.1. Bases. A subset | = n
o

o.
of a Hilbert space H is orthonormal if
n
o
, n
o
= c
o
o
. Given | = n
o

o.
orthonormal in a Hilbert space H, and r H,
dene the Fourier coecients of r (relative to |) by
(3.2) r(c) = r, n
o
, c .
Theorem 46. Let | = n
o

o.
be an orthonormal set in a Hilbert space H,
and suppose a
1
, ..., c

is a nite subset of . Then


(1) r =

n=1
c
n
n
or
implies that c
n
= r(c
n
) and |r|
2
=

n=1
[ r(c
n
)[
2
.
(2) r H implies
_
_
_
_
_
r

n=1
r(c
n
) n
or
_
_
_
_
_
_
_
_
_
_
_
r

n=1
`
n
n
or
_
_
_
_
_
for all scalars `
1
, ...`

, and morever, equality holds if and only if `


n
=
r(c
n
) for 1 _ : _ .
(3) The vector

n=1
r(c
n
) n
or
is the orthogonal projection of r onto the
linear space spanned by n
or

n=1
.
Proof : Statement (1) is a straightforward computation using orthonormality,
and (2) is equivalent, after squaring and expanding, to the inequality
|r|
2

n=1
[ r(c
n
)[
2
_ |r|
2
2 Io

n=1
r(c
n
) `
n

n=1
[`
n
[
2
,
which in turn follows from

n=1
r(c
n
) `
n

_
_

n=1
[ r(c
n
)[
2
_

n=1
[`
n
[
2
. Fi-
nally, (3) follows from (2) and the denition of orthogonal projection.
Theorem 47. (Bessels inequality) If | = n
o

o.
is an orthonormal set in
a Hilbert space H, then

o.
[ r(c)[
2
_ |r|
2
for all r H.
Proof : Let 1 be a nite subset of and let ' be the subspace spanned by
n
o

oJ
. It is an easy exercise to use (1) of Theorem 46 to see that ' is closed,
and then (3) of Theorem 46 shows that 1
1
r =

n=1
r(c
n
) n
or
. Then by (1) of
Theorem 46 and Lemma 29, we have

oJ
[ r(c)[
2
=
_
_
_
_
_

n=1
r(c
n
) n
or
_
_
_
_
_
2
= |1
1
r|
2
_ |1
1
r|
2
|1
1
?r|
2
= |r|
2
.
Now take the supremum over all nite subsets 1 of .
Theorem 48. (Riesz-Fischer) If | = n
o

o.
is an orthonormal set in a
Hilbert space H and , /
2
(), then there is r H such that r = ,.
Proof : There is 1 = c
n

o
n=1
such that ,(c) = 0 for c 1. Then
r

n=1
,(c
n
) n
or
is Cauchy in H, hence convergent to some r H, and
continuity now yields r = ,.
The following fundamental theorem regarding orthonormal sets is an easy con-
sequence of the above results.
Theorem 49. Suppose | = n
o

o.
is an orthonormal set in a Hilbert space
H. Then the following statements are equivalent:
108 7. LEBESGUE, BANACH AND HILBERT SPACES
(1) equality holds in Bessels inequality, i.e.
|r| =
_

o.
[ r(c)[
2
_1
2
= | r|
|
2
(.)
, r H,
(2) the linear map . : H /
2
() dened in (3.2) is a Hilbert space isomor-
phism of H onto /
2
(),
(3) | is a maximal orthonormal set (called an orthonormal basis)
(4) The linear span
oja: | =
_

oJ
c
o
n
o
: c
o
scalar, 1 a nite subset of
_
is dense in H.
Proof. We prove (1) == (2) == (8) == (4) == (1). If (1) holds, then
(2) follows by the Riesz-Fischer theorem, which shows . is onto, and polarization,
which shows that . preserves inner products:
r, j
1
=
1
4
_
|r j|
2
1
|r j|
2
1
i |r ij|
2
1
i |r ij|
2
1
_
=
1
4
_
| r j|
2
|
2
(.)
| r j|
2
|
2
(.)
i | r i j|
2
|
2
(.)
i | r i j|
2
|
2
(.)
_
= r, j
|
2
(.)
, r, j H.
Now assume (2) holds Then (3) holds since otherwise, there is H with ||
1
= 1
such that
(c) = n
o
,
|
2
(.)
= n
o
,
1
= 0, c ,
i.e. = 0, contradicting ||
1
= | |
|
2
(.)
.
Next, assume that (3) holds. Then oja: | is dense in H since otherwise,
oja: |
J
,= 0 by Theorem 45, and so there is . H with |.|
1
= 1 such
that ., r
1
= 0 for all r oja: |. In particular, ., n
o

1
= 0 for all c ,
contradicting maximality of |.
Finally, assume (4) holds so that oja: | = H. The linear isometry
. : oja: | /
2
() , r(c) = r, n
o
, c ,
has a unique continuous extension (isometries are trivially continuous) to oja:| =
H, and this continuation is easily seen to be a linear isometry. But this is precisely
(1).
Corollary 17. If | = n
o

o.
is an orthonormal basis for a Hilbert space
H, then for each r H, the set c : r(c) ,= 0 is at most countable, i.e.
c : r(c) ,= 0 = c
n

o or nite
n=1
,
and
r =
o or nite

n=1
r(c
n
) n
or
,
with convergence of the series in H.
3. HILBERT SPACES 109
Proof. Theorem 49 (1) implies that

o.
[ r(c)[
2
= |r|
2
< , and it follows
that
c : r(c) ,= 0 = c
n

o or nite
n=1
is at most countable. Theorem 49 (4) shows that given - 0, there is an element

1
n=1
`
n
n
or
in oja: | such that
_
_
_r

1
n=1
`
n
n
or
_
_
_ < -, and then with `
n
= 0
for ' < : _ , Theorem 46 (2) shows that
_
_
_
_
_
r

n=1
r(c
n
) n
or
_
_
_
_
_
_
_
_
_
_
_
r
1

n=1
`
n
n
or
_
_
_
_
_
< -
for all _ '.
The axiom of choice shows that there are lots of orthonormal bases in a Hilbert
space.
Theorem 50. Every orthonormal set | in a Hilbert space H is contained in a
maximal orthonormal set 1.
Proof : Following the standard transnite recipe, we let I be the class of all
orthonormal sets containing |, partially ordered by inclusion. By the Hausdor
Maximality Theorem, I contains a maximal totally ordered class \. It is straight-
forward to show that 1 = ' : \ is a maximal orthonormal set in H.
Example 5. Here are two examples of orthonormal bases in the Hilbert spaces
1
2
(T) and 1
2
(R, j) respectively.
(1) The set | =
_
c
In|
_
nZ
is an orthonormal set in 1
2
(T), i.e.

c
In|
, c
In|
_
=
_
2t
0
c
In|
c
In|
dt
2
=
_
0 if : ,= :
1 if : = :
.
The Stone-Weierstrass Theorem, together with Exercise 15, shows that
oja:| is dense in H = 1
2
(T), and thus by Theorem 49 the map T :
1
2
(T) /
2
(Z) given by
T) (:) =

) (:) =

), c
In|
_
=
_
2t
0
) (t) c
In|
dt
2
, : Z,
is a Hilbert space isomorphism of 1
2
(T) onto /
2
(Z). Thus
_
c
In|
_
nZ
is
an orthonormal basis for 1
2
(T).
(2) Let T =

|Z
T
|
be the union of the collections T
|
=
__
,2
|
, (, 1) 2
|
__
Z
of right open left closed intervals of length 2
|
having left endpoint in 2
|
Z.
We refer to the intervals in T as dyadic intervals. For each dyadic inter-
val 1, the left half 1

and the right half 1


+
are referred to as the children
of 1. Now suppose that j is a positive measure on R, and for convenience
we suppose that
j(1) 0 for every 1 T.
Then for every 1 T we dene the Haar function /

1
by
/

1
(r) =
_
j(1

)j(1
+
)
j(1)
_

1
1
(r)
j(1

)

1
1+
(r)
j(1
+
)
_
, r R.
110 7. LEBESGUE, BANACH AND HILBERT SPACES
Here we are writing 1
1
(r) for the indicator function
1
(r). Thus the
Haar function /

1
is supported in 1 and is constant on each child 1

of
1. In the special case j is Lebesgue measure `
1
, and 1 has length 2
|
,
/
X1
1
takes on the value
1
_
2
!+1
on the left half of 1, and the value
1
_
2
!+1
on the right half of 1 (draw a picture!). The collection of Haar functions
/

11
has the following elementary properties:
supp/

1
1,
_
/

1
dj =
_
j(1

)j(1
+
)
j(1)
_

1
j(1

)
_
(1)
dj
1
j(1
+
)
_
(1+)
dj
_
= 0,
_
[/

1
[
2
dj =
j(1

)j(1
+
)
j(1)
_
1
j(1

)
2
_
(1)
dj
1
j(1
+
)
2
_
(1+)
dj
_
=
j(1

)j(1
+
)
j(1)
_
1
j(1

)

1
j(1
+
)
_
=
j(1
+
) j(1

)
j(1)
= 1.
Moreover, there follows the crucial orthogonality property:
_
/

1
/

dj = 0, if 1, J T and 1 ,= J.
Indeed, this follows simply from (1):
_
/

dj = 0, and (2): if J is a
proper dyadic subinterval of a dyadic interval 1, then /

1
is constant on
the support of /

.
Altogether we have shown that /

11
is an orthonormal set in 1
2
(j).
It can be shown, using the dierentiation theory two chapters below, that
/

11
is actually an orthonormal basis for 1
2
(j).
Remark 21. In the special case that j is Lebesgue measure on the real line R,
the set of Haar functions /
1

11
is generated by translation and dilation of the
single function
/
[0,1)
= 1
[0,
1
2
)
1
[
1
2
,1)
.
Thus the Haar basis /
1

11
is the simplest example of a wavelet basis, an or-
thonormal basis that is generated by translations and dilations of a xed mother
wavelet. Such wavelet bases have been characterized, and their properties cata-
logued, by Daubechies and others.
Next we give an application of Hilbert space theory and the uniform bounded-
ness principle to nonconvergence of Fourier series.
3.0.2. Nonconvergence of Fourier series of continuous functions. Recall the
orthonormal basis
_
c
In|
_
nZ
of 1
2
(T) in the example above. Now consider the
symmetric partial sums o
n
) of the Fourier series of ) 1
1
(T):
o
n
) (r) =
n

|=n

) (/) c
I|r
=
n

|=n
_
2t
0
) (t) c
I||
dt
2
c
I|r
=
_
2t
0
) (t)
_
n

|=n
c
I|(r|)
_
dt
2
=
_
2t
0
) (t) T
n
(r t)
dt
2
= ) + T
n
(r) ,
3. HILBERT SPACES 111
where
T
n
(0) =
n

|=n
c
I|0
=
_
c
I
0
2
c
I
0
2
_

n
|=n
c
I|0
c
I
0
2
c
I
0
2
=
c
I(n+
1
2
)0
c
I(n+
1
2
)0
c
I
0
2
c
I
0
2
=
sin
_
:
1
2
_
0
sin
0
2
satises
_
2t
0
[T
n
(0)[
d0
2
2
_
t
0

sin
_
:
1
2
_
0

0
2

d0
2
=
2

_
(n+
1
2
)t
0
[sin0[
d0
0

|=1
1
/
_
|t
(|1)t
[sin0[ d0
=
4

2
n

|=1
1
/
,
and so tends to as : .
From the Hilbert space theory above, we obtain that o
n
) converges to ) in
1
2
(T) for all ) 1
2
(T):
|o
n
) )|
2
=

]|],n

) (/)

2
0 as : , ) 1
2
(T) .
For ) C (T) we ask if we have pointwise convergence of o
n
) to ) on T. How-
ever, the property sup
n1
|T
n
|
J
1
(T)
= of the Dirichlet kernel T
n
, when com-
bined with the uniform boundedness principle, implies that there are continuous
functions ) C (T) whose Fourier series

o
|=o

) (/) c
I|r
fail to converge at
some points r in T. In fact there is a dense G
o
subset 1 of C (T) (a set is a
G
o
subset of A if it is a countable intersection of open subsets of A) such that
r T : o
n
) (r) fails to converge at r contains a dense G
o
subset of T for every
) 1.
To see this, set A
n
) = o
n
) (0) =
_
2t
0
) (t) T
n
(t)
J|
2t
. Then A
n
C (T)
+
and
|A
n
|
+
=
_
2t
0
[T
n
(t)[
J|
2t
as : . By the uniform boundedness principle
we cannot have
(3.3) sup
n1
[A
n
)[ = sup
n1
[o
n
) (0)[ <
for ) in a dense G
o
subset of C (T). In particular, there exists a continuous function
) on T whose Fourier series fails to converge at 0. However, since 1 is a subspace,
we cannot in fact have (3.3) in any open set, and it follows that
1
0
=
_
) C (T) : sup
n1
[A
n
)[ =
_
is dense. Since the map sup
n1
[A
n
)[ is a lower semicontinuous function of ), we
also have that 1
0
is a G
o
subset.
112 7. LEBESGUE, BANACH AND HILBERT SPACES
Now choose r
I

o
I=1
dense in T = [0, 2), and by applying the above argument
with r
I
in place of 0, choose 1
I
to be a dense G
o
subset of C (T) such that
sup
n1
[o
n
) (r
I
)[ = , ) 1
I
, i _ 1.
By Baires Theorem, 1 =
n
I=1
1
I
is also a dense G
o
subset of C (T). Thus for
every ) 1 we have sup
n1
[o
n
) (r
I
)[ = for all i _ 1. Now we note that
sup
n1
[o
n
) (r)[ is a lower semicontinuous function of r (since it is a supremum of
continuous functions), and thus the set
_
r T : sup
n1
[o
n
) (r)[ =
_
is a G
o
subset of T for every ) C (T). Combining these observations yields that
there is a dense G
o
subset 1 of C (T) such that for every ) 1, the set of r where
the Fourier series of ) fails to converge contains a dense G
o
subset of T.
Remark 22. In a complete metric space A without isolated points, every dense
G
o
subset is uncountable. Indeed, if 1 = r
|

o
|=1
=
o
n=1
\
n
, \
n
open, is a count-
able dense G
o
subset of A, then \
n
= \
n
r
|

n
|=1
is still a dense open subset of
A, but
o
n=1
\
n
= c, contradicting Baires Theorem.
Remark 23. A famous theorem of L. Carleson shows that for every ) 1
2
(T),
lim
no
o
n
) (r) = ) (r) for a.e. r T.
4. Duality
Given any normed linear space A we dene A
+
to be the vector space of all
continuous linear functionals on A, i.e. continuous linear maps A : A C (or into
R if the scalar eld is real). We recall that a map 1 from one normed linear space
A to another 1 is linear if 1(`r j) = `1r 1j for all r, j A and ` C.
Recall also that 1 is said to be bounded if there is a nonnegative constant C such
that |1r|
Y
_ C |r|

for all r A. The proof of the next result is easy and is


left to the reader.
Lemma 30. Let 1 : A 1 be linear where A, 1 are normed linear spaces.
Then 1 is bounded == 1 is continuous on A == 1 is continuous at 0.
By Lemma 30 a linear functional is continuous on A if and only if it is contin-
uous at the origin, or equivalently bounded. If we set
(4.1) |A|
+
= sup
]r]1
[Ar[ ,
then it is easily veried that ||
+
is a norm on A
+
, and since the scalar eld is
complete, so is the metric on A
+
induced from ||
+
. Thus A
+
is a Banach space
(even if A is not).
Remark 24. Note that |A|
+
is the smallest nonnegative constant C which
exhibits the boundedness of A on A in the inequality [Ar[ _ C |r|.
Now we specialize this denition to a Hilbert space H. An example of a con-
tinuous linear functional on H is the linear functional A

associated with j H
given by
(4.2) A

r = r, j , r H.
5. ESSENTIALLY BOUNDED FUNCTIONS 113
The boundedness of A

follows from the Cauchy-Schwarz inequality [A

r[ _ |j| |r|.
In fact, this together with the choice r =

]]
in (4.1) yields |A

|
+
= |j|. It turns
out that there are no other continuous linear functionals on H and this is the rst
major consequence of Theorem 45, and hence also of Theorem 44.
Theorem 51 (Riesz representation theorem for Hilbert spaces). Let H be a
Hilbert space. Every A H
+
is of the form A

for some j H. Moreover, there


is a conjugate linear isometry from H to H
+
given by j A

where A

is as in
(4.2).
Proof : Weve already shown that A

H
+
with |A

|
+
= |j|, and since
A
X
= `A

we have that the map j A

is a conjugate linear isometry from H into


H
+
. To see that this map is onto, take A ,= 0 in H
+
and let A = r H : Ar = 0 =
A
1
0 be the null space of A. Since A is a proper closed subspace of H, Theorem
45 shows that A
J
,= 0. Take . ,= 0 in A
J
and note that
(Ar) . (A.) r A for all r H.
Thus
0 = (Ar) . (A.) r, . = (Ar) |.|
2
(A.) r, .
yields
Ar =
(A.) r, .
|.|
2
=
_
r,
A.
|.|
2
.
_
= A

r, r H,
with j =
:
]:]
2
..
5. Essentially bounded functions
Suppose that (A, /, j) is a measure space with j(A) = 1. Then Hlders in-
equality shows that for ) measurable, |)|
J

()
[0, [ is a nondecreasing function
of j (0, ). Indeed, if 0 < j
1
< j
2
< , |)|
J

2()
< and j =
2
1
, then
1 < j < and it follows that
|)|
J

1()
=
__

[) (r)[
1
dj(r)
_ 1

1
_
__

[) (r)[
1
dj(r)
_ 1

1
__

dj(r)
_ 1

1
_
__

[) (r)[
2
dj(r)
_ 1

2
= |)|
J

2()
.
Thus
|)|
z
= lim
o
|)|
J

()
= sup
0<<o
|)|
J

()
[0, [ .
The question now arises as to what |)|
z
actually measures. The answer lies
in the following two observations. If ` |)|
z
, then
`[[)[ `[
1

_
_
_
]]}],X]
[)[

dj
_1

_ |)|
z
,
which implies
[[)[ `[

_ lim sup
o
_
|)|
z
`
_

= 0.
114 7. LEBESGUE, BANACH AND HILBERT SPACES
Conversely, if [[)[ `[

= 0, then
|)|
J

()
=
_
_
]]}]X]
[)[

dj
_
]]}],X]
[)[

dj
_1

_ `j(A)
1

= `,
which implies ` _ |)|
z
. Thus we conclude that
|)|
z
= inf
_
` 0 : [[)[ `[

= 0
_
,
which suggests we dene the essential supremum of a measurable function in the
following way.
Definition 20. Suppose that (A, /, j) is a measure space and that ) : A C
is measurable. The essential supremum of ) is dened to be
|)|
o
= inf
_
` 0 : [[)[ `[

= 0
_
.
We set
1
o
(j) = ) measurable : |)|
o
< .
It is easy to show that 1
o
(j) is a linear space and that |)|
J
1
()
= |)|
o
denes a norm on 1
o
(j) (after identifying functions that agree outside a set of
measure zero). It is surprisingly easy to show that 1
o
(j) is complete. Indeed, if
)
n

o
n=1
is a Cauchy sequence in the metric space 1
o
(j), then )
n

o
n=1
converges
uniformly outside the exceptional set
1 =
o
_
n,n=1
1
n,n
=
o
_
n,n=1
r A : [()
n
)
n
) (r)[ |)
n
)
n
|
o
,
to a measurable function ) : (A 1) C. Since
j(1) _
o

n,n=1
j(1
n,n
) =
o

n,n=1
0 = 0,
we may view ) as belonging to 1
o
(j). It is now evident that |) )
n
|
o
=
lim
no
|)
n
)
n
|
o
tends to 0 as : .
We have already established the rst assertion in the second exercise below.
6. Exercises
Exercise 15. Under the hypotheses of Lemma 26, show that C
c
(A) is dense
in the metric space 1

(j).
Exercise 16. Suppose that j(A) = 1 and |)|
J
r
()
< for some 0 < r < .
Then
(1) lim
o
|)|
J

()
= |)|
o
,
(2) lim
0
|)|
J

()
= oxp
__

ln[)[ dj
_
.
CHAPTER 8
Complex measures and the Radon-Nikodym
theorem
We now wish to extend the notion of a positive measure to complex-valued
functionals. We begin with an example.
Example 6. Given a positive measure i on a measurable space (A, /), and a
complex-valued function / 1
1
(i), we can dene a set functional j by
j(1) =
_
J
/di, 1 /.
It is easy to verify that j is a complex measure on /, i.e. that the countable
additivity in (0.1) below holds. Indeed, Corollary 14 shows that if 1 =

o
n=1
1
n
,
then
_
J
[/[ di =
o

n=1
_
Jr
[/[ di,
and it follows that

S
1
r=1+1
Jr
/di

_
_

S
1
r=1+1
Jr
[/[ di =
_
o

n=+1

Jr
[/[ di =
o

n=+1
_
Jr
[/[ di 0
as . Now for each _ 1 we have

n=1
j(1
n
) =

n=1
_
Jr
/di =
_

S
1
r=1
Jr
/di
=
_
J\

S
1
r=1+1
Jr
/di
=
_
J
/di
_

S
1
r=1+1
Jr
/di,
and taking limits as , we get

o
n=1
j(1
n
) =
_
J
/di = j(1).
More generally, we have the following denition. Consider a measurable space
(A, /) and a functional
j : / C.
Note that we do not permit j to take on innite values here.
115
116 8. COMPLEX MEASURES AND THE RADON-NIKODYM THEOREM
Definition 21. We say that j is a complex measure on /, or on A, if for every
sequence 1
n

o
n=1
of pairwise disjoint measurable sets, the series

o
n=1
j(1
n
) con-
verges and we have
(0.1) j
_

_
o
n=1
1
n
_
=
o

n=1
j(1
n
) .
1. The total variation of a complex measure
The rst observation we make is that the convergence of the series in (0.1) must
be absolute, i.e.

o
n=1
[j(1
n
)[ < . Indeed, for 0 _ / _ 2, let
o
|
=
_
rc
I0
C : 0 < r < and

8
_ 0
2/
8
<

8
_
denote the sector of aperture
2t
3
centred at the angle
2t|
3
. Then with
|
=
: : j(1
n
) o
|
we have
o

n=1
[j(1
n
)[ =
2

|=0

n.
!
[j(1
n
)[ ,
and so if

o
n=1
[j(1
n
)[ = , there is / such that

n.
!
[j(1
n
)[ = .
Without loss of generality we take / = 0 and note that for . o
0
we have
1
2
[.[ _ Io . _ [.[ .
Thus we conclude that
=

n.0
Io (j(1
n
)) = Io
_

n.0
j(1
n
)
_
,
and so the series

n.0
j(1
n
) does not converge, contradicting (0.1) and the fact
that the sets 1
n

n.0
are measurable and pairwise disjoint.
The above observation suggests the possibility that there exists a closely related
positive measure associated with j, namely the nonnegative set functional [j[ : /
[0, ) dened by
[j[ (1) = sup
_
o

n=1
[j(1
n
)[ : 1 =

_
o
n=1
1
n
with 1
n
/ for all : _ 1
_
.
This set functional [j[ is referred to as the total variation of j, and turns out to be
a positive measure on / with [j[ (A) < .
Theorem 52. Let (A, /) be a measurable space, and suppose j is a complex
measure on /. Then the total variation [j[ of j is a positive measure on / with
[j[ (A) < .
Proof : To prove the inequality
(1.1) [j[ (1) _
o

n=1
[j(1
n
)[ ,
1. THE TOTAL VARIATION OF A COMPLEX MEASURE 117
let 1 =

o
n=1

n
. Then
o

n=1
[j(
n
)[ =
o

n=1

n=1
j(
n
1
n
)

_
o

n=1
o

n=1
[j(
n
1
n
)[
=
o

n=1
o

n=1
[j(
n
1
n
)[ _
o

n=1
[j[ (1
n
) ,
and if we take the supremum over all decompositions 1 =

o
n=1

n
we obtain
(1.1).
Now we turn to proving
(1.2) [j[ (1) _
o

n=1
[j(1
n
)[ .
Since at this point [j(1
n
)[ could be innite, we cannot use [j(1
n
)[ - < [j(1
n
)[
for a small positive -, and instead we let t
n
be any nonegative real number satisfying
t
n
< [j(1
n
)[. Then there is a decomposition 1
n
=

o
n=1

n
n
satisfying
(1.3) t
n
<
o

n=1
[j(
n
n
)[ .
It follows that
o

n=1
t
n
_
o

n=1
o

n=1
[j(
n
n
)[ _ [j[ (1) ,
and taking the supremum over sequences t
n

o
n=1
satisfying (1.3), we obtain (1.2).
Finally, we prove that [j[ (A) < . Suppose, in order to derive a contradiction,
that [j[ (A) = . Then there is a decomposition A =

o
n=1
1
n
with
o

n=1
[j(1
n
)[ 6 ([j(A)[ 1) .
Using the notation introduced before the statement of the Theorem 52 we have
6 ([j(A)[ 1) <
o

n=1
[j(1
n
)[ =
2

|=0
_

n.
!
[j(1
n
)[
_
,
and so there is / 0, 1, 2 such that

n.
!
[j(1
n
)[ 2 ([j(A)[ 1). Without
loss of generality, / = 0 and using
1
2
[.[ _ Io . _ [.[ for . o
0
we have
2 ([j(A)[ 1) <

n.0
[j(1
n
)[ _ 2

n.0
Io (j(1
n
)) = 2 Io
_

n.0
j(1
n
)
_
,
and thus

n.0
j(1
n
)

_ Io
_

n.0
j(1
n
)
_
[j(A)[ 1.
118 8. COMPLEX MEASURES AND THE RADON-NIKODYM THEOREM
Now set =

n.0
1
n
and 1 = A so that
[j()[ =

n.0
j(1
n
)

[j(A)[ 1 1,
[j(1)[ = [j(A) j()[ _ [j()[ [j(A)[
[j(A)[ 1 [j(A)[ = 1.
Now = [j[ (A) = [j[ () [j[ (1) implies that at least one of and 1 has
innite [j[-measure, say 1. Then we dene
1
= and 1
1
= 1 so that
A =
1

_
1
1
with [j(
1
)[ 1 and [j[ (1
1
) = .
Now iterate this construction with 1
1
in place of A to obtain measurable sets

2
and 1
2
such that
1
1
=
2

_
1
2
with [j(
2
)[ 1 and [j[ (1
2
) = .
Continuing by induction we obtain sequences
n

o
n=1
and 1
n

o
n=1
of measurable
sets satisfying
1
n1
=
n

_
1
n
with [j(
n
)[ 1 and [j[ (1
n
) = , : _ 2.
Now let =

o
n=1

n
be the union of the pairwise disjoint sets
n

o
n=1
. Then we
must have
j() =
o

n=1
j(
n
) ,
but this is impossible since the series on the right is divergent: [j(
n
)[ 1 for all
:.
Definition 22. Let (A, /) be a measurable space. If j, i are complex measures
on A, then so is cj ,i for c, , C where
(cj ,i) (1) = cj(1) ,i (1) , 1 /.
Denote by M(A) the normed linear space of complex measures on A with norm
given by
|j| = [j[ (A) , j M(A) .
In an exercise below you are asked to show that M(A) is a Banach space.
2. The Radon-Nikodym theorem
Every complex number . has a representation in polar coordinates as . = [.[
where [.[ _ 0 and [[ = 1 (we usually write = c
I0
as well). It turns out that there
is a similar representation of a complex measure j on a measurable space (A, /)
as (see Example 6 above)
(2.1) j = [j[ ,
where [j[ is the total variation of j, and is a measurable function on A satisfying
[ (r)[ = 1 for all r A. This representation of a complex measure j is often called
the polar representation of j.
2. THE RADON-NIKODYM THEOREM 119
In the special case that j takes on only real values, we call j a real measure, and
in the polar representation (2.1), we have (r) = 1 for all r A. In particular,
if
j
1
=
]r:(r)=1]
j and j
2
=
]r:(r)=1]
j,
then both j
1
and j
2
are positive measures on A whose dierence j
1
j
2
is j, and
j
1
and j
2
are carried by disjoint sets. We say that a positive or complex measure
j on / is carried by a set if j(1) = 0 for all 1 / such that 1 = O.
This decomposition j = j
1
j
2
, where the j
I
are positive measures carried
by disjoint sets, is called the Hahn decomposition of the real measure j. Note also
that [j[ = j
1
j
2
. A much simpler decomposition is the Jordan decomposition
(2.2) j =
1
2
([j[ j)
1
2
([j[ j) = j
+
j

,
where j

are easily shown to be positive measures, but no claim is made regarding


j

being carried by disjoint sets. It turns out that j


1
= j
+
and j
2
= j

so that
j

are indeed carried by disjoint sets. But this is hard to prove, and we will obtain
it from a much more general, and signicantly deeper, decomposition of a complex
measure; namely the Radon-Nikodym Theorem. To state this most important of
the theorems in measure theory, we need some denitions.
Definition 23. Let (A, /) be a measurable space. Suppose that j, i are mea-
sures (complex or positive) on / and that ` is a positive measure on /. Then
(1) j is said to be concentrated on (or carried by or lives on) a measurable
set / if
j(1) = j(1 ) for all 1 /,
equivalently,
j(1) = 0 for all 1 / with 1 = O;
(2) j and i are said to be mutually singular if there are disjoint measurable
sets , 1 / such that j is concentrated on and i is concentrated on
1. In this case we write j l i;
(3) j is said to be absolutely continuous with respect to the positive measure
` if
j(1) = 0 for all null sets 1 of `.
In this case we write j `.
Note that if in the rst denition, the measure j is a positive measure, then
j is concentrated on a set if and only if j(
c
) = 0. Of course this simple
characterization doesnt extend to complex measures j. The following properties
of these denitions are easy to prove.
Proposition 9. Let (A, /) be a measurable space. Suppose that j, i are mea-
sures (complex or positive) on / and that ` is a positive measure on /. Then
(1) the connections with the total variation of a measure are these:
(a) j is concentrated on if and only if [j[ is concentrated on ;
(b) j l i if and only if [j[ l [i[;
(c) j ` if and only if [j[ `;
(2) the connections with the additive structure of measures are these:
(a) j l ` and i l ` == (j i) l `.
(b) j ` and i ` == (j i) `.
120 8. COMPLEX MEASURES AND THE RADON-NIKODYM THEOREM
(3) the connections between and l are these:
(a) j ` and i l ` == j l i.
(b) j ` and j l ` == j = 0.
Definition 24. A positive measure ` on a measurable space (A, /) is o-nite
if A =

o
n=1
A
n
is a countable union of measurable sets A
n
with `(A
n
) < .
Now we can state the most important of the theorems in measure theory. It
gives, under certain conditions, a decomposition of a complex measure j into one
piece that is absolutely continous with respect to a given positive measure `, and
another piece that is mutually singular with respect to `. Moreover, it describes
completely the nature of the absolutely continuous piece, and shows that the mea-
sures in Example 6 are the only such pieces!
Theorem 53 (Radon-Nikodym Theorem). Let (A, /) be a measurable space.
Suppose that j M(A) is a complex measure and that ` is a positive o-nite
measure on /.
(1) There is a unique pair of complex measures j
o
, j
s
M(A) such that
j = j
o
j
s
where j
o
` and j
s
l `.
If in addition j is positive (and thus nite), so are j
o
and j
s
.
(2) There is a unique / 1
1
(`) such that
(2.3) j
o
(1) =
_
J
/d`, for all 1 /.
The function / 1
1
(`) in part (2) of the theorem is called the Radon-Nikodym
derivative of j with respect to ` and is usually denoted
/ =
dj
d`
.
This function will be obtained using the Riesz representation theorem 51 for an
associated Hilbert space 1
2
(,).
Proof : We begin with the proof of uniqueness. If
j
o
j
s
= j
t
o
j
t
s
where j
o
, j
t
o
` and j
s
, j
t
s
l ` then
. = j
o
j
t
o
= j
t
s
j
s
satises . ` and . l `,
hence by Proposition 9 (3)(b) we have . = 0. The uniqueness of / 1
1
(`) in
part (2) is simply the fact that
_
J
/d` = 0 for all 1 / implies / = 0 `-almost
everywhere.
Conversely, we rst prove the special case where both j and ` are positive
nite measures. Then the sum , = j ` is also a positive nite measure, and the
Cauchy-Schwarz inequality gives

_
)dj

_
_
[)[ dj _
_
[)[ d, _
__
[)[
2
d,
_1
2 _
,(A) =
_
,(A) |)|
J
2
(,)
for every ) 1
2
(,). Thus we see that the map
A) =
_
)dj, ) 1
2
(,) ,
2. THE RADON-NIKODYM THEOREM 121
denes a bounded linear functional on the Hilbert space 1
2
(,)! By the Riesz
representation theorem 51 for Hilbert spaces, there is a unique q 1
2
(,) such that
_
)qd, = ), q
J
2
(,)
= A) =
_
)dj, ) 1
2
(,) .
We now claim that q (r) [0, 1[ for ,-almost every r A. To see this, consider
a ball 1(., r) in the complex plane that doesnt intersect [0, 1[, i.e. 1(., r)[0, 1[ =
O. Let 1 = q
1
(1(., r)) and assume, in order to derive a contradiction, that
,(1) 0. Then we have
j(1)
,(1)
=
1
,(1)
_

J
dj =
1
,(1)
A
J
=
1
,(1)
_

J
qd,
=
1
,(1)
_

J
.d,
1
,(1)
_

J
(q .) d,
= .
1
,(1)
_

J
(q .) d,,
which shows that

j(1)
,(1)
.

_
1
,(1)
_

J
[q .[ d, <
1
,(1)
_

J
rd, = r,
since [q (r) .[ < r for r 1. Thus we have shown that
(J)
,(J)
1(., r). Since
(J)
,(J)
[0, 1[, we have the desired contradiction to 1(., r) [0, 1[ = O.
Now let 1(.
n
, r
n
)
o
n=1
be a countable collection of balls satisfying
C [0, 1[ =
o
_
n=1
1(.
n
, r
n
) .
It follows that
j
_
q
1
(C [0, 1[)
_
_
o

n=1
j
_
q
1
(1(.
n
, r
n
))
_
=
o

n=1
0 = 0,
which says that q (r) [0, 1[ for ,-almost every r A.
Thus we may assume that q (r) [0, 1[ for all r A. We then have
_

(1 q) )dj =
_

)dj
_

)qdj =
_
)qd,
_

)qdj (2.4)
=
_

)qd (j `)
_

)qdj =
_

)qd`,
for all ) 1
2
(,). We can now dene j
o
and j
s
. Formally, we expect that
(1 q) )dj = )qd`, hence
dj
d`
=
q
1 q
,
and this suggests that j
o
should live where q < 1 and that j
s
should live where
q = 1. So let
= r A : 0 _ q (r) < 1 ,
o = r A : q (r) = 1 ,
122 8. COMPLEX MEASURES AND THE RADON-NIKODYM THEOREM
and set
j
o
(1) = j(1 ) , 1 /,
j
s
(1) = j(1 o) , 1 /.
It is easy to see that j
s
l `. Indeed, since q = 1 on o we have
`(o) =
_

S
d` =
_

S
qd` =
_
(1 q)
S
dj = 0,
which means that ` is concentrated on = o
c
, while by denition j
s
is concen-
trated on o. To see that j
o
` is not much harder. If 1 / satises `(1) = 0,
then with ) =
J|.
in (2.4) we have from (2.4) that
0 =
_
J|.
qd` =
_

)qd` =
_

(1 q) )dj =
_
J|.
(1 q) dj.
Since 1 q 0 on we conclude that j(1 ) = 0, i.e. j
o
(1) = 0.
Finally, to see that there is / 1
1
(`) satisfying (2.3), we note that for 1
/ and : _ 1, equation (2.4) with ) = (1 q ... q
n
)
J
and the Monotone
Convergence Theorem applied twice yields
j(1 ) =
_
J
_
lim
no
_
1 q
n+1
_
_
dj
= lim
no
_
J
_
1 q
n+1
_
dj = lim
no
_

(1 q) (1 q ... q
n
)
J
dj
= lim
no
_

(1 q ... q
n
)
J
qd`
= lim
no
_
J
_
q q
2
... q
n+1
_
d`
=
_
J
q
1 q
d`,
since both
_
1 q
n+1
_
1 and
_
q q
2
... q
n+1
_

q
1 q
pointwise as : . Thus / =

1
1
1
(`) and (2.3) holds. Note that both j
o
and j
s
are positive measures, and that / is nonnegative.
Now we remove the additional assumptions on j and `. First, we consider
the case where j is positive and nite, and ` is positive and o-nite. It is easy
to construct a pairwise disjoint decomposition A =

or o
n=1
A
n
such that 0 <
`(A
n
) < for all :. Then dene
n =
or o

n=1
1
2
n
(1 `(A
n
))

r
.
Then we have both
0 < n(r) < 1 for all r A,
and
0 <
_

nd` =
or o

n=1
`(A
n
)
2
n
(1 `(A
n
))
< 1.
2. THE RADON-NIKODYM THEOREM 123
If we let `
0
be the nite positive measure given by
`
0
(1) =
_
J
nd`, 1 /,
then from what we have already proved we obtain
j = j
o
j
s
, where j
o
`
0
and j
s
l `
0
,
j
o
(1) =
_
J
/d`
0
, for all 1 /, where 0 _ / 1
1
(`
0
) .
Clearly j
o
` and j
s
l ` both hold, as well as
j
o
(1) =
_
J
/
0
d`
0
=
_
J
/
0
nd`, for all 1 /,
so that / = /
0
n 1
1
(`) satises (2.3). Indeed,
_
J
)d`
0
=
_
J
)nd` for all ) =
J
with 1 /, hence for all ) simple, hence for all ) nonnegative including ) = /
0
.
Finally, to remove the restriction that j is positive, write j = i
+
i

i.
3
i.
4
where i and . are the real and imaginary parts of j and i = i
+
i

and . =
.
+
.

are the Jordan decompositions of i and . respectively as dened in (2.2).


Remark 25. The o-niteness of ` cannot be dropped from the hypotheses of
the Radon-Nikodym theorem. For example, if j is Lebesgue measure on [0, 1[ and
` is counting measure on [0, 1[, then j ` but if there were / 1
1
(`) such that
j(1) =
_
J
/d`, then wed have /(r) =
_
]r]
/d` = j(r) = 0 for all r [0, 1[,
yielding the contradiction j = 0.
We can now obtain as corollaries, both the polar representation of a complex
measure and the Hahn decomposition of a real measure. We note that j [j[
holds trivially where [j[ is the total variation of j.
Corollary 18 (Polar representation). Let j be a complex measure on (A, /).
Then the Radon-Nikodym derivative / =
J
J]]
satises [/(r)[ = 1 for [j[-almost
every r A.
Thus for a complex measure, we can write dj(r) = c
I0(r)
d [j[ (r), explaining
the term polar representation.
Proof : We rst claim that if 0 < r < 1 and 1
:
= r A : [/(r)[ < r, then
[j[ (1
:
) = 0. Indeed, if 1
:
=

o
n=1
1
n
then
o

n=1
[j(1
n
)[ =
o

n=1

_
Jr
/d [j[

_
o

n=1
_
Jr
[/[ d [j[ _ r
o

n=1
[j[ (1
n
) = r [j[ (1
:
) .
Taking the supremum over all decompostions 1
:
=

o
n=1
1
n
we obtain 0 _ [j[ (1
:
) _
r [j[ (1
:
), which implies [j[ (1
:
) = 0 since r < 1. It follows that
[j[ (r A : [/(r)[ < 1) = lim
no
[j[
__
r A : [/(r)[ < 1
1
:
__
= 0.
To show that [j[ (r A : [/(r)[ 1) vanishes, we apply the averaging argu-
ment used in the proof of the Radon-Nikodym theorem. It suces to show that if
124 8. COMPLEX MEASURES AND THE RADON-NIKODYM THEOREM
1(., r) 1(0, 1) = O, then the subset 1 = /
1
(1(., r)) of A satises [j[ (1) = 0.
But if [j[ (1) 0, then we obtain
j(1)
[j[ (1)
=
1
[j[ (1)
_

J
dj =
1
[j[ (1)
_

J
/d,
=
1
[j[ (1)
_

J
.d,
1
[j[ (1)
_

J
(/ .) d,
= .
1
[j[ (1)
_

J
(q .) d,,
which shows that

j(1)
[j[ (1)
.

_
1
[j[ (1)
_

J
[/ .[ d [j[ <
1
[j[ (1)
_

J
rd [j[ = r,
contradicting 1(., r) 1(0, 1) = O.
Corollary 19 (Hahn decomposition). Let j be a real measure on (A, /). If
j = j
+
j

is the Jordan decomposition of j, i.e. j

=
1
2
([j[ j), then j
+
l j

.
Moreover, if / =
J
J]]
is the Radon-Nikodym derivative of j with respect to its total
variation [j[, then [/(r)[ = 1 for [j[-almost every r A, and for 1 / we have
j
+
(1) = [j[ (1 / = 1) ,
j

(1) = [j[ (1 / = 1) .
Remark 26. Using the Radon-Nikodym theorem it is easy to see that if j is a
complex measure and ` is a o-nite positive measure, then j ` if and only if for
every - 0 there is c 0 such that
(2.5) [j(1)[ < - whenever `(1) < c.
Indeed, if ) =
J
JX
1
1
(`) is the Radon-Nikodym derivative, and if - 0, the Domi-
nated Convergence Theorem shows that there is ' < such that
_
]]}],1]
[)[ d` <
:
2
. Then with c =
:
21
0, we have
[j(1)[ =

_
J
)d`

_
_
]]}],1]
[)[ d`
_
J|]]}]1]
[)[ d`
<
-
2
'`(1) <
-
2
'c = -,
if `(1) < c.
In fact, even for general positive measures `, it is true that j ` if and only if
(2.5) holds. To see this, suppose there is - 0 and sets 1
n

o
n=1
with `(1
n
) <
1
2
r
but [j(1
n
)[ _ - for all : _ 1. Then
`
_
o
_
n=n
1
n
_
_
o

n=n
2
n
0 as : ,
and so the set 1 =

o
n=1

o
n=n
1
n
yields the desired contradiction to Proposition
9 (1) (c):
`(1) = lim
no
`
_
o
_
n=n
1
n
_
= 0,
[j[ (1) = lim
no
[j[
_
o
_
n=n
1
n
_
_ lim inf
no
[j[ (1
n
) _ -.
3. EXERCISES 125
3. Exercises
Exercise 17. Show that M(A) is complete, hence a Banach space.
Exercise 18. Prove Proposition 9.
CHAPTER 9
Dierentiation of integrals
In this chapter we investigate to what extent we can dierentiate the Lebesgue
integral
_
R
r
)d`
n
in order to recover the integrand ). In one dimension we have for
) C
c
(R) the two familiar statements of the Fundamental Theorem of Calculus:
d
dr
_
r
o
)d` = ) (r) , r R,
_
o
o
)d` = lim
bo
1 (/) lim
oo
1 (a) ,
where 1 is any antiderivative of ). The rst of these statements can be rewritten
in the equivalent forms
lim
|0
1
/
_
r+|
r
)d` = lim
|0
_
r+|
o
)d`
_
r
o
)d`
/
= ) (r) ,
and
lim
]1]0: r1
1
[1[
_
1
)d` = ) (r) ,
for all ) C
c
(R) and r R. The latter limit is taken over all intervals 1 that
contain the point r and the assertion is that for - 0 there is c 0 such that

1
]1]
_
1
)d` ) (r)

< - whenever r 1 and [1[ < c. This suggests the following


analogue in higher dimensional Euclidean space R
n
.
Problem 4. To what extent is it true that
(0.1) lim
]1]0: r1
1
[1[
_
1
)d` = ) (r)
for ) 1
1
(R
n
), r R
n
, and a family 1
r1
of subsets of R
n
containing r?
Of course, for continuous functions ), the above limit (0.1) holds at every
r R
n
, provided only that the sets 1 have diameters that shrink to 0 as their
measures [1[ tend to zero. More generally, we will see that for integrable functions
), and for sets 1 which are suciently like balls, the above limit (0.1) holds for
almost every r in R
n
. The proof follows these lines:
The limit (0.1) holds for every r if ) is continuous.
The space of continuous functions is dense in 1
1
(R
n
).
The oscillation of the limit in (0.1) is near zero except on a small set when
|)|
J
1
(R
r
)
is small.
The connection between the oscillation of the limit of averages of ) in
(0.1), and the 1
1
(R
n
) norm of ), is governed by the maximal function
/) and a weak type inequality.
127
128 9. DIFFERENTIATION OF INTEGRALS
1. Covering lemmas, maximal functions and dierentiation
Let
T =
_
2
|
(, [0, 1)
n
)
_
Z
r
,|Z
=
_
Q
|

_
Z
r
,|Z
be the grid of dyadic cubes in R
n
, and dene the dyadic maximal function /
J
)
of a locally integrable function ) on R
n
by
/
J
) (r) = sup
rQ1
1
[Q[
_
Q
[) (j)[ dj, r R
n
.
We say that ) is locally integrable, written ) 1
1
loc
(R
n
), if )
1(0,1)
1
1
(R
n
) for
all 1 < . Clearly, /
J
) is measurable since it is the supremum over : of the
functions
)
n
(r) =

Z
r
_
E
Q
r

[)[
_

Q
r

(r) , r R
n
,
E
Q
q =
1
[Q[
_
Q
qd`
n
.
Thus /
J
) (r) is the least upper bound of all the dyadic averages E
Q
[)[ of [)[ at
r. In order to study the convergence of the dyadic averages of ), we consider the
limit superior of the dyadic averages of [)[ at r:
I
J
) (r) = lim sup
Qr
1
[Q[
_
Q
[) (j)[ dj, r R
n
,
where it is understood by the expression Q r that Q is a dyadic cube containing
r whose side length is shrinking to zero in the limit. Clearly we have
I
J
() ) (r)) (r) = 0 == lim
Qr
1
[Q[
_
Q
[) (j) ) (r)[ dj = 0 (1.1)
== ) (r) = lim
Qr
1
[Q[
_
Q
) (j) dj.
Of course we have
(1.2) I
J
) (r) _ /
J
) (r) ,
and the key properties of the maximal operator /
J
are that it is bounded on
1
o
(R
n
) and of weak type 1 1 on 1
1
(R
n
):
(1.3)

_
r R
n
: /
J
) (r) `
_

_
1
`
_
R
r
[) (j)[ dj, ` 0.
To see (1.3) dene
\
X
=
_
r R
n
: /
J
) (r) `
_
,
1
X
=
_
Q T :
1
[Q[
_
Q
[) (j)[ dj `
_
,
and let Q
n

n
be the set of maximal dyadic cubes in 1
X
. Then the cubes Q
n
are
pairwise disjoint and we have
\
X
=
_
Q

Q =

_
n
Q
n
.
1. COVERING LEMMAS, MAXIMAL FUNCTIONS AND DIFFERENTIATION 129
This is the most successful of covering lemmas: namely we have covered a union
\
X
of dyadic cubes with a pairwise disjoint subcollection. Unravelling denitions
yields
[\
X
[ =

n
[Q
n
[ <

n
1
`
_
Qr
[) (j)[ dj _
1
`
_
R
r
[) (j)[ dj.
The weak type inequality (1.3) for /
J
yields the Lebesgue Dierentiation
Theorem for dyadic averages.
Theorem 54. For ) 1
1
loc
(R
n
) we have
) (r) = lim
]Q]0: rQ1
1
[Q[
_
Q
) (j) dj, a.c.r R
n
,
in fact,
lim
]Q]0: rQ1
1
[Q[
_
Q
[) (j) ) (r)[ dj = 0.
Proof : Since the conclusion of the theorem is local it suces to consider
) 1
1
(R
n
) with compact support. Given - 0, we can use Lemma 26 to choose
q C
c
(R
n
) with
_
[) q[ < -. However, I
J
(q q (r)) (r) = 0 for every r R
n
since q is continuous. It follows from the subadditivity of I
J
and (1.2) that
I
J
() ) (r)) (r) _ I
J
() ) (r) [q q (r)[) (r) I
J
(q q (r)) (r)
_ I
J
() q) (r) I
J
() (r) q (r)) (r)
_ /
J
() q) (r) [() q) (r)[ .
Now we have
_
r R
n
: I
J
() ) (r)) (r) `
_

_
r R
n
: /
J
() q) (r)
`
2
_
'
_
r R
n
: [() q) (r)[
`
2
_
and so

_
r R
n
: I
J
() ) (r)) (r) `
_

_
r R
n
: /
J
() q) (r)
`
2
_

_
r R
n
: [() q) (r)[
`
2
_

_
2
`
_
[) q[
2
`
_
[) q[ <
4
`
-.
Now let - 0 to obtain

_
r R
n
: I
J
() ) (r)) (r) `
_

= 0 for all ` 0.
This proves that I
J
() ) (r)) (r) = 0 for a.e. r R
n
, and (1.1) now concludes
the proof of Lebesgues dierentiation theorem for dyadic averages.
We now wish to extend Lebesgues dierentiation theorem to more general av-
erages, namely to the collection of almost-balls in R
n
. Fix a large positive constant
C. Then we say that a subset 1 of R
n
is an almost-ball of eccentricity C if there is
r 0 and two balls, 1(r, r) and 1(j, Cr), with
(1.4) 1(r, r) 1 1(j, Cr) .
Note that we do not require r or j to belong to 1, nor must r equal j. Thus an
almost-ball contains an ordinary ball, and is contained in another ordinary ball of
130 9. DIFFERENTIATION OF INTEGRALS
C times the radius. In order to prove this more general dierentiaion theorem,
we will use the notion of shifted dyadic grids to reduce matters to what we have
already proved.
Dene a shifted dyadic grid to be the collection of cubes
(1.5) T
o
=
_
2
|
_
, (1)
|
c [0, 1)
n
_
: / Z, , Z
n
_
, c
_
0,
1
3
,
2
3
_
n
.
The basic properties of these collections are these: In the rst place, each T
o
is
a grid, namely for Q, Q
t
T
o
we have Q Q
t
O , Q, Q
t
and Q is a union
of 2
n
elements of T
o
of equal volume. In the second place, and this is the novel
property here, for any cube Q R
n
, there is a choice of some c 0,
1
3
,
2
3

n
and
some Q
t
T
o
so that
Q Q
t
and [Q
t
[ _ C
n
[Q[ .
Here C
n
is a positive constant depending only on dimension :. We prove that C
1
_
4 in dimension : = 1, and leave the general case to the reader. So suppose that [a, /[
is an interval. Let / Z be the unique integer satisfying 2
|1
< / a _ 2
|
. Now
choose , Z and c
_
0,
1
3
,
2
3
_
so that
_
, (1)
|+1
c
_
2
|+1
is the largest such ex-
pression satisfying
_
, (1)
|+1
c
_
2
|+1
< a. Then a _
_
, (1)
|+1
c
1
3
_
2
|+1
and so
/ _ 2
|
a _ 2
|

_
, (1)
|+1
c
1
8
_
2
|+1
_
_
,

6
(1)
|+1
c
_
2
|+1
.
It follows that
[a, /[
__
, (1)
|+1
c
_
2
|+1
,
_
, 1 (1)
|
c
_
2
|+1
_
,
where the latter interval belongs to the grid T
o
and has length 2
|+1
< 4 (/ a).
We now dene the T
o
-analogs of the dyadic maximal operator, namely
(1.6) /
J
o
)(r) = sup
Q1c: rQ
1
[Q[

_
Q
[)[ .
Just as for /
J
we have that /
J
o
is weak type 1 1 on 1
1
(R
n
),

_
r R
n
: /
J
o
) (r) `
_

_
1
`
_
R
r
[) (j)[ dj, ` 0.
Now x C 0 and let / = /
c
denote the collection of all almost-balls of eccen-
tricity C in R
n
. Consider the corresponding maximal function
/
,
) (r) = sup
1,: r1
1
[1[
_
1
[) (j)[ dj, r R
n
.
For each almost-ball 1 / and ball 1(j, Cr) as in (1.4), there is a cube Q
1(j, Cr) with [Q[ _ (2Cr)
n
. Then the properties of the shifted dyadic grids yield
the existence of c
_
0,
1
3
,
2
3
_
n
and Q
t
T
o
such that
1 1(j, Cr) Q Q
t
,
and
[Q
t
[ _ C
n
[Q[ _ C
n
(2Cr)
n
_ C
t
n
[1(r, r)[ _ C
t
n
[1[ .
It follows that
1
[1[
_
1
[) (j)[ dj _
C
t
n
[Q
t
[
_
Q
0
[) (j)[ dj _ C
t
n
/
J
o
) (r) , for each r 1,
1. COVERING LEMMAS, MAXIMAL FUNCTIONS AND DIFFERENTIATION 131
and hence that
/
,
) (r) _ C
t
n
max
o
n
0,
1
3
,
2
3
o
/
J
o
) (r) .
This proves that /
,
is also weak type 1 1 on 1
1
(R
n
):

_
r R
n
: /
,
) (r) `
_

_

o
n
0,
1
3
,
2
3
o

_
r R
n
: /
J
o
) (r)
`
8C
t
n
_

_
0C
t
n
`
_
R
r
[) (j)[ dj, ` 0.
As a result we can prove the following theorem in exactly the same way as Theorem
54 above.
Theorem 55. Let / = /
c
be the collection of all almost-balls of eccentricity
C 0. For ) 1
1
loc
(R
n
) we have
) (r) = lim
]1]0: r1,
1
[1[
_
1
) (j) dj, a.c.r R
n
,
in fact,
lim
]1]0: r1,
1
[1[
_
1
[) (j) ) (r)[ dj = 0, a.c.r R
n
.
Corollary 20. Suppose that j is a complex Borel measure on R
n
and that
j `
n
where `
n
is Lebesgue measure. If ) =
J
JXr
is the Radon-Nikodym derivative
of j with respect to `
n
, then ) can be obtained as a limit of ratios of measures:
) (r) = lim
]1]0: r1,
j(1)
[1[
, a.c.r R
n
.
Proof : Apply Theorem 55 using j(1) =
_
1
)d`
n
=
_
1
) (j) dj.
Corollary 21. Let 1 be a Lebesgue measurable subset of R
n
. Then almost
every point in 1 is densely surrounded by points of 1 in the sense that
lim
:0
[1 1(r, r)[
[1(r, r)[
= 1 for almost every r 1,
while almost every point not in 1 is densely surrounded by points not in 1 in the
sense that
lim
:0
[1 1(r, r)[
[1(r, r)[
= 0 for almost every r , 1.
Proof : Apply Theorem 55 using 1 = 1(r, r) and ) =
J
so that
[1 1(r, r)[ =
_
1(r,:)

J
(j) dj =
_
1
) (j) dj.
This last corollary gives a surprising insight into the structure of measurable
sets, which provides yet another illustration of Littlewoods rst principle: measur-
able sets are almost open sets. Of course it is trivial that every point in an open
set 1 is entirely surrounded by points of 1 at a small enough scale.
132 9. DIFFERENTIATION OF INTEGRALS
2. The maximal theorem
Our next theorem will require an expression of the 1

norm of a function ) in
terms of its distribution function
[[)[ t[ = [r R
n
: [) (r)[ t[ , t 0.
We could appeal at this point to the following special case of Fubinis theorem,
proved in the next chapter. Suppose that q : R
n
[0, ) is measurable. Then
_
R
r
q (r)

dr =
_
R
r
_
_
(r)
0
jt
1
dt
_
dr (2.1)
=
_
R
r
_
_
[0,o)

],|]
(r) jt
1
dt
_
dr
=
_
[0,o)
__
R
r

],|]
(r) jt
1
dr
_
dt
=
_
[0,o)
[q t[ jt
1
dt.
However, we only need the following easy approximation to (2.1):
(2.2)
_
R
r
q (r)

dr =
o

|=o
_
]2
!
<2
!+1
]
q (r)

dr _ 2

|=o
2
|

_
2
|
< q _ 2
|+1
_

.
Theorem 56. For 1 < j _ we have
__
R
r

/
J
)

_1

_ C

__
R
r
[)[

_1

, ) 1

(R
n
) .
Proof : The following argument is from Marcinkiewicz interpolation. Dene
)
X
=
]}],

) so that /
J
() )
X
) _
X
2
by the boundedness of /
J
on 1
o
(R
n
):
_
_
/
J
q
_
_
J
1
(R
r
)
_ |q|
J
1
(R
r
)
. Consequently, by the subadditivity of /
J
we have
/
J
) _ /
J
() )
X
) /
J
)
X
_
`
2
/
J
)
X
,
and thus
(2.3)
_
r R
n
: /
J
) (r) `
_

_
r R
n
: /
J
)
X
(r)
`
2
_
,
2. THE MAXIMAL THEOREM 133
for any ` 0. Now use (2.2), (2.3) and then (1.3) applied to )
X
with ` = 2
|
to
obtain
_
R
r

/
J
) (r)

dr _ 2

|=o
2
|

_
r R
n
: /
J
) (r) `
_

_ 2

|=o
2
|

_
r R
n
: /
J
)
2
! (r)
2
|
2
_

_ 2

|=o
2
|
_
1
2
|1
_
R
r
[)
2
! (r)[ dr
_
= 2
+1
o

|=o
2
|(1)
_
]rR
r
:]}(r)],2
!1
]
[) (r)[ dr
= 2
+1
_
R
r
[) (r)[
_
_
_

|: 2
!
<2]}(r)]
2
|(1)
_
_
_
dr
_ 2
21
1
1 2
1
_
R
r
[) (r)[

dr,
since

|: 2
!
<2]}(r)]
2
|(1)
<
(2]}(r)])
1
12
1
. Note that we have used Corollary 9 in
order to interchange summation and integration in the penultimate line above.
2.1. The Haar basis. In our second example of an orthonormal set in Ex-
ample 5 of Section 3 of Chapter 7, we showed that the collection of Haar functions
/

11
is orthonormal in 1
2
(j), but deferred the proof that it is a basis until we
had Lebesgues Dierentiation Theorem at our disposal. We assumed there that j
is a positive Borel measure on the real line R satisfying j(1) 0 for every 1 T.
For convenience we now assume a bit more:
j(1) 0 for every 1 T, (2.4)
_
o
0
dj =
_
0
o
dj = .
We will need the analogues of dyadic dierentiation theory for a positive mea-
sure j in place of Lebesgue measure. The following two theorems are proved in
exactly the same way as the corresponding results for Lebesgue measure above.
For these two theorems we assume that j is a positive Borel measure on R
n
satis-
fying j(1) 0 for every 1 T, and j(J) = for each of the 2
n
octants of the
form J =

n
I=1
J
I
where J
I
is either (, 0) or [0, ).
Theorem 57. For ) 1
1
loc
(R
n
) we have
) (r) = lim
]Q]0: rQ1
1
[Q[

_
Q
) (j) dj(j) , j a.c.r R
n
,
in fact,
lim
]Q]0: rQ1
1
[Q[

_
Q
[) (j) ) (r)[ dj(j) = 0.
134 9. DIFFERENTIATION OF INTEGRALS
Definition 25. Dene the dyadic j-maximal function /
J

) of a locally j-
integrable function ) on R
n
by
/
J

) (r) = sup
rQ1
1
[Q[

_
Q
[) (j)[ dj(j) , r R
n
.
Theorem 58. For 1 < j _ we have
__
R
r

/
J

dj
_1

_ C

__
R
r
[)[

dj
_1

, ) 1

(j) .
Now we return to dimension : = 1. Recall that T =

|Z
T
|
is then the set of
dyadic intervals, where T
|
=
__
,2
|
, (, 1) 2
|
__
Z
. We dened the Haar function
/

1
for 1 T by
/

1
(r) =
_
j(1

)j(1
+
)
j(1)
_

1
1
(r)
j(1

)

1
1+
(r)
j(1
+
)
_
, r R,
where 1

and 1
+
are the left and right halves of 1, referred to as the children of
1. The collection of Haar functions /

11
was shown to satisfy the elementary
properties
supp/

1
1,
_
/

1
dj = 0,
_
[/

1
[
2
dj = 1,
and most importantly, the crucial orthogonality property,
_
/

1
/

dj = 0, if 1, J T and 1 ,= J.
To see that /

11
is actually an orthonormal basis for 1
2
(j), it suces
by Theorem 49 to establish that oja: /

11
is dense in 1
2
(j). For this we
introduce the expectation functions,
E

|
) (r) =

11
!
_
),
1
[1[

1
1
_
J
2
()
1
1
(r) , r R, / Z,
which for a given / Z, are simply the functions that are constant on dyadic
intervals 1 of length 2
|
, and where the constant is the j-average of ) on 1. We
make three elementary observations regarding the functions E

|
) for ) 1
2
(j):
E

|
) (r) 0 as / for every r R,
E

|
) (r) ) (r) as / for j-almost every r R,
[E

|
) (r)[ _ /
J

) (r) for every r R,


and the crucial observation,
(2.5) E

1
) (r) E

) (r) =

11: 2
f+1
]1]2
1
), /

J
2
()
/

1
(r) ,
for all r R, and for all integers ' < .
The rst observation follows from

_
),
1
[1[

1
1
_
J
2
()

1
[1[

_
1
)dj

_
_
1
[1[

_
1
[)[
2
dj
_1
2
=
1
_
[1[

|)|
J
2
()
2. THE MAXIMAL THEOREM 135
and our second assumption in (2.4). The second observation follows directly from
Theorem 57. The third observation is immediate from Denition 25 since

_
),
1
[1[

1
1
_
J
2
()

_
1
[1[

_
1
[)[ dj _ /
J

) (r) , for r 1.
We now turn to the verication of the crucial observation (2.5). It suces to prove
the cases = ' 1, and then add them up. So for 1 T we must prove that
_
),
1
[1

1
1
_
J
2
()
1
1
(r)
_
),
1
[1
+
[

1
1+
_
J
2
()
1
1+
(r) (2.6)

_
),
1
[1[

1
1
_
J
2
()
1
1
(r)
= ), /

J
2
()
/

1
(r) , r 1,
where
/

1
(r) =
_
[1

[1
+
[

[1[

1
1
(r)
[1

1
1+
(r)
[1
+
[

_
.
This is an elementary but tedious calculation. For r 1

the left side of (2.6) is


_
),
1
[1

1
1
_
J
2
()

_
),
1
[1[

1
1
_
J
2
()
=
_
1
[1

1
[1[

_
_
1
)dj
1
[1[

_
1+
)dj
=
_
[1
+
[

[1[

_
1
[1

_
1
)dj
1
[1[

_
1+
)dj,
and the right side is
), /

J
2
()
_
[1

[1
+
[

[1[

1
[1

=
_
[1

[1
+
[

[1[

1
[1

__
[1

[1
+
[

[1[

_
_
1
_

1
[1

_
)dj
_
1+
1
[1
+
[

)dj
__
=
[1
+
[

[1[

_
1
[1

_
1
)dj
1
[1
+
[

_
1+
)dj
_
.
Thus (2.6) holds for r 1

, and the case r 1


+
is similar. This completes the
verication of (2.5).
136 9. DIFFERENTIATION OF INTEGRALS
With these observations in hand, we can apply the Dominated Convergence
Theorem with umbrella function q = 8/
J

) to obtain
lim
1o and o
_
_
_
_
_
_
)

11: 2
f+1
]1]2
1
), /

J
2
()
/

1
_
_
_
_
_
_
2
J
2
()
(2.7)
= lim
1o and o
_
R

) (r)

11: 2
f+1
]1]2
1
), /

J
2
()
/

1
(r)

2
dj(r)
=
_
R

) (r) lim
1o and o

11: 2
f+1
]1]2
1
), /

J
2
()
/

1
(r)

2
dj(r)
=
_
R
[) (r) [) (r) 0[[
2
dj(r) = 0.
Note that by Theorem 57 we have [) (r)[ _ /
J

) (r) for j-almost every r R,


and so for these r,

) (r)

11: 2
f+1
]1]2
1
), /

J
2
()
/

1
(r)

= [) (r) [E

1
) (r) E

) (r)[[
_ [) (r)[ [E

1
) (r) E

) (r)[
_ 8/
J

) (r) ,
where /
J

) 1
2
(j) by Theorem 58. Thus the umbrella function q = 8/
J

) can
be used in the above application of the Dominated Convergence Theorem.
Equation (2.7) shows that oja: /

11
is dense in 1
2
(j), and Theorem 49
now shows that /

11
is an orthonormal basis for 1
2
(j).
Remark 27. We can avoid the use of Theorems 57 and 58 if we appeal to the
density of C
c
(R) in 1
2
(j). Indeed, we then need only establish (2.7) for ) C
c
(R).
This is easy since E

|
) (r) ) (r) as / for every r R by continuity of ),
and if ) is supported in a dyadic interval 1, then
/
J

) (r) _ |)|
o
/
J

(1
1
) (r) ,
and it is easily veried that /
J

(1
1
) 1
2
(j). For example, if 1 = [0, 1) then
/
J

(1
1
) (r) =
j([0, 1))
j([0, 2
|
))
, 2
|1
_ r < 2
|
, / _ 1,
3. EXERCISES 137
and we have
_
o
0

/
J

(1
1
) (r)

2
dj(r)
=
_
1
0

/
J

(1
1
) (r)

2
dj(r)
o

|=1
_
j([0, 1))
j([0, 2
|
))
_
2
j
__
2
|1
, 2
|
__
=
_
1
0

/
J

(1
1
) (r)

2
dj(r) j([0, 1))
2
o

|=1
j
__
0, 2
|
__
j
__
0, 2
|1
__
j([0, 2
|
))
2
_ j([0, 1)) j([0, 1))
2
o

|=1
_
([0,2
!
))
([0,2
!1
))
1
t
2
dt
_ j([0, 1)) j([0, 1))
2
_
o
([0,1))
1
t
2
dt = 2j([0, 1)) .
3. Exercises
Exercise 19. Prove the maximal theorem with /
J
replaced by the larger
maximal operator /
,
.
CHAPTER 10
Product integration and Fubinis theorem
In this chapter we investigate to what extent the order of integration can be
reversed in a product integral, i.e. when do we have an equality
_

__
Y
) (r, j) di (j)
_
dj(r) =
_
Y
__

) (r, j) dj(r)
_
di (j)
An important example of this question arose at the end of the previous chapter.
However, much preparation needs to be done in order to even ask the general ques-
tion intelligently. For example, what sorts of functions ) (r, j) have the property
that for enough xed points r, the function j ) (r, j) is measurable on 1 ; and
for enough xed points j, the function r ) (r, j) is measurable on A? This
question brings to light the fact that we will be dealing with three o-algebras of
sets here, one in A, another in 1 , and a third in the product set A 1 . Thus we
begin with an investigation of product o-algebras.
1. Product o-algebras
Suppose that (A, /) and (1, E) are measurable spaces. A measurable rectangle
is any set 1 T (A 1 ) having the form 1 = 1 where / and 1 E.
Definition 26. /E is the smallest o-algebra on A 1 containing all mea-
surable rectangles.
An elementary set 1 T (A 1 ) is any nite pairwise disjoint union of mea-
surable rectangles, i.e. 1 =

n=1

n
1
n
where
n
/ and 1
n
E. The
collection of all elementary sets is denoted c.
Definition 27. A monotone class /on a set 7 is a collection of sets in T (7)
that is closed under both monotone unions and monotone intersections, i.e.
o
_
n=1
1
n
/ if 1
n
/ and 1
n
1
n+1
for all : _ 1,
o

n1
1
n
/ if 1
n
/ and 1
n
1
n+1
for all : _ 1.
Clearly every o-algebra is also a monotone class. Since / E contains the
elementary sets c, it follows thus /E is a monotone class containing c. It turns
out that in order to dene the notion of product measure independent of iteration,
it is important that /E is the smallest monotone class containing c. Note that
for any given collection of sets T, the smallest monotone class containing T always
exists - it is simply the intersection of all monotone classes containing T.
139
140 10. PRODUCT INTEGRATION AND FUBINIS THEOREM
Theorem 59. /E is the smallest monotone class on A 1 containing the
collection c of elementary sets.
Proof : From the remarks made prior to the theorem we have
(1.1) c / /E,
where / is the smallest monotone class containing c. Now the intersection of two
measurable rectangles is again a measurable rectangle, and the complement of a
measurable rectangle is a union of three pairwise disjoint measurable rectangles,
namely
(
1
1
1
) (
2
1
2
) = (
1

2
) (1
1
1
2
) ,
(
1
1
1
)
c
= (
c
1
1
1
)

' (
1
1
c
1
)

' (
c
1
1
c
1
) .
From this we see that the collection c of elementary sets is closed under nite
unions, intersections and dierences, i.e.
(1.2) 1 ' Q, 1 Q, 1 Q, Q 1 c for all 1, Q c.
Indeed, this is obvious for 1 Q, and so then for 1 Q = 1 Q
c
, and nally then
for 1 ' Q = (1 Q)

' Q.
Now for every 1 T (A 1 ) let
/
1
= Q T (A 1 ) : 1 Q, Q 1, 1 ' Q / .
It is clear that /
1
is a monotone class for every 1 T (A 1 ), and moreover
that
(1.3) Q /
1
==1 /
Q
, for all 1, Q T (A 1 ) .
We now claim that
(1.4) 1 Q, Q 1, 1 ' Q / for all 1, Q /.
Indeed, suppose rst that 1 c. Then by (1.2) we have that Q /
1
for all
Q c. Thus c /
1
and hence also / /
1
since /
1
is a monotone class
containing c, and / is the smallest such. Now x Q /. We just proved that
for 1 c we have Q /
1
, hence by (1.3) we also have 1 /
Q
. Thus c /
Q
,
and hence also / /
Q
since /
Q
is a monotone class. This completes the proof
of (1.4).
We next claim that / is a o-algebra. Indeed, / is closed under complemen-
tation by (1.4) since if 1 /, then 1
c
= (A 1 ) 1 where both A 1 and 1
are in /. Finally, /is closed under countable unions since if 1
n

o
n=1
/, then
o
_
n=1
1
n
=
o
_
n=1
1
1
' 1
2
' ... ' 1
n
/,
since the latter union is monotone and 1
1
' 1
2
' ... ' 1
n
/ for each : _ 1 by
(1.4).
In particular, we have proved that /is a o-algebra containing the measurable
rectangles. Since / E is the smallest such we obtain / E /, which when
combined with (1.1) gives /E = /.
Definition 28. Given a function ) : A 1 C (or [0, [), and a point
r A, we dene the slice function )
r
: 1 C (or [0, [) by
)
r
(j) = ) (r, j) , j 1.
2. PRODUCT MEASURES 141
Similarly, for j 1 , we dene the slice function )

: A C (or [0, [) by
)

(r) = ) (r, j) , r A.
Finally, for 1 T (A 1 ), we dene the slices 1
r
and 1

by
1
r
= j 1 : (r, j) 1 , r A,
1

= r A : (r, j) 1 , j 1.
Note that
(
J
)
r
=
Jo
and (
J
)

=
J
.
The minimality of the product o-algebra /E turns out to imply that measurability
of ) (r, j) with respect to /E is passed on to measurability of the slice functions
)
r
and )

with respect to E and /.


Theorem 60. Let ) be /E-measurable on A1 , and let 1 /E. Then
(1) for every r A, )
r
is E-measurable on 1 , and 1
r
E;
(2) for every j 1 , )

is /-measurable on A, and 1

/.
Proof : Let \ be an open set in C (or [0, [) and let G = )
1
(\ ). Then
()
r
)
1
(\ ) = j 1 : (r, j) G = G
r
.
Now let
( = 1 /E : 1
r
E for all r A .
Since E and /E are o-algebras, it follows easily that ( is a o-algebra. Moreover,
if 1 = 1 is a measurable rectangle, then 1
r
=
_
1 if r
O if r
c
, and so
( is a o-algebra that contains all the measurable rectangles. We conclude that
( = / E. Since G / E = (, we have ()
r
)
1
(\ ) = G
r
E for all r A,
which shows that )
r
is E-measurable for all r A. In particular,
Jo
= (
J
)
r
is
E-measurable and 1
r
E for all r A. Similarly, )

is /-measurable and 1

/
for all j 1 .
2. Product measures
Let j be a positive measure on (A, /), and let i be a positive measure on
(1, E). In this section we consider the equality of the two natural candidates for
dening a product measure j i on /E, namely for 1 /E,
(2.1)
_

__
Y
(
J
)
r
(j) di (j)
_
dj(r) and
_
Y
__

(
J
)

(r) dj(r)
_
di (j) .
We note that Theorem 60 shows that the functions (
J
)
r
and (
J
)

are measurable,
and hence that the inner integrals
_
Y
(
J
)
r
(j) di (j) and
_

(
J
)

(r) dj(r) in
(2.1) exist for all r and j. But we dont yet know that the functions
r
_
Y
(
J
)
r
(j) di (j) and j
_

(
J
)

(r) dj(r)
are measurable, and so we cant yet make sense of the iterated integrals in (2.1).
However, even when we can make sense of both iterated integrals, they may
not be equal! For example, if j is Lebesgue measure on (R, /
1
), and i is counting
142 10. PRODUCT INTEGRATION AND FUBINIS THEOREM
measure on (R, T (R)), and 1 = (r, r) : 0 _ r _ 1 is a diagonal segment in R
2
=
R R, then
_

__
Y
(
J
)
r
(j) di (j)
_
dj(r) =
_
[0,1]
1 dj(r) = 1 1 = 1,
and
_
Y
__

(
J
)

(r) dj(r)
_
di (j) =
_
[0,1]
0 di (j) = 0 = 0.
The following theorem resolves these diculties when the measures j and i are
both o-nite, i.e. A =

o
I=1
A
I
with j(A
I
) < for all i, and 1 =

o
=1
1

with
i (1

) < for all ,. The proof will use the Monotone and Dominated Convergence
Theorems in conjunction with Theorem 59 on monotone classes.
Theorem 61. Let (A, /, j) and (1, E, i) be o-nite measure spaces, and let
1 /E. Then
,(r) =
_
Y
(
J
)
r
(j) di (j) = i (1
r
) , r A,
is /-measurable, and
c (j) =
_

(
J
)

(r) dj(r) = j(1

) , j 1,
is E-measurable. Moreover, we have the equality
_

,(r) dj(r) =
_
Y
c (j) di (j) .
Proof : If both measures j and i were nite, we could use the Monotone and
Dominated Convergence Theorems to show that the class of all sets 1 / E
that satisfy the conclusions of the theorem, is a monotone class containing the
elementary sets. We could then apply Theorem 59 to complete the proof of the
theorem. Since the measures j and i are only o-nite, we must be a bit more
careful.
Let ( be the class of all sets 1 / E that satisfy the conclusions of the
theorem. We claim that ( has the following four properties:
(1) Every measurable rectangle 1 /E belongs to (,
(2) If 1
n

o
n=1
is a nondecreasing sequence of sets in (, i.e. 1
n
1
n+1
for
all : _ 1, then 1 =

o
n=1
1
n
(,
(3) If 1
n

o
n=1
is a pairwise disjoint sequence of sets in (, i.e. 1
n
1
n
= O
for all :, : _ 1, then 1 =

o
n=1
1
n
(,
(4) Suppose that 1 is a measurable rectangle with j() < and i (1) <
. Then if 1
n

o
n=1
is a nonincreasing sequence of sets in (, i.e. 1
n

1
n+1
for all : _ 1, and if 1 1
1
, then 1 =

o
n=1
1
n
(.
With these four properties established for (, it is easy to nish the proof of the
theorem. Indeed, we simply dene
/= 1 /E : 1 (A
I
1

) ( for all i, , _ 1 ,
where A
I
and 1

are as in the denition of o-niteness of A and 1 . Properties (2)


and (4) show that / is a monotone class. Properties (1) and (3) show that the
elementary sets c are contained in /. Theorem 59 now shows that / = / E,
and the theorem is proved.
2. PRODUCT MEASURES 143
So it remains only to establish properties (1) through (4) for the class (. If
1 = 1, then
(
J
)
r
(j) =
.
(r)
1
(j) = (
J
)

(r) ,
is E-measurable for each r, and /-measurable for each j, and so
,(r) =
_
Y

.
(r)
1
(j) di (j) = i (1)
.
(r) is measurable,
c (j) =
_


.
(r)
1
(j) dj(r) = j()
1
(j) is measurable,
_

,(r) dj(r) = j() i (1) =


_
Y
c (j) di (j) .
This establishes property (1).
To prove property (2), we let ,
n
and c
n
correspond to 1
n
in the same way
that , and c correspond to 1 above. We are assuming that ,
n
and c
n
satisfy the
conclusions of the theorem, so they are both measurable and
_

,
n
(r) dj(r) =
_
Y
c
n
(j) di (j) , : _ 1.
Since the sequence of sets 1
n
is nondecreasing, the sequence of functions ,
n
is
nondecreasing, and so is the sequence of functions c
n
. The Monotone Convergence
Theorem applied twice gives
_

,(r) dj(r) = lim


no
_

,
n
(r) dj(r)
= lim
no
_
Y
c
n
(j) di (j) =
_
Y
c (j) di (j) .
This completes the proof of property (2).
Property (3) is obvious for nite pairwise disjoint unions, and the general case
then follows using property (2).
Finally, the proof of property (4) is similar to that of property (2), except
that we can use the Dominated Convergence Theorem instead of the Monontone
Convergence Theorem because both j() and i (1) are nite.
We can now dene the product measure ji on /E that is associated with
j and i.
Definition 29. If (A, /, j) and (1, E, i) are o-nite measure spaces, and if
1 /E, dene
(2.2)
j i (1) =
_

__
Y
(
J
)
r
(j) di (j)
_
dj(r) =
_
Y
__

(
J
)

(r) dj(r)
_
di (j) ,
where the equality of the two iterated integrals follows from Theorem 61.
Corollary 14 applied twice shows that ji is a positive measure on (A 1, /E),
and it is of course o-nite. With the denition of product measure in hand, we are
more than half way to proving the equality of iterated integrals in Fubinis theorem.
Indeed, taking nite sums of scalars times indicator functions in (2.2) shows that
(2.3)
_

__
Y
) di
_
dj =
_
Y
) d (j i) =
_
Y
__

) dj
_
di,
144 10. PRODUCT INTEGRATION AND FUBINIS THEOREM
for all simple functions ). Five applications of the Monotone Convergence Theorem
then show that (2.3) holds for nonnegative measurable ). The integrals on the far
left and far right in (2.3) are called iterated integrals, and the integral in the middle
is called a double integral. In the next section we give a precise and more general
statement, along with a detailed proof. The cases where ) is [0, [-valued and
C-valued are treated separately.
3. Fubinis theorem
Theorem 62. Let (A, /, j) and (1, E, i) be o-nite measure spaces, and let )
be /E-measurable on the product set A 1 .
(1) If 0 _ ) (r, j) _ for all (r, j) A 1 , and if
,(r) =
_
Y
)
r
(j) di (j) , r A,
c (j) =
_

(r) dj(r) , j 1,
then , is /-measurable and c is E-measurable and
_

, dj =
_
Y
) d (j i) =
_
Y
c di.
(2) If ) (r, j) C for all (r, j) A 1 , and
,
+
(r) =
_
Y
[)[
r
(j) di (j) , r A,
then
_
Y
[)[ d (j i) =
_

,
+
dj,
and so ) 1
1
(j i) if
_

,
+
dj < .
(3) If ) 1
1
(j i) then )
r
1
1
(i) for j-almost every r A, )

1
1
(j)
for i-almost every j 1 , the functions , and c dened almost everywhere
by
,(r) =
_
Y
)
r
(j) di (j) , j a.c. r A,
c (j) =
_

(r) dj(r) , i a.c. j 1,


are in 1
1
(j) and 1
1
(i) respectively, and
_

, dj =
_
Y
) d (j i) =
_
Y
c di.
The rst assertion (1) is often called Tonellis Theorem, while the third assertion
(3) is then referred to as Fubinis Theorem. The point of assertion (2) is that if
at least one of the iterated integrals of [)[ is nite, then ) 1
1
(j i) and so (3)
holds.
Proof : We rst prove assertion (1). If 1 / E, then Theorem 61 shows
that assertion (1) holds for ) =
J
. By summing scalar multiples of such indicator
functions, we see that (1) holds for all simple functions ). Now if ) is [0, [-
valued, Proposition 7 shows that there is a nondecreasing sequence :
n

o
n=1
of
nonnegative simple functions satisfying 0 _ :
n
_ :
n+1
_ ) for all : _ 1 and
3. FUBINIS THEOREM 145
such that lim
no
:
n
(r, j) = ) (r, j) for every (r, j) A 1 . Since assertion (1)
holds for :
n
, if we let ,
n
and c
n
correspond to :
n
in the same way that , and c
correspond to ), then we have
_

,
n
dj =
_
Y
:
n
d (j i) =
_
Y
c
n
di, for all : _ 1.
We now apply the Monotone Convergence Theorem ve times. Two applications
show that ,
n
increases pointwise to ,, and that c
n
increases pointwise to c. Three
more applications show that
lim
no
_

,
n
dj =
_

, dj,
lim
no
_
Y
:
n
d (j i) =
_
Y
) d (j i) ,
lim
no
_

c
n
dj =
_

c dj,
and this completes the proof of assertion (1).
Assertion (2) is an immediate consequence of applying assertion (1) to [)[.
Finally, assertion (3) is easily reduced to the case that ) is real-valued. As-
sertion (1) then applies to both the positive )
+
and negative )

parts of ) to
give
_

dj =
_
Y
)

d (j i) =
_
Y
c

di,
where ,

and c

correspond to )

in the same way that , and c correspond to


). Now we add the two equations corresponding to to obtain that
_

[,[ dj =
_
Y
[)[ d (j i) =
_
Y
[c[ di.
Thus the functions ,

, )

, c

are nite almost everywhere, and all have nite


integral. Thus we can take the dierence of the two equations corresponding to
to obtain
_

, dj =
_
Y
) d (j i) =
_
Y
c di.
Note that the indeterminate expression will only arise on sets of measure
zero in the dierences taken above. This completes the proof of Fubinis theorem.
The next two examples show that assertion (3) of Fubinis theorem may fail if
) is not integrable, even if all other hypotheses hold,
) is not /E-measurable, even if all other hypotheses hold.
Example 7. Even if A and 1 are nite measure spaces, ) is /E-measurable,
and both iterated integrals for ) exist, it may happen that the iterated integrals are
not equal, due to the fact that ) fails to be integrable. For example, let A = 1 =
[0, 1), let j = i be Lebesgue measure on [0, 1), and dene ) by
) (r, j) =
o

n=1
_
2
n+1

[
1
2
r+1
,
1
2
r )
(r) 2
n

[
1
2
r ,
1
2
r1
)
(r)
_
2
n

[
1
2
r ,
1
2
r1
)
(j) .
146 10. PRODUCT INTEGRATION AND FUBINIS THEOREM
Then if r
_
1
2
, 1
_
, we have )
r
(j) = 4
[
1
2
,1)
(j) and
_
)
r
(j) dj = 2, while if
r
_
1
2
r+1
,
1
2
r
_
for some : _ 1, then
)
r
(j) = 2
n+1
2
n

[
1
2
r ,
1
2
r1
)
(j) 2
n+1
2
n+1

[
1
2
r+1
,
1
2
r )
(j) ,
and
_
)
r
(j) dj = 0. Altogether we have
_ __
)
r
(j) dj
_
dr =
_
[
1
2
,1)
(2) dr
o

n=1
_
[
1
2
r+1
,
1
2
r )
0dr = 1.
On the other hand, we have
_
)

(r) dr = 0 for all j [0, 1) and so


_ __
)

(r) dr
_
dj =
_
0dj = 0.
Example 8. Even if A and 1 are nite measure spaces, and both iterated
integrals exist for a nonnegative bounded function ), it may happen that the iterated
integrals are not equal, due to the fact that ) fails to be / E-measurable. For
example, let both (A, /, j) and (1, E, i) be Lebesgue measure on [0, 1[. Assume the
axiom of choice, and in addition the continuum hypothesis, which asserts that the
cardinality of the real numbers is the rst uncountable cardinal. Then there is a
one-to-one mapping
I : [0, 1[ A .
1
,
where (A, -) is the well-ordered set whose last element is the rst uncountable
ordinal .
1
. See the fourth instance of a measure space in Example 4 near the
beginning of Chapter 6. We note in passing that Cohens famous theorem shows that
the continuum hypothesis is independent of 71C set theory, the Zermelo-Fraenkel
axioms together with the axiom of choice. Now dene
1 =
_
(r, j) [0, 1[
2
: I(r) - I(j)
_
.
Recall that there are at most countably many predecessors of c for any c A.
1
.
Thus for each r [0, 1[, the slice 1
r
contains all but at most countably many of the
points in [0, 1[, and so is Borel measurable with measure 1. Also, for each j [0, 1[,
the slice 1

contains at most countably many of the points in [0, 1[, and so is Borel
measurable with measure 0. Thus the iterated integrals of
J
both exist and we
compute that
_
[0,1]
_
_
[0,1]
(
J
)
r
(j) dj
_
dr =
_
[0,1]
1dr = 1,
_
[0,1]
_
_
[0,1]
(
J
)

(r) dr
_
dj =
_
[0,1]
0dj = 0.
4. Exercises
Exercise 20. Suppose ) : R
2
R is such that every section )
r
is Borel
measurable on R, and every section )

is continuous on R. Prove that ) is Borel


measurable on R
2
. Recall that )
r
: R R and )

: R R are dened by )
r
(j) =
) (r, j) = )

(r).
Bibliography
[1] R. G. Bartle and D. R. Sherbert, Introduction to Real Analysis, John Wiley and Sons, Inc.
3rd edition, 2000.
[2] C. B. Boyer, A history of mathematics, John Wiley & Sons, Inc., 1968.
[3] W. Rudin, Principles of Mathematical Analysis, McGraw-Hill, 3rd edition, 1976.
[4] W. Rudin, Real and Complex Analysis, McGraw-Hill, 3rd edition, 1987.
[5] W. Rudin, Functional Analysis, International Series in Pure and Appl. Math., McGraw-Hill,
2nd edition, 1991.
[6] E. M. Stein and R. Shakarchi, Complex Analysis, Princeton Lectures in Analysis II, Prince-
ton University Press, Princeton and Oxford, 2003.
[7] E. M. Stein and R. Shakarchi, Real Analysis, Princeton Lectures in Analysis III, Princeton
University Press, Princeton and Oxford, 2003.
[8] S. Wagon, The Banach-Tarski Paradox, Cambridge University Press, 1985.
147

Vous aimerez peut-être aussi