Vous êtes sur la page 1sur 26

An Introduction to Lebesgue Measure and its

Applications to Probability Theory


Real Analysis II
Fall 2013 Semester
Yeng Chang
2013
References (for interested parties)
From Measures to Ito Integrals, Kopp, 2011.
Very short and to the point.
A Probability Path, Resnick, 2005 (5th printing)
Warning: quite a few typos.
The Elements of Integration and Lebesgue Measure, Bartle,
1995.
Analysis-type approach to measure theory. The material that
we covered from measure theory in this course can be found in
the second section of this book.
A First Course in Probability, Ross, 1988 (3rd edition)
A bit of an outdated edition, but Ross texts are the standard
for a rigorous elementary probability course.
Background
A family (or class) of things is a set of things that have a
common quality.
Denition (-eld/algebra)
Let A be a family of subsets of a set = . A is said to be a
-algebra (or -eld) if:
(i) A.
(ii) If A A, then its complement, denoted A
c
= \ A A as
well. That is, A is closed under complements.
(iii) If (A
n
) is a sequence of sets in A, then

n=1
A
n
A as
well. That is, A is closed under countable unions.
Background
Theorem
Let (A
n
) be a sequence of sets in a -algebra A. Then

n=1
A
n
A as well. That is, A is closed under countable
intersections.
Proof.
Use De Morgans Laws.
Example (Various -elds)
(i) Let A = 2

= P() be the power set of , that is, the set of


all subsets of .
(ii) Let A = {, }. A is known as the trivial -eld.
Background
Denition (Measure)
Let

R be the set of extended real-numbers, i.e., R {, +},
and A be a -algebra of subsets of .
: A

R is said to be a measure if:
(i) () = 0.
(ii) (A) 0 A A.
(iii) (

n=1
A
n
) =

n=1
(A
n
) for any pairwise disjoint sequence
of sets (A
n
) in A. That is, is -additive or countably
additive.
Background
Denition (Measure Space)
Using the denitions in this lecture, a measure space is a triple
(, A, ).
Borel Sets
Denition (Field/Algebra)
Let A be a family of subsets of a set = . A is said to be an
algebra (or eld) if
(i) A.
(ii) A is closed under complements.
(iii) If (A
n
) is a sequence of sets in A, A
i
A
j
A for i = j.
That is, A is closed under (nite) unions.
Notice that a -algebra is the same as an algebra, except
countable unions for -algebras is used instead of nite unions
for algebras. Whenever is part of the name of a structure, it is
usually an indication that the word countable is involved.
Borel Sets
Let
A
0
=
_
n
_
i=1
(a
i
, b
i
] : a
1
b
1
a
2
b
n
, n 1
_
,
where a
i


R and b
i
R. It can be shown that A
0
is a eld.
Denition (Minimal -eld)
The minimal -eld (A) is generated by a eld and any family
A of subsets of , dened by
(A) =

{G : G a -eld, A G}.
In particular, the Borel -eld, denoted B(R), is B(R) = (A
0
).
Constructing Probability
Denition (Probability Measure and Space)
Let be the sample space of an experiment and A be a
-algebra of subsets of , known as the events. P : A [0, 1] is
said to be a probability measure if P is a measure and P() = 1.
The measure space (, A, P) is known as a probability space.
Example
Let = (0, 1], B = B((0, 1]), and S = {(a, b] : 0 a b 1}.
Dene : S [0, 1] by () = 0 and (a, b] = b a. is known
as Lebesgue Measure on (0, 1]. Furthermore, (, B, ) is a
probability space.
Properties of P
Notation
We write AB = A B.
(i) P(A
c
) = 1 P(A).
Notice that for A A,
1 = P() = P(A A
c
) = P(A) + P(A
c
).
Properties of P
(ii) P(A B) = P(A) + P(B) P(AB).
For A, B A, notice that A = AB
c
AB and
B = BA
c
AB. Then
P(A B) = P(AB
c
BA
c
AB)
= P(A) P(AB) + P(B) P(AB) + P(AB)
= P(A) + P(B) P(AB).
Properties of P
(iii) If A B, then P(A) P(B).
This is because P(B) = P(A) + P(B \ A) P(A) since
P 0.
Random Variables
Let D R be measurable. Recall that f : D R is said to be
measurable if R, at least one of {x : f (x)
>

<

} is a
measurable set.
Denition (Random Variable)
If (, A, P) is a probability space, a measurable function
X : R is said to be a random variable.
Integration and Expectation
Notation
Probabalists use 1
A
=
A
, what we learned as the characteristic
function, and call it the indicator function.
Denition (Expectation of a Random Variable)
Let (, B, P) be a probability space and X :

R. The
expectation of X, written E[X], is the Lebesgue-Stieltjes integral
of X with respect to P:
E[X] :=
_

X dP =
_

X()P(d).
Integration and Expectation
Notation
If (A
i
), i = 1, 2, . . . , n is a sequence of disjoint sets,
n

i=1
A
i
is
used to indicate the union of these sets.
Denition (Expectation of a Simple Function)
Suppose |a
i
| < ,
n

i=1
A
i
= . Let X : R be simple. The
expectation of X is dened as:
E[X] :=
n

i=1
a
i
P(A
i
).
This is the denition of expectation for discrete random variables
as taught in Math 346.
Properties of E
(i) E[1
A
] = P(A).
This is because E[1
A
i
] = 1P(A) + 0P(A
c
) = P(A).
Notation
Denote as E the set of all simple functions R.
Properties of E
(ii) Let X, Y E. Then for , R, E[X +Y] = E[X]
+E[Y].
Suppose
X =
n

i=1
a
i
1
A
i
, Y =
m

j=1
b
j
1
B
j
.
Then
X +Y =

all i,j
(a
i
+b
j
)P(A
i
B
j
).
From here, use the fact that
m

j=1
P(A
i
B
j
) = P(A
i
) and
n

i=1
P(A
i
B
j
) = P(B
j
), which are true due to the law of total
probability.
Properties of E
This brings up the question... how can we extend E to continuous
random variables?
Notation
Write a b := min(a, b).
Denition (Integrable Random Variables)
Let X be a random variable and dene
X
+
= X 0, X

= (X) 0.
If both E[X
+
] and E[X

] are nite (or, put simply, E[|X|] < ),


X is said to be integrable. The set of all integrable random
variables is denoted L
1
, or L
1
(P) to emphasize the probability
measure.
Properties of E
Denition (Cumulative Distribution Function)
Let (R, B, P) be a probability space. Then dene
F
X
(x) = P((, x]), where x R. F
X
is known as the
(cumulative) distribution function of X.
Theorem (Law of the Unconscious Statistican)
If X : R is an (integrable) random variable and g a real Borel
function, then the random variable Y = g(X) has expectation
E[Y] =
_
R
g(x) dF
X
(x).
Properties of E
For those of you who have taken Math 346, you may recall that
the probability density function of X, denoted f
X
, is dened as
d
dx
[F
X
(x)] = f
X
(x).
Another thing to note is that E[Y], as shown in the Law of the
Unconscious Statistician, is a Lebesgue-Stieltjes integral. If you
have read about the Riemann-Stieltjes integral, it is not surprising
that if F
X
is absolutely continuous, it follows that
E[Y] =
_
R
g(x)f
X
(x) dP,
which is the familiar Math 346 denition. Furthermore, all
properties of E apply (that you have learned in Math 346).
Convergence
In probability theory, convergence is an extremely important
concept. The concept is extremely useful for statistics (think of
unbiased and consistent estimators, the Law of Large Numbers,
and the Central Limit Theorem!).
Denition (Almost Surely)
Let (, B, P) be a probability space. An event E B is said to
happen almost surely (a.s) if P(E) = 1.
Concerning convergence, let (X
n
) be a sequence of random
variables. Then lim
n
X
n
exists a.s. means that for almost all
1
,
limsup
n
X
n
() = liminf
n
X
n
(). In this case, we write X
n
a.s.
X.
1
That is, excluding countable sets.
Convergence
Denition (Convergence in Probability)
Let (X
n
) be a sequence of random variables and X be a random
variable. (X
n
) is said to converge in probability (i.p.) to X, or
X
n
P
X, if > 0,
lim
n
P(|X
n
X| > ) = 0.
Theorem
If X
n
a.s
X, X
n
P
X.
Convergence
We present some results based on the convergence methods
already shown.
Theorem (Cauchy Criterion for i.p. Convergence)
X
n
P
X X
n
X
m
P
0 as n, m .
Theorem (Lebesgue Dominated Convergence)
Let X
n
P
X and suppose L
1
s.t. |X
n
| n. Then
E[X
n
] E[X].
Convergence
Let X be a random variable. Then X L
p
if E[|X|
p
| < . If
X, Y L
p
, for p 1, a metric on L
p
can be dened by
d(X, Y) = (E[|X Y|
p
])
1/p
.
Denition
(X
n
) is said to converge in L
p
to X, or X
n
L
p
X, if
E[|X
n
X|
p
] 0 as n .
Theorem
If X
n
L
p
X, X
n
P
X.
Proof.
This follows from Chebychevs inequality.
The Characteristic Function
There is a lot that can be said about convergence, but there is not
sucient time to cover everything in this presentation. Instead, we
turn to the moment-generating function (MGF). Recall from Math
346 that the MGF of a random variable X with cumulative
distribution function F
X
is dened by
M
X
(t) = E[e
tX
] =
_
R
e
tx
dF
X
(x).
The problem with M
X
is that it does not exist for each random
variable X. Therefore, we turn to the characteristic function,
dened by:
(t) = E[e
itX
] =
_
R
cos(tx)F(dx) + i
_
R
sin(tx)F(dx).
The Characteristic Function
In Math 346, we were interested in computing derivatives of the
MGF and setting t = 0. It can be shown that

(k)
(t)

t=0
= i
k
E[X
k
].
A proof of the central limit theorem using the characteristic
function is similar to that of when MGFs are used.