Vous êtes sur la page 1sur 17

Weak Convergence Spring 2010

2. Measures on metric spaces and their weak convergence


2.1. General properties of measures
Regular measures
Consider a measure on (X, B(X)). Recall that measures are assumed to be non-negative
(otherwise we speak about signed measures) and -additive. A probability measure has the
total mass 1, i.e. (X) = 1.
Exercise 2.1. Let be a probability measure on uncountable space X. Prove that there
exists a countable family of disjoint sets such that each of those sets has a (strictly) positive
-measure.
Denition 2.1. Measure is said to be regular if for every Borel A one has
(A) = sup{(F) : F A, F closed} = inf{(G) : G A, G open} .
This property can be formulated as follows. For every Borel A and > 0 there exist closed set
F and open set G such that F A G and (G\ F) .
Theorem 2.2. Let X be a metric space. Then any nite measure on (X, B(X)) is regular.
Proof. Let be the family of all Borel sets B which are regular, i.e. for every > 0 there exist
closed set F and open set G such that F A G and (G\ F) .
First show that each closed ball B
r
(x) belongs to . Indeed,
F = B
r
(x) G
n
= {y : (x, y) < r + 1/n} .
Since (G
n
\ F) = , we have (G
n
\ F) 0.
It remains to show that is a -algebra. Then it coincides with the -algebra generated by
balls, i.e. the Borel -algebra. First, X , since X is open and closed at the same time.
Furthermore, if B , then B
c
(use the fact that if F B G then G
c
B
c
F
c
). If
B
n
, n 1, choose F
n
B
n
G
n
with (G
n
\ F
n
) 2
n1
. Dene
B =
n1
B
n
, G =
n1
G
n
, D
n
=
n
i=1
F
i
.
Then
lim
n
(G\ D
n
) = (G\ D

n
(G
n
\ F
n
) /2 .
Hence, there is N such that (G\ D
N
) < .
Corollary 2.3. If
1
and
2
are two nite measures on a metric space X and
1
(F) =
2
(F)
for all closed F, then
1
and
2
coincide.
11
Spring 2010 Weak Convergence
Proof. First
1
(X) =
2
(X), then the equality holds for all open sets. Then general Borel set
use the regularity of the measure and let 0.
If is a measure on X, denote the corresponding Lebesgue integral as
_
f(x)(dx) or
_
fd, if
the integral is taken over the whole X. Otherwise write
_
A
fd if A X.
Exercise 2.2. Let be a nite measure on a metric space X. Show that for each B B(X)
and each > 0 there exists a continuous function f such that

_
fd (B)

.
Further 1I
A
(x) is the indicator of set A, so that
_
1I
A
d = (A).
Theorem 2.4. Let
1
and
2
be two nite measures on a metric space X with its Borel
-algebra. If
_
fd
1
=
_
fd
2
for all bounded continuous functions f, then
1
=
2
.
Proof. Let F be any closed set. The distance (x, F) is a continuous function (show this!).
Thus,
1 f
n
(x) = (1 +n(x, F))
1
1I
F
(x)
for a sequence of continuous functions f
n
. By the condition, integrals of f
n
with respect to
1
and
2
coincide, so that the dominated convergence theorem implies that the limits coincide.
Note that
_
f
n
d
i

i
(F), i = 1, 2.
Tight measures and Radon measures
Denition 2.5. A measure on a metric space X is said to be tight if for each > 0 there
exists a compact set K such that (K
c
) < .
Theorem 2.6 (Ulam). Let be a nite measure on Polish space X. Then is tight.
Proof. Let {x
n
} be a dense subset of X. Then
i
B
1/n
(x
i
) = X for every n 1. Thus, there
exists an i
n
such that
(
ii
n
B
1/n
(x
i
)) (X) 2
n
.
Dene
K =

n1
_
ii
n
B
1/n
(x
i
) .
The set K is totally bounded, since for any > 0 it possesses a nite -net {x
1
, . . . , x
i
n
} for
n 1/. Furthermore, K is closed, and so is compact. Finally, notice that
(K
c
)

n
((
_
ii
n
B
1/n
(x
i
))
c
)

n
2
n
= .
12
Weak Convergence Spring 2010
Exercise 2.3. Prove that for every Borel set B in a Polish space and every > 0 there exists
a compact set K such that (B \ K) < , where is a nite measure.
Hint: see the proof above.
Denition 2.7. A nite Borel measure is called a Radon measure if, for each Borel set B,
its measure (B) equals the supremum of (K) for all compact sets K B.
Clearly, each Radon measure is regular and tight. The result of Exercise 2.3 means that each
nite Borel measure in a Polish space is Radon.
Support of a measure
Denition 2.8. The support of a probability measure is the set S

of all points x X such


that (U) > 0 for each neighbourhood of x; is said to possess a support if (S

) = 1.
Lemma 2.9. The support of a probability measure on a Polish space can be equivalently
dened as the complement to the union of all open sets U such that (U) = 0.
Proof. Consider the family U of all open sets U such that (U) = 0. Since the topology of X
has a countable base, there are open sets U
m
, m 1, such that U

=
m
U
m
=
UU
U. Dene
S

to be the complement of U

. Since (U

)

(U
m
) = 0, we have (S

) = 1. If x S

then each its neighbourhood has a positive measures, otherwise such a neighbourhood would
belong to U and hence x would be in U

.
Exercise 2.4. Prove that the support of a probability measure can be equivalently dened
as the intersection of all closed sets F such that (F) = 1.
Lemma 2.10. If is a probability measure on Polish space X, then the support S

is separable
in X.
Proof. If S

is not separable, then there exists an uncountable family {x

} of elements of S

such that (x

1
, x

2
) whenever
1
=
2
. Then the open balls {x X : (x, x

) < /2}
are disjoint for all , so that we obtain an uncountable family of disjoint sets, each having a
positive measure, which is impossible, see Exercise 2.1.
The situation in general topological spaces is considerably more complicated. For instance,
there exists a compact Hausdor topological space and a probability measure on its Borel
-algebra, such that this measure is not regular and does not have a support.
13
Spring 2010 Weak Convergence
Extension of measure
Dene by C
0
(X, ) the family of all cylindrical sets generated by the functions from , see
(1.2). By denition, the cylindrical -algebra C(X, ) is the smallest -algebra that contains
C
0
(X, ).
Exercise 2.5. Check that C
0
(X, ) is an algebra, i.e. it is closed under taking complements
and nite unions.
The following result concerning the extension of a measure from the cylindrical algebra C
0
(X, )
holds also in more general topological spaces. It is given without proof (see Theorem 1.3.4 from
Vakhaniya et al.).
Theorem 2.11. Let X be a Polish space and let be a family of continuous real functions
which separate the points of X. If : C
0
(X, ) [0, 1] is nitely additive, regular and tight
and (X) = 1, then admits an extension to a Radon probability measure on (X, C(X, ))
(= (X, B(X)), see Theorem 1.11).
2.2. Weak convergence
Consider a sequence {
n
, n 1} of nite measures on (X, B(X)).
Denition 2.12. We say that
n
converges weakly to (notation
n
w
) if for every bounded
real-valued continuous function f (notation f C
b
(X))
(2.1)
_
X
fd
n

_
X
fd.
In the following we need also the family BL(X) of bounded Lipschitz functions on X. Denote
BL(X) =
_
f : X R : f

= sup
xX
|f(x)| < , f
L
= sup
x,yX,x=y
|f(x) f(y)|
(x, y)
<
_
.
It is possible to introduce a norm on BL(X) as
f
BL
= f

+f
L
.
Note that BL(X) C
b
(X). It is also possible to show that BL(X) with norm
BL
is a
Banach space. The following lemma is often useful
Lemma 2.13. Let Y X and let f : Y R such that
(i) sup
xY
|f(x)| = M < ;
14
Weak Convergence Spring 2010
(ii) |f(x)f(y)| ((x, y)) for all x, y Y with a continuous function such that (0) = 0
and (s) (s +t) (s) + (t) for all s > 0 and t 0.
Then f can be extended to a function g : X R such that f and g coincide on Y and g
satises the above two conditions on the whole X.
Remark 2.14. Such a function in (ii) above is called the modulus of continuity of f and it
quanties the uniform continuity of function f in its (, ) denition.
Proof. Dene
g(x) = inf
yY
[f(y) + ((x, y))] , x X .
Because of (ii), f(x) f(y) + ((x, y)) for all x, y Y . If y x, then the right-hand side
converges to f(x), so that g(x) = f(x) for all x Y .
Furthermore,
| g(x

) g(x

)| = | inf
yY
[f(y) + ((x

, y))] inf
zY
[f(z) + ((x

, z))]|
|((x

, y)) ((x

, y))| ((x

, x

))
meaning that g admits the continuity modulus . Finally, set
g(x) = max
_
min( g(x), M), M
_
, x X ,
i.e. g(x) is the cut-o of g(x) at levels M and M.
Corollary 2.15. Let f BL(Y ) for Y X. Then there exists an extension g of f onto X,
such that f

= g

and f
L
= g
L
.
Exercise 2.6. Prove that the statement of Theorem 2.4 holds also if the integrals coincide for
all f BL(X). (Hint: use the idea from the proof (b) (c) of Theorem 2.16).
For a set A in a topological space X, the boundary A is the set of all points that are limits
for sequences of points from A and from the complement A
c
at the same time.
Theorem 2.16 (Portmanteau theorem). Let and {
n
, n 1} be nite measures on (X, B(X)).
Then the following conditions are equivalent.
(a)
n
w
.
(b)
_
fd
n

_
fd for all f BL(X).
(c) (F) limsup
n

n
(F) for every closed F and
n
(X) (X).
(d) (G) liminf
n

n
(G) for every open G and
n
(X) (X).
15
Spring 2010 Weak Convergence
(e)
n
(B) (B) for every Borel B such that (B) = 0.
(f)
_
fd
n

_
fd for every Borel bounded function f such that (
f
) = 0, where
f
is the
set of all points where f is discontinuous.
Proof. (a)(b) Obvious.
(b)(c) Take any closet set F. For > 0 choose > 0 such that
(F

) (F) < ,
where F

= {x : (x, F) < } is the -neighbourhood of F. Consider a Lipschitz function


f(x) = f

(x) = (1 (x, F)/)


+
which equals 1 on F and 0 on the complement of F

. Then
limsup
n

n
(F) limsup
n
_
f

d
n
=
_
f

d (F

) < (F) + .
Finally, take f 1 for the total mass relation.
(c)(d) is obvious.
(c)& (d)(e) Note that Int B B

B, where Int B is the interior of B and

B is the closure
of B. Furthermore, (

B \ Int B) = (B) = 0. Thus,
(

B) = (B) = (Int B)
and
(B) = (

B) limsup
n

n
(

B) limsup
n

n
(B)
liminf
n

n
(B) liminf
n

n
(Int B) (Int B) = (B) .
(e)(f) Since X = ,
n
(X) (X). By adding a constant to f, it suces to assume that
f 0. For such bounded f with C = sup f we have
_
fd
n
=
_
_
_
C
_
0
1I
f(x)>t
dt
_
_

n
(dx) =
C
_
0

n
({x : f(x) > t})dt =
C
_
0

n
(G
t
)dt .
Thus it suces to show that
n
(G
t
) (G
t
) for almost all t. Indeed, let L
t
= {x : f(x) = t}.
Since (G
t
) is a monote function in t, it may have at most countable number of discontinuities
of the rst kind. So (L
t
) = (G
t
) (G
t
) = 0 for almost all t. Next, if y G
t
and f is
continuous at y, then f(y) = t. Therefore G
t
L
t

f
which, together with the assumption
(
f
) = 0, yields (G
t
) = 0 for almost all t. Now
n
(G
t
) (G
t
) follows from (e).
(f)(a) is obvious.
16
Weak Convergence Spring 2010
Part (b) of the theorem shows that instead of a wider class of bounded continuous function in
the denition of the weak convergence, one may actually consider a smaller class BL(X). In
this respect, BL(X) is a convergence determining class of functions. Another such class is the
set of indicator functions of those measurable sets B for which (B) = 0 as stems from Part
(e).
From now on we mostly consider probability measures and so can omit the requirement that

n
(X) (X) in (b) and (c) of Theorem 2.16. The statements (c)(e) of Theorem 2.16 can
be expressed as
(Int B) liminf
n
(B) limsup
n
(B) (

B) , B B(X) .
Exercise 2.7. Let X be the set of natural numbers N with the discrete topology. Characterise
the weak convergence of probability measures on X.
Exercise 2.8. Prove the following result:
Let {
n
} be a sequence of probability measures on N such that
n
weakly converges to a
probability measure . Show that
lim
n

i1
|
n
({i}) ({i})| = 0 .
Hint: Show that the sum above equals twice the sum of all max(({i})
n
({i}), 0) and use
the dominated convergence theorem.
Exercise 2.9. Let X = [0, 1] with the Euclidean metric and let be the Lebesgue measure.
Show that

n
=
1
n
n

k=0

k/n
w
.
Exercise 2.10. Let X = R and let C
k
b
(R) be the family of k 1 times continuously dieren-
tiable functions such that f
(i)

< for all i = 0, 1, . . . , k. Show that


n
w
if and only if
S(
n
, ) 0, where
S(, ) = sup
_

_
fd
_
fd

: f

+f

+ +f
(k)

1
_
.
Exercise 2.11. Let
n
w
. Consider a continuous (but not necessarily bounded) function
f C(X). Show that
sup
n1
_
{x: f(x)a}
|f(x)|
n
(dx) 0 as a
implies
lim
n
_
fd
n
=
_
fd.
17
Spring 2010 Weak Convergence
Denition 2.17. A set B B(X) is said to be a continuity set for a measure if (B) = 0.
The family of all continuity sets for is denoted by S

.
Theorem 2.16(e) means that the weak convergence of measures is equivalent to the pointwise
convergence of their values on all continuity sets for the limiting measure.
Proposition 2.18. The family S

is a subalgebra of B(X). If X is a metric space, then S

contains a certain base of the topology on X.


Proof. The rst statement follows from the fact that (A B) (A B) and (A
c
) = A.
Let be the family of all continuous functions X [0, 1]. The sets U
f
= {x : f(x) > 0} for
f build a base of the topology on X since every open set G = {x : (x, G
c
) > 0}. Dene
U
f,t
= {x : f(x) > t} , f , t (0, 1) .
For each f the set
f
= {t (0, 1) : (U
f,t
) > 0} is at most countable, see the proof of
Theorem 2.16 part (e)(f) above. Thus, U
f
=
t(0,1)\
f
U
f,t
, so that the sets U
f,t
for f
and t (0, 1) \
f
belong to S

and form a base for the topology on X.


The following result concerns weak convergence of maps. Let X and Y be two metric spaces
and let h : X Y be a Borel function. For a measure on X, the map h induces an image
measure = h
1
on Y dened as
(B) = (h
1
(B)) , B B(Y ) .
Lemma 2.19. If h is continuous and
n
w
, then
n
=
n
h
1
w
= h
1
.
Proof. For each f C
b
(Y ), the change of variables formula in the Lebesgue integral yields that
_
Y
fd
n
=
_
Y
f(y)
n
h
1
(dy) =
_
X
f(h(x))
n
(dx)
_
X
(f h)d = =
_
Y
fd .
Theorem 2.20 (Mapping theorem). Let
n
w
on X. If h : X Y is a Borel function and
the measure of the set of discontinuity points of h vanishes, then
n
=
n
h
1
w
= h
1
.
Proof. Denote by D(h) X the set of discontinuity points for h. Let F be closed in Y .
If {x
n
} h
1
(F) is converging to some x h
1
(F) and h is continuous at x then h(x) =
lim
n
h(x
n
) F since F is closed. Therefore
h
1
(F) h
1
(F) D(h) .
18
Weak Convergence Spring 2010
whence
(h
1
(F)) (h
1
(F)) +(D(h)) = (h
1
(F)) .
Thus,
(F) = (h
1
(F)) = (h
1
(F)) limsup
n

n
(h
1
(F)) limsup
n

n
(h
1
(F)) = limsup
n

n
(F) .
2.3. Metrisation of the weak convergence
The family M(X) of all probability measures is a topological space itself with the topology
generated by the weak convergence of measures.
Exercise 2.12. The space X embeds in M(X) by associating with every x X the corre-
sponding Dirac measure
x
. Prove that x
n
x in X if and only if
x
n
w

x
.
The base for the weak topology on M(X) can be dened as the family of neighbourhoods of
any M(X) as
{ M(X) :

_
f
i
d
_
f
i
d

< , i = 1, . . . , k}
for > 0, k 1 and f
1
, . . . , f
k
C
b
(X). By Theorem 2.16, another base is given by
(2.2) U(; F
1
, . . . , F
k
, ) = { M(X) : (F
i
) < (F
i
) + , i = 1, . . . , k}
for closed sets F
1
, . . . , F
k
X, k 1, and > 0. Yet another base is given by
(2.3) V (; A
1
, . . . , A
k
, ) = { M(X) : |(A
i
) (A
i
)| < , i = 1, . . . , k}
where A
1
, . . . , A
k
are Borel subsets of X such that (A
i
) = 0 for i = 1, . . . , k, k 1, and
> 0.
If X is a metric space, it is possible to introduce a metric on M(X). For each A X and
r > 0 denote its r-envelope (or r-parallel set) as
A
r
= {x X : (x, A) < r} .
Exercise 2.13. Prove that A
r
= F
r
if F is the closure of A.
Denition 2.21. The Prokhorov distance between two probability measures and is dened
by
d(, ) = inf{ > 0 : (A) (A

) + for all A B(X)} .


Exercise 2.14. Prove that d indeed is a metric, i.e.
19
Spring 2010 Weak Convergence
1. d(, ) = d(, ) (despite the fact that the denition of d is not symmetric with respect
to and );
2. d(, ) = 0 implies that = ;
3. d satises the triangle inequality.
Prove that it is possible to impose that A is closed in the denition of the Prokhorov metric.
Exercise 2.15. Calculate the Prokhorov distance between two Dirac measures
x
and
y
.
Calculate a bound for the Prokhorov distance between two exponential distributions on [0, )
with dierent parameters.
Theorem 2.22. The Prokhorov metric metrises the weak convergence of probability measures
on a Polish space.
Proof. First show that the convergence in the Prokhorov metric implies the weak convergence.
It suces to show that each open set U(; F, ) from the base of the weak topology (see (2.2))
contains the set { : d(, ) < } for some > 0. For this, x
1
such that (F

1
) < (F)+/2.
If d(, ) < = min(
1
/2, /4), then
(F) (F

) + < (F
2
) + 2 (F

1
) + /2 < (F) + .
Now we prove that the weak convergence implies the convergence in the Prokhorov metric. For
this, it suces to show that { : d(, ) < } contains a neighbourhood from (2.3). Using
Lemma 2.10, x a separable set S such that (S) = 1. Cover S with a countable family of open
balls B
i
of diameter < /3. Take a nite collection of these balls, such that (
k
i=1
B
i
) > 1.
Since the -content of a ball is a monotone function of its diameter and thus it is continuous
for almost all > 0, one can slightly increase the diameter of all these chosen balls, if needed,
so that each ball is a continuity set for and still < /3. Denote by D the family of all nite
unions of these balls B
i
and
D
0
=
k
_
i=1
B
i
.
Let us show that V (; D, ) { : d(, ) < }. If V (; D, ), then, in particular,
|(D
0
) (D
0
)| < ,
so that (D
0
) > 1 2 and (D
c
0
) 2. Take an arbitrary B B(X) and let D
B
be the union
of these balls B
i
, i {1, . . . , k} that have non-empty intersection with B. Then B D
B
D
c
0
and D
B
B

since the diameters of the balls are and hence they cannot hit B if the distance
to it is greater than . Therefore,
(B) < (D
B
) + 2 < (D
B
) + 3 (B

) + 3 < (B

) + ,
i.e. d(, ) < .
20
Weak Convergence Spring 2010
Exercise 2.16. Show that the total variation metric
d
TV
(, ) = sup
AB(X)
|(A) (A)|
denes a convergence, which is stronger than the weak convergence of probability measures.
NB. Often one adds a multiplicative constant 2 in front of the supremum in the denition of
d
TV
. The reason is that then it coincides with the total variation of the signed measure
= dened via its Jordan decomposition =
+

by =
+
(X) +

(X).
Another metric on M(X) can be dened as
(, ) = sup
fBL(X): f
BL
1

_
fd
_
fd

.
Exercise 2.17. For a > 0, express
sup
fBL(X): f
BL
a

_
fd
_
fd

in terms of (, ).
Theorem 2.23. The metrics d and are equivalent, i.e. (
n
, ) 0 if and only if d(
n
, ) 0.
Proof. Necessity. Let d(
n
, ) 0. By Theorem 2.22,
n
w
. Then use Exercise 2.10.
Suciency follows from the next proposition.
Proposition 2.24. We have d(, ) (, ) +
_
(, ).
Proof. Denote = (, ). Let F be a closed set. Dene
f(x) = (1
1

(x, F))
+
.
Then f BL(X) and f
BL
= 1+
1

. Furthermore (see Exercise 2.17 for the second inequality),


(F)
_
fd
_
fd + (1 +
1

) (F

) + (1 +
1

) .
Choose from (1 +
1

) = , so that
2
= 0 and
=

2

_

2
4
+ +
_
.
21
Spring 2010 Weak Convergence
2.4. Prokhorovs theorem
Recall that the set K is compact if each its open cover admits a nite subcover. In metric
spaces this property is equivalent to the sequential compactness that each sequence of points
in K admits a convergent subsequence with limit in K. If we do not require that the limit lies
in K, then K becomes relatively compact, i.e. a set which closure is compact.
Exercise 2.18. What are compact sets in R
d
? What are relatively compact sets in R
d
?
Hellys theorem says that each uniformly bounded sequence of random variables has a weak
convergence subsequence, i.e. for each sequence {
n
} it is possible to nd its subsequence
{
n(k)
} such that
n(k)
weakly converges as k . In other words, any such family of random
variables is weak relatively compact. This result is proved by looking at the values of the c.d.f.s
at rational points, then using the diagonal procedure, to show that F
n(k)
(r) F
0
(r) for all
rational r and then showing that the limit is F(x) = inf{F
0
(r) : x < r, r Q}.
This fact is no longer true for random elements in a general Polish space. Counterexamples can
be constructed for innite dimensional spaces, which are not locally compact, e.g. the space
C([0, 1]) of continuous functions with the uniform metric.
Denition 2.25. A family of probability measures M is called weak relatively compact if each
sequence from M possesses a weakly convergent subsequence.
In the following we need the following important result from functional analysis, see, e.g.,
N. Dunford and J. T. Schwartz. Linear Operators. Part 1. General Theory. Sec. IV.6.3.
Theorem 2.26 (Riesz). Let X be a compact metric space, and let L : C(X) [0, ) be
a linear non-negative functional on C(X) such that L(1) = 1. Then there exists a unique
M(X) such that L(f) =
_
fd for all f C(X).
If X in Theorem 2.26 is not compact, then the result holds with being a nitely additive
measure, Ibid. Sec. IV.6.2.
Lemma 2.27. Let X be a compact metric space. Then M(X) is compact in the weak topology.
Proof. The space C(X) of continuous functions on X with the uniform metric is separable
(rational splines represent a countable everywhere dense subset). Let {f
i
} be the countable
dense set in C(X), and let {
n
, n 1} M(X).
Note that {
_
f
1
d
n
, n 1} is a bounded sequence of real numbers, since a continuous function
on a compact set is bounded, so there is a subsequence
1
n
such that
_
f
1
d
1
n
converges. By ap-
plying the same argument to the integrals of f
2
over {
1
n
} and so on, we arrive at a subsequence
{
n
n
, n 1} such that
_
f
i
d
n
n
converges for all i 1. Denote
(2.4) lim
n
_
f
i
d
n
n
= F(f
i
) .
22
Weak Convergence Spring 2010
If {f
i(k)
, k 1} is fundamental in C(X), then F(f
i(k)
) is fundamental on R, since
|F(f
i(k)
) F(f
i(s)
)| lim
n
_
|f
i(k)
f
i(s)
|d
n
n
f
i(k)
f
i(s)

.
Thus, it is possible to extend F onto C(X) by continuity. The extended F is linear and
non-negative. By Theorem 2.26, there exists unique M(X) such that F(f) =
_
fd.
It remains to prove that
n
n
w
. Consider any f C(X). Then there exists a sequence
f
i(k)
f uniformly, hence

_
fd
n
n

_
fd

_
|f f
i(k)
|d
n
n
+

_
f
i(k)
d
n
n

_
f
i(k)
d

+
_
|f f
i(k)
|d
2f f
i(k)

_
f
i(k)
d
n
n

_
f
i(k)
d

.
The rst term can be made arbitrarily small by the choice of the sequence, the second
by (2.4).
The following result is very useful to transfer the problem into a compact setting.
Lemma 2.28 (Urysohn). For any Polish space X, there exists a function from X onto a
Borel subset of [0, 1]
N
(with the norm (x
1
, x
2
, . . . ) =

n=1
2
n
x
n
) such that is continuous
and injective on X and
1
is continuous on (X).
Proof. Let {x
n
} be a countable dense subset of X. Dene : X [0, 1]
N
as
x (min{(x, x
1
), 1}, min{(x, x
2
), 1}, . . .) .
The continuity of is immediate from the continuity of the metric. Let x = y, and let =
min{(x, y), 2}. Consider k such that (y, x
k
) <
1
2
. By the triangle inequality, (x, x
k
) >
1
2
,
so that (x) and (y) dier at the kth coordinate. Thus, is bijection X (X).
Let
1
(u) = x for u = (t
1
, t
2
, . . .) (X). For a xed (0, 1/3) there is k such that
(x, x
k
) < /3. Now for any v = (s
1
, s
2
, . . .) = (y) such that
d(u, v) =

j1
2
j
|s
j
t
j
| <

3 2
k
we have that
|(y, x
k
) (x, x
k
)| = |s
k
t
k
| < 2
k
d(u, v) <

3
.
Therefore, (y, x
k
) < 2/3 and
(x, y) (x, x
k
) + (y, x
k
) < ,
23
Spring 2010 Weak Convergence
i.e.
1
is continuous.
It remains to prove that (X) is a Borel set in [0, 1]
N
. Let B
o
1/k
(x
n
) be an open ball centred
at x
n
. By continuity of
1
, (B
o
1/k
(x
n
)) is open in (X), so there exists a set V (n, k) open
in [0, 1]
N
such that (B
o
1/k
(x
n
)) = V (n, k) (X) (take for V (n, k), for instance, the union of
open balls in [0, 1]
N
centred at the points of (B
o
1/k
(x
n
))). Let V (k) be the union of V (n, k) for
n 1. We claim that
(X) = (X)
_

k1
V (k)
_
,
which would imply that (X) is Borel (notice that the closure as a closed set is always Borel).
Clearly, (X) is a subset of the right hand side. Choose any v from the right hand side. Then
v V (n
k
, k) for all k 1. Also each neighbourhood of v contains an element of (X). Choose
v
k
(X) such that
v
k
V (n
1
, 1) V (n
k
, k) {u [0, 1]
N
: d(u, v) < 1/k} .
Then v
k
v and {
1
(v
k
)} is a fundamental sequence in X, since for j, k m, v
j
and
v
k
both belong to (B
o
1/m
(x
n
m
)), so that (
1
(v
j
),
1
(v
j
)) < 2/m. Since X is complete,
y = lim
1
(v
k
) exists. By the continuity of , (y) = v, i.e. v (X).
Note that [0, 1]
N
is a compact space (the Tikhonov theorem).
Denition 2.29. A family of measures M on a topological space X is said to be tight if for
each > 0 there exists a compact subset K = K

X such that (K
c
) < for all M.
Exercise 2.19. Show that each nite family M = {
1
, . . . ,
k
} of probability measures on
Polish space X is tight.
The following theorem is central to us as it provides a very useful criterion in order to establish
weak convergence of probability measures on Polish spaces.
Theorem 2.30 (Prokhorov). Let X be a Polish space. Then a family M of probability measures
on X is weak relatively compact if and only if M is tight.
Proof. Necessity can be proved similarly to Ulams theorem. First, we prove that for every
n 1
(2.5) inf{(
im
B
1/n
(x
i
)) : M} 1 as m .
Indeed, assume that this is wrong. Then, for some > 0 and every m, there exists
m
M
such that

m
(
im
B
1/n
(x
i
)) 1 .
24
Weak Convergence Spring 2010
By assumption (if necessary passing to subsequences), there exists a probability measure ,
such that
m
w
. By Ulams Theorem 2.6, (K) 1 /2 for some compact set K. Since
K is compact,
K
ik
B
o
1/n
(x
i
) ,
where B
o
1/n
(x
i
) are open balls. By the Portmanteau theorem,
1 /2 (
ik
B
o
1/n
(x
i
)) liminf
m

m
(
ik
B
o
1/n
(x
i
)) limsup
m

m
(
ik
B
1/n
(x
i
)) 1 .
Contradiction. Thus, (2.5) holds. Choose i
n
such that
(
ii
n
B
1/n
(x
i
)) 1 2
n
for all M and dene
K =

n1
_
ii
n
B
1/n
(x
i
)
as in the proof of Ulams theorem. Then (K) 1 for all .
Suciency. By the Urysohns Lemma 2.28, it is possible to embed X into a compact metric
space [0, 1]
N
using the function . Choose K

such that (K
c

) < for all M. Then


C

= (K

) is a compact set. Indeed, it is closed by continuity of


1
and is a subset of a
compact space. Let
n
=
n

1
for a sequence {
n
} M. Then
n
(C
c

) < for all n. By


Lemma 2.27,
n
k
w
on [0, 1]
N
. By Portmanteau Theorem 2.16, (C

) 1 for all . Since


C

(X), we have ((X)) = 1. Let = be the measure induced by on X using


1
.
The continuity of implies that
n
w
by Lemma 2.19.
Exercise 2.20. Obtain Ulams theorem as a corollary from the Prokhorov theorem.
2.5. Weak convergence of random elements
If is a random element in X dened on the probability space (, F, P), then its distribution
on X is dened by

= P
1
Denition 2.31. A sequence {
n
} of random elements with values in metric space X is said
to converge in distribution (or converge weakly) to if

n
w

. Notation
n
D
or
n
w
.
It is a matter of reformulation of the denition to show that
n
D
if and only if Ef(
n
)
Ef() for all f BL(X).
25
Spring 2010 Weak Convergence
Recall the classical denition of convergence of distributions for random variables. Random
variables
n
are said to converge in distribution to if their cumulative distribution functions
converge pointwisely at all points where the cdf of is continuous, i.e.
F

n
(x) = P{
n
x} F

(x) = P{ x}
whenever F

(x) = F

(x), i.e. P{ = x} = 0. Note that P{ x} = P{ (, x]}. The


boundary of (, x] is {x}, so that the continuity of F

means that belongs to the boundary


of (, x] with probability zero.
Theorem 2.32. Let X be a separable metric space and let , {
n
} and {
n
} be random elements
in X. Assume that {
n
} and {
n
} are asymptotically stochastically equivalent, i.e.
P{(
n
,
n
) > } 0 as n
for all > 0. Then
n
D
implies that
n
D
.
Proof. Consider f BL(X) and any > 0. Then
|Ef(
n
) Ef()| |Ef(
n
) Ef(
n
)| + |Ef(
n
) Ef()|
. .
=
n
0
E|f(
n
) f(
n
)| +
n
= E
_
|f(
n
) f(
n
)| 1I
(
n
,
n
)
+|f(
n
) f(
n
)| 1I
(
n
,
n
)>

+
n
2f

P{(
n
,
n
) > } + f
L
+
n
.
Consider functions (or operators) T
k
: X X, k 1, which are Borel measurable. We assume
everywhere that X is a separable metric space.
Exercise 2.21. Show that if X separable, then (T
k
, ) is a random variable.
Theorem 2.33 (Wichura). Let {
n
} and be random elements on X. Assume that
1. T
k

n
D
T
k
for all k 1;
2. for all > 0
lim
k
limsup
n
P{(T
k

n
,
n
) > } = 0 ;
3. for all > 0
lim
k
P{(T
k
, ) > } = 0 .
Then
n
D
.
26
Weak Convergence Spring 2010
Proof. We need to show that Ef(
n
) Ef() for each f BL(X). Fix > 0. Then
|Ef(
n
) Ef()| E|f(
n
) f(T
k

n
)| + |Ef(T
k

n
) Ef(T
k
)| +E|f(T
k
) f()|
= E
_
|f(
n
) f(T
k

n
)|
_
1I
(T
k

n
,
n
)
+1I
(T
k

n
,
n
)>
_
+ |Ef(T
k

n
) Ef(T
k
)|
+E
_
|f() f(T
k
)|
_
1I
(T
k
,)
+1I
(T
k
,)>
_
f
L
+ 2f

P{(T
k

n
,
n
) > } + |Ef(T
k

n
) Ef(T
k
)|
+f
L
+ 2f

P{(T
k
, ) > } 0 .
Exercise 2.22. Explore what the Wichura theorem brings if is a random function from
C([0, 1]) and T
k
x is a piecewise linear function that equals x at grid points on [0, 1], see Sec-
tion 5.3.
27

Vous aimerez peut-être aussi