AND
PROBABILITY DISTRIBUTIONS
BY
HARALD CRAMER
('hane81Jo.r of the Su)edi8h, UniverBitltS
and PifojeR9or tn the Un1.t>erBity of StockAolrn
CAMBRIDGE
AT THE UNIVERSITY PRESS
1962
PUBLISHED BY
'fH:P: $). NDICS OF THE CAl\IBRIDGE UNIVERSITY PRESS
Belltl
P
3 200 Euston Roa.d, London. N.'V.l
American Branch 32 Eac;t 57th Street, York 22,
\V'fi'>bt AfrIcan Office: P.0. Box 33. Ibadan, Niflcria
First printed. 1937
$eg,onil IxhUm/, 1962
First printed at th.e UmtJt1"$ity Pru8, Oambridge
Reprinted by o1f
a
et..lithography by Bradford &: Diclcens, London. W..O. J
CONTENTS
Preface to the First Edition
Preface to the Second Edition "
AbbreviationB "
FIRST PART. PRINCIPLES
'Page vi
vii
Ohap. I. Introductory remarks 1
II.. Axionls and preHminary theorems .. 9
SECOND PART. DISTRIBUTIONS IN R
1
III. Gellerai properties. Meall values 18
IV. Characteristic functions "
24
v. Addition of independent variables.. Conver..
gence "iIi probability". Special distributions
3t)
VI. The normal distribution and the central limit
theorem
50
VII. Liapounoff's theorem. Asynlptotic expansions
!70
VIII. A class of stochastic processes 90
THIRD PART. DISTRIBUTIONS IN R
k
IX. General properties. Characteristic fUllctions 101
X. The normal distribution and the central limit
theorem 110
Bibliography .. 116
Some Recent Worles on Mathematical Probability 119
H. C.
PREFACE
The Mathematical Theory of Probability has lately beconle of
growing importance owing to the great variety ofits applications,
and also to its purely mathematical interest. The subject, of this
traot is the development of the purely mathematical side of the
theory, without any reference to the applications. The axiomatic
foundations of the theory have been chosen in agreement with
the theory given by A. Kolmogoroff in his work Grundbegrijje der
fVahr8cheinlichkeitBrechnung, to which I am greatly indebted. In
accordance with this theory, the subjeot has been treated as a
branch of the theory of completely additive set ftmotions. The
method principally used has been that of characteri8tic functi0ft.8
(or FourierStieltjes transforms).
The limitation of space has made it necessary to restrict the
programme somewhat severely. Thus in the first place it has
pro,red necessary to consider exclusively probability distribu
tions in spaces of a finite number of dimensions. With respect to
the advanced part of the theory, I ha.ve found it convenient to
confine myself almost entirely to problems connected with the
s o ~ c a l l e d Gentral Limit Theorem for sums of independent vari
ables, and with some of its generalizations and modifications in
various directions. This limitation permits a certain uniformity
of method, but obviously &. great number of important and
interesting problems will remain unmentioned.
My most sincere tha.nks .are due to my friends W. Feller,
O. Lundberg and H. Wold for valuable help with the preparation
of this work. In p&rtioular the constant assistance and criticism
of Dr Feller has been very helpful to Ine.
lJeparime.nt of MatAe:mCltietil Statistics
Unit'u8itg 0/ Stockholm
December 1936
PREFACE TO THE SECOND EDITIOK
This Tract has now been out of print for a number ofyears. Since
there still seems to be some demand for it, the Syndics of the
Cambridge University Press have judged it desirable to publish
a new edition.
However, owing to the vigorous development ofMathematica.l
Probability Theory since 1937, any attempt to bring the book up
to date would have meant rewriting it completely, a task that
would have been utterly beyond my possibilities under present
conditions. Thus I have had to restrict myself in the main to a
number of minor corrections, otherwise leaving the work
including the Bibliographywhere it was in 1937.
Besides the minor"corrections, most of which are concerned
with questions of terminology, there a.re, in fact, only two major
alterations. In the first place, a serious error in the statement
and proof of Theorem 11 has been put right.. Further, the con
tents of Chapter IV, 4, which are fundamental for the theory of
asymptotio expansions, etc.. developed in Chapter VII, have been
revised and simplified. This permits a new formulation of the
important Lemma 4, on which the proofs of Theorems 2426
are based. Finally a brief list of recent works on the subject in
thE" English language has been added.
H.C..
Univer8ity OAa:nullor1f OJlirf
Stockholm
MaTt1l.1960
ABBRE"\TIl\TIONS AND NOTATIONS
Hymbol Rignificatlon
Explanation
d.f. Distribution function
page 11
pr.. f. Probabihty function
11
B.d.. Standard de"iation
21
E(X) lIean value (or mathematical
20
expectation) of X
D(J::; Standard da\iation of }[
21
c.. Characteristic function
24
F (x) = F
1
(x) *F
z
(x)
F(:c)= J:", 1'1 (:ct)dF
2
(t) 37
conv"'ergence Lpr. Convergence in probability 39
(F (x))n* F (x).F(x) * .... (11, times) 53
The u?l,ion or BUm of any finite or enumerable sequence of sets
8
1
, S'j., ... ~ is denoted by
8=8
1
+8
2
+....
The intersection or product of the sets 8
1
, 8
2
, .... is denoted by
8=8
1
8
2
.....
The inclusioll SigIl c is used in relations of the type S1 c S
indicating that ;..9
1
is a subset of S, and also in relations ofthe type
x c ;.\1 t.o express the fact tllat x is an element of the set S.
FIRST PART
PRINCIPLES
CHAPTER I
INTRODUCTORY REMARKS
1" In the most varied fields of practical and scientific experi
enee, cases ocour where certain observations or trials may be
repeated a large number of times under similar circumstances.
Our attention is then directed to a certain quantity, which may
assume different numerical values at successive observations.
In many cases each observation yields not only one, but a certain
number of quantities, say k, so that generally we may say that
the result of each observation is a definite point X in a space of
Ie dimensions (k 1), while the result of the whole series of obser
vations is 8t sequence of points: Xl' XI' ....
Thus if we make a, series of throws with a given number of
dice, we may observe the sum of the points obtained at each
throw.. We are then concerl1ed with a variable quantity, which
may assume every integral value between m and 8m (both limits
where m is the number ofdice. On the other hand, in a
series of measurements of the state of some physical system, or
of the size of certain organs in a number of individuals belonging
to the same biological species, each observation furnishes a
certain number ofnurp.erical values, i.e. a definite point ina space
ofa fixed number of dimensions.
In certain cases, the observed characteristic is only indirectly
expressed as a number. Thus if, in a mortality investigation, we
observe during one year a large number of persons:J we may at
each observation (i.e. for each person) note the number of deatka
which take place during theyear, so that inthis case the observed
I
2 INTRODUOTORY RE1IARKS
quantity assumes the value 0 or 1 according as the corresponding
person is alive at the end of the year or not.
In a given class of observations, let R denote the set of points
which are a priori possible positions of our variable point X, and
let S be a sub..set of R. Further, let a &eries of n observations be
made, and count the number v of those observations, where the
follo'\\ing event takes place: the point X determined by the ob8erva
tzon belongs to S. Then the ratio vIn is called the frequency of that
event or, as ,ve may shortly put it, the frequency of the relation
(or event) XeS. Obviously any such frequency always lies
between 0 and 1, both limits inclusive. If 8=8
1
+lJ
2
, where 8
1
and 8
2
have no common point, and if vl/n and v2!n are the
frequencies corresponding to 8
1
and 8
2
, we obviously haye
v=vl+vS and thus
(1) vjn=Vl/n+ v,.!n.
When we are dealing with such frequencies, a certain peculiar
kind of regularity very often presents itself. This regularity may
be roughly described by saying that, for any given sub...set S,
the frequency of the relation (or event) XeS tends to becmne more
or leas constant as n increa8e8. In certain cases, such as e.g. cases
of biological measurements, our observations may be regarded
as samples from a very large or even infinite population, so that
for indefinitely increasing n the frequency would ultimately
reach an ideal value, characteristic of the total population.
It is thus suggested that in cases where the abovementioned
type of regularity appears, we should try to introduce a number
P (S) to represent such an ideal value of the frequency v/71, corre
sponding to the Bubset S. The number P (8) is then called the
probability oj tke 8ubBet S, or of tke event Xc 8. It follows from
(1) that we should obviously choose P (8) such that
(2) P (8
1
+8.) = P (8
1
) +P (8
2
)
for any two Bubsets 8
1
and loS! of R which have no common point.
Further, it is obvious that we should always have P (8) ~ 0 and
that for the particular set S== R we should have P (R) == 1..
INTRODUOTORY REl\IABKS 3
The investigation of 8et fUMtions of the type P (8) and their
mutual relations is the object of the Mathematical Theory of
Probability. This theory should be considered as a branch of
Pure Mathematics, founded on an axiomatic basis, in the same
sense as Geometry or Theoretical Once the funda
mental conceptions have been introduced and the axioms have
been laid down (and in this procedure we are, of QOurse, guided
by empirical considerations), the whole body ofthe theory should
be constructed by purely mathematical deductions from the
axioms. The practical value of the theory will then have to be
tested by experience, just in the same way as a theorem in
euclidean geometry, which is intrinsically a purely mathematical
proposition, obtains a, practical value because experience shows
that euclidean geometry really conforms with sufficient accuracy
to 8. large group of empirical facts.
2. The axiomatic basis of a theory may, of course, always be
constructed in many different ,vays, and it is well known that,
with respeot to the foundations of the Theory of Probability,
there has been a great diversity of opinions.
The type of statistical regularity indicated above was first
observed in connection with ordinary games of chance with
cards, dice, etc., and this gave occasion to the origin and early
development of the theory.2 In every game of this character. all
the results that are a priori possible may be arranged in a finite
number of cases which are supposed to be perfectly symmetrical.
This led to the famous principle oj equally P088ible ca8eR which,
.after having been more or less tacitly assumed by earlier writers,
was explicitly framed by Laplace [1], as the fundamental prin
ciple ofthe whole theory. Throughout thewhole centuryfollowing
the publication of Laplace's classical treatise, a, large amount of
work has been spent on the discussion of this principle.
During the course of this discussion, it has been maintained
1 This view seems to have been first explicitly expressed by v. Mises [2J.
Of. Todhunter [lJ.
4 INTRODUCTORY REMARKS
by various authors that the validity of the principle of equally
possible cases is necessarily restricted to the field of games of
chance, so that it is wholly incapable of serving as the basic
principle of the theory. Attempts have been made! to establish
the theory on an essentially different basis, the probabilities
being directly defined as ideal values of statistical frequencies.
The most successful attempt on this line is due to v. Mises [2, 3],
who endeavours to reach in this wayan axiomatic foundation of
the theory in the modern sense.
The fundamental conception of the v. Mises theory is that of a
" Kollektiv", by which is meant an unlimited sequence K of
similar observations, each furnishing a definite point belonging
to an a priori given space R of a finite number of dimerlsions..
The first axiom of v. Mises then postulates the existence of the
limit
(3) lim
for every simple subset ScR, while the second axiom requires
that the analogous limit should still exist and have the same
value P (S) for every subseq'llence K
'
that can be formed from
K according to a rule such that it can always be decided whether
the nth observation of K should belong to K' or not, 'Without
the re,sult oj this partic'lilar observatiort.
2
It does, however,
seem difficult to give a precise mathematical meaning to the
condition printed in italics, and the attempts to express the
second axiom III a more rigorous way do not, so far, seem to have
reached satisfactory and easily applicable results. Though fully
recognizing tIle value of a system of axioms based on the pro
perties of statistical frequencies, I think that these difficulties
must be considered sufficiently grave to justify, at least for the
time being, the choice of a fundamentally different system..
The underlying idea of the system that will be adopted here
1 For the history of these attempts, cf. Keynes (1], chaps. VIIVIII.
2 The second a.xiom as gIven by v. Mises [31, p. 18, is som.ewhat more com..
plica.ted. It can, however, be shown tha.t this is equivalent to the simpler statement
given a.bove.
INrtRODUCTORY REMARKS 5
may be roughly described in the following simple llray: The
probability ofan event i8 a definite number a880ciated with that etrent ;
and GU'! have to express the fundamental rules for operatio1J.,8
with 8U.ch numbers.
Follov,ing Kolrnogoroff we take as our starthlgIJoint the
observation made ahnve (of.. (2) that the probabihtj" P(8) nlaJ"
be regarded as an additi1)P oj the 8et S. We sh$ll1. ill fact,
content ourselves by Fiostu.lr..ting mainly the of a
function of this type, defined for a fal=luy of set'S S ill the
k...dimensiollal space R" tc, which our IJloint )( is
and such that P (8) denotes the prnbability ofttle relation it: cS
Thus the question of the validity of the relation (3) \\ilI 110ti
at all enter into the mathematical theory. For tele eral,irical
of tIle theory"' it on the other hand, becon1e a
matter of fundanlelltal to know if, in a gi,,"en case,
(3) is satisfied \vi
i
h a I)ractically sufficient approximation.!
Questions of verification and application fall, however, outside
the scope ofthe present work, \vhich will be exclusively conc:erlled
"\vith the development of the pllrely Dlathematical J>aI't of the
subject.
3. Before giving the explicit statemellt of our axioms, it ,vill
be convenient to discuss here a fe\\r preliminary questions related
to the theory of point sets and (generalized) Stieltjes integrals in
spaces of a finite number of
In the first place, we must define the falniIJT .F of sets S, for
which we shall want Otlr additive set function P (S) to be given.
If X =('1' ..... 'I belongs to the k..dirl1ensiollal euclidean space
R
k
, the family F should obviously co:ntain every k...dilnel1sional
interval J defined by inequalities of the fornl
(i=1,2.. ... ,l"),
as we may always want to know the 1)robal>ilit
1
y of tIle relation
1 Cf.. Ca.ntelli (2], Tomier [:l].. The foundations of the theor..r l);S down by
authors present eertam analogies lvith the principles hete used.
2 Reference may be made to treatises by Hobson (I), Lebesgue [11 and de La.
'alice Poussin [1]"
6 INTRODUCTORY REMARKS
Xc J. It is also obvious that F should contain every set S con
structedbyperforming onintervals J a finite number ofadditions,
subtractions and multiplications. It is even natural to require
that it should be possible to perform these operations an infinite
number of times without ever arriving at a set S such that the
value of P (8) is not defined. Accordingly, we shall assume that
P (8) is defined for all Borel 8et8
1
S of R
k

The family of Borel sets consists precisely of all sets that can
be constructed from intervals J by applying a finite or infinite
number of times the three elementary operations. If 8
1
, 8
2
)
are Borel sets in R
k
, this also holds tt1le for the two sets
limsup Sn = lim (811, +Sn+l +... ),
lim inf Sn=lim(SnSn+l.. ).
If fun sup 8
n
and liminfB
n
are identical, we put
limSn = limsup Sn. =liminfSn'
and thus limS
n
is also a Borel set. In particular, the sum and
product ofan infinite sequence ofBorel sets are always Borel sets.
If no two of the sets Si have a common point, it follows from
the additive property (2) that
P (8
1
+..... + Sn) == P ( ~ ) +.... +P (8,.,)
for every finite n. Since the limit 8
1
+ S2 +... always exists and
is a, Borel set, it is natural to require that this relation should
hold even as n+00, so that we should have
P(8
1
+S2+... )= P(St)+P(S2)+ ...
A set function with this property will be called completely
additive, and it 'Will be assumed that the function P (8) is of
this type.
Consider now a realvalued point function g (X), defined for
all points X = ( ~ 1 ' .... , 'k) in R
k
* 9 (X) is said to be mea8'Urable B2
if, for all real a and b, the set of points X such that a < g(X) ~ b
is a Borel set. Similarly, a vector function Y=!(X), where
1 cr. Hobson ( 1 ] ~ I, p_ 179; Lebesgue [1], p. 117; deIa Vallee POl18sin (1], p. 33.
:I Cf.. Hobson (lJ, It p. 563; de 1& Vallee P0US81n [1], p. 34.
INTRODUCTORY REMARKS 7
y = (7Jl' , 7]t) belongs to a certain fdimensional space ffi
r
, is
measurable B if every component 11, regarded as a function
of X, is measurable B. If <5 denotes any Borel set in Dl
r
, and if S
is the set of all points X ill R
k
such that f (X) c6, then loS is also
a Borel set. (If f(X) never assumes a value belonging to $,
~ S is of course the empty set.) IfJl'!2' ... are measurable B, so are
11 f2' Il!2' fl
1
, limsuPfn' liminfj", and, in the case of conver
gence, limfn.
All sets ofpoints with whick we shallltave to deal in the sequel are
Borel 8ets, while all point jU'Mtio'lUl are measurable B.. Generally
this will not be e:eplicitly mentioned, and should tltetL always be
tacitly ulUierstood.
A Lehe8gueStieltje8 integral with respect to the completely
additive set function P (8) is, for every bounded and non
negative g(X) and for every set S, uniquely defined by tIle
postulates
(A)
J
gdP=J gdP+j gdP,
~ + ~ ~ ~
8
1
and 8
2
having no common point, and
(B) f (gl +g,JdP=j Ul
dP
+J g2
dP
,
s s s
(0) fsgdPi:
O
,
(D) f 1.dP=P(S).
i ~
Ifg is not bounded, we put g.ll = min (g, M) and define f gaP
.. s
as the limit of Isg.lldP as M +00. If the limit is tinite, g is said
to be integrable over S with respect to P (8). The extellsion to
functions g which are not ofconstant sign is perfornled by putting
2f gdP==J (lgl+g)clPJ (lglg)dP.
8 S S
8 INTRODUOTORY REMARKS
FarallYg such that I9 I< Gthroughout th.e set S, vre then I1ave
the Inean value theorem
IfsgdP! < G P(S}.
I:,et gl' 92' . _. be a sequence of functions euch tllat for all points
of sS we have Ign i <g, where g jg integrable. Then iflin19n exiRt.s
for every point of S, except possibly for a certain set of pOints
S'1 c S such that P (8
1
) = 0, we have
,.. f"
Iiln I uIngrc,dI".
"'S s
It follows tliat the theorer.as on co:::rt.inuity, differelltiation a:u(l
integration \nth respect to Iiarameter, etc. ,,"hich are 1\:110\Vn
from elementa,ry integratior! theory' extend. thelneelYes im...
mediately to integrals of t!:. tYf:e Jr g{X,t)dP, where t is a
s
parameter..
TIle ordinary theoretl1s on repeated i:ntegra,Is1 are also easil;y
to integrals of the type here considered.. III ]::;artieular
,ve l1ave the following result ,vBl be used in ChaI:ter III.
Let P (is) be denlled in a, space R
2
and such
tllS:t for every J ((.1 < gl OJ.' a
2
< &2)
we na,"'e P I J')  P. (J ') R ;J \
\<  1 1 2 \ 2i'
where PI (S) and (S) are completely a.ddit.ive set functions in
H
1
J
i
denotes the onedinlel1siol1al interval a
1
< 'i 0i.
'l1l.etl if the function g1 (eli fJ2 iR integrs.. ble over R
s
""ith
respect to P(S), we have
f Ydgl}YS(gs)dP=J gl('I)dP1f g2('2)
dP
t.
1 Cf. Kobson (1].. I, p. 626; de 180 Va.llee Pou3sin (1], p. 50
CHAPTER II
AXIOl\fS AND PRELIMll{ARY THEOREMS
1 \)7
e
now proceed to the explicit st.atement of our axioms..
1
In accordance with the preceCUng chapter, we denote by R
J
;
a kdimensional euclidean space ,,rj'tti the variable POlllt
and we consider the family of all Borel sets S
inB
k
"
Axiom 1. 1
1
0 every S corre8po:nda '1; nonnegative nurnber
P (ij), which is called tke probability of the 1
c
elation (01" event) XeS.
2. 11
7
e have P (R
k
) = 1a
Axiom 3* P (8) ia a completelyadditi've8etf'l//nction, i.e. wenat'e
P (S1 +/3
2
+ ... )= P (8
1
) + P (8
2
) i ,
where li
1
, .0 .. are Borelaets, no tu.
1
0 of 1.vhic!t luzve a cornmon poin.t.
The variable point X is th.en called a randmn 1Jariable (or
random point
1
random vector)o The set functio:n P (8) is called
the probability!ufWtio?lI of X, and is said to define the probability
diatrilYutiO'n in R
k
which is attached to the variable X. It is often
convenient to use a concrete interpretation of a probability dis
tribution as a distribution of mass ofthe total amount lover R
k
,
the quantity of mass allotted to any Borel set 8 being aqua!
to P (ls).
It follows ilnmediately from the axioms that we always have
1,
and P(S)+P(S*)=l,
where Sand S* are complementary sets. Further, if 8
1
and 8
2
are two sets such that 8
1
:>19
2
, have 8
1
= 8
2
+(S1  Sf) and thus
(4) P P (S2).
1 The:f&ct that we restrict ourselves here to Borel seta in BIt: permits some formal
aimplifieation or the system of moms given by Kolmogoroff [4:If and or the im
mediate conclusions drawn from the axioms. .
B
10 AXIOMS AND PRELIMINA.RY THEOREMS
Theorem 1. For an'll 8equence oj Borel sets 8
1
, S2' (0. in R
k
,
we have P (limsup Sn) limsup P (8",),
P (lim inf Sn) lim inf P (Sn)
Hence, if lim8,,, exi8tB, 80 doe8lim P (Sn)' and we have
(5) P (lim8
n
) = limP (Stt>.
In order to prove this theorem, we shall first show that (5)
holds for any morwtcme sequence {Sn}. If {Sn} is arl increasing
sequence, we may in fact write
lim8
n
=Sl +(SZSl)+ (8
3
8
2
)+ ... ,
and thus obtain from Axiom 3
P(lim8
n
)=P(8
1
)+ P(Ss&)+P(8
3
8
2
)+ ...
== p (St,) +(P (S2)  P (8
1
)) +(P (S3)  P (8
2
) +...
=limP(Sn,).
For a decreasing sequence {Sn}' the Bame thing is shown by con..
sidering the increasing sequence formed by the complementary
sets S:.
For any sequence {Sn}' whether monotone or not, we have
(of. I, 3) limsupSn,=lim(Sn+Sn+l+ ... ). Now, Sn,+Sn+l+ .... is
obviously the general element of a decreasing sequence, 80 that
(6) P (limsup 81'1,) = limP (Stl, +8
n
+
1
+... ).
For every 1'==0, 1, .. , we have 8ft, +8
110
+
1
+... j Sn+" and thus
by (4:) P P S
(Sn+Stt+l + ) (n+,),
P (S,,+Sn+l +
We thus obtain from (6)
P (limsup 8
n
) limsup P (S,,).
Hence the inequalityfor P (liminf8ft) is obtained by considering
the sequence {S:} of complementary sets and using the identity
liminfSn. := (lim. sUp S:)*. Thus Theorem 1 is proved.
In the particular case when every point X of Bit. belongs at
most to a finite number of the sets 8
ft
, lim8" is the empty set,
and it follows that we have limP (8
11
J== o.
AXIOMS A.ND PRELIMINARY THEOREMS 11
2. Consider BOW the particular set 8:;el'X2'. .'x1c defined by the
inequalities
(7) e' x, (i = I, 2, ... , k).
For all real values ofthe XI, we define a point/unction F (Zl' ... , xk)
by putting
F (Xl' .. ,. Xk) = P (SZlf .... Xk)'
so that according to Axiom 1 Jf' (Xl' .... , :Ck) represents the pro
babilityofthe joint existenceoftbe relations (7). ThenF(Xl' ... ,Xk)
is called the diatribution fun.ction
1
of the probability distribtttion
defined by P (8).
In the sequel, the terms probability functicm and diBtrib1ttion
function will usually be abbreviated to pr.f. and d.f. respectively.
Let J denote the halfopen lcdimensionaJ interval defined by
the inequalities at <,;; b
i
for i =1, 2, ..... , k. The corresponding
probability P (J) is then easily seen to be given by the k..th order
difference of the d.f. F (Zl' ... , Xk) associated with the interval J.
We thus have, writing only the first and last terms of the expres...
sion for this difference,
P (J) = i1
k
JJ' . , xk)
=F(b
1
, ,b
k
) . + (I)kF(a
1
, ,t.Jk)
Theorem 2. Every d.l_ F (Xl' " , Xk) P0lJ8eB8eJJ the following
fWperlies :
(a) In eacA fJQ,riabk Zl' F i8 a never decreasing function, whick
everywhere oontifl,1.WU8 to the right and, tends to tke limit 0 a8
(b) As all tke variables Xi tend (independently or not) to +00, F
terul8 to tke limit 1.
(0) For any kal/open kdimensional interval J, the aB8ociated,
Je..th order tJ,i!/erence of F is non.negative.. i.e. Ai;F o.
.Further, every Junction, F (Zl' ... , xk) which po88e88e8 the pro..
pertiea (a), (b) and (0) determi,'IUUl uniquely Q, probability distribu
tion in R
k
, IJ'UC1I, that F repreatnt8 the probabili,y of the relations (7).
1 The use here made of the terms probabUitg !undio. and .ilerihtlcm. j'll/lICti.on
corresponds to the terminology of Kolmogoroff' [4]. The latter term was used, with
the same significance, already by v. Mises [I" 2].
12 AXIOMS AND PRELIMINARY THEOREZ,iS
That F is a never decreasing function of each Xi follo'W"s im...
mediately from (4), since the set 8.1:1, ... ,:tk increases steadily wit}}
each Xi Further, we have for every k> 0
F ,Xk)  F (X
1
,X
2
, ...... ,xk) = l:J
Z1
t;:lJfJ ...,:rk).
Ifkruns a sequenceofvalues tendingto zero, thesequence
of point sets appearing in the second member obviously tends to
a, definite limit, viz. the empty set. Thl18 by Theorem 1 the first
member tends to zeros and F is continuous to the right in .xl.
The same argument evidently applies to every Xi' In the sanle
way it is sean that F tends to zero as any given since
the set SXl
t
e.' (lJk tends then to the empty set.
As, on the other hand, all the variables Xi tend simultaneously
to +00, the set S:&l, ...,Xk tends to the whole apace RIc' ;j,Ild con
sequently F tends to the limit III
Further, it is obvious that any d. will satisfy the property
(c), as we must have P (,.,9) 0 for any Borel set S.
trhe last part of Theorem 2, which asserts that every d.f.
Ilniquely determines a nonnegative set function P (8) satisfying
our axionls, is equivalent to a wellknown proposition in the
theory of Lebesgue integration.
l
We have already seen that the
d.f.. immediately determines the value ofP (8) for every halfopen
k...dimensionai interval {i=i,2, ... ,k)c Now
Borel set can be constructed from such intervals by means vf
repeated passages to the limit, and the corresponding value of
the set function has then to be determuled according to (5). That
this procedure leads to a uniquely determined result for every
Borel set is preciselyasserted bythe proposition. referredtoabo","e.
AccoTding to Theorem 2, we are at liberty to define a pro...
babilit,y distribution either by its pr.f.. (which is a lJei function)
or by its d.f. (which is a point function). Though of course the
distinction between lihe two methods is only formal, it will
sometimes be found convenient to prefer one of them to the
1 Le"besgue [IJ, pp.. 168169 (one..dimensiona.I case); de la VaU6e Polt88in (1],
cha.p.. VI.
AXIOMS AND PRELIMINA.RY 13
other.. It is particularly in the case of distributions in a one
dimensional space (k= 1) that we shall use the d.. f., while for
general values of k the pr.f. will be used$
In the onedimensional case (k =1), the propert.y (e) is implied
by (a), and thus it follows from Theorem 2 that every non
decreasing funotionF (x) which is aiways continuous to the right
and is SUO!l tllat F(x)>O as 3:i'CO, and F(:c)+l as
defines a prolJability distribution.. As soon as k> 1, however, (c)
is no longer implied by (a), and already for lc=2 it is ill fact easy
to construct of funotions F satisfying (a) and (b), but
not (c). Accorcling!y, these functions are llot distribution func
tions..
2
3. Let ... ,e
ft
) be a random variable in R
1c
""itll tlle
pr.f.. P (8), and Y =,f(X) = (1]1' ,11f) be a Bmea.surable fttnQtion
which is nllite and uniquely defined for all points ..t.Y. of R
k
..
and such that its values belong to a. certain space ffira TheIl
if 6 is a Borel set in ffl
t
, the set S of all poirlts X in R
k
, SItch that
IT=!(X)c'S, is (cf. I, 3) also a Borel set.. If, 110"''', we define a
set function (6) in 31
t
by the relation
(6) = P
it is readily seell that our Axioms 13 are satisfied by ((5), so
that (6) determines a probability distribution in by
definition, is the probability distribution of the random variable
Y = J(X). The condition that f should be finite and 1.miquely
(lefined for all points ofR
k
may obviously be replaced bythe m.ore
general condition that the points X, whereJ is not finite or not
uniquely defined, should form a set i:: such thap P C) = o.
For a set <; such that the corresponding set S contains no point
X \ve obtain, of course (6)=0.
Take, e.g.) Y= .... , gf), wllere f < k, so that Y is simply the
S A mmple example is the funotion F (a:. y) defined by F =0 for x<l, y<l, for
11<0, and for .z<Ot and by F=l elsewhere.. For this function,. the
difference aasocia.ted with it sufticienifty sma.ll interval J containing the point
x==y== 1 in its interior is seen to be negative, so that (c) is not sa.tisfied.
14 AXIOlIS AND PRELIMINA.BY THEOREMS
projection of the point X on a certain fdimensional sub...space Rr
The pr.f. of Y is then (6) = P (8), where S is the cyZinder 8et in
R" defined by the relation ('1' .. t'I)CS. This may be concretely
interpreted byS&ying that the distribution of Y is formed by pro
jecting the mass in the original distribution on the subspace Rr
In particular (I =1), every component ,,, of.xis itself a, random
variable, and the corresponding distribution i$ found by pro
jecting the original distribution on the axis of 'i.
4. Two random variables Xl == .... , e
k1
) in Riel and
X 2=(1]1' .... , 11k,.) in RksJ>eing given, it often occurs that we have to
consider also the"combined" variable I =(Xl' X
2
) as a random
variable. The "values" of I are all pairs of "values" of Xl and
X
s
, so that it =(Xl' X
a
)= ('1' .... , 'k
1
' 1'Jl' .... , "IJk
J
) is defined in the
product space Si
f
== R
lcl
"R
kt
, where f:= k
1
+k
2
" Obviously the
probability distributionof! inat must be suchthat its projections
on R' and R
U
coincide with the distributions of Xl and X
s
respectively.l Similar remarks apply to the "combined" vari...
able formed with any number of random variables.
Let the probability functions of Xl' XI and be ii, P2 and
while the corresponding distribution functions are F
1
, Fa and g..
Then F
1
(Xl' ",. Xk
1
) and Fa (111) .... , Yk
s
) denote the probabilities
of the relations t <,.. (.:  12k)
Si = ""'I. II  , , ... , l'
and (j=I, 2, . ,k
s
)
while (Xl' .... , :J:k
1
' Y1' .. .,1Ika) denotes the pro
bability of the joint existence of all these f = k
l
+ k
a
relations.
We now introduce the following important definition: The
variables Xl and X
2
are called mutually independent if, for all
values of the Xi and Yi
J
(8) tr (Xl' . , Xk
1
' Yl' ... , Ykj) ==.1;. (Xl' .. ) Xk
1
) F
2
(Yl' . ,11k.)
If 8
1
and S" are given sets in R' and Bit respectively, and ifwe
consider the set S in mformed by all pairs I =(Xl' X.) such that
(9) Xl cS
I
and XscS",
1 Any distribution satisfying thia eoDditioD U. of C011l'8e, logieaDy poesible.
AXIOMS AND PBELIMINA:RY THEOREMS 15
then it follows from (8) by the basic property of Borel sets that
'(6) =P
1
(Sl) Pi (S.).
Thus for two independent variables the probability of the joint
existence of the relations (9) is equal to the product of the pro
babilities for each relation separately. The validity of this
multiplicative rule for the particular sets connected with the
distribution functions is thus equivalent to the validity of the
same rule for all Borel sets.
5. Let Xl' XI' ... , X
n
be random variables with the pr.f.'s.
... ,Pn and the lfi, ...,F
n
defined in the spaces B', ..., H:n)
of any number of dimensions. Consider the combined variable
11. =(Xl' .. *' Xfl) with the pr.f. '71. and the d.f. fYn' defined in the
produot space ffi,<n).
Xl' ... , Xft are then called mut'Ual''ll indepe:ndent if
trn=F1F" ... F
n
,
which is the straightfonvard generalization of (8). As in the case
of two variables, this is equivalent to the relation
(10) 'n(Sn)=Ii (8
1
) P", (8
n
),
where 8
1
, _, S.,., are given sets in B', , :en) respectively, while en
denotes theset inm<n) which consists ofall points $ =(Xl' . , Xno)
such that for i == 1, 2, . , no
If, in (10), we put Sn,=Rn), we obtain
'nl == Ii (81) ... (Snl)'
where ''''1 is the pr.f. of Inl =(Xl' .. , X
n

1
) and <Snl is the
set ofall points Inl in 8l,<nUsuch that Xi C8"for i = 1, 2, ... , n  1.
Thus we infer that the variables Xl' ... , X
n

1
are independent,
and in the same way we obviously :find that any g'TOUp 01 m n
amung the variable8 X, are mututJUy independent
Further, it is easily found that, if the variables Xl' ... , X
m
,
1;" , Y
n
are all mutually independent, then the combined vari...
abIes lm== (Xl' .. , X
m
) and Wn =(1;, .. 0'1';,,) are also independent.
6. Any Bmeasur&ble vector function I(X
1
, , X
m
) o m
random variables may be considered as a Bmeasura.ble function
of the combined variable !m. Thus according to II, 3, the pro
16 A.XIOl\lS A:s"n PRELIMIKARY THEOREMS
bability distributIon of f is uniquely determined by the diS
tribution of x
m
CHAPTER III
GENERAL PROPERTIES$ MEAN VALUES
1. According to Theorem 2, the d.f. F (x) of a. probability
distribution in ~ is always a nondecreasing function of x, which
is everJ"Where continuous to the right and tends to 0 as x+ 00,
and to 1 as x ~ + c o . Conversely" any F (x) with these properties
determines a probability distribution.
Any d.f. F (x) being a monotone function, we can at once state
a number ofgeneral properties of.... 11' (x), for the proofs ofwhich we
refer to standard treatises on the Theory of Funotions of a Real
Variable.!
Theorem 4. A d.f. F (x) lUJ8 at most a finite number of pointe
at which the 8altUB i8 ~ k >0, and COfUIequenfly at most an enumer
able aet of pointB of diacominuitll.. PM derivative F' (x) exists/or
"alWKJ8t all" val'Ue8 of x (i ..e. the pointa oj e:xception form a 8et oj
meaB'Ure zero).
F (z) can always be represented as a 8um of three components
(13) F (x) =aIF
1
(x) +aIIFII (z) +alIIFIll (x),
where aI' an' alII are nonnegative numher8 with the 8'Um 1, while
F
1
, F
II
) FIll are distributionfu:ncti01UJ BUCk that ..
PI (x) is ab80lutely continuous,' F
I
(x) = f: 0:) F ~ (t) dt for aU valU68
0/$,.
1 Hobson [I], 1, p. 338 and p. 603.
GENERAL PROPERTIES 19
FII (x) is a "8tep1unction" ; F
1I
(:c) = tke 8um of the 8altu8e8 of
F (x) at all discontinuities 3:.
FIll (x), the component. is a continuO?Ul functio11t
having almost 6vergwkere a derivative =o.
The three components aIF
1
, aIIF1I' aIIIFIII are u'fl/iquely deter
mined by F (x).
Let us consider in particular the cases when aI or all is equal
to so that J! (z) coincides with F
1
or F
II
(:tt Wp sha,ll say in
these C&SeS, which are those usually occUlTing ilL applications,
that F is of type I or n respectively.
I. H F (x) =F
1
(x), we have for all values of x
F J:oo F' (t)dt,
and thus tJte probability that the random variable X with the
d.f. F (x) $lSume8 & value belonging to the given set S is
,.
Js F' (t)dt.
The derivative F' (x) is then called the frequency function or
probability density of X.
II. If F (x) = F
n
there is a finite or enumerable set of
points z, such that every Xi is a point of discontinuity of F (x),
while F {x} is constant on every closed interval which contains
no IfPi, is the saltus of F (x) at the point Xl' we have LP! == 1.
,
The probability that X belongs to the given set S is zero, if S
does not contain any 3:" and is otherwise equal to the sum of all
those p, which correspond to points Xi belonging to S.. Thus in
this case the distribution is completely described by saying that
wehavetheprobabilityp", that X assumes the valuez, (i ::= 1, 2, .,. )
and the probability 0 that X differs from all tIle Xi.
2. A LebesgueStieltjes integral fgdP with respect to the
pr.f. P(S) has been defined in 1, 3. 'Ve now define tIle cotte
respectively.
20 GENERAL PROPERTIES
sponding integral with respect to the d.fa F (x) simply by putting
fs
gdF
=fs
UdP
.
If X is a random variable with the d.f. F (x), the integral
f: xdF(x) has a uniquely determined value for all finite a and b.
If this integral tends to a finite limit as a;.  00 and b;.. +00
independently (i.e. if the integral is absolutely convergent), we
denote this limit byl
(14) E(X)=J:", xdF{x)
and call it the 1nean value or expectation of the
random variable X ..
A B..measurable function g(X) of X may, according to n, 3
t
be considered as a random variable. If the d.f. of this variable
is denoted by F* (x), we have by n, 3,
F* (x) =J dF (t),
s::
where Sa; denotes the set of all points t such that g(t) x. Thus
,,e obtaJ.n fb f
a xdF*(x)= g(x)dF(x),
where the integral in the second member is extended to the set
8
11
 Sa Nowif the integral
f:a> Ig(x) IdF (x)
is convergent, we may allowa and b to tend to  00 and +00, and
80 obtain aocording to (14) for the mean value 0/ g(X)
(15) E{g{X)}=f:a> 9 (x) dF(x).
In the same way we obtain, if 9(I) is &. realvalued function af
a random variable I which is defined in a space m
r
of any number
1 In the partionlar ea.se8 when F (z) is of type I or type II" we have
E (x>=j"CO ::r:F' (x)dr and E (X)=Ep$X'i
j
GENERAL J:lROPERTIES
21
of dimensions,
(15a)
where (S) is the pr.f. of , and the integral is assumed to be
absolutely convergent.. If, in particular, g depends only 011 a
certain number k <f of the coordinates of the integral is, by
II, 3, directly reduced to an integral over the corresponding
subspace St
k
"
The mean value of the particular funotion (XE(X)! is
called the variance of X. The nonnegative square root of this
mean value is called the 8tandard deviation (abbreviated s.d.) of
X and is denoted by D (X), so that we have, assuming the con
vergence of the int\egral,
(16) (XE(X2dF(x)
= E (X
2
)  E2 (X).
The square root D (X) is always to be given a nonnegative value.
We have D (X) = 0 if and only if F (x) is constant on every
closed interval which does not contain the point x=E (X) .. In
this extreme case, we have the probability 1 that the variable X
assumes the value E(X), and we have F(x)=e(xE(X,
where f: (x) denotes the particular d.f. given by
(17) E(X)= {O for x<O,
1 "
In all other cases, the standard deviation D (X) is positive.
If X is a random variable with a finite mean value, we ob
viously have by (15)
(18) E (aX+b)=aE(X)+b
for any, coIlBtant a ftJ}d b. Further, if the s.d. is aJso finite, we
have /
(19) D(aX+b)=)alD(X).
In particular, the normalized variable X;; 'fi;> has the mean
value 0 and the s.d. 1.
The moment8 p and the ab80lute mome:nts p." of the variable X
22 GENERAL PROPERtIES
are the mean values of Xv and f X p' for v=0, 1, 2, ... :
Gt,,= x"dF (x),
fl,,= f:t1J Ix l"dF (x).
\/J
p
is, of course, hereby defined also for nonintegral v> 0.) It
is immediately seen that, if 13k is finite, both v and fJv are finite
for v k. Further, we have 211 = /32v and f tX
2v
+l I f32v+l From
(14) and (16) we obtain
E (X) = eX1' D2 (X) = tX2  (XI.
IfPk is finite, it follows fromwell...knowninequalities
1
that we hava
1 1 1
(20) Pl fi: ,e::;i ...
Inthe sequel it will always be tacitly understood that the mean
values occurring in our considerations are assumed. to be finite
even in the rigorous sense that the corresponding integrals are
absolutely convergent.
3. Theorem 5.
2
Let t/J (x) denote a nonnegative function Buck
tkat if; (a) M > 0 for all x belonging to a certain 8et 8. Then if X
is a random variable) tke probability that X Q,88Ume8 a value
belo1Iging to S i8
This follows directly from the relation
E{!f1(X)}= fsdF(X)=MP(S).
Taking here in particular "p(x)=(xE(X)2, M=lc
l
, we
obtain for every Ie> 0 the BienaymeTcAebyehefJ inequality:
The probability of tk relation IX  E (X) I /c is [)I/c\X)
Taking further t/J(x)= Iz I", M==Jcvftll' it follows that the
proba.bilityof IXI k\lfJ.. is
1 Cf.. Bardy...Littlewood.P61ya [1], p. 157..
I This is an obvious general.iAtion of theorems due to Tchebycheff and Markoff.
Of. Kolmogoroff{'], p. 37.
GENERA.L PROPERTIES 23
Ohoosing finally .p(:r:)=e=, M=e
ca
, where c>O, we conclude
bili
. fX E (tc
X
)
that the proba ty 0 ~ a IS ~ eca
4. Let X and Y be random variables in R
1
, such that the
combined variable Z = (X, Y) has a certain pr.f. P (S) in R
s
Ir(t) J
t  i dt
co! t i
34
CHARACTERISTIC FUNCTIONS
(35) R(z)= r(t) e4lzdt.
2m co t
We observe t.hat the conditions of the first part of the theorem
aresatisfied, inparticular, whenever R (x) is thedifference between
two d.f.'s with finite mean values. In this case, ,.(t) is the
differenoe between the oorresponding 0.f.. '8.
In order to prove the theorem, \ve shall first show that both
members of (33) are continuous functions of x and h, when (JJ is
fixed. between 0 and 1.. In respeot of the first mamber, this is
readily seen by writing this member in the form
hif: ytlR(z+ky)dIg.
In respect of the second member, we have already remarked
that r (t) is bounded for all t, and by the argument used in IV, 2,
we have 'r(t)=O(t) as t.+O. Moreover, we have for t'#:.O
(36) I fAu/.tJIeil:tldul=l! [ht u
tU

1
eiJ,d,fJ, II'
Jo twJo wftJw
where G is an absolute constant. It follows that the integral
with respect to t in the second member of (33) is absolutely and
uniformly oonvergent for all x and It, and accordingly represents
a continuous function.
Without restricting the generality, we may thus assume for
the proof of (33) that x and :t +k are continuity points of R (x).
The second member of (3S) is the limit, as M of the
expression
1 JAtI r (t) in,
 .  eit3: at u
w

1
e
w
d'U
2,,1. ,,'V t 0
1 [A 1.1.11 ed(z+u) fco
=  m ut.U1au .at eif.lI dB (y)
1'( .0 0 tt co
= _!Jco dR(y) fA dt.
Jo 0 t
According to the oonvergence properties of integrals (1, 3), we
may here allow ..r.11 to tend to infinity under the integral. Using
CHA.RACTERISTIC FUNCTIONS 35
the wellknown properties of trigonometric integrals
t
and ob
serving that by assumption we have R ( 00) =0, we then nnd
that the second member of (33) is equal to
IJ:t+ll. AttJ
 (lIX}OItlR(y)+ B (w+h).
W :.c OJ
An integration by parts now yields (33)..
On the other hand, replacing the first member of (33) by the
expression just obtained, we obtain the relation
(z+k
J,. (yx)WdB(y}h<R(x+h)
(I) fCf) r(t) i
A
= . eitJ:dt utul
e
it:udu
211t r:t:! t 0
= ].foo r(t} eil:cdt(ktl)eI.th +it fA'U/.lJSitUdtJ,).
21Tt 0;) t Jo
If we assume the convergence of (34), this gives (35) as C U ~ O ..
Thus Theorem 12 is proved.
CHAPTER V
ADDITION OF INDEPENDENT VARIABLES.
CONVERGENCE ,elK PROBABILITY"
SPECIAL DISTRIBUTIONS
1. If X and Yare mutually independent randoln variables
'With given (:t) and FA (y), then by 11:, 4, the d.f$ of the
combined variable (X, Y) is F
1
F
2
(y). Thus the pr.f. (S) of
(X, }'") is,. accorfling to Theorem 2, uniquely determined by F
1
and F
2
for all twodimensional Borel sets 6.
The sum X + Y is a onedimensional vector function of the
variable (X, Y), so that accordingto II, 6, its d.f. F (z) is uniquely
determined by (S), i.e. by F
1
and F
2
(X,lx+cxa
For our second example, we shall anticipate the discussion of
the normal distribution that will be given in the following
Chapter. We shall consider a quotient of the form X1!v'X,.,
where Xl is normally distributed with the mean value 0 and the
s.d. cr, while XI is distributed according to (49). We then have
(cf. (51))
and
IX>' f) 2A fCi:)
It (t) =r (,\) 0 xA
1
eGt.l:+itv':lI ax == r (A) 0 v2A
1
e<Xr'+itlJdv,
2ia:.
A
fco
1" (t) := vIAe
lXv
2+Uv dv
J2 r (A) 0
In this case we may apply the last forlnula of Theorem 16, and
80 obtain for the frequency function G' of the variable
X
1
!1/X
S
G' (z) = fco elutl: ae] QOv
ll
eo:.",1.U11;z dv
.,.,r (A) <X) 0
== fcovIA dvf elaIPUv;c tit
nr(A) 0 co
= 2tr' f0()vi" e(0<+:;.)",tW
O'V271 r (1) 0
= 1
V21TtX0
2
r (I\.) \ 2cxu
2
This is a distribution of type VII according to the classification
ADDITION OF INDEPENDENT VARIABLES 49
ofK. Pearson.. In the particular case wIlen 2tXu
2
=n, A=nJ2, we
obtain a distributioll defined by
r(n+1) _,.+1
G
'
(x)= v ~ ; r(i) (1+~ ) 2
"rhich is kno\vn under the nanle of "'Student'8" distribution.,l
1 '" Student ,) [ 1 ] ~ Cf. also e.g. Rider [1].
CHAPTER VI
l'HE NORMAL DISTRIBUTION AND
THE CENTRAL LIMIT THEOREl\I
1. rfhe nQrllw,l distribution Junction
l
<I> (x) is defined by the
relation 1 J':r 
<I> (a) = =_ e 2dt..
V21T  ao
The corresponding nortnal frequency jU1wtion is
m/( ) __1_ 2
\V x  ... ;_ ..
"V 21T
The mean value ofthis distribution is 0, and the s.. d. is 1, as shown
by the relations
(50) J:tOxd$(x)=O, J:tOx
2
dct> (x) =1.
The lnoments of odd order 0:.2v+1 all vanish, while
Ct2v = f (t) x
2v
d<I> (x) = 1.. 3.. u.. .. (2II  1)"
co
The c.. f .. is, by a wellknown integral formula,
(51) f<:O eUxd<I> (x) = 1 ftO j.
 co v'21T  co
Hence "\"'6 obtaill, for v = 1, 2, ...... , by partial illtegration
(52) f:tOe,txdct>(") (x) = (it)" e
and by differentiation
1 ftO
(it)V1e 'dt"
..lTf _ 00
A random v"ariable X is said to be normally dist1"ibuted, if its
1 The normal distribution was discussed already m 1733 by De :Moivre in the
second supplement to his Analytwa.. Cf. K. Pea.rson [IJ. It was afu.r..
.. ards treated by Gauss and Laplace, and is often referred to as the Gauss or
GausbLrtplace distribution..
NORMAL DISTRIBUTION
61
'xm)
d.f. is 4> (0 , where (J 0 and m are constants. (The case
q=O is, of course,. a degeBerated limiting case which might be
caJled an improper normal distribution. cJ) e m) should always
be interpreted as (xm), where E(X) is defined by (17).) The
normtJ,lized varia.ble X  m has then the d.f. cJ) and we obtain
(J
from (50) E (X) =m, D(X)=a,
while (51) shows (cf. also IV, 1) that the c.f. of the varia.bleX is
E (e
UX
) =e
tn
1.ti
a
Jlt.
The semiinvariants of X, as defined in IV, 2, are
"1 =m, i'1=a2, 'Ys="4.= .. =0.
2. We now proceed to prove a number of theorems which
show that the normal distribution plays a fundamental part in
& great nl1mber of questions connected with the a.ddition of
mutually independent random variables.
Let Xl and X
J
be independent and normally distributed
variables
t
the parameter values being ml' at and m., as respec
tively. Then the sum Xl +X
2
has the composed d.f. (cf. v, 1)
while the corresponding c.f. is
e
m
l
it
iai
tt
e1ntUio;" = e(ml+
m
t> Ul (oi+oi>tI.
This is, however, obviously the c.f.. of a normal distribution, and
so we have the following theorem.
Theorem 17.
1
The BUm oj two inilepe:ndem and fWrf1U.I8,y
distributed tJariaJJle8 is itBelf normally di8trib1detl fJllU8
* =cJ)
where m= m
1
+ml' at = af +ai
l ThiJ! theorem is som.etimes attributed to d'Ocagne, but it seems to b.a.ve been
known aJready to Poisson and Cauchy, and possibly &Iso to Gauss.
52 NORMAL DISTRIBUTION
Obviously this theorem is immediately generalized to the
composition of any finite number of normal distributions.
We shall now prove three theorems which attach themselves
in a natural way to Theorem 17 and reveal fllrther remarkable
properties of the normal distribution..
According to Theorem 17, the d.f.'s of the type c;I) tn) fol'Ill
a closedjamity (the "normal family") with respect to the opera..
tiOll of Now, any with a finite mean 'lalue m
and a finite s.d. a may be written in the form F , whertl
a I
F (x) is a d.. f. with the mean value 0 and the s.d. 1. For any given
F(x) with these properties, all functioIlB F (x:m) may be con
sidered a8 a family generated by F (x). Our next three theorems
then assert (1) that no F (2:) different from (zo) generates in this
way a closed family; (2) that the composition of any two d.f.'s
which do not both belong to the normal family never produces a
member of tha.t family; and (3) that every d.f. with a finite s.d.
gives, by nfold composition with itself, a d.f. which for all suffi
ciently large fa, comes (uniformly for all real x) as as weplease
to a member of the normal family. We shall first give the formal
statements of the three theorems and then proceed to the
proofs.
Theorem 18.
1
Let F (x) be a with the mean value 0 and the
s.d. 1. If, to any con8ta1ll8 ?nt, tnt (real) ani/, at, 0'1 (poaitiV6), 106
can find, m aM (1 suck that
(53)
then F (x) =$(x).
1 P61ya [1). The example of Caurhy's distribution {v
1
6} shows that, in thia
theo1"etu, it is essentia.l tha.t we consider only d.l. 's with finite dispersions. Further
examples of nonnormal d.f.'s satisfying (53) have been discussed by polya. and
Levy (1].
NORMAL DISTRIBUTION tiS
Theorem 19.
1
If the BUm of two independent raruiAn variables
ia 'fl,()'ff1W,lly distributed, then each variable i8 itself 1Wrmally dis
tributed. Thus if (z) and F
2
(x) are d.J.:'8 8uck that
(54)
then where m
1
+m
2
=m,
01+01=0'1.
Before stating the third theorem, some preliminary remarku
are necessary. Denoting the composition F *F *.... *F of 11,
equal components by Fn$, we obtain from Theorem 17
(
cD =(J) (xm..!!.)
er \ ayn " '
and in particular for 1ft = 0, (J = l/,\/n,
(55) (<I> (xyn})n* =fIl (x).
The last relation expresses that if Xl' ..... , X. are independent
variables, all with the same d.f.. ep (x), then the variable
(Xl +..... +Xn)!v'n
has the d.f. tb (x).
Theorem 20.
2
Let F (x) be a d.f. u'ith the mean value 0 arui the
8.d. 1. If Xl' XI) .... are independent all having the d.,l..
F(:t) then the a.f. of the variable (Xl +... +X'nJ/vn tends to CI> (x)
aa uniformly for all real z. Th'U8
(55a) {F (xvnn* +4> (x)
uniformly in x. Hence it follows also that
(56)
(
(
X m))n* (X mn)
F  fIl 
a ay'n
uniformly in x, for all fixed m and a.
1 Cramer [5J. The theorem had been conjeotured by Levy [2J, [3J. It will be
observed tha.t in thi'i theorem it is not assumed that the moments of a.ny order are
finite..
S! Lindeberg [1], Levy (1], p. 233.
54 NORMAL DISTRIBUTION
Theorem 20 is a particular case of the famous "Central Linl1t
Theorem" in the theory of probability, which will be more fully
treated in the following paragraph. We shall now first prove
Theorem 20) which will then be used for the proofof Theorem 18
,
Finally, we shall prove Theorenl 19.
Proof oj Tn.ehfem 20. If f(t) is the c.f. of a d.f. F (x) with
<Xl=O and 2=1, it follows from formula (25) of IV, 2) that
j(t)= Ilt
2
+o(t
2
) for srnall values of f t I. Thus we have uni
formly in every finite tinterval
I(_t
vn 2ft, n
as The e.f. of the variable (Xl + .. , +Xit)/yn is
...
As n ..00, this tends uniformly in every finite tinterval to the
t'
limit e', which is the c.f. of W(z). ThuB by Theorem 11 the d,f.
of (Xl +... +Xn)/V'" tends to ep The uniformity of the con
vergence follows easily from the fact that is continuous..
Thus (554) is proved, and (56) follows immediately from the
remark tha.t (F f* is the d.t. ofthel&riable
... / Xt++Xn.
mn+uv
n
v'n ..
Proof of Theorem 18. Both members of the relation (53) are
d.f. '8) andthe first order moments are m
1
+m
s
and m respectivelyt
while the variances are at +crI and a2) so that we obtain
m=ml +fnt, aI=at+crI Putting ""l=fnt= ... =0, we obtain by
iteration ..
56
NORMAL DISTRIBUTION
and thus in particular
(F (xy'n)"'* =F (x).
From (55a) it then follows that F (x) (x) for all x.
Proof of Theorem 19. Let Xl and X
2
be liWO independent
variables vvith the d f.'s 1
1
and F
t
, and the c..f.'s 11 and !'Z) and
suppose tha.t Xl l X 2 ha.s the d.f. cI> e:m). Since the qua.dra.nt
is a subset of the halfplane Xl .. we
have for all values of x and y
.F;. (x)F
2
(y)
Here we choose for 11 any fixed value such that F
I
(g) > 0, and use
the mequality
1
4> (x) < vi e 2.
217' t x I
which holds for all x < 0 and is easily proved by partial integra...
tion" It then follows that we can determine .A and B independent
of x, such that for all x < 0
ttfA
F
1
(x) < Ae2ut+
BI
:r
,
Similarly we can determine A' and B' such that for all
1.F;. (x) < A'e Itr
From the two last inequalities it follows that the integral
(57) J = et:dJ;, (x)
is convergent.. If, now, we Qonsider the
A(t) eiLrdJ;, (x)
for complex 1Jalues oj the variable t, it follows from the c.onvergence
of (57) that the integral which represents ,.ll (t) is absolutely and
uniformly convergellt in every finite domain in the tplane. Thus
56 NOlt.MAL DISTRIBUTION
11 (t) is an in:tegral funrJio1ll of the complex variable tv For the
modulus of this function we obtain by lneans of the elementary
inequality x2
J tx I 0'2 f t f
2
+4(12'
I11 (t) I S:""e0'1 till ia. d.F" (x) = J ei7lltlt,
so that the order
1
of tke integral junction /1 (t) does not exceed 2.
In the same way it is proved thatfa (t) is an integral function oft,
the order of which does not exceed 2. ACCOlding to (54) we have,
however, 11 (t)/a (t) =emitlaIl2,
which shows that 11 and 12 are integral functiona zefOS.
By the classical factorization theorem
2
of Hadamard it then
follows thsJt
(58) 11 (t) = e!11(t), 12 (t) =eqa(t),
where 21 (t) and q" (t) are polynomials ofdegree not greate.r than 2.
The convergence of (57) implies that all moments and semi...
invarianta of Xl are finite. Denoting the mean value by m
1
and
the s.d. by at, we then obtain from (58) according to IV, 2,
IV, 2, 11 (t) =emlitia1,tt,
and similarly
This is, however, equivalent to
(
xm
1
) (xm)
P
l
(x) =$ u; , PI (x) =$
Then obviously m
1
and af+oi=a
2
, and the theorem is
proved.
3
1 (,1'. e.g. Tltchma.rsh (1), p.. 248. Cf. e.g. Titehmarah [IJ, p. 250.
3 0'1 or 0t may be equal to zero If, e.g., at =0" we have by 1 to interpret
IX  ,n,,)
tP ( as ((x tn.
1
), and so obtatn the trlvialsolution of (54); F
1
(*)=c
, 1
Fa (=)=\1) n:,
NORMAL DISTRIBUTION
3. The Central Limit Theorem
1
in the theory of probability
asserts that, under certain general conditions, the sum oj a large
number of independent variables i8 approximately norrnally di8tri
buled. In Theorem 20, we have already met with a particular
case of the general theorem, viz" the composition of n equal
components with a, finite s..d. We shall now consider the case
when the components are not necessarily equal. Throughout tkia
paragraph and tM immediately foUowing one, we shall suppose
that every component 1148 a finite 8.d. and a mean, valtU eq'Ual to
zero. The assumption that the mean value is zero may obviously
be made without 108s of generality, since it is equivalent to the
simple addition of a constant to each variable.
We thus consider a sequence of independent random variables
Xl' Xi:; ... , such that Xv has the mean value 0 and the s.d. O'v"
The d.f. of Xv will be denoted by 1:, (x) and o..f. byIv (t).
Ifthed.f. ofthe sumXl +... +X
n
is denoted by (x), we have
(59) .. *Fn(x),
and F", (x) has the mean value zero and the variance 8: given by
(60) 8;=a}+oi+ ... +0';
The variable (Xl + ..... +X
n
J/8
n
then has the <i.f..
(61) iYn (x) == F'n, (8
n
X)
with the mean value 0 and the s.d. 1. It is possible to 81ww thal
under fairly general oonditions iJn (z) teM8 to the normal ill.
<lJ (x) a8 n tends to infinity. The Inost important case is that in
which the following two conditions are satisfied:
(62)
1 This theorem was first stated by L&pl&ce, a.nd was further trea.ted by several
m&thematicians during the nineteenth century, notably Tchebychetf a.nd Markoff..
Acomplete and rigorous proof under f&irly general condItions was first given in 1901
by Liapounoff [1], [2]. Of. Vlt 4, and va. 4. A eomprehensive a.ccount of the
modern development of the subject is given by Khintehine [2].. The central position
which the Limit Theorem occupies in the Theory of Prob&bility is well brought out
in this beautiful treatise.
6S
NORMAL DISTRIBUTION
(64)
(66)
J
80 that in particular a finite mean val?:e m. = _f$JQ;tl:F (x) exists.
(II) It is p08sible to find a 8equ.ence "1' tZ.z, .... ojpositive numbers
.taM that the d.J. oj the variable
('15) U. X1 + . +Xn nin
n an.
te?uU to fb ($) fJ8 n tX>.
1 FeDer [lj. [2], Kbintchine [3], Levy (3]. It is shown by these authors (?4)
malso & ?1Cfe88M'y condition for the existence of" two sequences {aft} and {on} sUbh
thAt the d.t. of (Xl +..... +X.>/o,ft  b.,. teilds toO (X). On the other h.a.nd. ('4) is not
a necessary condition for the eort"texpnoe of p,. for 0 <2.
66 NORMA.L DISTRIBUTION
For the proof of this theorem, we may obviously assume that
/32 is not finite, as otherwise (1) is trivial and (II) is an immediate
corollary of Theorem 20.
'Ve shall first prove that Pr is finite for 0 If < 2. The fun<.'tion
t/J(Z)=f x
2
dF(x) =z'I dF(Z)+2j"vdvi dF(;t)
Iz 0 Ia: 1>1.1
is never decreasing for z> 0 and tends to infinity with z. By (74)
we have (z?co)
,p (z) VdvJ dF (x) == 0 (III t/J (v) dV) .
o l:e!>'t' 1 V
E> 0 being given, we denote by M (z) the upper bound of veljJ(v)
in the intervall v z. and then obtain
J: dv M (z) dv < (z)
Thus we have (z) == 0(M (z), which shows that'" (z) =0 (Zf)
for every E > o. It follows that, for any fixed r such that 0 r<2
and for all suffioiently large z,
J
I I'tJ,F (x) <7/tJ/J (2z) < %,,/21,
'"
alld this obviously implies that fJ, is finite, Hence in particular
the mean value m is .finite.
We now proceed to prove the assertion (II). As by hypothesis
/32 is not finite, the first member of (74) is positive for all z> 0)
and the function
(76)
Z (tt) == lower bound of all z> 0 such that I dF(x) 'U,
la:I>.
is a positive and never increasing function of u" uniquely defined
for 0 < u < 1 and tending to infinity as u tends to zero. Further)
according to (74) we can find a, steadily deoreasing function
67
NORMA.L DISTRIBUTION
1] (Z), tending to zero as Z+co, such that for all z>0
(77) f tiP(x) <'Y/ f :.r;tdF (x).
1a:1>* z
Let {An} denote 8, decreasing sequence of numbers sucb that
0<"1&<1 and
(78)
We put
;\,.,,40,
(79) z'" =Z (AnIn) , =nf x
2
dF (x),
l:t
and are now going to show that, with this definition of an' the
d.f.. of the variable Un defined by (75) tends to (x). Putting
X.,=X.,,m, we have U,.,,=(X
1
+.... +XnJ/a
n
, the d.f. of each Xv
being F (x+m). We now apply Theorem 22 to the sequence
Xl' .It, ... , and then only have to show that the conditions
(71)(73) are satisfied if we put Fv(x)=F(z+m) and define an
according to (79).
By means of (76) and (79) we obtain
(80) f tiP(x) ,f tiP(x) > An ,
n 12:1>1.. fl,
and further according to (77)(79)
zldF(x) > n f rJ,F (x)
4TJ (lz,,) l:cf>P'a
> A.
= 400,
so that Z1f, == 0 (a,.,). E" > 0 being given, we now choose no such that
for all n >ito we have Zn < !fra
n
and Im J < tc:a
n
, and then obtain
by (80)
nI dF(x+m)<ft,f dF(x)+O,
lad>... Ixf>Ztt
68 NORMAL DISTRIBUTioN
so that (71) is satisfied. We have further for 11,:> no
x2dF(x+m)11
n I
II (xm)2
dF
(x>j dcSdF(x) I
an I
< IlfJ (m
2
2mx)dF (x) 1+ 2a:f
an I an
< (m
2
+2,s1 i m I) +e2n f elF (x).
an .. l.itl>%ft
According to (79) and (80) the last expression tends, however, to
zero as n..+oo, so that (72) is also satisfied.
Thus it only remains to show that (73) is satisfied. By (74) we
have for every fixed S> 0 and for all sufficiently large z
zJ Ix IelF(X)=zsf dF (:t)+zjtDdvf dF (x)
Izi '>z ixl>$ 11 Ixl>'"
<8+(z) +8zJ'".o+ ail
v
=2&P(z)+8z f IxldF(x),
eJ
a.nd consequently, putting z= lEan'
for every fixed E>O, aa n+co. By (79) and (80) we have, how
ever, for all n >
and thus by (81)
NORMAL DISTltIBUTION
89
Ix IdF (x)+O.
aft Ia: I >icGtt
Finally we have, the mean va.lue of each Xv being equal to zero;
r xdF(x+1n) I
ani .. an l:ct>4Ed,w I
2nf 2nf
<
an 1:I>fa,. an
Thus (73) is satisfied, and the proof of Theorem 23 is completed.
CHAPTER VII
LIAPOUNOFF'S THEOREM.
ASYMPTOTIC EXPANSIONS
1. In VI, 3, we have considered a sequence of independent
variables {X
n
} such that X.,.. has the d.f.. Fn, (x) with the mean
value zero and the s.d.. CT",. As in VI, 3, we ptlt
and .. +O;
(82) iYn (:c) ==F
1
(8
n
X) *.... *F
n
(8
n
X),
so that tin (x) is the d.. f. of the variable (Xl +... +Xn)/sn. The
corresponding c.f. is then
(83) fn (t) ==11 (tla
n
) .. ftt (tis",).
If the Lindeberg condition (64) is satisfied, it follows from
Theorem 21 that tyn (x) tends to the normal function 4) (x) as
n+oo. It is then natural to try to investigate the asymptotic
behaviour of the difference (a:) (a:). In this respeot, it
might be desired: (1) to find an upper limit for the modulus of
the difference (x) tI (x), and (II) to obtain some kind of
asymptotio expansion of this difference for large values of fl...
In the present Chapter, both these questions will be treated.
In the first place it will be shown (Theorem 24) that, under fairly
general conditions, we have 1trn (x) fIJ (x) I < K logn/von, where
K is independent of n and x.. It will then be shown (Theorems
25, 26) that, subject to conditions ofa somewhat more restrictive i
character, an asymptotic expansion of fYn (x) 4l (:I:) in powers of
11,1 can be obtained. From this expansion follows, in particular,
the relation I 4l (x) I<KJvn, which is improvement
ofthe precedinginequality. In the last paragraph ofthe Chapter,
we shall make some remarks concerning the relations between
our asymptotic expansions and the expansions in series of
Hermite polynomials which have been widely usedinapplications
to mathematIcal statistics.
(84)
ASYMPTOTIC EXPANSIONS 71
2. PnlfO'lJ,fJ1wut the whole Okapte:r, we sW Ct.Wl.8itkr a aeq'Uefl,Ct,
X
1J
XI' ... of independent ra,nd&m, 'lXlriable8 suck that X", has tke
mean vaJ,m zero aMthe 8.d. aft. The trivial case when aU the an are
eq:uJil to zero will al1.lJ4118 be ezel'UiJ,e,d. The vth order morn,t,nt, absolme.
m,om.,ent and 8emi,in'OtJriam (ef. IV, 2) 01 tke variable X
n
wiU be
den,oted by ~ n ' fJYA and Ym f"upectivelll_ Tk'U8 in particular
(Xlft. =Yl" == 0, Ott", ==13,,, = rift. = a:.
ThrougluYut the whole Ohapter it will be (J,IJ8'U/ffteil that twe exi8t8
aninteger k ~ 38'UCkthat flkn isfinitefrnall n =1,2, .... ItthenJollow8
that (XVI" f3vn and i'vn arejinitefor v= 1, 2, .... , k. In the particular
ease when all moments are finite, k fIUl,y be chosen as laf'fe Q,8 we
please.
We aooll1l8e the letters & and ere to denote 'u/nJ1peci:fied quantitiea
ruck that I& I~ 1, while 10
k
I is l.e.,s thn,n a number depending
onJ,y on k.
All the results of this Chapter take a particularly simple form
in the case when all the variables X,1, have the same d.f. We shall
refer to this case as the case of equal components, and the common
d.f. aftha variables X
n
will be denoted by F (a:). If, in this case,
a denotes the s..d. of X
n
, we have 8",=a.yn, and the relations
(82) and (83) become
tTn (x) = (F (axV n1l.*, fn (t) = (!(t/(aVn)'fI,.
3. In this paragraph we shall deduce some lemmas that are
required for the proofs of the results indicated in 1. We put for
v=2, 3, ... ,. Ie
1 1
B",,=(Pvl + ... +~ m ) ' rm=(rvl +... +y...,.),
n f1,
B
m
\ rlin
pvn, =Br:./2' ''vn == r,,/I
In ~
Thus for v =2 we have B
21t
=r In == B ~ / n , P2n =As" = 1. B"", is the
vth absolute moment of the d.f.. (F
1
(x) + ... +F
n
(z/n, and thus
by (20) B ~ never decreases as v increases from 2 to 1c, 80 that we
have for 11= 2, 3, .. _, lc
(85)
72
ASYMPTOTIC EXPANSIONS
(90)
It follows from (42) that n(JI8)l2
Av1I
is the 11th order semi.in.
variant (x). Further, it follows from. (27) that t rvn I v"'BlYI.)
and hence
(86) IAmI t
/l1
pvn. (kkpltn)v/k.
In the particular case of equal components B
llfU
r V'lo' 'PV1tJ and
AV1f. are all independent ofn, and we have B.",., =fJ"" r va. =rS', where
Pv and 'Yv denote the 11th order absolute moment and the vth
order ..invariant of the common d.f_ F (x).
BesiCles the case of equal components, we shall also sometimes
consider the case when the following condition is satisfied: it is
possible to find two positive constants g and G such that for all"
(87) B
2n
>g, B
kn
< G.
Obviously this case includes the case of equal components. If
(87) is satisfied, it follows from (85) and (88) that Pm And AItft are
uniformly bounded for all 11, 1 and for '11=2) 3, .., Ie.
We now consider the c.f. fn (t) of the variable (Xl +... +Xn)/'fl'J
as defined by (83). Putting
vn
(88) Tkn. = 4 I/k'
. Pkn
onr first object will be to show that in the interval It I '\0/21:8
there exists a certain expansion of f. (t) which.. in the particular
case of equal components, becomes an ordinary asymptoti.c
expansion in powers of n
t
.. (In the case of equal components,
Pkn is indeJlE'ndent ofn and thus Pkn. is, for large values off'l" of the
same order of magnitude as Vn.)
Lemma 2.
1
For It! :i '\o/T
kn
we have
t
1
k 3 P. (.) 0
(89) eiT (t)=l+ L 2!!!!.+_k <ltlk+ltI3{k2
In v=l nl12 t
where
v
p.
(
t) '" ( t\v+2j
v?t t ="'i CJvn i I
;=1
is II of degree 3v in (it), the coeJlicient c
jvn
beittg a pol,..
1 CramsI' (2].
(94)
ASYl\IPTOTIC EXP.ANSIONS 73
nornial in ,\31t' A
41l
, , A
v
:l+3t
n
with 'lI/umerical coefficients, 1J1UJ'h, that
v+2;
(91) e;m= 0
k
Pk: .
ThU18 in the oase of equal componentB Pvn (it) is independent of n,
while in the more general case when (87) is 8ati8fied the coeJficietz:t8
of P
v
" (it) are boundedlor all n.
For every r= 1, 2, ... , n we have by (66)
k let (it)V fl (t)k
U = 1 +& 1.. 
v:=r 2 v. 8.,.. 1(;. 811.
For It f we obtain, however, by and (88)
P
I
/
k
It I (nB )l/k !_!
(93) (nB:)lIi p1/f: =n
k
1,
and thus we obtain from (20)
IU I (f311 i t I)V e 2 < f.
v2 v. \ 8ft
For I U I<! have, however,
Ui
log(l+U)=
J
According to (92) U formally, a polynomial in t (in reality. the
factor.& depends of course on t), and the series
co ! It f)V
v. \ 8
n
is a Dlajorating expression for this polynomial. For any power
Vi, where 1 < k/2, we thus obtain from (92) the expansion
; _ k"Ql CO! It 1)11
U  ::E +,& L t
v=2, 8
n
v:::k
V
\ 8
n
kl I it)V 8
k
t
k
= 3
vir
\f:
",.2; 811./ 8
n
with coefficients 8
vjr
whioh are independent of t. From (92) and
(94). we thus obtain an expansion of logfr (tj8
71
J in power$ of it,
up to the term containing (it)kl, and with an error term of the
74 ASYMPTOTIC EXPANSIONS
order tIc. According to (26), the coefficient of (it/8,,)11 inthis expan
sion however, equal to Ywlv 1, 80 that we have for t , I
kl y (it)V p..,.t
k
logJr(tjstt,)= T  +@k.:r
v2 JI. 8
n
N"fi,
Summing here over r= 1, 2, ... , 11" we obtain according to (83)
and (84)
kl",r (it)V nBlmtk
logf,,(t)= r  +@k:r
vI v. 8
n
= +0
k
'n
Pk
'A(t )k.
2 v3 vI yn v'n
Substituting tz for t and dividing by Z2, we have
t
l
1 k3 A (it)V+2 ( Z )11 {k ( Z )ko.
V:=log{e
2
(f (tz)i'}=!; v+l.tJ,  +e!kft 
n vI (v+ 2)1 Vn le! \in
If we regard here t and n as fixed, and z as a, real variable such
that Iz I 1, we thus have for the function V =V (z) an expansion
inpowers ofz, withan error termofthe order zki. Thenobviously
there is a similar expansion for the function e
V
, 80 that we may
write for Iz I 1
!. k3 ( Z )"
(95) e
Y
==e
9
(l" (tz' =1+ J +B(z),
V" 1 :v n
where R(Z)==O(zkl) %+0. It is then readily seen that the
coefficient PVA (it) is a polynomial of degree 3v in it, which may be
put in the form (90).
According to (86), & majorating series for V== V(z) is
(96) 8&(PJ/: It I)s1=1 !: Itz I)V,
V'np=ro v. v'11,
and thus VI is, for j == 1, 2, ... , k  2, majorated by
(97) 0
k
(pl':l t l)31(J!1)1 i
'\1'11, 110 v.. V
n
From the development
kS Vi
eY:=:E :;+.&Vk2e1YI
; .. 0 J.
ASYMPTOTIC BXPANSIONS 75
we thus obtain, since the majorating series (96) shows that
I IVI< arc for It I
R (z) =0","'i;' (Pi': It 1)S1 (l!l)1 i (jPM: Itz J)V
:1=1 v'n vk21 v. v
n
== Elk(pJJ: 1tz J)Tcll 'j;' (Pit: It/)2/ i; I, ('!:)'V
11 vov.
= 0", (.;,,)1&2{(pJ/: It l)k+(Pi': It 1>3(kt>}
= (I t I'" + It 1
8
(k1.
Putting z= 1 in (95), we thus obtain (89). Finally, the relation
(91) for the coefficients c/
VIt
follows immediately from the major
ating series (97) ifwe observe that, in the expansion (95), a term
containing the product (it)J'+11 (z/vn)" can only arise from the
development of the term Vijj!. Thus Lemma 2 is proved.
We next consider thefollowing Lemma 3, which gives an upper
limit of IfA (t) I, valid in the interval It I Ph. IT the behaviour
of the absolute moments PtA and Ph for large values of n. is not
too irregular, Tim as defined by (88) tends to infinity with n, 80
that the interval It J t'P
b
of Lemma 2 is, for all suffioiently
large 'It, contained in the interval It I:i T
Ie
",
Lemma 3.
1
FiYI' It I 21. we n.aloe
We have
Ifr(t) 1
1
= S:..J:<J) cost(:t:y) d.F,. (x) d.F,.(lI).
but cost(xy) It 1
8
Izyl3
It 1
3
{I X1
3
+1111
8
),
1 Liapounoff [2].
76 ASYllPTOTIC EXPANSIONS
and thus for J t f Tkft, we obtain
11: ""
Us" ,
fa PIn It la
t fn (t) 1
2
= n Ifr (tI8"J 1
2
e 3 v/n ,
,1
i'( It I ) is
Ifn (t) I e
i
1
3
2'0 e'3.
Thus Lemma 3 is proved.
If, in the polynomial P", (it), we replace each power (it)1I+2:1 by
(_1)1'+
11
4l(J1+!1) (2:), we obtain a linear aggregate of the derivatives
ofthe normal function til (x), that will be symbolioally denotedby
p1m (ttJ). Thus by Lemma 2
v
(98) Pvn. (0) == L (_1)1'+21 c;vn. fIl<V+2:/) (X),
iI
where elm is a polynomial in the quantities such that
c  I:),. p(Y+IS)/k
III."  'elk k'n
Obviously we may write
f'
(99) Pm (l)=Pavl,n (z)e
i
,
where PSpl.n (:c) is a polynomial of degree 3v1 inz. In the case
6f equal components, PPft, (<I and Palll,n. (x) are independent
of n, and in the more general case when (87) is satisfied, the
coefticient.s CjV1t as well as the coefficients ofPSv..l
t
n (:t) are bounded
for all n. Aooording to (62) we have
(
100) p. (it)ei=J"" eUa:dP.,.(4l).
Vtl <XI
We nowdefine two"enor terms" R1t;n, (z) and rkn. (t) by writing
the following expansions for the d.f. tT,.. and the o.f. fn (t)
k3P. (cIJ)
(101) L vn vll
v=l n
=4\ (x) + (x)
V== 1 nFl'S kn'
k  31'. (it) _t
(102) f", (t) =e j + :11/1  e 2 +rkn (t)..
v=l III
ASYMPTOTIC EXPANSIONS
77
FroID (100) we then obtain
rkn (t)= f:""eiixdRkn (x).
Lemma 2 shows that we have 'rkn (t) =0 (t
k
) in the vicinity oft == 0,
and by the argument used in IV, 2, we conclude that
f:"" x
v
dRk1/.(x)=O
for v=O, 1, .... ,kl. Thus in particular Rkn(x) satisfies the con..
ditions of Theorem 12.
\Ve now proceed to the proof of tIle follo,\\Ting lemma. which is
fUlldanlental for tIle rest of the chapter..
Lemma 4 ..
1
Fo'r 0 < (/) < 1, we have fo' all1eal x and all h> 0
(103)
f
.1:+11  (ICO f fn (t)J 1)
w (yx)W lR
kn
(y)dY=(::)k t
w
+
1
dt+
pk
_
i

z Ph
If titre integral in the 8econcl member of tlitis relation is convergent for
w= 0, we further n.ave
(f
OO Ifn(t) I 1 )
R
'en
(x) = 0
k
 t
dt
+pk2 ..
'lt
kn
kn
For the proof of this lemma, we shall suppose that Tkn. > I,
so that < T
kn
 If this does not hold, only tiri\rial modifica
tions are necessary. (It will appear below that the conclusions
which will be drawn from Lemma 4 are all trivial in the case
1, so that this case is not really interesting.) (33)
l In the brat edItlon of tillS 'rrac1, LLmlna 4 was stnted In a dlftclont ftlrlll
Impbcs, ill that if the first nu:nlOCu of (103) is leph\('cd by
'" (.G'  1/1'"1 R411 III) dlj.
we ttbt.aut a, rolation valid for U 1. Tn thltt form, the J..,emms, ,rus glven by
(""raJner [2], and a,pphed to the study of the asymptotic I)ropcltie$ of certain integral
averages of }'n (Cf. below, p. 84.)
78
we obtain
LIAPOUNOFF'S THEOREM
and further, using the inequality (36),
ICdJ:+h(y  X)Wl Rim(y)dy I< 0f: I IdJ
o(S:I I
dt
+A1+Al!+A
s
),
\vhere 0 denotes an absolute constant.. For AI' A
2
and ..&3 we
have, on account of Lemmas 2 and 3,
\vhich completes the proof of the first part of the LamIna. The
second part is proved in exactly tIle same way, using (35) il1stead
of (33).
4. In this paragraph, we shall use Lemma 4 to prove the
following theorenl. which is due to Liapounoff..
Theorem 24.
1
Let Xl' X:a, .... , X
n
be independent variables
81JCh thal X
p
has the mean. value zero aM the 8.d. C1" and put
1 Liapounoff [11, (2]. It is possible to show (cf. Cramer [11, and p. 19) that
we ma.y ('==3.. In the works of and Gnedenko..Kobnogoroft quoted on
p. 119, It J8 shown that the factor logn In the evaluatIon of the error given in
Theorem 24 may be omitted. Of. the Remark on p.. 82 l>tt'low'. The evaluation th.us
obta.ined 18, in a. certain sense, a be&t..possihle one.
LIA.POUNOFF'S THEOREM 79
8;=af+... +0;. If the oosol'llk mom,ent Par =E <I X
r
J3) is jinite
lor aU r, the d.J. i}n. (x) o/the variable (Xl +... +X
n
)/8
n
satisfieafO'r
all n > 1 the ineq:ucility
logn
Ity" {z)4> (z) I< P81f, V
n
'
wheJre 0 is an ab80lute ccmstant, and P31" i8 dejineiJ, by (84).
(It will be remembered that, in the particular case of equal
components, Pan is independent of ?it, while in the more general
case when (87) is satisfied, Psn. is bounded for all n.)
Without 10s8 of generality, we mayinthe proof of this theorem
assume Pan.> 100, as in the opposite case we have
Pa.!V
n
= 1/(421n)
so that the theorem is then trivial.
Let us denote by X
n
+
1
an auxiliary vari&ble independent of
Xl' XI' ... , X", and having the d.f. F.+
1
(Z/(8.,f,' where
E is a parameter such that 0 < E < 1. In the notation used in
the preceding paragr&phs, we then have for the sequence
Xl) X2,' ... , X
n
+
1
= (1 +E
2
),
... .r;:o (Xv'1+.:1)
l+E
i
).cIl E ,
( )
e'"
ft&+l (t) = f. v
t
 e
1+E
2
We now apply Lemma 4 to the variables Xl' .... , X
n
+
1
, putting
e=3. It is then obviously permitted to use the second part of
the Lemma as, according to the above expression for fn+l (t),
the integral ocourring in the second member of (103) is abso
lutely convergent for 6)=0.. Replacing xVI +E
1
by 2:, we then
obtain
(104) lit" (z).<1l $(v=' 2) 'I < To 0 +f'" e2(r:..)dt
E 1 +E 3.n+1 TSft +
1
t
<0 e1ctTa..+I) ,
:.L8,.n.+l E 3.n+l
where 0 is an absolute constant. During the rest of this proof,
80 LIAPOUNOPlr'S THEOREM
we shall use the letter G to denote an unspeoified absolute con.
stant. We have further
f:>
and hence deduce, denoting by A> 1 a new parameter,
(105) f:49(;)
o
<A
e
s
(106)
()
2.
From (104)(106) we obtain
x )o(i
e

f
+
p
!.+
1+E
t
. .111 3, n+1 3, 1&+...
(
%) (1 1 1 )
tT,,(xhE)<4l 4/1+1' +0 he ! +Pa,n+l +E1Tl'Hl elf':7"",,+.
Replacing a: in the first inequality by %1H:, in the second by
a:+Jw, and using the relation we have
further
(107)
(
1 1 1 )
<0 1H+h:6 I + e.."7'1"+1 it
3,1f,+1 ".+1 I
We now dispose of the parameters hand 4! by taking
.. / Vlog1:
Av 210gT
hJ
E==3 In.
a_
From the assumption T",> 100 it then followa that we have
1"IAPOUNOFF'S THBO&Ell 81
11,> 1 and 0 < e < 1. Further, according to (84) and (88) we have
T. v;;.:+:I (1 +111)1 > Pan. ..
"_+1= 4p"1Hl  3It 1+8J;c
3
T8ft,
>
1+ 1001
and hence 1'Tl.+1> log Paa
Introducing in (101), we then obtain, since
!() (v'l:ot=)4l (a:) I<
. logn
! (:1:) 4l Cal)! < 0 T
k
3r. < apIA v'n
and the theorem is proved.
This theorem is directly a.pplicable, e.g., to the Bemoulli
distribution considered in v, 5, and VI, 5, in which case we ha.ve
Pa", == (1 2pq)/vPii. Thus if denotes the probability of the
relation where v is the number of white balls
obtained in a set of n drawings, the probability of drawing 8,
white ban being each time equal to p == 1 q, we have for all on > 1
logn
Ia. (z)(J) (z) t < 0 ;,
ynpq
where 0 is an absolute constant.
5. We now return to Lemma 4 with an arbitrary 3. In
this paragraph, we shaJI consider the particular case of equal
components. It will be shown in this case, it is possible to
give &, very simple sufficient condition for the existence of an
asymptotic expansion of the difference (a:) $(z) in powers
of .,.,1.
In the ca.se of equal components, the moments etc. introduced
in 23 are independent of 11, so that we may write p", P
v
and
Pa.l in the place of Ph p..... and PaIJl.".
82 ASYMPTOTIC EXPANSIONS
We shall say that & d.f. F (:1:) aoJ,iBjie8 the (C) if, for
the corresponding c.f.j(t), we ha.ve
(0) limsupl/(t) I <1.
By Theorem 7, the condition (C) is certainly satisfied if, in the
standard decomposition of F (x) according to (13), the coefficient
aI of the absolutely continuous component is different from zero.
We now proceed to prove the following theorem.
Theorem 15.
1
Let Xl' XI' .,.. be a sequence oj inilependem
ooriable8 all hOlDing the same d,.J. F (x) with the mean value zero,
the I.d. (T, and a finite absolute moment Pit oj order k a; 3.
By (2:) =(F (azvn)n* we denote the a.l. 01 tke fXJriable
(Xl +... +Xn)/(UVn). If F (z) 8ati8ji,tJJ the condition (0), we the""
have the expansion
1:3 R (4l)
(108) ffn (z) + I; v JIll +R
k3
)Ilia 1 1lt
k3
p
(3:) :t'
+:E e'2+Bien (x),
1'=1
with
M
(109) IBin (z) I< n(1c2)/2'
where MdeperulB"on 1c andOft, the given,fU'Mtion F, but is i1ulependem
o!n aM z.
Remark. For 1c == 3 this theorem shows that, as soon &8 fla is
finite and condition (0) is satisfied, we have
M
Ii.'f" (x) lP(x) I<V
n
'
where M is independent of ,.., and Thus the condition (0)
enables us to improve the Li&:POunotf limit of the error as given
by Theorem .24:
Proof. From Lemma 4 we obtain, using (88) and substituting
atv''' for t in the integral,
1 Cramer (2)
It is shown in the works of Bueen ud Gned.enkoKolmogorol' quoted OIl
pap 119 that this improvement holds even without the condition (O).
ASYl\IPTOTIC EXPANSIONS
R3
(3:+1&
(110) (,)J:: (11 X)(I)1 B
k
,. (lI)dll
=0
k
(u(I)'I/,W/'I. flO If,.(crtv''I/,) I
J t
cu
+
l
n('k2)!J
Given any d.f. F (x) satisfying the condition (C)'l it follows from
the Remark p. 26 that we can find c> 0 such that f j(t) I < for
t> 1/{4up'fk). By (83), however, fit (atv
n
) = (/(t))tt, and thus we
obtain from (llO)
(Ill)
i(JJ J:Th (y_X)<ul RIM (lI) dU I < M e:
lI
+n<k2)/9).
M denotes here, as during the rest of this proof, an unspecified
quantity depending only on k and on the given function F, but
independent of n, x, It and w.
Now (y) is the differenoe between the never deoreasing
k3
function (g) and the function U (11):= <1l + n
v
/
2
I:,(  tP}.
v:=l
The derivative U' (y) obviously satisfies the relation t u' (y) 1< M,
80 that we have for every 'Y in the interval of integration
Bien (:1:)  Mh < R
kn
(y) < R
kn
(x +h) +Mit.
By means of these inequalities, we obtain fronl (111)
Rim(x) <M(h+ha>;cn+hn<k2)/i) ,
Bien (x+h) > M(h +ka>;cn +k<Dn(k2)/B).
Replacing in the last inequality x +h by x, we thus have generally
(112) IR
kn
(x) I< M (h + ..;: k<u
n
<k2)/2) .
Taking here k=n(k2)/2, w= l/logn, ,ve obtain (109), and tht.'
theorem is proved.
1
It is easily shown by examples that Theorem 25 does not hold
84 ASYMPTO'!'IO EXPANSIONS
true without the condition (0).. Let, e.g", F (z) be the step
function oonnected with the simple Bernoulli distribution (v, is):
{O for x< p,
F(x)={q H
II "
F (z) being of type II (of. m, I), the condition (0) is obviously
not satisfied. Taking k =4, Theorem 25 would give the expansion
iJ (x\ ==w (x) + p  q <9(3) (:1:) +0 (!) .
n J n
This can, however, not be true
J
as it is readily seen that (a')
has, in t.he vicinity of x =0, discontinuities where the saltus is
of the same order of magnitude as n
l
.
Howe.ver. it can be s}lown (Cramer (2], p. 56) that, even without
condition (C), all asymptotic expansion of the form given in
Theorem 25 holds for an appropriately weighted average of the
function l1n (x) over any given intervaL In the second member
of {IDS}, we shall then have to hltroduoe the corresponding
average of <I> (x) &Ild its while the order of the error
term will only differ by a factor (log n)2 from the order given
by (109).
6. w
g
e shall nov," prove an analogue to Theorem 25 for the case
of unequal components. We shall then have to lay down certain
conditionswhich, roughlyspeaking, may be interpreted b;y saying
that the d.f.'s of the variables X'1 will be required to satisfy the
condition (0) on the average in a certain specified sense.
According to Theoreul any d.f. F, (z) ma.y be uniquely
represented in the form
(113) F,(X) =K,G
r
(x)+ (IIt,) G
r
(x), (0 1),
where 01' (x) is a d"f. of type 1 (absolutely continuous), while
Or (x) is a, d.f. \vhioh does not contain any compOllent of type I.
We no\\l' proceed to prove the following theorem.
(115)
(114)
ASYMPTOTIC EXPANSIONS 85
Theorem 26. Let Xl' X
1t
.... be independem vanable8 BUCk
thot Xr has the itl_ 1;. (.c) until the mean. value zero, tke 8.d. ar'
and a finite ab80lute mom,ent Pic' of order k 3. Let 1;. (x) be repre...
aented (JC(JO'fo,ing to (113) ana suppose that tke derivative a; (x) is
0/ bolllTUkd total variation v,. i,n (  00, +(0). Suppose further that
we AafJtlor infinitely increJ1,8i1l{/ n
1 'If, I(
 '+00
lognr=11 ,
.. 1 ICr
  A.A +oo
8
110
logn"_ll
8
11
ani/, Ph being defined as in tlfe preceding parag'fapkB. For the
d.l. if. oj tke tiOrriable. (Xl +... +X."J/8
n
we tken Mt1e the
ea:paftlion
witA an. error term Rim(z) satisfying the relation
M
(116) IBh (2:) I< Pl;I'
where M is indepe.ndem. 01 n 0,114 #:.
Remark. An important particular case is the case when
(4) the conditions (87) are and (b) the variation&t'r are
anitormly bounded for all r== 1, 2, .... As we have
== B'n/( t
the conditions (114:) and (115) are in this case equivalent and
reduce to the single condition
1 "
I 1::
og 7t"llIDl
Ph is in this case of the same order of magnitude as Vn, 80 that
.M
IRim(:I:) I<n,fkf)/I
For Jete: 3 we obtain here the same improvement of Liapounoff'e
theorem as at Theorem 25_
86
(116) becomes
ASYMPTOTIC EXPANSIONS
Proof. :From (91) and (98) we obtain
This shows that for T
kn
;$ 1 the assertion of the theoremis trivial,
so that we may assume throughout the proof Pie,,> 1. From
Lemma 4 we obtain, using (83), for 0 <Q) < 1,
wJ:"(y:r:)_1Rim(g)tlg=e" .
n
where Z= upper bound of II II" (t) I for t>T
Im
/8
n

r=l
Hence we obtain by the same argument as that used for the
deduction of (112)
(
kfJJZ )
IR
kn
(x) I< Elk A+; +ktDTkJ,"t) 
(For this deduction we require the result that the derivative of
k8
the function U(t)=4l+ I; n,v/IP
n
(4 satisfies, for 1'1:.> 1, the
vI
relation IU' (t) I< e
k
 This is easily proved by meaDS of (91)
and (98),.)
Taking n=T;Jkl), w= l/logT
h
, we now obtain
JRk,n (:1:) f <a. (TkJ,kt) +Z logTim).
So far we have made no use ofthe aesumptions (114) and (115).
If we can now show that, owing to these assumptions, we Ilave
for every fixed A > 0
M
(117) Z<
where M is independent of 11" the theorem will obviously be
proved.
ASYMPTOTIO EXPANSIONS 87
By hypothesis we have, denoting by gr (t) the c.f. of G,.
II,. (t) I 1<, Ig, (t) I+1  IC"
&nd
For 1t I 2v
r
we thus have
iI,. (t) I 1 tIC",
and hence for f t I< 2v,. by Lemma 1
tt tt
i fr (t) I 1(Ie,.  32tJ! 1 1(,. 64tJ
z

r r
It follows that we have for all t> 0
11,,(t) I
and consequently for t> Trm./8n.
1/1'(t) I Ih
K
r
Min
( 1. ;;':;) 1
1 It,. Min (1 Pt.)
:i e 6i 1+f); , < ,
.. I :Min I.. P:ft ) i 1(,.
n f Itt(t) I e 64 \ 1, s; r_:t
1
wf
,".1
According to (114) and (115) the last expression is, however,
for any fixed A > 0 a.nd for all sufficiently large n less than
so that (117) holds true, and the theorem is
proved.
7. It has been proved in tile preceding paragraphs that.
subject to certain conditions, the series
1
(lOla) =4 (x) +Pin (:cI +P
21l
( $)+PSn, (;4 +'"
ns n n
gives an asymptotic expansion ljf tYn (z) for large values of n.
According to (95) and (98), the Pvn ( tI are for 11= 1, 2, ... , k  8
defined in the following manner. We fu'St define an ordinary
1 The formal definition of this aeries was given by Edgeworth [1].
88 ASYMPTOTIC EXPANSIONS
polynomial P"" (t) by the relation
tl A
E .. +J,. %" Iea
e
v

1
(v+2)! :=1+
vl
Here, Amdenotes the quantity defined by (84), so that n(vI)/tA...
is the vth order semiinvariant of ij1\ (x); Jc is an integer such that
the 1cth order absolute moments are known to be finite for all
the components of (x); and finally z is an auxiliary variable
which varies in the vicinity of z== o. To obtain P
vn
( fla) we then
replace in Pm (t) each power f! by the function In
this way we obtain the expressions
( C) =  cXl<ll) (z),
P
h
( f1... 1)(') (z)+ 4)(6) (z).
Ph (4l)  
for the first terms of the development (lOla). It will be remem...
bered that in the case of equal com.ponents the Avn. (and thus alia
the Pvn) are independent of 1t.
On the other hand, 8, development of the type
(1010)
has been much used by writers OD mathematical statistiCi (ef.
e.g. works by Charlier, Bruns, Gram and Thiele), and it haa been
claimed (without correct proof) that this expansion should
possess asymptotio properties similar to those discu88ed above
for the expansion (101 a). The coefficients c
vn
are here determined
by the relation >
en=( 1)11f=HII(z)dg:n(z),
where H
p
(z) is the vth Hermite polynomial:
H. (z) =(l)Pe
i
ilz"e 2"
ASYMPTOTIC EXPANSIONS 89
From these expressions we obtain, by m.eans of the relations
between moments and semi...invariants (IV, 2).
;\3ft.
Can = n
i
'
A
4n
c
4n
=
n
Aan
Csn =  nfj.
n! + ;'
f; ""'''''oO'' to" '1'''' 'If c.,.,
For larger of n, the expressions of the and the om
become inCl:e&SL."t1gly complex, but it will be seen from the &bt1Ve
that the two expansions (lOla) and (lOlb) may be regarded a$
rearrangements of one Jt followa from Ol.Ir
tha.t It is only (lOla) which gives in tIle ordinary seflse, an
asymptotic expansion of tjn (x). Or1 the ottu;r tIlE' expan...
sion (Ii}} b) ma)' be considered formally tlirnpler, the
art' by simple relatlon givCll n ,vhich le!1ts on the
orthogonalitJ of the Herluite J:}Olynomials 1
1 Fo! a. more det&tltd anal,; of the relat.olls betv.een the two vf t'xpan
mons cf. Cramer (2]
o
CHAPTER VIII
A CLASS OF STOCHASTIC PROCESSES
Z =Z +D:
'Tl+Ts 1"1 '1'1 Ta'
where Z'TI and 'TI are independent.
It is, in faot, possible to give an exact meaning to the limit
passage which has thus been roughly indioated.. We shall, how
ever, prefer to consider directly a random variable which depends
1. In the preceding Chapters, we have been ooncerned with
distributions of sums of the type Zn == Xl + .. +X"" where the
X,. are independent mndom variables. Z,.. is then a variable
depending on a discontinuous parameter fi., and the passage from
Zn to Znrl means that Zn, receives the additive contribution
X
11
+
1
' so that we have Zn+l = Z.,. +X
n
+
1
, where Zn and X..+
1
are
independent.
Consider now the formation of Zn by successive addition of
the indepe...l.dent contributions Xl) X" ... , and let us
assume that each addition of :& new contribution takes a. finite
time S.. (In a concrete interpretation the X, might e.g. be the
gains of a certaJ.n player during a series of ga.mes, every game
requiring the time S, so that Zitis the total gain realized after 1Ir
games, or aftser the time n.8.)
The sum ZtJ, then arises after the time n8, and the d.f. of Z. is
thus the d.f.. of the sum that has been formed during the time
interval (0, nO).. Suppose now that we allow 8 to tend to zero and
1 to tend to infinity, insuch a way that nO tends to a finite limit 'T.
It is conceivable that the distribution of Zn may then tend to a
definite limit, whioh will depend on the oo1ltinuous ti1Mparamt.ter
7. Thus instead ofthe variable Zn, with a discontinuous parameter
n we should have a variable Z.,. with8 continuous parameter'Tt and
luch that the increment of ZT dUring the time interval (Tt, 1"1 +Tt)
is independent of Z'rl:
(118)
A CLA.SS OF STOOHASTIC PROCl1SSES 91
on & continuous parameter and which behaves in the general way
described above.
1
2. Let T be &. continuous parameter which may be thought of
as representing time. Suppose that, for every 1" ~ 0, we have a
random variable Zr with the d.f. F ( ~ , T) and the c"f..
I(t, T) = f ~ Q ) e#eclDO'F (:t, T).
ZO will be supposed to be identically equal to zero, so th&t
F(x,O) coinoides with the d.f. E(X) defined by (17).
The set of variables Z.,.. will be said to define a random. or
8t00ka8tic proces8 with 8tationary and indepentlettt itu;remems
(briefly: a 8.t.i. procea8) if, for 'T1 ~ 0, "1> 0, the difference
U'T11'a = Z1'l+'T1  Z'Tl is a random variable which is inilepefUkn;t 01
tke variable Z'Tl and has a d.f. which is i'llilepenilent oj 71 We oan
then say that the inorement of the variable ZT during any time
interval is independent of the value assumed by the variable at
the beginning of the interval, and also independent of the posi.
tion of the interval on the time scale (but not, of course, indepen
dent of the length of the interval).
If ZT defines 8, s.i.i. process, it is seen from (lIS) that the dlOf.
of Z1"l+
T
t is composed by the d..f.'s of Z"rl and U"l'1
7
'. The latter d.f.
is, however, by hypothesis independent of 71' and for '1"1 =0 we
have 0'0,'7'1 =Z,;sZO=Z"'a' so that the d.. f. of UrI"'. is identical with
F (x, Tj). This gives us the following relations which may serve
as an analytical definition of the s.i.i. process:
(119) F (x, Tl +72) = F (x, Tl) *F (z, T:a),
(120) f (t, 'T1 +Ta) =/(t, 71)! (t, Tt).
1 Particular cases of variables of this character were first studied by Baehe1ier
(1, 2) and Lundberg [1, 2]. Further contributions were given inter alia, by Cramer
[3] Stud Esscher [1], in conneotion with the mathematical theory of insuran.ce risk.
A complete and mathematically rigorous theory. which embnces &lso cases much
more general than the s..i.i. process, was first given by Kohnc.goroff [2]. The theory
of the s..i ..i. process was developed by Uvy [2] under more general conditions th&n
those considered here.
92 .A CI.,ASS OF STO(.HASTIC
Fo!" the mome:n::s &f Z'f' we use the nutation
f'J...:;;
C(p(r)=E(Z;)= f
.. 0;)
(Throughout the C;hapte:r lt be that the variable
ofintegration 18 always thefir8l varlable occurring in. the function
the tugn d. so Lhat we rnay Olnit the index or! this sign.)
Theorerrl Let Z"f defi'M a 8.,i.. i" 8twh tJw,t at (1")=0
(JfJta, :1
2
( T) J./f h,ite j(y.r all "f' > 0Q We then have
:0.. .. ... .. f}  1 itz ..
iUOTt2+7 d!!(x),
c x
w.ag visC (()jrstant andn(x) irs {it i
4
('i.unded aruJ nevtr deCrta8ing
junction1JJh.irJit i8c,ominu0'U8 at x ::.= o. G01lf..'eraely git'en anyCA'J'Mtant
0 and any brru/lu.led and neve:r ile.c1'easing n(x) con
ti'A'U0'U8 at x=0, (121) defines (;.f. f (t, r) Q, fJariable Z... c<Jrre.
8'jll11t;ding to a a.i.i. pr0CS88.
Before proceeding to the proof of this tbeorerrt, 'lte shall con...
sider some simple particular cases Suppose futBt t/hat n(x)
reduces to a oonstant, so that the last term In the second member
of (121) disappears. Then it follows froID (121) 1hat,
F (x, '1") =cJ) (xf(OO'V/7)),
80 that Z'T is, for every ". > 0, normally distributed witil the mean
vaJue 0 a,nd the s.d.. O"OVT This case is often called the BrfYWnian
movement prOCe88, a name referring to one of its important
physieal applications. Suppose on other hand ao = 0 and
o(x) = AC
2
E (xc), where ,\>0 and c:pO are COllstants, and (z)
is defined by (17). Then (121) gIVes
logf(t, T} (eM _.. 1  cit),
t Kolmogoroff{&l. aI, also de Flnf"tti 2] lfths hypothefWl.xl (.,)Oiaomitted.
may apply the theorem to the varia.ble Z., ""'J (1").. and choose for ft.l (r) any
:Mlutlon \eontinllt')us or not) of the funCttional equation
1 (1'] +"f'%) (1",.) (Tt).
If we uaume, e.g., that, !Xl (r) 1C m some mterval, however small, we neoes
sanly have tXt (,.1 where e is & real OOI1It&nt.. Levy {2] atudies the s.i.i.
mthu"Ut assumill$( the eXlItenee momenta 0'.1 (,.) and. a (1').
A. OLASS OF STOOHA.STIC PROCESSES 93
80 that the variable ZT+ACT has the a.f.
According to (47) this corresponds to 8, distribution ofthe Poisson
The corresponding process, which has important applica
tions, e.g. in the theory ofinsurance risk, is known as the PoiaB,.
pr0WJ8.
More generally, let (1 (x) be a stepfunction with a finite num
ber of steps, none of whioh is situated at the point :lJ == 0) and put
b=J 00r
1
tID(2:). Thenit follows from (121) that the distribution
of the variable ZT+bT may be regarded as composed of one
normal component (arising from the term containing (To) a:n,d a
number of independent Poisson distributions, each of which
corresponds to one step ofn(x).
In the general the distribution of Z.,. is always composed
of the normal component (:e/(ao\IT) and another oomponent
corresponding to the term containing (1 (x) in (121)..
\Ve now proceed to the proof of Theorem 27.. Let us first
consider the s.d. Vi.Xt(T). From the fundamental relatlons (119)i
and (120) it follows that we have
! +Ta):= (.(2 (Tl) +tX, (TI)
The only nonnegative solution of this functional equation is,
however..
l
(122) .s (T) == aar,
where 0'2 0 is a constantt From (122) we deduce
(123) J(t, = 1 i&altttlT
with 1.& t 1, so that !(t, L\T)>l as According to (120)
it then follows that, for every fixed t, f (t, 7) is & oontinuous
function of T ..
From (120) we obtain further f(t, l/n)={!(t, l)}l/1t, and hence
for all rational mIn we havef(t, mIn) ={f(t, .. By continuity
this result extends immediately to all 'T > 0, 80 that we have
generally
1 Oi.. Hamel (1]" Hauadorti [1], p. 17;;.
(125)
94 A OLASS OF STOCHASTIC PBOCJESSES
(124) f(t,T)={f(t,l)}T_
According to (123), the expression
f(t, t1T)l_ {f(t, 1)}QTl
dT  d7
is, for every fixed t, bounded as It follows thatf(t, 1).,&0
for all real t, and thus the expression (125) converges uniformly
in every :finite tinterval to the limit
(126)
lim j(t, 1=logj(t, 1),
AT+O T
where logJ(t, 1) denotes that branch of the multivalued fnnction
which vanishes for t:= 0 and is for all real t uniquely determined
by continuity.
On the other hand we have
Putting
(128)
H (z, A.T) is a never decreasing function of x suoh that
H (C1J,AT) =0, H (+00, 6:r) =0'2.
For every fixed AT>O, H(x,Ar) is continuous at x=O, and
we have
1 foc fGO e*1 itx
A (e
itz
 1 dF (x, .aT) = i dB (x, A:T),
x
where, for x=O, (e
1h
litx)jz" is to be interpreted as tll/!.
According to (124), (126) and (127) we thus obtain
f
cc e'ib: litx
(129) log! (t, T) = T lim 2 dB (x, 1ia'T).
A'T+O 00 x
Consider now the function H (x, aT) for a sequence of values
a1'r, L\eT) ... tending to zero. It is then always possible to choose
a subsequence Ant T, Ll
nt
T, such that the oorresponding fune...
A OLASS OF STOCHASTIO PROOBSSBS 90
tiona B (z, tT) tend to & limit H (3:), in &ll oontinuity points :e
of the latter. From (129) we then obtain
f
co e1l:J:litx
(130) logf(t, 1') =='1' 2 dB(:r:).
00 z
Obviously H (x) is a never decreasing function such that
We can, however, show that in both these relations the sign of
equality must hold. We obtain in fact from (130) for small values
oft logf(t,'T)=  t'Tt
2
{H(+co)H( oo)} +0(t
J
) ,
but on the other hand (122) gives
logf(t,T)= iat.rt
2
+o(tI),
so that we must have
B (00)=0, H( +00)=(12.
Let, now, denote the saltus of H (x) at the point :=0 (thus
o (12) and put
(131) n(z)=B
:(x) being defined by (17). Then we have
fi( 00)==0, n(+00) =oi=alaJ.
Further, n(%) is bounded, never decreasing and continuous at
z=O, and (121) follows immediately from (130), so that the :first
part of the theorem. is proved..
The latter part of the theoremis obvious in the particular case
when n(x) is a stepfunction with a finite number of steps.
(Of. the remarks made above.) Further, ifn(z) is any function
satisfying the conditions of the theorem, the second member of
(121) may be uniformly approximated by means of a, sequence of
stepfunctions converging to the limit n (x). By Theorem 11, the
corresponding dllf.'s tend to a limit which is itself a d.f., and the
second member of (121) is equal to the logarithm of the c.f. of
this limit. Thus (121) determines uniquely a d.f. F(z,r), and it
follows immediately from the form of (121) that the fundamental
96 A CLA.SS OF STOOHASTIC PROCESSES
relations (119) and (120) are satisfied, 80 that proof of
Theorem 27 is completed.
Since (XS (1') is finite, (130) may be twice differentiated with
respect t.o t, and we obtain
fa>e
1Jz
dH :;'logj(t, T).
But H (:e)/ut is It d.f. whioh is tlniquely determined by its a.f.
It follows that we must reach the same limit H (z) for every
sequence A
l
T, 4,.", a. tending to zero. This implies, however'I that
we have lim H(z, AT) =H(x) in every continuity point of H (x).
4..1>0
This leads to an interesting interpretation of Theorem 27. For
x< 0, we have by (128) and (131) in every continuity point of
a as AT40,
== tllI ="1 (X),
and for
F(x, AT) =f> U, A'r}ioJ"'OC
dOU
)=:
A:" :z; es :e es
This may be Mitten
F (x, 41")= fit (2:) 6.1"+0 (6.1"), (z< 0',
IF(x, AT) = fl. (z) dT+O (AT), (z> 0),
The probability that, during the infinitely small time AT, 8,
variation < x <0 ocours in the value of the variable Z., is thus
asymptotically equal to fil (x) AT, while the probability of a
variation > x> 0 is asymptotically equal to il. (2:) Ar
Thus the function (1 (x) determines the discontinuous part of
the variationofZ,., while obviously the constant Godetermines the
CO'TdinUC1U8 pan..
1
Further we have
J
"'O :r:2dnt (;t}+flxii dil
t
I_fflO dO (a:)
0
as (r)="=o1"+o1"1',
1 It should be noted that the d..f" F 1') is aJway. continllOUi with respect to T,
although variable Z1' mI.,. 8uifer diIconiinuoua :Jha.npI of value, if Q is not
identie&1ly zero.
A CLASS OF STOCHASTIC PROCESSES 97
S9'&hat the variance t ('t) of ZT is the sum of one term due to the
ogfltinuous part of the variation and one term due to the dis
continuous part.
3.. By means of the remarks made in 1,. it will be easily
understood that the s..Li.. process, as defined in 2, presents &,
great analogy with the "case of equal components" in the
problemof addition of independent variables treated in Chapters
VIVII.! Roughly speaking, we are here concemed not with &
8vm, but with an integral" the elements of which are independent
random variables (cf. Levy [2]).
It, is tl1en fairly obvioUiS that our theorems bearing on
the case ofequal coolponents, as Theorems 20 and 25, sllould
hol..t it1/u.f,a.rulis, also for the case of a a.Lie process.. In
the 'Variable with the d.f.
(z) T) = F (u.'t ,/7",,r)
and the a.f. f(t,T)=J(tj(av'r),T)
is directly analogous to the previously considered variable
(X
1
s .. +Xfl)!(ay'n) with the d.f. ti_ (x) and the c.f. fn (t).
Instead of the discontinuous parameter n, we are here concerned
with the continuous parameter T.
The relation (121) may be written
f
Qj eit%1itx t (itX)2
logj{t,T)== ia'rt
2
+,. I tlO{:t}.
ex;) x
Substituting here t/(t1VT) for t, we obtai11
(132) .
,",co Uoe itx 1 l ike \ 2
t
2
J eav'I"I ay'T2\aV'T)
logf(t,T)=2+
T
2 d11 (:r).
eo x
1 If we omit the condition laid down at the beglnning of 2 that the distri bution
of the increase Z"':t.+'"  Z"l should be independent of 1"ilt we arrive at a mare general
kind of random process related to the general problem of addition of independent
variables in the same way as the process here considered: is related to the particular
cue of equal components. Subject to appropri&te conditions, Theorems 2730 can
be generalized to this caN. (For &t generalization of Theorem 21 along these lines
of. Levy [21, who considers also the case when at, (or) i' not finite.)
(183)
98 A OLASS OF STOCHASTIC PROCESSES
In a way which is closely similar to the proof of Theorem 20, it is
now easily shown that the 1&st term of this expression tends to
zero as 1'"700, uniformly in every finite tinterval. We thus have
the following theorem directly analogous to Theorem 20.
Theorem28.
1
.A8T+OO,tked,.J. (x) T) ojtkemriableZ.,./(aVT)
tends to tke 'JWN)'l,Q], Ju/nction <1l (3:).
In order to obtain an asymptotic expansion of g. (z, 7) for
large valuesof'T analogous to theexpansiongivenbyTheorem25,
we shall suppose henceforth that there is an integer k 3, such
that the absolute moment of order k 2 of the function Q (x)
occurring in (121) is finite. We put for v=3, ... )k
1 fco
A
v
=; zvst1Q (z),
a co
1Ico
p)l=; IzIV
2
dO(x),
a eo
VT
Tm=4 Ilk
Pk
These notations are analogous to those introduced in VII, 3, by
(84) and (88). We can now prove the following lemmas, which
are directly analogous to Lemmas 2 and 3.
Lemma 5. For It I iYP/w we have
t
l
kSP.. (t) e
ejf(t,T)==l+
p1 7" kr
tJfn..ette (it) is the polynomial of degree 3v in (it), whick ill obtaineJl
by f'eplaci'n{} in the polyn,om,ial P1m (it) 0/ Lem'flUJ, 2 the fJ.'UCIntitiu
A
vn
defined by (84) by the quantities Av defined by (133).
Lemma 6. For It I T
kT
we M/ve
t'
The proofs of these lemmas, which are based on the relation
132), are so closely similar to the proofs of Lemmas .2 and 3 that
:4 Levy [2J.
A CLA.SS OJ' STOCHASTIC PBOCB88BS 99
theyneednot be explicitlygivenhere. Finally, puttinginanalogy
to (101)
k3 E (cz,)
(134) tf (x, T) =cJ) (x) + "T*/I + R
k
(:v, T)
vI
where P3i11 (x) is a polynomial of degree 3vl independent
of"T, we obtain in the same way as in VII, 3 the following funda
mental Lemma corresponding to Lemma 4.
Lemma 7. For o<w< 1, we havejor all real a: and all k> 0
(U f:+A (1/X)oo1 I Idt+
11 the 'ntegral in the second, member of this relation is CfYR,vergent
lor w = 0, we.furtM'r have
1f{t,T) I 1 )
Ric (x, 'T') == ail: TM t dt +Tt"l!
Proceeding in the same way as in VII, 45, we can now use
Lemma 7 to obtain information as to the behaviour of tv (x, 'T) for
large values ofT. Inthe first place, we have the following theorem,
the proof of whioh is direotly analogous to that of Theorem 24
&nd need not be giverl here.
Theorem :29. 1/ the quantity Pa defined by (133) i8 finite,
weMve
log..,
Ii.J (x, 1') 41 (x)! < OPa y''T '
where 0 is an ab80lute conetant.
Further, we can now prove the following theorem which gives
an asymptotic of ty (:e, 1") analogous to obtained
in Theorem
Theorem 30.
1
Suppose that the variable Z., cO'/l.8iAlered in
T1u?Prem 27 8lJtiaftea the conditions :
1 Cramer (4).
100 A CLASS OJ' STOOHASTIC PROOESSEe
(1) PIN ah80lute fIU'Jme1&t Pi OIl clefiwby (133) is finite for 80me
integer Ie 3;
(II) For SOOJ,8 T > 0, the d.l. F (%, ,,) 8tJ4Vftea the etmrlitsOfl, (0)
ofvn, 5.
For the il.!. it T) 01 the Vf.W'ia,ble Z,./(ay'r), we the"" have the
ezpo/nttion (134) toith
(135) IRj: (x, T) 1<
M being independent oj T aMz.
Further, afl,1/ of the JoUowing c<YIII1itic'M (IIa) and (lIb) t.
8Ujficie:nt for tAe validity 01 (n) ..
(na) 01>0,
(lIb) n(x) =01 (z) +0. where 0
1
(:c) aM 0. (z) are both
never tlecretUj,ng, wkiJe {}1 (:t) 1,8 ob8olutely continuous (J/nd, tlou not
"ed1JlC8. 'W a tXmBtant.
If (II) is satisfied for a single 1">0, it follows from (121) that
the same thing holds for every T> 0, and thus in particular for
T= 1. From Lemma 7 we obtain according to (124)
r.c+h
wJ.c
(
i
CC> If (t) 1) I'" pI )
= E)Ie ('ft.lJ.rtll tQJ+l til +7._1)/1 ,
T
which corresponds to (110). By me&ns of the condition (0) we
then obtain
jwJ:+1s (lI X)1BA;(Y,T)d,1 <M(S:+.,<I)'I) ,
and the rest of the proof of (135) is perfectly similar to the proof
of Theorem 25. The last part of the theorem is easily proved by
considering tIle real part of log!(t: T) according to (121).
THIRD PART
DISTRIBUTIONS IN RI;
The object of this Part is to sho,v that many of the results
obtained above for distributions in a onedimensional space can
be generalized to any number of dimensions. We shall, in the
main, restriot ourselves to a brief discussion of some typical
of this kind.
CHAPTER IXI
PROPERTIES.
CHARACTERISTIC FUNCTIONS
1. For a distribution in a onedimensional space, the only
possible discontinuities arise from discrete points interms
of the meohanical interpretation used in Chapter II, are bearers
of positive quantities of mass. As soon as the number of dimen
sions exceeds unity, the question of the discontinuities becomes,
more cor1'lplicated. Thus in a lcdimensiona.l space, the
whole mass may be conc.entrated to a SUbspace of less than Jc
dimensions (line, surface, .... ), though there is no single point that
earries &, positive quantity of mass.
Given a random variable X = (El' ... , ek) in the kdimensional
space R
k
, we denote as in Chapter II the corresponding pr.f. by
P(S) and the by F (Xl' ou,Xk). Just as in the case lc== 1, there
can at most be a finite nun'lber of points A such that P (A) > a > 0,
and hence at most an eIlu.merable set of points B such that
P (B) > o. We Sohal] call this set the point spectrum of the dis
tributioD.
1 The geueraJ theory of completely additive set functions in a hedimensionaJ space
haa been developed by Radon [11, Bochner [2] and. Ha.viland [1,2.3]. .A compre..
helsive account of the principal results of the theory is by Jessen..Wintner [1J.
102 GENERAL PROPERTIES
According to II, 3, every component 'i of X is itself a random
variable, and the corresponding (onedim.ensional) distribution
is found by projecting the original distribution on the axis of
Let Q
i
be the set of real numbers which are discontinuities of
the distribution of fi' and form the (at most enumerable) set
Q=Q
1
+... +Qt. Further) let J denote a lcdimensional interval
and consider the probability P(J) of the ','Oevent" X cJ as 8,
function of the variables al and hI.. It is then obvious that, as
long as no fJi and no bi. belong to the set Q, P (J) is a
function of these variables.
Any interval J such that no Q,i 1l,nd no hi belongs to Qwill be
called a, continluityintervaZofthe distribution. Iftwo distrihutions
coincide for every interval whioh is a continuity interval for both
distributions, it follows from Theorem 2 that the corresponding
d.f. '8 are al,vays equal, and thus by the same theorem the
distributions are identical..
If. a sequence of pr,f.'s {P
n
(S)} converges to a completely
additive set function P* (8) in ev"'ery continuity interval of the
latter, we shall say simply that {Pn, (8)} confJergea to p. (8). The
symbolPn. P* (8) will be used onlyinthis sense. Fromevery
sequence {Pn, (S)} it is possible
l
to choose a Bubsequence whioh
converges in this way to a limit p* (8). Obviously we cannot ill
getteral assert that p* (8) is a probability function, 88 we only
know that 0 p* (R
1
:) I.
Any pr.. f. can always2 (cf. Theorem 4) be uniquely represented
as a sum of three components
(136) P (8) =aIP
1
(8) +all 111 (8) (8),
where Ox, all' alII are nonnegative numbers with the sum 1,
l1x, PIlI are pr.f.'s such that
ll(S) is absolutely continuous; .&(S)=JsD(X)tlX. where
1 Radon [1]. This is proved in practically the same way as in the one..dimenaioDal
cue
:': Radon (t]..
GENERAL PROPBRTIES 103
D(X) is a, nonnegative point function in R
k
which is called the
probability de'lt8ity or densitY!'Ullldion of the distribution defined
by 1\ (8).
Ii:I (8, :is pnrely discontinuous; (8) = 1 if S coincides with
the point spectrunl of P (8).
1\11 (8) is "singular"; the point spectrum of (8) is empty
and there existsa Borel set S ofmeasure zerosuchthat (8) == 1.
2. A realvalued function g(X) which is :finite and uniquely
defined for all points
l
of Bit is, according to II, 3, a random
variable with a uniquely defined onedimensional distribution.
By (I5a) we have for the mean value of this variable the expres
f
E(g(X= g(X)d.P,
B.t
subject to the conditionthat the integral isabsolutelyconvergent.
The mean values of the particular functions
g (X) =fit ... (Vi =0,1,2, ... )
are called the '1fI..011ten:t8 of the distribution. We shall use the
notations m( = E (f
i
),
1';= E gimt) <f
j

m
J
)),
== D2 (et) =fLii = E e
i
 mi)I).
Putting rij = fLii ,
Ui(7j
it is thell ea.sily shown that we have  1 riJ 1, and that the
extreme values 'ti1 = 1 can only be reaohed if, in the two
dimensional distriblltion of the H combined" variable (E" f;),
the whole mass is situated on one of the straight lines
<f
i

m
i)/t:1f,= (,;m;)/oJ
"; is called the ooeJficient oj correlatiO'lt between '1. and f
s
' and
plays all important part in the statistical applications.
More generall)', he (!uadTat,ic forr.n
I; J.t,;; U l ::= dP
f.t>t;'" .. I4: i
1 Ex("ept posaibly fur pl)intB formir)g a $ftt :E such thn.t P (1:) =0..
104 GBNERAL PROPERTIES
18 never negative, which implies that the determinant II f'illl, as
wen as all its principal minors, is o.
3. The c1l4ract,eriBticfu'l"ldion ofa distributionin R
k
is the mean
value
(137) f(t
1J
... ,t't):aE(e
i
(ltil+...+
l
ktl)==j e'i.(llfl+...+tl;fk>dP.
BI:
explioitly stated. otherwise, the t, will be considered as
real variables, so that I may be considered as a function of the
real point (tx, ,tic) in R,,_
ObviouslyJis a uniformly continuous function of the t, in the
whole space, and we have always IJ I 1. The generalization of
Theorems 68 to any nutnber of dimensions is comparatively
easy, and will not be de< with here.
If all moments np to a certain order are finite, we have for
small values of It, I an expansion of,' analogous to (25). If, in
particular, all #L1,1 are finite and all m'l are equal to zero, we have
(138) . ,t/e)= 11
iti ,.
We shall now consider the generalization of Theorem 9, and
for the sake of formal simplicity we take first the case of a two"
dimensional space R
2
The generalized theorem will then be as
follows: If the interval J defined by
Xl < '1 +'1tt, 2:. < t,
is a continuity interval of the distribution, we have
(139) P (J)== F (xl .. +k
a
)  F (Xl +k
1
) XI)
1' (2:1' xa+ha )+F (Xl' XI)
1 JT IT leitlhlleUahs
= lim Aa 't,. 't f(t
1
t
a
)dt
1
dt
a