Vous êtes sur la page 1sur 11

Elementary Probability Theory

CLASSICAL DEFINITION OF PROBABILITY


Suppose an event E can happen in h ways out of a total of n possible equally
likely ways. Then the probability of occurrence of the event (called its success) is denoted
by
p=Pr { E }=

h
n

The probability of non-occurrence of the event (called its failure) is denoted by


q=Pr {not E }=

Thus

p+q=1 , or

nh
h
=1 =1 p=1Pr {E }
n
n

Pr {E }+ Pr {not E }=1 .

The event not E is sometimes denoted by

E
, E or

E .

Note that the probability of an event is a number between 0 and 1. If the event
cannot occur, its probability is 0. If it must occur, i.e. its occurrence is certain, its
probability is 1.
If

p is the probability that an event will occur, the odds in favour of its

happening are

p: q (read p to q ); the odds against its happening are q : p .

Thus the odds against a 3 or 4 in a single toss of fair die are


1.

2 1
q : p= : =2 :1
, i.e. 2 to
3 3

RELATIVE FREQUENCY DEFINITION OF PROBABILITY


The above definition of probability has a disadvantage in that the words equally
likely are vague. In fact, since these words seem to be synonymous with equally
probable, the definition is circular because we are essentially defining probability in
terms of itself. For this reason a statistical definition of probability has been advocated by
some people. According to this the estimated probability, or empirical probability, of an
event is taken as the relative frequency of occurrence of the event when the number of
observations is very large. The probability itself is the limit of the relative frequency as the
number of observation increase indefinitely.
The statistical definition, although useful in practice, has difficulties from a
mathematical point of view, since an actual limiting number may not really exist. For this
reason, modern probability theory has been developed axiomatically in which probability
is an undefined concept much the same as point and line are undefined in geometry.

CONDITIONAL PROBABILTY, INDEPENDENT AND DEPENDENT EVENTS


If
E1

E1

and

E2

are two events, the probability that

has occurred is denoted by

conditional probability of

E2

Pr {E 2E1 }Pr {E 2 given E1 }

given that

E1

If the occurrence or non-occurrence of


occurrence of

E2

then

E2

occurs given that


and is called the

has occurred.
E1

Pr {E 2E1 }=Pr {E 2 }

does not affect the probability of


and we say that

E1

and

E2

are

independent events; otherwise they are dependent events.


If we denote by

E1 E 2

the event that both

E1

and

E2

occur, sometimes

called a compound event, then


Pr {E 1 E2 }=Pr { E1 }Pr {E 2E1 }

(1)

In particular,
Pr {E 1 E2 }=Pr { E1 }Pr {E 2 }

For three events

E1

E2

E3

for independent events.

(2)

we have

Pr {E 1 E2 E3 }=Pr {E 1 }Pr {E2 E1 }Pr {E3 E1 E 2 }

i.e. the probability of occurrence of


E1
of

E2

times the probability of


E3

given that both

E1

and

E1

E2

given that
E2

E1

probabilities
and

E3

E1 , E 2 , E3 , , E n

p1 , p2 , p3 , , p n

and .,

En

is

is equal to the probability of

has occurred, times the probability

have occurred. In particular,

Pr {E 1 E2 E3 }=Pr {E 1 }Pr {E2 }Pr { E3 }

In general if

E3

and

(3)

for independent events

(4)

are n independent events having respective

, then the probability of occurrence of

p1 p2 p 3 pn

E1

and

E2

MUTUALLY EXCLUSIVE EVENTS


Two or more events are called mutually exclusive if the occurrence of any one of
them excludes the occurrence of the others. Thus if
exclusive events,
If

E 1 + E2

Pr {E 1 E2 }=0

E1

and

E2

E2

or both occur, then

are mutually

denotes the event that either

E1

or

Pr { E 1+ E 2 }=Pr { E 1 } + Pr { E2 } Pr {E1 E2 }

(5)

In particular,
Pr { E 1+ E 2 }=Pr { E 1 } + Pr { E2 }

E 1 , E 2 , , En

As an extension of this, if

respective probabilities of occurrence


of either

E1

or

E2

En

or

for mutually exclusive events

is

(6)

are n mutually exclusive events having

p1 , p2 , , pn

, then the probability of occurrence

p1+ p 2+ + pn

DISCRETE PROBABILITY DISTRIBUTIONS


X

If a variable
respective probability

can assume a discrete set of values


p1 , p2 , , p K

probability distribution for


respective values

p1 , p2 , , p K

function of frequency function of

where

X 1 , X 2 , , X K

p1+ p 2+ + pK =1

has been defined. The function


for

X =X 1 , X 2 , , X K

X . Because

with

, we say that a discrete


p ( X ) which has the

, is called the probability

can assume certain values with

given probability, it is often called a discrete random variable. A random variable is also
known as a chance variable or stochastic variable.
Note that this is analogous to a relative frequency distribution with probability
replacing relative frequencies. Thus we can think of probability distributions as theoretical
or ideal limiting forms of relative frequency distributions when the number of observations
for populations, whereas relative frequency distributions are distributions of sample drawn
from this population.

The Probability distribution can be represented graphically by plotting


against

p(X)

X , as for relative frequency distributions. See Problem 6.11.


By cumulating probabilities we obtain cumulative probability distributions which

are analogous to cumulative relative frequency distributions. The function associated with
this distribution is sometimes called a distribution function.

CONTINUOUS PROBABILITY DISTRIBUTIONS


The above ideas can be extended to the
case where the variable

may assume a

continuous set of values. The relative frequency


polygon of a sample becomes, in the theoretical
of limiting case of a population, a continuous
curve such as shown in Fig 6-1, whose equation
is

Y = p( X ) . The total area under this curve bounded by the

and the area under the curve between lines

X =a and

exis is equal to one,

X =b (shaded in the figure)

gives the probability that X lines between a and b, which can be denoted by
Pr {a< X <b } .
We call

p ( X ) a probability density function, or briefly a density function, and

when such a function is given we say that a continuous probability distribution for
has been defined. The variable

is then often called a continuous random variable.

As in the discrete case, we can define cumulative probability distributions and the
associated distribution functions.

MATHEMATICAL EXPECTATION
p is the probability that a person will receive a sum of money S , the

If

mathematical expectation, or simply the expectation, is defined as pS .


X

The concept of expectation is easily extended. If


random variable which can assume the values
probability
X

p1 , p2 , , p K

where

X 1 , X 2 , , X K

p1+ p 2+ + pK =1

denotes a discrete

with respective

, the mathematical expectation of

or simply the expectation of X , denoted by E( X) , is defined as


K

E ( X )= p1 X 1 + p2 X 2 ++ p K X K = p j X j = pX

(7)

j=1

If the probability

f j/ N

mean

pj

where N= f j

probability

fX
, the expectation reduces to /N

or a sample of size

frequencies. As
pj

in this expectation are replaced by relative frequencies

in which

X 1 , X 2 , , X K

which is the arithmetic

appear with these relative

gets larger and larger the relative frequencies

. This we are led to the interpretation that

f j/ N

approach the

E( X) represents the mean of

population from which the sample is drawn. If we call m the sample mean we can
denote the population mean by the corresponding Greek letter

(mu).

Expectation can also be defined for continuous random variables, but the definition
requires use of the calculus.

RELATION BETWEEN POPULATION AND SAMPLE MEAN AND VARIANCE


If we select a sample of size

at random from a population (i.e. we assume all

such samples are equally probable), then it is possible to show that the expected value of
the sample mean m is the population means .
If does not follow, however, that the expected value of any quantity computed from
a sample is the corresponding population quantity. For example, the expected value of the
sample variance as we have defined it is not the population variance ( N1)/ N

times

this variance. This is why some statisticians choose to defined the sample variance as our
variance multiplied by

N /(N 1) .

COMBINATORIAL ANALYSIS
In obtaining probabilities of complex events an enumeration of cases is often
difficult, tedious, or both. To facilitate the labour involved, use in made of basic principles
studied in a subject called combinatorial analysis.

FUNDAMENTAL PRINCIPLE
n1

If an event can happen in any one of


another event can happen in any one of

n2

ways and if when this has occurred

ways, then the number of ways in which

both events can happen in the specified order is

n1 n2

FACTORIAL

Factorial n , denoted by n ! , is defined as


n !=n ( n1 ) ( n2 ) 1

(8)

Thus 5 !=5.4 .3.2 .1=120 , 4 ! 3!=( 4.3.2 .1 ) (3.2 .1 )=144 . It is convenient to


define 0 !=1 .

PERMUTATIONS
A permutation of
r

n different objects taken r

at a time is an arrangement of

out of the n objects with attention given to the order of arrangement. The number

of permutations of n objects taken r at a time is denoted by


Pn ,r

Pr

P(n , r ) or

and is given by

P r=n (n1)(n2) ( nr +1)=

n!
( nr )!

In particular, the number or permutations of n objects taken n at a time is


n

P r=n (n1)(n2) 1=n!

The number of permutations of n objects taken n at a time is


n!
n1 ! n 2 !
(10)
COMBINATIONS

where

n=n1 +n2 +

(9)

A combination of n different objects taken r

at a time is a selection of the

n objects with no attention given to the order of arrangement. The number of


combinations of n objects taken r
Cn , r

at a time is denoted by

Cr

, C( n ,r ) ,

n
r

and given by

or

Cr=

P
n (n1) (nr +1)
n!
=
=n r
r!
r ! (nr)! r !

(11)
We have

C r =n C nr

. Thus

20

C 17=20 C 3=

20.19 .18
=1140
. The number of
3!

combinations of n objects taken 1 or 2, , or n at a time is


n

C 1 + n C 2 ++ n C n=2n1 .

STIRLINGS APPROMAXIMATION TO n !
When n is large a direct evaluation of

n ! is impractical. In such cases use is

made of an approximate formula due to Stirling, namely


n!

2 n nn en

(12)

Where e=2.71828 is the natural base of logarithms. See Prob. 6.31.

RELATION OF PROBABILITY TO POINT SET THEORY


In modern probability theory, we think of all possible outcomes or results of an
experiment, game, etc., as points in a space (which can be of one, two, three, etc.,

dimension) called a sample space S . If S


contains only a finite number of points then with
each point we can associate a non-negative
number, called a probability, such that the sum of
all numbers corresponding to all points in S add
to one. An event is a set or collection of points in
S such as indicated by

E1

E2

or

in Fig.

602, called an Euler diagram or Venn diagram.


The event

E1 + E2

is the set of the


E1

points which are either in


points common to both

E1

or

and

E2
E2

E1 E 2

or both, while the event

. Then the probability of an event such as

the sum of the probabilities associated with all points contained in the set
the probability of

E1 + E2

is the set of

, denoted by

Pr {E 1+ E 2 }

associated with all points contained in the set

E 1 + E2

E1

E1

is

. Similarly

, is the sum of probabilities


. If

E1

and

E2

have no

points in common, i.e. the events are mutually exclusive, then


Pr {E 1+ E 2 }=Pr {E1 }+{ E2 }

. If they have points in common then

Pr {E 1+ E 2 }=Pr {E1 }+{ E2 }Pr {E 1 E2 }

The set
two sets. The set

E 1 + E2

is sometimes denoted by

E1 E 2

and is called the union of

E1 E 2

is sometimes denoted by

E 1 E2

and is called the

intersection ot the two sets. Extensions to more than two sets can be made. Thus instead of

E 1 + E2 + E 3

and

E 1 E2 E 3

E 1 E 2 E3

we could use the notation

E1 E 2 E3

and

respectively.

A special symbol is sometimes used to denoted a set with no points in it,


called the null set. The probability associated with an event corresponding to this set is
zero, i.e.
E1 E 2=

Pr { }=0 . If

E1

and

E2

have no points in common, we can write

, which means that the corresponding events are mutually exclusive and

Pr {E 1 E2 }=0

With this modern approach, a random variable is a function defined at each point of
the sample space. For example, in Problem 6.37 the random variable is the sum of the coordinates of even point.
In the case where S

has an infinite number of points the above ideas can be

extended by means of concept involving the calculus.

Vous aimerez peut-être aussi