Vous êtes sur la page 1sur 6

86 11.

502
Copy Right : Ra i Unive rsit y
Q
U
A
N
T
I
T
A
T
I
V
E

M
E
T
H
O
D
S
LESSON 18:
DISCRETE RANDOM VARIABLES &
PROBABILITY DISTRIBUTIONS
In the last lecture we have discussed about fundamentals of
probability.Can anyone of you give a brief of what we have
studied.
At the end of this chapter you should be familiar with:
Random variables
Discrete random variables
Binomial probability distribution.
Poissons probability distribution.
Random Variables
When the outcomes of a random experiment are numerical
values, it is more convenient to list their possible values
through the notion of a random variable.
Random Variable
A random variable is a variable that takes on numerical values
determined by the out-come of a random experiment.
For example, if we pick 4 students at random from a large
university class, and we observe the number of courses the
students are taking that semester (the university allows a
maximum of 6 courses per semester).
Define the random variable X = # courses a student takes.
The possible values of x are: x = 1,2,3,4,5,6
This is an example of discrete random variable
Discrete Random Variable
A random variable is discrete if it can take on no more than a
countable number of values.
In the previous example, there were a finite number of possible
values.
Consider the random experiment of rolling a die until a six
appears. If the random variable X= the number of rolls until
the first 6 appears, the possible values are x=1,2,3,. Here
there are a countable number of possible values for this discrete
random variable.
Suppose we want to do a study on the commuting time to
school of students, at a commuter college (where no student
lives more than 2 hours away). If X = travel time (in minutes),
the possible values of the random variable x are over an
interval: 0 < x < 120.
X is and an example of a continuous random variable.
Continuous Random Variable
A random variable is continuous if it can take any value in an
interval.
Focusing now on discrete random variables, there are various
topics that relate to discrete random variables. In Chapter 6 we
will look at continuous random variables.
Probability Distributions for Discrete Random
Variables
Example
Consider the problem of rolling a die until a 6 appears,
however, we will not roll more than 3 times. If X = # rolls for
this process to end, the possible values of X are:
x=1, 2, 3. P(X=1) = P(6 on the first roll) =
6
1
P(X=2) = P(no-6 on the first roll and 6 on the second roll) =
36
5

,
_

,
_

6
1
6
5
Since we end on the third roll as long as the first 2 rolls do not
produce a six, then P(X=3) = P(no-6 on first roll and no-6 on
the second roll) =
36
25

,
_

,
_

6
5
6
5
Let list the possible values of X along with their probabilities:
Table 1 PDF of X
X P(x) = P(X=x)
1 1/ 6
2 5/ 36
3 25/ 36
TOTAL 1
The above is the probability distribution of the discrete
random variable X. P(x) is called the Probability Distribution
Function (PDF)
Probability Distribution Function (PDF)
The probability distribution function, P(x), of a discrete
random variable X expresses the probability that X takes the
value x, as a function of x.
That is P(x)=P(X=x), for all values of x. (1)
Notice that all the probabilities are non-negative and they sum
to one.
Required Properties of Probability Distribution Functions of
Discrete Random Variables
Let X be a discrete random variable with probability distribu-
tion function, P(x). Then
(i) P(x) 0 for any value x
(ii) The individual probabilities sum to 1; that is
( ) 1
x
P x

where the notation indicates summation over all possible


values x.
The Cumulative Probability Function is defined as follows:
Cumulative Probability Function F(x
0
)
The cumulative probability function, F(x
0
) of a random variable
X expresses the probability that X does not exceed the value x
0
,
as a function of x
0
. That is
F(x
0
) =P(X<x
0
) (2)
Copy Right : Ra i Unive rsit y
11.502 87
Q
U
A
N
T
I
T
A
T
I
V
E

M
E
T
H
O
D
S
where the function is evaluated at all values x
0
.
If we wish to look at the Cumulative Probability Function
0 0
( ) ( ) F x P X x for our example it would be as follows:
Table 2 Cumulative Probability Function of X
x F(x) = P(X<x)
1 1/6
2 11/36
3 36/36
Clearly, to get these cumulative probabilities, we just added up
all the probabilities that satisfied the event X<x
0
.
Derived Relationship Between Probability Function and
Cumulative Probability Function
Let X be a random variable with probability function P(x) and
cumulative probability function F(x
0
) Then we can show that
F(x
0
) =

0
) (
x x
x P (3)
where the notation implies that summation is over all possible
values x that are less than or equal tox
0
.
The following properties of the cumulative probability function
should be clear from its definition.
Derive Properties of Cumulative Probability Functions for
Discrete Random Variables
Let X be a discrete random variable with cumulative probability
function F(x
0
). Then we can show that (i) 0 F(x
0
) 1 for
every number x
0
(ii) If x
0
and x
1
are two number with x
0
< x
1
,
then F(x
0
) F(x
1
)
In practice, we often have a cumulative probabilities function
and from the following rule, we can get the individual prob-
abilities for the PDF.
P(X=x
0
) = P(X<x
o
) P(X<x
0
)
In statistics, the real importance of the PDF is its relationship
to a population. The Probability Distribution for a random
variable, describes a population.
For example, consider the population of N=500, the number
of cars sold in a car dealership over a 500 day period. The
following frequency table summarizes the population and
computes the mean ?and standard deviation ?of the popula-
tion.
Table 3 Mean and Variance of a Population
x f xf
f(x-)
2
# cars sold # days
0 40 0 373.5654
1 100 100 422.7136
2 142 284 158.3493
3 66 198 0.206976
4 36 144 32.0809
5 30 150 113.3741
6 26 156 225.3455
6 26 156 225.3455
7 20 140 311.1027
8 16 128 391.0902
9 14 126 494.6359
10 8 80 385.7531
11 2 22 126.2143
Totals 500 1528 3034.432
1528/500

2
303
4.62/500
6.068864

2.463506
444

If a day is chosen at random from the N=500 days, and the
discrete random variable X = # cars sold that day, x = 0, 1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11
What is the PDF for this random variable X?
Table 4 PDF of a Discrete Random Variable
x P(x)
0 0.08
1 0.2
2 0.284
3 0.132
4 0.072
5 0.06
6 0.052
7 0.04
8 0.032
9 0.028
10 0.016
11 0.004
Notice that P(X=0) = .08. On 8% of the days, 0 cars were
sold. 8% of the N=500 car population consists of 0s.
P(X=1) = .2 and 20% of the N=500 car population consists of
1s. We can make statements like this for every value in the
population.
We see that the Probability Distribution for the random variable
X, describes the population of the # of cars sold on 500 days.
So, PDFs describe populations.
Descriptive Measures for Discrete Random Variables
Using the following definitions, we can find the mean, variance
and standard deviation of a discrete random variable X.
88 11.502
Copy Right : Ra i Unive rsit y
Q
U
A
N
T
I
T
A
T
I
V
E

M
E
T
H
O
D
S
Expected Value of a Discrete Random Variable
Expected Value
The expected value, E(X), of a discrete random variable X is
defined as
E(X) =

x
x xP ) ( (4)
where the notation indicates that summation extends over all
possible values x. The expected value of a random variable is
called its mean and is denoted
x
.
Expected Value: Functions of Random Variables
Let X be a discrete random variable with probability function
P(X) and let g(X) be some function of X. Then the expected
value, E[g(X)], of that function is defined as
E[g(X)]=

x
x P x g ) ( ) (
(5)
Variance and Standard Deviation of a Discrete Random
Variable
Let X be a discrete random variable. The expectation of the
squared discrepancies about the mean, (X- x )
2

is called the
variance, denoted
x
2
and given by

s
x
2
=E[(X-
x
)
2
] = [ ] ) (
2
x P
x
x
x


(6)
The standard deviation,
x
, is the positive square root of the
variance.
Variance of a Discrete Random Variable (Alternative Formula)
The variance of a discrete random variable X can be expressed
as s
x
2
=E(X
2
)-
x
2
= ) (
2
x P
x
x -
x
2

(7)
Lets use formulas (5.4) and (5.6) to find the mean, variance and
standard deviation for our discrete random variable X=number of
cars per day sold in a 500 day period, using the PDF in Table 2.4
Table 5 Mean and Variance of a Random Variable X
X P(x) xP(x) ( x-3.056)
2
P(x)
0 0.08 0 0.74713088
1 0.2 0.2 0.8454272
2 0.284 0.568 0.316698624
3 0.132 0.396 0.000413952
4 0.072 0.288 0.064161792
5 0.06 0.3 0.22674816
6 0.052 0.312 0.450691072
7 0.04 0.28 0.62220544
8 0.032 0.256 0.782180352
9 0.028 0.252 0.989271808
10 0.016 0.16 0.771506176
11 0.004 0.044 0.252428544
Totals 1.000 3.056 6.068864
1.000 3.056
E(X) =
x

E(X-
)
2

x
2

6.068864

x
2
x


2.463506

Notice that the mean and variance of the random variable x, is
just the mean and variance of the population of cars sold per
day.
If we have the pdf that describes a population we are interested
in knowing about, then we can get the mean and the standard
deviation of the population from the probability distribution.
Moreover, we can answer many questions about the population
from the PDF.
Example
The manager of a large computer network has developed the
following probability distribution for the number of interrup-
tions per day:
Table 6 PDF of X =#computer network interruptions
x P(x)
0 0.32
1 0.35
2 0.18
3 0.08
4 0.04
5 0.02
6 0.01
From the probability distribution we know the 32% of the
days have no interruptions. If any particular day is chosen at
random, the probability of at most 2 interruptions = P(X<2) =
.32+.35+.18 = .75. 75% of the time, there are at most 2
interruptions in a day.
For this population, the mean and standard deviation can be
gotten using expected values as follows:
Table 7 Mean and Variance for Computer Network
Interruptions
X P(x) xP(x) (x-
3.056)
2
P(x)
0 0.32 0 0
1 0.35 0.35 0.35
2 0.18 0.36 0.72
3 0.08 0.24 0.72
4 0.04 0.16 0.64
5 0.02 0.1 0.5
6 0.01 0.06 0.36
Total 1.000 1.27 3.29

E(X) =
x

1.27
E(X-)
2

x
2

3.29

2
x


1.813836

Copy Right : Ra i Unive rsit y
11.502 89
Q
U
A
N
T
I
T
A
T
I
V
E

M
E
T
H
O
D
S
So, the mean number of interruptions is 1.27 with a standard
deviation of 1.114.
The following rules on means and variances will prove useful in
our later studies.
Mean and Variance of Special Functions of a Random Variable
Summary of Properties for Linear Function of a Random
Variable
Let X be a random variable with mean,
x
, and variance
x

2
; and
let a and bbe any con-stant fixed numbers. Define the random
variable Y = a + bX. Then, the mean and vari-ance of are

y
=E(a+bX)=a+b
x
(8)
and
y
2
=VarE(a+bX)=b
2

x
2
(9)
so that the standard deviation of Y is
y
=| b|
x
(10)
Summary Results for the Mean and Variance of Special Linear
Functions
a. Let b = 0 in the linear function, W= a + bX.
Then W = a (for any constant a).
E(a)=a and Var(a)=0 (11)
If a random variable always takes the value a, it will have a
mean a and a variance 0.
b. Let a = 0 in the linear function, W = a + bX.
Then W = bX.
c. E(bX) = bE(X)=b
x
and Var(bX) = b
2

x
2
(12)
Often, when we have data, we are used to standardizing the
values in the data. That is, each value is written in terms of
how many standard deviations above or below the mean the
value falls. To standardize, we use the formula
X
X
X
Z

Lets standardize our population of computer interruptions per


day and find the mean and variance of this standardized
population.
Table 8 Mean and Variance for a Standardized Random
Variable Z
z P(z) zP(z) z
2
P(z)
-0.98067 0.32 -0.31382 0.30775
-0.20849 0.35 -0.07297 0.015214
0.563694 0.18 0.101465 0.057195
1.335877 0.08 0.10687 0.142765
2.108061 0.04 0.084322 0.177757
2.880244 0.02 0.057605 0.165916
3.652428 0.01 0.036524 0.133402
Total 1.00 0.000000 1.000000

E(Z) =
Z
0

E(Z-)
2

Z
2
1

It turns out, that for all populations, if we standardize, the
mean and variance of the standardized population will always
be 0 and 1 respectively.
The Mean and Variance of
X
X
X
Z

Let a= -
x
/
x
and b=1/
x
in the linear function Z= a +
bX. Then,
Z= a + bX =
X
X
X

so that
1
0
X X
X
X X X
X
E


_
+

,
(13)
and
2
2
1
1
X
X
X X
X
Var


_


,
(14)
Binomial Distribution
Example
Suppose that items coming off an assembly line are classified as
defective (D) or non-defective (N). Suppose further that any
item has a p(D) = .2 and all items are independent.
If the random variable X takes on the value 0, if an item is
non-defective; and 1 if an item is defective then P(X=1) = =
.2 and p(X=0) = 1- = .8 and the random variable X is said to
have a Bernoulli Distribution with =.2.
Generally, suppose that a random experiment can give rise to
just two possible mutually exclu-sive and collectively exhaustive
outcomes, which for convenience we will label suc-cess and
failure. Let denote the probability of success, so that the
probability of failure is (1 -). Now define the random variable
X so that X takes the value 1 if the outcome of the experiment
is success and 0 otherwise. The probability function of this
random variable is then
) 1 ( ) 0 ( P and ) 1 ( P
This distribution is known as the Bernoulli distribution.
It is straightforward to find the mean and variance of the
Bernoulli distribution as follows:
Deviation of the Mean and Variance of a Bernoulli
Random Variable
The mean is:

( ) ( ) (0)(1 ) (1)
X
X
E X xP x +

(15)
and the variance is:

2 2 2
[( ) ( ) ( )
X X X
X
E X x P x

(16)
=
2 2
(0 ) (1 ) (1 ) (1 ) +
Recall that
)! ( !
!
x n x
n
C
n
x

is the number of ways of choos-


ing x items from a total of n items. If we look at sequences of
x successes in n trials, that is equivalent to looking at n items
and choosing x of those items to be successes. The rest would
90 11.502
Copy Right : Ra i Unive rsit y
Q
U
A
N
T
I
T
A
T
I
V
E

M
E
T
H
O
D
S
be failures. Hence,
)! ( !
!
x n x
n
C
n
x

is the number of
sequences of x successes in n trials.
Number of Sequences with x Successes in n Trials
The number of sequences with x successes in n independent
trials is:
)! ( !
!
x n x
n
C
n
x

(17)
where
1 ... ) 2 ( ) 1 ( ! n n n n
and
. 1 ! 0
These
n
x
C sequences are mutually exclusive, since no two of
them can occur at the same time.
Example
Suppose that items coming off an assembly line are classified as
defective (D) or non-defective (N). Suppose further that any
item has a p(D) = .2 and all items are independent. If 3 items
are chosen at random from the days production, and
X = # defective, what is the PDF of X?
X = # defective x= 0,1,2,3
P(0) = P(NNN) = (.8)
3
= .512
P(1) = P( DNN NDN NND) =3(.2)(.8)
2
=.384
P(2) =P( DDN DND NDD) =3(.2)
2
(.8) = .096
P(3) = P(DDD) = (.2) = .008
Notice, that

3
0
) (
x
x P
= 1 as it should be.
P(x)=

'

,
_

3 , 2 , 1 , 0
3
) 8 (. ) 2 (.
3
3
) 8 (. ) 2 (.
#
x
x x
x
C
x x
defective x
getting of ways
This is an example of the binomial distribution!
The Binomial Distribution
Suppose that a random experiment can result in two possible
mutually exclusive and collectively exhaustive outcomes,
success and failure, and that p is the probability of a success
resulting in a single trial. If n independent trials are carried out,
the distribution of the number of successes x resulting is
called the binomial distribution. Its probability distribution
function for the binomial random variable X = x is:
P(x successes in n independent trials)
=
) (
) 1 (
)! ( !
!
) (
x n x
x n x
n
x P

for x = 0, 1, 2, . . , n (18)
For a binomial distribution we have n independent
Bernoulli trials.
The mean and variance of the binomial distribution are derived
in Appendix 5.1.
Derived Mean and Variance of a Binomial Probability
Distribution
Let X be the number of successes in n independent trials, each
with probability of success p. Then X follows a binomial
distribution with mean,
( )
X
E X n (19)
and variance,
2 2
[( ) ] (1 )
X X
E X n (20)
Example
According to one study, 4 out of 10 World Wide Web surfers
who view Internet banner ads remember them. Suppose that a
random sample of 5 Web surfers is selected and asked if they
remember a specific Internet banner that they previously
accessed.
If X= # of people who remember the ads, then X has a
binomial distribution with p=.4. Understanding this, the
next few questions can easily be answered.
What is the expected number of surfers who remember (the
mean of the binomial)?
( )
X
E X n =5(.4) =2
What is the probability that exactly one of the Web surfers will
remember the banner ad?
P(X=1) =
5
1
C
(.4)
1
(.6)
4
= .2592
What is the probability that at most 1 of the Web surfers will
remember the banner ad?
P(X<1) = P(X=0) + P(X=1) =
5
0
C
(.4)
0
(.6)
5
+ (.4)
1
(.6)
4
=.07776 +.2592 = .33696
What is the probability that more than 1 of the Web surfers will
remember the banner ad?
P(X > 1) = P(X=2) + P(X=3) + P(X=4) + P(X=5)
= 1- P(X<1) = 1-.33696 = .66304
Graph of Binomial Probability Function (n=5, p=0.4)
Binomial (n=5,
=.4) Histogram
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 1 2 3 4 5
Number of Successes
P
(
X
Notice the shape of the binomial is fairly symmetric.
Copy Right : Ra i Unive rsit y
11.502 91
Q
U
A
N
T
I
T
A
T
I
V
E

M
E
T
H
O
D
S
Table9 Binomial Distribution Probabilities when n=5 and =.4
Binomial
Distribution
Web Surfer Example
Data
Sample size 5
Probability of
success
0.4
Statistics
Mean 2
Variance 1.2
Standard
deviation
1.095445
Binomial Probabilities Table
X P(X) P(<=X) P(<X) P(>X) P(>=X)
0 0.07776 0.07776 0 0.92224 1
1 0.2592 0.33696 0.07776 0.66304 0.92224
2 0.3456 0.68256 0.33696 0.31744 0.66304
3 0.2304 0.91296 0.68256 0.08704 0.31744
4 0.0768 0.98976 0.91296 0.01024 0.08704
5 0.01024 1 0.98976 0 0.01024
Notes

Vous aimerez peut-être aussi