Vous êtes sur la page 1sur 51

Introduction to Probability and Statistics

Probability & Statistics for Engineers & Scientists, 8th Ed.


2007
Review II
Instructor: Kuo-Jung Lee
TA: Brian Shea
The pdf le for this class is available on the class web page.
http://www.stat.umn.edu/~kjlee/STAT3021_Summer2009.html
1
Marginal Distribution
The marginal distributions of X alone and of Y alone are
Discrete case:
g(x) =

y
f(x, y) and h(y) =

x
f(x, y),
Continuous case:
g(x) =
_

f(x, y)dy and h(y) =


_

f(x, y)dx,
2
Conditional Distribution
Let X and Y be two random variables, discrete or continuous.
The conditional distribution of the random variable Y given
that X = x is
f(y|x) =
f(x, y)
g(x)
, g(x) > 0,
Similarly the conditional distribution of the random variable X
given that Y = y is
f(x|y) =
f(x, y)
h(y)
, h(y) > 0.
3
Statistical Independent
Let X and Y be two random variables with joint probability dis-
tribution f(x, y) and marginal distributions g(x) and h(y), respec-
tively. The random variables X and Y are said to be statistically
independent if and only if
f(x, y) = g(x)h(y)
for all (x, y) within their range.
4
Example 1
Consider the following joint probability density function of the
random variables X and Y
f(x, y) =
_
1
2
ye
x
, 0 < x, 0 < y < 2;
0, elsewhere.
1. Find the marginal density functions of X and Y .
2. Are X and Y are independent?
3. Find P(X > 2).
5
Solution:
1.
g(x) =
_
y
f(x, y)dy = e
x
, x > 0;
h(y) =
_
x
f(x, y)dy =
y
2
, 0 < y < 2.
2. Since f(x, y) = g(x)h(y), they are independent.
3.
P(X > 2) = e
2
.
Denition: Expectation
Let X be a random variable with probability distribution f(x).
The mean or expected value of X is:
if X is discrete
= E(X) =

x
xf(x)
if X is continuous
= E(X) =
_

xf(x)dx
6
Theorem
Let X be a random variable with probability function f(x). The
expected value of the random variable g(X) is
Discrete: if X is discrete

g(X)
= E[g(X)] =

x
g(x) f(x)
Continuous: if X is continuous

g(X)
= E[g(X)] =
_

g(x) f(x)dx
7
Denition
Let X and Y be random variables with joint probability distribu-
tion f(x, y). The mean or expected value of g(X, Y ) is:
if X and Y are discrete

g(X,Y )
= E[g(X, Y )] =

y
g(x, y)f(x, y)
if X and Y are continuous

g(X,Y )
= E[g(X, Y )] =
_

g(x, y)f(x, y)dxdy


8
Denition: Variance
Let X be a random variable with probability function f(x) and
mean . The variance of the random variable X is
Discrete: if X is discrete

2
= Var(X) = E[(X )
2
] =

x
(x )
2
f(x)
Continuous: if X is continuous

2
= Var(X) = E[(X )
2
] =
_

(x )
2
f(x)dx
The positive square root of the variance, , is called the stan-
dard deviation of X.
9
Theorem
The variance of a random variable X is

2
= Var(X) = E(X
2
)
2
10
Theorem
Let X be a random variable with probability function f(x). The
variance of the random variable g(X) is
Discrete: if X is discrete

2
g(X)
= E[g(X)
g(X)
]
2
=

x
[g(x)
g(X)
]
2
f(x)
Continuous: if X is continuous

2
g(X)
= E[g(X)
g(X)
)]
2
=
_

[g(x)
g(X)
)]
2
f(x)dx
11
Denition: Covariance
Let X and Y be random variables with joint probability distribu-
tion f(x, y). The covariance of the random variables X and Y
is
Discrete: if X and Y are discrete is discrete
Cov(X, Y ) = E(X
X
)(y
Y
) =

y
(x
x
)(y
y
)f(x, y)
Continuous: if X and Y are continuous
Cov(X, Y ) = E(X
X
)(y
Y
) =
_

(x
x
)(y
y
)f(x, y)dxdy
12
Theorem
The covariance of two random variables X and Y with means

X
and
Y
, respectively, is given by

XY
= Cov(X, Y ) = E(XY )
X

Y
.
13
Theorem
If a and b are constants, then
E(aX +b) = aE(X) +b E(aX +bY ) = aE(X) +bE(Y ).
Theorem
The expected value of the sum or dierence of two or more
functions of two random variable X and Y is the sum or dierence
of the expected values of the functions. That is,
E[g(X) h(Y )] = E[g(X)] E[h(Y )].
Theorem
Let X and Y be two independent random variables. Then
E(XY ) = E(X)E(Y ).
14
Theorem
If a and b are constants, then

2
aX+b
= Var(aX +b) = a
2
Var(X).
Theorem
If X and Y are random variables with joint probability distribution
f(x, y) and a and b are constants, then

2
aX+bY
= Var(aX+bY ) = a
2
Var(X)+2abCov(X, Y )+b
2
Var(Y )
Cov(aX +b, cY +d) = acCov(X, Y )
15
Example 2
Consider the following joint probability density function of the
random variables X and Y
f(x, y) =
_
1
2
ye
x
, 0 < x, 0 < y < 2;
0, elsewhere.
1. Find means and variance X and Y .
2. Find covariance of X and Y .
16
Solution:
1.
E(X) =
_

0
xg(x)dx = 1;
Var(X) = 1.
E(Y ) =
_
2
0
yh(y)dy =
1
6
;
Var(Y ) = E(Y
2
) (
1
6
)
2
=
7
72
.
2. Since f(x, y) = g(x)h(y), they are independent. Cov(X, Y ) =
0.
17
Example 3
Let X and Y have joint probability density function (p.d.f.)
f(x, y) =
_
3x
2
y
32
, if 0 < x < 2 and 1 < y < 3;
0, otherwise.
1. Find the marginal probability density functions of X and Y ,
respectively, and determine if X and Y are independent ?
2. Calculate Cov(0.652X,

17Y ).
18
Solution:
1.
g(x) =
_
3
1
3x
2
y
32
dy =
3x
2
8
, 0 < x < 2;
h(y) =
_
2
0
3x
2
y
32
dy =
y
4
, 1 < y < 3.
Since f(x, y) = g(x)h(y), they are independent.
2. Since X and Y are independent from (a), Cov(X, Y ) = 0.
Cov(0.652X,

17Y ) = 0.652

17 Cov(X, Y ) = 0.
19
Example 4
Let X and Y be jointly distributed with (X, Y ) = 1/2,
X
= 2,
and
Y
= 3. Find Var(2X 4Y +3).
Solution:
Var(2X 4Y +3) = Var(2X 4Y )
= 2
2
Var(X) +2 2 (4)Cov(X, Y ) +4
2
Var(Y )
= 132.
20
Markovs Inequality
Let X be a nonnegative random variable; then for any t > 0,
P(X t)
E(X)
t
.
Theorem (Chebyshevs Inequality)
The probability that any random variable X will assume a value
within k standard deviations of the mean is at least 1
1
k
2
. That
is,
P( k < X < +k) 1
1
k
2
21
Discrete Probability Distributions
Name Notation P.D.F. f(x) = P(X = x)
Uniform
1
k
, x = x
1
, . . . , x
k
.
Bernoulli Ber(p) p
x
(1 p)
1x
, x = 0, 1.
Binomial Bin(n, p) p
x
(1 p)
nx
, x = 0, 1, . . . , n
Multinomial
_
n
x
1
, x
2
, . . . , x
k
_
p
x
1
1
p
x
2
2
p
x
k
k
Geometric Geo(p) pq
x1
, x = 1, 2, 3, . . . .
Negative Binomial NB(k, p)
_
x 1
k 1
_
p
k
q
xk
, x = k, k +1, . . . .
Hypergeometric Hyp(N, n, k)
_
k
x
__
N k
n x
_
_
N
n
_
Poisson Poi(t)
e
t
(t)
x
x!
, x = 0, 1, 2, . . . .
22
Discrete Probability Distributions
Name Mean Variance M.G.F.
Bernoulli p p(1 p) pe
t
+q
Binomial np np(1 p) (pe
t
+q)
n
Geometric
1
p
1p
p
2
pe
t
1qe
t
Negative Binomial
k
p
k(1p)
p
2
_
pe
t
1qe
t
_
k
Poisson t t e
(e
s
1)
23
Example 5: Binomial Distribution
The probability that a student is accepted to a prestigious college
is 0.3. If 5 students from the same school apply, what is the
probability that at most 2 are accepted?
Solution:
To solve this problem, we compute 3 individual probabilities,
using the binomial formula. The sum of all these probabilities
is the answer we seek. Let X be the number of students are
accepted, X Bin(5, 0.4). Thus,
P(X 2) =
2

x=0
b(x; 5, 2) = 0.8369.
24
Example 6: Multinomial Distribution
Suppose a card is drawn randomly from an ordinary deck of
playing cards, and then put back in the deck. This exercise is
repeated ve times. What is the probability of drawing 1 spade,
1 heart, 1 diamond, and 2 clubs?
Solution:
The experiment consists of 5 trials, so n = 5.
The 5 trials produce 1 spade, 1 heart, 1 diamond, and 2
clubs; so n
1
= 1, n
2
= 1, n
3
= 1, and n
4
= 2.
On any particular trial, the probability of drawing a spade,
heart, diamond, or club is 0.25, 0.25, 0.25, and 0.25, respec-
tively. Thus, p
1
= 0.25, p
2
= 0.25, p
3
= 0.25, and p
4
= 0.25.
25
_
5
1, 1, 1, 3
_
(0.25)
1
(0.25)
1
(0.25)
1
(0.25)
3
= 0.05859.
Thus, if we draw ve cards with replacement from an ordinary
deck of playing cards, the probability of drawing 1 spade, 1 heart,
1 diamond, and 2 clubs is 0.05859.
Example 7: Hypergeometric Distribution
Suppose we randomly select 5 cards without replacement from
an ordinary deck of playing cards. What is the probability of
getting exactly 2 red cards (i.e., hearts or diamonds)?
Solution:
This is a hypergeometric experiment in which we know the fol-
lowing:
N = 52; since there are 52 cards in a deck.
k = 26; since there are 26 red cards in a deck.
n = 5; since we randomly select 5 cards from the deck.
26
x = 2; since 2 of the cards we select are red.
We plug these values into the hypergeometric formula as follows:
h(X = x; N, n, k) =
_
k
x
__
N k
n x
_
_
N
n
_
h(X = 2; 52, 5, 26) =
_
26
2
__
52 26
5 2
_
_
52
5
_
= 0.32513.
Example 8: Negative Binomial & Geometric Distributions
Bob is a high school basketball player. He is a 70% free throw
shooter. That means his probability of making a free throw is
0.70. During the season, what is the probability that Bob makes
his third free throw on his fth shot?
Solution:
This is an example of a negative binomial experiment. The
probability of success p is 0.70, the number of trials x is 5, and
the number of successes k is 3.
nb(5; 3, 0.7) =
_
4
2
_
(0.7)
3
(0.3)
2
= 0.18522.
27
Example 9: Poisson Distribution
The average number of homes sold by the Acme Realty company
is 2 homes per day. What is the probability that exactly 3 homes
will be sold tomorrow?
Solution:
This is a Poisson experiment in which we know the following:
= = 2; since 2 homes are sold per day, on average.
t = 1; since unit time is one day.
x = 3; since we want to nd the likelihood that 3 homes will
be sold tomorrow.
p(x = 3; t = 2) = 0.180
28
Continuous Probability Distributions
Name Notation P.D.F. f(x)
Uniform U[a, b]
1
ba
, a x b.
Normal N(,
2
)
1

2
exp{
(x)
2
2
2
}, < x < .
Exponential Exp()
1

e
x/
, x > 0.
Gamma (, )
1

()
x
1
e
x/
, x > 0.
Chi-Squared
2

=
_

2
, 2
_
1
2
/2
(/2)
x
/21
e
x/2
, x > 0.
Lognormal LogN(,
2
)
1
x

2
e

1
2
2
[ln(x)]
2
, x > 0
29
Continuous Probability Distributions
Name Mean Variance M.G.F.
Uniform
a+b
2
(ba)
2
12
Normal
2
exp{t +

2
t
2
2
}
Exponentail
2
Gamma
2
Chi-Squared 2
30
Normal Approximation to the Binomial-I
Let X be a binomial random variable with parameters n and p.
Then X has approximately a normal distribution with = np and

2
= npq = np(1 p) and
P(X x) =
x

k=0
b(k; n, p) (1)
area under normal curve to the left of x +0.5 (2)
P
_
Z
x +0.5 np

npq
_
(3)
where Z N(0, 1), and the approximation will be good if np and
n(1 p) are greater than or equal to 5.
31
Normal Approximation to the Binomial-II
Let X be a binomial random variable with parameters n and p.
Then X has approximately a normal distribution with = np and

2
= npq = np(1 p) and
P(x
1
X x
2
) =
x
2

k=x
1
b(k; n, p)
area under normal curve to
the right of x
1
0.5 and left of x
2
+0.5.
P
_
x
1
0.5 np

npq
Z
x
2
+0.5 np

npq
_
where Z N(0, 1), and the approximation will be good if np and
n(1 p) are greater than or equal to 5.
32
Memorylessness for Geometric & Exponential Distribution
A nonnegative random variable X is called memoryless if for all
s, t 0,
P(X t) = P(X t +s|X s)
33
Example 10
The position X of the rst defect on a digital tape (in cm)
has the exponential distribution with mean = 50. Find the
probability that X < 200 given X > 150.
Solution:
P(X < 200|X > 150) = 1 P(X > 200|X > 150)
= 1 P(X > 50)
= 1 e
1
.
34
Example 11
The lifetime of a TV tube (in years) is an exponential random
variable with mean 10. If Jim bought his TV set 10 years ago,
what is the probability that its tube will last another 10 years?
Solution:
Let X be the lifetime of a TV tube. X Exp( = 10).
P(X > 20|X > 10) = P(X > 10) = e
1
.
35
Example 12
Let the probability density function of a random variable X be
f(x) =
1
2

2
e

(x2)
2
22
2
, if < x < .
Show that
P(|X 2| < 4)
3
4
.
Solution:
Since
X N(2, 2
2
),
by Chebyshevs inequality, we have
P(|X 2| < 4) = P(|X 2| < 2 2) 1
1
2
2

3
4
.
36
If we are sampling from a population with unknown distribu-
tion, either nite or innite, the sampling distribution of

X will
be approximately normal with mean and variance
2
/n pro-
vided that the sample size is large (n > 30).
Central Limit Theorem
If

X is the mean of a random sample of size n taken from a
population with mean and nite variance
2
, then the limiting
form of the distribution of
Z =

X
/

n
as n , is the standard normal distribution N(0, 1).
37
Example 13
A pair of fair 4-sided dice is rolled 192 times. Let T be the
number that a total of 5 occurs.
1. Find the probability function of T?
2. What are the mean (expected value) and variance of T?
3. What is the probability that a total of 5 occurs at most 49
times?
38
Solution:
1. T Bin
_
192,
1
4
_
. That is
f(x) =
_
n
x
_
_
1
4
_
x
_
3
4
_
192x
, x = 0, . . . , 192.
2. E(T) = 48, Var(T) = 36
3. Normal Approximation to Binomial:
P(T 49) P
_
T 48
6
<
49.5 48
6
_
= P (Z < 0.25) = 0.599.
39
Example 14
The random variable X, representing the number of cherries in
cherry pu, has the following probability distributions
x 4 5 6 7
P(X = x) 0.2 0.4 0.3 0.1
1. Find the mean and the variance
2
.
2. Find the mean

X
, and the variance of
2

X
of the mean

X
for random samples of 36 cherry pus.
3. Find the probability that the average number of cherries in
36 cherry pus will less than 5.5?
40
Solution:
1. E(X) = 4.3, Var(X) = 10.41
2.

X
= E(

X) = 4.3,
2

X
= Var(

X) =
10.41
36
3. Central Limit Theorem:
P(

X < 5.5) P
_
_
_

X 4.3
_
10.41
36
<
5.5 4.3
_
10.41
36
_
_
_ = P (Z < 2.23) = 0.987.
41
Sampling Distribution: Dierence Between Two Averages
If independent samples of size n
1
and n
2
are drawn at random
from two populations, discrete or continuous, with means
1
and
2
, and variances
2
1
and
2
2
, respectively, then the sampling
distribution of the dierences of means,

X
1

X
2
, is approximately
normally distributed with mean and variance given by

X
1

X
2
=
1

2
, and
2

X
1

X
2
=

2
1
n
1
+

2
2
n
2
.
Hence
Z =
(

X
1


X
2
)

X
1

X
2
_

2
1
n
1
+

2
2
n
2
is approximately a standard normal variable.
42
Example 15
A random sample of size 100 is taken from a population having
a mean of 80 and a standard deviation of 5. A second random
sample of size 125 is taken from a dierent population having a
mean of 75 and a standard deviation of 3. Find the probability
that the sample mean computed from the 100 measurements
will exceed the sample mean computed from 125 measurements
by at least 3 but less than 5.
43
Sampling Distribution of S
2
If S
2
is the variance of a random sample of size n taken from a
normal population having the variance
2
, then the statistic

2
=
(n 1)S
2

2
=
n

i=1
(X
i


X)
2

2
has a chi-squared distribution with = n 1 degrees of free-
dom.
44
Example 16
Find the probability that a random sample of 25 observations,
from a normal population with variance
2
= 6, will have a
variance s
2
1. greater than 9.1;
2. between 3.462 and 10.745.
45
Sampling Distribution of

X When Variance is Unknown
Let X
1
, X
2
, . . . , X
n
be independent random variables that are all
normal with mean and standard deviation . let

X =
1
n
n

i=1
X
i
and S
2
=
1
n 1
n

i=1
(X
i


X)
2
.
Then the random variable T =

X
S/

n
has a t-distribution with
= n 1 degrees of freedom.
46
Sampling Distribution of the Ratio of Two Sample Variances
If S
2
1
and S
2
2
are the variances of independent random samples
of size n
1
and n
2
taken from normal populations with variances

2
1
and
2
2
, respectively, then
F =
S
2
1
/
2
1
S
2
2
/
2
2
=

2
2
S
2
1

2
1
S
2
2
has an F-distribution with
1
= n
1
1 and
2
= n
2
1 degrees
of freedom.
47
Example 17
If S
2
1
and S
2
2
represent the variances of independent random sam-
ples of size n
1
= 25 and n
2
= 31, taken from normal populations
with equal variances, nd
P
_
S
2
1
S
2
2
> 1.26
_
.
48

Vous aimerez peut-être aussi