Académique Documents
Professionnel Documents
Culture Documents
CHAPTER TWO
PROBABILITY AND PROBABILITY DISTRIBUTIONS
2.1. Probability Theory
Basic Concepts
Probability is a measure of the likelihood or chance that an uncertain event will
occur. It is a numerical measure of the chance of an outcome’s occurrence. It can
assume a value between 0 and 1, inclusive. A probability near zero indicates that
the outcome is very unlikely to occur, while a probability near 1 indicates that
the event is almost certain to happen. If we go to the extreme, a probability of
something will always to happen. Thus, probabilities are non-negative proper
fractions. It is the basis for inferential statistics
Experiment
An experiment is any well defined situation or procedure that results in one or
more possible outcomes. Or simply it can be defined as any process that
generates well defined outcomes. For instance, tossing a coin, rolling a die, foot
ball match, etc can be taken as experiments.
Outcome
An outcome is a particular result of an experiment. For example, getting either
head or tail is a possible outcome of the experiment tossing a coin. Winning,
loosing or tie/draw are the possible outcomes of the foot ball experiment, and
getting 1, 2, 3, 4,5, or 6 are possible outcomes of the rolling a die experiment.
Events
An event is a specific collection of basic outcomes, that is, a set containing one or
more of the basic outcomes from the sample space. An experiment identifies one
or more outcomes of an experiment. For example, in the rolling a die experiment,
the simple collection of two or more of the six possible outcomes can be taken as
an event.
Sample Space
A sample space is a complete roster or listing of all possible out comes of an
experiment. The sample space of an experiment is usually illustrated either by a
list or some type of diagram – Venn diagrams and tree diagrams.
Exercise
Identify the experiment, outcomes, events and sample space for the following
questions.
1. Sitting for an exam ………………………………. Experiment
Scoring A, B, C, D, F ……………………………... Possible outcomes
[A, B, C, D, F] ……………………………………... Sample Space
Scoring B and above ……………………………… Event
C and above ……………………………… Event
D or below ……………………………….. Event
Events
1. Independent events
Two or more events are independent if the occurrence or nonoccurrence of one of
the events does not affect the occurrence or nonoccurrence of the others. Certain
experiments, such as rolling dice, yield independent events; each die is
independent of the other. Whether a 6 is rolled on the first die has no influence
on whether a 6 is rolled on the second die. Coin tosses always are independent of
each other. The possibility of getting a head on the first toss of a coin in
independent of getting a head on the second toss.
The impact of independent events on the probability is that, if two events are
independent, the probability of attaining the second event is the same regardless
of the outcome of the first event. The probability of tossing a head is always ½
regardless of what was tossed previously. Thus, if someone tosses a coin six
times and gets six heads, the probability of tossing a head on the seventh time is
½, because coin tosses are independent. In terms of symbolic notation, if X and Y
are independent: P(X/Y) = P(X) and P(Y/X) = P(Y), where P(X/Y) denotes the
probability of X occurring given that Y has occurred, and P(Y/X) denotes the
probability of Y occurring given that X has occurred.
2. Dependent Events
3
Relating the above three types of events, mutually exclusive events must be
dependent, but dependent events need not be mutually exclusive. Events that are
independent cannot be mutually exclusive. Therefore, mutually exclusive implies
dependence and independence implies not mutually exclusive, but no other
simple implications among these conditions hold true.
5. Complementary events
The complement of an event A is denoted A . All elementary events of an
experiment not in A comprise its complement. For example, if in rolling one die,
event A is getting an even number, the complement of A is getting an odd
number. If event A is getting a 5 on the roll of a die, the complement of A is
4
Using the complement of an event can be helpful some times in solving for
probabilities because of the rule: P ( A ) = 1- P (A).
Principles of counting
Counting the number of ways in which events may occur in an experiment plays
a major role in probability. Some rules for counting are presented in this section.
The first of these is called the fundamental principle of counting.
The fundamental principle of counting specifies that if one event can occur in n1ways and
another event can occur in n2 ways, the two events can occur together in n1n2 ways.
Permutations
Other important counting rules pertain to the arrangement of items with regard
to the order of items. In this case with use Permutations.
By definition, 0! = 1.
Combinations
Permutations concern ways in which both order and composition are important.
In combinations what matters is the composition of the group not the order of
items as what we have in permutations.
Classical Method
The classical method of assigning probabilities is based on the assumption that
each outcome is equally likely to occur. Classical probability utilizes rules and
laws. It involves an experiment and an event. The definition assumes that all n
possible outcomes have the same chance for occurring. In this method
probability values are assigned as follows:
ne
P( E ) ,
N
where : N total possible number of outcomes of an exp eriment
ne the number of outcomes in which the event occurs out of N outcomes
As ne can never be greater than N (no more than N outcomes in the population
could possibly possess attribute e), the highest value of any probability is 1. If the
probability of an outcome occurring is 1, the event is certain to occur. The
smallest possible probability is zero. If none of the outcomes of N possibilities
possesses the desired characteristic, e, the probability is 0/N = 0, and the event is
certain not to occur. The range of possibilities for probabilities is: 0 P ( E ) 1.
Thus the probabilities are non negative proper fractions or non negative decimal
values less than or equal to 1.
Probability by number of
relative frequency= times an event has occurred
of occurrence total number of opportunities
for the event to occur
Relative frequency of occurrence is not based on rules or laws but on what has
occurred in the past.
Subjective method
The subjective method of assigning probability is based on the feelings or
insights of a person determining the probability. Subjective probability comes
from the person’s intuition or reasoning. Although not a scientific approach to
6
Types of probabilities
There are four types of probabilities. These are:
Simple probability
Joint probability
Marginal probability
Conditional probability
Simple Probability
Simple probabilities are relatively straight forward which are obtained using the
formula P (A) = n (A)/n – relative frequency method.
Marginal probability
Marginal probability is denoted by P (E), where E is some event. A marginal
probability is usually computed by dividing some subtotal by the whole. An
example of marginal probability is the probability that a student is infected by
HIV/AIDS. This probability is computed by dividing the number of students
infected by HIV/AIDS by the total number of students. The probability of a
person wearing glasses is also a marginal probability. This probability is
computed by dividing the number of people wearing glasses by the total number
of people. A marginal probability is found in the margin of any joint probability
table. It is the sum of the joint probabilities for a single category of one attribute
over all possible categories of another attribute.
Example:
ABC Company manufactures window air conditioners in both a deluxe model
(D) and a standard model (S). An auditor engaged in a compliance audit of the
firm is validating the sales account for the month April. She has collected 200
invoices for the month, some of which were sent to wholesalers (W) and the
remainders to retailers (R). Of the 140 retail invoices, 28 are for the standard
model. Only 24 of the wholesale invoices are for the standard model. If the
auditor selects one invoice at random, find the following probabilities.
7
Solution
Wholesale, W Retail, R Total
Deluxe, D 36 0.18* 112 0.56* 148 0.74**
Standard, S 24 0.12* 28 0.14* 52 0.26**
Total 60 0.30** 140 0.70** 200
P (D) = 148/200 = 0.74 P (W) = 60/200 = 0.30
P (S) = 52/200 = 0.26 P (R) = 140/200 = 0.70
Union probability
A second type of probability is the union of two events. Union probability is
denoted by P (E1 U E2), where E1 and E2 are two events. P (E1 U E2) is the
probability that E1 will occur or that E2 will occur or that both E 1 and E2 will
occur. An example of union probability is the probability that a person is infected
by HIV/AIDS or Cancer. To qualify for the union, the person has to be infected
with at least one of the diseases. Another example is the probability of wearing
eye glasses or is a soldier. All people wearing eye glasses are included in the
union along with all people who are soldiers and all soldiers who wear eye
glasses.
Joint probability
A third type of probability is the intersection of two events or joint probability. A
joint probability shows the probability that an observation will possess two (or
more) characteristics simultaneously. That is, it measures the probability of two
or more events occurring together. The joint probability of events E1 and E2
occurring is denoted P (E1 n E2). Some times P (E1 n E2) is read as the probability
of E1 and E2. To qualify for the intersection, both events must occur. Joint
probability ranges from 0 to 1, inclusive [0, 1]. The sum of all joint probabilities
must be equal to 1.0. An example of joint probability is the probability of a
person to be infected with HIV/AIDS and Cancer. Being infected with one of the
8
Conditional probability
The fourth type is conditional probability. Conditional probability is denoted by
P (E1 / E2). This expression is read as: the probability that E 1 will occur given that
E2 is known to have occurred. The conditional probability of an event E 1, given
event E2 is the ratio of the joint probability of two events to the marginal
probability of E2.
X P X and Y X Y
e)
For P Y 0, P P
Y P Y Y
Y P Y and X Y X
For P X 0, P P
X P X X
Example:
Blue Nile University recently conducted a survey of undergraduate students in
order to gather information about the usage of the library. The population for
this study included all 4000 undergraduate students enrolled in the university.
The library officers are interested in increasing usage, particularly among
females (F) and seniors (S) at the university. Of the 4000 students, 800 students
are seniors, 1800 students are females and 450 of the 1800 females are seniors.
Required:
1. What is the probability that a student selected at random is a senior given that
the selected student is female?
2. What is the probability that a student selected at random is female given that the
selected student is senior?
Solution:
Senior, S Non-Senior, N Total
Female, F 450 0.1125 1350 0.3375 1800 0.45
Male, M 350 0.0875 1850 0.4625 2200 .055
Total 800 0.20 3200 0.80 4000
P Y P X
P( X i ) * P Y
X i
i
X i
P
Xi
n .
Y
P Y P X
P Y
P X 2 ....... P X P X n P X I * P Y X
Y
X1 X2
1
n i 1 i
Example:
1. A company has three machines A, B and C which all produce the same two parts,
X and Y. of all the parts produced, machine A produces 60%, machine B
produces 30%, and machine C produces the rest. 40% of the parts made by
machine A are part X, 50% of the parts made by machine B are part X, and 70% of
the parts made by machine C are part X. A part produced by this company is
randomly sampled and is determined to be an X part. With the knowledge that it
is an X part, find the probabilities that the part came from machine A, B or C.
Solution:
P (A) = 0.6 P (X/A) = 0.4 P (A/X) =?
P (B) = 0.3 P (X/B) = 0.5 P (B/X) =?
P (C) = 0.1 P (X/C) = 0.7 P (C/X) =?
Method 1
Machine
Product A B C Total
X 0.24 0.15 0.07 0.46
Y 0.36 0.15 0.03 0.54
Total 0.6 0.3 0.1 1.00
P (A/X) = 0.24/0.46 = 0.52
P (B/X) = 0.15/0.46 = 0.33
P (C/X) = 0.07/0.46 = 0.15
1.00 The sum of joint and conditional probabilities should
be equal to one.
Method 3 – Bayesian Table
Event, Prior probability, Conditional Joint prob., Posterior/Revised
Ei P(Ei) Prob., P(X/Ei) P(EinX) prob., P(Ei/X)
A 0.60 0.40 0.24 0.24/0.46 = 0.52
B 0.30 0.50 0.15 0.15/0.46 = 0.33
C 0.10 0.70 0.07 0.07/0.46 = 0.15
P(X) = 0.46 1.00
0.24
Method 4 – Tree Diagram X/A P (A/X) = 0.24/0.46 = 0.52
0.40
0.60 0.60 0.36
Y/A
A 0.15 P(X) = 0.46
B 0.30 X/B 0.50 P (B/X) = 0.15/0.46 = 0.33
0.50 0.15
C Y/B 0.07 P(C/X) = 0.07/0.46 = 0.15
0.10 X/C 0.70 1.00
0.30 0.03
Y/C
Find: P(Y) = 0.54
P (A/Y) = 0.36/0.54 = 0.667
P (B/Y) = 0.15/0.54 = 0.278
P (C/Y) = 0.03/0.54 = 0.055
1.000
12
2. Bruk, Alemayehu and yohannes fill orders in a fast food restaurant. Bruk fills
incorrectly 20% of the orders he takes. Alemayehu fills incorrectly 12% of the
orders he takes, and Yohannes fills incorrectly 5% of the orders he takes. Bruk
fills 30% of all orders, Alemayehu fills 45% of all orders, and Yohannes fills 25%
of all orders. An order has just been filled.
a) What is the probability that Alemayehu filled the order? 0.45
b) If the order was filled by Yohannes, what is the probability that it would
was filled correctly? 0.95
c) Who filled the order is unknown, but the order was filled correctly. What
are the revised probabilities that Bruk, Alemayehu or Yohannes filled the
order? 0.2748, 0.4533 and 0.2719
d) Who filled the order is unknown, but the order was filled incorrectly. What
are the revised probabilities that Bruk, Alemayehu or Yohannes filled the
order? 0.4743, 0.4269 and 0.0988
3. A major league base ball team has four starting pitchers: Girma, Robel, Solomon,
and Asrat. Each pitcher starts every fourth game. The team wins 60% of all
games that Girma starts, 45% of all games that Robel starts, 35% of all games that
Girma starts, 40% of all games that Girma starts. An avid fan has just returned
from a three week vacation in the wilderness and found out that the team played
yesterday.
a) What is the probability that Girma started the game? 0.25
b) What is the probability that Solomon started the game? 0.25
c) If the team won yesterday, revise the probability of each pitcher starting the
game? 0.333, 0.250, 0.194 and 0.222
Laws of Probability
Additive Law
The general law of addition is used to find the probability of the union of two
events, P (E1 U E2). The general Law of Addition is presented as follows:
P E1 E2 P E1 P E2 P E1 E2
Example:
1. Test the matrix for the 200 executive responses to determine whether industry type is
independent of geographic location.
Geographic Location
Industry type North East, D South East, E Mid West, F West, G Total
Finance, A 24 10 8 14 56
Manufacturing, B 30 6 22 12 70
Communication, C 28 18 12 16 74
Total 82 34 42 42 200
15
Solution:
Select one industry type and one geographic location (Say A – Finance and G –
West). Does P (A/G) = P (A)?
P (A/G) = P (AnG)/P (G) = 0.07/0.21 = 0.33
P (A) = 56/200 = 0.28
Since P (A/G) ≠P (A), industry type and geographic location are not independent.
2. Considering the above problem, if a respondent is randomly selected from these
data:
a) What is the probability that this executive is from the mid west? 0.21
b) What is the probability that a respondent is from the communication
industry or from north east? 0.64
c) What is the probability that a respondent is from the south east or from
finance industry? 0.36
d) What is the probability that this executive is from the south east or the west?
0.38
3. The results of a survey asking, “Do you have a calculator and/or a computer in
your home?” are as follows:
Calculator
Yes No
Computer Yes 46 3
No 11 15
Is the variable calculator independent of the variable computer? Why or why
not? NO
P P E
E2
P E1 E 2 E1 1
P E1
E2 P E 2 P E1
Perhaps the most widely known of all discrete probability distribution is the
binomial distribution. The binomial distribution has the following underlying
assumptions:
i. The experiment involves n identical trials or sampling is
done with replacement.
ii. Each trial has only two possible mutually exclusive
outcomes.[Bi = Two]
III. Each trial is independent of the previous trials
iv. The probability of success (P) and failure (q = 1-P) remain
constant for each trial.
v. In n trials, only X successes are possible where X is a whole
number between 0 and n [0≤ X≤ n]
vi. It is applicable if the sample size n is less than 5% of the
population size N or if samples are taken with replacement.
x! n x !
n number of trials sample size
x number of successes desired
P probabilti y of success
q 1 p probabilti y of failure
Example:
1. If we toss a coin three times, what is the probability of getting exactly two heads?
Solution:
In a single toss, the probability of getting a head or a tail is 0.5. In tossing the coin
three times, the following are the possible outcomes.
HHH, HHT, HTH, HTT, THH, THT, TTH, TTH, TTT
The probability of getting exactly two heads is, therefore, computed as
= (0.5*0.5*0.5) + (0.5*0.5*0.5) + (0.5*0.5*0.5)
= 0.125 * 3 = 0.375
Using the Binomial formula
P = 0.50 q = 1 – 0.50 = 0.50 n=3 x=2
P(x=2) = ncx * PX * q1-x
= 3c2 *0.52*0.51
=
3(0.25*0.5) = 3(0.125) = 0.375
18
There are three ways of choosing exactly two heads from a total of three trials.
2. A researcher wants to test the claim that 10% of all people are left-handed by
randomly selecting forty students at a university. What is the probability of
getting six left handed students among forty?
Solution:
P = 0.10 q = 1 – 0.10 = 0.90 n = 40 x=6
P(x=6) = 0.1068
If 10% of the population is left-handed, about 10.68% of the time the researcher
would get six who are left handed in a sample of forty.
3. Based on past data, approximately 30% of the oil wells drilled in areas having a
certain favorable geological formation have struck oil. A company has identified
5 locations that possess this information. Assuming that the chance of striking oil
on any location is independent of any others, calculate the probability that
exactly 2 of the 5 wells strike oil.
Solution:
P = 0.30 q = 1 – 0.30 = 0.70 n=5 x=2
P(x = 2) = 0.3087
If the probability of getting oil in areas having certain favorable geological
formation is 0.3, 31% of the time we can get 2 drills which have oil in a sample of
5 drills.
4. The quality control department of a manufacturer tested the most recent batch of
1000 catalytic converters produced and found 50 of them defective.
Subsequently, an employee unwittingly/unintentionally mixed the defective
converters with the non-defective ones. If a sample of three converters is
randomly selected from the mixed batch, what is the probability that the
employee may get one defective item?
Solution:
Before we try to solve this problem we have to check whether all the assumptions
of a Binomial distribution are satisfied or not. One of the assumptions states that
the sample size, n must be less than five percent of the population size, N. in our
case, the sample size is less that 5% of the population size[ 3/1000 = 0.003< 0.05]
so we can use the binomial distribution to solve this specific exercise.
N = 1000 p = R/N, where R- the number of success in the population, N
n=3 = 50/1000 = 0.05
x=1 q = 1 – 0.05 = 0.95
P(x=1) = 0.1354
If 5% of the product contains defective converters, 13.54% of the time the quality
control department would get 1 defective item in a sample of three converters.
19
Some Binomial tables only show values up to 0.5. Thus, it would appear these
tables are can not be used when the probability of success exceeds p= 0.5.
However, such tables can be used by noting that the probability of n-x failures is
also the probability of x successes. That is, finding the probability of x successes
is equal to finding the probability of n-x failures. ncx and ncn-x are always
equal.
Example:
Suppose that 70% of all cola drinkers select non diet colas. If 10 cola drinkers are
randomly selected, what is the probability that 4 of them will be diet cola
drinkers?
Solution:
Finding the probability of 4 diet cola drinkers is equivalent to finding the
probability of 6 non diet cola drinkers.
n = 10 p= 0.7 q= 0.3 x= 6
P(x=6) = 0.2001
20
3. A lawyer estimates that 40% of the cases in which she represented the defendant
were won. If the lawyer is presently representing 10 defendants in different
cases, what is the probability that at least 5 of the cases will be won? What are
you assuming here?
Solution:
The assumption we are taking here is the cases in which the lawyer is
representing are independent. With this assumption:
n = 10 p= 0.4 q= 0.6 x≥ 5
P (x≥5) = P(x=5) + P(x=6) + P(x=7) + P(x=8) + P(x=9) +P(x=10)
= 0.2007 + 0.1115 + 0.0425 + 0.0106 + 0.0016 + 0.0001
= 0.3670
21
Example:
1. According to a study conducted approximately 55% of all hospitals in a given
town contained 100 or more beds. A researcher draws a sample of 15 hospitals by
randomly selecting names from a directory of hospitals.
a) What is the probability of selecting 10 or more hospitals that have 100 or more
beds?
b) What is the probability of selecting less than five hospitals that have 100 or
more beds?
c) What is the probability of selecting from six to ten hospitals, inclusive, that
have 100 or more beds?
2. A manufacturing company produces 10, 000 plastic parts per week. This
company supplies plastic parts to another company, which packages the plastic
parts as part of picnic sets. The second company randomly samples10 plastic
parts sent from the supplier. If two or less of the sampled plastic parts are
defective, the second company accepts the lot. What is the probability that the lot
will be accepted if the part manufacturing company actually producing parts is
10% defective? 20% defective? 30% defective? 40% defective?
22
µ = E(X) = ∑[X*P(X) =
Xifi
fi
Where: E(X) = long run average
X = an outcome
P(X) = the probability of that outcome
Variance of a discrete random variable, which measures how far the variables are
dispersed around the mean, is calculated as
δ2 = ∑(X-µ) 2*P(X)
Where: X = an outcome
µ = mean
P(X) = the probability of that outcome
And the standard deviation of a discrete random variable is calculated simply by
taking the square root of the variance. δ = √∑(X-µ) 2*P(X).
Hypergeometric Distribution
23
The binomial distribution assumes that the probability of success (p) and failure
(q = 1 - p) are the same throughout the experiment. This is because
– events are independent
– sampling is done with replacement
– n < 0.05N
– population is infinite
However, in cases where sampling is without replacement and the sample size
exceeds 5% of the population size, it is necessary to use the hypergeometric
distribution to determine correct probability.
The hypergeometric distribution has the following characteristics.
- It is a discrete distribution.
- Each outcome consists of either a success or a failure.
- Sampling is done without replacement.
- The population size is finite and known.
- It is described by three parameters: N, r and n. because of the multitude of
possible combinations of these three parameters, creating tables for the
hypergeometric distribution is practically impossible.
- The number of successes in the population, r, is known.
- The sample size is ≥ 5% of the population.
Under the above conditions, we can use the hypergeometric distribution for
determining the correct probability, with the following formula:
N r * r
P( X ) n x N x
n
Where: P(X) = the probability N = population size
n = sample size
r = number of successes in the population
x = number of successes in the sample for which a
probability is desired
C = combination
N-r = the number of items in the population that are
labeled as success
N Cn = the number of ways a sample of size n can be
selected from a population of size N.
rCx = the number of ways x successes can be selected
from a total of r successes in the population.
N-rCn-x = the number of ways n-x failures can be selected
from a total of N-r failures in the population
Example:
24
1. 24 people, of whom eight are women, have applied for a job. If five of the
applicants are randomly selected, what is the probability that three of those
sampled are women?
Solution:
N = 24 n=5 r=8 x=3
24 8
* 8
P( X 3) 5 3 24 3 = 120x56/42,504 = 0.1581
5
2. A shipment of 10 items has two defective and eight non-defective units. In the
inspection of the shipment, a sample of units will be selected and tested. If the
defective unit is found, the shipment of 10 units will be rejected.
a) If a sample of three items is selected, what is the probability that the
shipment will be rejected?
b) If management would like a 0.90 probability of rejecting a shipment with
two defective and eight non-defective units, how large a sample would
you recommend?
3. Suppose that there are 18 major insurance companies in Ethiopia and that 12 are
located in Addis. If three insurance companies are randomly selected from the
entire list, what is the probability that one or more of the selected companies are
located in Addis?
Solution:
N = 18 n=3 r = 12 x≥1
P(x ≥ 1) = P(x = 1) + P(x = 2) + P(x = 3)
= 0.2206 + 0.4853 + 0.2696
= 0.9755
2. Of a group of 300 men, 240 are physically fit. If five men are randomly selected,
what is the probability that three of them are physically fit?
Solution:
Using hypergeometric distribution
N = 300 n=5 r = 240 x=3
P(x=3) = 0.2057
While a binomial random variable counts the number of successes that occur in a
fixed number of trials, a Poisson random variable counts the number of rare
events (successes) that occur in a specified continuous time interval or specified
region.
The probability of getting 10 customers during the next eight minutes in a bank is
0.0528. Or there is 5.28% chance that exactly 10 customers will arrive in eight
minutes at a bank.
3. If a real estate office sells 1.6 houses on average weekday and sales of houses on
weekdays are Poisson distributed, what is the probability of selling:
a) Four houses in a day? 0.0551
b) No house in a day? 0.2019
c) More than five houses in a day? 0.0060
28
4. A secretary types 75 words per minute and averages six errors per hour of typing.
Assuming error occurrences are a Poisson process, what is the probability that a
225-word letter will be typed with out error? 0.7408
5. A pen company averages 1.2 defective pens per carton produced (200 pens). The
number of defects per cartoon is Poisson distributed.
a) What is the probability of selecting a cartoon and finding no defective
pen? 0.0312
b) What is the probability of finding eight or more defective pens in a
cartoon? 0.0000
c) Suppose that a purchaser of these pens will quit buying from the company
if a cartoon contains more than three defectives. What is the probability that
the purchaser will quit buying from this company? 0.0338
Binomial tables are often not available for large values of n, so in these cases the
approximation can be useful. So in cases where P≤0.05 and n≥20, substitute the
mean of the binomial distribution (µ = np) in place of the mean of the Poisson
X!
In general, the larger n is and smaller p is, the better will be the approximation.
Why approximation?
- The Poisson formula is easier to use than the binomial formula.
- It can be tabulated more efficiently than binomial probabilities
because Poisson distribution has only one parameter µ (λι), where as
binomial distribution has two parameters n and p.
Example:
n = 500 p= 0.02 µ = np = 500*0.02 = 10
n = 1000 p= 0.01 µ = np = 1000*0.01 = 10
If we want to calculate P(X) for both cases we can tabulate on a single column-
Poisson. Had it been binomial for the above cases we should have formulated
two columns.
1. A company sells insurance policies to a random sample of 1000 men who are 35
years of age. The probability that a 35-year old man dies with in a year is
approximately 0.002. What is the probability that the insurance company will
have to pay claims on 2 or more policies next year?
Solution:
Steps: 1. Make sure P≤0.05 and n≥20
30
X
Z , Where : Z number of s tan dard deviations from the mean.
X value of int erest
mean of the distribution
s tan dard deviation of distribution
A Z- score is the number of standard deviations that a value, X, is away from the
mean. If the value of X is less than the mean, the Z-score is negative; if the value
of X is greater than the mean, the Z-score is positive. Z-score is also known as z-
value. A standardized score in which the mean is zero and the standard
deviation is 1. The Z score is used to represent the standard normal distribution
a) P (485≤X≤600) =?
X
1. First convert X values in to Z-score using the formula Z
Z485 = 0
600 485
Z600 = = +1.10
105
2. P(485≤X≤600) = P(0≤Z≤+1.10)
= P (0 to +1.10)
= 0.36433
b) P (X>650) =?
X
1. First convert X values in to Z-score using the formula Z
650 485
Z650 = = +1.57
105
2. P(X>650) = P(Z>+1.57)
= 0.5- P (0 to +1.57)
= 0.5-0.44179
34
= 0.05281
c) P (X<300) =?
X
1. First convert X values in to Z-score using the formula Z
300 485
Z300 = = -1.76
105
2. P(X<300) = P(Z<-1.76)
= 0.5- P (0 to -1.76)
= 0.5-0.46080
= 0.03920
d) P (350≤X≤550) =?
X
1. First convert X values in to Z-score using the formula Z
350 485
Z350 = = -1.29
105
550 485
Z550 = = +0.62
105
2. P(350≤X≤550) = P (-1.29≤Z≤-1.29)
= P (0 to -1.29) + P (0 to 0.62)
= 0.40147 + 0.23237
= 0.63384
e) P (X<700) =?
X
1. First convert X values in to Z-score using the formula Z
700 485
Z700 = = +2.05
105
2. P(X>300) = P(Z<+2.05)
= P (X<485) + P (485≤X<700)
= 0.5+ P (0 to +2.05)
= 0.5 + 0.47982
= 0.97982
g) To find the expected number of applicants who score 590 or below, we first
find P (X≤590) and we multiply it by the number of applicants.
P (X≤590) =?
X
1. First convert X values in to Z-score using the formula Z
590 485
Z590 = = +1.00
105
2. P(X≤590) = P(Z≤+1.00)
= P (X<485) + P (485≤X<590)
= 0.5+ P (0 to +1.00)
= 0.5 + 0.34134
= 0.84134
If 500 applicants take the test, the number of students expected to score 590 or
below is 500(0.84134) = 420.65 or 421 students.
2. The result of an exam score for a given class is normally distributed. If the mean
score is 85 points and the standard deviation is equal to 20 points, find the cutoff
passing grade such that 83.4% of those taking the test will pass.
Solution:
µ = 85 prob. Of passing = 83.4%
δ = 20 cutoff point =?
Since 83.4% is greater than 50%, the cutoff point should be less than the mean,
and hence the Z-value is negative. And this calls for the inverse use of the
standard normal table.
(Z/P=0.334) = -0.97
X 485
-0.97 =
20
-19.4 = X-85
X = 65.6 Points – Minimum point to pass the test.
3. Data accumulated by the National Climatic Data Center shows that the average
wind speed in miles per hour for Addis is 9.7mph. Suppose that wind speed
measurements are normally distributed for a given geographical location. If
22.45% of the time the wind speed measurements are more than 11.6mph, what
is the standard deviation of wind speed in Addis?
Solution:
µ = 9.7mph δ =? X > 11.6
P(X> 11.6) = 22.45%
(Z/P = 0.2755) = +0.76
36
11 .6 9.7
+0.97 =
0.97δ = 1.9
δ = 2.5
4. The cylinder making machine has δ = 0.5mm and µ = 25mm. within what interval
of values centered at the mean will, the diameters of 80%of the cylinder lie?
Solution:
µ = 25mm δ =0.5mm
From the statement it is clear that the interval is centered at the mean; i.e., 50% of
the 80% (40%) lies below the mean and 50% lies above the mean.
(Z/P=0.4) = ± 1.28
X1 = µ - Z δ X2 = µ + Z δ
X 25 X 25
-1.28 = 1 +1.28 = 2
0.5 0.5
-0.64 = X1-25 +0.64 = X2-25
X1 = 24.36mm X2 = 25.64mm
80% of the diameter of the cylinder lies between 24.36mm and 25.64mm.
5. The lives of light bulbs follow a normal distribution. If 90% of the bulbs have lives
exceeding 2000 hrs and 3% have lives exceeding 6000 hrs. What are the mean and
standard deviation of the lives of light bulbs?
Solution:
P(X>2000) = 0.90 P(X>6000) = 0.03
µ=? δ =?
(Z/P=0.4) = - 1.28 (Z/P=0.47) = + 1.88
2000 6000
-1.28 = +1.88 =
-1.28δ = 2000 - µ +1.88δ = 6000 - µ
µ = 2000 + 1.28δ µ = 6000 - 1.88δ
µ = 2000 + 1.28 δ
37
= 2000 + 1.28(1265.82)
= 3620.25 points
6. On a civil service exam, the grades are normally distributed with µ = 70 points
and δ = 10 points. The police department hires the applicants whose grades are
among the top 10% of the population. What is the minimum grade required to be
hired?
Solution:
µ = 70points δ =10points
(Z/P=0.4) = + 1.28
X
+1.28 =
X 70
+1.28 =
10
12.8 = X - µ
X - 70 = 12.8
X = 82.8 – the minimum grade required to be hired.
7. A bakery shop sells loaves of freshly made bread. Any unsold loaves at the end of
the day are either discarded or sold elsewhere at a loss. The demand for this
bread has followed a normal distribution with µ = 35 loaves and δ = 8 loaves.
How many loaves should the bakery make each day so that they can meet the
demand 90% of the time?
Solution:
µ = 70 loaves δ = 8 loaves
(Z/P=0.4) = + 1.28
X
+1.28 =
X 35
+1.28 =
8
10.24 = X - 35
X = 45.24 ≈ 46- by stocking 46 loaves of breads each day, the bakery will meet the
demand for this product 90% of the time.
and hence another method of solving the problem must be found – the normal
distribution.
In binomial distribution
- When p is small (e.g. 0.1), the distribution is skewed to the right.
Mode<Median<Mean.
- As p increases (e.g. 0.3), the skewness is less noticeable.
- When p= 0.5 the distribution is symmetrical. Mode = Median = Mean.
- When p > 0.5, the distribution is skewed to the left. Mean>Median<Mode
Example:
1. According to a recent study conducted by the Addis Ababa University,
87% of all evening college students also work. If this figure still holds and if 120
evening class college students are randomly selected, what is the probability that
less than 100 also work? Use normal distribution to approximate the binomial.
Solution:
1. n = 120 - large p= 0.87 – is close to 0.5
np = 0.87*120 = 104.40, nq = 0.13*120 = 15.6….. Both greater than 5.
2. µ = np = 120*0.87 = 104.40and δ = √npq = √120*0.87*0.13 = 3.684
3. µ ± 3δ = 104.40 ± 3(3.684) = 104.40 ± 11.052 = 92.948 ≤ µ ± 3δ ≤ 115.052.
Hence, the interval (92.95 to 115.05) is between 0 and n (120).
4. P(X< 100) of binomial is changed in to P(X < 99.5) of normal by applying
the continuity correction factor.
5. P(X< 99.5) =?
40
99.5 104.4
Z99.5 = = -1.33
3.684
P(X< 99.5) = P (Z<-1.33)
= 0.5- P (0 to -1.33)
= 0.5-0.40824
= 0.09176
6. If 87% of the all the evening college class students work, 9.18% of the time
the Addis Ababa University would get less than 100 evening class college
students working in a sample of 120 evening college class students.
2. In a travel study, the Ethiopian Tourism Commission reported that during the
Ethio-Eritrean war, 29% of the tourists who came to Ethiopia said that the crisis
would affect their vacation plans.
a) At the end of the war if the figure is still 29%, in a random sample
of 150 travelers, what is the probability that 20 or fewer responded yes that
the Ethio-Eritrean crisis would affect their vacation plans? 0.0000
b) However, a study at the end of the war indicated that only 7% of
the travelers felt at that time that the Ethio-Eritrean crisis would affect their
vacation plans. What is the probability a random sample of 150 travelers
would result in 20 or fewer travelers saying yes that the Ethio-Eritrean crisis
would affect their vacation plans? 0.99934
As we can see from the above result, the difference is only 0.0057(0.0835 - 0.0778).
As µ increases the difference decreases.
Solution:
λ = 1.38 defects/20 minutes t = 15 minutes P (t < 15) =?
λι = 1.38defectives/20minutes* 15 minutes = 1.035
P (t < 15) = P (0 to 15) = 1 – е-λι
= 1- е-1.035
= 1 – 0.3552
= 0.6448
There is a probability of 64.48% that there will be less than 15 minutes between
two defects when there is an average of 1.38 defects per 20 minutes interval [or
an average of 14.49(1/1.38/30) minutes between defects.
4. Supermarkets usually get very busy at about 5pm on weekdays as many workers
stop by on the way home to shop. Suppose that at that time arrivals are Poisson
distributed at a supermarket’s checkout station with an average of 0.8 people per
minute. The clerk has just checked out the last person in line.
a) What is the probability that at least one minute will elapse before the next
customer arrives? 0.4493
b) Suppose the clerk needs to go to the manager’s office to ask a quick question
and that 2.5 minutes are needed to do so. What is the probability that the
clerk will get back before the next customer arrives? 0.1353
Reliability Engineering
Another situation that often fits the exponential distribution is observing the life
time of certain components in a machine; i.e., exponential distribution is widely
used in the area of reliability engineering to describe the life time of to failure of
a component or a system. The parameter µ is called the mean time to failure and
λ = 1/µ is the failure rate of the system.
Example:
44
1. Suppose that an automobile battery has a useful life described by the exponential
distribution with a mean of 1000 days.
a) What is the probability that a battery will fail before its expected life time of
1000 days?
b) If the battery has a 12-month (365 days) warranty, what fraction of the
batteries fail during the warranty period?
c) Find the probabilities that batteries will last between 1000 and 2000 days.
d) Find the probabilities that such batteries will last more than 2000 days.
Solution:
µ = 1000 days
a) P (t < 1000)?
P (0 to X0) = 1 – е-X0/µ
= 1 – е-1000/1000
= 1 – е-1
= 0.6321
There is a 63% chance that the battery will fail prior to its mean life time of 1000
days. This value is greater than 50% since this distribution is not symmetrical
and is positively skewed.
b) P (t ≤ 365) = P (0 to 365)
= 1 – е-X0/µ
= 1 – е-365/1000
= 1 – е-.365
= 0.3058
If the mean life time the batteries is 1000 days, with in 365 days 30.58% of the
batteries will fail – or – the manufacturer will be forced to replace 30.58% of the
batteries during one year warranty period.
c) P(1000 ≤ t ≤ 2000) = P (0 to 2000) – P(0 to 1000)
= [1 – e-2000/1000] – [1- e-1000/1000]
= (1 – e-2) - (1 – e -1)
= (1 – 0.1353) - (1 – 0.3679)
= 0.8647 – 0.6321
= 0.2326
The batteries have a 23.26% chance of waiting between 1000 and 2000 days.
The batteries have a 13.53% chance of waiting more than 2000 days.