Vous êtes sur la page 1sur 45

1

CHAPTER TWO
PROBABILITY AND PROBABILITY DISTRIBUTIONS
2.1. Probability Theory
Basic Concepts
Probability is a measure of the likelihood or chance that an uncertain event will
occur. It is a numerical measure of the chance of an outcome’s occurrence. It can
assume a value between 0 and 1, inclusive. A probability near zero indicates that
the outcome is very unlikely to occur, while a probability near 1 indicates that
the event is almost certain to happen. If we go to the extreme, a probability of
something will always to happen. Thus, probabilities are non-negative proper
fractions. It is the basis for inferential statistics

Experiment
An experiment is any well defined situation or procedure that results in one or
more possible outcomes. Or simply it can be defined as any process that
generates well defined outcomes. For instance, tossing a coin, rolling a die, foot
ball match, etc can be taken as experiments.

Outcome
An outcome is a particular result of an experiment. For example, getting either
head or tail is a possible outcome of the experiment tossing a coin. Winning,
loosing or tie/draw are the possible outcomes of the foot ball experiment, and
getting 1, 2, 3, 4,5, or 6 are possible outcomes of the rolling a die experiment.

Events
An event is a specific collection of basic outcomes, that is, a set containing one or
more of the basic outcomes from the sample space. An experiment identifies one
or more outcomes of an experiment. For example, in the rolling a die experiment,
the simple collection of two or more of the six possible outcomes can be taken as
an event.

Sample Space
A sample space is a complete roster or listing of all possible out comes of an
experiment. The sample space of an experiment is usually illustrated either by a
list or some type of diagram – Venn diagrams and tree diagrams.

Illustration of an experiment, outcomes, events, and sample space.


2

Tossing/Flipping a coin twice………………. Experiment


Heads or Tails……………………………….. Outcomes/elementary events
HH, HT, TH, TT…………………………….. 4 Events
(HH, HT, TH, TT)…………………………… Sample Space

Exercise
Identify the experiment, outcomes, events and sample space for the following
questions.
1. Sitting for an exam ………………………………. Experiment
Scoring A, B, C, D, F ……………………………... Possible outcomes
[A, B, C, D, F] ……………………………………... Sample Space
Scoring B and above ……………………………… Event
C and above ……………………………… Event
D or below ……………………………….. Event

2. Foot ball game ……………………………………… Experiment


Win, Loose, Tie/Draw ……………………………… Outcomes
[W, L, T] ……………………………………………… Sample Space
Not winning (L, D), Not loosing (W, D) …………… Events

Events
1. Independent events
Two or more events are independent if the occurrence or nonoccurrence of one of
the events does not affect the occurrence or nonoccurrence of the others. Certain
experiments, such as rolling dice, yield independent events; each die is
independent of the other. Whether a 6 is rolled on the first die has no influence
on whether a 6 is rolled on the second die. Coin tosses always are independent of
each other. The possibility of getting a head on the first toss of a coin in
independent of getting a head on the second toss.

The impact of independent events on the probability is that, if two events are
independent, the probability of attaining the second event is the same regardless
of the outcome of the first event. The probability of tossing a head is always ½
regardless of what was tossed previously. Thus, if someone tosses a coin six
times and gets six heads, the probability of tossing a head on the seventh time is
½, because coin tosses are independent. In terms of symbolic notation, if X and Y
are independent: P(X/Y) = P(X) and P(Y/X) = P(Y), where P(X/Y) denotes the
probability of X occurring given that Y has occurred, and P(Y/X) denotes the
probability of Y occurring given that X has occurred.
2. Dependent Events
3

Two or more events are dependent if the occurrence or nonoccurrence of one of


the events affects the occurrence or nonoccurrence of the others. Certain
experiments, such as rolling a die, yields dependent events; the occurrence of one
of the six events is dependent on the occurrence or nonoccurrence of other
events.
The impact of dependent events on the probability is that, if two events are
dependent, the probability of attaining the second event is different from that of
the outcome of the first event. In terms of symbolic notation, if X and Y are
dependent: P(X/Y) ≠ P(X) and P(Y/X) ≠ P(Y), where P(X/Y) denotes the
probability of X occurring given that Y has occurred, and P(Y/X) denotes the
probability of Y occurring given that X has occurred.

3. Mutually exclusive events/ Disjoint Events – opposite of Joint events


Two or more events are mutually exclusive if the occurrence of one event
precludes the occurrence of the other events. This characteristic means that
mutually exclusive events cannot occur simultaneously and therefore can have
no intersection.
In the toss of a single coin, the events of heads and tails are mutually exclusive.
The person tossing the coin gets either a head or a tail but never both. The
probability of two mutually exclusive events occurring at the same time is zero.
In terms of set notation, if events X and Y are mutually exclusive, P(X n Y) = 0, or
the probability of X intersecting Y is zero.

Relating the above three types of events, mutually exclusive events must be
dependent, but dependent events need not be mutually exclusive. Events that are
independent cannot be mutually exclusive. Therefore, mutually exclusive implies
dependence and independence implies not mutually exclusive, but no other
simple implications among these conditions hold true.

4. Collectively exhaustive events


A list of collectively exhaustive events contains all possible elementary events for
an experiment. Thus all sample spaces are collectively exhaustive lists. The
sample space for an experiment can be described as a list of events that are
mutually exclusive and collectively exhaustive.

5. Complementary events
The complement of an event A is denoted A . All elementary events of an
experiment not in A comprise its complement. For example, if in rolling one die,
event A is getting an even number, the complement of A is getting an odd
number. If event A is getting a 5 on the roll of a die, the complement of A is
4

getting a 1, 2, 3, 4, or 6. The complement of event A contains whatever portion of


the sample space that event A does not contain.

Using the complement of an event can be helpful some times in solving for
probabilities because of the rule: P ( A ) = 1- P (A).

Principles of counting
Counting the number of ways in which events may occur in an experiment plays
a major role in probability. Some rules for counting are presented in this section.
The first of these is called the fundamental principle of counting.
The fundamental principle of counting specifies that if one event can occur in n1ways and
another event can occur in n2 ways, the two events can occur together in n1n2 ways.
Permutations
Other important counting rules pertain to the arrangement of items with regard
to the order of items. In this case with use Permutations.

Permutations are groups of items where both the composition of the


groups and the order with in a group are important.

The number of permutations in n distinct items arranged x at a time is


n!
Px 
n
 n  x ! , where n!, read n factorial, is
n! = n (n -1) (n-2)……. (1).

By definition, 0! = 1.

Combinations
Permutations concern ways in which both order and composition are important.
In combinations what matters is the composition of the group not the order of
items as what we have in permutations.

The number of combinations possible by selecting x out of n distinct items is:


n!
Cx 
n
 n  x ! x! .

Methods of assigning probabilities

The three general methods of assigning probabilities are


5

 The classical method – the equally likely approach


 The relative frequency method
 The subjective method

Classical Method
The classical method of assigning probabilities is based on the assumption that
each outcome is equally likely to occur. Classical probability utilizes rules and
laws. It involves an experiment and an event. The definition assumes that all n
possible outcomes have the same chance for occurring. In this method
probability values are assigned as follows:
ne
P( E )  ,
N
where : N  total possible number of outcomes of an exp eriment
ne  the number of outcomes in which the event occurs out of N outcomes
As ne can never be greater than N (no more than N outcomes in the population
could possibly possess attribute e), the highest value of any probability is 1. If the
probability of an outcome occurring is 1, the event is certain to occur. The
smallest possible probability is zero. If none of the outcomes of N possibilities
possesses the desired characteristic, e, the probability is 0/N = 0, and the event is
certain not to occur. The range of possibilities for probabilities is: 0  P ( E )  1.
Thus the probabilities are non negative proper fractions or non negative decimal
values less than or equal to 1.

Relative Frequency of Occurrence Method


The relative frequency of occurrence method of assigning probabilities is based
on cumulated historical data. With this method, the probability of an event
occurring is equal to the number of times the event has occurred in the past
divided by the total number of opportunities for the event to have occurred:

Probability by number of
relative frequency= times an event has occurred
of occurrence total number of opportunities
for the event to occur
Relative frequency of occurrence is not based on rules or laws but on what has
occurred in the past.

Subjective method
The subjective method of assigning probability is based on the feelings or
insights of a person determining the probability. Subjective probability comes
from the person’s intuition or reasoning. Although not a scientific approach to
6

probability, the subjective method often is based on the accumulation of


knowledge, understanding, and experience stored and processed in the human
mind. At times it is merely a guess. At other times, subjective probability can
potentially yield accurate probabilities.

Subjective probability can be a potentially useful way of tapping a person’s


experience, knowledge, and insight and using them to forecast the occurrence of
some event. E.g. Weather forecast

Types of probabilities
There are four types of probabilities. These are:
 Simple probability
 Joint probability
 Marginal probability
 Conditional probability

Simple Probability
Simple probabilities are relatively straight forward which are obtained using the
formula P (A) = n (A)/n – relative frequency method.

Marginal probability
Marginal probability is denoted by P (E), where E is some event. A marginal
probability is usually computed by dividing some subtotal by the whole. An
example of marginal probability is the probability that a student is infected by
HIV/AIDS. This probability is computed by dividing the number of students
infected by HIV/AIDS by the total number of students. The probability of a
person wearing glasses is also a marginal probability. This probability is
computed by dividing the number of people wearing glasses by the total number
of people. A marginal probability is found in the margin of any joint probability
table. It is the sum of the joint probabilities for a single category of one attribute
over all possible categories of another attribute.

Example:
ABC Company manufactures window air conditioners in both a deluxe model
(D) and a standard model (S). An auditor engaged in a compliance audit of the
firm is validating the sales account for the month April. She has collected 200
invoices for the month, some of which were sent to wholesalers (W) and the
remainders to retailers (R). Of the 140 retail invoices, 28 are for the standard
model. Only 24 of the wholesale invoices are for the standard model. If the
auditor selects one invoice at random, find the following probabilities.
7

a) The invoice selected is for the deluxe model.


b) The invoice selected is for the standard model.
c) The invoice selected is a wholesale invoice.
d) The invoice selected is a retail invoice.

Solution
Wholesale, W Retail, R Total
Deluxe, D 36 0.18* 112 0.56* 148 0.74**
Standard, S 24 0.12* 28 0.14* 52 0.26**
Total 60 0.30** 140 0.70** 200
P (D) = 148/200 = 0.74 P (W) = 60/200 = 0.30
P (S) = 52/200 = 0.26 P (R) = 140/200 = 0.70

P (D) = P (WnD) + P (RnD) P (W) = P (WnD) + P (WnS)


= 0.18 + 056 = 0.18 + 0.12
= 0.74 = 0.30
* Joint Probabilities
** Marginal Probabilities

Union probability
A second type of probability is the union of two events. Union probability is
denoted by P (E1 U E2), where E1 and E2 are two events. P (E1 U E2) is the
probability that E1 will occur or that E2 will occur or that both E 1 and E2 will
occur. An example of union probability is the probability that a person is infected
by HIV/AIDS or Cancer. To qualify for the union, the person has to be infected
with at least one of the diseases. Another example is the probability of wearing
eye glasses or is a soldier. All people wearing eye glasses are included in the
union along with all people who are soldiers and all soldiers who wear eye
glasses.

Joint probability
A third type of probability is the intersection of two events or joint probability. A
joint probability shows the probability that an observation will possess two (or
more) characteristics simultaneously. That is, it measures the probability of two
or more events occurring together. The joint probability of events E1 and E2
occurring is denoted P (E1 n E2). Some times P (E1 n E2) is read as the probability
of E1 and E2. To qualify for the intersection, both events must occur. Joint
probability ranges from 0 to 1, inclusive [0, 1]. The sum of all joint probabilities
must be equal to 1.0. An example of joint probability is the probability of a
person to be infected with HIV/AIDS and Cancer. Being infected with one of the
8

diseases is not sufficient. A second example of joint probability is the probability


that the person is a soldier as well as he/she wears eye glasses.

Conditional probability
The fourth type is conditional probability. Conditional probability is denoted by
P (E1 / E2). This expression is read as: the probability that E 1 will occur given that
E2 is known to have occurred. The conditional probability of an event E 1, given
event E2 is the ratio of the joint probability of two events to the marginal
probability of E2.
 X  P  X and Y   X Y 
e)
For P  Y   0, P   P 
Y  P Y   Y 

 Y  P  Y and X  Y  X 
For P  X   0, P    P 
X P X   X 

Example:
Blue Nile University recently conducted a survey of undergraduate students in
order to gather information about the usage of the library. The population for
this study included all 4000 undergraduate students enrolled in the university.
The library officers are interested in increasing usage, particularly among
females (F) and seniors (S) at the university. Of the 4000 students, 800 students
are seniors, 1800 students are females and 450 of the 1800 females are seniors.
Required:
1. What is the probability that a student selected at random is a senior given that
the selected student is female?
2. What is the probability that a student selected at random is female given that the
selected student is senior?

Solution:
Senior, S Non-Senior, N Total
Female, F 450 0.1125 1350 0.3375 1800 0.45
Male, M 350 0.0875 1850 0.4625 2200 .055
Total 800 0.20 3200 0.80 4000

1. P (S/F) = P (SnF)/ P (F) = 0.1125/.45 = 0.25


2. P (F/S) = P (SnF)/ P (S) = 0.1125/.20 = 0.5625

Using conditional probability, joint probability of X and Y is calculated as:


 Y  * PY   PY X  * P ( X )
P X  Y   P X
9

The Bayes’ Rule


An extension to the Conditional Law of Probabilities is Bayes’ rule, which was
developed by and named for an English Clergy man Thomas Bayes (1702-1761).
Bayes’ rule is a formula that extends the use of the law of conditional
probabilities to allow revision of original probabilities when new information is
needed. The two core ideas in Bayes’ Rule are the prior probability and
posterior/revised probability.
Prior probability – is initial probability which is determined before new
information is obtained. It is the starting point for Bayes theorem.
Posterior probability - a probability that has been revised based on new
information, because it represents a probability calculated after new information
is obtained.
prior  probability  New Information  Application of Bayes Theorem  Posterior Pr obability
Deter min ed Subjectively Sample Information
The Bayes’ theorem simplifies the computation of P(X/Y) when P (XnY) and P(Y)
are not given directly.

 Conditional Probability Rule (The Bayes’ Theorem for One Event)


 
P Y   P X 
P 
X
Y
  X 
P (Y )
.

 Bayes’ Theorem for Two Events


P Y   P  X i 
X
 i
P X i   .
 Y  P  Y   P  X   P Y   P  X 
 X  1  X  2
 1  2

 Bayes’ Theorem for Three Events


P Y   P X i 
 Xi 
P
Xi 
 .
 Y    
P Y   P X 1   
 P Y   P  X 2   P Y   P  X 3 
 X1   X2  X3 
The general Bayes’ rule is presented below.
10

P Y   P X 
 P( X i ) * P Y 

 X i 
i
 X i 
P 
Xi
 n .
 Y   
P Y   P X 


 P Y 
  P X 2   ....... P X   P X n   P X I  * P Y X 
Y  
 X1   X2 
1
 n  i 1  i 

Example:
1. A company has three machines A, B and C which all produce the same two parts,
X and Y. of all the parts produced, machine A produces 60%, machine B
produces 30%, and machine C produces the rest. 40% of the parts made by
machine A are part X, 50% of the parts made by machine B are part X, and 70% of
the parts made by machine C are part X. A part produced by this company is
randomly sampled and is determined to be an X part. With the knowledge that it
is an X part, find the probabilities that the part came from machine A, B or C.
Solution:
P (A) = 0.6 P (X/A) = 0.4 P (A/X) =?
P (B) = 0.3 P (X/B) = 0.5 P (B/X) =?
P (C) = 0.1 P (X/C) = 0.7 P (C/X) =?
Method 1

PA X    X     X A  P A  X     P PA X X 


PX
P P A P P B P P C
A B C
(0.6 * 0.4) 0.24
   0.52
 0.6 * 0.4   0.3 * 0.5   0.1 * 0.7  0.46
   
B B
 
P X  P B  P B  X 
 
P
 
X P X  P  A  P X  P  B   P X  P  C 
A B C
  P X 
(0.3 * 0.5) 0.15
   0.33
 0.6 * 0.4   0.3 * 0.5   0.1 * 0.7  0.46
   
PC  C
 
P X  P C 

P C  X 
 
X P X  P  A  P X  P  B   P X  P  C 
A B C
  P X 
(0.1 * 0.7) 0.07
   0.15
 0.6 * 0.4   0.3 * 0.5   0.1 * 0.7  0.46
NB: P (A/X) + P (B/X) + P (C/X) = 1.00
P (A/X) + P (A’/X) = 1.0
Method 2
11

Machine
Product A B C Total
X 0.24 0.15 0.07 0.46
Y 0.36 0.15 0.03 0.54
Total 0.6 0.3 0.1 1.00
P (A/X) = 0.24/0.46 = 0.52
P (B/X) = 0.15/0.46 = 0.33
P (C/X) = 0.07/0.46 = 0.15
1.00 The sum of joint and conditional probabilities should
be equal to one.
Method 3 – Bayesian Table
Event, Prior probability, Conditional Joint prob., Posterior/Revised
Ei P(Ei) Prob., P(X/Ei) P(EinX) prob., P(Ei/X)
A 0.60 0.40 0.24 0.24/0.46 = 0.52
B 0.30 0.50 0.15 0.15/0.46 = 0.33
C 0.10 0.70 0.07 0.07/0.46 = 0.15
P(X) = 0.46 1.00
0.24
Method 4 – Tree Diagram X/A P (A/X) = 0.24/0.46 = 0.52
0.40
0.60 0.60 0.36
Y/A
A 0.15 P(X) = 0.46
B 0.30 X/B 0.50 P (B/X) = 0.15/0.46 = 0.33
0.50 0.15
C Y/B 0.07 P(C/X) = 0.07/0.46 = 0.15
0.10 X/C 0.70 1.00
0.30 0.03
Y/C
Find: P(Y) = 0.54
P (A/Y) = 0.36/0.54 = 0.667
P (B/Y) = 0.15/0.54 = 0.278
P (C/Y) = 0.03/0.54 = 0.055
1.000
12

2. Bruk, Alemayehu and yohannes fill orders in a fast food restaurant. Bruk fills
incorrectly 20% of the orders he takes. Alemayehu fills incorrectly 12% of the
orders he takes, and Yohannes fills incorrectly 5% of the orders he takes. Bruk
fills 30% of all orders, Alemayehu fills 45% of all orders, and Yohannes fills 25%
of all orders. An order has just been filled.
a) What is the probability that Alemayehu filled the order? 0.45
b) If the order was filled by Yohannes, what is the probability that it would
was filled correctly? 0.95
c) Who filled the order is unknown, but the order was filled correctly. What
are the revised probabilities that Bruk, Alemayehu or Yohannes filled the
order? 0.2748, 0.4533 and 0.2719
d) Who filled the order is unknown, but the order was filled incorrectly. What
are the revised probabilities that Bruk, Alemayehu or Yohannes filled the
order? 0.4743, 0.4269 and 0.0988

3. A major league base ball team has four starting pitchers: Girma, Robel, Solomon,
and Asrat. Each pitcher starts every fourth game. The team wins 60% of all
games that Girma starts, 45% of all games that Robel starts, 35% of all games that
Girma starts, 40% of all games that Girma starts. An avid fan has just returned
from a three week vacation in the wilderness and found out that the team played
yesterday.
a) What is the probability that Girma started the game? 0.25
b) What is the probability that Solomon started the game? 0.25
c) If the team won yesterday, revise the probability of each pitcher starting the
game? 0.333, 0.250, 0.194 and 0.222

Laws of Probability
Additive Law
The general law of addition is used to find the probability of the union of two
events, P (E1 U E2). The general Law of Addition is presented as follows:

P E1  E2   P E1   P E2   P E1  E2 

Where E1 , E2 are events and  E1  E2  is the int er sec tion of E1 and E2 .

Special Rule of Addition


If two events are mutually exclusive, the probability of the union of the two
events is the probability of the first event plus the probability of the second
13

event. Because mutually exclusive events do not intersect, nothing has to be


subtracted out. The formula is shown below.
If E1 and E 2 are mutually exclusive events,
p  E1  E 2   P E1   P E 2 .
Example:
1. A husband and a wife, each 20 years old, are debating whether to setup a
retirement program for themselves. Benefits are paid to the man or woman at the
age of 70. If both have died before reaching age 70, no benefits are paid. Assume
that the probability that a man aged 20 lives up to age 70 is approximately 0.7. If
the husband and wife join the program, what is the probability that either the
man or the woman will collect benefits? Assume that the chances of the man or
woman dying are independent of each other.
Solution:
Let M= man lives up to age 70, W = woman lives up to age 70.
P (M) = 0.60 P (W) = 0.70
P (WUM) = P (W) + P (M) – P (WnM)
= 0.70 + 0.60 – P (WnM). Since the two events are independent, the joint
probability that both the man and the woman lives up to age 70 is
equal to the product of the individual marginal probabilities.
P(WnM) = P (M) * P (W)
= 0.60 * 0.70
= 0.42
= 0.70 + 0.60 – 0.42
= 0.88
2. According to a recent study conducted by businessmen, 76% of all shareholders
have some college education. Suppose that 37% of all adults have some college
education and that 22% of all adults are share holders. For a randomly selected
adult:
a) What is the probability that the person did not own shares of stock? 0.78
b) What is the probability that the person owns shares of stock or had some
college education? 0.4228
c) What is the probability that the person has neither some college
education nor owns shares of stock? P( A  B ) = 1 – P(AUB) = 0.5772
d) What is the probability that the person does not own shares of stock or
has no college education? P( A  B ) = 1 – P(AnB) = 0.8382
e) What is the probability that the person owns only shares of stock or had
some college education but not both? P(AUB) – P(AnB) = 0.4227 – 0.1672
= 0.2556
14

3. A 1999 survey of 20,000 sales professionals conducted by Ethiopian


Telecommunication Corporation (ETC) found that 15% of all sales professionals
use home fax machines and 35% use mobile telephones. Suppose that 1% of all
sales professionals have both fax machines and use mobile telephones.
a) What is the probability that a randomly selected sales professional has a
home fax machine or uses a mobile telephone?
b) What is the probability that a randomly selected sales professional neither
has a home fax machine nor uses a mobile telephone?
c) Suppose that no sales professional has both a home fax machine and uses
a mobile telephone. What is the probability that a randomly selected sales
professional has a home fax machine or uses a mobile telephone?
Multiplicative law
The probability of the intersection of two events (E 1  E2) is called the joint
probability. The general law of multiplication is used to find the probability of
the intersection of two events or joint probability. The general law of
multiplication is stated as follows:

P E1  E 2   P E1   P E 2   P E 2   P E1 .


 E1   E2 

Special Rule of Multiplication


If events E1 and E2 are independent, a special law of multiplication can be used to
find the intersection of E1 and E2.

If E1 and E 2 are independent events,


p  E1  E 2   P E1   P E 2 .

Example:
1. Test the matrix for the 200 executive responses to determine whether industry type is
independent of geographic location.
Geographic Location
Industry type North East, D South East, E Mid West, F West, G Total
Finance, A 24 10 8 14 56
Manufacturing, B 30 6 22 12 70
Communication, C 28 18 12 16 74
Total 82 34 42 42 200
15

Solution:
Select one industry type and one geographic location (Say A – Finance and G –
West). Does P (A/G) = P (A)?
P (A/G) = P (AnG)/P (G) = 0.07/0.21 = 0.33
P (A) = 56/200 = 0.28
Since P (A/G) ≠P (A), industry type and geographic location are not independent.
2. Considering the above problem, if a respondent is randomly selected from these
data:
a) What is the probability that this executive is from the mid west? 0.21
b) What is the probability that a respondent is from the communication
industry or from north east? 0.64
c) What is the probability that a respondent is from the south east or from
finance industry? 0.36
d) What is the probability that this executive is from the south east or the west?
0.38

3. The results of a survey asking, “Do you have a calculator and/or a computer in
your home?” are as follows:
Calculator
Yes No
Computer Yes 46 3
No 11 15
Is the variable calculator independent of the variable computer? Why or why
not? NO

Laws of conditional probability


Conditional probabilities are based on knowledge of one of the variables. If E1
and E2 are two events, the conditional probability of E1 occurring given that E2 is
 E1 
known or has occurred is expressed as P  . The formula for finding a
 E2 
conditional probability is given below.

P   P E 
E2
P E1  E 2  E1  1
P E1    
 E2  P E 2  P E1 

Special Law of Conditional Probability


16

If E1 and E2 are independent events, the conditional probability and marginal


probability of the two events are equal. That is, P (E 1/E2) = P (E1), and P (E2/E1) = P
(E2).
2.2. Probability Distributions
Basic concepts
A variable is a characteristic that can have different values or outcomes. A
variable whose numerical value is determined by an outcome of a random
experiment, or, a variable whose outcomes occur by chance is called a random
variable. Depending on the values a random variable can take, there are two
types of random variables: Discrete random variables and Continuous random
variables.
Discrete Random Variables: these are random variables which can only assume
non-negative whole numbers such as 0, 1, 2, 3………, n for example, the number
of students in a class, the number of telephone calls received in a given hour, the
number of people living in certain area and the like can take only non-negative
whole numbers. As a result, there are gaps or voids in them along an interval.
Continuous Random Variables: these are random variables which can take any
value, that is, it can take any value over an interval. Thus, continuous random
variables have no gaps or unassumed values. These are random variables that
can assume an uncountably infinite number of values. For example, the height of
an individual, the distance traveled by a truck driver in a given hour, the
temperature of a room on a given day, and the like produces a continuous
random variable.
NB: Continuous random variables typically record the value of a measurement
such as time, weight, volume, or length. While discrete random variable counts
the number of times a particular attribute is observed.

Probability Distribution: is a listing of the possible values that a random


variable can assume along with their probabilities. It is any representation of the
values of a random variable and the associated probabilities. Depending on the
types of random variables with which we deal with, we do have two types of
Probability Distributions.: Discrete Probability Distributions and Continuous
Probability Distributions.
Discrete Probability Distribution is any representation of the values of discrete
random variable and the associated probabilities. The most commonly used
discrete probability distributions include the Binomial, hyper geometric and the
Poisson distributions.

The Binomial Distribution


17

Perhaps the most widely known of all discrete probability distribution is the
binomial distribution. The binomial distribution has the following underlying
assumptions:
i. The experiment involves n identical trials or sampling is
done with replacement.
ii. Each trial has only two possible mutually exclusive
outcomes.[Bi = Two]
III. Each trial is independent of the previous trials
iv. The probability of success (P) and failure (q = 1-P) remain
constant for each trial.
v. In n trials, only X successes are possible where X is a whole
number between 0 and n [0≤ X≤ n]
vi. It is applicable if the sample size n is less than 5% of the
population size N or if samples are taken with replacement.

To compute the probability of occurrences in binomial distribution we do have


the Binomial Formula. It is stated as follows:
The probabilit y of exactly X success in n trials
n!
P X    p x  q Where : P  x   probabilit y of X success in n trial
nx

x! n  x !
n  number of trials  sample size 
x  number of successes desired
P  probabilti y of success
q 1  p  probabilti y of failure

Example:
1. If we toss a coin three times, what is the probability of getting exactly two heads?
Solution:
In a single toss, the probability of getting a head or a tail is 0.5. In tossing the coin
three times, the following are the possible outcomes.
HHH, HHT, HTH, HTT, THH, THT, TTH, TTH, TTT
The probability of getting exactly two heads is, therefore, computed as
= (0.5*0.5*0.5) + (0.5*0.5*0.5) + (0.5*0.5*0.5)
= 0.125 * 3 = 0.375
Using the Binomial formula
P = 0.50 q = 1 – 0.50 = 0.50 n=3 x=2
P(x=2) = ncx * PX * q1-x
= 3c2 *0.52*0.51
=
3(0.25*0.5) = 3(0.125) = 0.375
18

There are three ways of choosing exactly two heads from a total of three trials.
2. A researcher wants to test the claim that 10% of all people are left-handed by
randomly selecting forty students at a university. What is the probability of
getting six left handed students among forty?
Solution:
P = 0.10 q = 1 – 0.10 = 0.90 n = 40 x=6
P(x=6) = 0.1068
If 10% of the population is left-handed, about 10.68% of the time the researcher
would get six who are left handed in a sample of forty.

3. Based on past data, approximately 30% of the oil wells drilled in areas having a
certain favorable geological formation have struck oil. A company has identified
5 locations that possess this information. Assuming that the chance of striking oil
on any location is independent of any others, calculate the probability that
exactly 2 of the 5 wells strike oil.
Solution:
P = 0.30 q = 1 – 0.30 = 0.70 n=5 x=2
P(x = 2) = 0.3087
If the probability of getting oil in areas having certain favorable geological
formation is 0.3, 31% of the time we can get 2 drills which have oil in a sample of
5 drills.

4. The quality control department of a manufacturer tested the most recent batch of
1000 catalytic converters produced and found 50 of them defective.
Subsequently, an employee unwittingly/unintentionally mixed the defective
converters with the non-defective ones. If a sample of three converters is
randomly selected from the mixed batch, what is the probability that the
employee may get one defective item?
Solution:
Before we try to solve this problem we have to check whether all the assumptions
of a Binomial distribution are satisfied or not. One of the assumptions states that
the sample size, n must be less than five percent of the population size, N. in our
case, the sample size is less that 5% of the population size[ 3/1000 = 0.003< 0.05]
so we can use the binomial distribution to solve this specific exercise.
N = 1000 p = R/N, where R- the number of success in the population, N
n=3 = 50/1000 = 0.05
x=1 q = 1 – 0.05 = 0.95
P(x=1) = 0.1354
If 5% of the product contains defective converters, 13.54% of the time the quality
control department would get 1 defective item in a sample of three converters.
19

5. A town has three ambulances for emergency transportation to a hospital. The


probability that any one of these will be available at a given time is 0.75. if a
person calls for an ambulance, what is the probability that an ambulance will be
available?
Solution:
n=3 p = 0.75 q = 0.25
Probability of getting (at least) an ambulance is calculated as one minus the
probability of getting no ambulance.
P(ambulance) = 1 – P(0 ambulance)
0
= 1 – (3c0*0.75 *0.253)
= 1 – 0.0156
= 0.9844

Using Individual Binomial Probability Table


Tables have been developed that give the probability of x successes in n trials for
a binomial experiment. These tables are generally easy to use and quicker than
the Binomial Formula, especially when the number of trials involved or sample
size, n is large. In order to use this table, it is necessary to specify the values of n,
p and x. (See your text Van Matre Appendix A- Individual Terms).

Some Binomial tables only show values up to 0.5. Thus, it would appear these
tables are can not be used when the probability of success exceeds p= 0.5.
However, such tables can be used by noting that the probability of n-x failures is
also the probability of x successes. That is, finding the probability of x successes
is equal to finding the probability of n-x failures. ncx and ncn-x are always
equal.
Example:
Suppose that 70% of all cola drinkers select non diet colas. If 10 cola drinkers are
randomly selected, what is the probability that 4 of them will be diet cola
drinkers?
Solution:
Finding the probability of 4 diet cola drinkers is equivalent to finding the
probability of 6 non diet cola drinkers.
n = 10 p= 0.7 q= 0.3 x= 6
P(x=6) = 0.2001
20

Finding the Probabilities that the Number of Successes X Lie In a Given


Interval (Cumulative Probabilities)
Cumulative probabilities are the sum of individual probability values. The
n X
Binomial formula P ( X )   * P * q gives us the probability of exactly x
X
X

successes in n trials/sample size n. to find cumulative probabilities such as


P(x≥3), P(x≤2), P(x›10) or P(X1≤X≤X2) = P(10≤X≤20), we should add the
respective exact/individual probability values.
Example:
1. A project manager has determined that a subcontractor fails to deliver standard
orders 20% of the time. The project manager has six orders that his subcontractor
has agreed to deliver. What is the probability that
a) The subcontractor will deliver all of the orders? 0.2621
b) The subcontractor will deliver at least four of the orders? 0.9011
c) The subcontractor will deliver exactly five orders? 0.3932
d) The subcontractor will fail to deliver at most two of the orders? 0.9011
e) What do you conclude from your answers in parts (b) and (d)? Finding the
probability of x successes is equal to finding the probability of n-x failures.
2. About 20% of all pro football players are injured during a given season. A team
has four star players. What is the probability that at least one of the star players
gets injured?
Solution:
n=4 p= 0.2 q= 0.8 x≥ 1
P (x≥1) = 1 – P (X≤0) = P(x=0)
= 1 – 0.4096 = 0.5904

3. A lawyer estimates that 40% of the cases in which she represented the defendant
were won. If the lawyer is presently representing 10 defendants in different
cases, what is the probability that at least 5 of the cases will be won? What are
you assuming here?
Solution:
The assumption we are taking here is the cases in which the lawyer is
representing are independent. With this assumption:
n = 10 p= 0.4 q= 0.6 x≥ 5
P (x≥5) = P(x=5) + P(x=6) + P(x=7) + P(x=8) + P(x=9) +P(x=10)
= 0.2007 + 0.1115 + 0.0425 + 0.0106 + 0.0016 + 0.0001
= 0.3670
21

Using Cumulative Binomial Probability Table


If cumulative probability table is given, one must subtract from the cumulative
probability of X the cumulative probability of X-1 to get the exact/individual
probability value of X. That is,
P (X=a) = P (X≤a) – P (X≤a-1)
E.g. P (X=3) = P (X≤3) – P (X≤2)
 P (X≥a) = 1- P (X≤a-1)
E.g. P (X≥3) = 1- P (X≤2)
 P (X>a) = 1- P (X≤a)
E.g. P (X>3) = 1- P (X≤3)
 P (a ≤X≤a ) = P (X≤a ) - P (X≤a )
1 2 2 1

E.g. P (10≤X≤20) = P (X≤20) - P (X≤10)


 P (a <X<a ) = P (X≤a -1) - P (X≤a -1)
1 2 2 1

E.g. P (10<X<20) = P (X≤19) - P (X≤9)

Example:
1. According to a study conducted approximately 55% of all hospitals in a given
town contained 100 or more beds. A researcher draws a sample of 15 hospitals by
randomly selecting names from a directory of hospitals.
a) What is the probability of selecting 10 or more hospitals that have 100 or more
beds?
b) What is the probability of selecting less than five hospitals that have 100 or
more beds?
c) What is the probability of selecting from six to ten hospitals, inclusive, that
have 100 or more beds?
2. A manufacturing company produces 10, 000 plastic parts per week. This
company supplies plastic parts to another company, which packages the plastic
parts as part of picnic sets. The second company randomly samples10 plastic
parts sent from the supplier. If two or less of the sampled plastic parts are
defective, the second company accepts the lot. What is the probability that the lot
will be accepted if the part manufacturing company actually producing parts is
10% defective? 20% defective? 30% defective? 40% defective?
22

Computation of Mean (µ) and Variance (δ2) of a Discrete Random Variable


Expected value or mean of a random variable is a measure of the central location
for the random variable. It is a long run average of occurrences. We must realize
that on any one trail using a discrete random variable, there will be one outcome.
However, if the process is repeated long enough, there is some likelihood that the
results will begin to approach some expected value or mean. This mean or
expected value is computed as

µ = E(X) = ∑[X*P(X) =
 Xifi
 fi
Where: E(X) = long run average
X = an outcome
P(X) = the probability of that outcome
Variance of a discrete random variable, which measures how far the variables are
dispersed around the mean, is calculated as
δ2 = ∑(X-µ) 2*P(X)

Where: X = an outcome
µ = mean
P(X) = the probability of that outcome
And the standard deviation of a discrete random variable is calculated simply by
taking the square root of the variance. δ = √∑(X-µ) 2*P(X).

Mean, Variance and Standard Deviation of a Binomial Distribution


Binomial probability distribution is a discrete probability distribution. And
hence, the method used to compute mean and standard deviation for a discrete
random variable is similar with the method used to compute  and  for a
binomial distribution.
A binomial distribution has an expected value or long run average, which is
denoted by µ. The value of µ is determined by n*p. the long run average or
expected value means that if n items are sampled over and over again for a long
period of time and if P is the probability of getting a success on one trial, the
average number of success per sample is expected to be n*p.
Like for other discrete variables, the variance of the binomial distribution is
calculated as δ2 = ∑(X-µ) 2*P(X) which is also equal to npq. The standard
deviation of the binomial distribution is also calculated by taking the square root
of the variance. δ = √∑(X-µ) 2*P(X) = √npq.

Hypergeometric Distribution
23

The binomial distribution assumes that the probability of success (p) and failure
(q = 1 - p) are the same throughout the experiment. This is because
– events are independent
– sampling is done with replacement
– n < 0.05N
– population is infinite
However, in cases where sampling is without replacement and the sample size
exceeds 5% of the population size, it is necessary to use the hypergeometric
distribution to determine correct probability.
The hypergeometric distribution has the following characteristics.
- It is a discrete distribution.
- Each outcome consists of either a success or a failure.
- Sampling is done without replacement.
- The population size is finite and known.
- It is described by three parameters: N, r and n. because of the multitude of
possible combinations of these three parameters, creating tables for the
hypergeometric distribution is practically impossible.
- The number of successes in the population, r, is known.
- The sample size is ≥ 5% of the population.
Under the above conditions, we can use the hypergeometric distribution for
determining the correct probability, with the following formula:
 N r * r
P( X )  n  x N x
n
Where: P(X) = the probability N = population size
n = sample size
r = number of successes in the population
x = number of successes in the sample for which a
probability is desired
C = combination
N-r = the number of items in the population that are
labeled as success
N Cn = the number of ways a sample of size n can be
selected from a population of size N.
rCx = the number of ways x successes can be selected
from a total of r successes in the population.
N-rCn-x = the number of ways n-x failures can be selected
from a total of N-r failures in the population
Example:
24

1. 24 people, of whom eight are women, have applied for a job. If five of the
applicants are randomly selected, what is the probability that three of those
sampled are women?
Solution:
N = 24 n=5 r=8 x=3
24 8
 * 8
P( X  3)  5  3 24 3 = 120x56/42,504 = 0.1581
5

2. A shipment of 10 items has two defective and eight non-defective units. In the
inspection of the shipment, a sample of units will be selected and tested. If the
defective unit is found, the shipment of 10 units will be rejected.
a) If a sample of three items is selected, what is the probability that the
shipment will be rejected?
b) If management would like a 0.90 probability of rejecting a shipment with
two defective and eight non-defective units, how large a sample would
you recommend?

3. Suppose that there are 18 major insurance companies in Ethiopia and that 12 are
located in Addis. If three insurance companies are randomly selected from the
entire list, what is the probability that one or more of the selected companies are
located in Addis?
Solution:
N = 18 n=3 r = 12 x≥1
P(x ≥ 1) = P(x = 1) + P(x = 2) + P(x = 3)
= 0.2206 + 0.4853 + 0.2696
= 0.9755

4. A company produces and ships 16 personal computers knowing that 4 of them


have defective wiring. The company that has purchased the computers is going
to test thoroughly 3 of the computers. The purchasing company can detect the
defective wiring when it is there. What is the probability that the purchasing
company will find:
a) No defective computer? 0.3932
b) Exactly three defective computers? 0.0071
c) Two or more defective computers? 0.1357
d) One or less defective computers? 0.8643

The Binomial Approximation to Hypergeometric Distribution


25

The binomial probability distribution with parameters n and p=r/N provides a


good approximation of hypergeometric probability distribution if the sample
size, n, is no more than five percent of the population size, N. That is n ≤ 5%N.
And as n/N decreases, the binomial distribution better approximates the
hypergeometric distribution.
Example:
1. An internal revenue service district office has files on 500 income tax returns that
were audited in 1996. After the audit, additional taxes were required on 350 of
these. In order to verify that proper audit procedures were followed, a supervisor
randomly selects and examines 10 of the 500 returns. What is the probability that
additional taxes were required exactly on six of the 10 returns sampled?
Solution:
Using hypergeometric distribution
N = 500 n = 10 r = 350 x≥6
P(x=6) = 0.2016

Using binomial distribution


n= 10 p=r/N = 350/500 = 0.7 q= 0.3 x=6
P(x=6) = 0.2001

2. Of a group of 300 men, 240 are physically fit. If five men are randomly selected,
what is the probability that three of them are physically fit?
Solution:
Using hypergeometric distribution
N = 300 n=5 r = 240 x=3
P(x=3) = 0.2057

Using binomial distribution


n= 5 p=r/N = 240/300 = 0.8 q= 0.2 x=3
P(x=6) = 0.2048

Similarities of the Hypergeometric and the Binomial Distributions


Distribution Assumptions Formula
Binomial - sampling with replacement or from an
 nx  p x  q 
nx
infinite population(n < 0.05N)
- P is constant

Hypergeometri - Sampling with replacement from a finite  nNxr *  rx


c population  nN
26

- P changes with each sample observations

The Poisson Distribution


The Poisson distribution is named after the French Mathematician Simeon Denis
Poisson (1781-1840), who published an article in 1837 discussing the distribution.
The Poisson distribution is another discrete probability distribution which is
used to describe a number of processes, including the distribution of telephone
calls going through a switch board system, the demand of patients for service at
a health institution, the arrival of trucks and cars at a tool booth, and the number
of accidents at an intersection.

While a binomial random variable counts the number of successes that occur in a
fixed number of trials, a Poisson random variable counts the number of rare
events (successes) that occur in a specified continuous time interval or specified
region.

The Poisson distribution has the following characteristics.

1. The probability of an occurrence is the same throughout the time interval or


space per unit.
2. The number of occurrences in one interval is independent of the number of
occurrences in another interval.
3. The probability of two or more occurrences in a subinterval is small enough
to be ignored.
4. It must be possible to divide the time interval of interest in to many sub
intervals.
5. The expected number of occurrences in an interval is proportional to the size
of the interval.

Examples of Poisson random variable


1. The number of air planes arriving at an airport in an hour.
2. The number of accidents at a factory in a day.
3. The number of cars crossing a bridge during a five second interval
4. The number of misprints on a page of newsprint
5. The number of white blood cells in a blood suspension.
6. The number of typographical errors on a page.
7. The number of bacteria in an ounce of fluid.
27

The formula for Poisson distribution


 x e    t  x  e  t 
P X    , Where :   mean number of arrivals per unit of time or space
X! X!
X  number of arrivals for which the probabilit y is desired
e  the base of natural log arithm
  exp ected number of occurrences in a specified int erval
t  the proportion of this specifird int erval for the question of int erest
(number of units of time)
Example:
1. Assume that a bank knows from past experience that between 10 and 11 a.m. of
each day, the mean arrival rate is 60 customers per hour. Suppose that the bank
wants to determine the probability that exactly two customers will arrive in a
given minute time minute interval between 10 and 11 a.m. Arrivals are assumed
to be constant over a given time interval. Calculate the probability.
Solution:
λ = 60 customers/hr t= 1 minute x = 2 customers
µ = λ* ι = 60customers/60minutes * 60 minutes = 1
 xe 12 e 1
P(x=2) = = = 0.1839
X! 2!
The probability of getting 2 customers during the next one minute in a bank is
0.1839. Or there is 18.39% chance that exactly 2 customers will arrive in one
minute at a bank.

2. Suppose that bank customers arrive randomly on weekday afternoons at an


average rate of 3.2 customers every four minutes. What is the probability of
getting 10 customers during an eight minute interval?
λ = 3.2 customers/4 minute t= 8 minutes x =10 customers
µ = λ* ι = 3.2 customers/4 minutes * 8minutes = 6.4 customers
 xe 6.410 e 6.4
P(x=10) = = = 0.0528
X! 10!

The probability of getting 10 customers during the next eight minutes in a bank is
0.0528. Or there is 5.28% chance that exactly 10 customers will arrive in eight
minutes at a bank.

3. If a real estate office sells 1.6 houses on average weekday and sales of houses on
weekdays are Poisson distributed, what is the probability of selling:
a) Four houses in a day? 0.0551
b) No house in a day? 0.2019
c) More than five houses in a day? 0.0060
28

d) Ten or more houses in a day? 1 – 1 = 0.000


e) Four houses in two days? 0.1781

4. A secretary types 75 words per minute and averages six errors per hour of typing.
Assuming error occurrences are a Poisson process, what is the probability that a
225-word letter will be typed with out error? 0.7408

5. A pen company averages 1.2 defective pens per carton produced (200 pens). The
number of defects per cartoon is Poisson distributed.
a) What is the probability of selecting a cartoon and finding no defective
pen? 0.0312
b) What is the probability of finding eight or more defective pens in a
cartoon? 0.0000
c) Suppose that a purchaser of these pens will quit buying from the company
if a cartoon contains more than three defectives. What is the probability that
the purchaser will quit buying from this company? 0.0338

6. A certain manufacturer sells a machine that has numerous moving parts. A


quality control inspector counts the number of moving parts that are misaligned
as the number of nonconformities for a particular machine. It is believed that the
number of nonconformities per machine follows a Poisson distribution, with an
average of three nonconformities per machine.
a) Determine the probability that the quality control inspector
finds no more than one nonconformity on a particular machine selected at
random. 0.0996
b) What is the probability that three or more nonconformities
may be obtained by the quality control inspector on three machines? 0.9938
7. The number of paint blisters produced by an automated painting process at
Associated Industries is Poisson distributed with a rate of 0.06 blisters per square
feet. The process is about to be used to paint an item that measures 9 by 15 feet.
a) What is the probability that the finished surface will have no blister in it?
0.0003
b) What is the probability that the finished surface will have between 5 and 8,
inclusive? 0.4846
c) What is the probability that the finished surface will have more than 2
blisters? 0.9873
8. The defects in an automated weaving process at Sharp Industries are Poisson
distributed at a mean rate of 0.00025 per square foot. The process is to be used to
weave a piece of materials that is 5 by 16 yards.
a) What is the probability that this piece will have no defects?
b) What is the probability that it will have one defect?
29

Mean and Variance of Poisson Distribution


The Poisson distribution has only one parameter, the expected value λι = µ.
Additionally, for the poison distribution, the expected value and variance are
equal. The expected vale and variance a Poisson probability distribution are E(X)
= µ = λι = δ2.

Poisson Approximation to Binomial Probability Distribution


The Poisson probability distribution can be used as an approximation to the
binomial probability distribution when P, the probability of success is small and
n, the number of trials/sample size, is large. Simply set µ = np and use the
Poisson tables. As a rule of thumb, the approximation will be good whenever
P≤0.05 and n≥20. However, this approximation is reasonably accurate if n>20
and np≤5.

Binomial tables are often not available for large values of n, so in these cases the
approximation can be useful. So in cases where P≤0.05 and n≥20, substitute the
mean of the binomial distribution (µ = np) in place of the mean of the Poisson

distribution (µ = λι), so that the formula becomes P(X) =


 np  e   np 
x

X!
In general, the larger n is and smaller p is, the better will be the approximation.

Why approximation?
- The Poisson formula is easier to use than the binomial formula.
- It can be tabulated more efficiently than binomial probabilities
because Poisson distribution has only one parameter µ (λι), where as
binomial distribution has two parameters n and p.
Example:
n = 500 p= 0.02 µ = np = 500*0.02 = 10
n = 1000 p= 0.01 µ = np = 1000*0.01 = 10

If we want to calculate P(X) for both cases we can tabulate on a single column-
Poisson. Had it been binomial for the above cases we should have formulated
two columns.
1. A company sells insurance policies to a random sample of 1000 men who are 35
years of age. The probability that a 35-year old man dies with in a year is
approximately 0.002. What is the probability that the insurance company will
have to pay claims on 2 or more policies next year?
Solution:
Steps: 1. Make sure P≤0.05 and n≥20
30

P = 0.002 n = 1000…………. Both requirements satisfied


2. Calculate µ = np = 1000*0.002 = 2
3. Calculate P(X)
P (X≥2) = 1 – [P(X=0) + P(X=1)]
 2  0 e 2  2 1 e 2
=1-[ +[ ]
0! 1!
= 1 – (0.1353 + 0.2707)
= 0.5940

The exact binomial probability is 0.5942.

2. Suppose that the probability of a bank making a mistake processing a deposit is


0.0003. If 10,000 deposits are audited, what is the probability that exactly six
mistakes were made in processing deposits?
Solution:
Steps: 1. Make sure P≤0.05 and n≥20
P = 0.0003 n = 10,000…………. Both requirements satisfied
2. Calculate µ = np = 10,000*0.0003 = 3
3. Calculate P(X)
 3 6 e 3
P (X=6) =
6!
= 0.0504

Continuous Probability Distribution


Up to this point, we have focused our attention on discrete distributions of
random variables that have either a finite number of possible value (E.g. 0, 1, 2, 3
…n) or a countably infinite number of values (E.g. 0, 1, 2, 3 …), and we can also
list all of the possible values of a discrete random variable and it is meaningful to
consider the probability that a particular individual value will be assumed. In
contrast, a continuous random variable has an uncountably infinite number of
possible values and can assume any value in the interval between two points and
31

b(a<x<b). As a result the only meaningful way to compute a probability is the


probability that the variable will fall within a specified region. That is, the
probability that a continuous random variable X will assume any particular value
is zero.
It is any representation of the values of continuous random variable and the
associated probabilities. The continuous probability distribution includes the
normal distribution and exponential distribution.

The Normal Distribution


The normal distribution is a continuous distribution that has a bell shape and is
determined by its mean and standard deviation. It occupies a place of central
importance in continuous probability distribution in particular and statistics in
general. It is the most important theoretical distribution because of the following
three reasons:
1. The normal distribution approximates the observed frequency distributions
of many natural and physical measurements, such as, IQ S, weights, heights,
sales, product life times, etc.
2. The normal distribution can often be used to estimate binomial probabilities
when n (sample size) is greater than 20.
3. The normal distribution is a good approximation of distributions of both
sample means and sample proportions of large samples (n > 30).

Characteristics of normal distribution


i. It is a continuous distribution.
ii. It has a bell shape and is symmetrical about its mean.
iii. It is asymptotic to the X- axis.
iv. It extends infinitely in either direction from the mean.
v. It is defined by two parameters: µ and δ. Each combination of these two
parameters specifies a unique normal distribution. The value of µ
indicates where the center of the bell lies, while δ represents how spread
out (or wide) the distribution is.
vi. It is measured on a continuous scale and the probability of obtaining a
precise value is zero.
vii. The total area under the curve is equal to 1.0 or 100%; 50% of the area is
above the mean and 50% is below the mean.
viii. The probability that a random variable will have a value between any
two points is equal the area under the curve between those two points.

Each combination of µ and δ specifies a unique normal distribution. This brings


about having an infinite family of normal distributions. This problem of dealing
32

with an infinite family of distributions can be solved by transforming all normal


distributions to the standard normal distribution, which has a mean equal to 0 and
a standard deviation equal to 1. Standard Normal Distribution is a normal
distribution in which the mean is 0 and the standard deviation is 1. It is denoted
by z.

Any normal distribution can be converted to the standard normal distribution by


standardizing each of its observations in terms of Z- values. The Z- value
measures the distance in standard deviations between the mean of the normal
curve and the X- value of interest. Any random variable can be transformed to a
standard random variable by subtracting the mean and dividing by the standard
deviation.

If a random variable X has mean µ and standard deviation δ, the standardized


variable Z is defined as:

X 
Z , Where : Z  number of s tan dard deviations from the mean.

X  value of int erest
  mean of the distribution
  s tan dard deviation of distribution

A Z- score is the number of standard deviations that a value, X, is away from the
mean. If the value of X is less than the mean, the Z-score is negative; if the value
of X is greater than the mean, the Z-score is positive. Z-score is also known as z-
value. A standardized score in which the mean is zero and the standard
deviation is 1. The Z score is used to represent the standard normal distribution

The probability calculations in normal distribution are made by computing areas


under the graph. Thus, to find the probability that a random variable lies within
any specific interval we must compute the area under the normal curve over that
interval.
Probabilities for some commonly used intervals are:
a) 68.26% of the time, a normal random variable assumes a value within ±1δ of
its mean.
b) 95.44% of the time, a normal random variable assumes a value within ±2δ of
its mean.
c) 99.72% of the time, a normal random variable assumes a value within ±3δ of
its mean.
Example:
33

1. The Graduate Management Admission Test (GMAT) is widely used by


graduate school of business as an entrance requirement. In one particular
year, the mean score for the GMAT was 485, with a standard deviation of 105.
assuming that GMAT scores are normally distributed, what is the probability
that a randomly selected score from this administration of the GMAT:
a) Falls between 600 and the mean, inclusive?
b) Is greater than 650?
c) Is less than 300?
d) Falls between 350 and 550, inclusive?
e) Is less than 700?
f) Is exactly 500?
g) If 500 applicants take the test, how many would you expect to score 590 or
below?
Solution:
Steps to find the probability value of a random variable which lies over an
interval:
2 Calculate the appropriate z values
2 Find the areas (probabilities) in the table
2 Interpret your results

µ = 485 δ = 105 485≤X≤600

a) P (485≤X≤600) =?
X 
1. First convert X values in to Z-score using the formula Z 

Z485 = 0
600  485
Z600 = = +1.10
105

2. P(485≤X≤600) = P(0≤Z≤+1.10)
= P (0 to +1.10)
= 0.36433
b) P (X>650) =?
X 
1. First convert X values in to Z-score using the formula Z 

650  485
Z650 = = +1.57
105

2. P(X>650) = P(Z>+1.57)
= 0.5- P (0 to +1.57)
= 0.5-0.44179
34

= 0.05281

c) P (X<300) =?
X 
1. First convert X values in to Z-score using the formula Z 

300  485
Z300 = = -1.76
105

2. P(X<300) = P(Z<-1.76)
= 0.5- P (0 to -1.76)
= 0.5-0.46080
= 0.03920
d) P (350≤X≤550) =?
X 
1. First convert X values in to Z-score using the formula Z 

350  485
Z350 = = -1.29
105
550  485
Z550 = = +0.62
105

2. P(350≤X≤550) = P (-1.29≤Z≤-1.29)
= P (0 to -1.29) + P (0 to 0.62)
= 0.40147 + 0.23237
= 0.63384
e) P (X<700) =?
X 
1. First convert X values in to Z-score using the formula Z 

700  485
Z700 = = +2.05
105

2. P(X>300) = P(Z<+2.05)
= P (X<485) + P (485≤X<700)
= 0.5+ P (0 to +2.05)
= 0.5 + 0.47982
= 0.97982

f) P(X=500) = 0. The probability of an exact/single value of a continuous random


variable is zero. Consequently, the probability of an interval is the same
whether the end points are included or not.
35

g) To find the expected number of applicants who score 590 or below, we first
find P (X≤590) and we multiply it by the number of applicants.
P (X≤590) =?
X 
1. First convert X values in to Z-score using the formula Z 

590  485
Z590 = = +1.00
105

2. P(X≤590) = P(Z≤+1.00)
= P (X<485) + P (485≤X<590)
= 0.5+ P (0 to +1.00)
= 0.5 + 0.34134
= 0.84134
If 500 applicants take the test, the number of students expected to score 590 or
below is 500(0.84134) = 420.65 or 421 students.

2. The result of an exam score for a given class is normally distributed. If the mean
score is 85 points and the standard deviation is equal to 20 points, find the cutoff
passing grade such that 83.4% of those taking the test will pass.
Solution:
µ = 85 prob. Of passing = 83.4%
δ = 20 cutoff point =?

Since 83.4% is greater than 50%, the cutoff point should be less than the mean,
and hence the Z-value is negative. And this calls for the inverse use of the
standard normal table.
(Z/P=0.334) = -0.97
X  485
-0.97 =
20
-19.4 = X-85
X = 65.6 Points – Minimum point to pass the test.
3. Data accumulated by the National Climatic Data Center shows that the average
wind speed in miles per hour for Addis is 9.7mph. Suppose that wind speed
measurements are normally distributed for a given geographical location. If
22.45% of the time the wind speed measurements are more than 11.6mph, what
is the standard deviation of wind speed in Addis?
Solution:
µ = 9.7mph δ =? X > 11.6
P(X> 11.6) = 22.45%
(Z/P = 0.2755) = +0.76
36

11 .6  9.7
+0.97 =

0.97δ = 1.9
δ = 2.5

4. The cylinder making machine has δ = 0.5mm and µ = 25mm. within what interval
of values centered at the mean will, the diameters of 80%of the cylinder lie?
Solution:
µ = 25mm δ =0.5mm

From the statement it is clear that the interval is centered at the mean; i.e., 50% of
the 80% (40%) lies below the mean and 50% lies above the mean.
(Z/P=0.4) = ± 1.28
X1 = µ - Z δ X2 = µ + Z δ
X  25 X  25
-1.28 = 1 +1.28 = 2
0.5 0.5
-0.64 = X1-25 +0.64 = X2-25
X1 = 24.36mm X2 = 25.64mm

80% of the diameter of the cylinder lies between 24.36mm and 25.64mm.

5. The lives of light bulbs follow a normal distribution. If 90% of the bulbs have lives
exceeding 2000 hrs and 3% have lives exceeding 6000 hrs. What are the mean and
standard deviation of the lives of light bulbs?
Solution:
P(X>2000) = 0.90 P(X>6000) = 0.03
µ=? δ =?
(Z/P=0.4) = - 1.28 (Z/P=0.47) = + 1.88
2000   6000  
-1.28 = +1.88 =
 
-1.28δ = 2000 - µ +1.88δ = 6000 - µ
µ = 2000 + 1.28δ µ = 6000 - 1.88δ

Using simultaneous equation,


µ = 6000 - 1.88δ
µ = 2000 + 1.28δ
3.16δ = 4000
δ = 1265.82

µ = 2000 + 1.28 δ
37

= 2000 + 1.28(1265.82)
= 3620.25 points

6. On a civil service exam, the grades are normally distributed with µ = 70 points
and δ = 10 points. The police department hires the applicants whose grades are
among the top 10% of the population. What is the minimum grade required to be
hired?
Solution:
µ = 70points δ =10points

(Z/P=0.4) = + 1.28

X 
+1.28 =

X  70
+1.28 =
10
12.8 = X - µ
X - 70 = 12.8
X = 82.8 – the minimum grade required to be hired.

7. A bakery shop sells loaves of freshly made bread. Any unsold loaves at the end of
the day are either discarded or sold elsewhere at a loss. The demand for this
bread has followed a normal distribution with µ = 35 loaves and δ = 8 loaves.
How many loaves should the bakery make each day so that they can meet the
demand 90% of the time?
Solution:
µ = 70 loaves δ = 8 loaves
(Z/P=0.4) = + 1.28

X 
+1.28 =

X  35
+1.28 =
8
10.24 = X - 35
X = 45.24 ≈ 46- by stocking 46 loaves of breads each day, the bakery will meet the
demand for this product 90% of the time.

Normal Approximation to Binomial Probability


When a binomial problem involves as n-value larger than 20, the binomial tables
may not be used. In such a case, the Poisson approximation is not appropriate,
38

and hence another method of solving the problem must be found – the normal
distribution.

The normal distribution is bell-shaped and symmetrical with mean, µ, and


standard deviation, δ. However, the binomial distribution is symmetrical only if
P=0.5. Hence, if n is large and p is close to 0.5, the normal distribution provides a
good approximation to the binomial distribution. The approximations are quite
good when np and nq are greater than 5.

In binomial distribution
- When p is small (e.g. 0.1), the distribution is skewed to the right.
Mode<Median<Mean.
- As p increases (e.g. 0.3), the skewness is less noticeable.
- When p= 0.5 the distribution is symmetrical. Mode = Median = Mean.
- When p > 0.5, the distribution is skewed to the left. Mean>Median<Mode

To approximate binomial by normal the following rule is used:


 When the normal probability distribution is used to approximate binomial
distribution, the mean and standard deviation for the normal approximation are
based on the expected value and standard deviation of the binomial
probabilities, i.e., µ = np and δ = √npq. And the normal distribution provides a
good approximation to the binomial if n is large (n≥ 50) and p is close to 0.5 and
thereby np and nq are greater than 5.

When we use a normal probability value as an approximation of a binomial


probability, we are substituting a continuous probability distribution for a
discrete probability distribution. Such a substitution requires a CONTINUITY
CORRECTION FACTOR (addition and subtraction of 0.5 to the discrete value of
x), i.e., a correction of ± 0.5 depending on the problem is required.

The need for continuity correction factor can be summarized as follows:

Values being determined Correction


1. X > + 0.5
2. X≥ - 0.5
3. X < - 0.5
4. X≤ + 0.5
5. ≤ X ≤ - 0.5 and + 0.5
6. < X < + 0.5 and – 0.5
39

7. X = - 0.5 and + 0.5

Without the continuity correction, the normal distribution will generally


underestimate binomial probabilities especially if n is small.

To approximate a binomial distribution by normal distribution, a test must be


made to determine whether the interval µ ± 3δ lies between 0 and n, which are
the lower and upper limits respectively, of a binomial distribution. This is
because the empirical rule states that approximately 99.72%, or almost all, of the
values a normal curve lie within three standard deviations of the mean. If µ ± 3δ
does not lie between 0 and n, don’t use the normal distribution to work a
binomial problem, because the approximation is not good enough.

In short, to use a normal distribution as an approximation to binomial we have


the following steps.
1. Check that n is large (n≥50) and p is close to 0.5 as well as np and nq > 5.
2. Calculate µ (np) and δ (√npq).
3. Check that µ ± 3δ lies between 0 and n.
4. Use the continuity correction factor and determine the appropriate
interval.
5. Calculate the probability value, by calculating the area which is covered
by the interval.
6. Interpret the results.

Example:
1. According to a recent study conducted by the Addis Ababa University,
87% of all evening college students also work. If this figure still holds and if 120
evening class college students are randomly selected, what is the probability that
less than 100 also work? Use normal distribution to approximate the binomial.

Solution:
1. n = 120 - large p= 0.87 – is close to 0.5
np = 0.87*120 = 104.40, nq = 0.13*120 = 15.6….. Both greater than 5.
2. µ = np = 120*0.87 = 104.40and δ = √npq = √120*0.87*0.13 = 3.684
3. µ ± 3δ = 104.40 ± 3(3.684) = 104.40 ± 11.052 = 92.948 ≤ µ ± 3δ ≤ 115.052.
Hence, the interval (92.95 to 115.05) is between 0 and n (120).
4. P(X< 100) of binomial is changed in to P(X < 99.5) of normal by applying
the continuity correction factor.
5. P(X< 99.5) =?
40

99.5  104.4
Z99.5 = = -1.33
3.684
P(X< 99.5) = P (Z<-1.33)
= 0.5- P (0 to -1.33)
= 0.5-0.40824
= 0.09176
6. If 87% of the all the evening college class students work, 9.18% of the time
the Addis Ababa University would get less than 100 evening class college
students working in a sample of 120 evening college class students.

2. In a travel study, the Ethiopian Tourism Commission reported that during the
Ethio-Eritrean war, 29% of the tourists who came to Ethiopia said that the crisis
would affect their vacation plans.
a) At the end of the war if the figure is still 29%, in a random sample
of 150 travelers, what is the probability that 20 or fewer responded yes that
the Ethio-Eritrean crisis would affect their vacation plans? 0.0000
b) However, a study at the end of the war indicated that only 7% of
the travelers felt at that time that the Ethio-Eritrean crisis would affect their
vacation plans. What is the probability a random sample of 150 travelers
would result in 20 or fewer travelers saying yes that the Ethio-Eritrean crisis
would affect their vacation plans? 0.99934

3. A true-false test containing 100 questions is given to a student who is totally


ignorant of the subject matter. What is the probability that the student gets
exactly 65 correct? 0.00042

Normal Approximation to Poisson Probabilities


When the mean (λι) a Poisson probability distribution is very large, the normal
distribution can be used to approximate Poisson probabilities. Such
approximation is generally considered to be acceptable when λι ≥ 10. The mean
and standard deviation used with the normal approximation are base on the
expected value and standard deviation for the number of events in the Poisson
probability distribution. µ = λι and δ = √λι

Exactly as for the normal approximation of binomial probabilities, a correction


factor for continuity should be used in conjunction with the normal
approximation of Poisson probabilities.
Example:
41

1. suppose we wish to determine the probability that 15 or more maintenance calls


will be required on a randomly selected day, given a Poisson random variable
with λ = 10 calls per day by normal distribution.
Solution:
Poisson Normal
λ = 10 calls per day µ = λι = 10
ι = 1 day δ = √λι = √10 = 3.16
P (X≥ 15) =? P (X≥ 14.5) =?
14.5  10
P (X≥ 15) = 1 - P (X≤ 14) Z14.5 = = + 1.42
3.16
= 1 – 0.9165
= 0.0835 P (X≥ 14.5) = P (Z≥ +1.42)
= 0.5 – P (0 to +1.42)
= 0.5 – 0.42220
= 0.0778

As we can see from the above result, the difference is only 0.0057(0.0835 - 0.0778).
As µ increases the difference decreases.

Exponential Distribution - is continuous probability distribution that is often


useful in describing the time it takes to complete a task. It is closely related to the
Poisson distribution. “The Poisson distribution for arrivals per unit of time and
the exponential distribution for time between arrivals provide two alternative
ways of describing the same thing.”
 Whereas a Poisson distribution is discrete and describes random occurrences
over some interval, the exponential distribution is continuous and describes the
times between random occurrences. Interval is defined as the time it takes to
complete a task, the time between arrivals, the time/ distance before the product
failure. In Poisson distribution we are interested in the discrete variable, the
number of occurrences/arrivals and we compute the number of arrivals per unit
of time.

Characteristics of Exponential Distribution


- It is a continuous distribution.
- It is skewed to the right: Mode < Median < Mean.
- The X-values vary from 0 to ∞
- Its apex is always at X=0.
- The curve steadily decreases as x gets larger.
42

An exponential distribution is characterized by one parameter, λ. Each unique


value of λ determines a different exponential distribution. Probabilities are
computed for the exponential distribution by determining the amount of area
under the curve between two points. In general, 1/λ is the average (mean) value
of the exponential random variable X. it is equal to the standard deviation of X.
So, µ = δ = 1/λ
The Formula for Exponential Distribution Probabilities:
P (0 to t) = 1 – е-λι, or P (X≤ t0) = 1 - е-ι/µ
ι≥0
Where: P (0 to t) = probability that the next arrival will occur between now
and t time units from now
λ = average arrival rate per unit of time/interval
t = number of units of time/interval
µ = average time between occurrences
Exponential distribution has two areas of application. These are: Measuring Time
Elapsed between Arrivals and Reliability Engineering.

Measuring Time Elapsed between Arrivals


Example:
1. Arrivals at a bank are Poisson distributed with λ of 1.2 customers every minute.
a) What is the average time between arrivals?
b) What is the probability that at least two minutes will elapse between one
arrival and the next arrival?
Solution:
λ = 1.2 customers per minute
a) µ = 1/λ = 1/1.2 = 0.833 minutes
b) P (t≥ 2) = 1 – (1 – е-λι) = 1- (1- е-1.2(2))
= 1 – (1 – е-2.4)
= 1- (1- 0.9093)
= 0.0907
About 9.07% of the time when the rate of arrivals of customers is 1.2customers
per minute, 2 minutes or more will elapse between arrivals.
2. A manufacturing firm has been involved in statistical quality control for several
years. As part of the production process, parts are randomly selected and tested.
From the records of these tests it has been established that a defective part occurs
in a pattern that is Poisson distributed on the average of 1.38 defects every 20
minutes during the production runs. What is the probability that less than 15
minutes will elapse between any two defects?
43

Solution:
λ = 1.38 defects/20 minutes t = 15 minutes P (t < 15) =?
λι = 1.38defectives/20minutes* 15 minutes = 1.035
P (t < 15) = P (0 to 15) = 1 – е-λι
= 1- е-1.035
= 1 – 0.3552
= 0.6448
There is a probability of 64.48% that there will be less than 15 minutes between
two defects when there is an average of 1.38 defects per 20 minutes interval [or
an average of 14.49(1/1.38/30) minutes between defects.

3. The defects in an automated weaving process at X company are Poisson


distributed at a rate of 0.025 defects per foot.
a) What is the mean distance between defects for this problem? 40 feet
b) What is the probability that the next defect will be within 10 feet of the
previous defect? 0.2212
c) What is the probability that the next defect will be within 10 feet of the
previous defect? 0.1723

4. Supermarkets usually get very busy at about 5pm on weekdays as many workers
stop by on the way home to shop. Suppose that at that time arrivals are Poisson
distributed at a supermarket’s checkout station with an average of 0.8 people per
minute. The clerk has just checked out the last person in line.
a) What is the probability that at least one minute will elapse before the next
customer arrives? 0.4493
b) Suppose the clerk needs to go to the manager’s office to ask a quick question
and that 2.5 minutes are needed to do so. What is the probability that the
clerk will get back before the next customer arrives? 0.1353

Reliability Engineering
Another situation that often fits the exponential distribution is observing the life
time of certain components in a machine; i.e., exponential distribution is widely
used in the area of reliability engineering to describe the life time of to failure of
a component or a system. The parameter µ is called the mean time to failure and
λ = 1/µ is the failure rate of the system.
Example:
44

1. Suppose that an automobile battery has a useful life described by the exponential
distribution with a mean of 1000 days.
a) What is the probability that a battery will fail before its expected life time of
1000 days?
b) If the battery has a 12-month (365 days) warranty, what fraction of the
batteries fail during the warranty period?
c) Find the probabilities that batteries will last between 1000 and 2000 days.
d) Find the probabilities that such batteries will last more than 2000 days.

Solution:
µ = 1000 days
a) P (t < 1000)?
P (0 to X0) = 1 – е-X0/µ
= 1 – е-1000/1000
= 1 – е-1
= 0.6321

There is a 63% chance that the battery will fail prior to its mean life time of 1000
days. This value is greater than 50% since this distribution is not symmetrical
and is positively skewed.
b) P (t ≤ 365) = P (0 to 365)
= 1 – е-X0/µ
= 1 – е-365/1000
= 1 – е-.365
= 0.3058

If the mean life time the batteries is 1000 days, with in 365 days 30.58% of the
batteries will fail – or – the manufacturer will be forced to replace 30.58% of the
batteries during one year warranty period.
c) P(1000 ≤ t ≤ 2000) = P (0 to 2000) – P(0 to 1000)
= [1 – e-2000/1000] – [1- e-1000/1000]
= (1 – e-2) - (1 – e -1)
= (1 – 0.1353) - (1 – 0.3679)
= 0.8647 – 0.6321
= 0.2326

The batteries have a 23.26% chance of waiting between 1000 and 2000 days.

d) P (t > 2000) = e-2000/1000


= 0.1353
45

The batteries have a 13.53% chance of waiting more than 2000 days.

2. A company that manufactures washing machines makes them to last an average


of 8 years before they have a major breakdown. The manufacturer offers a free
warranty against major breakdowns. However, the company only wants to
guarantee the machine against major breakdowns for no more than 20% of the
machines. For how many years should the warranty be promised? Assume
breakdowns occur Poisson distributed. 1.79 years
Solution:
µ = 8yrs P (0 to t) = 0.2 t=?
P (0 to t) = 0.2
-t/8
1–e = 0.2
-t/8
1-0.2 = e
-t/8 -.125t
0.8 = e = 0.8 = e ….. To solve this we use logarithmic concept.
Log0.8 = -.125tloge
log 0.8
log e = -0.125t
log 0.8 1
t = log e *  0.125
t = 1.79 years.

Vous aimerez peut-être aussi