Vous êtes sur la page 1sur 61

Probability & Probability

Distributions
Carolyn J. Anderson
EdPsych 580
Fall 2005

Probability & Probability Distributions p. 1/61

Probability & Probability Distributions

Elementary Probability Theory


Definitions
Rules
Bayes Theorem

Probability Distributions
Discrete & continuous variables.
Characteristics of distributions.

Expectations
Probability & Probability Distributions p. 2/61

Elementary Probability Theory


or
How Likely are the results?
Probabilities arise when sampling individuals
from a population and in experimental situations,
because different trials or replications of the
same experiment usually result in different
outcomes.

Probability & Probability Distributions p. 3/61

Statistical Experiment
A (simple) statistical experiment is some well
defined act or process (including sampling) that
leads to one well defined outcome.

Its repeatable (in principle).

There is uncertainty about the results.

Uncertainty is modeled by assigning


probabilities to the outcomes.

Examples. . .

Probability & Probability Distributions p. 4/61

Examples of Statistical Experiments


Well defined, repeatable, uncertainty, model by
probabilities?

Flip a coin 5 times & record number of heads.

Count the number of blue M& Ms in a 9 oz.


package.

Roll two dice & record the total number of


spots.

Ask people who they intend to vote for in the


next presidential election.

Recorded number of correct responses on a


test.
Probability & Probability Distributions p. 5/61

Statistical Experiments
A statistical experiment maybe

Real (it can actually be done).

Conceptualize (completely idealized).

Probability & Probability Distributions p. 6/61

Definition: Probability
The probability of an event is the proportion of
times that the event occurs in a large number of
trials of the experiment.
It is the long-run relative frequency of the event.

Probability & Probability Distributions p. 7/61

Example

Experiment: Draw a card from a standard


deck of 52.

Sample space: The set of all possible distinct


outcomes, S (e.g., 52 cards).

Elemenatary event or sample point: a


member of the sample space. (e.g., the ace
of hearts).

Event (or event class): any set of elementary


events. e.g., Suit (Hearts), Color (Red), or
Number (Ace).
Probability & Probability Distributions p. 8/61

Example (continued)
number of aces
Probability of an Ace =
number of cards
4
=
= .0769
52
Notes:

Elementary events are equally likely

Denote events by roman letters (e.g., A, B,


etc)

Denote probability of an event as P (A).


Probability & Probability Distributions p. 9/61

More Definitions

Joint Event is when you consider two (or


more events) at a time. e.g., A =heads on
penny, B = heads on quarter, and joint event
is heads on both coins.

Intersection: (A B) = A and B occur at the


same time.

Union: (A B) = A or B occur
Only A occurs.
Only B occurs.
A and B occur.
Probability & Probability Distributions p. 10/61

More Definitions

Complement of an event is that the event did


not occur. A not A. e.g., if A =red card,
then A is a black card (not a red card).

Mutually exclusive events are events that


cannot occur at the same time. Events have
no elementary events in common. e.g., A =
heart and B = club.

Mutually exclusive and exhaustive events are


a complete partition of the sample space. e.g.,
Suits (hearts, diamonds, clubs, spades)
Numbers (A, 2, 3, 4, 5, 6, 7, 8, 9, J, Q, K)
Probability & Probability Distributions p. 11/61

Formal Defintion of Probability


Probability is a number assigned to each and
every member in the sample space. Denote by
P ().
A probability function is a rule of correspondence
that associates with each event A in the sample
space S a number P (A) such that
0 P (A) 1, for any event A.
The sum of probabilities for all distinct events is 1.
If A and B are mutually exclusive events, then

P (A or B) = P (A B) = P (A) + P (B)
Probability & Probability Distributions p. 12/61

Example
Let A = number card (i.e., 210), B = face card
(i.e., J, Q, K), and C = Ace.

Probabilities of events:
P (A) = 9(4)/52 = 36/52 = .6923
P (B) = 3(4)/52 = 12/52 = .2308
P (C) = 1(4)/52 = 4/52 = .0769

P (A) + P (B) + P (C) = 1

P (A B) = P (A) + P (B) = .6923 + .2308 =


.9231 = 48/52.
Probability & Probability Distributions p. 13/61

Another Example

Experiment: Randomly select a third grade


student from a Unit 4 public school in
Champaign county.

Sample Space: All 3rd grade students at Unit


4 public schools.

Elementary Event: A characteristic of the


child. e.g., brown hair, age (in months),
weight, gender, the response very much to
question How much do you like school?

Probability & Probability Distributions p. 14/61

Venn Diagram
S

'$
' $A C
A


&%
I  C
@

@
B
@
@AB
& %

Addition rules. . .

Probability & Probability Distributions p. 15/61

Addition Rules

Rule 1: If 2 events, B & C, are


mutually exclusive (i.e., no overlap) then the
probability that one or both occur is
P (B or C) = P (B C) = P (B) + P (C)

Rule 2: For any 2 events, A & B, the


probability that one or both occur is
P (A or B) = P (AB) = P (A)+P (B)P (AB)

Probability & Probability Distributions p. 16/61

Example: Teachers by Region


The population consists of all elementary and
secondary teachers in US in 1969.
Level
Region
Elementary Secondary
Northeast
273,687
224,013
517,700
North Central
314,614
265,848
580,462
240,028
183,180
423,208
South
West
279,445
213,021
492,466
1,107,774
906,062 2,013,836
Probability & Probability Distributions p. 17/61

Example: Teachers by Region

Elementary event (or sample point) is a


teacher.

Event is any set of teachers. (e.g., region,


level, or combination).

Simple Experiment: Select 1 teacher at


random,
1, 107, 774
= .55
P (elementary) =
2, 0138, 836
P (not elementary) =P (secondary) = 1 .55 = .45
Probability & Probability Distributions p. 18/61

Example: Addition Rules


Rule 1: Events are an elementary teacher from
the South & an elementary teacher from the
West,
P (elementary in S or W) =
= P (elementary, South) + P (elementary, West)
240, 028
279, 445
=
+
= .26
2, 013, 836
2, 013, 836

Probability & Probability Distributions p. 19/61

Example: Addition Rules (continued)


Rule 2: Events are an elementary teacher and a
teacher from the South
P (elementary or from South) =
= P (elementary) + P (South)
P (elementary and South)
1, 107, 774 423, 208
240, 028
=
+

2, 013, 836 2, 013, 836


2, 013, 836
= .64

Probability & Probability Distributions p. 20/61

Conditional Probability

Conditional Probability equals the probability


of an event A given that we know that event B
has occurred.
P (A B) P (A, B)
P (A|B) =
=
P (B)
P (B)

Example: What is the probability that a


teacher is from the South given that he/she is
an elementary school teacher?

Probability & Probability Distributions p. 21/61

Example: Answer
P (South|elementary) =
=
=
=

P (elementary and South)


P (elementary)
240, 028/2, 013, 836
1, 107, 774/2, 013, 836
240, 028
1, 107, 774
.217

Probability & Probability Distributions p. 22/61

Example (continued)

Note that
423, 208
= .210
P (South) =
2, 013, 396

Knowing that a teacher is an elementary


school teacher changes the chance that the
teacher is also from the south,
P (South|elementary) =
6 P (South)
.217 =
6 .210
Probability & Probability Distributions p. 23/61

Bayes Theorem

P (A B) = P (A, B) = P (A|B)P (B)

P (A B) = P (A, B) = P (B|A)P (A)

Bayes Theorem:
P (B|A)P (A)
P (A|B) =
P (B)

Example: Monty hall problem.

Door A

Door B

Door C

Probability & Probability Distributions p. 24/61

Monty Hall Problem

Start of Game: Probability of getting the big


prise (e.g, car)
1
P (A) =
3

1
P (B) =
3

1
P (C) =
3

You pick door A.

Monty opens door B and gives you the


chance to switch from door A to door C. What
should do you do?

Montys trying to not let you win. . .


Probability & Probability Distributions p. 25/61

Monty Hall Problem (continued)

Choose the door for which has the larger


conditional probability, i.e., P (A|Monty opened B)
or P (C|Monty opened B).

Use Bayes Theorem. . . so we need


Conditional probabilities that Monty opens door B

given the car is behind A, behind B and behind C.


Joint probabilities that Monty chooses door B and

the car is behind door A, door B and door C.


Unconditional probability that Monty chooses door

B.
Probability & Probability Distributions p. 26/61

Monty Hall Problem (continued)


Conditional prob. that Monty opens door B:

1
P (Monty opensB|car behind A) = P (BM onty |A) =
2
P (Monty opensB|car behind B) = P (BM onty |B) = 0

P (Monty opensB|car behind C) = P (BM onty |C) = 1

Joint probabilities:
1 1
1
P (BM onty , A) = P (BM onty |A)P (A) = =
2 3
6
1
P (BM onty , B) = P (BM onty |B)P (B) = 0 = 0
3
1
1
P (BM onty , C) = P (BM onty |C)P (C) = 1 =
3
3
Probability & Probability Distributions p. 27/61

Monty Hall Problem (continued)


(Unconditional) Probability that Monty opens
door B:
P (BM onty ) = P (BM onty , A) + P (BM onty , B) + P (BM onty , C)
1
1
1
=
+0+ =
6
3
2
Apply Bayes Theorem. . .
P (A)
P (A|BM onty ) =
P (BM onty |A) =
P (BM onty )
P (C)
P (BM onty |C) =
P (C|BM onty ) =
P (BM onty )

1/3 1
1
=
1/2 2
3
1/3
2
1=
1/2
3

Probability & Probability Distributions p. 28/61

Monty Hall Problem (continued)

I got this example from: Gill, J. (2002).


Bayesian Methods for the Social and Behavioral
Sciences. Chapman & Hall.

Other sources on The Monty Hall Problem.

History.

Use today.

Probability & Probability Distributions p. 29/61

Independence

If the conditional and unconditional


probabilities are identical, then the two events
are Independent.

For Independent events,


P (A|B) = P (A)
P (B|A) = P (B)
P (A and B) = P (A B) = P (A)P (B) =
the multiplicative rule.

Probability & Probability Distributions p. 30/61

Conditional Independence (continued)

Conditional probabilities and Conditional


Independence: two very important concepts.

Conditional probability and regression.

Conditional Independence: explaining


dependency (e.g., classic example: Cal
graduate admissions)

Demonstration: Toss penny and quarter and


give information about what happens.

Probability & Probability Distributions p. 31/61

Are Events Conditionally Independence?

Physical considerations physically


unrelated events..

Deduced from observations.

Probability & Probability Distributions p. 32/61

Independent: Physical Considerations


Examples:

Toss a penny & a quarter:


P (penny = head & quarter = head) =
P (penny = head)P (quarter = head) = (.5)(.5)
= .25

Role two dice:


P (die1 = 5 & die2 = 6) =
P (die1 = 5)P (die2 = 6) = (1/6)(1/6)
= 1/36 = .0278
Probability & Probability Distributions p. 33/61

Independent: Physical Considerations


Examples:

Administer an test that measures attitude


toward gun control to 2 randomly drawn
adults in the US population.

P (Score1 = 50 and Score2 = 55)


= P (Score1 = 50)P (Score2 = 55)

Probability & Probability Distributions p. 34/61

Independence: Deduction
Whether events are independent can sometimes
be deduced from observations, e.g., Mendals
experiments.

Mendal postulated that existence of genes


that are recessive and dominant.

Experiment: Bred pure strains of yellow peas


& green peas.

1st generation: Cross the yellow and green


peas.

2nd generation: Cross plants from 1st


generation with each other and found. . .
Probability & Probability Distributions p. 35/61

Mendals Experiments (continued)

Results: About 75% yellow and About 25%


green.

Results were very regular and replicable (with


other traits and plants).

Part of explanation involves assumption of


independence.

Probability & Probability Distributions p. 36/61

Mendals Experiments: Explanation

There exist genes which when paired up


control seed color according to rules:
y/g yellow
g/y yellow
y/y yellow
g/g green

1st generation: Pure yellow strain (y/y) could


only give a y gene and pure green strain (g/g)
could only give a g gene.

2nd generation: About 1/2 of parent plants


contribute a y, about 1/2 contribute a g, and
pairing in random (independent).
Probability & Probability Distributions p. 37/61

Mendals Experiments: Explanation


Maternal
y
Paternal
g

y/y

y/g

(.25)

(.25)

g/y

g/g

(.25)

(.25)

(.50)

(.50)

(.50)

(1.00)

(.50)

Probability of each cell = (.50)(.50) = .25. . . this is the

independence part of the theory.


Probability of phenotype:

P (yellow pea) = P (y/y) + P (y/g) + P (g/y) = .75


P (green pea) = P (g/g) = .25
Probability & Probability Distributions p. 38/61

Mendals Experiments: Explanation

Mendals theory is an example where an


abstract probability theory is applied to
observed data.

The postulated probability distribution of seed


color for 2nd generation

Probability & Probability Distributions p. 39/61

Basic Logic

Assumed some things to be true (e.g.,


Mendals theory).

Make deductions about what should be true


in the long-run (e.g., 2nd generation: 75%
yellow and 25% green).

Its physically impossible to do all possible


experiments, so we do some (sample).

By chance the results will differ from what


should be true; however, in the long-run, it
would be exactly equal/true.
Probability & Probability Distributions p. 40/61

Probability Distributions
From Hayes:

Any statement of a function associating each


of a set of mutually exclusive and exhaustive
events with its probability is a probability
distribution

Let X represent a function that associates a


Real number with each and every elementary
event in some sample space S. Then X is
called a random variable on the sample
space S.
Probability & Probability Distributions p. 41/61

Random Variables

If random variable can only equal a finite


number of values, it is a discrete random
variable.
Probability distribution is known as a
probability mass function.

If a random variable can equal an infinite (or


really really large) number of values, then it is
a continuous random variable.
Probability distribution is know as a
probability density function.
Probability & Probability Distributions p. 42/61

Discrete Random Variables


From Mendals theory, assign event to real
number (arbitary):
(
1 if yellow
Y =
0 if green
Probability Mass Function:

Probability & Probability Distributions p. 43/61

Lottery Spinner

Color
Y P (Y )
Yellow 100 .10
Blue
5 .20
Red
0 .50
Green
10 .10
Tan
100 .10

Probability & Probability Distributions p. 44/61

Lottery Spinner

Probability & Probability Distributions p. 45/61

Lottery Spinner

Probability & Probability Distributions p. 46/61

Continuous Random Variables

When a numerical variable is continuous, its


probability distribution is represented by a
curve known as a probability density
function or just p.d.f.

Denote a p.d.f by f (y).

P (x1 Y x2 ) = area under curve.


Probability = area under curve.

P (Y = y) = 0

Probability & Probability Distributions p. 47/61

Continuous Random Variable


The event is how many miles a randomly
selected graduate student attending UIUC is
from home.

What would c.d.f look like?


Probability & Probability Distributions p. 48/61

Continuous Random Variable


Probability that a graduate student attending
UIUC is 2,000 or more miles from home
corresponds to the shaded area.

Probability & Probability Distributions p. 49/61

Continuous Random Variables


The event is temperature outside the education
building on January 27th.

P (22.0o ) = 0
P (19.5o temperature 21o ) =black area.
Probability & Probability Distributions p. 50/61

Examples of p.d.f.s
Wheres the mean, median and mode?

Probability & Probability Distributions p. 51/61

Examples of p.d.f.s
Wheres the mean, median and mode?

Probability & Probability Distributions p. 52/61

Characteristics of Distributions

Discrete or continuous

Shape

Central tendency

Dispersion (variability)

Probability & Probability Distributions p. 53/61

Expected Value
If you played this game what would you expect to
win or lose?
Color
Y P (Y )
Yellow 100 .10
Blue
5 .20
Red
0 .50
Green
10 .10
Tan
100 .10
Y = E(Y )
= .1(100) + .2(5) + .5(0) + .1(10) + .1(100) = 0
Probability & Probability Distributions p. 54/61

Expectations are Means

For discrete random variable,


E[Y ] = y

n
X

yi P (yi )

i=1

For continuous variables,


E[Y ] = y

yf (y)d(y)

Variance is the mean squared deviation,


y2 = E[(y y )2 ] = E[y 2 2yy + 2y ]
= E[y 2 ] 2y E[y] + 2y ]
= E[y 2 ] 2y
Probability & Probability Distributions p. 55/61

Expectations are Means (continued)


Example: The variance of lottery spinner:
2 = E[(y y )2 ]
5
X
=
(yi )2 P (Yi )
i=1

= .1(100 0)2 +.2(5 0)2 + .5(0 0)2


.1(10 0)2 + .1(100 0)2
= 2, 015
. . . So how much would you pay to take a spin?
Probability & Probability Distributions p. 56/61

The Algebra of Expectations


Why? We dont have to deal with calculus & its
used alot in statistics. From Hayes Appendix B,

Rule 1: If a is a constant, then


E(a) = a

Rule 2: If a is a constant real number and Y is


a random variable with expectation E(Y ),
then
E(aY ) = aE(Y )

Probability & Probability Distributions p. 57/61

The Algebra of Expectations

Rule 3: If a is a constant real number and Y is


a random variable with expectation E(Y ),
then
E(Y + a) = E(Y ) + a

Rule 4: If X and Y are random variables with


expectations E(X) and E(Y ), respectively,
then
E(X + Y ) = E(X) + E(Y )

Probability & Probability Distributions p. 58/61

The Algebra of Expectations

Rule 5: Given a finite number of random


variables, the expectation of the sum of those
variables is the sum of their individual
expectations, e.g.
E(X + Y + Z) = E(X) + E(Y ) + E(Z)

Variances:

Rule 6: If a is a constant and if Y is a random


variable with variance y2 , then the random
variable (Y + a) has variance y2 .
Probability & Probability Distributions p. 59/61

The Algebra of Expectations

Rule 7: If a is a constant and if Y is a random


variable with variance y2 , then the random
variable (aY ) has variance a2 y2 .

Rule 8: If X and Y are independent random


variables with variances x2 and y2 , then the
variance of X + Y is
2
(x+y)

2
x

2
y

What about variance of (X y)?


2

Probability & Probability Distributions p. 60/61

The Algebra of Expectations


Independence

Rule 9a: Given random variables X and Y


with expectations E(X) and E(Y ),
respectively, then X and Y are independent
if
E(XY ) = E(X)E(Y )

Rule 9b: If E(XY ) 6= E(X)E(Y ), the


variables X and Y are not independent.

. . . thats enough for now.


o

Probability & Probability Distributions p. 61/61

Vous aimerez peut-être aussi