Académique Documents
Professionnel Documents
Culture Documents
Lecture 2
Serafeim Tsoukas
Probability
Probability underlies statistical inference the
drawing of conclusions from a sample of data.
Some vocabulary
An experiment: an activity such as tossing a coin,
which has a range of possible outcomes.
A trial: a single performance of the experiment.
Probabilities
With each outcome in the sample space we can
associate a probability (calculated according to
either the frequentist or subjective view).
Pr(heads) = 1/2
Pr(tails) = 1/2.
This is an example of a probability distribution
(more detail in Chapter 3).
Pr(not-A) = 1 Pr(A)
The complement
of the event A
Compound events
Often we need to calculate more complicated
probabilities:
A slight complication
If A and B can simultaneously occur, the previous
formula gives the wrong answer
16 dots highlighted.
A
10
Spades
Hearts
Diamonds
Clubs
This
is very unlikely!
Conditional probability
3/51 is the probability of drawing an ace given
that an ace was drawn as the first card.
This is the conditional probability and is written
Pr(Second ace | ace on first draw).
To simplify notation write as Pr(A2|A1).
This is the probability of event A2 occurring, given
A1 has occurred.
and so
{H,H} P = 1/4
{H,T} P = 1/4
P=
{T,H} P = 1/4
{T,T} P = 1/4
or, in general
Example
P(3 heads in 5 tosses):
n = 5, r = 3, P =
Bayes Theorem
A ball is drawn at random from one of the boxes
below. It is red.
Box A
Box B
Solution
We require Pr(A|R). This can be written:
Pr( A and R)
Pr( A | R)
Pr(R)
Solution (Continued)
Hence we obtain:
6 / 10 0.5
2
Pr( A | R)
6 / 10 0.5 3 / 10 0.5 3
Likelihoods
0.5
0.6
0.30
0.30/0.45 2/3
0.5
0.3
0.15
0.15/0.45 1/3
Total
0.45
Summary
Probability underlies statistical inference.
There are rules (e.g. the multiplication rule) for
calculating probabilities.
Independence simplifies the rules.
These rules lead on to probability distributions
such as the Binomial.
Hypothesis testing
Hypothesis testing is about making decisions.
Is a hypothesis true or false?
e.g. are women paid less, on average, than men?
H0 true
H0 false
Accept H0
Correct
decision
Type II error
Type I error
Correct
decision
Reject H0
Hence a trade-off.
H1
H0
Type II error
Type I
error
x
Rejection region
xD
Non-rejection region
xm
2
s n
4,900 5,000
2
500 80
1.79
x 4,900
H0
2.5%
H1
2.5%
x
Reject H0
Reject H0
xm
2
s n
14.5 15
2
0.625
8 100
z < z*
or
Testing a proportion
Same principles: reject H0 if the test statistic falls
into the rejection region.
To test H0: = 0.5 versus H1: 0.5 (e.g. a coin
is fair or not) the test statistic is
1
n
p 0.5
0.51 0.5
n
0.6 0.5
2
0.51 0.5
100
x1 x2 m1 m 2
z
s12 s22
n1 n2
p1 p2 1 2
1 1
n1
n2
n1 p1 n2 p2
n1 n2
xm
s2 n
~ t n 1
Testing a mean
A sample of 12 cars of a particular make average
35 mpg, with standard deviation 15. Test the
manufacturers claim of 40 mpg as the true
average.
H0: m = 40
H1: m < 40.
35 40
2
1.15
15 12
x1 x1 m1 m 2
t
S2 S2
n1 n2
1
s
1
s
1
2
2
S2 1
n1 n2 2
Summary
The principles are the same for all tests: calculate
the test statistic and see if it falls into the rejection
region.
The formula for the test statistic depends upon
the problem (mean, proportion, etc).
The rejection region varies, depending upon
whether it is a one or two-tailed test.