Vous êtes sur la page 1sur 7

Some things about the final

GEM2900 - Understanding Uncertainty and


Statistical Thinking

The exam consists of 6 questions. 3 questions were set by


Prof Chew on the material from weeks 1-6; 3 questions
were set by me on the material I taught.

David Chew and David Nott

It is a closed book exam, but youre allowed one A4 help


sheet handwritten on both sides.

Department of Statistics and Applied Probability


National University of Singapore

Calculators are allowed.

The final exam is worth 70% of your final grade.

Exercise sheet 5, Exercise 1


A gambling game involves a series of rounds. On each round
you win with probability 0.5. When you win, the amount that
you win is equal to your stake. That is, if you bet a stake of $1
say, then when you win you have your stake returned plus an
additional $1.
Your friend tells you that he has a foolproof method to earn
money. His system works like this. On the first round, bet $1.
Then if you lose, on the next round bet $2. That means that if
you lose on the first round but win on the second, then your win
(with a profit of $2) will earn back your loss of $1 on the first
round plus an additional $1.

Exercise sheet 5, Exercise 1


If you lose on the first two rounds, then on the third round you
bet $4, so that if you win then you earn back your previous
losses ($1 +$2) plus an extra $1 profit.
If you keep on doubling the stake every time you lose, it is not
hard to see that you have a $1 profit when you eventually do
win (which is certain).
This sounds good to you. The problem is, your friend says, that
he is a little short of the capital needed to implement his certain
money making scheme. Could you loan him some money to
play?

Exercise sheet 5, Exercise 1

Exercise sheet 5, Exercise 1


Your expected loss before winning is

To determine how much you will loan him you decide to


compute the expected value of the amount of money lost before
the eventual win (remember you will bet $ 2i1 on round i if you
have lost on rounds 1, ..., i 1, and if you have lost on rounds
1, ..., i 1 then your losses up to that point are 2i1 1).
What is the expectation of the amount of money lost before
winning?

Exercise sheet 5, Exercise 2


A lazy lecturer comes up with the following scheme to minimize
her marking load when setting an assignment to see if her
students understand a certain topic. The assignment is
administered in two stages. In the first stage, students are
organized into groups of 20.
The lecturer poses one question to each group. The group
members are required to agree upon a response and submit a
single answer for the group to the lecturer
(It is assumed that if one or more of the students knows the
correct answer, they are able to convince the other students in
the group that their answer is correct).

X
i=1

(2

i1

1) 2


X
1
i=1

This follows from the fact that the probability you win on the ith
toss is 2i , and the fact that if you win on the ith toss then you
have lost 2i1 1 by that point.
Note that for i 2, 1/2 2i 1/4 so that the sum is infinite.
So the expectation of the amount of money lost before winning
is infinite!

Exercise sheet 5, Exercise 2

If a group gives the wrong answer, the lecturer can say that all
students in the group do not understand the topic. In that case,
no further questions need to be asked.
On the other hand, if the group gives the correct answer the
lecturer can assume that at least one student understands the
topic and only then will a second stage of the assignment need
to be administered.
In the second stage each of the students are asked questions
individually and submit their answers individually without any
collaboration among the students.

Exercise sheet 5, Exercise 2

Exercise sheet 5, Exercise 2


Let Ai be the event that the ith student in the group knows the
answer. A is the event that nobody knows the answer. Then
1 ) . . . P(A
20 ) = (1 0.05)20
P(A) = P(A

Suppose that each student knows the correct answer to any


question the lecturer asks in either stage with probability 0.05
independently.

and

What is the expected number of questions that the lecturer will


have to mark for a group of size 20?

The lecturer asks 1 question with probability P(A), and 21


So the expected number is
questions with probability P(A).

= 1 (1 0.05)20 .
P(A)

(1 0.05)20 1 + (1 (1 0.05)20 ) 21 = 13.83 < 20.

Exercise sheet 5, Exercise 3

A secretary drops 10 matching pairs of letters and envelopes


down the stairs and then places the letters in the envelopes in a
random order.

Exercise sheet 5, Exercise 3


Let Ai be the event of a match for the ith letter. Clearly
P(Ai ) = 1/10. Let IAi = 1 if Ai occurs and 0 otherwise. Then
P
E(IAi ) = P(Ai ). The number of matches is N = 10
i=1 IAi which
gives
E(N) =

What is the approximate probability that at least one letter and


envelope are correctly matched?

10
X
i=1
10
X

E(IAi )
P(Ai )

i=1

= 10
= 1.

1
10

Exercise sheet 5, Exercise 3

By the law of rare events, N is approximately Poisson with


mean 1.

Exercise sheet 5, Exercise 4

The speed of vehicles along a certain stretch of Clementi Road


has an approximately normal distribution with a mean of 71
km/hr and a standard deviation of 8 km/hr. The speed limit is 60
km/hr.

So
P(N = 0) exp(1)

and the probability of at least one match is


P(N > 0) = 1 P(N = 0) 1 exp(1).

Exercise sheet 5, Exercise 4

Let X be the speed of the randomly chosen vehicle, so that


X N(71, 82 ). The probability that the vehicle is obeying the
speed limit is


X 71
60 71
P(X < 60) = P
<
8
8
= P(Z < 1.375)
where Z N(0, 1).

What is the probability that a randomly chosen vehicle is


obeying the speed limit?
Express your answer in terms of a probability for a standard
normal distribution (there is no need to calculate the answer).

Exercise sheet 5, Exercise 5

A fair coin is flipped 25 times.

Using the normal approximation to the binomial, write down an


expression for the probability of getting between 15 and 18
heads inclusive in terms of a probability for a standard normal
distribution.

Exercise sheet 5, Exercise 5

Questions about causal inference

Let X be the number of heads. Then X Bin(25, 0.5). Using


the fact that for a binomial with parameters
n and p the mean is
p

p)
we obtain that
np and the standard deviation is np(1

E(X ) = 25 0.5 = 12.5 and sd(X ) = 25 0.5 0.5 = 1.25.


Now, let Y be normal, N(12.5, 1.25). Using the normal
approximation to the binomial, we get

Make up a small example of a population in which the causal


effect is negative but the association is positive.

P(15 X 18) P(14.5 Y 18.5)




18.5 12.5
14.5 12.5
Z
Z N(0, 1)
= P
1.25
1.25
= P(1.6 Z 4.8)

Questions about causal inference

Questions about causal inference

Consider the following population of 4 individuals:

X
0
0
1
1

Y
0
0
1
1

C0
0
0
1(U)
1(U)

C1
-1(U)
-1(U)
1
1

The (U) after an entry in the table means that entry was
unobserved.

In this example, E(Y |X = 1) = 1 (see the bottom half of the


table) and E(Y |X = 0) = 0 (see the bottom half of the table) so
the association is = 1 0 = 1 > 0.

However, E(C0 ) = 0.5 and E(C1 ) = 0 so the average causal


effect is = 0 0.5 = 0.5.

Questions about causal inference


A new treatment was tested in two hospitals. Hospital A is a
major research facility, famous for its treatment of advanced
cases of diseases; whereas hospital B is a local area hospital.
Hospital A, which has many researchers on staff, randomly
assigns 1000 patients to the new treatment and the remaining
100 patients to receive the standard treatment.
Hospital B, which is reluctant to try the new treatment due to
the limited number of researchers on staff, randomly assigns
100 patients to the new treatment, and the rest of the 1000
patients to receive the standard treatment.

Questions about causal inference

(i) Do the data seem to show that the new treatment procedure
is a success in hospital A as well as hospital B?
(ii) Next, combine the data from hospital A and hospital B.
Estimate an overall death rate for the new treatment. Do the
data seem to indicate the new treatment procedure is a
success?
(iii) Do your conclusions from parts (i) and (ii) seem to
disagree? Explain.

Questions about causal inference

The following table shows how the new treatment procedure


works compared with the standard treatment.

Standard
New
Total

Hospital A
Survive Die Total
5
95
100
100
900 1000
105
995 1100

Hospital B
Survive Die Total
500
500 1000
95
5
100
595
505 1100

Questions about causal inference


(i) Death rates for treatments in Hospital A:
New: 900/1000 = 0.90; Standard: 95/100 =0.95
Death rates for treatments in Hospital B:
New: 5/100= 0.05; Standard: 500/1000=0.50
The death rates from both hospitals seem to show that the new
treatment procedure works better than the standard treatment
procedure.

Questions about causal inference

Questions about causal inference


Why do the conclusions in parts i) and ii) seem to disagree?

(i) Death rates for treatments in combined data:

New: 905/1100 = 0.82, standard: 595/1100 = 0.54

The overall death rate for the standard treatment procedure


appears to be lower than for the new procedure. This seems to
show that the standard treatment works better.

Presumably, the more serious cases are usually treated in the


famous research hospital A. Because they are more serious
cases, they are more likely to die.
So the treatment assignment here is not independent of the
potential outcomes.
The fact that an association between treatment and outcome
changes direction when we condition on the hospital here is
only paradoxical if we interpret the associations causally.

Vous aimerez peut-être aussi