Vous êtes sur la page 1sur 7

Australian School of Business

Probability and Statistics


Solutions Week 9
1. The calculation of the expected value in each cell are done in the table below. The expected value is
simply the product of the (row total) to (column total) and dividing it by the (grand total). You can
easily verify the numbers:
very well clothed
well clothed
poorly clothed
column total

dull
81 (128.67)
141 (151.94)
127 (68.38)
349

intelligent
322 (347.31)
457 (410.11)
163 (184.58)
942

very capable
233 (160.01)
153 (188.95)
48 (85.04)
434

row total
636
751
338
1,725

Thus, computing the test statistic, we have:


2 statistic

X (Expected - Actual)2
Expected
2

=
=

(128.67 81)
(85.04 48)
+ ...+
128.67
85.04
134.6854

From the chi-square table with degree of freedom equal to (row - 1) x (column - 1) = 4, we have:
2 value = 9.49
at = 5%. Thus, we would reject the null hypothesis of independence if the observed 2 statistic
exceed the 2 value and in this case, it does. Therefore, we conclude that based on the data, there is
no strong evidence to support the hypothesis that intelligence and manner of clothing are independent.
2. (a) The hypothesis is given by:
H0 : classifications are independent v.s. H1 : classifications are dependent
Using the likelihood ratio test we find the approximate chi-squared test statistic:
T =

i{{A,B},{I,II}}

(Oi Ei )2
21 ,
Ei

Note that the degrees of freedom of the unconstraint model is two, i.e., the Pr(A = I) (which
result in Pr(A = II) = 1 Pr(A = I) and is therefore no extra degree of freedom) and Pr(B = I)
(which result in Pr(B = II) = 1 Pr(B = I) and is therefore no extra degree of freedom). In
the constraint model, i.e., under the null, we have Pr(I) = Pr(A = I) = Pr(B = I) and thus
Pr(A = II) = Pr(B = II) = 1 Pr(I), hence the only parameter is Pr(I) and thus the constraint
model has one degree of freedom. We will reject the null hypothesis for large value of the test
statistic (interpretation: large values of the test statistic corresponds to large value of (O E)2
and hence large deviations of what is expected under the null hypothesis, which is not likely)
In order to find the chi-squared test, we have to find the observed and expected numbers. We
have that the sum of each row and the sum of each columns equals 50. Therefore, under the null
hypothesis that the classification criteria where independent, the expected number of each cell
should be 50 1/2 = 25. The 1/2 is due to the probability that an observation (either in A or B)
is equal to I is 50/100 = 1/2 (using column totals), note that this is under the null hypothesis
our best estimate of the proportion. Thus we have the following observed and expected numbers:

c Katja Ignatieva

School of Risk and Actuarial Studies, ASB, UNSW

Page 1 of 7

ACTL2002 & ACTL5101

Observed
A
B

I
22
28
50

Probability and Statistics

II
28
22
50

Expected
A
B

50
50
100

I
25
25
50

II
25
25
50

Solutions Week 9

50
50
100

and the corresponding observed minus expected:


Observed-Expected
A
B

I
3
3

II
3
3

Hence, the value of our test statistic is:


T =4

32
= 1.44.
25

From Formulae and Tables page 164 we observe that Pr(21 1.44) = 0.77. Hence, our p-value is
1 0.77 = 0.23. Thus, for levels of significance of 0.23 or less (usually the case) the is not evidence
of dependence of the two criteria.
(b) R-code for the Pearson Chi-squared test:
> data < matrix(c(22,28,28,22),nrow=2,byrow=T) #create 2 2 matrix of the data
> chisq.test(data,correct=F) #preform the test
This is the same as we we calculated in the previous question.
3. (a) The hypothesis is that the classifications are independent (two sided):
H0 : classifications are independent v.s. H1 : classifications are dependent
Or alternatively, in terms of the following table:
N12
N22
n2

N11
N21
n1

n1
n2
n

we have the following hypothesis:


H0 :

N11
N12
=
n1
n2

v.s. H1 :

N11
N12
6=
.
n1
n2

The corresponding test statistic is given by:


T = N11 Hypergeometric(N, M, n)
We will reject the null hypothesis for small and large values of this statistic.
(b) Let X Hypergeometric(N, M, n) with N = 100, M = 50, n = 50. Then we have:


 N M 
50
50
M
22 5022
x nx


=
= 0.07806943
pX (x = 22) =
100
N
n

50

(c) R-code for the cumulative density function:


> p=c()
> for(x in 1:22){p[x]=choose(50,x)*choose(50,50-x)/choose(100,50)} # p is a vector with components the probability mass of the Hypergeometric distribution
> sum(p) # the cumulative density function

(d) Step 1 (defining the hypothesis) and step 2 (defining the test statistic) have been done in question
i). We now need to find the corresponding p-value of this test. To do so we use the cumulative
distribution function. We find that Pr(X 22) = 0.15867 (see question iii). We would reject
the null hypothesis if Pr(X 22) /2 or Pr(X 22) 1 /2. The smallest for which
this holds is obtained by Pr(X 22) = /2 = 0.15867 p-value is 2 0.15867 = 0.3173. Hence,
for reasonable levels of significance (less than 31%) the test cannot reject the null hypothesis of
independence.
(e) R-code for the Fisher test:
> data < matrix(c(22,28,28,22),nrow=2,byrow=T) #create 2 2 matrix of the data
> fisher.test(data) #preform the Fisher test
c Katja Ignatieva

School of Risk and Actuarial Studies, ASB, UNSW

Page 2 of 7

ACTL2002 & ACTL5101

Probability and Statistics

Solutions Week 9

We observe that the p-value is 0.3173, so we cannot reject reject the null at a 5% significance
N12
11
level. Also the 95% confidence interval of the odd ratio: N
n,1 / n,2 includes the value of one, hence
using the confidence interval we can also say that they are not unequal and have to accept the
null hypothesis.
4. Both test cannot reject the null hypothesis at reasonable levels of significance. However, the p-values
substantially differs, i.e., p-value is 0.3173 in the Fisher test and p-value is 0.2301 in the chi-squared
test. This is due to the fact that the chi-squared test uses an approximated distribution (normal for
the Binomial one for the observed numbers), which holds if np with x = 22 and x = 28 this
should give a good approximation. Hence, therefore we cannot reject the null hypothesis in both test.
However, since the chi-squared test is only an approximated test, this explains to the difference in the
p-value.
5. (a)

1. There are two assumptions which follows from the fact a Binomial distribution is the sum
of i.i.d. Bernoulli random variables. Hence, the two assumptions are that the probability of
house being burgled is independent from other houses being burgled (the independent part
of i.i.d.) and that each house in should have the same probability of being burgled (the
identically distributed part of i.i.d.).
2. We have that the number of houses per street which are burgled has a Bin(6, p) distribution.
Each street is an observation of the random variable X Bin(6, p) the number of houses in a
street in the sample which are burgled the past six months. Thus the Likelihood function is
given by:
L(p; x) =

100
Y

fX (xi )

i=1
39

= (Pr (X = 0))

38

(Pr (X = 1))

18

(Pr (X = 2))

(Pr (X = 3))

(Pr (X = 4)) (Pr (X = 5)) (Pr (X = 6))


39  
38
 
6
6
0
6
1
5
=

p (1 p)
p (1 p)
0
1
18  
4
 
6
6

p2 (1 p)4

p3 (1 p)3
2
3
0  
1  
0
 
6
6
6
4
2
5
1
6
0

p (1 p)

p (1 p)
p (1 p)
4
5
6
Hence, the log-likelihood function is given by:
(p; x) = log(L(p; x)) =

100
X

log(fX (xi ))

i=1

 
6
+ 39 log(p0 ) + 39 log((1 p)6 )
0
 
6
+ 38 log(p1 ) + 38 log((1 p)5 )
+ 38 log
1
 
6
+ 18 log
+ 18 log(p2 ) + 18 log((1 p)4 )
2
 
6
+ 4 log
+ 4 log(p3 ) + 4 log((1 p)3 )
3
 
6
+ log
+ log(p5 ) + log((1 p)1 )
5

=39 log

=const + (0 + 38 + 36 + 12 + 5) log(p)
+ (234 + 190 + 72 + 12 + 1) log(1 p)
=const + 91 log(p) + 509 log(1 p)


* using log(ab cd ) = b log(a) + d log(c) and ** using log(ab ) = b log(a) and const= 39 log 60 +




38 log 61 + 18 log 62 + 4 log 63 + log 65 . To find pb, the MLE estimate of p, we differentiate

c Katja Ignatieva

School of Risk and Actuarial Studies, ASB, UNSW

Page 3 of 7

ACTL2002 & ACTL5101

Probability and Statistics

Solutions Week 9

the log-likelihood function with respect to p and equate it equal to zero:


91
509
(p; x)
=0

=0
p
p
1p
91
509
91

=
91(1 p) = 509p 91 = 600p pb =
.
p
1p
600

Checking whether it is indeed a maximum, i.e, the second derivative should be negative:
2 (p; x)
91
509
= 2
< 0,
p2
pb
(1 pb)2

91
is indeed the maximum of the log-likelihood function and hence the MLE.
Hence, pb = 600
3. The probabilities of the Binomial distribution are given by:
 
n
Pr(X = x) =
px (1 p)nx ,
for x = 0, 1, 2, . . . , n and zero otherwise
x

With n = 6 and p = 91/600 we get the following probabilities for x = 0, 1, . . . , 6: 0.373, 0.400,
0.179, 0.043, 0.006, 0.000, 0.00. Hence, the expected number of street with the number of
houses burgled equal to x = 0, 1, . . . , 6 is given by n = 100 times this probability. From this
we can construct the following table:
# of streets
Observed # burgles
Expected # burgles

0
39
37.3

1
38
40.0

2
18
17.9

3
4
4.3

4
0
0.6

5
1
0.0

6
0
0.0

The observed and expected number of streets with burgles equal to 0, 1, . . . , 6 are similar,
which implies a good fit.
(b) We construct the following test/hypothesis:
H0 : p = 0.18 provides a good fit v.s. H1 : p = 0.18 does not provide a good fit
For a goodness of fit test, we use the chi-squared test statistic with k bins:
T =

k
X
(Oi Ei )2
i=1

Ei

2k1 ,

Note that in this hypothesis p is given and thus not estimated, hence we do not have to reduce
the degree of freedom with the number of parameters estimated. We reject the null hypothesis
for large value of the test statistic.
Similar to question a)iii), i.e., X Bin(n, p), with n = 6, but now with p = 0.18 we have the
probabilities for x = 0, 1, . . . , 6: 0.3040, 0.4004, 0.2197, 0.0643, 0.0106, 0.0009, 0.000. Hence, the
expected number of street, given the estimate of p = 0.18, with the number of houses burgled
equal to x = 0, 1, . . . , 6 is given by m = 100 times this probability. From this we can construct
the following table:
# of streets
Observed # burgles
Expected # burgles

0
39
30.40

1
38
40.04

2
18
21.97

3
4
6.43

4
0
1.06

5
1
0.09

6
0
0.0

Since the expected number of burgles is less than 5 for # burgles per street is equal to 4, 5, and 6.
Therefore, we have to aggregate cells in order to obtain only cells which have an expected number
of street with this burgled larger than or equal to 5. Aggregating cells 4, 5, and 6 would only lead
to an aggregate of 1.15, which is also substantial smaller than 5, therefore we aggregate cells 3,
4, 5, and 6 (i.e., 3 or more burgles in a street) which result in an aggregate of 7.58.
# of streets
Observed # burgles
Expected # burgles

c Katja Ignatieva

0
39
30.40

1
38
40.04

2
18
21.97

3+
5
7.58

School of Risk and Actuarial Studies, ASB, UNSW

Page 4 of 7

ACTL2002 & ACTL5101

Probability and Statistics

Solutions Week 9

The value of our test statistic is equal to:


T =

(39 30.40)2 (38 40.04)2 (18 21.97)2 (5 7.58)2


+
+
+
= 4.13
30.40
40.04
21.97
7.58

The degrees of freedom of the chi-squared test statistic is equal to 4-1=3 (# bins-1).
The level of significance is not given in the question, so we calculate the p-value. From Formulae
and Tables page 164 we observe Pr(23 4.2) = 0.7593, hence the p-value is 1 Pr(23 4.2) =
0.2407. We do not reject the null hypothesis that p = 0.18 provides a good fit of the data for level
of significance smaller than 0.2407, which is usual the case.
6. The R-code is given by:
> Burgled < c(39,38,18,4,0,1,0) #vector with observed number of streets with 0, . . . , 6 burgles
> BurgledPred < 100*dbinom(0:6,6,.18) #dbinom(x,n,p) gives the probability mass function of a
Binomial(n,p) distribution;
#dbinom(0:6,6,.18) gives a vector with p.m.f. for x = 0, . . . , 6;
#100*dbinom(0:6,6,.18) give thus the expected frequencies for the number of streets with 0, . . . , 6 burgles
#COMBINING CELLS INTO THE GROUP 3+
> Burgled2 < c(Burgled[1:3],sum(burgled[4:7])) # vector with first 3 elements the first 3 elements
of Burgled and as fourth element the sum of the fourth till seventh element of Burgled
> BurgledPred2 < c(BurgledPred[1:3],sum(burgledPred[4:7]))
# PREFORM TEST
> chisq.test(Burgled2,y=BurgledPred2,rescale.p=T) #chisq.test(x (observed), y = NULL Hypothesis
(expected), TRUE then p is rescaled (if necessary) to sum to 1 if FALSE, and p does not sum to 1, an
error is given)
Help on this function: see also:
http://stat.ethz.ch/R-manual/R-patched/library/stats/html/chisq.test.html
We find a p-value of 0.2471. Note the solution last week had a p-value of 0.2407, the difference was
due to rounding to value of the test statistic from 4.13 to 4.2 to use Formulae and Tables table.
7. (a) Were looking at 3 populations here: the MR (marginally rich), the CR (comfortably rich) and
the SR (super rich). For each of these populations we group members into one of four education
groups, thus creating a multinomial classification of each of the populations.
(b) The observed proportions are given by the cell by the sum of their column total.
Table 1: Observed frequencies
Highest
Education Level
No college
Some college
Undergraduate degree
Postgraduate study
Total

Marginally
Rich
0.32
0.13
0.43
0.12
1

Comfortably
Rich
0.20
0.16
0.51
0.13
1

Super Rich
0.23
0.01
0.60
0.16
1

Total
0.25
0.10
0.51333
0.13667
1

(c) We will test the following hypothesis:


H0 :prob MR, CR, and SR given education level are equal
v.s.
H1 :prob MR, CR, and SR given edu level is unequal for at least one edu level
Or equivalently:
H0 :pedu,MR = pedu,CR = pedu,SR
v.s.

for edu = NC, SC, UD, PD

H1 :pedu,MR 6= pedu,CR orpedu,MR 6= pedu,SR

c Katja Ignatieva

for at least one edu = NC, SC, UD, PD

School of Risk and Actuarial Studies, ASB, UNSW

Page 5 of 7

ACTL2002 & ACTL5101

Probability and Statistics

Solutions Week 9

The test statistic is given by:


X

T =

i{{MR,CR,SR}{N C,SC,UD,P D}}

(Observedi Expectedi )2
2 (f ),
Expectedi

where f = 6 is the degree of freedom. Note that we have to use the (maximum likelihood)
estimate of the proportion of proportions to find the expected frequencies and that the proportion
of (for example) PG given education level can be computed by NC,SC,UG given education level.
Therefore, the degree of freedom of the test is equal to (4 1) (3 1). Note, we reject the null
hypothesis for large values of the test statistic.
The critical value value of our test statistic is given by Pr(26 16.81) = 0.01 (See Formulae and
Tables page 169). Hence, we reject the null hypothesis if T > 16.81 or C = (16.81, ).
To calculate the test statistic we we the expected numbers under the null hypothesis:
Table 2: Expected numbers
Highest
Education Level
No college
Some college
Undergraduate degree
Postgraduate study

Marginally
Rich
25
10
51.33
13.67

Comfortably
Rich
25
10
51.33
13.67

Super Rich
25
10
51.33
13.67

Note all expected cell values are larger than 5, so we do not have to combine cells. The value of
our test statistic is given by:
(32 25)2 + (20 25)2 + (23 25)2
(13 10)2 + (16 10)2 + (1 10)2
+
25
10
(43 51.33)2 + (51 51.33)2 + (60 51.33)2
+
51.33
(12 13.67)2 + (13 13.67)2 + (16 13.67)2
= 19.17233
+
13.67
We observe that the value of the test statistic is in the rejection region C, so we reject the null
hypothesis of probabilities of MR, CR, and SR given the education level are equal at a level of
significance of 1%.
(d) We are asked to find the 95% confidence interval for the difference in proportions with at least
an undergraduate degree for individuals who are marginally and super rich. The corresponding
number of observations are given in the table below
T =

Table 3: Observed frequencies


Highest
Education Level
NC, SC
UD, PS
Total

Marginally
Rich
45
55
100

Super Rich
24
76
100

Total
69
131
200

The proportions are given by:


Table 4: Observed proportions
Highest
Education Level
NC, SC
UD, PS
Total

Marginally
Rich
0.45
pbMR = 0.55
1

Super Rich
0.24
pbSR = 0.76
1

Total
0.345
0.655
1

Using the CLT (proportion is the mean) the pivotal quantity is given by:
(b
pMR pbSR ) (pMR pSR )
T = q
N (0, 1).
p
bM R (1b
pM R )
p
bSR (1b
pSR )
+
nM R
nSR
c Katja Ignatieva

School of Risk and Actuarial Studies, ASB, UNSW

Page 6 of 7

ACTL2002 & ACTL5101

Probability and Statistics

Solutions Week 9

Thus, the 100(1 )% confidence interval is given by:


s
pbMR (1 pbMR ) pbSR (1 pbSR )
+
< pMR pSR
(b
pMR pbSR ) z1/2
nMR
nSR
s
pbMR (1 pbMR ) pbSR (1 pbSR )
< (b
pMR pbSR ) + z1/2
+
nMR
nSR

Thus the 95% confidence interval is given by: (-0.33850849, -0.08149151)


We observe that zero is not in the confidence interval, thus when testing the null hypothesis that
the proportions are equal against the alternative that they are unequal at a level of significance
5%, we can reject the null hypothesis.
(e) Part c)
> rich < matrix(c(32,20,23,13,16,1,43,51,60,12,13,16),nrow=4,byrow=T)
> E < chisq.test(rich,correct=F)$expected;print(E) #(displays the expected cell values, use
this to check whether all cells5)
> chisq.test(rich,correct=F)
Part d)
>p1.hat < sum(rich[3:4,1])/100
>p3.hat < sum(rich[3:4,3])/100
>diff < p1.hat-p3.hat
>lower < diff+qnorm(.025)*sqrt(p1.hat*(1-p1.hat)/100+p3.hat*(1-p3.hat)/100
>upper < diff+qnorm(.975)*sqrt(p1.hat*(1-p1.hat)/100+p3.hat*(1-p3.hat)/100)
c(lower,upper)
Answer 95% confidence interval: (-0.33850849, -0.08149151).
-End of Week 9 Tutorial Solutions-

c Katja Ignatieva

School of Risk and Actuarial Studies, ASB, UNSW

Page 7 of 7

Vous aimerez peut-être aussi