Vous êtes sur la page 1sur 18

THE AUSTRALIAN NATIONAL UNIVERSITY

SCHOOL OF FINANCE AND APPLIED STATISTICS


Second Semester Final Examination, November 2006
QUANTITATIVE RESEARCH METHODS (STAT1008)
Study Period: 15 minutes
Time Allowed: 3 hours
Permitted Material: Calculator, dictionary and 1 A4 page with notes
on both sides

Instructions to Candidates:
Attempt ALL questions.
Each question is of equal mark value.
Start your solution to each question on a new page.
To ensure full marks show all the steps in working out your
solution. Marks may be deducted for failure to show appropriate
calculations or formulae.
Unless otherwise stated, use a significance level of 5%.
Selected statistical tables are attached to the back of the
examination paper.

Page 1 of 18

STAT1008 Quantitative Research Methods Final Examination Semester 2, 2006


Question 1: 20 marks
For each question below, choose the best answer from the options given.
Write your answer in your answer booklet clearly indicating the question (i to xx), and
your answer as the letter appropriate (A, B, C, or D).
You will gain 1 mark for each correct answer. Marks will not be deducted for
incorrect answers.
Answers are in bold and italics.
i. Statistics can be best defined as which of the following:
a. A framework for dealing with variability
b. A set of tools for answering questions
c. A subject in which no answer is ever right
d. A system of obscuring the facts and data by using formulae.
ii. If we wish to examine if it is possible that a linear relationship exists between
two continuous variables, which of the following methods is most appropriate?
a. Simple linear regression
b. Multiple linear regression
c. 1 sample t-test
d. 2 sample t-test
iii. The heights of a group of men are taken, with a sample standard deviation found
to be 2.8cm. The heights of a group of women are taken, and the sample standard
deviation is found to be 4.3cm. Which of the following statements is true?
a. There is a significant difference in the population variances of the
heights of men and women.
b. There is no significant difference in the population variances of the
heights of men and women.
c. The average heights of men and women are different.
d. We need more information about the study performed before we can
make any comments.
iv. The central limit theorem ensures that which of the following statements is true,
given sufficient observations?
a. The standard deviation gets smaller as sample size increases.
b. The sample mean gets smaller as sample size increases.
c. The standard error of the sample mean gets smaller as sample size
increases.
d. The standard error of the sample mean follows a normal distribution.
v. If two events, A and B, are mutually exclusive, which of the following is true?
a. P(A)+P(B)=1
b. P(A)+P(B)=0
c. P(A B)=1
d. P(A B)=0 Answer d is correct
Questions (vi) to (xx) relate to the following situation. Use the information given
below to answer the questions.
A local General Practitioner (medical doctor) is interested in examining the
number of times people visit a doctor within a 12 month period. She selects a

Page 2 of 18

STAT1008 Quantitative Research Methods Final Examination Semester 2, 2006


random sample of 200 records from her files, and records from each record the
following information:
Gender (coded so that 0=male, 1=female)
Age
Number of times the patient visited the doctor in the preceding 12
months
Number of times a prescription was given.
vi. When the data are entered into a Minitab worksheet, how many columns will be
required?
a. 200
b. 2
c. 3
d. 4
vii. When the data are entered into a Minitab worksheet how many rows will be
required (excluding heading rows)?
a. 200
b. 2
c. 3
d. 4
viii. How many continuous variables are in the data set?
a. 1
b. 2
c. 3
d. 4
Some basic descriptive statistics were calculated in Minitab and are presented below.
(Note that Num of Prescript refers to the number of prescriptions given.) Drips of
medication have obscured some of the numbers.
Descriptive Statistics: Gender, Age, Number of Visits, Number of Prescriptions
Variable
Gender
Age
Number of Visits
Num of Prescript
Variable
Gender
Age
Number of Visits
Num of Prescript

N
200
200
200
200

Mean SE Mean
0.6150 *drip1*
37.52
*drip2*
5.155
*drip3*
3.000
*drip4*
Q1
0.000
28.25
4.000
2.000

Median
1.0000
38.00
5.000
3.000

StDev
0.4878
15.87
1.894
1.607
Q3
1.0000
47.00
7.000
4.000

Minimum
0.000
1.00
1.000
0.000
Maximum
1.0000
81.00
10.000
9.000

ix. The number which should be present at drip2 is (to 3 decimal places)
a. 2.653
b. 0.188
c. 1.122
d. 0.079
x. Based on the above descriptive statistics, which box in the graph below best
represents the variable age?

Page 3 of 18

STAT1008 Quantitative Research Methods Final Examination Semester 2, 2006

Boxplot of Age 1, Age 2, Age 3, Age 4


150
125

Data

100
75
50
25
0

Age 1

a.
b.
c.
d.

Age 2

Age 3

Age 4

Age 1
Age 2
Age 3
Age 4.

The doctor is interested in the differences between her male and female patients.
A boxplot is given below of the ages of the patients, split by gender. Use it to
answer question (xi).
Boxplot of Age vs Gender
90
80
70

Age

60
50
40
30
20
10
0
0

1
Gender

Page 4 of 18

STAT1008 Quantitative Research Methods Final Examination Semester 2, 2006


xi. Which of the following statements is true, based only on the boxplot above?
a. There were more men than women in the sample.
b. The men had a higher average age than women.
c. The women had a smaller standard deviation than men.
d. None of the above.
Some more descriptive statistics were calculated, this time comparing the ages of men
and women. They are reproduced below, with some drips of medication again
obscuring some of the numbers.
Descriptive Statistics: Age

Variable
Age

Gender
0
1

N
77
123

Mean
36.09
38.41

SE Mean
1.98
1.34

Variable
Age
xii.

xiii.

xiv.

xv.

StDev
17.37
14.85

Gender Minimum Median Maximum


0
1.00
38.00
80.00
1
1.00
38.00
*med1*
HINT: to answer this question, you will need to refer to the first descriptive
statistics output, containing the statistics for the total age column.
The number which should be shown at med1 is
a. 80
b. 200
c. 81
d. 47
In order to test if the population variances of ages for males and females is the
same, the test statistic to be used would be
17.37
a.
14.85
17.37 2
b.
- Answer b is correct
14.852
17.37
c.
14.85
1.98
d.
1.34
For the test in part (xiii), the test statistic would be compared to which tables for
determining the decision rule?
a. F tables with 77, 123 degrees of freedom
b. F tables with 76, 122 degrees of freedom
c. T tables with 199 degrees of freedom
d. T tables with 200 degrees of freedom
Assuming the test of equal variances failed to reject the null hypothesis, which of
the following shows the correct variance to be used in calculating the test statistic
for a test of equal mean ages of males and females?
(77 1) *36.092 + (123 1) *38.412
a.
77 + 123 2
b.

(77 1) *17.37 2 + (123 1) *14.852


77 + 123 2

Page 5 of 18

STAT1008 Quantitative Research Methods Final Examination Semester 2, 2006


(77 1) *17.37 + (123 1) *14.85
77 + 123 2
(77 1) *17.37 2 + (123 1) *14.852
d.
- Answer d is correct
77 + 123 2
xvi. The test mentioned in part (xv) is performed, giving a test statistic of -1.00. To
which tables should this test statistic be compared?
a. F tables with 77, 123 degrees of freedom
b. T tables with 200 degrees of freedom
c. T tables with 199 degrees of freedom
d. T tables with 198 degrees of freedom
xvii. The p-value for the test considered in parts (xv) and (xvi), against a two-sided
alternative, is 0.317. Based on this, which of the following statements is most
correct?
a. We would accept the null hypothesis.
b. We would reject the null hypothesis.
c. There is a 31.7% chance the null hypothesis is true.
d. 31.7% of the time, the null hypothesis will be true.
c.

The doctor is interested in the proportion of patients (regardless of gender) who have
visited 4 or fewer times over the year. She finds that in her sample of 200, 117
patients have visited 4 or fewer times. Based on this information, a 98% confidence
interval for the population proportion visiting 4 or fewer times in a year is calculated

p (1 p )
.
to be p c

xviii. In the confidence interval formula above, the value p should be replaced by
which of the following numbers?
117
a.
- Answer b is correct
200
83
b.
200
83
c.
117
d. 0.01 .
xix. In the confidence interval formula above, the value c should be replaced by
which of the following numbers?
a. 0.01
b. 0.02
c. 1.96
d. 2.33 - Answer d is correct
xx. In the confidence interval formula above, the value n should be replaced by
which of the following numbers?
a. 117
b. 83
c. 200 - Answer c is correct
d. Not enough information is available to answer this question.

Page 6 of 18

STAT1008 Quantitative Research Methods Final Examination Semester 2, 2006


Question 2 (20 marks)
Research and development (R&D) can be variously seen as crucial to a firms ability
to make scientific advances over its competitors, or a drain on financial resources with
no definite payoff. A government study wishes to investigate chemical firms with
regard to their spending on R&D both in Australia, and overseas. They take a sample
of 30 similar sized chemical firms who have their base within Australia, and obtain
the percentage of total profit spent on R&D in Australia (X), and the percentage of
total profit spent on R&D outside Australia (Y).
A scatterplot of the data appears below.
Scatterplot of Y vs X
0.9
0.8
0.7
0.6
Y

0.5
0.4
0.3
0.2
0.1
0.0
0

6
X

10

12

Some basic descriptive statistics appear below.


Descriptive Statistics: X, Y

Variable
X
Y

N
30
30

Mean
3.798
0.1960

Variable
X
Y

Minimum
0.510
0.000

SE Mean
0.629
0.0416

Median
2.045
0.0800

StDev
3.443
0.2279

Maximum
11.040
0.9100

Covariances: X, Y
X
Y
X 11.855989
Y
0.572271
0.051942
A regression is performed in Minitab, but a minor chemical spill has obscured some
of the output.

Page 7 of 18

STAT1008 Quantitative Research Methods Final Examination Semester 2, 2006


Regression Analysis: Y versus X

The regression equation is


*Chemical spill1*
Table 1
Predictor
Constant
X

Coef
0.01268
0.048269

SE Coef
T
0.04355 *spill2*
0.008559
5.64

P
0.773
0.000

Analysis of Variance
Source
Regression
Residual Error
Total

DF
1
28
29

SS
0.80106
0.70526
1.50632

MS
0.80106
0.02519

F
31.80

P
0.000

Unusual Observations
Obs
6

X
8.2

Y
0.9100

Fit
0.4085

SE Fit
0.0475

Residual
0.5015

St Resid
3.31R

R denotes an observation with a large standardized


residual.
Residual Plots for Y
Normal Probability Plot of the Residuals

Percent

90
50
10
1

Residuals Versus the Fitted Values


Standardized Residual

99

-2

0
2
Standardized Residual

3
2
1
0
-1
0.00

Frequency

10.0
7.5
5.0
2.5
0.0

-1

0
1
2
Standardized Residual

0.30
Fitted Value

0.45

0.60

Residuals Versus the Order of the Data


Standardized Residual

Histogram of the Residuals

0.15

3
2
1
0
-1

8 10 12 14 16 18 20 22 24 26 28 30

Observation Order

Page 8 of 18

STAT1008 Quantitative Research Methods Final Examination Semester 2, 2006


Predicted Values for New Observations
New
Obs
1
2
3
4

Fit
0.1092
0.2058
0.3023
0.3988

SE Fit
0.0328
0.0290
0.0346
0.0462

95%
(0.0420,
(0.1463,
(0.2315,
(0.3042,

CI
0.1764)
0.2652)
0.3731)
0.4934)

95%
(-0.2228,
(-0.1247,
(-0.0304,
( 0.0602,

PI
0.4412)
0.5362)
0.6350)
0.7374)

Values of Predictors for New Observations


New
Obs
1
2
3
4

X
2.00
4.00
6.00
8.00

Use the output above to answer the questions below.


a. (2 marks) Describe the scatterplot.
The scatterplot shows a weak, positive relationship, that is possibly linear.As X
increases, Y seems to increase on average, but there is a high amount of variability in
the relationship.
b. (2 marks) Calculate the coefficient of correlation between X and Y.
r=

cov( x, y )
0.572271
=
= 0.7293 to 4 decimal places.
sx s y
3.443*0.2279
c. (3 marks) Calculate the coefficient of determination for the regression
performed. Interpret this value.

R 2 = r 2 = 0.72932 = 0.5319 to 4 decimal places.


53.19% of the variation in Y is explained by the model.
d. (1 mark) Write down the regression equation (hidden by chemical
spill 1).

Y = 0.01268 +0.048269X
e. (2 marks) What number should be shown at spill2 (Constant row, T
column of Table 1)?
Spill 2 = test statistic for testing null hypothesis 0 =0
0 0.01268
= 0
=
= 0.2912 to 4dp
s
0.04355
0

f. (3 marks) Write down the null and alternative hypotheses being tested
by the P-value of 0.000 (X row, P column of Table 1). What
conclusion would you draw from this test in terms of the original
variables?
Page 9 of 18

STAT1008 Quantitative Research Methods Final Examination Semester 2, 2006

H 0 : 1 = 0
H A : 1 0
P-value = 0.000
Conclusion: Reject the null hypothesis. That is, there is a significant linear
relationship between X and Y. The slope of the regression is significantly different
from zero.
g. (3 marks) Comment on the Residual plots and the Unusual
Observations flagged by Minitab. Do you see any cause for concern
about the validity of the model?
Unusual observations: One large standardised residual, from 30 observations is
approximately 3%, so is not a cause for concern, that is, it does not indicate
significant non-normality in the residuals.
Residual plots: The normal probability plot and histogram suggest that the distribution
of residuals has a slight positive skew, i.e. is not normal. This is a violation of the
assumption of normality and so is cause for concern. The residuals in order show no
pattern, and so there is no violation of the assumption of independence. The residuals
vs fitted values show a possible increasing variance with increasing fitted values this
violates the assumption of constant variance.
So, there is cause for concern about the validity of the model with the normality and
constant variance of the residuals in some doubt.
h. (2 marks) There is interest in finding a 95% interval for the average
spend on R&D overseas for a chemical firm with Australian R&D
spending of 4%. Give the interval required from the output above.
X=4%. Want to find a 95% Confidence interval for E(Y).
From the output, the interval required is (0.1463, 0.2652).
i. (2 marks) Which value of the predictor will give the narrowest possible
prediction interval? Briefly explain your answer.
The value of X that will give the narrowest prediction interval is X =3.798. At this
point, the maximum possible information is available as equal information is available
for X larger than and X smaller than this value. In terms of the calculation formula,
the term ( xg x ) takes its minimum value (0) at this point, giving the narrowest
2

possible interval.

Page 10 of 18

STAT1008 Quantitative Research Methods Final Examination Semester 2, 2006


Question 3
One of the chemical companies studied in Question 2 was re-examined in a further
government study into employment of university graduates. For a sample of 100
employees from the company, the average yearly income of staff was recorded, along
with whether the staff member possessed a university (Bachelors level) degree or not.
The data found are shown in the table below.
Number of staff
Average Yearly
Standard
Income ($)
Deviation ($)
No university
53
$44,452
$149.00
degree
University degree 47
$56,493
$128.24

i. (2 marks) A new graduate is employed, whose honours project makes him


a particularly desirable member of the company. As an inducement, his
supervisors wish to set his salary such that he makes more than 75% of
existing employees with degrees. Assuming that incomes are normally
distributed, what annual income should he be given?
Let the income of employees with degrees be Y. Y~N(=56493, 2=128.242).
We want to find the income level, y, such that P(Y>y) = 0.25
Y y 56493
P
>
= 0.25
128.24

y 56493

PZ >
= 0.25
128.24

From tables, P ( Z < 0.67 ) = 0.25


Therefore, by symmetry, P ( Z > 0.67 ) = 0.25
y 56493
= 0.67
128.24
y = $56.578.92 to nearest cent.
That is, he should be given an annual income of $56,578.92 to the nearest cent.
ii. (8 marks) The companys owner is famed for being a success in the field
with no university degree. He claims that the difference in average salary
between staff with and without degrees is not significant. Based on these
data, would you agree with him? Perform a formal hypothesis test to
answer this question, ensuring that you check all assumptions.
We want to test if the average incomes are the same. First we need to check for equal
variances.
Let YN , N , N , sN refer to staff with no degree, and YD , D , D , sD refer to staff with a
university degree.
H 0 : N2 = D2
H A : N2 D2
sN2
1492
=
= 1.350 to 3dp.
sD2 128.242
Decision Rule: Compare to an F distribution with numerator df = 52, denominator df
= 46. Reject the null hypothesis if TS>F52,46,0.025 F60,40,0.025 =1.80.
Test Statistic: F =

Page 11 of 18

STAT1008 Quantitative Research Methods Final Examination Semester 2, 2006


Conclusion: Do not reject the null hypothesis. There is insufficient evidence to show
unequal population variances. Therefore we proceed with the test of equal population
means, assuming equal population variances.
H 0 : N = D
H A : N D
Test Statistic:
s 2pooled =
T=

( nD 1) sD2 + ( nN 1) sN2

(Y

nD + nN 2

YD ) 0

46*128.242 + 52*1492
= 19499.438 to 3dp
47 + 53 2

( 44452 56493) 0

= 430.37 to 2dp
1
1
1
1
19499.438 +
s
+

47 53
nD nN
Decision Rule: Compare to a t-distribution with 98 degrees of freedom. For alpha =
5%, we will reject the null hypothesis if |TS|>t98,0.025 t100,0.025 =1.984.
Conclusion: We reject the null hypothesis. There us very strong evidence against the
null hypothesis. The average wage difference between staff with and without degrees
is significantly different from zero.
N

2
pooled

iii. (4 marks) It is claimed that 50% of the employees of this firm have
university degrees. Do the data support this claim? Perform a formal
hypothesis test at 10% level to answer this question.
H 0 : p = 0.5
H A : p 0.5
Test Statistic:
p p0
0.47 0.5
=
= 0.6
Z=
p0 (1 p0 )
0.5 (1 0.5 )
n
100
Decision Rule: Compare to a Z distribution. For alpha = 10%, reject the null
hypothesis if |TS|>1.645.
Conclusion: Do not reject the null hypothesis. There is insufficient evidence to dispute
the claim than 50% of staff have degrees.
iv. (2 marks) Based on your answer to part (iii), would you expect to find the
value 0.5 within a 90% confidence interval for the population proportion
of employees with degrees? Explain why or why not.
Yes, 50% would be within a 90% confidence interval for p as the null hypothesis was
not rejected against a 2-sided alternative with 10% significance.
v. (4 marks) A university science department wishes to conduct a study into
the average income of all its graduates employed in chemical companies.
Assuming that the population standard deviation of annual incomes is
known to be $128.24, how many former students should they include in
their sample to obtain a 99% confidence interval with maximum width of
$50?

Page 12 of 18

STAT1008 Quantitative Research Methods Final Examination Semester 2, 2006


Want the width of the interval to be $50.
Width of interval
= upper confidence bound lower confidence bound

= 2* Z / 2 *

2* Z / 2 *

< 50

< 25
n
For 99% confidence, Z / 2 = Z 0.005 = 2.575
Z / 2 *

< 25
n
25

<
n 2.575
Given =128.24.

2.575*

n > 128.24* 2.575 / 25 = 13.209


n > 174.47
That is, we need at least 175 former students in the study.

Page 13 of 18

STAT1008 Quantitative Research Methods Final Examination Semester 2, 2006

Question 4
With newspaper reports of skyrocketing crime rates, a sample of 150 successful
prosecutions (i.e. crimes for which a conviction has been obtained) in the past 24
months is taken for the purpose of studying the patterns of offender age and type of
crime. The table below gives the data obtained from the sample.
Age of offender (in years)
Type of Crime
Under 20
20-40
Over 40
Violent
27
41
14
Nonviolent
12
34
22
a. One criminal file is selected at random from the 150 by a judge for review.
i. (1 mark) Estimate the probability that the file selected relates to a violent
crime.
P(violent) = (27+41+14) / 150 = 82/150
ii. (2 marks) If it is known that the file deals with a violent crime, what is
the probability that it also relates to a criminal under 20 years of age?
P(criminal<20 | violent) = 27/82
b. (2 marks) Are age of offender and type of crime plausibly independent? Explain
your answer using examples (or a counter-example).
If A and B are independent, P(A|B) = P(A).
Here P(Criminal<20|violent) = 27/82 P(Criminal<20) = 39/150
Therefore, age and nature of crime are not independent.
c. (3 marks) Estimate the marginal probabilities for age of offender, presenting your
answer in a table form.
Age bracket of offender
Probability

<20
39/150 = 0.26

20-40
75/150 = 0.5

>40
36/150 = 0.24

Another study is performed into non-violent criminals under 20, this time classifying
them by the number of times they have been successfully prosecuted. Upon
examining all records available, the following probability distribution is found to
apply to the group.
Number of
1
2
3
4
Convictions
Probability
0.33
0.34
0.22
0.11
d. (4 marks) Find the mean number of convictions among non-violent criminals
under 20 years of age. Is this an observable value, i.e. is it possible that there will
be a non-violent criminal in this age group with that exact number of convictions?
Does this indicate a problem with the data? Explain your answer.
Let X be the number of convictions.
n

E ( X ) = xi p ( xi )
i =1

= 1*0.33 + 2*0.34 + 3*0.22 + 4*0.11


= 2.11

Page 14 of 18

STAT1008 Quantitative Research Methods Final Examination Semester 2, 2006


This is not an observable value, but this does not indicate a problem with the data.
This is a long run average, not an exact, observable value.
e. (4 marks) Find the standard deviation of the number of convictions among this
group of criminals.
var ( X ) = E ( X 2 ) E ( X )
E ( X 2 ) = xi2 p ( xi )
n

i =1

= 12 *0.33 + 22 *0.34 + 32 *0.22 + 42 *0.11


= 5.43
var ( X ) = 5.43 2.112
= 0.9779 to 4dp
sd(X)=0.9899 to 4dp
f. (1 mark) If a single criminal of this group is sampled at random, what is the
probability that he has been convicted 3 or fewer times?
P(X3) = 1-P(X=4) = 1-0.11 = 0.89
g. (3 marks) A sample of 50 criminals who fit this profile is taken. What is the
probability that the average number of times they have been convicted is 3 or
fewer?
Let X be the sample average.
We know from the central limit theorem (assuming that n=50 is large enough) that
2
X ~ N , .
n

X
3 2.11
So, P ( X < 3) = P
<

n 0.9899 50
= P ( Z < 6.364 )
1

Page 15 of 18

STAT1008 Quantitative Research Methods Final Examination Semester 2, 2006

Question 5
(a) An unnamed lecturer has a reputation for her lectures running over time, that is,
taking longer than the 50 minutes allocated for her teaching slots. The duration of
her lectures is a random variable best represented by a distribution described by the
following equation: if x is the length of a lecture, in minutes, then
x < 40
0,
x 40

,
40 x < 55
150
f ( x) =
.
x
60

, 55 x < 60
50
0,
60 x

i. (3 marks) Draw a graph of f(x), clearly marking all axes and points of
interest.

ii. (2 marks) Is f(x) a probability distribution? Explain why or why not.


f(x) is a probability distribution as it satistifes the two basic requirements for a
continuous probability distribution function, those being that f(x)0 everywhere, and
the area under the curve is 1 (check: base length = 20, height = 0.1, so 0.5*base*
height = 20*0.1*0.5 = 1).
iii. (2 marks) What proportion of the time will the lecturer take less than 45
minutes for a lecture?
P(X<45) = 0.5*5* ( (45-40)/150) = 0.0833 to 4dp.
iv. (3 marks) What is the probability that a lecture runs for longer than 50
minutes?
P(X>50) = 1-P(X<50) = 1-0.5*10* ( (50-40)/150) = 1- 1/3 = 2/3
(b)One of the lecturers colleagues, called Professor Good (to preserve anonymity),
with a much better reputation for time management of lectures, claims that the
Page 16 of 18

STAT1008 Quantitative Research Methods Final Examination Semester 2, 2006


length of his lectures is best represented by a uniform variable with positive
probability between 47 and 52 minutes.
i. (1 mark) Draw a graph representing the probability distribution of the
length of a lecture given by Professor Good, clearly marking all axes and
points of interest.
Let y be the length, in minutes, of a lecture given by Professor Good.
Y is a uniform, continuous random variable.
0.25

0.2

f(y) = 0.2, 47y<52

f(y)

0.15

0.1

0.05

0
47

48

49

50

51

52

ii. (1 mark) Find the expected length of a randomly selected lecture given by
Professor Good.
a + b 47 + 52
=
= 49.5
2
2
iii. (1 mark) Find the standard deviation of length of a lecture given by
Professor Good.

E (Y ) =

var (Y ) =

(b a )

( 52 47 )

= 2.083 to 3dp.
12
12
standard deviation(Y) = 2.083 = 1.443 to 3dp.
iv. (2 marks) Find the probability that a lecture given by Professor Good runs
for longer than 50 minutes.
P(Y>50) = 2/5
v. (1 mark) Find the probability that a lecture given by Professor Good takes
exactly 50 minutes.
P(Y=50) = 0

Page 17 of 18

STAT1008 Quantitative Research Methods Final Examination Semester 2, 2006


vi. (4 marks) If a random sample of 50 lectures given by Professor Good is
timed, find the probability that the sample average is over 50 minutes.
By the central limit theorem, we assume that n = 50 is large enough for the
distribution of the sample mean to be approximately normal, that is we assume
var(Y )

Y ~ N = E (Y ) , var =

50 49.5
Y
>
P (Y > 50 ) = P
var(Y ) n 1.443 50

= P ( Z > 2.45 ) rounding to 2dp


= P( Z < 2.45)
= 0.0071
END OF EXAMINATION PAPER

Page 18 of 18

Vous aimerez peut-être aussi