0 Votes +0 Votes -

76 vues183 pagesPast Paper GCE O'level Statistics 4040 All Variants 2013 to 2014 inclusive of examiner reports

Sep 20, 2015

© © All Rights Reserved

PDF, TXT ou lisez en ligne sur Scribd

Past Paper GCE O'level Statistics 4040 All Variants 2013 to 2014 inclusive of examiner reports

© All Rights Reserved

76 vues

Past Paper GCE O'level Statistics 4040 All Variants 2013 to 2014 inclusive of examiner reports

© All Rights Reserved

- The Secret
- Principles: Life and Work
- Influence: The Psychology of Persuasion
- Leonardo da Vinci
- The E-Myth Revisited: Why Most Small Businesses Don't Work and What to Do About It
- The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
- Flowers for Algernon
- Zen and the Art of Motorcycle Maintenance: An Inquiry Into Values
- What Every BODY is Saying: An Ex-FBI Agent's Guide to Speed-Reading People
- Unfu*k Yourself: Get out of your head and into your life
- Unfu*k Yourself: Get Out of Your Head and into Your Life
- Grit: The Power of Passion and Perseverance
- Notorious RBG: The Life and Times of Ruth Bader Ginsburg
- Out of My Mind
- The Art of Communicating
- The Art of Communicating
- Educated: A Memoir
- The Success Principles(TM)
- The Mind Illuminated: A Complete Meditation Guide Integrating Buddhist Wisdom and Brain Science for Greater Mindfulness
- The Explosive Child: A New Approach for Understanding and Parenting Easily Frustrated, Chronically Inflexible Children

Vous êtes sur la page 1sur 183

* 0 1 9 2 7 3 6 8 8 2 *

4040/12

STATISTICS

Paper 1

October/November 2013

2 hours 15 minutes

Additional Materials:

Pair of compasses

Protractor

Write your Centre number, candidate number and name on all the work you hand in.

Write in dark blue or black pen.

You may use a soft pencil for any diagrams or graphs.

Do not use staples, paper clips, highlighters, glue or correction fluid.

DO NOT WRITE IN ANY BARCODES.

Answer all questions in Section A and not more than four questions from Section B.

If working is needed for any question it must be shown below that question.

The use of an electronic calculator is expected in this paper.

At the end of the examination, fasten all your work securely together.

The number of marks is given in brackets [ ] at the end of each question or part question.

DC (NH/CGW) 66916/2

UCLES 2013

[Turn over

2

Section A [36 marks]

Answer all of the questions 1 to 6.

mean,

median,

mode,

range,

interquartile range,

variance

and standard deviation.

In each of the following situations, one of these measures is to be found by the person

described. State the appropriate measure in each case.

(i)

................................................... [1]

(ii)

An athlete who competes in the 100 metres sprint finds the difference between his

slowest and quickest practice times.

................................................... [1]

(iii)

A graduate who seeks employment with a company finds a measure of central tendency

for the salaries of the companys employees. The company has twenty employees, of

whom three are managers earning salaries very much higher than the other employees.

................................................... [1]

(iv)

A teacher finds a measure of dispersion for the scores of her pupils in a test, in which no

pupil scored an exceptionally high mark, and no pupil scored an exceptionally low mark.

................................................... [1]

(v)

A biologist finds a measure of dispersion for the growth of twelve plants over a period of

three months. Two plants have been attacked by insects and have grown very much less

than the others.

................................................... [1]

(vi)

A sociologist finds a measure of central tendency for the first names given to the male

babies born in a hospital over a period of six months.

................................................... [1]

UCLES 2013

4040/12/O/N/13

For

Examiners

Use

3

2

A large keep fit class for women is held at a sports club once every week. The manager of

the club asks the class instructor to select a sample of size 10 from the class.

(i)

For

Examiners

Use

(a) the first 10 women to arrive at the class,

................................................... [1]

(b) women at regular intervals from the class register.

................................................... [1]

The sample is required to obtain responses to a proposal to change the time of the class from

Monday evening to Monday afternoon. For class members the only items of data presently

available to the instructor are name and age.

(ii)

State, and justify, two other items of data relating to class members which the instructor

needs to know when selecting the sample in order to avoid bias in responses. You are

not required to describe how the sample is selected.

..........................................................................................................................................

..........................................................................................................................................

..........................................................................................................................................

..........................................................................................................................................

..........................................................................................................................................

..........................................................................................................................................

..........................................................................................................................................

...................................................................................................................................... [4]

UCLES 2013

4040/12/O/N/13

[Turn over

4

3

In a photographic equipment store a record was kept of the number of cameras sold each

day. The values, for eleven consecutive days, were as follows.

6

(i)

the mode,

................................................... [1]

(ii)

................................................... [2]

(iii)

the median.

................................................... [2]

The values recorded for the next three days were x, x + 1 and x + 2.

(iv)

If the median for the entire fourteen-day period was the same as the median for the first

eleven days, find x.

x = .................................................. [1]

UCLES 2013

4040/12/O/N/13

For

Examiners

Use

5

4

The diagram below shows the number of actors at a film festival who have worked in one or

more of the cities Mumbai, Los Angeles and Rome.

Mumbai

For

Examiners

Use

Los Angeles

5

13

4

3

Rome

(i)

................................................... [1]

(ii)

..........................................................................................................................................

...................................................................................................................................... [1]

Find the probability of selecting an actor who has worked in

(iii)

................................................... [2]

(iv)

................................................... [1]

(v)

Rome, given that the actor has worked in Mumbai and Los Angeles.

................................................... [1]

UCLES 2013

4040/12/O/N/13

[Turn over

6

5

A charity, Camfam, classifies the income it receives under the headings Special Events,

Donations, Grants, and Other Sources. In Camfams report for 2010, the following percentage

bar chart was given, which represents a total income of $80 million.

2010

10

20

Special Events

(i)

30

40

50

60

Percentage

Donations

70

Grants

80

90

100

Other Sources

$ ................................................... [2]

(ii)

If a pie chart were to be drawn to represent this information, find the angle which would

represent the sector for Special Events.

................................................... [2]

Camfams total income in 2011 was $60 million.

Two pie charts, one for 2010 and one for 2011, are to be presented together in a new report.

(iii)

Find, in its simplest terms, the ratio of the area of the chart representing 2010 to the

area of the chart representing 2011.

................................................... [2]

UCLES 2013

4040/12/O/N/13

For

Examiners

Use

7

6

The following table is to show the distance, in kilometres, between any two of the five towns

A, B, C, D and E.

For

Examiners

Use

A

42

B

C

20 39

36 18

25

(i)

(a) The distance between B and C is 10 km more than the distance between D and E.

[1]

(b) The distance between A and C is two thirds of the distance between A and E.

[1]

(c) The distance between A and B is twice the distance between C and E.

[1]

(d) C is 19 km further from D than B is from E.

[1]

Dimitri lives in town A, but has one friend in each of the towns D and E. He makes a journey

in which he leaves his home, visits each of these friends once, and then returns home.

(ii)

............................................. km [2]

UCLES 2013

4040/12/O/N/13

[Turn over

8

Section B [64 marks]

For

Examiners

Use

Each question in this section carries 16 marks.

In this question the fertility rate of a population is defined as the number of births per 1000

females.

The table below gives information about the female population and age group fertility rates in

a particular city for the year 2012, together with the standard population of the area in which

the city is situated.

Age group of

females

Population of

females in age group

Age group

fertility rate

Standard population

of females (%)

Under 20

2900

50

18

20 29

4500

184

22

30 39

5250

136

25

Over 39

5800

15

35

(i)

Births

Calculate, to 1 decimal place, the standardised fertility rate for the city.

................................................... [4]

(ii)

Calculate the number of births for each age group and insert the values in the table

above.

[2]

UCLES 2013

4040/12/O/N/13

9

(iii)

Calculate, to 1 decimal place, the crude fertility rate for the city.

For

Examiners

Use

................................................... [4]

There are equal numbers of males and females in the city and in the standard population.

The standardised and crude death rates for the city in 2012 were 8.5 and 7.8 per thousand

of the population respectively.

(iv)

Using one of these values, and any other appropriate values from parts (i), (ii) and (iii),

find the increase in the population of the city in 2012 due to births and deaths.

................................................... [5]

It is not possible to obtain an accurate measure of population increase or decrease in a city

from information on births and deaths alone.

(v)

..........................................................................................................................................

...................................................................................................................................... [1]

UCLES 2013

4040/12/O/N/13

[Turn over

10

8

In a large residential building there are 120 apartments, of which 50 are private apartments

(owned by the residents) and 70 are company apartments (owned by the company which

constructed the building).

If two apartments are chosen at random, find the probability of choosing

(i)

................................................... [2]

(ii)

................................................... [2]

The weekly rents, in dollars, charged on the company apartments are represented in the

histogram below, from which one rectangle, representing the $400 to under $500 class, has

been omitted.

25

20

Number

of

apartments

per $50

15

10

200

250

300

350

UCLES 2013

4040/12/O/N/13

400

450

500

For

Examiners

Use

11

Use the histogram to find the number of company apartments for which the weekly rent was

(iii)

For

Examiners

Use

................................................... [2]

(iv)

................................................... [2]

There were 10 company apartments for which the weekly rent was from $400 to under $500.

(v)

the $400 to under $500 class.

[1]

(vi)

Write down the term used to describe the $300 to under $350 class.

................................................... [1]

The private apartments are of three different sizes. There are 24 apartments with three

rooms, 14 with four rooms, and 12 with five rooms.

A safety expert, conducting a survey on the use of smoke detectors, chooses three private

apartments at random.

(vii)

If the apartments chosen have 12 rooms in total, find the probability that the apartments

are all of the same size.

................................................... [6]

UCLES 2013

4040/12/O/N/13

[Turn over

12

9

The mid-day temperature at a particular location in a city was measured every day throughout

the year 2010. The following table summarises the results obtained.

Temperature (C)

Number of days

0 under 5

5 under 10

25

10 under 15

52

15 under 20

81

20 under 25

79

25 under 30

68

30 under 35

37

35 under 40

15

Cumulative frequency

(i)

[2]

(ii)

Plot the cumulative frequencies on the grid opposite, joining the points by a smooth

curve.

[3]

(iii)

(a) the median of these temperatures,

.............................................. C [1]

(b) the interquartile range of these temperatures.

.............................................. C [4]

UCLES 2013

4040/12/O/N/13

For

Examiners

Use

13

For

Examiners

Use

400

350

300

250

200

150

100

50

10

15

20

25

30

35

40

Temperature (C)

UCLES 2013

4040/12/O/N/13

[Turn over

14

When the results were obtained, a scientist predicted that, because of climate change,

temperatures in the city would increase at the rate of 0.5 C every ten years.

Assume that this prediction is accurate.

For this particular location,

(iv) use your answers to part (iii) to estimate, for the year 2050,

(a) the median of the mid-day temperatures,

........................................C [2]

(b) the interquartile range of the mid-day temperatures,

........................................C [1]

(v)

use your graph to estimate, for the period 2010 to 2050, the increase in the number of

days with a mid-day temperature of more than 36 C.

................................................... [3]

UCLES 2013

4040/12/O/N/13

For

Examiners

Use

15

BLANK PAGE

UCLES 2013

4040/12/O/N/13

[Turn over

16

10 Emilie, a student teacher, conducted research on the number of pupils and the number

of teachers in the schools in the town of Astra, where she lives. The schools supplied the

following data.

School

Number of pupils, x

760

1219

927

470

1361

628

381

1085

Number of teachers, y

29

44

33

34

52

24

16

40

(i)

y

60

50

40

Number

of

30

teachers

20

10

200

400

600

800

1000

1200

1400 x

Number of pupils

[2]

UCLES 2013

4040/12/O/N/13

For

Examiners

Use

17

The data have an overall mean of (853.875, 34) and an upper semi-average of (1148, 42.25).

(ii)

For

Examiners

Use

[2]

(iii)

................................................... [2]

(iv)

Without plotting the averages, and without drawing the line, find the equation of the

line of best fit in the form y = mx + c.

................................................... [3]

(v)

Explain briefly why the value of c which you have found in part (iv) might give you

cause for concern.

..........................................................................................................................................

...................................................................................................................................... [1]

Emilie discovered later that the data supplied by one of the schools gave, incorrectly, the

total number of people employed by the school, and not the number of teachers.

(vi)

Ignoring the point representing the school which supplied incorrect data, draw, by eye,

on the grid in part (i), a line of best fit through the remaining seven points.

[1]

(vii)

Use the line you have drawn in part (vi) to find its equation in the form y = mx + c.

................................................... [3]

UCLES 2013

4040/12/O/N/13

[Turn over

18

Emilie repeated the research for schools in the nearby town of Belport, for which she found

the equation of the line of best fit to be y = 0.0431x + 1.72 .

(viii)

Using this equation, and your answer to part (vii), state in which of the two towns a pupil

might choose to be educated, if free to choose. Explain your answer briefly.

..........................................................................................................................................

..........................................................................................................................................

...................................................................................................................................... [2]

UCLES 2013

4040/12/O/N/13

For

Examiners

Use

19

11

(i)

Give one advantage and one disadvantage of forming a large set of data into a grouped

frequency distribution.

For

Examiners

Use

Advantage.........................................................................................................................

..........................................................................................................................................

Disadvantage ....................................................................................................................

...................................................................................................................................... [2]

The presenter of a radio programme, in which recordings of popular songs are played, plans

his programme. For each song chosen he writes down the song length, in terms of time, in

minutes, taken to play the song. The following table summarises the song lengths.

Song length

(minutes)

Number

of songs

(ii)

Estimate, in minutes, the mean and standard deviation of these song lengths. Give your

answers to 3 significant figures.

Mean = ......................................................

Standard deviation = .................................................. [8]

UCLES 2013

4040/12/O/N/13

[Turn over

20

Information about five of the presenters earlier programmes is shown below.

Programme

Number of songs

played

(minutes)

Standard deviation of

song lengths (minutes)

38

3.70

0.339

39

3.52

0.328

42

3.69

0.294

37

3.83

0.305

38

3.74

0.291

(iii)

For

Examiners

Use

(a) shortest in length,

................................................... [1]

(b) most similar in length.

................................................... [1]

All the presenters programmes are three hours in duration. Songs are not played continuously

throughout each programme; for some of the time the presenter talks about the songs and

the singers.

A listener switched on programme P at a random time during its transmission.

(iv)

Find the probability that a song was not being played at that moment.

................................................... [4]

Permission to reproduce items where third-party owned material protected by copyright is included has been sought and cleared where possible. Every

reasonable effort has been made by the publisher (UCLES) to trace copyright holders, but if any items requiring clearance have unwittingly been included, the

publisher will be pleased to make amends at the earliest possible opportunity.

University of Cambridge International Examinations is part of the Cambridge Assessment Group. Cambridge Assessment is the brand name of University of

Cambridge Local Examinations Syndicate (UCLES), which is itself a department of the University of Cambridge.

UCLES 2013

4040/12/O/N/13

General Certificate of Education Ordinary Level

* 5 8 2 3 8 7 8 8 1 2 *

4040/13

STATISTICS

Paper 1

October/November 2013

2 hours 15 minutes

Additional Materials:

Pair of compasses

Protractor

Write your Centre number, candidate number and name on all the work you hand in.

Write in dark blue or black pen.

You may use a soft pencil for any diagrams or graphs.

Do not use staples, paper clips, highlighters, glue or correction fluid.

DO NOT WRITE IN ANY BARCODES.

Answer all questions in Section A and not more than four questions from Section B.

If working is needed for any question it must be shown below that question.

The use of an electronic calculator is expected in this paper.

At the end of the examination, fasten all your work securely together.

The number of marks is given in brackets [ ] at the end of each question or part question.

DC (CW/CGW) 66919/2

UCLES 2013

[Turn over

2

Section A [36 marks]

Answer all of the questions 1 to 6.

A survey was carried out to discover whether the quantity of traffic on a busy road was

sufficient to justify the installation of a pedestrian crossing. At intervals throughout one day

an investigator recorded the number of vehicles passing the proposed location in periods of

30 seconds duration.

The numbers he recorded were:

12 51 64 55 51 61 31 22 20 15 34 14 69 35

When his record sheet was examined the number shown here as was illegible, but it was

certainly a single-digit number.

Although this number is unknown, name, but do not calculate,

(i)

.......................................................

................................................... [2]

(ii)

................................................... [1]

(iii)

................................................... [1]

(iv)

.......................................................

................................................... [2]

UCLES 2013

4040/13/O/N/13

For

Examiners

Use

3

2

The pie chart below illustrates the distribution by location of the total net profit of $787 million

earned by an international company in the year 2011.

For

Examiners

Use

Asia

North

America

Rest

of the

World

Europe

(i)

Measure, to the nearest degree, the sector angles of the pie chart, and insert them in

the appropriate places on the chart.

[2]

(ii)

Calculate, to the nearest $million, the net profit of the company in Asia.

(iii)

Measure and state the radius, in centimetres, of the above pie chart.

............................................. cm [1]

The total net profit of the same company in the year 2005 was $523 million.

(iv)

pie chart for 2005.

............................................ cm [2]

UCLES 2013

4040/13/O/N/13

[Turn over

4

3

A factory employs both male and female staff in each of the three categories managerial,

inspection and production.

There are altogether 3500 employees, of whom 2150 are male. There are a total of 660

managerial staff, 540 male inspection staff and 785 female production staff.

(i)

Managerial

Inspection

Production

TOTAL

Male

Female

TOTAL

[1]

Two thirds of the managerial staff are female.

(ii)

[5]

UCLES 2013

4040/13/O/N/13

For

Examiners

Use

5

4

There are 50 girls in their final year at a school. The diagram below illustrates the number of

the girls who play each of the sports badminton, volleyball and handball.

Badminton

For

Examiners

Use

Volleyball

6

11

9

8

3

Handball

(i)

x = ......................................................

..........................................................................................................................................

...................................................................................................................................... [2]

(ii)

Find

(a)

................................................... [1]

(b)

how many more girls play exactly two sports than play exactly one sport.

................................................... [1]

Half of the girls who play volleyball and two thirds of the girls who play only handball say they

intend to continue playing sport after they have left school.

(iii)

Find the number of girls who intend to continue playing sport after they have left school.

................................................... [2]

UCLES 2013

4040/13/O/N/13

[Turn over

6

5

The times taken, in minutes, by 174 people to complete an aptitude test are summarised in

the following table.

Time (minutes)

Number of

people

10 under 30

28

30 under 40

36

40 under 45

40

45 under 50

32

50 under 75

20

75 under 120

18

TOTAL

174

Height of rectangle

(units)

18

The times are to be illustrated by a histogram, in which the 30 under 40 class is represented

by a rectangle of height 18 units.

(i)

Calculate the height of the rectangle representing the 40 under 45 class, and insert

the value in the table.

[1]

(ii)

Calculate the heights of the rectangles representing the remaining four classes, and

insert the values in the table.

[3]

(iii)

If the final two classes were combined into a single 50 under 120 class, calculate, to

2 decimal places, the height of the rectangle which would represent the combined class.

................................................... [2]

UCLES 2013

4040/13/O/N/13

For

Examiners

Use

7

6

(a) (i)

Describe the situation which can lead to the method of systematic sampling

producing a biased sample.

For

Examiners

Use

..................................................................................................................................

..................................................................................................................................

.............................................................................................................................. [1]

(ii)

20 of the students. Explain briefly how this could be achieved.

..................................................................................................................................

..................................................................................................................................

..................................................................................................................................

..................................................................................................................................

..................................................................................................................................

.............................................................................................................................. [3]

(b) Briefly explain how a population could be stratified, prior to taking a stratified sample, in

order to ascertain the views of members of the public on

(i)

..................................................................................................................................

.............................................................................................................................. [1]

(ii)

aircraft noise.

..................................................................................................................................

.............................................................................................................................. [1]

UCLES 2013

4040/13/O/N/13

[Turn over

8

Section B [64 marks]

For

Examiners

Use

Each question in this section carries 16 marks.

(a) A test for a particular disease has a 95% chance of correctly giving a positive result for a

person who has the disease, but a 10% chance of incorrectly giving a positive result for

a person who does not have the disease.

(i)

Find the chance that the test gives a negative result for a person who has the

disease, and insert it in the following table.

Person has

the disease

P(test result positive)

have the disease

0.95

[1]

(ii)

[1]

15% of the people who are tested are believed to have the disease.

A person is chosen at random and tested.

(iii)

Calculate the probability that the test gives a correct result for this person.

................................................... [4]

UCLES 2013

4040/13/O/N/13

9

(b) Give all probabilities in this part of the question as fractions.

The following diagram classifies the members of a tennis club as to whether they are

male or female, left-handed or right-handed, and whether or not they have represented

the club in matches.

Left-handed

Male

Female

For

Examiners

Use

Right-handed

represented club

Represented

club

1

(i)

Calculate the probability that this member has represented the club in matches.

................................................... [1]

A female member is chosen at random.

(ii)

................................................... [2]

A member who has represented the club in matches is chosen at random.

(iii)

................................................... [2]

UCLES 2013

4040/13/O/N/13

[Turn over

10

(c) Laura walks to school. On her route she passes two shops, A and B. The probability that

she will go into shop A on any morning is 0.2, and into shop B is 0.7.

Her decision of whether to go into one of the shops is independent of whether she goes

into the other shop. If she goes into either or both shops the probability that she will be

late for school is 0.09.

(i)

Calculate the probability that on any morning she will go into exactly one shop and

be late for school.

................................................... [3]

Laura has been told that she must aim to be late on no more than 5% of the schooldays

on which she goes into exactly one shop.

(ii)

..................................................................................................................................

.............................................................................................................................. [2]

UCLES 2013

4040/13/O/N/13

For

Examiners

Use

11

8

(a) The table below summarises information about the number of GCE O Level subjects

passed by different numbers of pupils at a school in the year 2011.

Number of subjects (x)

12

15

10

Cumulative frequency

10

22

37

47

54

For

Examiners

Use

(i)

On the grid below, draw and label two axes, the horizontal axis representing the

number of subjects passed and the vertical axis representing cumulative frequency.

[2]

(ii)

[4]

UCLES 2013

4040/13/O/N/13

[Turn over

12

(b) The cumulative frequency graph below illustrates the lengths of journey times, in

minutes, to their homes of a number of students at a college at the end of one particular

day.

200

160

120

Cumulative

frequency

80

40

10

20

30

40

50

60

Use the graph to estimate

(i)

.................................... minutes [1]

(ii)

..................................... minutes [1]

(iii)

(iv)

the number of students whose journey time was longer than 23 minutes,

................................................... [3]

UCLES 2013

4040/13/O/N/13

For

Examiners

Use

13

(v)

For

Examiners

Use

................................................... [2]

On the next day, due to bad weather, the journey time of all students was 5 minutes

longer than the original times illustrated in the graph.

Compared with the original times, state, without further calculation, the effect which the

bad weather had on

(vi)

.............................................................................................................................. [1]

(vii)

.............................................................................................................................. [1]

UCLES 2013

4040/13/O/N/13

[Turn over

14

9

The following table gives information about the populations and deaths in two towns, A and

B, during the course of one year, together with the standard population of the area in which

both towns are situated.

Town A

Age

Population

Deaths

0 under 15

5000

45

15 under 45

3750

15

45 under 65

2500

25

65 and over

1250

(i)

q=

Town B

Population

Deaths

Standard

population

6000

66

400

27000

54

300

10

15000

60

200

32

2000

30

100

Death rate

(per thousand)

p=

For town A, calculate the values of p and of q and insert them in the table.

[2]

(ii)

................................................... [4]

(iii)

................................................... [4]

(iv)

Use the population figures given in the table to state why the crude death rate and the

standardised death rate of town A are equal.

..........................................................................................................................................

...................................................................................................................................... [2]

UCLES 2013

4040/13/O/N/13

For

Examiners

Use

15

The table shows that far more deaths occurred in town B than in town A during the year,

and yet the standardised death rate for town B is much lower than that for town A.

(v)

For

Examiners

Use

..................................................................................................................................

..................................................................................................................................

..................................................................................................................................

.............................................................................................................................. [2]

whom had died during the year, had been misclassified by being included incorrectly in

the 45 under 65 class, when in fact they were all 65 and over.

(vi)

State, with a reason, the effect, if any, which correcting this error would have on the

crude death rate of town B.

..................................................................................................................................

.............................................................................................................................. [2]

UCLES 2013

4040/13/O/N/13

[Turn over

16

10 The time, in minutes, taken by each of 6 children to walk 1 kilometre, is given in the following

table.

(i)

Child

13

15

12

12

23

25

11

18

23

y

30

20

Time

(minutes)

10

10

12

14

16 x

Age (years)

[2]

(ii)

Calculate the overall mean and the two semi-averages of the data, and plot them on

your graph.

[5]

UCLES 2013

4040/13/O/N/13

For

Examiners

Use

17

(iii)

[1]

(iv)

Using any valid method, obtain the equation of your line of best fit, and write it in the

form y = mx + c.

For

Examiners

Use

................................................... [3]

(v)

Use your equation to estimate, to the nearest minute, the time taken to walk 1 kilometre

by a child aged 14 years.

................................................... [1]

(vi)

(a) Comment on how well your line of best fit matches the data points.

..................................................................................................................................

.............................................................................................................................. [1]

(b)

From the graph, identify the child for whom your line of best fit most overestimates

the time taken.

................................................... [1]

(vii)

State, with a reason, whether it would be valid to use your line of best fit to estimate the

time taken to walk 1 kilometre by a person whose age is outside the range of values

given in the table.

..................................................................................................................................

..................................................................................................................................

..................................................................................................................................

.............................................................................................................................. [2]

UCLES 2013

4040/13/O/N/13

[Turn over

18

11 The following table summarises the increase, in dollars, of the annual income of a sample of

200 people between the years 2006 and 2011 (a negative value indicates a decrease).

Increase in annual

income ($x)

y = m 750

250

2500 under 0

fy

fy 2

14

0 under 1500

99

39

25

23

TOTAL

(i)

Frequency

(f )

200

Obtain the mid-point, m, for each of the five classes and insert the values in the table.

[1]

(ii)

For each class, obtain the value of the scaled variable, y, where

y = m 750 ,

250

and insert the values of y in the table.

[2]

UCLES 2013

4040/13/O/N/13

For

Examiners

Use

19

(iii)

Obtain the values of fy and fy 2 and use them to estimate the values of the mean of y

and the variance of y.

Mean = ......................................................

Variance = .................................................. [7]

(iv)

(a)

the mean of x,

................................................... [2]

(b)

the variance of x.

................................................... [3]

(v)

................................................... [1]

UCLES 2013

4040/13/O/N/13

For

Examiners

Use

20

BLANK PAGE

Permission to reproduce items where third-party owned material protected by copyright is included has been sought and cleared where possible. Every

reasonable effort has been made by the publisher (UCLES) to trace copyright holders, but if any items requiring clearance have unwittingly been included, the

publisher will be pleased to make amends at the earliest possible opportunity.

University of Cambridge International Examinations is part of the Cambridge Assessment Group. Cambridge Assessment is the brand name of University of

Cambridge Local Examinations Syndicate (UCLES), which is itself a department of the University of Cambridge.

UCLES 2013

4040/13/O/N/13

General Certificate of Education Ordinary Level

* 9 5 0 8 8 4 8 7 2 6 *

4040/22

STATISTICS

Paper 2

October/November 2013

2 hours 15 minutes

Additional Materials:

Pair of compasses

Protractor

Write your Centre number, candidate number and name on all the work you hand in.

Write in dark blue or black pen.

You may use a soft pencil for any diagrams or graphs.

Do not use staples, paper clips, highlighters, glue or correction fluid.

DO NOT WRITE IN ANY BARCODES.

Answer all questions in Section A and not more than four questions from Section B.

If working is needed for any question it must be shown below that question.

The use of an electronic calculator is expected in this paper.

At the end of the examination, fasten all your work securely together.

The number of marks is given in brackets [ ] at the end of each question or part question.

DC (RW/CGW) 66922/2

UCLES 2013

[Turn over

2

Section A [36 marks]

For

Examiners

Use

Events A, B, C and D are four of the possible outcomes of an experiment such that

P(A) = 0.15 ,

(i)

P(B) = 0.2 ,

P(C) = 0.4

and

P(D) = 0.24 .

(a) P(A B),

................................................... [2]

(b) P(A B).

................................................... [2]

(ii)

(a) P(C D),

................................................... [1]

(b) P(C D).

................................................... [1]

UCLES 2013

4040/22/O/N/13

3

2

(i)

The annual salaries of the employees at a company have a mean of $m and a standard

deviation of $s, where s 0.

A new employee arrives at the company and is paid an annual salary of $m.

The mean and standard deviation of the salaries of the employees are now recalculated

to include the salary of the new employee.

For

Examiners

Use

For each of the mean and the standard deviation, state whether it will increase,

decrease, or stay the same when this new employees salary is included.

Mean .......................................................

Standard deviation ................................................... [2]

(ii)

At another company, at the end of 2011, the employees annual salaries had a mean of

$12 000 and a standard deviation of $1000.

During 2012, each of the employees salaries increased by 5%. At the end of that year

they each also received an annual bonus of $200.

Calculate the mean and standard deviation of the annual incomes (salaries plus

bonuses) of the employees at the end of 2012.

Mean $ .......................................................

Standard deviation $ ................................................... [4]

UCLES 2013

4040/22/O/N/13

[Turn over

4

3

They each have 4 cards, which are numbered 1, 2, 3 and 4.

Each shuffles her own cards and turns one over at random.

(i)

For

Examiners

Use

If the cards show the same number, Ariana wins and Bella must pay Ariana $3.

If the cards show different numbers, Bella wins and Ariana must pay Bella $1.

By finding the probabilities of Ariana and Bella winning, show whether or not the game

is fair.

[3]

(ii)

In a second game the numbers shown on the cards are added together.

If the total is 4 or less, Ariana wins and Bella must pay Ariana $5.

If the total is 5 or more, Bella wins.

If the game is to be fair, how much should Ariana pay Bella if Bella wins?

$ ................................................... [3]

UCLES 2013

4040/22/O/N/13

5

4

The pupils in a class should arrive for registration at 9.00 am. On one particular day, 25

pupils were early, with a mean arrival time of 8.51 am. On the same day, 9 pupils were late

with a mean arrival time of 9.21 am, and 2 pupils arrived at 9.00 am exactly.

For

Examiners

Use

If x represents the number of minutes a pupil was late (a pupil who was early would have a

negative value of x),

(i)

find x, and hence find the mean arrival time for all 36 pupils.

x = .......................................................

Mean = ................................................... [3]

If x 2 = 5096 for the 36 pupils,

(ii)

................................................... [3]

UCLES 2013

4040/22/O/N/13

[Turn over

6

5

The change in a countrys annual production (in millions of tonnes) of 4 commodities between

2011 and 2012 is shown in the change chart below.

3

2.5

1.5

0.5

0.5

1.5

2.5

Wheat

Rice

Cotton

Maize

3

2.5

1.5

0.5

0.5

1.5

2.5

Change in annual production between 2011 and 2012 (in millions of tonnes)

The quantity produced (in millions of tonnes) of the 4 commodities in 2011 in this country is

shown in the table below.

Commodity

(i)

(millions of tonnes)

Wheat

78.6

Rice

99.2

Cotton

22.6

Maize

17.3

(millions of tonnes)

Use these data and the change chart to find the quantities of the commodities produced

in 2012 and complete the table.

[2]

UCLES 2013

4040/22/O/N/13

For

Examiners

Use

7

(ii)

On the grid below, draw a dual bar chart to show the quantities produced in 2011 and

2012 of each of the 4 commodities.

For

Examiners

Use

[3]

(iii)

..........................................................................................................................................

...................................................................................................................................... [1]

UCLES 2013

4040/22/O/N/13

[Turn over

8

6

(a) For each of the following state whether the variable is discrete or continuous and

whether it is qualitative or quantitative.

Discrete or Continuous

Qualitative or Quantitative

in a football competition

[1]

in a football competition

[1]

(b) A football team used the diagram below to illustrate the number of goals it had scored

per match in a season in both the league and cup competitions.

20

18

16

14

12

Number of

10

matches

8

matches played in the league

6

4

2

0

(i)

1

2

3

Number of goals

................................................... [1]

(ii)

Explain why the above diagram is more appropriate than a histogram to illustrate

these data.

..................................................................................................................................

.............................................................................................................................. [1]

(iii)

Find the proportion of matches played in the cup in which the team scored 2 or

more goals.

................................................... [2]

UCLES 2013

4040/22/O/N/13

For

Examiners

Use

9

Section B [64 marks]

For

Examiners

Use

Each question in this section carries 16 marks.

(a) The total number of visitors at a tourist attraction has been recorded for every quarter

over a three-year period.

(i)

establishing the trend in the number of visitors.

..................................................................................................................................

.............................................................................................................................. [1]

(ii)

................................................... [1]

(iii)

..................................................................................................................................

.............................................................................................................................. [2]

UCLES 2013

4040/22/O/N/13

[Turn over

10

(b) A hospital records the number of patients admitted at two-monthly intervals over a

period of two years and the results are shown in the table below, together with the

6-point moving average values for these data.

Number of

patients

Jan Feb

241

Mar Apr

208

May Jun

6-point

total

6-point moving

average value

1272

212

1290

215

1290

215

1296

216

Centred moving

average value

x=

2010

Jul Aug

Sep Oct

Nov Dec

Jan Feb

185

209

261

259

y=

Mar Apr

May Jun

208

1323

220.5

1332

222

174

2011

Jul Aug

197

Sep Oct

224

Nov Dec

270

(i)

z=

[3]

UCLES 2013

4040/22/O/N/13

For

Examiners

Use

11

(ii)

Calculate the centred moving average values and insert them in the appropriate

places in the table.

For

Examiners

Use

[3]

(iii)

Plot the centred moving average values on the grid below and draw a trend line

through the points.

235

Number of patients

230

225

220

215

2010

2011

MayJun

MarApr

JanFeb

NovDec

SepOct

JulAug

MayJun

MarApr

JanFeb

NovDec

SepOct

JulAug

MayJun

MarApr

JanFeb

210

2012

[3]

(iv)

Explain what the trend line you have drawn tells you.

..................................................................................................................................

.............................................................................................................................. [1]

(v)

Estimate the number of patients admitted to the hospital during the period

Mar Apr 2012.

................................................... [2]

UCLES 2013

4040/22/O/N/13

[Turn over

12

8

The students at a college take one of three programmes of study: Physics, Chemistry and

Mathematics (PCM) or Physics, Chemistry and Biology (PCB) or Economics, Geography

and Mathematics (EGM). The numbers of students who study each programme are shown in

the table below.

(i)

PCM

PCB

EGM

TOTAL

Male

60

40

40

140

Female

40

90

30

160

TOTAL

100

130

70

300

(a) is a male studying PCM,

................................................... [1]

(b) is female,

................................................... [1]

(c) is studying Physics as part of their programme,

................................................... [1]

(d) is studying PCB, given that they are male.

................................................... [1]

(ii)

If two different students are chosen at random, find the probability that they are taking

the same programme of study.

................................................... [3]

UCLES 2013

4040/22/O/N/13

For

Examiners

Use

13

(iii)

If three different students are chosen at random, find the probability that they are each

taking a different programme of study.

For

Examiners

Use

................................................... [3]

Students are required to buy textbooks for each subject that they study: one textbook for each

of Physics, Chemistry and Biology and two textbooks for each of Mathematics, Economics

and Geography.

(iv)

Find how many textbooks a student taking each programme of study must buy, and

complete the table below.

Course

PCM

PCB

Number of

textbooks

(v)

EGM

[1]

If one of the textbooks owned by a student at the college is lost at random, find the

probability that it

(a) belongs to a student on the PCM programme,

................................................... [3]

(b) is a Mathematics textbook.

................................................... [2]

UCLES 2013

4040/22/O/N/13

[Turn over

14

9

(a) The values of a variable are formed into a grouped frequency distribution, with one of

the classes stated as 50 60 . State the true class limits of this class if the variable is

Lower class limit

of flats,

[1]

in mm, to the nearest mm,

[1]

in mm, to the nearest 10 mm.

[1]

(b) A fisherman recorded, in grams (g), to the nearest 100 grams, the masses of 100 fish

he had caught in river A.

(i)

Number of fish

100 200

12

300 400

31

500 700

29

800 1000

14

1100 1400

1500 2000

2100 3000

Cumulative frequency

State, with a reason, which of the mean or the median would be the more

appropriate measure of central tendency to use in this case.

..................................................................................................................................

.............................................................................................................................. [2]

(ii)

[1]

(iii)

masses of the fish.

................................................... [6]

UCLES 2013

4040/22/O/N/13

For

Examiners

Use

15

(iv)

The fisherman also recorded the masses of 100 fish caught in river B and found

the interquartile range of the masses of these fish to be 352 g. Explain what this

tells you about the masses of the fish caught in river B compared to those caught in

river A.

For

Examiners

Use

..................................................................................................................................

.............................................................................................................................. [1]

(v)

with a mass of less than 650 g.

................................................... [3]

UCLES 2013

4040/22/O/N/13

[Turn over

16

10 A hairdresser classifies the expenditure on her business into three categories: Rent,

Equipment and Wages.

The cost of Rent has increased from $240 per month in 2010 to $256 per month in 2012.

The price relative of Equipment in 2012 is 110, taking 2010 as base year.

The hourly rate of the Wages of her employees has decreased by 2% between 2010 and

2012.

(i)

(a) Calculate the price relative, to the nearest whole number, of Rent for 2012, taking

2010 as base year.

................................................... [2]

(b) Explain what the price relative of 110 for Equipment indicates.

..................................................................................................................................

..................................................................................................................................

.............................................................................................................................. [3]

(c) State the price relative of Wages for 2012, taking 2010 as base year.

................................................... [1]

(d) Present the price relatives for 2010 and 2012 for each of Rent, Equipment and

Wages in a suitable table.

[2]

UCLES 2013

4040/22/O/N/13

For

Examiners

Use

17

The hairdresser wishes to calculate a weighted aggregate cost index, using weights

calculated in 2010, for the three categories.

(ii)

For

Examiners

Use

..................................................................................................................................

.............................................................................................................................. [1]

The weights in 2010 for Rent, Equipment and Wages were calculated as 7, 2 and 5

respectively.

(b) Calculate, to the nearest integer, a weighted aggregate cost index for 2012, taking

2010 as base year.

................................................... [3]

(c) Her total expenditure on the hairdressing business in 2010 came to $5760. Use

your answer to part (b) to estimate, to the nearest dollar, her total expenditure on

the business in 2012.

................................................... [2]

(d) Give two possible reasons why this estimate might be very inaccurate.

Reason 1 ...................................................................................................................

..................................................................................................................................

Reason 2 ...................................................................................................................

.............................................................................................................................. [2]

UCLES 2013

4040/22/O/N/13

[Turn over

18

11 A small village has a population of 60 people aged 10 and over.

A group of researchers wish to find out what the people of the village think about proposed

changes to the timetable for the buses that pass through the village. Each researcher has a

list of the population and thinks of a different way to select a sample.

(i)

The first researcher plans to stand at the village bus stop at 7 am on a Monday morning

and ask the first six people from the population who come to wait for a bus. Explain why

this might not produce a reliable sample.

..........................................................................................................................................

..........................................................................................................................................

...................................................................................................................................... [2]

(ii)

A second researcher decides to take a simple random sample of size six from the

population of 60 people.

(a) Explain what the researcher would need to do with the population list before being

able to select the sample from a random number table.

..................................................................................................................................

.............................................................................................................................. [2]

(b) Use the random number table below, starting at the beginning of the first row

and working along the row, to select a simple random sample of size six from the

population of 60 people, ensuring that no one is selected more than once.

RANDOM NUMBER TABLE

15 08 73 00 60 15 31 52 86 47 82 99 04 33

23 05 65 27 46 13 81 50 49 34 29 08 94 72

.............................................................................................................................. [2]

UCLES 2013

4040/22/O/N/13

For

Examiners

Use

19

(iii)

A third researcher decides to take a systematic sample of size six from the population.

(a) Explain clearly how they should use a random number table to select the first value

for such a sample.

For

Examiners

Use

..................................................................................................................................

.............................................................................................................................. [1]

(b) Use the random number table below, starting at the beginning of the first row and

working along the row, to select a systematic sample of size six.

RANDOM NUMBER TABLE

36 04 85 06 63 22 16 64 12 51 25 92 74 43

35 75 21 44 56 20 83 59 98 35 27 08 14 69

.............................................................................................................................. [3]

UCLES 2013

4040/22/O/N/13

[Turn over

20

The table below shows the population, split into three different age groups.

Number

of people

(iv)

10 18

years

19 65

years

66 years

and over

TOTAL

20

30

10

60

For

Examiners

Use

A fourth researcher decides to take a random sample of size six, stratified by age group.

(a) State how many people from each age group would be needed for such a sample.

10 18 years .......................................................

19 65 years .......................................................

66 years and over ................................................... [1]

(b) Explain clearly what the researcher would need to do before selecting the random

sample, stratified by age group, from a random number table.

..................................................................................................................................

.............................................................................................................................. [2]

(c) Use the random number table below, starting at the beginning of the first row and

working along the row, to select a random sample of size six, stratified by age

group, ensuring that no one is selected more than once. Use every number if the

age group to which it relates has not yet been fully sampled.

RANDOM NUMBER TABLE

17 55 82 25 07 16 35 42 89 37 91 98 24 38

77 29 38 02 47 19 80 53 16 40 28 07 94 73

.............................................................................................................................. [2]

(d) Explain why a random sample, stratified by age group, might be a good idea in this

situation.

..................................................................................................................................

..................................................................................................................................

.............................................................................................................................. [1]

Permission to reproduce items where third-party owned material protected by copyright is included has been sought and cleared where possible. Every

reasonable effort has been made by the publisher (UCLES) to trace copyright holders, but if any items requiring clearance have unwittingly been included, the

publisher will be pleased to make amends at the earliest possible opportunity.

University of Cambridge International Examinations is part of the Cambridge Assessment Group. Cambridge Assessment is the brand name of University of

Cambridge Local Examinations Syndicate (UCLES), which is itself a department of the University of Cambridge.

UCLES 2013

4040/22/O/N/13

General Certificate of Education Ordinary Level

* 3 3 7 3 5 2 4 8 2 4 *

4040/23

STATISTICS

Paper 2

October/November 2013

2 hours 15 minutes

Additional Materials:

Pair of compasses

Protractor

Write your Centre number, candidate number and name on all the work you hand in.

Write in dark blue or black pen.

You may use a soft pencil for any diagrams or graphs.

Do not use staples, paper clips, highlighters, glue or correction fluid.

DO NOT WRITE IN ANY BARCODES.

Answer all questions in Section A and not more than four questions from Section B.

If working is needed for any question it must be shown below that question.

The use of an electronic calculator is expected in this paper.

At the end of the examination, fasten all your work securely together.

The number of marks is given in brackets [ ] at the end of each question or part question.

DC (LEG/CGW) 66936/4

UCLES 2013

[Turn over

2

Section A [36 marks]

Answer all of the questions 1 to 6.

(i)

................................................... [1]

(ii)

................................................... [1]

The variables described above are each grouped into classes labelled 0 4, 5 9,

10 14 etc.

State the true lower and upper class limits for the 5 9 class for

(iii)

...................................................................................................................................... [2]

(iv)

the variable described in (ii), after the distances have been rounded to the nearest

integer.

...................................................................................................................................... [2]

UCLES 2013

4040/23/O/N/13

For

Examiners

Use

3

2

Give a brief explanation of the meaning of each of the following terms when used in the

calculation of index numbers:

(i)

For

Examiners

Use

base year;

..........................................................................................................................................

..........................................................................................................................................

...................................................................................................................................... [2]

(ii)

weight;

..........................................................................................................................................

..........................................................................................................................................

...................................................................................................................................... [2]

(iii)

price relative.

..........................................................................................................................................

..........................................................................................................................................

...................................................................................................................................... [2]

UCLES 2013

4040/23/O/N/13

[Turn over

4

3

The body lengths (including the tail) of a sample of 45 white-footed Texas mice were

measured in millimetres. 25 of the mice were found to be male and 20 female. The following

table summarises the data obtained on mouse length.

(i)

Number of mice

Sum of lengths

Male

25

4325

748 369

Female

20

3060

468 252

Explain why the mean length of the total sample of 45 mice is not just given by

(mean length of male mice + mean length of female mice) / 2.

..........................................................................................................................................

...................................................................................................................................... [1]

(ii)

Calculate, to 1 decimal place, the mean and the standard deviation of the lengths of the

total sample of 45 mice.

Mean = ..................................................

Standard deviation = .................................................. [5]

UCLES 2013

4040/23/O/N/13

For

Examiners

Use

5

4

Values of experimental readings taken by different people are to be scaled for purposes of

comparison. The readings have a mean of 37 and a standard deviation of 5.

The scaled values are to have a mean of 100 and a standard deviation of 10.

For

Examiners

Use

Calculate

(i)

................................................... [2]

(ii)

................................................... [2]

(iii)

................................................... [2]

UCLES 2013

4040/23/O/N/13

[Turn over

6

5

For

Examiners

Use

School A

School B

The bar chart above is intended to illustrate information about how many boys and girls

attend each of two schools, A and B.

(i)

The bar chart is incomplete. List three items of detail which are missing.

...................................................

...................................................

................................................... [2]

(ii)

................................................... [1]

(iii)

Explain how you know that the bar chart illustrates the actual number of boys and girls,

and not percentages.

..........................................................................................................................................

...................................................................................................................................... [1]

(iv)

Another type of diagram which could be used to illustrate the data is a pictogram. State

a disadvantage of pictograms, compared with bar charts, when illustrating frequencies

such as the number of pupils at a school.

..........................................................................................................................................

...................................................................................................................................... [1]

(v)

Give a reason why a change chart could not be used to illustrate these data.

..........................................................................................................................................

...................................................................................................................................... [1]

UCLES 2013

4040/23/O/N/13

7

6

A farmer classifies the expenditure in running his farm under four headings: Animal Feed,

Labour, Fuel and Professional Services (e.g. veterinary services). The price relatives for each

of these headings for the year 2011, taking 2006 as base year, and the weight allocated by

the farmer to each heading are given in the following table.

(i)

Price relative

Weight

Animal Feed

104

14

Labour

110

Fuel

107

Professional Services

102

For

Examiners

Use

Calculate, correct to 2 decimal places, the overall percentage increase in the farmers

weighted cost index from 2006 to 2011.

................................................... [4]

(ii)

In 2011 the farmers income was 7% greater than it had been in 2006. State, with a

reason, whether or not the farm was more profitable than it had been five years earlier.

..........................................................................................................................................

...................................................................................................................................... [2]

UCLES 2013

4040/23/O/N/13

[Turn over

8

Section B [64 marks]

For

Examiners

Use

Each question in this section carries 16 marks.

will not be awarded any marks.

The following table summarises the heights, in centimetres, of a sample of 8585 adult males

in the United Kingdom.

Height (cm)

Frequency

144

1232

2213

2559

1709

705

23

Cumulative frequency

(i)

(ii)

[2]

................................................... [1]

(b) Estimate, to 1 decimal place, the median height.

............................................. cm [3]

(iii)

(a) State the class in which the lower quartile height lies.

................................................... [1]

(b) Estimate, to 1 decimal place, the lower quartile height.

............................................. cm [3]

UCLES 2013

4040/23/O/N/13

9

The upper quartile height, correct to 1 decimal place, is 175.9 cm.

(iv)

For

Examiners

Use

............................................. cm [1]

(b) Compare the distances of the quartiles from the median, and comment on whether

this is what you would expect in a distribution of the heights of a large number of

adult males.

..................................................................................................................................

..................................................................................................................................

..................................................................................................................................

.............................................................................................................................. [3]

(v)

If a cumulative frequency curve were drawn to illustrate this distribution, state, with a

reason, in which part of the graph the curve would be at its steepest.

..........................................................................................................................................

...................................................................................................................................... [2]

UCLES 2013

4040/23/O/N/13

[Turn over

10

8

Two identical bags each contain a number of coloured balls.

Bag X contains 4 white and 7 blue balls. Bag Y contains 3 blue and 8 red balls.

(i)

A bag is chosen at random and a ball selected at random from it. Find the probability

that the selected ball is blue.

................................................... [3]

(ii)

Two balls are chosen at random from bag Y. Find the probability that they are of the

same colour.

................................................... [3]

(iii)

One ball is chosen at random from each bag. Find the probability that the chosen balls

are of the same colour.

................................................... [4]

(iv)

A bag is chosen at random and two balls are selected at random from it. Find the

probability that both selected balls are white.

................................................... [3]

UCLES 2013

4040/23/O/N/13

For

Examiners

Use

11

(v)

All the balls from both bags are emptied into a third bag, bag Z. Two balls are then

chosen at random from bag Z. Find the probability that both selected balls are white.

For

Examiners

Use

................................................... [2]

(vi)

Explain briefly why the answer to part (iv) is greater than the answer to part (v).

..........................................................................................................................................

...................................................................................................................................... [1]

UCLES 2013

4040/23/O/N/13

[Turn over

12

9

Three unbiased six-sided dice, each with faces numbered 1, 2, 3, 4, 5 and 6, are rolled

simultaneously.

Find the probability that the numbers on the uppermost faces will be

(i)

three 1s,

................................................... [1]

(ii)

................................................... [1]

(iii)

................................................... [3]

A game, in which three such dice are rolled simultaneously and for which the entry fee is $1,

is organised. Prizes are paid for certain outcomes on the uppermost faces, as given in the

following table.

(iv)

Outcome

Three 1s

some other number

Calculate, to the nearest cent, the organisers expected profit each time the game is

played.

................................................... [3]

UCLES 2013

4040/23/O/N/13

For

Examiners

Use

13

In another game, a contestant chooses three cards at random from a set of ten. The numbers

on the cards are 1, 1, 1, 2, 2, 2, 3, 4, 5 and 6. Prizes are again paid as given in the previous

table.

(v)

For

Examiners

Use

By first calculating the appropriate probabilities, calculate, to the nearest cent, the entry

fee which should be charged to make this a fair game.

................................................... [8]

UCLES 2013

4040/23/O/N/13

[Turn over

14

10 (a) A large housing estate contains approximately equal numbers of three types of dwelling:

detached houses (D), semi-detached houses (S) and bungalows (B). A research

organisation wishes initially to get some idea of how many occupants there tend to be

in each type of dwelling. It has instructed an interviewer to call at four of each type of

dwelling to ask how many people live there but the choice of exactly which dwellings is

up to the interviewer.

(i)

................................................... [1]

(ii)

Give a reason why the research organisation could not just simply use a list of

registered voters for the estate.

..................................................................................................................................

.............................................................................................................................. [1]

The interviewer labelled his chosen dwellings 1 to 12, and the following is a copy of the

notes he made during a number of visits to the estate:

ad = adult(s) ch = child(ren)

1

2

3

4

5

6

7

8

9

10

11

12

5

8

10

5

8

8

(iii)

B

S

S

D

B

D

D

B

S

S

D

B

2ad

3ch 2ad

2ad 4ch

7ch 2ad

call again later

2ad 5ch

2ad 5ch

no reply

4ch 2ad

no reply

2ad

1ch 1ad

call again

still no reply

2ad

2ad

still no reply

2ad

(a) adults,

................................................... [1]

(b) bungalows with no children.

................................................... [1]

UCLES 2013

4040/23/O/N/13

For

Examiners

Use

15

(iv)

Draw up and complete a table showing the number of dwellings, classified by their

type and by the number of children who live in them.

For

Examiners

Use

[3]

(v)

childrens clothes. If it only has sufficient funding to investigate the expenditure on

such clothes by the inhabitants of one type of dwelling, state, with a reason, which

type it should choose.

..................................................................................................................................

..................................................................................................................................

.............................................................................................................................. [2]

UCLES 2013

4040/23/O/N/13

[Turn over

16

(b) A group of 60 people are each allocated a different two-digit random number in the

range 01 to 60. The 20 men are numbered 01 to 20 and the 40 women are numbered 21

to 60.

A sample of size six is to be selected by different sampling methods using the following

random number table, starting at the beginning of the row for each sample. No person

may be selected more than once in any one sample.

RANDOM NUMBER TABLE

21

32

07

42

98

81

21

57

81

59

31

17

36

Select

(i)

.............................................................................................................................. [2]

(ii)

a systematic sample,

.............................................................................................................................. [3]

(iii)

a sample stratified by gender, using every number if the gender to which it relates

has not yet been fully sampled.

.............................................................................................................................. [2]

UCLES 2013

4040/23/O/N/13

For

Examiners

Use

17

11 (a) (i)

A companys sales are recorded every month over a period of several years. Use

this example to explain briefly the meaning of the term

For

Examiners

Use

(a) trend,

...........................................................................................................................

...........................................................................................................................

....................................................................................................................... [1]

(b) seasonal variation,

...........................................................................................................................

...........................................................................................................................

....................................................................................................................... [1]

...........................................................................................................................

...........................................................................................................................

....................................................................................................................... [1]

(ii)

State which one of trend, seasonal variation and cyclic variation the method

of moving averages removes from a time series, and explain briefly how this is

achieved.

..................................................................................................................................

..................................................................................................................................

.............................................................................................................................. [2]

UCLES 2013

4040/23/O/N/13

[Turn over

18

(b) The following table gives the number of properties sold during each quarter of the years

2006 to 2009 by a small estate agent, together with values of relevant totals and moving

averages.

Year

2006

Quarter

Number of

sales

18

II

24

4-quarter

total

8-quarter

total

8-quarter moving

average value

197

24.625

205

25.625

209

26.125

209

26.125

96

III

28

101

IV

26

104

23

x=

II

2007

27

104

III

29

y=

25.625

101

IV

25

197

24.625

188

23.5

96

I

20

92

II

2008

22

180

z=

88

III

25

167

20.875

151

18.875

137

17.125

121

15.125

79

IV

21

72

11

65

II

15

2009

(i)

56

III

18

IV

12

[3]

UCLES 2013

4040/23/O/N/13

For

Examiners

Use

19

(ii)

[2]

II

III

2006

(iii)

Describe what your plotted points show about the sales of properties during this

time period.

28

For

Examiners

Use

26

24

22

20

Number

of 18

sales

16

14

12

10

IV

II

III

2007

IV

II

III

2008

IV

II

III

2009

IV

I

2010

..................................................................................................................................

.............................................................................................................................. [2]

[Question 11 continues on the next page]

UCLES 2013

4040/23/O/N/13

[Turn over

20

(iv)

State, with a reason, whether or not it would be meaningful to draw a single straight

trend line through the plotted points.

For

Examiners

Use

..................................................................................................................................

.............................................................................................................................. [1]

(v)

Draw a straight trend line which would be useful for estimating the number of

properties sold in the first quarter of 2010.

[1]

The seasonal components for the number of sales are given in the following table.

(vi)

Quarter

II

III

IV

Seasonal component

4.4

0.1

3.5

1.0

State, with a reason, whether the actual sales of properties in the first quarter of

2010 would be likely to be greater or smaller than the value indicated by your trend

line.

..................................................................................................................................

.............................................................................................................................. [2]

reasonable effort has been made by the publisher (UCLES) to trace copyright holders, but if any items requiring clearance have unwittingly been included, the

publisher will be pleased to make amends at the earliest possible opportunity.

University of Cambridge International Examinations is part of the Cambridge Assessment Group. Cambridge Assessment is the brand name of University of

Cambridge Local Examinations Syndicate (UCLES), which is itself a department of the University of Cambridge.

UCLES 2013

4040/23/O/N/13

4040 Statistics November 2013

Principal Examiner Report for Teachers

STATISTICS

Paper 4040/12

Paper 12

Key Messages

If a question specifies a certain degree of accuracy for numerical answers, full marks will not be obtained if

the instruction is not followed.

Premature rounding or truncation of decimals in the middle of working should be avoided so that accuracy is

not lost.

Candidates should develop the skill of holding the intermediate values of a calculation in the calculator to

obtain maximum accuracy in the final answer.

Candidates should try to relate their knowledge to the specific requirements of a question rather than simply

repeat memorised knowledge.

After performing any calculation it is worth pausing to consider if the answer obtained is a reasonable one for

the practical situation of the question.

General Comments

The overall standard of work was comparable to that of last year. Some very good marks were obtained,

and there were few exceptionally low marks. As is noted regularly in these reports, there were again

instances of marks being needlessly lost due to final answers not being given to the accuracy specifically

stated in the question. In those parts of questions requiring comment related to results calculated there is

still a tendency for some answers given to be mathematical rather than contextual (see Question 10 below).

Any candidate of statistics ought to be able to observe whether or not the result of a calculation is

reasonable in a given practical situation. If it is clearly unreasonable, the work can be checked to find the

error. For example, if it is found that the mid-day temperature in a city is set to increase by 20C by midcentury (see Question 9 below) it should be obvious that a mistake has been made; this is far in excess of

even the direst predictions of climate change scientists.

It may seem superfluous to remark that a question should be read carefully before an answer is attempted.

Yet there was one question in particular on the paper (see Question 2 below) where this was apparently not

done.

Section A

Question 1

Parts (i) and (ii) were generally answered best. It was clear from answers to the other parts that many

candidates do not understand the terms central tendency and dispersion, for many gave a measure of

dispersion when a measure of central tendency was requested, and vice versa. Few answered all parts

correctly.

Answers: (i) mode (ii) range (iii) median (iv) variance or standard deviation (v) interquartile range

(vi) mode

2013

4040 Statistics November 2013

Principal Examiner Report for Teachers

Question 2

The best answers to part (ii) were those which demonstrated that the candidate had read the question

carefully, and in particular had understood that the key piece of information given was that there was a

proposal to change the time of the class. Thus, when taking her sample, it was important that the instructor

did not select, for example, all women who were in full-time employment, and who, presumably, would all

have been against the change. The answers given below are not exhaustive; but whatever was suggested,

to earn credit it had to be explained to be something that would affect the womans ability, one way or the

other, to attend at the new time.

Weak answers did not address the situation described, but reproduced what was apparently memorised

material on avoiding bias in general. Thus in spite of the question stating clearly that this was a class for

women, and that the instructor already knew their ages, it was quite common to see gender and age

suggested for items of data needed.

Answers: (i)(a) quota (i)(b) systematic (ii) employment status, because working women may need to be

at work in the afternoon; maternal status, because a woman with children may prefer afternoon

attendance when her children are at school

Question 3

This was very well done, with only part (iv) causing problems. Success was most readily achieved by those

who tried inserting different sets of three consecutive integers into their ordered list in part (iii).

Answers: (i) 6 (ii) 3.9 (iii) 4 (iv) 3

Question 4

Whilst parts (i) and (ii) were almost always answered correctly, there were few fully correct answers to the

next three parts. As is observed regularly in these reports, many candidates do not understand clearly what

the regions of the different parts of a Venn diagram represent. In parts (iii) and (iv) common numerators

seen were 27 and 6 respectively, and in part (v) little appreciation was shown that a denominator of 9 had to

be used.

Answers: (i) 25 (ii) 6 actors have worked in Los Angeles and Rome but not Mumbai (iii) 40/48

(iv) 10/48 (v) 4/9

Question 5

Parts (i) and (ii) were almost universally well done. There were also many correct answers to part (iii), but

because past questions have usually asked about the radii of the charts, some candidates felt that squaring

or taking square roots had to be done somewhere.

Answers: (i) $12 million (ii) 126 (iii) 4 : 3

Question 6

This was another question which was almost universally well done. Candidates understood very clearly this

particular form of tabulation for the representation of the distances between different towns. Errors occurred

occasionally in part (ii) when it was not realised that three distances only had to be added for the journey

described in the question.

Answers: (i)(a) 35 in cell BC (i)(b) 24 in cell AC (i)(c) 21 in cell CE (i)(d) 37 in cell CD (ii) 81 km

Section B

Question 7

As was the case in the examination last year, most candidates were able to apply their knowledge of crude

and standardised rates to fertility rates, and there were many good answers to parts (i), (ii) and (iii).

However, as mentioned in the general comments above, this was yet again one of the questions where

marks were sometimes lost through failure to follow the given accuracy instructions.

2013

4040 Statistics November 2013

Principal Examiner Report for Teachers

Good answers to part (iv) showed clear understanding that the task was to find the number of deaths in the

city, as the number of births was already known from part (iii). They further showed understanding that the

calculation had to be based on the total population of the city, and not just the females. It was quite common

in weaker answers to see this last point overlooked, with 18 450 being used in the working for deaths instead

of 36 900. The least creditworthy attempts simply subtracted one of the death rates from one of the fertility

rates and stopped at that point, again failing to appreciate that, whilst fertility rates applied only to the

females, death rates applied to the whole population.

Very good general understanding of what was required was shown in part (v).

Answers: (i) 88.7 (ii) 145, 828, 714, 87 (iii) 96.2 (iv) 1486 (v) migration of people into or out of the city

Question 8

There were many correct answers to part (i), though not all candidates appreciated that this was a without

replacement situation. Most did not see the simple link between this part and the next, and attempted part

(ii) as though it was completely unrelated to what had gone before. Unfortunately, in the analysis of the

different cases this involved, one of the three possibilities was frequently omitted.

The quality of answers to the histogram was mixed, with many fully correct answers, but also many where no

allowance was made for the different widths of the rectangles.

Whilst the number of fully correct answers to part (vii) was limited, a good number of candidates were able to

obtain some marks on the question. The best answers showed clear understanding of the conditional

element, ending with a division of probabilities, even though these might not be individually correct, it being

sometimes thought that there were just three 3, 4, 5 cases. More limited answers finished at the point where

the probability of the apartments having 12 rooms had been found, the conditional element not being

recognised. A significant number of answers was seen in which it was thought that the only requirement was

to find the probability of choosing three apartments each with 4 rooms. It should have been apparent that a

question worth 6 marks must have involved more than one line of working for its solution.

Answers: (i) 35/204 (ii) 169/204 (iii) 54 (iv) 6 (v) rectangle of height 5 (vi) modal class (vii) 13/157

Question 9

Some candidates produced graphs of very high quality, the majority plotting points correctly. But the error of

using mid-class values instead of upper class boundaries continues to be seen too often.

As has been pointed out before in these reports, good answers to this type of question give some indication

on the graph (for example with lines drawn and labelled) of how the required information is being found.

Credit can then be given for method, even if the answer is incorrect. Some progress appears to have been

made in this respect, with, on this occasion, fewer graphs devoid of annotations than has been the case in

the past.

Part (iii) was reasonably well done, although a significant number of answers was seen where the serious

error of using a total frequency of 400 was made. Common errors in part (iv) were to add 2.5C or even

20C to the median previously found, and also to add a temperature increase to the interquartile range

previously found. In the case where 20C was being added, it should have been realised that this was a

highly unrealistic increase.

In part (v), thought processes were not always evident from answers presented. The best solutions were

those where vertical lines were drawn on the graph at temperatures of 36C and 34C, with horizontal lines

linking these to the respective cumulative frequencies.

Answers:

(i) 8, 33, 85, 166, 245, 313, 350, 365 (ii) plot of cumulative frequencies at upper class

boundaries joined by a smooth curve (iii)(a) 20.7C to 21.3C (iii)(b) 11C to 12C, dependent

on correct method for, and accuracy of, quartiles (iv)(a) answer to part (iii)(a) + 2C

(iv)(b) same answer as part (iii)(b) (v) 9, 10 or 11 days

2013

4040 Statistics November 2013

Principal Examiner Report for Teachers

Question 10

Following an observation made in this report last year on the clarity of plotted points, this year, almost

always, Examiners were able to see points very clearly.

Very good marks were generally earned on the first three parts, with good understanding shown of the need

to order data to find the semi-averages. By far the best way to proceed in part (iv) was to use the two given

averages to find the equation of the line. Candidates who used the average they had calculated in part (iii)

risked error by using values they could not be certain were correct, unlike the values for the other averages

given in the question. Unfortunately many did exactly this, and as a consequence of working with their own

(incorrect) average obtained an incorrect equation. Incorrect equations also resulted from working with a

gradient accurate to only one significant figure.

In part (v) quite a lot of answers were written in purely mathematical language, when what was required was

an appreciation of what was implied for the schools and teachers.

Reasonable skill was shown in part (vi) in drawing a line of best fit by eye, and in part (vii) in finding its

equation. For the latter it was essential that points from the line drawn had to be used. When values were

seen which were originally given in the table, Examiners only gave credit if the line drawn passed through the

plot of these particular points.

In part (viii), most candidates knew that this had something to do with educational provision as it related to

the number of teachers employed. But a good number focused on the intercepts of the two equations rather

than the gradients. Statements to the effect that Belport was better because it employed more teachers

could not be accepted, as actual numbers for Belport were unknown.

Answers: (ii) (927+1085+1219+1361)/4 (iii) (559.75, 25.75) (iv) m = 0.0280 or 0.028, c = 10.00 to 10.11

(v) it indicates there are 10 teachers when there are no pupils (vii) m = 0.033 to 0.039, c =

intercept of line drawn in part (vi) (viii) Belport, as gradient for Belport is higher, showing that the

number of teachers per pupil there is higher than at Astra

Question 11

The answers below for part (i) are not exhaustive, but to gain credit specific advantages and disadvantages

in the statistical analysis of data had to be provided. Thus references to a process being tedious or taking a

lot of time were not considered acceptable. Also, what appear to be common assumptions about it being

easier to analyse a frequency distribution rather than a large set of data must be questioned; if a large set of

data is held in a spreadsheet a wide range of statistical measures can be found almost instantaneously.

Part (ii) was generally well answered, although a mark was commonly lost on the standard deviation through

failure to maintain sufficient accuracy in decimals in the body of the working. For such a problem candidates

should have the ability to retain intermediate values of maximum accuracy within the calculator, by making

use of the memory. Too often premature rounding or truncation of decimals is seen. Most used the method

for standard deviation based on fx and fx, which is far better for computational purposes than that which

uses f(x mean).

Part (iii) aimed to test if candidates were able to focus on the particular numbers relevant to a question,

when given a table containing a range of information. There were very mixed answers, with some giving

more than one programme for one or both answers.

Good understanding was shown in part (iv), and many clearly presented answers were seen.

Answers: (i) provides a concise summary of the data; original data are lost (ii) 3.66, 0.343 (iii)(a) Q

(iii)(b) T (iv) 197/900

2013

4040 Statistics November 2013

Principal Examiner Report for Teachers

STATISTICS

Paper 4040/13

Paper 13

Key Messages

A valuable skill in statistical work is to be able to recognise when the results of a calculation or analytical

process are reasonable.

If a question specifies a certain degree of accuracy for numerical answers, the instruction must be followed

for full marks to be credited.

If words in a question are emphasised they should be noted carefully by the candidate so that unnecessary

errors are avoided.

General Comments

The overall standard of work was comparable to that of last year, with a wide range of marks being obtained.

As is noted regularly in these reports, there were again instances of marks being needlessly lost when

answers were not given to the required accuracy, where this was stated in the question (see Questions 2,

10 below).

A candidate of statistics ought to know whether or not the result of a calculation or analytical process is

reasonable in a given practical situation. If it is clearly unreasonable, the work can be checked to find the

error and the error corrected. If a plot of the values on a scatter diagram show clearly that as x increases y

decreases, it ought to be obvious that, if found, a line of best fit with positive gradient must be wrong (see

Question 10 below).

In questions which require written answers, candidates should try to relate their knowledge to the specific

context of the question rather than simply repeat memorised knowledge of a general nature (see Question 6

below).

Section A

Question 1

Answers to this question were mixed. It is clear that some candidates do not understand the terms central

tendency and dispersion, for a measure of dispersion was sometimes given when a measure of central

tendency was requested, and vice versa.

Answers: (i) median, mode (ii) interquartile range (iii) mean (iv) two from range, standard deviation,

variance

Question 2

This was very well answered, with many candidates obtaining full marks. Good understanding was shown of

the use of the square of the radius in part (iv), though occasionally a mark was needlessly lost as a

consequence of the accuracy instruction being ignored.

Answers: (i) Europe 164, Asia 74, North America 90, Rest of the World 32 (ii) $162 million

(iii) 4.9 cm to 5.1 cm (iv) 4.1 cm

2013

4040 Statistics November 2013

Principal Examiner Report for Teachers

Question 3

This was another very well done question, with many full mark answers being presented.

Answers: (i) and (ii) column totals: 220, 440, 660; 540, 125, 665; 1390, 785, 2175; 2150, 1350, 3500

Question 4

Where errors occurred they were mainly in part (iii), where the value for the total number of handball players

was occasionally used instead of the value for those who play only handball.

Answers: (i) 1; one girl did not play any of the three sports (ii)(a) 7 (ii)(b) 2 (iii) 19

Question 5

This question and the next were by far the least well answered in Section A. Whilst almost all recognised

the need for rectangle heights to correspond to frequency densities, many errors were made in using the one

given height to deduce correctly the standard class width.

Answers: (i) 40 (ii) 7, 32, 4, 2 (iii) 2.71

Question 6

It was clear that most candidates knew about systematic sampling, and there was scarcely any confusion

with other types of sampling. But in part (a)(i) there was a tendency to give examples of biased outcomes

rather than the causes of such outcomes. Answers to part (a)(ii) were reasonable, though rarely complete,

either the first or second steps (or even both) in the process being omitted. In part (b), stratification was

clearly understood, but only the strongest answers gave stratification directly relevant to the surveys being

carried out. Weaker answers offered criteria which might be employed in general, such as gender, age or

occupation.

Answers: (a)(i) occurs when there is a regular pattern in the population listing (a)(ii) three basic steps to

be given: listing the population; starting the selection at a random point; selecting every 19th

candidate from the list after the starting point (b)(i) into smokers and non-smokers (b)(ii) into

those who live near an airport and those who do not

Section B

Question 7

In this question, part (b) was answered far better than the other two parts. The diagram was well understood

and there were many correct answers.

In part (a) not everyone appreciated that the case of the person not having the disease had to be considered

as well as the case of the person having the disease, and furthermore that the test result had to be negative

in the former case to give the correct result. Nevertheless some correct solutions were seen.

But there were very few correct solutions to part (c)(i). Almost all failed to consider in their working that if

Laura went into exactly one shop it meant that she did not go into the other. Consequently 0.8 and 0.3 were

usually absent from the working. In part (c)(ii) some candidates did not seem to recognise the numerical

comparison which had to be made in order to give a decision.

Answers: (a)(i) 0.05 (a)(ii) 0.1, 0.9 in second column (a)(iii) 0.9075 (b)(i) 13/33 (b)(ii) 4/5 (b)(iii) 4/13

(c)(i) 0.0558 (c)(ii) unlikely as 0.0558 > 0.05

Question 8

There were very few completely correct answers to part (a) because of the graphs presented in part (a)(ii).

Candidates do not seem to have observed the emphasis given to the word appropriate, because almost all

produced a totally inappropriate graph. As the variable is discrete, full credit could only be given where a

step polygon was drawn.

2013

4040 Statistics November 2013

Principal Examiner Report for Teachers

The first five parts of part (b) were generally well answered, though with occasional errors through the

misreading of scales. Good appreciation was shown in part (b)(vii) that there would be no change, but a

mark was frequently dropped in part (b)(vi) because the 5 minutes given in the question was absent from

the answer.

Answers: (a)(ii) step polygon required (b)(i) 42 (b)(ii) 35 (b)(iii) 55 to 56 (b)(iv) 180 (b)(v) 6th or 7th

(b)(vi) increased by 5 minutes (b)(vii) unchanged

Question 9

The calculation of crude and standardised death rates is well known by most candidates, and there were

many good answers to the first three parts.

The explanatory parts were less well done. In part (iv), few focused on the population age structures, and in

part (v) it was usual to see only the first of the reasons given below, though credit was also given for the

observation that town B must have the healthier environment. In part (vi) there was widespread recognition

that the rate would not change, but incomplete explanation as to why this was so.

Answers: (i) p = 9, q = 40 (ii) 4.2 per thousand (iii) 7.3 per thousand (iv) the proportions of the

population of town A in the different age groups match exactly the proportions of the standard

population in the different age groups (v) town B has a larger population than town A; town B

has a much smaller group death rate amongst the elderly than town A (vi) value unchanged;

CDR is calculated using only total population and total deaths, and both would be unchanged

Question 10

A good proportion of candidates answered the first four parts well, with accurately plotted points and

accurately calculated averages, leading to a good line of best fit. But for others the fact that y decreased as

x increased resulted in a common error, it being assumed that the smallest values of x always had to be

paired with the smallest values of y, when calculating the semi-averages. This error meant that the location

of the plotted averages on the grid, and the line subsequently drawn through them, bore no relationship

whatsoever to the pattern of the plotted data. The line had a positive gradient when clearly the trend of the

data indicated the gradient should be negative. When this happened the candidate ought to have realised

something was wrong and paused for reflection, instead of continuing regardless.

In part (v) the accuracy instruction was sometimes ignored.

The best answers in part (vii) were those which illustrated the dangers of extrapolation with contextual

examples, commenting on the likely performance in this situation of very young children or elderly people.

Answers: (ii) overall (10.7, 18.7); lower (8, 23.7); upper (13.3, 13.7) (iv) gradient: value rounding to 1.9;

intercept: value rounding to 39 (v) 12 minutes (vi)(a) reasonably well (vi)(b) A (vii) would not

be valid for substantial extrapolation; for example, the line of best fit indicates an impossible time

of zero for someone who is about 20 years old

Question 11

The quality of answers to this question was variable. Even though basic computation of mean and standard

deviation was required, marks were routinely lost. Sometimes this was the result of calculation errors,

sometimes the result of using incorrect formulae.

In part (iv), as emphasised in the question, the results from part (iii) had to be used. Few candidates were

able to do this successfully. The few good answers seen used the 250 and 750 appropriately and obtained

the required values quickly and easily. Unsatisfactory answers went back to the original x values and started

again.

Answers: (i) 1250, 750, 2000, 3750, 7500 (ii) 8, 0, 5, 12, 27 (iii) 5.02, 85.9896 (iv)(a) 2005

(iv)(b) 5 374 350 (v) dollars squared

2013

4040 Statistics November 2013

Principal Examiner Report for Teachers

STATISTICS

Paper 4040/22

Paper 22

Key Message

The most successful candidates in this examination were able both to calculate the required statistics and to

interpret their findings. In the numerical problems, candidates scoring the highest marks provided clear

evidence of the methods they had used in logical, clearly presented solutions. In questions requiring written

definitions, justification of given techniques and interpretation, the most successful candidates provided

detail in their explanations with clear thought given to the context of the problem, where appropriate.

General Comments

In general, candidates did better on the questions requiring numerical calculations and graphical work than

on those requiring written explanations; in particular, candidates did well on the numerical and graphical

parts of Questions 1, 5, 7 and 10. Answers provided to questions requiring written explanations, such as

Questions 7(a)(i), 10(ii)(d) and 11(iv)(d), were sometimes too vague. Where candidates needed to provide

some interpretation of their calculated statistics, such as in comparing the interquartile ranges in Question

9(b)(iv), some otherwise strong candidates seemed to struggle.

Question 8, on probability, proved to be the least popular of the optional Section B questions, with each of

the remaining Section B questions proving equally popular.

Section A

Question 1

The majority of candidates were able to apply correctly the laws of probability relating to independent and

mutually exclusive events. The most common errors were for candidates simply to add the probabilities of A

and B in part (i)(b), without subtracting the intersection, and to multiply the probabilities of C and D in part

(ii)(a).

Answers: (i)(a) 0.03 (i)(b) 0.32 (ii)(a) 0 (ii)(b) 0.64

Question 2

In part (i) of this question, a new value was being added to a set of data and candidates were asked to

explain the effect on the mean and the standard deviation. Many candidates stated, incorrectly, that the

mean would increase and that the standard deviation would stay the same. Such candidates had confused

the idea of adding a constant to each data item, rather than adding a single value to the set of data items. In

part (ii) the concept being tested was the effect on the mean and standard deviation of adding to each item a

constant and of multiplying each item by a constant. Some candidates, incorrectly, assumed that the

addition of the bonus would affect the standard deviation.

Answers: (i) Stay the same, decrease (ii) 12800, 1050

2013

4040 Statistics November 2013

Principal Examiner Report for Teachers

Question 3

There were some good attempts at this question, with many candidates producing well organised solutions.

Some candidates got incorrect probabilities, but were nonetheless able to use expected values to decide

whether or not the game was fair. A few candidates, incorrectly, attempted to compare probabilities, rather

than expected values.

Answers: (i) , , fair game (ii) $3

Question 4

Many candidates struggled to deal with the times in this question. It was necessary to find the mean number

of minutes early/late for the two groups of candidates before trying to combine them. In part (ii) many

candidates were able to quote the correct formula for standard deviation, but again they frequently used

times rather than the number of minutes late in this formula.

Answers: (i) 36, 8.59 (ii) 11.9

Question 5

Most candidates were able to use the change chart, together with the figures provided, to calculate the

quantities of the commodities produced in 2012. They then, usually successfully, displayed this information

in the form of a dual bar chart. A mark was lost by some candidates for insufficient labelling of the vertical

axis, where it was necessary to state that the units were millions of tonnes. In part (iii) some candidates did

not explain sufficiently clearly that the advantage of a dual bar chart over a change chart is that the original

data is not lost.

Answers: (i) 80.7, 96.8, 22.1, 17.7

Question 6

Most candidates correctly identified the heights of the players as continuous, quantitative data and the towns

of birth of the players as discrete, qualitative data. In part (b), the majority of candidates were able to identify

the chart correctly as a sectional, component or composite bar chart, but many did not recognise that this

chart was more appropriate than a histogram, as the data presented here is discrete. Many candidates

simply stated that the sectional bar chart was easier to understand than a histogram. In part (b)(iii), it was

common to see the answer given as simply the number of matches played in the cup in which the team

scored 2 or more goals, rather than this expressed as a fraction of the total number of matches played in the

cup. The denominator of 11 was frequently incorrect or missing entirely.

Answers: (b)(iii) 3/11

Section B

Question 7

In part (a)(i), it was necessary for candidates to consider the merits of obtaining moving average values in

this particular situation. Therefore they needed to consider whether the number of visitors at a tourist

attraction is likely to be subject to seasonal variation, and to conclude that this is likely. Many candidates

simply stated, in general terms, the purpose of calculating moving average values, without relating their

comments to the particular situation identified. Parts (a)(ii) and (iii) were completed correctly by many

candidates with a few, incorrectly, giving an answer of 3 for part (ii).

The calculations in parts (b)(i) and (ii) were completed correctly by most candidates and the graph plots in

part (iii) were mostly accurate, with a suitable trend line drawn. Most candidates correctly interpreted the

trend line in the context of the problem presented. In part (v), some candidates did not take the reading from

the trend line at the correct place and others did not subtract 11.25 from their reading. The most common

error, however, was not to give the final estimate of the number of patients admitted to the hospital as a

whole number.

Answers: (a)(ii) 4 (b)(i) 168, 1308, 218 (b)(ii) 213.5, 215, 215.5, 217, 219.25, 221.25

2013

4040 Statistics November 2013

Principal Examiner Report for Teachers

Question 8

Most candidates were successful with part (i) of this question, although it was surprisingly common to see

incorrect responses of 60/100 and 40/130 for parts (a) and (d), respectively. There were many fully correct

solutions seen to part (ii), with some errors caused by some candidates unnecessarily trying to consider the

males and females separately and omitting some of the possible combinations. Part (iii) was more

challenging, but some good attempts were seen with the most common error being multiplication by 3

instead of 6. Most candidates found part (v) the most challenging, although some fully correct solutions were

seen. Some candidates were unable to attempt the final part of this question and a common incorrect

answer seen in part (v)(a) was 4/13.

Answers: (i)(a) 1/5 (i)(b) 8/15 (i)(c) 23/30 (i)(d) 2/7 (ii) 105/299 (iii) 700/3427 (iv) 4, 3, 6

(v)(a) 40/121 (v)(b) 34/121

Question 9

Candidates found part (a), and in particular part (a)(i), of this question difficult. A common error seen in part

(a)(i) was for the true class limits to be given as 50 and 60. Candidates who were successful with parts

(a)(ii) and (iii) usually went on to complete the numerical parts of (b) correctly.

In part (b)(i), many candidates correctly chose the median and explained that it is not affected by extreme

values. Almost all candidates found the correct cumulative frequencies in (b)(ii) and most then tried to find

th

th

the 25 and 75 values, as required for the interquartile range in (b)(iii). Most candidates then applied the

correct formula, but many used wrong values for class boundaries and class widths. In part (b)(iv), it was

necessary for candidates to interpret this value. Many thought that a smaller interquartile range indicated

smaller masses in general, rather than a smaller dispersion of the masses. Again in part (b)(v), incorrect

identification of class boundaries led to wrong answers for some who used a correct approach. Common

wrong methods involved trying to divide the whole population proportionately, rather than just the 450750

group.

Answers: (a)(i) 50, 61 (a)(ii) 49.5, 60.5 (a)(iii) 45, 65 (b)(ii) 12, 43, 72, 86, 94, 98, 100 (b)(iii) 480

(b)(v) 62.3

Question 10

Most candidates demonstrated a good understanding of price relatives in their answers to this question. The

numerical parts of (i), namely parts (a) and (c), were usually correct. In part (i)(b), candidates needed to

explain that the price relative of 110 for equipment indicates that the price or cost has increased by 10%

between 2010 and 2012. A few stated incorrectly that it indicated that the expenditure had increased by

10%. In part (i)(d), most candidates correctly drew a two-way table, with values of 100 for each category in

2010 and the price relatives that they had calculated for 2012.

Again it was the numerical parts of (ii), (b) and (c), which candidates found the most straight-forward, and

many fully correct solutions were seen. In part (a), it was necessary to explain that the ratio of the

expenditure on the three different categories could be used to calculate the weights. In part (d), candidates

needed to consider the reliability of the result they had achieved and also what might contribute to an

unreliable result. In the particular context provided, these reasons might have been that the number of

employees, or number of hours worked, had changed or that the amount of equipment used had changed.

These features had not been considered within the calculation of the weighted aggregate cost index. Some

candidates gave incorrect answers, such as that there might have been inflation; this is a feature included

within the figures for the price relatives, and thus not a potential source of inaccuracy. Other answers which

did not gain credit were those which were too vague and did not relate specifically to the problem presented,

such as simply that the weights may be incorrect. It was necessary to provide a reason as to why this might

be the case.

Answers: (i)(a) 107 (i)(c) 98 (ii)(b) 104 (ii)(c) 5990

10

2013

4040 Statistics November 2013

Principal Examiner Report for Teachers

Question 11

In part (i), candidates needed to consider the reliability of the sampling technique described. They needed to

think about the specific situation and consider who might be waiting for a bus at 7 am on a Monday morning.

In the best answers, the candidates described the method as not representative of the whole population,

because it was likely to contain a group of people such as workers or college candidates who are likely to

have similar requirements in terms of the buses they want to catch.

In part (ii), candidates needed to explain the need to number the population list from either 00 to 59 or 01 to

60 before selecting a simple random sample. Some candidates, who described writing names on pieces of

paper and drawing them from a hat, did not appear to have read the question carefully enough. There was a

high proportion of correct answers to (ii)(b), with the most common errors being the inclusion of both 00 and

60 or the inclusion of 15 twice.

Many candidates were able to find the correct systematic sample in part (iii) although, as with the previous

part, some candidates did not read the wording of part (a) sufficiently clearly and described the process for

selecting the whole sample rather than giving a sufficiently detailed description of the selection of the first

term. It was necessary to state that the first term came from a number between 00 and 09 (or between 01

and 10) randomly selected from the table.

In the final sampling method, stratified sampling, the numbering of the people within the age groups needed

to be considered. Some candidates appeared to be using the ages themselves for the numbering of the

groups, and thus it was common to see 07 excluded and 82 and 16 included in the stratified sample. The

correct numbering that should have been to give each of the age groups in the question was 00 to 19, 20 to

49 and 50 to 59 (or 01 to 20, 21 to 50 and 51 to 60) respectively. In part (iv)(d), it was necessary to explain

the benefit of a sample stratified by age group when considering the views of the population to the proposed

bus timetable change. Thus it was necessary to consider the relevance of age to this particular problem. In

the best answers, candidates explained that the different age groups are likely to want buses at different

times. Most candidates made a general comment about stratified samples not being biased, which did not

relate specifically to the problem that had been presented. Only the very strongest candidates scored this

final mark on the paper.

Answers: (ii)(b) 15, 08, 00, 31, 52, 47 or 15, 08, 60, 31, 52, 47 (iii)(b) 04, 14, 24, 34, 44, 54

(iv)(a) 2, 3, 1 (iv)(c) 17, 55, 25, 07, 35, 42

11

2013

4040 Statistics November 2013

Principal Examiner Report for Teachers

STATISTICS

Paper 4040/23

Paper 23

Key Message

The most successful candidates in this examination were able both to calculate the required statistics and to

interpret their findings. In the numerical problems, candidates scoring the highest marks provided clear

evidence of the methods they had used in logical, clearly presented solutions. In questions requiring written

definitions, justification of given techniques and interpretation, the most successful candidates provided

detail in their explanations with clear thought given to the context of the problem, where appropriate.

General Comments

In general, candidates did better on the questions requiring numerical calculations than on those requiring

written explanations; in particular, candidates did well on the numerical parts of Questions 4, 6, 7, 8 and, for

those who attempted it, Question 9. It was particularly pleasing to see, in Question 9(v), clearly laid out

logical solutions, as these were essential in this 8-mark question. Answers to questions requiring written

explanations, such as Questions 5(i) and 10(a)(ii), were sometimes too vague.

In Questions 6(ii), 7(iv)(b), 11(b)(iii) and 11(b)(iv) it was necessary for candidates to provide some

interpretation of their calculated statistics and graphs. In Questions 6(ii) and 11(b)(iv), some otherwise

strong candidates seemed to struggle to interpret and explain their results.

Question 9, on probability and expectation, proved to be the least popular of the optional Section B

questions, although those that attempted it generally scored high marks. Question 7 on linear interpolation

proved to be the most popular of the optional questions.

Section A

Question 1

The majority of candidates were able to correctly identify the number of items of mail as a discrete variable

and the distance run by a number of athletes during 1 hour as a continuous variable, although there were

slightly more errors in part (ii) than part (i). In parts (iii) and (iv), it was necessary to give true lower and

upper class limits for each of these variables, with candidates often being more successful in identifying the

lower than the upper class limit.

Answers: (i) Discrete (ii) Continuous (iii) 5 and 9 (iv) 4.5 and 9.5

Question 2

There were some good attempts by many candidates in part (i) to explain the term base year. Many scored

one of the two available marks, by describing the base year as a reference point or the point in time to which

other index numbers are referred. The second mark was for stating that it is when an index number takes

the value 100, and it was less common to see this part of the answer given.

In part (ii), many candidates found it difficult to express the meaning of the term weight in this context. A

few candidates were able to say that weights were used to calculate a weighted cost index, but not that it is a

means of taking into account the different levels of importance of items in an index number. For the second

mark, it was necessary to state that the weight is usually based on the expenditure on each item and this

was rarely seen.

12

2013

4040 Statistics November 2013

Principal Examiner Report for Teachers

As with part (i), it was common for candidates to score one mark in part (iii). It was necessary to describe

the term price relative as the ratio of two prices, or as showing the proportional or percentage change in the

price of an item, for the first mark and then, for the second mark, to state that this is relative to the base year.

Question 3

It was rare to see the correct answer to part (i), that the samples of male and female mice are of different

sizes. Some candidates, incorrectly, stated that the division should be by 45 rather than 2, because there

are 45 mice.

In part (ii), errors in the calculation of the mean included candidates multiplying the sums of lengths by the

respective frequencies and use of the incorrect formula from part (i). In the calculation of the standard

deviation, some candidates used their mean correct to only one decimal place, when a greater degree of

accuracy was required in order to find the standard deviation correct to one decimal place.

Answers: (ii) 164.1, 10.2

Question 4

The majority of candidates were successful with this question, particularly parts (i) and (ii). In part (iii), it was

quite common to see an incorrect initial equation, often with only one occurrence of the unknown, rather than

two.

Answers: (i) 136 (ii) 30.75 (iii) 26

Question 5

Correct items of detail missing from the bar chart for part (i) are the key/legend, the vertical scale, the label

on the vertical axis and the title. Some candidates gave answers which were too vague, such as labels or

axes. The vast majority of candidates scored at least one of the two marks available for this part of the

question.

In part (ii), candidates needed to describe the bar chart as either Sectional, Component or Composite. The

most common error was to see it described as a stacked bar chart.

Most candidates correctly noted, in part (iii), that had the bar chart illustrated percentages then the bars

would have been of equal heights.

In part (iv), many candidates correctly stated the disadvantage of using pictograms to illustrate frequencies

as the difficulty to determine the exact frequency when partial pictures are used. Some however incorrectly

referred to the difficulty of construction of a pictogram, rather than considering the impact of the diagram

once constructed.

In part (v), most candidates correctly identified that nothing had changed and therefore a change chart could

not be used.

Question 6

The most common error in part (i) was to see the weighted cost index given as the final answer, rather than

the percentage increase. There were however many fully correct solutions and the majority of candidates

scored at least three of the four available marks.

There were more difficulties, however, with part (ii), with surprisingly few candidates appreciating the need to

compare 5.56% with 7% in order to establish that the farm was more profitable than it had been, because the

income had increased by more than the costs.

Answers: (i) 5.56%

13

2013

4040 Statistics November 2013

Principal Examiner Report for Teachers

Section B

Question 7

Parts (i), (ii), (iii) and (iv)(a) of this question were completed successfully by the majority of candidates.

Some candidates did not appear to understand what they were being asked to do in part (iv)(b). Many did

not find the distances of the quartiles from the median. Those that did were often able to state correctly that

these distances were approximately equal, and that this is what would be expected in a distribution of

heights of adult males.

In part (v), some candidates were able to identify correctly that the curve would be steepest in the middle of

the graph, where the frequency is greatest or where the greatest change in the cumulative frequency occurs.

The most common error was to identify the part of the graph where the greatest change in frequency occurs.

Answers: (i) 144, 1376, 3589, 6148, 7857, 8562, 8585 (ii)(a) 170 under 175 class (ii)(b) 171.4

(iii)(a) 165 under 170 class (iii)(b) 166.7 (iv)(a) 9.2

Question 8

Most candidates were successful with part (i) of this question, realising the need to include the factor of

(the probability of selecting each bag) within their calculations.

In part (ii), some candidates did not realise that the selection of two balls implies that they will not be

replaced and hence the most common error was to see denominators of 11 and 11 in the two two-factor

products, rather than 11 and 10.

Parts (iii), (iv) and (v) were usually correctly calculated by those who attempted them, although by part (v)

some candidates were leaving this question blank.

In order to answer part (vi) of this question, it was necessary to notice that the probability that the first ball

will be white is the same in each situation. Thus the comparison can focus on the fact that, with fewer balls

in X than Z, the probability that the second ball being white is greater in part (iv) than part (v).

Answers: (i) 5/11 (ii) 31/55 (iii) 21/121 (iv) 3/55 (v) 2/77

Question 9

Most candidates were able to correctly find the probability of getting three 1s in part (i).

In part (ii), the most common error was for candidates to calculate the probability of any three numbers

except 1, rather than three of the same number except 1.

In part (iii), some candidates did not consider the number of ways in which exactly two 1s and some other

number could be achieved, however the majority correctly multiplied by a factor of three.

Most candidates correctly calculated the prize multiplied by the probability for each outcome and summed

their results in part (iv). Some candidates, however, left this as their final answer or subtracted the entry fee

of $1 from this amount, rather than considering the profit of the organiser and performing the subtraction the

other way around.

Clearly set out working was essential in part (v). Most candidates achieved this, with the most common error

being that some did not appreciate that the essential difference between this game and the previous one

was that the selected cards are not replaced. Thus, for example, denominators of 10, 10 and 10 were

sometimes seen instead of 10, 9 and 8. Such candidates did, however, usually appreciate the fact that the

only possible ways of getting three cards the same are three 1s or three 2s. Another common error was for

the factor of 3 to be missing from the probability of exactly two 1 s and some other number. The method for

the calculation of expectation was usually correct. Those candidates that attempted this part of the question

were usually able to achieve at least 3 of the available marks, with some scoring all 8 marks. Some

candidates, however, did not attempt this part of the question.

Answers: (i) 1/216 (ii) 5/216 (iii) 5/72 (iv) 67 cents (v) 61 cents

14

2013

4040 Statistics November 2013

Principal Examiner Report for Teachers

Question 10

Most candidates correctly named the sampling method as quota sampling in part (a)(i).

Answers to part (a)(ii) were often given in terms that were too general, such as to state simply that using a

list of registered voters would be biased, without providing a reason. Candidates needed to state that not all

inhabitants would appear in a list of voters.

Common incorrect answers of 17 and 1 were seen in parts (a)(iii)(a) and (a)(iii)(b) respectively, where

candidates had ignored the second and third visits to the properties.

In part (a)(iv), table headings were often incorrect with dwelling numbers 1 to 12, rather than number of

children 0 to 7, appearing as one of the row/column headings. Dwelling types D, S and B usually correctly

appeared as the other headings.

In part (a)(v), candidates needed to use the fact that the bigger the sample, the more accurate it is likely to

be. Some candidates incorrectly chose semi-detached houses over detached houses, because they said

that there were too many children in the detached houses.

In part (b), the simple random and stratified samples were usually correct. In part (b)(ii), some candidates

were picking values from the random number table at regular intervals, rather than selecting just the first

value for the systematic sample from the random number table and then selecting every tenth person.

Answers: (a)(iii)(a) 23 (a)(iii)(b) 3 (b)(i) 21, 32, 07, 42, 57, 59 (b)(ii) 07, 17, 27, 37, 47, 57

(b)(iii) 21, 32, 07, 42, 57, 17

Question 11

Correct responses to part (a)(i)(a) described the trend as the long-term pattern after regular variations have

been removed. The key to scoring the mark was to describe the increase/decrease as long-term, general

or over-time.

In part (a)(i)(b), some candidates described seasonal components rather than seasonal variation. The key

here was to include the idea of variation that repeats itself, such as describing a regular variation over a fixed

(relatively short) time period.

The majority of candidates were unable to explain the meaning of cyclic variation in part (a)(i)(c). The key

here was to describe long-term variation following a general pattern, but over variable lengths of time.

In part (a)(ii), a common error was to state that the trend, rather than the seasonal variation, is removed from

a time series when moving averages are calculated. Candidates were rarely able to explain that this is done

by smoothing out the variations over one time period.

The calculations in part (b)(i) were usually all correct, with accurate plots in part (b)(ii).

In part (b)(iii), most candidates correctly stated that the sales rose initially, but then declined thereafter.

Many candidates did not see that a single trend line was not appropriate in part (b)(iv). Those that did

recognise this were not always able to express the reason as being because the trend in the early quarters

was very different from that in the later part of the period.

Suitable trend lines, ignoring the plots before 2007, were drawn by many candidates for part (b)(v). Some

candidates incorrectly drew lines that included the early plots and others ignored the instruction to draw a

straight line and drew curves.

In part (b)(vi), most candidates correctly identified that the seasonal component for quarter one was negative

and therefore sales are likely to be smaller than indicated by the trend line.

Answers: (b)(i) x = 105, y = 205, z = 22.5

15

2013

Cambridge Ordinary Level

* 5 2 6 0 3 0 4 6 5 5 *

4040/12

STATISTICS

Paper 1

October/November 2014

2 hours 15 minutes

Additional Materials:

Pair of compasses

Protractor

Write your Centre number, candidate number and name on all the work you hand in.

Write in dark blue or black pen.

You may use an HB pencil for any diagrams or graphs.

Do not use staples, paper clips, glue or correction fluid.

DO NOT WRITE IN ANY BARCODES.

Answer all questions in Section A and not more than four questions from Section B.

If working is needed for any question it must be shown below that question.

The use of an electronic calculator is expected in this paper.

At the end of the examination, fasten all your work securely together.

The number of marks is given in brackets [ ] at the end of each question or part question.

DC (LK/SLM) 102872/4 R

UCLES 2014

[Turn over

2

Section A [36 marks]

Answer all of the questions 1 to 6.

In an industrial process, readings, x, of a particular gauge are recorded regularly. For 6 such

recorded readings it is found that x = 279 and x2 = 13 093.

(i)

Mean = .......................................................

Standard deviation = ...................................................[4]

It was discovered later that one of the readings had been incorrectly recorded as 43, when in fact

the correct reading was 34.

(ii)

State, for each of the mean and standard deviation, whether its correct value will be smaller

than, larger than, or the same as the value found in part (i).

Mean .......................................................

Standard deviation ...................................................[2]

UCLES 2014

4040/12/O/N/14

3

2

A student calculated, correctly, five statistical measures for a set of data. The five values he

obtained were, in ascending order, 6, 36, 43, 48 and 53.

(i)

Statistical measure

Value

Median

Lower quartile

Upper quartile

Standard deviation

Variance

[5]

(ii)

State the value of the 75th percentile for the students original set of data.

....................................................[1]

UCLES 2014

4040/12/O/N/14

[Turn over

4

3

A national government plans a survey to obtain the responses of its citizens to its proposal to build

wind farms as sources of renewable energy.

(i)

A

Face to face interviews will be conducted with a total of 1000 citizens in shopping centres

in different parts of the country.

Telephone calls will be made to 1000 citizens chosen from the telephone directory.

...........................................................................................................................................

.......................................................................................................................................[1]

(b) Give one advantage of method B over method A.

...........................................................................................................................................

.......................................................................................................................................[1]

(c) Give one advantage and one disadvantage of method C.

Advantage ..........................................................................................................................

...........................................................................................................................................

Disadvantage .....................................................................................................................

.......................................................................................................................................[2]

(ii)

Are you in favour of wind farms being built in your area?

Yes

No

...........................................................................................................................................

.......................................................................................................................................[1]

(b) Write down one open question which could be asked in the survey.

...........................................................................................................................................

...........................................................................................................................................

.......................................................................................................................................[1]

UCLES 2014

4040/12/O/N/14

5

4

The following diagram is to show the number of patients at a medical centre who have received

vaccine against one or more of the diseases polio, cholera and typhoid.

Polio

24

30

17

Cholera

(i)

Typhoid

(a) The number of patients who have received only cholera vaccine is 5 fewer than the

number of patients who have received only polio vaccine.

[1]

(b) The number of patients who have received only typhoid vaccine is two thirds of the

number of patients who have received polio and cholera vaccines but not typhoid vaccine.

[1]

(c) The number of patients who have received polio and typhoid vaccines but not cholera

vaccine is the same as the number of patients who have received all three vaccines. [1]

(d) Twice as many patients have received typhoid and cholera vaccines but not polio vaccine

as have received all three vaccines.

[1]

(ii)

Find the mode of the number of these vaccines received by these patients.

....................................................[2]

UCLES 2014

4040/12/O/N/14

[Turn over

6

5

The table below gives information on the gender of, and number of books written by, 40 authors

attending a book fair.

Number of books written

TOTAL

15

6 10

11 15

More than 15

Male

15

Female

12

25

TOTAL

17

14

40

Find the probability of choosing

(i)

a male,

....................................................[1]

(ii)

....................................................[1]

(iii)

an author who has written 6 10 books, given that the author is male.

....................................................[1]

Two authors are chosen at random to lead discussion groups.

(iv)

....................................................[3]

UCLES 2014

4040/12/O/N/14

7

6

A police camera at the side of a road measures the speed, in km/h, of every vehicle travelling on

the road.

The following histogram represents the information it recorded over a certain period of time.

50

40

Number of

vehicles

per 10 km/h

30

20

10

20

40

60

80

Vehicle speed (km/h)

100

120

Use the histogram to find, for this period of time, the number of vehicles whose speeds were

(i)

....................................................[2]

(ii)

....................................................[2]

(iii)

under 50 km/h.

....................................................[1]

The speed limit on this road is 100 km/h. Any driver of a vehicle travelling at a speed which is

5 km/h or more greater than the speed limit must pay a fine.

(iv)

Estimate the number of drivers represented by this information who had to pay a fine.

....................................................[1]

UCLES 2014

4040/12/O/N/14

[Turn over

8

Section B [64 marks]

Answer not more than four of the questions 7 to 11.

Each question in this section carries 16 marks.

In this question calculate all accident rates per thousand. Where values do not work out

exactly give your answers to one decimal place.

The table below gives information on the number of employees, and the number of accidents

they suffered, at a building construction company, Kwikbuild, in the year 2012. It also shows the

standard population for the building construction industry.

Job group

Number of

accidents

Number of

employees

Management

25

Office

Administration

167

35

Site Supervision

40

12

Site Labour

37

228

45

(i)

Job group

accident rate

Standard

population (%)

....................................................[4]

(ii)

Calculate the accident rate for each job group and insert the values in the table above.

[2]

UCLES 2014

4040/12/O/N/14

9

(iii)

Use your results from part (ii) to calculate the standardised accident rate for Kwikbuild.

....................................................[4]

Fastbuild is another building construction company. In 2012 its crude and standardised accident

rates were 109.4 and 98.7 per thousand respectively.

(iv)

State, with a reason, which of the two companies most likely operates in the safer environment.

...................................................................................................................................................

...................................................................................................................................................

...............................................................................................................................................[2]

For each company some of the accidents suffered by employees were classed as serious, and

they all occurred in the Site Labour job group.

The table below gives information on the serious accidents suffered at the two companies.

Job group

Company

Number of serious

accidents

Number of

employees

Kwikbuild

228

Fastbuild

154

Site Labour

(v)

Calculate, for each company, for the Site Labour job group only, the serious accident rate, and

hence state the company where an employee is less likely to suffer a serious accident.

....................................................[2]

(vi)

State, with a reason, whether the values you have calculated in part (v) are crude or

standardised rates.

...................................................................................................................................................

...................................................................................................................................................

...............................................................................................................................................[2]

UCLES 2014

4040/12/O/N/14

[Turn over

10

8

A running club holds a cross-country race. Competitors enter in either the junior or senior age

category. When they enter, they also choose to follow one of three routes: easy, moderate or

challenging. Information about the number of competitors and the routes chosen is shown below.

Number of competitors entering the race by age category

= 10 junior competitors

= 10 senior competitors

by choice of route

by choice of route

10%

25%

Easy

55%

35%

Easy

45%

Moderate

Challenging

Moderate

Challenging

30%

(i)

....................................................[1]

(ii)

Show that there were 42 junior competitors who chose the moderate route.

[1]

(iii)

Find the number of senior competitors who chose the easy route.

....................................................[2]

UCLES 2014

4040/12/O/N/14

11

Not all the entrants completed the race. The times taken by those who did complete the race are

shown in the table below.

Number of competitors

Completion time

(minutes)

Junior

Senior

Easy

Moderate

Challenging

Easy

Moderate

Challenging

60 under 90

16

19

90 under 120

29

12

32

15

18

15

20

17

14

11

16

65

39

10

71

46

37

TOTAL

(iv)

Find the number of competitors who entered the race but did not complete it.

....................................................[3]

(v)

Estimate, to the nearest minute, the mean time taken by senior competitors who completed

the challenging route.

....................................................[3]

(vi)

Of the junior competitors who completed the race in 2 hours or more, find the percentage who

had chosen the challenging route.

....................................................[3]

(vii)

Of all the senior competitors who had chosen the moderate route, find the percentage who

completed the race in under 2 hours.

....................................................[3]

UCLES 2014

4040/12/O/N/14

[Turn over

12

9

One way to determine if an adult has a healthy weight, independent of age and gender, is to

measure their body mass index, BMI (a continuous variable).

The BMI values for the adult population of a particular country in the years 1980 and 2010 are

summarised in the cumulative frequency polygons below.

100

90

1980

80

2010

70

Cumulative

frequency

(% of adult

population)

60

50

40

30

20

10

0

15

20

25

30

35

40

BMI

Use these graphs to answer the following questions on the adult population of this country.

(i)

Estimate

(a) the median BMI value in 1980,

....................................................[1]

(b) the median BMI value in 2010,

....................................................[1]

(c) the lower quartile BMI value in 1980,

....................................................[1]

(d) the upper quartile BMI value in 2010.

....................................................[1]

UCLES 2014

4040/12/O/N/14

13

An adults weight is classified as healthy if their BMI value is between 18.5 and 25.

(ii)

Estimate the percentage of the adult population whose weights were classified as healthy

(a) in 1980,

....................................................[2]

(b) in 2010.

....................................................[1]

An adult is classified as overweight if their BMI value is 25 or more.

(iii)

Estimate the median BMI value of the overweight adult population in 2010.

....................................................[3]

Adults with the highest BMI values are classified as obese.

In 1980, 7% of the adult population were obese.

(iv)

Estimate the percentage of the adult population in 2010 who were obese.

....................................................[4]

(v)

By referring to any of the values you have estimated in parts (i), (ii), and (iv), comment on

how the health of the adult population of this country, assessed in terms of its weight, changed

between 1980 and 2010.

...................................................................................................................................................

...................................................................................................................................................

...................................................................................................................................................

...............................................................................................................................................[2]

UCLES 2014

4040/12/O/N/14

[Turn over

14

10 Barutis teacher has suggested that pupils who enjoy studying a subject are likely to perform well

in tests in the subject.

To investigate this, Baruti asked his friends to rate their enjoyment of Statistics on a linear scale

from 0 (dislike very much) to 5 (like very much), then recorded their scores on the next class test.

His results are shown in the following table.

Friend

Enjoyment

rating, x

Test score

(%), y

57

47

78

59

26

86

53

34

(i)

y

100

80

60

Test

score

(%)

40

20

x

0

Enjoyment rating

[2]

(ii)

Explain briefly why the points (5, 78) and (4, 59) are not used if the lower semi-average is

calculated.

...................................................................................................................................................

...............................................................................................................................................[1]

UCLES 2014

4040/12/O/N/14

15

(iii)

Calculate the two semi-averages and the overall mean of the data, and plot them on your

graph.

[5]

(iv)

Use your plotted averages to draw a line of best fit, and find its equation in the form y = mx + c.

....................................................[4]

Another friend, who had rated his enjoyment of Statistics at 3, missed the test through illness.

(v) Use the line you have drawn in part (iv) to estimate the score this friend would have obtained

if he had taken the test.

....................................................[1]

Baruti repeated his investigations for English and Science.

The equations he found for the lines of best fit were

and

(vi)

y = 1.24x + 53.8

y = 13.8x + 15.1

for English

for Science.

State, with a reason, in which of the subjects Statistics, English and Science a pupils test

score is most affected by their enjoyment rating.

...................................................................................................................................................

...............................................................................................................................................[2]

(vii)

Explain briefly why Baruti may have experienced difficulty in deciding which of his two

variables should be the independent and which the dependent.

...................................................................................................................................................

...................................................................................................................................................

...............................................................................................................................................[1]

UCLES 2014

4040/12/O/N/14

[Turn over

16

11 At a restaurant it is known from experience that 10% of the customers order an omelette.

Assume that customers make choices independently of each other and that nobody orders more

than one omelette.

(i)

Find the probability that at this table an omelette is ordered by

(a) no customers,

....................................................[2]

(b) at least one customer.

....................................................[2]

The restaurant serves small omelettes and large omelettes. It is known from experience that 60%

of those ordered are small and 40% are large.

(ii)

Find the probability that at this table only one customer orders an omelette and it is a large

omelette.

....................................................[4]

UCLES 2014

4040/12/O/N/14

17

Small omelettes are made with 2 eggs and large omelettes with 3 eggs. The chef has a special

store of high quality eggs which are used only for making omelettes.

(iii)

Find the probability that, in preparing the food for this table, from his special store the chef

uses

(a) exactly 4 eggs,

....................................................[3]

(b) at most 4 eggs.

....................................................[5]

UCLES 2014

4040/12/O/N/14

Cambridge Ordinary Level

* 9 7 3 9 6 3 0 9 8 7 *

4040/13

STATISTICS

Paper 1

October/November 2014

2 hours 15 minutes

Additional Materials:

Pair of compasses

Protractor

Write your Centre number, candidate number and name on all the work you hand in.

Write in dark blue or black pen.

You may use an HB pencil for any diagrams or graphs.

Do not use staples, paper clips, glue or corrections fluid.

DO NOT WRITE IN ANY BARCODES.

Answer all questions in Section A and not more than four questions from Section B.

If working is needed for any question it must be shown below that question.

The use of an electronic calculator is expected in this paper.

At the end of the examination, fasten all your work securely together.

The number of marks is given in brackets [ ] at the end of each question or part question.

DC (RW/SLM) 83695/3

UCLES 2014

[Turn over

2

Section A [36 marks]

Answer all of the questions 1 to 6.

7 8 16 10 20 5 8 9 8 2 26 9 15 .

Three different measures of central tendency (average) of these numbers are 8, 9 and 11.

Complete the following table by giving, for each of these three measures, its name and a brief

explanation of how its value has been obtained.

Measure

Name

How obtained

..................................................................................................

............................

..................................................................................................

..................................................................................................

..................................................................................................

9

............................

..................................................................................................

..................................................................................................

..................................................................................................

11

............................

..................................................................................................

..................................................................................................

[6]

UCLES 2014

4040/13/O/N/14

3

2

50

40

30

Cumulative

frequency

20

10

0

0

(i)

4

X

...................................................................................................................................................

...............................................................................................................................................[2]

(ii)

State for which of the integer values shown in the graph the frequency of X is 0.

....................................................[2]

(iii)

x

Frequency

[2]

UCLES 2014

4040/13/O/N/14

[Turn over

4

3

(a) The population of a town is tabulated in different age groups. A research organisation wishes

to interview, from the population, a sample which represents it in terms of age. It proposes to

do this using either stratified random sampling or quota sampling.

State one way in which the use of these sampling methods would be similar, and one way in

which it would be different.

...................................................................................................................................................

...................................................................................................................................................

...................................................................................................................................................

...............................................................................................................................................[2]

(b) It is wished to obtain an estimate of the mean number of words on each page of a book. For

each of the following methods state, with a reason, whether a sample obtained using it would

be likely to be biased or unbiased:

(i)

...........................................................................................................................................

...........................................................................................................................................

.......................................................................................................................................[2]

(ii)

...........................................................................................................................................

...........................................................................................................................................

.......................................................................................................................................[2]

UCLES 2014

4040/13/O/N/14

5

4

The table below summarises the lengths, in millimetres, of a random sample of 50 leaves taken

from a bush.

Length (mm)

Frequency

Under 30

30 under 32

32 under 34

10

34 under 36

17

36 under 38

11

38 under 40

Cumulative frequency

(i)

[1]

(ii)

Plot the cumulative frequencies on the grid below and draw a smooth curve through the

plotted points.

[2]

50

40

30

Cumulative

frequency

20

10

28

30

32

34

36

38

40

42

Length (mm)

(iii)

(a) the lower quartile length,

.............................................mm [1]

(b) the percentage of leaves that have a length greater than 37.2 mm.

UCLES 2014

4040/13/O/N/14

....................................................[2]

[Turn over

6

5

A company which produces different sizes of sawn wood wishes to display information about the

amount of sawn wood it produces in each of two consecutive years.

(i)

State one advantage and one disadvantage of using a dual bar chart, as opposed to a

percentage bar chart, to illustrate the amount produced in the two years.

Advantage..................................................................................................................................

...................................................................................................................................................

Disadvantage .............................................................................................................................

...............................................................................................................................................[2]

(ii)

Name a quantity which neither a dual bar chart nor a percentage bar chart would show.

...............................................................................................................................................[1]

(iii)

State the names of two types of diagram which will give a relative indication of both the

amount of different sizes of sawn wood and the total amount of sawn wood produced in each

year.

........................................................

....................................................[2]

(iv)

State the name of the type of diagram which will give a direct indication of the differences in

the total amount of sawn wood produced from one year to the next.

....................................................[1]

UCLES 2014

4040/13/O/N/14

7

6

Three of the official languages of Switzerland are French, German and Romansh. The diagram

below illustrates which of these languages are spoken by a random sample of 70 Swiss citizens.

French

German

17

23

8

0

3

12

Romansh

(i)

Find the value which should be written inside the box but outside the circles in order to

complete the diagram.

....................................................[2]

(ii)

...................................................................................................................................................

...............................................................................................................................................[1]

(iii)

State, with a reason, in each of the following cases, whether the value 0 would be changed if

the person described learned to speak Romansh.

(a) One of the people denoted in the diagram by the value 17.

...........................................................................................................................................

.......................................................................................................................................[1]

(b) One of the people denoted in the diagram by the value 8.

...........................................................................................................................................

.......................................................................................................................................[1]

(c) One of the people denoted in the diagram by your answer to part (i).

...........................................................................................................................................

.......................................................................................................................................[1]

UCLES 2014

4040/13/O/N/14

[Turn over

8

Section B [64 marks]

Answer not more than four of the questions 7 to 11.

Each question in this section carries 16 marks.

(a) Mr Hassan can travel to work by either car or train. The probability that on any day he travels

by train is 47. If he travels by car the probability that he will be late for work is 19, but by train it is 15.

Calculate the probability that on any randomly chosen day he is not late for work.

....................................................[4]

(b) Three children are to be chosen at random from a group of seven, consisting of four boys,

Ian, James, Michael and Nathan, and three girls, Karen, Lucy and Olive.

(i)

Calculate the probability that Ian, Lucy and Nathan are the three chosen.

....................................................[2]

Two of the seven are a brother and sister.

(ii)

Calculate the probability that the brother and sister will both be among the three chosen.

....................................................[3]

UCLES 2014

4040/13/O/N/14

9

(c) Sammy and Pekos each have a bag containing a number of blue balls and white balls. Each

selects one ball from his bag at random. If the selected balls are of the same colour, Sammy

puts them both in his bag; if they are of different colours, Pekos puts them both in his bag.

Originally, Sammys bag contains 2 blue and 6 white balls, while Pekos bag contains 3 blue

and 5 white balls.

(i)

Calculate the probability that both selected balls are of the same colour.

....................................................[3]

(ii)

If, on the first selection, the balls were of the same colour (so both were put in Sammys

bag before a second selection), calculate the probability that on the second selection the

balls are of different colours.

....................................................[4]

UCLES 2014

4040/13/O/N/14

[Turn over

10

8

The following table summarises the times, x minutes, which the visitors to an art gallery during

one day spent in the gallery. The first row of the table gives the column numbers.

(1)

(2)

(3)

(4)

(5)

(6)

Time, x (minutes)

Frequency, f

fy

fy 2

0 under 30

30 under 35

11

35 under 40

40 under 50

40

50 under 60

26

60 under 70

14

70 under 100

TOTAL

105

(i)

Insert in column (3) of the table the mid-points, m, of each of the classes.

[1]

(ii)

y=

m 45

.

2.5

Calculate the value of y for each class and insert the values in column (4) of the table.

[2]

(iii)

For each class, calculate the value of the product fy, and insert the values in column (5) of the

table.

[1]

(iv)

For each class, calculate the value of fy 2, and insert the values in column (6) of the table.

[1]

(v)

UCLES 2014

4040/13/O/N/14

[1]

11

(vi)

....................................................[2]

(vii)

....................................................[2]

(viii)

(a) the mean of x,

....................................................[2]

(b) the standard deviation of x.

....................................................[2]

(ix)

Comment on whether or not, for these data, the interquartile range would be a more

appropriate measure of dispersion than the standard deviation.

...................................................................................................................................................

...................................................................................................................................................

...............................................................................................................................................[2]

UCLES 2014

4040/13/O/N/14

[Turn over

12

9

A bakery kept a record of the diameters, d centimetres, of the cakes it produced during one week.

The results are summarised in the histogram below.

50

40

Number 30

of cakes

per cm of

diameter 20

10

0

0

(i)

15

16

17

18

19

Diameter (cm)

20

21

22

23

Use the histogram to complete the following grouped frequency table for d.

Diameter, d (cm)

15

under 17

17

under 18

18

under 19

19

under 19.5

Frequency

19.5 under 20

20

under 20.5

20.5 under 21

21

under 23

[4]

UCLES 2014

4040/13/O/N/14

13

(ii)

Use the frequencies you have obtained to produce a simpler grouped frequency distribution,

having four classes of equal width between 15 cm and 23 cm, and present your distribution in

a table.

[3]

(iii)

On the grid below illustrate your simpler grouped frequency distribution by a histogram.

[3]

(iv)

Use the histogram you have drawn in part (iii) to estimate the modal diameter.

.............................................. cm [2]

(v)

Cakes with a diameter between 16.5 cm and 22 cm can be sold in the bakerys shop. Find the

percentage of this weeks cakes which can be sold in the shop.

UCLES 2014

4040/13/O/N/14

....................................................[4]

[Turn over

14

10 In this question calculate all death rates per thousand, and to 2 decimal places.

The first table below gives certain information about the population and deaths in a town, Eastbury,

for the year 2012, together with the standard population of the area in which Eastbury is situated.

(i)

Age group

Deaths

Population in

age group

Standard

population (%)

0 14

25

4500

20

15 34

7000

35

35 59

47

6000

25

60 and over

83

7000

20

The death rate for the 15 34 age group is 3.00 per thousand.

Show that x = 21.

[1]

(ii)

....................................................[4]

(iii)

Calculate the death rates for the other three age groups.

35 59 age group ........................................................

60 and over age group ....................................................[2]

UCLES 2014

4040/13/O/N/14

15

(iv)

Using the given rate for the 15 34 age group, and the rates you have calculated in part (iii),

calculate the standardised death rate for Eastbury.

....................................................[4]

The table below gives information about Westville, another town in the same area, for the year

2012.

The crude death rate for Westville in 2012 was 6.62 per thousand.

(v)

Age group

thousand

Population in

age group (%)

0 14

35

15 34

25

35 59

27

60 and over

24

13

Calculate the standardised death rate for Westville, using the same standard population as

for Eastbury.

....................................................[2]

One of the two towns has a higher crude death rate, but the other has a higher standardised death

rate.

(vi)

...................................................................................................................................................

...............................................................................................................................................[1]

(vii)

State, with a reason, which of the two towns would appear to have the healthier environment.

...................................................................................................................................................

...................................................................................................................................................

...............................................................................................................................................[2]

UCLES 2014

4040/13/O/N/14

[Turn over

16

11 Three trainee technicians, A, B and C, carried out laboratory trials to examine the effect of

temperature, x, in C, on the yield, y, in kg, of an industrial process. The following table shows the

results obtained by each technician.

Technician

Temperature, x (C)

10

15

20

25

30

35

40

45

50

55

60

65

Yield, y (kg)

80

106

75

90

117

118

97

127

80

109

140

115

(i)

Plot the points representing these results on the grid below and label each point A, B or C

according to which technician carried out the trial.

140

130

120

110

100

Yield

(kg)

90

80

70

60

0

0

10

20

30

40

50

60

70

Temperature (C)

[3]

UCLES 2014

4040/13/O/N/14

17

(ii)

[3]

The two semi-averages are (22.5, 97.7) and (52.5, 111.3).

(iii)

Plot the semi-averages and use the three plotted averages to draw the line of best fit.

[3]

It is known that over this range of temperatures the relationship between yield and temperature is

approximately linear.

(iv)

...................................................................................................................................................

...................................................................................................................................................

...............................................................................................................................................[2]

An experienced and reliable technician carried out a trial at a temperature of 40C and obtained a

yield of 125 kg.

(v)

[1]

(vi)

What might this extra information tell you about the performance of the trainees?

...................................................................................................................................................

...................................................................................................................................................

...............................................................................................................................................[2]

(vii)

Use the extra information to draw, by eye, a revised line of best fit.

(viii)

Use this revised line of best fit to estimate the yield for a temperature of 52C.

[1]

............................................... kg [1]

UCLES 2014

4040/13/O/N/14

Cambridge Ordinary Level

* 9 9 5 0 0 0 0 2 4 8 *

4040/22

STATISTICS

Paper 2

October/November 2014

2 hours 15 minutes

Additional Materials:

Pair of compasses

Protractor

Write your Centre number, candidate number and name on all the work you hand in.

Write in dark blue or black pen.

You may use an HB pencil for any diagrams or graphs.

Do not use staples, paper clips, glue or correction fluid.

DO NOT WRITE IN ANY BARCODES.

Answer all questions in Section A and not more than four questions from Section B.

If working is needed for any question it must be shown below that question.

The use of an electronic calculator is expected in this paper.

At the end of the examination, fasten all your work securely together.

The number of marks is given in brackets [ ] at the end of each question or part question.

DC (NF/SLM) 83693/3

UCLES 2014

[Turn over

2

Section A [36 marks]

Answer all of the questions 1 to 6.

(i)

...................................................................................................................................................

.............................................................................................................................................. [1]

(ii)

.............................................................................................................................................. [1]

(iii)

...................................................................................................................................................

.............................................................................................................................................. [1]

(iv)

.............................................................................................................................................. [1]

UCLES 2014

4040/22/O/N/14

3

2

Number of

people

8 or

more

Number of

homes

(i)

................................................... [1]

(ii)

................................................... [2]

It was later discovered that an error had been made, and that h homes with 8 or more people were

missing from the original data.

(iii)

Find the maximum possible value of h such that the median will be unchanged when the extra

data is included.

................................................... [2]

UCLES 2014

4040/22/O/N/14

[Turn over

4

3

Maria and Nico each have a tin containing 5 white, 8 milk and 7 dark chocolates.

(i)

Maria selects two chocolates at random from her tin and eats them.

Find the probability that

(a) both are white chocolates,

................................................... [2]

(b) exactly one is a white chocolate.

................................................... [2]

(ii)

Nico selects chocolates at random from his tin until he finds a milk chocolate. He returns

unwanted chocolates to the tin after each selection.

Find the probability that it will take him fewer than 3 attempts to find a milk chocolate.

................................................... [2]

UCLES 2014

4040/22/O/N/14

5

4

The masses, in grams, of a sample of potatoes from a crop are shown in the table below.

(i)

Mass, m (g)

Number of potatoes

30 m 50

14

50 m 100

63

100 m 150

82

150 m 250

47

250 m 400

19

400 m 600

12

For these data, state the name of the most appropriate measure of central tendency and the

name of the most appropriate measure of dispersion. Give a reason for your answers.

Measure of central tendency .......................................................

Measure of dispersion .......................................................

Reason .....................................................................................................................................

.............................................................................................................................................. [3]

(ii)

Without drawing a graph, calculate an estimate of the number of potatoes from this sample

which are classified as large.

................................................... [3]

UCLES 2014

4040/22/O/N/14

[Turn over

6

5

At a school the 100 male and 120 female students choose to study one of the three options Music,

Drama or Art.

Their choices are illustrated in the chart below.

100

90

80

Art

70

Drama

60

Music

Percentage

50

of students

40

30

20

10

0

(i)

Male

Female

...............................................................................................................................................[1]

(ii)

Calculate the numbers of males and females taking each option and insert them into the

table below.

Music

Drama

Art

Male

Female

[2]

UCLES 2014

4040/22/O/N/14

7

(iii)

Display your data from part (ii) in a fully-labelled dual bar chart using the key provided.

Male

Female

[3]

(iv)

Give one advantage that the dual bar chart you have drawn has over the chart given at the

start of the question.

...................................................................................................................................................

.............................................................................................................................................. [1]

UCLES 2014

4040/22/O/N/14

[Turn over

8

6

Two unbiased six-sided dice, one blue and one green, each with faces numbered 1, 2, 3, 4, 5 and

6 are thrown.

The following are some of the possible outcomes.

(i)

(a) independent events,

...................................................................................................................................... [2]

(b) mutually exclusive events.

...................................................................................................................................... [2]

(ii)

................................................... [3]

Event D and a fifth event, E, are known to be mutually exclusive.

(iii)

................................................... [1]

UCLES 2014

4040/22/O/N/14

9

Section B [64 marks]

Answer not more than four of the questions 7 to 11.

Each question in this section carries 16 marks.

time

(i)

1 .......................................................................................................................................

...........................................................................................................................................

2 .......................................................................................................................................

...................................................................................................................................... [2]

(ii)

If n-point moving average values were to be calculated for the variable V, state an

appropriate value for n.

................................................... [1]

(iii)

State whether or not it would be necessary to centre the moving average values in this

case. Clearly explain the reason for your answer.

...........................................................................................................................................

...........................................................................................................................................

...................................................................................................................................... [3]

UCLES 2014

4040/22/O/N/14

[Turn over

10

(b) The table below shows fertilizer sales (in thousands of tonnes) by a company each quarter for

a period of 3 years.

Year

Quarter

Sales

(000 tonnes)

84

II

65

4-point

moving average

Centred 4-point

moving average

a = ....................

2010

III

92

74.5

74

IV

59

73.625

73.25

80

72.625

72

II

62

71.5

71

2011

III

87

b = ..............................

70.5

IV

55

70

69.5

78

68.875

68.25

II

58

67.75

2012

(i)

67.25

III

c = .......................

IV

51

Calculate the values of a, b and c and enter them in the table above.

[3]

UCLES 2014

4040/22/O/N/14

11

(ii)

Use the Sales and Centred 4-point moving average values for quarter II of 2011 and

2012 to find an estimate for the seasonal component of quarter II. Give your answer, in

thousands of tonnes, correct to one decimal place.

....................................................[3]

(iii)

Plot the centred moving average values on the grid below and draw the trend line.

75

70

Fertilizer sales

(000 tonnes)

65

60

I

II

2010

(iv)

III

IV

I

II

2011

III

IV

I

II

2012

III

IV

I

II

2013

[2]

Use your trend line and answer to part (ii) to estimate the sales for quarter II of 2013.

....................................................[2]

UCLES 2014

4040/22/O/N/14

[Turn over

12

8

In order to calculate a weighted aggregate cost index, a caf owner divides his expenditure into

three categories: Ingredients, Electricity and Wages.

He collects the following data for the year 2011.

Ingredients cost a total of $15 600.

Electricity cost $0.09 per unit.

A total of 5000 units of electricity were used.

A total of 4000 staff hours were worked.

The average wage per hour for all the staff was $6.50.

(i)

Show that the caf owner should assign weights to the three categories Ingredients, Electricity

and Wages in the ratio 312 : 9 : 520.

[3]

(ii)

Using the following information, complete the table below, giving price relatives to the nearest

integer where appropriate.

2011 is the base year.

The cost of ingredients increased by 8% from 2011 to 2012.

The price of electricity rose to $0.11 per unit in 2012.

The average wage per hour for all staff fell by 3% from 2011 to 2012.

Price relatives

2011

2012

Ingredients

Electricity

Wages

[5]

UCLES 2014

4040/22/O/N/14

13

(iii)

Calculate a weighted aggregate cost index for the year 2012, taking 2011 as base year, giving

your answer correct to one decimal place.

................................................... [3]

(iv)

Use the index calculated in part (iii) and the costs for 2011 to estimate, to 3 significant figures,

the total cost of running the caf in 2012.

................................................... [3]

(v) Give two possible reasons why your estimate for 2012 may be very inaccurate.

Reason 1 ...................................................................................................................................

...................................................................................................................................................

Reason 2 ...................................................................................................................................

.............................................................................................................................................. [2]

UCLES 2014

4040/22/O/N/14

[Turn over

14

9

A turn at a game consists of throwing a pair of unbiased coins, each with a head on one side and

a tail on the other. A point is scored every time a turn produces a pair of heads. A game consists of

three turns.

(i)

1 ,

(a) show that the probability of scoring three points is 64

[2]

(b) find the probability of scoring exactly two points.

................................................... [3]

(ii)

If X is the number of points scored in one game, find the probability of each of the remaining

possible values of X. Hence produce a table showing all the possible values of X together with

their probabilities.

[4]

UCLES 2014

4040/22/O/N/14

15

Each game of three turns costs $4 to play.

A player wins $58 for scoring three points.

A player wins nothing for scoring fewer than two points.

Rashid decides to play the game, and scores exactly two points.

(iii)

................................................... [3]

As an alternative to taking the money won in part (iii), a player who has scored exactly two points

is given the option of throwing another single coin once. This coin is weighted in favour of tails, and

is four times more likely to show a tail than a head.

If this option is taken the player will win $50 for a head and $12.50 for a tail.

(iv)

Determine, by calculation, whether or not Rashid should risk throwing the extra coin.

................................................... [4]

UCLES 2014

4040/22/O/N/14

[Turn over

16

10 (a) The number of calls per day received at a fire station over a period of time is shown in the

table below.

Number of calls per day

Number of days

13

11

(i)

Calculate how many calls were received, in total, over the period of time.

................................................... [2]

(ii)

Calculate the mean number of calls per day, correct to one decimal place.

................................................... [3]

(b) The mean and standard deviation of three numbers a, b and c are 11 and 3 respectively.

Complete the table below by finding the mean and standard deviation of each of the four sets

of numbers shown.

Mean

Standard deviation

a 1, b 1, c 1

The three numbers

a, b, c

2 2 2

The three numbers

5a + 3, 5b + 3, 5c + 3

The six numbers

a, a, b, b, c, c

[4]

UCLES 2014

4040/22/O/N/14

17

(c) The students in a mathematics class were given a test in Algebra and a test in Geometry,

both with a maximum mark of 100. The following table summarises their results.

Class mean

(i)

Algebra

55

Geometry

40

Class standard

deviation

10

4.5

Explain what these figures tell you about the differences between the marks in the

Algebra test and the marks in the Geometry test.

...........................................................................................................................................

...........................................................................................................................................

...........................................................................................................................................

...................................................................................................................................... [2]

(ii)

Use the class means and standard deviations to state, with a reason, in which test

Priyanka scored better in relation to the rest of the class.

...........................................................................................................................................

...........................................................................................................................................

...................................................................................................................................... [2]

(iii)

The highest mark scored by any pupil in the Algebra test was 87. It is required to scale

the marks so that the scaled mean is 60 and the scaled highest mark is 100.

Calculate the scaled standard deviation which must be used to achieve this.

................................................... [3]

UCLES 2014

4040/22/O/N/14

[Turn over

18

11 At a jam-making factory, 90 jars are filled with jam in fifteen minutes. A sample of 9 of the jars

needs to be taken to check that the mass of jam in the jars is within acceptable limits.

The jars are numbered from 00 to 89.

Asad thinks that the best method for selecting the sample is to take a simple random sample.

RANDOM NUMBER TABLE

47 00 51 96 32 47 85 11 67 05 10 90 28 73

92 01 55 83 76 34 41 29 07 24 63 15 59 81

44 03 59 99 14 27 20 30 09 78 60 04 81 65

(i)

Starting at the beginning of the first row of the random number table, and working along the

row, find Asads sample, ensuring that no jar is selected more than once.

.............................................................................................................................................. [2]

Omar thinks that the best method for selecting the sample is to take a systematic sample.

(ii)

By starting at the beginning of the second row of the random number table, and working

along the row, select the first jar in Omars sample. State also the numbers of the remaining

jars in his sample.

.............................................................................................................................................. [3]

The jam-making factory has three machines A, B and C which put jam into jars, and two packers

X and Y who pack jars into boxes.

Each jar is filled by one of the machines and then packed into a box by one of the packers.

The two-way table shows how many jars filled by each machine were packed by each packer in

fifteen minutes.

(iii)

Machine A

Machine B

Machine C

Packer X

10

11

18

Packer Y

10

19

22

If a sample of size 9 stratified by machine were to be taken, calculate how many jars from

each machine would be required.

Machine A .......................................................

Machine B .......................................................

Machine C .................................................. [2]

UCLES 2014

4040/22/O/N/14

19

The jars from machines A, B and C are those numbered 00 19, 20 49 and 50 89 respectively.

(iv)

Comment on how accurately the samples taken by Asad and Omar represent the jars filled by

each machine.

...................................................................................................................................................

...................................................................................................................................................

...................................................................................................................................................

.............................................................................................................................................. [2]

(v)

Starting at the beginning of the third row of the table, and moving along the row, select a

sample of size 9 stratified by machine. Use every number if the machine to which it relates

has not yet been fully sampled.

...................................................................................................................................................

.............................................................................................................................................. [3]

(vi)

If a sample of size 9 stratified by packer were to be taken, calculate how many jars from each

packer would be required.

Packer X ..................................................

Packer Y .................................................. [2]

(vii)

If a stratified sample were to be taken, state whether it would be more appropriate, in this

case, to stratify by machine or by packer. Explain your answer.

...................................................................................................................................................

...................................................................................................................................................

.............................................................................................................................................. [2]

UCLES 2014

4040/22/O/N/14

20

BLANK PAGE

Permission to reproduce items where third-party owned material protected by copyright is included has been sought and cleared where possible. Every

reasonable effort has been made by the publisher (UCLES) to trace copyright holders, but if any items requiring clearance have unwittingly been included, the

publisher will be pleased to make amends at the earliest possible opportunity.

Cambridge International Examinations is part of the Cambridge Assessment Group. Cambridge Assessment is the brand name of University of Cambridge Local

Examinations Syndicate (UCLES), which is itself a department of the University of Cambridge.

UCLES 2014

4040/22/O/N/14

Cambridge Ordinary Level

* 9 0 9 9 9 9 9 8 1 4 *

4040/23

STATISTICS

Paper 2

October/November 2014

2 hours 15 minutes

Additional Materials:

Pair of compasses

Protractor

Write your Centre number, candidate number and name on all the work you hand in.

Write in dark blue or black pen.

You may use an HB pencil for any diagrams or graphs.

Do not use staples, paper clips, glue or correction fluid.

DO NOT WRITE IN ANY BARCODES.

Answer all questions in Section A and not more than four questions from Section B.

If working is needed for any question it must be shown below that question.

The use of an electronic calculator is expected in this paper.

At the end of the examination, fasten all your work securely together.

The number of marks is given in brackets [ ] at the end of each question or part question.

DC (SJF/SLM) 83687/4

UCLES 2014

[Turn over

2

Section A [36 marks]

Answer all of the questions 1 to 6.

The number of DVDs bought in a year by each person in a sample of 46 people is given in the

following table.

Number of DVDs bought

12

13

14

15

16

17

18

Number of people

10

(i)

State the modal number of DVDs bought in the year by these people.

....................................................[1]

(ii) Find the median number of DVDs bought in the year by these people.

....................................................[2]

(iii)

A number, k, of other people, all of whom bought 22 DVDs in the year, have been omitted

from the table.

If these people are included, find

(a) the greatest possible value of k if the median is unchanged,

....................................................[2]

(b) the greatest possible value of k if the median and the mode are both unchanged.

....................................................[1]

UCLES 2014

4040/23/O/N/14

3

2

Letters posted in the UK may be sent by either 1st or 2nd class post. 40% are sent 1st class.

The following table shows the number of days after posting on which a letter is delivered.

Days after posting

1st class

80%

20%

2nd class

50%

30%

20%

Find the expected number of days for a randomly chosen letter to be delivered.

....................................................[6]

UCLES 2014

4040/23/O/N/14

[Turn over

4

3

A teacher asked each of the 15 boys and 10 girls in her class to estimate the length, in cm, of a

piece of string she showed them.

The estimated lengths are summarised in the following table.

(i)

Number of

pupils

Sum of

lengths

of the lengths

Boys

15

270

5372

Girls

10

160

2759

....................................................[1]

(ii)

....................................................[1]

(iii)

Calculate the sum of the squares of the estimated lengths of all the pupils.

....................................................[1]

(iv)

Hence calculate the standard deviation of the estimated lengths of all the pupils.

....................................................[3]

UCLES 2014

4040/23/O/N/14

5

4

(i)

For purposes of comparison with another variable, values of X are to be scaled so that they

have a mean of 30 and a standard deviation of 6.

Find the value of X which would be unchanged by this scaling.

....................................................[3]

(ii)

The highest value of X was 51. It is now wished to scale values of X to produce a new variable,

Y, such that the mean of Y is 50 and the highest value of Y is 100.

Find the standard deviation of Y necessary to achieve this.

....................................................[3]

UCLES 2014

4040/23/O/N/14

[Turn over

6

5

A man who goes on a camping holiday each year classifies his expenditure under three headings:

Food, Clothing and Equipment.

The amounts, in dollars, which he spent in each of the years 2010 and 2011 are given in the

following table.

(i)

Year

Food

Clothing

Equipment

2010

135

165

200

2011

124

132

144

On the grid below illustrate the data by a dual bar chart, using one double bar for each item of

expenditure.

[3]

(ii)

On the grid below illustrate the data by a sectional (component) percentage bar chart, using

one bar for each year.

[3]

UCLES 2014

4040/23/O/N/14

7

6

(a) Explain briefly why it is not possible to illustrate a qualitative variable by a histogram.

...................................................................................................................................................

...................................................................................................................................................

...............................................................................................................................................[2]

(b) A quantitative variable can be either discrete or continuous. Briefly explain the difference

between these two types of quantitative variable.

...................................................................................................................................................

...................................................................................................................................................

...................................................................................................................................................

...............................................................................................................................................[2]

(c) A number of athletes all run for one hour and then the numbers of kilometres they have run

are formed into a grouped frequency distribution, of which the classes are labelled 12 13,

14 15, 16 17 etc.

State the mid-point of the 14 15 class if

(i)

....................................................[1]

(ii)

the distances have been rounded to the nearest whole number of kilometres.

....................................................[1]

UCLES 2014

4040/23/O/N/14

[Turn over

8

Section B [64 marks]

Answer not more than four of the questions 7 to 11.

Each question in this section carries 16 marks.

The treasurer of a cricket club is carrying out an analysis of changes in club expenditure.

He has summarised expenditure for the year 2012 as follows:

Total cost of maintenance to the grounds and buildings

Average cost of one box of three cricket balls

Number of balls purchased

Cost of services such as electricity and water supply

Wage rate paid per hour to the club groundsman

Number of hours worked by the groundsman during the year

(i)

$10 000

$50

75

$2500

$12.50

600

Use these data to show that the treasurer should assign weights to the four categories

Maintenance, Balls, Services, Wages in the ratio 8 : 1: 2 : 6.

[4]

In 2013, as compared with 2012, the following changes occurred.

Maintenance costs increased by 2%.

By changing the supplier the cost of balls decreased by 10%.

Cost of services increased by 5%.

The groundsmans hourly wage rate was increased by 3%.

(ii)

Write down price relatives for 2013, taking 2012 as base year, for each of the four categories.

Maintenance .......................................................

Balls .......................................................

Services .......................................................

Wages ...................................................[3]

UCLES 2014

4040/23/O/N/14

9

(iii)

Calculate a weighted aggregate cost index for 2013, taking 2012 as base year.

....................................................[4]

(iv)

Use the index you have calculated in part (iii) and the costs for 2012 to estimate the total cost

of running the club in 2013.

................................................... [3]

(v)

Give two reasons why your estimate for 2013 may be very different from the true cost in 2013.

Reason 1 ..................................................................................................................................

...................................................................................................................................................

Reason 2 ..................................................................................................................................

...............................................................................................................................................[2]

UCLES 2014

4040/23/O/N/14

[Turn over

10

8

A batch of 500 plastic rods is used as a statistical teaching aid. The following table summarises

the lengths of the rods in centimetres.

Length (cm)

Frequency

1 under 2

12

2 under 3

197

3 under 4

33

4 under 5

13

5 under 6

124

6 under 7

22

7 under 8

11

8 under 9

88

(i)

Cumulative frequency

....................................................[1]

(ii)

....................................................[1]

(iii)

(iv)

[2]

....................................................[3]

(v)

Calculate estimates of the two quartiles and hence obtain an estimate of the interquartile

range.

Upper quartile .......................................................

Interquartile range ...................................................[5]

UCLES 2014

4040/23/O/N/14

11

(vi)

(a) Calculate the difference between the median and each of the two quartiles.

........................................................

....................................................[1]

(b) Comment on what your answer to part (a) tells you about the shape of this distribution.

...........................................................................................................................................

.......................................................................................................................................[1]

(vii)

If a cumulative frequency graph of these data were drawn, state, with a reason, in which part

of the graph the gradient would be at its steepest.

...................................................................................................................................................

...................................................................................................................................................

...............................................................................................................................................[2]

UCLES 2014

4040/23/O/N/14

[Turn over

12

9

(a) (i)

...........................................................................................................................................

.......................................................................................................................................[1]

(ii)

...........................................................................................................................................

.......................................................................................................................................[1]

(iii)

P(A) = 0.5 ,

P(B) = 0.6 .

(a) Explain why, without any further information, it can be stated correctly that A and B

are not mutually exclusive.

....................................................................................................................................

................................................................................................................................[1]

(b) Find the value which P(AB ) must have if A and B are independent.

....................................................[2]

(b) A survey is being carried out at a college as to the driving status of its students.

The following table summarises the responses of a sample of 60 students.

Driving status

Males

Females

16

Learning to drive

but not yet taken a driving test

10

not yet taken a driving test

(i)

....................................................[2]

UCLES 2014

4040/23/O/N/14

13

(ii)

....................................................[2]

(iii)

a female student chosen at random has not yet taken a driving test,

....................................................[2]

(iv)

two students chosen at random have both not started to learn to drive,

....................................................[2]

(v)

of two students chosen at random, without replacement, the first is male and the second

has taken a driving test but failed.

....................................................[3]

UCLES 2014

4040/23/O/N/14

[Turn over

14

10 A statistician wishes to invite 5 people to dinner from among her fifteen closest friends and

relatives, and decides to select the 5 by applying a sampling procedure to a list of the fifteen.

The following table lists the fifteen people, numbered 00 to 14, classified by age group and by

whether they are a friend or a relative.

Person

00

01

02

03

04

05

06

07

08

09

10

11

12

13

14

Age group

II

II

II

III

II

III

II

II

III

Friend/relative

F = friend, R = relative.

You are asked to help the statistician by applying four different sampling procedures to select a

sample of size 5 from this population, using the two-digit random number table below. No person

may be selected more than once in any one sample.

TWO-DIGIT RANDOM NUMBER TABLE

61

91

06

72

(i)

12

65

15

24

00

05

09

11

18

78

79

29

12

35

08

13

07

00

59

18

09

26

10

15

53

25

03

10

01

11

04

66

15

32

37

06

74

03

02

02

45

19

99

00

14

21

01

09

Starting at the beginning of the first row of the table, and moving along the row, select a

simple random sample of the required size.

....................................................[2]

(ii)

A systematic sample is to be selected by starting at the beginning of the second row of the

table, and moving along the row.

(a) Write down the smallest possible and the largest possible two-digit numbers of the first

person selected.

....................................................[1]

(b) Write down the number of the first person selected.

....................................................[1]

(c) Write down the numbers of the other four people selected for the systematic sample.

....................................................[1]

UCLES 2014

4040/23/O/N/14

15

(iii)

(a) State how many friends and how many relatives would be selected for such a sample.

....................................... friends

.................................... relatives [1]

(b) Starting at the beginning of the third row of the table, and moving along the row, select

this sample. Use every number if the category to which it relates has not yet been fully

sampled.

...........................................................................................................................................

.......................................................................................................................................[3]

(iv)

(a) State how many people from each age group would be selected for such a sample.

....................... from age group I

...................... from age group II

..................... from age group III [1]

(b) Starting at the beginning of the fourth row of the table, and moving along the row, select

this sample. Use every number if the age group to which it relates has not yet been fully

sampled.

...........................................................................................................................................

.......................................................................................................................................[2]

(v)

State, for each of the two stratified samples, with a reason, whether or not it represents the

population exactly.

...................................................................................................................................................

Stratified by age group ..............................................................................................................

...............................................................................................................................................[4]

UCLES 2014

4040/23/O/N/14

[Turn over

16

11 The numbers of absences recorded per day in a school over a period of three weeks are shown in

the table below.

Week

(i)

Day

Number of

absences

5-point

moving total

5-point moving

average

Monday

35

Tuesday

20

Wednesday

18

140

28.0

Thursday

24

137

27.4

Friday

43

133

26.6

Monday

32

x = .

25.4

Tuesday

16

125

25.0

Wednesday

12

121

24.2

Thursday

22

122

24.4

Friday

39

124

y = .

Monday

33

128

25.6

Tuesday

18

127

25.4

Wednesday

16

125

25.0

Thursday

21

Friday

37

State why it is most appropriate to calculate values of a 5-point moving average in order to

analyse these data.

...................................................................................................................................................

...............................................................................................................................................[1]

(ii)

Give a reason why the moving average values have not been centred.

...................................................................................................................................................

...............................................................................................................................................[1]

UCLES 2014

4040/23/O/N/14

17

(iii)

Plot the numbers of absences on the grid below and describe what they show.

50

40

30

Number

of

absences

20

10

0

Mon Tue Wed Thu

Week 1

Fri

Week 2

Fri

Week 3

Fri

Week 4

...................................................................................................................................................

...............................................................................................................................................[3]

(iv)

Calculate the values of x and y and insert them into the table.

[2]

(v)

[2]

(vi)

State the purpose of calculating moving averages, and indicate whether this appears to have

been achieved in this case.

...................................................................................................................................................

...............................................................................................................................................[2]

(vii)

[1]

UCLES 2014

4040/23/O/N/14

[Turn over

18

(viii)

The seasonal components for the days of the week are given in the following table.

Day of week

Monday

Tuesday

Wednesday

Thursday

Friday

Seasonal component

11

15

....................................................[2]

(ix)

Use your trend line and the appropriate seasonal component to estimate the number of

absences on the Tuesday of Week 4.

....................................................[2]

UCLES 2014

4040/23/O/N/14

19

BLANK PAGE

Permission to reproduce items where third-party owned material protected by copyright is included has been sought and cleared where possible. Every

reasonable effort has been made by the publisher (UCLES) to trace copyright holders, but if any items requiring clearance have unwittingly been included, the

publisher will be pleased to make amends at the earliest possible opportunity.

Cambridge International Examinations is part of the Cambridge Assessment Group. Cambridge Assessment is the brand name of University of Cambridge Local

Examinations Syndicate (UCLES), which is itself a department of the University of Cambridge.

UCLES 2014

4040/23/O/N/14

4040 Statistics November 2014

Principal Examiner Report for Teachers

STATISTICS

Paper 4040/12

Paper 12

Key Messages

After performing any calculation it is worth pausing to consider if the answer obtained is a reasonable one for

the practical situation of the question.

It is very important to carefully read the words of a question to understand precisely what is required.

Candidates should always try to relate their knowledge to the specific requirements of a question, including

the specific context involved, rather than simply writing out memorised general theory.

If a question specifies a certain degree of accuracy for numerical answers, full marks will not be obtained if

the instruction is not followed.

General Comments

The overall standard of work was higher this year. A substantial number of candidates obtained very good

marks, and there were few exceptionally low marks. It has been noted regularly in these reports that marks

are often lost due to final answers not being given to the accuracy specifically stated in the question. A

definite improvement in this respect was observed this year.

It has also been noted previously that a student of Statistics ought to be able to observe whether or not the

result of a calculation is reasonable in a given practical situation. If it is clearly unreasonable, the work can be

checked to find the error. But some candidates still seem to give no thought to the answer they obtain,

treating the exercise as one in Pure Mathematics, having no practical relevance. For example, in a cross

country running race (see Question 8 below), it should have been obvious that the mean time taken by the

senior competitors following the most difficult route could not have been less than ten minutes.

It will seem superfluous to remark that a question should be read carefully before an answer is attempted.

Yet there were several instances on the paper (see Questions 2, 4(ii), 8(vii) below) where this most basic

advice for answering examination questions was not followed, and where candidates seemed to assume

what they thought was to be done.

When questions are asked which require written answers, there is a tendency for some candidates to

respond in a very general way, repeating apparently memorised points, without relating their knowledge to

the particular context of the question (see Question 3 below). Also, for example, in a situation involving

accidents in the construction industry (see Question 7(iv) below) there should have been no explanations in

terms of death rates, when there was no mention whatsoever of deaths anywhere in the question.

Comments on Specific Questions

Section A

Question 1

In part (i) the mean and standard deviation were usually evaluated correctly. But in part (ii) many answers

said that the standard deviation would be unchanged after the error made in the gauge readings had been

corrected. It appears as though these candidates were confusing this question with the theory concerning

the effect on the mean and standard deviation of a variable by adding (or subtracting) a constant to each

observation in a set of data.

Answers: (i) 46.5, 4.46; (ii) mean smaller, standard deviation larger

2014

4040 Statistics November 2014

Principal Examiner Report for Teachers

Question 2

This question was not to test how statistical measures are calculated, but to test whether or not a candidate

could deduce which measure was which for a set of measures already calculated, from their relative

numerical values. Many fully correct answers were seen. Many answers were also seen where the candidate

did not read the question properly, but assumed, completely erroneously, that this was a set of raw data from

which the stated measures were to be found.

Answers: (i) 48, 43, 53, 6, 36; (ii) 53

Question 3

Strong answers in part (i) showed a good appreciation of the advantages and disadvantages of the different

survey methods in this particular situation. Weaker answers tended to be vague, speculative, or of a very

general nature. It is not enough in this type of question to say, for example, only that something is easy:

one might validly ask in what way is it easy; what is it that makes it easy?

The difference between closed and open questions was generally well understood in part (ii). The main

weakness in answers tended to be seen in part (ii)(a) where there was sometimes too much focus on the

example given, rather than on closed questions in general.

The answers given below are far from exhaustive, but give examples of what would be considered good

answers.

Answers: (i)(a) citizens not in the telephone directory are excluded, (b) better response rate, (c) a very wide

range of people can be reached very quickly, people without internet access are excluded; (ii)(a)

only a limited number of answers is possible, (b) any relevant open question

Question 4

Many fully correct answers to part (i) were seen. In contrast, part (ii) was rarely answered well. This was

another instance of candidates not reading the question correctly. The variable is clearly stated to be the

number of these vaccines received..., and this only takes the values 1, 2 or 3.

Answers: (i) the following numbers inserted into the correct spaces: (a) 19, (b) 20, (c) 17, (d) 34; (ii) 2

Question 5

Many candidates obtained good marks on this probability question. Errors tended to occur most frequently in

parts (ii) and (iii), with incorrect denominators being used. Some candidates made the solution more

complex than it needed to be in part (iv) by considering male and female authors separately, rather than all

of the authors together as a group.

Answers: (i) 3 , (ii) 1 , (iii) 2 , (iv) 34

5

8

8

195

Question 6

Many candidates now recognise that, for a histogram, the frequency of a class is not always represented

simply by the height of the relevant column. In taking account of the column areas, however, occasional

errors were made which indicated that the labelling of the vertical axis had either not been read carefully, or

not properly understood. So for the under 50km/h class, for example, the height was multiplied by 50 rather

than 5. Part (iv) was least well done, with the total class frequency being offered, rather than the fraction of it

indicated in the question.

Answers: (i) 116, (ii) 62, (iii) 40, (iv) 6

2014

4040 Statistics November 2014

Principal Examiner Report for Teachers

Section B

Question 7

The question on crude and standardised rates continues to be answered exceptionally well, and there was

good application this year of basic knowledge to the problem of industrial accident rates. There were a few

cases in part (iv) of the reason referring to death rates, and as there was no mention of death rates in the

question, this could be given no credit. Only limited understanding was shown in part (vi) as to why the rates

found in the previous part were crude rates.

Answers: (i) 106.5; (ii) 40, 47.9, 75, 162.3; (iii) 102.0; (iv) Fastbuild, because its standardised accident rate

is lower; (v) Kwikbuild 30.7, Fastbuild 32.5, Kwikbuild; (vi) crude rates, a standardised rate is to

eliminate differences in population structures, so is meaningless for one category

Question 8

This question tested the abilities of candidates to interpret statistical information, presented in a mixture of

pictorial and tabular forms, relating to a particular situation. Responses were generally very good, with many

obtaining a high proportion of the marks available. Because of incorrect work a few candidates produced

answers to part (v) which, with pause for thought, should have been recognised as being utterly unrealistic.

When most of the competitors referred to took more than two hours to complete the route, it should have

been immediately apparent that the mean time taken could not have been, as was seen more than once,

less than ten minutes.

To answer part (vii) it was necessary to use all the sources of information: pictogram, pie chart and table.

Once more, this was an instance of a question not being read carefully, because many candidates did not

base their calculation on the senior competitors who had chosen the moderate route, as the question states,

but on the senior competitors who had completed the moderate route. Such candidates thereby used only

one of the sources of information, not all three.

Answers: (i) 280, (ii) (35/100) 120, (iii) 72, (iv) 12, (v) 141 minutes, (vi) 17.3%, (vii) 37.5%

Question 9

Candidates responded to this question well, and a good number obtained correct answers to all the

numerical parts. One of the errors sometimes made in part (iii) was to divide the 60% by 2 for the overweight

people, but then to read BMI for a cumulative frequency of 30% rather than 70%. A general limitation in

answers to parts (iii) and (iv) was the absence of explanatory method. Whilst this did not matter when

answers were correct, marks for method could not be awarded when they were incorrect.

There is no single good answer to part (v). But to earn both marks it was necessary to say not only how the

health of the people of the country had changed, but to give this specific support by citing more than one of

the changes which had occurred, or citing actual statistics as calculated earlier in the question.

Answers: (i)(a) 23.523.8, (b) 26.226.5, (c) 21.221.5, (d) 29.529.8; (ii)(a) 57%59%, (b) 36%; (iii) 29;

(iv) 22%; (v) the population became more unhealthy, because the percentage healthy decreased

from 58% to 36%, and the percentage obese increased from 7% to 22%

Question 10

Almost all candidates had a clear idea of the required steps in the plotting of data, and the calculation of

averages, to find the equation of the line of best fit. However, a common error seen this year in the

calculation of the semi-averages derived from the ordering of x values and y values as though they were

unconnected with each other, rather than linked pairs of values. This serious error resulted in the loss of

marks for many candidates.

There were mixed answers to part (vi), and only a minority seemed to appreciate the point in part (vii).

Answers: (ii) their x coordinates are not in the set of the four lowest x coordinates; (iii) (2, 41), (4.5, 69),

(3.25, 55); (iv) m = 11.011.4, c = 1819; (v) 52; (vi) Science, because the gradient of the line of

best fit is the greatest; (vii) it would have been very difficult to know if the pupils performed well in

tests because they liked the subject, or they liked the subject because they performed well in tests

in it.

2014

4040 Statistics November 2014

Principal Examiner Report for Teachers

Question 11

There were some excellent fully correct answers to this question, but also some very weak ones. Where

good solutions were seen, marks were most frequently dropped in part (iii)(b) where the case of 0 eggs was

omitted. In weaker answers, little solid progress was made beyond part (i).

Answers: (i)(a) 0.81, (b) 0.19; (ii) 0.117; (iii)(a) 0.00972, (b) 0.982

2014

4040 Statistics November 2014

Principal Examiner Report for Teachers

STATISTICS

Paper 4040/13

Paper 13

Key Messages

Candidates should always try to relate their knowledge to the specific requirements of a question, including

the specific context involved, rather than simply writing out memorised general theory.

It is sound examination practice to show method clearly, so that marks for method can be awarded even if

the answer obtained is incorrect.

If a question specifies a certain degree of accuracy for numerical answers, the instruction must be followed

for full marks to be credited.

General Comments

The overall standard of work was comparable to that of last year. A wide range of marks was seen, but there

were few very high marks. The best performances were on Questions 1, 4 and 6 in Section A, and on

Question 10 in Section B.

This year candidates paid much better attention than has often been the case in the past to following

accuracy instructions, where given, as in Question 10.

In questions which require written answers, candidates should try to relate their knowledge to the specific

context of the question rather than simply repeat memorised knowledge of a general nature. The latter

tended to happen especially in Question 5, resulting in little creditworthy work.

Comments on Specific Questions

Section A

Question 1

Good knowledge of these measures was shown, and how they are found. There were many full mark

answers, but a mark was sometimes lost in the explanation for the median, the need to order the data initially

being omitted.

Answers: (i) 8 is the mode, and definition (ii) 9 is the median, and definition (iii) 11 is the mean, and

definition

Question 2

The variable was usually identified as discrete in part (i), but it was rarely explained what feature of the

variable made it so. Some used as an incorrect reason the fact that the cumulative frequency only has

integer values. A common error in part (iii) was to enter cumulative frequencies into the table.

Answers: (i) X is discrete, as it only takes integer values (ii) 0, 4 (iii) 0, 5, 15, 10, 0, 7, 6, 7

2014

4040 Statistics November 2014

Principal Examiner Report for Teachers

Question 3

In part (a), the way in which methods were different was identified more easily than the way in which they

were similar. Good answers referred to whether or not there was a need for a sampling frame, or for random

numbers, and whether or not the method was biased. Few expressed clearly the way in which they were

similar.

In part (b), whilst the correct choice between biased and unbiased was often made, the reasons offered were

usually not creditworthy. In particular, the fact that one of these is a form of random sampling, whilst the

other is not, was rarely recognised.

Answers: (a) similar in that both sample proportionately from the different age groups; different in that

stratified random sampling requires a sampling frame, whilst quota sampling does not

(b)(i) because there are likely to be fewer words on the last page of the chapter than on other

pages, the sample is likely to be biased (ii) because a systematic sample is a form of random

sampling, the sample is likely to be unbiased

Question 4

This question was a good source of marks for many. Any errors that were made were usually in plotting

cumulative frequencies at class mid points, and occasionally in part (iii)(b), finding the percentage smaller

than, instead of larger than, 37.2 mm.

Answers: (i) 0, 8, 18, 35, 46, 50 (iii)(a) correct reading from the graph presented (b) 14%16%

Question 5

Candidates demonstrated that they had studied the advantages and disadvantages of the different forms of

chart, but often the answers seen made no reference to the practical situation of the question: that is, the

company producing wood. Answers such as values can be compared were not given credit. It also seems

not to have been appreciated that part (i) was not about the advantages and disadvantages of a dual bar

chart considered on its own, but the advantages and disadvantages of a dual bar chart as opposed to a

percentage bar chart. So any disadvantage offered which applies to both could not be credited.

Answers: (i) it shows actual amounts of wood; it only shows amounts for individual sizes (ii) total amount of

wood of all sizes produced (iii) pie chart, sectional bar chart (iv) change chart

Question 6

A substantial number of full-mark answers to this question was seen, with clear reasons well expressed in

part (iii).

Answers: (i) 5 (ii) none of these citizens speaks all three languages (iii)(a) no, because the person would

only speak two of the languages (b) yes, because the person would speak all three of the

languages (c) no, because the person would only speak one of these languages

Section B

Question 7

Correct answers were most frequently seen to part (a) and part (c)(i). It was puzzling in part (b) to see

products of two fractions frequently offered, when clearly three children were being chosen. In part (c)(ii)

only a few candidates recognised that two sums of two products would have to be worked out, corresponding

to the first choices being both blue, or both white.

Answers: (a) 88/105 (b)(i) 1/35 (ii) 1/7 (c) 86/189

2014

4040 Statistics November 2014

Principal Examiner Report for Teachers

Question 8

Good answers were those in which the measures for y were calculated correctly, then used with minimal

work to find the measures for x. Marks were frequently lost in all of parts (iv) (viii).

For part (iv), the values in column 6 were often found by squaring those in column 5. For part (vi) and part

(vii) it was often assumed that there had been just 7 visitors to the gallery, not 105. And for part (viii) many

started again with the original data, instead of following the instructions of the question to use what had just

been calculated in the previous parts.

Most answers to part (ix) did not make a judgment by looking at the nature of this particular distribution, but

simply repeated apparently memorised theory about the use of the interquartile range.

Answers: (i) 15, 32.5, 37.5, 45, 55, 65, 85 (ii) 12, 5, 3, 0, 4, 8, 16 (iii) 72, 55, 12, 0, 104, 112, 64

(iv) 864, 275, 36, 0, 416, 896, 1024 (v) 141, 3511 (vi) 1.34 (vii) 5.62 (viii)(a) 48.4 (b) 14.1

(ix) because the distribution is reasonably symmetrical, standard deviation is preferable

Question 9

Many candidates recognised in part (i) that they could not simply read off the heights of these columns to

produce the frequency table, but some also did not. For part (ii) most formed the correct grouping, but a

good number made the mistake of grouping the classes in part (i) in pairs. In the latter case the classes did

not have equal width, so an appropriate histogram could not be produced in part (iii). A frequent problem

with the histogram was that the vertical axis was not properly labelled, so a possible follow through mark

could not be awarded on the heights drawn. Candidates should note that a vertical axis labelled fd or even

frequency density is not good enough; the labelling should be of the form given for the histogram at the

start of the question.

For part (v) marks were available for correct method following earlier errors, but the method had to be clearly

shown to earn these.

Answers: (i) 24, 36, 32, 21, 18, 22, 19, 28 (ii) frequencies 24, 68, 80, 28 (iv) 19.3 cm19.4 cm (v) 84%

Question 10

The question was very well done and a good source of marks for many candidates. The most common error

was in part (v), where the percentages in the second table were sometimes used instead of the percentages

for the standard population.

Candidates understand very well that it is the standardised rate that has to be used to make fair comparisons

in this type of situation; but they understand less well why it is that one town can have a higher crude rate,

but a lower standardised rate, than the other.

Answers: (ii) 7.18 (iii) 5.56, 7.83, 11.86 (iv) 6.49 (v) 7.90 (vi) populations of the towns are differently

structured in terms of age groups (vii) because its standardised death rate is lower, Eastbury

Question 11

It was a pity that, at the outset in part (i), some candidates did not label the plotted points as instructed, as

this labelling was needed later in the question to make judgments on the performance of the trainees. It was

in these judgment parts, (iv) and (vi), that most marks were lost.

Good answers were able to point out in part (iv) that results for A and B seemed to follow (different) straight

lines, whilst those for C were quite erratic. Once the experienced technicians result was plotted in part (v)

they then further added that Bs results were clearly accurate. The best revised lines drawn in part (vii)

passed very closely through Bs results and that of the experienced technician.

Answers: (ii) (37.5, 104.5) (iv) results of A and B fall approximately on straight lines, whilst results of C are

erratic (vi) experienced technicians result fits results of B very well, so it seems that results of B

are accurate (viii) correct reading from revised line drawn

2014

4040 Statistics November 2014

Principal Examiner Report for Teachers

STATISTICS

Paper 4040/22

Paper 22

Key Message

The most successful candidates in this examination were able both to calculate the required statistics and to

interpret their findings. In the numerical problems, candidates scoring the highest marks provided clear

evidence of the methods they had used in logical, clearly presented solutions. In questions requiring written

definitions, justification of given techniques and interpretation, the most successful candidates provided

detail in their explanations with clear thought given to the context of the problem, where appropriate.

General Comments

In general, candidates did better on the questions requiring numerical calculations than on those requiring

written explanations; in particular, candidates did well on the numerical parts of Questions 8 and 10. It was

particularly pleasing to see, in Questions 8(iii) and 10(c)(iii), clearly laid out logical solutions. There were

however three numerical questions that caused difficulty this year, namely parts (ii) and (iii) of Question 2

and Question 4(ii). Answers to questions requiring written explanations, such as Questions 8(v), 10(c)(i)

and 10(c)(ii), were sometimes too vague or insufficiently detailed. However, in Question 1, for example,

there were some very good descriptions of different types of variable and in Question 7(a)(i) clear purposes

stated for finding moving average values. Graphs and charts were often accurately produced where

necessary, but a common error in Question 5 was for the vertical axis label to be missing.

Question 9, on probability and expectation, proved to be the least popular of the optional Section B

questions, with each of the remaining Section B questions proving equally popular.

Comments on Specific Questions

Section A

Question 1

Most candidates found it easier to find examples for parts (ii) and (iv) than to produce the required definitions

for parts (i) and (iii). There were, however, some good responses seen in all parts and, in general,

candidates did better on this question than on similar questions in the past. In part (i) the most common

correct definitions seen for a discrete variable were a variable whose outcomes can only take specific or

exact values or a variable which can be counted. The most common incorrect answer seen was where

candidates thought that discrete variables must take whole number values. Many correct examples were

seen in part (ii) including height, weight and length. In part (iii) many candidates correctly described a

qualitative variable as one which does not involve numbers or one which can only be described in words.

The mark was not awarded to explanations which simply said that a qualitative variable is one which has

quality, as further explanation was required. In part (iv) many correct examples of a discrete quantitative

variable were seen including shoe size, the number of people on a bus and the number of leaves on a tree.

2014

4040 Statistics November 2014

Principal Examiner Report for Teachers

Question 2

Most candidates correctly identified the mode in part (i). Many candidates struggled, however, with the

remainder of this question. In part (ii), for example, many candidates, rather than adding 1 to the total

frequency before dividing by 2 to find the correct position for the median, simply divided 29 by 2. In part (iii)

many candidates, correctly, made an attempt to work with a cumulative frequency of 18, but a common

incorrect answer seen was h = 7. This occurred as a result of using an incorrect method for finding the

position of the median for ungrouped data, as was also seen in part (ii).

Answers: (i) 6; (ii) 5; (iii) 6.

Question 3

The majority of candidates scored full marks in part (i)(a) of this question. In part (i)(b) there were many good

answers, with the most common errors being either not multiplying by 2 or not realising that if Maria eats the

chocolates then this implies that the situation is without replacement. Part (ii) proved to be more difficult for

many candidates. Some candidates incorrectly included 3 attempts in their working or found only the

probability of exactly 3 attempts. Some candidates produced tree diagrams with probabilities written on the

branches, but no indication of further working which might have gained them some method marks. Some

candidates missed the fact that in this part unwanted chocolates were returned to the tin, and thus incorrect

denominators were sometimes seen.

Answers: (i)(a) 1/19, (b) 15/38; (ii) 16/25

Question 4

Correct answers, together with a correct reason, were seen in the work of the more able candidates in

part (i). Many candidates did not realise that the fact that the data contains extreme values is both the reason

for the choice of the median as the measure of central tendency and the reason for the choice of the

interquartile range as the measure of dispersion. Some candidates did not notice the presence of extreme

values in the data (namely some large masses). Candidates tended to be more successful with part (ii) of

this question. The most common error was for candidates to find one third, rather than two thirds, of 19

before adding it on to 12.

Answers: (ii) 25

Question 5

In part (i) most candidates were able to name the chart as a percentage sectional or a percentage

component bar chart. The interpretation of this chart required in part (ii) was very well done by the vast

majority of the candidates, with almost all candidates getting the correct numbers of males and most of those

also getting the correct numbers of females. Accurately drawn dual bar charts were seen in part (iii),

although some candidates omitted the label on the vertical axis. In part (iv) it was encouraging to see that

many candidates recognised that the dual bar chart provided them with actual numbers rather than simply

percentages. Some candidates, however, simply stated that the dual bar chart was easier to read, without

explaining that this was because it shows actual numbers.

Answers: (ii) 42, 36, 22; 36, 24, 60

Question 6

Many candidates found this to be the most difficult of the Section A questions. It was quite common in part

(i) for candidates to give just one pair, rather than stating all of the pairs of the independent and mutually

exclusive events. Candidates often gave A and D as their only answer to part (i)(a) and B and C as their only

answer to part (i)(b). In part (ii) many candidates successfully wrote down that the probability of A is 1/6 and

the probability of B is 1/2. These two probabilities were then often simply added together by candidates,

rather than, using the fact that these are independent events, finding the intersection by multiplying the two

probabilities together and then subtracting this from the total. Some good attempts were made in part (iii),

but it was very rare to see both the smallest and the largest possible values both correct.

Answers: (i)(a) A and B, A and C, A and D, (b) B and C, C and D; (ii) 7/12; (iii) 0 and 5/6

2014

4040 Statistics November 2014

Principal Examiner Report for Teachers

Section B

Question 7

Many candidates were able to give two correct purposes of finding moving average values in part (a)(i). Any

two from to smooth out/eliminate the variation, to look for the trend, to find the seasonal components or to

make predictions were required. In part (a)(ii) many candidates gave the correct answer of 3, although a

considerable minority gave the incorrect answer of 4. In part (a)(iii) answers were usually consistent with any

value suggested in part (a)(ii), though many stopped at because n is odd/even without the further

explanation as to whether or not the moving average values correspond to original data points, which was

required for full marks. In parts (b)(i) and (b)(iii) the values of a, b and c were usually correct, the centred

moving average values were usually correctly plotted and a reasonable trend line was obtained. Candidates

often had difficulty, however, in part (b)(ii), with finding the seasonal component and, in part (b)(iv), with

using that seasonal component to estimate sales. Those candidates who correctly found the seasonal

component in part (b)(ii), by finding the differences between sales and moving average values for quarter II

and then averaging these differences, often went on to correctly use the seasonal component in part (b)(iv).

A common error in part (b)(iv) was for a reading to be taken from the trend line, but for the seasonal

component not then to be added to this reading.

Answers: (a)(ii) 3; (b)(i) a = 75, b = 70.75, c = 82; (ii) 9.6; (iv) 54 400

Question 8

Most candidates obtained the correct ratio in part (i). In part (ii) most candidates produced correct or almost

correct price relatives with some candidates omitting the 100s in the first column. Some weaker candidates

appeared not to understand the term price relatives and entered what looked like total expenditure values in

the various categories. These candidates sometimes did the part (ii) calculation in part (iii) and recovered.

Others continued with their very wrong values. Many candidates, however, did this part perfectly, producing

well set out solutions and giving their answer to the required degree of accuracy. Part (iv) was also done

perfectly by many candidates, though some ignored the instruction to use the index from part (iii) and did the

calculation by finding the new cost in each category. Many candidates correctly gave reasons specific to the

context of the problem presented, for example, that the amount of electricity used may have changed, that

the number of hours worked may have changed or the amount of ingredients used may have changed for

their answer to part (v). A few candidates referred, incorrectly, to the inaccuracy introduced by rounding

errors and some referred, incorrectly, to changes in prices. Vague answers, such as the weights or the

quantities may have changed, were sufficient for some of the marks, but in order to score full marks the

reasons provided needed to be in the context of the problem.

Answers: (ii) 100s in first column; 108, 122, 97 in second column; (iii) 101.3 or 101.4; (iv) $42 600

Question 9

This was the question omitted by most candidates or started and abandoned after part (i)(a). Those

candidates who continued with this question were usually able to score both marks in part (i)(a). Although

some perfect answers were produced, many of those who did attempt the whole question did not seem to

have understood the basic process and, for example, thought that exactly 2 points required 2 heads and 1

tail. In part (i)(b) some candidates gave partially correct responses by calculating , but many

missed the fact that 2 points could be achieved in 3 ways. In part (ii) many candidates correctly realised that

X could take the values 0, 1, 2 or 3, but the probabilities of each of these outcomes were usually incorrect

and very often these probabilities did not sum to 1. In part (iii) some correct attempts to use expectation

were seen, although sometimes the working was rather disorganised. In part (iv) many candidates realised

that the probability of a head is 1/5. Again some correct attempts to use expectation were seen, but many

candidates had abandoned this question by this stage.

Answers: (i)(b) 9/64; (ii) 27/64, 27/64, 9/64, 1/64; (iii) $22; (iv) $20 so should not risk throwing extra coin

Question 10

Candidates did well on the numerical parts of this question. Correct answers with clear working were often

seen in parts (a)(i) and (a)(ii). In part (b) correct values for the mean and standard deviation were usually

seen in the first three rows of table. A common error was to see the values 22 and 6 for the mean and

standard deviation, respectively, in the final row of the table. In part (c)(i) many candidates achieved one of

the two available marks. They were often able to state that the marks for Algebra were better, but a correct

2014

4040 Statistics November 2014

Principal Examiner Report for Teachers

comparison using the class standard deviations was much less common. The more able candidates were

able to state that the marks in Algebra were generally more varied and there were more correct responses to

this question than on a similar question requiring the comparison of interquartile ranges last year. Part (c)(ii)

was a difficult question requiring candidates to compare Priyankas mark with the class mean in terms of the

class standard deviation for each of Algebra and Geometry. It was necessary to show that she did better in

Geometry because her mark was two standard deviations above the mean in this subject, whereas in

Algebra her mark was only one standard deviation above mean. Only the most able candidates were able to

score both of the marks in this part. Many candidates scored full marks in part (iii) with clearly set out

solutions.

Answers: (a)(i) 64; (ii) 1.5; (b) 10, 3; 5.5, 1.5; 58, 15; 11, 3; (c)(iii) 12.5.

Question 11

Responses to questions of this sort have improved over the years. In part (i) many candidates obtained the

correct simple random sample and it was pleasing to see that the number 47 was not repeated too often. In

part (ii) many candidates correctly found a systematic sample, although a few used an interval of 9 instead of

10, and of the three types of sample being tested in this question, the systematic sample was the least well

done. Most candidates were able to find the correct sample sizes in part (iii), even though they had not been

provided with the necessary totals. Many correct answers were seen in part (iv), but some candidates did not

give enough detail when explaining why Asads sample was not accurate. It was necessary to state that

Asads sample over-represents machine A (or under-represents machine B or C) and that Omars sample

accurately represents jars filled by each machine. In part (v) many correct stratified samples were seen and

in part (vi) correct sample sizes were calculated, which included the need to round answers to the nearest

whole number. Part (vii), which required candidates to explain why it was more appropriate to stratify by

machine, was a difficult final part to this question. Only the most able candidates looked back to the start of

the question to find the purpose of sampling in this case. As the purpose was to check the mass of jam in

each jar, it was appropriate to stratify by machine, rather than packer, as it was the machine that was

responsible for the mass of jam. While only the most able candidates were successful in this part of the

question, there were more correct responses than in a similar part on last years paper.

Answers: (i) 47, 00, 51, 32, 85, 11, 67, 05, 10; (ii) 01, 11, 21, 31, 41, 51, 61, 71, 81; (iii) 2, 3, 4; (v) 44, 03,

59, 14, 27, 20, 78, 60, 81; (vi) 4, 5

2014

4040 Statistics November 2014

Principal Examiner Report for Teachers

STATISTICS

Paper 4040/23

Paper 23

Key Message

The most successful candidates in this examination were able both to calculate the required statistics and to

interpret their findings. In the numerical problems, candidates scoring the highest marks provided clear

evidence of the methods they had used in logical, clearly presented solutions. In questions requiring written

definitions, justification of given techniques and interpretation, the most successful candidates provided

detail in their explanations with clear thought given to the context of the problem, where appropriate.

General Comments

In general, candidates did better on the questions requiring numerical calculations, than on those requiring

written explanations; in particular, candidates did well on Questions 3 and 4 and the numerical parts of

Questions 7 and 8. It was particularly pleasing to see in Questions 3(iv) and 8(v), on finding the standard

deviation and the interquartile range, respectively, clearly laid out logical solutions. There were however two

numerical questions that caused difficulty this year, namely parts (ii) to (iv) of Question 1 and Question 2.

Answers to questions requiring written explanations, such as Questions 6(a) and 8(vii), were sometimes too

vague. However, in Question 6(b), for example, there were some very good explanations provided for the

difference between discrete and continuous variables. Graphs and charts were often accurately produced

where necessary, but a common error in Question 5 was for labelling to be missing.

Question 9, on probability, proved to be the least popular of the optional Section B questions and it was

also the question that those who attempted it found the most difficult. Question 8 on linear interpolation

proved to be the most popular of the optional questions.

Comments on Specific Questions

Section A

Question 1

Most candidates were successfully able to calculate the mode in part (i). Many candidates struggled,

however, with the remainder of this question. In part (ii), for example, many candidates, rather than adding 1

to the total frequency before dividing by 2, to find the correct position for the median, simply divided 46 by 2.

In part (iii) many candidates, correctly, made an attempt to work with a cumulative frequency of 29, but a

common incorrect answer seen was k = 12. This occurred as a result of using an incorrect method for finding

the position of the median for ungrouped data, as was also seen in part (ii). It was rare in part (iv) to see a

correct answer of 9.

Answers: (i) 17 (ii) 1 (iii) 11 (iv) 9

Question 2

This proved to be a difficult question on expectation. Many candidates were unable to deal with the fact that

40% and 60% of letters are sent by 1st and 2nd class post, respectively, with many ignoring this information

altogether in their solutions. Attempts were seen to multiply incorrect probabilities by the number of days in

an attempt to find the expected number of days for a letter to be delivered. Those candidates who have

successfully found the probabilities were often able to go on and correctly find the expectation.

Answer: 2.1

2014

4040 Statistics November 2014

Principal Examiner Report for Teachers

Question 3

Most candidates were able to find the required totals and the mean estimate. In part (iv) the correct answer

was seen in many cases, with well set out solutions, but some candidates appeared not to know the correct

formula for the variance or the standard deviation.

Answers: (i) 430 (ii) 17.2 (iii) 8131 (iv) 5.42

Question 4

As with question 3, many candidates produced fully correct solutions to this question. The most common

error seen in part (i) was to have two, rather than one, unknown in the equation. A few candidates had a

correct expression but were unable to rearrange correctly and then solve it. In part (ii) many candidates had

a correct standardised term, with the unknown standard deviation the denominator, but this did not always

appear in a fully correct equation.

Answers: (i) 33 (ii) 25

Question 5

Many accurate charts were seen in both parts of this question. Marks were, however, sometimes lost due to

a lack of labelling of the vertical axes. It was very important in part (i) to label the vertical axis as expenditure

(in dollars) and in part (ii) as percentages. Some weaker candidates did not start their scales on the vertical

axes at 0 and thus the height of their bars were not proportional to the expenditure (in part (i)) or the

percentage of the expenditure (in part (ii)).

Answers: (ii) 27%, 33%, 40%; 31%, 33%, 36%

Question 6

There were many partially correct responses to part (a). Candidates were often able to explain what is meant

by a qualitative variable, namely one with non-numerical outcomes, but they were not always able to explain

why this means that it is not possible to illustrate such data in the form of a histogram. In addition candidates

needed to explain that in a histogram area is proportional to frequency and, with no class widths, calculation

of such an area is not possible. In part (b) there were some well explained comparisons made and this was

good to see in a question requiring written explanation. Examples of correct comparisons seen were, a

discrete variable can only take certain values within its range, whereas a continuous variable can take all

values within its range and a discrete variable is counted whereas a continuous variable is measured. A

commonly seen incorrect answer was that discrete variables can only take whole number values. Answers to

part (c)(i) were often incorrectly given as 14.5 or 14, whereas answers to part (c)(ii) were more often correct.

Answers: (c)(i) 15 (c)(ii) 14.5

Section B

Question 7

Some candidates who embarked upon this question abandoned it after part (i). The most common error in

part (i) was caused by candidates not noticing that each box contained 3 cricket balls. Most candidates who

found the correct ratio in part (i) were able to go on and find correct price relatives in part (ii) and then use

these values to find a correct weighted aggregate cost index in part (iii). The majority of candidates were

also able to find a correct estimate for the total cost of running the club in part (iv). Fewer candidates,

however, were able to provide reasons why their estimate may be very different from the true cost. The most

commonly seen correct answers in part (v) were that number of balls used may have changed or that the

number of hours worked by the groundsman may have changed. Some commonly seen incorrect answers

were those which had been accounted for within the information included in the calculation, for example the

wage rate may have changed or there may have been inflation. Some answers did not give sufficient

detail, for example, the groundsman may have become ill, without giving further details as to how this

would affect the calculation. In this case it would be necessary to add that he would then be able to work

fewer hours.

Answers: (ii) 90, 102, 105, 103 (iii) 102 (iv) $21 675

2014

4040 Statistics November 2014

Principal Examiner Report for Teachers

Question 8

Most candidates found the correct modal class in part (i) and the correct cumulative frequencies in part (iii).

It was rare however see a correct answer for the maximum possible value of the range in part (ii). Common

incorrect answers seen included 197, the maximum frequency, and 186, the difference between the

maximum and minimum frequency. In parts (iv) and (v) estimates of the median and the interquartile range

were usually correct with clearly set out working. The position of the median, for example, was usually

correctly identified for grouped data by taking the total frequency and dividing it by two. In part (vi) most

candidates were unable to comment that the distribution was not symmetrical. In part (vii) some candidates

correctly identified that the gradient would be steepest around the 2 under 3 class, but the reason was

sometimes incorrectly given as the difference in the frequency is greatest at this point, rather than that class

frequency is greatest at this point or that the difference in the cumulative frequency is greatest at this point.

Answers: (i) 2 under 3 (ii) 8 cm (iii) 12, 209, 242, 255, 379, 401, 412, 500 (iv) 4.62 (v) 2.57, 5.97, 3.4

(vi)(a) 2.04 or 2.05 and 1.35

Question 9

This was the least popular of the Section B questions. In part (a)(i) many candidates correctly stated that

mutually exclusive events were those that could not occur at the same time. In part (a)(ii) candidates needed

to give an example of a pair of mutually exclusive events. These were often incorrect as they were not

outcomes of the same experiment, for example, a candidate may give one event as getting a head on a coin

and the another event as getting a 6 on a die. A correct pair of mutually exclusive events would be, for

example, getting a 6 on a die and getting a 4 on the die. In part (a)(iii)(a) candidates often incorrectly stated

that A and B are not mutually exclusive because the sum of the probabilities is not 1, rather than stating that

the sum of the probabilities is greater than 1. The answer to part (a)(iii)(b) was often correct. Most

candidates were able to find the correct probability in part (b)(i). Parts (b)(ii) and (b)(iii) were also usually

correct with any errors occurring in the denominator of the fraction. Fewer candidates were successful with

part (b)(iv), with a common error being that the problem was considered to be with rather than without

replacement. Occasionally addition rather than multiplication of the fractions was seen. Most solutions to part

(b)(v) were incorrect, however some candidates correctly had denominators of 60 and 59 in their

expressions. There are a number of ways to solve this problem, the most commonly correct method seen

being 7/60 12/59 + 28/60 13/59.

Answers: (a)(iii)(b) 0.3 (b)(i) 2/5 (ii) 23/35 (iii) 11/25 (iv) 1/177 (v) 112/885

Question 10

Most candidates produced a correct simple random sample in part (i), with the most common error being the

inclusion of the number 18. In part (ii)(a) the most common error was 03 being given for the largest possible

two-digit number for the first person selected. In part (ii)(b) the most common error was 05 for the first

person selected and this occurred sometimes in cases where the candidate had got the previous part

correct. In part (ii)(c) values outside the range were sometimes seen. The stratified samples in parts (iii) and

(iv) were usually correct. In part (v) reasons for each conclusion needed to be stated clearly. In the case of

the sample stratified by friend/relative it was necessary to state that this sample is also representative in

terms of age group. In the case of the sample stratified by age group it was necessary to state that this

sample over-represents friends or under-represents relatives.

Answers: (i) 12, 00, 07, 09, 01 (ii)(a) 00, 02 (ii)(b) 00; (ii)(c) 03, 06, 09, 12 (iii)(a) 3 friends, 2 relatives

(iii)(b) 06, 09, 08, 04, 02 (iv)(a) 2 from Group I, 2 from Group II, 1 from Group III

(iv)(b) 11, 13, 10, 02, 09

2014

4040 Statistics November 2014

Principal Examiner Report for Teachers

Question 11

Some candidates misunderstood the question in part (i) and gave general reasons for calculating moving

averages rather than reasons for calculating a 5-point moving average specifically. It was necessary to

explain that each cycle is of length 5 days. A general question regarding the purpose of calculating moving

averages appears in part (vi). In part (ii) most candidates gave as a correct answer that each cycle contains

an odd number of observations. Alternatively, the correct answer could be expressed as being because the

moving average values are at the same point in time as the original values. In part (iii) the plots were usually

correct. Most candidates spotted the clear cyclical pattern. An alternative answer would have been that there

is no clear upward or downward long-term trend. The calculations in part (iv) and the plots in part (v) were

usually correct. In part (vi) most candidates correctly stated the purpose of calculating moving averages,

namely to eliminate seasonal variation or to find the trend, but the subsidiary question regarding how well

this had been achieved in this case was less well answered. It was necessary for candidates to state that the

purpose had been achieved well in this case. In part (vii) most candidates were able to draw a suitable trend

line. However some candidates followed too closely the earlier moving average values and ignored the later

ones. In part (viii) a common incorrect answer was q = 3. In part (ix) working was sometimes missing, which

might have been worth a mark had it been shown.

Answers: (iv) x = 127, y = 24.8 (viii) q = 3 (ix) 17

2014

- 7Transféré parAdwin Anil Saldanha
- 0450_s16_qp_12Transféré parSeong Hun Lee
- 4037_w17_qp_12Transféré parPharero Academy
- 3248_w13_qp_2Transféré parHaider Ali
- 0580_s12_qp_41Transféré parKarim Ahmed
- 2059_w11_qp_2Transféré parFarah Noreen
- 0417_s10_qp_11Transféré pare16082008
- 4040_w13_qp_23Transféré parWajih Memon
- 5054_w12_qp_42Transféré parkaran79
- Question Papers IGCSETransféré parBuebuebue
- 0620_s10_qp_61Transféré parVarun Panicker
- insert.pdfTransféré partaimoor2
- 4024_w12_qp_22Transféré parBeatrice Ross
- 4024_s12_qp_21Transféré parmstudy123456
- 9709_w16_qp_33Transféré parAkylam
- Stat 3.pptTransféré parAl Arafat Rumman
- Practica 2 - DCATransféré parolicuellarr
- LAMPIRAN.docxTransféré parirsyad
- Lampiran Uji Wilcoxon Rifki 100719Transféré parHdtnov
- igcse biology 20080610_s08_qp_2Transféré parHassan mahmud
- 50 Sample Writing Ielts Part 1 IeltsTransféré parĐoàn Hà Phương Anh
- Turkish Ship Chandler Companies: A Marketing Success or a Disappointment?Transféré parinventionjournals
- Fire on the Actuarial Exam.docxTransféré parHalo1
- VAR_OVRHDTransféré parShah Gee
- BAYESTransféré parShikha Singh
- Output Spss VitaTransféré parBayu Maulana
- excelTransféré parShyamlee Kanojia
- Chapter 10Transféré pargaurav910
- BA201 Engineering Mathematic UNIT1 - StatisticsTransféré parAh Tiang
- Bab Question) Auto Saved)Transféré parsitiradhiah

- chapter 2.pdfTransféré parDexter Almonte
- BULATSTransféré parJuan Daniel Oklas
- Job Interview SkillsTransféré parAntonio Ortega Rodenas
- psat 8 9 and psat nmqst presentationTransféré parapi-296918325
- GradeCardUGC (2)Transféré parprashant
- sem4_Final.pdfTransféré parDhaval yadav
- Naplex Mpje Bulletin May 14 2018Transféré parClayton Jensen
- Syllabus from a Human Computer Interaction Class at UGATransféré parPaul Prae
- Downloadable-Test-Bank-for-Managerial-Accounting-Tools-for-Business-Decision-Making-6th-Edition-Weygandt-3.docTransféré parWaya Lapian
- Four Bases of Effective WritingTransféré parQuang Châu
- Student Satisfaction Survey on Student Services ResultTransféré parGina Lee Mingrajal Santos
- American-Jetstream-Pre-Intermediate-Teacher-Book.pdfTransféré parAlex Santa Fe
- Business Law Fact SheetTransféré parLex Spencer
- ESP Course MaterialTransféré parHa H. Muhammed
- ielts writing clolin.docxTransféré para “aamir” Siddiqui
- upload22655362.pdfTransféré parBOB PAUL
- HR Chapter 3 2 Eng Finalised0901Transféré parVictor Ng
- ChE 555 Code of EthicsTransféré parjestlej
- Cs Foundation Exam Solved Question Papers 2014 - Google SearchTransféré parRithik Visu
- jamie vega ulate resume 2019Transféré parapi-448368174
- Formal MethodsTransféré parslasherzkreeb
- Level 5 Advanced Technician Diploma in Electrical and Electronic Engineering v2Transféré parD Gihan Perera
- Important Instructions for the Candidates-5Transféré parEngr Danish
- Bikol Reporter April 3 - 9, 2016 IssueTransféré parBikol Reporter
- catalog.pdfTransféré parSylla Maisoneuve
- TradeAPrrenticeOFB-55th Batch V5(1)Transféré parnamokar_bb
- PI_2018 KOICA-SNU(GMPA) Capacity Building of Public ManagementTransféré parjoko
- 0455 Economics ChecklistTransféré parsallyohh
- Episode 5 and 6Transféré parEloisa Lyn Cristobal
- 2017 2019 Syllabus MandarinTransféré parmariamdesktop

## Bien plus que des documents.

Découvrez tout ce que Scribd a à offrir, dont les livres et les livres audio des principaux éditeurs.

Annulez à tout moment.