Vous êtes sur la page 1sur 2

STATS 101/101G/108 Introduction to Statistics

Assignment 2, Summer Semester 2014


Due: 3pm Tuesday 21st January

Question 1. [3 marks] [Chapter 2]


The following set of data is the mark (out of 100) for 29 students in a final exam.

(a)
Read these instructions carefully
Marks
Assignment 2 is worth 5% of your final mark.
It will be marked out of 45 marks, 40 marks for the questions and 5 marks for communication
and presentation. See below for how communication and presentation marks are allocated. Your
final mark will be converted to a mark out of 10 which will be recorded towards your course
work.
Statistics is about summarising, analysing and communicating information. Communication is
an important part of statistics. For this reason you will be expected to write answers which
clearly communicate your thoughts.
Communication and Presentation marks

Demonstrate clear sentence structure: this includes correct use of full stops and capital
letters; not writing overly long or complicated sentences; reasonable spelling and grammar.
Demonstrate ability to communicate information clearly in sentences: this includes
sentences clearly conveying the correct idea; sentences making sense; comments not being
excessively long or short; conclusions following logically from previous statements.
Assignment tidily set out and easy to follow: this includes the answers being clearly set
out in the correct order; the assignment not being messy; graphs and plots tidy with correct
labelling of axes; the assignment (including the correct cover sheet) being clipped together
or stapled.
Follow the Step-by-Step Guide to Performing a Confidence Interval by Hand as
required.
Student ID number shown somewhere on assignment: this can be on the inside of the
coversheet or on top of the first page of the assignment.

Handing in
Hand into the appropriate assignment drop-off box to the left of the counter in the Student
Resource Centre, ground floor of building 301, by the plaza that connects buildings 301
and 303. Do not hand your assignment in to the unsecured assignment return boxes!
Assignments handed in to the wrong place or received after the due time will not be marked.
Question guide
Attempt question 1 and 2 when chapter 2 has been covered.
Attempt question 3 when chapter 3 has been covered.
Attempt questions 4 and 5 when chapter 4 has been covered.
Attempt questions 6 and 7 when chapter 5 has been covered.
Question 1 will require use of SPSS. Hand in the required computer output.
Notes
The format and handing in of Assignment 2 is the same as that for Assignment 1. Refer to the
instructions on page 1 of Assignment 1.
Refer to the Worked Examples under Assignment Resources on Cecil for examples of how to set
out your answers.
Refer to the Lecture Workbook, Section A (Course Information), page 3, Assignment Rules:
Working together versus cheating.

49

59

49

58

32

69

49

71

67

77

69

49

49

37

71

38

45

49

88

49

51

53

49

49

49

53

29

54

65

Use SPSS commands: Analyze -> Descriptive Statistics -> Explore to:
(i)

Generate descriptive statistics.

(ii)

Create a stem-and-leaf plot of the data.

[See pages 12&13, SPSS Tutorial, available on Cecil.]

(iii) Create a box plot of the data.


(b)

Briefly comment on what the plots reveal.

Question 2. [2 marks] [Chapter 2]


A store manager was interested in the number of items purchased by customers in each
transaction. He took a sample of 100 transactions and recorded the number of items purchased
in each one. The resulting data is given in the following frequency table:
Numberofitems
1
2
3
4
5
6
total

Frequency
31
28
17
13
9
2
100

What are the sample mean and sample standard deviation of these 100 observations?
Note: Show the sample standard deviation to 3 decimal places.
Question 3. [10 marks] [Chapters 2/3]
For this question you need to do a Cecil Quiz on plots, tables and summary statistics. This is a
short quiz where you load a data set into iNZightVIT to produce plots, tables and summary
statistics and then answer 10 true/false questions on what you see.
Before attempting the quiz, read the iNZight Quiz Guide from Cecil under Resources and
Course Information -> Quiz.
You will have 3 attempts at the quiz and your best mark will be used. The deadline is 11pm on
Monday 20th January (the night BEFORE the assignment is due). You can only score marks for
question 2 if you submit the assignment and earn marks for at least one other question.
On your assignment answers for this question, write whether or not you have attempted the quiz
AND clearly write out your ID number.

Question 4. [6 marks] [Chapter 4]


The Roller Coaster Database is a website that contains a database of all the roller coasters in the
world. Information on steel tracked roller coasters was taken from the latest census of operating
roller coasters. Coasters were classified by the region they were in and the type Inverted, sitdown or other (which included pipeline, bobsled, flying, stand-up, suspended and 4th dimension
coasters). This information was cross-classified and is presented in the table below.

Region
Africa
Asia
Australia/NZ
Europe
NorthAmerica
SouthAmerica
Total

Type
Inverted SitDown Other
Total
3
56
0
59
44
1326
33
1403
2
20
0
22
28
731
25
784
50
554
31
635
3
135
4
142
130
2822
93
3045

Question 6. [6 marks] [Chapter 5]


A statistics student was interested in investigating how long it takes to get a pizza delivered
from the local pizzeria. Over a few weeks, a random sample of 10 delivery times (in minutes)
was recorded. The data are displayed below:
17.9, 22.2, 29.3, 33.1, 13.6, 18.3, 21.7, 16.3, 23.2, 29.8
(a)
(b)

Question 7. [8 marks] [Chapter 5]


As part of the National Centre on Addiction and Substance Abuse at Columbia University's
nationwide "Back to School Teen Survey", 1000 American teenagers aged between 12 and 17
were interviewed by telephone on many lifestyle issues. The teenagers have been split into two
age groups (12 - 14 and 15 - 17) each of size 500. The study also included surveys of 825
teachers and 822 principals. Some of the questions with the number of responses to each answer
are given below:

Use the table to answer the following questions about these roller coasters.
(a)

What proportion of steel tracked roller coasters are in Europe?

(b)

What region has the lowest proportion of inverted steel tracked roller coasters?

(c)

Given that a randomly chosen steel tracked roller coaster is in North America, what is the
probability that it is sit down?

(d)

What proportion of steel tracked roller coasters are North American sit down coasters?

(e)

What proportion of North American steel tracked roller coasters are sit down?

(f)

Of all sit down steel tracked roller coasters, what proportion are in North America?

Question 5. [5 marks] [Chapter 4]


Auditors developing systems to check the accuracy of regular tax returns for such taxes as GST
look at changes in the firms returns between tax periods. If the change is greater than some
threshold the firms return is tagged to be subject to rigorous audit. To check the accuracy of one
such system a large sample of returns were all audited. It was found that 23% of returns tagged
for audit by the system revealed tax evasion while only 1 out of 200 returns that were not tagged
for audit by the system revealed tax evasion.
The system was implemented at a tax department and run on a sample of 10,000 tax returns. Of
these, 600 were tagged for audit.
(a)

(i)
(ii)

How many of the 600 returns tagged are estimated to be for firms trying to evade tax?
How many of the returns that were not tagged are estimated to be for firms trying to evade
tax?

(b) Use your answer from (a) to help construct a 22 table of counts displaying the results for this
sample. Complete the table.
(c)

What is the estimated proportion of firms that are trying to evade tax which have their return
tagged for audit?

What is the sample mean and sample standard deviation of these 10 observations?
Calculate a 95% confidence interval for the underlying mean delivery time. Interpret your
results.
Note: You must follow the step-by-step guide to producing a confidence interval by hand given
in the Lecture Workbook, Chapter 5. At step 6 it is necessary to use either a graphics
calculator, SPSS, Excel or t-tables to determine the t-multiplier.

Question 1: What are you most likely to


do in the afternoon after school?

Question 2: Do you know a friend or


classmate who uses illegal drugs?

Age: 12-14 15-17


Hang out with friends
106
96
Go home, do homework
111
80
Go home, watch TV
64
59
Go home, do something else
79
63
Play on sports team
90
92
Go to a job
12
66
Other organised activity
33
40
Don't know / refused
5
4
Total
500 500

12-14 year olds: Yes (195) Total (500)


15-17 year olds: Yes (337) Total (500)
Question 3: Is it possible to use marijuana
every weekend and still do well at school?
12-14 year olds:
15-17 year olds:
Teachers:
Principals:

Yes (51)
Yes (114)
Yes (355)
Yes (288)

Total (500)
Total (500)
Total (825)
Total (822)

(a) State the sampling situation (a, b or c) for calculating the standard error of the difference in:
(i) estimating the difference between the proportion of 15 - 17 year old teenagers who know a
friend or classmate who uses illegal drugs and the proportion who believes it is possible to
use marijuana every weekend and still do well at school.
(ii) estimating the difference between the proportion of 12 - 14 year old teenagers who hang
out with friends after school and the proportion of 15 - 17 year old teenagers who hang out
with friends after school.
(iii) estimating the difference between the proportion of 12 - 14 year old teenagers who hang
out with friends after school and the proportion who go home and watch TV.
(b)

By hand, calculate a 95% confidence interval for the difference between the proportion of
principals who think students can use marijuana every weekend and still do well at school, and
the corresponding proportion of 15 - 17 year old teenagers?
Note: You must follow the step-by-step guide to producing a confidence interval by hand given
in the Lecture Workbook, Chapter 5.