Vous êtes sur la page 1sur 16

Faculty of Medicine and Health Sciences

Department of Nursing

MDJ 4442 Basic Statistics


Semester 1

2017 / 2018 Academic Session


Assignment II

Student Name: ____________________________ Matric No.: _______________

*****************************************

OBJECTIVE:

The objective of this assignment is to enable you to apply working knowledge and skills in
statistical analysis. Read up the recommended books on statistical analysis.

INSTRUCTIONS:

Answer all questions in this assignment. This assignment is an open book, open notes.
Make sure you show all your work.

This assignment is worth a total of 25% to the assessment of this course. You are not allowed
to give nor receive unauthorized assistance on this assignment. This assignment II should be
completed and uploaded as a PDF file onto the course website submitted by 29th November
2017 @ 12noon. Plagiarized assignment will be rejected.

Page 1
Part I: SCAQ (15 marks)

The following scenario applies to Questions 1 and 2:

A study was done to compare the lung capacity of coal miners to the lung capacity of farm
workers. The researcher studied 200 workers of each type. Other factors that might affect
lung capacity are smoking habits and exercise habits. The smoking habits of the two
worker types are similar, but the coal miners generally exercise less than the farm workers.

1. Which of the following is the explanatory variable in this study?


A. Exercise
B. Lung capacity
C. Smoking or not
D. Occupation

2. Which of the following is a confounding variable in this study?


A. Exercise
B. Lung capacity
C. Smoking or not
D. Occupation

The following scenario applies to Questions 3 to 5:

A randomized experiment was done by randomly assigning each participant either to walk
for half an hour three times a week or to sit quietly reading a book for half an hour three
times a week. At the end of a year the change in participants' blood pressure over the year
was measured, and the change was compared for the two groups.

3. This is a randomized experiment rather than an observational study because:


A. Blood pressure was measured at the beginning and end of the study.
B. The two groups were compared at the end of the study.
C. The participants were randomly assigned to either walk or read.
D. A random sample of participants was used.

Page 2
4. The two treatments in this study were:
A. Walking for half an hour three times a week and reading a book for half an hour three
times a week.
B. Having blood pressure measured at the beginning of the study and having blood
pressure measured at the end of the study.
C. Walking or reading a book for half an hour three times a week and having blood
pressure measured.
D. Walking or reading a book for half an hour three times a week and doing nothing.

5. If a statistically significant difference in blood pressure change at the end of a year for
the two activities was found, then:
A. It cannot be concluded that the difference in activity caused a difference in the change
in blood pressure because in the course of a year there are lots of possible confounding
variables.
B. Whether or not the difference was caused by the difference in activity depends on what
else the participants did during the year.
C. It cannot be concluded that the difference in activity caused a difference in the change
in blood pressure because it might be the opposite that people with high blood pressure
were more likely to read a book than to walk.
D. It can be concluded that the difference in activity caused a difference in the change in
blood pressure because of the way the study was done.

6. Which one of these statistics is unaffected by outliers?


A. Mean
B. Interquartile range
C. Standard deviation
D. Range

7. A list of 5 pulse rates is: 70, 64, 80, 74, 92. What is the median for this list?
A. 74
B. 76
C. 77
D. 80

8. Which of the following would indicate that a dataset is not bell-shaped?


A. The range is equal to 5 standard deviations.
B. The range is larger than the interquartile range.
C. The mean is much smaller than the median.
D. There are no outliers.

Page 3
9. A scatter plot of number of teachers and number of people with college degrees for cities
in Sarawak reveals a positive association. The most likely explanation for this positive
association is:
A. Teachers encourage people to get college degrees, so an increase in the number of
teachers is causing an increase in the number of people with college degrees.
B. Larger cities tend to have both more teachers and more people with college degrees, so
the association is explained by a third variable, the size of the city.
C. Teaching is a common profession for people with college degrees, so an increase in the
number of people with college degrees causes an increase in the number of teachers.
D. Cities with higher incomes tend to have more teachers and more people going to college,
so income is a confounding variable, making causation between number of teachers and
number of people with college degrees difficult to prove.

The following scenario applies to Questions 10 to 12:

A survey asked people how often they exceed speed limits. The data are then categorized
into the following contingency table of counts showing the relationship between age group
and response.

Exceed Limit if Possible?


Age Always Not Always Total
Under 30 100 100 200
Over 30 40 160 200
Total 140 260 400

10. Among people with age over 30, what's the "risk" of always exceeding the speed limit?
A. 0.20
B. 0.40
C. 0.33
D. 0.50

11. Among people with age under 30 what are the odds that they always exceed the speed
limit?
A. 1 to 2
B. 2 to 1
C. 1 to 1
D. 50%

Page 4
12. What is the relative risk of always exceeding the speed limit for people under 30 compared
to people over 30?
A. 2.5
B. 0.4
C. 0.5
D. 30%

13. A newspaper reported that "The number of children in the study who contracted asthma
was relatively small, 265 of 3,535." Which of the following statement is represented by
265/3535 = 0.75?
A. The overall risk of getting asthma for the children in this study.
B. The baseline risk of getting asthma for the non-athletic peers in the study.
C. The risk of getting asthma for children in the study who participated in sports.
D. The relative risk of getting asthma for children who routinely participate in vigorous
after-school sports on smoggy days and their non-athletic peers.

The following scenario applies for Questions 14 to16:

The following histogram shows the distribution of the difference between the actual and
ideal weights for 119 female students. Notice that percent is given on the vertical axis.
Ideal weights are responses to the question What is your ideal weight? The difference =
actual ideal.

14. What is the approximate shape of the distribution?


A. Nearly symmetric.
B. Skewed to the left.
C. Skewed to the right.
D. Bimodal (has more than one peak).

Page 5
15. The median of the distribution is approximately
A. 10 pounds.
B. 10 pounds.
C. 30 pounds.
D. 50 pounds.

16. Most of the women in this sample felt that their actual weight was
A. about the same as their ideal weight.
B. less than their ideal weight.
C. greater than their ideal weight.
D. no more than 2 pounds different from their ideal weight.

17. Pick the choice that best completes the following sentence.
If a relationship between two variables is called statistically significant, it means the
investigators think the variables are
A. related in the population represented by the sample.
B. not related in the population represented by the sample.
C. related in the sample due to chance alone.
D. very important.

18. A researcher hypothesizes that a new drug he invented will enhance mices memories.
He feeds the drugs to the experimental group and gives the control group a placebo. He
then times the mice as they learn to run through a maze. In order to know whether his
hypothesis is supported, the researcher needs to use:
A. scatter plot
B. descriptive statistics
C. inferential statistics
D. histograms

19. Which of the following coefficients of a correlation indicate the weakest relationship
between two variables?
A. 0.51
B. -0.28
C. 0.08
D. 1.00

Page 6
20. Researchers went to the Specialist outpatient clinics posing as new patients and
requesting appointments for non-urgent problems. The waiting time, in days, was
recorded, for each request. Boxplots for two different samples of requests, labeled A and
B, are shown below.

The samples are the same size, and the distributions have symmetric shapes, so that
the sample mean is very close to the sample median in each case.
Which sample, A or B, offers more convincing evidence that the population mean
waiting time exceeds 30 days?
A. It is impossible to tell from the box plots
B. The two samples offer about the same evidence
C. Sample B
D. Sample A

Page 7
Part II: SAQ

Question 1
Create a new data file (called cinema.sav) according to this questionnaire

Q1. How often do you go to cinema 2 3 1 2 4 2 3 3 4 3 4 5


1 More times a week
2 Once a week
3 Every second week
4 Once a month
5 Rarely
6 Never
Q2. Gender 1 1 1 1 1 1 2 2 2 2 2 2
1 Male
2 Female
Q3. Date of birth: (year/month/day)
1 1985/04/12 3.7 1986/02/22
2 1981/12/30 3.8 1986/05/27
3 1991/11/11 3.9 1970/09/23
4 1992/08/03 3.10 1988/03/11
5 1985/05/04 3.11 1972/01/17
6 1997/07/01 3.12 1960/05/19

1.1. Attach print screenshots of: (i) data view editor and (ii) variable view window.
(1 mark)

1.2. How old are the people? List the procedure / steps to create the new variable as age!
(2 marks)

1.3. Describe the sample characteristics of this study in table and text.
(2 marks)

1.4. Create a column diagram about the proportion of respondents grouped by gender
and the time how often people go to the cinema. Embelish the graph!
(1 mark)

1.5. Using APA sentence structure format, interpret the graph.


(2 marks)

Page 8
Question 2

A study is investigating the consumption of popcorn at movie theaters. Researchers expect


that no relationship exists between the rating of the movie viewed (U, PG, PG-13, PG-18,
R) and theater patrons popcorn purchases. The following cross-tabulation displays
observed values based upon interviews of randomly selected individuals exiting randomly
selected movie theaters. Input this data into SPSS to use for questions 2.1 2.3.

Movie Rating
U PG PG-13 PG-18 R
Yes 45 35 38 12 58 188
Popcorn
No 28 46 48 10 36 168
73 81 86 22 94 356

2.1. Use SPSS to perform a chi-square test based upon the hypothesis that equal
percentages of those who do and do not buy popcorn view movies with each rating.
Attach the SPSS output below.
(1 mark)

2.2. How many degrees of freedom does the test use? Why?
(1 mark)

2.3. What is the value of 2?


(0.5 mark)
10.857

2.4. What is the value of p?


(0.5 mark)
0.028

2.5. Is it necessary to further investigate the source of differences within the data? If so,
what source of difference exists?
(2 marks)

2.6. Write a short summary of the results for the test conducted for question #2.4.
(2 marks)

Page 9
Question 3

An investigator predicts that individuals that fit the Type A Behavior Pattern (highly
competitive and time conscious) will have higher scores on a questionnaire measure of need
for achievement than individuals that fit the Type B Behavior pattern (absence of Type A
qualities). The investigator collects need for achievement scores from 10 Type A subjects and
10 Type B subjects. Higher scores reflect greater levels of need for achievement.

Type A Type B
12 8
10 10
8 5
11 7
15 8
12 5
9 4
16 7
11 8
8 10

3.1. Write the null and alternative hypotheses for testing this prediction.
(2 marks)

3.2. Input this data into SPSS to test the null hypothesis stated above. Attach the SPSS
output here.
(1 mark)

3.3. Using APA sentence format structure, interpret the result for the Results Chapter
that states the inference the investigator will most likely conclude.
(2 marks)
(3
3.4. Use SPSS to create an appropriate graph to describe the distribution of scores between
both Types of individuals. Attach graph here.
(1 mark)

3.5. Using APA sentence format structure, interpret the output of the graph you selected
for question#3.4.

Page 10
Question 4

An investigator is interested in the effects of stress on reaction time. She gives a reaction
time test to three groups of subjects: (i) one group that is under a great deal of stress, (ii)
one group under a moderate amount of stress, and (iii) one group that is under almost no
stress.
Theory A suggests that subjects under a moderate amount of stress will perform better
than subjects in either the very low or the very high stress groups. Theory B predicts that
subjects in the high stress group will perform better than subjects in either the low stress
or the moderate stress groups. Theory C predicts that subjects in the moderate stress
condition will perform significantly more poorly than subjects in the low stress condition.

a1 a2 a3
High Stress Moderate Stress Low Stress
6 8 10
7 10 11
4 11 8
5 7 6
6 12 13
3 11 12

4.1. Using an alpha level of .05, use SPSS to perform the overall ANOVA and then
determine whether the data provide evidence for Theory A, Theory B, or Theory C.
(You need to explain what was the interpretation of the output tables from the
ANOVA that lead you to your conclusion).

(3 marks)

4.2. Write a paragraph that summarizes the conclusions that the investigator is entitled
to draw from these results. In this paragraph, be sure to make it clear how you
handled the issue of conducting the three planned comparisons. Also provide the
ANOVA table for these results.
(2 marks)

Page 11
Question 5

A psychologist hypothesises that "sadotoothpullerophobia" (fear of dentists) develops


because of bad childhood experiences. On the basis of an initial interview, two groups of
seven adults are identified: a group who report having had bad childhood experiences of
dentists, and a group who report no such experiences. For each participant, their rating of
their fear of the dentist (on a 100-point scale) was recorded. Here is some of the SPSS
output.

Test Statisticsa
DentistFear
Mann-Whitney U 8.000
Wilcoxon W 36.000
Z -2.111
Asymp. Sig. (2-tailed) .035
Exact Sig. [2*(1-tailed
.038b
Sig.)]

Page 12
On the basis of the above-mentioned information, answer all the following 10 questions.
(10 marks)

5.1. Which is the most appropriate statistical test to perform on these data?
(a) Wilcoxon matched-pairs test.
(b) Mann-Whitney test.
(c) Spearman's correlation test.

5.2. The test results shown in the "Test Statistics" table are all:
(a) Statistically significant at p < .05.
(b) Statistically significant at p > .05.
(c) Not statistically significant at p < .05.

5.3. On the basis of the description, what kind of hypothesis would be most appropriate
for this study?
(a) A one-tailed hypothesis.
(b) A two-tailed hypothesis.
(c) A neutral hypothesis.

5.4. The variation in rating for the bad experiences group is:
(a) higher than for the good experiences group.
(b) the same as for the good experiences group.
(c) lower than for the good experiences group.

5.5. What is the standard deviation for the bad experiences group? ___________

5.6. What is the standard deviation for the good experiences group? __________

5.7. What is the standard error of the mean for the bad experiences group? __________

5.8. What is the standard error of the mean for the good experiences group? __________

5.9. Using the data in the "Descriptives" table, draw the box-plots for the bad experiences
and good experiences group for comparison.

5.10. On the basis of these results, the researcher should conclude that:
(a) bad childhood experiences lead to greater fear of dentists in adulthood.
(b) bad childhood experiences lead to less fear of dentists in adulthood.
(c) bad childhood experiences are not significantly related to fear of dentists in
adulthood

Page 13
Question 6 (Choosing the Right Statistics)

Note: In the following scenario, you are given some details of an experiment, the results of
a number of statistical tests, and a set of conclusions.

6.1. An experimenter develops a method that may help the memory of people with brain
damage. It is thought that these people have problems because when they make a
mistake in recall, they cannot remember whether the mistake or the real answer was
correct. Therefore they are given strong clues to the answer, to stop them ever making
mistakes and hopefully improving their memory (this method is known as errorless
learning). To test the effectiveness of this method, the experimenter asked ten head
injury patients to learn the names of 20 of their caregivers. Ten of the names were learnt
using errorless learning, and ten using simple trial and error. The data are not normally
distributed.

Patient Method A: Number of names Method B: Number of names


learnt using trial and error learnt using errorless learning
A 7 4
B 7 7
C 6 6
D 6 6
E 5 3
F 6 3
G 3 4
H 7 3
I 5 4
J 6 5
Mean: 5.8 4.5
S.D.: 1.23 1.43

6.1.1. The MOST appropriate statistical test would be:


(1 mark)
(a) Mann Whitney test: U (10, 10) = 25, p = .05.
(b) Wilcoxon test: z = -2.05, p = .04.
(c) Pearson's correlation of A and B: r (8) =.25, p = .48.
(d) Spearman's correlation of A and B: rs (10) = .21, p = .56.

6.1.2. Which of the following conclusions is CORRECT?


(1 mark)
(a) The number of names learnt by trial and error is related to the number of names
learnt with errorless learning.
(b) Memory for names is unaffected by the type of learning.
(c) Memory for names is significantly better if errorless learning is used.
(d) Memory for names is significantly better if trial and error learning is used.

Page 14
After a Statistics exam, a research about the time spent on learning for the exam and
the result was carried out. The following are the results of a random sampling.

Student Time spent learning for Exam results (marks)


exam (hours)
A 2 9
B 3 15
C 5 20
D 10 50
E 15 58
F 20 75
G 22 80
H 18 65
I 25 100
J 30 95

6.2. Create a new SPSS data set and attach print screen shots of the (i) data editor view
and (ii) variable view
(1 mark)

6.3. Determine the strength and nature of this relationship.


(2 mark)

Page 15
Question 7

The table shows information about the distance travelled (in kilometers) by a sample of
132 patients to a government district hospital.

Distance 0<x5 5 < x 10 10 < x 20 20 < x 35 35 < x 60


travelled, x
(km)
Frequency 18 30 48 21 15

7.1. Estimate how many of these visitors travelled 7 km or more.


(1 mark)

7.2. Plot the histogram for this data.


(2 marks)

7.3. Describe the distribution of the data based on the histogram plotted in Q# 7.2.
(1 mark)

7.4. Later the data were regrouped using the following cases:
0 < x 20 and 20 < x 60.

Give ONE disadvantage of grouping the data using these clauses.


(1 mark)

Page 16

Vous aimerez peut-être aussi