Vous êtes sur la page 1sur 15

STAT E-150 - Statistical Methods

Practice Final Examination Questions


1. True or False? The equal variance assumption will be met if Levenes test is significant.

2. Suppose you are interested in knowing the impact on starting salary of a job candidate's
educational background and whether or not the job candidate had previous work
experience. Educational background was categorized as arts or science/engineering.
a. Which test would you use?
1. Simple linear regression
2. Multiple regression
3. Logistic regression

4. ANOVA
5. Two-way ANOVA
6. Repeated measures ANOVA

b. What are the null and alternative hypotheses?

3. Researchers wanted to investigate whether there is a relationship between a subject's


number of years of education and his or her driving record, recorded as total tickets in a
lifetime.
a. Which test would you use?
1. Simple linear regression
2. Multiple regression
3. Logistic regression

4. ANOVA
5. Two-way ANOVA
6. Repeated measures ANOVA

b. What are the null and alternative hypotheses?

4. Imagine we have a weighted coin where the probability heads is .65.


a. What is the probability of it landing on tails?
b. What are the odds of getting heads in a toss?
c. What is the odds ratio for heads?
d. Describe the odds ratio you found in context.

Page 1

5. Researchers wanted to predict whether someone was hired or not based on their age in
years, gender, and years of education.
a. Which test would you use?
1. Simple linear regression
2. Multiple regression
3. Logistic regression

4. ANOVA
5. Two-way ANOVA
6. Repeated measures ANOVA

b. What are the null and alternative hypotheses?

6. Regression analysis was used to predict the cost of milk in cents from the cost of a
barrel of corn in dollars. The resulting least-squares equation is y = 25 + 5x. The actual
cost of milk is 500 cents when corn costs 100 dollars a barrel. What is the residual?

7. Researchers want to investigate the relationship between subjects scores on a test of


empathy and their gender, age in years, whether they were a parent or not, and their
education level in years.
a. Which test would you use?
1. Simple linear regression
2. Multiple regression
3. Logistic regression

4. ANOVA
5. Two-way ANOVA
6. Repeated measures ANOVA

b. What are the null and alternative hypotheses?

Page 2

8. The following questions refer to the SPSS output shown below.


Researchers used temperature to predict failure time for a superconductive material
with the following linear model: y = 0 + 1xtemp
a. Write the regression equation based on the results shown below.

b. Assess the model utility.

c. Would you recommend the model? Why or why not?

Page 3

9. Use the SPSS output shown below to answer all parts of this question.
Engineers want to use three pond characteristics: depth, surface strength, and surface
area, to predict whether ice type was landfast or not.
a. What is the regression equation?

b. What are the hypotheses?

c. Assess the model utility.

d. Describe how the odds of pond ice being landfast change with depth.

Page 4

10. Use the SPSS output shown below to answer all parts of this question:
Researchers investigated whether caffeine helps to counteract the effects of alcohol
consumption. They randomly assigned students to four groups that received alcohol (A),
alcohol with caffeine (AC), a placebo that looked and tasted like alcohol (P), or nothing
(AR). After 25 minutes, their performance on a memory task was recorded.
a. Does the data satisfy the assumptions of ANOVA?

b. Is there a significant effect of group?


c. Use the appropriate posthoc tests to determine if there are there any significant
differences between the groups.

Page 5

11. Researchers investigated the mean whipping capacity for eggs that were randomly
selected from chickens that were raised in four different types of housing: cage, barn, free
range, and organic. The eggs were also classified as being either from a medium or large
weight class. The results from a two-way ANOVA are shown below.
a. Is there a significant main effect of weight class?

b. Is there a significant main effect of housing?

c. Is there an interaction?

d. What test would you conduct to follow up with these results?

Page 6

12. The following questions refer to the SPSS output below. Researchers wanted to
investigate a program to help obsessive-compulsive disorder. They randomly selected 13
subjects with the mental illness and measured their percentage of time spent obsessing.
This was measured on entrance into the program, and every two months after for 6
months.
a. Are the assumptions for this test met?
b. Is there a significant effect of obsessive thoughts?
c. If you wanted to find out if there was a significant difference between the first and
last rating of obsessive thoughts, what test would you conduct?

Page 7

13. Researchers wanted to test whether taking a GMAT prep course would improve
subjects' scores on the GMAT. They had students take the GMAT before the prep
course, midway through the course, and after the course was over. They then
compared their scores.
a. Which test would you use?
1. Simple linear regression
2. Multiple regression
3. Logistic regression

4. ANOVA
5. Two-way ANOVA
6. Repeated measures ANOVA

b. What are the null and alternative hypotheses?

14. Researchers wanted to investigate how the amount spent on homes purchased in
Boston varied by whether it was cape or colonial style, and whether it was sold in the
spring, summer, fall, or winter. They randomly selected 100 of the homes sold in the last 2
years and recorded the season and style of the home.
a. What are the factors?

b. How many levels does each factor have?

c. Is each factor observational or experimental?

d. What are the experimental units?

e. What are the treatments?

Page 8

15. Researchers conducted an experiment investigating whether the average SAT score
differed for people with high or low short-term memory capacity. They randomly selected
100 incoming freshman and tested their short-term memory. 50 were classified as having
low short-term memory capacity and 50 were classified as having high short-term memory
capacity. When the researchers plotted the data, they noticed it was not normally
distributed. To overcome this, they used the Wilcoxon rank sum test.
a. Find W.

b. If the Wilcoxon rank sum statistic for students with a low short-term memory capacity
is 1500 and the standard error is 145.1, what is the value of the test statistic?

c. Use the value you found for the test statistic to write your conclusion and conclusion
in context.

16. Researchers wanted to test whether staying up all night affected memory recall. They
randomly assigned subjects to three groups; one group stayed up all night, one group
stayed up for half of the night, and the third groups slept normally. The next morning they
recorded their performance on a memory test and their averages were tallied.
a. Which test would you use?
1. Simple linear regression
2. Multiple regression
3. Logistic regression

4. ANOVA
5. Two-way ANOVA
6. Repeated measures ANOVA

b. What are the null and alternative hypotheses?

Page 9

Solutions
1. False

2. a. two-way ANOVA
b. H0: AW = ANW = SW = SNW ; Ha: the means are not all equal.
3. a. simple linear regression
b. H0: 1 = 0, Ha: 1 0
4. a. P(tails) = 1 - P(heads) = 1 - .65 = .35
b. The odds of getting heads = P(heads) /1- P(heads) = .65/(1-.65) = .65/.35 = 1.857
c. odds ratio for heads =

odds of heads
.65
.35
1.857

3.449
odds of tails
1 .65 1 .35 .538

d. The odds of getting heads is 3.45 times the odds of getting tails.

5. a. Logistic regression
b. H0: age = gender = education = 0; Ha: the betas are not all zero
6. residual = observed value - predicted value = 500 - 525 = -25 cents

7. a. multiple regression
b. gender = age = parent = education = 0; Ha: the betas are not all zero

Page 10

8. a. y = 30,855.911 191.567xtemp
b. H0: temp = 0; Ha: temp 0.
F = 107.323 and p is close to 0. Since p is small, we can reject the null hypothesis and
conclude that there is a relationship between temperature and failtime. These results
suggest that the model predicts a significant amount of variation in failtime. This is
supported by the high adjusted R2 of .835, which suggests that the model accounts for
83.5% of the variation in failtime. However, the scatterplot of the data suggests that the
relationship between temperature and failtime might be curvilinear, and the residual plot
does not appear to be normally distributed.
c. I would not recommend the above model. Even though the model is significant (p< .05),
the scatterplot of the data suggests a curvilinear relationship and the residual plot violates
the assumption of equal variance, as the residual plot appears to thicken.

9. a. y = log(odds) = .296 + 4.128xdepth + 47.123xstrength 31.144xarea


b. H0: depth = strength = area = 0; Ha: the betas are not all zero
c. The HL test has a p-value of .198; since p is large, we fail to reject the null hypothesis
that there is no difference between the predicted and observed value. Taken together,
these tests suggest this is a useful model for predicting whether pond ice is landfast or not.
d. For every unit increase in surface depth, the odds of being landfast increase by a factor
of 62.065 for given values of surface area and surface strength.

Page 11

10. a. Normality condition:


H0: the data is normally distributed; Ha: the data is not normally distributed. The results
for the KS test of normality suggest that we fail to reject the null hypothesis for the three
groups with p-values > .05. However we reject the null hypothesis and conclude that the
assumption of normality is not met for group AR, the control group, which has a
p-value <.05.
Equal variances condition:
H0: 2a = 2ac = 2ar = 2p; Ha: the variances are not all equal.
Since the p-value for Levenes test is greater than .05, we fail to reject the null hypothesis
that the variances are all equal. Therefore, the assumption of equal variance is satisfied.
Independence: The description suggests that this assumption was met, as people were
randomly assigned to each group.
Since the assumptions of randomization and equal variance are met, but the assumption
of normality is not met, we can continue with the ANOVA but should proceed with caution.
b. H0: A = AC = AR = p; Ha: the means are not all equal.
Since p is small (0+), we reject the null hypothesis that the means are equal. The data
suggests that there are differences in peoples performance on the memory test
depending on what they drank before the test.
c. H0: i j = 0, Ha: i j 0.
At a .05 level of significance, there is evidence of a significant difference between the
group who drank alcohol and all three groups, the group who drank the placebo (p = 0+),
the control group (p = 0+), and the alcohol with caffeine group (p = .048). There is no
evidence of a difference between the placebo and control groups (p = .95), or between
the alcohol with caffeine and either the placebo (p = .289) or the control group (p = .108).

Page 12

11. a. H0: Medium - Large = 0


Ha: Medium - Large 0
For the main effect of weight class, since p is large (.233), we fail to reject the null
hypothesis that the means are equal. The data indicates that there is no difference
between the whipping capacity for medium eggs and large eggs.
b. H0: C = B = F = O
Ha: the means are not all equal
For the main effect of housing, F = 19.882 and p = 0+ and so the null hypothesis is
rejected. The data suggests that the environment chickens are raised in affects the
whipping capacity of their eggs.
c. H0: MC = MB = MF = MO = LC = LB = LF = LO
Ha: the means are not all equal
Since the p-value of the interaction term is not small (p = .516), we cannot reject the null
hypothesis. The data suggests that there is no interaction between housing and weight
class.
d. It would be appropriate to conduct a Scheffes posthoc test to see what housing types
differ.

Page 13

12.a.Normality:
H0: the data is normally distributed
Ha: the data is not normally distributed.
The KS test of normality is nonsignificant, with all p-values greater than alpha of .05.
Therefore, we fail to reject the null hypothesis that the data is normally distributed and we
can conclude that the assumption of normality is met
Sphericity:
H0: the pairwise variances are equal.
Ha: the pairwise variances are not equal.
Since Mauchlys test has a small p-value (.001), we reject the null hypothesis that the
variances of the differences are equal. Therefore, the assumption of sphericity is not met.
Independence:
We can assume this assumption is met as it is stated in the study description that
subjects were randomly selected.
We can proceed with the ANOVA analysis with caution, and use the Greenhouse-Geisser
test for significance since the sphericity assumption is not met.
b. H0: 1 = 2 = 3 = 4
Ha: the means are not all equal.
Using the Greenhouse-Geisser results, the main effect for obsessive thoughts has a small
p-value of .018, and so we can reject the null hypothesis that the means are equal. This
suggests that the treatment program resulted in a change in percentage of time devoted
to obsessive thoughts.
c. Since the groups are not independent, a paired t-test is appropriate.

13. a. Repeated measures ANOVA


b. H0: before = during = after
Ha: the means are not all equal.

Page 14

14. a. The factors are Season and Style


b. Season has four levels: Spring, Summer, Fall, Winter
Style has two levels : Cape and Colonial
c. Both factors are observational
d. The experimental unit is a house
e. There are 8 treatments: SpringCape, SpringColonial, SummerCape, SummerColonial,
FallCape, FallColonial, WinterCape, WinterColonial.

15. a. W = n1(n1+n2+1)/2 = 50(50+50+1)/2 = 2525


b. z = (W- W )/SE = (1500 - 2525)/145.1 = 7.06
c. Since the value of z is greater than the critical value of 1.96 associated with = .05,
we can reject the null hypothesis that the two groups are equal.
The data suggests that SAT scores differ for students with low and high short-term
memory capacity.

16. a. ANOVA
b. H0: nosleep = halfsleep = fullsleep
Ha: the means are not all equal.

Page 15

Vous aimerez peut-être aussi