Chap 013

Chapter 13
Multiple Regression
True / False Questions
1. In regression the dependent variable is referred to as the response variable.
True False
2. If a regression model's F test statistic is Fcalc = 43.82, we could say that the
explained variance is approximately 44 percent.
True False
3. In a regression, the model with the best fit is preferred over all other models.
True False
4. A common misinterpretation of the principle of Occam's Razor is that a simple

regression model (rather than a multiple regression model) is always best.
True False
5. A predictor whose pairwise correlation with Y is near zero can still have a
significant t-value in a multiple regression when other predictors are included.
True False
6. The F statistic in a multiple regression is significant if at least one of the predictors

has a significant t statistic at a given .
True False
7. R2adj can exceed R2 if there are several weak predictors.
True False
8. A binary (categorical) predictor should not be used along with nonbinary

predictors.
True False
9. In a multiple regression with 3 predictors in a sample of 25 U.S. cities, we would

use F3, 21 in a test of overall significance.
True False
10. Evans' Rule says that if n = 50 you need at least 5 predictors to have a good
model.
True False
11. The model Y = 0 + 1X + 2X2 cannot be estimated by Excel because of the

nonlinear term.
True False
12. The random error term in a regression model reflects all factors omitted from the
model.
True False
13. If the probability plot of residuals resembles a straight line, the residuals show a
fairly good fit to the normal distribution.
True False
14. Confidence intervals for Y may be unreliable when the residuals are not normally
distributed.
True False
15. A negative estimated coefficient in a regression usually indicates a weak
predictor.
True False
16. For a certain firm, the regression equation Bonus = 2,000 + 257 Experience +
0.046 Salary describes employee bonuses with a standard error of 125. John has 10
years' experience, earns $50,000, and earned a bonus of $7,000. John is an
outlier.
True False
17. There is one residual for each predictor in the regression model.
True False
18. If R2 and R2adj differ greatly, we should probably add a few predictors to improve
the fit.
True False
19. The effect of a binary predictor is to shift the regression intercept.
True False
20. A parsimonious model is one with many weak predictors but a few strong ones.
True False
21. The F statistic and its p-value give a global test of significance for a multiple
regression.
True False
22. In a regression model of student grades, we would code the nine categories of
business courses taken (ACC, FIN, ECN, MGT, MKT, MIS, ORG, POM, QMM) by
including nine binary (0 or 1) predictors in the regression.
True False
23. A disadvantage of Excel's Data Analysis regression tool is that it expects the
independent variables to be in a block of contiguous columns so you must delete
a column if you want to eliminate a predictor from the model.
True False
24. A disadvantage of Excel's regression is that it does not give as much accuracy in
the estimated regression coefficients as a package like MINITAB.
True False
25. Nonnormality of the residuals from a regression can best be detected by looking
at the residual plots against the fitted Y values.
True False
26. A high variance inflation factor (VIF) indicates a significant predictor in the
regression.
True False
27. Autocorrelation may be detected by looking at a plot of the residuals against

time.
True False
28. A widening pattern of residuals as X increases would suggest heteroscedasticity.
True False
29. Plotting the residuals against a binary predictor (X = 0, 1) reveals nothing about
heteroscedasticity.
True False
30. The regression equation Bonus = 2,812 + 27 Experience + 0.046 Salary says that
Experience is the most significant predictor of Bonus.
True False
31. A multiple regression with 60 observations should not have 13 predictors.
True False
32. A regression of Y using four independent variables X1, X2, X3, X4 could also have
up to four nonlinear terms (X2) and six simple interaction terms (XjXk) if you have
enough observations to justify them.
True False
33. When autocorrelation is present, the estimates of the coefficients will be

unbiased.
True False
34. If the residuals in your regression are nonnormal, a larger sample size might help
improve the reliability of confidence intervals for Y.
True False
35. Multicollinearity can be detected from t tests of the predictor variables.
True False
36. When multicollinearity is present, the regression model is of no use for making
predictions.
True False
37. Autocorrelation of the residuals may affect the reliability of the t values for the
estimated coefficients of the predictors X1, X2, . . . , Xk.
True False
38. The first differences transformation might be tried if autocorrelation is found in a

time-series data set.
True False
39. Statisticians who work with cross-sectional data generally do not anticipate
autocorrelation.
True False
40. The ill effects of heteroscedasticity might be mitigated by redefining totals (e.g.,
total number of homicides) as relative values (e.g., homicide rate per 100,000
population).
True False
41. Nonnormal residuals lead to biased estimates of the coefficients in a regression

model.
True False
42. A large VIF (e.g., 10 or more) would indicate multicollinearity.
True False
43. Heteroscedasticity exists when all the errors (residuals) have the same variance.
True False
44. Multicollinearity refers to relationships among the independent variables.
True False
45. A squared predictor is used to test for nonlinearity in the predictor's relationship
to Y.
True False
46. Nonnormality of residuals is not usually considered a major problem unless there
are outliers.
True False
47. In the fitted regression Y = 12 + 3X1 - 5X2 + 27X3 + 2X4 the most significant
predictor is X3.
True False
48. Given that the fitted regression is Y = 76.40 -6.388X1 + 0.870X2, the standard error
of b1 is 1.453, and n = 63. At = .05, we can conclude that X1 is a significant
predictor of Y.
True False
49. Unlike other predictors, a binary predictor has a t-value that is either 0 or 1.
True False
50. The t-test shows the ratio of an estimated coefficient to its standard error.
True False
51. In a multiple regression with five predictors in a sample of 56 U.S. cities, we would
True False
Multiple Choice Questions
52. In a multiple regression with six predictors in a sample of 67 U.S. cities, what
would be the critical value for an F-test of overall significance at = .05?
A. 2.29
B. 2.25
C. 2.37
D. 2.18
53. In a multiple regression with five predictors in a sample of 56 U.S. cities, what
A. 2.45
B. 2.37
C. 2.40
D. 2.56
54. When predictor variables are strongly related to each other, the __________ of the
regression estimates is questionable.
A. logic
B. fit
C. parsimony
D. stability
55. A test is conducted in 22 cities to see if giving away free transit system maps will
increase the number of bus riders. In a regression analysis, the dependent variable
Y is the increase in bus riders (in thousands of persons) from the start of the test
until its conclusion. The independent variables are X1 = the number (in thousands)
of free maps distributed and a binary variable X2 = 1 if the city has free downtown
parking, 0 otherwise. The estimated regression equation is
In city 3, the observed Y value is 7.3 and X1 = 140 and X2 = 0. The residual for city
3 (in thousands) is:
A. 6.15.
B. 1.15.
C. 4.83.
D. 1.57.
56. If X2 is a binary predictor in Y = 0 + 1X1 + 2X2, then which statement is most
nearly correct?
A. X2 = 1 should represent the most desirable condition.
B. X2 would be a significant predictor if 2 = 423.72.
C. X2 = 0, X2 = 1, X2 = 2 would be appropriate if three categories exist.
D. X2 will shift the estimated equation either by 0 units or by 2 units.
57. The unexplained sum of squares measures variation in the dependent variable Y
about the:
A. mean of the Y values.
B. estimated Y values.
C. mean of the X values.
D. Y-intercept.
58. Which of the following is not true of the standard error of the regression?
A. It is a measure of the accuracy of the prediction.
B. It is based on squared vertical deviations between the actual and predicted

values of Y.
C. It would be negative when there is an inverse relationship in the model.
D. It is used in constructing confidence and prediction intervals for Y.

59. A multiple regression analysis with two independent variables yielded the
following results in the ANOVA table: SS(Total) = 798, SS(Regression) = 738,
SS(Error) = 60. The multiple correlation coefficient is:
A. .2742
B. .0752
C. .9248
D. .9617
60. A fitted multiple regression equation is Y = 12 + 3X1 - 5X2 + 7X3 + 2X4. When X1
increases 2 units and X2 increases 2 units as well, while X3 and X4 remain
unchanged, what change would you expect in your estimate of Y?
A. Decrease by 2
B. Decrease by 4
C. Increase by 2
D. No change in Y
A. Increase by 2
B. Decrease by 4
C. Increase by 4
D. No change in Y
62. Which is not a name often given to an independent variable that takes on just two
values (0 or 1) according to whether or not a given characteristic is absent or
present?
A. Absent variable
B. Binary variable
C. Dummy variable
63. Using a sample of 63 observations, a dependent variable Y is regressed against
two variables X1 and X2 to obtain the fitted regression equation Y = 76.40 -
6.388X1 + 0.870X2. The standard error of b1 is 3.453 and the standard error of b2 is
0.611. At a = .05, we could:
A. conclude that both coefficients differ significantly from zero.
B. reject H0: 1 0 and conclude H0: 1 < 0.
C. reject H0: 2 0 and conclude H0: 1 > 0.
D. conclude that Evans' Rule has been violated.
64. Refer to this ANOVA table from a regression:
Which statement is not accurate?
A. The F-test is significant at = .05.
B. There were 50 observations.
C. There were 5 predictors.
D. There would be 50 residuals.

For this regression, the R2 is:
A. .3995.
B. .6005.
C. .6654.
D. .8822.
66. Refer to the following regression results. The dependent variable is Abort (the
number of abortions per 1000 women of childbearing age). The regression was
estimated using data for the 50 U.S. states with these predictors: EdSpend =
public K-12 school expenditure per capita, Age = median age of population,
Unmar = percent of total births by unmarried women, Infmor = infant mortality
rate in deaths per 1000 live births.
Which statement is not supported by a two-tailed test?
A. Unmar is a significant predictor at .= .01.
B. EdSpend is a significant predictor at = .20.
C. Infmor is not a significant predictor at = .05.
D. Age is not a significant predictor at = .05.

67. Refer to the following correlation matrix that was part of a regression analysis. The
dependent variable was Abort (the number of abortions per 1000 women of
childbearing age). The regression was estimated using data for the 50 U.S. states
with these predictors: EdSpend = public K-12 school expenditure per capita, Age
= median age of population, Unmar = percent of total births by unmarried
women, Infmor = infant mortality rate in deaths per 1000 live births.
Correlation Matrix
Using a two-tailed correlation test, which statement is not accurate?
A. Age and Infmor are not significantly correlated at = .05.
B. Abort and Unmar are significantly correlated at = .05.
C. Unmar and Infmor are significantly correlated at = .05.
D. The first column of the table shows evidence of multicollinearity.

68. Part of a regression output is provided below. Some of the information has been
omitted.
The approximate value of F is:
A. 1605.7.
B. 0.9134.
C. 89.66.
D. impossible to calculate with the given information.

69. Part of a regression output is provided below. Some of the information has been
omitted.
The SS (residual) is:
A. 3177.17.
B. 301.19.
C. 17.71.
D. impossible to determine.
70. A Realtor is trying to predict the selling price of houses in Greenville (in thousands
of dollars) as a function of Size (measured in thousands of square feet) and
whether or not there is a fireplace (FP is 0 if there is no fireplace, 1 if there is a
fireplace). Part of the regression output is provided below, based on a sample of
20 homes. Some of the information has been omitted.
The estimated coefficient for Size is approximately:
A. 9.5.
B. 13.8.
C. 122.5.
D. 1442.6.
fireplace). The regression output is provided below. Some of the information has
been omitted.
How many predictors (independent variables) were used in the regression?
A. 20
B. 18
C. 3
D. 2
fireplace). The regression output is provided below. Some of the information has
been omitted.
Which of the following conclusions can be made based on the F-test?
A. The p-value on the F-test will be very high.
B. At least one of the predictors is useful in explaining Y.
C. The model is of no use in predicting selling prices of houses.
D. The estimates were based on a sample of 19 houses.

fireplace). Part of the regression output is provided below, based on a sample of
20 homes. Some of the information has been omitted.
Which statement is supported by the regression output?
A. At = .05, FP is not a significant predictor in a two-tailed test.
B. A fireplace adds around $6476 to the selling price of the average house.
C. A large house with no fireplace will sell for more than a small house with a
fireplace.
D. FP is a more significant predictor than Size.
74. A log transformation might be appropriate to alleviate which problem(s)?
A. Heteroscedastic residuals
B. Multicollinearity
C. Autocorrelated residuals
75. A useful guideline in determining the extent of collinearity in a multiple regression
model is:
A. Sturge's Rule.
B. Klein's Rule.
C. Occam's Rule.
D. Pearson's Rule.
76. In a multiple regression all of the following are true regarding residuals except:
A. their sum always equals zero.
B. they are the differences between observed and predicted values of the
response variable.
C. they may be used to detect multicollinearity.
D. they may be used to detect heteroscedasticity.

77. The residual plot below suggests which violation(s) of regression assumptions?
A. Autocorrelation
B. Heteroscedasticity
C. Nonnormality
D. Multicollinearity
78. Which is not a standard criterion for assessing a regression model?
A. Logic of causation
B. Overall fit
C. Degree of collinearity
D. Binary predictors
79. If the standard error is 12, a quick prediction interval for Y is:
A. 15.
B. 24.
C. 19.
D. impossible to determine without an F table.
80. Which is a characteristic of the variance inflation factor (VIF)?
A. It is insignificant unless the corresponding t-statistic is significant.
B. It reveals collinearity rather than multicollinearity.
C. It measures the degree of significance of each predictor.
D. It indicates the predictor's degree of multicollinearity.

81. Which statement best describes this regression (Y = highway miles per gallon in 91
cars)?
A. Statistically significant but large error in the MPG predictions
B. Statistically significant and quite small MPG prediction errors
C. Not quite significant, but predictions should be very good
D. Not a significant regression at any customary level of

82. Based on these regression results, in your judgment which statement is most
nearly correct (Y = highway miles per gallon in 91 cars)?
A. The number of predictors is rather small.
B. Some predictors are not contributing much.
C. Prediction intervals would be fairly narrow in terms of MPG.
D. The overall model lacks significance and/or predictive power.

83. In the following regression, which are the three best predictors?
A. ManTran, Wheelbase, RearStRm
B. ManTran, Length, Width
C. NumCyl, HPMax, Length
D. Cannot be ascertained from given information

84. In the following regression, which are the two best predictors?
A. NumCyl, HpMax
B. Intercept, NumCyl
C. NumCyl, Domestic
D. ManTran, Width
85. In the following regression (n = 91), which coefficients differ from zero in a two-
tailed test at = .05?
A. NumCyl, HPMax
B. Intercept, ManTran
C. Intercept, NumCyl, Domestic
D. Intercept, Domestic
86. Based on the following regression ANOVA table, what is the R2?
A. 0.1336
B. 0.6005
C. 0.3995
D. Insufficient information to answer

87. In the following regression, which statement best describes the degree of
multicollinearity?
A. Very little evidence of multicollinearity.
B. Much evidence of multicollinearity.
C. Only NumCyl and HPMax are collinear.
D. Only ManTran and RearStRm are collinear.
88. The relationship of Y to four other variables was established as Y = 12 + 3X1 - 5X2
+ 7X3 + 2X4. When X1 increases 5 units and X2 increases 3 units, while X3 and X4
remain unchanged, what change would you expect in your estimate of Y?
A. Decrease by 15
B. Increase by 15
C. No change
D. Increase by 5
89. Does the picture below show strong evidence of heteroscedasticity against the
predictor Wheelbase?
A. Yes
B. No
C. Need a probability plot to answer
D. Need VIF statistics to answer
90. Which is not a correct way to find the coefficient of determination?
A. SSR/SSE
B. SSR/SST
C. 1 - SSE/SST
91. If SSR = 3600, SSE = 1200, and SST = 4800, then R2 is:
A. .5000
B. .7500
C. .3333
D. .2500
92. Which statement is incorrect?
A. Positive autocorrelation results in too many centerline crossings in the residual

plot over time.
B. The R2 statistic can only increase (or stay the same) when you add more
predictors to a regression.
C. If the F-statistic is insignificant, the t-statistics for the predictors also are
insignificant at the same .
D. A regression with 60 observations and 5 predictors does not violate Evans'

Rule.
93. Which statement about leverage is incorrect?
A. Leverage refers to an observation's distance from the mean of X.
B. If n = 40 and k = 4 predictors, a leverage statistic of .15 would indicate high

leverage.
C. If n = 180 and k = 3 predictors, a leverage statistic of .08 would indicate high

leverage.
A. Binary predictors shift the intercept of the fitted regression.
B. If a qualitative variable has c categories, we would use only c - 1 binaries as

predictors.
C. A binary predictor has the same t-test as any other predictor.
D. If there is a binary predictor (X = 0, 1) in the model, the residuals may not sum
to zero.
95. Heteroscedasticity of residuals in regression suggests that there is:
A. nonconstant variation in the errors.
B. multicollinearity among the predictors.
C. nonnormality in the errors.
D. lack of independence in successive errors.

96. If you rerun a regression, omitting a predictor X5, which would be unlikely?
A. The new R2 will decline if X5 was a relevant predictor.
B. The new standard error will increase if X5 was a relevant predictor.
C. The remaining estimated 's will change if X5 was collinear with other
predictors.
D. The numerator degrees of freedom for the F test will increase.
97. In a multiple regression, which is an incorrect statement about the residuals?
A. They may be used to test for multicollinearity.
B. They are differences between observed and estimated values of Y.
C. Their sum will always equal zero.
D. They may be used to detect heteroscedasticity.
98. Which of the following is not a characteristic of the F distribution?
A. It is a continuous distribution.
B. It uses a test statistic Fcalc that can never be negative.
C. Its degrees of freedom vary, depending on .
D. It is used to test for overall significance in a regression.

99. Which of the following would be most useful in checking the normality
assumption of the errors in a regression model?
A. The t-statistics for the coefficients
B. The F-statistic from the ANOVA table
C. The histogram of residuals
D. The VIF statistics for the predictors
100.The regression equation Salary = 25,000 + 3200 YearsExperience + 1400

YearsCollege describes employee salaries at Axolotl Corporation. The standard
error is 2600. John has 10 years' experience and 4 years of college. His salary is
$66,500. What is John's standardized residual?
A. -1.250
B. -0.240
C. +0.870
D. +1.500
101.The regression equation Salary = 28,000 + 2700 YearsExperience + 1900
YearsCollege describes employee salaries at Ramjac Corporation. The standard
error is 2400. Mary has 10 years' experience and 4 years of college. Her salary is
$58,350. What is Mary's standardized residual (approximately)?
A. -1.150
B. +2.007
C. -1.771
D. +1.400
102.Which Excel function will give the p-value for overall significance if a regression
has 75 observations and 5 predictors and gives an F test statistic Fcalc = 3.67?
A. =F.INV(.05, 5, 75)
B. =F.DIST(3.67, 4, 74)
C. =F.DIST.RT(3.67, 5, 69)
D. =F.DIST(.05, 4, 70)
103.The ScamMore Energy Company is attempting to predict natural gas
consumption for the month of January. A random sample of 50 homes was used
to fit a regression of gas usage (in CCF) using as predictors Temperature = the
thermostat setting (degrees Fahrenheit) and Occupants = the number of
household occupants. They obtained the following results:
In testing each coefficient for a significant difference from zero (two-tailed test at
= .10), which is the most reasonable conclusion about the predictors?
A. Temperature is highly significant; Occupants is barely significant.
B. Temperature is not significant; Occupants is significant.
C. Temperature is less significant than Occupants.
D. Temperature is significant; Occupants is not significant.
104.In a regression with 60 observations and 7 predictors, there will be _____

residuals.
A. 60
B. 59
C. 52
D. 6
105.A regression with 72 observations and 9 predictors violates:
A. Evans' Rule.
B. Klein's Rule.
C. Doane's Rule.
D. Sturges' Rule.
106.The F-test for ANOVA in a regression model with 4 predictors and 47

observations would have how many degrees of freedom?
A. (3, 44)
B. (4, 46)
C. (4, 42)
D. (3, 43)
107.In a regression with 7 predictors and 62 observations, degrees of freedom for a t-

test for each coefficient would use how many degrees of freedom?
A. 61
B. 60
C. 55
D. 54
Essay Questions
108.Using state data (n = 50) for the year 2000, a statistics student calculated a matrix
of correlation coefficients for selected variables describing state averages on the
two main scholastic aptitude tests (ACT and SAT). (a) In the spaces provided, write
the two-tailed critical values of the correlation coefficient for = .05 and = .01
respectively. Show how you derived these critical values. (b) Mark with * all
correlations that are significant at = .05, and mark with ** those that are
significant at = .01. (c) Why might you expect a negative correlation between
ACT% and SAT%? (d) Why might you expect a positive correlation between SATQ
and SATV? Explain your reasoning. (e) Why is the matrix empty above the
diagonal?
109.Using data for a large sample of cars (n = 93), a statistics student calculated a
matrix of correlation coefficients for selected variables describing each car. (a) In
the spaces provided, write the two-tailed critical values of the correlation
coefficient for = .05 and = .01 respectively. Show how you derived these
critical values. (b) Mark with * all correlations that are significant at = .05, and
mark with ** those that are significant at = .01. (c) Why might you expect a
negative correlation between Weight and HwyMPG? (d) Why might you expect a
positive correlation between HPMax and Length? Explain your reasoning. (e) Why
is the matrix empty above the diagonal?
110.Analyze the regression below (n = 50 U.S. states) using the concepts you have
learned about multiple regression. Circle things of interest and write comments in
the margin. Make a prediction for Poverty for a state with Dropout = 15,
TeenMom = 12, Unem = 4, and Age65% = 12 (show your work). The variables are
Poverty = percentage below the poverty level; Dropout = percent of adult
population that did not finish high school; TeenMom = percent of total births by
teenage mothers; Unem = unemployment rate, civilian labor force; and Age65% =
percent of population aged 65 and over.
111. Analyze the regression results below (n = 33 cars in 1993) using the concepts you
have learned about multiple regression. Circle things of interest and write
comments in the margin. Make a prediction for CityMPG for a car with EngSize =
2.5, ManTran = 1, Length = 184, Wheelbase = 104, Weight = 3000, and Domestic
= 0 (show your work). The variables are CityMPG = city MPG (miles per gallon by
EPA rating); EngSize = engine size (liters); ManTran = 1 if manual transmission
available, 0 otherwise; Length = vehicle length (inches); Wheelbase = vehicle
wheelbase (inches); Weight = vehicle weight (pounds); Domestic = 1 if U.S.
manufacturer, 0 otherwise.
Chapter 13 Multiple Regression Answer Key
True / False Questions
1. In regression the dependent variable is referred to as the response variable.
TRUE
Y is also sometimes called the dependent variable.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
Learning Objective: 13-01 Use a fitted multiple regression equation to make predictions.
Topic: Multiple Regression
2. If a regression model's F test statistic is Fcalc = 43.82, we could say that the
explained variance is approximately 44 percent.
FALSE
The R2 statistic (not the F statistic) shows the percent of explained variation.
AACSB: Analytic
Blooms: Understand
Difficulty: 1 Easy
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
3. In a regression, the model with the best fit is preferred over all other models.
FALSE
Occam's Razor says that complexity is justified only if it is necessary for a good
model.
AACSB: Analytic
Blooms: Understand
Difficulty: 2 Medium
4. A common misinterpretation of the principle of Occam's Razor is that a simple

regression model (rather than a multiple regression model) is always best.
TRUE
Occam's Razor says that complexity is justified if it is necessary for a good

model.
AACSB: Analytic
Blooms: Understand
5. A predictor whose pairwise correlation with Y is near zero can still have a
significant t-value in a multiple regression when other predictors are included.
TRUE
The t-statistic for a predictor depends on which other predictors are in the
model.
AACSB: Analytic
Blooms: Understand
Learning Objective: 13-03 Test individual predictors for significance.
Topic: Predictor Significance
6. The F statistic in a multiple regression is significant if at least one of the

predictors has a significant t statistic at a given .
TRUE
At least one predictor coefficient will differ from zero at the same used in the
F test.
AACSB: Analytic
Blooms: Understand
Difficulty: 1 Easy
7. R2adj can exceed R2 if there are several weak predictors.
FALSE
R2adj is smaller than R2 and a large difference suggests unnecessary predictors.
AACSB: Analytic
Blooms: Remember
8. A binary (categorical) predictor should not be used along with nonbinary

predictors.
FALSE
Binary predictors behave like any other except they look weird on a scatter plot.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
Learning Objective: 13-05 Incorporate a categorical variable into a multiple regression model.
Topic: Categorical Predictors
9. In a multiple regression with 3 predictors in a sample of 25 U.S. cities, we would
TRUE
For the F-test we use d.f. = (k, n - k - 1).
AACSB: Analytic
Blooms: Apply
10. Evans' Rule says that if n = 50 you need at least 5 predictors to have a good
model.
FALSE
On the contrary, Evans' Rule is intended to prevent having too many predictors.
AACSB: Analytic
Blooms: Remember
11. The model Y = 0 + 1X + 2X2 cannot be estimated by Excel because of the
nonlinear term.
FALSE
The X2 predictor is just a data column like any other.
AACSB: Analytic
Blooms: Remember
Learning Objective: 13-09 Explain the role of data conditioning and data transformations.
Topic: Tests for Nonlinearity and Interaction
12. The random error term in a regression model reflects all factors omitted from
the model.
TRUE
The errors are assumed normally distributed with zero mean and constant
variance.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
13. If the probability plot of residuals resembles a straight line, the residuals show a
fairly good fit to the normal distribution.
TRUE
The probability plot is easy to interpret in a general way (linearity suggests

normality).
AACSB: Analytic
Blooms: Remember
Learning Objective: 13-07 Analyze residuals to check for violations of residual assumptions.
Topic: Violations of Assumptions
14. Confidence intervals for Y may be unreliable when the residuals are not
normally distributed.
TRUE
If serious nonnormality exists and n is small, confidence intervals may be

affected.
AACSB: Analytic
Blooms: Remember
Learning Objective: 13-04 Interpret confidence intervals for regression coefficients.
15. A negative estimated coefficient in a regression usually indicates a weak
predictor.
FALSE
It is the t-statistic that indicates the strength of a predictor.
AACSB: Analytic
Blooms: Remember
16. For a certain firm, the regression equation Bonus = 2,000 + 257 Experience +
0.046 Salary describes employee bonuses with a standard error of 125. John has
10 years' experience, earns $50,000, and earned a bonus of $7,000. John is an
outlier.
FALSE
John's standardized residual is (yactual - yestimated)/se = (7,000 - 6,870)/(125) = 1.04,

which is not unusual.
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
Learning Objective: 13-08 Identify unusual residuals and high leverage observations.
17. There is one residual for each predictor in the regression model.
FALSE
There are k predictors, but there are n residuals e1, e2, , en.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
18. If R2 and R2adj differ greatly, we should probably add a few predictors to
improve the fit.
FALSE
Evidence of unnecessary predictors can be seen when R2adj is much smaller than
R2.
AACSB: Analytic
Blooms: Remember
19. The effect of a binary predictor is to shift the regression intercept.
TRUE
The omitted category becomes part of the intercept.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
20. A parsimonious model is one with many weak predictors but a few strong
ones.
FALSE
On the contrary, a lean (parsimonious) model has strong predictors and no

weak ones.
AACSB: Analytic
Blooms: Remember
21. The F statistic and its p-value give a global test of significance for a multiple
regression.
TRUE
The F-test tells whether or not at least some predictors are significant.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
22. In a regression model of student grades, we would code the nine categories of
business courses taken (ACC, FIN, ECN, MGT, MKT, MIS, ORG, POM, QMM) by
including nine binary (0 or 1) predictors in the regression.
FALSE
We can code c categories with c - 1 predictors (i.e., omit one).
AACSB: Analytic
Blooms: Apply
23. A disadvantage of Excel's Data Analysis regression tool is that it expects the
independent variables to be in a block of contiguous columns so you must
delete a column if you want to eliminate a predictor from the model.
TRUE
This is why we might want to use MINITAB, MegaStat, SPSS, or Systat.
AACSB: Technology
Blooms: Apply
Difficulty: 1 Easy
24. A disadvantage of Excel's regression is that it does not give as much accuracy in
the estimated regression coefficients as a package like MINITAB.
FALSE
Excel's accuracy is good for most of the common regression statistics.
AACSB: Technology
Blooms: Understand
Difficulty: 1 Easy
25. Nonnormality of the residuals from a regression can best be detected by
looking at the residual plots against the fitted Y values.
FALSE
Use a probability plot to check for nonnormality (a residual plot tests for
heteroscedasticity).
AACSB: Analytic
Blooms: Understand
26. A high variance inflation factor (VIF) indicates a significant predictor in the
regression.
FALSE
A high VIF indicates that a predictor is related to the other predictors in the
model.
AACSB: Analytic
Blooms: Remember
Learning Objective: 13-06 Detect multicollinearity and assess its effects.
Topic: Multicollinearity
27. Autocorrelation may be detected by looking at a plot of the residuals against
time.
TRUE
Too many or too few crossings of the zero axis suggest nonrandomness.
AACSB: Analytic
Blooms: Remember
28. A widening pattern of residuals as X increases would suggest

heteroscedasticity.
TRUE
The absence of a pattern would be ideal (homoscedastic).
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
29. Plotting the residuals against a binary predictor (X = 0, 1) reveals nothing about
heteroscedasticity.
FALSE
You can still spot wider or narrower spread at the two points X = 0 and X = 1.
AACSB: Analytic
Blooms: Remember
Difficulty: 3 Hard
30. The regression equation Bonus = 2,812 + 27 Experience + 0.046 Salary says
that Experience is the most significant predictor of Bonus.
FALSE
You need a t-statistic to assess significance of a predictor.
AACSB: Analytic
Blooms: Apply
31. A multiple regression with 60 observations should not have 13 predictors.
TRUE
Evans' Rule suggests no more than n/10 = 60/10 = 6 predictors.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
32. A regression of Y using four independent variables X1, X2, X3, X4 could also have
up to four nonlinear terms (X2) and six simple interaction terms (XjXk) if you
have enough observations to justify them.
TRUE
We must count all the possible squares and two-way combinations of four
predictors.
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
33. When autocorrelation is present, the estimates of the coefficients will be
unbiased.
TRUE
There is no bias in the OLS estimates, though variances and t-tests may be
affected.
AACSB: Analytic
Blooms: Remember
34. If the residuals in your regression are nonnormal, a larger sample size might
help improve the reliability of confidence intervals for Y.
TRUE
Asymptotic normality and consistency of the OLS estimators may help.
AACSB: Analytic
Blooms: Remember
35. Multicollinearity can be detected from t tests of the predictor variables.
FALSE
The t-tests only indicate significance (we use VIFs to detect multicollinearity).
AACSB: Analytic
Blooms: Remember
36. When multicollinearity is present, the regression model is of no use for making
predictions.
FALSE
Multicollinearity makes it hard to assess each predictor's role, but predictions

may be useful.
AACSB: Analytic
Blooms: Remember
37. Autocorrelation of the residuals may affect the reliability of the t values for the
estimated coefficients of the predictors X1, X2, . . . , Xk.
TRUE
Autocorrelation can affect the variances of the estimators, hence their t-values.
AACSB: Analytic
Blooms: Remember
38. The first differences transformation might be tried if autocorrelation is found in

a time-series data set.
TRUE
First differences may help, and is an easily understood transformation.
AACSB: Analytic
Blooms: Remember
39. Statisticians who work with cross-sectional data generally do not anticipate
autocorrelation.
TRUE
We are more likely to see autocorrelation in time-series data.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
40. The ill effects of heteroscedasticity might be mitigated by redefining totals (e.g.,
total number of homicides) as relative values (e.g., homicide rate per 100,000
population).
TRUE
Large magnitude ranges for X's and Y (the "size" problem) can induce
heteroscedasticity.
AACSB: Analytic
Blooms: Apply
41. Nonnormal residuals lead to biased estimates of the coefficients in a regression
model.
FALSE
There is no bias in the estimated coefficients, though confidence intervals may

be affected.
AACSB: Analytic
Blooms: Remember
42. A large VIF (e.g., 10 or more) would indicate multicollinearity.
TRUE
Some multicollinearity is inevitable, but very large VIFs suggest competing

predictors.
AACSB: Analytic
Blooms: Remember
43. Heteroscedasticity exists when all the errors (residuals) have the same variance.
FALSE
The statement would be true if we change the first word to "homoscedasticity."
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
44. Multicollinearity refers to relationships among the independent variables.
TRUE
When one predictor is predicted by the other predictors, we have

multicollinearity.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
45. A squared predictor is used to test for nonlinearity in the predictor's
relationship to Y.
TRUE
Including a squared predictor is an easy way to make the relationship nonlinear.
AACSB: Analytic
Blooms: Remember
46. Nonnormality of residuals is not usually considered a major problem unless

there are outliers.
TRUE
Serious nonnormality can make the confidence intervals unreliable.
AACSB: Analytic
Blooms: Remember
47. In the fitted regression Y = 12 + 3X1 - 5X2 + 27X3 + 2X4 the most significant
predictor is X3.
FALSE
We must have the t-statistics (not just the coefficients) to assess each
predictor's significance.
AACSB: Analytic
Blooms: Apply
48. Given that the fitted regression is Y = 76.40 -6.388X1 + 0.870X2, the standard
error of b1 is 1.453, and n = 63. At = .05, we can conclude that X1 is a
significant predictor of Y.
TRUE
tcalc = (-6.388)/(1.453) = -4.396, which is < t.025 = -2.000 for d.f. = 60 in a two-
tailed test.
AACSB: Analytic
Blooms: Apply
49. Unlike other predictors, a binary predictor has a t-value that is either 0 or 1.
FALSE
The t-value for a binary predictor is like any other t-value.
AACSB: Analytic
Blooms: Remember
50. The t-test shows the ratio of an estimated coefficient to its standard error.
TRUE
In a test for zero coefficient (and in computer output) tcalc = bj/sbj.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
51. In a multiple regression with five predictors in a sample of 56 U.S. cities, we
would use F5, 50 in a test of overall significance.
TRUE
F.05 = 2.25 for d.f. = (k, n - k - 1) = (6, 67 - 6 - 1) = (6, 60).
AACSB: Analytic
Blooms: Apply
Multiple Choice Questions
52. In a multiple regression with six predictors in a sample of 67 U.S. cities, what
A. 2.29
B. 2.25
C. 2.37
D. 2.18
F.05 = 2.25 for d.f. = (k, n - k - 1) = (6, 67 - 6 - 1) = (6, 60).
AACSB: Analytic
Blooms: Apply
53. In a multiple regression with five predictors in a sample of 56 U.S. cities, what
A. 2.45
B. 2.37
C. 2.40
D. 2.56
F.05 = 2.40 for d.f. = (k, n - k - 1) = (5, 56 - 5 - 1) = (5, 50).
AACSB: Analytic
Blooms: Apply
54. When predictor variables are strongly related to each other, the __________ of
the regression estimates is questionable.
A. logic
B. fit
C. parsimony
D. stability
High interpredictor correlation affects their variances, so coefficients are less

certain.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
55. A test is conducted in 22 cities to see if giving away free transit system maps
will increase the number of bus riders. In a regression analysis, the dependent
variable Y is the increase in bus riders (in thousands of persons) from the start
of the test until its conclusion. The independent variables are X1 = the number
(in thousands) of free maps distributed and a binary variable X2 = 1 if the city
has free downtown parking, 0 otherwise. The estimated regression equation is
In city 3, the observed Y value is 7.3 and X1 = 140 and X2 = 0. The residual for
city 3 (in thousands) is:
A. 6.15.
B. 1.15.
C. 4.83.
D. 1.57.
yestimated = 1.32 + .0345(140) - 1.45(0) = 6.15, so the residual is (7.3 - 6.15) = 1.15.
AACSB: Analytic
Blooms: Apply
56. If X2 is a binary predictor in Y = 0 + 1X1 + 2X2, then which statement is most
nearly correct?
A. X2 = 1 should represent the most desirable condition.
B. X2 would be a significant predictor if 2 = 423.72.
C. X2 = 0, X2 = 1, X2 = 2 would be appropriate if three categories exist.
D. X2 will shift the estimated equation either by 0 units or by 2 units.
If X2 = 0 then nothing is added to the equation, while if X2 = 1 we add 2 units.
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
57. The unexplained sum of squares measures variation in the dependent variable
Y about the:
A. mean of the Y values.
B. estimated Y values.
C. mean of the X values.
D. Y-intercept.
We are trying to explain variation in the response variable around its mean.
AACSB: Analytic
Blooms: Remember
58. Which of the following is not true of the standard error of the regression?
A. It is a measure of the accuracy of the prediction.
B. It is based on squared vertical deviations between the actual and predicted

values of Y.
C. It would be negative when there is an inverse relationship in the model.
D. It is used in constructing confidence and prediction intervals for Y.
The standard error is the square root of a sum of squares so it cannot be

negative.
AACSB: Analytic
Blooms: Apply
Topic: Confidence Intervals for Y
59. A multiple regression analysis with two independent variables yielded the
following results in the ANOVA table: SS(Total) = 798, SS(Regression) = 738,
SS(Error) = 60. The multiple correlation coefficient is:
A. .2742
B. .0752
C. .9248
D. .9617
R2 = SSR/SST = 738/798 = .9248, so r = (R2)1/2 = .9617.
AACSB: Analytic
Blooms: Apply
A. Decrease by 2
B. Decrease by 4
C. Increase by 2
D. No change in Y
The net effect is + 3X1 - 5X2 = 3(2) - 5(2) = 6 - 10 = -4.
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
A. Increase by 2
B. Decrease by 4
C. Increase by 4
D. No change in Y
The net effect is + 5X1 - 4X2 = 5(2) - 4(2) = 10 - 8 = +2.
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
62. Which is not a name often given to an independent variable that takes on just
two values (0 or 1) according to whether or not a given characteristic is absent
or present?
A. Absent variable
B. Binary variable
C. Dummy variable
A two-valued predictor is a binary or dummy variable (special cases of

categorical predictors).
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
63. Using a sample of 63 observations, a dependent variable Y is regressed against
two variables X1 and X2 to obtain the fitted regression equation Y = 76.40 -
6.388X1 + 0.870X2. The standard error of b1 is 3.453 and the standard error of b2
is 0.611. At a = .05, we could:
A. conclude that both coefficients differ significantly from zero.
B. reject H0: 1 0 and conclude H0: 1 < 0.
C. reject H0: 2 0 and conclude H0: 1 > 0.
D. conclude that Evans' Rule has been violated.
For 1 we have tcalc = (-6.388)/(3.453) = -1.849 which is less than t.05 = -1.671 for
d.f. = 60 in a left-tailed test. For 2 we have tcalc = (0.870)/(0.611) = +1.424 which
does not exceed t.05 = +1.671 for d.f. = 60 in a right-tailed test. For a two-tailed
test, t.025 = 2.000, so neither coefficient would differ significantly from zero at
a = .05. Evans' Rule is not violated because n/k = 63/3 = 21.
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
Which statement is not accurate?
A. The F-test is significant at = .05.
B. There were 50 observations.
C. There were 5 predictors.
D. There would be 50 residuals.
d.f. = (k, n - k - 1) = (4, 45), so k = 4 predictors.
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
For this regression, the R2 is:
A. .3995.
B. .6005.
C. .6654.
D. .8822.
R2 = SSR/SST = (1793.2356)/(4488.3352) = .3995.
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
66. Refer to the following regression results. The dependent variable is Abort (the
number of abortions per 1000 women of childbearing age). The regression was
estimated using data for the 50 U.S. states with these predictors: EdSpend =
public K-12 school expenditure per capita, Age = median age of population,
Unmar = percent of total births by unmarried women, Infmor = infant mortality
rate in deaths per 1000 live births.
Which statement is not supported by a two-tailed test?
A. Unmar is a significant predictor at .= .01.
B. EdSpend is a significant predictor at = .20.
C. Infmor is not a significant predictor at = .05.
D. Age is not a significant predictor at = .05.
For Infmor, tcalc = (-3.7848)/(1.0173) = -3.720, which is < t.025 = -2.014 for d.f. =
45.
AACSB: Analytic
Blooms: Apply
67. Refer to the following correlation matrix that was part of a regression analysis.
The dependent variable was Abort (the number of abortions per 1000 women
of childbearing age). The regression was estimated using data for the 50 U.S.
states with these predictors: EdSpend = public K-12 school expenditure per
capita, Age = median age of population, Unmar = percent of total births by
unmarried women, Infmor = infant mortality rate in deaths per 1000 live births.
Correlation Matrix
Using a two-tailed correlation test, which statement is not accurate?
A. Age and Infmor are not significantly correlated at = .05.
B. Abort and Unmar are significantly correlated at = .05.
C. Unmar and Infmor are significantly correlated at = .05.
D. The first column of the table shows evidence of multicollinearity.
Use rcrit = t.025/(t.0252 + n - 2)1/2 = (2.011)/(2.0112 + 50 - 2)1/2 = .2788 for d.f. = 50 -

2 = 48 for a two-tailed test at = .05. Using this criterion, we see that two pairs
of predictors, (Abort and Unmar) and (Unmar and Infmor), have correlations
that differ significantly from zero.
AACSB: Analytic
Blooms: Apply
68. Part of a regression output is provided below. Some of the information has
been omitted.
The approximate value of F is:
A. 1605.7.
B. 0.9134.
C. 89.66.
D. impossible to calculate with the given information.
Fcalc = MSR/MSE = (1588.6)/(17.717) = 89.66.
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
69. Part of a regression output is provided below. Some of the information has
been omitted.
The SS (residual) is:
A. 3177.17.
B. 301.19.
C. 17.71.
D. impossible to determine.
SSE = SST - SSR = 3478.36 - 3177.17 = 301.19.
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
70. A Realtor is trying to predict the selling price of houses in Greenville (in
thousands of dollars) as a function of Size (measured in thousands of square
feet) and whether or not there is a fireplace (FP is 0 if there is no fireplace, 1 if
there is a fireplace). Part of the regression output is provided below, based on a
sample of 20 homes. Some of the information has been omitted.
The estimated coefficient for Size is approximately:
A. 9.5.
B. 13.8.
C. 122.5.
D. 1442.6.
Coefficient = (t Stat)/(Std Err) = (11.439)/(1.2072436) = 9.475.
AACSB: Analytic
Blooms: Apply
there is a fireplace). The regression output is provided below. Some of the
information has been omitted.
How many predictors (independent variables) were used in the regression?
A. 20
B. 18
C. 3
D. 2
d.f. = (k, n - k - 1) = (2, 17), so k = 2.
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
there is a fireplace). The regression output is provided below. Some of the
information has been omitted.
Which of the following conclusions can be made based on the F-test?
A. The p-value on the F-test will be very high.
B. At least one of the predictors is useful in explaining Y.
C. The model is of no use in predicting selling prices of houses.
D. The estimates were based on a sample of 19 houses.
Fcalc = MSR/MSE = (1588.6)/(17.717) = 89.66, which exceeds F.05 = 3.59 for d.f. =
(2, 17).
AACSB: Analytic
Blooms: Apply
there is a fireplace). Part of the regression output is provided below, based on a
sample of 20 homes. Some of the information has been omitted.
Which statement is supported by the regression output?
A. At = .05, FP is not a significant predictor in a two-tailed test.
B. A fireplace adds around $6476 to the selling price of the average house.
C. A large house with no fireplace will sell for more than a small house with a
fireplace.
D. FP is a more significant predictor than Size.
The estimated coefficient of FP is 6.476 (our home prices are in thousands).
AACSB: Analytic
Blooms: Apply
74. A log transformation might be appropriate to alleviate which problem(s)?
A. Heteroscedastic residuals
B. Multicollinearity
C. Autocorrelated residuals
By reducing data magnitudes, the log transform may help equalize variances.
AACSB: Analytic
Blooms: Remember
75. A useful guideline in determining the extent of collinearity in a multiple

regression model is:
A. Sturge's Rule.
B. Klein's Rule.
C. Occam's Rule.
D. Pearson's Rule.
Klein's Rule suggests severe collinearity if any r exceeds the multiple correlation
coefficient.
AACSB: Analytic
Blooms: Remember
76. In a multiple regression all of the following are true regarding residuals except:
A. their sum always equals zero.
B. they are the differences between observed and predicted values of the
response variable.
C. they may be used to detect multicollinearity.
D. they may be used to detect heteroscedasticity.
Residuals help in all these except to detect multicollinearity (we need VIFs for
that task).
AACSB: Analytic
Blooms: Remember
77. The residual plot below suggests which violation(s) of regression assumptions?
A. Autocorrelation
B. Heteroscedasticity
C. Nonnormality
D. Multicollinearity
There seems to be a "fan-out" pattern (nonconstant residual variance).
AACSB: Analytic
Blooms: Apply
78. Which is not a standard criterion for assessing a regression model?
A. Logic of causation
B. Overall fit
C. Degree of collinearity
D. Binary predictors
Binary predictors may be a useful part of any regression model.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
79. If the standard error is 12, a quick prediction interval for Y is:
A. 15.
B. 24.
C. 19.
D. impossible to determine without an F table.
Double the standard error to get an approximate width of a prediction interval

for Y.
AACSB: Analytic
Blooms: Remember
Topic: Confidence Intervals for Y
80. Which is a characteristic of the variance inflation factor (VIF)?
A. It is insignificant unless the corresponding t-statistic is significant.
B. It reveals collinearity rather than multicollinearity.
C. It measures the degree of significance of each predictor.
D. It indicates the predictor's degree of multicollinearity.
The larger the VIFs, the more we suspect that the predictors are multicollinear.
AACSB: Analytic
Blooms: Remember
81. Which statement best describes this regression (Y = highway miles per gallon in
91 cars)?
A. Statistically significant but large error in the MPG predictions
B. Statistically significant and quite small MPG prediction errors
C. Not quite significant, but predictions should be very good
D. Not a significant regression at any customary level of
The p-value for the F-test indicates significance, but the quick prediction
interval is Y 2(4.019) or Y 8 mpg, which would not permit a very precise
prediction.
AACSB: Analytic
Blooms: Apply
82. Based on these regression results, in your judgment which statement is most
nearly correct (Y = highway miles per gallon in 91 cars)?
A. The number of predictors is rather small.
B. Some predictors are not contributing much.
C. Prediction intervals would be fairly narrow in terms of MPG.
D. The overall model lacks significance and/or predictive power.
There is a gap between R2 and R2adj, which suggests some superfluous

predictors were used.
AACSB: Analytic
Blooms: Apply
83. In the following regression, which are the three best predictors?
A. ManTran, Wheelbase, RearStRm
B. ManTran, Length, Width
C. NumCyl, HPMax, Length
D. Cannot be ascertained from given information
The absolute t-statistics indicate a ranking.
AACSB: Analytic
Blooms: Apply
84. In the following regression, which are the two best predictors?
A. NumCyl, HpMax
B. Intercept, NumCyl
C. NumCyl, Domestic
D. ManTran, Width
Absolute t-statistics indicate a ranking, so find tcalc = (Coef)/(Std Err) for each
predictor.
AACSB: Analytic
Blooms: Apply
85. In the following regression (n = 91), which coefficients differ from zero in a two-
tailed test at = .05?
A. NumCyl, HPMax
B. Intercept, ManTran
C. Intercept, NumCyl, Domestic
D. Intercept, Domestic
If the confidence interval includes zero, the predictor is not significant in a two-
tailed test.
AACSB: Analytic
Blooms: Apply
86. Based on the following regression ANOVA table, what is the R2?
A. 0.1336
B. 0.6005
C. 0.3995
D. Insufficient information to answer
R2 = SSR/SST = (1793.2356)/(4488.3352) = .3995.
AACSB: Analytic
Blooms: Apply
87. In the following regression, which statement best describes the degree of
multicollinearity?
A. Very little evidence of multicollinearity.
B. Much evidence of multicollinearity.
C. Only NumCyl and HPMax are collinear.
D. Only ManTran and RearStRm are collinear.
Many predictors have large VIFs.
AACSB: Analytic
Blooms: Apply
88. The relationship of Y to four other variables was established as Y = 12 + 3X1 -
5X2 + 7X3 + 2X4. When X1 increases 5 units and X2 increases 3 units, while X3
and X4 remain unchanged, what change would you expect in your estimate of
Y?
A. Decrease by 15
B. Increase by 15
C. No change
D. Increase by 5
The net effect is + 3X1 - 5X2 = 3(5) - 5(3) = 15 - 15 = 0.
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
89. Does the picture below show strong evidence of heteroscedasticity against the
predictor Wheelbase?
A. Yes
B. No
C. Need a probability plot to answer
D. Need VIF statistics to answer
Scatter appears random (no systematic difference in vertical spread).
AACSB: Analytic
Blooms: Apply
90. Which is not a correct way to find the coefficient of determination?
A. SSR/SSE
B. SSR/SST
C. 1 - SSE/SST
R2 = SSR/SST or R2 = 1 - SSE/SST.
AACSB: Analytic
Blooms: Remember
91. If SSR = 3600, SSE = 1200, and SST = 4800, then R2 is:
A. .5000
B. .7500
C. .3333
D. .2500
R2 = SSR/SST = 3600/4800 = .7500.
AACSB: Analytic
Blooms: Apply
A. Positive autocorrelation results in too many centerline crossings in the

residual plot over time.
B. The R2 statistic can only increase (or stay the same) when you add more
predictors to a regression.
C. If the F-statistic is insignificant, the t-statistics for the predictors also are
insignificant at the same .
D. A regression with 60 observations and 5 predictors does not violate Evans'

Rule.
Positive autocorrelation results in too few crossings of the zero point on the
axis (cycles).
AACSB: Analytic
Blooms: Apply
93. Which statement about leverage is incorrect?
A. Leverage refers to an observation's distance from the mean of X.
B. If n = 40 and k = 4 predictors, a leverage statistic of .15 would indicate high

leverage.
C. If n = 180 and k = 3 predictors, a leverage statistic of .08 would indicate high

leverage.
2(k + 1)/n = 2(4 + 1)/40 = .25, so hi = .15 would not indicate high leverage.
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
A. Binary predictors shift the intercept of the fitted regression.
B. If a qualitative variable has c categories, we would use only c - 1 binaries as

predictors.
C. A binary predictor has the same t-test as any other predictor.
D. If there is a binary predictor (X = 0, 1) in the model, the residuals may not

sum to zero.
Residuals always sum to zero using the OLS method.
AACSB: Analytic
Blooms: Remember
95. Heteroscedasticity of residuals in regression suggests that there is:
A. nonconstant variation in the errors.
B. multicollinearity among the predictors.
C. nonnormality in the errors.
D. lack of independence in successive errors.
Heteroscedasticity is nonconstant residual variance.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
96. If you rerun a regression, omitting a predictor X5, which would be unlikely?
A. The new R2 will decline if X5 was a relevant predictor.
B. The new standard error will increase if X5 was a relevant predictor.
C. The remaining estimated 's will change if X5 was collinear with other
predictors.
D. The numerator degrees of freedom for the F test will increase.
Numerator df is the number of predictors, so omitting one would have the

opposite effect.
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
97. In a multiple regression, which is an incorrect statement about the residuals?
A. They may be used to test for multicollinearity.
B. They are differences between observed and estimated values of Y.
C. Their sum will always equal zero.
D. They may be used to detect heteroscedasticity.
To check for multicollinearity we would look at the VIFs or a correlation matrix.
AACSB: Analytic
Blooms: Remember
98. Which of the following is not a characteristic of the F distribution?
A. It is a continuous distribution.
B. It uses a test statistic Fcalc that can never be negative.
C. Its degrees of freedom vary, depending on .
D. It is used to test for overall significance in a regression.
In ANOVA we use d.f. = (k, n - k - 1). The value of does not affect d.f.
AACSB: Analytic
Blooms: Remember
99. Which of the following would be most useful in checking the normality
assumption of the errors in a regression model?
A. The t-statistics for the coefficients
B. The F-statistic from the ANOVA table
C. The histogram of residuals
D. The VIF statistics for the predictors
A histogram could reveal skewness or possibly outliers.
AACSB: Analytic
Blooms: Remember
100. The regression equation Salary = 25,000 + 3200 YearsExperience + 1400
YearsCollege describes employee salaries at Axolotl Corporation. The standard
error is 2600. John has 10 years' experience and 4 years of college. His salary is
$66,500. What is John's standardized residual?
A. -1.250
B. -0.240
C. +0.870
D. +1.500
John's predicted salary is 25,000 + 3200(10) + 1400(4) = 62,600, so his

standardized residual is (66,500 - 62,600)/(2600) = 1.500 (he is somewhat
overpaid according to the fitted regression).
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
101. The regression equation Salary = 28,000 + 2700 YearsExperience + 1900
YearsCollege describes employee salaries at Ramjac Corporation. The standard
error is 2400. Mary has 10 years' experience and 4 years of college. Her salary is
$58,350. What is Mary's standardized residual (approximately)?
A. -1.150
B. +2.007
C. -1.771
D. +1.400
Mary's predicted salary is 28,000 + 2700 (10) + 1900 (4) = 62,600, so her
standardized residual is (58,350 - 62,600)/(2400) = -1.771 (she is somewhat
underpaid according to the fitted regression).
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
102. Which Excel function will give the p-value for overall significance if a regression
has 75 observations and 5 predictors and gives an F test statistic Fcalc = 3.67?
A. =F.INV(.05, 5, 75)
B. =F.DIST(3.67, 4, 74)
C. =F.DIST.RT(3.67, 5, 69)
D. =F.DIST(.05, 4, 70)
In pre-2010 versions of Excel the function was =FDIST(3.67, 5, 69) for d.f. = (k, n
- k - 1).
AACSB: Analytic
Blooms: Apply
103. The ScamMore Energy Company is attempting to predict natural gas
consumption for the month of January. A random sample of 50 homes was
used to fit a regression of gas usage (in CCF) using as predictors Temperature
= the thermostat setting (degrees Fahrenheit) and Occupants = the number of
household occupants. They obtained the following results:
In testing each coefficient for a significant difference from zero (two-tailed test
at = .10), which is the most reasonable conclusion about the predictors?
A. Temperature is highly significant; Occupants is barely significant.
B. Temperature is not significant; Occupants is significant.
C. Temperature is less significant than Occupants.
D. Temperature is significant; Occupants is not significant.
Find the test statistic tcalc = (Coef)/(StdErr) for each predictor and compare with
t.05 = 1.678 for d.f. = n - k - 1 = 50 - 2 - 1 = 47.
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
104. In a regression with 60 observations and 7 predictors, there will be _____
residuals.
A. 60
B. 59
C. 52
D. 6
There are 60 residuals e1, e2, . . . , e60 (one residual for each observation).
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
105. A regression with 72 observations and 9 predictors violates:
A. Evans' Rule.
B. Klein's Rule.
C. Doane's Rule.
D. Sturges' Rule.
Evans' Rule suggests n/k 10, but in this example n/k = 72/9 = 8.
AACSB: Analytic
Blooms: Apply
106. The F-test for ANOVA in a regression model with 4 predictors and 47
observations would have how many degrees of freedom?
A. (3, 44)
B. (4, 46)
C. (4, 42)
D. (3, 43)
d.f. = (k, n - k - 1) = (4, 47 - 4 - 1) = (4, 42).
AACSB: Analytic
Blooms: Apply
107. In a regression with 7 predictors and 62 observations, degrees of freedom for a
t-test for each coefficient would use how many degrees of freedom?
A. 61
B. 60
C. 55
D. 54
d.f. = n - k - 1 = 62 - 7 - 1 = 54.
AACSB: Analytic
Blooms: Apply
Essay Questions
108. Using state data (n = 50) for the year 2000, a statistics student calculated a
matrix of correlation coefficients for selected variables describing state averages
on the two main scholastic aptitude tests (ACT and SAT). (a) In the spaces
provided, write the two-tailed critical values of the correlation coefficient for
= .05 and = .01 respectively. Show how you derived these critical values. (b)
Mark with * all correlations that are significant at = .05, and mark with **
those that are significant at = .01. (c) Why might you expect a negative
correlation between ACT% and SAT%? (d) Why might you expect a positive
correlation between SATQ and SATV? Explain your reasoning. (e) Why is the
matrix empty above the diagonal?
(a) As explained in Chapter 12, for d.f. = n - 2 = 50 - 2 = 48, the critical values
of Student's t for a two-tailed test for zero correlation are t.025 = 2.011 and t.005
= 2.682. The critical values of the correlation coefficient are:
No correlation in the first column (ACT) is significant at either , but all the
other correlations differ significantly from zero at either value of . (b) An
inverse correlation between ACT% and SAT% might be expected because
students in a given state usually take one or the other, but not both (depending
on what their state universities prefer). (c) If the tests measure general ability,
test-takers who score well on SATQ tend also to score well on SATV. (d) Entries
above the diagonal are redundant, so they are omitted.
Feedback:
(a) As explained in Chapter 12, for d.f. = n - 2 = 50 - 2 = 48, the critical values
of Student's t for a two-tailed test for zero correlation are t.05 = 2.011 and t.01 =
2.682. The critical values of the correlation coefficient are:
No correlation in the first column (ACT) is significant at either , while all other
correlations differ significantly from zero at either value of . (b) An inverse
correlation between ACT% and SAT% might be expected because students in a
given state usually take one or the other, but not both (students may not know
that requirements follow a pattern by region). (c) If the tests measure general
ability, test-takers who score well on SATQ would tend also to score well on
SATV. (d) Entries above the diagonal are redundant, so they are omitted.
AACSB: Reflective Thinking

Blooms: Evaluate
Difficulty: 3 Hard
109. Using data for a large sample of cars (n = 93), a statistics student calculated a
matrix of correlation coefficients for selected variables describing each car. (a)
In the spaces provided, write the two-tailed critical values of the correlation
coefficient for = .05 and = .01 respectively. Show how you derived these
critical values. (b) Mark with * all correlations that are significant at = .05, and
mark with ** those that are significant at = .01. (c) Why might you expect a
negative correlation between Weight and HwyMPG? (d) Why might you expect
a positive correlation between HPMax and Length? Explain your reasoning. (e)
Why is the matrix empty above the diagonal?
(a) As explained in Chapter 12, for d.f. = n - 2 = 93 - 2 = 91, the critical values of
Student's t for a two-tailed test are t.025 = 1.986 and t.005 = 2.631. The critical
values of the correlation coefficient are:
Given the large sample, it would also be reasonable to use z.025 = 1.960 (giving
r.05 = .202) or z.005 = 2.576 (giving r.01 = .261). However, none of the sample
correlations is close to the decision point. All the correlations are significant at
either value of . (b) An inverse correlation between Weight and HwyMPG is
expected because larger cars have more mass that must be accelerated and
moved. (c) Longer cars require bigger engines, so HPMax and Length are
correlated. In fact, many measurable aspects of a car are correlated. (d) Entries
Feedback: (a) As explained in Chapter 12, for d.f. = n - 2 = 93 - 2 = 91, the

critical values of Student's t for a two-tailed test are t.025 = 1.986 and t.005 =
2.631. The critical values of the correlation coefficient are:
Given the large sample, it would also be reasonable to use z.025 = 1.960 (giving
r.05 = .202) or z.005 = 2.576 (giving r.01 = .261). However, none of the sample
correlations is close to the decision point. All the correlations are significant at
either value of . (b) An inverse correlation between Weight and HwyMPG is
expected because larger cars have more mass that must be accelerated and
moved. (c) Longer cars require bigger engines, so HPMax and Length are
correlated. In fact, many measurable aspects of a car are correlated. (d) Entries

Blooms: Evaluate
Difficulty: 3 Hard
110. Analyze the regression below (n = 50 U.S. states) using the concepts you have
learned about multiple regression. Circle things of interest and write comments
in the margin. Make a prediction for Poverty for a state with Dropout = 15,
TeenMom = 12, Unem = 4, and Age65% = 12 (show your work). The variables
are Poverty = percentage below the poverty level; Dropout = percent of adult
population that did not finish high school; TeenMom = percent of total births
by teenage mothers; Unem = unemployment rate, civilian labor force; and
Age65% = percent of population aged 65 and over.
The regression is significant overall (F = 18.74, p < .0001). All the predictors are
significant at = .05 (p-values less than .05). TeenMom and Unem are the best
predictors, while Age65% and DropOut are barely significant. The intercept is
not meaningful since no state would have all predictors equal to zero.
Regarding leverage, we can apply the quick rule to check for any residual
greater than 2(k + 1)/n = 2(5)/50 = .20. By this criterion, only AK (leverage .434)
has unusual leverage. We would want to check each predictor to see which X
values are unusual for Alaska, but this is not possible without the raw data.
There are no outliers in the Studentized residual column, although there are
three unusual ones: AK (t = -2.251), IN (t = -2.129), and NM (t = +2.829).
Autocorrelation is not an issue since these are not time-series observations
(and, in any event, the residual plot against observation order crosses the zero
centerline 22 times, which is not far from what would be expected for 50
observations). The residual plot against predicted Y has no pattern (suggesting
homoscedasticity) and the residual probability plot is linear (suggesting
normality). Overall, there are no serious problems. The fitted (estimated)
regression equation is: Poverty = - 5.3546 + 0.2065 Dropout + 0.4238
TeenMom + 1.1081 Unem + 0.3469 Age65%, so the predicted value of the
dependent variable Poverty for a state with Dropout = 15, TeenMom = 12,
Unem = 4, and Age65% = 12 is: Poverty = -5.3546 + 0.2065(15) + 0.4238(12) +
1.1081(4) + 0.3469(12) = 11.42. This prediction question is to see whether the
student knows how to interpret the regression coefficients and use them
correctly. The given values of the predictors are very close to their respective
means, so the prediction actually corresponds well to an "average" state.
Feedback: The regression is significant overall (F = 18.74, p < .0001). All the
predictors are significant at = .05 (p-values less than .05). TeenMom and
Unem are the best predictors, while Age65% and DropOut are barely
significant. The intercept is not meaningful since no state has all these
predictors equal to zero. Regarding leverage, we can apply the quick rule to
check for any residual greater than 2(k + 1)/n = 2(5)/50 = .20. By this criterion,
only AK (leverage .434) has unusual leverage. We would want to check each
predictor to see which X values are unusual for Alaska, but this is not possible
without the raw data. There are no outliers in the Studentized residual column,
although there are three unusual ones: AK (t = -2.251), IN (t = -2.129), and NM
(t = +2.829). Autocorrelation is not an issue since these are not time-series
observations (and, in any event, the residual plot against observation order
crosses the zero centerline 22 times, which is not far from what would be
expected for 50 observations). The residual plot against predicted Y has no
pattern (suggesting homoscedasticity) and the residual probability plot is linear
(suggesting normality). Overall, there are no serious problems. The fitted
(estimated) regression equation is: Poverty = - 5.3546 + 0.2065 Dropout +
0.4238 TeenMom + 1.1081 Unem + 0.3469 Age65%, so the predicted value of
the dependent variable Poverty for a state with Dropout = 15, TeenMom = 12,
Unem = 4, and Age65% = 12 is: Poverty = - 5.3546 + 0.2065(15) + 0.4238(12) +
1.1081(4) + 0.3469(12) = 11.42. This prediction question is to see whether the
student knows how to interpret the regression coefficients and use them
correctly. The given values of the predictors are very close to their respective
means, so the prediction actually corresponds well to an "average" state.

Blooms: Evaluate
Difficulty: 3 Hard
111. Analyze the regression results below (n = 33 cars in 1993) using the concepts
you have learned about multiple regression. Circle things of interest and write
comments in the margin. Make a prediction for CityMPG for a car with EngSize
= 2.5, ManTran = 1, Length = 184, Wheelbase = 104, Weight = 3000, and
Domestic = 0 (show your work). The variables are CityMPG = city MPG (miles
per gallon by EPA rating); EngSize = engine size (liters); ManTran = 1 if manual
transmission available, 0 otherwise; Length = vehicle length (inches); Wheelbase
= vehicle wheelbase (inches); Weight = vehicle weight (pounds); Domestic = 1 if
U.S. manufacturer, 0 otherwise.
The regression is significant overall (F = 20.09, p < .0001). There are four strong
predictors. Weight and Wheelbase are highly significant at = .01 (p-values less
than .01), while EngSize and Domestic are significant at = .05 (p-values less
than .05). The other two predictorsLength and ManTranare not significant
at the customary levels, although their t-values (at least 1.00 in absolute
magnitude) suggest that they may be contributing to the regression (that is, if
they are omitted, the R2adj would probably decline). The intercept is not
meaningful since no car would have all these predictors equal to zero (e.g.,
Weight = 0 is impossible). Regarding leverage, we can apply the quick rule to
check for any residual greater than 2(k + 1)/n = 2(7)/33 = .424. By this criterion,
only the Ford AeroStar (leverage .583) has unusual leverage. We would want to
check the values of each independent variable in the regression to see which
one(s) is(are) unusual. However, this is not possible without having the raw
data. There are no outliers in the Studentized residual column, although
observation 15 (Honda Civic, t = 2.862) is unusual. If we refer to the Studentized
deleted residual, observation 15 (Honda Civic, t = 3.392) is in fact an outlier. Its
actual mileage (42 mpg) is much better than predicted (34.1 mpg).
Autocorrelation is not an issue since these are not time-series observations. The
residual plot against predicted Y has no pattern (suggesting homoscedasticity)
and the residual probability plot is linear (suggesting normality). Regarding
multicollinearity, the VIFs are rather large, suggesting lack of independence
among predictors. Since none of the VIFs exceeds 10, most students will
conclude that there is no serious problem with multicollinearity. It is a fact that
many car measurements are correlated, which is a simple characteristic of the
data. However, experimentation might be needed to see whether their
contributions are truly necessary. The unexpected positive signs of EngineSize
and Wheelbase may be symptomatic of intercorrelation among the predictors.
Overall, there are no serious problems aside from one possible outlier. Nothing
should be done since this outlier is simply part of the data set. However, it
might be prudent to verify the MPG for observation 15 to make sure it is not a
typo. The fitted (estimated) regression equation is CityMPG = 34.27 + 3.824
EngSize - 2.014 ManTran - 0.08573 Length + 0.5420 Wheelbase - 0.01909
Weight - 4.285 Domestic, so the predicted value of the response variable
CityMPG for a car with EngSize = 2.5, ManTran = 1, Length = 184, Wheelbase =
104, Weight = 3000, and Domestic = 0 is CityMPG = 34.27 + 3.824(2.5) -
2.014(1) - 0.08573(184) + 0.5420(104) - 0.01909(3000) - 4.285(0) = 34.27 + 9.56 -
2.01 - 15.77 + 56.37 - 57.27 - 0 = 25.14. The given values of the predictors are
very close to their respective means, so the prediction actually corresponds well
to an "average" car. The prediction is strongly affected by the two terms
involving Wheelbase and Weight.
Feedback: The regression is significant overall (F = 20.09, p < .0001). There are
four strong predictors. Weight and Wheelbase are highly significant at = .01
(p-values less than .01), while EngSize and Domestic are significant at = .05
(p-values less than .05). The other two predictorsLength and ManTranare
not significant at the customary levels, although their t-values (at least 1.00 in
absolute magnitude) suggest that they may be contributing to the regression
(that is, if they are omitted, the R2adj would probably decline). The intercept is
not meaningful since no car has all these predictors equal to zero (e.g., Weight
= 0 is impossible). Regarding leverage, we can apply the quick rule to check for
any residual greater than 2(k + 1)/n = 2(7)/33 = .424. By this criterion, only the
Ford AeroStar (leverage .583) has unusual leverage. We would want to check
the values of each independent variable in the regression to see which one(s)
is(are) unusual. However, this is not possible without having the raw data. There
are no outliers in the Studentized residual column, although observation 15
(Honda Civic, t = 2.862) is unusual. If we refer to the Studentized deleted
residual, observation 15 (Honda Civic, t = 3.392) is in fact an outlier. Its actual
mileage (42 mpg) is much better than predicted (34.1 mpg). Autocorrelation is
not an issue since these are not time-series observations. The residual plot
against predicted Y has no pattern (suggesting homoscedasticity) and the
residual probability plot is linear (suggesting normality). Regarding
multicollinearity, the VIFs are rather large, suggesting lack of independence
among predictors. Since none of the VIFs exceeds 10, most students will
conclude that there is no serious problem with multicollinearity. It is a fact that
many car measurements are correlated, which is a simple characteristic of the
data. However, experimentation might be needed to see whether their
contributions are truly necessary. The unexpected positive signs of EngineSize
and Wheelbase may be symptomatic of intercorrelation among the predictors.
Overall, there are no serious problems aside from one possible outlier. Nothing
should be done since this outlier is simply part of the data set. However, it
might be prudent to verify the MPG for observation 15 to make sure it is not a
typo. The fitted (estimated) regression equation is CityMPG = 34.27 + 3.824
EngSize - 2.014 ManTran - 0.08573 Length + 0.5420 Wheelbase - 0.01909
Weight - 4.285 Domestic, so the predicted value of the response variable
CityMPG for a car with EngSize = 2.5, ManTran = 1, Length = 184, Wheelbase =
104, Weight = 3000, and Domestic = 0 is CityMPG = 34.27 + 3.824(2.5) -
2.014(1) - 0.08573(184) + 0.5420(104) - 0.01909(3000) - 4.285(0) = 34.27 + 9.56 -
2.01 - 15.77 + 56.37 - 57.27 - 0 = 25.14. The given values of the predictors are
very close to their respective means, so the prediction actually corresponds well
to an "average" car. Note that the prediction is strongly affected by the two
terms involving Wheelbase and Weight.

Blooms: Evaluate
Difficulty: 3 Hard

Chap 013

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Chap 013

Transféré par

Droits d'auteur :

Formats disponibles

Chapter 13

True / False Questions

1. In regression the dependent variable is referred to as the response variable.

4. A common misinterpretation of the principle of Occam's Razor is that a simple

6. The F statistic in a multiple regression is significant if at least one of the predictors

7. R2adj can exceed R2 if there are several weak predictors.

8. A binary (categorical) predictor should not be used along with nonbinary

9. In a multiple regression with 3 predictors in a sample of 25 U.S. cities, we would

11. The model Y = 0 + 1X + 2X2 cannot be estimated by Excel because of the

19. The effect of a binary predictor is to shift the regression intercept.

27. Autocorrelation may be detected by looking at a plot of the residuals against

28. A widening pattern of residuals as X increases would suggest heteroscedasticity.

31. A multiple regression with 60 observations should not have 13 predictors.

33. When autocorrelation is present, the estimates of the coefficients will be

38. The first differences transformation might be tried if autocorrelation is found in a

41. Nonnormal residuals lead to biased estimates of the coefficients in a regression

42. A large VIF (e.g., 10 or more) would indicate multicollinearity.

44. Multicollinearity refers to relationships among the independent variables.

Multiple Choice Questions

A. X2 = 1 should represent the most desirable condition.

B. X2 would be a significant predictor if 2 = 423.72.

C. X2 = 0, X2 = 1, X2 = 2 would be appropriate if three categories exist.

D. X2 will shift the estimated equation either by 0 units or by 2 units.

A. mean of the Y values.

C. mean of the X values.

A. It is a measure of the accuracy of the prediction.

B. It is based on squared vertical deviations between the actual and predicted

C. It would be negative when there is an inverse relationship in the model.

D. It is used in constructing confidence and prediction intervals for Y.

A. conclude that both coefficients differ significantly from zero.

B. reject H0: 1 0 and conclude H0: 1 < 0.

C. reject H0: 2 0 and conclude H0: 1 > 0.

D. conclude that Evans' Rule has been violated.

64. Refer to this ANOVA table from a regression:

Which statement is not accurate?

A. The F-test is significant at = .05.

B. There were 50 observations.

C. There were 5 predictors.

D. There would be 50 residuals.

For this regression, the R2 is:

Which statement is not supported by a two-tailed test?

A. Unmar is a significant predictor at .= .01.

B. EdSpend is a significant predictor at = .20.

C. Infmor is not a significant predictor at = .05.

D. Age is not a significant predictor at = .05.

Using a two-tailed correlation test, which statement is not accurate?

A. Age and Infmor are not significantly correlated at = .05.

B. Abort and Unmar are significantly correlated at = .05.

C. Unmar and Infmor are significantly correlated at = .05.

D. The first column of the table shows evidence of multicollinearity.

The approximate value of F is:

D. impossible to calculate with the given information.

The SS (residual) is:

The estimated coefficient for Size is approximately:

How many predictors (independent variables) were used in the regression?

Which of the following conclusions can be made based on the F-test?

A. The p-value on the F-test will be very high.

B. At least one of the predictors is useful in explaining Y.

C. The model is of no use in predicting selling prices of houses.

D. The estimates were based on a sample of 19 houses.

Which statement is supported by the regression output?

A. At = .05, FP is not a significant predictor in a two-tailed test.

D. FP is a more significant predictor than Size.

74. A log transformation might be appropriate to alleviate which problem(s)?