Vous êtes sur la page 1sur 18

26

Multiple Regression Model Building

CHAPTER 15: MULTIPLE REGRESSION MODEL BUILDING

1. A real estate builder wishes to determine how house size (House) is influenced by family income (Income), family size (Size), and education of the head of household (School). House size is measured in hundreds of square feet, income is measured in thousands of dollars, and education is in years. The builder randomly selected 50 families and ran the multiple regression. The business literature involving human capital shows that education influences an individuals annual income. Combined, these may influence family size. With this in mind, what should the real estate builder be particularly concerned with when analyzing the multiple regression model? a) Randomness of error terms b) Collinearity c) Normality of residuals d) Missing observations ANSWER: b TYPE: MC DIFFICULTY: Moderate KEYWORDS: collinearity, assumption 2. A microeconomist wants to determine how corporate sales are influenced by capital and wage spending by companies. She proceeds to randomly select 26 large corporations and record information in millions of dollars. A statistical analyst discovers that capital spending by corporations has a significant inverse relationship with wage spending. What should the microeconomist who developed this multiple regression model be particularly concerned with? a) Randomness of error terms b) Collinearity c) Normality of residuals d) Missing observations ANSWER: b TYPE: MC DIFFICULTY: Moderate KEYWORDS: collinearity, assumption 3. In multiple regression, the __________ procedure permits variables to enter and leave the model at different stages of its development. a) forward selection b) residual analysis c) backward elimination d) stepwise regression

ANSWER: d TYPE: MC DIFFICULTY: Easy KEYWORDS: stepwise regression, model building 4. A regression diagnostic tool used to study the possible effects of collinearity is a) the slope.

27

Multiple Regression Model Building

b) the Y-intercept.
c) the VIF. d) the standard error of the estimate. ANSWER: c TYPE: MC DIFFICULTY: Easy KEYWORDS: variance inflationary factor, collinearity 5. Which of the following is not used to find a "best" model? a) Adjusted r2 b) Mallow's Cp c) Odds ratio d) All of the above ANSWER: c TYPE: MC DIFFICULTY: Moderate KEYWORDS: model building 6. The Variance Inflationary Factor (VIF) measures the a) correlation of the X variables with the Y variable.

b) correlation of the X variables with each other. c) contribution of each X variable with the Y variable after all other X variables are included
in the model. d) standard deviation of the slope. ANSWER: b TYPE: MC DIFFICULTY: Easy KEYWORDS: variance inflationary factor, collinearity

7. The C p statistic is used


a) b) c) d) to determine if there is a problem of collinearity. if the variances of the error terms are all the same in a regression model. to choose the best model. to determine if there is an irregular component in a time series.

ANSWER: c TYPE: MC DIFFICULTY: Easy KEYWORDS: C-p statistic, model building TABLE 15-1 To explain personal consumption (CONS) measured in dollars, data is collected for INC: CRDTLIM: APR: personal income in dollars $1 plus the credit limit in dollars available to the individual average annualized percentage interest rate for borrowing for the individual

Multiple Regression Model Building 28

ADVT: SEX:

per person advertising expenditure in dollars by manufacturers in the city where the individual lives gender of the individual; 1 if female, 0 if male

A regression analysis was performed with CONS as the dependent variable and ln(CRDTLIM), ln(APR), ln(ADVT), and SEX as the independent variables. The estimated model was
$ = 2.28 - 0.29 ln(CRDTLIM) +5.77 ln(APR) +2.35 ln(ADVT) y + SEX 0.39

8. Referring to Table 15-1, what is the correct interpretation for the estimated coefficient for ADVT? a) A $1 increase in per person advertising expenditure by the manufacturer will result in an estimated average increase of $2.35 on personal consumption holding other variables constant. b) A 100% increase in per person advertising expenditure by the manufacturer will result in an estimated average increase of $2.35 on personal consumption holding other variables constant. c) A 100% increase in per person advertising expenditure by the manufacturer will result in an estimated average increase of 2.35% on personal consumption holding other variables constant. d) A 1% increase in per person advertising expenditure by the manufacturer will result in an estimated average increase of 2.35% on personal consumption holding other variables constant. ANSWER: b TYPE: MC DIFFICULTY: Difficult KEYWORDS: transformation, slope, interpretation 9. Referring to Table 15-1, what is the correct interpretation for the estimated coefficient for APR? a) A one percentage point increase in average annualized percentage interest rate will result in an estimated average increase of $5.77 on personal consumption holding other variables constant. b) A 100% increase in average annualized percentage interest rate will result in an estimated average increase of 5.77% on personal consumption holding other variables constant. c) A 100% increase in average annualized percentage interest rate will result in an estimated average increase of $5.77 on personal consumption holding other variables constant. d) A 1% increase in average annualized percentage interest rate will result in an estimated average increase of 5.77% on personal consumption holding other variables constant. ANSWER: c TYPE: MC DIFFICULTY: Moderate KEYWORDS: transformation, slope, interpretation TABLE 15-2 A certain type of rare gem serves as a status symbol for many of its owners. In theory, for low prices, the demand decreases as the price of the gem increases. However, experts hypothesize that when the gem is valued at very high prices, the demand increases with price due to the status owners believe they gain in obtaining the gem. Thus, the model proposed to best explain the demand for the gem by its price is the quadratic model:

Y = 0 + 1 X + 2 X 2 +

29

Multiple Regression Model Building

where Y = demand (in thousands) and X = retail price per carat. This model was fit to data collected for a sample of 12 rare gems of this type. A portion of the computer analysis obtained from Microsoft Excel is shown below: SUMMARY OUTPUT Regression Statistics Multiple R 0.994 R Square 0.988 Standard Error 12.42 Observations 12 ANOVA Regression Residual Total Intercept Price Price Sq df 2 9 11 Coeff 286.42 0.31 0.000067 SS 115145 1388 116533 StdError 9.66 0.06 0.00007 MS 57573 154 t Stat 29.64 5.14 0.95 F 373 Signif F 0.0001

P-value 0.0001 0.0006 0.3647

10. Referring to Table 15-2, what is the value of the test statistic for testing whether there is an
upward curvature in the response curve relating the demand (Y) and the price (X)? a) -5.14 b) 0.95 c) 373 d) None of the above. ANSWER: b TYPE: MC DIFFICULTY: Easy KEYWORDS: quadratic regression, t test on slope, test statistic

11. Referring to Table 15-2, what is the p-value associated with the test statistic for testing whether
there is an upward curvature in the response curve relating the demand (Y) and the price (X)? a) 0.0001 b) 0.0006 c) 0.3647 d) None of the above. ANSWER: c TYPE: MC DIFFICULTY: Easy KEYWORDS: quadratic regression, t test on slope, p-value

12. Referring to Table 15-2, what is the correct interpretation of the coefficient of multiple
determination? a) 98.8% of the total variation in demand can be explained by the linear relationship between demand and price.

Multiple Regression Model Building 30

b) 98.8% of the total variation in demand can be explained by the quadratic relationship between demand and price. c) 98.8% of the total variation in demand can be explained by the addition of the square term in price. d) 98.8% of the total variation in demand can be explained by just the square term in price. ANSWER: b TYPE: MC DIFFICULTY: Moderate KEYWORDS: coefficient of multiple determination, interpretation

13. Referring to Table 15-2, does there appear to be significant upward curvature in the response
curve relating the demand (Y) and the price (X) at 10% level of significance? a) Yes, since the p-value for the test is less than 0.10.

b) No, since the value of 2 is near 0. c) No, since the p-value for the test is greater than 0.10. d) Yes, since the value of 2 is positive.

ANSWER: c TYPE: MC DIFFICULTY: Moderate KEYWORDS: quadratic regression, t test on slope, decision, conclusion

14. True or False: Referring to Table 15-2, a more parsimonious simple linear model is likely to be
statistically superior to the fitted curvilinear for predicting sale price (Y). ANSWER: True TYPE: TF DIFFICULTY: Moderate KEYWORDS: quadratic regression, t test on slope, interpretation TABLE 15-3 In Hawaii, condemnation proceedings are under way to enable private citizens to own the property that their homes are built on. Until recently, only estates were permitted to own land, and homeowners leased the land from the estate. In order to comply with the new law, a large Hawaiian estate wants to use regression analysis to estimate the fair market value of the land. The following model was fit to data collected for n = 20 properties, 10 of which are located near a cove. 2 2 Model 1: Y = 0 + 1 X1 + 2 X2 + 3 X1 X 2 + 4 X1 + 5 X1 X 2 + where Y = Sale price of property in thousands of dollars X1 = Size of property in thousands of square feet X2 = 1 if property located near cove, 0 if not Using the data collected for the 20 properties, the following partial output obtained from Microsoft Excel is shown: SUMMARY OUTPUT Regression Statistics Multiple R 0.985 R Square 0.970

31

Multiple Regression Model Building

Standard Error Observations ANOVA Regression Residual Total Intercept Size Cove Size*Cove SizeSq SizeSq*Cove df 5 14 19

9.5 20 SS 28324 1279 29063 MS 5664 91 F 62.2 Signif F 0.0001

Coeff StdError 32.1 35.7 12.2 5.9 104.3 53.5 17.0 8.5 0.3 0.2 0.3 0.3

t Stat 0.90 2.05 1.95 1.99 1.28 1.13

P-value 0.3834 0.0594 0.0715 0.0661 0.2204 0.2749

15. Referring to Table 15-3, is the overall model statistically adequate at a 0.05 level of significance
for predicting sale price (Y)? a) No, since some of the t tests for the individual variables are not significant. b) No, since the standard deviation of the model is fairly large. c) Yes, since none of the -estimates are equal to 0.

d) Yes, since the p-value for the test is smaller than 0.05.
ANSWER: d TYPE: MC DIFFICULTY: Easy KEYWORDS: F test on the entire regression, p-value, decision, conclusion

16. Referring to Table 15-3, given a quadratic relationship between sale price (Y) and property size
(X1), what null hypothesis would you test to determine whether the curves differ from cove and non-cove properties? a) H0 : 2 = 3 = 5 = 0 b) H 0 : 4 = 5 = 0 c) H 0 : 3 = 5 = 0 d) H0 : 2 = 0 ANSWER: c TYPE: MC DIFFICULTY: Difficult KEYWORDS: interaction, partial F test, form of hypothesis

17. Referring to Table 15-3, given a quadratic relationship between sale price (Y) and property size
(X1), what test should be used to test whether the curves differ from cove and non-cove properties? a) F test for the entire regression model. b) t test on each of the coefficients in the entire regression model.

c) Partial F test on the subset of the appropriate coefficients. d) t test on each of the subsets of the appropriate coefficients.

Multiple Regression Model Building 32

ANSWER: c TYPE: MC DIFFICULTY: Difficult KEYWORDS: interaction, partial F test, interpretation 18. If a group of independent variables are not significant individually but are significant as a group at a specified level of significance, this is most likely due to a) autocorrelation. b) the presence of dummy variables. c) the absence of dummy variables. d) collinearity. ANSWER: d TYPE: MC DIFFICULTY: Easy KEYWORDS: collinearity, assumption, properties

19. As a project for his business statistics class, a student examined the factors that determined
parking meter rates throughout the campus area. Data were collected for the price per hour of parking, blocks to the quadrangle, and one of the three jurisdictions: on campus, in downtown and off campus, or outside of downtown and off campus. The population regression model hypothesized is Yi = + 1 x1i + 2 x2i + 3 x3i + i where Y is the meter price x1 is the number of blocks to the quad x2 is a dummy variable that takes the value 1 if the meter is located in downtown and off campus and the value 0 otherwise x3 is a dummy variable that takes the value 1 if the meter is located outside of downtown and off campus, and the value 0 otherwise Suppose that whether the meter is located on campus is an important explanatory factor. Why should the variable that depicts this attribute not be included in the model? a) Its inclusion will introduce autocorrelation. b) Its inclusion will introduce collinearity. c) Its inclusion will inflate the standard errors of the estimated coefficients. d) Both (b) and (c). ANSWER: d TYPE: MC DIFFICULTY: Moderate KEYWORDS: dummy variable, collinearity, properties

20. True or False: The Variance Inflationary Factor (VIF) measures the correlation of the X variables
with the Y variable. ANSWER: False TYPE: TF DIFFICULTY: Moderate KEYWORDS: variance inflationary factor, collinearity 21. True or False: Collinearity is present when there is a high degree of correlation between independent variables.

33

Multiple Regression Model Building

ANSWER: True TYPE: TF DIFFICULTY: Easy KEYWORDS: collinearity 22. True or False: Collinearity is present when there is a high degree of correlation between the dependent variable and any of the independent variables. ANSWER: False TYPE: TF DIFFICULTY: Moderate KEYWORDS: collinearity, properties

23. True or False: A high value of R2 significantly above 0 in multiple regression accompanied by
insignificant t-values on all parameter estimates very often indicates a high correlation between independent variables in the model. ANSWER: True TYPE: TF DIFFICULTY: Difficult KEYWORDS: collinearity, properties 24. True or False: One of the consequences of collinearity in multiple regression is inflated standard errors in some or all of the estimated slope coefficients. ANSWER: True TYPE: TF DIFFICULTY: Easy KEYWORDS: collinearity, properties 25. True or False: One of the consequences of collinearity in multiple regression is biased estimates on the slope coefficients. ANSWER: False TYPE: TF DIFFICULTY: Difficult KEYWORDS: collinearity, properties 26. True or False: Collinearity is present if the dependent variable is linearly related to one of the explanatory variables. ANSWER: False TYPE: TF DIFFICULTY: Easy KEYWORDS: collinearity, properties 27. True or False: Collinearity will result in excessively low standard errors of the parameter estimates reported in the regression output. ANSWER: False TYPE: TF DIFFICULTY: Difficult

Multiple Regression Model Building 34

KEYWORDS: collinearity, properties 28. True or False: The parameter estimates are biased when collinearity is present in a multiple regression equation. ANSWER: False TYPE: TF DIFFICULTY: Difficult KEYWORDS: collinearity, properties

29. True or False: Two simple regression models were used to predict a single dependent variable.
Both models were highly significant, but when the two independent variables were placed in the same multiple regression model for the dependent variable, R2 did not increase substantially and the parameter estimates for the model were not significantly different from 0. This is probably an example of collinearity. ANSWER: True TYPE: TF DIFFICULTY: Moderate KEYWORDS: collinearity, properties 30. True or False: So that we can fit curves as well as lines by regression, we often use mathematical manipulations for converting one variable into a different form. These manipulations are called dummy variables. ANSWER: False TYPE: TF DIFFICULTY: Moderate KEYWORDS: quadratic regression, transformation TABLE 15-4 A chemist employed by a pharmaceutical firm has developed a muscle relaxant. She took a sample of 14 people suffering from extreme muscle constriction. She gave each a vial containing a dose (X) of the drug and recorded the time to relief (Y) measured in seconds for each. She fit a centered curvilinear model to this data. The results obtained by Microsoft Excel follow, where the dose (X) given has been centered. SUMMARY OUTPUT Regression Statistics Multiple R 0.747 R Square 0.558 Adjusted R Square 0.478 Standard Error 863.1 Observations 14 ANOVA Regression Residual Total df 2 11 13 SS 10344797 8193929 18538726 MS 5172399 744903 F 6.94 Signif F 0.0110

35

Multiple Regression Model Building

Intercept CenDose CenDoseSq

Coeff StdError t Stat 1283.0 352.0 3.65 25.228 8.631 2.92 0.8604 0.3722 2.31

P-value 0.0040 0.0140 0.0410

31. Referring to Table 15-4, the prediction of time to relief for a person receiving a dose of the drug
10 units above the average dose (i.e., the prediction of Y for X = 10), is ________. ANSWER: 1,621 TYPE: FI DIFFICULTY: Moderate KEYWORDS: quadratic regression, prediction of individual values

32. Referring to Table 15-4, suppose the chemist decides to use an F test to determine if there is a
significant curvilinear relationship between time and dose. The p-value of the test is ________. ANSWER: 0.041 TYPE: FI DIFFICULTY: Difficult KEYWORDS: quadratic regression, partial F test, p-value

33. Referring to Table 15-4, suppose the chemist decides to use an F test to determine if there is a
significant curvilinear relationship between time and dose. The value of the test statistic is ________. ANSWER: 2.312 or 5.3361 TYPE: FI DIFFICULTY: Moderate KEYWORDS: quadratic regression, partial F test, test statistic

34. True or False: Referring to Table 15-4, suppose the chemist decides to use an F test to determine
if there is a significant curvilinear relationship between time and dose. If she chooses to use a level of significance of 0.05, she would decide that there is a significant curvilinear relationship. ANSWER: True TYPE: TF DIFFICULTY: Easy KEYWORDS: quadratic regression, partial F test, decision, conclusion

35. True or False: Referring to Table 15-4, suppose the chemist decides to use an F test to determine
if there is a significant curvilinear relationship between time and dose. If she chooses to use a level of significance of 0.01 she would decide that there is a significant curvilinear relationship. ANSWER: False TYPE: TF DIFFICULTY: Easy KEYWORDS: quadratic regression, partial F test, conclusion

36. Referring to Table 15-4, suppose the chemist decides to use a t test to determine if there is a
significant difference between a linear model and a curvilinear model that includes a linear term. The p-value of the test statistic for the contribution of the curvilinear term is ________.

Multiple Regression Model Building 36

ANSWER: 0.041 TYPE: FI DIFFICULTY: Moderate KEYWORDS: quadratic regression, t test on slope, p-value

37. Referring to Table 15-4, suppose the chemist decides to use a t test to determine if there is a
significant difference between a curvilinear model without a linear term and a curvilinear model that includes a linear term. The value of the test statistic is ______. ANSWER: 2.92 TYPE: FI DIFFICULTY: Moderate KEYWORDS: quadratic regression, t test on slope, test statistic

38. Referring to Table 15-4, suppose the chemist decides to use a t test to determine if there is a
significant difference between a curvilinear model without a linear term and a curvilinear model that includes a linear term. The p-value of the test is ______. ANSWER: 0.0140 TYPE: FI DIFFICULTY: Moderate KEYWORDS: quadratic regression, t test on slope, p-value

39. True or False: Referring to Table 15-4, suppose the chemist decides to use a t test to determine if
there is a significant difference between a linear model and a curvilinear model that includes a linear term. If she used a level of significance of 0.05, she would decide that the linear model is sufficient. ANSWER: False TYPE: TF DIFFICULTY: Moderate KEYWORDS: quadratic regression, t test on slope, decision

40. True or False: Referring to Table 15-4, suppose the chemist decides to use a t test to determine if
there is a significant difference between a linear model and a curvilinear model that includes a linear term. If she used a level of significance of 0.02, she would decide that the linear model is sufficient. ANSWER: True TYPE: TF DIFFICULTY: Moderate KEYWORDS: quadratic regression, t test on slope, decision

41. True or False: Referring to Table 15-4, suppose the chemist decides to use a t test to determine if
there is a significant difference between a curvilinear model without a linear term and a curvilinear model that includes a linear term. Using a level of significance of 0.05, she would decide that the curvilinear model should include a linear term. ANSWER: True

37

Multiple Regression Model Building

TYPE: TF DIFFICULTY: Moderate KEYWORDS: quadratic regression, t test on slope, decision 42. In multiple regression, the __________ procedure permits variables to enter and leave the model at different stages of its development. ANSWER: stepwise regression TYPE: FI DIFFICULTY: Easy KEYWORDS: stepwise regression 43. A regression diagnostic tool used to study the possible effects of collinearity is ______. ANSWER: VIF TYPE: FI DIFFICULTY: Moderate KEYWORDS: collinearity, variance inflationary factor

44. The _______ (larger/smaller) the value of the Variance Inflationary Factor, the higher is the
collinearity of the X variables. ANSWER: larger TYPE: FI DIFFICULTY: Moderate KEYWORDS: variance inflationary factor, collinearity, properties 45. The logarithm transformation can be used a) to overcome violations to the autocorrelation assumption. b) to test for possible violations to the autocorrelation assumption. c) to overcome violations to the homoscedasticity assumption. d) to test for possible violations to the homoscedasticity assumption. ANSWER: c TYPE: MC DIFFICULTY: Moderate KEYWORDS: transformation, homoscedasticity, assumption 46. The logarithm transformation can be used a) to overcome violations to the autocorrelation assumption. b) to test for possible violations to the autocorrelation assumption. c) to change a nonlinear model into a linear model. d) to change a linear independent variable into a nonlinear independent variable. ANSWER: c TYPE: MC DIFFICULTY: Moderate KEYWORDS: transformation, autocorrelation, homoscedasticity, assumption

47. Which of the following will NOT change a nonlinear model into a linear model?
a) b) c) d) Quadratic regression model Logarithmic transformation Square-root transformation Variance inflationary factor

Multiple Regression Model Building 38

ANSWERS: d TYPE: MC DIFFICULTY: Easy KEYWORDS: transformation

48. An independent variable Xj is considered highly correlated with the other independent variables if a) VIF j < 5
b) VIF j > 5

c) VIF j < VIFi for i j d) VIF j > VIFi for i j


ANSWER: b TYPE: MC DIFFICULTY: Easy KEYWORDS: variance inflationary factor, collinearity 49. True or False: The goals of model building are to find a good model with the fewest independent variables that is easier to interpret and has lower probability of collinearity. ANSWER: True TYPE: TF DIFFICULTY: Easy KEYWORDS: model building, collinearity 50. Using the best-subsets approach to model building, models are being considered when their a) C p > k b) C p k c) C p > ( k + 1) d) C p ( k + 1) ANSWER: d TYPE: MC DIFFICULTY: Easy KEYWORDS: model building, C-p statistic 51. True or False: In data mining where huge data sets are being explored to discover relationships among a large number of variables, the best-subsets approach is more practical than the stepwise regression approach. ANSWER: False TYPE: TF DIFFICULTY: Easy KEYWORDS: model building, stepwise regression, Cp 52. True or False: The stepwise regression approach takes into consideration all possible models. ANSWER:

39

Multiple Regression Model Building

False TYPE: TF DIFFICULTY: Easy KEYWORDS: model building, stepwise regression 53. True or False: In stepwise regression, an independent variable is not allowed to be removed from the model once it has entered into the model. ANSWER: False TYPE: TF DIFFICULTY: Easy KEYWORDS: model building, stepwise regression

54. True or False: Using the Cp statistic in model building, all models with C p ( k + 1) are equally
good. ANSWER: False TYPE: TF DIFFICULTY: Easy KEYWORDS: model building, Cp TABLE 15-8 The superintendent of a school district wanted to predict the percentage of students passing a sixthgrade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state. Let Y = % Passing as the dependent variable, X 1 =:% Attendance, X 2 = Salaries and X 3 = Spending. The coefficient of multiple determination ( R j ) of each of the 3 predictors with all the other remaining predictors are, respectively, 0.0338, 0.4669, and 0.4743. The output from the best-subset regressions is given below: Model 1 2 3 4 5 6 7 Variables X1 X1X2 X1X2X3 X1X3 X2 X2X3 X3 Cp 3.05 3.66 4.00 2.00 67.35 64.30 62.33 k 2 3 4 3 2 3 2 R Square 0.6024 0.6145 0.6288 0.6288 0.0474 0.0910 0.0907 Adjusted R Square Std. Error 0.5936 10.5787 0.5970 10.5350 0.6029 10.4570 0.6119 10.3375 0.0262 16.3755 0.0497 16.1768 0.0705 15.9984
2

Following is the residual plot for % Attendance:

Multiple Regression Model Building 40

% Attendance Residual Plot


20 10

Residuals

0 -10 -20 -30 -40 88 89 90 91 92 93 94 95 96 97 98

% Attendance

Following is the output of several multiple regression models: Model (I):


Coefficients Intercept % Attendance Salary Spending -753.4225 8.5014 6.85E-07 0.0060 Standard Error 101.1149 1.0771 0.0006 0.0046 t Stat -7.4511 7.8929 0.0011 1.2879 P-value 2.88E-09 6.73E-10 0.9991 0.2047 Lower 95% -957.3401 6.3292 -0.0013 -0.0034 Upper 95% -549.5050 10.6735 0.0013 0.0153

Model (II):
Intercept % Attendance Spending Coefficients Standard Error -753.4086 99.1451 8.5014 1.0645 0.0060 0.0034 t Stat -7.5991 7.9862 1.7676 P-value 1.5291E-09 4.223E-10 0.0840

Model (III): df Regression Residual Total 2 44 46 Coefficients Intercept % Attendance % Attendance 6672.8367 -150.5694 0.8532 SS MS 8162.9429 4081.4714 4504.1635 102.3674 12667.1064 Standard Error 3267.7349 69.9519 0.3743 t Stat 2.0420 -2.1525 2.2792 F Significance F 39.8708 1.3201E-10

P-value 0.0472 0.0369 0.0276

41

Multiple Regression Model Building

Squared

55. Referring to Table 15-5, what are, respectively, the values of the variance inflationary factor of
the 3 predictors? ANSWER: 1.03, 1.88 and 1.90 TYPE: PR DIFFICULTY: Easy KEYWORDS: variance inflationary factor, coefficient of multiple determination, collinearity

56. True or False: Referring to Table 15-5, there is reason to suspect collinearity between some pairs
of predictors. ANSWER: False TYPE: PR DIFFICULTY: Easy KEYWORDS: variance inflationary factor, collinearity

57. Referring to Table 15-5, which of the following predictors should first be dropped to remove
collinearity? a) X 1 b) c) d) None of the above ANSWER: d TYPE: MC DIFFICULTY: Easy KEYWORDS: variance inflationary factor, collinearity

X2 X3

58. Referring to Table 15-5, which of the following models should be taken into consideration using the Mallows C p statistic? X1, X 3 b) X 1 , X 2 , X 3
a) c) both of the above d) None of the above ANSWER: c TYPE: MC DIFFICULTY: Easy KEYWORDS: C-p statistic, model building

59. Referring to Table 15-5, the best model using a 5% level of significance among those chosen by the C p statistic is X1, X 3 b) X 1 , X 2 , X 3
a)

Multiple Regression Model Building 42

c) either of the above d) None of the above ANSWER: a TYPE: MC DIFFICULTY: Easy KEYWORDS: model building, t test on slope, test statistics, p-value, decision, conclusion

60. Referring to Table 15-5, the best model chosen using the adjusted R-square statistic is a) X 1 , X 3 b) X 1 , X 2 , X 3
c) either of the above d) None of the above ANSWER: a TYPE: MC DIFFICULTY: Easy KEYWORDS: model building, adjusted coefficient of determination

61. Referring to Table 15-5, the better model using a 5% level of significance derived from the best
model above is a) X 1

X3 c) X 1 , X 3 d) X 1 , X 2 , X 3
b) ANSWER: a TYPE: MC DIFFICULTY: Easy KEYWORDS: model building, t test on slope, test statistics, p-value, decision, conclusion

62. True or False: Referring to Table 15-5, the residual plot suggests that a nonlinear model on %
attendance may be a better model. ANSWER: True TYPE: TF DIFFICULTY: Easy KEYWORDS: residual plot, quadratic regression, transformation

63. Referring to Table 15-5, what is the value of the test statistic to determine whether the quadratic
effect of daily average of the percentage of students attending class on percentage of students passing the proficiency test is significant at a 5% level of significance? ANSWER: 2.2792 TYPE: PR DIFFICULTY: Easy KEYWORDS: quadratic regression, model building, t test on slope, test statistic

43

Multiple Regression Model Building

64. Referring to Table 15-5, what is the p-value of the test statistic to determine whether the quadratic
effect of daily average of the percentage of students attending class on percentage of students passing the proficiency test is significant at a 5% level of significance? ANSWER: 0.0276 TYPE: PR DIFFICULTY: Easy KEYWORDS: quadratic regression, model building, t test on slope, p-value

65. True or False: Referring to Table 15-5, the null hypothesis should be rejected when testing
whether the quadratic effect of daily average of the percentage of students attending class on percentage of students passing the proficiency test is significant at a 5% level of significance. ANSWER: True TYPE: TF DIFFICULTY: Easy KEYWORDS: quadratic regression, model building, t test on slope, decision

66. True or False: Referring to Table 15-5, the quadratic effect of daily average of the percentage of
students attending class on percentage of students passing the proficiency test is not significant at a 5% level of significance. ANSWER: False TYPE: TF DIFFICULTY: Easy KEYWORDS: quadratic regression, model building, t test on slope, conclusion