Burns05 Im 1s9

CHAPTER 19 REGRESSION ANALYSIS IN MARKETING RESEARCH
LEARNING OBJECTIVES
To understand the basic concept of prediction To learn how marketing researchers use regression analysis To learn how marketing researchers use bivariate regression analysis To see how multiple regression differs from bivariate regression To appreciate various types of stepwise regression, how they are applied, and the interpretation of their findings
To learn how to obtain and interpret regression analyses with SPSS
CHAPTER OUTLINE UNDERSTANDING PREDICTION Two Approaches to Prediction How to Determine the Goodness of Your Predictions BIVARIATE LINEAR REGRESSION ANALYSIS Basic Procedure in Bivariate Regression Analysis Independent and Dependent Variables Computing the Slope and the Intercept The Hobbits Choice Restaurant Survey: How to Run and Interpret Bivariate Regression Analysis on SPSS Testing for Statistical Significance of the Intercept and the Slope Making a Prediction and Accounting for Error MULTIPLE REGRESSION ANALYSIS An Underlying Conceptual Model Multiple Regression Analysis Described Basic Assumptions in Multiple Regression The Hobbits Choice Restaurant Survey: How to Run and Interpret Multiple Regression Analysis on SPSS 355
Chapter 19: Regression Analysis in Marketing Research
Using Results to Make a Prediction Special Uses of Multiple Regression Analysis Using a Dummy Independent Variable Using Standardized Betas to Compare the Importance of Independent Variables Using Multiple Regression as a Screening Device STEPWISE MULTIPLE REGRESSION How to do Stepwise Multiple Regression with SPSS THREE WARNINGS REGARDING MULTIPLE REGRESSION ANALYSIS KEY TERMS Prediction Predictive model Bivariate regression analysis Slope Independent variable Standard error of the estimate General conceptual model Regression plane Independence assumption Variance inflation factor (VIF) Standardized beta coefficient Stepwise multiple regression TEACHING SUGGESTIONS 1. Students may need some additional help understanding the difference between extrapolation and building a predictive model. A way to help them comprehend the difference is to note that extrapolation always relies on some pattern that is seen over time, while prediction requires the use of a factor other than time. Extrapolation uses the average change in the focal variable per relevant time period, while prediction uses the average change in the focal variable per relevant unit of the other variable. Extrapolation Analysis of residuals Intercept Dependent variable Least squares criterion Outlier Multiple regression analysis Additivity Multicollinearity Dummy independent variable Screening device
100
You can use sales and marketing variables as an example. If sales have increased at 10% per year for the past 5 years, you can extrapolate that they will increase 10% in the coming year. However, if sales have increased by 20% for every 10% decrease in price, you can predict that they will increase by 20% if the price is decreased by 10%. In both cases, however, all other variables are assumed to have the same influence as in the past. 2. The analysis of residuals underpins assessment of the goodness of a predictive model, and it is an important foundational concept. To help students understand analysis of residuals, consider the following in-class exercise. Show students the following number series, and ask them what straight line formula will correctly predict the next number. 15, 20, 25, 30, ? To find the intercept, use y = a + bx, and set x = 0, or a + b (0) = 15 a = 15 Next, experiment with different values of b, and look at how close the results are to the given series. Series (y) Let x = 15 + 1x 15 + 2x 15 + 3x 15 + 4x 15 + 5x 15, 0 15, 15, 15, 15, 15, 20, 1 16, 17 18, 19, 20, 25, 2 17, 19, 21, 23, 25, 30, 3 18, 21, 24, 27, 30, ? 4 19 23 27 31 35 Residual (sum: 0-3) 24 18 12 6 0
The residual (sum: 0-3) is the sum of the differences in the predicted value for each equation as compared to the series. The 15 +5x equation has the lowest residual, so it is the best predictive model, and although its residual is 0, the prediction of 35 for x=4 is correct. 3. Use of the Novartis data to illustrate bivariate regression is intentional as it explicitly ties regression to correlation. The text notes that the same data is used, but it is worthwhile to point out the connection to students who may have skipped over this point or otherwise overlooked it. 4. There are many nuances to regression analysis not treated in this chapter's introduction to the topic. The intent is to describe the basic concepts and to have students identify their related values on a printout. SPSS on the other hand, does provide for a number of statistical options that are beyond the scope of the chapter,
101
particularly in the case of multiple regression. Some instructors who desire more indepth coverage of this technique may do so with their own materials and rely on SPSS to accommodate this deeper coverage. 5. Regression analysis is complicated and difficult for undergraduate students to understand. To help with the comprehension of regression analysis, we have provided a number of regression application examples. If ones students relate well to concrete examples, it may be beneficial to use these examples in class or to go over them in detail more than with the examples in earlier chapters. 6. The section on the underlying conceptual model for multiple regression analysis has two pedagogical benefits. First, it can be used to help students understand the distinction between independent and dependent variables. The independent variables come from the constructs that are on the outside of the diagram that have arrows pointing toward the center. Dependent variables emanate from the center of the diagram, and the diagram implies that the central variables (dependent) are affected or influenced by the surrounding variables (independent). Second, the abbreviated lists of examples of variables for each circle in the diagram should help students to identify the specific variables (such as demographic variables) that would or could be used in the multiple regression model. 7. In earlier editions of the textbook, Chapter 18 included a section on time series analysis. This topic was deleted, and the section on multiple regression was expanded in response to what the authors perceived to be a low level of interest in time series analysis by adopting instructors. The Student Version of SPSS does have time series (experiential smoothing) analysis capabilities as well as graphing procedures for time series data. Instructors who wish to teach time series analysis concepts can still do so using SPSS; however, they will need to draw from sources other than the textbook for reading or study materials for their students. 8. Because of the many assumptions of regression analysis that can be easily violated with a tool such as SPSS, we emphasize caution when unleashing students on multiple regression analysis. We have provides some readable references in the endnotes (13 and14) that we list below in case the Instructor want his or her students to be exposed to practitioner-oriented literature on this topic. (The Quirks Marketing Research Review articles are available at www.quirks.com). See for example, Kennedy, Peter (2005, Winter), Oh No! I got the wrong sign! What should I do? Journal of Economic Education, Vol. 36, No. 1, 77-92. For readable treatments of problems encountered in multiple regression applied to marketing research, see Mullet, Gary (1994, October), Regression, regression, Quirks Marketing Research Review, electronic archive, Mullet, Gary (1998,
102
June), Have you ever wondered, Quirks Marketing Research Review, electronic archive, Mullet, Gary (2003, February), Data abuse, Quirks Marketing Research Review, electronic archive. ACTIVE LEARNING EXERCISES Perform a Bivariate Regression with SPSS Using the clickstream and annotated SPSS output in Figures 19.2 and 19.3, respectively, and the Hobbits Choice Restaurant survey dataset provided to you, perform bivariate regression analysis using the amount respondents expect to pay, on average, for and evening meal entree item in the new restaurant. When you have determined the results, make a prediction of how much a person who earns an income of $100,000 per year expects to pay for this entre. Students should use the recoded income variable (using midpoints of the ranges in thousands) to perform this bivariate regression.
Model Summary Model 1 R R Square .754a .569 Adjusted R Square .567 Std. Error of the Estimate $6.46485
a. Predictors: (Constant), Recoded income to $1,000s using midpoints of questionnaire ranges
b ANOVA
Model 1
Sum of Squares Regression 18616.303 Residual 14126.474 Total 32742.776
df 1 338 339
Mean Square 18616.303 41.794
F 445.427
Sig. .000 a
a. Predictors: (Constant), Recoded income to $1,000s using midpoints of questionnaire ranges b. Dependent Variable: What would you expect an average evening meal entree item alone to be priced?
103
a Coefficients
Model 1
(Constant) Recoded income to $1,000s using midpoints of questionnaire ranges
Unstandardized Coefficients B Std. Error 5.932 .705 .148 .007
Standardized Coefficients Beta
t 8.417 21.105
Sig. .000 .000
.754
a. Dependent Variable: What would you expect an average evening meal entree item alone to be priced?
The equation is Amount expected to pay = $5.93 + .148 times (income level) So, for an income level of $100,000 Amount expected to pay = $5.93 + .148 times ($100) = $5.93 + $14.80 = $20.73 To apply 95% confidence intervals, students must use 1.96 times $6.46, or $12.66, so the boundaries are $8.07 to $33.39. The General Conceptual Model for Intentions to Patronize the Hobbits Choice Restaurant What is the general conceptual model apparent in the Hobbits Choice Restaurant survey dataset? The central dependent variable is How likely would it be for you to patronize this restaurant (new upscale restaurant)? Demographics are: Year born What is your highest level of education? What is your marital status? Including children under 18 living with you, what is your family size? Please check the letter that includes the Zip Code in which you live (coded by letter). Which of the following categories best describes your before tax household income? What is your gender? Attitudes are preferences for various restaurant features: Prefer Waterfront View Prefer Drive Less than 30 Minutes 104
Prefer Formal Waitstaff Wearing Tuxedos Prefer Unusual Desserts Prefer Large Variety of Entrees Prefer Unusual Entrees Prefer Simple Decor Prefer Elegant Decor Prefer String Quartet Prefer Jazz Combo
Also, and attitude is: What would you expect an average evening meal entree item alone to be priced? Media habits are: Would you describe yourself as one who listens to the radio? To which type of radio programming do you most often listen? Would you describe yourself as a viewer of TV local news? Which newscast do you watch most frequently? Do you read the newspaper? Which section of the local newspaper would you say you read most frequently? Do you subscribe to City Magazine? Past behavior is Do you eat at this type of restaurant at least once every two weeks? (qualifier, so not to be included in any analysis), and How many total dollars do you spend per month in restaurants (for your meals only)? Comment on the usefulness of this general conceptual model to Jeff Dean. That is, assuming that the regression results are significant, what marketing strategy implications will become apparent? Jeff will gain market segmentation implications from the demographics and the past behavior (amount spent on restaurants per month), promotional strategy implications from the media habits, restaurant design implications from the preferences, and pricing implications from the average price expected to pay variable. Segmentation Associates, Inc. Note: this was an end-of-chapter case in the fourth edition. The full case solution is provided following even though the Active Learning Exercise only asks students about questions 1 and 2. Instructors may want to use questions 3-5 for class discussion. Case Objective
105
This case requires students to interpret the results of multiple regression and to apply them to market segmentation target marketing considerations using the underlying conceptual model concept described in the chapter. It also illustrates the use of multiple regression to identify market segment differences. Answers to Case Questions 1. What is the underlying conceptual model used by Segmentation Associates that is apparent in these three sets of findings? One can refer to the general conceptual model presented in Figure 19.6 and pick out those variables that are apparent in the Segmentation Associates example. The dependent variable is type of automobile purchased (compact, sports car, or luxury car). In order to satisfy multiple regression assumptions, the dependent variable(s) would be metric, so they could be preference measures (such as 1 = do not prefer and 5= greatly prefer) as to automobile type. The independent variables are demographic and life style measures. Thus, the conceptual model is that demographic and life styles predict automobile type preference. 2. What are the segmentation variables that distinguish compact automobile buyers and in what ways do they distinguish them? The relevant section of the table is reproduced here. Recall that the cell entries are standardized beta coefficients that are statistically significant. Compact automobile buyers have strong family values, are not cosmopolitan, have larger families, take pride in American, but do not embrace change. They are younger, financially insecure, and they have less education and earn less income. Segmentation Variable Demographics Age Education Family size Income Life Style/Values Active American pride Bargain hunter Conservative Cosmopolitan Embrace change Family values Financially secure 106 Compact Automobile Buyers -.28 -.12 +.39 -.15 +.30 +.45 -.40 -.30 +.69 -.28
Optimistic 3. What are the segmentation variables that distinguish sports car buyers and in what ways do they distinguish them? Segmentation Variable Demographics Age Education Family size Income Life Style/Values Active American pride Bargain hunter Conservative Cosmopolitan Embrace change Family values Financially secure Optimistic Sports Car Buyers -.15 +.38 -.35 +.25 +.59 -.33 -.38 +.68 +.65 +.21 +.71
Sports car buyers are optimistic and cosmopolitan, and they embrace change. They also lead active lives. They are not conservative, and they are not bargain hunters. Demographically, sports cars buyers have more education and more income, are younger, and represent smaller families. 4. What are the segmentation variables that distinguish luxury automobile buyers and in what ways do they distinguish them? Segmentation Variable Demographics Age Education Family size Income Life Style/Values Active American pride Bargain hunter Conservative Cosmopolitan Luxury Automobile Buyers +.59 +.68 -.39 +.24 +.54
107
Embrace change Family values Financially secure Optimistic
+.21 +.50 +.37
Luxury car buyers are older with higher incomes. They are conservative, financially secure, and optimistic. They do not lead active lives, and they believe in family values and American pride. 5. Contrast the segmentation variable classification differences among the three types of automobile buyers. Three differences are apparent. First, not all segmentation variables are statistically significant for all three car buyer segments. Second, the standardized beta coefficients are different from segment to segment, in fact, the signs are different in some cases. Third, the relative importance (absolute values of the standardized betas) differs between market segments. ANSWERS TO END-OF-CHAPTER QUESTIONS 1. Construct and explain a reasonable simple predictive model for each of the following cases: Application question. Students must use the predictive model concept introduced in the chapter to suggest reasonable relationships. A reasonable model is described under each case. a. What is the relationship between gasoline prices and distance traveled for family automobile touring vacations? As gasoline prices at the pump increase, the number of automobile touring miles for family vacations decreases. b. How do hurricane force warnings relate to purchases of flashlight batteries in the expected landfall area? The severity of the forecasted hurricane will be positively related to purchases of flashlight batteries because hurricanes commonly cause electricity blackouts. c. What do florists do with regard to inventory of flowers for the week prior to and the week after Mother's Day? During the week prior, they stock up, so inventories will increase to peak on the day before Mother's Day, but the demand falls significantly in the next week, so stocks fall to their normal levels.
108
109
2. Indicate what the scatter diagram and probable regression line would look like for two variables that are correlated in each of the following ways. In each instance, assume a negative intercept. Application question. To answer each item, students must understand how to apply scatter diagrams in the context of a correlation coefficients information. The scatter diagram appearance and regression line are described after each correlation. a. - .89 The scatter diagram would be a well-defined and narrow ellipse with a negative slope. The regression line would begin at the (negative) intercept and trace the midpoint line of the ellipse from end to end. b. +.48 The scatter diagram would be an ill-defined and wide ellipse with a positive slope. The regression line would begin at the (negative) intercept and cut through the middle of the ellipse from end to end. c. -.10 The scatter diagram would be definitionless. The regression line would begin at the (negative) intercept and move down to the right. The precise angle of the regression line could not be determined from the scatter diagram. 3. Circle K runs a contest inviting customers to fill out a registration card. In exchange, they are eligible for a grand prize drawing of a trip to Alaska. The card asks for the customer's age, education, gender, estimated weekly purchases (in dollars) at the Circle K, and approximate distance the Circle K is from his or her home. Identify each of the following if a multiple regression analysis was to be performed. a. Independent variables b. Dependent variable c. Dummy variable Application question. Students will need to comprehend and apply these basic regression concepts. The dependent variable is estimated weekly purchases, while the independent variables are age, education, gender, and distance from his/her home. Gender is a dummy variable although it is categorical (male or female).
110
4. Explain what is meant by the independence assumption in multiple regression. How can you examine your data for independence, and what statistics are issued by most statistical analysis programs? How is this statistic interpreted? That is, what would indicate the presence of multicollinearity, and what would you do to eliminate it? Review question. In order to answer this question correctly, students must be conversant in the notion of independence as it pertains to multiple regression analysis. The independence assumption refers to the necessity for the independent variables in a multiple regression to be statistically independent, or uncorrelated. This is a fundamental assumption of multiple regression. The statistic referred to is the variance inflation factor or VIF. A rule of thumb is that as long as VIF is less than 10, multicollinearity is not a concern. With a VIF of greater than 10 associated with any independent variable in the multiple regression equation, the researcher should remove that variable from the independent variable set, and rerun the multiple regression. This iterative process is used until only independent variables that are statistically significant and that have acceptable VIFs are in the final multiple regression equation. 5. What is multiple regression? Specifically, what is "multiple" about it, and how does the formula for multiple regression appear? In your indication of the formula, identify the various terms and also indicate the signs (positive or negative) that they may take on. Review question. This is a test of a students basic knowledge of the underlying model in multiple regression analysis. Multiple regression is conceptually identical to bivariate regression except that more than one, and perhaps several, independent variables are used. The general equation for multiple regression is as follows:
y = a + b1 x1 + b2 x 2 + b3 x3 + ...bn x n
Where y is the dependent variables, a is the intercepts, the bis are the regression coefficients or betas, and the xis are the values of the independent variables. There are n independent variables in the equation. 6. If one uses the "enter" method for multiple regression analysis, what statistics on an SPSS for Windows output should be examined to assess the result? Indicate how you would determine each of the following: a. Variance explained in the dependent variable by the independent variables b. Statistical significance of each of the independent variables c. Relative importance of the independent variables in predicting the dependent variable
111
Review question. Students will need to be familiar with SPSS multiple regression output and how its various statistics relate to multiple regression model assumptions/results. For a., inspect the adjusted R square; with b., the significance of the t value associated with each independent variables beta is reported; and for c., look at the standardized beta coefficients. 7. Explain what is meant by the notion of "trimming" a multiple regression result. Use the following example to illustrate your understanding of this concept. A bicycle manufacturer maintains records over 20 years of the following measured in appropriate units per year: unit sales (dependent variable), average retail price in dollars, co-operative advertising amount in dollars, competitors' average retail price in dollars, number of retail locations selling the bicycle manufacturer's brand, and whether or not the winner of the Tour de France was riding the manufacturers' brand (coded as a dummy variable where 0=no, and 1-yes). The initial multiple regression result determines the following: Variable Significance Level Average retail price in dollars .001 Cooperative advertising amount in dollars .202 Competitors' average retail price in dollars .028 Number of retail locations .591 Tour de France .032 Using the "enter" method, what would be the trimming steps you would expect to undertake to identify the significant multiple regression result? Explain your reasoning. Application question. This question simulates the first result of an SPSS multiple regression model analysis and requires students to apply the notion of trimming. Trimming refers to the iterative process of eliminating nonsignificant independent variables and rerunning the multiple regression until only statistically significant ones remain. In the bicycle example, the process would be to next eliminate the number of retail locations for the next multiple regression run. If cooperative advertising... is again nonsignificant and the largest significance value, eliminate it and rerun the trimmed multiple regression. Continue until only statistically significant betas are left. The logic is based on the null hypothesis that says that any nonsignificant independent variable in the multiple regression equation has a beta of zero. Thus, in order to make it take on a zero beta, the independent variable must be removed. Otherwise, the statistical analysis will still compute a nonzero value for that independent variable.
112
8. Using the bicycle example in question 7, what do you expect would be the elimination of variables sequence using stepwise multiple regression? Explain your reasoning with respect to the operation of this technique. Application question. This question tests a students comprehension of how the stepwise feature eliminates nonsignificant independent variables from the regression equation. With stepwise regression, the first independent variable to be included is the one that is most significant and explains the most variance. The multiple regression equation is then recomputed with the remaining independent variables and the most significant one in that set is added. This process is repeated until only those independent variables with statistically significant coefficients are in the equation. There is no explained variance information in question 7, so students must work with the significance levels. Assuming that the significance levels do not differ with each iteration from those given in the bicycle example, the order of entry would be (1) average price in dollars, (2) competitors average retail price in dollars, and (3) Tour de France. At this point, the stepwise procedure would stop and compute a multiple regression result for these three independent variables. 9. Using SPSS graphical capabilities, diagram the regression plane for the following variables. Number of gallons of gasoline used per week 5 10 15 20 25 Miles commuted for work per week 50 125 175 250 300 Number of riders in carpool 4 3 2 0 0
Application question. Students must use the scatter diagram graphing capabilities of SPSS for Windows to create these graphs. Enterprising students may attempt to create a three-dimensional graph, but the data range and number of data points are too few to create a good-looking graph.
113
30
20
10
GAS
0 0 100 200 300
WORK
30
20
10
GAS
0 -1 0 1 2 3 4 5
CARPOOL
10. The Maximum Amount is a company that specializes in making fashionable clothes in large sizes for large people. Among its customers are Sinbad and Shaquille ONeal. A survey was performed for the Maximum Amount, and a regression analysis was run on some of the data. Of interest in this analysis was the possible relationship selfesteem (dependent variable) and number of Maximum Amount articles purchased last year (independent variable). Self-esteem was measured on a 7-point scale where 1 signifies very low and 7 indicates very high self-esteem. Following are some items that have been taken from the output.
114
Pearson product moment correlation = +.63 Intercept = 3.5 Slope = +0.2 Standard Error = 1.5 All statistical tests significant at the .01 level or less What is the correct interpretation of these findings? Application question. This question tests students comprehension of regression findings. This is a bivariate regression, so the square of the correlation will indicate the amount of variance explained by the regression which is .632 or about .40. The intercept suggests that for those people who buy no Maximum Amount articles, their selfesteem is 3.5 on the average. Self-esteem increases by the slope (.2) with each article purchased. The standard error reveals that there is a considerable range in any prediction, however, although at the 95% level of significance, the confidence intervals are 1.5 times 1.96, or about 3.0. Essentially, prediction is very imprecise. 11. Wayne LaTorte is a safety engineer who works for the U.S. Postal Service. For most of his life, Wayne has been fascinated by UFOs. He has kept records of UFO sightings in the desert areas of Arizona, California, and New Mexico over the past 15 years and he has correlated them with earthquake tremors. A fellow engineer suggests that Wayne use regression analysis as a means of determining the relationship. Wayne does this and finds a "constant" of 30 separate earth tremor events and a slope of 5 events per UFO sighting. Wayne then writes an article for the UFO Observer claiming that earthquakes are largely caused by the subsonic vibrations emitted by UFOs as they enter the Earth's atmosphere. What is your reaction to Wayne's article? Application question. Can students spot this misuse of regression? Although Wayne has determined a statistical relationship, he cannot claim to have found a causal relationship with this data. There is only an association, and regression is a means of making a prediction, but it is inappropriate to infer a causal relationship. (Students may point out that the R square and standard error are unknown, so it is impossible to assess the goodness of Wayne's model, but the larger point is that he is using regression to substantiate his causal analysis.)
115
CASE SOLUTIONS Case 19.1 Dont You Hate it When Part IV Case Objective This case is another one where Josh fails to do the analysis correctly, and Marsha must do the work. It involves understanding of the two-step, trimming process and interpretation of standardized beta coefficients. Answers to Case Questions 1. Describe the two-step process and trimming approach that Josh should have used in running his three multiple regression analyses with the Pets, Pets, & Pets data. The two-step process is (1) determine if there is a linear relationship by examining the significance level for the F-test, and if this test is significant, (2) inspect the multiple regression independent variables coefficients for significance. Trimming is a process of systematically eliminating the nonsignificant independent variables iteratively based on their lack of statistical significance. 2. Assume that the independent variables reported in each of Joshs three tables are the result of correctly using the two-step process and trimming the nonsiginficant independent variables. Describe the relationships revealed in each table, and indicate the implications of these relationships for Pets, Pets & Pets marketing strategy. The interpretations are provided in the following tables.
116
Table 1: Times visited PPP in past year Independent Variable(s) Standardized I usually purchase pet 0.31 supplies from the same company.* Pets, Pets & Pets helps me -0.25 stretch my wallet.* Buying pet supplies at Pets, 0.25 Pets & Pets gives me time to do more important things.* My pet is a large part of my 0.30 life.* I am pleased with my pet 0.13 right now.* I enjoy taking care of my 0.15 pet.* How many miles do you -0.19 live from Pets, Pets & pets? Indicate your gender -0.18 (1=male, 2=female) *Based on a scale where 1=strongly disagree and 5=strongly agree
The most important variables are I usually purchase pet supplies from the same pet company, and My pet is a large part of my life, with PPP helps me stretch my wallet (negative) and buying pet supplies at PPP gives me time to do more important things slightly less important. The profile of the PPP frequent visitor: Company loyal Pet is large part of the family Use PPP to save time Does not use PPP to save money Lives close to PPP Male Enjoys caring for pet Pleased with life
Table 2: How likely to buy at PPP next time (1-7 scale) Independent Variable(s) Standardized Wide variety of pet supplies -0.25 at Pets, Pets & Pets* Good values at Pets, Pets & 0.49 Pets* Helpful employees at Pets, 0.33 Pets & Pets* *Based on a scale where 1=strongly disagree and 5=strongly agree
The most important variable is good values with helpful employees next, and narrow (negative sign) of PPP supplies third in importance. Why patrons are likely to buy at PPP next time: Good values Helpful employees Narrow variety (PPP is a specialty store)
117
Table 3: Amount spent at PPP last time Independent Variable(s) Number of pets owned Recall seeing a PPP newspaper ad in the past month? (1=yes, 2=no) Family income level
Standardized -0.17 -0.21 0.38
Most important is family income level, next is (did not) recall seeing PPP newspaper ad, third is (few) number of pets owned. The profile of high spenders at PPP: High income level Do not recall PPP newspaper ads Few pets owned
Case 19.2
Sales Training Associates, Inc.
Case Objective This case requires students to perform bivariate and multiple regression analyses and to interpret the results.
118
Answers to Case Questions 1. Using SPSS for Windows perform a series of bivariate regressions using the sales performance measure as the dependent variable, and each of the other factors in the table as independent measures. What did you find, and how do you interpret these findings? The bivariate regression outputs are listed in order, each with an interpretation. For convenience, confidence intervals for predictions are set at 95%.
Model Summary Model 1 R .723a R Square .523 Adjusted R Square .497 Std. Error of the Estimate 4.07
a. Predictors: (Constant), TRAINHRS
ANOVA b Model 1 Sum of Squares 326.264 297.536 623.800 df 1 18 19 Mean Square 326.264 16.530 F 19.738 Sig. .000a
Regression Residual Total
a. Predictors: (Constant), TRAINHRS b. Dependent Variable: RATING
Coefficients a Standardi zed Coefficie nts Beta .723
Model 1
(Constant) TRAINHRS
Unstandardized Coefficients B Std. Error 3.357 2.127 6.45E-02 .015
t 1.578 4.443
Sig. .132 .000
a. Dependent Variable: RATING
Interpretation. The bivariate regression is significant (F=.000) and the Adjusted R Square indicates that training hours explains about 50% of the ratings. The slope is significant (.000) and .065 in size. The constant is zero as it is nonsignificant (Sig = .132). The equation is: rating = 0 + .065 times number of training hours. Confidence intervals for predictions at 95% level of confidence will be "1.96 times 4.07.
119
a. Predictors: (Constant), CERTS
a. Predictors: (Constant), CERTS b. Dependent Variable: RATING
Model 1
(Constant) CERTS
Unstandardized Coefficients B Std. Error 5.595 2.187 1.314 .401
t 2.558 3.279
Sig. .020 .004
Interpretation. The bivariate regression is significant (F=.004), and both the slope of number of certificates and the constant are significantly different from zero (.004 and .020, respectively). The equation is: rating = 5.59 + 1.31 times number of certificates earned. Confidence intervals for predictions at 95% level of confidence will be "1.96 times 4.66. The R square is lower than the previous bivariate regression, indicating a weaker linear relationship.
a. Predictors: (Constant), AGE
120
a. Predictors: (Constant), AGE b. Dependent Variable: RATING
Model 1
(Constant) AGE
Unstandardized Coefficients B Std. Error -2.124 2.962 .353 .071
t -.717 4.946
Sig. .483 .000
Interpretation. The bivariate regression is significant (F=.000), and the slope for age is significantly different from zero (.000), but the constant is not (.483). The regression equation is: ratings = 0 + .35 times years of age. Confidence intervals for predictions at 95% level of confidence will be "1.96 times 3.83. The R square value is comparable to the first regression and larger than the second one.
a. Predictors: (Constant), COMPYRS
a. Predictors: (Constant), COMPYRS b. Dependent Variable: RATING
121
Model 1
(Constant) COMPYRS
Unstandardized Coefficients B Std. Error 6.906 1.668 .401 .108
t 4.140 3.720
Sig. .001 .002
Interpretation. The bivariate regression is significant (F=.002), and both the slope and intercept are significantly different from zero. The regression equation is: rating = 6.91 + .40 times number of years with company. Confidence intervals for predictions at 95% level of confidence will be "1.96 times 4.43. Note: The independent variable of gender (coded 1 or 2), is a nominally scaled variable, and it should not be used in a regression analysis as this analysis assumes metric data for the independent and dependent variables. 2. Use multiple regression to determine the relationship of the various factors to selfevaluated sales performance for last year. What did you find, and what are the implications of the findings for STA? Preliminary analysis involves inspecting the correlations between the various independent variables to spot multicollinearity problems. The correlation matrix is found as follows.
Correlations TRAINHRS Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N TRAINHRS COMPYRS GENDER 1.000 .605** -.339 . .005 .144 20 20 20 .605** 1.000 -.436 .005 . .055 20 20 20 -.339 -.436 1.000 .144 .055 . 20 20 20 .873** .590** -.306 .000 .006 .189 20 20 20 .558* .867** -.200 .011 .000 .398 20 20 20 CERTS .873** .000 20 .590** .006 20 -.306 .189 20 1.000 . 20 .428 .060 20 AGE .558* .011 20 .867** .000 20 -.200 .398 20 .428 .060 20 1.000 . 20
COMPYRS
GENDER
CERTS
AGE
**. Correlation is significant at the 0.01 level (2-tailed). *. Correlation is significant at the 0.05 level (2-tailed).
122
High correlations exist between trainhrs and certs (.873) and between age and compyrs (.867). It is reasonable to drop trainhrs and use certs although the number of certificates earned in SA programs is more managerially relevant than the total training hours. Also, SA would probably like to advertise that salespersons can benefit from its training programs regardless of age, so drop compyrs and keep age for the multiple regression that follows. Note that gender is a dummy independent variable.
a. Predictors: (Constant), AGE, GENDER, CERTS
a. Predictors: (Constant), AGE, GENDER, CERTS b. Dependent Variable: RATING
Coefficients a Standardi zed Coefficie nts Beta -.043 .340 .605
Model 1
(Constant) GENDER CERTS AGE
Unstandardized Coefficients B Std. Error -2.098 4.085 -.507 1.749 .729 .348 .282 .073
t -.513 -.290 2.097 3.848
Sig. .615 .776 .052 .001
Interpretation. The multiple regression is significant (F=.000). The slopes for certs and age are significantly different from zero (albeit marginally for certificates), but this is not the case for gender. Drop gender and perform a trimmed model regression. The output follows:
123
a. Predictors: (Constant), AGE, CERTS
a. Predictors: (Constant), AGE, CERTS b. Dependent Variable: RATING
Coefficients a Standardi zed Coefficie nts Beta .351 .609
Model 1
(Constant) CERTS AGE
Unstandardized Coefficients B Std. Error -2.970 2.686 .754 .328 .283 .071
t -1.106 2.303 3.993
Sig. .284 .034 .001
Interpretation. The multiple regression is significant (F=.000), and the slopes of both independent variables are significantly different from zero (.034 and .001), while the constant is not. The high multiple R of .823 infers a strong linear relationship with much of the ratings variance explained by the two independent variables. The regression model is: rating = 0 +.75 times number of certificates earned + .28 times the person's age in years. Using the 95% level of confidence, predictions can be made with confidence intervals of "1.96 times 3.44 (standard error). Age is about twice as important as is the number of certificates earned according to the standardized beta coefficients. Age can be interpreted as a general indicator of sales experience, operating independently of training gained in the STA certification programs. Case 19.3 The Hobbits Choice Restaurant Survey Predictive Analysis Case Objective Students must apply predictive analysis on the SPSS integrated case data set and interpret the findings.
124
Answers to Case Questions 1. What is the demographic target market definition for the Hobbits Choice Restaurant? The dependent variable to use is the likelihood of patronizing the Hobbits Choice Restaurant. Independent variables must be metric or dichotomous (dummy). The recoded income level is metric; family size is metric; year born (or age) is metric; and gender can be used as a dummy variable. The multiple regression findings follow.
Model Summary Model 1 R R Square .776 a .602 Adjusted R Square .598 Std. Error of the Estimate .785
a. Predictors: (Constant), Recoded income to $10,000's using midpoints of questionnaire ranges, Including children under 18 living with you, what is you family size?, What is your gender? , Year Born
b ANOVA
Model 1
Sum of Squares 367.834 243.164 610.998
df 4 395 399
Mean Square 91.958 .616
F 149.379
Sig. .000 a
a. Predictors: (Constant), Recoded income to $10,000's using midpoints of questionnaire ranges, Including children under 18 living with you, what is you family size?, What is your gender? , Year Born b. Dependent Variable: How likely would it be for you to patronize this restaurant (new upscale restaurant)?
a Coefficients
Model 1
(Constant) Recoded income to $1,000s using midpoints of questionnaire ranges Including children under 18 living with you, what is your family size? What is your gender? Year Born
Unstandardized Coefficients B Std. Error 9.103 9.308 .018 .001
t .978
Sig. .329 .000
.762
20.957
.019 .038 -.004
.029 .079 .005
.021 .016 -.030
.664 .490 -.816
.507 .624 .415
a. Dependent Variable: How likely would it be for you to patronize this restaurant (new upscale restaurant)?
125
As can be seen, the only significant independent variable is the recoded income level. Trimming and rerunning the regression results in the following:
a. Predictors: (Constant), Recoded income to $10,000's using midpoints of questionnaire ranges

b ANOVA
Model 1
Sum of Squares 366.920 244.078 610.998
df 1 398 399
F 598.309
Sig. .000 a
a. Predictors: (Constant), Recoded income to $10,000's using midpoints of questionnaire ranges b. Dependent Variable: How likely would it be for you to patronize this restaurant (new upscale restaurant)?
a Coefficients
Model 1
(Constant) Recoded income to $1,000s using midpoints of questionnaire ranges
t 23.625 24.460
Sig. .000 .000
.775
The target market definition is very simple: The Hobbits Choice Restaurant target market is upper income residents. 2. What is the restaurant spending behavior target market definition for the Hobbits Choice Restaurant? This question refers to restaurant spending behavior as the independent variables. There are two questions on the survey that pertain to restaurant spending: total dollars spent per month in restaurants and expected average price for an evening meal entree item alone.
126
a. Predictors: (Constant), What would you expect an average evening meal entree item alone to be priced?, How many total dollars do you spend/ per month in restaurants (for your meals only)?
b ANOVA
Model 1
Sum of Squares 260.859 101.553 362.412
df 2 337 339
F 432.827
Sig. .000 a
a. Predictors: (Constant), What would you expect an average evening meal entree item alone to be priced?, How many total dollars do you spend/ per month in restaurants (for your meals only)? b. Dependent Variable: How likely would it be for you to patronize this restaurant (new upscale restaurant)?
a Coefficients
Model 1
(Constant) How many total dollars do you spend per month in restaurants (for your meals only)? What would you expect an average evening meal entree item alone to be priced?
t 20.871 4.567
Sig. .000 .000
Collinearity Statistics Tolerance VIF
.275
.229
4.361
.063
.006
.597
9.908
.000
.229
4.361
The significance level of both independent variables is .000, and there is no problem with multicollinearity as no VIF value exceeds 10. Both total dollars spent in restaurants per month, and the expected average price for an evening meal entre predict the likelihood of patronizing the Hobbits Choice Restaurant. The standard beta coefficients reveal that the average price variable is twice as important as is total dollars spent on restaurants in predicting this likelihood. The target market definition for the Hobbits Choice Restaurant is: (1) people who patronize restaurants in general, and (2) those who expect to spend more for an evening entre (i.e., bigger spenders).
127
3. Develop a general conceptual model of market segmentation for the Hobbits Choice Restaurant. Test it using multiple regression analysis and interpret your findings for Jeff Dean. The general conceptual model should be based on the variables in the survey. There are three classes of variables: (1) demographics, (2) restaurant patronage, and (3) restaurant feature preferences. The media usage variables are categorical and not suited to regression analysis. Following is the result of stepwise multiple regression.
Model Summary Model 1 2 3 R R Square .838 a .702 b .852 .725 .855 c .731 Adjusted R Square .702 .724 .729 Std. Error of the Estimate .565 .543 .538
a. Predictors: (Constant), What would you expect an average evening meal entree item alone to be priced? b. Predictors: (Constant), What would you expect an average evening meal entree item alone to be priced?, Prefer Formal Waitstaff wearing Tuxedos c. Predictors: (Constant), What would you expect an average evening meal entree item alone to be priced?, Prefer Formal Waitstaff wearing Tuxedos, Year Born
d ANOVA
Model 1
Regression Residual Total Regression Residual Total Regression Residual Total
Sum of Squares 254.574 107.838 362.412 262.880 99.531 362.412 265.085 97.327 362.412
df 1 338 339 2 337 339 3 336 339
Mean Square 254.574 .319 131.440 .295 88.362 .290
F 797.917
Sig. .000 a
445.038
.000 b
305.049
.000 c
a. Predictors: (Constant), What would you expect an average evening meal entree item alone to be priced? b. Predictors: (Constant), What would you expect an average evening meal entree item alone to be priced?, Prefer Formal Waitstaff wearing Tuxedos c. Predictors: (Constant), What would you expect an average evening meal entree item alone to be priced?, Prefer Formal Waitstaff wearing Tuxedos, Year Born d. Dependent Variable: How likely would it be for you to patronize this restaurant (new upscale restaurant)?
128
a Coefficients
Model 1
(Constant) What would you expect an average evening meal entree item alone to be priced? (Constant) What would you expect an average evening meal entree item alone to be priced? Prefer Formal Waitstaff Wearing Tuxedos (Constant) What would you expect an average evening meal entree item alone to be priced? Prefer Formal Waitstaff Wearing Tuxedos Year Born
Unstandardized Coefficients B Std. Error 1.663 .066 .088 1.593 .069 .003 .065 .005
t 25.080 28.247 24.446
Sig. .000 .000 .000 .000
Collinearity Statistics Tolerance VIF
.838
1.000
1.000
.652
14.428
.399
2.508
.163 40.085 .065
.031 13.953 .005
.240
5.303 2.873
.000 .004 .000
.399
2.508
.618
13.323
.371
2.696
.107 -.020
.037 .007
.157 -.136
2.918 -2.759
.004 .006
.276 .331
3.629 3.021
The analysis has determined three significant variables: expected average evening entre price, preference for formal waitstaff with tuxedos, and year born. You have found that Jeff Dean should target the older, big spenders with a formal waitstaff attired in tuxedos in his upscale restaurant.
129

Burns05 Im 1s9

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Burns05 Im 1s9

Transféré par

Droits d'auteur :

Formats disponibles

CHAPTER 19 REGRESSION ANALYSIS IN MARKETING RESEARCH

To learn how to obtain and interpret regression analyses with SPSS

Chapter 19: Regression Analysis in Marketing Research

Chapter 19: Regression Analysis in Marketing Research

Chapter 19: Regression Analysis in Marketing Research

Chapter 19: Regression Analysis in Marketing Research

a. Predictors: (Constant), Recoded income to $1,000s using midpoints of questionnaire ranges

Sum of Squares Regression 18616.303 Residual 14126.474 Total 32742.776

Mean Square 18616.303 41.794

Chapter 19: Regression Analysis in Marketing Research

(Constant) Recoded income to $1,000s using midpoints of questionnaire ranges

Unstandardized Coefficients B Std. Error 5.932 .705 .148 .007

Standardized Coefficients Beta

Sig. .000 .000

Chapter 19: Regression Analysis in Marketing Research

Chapter 19: Regression Analysis in Marketing Research

Chapter 19: Regression Analysis in Marketing Research

Chapter 19: Regression Analysis in Marketing Research

Embrace change Family values Financially secure Optimistic

+.21 +.50 +.37

Chapter 19: Regression Analysis in Marketing Research

Chapter 19: Regression Analysis in Marketing Research

Chapter 19: Regression Analysis in Marketing Research

Chapter 19: Regression Analysis in Marketing Research

Chapter 19: Regression Analysis in Marketing Research

Chapter 19: Regression Analysis in Marketing Research

0 0 100 200 300

Chapter 19: Regression Analysis in Marketing Research

Chapter 19: Regression Analysis in Marketing Research

Chapter 19: Regression Analysis in Marketing Research

Chapter 19: Regression Analysis in Marketing Research

Standardized -0.17 -0.21 0.38

Sales Training Associates, Inc.

Chapter 19: Regression Analysis in Marketing Research

a. Predictors: (Constant), TRAINHRS

Regression Residual Total

a. Predictors: (Constant), TRAINHRS b. Dependent Variable: RATING

Coefficients a Standardi zed Coefficie nts Beta .723

Unstandardized Coefficients B Std. Error 3.357 2.127 6.45E-02 .015

Sig. .132 .000

a. Dependent Variable: RATING

Chapter 19: Regression Analysis in Marketing Research

a. Predictors: (Constant), CERTS

Regression Residual Total

a. Predictors: (Constant), CERTS b. Dependent Variable: RATING

Coefficients a Standardi zed Coefficie nts Beta .612

Unstandardized Coefficients B Std. Error 5.595 2.187 1.314 .401

Sig. .020 .004

a. Dependent Variable: RATING

a. Predictors: (Constant), AGE

Chapter 19: Regression Analysis in Marketing Research

Regression Residual Total

a. Predictors: (Constant), AGE b. Dependent Variable: RATING

Coefficients a Standardi zed Coefficie nts Beta .759

Unstandardized Coefficients B Std. Error -2.124 2.962 .353 .071

Sig. .483 .000

a. Dependent Variable: RATING

a. Predictors: (Constant), COMPYRS

Regression Residual Total

a. Predictors: (Constant), COMPYRS b. Dependent Variable: RATING

Chapter 19: Regression Analysis in Marketing Research

Coefficients a Standardi zed Coefficie nts Beta .659

Unstandardized Coefficients B Std. Error 6.906 1.668 .401 .108

Sig. .001 .002