Vous êtes sur la page 1sur 8

Teddy Hecht MKMR 310: Marketing Analytics Exercise 6: SPSS Regression October 22, 2011 For this regression,

the dependent, explained variable (DV) is men, sales of mens clothing. The two independent, explanatory variables (IVs) are mail, number of catalogs mailed, and page, number of pages in catalog. In regressing men on mail and page, we seek to determine whether either or both the number of catalogs mailed or the page count of the catalogs mailed has a significant effect on the dollar sales of mens clothing; and if that effect is statistically significant, we seek to find out the magnitude of that effect. The null hypothesis is that mail and page do not have any effect on men. SPSS Variable View confirms that both IVs and the DV have numeric, scale levels of measurement. The absolute values of the skewness (1.120, |-.683|, .014) and kurtosis (1.762, 6.194, .066) of men, mail, and page, respectively, all lie within the appropriate limits of |skewness| < 3 and |kurtosis| < 8. The regression equation follows:

The regression models p-value of .000 < .05 means that the regression is significant at a .05 level of significance. The model has an adjusted R-square of .661, meaning that 66.1% of the variance in the DV men can be explained by the two IVs mail and page together. I found that both Number of catalogs mailed ( ) and Number of pages in catalog ( ) were significant, positive predictors of Sales of mens clothing. According to the model, a one unit increase in mail predicts a 2.913 increase in men; that is, every additional catalog mailed predicts a $2.91 increase in mens clothing sales. Similarly, every additional page in the catalog predicts a $72.54 increase in mens clothing sales. Testing for assumptions. The Tolerance, VIF values are (.980, 1.020) for both IVs, meaning that we can rule out multicollinearity. Of note, the partial regression plot of mail on men does not pass the test for linearity, as the difference between the linear and quadratic R-square values is equal to .063greater than 3%. The quadratic and cubic plots are significantly (and equally) better at describing this relationship. Visual inspection of the plot shows one extreme outlier corresponding to the 09/01/1997 observation; this may be a factor impacting the accuracy of non-linear plots over the linear plot. Plotting SDR vs. COO_1, there is, again, one visibly extreme outlier. I excluded the observation using the Select Cases command (keeping all cases below cooks = 1), and reran the regression. Changes from the initial results are presented in a table. All cases Outlier Removed number of cases 120 119 Skewness of(men, mail, page) (1.120, |-.683|, .014) (1.203, .717, .029) Kurtosis of(men, mail, page) (1.762, 6.194, .066) (1.828, .748, .061) Adjusted R-square .661 .696 ANOVA: sig. .000 .000 unstandardized 2.913; .000 3.375; .000

unstandardized Tolerance VIF Linearity: partial of mail Linearity: partial of page

72.544; .006 .980 1.020 fail pass

57.051; .023 .962 1.039 pass pass

The next regression adds a third independent variable into the equation: phone, the number of phone lines open for ordering. For this analysis, I will continue to exclude the outlier as described above. I will begin by checking the model assumptions. The abs. value of skewness is < 3 (1.203, .717, .029, .343) and the abs. value of kurtosis (1.828, .748, .061, |-.251|) is < 8 for men, mail, page, and phone, respectively. Like the other three variables, phone has a numeric, scale level of measurement. All three partial regression plots pass the linearity testby this I mean that cubic and quadratic best fit plots are materially no better than the linear plot at explaining the effect of IV on DV. The IVs mail, page, and phone have Tolerances of (.437, .961, .441), respectively and VIF of (2.288, 1.041, 2.269) respectively. All of these values are outside the range that indicates multicollinearity. The results of this regression. The adjusted R-square of this regression is .769, meaning that 76.9% of the variance in men, sales of mens clothing, can be explained by the three independent variables corresponding to the number of catalogs mailed, the number of pages in catalog, and the number of phone lines available for ordering. Compared to the previous regression, this model explains (.769-.696) = 7.3% more of the variance in sales of mens clothing. This does not, however, mean that phone only predicts 7.3% of that effect. That number only means phones unique contribution to the variance in men among these other two predictors. Due to collinearity, some of the effect of phone might overlap with the effect of mail and/or page. There appear to be two influential cases; I filtered them out by only keeping cases with Cooks Distance < 0.1. Following the discovery that the new Adjusted R-square had decreased by over 3% to .726, I returned those two cases back into the regression. I found that Number of catalogs mailed ( ), Number of pages in catalog ( ) and Number of phone lines open for ordering were significant, positive predictors of Sales of mens clothing. According to the model, a one unit increase in mail predicts a 2.104 increase in men; that is, each additional catalog mailed predicts a $2.10 increase in mens clothing sales. Similarly, each additional page in the catalog predicts a $51.68 increase in mens clothing sales; each additional phone line open for ordering predicts a $301.02 increase. It is not surprising that more catalogs mailed, bigger catalogs mailed, and more phone lines open for ordering all have positive effects on sales of mens clothing; that much is probably true of womens clothing, too and, well, pretty much every single direct mail product or service. Now, we might say that the number and page count of catalogs mailed affect the early stages of the buying process: need awareness and information search. The more people we reach with catalogs, the more thorough our

catalogs pitchthe greater the number of customers likely to call the store for information or to order merchandise. Any potential customer that calls the store is prequalifiedmore likely to buy than the average guy in our target market, because he or she is calling us. Not having enough phone lines available to take orders is like shooting ourselves in the foot. That $300 increase in sales for every additional line available tells the story of the revenue-choking bottleneck we experience when we dont have enough phone infrastructure or staff. Next, I recoded YEAR_ into year_split, assigning the first 5 years of data a 0 and the last 5 years a 1. I ran the same regression as above, for each 5 year split. The model for the first 5 years shows no significant effect at all of page on men ( ; partial regression R2 linear = 0). By contrast, the later 5 years model does show a significant, positive effect ( ; partial regression R2 linear = .088). The later years model explains significantly less of the variance of mens clothing sales, however (adj. R-square = .623, p = .000) than the earlier years model (adj. R-square = .735, p = .000). These two findings, taken together, indicate that either mail or phone (or both) were significantly better predictors of men in the first 5 years then in the last 5. Indeed, the biggest change in partial regression R2 linear is in mail. In the first 5 years, the number of catalogs mailed out explainedby itself52.1% of the variance in sales of mens clothing; in the last 5 years, this variable only explainedby itself21.7% of the variance in said sales. Further, the marginal benefit on men predicted by increasing mail by one declined from = 2.625, p<.001 in the first period to = 2.020, p<.001. Put simply, the number of catalogs mailed out brought smaller marginal sales gains and played a much less significant role in predicting total sales volume in the latter five years. The managerial implication is twofold: if were committed to the catalog format, we should avoid mailing it to every address we can get and instead do some targeted mailing. We also need to evaluate alternative formats, such an online catalog and/or online store. My recommendation is to pursue/enhance our online storefront, because doing so may also benefit the company by taking pressure off of our currently underequipped phone staff.

Statistics date Date men Sales of Men's Clothing women Sales of Women's Clothing jewel Sales of Jewelry mail Number of Catalogs Mailed page Number of Pages in Catalog phone Number of Phone Lines Open for Ordering Valid N Missing Mean Std. Error of Mean Median Std. Deviation Variance Skewness Std. Error of .221 Skewness Kurtosis Std. Error of .438 Kurtosis Minimum Maximum 01/01/1989 12/01/1998 3245.18 38609.66 16578.93 80245.97 5983.550000000000 38231.570000000000 1147 15263 51 114 17 59 18061.20 40027.78 15 68 1989 1998 .438 .438 .438 .438 .438 .438 .438 .438 .438 -1.200 1.762 .818 1.226 6.194 .066 -.267 1.238 -.323 -1.225 .221 .221 .221 .221 .221 .221 .221 .221 .221 0 12/15/1993 96 15:39:44.816 12/16/1993 1058 18:37:44.334 8368272845471193.000 .000 0 16242.8134 577.38975 15452.2750 6324.98781 40005470.747 1.120 0 40583.6799 1113.37665 39779.3150 12196.43013 148752907.989 .603 0 16740.70833333333600 614.687688352301000 14241.08500000000000 6733.566254625217000 45340914.505 1.339 0 10131.77 154.996 10073.50 1697.898 2882856.063 -.683 0 80.58 1.190 80.50 13.030 169.791 .014 0 34.83 .773 34.00 8.464 71.641 .325 0 28518.8104 319.15366 28512.6400 3496.15321 12223087.250 .277 0 35.97 .999 36.00 10.942 119.730 .313 0 1993.50 .263 1993.50 2.884 8.319 .000 120 120 120 120 120 120 120 120 120 120 print Amount Spent on Print Advertising service Number of Customer Service Representatives YEAR_ YEAR, not periodic

Section 2:

Section 3:

Last five years 1994-1998 1989-1993

First five years

Section 3 grouped by year_split

GET FILE="C:\Users\Teddy\Documents\Syncplicity Folders\Teddy's Syncplicity\Dropbox\Current Semester\MKMR 310\Assignments\06 SPSS Regression\06.6 Data for Homework catalog(1).sav". DATASET NAME DataSet1 WINDOW=FRONT. FREQUENCIES VARIABLES=men mail page /FORMAT=NOTABLE /STATISTICS=STDDEV MINIMUM MAXIMUM SEMEAN MEAN MEDIAN SKEWNESS SESKEW KURTOSIS SEKURT /ORDER=ANALYSIS. REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA COLLIN TOL ZPP /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT men /METHOD=ENTER mail page /PARTIALPLOT ALL /SAVE COOK SDRESID. GRAPH /SCATTERPLOT(BIVAR)=COO_1 WITH SDR_1 /MISSING=LISTWISE. USE ALL. COMPUTE filter_$=(COO_1 1). VARIABLE LABELS filter_$ 'COO_1 1 (FILTER)'. VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'. FORMATS filter_$ (f1.0). FILTER BY filter_$. EXECUTE. REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA COLLIN TOL ZPP /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT men /METHOD=ENTER mail page /PARTIALPLOT ALL /SAVE COOK SDRESID. FREQUENCIES VARIABLES=men mail page /FORMAT=NOTABLE /STATISTICS=STDDEV MINIMUM MAXIMUM SEMEAN MEAN MEDIAN SKEWNESS SESKEW KURTOSIS SEKURT /ORDER=ANALYSIS. GRAPH /SCATTERPLOT(BIVAR)=COO_2 WITH SDR_2 /MISSING=LISTWISE. USE ALL. COMPUTE filter_$=(COO_1 < .1 & COO_2 < .1). VARIABLE LABELS filter_$ 'COO_1 < .1 & COO_2 < .1 (FILTER)'. VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'. FORMATS filter_$ (f1.0). FILTER BY filter_$. EXECUTE.

RECODE YEAR_ (MISSING=SYSMIS) (1989 thru 1993=0) (1994 thru 1998=1) INTO year_split. VARIABLE LABELS year_split 'first/last 5 yrs'. EXECUTE. SORT CASES BY year_split. SPLIT FILE SEPARATE BY year_split. REGRESSION

/MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA COLLIN TOL ZPP /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT men /METHOD=ENTER mail page phone /PARTIALPLOT ALL /SAVE COOK SDRESID

Vous aimerez peut-être aussi