Académique Documents
Professionnel Documents
Culture Documents
Abstract
We introduce a multivariate binomial logit model measuring cross-category dependence and sales promotion effects of a retail
assortment. This model requires as data both the market baskets of individual shoppers and the categories currently promoted in
a retail outlet. A special section describes the stepwise procedure used to estimate parameters of this model. Its application is
demonstrated analyzing 6147 purchases that were acquired in a medium-sized supermarket. We finally discuss the managerial
relevance of this model for sales promotion decisions of retail firms. 1999 Elsevier Science Ltd. All rights reserved.
0969-6989/99/$ — see front matter 1999 Elsevier Science Ltd. All rights reserved.
PII: S 0 9 6 9 - 6 9 8 9 ( 9 8 ) 0 0 0 2 6 - 5
100 H. Hruschka et al. / Journal of Retailing and Consumer Services 6 (1999) 99—105
model specifications without independent variables. purchases of category i from purchases of the rest of the
Manchanda et al. (1997) analyse multi-category pur- assortment.
chases in four categories (laundry detergents, fabric b denotes the effect of a sales promotion of category
G
softeners, cakemix and cake frosting) using a multivariate i on the main effect of the same category, b the effect on
GHG
probit model. They obtain significant complementary the interaction of categories i and j by a sales promotion
price effects between laundry detergents and fabric of category i.
softeners (cakemix and cake frosting). Conditional probabilities of purchases of category i
In the next section we lay out the multivariate logit given purchases of other related categories (whose
model. This is followed by a section giving information a O0) collected in the index set Z and sales promotions
GH G
on the estimation method. Then results of an empirical X"+X , 2, X , are derived from the loglinear model
'
study are presented. We conclude with a discussion of the as
managerial relevance of this model for sales promotion
P "1/(1#exp(!(aG #bG X
decisions of retail firms. G8G 6 G
# (aG #bG X #bG X ) ½ )))
H HG G HH H H
HZ8G
2. Multivariate logit model with
aG "a , bG "b , bG "b . (2)
We extend the model of Hruschka (1991) by including H GH HG GHG HH GHH
cross-category sales promotion effects influencing This model consists of one binomial logit equation for
purchase probabilities. Purchases ½ (i"1, I) and sales each of the categories considered. It is a multivariate
G
promotion X (i"1, I) in I product categories are binary binomial logit model using the terminology of Nerlove
G
variables. We assume that promotion of category i may and Press (1973).
influence purchases of category i via its main effect as well The relationship of the multivariate logit formulation
as joint purchases of other categories jOi via interaction to the loglinear model implies cross-equation parameter
parameters. restrictions of the following type (Maddala, 1987):
We start from the loglinear model for joint purchase
aG "aH"a . (3)
probabilites P(½ , 2, ½ ):
' H G GH
These restrictions come up to equality conditions for
' first-order interactions. The coefficient of interaction of
ln P(½ , 2, ½ )"a # (a #b X ) ½
' G G G G category j in the equation of category i, aG equals the
G H
coefficient of category i in the equation of category j, aH .
G
'\ ' We call two categories purchase complements (substi-
# (a #b X #b X )½ ½ , (1)
GH GHG G GHH H G H tutes) if joint purchases are more (less) frequent compared
G HG> to the case of stochastic independence (Mulhern and
where a is the main effect of category i (the change of the Leone, 1991). Our definition is based on product interde-
G pendencies in terms of customers’ purchases (Betancourt
log expected joint probabilities by a purchase of category
i), and a the first-order interaction between the two and Gautschi, 1990).
GH A parameter aG greater (less) than zero indicates that
categories i and j. Interactions measure the deviation of H
the log observed joint probabilities from the log expected both categories are complements (substitutes). A para-
joint probabilities if only main effects are considered. meter equal to zero shows that they are independent. To
The model includes interactions between pairs of be more specific, there is conditional independence with
categories (first-order interactions) and neglects higher- those categories jOi, which have no interaction para-
order interactions (e.g. between triples of categories). This meter different from zero in the logit equation of category
may be justified by the high number of variables (catego- i. If the logit equation of category i has no interaction
ries) of retail applications and the better interpretability parameters at all, its purchases are totally independent
of such a simplified model. A similar approach is taken in from purchases of the rest of the assortment.
conjoint-analysis models (e.g. Green et al., 1989). We distinguish the following types of effects of a sales
Omission of any first-order interaction a gives promotion of category i:
GH
a model, where the purchase of category i is conditionally E more purchases of the same category (bG '0);
independent from the purchase of category j given pur-
E less joint purchases of categories i and j (bG (0);
chases of other categories. By leaving out any a with HG
GI E more joint purchases of categories i and j (bG '0).
kOj as well, one obtains a model according to which the HG
purchase of both categories i and k is independent from In the extreme a promotion may make two categories
the purchase of category j. If all interactions a with jOi purchase complements that without promotion joint
GH
are excluded, we arrive at total independence of purchases are stochastically independent.
H. Hruschka et al. / Journal of Retailing and Consumer Services 6 (1999) 99—105 101
3. Estimation of the multivariate logit model special software installed in the data-processing center of
the retail chain. The purchases of the data set occurred
Estimation of the multivariate logit model proceeds in on four successive saturdays. As frequencies for indi-
the following way: vidual items as a rule become very low, we analyze data
1. Basic multivariate logit model: on the category level. In agreement with the classification
scheme of the retail chain 150 categories are distin-
E determination of significant cross effects for all pairs of guished.
categories; Results demonstrate that dependence exists for 73
E specification of the multivariate logit model by com- categories. Only 4.9% of the pairs formed by these 73
bining interaction parameters corresponding to all sig- categories have significant interaction parameters. The
nificant cross effects and all main effect parameters; purchases of certain categories are totally independent of
E single equation estimation of the multivariate logit purchases of the remaining categories. Table 1 shows
model; those isolated categories and their main effects, which
E stepwise elimination of interaction parameters; attain at least 100 purchases in the data set.
E estimation of the multivariate model for all categories Using interaction parameters of the multivariate
taking cross-equation equality restrictions into ac- binomial model one can identify clusters consisting of
count. more than one category that are independent from other
2. Introduction of the additional parameters of the ex- parts of the assortment (Fig. 1 contains a MDS map
tended model. computed on the basis of interaction parameters.).
3. Stepwise elimination of additional parameters. These clusters are:
Single logit equations for each category are estimated E detergents and related products;
by generalized least squares. To include parameter re- E household cleansers, other cleansers;
strictions the whole system of binomial logit equations is E tobacco, cigars, cigarette paper;
formulated as one multivariate nonlinear regression E tropical fruit, frozen fruit;
model. Parameter estimates (or variance weighted ave- E baby related products (food, care, hygienic);
rages) obtained in the single equation step serve as initial E red wine, white wine;
values for the multivariate nonlinear least squares es- E beer, water.
timation (Gallant, 1987).
The vast number of possible model specifications The categories most frequently bought are:
because of the high number of categories even when bread (1078 purchases);
restricting to constant terms and first order interactions fruit (1050 purchases);
forces to use a coarse model search heuristic. In several vegetables (846 purchases);
steps, the parameter with maximal insignificance is se- yoghurt (782 purchases);
lected as candidate for elimination from the model. It is journals (713 purchases);
actually eliminated, if the normed fit index of the unres- milk (705 purchases).
tricted model with this parameter included compared to Table 2 contains main effects and interaction para-
the restricted model without this parameter is less than meters of the logit equations of these categories. Bread
0.02. Stepwise elimination stops, if all parameters are has the strongest interactions with cut cheese and fruit or
significant at a"0.05. vegetables, fruit with vegetables and yoghurt, vegetables
The normed fit index gives the relative improvement of with fruit and milk, yoghurt with milk and fruit, milk
the sum of weighted squared errors of an unrestricted with yoghurt and vegetables. Journals only interact with
model (SSE ) compared to a restricted model (SSE ): bread.
3 0
SSE !SSE
0 3. (4) Table 1
SSE
0 Independent categories
In the case of perfect fit the normed fit index assumes the Category Purchases aG
value one.
Champagne 116 !1.398
Cigarettes 443 !0.796
Frozen potato and flour products 108 !1.398
4. Empirical study Frozen poultry 157 !1.301
Gifts 193 !1.222
The empirical study is based on a data set consisting of Office articles 100 !1.523
6147 purchases acquired in a medium sized supermarket Rolls 265 !1.046
Snacks 251 !1.046
of the same retail chain. Usual scanner data were read in, Soft drinks 555 !0.770
transformed to and stored as market basket data by
102 H. Hruschka et al. / Journal of Retailing and Consumer Services 6 (1999) 99—105
On the whole, results confirm expectations that most gories promotion does not change their own main effect
categories of a retail assortment are complements as they (bG "0), i.e. the purchase frequency does not significantly
allow customers to do one-stop shopping (Betancourt increase (or decrease), if a category is promoted:
and Gautschi, 1990). Almost all of the interactions dis-
E deli;
covered comprise categories that are complements. The
E spread;
only substitutes found are cigars and tobacco as well as
E sweets;
cigars and cigarette paper. This may be explained by the
E vermouth and dessert wine;
fact that these categories are restricted to basically the
E dog food;
same consumption activity.
E toilet tissue;
The logit model is computationally more efficient than
E red wine;
the data mining approach of Agrawal and Sikant (1994)
E dishwashing detergents;
mentioned above. The latter leads to computing times for
E electric appliances.
small simulation problems with 20 categories which are
much higher than those necessary for the multivariate It may be possible that promotion in these categories
binomial logit model when applied to a real-world data only lead to switching of customers within a category. Of
set. course, to study this hypothesis one has to use brand-
Maximally, 47 categories are promoted per week. After specific data.
eliminating 26 categories that are promoted in every For salt & garlic as well as cat food the multivariate
week and categories with very low purchase frequencies, logit model indicates more purchases if these categories
28 promoted categories remain. For the following cate- are promoted (i.e. higher main effects because of bG equal
H. Hruschka et al. / Journal of Retailing and Consumer Services 6 (1999) 99—105 103
Table 2
Main and interaction effects
to 0.176 and 0.230, respectively), but no effects on interac- i and j are only related if i is promoted:
tions (i.e. the interaction parameters are the same as
without feature, bG "0). Especially, salt & garlic is a rice & legume P milk (0.318);
HG soups & sauces P baking products (0.223).
category only weakly related to consumption activities.
Promoting it therefore does not change consumption
The multivariate logit model demonstrates that for
patterns of other categories.
some categories promotion decreases the joint purchase
For some of the other categories effects of promotion
probability compared to the situation without promo-
on interactions can be confirmed. Promotion increases
tion:
complementary relationships in some categories (para-
meters bG are shown in parentheses): flour P baking products (!0.276);
HG
canned vegetables P pasta (0.236); flour P fat & oil (!0.276);
canned vegetables P soups sauces (0.251); bread P fruit (!0.375);
dried fruit P baking products (0.338); bread P milk (!0.108);
hair care P hygienic products (0.021); deli P canned fish (!0.046);
soups & sauces P canned vegetables (0.206). exotic fruit P frozen fruit (!0.495);
fruit P bread (!0.215);
Most of these category pairs are complements with hygienic tissue P dental care (!0.260).
regard to consumption acitivities. Note that promotion
of canned vegetables increases the complementary rela- Accelerated purchases (of e.g. flour, tissue), reduction
tionship with soups & sauces, while promotion of soups of consumption activities including the categories on the
& sauces in turn increases the complementary relation- right-hand side (e.g. fruit or bread) or substitution in
ship with canned vegetables. consumption activities (e.g. of frozen by exotic fruit)
Promotion in some categories lead to complementary caused by promotion may be responsible for these
relationships with certain other categories, i.e. categories results.
104 H. Hruschka et al. / Journal of Retailing and Consumer Services 6 (1999) 99—105
store performance. International Review of Retail, Distribution Nerlove, M., Press, J., 1973. Univariate and Multivariate Log Linear
and Consumer Research, 351—379. and Logistic Models RAND Report.
Gallant, A.R., 1987. Nonlinear Statistical Models. Wiley, New York. Mulhern, F.J., Leone, R.P., 1991. Implicit price bundling of retail
Green, P.E., Krieger, A.M., Zelnio, R.N., 1989. A componential segmen- products: a multiproduct approach to maximizing store profitabi-
tation model with optimal product design features. Decision lity. Journal of Marketing, 55, 63—76.
Sciences 20, 220—238. Manchanda, P., Ansari, A., Gupta, S., 1997. The shopping basket:
Hruschka, H., 1991. Bestimmung der Kaufverbundenheit mit Hilfe A model for multi-category purchase incidence decisions. Working
eines probabilistischen Me{modells. Zeitschrift für betriebswirt- Paper, Columbia University, New York.
schaftliche Forschung 43, 418—434. Schmalen, H., Pechtl, H., 1995. Die Absatzwirkung von Sonderan-
Julander, C.-R., 1992. Basket analysis. A new way of analysing scanner gebotsaktionen im Lebensmitteleinzelhandel. Zeitschrift für
data. International Journal of Retail & Distribution Management, Betriebswirtschaft, 65, 587—607.
20, 10—18. Walters, R.G., 1991. Assessing the impact of retail promotions on
Maddala, G.S., 1987. Limited-Dependent and Qualitative Variables in product substitution, Complementary purchase, and inter-store
Econometrics, Cambridge University Press, Cambridge. sales displacement. Journal of Marketing, 55, 17—28.