Académique Documents
Professionnel Documents
Culture Documents
Haemoon Oh
and
Sara C. Parks
The Pennsylvania State University
ABSTRACT
There is a desperate need for new research that will advance customer
satisfaction (CS) and service quality (SQ) methodologies in the hospitality industry.
This comprehensive review of the theories and methodologies reported in CS and
SQ studies cited in the hospitality literature provides suggestions for future CS and
SQ research in the hospitality field. First, the theoretical and methodological issues
are critically reviewed. Next, major developments in CS and SQ research method-
ologies are discussed. The concept of importance and its role in behavioral models
are included as they have been recently applied in hospitality CS and SQ research.
The final section of this study is devoted to developing and proposing new directions
for future CS and SQ research in the hospitality industry. Key words: customer
satisfaction, service quality, expectation, performance, disconfirmation, behavioral
intention, importance
INTRODUCTION
Today, companies within and outside the hospitality industry are striving to deliver
not only their products and services but also high "quality" and "satisfaction" that will
lead to increased brand loyalty and market share. The importance of customer
satisfaction (CS) and its relationship with service quality (SQ), occupancy rate and
profitability has long been exhorted by both management experts and researchers in
the hospitality field (Brewton, 1990; Edwards, 1992; Florida Hotel & Motel Journal,
1989; Greger and Withiam, 1991; Hirst, 1992; Kirwin, 1992; Knutson, 1988; Ravenel,
1992; Shifflet, 1989; Walker, 1988; Withiam, 1991; Wolff, 1992). CS and SQ frequently
top the list of the most important issues that must be addressed by hospitality marketers
(HR Focus, 1992 a, b).These concerns for measuring CS and SQ in the hospitality
industry have been precipitated by the need to position firms competitively in the
marketplace.
Responding to the growing demands for more reliable ways to measure CS and
SO, several hospitality researchers have recently attempted to introduce theoretical
and methodological frameworks for measuring CS and SQ in hotel services (Barsky,
1992 a, b; Barsky and Labagh, 1992; Getty and Thompson, 1994; Saleh and Ryan,
1991 ), restaurant services (Dube, Renaghan and Miller, 1994), and international travel
(Pizam and Milman, 1993). Most of these studies employed theories and methods
grounded in the well-developed domains of research in product marketing; however,
they did not address the theoretical and methodological issues raised in consumer
behavior literature. Although these recent efforts in hospitality research have made
substantive contributions to understanding hospitality customers' behavior, more
rigorous theoretical and methodological treatments are needed to advance the
underdeveloped pedagogy of hospitality CS and SQ research. Therefore, a critical
review of the CS and SQ literature is valuable at this incipient stage of developing
hospitality-specific CS and SQ paradigms.
The primary purpose of this paper is to review critically the main issues in CS and
SQ research and their applications for the hospitality industry. Specifically, this paper
will: 1) provide a critical review of theoretical and methodological issues revealed in the
CS and SQ literature; 2) review the major developments in CS and SQ research; 3)
discuss the concept of importance and its role in CS and SQ models; 4) develop specific
suggestions to assist hospitality researchers in designing more robust CS and SQ
studies; and 5) offer several directions for future hospitality CS and SQ research. In
achieving these objectives, this study considers CS and SQ research together.
Because these two paradigms share considerable areas of theoretical and method-
ological overlap, they can be meaningfully compared to each other.
Olshavsky and Krishnan, 1994; Spreng and Mackay, 1994; Westbrook, 1987;
Westbrook and Oliver, 1991 ).
In an attempt to understand CS better, most CS researchers have focused on the
cognitive consumption processes (Churchill and Surprenant, 1982). However, it is
clear that CS may be more than a simple cognitive evaluative process. Rather, it is
probably a complex human process involving extensive cognitive, affective and other
undiscovered psychological and physiological dynamics. Considering the recent
movement towards broadening the definition of CS, it is desirable that future research
measures satisfaction more broadly in order to reflect the constant interplay of cognition
and emotion in processing external stimuli {i.e., products and services).·
Similarly, the SQ literature also reflects the movement of measuring customer
perceptions as an indication of quality. Parasuraman, Zeithaml and Berry (1985)
developed the GAPS model, which is widely accepted in today's service industry. By
subsequently proposing a measurement model (SERVQUAL) for SQ within the GAPS
framework, Parasuraman, Zeithaml and Berry (1988) defined SQ as "the degree and
direction of discrepancy between consumers' perceptions and expectations." Their
model measures SQ as the gap between a customer's expectations and the percep-
tions of what is actually delivered. The model is built on the assumption that the smaller
the gap, the better the quality of service provided.
Although Parasuraman, Zeithaml and Berry attempted to lay a distinct framework
for research on SQ, their definition of SQ happened to be similar to that of CS. This
definitional overlap has caused mixed conceptualization and interpretations of CS and
SQ among researchers. Therefore, in addition to methodological differentiation, a
stand-alone conceptual framework is needed in order to pursue an independent
research tradition of SQ.
Most of the other theories listed above have often been applied within the
expectancy-disconfirmation framework. For example, several researchers tested
assimilation theory by examining how performance perceptions were assimilated
towards prior expectations in product purchase situations (Cardozo, 1965; Oliver,
1977; Olshavsky and Miller, 1972; Olson and Dover, 1979). Also, the expectancy-
disconfirmation model has served as the basic research framework in studies of the
effects of contrast. Anderson (1973) and Cohen and Goldberg (1970) interpreted the
tendency of exaggeration for disparity between expectations and perceptions as a
typical example of contrasting effects. Nonetheless, it should be noted that other
researchers have argued that value-precepttheory replace expectancy-disconfirmation
theory because it was believed to be more valid and reliable (Locke, 1967).
With the exception of expectancy-disconfirmation, most CS theories define the CS
judgment process too narrowly. Consequently, they have received applications and
testing in a laboratory setting where the CS process was tightly controlled, situation-
specific, and individually focused. This is well-evidenced by their frequent applications
nested in the more broadly conceptualized expectancy-disconfirmation theory. There-
fore, when applied in a field survey research study, the hypothesized effects may fail
to be demonstrated, leading to a hasty rejection of the research hypotheses. Further
research is needed to define under what situations the theorized effects are most likely
to occur. Also, researchers might want to pose these theoretical effects as a null
framework for their investigation of the CS and SQ processes.
Despite the problems with CS theories, many of them, when applied appropriately,
possess strong potential for applications in service consumption situations. For
example, hospitality customers engaged in service encounters may be very sensitive
to the fairness of the transaction, thereby demonstrating the effects of equity theory. It
is also possible that the customers develop satisfaction or dissatisfaction by attributing
their good or bad service experience to either themselves or the other parties involved
in the service transactions (i.e., the effects of attribution theory). Comparison-level
theory also holds importance when applied to the situation in which customers are
heavily involved in comparing multiple brands in the decision-making processes.
Finally, value-precept theory, if fully developed, may also provide cogent explanations
of the CS and SQ processes.
In order to advance CS research in the lodging context, Barsky (1992) and Barsky
and Labagh (1992) introduced the expectancy-disconfirmation paradigm into lodging
research. Basically, the proposed model in these studies was that CS was the function
of disconfirmation, measured by nine "expectations mer (EM measures) that were
weighted by attribute-specific importance. The model was tested with data collected
from 100 random subjects via guest comment cards. As a result, CS was found to be
correlated with a customer's willingness to return.
Although Barsky (1992) added to an extant research base by attempting to
formulate a modified (i.e., weighted) disconfirmation model for lodging services, a close
look at the proposed model, weighting procedures and hypothesis testing procedures
suggests a number of methodological problems. For example, the null hypothesis
posited in this study is not the theoretical property of the employed Chi-square. That is,
the Chi-square test does not assume a negative relationship between the two variables
in its null hypothesis. Rather, it hypothesizes a zero correlation and tests whether
there exist any positive or negative interactions between the two variables. Also,
multiplicative weighting assumes the two factor variables (i.e., EM and importance
measures) to be a continuous, normally-distributed variable. However, the two
variables were later used as a variable of nonparametric property through a modified
weighting scheme. The employed weighting method has the potential risk of biasing
the results.
Pizam and Milman (1993) utilized Oliver's (1980 a, b, c; 1981) expectancy-
disconfirmation model to improve the predictive power of travelers' satisfaction. They
introduced the basic dynamic nature of the disconfirmation model into hospitality
research, while testing part of the original model in a modified form. In order to assess
the causal relationship between two different disconfirmation methods, they employed
a regression model with a single "expectation-mar measure as the dependent variable,
and21 difference-score measures (performance minus expectation as in Parasuraman,
Zeithaml and Berry's [1988] definition of SQ) as the independent variables. Thus,
Pizam and Milman's regression model can be written in CS research terms as
=
"subjective disconfirmation f(objective disconfirmation)," which can be rewritten as
=
"subjective disconfirmation f (SQ).• Here, the CS and SQ paradigms were mixed,
without supportive justification, which resulted in an equation contradictory to that
agreed upon by other researchers (Parasuraman, Zeithaml and Berry, 1994; Teas,
1993 a, b; 1994). No explanation was provided for this modification in the model
specifications or the departure from earlier CS/SQ methodologies. Their model tested
only the reliability of the two different methodological operationalizations for the same
concept.
Some research efforts on CS are also notable in tourism behavior. For example,
Pizam, Neumann and Reichel (1978) investigated the factor structure of tourists'
satisfaction with their destination areas. The authors showed eight distinguishable
dimensions of tourist satisfaction. In addition, by applying a series of Sirgy's (1982,
1983, 1984, 1985, 1987) research on a social cognition and self-image model, Chon
(1989, 1992 a, b) attempted to show how the congruities and incongruities between
the tourists' self-image and their perceived destination image relate to tourist
satisfaction with the destination. Sirgy's conceptualization of social cognition and
self-image was supported in tourists' destination behaviors. For interested readers,
questionable, because they obtained the P-E difference scores based on factor
structure rather than item levels (Parasuraman, Zeithaml and Berry, 1988}.
Finally, careful assessment must be made of the source of Getty and Thompson's
(1994) factor structure for future lodging research. Customers' perceived performance
is likely to be a state, rather than trait, variable. Therefore, perceived performance may
not be a psychometric property that is enduring in customers' minds because it depends
heavily on the company's performance per se. Rather, perceived performance could
be simple performance patterns resulting in some company-specific partitioning of
performance scores, and not factors generic to customer wants and desires. Moreover,
the results have limited generalizability because the study sample used to generate the
factor structures was composed of students. Therefore, Getty and Thompson's (1994)
results may not serve broadly as the basis for future lodging research.
Study Design
One of the most critical issues in CS and SQ research is the nature of the study
setting used to model the two constructs. The debate focuses on two data generation
methods: experimental designs and field surveys. Although experiments were a
popular study design among many researchers (Bolton and Drew, 1991; Cardozo,
1965; Churchill and Surprenant, 1982; Tse and Wilton, 1988), other researchers
continued to prefer field survey approaches (Bearden and Teel, 1983; Oliver and Swan,
1989; Westbrook and Newman, 1978). Most notably, Bolton and Drew (1991)
conducted a field experiment by manipulating the quality of telecommunication
services in selected areas in order to measure subscribers' satisfaction with the
telephone companies. Although both experimental design and field survey methods
have respective strengths and limitations in producing generalizable study results, the
results of studies using either of the methods have been generally accepted without any
critical considerations of the employed study designs.
In general, most of the experimental studies have attempted to test the validity of
the employed conceptual theories (Anderson, 1973; Oliver and DeSarbo, 1988), while
in the field survey CS studies the size and directionality of causal influences among
model constructs have been the major concerns (Swan and Trawick, 1981; Oliver,
1980 a). Due to the methodological limitations, experimental studies have investigated
almost exclusively the short-term aspects of the CS processes. However, both the
short-term (Cadotte, Woodruff and Jenkins, 1987; Swan and Trawick, 1981) and the
longitudinal (Bearden and Teel, 1983; Oliver, 1980 a; LaBarbera and Mazursky, 1983)
aspects of the CS processes have received equivalent attention in field survey
approaches.
iCanlozo 0 965), Cohen & GoldberJ 0 970l.Olshanky & Bohon & Dn:w'1 i1
Mille< (1972), Anderson (1973), Churchill & $urpRmnl I field experimenL
EAperimenls (1982), Oliver & DeSarbo (1988), Toe & Wihon (1988), Focus: Conceptual
Study Bolton & Drew (1991) lheory lellins snd
design short·letm CS process
Westbrook & Newman (1978), Oliver (1980s), Swan & Focus: Size snd direc-
Field Survey Trswick (1981}, Be•den & Teel (1983), Cadoa.e et sl lion of csusa1 foraa.
(1983). Oliver & Swan 0987), llalstead (1989) Jon&iludinal snd
matkel-level CS.
Performance omiu<d Oliver (1980s), Bearden & Teel (1983), Oliver & Swan Dilconfumation wu
(1989) measured.
Both performance snd Swan & Trswiclc (1981), Oiun:hill & Surprawll (1982) E..pectations were
disconfumation Oliver & DeSarbo (1988), Bolton & Dn:w (1991) meuwed.
Affect-belief Obon & Dover (1979), Oliver (1980s), Oliver (1981) Not 11 all likely -
scales very likely
7-point Bearden & Teel (1983), PZB (1988, 1991), Cronin & Likely-unlikely bipola.t
scales T1ylor (1992) Agree-<lisagree Liken
7-poinl Bearden & Teel (1983), Swan & Tnwlck (1981) Oliver'1 snd Y. .No
The field survey approach has been a dominant study design for CS research
the hospitality field. For example, Barsky (1992 b) and Barsky and Labagh (199 ,
studied with customers staying at a local hotel, while Pizam and Milman (1993)
collected data from travelers to Spain by using a "before and after" measurement
design. Traditionally, experimental studies have not been widely used in hospitality
research. Perhaps one of the major reasons for this is that hospitality services, due to
their multifunctional nature, can be overly sensitive to extraneous variables that cannot
easily be controlled in typical laboratory experimental designs. In the future, it is likely
thatthe field survey approach will remain a popular mode of CS research in both product
Convenience student Cardozo (1965), Cohen A Goldbera (1970), Obh1¥1ky ~ Sia varies from 40 lo
umple Miller (1972), Andenon(l97JJ,Oliver A DeSvboU988) 144 IUbjectl
Systematic residall WCllbroolt A Newm111 (1978) Oliver (1980b), 119- 39'4 oubjedS
umple Bohon A Drew (1991) 3 waves of aamplinJ
Samplin&
Random CadoUe et al. (1987), Oliver A Swan (1989), PZB (1988) 120 -415 (2 atage sample)
Panel Swan A Trawick (1981), Bearden A Tccl (1983). 87 - 749 (ID wa\'OI)
LaBarben A Muunky (1983) Lon&itudinal lllUdiel
and services marketing research, including the hospitality industry. However, there
is a continuous need for experiments that can help to develop and refine theories.
Another critical issue with hospitality CS research design is that most studies have not
investigated the dynamic nature of the CS processes. The majority of studies have focused
exclusively on transaction-specific CS processes without incorporating the potential effects
of the long-term CS elements, such as attitude changes. Thus, the results of the transaction-
specific CS studies, which did not provide experimental control overthe effects of customers'
attitudes towards the focal brands, have often exposed valicfity problems. The long-term
aspect of CS processes should be considered even in designing a CS study with a
transaction-specific nature. This view of CS processes is particularly worthwhile in the
hospitality industry, which thrives on customers' repeat visits over an extended period oftime.
Study design in SQ research has been consistent in employing the fixed format
of the original SERVQUAL model suggested by Parasuraman, Zeithaml and Berry
(1988). Most of the SQ studies, including those in the hospitality field, were based on
field surveys (Saleh and Ryan, 1991; Bojanic and Rosen, 1994). In particular,
hospitality SQ studies were conducted mainly in replication of the SERVQUAL model.
However, SQ researchers should expand their study design to the longitudinal aspect of SQ,
both as an indicator of company performance and as a predictor of consumer behavior.
The measure of attribute presence has been widely used in SQ research, whereas
both presence and likelihood measures are frequently employed by CS researchers.
Although this distinction is possible in measurement practices, it is not clear whether
customers really distinguish these two measures in expressing their expectations. It is
more likely that customers consider the two components simultaneously, because the
likelihood notion already captures the presence concept. Moreover, this distinction is
particularly difficult when applied to hospitality services that contain numerous intan-
gible components. Therefore, the likelihood measure seems to be more appropriate for
measuring customers' expectations towards hospitality services.
The recommended scale for belief measures consists of a continuum with two
ends of the scale anchored as "not at all likely" and "very likely." The affective dimension
of each attribute is to be measured on scales of like-dislike, good-bad, desirable-
undesirable or attractive-unattractive. Further, Cadotte, Woodruff and Jenkins (1987)
suggested a 5-point bipolar scale to measure beliefs related to normative product and
brand expectations. Similarly, a unifonn "should" (nonnative) expectation has been
measured in all SQ research.
Despite these suggestions, however, researchers have used different scales to
measure expectations. Halstead (1989) used a 4-point scale ranging from "definitely
would not expecr to "definitely would expect." Both 6-point and 7-point scales were
used to measure attitudinal beliefs in Swan and Trawick's study (1981). Another
example is Bearden and Teel (1983), who used a 7-point bipolar scale ranging from
"likely" to "unlikely" to measure perceived expectations. In hospitality CS research,
Pizam and Milman (1993) employed 5-point scales, while Barsky (1992 a, b) adopted
a 4-point scale with different measure-specific wording.
In general, SQ research advocates the use of a 7-point normative, unipolar Likert
scale ranging from "strongly agree" to "strongly disagree." Although diverse measure-
ment methods have been used to measure expectation, 5- to 7-point interval scales
seem to be appropriate for measuring expectation.
expected to wait in line for more than ten minutes in a restaurant. However, suppose
that the customer was treated nicely while waiting in line for 15 minutes and that the
dining experience was much betterthan she had initially expected. It is doubtful that the
customer would develop dissatisfaction regarding the 15-minute waiting period, which
was not her original expectation. Instead, it is likely that the customer would develop
more tolerance for waiting in line, because she received outstanding performance in
the other attributes of the dining experience. Thus, the performance evaluation criterion
for waiting in line, in this case, is likely to be lowered post hoe. This example illustrates
a potential mismatch in criteria between expectation and performance, thereby
suggesting that multicriteria for performance evaluation are likely in measuring
expectations. However, the validity of this criterion mismatch between expectations
and performance perception has not been questioned in the literature.
Sampling Methods
CS and SQ researchers have collected data from various types of samples.
Convenience, systematic and random sampling methods were frequently employed by
CS researchers. Student subjects were used mainly in experimental studies. Cardozo
(1965), Anderson (1973) and Oliver and DeSarbo (1988), for example, utilized either
undergraduate or graduate business school students for their experiments. The size
of the sample for these experiments ranged from 40 (Oliver and DeSarbo, 1988) to 144
(Anderson, 1973). Bolton and Drew (1991) and Westbrook and Newman (1978)
systematically sampled residents in selected service areas, while random sampling
was used by Cadotte, Woodruff and Jenkins (1987) and Oliver and Swan (1989). A
panel-type sample was used for longitudinal studies (Bearden and Teel, 1983; Swan
and Trawick, 1981 ), while Oliva, Oliver and MacMillan (1992) utilized a data bank.
In order to develop measurement scales for SQ, Parasuraman, Zeithaml and
Berry (1985, 1988) used a national sample. Also, Cronin and Taylor (1992) used
random samples from four industries to evaluate SERVQUAL and other alternatives.
Hospitality researchers have utilized various sampling techniques. These include
random sampling (Barsky, 1992 b; Barsky and Labagh, 1992; Callan, 1994), system-
atic random sapipling (Pizam and Milman, 1993), stratified random sampling (Saleh
and Ryan, 1991) and convenience sampling (Getty and Thompson, 1994).
The sampling and the generalization of study results are thorny issues in
hospitality CS and SQ research due to the high levels of fragmentation in both product
class and market segments. The lodging and restaurant industries, for example, have
a distinct hierarchy of products that are differentiated by price, and at the same time
require a wide variety of market segments for the same products. Thus, any study
drawing samples from a specific property or market segment has limited generalizability.
Also, time is an important factor to be considered when sampling for such hospitality
products as hotels, resorts and tourist attractions. Considering all the constraints in
hospitality sampling, it is suggested that researchers build a generalizable model by
developing it from a specific sample and testing it with diverse samples, rather than by
pursuing an aggregate model from the beginning.
Validity
Perhaps the two most salient issues related to validity in CS and SQ research are:
1) the discriminant validity of measured expectations; and 2) the dimensionality of CS
and SQ (construct validity). Discriminant validity is established, for example, when the
between-factor item correlations are lower than the within-factor item correlations. The
questions associated with discriminant validity were raised first by Miller (1977) who
argued that multiple comparison standards for expectations do exist. In accordance
with this viewpoint, CS researchers have measured different expectations in their
studies. Churchill and Surprenant (1982), Tse and Wilton (1988) and Bolton and Drew
(1991 ) measured subjects' expectations on expected[predicted] product performance,
whereas Swan and Trawick (1981) gauged customers' beliefs. Cadotte, Woodruff and
Jenkins (1987) measured both product norms and brand norms as comparison
standards.
Similarly, SQ research has typically measured normative expectations, although
Teas (1993 a) strongly argued that the discriminant validity of these expectation
measures was a methodological problem. In particular, Boulding et al. (1993) included
measures of ideal expectations along the normative measures and found discriminant
validity between the two types of expectations.
To date, it is not clear what type of expectation, under what type of situation, has
greater validity. More empirical evidence should be accumulated to support any
particular type of expectation. Obviously, the validity of any particular expectation must
be assessed in tenns of performance measures due to their close relationship.
Along with discriminant validity, several researchers have provided evidence of
the convergent validity of each model construct (Bearden and Teel, 1983; Cadotte,
Woodruff and Jenkins, 1987; Churchill and Surprenant, 1982). Following the sugges-
tions made by Nunnally (1978) and Churchill (1979), the convergent validity of each
construct can be tested by examining the correlation coefficients among construct
measurement items. For convergent validity to hold, the correlation among the items
that measure the same construct should be higherthan the correlation among the items
that measure the different constructs. Cadotte, Woodruff and Jenkins (1987) and
Westbrook (1980) examined both convergent and discriminant validity by using a
multitrait-multimethod matrix (MTMM) developed by Campbell and Fiske (1959).
Convergent and discriminant validities were also discussed in SQ studies (Cronin and
Taylor, 1992; Parasuraman, Zeithaml and Berry, 1988, 1991; Teas, 1993 a, 1993 b).
Nonetheless, the majority of the hospitality CS and SQ research reported in the
literature lacks a discussion of convergent and discriminant validities of the employed
scales. Thus, it appears that some of the conclusions drawn from these studies may
be inappropriate.
Only a few studies have directly investigated the dimensionality of CS measures
(Yi, 1992; Czepiel, Rosenberg and Akerele, 1974; Leavitt, 1977). The most frequently
proposed theory about the dimensionality of CS is a dual-factor theory, which was
proposed early in Herzberg's two-factor theory of job satisfaction (Herzberg, Mausner
and Snyderman, 1959). The theory asserts that satisfaction and dissatisfaction are
different constructs, and that they are caused by different facets of interaction between
a product or service and a customer. A low correlation between the two constructs is
believed to imply their relative independence from each other. While there is an ongoing
argument about the two-factor theory, the service and hospitality literature has not
reported evidence of this dual-factor possibility. Further empirical evidence is needed
to determine the viability of the two-factor CS measures.
The dimensionality question for SQ has also been raised. As discussed earlier,
although Parasuraman, Zeithaml and Berry (1988) proposed five dimensions of
service, many replication studies have disagreed about the number and types of
dimensions. For instance, Cronin and Taylor (1992) found that 22 SERVQUAL items
loaded on the same dimension. Bojanic and Rosen (1994) identified six dimensions of
restaurant service when the original SERVQUAL scale was used. Three plausible
sources of this discrepancy in the dimensionality of SQ are: 1) the SQ score's situation-
specific dependence on the expectation or performance scores; 2) the structural
differences of SQ across services; and 3) the differences in the level of factor
abstraction achieved by researchers. Research should be conducted to determine
which of these sources of discrepancy is the most likely.
Reliability
Cronbach's (1951) alpha statistic has been the most frequently used indicator of
reliability in CS and SQ research. A number of CS studies have reported reliability
indices of adopted scales (Oliver and Linda, 1981; Westbrook, 1980). Westbrook
(1980) and Maddox (1985), for example, compared the test-retest reliability of several
single-item CS scales such as the Delighted-Terrible [D-T] percentage, need (satisfied-
dissatisfied) [S-0], content analytic and graphic scales. As a result, they found the O-
T scale to be the most reliable among the examined scales with an alpha value ranging
from .65 to .85. However, most of the reliability estimates for these repeated single-item
scales were low to moderate, suggesting that caution is needed in using single-item
measures (Yi, 1992; Churchill, 1979).
Reliability scores also vary for multi-item CS scales. Several researchers have
reported reliabilities of selected multi-item scales: Bearden and Teel (1983), Oliver
( 1980 a) and Oliver and Bearden (1983) forthe Likertscale; and Oliver and Linda (1981)
for a semantic differential scale. Westbrook and Oliver (1991) investigated five different
scales to assess their reliabilities: Liker!, semantic differential, graphic, verbal and
porter. The results of their study showed that the Like rt and semantic-differential scales
performed equally well for multi-item CS measures with alpha values of .75 - .96 and
.90 - .95, respectively. In comparing these results with those of single-item measures,
it was found that the multi-item measures outperformed most of the single-item
measures of CS.
Studies on SQ have reported favorable reliabilities for normative expectations and
performance. For example, Parasuraman, Zeithaml and Berry (1988), who utilized a
7-point Likert-type scale, achieved reliabilities ranging from .72 (tangibles) to .86
(empathy) for five SQ dimensions. In similar efforts to develop lodging-specific SQ
scales, Getty and Thompson (1994) identified three dimensions of customers' per-
ceived performance of hotel services with improved reliabilities ranging from .84 to .97.
Overall, the hospitality literature does not report reliability and validity results
sufficiently. Only a few studies have discussed validity issues. For example, following
Churchill's (1979) and Parasuraman, Zeithaml and Berry's (1988) suggestions, Getty
and Thompson (1994) discussed the face, trait and predictive validities of their
performance scales. Not included in their discussion, however, were the convergent
and the discriminant validity issues as emphasized by Churchill (1979). More rigorous
examination of validity and reliability issues is necessary in future hospitality research.
In this regard, Churchill (1979) and Cadotte, Woodruff and Jenkins (1987) have
demonstrated excellent examples of reliability assessment using the multitrait-
multimethod-matrix.
New theories: In response to the fundamental concern raised above, Oliva, Oliver
and MacMillan (1992) proposed a catastrophe model theorizing the relationship of
satisfaction with transaction costs and brand loyalty. Originated in the catastrophe
theory (Thom, 1975; Zeeman, 1976) and chaos theory (Gleick, 1988), this approach
hypothesizes that satisfaction and dissatisfaction occur at different points. By bringing
a gap between the two threshold points, these behaviors are associated with transac-
tion costs and brand loyalty and therefore are not monotonic. The gap between the two
thresholds is explained by the concept of a catastrophe, which is similar to the zone of
indifference proposed by Woodruff, Cadotte and Jenkins (1983). If this reasoning is
true, two questions emerge: (1) Where are the triggering points of satisfaction and
dissatisfaction, respectively? and (2) How does one measure the two constructs
independently? These questions remain unanswered in the literature.
Another serious problem with the SERVQUAL difference score is the variance
restriction, which occurs when one of the component scores used to calculate the
difference score is consistently higher than the other component. This situation does
apply to SERVQUAL in that the expected or desired level of service is almost always
higher than the perceived level of actual service (Peter, Churchill and Brown, 1993),
thereby potentially limiting the variability in perceptions scores. Therefore, the validity
issues of difference scores are not yet clearly settled. More rigorous theoretical
consideration must be given to using difference scores.
Teas' (1993 a, b) propositions about SERVQUAL also deserve attention. He
raised questions about the locus of the ideal point by which Parasuraman, Zeithaml and
Berry's expectation construct is believed to be developed and to which performance is
to be compared. According to Teas, Parasuraman, Zeithaml and Berry's "ideal
standard" subsumed in expectations poses two possible interpretations: 1) a classic
attitudinal ideal point that predicts, in contrast to Parasuraman, Zeithaml and Berry's
assumption of monotonic relationship between performance and SQ, decreasing
perceived quality as performance increasingly exceeds the ideal point (Ginter, 1974;
Green and Srinivasan, 1978); and 2) a feasible ideal pointthat represents a feasible or
the best level of performance by the highest-quality provider under perfect circum-
stances. Although this feasible ideal point interpretation could justify assumptions
made by Parasuraman, Zeithaml and Berry, it still depends on whether the attributes
are vector attributes (i.e., infinite or maximum classic attitudinal ideal points) or finite
ideal point attributes (i.e., non-infinite or intermediate classic attitudinal ideal points), as
proposed by Lilien, Kotler and Moorthy (1992). Building on these researchers, Teas
(1993b)proposedtwoaltemativeapproachestoSERVQUAL:1)aModifiedSERVQUAL
(MQ) model; and 2) a Normed Quality (NQ) model.
Cronin and Taylor (1992, 1994) also proposed an alternative performance-based
model (SERVPERF) which was found to perform better than SERVQUAL. They tested
the two models in four different industries for nomological validities and structural
consistency. Also raised in their study was the issue related to the temporal order
between CS and SQ. They concluded that SERVPERF outperformed SERVQUAL,
and that SQ is an antecedent of CS. However, they were inclined to over-interpret their
study results because their claimed difference in model performance might not prove
"significant."
In sum, several alternative approaches to traditional CS and SQ models have
recently been proposed to provide a more robust prediction of customers' behaviors
over time. Additionally, these new models address more practical issues of measuring
attitude, SQ and CS over time. Similar model specification problems with SQ models
are expected in the future because structural models will become more frequently used
in SQ modeling. Therefore, a critical task pending for CS and SQ researchers is to
integrate pieces of models and recreate them as a few parsimonious models that would
be useful to industry managers.
(Fishbein and Ajzen, 1975; Mazis, Ahtola and Klippel, 1975; Mitchell, 1974; Oliver,
1979, 1981) did not recommend the inclusion of importance in the attitude models,
other researchers (Barsky, 1992 b; Bojanic and Rosen, 1994; Getty and Thompson,
1994; Goering, 1985; Kotler, 1988; Lewis and Pizam, 1981; Teas, 1993 a, 1993 b)
seemed to support the inclusion of attribute importance in evaluating customers'
satisfaction levels.
In hospitality research, the concept of importance has received considerable
attention recently. For example, Barsky (1992 b) and Barsky and Labagh (1992)
included direct measures of importance in their investigation of hotel customers'
satisfaction. Also, following Parasuraman, Zeithaml and Berry's (1988) suggestions,
Bojanic and Rosen (1994) indirectly assessed, through a regression model, the relative
importance of six quality dimensions of restaurant services. However, although studies have
investigated the concept of importance, the benefits of measuring attribute importance have
not been thoroughly discussed. Studies are needed to assess accurately the contribution of
attribute importance to increasing the model's predictive power.
The two contradictory views on the inclusion or exclusion of importance in CS and
SQ models stem from the validity of the concept with insufficient empirical support. That
is, most researchers agree that customers' evaluations of performance are based upon
subjective relative comparisons (Cadotte, Woodruff and Jenkins, 1987; Oliva, Oliver
and MacMillan, 1992; Parasuraman, Zeithaml and Berry, 1985). Here, the term
"relative" should mean between-attributes, between-brands (competitors) and even
between-alternative products simultaneously. However, empirical investigations have
attempted to focus on only one source of importance. Thus, study results have been
unable to support the role of importance in attitude models because no significant
improvement in the predictive power was shown (Mazis, Ahtola and Klippel, 1975).
In general, two methods of measuring importance are popular: 1) direct question-
ing of subjects; and 2) indirect inference through regression models. Studies that have
attempted to measure importance have only considered the absolute or within-brand
importance of each attribute or factor (Bojanic and Rosen, 1994; Cronin and Taylor,
1992). The dynamics of importance, however, may not be so simple. Rather, the
concept of importance must apply to a between-attribute and between-brands trade-
off that may be the basis of a choice against another brand being considered. In the case
of the direct question methods, the multiplication of a variable and its importance has
been the most popular weighting method, even if the underlying assumption of
statistical independence of the two variables has not been clearly estgblished.
In summary, the inclusion of importance in measuring attitudes seems to be a
philosophical question. Those who advocate inclusion tend to focus on the conceptual
and realistic role of importance in human decision processes, while those who dismiss
inclusion tend to emphasize statistical contributions and methodological efficiency of
the concept when it is included in attitude models. However, if the concept holds a role
in the CS processes, further efforts should be made to improve the methods of
measuring and modeling the concept of importance. This can be done by: 1) measuring
the importance construct more accurately-that is, the validity for the source of
importance should be established first; and 2) discovering a better way to incorporate
the concept with the other related variables-that is, an appropriate weighting method
must be developed if importance should be used as a weighting parameter.
CONCLUSION
It is most gratifying to see an increasing interest in CS and SQ research in the
hospitality industry. Although this research interest is not completely new to the
industry, these paradigm-oriented efforts have been unprecedented. In an effort to
improve marketers' understanding of consumer behavior in the hospitality industry,
many researchers are introducing CS and SQ models and testing them with different
hospitality services. Nevertheless, their research efforts are still within the conventional
approach that has prevailed in the hospitality industry. The simple, and often incom-
plete, introduction of new research paradigms without critical consideration of theory
and methodology is delaying the development of hospitality-oriented CS and SQ
research pedagogy. It is time for hospitality researchers to emphasize domain-specific
theories and the legacy of a discovery-oriented research tradition to the next-
generation researchers.
This review was conducted to provide hospitality CS and SQ researchers with
more useful guidelines forfuture research that would result in more rigorous theoretical
and methodological progresses. The terms "satisfaction" and "quality" have been a
central hospitality management philosophy, and their importance continues with the
promise of a renewed, foreseeable prosperity for hospitality organizations of the future.
Nevertheless, hospitality research has not, on the whole, developed any substantive
theories and innovations. Partial responsibility for this inevitably lies with the method-
driven research traditions of the past. Without consideration for theoretical underpin-
nings as a priori, the outcomes only become ad hoe.
Throughout this review, an effort has been made to define a number of important
issues related to CS and SQ research. However, this review still leaves many more
issues unresolved. Hospitality researchers are encouraged to address the more
fundamental issues pending in their research agendas. More discussion and more
rigorous investigations are urged. Certainly, opportunities to build more productive
research paradigms abound, especially in the hospitality industry.
REFERENCES
Anderson, E.W., Fornell, C. and Lehmann, D.R. (1994). Customer satisfaction, market
share, and profitability: Findings from Sweden. Journal of Marketing, 58, 53-66.
Anderson, A. E. (1973). Consumer dissatisfaction: The effect of disconfirmed expect-
Ravenel, C. S. (1992). A new way to satisfy guests. Lodging Hospitality, 48, 32.
Saleh, F. and Ryan, C. (1991, July). Analyzing service quality in the hospitality industry
using the SERVQUAL model. The Service Industries Journal, 11, 324-343.
Shifflet, D. K. (1989). How are you doing? Measure guest satisfaction. Hotel & Resort
Industry, 12, 72.
Sirgy, J. M. (1982). Self-image/product image congruity and advertising strategy. In
Vinay Kothari (Ed.), Development in marketing science, Proceedings of the
Academy of Marketing Science, 5, 129-33.
Sirgy, J. M. (1983). Social cognition and consumer behavior. New York: Praeger
Publishers.
Sirgy, J. M. {1984, Summer). A social cognition model of CS/D: An experiment.
Psychology and Marketing, 1, 27-44.
Sirgy, J. M. (1985). Self-image/product image congruity and consumer decision
making. International Journal of Marketing, 2(4), 49-63.
Sirgy, J.M. (1987). A social cognition model of consumer problem recognition. Journal
of the Academy of Marketing Science, 15(4), 53-61.
Spreng, R. A. and Mackoy, R. D. (1994). A dynamic model of affect, disconfirmation,
and satisfaction judgment. A paper presented at the 1994 Annual Conference of
Journal of Consumer Research, Boston.
Swan, J. E. and Trawick, I. F. (1981 ). Disconfirmation of expectations and satisfaction
with a retail service. Journal of Retailing, 57, 48-67.
Teas, R. K. (1993 a). Consumer expectations and the measurement of perceived
service quality. Journal of Professional Services Marketing, 8(2), 33-53.
Teas, R. K. (1993 b).Expectatio ns, performance evaluation, and consumers' percep-
tions of quality. Journal of Marketing, 57, 18-34.
Teas, R. K. (1994). Expectations as a comparison standard in measuring service
quality: An assessment of a reassessment. Journal of Marketing, 58, 132-139.
Thom, R. (1975). Structural stability and morphogenesis. Reading, MA: Benjamin, W.
A.
Tse, D. K. and Wilton, P. C. (1988). Models of consumer satisfaction formation: An
extension. Journal of Marketing Research, 25, 204-212.
Walker, J. R. (1988). The viability of quality assurance in hotels. Hospitality Education
and Research Journal, 12, 461-470.
Webster, C. (1991 ). Influences upon consumer expectations of services. The Journal
of Services Marketing, 5, 5-17.
Westbrook, R. A. (1980, Fall). A rating scale for measuring producVservice satisfaction.
Journal of Marketing, 44, 68-72.
Westbrook, R. A. (1987). ProducVconsumption-based affective responses and
postpurchase processes. Journal of Marketing Research, 24, 258-270.
Westbrook, R. A. and Newman, J. W. (1978). An analysis of shopper dissatisfaction for