Bajpai Chapter 3

Copyright © 2017 Pearson India Education Services Pvt.
Ltd
Chapter 3
Measurement and Scaling
Business Research Methods, 2e Author: Naval Bajpai

Learning Objectives
Copyright © 2017 Pearson India Education Services Pvt. Ltd

Upon completion of this chapter, you will be able to:
 Understand the scale of measurement and four levels of data
measurement
 Understand the criteria for good measurement
 Learn about the various established measurement scales used
in business research
 Understand the factors to be considered in selecting the
appropriate measurement scales

What Should be Measured?

• The measurement of physical properties is not a complex deal,
whereas measurement of psychological properties requires a
careful attention of a researcher.
• The quality of the research always depends on the fact that what
measurement techniques are adopted by the researcher and
how these fit in the prevailing research circumstances.

Scales of Measurement

 Nominal scale
 Ordinal scale
 Interval scale
 Ratio scale


 Nominal Scale: When data are labels or names used to identify the
attribute of an element, the nominal scale is used.
Eg: A survey is conducted in 3 different cities of Gujarat namely, Rajkot,
Baroda & Ahmedabad. If we code ‘1’ for Rajkot, ‘2’ for Baroda and
‘3’ for Ahmedabad, then 1,2, & 3 are the labels used to identify the
name of the city.

Nominal Data

• Nominal data are numerical in name only,
because they do not share any of the
properties of the numbers we deal in ordinary
arithmetic.
• For instance if we record marital status as 1, 2,
3, or 4 as stated above, we cannot write 4 > 2
or 3 < 4 and we cannot write 3 – 1 = 4 – 2, 1 +
3 = 4 or 4 ÷ 2 = 2.
Business Research Methods

Business Research Methods, 2e Author:
Naval Bajpai Naval Bajpai

• Ordinal Scale: In addition to nominal level data capacities,
ordinal scale can be used to rank or order objects.
• If the consumers are asked to judge its products in terms of
excellent, good and poor, then the company can assign ‘1’ to
excellent; ‘2’ to good and ‘3’ to poor.
• but the exact difference in terms of numeric values cannot be
determined using ordinal scale.
• We can only conclude that excellent is superior than good; and good is
superior than poor


 Interval Scale: In interval level measurement, the difference between
two consecutive numbers is meaningful.
 Interval scale data is always numeric
Eg: Suppose 3 students score 65, 75 and 85 respectively. We can not
only determine that the score of 85 is superior than 75 but we can
also make out that the numeric difference between the numbers is
10.


 Ratio Scale: Ratio level measurements possess all the
properties of interval data with meaningful ratio of two
values.
Eg: Suppose the company possess 2 category of products A and
B. the prices of both are Rs. 30 and Rs. 15 respectively. We
can not only determine that 1st product is superior than 2nd,
but we can also calculate the difference of Rs. 15 between
them in numeric terms and can calculate the relationship that
1st product is twice priced as 2nd.
• In terms of measurement capacity, nominal,
ordinal, interval, and ratio level data are
placed in ascending order.

Figure 3.1: A comparison between the four levels of data
measurement in terms of usage potential

The Criteria for Good Measurement

1. Validity

• In fact, validity is the ability of an instrument to measure what is
designed to measure.
• It sounds simple that a measure should measure what it is
supposed to measure but has a great deal of difficulty in real life.
Eg: Behaviour of employees to measure consumer satisfaction in a big
shopping mall is a validity issue.
Employees’ Behaviour is not the determinant of satisfaction rather
other factors like, price, discount, etc. may be responsible for
satisfaction.

1(a) Content Validity

• The content validation includes, but is not limited to, careful
specification of constructs, review of scaling procedures by content
validity judges, and consultation with experts and the members of
the population.
• Sometimes, the content validity is also referred as face validity.
• Researcher or group of experts examine whether the measuring
instrument is capable of providing adequate coverage of the
concept.
• In fact, the content validity is a subjective evaluation of the scale
for its ability to measure what it is supposed to measure.

1(b) Criterion Validity

• The criterion validity is the ability of the variable to predict the
key variables or criteria (Lehmann et al., 1998).
• It involves the determination of whether the scale is able to
perform up to the expectation with respect to the other
variables or criteria.
• Criterion variables may include demographic and psychographic
characteristics, attitudinal and behavioural measures, or scales
obtained from other scales (Malhotra, 2004).
• Eg: “Offering snack during shopping”. This new measurement is
correlated with other traditional measures like price, discount,
after sales service, etc.

1(b) Criterion Validity

Concurrent validity
Data collected and
executed at same time
and are valid
Criterion Validity
Predictive validity
If the new measure is able
to predict the future
events like increase in sale

1(c) Construct Validity

• The construct validity is the initial concept, notion, question, or
hypothesis that determines which data are to be generated and
how they are to be gathered (Golafshani, 2003).
• If a researcher has developed a scale to measure consumer

preference then to evaluate construct validity, the researcher will
correlate the result obtained by new measure with the existing well
known measure.

1(c) Construct Validity

• To achieve the construct validity, the researcher must focus
on convergent validity and discriminant validity.
• The convergent validity is established when the new measure
correlates or converges with other similar measures.
• The literal meaning of correlation or convergence specifically
indicates the degree to which the score on one measuring
instrument (scale) is correlated with other measuring
instrument (scale) developed to measure the same constructs.

Discriminant validity

• Discriminant validity is established when a new measuring
instrument has low correlation or non convergence with the
measures of dissimilar concept.
• The literal meaning of no correlation or non-convergence
specifically indicates the degree to which the score on one
measuring instrument (scale) is not correlated with the other
measuring instrument (scale) developed to measure the different
constructs.
• To establish the construct validity, a researcher has to establish the
convergent validity and discriminant validity.

2 Reliability

• Reliability is the tendency of a respondent to respond in the same
or in a similar manner to an identical or a near identical question
(Burns & Bush, 1999).
• A measure is said to be reliable when it elicits the same response
from the same person when the measuring instrument is
administered to that person successively in similar or almost
similar circumstances.
• Reliable measuring instruments provide confidence to a researcher
that the transient/ temporary and situational factors are not
intervening in the process, and hence, the measuring instrument is
robust/healthy.

• A researcher can adopt three ways to handle
the issue of reliability: test–retest reliability,
equivalent forms reliability, and internal
consistency reliability.

2(a)Test–Retest Reliability

• To execute the test–retest reliability, the same questionnaire is
administered to the same respondents to extract responses in
two different time slots.
• As a next step, the degree of similarity between the two sets of
responses is determined.
• To assess the degree of similarity between the two sets of
responses, correlation coefficient is computed. Higher correlation
coefficient indicates a higher reliable measuring instrument, and
lower correlation coefficient indicates an unreliable measuring
instrument.
• Problems: Ideal time difference, respondent may become
sensitive regarding the subject influencing second answer,
Alteration in views of the respondents.

2(b)Equivalent Forms Reliability

• In test–retest reliability, a researcher considers personal and
situation fluctuation in responses in two different time periods,
whereas in the case of considering equivalent forms reliability,
two equivalent forms are administered to the subjects at two
different times.
• To measure the desired characteristics of interest, two
equivalent forms are constructed with different sample of items.
Both the forms contain the same type of questions and the same
structure with some specific difference.

2 (c) Internal Consistency Reliability
• The internal consistency reliability is used to assess the reliability of a

summated scale by which several items are summed to form a total
score (Malhotra, 2004).
• The basic approach to measure the internal consistency reliability is
split-half technique.
• In this technique, the items are divided into equivalent groups. This
division is done on the basis of some predefined aspects as odd versus
even number questions in the questionnaire or split of items randomly.
• After division, responses on items are correlated. High correlation
coefficient indicates high internal consistency, and low correlation
coefficient indicates low internal consistency.
• Subjectivity in the process of splitting the items into two parts poses
some common problems for the researchers.
• A very common approach to deal with this problem is coefficient alpha
or Cronbach’s alpha.

The coefficient alpha or Cronbach’s alpha

• The coefficient alpha or Cronbach’s alpha is actually a mean
reliability coefficient for all the different ways of splitting the items
included in the measuring instruments.
• As different from correlation coefficient, coefficient alpha varies
from 0 to 1, and a coefficient value of 0.6 or less is considered to
be unsatisfactory.

3. Sensitivity

• Sensitivity is the ability of a measuring instrument to measure the
meaningful difference in the responses obtained from the subjects
included in the study.
• It is to be noted that the dichotomous categories of response such
as yes or no can generate a great deal or variability in the
responses.
• Hence, a scale with many items as a sensitive measure is required.
• For example, a scale based on five categories of responses, such as
“strongly disagree,” “disagree,” “neither agree nor disagree,”
“agree,” and “strongly agree,” presents a more sensitive measuring
instrument.

Measurement Scales

• Rating scales are used to measure theoretical constructs like
customer satisfaction, brand loyalty, etc.
• Comparative scales are based on the direct comparison of stimulus
and generally generate some ranking or ordinal data.
• This is the reason why these scales are sometimes referred as non-
metric scales. Non-comparative scaling techniques generally involve
the use of a rating sale, and the resulting data are interval or ratio in
nature.
• This is the reason why these scales are referred as monadic scales or
metric scales by some business researchers.
• This section is an attempt to discuss the various types of scales in the
light of items included in the scales.
• These are single-item scales, multi-item scales, and continuous rating
scales.
FIGURE 3.3 : The classification of measurement scales

1. Single-Item Scales
• As clear from the name, the single-item scales measure only

one item as a construct.
• Some of the commonly used single-item scales in the field of
business research are multiple choice scales, forced-ranking
scales, paired-comparison scales, constant-sum scales, direct
quantification scales, and Q-sort scales.

1(a) Multiple-Choice Scale
• Researcher tries to generate some basic information to conduct

his or her research work, and for the sake of convenience or
further analysis, he or she codes it by assigning different numbers
to different characteristics of interest.
• This type of measurement is commonly referred as multiple-
choice scale and results in generating the nominal data. In this
type of scale, the researcher poses a single question with multiple
response alternatives.
• For a mere quantification reason, a researcher assigns 1 to the
first response, 2 to the second response, and so on. It is important
to note that the numbers provide only the nominal information.

FIGURE 3.4 : Examples of multiple-choice scales

1(b) Forced-Choice Ranking

• In the forced-choice ranking scaling technique, the respondents
rank different objects simultaneously from a list of objects
presented to them.

FIGURE 3.5: Example of forced-choice scale

1(c) Paired-Comparison Technique
• As the name indicates, in the paired-comparison scaling

technique, a respondent is presented a pair of objects or stimulus
or brands and the respondent is supposed to provide his or her
preference of the object from a pair.
• When n items (objects or brands) are included in the study, a
respondent has to make n(n −1) / 2 paired comparisons.
• Sometimes, a researcher uses the “principle of transitivity” to
analyse the data obtained from a paired-comparison scaling
technique. Transitivity is a simple concept that says that if Brand
“X” is preferred over Brand “Y” and Brand “Y” is preferred over
Brand “Z,” then Brand “X” is also preferred over Brand “Z”.

FIGURE 3.6: Example of paired comparison
scaling technique

1(d) Constant-Sum Scales

• In the constant-sum scaling technique, the respondents allocate
points to more than one stimulus objects or object attributes or
object properties, such that the total remains a constant sum of
usually 10 or 100.
• The sum of all the points should be equal to a predefined
constant 100 or 10, which is why this scale is called the constant-
sum scale. This scaling technique generates the ratio-level data.

FIGURE 3.7 : Example of a constant-sum scale

1(e) Direct Quantification Scale

• The simplest form of obtaining information is to directly ask a
question related to some characteristics of interest resulting in
ratio-scaled data. Researchers generally ask a question related
to payment intention of consumers.

Figure 3.8: Example of the direct quantification scale

1(f) Q-Sort Scales

• The objective of the Q-sort scaling technique is to quickly
classify a large number of objects. In this kind of scaling
technique, the respondents are presented with a set of
statements, and they classify it on the basis of some predefined
number of categories (piles), usually 11, ranging from most
strongly agree to least strongly agree.

2. Multi-Item Scales
• Multi-item scaling techniques generally generate some interval

type of information.
• In interval scaling technique, a scale is constructed with the
number or description associated with each scale position.
• Therefore, the respondent’s rating on certain characteristics of
interest is obtained.
• For the majority of researchers, the rating scales are the preferred
measuring device to obtain interval (or quasi-interval) data on the
personal characteristics (i.e., attitude, preference, and opinions)
of the individuals of all kind (Peterson, 1997).

2(a) Summated Scaling Technique: The Likert Scales

• In a Likert scale, each item response has five rating categories, “strongly
disagree” to “strongly agree” as two extremes with “disagree,” “neither
agree nor disagree,” and “agree” in the middle of the scale. Typically, a 1-
to 5-point rating scale is used, but few researchers also use another set of
numbers such as −2, −1, 0, +1, and +2.
• The analysis can be done by using either profile analysis or summated
analysis.
• The profile analysis is item-by-item analysis, where the respondent’s
scores are obtained for each item of the scale, and the analysis is also
done on the basis of individual item scores. As another approach, scores
are obtained from the respondents, and the sum is obtained across the
scale items. After summing, an average is obtained for all the respondents.
The summated approach is widely used, which is why the Likert scale is
also referred as the summated scale.

FIGURE 3.9 : Example of Likert scale

2(b) Semantic Differential Scales

• The semantic differential scale consists of a series of bipolar
adjectival words or phrases placed on the two extreme points
of the scale.
• Good semantic differential scales keep some negative adjectives
and some positive adjectives on the left side of the scale to
tackle the problem of the halo effect.

FIGURE 3.11: Example of semantic differential scale

2(c) Staple Scales
• The staple scale is generally presented vertically with a single

adjective or phrase in the centre of the positive and negative
ratings.
• Similar to the Likert scale and the semantic differential scale, in a
staple scale, points are at equidistant position both physically
and numerically, which usually results in the interval-scaled
responses.

FIGURE 3.12: Example of staple scale

2(d) Numerical Scales

• Numerical scales provide equal intervals separated by numbers, as
scale points to the respondents. These scales are generally 5- or 7-point
rating scales.

FIGURE 3.13: Example of numerical scale

(3) Continuous Rating Scales

• In a continuous rating scale, the respondents rate the object by
placing a mark on a range to indicate their attitude. In this scale,
the two ends of range represent the two extremes of the
measuring phenomenon.
• This scale is also referred as a graphing rating scale and allows a
respondent to select his or her own rating point instead of the
rating points predefined by the researcher.

FIGURE 3.14: Example of a continuous rating scale

Factors in Selecting an Appropriate
Measurement Scale

Nominal Open 5 point
5, 7 or
Ordinal ended scale
11
Interval Close
Ratio ended Other
than 5
point

Reliability Analysis
• In Chapter 3, Figure 3.10 exhibits a multi-item scale (E-S-QUAL) to

measure the service quality delivered by the websites in which online
shopping is available for customers.
• For understanding the application of SPSS to launch reliability analysis,
we will take the first component of the scale indicated by ‘efficiency’.
• Let’s suppose, data is collected from 10 respondents to measure
efficiency of these online shopping portals. This is presented through
data editor window of SPSS as exhibited in Figure 3.19.
• SPSS output is exhibited from Figure 3.17 to Figure 3.24.

Figure 3.10 : Multi-item scale (E-S-QUAL) to measure the service quality delivered by
Websites in which online shopping is available for the customers

Figure 3.19: SPSS Date Editor Window
Figure 3.20: Test of Reliability Statistics

Figure 3.25: Acceptable Values for Crobach’s Alpha

Interpretation: Figure 3.20
• Figure 3.20, which presents a reliability statistics table, gives

Cronbach’s alpha for our example at 0.699.
• This value seems to be in the acceptable limit as per George and
Mallery’s rule of thumb discussed earlier.
• The second column in Figure 3.20 gives the Cronbach’s alpha
based on standardized items.
• This represents the internal consistency of the alpha value when
all the items are being standardized.
• This value is being used when individual scale items are not being
uniformly scaled.

Figure 3.21: Table of Item Statistics
Figure 3.22: Inter Item Correlation Matrix

Interpretation: Figure 3.21 and 3.22

• Figure 3.21 exhibits reliability statistics table. It presents the
mean of the items, standard deviation, and sample size (in our
case, this is 10).
• Figure 3.22 presents inter-item correlation matrix. This
represents correlation of each variable with other variables in a
matrix form.

Figure 3.23: Summary of Item Statistics
Figure 3.24: Inter-Total Statistics Matrix

Interpretation : Figure 3.23
• Figure 3.23 presents summary item statistics. This figure presents a very

important result mean ( 0.223) of inter-item correlations.
• This is the r value in the formula to calculate- Cronbach’s alpha. The first
inter-item correlation value can be obtained by computing the correlation
between the first variable EFF1 and the sum of other seven variables.
• The second inter-item correlation value can be obtained by computing the
correlation between the second variable EFF2 and the sum of other seven
variables.
• Similarly, eight inter-item correlation values can be computed.
• Small r value of the formula is the average of these eight correlation
values. Hence, using the above formula, Cronbach’s alpha can be
computed as:

Cronbach’s Alpha Formula and Computation


• Figure 3.24 exhibits inter-total statistics matrix, and it is the most
important figure for interpreting the internal consistency of the scale.
• The second column of the table presents ‘scale mean if item deleted’.
This is the mean of remaining items excluding the concerned item. So,
28.30 is the mean of all other seven variables excluding the first variable
EFF1.
• The column three of the figure for variance can be interpreted in a similar
manner.
• The fourth column in Figure 3.24 exhibits ‘corrected item-total
correlation’. This is the correlation of the concerned item with the
summated score of all other items. Against EFF1, 0.525 is the correlation
of first variable EFF1 with the summated score of remaining seven items
of the construct. The fifth column in the same figure exhibits ‘squared
multiple correlation’.

• This is the predicted multiple correlation coefficient squared

determined by regressing the concerned item on all other remaining
items of the construct.
• Against EFF1, 0.652 is predicted multiple correlation coefficient
squared determined by regressing the first item EFF1 on all other seven
remaining items of the construct.
• The last column in Figure 3.24 indicates ‘Cronbach’s alpha if item
deleted’. This is the overall value of alpha when the concerned item is
not included in the calculation.
• For example, in case when first item EFF1 is not being included in the
calculation value of Cronbach’s alpha will be 0.642.

• If this column’s values show a reasonable increase in the value of

alpha after deletion of the item, that item can be struck off.
• The last column indicates no reasonable increase in Cronbach’s
alpha after deletion of any item.
• So, there is no rationale in deleting any item for the study.
• This is actually a researcher’s discretion. For example, a few
researchers would like to drop variable seven EFF7.
• This will result in an increase of alpha as 0.738.

Reliability Analysis: Using SPSS

• Reliability Analysis\ReliabilitY Prob.xlsx
• Reliability Analysis\Reliability Analysis Prob.sav
• Reliability Analysis\Output Relaibility Analysis.spv

Bajpai Chapter 3

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Bajpai Chapter 3

Transféré par

Droits d'auteur :

Formats disponibles

Copyright © 2017 Pearson India Education Services Pvt.

Measurement and Scaling

Business Research Methods, 2e Author: Naval Bajpai

Copyright © 2017 Pearson India Education Services Pvt. Ltd

Business Research Methods, 2e Author: Naval Bajpai

Copyright © 2017 Pearson India Education Services Pvt. Ltd

Business Research Methods, 2e Author: Naval Bajpai

Copyright © 2017 Pearson India Education Services Pvt. Ltd

Business Research Methods, 2e Author: Naval Bajpai

Copyright © 2017 Pearson India Education Services Pvt. Ltd

Business Research Methods, 2e Author: Naval Bajpai

Copyright © 2017 Pearson India Education Services Pvt. Ltd

Business Research Methods

Copyright © 2017 Pearson India Education Services Pvt. Ltd

Business Research Methods

Copyright © 2017 Pearson India Education Services Pvt. Ltd

Business Research Methods, 2e Author: Naval Bajpai

Copyright © 2017 Pearson India Education Services Pvt. Ltd

Business Research Methods

Copyright © 2017 Pearson India Education Services Pvt. Ltd

Copyright © 2017 Pearson India Education Services Pvt. Ltd

Copyright © 2017 Pearson India Education Services Pvt. Ltd

Business Research Methods, 2e Author: Naval Bajpai

Copyright © 2017 Pearson India Education Services Pvt. Ltd

Business Research Methods, 2e Author: Naval Bajpai

Copyright © 2017 Pearson India Education Services Pvt. Ltd

Business Research Methods, 2e Author: Naval Bajpai

Copyright © 2017 Pearson India Education Services Pvt. Ltd

Business Research Methods

Copyright © 2017 Pearson India Education Services Pvt. Ltd

• If a researcher has developed a scale to measure consumer

Business Research Methods, 2e Author: Naval Bajpai

Copyright © 2017 Pearson India Education Services Pvt. Ltd

Business Research Methods

Copyright © 2017 Pearson India Education Services Pvt. Ltd

Business Research Methods, 2e Author: Naval Bajpai

Copyright © 2017 Pearson India Education Services Pvt. Ltd

Business Research Methods, 2e Author: Naval Bajpai

Business Research Methods

Copyright © 2017 Pearson India Education Services Pvt. Ltd

Business Research Methods, 2e Author: Naval Bajpai

Copyright © 2017 Pearson India Education Services Pvt. Ltd

Business Research Methods, 2e Author: Naval Bajpai

Copyright © 2017 Pearson India Education Services Pvt. Ltd

Business Research Methods, 2e Author: Naval Bajpai

Copyright © 2017 Pearson India Education Services Pvt. Ltd

Business Research Methods, 2e Author: Naval Bajpai

Copyright © 2017 Pearson India Education Services Pvt. Ltd

Business Research Methods, 2e Author: Naval Bajpai

Copyright © 2017 Pearson India Education Services Pvt. Ltd

Copyright © 2017 Pearson India Education Services Pvt. Ltd

• As clear from the name, the single-item scales measure only

Copyright © 2017 Pearson India Education Services Pvt. Ltd

Business Research Methods, 2e Author: Naval Bajpai

• Researcher tries to generate some basic information to conduct

Copyright © 2017 Pearson India Education Services Pvt. Ltd

Business Research Methods, 2e Author: Naval Bajpai

Copyright © 2017 Pearson India Education Services Pvt. Ltd

Copyright © 2017 Pearson India Education Services Pvt. Ltd

Business Research Methods, 2e Author: Naval Bajpai

Copyright © 2017 Pearson India Education Services Pvt. Ltd

• As the name indicates, in the paired-comparison scaling

Copyright © 2017 Pearson India Education Services Pvt. Ltd

Business Research Methods, 2e Author: Naval Bajpai