Vous êtes sur la page 1sur 60

Measurement

An overview of measurement
Measurement is central to the process of obtaining
data. How, and how well, the measurements in a
research project are made are critical in determining
whether the research project will be a success.

“ You cannot manage that you cannot measure. You cannot


measure what you cannot operationally define. You cannot
operationally define what you do not understand. You will not
succeed if you do not manage”
Defense Management System.
Measuring Anything that Exists
• Measurement – Careful, deliberate observations
of the real world for the purpose of describing
objects and events in terms of the attributes
composing the variable.
• there are usually several different options for
measuring any particular variable:
Direct measurement (height, weight)
vs indirect measurement
(motivation, knowledge, memory,
marital satisfaction)
THE NATURE OF MEASUREMENT
1. The process of assigning numbers or scores to
attributes of people or objects.
2. The process of describing some property of a
phenomenon of interest by assigning numbers in a
reliable and valid way
Precise measurement requires:
a) Careful conceptual definition – i.e. careful definition of
the concept (e.g. loyalty) to be measured
b) Operational definition of the concept
c) Assignment rules by which numbers or scores are
assigned to different levels of the concept that an
individual (or object) possesses.
• Conceptualization – The mental process whereby
fuzzy and imprecise notions (concepts) are made
more specific and precise.

– How would you conceptualize…


• Learning?
• Attitude?
• Quality?
Conceptualization
Conceptualization is the refinement and
specification of abstract concepts.
Conceptualization – The process through
which we specify what we mean when we use
particular terms in research.
Dimensions (D) and elements (E) of the concepts (C) learning (L)

Learning

D D D
Understanding Retention (recall) Application

E E E E E
Answer Give Recall Solve problems Integrate with
question appropriate material after applying concepts other relevant
correctly example some lapse of understood and material
time recalled
C
Achievement
motivation

D1 D2 D3 D4 D5
Driven by work Unable to relax Impatience with Seeks moderate Seeks feedback
ineffectiveness challenge

E E
E E E
Constantly Persevering
despite Swears under Opts to do a Opts to take
working one’s breath challenging moderate, rather
setbacks
when even small rather than a then
mistake occur routine job overwhelming
E challenging
E
Very Does not like to
reluctant to work with slow or
take time inefficient people
off for
anything E E
Asks for Is implement for
E E
feedback on how immediate
Thinks of work Does not have the job has feedback
even at home any hobbies been done
Operationalization s
• Conceptualization is the refinement and
specification of abstract concepts.
• Operationalization is the development of
specific research procedures that will result in
empirical observations representing those
concepts in the real world.
1. Conceptual Definition
 Concept - A generalized idea about a class of
objects, attributes, occurrences, or processes.
 Examples: Gender, Age, Education, brand
loyalty, satisfaction, attitude, market
orientation

Variable - Anything that varies or changes from


one instance to another; can exhibit differences
in value, usually in magnitude or strength, or in
direction. The variable for the study can be the
“concept” .
1. Conceptual Definition
 Concepts must be precisely defined for effective
measurement.
 E.g. consider the following definitions of “brand
loyalty”:
1. “The degree to which a consumer consistently
purchases the same brand within a product
class.” (Peter & Olson)
2. “A favorable attitude toward, and consistent
purchases of, a particular brand”. (Wilkie, p.276)
 The two definitions have different implications for
measurement – they imply different
operationalizations of the concept of brand loyalty
2. Operational Definition/Operationalization
 Operational definition - A definition that gives meaning to a
concept by specifying what the researcher must do (i.e.
activities or operations that should be performed) in order to
measure the concept under investigation.
 Operationalization - The process of identifying scales that
correspond to variance in a concept.
For example:
 Conceptual definition # 1 for brand loyalty in the previous slide
implies that in order to measure loyalty for brand A (operational
definition), you will need to:
1) Observe consumers’ brand purchases over a period of time, and
2) Compute the percent of purchases going to brand A
 For conceptual definition # 2 you will need to:
1) Observe consumers’ brand purchases over a period of time,
2) Compute the percent of purchases going to brand A, and
3) Ask consumers questions to determine their attitudes toward
brand A
3. Rules of Measurement
 Guidelines established by the researcher for assigning
numbers or scores to different levels of the concept (
dimensions and elements) that different individuals (or
objects) possess
 The process is facilitated by the operational definition.
 For example, if you operationalized brand loyalty as “purchase
sequences” (conceptual definition # 1), then you may establish the
following rules for assigning scores:
 If consumer purchased brand A:

 90% or more –> loyalty for brand A = 1 (Extremely loyal)


 80 - 89% –> loyalty for brand A = 2 (Very loyal)
 70 - 79% –> loyalty for brand A = 3 (Loyal)
 Etc.
 In this case, we have assigned the numbers 1, 2, 3 to
different levels of loyalty toward brand A. We have
measured loyalty for brand A.
MEASUREMENT SCALES
 To effectively carry out any measurement (whether
in the physical or social sciences) we need to use
some form of a scale.
 A scale is any series of items (numbers) arranged
along a continuous spectrum of values for the purpose
of quantification (i.e. for the purpose of placing
objects based on how much of an attribute they
possess)
 E.g. the thermometer consists of numbers
arranged in a continuous spectrum to indicate the
magnitude of “heat” possessed by an object.
The Four Levels of Scale Measurement
 Four levels of scale measurement result from this mapping
1. Nominal Scale: a scale in which the numbers or letters assigned
to an object serve only as labels for identification or
classification, e.g. Gender (Male=1, Female=2)
2. Ordinal Scale: a scale that arranges objects or alternatives
according to their magnitude in an ordered relationship, e.g.
100 Meter Sports (Gold=8.9 sec, Silver=9.1sec, Bronze=13 sec)
3. Interval Scale: a scale that both arranges objects according to
their magnitude, distinguishes this ordered arrangement in
units of equal intervals, but does not have a natural zero
representing absence of the given attribute, e.g. the
temperature scale (40oC is not twice as hot as 20oC)
4. Ratio Scale: a scale that has absolute rather than relative
quantities and an absolute (natural) zero where there is an
absence of a given attribute, e.g. income, age.
Characteristics of Different Levels of Scale Measurement
Type of Data Numerical Descriptive
Examples
Scale Characteristics Operation Statistics
Nominal Classification but no Counting Frequency in each Gender (1=Male,
order, distance, or category 2=Female)
origin Percent in each
category
Mode
Ordinal Classification and Rank ordering Median 100 Meter race
order but no Range status (1=Gold,
distance or unique Percentile ranking 2=Silver, 3=
origin Bronze

Interval Classification, order, Arithmetic Mean Temperature in


and distance but no operations that Standard deviation degrees
unique origin preserve order and Variance Satisfaction on
magnitude semantic
differential scale
Ratio Classification, order, Arithmetic Geometric mean Age in years
distance and unique operations on Coefficient of Income in Saudi
origin actual quantities variation riyals

Note: All statistics appropriate for lower-order scales (nominal being lowest) are appropriate for
higher-order scales (ratio being the highest)
Likert Scale
Strongly Disagree Disagree Neutral Agree Strongly Agree
1 2 3 4 5

1. My job offers me chance to 1 2 3 4 5


test myself and my abilities

2. Mastering these jobs meant a


lot to me. 1 2 3 4 5

What type of scale is it?


INDEX MEASURES
• Index measures use combinations (or collection) of
several variables to measure a single construct (or
concept); they are multi-item measures of
constructs.

Example 1: Index Measure


Construct: Social class
Measures: Linear combination (index) of occupation,
education, income.
Social class = β1Education + β2Occupation +
β2Occupation
CRITERIA FOR GOOD MEASUREMENT

 Three criteria are commonly used to assess


the quality of measurement scales in
marketing research:

1. Reliability
2. Validity
3. Contextually Sensitive
RELIABILITY

 The degree to which a measure is free from


random error and therefore gives consistent
results.
 An indicator of the measure’s internal
consistency
Test-Retest
Stability
(Repeatability)

Reliability Splitting
halves
Internal
Consistency
Equivalent
forms
Assessing Stability (Repeatability)
• Stability  the extent to which results obtained
with the measure can be reproduced.
1. Test-Retest Method
• Administering the same scale or measure to the same
respondents at two separate points in time to test for
stability.
2. Test-Retest Reliability Problems
• The pre-measure, or first measure, may sensitize the
respondents and subsequently influence the results of the
second measure.
• Time effects that produce changes in attitude or other
maturation of the subjects.
Assessing Internal Consistency
• Internal Consistency: the degree of homogeneity
among the items in a scale or measure
1. Split-half Method
• Assessing internal consistency by checking the results of one-
half of a set of scaled items against the results from the other
half.
• Coefficient alpha (α)
– The most commonly applied estimate of a multiple item
scale’s reliability.
– Represents the average of all possible split-half reliabilities
for a construct.
2. Equivalent forms
• Assessing internal consistency by using two scales designed to
be as equivalent as possible.
VALIDITY
• The accuracy of a measure or the extent to which
a score truthfully represents a concept.
• The ability of a measure (scale) to measure what
it is intended measure.
• Establishing validity involves answers to the ff:
– Is there a consensus that the scale measures what it is
supposed to measure?
– Does the measure correlate with other measures of
the same concept?
– Does the behavior expected from the measure predict
actual observed behavior?
Validity

Face or Criterion Construct


Content Validity Validity

Concurrent Predictive
ASSESSING VALIDITY
1. Face or content validity: The subjective agreement
among professionals that a scale logically appears to
measure what it is intended to measure.
2. Criterion Validity: the degree of correlation of a
measure with other standard measures of the same
construct.
• Concurrent Validity: the new measure/scale is taken at
same time as criterion measure.
• Predictive Validity: new measure is able to predict a future
event / measure (the criterion measure).
3. Construct Validity: degree to which a measure/scale
confirms a network of related hypotheses generated
from theory based on the concepts.
• Convergent Validity.
• Discriminant Validity.
Relationship Between Reliability & Validity

1. A measure that is not reliable cannot be


valid, i.e. for a measure to be valid, it must
be reliable  Thus, reliability is a
necessary condition for validity
2. A measure that is reliable is not
necessarily valid; indeed a measure can be
but not valid  Thus, reliability is not a
sufficient condition for validity
3. Therefore, reliability is a necessary but not
sufficient condition for Validity.
CONTEXTUALLY SENSITIVE
• The ability of a measure/scale to accurately measure
variability in stimuli or responses – diff. place, time, job;
• The ability of a measure/scale to make fine distinctions
among respondents with/objects with different levels of
the attribute (construct).
– Example - A typical bathroom scale is not sensitive enough to be used to
measure the weight of jewelry; it cannot make fine distinctions among
objects with very small weights.
– Example – the instrument measuring some aspect of concepts sensitive to
culture such as “organizational commitment”
QUESTIONNAIRE DEVELOPMENT
MAJOR WAYS TO COLLECT DATA
• Administer a standardized or existing instrument
e.g. Servqual, 16 PF etc.
Need to thorough check: validity, reliability, suitability and
feasibility.
Due to some factors such as cultural effect, we need to
modify the standardized instrument.
need to recheck the validity and reliability.
Record naturally available data.
e.g. sales, absenteeism etc.
the data sometimes cannot fit exactly the need of the study.
PROCESS FOR SELECTING AN EXISTING INSTRUMENT
• List the variables to be measured by the content of the
instruments.
• Examine sources of information about existing instruments.
– Buros’ Mental Measurement Yearbooks [10]. List published instruments in
prints at time of publications and provides descriptions and critiques of
the instruments. Found in libraries. Published periodically, not annually.
– Test in Print[11]. Provides title, publication date, authors, publishers and
comments. Found in libraries.
– Publishers’ catalogs. Provide information about instrument’s purpose,
content, administration time, cost and scoring services available. Acquired
from publisher.
– Specimen sets. Usually includes a specimen instrument, technical manual,
administrator’s manual, answers sheets and information on ordering costs
and additional services such as scoring and/or interpretation. Acquired
from publisher.
– Professional journals. Occasionally new instruments are available
from authors of journal articles who usually have conducted studies
using these instruments. Many of these have insufficient evidence
of validity and reliability.
• Examine the technical manual to assess the evidence
provided for validity and reliability. Check that types,
magnitudes and sampling for validity and reliability are
appropriate.
• Examine the information on interpretation of the
instrument. There should be sufficient information for the
user to interpret the results correctly.
• Examine the directions to assess their clarity and ease of
use.
• Examine administration procedures including the time
necessary for ss lo complete the instrument.
• Examine the scoring procedures for sufficient detail, ease
and clarity to avoid errors.
The process for estimating content validity
1. Examine the variables of interest and list the tasks or skill or other
characteristics involved.
2. Add to the list the importance (criticality) and frequency of occurrence
of each of the tasks or skills.
3. reexamine the list and make sure that all skill or tasks which are crucial
to the variable are included even if they occur infrequently. Add any
which may have been omitted.
4. Compare each of the tasks or skills on the list to the items of the
measure to ensure that each crucial and frequently occurring task or
skill is measured by at least one item. Usually more items are included
to assess those skills which are more important and those which occur
frequently, an additional aspect of the representative ness of the
measure to the variable being measured.
5. Examine each item of the measure to ensure that the difficulty level is
appropriate for the variable being measured. For example, if algebra
ability is being assessed a series of items requiring calculus would
indicate poor content validity. If the ability to follow simple directions is
being assessed, items which require college level reading ability would
indicate poor content validity.

NOTE:
Content validity is rarely represented by numerical figure because it is a
logical process to comparing the components of a variable to items of a
measure.
THE PROCESS OF ESTIMATING CONCURRENT

1. Gather scores from the non-validated instrument


administered to a validity sample.
2. Gather scores from a previously validity instrument which
purports to measure the same variable and which is
administered to THE SAME SAMPLE at APPROXIMATELY THE
SAME TIME.
3. Compute a correlation coefficient between the two sets of
scores.
THE PROCESS OF ESTIMATING PREDICTIVE

1. Gather scores on the predictor variable from a group of


subjects (the validity sample) for whom the instruments is
appropriate.
2. Gather scores on the criterion variable from the SAME
sample AT A LATER TIME.
3. Compute a correlation coefficient between the two sets of
scores.
THE PROCESS OF ESTIMATING
CONSTURUCT VALIDITY
1. Examine the theory associated with the variable of interest.
2. Select behaviors which the theory indicates would
differentiate subjects with differing amounts of the variable.
For example, self-concept would achieve at a higher level,
be promoted more often and be more open in
interpersonal relationships than would low self-concept
employees.
3. Administer the instrument measuring the variable of
interest to the validity sample and record the scores.
4. Gather scores for the validity sample on each of the
behavior selected in step #2.
5. Analyze the data using appropriate statistical tests to
ascertain if subjects scoring high the major variable and
those scoring low are statistically differentiated on each of
the selected criterion variable.
6. Accept evidence of construct validity if each of the
statistical tests indicates a significant difference or a
significant relationship between high and low scores on the
major variable and the criterion variables. If even one of
hypothesesized relationships is not supported statistically
then the instrument cannot be said to evidence construct
validity.
7. Examine reasons if construct validity is not supported.
Possible reasons include: (1) the theory is incorrect, (2) the
instrument was not a valid measure of the variable of
interest, or (3) there may have been errors in the
administration of the instrument, scoring or analysis of the
data.
THE PROCESS OF ESTIMATING TEST-RETEST
RELIABILITY.
NOTE: One form and two administrations of the instrument are
required.

• Administer the instrument to the reliability sample at Time


1.
• Wait a period of time (e.g. , 2-4 weeks)
• Administer copies of the same instrument to the same
sample at Time 2.
• Correlate the scores from Times 1 and Time 2.
THE PROCESS OF ESTIMATING EQUIVALENT
FORMS RELIABILITY
• Note: two from and two administrations of the instrument
are required.
1. Administer Form A of the instrument to the reliability
sample.
2. Break the sample for a short rest period (10-20 minutes).
3. Administer Form B of the instrument to the same reliability
sample.
4. Correlate the scores from Form A and Form B.
THE PROCESS OF ESTIMATING SPLIT-HAFT
RELIABILITY
• NOTE: One form and one administration of the instrument are required.
1. Obtain or generate an instrument in which the two halves were
formulated to measure the same variables.
2. Administer the instrument to the reliability sample.
3. Correlate the summed scores from the first haft (often the odd
numbered items) with summed scores from the second haft (often the
even numbered items).
4. Compute the Spearman-Brown prophecy formula* to correct for
splitting one instrument into halves.
n(rxx )
*Spearman-Brown Prophecy Formula r corrected =
1  (n  1)rxx
where rxx= uncorrected reliability, and n= number of splits (for two
halves, n=2)
THE PROCESS FOR COMPUTING KR-20
RELIABILITY ESTATES.
• Note : One form and one administration of the instrument is
required.
1. Generate or selected an instrument.
2. Administer the instrument
 x 2 to the reliability sample.
3. Compute the variance ( ) of the scores.
4. Compute the proportion of correct responses to each item.
5. Compute the proportion of incorrect responses to each item.
6. Compute the KR-20 Formula. *
k pq
* rKR  20  ( )(1  2 )
k 1 x
Where k= number of items
 x 2= variance of total scores
p = proportion of correct (passing) responses
q = proportion of wrong (not passing) responses or 1-p
THE PROCESS OF ESTIMATING INTERRATER
RELIABILITY
1. Select or generate an instrument
2. Randomly select a number of objects or events to be rated.
3. Train the raters.
4. Have rater #1 judge each object or event independently.
5. Have rater #2 judge each object or event independently.
6. Correlate the scores of the two raters.*
*other statistical techniques may be used if there are more
than two rates. The statistical technique also depends on
the level of measurement of the rating instrument used.
STAGE IN THE
DEVELOPMENT OF A
QUESTIONNAIRE
STAGES IN THE DEVELOPMENT OF QUESTIONNAIRE

Exploratory Research
Formulation of Hypotheses

Planning
Information required
Stage
Population of relevance
Target group
Method of data collection

Order of topics
Wording and instructions
Type of question
Design Layout
Stage Scales
Probes and prompts
Pilot testing Pilot
Is design efficient? Stage
Coding
Time and cost

Final questionnaire
DESIGN QUESTIONNAIRES OR INTERVIEW SCHEDULS

1. List the variables and their operational definitions


2. Consider including demographic variables, if relevant.
3. Specify the type respondents.
4. Select mail, telephone or personal approach
5. Determine amount of structure for the questionnaire or interview.
6. Determine the level of measurement desired and the type of response
format for each item.
7. Write the questionnaire items.
8. Check the items for invalidating factors and appropriate levels of
measurement.
9. Determine the placement and sequencing of items.
10. Write the introduction, directions and ending.
11. Determine the degree of interviewer direction to respondent, if interviewing.
12. Train the questionnaire administrator or interviewer.
13. Compute the interrater reliability, if interviewing.
14. Use techniques for increasing response, if using the mail.
15. Conduct a pilot study.
16. Check that questions elicit appropriate measures of the variables levels of
measurement, ease of response, ease of administration and that time of
response for the instrument is appropriate.
17. Revise the instrument , if necessary.
SELECTED WAYS TO INCREASE RESPONSE RATE OF
MAILED QUESTIONNAIRES
• Include a cover letter which appeals to respondent’s affiliation such as “ as
a graduate of Universiti Utara Malaysia we feel you will want to_____”
• Mail a reminder postcard about __day after first mailing. Every respondent
will have to receive this postcard unless ss who have returned
questionnaires are identified.
• Mail a second questionnaire about ___week after the postcard.
• Contact nonrespondent by telephone
• Enclose a token
• Write clear directions.
• Mention how little time is required to complete the questionnaire.
• Avoid open-ends items, if possible. People are more likely to respond to a
format in which they can check item rather than generate responses.
• Structure item responses so respondent can answer quickly and easily.
• Structure the entire questionnaire so respondent can complete it easily
and quickly. Make placement and sequencing of items logical and easy
follow.
• Ensure that the questionnaire is professionally typed and printed so that
its appearance gives the impression of credibility and professionalism.
WRITING TIPS FOR MULTIPLE CHOICE
ITEMS
• Avoid leading or biased stems.
• Avoid items which respondents cannot or will not answer, or will not
answer truthfully.
• Ensure that the categories are mutually exclusive – that the response can
only be in one category.
• Ensure that the categories are inclusive – that they include all reasonable
answers to the question.
• Add a category “Other, please specify_____” if uncertain that the category
are inclusive.
• Read about instruments, if including scale.
• Limit the number of choices. Three to seven alternative responses are
usually sufficient with five alternatives commonly used. Beyond seven
choices the respondent has some difficulty in discriminating and the time
of response increases.
WRITING TIPS FOR OPEN-ENDED ITEMS

• “ what is your reaction to the current Federal Budget Deficit?" is an open-


ended items. This item could elicit responses from “I think it stinks” to a
four paragraph treatise on the subject. It would be better to place that
item into a multiple choice format and save the open-ended items for
measuring variables which require creative responses such as “What
would you recommend to decrease the Federal Budget Deficit?”
• Newer use written open-ended items to elicit dichotomous responses
such as yes/no or agree/disagree. Provide these responses for the
respondent check to save both and researcher time.
CHECLIST FOR EACH QUESTIONNAIRE ITEM*

• Will item yield data in the form required by the hypotheses or research
questions and the operational definitions?
• Will item yield data at the level of measurement required for the selected
statistical analysis?
• Does item avoid “leading” respondent to a specific response?
• Is item unbiased?
• Will most respondents have sufficient knowledge to answer item?
• Will most respondents be willing to answer item?
• Will most respondents answer item truthfully?

* All answers should be yes before proceeding with questionnaire.


INFORMATION USUALLY INCLUDED IN QUESTIONNAIRE
INTRODUCTIONS AND COVER LETTERS
• Name of organization conducting the study. This not always the same as
the sponsor of the\study. Often giving the name of sponsor, such as
_____________, will elicit different responses than will “the survey
corporation.”
• Purpose the project. This should not be a lengthy explanation but brief
general statement such as “we are interested in your opinions of various
products.”
• How respondent was selected. Phrases such as “your name was randomly
selected _________.”
• Expression of appreciation for respondent’s help. Phrases such as “we
would very much appreciate your help _____.’
• Estimate of questionnaire completion time. Phrases such as “this should
only take about 10 minutes of your time.”
• Assurance of non-identification of respondent. Phrases such as ‘ you will
no be identified by name.”
• Assurance of confidentiality of responses. Phrases such as “all responses
will remain confidential.”
• Directions for completion. Complete information on how to mark
responses should be given such as “ Please circle ONE answer for each of
the following questions.” Directions are not included in a cover letter but
are placed directly on the questionnaire where the respondent can easily
see them.
SUGGESTIONS FOR ITEM PLACEMENT
• Place introduction in a separate cover letter or at the top of the first page.
• Place non-sensitive demographic items at the beginning of the
questionnaire because they are easy to answer, non-threatening and tend
to put the respondent at ease.
• Place items of major interest next, as there is a greater probability of
respondents completing the first part of the questionnaire.
• Sequence items of major interest in logical order, usually with items on the
same topic grouped together. Item sequence can lead to biased responses
so be careful that one item does not influence the response to the
following item.
• Place sensitive items (e.g. income) last so that resentment of the
intrusiveness of these items does not affect other responses.
• Group items with the same response formats together, if using mixed
response formats, unless this interferes with the desired sequencing of
items.
THE PROCESS FOR CONDUCTING A PILOT TEST
• Select a sample similar to the sample to be used in the study but which
will not be included in the study.
• Copy a sufficient number of questionnaire for the pilot test study sample.
• Instruct the questionnaire administrators to make notes of respondent
questions about the items or directions.
• Administer the questionnaires or interview schedules.
• Check the results of the pilot test:
– Did the items yield the desired information in the appropriate form?
– Did the items yield the selected levels of measurement?
– Were the directions clear to the respondents?
– Were respondents able to answer the questions easily?
– How much time did the fastest 90% of the respondents take to complete the
questionnaire or interview?
– How much time did it take 90% of the fastest questionnaire administrators to
give directions and complete the probing phase?
• Revise the questionnaire if responses to #5 are unsatisfactory.
• Retrain the questionnaire administrators or interviewers, if their
performance was unsatisfactory.
Survey Research
Surveys

Surveys ask respondents for information


using verbal or written questioning