Vous êtes sur la page 1sur 3

Chapter 4: Reliability & Validity

Reliability: To what extent can we say that the data are consistent?

About consistency
Different statistical techniques available, each produces a
reliability coefficient ranging from 0.00 to +1.00 (totally
inconsistent or totally consistent)

Different techniques to measure reliability

Approach/Method Description Remarks


Test-Retest Measures across time - Article need to start time lapse
(coefficient of stability) (higher reliability if longer time)

Parallel-Forms Measures across forms - two forms of the same instrument


(coefficient of equivalence) supposedly focusing on the same
object to measure
Internal Consistency
1. Split-half - Examining performance - Will be expectedly high (unusual
on odd and even-numbered if low)
items separately and
measure their correlation

2. Kuder-Richardson - Order of items does not


#20 (K-R 20) matter (all possible
combinations computed)

3. Cronbachs alpha - Same as KR 20 if


dichotomous; more versatile
if items have > 2 possible
values
Interrater Reliability
1. Kendalls coefficient - For ranked data (ordinal)
of concordance

2. Cohens kappa - For nominal data


(categorical)
3. Intraclass - Reliability of ratings (for
Correlation (ICC) raw score)

4. Pearsons product- - For raw score


moment correlation

Standard Error Measurement:

Range within which a score would likely to fall if a given measured object were
to be remeasured
Some warnings about reliability

1. Different methods of assessing reliability consider the issue of


consistency from different perspectives. E.g. a high coefficient of
stability does not necessarily mean high internal consistency

2. Reliability coefficients really apply to data, not to the measuring


instruments. They are characteristics of data rather than the
instruments that produce the data.

3. Place more faith in good results for large groups rather than for small
groups.

4. If a test is administered under time pressure, various estimates of


internal consistency (split-half, KR 20, alpha) will be high. So dont be
overly impressed.

5. Reliability not only criterion used to assess quality of data.

Validity

- concerned with accuracy


- i.e. whether the measuring instrument measures what it purports to
measure
- reliability is a necessary but not sufficient condition for validity; valid data
are reliable, but not all reliable data are valid
- three kinds of validity:
1. Content Validity
o Content experts evaluates instrument; we should ask:
Who did the evaluations?
What did they check/do?
How was the outcome?
2. Criterion-Related Validity
o Comparing scores with a relevant, established criterion
variable
o Correlating these two set of scores to produce the validity
coefficient
o Two kinds:
Concurrent validity (two tests administered at same or
near same time)
Predictive validity (test before the criterion)
3. Construct Validity
o 3 ways
Providing correlation with convergent and discriminative
variables; score should indicate high correlation for
convergent (convergent validity) variables and low
correlation for discriminant (discriminant validity)
variables.
Show certain groups obtain higher mean scores on new
instrument than other groups, with the two groups
determined on logical grounds prior to the test
Conduct a factor analysis

Warnings about Validity

1. Validity (like reliability) is a characteristic of the data, not the


instrument.
2. Importance of correlation as correlation plays a central role in
assessing construct validity (the first two ways); hence remember
warnings about correlation

Final Comments

1. How high should reliability and validity coefficients be?


Answer: It should be judged in relative to other available instruments.

2. Researchers should use multiple methods to assess reliability and


validity.

3. Reliability and validity related to data quality, which by itself, does not
determine the degree to which the studys results can be trusted.
Possible for conclusions to be worthless because of the wrong use of
statistical procedure, or design of study deficient. In other words,
reliability and validity are important, but other important concerns must
be attended to as well.

Vous aimerez peut-être aussi