Error PHiggins PDF

Understanding the error of our ways: Mapping the
concepts of validity and reliability

Patricia A. Higgins, RN, PhD
Andrew J. Straub, ANP, GNP
Clinicians increasingly desire evidence upon which to straightforward than others; for example, the timeliness,
base their practice decisions. One of the difficulties in research setting, cultural relevance, and/or types of
their decision-making, however, is answering the fun- subjects involved. On the other hand, evaluating the
damental question, “How do I evaluate the relevance methodology (design and instruments) that produced
and applicability of the findings?” There are a number
those findings may not be as easily accessible, in part
of factors involved in such an evaluation and, fre-
quently, readers can easily determine the usefulness
because of the complexity of the terms and relation-
of a study’s findings based on similarities to their own ships used to describe the reliability (trustworthiness)
clinical setting, timeframe, and/or patient population. and validity (truth) of the study’s measurement strate-
It may be more difficult, however, to understand and gies.
evaluate a study’s measurement error, or the reliability Measurement error is defined as the difference be-
(trustworthiness) and validity (truth) of its methods and tween an abstract concept or idea, considered the “true”
measurement strategies, in part because the exten- state, and the “observed” measurement provided by an
sive body of literature associated with validity and empirical instrument.1 Measurement error can be de-
reliability can be overwhelming. The purpose of this scribed as either systematic or random. Although there
article is to provide a comprehensive overview of
is overlap, the statistical methodology associated with
measurement error as it applies to research design
validity primarily focuses on reducing systematic error,
and instrumentation issues. It is intended to serve as a
succinct, practical reminder of the definitions and or the relationship between the true score and the
relationships of the concepts of validity and reliability. concept (or variable), while the methodology of reli-
It is not intended to replace the essential, detailed ability is concerned with minimizing random error, or
discussions found in numerous textbooks and journal the relationship of the observed score to the variable.1,2
articles. The different dimensions of validity and reli- The relationship between validity and reliability is
ability are briefly discussed and a concept map is mathematically illustrated through the classical test
used to illustrate their relationships. In the process of theory equation, X ⫽ t ⫹ e, in which X is the observed
explaining or predicting the phenomena and/or pro- score, t is the true score, and e is error.3
cesses of health care, researchers and clinicians must The purpose of this article is to provide a compre-
be able to evaluate the truthfulness, precision, and
hensive overview of measurement error as it applies to
dependability of the instruments and measurement
methods used to generate the knowledge for evi-
research design and instrumentation issues. Although
dence-based practice. this multifaceted approach is complicated, and the
discussion for each type of validity and reliability is
limited, we believe that it provides the most informative
concept map. By necessity, however, this article is
N
urses, like all those involved in health care re-
search and practice, continuously grapple with limited to domains, relationships, and statistical meth-
how to evaluate the importance of evidence pro- ods related to the most common types of validity and
duced from research investigations. Among the multi- reliability in quantitative research.
ple issues to be considered, some may seem more Uniting concept mapping with a modified form of
concept analysis seems an especially good fit, as they
Patricia A. Higgins is an Assistant Professor at Frances Payne Bolton both provide mechanisms to increase our understanding
School of Nursing, Case Western Reserve University, Cleveland, OH. of complex concepts. Concept analysis uses the litera-
Andrew J. Straub is a Nurse Practitioner at Evercare, United Health ture to explain a complex mental abstraction, while
Care Services, Inc., Cleveland, OH.
Reprint requests: Dr. Patricia A. Higgins, Case Western Reserve Uni-
concept mapping provides the diagram that helps us
versity Nursing, 10900 Euclid Avenue, Cleveland, OH 44106-4904. better visualize the concept and all its dimensions. As
E-mail: Patricia.Higgins@case.edu one method of conveying an idea, concept mapping is
used to clarify the meaning of a concept, delineate its
Nurs Outlook 2006;54:23-29.
0029-6554/06/$–see front matter
domains and summarize relationships. As a strategy to
Copyright © 2006 Mosby, Inc. All rights reserved. promote meaningful learning, it also has been used to
doi:10.1016/j.outlook.2004.12.004 explain complex ideas and as a tool for individuals or
J A N U A R Y / F E B R U A R Y N U R S I N G O U T L O O K 23
Mapping Validity and Reliability Higgins and Straub
groups to graphically represent their personal ideas of a may be described as statistical interaction, in which the
situation or event.4,5,6,7 A concept map is a hierarchical meaning of the concept is constant but the effect of an
structure that is read vertically, from top to bottom, intervention differs across populations, time, and lo-
proceeding from the most abstract to the most concrete cale.8 For instance, for the older adult, “function” may
concepts. Figure 1, the concept map of validity and consistently be defined according to activities of daily
reliability, begins with the overarching concept of living (ADL), but the effect of an intervention to
measurement error and ends with statistical measure- improve ADL function may vary depending on sub-
ment strategies. Its organization parallels the text of the jects’ age, medical diagnosis, culture and/or setting
article and the numbers (lowest level of the map) (community outpatient clinic, acute care division, or
correspond to numbers in the text and the legend. nursing home unit). Theoretical specification is crucial,
therefore, for construct validation3 and, ideally, con-
VALIDITY struct validation occurs through accrual of evidence
Validity addresses the inference of truth of a set of from multiple studies that use congruent (but not
statements.8 The set consists of symbols that represent necessarily identical) theoretical frameworks.
mathematical formulae and/or words and statements With an understanding that theoretical specification
that comprise a line of reasoning in which some of the is a fundamental assumption of construct validity, there
statements support acceptance of others (conclusions). has been an historical shift in the thinking of measure-
In science, validity is essential to a research proposal’s ment theorists from a focus on test validity, which
theoretical framework, design, and methodology, in- emphasized corroborating a specific test or instrument,
cluding how well specific tools or instruments measure to validation of inferences based on the test results.10
what they are intended to measure. As such, it is Which brings us to the second and third steps of
considered an ideal state, to be pursued but never construct validation: specifying and empirically testing
obtained. Knowing that we will never achieve an the relationships (hypotheses) that involve a particular
absolute outcome, scientists nevertheless engage in concept, and interpreting the results to determine if the
continuous efforts to attain a degree of validity.9 measure/instrument validates the concept as defined by
Even a cursory reading of the literature quickly alerts the theoretical relationships. Construct validity is a
a reader to the many different, and sometimes confus- lengthy process that typically involves many years and
ing, discussions of validity. We have chosen the ap- several different studies and measurement iterations,
proach first used by Cook and Campbell8 and adapted with subsequent revisions of the concept’s meaning and
by Pedhazur and Schmelkin.3 In this approach, validity its indicators. Its strongest results are produced in
is considered a unitary concept because its multiple situations where all the concepts of a theoretical frame-
domains have common characteristics that are not work have a degree of validity and reliability and there
mutually exclusive.3 But similar to other abstract con- is strength and consistency in all the measured relation-
cepts, in which the whole is better understood through ships of the concepts.1 For nurses, several decades of
a discussion of its particulars, separate discussions are multidisciplinary research have produced a much richer
presented for its 2 primary domains, construct validity understanding of the construct validity of such multi-
and design validity. Additionally, each of these con- faceted concepts as pain, social support, uncertainty,
cepts is broken down into even more specific types of and self-efficacy. Although the following 2 types of
validity; for example, criterion-related validity is exam- validity, content and criterion, are sometimes discussed
ined as a subset of construct validity. separately from construct validity, they are presented
here as 2 of its dimensions representing steps 2 and 3
Construct Validity described above.
Construct validity answers the question, “Is the Content validity. Content validity is concerned with
concept accurately defined and does the instrument or the adequacy of the items (questions) of an instrument
tool actually measure the concept that it is supposed to to assess the concept, or domain of interest.9 The
measure?” Understanding construct validity is chal- processes of content validity are preceded by concept
lenging, but knowledge of 3 essential steps assists in analysis (domain identification) and a developmental
grasping its complexity: (1) specification of context, (2) stage in which there is generation of an instrument.1,11
specifying and testing predicted relationships, and (3) Content validity is an interpretation of the results of the
interpreting results. The first step is determining the tool development, a critical review of the instrument’s
context or situation in which a construct or concept is items in order to assess semantic clarity, domain sam-
used. Researchers work to establish a degree of con- pling adequacy, and coherence of items. The evaluation
struct validity for a particular concept that is specific to methods include (Figure 1, 1-3):
a theoretical framework.1 For example, if used within a
physiological framework, the concept of “function” is 1. Literature review. Includes historical and current
defined differently than when it is used within a social uses of the concept/instrument.
interaction framework. In experimental designs, context 2. Personal reflection.
24 V O L U M E 5 4 ● N U M B E R 1 N U R S I N G O U T L O O K
Figure 1. Concept Map
3. Analytical critique. (a) Analytical critique of the 6. Predictive validity. Ability of the measure of a
instrument by experts (clinicians and researchers), concept to predict future phenomena or events
either individually or as a panel, in which both the even if we cannot explain why the concept is
individual items and the entire instrument are predictive. Time period for prediction must be
evaluated; (b) Analytical critique of the instrument specified. Reported as a correlation coefficient (r).
by potential subjects (focus groups). 7. Factor analysis. Ability of an instrument to opera-
tionalize a theoretical construct by determining
Criterion-related validity. Criterion validity is con-
the relationships of a set of variables.9 Requires
cerned with the statistical testing of theoretical relation-
strict adherence to theoretical guidelines for
ships within an instrument, between 2 instruments,
interpretation.
and/or an instrument and an event that occurs before,
during, or after an instrument is used to measure the
Design Validity
concept of interest.3 There are 4 types of criterion
Design validity is concerned with the evaluation of a
validity (Figure 1, 4-7):
researcher’s design (blueprint) of a study.12 Given that
4. Concurrent validity. Ability to detect a positive or scientists seek to draw valid conclusions about the
negative statistical relationship between 2 instru- effect of the independent variable on the dependent
ments simultaneously measuring the same concept variable and make valid generalizations about the
at the same time. Reported as a correlation coef- population and settings of interest, statistical as well as
ficient (r). non-statistical factors must be considered in their de-
5. Convergent or divergent validity. Ability to detect sign deliberations.13 For the clinician, it can be just as
a relationship (r) between the concept of interest daunting to determine if research results are relevant;
and a concept that has a similar meaning (conver- that is, were the results obtained under circumstances
gent), or the ability to detect the absence of a that are somewhat similar to clinical practice? It is
significant relationship between the concept of helpful, therefore, to know that knowledge of the
interest and one that has an opposite meaning clinical domain of interest must be combined with 3
(divergent). closely related factors. The first is concerned with the
ethical treatment of human subjects; the second, the 9. Maturation. Occurs in pre-test and post-test de-
features and constraints of the clinical setting; and the signs when changes in subjects affect the depen-
third, the factors of design validity. dent variable.
Design validity includes 3 broad categories: internal 10. Testing. Repeated testing alters a subject’s perfor-
validity, external validity, and statistical conclusion mance through familiarity with the tool.
validity.12 In some cases, statistical conclusion validity 11. Regression toward the mean. In pre-test/post-test
is considered a special case of internal validity, and in and matched-group designs, the natural tendency
other approaches it is a separate category. It also differs of subjects’ extreme scores to move toward the
from other types of validity in that it includes random mean, and thus, it cannot be concluded that the
error (reliability). To best illustrate its characteristics, treatment caused the change.
statistical conclusion validity is discussed here as a 12. Differential selection of participants. Occurs if
third dimension of design validity. there is systematic bias on the part of the re-
The following descriptions of the 3 dimensions of searcher in recruiting subjects and/or assigning
design validity are drawn from the work of Cook and them to groups.
Campbell8 and Kirk,13 but discussions also are found in 13. Differential loss of participants. May occur during
other research texts; for example, Burns and Grove.12 recruitment and/or data collection.
Internal and external validity claims or “threats” pri- 14. Diffusion of treatment. Occurs when subjects in
marily apply to experimental or quasi-experimental the control group inadvertently learn information
designs; the claims of statistical conclusion validity are intended only for the treatment group, or they
relevant for all designs. As with construct validity, the receive the treatment intended for the treatment
3 dimensions are overlapping. Further, some of the group.
claims of design validity overlap, or are based on, 15. Ambiguity about direction of causal relationships
reliability measures. For example, low reliability of the among variables. Occurs most frequently in cross-
measure(s) for any variable may create a threat to sectional designs.
internal validity; that is, low reliability of measures may 16. Compensatory equalization of treatments. May be
inflate the error variance and lead to Type II error ethically required and/or socially desired. Results
(concluding that there is no statistical difference be- in the loss of the control group.
tween subject groups when there is an actual differ- 17. Compensatory rivalry of subjects. Competition
ence).8 And finally, strengthening one type of design between treatment groups may occur if the re-
validity may weaken another. For example, the more search protocol is known and subjects learn of
the researcher controls the effects of the treatment on their group membership (control or treatment).
test subjects, the better the statistical conclusion valid- 18. Demoralization of control group subjects. Occurs
ity; however, this greater control on the treatment can when the group receiving the less desirable treat-
also decrease external validity and construct validity. ment becomes discouraged or resentful.
Internal validity. Internal validity is concerned with
External validity. External validity is concerned
the congruence between the theoretical assertions and
with generalizing to particular people, settings and
the statistical relationship of 2 variables; that is, are 2
times and across types of people, settings and times.
variables causally related and is the direction of their
The former is concerned with whether research goals
relationship as hypothesized? Internal validity, most
for an identified population have been met; the latter,
frequently discussed in terms of “threats,” are those
for assessing how far one can generalize.8 The 4 factors
factors that researchers are expected to anticipate and
discussed here may also be referred to as “threats,” or
control for during proposal writing (a priori) or attempt
statistical interaction effects. One additional factor that
to explain (a posteriori) when unexpected results mate-
is not identified explicitly as a threat to external validity
rialize after completion of the data collection. Although
is sampling bias, in the form of non-probability or
discussed singly, the threats may occur simultaneously
convenience sampling. It is used in many nursing study
and in a cumulative or opposing manner. Based on the
designs and limits the generalizability of the findings in
purpose, setting, theory, and design of a study, the
the same way as the other threats to external validity
investigator must decide the priority of validity claims.8
(Figure 1, 19-22).
Similarly, the clinician must prioritize validity claims of
a particular study and decide if internal threats are 19. Interaction of subject selection and treatment.
sufficient to reject its results as unsuitable (Figure 1, Sample bias occurs when a large number of
8-18): subjects inadvertently share a salient characteristic
or a large number of potential subjects decline to
8. History or concurrent events. Refers to events participate. Either can affect the experiment’s
unrelated to the research that occur during data dependent variable.
collection and may incorrectly contribute to a 20. Interaction of setting and treatment, also known as
relationship between 2 variables. ecological validity. Occurs when a treatment effect
happens only in a specific setting, preventing and although the dimensions overlap, a high degree of
further generalization. reliability from one dimension (for example, stability)
21. Interaction of history and treatment. The circum- does not indicate a high degree of reliability from
stance or times in which the study takes place are another (for example, internal consistency). Therefore,
temporary and unique and unlikely to be repeated. authors who mistakenly refer to “the high reliability” of
22. Interaction of multiple treatments. Treatments, an instrument, without further identifying the concep-
other than the experimental one, affect the depen- tual or statistical perspective, prohibit readers from
dent variable. The treatments may be unknown or making an informed decision about the evidence pre-
unanticipated by the researcher. sented.
Statistical conclusion validity. Statistical conclusion The direct connection between the concepts of reli-
validity is concerned with both systematic and random ability and validity is illustrated through the under-
error and the correct use of statistics and statistical standing that “valid measurements require consistency
tests.8 It addresses the question, “Is there a relationship of observation”14 but, also, that reliability is considered
between two variables?” The types listed below are a necessary but not sufficient condition for validity.3 In
essential for experimental and quasi-experimental de- other words, it is possible for a data set to have high
signs and most are important for descriptive designs reliability measures but low validity, but in order to
(Figure 1, 23-29): have a high degree of validity, the data set also must be
23. Power. Low statistical power may lead to Type I reliable. As an illustration, consider an electronic scale
error: falsely concluding that a relationship or used to measure patient weights. Suppose we are testing
difference exists between 2 subject groups. the validity and reliability of this scale by repeatedly
24. Violation of assumptions of statistical tests. May placing a 100-lb barbell on the scale and recording the
result in incorrect inferences. reading each time. If the scale is to have a degree of
25. Confounded significance test levels. With certain validity, it must record consistent readings at or very
statistical tests (for example, t-tests), the probabil- near the “true” weight of 100 lbs each time. But
ity of drawing one or more erroneous conclusions suppose the researcher mistakenly calibrated the empty
increases as a function of the number of tests scale to 0.5 lb instead of 0 lbs and subsequent readings
performed. are at or near 100.5 lbs each time; consequently, this
26. Reliability of measures. Includes a wide range of data set would have high reliability but low validity.
instrumentation error, from poor calibration of a As will become evident from the discussion below,
physiological instrument to changes in an observ- the determination of reliability in human test subjects
er’s performance. presents additional challenges. The following measures
27. Reliability of treatment implementation. Failure to are all population-dependent indicators of reliability
standardize the treatment may lead to an increase (random error).
of either a Type I or Type II error. Internal Consistency Reliability. Internal consis-
28. Heterogeneity of the environmental setting. Unan- tency is a measure of the degree to which each of an
ticipated irrelevancies in the experimental setting instrument’s items measure the same characteristic. To
of the study may lead to Type II error. compute internal consistency, a single version of an
29. Heterogeneity of respondents. Idiosyncratic differ- instrument is administered to a single group of test
ences in subjects may lead to Type II error.
subjects at a single time point. The data are then
analyzed for consistency using 1 of the following
RELIABILITY statistical methods1 (Figure 1, 33-35):
Reliability in the most general sense is “the extent to
which an experiment, test, or any measuring procedure
30. Cronbach’s alpha. Also known as “coefficient
yields the same results on repeated trials.”1 According
to classical measurement theory, it is the ratio of alpha” or “alpha,” Cronbach’s alpha provides a
true-score variance to observed-score variance. Similar general estimate of how well all the items on a test
to validity, perfect reliability is an ideal state that is instrument measure the same phenomenon. It is
never achieved. Additionally, there are a number of based on the number of test items and their
issues pertaining to the measurement and/or use of the average inter-item correlations. The possible range
term. First, for physiological measures, the term reli- of scores for alpha ⫽ 0.0-1.0, with lower scores
ability is frequently referred to as accuracy and/or indicating less internal reliability. Estimates of an
precision. Second, reliability should be used only to acceptable alpha for an instrument are dependent
describe inferences from a particular sample’s data set, on the maturity of the instrument and the sample
test or instrument. Also, although the concept map population for which it was used. An alpha of ⱖ
shows that reliability is multidimensional, with each 0.80 is recommended for well-established and
dimension reflecting a different conceptual perspective, widely used instruments.1
31. Kuder-Richardson Formulas. Similar to Cron- least 2 different occasions. Pearson’s correlation
bach’s alpha but used with scale items that are coefficient is calculated.
scored dichotomously. 35. Kendall’s Coefficient of Concordance. Measures
32. Coefficient Theta. Reliability estimator computed the degree to which ranked data from numerous
from principal component or factor analysis that raters agree with one another.
was developed to account for non-parallel or 36. Percent Agreement. Ratio of the number of mea-
multi-dimensionality in a set of items.15 surements in agreement divided by the total num-
ber of measurements.
Stability. Stability is the consistency of repeated
37 Kappa and Phi. Computation of Cohen’s Kappa or
measurements. A sample of subjects is tested more than
its equivalent, Phi, provides nearly equal inter-rater
once, under similar circumstances, utilizing the same
reliability coefficients. Because they control for
instrument, which can be either a physiological or paper
chance agreement, they are preferable to the per-
and pencil instrument. It is used with phenomena in
cent agreement statistic.16
which little or no change is expected between the first
and second trials. The data sets from the 2 test admin- The Bland-Altman plot17 is a relatively new statisti-
istrations are statistically compared from one test to the cal methodology that increasingly is being used to
next. Assessing stability of measurement requires the- assess the equivalence between 2 instruments measur-
oretical understanding of the concept of interest, the ing the same phenomenon. Bland and Altman17 argue
time between measurement, and intervening factors. In against the common method of using correlation coef-
assessing a physiological measurement, for example, a ficients to compare one set of repeated measures to
subject’s body weight, measured at two 15-minute another set of repeated measures on the basis that
intervals in an outpatient clinic, should be unchanged. correlations can lead to misleading conclusions about
Even if appropriate timing can be determined, assessing the agreement between 2 variables. The Bland-Altman
the stability of repeated measures of attitudes or psy- method compares the measurements’ differences to the
chological conditions can be more difficult, or even mean of the measurements (in this analysis, the mean of
inappropriate, considering the variability of subject the measurements is assumed to be the true or correct
perceptions and/or the number of intervening factors measurement against which both sets of experimental
There is 1 common statistical method for testing stabil- measurements are compared). The statistical limits of
ity1 (Figure 1, 33): agreement are specified in advance of performing the
statistical analysis. Most frequently used to assess
33. Test-Retest Method. Also called “Retest Reliabil- equivalence between measures of physiological phe-
ity.” Involves applying the same test instrument to nomena,17,18,19 Bland-Altman also has been used to
the same test subjects at different points in time. assess equivalence of paper and pencil tests20 or be-
Reported as a correlation coefficient (r). tween a paper and pencil test and a physiological
instrument.21
Inter-rater Reliability and Equivalence. Inter-rater
reliability refers the degree to which researchers agree 38. Bland-Altman plots. Generated with MedCalc
in administering and scoring a test instrument given to software (www.medcalc.be, Belgium).22 If the
a group of subjects, or to the degree of consistency with within-mean differences equal ⫾1.96 SD and are
paper and pencil data collection.12 Many potential not clinically important, then the 2 methods are
threats to inter-rater reliability exist in any test situation. equivalent and may be used interchangeably.
For instance, 2 researchers may vary considerably in
rating an infant’s response to pain, even if they have CONCLUSION
clear criteria and were trained in the same protocol. A Measurement permeates all aspects of our everyday
second threat to inter-rater reliability can occur when lives. Daily activities and decisions are guided by it and
data are copied from one source to another (for exam- very often its systematic predictability is taken for
ple, copying laboratory values from the subject’s med- granted. When we move into scientific inquiry, how-
ical record to a data collection tool). To compute ever, our perspective changes. In the process of explain-
inter-rater reliability, ⱖ 2 raters should complete at ing or predicting the phenomena and/or processes of
least 10 identical instruments. Generally, 80% agree- health care, researchers and clinicians must carefully
ment between raters is the minimum required.12 Four scrutinize the truthfulness, precision, and dependability
common methods for determining inter-rater reliability of the instruments and measurement methods used to
are (Figure 1, 34-37): generate the knowledge for evidence-based practice.
There are few hard and fast rules to guide us through
34. Alternative Form Method. Measure of test equiv- the measurement maze23 and, thus, an understanding of
alence. Involves creating at least 2 versions of the validity and reliability becomes crucial for anyone who
test instrument and applying each version to the wants to correctly use the research process to advance
same group of test subjects at the same time, on at nursing science and practice.
The authors would like to thank our colleagues and the anonymous 13. Kirk RE. Experimental design 2nd ed. Belmont, CA: Wads-
reviewers for their comments. worth, Inc; 1982.
14. Froman RD. Measuring our words on measurement. Re-
search in Nursing and Health 2000; 23:421-22.
REFERENCES 15. Ferketich S. Internal consistency estimates of reliability.
1. Carmines EG, Zeller RA. Reliability and validity. Newbury Research Nurs Hlth 1990;13:437-40.
Park, CA: Sage Publications; 1979. 16. Topf M. Three estimates of inter-rater reliability for nominal
2. Knapp TR. Validity, reliability, and neither. Nurs Research data. Nursing Research 1986;35(4):253-55.
1985;34:189-92. 17. Bland JM, Altman DG. Statistical methods for assessing
3. Artinian BM. Conceptual mapping: development of the agreement between two methods of clinical measurement.
strategy. Western J Nurs Research 1982;4:379-93. Lancet 1986;1:307-10.
4. Artinian BM. Guiding students in theory development 18. Albert NM, Hail MD, Li J, Young JB. Equivalence of
through conceptual mapping. J Nurs Ed 1985;24:156-8. bioimpedance and thermodilution methods in measuring
5. Beitz JM. Concept mapping. navigating the learning process. cardiac output in hospitalized patients with advanced decom-
Nurse Educator 1998;23:35-41. pensated chronic heart failure. Am J Critical Care 2004;13:
6. Irvine LMC. Can concept mapping be used to promote 469-79.
meaningful learning in nurse education? J Advanced Nurs 19. Barton SJ, Chase T, Latham, B, Rayens MK. Comparing
1995;21:1175-9. two methods to obtain blood specimens from pediatric
7. Cook TD, Campbell DT. Quasi-experimentation. Design and central venous catheters. J Pediatric Oncology Nurs 2004;
analysis issues for field settings. Boston, MA: Houghton 21:320-6.
Mifflin Co; 1979. 20. Pesudovs K, Garamendi E, Elliott DB. The quality of life
8. Nunnally JC, Bernstein IH. Psychometric theory 3rd ed. impact of refractive correction (QIRC) questionnaire: devel-
New York, NY: McGraw-Hill Publishing Company; 1994. opment and validation. Optometry Vision Sci 2004;81:769-
9. Pedhazur EJ, Schmelkin LP. Measurement, design and 77.
analysis. An integrated approach. Hillsdale, NJ: Lawrence 21. Trivel D, Calmels P, Leger L, Busso T, Devillard X, Castells
Erlbaum Associates; 1991. J, et al. Validity and reliability of the Huet questionnaire to
10. Goodwin LD. Changing conceptions of measurement valid- assess maximal oxygen uptake. Canadian J Applied Physi-
ity. J Nurs Ed 1997;36:102-7. ology 2004;29:623-38.
11. Lynn MR. Determination and quantification of content 22. MedCalc. www.medcalc.be. (Accessed December 1, 2004).
validity. Nursing Research 1986;35:382-5. 23. Knapp TR, Brown JK. Ten measurement commandments
12. Burns N, Grove SK. The practice of nursing research 4th ed. that often should be broken. Research Nurs Hlth 1995;18:
Philadelphia, PA: WB Saunders Co; 2001. 465-9.

Error PHiggins PDF

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Error PHiggins PDF

Transféré par

Droits d'auteur :

Formats disponibles

Understanding the error of our ways: Mapping the

concepts of validity and reliability

Figure 1. Concept Map

Vous aimerez peut-être aussi