Vous êtes sur la page 1sur 6

Rheumatology 2011;50:500505

RHEUMATOLOGY doi:10.1093/rheumatology/keq357
Advance Access publication 11 November 2010

Concise report
Measurement properties of the osteoarthritis of
knee and hip quality of life OAKHQOL questionnaire:
an item response theory analysis
Christophe Goetz1,2, Emmanuel Ecosse1,3, Anne-Christine Rat1,2,4,
Jacques Pouchot1,5, Joel Coste1,3 and Francis Guillemin1,2

Abstract
Objective. To further document the measurement properties of each domain of the OA of knee and hip
quality of life (OAKHQOL) questionnaire by a Rasch analysis.
Methods. The OAKHQOL self-administered questionnaire has been developed to assess health-related
quality of life in lower limb OA. Patients with various degrees of severity of knee or hip OA answered the
questionnaire. For each domain, their responses to the items were analysed with a Rasch family model,
the partial credit model. We examined the fit of data to model expectations, as well as assumptions of
unidimensionality and local independence. Invariance was assessed by analysis of differential item func-
tioning (DIF) by sex, age and joint. Analyses used the RUMM2020 software (Rumm Laboratory, Perth,
Western Australia).
Results. Responses for 544 patients were analysed: 297 had medically managed OA and 247 were
waiting for arthroplasty surgery. For the 40 items of the OAKHQOL, data analysis showed 5 with dis-
ordered thresholds and 9 with DIF (5 for joint, 3 for sex and 1 for age). Ten pairs of items showed local
dependence and four domains showed unidimensionality. Full-item domains and domains without the
misfitted items did not differ in patient-estimates data; therefore any bias at the item level is negligible
when considering the domain scores.
CLINICAL
SCIENCE

Conclusion. The five domains of the OAKHQOL questionnaire show good measurement properties by
Rasch analysis and provide valid scales.
Key words: Osteoarthritis, Quality of life, Item response theory, Rasch, Questionnaire.

Introduction measurement properties [1]. Because lower limb OA


affects specific aspects of QoL [2], specific instruments
Quality of life (QoL) instruments are widely used to meas- may be useful [3].
ure the impact of diseases in terms of pain and symptoms The OA knee and hip QoL (OAKHQOL) is a self-
as well as physical functioning, mental health and social administered questionnaire specifically developed for
functioning. Their usefulness is recognized once they have knee and hip OA [4]. Its factorial structure was assessed
undergone a rigorous validation process and show good by exploratory principal components analysis (PCA). Its
psychometric properties were evaluated according to
1
EA 4360 Apemac, Nancy University, Paris Descartes University, Paul the classical test theory (CTT): construct validity, reprodu-
Verlaine University, Nancy, 2Department of Clinical Epidemiology and cibility and sensitivity to change [5]. Results were satisfac-
Evaluation, INSERM CIC-EC CIE6, Nancy University Hospital, Nancy,
3
Biostatistics and Epidemiology Unit, Hopital Cochin, AP-HP, Paris, tory but the CTT did not consider important aspects of
4
Department of Rheumatology, Nancy University Hospital, Nancy and measurement such as item properties, ordering of re-
5
Department of Internal Medicine, Hopital Europeen Georges sponse categories and invariance across conditions and
Pompidou, Assistance Publique, Hopitaux de Paris, Paris, France.
populations.
Submitted 8 April 2010; revised version accepted 23 September 2010.
The item response theory (IRT) provides supplementary
Correspondence to: Francis Guillemin, Ecole de Sante Publique,
Faculte de Medecine BP 184, 54505 Vandoeuvre-les-Nancy, France. methods to investigate the properties of health outcome
E-mail: francis.guillemin@medecine.uhp-nancy.fr instruments [6, 7]. This theory describes the level of a

! The Author 2010. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com
OA of knee and hip quality of life

subjects characteristics on the measured outcome analyses are presented in supplementary data available
(underlying trait) as a function of responses to the items at Rheumatology Online. Item fit was explored with stan-
as well as item properties. The simplest IRT model is the dardized residuals (expected to be between 2.5 and
Rasch model [8] and has proved useful to evaluate the +2.5, with a mean (S.D.) value of 0 (1), chi-square statis-
properties of health outcome measures [9]. We aimed to tics and examination of the item characteristic curves
further document and validate the measurement proper- (ICCs). The internal consistency of the domains was
ties of each item of the OAKHQOL, using Rasch analysis. examined with a person separation index (PSI) as defined
by Andrich and Douglas [14] A PSI > 0.85 is required to
Materials and methods use the scores at the individual level.
The invariance of the scales was assessed by looking
This non-interventional study was approved by the French for differential item functioning (DIF) [15] across several
Consultative Committee for data processing in health re- factors: sex, age (<60, 6070 and >70 years) and involved
search (CCTIRS) and authorized by the National joint (hip or knee). Items with DIF were split by group to
Committee on Informatics and Freedom (CNIL). Written allow item parameters to be estimated separately in each
consent was obtained from each included patient. group.
Residual correlation matrices were examined to deter-
OAKHQOL scales mine local independence of the scales. Local dependence
The OAKHQOL questionnaire contains 43 items and de- between two items is detected when the correlation of
scribes QoL in five domains: physical activities (16 items), their residuals is higher than that of the residuals between
mental health (13 items), pain (4 items), social support most pairs of items. Unidimensionality was determined
(4 items), social functioning (3 items); and 3 independent through the PCA of the residuals.
items [5]. Each item is scored on a scale from 0 to 10. Finally, to check whether the misfitting items of a
When at least half the item scores in a domain are miss- domain create a significant bias in the domain score, we
ing, the score for that domain is dropped. Scores are ob- compared the person-estimates with and without these
tained by computing the mean of the item scores for each items. A non-significant difference in person-estimates
domain and normalized to a scale from 0 (worst) to 100 (i.e. levels of QoL of patients) means that the misfitting
(best possible QoL). items do not create significant bias at the domain level.
The final sample size was 544 and met literature recom-
Patients mendations [16, 17]. Rasch analysis was conducted using
RUMM2020 v4.1 software (Rumm Laboratory, Perth,
We used data from a multicentre cohort constituted for
Western Australia). All P-values were adjusted according
the first validation study of the OAKHQOL questionnaire to the Bonferroni method, and overall significance level
[4, 5] and for the AMISAT study [10]. Patients were was set to 0.05.
recruited between 2002 and 2005 from six rheumatology
and orthopaedic surgery outpatient clinics in Paris and in
the Lorraine region (France). Inclusion criteria were hip or
Results
knee OA according to Altmans criteria [11], age >18 A sample of 648 patients was available. Data for
years and no other disabling disorder. Exclusion criteria 50 patients with >50% missing responses for a domain,
were an indication of total hip or knee replacement sur- 28 patients at a post-operative stage and 26 patients with
gery for a reason other than OA and another total hip or both hip and knee OA or missing data regarding the
knee replacement within the previous year. involved joint were excluded from the Rasch analysis. Of
For the analyses, patients at the post-operative stage the remaining 544, 62% were women; 56 and 44% had
were excluded because their number was low, while pa- hip and knee OA, respectively; 55% were receiving med-
tients medically treated and waiting for surgery were con- ical care and 46% were waiting for surgery. The item dif-
sidered to represent a wide enough variety of OA severity. ficulties [level of QoL needed to answer positively to an
Data for patients with both hip and knee OA or with miss- item (see supplementary data available at Rheumatology
ing data on the involved joint were excluded to avoid a Online)] demonstrated good coverage of the sample for
bias in analysis of the measurement invariance by the all domains except social functioning (see supple-
involved joint. Data for questionnaires with >50% missing mentary figure 1, available as supplementary data at
data in a domain were excluded. Rheumatology Online).

Rasch analysis Physical activities domain (16 items)


Analyses were conducted separately for each OAKHQOL The domain showed a lack of fit to the model (Table 1).
domain. We used a partial credit model [12], which Items 4, 5, 7, 8 and 10 showed a DIF for joint; Item 28 a
extends the Rasch model for polytomous items. DIF for sex. The residuals and ICC suggested misfit for
Distance between response categories was constrained Items 3, 6, 9, 14 and 28 (Table 2). Local dependence was
to be equal within each item, as an 11-category rating found between Items 4 and 5 (r = 0.42), Items 7 and
scale shows near-interval-scale properties [13]. 8 (r = 0.49) and Items 13 and 14 (r = 0.48). The PCA of
Fit of the data to the models was assessed according the residuals showed a significant deviation from unidi-
to recent recommendations [7]. Details regarding the mensionality.

www.rheumatology.oxfordjournals.org 501
Christophe Goetz et al.

TABLE 1 Fit statistics for the models used for each OAKHQOL domain

Mean (S.D.) of fit residualsa Item trait interaction

Domain/model Items Persons Chi square Dfb P-value PSIc

Physical activities
Full-item domain 0.79 (2.402) 0.35 (1.59) 235.7 64 <0.0001 0.94
Eight-item set 0.07 (1.380) 0.62 (1.54) 53.2 32 0.05 0.89
Mental health
Full-item domain 1.01 (3.425) 0.16 (1.36) 267.3 52 <0.0001 0.93
Six-item set 0.18 (1.095) 0.30 (1.06) 42.3 24 0.002 0.88
Pain
Full-item domain 0.32 (1.702) 0.71 (1.33) 30.3 16 0.08 0.89
Social support
Full-item domain 0.39 (0.582) 0.47 (1.21) 76.0 36 0.006 0.91
Social functioning
Full-item domain 1.02 (1.540) 0.41 (1.18) 60.7 12 <0.0001 0.80

a
Expected mean and S.D. are 0 and 1, respectively. bDf: degrees of freedom. cPerson separation index.

Removal of items showing misfit, DIF or local depend- of residuals. Removing Items 27 and 34 left no local
ence one by one resulted in a set of eight items: Items 1, dependence but had no impact on person-estimates.
2, 3, 6, 9, 11, 13 and 24 (fit of Items 3, 6 and 9 improved
during the iterative process and were thus kept in the Social support domain (four items)
core set). This set of items showed good fit to the model The domain showed moderate fit to the model (Table 1).
(Table 1), no local dependence was left and unidimension- No item was misfitting according to the residuals and the
ality was respected. The comparison between person- ICC (Table 2). No DIF was found for joint, sex or age. Local
estimates of the core set and person-estimates of the independence was respected; only a weak residual cor-
full-item domain showed that the difference was relation was found between Items 42 and 43 (r = 0.15).
non-significant, meaning that the removed items do not Unidimensionality was respected according to the PCA
create significant bias in the full-item domain. of residuals. Removing Item 43 left no local dependence
but had no impact on person-estimates.
Mental health domain (13 items)
Social functioning domain (three items)
The domain showed lack of fit to the model (Table 1).
Items 16 and 36 showed a DIF for sex and Item 29 a The domain showed a lack of fit to the model (Table 1). No
DIF for age. The residuals and ICC suggested misfit for DIF was found and no item was misfitting according to
Items 18, 20, 29 and 38 (Table 2). Local dependence was the residuals and the ICC (Table 2). Local dependence
found between Items 16 and 17 (r = 0.51), Items 19 and 20 was found between Items 31 and 32 (r = 0.10). Removing
(r = 0.32) and Items 36 and 37 (r = 0.41). Unidimensionality Item 32 left no local dependence but had no impact
was respected according to the PCA of residuals. on person-estimates. Unidimensionality was respected
Removal of items showing misfit, DIF or local depend- according to the PCA of residuals.
ence one by one resulted in a set of six items (Items 15,
17, 19, 21, 35 and 37) showing good fit to the model, Discussion
despite a significant chi-square statistic (Table 1). The dif-
ference between person-estimates of the reduced domain The Rasch analysis of the OAKHQOL confirms the validity
and the full-item domain was non-significant, meaning of the five scales, despite some misfitting items, that did
that the removed items do not create significant bias in not compromise the measurement properties of the five
the full-item domain. domains. Of 40 items, 9 showed DIF, 2 showed misfit and
10 pairs showed local dependence, but these findings do
not create a significant bias at the domain level. The target
Pain domain (four items)
population covered most of the severity levels of OA. For
The domain demonstrated good fit to the model (Table 1). all domains except social support (only three items), the
No DIF was found for joint, sex or age. No item was mis- range of QoL for the sample was well covered by the item
fitting according to the residuals and the ICC (Table 2). difficulties. The questionnaire is thus well suited to dis-
Residual correlations were found between Items 26 criminate small variations in QoL for our target population.
and 27 (r = 0.22) and Items 33 and 34 (r = 0.30). This corroborates the good results obtained when map-
Unidimensionality was respected according to the PCA ping the OAKHQOL items with the OA core set of the

502 www.rheumatology.oxfordjournals.org
OA of knee and hip quality of life

TABLE 2 Difficulties and fit statistics of the OAKHQOL items after splitting items showing DIF

Domain/item Difficulty Fit residual P-value

Physical activities
Q1. Walking 0.15 0.31 0.56
Q2. Bending or straightening 0.29 0.24 0.53
Q3. Carrying heavy things 0.30 3.19 0.03
Q4. Going down stairs (hip) 0.18 1.10 0.67
Q4. Going down stairs (knee) 0.41 0.50 0.71
Q5. Climbing stairs (hip) 0.16 1.67 0.76
Q5. Climbing stairs (knee) 0.40 0.93 0.18
Q6. Taking a bath 0.18 5.64 0.001*
Q7. Dressing (hip) 0.13 0.47 0.95
Q7. Dressing (knee) 0.25 0.29 0.63
Q8. Cutting toenails (hip) 0.33 1.85 0.51
Q8. Cutting toenails (knee) 0.03 0.02 0.39
Q9. Getting moving after staying in the same position 0.29 3.91 0.01
Q10. Getting in and out a car (hip) 0.22 1.06 0.41
Q10. Getting in and out a car (knee) 0.11 1.24 0.25
Q11. Using public transport 0.16 2.31 0.01
Q13. Need to spare oneself 0.11 0.42 0.80
Q14. Take longer doing things 0.15 2.61 0.06
Q24. Staying for a long time in the same position 0.32 0.66 0.84
Q25. Need a stick to walk 0.88 1.64 0.01
Q28. Need help (women) 0.37 0.13 0.13
Q28. Need help (men) 1.09 2.00 0.0002*
Mental health
Q15. Feel depressed because of pain 0.03 0.09 0.04
Q16. Been afraid of being dependent on others (women) 0.36 0.47 0.54
Q16. Been afraid of being dependent on others (men) 0.23 0.16 0.26
Q17. Been afraid of becoming an invalid 0.31 0.03 0.24
Q18. Embarrassed when people see me 0.23 2.02 <0.0001*
Q19. Worry 0.06 0.89 0.03
Q20. Feel depressed 0.26 1.45 <0.0001*
Q21. Hindered in family life 0.20 3.36 0.12
Q29. Feel older than my years (<60-years old) 0.12 3.25 0.56
Q29. Feel older than my years (>60-years old) 0.06 1.96 0.42
Q35. Wonder what is going to happen 0.06 1.54 0.005
Q36. Feel aggressive and irritable (women) 0.31 1.52 0.79
Q36. Feel aggressive and irritable (men) 0.10 0.12 0.34
Q37. Feel being a burden to close relatives 0.24 0.03 0.72
Q38. Worried about the side effects of treatment 0.14 6.08 <0.0001*
Q41. Feel embarrassed to ask for help 0.48 1.53 0.004
Pain
Q26. Frequency of pain 0.35 1.40 0.38
Q27. Intensity of pain 0.21 0.88 0.99
Q33. Having difficulties getting to sleep because of pain 0.26 1.97 0.002
Q34. Wake up at night because of pain 0.30 1.57 0.06
Social support
Q39. Talking about arthritis problems 0.01 0.32 0.04
Q40. Feel others understand arthritis problems 0.03 0.14 0.01
Q42. Feel support from people close to me 0.12 0.18 0.001*
Q43. Feel support from people around 0.10 1.22 0.39
Social functioning
Q30. Able to plan for the future 0.12 2.67 0.37
Q31. Going out whenever would like 0.07 0.38 <0.0001*
Q32. Have friends in whenever would like 0.05 0.76 0.0002*

When the fit residual was not between 2.5 and +2.5 or the chi-square test was significant, the misfit was confirmed by
examining the ICC. Items considered misfitting are shown in italics in the table. *Significant at 0.05 level after Bonferroni
adjustment.

www.rheumatology.oxfordjournals.org 503
Christophe Goetz et al.

International Classification of Functioning, Disability and PHRC (Programme Hospitalier de Recherche Clinique),
Health [18]. Nancy University Hospital, France.
We used a conservative approach and retained in the
Disclosure statement: The authors have declared no
scales any misfitting item as long as the domain score
conflicts of interest.
remained unbiased. Indeed, the Rasch model was origin-
ally considered a confirmatory tool [8] and may not be
appropriate to select items to be removed. Removing Supplementary data
items to obtain fit is not a guarantee that the scale im-
proved [19]. Moreover, removing items should imply a Supplementary data are available at Rheumatology
consideration of the content validity and a validation Online.
of the shortened instrument with an independent
sample [20]. References
Another issue justifying caution when interpreting misfit
of items is that fit results depend on the power of the 1 Leidy NK, Revicki DA, Geneste B. Recommendations for
tests. Power increases with the sample size and variance evaluating the validity of quality of life claims for labeling
and promotion. Value Health 1999;2:11327.
of the subjects (i.e. the PSI). With our large sample and
high PSI, even small differences between observed and 2 Bombardier C, Melfi CA, Paul J et al. Comparison of a
generic and a disease-specific measure of pain and
expected values may result in statistically misfit data [16].
physical function after knee replacement surgery. Med
This situation can explain why some chi-square statistics
Care 1995;33:13144.
remained significant, while the comparison of person-
3 Hawker G, Melfi C, Paul J, Green R, Bombardier C.
estimates showed that the bias in the full-item domains Comparison of a generic (SF-36) and a disease specific
were negligible. (WOMAC) (Western Ontario and McMaster Universities
For every item demonstrating DIF, a clinical interpret- Osteoarthritis Index) instrument in the measurement of
ation can be found. For example, Items 4 and 5 show that outcomes after knee replacement surgery. J Rheumatol
going down or climbing stairs is, as expected, more 1995;22:11936.
difficult for people with knee than for those with hip OA. 4 Rat AC, Coste J, Pouchot J et al. OAKHQOL: a new
In contrast, dressing, cutting toenails and getting in and instrument to measure quality of life in knee and hip
out of a car are more difficult with hip OA. Some other osteoarthritis. J Clin Epidemiol 2005;58:4755.
explanations are obvious (feel older than my years and 5 Rat AC, Pouchot J, Coste J et al. Development and testing
DIF for age), some are more subtle (been afraid of being of a specific quality-of-life questionnaire for knee and hip
dependent on others and DIF for sex), but all make sense. osteoarthritis: OAKHQOL (OsteoArthritis of Knee Hip
In the same way, all observed issues of local dependence Quality Of Life). Joint Bone Spine 2006;73:697704.
appear between items that are conceptually close (e.g. 6 Edelen MO, Reeve BB. Applying item response theory
need to spare oneself and take longer doing things). (IRT) modeling to questionnaire development, evaluation,
and refinement. Qual Life Res 2007;16:518.
These are violations of the Rasch model assumptions,
but we showed that in the present case they do not 7 Tennant A, Conaghan PG. The Rasch measurement model
in rheumatology: what is it and why use it? When should it
create a significant bias for the domain scores.
be applied, and what should one look for in a Rasch
paper? Arthritis Rheum 2007;57:135862.
Conclusion 8 Rasch G. Probabilistic models for some intelligence and
attainment tests. Copenhagen: Danmarks Paedagogiske
The five domains of the OAKHQOL questionnaire show Institut, 1960.
good measurement properties by Rasch analysis and pro- 9 Tennant A, McKenna SP, Hagell P. Application of Rasch
vide valid scales to measure the specific QoL for patients analysis in the development and application of quality of
with hip or knee OA that is under medical management or life instruments. Value Health 2004;7:S226.
those waiting for surgery. 10 Baumann C, Rat AC, Osnowycz G et al. Do clinical
presentation and pre-operative quality of life predict
satisfaction with care after total hip or knee replacement?
Rheumatology key messages J Bone Joint Surg Br 2006;88:36673.
. The OAKHQOL questionnaire is valid for use in its 11 Altman R, Alarcon G, Appelrouth D et al. The American
current five-scale format. College of Rheumatology criteria for the classification and
. The OAKHQOL shows good measurement proper- reporting of osteoarthritis of the hip. Arthritis Rheum 1991;
ties through Rasch analysis. 34:50514.
12 Masters GN. A Rasch model for partial credit scoring.
Psychometrika 1982;47:14974.
13 Coste J, Walter E, Venot A. A new approach to selection
Acknowledgements and weighting of items in evaluative composite measure-
ment scales. Stat Med 1995;14:256580.
Funding: This work was supported by the Clinical 14 Andrich D, Douglas GA. Reliability: distinctions between
Epidemiology Center, INSERM CIE6 (Institut National de item consistency and subject separation with the simple
la Sante et de la Recherche Medicale), Health Ministry logistic model. Paper presented at the Annual Meeting of

504 www.rheumatology.oxfordjournals.org
OA of knee and hip quality of life

the American Educational Research Association. 18 Rat AC, Guillemin F, Pouchot J. Mapping the osteoarth-
New York, 1977. ritis knee and hip quality of life (OAKHQOL) instrument to
15 Teresi JA, Fleishman JA. Differential item functioning the international classification of functioning, disability and
and health assessment. Qual Life Res 2007;16: health and comparison to five health status instruments
3342. used in osteoarthritis. Rheumatology 2008;47:171925.
16 Smith AB, Rush R, Fallowfield LJ et al. Rasch fit 19 Gustafsson J. Testing and obtaining fit of data to the
statistics and sample size considerations for polytomous Rasch model. Br J Math Stat Psychol 1980;32:20533.
data. BMC Med Res Methodol 2008;8:33. 20 Coste J, Guillemin F, Pouchot J, Fermanian J.
17 Linacre JM. Optimizing rating scale category effective- Methodological approaches to shortening composite
ness. J Appl Meas 2002;3:85106. measurement scales. J Clin Epidemiol 1997;50:24752.

www.rheumatology.oxfordjournals.org 505

Vous aimerez peut-être aussi