Académique Documents
Professionnel Documents
Culture Documents
DOI 10.1007/s10862-012-9309-2
94
Although these guidelines can assist the health care professional in determining the severity of depressive symptomatology, it is recommended to determine norm scores in a variety
of non-clinical samples as well. The current study aimed to
establish norm scores in a large adult community sample.
The traditional approach to deriving norm scores is splitting a group into subgroups based on relevant background
variables. A disadvantage of this approach is that the sample
size is reduced resulting in less reliable norms. A multiple
regression analysis approach to norming questionnaire data
overcomes this problem. This approach allows for examination of whether background variables (e.g., gender, age)
are important for calculating norm scores. Importantly, it is
possible to test for interactions between predictors. In the
case these interactions are significant, norms on the basis of
subgroups should be created, which boils down to the traditional approach to norming questionnaire data. The
strength of a multiple regression approach is thus that one
can examine whether it is necessary to provide norm data
separately for various background variables. For example,
Van Breukelen and Vlaeyen (2005) found that pain coping
and cognitions were not predicted by gender, but by level of
education instead, suggesting that norm data do not have to
be given for males and females separately. Similarly, Van
der Elst et al. (2006) found that performance on the Concept
Shifting Test was influenced by age, gender, and level of
education, but not by any of their interactions. Consequently, the most stable and simple norming was obtained by
applying a regression model with age, gender and education
as predictors of test score, and applying that model to the
complete sample.
In the current study, we calculated norm scores of the
BDI-II in a large community sample of adults. Gender was
hypothesized to be a significant predictor of BDI-II scores
as prevalence rates of depression in females is generally
higher compared to males (Picinelli and Wilkinson 2000).
Evidence for a relation between age and symptoms of depression is equivocal with studies finding positive associations (e.g., Glenn et al. 2001) a curvilinear relationship
(Steer et al. 1999), or no relationship (Beck et al. 1996b).
Age was therefore examined as both a linear and curvilinear
(quadratic) effect. There is also some evidence that education level is (negatively) related to depressive symptoms
(Arnau et al. 2001) and education level was therefore hypothesized to predict BDI-II scores as well. The aim of the
current study was to determine norm scores for the BDI-II
following a state-of-the-art methodology in a large community sample. The effects of gender, age, education level and
the interaction effects of these predictors were examined to
determine whether norming should take place in the total
group or in subgroups. Norming of the BDI-II was done on
the total scale scores as factor analytic studies have yielded
variability in, and instability of, obtained factor solutions of
Method
Participants and Procedure
Data were collected as part of a large-scale community
screening program for the recruitment of individuals who
could participate in a randomized clinical trial on the effectiveness of computerized cognitive behavioral therapy for
depression. A random selection of individuals in the general
population (age range: 1865 years) received an invitation
letter to complete a screening questionnaire via the internet.
Six municipalities in the Southern part of the Netherlands
cooperated by providing names and addresses of their residents. More specifically, the municipalities provided a total
of 217.816 names and addresses and letters were sent to
these addresses with the request to participate in this study.
In the letter, no reference was made to research on the
treatment of depression but this information was provided
on the website where participants completed the BDI-II. In
the letter, it was emphasized that everyone could participate
even if one had no symptoms of depression. A total of 8,960
individuals responded but the first 1,460 individuals completed the BDI-primary care version and not the BDI-II.
After these 1,460 respondents, the BDI-II was used instead
of the BDI-primary care version. A total number of 7,500
individuals completed the BDI-II. Participants were not
reimbursed for their efforts (see de Graaf et al. 2009). A
comparison of the demographic variables of the current
sample and the population in the Southern part of the Netherlands (Statistics Netherlands, www.cbs.nl) did not reveal
any major discrepancies. The Ethical Committee of the
academic hospital Maastricht and Maastricht University approved the study protocol. In the description of the community sample, 57.3 % was female (N04,300) and 42.7 % male
(N 03,200). Mean age was 43.3 years (SD 013.3 years,
range 1865 years). Mean duration of depressive complaints
was 2.2 months (SD03.2). A total of 70.7 % had a paid job,
10 % were students, 8.2 % received a disablement insurance
benefit, 6.5 % were retired early, and 4.2 % had no work. A
total of 97 % of the total sample were Caucasian.
Instruments
Beck Depression Inventory
The Beck Depression Inventory-II (BDI-II; Beck et al.
1996b; Dutch version: Van der Does 2002) is a 21-item
self-report depression inventory designed to assess symptoms and level of depression. The questionnaire consists of
21 items comprising a list of four statements each about a
particular symptom of depression. Scores on the individual
items range from zero to three. The respondent has to
choose the statement that best represented his or her mood
during the last 2 weeks. Total scores can range between 0
and 63 with higher scores reflecting higher levels of depression. Reliability and validity of the BDI-II have been supported (e.g., Osman et al. 2008; Van der Does 2002).
Demographic Variables
Demographic variables included gender, age, and education.
Regarding education, the highest completed level of training
was rated on an 8-point scale with 10no education,
20elementary school, 30lower technical and vocational
training, 40medium technical and vocational training,
50higher general secondary education, 60pre-university education, 70bachelors degree, 80masters degree. Education
level was further categorized as follows: low education
(1,2,3: 14.7 %]), medium education (4,5,6: 50.4 %), and
high education (7,8: 35 %). Duration of depressive symptoms was rated on an 8-point scale with 10less than 1 month,
201 month, 302 months, 403 months, 504 months,
605 months, 706 months, and 80more than 6 months.
Statistical Analyses
The Statistical Package for the Social Sciences (SPSS, version 15.0) was used to perform regression analyses in order
to determine a parsimonious model for obtaining BDI-II
norms (see Van Breukelen and Vlaeyen 2005, for a detailed
description). Total BDI-II score was the dependent variable
in the regression analyses, and gender, age, education level,
and their interactions were the predictor variables. Dummy
coding was used for the categorical predictors gender
(females 0 0, males 0 1) and education level (low, medium,
high), with low education as reference group. Dummy coding involves the inclusion of a regression weight in the
model to represent the mean scale difference between the
reference category and each other category, adjusted for all
other predictors in the model. Linear and quadratic terms
were included for the quantitative predictor age, which was
centered to prevent collinearity between linear and quadratic
age terms.
The regression model was reduced in a stepwise fashion
by eliminating the least significant predictor, starting with
the interaction terms between the various predictors. Variables with a two-tailed p>.001 were excluded to prevent type
I error due to multiple testing. Note that with 0.001 twotailed, the present sample size of N07,500 still gives a
power of 90 % to detect a correlation as small as .05, so
95
that the risk for type II error is negligible. For the final
model, residuals were plotted and analyzed to check the
assumptions of normality and homogeneity of residual variance across the entire range of predicted scale scores and
the absence of outliers. Within the final model, a raw scale
score of an individual can be converted into a standardized
z-score by computing the predicted score Y (by means of
filling in the regression equation), computing the residual
error (subtracting predicted Y from observed Y), and finally,
dividing the residual error by the SD(e), which is the square
root of the MS(residual). If the residuals are normally distributed with the same variance, then z is normally distributed and the standard normal distribution can be used to
interpret z-values (e.g., Van Breukelen and Vlaeyen 2005).
If normality or homogeneity is seriously violated, percentiles of the residuals can be used instead of z-values for
norming.
Results
Before addressing the main results, two remarks need to be
made. First, total BDI-II scores were not normally distributed with skewness and kurtosis outside the acceptable
range of 1 to +1. A square root-transformation of the
BDI-II total score was successful in normalizing the total
scores to a reasonable extent (note that the residual of the
regression requires a normal distribution, not the dependent
variable itself). These square root-transformed BDI-II scores
were back-transformed into normal BDI-II scores after the
regression analyses in order to obtain norm data. Second,
mean BDI-II score was 10.6 (SD010.9; range 062) and the
BDI-II was reliable in terms of internal consistency with an
alpha of .95. Mean of the square root transformed BDI-II
scores was 2.77 (SD01.71; range 07.87).
Predictors of the BDI-II Score
The final model containing significant predictors of the
BDI-II score consisted of gender and education level. None
of the interactions was significant, and after their deletion
from the model, age (linear and quadratic terms) did not
predict BDI-II total scores either. So norming can be performed on the total sample, using gender and education
level as the only predictors. The final model is presented
in Table 1.
Model Checks
To apply the model for norming purposes, the model assumptions need careful checking as prediction of individual scores
depends even more on such assumptions than the regression
analysis does. More specifically, the use of (standardized)
96
SE of B
p (two-tailed)
Constant
Gender
Medium education
High education
3.51
.25
.55
1.03
.050
.039
.055
.058
<.001
<.001
<.001
<.001
Gender was coded 0 for females and 1 for males. Low education was
the reference groups. Square root depression scores (BDI-II) range
roughly between 0 and 8
Discussion
The aim of the present study was to develop reliable and
representative norms for the BDI-II in a large-scale Dutch
community sample. Using multiple regression, predictors
for BDI-II total scores were identified (i.e., gender, education level) and, after some model checks, norm scores were
calculated.
Predictors for BDI-II Norm Scores
As hypothesized, gender was a significant predictor of BDIII total scores, with females having somewhat higher BDI-II
scores than males. These findings are in line with prevalence
rates in depression being higher in females than in males
(Picinelli and Wilkinson 2000) and with research findings
showing that females have higher scores than males (Arnau
et al. 2001; Beck et al. 1996b; Coelho et al. 2002; Kojima et
al. 2002; Kumar et al. 2002; Steer et al. 1997, 1998, 1999),
but contradict other studies reporting no significant gender
differences on the BDI-II (Beck et al. 1996a; Dozois et al.
1998; OHara et al. 1998; Penley et al. 2003; Schulenberg
and Yutrzenka 2001; Steer and Clark 1997; Steer et al.
2000). Age did not correlate with the BDI-II total scores
(i.e., no linear or quadratic age effects). These findings add
to research showing little relationship between the BDI-II
scores and age in samples of outpatients or college students
(Beck et al. 1996b; Kojima et al. 2002; Penley et al. 2003;
Steer et al. 1997), but are not in line with studies reporting
1
As an ancillary analysis, we determined the BDI-II norm scores for a
model with no predictors to allow for a comparison of an individual
BDI-II score with the total sample. Table 2 presents the BDI-II norm
scores for that model.
97
Table 2 Norms of the BDI-II total score in a community sample breakdown by gender and education level
Males
Low education
Medium education
High education
Females Low education
Medium education
High education
Total sample
5th
Percentile
(z01.64)
20th
Percentile
(z0.84)
40th
Percentile
(z0.25)
60th
Percentile
(z0.25)
80th
Percentile
(z0.84)
95th
Percentile
(z01.64)
0
0
0
0
0
0
0
3
2
1
4
2
1
2
8
5
3
10
6
4
6
14
10
7
15
11
7
11
22
17
13
24
19
15
18
36
30
25
39
32
27
32
p
z BDI score 3:26=1:66
p
z BDI score 2:71=1:66
p
z BDI score 2:23=1:66
p
z BDI score 3:51=1:66
p
z BDI score 2:96=1:66
p
z BDI score 2:48=1:66
p
z BDI score 2:77=1:71
best possible reference group but the norm data cannot be used
for determining thresholds for a certain intervention. That is, a
score of 29 (cutoff for severe depression according to Beck et
al. 1996b) on the BDI-II corresponds to different z-scores
depending on gender and education level, but it does not
indicate that the personal burden is also different. Furthermore, a decision to offer treatment should never be based on a
BDI-II score alone, but should be decided upon a diagnostic
evaluation. In clinical practice, much more is needed for an
informed decision: only the severity of the depressive symptomatology will not do, and additional data regarding comorbidity, functioning, motivation for treatment is needed.
Normative data merely position a respondent in the normal
population.
Implications
Strengths and Limitations of the Current Study
With respect to the norm data, the interpretation of raw
BDI scores depends on gender and education level. Zscores computed from the residuals of the present regression of (square root) scores of BDI-II on these
variables, can provide a more objective picture of the
meaningfulness of depressive symptoms in adults. That
is, z-scores can be helpful to identify the severity of the
problems and also to evaluate treatment success. However, the results should be cautiously interpreted in
terms of what BDI-II scores may have (preventive)
treatment implications. In this respect, a number of
cutoff scores for the BDI-II have been identified ranging
from 14 to 18 in different samples (Sprinkle et al. 2002;
Arnau et al. 2001; Dutton et al. 2004). The choice of a
particular cutoff point depends in part on the purpose
for using the test. If the purpose is to detect the maximum number of persons with depression, then the cutscore threshold must be lowered to minimize false negatives. If it is important to obtain as pure a group of
persons with depression as possible, the cut-score
should be raised to reduce the number of false positives.
Thus, the norm data are indicative of where a person with a
certain BDI-II score lies in terms of z-scores compared to the
The results of the current study advance the BDI-II as screening instrument for depression and provide reliable and representative norms for an adult community sample. It is a strength
of the BDI-II that it takes only 510 min to fill in and its
reliability is good. Another strength of the study is that a large
sample of adults was involved, deriving from different education levels, which enhances the generalization. A number of
limitations of the present study need to be addressed. First,
norm data obtained in the current study were found for the
Dutch version of the BDI-II. It should be borne in mind that
norming is culture and translation bound. As a consequence,
the present norms are not generalizable to other language
versions of the BDI or to populations stemming from other
countries. It is quite likely that respondents from other
countries or cultures respond differently to self-report questionnaires. Second, the current study used an in internet-based
approach to completing the BDI-II (see also Schulenberg and
Yutrzenka 2001). Paper-and-pencil and web-based administrations appear to yield equivalent results at least for the Child
Depression Inventory (see Roelofs et al. 2010). Third, it is
possible that selection bias might have occurred. Although
there was no reference to research on the treatment of
98
References
American Psychiatric Association. (2000). DSM-IV-TR: diagnostics
and statistical manual of mental disorders-Text revision (4th
ed.). Washington, D.C.: American Psychiatric Association.
Arnau, R. C., Meagher, M. W., Norris, M. P., & Bramson, R. (2001).
Psychometric evaluation of the Beck Depression Inventory-II with
primary care medical patients. Health Psychology, 20, 112119.
Beck, A. T., Ward, C. H., Mendelson, M., Mock, J., & Erbaugh, J.
(1961). An inventory for measuring depression. Archives of General Psychiatry, 4, 561571.
Beck, A. T., Steer, R. A., Ball, R., & Ranieri, W. F. (1996a). Comparison of Beck Depression Inventories-IA and II in psychiatric
outpatients. Journal of Personality Assessment, 67, 588597.
Beck, A. T., Steer, R. A., & Brown, G. K. (1996b). Beck Depression
Inventory (2nd ed.). San Antonio: The Psychological Corporation.
Coelho, R., Martins, A., & Barros, H. (2002). Clinical profiles relating
gender and depressive symptoms among adolescents ascertained
by the Beck Depression Inventory II. European Psychiatry, 17,
222226.
De Graaf, L. E., Gerhards, S. A. H., Arntz, A., Riper, H., Metsemakers,
J. F. M., Evers, S. M. A. A., et al. (2009). Clinical effectiveness of
online computerised cognitive-behavioural therapy without support for depression in primary care: randomised trial. The British
Journal of Psychiatry, 195, 7380.
Dozois, D. J. A., Dobson, K. S., & Ahnberg, J. L. (1998). A psychometric evaluation of the Beck Depression Inventory-II. Psychological Assessment, 10, 8389.
Dutton, G. R., Grothe, K. B., Jones, G. N., Whitehead, D., Kendra, K.,
& Brantley, P. J. (2004). Use of the Beck Depression Inventory-II
with African American primary care patients. General Hospital
Psychiatry, 26, 437442.
Glenn, M. B., ONeil-Pirozzi, T., Goldstein, R., Burke, D., & Jacob, L.
(2001). Depression amongst outpatients with traumatic brain injury. Brain Injury, 15, 811818.
Hunt, M., Auriemma, J., & Cashara, A. C. (2003). Self-report bias and
underreporting of depression on the BDI-II. Journal of Personality
Assessment, 80, 2630.