CES-D Scale, Radloff PDF

The CES-D Scale: A Self-Report Depression
Scale for Research in the General Population

Lenore Sawyer Radloff
Center for Epidemiologic Studies
National Institute of Mental Health
The CES-D scale is a short self-report scale was designed to measure current level of depres-
designed to measure depressive symptomatology in sive symptomatology, with emphasis on the af-
the general population. The items of the scale are
fective component, depressed mood. The symp-
symptoms associated with depression which have toms are among those on which a diagnosis of
been used in previously validated longer scales. The
new scale was tested in household interview surveys clinical depression is based but which may also
and in psychiatric settings. It was found to have accompany other diagnoses (including &dquo;nor-
very high internal consistency and adequate test- maI&dquo;) to some degree.
retest repeatability. Validity was established by pat-
This definition of the variable being measured
terns of correlations with other self-report measures,
determines the appropriate criteria of validity
by correlations with clinical ratings of depression,
and by relationships with other variables which and reliability (Standards for Educational and
support its construct validity. Reliability, validity, Psychological Tests, 1974). Content validity will
and factor structure were similar across a wide be based on the clinical relevance of the symp-
variety of demographic characteristics in the toms which comprise the items of the scale.
general population samples tested. The scale should Criterion-oriented validity will include correla-
be a useful tool for epidemiologic studies of de-
pression. tions with other valid self-report depression
scales, correlations with clinical ratings of
The Center for Epidemiologic Studies Depres- severity of depression, and discrimination be-
sion Scale (CES-D Scale) was developed for use tween psychiatric patients and general popula-
in studies of the epidemiology of depressive tion samples. Construct validity will be based on
symptomatology in the general population. Its what is known about the theory and epidemiolo-
purpose differs from previous depression scales gy of depressive symptoms. Evidence that the
which have been used chiefly for diagnosis at scale is reliable but is also sensitive to current
clinical intake and/or evaluation of severity of levels of symptomatology will be based on
illness over the course of treatment. The CES-D predictability of test-retest changes in scores
(e.g., scores of patients before and after treat-
ment, or scores of household respondents before
and after &dquo;Life Events Losses&dquo;). Since several
comparable samples (essentially replications)
were tested, consistency of results across the
385
Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227.

May be reproduced with no cost by students and faculty for academic use. Non-academic reproduction
requires payment of royalties through the Copyright Clearance Center, http://www.copyright.com/
386
samples will also be shown as indirect evidence possible range of scores is zero to 60, with the
of reliability. higher scores indicating more symptoms,
The CES-D was designed for use in general weighted by frequency of occurrence during the
population surveys, and is therefore a short, past week.
structured self-report measure. It is usable by
lay interviewers, acceptable to the respondent, Field Tests: Methods
and not substantially influenced by the normal
First Questionnaire Survey (Ql Survey)
range of conditions during a household inter-
view. The scale was designed for use in studies of The CES-D scale was included in a structured
the relationships between depression and other interview containing over 300 items, including
variables across population subgroups. To com- other scales designed to measure depression or
pare results from one subgroup to another, the depressed mood (Bradburn Negative Affect,
scale must be shown to measure the same thing 1969; Lubin, 1967), psychological symptomss
in both groups. Therefore, it will be shown that (Langner, 1962), well-being (Bradburn Positive
properties of the scale (validity, reliability, factor Affect, 1969; Cantril Ladder, 1965) and Social
structure) are similar for the various population Desirability (Crowne Marlowe, 1960). It also
&
subgroups to be studied. included standard sociodemographic items (age,
sex, education, occupation, marital status) and
measures of life events, alcohol problems, social
Development of the Scale functioning, physical illness and use of medica-
The CES-D items were selected from a pool of tions. The interview, which took about an hour,
items from previously validated depression was conducted by an experienced lay interviewer
scales (e.g. Beck, Ward, Mendelson, Mock, & in the home of the respondent.
Erbaugh, 1961; Dahlstrom & Welsh, 1960; Probability samples of households designed to
Gardner, 1968; Raskin, Schulterbrandt, Reatig, be representative of two communities (Kansas
& McKeon, 1969; Zung, 1965). The major com- City, Missouri, and Washington County, Mary-
ponents of depressive symptomatology were land) selected. An individual (aged 18 and
were
identified from the clinical literature and factor over) randomly selected for interview from
was
analytic studies. These components included: each household in the sample. Independent
depressed mood, feelings of guilt and worthless- samples of households were designated for each
ness, feelings of helplessness and hopelessness, week of the study. Strong efforts were made to
psychomotor retardation, loss of appetite, and complete interviews in the assigned week, but up
sleep disturbance. Only a few items were to three weeks (and unlimited numbers of call-
selected to represent each component. Four backs) were allowed to maximize response rate.
items were worded in the positive direction to Interviewing was done from October 1971
break tendencies toward response set as well as through January 1973 in Kansas City and from
to assess positive affect (or its absence). To em- December 1971 through July 1973 in Washing-
phasize current state, the directions read: &dquo;How ton County. The response rate in Kansas City
often this past week did you ... &dquo; Each response was about 75%, with a total of 1173 completed
was scored from zero to three on a scale of fre- interviews; in Washington County the response
quency of occurrence of the symptom. rate was about 80%, with 1673 completed inter-
Pretests on small &dquo;samples of convenience&dquo; views. Informed consent was obtained from all
indicated appropriate performance of the scale respondents. Both sites had a refusal rate of
and guided minor revisions for clarity and about 17%, plus a small percentage of not-at-
acceptability. The 20-item scale used in the home and other reasons for nonresponse.
studies reported here is shown in Table 1. The Demographic distributions of the samples are

387
Table 1. CES-D Scale
reported elsewhere (Comstock & Helsing, in interview, including the CES-D scale. The sam-
press), as are analyses of characteristics of those ples probably have some underrepresentation of
who refused to be interviewed (Comstock & Hel- males and the poorly educated. However, they
sing, 1973; Klassen & Roth, 1974). Refusals include respondents with a wide range of demo-
were significantly more likely to have lower edu- graphic characteristics, in numbers adequate for
cation and come from smaller households than analyses of relationships among variables.
respondents. Analyses have been made of re-
spondents interviewed in the assigned week (&dquo;on
time&dquo;) versus the harder to find respondents in- Second Questionnaire Survey (Q2 Survey)
terviewed in the following three weeks (&dquo;late&dquo;) The CES-D scale was also included in a slight-
(Mebane, 1973). Males and working people were ly revised (mainly shortened) version of the ques-
slightly overrepresented among the &dquo;late&dquo; re- tionnaire (Q2) used in Washington County only,
spondents, but the &dquo;late&dquo; did not differ from the from March 1973 through July 1974 (for three
&dquo;on time&dquo; on the psychological measures in the months Q1 and Q2 were used alternately). Sam-

388
ples were drawn for four-week periods. The re- ticut (Weissman. Prusoff & Newberry, 1975). In
vision was not expected to affect the CES-D, the Washington County study, seventy patients
since the scale was placed very early in both in- residing in a private psychiatric facility were
terviews, with identical preceding sections. The selected on the basis of willingness and ability to
major differences between the 01 and Q2 sur- participate. Each patient was rated on the Rock-
veys were: length of interview (60 vs. 30 liff Depression Rating Scale (Rockliff, 1971) by
minutes); the time-basis of the sampling frame the nurse-clinician who was most familiar with
(weekly vs. four-week); and the site (Kansas City the patients current status. Immediately follow-
and Washington County vs. Washington County ing this, the patient was interviewed by one of
otili,). Theresponse rate for the Q2 survey was the interviewers from the Washington County
about 75%, with 1089 completed interviews, and general population survey, using the original in-
about 22% refusals. Therefore, the obtained terview form (Ql). In the New Haven Study,
sample for Q2 may be slightly less representative thirty-five people admitted to outpatient treat-
than that of the Washington County QI survey. ment for severe depression and scoring seven or
higher on the Raskin Depression Rating Scale
(Raskin et al., 1969) participated in the study.
Mail-backs They were given the CES-D scale and the SCL-
90 (Derogatis, Lipman, & Covi, 1973) as self-re-
From May 1973 through March 1974, each reports and rated by clinicians on the Hamilton
spondent to Q2 was asked to fill out and mail Rating Scale (Hamilton, 1960) as well as the
back one retestthe CES-D scale either two,
on
Raskin. The measures were taken upon admis-
four, six, or weeks after the original inter-
eight sion for treatment, after one week, and after
view. A total of 419 mail-backs was received four weeks of treatment, using psychotropic
(about 56% response rate). medication and supportive psychotherapy.
Reinterview Survey (Q3 Survey) Summary of Field Tests

In the present paper, results will be reported
The CES-D was also included in a reinterview
for the following populations: All Ql Whites,
(Q3) of samples of the original respondents to
All Q2 Whites, and All Q3 Whites (these analy-
01 or Q2. In Kansas City, from July 1973 ses were confined to Whites to make the samples
through December 1973, 343 respondents (78%
of those attempted) were reinterviewed about 12 comparable, since the Q2 sample contained less
than 3% nonwhites). These results are treated as
months after the original interview. From Au-
gust 1973 through April 1974 in Washington
replications to demonstrate repeatability of the
properties of the scale across two samples con-
County, 1209 respondents (about 79% of those sidered equivalent (Ql vs. Q2) and across two
attempted) were reinterviewed once-either tests on essentially the same sample (Q 1 /Q2 vs.
three, six, or twelve months after the original in-
terview.
Q3). For test-retest reliability, scores of the same
people at different times will be compared (Q2
vs. mail-backs and Q1 /Q2 vs. Q3). To demon-
strate generalizability across different groups
Psychiatric Patient Samples and thereby justify the epidemiologic uses of the
Two clinical validation studies have been done scale, results will be compared across age, sex,
in coordination with the survey program: one in race, and educational subgroups of the com-
Washington County, Maryland (Craig & Van bined Q1 and Q2 general populations and with
Natta, in press) and one in New Haven, Connec- the Washington County patient group.

390
Suitability for Use in Household Surveys levels reportedshould be considered approxi-

mate, and borderline levels should be inter-
Acceptability preted with caution.
The CES-D scale proved acceptable to both
general and clinical populations. In the 01 and Conditions of Interview
Q2 survey data, the average nonresponse to sin-
gle items was less than 0.2%. Two items were an- Controlling for age, race, sex, marital status,
swered &dquo;dont know&dquo; or &dquo;not applicable&dquo; some- income and occupational role, analyses of condi-
what more often than average: &dquo;I felt I was just tions during the interview revealed no significant
as good as other people&dquo; (1.6%) and &dquo;I felt hope-
differences in the CES-D scores associated with
ful about the future&dquo; (2.2%). For comparison, the time of day, day of week, or month of year in
the 01 item with the highest nonresponse was which the interview took .place (tables available
on request). There were some differences in the
household income (8%). The entire CES-D scale
was considered missing if more than four items average CES-D scores obtained by the various
were missing. This occurred only twice in the Q1l
interviewers, significant at borderline levels,
and Q2 data combined. which warrant further analyses (Choi & Com-
stock, 1975; Handlin et al., 1974). The Washing-
ton County Q2 survey obtained significantly
lower CES-D scores than did the Washington
Distributions
County Q1 survey. The Q3 reinterview survey
Parameters of the distributions of CES-D (Kansas City and Washington County com-
scale scores in the general population samples bined) also had significantly lower average CES-
and the Washington County patients are shown D than the initial (Q1 or Q2) survey of the
scores
in Table 2. The distribution of scores in the pa- same respondents. Further analyses will be re-
tient group was symmetrical, with a large stand- quired to assess the relative contribution of
ard deviation, while the general population dis- several factors (including possible real time
tributions were very skewed, with a smaller trends, response bias in test-retest effects, inter-
standard deviation and a much larger propor- view forms, possible nonresponse bias, and
tion of low scores. This pattern is consistent with sampling procedures).
an interpretation of the scale as related to a These possible difcrences (among inter-
pathological condition more typical of a patient viewers and among me three surveys) are small
population than a household sample. However, in magnitude and of minor practical im-
there was a wide enough range of scores in the portance. If further analyses confirm true dif-
general population to allow meaningful identifi- ferences, then suitable controls could be intro-
cation of relationships between depressive symp- duced in the sampling or analytic procedures. It
tomatology and other variables. is more important for present purposes that the
The very skewed distributions and the fact properties.of the scale were consistent across in-
that groups with higher means also tended to terviewers and questionnaire forms. Analysis of
have higher variances should be noted. Standard the interviewers (excluding those who completed
parametric significance tests on these data will fewer than 20 interviews) revealed very similar
not be exact. However, several basic analyses levels of reliability and patterns of relationships
(e.g., analysis of variance of CES-D scores by sex to other variables across interviewers (tables
and marital status) have been replicated using available on request). Results from the 01. Q2
normalizing transformations and nonpara- and Q3 surveys are reported separately in subse-
metric tests. In no case was the decision (accept quent sections of this paper to demonstrate that
vs. reject Ho) reversed. Nevertheless, probability they are also very similar on these properties.

391
Reliability extreme, this would result in a bimodal distribu-

Internal Consistency tion of scores, which was not observed in the
The scale contains 20 symptoms, any of which

present data. Evidence of discriminant validity
and of validity based on clinicians ratings, inde-
may be experienced occasionally by healthy peo-
pendent of self-report, also suggests that re-
ple ; a seriously depressed person would be ex- sponse bias is not the major contributor to the
pected to experience many but not necessarily reliability of the scale.
all of these symptoms. In a healthy population,
positive and negative affect are expected to co-
exist, with a low (negative) correlation. However, Test-retest Correlations
it has been suggested (Klein, 1974) that severely
Predictions regarding test-retest correlations
depressed patients are characterized by absence
of positive as well as presence of negative affect, depend on several factors. The CES-D scale was
so that positive and negative affect would be explicitly designed to measure current (&dquo;this
more highly (negatively) correlated. There is also week&dquo;) level of symptomatology, which is ex-
evidence that different kinds of people may pected to vary over time. Changes over time may
manifest different types of symptoms; e.g., lower not be monotonic; they are more likely to be
socioeconomic status people report more phy- cyclic in at least some individuals, and the phase
sical symptoms while higher socioeconomic (length) of cycles may vary across individuals.
status people report more affective symptoms The CES-D was designed to be sensitive to possi-
(Crandell & Dohrenwend, 1967). In summary, in ble depressive reactions to events in a persons
a general population sample, we would expect a life; the timing of these events is unpredictable
but presumably aperiodic. There are also
great deal of heterogeneity, with many people
experiencing a few symptoms and a few ex- methodological complications in test-retest
measures. For example, there may be biases due
periencing many. Therefore, some inter-item
correlations may be quite low, but the direction to nonresponse, biasing effects of repeated test-
of correlations should be consistent enough to ing, and asymmetric regression toward the mean
produce reasonably high measures of internal due to the very skewed distribution of CES-D
scores. Furthermore, in the present data, the
consistency. In a patient group, we would expect
higher item means, higher inter-item correla- test-retest time interval was confounded with
tions, and very high internal consistency. differences in style of data collection: all initial
The results support these expectations (see scores were based on interviews; the short-inter-
Table 3). Both inter-item and item-scale correla- val (weeks) retests were different (i.e., self-ad-
tions were higher in the patient sample than in ministered mail-backs); the long-interval
the general population samples (even when the (months) retests were the same (i.e., interviews).
small N and, therefore, greater sampling error of In light of these properties of the variable be-
the patient sample is taken into account). Ex- ing measured, we would expect only moderate
pectations were also confirmed by measures of levels of test-retest correlations in the overall
internal consistency (coefficient alpha and the samples. Shorter test-retest time intervals
Spearman-Brown, split-halves method; Nun- should produce somewhat higher correlations
nally, 1967). They were high in the general than longer intervals. However, if people were
population (about .85) and even higher in the selected by the information we have about what
patient sample (about .90). happened during the time interval, the correla-
This high internal consistency may include tions should be better differentiated. Specifical-
some component of response bias, i.e., the ten- ly, life events are expected to introduce variabili-
dency of an individual to answer all questions in ty (i.e., some individuals may react more than
the same (positive or negative) direction. In the others) and thus lower the test-retest correla-

392
tions. Tables 4 and 5 show that the results were but not the other had intermediate correlations.
consistent with these predictions. The correlation for those with no events (r .54) =
Table 4 shows the test-retest correlations for might be considered the fairest estimate of test-
those who responded to the request to fill out retest reliability, in the sense of repeatability
and mail in a retest of the CES-D (mail-backs) with conditions replicated, for the three- to
and those who were reinterviewed (Q3). All twelve-month time interval. In the New Haven
respondents were retested only once; each time patient group, the correlation of CES-D scores
interval represents a different group of people. at admission with scores obtained after four
The correlations were in the moderate range (all weeks of treatment was .53 (compared with r =
but one were between .45 and .70) and were, on .58 for the SCL-90). In this group, &dquo;events&dquo; had
the average, larger for the shorter time intervals. certainly occurred, but the effect of treatment
In Table 5, all Q3 respondents (test-retest may be assumed to be in the same direction for
time interval ranging from three to twelve all or most patients. Therefore, it is reasonable
months) were classified by whether any one of 14 that the correlation was about the same as that
in the &dquo;no events&dquo; group.
negative life events had occurred in the year
prior to the first interview and in the interval be-
tween interviews. Those with no life events at Validity
either time had the highest test-retest correla- Although not designed for clinical diagnosis,
tion ; those with life events at both times had the the CES-D scale is based on symptoms of de-
lowest correlation. Those with events at one time pression as seen in clinical cases. Therefore, it
Table 4. Test-retest Correlations by Time

Interval Between Test and Retest

393
Table 5. Test-retest Correlations by Life ment, the correlations were substantially higher
Events Losses Before Each Test (.69 to .75). These correlations were almost as
high as those obtained for the 90-item SCL 90
(Weissman et al., 1975).
Self-report Criteria
Table 6 shows correlations of the CES-D scale
with other self-report scales in the several sam-
ples. (Note that Q2 and Q3 did not include all
scales.) In all the samples, the pattern of correla-
tions of the CES-D with other scales gives rea-
should discriminate strongly between patient sonable evidence of discriminant validity. The
and general population groups, be sensitive to
highest rs were with scales designed to measure
levels of severity of depressive symtomatology,
symptoms of depression (i.e., Lubin, Bradburn
and reflect improvements after psychiatric treat-
Negative Affect and Bradburn Balance) or gen-
ment. In addition, it should correlate well with eral psychopathology (Langner) and the Cantril
other scales designed to measure depression and Ladder. The correlation of the CES-D with the
less well with scales which measure related but Bradburn Positive Affect scale was negative and
different variables; be related to a felt need for was low positive with scales designed to measure
psychiatric services; and be sensitive to possible different variables (medications, disability days,
reactive depression in the face of certain life social functioning, aggression). The CES-D cor-
events. related moderately with interviewer ratings of
depression but low negative to zero with inter-
Clinical Criteria viewer ratings of cooperation and understanding
The CES-D scores discriminated well between of the question.
Table 6 also shows support for the concept of
psychiatric inpatient and general population
a &dquo;syndrome&dquo; of depression which is more con-
samples and discriminated moderately among
levels of severity within patient groups. Table 2 sistent in the patient sample than in the general
shows that the average CES-D score for the population samples. In the patient groups, the
group of 70 Washington County psychiatric in- correlations with other depression scales were
patients was substantially and significantly higher positive (in the New Haven patients, cor-
higher than the average for the general popula- relation with the SCL-90 was .83); with the
tion samples. Seventy percent of the patients but Bradburn Positive Affect, higher negative; and
only 21% of the general population scored at with other scales, the same low positive.
and above an arbitrary cutoff score of 16. In the The low negative correlations with the Mar-
lowe-Crowne scale of &dquo;social desirability&dquo; sug-
patient group, the correlation between the CES-
D scale and ratings of severity of depression by gest that there may be some general response set
the nurse-clinician was .56 (Craig & Van Natta, involved in the CES-D scores (see also Klassen,
in press). In the New Haven patient group, the Hornstra, & Anderson, 1975). However, the pat-
average CES-D score at admission was 39.11, tern of correlations in Table 6 suggests that this
with no score below 16 (note that this group was bias is small and does not entirely mask mean-
screened to include only those above 6 on the ingful relationships with other variables.
Raskin scale). The correlations of the CES-D
with the Hamilton Clinicians Rating scale and Need for Services
with the Raskin Rating scale were moderate (.44 In the Ql and Q2 surveys, the respondents
to .54) at admission. After four weeks of treat- were asked whether they had had an emotional

394

395

396
problem in the past week for which they felt they

needed help. The group who answered the &dquo;need
help&dquo; question &dquo;yes&dquo; or &dquo;no, because its no use
to look for help&dquo; are considered &dquo;at risk&dquo; of be-
coming patients. Table 7 shows parameters of
the CES-D scale by answers to this question.
The Washington County patient population is
included for comparison. The general popula-
tion &dquo;need help&dquo; groups were more similar to
the patients than were the &dquo;not need help&dquo;
groups. The &dquo;need help&dquo; groups had high
means (significantly higher than the &dquo;not need
help&dquo; groups) and standard deviations, symme-

trical distributions (low skew), high percentages
of high scores (16 and above), and moderately
high correlations with the Bradburn Positive Af-
fect scale. The patterns of correlations with the
Bradburn Positive and Negative Affect scales
can also be considered in terms of discriminant
validity.
Life Events
Past research has shown an association of ill-
ness, including mental illness, with certain sig-
nificant life events (Dohrenwend & Dohren-
wend, 1974). Table 8 shows the average CES-D
scores for those who do and do not report certain
events in the year (or during the retest interval
for Q3) preceding the interview. The results were
as predicted: the more negative the event, the
higher the depression score of those who ex-

perienced it. Vacations were associated with low
CES-D scores(possibly biased by socioeconomic
status); marriage was ambiguous; separation
was more strongly associated with depression
than was divorce.
Table 9 shows the interview-reinterview (i.e.,
QI/Q2 vs. Q3) CES-D scores by Life Events
Losses (using the same criterion of Life Events
Losses as was used for Table 5). The overall
trend was for lower scores on Q3 than on the ori-
ginal interview, except in the group with no
events before the first interview and at least one
event in the retest interval. The four groups were
significantly different in amount of change in
CES-D score by several different methods of

397
Table 9. Test-retest CES-D Average Scores

By Life Events Losses Before Each Test
a
Overall significance of difference between groups in change scores:
p < .01 in one-way analysis of variance and in one-way analysis of
covariance, with score at time 1 as covariate.
testing change scores. These relationships of the Factor Analysis

CES-D scores to life events are considered vali-
Principal components factor analysis (with
dation of its sensitivity to current mood state, ones in the main diagonal) of the 20-item scale
which is a property desired for the scale. was done for the three general population
groups (All Ql Whites, All Q2 Whites, and All

Q3 Whites). For each group, there were four
Improvement After Treatment eigenvalues greater than one, which together ac-
Further evidence of response to change is fur- counted for a total of 48% of the variance; there-
nished by the New Haven clinical study (Weiss- fore, the normal varimax rotation to four factors
man, et al., 1975). The average CES-D score, was examined (see Table 11). The pattern of fac-
along with the SCL 90, the Hamilton, and the tor loadings is quite consistent across the three
Raskin, decreased significantly from the time of groups. Including items with loadings above .40
admission to one week and to four weeks of in all three groups, the four factors are readily
treatment (see Table 10). The mean for each of interpretable as follows:
the 20 items was lower after four weeks of threat- I. Depressed affect (blues, depressed. lonelv.
ment than upon admission (tables available on crv sad)
request). The change was particularly large for II. Positive affect (good, hopeful, happy, en-
patients rated &dquo;recovered&dquo; (by a Raskin score of Joy)

less than 7) after four weeks. The average CES- III. Somatic and retarded activity (bothered,
D score went down 20 points in the recovered appetite, effort, sleep, get going)
group and 12 points in the group rated &dquo;still ill.&dquo; IV. Interpersonal ~un friendlv, dislike)

398
Table 10. Average Scores of New Haven

Patients At Admission and After
Treatmenta (N=35)
afrom Weissman et al (1975) Tables 3 & 4.
bMatched t-test, p < .001 for all 4 measures, for 3 comparisons:

Time 1 vs. Time 2
Time 2 vs. Time 3
Time 1 vs. Time 3
Including items with loadings of at least .35 in at This is very strong evidence that the CES-D has
least two groups would add the items ,failure a similar factor structure in two samples from
fearful, happy. and enjoy to the depressed affect similar populations (Ql vs. Q2) and across two
factor and the items blues, mind, depressed. and tests on essentially the same sample (Ql vs. Q3).
talk to the somatic factor. In all three groups, The factors found in the general population
the depressed affect factor shares the largest are consistent with the components of depres-
proportion of the variance (about 16%) and the sion built into the scale. However, the high inter-
nal consistency of the scale found in all groups
interpersonal factor, the smallest proportion
(about 8%). argues against undue emphasis on separate fac-
tors. The items are all symptoms related to de-
Similarity of factor structure of the three sam-
ples was estimated by the Factorial Invariance pression. For epidemiologic research, a simple
total score is recommended as an estimate of the
Coefficient, ri,(Derogatis, Kallmen, & Davis,
197l ; Derogatis, Serio, & Cleary, 1972; Pinneau degree of depressive symptomatology.
& Newhouse, 1964). The ri, is a measure of the
correlation of the loadings of all items on one Generalizability Across Subgroups
factor in one group versus the loadings on one To be useful fur epidemiologic studies (e.g.,
factor in another group. If the factor structure of distribution of depression across demographic
two groups is similar, the r;,, will be very high subgroups), the CES-D scale must have ade-
when loadings on the same factor in both groups quate reliability and validity and a similar fac-
are correlated (the &dquo;diagonal&dquo; coefficients) and tor structure within each subgroup of the popu-
very low when d(fferent factors are correlated lation. Therefore, the analyses of Tables 3, 4, 6,
(the &dquo;off-diagonal&dquo; coefficients). Comparing Q1 7 and 11 were repeated on each of three age
with Q2 and Ql with Q3, the diagonal coeffi- groups (under 25, 25-64, over 64), the two sexes,
cients were very high (.87 to .99). The off- two races (Black and White), three levels of edu-
diagonal coefficients (i.e., the similarity of dif cation (less than high school, high school,
ferent factors) were very low (the largest was .13). greater than high school), and the two &dquo;need

400
help&dquo; groups (&dquo;need help,&dquo; &dquo;not need help&dquo;). sampling design balanced by interviewer may be
For these analyses, the data from Kansas City appropriate.
and Washington County Q1 and Q2 were com- On the positive side, the results reported here
bined to maximize numbers in the subgroups. are very favorable for the uses of the CES-D
With few exceptions, the results for the total scale for which it was designed. The scale has
population were confirmed in all subgroups. (ta- high internal consistency, acceptable test-retest
bles available on request). In all subgroups, co- stability, excellent concurrent validity by clinical
efficient alpha was .80 or above. Test-retest cor- and self-report criteria, and substantial evidence
relations were moderate (.40 or above) in all but of construct validity. These properties hold
three groups (Blacks, age under 25, and &dquo;need across the general population subgroups
help&dquo;). The subgroup patterns of correlations studied. The scale is suitable for use in Black
with other scales (as in Table 6) and relation- and White English-speaking American popula-
ships to &dquo;need help&dquo; (as in Table 7) were very tions of both sexes with a wide range of age and
similar to those in the total population. The sub- socioeconomic status for the epidemiologic
groups did not differ from each other or from study of the symptoms of depression. A group
the total population in factor structure. The with a high average score may be interpreted to
&dquo;need help&dquo; group (which had been found to be be &dquo;at risk&dquo; of depression or in need of treat-
similar to the Washington County patient group ment. The scale is a valuable tool to identify
by various criteria above) was not like that pa- such high-risk groups and to study the relation-
tient group in factor structure but was very simi- ships between depressive symptoms and many
lar to the total general population. other variables.
Cautions and Conclusions

References
Some limitations in use of the scale should be
Beck, A. T., Ward, C. H., Mendelson, M., Mock, J.,
noted. It is not intended as a clinical diagnostic & Erbaugh, J. An inventory for measuring depres-
tool, and interpretations of individual scores sion. Archives of General Psychiatry, 1961, 4,
should not be made. Even group averages 561-571.
should be interpreted in terms of level of symp- Bradburn, N. M. The structure of psychological well
toms which accompany depression, not in terms
being. Chicago: Aldine, 1969.
Cantril, H. The pattern of human concern. New
of rates of illness. Appropriate cutoff scores for Brunswick: Rutgers University, 1965.
clinical screening are yet to be validated. There Choi, I. C., & Comstock, G. W. Interviewer effect on
are some hints that understanding of the items responses to a questionnaire relating to mood.
American Journal of Epidemiology, 1975, 101,
may be a problem; there was a very small but
84-92.
consistent correlation between the CES-D score
Comstock, G. W., & Helsing, K. Characteristics of
and the interviewer ratings of understanding of
respondents and nonrespondents to a question-
the questions, independent of education of the naire for estimating community mood. American
respondent. Analyses indicate that this was not Journal of Epidemiology, 1973, 97, 233-239.
Comstock, G. W., & Helsing K. Symptoms of depres-
simply due to respondents who did not notice sion in two communities. Psychological Medicine
the reversal of the positive items. Special caution (in press).
is needed with bilingual respondents (Trieman, Craig, T. J., & Van Natta, P. Recognition of de-
1975). Further study of this issue is needed, with pressed affect in hospitalized psychiatric patients:
possible revision for simplicity of wording and Staff and patient perceptions. Diseases of the Ner-
vous System (in press).
removal of colloquial expressions. There is still
some question as to the effect of the interviewer
Crandell, D. L., & Dohrenwend, B. P. Some relations
among psychiatric symptoms, organic illness and
and the interview form on the mean level of scale social class. American Journal of Psychiatry, 1967,
scores. Until further study decides this issue, a , 1527-1537.
12

401
Crowne, D. P., & Marlowe, D. A new scale of social Nunnally, J. C. Psychometric theory. New York: Mc-
desirability independent of psychopathology. Graw Hill, 1967.
Journal of Consulting Psychology, 1960, 24, Pinneau, S. R., & Newhouse, A. Measures of invari-
349-354. ance comparability in factor analysis for fixed
and
Derogatis, L. R., Kallmen, C. H., & Davis, D. M. variables. Psychometrika, 1964,
, 271-281.
29
FMATCH: A program to evaluate the degree of Raskin, A., Schulterbrandt, J., Reatig, N., & Mc-
equivalence of factors derived from analyses of Keon, J. Replication of factors of psychopathology
different samples. Behavioral Science, 1971, 16, in interview, ward behavior, and self-report rat-
271-273. ings of hospitalized depressives. Journal of Ner-
Derogatis, L. R., Lipman, R. S., & Covi, L. SCL-90: vous and Mental Disease, 1969, 148, 87-96.
An outpatient psychiatric scale: Preliminary re- Rockliff, B. W. A brief rating scale for anti-depres-
port. Psychopharmacology Bulletin, 1973, 9, sant drug trials. Comprehensive Psychiatry, 1971,
13-27. 12,
122-135.
Derogatis, L. R., Serio, J. C., & Cleary, P. A. An Standardsfor educational and psychological tests.
empirical comparison of three indices of factorial Washington, D.C.: American Psychological Asso-
similarity. Psychological Reports, 1972, 30, ciation, 1974.
791-804. Trieman, B. Depressive mood among middle class ur-
Dohrenwend, B. S., & Dohrenwend, B. P. (Eds.) ban ethnic groups. Technical Report, 1975, Con-
Stressful life events: Their nature and effects. New tract HSM 42-73-238, National Institute of Men-
York: Wiley-Interscience, 1974. tal Health.
Weissman, M. M., Prusoff, B., & Newberry P. Com-
Gardner, E. A. Development of a symptom check list
the measurement of depression in a popula- parison of the CES-D with standardized depres-
for sion rating scales at three points in time.
tion. Unpublished, 1968.
Technical Report, 1975, Yale University, Contract
Hamilton. M. A rating scale for depression. Journal
ASH-74-166, National Institute of Mental Health.
of Neurologic Neurosurgical Psychiatry, 1960, 23,
56-62. Zung, W. W. K. A self-rating depression scale. Ar-
chives of General Psychiatry, 1965, 12, 63-70.
Handlin, V., Klassen, D., Hornstra, R., & Roth, A.
Interviewer effects in a community mental health
survey. Technical Report, 1974, The Greater Kan- Acknowledgements
sas City Mental Health Foundation, Contract PH The CES-D Scale was originallv developed bv Mr.
43-66-1324, National Institute of Mental Health. Ben Z. Locke, Chief, Center ,for Epidemiologic
Klassen, D., Hornstra, R., & Anderson, P. The in- Studies (CES), National Institute of Mental Health
fluence of social desirability on symptom and and Dr. Peter Putnam. ,formerlv at CES. The overall
mood reporting in a community Journal
survey. of program was initiated by Dr. Robert Markush.
Consulting & Clinical Psychology, 1975, 43, _former Chiqfi CES. The,field studies were carried out
448-452. bv the Epidemiologic Field Station, Greater Kansas
Klassen, D., & Roth, A. Characteristics of non-re- Citv Mental Health Foundation. Kansas Citv. Mis-
spondents in the Community Mental Health As- souri (Dr. Rob(jn Hornstra. Director) and The Train-
sessment survey. Technical Report, 1974, The irzg Center for Public Health Research, Johns Hop-
Greater Kansas City Mental Health Foundation, kins Universitv. Hagerstown. Maryland (Dr. George
Contract PH 43-66-1324, National Institute of Comstock) under contract with CES. The clinical
Mental Health. validation studies were carried out bv Dr. Thomas
Klein, D. F. Endogenomorphic depression. Archives Craig. _fbrmer(v of Johns Hopkins and Dr. Mvrna
of General Psychiatry, 1974,
447-454.
31, Weissman. Yale Universitv, under contract NItl7
CES. Important advice on and review of this report
Langner, T. S. A twenty-two item screening score of
was provided bv those connected with the
psychiatric symptoms indicating impairment. study. es-
Journal of Health and Human Behavior, 1962, 3, peciallv Dr. Thomas Craig and Dr. Evelvn Goldberg,
269-276. Johns Hopkins; bv Dr. Len Derogatis, Johns Hop-
Lubin, B. Munual for the depression adjective check kins ; and by the reviewers and editor of this journal.
lists. San Diego: Educational and Industrial Test-
ing Service, 1967. Authors Address
Mebane, I. On time and late respondents. Technical
Report, 1973. Center for Epidemiologic Studies, Lenore S. Radloff, Room 10C-09, Parklawn Building,
National Institute of Mental Health. 5600 Fishers Lane, Rockville, Maryland 20852.


CES-D Scale, Radloff PDF

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

CES-D Scale, Radloff PDF

Transféré par

Droits d'auteur :

Formats disponibles

The CES-D Scale: A Self-Report Depression

Scale for Research in the General Population

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227.

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227.

Table 1. CES-D Scale

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227.

Reinterview Survey (Q3 Survey) Summary of Field Tests

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227.

Suitability for Use in Household Surveys levels reportedshould be considered approxi-

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227.

Reliability extreme, this would result in a bimodal distribu-

The scale contains 20 symptoms, any of which

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227.

Table 4. Test-retest Correlations by Time

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227.

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227.

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227.

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227.

problem in the past week for which they felt they

help&dquo; groups) and standard deviations, symme-

higher the depression score of those who ex-

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227.

Table 9. Test-retest CES-D Average Scores

testing change scores. These relationships of the Factor Analysis

groups (All Ql Whites, All Q2 Whites, and All

patients rated &dquo;recovered&dquo; (by a Raskin score of Joy)

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227.

Table 10. Average Scores of New Haven

afrom Weissman et al (1975) Tables 3 & 4.

bMatched t-test, p < .001 for all 4 measures, for 3 comparisons:

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227.

Cautions and Conclusions

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227.

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227.

Vous aimerez peut-être aussi