Académique Documents
Professionnel Documents
Culture Documents
Shear et al.
INTRODUCTION
The Hamilton Anxiety Rating Scale [HAM-A or
HARS: Hamilton, 1959, 1969] is a 14-item clinical interview measure of somatic and psychic anxiety symptoms. This scale was one of the first attempts to
measure the clinical status of patients diagnosed with
neurotic anxiety states quantitatively and has become
one of the most widely used symptom rating scales in
the world. Although the scale assesses a broad range of
symptoms that are common to all eight of the DSM IV
Anxiety Disorders, it is most often used to assess severity of Generalized Anxiety Disorder (GAD). The
Hamilton Anxiety Rating Scale comprises the main
outcome measure in most treatment studies of this disorder. However, in its original form, the scale has no
established reliability aids, such as instructions for administration or for scoring, and there are no scripted
questions to guide the interviewers who administer this
scale. Without such guidelines, the method of administering each item and assigning the level of symptom
severity can be quite arbitrary. These deficiencies could
result in inconsistent use of this scale, which could increase the variability of treatment outcome ratings and
decrease the accuracy of cross-site or cross-rater comparisons [Bruss et al., 1994]. The industry sponsor of
2001 WILEY-LISS, INC.
METHODS
The study sample included individuals who sought
treatment for an anxiety disorder at one of three sites
(Western Psychiatric Institute and Clinic, Massachusetts General Hospital, and the Medical University of
South Carolina) between April 1, 1997 and February
28, 1998. Eligible patients signed informed consent as
approved by the Institutional Review Boards associated
with the three study sites. All interviews were videotaped for the purpose of co-rating. In addition to completing structured interviews, all patients completed
self-report questionnaires assessing anxiety-related
symptoms. Participants underwent two interviews on
each of 2 days and were compensated $50 for each day.
Study participants were consenting patients, age 18
years or older, who met criteria for a DSM IV Anxiety
Disorder of at least 6 months in duration, as determined by trained raters, using the Structural Clinical Interview for DSM-IV [SCID-P with Psychotic Screen;
Spitzer et al., 1995]. Participants were excluded from
the study if they had a primary diagnosis of Major Depressive Disorder, Panic Disorder without comorbid
Generalized Anxiety Disorder, psychotic disorder, or if
they met criteria for any psychoactive substance abuse
or dependence within the past 6 months or were
judged unable to participate in the interviews reliably
because of things like chaotic lifestyle or practical
problems or other characteristics judged by the coordinator to be likely to interfere with providing accurate
or complete data.
Participants underwent 2 days of testing within a 7day period. On each day all participants completed
both the traditional Hamilton Anxiety Rating Scale
and the structured interview form of the Hamilton
scale, with the order of these scales randomly assigned
and counterbalanced across the two testing days. On
day 2 the scales were administered in the opposite or-
167
der as on day one. In addition, all participants completed the Beck Anxiety Inventory (BAI). Patients
completed the Patient Global Improvement (PGI) on
day 2 to ensure that there was no substantial change in
clinical state. The mean PGI score on day 2 was 3.73
(SD 1.02) (3 = minimally improved; 4 = unchanged).
Raters who participated in this study were experienced research interviewers. Similar to standard procedures in pharmaceutical trials, they received a brief
introduction and instructions for using each scale but
did not undergo formal training and certification procedures. There was no cross-site training. In order to
avoid crossover effects, e.g., inadvertent use of the
guidelines from the structured interview, raters at each
site administered either the structured SIGH-A only
or the unstructured HAM-A version only. A different
rater administered each scale on day 1 and day 2, thus
providing a more stringent test of inter-rater reliability
than co-rated videotapes. All interviews were videotaped and sent to Western Psychiatric Institute and
Clinic. Two raters at each site, along with two additional raters from New York State Psychiatric Institute,
carried out co-ratings of 32 videotaped interviews for
the structured interview form of the Hamilton Scale
and Hamilton Anxiety Rating Scale. The 32 tapes were
selected by a random process of 16 from the first half
of the sample, and 16 from the second half of the
sample, with an even distribution of the range of Clinical Global Impression severity scores across sites.
INSTRUMENTS
HAMILTON ANXIETY RATING SCALE
[HARS; HAMILTON, 1959]
This instrument was developed to assess and quantify symptom severity among patients with anxiety
neurosis. Inter-rater reliability has been reported as an
Intraclass Correlation Coefficient of 0.740.96 [Bruss
et al., 1994].
BECK ANXIETY INVENTORY [BAI; BECK ET
AL., 1988]
This instrument is a 21-item, self-report questionnaire designed to assess and evaluate the frequency of
anxiety symptoms over a one-week period. This test
assesses two factors: cognitive and somatic symptoms.
The instrument has good internal consistency ( =
0.92), test-retest reliability (r = 0.75; df = 81, P =
<.001), and convergent and discriminant validity.
RESULTS
DEMOGRAPHIC AND CLINICAL
CHARACTERISTICS
Eighty-nine adults participated in the study including 30 at Massachusetts General Hospital, 30 at the
Medical University of South Carolina, and 29 at
Western Psychiatric Institute and Clinic. Sixty percent
168
Shear et al.
Figure 1.
169
170
Shear et al.
171
172
Shear et al.
173
174
Shear et al.
175
176
Shear et al.
Figure 2.
score.
177
TABLE 1. Reliability of SIGH-A and HAM-A Scales in Patients With and Without Current GAD.
Total sample
ICC (test-retest) for HAM-A
ICC (test-retest) for SIGH-A
ICC (inter-rater reliability) for HAM-A
ICC (inter-rater reliability) for SIGH-A
HAM-A/SIGH-A correlation
Internal consistency (alpha) for SIGH-A
Internal consistency (alpha) for HAM-A
Mean HAM-A total score (day1)(S.D.)
Mean SIGH-A total score (day1)(S.D.)
Current GAD
.86 (.78.91)
.89 (.83.93)
.98 (.97.99)
.99 (.98.99)
.77 (day1) .75 (day2)
.82
.85
20.58 (8.48)
24.62 (9.09)
.79 (.66.87)
.88 (.80.92)
.98
.98
.70 (day1) .72 (day2)
.79
.82
20.85 (7.47)
24.67 (8.68)
No current GAD
.94 (.85.97)
.93 (.84.97)
.98
.99
.89 (day1) .84 (day2)
.88
.92
19.80 (11.05)
24.48 (10.38)
of the two forms was essentially the same (0.53 for the
traditional Hamilton scale and 0.57 for the SIGH-A).
This finding provides further confirmation of the convergent validity of the two forms of the instrument.
The Hamilton Anxiety Rating Scale is an extensively used assessment instrument, which was developed for rating severity of anxiety symptoms prior to
the development of reliable diagnostic criteria for
different Anxiety Disorders. In some studies it has
shown sensitivity to change and may be useful as an
outcome measure in clinical settings. The HAM-A is
the primary outcome measure most often used in
treatment studies of Generalized Anxiety Disorder,
and it is also used to rate severity of anxiety symptoms in other disorders. The lack of instructions for
administration and the absence of clear anchor points
for severity ratings mean training is somewhat difficult and decisions for both administration and scoring can be idiosyncratic.
We developed a structured interview to provide an
explicit guide for the use of the Hamilton scale. The
study reported here documents good reliability of the
SIGH-A, though the traditional scale also performed
well. Of some interest, there was significantly lower
rater reliability across 2 days on the unstructured
HAM-A in GAD patients compared to non-GAD patients. Scores on the two instruments were highly correlated in this study, with a uniform and reliable
difference between them. One possible reason that the
SIGH-A has yielded slightly higher scores on average is
that, unlike the traditional scale, the SIGH-A instructs
clinicians to probe subject responses for frequency, distress, and interference before making ratings. Furthermore, these ratings are based on distinct severity scale
anchor points. Because the HAM-A does not instruct
clinicians to probe subject responses before making ratings, this may contribute to the HAM-A generating
lower total scores due to potentially obtaining less information from subjects. A second possibility is that the
raters, despite not having formal training on the SIGHA instrument, may have been aware of the hypothesis of
the study. These rater expectancies could have influenced the ratings in a consistent direction.
Our study is limited by the fact that this was a
DISCUSSION
178
Shear et al.
sample of subjects who presented to our Universitybased clinics. The subjects we recruited were similar
to those who come to our settings for treatment of
anxiety and depressive disorders in having a range of
diagnoses and severity. Moreover, collection of data
from three sites improves generalizability. However,
we cannot be certain if results would be similar for patients presenting to other clinical settings or for those
who do not seek treatment. In addition, all raters in
this study had prior experience as research raters; we
do not know whether untrained raters would achieve
similar levels of reliability on either instrument.
We conclude that either form of the Hamilton scale
can be used with confidence by trained research raters.
The main advantage of the traditional format is that it
has been used for many years. However, the advantage
of the structured interview is that it provides instructions to assist in training and increased consistency of
administration and scoring, which may also generate
more appropriate cross-site comparisons and increase
the variability of treatment outcome ratings. The fact
that raters in this study were all experienced research
assessors may have contributed to the lack of significant differences between the two instruments. Providing clear instructions may be especially useful when
raters are inexperienced and extensive training is impractical. A study comparing ratings in such a situation would be of interest.
ACKNOWLEDGMENTS
The authors acknowledge the contributions of assessors at Columbia University: Richard Blumenthal,
REFERENCES
Bartko JJ, Carpenter WT. 1966. The Intraclass Correlation Coefficient as a measure of reliability Psychol Rep 19:311.
Beck AT, Epstein N, Brown G, Steer RA. 1988. An inventory for
measuring clinical anxiety: psychometric properties. J Consult
Clin Psychol 56:893897.
Bruss GS, Gruenberg AM, Goldstein RD, Barber JP. 1994. Hamilton anxiety rating scale interview guide: joint interview and testretest methods for interrater reliability. Psychiatry Res 53:
191202.
Hamilton M. 1959. The assessment of anxiety states by rating. Br J
Psychiatry, 32:5055.
Hamilton M. 1969. Diagnosis and rating of anxiety. Br J Psychiatry
Special Pub 3:7679.
Williams JB. 1988. A structured interview guide for the Hamilton
Depression Rating Scale. Arch Gen Psychiatry 45:742747.