Do The DSM Decision Trees Improve Diagnoatic Ability

Do the DSM Decision Trees Improve
Diagnostic Ability?

Robert D. Morgan
Oklahoma State University

Kenneth R. Olson, Randy M. Krueger,

Richard P. Schellenberg, and Thomas T. Jackson
Fort Hays State University
Experiment 1 examined whether the use of the DSM-III-R decision trees

increased the accuracy of DSM-III-R diagnoses. Results indicated that the
use of the decision trees interacted with the level of DSM-III-R experience
to affect diagnostic accuracy. The use of the decision trees resulted in a
modest increase in diagnostic accuracy for participants with less DSM-
III-R experience; for participants with more DSM-III-R experience, the use
of the decision trees had no significant effect on diagnostic accuracy.
Experiment 2 examined whether the use of the DSM-III-R decision trees
increased the accuracy and confidence and decreased the time of DSM-
III-R diagnosis across participants with varying levels of DSM-III-R experi-
ence. The primary analyses consisted of a 3 3 2 3 2-multivariate analysis
of variance (MANOVA) to determine whether the use of the decision trees
increased diagnostic accuracy and diagnostic confidence and decreased
diagnostic time. Results indicated (1) the experienced participants made
more accurate diagnoses than the less-experienced and no-experience
participants; (2) the decision trees, combined with practice, increased
class diagnostic accuracy and decreased diagnostic time; and (3) partici-
pants were more confident in their diagnosis when they used the decision
trees than when they did not use the decision trees. Supplementary analy-
ses consisted of two one-way analysis of variance (ANOVA) procedures
and indicated that participants preference for and knowledge of how to
use the decision trees did not significantly affect their diagnostic accuracy.
2000 John Wiley & Sons, Inc J Clin Psychol 56: 7388, 2000.
Robert D. Morgan is now a postdoctoral fellow in forensic psychology in the Department of Psychiatry at the
University of Missouri-Kansas City.
Correspondence concerning this article should be addressed to Robert D. Morgan, Department of Forensic
Services, Western Missouri Mental Health Center, 600 E. 22nd Street, Kansas City, MO 64108; e-mail:
RobertDMorgan@juno.com
JOURNAL OF CLINICAL PSYCHOLOGY, Vol. 56(1), 7388 (2000)

2000 John Wiley & Sons, Inc. CCC 0021-9762/00/010073-16
74 Journal of Clinical Psychology, January 2000
Introduction
Clinical diagnosing is salient to the helping process as it is critical to developing a suc-

cessful treatment plan (Stout, 1991, p. 141), facilitates meaningful communication among
mental health professionals (Malt, 1986), and is a requirement for insurance reimburse-
ments; however, to be useful in clinical and research settings, diagnoses must be valid
and reliable. Several studies have pointed out the problems of low reliability in clinical
diagnosing (e.g., Ash, 1949; Mehlman, 1952; Spitzer & Fleise, 1974).
Clinicians and theorists alike have attempted to develop a more reliable diagnostic
classification system. The Diagnostic and Statistical Manual of Mental Disorders (DSM)
has been under continual revision since the first edition was published in 1952 (American
Psychiatric Association, 1952). Each edition of the DSM has sought to improve its clinical
usefulness with efforts being directed at overcoming the serious weaknesses in reliability
and validity encountered with the previous editions (Carson, Butcher, & Coleman, 1988).
Ward, Beck, Mendelson, Mock, and Erbaugh (1962) indicated three reasons for diag-
nostic disagreement: patient inconsistencies, diagnostician inconsistencies, and more impor-
tantly, inadequacies in the nosology. Revised editions of the DSM, with their attempts at
improving reliability, take aim at these problems. In an attempt to increase the validity of
diagnosing, DSM-III-R revisions have consisted of major changes in the diagnostic dis-
orders and their criteria rather than altering the DSM criteria-based format (e.g., Vaglum,
Friis, Vaglum, & Larsen, 1989).
A study by Malt (1986) demonstrated that the DSM-III classification system is supe-
rior in reliability to other systems of classification. Further evaluations of the DSM-III
suggested that its reliability was enhanced over previous editions (Carson et al., 1988).
The extensive field trials conducted by Spitzer, Forman, and Nee (1979) are further
evidence of the increase in diagnostic reliability of the DSM-III over previous versions,
and Adams and Cassidy (1993) concluded that DSM-III and its successors represent a
decided improvement over previous efforts (p. 23).
In addition to reliability, the accuracy of clinical diagnosis must be considered and is
especially important with recent emphasis in developing and identifying specific treat-
ments for specific DSM-III-R disorders (e.g., American Psychiatric Association, 1989).
A reliable diagnostic system is of little use if it does not result in more accurate diagnoses.
While DSM-IV is a continued attempt at developing the most reliable and accurate diag-
nostic system, efforts must continue to focus on facilitating the most reliable and accurate
diagnostic decision-making process. One such attempt would be to use the manual in the
most efficient manner possible, which may include the use of the decision trees.
Clinical judgment or intuition is integral to the clinical diagnosticians repertoire;
however, as LeLaurin (1990) argues, judgment-based assessment is only as good as the
objectivity, reliability, and validity of the data collection used to make judgments and
diagnoses. Diagnosing relies heavily on clinical judgment and intuition, but even this
approach is based on underlying baseline data as gathered by the clinician through expe-
rience. As LeLaurin points out, this baseline data is only good if used objectively and
reliably. In an analysis of medical decision-making, Elstein, Shulman, and Sprafka (1978)
indicate that diagnostic errors occur as the result of mistakes in the analysis of large
quantities of complex information, and diagnostic accuracy may be improved by the use
of strategies aimed at systemizing a clinicians inferences (e.g., flow charts). Decision
trees offer a systematic method for determining a diagnosis that may eliminate much of
the subjective decision-making on the part of the clinician.
The DSM decision trees do not follow a fixed methodological approach; rather they
function as a guide by encouraging clinicians to be more comprehensive in their consid-
Do the DSM Decision Trees Improve Diagnostic Ability? 75
erations of history, signs, and symptoms (Reid & Wise, 1989). The DSM-III-R (Ameri-
can Psychiatric Association, 1987) decision trees for differential diagnosis were developed
to assist professionals in understanding the organization and hierarchic structure of the
classification (p. 377). It does not seem unreasonable to suggest that increased under-
standing in this regard would result in increased diagnostic accuracy.
Millon (1983) described what are probably two typical but contrasting views of the
usefulness of the decision trees: on the one hand he commented that the decision trees are
likely to be considered an unnecessary encumbrance for routine diagnostic tasks, quite
impractical for everyday decision making and perhaps most relevantly, abhorrent to cli-
nicians accustomed to the diagnostic habit of intuitive synthesis (p. 809). Contrary to
Millons hypothesis, Timmermans and Vlek (1992) indicate that decision aids are designed
to solve difficult decision problems. The justification for using decision aids lies in the
shortcomings of human judgment (p. 50). Furthermore, complex problems require more
cognitive effort and often result in the use of simplified decision strategies and a less
complete evaluation of information (e.g., Olshavsky, 1979; Payne, 1976). Millon (1983)
also appears to recognize the potential of the decision trees as he observed that should
the method guarantee significantly greater diagnostic accuracy . . . then it might gain a
sufficient following to override the inertia of traditional practice (p. 809).
The DSM decision trees appear to be a novel experiment by the DSM-III task force
that has not been taken too seriously as there are no studies that have investigated their
use in diagnostic decision-making. In fact, no studies have investigated the use of the
DSM decision trees in diagnostic decision-making. The purpose of these experiments
was to evaluate the effects of the DSM decision trees on diagnostic accuracy. Experiment
1 examined whether the use of the decision trees increased the accuracy of DSM-III-R
diagnoses, and it was hypothesized that they would. The purpose of Experiment 2 was to:
(a) replicate and expand Experiment 1, and (b) examine whether the use of the DSM-
III-R decision trees increased the accuracy of diagnoses for those participants with less
DSM experience and also decreased the time of diagnoses. It was hypothesized that the
use of the decision trees would increase diagnostic accuracy across groups with various
levels of DSM-III-R experience and that the use of the decision trees would decrease the
diagnostic time across these groups.
Experiment 1
Method
Participants. Participants consisted of 15 graduate students in a Masters degree program

at a midwestern university. The students had been enrolled in an advanced Abnormal
Psychology and/or an Applied Practicum course and had received training (i.e., class-
room lecture and practical experience) in the use of the DSM-III-R. Of the 15 partici-
pants, 10 were female and 5 were male. Their ages ranged from 23 to 43 (M 5 30.1, SD 5
7.9). The participants had varying amounts of previous DSM-III-R experience and train-
ing (M 5 22.0 weeks; SD 5 21.9); three of the participants were first-year graduate
students; 11 were second-year graduate students; one was a third-year graduate student.
Materials. Ten case vignettes were selected randomly from the DSM-III-R casebook
(Spitzer, Gibbon, Skodol, Williams, & First, 1989) and used as case studies for the par-
ticipants to make their diagnoses. Two case studies were selected for each decision tree.
For each category of decision trees (e.g., psychotic symptoms) two disorders were selected
randomly, and then case studies representing those disorders were selected randomly
from the DSM-III-R casebook. Case vignettes were assigned randomly to each condition
(with and without the decision trees), and the order of presentation of the vignettes also
was assigned randomly. Thus, each vignette had an equal opportunity to be in either
condition and appear in any position (e.g., first, second, third, etc.).
The participants used the DSM-III-R and photocopies of the five DSM-III-R deci-
sion trees when making their diagnoses. An 11-item questionnaire concerning the use of
the decision trees also was administered to the participants. The first six questions inquired
about the participants previous use, understanding, and attitude about using the decision
trees, and possible future use of the decision trees. Participants responded to these ques-
tions by making ratings on Likert-type scales ranging from one (not at all) to five
(very much). The last five questions pertained to the method employed when using the
decision trees and demographics including sex, age, year in graduate school, and number
of weeks of DSM-III-R training.
Procedure. The participants first were informed that they would receive five case
vignettes and that they were to make an Axis I DSM-III-R diagnosis. They were instructed
to use the method that they usually used when making their diagnoses with the exception
that if they usually used the decision trees provided in the manual they were to refrain
from using them with these vignettes. The participants then were given five of the case
vignettes and instructed to make an Axis I diagnosis for each case. They were told that the
Axis I diagnosis was to come from one of 11 categories, all of which were represented in
the decision trees (e.g., schizophrenia, mood disorder, and anxiety disorder).
Upon completion of the five vignettes, the participants were informed that they would
be given five more case vignettes, and that again they were to make an Axis I DSM-III-R
diagnosis. This time however, they were to use the decision trees when making their
diagnosis. The experimenter then gave two suggestions on how to implement the decision
trees when making a diagnosis. Method one consisted of using the decision trees as a
check of their usual method of diagnosing (i.e., method used on the first five vignettes).
Method two consisted of locating the appropriate decision tree and then proceeding down
the tree, referring to the diagnostic criteria in the manual as needed, until the participants
arrived at a diagnosis.
The participants then were given the other five case vignettes (again representing the
five decision trees) and copies of the five DSM-III-R decision trees. They were instructed
to implement the decision trees to make an Axis I DSM-III-R diagnosis from one of the
11 categories using either one of the two methods suggested by the experimenter or any
other preferred method. When finished with these case vignettes, the participants were
asked to complete the questionnaire.
Results and Discussion
The primary analysis was a one-way repeated-measure analysis of variance (ANOVA)

design with (and without) decision trees being the independent variable. The purpose of
the analysis was to determine whether the use of the decision trees increased diagnostic
accuracy. Accuracy scores were computed by determining the number of diagnoses on
the vignettes that correctly matched those of the casebook. A participants accuracy scores
for with and without the trees ranged from 0 to 5. The results indicated that using or
not using the decision trees had no significant affect on the participants diagnostic accu-
racy, F(1, 13) 5 2.27, p 5 0.156. Supplementary analyses were conducted in order to
determine whether other variables might interact with the use of the decision trees in
effecting diagnostic accuracy. These variables included (a) whether or not participants
liked using the trees, (b) the method participants employed when using the trees (Method 1
or Method 2, as described previously), and (c) amount of DSM-III-R experience and

training. Analyses implementing repeated-measure ANOVA designs for the first two of
these variables were not significant.
The supplementary analysis involving the experience variable entailed dividing the
sample into those participants with less than 15 weeks of DSM-III-R training or experi-
ence (n 5 7) and those participants with 15 or more weeks of training or experience (n 5
8). A 2 (with vs. without decision trees) 3 2 (less vs. more experienced) repeated-
measure ANOVA performed on the accuracy scores yielded a significant interaction,
F(1, 13) 5 4.99, p 5 0.044, indicating that without the decision trees the more experi-
enced participants were more accurate in their diagnoses of the vignettes than were the
less-experienced participants. However, when both groups used the decision trees there
was virtually no difference in accuracy of diagnoses. This interaction is presented in
Figure 1. With the use of the decision trees, the less-experienced participants increased
their accuracy of diagnoses by more than one point. These results suggest that without
using the decision trees, participants who have less than 15 weeks of training or experi-
ence with the DSM-III-R do not make as accurate a diagnosis as those participants with
15 or more weeks of training or experience. These results are consistent with those of
Lambert and Wertheimer (1988), who showed that diagnostic ability is related to training
and experience. However, when the less-experienced participants used the decision trees
this difference disappeared. This increase in accuracy may be a result of a better under-
standing of the DSM organization and hierarchic structure as stated in the manual (p. 377),
or it may be a reflection of the organized step-by-step decision-making process of diag-
nosing that was absent in the less-experienced group when not using the trees.
The design of the present study required participants to diagnose the first five vignettes
without the trees and then diagnose the second five vignettes with the trees. This feature
was included intentionally in order to minimize the lack of control over participants
diagnostic processes that might result from teaching them to use a possible helpful tool
(trees) for the first set of vignettes and then relying on them not to make implicit use of
this tool for the second set of vignettes. Therefore, the effect of practice must be consid-
ered when interpreting the results. For example, this design feature raises the question of
whether the increase in accuracy scores for the less-experienced participants in the present
study might be attributable to practice effects during the second set of vignettes. There
are two considerations that appear to weigh against this possibility. One pertains to the
expectation that any practice effects might benefit the experienced participants as well as
Figure 1. Diagnostic accuracy when using the decision trees by weeks of experience.
the less-experienced participants. However, as observed previously, while nearly all of

the less-experienced participants had improved accuracy scores on the second set of
vignettes, nearly all of the experienced participants had the same or lower scores on the
second set of vignettes. The second consideration pertains to the fact that if there were
practice effects they would most likely be indicated by difference(s) in accuracy scores
for the various positions in which the vignettes were administered (i.e., first position
receiving a low score, last position receiving a high score). However, two regression
analyses examining the relationship between the percentage of vignettes correct and the
number of trials were conducted. The first analysis examined the first five vignettes
without use of the decision trees; the second analysis considered the remaining vignettes
with use of the decision trees. The obtained results failed to provide evidence for order
effects, (slope 5 211.42); F(1, 3) 5 2.997, p 5 0.18 (without decision trees) and (slope 5
214.27); F(1, 3) 5 7.469, p 5 .07 (with decision trees).
Additional supplementary findings included descriptive statistics for participants
ratings with respect to the decision trees, and these ratings are presented in Table 1. These
results indicate that participants usually do not incorporate the decision trees into their
diagnostic repertoire. However, there were indications they would use them somewhat in
the future, and they would use them to a moderate degree if they were provided with
further training and if research supported the reliability and validity of the decision trees.
In addition, participants moderately liked using the trees and perceived the trees as some-
what helpful in making their diagnoses. Finally, the participants indicated they under-
stood how to use the decision trees when making a diagnosis.
While this study offers promise for the utility of the DSM decision trees, there are
two limitations that need to be noted. The first is the small sample size and its lack of
representitiveness with respect to DSM-III-R users. The second limitation is that the
main finding of the interaction between the use of trees and experience level was an
unpredicted result that was observed in the content of post-hoc analyses.
Even with the above limitations, the present results caution against dismissing the
decision trees as being of minimal use in training professionals to use the DSM-III-R in
ways that would increase diagnostic accuracy. Clarification of the potential usefulness of
the trees most likely would result from studies similar to the present one that would
involve larger sample sizes and well-reasoned predictions about how level of experience
might interact with the use of the trees to produce increased diagnostic accuracy. Find-
ings consistent with the present results could provide a basis for considering how the
trees might be used to improve training in the use of the DSM-III-R and its successor,
DSM-IV.
Table 1
Descriptive Statistics for Preference and Use Ratings
With Respect to the Decision Trees
Standard
Question Mean Deviation
Like using trees? 3.467 1.407

Did the trees help? 3.533 1.457
Understand how to use the trees? 4.257 0.597
Would you use the trees? 3.333 1.496
Would you use the trees if provided further training? 4.333 1.047
Do you usually use the trees? 1.400 0.828
Experiment 2
Experiment 2 expands upon Experiment 1 and addresses some of the limitations in the
original study. Lambert and Wertheimer (1988) found that diagnostic accuracy increases
significantly with relevant education and relevant experience. A study by Webb, Gold,
Johnstone, and Diclemente (1981) showed that after a two and one half day training
program, participants with no previous DSM-III exposure were able to agree with an
expert opinion on 74% of diagnoses of videotaped cases. While the training program
used by Webb et al. lasted two and one half days, Malt (1986) demonstrated that reliable
DSM-III diagnoses could be achieved with a training program that lasts only a few hours
and with reference to the decision trees and the diagnostic criteria alone. These studies
indicate that diagnostic accuracy can be improved with training.
A final note concerning the decision trees is their practicality for use in clinical
practice. Millon (1983) indicated that the decision trees would seem to increase the diag-
nostic time and would be an encumbrance for the diagnostic process; however, an
alternative hypothesis would be that the use of the decision trees may provide clinicians
with a more-effective diagnostic approachtherefore diagnostic time would decrease.
Experiment 2 improves upon Experiment 1 with an increased sample size, an a priori
distinction between participants level of DSM experience, and an assessment of diag-
nostic speed and diagnostic confidence.
Method
Participants. Participants included 20 undergraduate students, 20 graduate psychology

students, and 20 graduate students and professional mental-health workers who use the
DSM-III-R to make diagnoses (i.e., psychologists and counselors). The undergraduate
students were or had been enrolled in an undergraduate Abnormal Psychology course and
served as a no-experience control group. The 20 graduate students were a relatively
inexperienced group with regard to the DSM-III-R. They had never been enrolled in a
graduate-level psychopathology course involving extensive coverage of the DSM-III-R
but had limited exposure to the DSM-III-R during practicum coursework. The third group
had more DSM-III-R experience, as they had been enrolled in a graduate-level course
involving extensive coverage of the use of the DSM-III-R and/or used the DSM-III-R in
clinical practice. This group consisted of graduate students and professional mental-
health workers who were using the DSM-III-R to make diagnoses.
Materials. Ten case vignettes were selected randomly from the DSM-III-R casebook
(Spitzer et al., 1989) and used as case studies for the participants to make their diagnoses.
Case vignettes were selected from the casebook in the same manner as in Experiment 1
with the exception that all vignettes used in a graduate-level Abnormal Psychology class
at the university under study, and those used in Experiment 1 were eliminated from the
pool of possible vignettes.
Following selection of the ten vignettes, the two case studies for each decision tree
then were assigned randomly to one of two conditions (A & B). Each condition included
five vignettes, with the five decision trees equally being represented in each condition.
That is, Condition A had five case vignettes with each vignette representing one of the
decision trees, and the same was true for Condition B. The case vignettes in both condi-
tions partially were counterbalanced using a Latin-square procedure so that each vignette
appeared in each position of the condition an equal number of times. Vignette one, for
example, was represented equally in conditions A and B, and appeared in each position of
the conditions an equal number of times.
Conditions A and B then were assigned randomly to one of two treatment orders
(decision trees used on the first five vignettes or second five vignettes) for each partici-
pant. Thus, every participant received Condition A and Condition B, but half (10 from
each level of experience group) used the decision trees on their first five vignettes and
half (the remaining 10 from each level of experience group) used the decision trees on
their second five vignettes. The conditions (A & B) were assigned randomly to the order
of tree use, with the exception that the two conditions appeared in both orders of tree use
an equal number of times (30 times). Thus, of the 60 participants, 15 (five from each level
of experience group) received condition A first and used the decision trees when making
their diagnosis; 15 participants received condition A first and did not use the decision
trees when making their diagnosis; 15 participants received condition A second and used
the decision trees when making their diagnosis; and 15 participants received condition A
second and did not use the decision trees when making their diagnosis. The same proce-
dure was incorporated for condition B.
The participants were allowed to use the DSM-III-R when making their diagnoses.Also,
a 5-item questionnaire about the use of the decision trees was administered to the partici-
pants after diagnosing all of the vignettes. Participants responded to these questions by making
ratings on Likert-type scales with one being not at all and seven equaling very much.
Finally, a standard digital stopwatch was used for the timing of the diagnoses.
Procedure. The participants from the different groups were assigned to one of two
Orders of decision-tree use before they arrived to participate in this experiment. Partici-
pants in the first Order used the decision trees on the first five vignettes and did not use
the decision trees on the second five vignettes. Participants in Order 2 did not use the
decision trees on the first five vignettes and did use the decision trees on the second five
vignettes. Each Order of decision-tree use consisted of 10 participants from each expe-
rience group so that of the 20 participants from the experienced group, 10 were in Order 1
and 10 were in Order 2. A partially random assignment procedure was used in that after
10 participants from a particular experience group had been assigned randomly to one of
the two Orders, the rest of the participants in that group were assigned to the other Order.
The participants were informed that they would receive five case vignettes one at a
time and that they were to make an Axis I DSM-III-R diagnosis for each vignette. They
were informed that while they were to make an Axis I diagnosis, they did not need to
make detailed diagnostic specifications (e.g., severity). An example of an Axis I diagno-
sis without specifications was given to the participants. The participants were told that
the Axis I diagnosis was to come from one of 11 categories, all of which were represented
in the decision trees. A listing of these categories was given to the participants. The
participants also were informed that they would be timed on each vignette but that there
was no time limit and they could take as much time as they needed to complete each
vignette. They also were asked to make confidence ratings pertaining to each diagnosis.
The participants then were given five of the case vignettes one at a time and instructed to
make an Axis I diagnosis for each case vignette.
The participants were instructed to use the decision trees or to not use the decision
trees when making their diagnoses, depending upon order. Because some participants
might have been unfamiliar with the decision trees, the experimenter provided two sug-
gestions on how to implement the decision trees when making a diagnosis (see Experi-
ment 1). When using the decision trees the participants could use one of the two methods,
experiment with both methods, or use another preferred method. Finally, the participants
were timed for each vignette starting from the time they were given the vignette and
ending when they were completed with their diagnosis (i.e., when they were done writing
their diagnosis).
Upon completion of the first five vignettes, the participants were informed that they
would be given five more case vignettes, again one at a time, and that they were again to
make an Axis I DSM-III-R diagnosis. This time, however, participants in Order 1 were
told to use the decision trees and participants in Order 2 were told to not use the decision
trees on the five vignettes. The participants then were given the other five case vignettes
(again representing the five decision trees) one at a time. They again were instructed to
make an Axis I DSM-III-R diagnosis from one of the 11 categories without using the
decision trees or by using the decision trees, depending on order. The participants again
were timed on each vignette and asked to make confidence ratings. When finished with
these case vignettes, the participants were asked to complete the questionnaire.
Results and Discussion
This study consists of a 3 (experience level) 3 2 (decision tree) 3 2 (order) split-plot

multivariate analysis of variance (MANOVA) research design with four dependent vari-
ables (specific diagnostic accuracy, class diagnostic accuracy, diagnostic time, and diag-
nostic confidence scores). Supplementary analyses were conducted to determine if other
factors (e.g., understanding and method of using the trees) affected diagnostic accuracy.
Specific diagnostic accuracy was defined as making a diagnosis on a vignette that
correctly matched those of the casebook, excluding the diagnostic specifications (e.g.,
severity), and the schizophrenic diagnosis. The schizophrenic disorders are not broken
down according to subtype in the decision trees, thus a diagnosis of schizophrenia with-
out specific type was considered an accurate diagnosis for the vignettes. In addition to
specific diagnostic accuracy, class diagnostic accuracy was also assessed. Of the 20 par-
ticipants in each group, only two of the no-experience and seven of the less-experienced
participants made a specific diagnosis on all of the vignettes. Instead, the majority of the
participants in these two groups made a class diagnosis or a combination of class and
specific diagnosis for the vignettes. For example, rather than making a specific diagnosis
of Major Depression, these participants tended to make a class diagnosis, such as Mood
Disorder. Thus it was necessary to analyze class accuracy as a dependent variable to
ensure more accurate data analysis and results. Class accuracy then was defined as mak-
ing a diagnosis on the vignettes that was in the same class of disorders as the diagnosis in
the casebook. As in Experiment 1, the correct diagnoses for both specific diagnostic
accuracy and class accuracy were summed to yield accuracy scores that ranged between
zero and five for the five vignettes diagnosed with the decision trees and for the five
vignettes diagnosed without the decision trees. Diagnostic time was calculated in minutes
and seconds.
Table 2 presents the means and standard deviations for specific accuracy, class accu-
racy, diagnostic time, and diagnostic confidence by experience level, trees, and order.
The multivariate test of significance was conducted according to Wilks Lambda and is
presented in Table 3. Subsequent univariate analyses indicated significant main effects of
experience level for specific diagnostic accuracy, F(2, 54) 5 30.08, p 5 .001, and class
diagnostic accuracy, F(2, 54) 5 10.88, p 5 .001. A follow-up analysis using the Scheffe
procedure resulted in significant ( p , .05) differences between the experienced group
and the less-experienced group and the experienced group and the no-experience group
for specific diagnostic accuracy, but not between the less-experienced and no-experience
groups. For class diagnostic accuracy significant differences were found between the
experienced and no-experience group only. Thus, the experienced group was signifi-
cantly more accurate in their specific diagnoses than both of the other two groups; how-
Table 2
Means and Standard Deviations for Experience Level With and Without the Trees
on Diagnostic Accuracy, Time, and Confidence
Group
No Experience Less Experience Experienced
Order 1 Order 2 Order 1 Order 2 Order 1 Order 2
Dependent Variable M SD M SD M SD M SD M SD M SD
Specific accuracy
with tree 0.50 0.71 0.20 0.42 0.70 0.95 0.50 0.71 1.90 0.57 2.20 1.30
Specific accuracy
without tree 0.50 0.85 0.60 0.84 1.00 1.05 0.70 0.68 2.20 0.79 1.90 0.99
Class accuracy
with tree 2.10 1.29 2.40 1.17 2.20 0.79 3.10 0.99 2.50 0.85 3.60 0.84
Class accuracy
without tree 1.60 1.43 1.60 1.27 2.80 1.62 1.80 1.03 3.20 0.92 2.90 0.88
Time with tree
(per vignette) 6.27 2.14 5.95 1.23 7.45 2.54 5.11 1.48 7.03 2.23 5.94 0.99
Time without tree
(per vignette) 4.91 1.83 7.20 2.31 5.99 1.37 5.77 2.02 5.68 1.36 5.81 1.14
Confidence with tree
(per vignette) 4.24 0.84 5.04 0.84 3.92 0.68 3.96 0.91 5.54 0.71 4.74 0.72
Confidence without
tree (per vignette) 3.78 1.01 4.60 1.36 4.20 1.22 3.56 0.95 5.01 0.70 4.84 0.90
Note. Order 1 5 Trees used only on first 5 vignettes. Order 2 5 Trees used only on second 5 vignettes. Range of scores for
each dependent variable for each group can be determined from the raw data.
ever, on class diagnostic accuracy, the experienced participants were significantly more
accurate than the no-experience participants only. These results substantiated Lambert
and Wertheimers (1988) study in that experienced participants made more accurate
diagnosis.
In addition to the main effect of experience level, a significant Trees 3 Order inter-
action, F(1, 54) 5 8.06, p 5 .006, was found for class diagnostic accuracy (see Fig. 2). In
this interaction, t-tests indicated that when using decision trees, the participants were
Table 3
Multivariate Analysis of Variance
(Experience Level 3 Decision Tree 3 Order)
Source df Approximate F Pexact
Experience 8 9.21 0.001

Trees 4 5.24 0.001
Order 4 0.62 0.650
Experience 3 Trees 8 0.76 0.640
Experience 3 Order 8 1.49 0.169
Tree 3 Order 4 6.68 0.001
Experience 3 Tree 3 Order 8 1.68 0.113
Figure 2. Class diagnostic accuracy when using the decision trees by order. For Order 1, when the decision
trees were used, it was on the first five vignettes; thus no decision trees were used on the second five
vignettes. For Order 2, when the decision trees were used, it was on the second five vignettes; thus no
trees were used on the first five vignettes.
significantly more accurate using the trees when they were accompanied by some prac-
tice (i.e., using the trees on the second five vignettes rather than on the first five), t 5
2.85, p 5 .005; however, when not using the trees, practice did not significantly affect
class diagnostic accuracy. In addition, on the first five vignettes, there was no significant
difference in class diagnostic accuracy between using the trees and not using the trees;
however, when using the trees on the second five vignettes there was a significant dif-
ference than when not using the trees on the first five vignettes, t 5 3.382, p , .001.
This Tree 3 Order interaction for class diagnostic accuracy showed that participants
were most accurate if they used the trees on the second five vignettes. Thus, as indicated
by the interaction, practice will help with diagnostic accuracy but practice and the use of
the decision trees combined led to the most accurate class diagnosis. This result clearly
indicated that using the trees alone, and practice alone, were not sufficient to improve
class diagnostic accuracy.
There were no other significant univariate main effects or interaction effects for
specific diagnostic accuracy or class diagnostic accuracy. Analysis for the diagnostic
time data revealed a significant Tree 3 Order interaction, F(1, 54) 5 17.06, p 5 .001, and
is shown in Figure 3. t-Tests showed that participants who used the trees on the first five
vignettes were significantly slower on these vignettes than when not using the trees on
the second five vignettes t 5 3.898, p , .001. There were no significant differences
between participants who did not use the trees on the first five vignettes and participants
who did use the trees on the second five vignettes. In addition, when using the trees,
participants took significantly less time to make their diagnosis if they used the trees on
the second five vignettes rather than on the first five vignettes t 5 2.63, p , .02. How-
Figure 3. Diagnostic time when using the decision trees by order. For Order 1, when the decision trees
were used, it was on the first five vignettes; thus no decision trees were used on the second five vignettes.
For Order 2, when the decision trees were used, it was on the second five vignettes; thus no trees were used
on the first five vignettes.
ever, when not using the trees, there were no significant differences between the two
orders (indicating that practice did not effect diagnostic time when not using the trees).
This interaction indicates that when using the decision trees, participants took less time to
make a diagnosis if they had some practice, and when not using the decision trees prac-
tice made no significant difference on diagnostic time. However, participants took less
time to make a diagnosis when they were not using the trees but did have some practice
than when using the trees but without practice. There were no other significant effects for
diagnostic time.
Finally, univariate analyses performed on the Diagnostic Confidence data showed
significant main effects for Experience Level, F(2, 54) 5 8.18, p 5 .001, and Trees
F(1, 54) 5 4.92, p 5 .031. Specific comparisons were made using the Scheffe procedure,
and results indicated that the experienced group was significantly ( p , .05) more confi-
dent in their diagnosis than were the less-experienced participants. However, they were
not significantly more confident than the no-experience participants. Also, the less-
experienced and no-experience participants did not differ significantly in their diagnostic
confidence. The main effect for the Trees variable indicates that when the participants
used the decision trees, they were more confident in their diagnosis than when they did
not use the decision trees. A significant Experience Level 3 Order interaction, F(2, 54) 5
3.49, p 5 .038, and a significant Experience Level 3 Trees 3 Condition interaction,

F(2, 54) 5 4.01, p 5 .024, were found. These interactions were not significant in the
multivariate analysis; therefore, less weight can be given to the results of the univariate
analyses. Nevertheless, these results indicate that the decision trees have a significant
affect on diagnostic confidence. When the participants used the decision trees they were
more confident in their diagnosis than when they did not use the decision trees. One
possible explanation for the increased confidence is that the trees elicited more effective
decision-making strategies. The participants may have detected the difference in their
diagnosing strategy and felt more confident with the aid of the trees. Results also indi-
cated that the experienced participants are significantly more confident in their diagnosis
than the less-experienced participants.
Supplementary analyses were conducted to examine the participants opinions across
several variables concerning the use of the decision trees. These variables included: (a)
whether or not participants liked using the decision trees, (b) the method participants
employed when using the decision trees (Method 1 or Method 2, as described previous-
ly), (c) whether or not the participants understood how to use the decision trees, (d)
whether or not the participants thought using the decision trees helped in their diagnostic
decision, and (e) if the participants had previously memorized the decision trees. Table 4
presents the frequency of responses on a 7-item Likert-type scale for these variables.
Analyses for these variables consisted of comparing the sum of response 1 (not at all)
through response 3 (a little) with the sum of response 5 (somewhat) through response 7
(very much). This split provided a comparison of those participants who responded more
negatively to a particular question to those who responded more positively. For questions
pertaining to whether the participants liked using the decision trees and if they under-
stood how to use the decision trees, the analyses consisted of one-way ANOVA designs
for the specific and class-accuracy data when using the decision trees.
The first one-way ANOVA for whether or not the participants liked using the deci-
sion trees resulted in no significant differences for specific diagnostic accuracy,
F(1, 53) 5 3.43, p 5 .07, or class diagnostic accuracy, F(1, 53) 5 .82, p 5 .37. The second
ANOVA, whether or not the participants thought the decision trees helped in their diag-
nostic decision-making, resulted in no significant differences for specific diagnostic accu-
racy, F(1, 54) 5 1.97, p 5 .17, or class diagnostic accuracy, F(1, 54) 5 .47, p 5 .50. Thus,
whether or not the participants liked using the decision trees or understood how to use the
decision trees did not effect significantly their diagnostic accuracy when using the trees.
This result indicates that the reported differences in regard to the trees are the result of the
use of the trees, not if the participants liked using the trees or understood how to use the
trees.
Of the 60 participants in this study, only three responded in the lower end of the
Likert-scale on whether or not they thought the decision trees helped in their diagnostic
Table 4
Frequency for Response on Decision-Tree Use
Question 1 2 3 4 5 6 7
Did you like using the trees? 1 11 8 5 15 13 6

Did you understand how to use the trees? 1 2 13 4 23 9 7
Do you think the trees helped? 0 0 3 1 20 19 16
Prior to this study had you memorized the trees? 58 1 1 0 0 0 0
What method of using the trees did you use (1, 2, 3)? 58 1 1 N/A N/A N/A N/A
accuracy. Thus, the great majority of participants perceived the decision trees to be at
least somewhat helpful. Similarly, 58 of the 60 participants responded that they did not
have the decision trees memorized prior to this study and that they used the trees essen-
tially as a check of their usual method of diagnosing or as a check of an initial diagnosis.
Some limitations with regard to the present study need to be addressed. First, this
study did not adequately assess specific diagnostic accuracy. As indicated above, those
participants with less DSM-III-R experience tended to make class diagnoses on some or
all of the vignettes, thus possibly distorting the accuracy of the specific diagnostic results.
In addition, the no-experience participants were undergraduate psychology majors, thus
their representation as DSM users and potential generalizability is limited. Another pos-
sible limitation of this study could have been subject familiarity with the case vignettes.
While an attempt was made to exclude those vignettes that may have been utilized pre-
viously in academic coursework, subjects were not asked about their familiarity with the
case vignettes. This poses a potential limitation to the internal validity of the study as
subjects diagnostic accuracy may have been affected by familiarity with any of the case
vignettes. Finally, to simplify diagnostic comparisons, only those diagnoses that are rep-
resented by the decision trees (i.e., 11 diagnostic classes) were selected for inclusion in
this study; however, this experimental control may limit the generalizability of the study.
In spite of these limitations, the findings of this study suggest that the decision trees
warrant consideration as a diagnostic tool for those clinicians with limited diagnostic
experience. As indicated in the results, the decision trees did have a significant affect on
class diagnostic accuracy and time depending on the order in which the trees were used.
Like the initial experiment, these results indicated that if the decision trees and some
practice are provided, the result will be improved diagnostic accuracy and decreased
diagnostic time. While the results from the two experiments are similar, the latter exper-
iment expanded the initial experiment by including diagnostic time. This experiment also
used actual experience-level differences whereas the first experiment included experience-
level differences found in post-hoc analysis. The present experiment used 60 participants
whereas the first experiment used only 15 participants. In addition, the present study had
equal sample sizes for the order of using the trees. That is, 30 participants (10 from each
group) used the decision trees on the first five vignettes and 30 used the trees on the
second five vignettes. In the first experiment all participants used the decision trees on
the second five vignettes, thus leaving open the possibility of practice effects. Thus, the
first experiment had several limitations (possibility of practice effects, groups selected on
post-hoc analysis) for which the latter experiment accounted.
From these findings, it is evident that the trees facilitate a more accurate class diag-
nosis as well as decreased diagnostic time when at least minimal practice is employed.
This finding suggests that teaching of the trees alone is not sufficient, but if the trees are
taught and practice is made available, then participants will tend to make a more accurate
class diagnosis and decrease their diagnostic time.
General Discussion
The results of these studies parallel those of Elsteins et al. (1978) results with medical
students by deemphasizing an over reliance on diagnostic intuition or insight by stressing
the inclusion of a systematized guide for determining diagnosis. More specifically, the
two experiments presented here demonstrate that the DSM decision trees can be a useful
tool, especially for practitioners with minimal DSM experience. It was demonstrated that
for those clinicians with less experience, the decision trees, when provided with minimal
practice, facilitated more accurate diagnoses and also decreased participants diagnostic
time. Thus, Millons (1983) initial assessment of the trees as unnecessary and an encum-
brance was premature. These results suggest that neophyte clinicians may benefit from
instruction using the trees and that with minimal practice the trees can be a useful tool in
the neophytes clinical repertoire. Finally, this study provides support for the inclusion of
the decision trees in the DSM classification system. While these experiments imple-
mented the use of the DSM-III-R, it does not seem unreasonable to suggest that similar
findings would be found with the DSM-IV.
While the results presented here provide preliminary evidence for the usefulness of
the decision trees, further studies are warranted. Future studies need to account for the
limitations of these studies and need to investigate how the trees may be implemented
more effectively as a tool for clinicians in the real world. First, a similar study needs to
incorporate a more representative sample between participants experience and training
level (e.g., doctoral level practitioners, masters degree level clinicians, advanced gradu-
ate students, and intermediate graduate students). Additionally, future studies should
replicate these studies with professionals from disciplines who engage in diagnostic
decision-making (e.g., social workers and psychiatrists). In addition, further studies need
to incorporate the full range of diagnosis and not limit the diagnostic options to only
those covered by the decision trees. Such studies would increase positively the general-
izability of the results to real world practitioners.
In conclusion, while these studies provide evidence for the utility of the DSM deci-
sion trees, a caution is necessary. While inexperienced participants were used in this
study, and the use of the decision trees with some practice on vignettes appeared to
facilitate a more accurate class diagnosis as well as decreased diagnostic time, the use of
the decision trees should not be construed as a replacement for experience or clinical
intuition; rather, the decision trees may serve as an adjunct to the diagnostic decision-
making process.
References
Adams, H.E., & Cassidy, J.F. (1993). The classification of abnormal behavior. In P.B. Sutker &
H.E. Adams (Eds.), Comprehensive handbook of psychopathology (pp. 325). New York:
Plenum Press.
American Psychiatric Association. (1952). Diagnostic and statistical manual of mental disorders.
Washington, D.C.: Author.
American Psychiatric Association. (1980). Diagnostic and statistical manual of mental disorders
(3rd ed.). Washington, D.C.: Author.
American Psychiatric Association. (1987). Diagnostic and statistical manual of mental disorders
(3rd ed., rev.). Washington, D.C.: Author.
American Psychiatric Association. (1989). Treatments of psychiatric disorders. Washington, D.C.:
Author.
Ash, P. (1949). The reliability of psychiatric diagnoses. Journal of Abnormal and Social Psychol-
ogy, 44, 272276.
Carson, R.C., Butcher, J.N., & Coleman, J.C. (1988). Abnormal psychology and modern life (8th
ed.). Glenview, IL: Scott, Foresman and Company.
Elstein, A.S., Shulman, L.S., & Sprafka, S.A. (1978). Medical problem solving: An analysis of
clinical reasoning. Cambridge, MA: Harvard University Press.
Lambert, L.E., & Wertheimer, M. (1988). Is diagnostic ability related to relevant training and
experience? Professional Psychology: Research and Practice, 19, 5052.
LeLaurin, K. (1990). Judgment-based assessment: Making the implicit explicit. TECSE, 10, 96110.
Malt, U.F. (1986). Teaching DSM-III to clinicians. Acta Psychiatrica Scandinavica Supplementum,
73, 6875.
Mehlman, B. (1952). The reliability of psychiatric diagnoses. Journal of Abnormal and Social
Psychology, 47, 577578.
Millon, T. (1983). The DSM-III: An insiders perspective. American Psychologist, 38, 804814.
Olshavsky, R.W. (1979). Task complexity and contingent processing in decision making: A repli-
cation and extension. Organizational Behavior and Human Performance, 24, 300316.
Payne, J.W. (1976). Task complexity and contingent processing in decision making: An information
search and protocol analysis. Organizational Behavior and Human Performances, 16, 366387.
Reid, W.H., & Wise, M.G. (1989). DSM-III-R training guide: For use with the American Psychi-
atric Associations diagnostic and statistical manual of mental disorders (3rd ed. rev.). New
York: Brunner/Mazel.
Spitzer, R.L., & Fleiss, J.L. (1974). A re-analysis of the reliability of psychiatric diagnosis. British
Journal of Psychiatry, 125, 341347.
Spitzer, R.L., Forman, J.B.W., & Nee, J. (1979). DSM-III field trials: I. Initial interrater diagnostic
reliability. American Journal of Psychiatry, 136, 815817.
Spitzer, R.L., Gibbon, M., Skodol, A.E., Williams, J.B.W., & First, M.B. (1989). Diagnostic and
statistical manual of mental disorders casebook (rev. ed.). Washington, D.C.: American Psy-
chiatric Press.
Stout, C. (1991). A methodological approach to differential diagnostics. In K.N. Anchor (Ed.), The
handbook of medical psychotherapy: Cost effective strategies in mental health (p. 141). Lewiston,
NY: Hogrefe and Huber.
Timmermans, D., & Vlek, C. (1992). Multi-attribute decision support and complexity: An evalua-
tion and process analysis of aided versus unaided decision making. Acta Psychologica, 80,
49 65.
Vaglum, P., Friis, S., Vaglum, S., & Larsen, F. (1989). Comparison between personality disorder
diagnoses in DSM-III and DSM-III-R: Reliability, diagnostic overlap, predictive validity. Psy-
chopathology, 22, 309314.
Ward, C.H., Beck, A.T., Mendelson, M., Mock, J.E., & Erbaugh, J.K. (1962). The psychiatric nomen-
clature. Archives of General Psychiatry, 7, 60 67.
Webb, L.J., Gold, R.S., Jonstone, E.E., & Diclemente, C.C. (1981). Accuracy of DSM-III diagnoses
following a training program. American Journal of Psychiatry, 138, 376378.

Do The DSM Decision Trees Improve Diagnoatic Ability

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Do The DSM Decision Trees Improve Diagnoatic Ability

Transféré par

Droits d'auteur :

Formats disponibles

Do the DSM Decision Trees Improve

Kenneth R. Olson, Randy M. Krueger,

Experiment 1 examined whether the use of the DSM-III-R decision trees

JOURNAL OF CLINICAL PSYCHOLOGY, Vol. 56(1), 7388 (2000)

Clinical diagnosing is salient to the helping process as it is critical to developing a suc-

Participants. Participants consisted of 15 graduate students in a Masters degree program

Results and Discussion

The primary analysis was a one-way repeated-measure analysis of variance (ANOVA)

or Method 2, as described previously), and (c) amount of DSM-III-R experience and

the less-experienced participants. However, as observed previously, while nearly all of

Like using trees? 3.467 1.407

Participants. Participants included 20 undergraduate students, 20 graduate psychology

Results and Discussion

This study consists of a 3 (experience level) 3 2 (decision tree) 3 2 (order) split-plot

No Experience Less Experience Experienced

Order 1 Order 2 Order 1 Order 2 Order 1 Order 2

Source df Approximate F Pexact

Experience 8 9.21 0.001

3.49, p 5 .038, and a significant Experience Level 3 Trees 3 Condition interaction,

Did you like using the trees? 1 11 8 5 15 13 6

Vous aimerez peut-être aussi