Vous êtes sur la page 1sur 16

This article was downloaded by: [Illinois State University Milner Library]

On: 29 November 2012, At: 16:36


Publisher: Routledge
Informa Ltd Registered in England and Wales Registered Number: 1072954
Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH,
UK

Journal of Personality
Assessment
Publication details, including instructions for
authors and subscription information:
http://www.tandfonline.com/loi/hjpa20

Should Human Figure Drawings


Be Admitted Into Court?
Stephen J. Lally
Version of record first published: 10 Jun 2010.

To cite this article: Stephen J. Lally (2001): Should Human Figure Drawings Be
Admitted Into Court?, Journal of Personality Assessment, 76:1, 135-149
To link to this article: http://dx.doi.org/10.1207/S15327752JPA7601_8

PLEASE SCROLL DOWN FOR ARTICLE


Full terms and conditions of use: http://www.tandfonline.com/page/termsand-conditions
This article may be used for research, teaching, and private study purposes.
Any substantial or systematic reproduction, redistribution, reselling, loan,
sub-licensing, systematic supply, or distribution in any form to anyone is
expressly forbidden.
The publisher does not give any warranty express or implied or make any
representation that the contents will be complete or accurate or up to
date. The accuracy of any instructions, formulae, and drug doses should be
independently verified with primary sources. The publisher shall not be liable
for any loss, actions, claims, proceedings, demand, or costs or damages
whatsoever or howsoever caused arising directly or indirectly in connection
with or arising out of the use of this material.

Downloaded by [Illinois State University Milner Library] at 16:36 29 November 2012

JOURNAL OF PERSONALITY ASSESSMENT, 76(1), 135149


Copyright 2001, Lawrence Erlbaum Associates, Inc.

Should Human Figure Drawings Be


Admitted Into Court?
Stephen J. Lally
American School of Professional Psychology
Arlington, Virginia

In recent years, there has been debate about the validity of figure drawings, although
surveys of clinicians in both general and forensic practice still find them to be one of
the most widely used tests of personality functioning. Using both Heilbruns (1992)
guidelines for the use of psychological tests in a forensic evaluation and the U.S. Supreme Courts Daubert v. Merrell Dow Pharmaceuticals, Inc. (1993) criteria for the
admission of scientific evidence, I examine the admissibility of human figure drawings in court. The results suggest that the most commonly used methods for interpreting human figure drawings fall short of meeting the standards for admissibility. The
use of overall rating scales, although weak in validity, appear to minimally meet these
standards.

The formal use of human figure drawings to assess personality function dates back
more than 50 years (Machover, 1949). Although in recent years there has been debate about their validity (Dumont & Smith, 1996; Hammer, 1969, 1996; Joiner &
Schmidt, 1997; Joiner, Schmidt, & Barnett, 1996; Motta, Little, & Tobin, 1993b;
Riethmiller & Handler, 1997; Safran, 1996; Smith & Dumont, 1995; Waehler,
1997; Wanderer, 1969), surveys of clinicians in general practice (Lubin, Larsen, &
Matarazzo, 1984; Piotrowski, 1984; Watkins, Campbell, Nieberding, & Hallmark,
1995) have found human figure drawings to be one of the most widely used tests of
personality functioning. Similarly, surveys of tests used in forensic practice
(Ackerman & Ackerman, 1997; Borum & Grisso, 1995; Naar, 1961; Pinkerman,
Haynes, & Kaiser, 1993) also cite human figure drawings as one of the most frequently used tests. In fact, two of the most widely used child custody instruments,
the Bricklin Perceptual Scales (Bricklin, 1984) and the AckermanSchoendorf
Scales for Parent Evaluation of Custody (Ackerman & Schoendorf, 1992), incorporate human figure drawings in their assessment procedures. Within psychology,
there has recently been discussion (Heilbrun, 1992) about the minimum standards

Downloaded by [Illinois State University Milner Library] at 16:36 29 November 2012

136

LALLY

for selecting psychological tests in forensic assessments. Paralleling this move, the
courts have begun to articulate criteria for the admission of both scientific evidence
and evidence based on technical and other specialized knowledge (Daubert v.
Merrell Dow Pharmaceuticals, Inc., 1993; General Electric v. Joiner, 1997;
Kumho Tire v. Carmichael, 1999). With the use of Heilbruns guidelines and the
courts criteria, it is possible to address the question of whether human figure drawings should be admitted into court.
Heilbrun (1992), a clinical psychologist, published guidelines for determining
whether a given psychological test should be used in a forensic evaluation. These
seven guidelines cover both the characteristics of a given test and the particular legal context to which it is applied. The first guideline requires that the test is commercially available and adequately documented, both in its manual and in a
recognized publication that reviews psychological tests (e.g., Mental Measurements Yearbook). This requirement permits opposing counsel to examine and
evaluate the bases of the experts opinions. Next, the test is required to have adequate reliability. Heilbrun argued for a reliability coefficient of at least .80 because
a lower reliability would lead to unacceptably low validity. The third guideline requires that the test be relevant to the legal issue or to a psychological construct that
is relevant to the legal issue. It is further suggested that this relevance be supported
by validated research.
Heilbruns (1992) fourth guideline requires that the test have a standard method
of administration and that the examiner follow the instructions and conditions of
administration used when the norms were developed. The applicability of the test
to the population and the purpose for which it is used make up the fifth guideline.
In other words, an intelligence test would not be used to measure the presence of
psychopathology. There should ideally be a close fit between the validation research and the individual being evaluated. Heilbrun argued in his sixth guideline
that objective and actuarial tests are preferable. He based this argument on the general superiority in the research literature of objective methods of data collection
and actuarial methods of data combination over clinical methods for these tasks.
Finally, Heilbrun wrote that response style needs to be explicitly assessed by the
test; in other words, there must be a way to detect whether someone is malingering
pathology or feigning health. In short, the key points of Heilbruns argument are
that for a psychological test to be used for a forensic issue, it must have a solid scientific foundation, be commercially available, be able to detect malingering, and
be appropriate for answering the legally relevant question.
The courts have moved in a similar direction in terms of requiring expert testimony to be based on scientific method. Specifically, the U.S. Supreme Court, in its
triptych of rulingsDaubert (1993), General Electric (1997), and Kumho Tire
(1999)established a two-pronged test to determine whether an experts testimony can be admitted. Testimony must be both scientifically valid and relevant.
The court articulated criteria that the trial judge should use to decide whether testi-

Downloaded by [Illinois State University Milner Library] at 16:36 29 November 2012

HUMAN FIGURE DRAWINGS IN COURT

137

mony is based on junk or genuine science; in other words, whether the reasoning
and methodology underlying the testimony are scientifically valid. The four criteria are that (a) the theory or technique can be and has been tested (falsifiability); (b)
the theory or technique has been subjected to peer review and published in professional journals; (c) there is general acceptance of the theory or technique in the scientific community; and (d), a fourth, two-part criterion, the theory or technique has
a known, error rate and there exist standards to control the operation of the technique. These four criteria were not meant to be exhaustive, and the court did not
state that testimony had to meet all four elements.
Although the standard for admission is relatively low, it is clear that the court
is trying to get trial judges to screen out testimony that is based on unreliable or
speculative sources. This conclusion was reaffirmed on February 22, 2000, in
Weisgram v. Marley Co. In this case, the court wrote since Daubert parties
relying on expert evidence have had notice of the exacting standards of reliability such evidence must meet (p. 1021). The justices went on to note that it is
implausible to suggest, post-Daubert, that parties will initially present less than
their best expert evidence in the expectation of a second chance should their first
try fail(p. 1021). Although this ruling is binding only in federal court, it is
likely that most state courts will follow this move to require that testimony be
based on scientifically valid evidence.
Before the application of these guidelines and criteria to human figure drawings, it is first necessary to define the test. Even limiting human figure drawings
to those techniques designed to assess personality functioning and ignoring
Goodenoughs (1926) and others work that have used drawings to estimate intellectual functioning, human figure drawings cover a wide universe. The most
widely known members are the Draw-A-Person, HouseTreePerson (HTP),
Draw-A-Family, and Kinetic Family Drawing. The slightly less popular members include Kinetic HouseTreePerson, Draw-A-Person in the Rain, and
Draw-A-School. Further complicating the matter is that the instructions and requirements for identically named tests can vary from researcher to researcher
and from examiner to examiner.
For this article, I have divided human figure drawings into three groups, largely
on the basis of the method used to interpret them. This division is not original, and
others have suggested similar variants (Groth-Marnat, 1997; Naglieri, McNeish, &
Bardos, 1991; Scribner & Handler, 1987; Shaffer, Duszynski, & Thomas, 1984).
The first method consists of clinicians and researchers using their global impressions to arrive at conclusions about the artists personality characteristics and level
of pathology. This is arguably the most widely used method and involves little or no
formal scoring. Instead, clinicians rely on their phenomenological experience of
the drawing, affective or visceral reactions to it, and relatively loosely reined impressions and associations (Scribner & Handler, 1987, p. 112) to arrive at their conclusions. The second method attempts to link single signs with specific aspects of

Downloaded by [Illinois State University Milner Library] at 16:36 29 November 2012

138

LALLY

personality or specific diagnosis. This method originates with Machovers (1949)


writing, and many of the current texts (i.e, Buck, 1992; Burns, 1987; Kaufman &
Wohl, 1992) on human figure drawings provide a kind of glossary to assist with this
method of interpretation. The final method does not attempt to link particular signs
with a specific interpretation but instead focuses on the frequency of indicators of pathology in a drawing. By comparing the number of such items in a drawing with normative information about the expected frequency, conclusions can be drawn about
the presence of maladjustment. This overall rating method originates with Koppitzs
(1968) work, and two of its current representatives are the Draw A Person: Screening
Procedure for Emotional Disturbance (DAP:SPED; Naglieri et al., 1991) and Van
Huttons scoring system for the HTP and Draw-A-Person (Van Hutton, 1994).

HEILBRUNS (1992) GUIDELINES


I apply Heilbruns (1992) seven guidelines to this three-way division of human figure drawings. I evaluate each scoring method to determine whether it meets the
guidelines (see Table 1).
1. The Test Should Be Commercially Available, With a
Manual and at Least One Independent Review
There have been a number of manuals (e.g., Buck, 1992; Burns, 1987), books (e.g.,
Hammer, 1958), and articles (e.g., Handler & Riethmiller, 1998) that describe the
TABLE 1
Heilbruns (1992) Guidelines Applied to the Three
Human Figure Drawing Interpretive Methods

Guideline
1. Commercially available, with manual and
independent review
2. Reliability of .80
3. Relevant to the legal issue or an
underlying psychological construct
relevant to the issue
4. Standard method of administration
5. Applicable to the population and purpose
used
6. Objective test and actuarial method
7. Measures response style

Global
Impressions

Specific
Signs

Overall
Ratings

+
+

+
+

Note. = method does not meet criterion; + = equivocal whether method does meet criterion; + =
method does meet criterion.

Downloaded by [Illinois State University Milner Library] at 16:36 29 November 2012

HUMAN FIGURE DRAWINGS IN COURT

139

global-impressionistic manner of scoring human figure drawings. Often, these


sources will both discuss the global-impressionistic method and include a catalog
of specific signs and associated interpretations. It is not surprising that the manuals
are often somewhat vague when it comes to describing the impressionistic scoring
method and are very detailed in articulating the specific-signs interpretation
method. Similarly, the independent reviews (Chase, 1984; Cundick, 1989; Ellis,
1953; Gersten, 1978; Harriman, 1953; Harris, 1978; Haworth, 1965; Killian, 1984;
Kitay, 1965; Krugman, 1949; Rosen, 1953; Stewart, 1953; Weinberg, 1989;
Wilcox, 1949) have tended to examine both methods. The reviews have almost uniformly been negative not only about the adequacy of the test but also about the test
manuals. This is true not only of the early reviews, for example, Bucks original
manual for the HTP is certainly one of the worst horrors ever perpetrated in the
field of clinical psychology (Ellis, 1953, p. 178), but also of more recent ones, for
example
Based on the evidence reviewed herein, there is not a single scientific reason for using
the DAS [Draw-A-Story] The DAS along with virtually all projective instruments,
belongs in a museum chronicling the history of simple-minded assessment practices
along with mood rings, phrenology, tarot cards, and horoscopes. (Witt & Gresham,
1998, p. 379)

Although the manuals are flawed, this first guideline is, to a limited degree, met by
the first two interpretative methods. The overall rating method, as exemplified by
the DAP:SPED (Naglieri et al., 1991) and Van Huttons scoring system (Van
Hutton, 1994) do have commercially available manuals and have been independently reviewed (Cosden, 1995; Dowd, 1998; Knoff, 1998; Morrison, 1995). The
adequacy of the manuals and the tests themselves have also been criticized, but the
criticisms are less global and severe than they are for the other two methods.

2. The Test Should Have Adequate (.80) Reliability


Heilbrun (1992) set a reliability level of at least .80 for both interrater reliability and
testretest reliability. He argued that reliability levels should be published in both
the test manual and the independent reviews. It is difficult to measure the reliability
of the global-impressionistic method because by its very nature it is subjective and
heavily influenced by the personality pattern of the interpreter. The degree to which
interpreters have an affiliative personality style (Scribner & Handler, 1987) or
are empathic and so able to allow themselves to become the person drawn (Burley & Handler, 1997, p. 371) have been linked by proponents of this method with
greater accuracy. Although it is claimed that this method is able to achieve a high
level of reliability (Handler & Riethmiller, 1998, p. 269), this is achieved only
when the raters are given explicit criteria (Tharinger & Stark, 1990). Yumas

140

LALLY

Downloaded by [Illinois State University Milner Library] at 16:36 29 November 2012

(1990) study, in which the raters appeared not to be given explicit criteria, resulted
in interrater correlations below Heilbruns level. These lower levels of reliability
are likely more representative of this approach in clinical practice. This conclusion
is supported by comments made by Hammer (1997), a major proponent of the impressionistic method:
Personality drawing interpretation is revealing itself to be more of an Art than a Science half of the psychologists are adept at this art, and about half are not . The research findings may be more dependent on the psychologists personality than on the
drawings, per se. (p. 377)

The data on the reliability of the specific-signs approach are not good (Groth-Marnat,
1997). Swensons (1968) review of the early literature found low interrater reliability (.23.52) and testretest reliability (.21.85). He concluded that structural and
content variables have reliabilities that are probably too low for making reasonably
reliable clinical judgments (p. 40). Adler (1970) also found low interrater reliability
for specific signs (.16.92), with only 12 of 32 specific signs reaching Heilbruns
(1992) suggested mark. A more recent review of the research (Kahill, 1984) found
interrater reliabilities for the most part over .80, and testretest reliability in one specific
signs study of .81. Kahill hypothesized that the improved reliability reflected researchers increased motivation to objectify their ratings and adequately train their raters.
The various overall ratings have generally been seen as having good interrater
and testretest reliability (Kahill, 1984; Swenson, 1968). The DAP:SPED, a specific overall rating scoring system, was reported (Naglieri et al., 1991) to have an
interrater reliability of .84 and a testretest reliability of .67 over a 1-week time period. These reliability scores partially meet Heilbruns (1992) standard, although it
is notable that some researchers (Bruening, Wagner, & Johnson, 1997) have suggested that because the majority of items on the DAP:SPED are scored infrequently, the method suggested by the test authors to calculate interrater reliability
might be somewhat inflated. Van Huttons (1994) scoring method is another specific overall rating. The manual (Van Huttton, 1994) reports that three of its four
scales have interrater reliability at or above .95, and the fourth has a reliability of
.70. The manual was criticized (Knoff, 1998) for not providing any information
about testretest or internal reliability and for relying on only two raters to provide
the estimate of interrater reliability.
3. The Test Should Be Relevant to the Legal Issue or to a
Psychological Construct Underlying the Legal Issue
None of the scoring methods purport to explicitly measure a legal construct; however, they all claim to measure psychological constructs (e.g., degree of pathology,
the presence of sexual abuse) that are potentially relevant to legal issues. The problem is that it is not clear if they actually measure these constructs.

Downloaded by [Illinois State University Milner Library] at 16:36 29 November 2012

HUMAN FIGURE DRAWINGS IN COURT

141

Kahill (1984) cited a number of attempts made with the global-impressionistic


method to have judges sort figure drawings into normal versus pathological categories. He concluded that the judges efforts were largely unsuccessful and, even
when judges could differentiate between normal and pathological drawings, that
their improvement over chance was so slight as to be meaningless. Part of the difficulty appears to result from the fact that drawing ability and quality are confounding variables (Groth-Marnat & Roberts, 1998; Kahill, 1984; Swenson, 1968). In
addition, others (Chapman & Chapman, 1967; Gresham, 1993) have argued that illusory correlations, false relations based on verbal associations rather than actual
correlations, may be confounding the results. To the degree that studies ground the
impressionistic method in an articulated criteria, the validity appears to improve.
Still, at best, only general conclusions about the presence or absence of pathology
can be made (Kahill, 1984; Swenson, 1968; Tharinger & Stark, 1990).
The evidence to support the specific-signs method of interpretation is even
worse. Reviewers (Kahill, 1984; Swenson, 1968) have found either no support or,
at best, contradictory support for the theory that specific signs are associated with
specific personality or diagnostic characteristics. Conclusions based on these signs
clearly would not meet Heilbruns (1992) relevancy standard.
Overall rating methods have generally fared somewhat better than the other two
methods, although it is notable that in at least one study (Tharinger & Stark, 1990),
a more qualitative approach was superior to Koppitzs (1968) rating. The validity
data for both the DAP:SPED (Naglieri & Pfeiffer, 1992) and Van Huttons (1994)
scoring method suggest that in both rating systems, emotionally disturbed children
score significantly worse then normal children. However, both tests have been
criticized (Cosden, 1995; Dowd, 1998; Knoff, 1998; Morrison, 1995) for the
weakness of these data. In addition, human figure drawings have been criticized
(Gresham, 1993; Knoff, 1993) for their lack of incremental validity. In the words
of Motta et al. (1993b)
One does not use a less valid measure to support a more valid one. Figure drawings do
not have established validity as measures of behavior or personality and as such can
add little to existing valid measures for these characteristics. (p. 163)

To fully met Heilbruns (1992) standard, additional positive validation studies are
needed.
4. The Test Should Have a Standard Method of
Administration and Should Be Administered as
Close as Possible to This Standard
Manuals for both the global-impressionistic and specific-signs methods vary from
small (Burns, 1987) to considerable degrees (Buck, 1992) in the amount of information they provide about the standard method of administration. Both in clinical

Downloaded by [Illinois State University Milner Library] at 16:36 29 November 2012

142

LALLY

practice and in the research literature, there is considerable variability in how human figure drawing tests are administered and scored. Kahill (1984) noted that
some environmental factors (e.g., sex of the examiner, group vs. individual administration, specific instructions), which are often not controlled for in studies or clinical practice, can affect the results.
In developing overall rating systems, authors generally provide detailed information on the administration and scoring of the system. For example, the
DAP:SPED manual (Naglieri et al., 1991) describes in great detail administration guidelines and even provides plastic templates to aid in scoring. Van Hutton
(1994) was less detailed in her discussion of administration, and her scoring criteria for some items could be more detailed; however, the manual does appear to
meet Heilbruns (1992) minimal requirement.

5. The Test Should Be Applicable to the


Population and Purpose Used
This standard is largely dependent on the specific forensic testing situation. It refers to
the degree of fit between the individual being evaluated and the population and condition on the basis of which the test was developed. In other words, can one generalize
from the norm sample to the particular person? For both the global-impressionistic and
specific-signs methods, there are little good normative data. Samples may not be random, and many important demographic variables go unreported (Killian, 1984).
Without this information, it is not possible to assess the degree of fit between the individual currently being evaluated and the validation sample of the test.
Van Huttons (1994) method was severely criticized (Dowd, 1998; Knoff, 1998)
because her sample was extremely small, nonrandom, unstratified, and geographically limited. Additional concerns were raised about whether her controls were actually free from sexual or physical abuse (Dowd, 1998). In contrast, Naglieri et al.
(1991) used a relatively large sample of children and adolescents who were representative of U.S. geographic regions, sex, ethnicity, and socioeconomic status in developing the DAP:SPED. Morrison (1995) noted that it was not clear if children with
emotional problems were excluded from the sample. However, even with this weakness, the DAP:SPED appears to allow one to estimate the degree of fit between its
norm sample and the individual undergoing a forensic evaluation.

6. Objective Tests and Actuarial Data


Combination Are Preferable
By their very definition, human figure drawing tests are not objective. Similarly,
both the global-impressionistic and specific-signs methods emphasize the clinical

Downloaded by [Illinois State University Milner Library] at 16:36 29 November 2012

HUMAN FIGURE DRAWINGS IN COURT

143

processing of data. Although the overall rating system has a more objective scoring
method, the test is still projective in its data collection. The global-impressionistic
system exemplifies the clinical method of data combination as does, to a large extent, the specific-signs method. The overall rating method, with its cutoffs to determine the presence of pathology on the basis of normative samples, is akin to an
actuarial method. However, the overall ratings do not systematically measure outcomes, identify predictor variables, weigh the variables, and then cross-validate the
weighted variables.

7. The Test Should Measure Response Style


None of the three methods have built-in controls or methods to assess response set.
There has been little research conducted to evaluate the susceptibility of human figure drawings to being malingered. Ponzo (1957) asked participants to draw figures
as an idiot would and found that the drawings were more primitive and careless
than what would normally be expected. In general, it has been suggested that aberrant response sets for projective tests are fakable (Schretlen, 1997, p. 220).

DAUBERT (1993) CRITERIA


I next apply the Supreme Courts criteria, as articulated in Daubert (1993), to the
three interpretive methods for human figure drawings. I evaluate each method to
determine whether it meets the criteria (see Table 2).
TABLE 2
Daubert v. Merrell Dow Pharmaceuticals, Inc. (1993) Criteria
Applied to the Three Human Figure Drawing Interpretive Methods

Criteria
1.
2.
3.
4a.
4b.

Falsifiability (has been or can be


tested)
Peer reviewed and published in a
professional journal
General acceptance by scientific
community
Known error rate
Standards for administration

Global
Impressions

Specific
Signs

Overall
Ratings

+
+

Note. = method does not meet criterion; + = equivocal whether method does meet criterion; + =
method does meet criterion.

144

LALLY

Downloaded by [Illinois State University Milner Library] at 16:36 29 November 2012

1. The Theory or Technique Can Be and


Has Been Tested (Falsifiability)
A number of studies have used the global-impressionistic method. As previously
mentioned, the results are mixed at best. An additional complication with this
method is its dependence on the interpreter being able to develop an empathic
attunement with the artist. Because such attunement is difficult to define, much less
measure, it is not easy to falsify this method. It has been argued (Hammer, 1997)
that negative results are due to the interpreters personality and not to a flaw in the
method; however, such an argument does not increase this the falsifiability of this
method.
The specific-signs method is much more amenable to being tested, and as mentioned previously, it has been the subject of numerous studies. However, the results have been contradictory as often as they have been negative in support of this
method. It is ironic that although the validity appears to be poor for this method, it
does meet this criteria by virtue of being both amenable to being tested and by being repeatedly tested.
The overall rating scales can also easily be tested. For the DAP:SPED (Naglieri
et al., 1991) and Van Huttons (1994) method, the main limitation is the small
number of studies that have been completed on each test. It is this limitation that
keeps them from fully meeting the criterion.

2. The Theory or Technique Has Been Subjected to


Peer Review and Published in Professional Journals
Because of a lack of precision in the definition of the global-impressionistic
method, many studies vary in how they define it. Some combine aspects of the
impressionistic method with the specific-signs method (e.g., Stawar & Stawar,
1989), and others (e.g., Tharinger & Stark, 1990) provide raters with criteria that
eliminate some of the intuitive process. If the global-impressionistic method is
broadly defined, it does, to an extent, meet this criterion. The fact that some
studies are not supportive of the method does not directly apply to its admissibility within this criterion. To an even greater degree, the specific-signs method
(Kahill, 1984; Swenson, 1968) has been peer reviewed and published in the professional literature. Again, the fact that most of the studies are negative does not
apply directly to its admissibility under this criterion. There have been a number
of peer reviewed studies of Koppitzs (1968) overall rating system (e.g.,
Tharinger & Stark, 1990), or some modification of it (e.g., Yuma, 1990), but
there are less than a handful of peer reviewed and published articles on the
DAP:SPED (e.g., Bruening et al., 1997; Naglieri & Pfeiffer, 1992) and none on
Van Huttons (1994) system.

HUMAN FIGURE DRAWINGS IN COURT

145

Downloaded by [Illinois State University Milner Library] at 16:36 29 November 2012

3. There is General Acceptance of the Theory or


Technique in the Scientific Community
Although the frequency of use of human figure drawings suggests a continued acceptance by the practitioner community, the response from the scientific community is
anything but acceptance. Both the global-impressionistic and the specific-signs methods have been severely criticized by authors (Gersten, 1978; Gresham, 1993; Harris,
1978; Joiner & Schmidt, 1997; Joiner et al., 1996; Kahill, 1984; Knoff, 1993; Motta et
al., 1993b; Smith & Dumont, 1995; Swenson, 1968; Ziskin, 1995), with some even
suggesting that the continued use of the tests is unethical (Motta, Little, & Tobin,
1993a). Although it is clear that there are those who advocate for the continued use of
human figure drawings (Hammer, 1969, 1996; Riethmiller & Handler, 1997; Safran,
1996), it cannot be concluded that there is general acceptance of this testing modality.
At least in the case of the DAP:SPED, one of the overall rating scales comes closer to
receiving general acceptance (Cosden, 1995; Morrison, 1995).
4a. The Theory or Technique Has a Known Error Rate
Because there is so much variability in the definition, administration, and scoring of
the global-impressionistic method, it is difficult to even estimate its error rate. Similarly, it is difficult to derive a precise error rate from the specific-signs method because studies have varied as to how signs are defined and to what extent they
publish information about the number of false positives and negatives. However, in
general this method appears to have a very high error rate.
There is more information available about the error rate of the different overall
rating methods. For example, in a study of children with mood and anxiety disorders
(Tharinger & Stark, 1990), the Koppitz (1968) emotional indicators misidentified
50% of the mood-disordered children, 38% of the mixed mood/anxiety-disordered
children, 36% of the anxiety-disordered children, and 22% of the normal controls. In
two studies with the DAP:SPED (Bruening et al., 1997; Naglieri & Pfeiffer, 1992),
samples of emotionally disturbed children were misidentified between 35% and
52% of the time, and normal controls were misidentified 22% of the time. The
DAP:SPED was also not able to distinguish between sexual and emotional abuse.
The error rates for three out of four of Van Huttons (1994) scales in her scoring system range from 20% to 93% for sexually abused children, 0% to 18% for emotionally
disturbed children, and 0% to 2.8% for her normal controls.
4b. There Exist Standards for Administration to
Control the Operation of the Technique
This Daubert (1993) criterion parallels Heilbruns (1992) fourth guideline. As reviewed during the discussion of that guideline, both the global-impressionistic and

146

LALLY

Downloaded by [Illinois State University Milner Library] at 16:36 29 November 2012

specific-signs methods vary considerably to the degree that they have a standard
method of administration. This variability is reflected in both clinical practice and
the research literature. Generally, the overall rating systems provide detailed information on administration and scoring, and so these methods basically meet this
criterion.

CONCLUSIONS
It is clear that the two most commonly used methods for scoring human figure
drawings, global impressions and specific signs, do not meet most of Heilbruns
(1992) guidelines for use in forensic assessment and fall short of meeting the
Daubert (1993) criteria. These methods may arguably have a place in clinical practice, but they clearly do not belong in a courtroom. The conclusions with regard to
the overall rating scales, such as the DAP:SPED and Van Huttons (1994) scoring,
are less clear cut. They meet, or partially meet, a number of the guidelines and criteria, and they have the potential, with additional research, to meet more. Although
their validity is weak, their conclusions are limited in scope, and they appear to offer no additional information over other psychological tests, it can at least be argued
that they cross the relatively low hurdle of admissibility.

REFERENCES
Ackerman, M. J., & Ackerman, M. C. (1997). Custody evaluations practices: A survey of experienced
professionals (revisited). Professional Psychology: Research and Practice, 28, 137145.
Ackerman , M. J., & Schoendorf, K. (1992). AckermanSchoendorf Scales for Parent Evaluation of
Custody (ASPECT). Los Angeles: Western Psychological Services.
Adler, P. T. (1970). Evaluation of the figure drawing technique: Reliability, factorial structure, and diagnostic usefulness. Journal of Consulting and Clinical Psychology, 35, 5257.
Borum, R., & Grisso, T. (1995). Psychological test use in criminal forensic evaluations. Professional
Psychology: Research and Practice, 26, 465473.
Bricklin, B. (1984). Bricklin Perceptual Scales. Furlong, PA: Village.
Bruening, C. C., Wagner, W. G., & Johnson, J. T. (1997). Impact of rater knowledge on sexually abused
and nonabused girls scores on the Draw-A-Person: Screening Procedure for Emotional Disturbance (DAP: SPED). Journal of Personality Assessment, 68, 665677.
Buck, J. N. (1992). HouseTreePerson projective drawing technique: Manual and interpretive guide
(Rev. ed.; revised by W. L. Warren). Los Angeles: Western Psychological Services.
Burley, T., & Handler, L. (1997). Personality factors in the accurate interpretation of projective tests. In
E. F. Hammer (Ed.), Advances in projective drawing interpretation (pp. 359377). Springfield, IL:
Thomas.
Burns, R. C. (1987). KineticHouseTreePerson drawings (KHTP). New York: Brunner/Mazel.
Chase, C. I. (1984). Psychological evaluation of childrens human figure drawings [Review]. In D. J.
Keyser & R. C. Sweetland (Eds.), Test critiques (Vol. 1, pp. 189194). Kansas City, MO: Test Corporation of America.

Downloaded by [Illinois State University Milner Library] at 16:36 29 November 2012

HUMAN FIGURE DRAWINGS IN COURT

147

Chapman, L. J., & Chapman, J. P. (1967). Genesis of popular but erroneous psycho-diagnostic observations. Journal of Abnormal Psychology, 72, 193204.
Cosden, M. (1995). Review of the Draw A Person: Screening Procedure for Emotional Disturbance. In
J. C. Conoley & J. C. Impara (Eds.), The twelfth mental measurement yearbook (pp. 321322). Lincoln: University of Nebraska Press.
Cundick, B. P. (1989). Review of the Kinetic Drawing System for Family and School: A handbook. In J.
C. Conoley & J. J. Kramer (Eds.), The tenth mental measurements yearbook (pp. 422423). Lincoln:
University of Nebraska Press.
Daubert v. Merrell Dow Pharmaceuticals, Inc., 113 S. Ct. 2786 (1993).
Dowd, E. T. (1998). Review of the HouseTreePerson and Draw-A-Person as measures of abuse in
children: A quantitative scoring system. In J. C. Impara & B. S. Plake (Eds.), The thirteenth mental
measurements yearbook (pp. 486487). Lincoln: University of Nebraska Press.
Dumont, F., & Smith, D. (1996). Projectives and their infirm research base. Professional Psychology:
Research and Practice, 27, 419420.
Ellis, A. (1953). [Review of HTP]. In O. K. Buros (Ed.), The fourth mental measurements yearbook
(pp. 178181). Highland Park, NJ: Gryphon.
General Electric v. Joiner, 118 S. Ct. 512 (1997).
Gersten, J. C. (1978). [Review of Kinetic Family Drawings]. In O. K. Buros (Ed.), The eighth mental
measurements yearbook (pp. 882884). Highland Park, NJ: Gryphon.
Goodenough, F. L. (1926). Measurement of intelligence by drawings. New York: Harcourt Brace.
Gresham, F. M. (1993). Whats wrong in this picture?: Response to Motta et al.s review of human figure drawings. School Psychology Quarterly, 8, 182186.
Groth-Marnat, G. (1997). Handbook of psychological assessment (3rd ed.). New York: Wiley.
Groth-Marnat, G., & Roberts, L. (1998). Human figure drawings and House Tree Person drawings
as indicators of self-esteem: A quantitative approach. Journal of Clinical Psychology, 54,
219222.
Hammer, E. F. (1958). The clinical application of projective drawings. Springfield, IL: Thomas.
Hammer, E. F. (1969). DAP: Back against the wall? Journal of Consulting and Clinical Psychology, 33,
151156.
Hammer, E. F. (1996). Deception? Professional Psychology: Research and Practice, 27, 418.
Hammer, E. F. (1997). Editors comment. In E. F. Hammer (Ed.), Advances in projective drawing interpretation (p. 377). Springfield, IL: Thomas.
Handler, L., & Riethmiller, R. (1998). Teaching and learning the administration and interpretation of
graphic techniques. In L. Handler & M. J. Hilsenroth (Eds.), Teaching and learning personality assessment (pp. 267294). Mahwah, NJ: Lawrence Erlbaum Associates, Inc.
Harriman, P. L. (1953). [Review of Machover Draw-A-Person Test]. In O. K. Buros (Ed.), The fourth
mental measurements yearbook (pp. 186188). Highland Park, NJ: Gryphon.
Harris, D. B. (1978). [Review of Kinetic Family Drawings]. In O. K. Buros (Ed.), The eighth mental
measurements yearbook (pp. 884885). Highland Park, NJ: Gryphon.
Haworth, M. R. (1965). [Review of HTP]. In O. K. Buros (Ed.), The sixth mental measurements yearbook (pp. 435436). Highland Park, NJ: Gryphon.
Heilbrun, K. (1992). The role of psychological testing in forensic assessment. Law and Human Behavior, 16, 257272.
Joiner, T. E., Jr., & Schmidt, K. L. (1997). Drawing conclusionsor notfrom drawings. Journal of
Personality Assessment, 69, 476481.
Joiner, T. E., Jr., Schmidt, K. L., & Barnett, J. (1996). Size, detail and line heaviness in childrens drawings as correlates of emotional distress: (More) negative evidence. Journal of Personality Assessment, 67, 127141.
Kahill, S. (1984). Human figure drawings in adults: An update of the empirical evidence, 19671982.
Canadian Psychology, 25, 269292.

Downloaded by [Illinois State University Milner Library] at 16:36 29 November 2012

148

LALLY

Kaufman, B., & Wohl, A. (1992). Casualties of childhood: A developmental perspective on sexual
abuse using projective drawings. New York: Brunner/Mazel.
Killian, G. A. (1984). HouseTreePerson technique [Review]. In D. J. Keyser & R. C. Sweetland
(Eds.), Test critiques (Vol. 1, pp. 338353). Kansas City, MO: Test Corporation of America.
Kitay, P. M. (1965). [Review of Machover Draw-A-Person Test]. In O. K. Buros (Ed.), The sixth mental
measurements yearbook (pp. 466468). Highland Park, NJ: Gryphon.
Knoff, H. M. (1993). The utility of human figure drawings in personality and intellectual assessment:
Why ask why? School Psychology Quarterly, 8, 191196.
Knoff, H. M. (1998). Review of the HouseTreePerson and Draw-A-Person as measures of abuse in
children: A quantitative scoring system. In J. C. Impara & B. S. Plake (Eds.), The thirteenth mental
measurements yearbook (pp. 487490). Lincoln: University of Nebraska Press.
Koppitz, E. (1968). Psychological evaluation of childrens human figure drawings. New York: Grune &
Stratton.
Krugman, M. (1949). [Review of HTP]. In O. K. Buros (Ed.), The third mental measurements yearbook (pp. 8485). Highland Park, NJ: Gryphon.
Kumho Tire v. Carmichael, 119 S. Ct. 1167 (1999).
Lubin, B., Larsen, R. M., & Matarazzo, J. D. (1984). Patterns of psychological test usage in the United
States: 19351982. American Psychologist, 39, 451454.
Machover, K. (1949). Personality projection in the drawing of the human figure. Springfield, IL:
Thomas.
Morrison, G. M. (1995). Review of the Draw A Person: Screening Procedure for Emotional Disturbance. In J. C. Conoley & J. C. Impara (Eds.), The twelfth mental measurement yearbook (pp.
321322). Lincoln: University of Nebraska Press.
Motta, R. W., Little, S. G., & Tobin, M. I. (1993a). A picture is worth less than a thousand words: Response to reviewers. School Psychology Quarterly, 8, 197199.
Motta, R. W., Little, S. G., & Tobin, M. I. (1993b). The use and abuse of human figure drawings. School
Psychology Quarterly, 8, 162169.
Naar, R. (1961). Testing in juvenile courts: A survey. Journal of Clinical Psychology, 17, 210.
Naglieri, J. A., McNeish, T. J., & Bardos, A. N. (1991). Draw A Person: Screening Procedure for Emotional Disturbance examiners manual. Austin, TX: PRO-ED.
Naglieri, J. A., & Pfeiffer, S. L. (1992). Performance of disruptive behavior disordered and normal samples on the Draw A Person: Screening Procedure for Emotional Disturbance. Psychological Assessment, 4, 156159.
Pinkerman, J. E., Haynes, J. P., & Kaiser, T. (1993). Characteristics of psychological practice in juvenile
court clinics. American Journal of Forensic Psychology, 11, 312.
Piotrowski, C. (1984). The status of projective techniques: Or wishing it would go away. Journal of
Clinical Psychology, 40, 14951502.
Ponzo, E. (1957). An experimental variation of the Draw-A-Person technique. Journal of Projective
Techniques, 21, 278285.
Riethmiller, R. J., & Handler, L. (1997). The great figure drawing controversy: The integration of research and clinical practice. Journal of Personality Assessment, 69, 488496.
Rosen, E. [Review of HTP]. (1953). In O. K. Buros (Ed.), The fourth mental measurements yearbook
(p. 181). Highland Park, NJ: Gryphon.
Safran, S. (1996). DAP or method? Professional Psychology: Research and Practice, 27, 418419.
Schretlen, D. J. (1997). Dissimulation on the Rorschach and other projective measures. In R. Rogers (Ed.),
Clinical assessment of malingering and deception (2nd ed., pp. 208222). New York: Guilford.
Scribner, C. M., & Handler, L. (1987). The interpreters personality in Draw-A-Person interpretation: A
study of interpersonal style. Journal of Personality Assessment, 51, 112122.
Shaffer, J. W., Duszynski, K. R., & Thomas, C. B. (1984). A comparison of three methods for scoring
figure drawings. Journal of Personality Assessment, 48, 245254.

Downloaded by [Illinois State University Milner Library] at 16:36 29 November 2012

HUMAN FIGURE DRAWINGS IN COURT

149

Smith, D., & Dumont, F. (1995). A cautionary study: Unwarranted interpretations of the Draw-A-Person test. Professional Psychology: Research and Practice, 26, 298303.
Stawar, T. L., & Stawar, D. E. (1989). Kinetic family drawings and MMPI diagnostic indicators in adolescent psychiatric inpatients. Psychological Reports, 65, 143146.
Stewart, N. (1953). [Review of Machover Draw-A-Person Test]. In O. K. Buros (Ed.), The fourth mental
measurements yearbook (pp. 188190). Highland Park, NJ: Gryphon.
Swenson, C. H. (1968). Empirical evaluations of human figure drawings: 19571966. Psychological
Bulletin, 70, 2044.
Tharinger, D. J., & Stark, K. (1990). A qualitative versus quantitative approach to evaluating the
Draw-A-Person and Kinetic Family Drawing: A study of mood- and anxiety-disorder children. Psychological Assessment, 2, 365375.
Van Hutton, V. (1994). HouseTreePerson and Draw-A-Person as measures of abuse in children: A
quantitative scoring system. Odessa, FL: Psychological Assessment Resources.
Waehler, C. A. (1997). Drawing bridges between science and practice. Journal of Personality Assessment, 69, 482487.
Wanderer, Z. W. (1969). Validity of clinical judgments based on human figure drawings. Journal of
Consulting and Clinical Psychology, 33, 143150.
Watkins, C. E., Jr., Campbell, V. L., Nieberding, R., & Hallmark, R. (1995). Contemporary practice of
psychological assessment by clinical psychologists. Professional Psychology: Research and Practice, 26, 5460.
Weinberg, R. A. (1989). Review of the Kinetic Drawing System for Family and School: A handbook. In
J. C. Conoley & J. J. Kramer (Eds.), The tenth mental measurements yearbook (pp. 423425). Lincoln: University of Nebraska Press.
Weisgram v. Marley Co., 120 S. Ct. 1011 (2000).
Wilcox, K. W. (1949). [Review of HTP]. In O. K. Buros (Ed.), The third mental measurements yearbook (pp. 8586). Highland Park, NJ: Gryphon.
Witt, J. C., & Gresham, F. M. (1998). Review of the Draw-a-Story: Screening for depression and age or
gender differences. In J. C. Impara & B. S. Plake (Eds.), The thirteenth mental measurements yearbook (pp. 377379). Lincoln: University of Nebraska Press.
Yuma, M. F. (1990). The usefulness of human figure drawings as an index of overall adjustment. Journal of Personality Assessment, 54, 7886.
Ziskin, J. (1995). Coping with psychiatric and psychological testimony (5th ed.). Los Angeles: Law and
Psychology Press.

Stephen J. Lally
American School of Professional Psychology
1550 Wilson Blvd, Suite 600
Arlington, VA 22209
Received March 30, 2000
Revised June 1, 2000

Vous aimerez peut-être aussi