RESEARCH UPDATE REVIEW
This series of 10-year updates in child and adolescent psychiatry began in July 1996. Topics are selecred in
consultation with the AACAP Committee on Recertification, both for the importance of new research and
its clinical or developmental significance. The authors have been asked to place an asterisk before the five
or six most seminal references,
MKD.
Psychological Testing for Child and Adolescent
Psychiatrists: A Review of the Past 10 Years
JEFFREY M. HALPERIN, PH.D., AND KATHLEEN E, MCKAY, Px.D.
ABSTRACT
(Objective: To provide a review of psychological tests often used with chien and adolescents. Method: A description
cf how psychological tests are used and how to interpret various types of scores is provided. Subsequently, psy-
chological tests used to assess inteligence, academic achievement, naurepsychologieal functions, and personality are
reviewed. Results: There are numerous woll-normed, rllable, and val inetruments that are availabe for assessing
intellectual and academic functioning in children and adolescents. Nouropsychological tests, designed to assess
‘objectively a wide range of cognitive functions, are avaiable and extremely useful for designing treatment plans for
patients with cognitive dificulties. Despite thelr populaily, most projective tests have relatively weak psychometric data
‘supporting their reliabity and/or validity. Conclusions: Psychological testing provides objective measures of behavior
that are of considerable utility for evaluating children and adolescents, However, psychological test data, in isoaton, will
rarely be adequate for providing a DSM diagnosis, and test scores are best inoxpreted inthe context of other clinical
ata, Psychological test data can be very usetul for developing a comprehensive treatment plan that addresses the
Patient's cognitive and emotional needs. J. Am. Acad. Child Adolesc. Psychiatry, 1998, 37(6)'575-584, Key Words:
inteligence testing, academic achievement, neuropsychoiogical testing, personality assessment, projective tests
This article will provide a review of psychological tests
that are often used with children and adolescents, high-
lighting more recently developed or revised instru-
ments. The focus will be on tests of intelligence,
academic achievement, personality, and neuropsycho-
logical functioning. Space limitations preclude a com-
prehensive review of all psychological tests or even an
in-depth review of any individual test, but several books
provide comprehensive reviews of psychological assess-
ment instruments (e.g., Anastasi and Urbina, 1997;
‘Kamphaus and Frick, 1996).
Accipied Ocrober 2, 1997.
‘Dr Halperin is withthe Pychlogy Deparment, Queens Cale ofthe City
Univers of New Yr, Fishing and he Phe Deparment, Mount Sina
Schwa of Medicine, Naw Yok, Dr McKay swith he Pyciary Department
Mors Sinai Schl of Medicine
Reprise requ 10 Dr Halperin, Pycholgy Department, Queens Calle
(65-30 Kisena Bled, Pshing NY 11367.
'0890-8567/98/3706-0575/S03,00/001998 by the American Academy
(of Ohild and Adolescent Paychiatry
J. AM, ACAD. CHILD ADOLESC, PSYCHIATRY, 37:6, JUNE 1998
Furthermore, this review will not cover behavior
rating scales and personality inventories because the
data they generate are qualitatively differenc from data
derived from objective psychological testings. Unlike
rating scales, which provide direct (although subjective)
information about how an individual functions within
his/her environment, psychological tests do not measure
functioning in the natural environment and, only rarely,
provide direct evidence about the presence or absence of
psychiatric symptoms. Rather, psychological testing
provides objective measures of behavior and/or func-
tions derived in a “laboratory-like” setting. Psycholog-
ical testing is particularly good for assessing current
cognitive or emotional status, what someone has
learned, and/or a person’ thinking process. Test data are
best interpreted in the context of a comprehensive
clinical evaluation and cannot, in isolation, provide
DSM Axis I diagnoses. This is not to say that psy-
chological resting cannot be useful for determining
diagnosis or for clarifying etiological factors. However,
575HALPERIN AND McKAY
such determinations generally require clinical inferences
that go beyond the test data. Psychological testing is
more often useful for assessing aspects of a child's
cognitive ot emotional status that may have important
tions for treatment planning rather than for
differential diagnosis per se.
UNDERSTANDING TEST SCORES
Before discussing specific psychological tests, we will
present a brief review of how to interpret test scores.
Psychological testing reports rarely refer to raw scores,
which indicate the number of items correct (or number
of errors). Rather, they provide scores that indicate how
performance relates to that of similar others on the
same measures. For example, knowing that a 10-year-
old child correctly answered 41 of 50 questions on atest
means little unless you know how 10-year-olds in
general perform on the same measure.
‘There are three common methods for reporting per-
formance on psychological tests: developmental scores,
percentiles, and standard scores. In somewhat different
ways, each of these reflects performance relative to that
of others. The most common developmental scores are
“mental age” and “grade equivalents,” although many
tests provide age-equivalent scores. The primary
strength of developmental scores is their descriptive
appeal. Hearing that Johnny has a mental age of 7 years,
or a thitd grade reading level, provides what seems to be
a vivid picture of where Johnny stands within these
domains. Yer one must be cautious when interpreting
developmental scores which, unlike chronological age,
ate not on a ratio of even an interval scale of measure
ment. The unit of measure on developmental scales
systematically shrinks with age. A 5-year-old child
functioning at the 3-year-old level might be quite
impaired, whereas a 12-year-old functioning at the 10-
year-old level might be only moderately behind. The
-rence in functioning between a 19- and 17-year old.
might be meaningless. Thus, at different ages, discrep-
ancies in developmental scores mean different things.
Furthermore, developmental scores provide little infor-
ration about the variability of test performance, which
often varies across ages. Thus, within a given measure,
how much do normal 10-year-olds vary around a 10-
year-old age score?
Percentile scores provide an index of where one
stands relative to others on a scale of 1 to 100.
Importantly, a score at the first or 100th percentile does
576
not mean that the person got all of the questions on the
test right or wrong. Rather, it means that the individual
performed worse or better than everybody else in the
‘comparison group. If the comparison group is made up
of children of similar age, percentile scores have the
advantage over developmental scores of maintaining
their meaning at different ages. Nonetheless, like devel-
‘opmental scores, percentile scores are on an ordinal
scale, The unit of measure varies across the range. There
is relatively litle difference between scores at the 40th
and 60th percentiles (these are equivalent to IQ scores
of approximately 96 and 104, respectively), but a 20-
point difference near cither tal of the distribution will
be substantial (eg., the Ist and 21st percentile equal IQ.
scores of about 65 and 88, respectively)
In contrast, not only do standard score scales have
the advantage of being indicative of performance rel-
ative to others, but the unit of measure remains con-
stant across the range of scores. Standard score scales
report scores in standard deviation (SD) units from the
normative samples mean. Some tests report z scores,
which ditectly indicate SD units. Thus, a z score of 0
‘means that the child scored exactly at the mean of the
normative sample, a score of +1.0 means the child
scored 1 SD above the mean, and a score of -0.2 means
that the child scored 0.2 SD below the mean as assessed
via the normative sample. Most tests, however, do not
present scores as z scores. Rather, a wide artay of stand-
ard score scales are used which can all be interpreted in
the same manner. Whereas a z score reflects perform-
ance on a scale with a mean of 0 and SD of 1, an IQ.
score is more likely to be on a scale with a mean of 100
and SD of 15 or 16. To follow the example above,
someone who scored exactly at the mean of the
normative sample would receive an 1Q score of 100.
Someone scoring 1 SD above the mean would receive
an 1Q score of 115 (or 116), and someone scoring 0.2
‘SD below the mean would receive an IQ score of 97
.e., 0.2 X 15 = 3). Thus, to interpret standard scores,
‘one must know the mean and SD of the scale on which
it is based.
ASSESSMENT OF INTELLIGENCE
Intelligence tests usually provide an estimate of
global cognitive functioning as well as information
about functioning within more specific domains.
Compared to measures of virtually all other human
trait, intelligence test scores are quite stable. However,
J. AM. ACAD. CHILD ADOLESC, PSYCHIATRY, 37:6, JUNE 1998the degree of stability increases with age such that early
childhood and preschool measures of intellectual
function are far less predictive of later functioning than
assessments taken during middle childhood. Further-
more, despite their relative stability, intelligence test
scores may change as a function of important environ-
mental factors. Therefore, intelligence test scores are
descriptive of a child’ Functioning at that point in time.
This could change with alterations in the child’s psy-
chiatric status, environmental conditions, or educa-
tional program.
The Wechsler Scales
‘The Wechsler intelligence scales are the most popular
among intelligence tests (Watkins et al., 1996; Wilson
and Reschly, 1996), and therefore they will be described
in greater detail than others. There are three different
‘Wechsler intelligence tests that are structurally similar but
differ with regard to the target age-range. The Wechsler
Preschool and Primary Scale of Intelligence-Revised
(WPPSLR) (Wechsler, 1989) is the most recent version
of the test normed for ages 3 to 7.3 years; the Wechsler
Intelligence Scale for Children-Third Edition (WISC-II})
(Wechsler, 1991) is normed for ages 6 to 16 years, 11
‘months; and the Wechsler Adult Intelligence Scale-Third
Edition (WAIS-III) (Wechsler, 1997) is normed for ages
16 through 74 years, All three are well-normed rests with
considerable data supporting their reliability and validity.
‘The Wechsler tests generate three major IQ scores:
Verbal 1Q (VIQ), Performance IQ (PIQ), and Full
Scale IQ (FSIQ). These are all deviation 1Q scores stan-
dardized by age with a mean of 100 and SD of 15.
Classification ranges are provided in the manuals such
that the “average” range is considered to be between 90
and 109. “High average” is considered to be 110 to 119;
“superior” is 120 to 129; and “very superior” is 130 and
greater. Going downward, “low average” is between 80
and 89; “borderline” is 70 to 79; and the “intellectually
deficient” range is below 70. These tests are not
particularly sensitive to individual differences below the
“mildly deficient” range (ie., 2 to 3 SD below the
mean). Furthermore, as discussed below, mental retar-
dation should not be diagnosed only on the basis of
data from an intelligence test. The individual must also
be assessed by a measure of adaptive function
‘The Verbal and Performance scales of all three
‘Wechsler intelligence tests are composed of subtests
which are scaled with a mean of 10 and SD of 3, and
J. AM. ACAD, CHILD ADOLESC. PSYCHIATRY, 37:6, JUNE 1998
PSYCHOLOGICAL TESTING
they generate scores ranging from 1 to 19 (ie., #3 SD
from the mean). All Verbal subtests are administered
orally by the examiner and require a verbal response
from the examinee. They require no reading or writing
by the examinee, they do not involve manipulation of
objects, and other than the Arithmetic subtest, all
Verbal subtests are untimed. The Verbal subtests vary
considerably with regard to the content of the material
ascertained as well as the relative degree of receptive and
expressive linguistic demands. For example, questions
on the Information and Arithmetic subtests can
generally be answered accurately with a one- or ewo-
word response, whereas the Vocabulary, Similarities,
and Comprehension subtests frequently require lengchy
explanations. With regard to receptive skills, several
questions on the Arithmetic and Comprehension sub-
tests are semantically and syntactically complex, whereas
those on the Vocabulary and Similarities subtests
involve a single word oF two words, respectively.
The Performance subtests primarily involve visual
perceptual organization, motor speed and coordination,
and visual-motor integration, along with reasoning
abilities. All Performance subtests are timed. Importantly,
Performance casks are not unaffected by poor verbal
abilities. Instructions are verbally administered and many
of the tasks can be best performed using verbal mediation.
When interpreting Wechsler scores one typically
looks initially at the FSIQ score, followed by VIQ-PIQ
discrepancies, and finally patterns of subtest scater. The
FSIQ may be the single best indicator of overall
functioning, but this is only true when there is not a
significant difference berween VIQ and PIQ scores. In
the context of large VIQ-PIQ differences, the FSIQ
may be of little utility. In general, large discrepancies
between the VIQ and PIQ are suggestive of uneven
development across domains of function. Although
there is some variability across ages, for the WISC-III,
differences of approximately 11 points are statistically
significant at the .05 level. Yet large differences are not
necessarily pathological. More than half of the children
in the standardization sample for the WISC-III had
VIQ-PIQ discrepancies greater than 9 points, and
about 25% had discrepancies of 15 points or greater
(Wechsler, 1991). Nonetheless, large differences may be
indicative of language (where the VIQ is lower) or
perceptual motor (where the PIQ is lower) problems.
Further analysis of subtest scatter is often based on
the factor structure of the subtests. The WISC-III sub-
577HALPERIN AND McKAY
tests are best accounted for by a four-factor solution:
Verbal Comprehension, Perceptual Organization,
Freedom From Distractibility, and Processing Speed
(Wechsler, 1991). While poor performance on one or
more of these factors is not diagnostic, it may be
indicative of difficulties within that domain. Typically,
children with language disabilities perform poorly on
the Verbal Comprehension factor. Children with
attention-deficit/hyperactivity disorder and/or learning
disabilities perform poorly on the Freedom From
Distractibility and Processing Speed factors (Wechsler,
1991). Yer children may perform poorly on Wechsler
subtests for a variety of reasons. Furthermore, the relia-
bilicy (and thus the measurement error) of che subtests
is not neatly that of the major scales. The reliability
(averaged across ages) for WISC-III subtests varies from
a low of 0.69 for Object Assembly, to a high of 0.87 for
the Vocabulary and Block Design subtests. In contrast,
the reliability values of the VIQ, PIQ, and FSIQ scores
are 0.95, 0.91, and 0.96, respectively. Therefore, inter-
pretations based on subtest patterns should be made
with caution and only in the context of other sup-
porting clinical and psychometric daca
Stanford-Binet inteligence Scale
The Fourth Edition of the Stanford-Binet (SB-IV)
(Thorndike et al., 1986) represents several advances
over previous versions of this first-of-all intelligence
tests, Like the Wechsler tests, che SB-IV is an individ-
ually administered test that requires extensive training
to administer. Ie is well-normed for ages 2 through
adulthood, thus allowing assessment of younger chil-
dren than does the WPPSL-R.
‘The SB-IV is composed of 15 tests which are divided
into four cognitive areas: Verbal Reasoning, Abstract!
Visual Reasoning, Quantitative Reasoning, and Short-
term Memory. However, all 15 tests do not span the
SB-IV age/difficulty range. Thus, no individual is
administered all 15 tests. The SB-IV no longer uses the
term “IQ,” and instead it generates “Standard Age
Scores” with a mean of 100 and SD of 16 for each of
the four cognitive areas and a composite score. The
Standard Age Scores for the individual subtests have a
mean of 50 and SD of 8.
The normative data for the SB-IV are quite good in
that the sample characteristics closely represent U.S.
census data, reliability coefficients are excellent, and
validity data indicate high correlations with other
578
intelligence tests, with older versions of the Stanford-
Binet, and with measures of academic achievement
(Thorndike et al., 1986; Laurent et al., 1992). Ie is an
excellent, well-normed test that is particularly useful for
assessing gifted children in that its ceiling is quite high.
However, similar to the Wechsler scales, there are floor
cffects, rendering it less sensitive for assessing varying
levels of mental retardation, especially in the younger
age groups.
Kaufman Assessment Battary for Children
The Kaufman Assessment Battery for Children (K-
ABC) (Kaufman and Kaufman, 1983a) was designed
for uses similar to that of the Wechsler and Sranford-
Binet tests. However, the K-ABC was designed from 2
theoretical orientation which posits a distinction
between information that is processed via simultaneous
versus sequential processing (Kaufman and Kaufman,
19836). Simultaneous processing is used on infor-
mation that is presented in its entirety or as a whole.
Sequential processing is used on temporal or succes-
sively presented information. In general, the Simultan-
cous subtests are visually presented perceptual tasks,
‘whereas che Sequential rasks are more likely to involve
verbal processing, memory, and/or sequential move-
ments. As such, it could be argued that the model does
not hold up particularly well in practice and that the
discinetion across scales may be mote modality- than
process-related.
The test, which is well-normed for ages 2 to 12%
years, generates a General Cognitive Index along with
separate scores for Simultaneous Processing, Sequential
Processing, and Achievement. These scores are stand-
ardized with a mean of 100 and SD of 15. Separate sub-
test scores for the Simultaneous and Sequential scales
are normed with means of 10 and SDs of 3, whereas the
Achievement scale subtests are normed with a mean of
100 and SD of 15. In addition to ies distinct theoretical
‘orientation, the K-ABC differs from the Wechsler and
SB-IV scales in several ways. Fits the items tend to be
more colorful, child-oriented, and engaging for chil-
dren, Second, an important difference exists in the
manner in which the subtests are presented. Similar to
other tests, each subtest is preceded by carefully worded
instructions for the child. Unlike the others, however,
the K-ABC manual instructs the examiner to use the
initial items to teach the child what he/she must do if
the task demands (as opposed to content of the material)
J. AM. ACAD. CHILD ADOLESC. PSYCHIATRY, 37:6, JUNE 1998are not clearly understood. Third, the nature of the sub-
tests are different; they are more “neuropsychological-
like.” That is, hey are more specific to specific processes.
Finally, this tese distinguishes itself from the WISC-III
and SB-IV in that its two cognitive scales (Sequential
and Simultaneous) require minimal language skills on
the part of the child. As such, scores are less likely to be
influenced by cultural and linguistic factors.
Two more recently developed Kaufman scales are the
Kaufman Adolescent and Adult Intelligence Test
(KAIT) (Kaufman and Kaufman, 1993) and the
Kaufinan Brief Intelligence Test (K-BIT) (Kaufman and
Kaufman, 1990). The KAIT, which was developed for
ages 11 to 85 years, distinguishes between crystallized
and fluid abilities. Crystallized abilities are presumably
related to what one has learned, either through his/her
environment or through schooling, whereas fluid abil-
ities relate to one’s capacity to solve novel problems. To
date only limited research on the validity of this
instrument has been conducted. Yet one attribute of
this testis the nature of the subtests and items, which
are generally unique in character and of greater interest
to most adults than items on other intelligence tests.
The K-BIT is a screening instrument designed to
estimate intellectual functioning for ages 4 through 90
years. It is composed of a Vocabulary test, which is
divided into “Expressive Vocabulary” and “Definitions,”
and a separate “Matrices” subtest. The limited scope of
this test makes it less appealing as a clinical instrument.
Yer it may be useful for research scudies in which a
quick assessment of overall cognitive function is needed
to characterize the sample.
Infant Assessment
Several of the tests described above are appropriate
for assessing preschool children, but none are adequate
for testing infants. The second edition of the Bayley
Scales of Infant Development (Bayley-II) (Bayley,
1993), which is the most commonly used test for assess-
ing infants (Wilson and Reschly, 1996), consists of
three subsections: Mental Scale, Motor Scale, and
Behavior Rating Scale. The Mental Scale assesses
responsivity to environmental stimulation, as well as an
array of sensory/perceptual, memory, learning, and
early language/communication abilities. The Moror
Scale assesses both gross and fine motor skills. The
Behavior Rating Scale is not an objective psychological
test, but rather a rating of several behaviors that the
J. AM. ACAD, CHILD ADOLESC, PSYCHIATRY, 37:6, JUNE 1998
PSYCHOLOGICAL TESTING
clinician bases on information gathered from the parent
and from his/her own observations.
‘The Bayley-II has norms based on 1,700 children,
broken down into 17 different age groups (50 boys and
50 girls) between 1 and 42 months. The Mental and
Motor scales yield separate standardized scores with a
mean of 100 and SD of 15. The Behavior Rating Scale
yields a percentile score which gets translated into one
of three categories: Non-Optimal, Questionable, or
Within Normal Limits.
‘As mentioned earlier, the stability of cognitive func-
tion increases with age. As such, the predictive ability of
the Bayley-II is limited, This instrument should be used
to assess current developmental level, not to predict
later potential. Thus, for children within the “normal”
range this test provides only limited utility. However,
among the ever-increasing population of “high-risk”
children (due to pre- and/or perinatal complications,
substance abuse, prematurity), thi
of substantial value for assessing current function and
strument may be
determining early intervention strategies.
‘Assessment of Mental Retardation
Over the past few decades, conceptual and political
changes have had substantial impact on the assessment
of mental retardation, OF particular relevance to this
review is the position that intelligence tests alone should
not be used to diagnose mental retardation. Rather, ic is
essential that a measure of functional ability is used in
addition co an intelligence test. Although itis common,
practice to consider a person scoring more than 2 SD
below the mean on an intelligence test as mentally
retarded, such individuals vary considerably in their
degree of functional impairment. Furthermore, most
intelligence tests have difficulties with floor effects. As
such, they lack sensitivity to varying degrees of mental
retardation.
‘Two instruments are particulaely useful aids for the
assessment of mentally retarded people: the Vineland
Adaptive Behavior Scales (Sparrow et al., 1984) and the
second edition of the American Association of Mental
Retardation Adaptive Behavior Scales (Lambert et al,
1993). These scales assess functional capacities in a wide
array of domains including daily living skills, com-
munication skills, and socialization. Several versions of
these scales exist, but most are administered as semi-
structured interviews to a caregiver. These scales have
good norms, reliability, and validity, and they are
579HALPERIN AND McKAY
generally far more useful for setting up a treatment plan
for mentally deficient people than are standard intel-
ligence tests.
‘Assessment of the Physically Handicapped
Increasing societal awareness of the needs of hand-
icapped persons has facilitated advances in psycho-
logical testing for handicapped individuals. Assessment
of hearing-impaired children is complicated not only
by their sensory loss, which can be dealt with by admin-
istering orally presented verbal items in written format,
bur also by the language deficits that often accompany
carly hearing loss. One approach to this complication is
to use performance-type subtests from various intelli-
gence tests. However, only limited research related to
the validity of these scales with this population has been
conducted (Maller and Braden, 1993). A more appro-
priate method may be through the use of the Hiskey
Nebraska Test of Learning Aptitude (Hiskey, 1966).
This tese was developed and standardized on samples of
hearing-impaired children and children with adequate
hearing, and it is normed for ages 3 to 17 years. The
test, which is untimed, assesses a wide range of cogni-
tive functions. Instructions are presented primarily
through the use of pantomime and practice exercises.
‘Assessment of visually impaired individuals is
commonly done through the use of verbal—and
elimination of performance—tests from standard
intelligence tests. For example, the Wechsler scales have
been modified for blind examinees through the elim-
ination of the Performance scale and the few Verbal
scale items that require sight. Several adaptations of the
Stanford-Binet have also been developed and normed
for visually impaired individuals. The most recent of
these is the Perkins-Binet Test of Intelligence for the
Blind (Davis, 1980).
(Cross-Cultural Testing
“The use of psychological testing with children from
diverse cultures has increased in recent years. Although
no test is free of all cultural influences, attempts have
been made to make culture-fair tests. These tests limit
or avoid completely the use of language, timing,
reading, and stimuli chat may have greater familiarity in
‘one culture relative to another. As described above, the
K-ABC uses less language than the WISC-III and may
have greater validity with children of non—English-
speaking backgrounds. However, it still uses speed and
580
several stimuli characteristic of American/Western
cultures.
In contrast, the Leiter International Performance
Scale-Revised (Roid and Miller, 1997) is an untimed
test, which is normed for ages 2 to 20 years, and is
administered using essentially no verbal instructions.
Each set of items begins with a simple example which is
prompted through pantomime. This revised version
covers four domains of functioning: Reasoning, Visuali-
zation, Attention, and Memory. Unlike its predecessor,
the revised Leiter generates standard scores rather than
the cruder ratio IQ scores
‘Another test that is relatively free of cultural biases is
the Ravens Progressive Matrices (Court and Ravens,
1995). This test comes in three forms, two of which are
appropriate for use with children: the colored
Progressive Matrices (normed for ages 5} to 11 years)
and the Standard Progressive Matrices (normed for ages
6 to 80 years). The Ravens Matrices are administered in
a multiple-choice format. The test begins with simple
visual discrimination and gradually moves to more
difficult perceptual analogies and reasoning problems.
‘The Ravens Matrices are untimed and can be admin-
istered using virtually no language.
ACADEMIC ACHIEVEMENT TESTS.
‘Academic achievement tests have a wide range of
uses including the assignment of grades, identification
of special needs for remediation, and assessment of
progress. High-quality, group-administered general
achievement batteries are typically administered by
schools. Examples of these are the California Achieve-
ment Tests, the Comprehensive Tests of Basic Skills, the
IOWA Tests of Basic Skills, the Metropolitan Achieve-
ment Tests, and the Stanford Achievement Tests. These
tests have excellent norms and psychometric properties
and are often quite useful for identifying children with
educational deficits
However, individually administered tests of academic
achievement are generally warranted for children with
cognitive, emotional, and/or learning problems because
these characteristics frequently have a negative impact
on the child’s performance within the group format.
Furthermore, the individualized assessment, which is
carefully structured and observed by the clinician, is
likely to provide a more detailed assessment of the
nature of the child's difficulties along with a profile of
strengths and weaknesses. Finally, individualized assess-
J. AM, ACAD, CHILD ADOLESC. PSYCHIATRY, 37:6, JUNE 1998‘ments are particularly useful for determining the pres-
ence of a learning disability and to highlight specific
achievement-related deficiencies that may be targeted
for treatment.
There are numerous well-normed individualized tests
of academic achievement. Three of the most popular
(Watkins et al., 1996; Wilson and Reschly, 1996) will be
reviewed here: the Wide Range Achievement Test 3
(WRAT3) (Wilkenson, 1993), the Wechsler Individual
Achievement Tests (WIAT) (Psychological Corporation,
1992), and the Woodcock-Johnson Psychoeducational
Battery-Revised (Woodcock and Johnson, 1989).
‘The WRAT3 contains separate tests of rcading/
decoding, spelling to dictation, and arithmetic. The test
is normed for ages 5 through 75 years and ranges in
difficulty level from preschool skills (eg, recognizing/
naming letters, counting) through problems that are
beyond high school level. Unlike most other tests, the
‘WRATS contains matched forms of each test, making it
useful for retesting after remedial intervention. Whereas
the WRATS is generally adequate for assessing a child’s
level of function in the basic skills of decoding words,
spelling, and arithmetic, and thus for assessing the
presence of a learning disability, its narrow range of
focus is limiting with regard to elucidating more subtle
aspects of learning difficulties such as reading com-
prehension, language difficulties, and writing problems.
The WIAT is an individually administered achieve-
ment battery that was developed and normed (for ages
5 to 19 years) along with the WISC-III. This facilitates
the ability to draw comparisons between a child's
intellectual and academic functioning, which is an
important component of diagnosing most specific
developmental disorders as defined by DSM-IV. The
WIAT has two separate recommended formats for
administration. The WIAT Screener consists of three
tests: Basic Reading, Spelling, and Mathematics
Reasoning. The Screener, which can usually be admin-
istered in less than 20 minutes, provides limited infor-
mation similar to that generated by the WRAT3.
However, for a more comprehensive assessment of
academic achievement, the full WIAT contains
additional cests of Reading Comprehension, Numerical
Operations, Listening Comprehension, Oral Expres-
sion, and Written Expression. These latter tests, which
may be necessary only for children wich known or sus-
pected learning disabilities, provide a picture of the
child's abilities in a wider range of academic domains.
J. AM, ACAD. CHILD ADOLESC. PSYCHIATRY, 37:6, JUNE 1998
PSYCHOLOGICAL TESTING
Among the most comprehensive individually admin-
istered academic achievement batteries is the Woodcock-
Johnson Psychoeducational Battery-Revised. This test
battery, which was designed for ages 2 to 95 years, has
21 tests of “Cognitive Ability” and 18 Achievement
Tests. Cognitive factors assessed by the Woodcock-
Johnson include long- and short-term memory,
auditory and visual processing, processing speed, com-
prehension, and reasoning. Within the achievement
domains, tests assess nor only level of functioning, but
also underlying processes such as word atcack skills,
reading comprehension, letter-word identification, and
vocabulary. Similar subcomponents are assessed for
writing and math skills. As such, this test battery is
particularly useful for evaluating the underlying
component skills that go into academic competency.
Therefore, it can provide information necessary for
developing remedial plans.
NEUROPSYCHOLOGICAL TESTING
Neuropsychological testing assesses a wide array of
cognitive functions and interprets the data in the con-
text of a comprehensive understanding of brain~
behavior relationships. Approximately 25 years ago,
when neuropsychological testing began a rapid expan-
sion in populatity, the primary goal of testing was to
determine whether the patient had brain damage (e.g.,
“Is it organic?) and, if so, which part of the brain was
damaged. Whereas this application is still used with cer-
rain patient populations (e.g., closed head injury), this is
rarely the purpose of neuropsychological testing in child
and adolescent psychiatric patients. More often than
not, the goal of neuropsychological testing in children
and adolescents is to provide a detailed assessment of
the individual’ cognitive functioning. Comparing per-
formance across tests allows areas of strengths and weak-
nesses to be identified, and a comprehensive assessment
of how the individual encodes, processes, stores, and
ourputs information is often provided. The data can
then be examined to determine the ways in which the
patient's “style” of information processing either impairs
functioning or can be modified to improve functioning.
Although rarely diagnostic by themselves, neuropsy-
chological assessments may play a particularly useful
role in understanding the deficits in many child psy-
chiatric patients and in treatment planning. Neuro-
psychological cesting is most useful in patients with a
wide array of at least partially neurologically based dis-
581HALPERIN AND McKAY
orders such as learning disabilities, Tourette’ disorder,
autism, pervasive developmental disorders, and attention-
deficit/hyperactivity disorder.
Many neuropsychologists use standardized neuro-
psychological rest batteries such as the Halstead-Reitan
(Reitan and Wolfson, 1993) or Luria-Nebraska Battery
(Golden et al, 1986). These test batteries have separate
versions for children and adults. A more recently devel-
‘oped neuropsychological test battery, developed spe-
cifically for children, is the NEPSY (Korkman et al.,
1997). The NEPSY, which is normed for ages 3 through
12 years, was designed to detect subtle deficits that inter-
fere with learning in five functional domains: language
and communication, sensorimotor functions, visual-
spatial abilities, learning and memory, and executive
functions. This latter domain includes functions such as
attention, planning, and problem solving.
The use of a standardized battery is likely to ensure
that the assessment is comprehensive with regard to the
breadth of domains assessed. In addition, normative
data for the individual tests that make up the battery are
usually adequate, and comprehensive manuals facilitate
interpretation of the test scores, which is generally done
via various pattern analyses. Yet many neuropsycholo-
gists contend chat the fixed format of test batteries
requires excessive testing in some domains, which may
not be necessary for certain patients, while lacking more
in-depth measures in other domains when needed.
‘Thus, they lack che flexibility to tailor the assessment to
the individual patient. Furthermore, they may not be
ideal for comprehensively assessing the process by which
individuals take in and use information.
‘Therefore, many neuropsychological evaluations
comprise a wide array of tests which are “hand-picked”
by the examiner. This approach may have several
advantages for a skilled neuropsychologist, with regard
to selecting the most appropriate test for assessing the
process of interest for an individual patient. However,
caution must be used because many smaller neuro-
psychological rests do not have adequate norms, and the
reliability and validity may not have been adequately
assessed
‘Whether using a standardized or “self made” set of
tests, a comprehensive neuropsychological battery
generally assesses a wide array of sensory, perceptual,
linguistic, cognitive, motor, and executive functions.
‘Table 1 indicates the domains that are generally asessed
in a comprehensive neuropsychological assessment. In
582
addition, the table provides examples of several
commonly used tests within each functional domain.
PERSONALITY ASSESSMENT
Personality assessment in children and adolescents
involves several approaches including behavior rating
scales, self-report inventories, and projective techniques.
[As discussed above, behavior rating scales and self-
report inventories differ from true psychological tests
and will not be reviewed here.
Projective testing is based on the notion that, when
presented with a vague, unstructured, or ambiguous
TABLE 1
‘Components of « Neuropsychological Fraluation
in Childzen and Examples of Tests
(Overall cognitive function
Standard intelligence rests
Motor function (Fine, gross, apraxias)
Purdue Pegboard
Neurological Examination for Subtle Signs
Perception (visual, auditory, somatosensory)
“Hooper Visual Organization Test
Motor-Free Visual Pereption Test
“Two-point diserimination
‘Wepman Auditory Discrimination Test
Visual-motor integration (construction, graphomotor)
Bender-Gestale
Beery-Bukrenica Developmental Test of Visual-Motor Integration
Benton Visual Retention Test
Lamping peice sa emesis
“Token Test for Children
Peabody Picture Vocabulary Test
Bescon Naming Test
Expressive One: Word Vocabulary Test
‘Tes of Language Development
Memory (long-term/shoreterm; verballvisual;storagelrettieval)
‘Wide Range Assessment of Memory and Learning
Buschke Selective Reminding Test
California Verbal Leaning Test-Childrens Version
Academic abilities
Standardized achievement ests
Executive functions (attention, inhibitory control, planning,
organization)
Stroop Color-Word Test
Wisconsin Card Sort
Trail Making,
Continuous Performance Tess
‘Noie: A more complete compilation of neuropsychological ess,
along with detailed descriptions and normative data, can be found
in Lezak (1995) and Spreen and Straus (1991)
J. AM, ACAD, CHILD ADOLESC. PSYCHIATRY, 37:6, JUNE 1998stimulus or task, the production of the individual will
reflect aspects of the personality that might be otherwise
unavailable to consciousness of for assessment. In most
cases the examinee is unaware of what the examiner is
looking for and thus the interpretation of the tes is dis-
guised and less susceprible co faking. Yee this lack of
structure, which results in a nearly infinite number of
potential responses, creates psychometric problems for
most projective tests. In general, normative data are
sparse and interscorer reliability is problematic. Never-
theless, these are extremely popular tools for assessing
children. The most commonly used projective instru-
‘ments (Watkins et al., 1996; Wilson and Reschly, 1996)
fall under one of three categories: drawings, inkblot
techniques, and verbal/storytelling techniques.
Drawings
According to a survey by Wilson and Reschly (1996),
the three most commonly used projective techniques
are the Human Figure Drawing Test, che House-Tree-
Person Test, and the Kinetic Family Drawing. The
Human Figure Drawing Test (Koppitz, 1984), which is
standardized for children aged 5 to 12 years, is scored
for the presence of “emotional indicators.” While the
frequency, or aggregate, of emotional indicators has
been found to distinguish berween normal and patient
populations (Naglieri and Pfeiffer, 1992), individual
indicators cannot be used for diagnostic purposes.
Furthermore, the data should only be interpreted in the
context of other clinical material. The House-Tree-
Person ‘Test requires the child to produce separate
drawings of a house, tree, and person. Again, data
should be interpreted with caution and should be used.
primarily to generate, not confirm, hypotheses about
the child. The Kinetic Family Drawing (Handler and
Habenicht, 1994) requires that the child draw a picture
of his/her family doing something together and is inter-
preted in terms of the distances between individuals and
the degree to which they are interacting.
Inkblot Techniques
‘The most popular inkblot technique is the Rorschach
(Watkins et al., 1996), which consists of 10 bilaterally
symmetrical inkblots. The lack of adequate assessments
of reliability and validity, as well as the absence of a
single clear procedure for administration and scoring,
led to a decline in confidence in the Rorschach through-
out the 1960s and 1970s. However, the development of
J. AM. ACAD. CHILD ADOLESC. PSYCHIATRY, 37:6, JUNE 1998
PSYCHOLOGICAL TESTING
Exner’s (Exner and Weiner, 1994) Comprehensive
System for administration and scoring throughout the
past two decades has begun to reverse that trend. By
gleaning aspects of several previously described systems,
Exner’ atheoretical Comprehensive System has begun
to apply modern psychometric procedures to the
Rorschach. There are now clear guidelines for admin-
istration and scoring, as well as normative data for chil-
dren and adults. Furthermore, several reliability and
validity studies have yielded favorable results. Although
Rorschach data must still be interpreted with caution
and should never be used in isolation for making
important decisions about diagnosis, etiology, or prog-
nosis, this newly revived test may provide useful daca
regarding aspects of thinking, perception, and affective
responsivity in children.
Storytelling Techniques
Several techniques require the child co tell a story in
response to a picture. Two popular tests (Watkins etal,
1996) are the Thematic Apperception Test (TAT)
(Bellak, 1993), which is reported to be applicable for
adults as well as children down to the age of 4 years,
and the Children’s Apperception Test (CAT) (Bellak,
1993), which was designed for children aged 3 to 10
years. Whereas the TAT consists of sets of black-and-
whire pictures depicting various scenes, the CAT
depicts cartoon-like pictures of animals in human sieu-
ations that relate to various developmental themes (e.g.
toilet training, feeding, sibling rivalry). The task is for
the examinee tell a story based on the picture. Despite
the common use of the TAT and CAT, few clinicians
use systematic procedures for administration (even
varying which cards they choose to present), and true
scoring of responses is rarely done. Rather, stories are
generally interpreted in the context of what is known
about the patient and inferences about social relation-
ships and interpersonal interactions are often made.
Because of the lack of standardized procedures and
objectivity in scoring, these results must be interpreted
with extreme caution.
CONCLUSIONS
Unlike other forms of clinical assessment, psy-
chological testing provides standardized and objective
‘measures of behavior that can be of considerable utility
for evaluating children and adolescents. While psy-
chological testing data, in isolation, will rarely be
583HALPERIN AND McKAY
adequate for providing a DSM diagnosis, the test data
are likely to provide important information about
intellectual/cognitive, academic, and personality char-
acteristics of the patient. When interpreted in the con-
text of other clinical information, these data are very
useful for developing a comprehensive treatment.
Furthermore, objective reevaluation using psychological
tests is of considerable utility for determining the
effectiveness of an ongoing treatment plan and to
modify the intervention to coincide with the patients
changing needs.
REFERENCES
Anat A, Uibina § (1997), Byebolpcal Ting, 7 ed. Upper Sade
Rivet, NJ: Prentice Hall
Bayley N (1993). Raye Sale of infent Development, Dd ed Mana. San
‘Antonio, TX: Psychological Carpestion
Balak 1 (1993) The TAT. CAT, and SAT in Clinical Cie, Sh ed. Boston:
All & Bacon
CCoute JH. Raven J (1995), Manual forthe Raven Popesie Matrices end
Vecublary Sale. OxSors, England: Oxiord Psychologists Pest.
Davis C} (1980), Prine Tes of Ieligence forte Blind. Watertown,
MA Peskin School forthe Blind
Exner JE J: Weiner 1B (1994), The Rorchach: A Comprehensive Sytem, Vo
'S Asesment of Children and Adolcents, 2d ed. New Yor: Wiley
Golden CJ, Parisch AD, Hammeke TA (1986), Luria-Nebraske
‘Newoparblogial Baer: Forms land If Manual Los Angee: Western
Paychologial Services
Handle L, Habeniht D (1994), The Kinetic Family Drawing echnique 3
review of the iteraure. J er Ass 62440-4684
Hiey MS (1966), The Hike Nebraks Tet of Learning Apitude, Lincoln
TNE: Union College Press
*Kamphaus RW, Frick P) (1996), Clinical Aseumen of Child and Adelent
‘esonli and Behavior, Boston: Aliya 8 Bacon
Kaufman AS, Kaufman NL (19832), Kaufan Aseximent Basery for
(Children: Administration and Scoring Manual. Cicle Pines, MN:
‘Armetcan Guidance Service
Kaufman AS, Kaufman NL (19838), Kaufman Ascent Rarer for Children
Thcerpeive Manual. Cie Pines, MN: Ameican Guidance Service
-Kaufian AS, Kaufman NL (1990), Kannan Brief Tnligence Te: Mama
Girl Pines, MN: American Guidance Service
584
Kaufman AS, Kaufman NL (1993), Kew
Tnieligence Tet: Manual, Circe Pines, MIN: American Guidance Service
Koppits EM (1984), Pcholepcal Enluacian of Human Figure Drawings by
"Midis Shoal Papi Orlando, FL: Grune & Stratton
Korman M, Kick U. Kemp § (1997), NEPSY. San Antoni, TX: Pycho-
logical Corporation
Lambert N, Nike Ky Leland H (1993), AMR Adgpsive Bebesior Seale.
"Soo, 2nd ed: Eximiners Manual, Austin, TX. PRO-ED
Lauren J, Swe M, Rybur M (1992), Review of vali esearch on the
‘Sanford Bune lnaligence Scale: Fourth Edition. Pychl Ass 4:102-112
Leak MD (1995), NewopoebolepzlAsesment ke New York: Oxford
niversiy Press
Maller §), Braden |P (1993), "The construct and crtron-reated valid of
the WISC-III with deaf adolescents. J Pychoedue Aes 1105-113
(WISC-III Monograph)
Nagle JA, Peller SI (1992), Performance of disruptive behavior dis:
dered and normal simples on the Draw a Peon: screning procedure
foremotional dstutbance.Pyobol Anes 4156-159
Pyychlogical Corporation (1992), Weller Individual Achievement Tes
"Mansal San Antonio, TX: Pychologial Corporation
Reitan RM, Wellin D (1993), The Haltead-Retan Newopaeholgical Tt
“user Theory and Clinical nepretton, 2nd, Tucson, AZ: Neuro-
pychology res
oid GH, Miller L} (1997), Examines Manual forthe Leite international
Peformance Scale Reied, Wood Dale, IL: Stocking
Sparrow 85, Balla DA, Ciccheri DY (1984), Vineland Adaptive Bebesor
‘Sal Iteviw Edition Expanded Form Manaal. Cice Pines, MN:
‘American Guidance Service
*spreen O, Sass E(1990)-A Compendia of Navspcolgial Tt: Adin.
timation, Novms end Commentary. New York: Oxfoed Univesity Press
“Thorndike RL, Hagen EP, Sater JM (1986), The Sanford Bins Inclignce
“Seale, th ed: Teil Manual. Chicago: Riverside
‘Watkins CE Je: Campbell Vi, Nieberding R, Hallmark R (1996),
‘Comparative practic of psychalialaseszmene. Pf Pehol Res Prat
27316318
‘Wechsler D (1989), Macher Prchol and Primary Sele of Itligene
‘Revised: Manual. San Anconio, TX: Pjchological Corporation
Wechsler D (1991), WecherIneligence Seale for Children Third Eon
"Manual. San Antonio, TX: Psychological Corporation
‘Weculer D (1997), Wechsler Adult InligenceSeale-ThindEdlvon (WAIS-
“UD: Administion and Scoring Mana. San Antonio, TX: Pychoogial
Corporation
Wilkenton JS (1993), Wide Renge Achievement Tall: Administaion
‘Manual. Wilmington DE: Wide Range, Ine
‘Wilton MS, Reschly DJ (1996), Asesment in schoo! pycholoy taining,
aud pace. Sh Pohl Rev 2529-23
‘Woodcork RU, Jobson MB (1989), Woadeackobnion Pychaeducavonal
‘Butery Revie. Allen, TX: DLM Teaching Resources
J. AM, ACAD. CHILD ADOLESC, PSYCHIATRY, 37:6, JUNE 1998