Vous êtes sur la page 1sur 22

Brain and Language 80, 488509 (2002)

doi:10.1006/brln.2001.2610, available online at http://www.idealibrary.com on

Novel Metaphors Appear Anomalous at Least


Momentarily: Evidence from N400
Vivien C. Tartter, Hilary Gomes, Boris Dubrovsky, Sophie Molholm,
and Rosemarie Vala Stewart
City College of the City University of New York
Published online February 11, 2002

This study addresses a central question in perception of novel figurative language: whether
it is interpreted intelligently and figuratively immediately, or only after a literal interpretation
fails. Eighty sentence frames that could plausibly end with a literal, truly anomalous, or figurative word were created. After validation for meaningfulness and figurativeness, the 240 sentences were presented to 11 subjects for event related potential (ERP) recording. ERPs first
200 ms is believed to reflect the structuring of the input; the prominence of a dip at around
400 ms (N400) is said to relate inversely to how expected a word is. Results showed no
difference between anomalous and metaphoric ERPs in the early window, metaphoric and
literal ERPs converging 300500 ms after the ending, and significant N400s only for anomalous endings. A follow-up study showed that the metaphoric endings were less frequent (in
standardized word norms) than were the anomalous and literal endings and that there were
significant differences in cloze probabilities (determined from 24 new subjects) among the
three ending types: literal ! metaphoric ! anomalous. It is possible that the low frequency
of the metaphoric element and lower cloze probability of the anomalous one contributed to
the processes reflected in the early window, while the incongruity and near-zero cloze probability of the anomalous endings produced an N400 effect in them alone. The structure or parse
derived for metaphor during the early window appears to yield a preliminary interpretation
suggesting anomaly, while semantic analysis reflected in the later window renders a plausible
figurative interpretation. 2002 Elsevier Science (USA)
Key Words: figurative language; metaphor; anomaly; N400; standard pragmatic model;
cloze probability; semantic processing; selectional restrictions.

By definition, figurative language can be taken in two ways: literally and often
anomalously, as when we call a person a rock, and creatively (if the figure is novel),
abstracting across the literal meanings of the component words. A central question
in understanding how we comprehend figurative language is whether both the literal
and figurative meanings of the figure are immediately activated or if one of those
meanings is normally achieved preferentially. The standard pragmatic model (cf.
Gibbs & Gerrig, 1989; Glucksberg, 1991) assumes that we attempt a creative interpretation only after recognizing the nonsense of the literal interpretation. Alternatively,
constructivist approaches (e.g., Gibbs, 1994) assume that we arrive at the sensible
interpretation guided by context and shared conventions and presumptions, noticing
the literal interpretation only if the figure of speech is highlighted.
Hilary Gomes and Sophie Molholm are also at the Albert Einstein College of Medicine.
Address correspondence and reprint requests to Vivien C. Tartter, Psychology Department, City College of CUNY, 138th St. at Convent Avenue, New York, NY 10031. Fax: (212) 650-5865. E-mail:
VickyT@aol.com.
488
0093-934X/02 $35.00
2002 Elsevier Science (USA)
All rights reserved.

METAPHOR, ANOMALY AND N400

489

The standard pragmatic model grew out of generative semantics, in which understanding of words in combination was assumed to derive from a logical conjunction
of general features: Understanding a canary has feathers entails understanding
that canaries are birds; that birds have feathers; and by implication, that canaries are
feathered. Understanding which meaning of bill (money, debt, or bird mouth)
pertains in a particular phrase depends on whether other bird features occur in that
phrase; understanding that colorless green is anomalous depends on recognizing
the contradiction of green, marked as color, and colorless marked as not-color
(Katz, 1990). Following the principles of logic connecting semantic features (e.g.,
person is animate and rock is inanimate), metaphor yields an anomaly which
might trigger some special meaning-creating process. The standard pragmatic model
contrasts metaphor and simile, since in simile the figurative nature of the language
is indicated explicitly using like or as (as in The person is like a rock).
Simile is therefore hypothesized to be understood more immediately, with the conjunction triggering the special process. Indeed, one proposal has been that metaphors
are understood by conversion mentally to similes, spurring the special process.
The standard pragmatic model suggests that figurative language should be slower
to understand than literal language, since it requires a two-step process. It further
suggests for metaphor that the literal meaning would be derived before the figurative
one. And it suggests that similes should be easier and faster to understand than metaphors. A number of experiments have demonstrated that these implications are false.
First, given appropriate context, figurative sentences take no longer than literal
sentences to verify (see, e.g., Ortony, 1980). Moreover, metaphors are no harder to
comprehend than similes, and subjects do not feel that metaphor and simile are meaning-identical, indicating that they are not translating one into the other (Gibbs,
1994). Third, McElree and Nordlie (1999) have shown that meaningfulness decisions
for literal and figurative strings of words are made following the same course, suggesting that the interpretation of figurative strings does not require first a literal parse.
Finally, figurative interpretations seem to be derived automatically, not secondarily,
and can interfere with comprehending literal meaning. Glucksberg (1989; and
Gildea & Bookin, 1982) found that subjects are slower at verifying sentences based
on literal meaning if there is also a figurative meaning: It takes longer to decide that
some jobs are jails is literally false than it does to decide so for some desks are
melons. This suggests that the figurative meaning is created naturally along with
the literal meaning, not as a secondary process.
While there is evidence that figurative interpretation is a normal process not secondary to literal interpretation, we must note that good figurative language does not
feel normal. Verbrugge and McCarrell (1977) describe original figures of speech
as creating tension, as do jokes, relieved by the insight into the metaphors meaning.
The insight is what makes metaphors fun. Johnson (1980) proposes that the tension arises from the apparent contradiction, and the fun arises from the construction of the common ground relating the metaphor topic (roughly the subject) to the
metaphor vehicle (how you get from the topic to the ground, usually the predicate
of the metaphor). If such constructive processes happen in understanding literal sentences, they do not produce as strong a sensation of tension and relief.
It seems quite possible that the tension arises from semiawareness of the anomaly,
and the relief, from its resolution. This is not to say that it is the perception of anomaly
that triggers a search for figurative meaning (this Glucksbergs research belies) although it could for new, creative metaphors. Rather, it could be that several interpretations are derived in parallel, that the anomaly of the literal meaning is noted, and that
the simultaneous perception of anomaly and meaning is enjoyable. Both alternatives
suggest that the appreciation of figurative language derives from perceiving potential

490

TARTTER ET AL.

anomaly, contrary to the conclusion of much of the current research on metaphor


comprehension.
The current study tests specifically for the possibility that anomaly is recognized
at some stage in the understanding of a metaphor. To this end, we recorded eventrelated potentials (ERPs) from subjects as they read novel metaphors. ERPs provide
a very powerful technique for studying the temporal and spatial patterns of cortical
electrical activity during stimulus processing. We were particularly interested in the
N400, a component of ERPs which is sensitive to semantic relatedness and expectancy. Typically, the N400 (a negative-going peak about 400 ms after the target word
which generally has a central parietal maximum) elicited by a specific word is smaller
when the word is expected and related to the semantic context than when it is less
expected and less related to the context. This difference in amplitude is referred to
as the N400 priming effect (Friederici, 1997; Kutas & Hillyard, 1980). We assumed
that in semantic processing of a metaphor if there is a point before common ground
is constructed when the vehicle of the metaphor is seen as incongruous, there should
be a more negative N400 than for a literal sentence (an N400 effect).
We note that ours is not the first study of ERPs and metaphor. Pynte, Besson,
Robichon, and Poli (1996) reported N400s for very short common French metaphors
(e.g., those fighters are lions and those apprentices are jars [clumsy]), literal
sentences, and unfamiliar metaphors, created by scrambling topic and vehicle from
the familiar ones (e.g., those apprentices are lions). Across experiments the metaphors were presented alone or with a preceding sentence that contextualized the metaphor appropriately (e.g., They are not cowardly: Those fighters are lions) or not
(e.g., They are not naive: Those apprentices are lions). Pynte et al. found a reliable
N400 effect even for the common figures of speech and a greater one for the novel
figures of the same form. Unsurprisingly, providing an inappropriate context increased the size of the N400 for both kinds of sentences. It is important to note,
though, that Pynte et al. included no anomalous conditions. Since overall context
(i.e., the interrelationships among the stimuli in the set) has been shown to affect the
N400, at least for pairs of associated or unassociated words (Brown, Hagoort, &
Chwilla, 1996; Deacon, Breton, Ritter, & Vaughan, 1991; Holcomb, 1988), results
from a stimulus set including no truly anomalous sentences are difficult to interpret.
Relative to the literal sentences the metaphors would be more anomalous, but we
have no way of knowing whether they are processed as anomalously as are truly
anomalous sentences.
Moreover, while their N400 results suggest that the common metaphors were
viewed as anomalous, it is not clear that they were in fact viewed as figurative:
Subjects were simply asked to read the sentences for comprehension and their brain
waves were recorded; they were not asked to paraphrase the meanings, so we do not
know how they interpreted the figurative ones. In addition, the sentences were all
short and of a given form, contexts unlike those of natural figurative language and
therefore unlikely to provoke the shared conventions for figurative speech. Finally,
since the figures of speech in this study were all common, i.e., dead metaphors,
their comprehension could require recognition, rather than construction, of the common ground. Thus this study does not really tap into the process of understanding
figures of speech or show whether it entails a moment when the anomaly of the literal
reading surfaces.
The present study is designed to record N400s in unfamiliar figures of speech,
rated as good metaphors and understood metaphorically by subjects. We hypothesize
that if subjects automatically use context to construct the ground, new but good figurative sentences should not yield a more negative N400 (no larger an N400) than do
new, good literal sentences. On the other hand, if the anomalous literal meaning of

METAPHOR, ANOMALY AND N400

491

a new figurative sentence is recognized at some point, there should be a reliable N400
effect. We include in our test materials both literal and truly anomalous sentences
to provide anchor points for judging the magnitude of the N400 and to mitigate context effects.
PRELIMINARY STUDY: STIMULUS VALIDATION

Our purpose is to examine ERPs during the processing of metaphors, comparing


them to those occurring in understanding literal and anomalous sentences. Unlike
earlier research, we were interested in looking at novel, creative metaphors, which
we hoped would produce on first encounter the experience of tension that other researchers have described. It was thus necessary to create new metaphors and ensure
that subjects would indeed perceive them figuratively and as good metaphors. We
also needed sets of control sentencesliteral and truly anomalous versionsthat
naive subjects would perceive as such. The first experiment therefore measured subjects rankings of candidate sentences on meaningfulness and figurativeness scales
to produce valid sets of stimuli.
Methods
Sentence preparation. Eighty metaphors were created such that (1) each sentence expressed sufficient
context to be appropriately interpreted with no additional information needed, (2) the figure was provided
in only one word, and (3) the figurative word was the last word in the sentence. The last two requirements
derived from the need to have a clear starting point for assessing N400, which should be present approximately 400 ms after the onset of the last word in the sentence. Once the metaphors were created, literal
and anomalous versions were produced by altering the last word. So, for example, the metaphor His
face was contorted by an angry cloud was made literal by substituting the word frown for cloud
and was rendered anomalous by substituting the word map for cloud.
The Appendix displays the ultimate sets of sentences. As indicated, the metaphors are not all original.
Some were borrowed from the published literature on metaphor: For example, the teenagers face was
a coral reef was a version of a sentence used by Radencich and Baldwin (1985). Others were created
by consulting poems or noting figures of speech appearing in the newspaper, conversation, and so on.
Subjects. Seven monolingual native American English speakers attending college or graduate school
were volunteers for this experiment.
Procedure. Three blocks of sentences were created from the 240 stimuli. Each block contained all
80 sentence frames, each with only one ending. Endings were pseudorandomly assigned to a block so
that there would be roughly equal numbers of literal, anomalous, and metaphorical sentences in each
(27 of two types and 26 of the third in each block). Across the three blocks each sentence was presented
with all three endings. The order of the sentences was separately randomized in each block.
Each subject was asked to rate two blocks of the sentences, each on one of two 5-point scales. One
scale assessed Meaningfulness (1 " completely nonsensical to 5 " very meaningful) and the other
assessed Figurativeness (1 " strictly literal to 5 " strictly metaphorical). Thus each sentence frame
was seen twice by each participant, once in one group and once in another group, with one ending being
rated on one scale and another ending being rated on the other scale. Each ending was rated on each
scale by between one and three of the subjects. The order of the scales and the blocks of sentences were
roughly counterbalanced across the seven subjects.
Sentences were presented to the subjects in written format for the subjects to take home and rate at
their leisure. When the forms were given to the subjects, written and oral instructions were provided,
and any questions the subject had were answered. For the meaningfulness scale, the written instructions
were as follows: Please rate each of the following sentences on the meaningfulness scale (1 " completely nonsensical; 5 " very meaningful). For example, the sentence All reserve materials must be
eaten within the library would be rated as 1 (completely nonsensical), and the sentence American
society does not prioritize sleep as an important health factor would be rated as 5 (very meaningful).
Circle your answer. For the metaphor scale, the instructions read as follows: Please rate each of the
following sentences on the metaphor scale (1 " strictly literal; 5 " strictly metaphorical). For example,
the sentence Cigarette smoke contains carbon monoxide is rated 1 (strictly literal) and the sentence
Smoking causes dandruff of the lungs is rated 5 (strictly metaphorical). Circle your answer.

492

TARTTER ET AL.

Results
For each scale, the ratings of all participants on a given ending were averaged.
For a sentence triad to be successful, the metaphor sentence was to score high (an
average of 3.5 or more) on metaphoricalness and meaningfulness, the literal sentence
was to score low on metaphoricalness (less than 2.5) and high on meaningfulness
(more than 3.5), and the anomalous sentence was to score low on meaningfulness
(less than 2.5). We anticipated that the anomalous sentences might be rated highly
metaphoric, since subjects had to select between metaphoric and literal, and
the anomalous sentences were clearly not literal.
Sixty-four of the original 80 triads met these criteria. For the criteria-meeting sentences, the literal endings were rated as 4.6 # .9 on the meaningfulness scale and
1.6 # .8 on the metaphoric scale. The anomalous sentences were rated as 1.8 # .9
on the meaningfulness scale and, as we had anticipated, 4.3 # .7 on the metaphoric
scale. Finally, and most importantly, the metaphoric sentences were rated both metaphoric (3.5 # 1.0) and meaningful (3.7 # 1.3).
A score for each sentence ending on each scale was computed by averaging subject
ratings for the 64 sentences that met the criteria. These were submitted to a one-way
repeated- (across sentence frame) measures analysis of variance assessing the effect
of ending type (literal vs metaphoric vs anomalous). Both scales showed highly significant differences [for meaningfulness, F(2, 62) "145.94, p $ .0005; for metaphoricalness, F(2, 62) " 258.22, p $ .0005]. Post hoc t tests (p $ .002) showed that the
anomalous endings were rated as significantly less meaningful and more metaphorical
than the literal and metaphorical endings, and the metaphorical sentences were rated
as significantly less meaningful and more metaphorical than the literal sentences.
(We note that McElree and Nordle, 1999, also found that subjects judged figurative
sentences as less meaningful than literal ones; the absence of the fuller appropriate
pragmatic context of natural use may be responsible.)
Of the 16 triads that did not pass muster, 10 were because the anomalous ending
was rated as meaningful, and this ending was changed. For 4 of the 16 sentences the
anomalous endings were also changed, although they had been rated as unmeaningful
by the subjects, because the authors judged them as too easily interpreted. On one
sentence all three endings were rated as both metaphorical and meaningful, and so
these were changed. A final sentence was changed after the ratings to maintain the
same phrase structure, the article an introducing the noun phrase across all three
versions. The new endings were validated by consensus of all the authors and four
new subjects.
ERP Experiment

Methods
Subjects. Eleven paid volunteers served as subjects. Their average age was 26 years. All subjects
were right-handed fluent English speakers, either monolingual or having learned English by the age of
five years. By self-report all subjects were healthy with normal or corrected-to-normal vision.
Stimuli. The stimuli were produced and evaluated as described in the previous study. The sentences
employed are displayed in the Appendix.
The stimuli were presented visually on a dark computer screen typed in uppercase, 1.20-cm-tall, white
characters. A sentence frame with dots indicating the position of the final word, or ending, was presented
first, and then the final word was displayed. Both the sentence frames and their final words were centered
on the screen. The final words subtended approximately 0.87 vertically and, on the average, 5.52
horizontally.
Preceding the display of the frame a small green square indicating the focal point appeared on the
screen. It remained on the screen until the subject pressed a response button, triggering the sentence

METAPHOR, ANOMALY AND N400

FIG. 1.

493

The configuration of recording electrodes viewed from the top of the head.

frame 1 s later. Another button press indicated that the subject had read the frame, and 500 ms later
the ending completed the sentence. The final word was displayed for 1 s, and the green square reappeared
2 s later.
ERP recording. The recordings were obtained using a 31-channel electrode cap that incorporated
a subset of the International 10-10 system (new and modified combinatorial nomenclature, American
Electroencephalographic Society, 1990): Fpz, Fz, Cz, Pz, Oz, FP1, FP2, F7, F8, F3, F4, FC1, FC2, FC5,
FC6, T7, T8, C3, C4, CP1, CP2, CP5, CP6, P7, P8, P3, P4, O1, 02, LM (left mastoid), and RM (right
mastoid). Electrode placements are displayed in Fig. 1.
Vertical eye movements were recorded with a bipolar configuration of FP1 and an electrode below
the left eye. Horizontal eye movements were recorded at F7 and F8. A left earlobe electrode served as
the reference and an electrode at PO9 served as the ground. All impedances were maintained below 5
k. Amplifiers with a gain of 30,000 Hz and filter settings of .05100 Hz were used. The continuous
EEG for all channels was monitored during the recordings so that any problems with the electrodes
could be identified and feedback about excessive motor movement could be given. The total recording
epoch was 1100 ms including a prestimulus interval of 100 ms. The digitization rate was 256 Hz. Each
epoch was baseline-corrected across the entire sweep before artifact rejecting and averaging. The averages from each block were baseline-corrected again using the average amplitude of the prestimulus
portion of the epoch. Artifact reject levels were set at # 100 V for all electrodes to exclude blinks
and movement artifacts. Individual block averages were visually examined for residual artifact.
Design. The 240 sentences constituted 3 blocks of 80. For a first group, 27 frames were randomly
selected to be used with literal endings, 26 frames were selected from the remainder to be used with
metaphorical endings, and the final set was used with anomalous endings. For a second group the sentence
frames in the first group that had been used with literal endings were now used with anomalous endings,
the ones that had had metaphorical endings now ended literally, and the ones with anomalous endings
ended metaphorically. A third group was created by using the frames with the endings not used in the
other two groups. Each group was then divided in half (yielding six half-groups) and sentences within
it were pseudorandomized with three restrictions: (1) no more than two sentences of the same category
appeared in a row; (2) both halves of each group contained near-equal (13 or 14) sentences of each
category; and (3) for three pairs of sentences the frames of which were structurally or semantically
similar (e.g., 20 and 21 in the Appendix), one member only of each pair appeared in a given half.

494

TARTTER ET AL.

Presentation of all six half-groups was administered to each subject, with order determined by a Latin
square counterbalancing the three pairs of blocks with order within each pair counterbalanced between
paired subjects.
Procedure. Subjects were tested individually. Upon signing the consent form and filling out the
personal information sheet, each subject was fitted with an electrode cap and seated comfortably in a
reclining chair in a normally illuminated, sound-attenuated room in front of a monitor. The distance
between the center of the screen and the subjects right eyeball was set at the beginning of the session
to 31 in.
The subjects were told that a small green square would appear in the center of the screen as
an indication that a sentence was ready. They were instructed to press a response button to display
the sentence frame and that when they had completed reading the frame, they should press the response button to display the sentence end. They were asked to try to stay still during the recording and
to move and blink only when the green square was displayed. Fifteen practice sentences were presented,
as the subjects were monitored for conformity to the instructions. No subject required a second practice
run.
The subjects were also instructed that after each block they would be given a recognition task, for
which they would have to choose from among several sentences those presented during that block. The
recognition task was necessary to ensure that subjects were in fact reading the sentences. For the recognition test, one literal, one anomalous, and one metaphorical sentence were randomly selected from each
half of each block and then pseudorandomly mixed with six foils, two from each semantic category
(sentences that were experimental in other blocks), such that no more than two experimental or two
foil sentences and no more than two sentences from each semantic category appeared in a row.
The subjects were not alerted to the different semantic categories of the experimental sentences. All
but one subject completed the six experimental runs and following recognition tasks in one sitting; the
remaining subject took a 5-min break between the third and fourth blocks.
After the trial was complete, subjects were disconnected from the recording computer and allowed
to rest for 1015 min. Then each subject was seated in a separate room, given a tape recorder and printed
list of the 240 sentences [in a different random order (the same for all subjects), created in the same
manner as for the experimental trials], and asked to state the number of each sentence and to restate in
his/her own words the meaning of the sentence.
The entire procedure took 3 to 4 h: 1 h for hook-up, 1 h for the six blocks, 1 h for paraphrasing, and
up to 30 min for breaks.
Data analysis. As a preliminary step in the data analysis, two judges scored the subjects paraphrases
in order to determine whether the sentences were perceived by the subjects as expected. Each paraphrase
was categorized as constituting a literal, anomalous, or metaphoric interpretation of the original. The
judges had to know what the stimulus sentences were in order to judge the paraphrase. Consequently,
they were aware of the category to which each original sentence was assigned. In rare cases of disagreement, the judges discussed the paraphrase in question until an agreement was reached. For the final
analysis, only those sentence triads were chosen for which none of the sentences in the triad was paraphrased incorrectly by more than four subjects. This procedure yielded 55 sentence triads.
For the 55 triads that were selected for the final analysis, ERPs time-locked to the onset of the final
word were averaged for each subject within each semantic category. The data were then averaged across
subjects and plotted as a function of time elapsed after the onset of the final word, with each of the
three semantic conditions shown separately. Examination of the grand averages indicated that the waveform elicited by the literal sentences began separating from those elicited by the metaphoric and the
anomalous sentences approximately 160 ms after the onset of the final word (see Figs. 2 and 3). Further,
the waveforms elicited by the anomalous and the metaphoric sentences began separating from each other
at approximately 280 ms after the final word. Consequently, average amplitude was measured in two
windows that reflected the peaks of these separations: 200300 ms (early) and 300500 ms (late). Individual ERP data were measured for the two windows separately for literal, anomalous, and metaphor
conditions, resulting in six average amplitude measures per subjects. The effect of semantic condition
on average amplitude at Fz, Cz, Pz, FC1, FC2, CP1, CP2, C3, and C4 was tested separately for the
early and late windows using two-way repeated-measures analyses of variance (electrode % sentence
condition). These electrode sites were chosen because N400 is typically found to be largest in the regions
they represent. Post hoc analyses of variance were used to compare pairs of sentence types.
As a second step in the analysis, the effect of semantic category on the late window was further
evaluated, controlling for the amplitude of the waveform in the early period. Each subjects average
amplitude in the early window was subtracted from his or her average amplitude in the later window
for the respective semantic condition. The difference amplitudes thus obtained were then tested for an
effect of semantic conditions using a two-way analysis of variance (electrode % sentence condition).
The electrode sites used in this analysis were the same as above. Post hoc analyses of variance were
again used to compare pairs of sentence types.

METAPHOR, ANOMALY AND N400

495

FIG. 2. Average amplitudes at Cz for literal (thick), anomalous (dashed), and metaphoric (thin)
sentences at early and late processing points.

FIG. 3. Grand mean ERPs elicited by the final words in the sentences at all electrode sites. The
thin lines are the ERPs elicited in the literal condition, the dashed lines are the ERPs elicited in the
metaphoric condition, and the thick lines are the ERPs elicited in the anomalous condition. In this and
all subsequent figures, stimuli were presented at time zero and marks on the x axis represent 100-ms
steps. Waveforms in this and all subsequent figures were filtered at 30 Hz for clarity of display.

496

TARTTER ET AL.

The topography of the early and late effects was also examined and compared across sentence type.
For this analysis, the average amplitudes of the difference waves (anomalous minus literal and metaphoric
minus literal) for the early and late windows were scaled according to the procedure described in McCarthy and Wood (1985). Electrodes were then grouped by region: anterior (Fpz, Fz, Fp1, Fp2, F3, and
F4), central (Cz, FC1, FC2, C3, C4, CP1, and CP2), posterior (Pz, Oz, P3, P4, O1, and O2), lateral left
(F7, FC5, T7, CP5, P7, and LM), and lateral right (F8, FC6, T8, CP6, P8, and RM). The average scaled
amplitudes for each region were submitted to a three-way repeated measures analysis of variance. Factors
for this analysis were region (five levels), difference waveform (anomalous/literal difference vs
metaphor/literal difference), and time window (early vs late).
An level of .05 was used for all statistical tests. The GeisserGreenhouse procedure was used to
correct the degrees of freedom and p value for the respective F test. Only corrected degrees of freedom
and p values are reported.

Results
The grand average waveforms elicited by the three different endings are presented
in Fig. 3. The waveforms are similar to those reported in other N400 studies (for
example, Kutas, Lindamood, & Hillyard, 1984), except that the amplitude of the N400
in the anomalous condition is not large enough to offset the early positive shift and
pull the waveform below the baseline (also see Federmeier & Kutas, 1999; Kutas &
Iragui, 1998). N400, which generally has a central parietal maximum (Friederici,
1997), is seen for each of the three conditions at most electrode sites depicted as the
large negative-going deflection peaking between 390 and 430 ms.
An examination of the ERPs in Fig. 3 suggests that the ERPs elicited by the final
words in the anomalous and metaphoric sentences begin to differ in amplitude from
those elicited by the final word in the literal sentences at approximately 160 ms and
remain separated for the rest of the epoch. The ERPs elicited by the anomalous and
metaphoric endings, while similar during early processing, begin to diverge at approximately 280 ms. These observations are supported by the mean amplitudes at
Fz, Cz, Pz, FC1, FC2, CP1, CP2, C3, and C4 in the early (200300 ms) and late
(300500 ms) windows depicted in Tables 1 and 2, respectively.
Statistical analyses also support these observations. An analysis of variance comparing the average amplitudes in the early window at nine electrode sites found a
significant effect of Semantic Condition [F(1, 10) " 10.8, p $ .01]. Neither electrode
nor the interaction were significant. The amplitude of the waveform in the early window elicited by the literal endings was significantly more positive than those elicited by the anomalous or metaphoric endings [F(1, 10) " 16.02, p $ .005, and
F(1, 10)"15.49, p $ .005, respectively], with the latter two not being different from

TABLE 1
Average Mean Amplitude (in Milliseconds) and Standard
Deviations (in Parentheses) for the Early Window (200300
ms) for All Electrode Locations
Literal
Fz
Cz
Pz
FC1
FC2
CP1
CP2
C3
C4

5.4
5.4
3.6
5.4
5.9
4.7
5.1
4.6
5.6

(4.2)
(4.1)
(3.4)
(3.9)
(4.3)
(3.7)
(3.8)
(3.2)
(3.6)

Metaphoric
3.3
3.3
2.0
3.5
3.7
2.8
2.9
3.1
3.5

(3.5)
(3.5)
(2.9)
(3.5)
(3.3)
(3.1)
(3.0)
(2.9)
(2.8)

Anomalous
3.6
3.6
2.0
3.5
4.1
3.0
3.1
3.1
3.8

(3.2)
(3.7)
(3.6)
(3.1)
(3.3)
(3.6)
(3.5)
(3.0)
(3.1)

497

METAPHOR, ANOMALY AND N400

TABLE 2
Average Mean Amplitude (in Milliseconds) and Standard
Deviations (in Parentheses) for the Late Window (300500 ms)
for All Electrode Locations
Literal
Fz
Cz
Pz
FC1
FC2
CP1
CP2
C3
C4

6.9
6.5
5.1
6.5
7.1
5.9
6.0
5.6
6.7

(5.1)
(4.6)
(3.4)
(4.5)
(4.8)
(3.5)
(4.0)
(3.2)
(3.9)

Metaphoric
4.6
4.0
3.0
4.0
4.8
3.4
3.6
3.5
4.2

(4.4)
(4.2)
(3.3)
(4.2)
(4.0)
(3.6)
(3.6)
(3.4)
(3.7)

Anomalous
2.9
2.0
1.4
2.2
2.8
1.9
1.9
1.8
2.5

(3.8)
(3.9)
(3.2)
(3.9)
(3.8)
(3.6)
(3.5)
(3.6)
(3.3)

each other. An analysis of variance comparing the amplitudes in the later time window also found a significant effect of Semantic Condition [F(1, 10) " 13.6, p $
.005]. Again, neither electrode nor the interaction were significant. Post hoc analyses
indicated significant differences among all three sentence types. Thus, anomalous
endings resulted in significantly more negative deflections in the later window than
did literal or metaphoric endings [F(1, 10 " 15.36, p $ .005 and F(1, 10) " 10.04,
p " .01, respectively], and metaphorical endings produced a significantly more negative response than did literal ones [F(1, 10) " 11.80, p $ .01]. That is, there is a
greater N400 effect for anomalous sentences than for metaphorical sentences. Moreover, if only the late time window region of the ERPs is examined, it also seems
that there is a greater N400 for metaphorical than literal sentences.
However, the amplitudes of the ERPs for the literal and metaphorical sentences
differ greatly in the region of the early time window after which the waveforms
follow a parallel trajectory. Thus the apparent N400 effect in the later time window
for metaphor may be attributable to the increased negativity in the early region. The
change in average amplitude from the early to the late time region is relatively similar
for the metaphoric and literal endings, suggesting that the difference in amplitude
between these two conditions in the later window is due to the earlier negativity. In
contrast, the amplitude separation between the waveforms elicited by the anomalous
and the literal endings continues to increase from the early time window to the late
time window.
Therefore, the effect of semantic category on the later time window encompassing
the N400 was further evaluated, controlling for the amplitude in the early time window by using a measure of the difference in amplitude between early and late periods.
An analysis of variance comparing these difference amplitudes found a significant
effect of type of ending [F(1, 10) " 6.94, p $ .05]. Post hoc tests indicated no
difference in amplitude in the later time region between the literal and metaphorical
conditions [F(1, 10) " 0.41, ns] when the amplitude of the waveform in the early
window was controlled. However, the increased difference in amplitude of the late
window (N400) elicited by the anomalous ending was significantly greater than that
elicited by either literal or metaphor endings [F(1, 10) " 9.39, p $ .025 and
F(1, 10) " 10.99, p $ .01, respectively].
Topography of the early and late (N400) effects. The grand average difference
waveforms derived by subtracting the waveforms elicited in the literal condition from
those elicited in the anomalous (thick line) and the metaphoric (thin line) conditions

498

TARTTER ET AL.

FIG. 4. Grand mean difference waveforms constructed by subtraction of the ERPs elicited in the
literal condition from those elicited in the metaphoric (thin lines) and anomalous conditions (thick lines)
at Fpz, Fz, Cz, Pz, Oz, FC5, CP5, FC6, and CP6.

are depicted in Fig. 4 for a subset of the electrodes to illustrate the topography. Both
the early and late effects were largest at the central electrodes and somewhat smaller
at the right and left lateral, anterior, and posterior electrodes. Further, for the anomalous sentences, the effect in the late time window appeared to be somewhat larger
on the right than on the left. Both the central maximum and the right-sided bias are
consistent with the topography reported for the N400 effect in the literature (Federmeier & Kutas, 1999; Friederici, 1997; Kutas & van Petten, 1994). However, the
topographic effects were small and found to be nonsignificant.
DISCUSSION

The present experiment was designed to determine whether novel metaphors would
elicit an N400 effect as do anomalous sentences, indicating perhaps that, initially,
figurative language is seen as anomalous, following which a constructive process
allowing figurative interpretation is triggered. We obtained similar ERP results for
metaphoric and anomalous sentences only during an early epoch not generally associated with semantic analysis (Friederici, 1997), after which the metaphoric sentences
produced ERPs following similar trajectories to those produced by literal sentences.
Most critically, in the later time window associated with N400, a pronounced negative
deflection which usually signals incongruity was observed relative to the amplitudes
in the early window only for the anomalies.
Friederici (1997) has identified a window in the vicinity of N200 (encompassed
by our early epoch) as reflecting a primarily syntactic process, with the window

METAPHOR, ANOMALY AND N400

499

around N400 attributed to lexical-semantic processing. The divergence in the early


window of literal from metaphoric and anomalous sentences in our studyall of
which were syntactically well-formedsuggests, if Friederici is correct, that some
aspect of this syntactic process is sensitive to selectional constraints among words,
a possibility consistent with Chomskys (1965) Extended Standard Theory of syntax.
Thus the early window also reflects a form of lexical processing, with perhaps a
fuller semantic analysis indicated by the later window activity, where, in our study,
metaphor converges with literal interpretation and diverges from anomaly.
It is important to note, however, that the N400 effect has been obtained with word
pairs, which do not constitute a phrase or clause to which a parse or semantic analysis
could be applied. Thus, to some extent the N400 may reflect simply the expectancy
of a particular lexical item, where expectancy is determined either by lexical parameters such as word frequency or by a fuller linguistic context. Before making inferences
on the nature of the syntactic and semantic processing underlying literal, metaphoric,
and anomalous sentences as reflected by our ERP results, we need to determine the
degree to which our endings differ in lexical parameters related to expectancy.
FOLLOW-UP EXPERIMENT

The present study was undertaken to determine if novel metaphors are processed
as are literal sentences, with sense constructed from the onset, or as are anomalous
sentences, with initial surprise at the metaphoric element, followed when possible
by, or perhaps even triggering, construction of figurative meaning. To test this we
put together a list of novel metaphors, some created and some taken from the literature
on metaphor processing, and from them derived literal and anomalous sentences by
changing the metaphoric element appropriately. We then rated the set of sentences
for metaphoricalness and meaningfulness, yielding a stimulus set that could be read
by subjects while we recorded their ERPs. ERP analyses revealed similar processing
for anomalous and metaphoric sentences in an early window, but their divergence
at N400, when the metaphoric sentences, like the literal sentences, showed a positive
deflection.
The literature on N400 suggests that a pronounced negative deflection for anomalous elements is caused by their incongruity or the resulting failure at constructing
a sensible interpretation. To some extent both congruity and ease of construction of
interpretation relate to predictability: The final word in For breakfast I had bacon
and socks is not predictable given both probability in the language and the semantic
interpretation constructed by the preceding context.
Apart from anomaly, there are other variables that lead to low predictability of a
word in context. For example, statistically, low frequency words should be less expected than high frequency words, and so if unpredictability underlies the N400 effect
we would expect a greater effect for low than high frequency words. As a second
example, consider the predictability of different words within a particular context,
the cloze probability. A stitch in time saves ten is as sensible as a stitch in time
saves nine, but is much less predicted by context. If the N400 effect arises from a
simple discrepancy between the word expected and the word supplied, and if anomalous endings are less expected than literal or metaphoric ones, there could be a greater
N400 effect for the anomalies independent of the ability to construct sense for them.
Indeed Kutas, Lindamood, and Hillyard (1984) demonstrated that the size of the
N400 varies inversely with cloze probability.
Therefore to follow-up on our ERP results we further analyzed our stimuli, determining the expectedness of our ending elements as a function of word frequency
and cloze probability.

500

TARTTER ET AL.

Methods
Subjects. For the cloze portion of the test a FrenchEnglish native bilingual examined the stimuli
of Pynte et al. (1996) and then created sentences using metaphors as well-known to English speakers
as the French ones would have been to the French speakers in the Pynte et al. study. The English and
English-translated French metaphors were informally evaluated by several of her bilingual friends and
deemed equivalently trite.
Twenty-four City College students, native or near-native English speakers (began speaking English
before the age of 5, with all their schooling in English), were recruited primarily from psychology classes
for the cloze experiment. They received course credit or were paid for their participation.
Materials. For the frequency analysis, the frequency of the final word for each semantic condition
was computed from the Kucera and Francis (1967) word norms.
For the cloze analysis, 40 trite metaphors were created so that (a) the metaphor would be familiar
to English speakers, and (b) like the experimental sentences, the metaphor would hinge on only one,
the final, word. Examples of these are Despite her good grades, my roommate is a complete airhead;
Wherever she goes, Donna spreads sunshine; and Please open a window; this room is a sauna.
The 40 new metaphors and the 80 original sentence frames were randomized together into two different
random orders. The final word of the sentence was replaced by dashes, and, if a noun, the preceding
article was rendered as a(n). The two random orders were printed into booklets.
Procedure. For the cloze experiment, subjects were tested for the most part in small groups. Each
subject received a booklet, with about half the subjects tested in each order. The booklet opened with
the written instructions (modified minimally from Bloom & Fischler, 1980):
On the following pages are 120 sentences each with the final word left blank. Your task is simply
to read each sentence at your normal rate, and write down the word that first occurs to you as a
likely end of that sentence. For example, if the sentence frame were, The party did not end
until
, possible responses might include dawn, three, late, midnight, and
so forth. Dont try to be unique or average; just be natural. You should keep within the following
bounds however: (1) only one response word per sentence; (2) the word should make sense
of the sentence, and be from an appropriate class of words (nouns, verbs, adjectives, etc.); (3)
English words only; (4) try to avoid repetitions. For some of the sentences, the response will seem
obvious; for others, any number of words will seem possible. This is of course intentional since
we are interested in the whole range of sentence constraints.

Results
To calculate cloze probability for each word, the number of times that word was
provided by subjects for that sentence frame was calculated and then divided by 24
(the number of subjects). For each of the three endings for the 80 sentence frames
used in the ERP experiment, both frequency and cloze probability are displayed in
the Appendix, with the item.
Word frequency. Table 3 displays the average frequency for the 55 literal, metaphoric, and anomalous endings of the sentences analyzed in the ERP experiment.
One-way repeated-measures (the sentence frame yolked a particular literal, metaphoric, and anomalous word) analysis of variance corrected for sphericity showed a
significant effect of semantic condition [F(1, 54) " 6.31, p $ .002]. Post hoc tests
showed that the metaphoric endings were drawn from a less frequently occurring
TABLE 3
Mean Word Frequency (as Derived from the
Kucera and Francis, 1967, Word Norms) for
the 55 Literal, Metaphoric, and Anomalous
Endings to the Sentences Analyzed in the ERP
Experiment
Literal

Metaphoric

Anomalous

60.21

10.36

41.73

501

METAPHOR, ANOMALY AND N400

TABLE 4
Mean Cloze Probability (as Determined from 24 Subject
Responses for the 55 Literal, Metaphoric, and Anomalous
Endings to the Sentences Analyzed in the ERP Experiment, as
Well as to 40 Sentence Frames Designed to Have Trite Metaphoric Endings
Literal

Metaphoric

Trite Metaphoric

Anomalous

.19

.017

.227

.0005

sample than were the literal or anomalous endings, which were not significantly different from one another.
Cloze probabilities. Table 4 displays the mean cloze probabilities for the sentences used in the ERP study, along with the average cloze probability for the trite
metaphors created for this study for comparison purposes. A repeated-measures analysis of variance as was conducted for frequency, comparing the three endings used
in the ERP experiment, showed a significant effect of semantic condition [F(1, 54) "
37.54, p $ .001], with post hoc tests revealing a higher cloze probability for the
literal endings than for either of the other two, as well as a higher cloze probability
for metaphoric endings than for anomalous ones. A t test for independent samples
with unequal variance comparing the cloze probabilities of the test metaphors with
those of the trite ones showed significantly less predictability of the test sentences
[t(39) " 4.89, p $ .005].
Discussion
The purpose of the follow-up experiment was to determine the word variables that
might have differentiated our stimuli, apart from their metaphoricalness or meaningfulness. Confirming our design, our metaphorical endings were shown to be significantly less predictable than a foil set of metaphors created to be comparable to the
set of metaphors that Pynte et al. had used for French speakers. Thus, our ERP test,
while replicating Pynte et al.s general finding of a smaller (less positive) N400 for
metaphoric than literal sentences, is actually a quite different result: (1) Our test used
more creative, unfamiliar metaphors and (2) our test did not show a significant N400
effect, given the baseline of the early window.
The follow-up results demonstrated that apart from meaningfulness and metaphoricalness, our endings differed in more mundane semantic measures: the metaphoric
endings had a lower word frequency than did the other two ending types and a lower
cloze probability than did the literal endings; the anomalous endings had a significantly smaller, near-0 cloze probability. In some sense, these results are quite reasonable given the way the stimuli were derived. First, one would expect a near-zero
cloze probability for anomalous endings, since subjects in the cloze experiment were
asked to make sense, and in selecting the anomalous ending, the experimenters
had the opposite purpose. Likewise, since subjects in the cloze experiment were instructed not to try to be unique (or average), they would not be seeking a poetic
creative-metaphor ending, in contrast to the selection process of the experimenters
designing the stimuli. Second, in creating the anomalous and literal sentences the
experimenters began with the metaphoric sentence, found in the literature or spontaneously occurring and therefore relatively unconstrained, and then thought of alternative endings. The endings that are most likely to come to mind are words that are

502

TARTTER ET AL.

more available or codable, closer to the top of cognitive deck (Brown, 1958, p.
236), which are more often high frequency than low frequency words.
However, the finding of semantic variables, apart from meaningfulness and metaphoricalness, covarying with them in our study, mandates interpretation. It may be
that the deviance for the literal sentences in the early window of the ERPs from
those of the metaphoric and anomalous sentences reflects an ease of construction of
interpretation and structure, determined not by a literal process per se, but by the
availability of the words and their predictable fit to context, both of which may be
part and parcel of normal, literal language interpretation. The greater N400 effect
for the anomalous words as compared to the metaphoric and literal words may reflect
the near-zero predictability for anomaly, the extreme deviance from expectedness,
rather than, or in addition to, the difficulty in creating a sensible semantic interpretation (see also Kutas et al., 1984). It is more difficult to isolate an effect of word
frequency per se on the ERP waveforms. The metaphoric words were of a lower
frequency class, yet their waveforms grouped with those of the anomalous words in
the early window, and with those of the literal words in the later window. So, if
word frequency affected the waveforms, it did so in conjunction with some other
variable(s).

GENERAL DISCUSSION

The present studies were designed to determine whether good novel metaphoric
sentences are interpreted using constructive processes in common with literal sentences or whether their novel, literally anomalous meaning is noted first, perhaps
triggering a special figurative process as suggested by the standard pragmatic theory.
To this end we created, collected, had rated, weeded, and edited sets of sentences
that could then be confidently considered as metaphoric, literal, or anomalous. We
then measured event-related potentials (N400) believed to reflect recognition of
anomaly, a presumably semantic process, for these sentences in new subjects. Those
subjects were also given a recognition task to ensure that they were reading the sentences and a paraphrasing task to ensure that they interpreted them as we intended
and as the results from the rating study suggested they would. Finally, we examined
the frequency of occurrence in English of our final words and their cloze probabilities
with respect to the sentence frames.
Our ERP results suggest that the question of whether figurative language is perceived first as anomalous cannot be answered with a simple yesno. The results
clearly show a divergence in processing of anomalous sentences from literal and
metaphoric sentences, a larger N400 effect for the anomalous sentences. In the region
of N400, metaphoric sentences also differ from literal sentences, suggesting at first
glance that they too may be perceived as more anomalous. However, the ERP results
also showed a significant difference in the early epoch among the three sentence
types, with literal and metaphoric sentences diverging in an early window encompassing N200, and thereafter following parallel trajectories. Thus, when the amplitudes in the early epoch were controlled for, there was no N400 effect for the metaphors.
Friederici (1997) has identified the N200 window with preliminary sentence structuring: an initial syntactic structure is assigned to the incoming information on the
basis of word category information alone: during this stage, incoming words are
structured into phrases (noun phrase, verb phrase), and grammatical roles (subject,
object) are considered (p. 64). Since in the current study, across semantic conditions, endings associated with the same frame shared part of speech, and yet we found

METAPHOR, ANOMALY AND N400

503

a difference, we suggest that more than word category information and grammatical
role figure into this initial structuring. One aspect that may figure in is word accessibility: Our metaphoric endings were significantly less frequently occurring according
to the Kucera and Francis (1967) word norms than were our other endings. A second
aspect that may come into play was suggested intriguingly many years ago by Chomsky (1965) in his Extended Standard Theory of syntax, a suggestion for a syntactic
operation that goes beyond phrase structure assignment. In this theory, word category
information is incorporated in strict subcategorization rules and word selection
is governed by selectional restrictions. The latter ensured, for example, that if an
animate word was required for the frame, one would be selected. If, as Chomsky
proposed, selectional restrictions operate in the assignment of structure, a metaphoric
sentence like the camel is a desert taxi would contain a violation, since camel
would select for an animate object.
Thus, it seems quite possible that a structure could be completely assigned in the
early epoch only for the literal sentences, not for the others. While for each frame
all three sentence types should be assigned the same phrase marker, for the literal
sentences only would a preliminary check of selectional features yield a match and
certainty. If Friederici is correct that processing in the region of the N200 reflects
syntactic or structure processes, the separation we obtained for literal sentences from
anomalous and metaphoric ones in this window suggests that early structure processes
entail examination of selectional restrictions, resurrecting this aspect of the Extended
Standard Theory of generative grammar.
If the ERPs for literal sentences remained different from those for metaphor and
anomaly we could conclude that the standard pragmatic theory for figurative language
is correctthat a literal interpretation is first attempted, and only when rejected is
a figurative interpretation tried. However, in fact, our ERP amplitudes for metaphoric
and literal sentences were statistically equivalent in the region of the N400 showing
similar relative changes from the early to the late window encompassing N400, the
component identified with anomaly detection by a considerable body of research (for
a review, see Kutas & Van Petten, 1994). In this later epoch, in Friedericis (1997)
framework, lexical-semantic processes are operating. Following this framework, our
results suggest that after initial assignment of structure, including that based on selectional features, semantic processes, perhaps constructive, kick in to assign interpretations to both literal and metaphoric sentences. In contrast, for anomalous sentences,
an interpretation cannot be assigned, resulting in a large N400 effect for them alone.
As we discussed, ours is not the first study of N400 and metaphor. Pynte et al.
(1996) measured N400s for common French metaphoric expressions. They reported
a significant N400 effect for these, a result which ours appears to contradict. The
contradiction is more remarkable given that their study used familiar metaphors and
ours used novel ones, which one might expect would produce more surprise and
therefore greater apparent anomaly. Indeed, metaphoric sentences in English which
we created to be similarly trite to the French ones used by Pynte et al. showed a
significantly greater cloze probability than did our more original metaphors. Our
interest in metaphors and ERPs was to try to tap the constructive process in language
understanding, a process that could be unnecessary for common, already interpreted,
expressions. And in this regard, we must emphasize that (1) we used only sentences
validated by independent subjects as metaphoric/literal, anomalous/meaningful appropriate to the categories literal, metaphoric, and anomalous; and (2) subjects from whom we recorded ERPs later paraphrased the sentences and for the most
part interpreted them in accordance with our preassigned categories (91% of the sentences were correctly paraphrased by a half or more of the subjects).
So why might Pynte et al. have found a significant N400 effect and we not for

504

TARTTER ET AL.

metaphoric sentences? First, we measured the ERP amplitude in the region of the
N400 relative to the ERP amplitude in an earlier region, and used the difference as
our measure of negativity; they compared the amplitudes of N400 across sentence
types. We found a significant N400 effect only for anomalous sentences. They found
that N400 was more negative for their metaphoric than their literal sentences, which
is not to say that the effect was specific to this time window. Second, our study
employed anomalous and literal sentences as controls and randomized all three sentence types in each block, so expectation of a particular sentence type would be
neutralized. In their first experiment they included 50% literal sentences, 25% familiar metaphors, and 25% unfamiliar metaphors (these were endings for familiar metaphors matched with wrong frames and may in fact have been uninterpretable
anomalousin the minds of the subjects). For word pairs, it has been shown that
list probability and context affect N400 (Brown, Hagoort, & Chwilla, 1996; Deacon,
Breton, Ritter, & Vaughan, 1991; Holcomb, 1988). If this holds true also for sentences, subjects expecting literal sentences (because they were twice as likely as either
of the others) could show an N400 effect for either of the others because they were
relatively unexpected. Thus the N400 they obtained could be an artifact of their design, not a sign that metaphor is initially processed as anomaly. (Similar list probabilities that could affect the N400 in and of themselves exist in their other experiments.)
Nevertheless, we concur with Pynte et al., that metaphoric sentences are not processed
as are literal sentences, but we locate the difference in the earlier epoch, which they
did not report. We disagree that metaphors are processed as are anomalies and have
ERP measures for anomalous sentences in the same subjects who read the metaphors
to substantiate this position.
With regard to list probabilities affecting the size of the N400, it is important to
note that our metaphors, while creative and interpretable (and in many cases used
in previous studies in the literature on metaphor processing), differ from our anomalous and literal endings in predictability. The literal and anomalous endings were
drawn from a more frequently occurring-in-the-language sample, and the literal endings were provided in a cloze experiment as a filler for the frame more often than
were the metaphoric words, while the anomalous ones were offered as endings to
the frames significantly less often, indeed almost never. While frequency does not
seem to have a clear impact by itself on the activity in either the early or the late
windows, the cloze probabilities could be responsible for the relative amplitudes in
the different semantic conditions in the later window, at N400: The lower the cloze
probability the more negative the N400, a result consistent with the findings of Kutas
et al. (1984). However, what produces a particular word selection for a sentence
frame in the cloze task is a combination of syntactic and semantic fit. So, given the
full sentence context, what we believe we see reflected in both the cloze probabilites
of production and the ERPs of comprehension is the implementation by the late window of a constructive semantic interpretation and a parse for both the metaphors and
the literal sentences, but not for the anomalies.
In sum, we have demonstrated that metaphoric sentences are neither fish nor
fowlthey are processed differently from literal sentences in the early structureassigning epoch, and they are processed differently from anomalies in the region of
the N400, when semantic interpretation may be assigned, and where they, and not
anomalous sentences, are indeed interpretable. The results suggest that some semantic analysis may occur as part of structure assignment in line with Chomskys (1965)
Extended Standard Theory, but that fuller semantic analysis does not read the metaphor as anomalous in contradistinction to the standard pragmatic theory. The partially overlapping syntactic and semantic analyses indicating violation of selectional

METAPHOR, ANOMALY AND N400

505

restrictions but interpretability may be what renders the tension that makes figurative
language fun.
APPENDIX
Stimulus Sentences Validated by Subjects

Asterisks indicate those used in the final ERP analyses. The metaphoric version
is presented on the top line. Substitution for the last word of the word marked (a)
yields the anomalous version and of the word marked (l), the literal version. Following each final word are two numbers. The first is the words frequency, and the second
the cloze probability (see under Follow-Up Experiment).
1. The winter wind tossed the earths lacy blanket (30, 0)
placemat (0, 0) (a) snow (59, .29) (l)
2. The chimney belched forth soiled wisps of cotton (38, 0)
wool (10, 0) (a) smoke (41, .58) (l)
*3. The orchestra filled the concert hall with sunshine (8, 0)
hail (10, 0) (a) music (216, .54) (l)
*4. The skillful diplomat alleviated the internal tug-of-war (2, 0)
hopskotch (1, 0) (a) crisis (82, .08) (l)
*5. Spring makes green the woodlands bare skeletons (1, 0)
spheres (4, 0) (a) trees (0, .08) (l)
*6. The children playing in the park trampled the soft green carpet (13, 0)
bedspread (2, 0) (a) grass (53, .75) (l)
*7. The hunters approach silenced the chattering underbrush (1, 0)
moss (9, 0) (a) animals (58, .13) (l)
*8. The flowers were watered by natures tears (34, .08)
laughter (22, 0) (a) rain (70, .38) (l)
9. The leaves were tossed by the earths gentle whisperings (1, 0)
rollings (0, .04) (a) breeze (14, .58) (l)
*10. His face was contorted by an angry cloud (28, 0)
map (13, 0) (a) frown (1, .04) (l)
11. The countrys border was marked by a concrete serpent (2, 0)
elephant (7, 0) (a) wall (160, .58) (l)
The previous 11 sentences were adapted from Gerrig and Healy (1983).
*12. The camel is a desert taxi (16, 0)
table (198, 0) (a) animal (68, .67) (l)
*13. Not even Einsteins ideas were all gold (52, 0)
coal (32, 0) (a) great (665, .08) (l)
*14. Some jobs are prisons (3, 0)
houses (83, 0) (a) boring (5, .33) (l)
The previous three sentences were adapted from Glucksberg (1989).
*15. The bell sounded and the employees streamed from the anthill (0, 0)
bedroom ( 52, 0) (a) factory (32, .29) (l)
*16. The rush hour train stopped and out poured the sardines (2, .04)
tuna (0, 0) (a) commuters (0, .21) (l)
17. For the musical, the actress captured Broadways superbowl (0, 0)
helmet (1, 0) (a) Tony (1, .08)

506

TARTTER ET AL.

18. On his head he sported a rug (13, 0)


chair (66, 0) (a) toupee (0, .08) (l)
*19. The shampoo effectively removed the snowflakes (1, 0)
ice (45, 0) (a) dandruff (0, .5) (l)
*20. The teenagers face was a coral reef (11, 0)
sea (95, 0) (a) pock-marked (0, 0) (l)
The previous sentence was adapted from Radencich and Baldwin (1985).
*21. The avengers face was a sealed furnace (11, 0)
shower (15, 0) (a) hiding anger (48, 0) (l)
*22. He hates the slime that sticks on filthy deeds (8, 0)
rules (0, 0) (a) toilets (4, 0) (l)
*23. In the photograph he was doing a Napolean (7, 0)
Lincoln (47, 0) (a) salute (3, 0) (l)
The previous two sentences were adapted from Gibbs (1994).
24. Sermons are like sleeping pills (0, .08)
grapefruit (3, 0) (a) lectures (15, .29) (l)
*25. Cigarettes are like timebombs (0, .04)
furniture (39, 0) (a) cigars (2, .04) (l)
The previous two sentences were adapted from Glucksberg and Keysar (1990).
*26. We wait for the sun to blow out (33, 0)
drip (1, 0) (a) set (414, .17) (l)
27. The Beatles were more popular than Christ (97, .08)
the navy (37, 0) (a) the Stones (12, .04) (l)
28. The prison guard was a hard rock (75, 0)
noodle (0, 0) (a) judge (77, 0) (l)
The previous sentence was adapted from Winner, Rosensteil, and Gardner (1976).
*29. The worker exhausted his fuel (17, 0)
chickens (13, 0) (a) energy (100, .08) (l)
*30. For their daughters wedding the couple proposed a ceasefire (7, 0)
highchair (0, 0) (a) toast (19, .42) (l)
*31. The Mona Lisa was DaVincis Hamlet (7, 0)
play (200, 0) (a) masterpiece (9, .54) (l)
32. Federal funding for abortion is a minefield (0, 0)
pasture (14, 0) (a) controversy (26, .08) (l)
33. He recognized it as a great idea as soon as it erupted (7, 0)
sank (18, 0) (a) appeared (135, .25) (l)
*34. He sank into the featherbed, enjoying its soft embrace (13, .13)
taste (59, 0) (a) feel (216, .17) (l)
*35. Hoping to prevent a scene, she tried to lower his thermostat (6, 0)
computer (13, 0) (a) rage (16, 0) (l)
*36. The lifeguard sparkled with the healthy glow of tanned cancer (25, 0)
headache (5, 0) (a) skin (47, .46) (l)
37. The archeologists found the ancient dump to be an encyclopedia (1, 0)
atlas (12, 0) (a) treasure-trove (0, .04) (l)
*38. The buttery pastries melted all over his arteries (16, 0)
kidney (6, 0) (a) chin (27, 0) (l)
*39. Touching a turtle causes it to retreat into its armor (4, 0)
nightgown (0, 0) (a) shell (22, .96) (l)

METAPHOR, ANOMALY AND N400

507

*40. It was hard to see the road, the air was so soupy (0,0)
beefy (1, 0) (a) foggy (5, .33) (l)
*41. When the car broke down she had to thumb (10, .04)
pinky (1, 0) (a) hitchhike (0, 0) (l)
42. From every corner cockroaches peeked like so many black plums (0, 0)
apples (6, 0) (a) ants (7, .13) (l)
*43. The cheap cushion seemed stuffed with old rocks (23, 0)
shoes (44, 0) (a) clothes (89, 0) (l)
*44. In the spring the brown branches are covered in tiny emeralds (6, 0)
sapphires (0, 0) (a) leaves ( 49, .13) (l)
45. He argued his test grade with the dragon (1, 0)
horse (117, 0) (a) teacher (80, .33) (l)
46. The rock star sweated hormones (2, 0)
kidneys (5, 0) (a) profusely (3, .38) (l)
*47. Mt. Everest is Nepals Yellowstone (0, 0)
tree (59, 0) (a) attraction (15, .08) (l)
48. Billboards are a highways warts (5, .08)
dimples (0, 0) (a) eyesores (0, 0) (l)
The previous sentence was adapted from Verbrugge and McCarrell (1977).
49. A hat-trick in hockeythree goalsis a players grand slam (1, .46)
heart (173, 0) (a) dream (64, 0) (l)
50. Before the tornado the sky was a brilliant amethyst (0, 0)
diamond (8, 0) (a) purple (13, 0) (l)
51. On hot summer nights, the children played in the fire hydrant geysers (2, 0)
mudbath (0, 0) (a) water (442, .29) (l)
*52. Cave paintings are ancient graffiti (1, 0)
music ( 216, 0) (a) art (208, 0) (l)
*53. After their leafy feasts, caterpillars wrap in silk hammocks (0, 0)
underwear (3, 0) (a) cocoons (0, .67) (l)
*54. A night of heavy drinking makes your stomach a whirlpool (1, 0)
lobster (1, 0) (a) upset (14, .04) (l)
*55. Hypnosis opens memorys dams (3, 0)
trees (101, 0) (a) secrets (20, 0) (l)
*56. The Christmas stocking was stuffed to the gills (0, 0)
lungs (20, 0) (a) top (204, .21) (l)
*57. Before gargling his breath smelled swampy (1, 0)
red (197, 0) (a) horrible (15, .25) (l)
58. The gym-teacher taught warm-up exercises as boot-camp (0, 0)
algebra (2, 0) (a) required (182, 0) (l)
59. There is only so much deposit I am willing to eat (61, 0)
drink (82, 0) (a) forfeit (3, 0) (l)
*60. Homing pigeons have a built-in compass (13, .04)
ruler (3, 0) (a) locator (0, 0) (l)
61. He jumped from the chair and evaporated (2, 0)
disintegrated (0, 0) (a) left (480, .08) (l)
*62. The multi-vehicle accident set off an alarm symphony (33, 0)
violin (11, 0) (a) many alarms (1, 0) (l)
*63. The spring wind created a pollen blizzard (7, .04)
blackout (5, 0) (a) cloud (28, .04) (l)
64. After the engine was replaced the car needed to be debugged (0, 0)
baked (8, 0) (a) retuned ( 0, .33) (l)

508

TARTTER ET AL.

*65. She inhaled the sea perfume (10, 0)


cat (23, 0) (a) breeze (14, .21) (l)
*66. Rice is the Orients wheat (9, .04)
clothes (89, 0) (a) grain (27, 0) (l)
*67. Her favorite outfit was plagiarized (0, 0)
watered (7, 0) (a) copied (3, 0) (l)
*68. After a week of no rain the plants were panting (9,0)
typing (7, 0) (a) wilting (0, .13) (l)
*69. The heavy smog created an emergency room flood (19, .04)
snow (59, 0) (a) crowd (53, .04) (l)
*70. Her wrinkled appearance was smoothed with a scalpel (0, 0)
grape (3, 0) (a) facelift (0, .13) (l)
*71. The teacher had trouble with the students hieroglyphics (0, 0)
acorn (0, 0) (a) writing (117, .13) (l)
*72. Samson was a biblical Hercules (3, 0)
nymph (1, 0) (a) strongman (0, .04) (l)
*73. Tuberculosis was the 19th century AIDS (27, .04)
month (130, 0) (a) plague (6, .5) (l)
*74. Alzheimers slowly destroys ones hard-drive (0, 0)
magazine (39, 0) (a) brain (45, .29) (l)
*75. After five oclock the financial district is a cemetery (15, 0)
sky (58, 0) (a) deserted (15, .08) (l)
*76. The therapist helped the patient reach shore (6, 0)
leaves (49, 0) (a) peace (198, .04) (l)
77. The theatre company produced a Shakespeare banquet (6, 0)
floor (158, 0) (a) festival ( 27, .04) (l)
78. His imagination could be seen with a microscope (8, .13)
hammer (9, 0) (a) difficulty (76, 0) (l)
*79. Nelson Mandela is South Africas Lincoln (47, 0)
paper (157, 0) (a) savior (6, .21) (l)
*80. Fluoridating water is dental penicillin (1, 0)
cook (47, 0) (a) hygiene (3, .21) (l)
REFERENCES
American Encephalographic Society. (1990). Standard electrode position nomenclature. Bloomfield, CT.
Bloom, P. A., Fischler, I. (1980). Completion norms for 329 sentence contexts. Memory & Cognition,
8, 631642.
Brown, C. M., Hagoort, P., & Chwilla, D. J. (1996). An event-related brain potential analysis of visual
word priming effects. In D. Chwilla (Ed.), Electrophysiology of word processing: The lexical processing nature of the priming effect. Self-published dissertation ISBN 90-9009317-6; printed and
bound by Koninklijke Wohrmann b. v., Zutphen.
Brown, R. (1958). Words and things. New York: The Free Press.
Chomsky, N. (1965). Aspects of the Theory of Syntax. Cambridge, MA: MIT Press.
Deacon, D., Breton, F., Ritter, W., & Vaughan, H. G., Jr. (1991). The relationship between N2 and
N400: Scalp distribution, stimulus probability, and task relevance. Psychophysiology, 28, 185200.
Federmeier, K. D., & Kutas, M. (1999). A rose by any other name: Long-term memory structures and
sentence processing. Journal of Memory and Language, 41, 469495.
Friederici, A. D. (1997). Neurophysiological aspects of language processing. Clinical Neuroscience, 4,
6472.
Gibbs, R. W., Jr. (1994). The poetics of mind: Figurative thought, language, and understanding. New
York: Cambridge Univ. Press.

METAPHOR, ANOMALY AND N400

509

Gibbs, R. W., Jr., & Gerrig, R. J. (1989). How context makes metaphor comprehension seem special.
Metaphor and Symbolic Activity, 4, 145158.
Glucksberg, S. (1989). Metaphors in conversation: How are they understood? Why are they used? Metaphor and Symbolic Activity, 4, 125143.
Glucksberg, S. (1991). Beyond literal meanings: The psychology of allusion. Psychological Science, 2,
146152.
Glucksberg, S., Gildea, P., & Bookin, H. A. (1982). On understanding speech: Can people ignore metaphors? Journal of Verbal Learning and Verbal Behavior, 21, 8598.
Glucksberg, S., & Keysar, B. (1990). Understanding metaphorical comparisons: Beyond similarity. Psychological Review, 97, 318.
Holcomb, P. J. (1988). Automatic and attentional processing: An event-related brain potential analysis
of semantic priming. Brain and Language, 35, 6685.
Johnson, M. (1980). A philosophical perspective on the problems of metaphor. In R. P. Honeck &
R. R. Hoffman (Eds.), Cognition and figurative language (pp. 2546). Hillsdale, NJ: Erlbaum.
Katz, J. J. (1990). The metaphysics of meaning. Cambridge, MA: Bradford Books of MIT Press.
Kucera, H., & Francis, W. N. (1967). Computational analysis of present-day modern English. Providence, RI: Brown Univ. Press.
Kutas, M., & Hillyard, S. A. (1980). Reading senseless sentences: Brain potentials reflect semantic
incongruity. Science, 307, 161163.
Kutas, M., & Iragui, V. (1998). The N400 in a semantic categorization task across 6 decades. Electroencephalography and Clinical Neurophysiology, 108, 456471.
Kutas, M., Lindamood, T. E., & Hillyard, S. A. (1984). Word expectancy and event-related potentials
during sentence processing. In S. Kornblum & J. Requin (Eds.), Preparatory states and processing
(pp. 217237). Hillsdale, NJ: Erlbaum.
Kutas, M., & Van Petten, C. K. (1994). Psycholinguistics electrified: Event-related brain potential investigations. In M. A. Gernsbacher (Ed.), Handbook of psycholinguistics (pp. 83143). San Diego, CA:
Academic Press.
McCarthy, G., & Wood, C. C. (1985). Scalp distribution of event-related potentials: An ambiguity associated with analysis of variance models. Electroencephalography and Clinical Neurophysiology, 62,
203208.
McElree, B., & Nordle, J. (1999). Literal and figurative interpretations are computed in equal time.
Psychonomic Bulletin & Review, 6, 486494.
Ortony, A. (1980). Some psycholinguistic aspects of metaphor. In R. P. Honeck & R. R. Hoffman (Eds.),
Cognition and figurative language (pp. 6983). Hillsdale, NJ: Erlbaum.
Pynte, J., Besson, M., Robichon, F.-H., & Poli, J. (1996). The time-course of metaphor comprehension:
An event-related potential study. Brain and Language, 55, 293316.
Radencich, M. C., & Baldwin, R. S. (1985). Cultural and linguistic factors in metaphor interpretation.
Bilingual Review, 12, 4353.
Verbrugge, R. R., & McCarrell, N. S. (1977). Metaphoric comprehension: Studies in reminding and
remembering. Cognitive Psychology, 9, 494533.
Winner, E., Rosenstiel, A. K., & Gardner, H. (1976). The development of metaphoric understanding.
Developmental Psychology, 12, 289297.

Vous aimerez peut-être aussi