Hearing Tsur's "Poetic Mode" The Text Music Relationship From Monody To Björk

Hearing Tsur’s “Poetic Mode”:
The Text/Music Relationship from Monody to Björk
by
Lorena M. Guillén
May 25, 2007
A Dissertation submitted to the

Faculty of the Graduate School of State
University of New York at Buffalo
in partial fulfillment of the requirements for the
degree of
Doctor of Philosophy
Department of Music
UMI Number: 3262037
Copyright 2007 by
Guillen, Lorena M.
All rights reserved.
UMI Microform 3262037

Copyright 2007 by ProQuest Information and Learning Company.
All rights reserved. This microform edition is protected against
unauthorized copying under Title 17, United States Code.
ProQuest Information and Learning Company

300 North Zeeb Road
P.O. Box 1346
Ann Arbor, MI 48106-1346
Copyright by
Lorena M. Guillén
2007
ii
ACKNOWLEDGEMENTS
Several people made possible the realization of this work. First, I wish to
acknowledge my advisor Michael Long who patiently and wisely guided me through this
process. I also really appreciate the time and insightful comments provided by my
committee members—Martha Hyde, Peter Schemlz and Jeffrey Stadelman.
The last years of research and writing were made possible by The Doctoral
Dissertation Fellowship from the College of Arts and Sciences of the State University of
New York at Buffalo. I want to thank them for their support.
I want to mention Barbara Hein and Martina Anderson, who carefully read and
edited my document, Martina Möetz for assisting with translations, and William
Egginton for introducing me to some of the important linguistics literature.
I would like to thank Gloria Escobar and Esperanza Roncero for allowing me to
conduct my questionnaires in her classes, and certainly, all Hartwick College students
that volunteered their time to answer my questions.
And last, I am grateful to my family, my husband, Alejandro Rutty, and my
babies, Xul y Mora, who patiently waited for their mother to finish her dissertation.
Alejandro with his sharp comments on my topic provided a continuous dialogue that
made me build a coherent discourse to defend my arguments.
iii
TABLE OF CONTENTS
ACKNOWLEDGEMENTS…………………………………………………………….iii
LIST OF EXAMPLES………………………………………………………………….vi
LIST OF TABLES……………………………………………………………………..viii
ABSTRACT………………………………………………………………………….......ix
I. INTRODUCTION. THE POETIC MODE………………………………………..…1
The Poetic Mode of Text Perception……………………………………...8
The Perceptual Process of Text………………………………………..…12
Expressive Potential: Three Types of Poetic Modes…………………….18
Two Contributions from the Study of Prosody…………………………..19
II. MONODY, RECITATIVE AND ART SONG………………………………………...22
Intonation Analysis Terminology………………………………………..22
Low-Level Mimesis: Monody and Recitative…………………………...24
High-Level Mimesis: The 19th Century Lied............................................39
III. FURTHER EXPLORATIONS:
MEREDITH MONK AND LUCIANO BERIO…………………………………63
Volcano Songs: Meredith Monk…………………………………………70
A-Ronne: Luciano Berio…………………………………………………85
IV. THE POPULAR SONG…………………………………………………………..114
The Highly Structured Song: Tin-Pan Alley Tune……………………..120
The Narrative Type: Strophic Form…………………………………….135
Redundancy: Variation on the “Verse-Chorus” Form………………….142
iv
Empirical Data: Questionnaires’ Results………………………………157
V. CONCLUSION…………………………………………………………………….170
BIBLIOGRAPHY……………………………………………………………………………175
v
LIST OF EXAMPLES
EX. 1.1: From Caccini’s Sfogava con le stelle, mm. 5-16………………………………28
EX. 1.2: Recitative from Scene V, W. A. Mozart’s Don Giovanni……………………...32
EX. 1.3: Alto’s recitative n. 8 from Handel’s The Messiah……………………………...38
EX. 1.4: Reichardt, Kennst du das Land, 1st stanza……………………………………...46
EX. 1.5: Zelter’s Kennst du das Land, mm.1-27………………………………………...47
EX. 1.5 (cont.): Zelter’s Kennst du das Land, mm.28-53………………………………..48
EX. 1.6: Beethoven’s Kennst du das Land, mm.1-17………………………….………...51
EX. 1.6 (cont.): Beethoven’s Kennst du das Land, mm.18-43…………………………..52
EX. 1.7: Beethoven’s “Kennst du”-questions’s rhythmic pattern……………………….53
EX. 1.8: Schubert’s Kennst du das Land, mm.1-18. ……………………………………55
EX. 1.8 (cont.): Schubert’s Kennst du das Land, mm.19-40………………………….....56
EX. 1.9: Schumann’s Kennst du das Land, mm. 1-20…………………………………...59
EX. 1.9 (cont.): Schumann’s Kennst du das Land, mm. 21-41……………………….....60
EX. 2.1: Monk, Volcano Songs: Duets, “Walking Song,” min. 0:25 to 0:35…………...74
EX. 2.6: Fragment from Monk, Volcano Songs: Duets, “Lost Wind”………………….80
EX. 2.7: Fragment from Monk, Volcano Songs: Duets, “Hip Dance”………………….81
EX. 2.8: Fragment from Monk, Volcano Songs: Duets, “Cry # 1”……………………..84
EX. 3.1: Rhythmic riff played by the brass section in Fitzgerald’s version of “All the
Things You Are”……………………………………………………………………132
vi
EX. 3.2: “All the Things You Are”: Two versions and published score of the A section.
Each in the key performed or published. Trans. by L. Guillén……………………..133
EX. 3.3: Transcription of opening three measures of Dylan’s

“Simple Twist of Fate”…………………………………………………………….138
EX. 3.4: Transcription of mm.4 to 6 of Dylan’s “Simple Twist of Fate”………………139
EX. 3.5: Transcription of the refrain of Dylan’s “Simple Twist of Fate”……………...139
EX. 3.6: Transcription of all the sections of Björk’s “Isobel” …………………………144
EX. 3.7: Transcription of the opening eleven measures of Gabriel’s “Sky Blue”…….154
EX. 3.8: Layout of solo and vocals in the first chorus of Gabriel’s “Sky Blue”……….155
EX. 3.9: Layout of solo and vocals in the second chorus of Gabriel’s “Sky Blue”……156
vii
LIST OF TABLES
TABLE 1.1: Intoneme analysis of Caccini’s Sfogava con le stelle……………………...29
TABLE 1.2: Intoneme analysis of recitative from Scene V, W. A. Mozart’s Don

Giovanni…………………………………………………………………………………31
TABLE 1.3: Intoneme analysis of Alto’s recitative n. 8 from Handel’s The Messiah…..37
TABLE 2.1: Classification of performance instructions in Berio’s A-ronne…………....94
TABLE 2.2: Exchange between tenor 1 and baritone 1 in Berio’s A-ronne, 18 to 20…..99
TABLE 3.1: Song format of Björk’s “Isobel”………………………………………….143
TABLE 3.2: Layout of parallel lyrics constructions in the chorus of Björk’s “Isobel”..147
TABLE 3.3: Song format of Gabriel’s “Sky Blue”…………………………………….151
TABLE 3.4: Results from question #1…………………………………………………159
TABLE 3.7: Results from question #4 on Dylan’s “Simple Twist of Fate”……………163
TABLE 3.8: Results from question #4 on Gabriel’s “Sky Blue”………………………163
TABLE 3.9: Results from question #4 on Bjork’s “Isobel”……………………………164
TABLE 3.10: Results from question #4 on Fitzgerald’s version of “All the Things Your
Are”………………………………………………………………………………....164
TABLE 3.11: Results from question #5………………………………………………..165
viii
ABSTRACT
The pleasure we experience through hearing a song depends largely on musical
gestures—sonorous stimuli that are a complex web of musical and language parameters.
The listener’s emotional experience is mostly independent of the understanding of the
semantic meaning of the lyrics.
This dissertation looks into how people listen to song and how, consciously or
unconsciously, that affects the strategies implemented by composers and songwriters
while facing the task of creating their pieces: Which are the diverse compositional tactics
employed to manipulate the focus of the listener’s perception of the text? How can
composers and songwriters emphasize, compensate for, or oppose this sonic connection
of the listener to the song’s text?
Our first tendency as listeners is to connect musically to the popular song or
vocal “art” work. Although there is an intention of deciphering the semantic message of
the lyrics, it is only after repeated listening that the audience is able to apprehend the
piece as a cohesive discourse. Aside from paying attention to the strict musical elements
of the piece, listeners predominantly perceive the sonic or musical aspects of its lyrics:
the colors of its phonemes, the prosodic arch of intonation of its phrases, the sonic quality
of the performing voice, and the specific colors and inflections that the voice adopts at
each phrase.
In order to test the hypothesis previously proposed and then explore further
ramifications, two different research tools were implemented: direct observation of
materials (analysis of vocal pieces through listening to specific recordings and analysis of
ix
scores) and surveys of college students to observe their perception of the four popular
songs analyzed.
x
I.
INTRODUCTION
THE POETIC MODE
In many non-English speaking countries, Anglo-American popular song is
consumed as frequently and as enthusiastically as indigenous musical repertoires. Even
listeners with no command of the English language engage at some internal level with
their favorite songs. Growing up in Argentina, I was no exception, and many songs in
foreign languages marked my youth. While singing along, my peers and I often
mimicked the sound content of the words, making uip nonsense syllables that stood in for
the actual lyrics. This phenomenon has always intrigued me. How did we engage
“emotionally” with these songs while ignoring the meaning of their texts? And what sort
of pleasure resided in the singing of nonsense syllables rather than meaningful words?
Years later, while watching the 2005 Super Bowl, I witnessed a scene that
resonated with my childhood experience. During that year’s game, the halftime
entertainment was provided by Paul McCartney, who performed some of his well-known
hits, including “Hey Jude.” The audience’s participation increased dramatically at this
point of the show, demonstrating the great popularity of this song among the American
public. The crowd sang along enthusiastically with the chorus, and, surprisingly, some
audience members held up signs containing the non-sense syllabic utterance “na na na
nananana.” Was this the most memorable text phrase of this famous song? Was this event
in anyway related to the way non-English speakers experience Anglo-American song?
The fundamental questions behind this dissertation, which considers how mainstream
1
listeners “hear” song (i.e.,music-with-text), were generated by my own history but were
cast into a more generalized context by this very American musical moment.
Rethinking these issues, I came to the conclusion that the pleasure that we
experience through hearing a song depends largely on musical gestures—sonorous
stimuli that are a complex web of musical and language parameters. These “sound
patterns” interplay through repetition, contrast, reinforcement and de-emphasis. The
listener’s emotional experience is mostly independent of the understanding of the
semantic meaning of the lyrics.
The hypothesis in this dissertation proposes that our first tendency as listeners is
to connect musically to the popular song or vocal “art” work. Although there is an
intention of deciphering the semantic message of the lyrics, it is only after repeated
listening that the audience is able to apprehend the piece as a cohesive discourse. Aside
from paying attention to the strict musical elements of the piece, listeners predominantly
perceive the sonic or musical aspects of its lyrics: the colors of its phonemes, the prosodic
arch of intonation of its phrases, the sonic quality of the performing voice, and the
specific colors and inflections that the voice adopts at each phrase.
Numerous essays have been devoted to exploring the issue of the music and text
relationship, and many others have proposed insightful analysis of song and other vocal
genres, with special emphasis on observing the structuring of music around a poem.1
1
For specific articles on text and music relationship issues, see Walter Bernhart, Steven Paul Scher and
Werner Wolf, ed., Word and Music Studies 1: Defining the Field, ed. (Amsterdam-Atlanta, GA: Rodopi,
1999). This volume gathers essays written by members of the International Association for Word and
Music Studies (WMA). Of particular interest are: Steven Paul Scher, “Melopoetics Revisited. Reflections
on Theorizing Word and Music Studies,” and Suzanne M. Lodato, “Recent Approaches to Text/Music
Analysis in the Lied, A Musicological Perspective.”
Although impossible to name all, here are some texts that contain original song analysis: Charles Rosen,
The Romantic Generation (Cambridge, Massachusetts: Harvard University Press, 1995), particularly his
“Chapter Three: Mountains and Song Cycles”; Kofi Agawu, “Theory and Practice in the Analysis of the
2
However, very little has been said about how people listen to song and how, consciously
or unconsciously, that affects the strategies implemented by composers and songwriters
while facing the task of creating a song: Which are the diverse compositional tactics
employed to manipulate the focus of the listener’s perception of the text? How can
composers and songwriters emphasize, compensate for, or oppose this sonic connection
of the listener to the song’s text?
In an effort to define the nature of the relationship between the two semiotic
realms converging into song, music and language, some poststructuralist analysts and
philosophers have given special attention to the preponderant role of music in this fusion.
Thus, there is the “assimilation model” of Suzanne Langer. Although she conceives the
capacity of the poem to trigger the composer’s imagination, she admits that music
transforms “the entire verbal material, sound, meaning, and all—into musical elements.”2
She argues:
When words and music come together in song, music swallows words; not only mere
words and literal sentences, but even literary words-structures, poetry. Song is not a
compromise between poetry and music, though the text taken by itself may be a great
poem; song is music. 3
On the opposite side of the semiotic analytical arena, Lawrence Kramer conceives
of song as a structure where words and music coexist without losing their individual
Nineteenth-Century ‘Lied’” in Music Analysis 11, no.1 (Blackwell Publishing: March, 1992); Lawrence
Kramer, Music and Poetry: The Nineteenth Century and After (Berkeley: University of California Press,
1984); David B. Lewin, “Figaro’s Mistakes,” Carl Sachter, “Motive and Text in Four Schubert Songs,” in
Engaging Music: Essays in Music Analysis, ed. Deborah Stein (New York-Oxford: Oxford University
Press, 2005); Susan Youens, Schubert’s Poets and the Making (Cambridge-New York: Cambridge
University Press, 1996); John Daverio, Robert Schumann: Herald of a “New Poetic Age” (New York-
Oxford: Oxford University Press, 1997)
2
Langer, Suzanne, Feeling and Form, as quoted by Kofi Agawu in Theory and Practice in the Analysis of
the Nineteenth-Century ‘Lied’, 5.
3
Ibid., 5.
3
essences: “A poem is never really assimilated into a composition; it is incorporated, and
it retains its own life, its own body, within the body of the music.”4
Kramer describes song as “a regressive form of utterance.”5 He argues that music
alienates the singing of the words as a speech-act:
The style of the classical art song since the Renaissance heightens the tension between
words and music in two fundamental ways: first, by adopting an intonational manner that
presents the voice as a precisely tuned instrument rather than as a source of utterance, and
second, by opening the possibility of a musical response to the poetry that is complex
enough to raise questions of interpretation. Other features—the expressive forcing of high
and low tessitura, where the sound of the words inevitably fades into the effort of
attacking the pitch; the complication of rhythm and the varied movement of the voice
toward and away from speech-like patterns, the repetition, alteration, and syntactic
breakdown of the text—also contribute to alienating the singing of the words...6
According to Kramer, song undergoes deconstructive processes simultaneously in
two ways: “overvocalization” and “songfulness.” These processes create an effacement
of meaning. In technical terms, the “overvocalization” applies a kind of topological
distortion to song, dissolving the employed language into its physical origin, the
vocalization. Kramer defines “overvocalization” as the over stretching and twisting of
text to accommodate a musical setting with melismas or sustained notes. This
“purposeful effacement of text by voice” is also produced by “songfulness.”7 More than
technically, the difference between “overvocalization” and “songfulness” is found in an
almost metaphysical ground:
What separates them is a blend of purpose and circumstance. Overvocalization projects

meaning loss as the outcome of a rupture, a wrenching of song beyond the symbolizing
terrain of language and even conception, and therefore beyond the type of regulated
subjectivity mandated on the terrain by the laws of what Lacan calls the symbolic order.
Songfulness projects meaning loss as the outcome of a relative indifference of meaning, a
4
Lawrence Kramer, Music and Poetry (Berkeley: University of California Press, 1984), 127.
5
Ibid.
6
Ibid.
7
Lawrence Kramer, Musical Meaning: Towards a Critical History, (Berkeley and Los Angeles, California:
University of California Press, 2002), 63.
4
kind of higher carelessness or forgetfulness that simply does not avail itself of the
symbolic, allows the symbolic to lie unused even if its words may still be heard clearly.8
“Songfulness” refers to the transformation of song into a fusion of vocal and musical
utterance judged to be pleasurable independent of verbal content. It depends on the
enveloping effect of the human voice, which at the physical level surrounds the listener
with the fullness of its vibrations and always implies potential meaning.
Kramer purposely eludes any objective definition of “songfulness” arguing that it
is one of those aesthetic qualities that is immediately recognized but difficult to account
for. Is “songfulness” an attribute of the vocal music itself, the particular performance or
the ears of the listener? This dissertation proposes that the nature of the “songfulness”
attribute may be explained in terms of the linguistic field of cognitive poetics.
According to Reuven Tsur, a linguist in the field of poetics, poetry is considered
more “poetic” wherever it makes full use of disruptive tactics. Such tactics might create
“sound patterns”—webs of phonemes associated by alliteration, assonance and rhyme
across the poem. These sonic webs may trigger new narrative operations parallel to the
“primary narratemes” and explore the potential of phonemes to communicate different
emotional moods. The poetic text is perceived in what he calls “the poetic mode,” where
“some non-speech qualities of the signal seem to become accessible, however faintly, to
consciousness” along with “speech mode” processing.9
8
Ibid.
9
Reuven Tsur, What Makes Sound Patterns Expressive?: The Poetic Mode of Speech Perception (Durham
and London: Duke University Press, 1992), 14.
5
This dissertation proposes that the natural disruption produced by the
“overvocalization” effect is mounted over the already existent speech disruption of
poems in general. The result is what Kramer calls “songfulness.” But why do we perceive
song in this “songful” way? It is because we hear it in what Tsur calls the “poetic mode.”
Poetry is considered more “poetic” when it makes full use of disruptive tactics. In
a similar manner, music history shows that vocal setting styles have been considered
closer to a “singing quality” when “overvocalizing” takes place, and as a consequence
disruption of the natural flow of the discourse occurs. The degree of discourse
dismembering is directly proportional to the “singing” musicality of that song or vocal
piece. However, vocal setting modalities that aim for a “speech quality” try to keep their
musical elements as plain and unobtrusive as possible in order to guarantee a closer result
to the usual processing of the ”straight” syntactic flow.
The type of musical setting determines if the listeners will appreciate a song in (1)
a “poetic mode,” in which the “speech mode” of perception dominates; (2) a “poetic
mode,” in which there is a balance between the “speech mode” and the “auditory mode;”
or (3) a “poetic mode,” in which the “auditory mode” dominates. First, those settings that
aim to express through their semantic content try to preserve the integrity of the discourse
and therefore tend toward a “speech mode” of processing the text. They situate the
affective potential of language in its prosody, as found in its natural speech state. Second,
those settings that aim to express through their “songfulness” accept the new, artificial
inflections of language under the influence of a song’s melodies and a musical
arrangement’s textures and tend toward a balanced “poetic mode.” They situate the
affective potential of language in its musical-poetic qualities—sonic and semantic
6
elements intermingled. Lastly, those settings that find the distortion that the text
undergoes as irreversible further fragment language in its minimal components
(phonemes) and paralingual gestures and use them as compositional elements. They
situate the affective potential of language in its sonic qualities.
After reviewing central theoretical issues on text-processing in the introduction of
this dissertation, sections two to four are dedicated to the application of those
perspectives in the analysis of musical pieces of diverse musical styles and periods. These
serve to exemplify variations on the “poetic mode” of text perception and the various
compositional tactics that affect the way people listen to them. Thus, the selected
examples of monody and recitative represent the kind of “poetic mode,” in which the
“speech mode” dominates; the nineteenth-century German lieder, as well as the four
analyzed popular song cases, exemplify the “poetic mode,” in which there is a balance
between the “speech mode” and the “auditory mode”; finally, the two twentieth-century
avant-garde vocal selections illustrate the “poetic mode,” in which the “auditory mode”
dominates.
This selection by no means pretends to be a systematic historical overview of the
approaches to the relationship between text and music; it serves only to exemplify
different modes of perception derived from the application of Tsur’s concepts. Each
analysis presents a different methodological approach that most clearly articulates those
compositional procedures that reveal the way people perceive those kinds of settings.
In order to test the hypothesis previously proposed and then explore further
ramifications, two different research tools were implemented: direct observation of
materials (analysis of vocal pieces through listening to specific recordings and analysis of
7
scores) and surveys of college students to observe their perception of the four popular
songs analyzed. Both data collection methods take into account the listeners’ perspective.
This research has exploratory intentions. It introduces a new explanation for the
phenomenon of perceiving text set into music in a “poetic” way and, at the same time,
observes the strategies implemented by composers and songwriters to manipulate the
focus of the listener’s perception. Although preserving as its ultimate goal the creation of
a theory of perception of text in vocal music, its immediate intentions are more modest:
to achieve definitional clarity and to generate an initial hypothesis. This research is a
beginning, a pilot test of a theory imported from the linguistics of poetics—reformulated,
expanded and adapted to explain the perception of vocal music.
The Poetic Mode of Text Perception
The different degrees of linear connection and disruption of a text influence the
cognitive processing of it. Disruptive textual qualities can be produced by a divergence
between linguistic strings of arbitrary verbal signs (words) and repeated phonetic sound
clusters (produced by alliteration, assonance and rhyme), or between syntactic
organization (sentences) and the prosodic units (poetry lines or intonation phrases). These
two organizational levels, the linguistic/syntactic and the phonetic/prosodical, contrast
one other and pull the reader’s or listener’s attentions in opposite directions. Tsur argues:
In connected speech there is a tendency to proceed linearly rather than move in different
directions from the central sequence...to allow for the disruption of the linear sequencing
of speech sounds (that is, for segregating the relevant portions of the auditory stream), the
whole message must be less thoroughly organized on all levels as the linguistic stress
pattern diverges from the conventional metric pattern, and as does the syntactic unit
(clause, sentence) from the prosodic unit (line)...10
10
Ibid., 73.
8
Poetic texts usually tend to disruption, which triggers a “poetic mode of
listening,” where there is freedom for segregating or grouping portions of the sound
stream and moving back and forth between auditory and phonetic modes of listening. The
resultant “sound patterns” assume the emotive effects of non-referential sound gestures;
they are perceived as music by the brain’s right hemisphere—a process that involves the
identification of overtone structures as individual tone colors, a musical kind of listening.
More specifically, Tsur comments—based on findings by A. M. Liberman and his
colleagues of the Haskins Laboratories at Yale University11—that we have a “speech
mode” and a “non-speech mode” of listening, which follow different paths in the neural
system. The same transitions between phonemes that in the speech continuum are heard
as speech sound (because they appear to carry linguistic information), when isolated are
heard as musical sounds. But, according to Tsur, there is a third mode, the “poetic mode,”
in which “some non-speech qualities of the signal seem to become accessible, however
faintly, to consciousness” along with the speech mode processing.12 These three
“listening modes” depend on the way the acoustic signal is processed. In the “non-speech
mode” (processed by the right hemisphere of the brain), we attend away from the
overtone structure to tone color, as when we hear musical sounds or natural noises. In the
“speech mode” (processed by the left hemisphere of the brain), we process the signal
attending away from the overtone or formant structure to the phoneme, and the tone color
11
Some of the specific articles that summarized these findings are: A.M. Liberman and David Isenberg
“Duplex Perception of Acoustic Patterns as Speech and Nonspeech” in Status Report on Speech Research
SR-62 (Haskins Laboratories, 1980): 47-57.; A.M. Liberman, I.M. Mattingly and M.T. Turvey, ”Language
Codes and Memory Codes” in Coding Processes in Human Memory, A.Melton and E. Martin, ed. (New
York: Wiston, 1972); A.M. Liberman, F. S. Cooper, D .P. Shankweiler, and M. Studdert-Kennedy,
“Perception of the Speech Code” in Psychological Review 74 (1967): 431-61; B. Repp, C. Milburn and
John Ashkenas, “Duplex Perception: Confirmation of Fusion” in Perception & Psychophysics 33, no.4
(1983): 333-337.
12
Tsur, What Makes Sound Patterns Expressive, 13
9
is taken into account or almost suppressed. In the “poetic mode,” the main processing is
identical with the speech mode except that certain precategorical information (such as
tone color) enters consciousness.13 In this mode, it is possible to switch back and forth
between “auditory” and “phonetic” modes of listening, either simultaneously or in rapid
succession.
The musical setting of text produces disruption to differing degrees, posing an
irresolvable agonic relation between understandability of text and musicality of the
vocalized word. This disruption is mounted over the disruptive sound groupings produced
by rhyme schemes and phonetic interplays among words already present in any poetic
text as specified in previous paragraphs. The resulting effect could communicate two
kinds of messages: one that, according to Tsur, explores the double-edged expressive
capacity of phonetic sounds in connection with the words that contain them; and the other
concerns itself mainly with the semantic meaning of the words or the “parallel
narratemes” that words make up after being associated by phonetic similarities. These
“parallel narratemes” could oppose or complement the “primary narratemes” made up
from the “straight” syntactic flow of narrative.14
Regarding the double-edged expressive capacity of phonemes, Tsur says that
those sounds have meaning but not a specific one: “they may express vastly different or
13
The term “precategorical” refers to the “categorical perception” phenomena explored at the Haskins
Laboratories at Yale University. The human ear has the tendency to fuse the continuous variation of color
and pitch within phonetic linguistic categories (the repetition of one particular phoneme, for example,
minimal variations in the formants of the [b] phoneme). This is a similar process to the fusion of overtones
in sound stimulus. But, at the same time, the ear perceives as a quite distinctive difference the change from
one category to the other (the change from one particular phoneme to a different one, for example, from [b]
to [d]) as a quite distinctive difference. We, as humans, perceive the phonemes that make speech as
individual categories beyond their minimal formant variations. In contrast, natural noises and music are
perceived in a continuous manner, attending to every single formant variation.
14
The tern “narrateme” has been coined by Didier Coste in his book Narrative as Communication
(Minneapolis: University of Minnesota Press, 1989). Narrateme is the minimal unit of the narrative
10
even opposing qualities.”15 As its syntactic and semantic context, the “word” influences
the enhancement of different potentialities of those sounds. An abstraction or general
meaning of the combined sounds is grasped and runs parallel to the semantic abstraction
of the words united by sonorous similarities such as alliterations, rhyme schemes, etc.
Regarding the second possible message, Didier Coste, in his book Narrative as
Communication, explains that when poetry carries a narrative type of discourse, there is a
tension between verse and narrative. Besides the (usual) way of processing the straight
syntactic flow of narrative, other messages can be constructed from “narrative
operations” based on phonetic or rhyme connections between two words. In some cases,
the ambiguity of the contrast between phonetic affinity and semantic disjunction points to
the insufficiency of the “primary narratemes” to account fully for the narrative
significance of the poem.16 Poems are usually prized for their incompatibility with
straight narrative.
Narrative is not the only type of discourse used in vocal settings, although it is the
predominant type. Other kinds of non-narrative discourses—both linguistic (such as
description, definition, and injunctive17) and paralinguistic text (phonetic or
onomatopoeic)—are used in other vocal pieces, the latter mostly in 20th and 21st century
avant-garde pieces. The poetry or the song lyric format makes use of these types of
discourses.18
discourse; it is “an utterance that contains an actional predicate” (Coste, p. 36); in other words, it represents
an event.
15
Tsur, What Makes Sound Patterns Expressive?, 2.
16
Didier Coste, Narrative as Communication (Minneapolis: University of Minnesota Press, 1989).
17
The injunctive type of discourse is understood as laws, orders to somebody else, or ordering of things.
18
There are vocal pieces that use prose. In some cases, this prose has poetic tendencies (it explores the
sound patterns of the words and phonemes or interplay between words at a semantic level). But, in other
cases, straight prose has been used in chamber vocal pieces like Berio’s Sinfonia, in which some of the
sources are philosophical texts by Levi-Strauss and political speeches.
11
The Perceptual Process of Text
Both song lyrics and poems set into popular or art songs are conceived by their
creators and received by their audiences in a different manner than the common speech
text of an everyday conversation. These literary texts are more like oral poetic story
telling. But they have one characteristic in common with speech: their orality. They are
literary forms that are communicated verbally during the song performance. The
communication process of the text of the song lyrics and poems, whatever its nature,
begins at this aural dimension. At a macro-level, the communication processes of both
literary texts and everyday speech are similar. At a micro-level, however, the specifics of
the processes are different.
At the first stage (macro-level), the auditory perception takes the form of a
process of “analysis and synthesis”:
The listeners “decode” the input speech signal by using their knowledge of the
constraints that are imposed by the human articulatory “output” apparatus.19
A reference to the articulatory gestures that are involved in the production of
speech helps the listener to decode segmental phonemes, intonation and stress. The
acoustic signal may be partially ignored and filled in by the listeners’ own syntactic and
semantic knowledge of language and the social context of the communication act.
The listener apparently comprehends the message by a process of “hypothesis formation”

that involves analysis-by-synthesis where the context guides the recognition routine. The
listener may consider a comparatively large “chunk” of speech, and he is often able to
“guess” what the speech signal should be from the context that is furnished by the
“chunk.”20
19
Philip Lieberman, Intonation, Perception, and Language (Cambridge, Massachusetts: The M.I.T. Press,
1967), 162.
20
Ibid., 163.
12
At times, when the speaker realizes that the listener may infer the rest of the
message with only certain minimal information, he simplifies his articulatory control over
it. Lieberman says that a speaker may neglect to articulate a word carefully in such a
case. The listener will then create a hypothesis regarding the phonetic character of the
segments that are unrecognizable from the acoustic signal and, applying phonological and
syntactic rules, will form a hypothetical phrase. This hypothetical phrase may be
semantically reasonable and consistent with its context. If that is not the case, the listener
may try another hypothesis or simply not understand the message.
Once this primary or macro level of communication of speech is completed,
another process is triggered in which the listener tries to make sense of the whole
message at a deeper, or micro, level. The literary text and strategies used in it are simply
a starting point, from which the reader, or listener in our case, must construct for himself
the aesthetic object. The communicative act is initiated by the text but depends on the
active involvement of the reader. The texts should stimulate the individual reader’s
faculty of perception and processing.
The decoding proceeds in “chunks” rather than by single words. These chunks
correspond to the syntactic units of a sentence. These individual sentences do not directly
denote objects. Literary text does not denote empirically existing objects; although text
may select objects from the empirical world, they are depragmatized. The literary
aesthetic object is built up in such a way that these intentional “sentence correlates” join
in semantic units. Wolfgang Iser says that “the semantic pointers of individual sentences
always imply an expectation of some kind…As this structure is inherent in all intentional
13
sentence correlates, it follows that their interplay will lead not so much to the fulfillment
of expectations as to their continual modification.”21
According to Roman Ingarden—as Iser paraphrases—during the flow of thinking
a sentence, after completing the thought of that sentence, we are prepared “to think its
‘continuation’ as a new sentence, especially one that has connection with the previous
one.”22 Each of these “sentence correlates” contains what Iser calls “a hollow section,
which creates expectation pointers towards the next sentence, and a retrospective section,
which complies with the expectations of the preceding correlate, which at this point is
part of the background. According to Iser, this creates a constant “dialectic of protension
and retension, conveying a future horizon yet to be occupied, along with the past horizon
already filled…”23 What has already been heard undergoes a permanent synthesizing
process. Every sentence shrinks in the memory and becomes some sort of background,
which is constantly restructured by the correlates that evoke it by associative relations.
If the new sentence answers the expectations aroused by the previous correlate, the range
of semantic horizons narrows. Descriptive texts, especially, behave in this way in order to
individualize the particular object. But when the new sentence does not fulfill the
expectations, the resulting frustration retroactively affects what has already been read.
Although connectability is fundamental to the construction of texts in general, literary
fictional texts and pragmatic expository language behave very differently. In order to
guarantee the reception of a specific given fact, the expository text tends to stay as
cohesive as possible.
21
Wolfgand Iser, The Act of Reading: A Theory of Aesthetic Response (Baltimore and London: The Johns
Hopkins University Press, 1978), 111.
22
Ibid., 112.
23
Ibid.
14
Whenever the expository text unfolds an argument or conveys information, it
presupposes reference to a given object; this, in turn, demands a continuous
individualization of the developing speech act, so that the utterance may gain its intended
precision. Thus, the multiplicity of possible meanings must be constantly narrowed down
by observing the connectability of textual segments, whereas in fictional texts the very
connectability broken up by the blanks tends to become multifarious. It opens up an
increasing number of possibilities, so that the combination of schemata entails selective
decisions on the part of the reader.24
The involvement of the listener is necessary to “activate the interplay of correlates
prestructured by the sequence of sentences.”25 The listener reconstructs the gaps that the
text leaves from what is revealed along its development. In return, the information made
explicit is transformed when what is left open is discovered. These blanks are one of the
fundamental differences between literary language—written or oral, as in the case of
“song” —and everyday speech.
The coherence and connectability of speech also depend upon certain extra-
textual conditions that in pragmatic language are a given and in fiction have to be
recreated every time, such as a “‘non-verbal frame of action…as matrix for utterances’;
the relation between the recipient and ‘the common referential system of experiences
assumed by the speaker,’ as well as ‘the common area of perception’; and the relation
between the recipient and the communication situation, as well as the ‘speaker’s range of
associations.’”26 Certainly, since in the act of listening to songs there is no direct contact
between the songwriter or the speaker’s voice and the listener, these preconditions need
to be recreated in every piece, as in any other literary fictional language.
Either when listening to a recording or in a live concert situation, when the singer
is not the songwriter himself, an indirect communicative situation exists. The performer
24
Ibid., 184.
25
Ibid., 110.
15
is only interpreting, or creating her own reading of the text (text conceived as the already
combined product of text plus music into the song), and then communicating her version
to the audience. But even in live performances when the performer is the songwriter, an
unavoidable distance exists between the performing artist and her audience that does not
exist in a common dialogue situation. Although in these live settings the performer could
partially provide the “non-verbal frame of action,” the rest of the assumed extra-textual
conditions are a leap of faith that every songwriter makes every time that he composes a
song.
The songwriter, as well as the performer, is alone in this realm since her
knowledge of the audience is limited and general (at least more than in a direct
conversation). The relation between the recipient and “the common referential system of
experiences assumed by the speaker” is probably estimated by the songwriter based on
the audience that she has in mind at the moment of composing. “The common area of
perception” does not apply in this situation because this is not a conversation that is
developing in real time, so it cannot be modified and affected according to the
surrounding environment. In the same way, the relation between the recipient and the
communication situation, as well as the “speaker’s range of associations,” escapes this
kind of communicative situation because of the predetermined nature of the message
being delivered. Beyond the interpretative nuances that any performer may add in
reaction to a participative audience, the song is a pre-composed form not improvised
according to the audience reaction. So, in every case, the live performance as
communicative act comes closer to the way a fictional literary text reaches its recipients
26
Ibid.,183. Iser summarizes a list of factors listed by S. J. Schmidt in Texttheorie (UTB 202) (Munich:
Fink, 1973).
16
than to the way an everyday conversation does. This is even clearer in the case of
listening to a recording of a song where even the visual contact is lacking.
Concerned with the “orality” of the song’s literary medium, Roland Barthes
elaborates on the concept of what he calls “the grain of the voice”:
...we never listen to a voice en soi, in itself, we listen to what it says. The voice has the
very status of language, an object thought to be graspable only through what it transmits;
however, just as we are now learning, thanks to the notion of “text,” to read the linguistic
material itself, we must in the same way learn to listen to the voice’s text, its meaning,
everything in the voice which overflows meaning.27
It is the sound of the voice, its quality of tone and intonation patterns that offer the
context from which the listener departs in his interpretive journey of the conveyed
message. First, the quality depends on the circumstantial performer, her involvement with
the musical piece and her interpretation of what the composer/lyricist wants to transmit—
unless performer and composer are the same person. Second, the intonation patterns
depend mostly on the way in which the lyrics were set to music by the composer. But,
ultimately, it is the receiver or listener that activates the connections suggested by these
pointers.
The literary use of the text gaps challenges the listener or reader by withholding
information that could be a given in normal language, so he “…must reformulate a
formulated text if he is to be able to absorb it.”28 In pragmatic speech, this challenge does
not exist and the imagination of the listener is not tested as it is in the literary medium.
The listener may fill in the blanks in the disconnected discourse by asking the speaker.
Iser points to the natural need of language to leave holes in the continuum of any
discourse to allow real meaning to flourish. He quotes Maurice Merleau-Ponty:
17
The lack of a sign can itself be a sign; expression does not consist in the fact that there is
an element of language to fit every element of meaning…Speaking does not mean
substituting a word for every thought: if we did that, nothing would ever be said…we
would remain in silence, because the sign would at once be obliterated by
meaning…Language is meaningful when, instead of copying the thought, it allows itself
to be broken up and then reconstituted by the thought.29
Meaning comes out of the unsaid as much as out of the message carried by the spoken
words. Thus, literary text engages the listener in a constant exercise of interpretation.
Expressive Potential: Three Types of Poetic Modes
All vocal setting modalities, from those that aim to a “speech quality” to those
that aim to a “musical-poetic quality,” want to communicate some kind of affect, but they
locate that expressive potential on different aspects of the text. As a consequence , the
way the musical setting interacts with its lyrics or poem will vary.
As mentioned earlier, those settings that rely on the semantic meaning of their
lyrics for the communication of their expressive potential tend to preserve the integrity of
the speech qualities. They locate this affective potential of language in its “prosody,” as
found in its natural speech state. Other kinds of vocal settings accept the new, artificial
inflections of language under the influence of a song’s melodies and a musical
arrangement’s textures and aim to express through their “songfulness.” The
fragmentation of the sequence of the original discourse can vary to a vast degree but
tends toward a balanced “poetic mode,” which situates the affective potential of language
in its musical-poetic qualities—sonic and semantic elements intermingled. Lastly, those
27
Roland Barthes, The Grain of the Voice: Interviews 1962-1980 (Berkeley and Los Angeles: University of
California Press, 1985): 183-184.
28
Iser, The Act of Reading, 185.
29
Maurice Merleau-Ponty, Das Auge und der Geist. Philosophische Essays, trans. Hans Werner Arndt
(Reinbeck, 1967), p.73f, as quoted by Iser in The Act of Reading, 186.
18
settings that find the distortion that the text undergoes as irreversible further fragment
language in its minimal components (phonemes) and paralingual gestures and use them
as compositional elements. They situate the affective potential of language in its sonic
qualities.
Two Contributions from the Study of Prosody
In addition to the important concepts developed by Tsur in the cognitive poetics
field, linguistic studies in prosody also contribute to enlighten the intermingling of sonic
and semantic elements of language. “Prosody” is understood as the kind of shape that
each intonation unit—which corresponds to syntactic clauses or units of information in
language—takes according to the variation of the parameters that govern its contour and
dynamic: variation of intensity, segmental duration, temporal organization or rhythm, and
variation of the fundamental frequency, as the primary parameter. This is a higher level
of prosody than the one that is generally applied at the lexical level, where the word is the
unit and all the above mentioned parameters take place inside its limits. Despite complex
controversies among researchers in this field, most linguists agree that intonation systems
convey, as Dwight Bolinger says, “how we feel about what we say, or how we feel when
we say.”30 Researchers in this area, including B. Shapiro and M. Danly (1985), have
conducted tests that provide neurological evidence for those linguists, among them
Bolinger, who maintain the fundamental affectivity, rather than grammaticality, of
intonation.
30
Dwight Bolinger, Intonation and Its Uses: Melody in Grammar and Discourse (Stanford, California:
Stanford University Press, 1989), 1.
19
Two phenomena described by some linguists involved in prosody studies are
crucial to the understanding of how different vocal setting modalities act on the
perception process. The first one is found when Bolinger explains that the vocal tones
employed in language are made of overtones produced by the shaping of the different
phonemes, which carry the semantic message, and fundamental pitches that are “mostly
used for mood and punctuating effects”—what is known as prosody.31 In any case, since
the melodic manipulation is essentially the musical arrangement of successive
fundamental pitches, types of song focusing on tune over speech quality affect the
expressive/affective mood of the text. However, these kinds of musical settings do not
modify the overtones produced by the phonemes, and, as a consequence, the strict
message of the text remains intact. By modifying the intonation arches in this manner, we
are essentially presented with a new reading of the text in performative terms.
The second phenomenon is one described by Daniel Hirst. When defining the
difference between speech and song, he says that while normal speech consists of a
continuous sequence of movements from one target-point to the next, in song, the
common prosodic characteristic of these patterns is that the contour is produced as a
sequence of static level tones.32 This description will prove instrumental when analyzing
31
Ibid., Aspects of Language (New York/Chicago/San Francisco/Atlanta: Harcourt, Brace & World, Inc.,
1968), 31.
32
Daniel Hirst, “Intonation in British English” in Intonation Systems: A Survey of Twenty Languages
(Cambridge:, UK: Cambridge University Press, 1998), 71.
20
the different degrees in the spectrum of possible vocal tones employed in the following
musical examples.
21
II.
MONODY, RECITATIVE AND ART SONG
Intonation Analysis Terminology
In order to proceed with the prosodic analysis of monody and recitative in Italian,
it is necessary to provide a brief explanation of the specialized terminology of linguistic
studies as presented by Mario Rossi.1 There is a certain agreement among linguists that
an “intonation unit” is dictated by syntactic rules. Although it does not coincide with the
boundaries of a sentence, it always contains certain specific syntactic elements. The
“intonation unit” is usually a fraction of a sentence composed of a VP (verbal term) and a
NP (noun term), including its modifying adjectives. Whatever its place in the phrase, S-
ADV (sentence adverb) is separated from the other constituents of the intonation unit
during analysis.
The intonation curve at the last syllable of each intonation unit may be either a
“major ConTinuative intoneme” (CT) or a “major ConClusive intoneme” (CC). Each
intonation unit has two “internal Accents” (AC1 and AC2). These accents are at the
lexical, or word, level. As we may observe in the two first examples analyzed, the Italian
language in particular has the tendency of not synchronizing its intonemes with these
“internal Accents” (ACs) because most of the time it carries its lexical stress on the
penultimate or antepenultimate syllable, while the intoneme occurs on the final syllable.
As Rossi describes, usually the CT manifests as “a contour whose pitch is equal to or
higher than that of the AC.”2 Actually, “the pitch contour of the ‘continuative intoneme,’
1
Mario Rossi, ”Intonation in Italian,” in Intonation Systems: A Survey of Twenty Languages (Cambridge:,
UK: Cambridge University Press, 1998)
2
Ibid., p. 225.
22
after AC1, may vary freely between the two pitch extremes of AC1 and AC2, that is to
say between the Mid and the Mid High levels.”3 In contrast, CCs (“major conclusive
intonemes”) tend to be lower in pitch than the ACs, and in general a falling pitch contour
takes over the utterance, which concludes in low pitch levels.
Rossi also mentions that “the duration of the vowel under AC and the
continuative intoneme is longer than of the unstressed vowels…”4 The last intoneme
syllable is probably significantly longer than all the previous atonic prestressed vowels.
An interesting characteristic of the Italian intonation’s dynamic is that “the loci of
temporal prominence are not synchronized with those of pitch prominence.”5 In an ideal
neutral intonation expression, AC2 has the pitch prominence, higher than the rest, even
than AC1 and CT or CC. AC2 may be somewhere between the Mid High and the Mid
pitch levels, while AC1 is around the Mid to Mid Low. The temporal prominence belongs
to AC1. It is longer than any other element, even than AC2, and this factor indicates that
what follows is an intoneme of any type, a CT or a CC. The difference between the AC2
and its preceding and proceeding unstressed syllables is 3 PUs, while the difference
between the AC1 and its surrounding group of unstressed syllables is 6 PUs.6 That
difference between the two ACs and their surrounding atonic syllables indicates the
difference between a stressed group around AC2 and an intoneme around AC1. Between
AC2 and AC1 there is no intoneme. The unstressed vowels lying in between those two
3
Ibid., p. 227.
4
Ibid., p. 225.
5
Ibid. , p. 226.
6
PUs is a duration unit that is calculated as “the log of the ratio of the duration of a given vowel to that of
the vowel carrying AC in the utterance. This value is then normalized by dividing by log (1.22).” (Rossi, p.
238).
23
accents adjust their prosodic values to “the temporal and melodic continuums obtained by
linear interpolation between those two points.”7
Low-Level Mimesis: Monody and Recitative
Certain vocal styles establish a “low-level mimetic relation” with their texts, a
relation at the lexical and prosodic level of their literary sources. These styles have as a
principle the preservation of the text integrity as a logical linear discourse as far as is
musically possible. In order to allow the discourse to flow without interruption, its
practitioners look at intonation and inflections of the text as speech. If language as speech
has the ability to communicate ideas, it is by allowing the texts to behave as such that
their musical settings are able to transmit emotions. One might say that these vocal styles
and genres locate the expressive qualities of language in the intonation system. By
attempting to maintain the integrity of the discourse, these pieces trigger a “poetic mode”
of perception in which the “speech mode” dominates. Two vocal setting modalities may
serve as examples of this approach: Italian monody of the late sixteenth- and early
seventeenth-century and recitative in general.
Monody composers, such as Caccini, D’India, Galiliei, Peri and Monteverdi (in
his new monody phase), claimed to be imitating nature with the new stile rappresentativo
of their vocal compositions. This was not an original claim since earlier and
contemporary madrigalists, such as Cipriano de Rore, Marenzio, Gesualdo and Zarlino,
maintained the same about the counterpoint and “word painting” devices used in their
compositions. The stile rappresentativo or “representational style” consisted of the
mimetic imitation of speech’s rhythm and intonation. Monody composers wanted to
7
Ibid., p. 227.
24
avoid the madrigal analogies of “word painting,” which illustrated the meaning of the
words with specific harmonies created by intervals among the voices, runs up and down,
silent parts or notation devices. They based their monodic settings on the homology of
the spoken word as the closest to human nature that they could get.8
The proponents of this seconda prattica put their efforts into following the textual
rhythm, as in direct natural speech.9 They observed intonational accents, elongation or
shortening and pitch variation of morphemes as much as they musically could. But, in the
end, a general musical sense and sensitivity ruled over the strict transcription of those
intonational parameters. They also followed the phrasing of intonational units and order
of the text, only adding some repetitions toward the end of the poems. These repetitions
were meant to emphasize certain concepts and words. The restatement of words allowed
melodic embellishment, such as melismas, trillo and gruppo, without detracting from the
understandability of the semantic content. Once a term was introduced, it could be
stretched and twisted over long and elaborate ornamentation and still be present in the
short memory of the listener.
Some embellishments were conceived as imitations of the tones of the voice or
“manners of speaking,” as Caccini explains in his preface to Le nuove musiche. For
example, the esclamazione consists of “a gradual loudening of the voice on long notes
into an outcry, made more artful by first diminishing the volume before beginning the
increase…Clearly the esclamazione is the likeness of a sigh.”10 Other ornamentations
described by Caccini are the gruppo and the trillo, which imitate unsteady speech. The
8
For a detailed discussion of the monodists’ reform, see Richard Taruskin, The Oxford History of Western
Music, vol. 1 (Oxford-New York: Oxford University Press, 2005), p.797-847.
25
former is “the artfully simulated vocal tremble” made of the “rapid alternation of
contiguous notes of the scale.”11 The latter consists of the rapid repetition of a single
pitch.
In their score settings, seventeenth-century Italian monodists closely followed the
prosodic characteristics of the text phrases. On the one hand, they set the text phrases to
melodic patterns that mimicked their intonation shapes pitch-wise. On the other hand,
they also replicated the kinds of prolongation performed over accented syllables and the
shorter values of the syllables in between with similar musical rhythmic patterns.
There are two factors that are decisive in bringing monody closer to natural
speech. First, the shapes and rhythm indicated in the score served only as a point of
departure for the real interpretation of the performer. These composers allowed and
expected flexibility in the performing beat of their monodies, giving the performer the
chance to vary the articulation pace according to her or his dramatic interpretation of the
different phrases. Second, the quasi-parlato nature of the vocal sound—with less vibrato
and more continuous movements between pitches—better recreated the sound of the
speaking voice. It is only by these means that monody achieved the emulation of speech
manners.
Monodies kept the accompanying texture as simple as possible: only chords,
usually played by a string instrument—sometimes chitarrone—and notated in the short
hand method known as figured bass. Dissonance was used to highlight certain “rhetorical
9
The “second practice” referred to the stile rappresentativo, as defined by Giulio Cesare Monteverdi in the
postface “Declaration” of Claudio Monteverdi’s score of Scherzi musicali (Venice, 1607). For these
monodists, the madrigalists represented the prima prattica, the stile antico.
10
Taruskin, The Oxford History of Western Music, p. 817.
11
Ibid., p. 817.
26
effects of vocal inflection and delivery.”12 The focus is still on the speech qualities of the
vocal line; and music is only at its service.
By analyzing some examples of this repertoire, it is possible to observe several of
the features described above. In Sfogava con le stelle, Giulio Caccini sets a poem by
Ottavio Rinuccini:
Sfogava con le stelle un’infermo d’amore

Sotto notturno cielo il suo dolore;
E diecea fisso in loro,
O immagini belle del idol mio ch’adoro,
Sì come a me mostrate
Mentre così splendete
La sua rara beltate,
Così mostrate à lei
Mentre cotanto ardete
I vivi ardori miei;
La fareste col vostro aureo sembiante
Pietosa, sì come me ne fata amante.13
O. Rinuccini
Turning now to the score of Caccini’s Sfogava con le stelle, most of its musical
phrases coincide with the intonation units into which the poem could be divided. Thus,
from the third verse to the tenth verse, there are nine intonation units that correspond with
nine musical phrases.
12
Ibid., p. 829.
13
Translation: “There appeared under the stars a man sick with love, and under the night sky he disclosed
his pain; and he said, his eyes fixed on them, ‘Oh, lovely images of my adored Idol, just as you show me
her rare beauty as you shine so brightly, in the same way show her my keen pangs. Perhaps you might
make her pitiful with your golden aspect, just as you made me loving.’” Carol MacClintock, ed., The Solo
Song 1580-1730 (New York: W.W. Norton & Company, Inc., 1973)
27
EX. 1.1: From Caccini’s Sfogava con le stelle, mm. 5-16.
28
Intoneme Units Measures Intoneme Análisis
1 mm. 5 AC1 CT
E diCEa,
2 mm. 5-6 AC2 AC1 CT
FIsso in LOro,
3 mm. 6-8 AC2 AC1 ct AC2 AC1 CT
O immagini BElle del Idol mio ch’aDOro,
4 mm.8-10 AC2 AC1 CT
Sì come a ME mosTRAte,
5 mm. 10-11 AC2 AC1 CT
mentre coSI splenDEte
6 mm. 11-12 AC2 AC1 CT
la sua RAra belTAte,
7 mm. 12-13 AC2 AC1 CT
coSI mostrate à Lei
8 mm. 13-14 AC2 AC1 CT
mentre con TANto arDEte
9 mm. 14-16 AC2 AC1 CC
I VIvi ardori MIei.
TABLE 1.1: Intoneme analysis of Caccini’s Sfogava con le stelle.
As the musical excerpt and table illustrate, all of the continuative intonemes end
on the same or a higher note than their preceding AC1. On the other hand, as expected, in
measure 16, the conclusive intoneme (CC) resolves on a lower note than the AC1: from
A4 to G4 (this last pitch is the modal center of the piece). In most of the phrases, the
highest pitch of each of these units is set on the AC2, while the rest of the morphemes
move downwards and in shorter rhythmic values toward the longest note of the phrase,
on which we find the AC1. This is true at least for the first three intonation units, but
certain exceptions are found in the rest of the units in which the character of the narration
changes.
In the phrases in which the narrator evokes what the “infermo d’amore” (the man
sick with love) said, the hierarchy of the pitches is reversed—with the exception of the
two “Mentre…” phrases that are time clauses directly connected to what follows. They
tend to ascend, thus making the AC2 lower than the AC1. This kind of departure from the
29
basic pattern of rhythmic values and pitch hierarchies is a sign of highlighting pragmatic
content or a specific expressive intention. In cases like this one, when AC1 is pronounced
at the level of AC2 or higher, the speaker is trying to draw the listener’s attention to the
topic. This focus attracts the listener at the same time as it imitates the persuasive tone of
the “sick man” while looking for approval of his “Idol.” Ascending lines are actually
more persuasive and create expectation.
The last intonation unit, “I vivi ardori miei,” is even a further case of focusing.
Although it respects the pitch hierarchy of AC2 (morpheme “vi” on D5) over AC1
(morpheme “mie” on A), the penultimate syllable of “ardori” is elongated, displaying a
florid run that takes almost the whole of measure 15, except the last sixteenth-note. Thus,
an otherwise unstressed syllable of the intonation unit (although a stressed one at the
word level) gains prominence. This stretching treatment of the syllable “do” certainly
highlights the word “ardori,” enhancing with it the importance of the man’s suffering
with his “keen pangs.”
Another vocal setting style that engages in a “low-level mimetic relation” with its
text is the recitative. The operatic, oratorio or cantata recitative of all musical historical
periods continues the same kind of approach to text setting as monody, which respects
most of the intonational contours and phrasing of the text and gives the performer
flexibility on the beat for the final touch of speech-likeness. The recitative between Don
Giovanni, Dona Elvira and Leporello, before the aria “Madamina! il catalogo è questo,”
(Scene V from W. A. Mozart’s Don Giovanni) shows the same kind of close observation
of the phrasing and intonation of Lorenzo Da Ponte’s words as Caccini’s attention to
30
Rinuccini’s. From the opening seven measures of the beginning of their recitative, it is
possible to extract the following seven intonation units.

CT
D. Elvira: Sei QUI! / MOStro! /
CT
AC1 AC2 AC1 CC
feLON! / NIdo d’inGAnni!
2 mm. 3 AC2 AC1 CC
Leporello: Che TItoli crusCANti!
3 mm. 4-5 AC2 AC1 CC
Leporello:: manco MAle che lo conosce BEne!
D. Giovanni: VIa, cara Donna ElVIra,
D. Giovanni: calMAte quella COllera;
6 mm. 7 AC1 CT
D. Giovanni: senTIte…
7 mm.7-8 CT
AC2 AC1
D. Giovanni: laSCIAtemi parLAR! 14
TABLE 1.2: Intoneme analysis of recitative from Scene V, W. A. Mozart’s Don Giovanni.
These intonation units set into music observe most of the prosodic rules
previously summarized. This recitative fragment has a peculiar characteristic; it is a
dialogue between three characters in the middle of an agitated discussion. It shows a wide
range of emotions and circumstances, from Donna Elvira’s insults to Leporello’s aside
comments, which translate into intonational contextual effects such as focusing
techniques on specific fragments of the speech.
14
Translation of this fragment: D.E: You are here! Monster! Traitor! Nest of deceits!; Lep: Such pure
Tuscan titles! So much the better that she knows him!; D.G: Come now, dear Donna Elvira, calm this
anger; listen…let me talk! (translation by L. Guillén)
31
EX. 1.2: Recitative from Scene V, W. A. Mozart’s Don Giovanni.
The first section of Donna Elvira’s intervention is a direct call for Don Giovanni’s
attention. “Sei qui!” not only has the expected rising contour of a continuative intoneme
(CT), but it also has the coincidence of the AC and the CT on the last syllable “qui!” The
32
exclamation mark, which indicates the surprise and anger of the character when
discovering Don Giovanni, is represented by the doubling of the rhythmic value of the
quarter-note A4 on which “qui!” is set with respect to the preceding unstressed “Sei” of
only an eighth-note. Another emphasizing factor for “qui!” is that it falls on one of the
strong beats of the measure (the fourth one). “Fellon” is set in the same way as “Sei qui.”
But the previous “mostro,” which has the accent on the penultimate syllable, shows the
splitting of attributes that is typical between AC1-CT. The former is rhythmically longer
and has the same or a higher note than the CT.
The last section of Donna Elvira’s list of insults extends beyond these three short
periods of two syllables. The next insult, “Nido d’inganni,” opens with the AC2 on “ni”
holding the normally expected higher pitch of the whole unit, an F#5. From then on, the
pitch contour mostly descends—with some brief upward deviations—toward the AC1,
which falls on a D5. This whole descending line moves a major third below the AC2’s
F#5. But contrary to the expected lengthening of the AC1 on “ga,” this one stays on the
same eighth-note value as the previous unstressed syllables. In this particular case, this
setting may try to convey the urgency and tripping of the words in an infuriating
situation, such as the one Donna Elvira is experiencing with Don Giovanni. Although in
the score “ga” appears on a D5, if the performer applies the traditional appoggiatura on
an E5 before the resolution on the D5 of “nni!” the descending contour of the AC1-CC
succession takes place, at least in performance practice. On the other hand, if the
performer does not apply this appoggiatura rule, ending the phrase with two equal D5
notes on “-ganni,” the effect produced could be that of an AC1-CT succession. Under
these circumstances, there is a sensation of continuation, as if Donna Elvira were still
33
speaking while the Leporello’s next aside to the public by takes place. The audience will
listen to Leporello while Donna Elvira keeps insulting Don Giovanni.
Turning now to the next two intonation units pronounced by Leporello, both
clearly depart from the basic pattern because of the emphasis that the character wants to
put on the sarcastic adjectives modifying the word “titoli.” Donna Elvira’s insults are
now ironically called “titles.” This is immediately followed by the double-meaning
comment of Leporello: “manco male che lo conosce bene!” (So much the better that she
knows him!). Rossi mentions that when AC1 is pronounced “at the top level of AC2 or
higher, the speaker is drawing the listener’s attention to the topic as an effect of the
pragmatic accent (PA).” The AC1 in “Che titoli cruscanti” falls on “-can” and is set on a
D4, a fourth above the AC2, which is on an A3. That setting directs all the attention of
the audience to the adjective attached to it, “cruscanti.”15
The second phrase of Leporello presents the same kind of contour and “pragmatic
accent”—AC1 is higher than the AC2 by a minor third, from F#3 to A3. The same kind
of attention-calling effect is produced by Don Giovanni’s “Via, cara Donna Elvira,”
which is achieved by the same kind of pitch contour that Don Giovanni has just used in
the previous phrase. From the last three units, one follows an unusual intonation profile
and the other two shape around the point of focus being highlighted. In “calmate quella
collera,” in order to keep attention on the verb, which embodies the order given by Don
Giovanni to Donna Elvira, Mozart respects the usual hierarchy of pitches. But the last
15
“Cruscanti” literally means “pure Tuscan.” During this period, Tuscan culture and society was
considered the richest and most refined of all the regions of Italy. Even today the Tuscan dialect is
considered the purest Italian. But in this case, Leporello is using it to make a sarcastic contrast with the
insults that D. Elvira has been enumerating, which are not very refined.
34
two phrases are conceived as intonational units of the same nature as the previous ones,
both presenting ascending melodic lines toward their AC1s.
The first two examples analyzed were in Italian. At this point of the argument, it
is necessary to expand these observations to other languages, English in particular. As the
nature of English prosody differs greatly from any other language, a general explanation
of usual intonational tendencies will clarify the subsequent analysis of the Alto’s
recitative n.8 from Georg Handel’s oratorio The Messiah. The recitative of the oratorio
also follows with close attention to the intonational nature of its text source.
Daniel Hirst mentions that, according to Jassen (1952), English speech is
organized into two kinds of rhythmic units: the Narrow Rhythm Unit, which consists of a
stressed syllable followed by a sequence of unstressed syllables, and the Anacrusis,
which consists of a sequence of proclitic16 unstressed syllables.17 In the Anacrusis, the
syllables tend to be pronounced rapidly moving toward the subsequent stressed word. In
contrast, in the Narrow Rhythmic Unit, the duration of each unstressed syllable tends to
be inversely proportional to the number of them in that unit giving the impression of
isochrony. Tonal Units group Anacrusis with the preceding Narrow Rhythm Unit. At a
higher structural level, these Tonal Units are grouped into Intonation Units.
According to David Crystal, most of the intonation units consist of five to eight
words.18 Utterances longer than this are usually broken into two or more intonation units.
16
Proclitic: adj. In Greek Gram., used of a monosyllabic word that is so closely attached in pronunciation
to the following word as to have no accent of its own; hence, generally, used of a word in any language,
which in pronunciation is attached to the following stressed word, as in an ounce, as soon, at home, for
nobody, to comprehend. (Oxford English Dictionary Online, second edition, 1989).
17
Hirst, “Intonation in British English,” p. 58.
18
David Crystal, Prosodic Systems and Intonation in English. (London: Cambridge University Press,
1969).
35
Although pragmatic or phonological reasons dominate the final decision, syntactic
criteria define where these breaks may occur.
The final accent of the Intonation Unit is usually referred to as the “nucleus.” In
defined assertions, the stressed syllables form a descending scale until the last stressed
syllable when the pitch of the voice falls abruptly to a lower level. The intermediate
unstressed pitches may actually remain more or less at the same level with some
fluctuations and do not necessarily descend toward the last stressed syllable. As Hirst
comments: “In fairly slow deliberate speech, the ‘down stepping’ effect can be quite
striking.”19 However, in spontaneous speech, the pitch drop is reduced to the point of
almost being imperceptible, “giving rise to the ‘hat’; or ‘bridge’ type of pattern of
sentences that has been described as typical of unemphatic utterances in a number of
different languages,” such as English.20 A second kind of tune, besides this descending or
“hat” one, is used for statements with implications, “Yes-No” questions, requests and
incomplete utterances. In these cases, the descending or “hat” shape is used until the
nucleus (last stressed syllable) is reached. Usually this last accent is on a low note and the
syllables that follow rise from then on. The rise of the last pitch is not a way of
transforming a statement into a syntactic question, “but rather a way of indicating that a
syntactic statement is being used pragmatically as a request for information.”21
“Incompleteness” in a sentence takes the form of rising nuclear tones; and those involve a
pragmatic evocative value. In contrast, “falling nuclear tones have proclamatory value.”22
19
Ibid., p. 61.
20
Ibid., p. 62.
21
Ibid., p. 65.
22
Ibid., p. 66.
36
Both of these contours, descending and ascending ones, indicate to the listener how the
utterance should be processed.
In emphatic statements, the final nuclear pitch accent rises to a higher level than
usual with respect to the preceding unstressed syllable. It is common practice to switch
the first accent of the intonation unit for a low accent coming from high pitch unstressed
syllables to reinforce the later high final accent and subsequent falling pitch.
The Alto’s recitative n. 8 from Handel’s oratorio The Messiah uses biblical text
from Isaiah vii: 14 and Matt. I: 23. This text is already divided into its Intonation Units.

NUC
1 BeHOLD!
2 AC1 NUC
a VIRgin shall conCEIVE,
3 NUC
and bear a SON,
4 NUC
and shall call his NAME
5 NUC
EmMAnuel:
6 AC1 NUC
GOD with US.
TABLE 1.3: Intoneme analysis of Alto’s recitative n. 8 from Handel’s The Messiah
37
EX. 1.3: Alto’s recitative n. 8 from Handel’s The Messiah
The phrasing of Handel’s settings falls into the division of the intonation units
that the text could have in normal speech, except for the explicit separation of
“Emmanuel” into an independent unit. This gesture prepares the solemn delivery of the
sacred name of the “son of God,” while also creating expectancy. All nuclei, as well as
most of the first accents of these intonation units, fall on strong beats of the 4/4 meter in
which this recitative evolves. The only exceptions are the first accents on “bear” and
“call.” This may be explained as a way of de-emphasizing the separation of the utterances
that contain them in three independent intonation units, focusing instead on the continuity
of the syntactic unit starting with the subject/noun phrase “a virgin” to the last verbal
38
unit, “and call his name…” The simple chord accompaniment also connects these four
intonation units by holding a double bass pedal on the notes D3 and D2 for three and a
half measures.
All the units end with an ascending interval toward the last note, which carries the
main and last accent, the nucleus. Since these are incomplete statements, the ascending
interval in each of these units also creates a sense of continuity. As an exception, the
abrupt drop of the pitch a fifth below (from A4 to D4) of “his name” may be explained as
preparation for the first note of “Emmanuel,” a B3. The first two syllables of the name,
“Em” and “man,” open with a solemn ascending perfect fourth interval. The second
syllable “-ma” carries the main accent of the unit, the accent of the nucleus. The closing
intonation unit, “God with us,” is the definite and final statement of this recitative, and as
such, it ends with a conclusive descending interval, a perfect fourth. The syllabic setting
of this recitative specially contrasts with the overvocalization of long melismas in the alto
aria that comes immediately after.
High-Level Mimesis: The 19th Century Lied
The 19th century Lied serves as an example of the kind of “poetic mode” in which
the listener negotiates through the song a balanced perception between the fragmented
semantic content and the sonic components of the lyrics. Nineteenth-century Lied
composers engaged at a high-level mimetic relationship with the text. In their attempts to
offer a reading of the poetic text by enhancing semantic meaning and poetic structure
through music—harmony, melodic and rhythmic patterns and form—nineteenth-century
Lied composers departed from the strict prosodic features of the text and offered an
39
interesting but distracting musical setting. This disengagement from the intonational
properties of the poetic text emphasizes the music/poetic aspects of its words and negates
the possibility of understanding the discourse in a linear way—as close as any poetic text
may get to be perceived in the “speech mode.”
Composers at the transition between the eighteenth and nineteenth century
struggled to make their musical settings a natural extension of the text. Johann Reichardt
(1752-1814) alleged that his melodies sprang automatically from repeated readings of the
poem and they were so closely interwoven with the text that they spoke and sang
pleasantly.23 Reichardt, as well as other composers belonging to the “Second Berlin
School—Carl Zelter and Johann Schulz, committed to the least musical intervention
possible and put their efforts into letting the poetic text speak for itself as conceived by its
author.
The primacy of text over music did not last long. The next generation of Lied
composers—Schubert, Schumann, Brahms—allowed their music to speak and offer a
new perspective on the poem. By doing this, they offered an alternative reading of the
text. This new reading took the form of particular musical phrasings of the text and
treatment of textures employed in the piano accompaniment. These changes challenged
the cohesion of the poem but at the same time highlighted certain text fragments.
However, beyond the difference in approaches, the end result in both casesc(earlier and
later Lied composers) is settings that are perceived in a balanced “poetic mode.” The
listener receives a re-elaborated version of the poetic text, stretched, fragmented and
webbed into melodic and harmonic treatments. Thus, new poetic images, rhymes, words,
23
J.F. Reichardt, cited by Jack M. Stein, Poem and Music in the German Lied from Gluck to Hugo Wolf
(Cambridge, Mass.: Harvard University Press, 1971), p. 34.
40
sounds that before may have been unattended are brought to the attention of the listener,
which altogether contribute to the reinterpretation of the poem.
Examining one particular poem, which was set several times by different
composers during the nineteenth century, will allow us to contrast the change in
compositional approaches and aesthetic conception of poetic text and music relationship.
Furthermore, this analysis will allow us to understand the compositional decisions that
highlight certain fragments or words of the poetic text and understate others.
From 1795 to 1907, an extensive number of songs were composed using the
poems from Johann Wolfgang von Goethe’s novel Wilhelm Meister, and more than a
hundred of these were settings of “Kennst du das Land?” Of special interest are the
settings of “Kennst du das Land?” by Johann Reichardt (1795), Carl Zelter (1795),
Ludwig van Beethoven (1809), Franz Schubert (1815), and Robert Schumann (1849).
Reichardt and Zelter intended to make their musical settings a natural extension of
the text and limit their musical intervention as much as possible. However, their settings
are songful renditions far from any speech quality of the text. Their melodies are a
genuine example of the simplicity and modesty of the Volkstümlichkeit style—in the
manner of folk melodies.24 These simple melodies—syllabic settings constrained in
register, stripped from embellishments and extended vocalizations—were respectful of
the obvious prosodic characteristics of the text, but followed a strict musical logic. The
lack of text order modification and piano interludes aims to limit the disruption of the
narrative flow. Although the repetitious nature of their strophic settings and simple chord
24
Although until the late eighteenth-century folklore was thought to belong only to the peasants and
“assigned a low cultural or intellectual prestige,” in the nineteenth-century folklore “was seen as
embodying the essential authentic wisdom of a language community or nation.” (Taruskin, The Oxford
History of Western Music, vol. 3, p. 122).
41
accompaniment allows listeners to ease their attention from the pure musical elements of
the song, they still apprehend the text in the “poetic mode” of listening. The listeners hear
the original sound patterns of the poem mounted over the “songfulness” of the sustained
tones of a voice singing a melody.
Beethoven’s, Schubert’s and Schumann’s settings of Mignon’s Lied further
emphasize the “poetic mode” of processing by unleashing their compositional creativity
in further elaborated arrangements abounding in piano interludes, more complex and
colorful accompanying textures and text modification. They built musical forms that
direct the attention of the listener to specific words or phrases in their text. Those may
synthesize ideas relevant to the reading that the composer makes of the poem. They
highlight these text fragments by: creating harmonic tension, announcing or delaying this
phrase with piano interludes, repeating it several times, detaining the flow of the piece
rhythmically, etc.
A brief explanation of the insertion of the poem in the novel and Mignon’s
previous story could help to understand some musical decisions made in the settings of
the composers under study. “Kennst du das Land?” is a poem inserted in Wilhelm
Meisters Lehrjahre (Book Three, Chapter one). The poem is actually a song that the
character Mignon sings. She is supposedly an orphan girl, whom Wilhelm rescued from
the abusive master of a circus troupe. She has neither a home nor parents that are known
at that point in the novel. Mignon’s traumatic kidnapping made her become completely
averse to remembering her past in detail. When she sings this song, she is already living
with Wilhelm, and although she addresses him as “father,” a secret passion for him as a
man has started to grow in her. At the beginning of this first chapter in the third book,
42
when Wilhelm hears her singing from his room accompanying herself with a zither, he
becomes interested in the lyrics of the songs. He asks Mignon to repeat the song; he
writes it down and translates it; however, as Goethe describes in the novel:
He found, however, that he could not even approximate the originality of the phrases, and
the childlike innocence of the style was lost when the broken language was smoothed
over and the disconnection removed. The charm of the melody was also quite unique. She
intoned each verse with a certain solemn grandeur, as if she were drawing attention to
something unusual and imparting something of importance. When she reached the third
line, the melody became more somber; the words “Do you know it, indeed?” were given
weightiness and mystery, the “There!, there” was suffused with longing, and she
modified the phrase “Let us go” each time it was repeated, so that one time it was
entreating and urging, the next time pressing and full of promise.25
This description was closely followed by some of the composers, like Zelter, when
setting this poem into music.
In Mignon’s Lied each of the three stanzas introduces a description from a
different perspective of her lost paradise. The first stanza is the description of an earthly
paradise to which she urges her “beloved” to take her. In the second stanza, the life is
gone; it is an architectural paradise, where everything is glittery and cold as the statues
that pronounce the central question of the song, which reveals Mignon’s suffering (“Was
hat man dir getan?”. Then the term “beloved” is displaced by “protector.” The third
stanza is the description of nature—similar to the first stanza—but this time it is a misty
landscape where everything is confusing and intimidating; therefore, her protector now
becomes her father. Thus, her lost paradise has a warm and voluptuous side, a cold and
glittery side, and, finally, a confusing, misty and intimidating side. For that reason,
although she wishes to go back to her homeland some day, she wants to go under the
company and protection of Wilhelm—her lover, protector and father.
25
Johann W. von Goethe, Wilhelm Meister’s Apprenticeship. Edited and translated by E.A.Blackall.
(Suhrkamp Publishers: New York, 1989)
43
Reichardt, as well as Zelter, gave preference to a strophic setting with regular
phrasing. They sought a style of setting whose clarity was close to the dignified
simplicity that they admired in folk art. As Carl Dahlhaus remarks, “Any composer who
tried to recapture the natural state of folksong had to conceal the excerptions of his art.”26
These ideals of the “Second Berlin School” are echoed by Reichardt who states, “For the
artist the supreme art lies not in the ignorance of his art but in its renunciation” (Geist des
Musikalischen Kunstmagazins, 1791).27 This composer’s renunciation of his craft to
accomplish clarity and simplicity in his compositions is in accordance with Goethe’s own
opinion about the degree of interference of the composer’s musical creativity with the
poem. Goethe uses the term “false participation” to describe any musical response to the
poem’s meaning beyond the strict accompaniment of the declamation of the poem. The
composer surrenders his creative space to the art of the poem. The composer’s duty is
only to create the appropriate musical ambiance, which is subtracted from the general
meaning of the poem. By creating this musically suggestive context, the composer helps
the audience to appreciate the richness of the meaningful inflections of the text itself.
Thus, Goethe wrote in a letter sent to Carl Zelter on May 2, 1820:
The thing to do is to place the auditor in the mood that the poem suggests, letting the
imagination then create its own figures at the instance of the text, without his knowing
anything of the how of the process...To paint tone with tones, to thunder, crash, paddle
and plash, is detestable.28
While in “Kennst du das Land?” Zelter sets the mood of the poem as Goethe
requires, Reichardt ascribes to a simpler manner and also less intrusive folk-like song
26
Carl Dahlhaus, Nineteenth-Century Music. Translated by J. Bradford Robinson. (University of
California: Berkeley, Los Angeles, 1989).
27
Ibid., p.109.
28
Carl Fredrich Zelter, J.W. Goethe: Briefweschsel (Leipzig, 1987), p. 216. Translation of the quote by
Christopher Gibbs.
44
style. His setting is strictly strophic. In the accompaniment, he harmonizes with simple
chord support and doubles the melody throughout the song. Melodic phrases are
completely regular and the setting of the words is mainly syllabic. The song has only a
short modulation to its dominant and has no instrumental interludes. Zelter’s song is a
slightly modified strophic setting; the second and third strophes show a slight harmonic
modification in the piano and the voice between mm. 11 and 15. This song presents, as in
Reichardt’s, mainly a chord accompaniment with mm. 6 through 13 arpeggiated in
triplets in each strophe.
Zelter gives careful indications of the expression for each strophe. These
expressive indications closely follow the description of Mignon’s performance in
Goethe’s novel. Thus, Zelter asks for a Pathetisch (pathetic) mood mit Anmut (with
grace), in correspondence with Goethe’s description of the solemn grandeur of the
opening of Mignon’s song. In the third line, Zelter asks for a more getragen (hesitating)
mood as Goethe talks of the melody being more somber. For the words “Kennst du es
wohl?” (“Do you know it, indeed?”), he asks for an anwachsend (crescendo in emotional
intensity), according to the weightiness and mystery of Goethe’s description. The
variation in the expressive mood of this last refrain is also closely followed by Zelter.
Even in Reichardt’s and Zelter’s settings, where most of the obtrusive musical
devices are minimized, the listeners will tend to perceive the poetic text in a “poetic
mode.” The intrinsic musicality of the poem itself mounted over a melody—although
simple, syllabic, unembellished, and constrained register-wise—draws the listeners’
attention away from the linear discourse.
45
EX. 1.4: Reichardt, Kennst du das Land, 1st stanza.
46
EX. 1.5: Zelter’s Kennst du das Land, mm.1-27.
47
EX. 1.5 (cont.): Zelter’s Kennst du das Land, mm.28-53.
48
Convinced of the power of music to communicate feelings and sensations
otherwise ungraspable in words, the next generation of Lied composers broke free from
the previous word setting’s constraints delineated by the “Second Berlin School” and
unleashed their musical voices into their compositions. They ventured into busier textures
and more elaborate melodies, which certainly fragmented the poetic discourse further. At
the same time asthey emphasized musical qualities already existent in the poem, they
focused their expressive forces on the strict musical elements of the setting. Almost as if
they were conscious of the unavoidable distortive effect that their settings inflected upon
the poem, in the music of the settings they offered their listeners a synthesis of the
feelings at play in the poem: highlighting specific words with repetition, rhythmic and
melodic procedures, or underlying accompanying textures or harmonies.
Edward T. Cone says that a composer cannot set a poem with all its connotations;
some aspects of the poem will always be left out. In a case where the composer wants to
consider all the possible readings of the poem, he should include every point of view
translated into music in order to give the total meaning of the poem.29 Otherwise, what
results is a new creation that does not show the poet’s persona but, rather, the
composer’s. Inevitably, this will be a new set reading from the composer’s point of view.
His or her particular setting will highlight certain words and sounds, which will combine
in a completely new set of images associated with different moods and ideas.
If we turn to Beethoven’s and Schubert’s settings, we are able to appreciate an
aesthetic change in the following terms: the composer’s voice increases its active
participation in the musical result of the song. As Taruskin comments “The basic vocal
29
Edward T. Cone. “Some Thoughts on ‘Erlkönig.’” In The Composer’s Voice. (University of California
Press: Berkeley, Los Angeles, 1974) p.19.
49
idiom is always that of Volkweise (folk tune), the ‘natural’ music representing the ‘We,’
inflected by eccentric details of melody, harmony, or accompaniment that at extreme
moments allow the ‘I’ to intrude.”30 Both Beethoven’s and Schubert’s voices translate
into their music the anxiety and urgency of Mignon’s request. They especially focus on
aspects of those concerns that they are interested in emphasizing from the poem. All
those become melodic, rhythmic, harmonic and textural effects at the hands of Beethoven
and Schubert.
Both composers set their song in strophic form with minor variations in the third
stanza. Both songs are in the key of A major and follow a similar tonal plan. Within this
formal frame, the contrasting textures, change of dynamics and rhythmic acceleration of
the second section of each stanza (“Dahin! Dahin!”) show more than a mere transcription
of the song that Mignon could have actually sung. Through their particular musical
treatments, Beethoven and Schubert show a reinterpretation of the unconscious concerns
of the character and her anxiety, once confronted with her critical personal situation: an
exposition of the unconscious feelings of Mignon told in the musical language of these
composers.
30
Earlier in his Chapter 35 “Volkstümlichkeit,” Taruskin explains the kinds of negotiations established
between the “I” and the “We” in previous nineteenth-century Lied, which crystallized the Volkstümlichkeit
ideals. He comments about the “impossibility of a particular ‘I’ without a particular ‘We,’” which may be
explained by a reformulated idea of cultural relativism, “the irreducible human difference”: “A human was
human only in the society of other humans, and the natural definer of societies was language. Since there
could be no thought without language, it followed that human thought, too, was a social or community
product…” In this manner extending language as expressive of all cultural aspects of a society, the concept
of a collective spirit idiosyncratic to each particular society arises. And this one was found in folklore
manifestations. So, the Lied was mainly concerned with the faithful portrayal of that “We.” (Taruskin, The
Oxford History of Western Music, vol. 3, p. 120-123).
50
EX. 1.6: Beethoven’s Kennst du das Land, mm.1-17.
51
EX. 1.6 (cont.): Beethoven’s Kennst du das Land, mm.18-43.
52
In Beethoven’s Lied, the gay anxiety of this adolescent is manifested in a playful
Piú Mosso in 6/8. This section functions as an answer to the “Kennst du es wohl?”
rhetorical question of Mignon, which together with the other “Kennst du?” questions are
the structural columns of Beethoven’s song. All these questions are set to the same
rhythmic pattern:
EX. 1.7: Beethoven’s “Kennst du”-questions’s rhythmic pattern.
This pattern contrasts with the rest of the musical phrases with the use of one long
rhythmic value opening and another closing it: a quarter note at the beginning of the
phrase and a dotted quarter, eventually extended, at the end. This produces a slow down
of the flow of the song and, as a consequence, the highlighting of these questions in the
poem. The last “Kennst du es wohl?” of each stanza is also preceded by a short piano
interlude, anticipating instrumentally with the same melody and harmony the question
that will arise afterwards. Thus, the attention of the listener is drawn inevitably towards
those questions.
Schubert also directs the flow of the first part of each strophe to the same question
(“Kennst do es wohl?”), this time set in recitative style. This offers a quasi-speech effect
in the middle of a completely “songful” melody. The expectation is built through the two
preceding measures, which serve as preparation to the D# of the French augmented sixth
chord sustained under the question, which in the next measure resolves on the dominant
chord, E Major. These procedures signal the arrival of a phrase to which the composer
53
wants the listener to pay special attention, “Kennst do es wohl?” (Do you know it
indeed?), which at the same time prolongs the expectation for an answer, musically as
well as lyrically.
In Beethoven’s setting, the answer is found in the next measure; the first “Dahin”
resolves on the tonic of the original key, A Major—after a short deviation to C Major in
the preceding section. Schubert delays the answer by displacing the clear resolution until
the end of the strophe—twenty-two measures later. The harmonic tension built measure
after measure while waiting for the final cadence parallels the frantic searching of
Mignon for the realization of her dream, to go back to her homeland, which always seems
far from concretion. While Beethoven portrays a calmer attitude on Mignon’s part—a
kind of contained anxiety—Schubert does the contrary. Mignon’s over-excitation is set
into music with unresolved harmonies. Furthermore, the frantic driving flow of triplets
from mm. 8 of the song does not stop until the end of his “Etwas geschwinder” (“A little
faster”) section. The only moment when the triplet texture is suspended is under the
question “Kennst do es wohl?” Finally, the alteration of the text, especially the desperate
repetition of “Dahin,” emphasizes her emotional state. This whole delayed answer section
lasts twenty-two measures in Schubert and fifteen in Beethoven’s setting—and only six
measures in Reichardt’s setting.
54
EX. 1.8: Schubert’s Kennst du das Land, mm.1-18.
55
EX. 1.8 (cont.): Schubert’s Kennst du das Land, mm.19-40.
56
More than any of the other settings, Schumann’s setting of “Kennst du das Land?”
achieves the new synthesis of the “I” and the “We” of Lied in romantic terms. He
approaches this poem with a strict strophic form, which contains a very slight
modification in the interlude between the second and the third stanzas—a deceptive
cadence in place of the original cadence to tonic. This cadence is as deceptive as
Mignon’s answer to the preceding question of the statues, “Was hat man dir, du armes
Kind, gethan?” (“Poor child, what have they done to you?”)—this is the central question
of the poem in meaning and placement. In his setting of Mignon’s Lied, Schumann seems
to have the same intentions as the “Second Berlin School” composers. His choice of a
strophic setting and his own indication at the beginning of the song of “Langsam, die
beiden letzten Verse mit gesteigertem Ausdruck” (Largo, the two last verses with
different expressive gesture) give that impression. But this is not the case. The melody
sung by Mignon is neither the ideal Volksweise (folk tune) nor the simple and transparent
melody of a fragile adolescent. The accompaniment, with its thick harmony, suspensions,
appogiaturas and deceptive cadences, builds a musical fabric that neither serves as an
unobtrusive support of the text nor represents the simplicity of Mignon’s zither playing.
Also, the relation between accompaniment and vocal melody with its displaced doubling
in the piano, so characteristic of Schumann’s songs, is far from the clear chord
accompaniment and strict doubling of the melody necessary for the delivery of the “only
possible reading of the poem,” according to the “Second Berlin School.” All these
features are put in place by Schumann to delineate Mignon’s psychological and
emotional state, or his interpretation of it. In the song, Mignon talks with the voice of the
composer.
57
The difference from Beethoven’s and Schubert’s settings, which are structured
around the “Kennst du?” rhetorical questions of Mignon, is that Schumann seems to drive
the flow of the song to each of the three addressing names that Mignon uses for Wilhelm.
The thick web of triplets in the two hands of the piano that starts in mm. 10 does not stop
until mm. 25, coinciding with "o mein Geliebter" (“o my beloved”). The same procedure
is repeated in the following two stanzas where the flow of the triplets ceases upon
arriving at “o mein Beschützer” (“o my protector”) the first time and “o mein Vater” (“o
my father”) the second time. Thus, the flow of the song seems to be organized around
these climactic points, which reflect one of Mignon’s major concerns: her relationship
with Wilhelm. The ambiguity of this relationship drives her to ask herself three reflexive
questions: Are you my lover? Are you my protector? Are you my father?
Furthermore, Schumann’s is the only setting that opens with an idiomatically
pianistic introduction, which hints at the chromatic world that he will develop later on in
the piece. This introduction will become the interlude played in between stanzas. When
the voice starts, this pianistic treatment gives place to a more open texture, which allows
the text to transcend and reach the listener in a relatively clear manner. But once the
audience is introduced to the landscape of each stanza, the thick web of triplets takes over
—starting in mm. 10—with its displaced doubling and complex chromatic language until
the next landmark: “o mein Geliebter,” “o mein Beschützer,” “o mein Vater.”
58
EX. 1.9: Schumann’s Kennst du das Land, mm. 1-20.
59
EX. 1.9 (cont.): Schumann’s Kennst du das Land, mm. 21-41.
60
By use of the described musical procedures, these Lied composers permeated the
poem with their musical voices, suggesting, through the accompaniment and its relation
with the vocal line, things that are not said in the words of the poem. They relied on the
music for this task because they conceived music as a language equal to the literature.
Music was capable of transmitting sensations and feelings that the audience would
appreciate only through the direct experience of listening—a type of physical connection.
Rosen says that for the nineteenth-century composers, the word is not anymore
embellished and imitated by the music.31 Now, the music becomes a language by itself, a
separate symbolic universe with its own logic and communicative-expressive power.
However, although music transmits feelings and sensations captured by the reading of the
composer, it only represents their form and not their content. The listener feels the
movement and impulses of the music conveying those feelings as a physically empty
message, which only his own imagination will fill with a determined content. In this way,
the listener will capture the structure of the composer’s personal reading and complete
the content of it with his own personal reading.
The particular dramatic point of view adopted in a song exerts an enormous
influence on the concrete musical manifestation of the special poetic features of a poem.
And the features that attracted Romantic composers dwelt at a structural level of the
poem. Since they were mainly concerned with content resulting from the elaboration of
several internal layers of meaning of the text, they molded their musical setting to portray
these emotional states or concepts. Neither narration of the dramatic events nor speech
qualities of the text were major concerns at this point in music history. The Lied was
31
Charles Rosen, The Romantic Generation (Cambridge, Massachusetts: Harvard University Press, 1995)
p. 68.
61
mainly music. The music of its poetic text runs parallel to the highlighted text fragments
and, integrated into those purely musical elements of the song, engaged the listener in a
strictly “poetic mode” of perception.
62
III.
FURTHER EXPLORATIONS:
MEREDITH MONK AND LUCIANO BERIO
As an academic and a performer of new music, I felt it was important to test my
text perception theories in “post-tonal” repertoire. I was especially interested in certain
composers of the second half of the twentieth century and the beginning of the twentieth
century, such as Meredith Monk and Luciano Berio, who have produced music that
reflects their concerns about the semiotics of paralingual vocal gestures and intonation. In
exploring these issues, they created music that is the practical representation of the way
we listen in the poetic mode taken to an extreme—a “poetic mode” with a special
emphasis on the “auditory mode.” By employing the sonic elements of language as the
structural components in their pieces, they strived to offer their audiences a direct
experience of the struggle and negotiations that text undergoes once set into music. They
wanted people to attend to those paralingual nuances that we usually disregard when
listening to speech and disregard even more when speech is set into music. At their hands
these paralingual elements become music to our ears.
Monk and Berio share concerns and interests in exploring the tensions between
text and music. They observe the fragmentation and deformation that any text set into
music undergoes and its resulting unavoidable degrees of unintelligibility. As an
offspring of these observations, Berio proposes to create a “new kind of relationship
between word and sound, poetry and music...,” the function of which “would not be the
contrasting or mixing up of two separate expressive systems but rather the creation of
complete continuity, so that the shift from one to the other would be imperceptible,
63
without drawing attention to the difference between a logical-semantic mode of
apprehension (as adopted for the spoken message) and a musical mode...”1 This kind of
word and sound relation would activate in the listeners the “poetic mode” of perception,
as defined by Reuven Tsur.
Monk and Berio are not the only composers who have introduced new
perspectives on the language and music relationship and influenced subsequent
compositional tendencies in the vocal music realm. Composers such as Pierre Boulez and
Karlheinz Stockhausen have also promoted rethinking this relationship with their writings
and compositions.2 On the one hand, Boulez developed his concept of “centre and
absence” in which the text remains at the notional center of the composition process. On
the other hand, Stockhausen envisioned a “sound-word continuum” in a perpetual
transition from listening to comprehension by softening the boundaries between these
two media. As explained later in this chapter, Berio conceives of a similar fluid
transitional process between sound and word.
It is also relevant to mention two developments of the second half of the twentieth
century, “concrete poetry” and “text-sound composition.” Both have predecessors in
movements taking place during the first two decades of the twentieth century. The former
has an earlier direct predecessor in the Dada poetry of Kurt Schwitters, Hugo Ball, and
Tristan Tzara, and the latter in the Italian Futurist experiments of Russolo and Marinetti.
The “concrete poetry” movement aims to create a new artistic reality. Without the
complete suppression of semantic meaning, this movement seeks to eradicate
1
Luciano Berio, “Poesia e musica un’esperienza,” in Incontri Musicali 3 (1959), 99.
2
An explanation of Boulez’s concept of “centre and absence” may be found in Orientations: Collected
Writings (1986). Stockhausen explains his ideas about text and music in his paper Speech and Music read
in 1959 in Darmstad Summer School and later published in “Die Reihe.” Some of the most influential
64
representation of any external reality. Thus, its focus moves towards the phonetic sounds
of words, shapes of letters, breaking of the formal semantic units and punctuation rules,
etc. “Text-sound compositions” renounce the optic dimension and concentrate on the
relationship of sound and meaning. These works exist only in recording format (sound
pieces without a written version). This branch of electro-acoustic music holds among its
more important examples compositions such as Steve Reich’s Come Out (1966), Nono’s
La fabbrica illuminata (1964) or the first region of Stockhausen’s Hymnem (1966).
The non-semantic internal reality of language or indefinable vocal sounds are of
interest to Monk and Berio, but unlike in “text-sound compositions,” these composers
explore these concepts without any electronic interventions. Their pieces may be
reproduced in live performance by one or several singers without any processing of their
voices. The fact that the full palette of vocal sounds employed by these composers
originates acoustically from the natural resources of the human voice is of special interest
to this dissertation.
As mentioned before, both Berio and Stockhausen have proposed to soften the
boundaries between speech and music, creating a “sound-word continuum.” This
continuum is created when speech approaches music and music approaches speech to the
point of the dissolution of the boundaries of sound and meaning. Berio considers the first
and primordial step in creating this “word-sound continuum” to be the dissolution of the
speech continuity as a logic/semantic discourse. Thus, he proposes to explore beyond the
natural fragmentation that any text undergoes when set to music—by breaking words into
their phonetic elements, stretching them, masking their enunciation, and mixing them
pieces in the realm of explorations of the tensions between language and music were Boulez’s Pli selon pli
(....), Stockhausen’s Gesang der Jünglinge (1955-6) and Momente (1962-4).
65
with paralingual sounds—to submerge the listener in the deepest nuances of language and
the human phonatory apparatus. In this way, he intends to dissect the elements of
language and observe their relations and tensions from inside out, while at the same time
revealing the communicative power of the sonic aspects of language beyond the
semantic-linguistic content of the syntactic units and system. He makes use of poems,
political speeches, academic texts, literary narrations and other kinds of discursive texts
in their entirety or in fragments. In A-ronne, as in many of his previous and later pieces,
Berio recreates the semiotic structural manifestation of the agony of language when set to
music. As Berio himself describes, A-Ronne is a dramatization of the sonic aspects of
language in a radiophonic theater.
In A-Ronne, as well as some of his early vocal pieces, Berio extracts the purely
musical elements from his literary textual sources and, as Osmond-Smith comments, uses
them “to explore the borderline where sound as the bearer of linguistic sense dissolves
into sound as the bearer of musical meaning: a territory that…he was to make very much
his own.”3 The words’ musical elements become structural components in his pieces.
Thus, Osmond-Smith describes the process of creating tension between the sonic
elements extracted from the words in Berio’s Thema (Omaggio a Joyce):
…he then proceeds to work in tension with it, juxtaposing and superposing phonetic
elements so as to produce consonant groupings that the human voice would normally find
hard to articulate in rapid succession (such as voiced and plosives)…Out of this
impossible vocalism, comprehensible speech…momentarily emerges, only to be
engulfed: relative comprehensibility has become a compositional parameter to be handled
in much the same way as textural density or, within a pitched context, harmonic
density…It may be achieved by the fragmentation of originally linear texts…by
superposition of texts…by dissolution of texts into their component phonetic materials, or
more usually by a combination of these.4
3
David Osmond-Smith, Berio (Oxford, New York: Oxford University Press, 1991), 62.
4
Ibid., 62-63.
66
These same kinds of procedures are explored in A-Ronne; only in this piece Berio uses
the natural voices of eight singers instead of electronic techniques.
Departing from similar observations, Monk arrives at different results. While
Berio places a magnifier on the sonic transitions of language but always in the syntactic
context of a real text, which could be stretched and deformed beyond recognition, Monk
steps out of the syntactic/linguistic frame and treats the phonemes as pure sounds. No
linguistic text of any kind precedes her pieces. Although she shares with Berio an interest
in the communicative power of the sonic aspects of vocal sounds, she specifically focuses
on their emotional communicative potential. Monk conceives of the voice as a tool for
“demonstrating primordial/prelogical consciousness...a direct line to emotions [and]
...Feelings that we have no words for.”5 In this way she exploits the potential of the
“songfulness” nature of the human voice.
For Monk the voice is in complete connection with the body. At the same time,
the physicality of the voice is one of her fundamental concerns: “The body of the voice/
the voice of the body.”6 As a consequence, in the mid-sixties, she began a methodic
exploration of the voice as an instrument that could develop its own idiosyncratic
vocabulary: “I realized that the voice could be as fluid as the spine, that it could have the
flexibility and range of the body.”7 She immersed herself in the study of vocal color,
voice placement and nuances in the articulatory/phonatory apparatus and applied her
discoveries into controlled explorations of vocal pitch, volume, speed, texture, timbre,
breath, and strength.
5
Meredith Monk, “Notes on the Voice,” In Meredith Monk, ed. Deborah Jowitt. (Baltimore, Maryland: The
Johns Hopkins University Press 1997), 56.
6
Ibid.
7
Robert Schwarz, Minimalists (London: Phaidon Press Limited, 1996), 189.
67
Berio does not ignore the physicality of the voice either and exposes the gestures
of vocal sound production through his music. Although in A-Ronne Berio makes
extensive use of paralingual sounds, such as breathing, sighs and other vocal noises, as
part of the musical process, he explains that he does not conceive of them as mere sound
effects but signs carrying meaning:
I am not interested in sound by itself—and even less in sound effects, whether of vocal or
instrumental origin. I work with words because I find new meaning in them by analyzing
them acoustically and musically, I rediscover the word. As far as breathing and sighing
are concerned, these are not effects but vocal gestures which also carry a meaning; they
must be considered and perceived in their proper context.8
By exploring in detail human vocal nuances, both Monk and Berio create proximity with
their audience. Listeners may directly relate to these tangible sounds because they are
produced by the same gestures that any human being uses in everyday normal speech.
Instead of having a passive audience admiring the virtuosic sound production of
the performer, Berio’s and Monk’s music invites its audience to experience it physically.
Richard Middleton comments that listeners “… identify with the motor structure,
participating in the gestural patterns, either vicariously, or even physically, through dance
or through miming vocal…performances.”9 Ethnomusicologists such as John Baily argue
that the movements that players perform while playing their instruments directly affect
musical structure, so “Music can be viewed as a product of body movement transduced
into music.”10 In this manner, the listener gains a firsthand experience of the gestures that
are structural to Monk’s and Berio’s pieces.
8
Rossana Dalmonte and Bálint András Varga, Two Interviews/Luciano Berio, trans. David Osmond-Smith
(New York: M. Boyars, 1985), 141.
9
Richard Middleton, Studying Popular Music (Philadelphia: Open University Press, 1990), 243.
10
John Baily, “Movement patterns in playing the Herati dutar” as quoted in R. Middleton, Studying
Popular Music (Philadelphia: Open University Press, 1990), 243.
68
Monk comments that “By working with your own instrument, you actually come
across gestures that are trans-cultural, and in certain ways you become part of the world
vocal family.”11 Her vocabulary involves human sounds that many men and women
could find natural and organic; thus, her music triggers a close emotional connection
between her audiences and her musical idiom. One could feel that those sounds are part
of our essential primal vocabulary: pre-lingual and, at the same time, beyond language.
Monk explores the communicative possibilities of vocal sound devoid of any
form of linguistic meaning—beyond semantic content. Vocal sound is presented as pure
sound. In this way, it opens to the audience the wide spectrum of potential meanings that
any sound usually evokes. The baggage of meanings that any human vocal sound carries
is not always precise and easy to define, allowing all sorts of associations. In a linguistic
communicative setting where human sound is the carrier of language, these multi-
associative meanings are usually overlooked. Monks wants to bring to her audience an
awareness of the potential multi-faceted meaning of vocal sounds.
In search of opening this kind of associative vocal sound/meaning spectrum, most
of Monk’s pieces are wordless, as she has restricted herself to moaning, shouting,
sighing, breathing, whispering, trilling, sliding, doing glottal breaks, and chanting on
nonsense syllables. This is a conscious aesthetic decision, since she departs from the
conception that it is almost impossible to comprehend text put into music, or at least to
understand it fully as in a normal colloquial situation. As a result, she attempts to direct
our attention directly to the sound of the voice without any obstacle. Even when she
composes pieces as Three Heavens and Hells (1992), where she uses exclusively and
11
Schwarz, Minimalist, 190.
69
exactly the four words of the title as the text of the piece, she almost strips the words
from their semantic/linguistic meaning. The quasi-mechanical repetition of these words
along the twenty-one minute and ten second duration of the piece produces a progressive
fade away of meaning until these words become empty vessels. These words keep their
pragmatic sense but not their semantic meaning. The effect is finally similar to the pieces
in which language is completely absent; the audience turns its attention toward the vast
spectrum of meanings of vocal sound.
In regard to her piece Atlas, Monk explains that it was meant to pass discursive
thought to “go directly to the heart.”12 She argues that in any case, she usually is not able
to understand a word in opera. Departing from the idea of language “as a screen in front
of the emotion and the action,” she prefers a direct communication that “bypasses that
step so that you’re really dealing with a very primary and direct emotion.”13
Volcano Songs: Meredith Monk
Monk’s Volcano Songs: Duets (1993) are an interesting example of wordless
songs. They explore the full potential of the “songfulness” of the voice and the pure
musicality of the human vocal gestures. As said before, Monk’s piece is a self-conscious
representation of the way we listen in the “poetic mode” with emphasis on the “auditory”
elements of perception. She chooses to make music from the stripped musical elements of
the voice that we usually unconsciously apprehend and to which we emotionally connect
when listening to any other vocal piece—whether popular song or “art” song.
12
Ibid., 191.
13
William Duckworth, Talking Music (New York: Simon & Schuster Macmillan, 1995), 359.
70
In an interview offered in 1996, Monk commented to the ethnomusicologist
David Gere that these duets were conceived as processes of nature. Each of them only
explores a single particular vocal quality. Monk preferred simplicity over compositional
fanciness; she says: “I was thinking: Why don’t you just take the purest color in each
song and only work with that. Like one brush stroke or a haiku.”14 The creation of a
particular character in each song is central to Monk. In Volcano Songs as in other pieces,
she looks for “the voice” of each piece, the one that creates a world in itself and is not
similar to any previous one.
This kind of restrained canvas that Monk self imposes in each Volcano Song is
not unusual for her music in general and appears to be an intentional procedure in other
pieces like Vessel (1971). This restraint manifests itself in two aspects of her music. First,
raw materials tend to be simple, but her controlled delivery—a certain solemnity in her
performance that creates “momentum”—transforms them into music of a universal scope
and stature. Second, she is interested in slowing down musical processes to get a slice of
them. She wants the audience to taste every single moment. The same detailed delivery
that Marcia Siegel and Kenneth Bernard have observed in her theatrical and dance
movements is present in her music.15 Most of her pieces are constructed as a succession
of single episodes that succeed one another, repetitive sequences of slow, sustained notes
or glissandi, or swirling rhythm.
Several of her compositional techniques were revealed to me in a palpable way
through my direct experience in workshops held by The Meredith Monk Ensemble. They
guided participants through similar processes that they established with Monk during the
14
Meredith Monk, interview by David Gere, Volcano Songs (CD insert), ECM, June 19, 1996.
71
creative process of some of her ensemble pieces. She usually proposes materials and
processes, and through improvisatory techniques, they mold those materials, each in their
own idiosyncratic ways. She wants to hear through their vocal sounds: their backgrounds,
their experiences, their personalities, their humanity, their imperfections. After long
sessions of experimentation, a final version is put together and fixed. For the most part,
there is a preference for the oral transmission of her pieces—although these versions
finally do get scored, which was the way in which her ensemble members taught
fragments of her repertoire to the workshop participants.
In terms of the musical processes and materials that Monk employs in her pieces,
the musical structures show a predominant horizontal conception: short cells that develop
linearly, “plain chant” style or “folk-flavor” simple melodies that succeed one another.
These sometimes undergo slow processes of gradual transformation. At other times, each
component succeeds another, but in their transition, there is a period of overlapping in
which one theme fades out and the other slips in. Monk calls this process “wash,” and
this is one of the several musical processes that are directly associated with cinematic
editing techniques. Other musical procedures, such as canonic textures, are rooted purely
in the musical tradition.
Monk’s approach to music-theater connects to a general non-narrative conception,
which she applies in her explorations across media: music, dance, theater, and video. In
terms of the structure of her staged pieces, Monk’s preference for non-narrative models
causes her to choose a more fragmented poetic style in which things happen one at a
15
Marcia B. Siegel, “Virgin Vessel” and Kenneth Bernard, “Observations On Recent Ruins” in Meredith
Monk, ed. D. Jowitt.
72
time, and it is not until the end that the spectators are able to intermingle the separate
episodes or scenes and make sense of them as a whole theater piece.
According to Deborah Jowitt, during the theatrical presentation of Volcano Songs,
Monk walks to a row of three rectangles that lie on the floor and, in a ritualistic manner,
removes the black pieces of cloth that cover these rectangles.16 After each uncovering
action, she lies in a crumpled position on each pallet, while a bright light flashes on and
off. Once she stands up, the light turns the pallet a luminous green, discovering on it a
dark imprint left by her body. Then she proceeds to the next rectangle to perform the
same task and the previous one fades away. This seemingly magical theatrical effect
transforms advanced technology, such as photosensitive paper, into a “poetic and
apocalyptic” memorial of victims of volcanic eruptions or nuclear disasters such as
Pompei or Hiroshima.17
The volcanic theme brings in the motive of “transformation” that lies under all
these songs. According to the composer, although volcanic activity implies potential for
destruction, it has also been instrumental in the creation of the Earth. Furthermore,
“volcanic land is some of the most fertile land on earth.”18 The tension between death and
destruction and rebirth and growing implies a kind of cyclic transformation, which
translates into musical processes of transformation of the vocal textures and themes that
are used throughout the Volcano Songs: Duets: morphic overlappings between materials.
The first song of the cycle, called “Walking Song,” explores the opposition of
pure vowels against a backdrop of voiceless, breathy vocal sounds. This duet
16
Deborah Jowitt, ed. “Introduction.” In Meredith Monk. ( Baltimore, Maryland: The Johns Hopkins
University Press, 1997)
17
Ibid., 15
18
Meredith Monk, interview by David Gere.
73
concentrates on the vocal color of [a] and [o] connected once in awhile by semi-vowel
consonants [n] and [l], and glide [j]. The piece evolves through a restless motif of what
could be called “a galloping rhythmic” nature: a six-eight meter made of a quarter note
followed by an eighth-note, of iambic characteristics. The melody seems to move mostly
in conjunct intervals around an F# minor tonal center. Despite this constrained melodic
beginning, throughout the duet, the pitch content and contour evolve from a very narrow
register to more than a fifth wide register and then an almost total loss of tonal center to
later return to the previous, more constrained and defined version of the motif. Departing
from this basic version, the piece explores augmentations and diminutions of the
following melody:
EX. 2.1: Monk, Volcano Songs: Duets, “Walking Song,” min. 0:25 to 0:35
Only the last three notes, on the phonemes [o-a-jo], of this motive remain constant
along the piece. They become a sort of refrain that reappears at the end of every phrase
despite any kind of transformation that the beginning of the motive could have
undergone.
The duet may be divided into fourteen sections along which the musical processes
described in previous paragraphs take place.
74
Section # 1 - (0 to 0:15 minute):
This opening section is the introduction of the first and simplest version of the
theme, as already presented in example # 1. This theme is made of two identical melodic
phrases but with different phonetic material. At this point, each of these phrases lasts four
measures (4 seconds).
Section # 2 - (0:15 to 0:25 minute):
In this second section, the introductory theme is expanded in two dimensions,
pitch and duration (one vertical and the other horizontal). Regarding the former, it adds
two whole steps up, thus reaching to a C#5. For the latter, although it keeps the dynamic
of presenting two phrases, it adds a fifth measure to the first phrase, which produces
instability and breaks the balance and regularity that the theme had in its introductory
state. This means a whole second of new music and surprises the listener, refreshing the
perceptual experience. But the second motive phrase goes back to the established four
seconds—four measure period.
75
In this third section, there is another kind of expansion, this time in the texture:
the addition of a second female voice. Again unpredictability and irregularity are
emphasized by surprising the listener with the unexpected new element not at the
beginning of the phrase but by the second note (in 0:27 minute). But this second singer
only emits breathy, almost voiceless sounds without specific phonemes, which follow the
same rhythmic patterns of the leading female voice, which still carries the tune. Once
introduced, this second voice keeps singing for the two normal melodic phrases of four
measures. The melodic content of these phrases is a variation over the given pitch
spectrum until this point of the piece. In the refrain [o-a-jo], the second voice gains more
presence with faintly defined phonemes but without completely abandoning the
voiceless/breathy quality. The phonetic material is basically the same recomposed in a
different order.
The second voice keeps singing, and now, interspersed within its voiceless
texture, some phonemes are completely voiced—in addition to those of the refrain. The
two regular phrases are maintained. Interestingly, when the melody reaches the first
“refrain,” the second voice splits from the first one and sings it with a delay of one beat,
76
thus creating an echo-effect. But by the end of the second phrase, they are in unison
again. The melody expands its pitch range even more. The first phrase opens with a
perfect fifth, A4 to E5.
Some elements stay constant, such as the two thematic phrases and the
sporadically sung pitches of the second female voice, but the new expansion is harmonic.
Along the “refrain” the second singer adds a second parallel voice a third above.
Now the expansion is in the variety of vocal colors or timbres employed. Both
singers, but more prominently the second one, shake their voices, giving the impression
of trembling.
In this section a process of simplification begins. There is a reversal of all the
musical effects applied and compiled up until now in the piece. The second voice drops
back into more breathy sounds. The voices play with delays and anticipations of the
“refrain” both times that it appears at the end of each of the two phrases.
77
In this longer section, the second voice starts to drop at times, producing rests
between the breathy sounds. The first voice progressively drops the phonemes and starts
to sing bocca chiusa.
This section goes back to the same kind of articulation and texture as before
minute 1:14.
The melody loses its tonal center at the same time as its range expands. It moves
in wondrous ways out of the F# minor center that dominated until now. The trembling
timbre is more prominent and frequent, and by the end of the second phrase, before the
refrain, there is a whole second of pause in both voices—again surprising the listener
with the unexpected.i
The new element is the broken nature of the melody. Unexpected rests interrupt
the melody’s flow. But those rests have more of an effect of stops or suspensions in time
than silence as a product of suppression of sonorously existing material. After each
suspension, the melody reassumes its natural flow from the point before the interruption.
78
In this section several of the previously used musical tactics and processes are
used all together: delay/echo, harmonization in thirds, rests, etc.
A completely new melodic material replaces the theme of two phrases. The first
voice alone sings the following pattern based on descending ninth intervals (G#4 to F#3).
The phonetic material is limited to alternations between [a-ε], the [a] as old material and
the newly introduced [ε].
The second voice returns and sings along in unison with the two complete phrases
of the theme and then a single [a] on F#3, one second rest, and the last single [ε] on G#4,
this time sung only by the first voice. Thus, the piece ends with this open ascending
ninth.
The second duet of Volcano Songs, “Lost Wind,” consists of two voices playing
with the friction of two notes a half step apart and the partials produced by this action.
The motif is introduced by one of the voices and repeated twice by this single voice. It
79
consists of two short notes on the phonemes [ni - a], on C#5, followed by a long note a
half step up (D5) on the syllable [no]. This last D note immediately decays, sliding down
a third.
EX. 2.6: Fragment from Monk, Volcano Songs: Duets, “Lost Wind”
The duration of the long note before the decay varies in every repetition. The
second voice, which appears by the third time that the motive is repeated, reproduces the
same melody but with a delay. It accentuates the friction by desynchronizing the voices
while sustaining and sliding down. Although it is very difficult because of the fine timbre
of the voices blending, one may perceive, until a certain point in the piece, that the two
voices take turns starting each repetition. The singer that follows always applies a more
distant timbre in her voice, as if the sound had a veil over it. This is achieved with a
further back placement in the vocal cavity and a certain breathy quality (in contrast to a
more forward and projected full sound). Although less veiled, the leading voice presents
to a certain degree a similar distant sound.
By these means, the piece explores distance in two dimensions: space and time. It
explores the effects of distance between the sound of the two singing voices, the singers
with respect to the performing space, or between the listeners and the singers (their
proximity or remoteness). The piece gives the impression of an echo effect, in which one
80
singer sings and her voice comes back with a far and diluted color. This illusion is
created by the fine imitation of vocal colors between the two singers. The echo suggests a
vast expanse of space. Thus, the human and her or his voice in solitude face the
immensity of nature. Nevertheless, the proximity is present in the friction produced by
the rubbing partials created by the semitone interval between the voices. At the same
time, “Lost Wind” experiments with the distance in the historical time spectrum. These
could be voices coming from the past, a prehistoric time in the evolution of the earth or
more recent events of natural or nuclear disasters.
The third duet of Volcano Songs, called “Hips Dance,” concentrates on the
proximity and mingling of the two voices. This effect is taken to a point at which the two
voices intertwine and “you can hardly tell that two different people are singing...In ‘Hips
Dance’ we push it even further by creating the illusion of more than two voices
overlapping.”19 Both voices start together from the beginning of the piece. One of them
maintains the following drone on the semi consonant [m] throughout the whole piece:
EX. 2.7: Fragment from Monk, Volcano Songs: Duets, “Hip Dance”
19
Meredith Monk, interview by David Gere. .
81
Some hard exhalations—which sound close to [ho]—are interspersed between the
two-beat long notes. The “[m] drone” is only interrupted once in the middle of the piece
by a whole section of those exhalations. At that point, the two voices exchange hard
breathy [ho]s, which create a rather percussive effect. This section acts like a rhythmic
improvisation or percussion solo in the middle of a piece, in which time stops moving
forward, and a suspended syncopation takes place. This is one of those instances when
the voices are indistinguishable from each other, thus creating the sensation of more than
two singers singing at the same time.
While the “[m] drone” continues, the other voice varies a pattern based on the
following material: a short eighth-note sung on [e] followed by four or more sixteenth
notes sung on [m –a] a minor sixth down. These sixteenth notes are interrupted in
irregular patterns by hard exhalations on [ho]. The rhythmic interventions of the
percussive [ho] in between the melodic patterns of the [m-a] and the hummed drone of
the other voice reinforce the fictional sensation of a third voice intervening. Towards the
end of the piece, the tempo accelerates and the frequency and irregularity of inserted
exhalations in between the sung notes increases, exacerbating the sensation of several
voices singing at the same time.
The last of the duets, as its own title “Cry #1” suggests, is a lament. It is not by
chance that this duet uses a phoneme of intense emotional charge. All through the duet
one of the singers vocalizes on [ηg a – ηgæ] while the other slides over a single [η]. The
[η], as other posterior nasals (or velar nasals), is one of the last phonemes to be acquired
by children in languages that employ them. Its late incorporation into the linguistic
system of arbitrary signs in language implies that the child experiments for a longer time
82
during her/his infancy with these sounds in onomatopoeia or sound-gestures. As a
consequence, this nasal-velar sound carries a heavy load of emotional connotations, for
which the child has no words. Roman Jackobson, who developed these theories in his
famous Child Language, Aphasia, and Phonological Universals (1968), argues that
“Sound-gestures, which tend to form a layer even apart in the language of the adult,
appear to seek out those sounds which are inadmissible in a given language.”20 These
sounds will coexist with those employed in the vocabulary for a long time and even be
used as expressive sound-gestures in one’s adult life. Thus, these sounds have a playful
and affective charge.
Rueven Tsur makes further observations about the nature of these nasal
phonemes. He comments that “periodic sounds” such as nasals, vowels and liquids—
those having similar structure for their recurrent acoustic signal portions—arouse a
certain relaxed kind of attentiveness, prediction and order, a quasi-hypnotic effect.21 One
may observe the extended use of these kinds of phonemes in Monk’s Volcano Songs,
which induce audiences to experience similar effects to those described above.
The musical development of “Cry #1” consists of the erratic wandering of the
voices as they slide around a half-step up and down from a central note in oblique
interweaving. The first singer’s voice glides around this pattern, which is constantly
transformed: [ηg a – ηgæ]
20
Roman Jakobson, Child Language, Aphasia and Phonological Universals. (The Hague: Mouton, 1968),
25.
21
Reuven Tsur, What Makes Sounds Patterns Expressive: The Poetic Mode of Speech Perception (Durham,
London: Duke University Press, 1992), 44
83
EX. 2.8: Fragment from Monk, Volcano Songs: Duets, “Cry # 1”
The second voice is introduced in minute 0:40 of the piece and slowly slides in a
humming manner on a sustained [η], departing from a different note than the first voice.
Similar to the previous duet, this second voice acts as a drone. From minute 0:43 in the
piece, each voice alternately moves its pivot note, but when it seems that both voices are
going to coincide, the other moves away and again produces dissonant intervals.
Progressively, after the minute 1:00, both voices accelerate the frequency of their pitch
fluctuation, becoming an undulation, and the phonetic articulation in the first voice
becomes more blurry and muddy. While the second singer keeps humming on [η], the
first one conserves only a murky [æ] from her phonetic set. A short section follows, from
minute 1:58 to 2:18, in which both voices closely and in parallel movements sing on [ηg
a – ηgæ] up and down a half step in an insistent mourning. From this point to the end,
both voices start a process of simplification and assimilation until they collide in an
undulating unison around the same pivot note on a fluctuating [æ], to finally close on the
same pivot note sustained by both voices.
The employment of the nasal phonemes on top of the musical procedures
previously described gives “Cry # 1” a relentless lament quality, which lends itself to a
variety of interpretations. With an unmistakable primary human emotional quality, it is
vague enough to relate to and make all sorts of associations around.
84
A-Ronne: Luciano Berio
In 1974 Luciano Berio composed A-Ronne for Radio Hilversum arranged for five
actors-singers, but in 1975 he revised the piece and expanded it for eight singers, in a
double vocal quartet format. The premier of this second version was made by the group
Swingle II in 1976. Berio’s usual collaborator, the poet Edoardo Sanguinetti, is the author
of the text set in the piece. As Osmond-Smith comments, A-ronne is the product of a
dynamic process of improvisation among the original five actors instigated by Berio,
which resulted in a series of fragmentary sonic dramas derived from Sanguinetti’s text.
He recorded these sessions, which after reworking them were transformed into the eight-
voice concert version.22 While this original first version profited greatly from the vivid
imagination of these five actors, Swingle II brought their imprint to the second version
through their staple sound: a kind of vocally imitated instrumental fusion of jazz and
“classical” styles.
A-Ronne, conceived as a piece of “theater for the ears,” is one the pieces resulting
from “his [Berio’s] eight years of work in the Milan radio” which added “a sharp sense of
the extraordinary flexibility of the aural imagination, where images can flow into, or
coexist with one another with an ease denied to the eye.”23 This shows Berio’s special
concern for the human listening process. A-Ronne, as many of his other pieces, acts
directly upon the different ways that we, as listeners, may apprehend text. Again, as
Monk’s Volcano Songs, A-Ronne is the concrete representation of the way we listen to
text in the “poetic mode,” with the exception that, in this case, Berio manipulates words
as well as pure paralingual gestures.
22
David Osmond-Smith, Berio (Oxford, New York: Oxford University Press, 1990), 98.
23
Ibid., 90.
85
The following analysis focuses particularly on the ways Berio operates on the
literary materials to reproduce a “poetic mode” of listening, which could alert the
listeners to the language elements that we unconsciously relate to in vocal music. This
analysis is based on the published score of the eight-singer version as well as the
recorded version of A-Ronne by Swingle II in 1976, which was conducted by Berio.24
Sanguinetti’s text quotations come from different sources: literary documents,
languages, periods, and cultural circles. Each quote appears in the actual text divided by
semicolons to denote a change from one literary source to the other.
1. The gospel of John in Latin and Greek, as well as in Luther’s German

translation with the modifications of Goethe’s “Faust I”
2. Dante Alighieri’s “Convivio” and “Divina Commedia”
3. The beginning of the Communist Manifesto of Marx and Engels
4. An essay from Roland Barthes about George Bataille
5. The old Italian alphabet25
Even before Sanguinetti’s employment and extraction of them, fragments of these
cultural documents became expressions independent from their historical context or
background. Sanguinetti takes advantage of the original fragmentation that these phrases
have gone through in the real world and deepens their fractures, presenting sections or
even single words. The colon marks clarify further the fractures between each element.
He takes these new wholes and works them through to show their original alienation.
Besides setting the quotes in disturbing contexts, the use of six different languages
highlights the split. Sanguinetti organizes the text of A-Ronne in sections, which are
themed “beginning,” “middle” and “end.” Time is also thematized in words like “run” or
24
Luciano Berio, A-ronne: documentary for 8 singers on a poem by E. Sanguinetti (Wien: Universal
Edition, 1975); Swingle II, Luciano Berio, conductor. A-ronne .London: Decca 1976.
25
List extracted from Norbert Dreßen, Sprache und Musik bei Luciano Berio: Untersuchungen zu seine
Vokalkompositionen (Regensburg: Bosse, 1982), 159.
86
“beginning,” as well as space in the parts of the body: bocca, labbro, annus, pied, etc.
Among the cultural and historical quotes, Sanguinetti interpolates an extra
literary/musical allusion to Guillaume de Mauchaut’s roundeau Ma fin est mon
commencement, which fits with the thematization of space and time and at certain point
almost articulates the dissolution of time “in my end...is my beginning.” The title A-ronne
is related to the conceptualized idea of time and the sections of the piece. “A” was the
first letter of the old Italian alphabet and “ronne” the last one. All material in between
represents the rest of the alphabet. This is the poem as Sanguinetti presents it:
1.
a:ah:ha:hamm:anfang:
in:in principio:nel mio
principio:
am anfang:in my beginning:
das wort:en arché en:
verbum:am anfang war:in principo:o lògos:è la mia
carne:
am anfang war:in principio:die kraft:
die tat:
nel mio principio:
2.
nel mezzo:in medio:
nel mio mezzo:où commence?:nel mio corpo:
où commence le corps humain?
nel mezzo del cammino:nel mezzo
della mia carne:
car la bouche est le commencement:
nel mio principio
è la mia bocca:parce qu’il y a opposition:paradigme:
la bouche:
l’annus:
in my beginning:aleph:is my end:
ein gespenst geht um:
3.
l’uomo ha un centro:qui est le sexe:
en méso en:le phallus:
nel mio centro è il mio corpo:
nel mio principio è la mia parola:nel mio
87
centro è la mia bocca:nella mia fin:am ende:
in my end:is my
beginning:
l’âme du mort sort par le pied:
par l’anus:nella mia fine
war das wort:
in my end is my music:
ette, conne, ronne:
According to Norbert Dreßen, although A-ronne continues a similar kind of
confrontation with the human voice as Sequenza III, the former “enriches the work with
the possibilities that are offered by a double quartet and through a referring system that
extends the concert frame of his Sequenza.”26 Regarding the first point, instead of the
isolation of the vocal actions of Sequenza27 performed by the solo singer, A-ronne knits
the individual vocal expressions of each of the eight voices into the full structure of the
piece: “the interpretation of each voice horizontally constantly refers to the collective
connection vertically.”28
Departing from Sanguinetti’s poem, Berio recomposes a new text for the score,
which adds sounds in phonetical writing, syllables, words, sound gestures, and a
completely new language. These new elements may or may not be related to words or
fractions of those that already are part of the poem’s text. Although not precisely in its
original order, the whole poem is repeated about twenty times in its entirety. As Dreßen
comments, in this manner Berio unravels the “meaningless sign chain” nature of this text
specifically, and any text in general, within the general state of crisis of the language.29
26
Dreßen, Sprache und Musik bei Luciano Berio, trans. L. Guillén, 157.
27
In the instructions for performing Sequenza III, Berio requests that attention be paid to the timing
indications on the score for each section to maintain the rapid succession of vocal events. He aims to create
the illusion of one voice polyphony, such a rapid articulation of diverse vocal sounds that the listener could
perceive as several voices.
28
Dreßen, Sprache und Musik, 158.
29
Dreßen, Sprache und Musik, 162.
88
The rather negative approach of Boulez to text is transformed into a more positive
conception of the same issues in Berio’s hands. Boulez asserts that any text set to music
will be inevitably destroyed. Text may be at the “centre” molding the musical piece but at
the same time “absent” or completely unintelligible as a logical discourse. In contrast,
Berio proposes a new combined and fluent media as previously commented on in this
chapter. For Berio, certain fragmented text elements may be perceived as music and still
retain some affective communicative potential. These incomplete phrases, words,
syllables and phonemes inhabit the transitional realm of his new “combined and fluent
media.” Berio enacts the language crisis and transforms it into a new communicative
experience without dwelling on the impossibility of keeping the integrity of the
discourse.
Thema (Omaggio a Joyce) (1958) was Berio’s compositional turning point with
regard to text treatment. Departing from an impressionistic reading of Sanguinetti’s
poem, which breaks the logical/semantic continuous discourse, Berio first set down only
those features of the text that could be perceived during a first reading: text fragments,
words, syllables, phonemes, paralingual vocal gestures, intonational inflections and
contour, vocal timbre modifications. His phonological studies and observations lead him
to purposely take into account the perceptual experience from the listener’s point of view.
According to his beliefs, in the already disturbing context of a musical setting, only those
features immediately perceived have a chance of being captured by the listener. This
same kind of approach is found in other pieces by Berio, such as Laborintus II and A-
ronne.
89
Laborintus II not only uses texts of the same poet, Sanguinetti, it also sets a
precedent in developing a multilayered counterpoint in a small vocal ensemble. The
theatrical element is present in the human interrelations represented in the corresponding
layers of eight singers. The following comments of the composer about his Laborintus II
(1965) find further realization in A-ronne:
The first step...was to set down some features of the text spontaneously so realizing the
polyphony attempted on the page...Nor should it be forgotten that only those features
immediately perceptible on a simple reading of the text have been taken into
consideration...30
This rather spontaneous approach materializes Berio’s conception of communication and
expressive connection between the text and its receptor through its more rudimentary and
firsthand apprehensible sonic elements. With respect to Laborintus II, A-ronne refines
and deepens the exploration of the phonetic and paralingual aspects of its text.
In A-ronne, Berio continues the process of transforming the prose into a poem
already initiated by Sanguinetti.31 By increasing the isolation—decontextualization and
recontextualization of the material—the text is opened to a variety of interpretations and
ambiguity of meaning.
Verbum caro in principio die Kraft die tat der sinn (alto 2, p. 3)
In my beginning am anfang o lògos in principio erat (tenor 2, p. 3)
Nel mezzo della mia carne: la bouche: ou commence le corps humain:[o] in my

beginning:l’anus:in medio (alto 1, p. 7)
The repetition of words or partial phrases adds to the recontextualization of the material,
assisting the transformation of prose into poetic form.
90
Nel mezzo nel mio mezzo nel mio corpo oh nel mezzo: nel mezzo del camino
nel mezzo della mia carne la bouche è la mia bocca la bouche la bouche la
bouche (alto 2, p. 22)
Neither Sanguinetti’s text source nor Berio’s final scored text preserves cohesive
blocks of discourse with patterns of ordinary speech—formed sentences that make
paragraphs. A-ronne uses two formal types of speech: disjointed and enumerative. Most
of the passages are made up of short phrases that never develop into a more fluent kind of
speech. Both types explore contrast and repetition as a source of music/ sonic interplay.
The repetition of words and phrases, as well as especially the accumulative repetition,
creates emphasis, climax and expectancy. This kind of repetition often creates points of
assonance and alliteration inside a word or among words. In several cases, in cataloguing
and enumerative passages, the words are bound together more by their similarity in sound
than by their logical connection. Berio comments, in an interview with Bálint András
Varga, that “the grammar of A-Ronne is focused on a single technique: alliteration. It is
with the help of alliteration that I musically reorganize this rather complex text.”32
The repetition of words or phrases is not an exclusive prerequisite to achieve
alliteration and assonance. The percussive effect created by the vertical superposition of
the following words is achieved by their similar phonetic content. Stop-plosives, such as
[t]-[k]-[d], in their two typographic representations, “c” or “k,” the fricatives [v]-[f] and
the flipped and rolled [r], dominate the phonetic field in this passage. The following
30
As quoted in Peter Stacey, Contemporary Tendencies in the Relation of Music and Text with Special
Reference to Pli selon pli (Boulez) and Laborintus II (Berio), (New York/London: Garland Publishing,
INC, 1989), 156.
31
Sanguinetti’s poem is made, for the most part, from quotes coming from famous texts in prose form.
32
Dalmonte and András Varga, Two Interviews, 142.
91
example comes from the three first coincidences between some of the eight singers on
page 5 of the published score:
S1) verbum ach ach
S2) ach die Tat die Tat
A1) das Wort das Wort ach
A2) die Kraft die Kraft die Kraft
B1) caro caro caro
In some instances, the phonetic material employed in the piece is textually
derived from surrounding words, but in other instances, phonemes seem not to have any
textual connection and have a purely musical function. On page 4, the sung notes in the
baritone 1 on the phoneme [o] are clearly derived from the preceding “caro” of the alto 2
and soprano 2 or from “principio” and “wort” of baritone 2 or soprano 1. Another case of
direct correspondence is found on page 6 where the baritone 2, while “stuttering,
coughing, suffocated by words and saliva” —as the composers indicates on the score—
anticipates the consonants [d]-[s]-[z]-[d] of the words he is trying to deliver: “der Sinn.”
But other episodes, such as the succession of phonemes [u]-[i]-[Λ]-[o]-[a] on page 16 and
17 in the baritone 2, are for the most part unconnected to any surrounding word. The case
of the counterpoint between tenor 1 and baritone 1—starting at the end of page 20 and
running into the first half of page 21—is a purely musical event, which focuses on the
colors and sonic quality of these phonemes. In rapid succession tenor 1 delivers a set of
consonants: [l]-[f]-[s]-[m]-[n]-[d] alternating in a kind of timbric counterpoint with a set
of vowels spoken by the baritone 1: [a]-[e]-[i]-[ai]-[e]. Both singers receive the
92
composer’s indication to perform this passage as if they were “teaching vowels” or
“teaching consonants.”
Besides the transformation of the literary text, Berio “performs a second
transformation process, by interpreting the meaning of the fragment in literature as the
‘representation of the invisible in the visible’ as the relation between the written picture
and its acoustical realization.”33 For this realization, Berio resorts to a vast and
meticulous musical vocabulary: twelve forms of notation, three procedures for time
organization of the piece, seven dynamic steps, and ninety-six performance instructions.
Dreßden makes an exhaustive list of these procedures.34 The following list shows the
range of vocal sounds that Berio requires from the performers: from the spoken to the
sung, going through a variety of physical modifications that affect the sound result.
1. written text: spoken

2. sounds in phonetic writing, each according to sounding rules
3. scratching through throat
4. breathed, almost whispered
5. singing or speaking with closed mouth
6. as high or as low as possible
7. moving the voice in that register
8. ornamentations that are easily articulated
9. bridge over an interval (by sliding of the voice)
10. spoken/singing (with the given intervals)
11. sung pitch that is to be kept to the end of the time unit
12. sung materials35
The ninety-six performance instructions gather terminology from the area of
prosody (including paralingual sounds), emotional moods, sounds produced by different
33
Dreßden, Sprache und Musik bei Luciano Berio, 162.
34
For a complete list of notation forms, time organization, dynamics, and performance instructions in A-
ronne, please refer to pages 162 to 167 of Dreßen, Sprache und Musik bei Luciano Berio , Chapter V.
35
List reproduced from Dreßen, Sprache und Musik, 162-163.
93
body parts, and musical or theatrical/acting situations—interpreting different characters
in various settings. The following table shows examples of each of those areas:
Performance Examples Page

Instructions
Prosody “highly inflected” 3
“in an explanatory manner” 3
“discontinuous with occasional questions” 3
“violent and quick” 4, 5
“Stuttering, coughing, suffocated by words and saliva” 6
6
“gasping out the words”
7
“tense murmuring”
Emotional Moods “outgoing and happy” 3
“cold” 3
“angry” 4
“sadly”
15
Sounds Produced by “flickering tongue against the upper lip” 6
Body Parts “’Pop’ sliding finger inside-out of mouth” 6
“’Chewing’ quickly-mike against the mouth” 7
Musical Situations “Singing unrelated pitches” 1, 2
“like a bumpkin’s marching song” 16,17
“Vocalizing” 27
Theatrical/Acting “like a dictator’s harangue” 4, 6
Situations “like two priests murmuring a prayer” 7
“like a drill sergeant’s questioning” 16,18
“intimidated by the sergeant” 16,18
TABLE 2.1: Classification of performance instructions in Berio’s A-ronne.
94
A variety of vocal sound production styles dominate the performance spectrum of
A-ronne:
1. Unnotated recitation (p. 7-8 every singer; p. 9-11 baritone 1 and 2; p. 34 tenor 1
and baritone 1; p. 35 tenor 2; p. 41 tenor 1)
2. Spoken short phrases, words or syllables with notated rhythms and no pitch
indication, (p. 1, p. 18-19, p.23-24)
3. Spoken short phrases with notated rhythms on a single line denoting the central
point of the speaker’s register: p. 1-3; sporadic use between p. 44–46; the scat like
unspecified melodic contours on the syllables “de” and “den” at the beginning p.
16, 17, 18, 20, and the longest episode On p. 21.
4. Singing with an unstriated pitch: notated around a central line (usually small
vocalizes on one of the pure vowels, on p. 1 alto 1, p. 2 alto 1 and tenor 1; all
these have the following performance indication on the score “(singing unrelated
pitches).” Unless this is indicated, the rest of syllables notated in this way are
spoken.
5. Singing with musical parameters (melismatic and fragmented). This piece
presents several instances of singing, which evoke different musical styles.
Among the passages sung with lyrics: p. 8-9 a Marenzio-like-madrigal among
most of the singers; p. 28 soprano 1 and alto 1 indicated “(as a folk song)”; p.12-
13 and p.39 baritone 2 sings like a double bass detached monosyllabic mixed text
“(Expressivo like an accompanying DB)”; tenor 2 starts with two fragments on p.
15, but the whole melody appears on p.16 “(like a bumpkin’s marching song)”
95
6. The other singing passages are sung in single phonetic sound that do not combine
to make any word: “(dreamy and distant)” tenor 2 melody; chords sustained by
several voices on transitional vowel phonemes p. 25; p.27-28 “(vocalizing
independently from other voices)” alto 2 sings arpeggiated tonic-dominant
seventh chord sequences on [a]; p. 29-31 singers take turns on a melismatic
vocalization with specific rhythmic notation to serve as background of folk-like
singing of mainly alto 1 and then soprano 1 in p.31; a pseudo-romantic
transformation of a Bach-oratorio-like counterpoint sung among all voices except
tenor 2 who is reciting on vowel phonemes; from p. 42 to the end, staccatos at
different rhythmic patterns superposed among all the voices, which progressively
transform in more sustained longer chords.
7. Syllabic singing (nonsense syllables), the most prominent instances are the
passages on “de-de-den” as on p. 23, 25 and then the whole p. 27-28; melodically
and rhythmically these passages imitate the jingle jazzy commercial vocal style.
8. Singing “bouche fermée” (bocca chiusa or closed mouth) appears in short
passages or as a transitional articulation mixed with phonetic or syllabic singing:
p. 13-14 in tenor 2 and baritone 1; p.17 baritone 2; p. 23-24 sustained chord by
altos and sopranos; p.31 sopranos, alto 2 and baritone 1; p. 38 everybody; p. 40
tenors and baritones.
The spoken passages, such as the “unnotated recitation” and the “spoken words
and short phrases” expose a wide range of performance styles according to the mood
indicated by the composer on the score. Since these are not musically notated, all
96
parameters are open to the performers’ interpretation. These passages have neither pitch
indication nor rhythmic notation; their performance intonation arches and speed are only
determined by the mood indicated by Berio over each event. On page 7 and 8, the
indication of “tense murmuring” and the whisper sign, o--------, turn to a very fast
masked delivery of the text that, added to the polytextuality among the singers, makes
understanding almost impossible. The pitch results in the medium-high register of each
normal speaking voice. Perhaps to avoid the natural performance tendency of associating
fast and high pitch, on page 9, the indication for baritone 1 and 2 is “Like two priests
murmuring a prayer: fast and low tones.” Superposing the other voices singing the
“quasi-motet” passage over the low pitch and volume of the baritone 2’s singing, the text
is masked, and its understanding obscured. The low pitch speaking quality is kept
through p. 10 when alto 2 is introduced in a counterpoint with baritone 2; but the pace
slows down to create the sensual and intimate ambiance suggested by the indication
“Like an intimate dialogue: with a husky and hesitating tone.” Sighs are incorporated
between this hesitant tone phrases. Intelligibility is regained because of the slower pace
and over articulation of the words as a mode of sensualizing their sound, their phonetic
coloring.
A similar situation is encountered on page 22, this time between tenor 2 and alto
2, “intimate and sensual with occasional o-------.” In this passage, all the voices, except
baritone 2, engage in an overlapped multitextual recitation “colloquial, gigglish and o-----
--.” The gigglish quality makes every voice explore the high end of its speaking register
and move in a fast manner. Right after this passage, tenor 2 and baritone 1 engage in a
process of “Stuttering, gradually faster” recitation for tenor 2 and “Very fast, gradually
97
stuttering” for baritone 1. The “faster” indication for the tenor towards the end of this
section, added to the maintained “very fast” of the baritone, builds excitement toward a
climax via a natural crescendo and acceleration of the delivery pace. Certain words
become clear at moments because of the thinner texture—only baritone on a syllabic
monotonous line and the two tenors. But the overlapping of two different texts and the
fragmentation of words produced by the stuttering diverts the listener’s attention from the
meaning of the words to the phonetic colors that constitute them.
The paralingual realm of A-ronne is highly developed, suggesting with this sonic
material a “radiophonic” drama. Nevertheless, it is rather difficult to categorize or
interpret the exact dramatic meaning of these sonic gestures. They are ambiguous and
simultaneously evocative of different scenarios. Some of them may have only a musical,
quasi-instrumental, function. The last event of page 21 is a 25-second long section where
the eight singers are instructed to perform alternately the following paralingual sounds:
gasping, cough, grunting, snorting, straining, groaning, exhaling loudly, moaning. All
these—with the exception of the two baritones—conclude in a general “breath” which
precedes the sensually charged exchange—indicated by the composer as “intimate and
sensual”—between tenor 2 and alto 2. These last two singers alternate phrases about the
human body with breathy sounds, such as “nel mezzo…nel mio corpo…nel mezzo della
mia carne…nel mezzo della mia carne…nel principio…la bouche…l’anus.”
On the one hand, altogether, this paralingual superposition does not make sense as
a cohesive action or a chain of reactions in relation to each other. On the other hand, this
episode may be seen as the deconstructing turning point of the previous authoritarian
situation in which baritone 1 “like a drill sergeant’s questioning” intimidates the
98
answering tenor 1. At the beginning of this section (page 16), tenor 1 seems to answer
imitating exactly what soprano 2 indicates to him in a whispering mode. Later, on the
second half of page 18, soprano 2 drops her indications and for the next two pages,
baritone 1 and tenor 1 continue their “angry and hysterical” (as indicated by the
composer) counterpoint. Throughout this authoritarian exchange, the delivered text is
very similar to the already quoted text of the seduction section between alto 2 and the
same tenor 2:
Tenor 1 ha un centro le sexe le phallus
Baritone 1 L’uomo? Qui est? en meso en?
Tenor 1 è la mio corpo è la mia parola etc….
Baritone 1 nel mio centro? Nel mio principio? Etc…
TABLE 2.2: Exchange between tenor 1 and baritone 1 in Berio’s A-ronne, p. 18 to 20
The meaning, however, is completely different. The semantic content of the
words is almost obliterated by the harsh delivery. And the only thing that a listener
perceives is the aggressive tone of the baritone and the intimidated tone of the tenor, who
progressively is taught (or brainwashed by the other). To a certain degree, the text being
used could have been this one or any other because what we apprehend is the aggressive
tone and procedure of forcefully imposing answers at any cost. This effect becomes
progressively clearer when, in the second half the score page 18, their phrases start to
collapse with each other. On page 19, phrases or even words are completed between each
other in a kind of “hocket,” while in other cases different words or complete phrases are
overlapped. By page 21, the only material exchanged between these two singers is a set
99
of vowels and consonants that they pretend to teach each other. Thus, we arrive at the
paralingual episode of score page 21—described above—which should be thought of as a
cathartic preparation for the sensual section in which the physicality is exacerbated and
certain body parts become the focus of sexual desire. The chaotic superposition is the
release and final step in this gradual relaxation and disintegration of language and vocal
sound expression: from phrases to phonemes, from vocal expressive intonation to the
isolated vocal gesture (paralingual gestures). Finally, although this chaos is difficult to
explain as a coherent act, each paralingual action preserves its everyday concrete
meaning.
A complete different effect is produced by the paralingual episode on page 6,
which has a purely musical function. It offers a sonic mattress, a texture, against which
the baritone 2 stutters through the phrases that he is trying to enunciate: “der sinn…o
logos…Am anfang war…die kraft…die tat.” While he fights with these words, the rest of
the seven singers engage in a loop of spoken and whispering single phonemes and
paralingual sounds, such as: “inhaling and exhaling through teeth,” “flickering tongue
against upper lip,” “whistle,” “‘Pop’ sliding finger inside-out of mouth,” “squeak.” Most
of these sounds have no immediate reference to any usual human action—with the
exception of the whistle—and are mere playful effects with a purely musical textural
function.
The three main factors that affect the intelligibility of the text in Laborintus II
may be observed in similar conditions as affecting factors in A-ronne: 1) vocal or
performance style; 2) text condition; and 3) masking. The optimal case of intelligibility is
achieved when the text is spoken, well articulated and more or less intact, in its “prime
100
condition.” When the same text or different texts are recited by two or more voices, the
intelligibility depends on how much each word or short units of text are superimposed. In
some cases, two singers alternate in a rapid, moderate, or slow paced delivery. In this
case, the text could be clearly understood unless another masking factor, such as
underarticulation (babbling, whispering, low volume combined with rapid articulation, or
monotone reciting in low volume), blurs its intelligibility. Superposition of words or
phrases, plus any of these masking factors, only intensifies the ambiguity of the text.
Whenever melody with lyrics are superimposed with a spoken text, the listener’s
attention will be drawn to the message of the latter, while the melody with text becomes
the background. This is especially true when the sung melodic passages are soft in
volume and melismatic in their lyric setting, while the spoken ones are clear and over
articulated. Only in certain circumstances, when stylistic performance effects mask the
spoken text, could the sung melody take a leading role and directly grasp the listener’s
attention. Whenever the sung passages with lyrics are the center of attention, the text
tends to be more unclear in melismatic settings than in syllabic ones. Any vocal or
performance masking effect could obviously add to clarity, or obstruct the mentioned
perceptual tendencies.
The following is a list of passages in A-ronne in which one, or more than one, of
these intelligibility affecting factors intervene, modifying the perception and
understanding of the text:
1. Page 1–3: In the first three pages of the piece, although the “prime condition” of
the text is not conserved, its fragmentation is counteracted by the repetition of the
101
same words, “in mio principio.” In this way, the listener has several opportunities
to grasp them.
2. Page 3: The first example of complete unintelligibility caused by superimposition
is found on this page. For twenty seconds, the eight singers recite different texts
simultaneously. On top of the already obstructed text, the individual recitations
are fragmented by interspersed sung pitches on the last phoneme of the previous
word. A similar situation is also encountered on page 7. Page 8 is an expanded
version of page 3.
3. Page 4: Although all the voices articulate words or short phrases at the same time,
the repetition of some of those in the vertical axis (the same word or phrase
pronounced at the same time by two singers) or in the horizontal axis (successive
repetition of the same material by the same singer or a different one) brings a
certain accessibility to the text.
4. Page 6: The vocal style employed by baritone 2, “stuttering,” offers a clear
instance of text obstruction.
5. Page 22: The “intimate and sensual” dialogue between tenor 2 and alto 2 on page
22 offers almost no intelligibility because of the masking effect of the whispering
mode and the low volume of the voices.
6. Page 9: The low volume and pitch performance style mask the two different texts
that baritone 1 and 2 speak. This effect is especially emphasized by the upfront
sounding presence of the female quartet singing a quasi-Marenzio madrigal in
German created by Berio.
102
7. Page 13: Again, Berio further alters the “prime condition” of the text. The
composer takes, from the already fragmented text compiled by Sanguinetti, only
monosyllables and mixes them in a completely incoherent succession.
8. Page 16: Baritone 1, soprano 2 and tenor 2 offer an instance of alternating phrase
delivery in the manner of a dialogue. Although certain masking factors, such as
whispering in soprano 2, are intervening, the relaxed pace, alternation with the
baritone who pronounces different text, plus the superposition of the tenor 2
repeating the same text in a different vocal style, contribute to the understanding
of the text without major difficulties. At the same time, on this same page, the
composer adds an extra textural layer, a sung melody with lyrics by tenor 2. The
melismatic and soft quality of his singing, plus the natural tendency of the listener
to be attracted by the strong shouted delivery of baritone 2, places it as a
melodious background texture
9. Page 18: The multilingual and extreme fragmentation of the text does not
contribute to the otherwise potentially understandable text of the dialogue
between tenor 1 and baritone 1.
10. Page 21: The composer pushes the fragmentation of the text of Sanguinetti to an
extreme. On page 21, Berio breaks the text into its phonetic components,
grouping vowels on one side and consonants on the other. The effort of
reconstruction is almost impossible at the time of listening; only after several
times listening or looking at the score does one realize that there are phonetic
fragments of the words immediately preceding this passage.
103
11. Page 23-26: In these pages, the process that the text undergoes transforms it from
a clear understandable dialogue between tenor 1 and baritone 1 to a broken
phonetic extraction of only the vowels of the preceding words. This again
alienates its comprehension. Only the hocket between the two singers of
consonants and vowels in their original order at the end of page 26 offers the
listener the opportunity of reconstructing the message, “in my end is my music.”
12. Page 26: In this passage, not only are tenor 1’s and baritone 1’s recitations
superimposed, but the “stuttering, gradually faster” and “very fast, gradually
stuttering” indication obliterates the comprehension of the delivered text.
13. Page 29: Another instance of sung text is found on page 29, but this time—in
contrast to what happened on page 16—the syllabic setting, successive repetition
of the same melody and lyrics by different female voices and the medium high
volume of the performance contribute to a better understanding of the text.
14. Page 35: Tenor 2’s recitation is acoustically set to the background and buried in
the thick vocal polyphonic texture of the rest of the voices. The intimate
monotone and low volume delivery of tenor 2’s recitation also does not contribute
to a clear understanding of the text, relegating it to an almost percussive
background effect: a percussive texture in a melismatic context.
15. Page 41: At the end of this page, the inaudible recitation of tenor 1 is
overpowered by the forced and over articulated whispering of alto 2, which takes
the sonic front stage over the tenor’s textural background. While alto 2’s text is
clearly understood, tenor 1’s becomes a mere incomprehensible bubbling.
104
As mentioned before, in A-Ronne, Berio recreates the semiotic structural
manifestation or dramatization of the elements of language and their agony when set to
music. In terms of its overall structure, this piece explores all the possible degrees in the
music-text continuum proposed by Berio. While traveling across the whole spectrum of
possibilities, A-Ronne contrasts and overlaps in a single texture the pure extremes, word
or music, or any intermediate scale degree of the synthesis of the two mediums.
One may divide A-Ronne into several sections according to the types of processes,
vocal styles and text materials employed.
a) Section 1: The section compressed between score pages 1 to 6 focuses especially on
the contrast between fragmented text—words that are repeated—and free vocalizations
on vowels extracted from preceding words. The spoken material is mostly set to notated
rhythms and shouted in different moods or whispered in unrelated undetermined pitches.
Among the briefly sung material, besides the vocalizations, there are also short sustained
pitches on single vowel phonemes. There are occasional paralingual vocal gestures, such
as sighs and belches, and sound effects such as bocca chiusa. But the climax of the
paralingual realm is not reached until page 6.
Most of these performing styles—part of the continuum—are mixed by the eight
voices vertically as well as horizontally, creating a chaotic textural mix of short phrases
delivered in different manners, which overlap and succeed one another. Page 3 starts
clarifying this notated sonic multi-texture and proceeds into a section of 20 seconds in
which the eight singers deliver straight spoken text in different moods and insert short
sung phonemes—extracted from preceding words—in determined pitches. Next, page 4
again concentrates on coordinated short phrases of the text, which are rushed and shouted
105
by all voices except tenor 1—who inserts his phrases in between with a prescribed
rhythm and later adds a specific pitch contour. Page 6 is mostly devoted to the
paralingual realm. All the voices—except baritone 2 who stutters over the consonants of
mixed text—loop over a delirious combination of non-linguistic sounds and vocal
gestures.
In this way, this first section opens the palette to most of the vocal styles and
textures that will be developed in the rest of the piece.
b) Section 2: Score pages 7 to 12 focus, for the most part, on the contrast between the two
vocal styles that constitute the extreme in Berio’s continuum: speaking and singing. After
the previous introduction to a big portion of the universe of vocal possibilities, now, in a
more economical manner, this section overlaps—in vertical contrast—the extremes, the
opposites: straight spoken text against sung melody—rhythm and pitch notated—in a
parody of Renaissance counterpoint.
Page 7 opens with all the voices superposing spoken complete sections of the text
of the second part of Sanguinetti’s poem. Then it proceeds to the insertion of short sung
phrases of what on page 8 will become a full contrapuntal melody. So far the text of the
sung phrases is clear and understandable.
The transitional page 8 gives way to page 9 on which four of the singers
alternately take the singing role; baritone 2 and a second singer—who changes from
voice to voice—recite passages from the first and third parts of the original poem.
106
c) Section 3: Score pages 12 to 18 are devoted to the contrast of different singing styles
with sporadic interventions of paralingual sounds—imitation of animals—and some
nonsense spoken syllables on notated rhythmic patterns—scat-like style. Besides short
“desperate” shouted phrases by alto 2, the spoken text in free rhythm is not contrasted
with singing until page 16.
This section opens at the end of page 12 with baritone 2 singing in a “basso
continuo” manner, on syllables that, although extracted from words of the poem’s
original text, are decontextualized to such a degree that the result seems a mix without
any syntactic sense. On page 13, tenor 2 performs a distant and soft contrapuntal melody,
which articulates the vowel phoneme of the immediately preceding syllable sung by
baritone 2. These two voices offer an instance of complete unintelligibility of the text,
resulting from the extreme fragmentation of the discourse. This procedure transforms the
two voices, resulting in an instrumental effect in which the vowel becomes pure sounds
with a delay or reverberant effect created by the tenor’s post articulation.
By page 14, some of the other voices intervene in the background with singing
lines in “ppp” on half-notes and slow rhythms on vowels also derived from the syllables
of the baritone. The rest of the voices perform animal sounds, scat-like quick short
phrases, or short bits of melodious popular chants—which make use of more organic
phrases of the text. All of them add to the purely musical or instrumental multiple texture.
By score page 16, the spoken word becomes dominant over any background
singing. The predominant baritone 1 shouts authoritarian commands “like a drill
sergeant’s questioning” and intimidates tenor 1 who answers with whichever phrase is
107
prompted by soprano 2. But this brief spoken passage is shortly invaded again by most of
the singing multiple textures of the previous pages of this section.
d) Section 4: This section explores the medium zone of the music-text “continuum.” Two
types of spoken words are set to musical parameters: a quasi-“Sprechstimme” and
fragmented spoken words articulated in a hocket with precise notated rhythms. Regarding
the quasi-“Sprechstimme” vocal sound production style, in these pages there several
instances in which most of the singers pronounce the syllables “den” or “my” in an in-
between speaking and singing manner. This may be described as speech with a melodic
contour of unspecified pitches.
The second procedure articulates plain spoken words or short fragments of a
sentence in hocket. Usually, it starts to alternate freely but soon the composer assigns a
very well-structured rhythmic pattern to each singer. It is at that moment that the text
becomes fragmented word by word or syllable by syllable, and the phrases are put
together only by the interplay between the two singers. The two times that the text is
broken in syllable or phonemes, those particles derive directly from the immediately
previous clearly pronounced material: page 19 “beginning” and page 20 “aleph is my
end.” Showing the word immediately before its disintegration allows the listener to
appreciate and understand its meaning at the same time that it challenges and involves
her/his musical perceptive capabilities in reconstructing the words.
This section closes with a 25-second episode of decontextualized paralingual
sounds and high pitches “imitating the ‘call’ of Algerian women” produced by the four
female vocalists. This sonic conglomerate becomes almost a mechanic symphony of
108
undistinguishable sounds or the turning of a gigantic squeaky wheel. Again the lingual, or
paralingual in this case, becomes pure music.
e) Section 5: After the isolated transitional page 22, the section between pages 23 and 28
further explores similar elements and procedures as the 4th section. But now these text
hockets and “den” passages, on the one hand, are subject to better defined musical
parameters, and on the other hand, can either be perceived more as text or music, the
extremes of Berio’s text-sound continuum.
The short “den” motifs now are rhythmically and melodically notated. From page
23 to 26, every time these motives appear, they are used as trigger of a dialogue between
tenor 1 and baritone 1, who declamate text over a sustained chord by the other voices.
The first time (p. 23-24) this chord is sung in bocca chiusa, the second time (p. 25-26) on
vowels derived from the text declamated by tenor 1 and baritone 1 on pages 23 and 24.
This purely musical texture serves as background to the dialogue made of short fractions
of sentences. Tenor 1 and baritone 1 each complete the last syllable of the last word left
incomplete by the other. This is one of the few instances in which a large section of
Sanguinetti’s poem (the whole Part 3) is delivered in its original order; from “L’uomo ha
un centro...” to “...ette, conne, ronne.”
Although in section 3 and 4 there are fractions of the text in its original state, it
has not ever been as clear as in this passage; this happens for several reasons. First, the
text is not as fragmented as before; now it is delivered in bigger chunks by each singer.
Second, the rhythmic patterns, to which the text is set, allow the logical, natural flow of
intonation—not too frantic not too slow—and do not disturb or distort its understanding.
109
Berio allows the performers to take those rhythmic patterns with flexibility as he
indicates in a footnote: “Suggested rhythm and speed; minor modifications and
adaptations are possible.” Third, the sustained chord texture accompanying this dialogue
is completely unobtrusive.
Finally, the whole section, with its harmonized “den”s and declamations, creates
the illusion of listening to a radio commercial. The melodic, rhythmic, harmonic voice
tone and performance style of the vocal ensemble resembles commercial jingles, which
generally introduce and draw the attention of the listener to the commercial selling
speech that follows. Then, usually two announcers declamate that selling speech,
alternating their voices with overarticulated inflections of their speaking voices. This is
certainly the fourth and more definitive reason of why this passage is as intelligible as it
is.
Thus, A-ronne reaches its maximum intelligibility and textual integrity exactly in
these two pages, which fall in the middle of the whole score—pages 23 and 24 of a total
of 48.
f) Section 6: This section—pages 28 to 33—is completely devoted to the singing realm.
Soprano 1 proposes the first half of a singable melody, which then is completed by alto 2.
On page 30, soprano 1 restates the same part one of the melody, but this time alto 1
repeats this first section after her and completes the rest of the melody. While these
singers perform this theme, the other voices accompany with vocalizations on vowels.
The accompanying texture thickens progressively toward page 31, on which not only all
the other voices have been introduced, but also their rhythmic activity increased. By page
110
32 and 33, the eight singers vocalize on frantic scales in triplets, quintuplets, and
sixteenth-notes. This overlapping creates polyrhythm and cacophony, since now the
vocalists perform a succession of quick repetitious syllables: “lo-go-lo-go-lo-go-etc;” “ra-
ga- ra-ga- ra-ga-etc;” “ca-ro- ca-ro- ca-ro-etc;” etc. By the end of this section—page 33—
all collide on a unison D on the [o] phoneme.
The syllabic setting and simplicity of the melody proposed by soprano 1 and the
alti allows the listener to understand the words. The fragment set to this melodic passage
is extracted in its original layout from the first half of Sanguinetti’s part 1 of the poem
without any further editing by Berio. But shortly after its introduction, the accompanying
texture gets thicker and busier, progressively obstructing the previous clarity of this text
fragment.
g) Section 7: This section—score pages 34 to 40—goes back to the contrast of the two
pure extremes of the text-music continuum. Although several of the voices have spoken
passages,36 those recitations are perceived as a percussive texture. In the beginning of
page 34, this effect is created because all the singers’ overlapped polytextuality makes the
literal understanding of the text impossible and at the same time highlights the richness of
the consonants’ phonetic colors. The rest of the recitations are either overlapped and
delivered in a fast stuttering manner, or become like a murmuring praying background to
the sung parts. In the first case, the stuttering again emphasizes the percussive effect of
the consonants of the words. In the second case, the whole text melts in a muddy and
monotonous “sonic mattress.” The text employed in these passages is drawn from the
36
At the beginning, all except baritone 2 have a spoken passage, then tenor 1 and baritone 1 and finally
only tenor 2.
111
three parts of Sanguinetti’s poem; in some cases only fragments are extracted and in
others the text is incorporated in its original order or phrase by phrase backwards.
The sung parts consist of an ornamented vocal line, usually carried by one voice,
harmonized by the others. All these lines are textless and performed on isolated vowel
phonemes. But harmonic richness and dramatic melodic content dominate this section.
h) Section 8: The last section—pages 41 to 48—opens with a similar “den” jingle-like
ensemble. But in contrast with the one in the 5th section, this one is decomposed:
rhythmically fragmented and harmonically dense (more dissonant). Also, this time, the
announcers, although they use the same text from Part 3 of the poem, do not establish the
dialogue game as they did before. Tenor one is almost inaudible, murmuring in the
background, and alto 2, who loudly whispers “in my beginning Aleph is my end” (from
the poem’s Part 2), dominates on page 41 against the murmuring and a sustained sung C4
of the other voices. By the next page, that sustained C4 breaks into quarter notes that
immediately start to unlock from their homophonic layout into a slightly displaced
polyrhythm, which creates an echo or delay effect. In the following pages, the eight
voices also progressively expand their harmonic spectrum. They mostly sing on isolated
phonemes or syllables derived from different parts of the text. The phonetic material also
increases in complexity, both horizontally—in the successive articulation of a single
voice—and vertically—in the overlapping of several voices.
In this section, the singers interrupt the singing with isolated phrases from
different parts of the poem. These phrases are mostly whispered in Sprechstimme style
112
and performed according to the different character indications: “dreamy,” “solemn,”
“urgent,” “ecstatic,” “frantic,” ”sensual,” etc.
On score page 47, the eight voices lock into a dissonant homophony that
crescendos until colliding into a held open perfect fifth, which soon after dissolves into a
fading dissonance until the closing spoken letters of the Italian alphabet: “ette, conne,
ronne.”
These detailed analyses of Volcano Songs: Duets and A-ronne reveal the
procedures through which Monk and Berio transform the sonic elements of language into
structural components of their pieces. Through these means, they explore the
communicative potential of the non-linguistic aspects of language and human vocal
gestures that are usually disregarded in text set into music and make a conscious
representation of the “poetic mode of listening.”
113
IV.
THE POPULAR SONG
The cases analyzed so far have been pieces from the “art-music” realm: art-song,
opera and oratorio repertoire. Examining this dissertation’s hypothesis in the context of
the repertoire of the “popular-music” realm—jazz tunes, folk/blue grass and pop songs—
may provide further insights. This section, therefore, will proceed with an analysis of four
popular songs from the English speaking repertoire, which serve as examples of different
approaches to the articulation between words and music: first, a highly structured Tin-
Pan Alley tune, Jerome Kern and Oscar Hammerstein’s “All the Things You Are;”
second, the narrative type in strophic form, Bob Dylan’s “A Simple Twist of Fate;” and
finally, two with “redundancy variation” in the “verse-chorus” format, Björk’s “Isobel”
and Peter Gabriel’s “Sky Blue.”
Popular song confronts us with new issues that lead us to rethink even the way we
approach “art song” analysis. From the debate over popular music studies, three
particular issues are relevant to the approach that this chapter seeks in analyzing these
four songs. These perspectives are fundamental to understanding the way in which
audiences listen to these songs and how they are conceived by their songwriters.
The first issue addresses the fact that an analysis of the musical text—the song
and its constituent parts, words and music—is not sufficient in isolation since its elements
only gain significance in relation with their context. Contexts directly influence the way
text is perceived. As David Bracket summarizes, in accordance with similar positions of
Richard Middleton and Simon Frith in the musicological debate over analysis of popular
music, “one of the most important aspects of context is that it establishes the codes that
114
listeners are most likely to apply in certain listening situations.”1 Style conventions are
indicators of which musical elements are valued in each specific popular genre. These
elements are the main focus of artists at the creative moment and of the audience in the
listening situation. The analysis of those conventions reveals the code under which
popular songs are read as “texts.” It is necessary, however, to note that more than one
perspective comes into play in the formulation of that code and that the code may be
different according to the agents interpreting the musical object. This coding is actually
the result of a diversity of discourses converging in a dialectical manner into the object of
study, the song. Cultural studies theorists of the 1970s assumed that “the meaning of
music could be deduced from its users’ characteristics…ignoring lyrical analysis
altogether,” as Simon Frith points out in a criticism of his own analyses in The Sociology
of Rock (1978).2 These “consumptionism” theories—mainly concerned with the values
that each “subcultural group” assigns to the song styles with which they identify—gave
way to studies that also took into account the “changing modes of lyrical production” in
the record industries.3
The second relevant issue is concerned with the problem that traditional score
analysis presents when dealing with popular song. Most traditional formalistic analysis
based on the visual information provided by scores may ignore important musical
aspects. This “visual” approach—concerned with the kind of musical development
known as “extensional” or “syntactical,” in terminology coined by Charles Keil and John
1
David Bracket, Interpreting Popular Music (Berkeley, Los Angeles, London: University of California
Press, 2000), p. 18. For further details over these issues see, Richard Middleton, Studying Popular Music
(Philadelphia: Open University Press, 1990), and Simon Frith, Music for Pleasure: Essays in the Sociology
of Pop (Cambridge, Oxford: Polity Press, 1988).
2
Frith, Music for Pleasure, 119.
3
Ibid. “Lyrical production” refers not only to the content and format of the lyrics but also the musical
structure of the song.
115
Shepherd—departs from scores and concentrates only on the musical syntax (mainly
harmony and melody), ignoring rhythmic nuances, texture, vocal and instrumental
arrangement, timbre nuances of the sound inflections, and sound mixing (in recording
and amplified sound in concerts).4 In contrast, a reformulated listening approach —which
focuses on the “processual” or “intensional” musical development as identified by
Charles Keil and John Shepherd—allows observation alongside the melodic and
harmonic aspects of the previously disregarded rhythmic and sound nuances.
Certainly recording technology has been partly responsible for the reconsideration
of several of these analytical perspectives. First of all, recording technology provided a
new way of registering particular performances of musical pieces: this is the third
relevant issue that contributes to the analytical approach used in this chapter.
The possibility of listening to recorded versions of performances opened a vast
corpus of new questions such as the ones mentioned in the previous paragraph.
Furthermore, Albin J. Zak III proposes that “records are not reproductions of anything;
they are ‘realities in themselves.’”5 He is appealing to the rock band leaders’ conception
of songs as only the starting point; “for them…the sound of the recording represented the
ultimate form of the artwork, and their compositional intention was to have a hand in
shaping the sonic relationships that made their identity.”6 Rock historian Carl Belz stated
as early as 1969 that although rock was not the first genre to use records and radio as its
4
For details on these terms definitions see, Charles Keil, “Motion and Feeling through Music,” The Journal
of Aesthetics and Art Criticism, 24 (Spring 1966), and John Shepherd, “Media, Social Process and Music”
in Whose Music? A Sociology of Musical Languages, John Shepherd, Phil Virden, Graham Vulliamy,
Trevor Wishart, ed. (London: Latimer, 1977) and “A Theoretical Model for Sociomusicological Analysis
of Popular Musics, “ in Popular Music 2, David Horn and Richard Middleton, ed. (Cambridge: Cambridge
University Press, 1982) as quoted by D. Bracket in Interpreting Popular Music, 21.
5
Albin J. Zak, The Poetics of Rock: Cutting Tracks, Making Records, (Berkeley, Los Angeles, London:
University of California Press, 2001), 21.
6
Ibid.
116
primary media of expression, for rock, “records became the primary, common bond
among artists and listeners.”7 Rock recording is not to be assumed as a mere “‘acoustic
presentation’ of a written text (the score). It is itself a text, a sonic one; ‘what it sounds
like’ is precisely ‘what it is.’”8 Records as well as scores are semiotically mediated texts
open to interpretation. But we must not overlook that by their own nature, records have a
material content, sound directly experienced by the listeners and that “in addition to
whatever we make them to be, they insist as well on being exactly what they are.”9
These “acoustic publications” or “electric prints,” as Richard Middleton came to
call the recordings, “represent a reified abstraction,” which include more than “musical
thought.10 They “encompass musical utterances and sonic relationships—material—
whose particularity is immutable and thus essential to the work’s identity.”11 As musical
ideas are not only expressed in sound but also become sound, we must take into
consideration new elements that are integral to the final artistic product, such as recording
tools, space and dynamics among the members involved in the actual recording and
mixing. What primarily concerns this dissertation is that the recording tracks bring
awareness that we are hearing song words in somebody’s voice and that voice delivers
linguistic meaning filtered through a particular expressive interpretation—a particular
intonation, timbre and rhythmic inflection.12
Taking into consideration the discussion of these three issues, this chapter
concentrates on the analysis of the text itself—the recorded track, the song. My work falls
among the textually oriented studies of popular music that approach lyric analysis in
7
Carl Belz, The Story of Rock, (New York: Oxford University Press, 1969) as quoted in A. Zak, 13.
8
Zak, 41. Comments in brackets and italic type by L. Guillen.
9
Ibid.
10
Middleton, Studying Popular Music, 83.
117
particular “with awareness of their function not as verbal texts but as sung words,
linguistically marked vocal sound-sequences mediated by musical conventions.”13
Popular song often offers the chance of finding the composer of the music and the writer
of the lyrics in the same person. Sometimes the songwriter is even the performer herself,
as in three of the cases under study in this chapter. This particular circumstance provides
the opportunity of observing the manipulation of the lyrics—as Middleton calls them,
those “linguistically marked vocal sound-sequences”—as well as the musical elements
(including the sound of the recordings) as part of the materials that songwriters count on
to reach and influence their audiences.
Although “people may not listen to pop songs as ‘messages,’” it is obvious that
they take them into account.14 As Simon Frith says, “So the question remains: why and
how do song words…work?” And he answers himself by saying:
In songs, words are the sign of a voice…Singers use non-verbal as well as verbal devices
to make their points—emphases, sighs, hesitations, changes of tone…(which is why some
singers, such as the Beatles and Bob Dylan in Europe in the sixties, can have profound
significance for listeners who do not understand a word they are singing).15
In approaching the task of analyzing songs to find out “why and how do song
words…work?” several points must be taken into account:16
1. The way a singer performs a song determines what the singer means to
us and our relationship to him/her as the audience.
2. “Different pop forms engage their listeners in different narratives of

desire.”17 In the process of identifying themselves with different
11
Zak, The Poetics of Rock, 42.
12
Simon Frith, “Try to Dig What We All Say,” The Listener (June 26, 1980), as cited by A. Zak, 43.
13
Middleton, ed., Reading Pop: Approaches to Textual Analysis in Popular Music (Oxford, New York:
Oxford University Press, 2000), 7.
14
15
Ibid. Frith’s remarks resonate with my comment about consumers of Anglo-American popular song in
non-English speaking countries in the introduction of this dissertation.
16
The following three points have been adapted from Simon Frith, Music for Pleasure, 121.
17
Ibid.,121.
118
musical genres, listeners engage in fantasizing about belonging to
different sorts of communities.
3. Songs put ordinary language—common speech—into a refreshed poetic

form. “Songwriters give them a new sort of resonance” finding “the
pressure points of language…the syllables that locked a phrase up and
were begging to be prodded.”18
As already mentioned in Section 4 of this dissertation, “Expressivity Location:
Speech Mode vs. Poetic Mode,” we may find a wide range of vocal setting modalities,
from those that aim at a “speech quality” to those that aim at a “musical-poetic quality.”
Similarly to this classification, Richard Middleton describes the extremes of this
spectrum as “one characterized by verbal predominance over relatively vague musical
meanings, the other by the ‘musicalization’ of the words, often through paralinguistic
techniques.”19 Furthermore, he identifies three different approaches to setting lyrics into
music: “affect,” “story,” and “gesture.”
In the first case, the “affect” mode of setting lyrics absorbs words as expression,
merging them with the melody. Middleton explains that in this case “voice tends towards
song…intoned feeling.”20 This brings us back to Lawrence Kramer’s “songfulness”
concept, as explained in the introductory section of this dissertation. In these kinds of
settings, words are mainly perceived in the “poetic mode” of listening, in which certain
denotative aspects of words are captured parallel to their sonic qualities. This is the way
we listen to songs such as Kern/Hammerstein‘s “All the things You Are” and certain
sections of Björk’s “Isobel” and Gabriel’s “Sky Blue.”
In the second case, the “story” mode of setting lyrics retains the focus on the
denotative effect of words over the rhythmic and harmonic flow. In this case, words are
18
Clive James, “The Beatles,” Cream (October 1972), as quoted by S. Frith, Music for Pleasure, 122.
19
Middleton, Studying Popular Music, 228.
119
perceived in a “poetic mode” in which there is a preponderance of the “speech mode”—
the listener hears more of the speech qualities of the words than the sonic ones. The
straightforward discourse keeps its integrity by relegating rhythm, melody and harmony
to the background. We may find an example of this in Dylan’s “A Simple Twist of Fate.”
In the third case, the “gesture” mode, words tend to be absorbed into music at the
point of becoming sound while the voice becomes almost an instrument. In some
instances, “verbal denotations can be almost completely subordinated to musical
effects—through rhythmic ‘non-sense’ language …and the organization of
inconsequential verbal phrases into rhyming musical parallelisms.”21 In this kind of
setting, words are perceived in a “poetic mode” in which there is a preponderance of the
“acoustic mode” of listening—the listener hears more of the sonic qualities of the words
than their speech denotative content. We may find an example of this in certain sections
of Björk’s “Isobel” and Gabriel’s “Sky Blue.”
The Highly Structured Song: Tin-Pan Alley Tune
During the “golden years of the Tin Pan Alley”—1910s to 1950s in the United
States—one of the song formats most often employed by composers such as George
Gershwin, Cole Porter, Irving Berlin, and Jerome Kern was a format that opened with an
introductory section—sometimes with a quasi-recitative flavor in a “story” mode word
setting—called the “verse,” followed by what they called the “refrain,” which was the
“real” tune, in an “affect” mode setting. This is the case in “All the Things You Are,”
composed by Jerome Kern with lyrics by Oscar Hammerstein in 1939 as part of the now
20
Ibid., 231.
21
Ibid., 228.
120
rarely performed musical Very Warm For May. Larry Starr and Christopher Waterman22
indicate that the origin of this verse-refrain form is the result of the fusion of the
nineteenth-century AABA structure and the verse-and-chorus form influenced by “the
craze of ragtime and jazz music” of the early twentieth-century. After the introductory
“verse,” the “refrain” follows in AA’BA form. The A section introduces the main
melody, which is repeated with new lyrics and some slight melodic changes (A’). Then
the B section or bridge immediately follows with new musical material and lyrics. It then
finishes with the return of the A melody, usually with new lyrics and some melodic
alteration, which may include an addition or “tag,” becoming A’’.
Several talented composers explored variations on this format, but what made it
especially successful was its predictability. Peterson and Berger comment that the
oligopoly in airwaves and recording studios before the1950s demanded standardization of
the Tin Pan Alley tune formula in the market.23 Once this formula proved to be widely
accepted among audiences, its reproduction meant a guaranteed commercial success.
Before turning to the analysis of “All the things You Are,” it is necessary to point
out that this is the only example among the popular songs considered in this section that
presents different actors in the role of songwriter and performer. Although the focus of
this section is the diversity of compositional procedures used by songwriters to achieve
their desired effects on listeners, in this specific song, it may prove productive to compare
the published score—as the only document giving testimony to Kern and Hammerstein’s
compositional intentions—with two radically different renditions of the same song: the
22
Larry Starr and Christopher Waterman, American Popular Music: from Minstrelsy to MTV (New York,
Oxford: Oxford University Press, 2003), 62, 64.
23
R. A. Peterson and D. G. Berger, “Three Eras in the Manufacture of Popular Music Lyrics,” in The
Sounds of Social Change, eds. Denisoff and Peterson, as quoted in Simon Frith, Music for Pleasure, 119.
121
first one sung by Ella Fitzgerald (the version used during the listening experience) and
the second one performed by Barbra Streisand. These “metteurs en scene,” as David
Laing calls these performers, approach “a song as an actor does his part—as something to
be expressed, something to get across.24 His aim is to render the lyric faithfully. The
vocal style of the singer is determined almost entirely by the emotional connotations of
the words.”25 Working through her interpretation, each singer brings her own
idiosyncratic vocal rhythmic articulations and vocal timbre nuances to the phrases of “All
the Things You Are.” By contrasting these differences, we may observe the very moment
of “songfulness” as the personal creation of each artist.
The lyrics of this song are highly structured and abound in redundancy devices.
Mark W. Booth comments that the “repetition of phrasing in successive stanzas, where
small modifications adapt the words to a new use or effect, is the signature of the
ballad.”26 Booth considers this device not a mere stylistic convention but a mnemonic
resource related to the oral nature of the primitive ballad. Although here we are not
dealing with a traditional “oral ballad,” which resorted to redundancy in order to help the
creator and later to help singers to remember the lyrics, the internal repetition certainly
contributes to this song’s popular ballad flavor, creating a dent in its audience’s memory.
Tin Pan Alley lyrics show a prominent concern for “privacy” and “romance.” The
rapidly growing American middle-class of the first quarter of the twentieth century had
elite aspirations and cared about property ownership and privacy. These kinds of interests
are reflected in some of the lyrics of this period: romantic love, a wife, a home to share.
The third person narration of the old European ballads gave way to the first-person stories
24
David Laing, as quoted by Simon Frith, Music for Pleasure, 122.
25
Frith, 122.
122
of Tin Pan Alley. “This first-person mode of address was reminiscent of elite poetic
forms such as sonnet, but Tin Pan Alley songwriters avoided the flowery
language…opting instead for a more down-to-earth manner of speech,” which “allowed
the listener to identify his or her personal experience more directly with that of the
singer.”27
“All the Things You Are” talks about romantic love. In the first section of “the
verse,” the three lines of the first stanza tell us of longing for something still unknown;
the next three lines of the second stanza give us the answer for each of the three needs in
the same order that they were introduced. With the conflict resolved in this introductory
section, the “refrain” of the song proceeds into a more luxurious melody that almost
incorporates the lyrics as an additional colorful instrumental element.
Observing the published score, in the “verse,” Kern creates a colloquial sensation
by rhythmically moving with the intonation inflections of the text in a “quasi-recitative”
style.28 Each verse is set to the same rhythmic pattern. Its pace is rather fast; especially in
comparison with the way the lyrics in the second half of the song (the “refrain”) are set.
Each line of the first two stanzas lasts two measures of 2/2 and is ten syllables long, while
at least the first two lines of the third stanza set in the “refrain” consist of nine syllables
stretched over four measures. Melodically, each line of the “verse” opens with an
ascending perfect fourth—someitimes transformed in a perfect fifth—which will become
a motif later in the “refrain” of the song, followed by a simple arching melody. The
26
Mark W. Booth, The Experience of Songs (New Haven and London: Yale University Press, 1981), 59.
27
Starr and Waterman, American Popular Music, 67.
28
Oscar Hammerstein II and Jerome Kern, All the Things You Are (Polygram International Publishing, Inc.,
1939)
123
repetitive rhythm and unattractive melodic contour allow the lyrics to take the foreground
in this “quasi-recitative” section.
The clear and stable tonality of G major—harmonizing with a simple I–V–I on G
major without too many deviations—does not distract the listener’s attention from the
lyrics either, especially once it is compared with the sequential and modulatory nature of
the “refrain” that follows. Once the action is set and resolved, the song may self-indulge
into a busier harmony over a static descriptive text as the one found in the “refrain.” In
contrast, in the first section of the song, the “verse,” the text is highly structured, with
repetition of phrasing as explained in previous paragraphs. The predictable syntactic
structure offers the audience a grid to follow to comprehend the lyrics.
We know from several accounts of songwriter teams of this period that the music
usually came first, followed by the lyrics. The text was written to fit a previously
composed melody or as a parallel process. This is particularly evident in the “refrain” of
“All the Things You Are.” The music takes over in the second half of the song. It even
dictates the structure of the text, which molds around the musical phrasing and reinforces
certain harmonic procedures. The number of syllables changes from one verse to the
next. Also, the rhyme is loosely structured, which controls the natural tendency of
engaging with the musicality of the words in combination with the flow of the melody
and attracts more attention to a linear reading of the text. By using this tactic, the
songwriters guarantee a certain attention of the listener to the denotative content of the
lyrics.
The text remains simple and does not try to address multiple semantic levels; it
consists of a straightforward enumeration of images that reminds the song’s “persona” of
124
his or her “object of love.” The lyrics’ structure signals musical events such as the
beginning of harmonic sequences, the end of musical sections or similarities between or a
return to one of the sections.
The “refrain” has four sections: A –A’–B–A’’. Both A and A’ sections are made
of two phrases of sequences in fourths (dominant-tonic type): Fm–Bb m–Eb7–Ab7–Db7–
G7–C7, and Cm–Fm–Bb7–Eb–Ab–D7–G. The B section—or bridge—follows, taking
over the G major but breaking this sequence and still keeping the two phrase structure,
although this time the phrases are much shorter, only four measures compared to the
seven measures of the previous ones. The first phrase of the B section stays on G major
developing a typical cadential progression of I–ii–V–I to immediately modulate in the
second phrase. Although this second phrase of B parallels exactly the previous cadential
progression, now everything is in E major. The A’’ section opens with the same sequence
as the one employed at the beginning of the refrain in the A section. It even starts on F
minor, but stops halfway on the Ab major of the sequence in fourths, collapsing the
previous two phrases into one of seven measures followed by a cadential coda on the new
and final tonality of Ab major.
The text of the “refrain” signals the beginning of each sequence as well as the
parallelism between them by using the same phrase “you are” both times (mm.1 and
mm.9 of the published score). Kern achieves this focalization by carefully setting “you
are” on two long notes, a whole and a dotted half-note, which stop the flowing of tempo
in the music. The rest of the text immediately following “you are” gains a quicker pace
by fitting more syllables into each measure. Although the harmonic rhythm of the
sequence is constant—one chord per measure—the layout of the text varies in density.
125
While the two syllables of “you are” are spread over two measures, the remaining sixteen
syllables of the text (eighteen in A’), where the explanation of what “you are” takes
place, are crammed into the next five measures. In the context of a syllabic setting such
as this one, the pace and density over the measures will have a direct influence on the
perception of the text in general. The slower rhythmic pace allows a better understanding
of the contained lyrics. In contrast, a tighter layout produces a blending of syllable over
syllable and syllable and melodic line.
The result is a generalized idea of what has been said in these phrases set into the
A and A’ sections. While the listener attends to the words “you are,” the specifics of the
description of her or his “object of affection” are overlooked, and she only remembers
that this person is a series of things. The listener attends away from the precise meaning
of this description and is satisfied with the assurance that it has one without its mattering
what it is.
In this same description in the A and A’ sections, Hammerstein explores most of
the alliteration devises on hand with his lyrics. The associations between similar
phonemes that are placed close to each other create a certain sustained musicality in the
lyrics. Thus, the listener tends to attend away from the meaning of the words and listen to
their instrumental cacophony.
(A section)
You are the promised kiss of springtime
That makes the lonely winter seem long.
(A’ section)
You are the breathless hush of evening
That trembles on the brink of a lovely song.
126
The first line explores the alliteration between sibilant [s] phonemes that by the
second line make a counterpoint against the liquid [l]. The third line connects the
unvoiced fricative phonemes [θ] – [s] – [ς] in a backward movement of the point of
articulation of the tongue (upper teeth, teeth ridge, and hard palate).
All the devices described so far contribute to a fragmented grasping of isolated
words and phrases. In fact, listeners tend to grab chunks of lyrics in an imprecise non-
linear way. The usual fragmented nature of a song’s lyrics contribute to the broken way
the listener tends to grasp the text. According to Booth, these fragments behave like
“standing patterns as opposed to linear sequences of growth, evolution, discovery,
catharsis,” which is a common procedure in most of the other types of discourses.29 This
does not mean that song text is shapeless but that its elaborated patterns connect to each
other in a different manner. Booth comments on the particulars of this relation through a
quote from Edward Doughtie’s book Elizabethan Air:
In song lyric, although the images and ideas may be related to a central theme or an
obvious central conceit, they tend to be isolated from each other; they accumulate
rather than develop. Rarely, in fact, does an image or thought extend beyond two
lines…the listener is rarely able to make connections of much complexity over a
longer space of time. 30
Returning to the setting of the song’s lyrics, after twice establishing comparisons
of the “object of love” with certain pieces of nature—“You are the promised kiss of
springtime” and “You are the breathless hush of evening”—Kern and Hammerstein allow
the next “you are” to move in quarter-notes over the arpeggio of G Major. This is one of
the modulatory turns that the song takes from mm. 15 until mm. 20. Although this seems
to break with the device of focusing attention on the phrase “You are,” a couple of
127
measures later (mm. 22-23), the song returns to a modified version of the emphatic tactic.
This time, the B section closes with a palindromic effect that brings back the phrase “you
are,” although this time as a closure of the lyrics’ statement. This happens at this point of
the song for two reasons. First, because the phrase “you are” has already been well
established during its two previous appearances. By now it is only a reminder—empty of
specific meaning—that a new enumeration is starting. There are several meditational
practices, as Booth comments, which “buil[d] on the fact that any word sheds its sense
upon a small number of consecutive repetitions.”31 Nevertheless, these two words still
produce their denting effect on the listener; they keep their pragmatic value while their
semantic one is almost extinguished. Second, the change in the way “you are” is
introduced is a sign that announces the beginning of a completely different musical
section.
This section behaves harmonically differently. Instead of a sequence, a
progression modulates the second time around. Melodically it is also different; instead of
a seven measure melodic phrase, there is a four measure one. The fact that mm. 25
reintroduces the same melody of the first five measures of the refrain, together with the
fact that by the fifth measure the same “some day” of the beginning of this A’’ section is
repeated, indicates that things seem simultaneously similar but different. This is a return
of the A section but not under the same conditions as before.
A’’ establishes the same game of two words as a “motto” heading a melodic
phrase that, at least in section A, is repeated a fourth down the second time around. But
new words are now used: “some day.” This not only breaks the monotony, but also puts
29
Booth, 25.
30
Ibid., 24.
128
the listener on alert. Contrary to his or her expectations, the listener is surprised by the
sudden return of the “motto,” “some day,” set into what seems to be a variation of the
opening melody. This acceleration of events propels the listener toward the end of the
song, expressing the hopeful wish that “some day” all the things that represent him in the
song become hers: “all the things you are, are mine.” The title of the song occupies this
strategic place and has the function of summarizing the song.
(verse)
Time and again I’ve longed for adventure, (10) a
Something to make my heart beat the faster. (10) b
What did I long for? I never really knew. (10) c
Finding your love I’ve found my adventure, (10) a

Touching your hand, my heart beats the faster, (10) b
All that I want in all of this world is you. (10) c
(refrain)
You are the promised kiss of springtime (9) a
That makes the lonely winter seem long. (9) b
You are the breathless hush of evening (9) c
That trembles on the brink of a lovely song. (11) b
You are the angel glow that lights a star, (10) d

The dearest things I know are what you are. (10) d
Some day my happy arms will hold you, (9) e
And some day I’ll know that moment divine, (10) f
When all the things you are, are mine. (8) f
Before proceeding to the analysis of the two recordings, there is a final
observation to make on the choice of lyrics employed as the “motto” or heading. Both
“you are” and “some day,” are made up of what is known in linguistics as deictics.
Phrases made of words like “you are” and “some day” are semantically empty and
depend totally on the context of the utterance. They have the “function of situating the
31
Ibid., 39.
129
speaker’s utterance in a specific time and place. They do not characterize or qualify
someone or something, but ‘point to’ a person, an object, a time.”32 Deictics are used
more frequently in spoken language than in written. As Mauro Calcagno says, theater
scholars regard the high incidence of deictics in dramatic texts as one of the main factors
that distinguishes the language of theater from that of narrative or poetry. I argue in
addition that the colloquial flavor of several song lyrics is created through the extensive
use of this kind of deictic phrase.
The employment of the deictic phrases emphasizes the performative and oral
nature of the song’s texts. Deictics contribute to creating a kind of direct immediacy to
the audience during the act of communication, regardless of the context: whether a live
concert or a recording. In theater or opera the audience identifies with specific characters
on stage. However, the general public approaches song by identifying themselves with
different “personas” coexisting in it. Whether the singing voice is male or female, the
listener never identifies with the person being addressed, in this song with the “you” of
“you are.” On the contrary, the listener assumes the place of “I.” In the case of a
narration, the listener tends to assume the perspective of the narrator of the story. If the
perspective and opinion on the topic is not shared, the process of identification does not
take place. In contrast, if the song embodies bits of the ideals of the group to which the
audience belongs, the communion takes place. In both cases, when the identification is
with “I” or when it is with the narrator, it is the power of the human voice that invites the
listener to put him or herself in the place of the singing voice.33
32
Mauro Calcagno, “’Imitar col canto chi parla’: Monteverdi and the Creation of a Language for Musical
Theater.” In Journal of the American Musicological Society (Vol. 55, n. 3, Fall 2002), pp. 390.
33
See Booth, pp. 16-17 for further details of these arguments.
130
Turning now to the two recorded versions of “All the Things You Are,” these
renditions represent two very different approaches to the same song, which in turn
provoke distinctive reactions in their listeners. Of course, what is known of both artists’
careers and styles are read into these versions too. Ella Fitzgerald, diva of the “big-band-
era,” recorded “All the Things You Are” in 1963 for her album Ella Fitzgerald Sings
Jerome Kern Songbook—the seventh in a series of “Songbook” albums dedicated to big
Tin Pan Alley and Jazz songwriters such as Cole Porter, Duke Ellington, and Harold
Arlen. She initiated this series of recordings under the guidance of her manager and
producer, the owner of Verve Labels, Norman Graz. Especially with this “Kern” album,
Fitzgerald ventures outside the emblematic raw, energetic, jazzy vocal style of her
performances into the more well-polished sound of these Broadway musical tunes, which
may appeal to a broader audience. Recorded only four years later, Barbra Streisand’s
track represents the late 1960s-early 1970s style identified as “adult contemporary,”
which was an extension of the old crooner tradition.34 Streisand recorded “All the Things
You Are” on her album Simply Streisand, which was released in 1967 by Columbia
Records—and which she had been recording since 1962. This album was produced by
Jack Gold and Howard Roberts with orchestral arrangements by Ray Ellis—her long term
partner in this business.
Both versions may be classified as “vocal with orchestra,” meaning a vocal
soloist, who is clearly the leading figure, and a “more or less anonymous” orchestra just
accompanying.35 These tracks also share a moderate tempo, one-hundred-four quarter
notes in Fitzgerald’s performance and ninety-two in Streisand. But the arrangements and
34
Starr and Waterman, p. 307.
35
David Brackett, Interpreting Popular Music, p. 58.
131
sound of the orchestras are quite different in these two recordings. First of all, Fitzgerald
does not sing the “verse” at all. She delivers the entire “refrain” and then goes back and
repeats sections B and A’’. Streisand opens with the “verse” followed by the complete
“refrain” ending with the repetition of only A’’.
Fitzgerald’s version is backed by an arrangement that recreates the “Big Band”
sound. This is achieved by the prominent use of the brass section playing a swing
rhythmic riff in block—with sforzatti—as an introduction and reappearing during
interludes, or otherwise, punctuating certain beats with brief chords under the vocals. The
rhythm of that riff has a strong sense of swing in its ternary subdivided pattern, here
transcribed:
EX. 3.1: Rhythmic riff played by the brass section in Fitzgerald’s version of “All the Things You Are”
The arrangement is held together by the rhythm section: ride cymbal and bass. There is
only a touch of strings that comes briefly in the B section.
Streisand’s version recreates a Latin soft jazz ballad sound by molding the piece
around a slow bossa pattern with rim shots and triangle—the latter reminiscent of
Northeastern Brazilian music. The main difference is that there are no strong swing brass
interventions in this version. The softer timbres of the string sections (duplicated at times
by backing vocals) and woodwinds are used in textures that privilege lyrical
countermelodies over the smoother bossa beat.
132
In comparing the vocal renditions of Fitzgerald and Streisand against the
published score, we find some jazz improvisational elements. The following are the
transcriptions of the eight opening measures of the “refrain.” These have been transcribed
in the original keys in which each singer performs them. Rhythmic and melodic details
have been transcribed as faithfully as possible to show the different nuances of each
singer.
EX. 3.2: “All the Things You Are”: Two versions and published score of the A section. Each in the key
performed or published. Transcriptions by Lorena Guillen.
While Fitzgerald sings mostly on the beat—almost parallel to the score version—
with slight delays on “that” and ”the” of “the lonely” and anticipations on both syllables
133
of “winter,” by the fourth measure Streisand is already two beats behind and most of her
rhythms are slightly modified from the score. However, it is Streisand who sings the
pitches straightforwardly and will for the most part stay faithful to the melody until the
end, with minor improvised vocalizations in the last phrases. Fitzgerald’s rendition
abounds in pitch bendings in between notes and scooping in almost every attack. She also
later introduces major melodic changes, such as the three repeated G4 natural pitches,
which take the place of the upwards arpeggio of G3-C4-G4 of the original in “You are
the angel glow” and the subsequent G4-Ab4-G4-E4 on “that lights a star.”
Although Fitzgerald clearly articulates every word sound in soft crooning vocal
timbre—recorded close to the microphone—Streisand deliberately emphasizes and
prolongs certain consonants. She brings up those phonemes that are associated by
alliteration inside each verse, such as the [s] in “kiss” and “springtime” and [θ] and [s] in
“breathless,” or prolongs notes on the final [m] of certain words, such as “time” and
“seem.” These stretched out sounds are reinforced by a more prominent reverb effect
applied to the mix of her voice.
These particular vocal effects, together with the orchestral sound in each case,
contribute to conveying a specific type of sound, which triggers different emotional states
in the audiences. Fitzgerald keeps the swing big band style with an on-the-beat
articulation combined with a serene and pleasing vocal tone. Streisand emphasizes the
lounge bossa style with a simple and relaxed vocal tone that floats freely over the beat
getting behind in a lazy manner. So, while the former calls the listener to a comfortable
but punctuated and energizing sound experience, the expansive sustained sonorities of the
latter invite the audience to relax.
134
The Narrative Type: Strophic Form
American songwriter Bob Dylan provides several good examples of pure
narrative texts. In his typical “strophic song form” (A, A’, A’’, A’’’, etc.) is “A Simple
Twist of Fate,” a track on his 1975 album Blood on the Tracks. As an American urban
folk icon, he dragged this genre into the modern era of rock by introducing electric band
sound to his recordings and live performances, starting with his 1965 album Bringing It
All Back Home and following in July of that same year with his performance at the
Newport Folk Festival. Throughout his career, however, his songwriting style has
remained faithful to the early American folk tradition. Some of his songs have been
modeled “implicitly or explicitly, on the musical and poetic content of preexisting folk
material.” 36 Furthermore, his performing style has “demonstrated strong affinities to rural
models in blues and earlier country music” favoring “a rough-hewn, occasionally
aggressive vocal, guitar and harmonica style.”37
The object analyzed in this case is the recorded track itself as conceived as the
auteur—following Laing’s distinction—version in its full dimension. Since Dylan is both
the songwriter and performer of this track, its value and interest lies not in the features of
the song but in the unique way he sings it, his personal inflections. “The appeal of
auteurs is that their meaning is not organized around the words…in the situations his
songs portray, but in the exceptional nature of his singing style and its instrumental
accompaniment.”38
As Albin Zak points out in talking about Dylan’s John Wesley Harding (1967),
“against the contemporary trends in recording, which tended in varying degrees towards
36
Starr and Waterman, 281.
37
Ibid., 278.
135
the sonic opulence exemplified by Sgt. Pepper’s, and in contrast even to the ‘thin wild
mercury sound’ of Dylan’s own Blonde on Blonde album, it strips things down to an
elemental level—bass, drums, acoustic guitar, voice, harmonica, three chords, and no
obvious sonic manipulations.”39 Eight years later, “Simple Twist of Fate” returns to that
stripped sonority with his strummed acoustic guitar, bass, harmonica and a quasi-spoken
singing quality. Blood on the Tracks is a mixture of some recordings that feature this bare
sonority and others that have a more stylish band sound with arpeggios and
countermelodies between two guitars—one steel guitar played by Buddy Cage, Tony
Brown on electric bass, Paul Griffin on organ, drums, and Dylan’s own harmonica and
voice.
The lyrics of this song are made of six stanzas; each set to the same melody with a
“quasi-refrain” at the end of them. Although these last verses of the stanzas are set to the
same music and finish with the same words—which are not surprisingly the title of the
song, “A Simple Twist of Fate”—they open with a different heading each time. These
heading words state the action or verbal phrase that will affect this “simple twist of fate”:
“And watched out for a simple twist of fate;” “Moving with a simple twist of fate;” “And
forgot about a simple twist of fate;” etc.
This song employs a narrative type of discourse, which unfolds events in a linear
way. In the first four stanzas, the action takes place in the past. A third person, an
omnipresent narrator, tells about the first encounter of a man and a woman in the past.
The second part of the song—the remaining two stanzas coming after a harmonica solo
that, in a way, marks the passage of time—takes place in the present. In the last stanza,
38
39
Zack, The Poetics of Rock, 48.
136
the narrator reveals himself as the male protagonist of the amorous encounter. A short
harmonica solo closes the song.
The strophic setting that Dylan chooses goes well with the folkish story-telling
style and allows the audience to concentrate on the details of the story. The songwriter
wants the listener to focus on the lyrics without big musical distractions or fragmentation
of the linear development of the story. The traditional heavy rhyming of the verses seems
to aim toward having the same effect on the audience.
(1st Stanza)
They sat together in the park (8) a
As the evening sky grew dark, (7) a
She looked at him and he felt the a spark (9) a
Tingle to his bones. (5) b
’Twas then he felt alone (6) b
And wished that he’d gone straight (6) c
And watched out for a simple twist of fate. (12) c
The rhyme scheme of the first three verses and the next two consecutive pairs,
although strong and attractive, does not distract from the main point of the story; instead,
it contributes to the narration. This rhyme scheme points to words that are key to the
story and creates a parallel narration that synthesizes and contributes to the essential
thread of the theme:
• The place is the “park.”
• The Opposition “dark”/”spark”: first it was “dark” but then there was a
“spark” of hope in a new relationship.
• “Bones”/“Alone” proposes the extreme loneliness and bareness

represented by the “bones,” wspecially after this casual relationship ends.
• “Straight”/”Fate” gives the unavoidable sense or direction of destiny.
137
This rhyme scheme, as well as the repetition and placement of the song title at the
end of each stanza, clarifes the poetic structural frame. This kind of predictable form
liberates the mind of the listener, who in this way can trust and concentrate on the linear
succession of events of the story being told. The rhyme scheme is also very regular and
its placement predictable (at the end of each verse). There are no further strong
alliterations or internal vowel rhymings that could deviate or offer alternative webs of
phonemes or morphemes. The rhyme moves the lines ahead, propelling the rhythm of the
stanza in a straightforward motion.
The melodic development also contributes to this sense. The melody that is
repeated for every stanza is fifteen bars long. In contrast with the way Kern and
Hammerstein approached making “All the Things You Are,” “Simple Twist of Fate”
shows evidence that Dylan may have written the lyrics first and then set them to music.
In this case, it is the music that follows the lyrics’ structure and not the other way around.
The first three verses, which are assonantly rhymed (vowel rhyme), are set to the same
musical phrase that repeats three times.
EX. 3.3: Transcription of opening three measures of Dylan’s “Simple Twist of Fate”
138
The next two rhymed verses—verse four and five—are set to a second melodic
phrase, which is also repeated twice to fit each one of the mentioned verses.
EX. 3.4: Transcription of mm.4 to 6 of Dylan’s “Simple Twist of Fate”
Again, each verse is set to a two measure melodic phrase. And the stanza will
actually keep this regular pace for the next verse to slow down only in the last one, which
is the refrain. The regular structure of the melodic phrases is evidence of the lyrics
preceding the music.
EX. 3.5: Transcription of the refrain of Dylan’s “Simple Twist of Fate”
If song lyrics resemble poetry in some way, it is in their rhyme and metric
schemes. Lyrics, as well as poetry, are created by feeling the feet—the number of accents
per verse—regardless of the number of syllables in between those accents. Dylan’s
melody tries to fit and follow the feet and accents preexisting in his lyrics. This results in
the subdivision of beats into their proportional rhythmic values to fit the extra-syllables
139
of the irregular verses. Otherwise, as happens in “All the Things You Are,” the lyrics
should have been created as well-proportioned parts to fit the music exactly. For
example, the three first verses in stanza one have, respectively, 8-7-9 syllables; the first
three verses in stanza two have 8-10-8 syllables; and stanza three has 9-9-9 syllables. All
this verses are set to the same melody as is usual in any strophic song setting—whether
popular or art song.
Fitting lyrics to music in this particular way is an indication of where the
expressive value of the song is located. In this case, it is in the semantic meaning of the
lyrics and in the communicational value of text as carrier of denotative content. This
discourse needs to be uninterrupted and fluid to make any sense. Text as acoustical
phenomena is almost ignored.
This musical setting follows only the lyrics’ main accents and shape to make it
understandable without further prosodical details. Here the strophic song tries to solve the
inconvenience of not molding exactly to the intonational arch of the text with the
repetition of the melody. This procedure is taken at the point that the melody and its
arrangement almost completely lose their ability to surprise the listener. In this way, they
release the listener’s attention to focus on the story and its logical sequence away from
the melodic swirls of the music. In sum, the strophic setting eases the ears and mind of
the listener.
As a counterpart, Dylan’s vocal delivery of the lyrics is almost spoken at times.
He breaks his sustained tone into a non-determinate pitch sound contour. This quality
gains over the singing, especially toward the end of each melodic phrase of the stanzas
and each time the refrain appears. By manipulating his voice in this way, Dylan
140
counteracts the lack of prosodical observance of his melodies to the speech intonemes of
his lyrics. The strophic repetition does not allow the flexibility of following speech
intonation arches. By speaking the lyrics, the words break free into quasi-speech.
In terms of the lyrics themselves, Dylan intends to counteract the natural tendency
of text‘s song to be processed in the “poetic mode” by: first, using predictable rhyme
schemes at the end of verses; second, avoiding further alliteration inside verses, which
could deviate the attention of the listener from the linear succession of concepts; third,
avoiding repetitive semantic schemata (beyond the repetitive refrain). Resorting to these
tactics, the songwriter minimizes the musical structures, Tsur’s “sound patterns,” of his
lyrics, and assures a propositional temporal processing similar to the one speech follows.
The left hemisphere of the brain composes speech by retrieving from memory
“several morpheme units…according to grammatical rules” and ordering them “into a
specified temporal arrangement.”40 The left side of the brain is usually associated with
propositional thought: speaking, reading, and writing. In contrast, songs or phrases of
their lyrics are remembered as wholes. As Booth says, “The parts of these units are not
pieced together tone by tone, word by word, but rather are recalled all at once as a
complete unit.”41 The appositional capacity of the right hemisphere of the brain is the
ability of “comparing perceptions, schemas, engrams…,” which are remembered and
produced as intact wholes.42 This is the way listeners retrieve fragments of song’s lyrics.
But Dylan counteracts this appositional tendency by controlling the “sound patterns” of
his lyrics and keeping the flow of the narration. He delivers the story in his usual quasi-
spoken vocal tone and idiosyncratic story-telling style.
40
Booth, 68.
41
Ibid.
141
Redundancy: Variation on the “Verse-Chorus” Form
Relying on other effects, the imprint that Björk’s “Isobel” leaves on the listener is
quite different from that of Dylan’s song. “Isobel” was written by Björk, Nellee Hooper
and Marius De Vries, with lyrics by Sjón. Nellee Hooper and Björk produced it together
and released it in 1995 on Björk’s second album Post. This song offers the unusual
opportunity of comparing the sound properties between two differently mixed versions:
the first in Post, where Björk herself participated in the mixing process; and the second
made by Eumir Deodato for Bjork’s 1996 CD Telegram. The latter is a remix made up
largely of songs from Björk’s album Post. Björk personally commissioned nine artists
and gave them complete freedom to remix her tracks. After receiving the mixes, she went
back to the studio and re-recorded the vocals to complement these artists’ versions.
Deodato’s spin on “Isobel” opens up the texture with a straightforward pop sound
version. Actually, Björk comments on her website:
For me Telegram is really Post as well but all the elements of the songs are just
exaggerated. It’s like the core of Post. That’s why it’s funny to call it a remix album, it’s
like the opposite. It’s like the-cover-of-Post-me like this [she smiles beatifically] in pink
and orange and big ribbon and it’s like a pressie for you. But Telegram is more stark,
naked. Not trying to make it pretty or peaceable for the ear. Just a record I would buy
myself. (Like a letter to yourself?) Yeah, more, sort of...fuck what people think. It’s a
truth thing. Which is maybe a contradiction because it’s other people’s remixes. (Blah
Blah Blah, December 1996)43
In her Post version of “Isobel,” Björk and De Vries play keyboard over a
rhythmic base of “ethnic” percussion also programmed by De Vries. Deodato and Björk
add a string arrangement. What is radically different between this original version of
“Isobel” in Post and the one remix by Deodato in Telegram is the levels of volume in the
mix and inclusion or suppression of certain recorded instrumental tracks. From the
42
Ibid., 69.
142
opening sustained harmonic string sequence with trumpet solo in Post, Deodato only
keeps his own arrangement of strings. The softer and diluted “ethnic” percussion is
replaced by a pop drum-set pattern that is brought quite prominently into the mix. The
original bass, which was muffled and back in the Post mix, is replaced by a “funkier”
bass that is also up front in the Telegram mix. All the programmed sequences and
keyboard sounds of Post are stripped out in Deodato’s mix. This now clean cut pop track
directly affects the way in which Björk herself interpreted her vocals when she
rerecorded them after the new mix. She goes for a less affected vocal inflection of the
lyrics. Her voice also is mixed with less processing, a more “in-your-face” sound. The
hermetic and mysterious aura of “Isobel” in Post is replaced by a banal and
straightforward sound, which contrasts, like an ironic comment, with the still hermetic
lyrics.
Isobel is a variation on the “verse-chorus” form.44 In the case of this song, it
follows this form:
1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th
Parts A Chorus A’ Chorus Bridge Chorus A’’ Chorus Bridge Chorus

(verse (1) (2) (with (3) (4) (with (5)
with climb at climb)
refrain) the end)
TABLE 3.1: Song format of Björk’s “Isobel”
43
See http://unit.bjork.com/specials/gh/SUB-12/index.htm. Copyright © 1995-2007 by Björk Overseas
Ltd.
44
Here the term “verse” is used to denote a complete strophe of a song. This is the most commonly used
term among popular music songwriters to refer to this part of a song. In this section of the dissertation, it is
will be introduced in quotation marks each time it is employed in this way.
143
EX. 3.6: Transcription of all the sections of Björk’s “Isobel”
Although the song has three strophes or “verses,” it repeats the chorus five times.
After its third reappearance, instead of being sung to the same lyrics, the chorus’ melody
is performed on babbling: “na, na, na…” Another device to add variation to the
traditional succession of “verse-chorus” is the insertion of “bridges” and “climbs.”45
Between the second and the third chorus the first “bridge” is introduced, which brings a
“climb” of two “verses” attached at the end. The bridge itself is 3 “verses.” The second
45
The “bridge” is a fresh new section inserted to offset the predictable verse/chorus pattern: “…a bridge
works to provide contrast in lyrical content, meter, and melody.…Lyrically speaking , your bridge doesn’t
justify its existence if it merely restates a fact we’ve been told. Ideally, a bridge adds dimension to a lyric
by expanding the content of the verse or chorus, or by giving new insight into the singer’s feelings.” Sheila
Davis. The Craft of Lyric Writing. (Cincinnati, Ohio: Writer’s Digest Books, 1985), 57.
A “climb” gives the “verse/chorus” song a fresh contour. Although it is also new material, it is usually
shorter than a “bridge” section: “…a climb is a couplet (two rhymed lines in the same meter) which pull
away from the verse both verbally and musically, and reach up toward the chorus. A climb functions as
aural foreplay, to extend and increase the song’s emotional tension by delaying the arrival of its climatic
section.” Davis, 55.
144
time that the bridge is introduced is in between the fourth and fifth chorus. The song
unfolds under a well-planned structure full of musical repetition and text redundancy.
The lyrics, although rather hermetic, describe the character of Isobel. But the
obscure images depicting her personality contrast with the colorful and well-planned use
of rhyme, alliteration, parallelism and word repetition. This gives the verse a quite
attractive musicality. This attraction does not overlook the semantic content of the words
in favor of their musical possibilities. If something detracts from this otherwise inevitable
effect, it is the repetition of whole phrases or “verses” in an insistent manner. For
example the refrain at the end of each strophe, chorus, and bridge:
In a forest pitch-dark a (6)

Glowed the tiniest spark a (6)
It burst into flame b (6)
Like me, c (2)
Like me. c (2)
(Chorus)
My name Isobel d (5)
Married to myself d (5)
My love Isobel d (5)
Living by herself. d (5)
In a heart full of dust e (6)

Lives a creature called lust e (6)
It surprises and scares b (6)
Like me, c (2)
Like me. c (2)
(Chorus)
(Bridge)
When she does it she means to f (7)
Moth delivers her message g (7)
Unexplained on your collar h (7)
Crawling in silence i (5)
A simple excuse. j (5)
145
(Chorus on “Na, na, na…)
In her tower of steel k (6)

Nature forges a deal k (6)
To raise wonderful hell b (6)
Like me c (2)
Like me. c (2)
(Chorus with lyrics)
(Bridge)
(Chorus on “Na, na, na…”)
Each of the three “verses” opens with an assonant rhyme between the two first
lines, but the third one rhymes with the third line of each “verse.” These three words
connected at a distance form an interesting imaginary scheme throughout the song. This
scheme becomes a kind of parallel or alternative sub-line plot that describes a possible
scenario for the description of Isobel: flame—scares—hell. The “verse” closes with one
of the first instances of word repetition, “like me, like me,” which is the refrain that not
only repeats twice there, but appears at the end of each “verse” in the same couplet form.
The chorus takes to an extreme all the versification devices at hand. Its four
“verses” are metrically regular: five syllables each. This gives regularity, balance and a
perfect rhythmic pattern. On top of that, the rhyme scheme is also extremely regular. Not
only do the four lines share an assonant rhyme among them, but line one has perfect
rhyme with line three in the same manner as line two with four. In addition to these
rhyming schemes, the lines present parallelism in their grammatical construction.
Similarities in ideas are brought to the surface by the similarities in sound and
grammatical construction of the two parallel phrases that make up the chorus.
146
Subject Verb Object (direct or
indirect)
Line 1 and 2 My name Isobel Married to myself
Line 3 and 4 My love Isobel Living by herself
TABLE 3.2: Layout of parallel lyrics constructions in the chorus of Björk’s “Isobel”
The first and third lines also open with an anaphora46 that starts only with the repetition
of the opening possessive pronoun “My” and, although the noun is different, is followed
by the same proper name “Isobel.”
The second “verse” follows immediately after this chorus, but the perfectly
balanced sound of the words of the chorus keeps resonating in the ears of the listener.
This is remembered as an instrumental cacophony. The parallel constructions of the lyrics
in the first and second lines and third and fourth lines are set over a melody that repeats
twice for each of these couplets.
By the time the song moves into the second “verse,” the ears of the listener are
already attuned to a musical way of listening to the lyrics. The alliteration of consonants
between the two first lines and inside the third one are more prominent than any other
effect. “In a heart full of dust/ lives a creature called lust” encloses the alliteration of
the following sounds, now quoted in their phonetic symbols: [I]-[æ]-[rt]-[u]-[l]-[Λst].
Inside the third line, the reiteration of [s] and [r] creates a sensation of smooth continuity
in these sounds and over those that are in between them: “It surprises and scares.”
46
Repetition of a word (or like-sounding words) or a short phrase at the start of successive lines or verses
(Davis, p. 143).
147
The refrain “Like me” holds our attention because of its redundancy and exact
repetition after its previous appearance. It intends to remark on the correspondence
between the persona of Isobel and the other, which maybe are one.
The second time the chorus comes around with its features and effects intact, but
is followed by the unexpected “bridge.” After two rounds of “verse-chorus,” fresh
musical and textual material is more than welcome. Usually the function of the “bridge”
is to introduce further information still not provided in the “verses,” a new perspective
and/or to offset the musical predictability. The “bridge” should also offer contrast in its
meter and versification compared to the verses and chorus. Certainly this is the case in
“Isobel.” The lines are now seven syllables long and have free rhyme. There are no new
parallel grammatical constructions or anaphoras. The previous highly structured lyrics
give way to a looser text approach.
Björk also proposes a musical change in this “bridge.” The melodic setting of
each line consists of a first phrase that carries part of the lyrics, followed by a second
period that is a vocalization of its melodic contour on a babbling “uh, uh, uh…” Attached
at the end of this bridge is a couplet that functions as a “climb” toward the next chorus.
Although climbs normally “pull away from the verse both verbally and musically,
and reach up toward the chorus,” this one is not pulling from known material, but from
the newly introduced bridge’s music.47 This couplet anticipates the five-syllable length of
the chorus lines in an open melodic phrase that does not find its harmonic and melodic
closure until the first notes of the following chorus. Thus, it creates the expectation and
tension usually ascribed to all climbs.
47
Sheila Davis, The Craft of Lyric Writing (Cincinnati, Ohio: Writer’s Digest Books, 1985), pp. 55.
148
This third chorus is sung on a babbling “nana na nana, nana na nana…” By this
time the lyrics of the chorus, which have been repeated twice, have probably been
imprinted on our memory, and the playful musicality of the text has contributed to this
sense. But more than the tangled semantic meaning, what the listener remembers is the
musicality. The babbling is an extension of that musicality, an iconic remainder of the
almost musico/instrumental characteristics of the chorus. The “nana-na-nana” represents
what is remembered of this section.
The conscious or unconscious choice of the babbling syllable “na” also has
certain emotional implications. Part of the delight that any listener feels, and even the
performer experiences while singing, is situated in the regressive value and childhood
connotations that those phonemes evoke. According to Roman Jakobson, the emotional
charge of each phoneme is proportional to the amount of time that the infant has used it
in his prelanguage babbling stage in language development.48 So, any association with
sounds out of their syntagmatic or referential relation to a linguistic sign (a word) refers
us to that period. Nasal phonemes are among the latest acquisitions of children into the
vocabulary, but are part of the mass of sounds used in their onomatopoeia and emotional
manifestations. The “nana-na-nana” certainly brings back those associations in a
pleasurable regression to an earlier age that is compelling and attractive.
Only after a close listening and analysis of the recorded material were Björk’s
comments about the story line of Isobel taken into account. And, surprisingly, as I came
to believe after listening to it, the “na na na nana” proved to be the expression of an
instinctive impulse. In an interview for MTV’s Eurotrash in 1995, Björk tells the story in
this way:
149
This is the story of Isobel; she was born in a forest by a spark, and as she grew up, she
realized that the pebbles on the forest floor were actually skyscrapers. And by the time
she was a grown-up woman and the skyscrapers had taken over the forest. She found
herself in a city, and she didn’t like all the people there so much, because they were a bit
too clever for her.
She decided to send to the world, all these moths, that she had trained to go and fly all
over the world and go inside windows of people's houses—the ones that were too
clever—and they’d sit on their shoulder and remind them to stop being clever and start to
function by their instincts. They do that by saying “Nah-nah-nan-nah-nah!” to them...
(Björk waves a finger in front of her face)
...and then they’d say “Oh! Sorry! I was being all clever there!” and start functioning on
instinct.49
The rest of the song proceeds with the same tactics described previously. Finally,
what the song offers musically—the rich realm of its text webs and musical events—to its
listener overwhelms any pretension of perceiving its discourse linearly without
distortions or fragmentations. Björk is certainly looking for this kind of musical sonic
experience.
Peter Gabriel’s “Sky Blue,” from his 2003 album Up, shares certain similarities
with Björk’s “Isobel.” Both rely on redundancy and repetition as a way of creating a grid,
which the listener can use to guide him/herself along the fragmented lyrics. But this is
approached differently in each song. Although Gabriel’s lyrics seem more accessible than
Björk’s, they are long and stretched over lengthy periods. This accessibility is due to the
fact that each fragment of Gabriel’s lyrics corresponds to a line of the strophe and a
single musical phrase. Each of these fragments carries a single idea or concept. This
procedure reestablishes some of the integrity missing in the lyrics as a whole. Bjork’s
ideas and images stretch over more than one musical phrase or longer musical phrases
and that makes them less graspable.
48
Roman Jakobson, Child Language, Aphasia, and Phonological Universals (The Hague: Mouton, 1968).
49
See http://unit.bjork.com/specials/gh/SUB-12/index.htm. Copyright © 1995-2007 by Björk Overseas Ltd,
(accessed April 16, 2007).
150
The song is almost seven minutes long. A skillfully crafted handling of the long-
term musical and lyric structure is what makes this song work. I argue that what holds
this song together are mainly its musical elements and not so much the linearity of its
narration—which, as stated before, is rather fragmentary. In a way, the personal story
being told becomes secondary. The listener probably ends up singing the vocal riff of the
chorus (described below) and the “group answer” on “sky blue.” The following diagram
shows the form of the song.
1st 2nd 3rd 4th 5th 6th 7th
Parts A A’ Chorus (1) A’’ Chorus (2) Bridge Chorus (3)

Two layers: Two layers: Only riff.
lyrics and lyrics and riff.
riff.
TABLE 3.3: Song format of Gabriel’s “Sky Blue”
At the level of the lyrics, Gabriel offers less repetition of whole sections than
Björk, with the obvious exception of the two words of the title, “sky blue,” which are
introduced every other line of the “verses.” The simple instrumentation of the band and
bare chord accompaniment of the very first “verse” provide a chance to understand the
opening lines clearly. The first and fifth lines resort to another highlighting literary
device, parallel grammatical constructions such as “Lost my time/lost my place in” and “I
know how to fly/I know how to drown in.” The first, second and third lines have internal
and final assonant rhyme.
151
(verse 1)
Lost my time lost my place in
Sky blue
Those two blue eyes light your face in
Sky blue
I know how to fly, I know how to drown in
Sky blue.
These kinds of structures keep recurring in the next verses. The second “verse”
holds a literary palindrome construction at a semantic level: “I sing through the land, the
land sings through me.” It also contains the alliteration of the phonemes [w], [m] and [n]
on “warm wind blowing.” The third verse ends each of its lines with an assonant final
rhyme: “goodbyes/ sky/denies.”
(Verse 2)
Warm wind blowing over the earth
Sky blue
I sing through the land, the land sings through me
Sky blue
Reaching into the deepest shade of
Sky blue
(verse 3)
Train pulled out said my goodbyes
Sky blue
Back on the road alone with the sky
Sky blue
There’s a presence here no one denies
Sky blue
By the second “verse,” however, all of these literary and versification devices are
almost overlooked. The only structural repetition that appeals to the listener’s attention is
the group vocalization of “sky blue.” These two words are sung by a choir that answers
the opening melody proposed by the soloist, Gabriel. The first time around it is actually
Gabriel who answers his own “calling.” From the second “verse” on, a group of voices
152
takes over the “sky blue” response. This procedure reproduces the typical “call-and-
response” form of several Afro musical styles. This “sky blue” response also acts as the
“hook” in the song. The hook is the repeated section of the song that more often than not
contains the song’s title, but it could also be a melodic phrase. In this case, the song
makes use of both things: a particular melodic snippet that answers the previous “call,”
the melodic opening phrase, the lyrics of which are the song’s title.
The hook serves the function of snagging the listener into the song and grabbing
his/her attention. The hook remains in memory even after the song is over. Booth says
“self-reference is often visible in the verbal form of…[a] hook, returning upon itself as a
paradox, or as a repetitive regression, or as an absurd phrase refusing to connect to the
expected context.”50 The lyrics of this particular hook, “sky blue,” do not hold any
paradox or absurdity but only a certain innocent redundancy. Although at the beginning
of the song, “sky blue” is the grammatical continuation of a phrase started in the
preceding line, by the second or third verse this discoursive connection is discontinued.
“Sky blue” holds a loose and indirect relation to the previous phrase: “warm wind
blowing over the earth/sky blue/I sing through the land, the land sings through me/sky
blue;” “Train pulled out said my goodbyes/sky blue.” Musically speaking, however, the
little melodic phrase of “sky blue” keeps connecting as the closing answer to the
“calling.” The following score of the first verse shows this musical procedure:
50
Booth, 179.
153
EX. 3.7: Transcription of the opening eleven measures of Gabriel’s “Sky Blue”
The chorus, a typical Western musical element of the song form, actually makes
use of another Afro musical devise. This time, the group of voices sings a four-measure
riff over which the soloist performs a quasi-parlato melody that gives the impression of
being improvised. The fragmented and scattered nature of its melodic contour provides
its improvised character. Of course, the force of tradition has some influence on what is
perceived and the fact that improvisation is what is expected in this kind of musical genre
—or at least the style that the song is trying to evoke—reinforces the way we listen to it.
The following score shows the progressive overlapping of these vocals in chorus
1 and 2.
154
EX. 3.8: Layout of solo and vocals in the first chorus of Gabriel’s “Sky Blue”
155
EX. 3.9: Layout of solo and vocals in the second chorus of Gabriel’s “Sky Blue”
As it is possible to observe in the score above, the riff lasts four 3/4 measure
pattern and is made of two phrases built on the same harmonic progression that repeats
throughout the chorus: C# minor–A major–B major–G# minor. These two phrases are
identical with the exception of their resolution, alternately the last note on G4 or F4. By
the second chorus, the increment of repetitions in the complete four-measure version and
earlier introduction of the riff contribute to the shift of the listener’s attention from the
solo voice to this group vocalization.
156
(Music) surely provides the shortest, the least arduous, perhaps even the most natural
solvent of artificial boundaries between the self and others…The words of folk song…are
not directed by one person to another or by many persons to many others; the voice is
that of the group…there is no “other being,” no mere listeners…If one member happens
to lead the chorus, his words are certainly not addressed to the others…He does not tell
them anything they don’t know; he does not speak to the others but for them.51
After the first three times that the soloist proposes the “sky blue” answer, we are
invited to join and participate in this collective act, vocalizing these words. From then on,
it becomes a habit to participate in the game through the riff of the chorus until the end,
when only the collective voice is left.
Empirical Data: Questionnaires’ Results
To supplement the observations obtained from the previous popular song
analyses, I undertook an experimental project, in which questionnaires were distributed to
a group of forty-six college students, the “listeners” referred to throughout.
Although I intend to apply my hypothesis to listeners in general, it was
necessary for practical reasons to limit my exploratory subjects to college students. These
undergraduate students of Hartwick College, a small private liberal arts college in upstate
New York, were between eighteen and twenty-three years of age (with the exception of
three older students between thirty-four and forty-five years old). Some of these students
were music majors (twenty-four people) and others were from other degrees such as
visual art (twelve people) and modern languages (ten people). Thus, the analytical group
embraces a more comprehensive universe of people and interests beyond music, and the
possibilities of obtaining a misguiding result are reduced. Otherwise, the exclusive
51
Booth, 18-19. From Victor Zuckerkandl, Man The Musician, trans., Norbert Gutterman. Bollingen Series
157
selection of music students could be questioned because of the possible conditioning by
their musical training. Interestingly enough, the results proved to be similar among all the
groups. For this reason, the data was processed and presented all together, and not
separated by group.
The stimulus was systematic and consistent in each administration of the
questionnaire. Every group listened to the same recorded songs and same versions of
them. The recordings used in this experimental project were the same songs analyzed in
the previous section: Fitzgerald’s version of “All the Things You Are;” Bob Dylan’s “A
Simple Twist of Fate;” Björk’s “Isobel” and Peter Gabriel’s “Sky Blue.”
The experience took place in a formal non-structured environment. Formal
because the listening action was not observed in its ordinary place but an artificially set
one, a classroom; and non-structured because all the variables intervening in the act of
listening were not controlled during the experience. Faced with the impossibility of
observing people in their natural environment where they perform the listening as an
everyday activity (concert, home, ambient music in cafes or any other social situation in
which music is encountered), the experience took place in classrooms where the subjects
were asked to listen to the songs without any specific instruction. Only after the pieces
were played once, they were asked to open the page in front of them and read the
questions.
At all times the intention was to minimize the artificiality of the situation and
reproduce as much as possible the casual listening that people experience in everyday
life. Having a blind first exposure to the songs and ignoring the actual goal of the
experience by hiding the particular questions from the listeners provided a chance for an
44.2 (Princeton: Princeton University Press, 1973), 51, 24-25, 26-27.
158
objective result. Although a certain kind of general attentive listening may have been
taking place, at least the subjects were not guided from the beginning of the experience to
observe and retain individual musical events in an “analyzing” or “deceptive” listening
way.
Finally, the same five questions were presented to each group. These questions
were designed in a non-systematic and open manner. The listeners did not have
formulated answers to choose from. This guaranteed that their responses were not
influenced or narrowed by any outside instruction. Each listener volunteered their free-
form responses, which I later grouped, noting the frequency of each. Thus, these
variables found in the following tables are not a mere listing of the people’s answers, but
a grouping in categories, which embrace and conceptualize their spontaneous responses.
Question 1: What do you remember from the song?
SKY BLUE SIMPLE TWIST ISOBEL ALL THE

OF FATE THINGS YOU
ARE
lyrics’ fragments 6 6 1 4
instrumental solos 1 3 1
melodies or 2 4 4
harmonization
rhythm or beat 3 2 3 1
back up vocals or 8 (It does not apply) 2 (It does not apply)
other non-word
vocalizations
band sound 8 7 4 3
(instrumentation,
mix and effects)
voice sound 8 8 5 5
performance style 1 4 2 4
character, mood 1
Other 2 (dark tone of piece; 1 (everything, I know 1 (Lion King?) 2 (enjoyable, familiar; I
overall sound) the song) knew the song)
TABLE 3.4: Results from question #1
159
The following is list of specific answers that the listeners wrote down as lyrics
fragments of lyrics that they remembered:
For Dylan’s “Simple Twist of Fate”:

• “Simple twist of fate” (3 answers)
• “twist of fate” (2 answers )
• “fate” (2 answers)
• all the lyrics, she knows the song (1 answer)
For Gabriel’s “Sky Blue”:

• “Sky blue” (4 answers)
• “Blue” (1 answer)
For Björk’s “Isobel”:

• “My name is...” (1 answer)
For Fitzgerald’s version of “All the Things You Are”:

• “You are” (x)
• “All the things you are, are mine.” (x)
• “springtime” (x)
• lyrics in general without specification (x)
During this first exposure to the four songs, the listeners showed a special interest
in the particularities of the performers’ voices and the sound of the bands of each track.
From the few phrases that they could remember, we gather that these were in strategic
places of the songs. They were either part of the songs’ refrains or part of the choruses.
These are sections that usually work through the song by melodic and harmonic
repetition, creating a dent in the listeners’ memory with their recurrent musical schemata.
Furthermore, when we cross the information obtained from the analysis of each
song with these results, it is possible to observe that the words or phrases that the
listeners remembered were highlighted by additional procedures such as parallel lyric
constructions, manipulation of the rhythmic pace of the piece over those words, or simple
lyric redundancy (recurrence of the same word). In the particular case of Gabriel’s “Sky
160
Blue,” the hypothesis proposed during the analysis was confirmed: the back up vocals
performed along Gabriel’s solo in the chorus were a main focus of the listeners’ attention.
Question 2: Why do you think you were able to remember specifically that?
SKY SIMPLE TWIST OF ISOBEL ALL THE

BLUE FATE THINGS YOU
ARE
Focused on the voice’s 2 1
sound because of its
prominence in the mix
Focused on vocal or 2 1 2 6
instrumental sounds because
the richness of colors makes
them memorable
Focused on instrumentation 3 1 1
because gave a particular
feeling and character to the
song
Focused on that word or 5 1
phrase because was repeated
several times
Focused on chorus because 7 3 3
of its strategic placement
and contrasting quality
Focused on instrumental 2
intro, interlude and end
because were playing by
themselves for a while
Focused on that line 1
because sticks out
Focused on voice quality 2 3 1
because of its compelling,
different and evocative
sound
Focused on lyrics because I 1 4
knew the song
Focused on melody because 1 2
the song is catchy, repetitive
Focused on rhythm because 2
of danceability
Focused on melody because 1 1
was repeated constantly
Focused on harmony and 1 3 1 2
rhythm because unexpected,
interesting
I liked those things, because 3 1
I liked the song, or the (annoying)
complete opposite, it was
annoying.
Other 1 2 ( folk make you focus on male 1
voice, guitar and harmonica)
The second question proved to be the most difficult to categorize and organize in
separate variables: first, because it is completely linked to the first question, and second,
161
because the many alternative combinations of factors presented a challenge at the
moment of synthesizing them into more embracing categories. But it proved useful in
confirming tendencies already marked in the answers given to the first question. The
reasons given by the listeners coincide with the information obtained in the songs’
analyses section. Thus in the particular case of “All the Things You Are,” the listeners
found themselves attracted to Fitzgerald’s vocal tone because of the richness of its color.
Dylan, Gabriel and Björk directed the attention of the listeners towards the chorus or
refrain of their songs by a crafted handling of the form which creates momentum and
expectancy, either by contrast or repetition of words and music.
Question 3: Is the song telling some kind of story? Briefly describe what is about.
SKY BLUE SIMPLE TWIST ISOBEL ALL THE

OF FATE THINGS YOU
ARE
General description of 1 4
topic in one word
Not sure, I paid 2 1
attention to the sound in
general
I do not remember; I do 4 5 7 1
not know
There is no story that I 2 1
remember
In a new song I listen to 1 1
the music and not pay
attention to words
Too abstract to grasp 1
meaning
I believe there was one, 2 2
but I do not remember
More detailed 1 5 9
description, but still
general, no specifics
Too busy listening to 2
the music
It does not tell a story, 1 2
it paints a picture,
shows emotions
Blank answer, no idea 4
162
The answers to the third question show the difficulty of grasping the meaning of
the lyrics after only one listening. Most of the listeners said that they could not remember
what the song was about or they did not know. The only two songs that seem to be more
accessible in a first time listening situation were “Simple Twist of Fate” and “All the
Things You Are.”
Question 4: How does the song make you feel? Why? Which elements of the song put
you in that mood? Musical elements, voice quality, words?
Simple content/ think/ Being

relaxed/ pensive sad Nostalgia fun Curious good reflect outdoors,
Twist of restful mood like riding
Fate on the road
not connected 1 1 1 1 1 3
to any music
element
Simplicity 1 1
catchy and 1 3
friendly beat
Interesting 1 2
chord
progression
serene tone, 2 1 1
awesome vocal
quality
soft and 2 1
smooth guitar
part
bacuase of the 3 1
lyrics, story-
telling
TABLE 3.7: Results from question #4 on Dylan’s “Simple Twist of Fate”
Sky Blue no answer reflective/ peaceful/ deep/ melanc good Yearning problematic/ uplifting
spiritual relaxed intense -olic mood melodrama-
tic
no musical 2 1 1 1 1 1 1
element related
no answer 1
soothing vocal 2
quality
beat and tempo 1 1 2
Chorus 1 1
because of the 2 2
accompany-
ment
TABLE 3.8: Results from question #4 on Gabriel’s “Sky Blue”
163
Isobel relaxing involved / attracted sad / dramatic good sleepy/ in trance angry
band/ 6 1
accompani-
ment
pulsating beat 2 1
her voice/ her 2 1 1
vocal quality
general pace of 1 1
the music
repetition of 1
sections and
melodic
material
TABLE 3.9: Results from question #4 on Bjork’s “Isobel”
All the light / happy like dancing Good like singing tunes in
Things a bar
You Are
no musical 1 1 1 1
element related
upbeat / 1 2 2
danceable
major chords 2
singers vocal 1
interpretation
musical style 1
TABLE 3.10: Results from question #4 on Fitzgerald’s version of “All the Things Your Are”
Question four required a separate table for each song. The answers regarding
moods and sensations that arose while listening to the songs were far too many and
idiosyncratic to each song. The cause-effect relationship among these variables is
individual to each piece. This question also received the most ambiguous and personal
answers.
This question had second intentions. Asking about the mood or emotional state
was only a way of obtaining the real sought after information: which elements of the song
were the listeners paying attention to? The results of this question complement those
from the first and second questions, where listeners were requested to tell what they
remembered from the heard songs and why. Seeking the same information from a new
164
angle confirmed tendencies already marked in those two previous questions. Except for
“Simple Twist of Fate,” where the lyrics seemed to be one of the elements with certain
incidence, listeners pointed that they got in certain moods while listening to: the vocal
tone of the performers, the band sound and the accompaniment, or the beat of the song.
For the most part there was no mention to the lyrics.
Question 5: Does the story of the song have a protagonist? Who is speaking to you in
this song? Who is narrating or describing the situation?
5. Does the story

of the song have a SKY BLUE SIMPLE TWIST ISOBEL ALL THE
protagonist? Who OF FATE THINGS YOU
is speaking to you ARE
in this song?
do not know 3 3 7 1
Yes 1 4 2
No 2 2 2
the narrator of the song 1
the writer of the song 2
fictional character 1
the singer 3 3 1 4
blank answer 2 2
the orchestra 1
Question number five is directly related to how much of the story or description
the listener was able to grasp. It was too complex for listeners to determine the nature and
identity of each song’s protagonist or narrator in a first listening. The fragmentary story
gathered in this brief exposure did not provide sufficient information. The answer could
be established only after a careful attentive listening to the lyrics of each song. In the act
of reading or even oral story-telling there is enough information in other sentences
165
around, time to go back mentally, and link previous statements to arrive at a satisfactory
conclusion.
Because of the nature of the question and the answers received, tables do not
clearly translate the information obtained. In this specific case, it is more appropriate to
proceed to describe and group the results in the following paragraphs and then show the
bare data in a table format.
The formulation of the question itself is vague and imprecise, but that decision
could be justified by the need to not influence or direct the answers of the listeners. Its
outcome was of special interest for this exploratory project. It actually proved how much
further attentive listening is needed to comprehend text at the level required to resolve
this issue: who is the “persona” singing through the song?
The question is ambiguous in itself. It may have several correct answers. It
engenders in itself two enigmas: first, the difference between the voice singing and the
first person in the narration; second, the difference between the protagonist of the story
and narrator. These could coexist in one person or they could be three different people.
From the people interrogated, all except one did not notice the switching of the
narrator in Bob Dylan’s “Simple Twist of Fate.” The first five verses are narrated in third
person as if the story of these two lovers was told by somebody else. The last verse
switches to the first person; the narrator becomes the protagonist (the male lover). Only
one of the listeners made a special comment about this.
Some subjects established a difference between when the singer was talking from
a personal experience and when she was interpreting and voicing a fictional character.
One could allege that such discrimination is the result of associating authenticity values
166
to certain musical styles more than others. For example, folk or grassroots influenced
musical styles such as Dylan’s song are expected to be sincere, personal and intimate
story-telling of the singer’s past experiences, while pop singers could take different
masks and become different characters.
From those songs used in the experiment, “All the Things You Are” is the only
one in which the composer and author were different from the singer: the composer is
Jerome Kern and the singer Ella Fitzgerald. But that does not mean that the author of the
lyrics and the narrator (the “I” first person of the story or protagonist) are the same
person. In the other three cases, the singers are also the composers: Bob Dylan’s “Simple
Twist of Fate,” Bjork’s “Isobel,” and Peter Gabriel’s “Sky Blue.” And again they may or
may not be the protagonists of their own stories.
The most problematic and unexpected answers were the straight “yes” and “no.”
The “yes,” besides not providing any specification of who they think is the protagonist,
does not give a precise idea if they really understood something from the story. They
could be assuming—a generalized idea—that any story has a protagonist as default. But
on the contrary, that is not the only option. The lyrics of a song could be unconnected
ideas, a description, loose words, or the perspective of who is talking in this song could
be very vague, unpredictable or completely absent. The “no” answers did not provide any
insight about the listeners’ understanding of the song.
Isobel was, according to the results, the most confusing song of all. Most of the
people did not know who was speaking or who was the protagonist, and the others
directly said that there was none. Observing the answers to some of the previous
167
questions about this same song, it appears that the listeners did not grasp the lyrics of this
particular song and their attention was mainly devoted to other aspects of the piece.
In order to arrive at a satisfactory answer in question five as well as in question
three, the listener needed at least a second listening. The second time the listeners were
prepared to pay attention to certain aspects of the lyrics. Text was listened to in an
attentive manner guided by the formulated questions.
Question 6: After listening for a second time to the same songs, do you feel you grasped
more of the meaning of the lyrics? Why? Only because you have a second chance or
because you pay more attention guided by the questions? What is the story about in every
song?”
In the answers given for question number six, the last one of this experience, the
listeners confirmed that only after this second time could they start to understand the
song’s content. They also admitted that this time they paid more attention guided by the
questions they already knew. Some of them specifically pointed out that they usually do
not pay attention to the lyrics the first time they listen to a song. They immerse
themselves in the music: the singing, the band, the melody, the instrumental solos, the
harmonies. Only after repeated listening do they feel they concentrate on the lyrics.
As the end result of this experiment, we can conclude that although listeners do
not ignore completely the songs’ lyrics, they tend to remember only certain isolated
words or short phrases. These lyrics’ fragments are usually part of choruses, refrains or
short motives that work through the song by melodic and harmonic repetition—on top of
the repetition of the words themselves. Only after listening several times in an attentive
manner, people may grasp the meaning of the lyrics. But otherwise, their attention is
168
diverted towards mostly sonic aspects of the performer’s voice, the band or the
arrangements of the songs.
169
V.
CONCLUSION
In their compositional approach to song, songwriters and composers evidence a
conscious or instinctive knowledge of how people tend to listen to vocal music. They
manipulate their textual and musical materials either to compensate, reinforce or oppose
the usual “poetic mode” of listening.
Faced with the challenge of the unavoidable fragmentation of text under any kind
of musical setting, songwriters and composers emphasize words from their lyrics or
poetic texts that they hope will help listeners to create their own narratives. Although the
songwriters and composers mold this emphasis on certain words according to their own
readings of their texts, the listener will reinterpret the text through his or her own reading,
to create a possible meaning for the song to which they are listening.
The vocal examples analyzed in this dissertation make the case for how different
compositional approaches act on the way people listen to text set to music. I start from
the idea that any musical setting of text produces a natural disruption of the discourse.
Even monody and recitative, with their bare settings, produce a certain degree of
disruption. The fact that they mount the words to sung tones already invites the listener to
shift her or his attention to the “songfulness” of the performing voice—paying attention
to the timbric color of its sustained sound.
Caccini’s monody and Handel’s and Mozart’s recitatives operate in a “poetic
mode” of perception, which privileges the “speech qualities” of the text. In their score
settings, they follow the prosodic characteristics of the text phrases as closely as possible.
On the one hand, they set the text phrases to melodic patterns that mimic their intonation
170
shapes pitch-wise. On the other hand, they also replicate the kinds of prolongation
performed over accented syllables and shorter values of the syllables in between with
similar musical rhythmic patterns. But the shapes and rhythm indicated in the score only
serve as points of departure for the real interpretation of the performer. It is in this
instance that both monody and recitative become quasi-speech. Any seventeenth-century
composer of monodies—as well as composers of recitative in any period in history—
expects flexibility in the tempo, without a steady beat in the performance of their settings.
This beat fluctuation gives the performer the chance to vary the articulation pace
according to her or his dramatic interpretation of the different phrases, creating the
definitive sense of speech so characteristic of monodies and recitatives.
Nineteen-century Lied serves as an example of the way the balanced “poetic
mode” operates. Although Reichardt and Zelter proclaimed their intentions of making
their musical settings a natural extension of the text—allowing it to speak for itself—and
limit their musical intervention as much as possible, their settings are “songful”
renditions far from any speech quality of the text. Their simple melodies—constrained in
register, syllabic, and stripped of embellishments and extended vocalizations—did not
have any speech quality but those of any other sung melody. The lack of modification in
text order and piano interludes aims to limit the disruption of the narrative flow.
Although the repetitious nature of their strophic settings and unobtrusive chord
accompanying textures allow listeners to ease their attention from the pure musical
elements of the song, they still apprehend the text in the “poetic mode” of listening. The
listeners hear the original sound patterns of the poem mounted over the “songfulness” of
the sustained tones of a voice singing a melody. Dylan’s “Simple Twist of Fate” operates
171
in a similar manner to Reichardt’s and Zelter’s settings with the strophic form holding his
own lyrics.
Beethoven’s, Schubert’s and Schumann’s settings of Mignon’s Lied further
emphasizes the “poetic mode” of processing by unleashing their compositional creativity
in further elaborated arrangements abounding in piano interludes, more complex and
colorful accompanying textures and harmonies and text modification. By the same
means, they also built musical forms that directed the attention of the listener to certain
specific words or phrases in their texts that synthesized the meaning behind the narration.
They highlighted these text fragments by creating harmonic tension, announcing or
delaying this phrase with piano interludes, repeating it several times, detaining the
rhythmic flow of the piece, etc.
Monk’s Volcano Songs as well as Berio’s A-Ronne represent the conscious
representation of the way listeners perceive text in the “poetic mode.” By
overemphasizing the sonic aspects of language, they concretely manifest what we hear
and how we hear it. In the hands of Monk, this process explores the human voice’s
timbral and gestural possibilities as deployed by any kind of syntactic text. Berio departs
from literary sources subjected to fragmentation and masking processes, which interfere
with their intelligibility. By breaking words into their phonetic components, overlapping
multi-texts, exploring different masking vocal gestures and paralingual sounds, he brings
awareness of speech elements that for the most part we unconsciously hear but do not pay
attention to in musical settings.
The four popular songs analyzed in the final section introduce four different
procedures of manipulating song structure to overcome or emphasize the “songfulness”
172
effect. What do songwriters have to do when they want to give a place to the narrative of
the lyrics? And what do they have to do when they want to indulge in this “songfulness”
effect and engage their audience in an experience of emotional and physical involvement
with their songs?
Observing the questionnaire’s results, we see that listeners do not completely
ignore text in songs, but they tend to remember only certain isolated words, mainly for
musical reasons. These words tend to occupy a prominent place in the songs, either by 1)
repetition of the word itself, 2) highlighting techniques such as slowing down the
rhythmic pace over the words, 3) repetition of the melodic motive over which the words
are set, or 4) over-articulation of the words’ phonetic components in performance.
The songwriters of the four songs used in the listening experiment—the same
songs analyzed in the previous section—show awareness of these highlighting procedures
and a crafted use of them. The highly structured song form of the Tin Pan Alley “All the
Things You Are” gives certain musical predictability to the listener but otherwise, as any
other song, resorts to musical manipulation to direct the attention of the listener toward
certain memorable phrases of the lyrics. Additionally, as we were able to observe, the
Fitzgerald’s and Streisand’s versions bring out different qualities in the same song. In
“Simple Twist of Fate,” Dylan acts over the “songfulness” effect and musicality of the
abundant sound patterns of his lyrics by using a repetitive strophic setting and a “quasi-
speech” vocal quality in his performance. Björk and Gabriel anchor their songs, “Isobel”
and “Sky Blue,” on certain phrases, such as refrains, or on engaging vocal riffs and
choruses, which for the most part are sung on nonsense syllables. The listeners are taken
through structures that build expectancy and redundancy devices.
173
Aside from paying attention to the strict musical elements of the piece, listeners
predominantly perceive the sonic or musical aspects of its lyrics: the colors of its
phonemes; the prosodic arch of its phrases’ intonation; the sonic quality of the
performing voice; the specific colors and inflections that the voice adopts at each phrase.
Composers know that audiences engage with their songs through these musical
gestures resulting from the alchemy of music and words. Even Goethe, despite his
caution against oversensitive and overcomplicated settings of his poems, had to admit
that only when words are set to music “is the poetic inspiration, whether nascent or fixed,
sublimated (or rather fused) into the free and beautiful element of sensory experience.
Then we think and feel at the same time, and are enraptured thereby.”1 Then we hear the
music of the words.
1
Goethe to Zelter, 21 December 1809; quoted in Eric Sams and Graham Johnson, “Lied (IV),” in New
Grove Dictionary of Music and Musicians, Vol. XIV (2nd ed. New York: Grove, 2001), 672.
174
BIBLIOGRAPHY
Aiello, Rita and John Sloboda, ed. Music Perception. New York, Oxford: Oxford
University Press, 1994.
Agawu, Kofi. “Theory and Practice in the Analysis of the Nineteenth-Century ‘Lied.’” In
Music Analysis 11, no.1 (March 1992): 3-36.
Barthes, Roland. The Grain of the Voice: Interviews 1962-1980. Berkeley and Los
Angeles: University of California Press, 1985.
Bauman, Richard, ed. Folklore, Cultural Performances, and Popular Entertainments: A

Communications-Centered Handbook. New York: Oxford University Press, 1992.
Berger, Karol. A Theory of Art. New York, Oxford: Oxford University Press, 2000.
Berio, Luciano. A-ronne: documentary for 8 singers on a poem by E. Sanguinetti. Wien:

Universal Edition, 1975.
———. “Poesia e musica un’esperienza.” In Incontri Musicali 3 (1959): 98-110.
——— and Swingle II. A-ronne .London: Decca, HEAD 15, 1976.
Beethoven, Ludwig V. Lieder und Gesänge mit Klavier. München: G. Henle, 1992.
Björk. Post. Elektra Entertainment Group. 61740-2, 1995.
Björk. Telegram. Elektra Entertainment Group. 61897-2, 1996.
Bolinger, Dwight. Intonation and Its Uses: Melody in Grammar and Discourse. Stanford,
California: Stanford University Press, 1989.
———. Aspects of Language. New York/Chicago/San Francisco/Atlanta: Harcourt, Brace

& World, Inc., 1968.
Boulez, Pierre. Orientations: Collected Writings. Ed. Jean-Jacques Nattiez. Cambridge,

Mass.: Harvard University Press, 1986.
Booth, Mark W. The Experience of Songs. New Haven and London: Yale University
Press, 1981.
Bracket, David. Interpreting Popular Music. Berkeley, Los Angeles, London: University
of California Press, 2000.
Calcagno, Mauro. “Signifying Nothing: On the Aesthetics of Pure Voice in Early

Venetian Opera.” In The Journal of Musicology 20, no. 4 (Autumm, 2003): 461-497.
175
———. “’Imitar col canto chi parla’”: Monteverdi and the Creation of a Language for
Musical Theater.” In Journal o the American Musicological Society 55, no. 3 (Fall
2002): 383-433.
———. “Monteverdi’s parole sceniche.” In Journal of Seventeenth-Century Music, vol.

9, no.1 (2004). Http://www.sscm-jscm.org/jscm/v9/no1/Calcagno.html
Cone, Edward. The Composer’s Voice. Berkely, Los Angeles, London: University of
California Press, 1974.
Cone, Edward. “Words into Music: The Composer’s Approach to the Text.” In Sound
and Poetry. New York , London: Coloumbia University Press, 1957.
Coste, Didier. Narrative as Communication. Minneapolis: University of Minnesota Press,

1989.
Crystal, David. Prosodic Systems and Intonation in English. London: Cambridge

Dahlhaus, Carl. Nineteenth-Century Music, trans. J. Bradford Robinson. University of

California: Berkeley, Los Angeles, 1989.
Dalmonte, Rossana and Bálint András Varga, Two Interviews/Luciano Berio, trans.
David Osmond-Smith. New York: M. Boyars, 1985.
Dame, Joke. “Voices Within the Voice: Geno-text and Pheno-text in Berio’s Sequenza
III.” In Music/Ideology” resisting the Aesthetic, ed. Adam Krims. Amsterdam:G&B
Arts International, 1998.
Daverio, John. Robert Schumann: Herald of a “New Poetic Age.” New York-Oxford:
Oxford University Press, 1997.
Davis, Sheila. The Craft of Lyric Writing. Cincinnati, Ohio: Writer’s Digest Books, 1985.
Dreßen, Norbert. Sprache und Musik bei Luciano Berio: Untersuchungen zu seine
Vokalkompositionen. Regensburg: Bosse, 1982.
Duckworth, William. Talking Music. New York: Simon & Schuster Macmillan, 1995.
Dylan, Bob. Blood on the Tracks. CBS CDBS 69097, 1975.
Fitzgerald, Ella. Ella Fitzgeral Sings the Jerome Kern Song Book. Verve Records 314
519 847-2, 1993.
Forte, Allen. The American popular Ballad of the Golden Era, 1924-1950. Princeton,
N.J.: Princeton University Press, 1995.
176
Frith, Simon. Music for Pleasure: Essays in the Sociology of Pop. Cambridge, Oxford:
Polity Press, 1988.
———. Performing Rites: On the Value of Popular Music. Cambridge, Mass.: Harvard
Fubini, Enrico. A History Of Music Aesthetics. London: The Macmillan Press Limited, 1990.
Gabriel, Peter. Up. Geffen Records 0694933882, 2002.
Goethe, Johann W. von. Wilhelm Meister’s Apprenticeship, ed and trans. E.A.Blackall.

Suhrkamp Publishers: New York, 1989.
Hill, Walter J. “Beyond Isomorphism Towards a Better Theory of Recitative.” In Journal

of Seventeenth-Century Music, vol. 9, no.1 (2004). Http://www.sscm-
jscm.org/jscm/v9/no1/Hill.html
Hirst, Daniel. “Intonation in British English.” In Intonation Systems: A Survey of Twenty

Languages. Cambridge,UK: Cambridge University Press, 1998.
Iser, Wolfgand. The Act of Reading: A Theory of Aesthetic Response. Baltimore and
London: The Johns Hopkins University Press, 1978.
Jakobson, Roman. Child Language, Aphasia and Phonological Universals. The Hague:
Mouton, 1968.
Jakobson, Roman. Language in Literature. Cambridge, London: The Belknap Press of

Harvard University Press, 1987.
Jowitt, Deborah, ed. Meredith Monk. Baltimore: The Johns Hopkins University Press,
1997.
Kern, Jerome and Oscar Hammerstein II. All the Things You Are. Polygram International
Publishing, Inc., 1939.
Kramer, Lawrence. Music and Poetry: The Nineteenth Century and After. Berkeley:
University of California Press, 1984.
———. Musical Meaning: Towards a Critical History. Berkeley and Los Angeles,
California: University of California Press, 2002.
Lewin, David B. “Figaro’s Mistakes.” In Engaging Music: Essays in Music Analysis, ed.
Deborah Stein. New York-Oxford: Oxford University Press, 2005.
177
Liberman, A.M., and David Isenberg “Duplex Perception of Acoustic Patterns as Speech
and Nonspeech” in Status Report on Speech Research SR-62. Haskins Laboratories
(1980): 47-57.
———, I.M. Mattingly and M.T. Turvey, ”Language Codes and Memory Codes.” In
Coding Processes in Human Memory, ed. A.Melton and E. Martin. New York: Wiston,
1972.
———, F. S. Cooper, D .P. Shankweiler and M. Studdert-Kennedy. “Perception of the

Speech Code.” In Psychological Review 74 (1967): 431-61.
Lieberman, Philip. Intonation, Perception, and Language. Cambridge, Massachusetts:

The M.I.T. Press, 1967.
Lodato, Suzanne M. “Recent Approaches to Text/Music Analysis in the Lied: A

Musicological Perspective.” In Word and Music Studies 1: Defining the Field, ed.
Walter Bernhart, Steven Paul Scher and Werner Wolf. Amsterdam-Atlanta, GA:
Rodopi, 1999.
Lyotard, Jean-François. “A Few Words to Sing.” In Music/Ideology” resisting the

Aesthetic, ed. Adam Krims. Amsterdam:G&B Arts International, 1998.
MacClintock, Carol, ed. The Solo Song 1580-1730. New York: W.W. Norton &
Company, Inc., 1973.
Menezes, Flo. Luciano Berio et la Phonologie: Une Approche Jakoksonienne de son

Oeuvre. Frankfurt, Berlin, Bern, New Cork, Paris, Wien: Petersbang, 1993.
Middleton, Richard. Studying Popular Music. Philadelphia: Open University Press, 1990.
———, ed., Reading Pop: Approaches to Textual Analysis in Popular Music. Oxford,
New York: Oxford University Press, 2000.
Minsky, Marvin. “Music, Mind and Meaning.” In Music, Mind and the Brain: The
Neuropsycology of Music, ed. Manfred Clynes. New York, London: Plenum Press,
1982.
Monelle, Raymond. The Sense of Music: Semiotic Essays. Princeton and Oxford:
Princeton University Press, 2000.
Monk, Meredith. Volcano Songs, ECM 1589 453 539-2,1997.
Mozart, W.A. Don Giovanni. Opera completa per canto e pianoforte. Milano: Ricordi,
1946.
Nattiez, Jean-Jacques. Music and Discourse: Toward a Semiology of Music. Princeton,

New Jersey: Princeton University Press.
178
Neubauer, John. The Emancipation of Music from Language: Departure from Mimesis in
Eighteenth-Century Aesthetics. New Haven, London: Yale University Press, 1986.
Osmond-Smith, David. Berio. Oxford, New York: Oxford University Press, 1991.
———. Playing with Words: A Guide to Luciano Berio’s Sinfonia. Cambridge: B.

Jordon Music Books, 1985.
Palisca, Claude V. Music and Ideas in the Sixteenth and Seventeenth Centuries. Chicago:
University of Illinois Press, 2006.
Repp, B., C. Milburn and J. Ashkenas, “Duplex Perception: Confirmation of Fusion.” In

Perception & Psychophysics 33, no.4 (1983): 333-337.
Schwarz, Robert. Minimalists. London: Phaidon Press Limited, 1996.
Tsur, Reuven. What Makes Sound Patterns Expressive? The Poetic Mode of Speech
Perception. Durham and London: Duke University Press, 1992.
i
Reichardt, Johann F. 31 Lieder, Oden, Balladen und Romanzen. Huntsville, Tex.: recital
Publications, 2000.
Rosen, Charles. The Romantic Generation. Cambridge, Massachusetts: Harvard

Rossi, Mario. ”Intonation in Italian.” In Intonation Systems: A Survey of Twenty

Languages. Cambridge:, UK: Cambridge University Press, 1998.
Sachter, Carl. “Motive and Text in Four Schubert Songs.” In Engaging Music: Essays in
Music Analysis, ed. Deborah Stein. New York-Oxford: Oxford University Press, 2005.
Sams, Eric and Graham Johnson. “Lied (IV).” In New Grove Dictionary of Music and
Musicians, vol. XIV, 2nd edition. New York: Grove, 2001.
Schwarz, David. Listening Subjects: Music, Psycoanalysis, Culture. Durham, London:

Duke University Press, 1997.
Scher, Steven Paul. “Melopoetics Revisited. Reflections on Theorizing Word and Music
Studies.” In Word and Music Studies 1: Defining the Field, ed. Walter Bernhart, Steven
Paul Scher and Werner Wolf. Amsterdam-Atlanta, GA: Rodopi, 1999.
Schumann, Robert. Selected Songs for Solo Voice and Piano from the Complete Works
Edition. New York: Dover, 1981.
179
Stacey, Peter .Contemporary Tendencies in the Relation of Music and Text with Special
Reference to Pli selon pli (Boulez) and Laborintus II (Berio). New York/London:
Garland Publishing, INC, 1989.
Stacey, Peter. “Towards the Analysis of the Relationship of Music and Text in
Contemporary Composition.” In Contemporary Music Review. United Kingdom:
Harwood Academic Publishers GmbH, 1989.
Starr, Larry and Christopher Waterman, American Popular Music: from Minstrelsy to
MTV. New York, Oxford: Oxford University Press, 2003.
Stein, Jack M. Poem and Music in the German Lied from Gluck to Hugo Wolf.
Cambridge, Mass.: Harvard University Press, 1971.
Stockhausen, Karlheinz and Herbert Eimert. Die Reihe:Speech and Music. Bryn Mawr,
Pennsylvania: Theodore Presser Company, 1968.
Streisand, Barbra. Simply Barbra. Sony B0000024TI, 1990.
Taruskin, Richard. The Oxford History of Western Music. Oxford-New York: Oxford
Youens, Susan. Schubert’s Poets and the Making. Cambridge-New York: Cambridge
Zak, Albin J. The Poetics of Rock: Cutting Tracks, Making Records. Berkeley, Los
Angeles, London: University of California Press, 2001.
Zelter, Carl F. Lieder. München: G. Henle, 1995.
180

Hearing Tsur's "Poetic Mode" The Text Music Relationship From Monody To Björk

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Hearing Tsur's "Poetic Mode" The Text Music Relationship From Monody To Björk

Transféré par

Droits d'auteur :

Formats disponibles

Hearing Tsur’s “Poetic Mode”:

The Text/Music Relationship from Monody to Björk

May 25, 2007

A Dissertation submitted to the

All rights reserved.

UMI Microform 3262037

ProQuest Information and Learning Company

committee members—Martha Hyde, Peter Schemlz and Jeffrey Stadelman.

New York at Buffalo. I want to thank them for their support.

Egginton for introducing me to some of the important linguistics literature.

that volunteered their time to answer my questions.

And last, I am grateful to my family, my husband, Alejandro Rutty, and my

made me build a coherent discourse to defend my arguments.

I. INTRODUCTION. THE POETIC MODE………………………………………..…1

The Poetic Mode of Text Perception……………………………………...8

The Perceptual Process of Text………………………………………..…12

Expressive Potential: Three Types of Poetic Modes…………………….18

Two Contributions from the Study of Prosody…………………………..19

II. MONODY, RECITATIVE AND ART SONG………………………………………...22

Intonation Analysis Terminology………………………………………..22

Low-Level Mimesis: Monody and Recitative…………………………...24

High-Level Mimesis: The 19th Century Lied............................................39

III. FURTHER EXPLORATIONS:

MEREDITH MONK AND LUCIANO BERIO…………………………………63

Volcano Songs: Meredith Monk…………………………………………70

A-Ronne: Luciano Berio…………………………………………………85

IV. THE POPULAR SONG…………………………………………………………..114

The Highly Structured Song: Tin-Pan Alley Tune……………………..120

The Narrative Type: Strophic Form…………………………………….135

Redundancy: Variation on the “Verse-Chorus” Form………………….142

EX. 1.1: From Caccini’s Sfogava con le stelle, mm. 5-16………………………………28

EX. 1.2: Recitative from Scene V, W. A. Mozart’s Don Giovanni……………………...32

EX. 1.3: Alto’s recitative n. 8 from Handel’s The Messiah……………………………...38

EX. 1.4: Reichardt, Kennst du das Land, 1st stanza……………………………………...46

EX. 1.5: Zelter’s Kennst du das Land, mm.1-27………………………………………...47

EX. 1.5 (cont.): Zelter’s Kennst du das Land, mm.28-53………………………………..48

EX. 1.6: Beethoven’s Kennst du das Land, mm.1-17………………………….………...51

EX. 1.6 (cont.): Beethoven’s Kennst du das Land, mm.18-43…………………………..52

EX. 1.7: Beethoven’s “Kennst du”-questions’s rhythmic pattern……………………….53

EX. 1.8: Schubert’s Kennst du das Land, mm.1-18. ……………………………………55

EX. 1.8 (cont.): Schubert’s Kennst du das Land, mm.19-40………………………….....56

EX. 1.9: Schumann’s Kennst du das Land, mm. 1-20…………………………………...59

EX. 1.9 (cont.): Schumann’s Kennst du das Land, mm. 21-41……………………….....60

EX. 3.3: Transcription of opening three measures of Dylan’s

EX. 3.4: Transcription of mm.4 to 6 of Dylan’s “Simple Twist of Fate”………………139

EX. 3.5: Transcription of the refrain of Dylan’s “Simple Twist of Fate”……………...139

EX. 3.6: Transcription of all the sections of Björk’s “Isobel” …………………………144

TABLE 1.1: Intoneme analysis of Caccini’s Sfogava con le stelle……………………...29

TABLE 1.2: Intoneme analysis of recitative from Scene V, W. A. Mozart’s Don

TABLE 2.1: Classification of performance instructions in Berio’s A-ronne…………....94

TABLE 3.1: Song format of Björk’s “Isobel”………………………………………….143

TABLE 3.3: Song format of Gabriel’s “Sky Blue”…………………………………….151

TABLE 3.4: Results from question #1…………………………………………………159

TABLE 3.5: Results from question #2…………………………………………………161

TABLE 3.6: Results from question #3…………………………………………………162

TABLE 3.7: Results from question #4 on Dylan’s “Simple Twist of Fate”……………163

TABLE 3.8: Results from question #4 on Gabriel’s “Sky Blue”………………………163

TABLE 3.9: Results from question #4 on Bjork’s “Isobel”……………………………164

TABLE 3.11: Results from question #5………………………………………………..165

The pleasure we experience through hearing a song depends largely on musical

The listener’s emotional experience is mostly independent of the understanding of the

semantic meaning of the lyrics.

unconsciously, that affects the strategies implemented by composers and songwriters