Académique Documents
Professionnel Documents
Culture Documents
by
Lorena M. Guillén
Doctor of Philosophy
Department of Music
UMI Number: 3262037
Copyright 2007 by
Guillen, Lorena M.
ii
ACKNOWLEDGEMENTS
Several people made possible the realization of this work. First, I wish to
acknowledge my advisor Michael Long who patiently and wisely guided me through this
process. I also really appreciate the time and insightful comments provided by my
The last years of research and writing were made possible by The Doctoral
Dissertation Fellowship from the College of Arts and Sciences of the State University of
I want to mention Barbara Hein and Martina Anderson, who carefully read and
edited my document, Martina Möetz for assisting with translations, and William
I would like to thank Gloria Escobar and Esperanza Roncero for allowing me to
conduct my questionnaires in her classes, and certainly, all Hartwick College students
babies, Xul y Mora, who patiently waited for their mother to finish her dissertation.
Alejandro with his sharp comments on my topic provided a continuous dialogue that
iii
TABLE OF CONTENTS
ACKNOWLEDGEMENTS…………………………………………………………….iii
LIST OF EXAMPLES………………………………………………………………….vi
LIST OF TABLES……………………………………………………………………..viii
ABSTRACT………………………………………………………………………….......ix
iv
Empirical Data: Questionnaires’ Results………………………………157
V. CONCLUSION…………………………………………………………………….170
BIBLIOGRAPHY……………………………………………………………………………175
v
LIST OF EXAMPLES
EX. 2.1: Monk, Volcano Songs: Duets, “Walking Song,” min. 0:25 to 0:35…………...74
EX. 2.2: Monk, Volcano Songs: Duets, “Walking Song,” min. 0:05 to 0:14…………...75
EX. 2.3: Monk, Volcano Songs: Duets, “Walking Song,” min. 0:15 to 0:20…………...76
EX. 2.4: Monk, Volcano Songs: Duets, “Walking Song,” min. 0:36 to 0:43…………...77
EX. 2.5: Monk, Volcano Songs: Duets, “Walking Song,” min. 2:27 to 2:46…………...79
EX. 2.6: Fragment from Monk, Volcano Songs: Duets, “Lost Wind”………………….80
EX. 2.7: Fragment from Monk, Volcano Songs: Duets, “Hip Dance”………………….81
EX. 2.8: Fragment from Monk, Volcano Songs: Duets, “Cry # 1”……………………..84
EX. 3.1: Rhythmic riff played by the brass section in Fitzgerald’s version of “All the
Things You Are”……………………………………………………………………132
vi
EX. 3.2: “All the Things You Are”: Two versions and published score of the A section.
Each in the key performed or published. Trans. by L. Guillén……………………..133
EX. 3.7: Transcription of the opening eleven measures of Gabriel’s “Sky Blue”…….154
EX. 3.8: Layout of solo and vocals in the first chorus of Gabriel’s “Sky Blue”……….155
EX. 3.9: Layout of solo and vocals in the second chorus of Gabriel’s “Sky Blue”……156
vii
LIST OF TABLES
TABLE 1.3: Intoneme analysis of Alto’s recitative n. 8 from Handel’s The Messiah…..37
TABLE 2.2: Exchange between tenor 1 and baritone 1 in Berio’s A-ronne, 18 to 20…..99
TABLE 3.2: Layout of parallel lyrics constructions in the chorus of Björk’s “Isobel”..147
TABLE 3.10: Results from question #4 on Fitzgerald’s version of “All the Things Your
Are”………………………………………………………………………………....164
viii
ABSTRACT
gestures—sonorous stimuli that are a complex web of musical and language parameters.
This dissertation looks into how people listen to song and how, consciously or
while facing the task of creating their pieces: Which are the diverse compositional tactics
employed to manipulate the focus of the listener’s perception of the text? How can
composers and songwriters emphasize, compensate for, or oppose this sonic connection
vocal “art” work. Although there is an intention of deciphering the semantic message of
the lyrics, it is only after repeated listening that the audience is able to apprehend the
piece as a cohesive discourse. Aside from paying attention to the strict musical elements
of the piece, listeners predominantly perceive the sonic or musical aspects of its lyrics:
the colors of its phonemes, the prosodic arch of intonation of its phrases, the sonic quality
of the performing voice, and the specific colors and inflections that the voice adopts at
each phrase.
In order to test the hypothesis previously proposed and then explore further
materials (analysis of vocal pieces through listening to specific recordings and analysis of
ix
scores) and surveys of college students to observe their perception of the four popular
songs analyzed.
x
I.
INTRODUCTION
listeners with no command of the English language engage at some internal level with
their favorite songs. Growing up in Argentina, I was no exception, and many songs in
foreign languages marked my youth. While singing along, my peers and I often
mimicked the sound content of the words, making uip nonsense syllables that stood in for
the actual lyrics. This phenomenon has always intrigued me. How did we engage
“emotionally” with these songs while ignoring the meaning of their texts? And what sort
of pleasure resided in the singing of nonsense syllables rather than meaningful words?
Years later, while watching the 2005 Super Bowl, I witnessed a scene that
resonated with my childhood experience. During that year’s game, the halftime
entertainment was provided by Paul McCartney, who performed some of his well-known
hits, including “Hey Jude.” The audience’s participation increased dramatically at this
point of the show, demonstrating the great popularity of this song among the American
public. The crowd sang along enthusiastically with the chorus, and, surprisingly, some
audience members held up signs containing the non-sense syllabic utterance “na na na
nananana.” Was this the most memorable text phrase of this famous song? Was this event
The fundamental questions behind this dissertation, which considers how mainstream
1
listeners “hear” song (i.e.,music-with-text), were generated by my own history but were
cast into a more generalized context by this very American musical moment.
Rethinking these issues, I came to the conclusion that the pleasure that we
stimuli that are a complex web of musical and language parameters. These “sound
The hypothesis in this dissertation proposes that our first tendency as listeners is
to connect musically to the popular song or vocal “art” work. Although there is an
intention of deciphering the semantic message of the lyrics, it is only after repeated
listening that the audience is able to apprehend the piece as a cohesive discourse. Aside
from paying attention to the strict musical elements of the piece, listeners predominantly
perceive the sonic or musical aspects of its lyrics: the colors of its phonemes, the prosodic
arch of intonation of its phrases, the sonic quality of the performing voice, and the
specific colors and inflections that the voice adopts at each phrase.
Numerous essays have been devoted to exploring the issue of the music and text
relationship, and many others have proposed insightful analysis of song and other vocal
genres, with special emphasis on observing the structuring of music around a poem.1
1
For specific articles on text and music relationship issues, see Walter Bernhart, Steven Paul Scher and
Werner Wolf, ed., Word and Music Studies 1: Defining the Field, ed. (Amsterdam-Atlanta, GA: Rodopi,
1999). This volume gathers essays written by members of the International Association for Word and
Music Studies (WMA). Of particular interest are: Steven Paul Scher, “Melopoetics Revisited. Reflections
on Theorizing Word and Music Studies,” and Suzanne M. Lodato, “Recent Approaches to Text/Music
Analysis in the Lied, A Musicological Perspective.”
Although impossible to name all, here are some texts that contain original song analysis: Charles Rosen,
The Romantic Generation (Cambridge, Massachusetts: Harvard University Press, 1995), particularly his
“Chapter Three: Mountains and Song Cycles”; Kofi Agawu, “Theory and Practice in the Analysis of the
2
However, very little has been said about how people listen to song and how, consciously
while facing the task of creating a song: Which are the diverse compositional tactics
employed to manipulate the focus of the listener’s perception of the text? How can
composers and songwriters emphasize, compensate for, or oppose this sonic connection
In an effort to define the nature of the relationship between the two semiotic
realms converging into song, music and language, some poststructuralist analysts and
philosophers have given special attention to the preponderant role of music in this fusion.
Thus, there is the “assimilation model” of Suzanne Langer. Although she conceives the
capacity of the poem to trigger the composer’s imagination, she admits that music
transforms “the entire verbal material, sound, meaning, and all—into musical elements.”2
She argues:
When words and music come together in song, music swallows words; not only mere
words and literal sentences, but even literary words-structures, poetry. Song is not a
compromise between poetry and music, though the text taken by itself may be a great
poem; song is music. 3
On the opposite side of the semiotic analytical arena, Lawrence Kramer conceives
of song as a structure where words and music coexist without losing their individual
Nineteenth-Century ‘Lied’” in Music Analysis 11, no.1 (Blackwell Publishing: March, 1992); Lawrence
Kramer, Music and Poetry: The Nineteenth Century and After (Berkeley: University of California Press,
1984); David B. Lewin, “Figaro’s Mistakes,” Carl Sachter, “Motive and Text in Four Schubert Songs,” in
Engaging Music: Essays in Music Analysis, ed. Deborah Stein (New York-Oxford: Oxford University
Press, 2005); Susan Youens, Schubert’s Poets and the Making (Cambridge-New York: Cambridge
University Press, 1996); John Daverio, Robert Schumann: Herald of a “New Poetic Age” (New York-
Oxford: Oxford University Press, 1997)
2
Langer, Suzanne, Feeling and Form, as quoted by Kofi Agawu in Theory and Practice in the Analysis of
the Nineteenth-Century ‘Lied’, 5.
3
Ibid., 5.
3
essences: “A poem is never really assimilated into a composition; it is incorporated, and
it retains its own life, its own body, within the body of the music.”4
The style of the classical art song since the Renaissance heightens the tension between
words and music in two fundamental ways: first, by adopting an intonational manner that
presents the voice as a precisely tuned instrument rather than as a source of utterance, and
second, by opening the possibility of a musical response to the poetry that is complex
enough to raise questions of interpretation. Other features—the expressive forcing of high
and low tessitura, where the sound of the words inevitably fades into the effort of
attacking the pitch; the complication of rhythm and the varied movement of the voice
toward and away from speech-like patterns, the repetition, alteration, and syntactic
breakdown of the text—also contribute to alienating the singing of the words...6
distortion to song, dissolving the employed language into its physical origin, the
4
kind of higher carelessness or forgetfulness that simply does not avail itself of the
symbolic, allows the symbolic to lie unused even if its words may still be heard clearly.8
“Songfulness” refers to the transformation of song into a fusion of vocal and musical
enveloping effect of the human voice, which at the physical level surrounds the listener
with the fullness of its vibrations and always implies potential meaning.
is one of those aesthetic qualities that is immediately recognized but difficult to account
for. Is “songfulness” an attribute of the vocal music itself, the particular performance or
the ears of the listener? This dissertation proposes that the nature of the “songfulness”
more “poetic” wherever it makes full use of disruptive tactics. Such tactics might create
across the poem. These sonic webs may trigger new narrative operations parallel to the
emotional moods. The poetic text is perceived in what he calls “the poetic mode,” where
“some non-speech qualities of the signal seem to become accessible, however faintly, to
8
Ibid.
9
Reuven Tsur, What Makes Sound Patterns Expressive?: The Poetic Mode of Speech Perception (Durham
and London: Duke University Press, 1992), 14.
5
This dissertation proposes that the natural disruption produced by the
poems in general. The result is what Kramer calls “songfulness.” But why do we perceive
song in this “songful” way? It is because we hear it in what Tsur calls the “poetic mode.”
Poetry is considered more “poetic” when it makes full use of disruptive tactics. In
a similar manner, music history shows that vocal setting styles have been considered
disruption of the natural flow of the discourse occurs. The degree of discourse
piece. However, vocal setting modalities that aim for a “speech quality” try to keep their
musical elements as plain and unobtrusive as possible in order to guarantee a closer result
The type of musical setting determines if the listeners will appreciate a song in (1)
a “poetic mode,” in which the “speech mode” of perception dominates; (2) a “poetic
mode,” in which there is a balance between the “speech mode” and the “auditory mode;”
or (3) a “poetic mode,” in which the “auditory mode” dominates. First, those settings that
aim to express through their semantic content try to preserve the integrity of the discourse
and therefore tend toward a “speech mode” of processing the text. They situate the
affective potential of language in its prosody, as found in its natural speech state. Second,
those settings that aim to express through their “songfulness” accept the new, artificial
arrangement’s textures and tend toward a balanced “poetic mode.” They situate the
6
elements intermingled. Lastly, those settings that find the distortion that the text
(phonemes) and paralingual gestures and use them as compositional elements. They
this dissertation, sections two to four are dedicated to the application of those
perspectives in the analysis of musical pieces of diverse musical styles and periods. These
serve to exemplify variations on the “poetic mode” of text perception and the various
compositional tactics that affect the way people listen to them. Thus, the selected
examples of monody and recitative represent the kind of “poetic mode,” in which the
“speech mode” dominates; the nineteenth-century German lieder, as well as the four
analyzed popular song cases, exemplify the “poetic mode,” in which there is a balance
between the “speech mode” and the “auditory mode”; finally, the two twentieth-century
avant-garde vocal selections illustrate the “poetic mode,” in which the “auditory mode”
dominates.
approaches to the relationship between text and music; it serves only to exemplify
different modes of perception derived from the application of Tsur’s concepts. Each
analysis presents a different methodological approach that most clearly articulates those
compositional procedures that reveal the way people perceive those kinds of settings.
In order to test the hypothesis previously proposed and then explore further
materials (analysis of vocal pieces through listening to specific recordings and analysis of
7
scores) and surveys of college students to observe their perception of the four popular
songs analyzed. Both data collection methods take into account the listeners’ perspective.
This research has exploratory intentions. It introduces a new explanation for the
phenomenon of perceiving text set into music in a “poetic” way and, at the same time,
focus of the listener’s perception. Although preserving as its ultimate goal the creation of
a theory of perception of text in vocal music, its immediate intentions are more modest:
The different degrees of linear connection and disruption of a text influence the
between linguistic strings of arbitrary verbal signs (words) and repeated phonetic sound
organization (sentences) and the prosodic units (poetry lines or intonation phrases). These
one other and pull the reader’s or listener’s attentions in opposite directions. Tsur argues:
In connected speech there is a tendency to proceed linearly rather than move in different
directions from the central sequence...to allow for the disruption of the linear sequencing
of speech sounds (that is, for segregating the relevant portions of the auditory stream), the
whole message must be less thoroughly organized on all levels as the linguistic stress
pattern diverges from the conventional metric pattern, and as does the syntactic unit
(clause, sentence) from the prosodic unit (line)...10
10
Ibid., 73.
8
Poetic texts usually tend to disruption, which triggers a “poetic mode of
listening,” where there is freedom for segregating or grouping portions of the sound
stream and moving back and forth between auditory and phonetic modes of listening. The
resultant “sound patterns” assume the emotive effects of non-referential sound gestures;
they are perceived as music by the brain’s right hemisphere—a process that involves the
mode” and a “non-speech mode” of listening, which follow different paths in the neural
system. The same transitions between phonemes that in the speech continuum are heard
as speech sound (because they appear to carry linguistic information), when isolated are
heard as musical sounds. But, according to Tsur, there is a third mode, the “poetic mode,”
in which “some non-speech qualities of the signal seem to become accessible, however
faintly, to consciousness” along with the speech mode processing.12 These three
“listening modes” depend on the way the acoustic signal is processed. In the “non-speech
mode” (processed by the right hemisphere of the brain), we attend away from the
overtone structure to tone color, as when we hear musical sounds or natural noises. In the
“speech mode” (processed by the left hemisphere of the brain), we process the signal
attending away from the overtone or formant structure to the phoneme, and the tone color
11
Some of the specific articles that summarized these findings are: A.M. Liberman and David Isenberg
“Duplex Perception of Acoustic Patterns as Speech and Nonspeech” in Status Report on Speech Research
SR-62 (Haskins Laboratories, 1980): 47-57.; A.M. Liberman, I.M. Mattingly and M.T. Turvey, ”Language
Codes and Memory Codes” in Coding Processes in Human Memory, A.Melton and E. Martin, ed. (New
York: Wiston, 1972); A.M. Liberman, F. S. Cooper, D .P. Shankweiler, and M. Studdert-Kennedy,
“Perception of the Speech Code” in Psychological Review 74 (1967): 431-61; B. Repp, C. Milburn and
John Ashkenas, “Duplex Perception: Confirmation of Fusion” in Perception & Psychophysics 33, no.4
(1983): 333-337.
12
Tsur, What Makes Sound Patterns Expressive, 13
9
is taken into account or almost suppressed. In the “poetic mode,” the main processing is
identical with the speech mode except that certain precategorical information (such as
tone color) enters consciousness.13 In this mode, it is possible to switch back and forth
succession.
vocalized word. This disruption is mounted over the disruptive sound groupings produced
by rhyme schemes and phonetic interplays among words already present in any poetic
text as specified in previous paragraphs. The resulting effect could communicate two
kinds of messages: one that, according to Tsur, explores the double-edged expressive
capacity of phonetic sounds in connection with the words that contain them; and the other
concerns itself mainly with the semantic meaning of the words or the “parallel
narratemes” that words make up after being associated by phonetic similarities. These
those sounds have meaning but not a specific one: “they may express vastly different or
13
The term “precategorical” refers to the “categorical perception” phenomena explored at the Haskins
Laboratories at Yale University. The human ear has the tendency to fuse the continuous variation of color
and pitch within phonetic linguistic categories (the repetition of one particular phoneme, for example,
minimal variations in the formants of the [b] phoneme). This is a similar process to the fusion of overtones
in sound stimulus. But, at the same time, the ear perceives as a quite distinctive difference the change from
one category to the other (the change from one particular phoneme to a different one, for example, from [b]
to [d]) as a quite distinctive difference. We, as humans, perceive the phonemes that make speech as
individual categories beyond their minimal formant variations. In contrast, natural noises and music are
perceived in a continuous manner, attending to every single formant variation.
14
The tern “narrateme” has been coined by Didier Coste in his book Narrative as Communication
(Minneapolis: University of Minnesota Press, 1989). Narrateme is the minimal unit of the narrative
10
even opposing qualities.”15 As its syntactic and semantic context, the “word” influences
meaning of the combined sounds is grasped and runs parallel to the semantic abstraction
of the words united by sonorous similarities such as alliterations, rhyme schemes, etc.
Regarding the second possible message, Didier Coste, in his book Narrative as
Communication, explains that when poetry carries a narrative type of discourse, there is a
tension between verse and narrative. Besides the (usual) way of processing the straight
operations” based on phonetic or rhyme connections between two words. In some cases,
the ambiguity of the contrast between phonetic affinity and semantic disjunction points to
the insufficiency of the “primary narratemes” to account fully for the narrative
significance of the poem.16 Poems are usually prized for their incompatibility with
straight narrative.
Narrative is not the only type of discourse used in vocal settings, although it is the
onomatopoeic)—are used in other vocal pieces, the latter mostly in 20th and 21st century
avant-garde pieces. The poetry or the song lyric format makes use of these types of
discourses.18
discourse; it is “an utterance that contains an actional predicate” (Coste, p. 36); in other words, it represents
an event.
15
Tsur, What Makes Sound Patterns Expressive?, 2.
16
Didier Coste, Narrative as Communication (Minneapolis: University of Minnesota Press, 1989).
17
The injunctive type of discourse is understood as laws, orders to somebody else, or ordering of things.
18
There are vocal pieces that use prose. In some cases, this prose has poetic tendencies (it explores the
sound patterns of the words and phonemes or interplay between words at a semantic level). But, in other
cases, straight prose has been used in chamber vocal pieces like Berio’s Sinfonia, in which some of the
sources are philosophical texts by Levi-Strauss and political speeches.
11
The Perceptual Process of Text
Both song lyrics and poems set into popular or art songs are conceived by their
creators and received by their audiences in a different manner than the common speech
text of an everyday conversation. These literary texts are more like oral poetic story
telling. But they have one characteristic in common with speech: their orality. They are
literary forms that are communicated verbally during the song performance. The
communication process of the text of the song lyrics and poems, whatever its nature,
literary texts and everyday speech are similar. At a micro-level, however, the specifics of
At the first stage (macro-level), the auditory perception takes the form of a
The listeners “decode” the input speech signal by using their knowledge of the
constraints that are imposed by the human articulatory “output” apparatus.19
speech helps the listener to decode segmental phonemes, intonation and stress. The
acoustic signal may be partially ignored and filled in by the listeners’ own syntactic and
semantic knowledge of language and the social context of the communication act.
19
Philip Lieberman, Intonation, Perception, and Language (Cambridge, Massachusetts: The M.I.T. Press,
1967), 162.
20
Ibid., 163.
12
At times, when the speaker realizes that the listener may infer the rest of the
message with only certain minimal information, he simplifies his articulatory control over
it. Lieberman says that a speaker may neglect to articulate a word carefully in such a
case. The listener will then create a hypothesis regarding the phonetic character of the
segments that are unrecognizable from the acoustic signal and, applying phonological and
syntactic rules, will form a hypothetical phrase. This hypothetical phrase may be
semantically reasonable and consistent with its context. If that is not the case, the listener
another process is triggered in which the listener tries to make sense of the whole
message at a deeper, or micro, level. The literary text and strategies used in it are simply
a starting point, from which the reader, or listener in our case, must construct for himself
the aesthetic object. The communicative act is initiated by the text but depends on the
active involvement of the reader. The texts should stimulate the individual reader’s
The decoding proceeds in “chunks” rather than by single words. These chunks
correspond to the syntactic units of a sentence. These individual sentences do not directly
denote objects. Literary text does not denote empirically existing objects; although text
may select objects from the empirical world, they are depragmatized. The literary
aesthetic object is built up in such a way that these intentional “sentence correlates” join
in semantic units. Wolfgang Iser says that “the semantic pointers of individual sentences
always imply an expectation of some kind…As this structure is inherent in all intentional
13
sentence correlates, it follows that their interplay will lead not so much to the fulfillment
a sentence, after completing the thought of that sentence, we are prepared “to think its
‘continuation’ as a new sentence, especially one that has connection with the previous
one.”22 Each of these “sentence correlates” contains what Iser calls “a hollow section,
which creates expectation pointers towards the next sentence, and a retrospective section,
which complies with the expectations of the preceding correlate, which at this point is
part of the background. According to Iser, this creates a constant “dialectic of protension
and retension, conveying a future horizon yet to be occupied, along with the past horizon
already filled…”23 What has already been heard undergoes a permanent synthesizing
process. Every sentence shrinks in the memory and becomes some sort of background,
If the new sentence answers the expectations aroused by the previous correlate, the range
of semantic horizons narrows. Descriptive texts, especially, behave in this way in order to
individualize the particular object. But when the new sentence does not fulfill the
expectations, the resulting frustration retroactively affects what has already been read.
fictional texts and pragmatic expository language behave very differently. In order to
guarantee the reception of a specific given fact, the expository text tends to stay as
cohesive as possible.
21
Wolfgand Iser, The Act of Reading: A Theory of Aesthetic Response (Baltimore and London: The Johns
Hopkins University Press, 1978), 111.
22
Ibid., 112.
23
Ibid.
14
Whenever the expository text unfolds an argument or conveys information, it
presupposes reference to a given object; this, in turn, demands a continuous
individualization of the developing speech act, so that the utterance may gain its intended
precision. Thus, the multiplicity of possible meanings must be constantly narrowed down
by observing the connectability of textual segments, whereas in fictional texts the very
connectability broken up by the blanks tends to become multifarious. It opens up an
increasing number of possibilities, so that the combination of schemata entails selective
decisions on the part of the reader.24
prestructured by the sequence of sentences.”25 The listener reconstructs the gaps that the
text leaves from what is revealed along its development. In return, the information made
explicit is transformed when what is left open is discovered. These blanks are one of the
The coherence and connectability of speech also depend upon certain extra-
textual conditions that in pragmatic language are a given and in fiction have to be
recreated every time, such as a “‘non-verbal frame of action…as matrix for utterances’;
the relation between the recipient and ‘the common referential system of experiences
assumed by the speaker,’ as well as ‘the common area of perception’; and the relation
between the recipient and the communication situation, as well as the ‘speaker’s range of
associations.’”26 Certainly, since in the act of listening to songs there is no direct contact
between the songwriter or the speaker’s voice and the listener, these preconditions need
Either when listening to a recording or in a live concert situation, when the singer
is not the songwriter himself, an indirect communicative situation exists. The performer
24
Ibid., 184.
25
Ibid., 110.
15
is only interpreting, or creating her own reading of the text (text conceived as the already
combined product of text plus music into the song), and then communicating her version
to the audience. But even in live performances when the performer is the songwriter, an
unavoidable distance exists between the performing artist and her audience that does not
exist in a common dialogue situation. Although in these live settings the performer could
partially provide the “non-verbal frame of action,” the rest of the assumed extra-textual
conditions are a leap of faith that every songwriter makes every time that he composes a
song.
The songwriter, as well as the performer, is alone in this realm since her
knowledge of the audience is limited and general (at least more than in a direct
conversation). The relation between the recipient and “the common referential system of
the audience that she has in mind at the moment of composing. “The common area of
perception” does not apply in this situation because this is not a conversation that is
surrounding environment. In the same way, the relation between the recipient and the
being delivered. Beyond the interpretative nuances that any performer may add in
according to the audience reaction. So, in every case, the live performance as
communicative act comes closer to the way a fictional literary text reaches its recipients
26
Ibid.,183. Iser summarizes a list of factors listed by S. J. Schmidt in Texttheorie (UTB 202) (Munich:
Fink, 1973).
16
than to the way an everyday conversation does. This is even clearer in the case of
Concerned with the “orality” of the song’s literary medium, Roland Barthes
...we never listen to a voice en soi, in itself, we listen to what it says. The voice has the
very status of language, an object thought to be graspable only through what it transmits;
however, just as we are now learning, thanks to the notion of “text,” to read the linguistic
material itself, we must in the same way learn to listen to the voice’s text, its meaning,
everything in the voice which overflows meaning.27
It is the sound of the voice, its quality of tone and intonation patterns that offer the
context from which the listener departs in his interpretive journey of the conveyed
message. First, the quality depends on the circumstantial performer, her involvement with
the musical piece and her interpretation of what the composer/lyricist wants to transmit—
unless performer and composer are the same person. Second, the intonation patterns
depend mostly on the way in which the lyrics were set to music by the composer. But,
ultimately, it is the receiver or listener that activates the connections suggested by these
pointers.
The literary use of the text gaps challenges the listener or reader by withholding
formulated text if he is to be able to absorb it.”28 In pragmatic speech, this challenge does
not exist and the imagination of the listener is not tested as it is in the literary medium.
The listener may fill in the blanks in the disconnected discourse by asking the speaker.
Iser points to the natural need of language to leave holes in the continuum of any
17
The lack of a sign can itself be a sign; expression does not consist in the fact that there is
an element of language to fit every element of meaning…Speaking does not mean
substituting a word for every thought: if we did that, nothing would ever be said…we
would remain in silence, because the sign would at once be obliterated by
meaning…Language is meaningful when, instead of copying the thought, it allows itself
to be broken up and then reconstituted by the thought.29
Meaning comes out of the unsaid as much as out of the message carried by the spoken
words. Thus, literary text engages the listener in a constant exercise of interpretation.
All vocal setting modalities, from those that aim to a “speech quality” to those
that aim to a “musical-poetic quality,” want to communicate some kind of affect, but they
locate that expressive potential on different aspects of the text. As a consequence , the
way the musical setting interacts with its lyrics or poem will vary.
As mentioned earlier, those settings that rely on the semantic meaning of their
lyrics for the communication of their expressive potential tend to preserve the integrity of
the speech qualities. They locate this affective potential of language in its “prosody,” as
found in its natural speech state. Other kinds of vocal settings accept the new, artificial
fragmentation of the sequence of the original discourse can vary to a vast degree but
tends toward a balanced “poetic mode,” which situates the affective potential of language
27
Roland Barthes, The Grain of the Voice: Interviews 1962-1980 (Berkeley and Los Angeles: University of
California Press, 1985): 183-184.
28
Iser, The Act of Reading, 185.
29
Maurice Merleau-Ponty, Das Auge und der Geist. Philosophische Essays, trans. Hans Werner Arndt
(Reinbeck, 1967), p.73f, as quoted by Iser in The Act of Reading, 186.
18
settings that find the distortion that the text undergoes as irreversible further fragment
language in its minimal components (phonemes) and paralingual gestures and use them
as compositional elements. They situate the affective potential of language in its sonic
qualities.
field, linguistic studies in prosody also contribute to enlighten the intermingling of sonic
and semantic elements of language. “Prosody” is understood as the kind of shape that
language—takes according to the variation of the parameters that govern its contour and
variation of the fundamental frequency, as the primary parameter. This is a higher level
of prosody than the one that is generally applied at the lexical level, where the word is the
unit and all the above mentioned parameters take place inside its limits. Despite complex
controversies among researchers in this field, most linguists agree that intonation systems
convey, as Dwight Bolinger says, “how we feel about what we say, or how we feel when
we say.”30 Researchers in this area, including B. Shapiro and M. Danly (1985), have
conducted tests that provide neurological evidence for those linguists, among them
intonation.
30
Dwight Bolinger, Intonation and Its Uses: Melody in Grammar and Discourse (Stanford, California:
Stanford University Press, 1989), 1.
19
Two phenomena described by some linguists involved in prosody studies are
crucial to the understanding of how different vocal setting modalities act on the
perception process. The first one is found when Bolinger explains that the vocal tones
employed in language are made of overtones produced by the shaping of the different
phonemes, which carry the semantic message, and fundamental pitches that are “mostly
used for mood and punctuating effects”—what is known as prosody.31 In any case, since
fundamental pitches, types of song focusing on tune over speech quality affect the
expressive/affective mood of the text. However, these kinds of musical settings do not
modify the overtones produced by the phonemes, and, as a consequence, the strict
message of the text remains intact. By modifying the intonation arches in this manner, we
are essentially presented with a new reading of the text in performative terms.
The second phenomenon is one described by Daniel Hirst. When defining the
difference between speech and song, he says that while normal speech consists of a
continuous sequence of movements from one target-point to the next, in song, the
sequence of static level tones.32 This description will prove instrumental when analyzing
31
Ibid., Aspects of Language (New York/Chicago/San Francisco/Atlanta: Harcourt, Brace & World, Inc.,
1968), 31.
32
Daniel Hirst, “Intonation in British English” in Intonation Systems: A Survey of Twenty Languages
(Cambridge:, UK: Cambridge University Press, 1998), 71.
20
the different degrees in the spectrum of possible vocal tones employed in the following
musical examples.
21
II.
In order to proceed with the prosodic analysis of monody and recitative in Italian,
studies as presented by Mario Rossi.1 There is a certain agreement among linguists that
an “intonation unit” is dictated by syntactic rules. Although it does not coincide with the
NP (noun term), including its modifying adjectives. Whatever its place in the phrase, S-
ADV (sentence adverb) is separated from the other constituents of the intonation unit
during analysis.
The intonation curve at the last syllable of each intonation unit may be either a
intonation unit has two “internal Accents” (AC1 and AC2). These accents are at the
lexical, or word, level. As we may observe in the two first examples analyzed, the Italian
language in particular has the tendency of not synchronizing its intonemes with these
“internal Accents” (ACs) because most of the time it carries its lexical stress on the
penultimate or antepenultimate syllable, while the intoneme occurs on the final syllable.
higher than that of the AC.”2 Actually, “the pitch contour of the ‘continuative intoneme,’
1
Mario Rossi, ”Intonation in Italian,” in Intonation Systems: A Survey of Twenty Languages (Cambridge:,
UK: Cambridge University Press, 1998)
2
Ibid., p. 225.
22
after AC1, may vary freely between the two pitch extremes of AC1 and AC2, that is to
say between the Mid and the Mid High levels.”3 In contrast, CCs (“major conclusive
intonemes”) tend to be lower in pitch than the ACs, and in general a falling pitch contour
Rossi also mentions that “the duration of the vowel under AC and the
continuative intoneme is longer than of the unstressed vowels…”4 The last intoneme
syllable is probably significantly longer than all the previous atonic prestressed vowels.
temporal prominence are not synchronized with those of pitch prominence.”5 In an ideal
neutral intonation expression, AC2 has the pitch prominence, higher than the rest, even
than AC1 and CT or CC. AC2 may be somewhere between the Mid High and the Mid
pitch levels, while AC1 is around the Mid to Mid Low. The temporal prominence belongs
to AC1. It is longer than any other element, even than AC2, and this factor indicates that
what follows is an intoneme of any type, a CT or a CC. The difference between the AC2
and its preceding and proceeding unstressed syllables is 3 PUs, while the difference
between the AC1 and its surrounding group of unstressed syllables is 6 PUs.6 That
difference between the two ACs and their surrounding atonic syllables indicates the
difference between a stressed group around AC2 and an intoneme around AC1. Between
AC2 and AC1 there is no intoneme. The unstressed vowels lying in between those two
3
Ibid., p. 227.
4
Ibid., p. 225.
5
Ibid. , p. 226.
6
PUs is a duration unit that is calculated as “the log of the ratio of the duration of a given vowel to that of
the vowel carrying AC in the utterance. This value is then normalized by dividing by log (1.22).” (Rossi, p.
238).
23
accents adjust their prosodic values to “the temporal and melodic continuums obtained by
Certain vocal styles establish a “low-level mimetic relation” with their texts, a
relation at the lexical and prosodic level of their literary sources. These styles have as a
principle the preservation of the text integrity as a logical linear discourse as far as is
musically possible. In order to allow the discourse to flow without interruption, its
practitioners look at intonation and inflections of the text as speech. If language as speech
has the ability to communicate ideas, it is by allowing the texts to behave as such that
their musical settings are able to transmit emotions. One might say that these vocal styles
and genres locate the expressive qualities of language in the intonation system. By
attempting to maintain the integrity of the discourse, these pieces trigger a “poetic mode”
of perception in which the “speech mode” dominates. Two vocal setting modalities may
serve as examples of this approach: Italian monody of the late sixteenth- and early
Monody composers, such as Caccini, D’India, Galiliei, Peri and Monteverdi (in
his new monody phase), claimed to be imitating nature with the new stile rappresentativo
of their vocal compositions. This was not an original claim since earlier and
maintained the same about the counterpoint and “word painting” devices used in their
7
Ibid., p. 227.
24
avoid the madrigal analogies of “word painting,” which illustrated the meaning of the
words with specific harmonies created by intervals among the voices, runs up and down,
silent parts or notation devices. They based their monodic settings on the homology of
the spoken word as the closest to human nature that they could get.8
The proponents of this seconda prattica put their efforts into following the textual
shortening and pitch variation of morphemes as much as they musically could. But, in the
end, a general musical sense and sensitivity ruled over the strict transcription of those
intonational parameters. They also followed the phrasing of intonational units and order
of the text, only adding some repetitions toward the end of the poems. These repetitions
were meant to emphasize certain concepts and words. The restatement of words allowed
melodic embellishment, such as melismas, trillo and gruppo, without detracting from the
stretched and twisted over long and elaborate ornamentation and still be present in the
example, the esclamazione consists of “a gradual loudening of the voice on long notes
into an outcry, made more artful by first diminishing the volume before beginning the
described by Caccini are the gruppo and the trillo, which imitate unsteady speech. The
8
For a detailed discussion of the monodists’ reform, see Richard Taruskin, The Oxford History of Western
Music, vol. 1 (Oxford-New York: Oxford University Press, 2005), p.797-847.
25
former is “the artfully simulated vocal tremble” made of the “rapid alternation of
contiguous notes of the scale.”11 The latter consists of the rapid repetition of a single
pitch.
prosodic characteristics of the text phrases. On the one hand, they set the text phrases to
melodic patterns that mimicked their intonation shapes pitch-wise. On the other hand,
they also replicated the kinds of prolongation performed over accented syllables and the
shorter values of the syllables in between with similar musical rhythmic patterns.
There are two factors that are decisive in bringing monody closer to natural
speech. First, the shapes and rhythm indicated in the score served only as a point of
departure for the real interpretation of the performer. These composers allowed and
expected flexibility in the performing beat of their monodies, giving the performer the
chance to vary the articulation pace according to her or his dramatic interpretation of the
different phrases. Second, the quasi-parlato nature of the vocal sound—with less vibrato
and more continuous movements between pitches—better recreated the sound of the
speaking voice. It is only by these means that monody achieved the emulation of speech
manners.
hand method known as figured bass. Dissonance was used to highlight certain “rhetorical
9
The “second practice” referred to the stile rappresentativo, as defined by Giulio Cesare Monteverdi in the
postface “Declaration” of Claudio Monteverdi’s score of Scherzi musicali (Venice, 1607). For these
monodists, the madrigalists represented the prima prattica, the stile antico.
10
Taruskin, The Oxford History of Western Music, p. 817.
11
Ibid., p. 817.
26
effects of vocal inflection and delivery.”12 The focus is still on the speech qualities of the
the features described above. In Sfogava con le stelle, Giulio Caccini sets a poem by
Ottavio Rinuccini:
Turning now to the score of Caccini’s Sfogava con le stelle, most of its musical
phrases coincide with the intonation units into which the poem could be divided. Thus,
from the third verse to the tenth verse, there are nine intonation units that correspond with
12
Ibid., p. 829.
13
Translation: “There appeared under the stars a man sick with love, and under the night sky he disclosed
his pain; and he said, his eyes fixed on them, ‘Oh, lovely images of my adored Idol, just as you show me
her rare beauty as you shine so brightly, in the same way show her my keen pangs. Perhaps you might
make her pitiful with your golden aspect, just as you made me loving.’” Carol MacClintock, ed., The Solo
Song 1580-1730 (New York: W.W. Norton & Company, Inc., 1973)
27
EX. 1.1: From Caccini’s Sfogava con le stelle, mm. 5-16.
28
Intoneme Units Measures Intoneme Análisis
1 mm. 5 AC1 CT
E diCEa,
2 mm. 5-6 AC2 AC1 CT
FIsso in LOro,
3 mm. 6-8 AC2 AC1 ct AC2 AC1 CT
O immagini BElle del Idol mio ch’aDOro,
4 mm.8-10 AC2 AC1 CT
Sì come a ME mosTRAte,
5 mm. 10-11 AC2 AC1 CT
mentre coSI splenDEte
6 mm. 11-12 AC2 AC1 CT
la sua RAra belTAte,
7 mm. 12-13 AC2 AC1 CT
coSI mostrate à Lei
8 mm. 13-14 AC2 AC1 CT
mentre con TANto arDEte
9 mm. 14-16 AC2 AC1 CC
I VIvi ardori MIei.
As the musical excerpt and table illustrate, all of the continuative intonemes end
on the same or a higher note than their preceding AC1. On the other hand, as expected, in
measure 16, the conclusive intoneme (CC) resolves on a lower note than the AC1: from
A4 to G4 (this last pitch is the modal center of the piece). In most of the phrases, the
highest pitch of each of these units is set on the AC2, while the rest of the morphemes
move downwards and in shorter rhythmic values toward the longest note of the phrase,
on which we find the AC1. This is true at least for the first three intonation units, but
certain exceptions are found in the rest of the units in which the character of the narration
changes.
In the phrases in which the narrator evokes what the “infermo d’amore” (the man
sick with love) said, the hierarchy of the pitches is reversed—with the exception of the
two “Mentre…” phrases that are time clauses directly connected to what follows. They
tend to ascend, thus making the AC2 lower than the AC1. This kind of departure from the
29
basic pattern of rhythmic values and pitch hierarchies is a sign of highlighting pragmatic
content or a specific expressive intention. In cases like this one, when AC1 is pronounced
at the level of AC2 or higher, the speaker is trying to draw the listener’s attention to the
topic. This focus attracts the listener at the same time as it imitates the persuasive tone of
the “sick man” while looking for approval of his “Idol.” Ascending lines are actually
The last intonation unit, “I vivi ardori miei,” is even a further case of focusing.
Although it respects the pitch hierarchy of AC2 (morpheme “vi” on D5) over AC1
florid run that takes almost the whole of measure 15, except the last sixteenth-note. Thus,
an otherwise unstressed syllable of the intonation unit (although a stressed one at the
word level) gains prominence. This stretching treatment of the syllable “do” certainly
highlights the word “ardori,” enhancing with it the importance of the man’s suffering
Another vocal setting style that engages in a “low-level mimetic relation” with its
text is the recitative. The operatic, oratorio or cantata recitative of all musical historical
periods continues the same kind of approach to text setting as monody, which respects
most of the intonational contours and phrasing of the text and gives the performer
flexibility on the beat for the final touch of speech-likeness. The recitative between Don
Giovanni, Dona Elvira and Leporello, before the aria “Madamina! il catalogo è questo,”
(Scene V from W. A. Mozart’s Don Giovanni) shows the same kind of close observation
30
Rinuccini’s. From the opening seven measures of the beginning of their recitative, it is
CT
AC1 AC2 AC1 CC
feLON! / NIdo d’inGAnni!
2 mm. 3 AC2 AC1 CC
Leporello: Che TItoli crusCANti!
3 mm. 4-5 AC2 AC1 CC
Leporello:: manco MAle che lo conosce BEne!
4 mm. 5-6 AC2 AC1 CT
D. Giovanni: VIa, cara Donna ElVIra,
5 mm. 6-7 AC2 AC1 CT
D. Giovanni: calMAte quella COllera;
6 mm. 7 AC1 CT
D. Giovanni: senTIte…
7 mm.7-8 CT
AC2 AC1
D. Giovanni: laSCIAtemi parLAR! 14
TABLE 1.2: Intoneme analysis of recitative from Scene V, W. A. Mozart’s Don Giovanni.
These intonation units set into music observe most of the prosodic rules
dialogue between three characters in the middle of an agitated discussion. It shows a wide
range of emotions and circumstances, from Donna Elvira’s insults to Leporello’s aside
14
Translation of this fragment: D.E: You are here! Monster! Traitor! Nest of deceits!; Lep: Such pure
Tuscan titles! So much the better that she knows him!; D.G: Come now, dear Donna Elvira, calm this
anger; listen…let me talk! (translation by L. Guillén)
31
EX. 1.2: Recitative from Scene V, W. A. Mozart’s Don Giovanni.
The first section of Donna Elvira’s intervention is a direct call for Don Giovanni’s
attention. “Sei qui!” not only has the expected rising contour of a continuative intoneme
(CT), but it also has the coincidence of the AC and the CT on the last syllable “qui!” The
32
exclamation mark, which indicates the surprise and anger of the character when
discovering Don Giovanni, is represented by the doubling of the rhythmic value of the
quarter-note A4 on which “qui!” is set with respect to the preceding unstressed “Sei” of
only an eighth-note. Another emphasizing factor for “qui!” is that it falls on one of the
strong beats of the measure (the fourth one). “Fellon” is set in the same way as “Sei qui.”
But the previous “mostro,” which has the accent on the penultimate syllable, shows the
splitting of attributes that is typical between AC1-CT. The former is rhythmically longer
The last section of Donna Elvira’s list of insults extends beyond these three short
periods of two syllables. The next insult, “Nido d’inganni,” opens with the AC2 on “ni”
holding the normally expected higher pitch of the whole unit, an F#5. From then on, the
pitch contour mostly descends—with some brief upward deviations—toward the AC1,
which falls on a D5. This whole descending line moves a major third below the AC2’s
F#5. But contrary to the expected lengthening of the AC1 on “ga,” this one stays on the
same eighth-note value as the previous unstressed syllables. In this particular case, this
setting may try to convey the urgency and tripping of the words in an infuriating
situation, such as the one Donna Elvira is experiencing with Don Giovanni. Although in
the score “ga” appears on a D5, if the performer applies the traditional appoggiatura on
an E5 before the resolution on the D5 of “nni!” the descending contour of the AC1-CC
succession takes place, at least in performance practice. On the other hand, if the
performer does not apply this appoggiatura rule, ending the phrase with two equal D5
notes on “-ganni,” the effect produced could be that of an AC1-CT succession. Under
33
speaking while the Leporello’s next aside to the public by takes place. The audience will
Turning now to the next two intonation units pronounced by Leporello, both
clearly depart from the basic pattern because of the emphasis that the character wants to
put on the sarcastic adjectives modifying the word “titoli.” Donna Elvira’s insults are
comment of Leporello: “manco male che lo conosce bene!” (So much the better that she
knows him!). Rossi mentions that when AC1 is pronounced “at the top level of AC2 or
higher, the speaker is drawing the listener’s attention to the topic as an effect of the
pragmatic accent (PA).” The AC1 in “Che titoli cruscanti” falls on “-can” and is set on a
D4, a fourth above the AC2, which is on an A3. That setting directs all the attention of
The second phrase of Leporello presents the same kind of contour and “pragmatic
accent”—AC1 is higher than the AC2 by a minor third, from F#3 to A3. The same kind
which is achieved by the same kind of pitch contour that Don Giovanni has just used in
the previous phrase. From the last three units, one follows an unusual intonation profile
and the other two shape around the point of focus being highlighted. In “calmate quella
collera,” in order to keep attention on the verb, which embodies the order given by Don
Giovanni to Donna Elvira, Mozart respects the usual hierarchy of pitches. But the last
15
“Cruscanti” literally means “pure Tuscan.” During this period, Tuscan culture and society was
considered the richest and most refined of all the regions of Italy. Even today the Tuscan dialect is
considered the purest Italian. But in this case, Leporello is using it to make a sarcastic contrast with the
insults that D. Elvira has been enumerating, which are not very refined.
34
two phrases are conceived as intonational units of the same nature as the previous ones,
The first two examples analyzed were in Italian. At this point of the argument, it
nature of English prosody differs greatly from any other language, a general explanation
of usual intonational tendencies will clarify the subsequent analysis of the Alto’s
recitative n.8 from Georg Handel’s oratorio The Messiah. The recitative of the oratorio
also follows with close attention to the intonational nature of its text source.
organized into two kinds of rhythmic units: the Narrow Rhythm Unit, which consists of a
syllables tend to be pronounced rapidly moving toward the subsequent stressed word. In
contrast, in the Narrow Rhythmic Unit, the duration of each unstressed syllable tends to
be inversely proportional to the number of them in that unit giving the impression of
isochrony. Tonal Units group Anacrusis with the preceding Narrow Rhythm Unit. At a
higher structural level, these Tonal Units are grouped into Intonation Units.
According to David Crystal, most of the intonation units consist of five to eight
words.18 Utterances longer than this are usually broken into two or more intonation units.
16
Proclitic: adj. In Greek Gram., used of a monosyllabic word that is so closely attached in pronunciation
to the following word as to have no accent of its own; hence, generally, used of a word in any language,
which in pronunciation is attached to the following stressed word, as in an ounce, as soon, at home, for
nobody, to comprehend. (Oxford English Dictionary Online, second edition, 1989).
17
Hirst, “Intonation in British English,” p. 58.
18
David Crystal, Prosodic Systems and Intonation in English. (London: Cambridge University Press,
1969).
35
Although pragmatic or phonological reasons dominate the final decision, syntactic
The final accent of the Intonation Unit is usually referred to as the “nucleus.” In
defined assertions, the stressed syllables form a descending scale until the last stressed
syllable when the pitch of the voice falls abruptly to a lower level. The intermediate
unstressed pitches may actually remain more or less at the same level with some
fluctuations and do not necessarily descend toward the last stressed syllable. As Hirst
comments: “In fairly slow deliberate speech, the ‘down stepping’ effect can be quite
striking.”19 However, in spontaneous speech, the pitch drop is reduced to the point of
almost being imperceptible, “giving rise to the ‘hat’; or ‘bridge’ type of pattern of
different languages,” such as English.20 A second kind of tune, besides this descending or
“hat” one, is used for statements with implications, “Yes-No” questions, requests and
incomplete utterances. In these cases, the descending or “hat” shape is used until the
nucleus (last stressed syllable) is reached. Usually this last accent is on a low note and the
syllables that follow rise from then on. The rise of the last pitch is not a way of
transforming a statement into a syntactic question, “but rather a way of indicating that a
“Incompleteness” in a sentence takes the form of rising nuclear tones; and those involve a
pragmatic evocative value. In contrast, “falling nuclear tones have proclamatory value.”22
19
Ibid., p. 61.
20
Ibid., p. 62.
21
Ibid., p. 65.
22
Ibid., p. 66.
36
Both of these contours, descending and ascending ones, indicate to the listener how the
In emphatic statements, the final nuclear pitch accent rises to a higher level than
usual with respect to the preceding unstressed syllable. It is common practice to switch
the first accent of the intonation unit for a low accent coming from high pitch unstressed
syllables to reinforce the later high final accent and subsequent falling pitch.
The Alto’s recitative n. 8 from Handel’s oratorio The Messiah uses biblical text
from Isaiah vii: 14 and Matt. I: 23. This text is already divided into its Intonation Units.
TABLE 1.3: Intoneme analysis of Alto’s recitative n. 8 from Handel’s The Messiah
37
EX. 1.3: Alto’s recitative n. 8 from Handel’s The Messiah
The phrasing of Handel’s settings falls into the division of the intonation units
that the text could have in normal speech, except for the explicit separation of
“Emmanuel” into an independent unit. This gesture prepares the solemn delivery of the
sacred name of the “son of God,” while also creating expectancy. All nuclei, as well as
most of the first accents of these intonation units, fall on strong beats of the 4/4 meter in
which this recitative evolves. The only exceptions are the first accents on “bear” and
“call.” This may be explained as a way of de-emphasizing the separation of the utterances
that contain them in three independent intonation units, focusing instead on the continuity
of the syntactic unit starting with the subject/noun phrase “a virgin” to the last verbal
38
unit, “and call his name…” The simple chord accompaniment also connects these four
intonation units by holding a double bass pedal on the notes D3 and D2 for three and a
half measures.
All the units end with an ascending interval toward the last note, which carries the
main and last accent, the nucleus. Since these are incomplete statements, the ascending
interval in each of these units also creates a sense of continuity. As an exception, the
abrupt drop of the pitch a fifth below (from A4 to D4) of “his name” may be explained as
preparation for the first note of “Emmanuel,” a B3. The first two syllables of the name,
“Em” and “man,” open with a solemn ascending perfect fourth interval. The second
syllable “-ma” carries the main accent of the unit, the accent of the nucleus. The closing
intonation unit, “God with us,” is the definite and final statement of this recitative, and as
such, it ends with a conclusive descending interval, a perfect fourth. The syllabic setting
of this recitative specially contrasts with the overvocalization of long melismas in the alto
The 19th century Lied serves as an example of the kind of “poetic mode” in which
the listener negotiates through the song a balanced perception between the fragmented
semantic content and the sonic components of the lyrics. Nineteenth-century Lied
composers engaged at a high-level mimetic relationship with the text. In their attempts to
offer a reading of the poetic text by enhancing semantic meaning and poetic structure
Lied composers departed from the strict prosodic features of the text and offered an
39
interesting but distracting musical setting. This disengagement from the intonational
properties of the poetic text emphasizes the music/poetic aspects of its words and negates
the possibility of understanding the discourse in a linear way—as close as any poetic text
struggled to make their musical settings a natural extension of the text. Johann Reichardt
(1752-1814) alleged that his melodies sprang automatically from repeated readings of the
poem and they were so closely interwoven with the text that they spoke and sang
School—Carl Zelter and Johann Schulz, committed to the least musical intervention
possible and put their efforts into letting the poetic text speak for itself as conceived by its
author.
The primacy of text over music did not last long. The next generation of Lied
new perspective on the poem. By doing this, they offered an alternative reading of the
text. This new reading took the form of particular musical phrasings of the text and
the cohesion of the poem but at the same time highlighted certain text fragments.
However, beyond the difference in approaches, the end result in both casesc(earlier and
later Lied composers) is settings that are perceived in a balanced “poetic mode.” The
listener receives a re-elaborated version of the poetic text, stretched, fragmented and
webbed into melodic and harmonic treatments. Thus, new poetic images, rhymes, words,
23
J.F. Reichardt, cited by Jack M. Stein, Poem and Music in the German Lied from Gluck to Hugo Wolf
(Cambridge, Mass.: Harvard University Press, 1971), p. 34.
40
sounds that before may have been unattended are brought to the attention of the listener,
Examining one particular poem, which was set several times by different
composers during the nineteenth century, will allow us to contrast the change in
compositional approaches and aesthetic conception of poetic text and music relationship.
Furthermore, this analysis will allow us to understand the compositional decisions that
highlight certain fragments or words of the poetic text and understate others.
From 1795 to 1907, an extensive number of songs were composed using the
poems from Johann Wolfgang von Goethe’s novel Wilhelm Meister, and more than a
hundred of these were settings of “Kennst du das Land?” Of special interest are the
settings of “Kennst du das Land?” by Johann Reichardt (1795), Carl Zelter (1795),
Ludwig van Beethoven (1809), Franz Schubert (1815), and Robert Schumann (1849).
Reichardt and Zelter intended to make their musical settings a natural extension of
the text and limit their musical intervention as much as possible. However, their settings
are songful renditions far from any speech quality of the text. Their melodies are a
genuine example of the simplicity and modesty of the Volkstümlichkeit style—in the
the obvious prosodic characteristics of the text, but followed a strict musical logic. The
lack of text order modification and piano interludes aims to limit the disruption of the
narrative flow. Although the repetitious nature of their strophic settings and simple chord
24
Although until the late eighteenth-century folklore was thought to belong only to the peasants and
“assigned a low cultural or intellectual prestige,” in the nineteenth-century folklore “was seen as
embodying the essential authentic wisdom of a language community or nation.” (Taruskin, The Oxford
History of Western Music, vol. 3, p. 122).
41
accompaniment allows listeners to ease their attention from the pure musical elements of
the song, they still apprehend the text in the “poetic mode” of listening. The listeners hear
the original sound patterns of the poem mounted over the “songfulness” of the sustained
colorful accompanying textures and text modification. They built musical forms that
direct the attention of the listener to specific words or phrases in their text. Those may
synthesize ideas relevant to the reading that the composer makes of the poem. They
highlight these text fragments by: creating harmonic tension, announcing or delaying this
phrase with piano interludes, repeating it several times, detaining the flow of the piece
rhythmically, etc.
A brief explanation of the insertion of the poem in the novel and Mignon’s
previous story could help to understand some musical decisions made in the settings of
the composers under study. “Kennst du das Land?” is a poem inserted in Wilhelm
Meisters Lehrjahre (Book Three, Chapter one). The poem is actually a song that the
character Mignon sings. She is supposedly an orphan girl, whom Wilhelm rescued from
the abusive master of a circus troupe. She has neither a home nor parents that are known
at that point in the novel. Mignon’s traumatic kidnapping made her become completely
averse to remembering her past in detail. When she sings this song, she is already living
with Wilhelm, and although she addresses him as “father,” a secret passion for him as a
man has started to grow in her. At the beginning of this first chapter in the third book,
42
when Wilhelm hears her singing from his room accompanying herself with a zither, he
becomes interested in the lyrics of the songs. He asks Mignon to repeat the song; he
writes it down and translates it; however, as Goethe describes in the novel:
He found, however, that he could not even approximate the originality of the phrases, and
the childlike innocence of the style was lost when the broken language was smoothed
over and the disconnection removed. The charm of the melody was also quite unique. She
intoned each verse with a certain solemn grandeur, as if she were drawing attention to
something unusual and imparting something of importance. When she reached the third
line, the melody became more somber; the words “Do you know it, indeed?” were given
weightiness and mystery, the “There!, there” was suffused with longing, and she
modified the phrase “Let us go” each time it was repeated, so that one time it was
entreating and urging, the next time pressing and full of promise.25
This description was closely followed by some of the composers, like Zelter, when
different perspective of her lost paradise. The first stanza is the description of an earthly
paradise to which she urges her “beloved” to take her. In the second stanza, the life is
gone; it is an architectural paradise, where everything is glittery and cold as the statues
that pronounce the central question of the song, which reveals Mignon’s suffering (“Was
hat man dir getan?”. Then the term “beloved” is displaced by “protector.” The third
stanza is the description of nature—similar to the first stanza—but this time it is a misty
landscape where everything is confusing and intimidating; therefore, her protector now
becomes her father. Thus, her lost paradise has a warm and voluptuous side, a cold and
glittery side, and, finally, a confusing, misty and intimidating side. For that reason,
although she wishes to go back to her homeland some day, she wants to go under the
25
Johann W. von Goethe, Wilhelm Meister’s Apprenticeship. Edited and translated by E.A.Blackall.
(Suhrkamp Publishers: New York, 1989)
43
Reichardt, as well as Zelter, gave preference to a strophic setting with regular
phrasing. They sought a style of setting whose clarity was close to the dignified
simplicity that they admired in folk art. As Carl Dahlhaus remarks, “Any composer who
tried to recapture the natural state of folksong had to conceal the excerptions of his art.”26
These ideals of the “Second Berlin School” are echoed by Reichardt who states, “For the
artist the supreme art lies not in the ignorance of his art but in its renunciation” (Geist des
accomplish clarity and simplicity in his compositions is in accordance with Goethe’s own
opinion about the degree of interference of the composer’s musical creativity with the
poem. Goethe uses the term “false participation” to describe any musical response to the
poem’s meaning beyond the strict accompaniment of the declamation of the poem. The
composer surrenders his creative space to the art of the poem. The composer’s duty is
only to create the appropriate musical ambiance, which is subtracted from the general
meaning of the poem. By creating this musically suggestive context, the composer helps
the audience to appreciate the richness of the meaningful inflections of the text itself.
The thing to do is to place the auditor in the mood that the poem suggests, letting the
imagination then create its own figures at the instance of the text, without his knowing
anything of the how of the process...To paint tone with tones, to thunder, crash, paddle
and plash, is detestable.28
While in “Kennst du das Land?” Zelter sets the mood of the poem as Goethe
requires, Reichardt ascribes to a simpler manner and also less intrusive folk-like song
26
Carl Dahlhaus, Nineteenth-Century Music. Translated by J. Bradford Robinson. (University of
California: Berkeley, Los Angeles, 1989).
27
Ibid., p.109.
28
Carl Fredrich Zelter, J.W. Goethe: Briefweschsel (Leipzig, 1987), p. 216. Translation of the quote by
Christopher Gibbs.
44
style. His setting is strictly strophic. In the accompaniment, he harmonizes with simple
chord support and doubles the melody throughout the song. Melodic phrases are
completely regular and the setting of the words is mainly syllabic. The song has only a
short modulation to its dominant and has no instrumental interludes. Zelter’s song is a
slightly modified strophic setting; the second and third strophes show a slight harmonic
modification in the piano and the voice between mm. 11 and 15. This song presents, as in
Zelter gives careful indications of the expression for each strophe. These
Goethe’s novel. Thus, Zelter asks for a Pathetisch (pathetic) mood mit Anmut (with
opening of Mignon’s song. In the third line, Zelter asks for a more getragen (hesitating)
mood as Goethe talks of the melody being more somber. For the words “Kennst du es
wohl?” (“Do you know it, indeed?”), he asks for an anwachsend (crescendo in emotional
variation in the expressive mood of this last refrain is also closely followed by Zelter.
Even in Reichardt’s and Zelter’s settings, where most of the obtrusive musical
devices are minimized, the listeners will tend to perceive the poetic text in a “poetic
mode.” The intrinsic musicality of the poem itself mounted over a melody—although
45
EX. 1.4: Reichardt, Kennst du das Land, 1st stanza.
46
EX. 1.5: Zelter’s Kennst du das Land, mm.1-27.
47
EX. 1.5 (cont.): Zelter’s Kennst du das Land, mm.28-53.
48
Convinced of the power of music to communicate feelings and sensations
otherwise ungraspable in words, the next generation of Lied composers broke free from
the previous word setting’s constraints delineated by the “Second Berlin School” and
unleashed their musical voices into their compositions. They ventured into busier textures
and more elaborate melodies, which certainly fragmented the poetic discourse further. At
the same time asthey emphasized musical qualities already existent in the poem, they
focused their expressive forces on the strict musical elements of the setting. Almost as if
they were conscious of the unavoidable distortive effect that their settings inflected upon
the poem, in the music of the settings they offered their listeners a synthesis of the
feelings at play in the poem: highlighting specific words with repetition, rhythmic and
Edward T. Cone says that a composer cannot set a poem with all its connotations;
some aspects of the poem will always be left out. In a case where the composer wants to
consider all the possible readings of the poem, he should include every point of view
translated into music in order to give the total meaning of the poem.29 Otherwise, what
results is a new creation that does not show the poet’s persona but, rather, the
composer’s. Inevitably, this will be a new set reading from the composer’s point of view.
His or her particular setting will highlight certain words and sounds, which will combine
in a completely new set of images associated with different moods and ideas.
aesthetic change in the following terms: the composer’s voice increases its active
participation in the musical result of the song. As Taruskin comments “The basic vocal
29
Edward T. Cone. “Some Thoughts on ‘Erlkönig.’” In The Composer’s Voice. (University of California
Press: Berkeley, Los Angeles, 1974) p.19.
49
idiom is always that of Volkweise (folk tune), the ‘natural’ music representing the ‘We,’
moments allow the ‘I’ to intrude.”30 Both Beethoven’s and Schubert’s voices translate
into their music the anxiety and urgency of Mignon’s request. They especially focus on
aspects of those concerns that they are interested in emphasizing from the poem. All
those become melodic, rhythmic, harmonic and textural effects at the hands of Beethoven
and Schubert.
Both composers set their song in strophic form with minor variations in the third
stanza. Both songs are in the key of A major and follow a similar tonal plan. Within this
formal frame, the contrasting textures, change of dynamics and rhythmic acceleration of
the second section of each stanza (“Dahin! Dahin!”) show more than a mere transcription
of the song that Mignon could have actually sung. Through their particular musical
of the character and her anxiety, once confronted with her critical personal situation: an
exposition of the unconscious feelings of Mignon told in the musical language of these
composers.
30
Earlier in his Chapter 35 “Volkstümlichkeit,” Taruskin explains the kinds of negotiations established
between the “I” and the “We” in previous nineteenth-century Lied, which crystallized the Volkstümlichkeit
ideals. He comments about the “impossibility of a particular ‘I’ without a particular ‘We,’” which may be
explained by a reformulated idea of cultural relativism, “the irreducible human difference”: “A human was
human only in the society of other humans, and the natural definer of societies was language. Since there
could be no thought without language, it followed that human thought, too, was a social or community
product…” In this manner extending language as expressive of all cultural aspects of a society, the concept
of a collective spirit idiosyncratic to each particular society arises. And this one was found in folklore
manifestations. So, the Lied was mainly concerned with the faithful portrayal of that “We.” (Taruskin, The
Oxford History of Western Music, vol. 3, p. 120-123).
50
EX. 1.6: Beethoven’s Kennst du das Land, mm.1-17.
51
EX. 1.6 (cont.): Beethoven’s Kennst du das Land, mm.18-43.
52
In Beethoven’s Lied, the gay anxiety of this adolescent is manifested in a playful
Piú Mosso in 6/8. This section functions as an answer to the “Kennst du es wohl?”
rhetorical question of Mignon, which together with the other “Kennst du?” questions are
the structural columns of Beethoven’s song. All these questions are set to the same
rhythmic pattern:
This pattern contrasts with the rest of the musical phrases with the use of one long
rhythmic value opening and another closing it: a quarter note at the beginning of the
phrase and a dotted quarter, eventually extended, at the end. This produces a slow down
of the flow of the song and, as a consequence, the highlighting of these questions in the
poem. The last “Kennst du es wohl?” of each stanza is also preceded by a short piano
interlude, anticipating instrumentally with the same melody and harmony the question
that will arise afterwards. Thus, the attention of the listener is drawn inevitably towards
those questions.
Schubert also directs the flow of the first part of each strophe to the same question
(“Kennst do es wohl?”), this time set in recitative style. This offers a quasi-speech effect
in the middle of a completely “songful” melody. The expectation is built through the two
preceding measures, which serve as preparation to the D# of the French augmented sixth
chord sustained under the question, which in the next measure resolves on the dominant
chord, E Major. These procedures signal the arrival of a phrase to which the composer
53
wants the listener to pay special attention, “Kennst do es wohl?” (Do you know it
indeed?), which at the same time prolongs the expectation for an answer, musically as
well as lyrically.
In Beethoven’s setting, the answer is found in the next measure; the first “Dahin”
resolves on the tonic of the original key, A Major—after a short deviation to C Major in
the preceding section. Schubert delays the answer by displacing the clear resolution until
the end of the strophe—twenty-two measures later. The harmonic tension built measure
after measure while waiting for the final cadence parallels the frantic searching of
Mignon for the realization of her dream, to go back to her homeland, which always seems
far from concretion. While Beethoven portrays a calmer attitude on Mignon’s part—a
into music with unresolved harmonies. Furthermore, the frantic driving flow of triplets
from mm. 8 of the song does not stop until the end of his “Etwas geschwinder” (“A little
faster”) section. The only moment when the triplet texture is suspended is under the
question “Kennst do es wohl?” Finally, the alteration of the text, especially the desperate
repetition of “Dahin,” emphasizes her emotional state. This whole delayed answer section
lasts twenty-two measures in Schubert and fifteen in Beethoven’s setting—and only six
54
EX. 1.8: Schubert’s Kennst du das Land, mm.1-18.
55
EX. 1.8 (cont.): Schubert’s Kennst du das Land, mm.19-40.
56
More than any of the other settings, Schumann’s setting of “Kennst du das Land?”
achieves the new synthesis of the “I” and the “We” of Lied in romantic terms. He
approaches this poem with a strict strophic form, which contains a very slight
modification in the interlude between the second and the third stanzas—a deceptive
Mignon’s answer to the preceding question of the statues, “Was hat man dir, du armes
Kind, gethan?” (“Poor child, what have they done to you?”)—this is the central question
of the poem in meaning and placement. In his setting of Mignon’s Lied, Schumann seems
to have the same intentions as the “Second Berlin School” composers. His choice of a
strophic setting and his own indication at the beginning of the song of “Langsam, die
beiden letzten Verse mit gesteigertem Ausdruck” (Largo, the two last verses with
different expressive gesture) give that impression. But this is not the case. The melody
sung by Mignon is neither the ideal Volksweise (folk tune) nor the simple and transparent
melody of a fragile adolescent. The accompaniment, with its thick harmony, suspensions,
appogiaturas and deceptive cadences, builds a musical fabric that neither serves as an
unobtrusive support of the text nor represents the simplicity of Mignon’s zither playing.
Also, the relation between accompaniment and vocal melody with its displaced doubling
in the piano, so characteristic of Schumann’s songs, is far from the clear chord
accompaniment and strict doubling of the melody necessary for the delivery of the “only
possible reading of the poem,” according to the “Second Berlin School.” All these
emotional state, or his interpretation of it. In the song, Mignon talks with the voice of the
composer.
57
The difference from Beethoven’s and Schubert’s settings, which are structured
around the “Kennst du?” rhetorical questions of Mignon, is that Schumann seems to drive
the flow of the song to each of the three addressing names that Mignon uses for Wilhelm.
The thick web of triplets in the two hands of the piano that starts in mm. 10 does not stop
until mm. 25, coinciding with "o mein Geliebter" (“o my beloved”). The same procedure
is repeated in the following two stanzas where the flow of the triplets ceases upon
arriving at “o mein Beschützer” (“o my protector”) the first time and “o mein Vater” (“o
my father”) the second time. Thus, the flow of the song seems to be organized around
these climactic points, which reflect one of Mignon’s major concerns: her relationship
with Wilhelm. The ambiguity of this relationship drives her to ask herself three reflexive
questions: Are you my lover? Are you my protector? Are you my father?
pianistic introduction, which hints at the chromatic world that he will develop later on in
the piece. This introduction will become the interlude played in between stanzas. When
the voice starts, this pianistic treatment gives place to a more open texture, which allows
the text to transcend and reach the listener in a relatively clear manner. But once the
audience is introduced to the landscape of each stanza, the thick web of triplets takes over
—starting in mm. 10—with its displaced doubling and complex chromatic language until
58
EX. 1.9: Schumann’s Kennst du das Land, mm. 1-20.
59
EX. 1.9 (cont.): Schumann’s Kennst du das Land, mm. 21-41.
60
By use of the described musical procedures, these Lied composers permeated the
poem with their musical voices, suggesting, through the accompaniment and its relation
with the vocal line, things that are not said in the words of the poem. They relied on the
music for this task because they conceived music as a language equal to the literature.
Music was capable of transmitting sensations and feelings that the audience would
appreciate only through the direct experience of listening—a type of physical connection.
Rosen says that for the nineteenth-century composers, the word is not anymore
embellished and imitated by the music.31 Now, the music becomes a language by itself, a
separate symbolic universe with its own logic and communicative-expressive power.
However, although music transmits feelings and sensations captured by the reading of the
composer, it only represents their form and not their content. The listener feels the
movement and impulses of the music conveying those feelings as a physically empty
message, which only his own imagination will fill with a determined content. In this way,
the listener will capture the structure of the composer’s personal reading and complete
influence on the concrete musical manifestation of the special poetic features of a poem.
And the features that attracted Romantic composers dwelt at a structural level of the
poem. Since they were mainly concerned with content resulting from the elaboration of
several internal layers of meaning of the text, they molded their musical setting to portray
these emotional states or concepts. Neither narration of the dramatic events nor speech
qualities of the text were major concerns at this point in music history. The Lied was
31
Charles Rosen, The Romantic Generation (Cambridge, Massachusetts: Harvard University Press, 1995)
p. 68.
61
mainly music. The music of its poetic text runs parallel to the highlighted text fragments
and, integrated into those purely musical elements of the song, engaged the listener in a
62
III.
FURTHER EXPLORATIONS:
composers of the second half of the twentieth century and the beginning of the twentieth
century, such as Meredith Monk and Luciano Berio, who have produced music that
reflects their concerns about the semiotics of paralingual vocal gestures and intonation. In
exploring these issues, they created music that is the practical representation of the way
we listen in the poetic mode taken to an extreme—a “poetic mode” with a special
emphasis on the “auditory mode.” By employing the sonic elements of language as the
structural components in their pieces, they strived to offer their audiences a direct
experience of the struggle and negotiations that text undergoes once set into music. They
wanted people to attend to those paralingual nuances that we usually disregard when
listening to speech and disregard even more when speech is set into music. At their hands
Monk and Berio share concerns and interests in exploring the tensions between
text and music. They observe the fragmentation and deformation that any text set into
between word and sound, poetry and music...,” the function of which “would not be the
contrasting or mixing up of two separate expressive systems but rather the creation of
complete continuity, so that the shift from one to the other would be imperceptible,
63
without drawing attention to the difference between a logical-semantic mode of
apprehension (as adopted for the spoken message) and a musical mode...”1 This kind of
word and sound relation would activate in the listeners the “poetic mode” of perception,
Monk and Berio are not the only composers who have introduced new
compositional tendencies in the vocal music realm. Composers such as Pierre Boulez and
Karlheinz Stockhausen have also promoted rethinking this relationship with their writings
and compositions.2 On the one hand, Boulez developed his concept of “centre and
absence” in which the text remains at the notional center of the composition process. On
two media. As explained later in this chapter, Berio conceives of a similar fluid
It is also relevant to mention two developments of the second half of the twentieth
movements taking place during the first two decades of the twentieth century. The former
has an earlier direct predecessor in the Dada poetry of Kurt Schwitters, Hugo Ball, and
Tristan Tzara, and the latter in the Italian Futurist experiments of Russolo and Marinetti.
The “concrete poetry” movement aims to create a new artistic reality. Without the
1
Luciano Berio, “Poesia e musica un’esperienza,” in Incontri Musicali 3 (1959), 99.
2
An explanation of Boulez’s concept of “centre and absence” may be found in Orientations: Collected
Writings (1986). Stockhausen explains his ideas about text and music in his paper Speech and Music read
in 1959 in Darmstad Summer School and later published in “Die Reihe.” Some of the most influential
64
representation of any external reality. Thus, its focus moves towards the phonetic sounds
of words, shapes of letters, breaking of the formal semantic units and punctuation rules,
etc. “Text-sound compositions” renounce the optic dimension and concentrate on the
relationship of sound and meaning. These works exist only in recording format (sound
pieces without a written version). This branch of electro-acoustic music holds among its
more important examples compositions such as Steve Reich’s Come Out (1966), Nono’s
interest to Monk and Berio, but unlike in “text-sound compositions,” these composers
explore these concepts without any electronic interventions. Their pieces may be
reproduced in live performance by one or several singers without any processing of their
voices. The fact that the full palette of vocal sounds employed by these composers
originates acoustically from the natural resources of the human voice is of special interest
to this dissertation.
As mentioned before, both Berio and Stockhausen have proposed to soften the
continuum is created when speech approaches music and music approaches speech to the
point of the dissolution of the boundaries of sound and meaning. Berio considers the first
and primordial step in creating this “word-sound continuum” to be the dissolution of the
natural fragmentation that any text undergoes when set to music—by breaking words into
their phonetic elements, stretching them, masking their enunciation, and mixing them
pieces in the realm of explorations of the tensions between language and music were Boulez’s Pli selon pli
(....), Stockhausen’s Gesang der Jünglinge (1955-6) and Momente (1962-4).
65
with paralingual sounds—to submerge the listener in the deepest nuances of language and
the human phonatory apparatus. In this way, he intends to dissect the elements of
language and observe their relations and tensions from inside out, while at the same time
revealing the communicative power of the sonic aspects of language beyond the
semantic-linguistic content of the syntactic units and system. He makes use of poems,
political speeches, academic texts, literary narrations and other kinds of discursive texts
in their entirety or in fragments. In A-ronne, as in many of his previous and later pieces,
Berio recreates the semiotic structural manifestation of the agony of language when set to
In A-Ronne, as well as some of his early vocal pieces, Berio extracts the purely
musical elements from his literary textual sources and, as Osmond-Smith comments, uses
them “to explore the borderline where sound as the bearer of linguistic sense dissolves
into sound as the bearer of musical meaning: a territory that…he was to make very much
his own.”3 The words’ musical elements become structural components in his pieces.
Thus, Osmond-Smith describes the process of creating tension between the sonic
…he then proceeds to work in tension with it, juxtaposing and superposing phonetic
elements so as to produce consonant groupings that the human voice would normally find
hard to articulate in rapid succession (such as voiced and plosives)…Out of this
impossible vocalism, comprehensible speech…momentarily emerges, only to be
engulfed: relative comprehensibility has become a compositional parameter to be handled
in much the same way as textural density or, within a pitched context, harmonic
density…It may be achieved by the fragmentation of originally linear texts…by
superposition of texts…by dissolution of texts into their component phonetic materials, or
more usually by a combination of these.4
3
David Osmond-Smith, Berio (Oxford, New York: Oxford University Press, 1991), 62.
4
Ibid., 62-63.
66
These same kinds of procedures are explored in A-Ronne; only in this piece Berio uses
Berio places a magnifier on the sonic transitions of language but always in the syntactic
context of a real text, which could be stretched and deformed beyond recognition, Monk
steps out of the syntactic/linguistic frame and treats the phonemes as pure sounds. No
linguistic text of any kind precedes her pieces. Although she shares with Berio an interest
in the communicative power of the sonic aspects of vocal sounds, she specifically focuses
on their emotional communicative potential. Monk conceives of the voice as a tool for
...Feelings that we have no words for.”5 In this way she exploits the potential of the
For Monk the voice is in complete connection with the body. At the same time,
the physicality of the voice is one of her fundamental concerns: “The body of the voice/
the voice of the body.”6 As a consequence, in the mid-sixties, she began a methodic
exploration of the voice as an instrument that could develop its own idiosyncratic
vocabulary: “I realized that the voice could be as fluid as the spine, that it could have the
flexibility and range of the body.”7 She immersed herself in the study of vocal color,
voice placement and nuances in the articulatory/phonatory apparatus and applied her
discoveries into controlled explorations of vocal pitch, volume, speed, texture, timbre,
5
Meredith Monk, “Notes on the Voice,” In Meredith Monk, ed. Deborah Jowitt. (Baltimore, Maryland: The
Johns Hopkins University Press 1997), 56.
6
Ibid.
7
Robert Schwarz, Minimalists (London: Phaidon Press Limited, 1996), 189.
67
Berio does not ignore the physicality of the voice either and exposes the gestures
of vocal sound production through his music. Although in A-Ronne Berio makes
extensive use of paralingual sounds, such as breathing, sighs and other vocal noises, as
part of the musical process, he explains that he does not conceive of them as mere sound
I am not interested in sound by itself—and even less in sound effects, whether of vocal or
instrumental origin. I work with words because I find new meaning in them by analyzing
them acoustically and musically, I rediscover the word. As far as breathing and sighing
are concerned, these are not effects but vocal gestures which also carry a meaning; they
must be considered and perceived in their proper context.8
By exploring in detail human vocal nuances, both Monk and Berio create proximity with
their audience. Listeners may directly relate to these tangible sounds because they are
produced by the same gestures that any human being uses in everyday normal speech.
the performer, Berio’s and Monk’s music invites its audience to experience it physically.
Richard Middleton comments that listeners “… identify with the motor structure,
participating in the gestural patterns, either vicariously, or even physically, through dance
that the movements that players perform while playing their instruments directly affect
into music.”10 In this manner, the listener gains a firsthand experience of the gestures that
8
Rossana Dalmonte and Bálint András Varga, Two Interviews/Luciano Berio, trans. David Osmond-Smith
(New York: M. Boyars, 1985), 141.
9
Richard Middleton, Studying Popular Music (Philadelphia: Open University Press, 1990), 243.
10
John Baily, “Movement patterns in playing the Herati dutar” as quoted in R. Middleton, Studying
Popular Music (Philadelphia: Open University Press, 1990), 243.
68
Monk comments that “By working with your own instrument, you actually come
across gestures that are trans-cultural, and in certain ways you become part of the world
vocal family.”11 Her vocabulary involves human sounds that many men and women
could find natural and organic; thus, her music triggers a close emotional connection
between her audiences and her musical idiom. One could feel that those sounds are part
of our essential primal vocabulary: pre-lingual and, at the same time, beyond language.
sound. In this way, it opens to the audience the wide spectrum of potential meanings that
any sound usually evokes. The baggage of meanings that any human vocal sound carries
is not always precise and easy to define, allowing all sorts of associations. In a linguistic
communicative setting where human sound is the carrier of language, these multi-
associative meanings are usually overlooked. Monks wants to bring to her audience an
of Monk’s pieces are wordless, as she has restricted herself to moaning, shouting,
sighing, breathing, whispering, trilling, sliding, doing glottal breaks, and chanting on
nonsense syllables. This is a conscious aesthetic decision, since she departs from the
conception that it is almost impossible to comprehend text put into music, or at least to
our attention directly to the sound of the voice without any obstacle. Even when she
composes pieces as Three Heavens and Hells (1992), where she uses exclusively and
11
Schwarz, Minimalist, 190.
69
exactly the four words of the title as the text of the piece, she almost strips the words
along the twenty-one minute and ten second duration of the piece produces a progressive
fade away of meaning until these words become empty vessels. These words keep their
pragmatic sense but not their semantic meaning. The effect is finally similar to the pieces
in which language is completely absent; the audience turns its attention toward the vast
In regard to her piece Atlas, Monk explains that it was meant to pass discursive
thought to “go directly to the heart.”12 She argues that in any case, she usually is not able
to understand a word in opera. Departing from the idea of language “as a screen in front
of the emotion and the action,” she prefers a direct communication that “bypasses that
step so that you’re really dealing with a very primary and direct emotion.”13
songs. They explore the full potential of the “songfulness” of the voice and the pure
musicality of the human vocal gestures. As said before, Monk’s piece is a self-conscious
representation of the way we listen in the “poetic mode” with emphasis on the “auditory”
elements of perception. She chooses to make music from the stripped musical elements of
the voice that we usually unconsciously apprehend and to which we emotionally connect
when listening to any other vocal piece—whether popular song or “art” song.
12
Ibid., 191.
13
William Duckworth, Talking Music (New York: Simon & Schuster Macmillan, 1995), 359.
70
In an interview offered in 1996, Monk commented to the ethnomusicologist
David Gere that these duets were conceived as processes of nature. Each of them only
explores a single particular vocal quality. Monk preferred simplicity over compositional
fanciness; she says: “I was thinking: Why don’t you just take the purest color in each
song and only work with that. Like one brush stroke or a haiku.”14 The creation of a
particular character in each song is central to Monk. In Volcano Songs as in other pieces,
she looks for “the voice” of each piece, the one that creates a world in itself and is not
This kind of restrained canvas that Monk self imposes in each Volcano Song is
not unusual for her music in general and appears to be an intentional procedure in other
pieces like Vessel (1971). This restraint manifests itself in two aspects of her music. First,
raw materials tend to be simple, but her controlled delivery—a certain solemnity in her
and stature. Second, she is interested in slowing down musical processes to get a slice of
them. She wants the audience to taste every single moment. The same detailed delivery
that Marcia Siegel and Kenneth Bernard have observed in her theatrical and dance
movements is present in her music.15 Most of her pieces are constructed as a succession
of single episodes that succeed one another, repetitive sequences of slow, sustained notes
through my direct experience in workshops held by The Meredith Monk Ensemble. They
guided participants through similar processes that they established with Monk during the
14
Meredith Monk, interview by David Gere, Volcano Songs (CD insert), ECM, June 19, 1996.
71
creative process of some of her ensemble pieces. She usually proposes materials and
processes, and through improvisatory techniques, they mold those materials, each in their
own idiosyncratic ways. She wants to hear through their vocal sounds: their backgrounds,
their experiences, their personalities, their humanity, their imperfections. After long
sessions of experimentation, a final version is put together and fixed. For the most part,
there is a preference for the oral transmission of her pieces—although these versions
finally do get scored, which was the way in which her ensemble members taught
In terms of the musical processes and materials that Monk employs in her pieces,
the musical structures show a predominant horizontal conception: short cells that develop
linearly, “plain chant” style or “folk-flavor” simple melodies that succeed one another.
These sometimes undergo slow processes of gradual transformation. At other times, each
which one theme fades out and the other slips in. Monk calls this process “wash,” and
this is one of the several musical processes that are directly associated with cinematic
editing techniques. Other musical procedures, such as canonic textures, are rooted purely
which she applies in her explorations across media: music, dance, theater, and video. In
terms of the structure of her staged pieces, Monk’s preference for non-narrative models
causes her to choose a more fragmented poetic style in which things happen one at a
15
Marcia B. Siegel, “Virgin Vessel” and Kenneth Bernard, “Observations On Recent Ruins” in Meredith
Monk, ed. D. Jowitt.
72
time, and it is not until the end that the spectators are able to intermingle the separate
Monk walks to a row of three rectangles that lie on the floor and, in a ritualistic manner,
removes the black pieces of cloth that cover these rectangles.16 After each uncovering
action, she lies in a crumpled position on each pallet, while a bright light flashes on and
off. Once she stands up, the light turns the pallet a luminous green, discovering on it a
dark imprint left by her body. Then she proceeds to the next rectangle to perform the
same task and the previous one fades away. This seemingly magical theatrical effect
Pompei or Hiroshima.17
The volcanic theme brings in the motive of “transformation” that lies under all
these songs. According to the composer, although volcanic activity implies potential for
destruction, it has also been instrumental in the creation of the Earth. Furthermore,
“volcanic land is some of the most fertile land on earth.”18 The tension between death and
destruction and rebirth and growing implies a kind of cyclic transformation, which
translates into musical processes of transformation of the vocal textures and themes that
are used throughout the Volcano Songs: Duets: morphic overlappings between materials.
The first song of the cycle, called “Walking Song,” explores the opposition of
pure vowels against a backdrop of voiceless, breathy vocal sounds. This duet
16
Deborah Jowitt, ed. “Introduction.” In Meredith Monk. ( Baltimore, Maryland: The Johns Hopkins
University Press, 1997)
17
Ibid., 15
18
Meredith Monk, interview by David Gere.
73
concentrates on the vocal color of [a] and [o] connected once in awhile by semi-vowel
consonants [n] and [l], and glide [j]. The piece evolves through a restless motif of what
could be called “a galloping rhythmic” nature: a six-eight meter made of a quarter note
in conjunct intervals around an F# minor tonal center. Despite this constrained melodic
beginning, throughout the duet, the pitch content and contour evolve from a very narrow
register to more than a fifth wide register and then an almost total loss of tonal center to
later return to the previous, more constrained and defined version of the motif. Departing
from this basic version, the piece explores augmentations and diminutions of the
following melody:
EX. 2.1: Monk, Volcano Songs: Duets, “Walking Song,” min. 0:25 to 0:35
Only the last three notes, on the phonemes [o-a-jo], of this motive remain constant
along the piece. They become a sort of refrain that reappears at the end of every phrase
despite any kind of transformation that the beginning of the motive could have
undergone.
The duet may be divided into fourteen sections along which the musical processes
74
Section # 1 - (0 to 0:15 minute):
This opening section is the introduction of the first and simplest version of the
theme, as already presented in example # 1. This theme is made of two identical melodic
phrases but with different phonetic material. At this point, each of these phrases lasts four
measures (4 seconds).
EX. 2.2: Monk, Volcano Songs: Duets, “Walking Song,” min. 0:05 to 0:14
pitch and duration (one vertical and the other horizontal). Regarding the former, it adds
two whole steps up, thus reaching to a C#5. For the latter, although it keeps the dynamic
of presenting two phrases, it adds a fifth measure to the first phrase, which produces
instability and breaks the balance and regularity that the theme had in its introductory
state. This means a whole second of new music and surprises the listener, refreshing the
perceptual experience. But the second motive phrase goes back to the established four
75
EX. 2.3: Monk, Volcano Songs: Duets, “Walking Song,” min. 0:15 to 0:20
In this third section, there is another kind of expansion, this time in the texture:
the addition of a second female voice. Again unpredictability and irregularity are
emphasized by surprising the listener with the unexpected new element not at the
beginning of the phrase but by the second note (in 0:27 minute). But this second singer
only emits breathy, almost voiceless sounds without specific phonemes, which follow the
same rhythmic patterns of the leading female voice, which still carries the tune. Once
introduced, this second voice keeps singing for the two normal melodic phrases of four
measures. The melodic content of these phrases is a variation over the given pitch
spectrum until this point of the piece. In the refrain [o-a-jo], the second voice gains more
presence with faintly defined phonemes but without completely abandoning the
different order.
The second voice keeps singing, and now, interspersed within its voiceless
texture, some phonemes are completely voiced—in addition to those of the refrain. The
two regular phrases are maintained. Interestingly, when the melody reaches the first
“refrain,” the second voice splits from the first one and sings it with a delay of one beat,
76
thus creating an echo-effect. But by the end of the second phrase, they are in unison
again. The melody expands its pitch range even more. The first phrase opens with a
EX. 2.4: Monk, Volcano Songs: Duets, “Walking Song,” min. 0:36 to 0:43
Some elements stay constant, such as the two thematic phrases and the
sporadically sung pitches of the second female voice, but the new expansion is harmonic.
Along the “refrain” the second singer adds a second parallel voice a third above.
Now the expansion is in the variety of vocal colors or timbres employed. Both
singers, but more prominently the second one, shake their voices, giving the impression
of trembling.
musical effects applied and compiled up until now in the piece. The second voice drops
back into more breathy sounds. The voices play with delays and anticipations of the
“refrain” both times that it appears at the end of each of the two phrases.
77
Section # 8 - (1:15 to 1:27 minute):
In this longer section, the second voice starts to drop at times, producing rests
between the breathy sounds. The first voice progressively drops the phonemes and starts
This section goes back to the same kind of articulation and texture as before
minute 1:14.
The melody loses its tonal center at the same time as its range expands. It moves
in wondrous ways out of the F# minor center that dominated until now. The trembling
timbre is more prominent and frequent, and by the end of the second phrase, before the
refrain, there is a whole second of pause in both voices—again surprising the listener
The new element is the broken nature of the melody. Unexpected rests interrupt
the melody’s flow. But those rests have more of an effect of stops or suspensions in time
suspension, the melody reassumes its natural flow from the point before the interruption.
78
Section # 12 - (2:13 to 2:27 minute):
In this section several of the previously used musical tactics and processes are
A completely new melodic material replaces the theme of two phrases. The first
voice alone sings the following pattern based on descending ninth intervals (G#4 to F#3).
The phonetic material is limited to alternations between [a-ε], the [a] as old material and
EX. 2.5: Monk, Volcano Songs: Duets, “Walking Song,” min. 2:27 to 2:46
The second voice returns and sings along in unison with the two complete phrases
of the theme and then a single [a] on F#3, one second rest, and the last single [ε] on G#4,
this time sung only by the first voice. Thus, the piece ends with this open ascending
ninth.
The second duet of Volcano Songs, “Lost Wind,” consists of two voices playing
with the friction of two notes a half step apart and the partials produced by this action.
The motif is introduced by one of the voices and repeated twice by this single voice. It
79
consists of two short notes on the phonemes [ni - a], on C#5, followed by a long note a
half step up (D5) on the syllable [no]. This last D note immediately decays, sliding down
a third.
EX. 2.6: Fragment from Monk, Volcano Songs: Duets, “Lost Wind”
The duration of the long note before the decay varies in every repetition. The
second voice, which appears by the third time that the motive is repeated, reproduces the
same melody but with a delay. It accentuates the friction by desynchronizing the voices
while sustaining and sliding down. Although it is very difficult because of the fine timbre
of the voices blending, one may perceive, until a certain point in the piece, that the two
voices take turns starting each repetition. The singer that follows always applies a more
distant timbre in her voice, as if the sound had a veil over it. This is achieved with a
further back placement in the vocal cavity and a certain breathy quality (in contrast to a
more forward and projected full sound). Although less veiled, the leading voice presents
By these means, the piece explores distance in two dimensions: space and time. It
explores the effects of distance between the sound of the two singing voices, the singers
with respect to the performing space, or between the listeners and the singers (their
proximity or remoteness). The piece gives the impression of an echo effect, in which one
80
singer sings and her voice comes back with a far and diluted color. This illusion is
created by the fine imitation of vocal colors between the two singers. The echo suggests a
vast expanse of space. Thus, the human and her or his voice in solitude face the
the rubbing partials created by the semitone interval between the voices. At the same
time, “Lost Wind” experiments with the distance in the historical time spectrum. These
could be voices coming from the past, a prehistoric time in the evolution of the earth or
The third duet of Volcano Songs, called “Hips Dance,” concentrates on the
proximity and mingling of the two voices. This effect is taken to a point at which the two
voices intertwine and “you can hardly tell that two different people are singing...In ‘Hips
Dance’ we push it even further by creating the illusion of more than two voices
overlapping.”19 Both voices start together from the beginning of the piece. One of them
maintains the following drone on the semi consonant [m] throughout the whole piece:
EX. 2.7: Fragment from Monk, Volcano Songs: Duets, “Hip Dance”
19
Meredith Monk, interview by David Gere. .
81
Some hard exhalations—which sound close to [ho]—are interspersed between the
two-beat long notes. The “[m] drone” is only interrupted once in the middle of the piece
by a whole section of those exhalations. At that point, the two voices exchange hard
breathy [ho]s, which create a rather percussive effect. This section acts like a rhythmic
improvisation or percussion solo in the middle of a piece, in which time stops moving
forward, and a suspended syncopation takes place. This is one of those instances when
the voices are indistinguishable from each other, thus creating the sensation of more than
While the “[m] drone” continues, the other voice varies a pattern based on the
following material: a short eighth-note sung on [e] followed by four or more sixteenth
notes sung on [m –a] a minor sixth down. These sixteenth notes are interrupted in
percussive [ho] in between the melodic patterns of the [m-a] and the hummed drone of
the other voice reinforce the fictional sensation of a third voice intervening. Towards the
end of the piece, the tempo accelerates and the frequency and irregularity of inserted
exhalations in between the sung notes increases, exacerbating the sensation of several
The last of the duets, as its own title “Cry #1” suggests, is a lament. It is not by
chance that this duet uses a phoneme of intense emotional charge. All through the duet
one of the singers vocalizes on [ηg a – ηgæ] while the other slides over a single [η]. The
[η], as other posterior nasals (or velar nasals), is one of the last phonemes to be acquired
by children in languages that employ them. Its late incorporation into the linguistic
system of arbitrary signs in language implies that the child experiments for a longer time
82
during her/his infancy with these sounds in onomatopoeia or sound-gestures. As a
consequence, this nasal-velar sound carries a heavy load of emotional connotations, for
which the child has no words. Roman Jackobson, who developed these theories in his
famous Child Language, Aphasia, and Phonological Universals (1968), argues that
“Sound-gestures, which tend to form a layer even apart in the language of the adult,
appear to seek out those sounds which are inadmissible in a given language.”20 These
sounds will coexist with those employed in the vocabulary for a long time and even be
used as expressive sound-gestures in one’s adult life. Thus, these sounds have a playful
Rueven Tsur makes further observations about the nature of these nasal
phonemes. He comments that “periodic sounds” such as nasals, vowels and liquids—
those having similar structure for their recurrent acoustic signal portions—arouse a
certain relaxed kind of attentiveness, prediction and order, a quasi-hypnotic effect.21 One
may observe the extended use of these kinds of phonemes in Monk’s Volcano Songs,
The musical development of “Cry #1” consists of the erratic wandering of the
voices as they slide around a half-step up and down from a central note in oblique
interweaving. The first singer’s voice glides around this pattern, which is constantly
20
Roman Jakobson, Child Language, Aphasia and Phonological Universals. (The Hague: Mouton, 1968),
25.
21
Reuven Tsur, What Makes Sounds Patterns Expressive: The Poetic Mode of Speech Perception (Durham,
London: Duke University Press, 1992), 44
83
EX. 2.8: Fragment from Monk, Volcano Songs: Duets, “Cry # 1”
The second voice is introduced in minute 0:40 of the piece and slowly slides in a
humming manner on a sustained [η], departing from a different note than the first voice.
Similar to the previous duet, this second voice acts as a drone. From minute 0:43 in the
piece, each voice alternately moves its pivot note, but when it seems that both voices are
going to coincide, the other moves away and again produces dissonant intervals.
Progressively, after the minute 1:00, both voices accelerate the frequency of their pitch
fluctuation, becoming an undulation, and the phonetic articulation in the first voice
becomes more blurry and muddy. While the second singer keeps humming on [η], the
first one conserves only a murky [æ] from her phonetic set. A short section follows, from
minute 1:58 to 2:18, in which both voices closely and in parallel movements sing on [ηg
a – ηgæ] up and down a half step in an insistent mourning. From this point to the end,
both voices start a process of simplification and assimilation until they collide in an
undulating unison around the same pivot note on a fluctuating [æ], to finally close on the
previously described gives “Cry # 1” a relentless lament quality, which lends itself to a
84
A-Ronne: Luciano Berio
In 1974 Luciano Berio composed A-Ronne for Radio Hilversum arranged for five
actors-singers, but in 1975 he revised the piece and expanded it for eight singers, in a
double vocal quartet format. The premier of this second version was made by the group
Swingle II in 1976. Berio’s usual collaborator, the poet Edoardo Sanguinetti, is the author
of the text set in the piece. As Osmond-Smith comments, A-ronne is the product of a
dynamic process of improvisation among the original five actors instigated by Berio,
which resulted in a series of fragmentary sonic dramas derived from Sanguinetti’s text.
He recorded these sessions, which after reworking them were transformed into the eight-
voice concert version.22 While this original first version profited greatly from the vivid
imagination of these five actors, Swingle II brought their imprint to the second version
through their staple sound: a kind of vocally imitated instrumental fusion of jazz and
“classical” styles.
A-Ronne, conceived as a piece of “theater for the ears,” is one the pieces resulting
from “his [Berio’s] eight years of work in the Milan radio” which added “a sharp sense of
the extraordinary flexibility of the aural imagination, where images can flow into, or
coexist with one another with an ease denied to the eye.”23 This shows Berio’s special
concern for the human listening process. A-Ronne, as many of his other pieces, acts
directly upon the different ways that we, as listeners, may apprehend text. Again, as
Monk’s Volcano Songs, A-Ronne is the concrete representation of the way we listen to
text in the “poetic mode,” with the exception that, in this case, Berio manipulates words
22
David Osmond-Smith, Berio (Oxford, New York: Oxford University Press, 1990), 98.
23
Ibid., 90.
85
The following analysis focuses particularly on the ways Berio operates on the
literary materials to reproduce a “poetic mode” of listening, which could alert the
listeners to the language elements that we unconsciously relate to in vocal music. This
analysis is based on the published score of the eight-singer version as well as the
languages, periods, and cultural circles. Each quote appears in the actual text divided by
background. Sanguinetti takes advantage of the original fragmentation that these phrases
have gone through in the real world and deepens their fractures, presenting sections or
even single words. The colon marks clarify further the fractures between each element.
He takes these new wholes and works them through to show their original alienation.
Besides setting the quotes in disturbing contexts, the use of six different languages
highlights the split. Sanguinetti organizes the text of A-Ronne in sections, which are
themed “beginning,” “middle” and “end.” Time is also thematized in words like “run” or
24
Luciano Berio, A-ronne: documentary for 8 singers on a poem by E. Sanguinetti (Wien: Universal
Edition, 1975); Swingle II, Luciano Berio, conductor. A-ronne .London: Decca 1976.
25
List extracted from Norbert Dreßen, Sprache und Musik bei Luciano Berio: Untersuchungen zu seine
Vokalkompositionen (Regensburg: Bosse, 1982), 159.
86
“beginning,” as well as space in the parts of the body: bocca, labbro, annus, pied, etc.
commencement, which fits with the thematization of space and time and at certain point
almost articulates the dissolution of time “in my end...is my beginning.” The title A-ronne
is related to the conceptualized idea of time and the sections of the piece. “A” was the
first letter of the old Italian alphabet and “ronne” the last one. All material in between
represents the rest of the alphabet. This is the poem as Sanguinetti presents it:
1.
a:ah:ha:hamm:anfang:
in:in principio:nel mio
principio:
am anfang:in my beginning:
das wort:en arché en:
verbum:am anfang war:in principo:o lògos:è la mia
carne:
am anfang war:in principio:die kraft:
die tat:
nel mio principio:
2.
nel mezzo:in medio:
nel mio mezzo:où commence?:nel mio corpo:
où commence le corps humain?
nel mezzo del cammino:nel mezzo
della mia carne:
car la bouche est le commencement:
nel mio principio
è la mia bocca:parce qu’il y a opposition:paradigme:
la bouche:
l’annus:
in my beginning:aleph:is my end:
ein gespenst geht um:
3.
l’uomo ha un centro:qui est le sexe:
en méso en:le phallus:
nel mio centro è il mio corpo:
nel mio principio è la mia parola:nel mio
87
centro è la mia bocca:nella mia fin:am ende:
in my end:is my
beginning:
l’âme du mort sort par le pied:
par l’anus:nella mia fine
war das wort:
in my end is my music:
ette, conne, ronne:
confrontation with the human voice as Sequenza III, the former “enriches the work with
the possibilities that are offered by a double quartet and through a referring system that
extends the concert frame of his Sequenza.”26 Regarding the first point, instead of the
isolation of the vocal actions of Sequenza27 performed by the solo singer, A-ronne knits
the individual vocal expressions of each of the eight voices into the full structure of the
piece: “the interpretation of each voice horizontally constantly refers to the collective
connection vertically.”28
Departing from Sanguinetti’s poem, Berio recomposes a new text for the score,
which adds sounds in phonetical writing, syllables, words, sound gestures, and a
completely new language. These new elements may or may not be related to words or
fractions of those that already are part of the poem’s text. Although not precisely in its
original order, the whole poem is repeated about twenty times in its entirety. As Dreßen
comments, in this manner Berio unravels the “meaningless sign chain” nature of this text
specifically, and any text in general, within the general state of crisis of the language.29
26
Dreßen, Sprache und Musik bei Luciano Berio, trans. L. Guillén, 157.
27
In the instructions for performing Sequenza III, Berio requests that attention be paid to the timing
indications on the score for each section to maintain the rapid succession of vocal events. He aims to create
the illusion of one voice polyphony, such a rapid articulation of diverse vocal sounds that the listener could
perceive as several voices.
28
Dreßen, Sprache und Musik, 158.
29
Dreßen, Sprache und Musik, 162.
88
The rather negative approach of Boulez to text is transformed into a more positive
conception of the same issues in Berio’s hands. Boulez asserts that any text set to music
will be inevitably destroyed. Text may be at the “centre” molding the musical piece but at
Berio proposes a new combined and fluent media as previously commented on in this
chapter. For Berio, certain fragmented text elements may be perceived as music and still
syllables and phonemes inhabit the transitional realm of his new “combined and fluent
media.” Berio enacts the language crisis and transforms it into a new communicative
discourse.
Thema (Omaggio a Joyce) (1958) was Berio’s compositional turning point with
poem, which breaks the logical/semantic continuous discourse, Berio first set down only
those features of the text that could be perceived during a first reading: text fragments,
contour, vocal timbre modifications. His phonological studies and observations lead him
to purposely take into account the perceptual experience from the listener’s point of view.
According to his beliefs, in the already disturbing context of a musical setting, only those
features immediately perceived have a chance of being captured by the listener. This
same kind of approach is found in other pieces by Berio, such as Laborintus II and A-
ronne.
89
Laborintus II not only uses texts of the same poet, Sanguinetti, it also sets a
layers of eight singers. The following comments of the composer about his Laborintus II
The first step...was to set down some features of the text spontaneously so realizing the
polyphony attempted on the page...Nor should it be forgotten that only those features
immediately perceptible on a simple reading of the text have been taken into
consideration...30
expressive connection between the text and its receptor through its more rudimentary and
firsthand apprehensible sonic elements. With respect to Laborintus II, A-ronne refines
and deepens the exploration of the phonetic and paralingual aspects of its text.
In A-ronne, Berio continues the process of transforming the prose into a poem
ambiguity of meaning.
Verbum caro in principio die Kraft die tat der sinn (alto 2, p. 3)
The repetition of words or partial phrases adds to the recontextualization of the material,
90
Nel mezzo nel mio mezzo nel mio corpo oh nel mezzo: nel mezzo del camino
nel mezzo della mia carne la bouche è la mia bocca la bouche la bouche la
Neither Sanguinetti’s text source nor Berio’s final scored text preserves cohesive
paragraphs. A-ronne uses two formal types of speech: disjointed and enumerative. Most
of the passages are made up of short phrases that never develop into a more fluent kind of
speech. Both types explore contrast and repetition as a source of music/ sonic interplay.
The repetition of words and phrases, as well as especially the accumulative repetition,
creates emphasis, climax and expectancy. This kind of repetition often creates points of
assonance and alliteration inside a word or among words. In several cases, in cataloguing
and enumerative passages, the words are bound together more by their similarity in sound
than by their logical connection. Berio comments, in an interview with Bálint András
with the help of alliteration that I musically reorganize this rather complex text.”32
alliteration and assonance. The percussive effect created by the vertical superposition of
the following words is achieved by their similar phonetic content. Stop-plosives, such as
[t]-[k]-[d], in their two typographic representations, “c” or “k,” the fricatives [v]-[f] and
the flipped and rolled [r], dominate the phonetic field in this passage. The following
30
As quoted in Peter Stacey, Contemporary Tendencies in the Relation of Music and Text with Special
Reference to Pli selon pli (Boulez) and Laborintus II (Berio), (New York/London: Garland Publishing,
INC, 1989), 156.
31
Sanguinetti’s poem is made, for the most part, from quotes coming from famous texts in prose form.
32
Dalmonte and András Varga, Two Interviews, 142.
91
example comes from the three first coincidences between some of the eight singers on
derived from surrounding words, but in other instances, phonemes seem not to have any
textual connection and have a purely musical function. On page 4, the sung notes in the
baritone 1 on the phoneme [o] are clearly derived from the preceding “caro” of the alto 2
and soprano 2 or from “principio” and “wort” of baritone 2 or soprano 1. Another case of
coughing, suffocated by words and saliva” —as the composers indicates on the score—
anticipates the consonants [d]-[s]-[z]-[d] of the words he is trying to deliver: “der Sinn.”
But other episodes, such as the succession of phonemes [u]-[i]-[Λ]-[o]-[a] on page 16 and
17 in the baritone 2, are for the most part unconnected to any surrounding word. The case
of the counterpoint between tenor 1 and baritone 1—starting at the end of page 20 and
running into the first half of page 21—is a purely musical event, which focuses on the
colors and sonic quality of these phonemes. In rapid succession tenor 1 delivers a set of
92
composer’s indication to perform this passage as if they were “teaching vowels” or
“teaching consonants.”
‘representation of the invisible in the visible’ as the relation between the written picture
and its acoustical realization.”33 For this realization, Berio resorts to a vast and
meticulous musical vocabulary: twelve forms of notation, three procedures for time
organization of the piece, seven dynamic steps, and ninety-six performance instructions.
Dreßden makes an exhaustive list of these procedures.34 The following list shows the
range of vocal sounds that Berio requires from the performers: from the spoken to the
sung, going through a variety of physical modifications that affect the sound result.
33
Dreßden, Sprache und Musik bei Luciano Berio, 162.
34
For a complete list of notation forms, time organization, dynamics, and performance instructions in A-
ronne, please refer to pages 162 to 167 of Dreßen, Sprache und Musik bei Luciano Berio , Chapter V.
35
List reproduced from Dreßen, Sprache und Musik, 162-163.
93
body parts, and musical or theatrical/acting situations—interpreting different characters
in various settings. The following table shows examples of each of those areas:
6
“gasping out the words”
7
“tense murmuring”
“cold” 3
“angry” 4
“sadly”
15
“Vocalizing” 27
94
A variety of vocal sound production styles dominate the performance spectrum of
A-ronne:
1. Unnotated recitation (p. 7-8 every singer; p. 9-11 baritone 1 and 2; p. 34 tenor 1
2. Spoken short phrases, words or syllables with notated rhythms and no pitch
3. Spoken short phrases with notated rhythms on a single line denoting the central
point of the speaker’s register: p. 1-3; sporadic use between p. 44–46; the scat like
unspecified melodic contours on the syllables “de” and “den” at the beginning p.
4. Singing with an unstriated pitch: notated around a central line (usually small
vocalizes on one of the pure vowels, on p. 1 alto 1, p. 2 alto 1 and tenor 1; all
these have the following performance indication on the score “(singing unrelated
pitches).” Unless this is indicated, the rest of syllables notated in this way are
spoken.
most of the singers; p. 28 soprano 1 and alto 1 indicated “(as a folk song)”; p.12-
13 and p.39 baritone 2 sings like a double bass detached monosyllabic mixed text
15, but the whole melody appears on p.16 “(like a bumpkin’s marching song)”
95
6. The other singing passages are sung in single phonetic sound that do not combine
to make any word: “(dreamy and distant)” tenor 2 melody; chords sustained by
different rhythmic patterns superposed among all the voices, which progressively
7. Syllabic singing (nonsense syllables), the most prominent instances are the
and rhythmically these passages imitate the jingle jazzy commercial vocal style.
The spoken passages, such as the “unnotated recitation” and the “spoken words
and short phrases” expose a wide range of performance styles according to the mood
indicated by the composer on the score. Since these are not musically notated, all
96
parameters are open to the performers’ interpretation. These passages have neither pitch
indication nor rhythmic notation; their performance intonation arches and speed are only
determined by the mood indicated by Berio over each event. On page 7 and 8, the
indication of “tense murmuring” and the whisper sign, o--------, turn to a very fast
masked delivery of the text that, added to the polytextuality among the singers, makes
understanding almost impossible. The pitch results in the medium-high register of each
normal speaking voice. Perhaps to avoid the natural performance tendency of associating
fast and high pitch, on page 9, the indication for baritone 1 and 2 is “Like two priests
murmuring a prayer: fast and low tones.” Superposing the other voices singing the
“quasi-motet” passage over the low pitch and volume of the baritone 2’s singing, the text
is masked, and its understanding obscured. The low pitch speaking quality is kept
through p. 10 when alto 2 is introduced in a counterpoint with baritone 2; but the pace
slows down to create the sensual and intimate ambiance suggested by the indication
“Like an intimate dialogue: with a husky and hesitating tone.” Sighs are incorporated
between this hesitant tone phrases. Intelligibility is regained because of the slower pace
and over articulation of the words as a mode of sensualizing their sound, their phonetic
coloring.
A similar situation is encountered on page 22, this time between tenor 2 and alto
2, “intimate and sensual with occasional o-------.” In this passage, all the voices, except
--.” The gigglish quality makes every voice explore the high end of its speaking register
and move in a fast manner. Right after this passage, tenor 2 and baritone 1 engage in a
process of “Stuttering, gradually faster” recitation for tenor 2 and “Very fast, gradually
97
stuttering” for baritone 1. The “faster” indication for the tenor towards the end of this
section, added to the maintained “very fast” of the baritone, builds excitement toward a
climax via a natural crescendo and acceleration of the delivery pace. Certain words
monotonous line and the two tenors. But the overlapping of two different texts and the
fragmentation of words produced by the stuttering diverts the listener’s attention from the
The paralingual realm of A-ronne is highly developed, suggesting with this sonic
interpret the exact dramatic meaning of these sonic gestures. They are ambiguous and
simultaneously evocative of different scenarios. Some of them may have only a musical,
quasi-instrumental, function. The last event of page 21 is a 25-second long section where
the eight singers are instructed to perform alternately the following paralingual sounds:
gasping, cough, grunting, snorting, straining, groaning, exhaling loudly, moaning. All
sensual”—between tenor 2 and alto 2. These last two singers alternate phrases about the
human body with breathy sounds, such as “nel mezzo…nel mio corpo…nel mezzo della
On the one hand, altogether, this paralingual superposition does not make sense as
a cohesive action or a chain of reactions in relation to each other. On the other hand, this
episode may be seen as the deconstructing turning point of the previous authoritarian
98
answering tenor 1. At the beginning of this section (page 16), tenor 1 seems to answer
imitating exactly what soprano 2 indicates to him in a whispering mode. Later, on the
second half of page 18, soprano 2 drops her indications and for the next two pages,
baritone 1 and tenor 1 continue their “angry and hysterical” (as indicated by the
very similar to the already quoted text of the seduction section between alto 2 and the
same tenor 2:
words is almost obliterated by the harsh delivery. And the only thing that a listener
perceives is the aggressive tone of the baritone and the intimidated tone of the tenor, who
progressively is taught (or brainwashed by the other). To a certain degree, the text being
used could have been this one or any other because what we apprehend is the aggressive
tone and procedure of forcefully imposing answers at any cost. This effect becomes
progressively clearer when, in the second half the score page 18, their phrases start to
collapse with each other. On page 19, phrases or even words are completed between each
other in a kind of “hocket,” while in other cases different words or complete phrases are
overlapped. By page 21, the only material exchanged between these two singers is a set
99
of vowels and consonants that they pretend to teach each other. Thus, we arrive at the
cathartic preparation for the sensual section in which the physicality is exacerbated and
certain body parts become the focus of sexual desire. The chaotic superposition is the
release and final step in this gradual relaxation and disintegration of language and vocal
sound expression: from phrases to phonemes, from vocal expressive intonation to the
isolated vocal gesture (paralingual gestures). Finally, although this chaos is difficult to
explain as a coherent act, each paralingual action preserves its everyday concrete
meaning.
which has a purely musical function. It offers a sonic mattress, a texture, against which
the baritone 2 stutters through the phrases that he is trying to enunciate: “der sinn…o
logos…Am anfang war…die kraft…die tat.” While he fights with these words, the rest of
the seven singers engage in a loop of spoken and whispering single phonemes and
paralingual sounds, such as: “inhaling and exhaling through teeth,” “flickering tongue
against upper lip,” “whistle,” “‘Pop’ sliding finger inside-out of mouth,” “squeak.” Most
of these sounds have no immediate reference to any usual human action—with the
exception of the whistle—and are mere playful effects with a purely musical textural
function.
The three main factors that affect the intelligibility of the text in Laborintus II
performance style; 2) text condition; and 3) masking. The optimal case of intelligibility is
achieved when the text is spoken, well articulated and more or less intact, in its “prime
100
condition.” When the same text or different texts are recited by two or more voices, the
intelligibility depends on how much each word or short units of text are superimposed. In
some cases, two singers alternate in a rapid, moderate, or slow paced delivery. In this
case, the text could be clearly understood unless another masking factor, such as
phrases, plus any of these masking factors, only intensifies the ambiguity of the text.
Whenever melody with lyrics are superimposed with a spoken text, the listener’s
attention will be drawn to the message of the latter, while the melody with text becomes
the background. This is especially true when the sung melodic passages are soft in
volume and melismatic in their lyric setting, while the spoken ones are clear and over
articulated. Only in certain circumstances, when stylistic performance effects mask the
spoken text, could the sung melody take a leading role and directly grasp the listener’s
attention. Whenever the sung passages with lyrics are the center of attention, the text
tends to be more unclear in melismatic settings than in syllabic ones. Any vocal or
performance masking effect could obviously add to clarity, or obstruct the mentioned
perceptual tendencies.
The following is a list of passages in A-ronne in which one, or more than one, of
1. Page 1–3: In the first three pages of the piece, although the “prime condition” of
the text is not conserved, its fragmentation is counteracted by the repetition of the
101
same words, “in mio principio.” In this way, the listener has several opportunities
to grasp them.
is found on this page. For twenty seconds, the eight singers recite different texts
are fragmented by interspersed sung pitches on the last phoneme of the previous
version of page 3.
3. Page 4: Although all the voices articulate words or short phrases at the same time,
the repetition of some of those in the vertical axis (the same word or phrase
pronounced at the same time by two singers) or in the horizontal axis (successive
repetition of the same material by the same singer or a different one) brings a
5. Page 22: The “intimate and sensual” dialogue between tenor 2 and alto 2 on page
6. Page 9: The low volume and pitch performance style mask the two different texts
that baritone 1 and 2 speak. This effect is especially emphasized by the upfront
102
7. Page 13: Again, Berio further alters the “prime condition” of the text. The
composer takes, from the already fragmented text compiled by Sanguinetti, only
8. Page 16: Baritone 1, soprano 2 and tenor 2 offer an instance of alternating phrase
whispering in soprano 2, are intervening, the relaxed pace, alternation with the
baritone who pronounces different text, plus the superposition of the tenor 2
repeating the same text in a different vocal style, contribute to the understanding
of the text without major difficulties. At the same time, on this same page, the
composer adds an extra textural layer, a sung melody with lyrics by tenor 2. The
melismatic and soft quality of his singing, plus the natural tendency of the listener
9. Page 18: The multilingual and extreme fragmentation of the text does not
10. Page 21: The composer pushes the fragmentation of the text of Sanguinetti to an
extreme. On page 21, Berio breaks the text into its phonetic components,
grouping vowels on one side and consonants on the other. The effort of
times listening or looking at the score does one realize that there are phonetic
103
11. Page 23-26: In these pages, the process that the text undergoes transforms it from
phonetic extraction of only the vowels of the preceding words. This again
alienates its comprehension. Only the hocket between the two singers of
consonants and vowels in their original order at the end of page 26 offers the
12. Page 26: In this passage, not only are tenor 1’s and baritone 1’s recitations
superimposed, but the “stuttering, gradually faster” and “very fast, gradually
13. Page 29: Another instance of sung text is found on page 29, but this time—in
of the same melody and lyrics by different female voices and the medium high
14. Page 35: Tenor 2’s recitation is acoustically set to the background and buried in
the thick vocal polyphonic texture of the rest of the voices. The intimate
monotone and low volume delivery of tenor 2’s recitation also does not contribute
15. Page 41: At the end of this page, the inaudible recitation of tenor 1 is
overpowered by the forced and over articulated whispering of alto 2, which takes
the sonic front stage over the tenor’s textural background. While alto 2’s text is
104
As mentioned before, in A-Ronne, Berio recreates the semiotic structural
manifestation or dramatization of the elements of language and their agony when set to
music. In terms of its overall structure, this piece explores all the possible degrees in the
music-text continuum proposed by Berio. While traveling across the whole spectrum of
possibilities, A-Ronne contrasts and overlaps in a single texture the pure extremes, word
or music, or any intermediate scale degree of the synthesis of the two mediums.
One may divide A-Ronne into several sections according to the types of processes,
the contrast between fragmented text—words that are repeated—and free vocalizations
on vowels extracted from preceding words. The spoken material is mostly set to notated
Among the briefly sung material, besides the vocalizations, there are also short sustained
pitches on single vowel phonemes. There are occasional paralingual vocal gestures, such
as sighs and belches, and sound effects such as bocca chiusa. But the climax of the
voices vertically as well as horizontally, creating a chaotic textural mix of short phrases
delivered in different manners, which overlap and succeed one another. Page 3 starts
clarifying this notated sonic multi-texture and proceeds into a section of 20 seconds in
which the eight singers deliver straight spoken text in different moods and insert short
again concentrates on coordinated short phrases of the text, which are rushed and shouted
105
by all voices except tenor 1—who inserts his phrases in between with a prescribed
rhythm and later adds a specific pitch contour. Page 6 is mostly devoted to the
paralingual realm. All the voices—except baritone 2 who stutters over the consonants of
gestures.
In this way, this first section opens the palette to most of the vocal styles and
b) Section 2: Score pages 7 to 12 focus, for the most part, on the contrast between the two
vocal styles that constitute the extreme in Berio’s continuum: speaking and singing. After
the previous introduction to a big portion of the universe of vocal possibilities, now, in a
more economical manner, this section overlaps—in vertical contrast—the extremes, the
opposites: straight spoken text against sung melody—rhythm and pitch notated—in a
Page 7 opens with all the voices superposing spoken complete sections of the text
of the second part of Sanguinetti’s poem. Then it proceeds to the insertion of short sung
phrases of what on page 8 will become a full contrapuntal melody. So far the text of the
The transitional page 8 gives way to page 9 on which four of the singers
alternately take the singing role; baritone 2 and a second singer—who changes from
voice to voice—recite passages from the first and third parts of the original poem.
106
c) Section 3: Score pages 12 to 18 are devoted to the contrast of different singing styles
“desperate” shouted phrases by alto 2, the spoken text in free rhythm is not contrasted
This section opens at the end of page 12 with baritone 2 singing in a “basso
continuo” manner, on syllables that, although extracted from words of the poem’s
original text, are decontextualized to such a degree that the result seems a mix without
any syntactic sense. On page 13, tenor 2 performs a distant and soft contrapuntal melody,
which articulates the vowel phoneme of the immediately preceding syllable sung by
baritone 2. These two voices offer an instance of complete unintelligibility of the text,
resulting from the extreme fragmentation of the discourse. This procedure transforms the
two voices, resulting in an instrumental effect in which the vowel becomes pure sounds
By page 14, some of the other voices intervene in the background with singing
lines in “ppp” on half-notes and slow rhythms on vowels also derived from the syllables
of the baritone. The rest of the voices perform animal sounds, scat-like quick short
phrases, or short bits of melodious popular chants—which make use of more organic
phrases of the text. All of them add to the purely musical or instrumental multiple texture.
By score page 16, the spoken word becomes dominant over any background
sergeant’s questioning” and intimidates tenor 1 who answers with whichever phrase is
107
prompted by soprano 2. But this brief spoken passage is shortly invaded again by most of
d) Section 4: This section explores the medium zone of the music-text “continuum.” Two
fragmented spoken words articulated in a hocket with precise notated rhythms. Regarding
the quasi-“Sprechstimme” vocal sound production style, in these pages there several
instances in which most of the singers pronounce the syllables “den” or “my” in an in-
between speaking and singing manner. This may be described as speech with a melodic
sentence in hocket. Usually, it starts to alternate freely but soon the composer assigns a
very well-structured rhythmic pattern to each singer. It is at that moment that the text
becomes fragmented word by word or syllable by syllable, and the phrases are put
together only by the interplay between the two singers. The two times that the text is
broken in syllable or phonemes, those particles derive directly from the immediately
end.” Showing the word immediately before its disintegration allows the listener to
appreciate and understand its meaning at the same time that it challenges and involves
sounds and high pitches “imitating the ‘call’ of Algerian women” produced by the four
108
undistinguishable sounds or the turning of a gigantic squeaky wheel. Again the lingual, or
e) Section 5: After the isolated transitional page 22, the section between pages 23 and 28
further explores similar elements and procedures as the 4th section. But now these text
hockets and “den” passages, on the one hand, are subject to better defined musical
parameters, and on the other hand, can either be perceived more as text or music, the
The short “den” motifs now are rhythmically and melodically notated. From page
23 to 26, every time these motives appear, they are used as trigger of a dialogue between
tenor 1 and baritone 1, who declamate text over a sustained chord by the other voices.
The first time (p. 23-24) this chord is sung in bocca chiusa, the second time (p. 25-26) on
vowels derived from the text declamated by tenor 1 and baritone 1 on pages 23 and 24.
This purely musical texture serves as background to the dialogue made of short fractions
of sentences. Tenor 1 and baritone 1 each complete the last syllable of the last word left
incomplete by the other. This is one of the few instances in which a large section of
Sanguinetti’s poem (the whole Part 3) is delivered in its original order; from “L’uomo ha
Although in section 3 and 4 there are fractions of the text in its original state, it
has not ever been as clear as in this passage; this happens for several reasons. First, the
text is not as fragmented as before; now it is delivered in bigger chunks by each singer.
Second, the rhythmic patterns, to which the text is set, allow the logical, natural flow of
intonation—not too frantic not too slow—and do not disturb or distort its understanding.
109
Berio allows the performers to take those rhythmic patterns with flexibility as he
adaptations are possible.” Third, the sustained chord texture accompanying this dialogue
is completely unobtrusive.
Finally, the whole section, with its harmonized “den”s and declamations, creates
the illusion of listening to a radio commercial. The melodic, rhythmic, harmonic voice
tone and performance style of the vocal ensemble resembles commercial jingles, which
generally introduce and draw the attention of the listener to the commercial selling
speech that follows. Then, usually two announcers declamate that selling speech,
alternating their voices with overarticulated inflections of their speaking voices. This is
certainly the fourth and more definitive reason of why this passage is as intelligible as it
is.
Thus, A-ronne reaches its maximum intelligibility and textual integrity exactly in
these two pages, which fall in the middle of the whole score—pages 23 and 24 of a total
of 48.
Soprano 1 proposes the first half of a singable melody, which then is completed by alto 2.
On page 30, soprano 1 restates the same part one of the melody, but this time alto 1
repeats this first section after her and completes the rest of the melody. While these
singers perform this theme, the other voices accompany with vocalizations on vowels.
The accompanying texture thickens progressively toward page 31, on which not only all
the other voices have been introduced, but also their rhythmic activity increased. By page
110
32 and 33, the eight singers vocalize on frantic scales in triplets, quintuplets, and
sixteenth-notes. This overlapping creates polyrhythm and cacophony, since now the
ga- ra-ga- ra-ga-etc;” “ca-ro- ca-ro- ca-ro-etc;” etc. By the end of this section—page 33—
The syllabic setting and simplicity of the melody proposed by soprano 1 and the
alti allows the listener to understand the words. The fragment set to this melodic passage
is extracted in its original layout from the first half of Sanguinetti’s part 1 of the poem
without any further editing by Berio. But shortly after its introduction, the accompanying
texture gets thicker and busier, progressively obstructing the previous clarity of this text
fragment.
g) Section 7: This section—score pages 34 to 40—goes back to the contrast of the two
pure extremes of the text-music continuum. Although several of the voices have spoken
page 34, this effect is created because all the singers’ overlapped polytextuality makes the
literal understanding of the text impossible and at the same time highlights the richness of
the consonants’ phonetic colors. The rest of the recitations are either overlapped and
the sung parts. In the first case, the stuttering again emphasizes the percussive effect of
the consonants of the words. In the second case, the whole text melts in a muddy and
monotonous “sonic mattress.” The text employed in these passages is drawn from the
36
At the beginning, all except baritone 2 have a spoken passage, then tenor 1 and baritone 1 and finally
only tenor 2.
111
three parts of Sanguinetti’s poem; in some cases only fragments are extracted and in
others the text is incorporated in its original order or phrase by phrase backwards.
The sung parts consist of an ornamented vocal line, usually carried by one voice,
harmonized by the others. All these lines are textless and performed on isolated vowel
phonemes. But harmonic richness and dramatic melodic content dominate this section.
ensemble. But in contrast with the one in the 5th section, this one is decomposed:
rhythmically fragmented and harmonically dense (more dissonant). Also, this time, the
announcers, although they use the same text from Part 3 of the poem, do not establish the
dialogue game as they did before. Tenor one is almost inaudible, murmuring in the
background, and alto 2, who loudly whispers “in my beginning Aleph is my end” (from
the poem’s Part 2), dominates on page 41 against the murmuring and a sustained sung C4
of the other voices. By the next page, that sustained C4 breaks into quarter notes that
immediately start to unlock from their homophonic layout into a slightly displaced
polyrhythm, which creates an echo or delay effect. In the following pages, the eight
voices also progressively expand their harmonic spectrum. They mostly sing on isolated
phonemes or syllables derived from different parts of the text. The phonetic material also
In this section, the singers interrupt the singing with isolated phrases from
different parts of the poem. These phrases are mostly whispered in Sprechstimme style
112
and performed according to the different character indications: “dreamy,” “solemn,”
On score page 47, the eight voices lock into a dissonant homophony that
crescendos until colliding into a held open perfect fifth, which soon after dissolves into a
fading dissonance until the closing spoken letters of the Italian alphabet: “ette, conne,
ronne.”
These detailed analyses of Volcano Songs: Duets and A-ronne reveal the
procedures through which Monk and Berio transform the sonic elements of language into
structural components of their pieces. Through these means, they explore the
gestures that are usually disregarded in text set into music and make a conscious
113
IV.
The cases analyzed so far have been pieces from the “art-music” realm: art-song,
opera and oratorio repertoire. Examining this dissertation’s hypothesis in the context of
the repertoire of the “popular-music” realm—jazz tunes, folk/blue grass and pop songs—
may provide further insights. This section, therefore, will proceed with an analysis of four
popular songs from the English speaking repertoire, which serve as examples of different
approaches to the articulation between words and music: first, a highly structured Tin-
Pan Alley tune, Jerome Kern and Oscar Hammerstein’s “All the Things You Are;”
second, the narrative type in strophic form, Bob Dylan’s “A Simple Twist of Fate;” and
finally, two with “redundancy variation” in the “verse-chorus” format, Björk’s “Isobel”
Popular song confronts us with new issues that lead us to rethink even the way we
approach “art song” analysis. From the debate over popular music studies, three
particular issues are relevant to the approach that this chapter seeks in analyzing these
four songs. These perspectives are fundamental to understanding the way in which
audiences listen to these songs and how they are conceived by their songwriters.
The first issue addresses the fact that an analysis of the musical text—the song
and its constituent parts, words and music—is not sufficient in isolation since its elements
only gain significance in relation with their context. Contexts directly influence the way
Richard Middleton and Simon Frith in the musicological debate over analysis of popular
music, “one of the most important aspects of context is that it establishes the codes that
114
listeners are most likely to apply in certain listening situations.”1 Style conventions are
indicators of which musical elements are valued in each specific popular genre. These
elements are the main focus of artists at the creative moment and of the audience in the
listening situation. The analysis of those conventions reveals the code under which
popular songs are read as “texts.” It is necessary, however, to note that more than one
perspective comes into play in the formulation of that code and that the code may be
different according to the agents interpreting the musical object. This coding is actually
the result of a diversity of discourses converging in a dialectical manner into the object of
study, the song. Cultural studies theorists of the 1970s assumed that “the meaning of
altogether,” as Simon Frith points out in a criticism of his own analyses in The Sociology
that each “subcultural group” assigns to the song styles with which they identify—gave
way to studies that also took into account the “changing modes of lyrical production” in
The second relevant issue is concerned with the problem that traditional score
analysis presents when dealing with popular song. Most traditional formalistic analysis
based on the visual information provided by scores may ignore important musical
1
David Bracket, Interpreting Popular Music (Berkeley, Los Angeles, London: University of California
Press, 2000), p. 18. For further details over these issues see, Richard Middleton, Studying Popular Music
(Philadelphia: Open University Press, 1990), and Simon Frith, Music for Pleasure: Essays in the Sociology
of Pop (Cambridge, Oxford: Polity Press, 1988).
2
Frith, Music for Pleasure, 119.
3
Ibid. “Lyrical production” refers not only to the content and format of the lyrics but also the musical
structure of the song.
115
Shepherd—departs from scores and concentrates only on the musical syntax (mainly
harmony and melody), ignoring rhythmic nuances, texture, vocal and instrumental
arrangement, timbre nuances of the sound inflections, and sound mixing (in recording
Charles Keil and John Shepherd—allows observation alongside the melodic and
Certainly recording technology has been partly responsible for the reconsideration
new way of registering particular performances of musical pieces: this is the third
relevant issue that contributes to the analytical approach used in this chapter.
corpus of new questions such as the ones mentioned in the previous paragraph.
Furthermore, Albin J. Zak III proposes that “records are not reproductions of anything;
they are ‘realities in themselves.’”5 He is appealing to the rock band leaders’ conception
of songs as only the starting point; “for them…the sound of the recording represented the
ultimate form of the artwork, and their compositional intention was to have a hand in
shaping the sonic relationships that made their identity.”6 Rock historian Carl Belz stated
as early as 1969 that although rock was not the first genre to use records and radio as its
4
For details on these terms definitions see, Charles Keil, “Motion and Feeling through Music,” The Journal
of Aesthetics and Art Criticism, 24 (Spring 1966), and John Shepherd, “Media, Social Process and Music”
in Whose Music? A Sociology of Musical Languages, John Shepherd, Phil Virden, Graham Vulliamy,
Trevor Wishart, ed. (London: Latimer, 1977) and “A Theoretical Model for Sociomusicological Analysis
of Popular Musics, “ in Popular Music 2, David Horn and Richard Middleton, ed. (Cambridge: Cambridge
University Press, 1982) as quoted by D. Bracket in Interpreting Popular Music, 21.
5
Albin J. Zak, The Poetics of Rock: Cutting Tracks, Making Records, (Berkeley, Los Angeles, London:
University of California Press, 2001), 21.
6
Ibid.
116
primary media of expression, for rock, “records became the primary, common bond
among artists and listeners.”7 Rock recording is not to be assumed as a mere “‘acoustic
presentation’ of a written text (the score). It is itself a text, a sonic one; ‘what it sounds
like’ is precisely ‘what it is.’”8 Records as well as scores are semiotically mediated texts
open to interpretation. But we must not overlook that by their own nature, records have a
material content, sound directly experienced by the listeners and that “in addition to
whatever we make them to be, they insist as well on being exactly what they are.”9
call the recordings, “represent a reified abstraction,” which include more than “musical
whose particularity is immutable and thus essential to the work’s identity.”11 As musical
ideas are not only expressed in sound but also become sound, we must take into
consideration new elements that are integral to the final artistic product, such as recording
tools, space and dynamics among the members involved in the actual recording and
mixing. What primarily concerns this dissertation is that the recording tracks bring
awareness that we are hearing song words in somebody’s voice and that voice delivers
Taking into consideration the discussion of these three issues, this chapter
concentrates on the analysis of the text itself—the recorded track, the song. My work falls
among the textually oriented studies of popular music that approach lyric analysis in
7
Carl Belz, The Story of Rock, (New York: Oxford University Press, 1969) as quoted in A. Zak, 13.
8
Zak, 41. Comments in brackets and italic type by L. Guillen.
9
Ibid.
10
Middleton, Studying Popular Music, 83.
117
particular “with awareness of their function not as verbal texts but as sung words,
Popular song often offers the chance of finding the composer of the music and the writer
of the lyrics in the same person. Sometimes the songwriter is even the performer herself,
as in three of the cases under study in this chapter. This particular circumstance provides
the opportunity of observing the manipulation of the lyrics—as Middleton calls them,
(including the sound of the recordings) as part of the materials that songwriters count on
Although “people may not listen to pop songs as ‘messages,’” it is obvious that
they take them into account.14 As Simon Frith says, “So the question remains: why and
In songs, words are the sign of a voice…Singers use non-verbal as well as verbal devices
to make their points—emphases, sighs, hesitations, changes of tone…(which is why some
singers, such as the Beatles and Bob Dylan in Europe in the sixties, can have profound
significance for listeners who do not understand a word they are singing).15
In approaching the task of analyzing songs to find out “why and how do song
1. The way a singer performs a song determines what the singer means to
us and our relationship to him/her as the audience.
11
Zak, The Poetics of Rock, 42.
12
Simon Frith, “Try to Dig What We All Say,” The Listener (June 26, 1980), as cited by A. Zak, 43.
13
Middleton, ed., Reading Pop: Approaches to Textual Analysis in Popular Music (Oxford, New York:
Oxford University Press, 2000), 7.
14
Frith, Music for Pleasure, 120.
15
Ibid. Frith’s remarks resonate with my comment about consumers of Anglo-American popular song in
non-English speaking countries in the introduction of this dissertation.
16
The following three points have been adapted from Simon Frith, Music for Pleasure, 121.
17
Ibid.,121.
118
musical genres, listeners engage in fantasizing about belonging to
different sorts of communities.
Speech Mode vs. Poetic Mode,” we may find a wide range of vocal setting modalities,
from those that aim at a “speech quality” to those that aim at a “musical-poetic quality.”
meanings, the other by the ‘musicalization’ of the words, often through paralinguistic
In the first case, the “affect” mode of setting lyrics absorbs words as expression,
merging them with the melody. Middleton explains that in this case “voice tends towards
settings, words are mainly perceived in the “poetic mode” of listening, in which certain
denotative aspects of words are captured parallel to their sonic qualities. This is the way
we listen to songs such as Kern/Hammerstein‘s “All the things You Are” and certain
In the second case, the “story” mode of setting lyrics retains the focus on the
denotative effect of words over the rhythmic and harmonic flow. In this case, words are
18
Clive James, “The Beatles,” Cream (October 1972), as quoted by S. Frith, Music for Pleasure, 122.
19
Middleton, Studying Popular Music, 228.
119
perceived in a “poetic mode” in which there is a preponderance of the “speech mode”—
the listener hears more of the speech qualities of the words than the sonic ones. The
straightforward discourse keeps its integrity by relegating rhythm, melody and harmony
to the background. We may find an example of this in Dylan’s “A Simple Twist of Fate.”
In the third case, the “gesture” mode, words tend to be absorbed into music at the
point of becoming sound while the voice becomes almost an instrument. In some
setting, words are perceived in a “poetic mode” in which there is a preponderance of the
“acoustic mode” of listening—the listener hears more of the sonic qualities of the words
than their speech denotative content. We may find an example of this in certain sections
During the “golden years of the Tin Pan Alley”—1910s to 1950s in the United
States—one of the song formats most often employed by composers such as George
Gershwin, Cole Porter, Irving Berlin, and Jerome Kern was a format that opened with an
setting—called the “verse,” followed by what they called the “refrain,” which was the
“real” tune, in an “affect” mode setting. This is the case in “All the Things You Are,”
composed by Jerome Kern with lyrics by Oscar Hammerstein in 1939 as part of the now
20
Ibid., 231.
21
Ibid., 228.
120
rarely performed musical Very Warm For May. Larry Starr and Christopher Waterman22
indicate that the origin of this verse-refrain form is the result of the fusion of the
craze of ragtime and jazz music” of the early twentieth-century. After the introductory
“verse,” the “refrain” follows in AA’BA form. The A section introduces the main
melody, which is repeated with new lyrics and some slight melodic changes (A’). Then
the B section or bridge immediately follows with new musical material and lyrics. It then
finishes with the return of the A melody, usually with new lyrics and some melodic
Several talented composers explored variations on this format, but what made it
especially successful was its predictability. Peterson and Berger comment that the
the Tin Pan Alley tune formula in the market.23 Once this formula proved to be widely
Before turning to the analysis of “All the things You Are,” it is necessary to point
out that this is the only example among the popular songs considered in this section that
presents different actors in the role of songwriter and performer. Although the focus of
their desired effects on listeners, in this specific song, it may prove productive to compare
the published score—as the only document giving testimony to Kern and Hammerstein’s
compositional intentions—with two radically different renditions of the same song: the
22
Larry Starr and Christopher Waterman, American Popular Music: from Minstrelsy to MTV (New York,
Oxford: Oxford University Press, 2003), 62, 64.
23
R. A. Peterson and D. G. Berger, “Three Eras in the Manufacture of Popular Music Lyrics,” in The
Sounds of Social Change, eds. Denisoff and Peterson, as quoted in Simon Frith, Music for Pleasure, 119.
121
first one sung by Ella Fitzgerald (the version used during the listening experience) and
the second one performed by Barbra Streisand. These “metteurs en scene,” as David
Laing calls these performers, approach “a song as an actor does his part—as something to
be expressed, something to get across.24 His aim is to render the lyric faithfully. The
vocal style of the singer is determined almost entirely by the emotional connotations of
the words.”25 Working through her interpretation, each singer brings her own
idiosyncratic vocal rhythmic articulations and vocal timbre nuances to the phrases of “All
the Things You Are.” By contrasting these differences, we may observe the very moment
The lyrics of this song are highly structured and abound in redundancy devices.
Mark W. Booth comments that the “repetition of phrasing in successive stanzas, where
small modifications adapt the words to a new use or effect, is the signature of the
ballad.”26 Booth considers this device not a mere stylistic convention but a mnemonic
resource related to the oral nature of the primitive ballad. Although here we are not
dealing with a traditional “oral ballad,” which resorted to redundancy in order to help the
creator and later to help singers to remember the lyrics, the internal repetition certainly
contributes to this song’s popular ballad flavor, creating a dent in its audience’s memory.
Tin Pan Alley lyrics show a prominent concern for “privacy” and “romance.” The
rapidly growing American middle-class of the first quarter of the twentieth century had
elite aspirations and cared about property ownership and privacy. These kinds of interests
are reflected in some of the lyrics of this period: romantic love, a wife, a home to share.
The third person narration of the old European ballads gave way to the first-person stories
24
David Laing, as quoted by Simon Frith, Music for Pleasure, 122.
25
Frith, 122.
122
of Tin Pan Alley. “This first-person mode of address was reminiscent of elite poetic
forms such as sonnet, but Tin Pan Alley songwriters avoided the flowery
the listener to identify his or her personal experience more directly with that of the
singer.”27
“All the Things You Are” talks about romantic love. In the first section of “the
verse,” the three lines of the first stanza tell us of longing for something still unknown;
the next three lines of the second stanza give us the answer for each of the three needs in
the same order that they were introduced. With the conflict resolved in this introductory
section, the “refrain” of the song proceeds into a more luxurious melody that almost
Observing the published score, in the “verse,” Kern creates a colloquial sensation
style.28 Each verse is set to the same rhythmic pattern. Its pace is rather fast; especially in
comparison with the way the lyrics in the second half of the song (the “refrain”) are set.
Each line of the first two stanzas lasts two measures of 2/2 and is ten syllables long, while
at least the first two lines of the third stanza set in the “refrain” consist of nine syllables
stretched over four measures. Melodically, each line of the “verse” opens with an
a motif later in the “refrain” of the song, followed by a simple arching melody. The
26
Mark W. Booth, The Experience of Songs (New Haven and London: Yale University Press, 1981), 59.
27
Starr and Waterman, American Popular Music, 67.
28
Oscar Hammerstein II and Jerome Kern, All the Things You Are (Polygram International Publishing, Inc.,
1939)
123
repetitive rhythm and unattractive melodic contour allow the lyrics to take the foreground
major without too many deviations—does not distract the listener’s attention from the
lyrics either, especially once it is compared with the sequential and modulatory nature of
the “refrain” that follows. Once the action is set and resolved, the song may self-indulge
into a busier harmony over a static descriptive text as the one found in the “refrain.” In
contrast, in the first section of the song, the “verse,” the text is highly structured, with
We know from several accounts of songwriter teams of this period that the music
usually came first, followed by the lyrics. The text was written to fit a previously
“All the Things You Are.” The music takes over in the second half of the song. It even
dictates the structure of the text, which molds around the musical phrasing and reinforces
certain harmonic procedures. The number of syllables changes from one verse to the
next. Also, the rhyme is loosely structured, which controls the natural tendency of
engaging with the musicality of the words in combination with the flow of the melody
and attracts more attention to a linear reading of the text. By using this tactic, the
songwriters guarantee a certain attention of the listener to the denotative content of the
lyrics.
The text remains simple and does not try to address multiple semantic levels; it
124
his or her “object of love.” The lyrics’ structure signals musical events such as the
The “refrain” has four sections: A –A’–B–A’’. Both A and A’ sections are made
over the G major but breaking this sequence and still keeping the two phrase structure,
although this time the phrases are much shorter, only four measures compared to the
seven measures of the previous ones. The first phrase of the B section stays on G major
second phrase. Although this second phrase of B parallels exactly the previous cadential
progression, now everything is in E major. The A’’ section opens with the same sequence
as the one employed at the beginning of the refrain in the A section. It even starts on F
minor, but stops halfway on the Ab major of the sequence in fourths, collapsing the
previous two phrases into one of seven measures followed by a cadential coda on the new
The text of the “refrain” signals the beginning of each sequence as well as the
parallelism between them by using the same phrase “you are” both times (mm.1 and
mm.9 of the published score). Kern achieves this focalization by carefully setting “you
are” on two long notes, a whole and a dotted half-note, which stop the flowing of tempo
in the music. The rest of the text immediately following “you are” gains a quicker pace
by fitting more syllables into each measure. Although the harmonic rhythm of the
sequence is constant—one chord per measure—the layout of the text varies in density.
125
While the two syllables of “you are” are spread over two measures, the remaining sixteen
syllables of the text (eighteen in A’), where the explanation of what “you are” takes
place, are crammed into the next five measures. In the context of a syllabic setting such
as this one, the pace and density over the measures will have a direct influence on the
perception of the text in general. The slower rhythmic pace allows a better understanding
of the contained lyrics. In contrast, a tighter layout produces a blending of syllable over
The result is a generalized idea of what has been said in these phrases set into the
A and A’ sections. While the listener attends to the words “you are,” the specifics of the
description of her or his “object of affection” are overlooked, and she only remembers
that this person is a series of things. The listener attends away from the precise meaning
of this description and is satisfied with the assurance that it has one without its mattering
what it is.
the alliteration devises on hand with his lyrics. The associations between similar
phonemes that are placed close to each other create a certain sustained musicality in the
lyrics. Thus, the listener tends to attend away from the meaning of the words and listen to
(A section)
You are the promised kiss of springtime
That makes the lonely winter seem long.
(A’ section)
You are the breathless hush of evening
That trembles on the brink of a lovely song.
126
The first line explores the alliteration between sibilant [s] phonemes that by the
second line make a counterpoint against the liquid [l]. The third line connects the
unvoiced fricative phonemes [θ] – [s] – [ς] in a backward movement of the point of
articulation of the tongue (upper teeth, teeth ridge, and hard palate).
words and phrases. In fact, listeners tend to grab chunks of lyrics in an imprecise non-
linear way. The usual fragmented nature of a song’s lyrics contribute to the broken way
the listener tends to grasp the text. According to Booth, these fragments behave like
catharsis,” which is a common procedure in most of the other types of discourses.29 This
does not mean that song text is shapeless but that its elaborated patterns connect to each
other in a different manner. Booth comments on the particulars of this relation through a
In song lyric, although the images and ideas may be related to a central theme or an
obvious central conceit, they tend to be isolated from each other; they accumulate
rather than develop. Rarely, in fact, does an image or thought extend beyond two
lines…the listener is rarely able to make connections of much complexity over a
longer space of time. 30
Returning to the setting of the song’s lyrics, after twice establishing comparisons
of the “object of love” with certain pieces of nature—“You are the promised kiss of
springtime” and “You are the breathless hush of evening”—Kern and Hammerstein allow
the next “you are” to move in quarter-notes over the arpeggio of G Major. This is one of
the modulatory turns that the song takes from mm. 15 until mm. 20. Although this seems
to break with the device of focusing attention on the phrase “You are,” a couple of
127
measures later (mm. 22-23), the song returns to a modified version of the emphatic tactic.
This time, the B section closes with a palindromic effect that brings back the phrase “you
are,” although this time as a closure of the lyrics’ statement. This happens at this point of
the song for two reasons. First, because the phrase “you are” has already been well
practices, as Booth comments, which “buil[d] on the fact that any word sheds its sense
upon a small number of consecutive repetitions.”31 Nevertheless, these two words still
produce their denting effect on the listener; they keep their pragmatic value while their
semantic one is almost extinguished. Second, the change in the way “you are” is
section.
progression modulates the second time around. Melodically it is also different; instead of
a seven measure melodic phrase, there is a four measure one. The fact that mm. 25
reintroduces the same melody of the first five measures of the refrain, together with the
fact that by the fifth measure the same “some day” of the beginning of this A’’ section is
repeated, indicates that things seem simultaneously similar but different. This is a return
A’’ establishes the same game of two words as a “motto” heading a melodic
phrase that, at least in section A, is repeated a fourth down the second time around. But
new words are now used: “some day.” This not only breaks the monotony, but also puts
29
Booth, 25.
30
Ibid., 24.
128
the listener on alert. Contrary to his or her expectations, the listener is surprised by the
sudden return of the “motto,” “some day,” set into what seems to be a variation of the
opening melody. This acceleration of events propels the listener toward the end of the
song, expressing the hopeful wish that “some day” all the things that represent him in the
song become hers: “all the things you are, are mine.” The title of the song occupies this
(verse)
Time and again I’ve longed for adventure, (10) a
Something to make my heart beat the faster. (10) b
What did I long for? I never really knew. (10) c
(refrain)
You are the promised kiss of springtime (9) a
That makes the lonely winter seem long. (9) b
You are the breathless hush of evening (9) c
That trembles on the brink of a lovely song. (11) b
observation to make on the choice of lyrics employed as the “motto” or heading. Both
“you are” and “some day,” are made up of what is known in linguistics as deictics.
Phrases made of words like “you are” and “some day” are semantically empty and
depend totally on the context of the utterance. They have the “function of situating the
31
Ibid., 39.
129
speaker’s utterance in a specific time and place. They do not characterize or qualify
someone or something, but ‘point to’ a person, an object, a time.”32 Deictics are used
more frequently in spoken language than in written. As Mauro Calcagno says, theater
scholars regard the high incidence of deictics in dramatic texts as one of the main factors
that distinguishes the language of theater from that of narrative or poetry. I argue in
addition that the colloquial flavor of several song lyrics is created through the extensive
The employment of the deictic phrases emphasizes the performative and oral
nature of the song’s texts. Deictics contribute to creating a kind of direct immediacy to
the audience during the act of communication, regardless of the context: whether a live
concert or a recording. In theater or opera the audience identifies with specific characters
on stage. However, the general public approaches song by identifying themselves with
different “personas” coexisting in it. Whether the singing voice is male or female, the
listener never identifies with the person being addressed, in this song with the “you” of
“you are.” On the contrary, the listener assumes the place of “I.” In the case of a
narration, the listener tends to assume the perspective of the narrator of the story. If the
perspective and opinion on the topic is not shared, the process of identification does not
take place. In contrast, if the song embodies bits of the ideals of the group to which the
audience belongs, the communion takes place. In both cases, when the identification is
with “I” or when it is with the narrator, it is the power of the human voice that invites the
32
Mauro Calcagno, “’Imitar col canto chi parla’: Monteverdi and the Creation of a Language for Musical
Theater.” In Journal of the American Musicological Society (Vol. 55, n. 3, Fall 2002), pp. 390.
33
See Booth, pp. 16-17 for further details of these arguments.
130
Turning now to the two recorded versions of “All the Things You Are,” these
renditions represent two very different approaches to the same song, which in turn
provoke distinctive reactions in their listeners. Of course, what is known of both artists’
careers and styles are read into these versions too. Ella Fitzgerald, diva of the “big-band-
era,” recorded “All the Things You Are” in 1963 for her album Ella Fitzgerald Sings
Tin Pan Alley and Jazz songwriters such as Cole Porter, Duke Ellington, and Harold
Arlen. She initiated this series of recordings under the guidance of her manager and
producer, the owner of Verve Labels, Norman Graz. Especially with this “Kern” album,
Fitzgerald ventures outside the emblematic raw, energetic, jazzy vocal style of her
performances into the more well-polished sound of these Broadway musical tunes, which
may appeal to a broader audience. Recorded only four years later, Barbra Streisand’s
track represents the late 1960s-early 1970s style identified as “adult contemporary,”
which was an extension of the old crooner tradition.34 Streisand recorded “All the Things
You Are” on her album Simply Streisand, which was released in 1967 by Columbia
Records—and which she had been recording since 1962. This album was produced by
Jack Gold and Howard Roberts with orchestral arrangements by Ray Ellis—her long term
soloist, who is clearly the leading figure, and a “more or less anonymous” orchestra just
notes in Fitzgerald’s performance and ninety-two in Streisand. But the arrangements and
34
Starr and Waterman, p. 307.
35
David Brackett, Interpreting Popular Music, p. 58.
131
sound of the orchestras are quite different in these two recordings. First of all, Fitzgerald
does not sing the “verse” at all. She delivers the entire “refrain” and then goes back and
repeats sections B and A’’. Streisand opens with the “verse” followed by the complete
sound. This is achieved by the prominent use of the brass section playing a swing
interludes, or otherwise, punctuating certain beats with brief chords under the vocals. The
rhythm of that riff has a strong sense of swing in its ternary subdivided pattern, here
transcribed:
EX. 3.1: Rhythmic riff played by the brass section in Fitzgerald’s version of “All the Things You Are”
The arrangement is held together by the rhythm section: ride cymbal and bass. There is
Streisand’s version recreates a Latin soft jazz ballad sound by molding the piece
around a slow bossa pattern with rim shots and triangle—the latter reminiscent of
Northeastern Brazilian music. The main difference is that there are no strong swing brass
interventions in this version. The softer timbres of the string sections (duplicated at times
by backing vocals) and woodwinds are used in textures that privilege lyrical
132
In comparing the vocal renditions of Fitzgerald and Streisand against the
published score, we find some jazz improvisational elements. The following are the
transcriptions of the eight opening measures of the “refrain.” These have been transcribed
in the original keys in which each singer performs them. Rhythmic and melodic details
have been transcribed as faithfully as possible to show the different nuances of each
singer.
EX. 3.2: “All the Things You Are”: Two versions and published score of the A section. Each in the key
performed or published. Transcriptions by Lorena Guillen.
While Fitzgerald sings mostly on the beat—almost parallel to the score version—
with slight delays on “that” and ”the” of “the lonely” and anticipations on both syllables
133
of “winter,” by the fourth measure Streisand is already two beats behind and most of her
rhythms are slightly modified from the score. However, it is Streisand who sings the
pitches straightforwardly and will for the most part stay faithful to the melody until the
end, with minor improvised vocalizations in the last phrases. Fitzgerald’s rendition
abounds in pitch bendings in between notes and scooping in almost every attack. She also
later introduces major melodic changes, such as the three repeated G4 natural pitches,
which take the place of the upwards arpeggio of G3-C4-G4 of the original in “You are
the angel glow” and the subsequent G4-Ab4-G4-E4 on “that lights a star.”
Although Fitzgerald clearly articulates every word sound in soft crooning vocal
prolongs certain consonants. She brings up those phonemes that are associated by
alliteration inside each verse, such as the [s] in “kiss” and “springtime” and [θ] and [s] in
“breathless,” or prolongs notes on the final [m] of certain words, such as “time” and
“seem.” These stretched out sounds are reinforced by a more prominent reverb effect
These particular vocal effects, together with the orchestral sound in each case,
contribute to conveying a specific type of sound, which triggers different emotional states
in the audiences. Fitzgerald keeps the swing big band style with an on-the-beat
articulation combined with a serene and pleasing vocal tone. Streisand emphasizes the
lounge bossa style with a simple and relaxed vocal tone that floats freely over the beat
getting behind in a lazy manner. So, while the former calls the listener to a comfortable
but punctuated and energizing sound experience, the expansive sustained sonorities of the
134
The Narrative Type: Strophic Form
narrative texts. In his typical “strophic song form” (A, A’, A’’, A’’’, etc.) is “A Simple
Twist of Fate,” a track on his 1975 album Blood on the Tracks. As an American urban
folk icon, he dragged this genre into the modern era of rock by introducing electric band
sound to his recordings and live performances, starting with his 1965 album Bringing It
All Back Home and following in July of that same year with his performance at the
Newport Folk Festival. Throughout his career, however, his songwriting style has
remained faithful to the early American folk tradition. Some of his songs have been
modeled “implicitly or explicitly, on the musical and poetic content of preexisting folk
material.” 36 Furthermore, his performing style has “demonstrated strong affinities to rural
The object analyzed in this case is the recorded track itself as conceived as the
the songwriter and performer of this track, its value and interest lies not in the features of
the song but in the unique way he sings it, his personal inflections. “The appeal of
auteurs is that their meaning is not organized around the words…in the situations his
songs portray, but in the exceptional nature of his singing style and its instrumental
accompaniment.”38
As Albin Zak points out in talking about Dylan’s John Wesley Harding (1967),
“against the contemporary trends in recording, which tended in varying degrees towards
36
Starr and Waterman, 281.
37
Ibid., 278.
135
the sonic opulence exemplified by Sgt. Pepper’s, and in contrast even to the ‘thin wild
mercury sound’ of Dylan’s own Blonde on Blonde album, it strips things down to an
elemental level—bass, drums, acoustic guitar, voice, harmonica, three chords, and no
obvious sonic manipulations.”39 Eight years later, “Simple Twist of Fate” returns to that
stripped sonority with his strummed acoustic guitar, bass, harmonica and a quasi-spoken
singing quality. Blood on the Tracks is a mixture of some recordings that feature this bare
sonority and others that have a more stylish band sound with arpeggios and
countermelodies between two guitars—one steel guitar played by Buddy Cage, Tony
Brown on electric bass, Paul Griffin on organ, drums, and Dylan’s own harmonica and
voice.
The lyrics of this song are made of six stanzas; each set to the same melody with a
“quasi-refrain” at the end of them. Although these last verses of the stanzas are set to the
same music and finish with the same words—which are not surprisingly the title of the
song, “A Simple Twist of Fate”—they open with a different heading each time. These
heading words state the action or verbal phrase that will affect this “simple twist of fate”:
“And watched out for a simple twist of fate;” “Moving with a simple twist of fate;” “And
This song employs a narrative type of discourse, which unfolds events in a linear
way. In the first four stanzas, the action takes place in the past. A third person, an
omnipresent narrator, tells about the first encounter of a man and a woman in the past.
The second part of the song—the remaining two stanzas coming after a harmonica solo
that, in a way, marks the passage of time—takes place in the present. In the last stanza,
38
Frith, Music for Pleasure, 122.
39
Zack, The Poetics of Rock, 48.
136
the narrator reveals himself as the male protagonist of the amorous encounter. A short
The strophic setting that Dylan chooses goes well with the folkish story-telling
style and allows the audience to concentrate on the details of the story. The songwriter
wants the listener to focus on the lyrics without big musical distractions or fragmentation
of the linear development of the story. The traditional heavy rhyming of the verses seems
(1st Stanza)
They sat together in the park (8) a
As the evening sky grew dark, (7) a
She looked at him and he felt the a spark (9) a
Tingle to his bones. (5) b
’Twas then he felt alone (6) b
And wished that he’d gone straight (6) c
And watched out for a simple twist of fate. (12) c
The rhyme scheme of the first three verses and the next two consecutive pairs,
although strong and attractive, does not distract from the main point of the story; instead,
it contributes to the narration. This rhyme scheme points to words that are key to the
story and creates a parallel narration that synthesizes and contributes to the essential
• The Opposition “dark”/”spark”: first it was “dark” but then there was a
“spark” of hope in a new relationship.
137
This rhyme scheme, as well as the repetition and placement of the song title at the
end of each stanza, clarifes the poetic structural frame. This kind of predictable form
liberates the mind of the listener, who in this way can trust and concentrate on the linear
succession of events of the story being told. The rhyme scheme is also very regular and
its placement predictable (at the end of each verse). There are no further strong
alliterations or internal vowel rhymings that could deviate or offer alternative webs of
phonemes or morphemes. The rhyme moves the lines ahead, propelling the rhythm of the
The melodic development also contributes to this sense. The melody that is
repeated for every stanza is fifteen bars long. In contrast with the way Kern and
Hammerstein approached making “All the Things You Are,” “Simple Twist of Fate”
shows evidence that Dylan may have written the lyrics first and then set them to music.
In this case, it is the music that follows the lyrics’ structure and not the other way around.
The first three verses, which are assonantly rhymed (vowel rhyme), are set to the same
EX. 3.3: Transcription of opening three measures of Dylan’s “Simple Twist of Fate”
138
The next two rhymed verses—verse four and five—are set to a second melodic
phrase, which is also repeated twice to fit each one of the mentioned verses.
Again, each verse is set to a two measure melodic phrase. And the stanza will
actually keep this regular pace for the next verse to slow down only in the last one, which
is the refrain. The regular structure of the melodic phrases is evidence of the lyrics
If song lyrics resemble poetry in some way, it is in their rhyme and metric
schemes. Lyrics, as well as poetry, are created by feeling the feet—the number of accents
melody tries to fit and follow the feet and accents preexisting in his lyrics. This results in
the subdivision of beats into their proportional rhythmic values to fit the extra-syllables
139
of the irregular verses. Otherwise, as happens in “All the Things You Are,” the lyrics
should have been created as well-proportioned parts to fit the music exactly. For
example, the three first verses in stanza one have, respectively, 8-7-9 syllables; the first
three verses in stanza two have 8-10-8 syllables; and stanza three has 9-9-9 syllables. All
this verses are set to the same melody as is usual in any strophic song setting—whether
expressive value of the song is located. In this case, it is in the semantic meaning of the
lyrics and in the communicational value of text as carrier of denotative content. This
discourse needs to be uninterrupted and fluid to make any sense. Text as acoustical
This musical setting follows only the lyrics’ main accents and shape to make it
understandable without further prosodical details. Here the strophic song tries to solve the
inconvenience of not molding exactly to the intonational arch of the text with the
repetition of the melody. This procedure is taken at the point that the melody and its
arrangement almost completely lose their ability to surprise the listener. In this way, they
release the listener’s attention to focus on the story and its logical sequence away from
the melodic swirls of the music. In sum, the strophic setting eases the ears and mind of
the listener.
He breaks his sustained tone into a non-determinate pitch sound contour. This quality
gains over the singing, especially toward the end of each melodic phrase of the stanzas
and each time the refrain appears. By manipulating his voice in this way, Dylan
140
counteracts the lack of prosodical observance of his melodies to the speech intonemes of
his lyrics. The strophic repetition does not allow the flexibility of following speech
intonation arches. By speaking the lyrics, the words break free into quasi-speech.
In terms of the lyrics themselves, Dylan intends to counteract the natural tendency
of text‘s song to be processed in the “poetic mode” by: first, using predictable rhyme
schemes at the end of verses; second, avoiding further alliteration inside verses, which
could deviate the attention of the listener from the linear succession of concepts; third,
avoiding repetitive semantic schemata (beyond the repetitive refrain). Resorting to these
tactics, the songwriter minimizes the musical structures, Tsur’s “sound patterns,” of his
lyrics, and assures a propositional temporal processing similar to the one speech follows.
The left hemisphere of the brain composes speech by retrieving from memory
specified temporal arrangement.”40 The left side of the brain is usually associated with
their lyrics are remembered as wholes. As Booth says, “The parts of these units are not
pieced together tone by tone, word by word, but rather are recalled all at once as a
complete unit.”41 The appositional capacity of the right hemisphere of the brain is the
produced as intact wholes.42 This is the way listeners retrieve fragments of song’s lyrics.
But Dylan counteracts this appositional tendency by controlling the “sound patterns” of
his lyrics and keeping the flow of the narration. He delivers the story in his usual quasi-
40
Booth, 68.
41
Ibid.
141
Redundancy: Variation on the “Verse-Chorus” Form
Relying on other effects, the imprint that Björk’s “Isobel” leaves on the listener is
quite different from that of Dylan’s song. “Isobel” was written by Björk, Nellee Hooper
and Marius De Vries, with lyrics by Sjón. Nellee Hooper and Björk produced it together
and released it in 1995 on Björk’s second album Post. This song offers the unusual
opportunity of comparing the sound properties between two differently mixed versions:
the first in Post, where Björk herself participated in the mixing process; and the second
made by Eumir Deodato for Bjork’s 1996 CD Telegram. The latter is a remix made up
largely of songs from Björk’s album Post. Björk personally commissioned nine artists
and gave them complete freedom to remix her tracks. After receiving the mixes, she went
back to the studio and re-recorded the vocals to complement these artists’ versions.
Deodato’s spin on “Isobel” opens up the texture with a straightforward pop sound
For me Telegram is really Post as well but all the elements of the songs are just
exaggerated. It’s like the core of Post. That’s why it’s funny to call it a remix album, it’s
like the opposite. It’s like the-cover-of-Post-me like this [she smiles beatifically] in pink
and orange and big ribbon and it’s like a pressie for you. But Telegram is more stark,
naked. Not trying to make it pretty or peaceable for the ear. Just a record I would buy
myself. (Like a letter to yourself?) Yeah, more, sort of...fuck what people think. It’s a
truth thing. Which is maybe a contradiction because it’s other people’s remixes. (Blah
Blah Blah, December 1996)43
In her Post version of “Isobel,” Björk and De Vries play keyboard over a
rhythmic base of “ethnic” percussion also programmed by De Vries. Deodato and Björk
add a string arrangement. What is radically different between this original version of
“Isobel” in Post and the one remix by Deodato in Telegram is the levels of volume in the
mix and inclusion or suppression of certain recorded instrumental tracks. From the
42
Ibid., 69.
142
opening sustained harmonic string sequence with trumpet solo in Post, Deodato only
keeps his own arrangement of strings. The softer and diluted “ethnic” percussion is
replaced by a pop drum-set pattern that is brought quite prominently into the mix. The
original bass, which was muffled and back in the Post mix, is replaced by a “funkier”
bass that is also up front in the Telegram mix. All the programmed sequences and
keyboard sounds of Post are stripped out in Deodato’s mix. This now clean cut pop track
directly affects the way in which Björk herself interpreted her vocals when she
rerecorded them after the new mix. She goes for a less affected vocal inflection of the
lyrics. Her voice also is mixed with less processing, a more “in-your-face” sound. The
straightforward sound, which contrasts, like an ironic comment, with the still hermetic
lyrics.
1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th
43
See http://unit.bjork.com/specials/gh/SUB-12/index.htm. Copyright © 1995-2007 by Björk Overseas
Ltd.
44
Here the term “verse” is used to denote a complete strophe of a song. This is the most commonly used
term among popular music songwriters to refer to this part of a song. In this section of the dissertation, it is
will be introduced in quotation marks each time it is employed in this way.
143
EX. 3.6: Transcription of all the sections of Björk’s “Isobel”
Although the song has three strophes or “verses,” it repeats the chorus five times.
After its third reappearance, instead of being sung to the same lyrics, the chorus’ melody
is performed on babbling: “na, na, na…” Another device to add variation to the
Between the second and the third chorus the first “bridge” is introduced, which brings a
“climb” of two “verses” attached at the end. The bridge itself is 3 “verses.” The second
45
The “bridge” is a fresh new section inserted to offset the predictable verse/chorus pattern: “…a bridge
works to provide contrast in lyrical content, meter, and melody.…Lyrically speaking , your bridge doesn’t
justify its existence if it merely restates a fact we’ve been told. Ideally, a bridge adds dimension to a lyric
by expanding the content of the verse or chorus, or by giving new insight into the singer’s feelings.” Sheila
Davis. The Craft of Lyric Writing. (Cincinnati, Ohio: Writer’s Digest Books, 1985), 57.
A “climb” gives the “verse/chorus” song a fresh contour. Although it is also new material, it is usually
shorter than a “bridge” section: “…a climb is a couplet (two rhymed lines in the same meter) which pull
away from the verse both verbally and musically, and reach up toward the chorus. A climb functions as
aural foreplay, to extend and increase the song’s emotional tension by delaying the arrival of its climatic
section.” Davis, 55.
144
time that the bridge is introduced is in between the fourth and fifth chorus. The song
unfolds under a well-planned structure full of musical repetition and text redundancy.
The lyrics, although rather hermetic, describe the character of Isobel. But the
obscure images depicting her personality contrast with the colorful and well-planned use
of rhyme, alliteration, parallelism and word repetition. This gives the verse a quite
attractive musicality. This attraction does not overlook the semantic content of the words
in favor of their musical possibilities. If something detracts from this otherwise inevitable
example the refrain at the end of each strophe, chorus, and bridge:
(Chorus)
My name Isobel d (5)
Married to myself d (5)
My love Isobel d (5)
Living by herself. d (5)
(Chorus)
(Bridge)
When she does it she means to f (7)
Moth delivers her message g (7)
Unexplained on your collar h (7)
Crawling in silence i (5)
A simple excuse. j (5)
145
(Chorus on “Na, na, na…)
(Bridge)
Each of the three “verses” opens with an assonant rhyme between the two first
lines, but the third one rhymes with the third line of each “verse.” These three words
connected at a distance form an interesting imaginary scheme throughout the song. This
scheme becomes a kind of parallel or alternative sub-line plot that describes a possible
scenario for the description of Isobel: flame—scares—hell. The “verse” closes with one
of the first instances of word repetition, “like me, like me,” which is the refrain that not
only repeats twice there, but appears at the end of each “verse” in the same couplet form.
The chorus takes to an extreme all the versification devices at hand. Its four
“verses” are metrically regular: five syllables each. This gives regularity, balance and a
perfect rhythmic pattern. On top of that, the rhyme scheme is also extremely regular. Not
only do the four lines share an assonant rhyme among them, but line one has perfect
rhyme with line three in the same manner as line two with four. In addition to these
Similarities in ideas are brought to the surface by the similarities in sound and
grammatical construction of the two parallel phrases that make up the chorus.
146
Subject Verb Object (direct or
indirect)
Line 1 and 2 My name Isobel Married to myself
TABLE 3.2: Layout of parallel lyrics constructions in the chorus of Björk’s “Isobel”
The first and third lines also open with an anaphora46 that starts only with the repetition
of the opening possessive pronoun “My” and, although the noun is different, is followed
The second “verse” follows immediately after this chorus, but the perfectly
balanced sound of the words of the chorus keeps resonating in the ears of the listener.
in the first and second lines and third and fourth lines are set over a melody that repeats
By the time the song moves into the second “verse,” the ears of the listener are
already attuned to a musical way of listening to the lyrics. The alliteration of consonants
between the two first lines and inside the third one are more prominent than any other
effect. “In a heart full of dust/ lives a creature called lust” encloses the alliteration of
Inside the third line, the reiteration of [s] and [r] creates a sensation of smooth continuity
in these sounds and over those that are in between them: “It surprises and scares.”
46
Repetition of a word (or like-sounding words) or a short phrase at the start of successive lines or verses
(Davis, p. 143).
147
The refrain “Like me” holds our attention because of its redundancy and exact
between the persona of Isobel and the other, which maybe are one.
The second time the chorus comes around with its features and effects intact, but
musical and textual material is more than welcome. Usually the function of the “bridge”
is to introduce further information still not provided in the “verses,” a new perspective
and/or to offset the musical predictability. The “bridge” should also offer contrast in its
meter and versification compared to the verses and chorus. Certainly this is the case in
“Isobel.” The lines are now seven syllables long and have free rhyme. There are no new
Björk also proposes a musical change in this “bridge.” The melodic setting of
each line consists of a first phrase that carries part of the lyrics, followed by a second
period that is a vocalization of its melodic contour on a babbling “uh, uh, uh…” Attached
at the end of this bridge is a couplet that functions as a “climb” toward the next chorus.
Although climbs normally “pull away from the verse both verbally and musically,
and reach up toward the chorus,” this one is not pulling from known material, but from
the newly introduced bridge’s music.47 This couplet anticipates the five-syllable length of
the chorus lines in an open melodic phrase that does not find its harmonic and melodic
closure until the first notes of the following chorus. Thus, it creates the expectation and
47
Sheila Davis, The Craft of Lyric Writing (Cincinnati, Ohio: Writer’s Digest Books, 1985), pp. 55.
148
This third chorus is sung on a babbling “nana na nana, nana na nana…” By this
time the lyrics of the chorus, which have been repeated twice, have probably been
imprinted on our memory, and the playful musicality of the text has contributed to this
sense. But more than the tangled semantic meaning, what the listener remembers is the
The conscious or unconscious choice of the babbling syllable “na” also has
certain emotional implications. Part of the delight that any listener feels, and even the
performer experiences while singing, is situated in the regressive value and childhood
connotations that those phonemes evoke. According to Roman Jakobson, the emotional
charge of each phoneme is proportional to the amount of time that the infant has used it
in his prelanguage babbling stage in language development.48 So, any association with
sounds out of their syntagmatic or referential relation to a linguistic sign (a word) refers
us to that period. Nasal phonemes are among the latest acquisitions of children into the
vocabulary, but are part of the mass of sounds used in their onomatopoeia and emotional
Only after a close listening and analysis of the recorded material were Björk’s
comments about the story line of Isobel taken into account. And, surprisingly, as I came
to believe after listening to it, the “na na na nana” proved to be the expression of an
instinctive impulse. In an interview for MTV’s Eurotrash in 1995, Björk tells the story in
this way:
149
This is the story of Isobel; she was born in a forest by a spark, and as she grew up, she
realized that the pebbles on the forest floor were actually skyscrapers. And by the time
she was a grown-up woman and the skyscrapers had taken over the forest. She found
herself in a city, and she didn’t like all the people there so much, because they were a bit
too clever for her.
She decided to send to the world, all these moths, that she had trained to go and fly all
over the world and go inside windows of people's houses—the ones that were too
clever—and they’d sit on their shoulder and remind them to stop being clever and start to
function by their instincts. They do that by saying “Nah-nah-nan-nah-nah!” to them...
(Björk waves a finger in front of her face)
...and then they’d say “Oh! Sorry! I was being all clever there!” and start functioning on
instinct.49
The rest of the song proceeds with the same tactics described previously. Finally,
what the song offers musically—the rich realm of its text webs and musical events—to its
distortions or fragmentations. Björk is certainly looking for this kind of musical sonic
experience.
Peter Gabriel’s “Sky Blue,” from his 2003 album Up, shares certain similarities
with Björk’s “Isobel.” Both rely on redundancy and repetition as a way of creating a grid,
which the listener can use to guide him/herself along the fragmented lyrics. But this is
approached differently in each song. Although Gabriel’s lyrics seem more accessible than
Björk’s, they are long and stretched over lengthy periods. This accessibility is due to the
fact that each fragment of Gabriel’s lyrics corresponds to a line of the strophe and a
single musical phrase. Each of these fragments carries a single idea or concept. This
procedure reestablishes some of the integrity missing in the lyrics as a whole. Bjork’s
ideas and images stretch over more than one musical phrase or longer musical phrases
48
Roman Jakobson, Child Language, Aphasia, and Phonological Universals (The Hague: Mouton, 1968).
49
See http://unit.bjork.com/specials/gh/SUB-12/index.htm. Copyright © 1995-2007 by Björk Overseas Ltd,
(accessed April 16, 2007).
150
The song is almost seven minutes long. A skillfully crafted handling of the long-
term musical and lyric structure is what makes this song work. I argue that what holds
this song together are mainly its musical elements and not so much the linearity of its
being told becomes secondary. The listener probably ends up singing the vocal riff of the
chorus (described below) and the “group answer” on “sky blue.” The following diagram
At the level of the lyrics, Gabriel offers less repetition of whole sections than
Björk, with the obvious exception of the two words of the title, “sky blue,” which are
introduced every other line of the “verses.” The simple instrumentation of the band and
bare chord accompaniment of the very first “verse” provide a chance to understand the
opening lines clearly. The first and fifth lines resort to another highlighting literary
device, parallel grammatical constructions such as “Lost my time/lost my place in” and “I
know how to fly/I know how to drown in.” The first, second and third lines have internal
151
(verse 1)
Lost my time lost my place in
Sky blue
Those two blue eyes light your face in
Sky blue
I know how to fly, I know how to drown in
Sky blue.
These kinds of structures keep recurring in the next verses. The second “verse”
holds a literary palindrome construction at a semantic level: “I sing through the land, the
land sings through me.” It also contains the alliteration of the phonemes [w], [m] and [n]
on “warm wind blowing.” The third verse ends each of its lines with an assonant final
(Verse 2)
Warm wind blowing over the earth
Sky blue
I sing through the land, the land sings through me
Sky blue
Reaching into the deepest shade of
Sky blue
(verse 3)
Train pulled out said my goodbyes
Sky blue
Back on the road alone with the sky
Sky blue
There’s a presence here no one denies
Sky blue
By the second “verse,” however, all of these literary and versification devices are
almost overlooked. The only structural repetition that appeals to the listener’s attention is
the group vocalization of “sky blue.” These two words are sung by a choir that answers
the opening melody proposed by the soloist, Gabriel. The first time around it is actually
Gabriel who answers his own “calling.” From the second “verse” on, a group of voices
152
takes over the “sky blue” response. This procedure reproduces the typical “call-and-
response” form of several Afro musical styles. This “sky blue” response also acts as the
“hook” in the song. The hook is the repeated section of the song that more often than not
contains the song’s title, but it could also be a melodic phrase. In this case, the song
makes use of both things: a particular melodic snippet that answers the previous “call,”
the melodic opening phrase, the lyrics of which are the song’s title.
The hook serves the function of snagging the listener into the song and grabbing
his/her attention. The hook remains in memory even after the song is over. Booth says
“self-reference is often visible in the verbal form of…[a] hook, returning upon itself as a
expected context.”50 The lyrics of this particular hook, “sky blue,” do not hold any
paradox or absurdity but only a certain innocent redundancy. Although at the beginning
of the song, “sky blue” is the grammatical continuation of a phrase started in the
preceding line, by the second or third verse this discoursive connection is discontinued.
“Sky blue” holds a loose and indirect relation to the previous phrase: “warm wind
blowing over the earth/sky blue/I sing through the land, the land sings through me/sky
blue;” “Train pulled out said my goodbyes/sky blue.” Musically speaking, however, the
little melodic phrase of “sky blue” keeps connecting as the closing answer to the
“calling.” The following score of the first verse shows this musical procedure:
50
Booth, 179.
153
EX. 3.7: Transcription of the opening eleven measures of Gabriel’s “Sky Blue”
The chorus, a typical Western musical element of the song form, actually makes
use of another Afro musical devise. This time, the group of voices sings a four-measure
riff over which the soloist performs a quasi-parlato melody that gives the impression of
being improvised. The fragmented and scattered nature of its melodic contour provides
its improvised character. Of course, the force of tradition has some influence on what is
perceived and the fact that improvisation is what is expected in this kind of musical genre
—or at least the style that the song is trying to evoke—reinforces the way we listen to it.
The following score shows the progressive overlapping of these vocals in chorus
1 and 2.
154
EX. 3.8: Layout of solo and vocals in the first chorus of Gabriel’s “Sky Blue”
155
EX. 3.9: Layout of solo and vocals in the second chorus of Gabriel’s “Sky Blue”
As it is possible to observe in the score above, the riff lasts four 3/4 measure
pattern and is made of two phrases built on the same harmonic progression that repeats
throughout the chorus: C# minor–A major–B major–G# minor. These two phrases are
identical with the exception of their resolution, alternately the last note on G4 or F4. By
the second chorus, the increment of repetitions in the complete four-measure version and
earlier introduction of the riff contribute to the shift of the listener’s attention from the
156
(Music) surely provides the shortest, the least arduous, perhaps even the most natural
solvent of artificial boundaries between the self and others…The words of folk song…are
not directed by one person to another or by many persons to many others; the voice is
that of the group…there is no “other being,” no mere listeners…If one member happens
to lead the chorus, his words are certainly not addressed to the others…He does not tell
them anything they don’t know; he does not speak to the others but for them.51
After the first three times that the soloist proposes the “sky blue” answer, we are
invited to join and participate in this collective act, vocalizing these words. From then on,
it becomes a habit to participate in the game through the riff of the chorus until the end,
necessary for practical reasons to limit my exploratory subjects to college students. These
undergraduate students of Hartwick College, a small private liberal arts college in upstate
New York, were between eighteen and twenty-three years of age (with the exception of
three older students between thirty-four and forty-five years old). Some of these students
were music majors (twenty-four people) and others were from other degrees such as
visual art (twelve people) and modern languages (ten people). Thus, the analytical group
embraces a more comprehensive universe of people and interests beyond music, and the
51
Booth, 18-19. From Victor Zuckerkandl, Man The Musician, trans., Norbert Gutterman. Bollingen Series
157
selection of music students could be questioned because of the possible conditioning by
their musical training. Interestingly enough, the results proved to be similar among all the
groups. For this reason, the data was processed and presented all together, and not
separated by group.
questionnaire. Every group listened to the same recorded songs and same versions of
them. The recordings used in this experimental project were the same songs analyzed in
the previous section: Fitzgerald’s version of “All the Things You Are;” Bob Dylan’s “A
Simple Twist of Fate;” Björk’s “Isobel” and Peter Gabriel’s “Sky Blue.”
because the listening action was not observed in its ordinary place but an artificially set
one, a classroom; and non-structured because all the variables intervening in the act of
listening were not controlled during the experience. Faced with the impossibility of
observing people in their natural environment where they perform the listening as an
everyday activity (concert, home, ambient music in cafes or any other social situation in
which music is encountered), the experience took place in classrooms where the subjects
were asked to listen to the songs without any specific instruction. Only after the pieces
were played once, they were asked to open the page in front of them and read the
questions.
At all times the intention was to minimize the artificiality of the situation and
reproduce as much as possible the casual listening that people experience in everyday
life. Having a blind first exposure to the songs and ignoring the actual goal of the
experience by hiding the particular questions from the listeners provided a chance for an
158
objective result. Although a certain kind of general attentive listening may have been
taking place, at least the subjects were not guided from the beginning of the experience to
way.
Finally, the same five questions were presented to each group. These questions
were designed in a non-systematic and open manner. The listeners did not have
formulated answers to choose from. This guaranteed that their responses were not
influenced or narrowed by any outside instruction. Each listener volunteered their free-
form responses, which I later grouped, noting the frequency of each. Thus, these
variables found in the following tables are not a mere listing of the people’s answers, but
159
The following is list of specific answers that the listeners wrote down as lyrics
During this first exposure to the four songs, the listeners showed a special interest
in the particularities of the performers’ voices and the sound of the bands of each track.
From the few phrases that they could remember, we gather that these were in strategic
places of the songs. They were either part of the songs’ refrains or part of the choruses.
These are sections that usually work through the song by melodic and harmonic
repetition, creating a dent in the listeners’ memory with their recurrent musical schemata.
Furthermore, when we cross the information obtained from the analysis of each
song with these results, it is possible to observe that the words or phrases that the
constructions, manipulation of the rhythmic pace of the piece over those words, or simple
lyric redundancy (recurrence of the same word). In the particular case of Gabriel’s “Sky
160
Blue,” the hypothesis proposed during the analysis was confirmed: the back up vocals
performed along Gabriel’s solo in the chorus were a main focus of the listeners’ attention.
Question 2: Why do you think you were able to remember specifically that?
The second question proved to be the most difficult to categorize and organize in
separate variables: first, because it is completely linked to the first question, and second,
161
because the many alternative combinations of factors presented a challenge at the
moment of synthesizing them into more embracing categories. But it proved useful in
confirming tendencies already marked in the answers given to the first question. The
reasons given by the listeners coincide with the information obtained in the songs’
analyses section. Thus in the particular case of “All the Things You Are,” the listeners
found themselves attracted to Fitzgerald’s vocal tone because of the richness of its color.
Dylan, Gabriel and Björk directed the attention of the listeners towards the chorus or
refrain of their songs by a crafted handling of the form which creates momentum and
Question 3: Is the song telling some kind of story? Briefly describe what is about.
162
The answers to the third question show the difficulty of grasping the meaning of
the lyrics after only one listening. Most of the listeners said that they could not remember
what the song was about or they did not know. The only two songs that seem to be more
accessible in a first time listening situation were “Simple Twist of Fate” and “All the
Question 4: How does the song make you feel? Why? Which elements of the song put
you in that mood? Musical elements, voice quality, words?
Sky Blue no answer reflective/ peaceful/ deep/ melanc good Yearning problematic/ uplifting
spiritual relaxed intense -olic mood melodrama-
tic
no musical 2 1 1 1 1 1 1
element related
no answer 1
soothing vocal 2
quality
beat and tempo 1 1 2
Chorus 1 1
because of the 2 2
accompany-
ment
163
Isobel relaxing involved / attracted sad / dramatic good sleepy/ in trance angry
band/ 6 1
accompani-
ment
pulsating beat 2 1
her voice/ her 2 1 1
vocal quality
general pace of 1 1
the music
repetition of 1
sections and
melodic
material
All the light / happy like dancing Good like singing tunes in
Things a bar
You Are
no musical 1 1 1 1
element related
upbeat / 1 2 2
danceable
major chords 2
singers vocal 1
interpretation
musical style 1
TABLE 3.10: Results from question #4 on Fitzgerald’s version of “All the Things Your Are”
Question four required a separate table for each song. The answers regarding
moods and sensations that arose while listening to the songs were far too many and
individual to each piece. This question also received the most ambiguous and personal
answers.
This question had second intentions. Asking about the mood or emotional state
was only a way of obtaining the real sought after information: which elements of the song
were the listeners paying attention to? The results of this question complement those
from the first and second questions, where listeners were requested to tell what they
remembered from the heard songs and why. Seeking the same information from a new
164
angle confirmed tendencies already marked in those two previous questions. Except for
“Simple Twist of Fate,” where the lyrics seemed to be one of the elements with certain
incidence, listeners pointed that they got in certain moods while listening to: the vocal
tone of the performers, the band sound and the accompaniment, or the beat of the song.
Question 5: Does the story of the song have a protagonist? Who is speaking to you in
this song? Who is narrating or describing the situation?
Question number five is directly related to how much of the story or description
the listener was able to grasp. It was too complex for listeners to determine the nature and
identity of each song’s protagonist or narrator in a first listening. The fragmentary story
gathered in this brief exposure did not provide sufficient information. The answer could
be established only after a careful attentive listening to the lyrics of each song. In the act
165
around, time to go back mentally, and link previous statements to arrive at a satisfactory
conclusion.
Because of the nature of the question and the answers received, tables do not
clearly translate the information obtained. In this specific case, it is more appropriate to
proceed to describe and group the results in the following paragraphs and then show the
The formulation of the question itself is vague and imprecise, but that decision
could be justified by the need to not influence or direct the answers of the listeners. Its
outcome was of special interest for this exploratory project. It actually proved how much
further attentive listening is needed to comprehend text at the level required to resolve
engenders in itself two enigmas: first, the difference between the voice singing and the
first person in the narration; second, the difference between the protagonist of the story
and narrator. These could coexist in one person or they could be three different people.
From the people interrogated, all except one did not notice the switching of the
narrator in Bob Dylan’s “Simple Twist of Fate.” The first five verses are narrated in third
person as if the story of these two lovers was told by somebody else. The last verse
switches to the first person; the narrator becomes the protagonist (the male lover). Only
Some subjects established a difference between when the singer was talking from
a personal experience and when she was interpreting and voicing a fictional character.
One could allege that such discrimination is the result of associating authenticity values
166
to certain musical styles more than others. For example, folk or grassroots influenced
musical styles such as Dylan’s song are expected to be sincere, personal and intimate
story-telling of the singer’s past experiences, while pop singers could take different
From those songs used in the experiment, “All the Things You Are” is the only
one in which the composer and author were different from the singer: the composer is
Jerome Kern and the singer Ella Fitzgerald. But that does not mean that the author of the
lyrics and the narrator (the “I” first person of the story or protagonist) are the same
person. In the other three cases, the singers are also the composers: Bob Dylan’s “Simple
Twist of Fate,” Bjork’s “Isobel,” and Peter Gabriel’s “Sky Blue.” And again they may or
The most problematic and unexpected answers were the straight “yes” and “no.”
The “yes,” besides not providing any specification of who they think is the protagonist,
does not give a precise idea if they really understood something from the story. They
could be assuming—a generalized idea—that any story has a protagonist as default. But
on the contrary, that is not the only option. The lyrics of a song could be unconnected
ideas, a description, loose words, or the perspective of who is talking in this song could
be very vague, unpredictable or completely absent. The “no” answers did not provide any
Isobel was, according to the results, the most confusing song of all. Most of the
people did not know who was speaking or who was the protagonist, and the others
directly said that there was none. Observing the answers to some of the previous
167
questions about this same song, it appears that the listeners did not grasp the lyrics of this
particular song and their attention was mainly devoted to other aspects of the piece.
three, the listener needed at least a second listening. The second time the listeners were
prepared to pay attention to certain aspects of the lyrics. Text was listened to in an
Question 6: After listening for a second time to the same songs, do you feel you grasped
more of the meaning of the lyrics? Why? Only because you have a second chance or
because you pay more attention guided by the questions? What is the story about in every
song?”
In the answers given for question number six, the last one of this experience, the
listeners confirmed that only after this second time could they start to understand the
song’s content. They also admitted that this time they paid more attention guided by the
questions they already knew. Some of them specifically pointed out that they usually do
not pay attention to the lyrics the first time they listen to a song. They immerse
themselves in the music: the singing, the band, the melody, the instrumental solos, the
harmonies. Only after repeated listening do they feel they concentrate on the lyrics.
As the end result of this experiment, we can conclude that although listeners do
not ignore completely the songs’ lyrics, they tend to remember only certain isolated
words or short phrases. These lyrics’ fragments are usually part of choruses, refrains or
short motives that work through the song by melodic and harmonic repetition—on top of
the repetition of the words themselves. Only after listening several times in an attentive
manner, people may grasp the meaning of the lyrics. But otherwise, their attention is
168
diverted towards mostly sonic aspects of the performer’s voice, the band or the
169
V.
CONCLUSION
conscious or instinctive knowledge of how people tend to listen to vocal music. They
manipulate their textual and musical materials either to compensate, reinforce or oppose
Faced with the challenge of the unavoidable fragmentation of text under any kind
of musical setting, songwriters and composers emphasize words from their lyrics or
poetic texts that they hope will help listeners to create their own narratives. Although the
songwriters and composers mold this emphasis on certain words according to their own
readings of their texts, the listener will reinterpret the text through his or her own reading,
to create a possible meaning for the song to which they are listening.
The vocal examples analyzed in this dissertation make the case for how different
compositional approaches act on the way people listen to text set to music. I start from
the idea that any musical setting of text produces a natural disruption of the discourse.
Even monody and recitative, with their bare settings, produce a certain degree of
disruption. The fact that they mount the words to sung tones already invites the listener to
shift her or his attention to the “songfulness” of the performing voice—paying attention
mode” of perception, which privileges the “speech qualities” of the text. In their score
settings, they follow the prosodic characteristics of the text phrases as closely as possible.
On the one hand, they set the text phrases to melodic patterns that mimic their intonation
170
shapes pitch-wise. On the other hand, they also replicate the kinds of prolongation
performed over accented syllables and shorter values of the syllables in between with
similar musical rhythmic patterns. But the shapes and rhythm indicated in the score only
serve as points of departure for the real interpretation of the performer. It is in this
instance that both monody and recitative become quasi-speech. Any seventeenth-century
expects flexibility in the tempo, without a steady beat in the performance of their settings.
This beat fluctuation gives the performer the chance to vary the articulation pace
according to her or his dramatic interpretation of the different phrases, creating the
mode” operates. Although Reichardt and Zelter proclaimed their intentions of making
their musical settings a natural extension of the text—allowing it to speak for itself—and
limit their musical intervention as much as possible, their settings are “songful”
renditions far from any speech quality of the text. Their simple melodies—constrained in
have any speech quality but those of any other sung melody. The lack of modification in
text order and piano interludes aims to limit the disruption of the narrative flow.
Although the repetitious nature of their strophic settings and unobtrusive chord
accompanying textures allow listeners to ease their attention from the pure musical
elements of the song, they still apprehend the text in the “poetic mode” of listening. The
listeners hear the original sound patterns of the poem mounted over the “songfulness” of
the sustained tones of a voice singing a melody. Dylan’s “Simple Twist of Fate” operates
171
in a similar manner to Reichardt’s and Zelter’s settings with the strophic form holding his
own lyrics.
colorful accompanying textures and harmonies and text modification. By the same
means, they also built musical forms that directed the attention of the listener to certain
specific words or phrases in their texts that synthesized the meaning behind the narration.
delaying this phrase with piano interludes, repeating it several times, detaining the
overemphasizing the sonic aspects of language, they concretely manifest what we hear
and how we hear it. In the hands of Monk, this process explores the human voice’s
timbral and gestural possibilities as deployed by any kind of syntactic text. Berio departs
from literary sources subjected to fragmentation and masking processes, which interfere
with their intelligibility. By breaking words into their phonetic components, overlapping
multi-texts, exploring different masking vocal gestures and paralingual sounds, he brings
awareness of speech elements that for the most part we unconsciously hear but do not pay
The four popular songs analyzed in the final section introduce four different
172
effect. What do songwriters have to do when they want to give a place to the narrative of
the lyrics? And what do they have to do when they want to indulge in this “songfulness”
effect and engage their audience in an experience of emotional and physical involvement
ignore text in songs, but they tend to remember only certain isolated words, mainly for
musical reasons. These words tend to occupy a prominent place in the songs, either by 1)
repetition of the word itself, 2) highlighting techniques such as slowing down the
rhythmic pace over the words, 3) repetition of the melodic motive over which the words
The songwriters of the four songs used in the listening experiment—the same
and a crafted use of them. The highly structured song form of the Tin Pan Alley “All the
Things You Are” gives certain musical predictability to the listener but otherwise, as any
other song, resorts to musical manipulation to direct the attention of the listener toward
certain memorable phrases of the lyrics. Additionally, as we were able to observe, the
Fitzgerald’s and Streisand’s versions bring out different qualities in the same song. In
“Simple Twist of Fate,” Dylan acts over the “songfulness” effect and musicality of the
abundant sound patterns of his lyrics by using a repetitive strophic setting and a “quasi-
speech” vocal quality in his performance. Björk and Gabriel anchor their songs, “Isobel”
and “Sky Blue,” on certain phrases, such as refrains, or on engaging vocal riffs and
choruses, which for the most part are sung on nonsense syllables. The listeners are taken
173
Aside from paying attention to the strict musical elements of the piece, listeners
predominantly perceive the sonic or musical aspects of its lyrics: the colors of its
phonemes; the prosodic arch of its phrases’ intonation; the sonic quality of the
performing voice; the specific colors and inflections that the voice adopts at each phrase.
Composers know that audiences engage with their songs through these musical
gestures resulting from the alchemy of music and words. Even Goethe, despite his
caution against oversensitive and overcomplicated settings of his poems, had to admit
that only when words are set to music “is the poetic inspiration, whether nascent or fixed,
sublimated (or rather fused) into the free and beautiful element of sensory experience.
Then we think and feel at the same time, and are enraptured thereby.”1 Then we hear the
1
Goethe to Zelter, 21 December 1809; quoted in Eric Sams and Graham Johnson, “Lied (IV),” in New
Grove Dictionary of Music and Musicians, Vol. XIV (2nd ed. New York: Grove, 2001), 672.
174
BIBLIOGRAPHY
Aiello, Rita and John Sloboda, ed. Music Perception. New York, Oxford: Oxford
University Press, 1994.
Agawu, Kofi. “Theory and Practice in the Analysis of the Nineteenth-Century ‘Lied.’” In
Music Analysis 11, no.1 (March 1992): 3-36.
Barthes, Roland. The Grain of the Voice: Interviews 1962-1980. Berkeley and Los
Angeles: University of California Press, 1985.
Berger, Karol. A Theory of Art. New York, Oxford: Oxford University Press, 2000.
——— and Swingle II. A-ronne .London: Decca, HEAD 15, 1976.
Beethoven, Ludwig V. Lieder und Gesänge mit Klavier. München: G. Henle, 1992.
Bolinger, Dwight. Intonation and Its Uses: Melody in Grammar and Discourse. Stanford,
California: Stanford University Press, 1989.
Booth, Mark W. The Experience of Songs. New Haven and London: Yale University
Press, 1981.
Bracket, David. Interpreting Popular Music. Berkeley, Los Angeles, London: University
of California Press, 2000.
175
———. “’Imitar col canto chi parla’”: Monteverdi and the Creation of a Language for
Musical Theater.” In Journal o the American Musicological Society 55, no. 3 (Fall
2002): 383-433.
Cone, Edward. The Composer’s Voice. Berkely, Los Angeles, London: University of
California Press, 1974.
Cone, Edward. “Words into Music: The Composer’s Approach to the Text.” In Sound
and Poetry. New York , London: Coloumbia University Press, 1957.
Dalmonte, Rossana and Bálint András Varga, Two Interviews/Luciano Berio, trans.
David Osmond-Smith. New York: M. Boyars, 1985.
Dame, Joke. “Voices Within the Voice: Geno-text and Pheno-text in Berio’s Sequenza
III.” In Music/Ideology” resisting the Aesthetic, ed. Adam Krims. Amsterdam:G&B
Arts International, 1998.
Daverio, John. Robert Schumann: Herald of a “New Poetic Age.” New York-Oxford:
Oxford University Press, 1997.
Davis, Sheila. The Craft of Lyric Writing. Cincinnati, Ohio: Writer’s Digest Books, 1985.
Dreßen, Norbert. Sprache und Musik bei Luciano Berio: Untersuchungen zu seine
Vokalkompositionen. Regensburg: Bosse, 1982.
Duckworth, William. Talking Music. New York: Simon & Schuster Macmillan, 1995.
Fitzgerald, Ella. Ella Fitzgeral Sings the Jerome Kern Song Book. Verve Records 314
519 847-2, 1993.
Forte, Allen. The American popular Ballad of the Golden Era, 1924-1950. Princeton,
N.J.: Princeton University Press, 1995.
176
Frith, Simon. Music for Pleasure: Essays in the Sociology of Pop. Cambridge, Oxford:
Polity Press, 1988.
———. Performing Rites: On the Value of Popular Music. Cambridge, Mass.: Harvard
University Press, 1996.
Fubini, Enrico. A History Of Music Aesthetics. London: The Macmillan Press Limited, 1990.
Iser, Wolfgand. The Act of Reading: A Theory of Aesthetic Response. Baltimore and
London: The Johns Hopkins University Press, 1978.
Jakobson, Roman. Child Language, Aphasia and Phonological Universals. The Hague:
Mouton, 1968.
Jowitt, Deborah, ed. Meredith Monk. Baltimore: The Johns Hopkins University Press,
1997.
Kern, Jerome and Oscar Hammerstein II. All the Things You Are. Polygram International
Publishing, Inc., 1939.
Kramer, Lawrence. Music and Poetry: The Nineteenth Century and After. Berkeley:
University of California Press, 1984.
———. Musical Meaning: Towards a Critical History. Berkeley and Los Angeles,
California: University of California Press, 2002.
Lewin, David B. “Figaro’s Mistakes.” In Engaging Music: Essays in Music Analysis, ed.
Deborah Stein. New York-Oxford: Oxford University Press, 2005.
177
Liberman, A.M., and David Isenberg “Duplex Perception of Acoustic Patterns as Speech
and Nonspeech” in Status Report on Speech Research SR-62. Haskins Laboratories
(1980): 47-57.
———, I.M. Mattingly and M.T. Turvey, ”Language Codes and Memory Codes.” In
Coding Processes in Human Memory, ed. A.Melton and E. Martin. New York: Wiston,
1972.
MacClintock, Carol, ed. The Solo Song 1580-1730. New York: W.W. Norton &
Company, Inc., 1973.
Middleton, Richard. Studying Popular Music. Philadelphia: Open University Press, 1990.
———, ed., Reading Pop: Approaches to Textual Analysis in Popular Music. Oxford,
New York: Oxford University Press, 2000.
Minsky, Marvin. “Music, Mind and Meaning.” In Music, Mind and the Brain: The
Neuropsycology of Music, ed. Manfred Clynes. New York, London: Plenum Press,
1982.
Monelle, Raymond. The Sense of Music: Semiotic Essays. Princeton and Oxford:
Princeton University Press, 2000.
Mozart, W.A. Don Giovanni. Opera completa per canto e pianoforte. Milano: Ricordi,
1946.
178
Neubauer, John. The Emancipation of Music from Language: Departure from Mimesis in
Eighteenth-Century Aesthetics. New Haven, London: Yale University Press, 1986.
Osmond-Smith, David. Berio. Oxford, New York: Oxford University Press, 1991.
Palisca, Claude V. Music and Ideas in the Sixteenth and Seventeenth Centuries. Chicago:
University of Illinois Press, 2006.
Tsur, Reuven. What Makes Sound Patterns Expressive? The Poetic Mode of Speech
Perception. Durham and London: Duke University Press, 1992.
i
Reichardt, Johann F. 31 Lieder, Oden, Balladen und Romanzen. Huntsville, Tex.: recital
Publications, 2000.
Sachter, Carl. “Motive and Text in Four Schubert Songs.” In Engaging Music: Essays in
Music Analysis, ed. Deborah Stein. New York-Oxford: Oxford University Press, 2005.
Sams, Eric and Graham Johnson. “Lied (IV).” In New Grove Dictionary of Music and
Musicians, vol. XIV, 2nd edition. New York: Grove, 2001.
Scher, Steven Paul. “Melopoetics Revisited. Reflections on Theorizing Word and Music
Studies.” In Word and Music Studies 1: Defining the Field, ed. Walter Bernhart, Steven
Paul Scher and Werner Wolf. Amsterdam-Atlanta, GA: Rodopi, 1999.
Schumann, Robert. Selected Songs for Solo Voice and Piano from the Complete Works
Edition. New York: Dover, 1981.
179
Stacey, Peter .Contemporary Tendencies in the Relation of Music and Text with Special
Reference to Pli selon pli (Boulez) and Laborintus II (Berio). New York/London:
Garland Publishing, INC, 1989.
Stacey, Peter. “Towards the Analysis of the Relationship of Music and Text in
Contemporary Composition.” In Contemporary Music Review. United Kingdom:
Harwood Academic Publishers GmbH, 1989.
Starr, Larry and Christopher Waterman, American Popular Music: from Minstrelsy to
MTV. New York, Oxford: Oxford University Press, 2003.
Stein, Jack M. Poem and Music in the German Lied from Gluck to Hugo Wolf.
Cambridge, Mass.: Harvard University Press, 1971.
Stockhausen, Karlheinz and Herbert Eimert. Die Reihe:Speech and Music. Bryn Mawr,
Pennsylvania: Theodore Presser Company, 1968.
Taruskin, Richard. The Oxford History of Western Music. Oxford-New York: Oxford
University Press, 2005.
Youens, Susan. Schubert’s Poets and the Making. Cambridge-New York: Cambridge
University Press, 1996.
Zak, Albin J. The Poetics of Rock: Cutting Tracks, Making Records. Berkeley, Los
Angeles, London: University of California Press, 2001.
180