Académique Documents
Professionnel Documents
Culture Documents
Introduction
This paper is both a proposal and an initial experiment en route to a dynamical systems approach to autonomously generated music. Currently affecting
many recent efforts in generative music is the structure/novelty problem, which
appears in any purely symbolic system [8]. To overcome this issue, it is proposed that a generative music system be based on situated, embodied dynamics,
following similar moves in the disciplines that now contribute to a dynamical
systems framework for understanding life, cognition, and adaptive behaviour [3,
5, 12, 20].
In particular, this paper explores the songbird and its musical actuator, the
syrinx, as a possible agent template for this system concluding with an evaluation
of how this research might be extended to realize an embodied physical agent.
The songbird is an ideal model system on which to explore this approach to
generative music because the songbirds vocalizations and neural mechanisms
for song learning and control are extraordinarily well studied, providing a wealth
of data for our modeling efforts.
To begin creating a sensory-motor loop for generative music systems, efforts
were concentrated on the modeling and dynamical analysis of the syrinx, whose
structural dynamics will both constrain and inform the production of musical
sonic forms. The labial masses responsible for sound production in the syrinx
are modeled using modified nonlinear oscillators after the work of Garner, Laje,
Mindlin and colleagues [18, 22, 23]. In addition to the realistic reproduction of
birdsong, the syrinx oscillator acts as an appropriate actuator for the proposed
autonomous musician. Manipulation of the models parameters can affect rich,
non-discrete sonic dynamics including a continuum of frequencies, rhythms, and
150
Time (ms)
100
50
0
2
20
1
0
1
20
40
Fig. 1. Advances in basket weaving. Phase dynamics of the syrinx oscillator showing
stable limit cycles and point attactors.
tonal characteristics commonly associated with music such as portamento, vibrato, and tonal stacks (chords). Any potential control system will need to be
able to learn to control the dynamics of the oscillator, rather than predefined
musical entities.
To control the syrinx oscillator parameters, simple sine waves are used as
a stand-in for what will ultimately be output from an artificial neural network
controller. The oscillations are rough approximations of the neuromuscular gestures produced by the nucleus robustus archistriatalis (RA) which coordinate
the movements of syringeal muscles. In addition to mimicking the neuromuscular dynamics of syrinx, it is with specific interest to the author that potential
control systems possess dynamics which will explore the parameter space of the
syrinx oscillator model.
Before further description of the model and the experiments are discussed,
we will look at the motivations for this project, which are rooted in a shared
desire for improved autonomous behaviour in generative music systems.
This is the mapping of algorithm to musical phenomena [26]. It can also sometimes
be seen as the sonification of arbitrary algorithms.
Also characterized as classical AI, Good Old Fashioned AI (GOFAI), or symbolic
AI, this approach to AI focused almost purely on symbolic systems, often with the
intent that they would be connected to the world at some point in the future
Referring to the tenet of embodiment in new AI.
of the world or their impact on it. In a way, it has missed the most important
point. Music, like all behaviour, is an embodied activity.
The work described herein builds on the conviction that embodiment (the
ability for a system to sense and act on the world) is a necessary consideration in the automation of the compositional process as well as the creation of
autonomous artificial composers able to produce novel and appropriate musical
works. From a dynamical and embodied perspective, autonomous adaptive music can be generated without the pre-specification of music theory. Furthermore,
the musical behaviour is inherent in the dynamics of the brain-body system.
Eventually, musical context (also important for musicality) can be learned by
an embodied, situated musical automata through its interaction with the sonic
environment.
Extending the credo of new AI, the first steps toward creating an artificial
composer with human level proficiency may lie in the study of simpler qualitatively examples of musical behaviour from the biological world. For this, the most
familiar and well studied of biological musicians has been chosen as a model; the
songbird. It is beyond the scope of this paper to build the entire device, rather
it is intended to provide an account of how musical structures can arise from the
dynamics of the embodiment alone.
10000
Freq (Hz)
8000
6000
4000
2000
0.2
0.4
0.6
0.8
1
1.2
Time (ms)
1.4
1.6
1.8
Fig. 2. Spectrogram of the song of the White-crowned Sparrow (sound courtesy of the
Macaulay Library of Natural Sounds).
Song structures
4.2
Typical song has been shown to have components which can be purely tonal,
harmonic, inharmonic, and even contain complex and coupled amplitude and
frequency modulations. More specifically, spectrograms of birdsong contain the
fingerprints of nonlinear dynamics [13]. Under spectrographic analysis, specific nonlinear dynamical signatures can be observed such as frequency doublings or sudden jumps (as shown in Figure 2), transitions between harmonic
and inharmonic frequency stacks and even deterministic chaos (which appear as
spectrographic noise).
These dynamics are reproducible with a class of dynamical system called
nonlinear oscillators. Nonlinear oscillators have been used frequently to model
biological sound production and can roughly approximate the motion of the
membranous structures found in the windpipe of vertebrates. In the songbird,
the anatomical feature that contains these vibratory structures is the syrinx.
While filtering by the trachea and beak also play roles in sound quality, the
syrinx is responsible for much, if not all of the sound source.
The quality and structure of vocalizations in the songbird are obviously intimately related to the physical and neural structures of the bird. Fortunately for
our purposes, few in the animal kingdom have enjoyed as much intensive study
as the songbird. Many hundreds of studies have been carried out on localizing
the muscles, membranes and neural pathways which contribute to song production. For the current paper, we are most interested in the structures involved in
sound generation, the syrinx.
5.1
The avian vocal organ responsible for the physical production of sound is the
syrinx. Though there are a wide variety of syrinx morphologies, they all share
some basic structures relevant to the production of sound.
The syrinx is located at the junction of the bronchi, projecting from the lungs,
and the trachea. It is composed of cartilaginous and membranous structures as
well as a series of specialized muscles. The inaccessibility of the syrinx due to its
size and place in the bird have historically made studies difficult.
The classical model [17] attributes the sound production based on morphological observations to vibrations in the syringeal medial tympaniform membrane
(MTM) in each bronchi which vibrate through the Bernoulli effect [14]. Recently
though, direct endoscopic observations of the syrinx during sound production by
Goller and Larsen [19], show that at least in those species studied, sound is generated by connective tissues called the medial and lateral labia, located at the
cranial end of each bronchus [30]. Surgical disruption of the MTM was not shown
Fig. 3. The syrinx. The actual structure, removed and cleaned, reprinted from [15]
(left). Major anatomical features, slightly modified and reprinted from [29] (right).
The experiments by Goller and Larsen [19], having convincingly shown the
labia as the sound source in the syrinx through direct videography and MTM
destruction, has inspired the development of oscillating models of bird sound
production. Because of the evidence that syrinx labia function in a way which is
to the human
vocal The
folds (also in [19]), one of the first oscillator models to
d (Greenewalt 1968; similar
Setterwall
1901).
be
produced
by
Fee
et
al.
[16]
on a classic
model of human
Materials
and two-mass
methods
s clearly involved in sound-producing oscilla-was based
vocal fold oscillation by Ishizaka and Flanigan (1972). This model was further
demonstrated in a number
early etinal.vitro
taken up of
by Mergell
[24].
Preparation of the syrinx
Miskimen 1951; Ruppell The
1933),
andin until
re- and
strength
the Ishizaka
Flanigan based models is the ability to
account
for transitions
to chaotic
and finches
period doubling
as observed
as identified as the sound
producing
element
of vibrations
Male zebra
(T. guttata,
>100 in
days old) and male canaries
vivo
[16].
To
its
detriment
is
the
models
complexity.
Not
only
are
the
equations
x. The importance of the MTM in sound gen- (S. canaria American singer, >1 year old) were obtained from a
describing the system complex, thislocal
complexity
translates
intoBird
prohibitive
breeder
(Canary
Farm,comOld Bridge, N.J., USA) and
has been called into question
by the observation
putational and coding demands, making
them
impractical
for
use
as
phonator
housed
in
small
groups
in
cages
in
a
general aviary on a 12 h:12 h
situ destruction of the (sound
MTMsource)
has insurprisingly
the proposed larger
projects.cycle. The birds were decapitated and the syrinx was
day-night
ct on the vocalizations in More
zebra
finchGardner,
and carrecently,
Laje, Mindlin
collegues [18,
22, 23]1have
been
removedand
(including
roughly
cm of
trachea and 5 mm of both
developing
variants
of
a
model
originally
introduced
by
Titze,
thereby
extending
Goller and Larsen 1997). However, strobed bronchi) and placed in saline. The syrinx was pinned in a Sylgardtheir previous work on human vocallined
oscillations
[21]. They
claim thetissue
Titze was removed to allow opdish
and excess
connective
of oscillations in an in vitro
syrinx preparation tofold
model is one of the simplest modelstical
account
for
the
transfer
of
kinetic
energy
access to the medial tympaniform membrane and the medial
onstrated that the ML, together
with
the MTM,
to vocal fold
oscillations.
As simple
as the Either
Titze model
is, Garner
al were
labium.
the right
or leftethalf
of the syrinx was selected for
wave-like vibrations locked
to the
able to arrive
at angenerated
even simpler equation
for
a
nonlinear
oscillator
which
can was severed posterior to
measurement; the contralateral bronchus
produce
periodic
in Canaries, and extended to Sparrows.
Fee et al. 1998), suggesting
thatthethe
ML dynamics
and theobserved
the third
bronchial ring.
With simple sweeps of two control parameters representing the stiffness of the
oscillator and the bronchial pressure, they could mimic the song of the Chingolo
Sparrow (Zonotrichia Capensis).
The strength of the Titze-based model is its utter simplicity which makes
dynamical analysis, implementation, and integration into other systems easier.
Unfortunately though, this model initially seems incapable of producing chaotic
dynamics [23] without some further modification. Another drawback may be a
lack of biologically supportable parameter values, though this is not something
of great concern for our current purposes. Though highly simplified, these basic
models can provide insights into how a simplified system operates, and can be
controlled.
Body abstraction and sensory-motor simplification is central in the study
of adaptive behaviour of which this paper is a part, and can be seen in many
works viewing the body as a state-dependent agent with simple actuators (such
as Beers minimally cognitive agents [5], among countless others). Similarly, the
object of actuation in our study is ultimately a nonlinear oscillator, based on
the songbirds syrinx. By understanding the syrinx oscillators dynamics, we not
only get a picture of how our autonomous instrument will sound, but we are
also better prepared to analyze how a control system exploits these dynamics in
sound production.
6
6.1
The syrinx oscillator model chosen for these experiments is the one developed
by Garner, Laje, Mindlin and colleagues which appears in [22], extended from
the original model proposed in [18]. It was chosen on the basis of simplicity,
controllability, and computational efficiency, as well as its ability to accurately
produce the sounds of songbirds.
The syrinx oscillator is basically a second-order ordinary differential equation
(ODE) of the following form.
x = y,
(1)
y = kx cx2 y + by f0 ,
(2)
Fig. 4. The idealized syrinx. Shows the relation of parameters in the syrinx oscillator,
reprinted from [22].
generally referred to as nonlinear oscillators. More specifically they are relaxation oscillators.
6.2
Relaxation oscillators
Relaxation oscillators are dynamical systems which exhibit stable limit cycles and have been successfully used to model biological phenomena that have
self-sustained oscillations such as the heartbeat and neurons [28]. In fact, a large
variety of periodic biological phenomena can be characterized by relaxation oscillations [32].
Limit cycles are isolated closed trajectories in the phase space (one variable
plotted against another) of a nonlinear system. As Strogatz notes, limit cycles
are inherently nonlinear phenomena (p.196) and the amplitude and frequency
of the system are ultimately dependent upon initial conditions and the structure of the system itself. Unlike linear oscillators where any disturbance to the
amplitude will persist in the system forever, nonlinear oscillators are resistant
to perturbations, returning to the preferred limit cycle for a given set of parameters [28]. For a more exhaustive treatment, the reader is referred to [32] and [28].
6.3
The syrinx oscillator has dynamics which are curiously different from the
standard van der Pol, as much as its relaxation oscillation is mirrored. The b
term in the syrinx oscillator leads to a Hopf bifurcation [22] of the system at
b = 0. For negative values of b, representing reverse in pressure and an inhalation
by the bird, the phase space of the system moves towards a fixed point attractor.
Practically, this means that there is a loss of oscillatory behavior, and an ultimate
loss of phonation. As can be seen in Figure 5, the shape of the limit cycle toward
0.2
0.4
0.6
0.8
1.2
0.5
1.5
time
time
2.5
time
0
x
time
Fig. 5. Dynamics of the syrinx oscillator. (left column) Graph of y(t) for +b showing
relaxation oscillations and phase portrait of (x, y) for solutions to Equations 1 and 2
starting from (x, y) = (0.01, 0.01). (right column) Graph of y(t) for b showing a more
sinusoidal waveform leading to a point attractor for Equations 1 and 2 starting from
(x, x)
= (0.01, 0.01).
the point attractor is also quite different, more resembling a pure sine wave. This
sinusoidal waveshape appears as b moves toward 0.
High values for f 0 and c also result in loss of phonation, but for different
quantitative reasons. If c is too high all kinetic energy is lost to the medium/labia,
that is, the material is too dense. If f 0 is too high, the model labia are completely
closed, limiting airflow and thus oscillation of the membrane.
The value k, which represents labial stiffness as caused by the musculature,
effects the frequency of the oscillations. As expected, high values of k results in
high frequency oscillations and visa versa.
In order to faithfully reproduce birdsong, all parameters b and k cannot be
left at a single setting and are best served as functions of time. The original work
on the syrinx oscillator was able to reproduce many features of birdsong using
smooth oscillatory sweeps of the control parameters b and k (and later f 0 for
abrupt phonation control) using a pair of coupled oscillators tracing ellipses in
b, k phase space. These periodic sweeps have since been experimentally verified
[25], though they are not the result of coupled oscillators.
Studies have shown that the neuromuscular response of the syringeal muscles follow oscillatory trajectories in response to stimulation from RA projection
neurons [30]. For the proceeding simulations, simple uncoupled sine wave oscillators were used for the continuous control parameters (b, k) on the syrinx model,
thereby approximating the actions of the RA. This method follows the work
presented in [23, 18]. An understanding of how the oscillatory dynamics of the
syringeal muscles influence the sound will be of great use in defining a control
system and its range of operation.
Gelder
entGeldoferPhilosophy
yentofofMelPhiblooursophyne
VIy ofCMel3052bourAustneralia
@arVIiCel.3052
unimelAustb.erdu.aliaau
elar.iitesl..uuninimmelelbb..eedu.du.aau/u~tgelder
el.its.unimelb.edu.au/~tgelder
Fig. 6. Bifurcation of the syrinx oscillator. This image modified from the original which
appears in (Laje et al. 2002) [22].
It should also be noted here, but will also be pointed out in the simulation
results, that the frequency of control parameter sweeps should be much lower
than the frequency of oscillations induced in the system as a whole.
7
7.1
10000
8000
8000
Frequency
Frequency
10000
6000
6000
4000
4000
2000
2000
0.5
1
Time
1.5
0.1
0.2
0.3
0.4
0.5
Time
0.6
0.7
0.8
0.9
Fig. 7. Natural and synthetic songs. Spectrogram of the song of the Rufous-sided
Towhee (top) and spectrogram of a synthesized birdsong (bottom) where b is varied by
a low frequency sine wave and k is varied using a combination of a sine driven, adaptive
van der Pol at t < 500ms, and a sine wave thereafter. Syllable and phrase similarity is
readily apparent.
7.2
Results
The range of musical dynamics that could be achieved with the syrinx oscillator was quite surprising and upon examination, makes sense within the context
of the model as previously discussed. The spectrograms in figures 7 and 8 reveal
several of the interesting dynamical features of this oscillator. These phase dynamics are better observed in the 3-dimentional traces in Figures 1 and 9 showing
the predicted movement of the system at the bifurcation point from sharp stable
limit cycles to smoother cycles which quickly lead to point attractors.
It is the shape of the limit cycles which ultimately result in the harmonic
stacks and their ratios in sound space which collapse as b nears the bifurcation
point of the system (b = 0). This can be clearly seen in Figure 8, which are
representative of the main findings of these experiments. Swept values for the
b parameter, representing broncial pressure, resulted in a smooth transition of
the harmonic stacks, increasing the space between harmonics as the value of
b increased. Additionally, as b turns negative, phonation ceases, as predicted,
only to resume again when b turns positive. High top values in the range of b
resulted in increased steepness of the slide of harmonic stacks. The frequency of b
ultimately characterizes the starts and stops of phonation in the synthetic song.
It is this parameter that is essential in the creation of rhythms for structuring
syllables and phrases.
The value of the k parameter had noticeable effect on the frequency of the
oscillator and a broad range was necessary to ensure phonation in the frequency
range characteristic of the organism it models. Though further experimentation
30
20
10
0
!10
!20
!30
100
200
300
400
500
600
700
time
4
x 10
Frequency
1.5
0.5
0.1
0.2
0.3
0.4
0.5
0.6
Time
1
0.5
0
!0.5
!1
0
100
200
300
400
500
600
700
time
is necessary, this model was intially unable to produce period doubling, a feature
commonly observed in birdsong.
One additional consideration was left out of these experiments, and that is
the low pass filtering effects of the trachea and beak. In other experiments modeling birdsong, great lengths have been taken to make mathematical tube models
of the trachea and beak resulting in much more natural sounding synthetic song.
However, for the purposes of our applications as described previously, this should
not be an issue. Instantiation of this model as sound generator in physical body
will allow for a trachea and beak that require not math, that is, a real tube can
affixed to filter the sound output from the model.
150
100
50
Fig. 9. Close-up of syrinx oscillator x, y phase dynamics in Figure 1 with time plotted
on the vertical axis.
0
30
20
The syrinx oscillator seems to be quite capable of providing a basis for the
production of musical material. Through the manipulation of parameters in this
simple dynamical system, we were able to experimentally synthesize many of the
components of musical sound. The bifurcation of the system at b = 0 results in
the generation of rhythm as well as having an affect on the timbre of the sound.
Sweeping of the k parameter allows for access to frequencies in a sufficient range
for varied melodic content. The results of the present experiment suggest that
this model can provide a rich canvas to work with en route to musical automata.
Moreover, these experiments suggest that the dynamics of a system can indeed
give rise to potentially musical material without musical specifications per se.
The relative ease and success of the present implementation and analysis
owes much to the numerous obsessive studies and simulations of the avian vocal
organ, as well as to the birds sacrificed [19] in order to recreate them. Though
the models can effectively imitate simple song phrases, they still seem a far cry
from the complex structures found in species such as Luscinia Megarhynchos (the
Nightengale), though it is assumed that the structures are a neural phenomena
and do not require a drastic rework of the physical model.
Now that the body of the proposed musical automata has been chosen, a
suitable neural controller can be investigated. Fortunately, there is a wealth
of information concerning the neural mechanisms and pathways involved in the
10
10
20
30
40
1.5
1
0.5
0
0.5
1
1.5
2
2.5
control and production of birdsong. In fact, the learning of songs and mechanisms
of vocalization in humans and songbird share some striking parallels. As with
human speech, songbirds must learn their songs from tutors. This convergent
evolution has made the oscine an extraordinarily well studied order of birds,
much of it centered around illuminating possible mechanisms of human speech
acquisition and learning.
Combining what is known about the behavioural neurobiology of birdsong
with techniques of simulated learning and artificial neural networks, the prospects
for a complete model seem encouraging. Future research aims to combine what
is known of songbird neural architectures with the model syrinx to produce a
device that is able to sense and respond to the world, hopefully outperforming its
designers specifications (which should be little more than the dynamic structure
of the system).
For many researchers, songbird vocalization and learning models allow for the
testing of hypotheses concerning how birdsongs are learned and produced, based
on experimental evidence with living specimens. To the author, birdsong models
represent an opportunity to explore new possibilities of autonomously generated,
adaptive music using these models as self-playing instruments, in much the same
way that eighteenth century bird fancyers used living birds but whose physical
realization is wholly non-organic and more akin to clockwork musical automata.
Acknowledgments
This paper is dedicated to Marilyn Anne Losco Michael. Thanks to Sarah
Angliss for her research and turning me on to the The Bird Fancyers Delight, Alice Eldridge for discussions on nonlinear oscillators, the nightengales in Sheffield
and the Royal Pavilion in Brighton, Dr. Max Michael III, Fernando Almeida e
Costa and Eduardo Izquierdo-Torres who organized the activaite.d workshop,
and finally thanks to two anonymous reviewers for the helpful comments and
references.
References
1. Sarah Angliss. Flights of fancy? thoughts on the ancient practice of teaching
songbirds anthropogenic tunes. 2005.
2. W. Ross Ashby. Can a mechanical chess-player outplay its designer? British
Journal for the Philosophy of Science, pages 4457, 1952.
3. W. Ross Ashby. Design for a Brain, 2nd Edition. Chapman & Hall Ltd., 1960.
4. Mira Balaban, Kemal Ebcioglu, and Otto Laske. Understanding Music with AI.
AAAI Press/MIT Press, 1992.
5. Randall Beer. A dynamical systems perspective on agent-environment interaction.
Artificial Intelligence, (72):173215, 1993.
6. Rodney Brooks. Intelligence without reason. In Luc Steels and Rodney Brooks,
editors, The Artificial Life Route to Artificial Intelligence, chapter 2, pages 2581.
Lawrence Erlbaum Associates 1995, 1991.
30. Rodrick A. Suthers and Daniel Margoliash. Motor control of birdsong. Current
Opinion in Neurobiology, 12, 2002.
31. Peter M. Todd and Gregory M. Werner. Frankensteinian methods for evolutionary
music composition. In N Griffith and P. M. Todd, editors, Musical Networks:
Parallel distributed perception and performance. 1998.
32. DeLiang Wang. Wiley Encyclopedia of Electrical and Electronics Engineering, volume 18, pages 396405. Wiley and Sons, 1999.
33. Bennet Woodcroft. The Pneumatics of Hero of Alexandria. Taylor Walton and
Maberly, 1851.