Académique Documents
Professionnel Documents
Culture Documents
Dϳ ŵϳ ϵ
ABSTRACT /ŶƉƵƚ͗ ŵ 'ϳ ŵ
&dKZ/d/KE
нϳ нϮ нϱ нϮ
ŵĂũŽƌ ŵŝŶŽƌ ƐĞǀĞŶƚŚ ŵŝŶŽƌ
ďĂƐĞƐ
This paper presents a system named ChordSequenceFac- Ϭ͘ϲ Ϭ͘Ϯ Ϭ͘Ϯ Ϭ͘ϭ Ϭ͘ϱ Ϭ͘ϰ Ϭ͘Ϯ Ϭ͘Ϯ Ϭ͘ϲ Ϭ͘ϭ Ϭ͘ϱ Ϭ͘ϰ
tory for automatically generating chord arrangements. A Ϭ͘ϯdž н Ϭ͘ϳdž
key element of musical composition is the arrangement of
WKK>K&^^
с
chord sequences because good chord arrangements have ďĂƐĞƐ
ZZE'DEd
Ϭ͘ϲ Ϭ͘Ϯ Ϭ͘Ϯ Ϭ͘ϭ Ϭ͘ϱ Ϭ͘ϰ Ϭ͘Ϯ Ϭ͘Ϯ Ϭ͘ϲ Ϭ͘ϭ Ϭ͘ϱ Ϭ͘ϰ
the potential to enrich the listening experience and create
a pleasant feeling of surprise by borrowing elements from
different musical styles in unexpected ways. While chord нϳ нϱ ŵĂũŽƌ нϵ нϱ ŵĂũŽƌ
ŵĂũŽƌ ƐĞǀĞŶƚŚ
sequences have conventionally been modeled by using N- Ěŝŵ &ŵϳ ƐƵƐϰ 'ŵϳ ƐĞǀĞŶƚŚ ƐĞǀĞŶƚŚ
ƌƌĂŶŐĞĚ͗ &Dϳ ϳ 'Dϳ
grams, generative grammars, or music theoretic rules, our
system decomposes a matrix consisting of chord transi- Figure 1. Overview of generating chord arrangements
tion probabilities by using nonnegative matrix factoriza- with ChordSequenceFactory.
tion. This enables us to not only generate chord sequences
from scratch but also transfer characteristic transition pat- selves, known as user-generated content (UGC) on video
terns from one chord sequence to another. ChordSequence- sharing services, the majority of users are generally music
Factory can assist users to edit chord sequences by modi- listeners who do not create much themselves. We aim to
fying factorized chord transition probabilities and then au- encourage such people to create more derivative works by
tomatically re-arranging them. By leveraging knowledge changing chord sequences. This enables music listeners to
from chord sequences of over 2000 songs, our system can personalize songs by customizing chord sequences without
help users generate a wide range of musically interesting any special training [4]. Such content personalization was
and entertaining chord arrangements. achieved by Drumix [15], a system for arranging the drum
track in music audio signals, but content personalization
1. INTRODUCTION using chord arrangements has not yet been studied.
To generate chord arrangements, it is necessary to model
Chord sequences are essential when composing and ar- and manipulate chord sequences. Conventionally, mod-
ranging music. Different songs, composers, arrangers, and els for generating chord sequences have been represented
musical genres have different tendencies to use chord se- by probabilistic models or N-grams [1, 6, 8, 9, 11, 12, 14],
quences, which contributes to increased variety in music. genetic algorithms [2], generative grammars [10, 13], and
Each song could have different natural chord sequences exploiting examples or templates [3, 7]. Although these
that give different impressions. Although an arranger can existing models can easily be used to generate new chord
change (i.e., arrange) chord sequences of a song to alter its sequences from scratch [2], it is difficult to arrange chord
mood, this is very difficult for people who lack knowledge sequences of an existing song while referring to other ex-
of chord sequences to arrange the chords in an appropriate isting songs.
way. The goal of this research is to assist people to gen- In this paper we propose a new mathematical formu-
erate variations of chord sequences from an input original lation of chord sequences using nonnegative matrix fac-
sequence by leveraging knowledge from a large number of torization (NMF) and use it to build a system, Chord-
other existing chord sequences called references. SequenceFactory, that enables users to arrange chord se-
Chord sequence arrangement is a promising approach quences of existing songs (Fig. 1). The chord sequence of
to create derivative works from existing songs. Although each song (a sequence of symbols) is first converted into
amateur creators could create such derivative works them- a chord differential matrix that represents chord transitions
in short regions. The matrix is then decomposed into a set
Permission to make digital or hard copies of all or part of this work for of bases (characteristic chord transition patterns) and a set
personal or classroom use is granted without fee provided that copies are of corresponding temporal activations using NMF. Chord
not made or distributed for profit or commercial advantage and that copies arrangements can be achieved by interpolating the bases of
bear this notice and the full citation on the first page. a song with similar bases obtained from other songs while
c 2013 International Society for Music Information Retrieval. preserving the activations. An arranged chord sequence is
457
14th International Society for Music Information Retrieval Conference (ISMIR 2013)
ŚŽƌĚ^ĞƋƵĞŶĐĞ
ŵ 'ϳ ŵ
ŚŽƌĚdƌĂŶƐŝƚŝŽŶ
d
^ĞƋƵĞŶĐĞ dž ƚ ƚсϭ
нϮ ŵŝŶŽƌ нϱ ƐĞǀĞŶƚŚ нϮ ŵŝŶŽƌ
ηͬď ηͬď ηͬď
ηͬď нϮ ηͬď ηͬď
458
14th International Society for Music Information Retrieval Conference (ISMIR 2013)
ŚŽƌĚŝĨĨĞƌĞŶƚŝĂůDĂƚƌŝdž
s
sŽĐĂďƵůĂƌLJŽĨŚŽƌĚdƌĂŶƐŝƚŝŽŶƐ
;Ŷсϭ͕Ϯ͕͙͕EͿ
dŝŵĞ;ƚсϭ͕Ϯ͕͙͕dͿ
ŚĂƌĂĐƚĞƌŝƐƚŝĐŚŽƌĚ
dƌĂŶƐŝƚŝŽŶWĂƚƚĞƌŶƐ
;ĂƐĞƐͿ dĞŵƉŽƌĂůĐƚŝǀĂƚŝŽŶƐ
t ,
sŽĐĂďƵůĂƌLJŽĨŚŽƌĚdƌĂŶƐŝƚŝŽŶƐ
ĂƐĞƐ ;Ŭсϭ͕Ϯ͕͙͕<Ϳ
;Ŷсϭ͕Ϯ͕͙͕EͿ
Figure 4. Regeneration of a chord transition sequence dž
from a chord differential matrix representation by means
of V and the bi-gram probability obtained a priori from
ĂƐĞƐ ;Ŭсϭ͕Ϯ͕͙͕<Ϳ dŝŵĞ;ƚсϭ͕Ϯ͕͙͕dͿ
the data.
Figure 5. Factorizing the chord deferential matrix: Anal-
tions in the vicinity of frame t. The process of representing ysis of the frequent combinations and the temporal occur-
the chord transition sequence as a chord differential matrix rences of chord transitions.
is shown in Fig. 3.
This frame-based vectorial representation of chord tran-
sition probabilities has useful properties for chord arrange- 3. FORMULATION OF
ment as follows: CHORD-SEQUENCE-FACTORY
T
nonnegative matrix factorization (NMF) (Fig. 5), i.e.,
P {xt }Tt=1 = (ξvt (xt ) + (1 − ξ) P (xt |xt−1 )) , V = (v 1 · · · v T ) (6)
t=1
(3) is decomposed into matrices W (N × K) and H (K × T )
where ξ is an interpolation coefficient such that 0 ≤ ξ ≤ 1. as:
The chord transition
sequence
{x∗t }Tt=1 is reconstructed
T
by maximizing P {xt }t=1 as follows: V W H, (7)
where W is the matrix of bases:
{x∗t }Tt=1 = argmax P {xt }Tt=1 . (4)
{xt }T
t=1
W = (w1 · · · wK ) (8)
Since there are N possibilities for each xt , it is computa-
tionally infeasible to test all N T possible sequences with and H is the matrix of activations:
⎛ ⎞
the naive method of exhaustive search. Fortunately, we can h11 · · · h1T
obtain the solution {x∗t }Tt=1 with O (N ) using dynamic ⎜ ⎟
H = ⎝ ... . . . ... ⎠ . (9)
programming. The process for generating a sequence is
shown in Fig. 4. hK1 · · · hKT
459
14th International Society for Music Information Retrieval Conference (ISMIR 2013)
REFERENCE
FACTORIZATION
ORIGINAL
+5 sus2 +2 major +2 minor +10 major +5 major . . .
FACTORIZATION
(ANALYSIS)
Characterisc Chord
Transion Paerns
(bases) Temporal Acvaons Chord Differenal Matrix
W H V
x ≃
(1–a) +a
POOL OF BASES
Preserved
INTERPOLATION OF BASES CONVERTED CONVERTED
=
x =
ARRANGEMENT
Characterisc Chord
Transion Paerns Temporal Acvaons
(Bases) Chord Differenal Matrix
W* H (SYNTHESIS)
V*
Chord Differenal Matrix to Chord Transion Sequence
ARRANGED
+3 dim7 +2 seventh +7 minor +11 sus2 +2 major7 . . .
Figure 6. Overview of the process executed by ChordSequenceFactory for generating chord arrangements.
460
14th International Society for Music Information Retrieval Conference (ISMIR 2013)
sŽĐĂďƵůĂƌLJŽĨŚŽƌĚdƌĂŶƐŝƚŝŽŶƐ sŽĐĂďƵůĂƌLJŽĨŚŽƌĚdƌĂŶƐŝƚŝŽŶƐ
Ăсϭ͘Ϭ Ă сϬ͘ϳ
;Ŷсϭ͕Ϯ͕͙͕EͿ
4. EVALUATION
found our system was, in many cases, able to provide mu-
4.1 Experimental conditions sically coherent chord sequence arrangements (Table. 1).
To evaluate our system, we used 2,123 lead sheets down- However, our evaluation also revealed some limitations
loaded from the Wikifonia (www.wikifonia.org) website. that should be addressed in our future work. Since the vo-
All lead sheets we used were formatted in the MusicXml cabulary of chord transitions defined for the system does
format including information of chord sequence. Each not include the absolute pitch for the root note, the gener-
chord name in a file included the step (C, D, ...), alternation ated results tended to exhibit transposition frequently. In
(#, ) of the root note, and the chord type. addition, finding the optimal number of bases when de-
composing the probability should be investigated for better
To define a vocabulary {cn }N n=1 , we extracted chord
performances. Finally, a slider for changing the activation
transitions that appeared more than 10 times in the data. A
can bring about more effects in the generated sequence.
special symbol “Unknown” ∈ {cn }N n=1 was used for repre-
senting the rest. The vocabulary size was N = 163. Using Although simply adding a seventh note to chord se-
the vocabulary, we represented each song as a chord tran- quences (rather than using our approach) has a similar ef-
sition sequence {xt }Tt=1 and calculated a chord differen- fect on chord arrangements, dissonance may appear in the
tial matrix from that sequence by using an exponentially- connection between the chords. In contrast, with our sys-
decaying window of length λ = 6, corresponding to the tem, dissonant chords can be avoided by using the con-
number of chord changes. The transition probabilities be- straints given by the transition probabilities. Furthermore,
tween {cn }N there are no restrictions in terms of combining two songs
n=1 were calculated in advance for all 2,123
songs and for each song, respectively. The chord differ- that have different structures, since the interpolation is done
ential matrix was decomposed using NMF based on the between the decomposed basis.
Euclidean distance with K = 5. Conventional methods based on N-grams cannot con-
We interpolated characteristic chord transition patterns trol the dynamic characteristics of chord transitions and
(basis vectors) of a reference song with those of an origi- need label sequences to reflect human intention. In our
nal song according to an interpolation coefficient a = 1.0, method, we can see that the probability is changed for each
0.7, 0.5, 0.3, or 0.1. A chord transition sequence was syn- time t and that the activations at time t play the same role
thesized from a modified chord differential matrix. Since as the label sequences in the conventional methods.
the vocabulary of chord transitions only holds information
of relative root positions, we set the root note of the initial 5. CONCLUSIONS
chord to the original root note.
We have described a system ChordSequenceFactory that
can assist a user to arrange chord symbol sequences of
4.2 Experimental results and discussions
a song by finding latent frequent patterns of chord tran-
As shown in Fig. 7, we can combine two different charac- sitions and modifying them on the basis of other songs.
teristic chord transition patterns by controlling the interpo- We proposed a new analysis and synthesis framework for
lation factor a on the user interface of ChordSequenceFac- chord symbol sequences, where the temporal changes of
tory (Fig. 8). The generated examples of chord arrange- chord transition sequences are represented as a chord dif-
ments are shown in Table. 1. We confirmed that the mood ferential matrix. NMF is used to decompose this matrix
of an original song can be modified while incorporating the into bases corresponding to the frequent patterns of chord
mood of a reference song if these songs have some simillar transitions and activations corresponding to their temporal
characteristic chord transition patterns. occurrences. The matrix can then be updated for a new ar-
Through informal evaluation and listening tests we rangement by modifying these bases by mixing them with
461
14th International Society for Music Information Retrieval Conference (ISMIR 2013)
Table 1. Generated results of ChordSequenceFactory with five different reference songs (A,B,C,D,E). For each reference
song, three values of interpolation factor a = 0.7, 0.6, 0.5 were used. The arranged chord sequences were decoded from
the chord transition sequence {xt }Tt=1 , by setting the root note of the first chord to C as in the original sequence.
similar bases in the pool of bases obtained from more than Book Songs,” Proceedings of the ISMIR conference,
2000 songs. The updated matrix is finally used to generate pp. 255–258, 2007.
a new re-arranged chord sequence using dynamic program-
ming. In our experience, ChordSequenceFactory gener- [7] F. Pachet: “Surprising Harmonies,” International Jour-
ated musically interesting and entertaining chord arrange- nal of Computing Anticipatory Systems, Vol. 4 1999.
ments. In the future, we plan to extend our framework to [8] J. Paiement, D. Eck, S. Bengio: “A Probabilistic Model
consider melody lines as constraints on arranged chord se- for Chord Progressions,” Proceedings of the ISMIR
quences. We will also include audio signal processing so conference, pp. 312–319, 2005.
that the chord differential matrix can be used directly on
music audio signals. [9] H. Papadopoulos, G. Peeters: “Large-scale study of
chord estimation algorithms based on chroma repre-
Acknowledgement sentation and hmm,” Proceedings of the ISMIR con-
This work was supported in part by OngaCREST, CREST, ference, pp 225-258, 2007.
JST.
[10] M. Rohermeier: “Towards a generative syntax of tonal
harmony,” Journal of Mathematics and Music, Vol. 5,
6. REFERENCES No. 1, pp. 35–53, 2011.
[1] C. Ames: “The Markov Process as a Compositional
[11] R. Scholz, E. Vincent, F. Bimbot: “Robust modeling of
Model: A Survey and Tutorial,” Leonardo Music Jour-
musical chord sequences using probabilistic N-grams,”
nal, Vol. 22, No. 2, pp. 175–187, 1989.
Proceedings of the ICASSP, pp. 53–56, 2009.
[2] J. Biles: “GenJam: A Genetic Algorithm for Generat-
[12] A. Sheh, D. Ellis: “Chord Segmentation and Recogni-
ing Jazz Solos,” Proceedings of ICMC, pp. 131–137,
tion using EM-Trained Hidden Markov Models,” Pro-
1994.
ceedings of the ISMIR conference, pp. 185–191, 2003.
[3] D. Cope: Experiments in Musical Intelligence, A-R
Editions, 1996. [13] M. J. Steedman: “A Generative Grammar for Jazz
Chord Sequences,” Music Perception, 2:1 pp. 52–77,
[4] M. Goto: “Active Music Listening Interfaces based on 1984.
Signal Processing,” Proceedings of ICASSP, pp. 1441–
1444, 2007. [14] D. Temperley: Music and Probability, The MIT Press,
2007.
[5] C. Harte, M. B. Sandler, S. A. Abdallah, E. Gómez:
“Symbolic Representation of Musical Chords: A Pro- [15] K. Yoshii, M. Goto, K. Komatani, T. Ogata and H. G.
posed Syntax for Text Annotations,” Proceedings of Okuno: “Drumix: An Audio Player with Real-time
the ISMIR conference, pp. 66–71, 2005. Drum-part Rearrangement Functions for Active Music
Listening,” IPSJ Digital Courier, Vol. 3, pp. 137–144,
[6] M. Mauch, S. Dixon, C. Harte, M. Casey, B. Fields: 2007.
“Discovering Chord Idioms through Beatles and Real
462