Vous êtes sur la page 1sur 318

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/263657852

The expression and production of piano timbre: gestural control and technique,
perception and verbalisation in the context of piano performance and practice

Thesis · January 2013

CITATIONS READS

3 317

1 author:

Michel Bernays

10 PUBLICATIONS   39 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Verbal description and perception of instrumental timbre View project

Scientific investigation of piano technique View project

All content following this page was uploaded by Michel Bernays on 05 July 2014.

The user has requested enhancement of the downloaded file.


Université de Montréal

The expression and production of piano timbre: gestural control and technique,
perception and verbalisation in the context of piano performance and practice

par
Michel Bernays

Faculté de musique

Thèse présentée à la Faculté des études supérieures et postdoctorales


en vue de l’obtention du grade de Philosophiæ Doctor (Ph.D.)
en musique, option musicologie

Janvier, 2013

c Michel Bernays, 2013.


Université de Montréal
Faculté des études supérieures et postdoctorales

Cette thèse intitulée:

The expression and production of piano timbre: gestural control and technique,
perception and verbalisation in the context of piano performance and practice

présentée par:

Michel Bernays

a été évaluée par un jury composé des personnes suivantes:

Marie-Hélène Benoit-Otis, président-rapporteur


Caroline Traube, directrice de recherche
Maneli Pirzadeh, membre du jury
John Rink, examinateur externe
RÉSUMÉ

Cette thèse a pour objet l’étude interdisciplinaire et systématique de l’expression du


timbre au piano par les pianistes de haut niveau, dans le contexte de l’interprétation et la
pratique musicales. En premier lieu sont exposées la problématique générale et les dif-
férentes définitions et perspectives sur le timbre au piano, selon les points de vue scien-
tifiques et musicaux. Suite à la présentation de la conception du timbre au piano telle
qu’établie par les pianistes dans les traités pédagogiques, la perception et la verbalisation
du timbre au piano sont examinées à l’aide de méthodes scientifiques expérimentales et
quantitatives. Les mots dont usent les pianistes pour décrire et parler de différentes
nuances de timbre sont étudiés de façon quantitative, en fonction de leurs relations sé-
mantiques, et une carte sémantique des descripteurs de timbre communs est dressée.
Dans deux différentes études, la perception du timbre au piano par les pianistes de haut
niveau est examinée. Les résultats suggèrent que les pianistes peuvent identifier et nom-
mer les nuances de timbre contrôlées par l’interprète dans des enregistrements audio, de
façon consistante et convergente entre production et perception. Enfin, la production et
le contrôle gestuel du timbre au piano en interprétation musicale est explorée à l’aide du
système d’enregistrement d’interprétation Bösendorfer CEUS. La PianoTouch toolbox,
développée spécialement sous MATLAB afin d’extraire des descripteurs d’interprétation
à partir de données de clavier et pédales à haute résolution, est présentée puis mise en
œuvre pour étudier la production expressive du timbre au piano par le toucher et le geste
au sein d’interprétations par quatre pianistes exprimant cinq nuances de timbre et enre-
gistrées avec le système CEUS. Les espaces et portraits gestuels des nuances de timbre
ainsi obtenus présentent différents degrés d’intensité, attaque, équilibre entre les mains,
articulation et usage des pédales. Ces résultats représentent des stratégies communément
employées pour l’expression de chaque nuance de timbre en interprétation au piano.
Mots-clés: piano, timbre, expression, interprétation, geste, toucher, perception, descrip-
tion verbale, Bösendorfer CEUS
ABSTRACT

This dissertation presents an interdisciplinary, systematic study of the expression of


piano timbre by advanced-level pianists in the context of musical performance and prac-
tice. To begin, general issues and aims are introduced, as well as differing definitions
and perspectives on piano timbre from scientific and musical points of view. After the
conception of piano timbre is presented as documented by pianists in pedagogical trea-
tises, the perception and verbalisation of piano timbre is investigated with experimental
and quantitative scientific methods. The words that pianists use to describe and talk
about different timbral nuances are studied quantitatively, according to their semantic
relationships, and a semantic map of common piano timbre descriptors is drawn out. In
two separate studies, the perception of piano timbre by highly skilled pianists is investi-
gated. Results suggest that advanced pianists can identify and label performer-controlled
timbral nuances in audio recordings with consistency and agreement from production to
perception. Finally, the production and gestural control of piano timbre in musical per-
formance is explored using the Bösendorfer CEUS piano performance recording system.
The PianoTouch toolbox, specifically developed in MATLAB for extracting performance
features from high-resolution keyboard and pedalling data, is presented and used to study
the expressive production of piano timbre through touch and gesture in CEUS-recorded
performances by four pianists in five timbral nuances. Gestural spaces and portraits of
the timbral nuances are obtained with differing patterns in intensity, attack, balance be-
tween hands, articulation and pedalling. The data represents common strategies used for
the expression of each timbral nuance in piano performance.

Keywords: piano, timbre, expression, performance, gesture, touch, perception, verbal


description, Bösendorfer CEUS
CONTENTS

RÉSUMÉ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

CONTENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii

LIST OF APPENDICES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi

ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii

INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

I Theoretical and musical perspectives on the expression of


piano timbre 9

CHAPTER 1: DEFINITIONS OF PIANO (AND) TIMBRE . . . . . . . . 10


1.1 Timbre . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.1.2 Acoustic and perceptual definition . . . . . . . . . . . . . . . . 11
1.1.3 Functional definition . . . . . . . . . . . . . . . . . . . . . . . 14
1.1.4 Etymology, synonymy, vocabulary . . . . . . . . . . . . . . . . 15
1.1.5 Timbre in piano music . . . . . . . . . . . . . . . . . . . . . . 18
1.2 The Piano . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.2.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.2.2 Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
vi

1.2.3 Timbre of the instrument . . . . . . . . . . . . . . . . . . . . . 24


1.2.4 Pianist, control and controversy . . . . . . . . . . . . . . . . . 25
1.2.5 Scientific perspective on piano timbre-quality and its control . . 27

CHAPTER 2: EXPRESSION AND PRODUCTION OF PIANO TIMBRE


ACCORDING TO TREATISES . . . . . . . . . . . . . . . 30
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.2 Gesture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.2.1 Hands and touch . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.2.2 Arms and body . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.3 Piano techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.3.1 Chords . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.3.2 Setting the notes in time . . . . . . . . . . . . . . . . . . . . . 38
2.3.3 Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.3.4 Pedals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.3.5 Articulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.4 Mental conception . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.5 Composers and their works . . . . . . . . . . . . . . . . . . . . . . . . 44
2.6 Aesthetics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

II Perception and verbalisation of piano timbre 52

CHAPTER 3: TIMBRE PERCEPTION AND VERBALISATION


STUDIES . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.1 Timbre perception studies . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.1.1 Methodologies in timbre perception studies . . . . . . . . . . . 54
3.1.2 Analysis of matching data . . . . . . . . . . . . . . . . . . . . 57
vii

3.1.3 Analysis of proximity data: Multidimensional Scaling . . . . . 58


3.1.4 Cluster analysis of proximity data . . . . . . . . . . . . . . . . 64
3.1.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.2 Timbre verbalisation studies . . . . . . . . . . . . . . . . . . . . . . . 65
3.2.1 Verbal attributes of timbre . . . . . . . . . . . . . . . . . . . . 67
3.2.2 Free verbalisations of timbre and semantic studies . . . . . . . 70
3.2.3 Verbalisation of intra-instrumental timbre . . . . . . . . . . . . 75
3.2.4 Verbalisation of piano timbre . . . . . . . . . . . . . . . . . . . 81
3.2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

CHAPTER 4: PERCEPTION AND IDENTIFICATION OF PIANO


TIMBRE: A PILOT STUDY . . . . . . . . . . . . . . . . 87
4.1 Introduction – Aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.2.1 Production of piano timbre and audio recordings . . . . . . . . 88
4.2.2 Perceptual identification test of piano timbre . . . . . . . . . . 92
4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.3.1 Preliminary timbre identification test . . . . . . . . . . . . . . . 94
4.3.2 Main timbre perception test . . . . . . . . . . . . . . . . . . . 94
4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

CHAPTER 5: VERBAL EXPRESSION OF PIANO TIMBRE:


SEMANTIC SCALING OF ADJECTIVAL DESCRIPTORS106
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
5.2 Aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.3 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.4.1 Familiarity with timbre descriptors . . . . . . . . . . . . . . . . 111
5.4.2 Dissimilarities and semantic space . . . . . . . . . . . . . . . . 112
viii

5.4.3 Cluster analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 120


5.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

CHAPTER 6: PERCEPTION AND IDENTIFICATION OF PIANO


TIMBRE: A FOLLOW-UP STUDY . . . . . . . . . . . . . 125
6.1 Introduction and aims . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
6.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
6.2.1 Piano timbre descriptors . . . . . . . . . . . . . . . . . . . . . 126
6.2.2 Audio stimuli . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
6.2.3 Perceptual identification test of piano timbre . . . . . . . . . . 130
6.2.4 Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
6.2.5 Testing interface and protocol . . . . . . . . . . . . . . . . . . 131
6.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
6.3.1 Timbre identification rate per participant . . . . . . . . . . . . . 133
6.3.2 Effects of piece, performer and timbre of the stimuli on identifi-
cation rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
6.3.3 Confusion matrix . . . . . . . . . . . . . . . . . . . . . . . . . 135
6.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

III Production and gestural control of piano timbre 141

CHAPTER 7: SCIENTIFIC STUDIES OF PIANO PERFORMANCE


AND TIMBRE PRODUCTION . . . . . . . . . . . . . . . 142
7.1 Illustrated epistemological perspective on performance studies . . . . . 142
7.1.1 History of empirical music performance studies . . . . . . . . . 142
7.1.2 Movement and gesture in performance . . . . . . . . . . . . . . 145
7.1.3 Data acquisition technologies for piano performance . . . . . . 147
7.2 Findings on piano performance . . . . . . . . . . . . . . . . . . . . . . 154
ix

7.2.1 Expressive patterns in piano performance . . . . . . . . . . . . 155


7.2.2 Synchrony, articulation and pedalling in piano performance . . 157
7.2.3 Timbre, touch and a single tone . . . . . . . . . . . . . . . . . 160
7.2.4 Timbre in combination of tones . . . . . . . . . . . . . . . . . 168

CHAPTER 8: PIANO TOUCH ANALYSIS: A MATLAB TOOLBOX FOR


EXTRACTING PERFORMANCE FEATURES . . . . . . 171
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
8.2 High-resolution data acquisition . . . . . . . . . . . . . . . . . . . . . 172
8.3 Data processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
8.3.1 From streamed data files to piano rolls . . . . . . . . . . . . . . 175
8.3.2 Retrieving notes . . . . . . . . . . . . . . . . . . . . . . . . . 178
8.3.3 Identifying chords . . . . . . . . . . . . . . . . . . . . . . . . 179
8.4 Piano touch features . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
8.4.1 Single note features . . . . . . . . . . . . . . . . . . . . . . . . 181
8.4.2 Chord features . . . . . . . . . . . . . . . . . . . . . . . . . . 183
8.5 Analysis and visualisation functions . . . . . . . . . . . . . . . . . . . 186
8.5.1 Comparison of piano rolls . . . . . . . . . . . . . . . . . . . . 186
8.5.2 Chords and notes selection . . . . . . . . . . . . . . . . . . . . 187
8.5.3 Graphical representation of performance features . . . . . . . . 189
8.5.4 Score matching . . . . . . . . . . . . . . . . . . . . . . . . . . 193
8.5.5 MIDI to boe conversion . . . . . . . . . . . . . . . . . . . . . 195
8.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198

CHAPTER 9: EXPRESSION AND GESTURAL CONTROL OF PIANO


TIMBRE: TOUCH AND GESTURE FOR TIMBRE
PRODUCTION IN PIANO PERFORMANCE . . . . . . . 200
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
9.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
x

9.2.1 Timbre descriptors . . . . . . . . . . . . . . . . . . . . . . . . 201


9.2.2 Musical pieces . . . . . . . . . . . . . . . . . . . . . . . . . . 201
9.2.3 Equipment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
9.2.4 Participants and task outline . . . . . . . . . . . . . . . . . . . 204
9.2.5 Performance analysis . . . . . . . . . . . . . . . . . . . . . . . 205
9.3 General results and discussion . . . . . . . . . . . . . . . . . . . . . . 206
9.3.1 Significant, timbre-discriminating piano performance features . 206
9.3.2 Gestural spaces of piano timbre . . . . . . . . . . . . . . . . . 207
9.3.3 Gestural description of piano timbre . . . . . . . . . . . . . . . 212
9.3.4 Pairwise comparisons between timbres . . . . . . . . . . . . . 217
9.3.5 Gestural descriptions of piano timbre in time . . . . . . . . . . 219
9.3.6 Piece-wise gestural description of piano timbre . . . . . . . . . 230
9.4 Individual results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
9.4.1 Comments from the participants . . . . . . . . . . . . . . . . . 239
9.4.2 Pianist-wise gestural description of piano timbre . . . . . . . . 242
9.4.3 Comparison between participants’ comments and individual
performance analysis results . . . . . . . . . . . . . . . . . . . 252
9.5 General discussion: gestural portraits of piano timbre . . . . . . . . . . 254

CONCLUSION AND FUTURE WORK . . . . . . . . . . . . . . . . . . . . . 259

BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
LIST OF TABLES

4.I Timbre descriptors provided (mostly in French) per audio excerpt


in the free identification task . . . . . . . . . . . . . . . . . . . . 95
4.II Synonyms and antonyms of the eight timbre descriptors used as
performance instructions, as extracted and listed from Bellemare
and Traube (2005). . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.III Numerical evaluation of synonymy for the verbal descriptors pro-
vided in the free identification of each excerpt. . . . . . . . . . . 98
4.IV Confusion matrix of forced-choice timbre identification . . . . . . 100

5.I Uses and citations of the 14 timbre descriptors selected for the study110
5.II Evaluation of familiarity with piano timbre descriptors . . . . . . 111
5.III Coordinates of the 14 piano timbre descriptors along the four di-
mensions of the MDS semantic similarity space . . . . . . . . . . 116

6.I Timbre-wise confusion matrix and timbre identification rates (180


answers and stimuli per timbre). . . . . . . . . . . . . . . . . . . 136

9.I Summary of performance features significant in post-hoc pairwise


timbre comparisons. . . . . . . . . . . . . . . . . . . . . . . . . 218
LIST OF FIGURES

1.1 Cristofori’s Pianoforte . . . . . . . . . . . . . . . . . . . . . . . 21


1.2 Erard’s Grand Pianoforte . . . . . . . . . . . . . . . . . . . . . . 22
1.3 Piano action mechanism . . . . . . . . . . . . . . . . . . . . . . 24

2.1 Debussy’s La cathédrale engloutie, last page of the score. . . . . 45


2.2 Recitative section near the end of the first movement in Beethoven’s
Sonata no.17 in D minor (right hand solo) (Neuhaus, 1973, p. 59). 46

3.1 Semantic atlas of piano timbre . . . . . . . . . . . . . . . . . . . 85

4.1 Score of the first miniature piece composed for this pilot study. . . 89
4.2 Score of the second miniature piece composed for this pilot study. 90
4.3 Score of the third miniature piece composed for this pilot study. . 91
4.4 The Bösendorfer Imperial 290 grand piano and recording setup . . 92
4.5 Timbre identification rate per participant . . . . . . . . . . . . . . 100
4.6 Timbre identification rate per audio excerpt . . . . . . . . . . . . 101

5.1 Semantic atlas of piano timbre . . . . . . . . . . . . . . . . . . . 108


5.2 Mean evaluation of familiarity with piano timbre descriptors . . . 112
5.3 Scree plot of the stress values corresponding to MDS spaces of
different dimensionalities . . . . . . . . . . . . . . . . . . . . . . 115
5.4 Individual linear representation of each of the four MDS dimensions117
5.5 Planar projections of the MDS space . . . . . . . . . . . . . . . . 117
5.6 3D representation of the first three MDS dimensions . . . . . . . 118
5.7 Shepard plot of distances in the MDS space and disparities vs.
averaged original dissimilarities. . . . . . . . . . . . . . . . . . . 119
5.8 Dendrogram of the hierarchical clustering of semantic dissimilar-
ities between the 14 piano timbre descriptors. . . . . . . . . . . . 121
xiii

5.9 Semantic MDS space plans with groups of neighbouring descrip-


tors highlighted . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

6.1 Scores of the four miniature pieces composed and selected for the
study. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
6.2 Cardioid microphones setup used in recording performances on
the Bösendorfer Imperial 290 grand piano . . . . . . . . . . . . . 130
6.3 Page of the testing interface in its initial, pre-manipulation state. . 131
6.4 Page of the testing interface in its final, pre-validation state, with
each excerpt (grey disk) assigned to a different timbre descriptor
(yellow disk). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
6.5 Timbre identification rate per participant . . . . . . . . . . . . . . 134
6.6 Identification rate per piece by timbre . . . . . . . . . . . . . . . 136
6.7 Identification rate per timbre by performer . . . . . . . . . . . . . 137

8.1 The Imperial Bösendorfer grand piano with embedded CEUS system173
8.2 Details of the CEUS system . . . . . . . . . . . . . . . . . . . . 174
8.3 Piano roll display of a performance (detail) . . . . . . . . . . . . 176
8.4 Piano roll display of another performance (detail) . . . . . . . . . 177
8.5 Piano roll display with framed chords . . . . . . . . . . . . . . . 180
8.6 Illustration of some basic note features . . . . . . . . . . . . . . . 182
8.7 Note key tracking sectioning analogous to the acoustic temporal
ADSR envelope . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
8.8 Piano roll display of successive chords with their interval and over-
lap relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
8.9 Superimposed display of two performance piano rolls (detail) . . 187
8.10 Chord selection interface . . . . . . . . . . . . . . . . . . . . . . 188
8.11 Graphical user interface for feature selection . . . . . . . . . . . 190
xiv

8.12 Graphical display of one performance piano roll and evolution


over time of selected features . . . . . . . . . . . . . . . . . . . . 191
8.13 Comparison in time of two performances, according to three features192
8.14 Comparison in time of six performances grouped by performer,
according to three features . . . . . . . . . . . . . . . . . . . . . 194
8.15 Graphical output of the score matching function . . . . . . . . . . 196
8.16 Piano rolls generated from MIDI-like text data . . . . . . . . . . 197

9.1 Semantic similarity Multidimensional Scaling space of piano tim-


bre descriptors (Dimensions 1 and 2) . . . . . . . . . . . . . . . 202
9.2 Scores of the four miniature pieces composed and selected for the
study. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
9.3 Principal Component Analysis of the 192 significant gestural fea-
tures over 80 samples: Dimensions 1 and 2 . . . . . . . . . . . . 208
9.4 Principal Component Analysis of 192 significant gestural features
over the 60 performances of Pieces no.1 and no.2: Dimensions 1
and 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
9.5 Principal Component Analysis of 192 significant gestural features
over the 60 performances of Pieces no.3 and no.4: Dimensions 1
and 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
9.6 Kiviat chart of the 13 gestural features giving a minimal and unique
description of the five timbral nuances explored in the study . . . 213
9.7 Evolution in time, over all performances of Piece no.1 grouped by
timbre, of four gestural control features . . . . . . . . . . . . . . 221
9.8 Evolution in time, over all performances of Piece no.2 grouped by
timbre, of four gestural control features . . . . . . . . . . . . . . 224
9.9 Evolution in time, over all performances of Piece no.3 grouped by
timbre, of four gestural control features . . . . . . . . . . . . . . 226
xv

9.10 Evolution in time, over all performances of Piece no.4 grouped by


timbre, of four gestural control features . . . . . . . . . . . . . . 228
9.11 Kiviat chart of the four gestural features giving a minimal and
unique description in Piece no.1 of the five timbral nuances ex-
plored in the study . . . . . . . . . . . . . . . . . . . . . . . . . 231
9.12 Kiviat chart of the eight gestural features giving a minimal and
unique description in Piece no.2 of the five timbral nuances ex-
plored in the study . . . . . . . . . . . . . . . . . . . . . . . . . 233
9.13 Kiviat chart of the six gestural features giving a minimal and unique
description in Piece no.3 of the five timbral nuances explored in
the study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
9.14 Kiviat chart of the seven gestural features giving a significant, min-
imal and unique description in Piece no.4 of the five timbral nu-
ances explored in the study . . . . . . . . . . . . . . . . . . . . . 237
9.15 Kiviat chart of the 11 minimal performance gesture and touch fea-
tures that can account in RB’s performances for a unique descrip-
tion of each of the five timbral nuances explored in the study . . . 243
9.16 Kiviat chart of the 10 minimal performance gesture and touch fea-
tures that can account in PL’s performances for a unique descrip-
tion of each of the five timbral nuances explored in the study . . . 245
9.17 Kiviat chart of the 8 minimal performance gesture and touch fea-
tures that can account in FP’s performances for a unique descrip-
tion of each of the five timbral nuances explored in the study . . . 247
9.18 Kiviat chart of the 8 minimal performance gesture and touch fea-
tures that can account in BB’s performances for a unique descrip-
tion of each of the five timbral nuances explored in the study . . . 250
LIST OF APPENDICES

Appendix I: Questionnaire used in the semantic study of piano timbre


verbal descriptors . . . . . . . . . . . . . . . . . . . . . . xix

Appendix II: Publications stemming from the work described herein . .xxiii
ACKNOWLEDGMENTS

First and foremost, I wish to extend my deepest thanks to my supervisor, Prof. Car-
oline Traube, for initiating this research project and for offering me the opportunity to
work on it; for her constant and contagious enthusiasm and kindness; for her always
judicious advice; for her impressive competence and skill in all technical and musical
matters; for her trust in me from the beginning; and in the end for more than five years
of support and intellectual stimulation.
I wish to thank Prof. Douglas Eck, for investing in the Bösendorfer CEUS piano, a
wondrous research tool at the heart of my Ph.D. project. I also wish to thank him for
his co-supervision in the early years of my Ph.D., until better professional opportunities
were offered him elsewhere.
I would like to express my profound gratitude to BRAMS (International Laboratory
for Brain, Music and Sound Research), for providing me with a perfect working envi-
ronment and all the research equipment a Ph.D. student can dream of. I wish to thank
the Faculté de musique, Université de Montréal, for everything I could learn, and all the
people there, professors and students, with whom I had the chance to interact. I would
also like to thank CIRMMT (Centre for Interdisciplinary Research in Music Media and
Technology) and OICRM (Observatoire interdiscplinaire de création et de recherche en
musique) for the productive research environment I had the privilege to be a part of, for
the opportunities they gave me to present my research, and for their financial support
over the years. And I would like to thank FRQSC (Fonds de recherche du Québec –
Société et culture) for their crucial financial support for the last three years of my Ph.D.
I wish to thank greatly all the people I worked with during my Ph.D. Many thanks
to the BRAMS staff in general, and more particularly to Bernard Bouchard, for his in-
valuable help in all technical and musical matters: my research could not have come
to fruition without his talents and knowledge as a composer and pianist. I would like
to thank all the participants to our studies for their time and for their contributions. In
xviii

particular, I wish to address special thanks to the composers of the musical pieces that
were used, and to the five great pianists who spent lots of time and effort in order to play
for us; they made me realise what pianists so talented can accomplish musically.
Big thanks go to Nicolas Riche, for the great work he accomplished in his intern-
ship with me, and to Sébastien Bel, my colleague, collaborator, disciple, successor and
friend. I also wish to thank Mark Zadel and Mike Winters for their proofreading of this
dissertation. I would also like to thank all my colleagues, at BRAMS, CIRMMT, the
Faculté de musique and elsewhere in Montreal and around the world, for all the invalu-
able discussions and sharing of knowledge.
On a personal level, I would like to pay tribute to the musicians who have made all
that great music that has enhanced my life throughout the years.
I now wish to thank all my friends in Montreal, colleagues or otherwise, for all the
great things we have lived together during those last five years. Likewise, I wish to thank
my friends in France and over the world. You are too many to name, but you know who
you are. I love you all.
Last but not least, I wish to address my most heartfelt thanks to my family for their
unyielding love and support throughout my life.
INTRODUCTION

Musical performance is essential to the art and experience of music. Classical per-
formers, in particular, aim to enlighten composers’ works by their own interpretation,
allowing creative expression throughout this complex task. This holds equally true in
piano performance, for which a vast repertoire has been composed in the few centuries
since the instrument’s inception. An extensive, empirical body of knowledge has con-
sequently been developed within the pianistic community, allowing for the most vivid
expressivity to shine through in piano performance. Guidelines about technique, ges-
ture, touch, mental approach and so on have been provided by teachers and pedagogues
throughout piano history — a written trace of which can notably be found in master
treatises from the twentieth century (e.g. Hofmann, 1920; Kochevitsky, 1967; Lhevinne,
1924; Matthay, 1932; Neuhaus, 1973) — in order to help pianists effectively generate the
most expressive performances and refine their “sound” to best convey emotions toward
an audience.
Over the course of the twentieth century, an important body of scientific research
has been devoted to understanding and quantifying the intricacies of musical perfor-
mance. Indeed, the sequence of technological developments since the late nineteenth
century have progressively allowed us to capture sound as an acoustic signal. These
fixed-medium recordings of musical performances thus form objects suitable to rigorous,
quantitative analysis. Moreover, measurements of the act of musical performance itself
— the mechanical action of performers on their instruments, the physiological mechan-
ics or movements of performers — have become possible, with ever increasing accuracy,
with analog then digital sensor technologies. Meanwhile, many disciplines in the hu-
manities and natural sciences (e.g. psychology and psychoacoustics) have gained much
weight and recognition through methodological progress. And data analysis greatly ben-
efited in general from the ever increasing calculating power of computers.
2

Consequently, musical performance has become a major focal point of systematic


musicology, whose interdisciplinary approach to the musical object, combining the meth-
ods and tools of both humanities (history, anthropology, sociology, semiotics, linguis-
tics) and natural and technological sciences (acoustics, psychoacoustics, cognitive neu-
roscience, kinesiology, computer science) has been applied to studying empirically, ex-
perimentally, quantitatively and qualitatively the many aspects and parameters of expres-
sion in musical performance. Such a catalysis through technological and methodological
means came together with regard to piano performance in the first part of the twentieth
century, with seminal works from Ortmann (1925, 1929) and Seashore (1936) for in-
stance. Much has since been learnt about the timing, dynamics, articulation, pedalling
and overall expressive patterns of piano performance, thanks to systematic studies by
Repp, Rink, Goebl and many others.

However, there remains one crucial musical parameter whose expression in piano
performance is still not well understood: timbre. Defined as a fourth musical parameter
after pitch, loudness and duration, timbre has been studied according to its acoustic and
perceptual features since the late nineteenth century (von Helmholtz, 1885) (then Grey,
1977; McAdams et al., 1995; Risset and Mathews, 1969, for instance). Yet musical
timbre is a multidimensional acoustic parameter — characterised by both spectral and
temporal envelopes and their intricate inter-relations — as well as a complex percep-
tual attribute, which is involved in several functions, from sound source categorisation
to the surface attributes (or quality) of the sound itself. Thus, although timbre can be
first used, broadly, to identify the type of musical instrument one hears (e.g. piano or
violin) or the characteristics of the instrument (e.g. to differentiate between a Steinway
and a Bösendorfer grand piano), more subtle nuances of timbre appear in the sound of
a performance itself. Hence, specific timbral nuances can be brought out by performers
by employing specific gestures and playing techniques, i.e. by controlling the subtle
performance parameters at their disposal.
3

This concept of performer-controlled timbre is widely acknowledged within the pi-


anistic community. At an advanced level of piano performance teaching, much attention
is typically devoted to timbre, both as an abstract concept and as the shaping and mod-
ulation of sound the teacher demonstrates at the keyboard. Students are then generally
left to learn by ear what a certain timbre sounds like, and to reproduce the timbre in
their own performances by fine-tuning their playing through trial and error and attention
to auditory feedback, until it sounds “right” to both student and teacher. Consequently,
advanced piano students can develop an acute auditory sensitivity to very subtle sonic
variations. The vast palette of timbres they can discern is illustrated in a rich, detailed
and extensive vocabulary, composed notably of adjectival descriptors, such as bright,
round or velvety. Such terms however remain subjective and attached to the images and
metaphors abstractly evoked by timbre in a sound object. This top-down cognitive inte-
gration of timbre from imagery is definitely useful to pianists, in that it allows them to
develop an individual approach to timbre production and to colour their performances
with very personal timbral nuances which are integrated within the general concept of a
personal “sound” that each pianist ought to aim for. Moreover,this top-down approach
to timbre affords the pianists with more flexibility in adapting to the different pianos
they may come to play. However, this approach may then fall short of reaching the
gestural, technical level of timbre production in piano performance. Indeed, although
highly-skilled pianists can, through their exceptional dexterity, control their instrument
to the finest degree for producing a personal timbral nuance, their understanding of a
precise timbre (which for instance a teacher may request to be obtained) can sometimes
be so deeply ingrained within an abstract construct (built in the learning process through
careful ear-driven refinement) that they may not always explicitly grasp the subtleties
of gesture that control the production of this timbral nuance. It may be presumed that,
beyond the idiosyncrasies of each pianist and the specificities of each piano, there must
be some common patterns of gesture and touch governing the production of a specific
4

timbral nuance. Thus, however clear the concept of timbre may be to advanced-level
pianists, its actual physical method of production and gestural control can remain blurry.

It would then seem quite relevant and appropriate to explore the production of piano
timbre according to systematic and quantitative methods. However, little scientific re-
search on the performer-controlled production of piano timbre has actually been done.
Despite the groundbreaking works of Ortmann (1925, 1929) on the relation between
touch and timbre in one tone, further investigations have been significantly impeded by
the common misconception in the scientific realm that piano timbre control would be
limited (by the mechanical constraints of the instrument) to sheer keystroke velocity,
thus making timbre inseparable from intensity. This idea, already questionable when a
single key is considered, is utterly debunked once we take into account the subtleties
of articulation and combination of tones in chords, of dynamic differentiation between
tones, of pedalling and so on that pianists can control in a polyphonic musical context
to colour their performances in timbre. In order to quantitatively explore the produc-
tion and gestural control of piano timbre in a polyphonic musical context, it is therefore
essential to identify the fine-grained nuances of these piano performance parameters,
which may have simply been out of reach of early mechanical piano performance mea-
surement tools and digital MIDI data. But the latest-generation technologies of piano
performance-recording systems — including the Bösendorfer CEUS system, which I
had the privilege to use for this work — has now reached a level of accuracy and resolu-
tion in key, hammer and pedal tracking that allows for measuring the subtleties in piano
performance gesture that are involved in producing different timbral nuances.

Aims

This is how we could set about exploring the production and gestural control of pi-
ano timbre for the first time, at the high level of detail and subtlety that the Bösendorfer
5

CEUS system can bring to light. The aims of this project, however, extend to broader
concerns, as the research is intended to grasp a more holistic understanding of the ex-
pression of piano timbre in performance.
A part of this research (and of this dissertation) will thus concern the verbalisation of
piano timbre, in order to reveal the words used to describe its nuances and their relative
meanings. In particular, we will aim at building a semantic map for the verbal description
of piano timbre. A crucial facet of the auditory perception of piano timbre in a musi-
cal context will also be explored: can pianists agree on the timbral nuances (and their
verbal descriptors) illustrated in different performances of performer-controlled timbre-
colouring? Furthermore, the connections between the verbalisation, perception and pro-
duction of piano timbre will be thoroughly analysed, around the following question:
will the verbal descriptor of a specific timbral nuance used to colour a performance be
retrieved by listeners when asked to label timbre in the performance recording? In other
words, we will explore the relations from words, to performance, to sound, to perception,
and back to words, in the expression of piano timbre.
In parallel, the production of piano timbre production through precise gesture, touch,
keyboard control and pedalling will be measured and explored for several different tim-
bral nuances, as defined by specific verbal descriptors. In comparing, in a musical yet
controlled context, how several performers manage to adjust their performances in order
to produce those different timbres, and then in distinguishing the performance features
characteristically employed in producing each timbral nuance, we intend to draw ges-
tural portraits of piano timbre nuances, which would represent the common strategies
essential for producing the timbral nuances, and over which the pianists could apply
additional subtleties to convey their own expression.
Yet such quantitative exploration of the expression of piano timbre would be in vain
without its being set in perspective with regard to the immense empirical body of knowl-
edge developed by master pianists and pedagogues throughout the years.
6

Thesis overview

This thesis is divided into three parts. The first is devoted to theoretical and musical
perspectives on the expression of piano timbre. In Chapter 1, definitions of the concepts
of piano, timbre, and more specifically, piano timbre, are examined. In particular, several
definitions of timbre are posited, according to the relevant perspectives set forward in
the scientific literature — acoustic, perceptual, functional, and etymological. The role
of timbre in music is then described. The piano is then concisely presented through
its history and its sound-production mechanism, with particular attention given to its
inherent timbre as well as the control over piano timbre accessible to performers —
revealing conflicting opinions both amongst and between musicians and scientists.
Chapter 2 meanwhile focuses on the perspectives on piano timbre described in peda-
gogical treatises, and the recommended gesture, technique and mental conception in the
expression of piano timbre.

The second part is devoted to the perception and verbalisation of piano timbre. Chap-
ter 3 contains a review of the scientific literature on timbre perception and verbalisation
research, the procedural and analytical methodologies employed in such studies, and
relevant findings on the verbal description of piano timbre.
Chapter 4 details a pilot study on the perception, identification and labelling of
piano timbre in recordings of performances coloured in timbral nuances — as con-
trolled by a performer according to specific timbral instructions. This exploratory study
thus attempts to find out whether pianists can identify and consistently label performer-
controlled timbre nuances from only hearing the sound.
In Chapter 5, the verbal expression of piano timbre is studied in detail. The semantic
proximity between the most common adjectival descriptors of piano timbre is evaluated
via questionnaires, and scaling and clustering methods are applied to draw out a semantic
space of piano timbre verbalisation. The descriptors most representative of this whole
7

semantic space of piano timbre are then identified.


And in Chapter 6, the auditory perception and identification of piano timbre is re-
envisioned, with a more refined and controlled procedure. Preliminary results of this
study are presented and discussed.

The third and last part of the thesis concerns the production and gestural control of
piano timbre. Chapter 7 reviews the scientific research on piano performance and tim-
bre production. First, the epistemology, methodologies and purposes of the research on
movement and gesture in music performance are presented. Data acquisition technolo-
gies for piano performance are then detailed, setting the capacities of the Bösendorfer
CEUS digital piano performance recording system in context.
The findings of piano performance studies are next presented, first regarding the ex-
pressive piano performance parameters that can be considered relevant to the production
of piano timbre, then oriented towards a thorough account of timbre control, through
touch, for a single tone, and for the polyphonic tone combinations in a musical context.
Chapter 8 details the technical, computational aspect of this research, and presents
the MATLAB PianoTouch toolbox developed for analysing piano touch by extracting
performance features from the Bösendorfer CEUS high-resolution keyboard and ped-
alling data. The CEUS technology, the raw data it can acquire, its computational pro-
cessing, note and chord retrieval, extraction of numerous performance features, and the
additional analysis and visualisation functions included in the toolbox are described in
turn.
Lastly, Chapter 9 details a study of touch and gesture in the expressive production
of piano timbre in performance. In a rigorous, quantitative exploration of timbre as an
expressive device in piano performance, CEUS data recordings are collected for several
performances coloured in different timbre nuances. These recordings are analysed with
the PianoTouch toolbox, and the gesture and touch features extracted from each perfor-
mance are compared, with statistical tests, according to the timbral nuance highlighted.
8

In light of the performance features that significantly differ in the production of differ-
ent timbral nuances, gestural spaces and portraits of piano timbre nuances could thus be
drawn out.

This dissertation concludes with a general discussion of this research, its applica-
tions and perspectives. The results of this research could be used in piano pedagogy,
for a more concrete, gesture-based approach to piano timbre that would complement
the more abstract, mental perspective generally adopted. Moreover, the findings may
also be used for providing better control over digital piano sound synthesis engines.
Indeed, even though the best piano-modelling engines currently available are able to
accurately reproduce the sound of an acoustic piano, their playability, as actual musical
instruments, may be improved with the integration of finer control parameters that would
reflect the subtle details of the keystrokes used for expression and timbral colouring in
piano performance. New digital keyboard interfaces built with this aim could respond to
the subtleties of piano touch, and provide the user with a more realistic sense of touch
as well. Furthermore, an intermediate software layer could be used between the digi-
tal controller and the synthesis algorithm, with the aim of enhancing or simulating the
nuances of piano timbre.
Part I

Theoretical and musical perspectives


on the expression of piano timbre

9
CHAPTER 1

DEFINITIONS OF PIANO (AND) TIMBRE

1.1 Timbre

1.1.1 Introduction

The notion of timbre, as attached to any sound phenomenon, is relatively easy to


comprehend. Timbre, for instance, can tell us several things about the human voice.
First, we can recognize a voice as such, as timbre allots its categorical nature. Then,
we can distinguish between different voices, as different people possess different vocal
timbres. Finally, for one person and voice, timbre helps us detect the emotional state and
intentions of the speaker through subtle nuances in voice quality.
These three functions of timbre were formalised by Marozeau (2004, p. 11), from
coarse to fine:
– T IMBRE - IDENTITY characterises the broad properties of a sound that let us rec-
ognize the sound source category;
– T IMBRE - INDIVIDUALITY lets us identify a particular sound to a unique source,
thus defined more narrowly than by timbre-identity, yet still including a palette of
sounds — such as the different notes playable on one instrument or the specific
“sound” of one musician;
– T IMBRE - QUALITY is used in the perceptual assessment of a sound, regardless of
its pitch, duration, intensity or localisation — like the sound of one piano note.
However, this rather simple and intuitive, functional and categorical comprehension
of timbre as sound parameter does not result in an equally simple, quantitative definition.
11

1.1.2 Acoustic and perceptual definition

Indeed, once one delves into deeper analysis of timbre as an acoustic parameter,
timbre complexity and intangibility soon arise.
Timbre is considered here as a perceptual phenomenon, i.e. an artefact of the infor-
mation from the outer physical world, transformed, organised and structured by the mind
in the form of sensation or memory (Hajda et al., 1997, p. 254). Timbre also constitutes
one of the five essential psychoacoustic parameters of a sound, in addition to pitch, loud-
ness, duration and localisation (see Chailley, 1982, for instance), yet that is as far as a
consensual definition goes.
In 1960, the American Standard Institute posited the following definition of timbre:

Timbre is that attribute of auditory sensation in terms of which a listener


can judge that two sounds similarly presented and having the same loudness
and pitch are dissimilar. Timbre depends primarily upon the spectrum of
the stimulus, but it also depends on the waveform, the sound pressure, the
frequency location of the spectrum, and the temporal characteristics of the
stimulus. 1

The difficulties in quantifying timbre per se are highlighted in the New Grove Dic-
tionary of Music and Musicians (Campbell and Emerson, 2001):

[Timbre is] a term describing the tonal quality of a sound; a clarinet and
an oboe sounding the same note at the same loudness are said to produce
different timbres. Timbre is a more complex attribute than pitch or loud-
ness, which can each be represented by a one-dimensional scale (high–low
for pitch, loud–soft for loudness); the perception of timbre is a synthesis
of several factors, and in computer-generated music considerable effort has
been devoted to the creation and exploration of multi-dimensional timbral
spaces. The frequency spectrum of a sound, and in particular the ways in
1. The second sentence of this definition was presented as a footnote (ANSI, 1960, p. 45).
12

which different partials grow in amplitude during the starting transient, are
of great importance in determining the timbre.

On the perceptual end, while timbre is thus defined as a way of differentiating be-
tween two sounds or identifying one sound (Chailley, 1982), only a negative definition
could entail the entire perceptual concept of timbre, by defining what timbre is not: nei-
ther pitch, nor loudness, nor presentation (i.e. duration and localisation). Moreover,
these definitions do not account for the fact one can still differentiate two sounds in
timbre even if they also differ in pitch and loudness.
These definitions also highlight the multidimensionality of timbre as an acoustic and
psychoacoustic parameter. Timbre thus markedly differs from pitch and loudness, which
can both be assessed on a one-dimensional scale, meaning they can both be described
by one psychoacoustic parameter — itself closely correlated to an acoustic parameter,
frequency or amplitude respectively. Pitch, loudness and duration are directly associated
with physiological auditory processes — the location and frequency of the stimulated
auditory filters defining pitch, the degree of stimulation defining loudness, and the dura-
tion of innervation defining duration (Ortmann, 1935). On the contrary, timbre cannot be
simply measured along one acoustic parameter, for it is multidimensional: many acous-
tic parameters, and their interactions, are involved in forming timbre. Moreover, timbre
is not orthogonal to pitch nor loudness (Melara and Marks, 1990). Indeed, Krumhansl
and Iverson (1992) identified perceptual interactions between musical pitch and timbre.
Marozeau (2004) also showed that pitch (as the fundamental frequency in a complex
tone) influences timbre perception. And Fletcher (1934) showed that large changes in
loudness also can affect timbre perception.
Most generally, timbre is known to be associated with the spectral and temporal en-
velopes of a sound (Risset, 2004). In more detail, a finer, multidimensional acoustic
description of timbre is still problematic. Many studies on timbre perception have iden-
tified perceptual dimensions of timbre, and sought out their acoustical correlates. The
methodologies and results employed in these studies will be thoroughly detailed in Chap-
13

ter 3. While no clear consensus has emerged for an acoustic assessment of perceptual
timbre dimensions (Faure, 2000), the notion of brightness, which appeared as the most
significant perceptual dimension of timbre in most studies, was proven to be closely
correlated with the acoustic parameter of spectral centroid — the “center of mass” in
frequency of the amplitude spectrum of a complex sound.
On the broad acoustic picture, Helmholtz established in 1877, by acoustic analysis,
that timbre was characterised by the frequency spectrum, i.e. the relative amplitude of
the different partials in a complex sound (von Helmholtz, 1885). Yet he only consid-
ered stationary sounds — the musical equivalent of which is the sustain portion of an
instrumental sound — and thus studied the long-term frequency spectra, over sounds of
relatively long durations.
Later, Seashore (1938) still agreed with Helmholtz’s definition of timbre as a time-
independent spectrum: “physically the timbre of the tone is a cross section of the tone
quality for the moment represented by the duration of one vibration in the sound”, and
“timbre is that characteristic of a tone which depends on its harmonic structure as mod-
ified by absolute pitch and total intensity. The harmonic structure is expressed in terms
of the number, distribution, and relative intensity of its partials” (Seashore, 1938, p. 97).
Yet he introduced a second notion, SONANCE, to describe the changes in tone quality
over time: “sonance is that aspect of tone quality which results from fluctuations in
pitch, intensity, time, and timbre within a tone” (Seashore, 1938, p. 108). 2
The temporal characterisation of timbre was then brought to light by Schaeffer (1966)
with manipulations of sound recordings. By cutting the attack (the beginning) of a bell
sound, timbre was altered to the point that it sounded like an oboe. He thus confirmed
the role of attack transients in timbre perception that Stumpf (1926) had posited. This
realisation brought Schaeffer to introduce the notion of DYNAMIC TIMBRE for the dy-
namic evolution in time of a sound object. It complements the HARMONIC TIMBRE ,

2. This concept of SONANCE is a first example of the issue of synonymy around a general notion of
timbre, which will be discussed in Section 1.1.4.
14

akin to Helmholtz’s timbre i.e. characterised entirely by the frequency spectrum. 3 This
temporal aspect of timbre was confirmed by Risset and Mathews (1969), who applied
the analysis–synthesis paradigm to recreate brass tones, and showed that the variation in
time of the spectrum was crucial to obtaining a realistic brass timbre.
Furthermore, timbre perception studies have revealed that some perceptual dimen-
sions of timbre could be correlated with acoustic descriptors of the temporal evolution
of sound spectra — although there is no agreement between studies about the best such
acoustic descriptor, spectral flux (in some variant) appears to apply best.
Schaeffer underlined that timbre is a temporal construct in his definition of instru-
mental timbre: “a musical variation which relaxes and ‘compensates’ for a causal per-
manency” 4 (Schaeffer, 1966, p. 239). Yet in this definition, and in the rest of Schaeffer’s
opus, is also ingrained another perspective on timbre: its functional use.

1.1.3 Functional definition

Schaeffer thus posits two separate functions of timbre. The first, INSTRUMENTAL

TIMBRE , applies to the whole register and associated sounds a given instrument can
reach. This is “how one can recognize that various sounds come from the same instru-
ment” 5 (Schaeffer, 1966, p. 55). This notion includes the “abstract” variations in timbre
that appear along the range of an instrument, and that are imputable to its conception and
design. This causal definition thus associates timbre with its sound source. In line with
Gibson’s (1966) ecological theory, perception is direct and reflects the physical cause of
sound production (vibrating string, resonant cavity, etc.).
Yet this definition is insufficient. As Schaeffer (1966, p. 233) explains, “instruments
have a timbre, and at the same time each sound object extracted from them has its own
3. Schaeffer also proposes the notion of ATTACK TIMBRE, which encapsulates both temporal and
spectral criteria in the description of a sound attack (and its stiffness) and its role on timbre perception.
4. “Une variation musicale assouplissant et ‘compensant’ une permanence causale.” [our own trans-
lation into English, as for all the following translations from French]
5. “Ce à quoi on reconnaît que divers sons proviennent du même instrument.”
15

specific timbre”. 6 Consequently, Schaeffer’s second notion, TIMBRE OF A SOUND OB -

JECT, defines the intrinsic properties of a given sound, i.e. “the timbre of a sound without
clear association to a given instrument, but rather considered as a proper characteristic
of this sound, as perceived for itself” 7 (Schaeffer, 1966, p. 232). In the musical con-
text, this kind of timbre stems from the instrumental control, thus the fine nuances and
“concrete” variations applied by the performer.
This dual function of timbre is also underlined by Castellengo and Dubois (2007),
as categorisation by timbre follow two points of view: categories of sources which pro-
duced the sound (akin to classification of smells) and categories of sounds themselves
(like colours in the visual modality). This perceptual duality is summed up as sound
source vs. surface attributes (Caclin, 2004), or source vs. interpretative modes of per-
ception (Hajda et al., 1997). I also propose, in a musical, instrumental context, to define a
dichotomy between inter-instrument and intra-instrument timbre, where musical instru-
ments de facto represent identifiable and categorisable sound sources, whereas the tim-
bral nuances available to the performer within an instrument range represent the surface
attributes, or sound quality. We in fact retrieve here Marozeau’s three functions of tim-
bre that were posited in introduction to this chapter: timbre-identity concerns the sound
source, while both timbre-individuality and timbre-quality correspond to the sound ob-
ject and its surface attributes. 8

1.1.4 Etymology, synonymy, vocabulary

The term “timbre” is derived from the French language. Its French etymology is
exhaustively described in Lavoie (2008). The term first appeared around 1150 (accord-
ing to the Dictionnaire Larousse de l’ancien français) and comes from the medieval
6. “A la fois les instruments [ont] un timbre, et chaque objet sonore qu’on en tire [a], pourtant, son
timbre particulier”
7. “Le timbre d’un son sans le rapporter clairement à un instrument déterminé, mais plutôt en le
considérant comme une caractéristique donnée propre de ce son, perçue pour elle-même.”
8. Timbre-individuality may also be included within the sound source timbre, by considering the sys-
tem instrument+performer as the actual sound source.
16

Greek small drum-like instrument tumbanon — whose own name comes from a verb
(transliteration: typtein) describing a hitting action. Risset (2004) refers to the tympanon,
a word derived from tumbanon that designates a thirteenth-century medieval stringed
drum. Later the term “timbre” would designate a bell without clapper. The meaning of
timbre then somehow evolved into designating a seal, an authentication mark, and in its
most common French use, a postage stamp. All those meanings are actually consistent
with the function of identifying the sound source. As the name of an instrument, it came
by metonymy to designate its sound, while a seal clearly holds an identification role.
By extension, timbre also came to define a quality of sound, as attested by Rousseau’s
definition of timbre: “we thus call [timbre], by metaphor, this quality of sound by which
it is sour or soft, dull or bright, dry or mellow” 9 (Rousseau, 1768, p. 528).
According to the Oxford English Dictionary, the word “timbre” first appeared in
English literature in 1849, in Charlotte Brönte’s Shirley: “your voice. . . has another
‘timbre’ than that hard, deep organ of Miss Mann’s”. Four years later (1853), “timbre”
was first used in scientific literature, in a medical context, referring to the quality of
sound in auscultation by stethoscope, when W.O. Markham, a British medical doctor,
translated Joseph Skoda’s Abhandlung über Perkussion und Auskultation from German
into English (Skoda, 1853) and wrote the following passage:

The voices of individuals, and the sounds of musical instruments, differ, not
only in strength, clearness, and pitch, but (and particularly) in that qual-
ity also for which there is no common distinctive expression, but which is
known as the tone, the character, or timbre of the voice. The timbre of the
thoracic, always differs from the timbre of the oral, voice. . . A strong tho-
racic voice partakes of the timbre of the speaking-trumpet.

The definition of timbre, in this context and translation, focuses on both the quality of
sound and identifying analogy to a musical instrument.
9. “On appelle ainsi [timbre], par métaphore, cette qualité du son par laquelle il est aigre ou doux,
sourd ou éclatant, sec ou mœlleux.”
17

A more modern definition of timbre as qualification is posited, in English, in Scholes


(1960, p. 1028):

Timbre means tone quality — coarse or smooth, ringing or more subtly pen-
etrating, “scarlet” like that of the trumpet, “rich brown” like that of the cello,
or “silver” like that of the flute. These colour analogies come naturally to
every mind, as does the metaphorical term now become a commonplace as
synonym for “timbre” — tone colour; the German for “timbre” is Klang-
farbe — literally “sound-colour”.

We can discern from these definitions that timbre, as sound quality, is generally qual-
ified with trans-modal, analogical adjectival descriptors: sour, soft, dull, bright, dry, mel-
low, scarlet, rich brown are used to characterise different timbral nuances as if they per-
tain to visual or tactile perception. Such metaphors can serve a communicative purpose
in both musical or everyday contexts (Halmrast et al., 2010, p. 184). Most noticeably,
it is suggested that timbre is to sound as colour is to visual perception. It results in a
first common-use synonym to timbre: sound colour. 10 Indeed, the Oxford English Dic-
tionary mentions timbre as equivalent to the German term Klangfarbe (literally sound
colour). Moreover, Helmholtz used Tonfarbe (literally, tone colour) to designate tim-
bre in his seminal work. Yet the German-to-English translator, A.J. Ellis (1885), felt
compelled to address his choice of translation for Tonfarbe among many options such
as timbre, clangtint, quality of tone or colour. His dominant leaning toward timbre was
motivated by the foreign origins of the term and recent inclusion in the English lexicon,
which let the term free of pre-existing connotations. As for the French translation, by
M.G. Guéroult (1868), timbre is almost always used to translate Tonfarbe (with qualité
du son as a rare alternative).
Otto Ortmann, in his empirical, scientific research on piano touch (Ortmann, 1925)
and the physiology of piano playing (Ortmann, 1929), already refers to tone-quality 11 ,
10. “Sound Color” is notably the title of the scientific book by Slawson (1985).
11. Most notably, Ortmann’s (1929) chapter XXIII is entitled “Tone-Qualities”.
18

a term that he would later define in his article “What is tone-quality?” (Ortmann, 1935).
Accounting for its subjective, perceptual nature, he refutes its status as a fourth, in-
dependent parameter of sound, which he believes is wrongly yielded by the “colour-
ing” descriptive metaphors. He thus posits tone-quality as a psychological, perceptual
and sensory attribute, which explains why its description essentially solicits trans-modal
terms (which describe for other senses a similar sensory stimulation). 12
There are other synonyms of timbre, frequently used in different musical genres,
like sonority, or even simply tone or sound. The term of choice will vary, depending on
times, musical style — be it classical, jazz, electronic or contemporary music — but also
countries, languages 13 , schools and between individuals.
We may yet, for all intents and purposes, understand by any of those terms the same
concept of timbre-quality. We will concern ourselves with this aspect of timbre through-
out this dissertation, in the context of piano performance.

1.1.5 Timbre in piano music

Musically, timbre is an essential guiding principle of orchestration. A good balance


between the timbres of the various orchestral instruments is indispensable for creating
an artful orchestral piece of music — according to Rimsky-Korsakov, “to orchestrate is
to create” (Rimsky-Korsakov and Steinberg, 1964, p. X, editor’s preface) — and had
become primordial to the art of composition by the time Berlioz published his seminal
Grand Traité d’Instrumentation et d’Orchestration Modernes (Berlioz, 1844). Yet or-
chestration does not depend solely on choosing the right combination of instruments, in
which timbre relates to the sound source (timbre-identity): orchestration also involves
the organisation and articulation of timbres in time, for which timbre then has to be
considered as sound quality.
12. For instance, sharp primarily designates a tactile sensation stemming from the stimulation of only
a small patch of skin (thus few end-organs). As a high, loud sound only stimulates a few hair cells from
the most external auditory filters, it is by analogy referred to a sharp (Ortmann, 1935, p. 448).
13. For instance, the use of timbre as sound quality seems more common in French than in English or
German throughout the scientific and musical literature.
19

Consequently, musical timbre is also relevant in performance. Indeed, musical tim-


bre can be understood as “an emergent phenomenon dependent on both the physical
features of instruments and the gestures that produce the sound” (Halmrast et al., 2010,
p. 184). As such, within the timbral constraints imposed by the physics of an instrument,
it remains possible for the musician to produce different timbral qualities and nuances.
Pianists are well aware of the major role of timbre in their performances. Tobias Matthay,
the English pianist, teacher and composer, actually distinguishes the two components of
timbre and tone-quality. The first, “tone-colouring”, represents the macrostructure, in-
tentions and movements posited by the composer. On the other hand, “tone-inflexion”
is left to the pianist’s expression (Matthay, 1913, pp. 167–168). In his piano treatise, the
Hungarian pianist György Sándor (1995, p. 8), a former student of Bartók and Kodály,
highlights timbre-quality as characteristic of a pianist: “it is tone quality — the sound
— that is the most essential artistic ingredient in the world of music. Every artist has
a touch and timbre that we can recognize as his own”. Likewise, Heinrich Neuhaus,
the great Russian pedagogue who taught Sviatoslav Richter (among others), emphasizes
the importance of tone (as timbre-quality) in piano performance and practice (Neuhaus,
1973, p. 54): “since music is a tonal art, the most important task, the primary duty of
any performer is to work on tone”. However, pianists cannot rely on a standardised mu-
sic notation or theoretical model for timbre in Western piano music. 14 Additional score
indications are but the only notated source of information about the timbral nuance with
which a pianist ought to perform a work of classical (especially tonal) music. Timbre
thus needs to be a prime concern in piano teaching and pedagogy. As Neuhaus (1973,
p. 56) explains, “three-quarters of all work with my pupils is work on tone”.
What then makes timbre such an arduous expressive device to master at the piano,
requiring such diligent and extensive commitment in work and practice? We may now
look for an answer in the nature of the instrument itself, its design and mechanism.
14. Such a system for timbre notation exists elsewhere however, as for the Guqin (a Chinese string
instrument) around a hundred symbols are used to describe timbral nuances. (Van Gulik, 1969)
20

1.2 The Piano

By no means is this section intended to give an exhaustive and thorough account


of the whole history, making, structure and mechanism of the piano. 15 Yet, a summary
description of the instrument may provide meaningful information about how timbre can
be produced and controlled at the piano.

1.2.1 History

Born in its ancestral Fortepiano form out of Bartolomeo Cristofori’s design in the
eighteenth century (Figure 1.1), the piano started approaching its modern form by 1820
under its developing by piano makers Pleyel and Erard (Figure 1.2). In particular, Erard
and his son invented the mechanism of double escapement, which allows for the rapid
repetition of a note before the key is completely released.
The creation of this new instrument was presumably motivated by the need for both
loudness (required in the bigger, new concert halls), but also by the growing need for
expressive control (already required in the clavichord repertoire, and that would be-
come indispensable by the end of the eighteenth century with the new style of Haydn or
Mozart). The piano would thus combine the qualities of the harpsichord (loud but lack-
ing in expressive control) and the clavichord (providing appropriate expressive control
but far too quiet).
From these early stages, many technical advancements would soon follow. Interest-
ingly, many refinements to the instrument were initiated by the requests from composers
looking for a better tone and a larger palette of timbres. Many composers became in-
trigued by the tonal possibilities offered by the pianoforte, but were often left wanting
more tonal and expressive qualities. For instance, Gottfried Silbermann’s pianoforte
did not make a good first impression on J.S. Bach in 1726, who pointed out its hard
15. For such information, one shall refer to Closson (1944) (among others) for piano history, and to
Askenfelt and Jansson (1990, 1991, 1993), Conklin Jr (1996a, b, c), Hall and Askenfelt (1988), Suzuki
and Nakamura (1990) or Boutillon (1990) for its mechanism and acoustic behaviour.
21

Figure 1.1: Pianoforte by Cristofori (Florence, Italy, 1720), one of the earliest hammer-
action keyboard instrument still conserved. (Picture by the author, c The Metropolitan
Museum of Art, New York City.)

action and weak treble. But after two years of work by Silbermann to correct these
flaws, J.S. Bach praised the instrument and its tone (although he eventually kept favour-
ing the harpsichord). Later in the eighteenth century, when the pianoforte really rose
to prominence, many composers were in frequent dialogue with piano makers (for in-
stance W.A. Mozart with Johann Andreas Stein), and asked for refinements, in both tone
and playability, to their instruments so that they could fulfil their expressive intentions.
Likewise, Beethoven famously corresponded extensively with Viennese piano maker
Andreas Streicher, asking for a pianoforte that would provide the singing, personal tone
he sought (Skowroneck, 2010, p. 117), in all dynamic registers and especially in the for-
tissimo highlighted by his playing style. These requests yielded for instance the tripling
of strings for each note in the treble register, and both greater resonance and elasticity in
Streicher’s pianos (De Silva, 2008).
22

Figure 1.2: Grand Pianoforte by Erard et Cie (London, UK, ca. 1840). Including
80 notes and Pierre Erard’s patented double escapement action, it was considered the
most advanced piano of its time. (Picture by the author, c The Metropolitan Museum
of Art, New York City.)

Much of the early technical progress in piano making was thus imputable to the
search of new timbres and expressive features sought out by the composers. With all
these developments, which would continue in the same fashion until at least the early
twentieth century, the essence of the instrument was reshaped, as described by Closson
(1944, p. 53):

We can say the piano, which we can play and listen to nowadays, goes back
hardly as far as the beginning of the [twentieth] century and is already very
much different from the keyboards virtuosi of yesteryear had at their dis-
posal, most especially regarding its power and timbre richness, with the ff
of yesteryear seldom reaching the mf from nowadays; the Pleyel played by
Chopin has nothing left in common with its descendants. 16
16. “On peut dire que le piano actuel, tel que nous le pratiquons et l’entendons, remonte à peine au
23

1.2.2 Mechanism

The piano as a musical instrument works indirectly 17 by mechanical transfer of en-


ergy. Each key among the (generally) 88 on a keyboard launches a felted hammer toward
hitting the strings. In piano action, a complex system of levers (see Figure 1.3) interacts
between the key and hammer, before the hammer is launched into a ballistic trajectory 18
towards the strings — three strings in the high register, one in the lowest. The hammer
can then fall back without damping the string vibration.
Additional springs and levers form the double escapement mechanism on grand pi-
anos, which speeds up the resetting of the whole key-to-hammer mechanism and lets a
note be repeated rapidly as soon as the key is about half-released (up to the escapement
point at which the reset system activates). A damper also blocks the strings when the key
is at rest, and the mechanism triggers its pushing up and release of the strings when a
key is pressed. The sustain pedal also releases the strings from dampers, but does so for
all keys so that sympathetic vibrations of other strings may happen. The soft pedal dis-
places the whole keyboard horizontally, altering the alignment of hammers and strings.
Thus fewer strings are hit (in the high register), and hammer-string contact point is dis-
placed to a less worn-out portion of the hammer felt. It results in an attenuated, dulled
sound. Finally, string vibrations are propagated, via the metal frame to which strings are
attached, to the soundboard, the large piece of soft wood which resonates and amplifies
the sound, while giving it most of the timbre characteristic of the instrument.
Piano thus pertains to the family of chordophones in the Hornbostel–Sachs classi-
fication, yet as its strings are hit it can also be considered as both a percussion and a
stringed instrument.
commencement du siècle et qu’il est déjà très éloigné des claviers dont disposèrent les grands virtuoses
d’autrefois, tout particulièrement au point de vue de la puissance et de la richesse du timbre, le ff
d’autrefois atteignant à peine le mf d’aujourd’hui ; le Pleyel joué par Chopin n’a plus rien de commun
avec ses descendants.”
17. Contrary to, for instance, the guitar where the strings are set into vibration by direct plucking action
from the finger.
18. No more contact at this point between the lever mechanism and hammer.
24

Figure 1.3: Piano action mechanism — from key to string. 19

1.2.3 Timbre of the instrument

Without entering into the details of the complex physical phenomena that shape the
piano sound, some global specificities can already differentiate the timbre of the instru-
ment from any others.
First, and contrary to what was stated as a generality in Section 1.1.2, attack tran-
sients are not essential to the identification of piano timbre. As a piano tone does not
possess a sustain portion, its temporal envelope is only divided into an (extremely brief)
attack section and a (long) decay. As Schaeffer (1966) and Leipp (1971) reported, cut-
ting off the attack of a piano tone does not alter its identification as such. On the other
hand, piano tone recordings with their decay phases presented backwards (reversed in
time) are all but unrecognisable (an experiment by Houtsma reported in Rossing et al.,
1990). The decay portion is thus sufficient to identify a piano tone as such.
19. Remesz, O. et ‘Bechstein’, A. (2006). Fortepian – mechanizm angielski, http:
//fr.wikipedia.org/wiki/Fichier:Fortepian_-_mechanizm_angielski.svg, ac-
cessed on 2009/01/12.
25

A piano tone is also characterised by the inharmonicity of its partials. The thickness
(especially in the low register) and extreme rigidity of the strings, and the hammer per-
cussion, cause the deviation of partials from the harmonic series (integer multiples of the
fundamental frequency); the higher the partial, the larger the deviation (Guigue, 1994b,
p. 7). Yet this inharmonicity has been proven by Fletcher et al. (1962, p. 758), through
perception tests of digitally synthesized sounds, to be a defining characteristic of piano
tones. The inharmonicity of synthesized tones was identified as a bearer of warmth in
piano timbre. Moreover, the two or three strings that correspond to each key and ham-
mer (except in the low register) are tuned out of unison by a few cents (Kirk, 1959). It
has two main effects on piano tone: it creates beatings between strings, and it sets the
strings progressively out of phase, which affects decay duration and timbre. Finally, as
the number of strings, their thickness and winding vary with register, piano tones also
vary in quality. 20 Acoustically it affects the observable number, amplitude, harmonicity
and decay rate of partials.

1.2.4 Pianist, control and controversy

In his musical analysis of Busoni, Desfray (2004, p. 9) thus comments on the techni-
cal progress in piano making:

The evolution of piano making since the beginning of the nineteenth century
reveals a constant concern: the artist has much less control over the tone
quality produced [on the piano] than a violinist or wind instrument player.
This frustration has been progressively compensated for through consider-
able technical progress: the double escapement which allows for very fast
note repetitions and provides for a better control of phrasing. The introduc-
tion of metallic materials (frame, bases) has brought a gain in power and
accuracy. Crossing strings has enriched harmonic resonances. The addition
20. Schaeffer (1966, p. 233) posited a “piano law” according to which the product ‘dynamic stiffness’
(increasing with register) by ‘harmonic richness’ (decreasing with register) remains constant along the
register.
26

of the soft, una corda pedal that shifts the hammers leftward can make for a
more distant sound. 21

Despite all these improvements, most of timbre is actually imputable to the piano’s
inherent structural characteristics and both independent from and inaccessible to the
performer. The timbre of a piano is fixed, imposed and defined by the materials, design,
and style of the piano maker, by the characteristics of the hammer felt and of the strings,
and by the tuning.
In this context, the possibility of timbre control by the performer remains far from
obvious. The pianist Emile Bosquet (who studied with Busoni) laconically stated that
“the only variable at the performer’s disposal is the hammer velocity. The prodigious
diversity of effects a virtuoso can obtain with his/her piano is imputable to the range of
different speeds he/she can impart the hammers with, all the rest is literature” 22 (as cited
in Ott, 1987, p. 195).
On the other hand, many musicians and most pianists have defended the role of the
performer in timbre production. Berlioz remarked that “piano, at the level of improve-
ment our skilful piano makers have brought it to nowadays, can be considered in two
ways: as an orchestra instrument or as a small, full orchestra in itself” 23 (Berlioz, 1844,
p. 12). This latter perspective is fully used for piano reduction of orchestral works. A
pianist, with the score of the reduction at hand, can then emulate various orchestra in-
struments, in all their timbral diversity. Likewise, Neuhaus (1973, p. 55) thus quoted
21. “L’évolution de la facture du piano depuis le début du XIXe siècle laisse apparaître une préoc-
cupation constante : le contrôle de l’artiste sur la qualité du son produit est très inférieur au contrôle
que peut en avoir le violoniste ou l’instrumentiste à vent. Cette frustration a progressivement été com-
pensée par des progrès techniques considérables : le double échappement qui permet de répéter des
notes très rapidement et assure un meilleur contrôle du phrasé. L’introduction de matériaux métalliques
(cadres, sommiers...) permit de gagner de la puissance et une plus grande précision. Les cordes croisées
enrichissent la résonance des harmoniques. L’ajout de la pédale douce una corda qui décale les marteaux
vers la gauche permet d’obtenir un son plus lointain.”
22. “La seule variable dont l’exécutant dispose est la vitesse du marteau. La variété prodigieuse des
effets qu’un virtuose obtient à son piano est due à la gamme de vitesses différentes qu’il est capable
d’imprimer aux marteaux, tout le reste est littérature.”
23. “Le piano, au point de perfectionnement où nos habiles facteurs l’ont porté aujourd’hui, peut être
considéré sous un double point de vue: comme instrument d’orchestre ou comme étant lui-même un petit
orchestre complet.”
27

Anton Rubinstein: “you think [piano] is one instrument? It is a hundred instruments!”


These statements illustrate the vast range of timbres that the harmonic richness and dy-
namic finesse of the instrument allow the pianist to control and produce — to such an
extent that virtuosi can make their pianos sound like almost all orchestral instruments.
Performance thus has an essential role in piano timbre production, provided the per-
former is up to the task: “everyone knows how to play but only a few know how to
perform” (Neuhaus, 1973, p. 19, quoting Anton Rubinstein once again). The celebrated
pianist John Browning thus explained (Noyle, 1987, p. 29):

[Piano] is not an instrument that has a special character of sound all its own.
The piano will do what you want to do with it, but it in itself has very little
character. Its character comes from what you want it to be. [. . . ] You have
to give the piano character, you have to give it the ability to sing.

And in another take on the same matters, Sándor (1995, pp. 8–9) insists on the possi-
bility for technically skilled and tonally aware pianists to control timbre and tone quality:

Although the piano is far less responsive and sensitive than string instru-
ments (not to mention the human voice), it does nevertheless respond to
one’s technique, and it produces sound accordingly. [. . . ] If we succeed in
cultivating tone quality through a well-coordinated and natural technique,
we will realise that we are the fortunate possessors of a miraculously ex-
pressive and complete instrument, one that is capable of every shading.

1.2.5 Scientific perspective on piano timbre-quality and its control

Several scientific studies have explored the matter of whether pianists can control
piano timbre. The most common conclusions in those studies, and consequently the
prevalent opinion in the scientific world, is that it is impossible for pianists to control
piano timbre independently from hammer velocity. 24
24. It is well proven that hammer velocity, by altering strings vibration modes, affects the sound spec-
trum: the faster the hammer, the more energy in higher partials — although this relation is nonlinear
28

Hart et al. (1934) stated a general agreement between physicists and engineers about
pianists’ lack of control of timbre. It resulted from the simple assumption that at the
time of impact with the strings, the hammer is not in contact with the key, prohibiting
any control other than sheer hammer velocity — loudness and tone quality could not be
dissociated. Several quantitative studies of piano touch and tone-quality (Bryan, 1913;
Hart et al., 1934; Ortmann, 1925; Seashore, 1937) have concluded that pianists could not
control timbre independently of loudness, on a single, isolated note.
Parncutt and Troup (2002, p. 289) sum up an opinion endemic among scientists:

Acousticians and psychologists have often wondered why, in spite of [the]


evidence that the pianist cannot influence how the hammer hits the string,
only the speed with which it does so, so many pianists still believe that the
timbre of a piano tone depends on touch — not only how fast but also how
the key is depressed.

Yet this point of view is directly debatable. As Ortmann (1929), Báron (1958), Goebl
et al. (2004) and Goebl and Fujinaga (2008) have posited, studied and demonstrated, the
various percussive noises produced at the piano by finger–key, key–keybed or hammer–
string contacts play a large part in the perception of a tone and its timbre. Those per-
cussive contact noises were even used musically by twentieth-century composers such
as Stravinsky and Bartók (Risset, 1978, p. 78). Yet this percussive property is seminal to
piano, as Cristofori’s instrument was first named clavicembalo a marteletti. Piano mak-
ers also select meticulously the wood from which to build the keybed, in order to obtain
the desired percussive sound (Askenfelt and Jansson, 1991). This noise was qualitatively
defined as a “rather dull impact sound [that] slightly precedes in time the main [piano]
sound” 25 (Boutillon, 1990, p. 811).
And pianists, for most of whom this point of view of a control-less timbre does
not bode well, have an even more convincing argument at their disposal: in a musical
and dependent on multiple material parameters (Askenfelt and Jansson, 1990; Hall and Askenfelt, 1988;
Leipp, 1971).
25. “Ce bruit d’impact est assez sourd et précède très légèrement dans le temps le son principal.”
29

context, notes are never played independently. This matter of fact is vigorously stated
by Sándor (1995, p. 14):

It has been “proven” by some “experts” that it is only the volume of the
sound that can be altered and that altering tone quality is purely a matter of
imagination. This may be true in playing one single note, but a series of
sounds in sequence is quite another matter: touch and tone quality are most
personal things, and they are quite recognisable. Even if they are hard to
define, the difference in tone qualities [. . . ] undoubtedly exists and is not
imagined. Perhaps it is caused by the rate of acceleration of the speed of the
hammers; perhaps it is the way the damper stops the sound when it descends
on the strings; perhaps it is the spacing of notes, the agogic qualities of the
playing, or the flexibility of metric units — these and many other factors
may influence tone quality. But differences do exist!

Lillie Hermann-Philipp, the renowned German-American piano teacher, concurs on


the involvement of touch in timbre production when in a musical context: “through
combination of different intensities, simultaneously or in succession, one can produce
all the beautiful effects which we attribute to the secret of touch” (Philipp, 1969, p. 43).
Thus, by the complex interactions between tones in addition to their individual shaping,
pianists can control piano timbre and reach such a diverse palette of nuances that can
befit a musical work and enlighten the emotions and characters the performer intends to
bring forward. The timbre in a piano performance can thus be defined.
We will see in the following chapter the various strategies, methods and thoughts
that pianists and pedagogues propose to best serve the expression and production of
piano timbre in a performance. In the third part of this thesis, we will quantitatively
explore the proceses of gesture and touch, including those just suggested by Sándor, that
allow pianists to control and produce different timbral nuances in their performances.
CHAPTER 2

EXPRESSION AND PRODUCTION OF PIANO TIMBRE ACCORDING TO


TREATISES

This chapter is mostly derived from the article published (in French) in the journal
Recherche en éducation musicale (Bernays, 2012).

2.1 Introduction

Virtuoso piano playing, in all its nuances and subtleties that serve the expressiveness
of the performance, is bestowed upon the audience as a manifestation of sheer brilliance.
Behind the ease and genuineness often transpiring from the greatest pianists of the last
two centuries — Chopin, Liszt, Rachmaninov, Busoni, Field, Cortot, Horowitz, Schn-
abel, Anton and Arthur Rubinstein, Gould, Brendel, Argerich, and so many more —
lie hours and hours of intensive work, both technical and artistic, for them to bring out
musically their personal sensibility towards the work performed.
Master piano teachers (often renowned pianists themselves) have often documented
their ideas on piano playing techniques and pedagogy in treatises. Their precepts and
theories could thus be spread farther and conserved in time thanks to the written medium.
The diversity in empirical experiences between different teachers has yielded as many
different ideas and methods. Some remained grounded in conservatism and inspired by
old traditions of piano performance. Others followed the Zeitgeist and the most fashion-
able ideas of their time. A few proposed revolutionary methods, sometimes predicting
or influencing the future of performance. Amongst the various subjects treatises deal
with, from the most mechanical to the farthest abstraction, the question of timbre some-
times arises. In respect of this essential musical component, what advice could help a
pianist both find his/her “sound” and modulate it for colouring different expressive per-
31

formances in the most appropriate timbre? This challenge involves even more complex
pedagogical constructs, as the sheer communication and verbal expression of something
that subtle and abstract is problematic.
As we saw in Chapter 1 (Section 1.1, p. 10), even defining timbre is an arduous task,
as every point of view yields a different definition and different terms denote essentially
the same concept. We shall therefore investigate how the pedagogues envision timbre,
and from which level of sound production they believe piano timbre takes its source. We
shall also trace the evolution of these ideas in history through the perspectives of pianists,
composers and teachers, as they are represented in treatises. I will conduct towards
this aim a critical analysis of the most renowned treatises published since the dawn
of the twentieth century — a time by which pianos had become sufficiently modern,
technically and mechanically, to stand comparison with present-day instruments (see
Chapter 1, Section 1.2, p. 20). The point of views of prominent pianists will also be
considered, as presented in interview collections as well as in relevant musicological
studies and analyses. The focus will be particularly set on the written works which deal
with the process of shaping timbre and tone-quality through the subtleties of pianistic
gesture and their holistic integration within the performance.
We shall see that, depending on treatises, dominant thinking has evolved or shifted
about the approach to the instrument itself, the focus of gesture (from hands to arm and
body) and even the abstract perspective (mental conception, musical ear and feeling,
aesthetic criteria and integration into the work). Still, even the personal understand-
ings of timbre that pianists may have come to, through both their empirical experience
and the influence of their specific pianistic upbringing, will intersect in some aspects,
and common conceptions of timbre will show through. On the basis of these shared,
empirically-formed perspectives, hypotheses will be proposed about the possible com-
monalities in the expression of piano timbre, according to different aspects — auditory
perception, verbal description, production and expressive control. These hypotheses will
be tested in a scientific, quantitative fashion in the later parts of this dissertation.
32

2.2 Gesture

The instillation of timbre, or tone-quality, in piano performance is of paramount


importance for pianists and teachers, be it in practice or in the learning process. Such a
marked concern from pianists and teachers about timbre stands despite the controversy
about the degree of control one has over timbre production and the sound of a piano
performance (see Chapter 1, Section 1.2, p. 20). This is consequently highlighted in
treatises by the detailed descriptions and advice about the various approaches to gesture
best suited to obtaining the desired sound and timbre.

2.2.1 Hands and touch

The definition of a “good” performance and of the corresponding technical methods


has changed in time, and can still differ between pianists and treatises. Likewise, meth-
ods for controlling timbre can differ. Gesture is crucial to timbre control, the way for
a pianist to create his/her “sound”. For instance, the pianist Maria Curcio, a disciple
of Artur Schnabel, was said 1 to be able to recognise the generating gesture from the
sound. She could detect technical errors in her students’ gesture without watching their
hands. Hands can be “pattes de velours ([. . . ] velvet paws)”, as Liszt said “when [he]
heard for the first time Henselt, who had an extraordinary ‘velvet’ tone” (Neuhaus, 1973,
p. 56). Neuhaus (1973, p. 72) also explained how different touches can yield different
tone qualities:

To get a [sic] tender, warm, penetrating tone, you have to press the keys very
intensively, deeply, keeping the fingers as close to the keys as possible [. . . ].
But to get an open, broad, flowing tone [. . . ] you have to use the whole
swing of the finger and hand.

Meanwhile, Tobias Matthay developed revolutionary ideas at the dawn of the twen-
tieth century. He especially insists on touch, and on a novel use of the hand and fingers.
1. As cited in Xardel (2002, p. 63).
33

In particular, he underlines the necessity of musical sound control, which he attributes


to “accurately apply[ing] the right degree of force required for each key-descent”, while
“sensing the varying resistance the key itself offers during descent” (Matthay, 1913,
p. 12). Those notions are respectively summarised as concepts of “Work-sense” (or
“Muscular-sense”) and “Key-sense” (or “Key-resistance-sense”). Consequently, he ad-
vocates that one should avoid striking the finger on the key, instead applying a soft
contact enhancing the tactile sensation. This advice is to be supported by pedagogy: the
conscientious teacher shall diagnose all errors in tone and explain how to avoid them
by respecting the rules of touch (Matthay, 1913, p. 117). He later insists further on the
wrongfulness of striking the key, and command the enhanced control of timbre varia-
tions induced by an appropriate touch. He thus takes as example the audible difference
between a solid, rigid chord produced in a sharp arm movement, a sound that “kills mu-
sic”, and a chord played with relaxed elbow and forearm, clearing out the “nasty noise”.
He also separates timbre and tone-quality from dynamics. Contrary to natural tendencies
present even in the performances of virtuosi, he advocates playing forte relaxed and with
a soft touch for improving dynamic control and gradation. A struck touch ought to be
reserved for a voluntarily “ugly” sound (Matthay, 1932, pp. 135–139). 2
Around the same time in France, the great pianist Marie Jaëll was developing her
own method. Inspired by the budding science of physiology, her method stands on the
harmony of touch (Jaëll, 1897, 1899). It emphasizes a relation between the part of the
finger in contact with the key and the sound colour obtained, and even defends the influ-
ence on sound color of the orientation of papillary lines in fingerprints. In regard to the
part of the fingers that should come in contact with the keys, the great Russian pianist
and teacher Josef Lhevinne (1924, p. 18) advises one to “see that the key is touched with
as resilient a portion of the finger as possible, if a lovely, ringing, singing tone is desired
instead of the hard, metallic one”. That is, one should use the cushions of flesh above
the fingertips, “more elastic, less resisting, more springy” and offering a larger surface
2. Citing his own article published in The Music Teacher, March 1931.
34

than the bony ends. Still, for “a passage requiring a very brilliant, brittle tone [one shall]
employ a small striking surface, using only the tips of the fingers” (Lhevinne, 1924,
p. 19). Philipp (1969, p. 43) emphatically assesses that “the varying shades of touch or
strike intensity are tone production”. Technically, with tone and touch thus coupled, she
advocates that “good tone and touch can be created with curved fingers, with weight or
without weight. It can also be produced by playing on the cushions of the fingers held
comparatively straight, à la Chopin”. The second part of that sentence is in agreement
with the guidance on touch advocated by Matthay, Jaëll and Lhevinne, while its begin-
ning (curved fingers) is rather at odds with them, and more akin to older principles of
piano playing. The finger strategies for obtaining the best touch may thus vary between
methods.
Touch had taken on a central role in timbre production by the 1900s. However, fo-
cusing on touch requires keeping the fingers in contact with the keys as much as possible
(for tactile feedback), and generally staying close to the keyboard at all times. The am-
plitude of movements is thus restrained, which results in limitations in dynamic register.
Moreover, according to a more modern perspective (Sándor, 1995, p. 60), “the tone
quality tends to become rather bland and dull”. Consequently, in order to access a larger
dynamic range and control a richer palette of timbral nuances, the role of the arms and
body in expressive piano performance must also be considered.

2.2.2 Arms and body

In the twentieth century a school of thought developed that favours the involvement
of the arms and body and their articulations in piano playing. Indeed, in the late nine-
teenth century, Ludwig Deppe proposed that “tone must be produced, not by finger stroke
— that is, not by requiring unnatural strength from the relatively weak muscles of the
hand and fingers — but by coordinated action of all parts of the arms” (Kochevitsky,
1967, p. 8). The French pedagogue Monique Deschaussées (1982, p. 43) explains that
“the combination of those articulations [wrist, elbow, shoulder] in motion creates varied
35

attacks and an infinity of sonorities”. 3 Likewise, the degree of flexibility in the artic-
ulations directly influences tone, as for instance “it is next to impossible to produce a
good singing tone with a stiff wrist”, although the stiff wrist is considered “absolutely
necessary” to a brilliant tone (Lhevinne, 1924, p. 19).
The control of body weight and the position of its centre of gravity is also essential
for tone production. The Californian piano teacher Christie Skouser (2012, pp. 96–97)
for instance advises, when playing Chopin, that:

To produce a rich and deep tone that utilizes the weight of the arm, [you
should] sit with your full body weight centred on the bench so you sense a
solid centre of gravity. This will help you to feel the power come up through
your back, to your shoulders and down through your arm. The arm must be
relaxed, with the weight coming from this solid, strong core, not just from
the arm itself.

We can actually notice how the way to use the body depends on the school and time.
In the late eighteenth century, the prevailing method of Muzio Clementi asked for the
utmost rigidity and immobility from the hands up, with only the fingers left to play.
About a century later, the German pianist Heinrich Ehrlich still advised one to keep the
arms still and stuck to the body. Even in the twentieth century, Arthur Rubinstein con-
sidered that the body ought to remain still, with the straightest posture, and the whole
weight projected into the hands for better sound control. Yet Sándor warns against still-
ness, rigidity and tense playing: “some may think that it is easier to achieve control with
tensed-up muscles and with fixed hands, but under these circumstances not only the flow
of music suffers, but the tone quality as well” (Sándor, 1995, p. 70). Muscular relaxation
thus became paramount in piano teachings of the twentieth century. For Neuhaus (1973,
p. 69), “the condition sine qua non for a good tone is complete freedom and relaxation
of the arm and wrist from the shoulders to the tips of the fingers”.
3. “La combinaison de ces articulations [poignet, coude, épaule] en mouvement crée des attaques
variées et une infinité de sonorités.”
36

Through his physiological approach to piano playing, Schultz (1936) asserted the
conjoined and indivisible muscular involvement of fingers and arms in piano playing.
Inspired by Ortmann’s (1929) scientific research on the physiological mechanics of pi-
ano technique (which he integrated in his pedagogical theory), he highlighted the impor-
tance of finger coordination in obtaining good tone control, and insisted on the complex
combinations of muscles — to coordinate through the arm down to the weaker finger
muscles — required to apply physiologically appropriate types of movements. Yet as
his theory required stabilising and fixating joints for better control, some degree of mus-
cular tension was inevitable. This ran contrary to the predominant recommendation of
complete arm relaxation — while not reaching however the extreme stiffness advised by
the old finger-school.
Still, whichever the approach, the arms can be technically engaged in the control of
tone-quality with articulation. For the American pianist and professor Seymour Fink
(1992, p. 142), “the arms facilitate the control, accuracy, sonority, and speed of the
fingers managing connected lines”.

2.3 Piano techniques

Thus, the gestural control of piano timbre does not solely rely on the body disposi-
tion, as it can be conceived of at the level of piano technique. Indeed, technical elements
in playing a chord and in timing and rhythmic considerations such as rubato are largely
involved as well.

2.3.1 Chords

A chord can be understood as a sound molecule, made up of atom notes of musical


content (Neuhaus, 1973, p. 58). This metaphor underlines the emerging properties of
sound and timbre that can arise at the molecule–chord level, thus creating a new musical
identity that goes beyond its constitutive notes. For instance, a C-major triad (C–E–G)
37

cannot be reduced to the simple sum of three independent notes C, E and G, nor does it
coincide with the [C–E–G] component in the [C–E–G–Bb] C7 chord. Matthay likewise
insists on the importance of the balance in sound colour between notes in a chord, as the
right combination of the colours in each note will provide the intended, emerging overall
tone-quality. Risset (2004, p. 146) further details this point of view, taking example on
the notorious pianist Alfred Cortot:

We often talk about the sound of a pianist. Alfred Cortot was renowned for
his exceptional tone: when he started playing after teaching his students,
he would transfigure the timbre of the piano and would give the impression
he was playing on a different instrument. However, for each note, a pianist
can only control intensity and the duration of key depression. Yet he can
gauge the respective intensities in the notes of a chord, and also shift them
imperceptibly in time. 4

Yet finding the right balance in tone between notes in a chord is not that easy. Philipp
(1969, p. 46) remarks that “if the dynamics of the fingers are not under strict control, the
longer fingers may produce more tone than the shorter ones. This produces uncalled-for
overtones”. But once gotten right, playing one of the components in a chord louder will
result in different timbral nuances: “if we play the top note louder, the chord sounds
brilliant; if we bring out the bass, it will sound warmer; but if one of the middle tones
comes out more strongly, the sound will be harsh” (Philipp, 1969, p. 46). This remains a
broad picture, as many more subtleties are involved in producing such timbral nuances,
but it gives nonetheless a fair indication of the role in timbre production of balancing
tones in a chord. Meanwhile, balancing tones is also crucial for voicing (in counterpoint
especially), in order for each melodic line to be distinguished in a proper tone quality.
4. “On parle souvent de la sonorité d’un pianiste. Alfred Cortot était réputé pour sa sonorité excep-
tionnelle : lorsqu’il se mettait au piano après ses élèves lors de ses cours d’interprétation, il transfigurait
le timbre du piano et donnait l’impression qu’il jouait sur un autre instrument. Or, pour chaque note,
le pianiste ne peut contrôler que l’intensité et la durée d’enfoncement. Mais il peut doser les intensités
respectives des notes d’un accord, et aussi les décaler insensiblement dans le temps.”
38

2.3.2 Setting the notes in time

Moreover, the temporal delays between each note in a chord may also be involved
in timbre production. Many treatises actually mention a relation between tone-quality
and rhythm. For example, the notion of “Time-spot” (Matthay, 1932, p. 29) defines
the instant at which a note ought to best respect the intended musicality. This holds
a prime role in controlling the sound, as does the haptic information from the tactile
sensation transmitted from the resistance of the key. Matthay also insists on continuity
in performance, both rhythmical and timbral, and especially relevant to rubato playing.
Neuhaus later follows on this point of view, insisting on the relation between tone and
rhythm in rubato:

It is impossible to determine the degree of rhythmic freedom of a phrase if


the correct nuances have not been found. Tone and rhythm go hand in hand,
help each other and only jointly can they solve the problem of ensuring an
expressive performance. (Neuhaus, 1973, p. 53).

In short: “everything [tone, rhythm, expression] is part of one whole” (Neuhaus, 1973,
p. 53).

2.3.3 Dynamics

Interdependency between timbre and dynamics is also featured. Philipp (1969, p. 43)
refers to “Hermann [von] Helmholtz, the great physiologist and physicist, [who] claimed
that tone color changes with the varying speed of the hammer”. Likewise, for Sándor
(1995, p. 15), pianists “are able to modify the dynamic level and tone quality by altering
the speed with which the hammer strikes the key”. Matthay thus advocates presenting,
by way of dynamics, a wide range of tone-qualities in a musical passage. One shall
thus reach pianissimo with a very soft sound, so that the timbre in forte may stand out.
Hence, he warns against teachers who may demand the loudest sounds instead of looking
for “Tone-contrasts” (Matthay, 1913, p. 117).
39

2.3.4 Pedals

Finally, pedals are of the utmost importance for timbre, as “the pedal gives color” to
the performance, according to the late American pianist John Browning (cited in Noyle,
1987, p. 30). With the soft (or una-corda) pedal, “a contrasting, slightly muffled tone
color is created” (Fink, 1992, p. 69). According to Josef Hofmann (1920, p. II-43),
former pupil of Anton Rubinstein, the soft pedal should actually “serve to change the
quality of tone, not the quantity”, and “be employed only when the softness of tone is
coupled with a change of colouring”. On the other hand, the damper pedal is “one of the
greatest means for colouring” (Hofmann, 1920, p. II-39). Seymour Fink (1992, p. 66)
further describes its effect on tone:

The tonal characteristics of the piano are greatly enhanced by its ability to
enrich and sustain the tones produced [with the damper pedal] [. . . ] [whose]
inflections and transformations are unique to the instrument, allowing it to
sing in ways no other instrument can.

Technically, anticipatory pedalling is advised for tonal richness, as “depressing the


pedal before starting to play will produce a richer sound than will striking the note or
chord first and depressing the pedal a split second later, since with the dampers already
raised, the sympathetic partials will vibrate fully” (Banowetz, 1985, p. 70).

2.3.5 Articulation

Fink (1992, p. 126) presents articulation as another key to timbral colouring:

Few skills contribute to the quality of piano sound more than a pianist’s
ability to vary and manage the degree of connection between tones. [. . . ]
The ability to perform a consciously controlled, overlapping finger legato
enables pianists to enrich their sound in ways that are more subtle than the
use of the pedal alone.
40

The articulation between tones essentially depends on key release as, still accord-
ing to Fink (1992, p. 173), “the ability to control the timing and speed of finger releases
vastly increases your potential for tonal control. [. . . ] For example, [pianists] often over-
lap fingers (finger pedaling) as a subtle means of enriching the sonority”. Controlling
note releases and finger overlap techniques thus provides a finer, complementary way to
pedalling in inflecting timbral nuances.

2.4 Mental conception

Yet beyond these technical and physical perspectives, George Kochevitsky proposed
at the end of the 1960s a theory of piano playing inspired by physiology. His theory gives
the central nervous system a primary role (Kochevitsky, 1967, pp. 21–27). Regarding
timbre, “the kind of movement (its form) and amount of energy exerted for producing
this or that tone are of secondary consideration” (Kochevitsky, 1967, p. 38). Walter
Gieseking goes even farther: “it is useless to look for the reason of the beautiful tone in
some particular finger position or hand position” (Kochevitsky, 1967, p. 38).
That is why, for most pianists and pedagogues (especially in the twentieth century),
the key to inflecting a performance with the right timbre resides in one’s mind. Kochevit-
sky thus declares “the quality of a pianist’s tone depends mainly on his mental concep-
tion, his inward imagination of the tone which has to be produced” (Kochevitsky, 1967,
p. 38). According to him, such capacity for the mental conception of timbre does not rely
on gesture analysis. He relates the case of John Field, a pianist lauded for his singing
tone, the particularity of which was explained at the time by his fingers positioning,
almost perpendicular to the keyboard — an idea that Kochevitsky adamantly rejects.
Kochevitsky also criticises the extreme idea according to which a pianist’s sound
stems from the physiognomy of his/her hands. He thus completely rebukes Lechetizky’s
“ridiculous” [sic] explanation of Anton Rubinstein’s sound by his broad and thick fin-
gers. It drives him to conclude: “it is hopeless to look for beauty of tone in some kind of
41

pianistic movement. [. . . ] [Young students] should listen rather than look. Visual per-
ception without inner conception is of little help: it can lead to quite wrong conclusions”
(Kochevitsky, 1967, p. 38). Or, to complete Gieseking’s quote: “I am convinced that
the only way to learn to produce beautiful tone is systematic ear training” (Kochevitsky,
1967, p. 38).

This is where we reach the role of teaching. Yet “work on tone is the most difficult
work of all, since it is closely connected with the ear” (Neuhaus, 1973, p. 56). Teaching
piano tone can thus work in two complementary ways:

By training [the pupil’s] ear, we directly influence his tone. By working at


the instrument, persevering relentlessly in an attempt to improve the quality
of tone we influence the ear and develop it (Neuhaus, 1973, p. 56).

Ludwig Deppe wanted to “awaken a keen sense of tonal beauty in the minds of his
pupils” (Kochevitsky, 1967, p. 8) by ear training with a deep attention to each tone.
He was thus going against the dominant methods of his time (the end of the nineteenth
century) epitomised by Czerny’s precepts for a mechanical gesture. More generally,
Matthay advises both students and teachers to “learn to think” more and better, to display
“vividness of imagination” in mentally conceiving of a performance and its sonority, to
anticipate the sound to be produced, and to approach tone and rhythm hand in hand for
achieving a natural performance, flowing at the “timing of the consciousness” (Matthay,
1913, pp. 3, 10 and 31 respectively). Likewise, Hofmann (1920, p. I-43) suggests that
“the student should strive to acquire the ability to form the tonal picture in his mind,
rather than the note picture”, i.e. preconceive the whole tonal production instead of just
preparing for a sequence of key depressions.
Practice and rehearsal, for concert pianists like Jorge Bolet and André Watts (inter-
wieved in Noyle, 1987, pp. 17 and 143 respectively), thus become a mental matter. Leon
Fleisher (Noyle, 1987, p. 145) advised his students to confront technical difficulties, not
by playing, but by looking for a driving mental image. To the student taking on a new
42

work, as “a tonal picture of perfect clarity should be prepared in the mind before the
mechanical (or technical) practicing begins” (Hofmann, 1920, p. I-37), Hofmann rec-
ommends that “it will be best to ask the teacher to play the piece for [the student], and
thus help [him/her] in forming a correct tonal picture in [his/her] mind” (Hofmann, 1920,
p. I-37). The right gesture would then naturally follow: “make the mental tonal picture
sharp; the fingers must and will obey it” (Hofmann, 1920, p. I-39).
Mental conception and musical ear are also involved in the immediate, critical self-
analysis feedback. Kochevitsky mentions this feedback as an evolving process of con-
structing an inner conception of tone. Influenced by the tone produced and its enriching
of the imagination, this process results in a mental image which, once it is robust and
precise enough, can guide the motor functions of piano playing, and thus control arms,
hands and fingers. For the American pianist and professor André Watts, one shall thus
develop the “ability to hear as you play, to hear and to judge at the same time so that
you can listen to what you’re producing” (Noyle, 1987, p. 145). In this context, accord-
ing to Vladimir Ashkenazy (interwieved in Noyle, 1987, p. 8), memorisation becomes
multimodal, with digital–tactile, visual, auditory and mental memory combined and in-
tertwined. For Kochevitsky (1967, p. 37), timbre, thus mentally internalised, “defies
abstract description and must be confirmed by live illustration and auditory perception”.
There exists, however, a recourse in the verbal description of piano timbre, as its
frequent use in piano teaching is highlighted by Neuhaus (1973, p. 62): “we teachers
inevitably and constantly use metaphors to define the various ways of producing tone
on the piano”. Beyond the sole metaphor, the pianist can also seek out imitation. As
Neuhaus (1973, p. 64) explains, “the piano is the best actor of all the instruments for it
can play the most varied roles”. A pianist can especially try and imitate the human voice.
John Browning (cited by Noyle, 1987, p. 29) actually tried in his playing to “imitate a
good bel canto singing, both stylistically and in terms of sound”. To achieve this, one
must get rid of the instrument percussiveness, according to Neuhaus (1973, p. 59):“one
must first of all learn to make the piano sing, and not only ‘strike’ it”.
43

The musical sense now comes into play, driving the expression of an intention, a mu-
sical idea, by way of technique. According to Abbey Simon (concert pianist interviewed
in Noyle, 1987, p. 120), the same goes in daily practice: to be efficient, it must be driven
by a musical idea. This must be instilled as early as possible, with particular insistence
on the conveying of emotions:

The child should be made, at the earliest possible stage, to play a sad melody
sadly, a gay melody gaily, a solemn melody solemnly, etc. and should make
his musical and artistic intention completely clear.

This reliance on musical sense can be understood as the introduction of character within
piano performance.

To conclude these last two parts, working on piano performance can be approached
at different levels. Neuhaus (1973, p. 57) distinguishes three parts:

First — the image (i.e. the meaning, content, expression, the what-it-is-all-
about [sic]); second — tone in time — the embodiment, the materialization
of the image, and finally, the third — technique as a whole, as the sum total
of all means essential for solving the artistic problem of piano playing as
such, i.e. mastery of the muscular movements of the performer and of the
mechanism of the instrument.

This quote effectively encompasses all the concepts presented up to this point that
concern piano performance and especially timbre production. However, timbre, sonor-
ity, tone-quality, sound production, whichever the name given, cannot be a complete
construct of the performer freely channelling his/her artistic sensibility: the composer
has first laid the sonic framework. We shall now see how composers can infuse timbre
in their piano works.
44

2.5 Composers and their works

Debussy is perhaps the paragon of composers for whom timbre holds a central role
in their works. A pianist himself, Debussy composed many piano works. The uses and
functions of timbre are examined in this section, in some of these compositions.
First, Debussy’s attention to timbre and the precise nuances of colour he wants for
his works appear in additional score indications. For instance, in the piece Des pas sur
la neige from his Piano Preludes, the first bar is annotated by this wonderful metaphoric
depiction: “ce rythme doit avoir la valeur sonore d’un fond de paysage triste et glacé” 5
(Howat, 1997, p. 86). Debussy can be credited with inducing by himself an evolution
in the perspective on piano tone, which he used as a painter does colours. Piano tone
was thus transformed in a “orchestra of timbres” 6 (Dubé, 2003, p. 28). In those times,
French music was probably most characterised by the colours and voices driving the
expression, instead of rubato or other rhythmical variations. Beyond straightforward
indications, Debussy’s writing style imposes by itself the timbre expression to adopt.
For instance, in the last page of the score of La cathédrale engloutie, the melodic line
drawn by the right hand gets drowned in the left-hand accompaniment, thus “ringing” as
waves of cathedral bells (see Figure 2.1, first 12 bars). Such an effect flows so naturally
from the score that it is almost irrepressible for the pianist (Howat, 1997, p. 87).
Since the origins of the piano, however, composers praised its tone and the new tim-
bral possibilities it offered. And when in 1773 Muzio Clementi composed the first work
especially intended for the pianoforte (his Sonata Op.2), he “indulged in experiments and
tried all sorts of technical and coloristic effects” (Kochevitsky, 1967, p. 1), thus giving a
central role to the yet unexplored timbres of the pianoforte. Beethoven later praised the
pianoforte for its expressive affordance of a personal tone character and its singing tone.
He made use of this singing tone for instance in the recitative section of the first move-
ment of his Sonata no.17 in D minor, op.31 no.2 (Figure 2.2, in which he particularly
5. Which translates as: “this rhythm must have the sound value of a sad and icy landscape background.”
6. “un orchestre de timbres”
45

Figure 2.1: Debussy’s La cathédrale engloutie, last page of the score.

aimed at imitating the human voice (an example cited in Neuhaus, 1973, p. 59). More
generally, Beethoven conceived of timbre as a full dimension in the musical discourse
of his pianoforte compositions (Guigue, 1994a). Liszt later used the capacity of the pi-
ano to imitate orchestral instruments, and worked consciously in his piano transcriptions
toward reproducing orchestral timbres.
46

Figure 2.2: Recitative section near the end of the first movement in Beethoven’s Sonata
no.17 in D minor (right hand solo) (Neuhaus, 1973, p. 59).

Further evidence of the conscious use of timbre expression in piano compositions can
be found in score annotations, as indications such as maestoso (imprinted for instance in
Chopin’s Third Sonata) can be understood as indications of timbre and/or character. The
integration and role of timbre in composition would only grow from there.
Meanwhile, pianists, more or less guided by the composer, must extract from the
work and its score the information relevant to their colouring the performances in the
appropriate timbral nuances. In this aim, the musical structure can help in giving in-
dications. The French-Cypriot pianist Cyprien Katsaris, cited in Xardel (2002, p. 177),
projects within his performance the forms, sculptures or architecture that the work evokes
to him. Ashkenazy (cited in Noyle, 1987, p. 11) first adopts a global perspective on a
composition. He thus aims at bringing out the meaning intended by the composer and
communicating it through his expressive performance. At a finer, local structure level,
according to the late Cuban-American pianist Jorge Bolet (cited in Noyle, 1987, p. 17),
each phrase carries its own nuances. Matthay (1913, p. 34) nonetheless advises pianists
to keep the whole work in mind, as a “continuous movement” or progression.
Neuhaus (1973, p. 55) sums up the relation of tone quality to the work: “the concept
of beauty of tone is not sensuously static but dialectic; the best tone, and consequently
the most beautiful, is the one which renders a particular meaning in the best possible
manner.” Tone is thus defined not in isolation but as contextually anchored in the work
according to the composer’s intentions. In conclusion, “it is possible to work effectively
at the tone quality only when working at the work itself, the music and its components”
(Neuhaus, 1973, p. 68).
47

2.6 Aesthetics

The artistic and aesthetic dimension in the expressive production of piano timbre is
the achievement of a mental conception, guided by the work and driven by technique.
This definition would actually extend to the music whole:

What is the “artistic image of a musical composition” but music itself, the
living fabric of sound, musical language with its rules, its component parts,
which we call melody, harmony, polyphony, etc., a specific formal structure,
an emotional and poetic content? (Neuhaus, 1973, p. 7)

Thus piano technique can be used as a tool for artistic expression, as Neuhaus (1973,
p. 61) wittily sums up:

Is he a pianist because he has a good technique? No, of course not; he has a


good technique because he is a pianist, because he finds meaning in sounds,
the poetic content of music, its regular structure and harmony.

This expression is naturally, and frequently, set in comparison with painting. Matthay
identifies the same kind of movement between the “movement upon the canvas”, first
from the painter’s brush then from the audience’s eye, and music movement “upon a
time-surface” — as defined by a “sequence of beats or pulses”, which forms a “thor-
oughly tangible time-canvas of Pulse” over which pianists “lay out the progression of
[their] musical picture”. Music and painting thus share both the same “sense of Pro-
gression, or Movement” and “the necessity of some medium upon which to fix [these]
progressions” (Matthay, 1913, pp. 32–33). Neuhaus (1973, p. 63) thus details the con-
nection to sound:

How often the playing of a grand master makes us think of a picture with
a deep background and varying planes; the figures in the foreground almost
leap out of the frame whereas in the background the mountains and clouds
are lost in a blue haze. Remember for instance, Perugino, Raphael, Claude
48

Lorrain, Leonardo, our own great [Russian] painters, and let them influence
your playing, your tone.

Consequently, performers can, by their playing, vary the tone-qualities so as to cre-


ate “tone perspectives” that separate polyphonic or harmonic lines like the foreground,
middle-ground and background in the composition of a painting. Furthermore, the great-
est pianists use this depth of tonal variations to display their idiosyncratic identities in
the several levels and layers of their performances. Neuhaus (1973, p. 68), continuing
the analogy with painters, explains:

The difference and variety of the tonal picture presented by various great
performers is infinite because of the differences in their personalities, just as
the paint, colour and light of great painters differ.

2.7 Conclusion

In the end, timbre for pianists seems to be a matter of sensation and sensibility. Its
rightful construction first requires the visualisation of a mental image derived from the
work performed. The gestural process to apply for producing the desired timbre is then
channelled through this mental image, in different forms depending on pianistic schools.
“Music is a chaos which must be organised to find the ineffable” 7 , as said the French
musicologist Vladimir Jankélévitch (1961). To do so, Horowitz advocates, “technique
must be used as a mechanism” 8 , and complemented by phrasing, colours, levels and
layers in order to transcend the physical instrument and induce emotion. The piano, once
upon a time discarded by Voltaire as an “instrument for coalmen” 9 , has since revealed
the subtleties of playing it offers. Berlioz, speaking about a passage of his Fantaisie sur
la Tempête de Shakespeare, thus paid tribute to the piano:
7. “La musique est un chaos à organiser pour trouver l’ineffable.”
8. “La technique doit être utilisée comme mécanisme” (Cyprien Katsaris in Xardel, 2002, p. 179).
9. “Un instrument de charbonniers”, in Correspondance à Mme Du Deffand, 1774.
49

No other known instrument would produce this kind of harmonious crack-


ling sound that the piano can make without difficulty and that the sylphlike
intent of the piece there made conceivable. 10 (Berlioz, 1844, excerpt)

Thus is set the fundamental connection between expressivity in performance and its
acoustic trace, i.e. the sound. Consequently, the formal function of timbre is revealed: to
organise the sound matter in order to clarify the form. As Neuhaus (1973, p. 68) stated:

What gives us the impression of a beautiful tone is in actual fact something


much greater; it is the expressiveness of the performance, or in other words,
the ordering of sound in the process of performing a composition.

The mental image of the timbral nuances that colour a performance are thus inspired by
what the composition calls for, both formally and expressively.
Yet the following opinion has been highlighted in most of the piano literature re-
viewed here: working on tone is above all about the personal ‘sound’ each pianist ought
to develop. As a consequence, the proper timbral nuances that highlight the composi-
tion would actually be overlaid in the general tonal picture of the performance by the
idiosyncratic tone quality that the pianist should aim at bringing forth. There could thus
seem to be little in common, in both sound and production, between the same timbral
nuance applied by two different pianists, both idiosyncratic in tone quality. However, as
precise timbral nuances can be understood, discussed and shared among pianists through
metaphors and images, there must be recognizable sound qualities that can show through
the personal tone qualities of different pianists, qualities that pianists are able to identify.
Moreover, such a specific sound quality of a given timbral nuance must be produced at
the piano, at least to some degree, by common gestures and strategies of expressive con-
trol. In some respect, the advice on gesture presented in Section 2.2 could serve not only
the development of a personal tone, but also the production of specific timbral nuances
— in complement to the abstract image of timbre formed in respect of the composition.
10. “Aucun autre instrument connu ne produirait cette sorte de grésillement harmonieux que le piano
peut rendre sans difficulté et que l’intention sylphidique du morceau rendait là concevable.”
50

In conclusion, the pianists and pedagogues whose written works and interviews were
reviewed in this chapter thus show the utmost respect toward empiricism and the cre-
ative talent of the performer. Yet, while the empirical body of knowledge stemming
from piano literature is essential in understanding the expression of piano timbre, it still
remains informed by individual perspectives and opinions, however worthy and enlight-
ened. Whether this empirical body of knowledge holds true and can be generalised is
yet to be verified.

We shall for this aim examine those ideas and thoughts through a different perspec-
tive: that of scientific, quantitative studies of timbre, piano performance and the expres-
sion and production of piano timbre.
The next part will detail the quantitative account on the perception and verbalisation
of timbre in general and piano timbre in particular. What can the scientific literature
and our own research reveal on the ways in which piano timbre is perceived? And how
does it relate to the guidelines suggested by the greatest teachers — a mental conception
based upon a global and local analysis and understanding of the composer’s intentions
and the musical structure, driving timbre production through the appropriate technical
and gestural approaches for faithfully realising the image abstractly conceived of? Cor-
respondingly, what can we learn from scientific research on the verbalisation of piano
timbre, the adjectival descriptors and semantics used for characterising the different tim-
bral nuances that correspond to specific mental images and musical intentions?
The third part of this thesis will then aim at confirming and complementing through
quantitative analyses the empiric perspective on the expression and production of piano
timbre. We will thus try and determine direct relations between gesture and timbre. This
research however undermines in no way the empiric perspective of pianists. By keeping
aware of their own sensations, they could build subjective methods based on their own
mental intuitions, and thus adjusted to their own musical universe.
51

A more concrete and generalisable knowledge of the relations between gesture and
timbre may nonetheless benefit piano teaching, by offering a more direct, analytic and
objective way for teachers to communicate with students, that would judiciously com-
plement the empirically developed, metaphoric and example-based descriptions of piano
timbre.
Part II

Perception and verbalisation of piano


timbre

52
CHAPTER 3

TIMBRE PERCEPTION AND VERBALISATION STUDIES

This chapter details the literature on timbre perception and verbalisation. Begin-
ning with inter-instrumental timbre perception and verbalisation, we will progressively
narrow our focus toward intra-instrumental timbral nuances and piano timbre.

3.1 Timbre perception studies

Compared to other perceptual parameters of sound — first and foremost loudness


and pitch — timbre has been relatively understudied. Its acoustic complexity, multi-
dimensionality, and vague, non-consensual functional and constitutive definitions have
indeed posed obstacles to well-controlled, replicable experimental studies (Hajda et al.,
1997, p. 253).
While timbre in music has been theorised and written about in orchestration treatises
since Berlioz (1844), and timbre as an acoustic parameter had been first theoretically
explored by von Helmholtz (1885), then notably by Seashore (1938), experimental re-
search on timbre as a perceptual artefact started in earnest in the 1960s. 1
Yet such perceptual studies are crucial to understanding timbre. As Cadoz (1991,
p. 17) remarked, “if we must speak of timbre, it is as perceptual attribute, by which we
must understand that its value and final function are to be decided by the ear and the
musical intelligence, whichever the producing device.” 2 Research on timbre percep-
tion gained major ground in the 1970s, when Plomp (1970) introduced to this field a
new methodology and analysis procedure that was soon followed by Wedin and Goude
(1972) and Grey (1975, 1977). Proximity ratings were gathered in perception tests of
1. One earlier such study by Lichte (1941) stands as the exception.
2. “S’il faut parler du timbre, c’est en tant qu’attribut perceptuel. Ce qu’il faut entendre par là, c’est
que sa valeur, sa fonction finale, ce sont l’oreille et l’intelligence musicale qui en décident, quel que soit
le dispositif de production.”
54

similarity (or dissimilarity) between pairs of stimuli, and subsequently analysed with the
Multidimensional Scaling procedure. It allowed researchers to identify the most promi-
nent perceptual dimensions of timbre, for which acoustic correlates could be sought out.
Several studies have followed to this day, using the same general methodology.
Yet many timbre perception studies only concerned the inter-instrumental timbre.
And most used digitally synthesized stimuli, aimed at imitating acoustic instruments
while guaranteeing total control of their acoustic features. Thus, in order to avoid any
effect of extraneous parameters on perceptual judgements, the stimuli were normalised,
independently, in pitch, loudness and duration, which also facilitated the extraction of
acoustic features. Additionally, several studies presented steady-state stimuli, excluding
the attack portion of a musical tone — thus far removed from actual musical sounds.
Finally, the stimuli were presented in most studies out of any musical context. As almost
all timbre perception studies presented at least one of these limitations in the musical
representativeness of the stimuli, and as intra-instrumental timbral nuances were rarely
considered, the results and conclusions of those studies are of little interest to the con-
cerns of this dissertation, namely the nuances of piano timbre that can be instilled in mu-
sical performance. Thus, the results, perceptual spaces and acoustic correlates brought
forth by such timbre perception studies will not be discussed here in detail. However,
the methodologies and procedures used in timbre perception studies that are relevant in
the context of this dissertation are presented below.
For more complete and exhaustive information on the topic, and for a thorough ac-
count of the whole body of work, one shall refer to Hajda et al. (1997) and Risset and
Wessel’s (1999) reviews of literature.

3.1.1 Methodologies in timbre perception studies

The various methodologies employed in studying timbre perception were exhaus-


tively described in Hajda et al. (1997, pp. 257–264) and McAdams (1999).
Several procedures were employed in perception tests, with different tasks asked of the
55

participant listeners. I may hereby propose a classification of those procedures in three


categories: matching, adjustment and evaluation.

3.1.1.1 Matching procedures

Early studies on the perception of instrumental timbre (Clark Jr et al., 1964; Sal-
danha and Corso, 1964; Wedin and Goude, 1972) relied on identification tasks, in which
participants were asked to attach a verbal label to a perceived audio stimulus. Most of-
ten in experiments on musical timbre, the task consisted in naming the instrument (or
its category) that most closely corresponded to each sound presented. Several types of
identification tasks can be employed. A first method is forced-choice identification, for
which verbal labels are provided for the participants to choose from. There can be as
many labels as stimuli to identify (e.g. in Clark Jr et al. (1964)), or the label list can
be distracting, including labels not corresponding to any of the presented stimuli (e.g.
in Saldanha and Corso (1964)). In another case, the task can consist of free identifica-
tion, where participants themselves come up with the label to attach to each stimulus
(e.g. Wedin and Goude, 1972). These identification procedures are simple and efficient,
yet require prior knowledge from participants about the instruments’ identities: non-
musicians can only be tested reliably on the most well-known instruments.
With the matching method this issue is avoided. Each stimulus is paired with one ref-
erence sound — which in effect serves as label without requiring any prior knowledge.
A close variant of matching is categorisation. In this method, participants are asked
to assign stimuli to one of several categories, each of which is defined by a reference
stimulus. This method was used in timbre perception studies by Pitt (1994), Kendall
et al. (1995), Guyot (1996), Gaillard et al. (2005). All these test methods produce fre-
quency data, with binary, hit-or-miss results per stimulus. Such data can then be sorted
as confusion matrices (stimuli vs. labels/categories).
56

3.1.1.2 Adjustment procedures

In another variant to matching, the adjustment procedure, participants have to manip-


ulate each stimulus so that it matches its target (reference stimulus) with regard to some
attribute. In Lichte (1941) for instance, that attribute was brightness. The adjustment
procedure is thus a user-controlled matching, where the participants apply variations
to the stimulus and have to identify whether the controlled stimulus and its target are
identical.
Discrimination, as was used in Grey (1975), consists in a sequence of stimuli, in
which one stimulus differs from all the others. The task for participants is to identify it.
This method is most common in identifying just noticeable differences, with some target
attribute progressively increased or decreased in a series of stimuli until the participant
detects the change. This procedure is used for instance in audiology tests, and essen-
tially works as an automated adjustment procedure, except that its aim is generally not
matching but differentiating.
These methods can provide very precise quantitative information on the effect of a
physical/acoustic attribute on timbre perception. Yet for adjustment and discrimination
to be effective, the stimuli have to be very well controlled. Sound parameters other than
timbre (and recording and ambient noises) must not interfere with participants’ timbre
assessments. The stimuli must also be controllable: the listener must be able to make
adjustments, and the experimenter must be able to build sequences. Adjustment and
discrimination procedures are thus essentially applicable only with synthetic stimuli.

3.1.1.3 Evaluation procedure

Whereas the previous testing methods asked for participants to identify something
(a stimulus with a verbal label, with another stimulus or with a category, or a stimulus
differing in a sequence), the following method uses evaluations, or ratings, from partici-
pants.
57

In the proximity rating procedure, participants are asked to evaluate the degree of
either similarity or dissimilarity between pairs of stimuli. The degree of (dis)similarity
between a pair of stimuli is to be rated, most generally along a scale. The rating scale, or
Likert scale (Likert, 1932), traditionally includes five or seven discrete steps, although
it can vary between experiments: Grey and Gordon (1978) used 30 steps, McAdams
et al. (1995) nine. An even number of steps is used when a neutral evaluation is not
desired. Scales can also appear continuous to the participant, with a narrow underlying
discretisation (e.g. 100 steps) (Wedin and Goude, 1972). The scale ranges from most
dissimilar to most similar.
n · (n 1)
From a set of n stimuli, a total of pairs are to be presented for each pair to
2
be rated in (dis)similarity. This is the main drawback to this procedure, as the number
of presentations grows according to a square law over the number of stimuli. Thus, with
for instance 25 stimuli, there are 300 comparisons for a participant to make. With twice
as many (50) stimuli, there would be 1225 comparisons. Restraining the cognitive load
on participants to less than overwhelming thus severely limits the number of stimuli that
can be used in one experiment.
Nevertheless, proximity rating has been the methodological procedure of choice in
timbre perception studies, in the wake of Plomp (1970, 1976), Wedin and Goude (1972)
and Grey (1975, 1977) studies. With n stimuli, the proximity rating procedure natu-
rally yields triangular n ⇥ n proximity matrices, where each element (x, y) contains the
(dis)similarity rating between stimuli x and y. This data design is perfectly adapted to
the Multidimensional Scaling analysis method.

3.1.2 Analysis of matching data

As we saw, matching-type procedures produce frequency data that can be gathered


in confusion matrices. They can be analysed with descriptive statistics, including mean
scores and standard deviations. If different factors (e.g. musician vs. non-musician par-
ticipants) or conditions (e.g. natural vs. synthesized stimuli, or by instrumental family)
58

are included in the experimental design, other procedures can be performed, such as
correlations, either linear (Pearson’s r) or non-parametric (Spearman’s r), or analysis of
variance, with Student’s t-test or ANOVA for identification data and repeated-measures
ANOVA for matching and discrimination data.

3.1.3 Analysis of proximity data: Multidimensional Scaling

The Multidimensional Scaling (MDS) method aims at building a low-dimensionality


space where the proximities between evaluated stimuli are reproduced as faithfully as
possible in the distances between points. It thus results in a spatial representation of
proximity matrices, with the stimuli represented as points and their relative distances
accounting for their pairwise proximity. Judgements of proximity between stimuli are
thus represented as perceptual distances. As we can consider an n ⇥ n proximity matrix
(with n the number of variables, or stimuli) as the representation of the distances between
variables in a space of n dimensions, the MDS procedure provides a mechanism for
dimensionality reduction.
The MDS method was developed for application in the field of psychometry. It was
first applied to perceptual data by Ekman (1954), and theorised accordingly by Torgerson
(1952, 1958). It has since been used in fields as diverse as biology, chemistry, sociology
and economics. Several techniques and algorithms have been successively developed,
building and improving on the previous ones. Four main types of MDS algorithms —
classical, metric, non-metric and generalised — are introduced below. Additional mod-
els improving the MDS procedure are then presented, the most relevant of which is the
weighted model — as it is included in the PROXSCAL algorithm that will be used in
Chapter 5. The extended, latent classes and non-linear models are also mentioned, as
they were used in important timbre perception studies.
59

3.1.3.1 Classical MDS

Torgerson’s seminal, Classical MDS algorithm (also called Principal Coordinates


Analysis) was based on eigen-decomposition, and is connected to Principal Component
Analysis (PCA) (Gower, 1966; Pearson, 1901). It proceeds by iteratively defining di-
mensions (i.e. a linear combination of input variables) that can account for the most
variance in the dataset. After each iteration, the variance explained by the last dimen-
sion obtained is taken out of the dataset, and the algorithm then seeks the dimension
that explains the most variance in what information is left in the dataset. Classical MDS
differs from PCA by the input data it processes: whereas PCA uses data descriptive of
the variables, Classical MDS takes in proximity information.
A specificity of Classical MDS is that it uses inner products to measure distances
between variables/objects. The variance not accounted for by a given dimension (i.e.
inner product residuals) is then evaluated with a loss function. In the Classical MDS
algorithm, the loss function, called strain, is an estimation of fit between inner products
in the solution and inner products in the input matrix.

3.1.3.2 Metric MDS

Classical MDS soon evolved into the Metric MDS method (Kruskal, 1964a, b; Shep-
ard, 1962a, b), also called least-squares MDS model, which measures Euclidean dis-
tances between objects and points. Let X be the n ⇥ p matrix of the objects coordinates
in the computed MDS space, with n the number of objects (or dimension of the triangu-
lar input matrix) and p the dimensionality of the solution (p < n). Then the Euclidean
distance between objects i and j in the Metric MDS space is:

!1/2
p
di, j (X) = Â (xis x js )2 (3.1)
s=1

The Metric MDS method also introduces a different family of loss functions: stress.
60

In its simple form 3 , raw stress s 2 (X) evaluates the difference between all di, j (X) (dis-
tances between all objects in the MDS solution) and di, j (the input proximities, i.e. dis-
tances in the original n-dimension space), according to the formula (Kruskal, 1964a):

n i 1
s (X) = Â
2
 (di j di j (X))2 (3.2)
i=2 j=1

Consequently, stress can assess the goodness of fit of a complete MDS space, i.e.
over all its dimensions at the same time, where the Classical MDS stress was calculated
one dimension after another. Metric MDS can thus identify an optimal p-dimension
space, where Classical MDS would successively identify the first p best dimensions,
one by one, without guaranteeing the combination of those p dimensions is optimal.
Additionally, as Euclidean distances are invariant by rotation, translation, and reflec-
tion, raw stress is not affected by such operations on the MDS solution. Thus, the MDS
space can be seamlessly translated (for instance so that coordinates reach zero-sum on
each dimension), rotated (so that axes fit the principal components and explain the most
variance), and reflected (i.e. axes directions can be changed).

3.1.3.3 Non-metric MDS

Introduced in Kruskal (1964a, b), the Non-metric MDS method uses non-parametric
distance estimations. Dissimilarities between objects (the values in the input proximity
matrices) are replaced by their rank order — i.e., for an object i, all other objects are
ranked according to their proximity with i, and numbered accordingly. These rankings
are called disparities d,ˆ and can be obtained with the monotone regression procedure. In
effect, the judgements of proximity between objects are considered no more in their nu-
merical values, only as rankings. These proximity rankings are more suitable to percep-
tive evaluations, as they can be relative (to the pair of stimuli presented) and non-linear
3. Other stress functions differ from raw stress by some normalising parameter or quadratic transfor-
mation, but remain comparable in essence.
61

(in the use of Likert scales). This method can also be used with ordinal data instead of
proximities (data from a ranking task).
The MDS transformation thus tries to respect the disparities in the distances between
points (i.e. dˆi j < dˆik < dˆim ) di j < dik < dim ) but does not conserve the actual proximi-
ties/perceptual distances (thus the non-metric, i.e. non-Euclidean, moniker).
The raw stress loss function is thus now expressed as a function of dˆ (disparities
matrix) and X (coordinates in the MDS space):

n i 1
ˆ X) = Â
s 2 (d, Â (dˆi j di j (X))2 (3.3)
i=2 j=1

Other, non-ordinal transformations are also possible. In ratio transformations, for all
objects i and j dˆi j = a · di j , and in interval transformations, dˆi j = a · di j + b where a a
and b are scalars for the algorithm to define for optimising d.ˆ As both transformations
conserve relative distances, they are considered metric.
The non-metric method is implemented in the MDSCAL (Kruskal, 1964a, b) algo-
rithm, and was used for instance by Wessel (1979) for building a perceptual timbre space
from which to define control parameters for sound synthesis. With MDSCAL, the user
must first define the number of dimensions desired for the MDS solution. Then MD-
SCAL runs iteratively until stress is sufficiently minimised. As stress minimisation is
too complex a problem to be solved in closed-form, optimisation algorithms have to run
iteratively. At each iteration, a solution (i.e. a MDS space) is envisioned, and its stress
function calculated. The algorithm is then run again, this time swayed by heuristics
informed by the previous stress values obtained. The procedure is repeated until conver-
gence of the stress function is obtained — i.e. when another iteration of the algorithm
would not improve the stress value by more than a pre-fixed threshold, e.g. 0.01. Various
heuristic algorithms can be used for driving the iterations, from the (relatively) simple
gradient descent (Kruskal, 1964a), to the more complex and efficient SMACOF (Scaling
by MAjorizing a COmplicated Function) algorithm (De Leeuw, 1977).
62

3.1.3.4 Generalised MDS

The Generalized MDS method aims at building a non-Euclidean solution. It involves


complex spatial transformations aimed at minimising distortion from dissimilarities to
the MDS solution.

3.1.3.5 Weighted model

In order to account for the differences between participants, and more precisely the
relative salience of each dimension to each participant, Kruskal (1964b) introduced a
weighting system by dimension and participant, where wn,s describes the weight [0–1]
of dimension s for participant n. Consequently, dimensions become meaningful, elimi-
nating Euclidean invariances. Distances are then expressed:

!1/2
p
di, j,n (X) = Â wn,s · (xis x js )2 (3.4)
s=1

This procedure was incorporated in the INDSCAL algorithm (Carroll and Chang,
1970), which was notably used by Grey (1977) and Grey and Gordon (1978).
More recent algorithms (ALSCAL, PROXSCAL) rely on a weighted model, but use
more advanced stress functions. The PROXSCAL algorithm will be set in action, and
thoroughly detailed, in Chapter 5.

Although the following MDS models were not directly used for the research pre-
sented in the rest of this thesis, they ought to be mentioned in light of their use in impor-
tant timbre perception studies whose results are detailed later in this chapter.

3.1.3.6 Extended model

Developed by Winsberg and Carroll (1989), the Extended two-way MDS model
(EXSCAL) compensates for possible specificities of each stimulus/object that would
63

not be accounted for in the general space. Scalars are thus linearly introduced that rep-
resent the specificities of each stimulus. They can be understood as the square of the
perceptual salience of some specific feature in each object, or equivalently as its coordi-
nate along the dimension that would represent this specific feature. This extended model
was applied to timbre perception by Krumhansl (1989) and Faure (2000).

3.1.3.7 Latent classes model

Winsberg and De Soete’s (1993) CLASCAL model was developed to reduce the
number of parameters used in a MDS model. Sources (participants) are grouped in a
small number of classes, called ‘latent’ as they are derived from the data, not prede-
termined. Weights can then be applied to each class, instead of each participant. As
the number of latent classes and the repartition of sources is a priori unknown, an ar-
bitrary number of classes is first fixed, and the repartition of sources in these classes is
optimised with the Expectation-maximisation theorem before being tested a posteriori
with the Bayes theorem. Afterwards, the optimal number of classes is sought with a
Monte-Carlo procedure applied to testing the hypothesis that T latent classes is better
than T + 1. If not, the process is iterated with one more latent class. Finally, the ob-
tained latent-class design is tested according to statistical criteria (called AIC or BIC)
for optimal representation of the proximity data.
The CLASCAL algorithm was most prominently used in McAdams et al. (1995).

3.1.3.8 Nonlinear models

A method such as Isomap (Tennenbaum et al., 2000) transforms proximity data


non-linearly, by applying geodesic transformations to large distances. Large vs. local
distances are defined with a k-nearest-neighbour algorithm. Burgoyne and McAdams
(2008) used this method in combination with CLASCAL for a meta-analysis of timbre
perception.
64

3.1.4 Cluster analysis of proximity data

An alternate method for analysing proximity data is hierarchical cluster analysis


(Johnson, 1967). From (dis)similarities collected between sources (which we can here
again consider as distances), a hierarchical clustering tree is built that establishes suc-
cessive levels of connection between sources according to their relative distance. The
approach can be bottom-up — starting from the sources as ‘leaves’, then iteratively con-
necting the closest sources (or group of sources) together until the ‘trunk’ is reached —
or top-down — starting from the trunk then iteratively splitting sources in two clusters.
Several metrics (Euclidean or not) can be used to evaluate distances from dissimilarities.
Hajda et al. (1997) built hierarchical clustering trees with the data from three studies
by Plomp (1970, 1976) and Wedin and Goude (1972), as well as their own data. These
trees, and the quantitative information on the categorical clustering of sources, comple-
mented the MDSCAL spaces (re)computed. The hierarchical clustering method was also
used in Kendall and Carterette (1993a, b) and Barthet (2008). This method was applied
in the study presented in Chapter 5.

3.1.5 Results

A brief summary of the results obtained in the timbre perception studies of the last 42
years that used MDS to build perceptual timbre spaces is presented in this section. First,
Plomp (1970, 1976) used MDSCAL to build 3D timbral spaces of synthetic, steady-
state sounds of wind and bowed-string instruments, and a 2D space for organ. Wedin
and Goude (1972) then tested recordings of natural wind and bowed-string instruments,
and likewise found a 3D MDS solution. They identified spectral correlates to these
dimensions, the first two of which they called ‘overtone richness’, or ‘sonority’, and
‘overtone poorness’, or ‘dullness’. Grey (1977) applied INDSCAL to data of timbral
proximity between several synthesized instrument sounds, and thus built a 3D space,
whose dimensions were correlated to the acoustic parameters of brightness, spectral flux
65

and transient density. Most of the following studies confirmed those findings, obtaining
3D models of timbre perception with the same acoustical correlates as Grey (1977) for
the first two dimensions, yet with many different acoustical correlates for the third di-
mension. Of particular interest, Krumhansl (1989) used EXSCAL to study the details of
each timbre in synthesized imitations or hybrids of instruments. McAdams et al. (1995)
used the CLASCAL model to study the perception of timbre in synthesized instrumental
sounds, and identified five latent classes and a 3D space where the dimensions correlated
with spectral centroid, spectral flux and log rise time. Burgoyne and McAdams’s (2008)
meta-study replicated Grey (1977) and Grey and Gordon’s (1978) studies with the orig-
inal stimuli, and re-analysed McAdams et al.’s (1995) data, with the Isomap extension
to CLASCAL. Their results confirmed the findings of acoustic parameters that enable
distinguishing between timbres: log rise time, spectral centroid, and a third one harder
to define, the best candidate being spectral shape.
Finally, in regard to intra-instrumental timbre, Barthet (2008) tested the perception
of different timbral nuances produced by a sound synthesis model of a clarinet, and
identified with MDSCAL a perceptual space of three dimensions. The first dimension
correlated with both log rise time and spectral centroid, the second with linear spectral
spread, and the third with the odd/even harmonics ratio and temporal centroid.

3.2 Timbre verbalisation studies

The studies and procedures presented up to this point explored timbre perception
through direct assessments of audio stimuli, either by identification of the instrument
name or family to which the stimulus pertains, or most often by direct comparative
judgements between sounds. The Multidimensional Scaling method was used to build
perceptual spaces from pairwise proximity ratings, with the sounds arranged according
to how similar their timbres were perceived to be. By exploring the organisational pat-
terns according to their acoustical characteristics, the dimensions of these perceptual
66

spaces could then be associated with acoustical correlates.


However, timbre perception can be evaluated in another way. Instead of having
sounds compared according to the single, direct criterion of similarity, a wide range of
subjective sound qualities can be assessed by rating sounds according to several percep-
tual attributes, as defined by verbal descriptors. With this paradigm, timbre perception
can be either evaluated by comparing sounds, or directly assessed in each sound along
verbal attribute rating scales. The first benefit of this method is to give meaning to dimen-
sions of perceptual spaces, as each dimension can then be characterised by one verbal
attribute or a combination thereof.
Yet the conundrum in this paradigm is whether timbre and its facets can be verbally
described in an accurate, consensual, and meaningful way. Indeed, as it was suggested
in the first part of this thesis and as the rest of this chapter may indicate, language lacks
direct vocabulary for talking about musical timbre (and arguably music at large), and
is mostly restricted to using images and metaphors that can ably describe emotions, but
are deprived of the objectivity within common language. However, as toddlers learn
the meanings of words by empirical experience, so may the discourse on music acquire
a shared meaning among individuals, thus overcoming its subjective limitations. Even
von Helmholtz (1885), in his seminal, scientific exploration of timbre as an acoustic
parameter, studied the timbral nuances corresponding to qualitative descriptors such as
pleasant, harmonious, rich, poor or hollow in light of their acoustic correlates. The
relations between sound and words in speaking of timbre were thoroughly explored in
Faure (2000), who found that it is possible to speak consensually and understandably
about timbre. And as we will see by the end of this chapter, performers in particular
have managed to develop a common vocabulary, using a specialised lexicon to describe
subtle timbral nuances that can arise from their instrument.
This section now presents timbre verbalisations studies that explored different con-
texts (perception of audio stimuli or sound-free tasks), methods (verbal attributes scale
ratings or free verbalisations), analyses (statistical methods such as MDS and PCA, or
67

semantic methods), objects (timbre at different levels, from general sounds or a family
of instruments to different instruments of the same kind, i.e. timbre-individuality, or
performer-dependent timbral nuances) and principal aims (timbre perception and its ver-
bal correlates, or timbre verbalisation per se, i.e. the perception of its verbal attributes),
all the while narrowing our focus toward the verbal description of performer-controlled
piano timbral nuances.

3.2.1 Verbal attributes of timbre

Other studies on timbre perception have used alternative methodologies, with timbre
perception assessed via verbal attributes instead of direct proximity ratings of stimuli, or
stimuli matching.
With verbal attribute rating, participants are asked to evaluate a stimulus (or the dif-
ference between two stimuli presented in pairs) along certain scales, each characterising
a perceptual attribute that is described by verbal label(s).
Lichte (1941) was the first to use this method in a timbre perception study. He asked
participants to evaluate whether the second stimulus in a given pair was brighter or duller
(in the first two experiments) and thinner or fuller (in the third) than the first. Those terms
were empirically (and tentatively) chosen to best reflect the qualitative differences in
timbre between the spectrally designed stimuli. In the first set, stimuli varied in spectral
centroid; in the second, in spectral spread (mid-frequencies either hollowed or peaked);
and in the third, in even/odd partial amplitude ratio. Brightness, roughness and fullness
were conclusively identified as perceivable dimensions of tone quality.
The method was then refined as the semantic differential by Osgood et al. (1957).
The procedure uses bipolar Likert scales, each extreme of which is associated to one
adjective among a pair of antonyms.
Crucial to this methodology is the definition and choice of verbal attributes to use for
participants to assess timbre by. In this aim, von Bismarck (1974) listed in the psychoa-
coustical and psycholinguistic literature a total of 69 semantic differential scales formed
68

of verbal attributes that could describe the timbre of steady-state synthetic tones. From
those 69 scales, 35 were selected that provided the highest mean ratings in the literature.
Seven more were discarded for their apparent synonymy with other scales, and two were
added — soft–loud and low–high — to check loudness and pitch equalization. Thirty-
five digitally synthesized, static, steady-state stimuli were rated along those 30 scales.
The correlations of ratings between stimuli across the scales were then compiled, and
Principal Component Analysis yielded four main scales that accounted for 91% of total
variance: dull–sharp (accounting in itself for 44% of total variance), compact–scattered,
empty–full and colourless–colourful.
Two decades later, Kendall and Carterette (1993b) studied the verbal attributes of 10
natural wind-instrument dyads, using the same procedure as von Bismarck (1974) and
employing eight of his semantic differential scales. However, analyses proved that the
bipolar verbal scales failed to differentiate the 10 stimuli in timbre. Attributes intended
as antonyms would actually characterise two different qualities of sound that would not
constitute real opposites of one perceptual aspect, but rather each define different proper-
ties. They thus introduced a new method,verbal attribute magnitude estimation (VAME),
in which each rating scale is associated to only one verbal attribute, the extremes of the
scale corresponding respectively to the attribute and its negation. The procedure then
consists of evaluating the salience of a verbal attribute in a stimulus. Stimuli ratings
along the eight VAME scales better distinguished them, and were submitted to Principal
Component and Hierarchical Clustering analyses, revealing a main factor illustrated by
the terms heavy, hard and loud, and a second factor of sharpness and complexity — thus,
very different from von Bismarck (1974). Yet as they suspected that von Bismarck’s
(1974) adjectives lacked ecological validity, Kendall and Carterette (1993a) drew new
verbal attributes from Piston’s 1955 Orchestration treatise. They collected all the adjec-
tives found in the treatise, reduced the list to 61, then to 21 after check-list procedure —
i.e. direct evaluation of the verbal attributes themselves, in the absence of audio stim-
uli. The 10 same stimuli were rated over those 21 VAME scales, and PCA revealed
69

four major factors: power (for which soft, light, mellow, weak hold the most weight),
stridency (represented by edgy, brittle, nasal), plangency (crisp, brilliant) and reediness
(reedy, fused). The attributes’ cross-correlations were also mapped into a 3D MDS se-
mantic space. Superposition of the first two dimensions of this space with Kendall and
Carterette’s (1991) perceptual space provided a good correlation fit, which let the authors
assign semantic meaning to the space dimensions — nasality vs. richness, and reediness
vs. brilliance, respectively.
Other research on the verbal attributes of timbre has been motivated by its possible
applications to sound synthesis. Indeed, verbal attributes can be efficiently used to de-
scribe control parameters that the user can access in graphic user interfaces and adjust
along a scale associated with a verbal attribute. 4 This application was for instance the
motivation for Disley et al.’s (2006) study of the timbral description of a diverse set
of stimuli, according to seven verbal attributes selected from recent scientific literature
and eight others suggested by pilot participants. Twelve samples of musical instruments
were rated over the 15 VAME scales corresponding to verbal descriptors bright, clear,
warm, thin, harsh, dull, nasal, metallic, wooden, rich, gentle, ringing, pure, percussive
and evolving. Results showed that 10 terms were worth conserving, as participants were
in general agreement over them, and as they could be efficiently used for stimuli discrim-
ination. Those 10 descriptors could be arranged in two dimensions. The first one ranges
from bright, thin and harsh to dull, warm and gentle. The second dimension ranges from
percussive to nasal.
With the same aim in mind, Gounaropoulos and Johnson (2006) used machine learn-
ing to associate timbre descriptors with elements of the parameter space of their sound
synthesis model. To this end, reference instrument sample sounds were submitted to
listening tests, in which the goodness of fit of several verbal descriptors to these sounds
was evaluated. The nine best descriptors were selected, and used as outputs to a neural
4. It is for instance arguably more intuitive and accessible for users to control the brightness of a sound
synthesis device than the spectral centroid.
70

network. Reference sound files with their known corresponding ratings along those nine
descriptors were used to train the neural network — sounds were input until the network
output matched the predefined ratings. Sounds produced with their synthesis algorithm
could then be fed to the network, and their rating according to the nine verbal descrip-
tors would be automatically output. Sound synthesis control parameters could then be
mapped accordingly to the verbal descriptors, and their algorithm could be controlled
via user interface with sliders associated directly to each verbal descriptor.

3.2.2 Free verbalisations of timbre and semantic studies

Free verbalisations of timbre have also been collected from participants. These free
verbalisations serve the purpose of identifying the attributes to associate with rating
scales, and are not limited to a pre-selected list whose relevance and exhaustiveness
is not ensured. Additionally, collections of free verbalisations can be analysed with se-
mantic methods, revealing much about timbre verbalisation and offering a more direct
access to mental representations (Samoylenko et al., 1996).

3.2.2.1 Semantic analyses

Nosulenko et al. (1994) applied such methods to studying timbre comparisons and
their verbal descriptions. They combined the proximity-based perceptual approach with
a semantic study of free verbalisations. A first task consisted in dissimilarity ratings
between pairs of synthetic sounds (imitations and hybrids of acoustic instruments). The
second was providing free verbal descriptions of the similarities or differences in timbre
between each pair of stimuli. To analyse this corpus of verbal descriptions, a method of
verbal protocol analysis was developed that focused on the logical structure, the object
and the references (i.e. semantic content) of these descriptions. This protocol revealed
the correspondence of patterns in verbal descriptions with the level of similarity rated by
individuals in the first task. Verbal portraits of timbres were drawn by listing all verbal
units that directly referred to a particular instrument or class of instruments. Comparison
71

of those portraits between musicians and non-musicians revealed notable differences in


the classification of hybrid instrument sounds and in the overall precision.
Samoylenko et al. (1996) refined the same semantic method, applied to a subset of the
same corpus (non-musicians), by adding a hierarchical analysis and identifying several
levels of categorisation of verbal units. Their results revealed significant differences
in verbal description between the groups making small and large dissimilarity ratings
respectively. The occurrences of verbal units differed significantly between the groups
in several categories. The small-rating group tended to use more verbal units describing
similarity in categories that described spectral features. The large-rating group tended
to use more verbal units describing differences in categories that described temporal
features or were of concrete or classificational nature.

3.2.2.2 From sounds to words

Faure (2000) performed three experiments in the perspective of identifying verbal


correlates of the perceptual dimensions of timbre, i.e. matching perceptual timbre spaces
to semantic profiles.
The first consisted in the construction of a perceptual space of timbre by proxim-
ity rating. 20 participants judged the pairwise dissimilarity of 12 synthesized sounds.
Those dissimilarity ratings were analysed with Winsberg and Carroll’s (1989) EXS-
CAL MDS algorithm, with the BIC index as optimisation function. It resulted in a
4D space, whose first two dimensions strongly correlated with the first two dimensions
in McAdams et al.’s (1995) space. Additionally, Faure gathered free verbalisations, pro-
vided by the participants in describing the stimuli and the qualitative nature of their
pairwise (dis)similarities.
The verbal units that were used to compare the timbre of stimuli were extracted from
the whole free verbalisation corpus. Among the 596 verbal units identified, the 28 with
the most occurrences were selected. Semantic dimensions were formed, according to
the way those 28 verbal descriptions aligned and ordered the stimuli (through the com-
72

parisons of every pair of stimuli). Those semantic dimensions were then compared to
the perceptual MDS dimensions. Statistical analyses showed that the first perceptual
dimension, acoustically described by log rise time (a temporal feature), was most cor-
related with verbal descriptors of temporal features (either in the attack or steady-state).
The second dimension, acoustically described by spectral centroid, was correlated with
notions of treble, lack of roundness, agreeability and warmth. The third dimension,
acoustically described by spectral deviation, was only correlated with notions of (lack
of) richness and length. And the only descriptor uniquely correlated to the fourth dimen-
sion was (lack of) softness. Multidimensional correlations were identified between some
verbal descriptors to the perceptual space, and thus to its acoustic correlates. These ver-
bal descriptors could thus be portrayed summarily by the combination of a few acoustic
features.
For the second experiment, 23 verbal attributes were selected for the previous free
verbalisations. They were each assigned to a VAME scale, along which 32 participants
had to rate the same 12 stimuli. Correlations were then sought out between the mean
semantic rating for each scale and the dimensions of the perceptual MDS space. Yet,
the only significant correlations were of the first dimension with verbal attributes soufflé
(negative correlation), pincé, attaque rapide and résonnant, and of the second dimension
with grave and rond (both negative).
Then from the semantic profiles associated with each stimulus and produced with
the 23 VAME ratings, semantic distances between stimuli were calculated. Five different
types of distance calculations were used, two of them Euclidean (based directly upon the
rating values for each verbal attribute) and three based on Tversky’s contrast model (by
correlation of verbal attributes ratings between each combination of two stimuli). Those
semantic distances were then used for Multidimensional Scaling (EXSCAL algorithm,
including specificities per stimulus). Of the five types of semantic distances, the two Eu-
clidean distances could not be interpreted in a MDS space of low enough dimensionality
that satisfied the BIC optimisation index — i.e., low (< 7) dimensionality MDS spaces
73

did not contain enough information from the input dataset. With the other three Tver-
sky distances, MDS spaces were found, of either two dimensions without specificities
or one dimension with specificities. Correlations were then sought out for the position
of each stimulus between the perceptual space and each of the semantic spaces. Those
correlations were low overall, as only the correlation between the fourth perceptual di-
mension and the second semantic dimension (for the only type of distance for which it
existed) was significant at the 0.05 level. Likewise, the dimensions of those semantic
spaces were poorly correlated with acoustic parameters and even with mean ratings over
the 23 VAME scales — i.e. semantic dimensions cannot be directly described by the
verbal attributes. From those results, semantic and perceptual timbre spaces thus could
not be considered isomorphic.
Finally, following and extending the methodology previously employed in Nosu-
lenko et al. (1994) and Samoylenko et al. (1996), verbal portraits were drawn of each
sound. Several portraits were drawn per sound, from the free verbalisations collected
in the first study. They each used different categories of descriptors: descriptions of
the sound source with instrument names (or family of instruments), descriptions of the
excitation source and material, descriptions of actions, and quality assessments. Then,
another portrait was drawn of each sound from the three descriptors among the 23 verbal
attributes used in the second experiment that drew the most ‘extreme’ mean ratings for
the sound.
Those verbal portraits of each sound were then used in the third experiment. Par-
ticipants were asked to associate verbal portraits with the stimuli they listened to. The
results showed that ‘quality’ portraits were associated with more sounds, yet less agreed
upon, than ‘sound source’ portraits. There was also more agreement in the association
of portraits with sounds among musicians than among non-musicians. And most verbal
portraits were significantly associated with the sound from whose free verbal description
in the first experiment the verbal attributes used in drawing the portrait were taken from.
74

Thus, despite the lack of a vocabulary specific to describing sound and its timbre or
quality, it still seems possible, from Faure’s (2000) findings, to communicate verbally
about sound perception in an understandable and rather consensual way.

3.2.2.3 A top-down, sound-context-free approach to timbre verbalisation

The verbal description of musical sound timbre in the realm of musicianship was
further explored by Stěpánek and Moravec (2005). An interesting approach was used
for this purpose, which, in opposition to the bottom-up approach employed in the stud-
ies detailed in Section 3.2.1, can be defined as top-down. 5 Contrary to von Bismarck
(1974) and others, Stěpánek and Moravec (2005) performed sound-context free exper-
iments: instead of asking participants to verbally describe timbre in specific exemplar
sounds and then generalise the findings, verbal descriptions were provided here from the
general experience of participant musicians, unattached to sound stimuli — thus, top-
down. In the first sound-context-free experiment, a questionnaire survey was applied
to collecting, first, free verbalisations about the description of musical sound timbre,
then a list of groups of synonyms and antonyms. 120 questionnaires were filled in by
musicians (bow, wind or keyboard instrumentalists) and provided 1964 different verbal
attributes, of which 30 were mentioned with over 25% frequency. Overall, adjectives
sharp, gloomy, soft, clear and velvety were cited by more than 50% of participants. 6
Significant differences were observed between the separate groups of instrumentalists,
as bowed-instrument players used sweet and hearty more frequently (and gloomy less),
wind-instrument players mentioned round and narrow more often, and keyboard players
cited ringing more frequently.
The second experiment then used the proximity rating procedure detailed in Sec-
tion 3.1, yet sound-context free. The pairwise dissimilarity ratings were thus assessed
between the 25 verbal attributes most frequently used in the questionnaires, instead of
5. Perceptually, Plomp (1976) differentiated between the bottom-up, auditory process and the top-
down, cognitive process.
6. The study was conducted in Czech. I mention the English translations provided by the authors.
75

sound stimuli. The resulting dissimilarity matrices were analysed with the CLASCAL
algorithm. The best solution was obtained with a three-dimensional common percep-
tual space of verbal attributes. The three dimensions are thus described by the verbal
attributes: gloomy/dark vs. clear/bright, harsh/rough vs. delicate, and full/wide vs. nar-
row. Additionally, the most frequently cited attribute, sharp, saliently appears in the
dimensions-1-vs.-2 plane as both very clear and harsh (said otherwise, sharp = clear +
harsh). Separate analyses of each professional/instrument group among the 34 partici-
pants revealed that at least two common dimensions were shared by all groups.
Finally, in a music performance context, 20 musicians rated the suitability of the 60
verbal attributes most cited in questionnaires. The results corroborated the dimensions
and their attributes identified in the MDS space.

3.2.3 Verbalisation of intra-instrumental timbre

These methods for studying the verbal attributes and description of timbre in general
(as a characteristic of diverse musical sounds, at both the identity and quality levels)
were also applied in verbalisation studies of timbral nuances in a single instrument.

3.2.3.1 Saxophone

Nykänen and Johansson (2003) studied the verbal description and perception of the
timbre of the saxophone. Timbre was studied at the level of individuality — i.e. the
influence on timbre of different saxophones and saxophonists.
In interviews, ten saxophone players provided verbal descriptors (in Swedish) of sax-
ophone timbre, in different contexts: free verbalisations, picking from a list, describing
sounds, and commenting on their own playing on different saxophones. The nine most
cited descriptors were selected for a listening test. Each stimulus consisted in a target
note, either isolated or in a melody, and played by one of two performers on one of
two saxophones. Participants were asked to rate on VAME scales the appropriateness
of each verbal descriptor in describing the timbre of each stimulus. PCA was applied
76

to several variables — perceptual quality ratings from the listening test, acoustical prop-
erties and additional psychoacoustic indices of the stimuli — so as to highlight their
relationships. Six significant dimensions were thus obtained, the first of which was well
described by perceptual qualities sharp and its opposite soft, and the second by coreful
[sic]. Sharpness, roughness and tonality proved to be the two most salient psychoa-
coustical characteristics of saxophone timbre. The first dimension was correlated with
sharpness, while the second corresponded to roughness vs. tonality — but not to rough-
ness as a verbal descriptor, which must have then held a different meaning to listeners in
a musical context. Dimensions 2 and 6 best differentiated between the two saxophones,
and dimensions 2 and 4 between the two saxophone players.

3.2.3.2 Pipe organ

In the same manner, Disley and Howard (2003, 2004) studied the timbral seman-
tics of different pipe organs in their polyphonic, natural context. Fifty participants were
asked to describe the sound of recordings from four different pipe organs (set at their
Principal stops). From their free verbalisations, 99 adjectives were extracted. Seven
were selected in this corpus, with regard to their common occurrence and unambiguity:
balanced, bright, clear, flutey, full, thin and warm. Seven corresponding VAME rating
scales were then used in a listening test. Each participant was presented with four stimuli
out of six recordings of five different organs — taken from a typical listener position in
the church or concert hall, thus an ecologically valid environment. The results revealed
a common understanding among participants of the timbre descriptors flutey, warm and
thin, and to some degree of bright and clear; neither full nor balanced showed any
common understanding. Spectral correlates were then sought out for the five most con-
sensual verbal descriptors, then verified by synthesis, with listening tests conducted over
four synthesized, acoustically controlled sounds. Flutey and warm thus corresponded to
a low spectral centroid, bright and clear to a high spectral centroid, and thin to weak low
harmonics.
77

Stěpánek (2006) likewise studied pipe organ, according to both its perception and
its verbalisation. Following a bottom-up approach, 60 pipe organ sounds were submit-
ted to listening tests. The sounds were recorded in situ from 12 different organs set
on Principal 8’ stops. Five tones were recorded for each, all C notes from C2 to C6.
The perceptual influence of transients in the stimuli was minimised by applying uniform
fade-ins and fade-outs. Twelve organists rated the pairwise dissimilarity between those
stimuli, and also provided spontaneous verbal descriptions of their timbre dissimilar-
ity judgements. 7 With the CLASCAL algorithm, a three-dimensional perceptual MDS
space was identified as optimal. The accompanying verbal attributes collected were used
for external interpretation of the perceptual dimensions. Their contextual frequency of
occurrence was correlated to the stimuli coordinates in the perceptual space, and the
verbal attributes were accordingly embedded in the perceptual space — their immersion
represented as vectors of different relative angles (i.e. directions). The verbal attributes’
embedding would not fill up the whole 3D perceptual space, yet the best-embedded ver-
bal attributes — round, soft, noisy, sharp and narrow — could effectively be projected
on a plane. Those attributes differ from those identified in Disley and Howard (2004),
yet it must be noted that the terms specific to pipe organ were dismissed here (this is
explicitly the case of flutey for instance) as the correspondence of the verbal description
of timbre between different instruments (and in general) was sought out above all.

3.2.3.3 Violin

Stěpánek (2006), however, was most focused on violin timbre. In a preliminary


study, Stěpánek (2004) used the same procedure described for pipe organ in the aim of
matching perceptual and verbal spaces of violin timbre. Dissimilarity ratings were col-
lected in listening tests involving 20 violinists, over tones from 17 different violins and
7. Those verbal descriptions were provided in German in the experiment. I mention the English trans-
lations proposed by the authors.
78

over five different pitches. These ratings were analysed with the CLASCAL algorithm,
separately for each pitch group of stimuli. In the case of pitch D6 (on which the study
focused thereafter), the optimal MDS perceptual space was found with two dimensions.
Spontaneous verbal descriptions were also gathered for 11 of the 17 tones. 8 The verbal
attributes thus cited with the highest frequency of occurrence with regard to each pitch
group were selected. For pitch D6, it resulted in a list of 65 verbal attributes.
Qualifications of the perceptual dimensions based on these verbal attributes were
then sought out. Correlations were calculated between coordinates of the sounds along
each dimension and the frequency of occurrence of each verbal attribute in the context
of qualification of these sounds. Each verbal attribute was thus assigned a correlation
coefficient along each dimension. Each dimension was then interpreted according to the
verbal attributes which showed the most significant correlation with the dimension. The
first perceptual dimension for pitch D6 was thus best qualified as rustle, sharp, sandy,
not soft and not round, while the second dimension was best qualified as dark and not
high.
Verbal attributes’ immersion in the perceptual space was subsequently represented
by vectors, whose direction was defined by the optimal multiple-regression fit between
frequency of verbal attribute occurrence and perceptual space coordinates. Immersion
was most successful for soft, sharp and round.
Furthermore, the verbal attribute projection method was used to define regions in the
perceptual space best characterised by certain verbal descriptors. In each region, all the
sounds thus shared common properties regarding the frequency of occurrence of certain
descriptors. The projection of one verbal attribute was defined as the centre of gravity
of all sounds, each weighted by the frequency of occurrence of the verbal attribute for
this sound. Hierarchical clustering complemented these projections. Around the most
stable such projections, several regions in the perceptual plane of pitch D6 could thus
hold local meaning around velvety, wide, strident and rustle.
8. Verbal descriptions were given in Czech, with English translations provided by the authors.
79

Stěpánek (2006) then furthered the exploration of verbal descriptions for violin tim-
bre. Eleven violin sounds (for each of five tones) from the 17 ⇥ 5 stimuli previously
described were submitted for evaluation to 11 other expert participants. The evalua-
tion method used is called Verbal Attribute Ranking and Rating (VARR), essentially a
ranking extension to the VAME procedure consisting of two parts. The verbal attributes
most frequently cited and most representative (over the five different pitches explored)
— sharp, dark, clear and narrow — were selected from the previous spontaneous ver-
bal descriptions of violin sounds. Participants then ranked the stimuli according to the
salience of the four verbal attributes as well as of perceived sound quality. Participants
were also asked to evaluate the magnitude of the four verbal attributes and sound quality
criterion in each stimulus. Factor analysis was applied to the ratings for each pitch class,
and an optimal two-dimensional factor space was obtained in all five cases. Sharp and
dark were opposite on this plane for all pitch classes, with clear in the close vicinity of
sharp, and with narrow and sound quality mostly opposite and orthogonal to dark–sharp
(although much variation is observed depending on pitch).
These studies thus constitute a methodologically comprehensive bottom-up explo-
ration of the perception and verbal description of musical instruments’ timbre.

Still on the topic of the verbal descriptors of violin timbre, Fritz et al. (2008) em-
ployed a top-down, sound-context-free approach. In a perspective voluntarily oriented
towards timbre as an individual characteristic of a given violin, descriptors were col-
lected in free verbalisations by 19 English-speaking violinists. Other descriptors were
taken from violin magazine articles concerning the characterisation of famous violins.
In the end, the 61 descriptors quoted more than three times were selected. Fourteen
violinists were then asked to arrange those descriptors according to their relative sim-
ilarity (the more similar the closer) in the two-dimensional virtual plane of an Excel
spreadsheet. The maps thus obtained were converted into matrices of distances between
descriptors. Multidimensional Scaling of these distance matrices with the ALSCAL al-
80

gorithm yielded an optimal 3D spatial configuration. Descriptors of ‘good’ and ‘bad’


sound qualities (respectively) were diagonally set apart in the dimensions 1-vs.-2 plan.
The first dimension was interpreted as related to the balance, noisiness and high fre-
quency content of the instrument. The second dimension was interpreted as related to
brightness and responsiveness. The third dimension was tentatively related to depth of
sound.

3.2.3.4 Guitar

Lastly, Traube (2004) studied the verbalisation of the timbre of the classical guitar, at
the level of timbral nuances performers can perceive and control. Twenty-two guitarists
were each asked to provide in semi-free form 9 10 adjectives best describing the timbral
nuances that can be produced on the guitar. 10 Synonyms and antonyms were also to
be provided in complement. 108 adjectives were thus collected, among which metal-
lic, round and bright, then thin, warm, velvety, nasal and dry, were the most frequently
mentioned. The timbre descriptors were classified into referral categories of sound-
ing object, luminosity, shape, matter, taste and character. They were also organised in
clusters, according to their synonymic relationships. The resulting mapping — with an
adjective as the centre of each cluster, its synonyms around it and clusters overlapping
from multiple synonymic relationships — displayed a linear organisation, whose axis
would correspond to the strings plucking point: from hollow and dull when plucking up
by the middle of the strings, to thin and nasal when plucking by the bridge. Some ver-
bal descriptors of the timbre of the classical guitar were also proven to refer to phonetic
gestures: indeed, the different resonances and comb-filter shapes of spectra that guitar
sounds can present were appropriately described by analogy with spectral formants of
vowels and phonetic resonators (mouth and its openness, lips, nose, larynx).
9. A non-restrictive list of 50 descriptors was proposed to the participants for reference.
10. Those adjectives were given in French, with their English translations provided by the author and
validated by a professional, bilingual guitarist.
81

3.2.4 Verbalisation of piano timbre

Now, on the same matter of intra-instrumental timbre verbalisation, yet of more di-
rect relevance to the concerns of this dissertation, some studies have focused on the
verbal description of piano timbre.
Cheminée et al. (2005) and Cheminée (2006) studied the pianists’ timbral lexicon in
free verbalisations about piano sound. Eighteen pianists were asked to verbally describe
(in French) the sound of the nine different pianos they were testing. This collection
of free verbalisations was analysed with the lexical corpus method. Verbal units, most
of them adjectival descriptors, were identified, and their context of use accounted for.
Special attention was given to reformulations (i.e. markers of synonymic relationships
between descriptors), as well as collocations (words used concomitantly) and semantic
equivalences, and to the frequencies of occurrence.
The study first aimed at assessing the semantic nature of the lexicon employed by pi-
anists in describing piano sound. The first observation was that among the 74 adjectives
cited more than once, most occurrences were concentrated in only a few terms: 18 ad-
jectives represented 70% of the occurrences, among which 10 accounted for almost 50%
of the occurrences. As for the enunciative classification of descriptors, the vocabulary
employed to describe piano sound was posited as subjective by essence, in describing
immaterial sound images, and a priori constrained to either referring to other sensory
modalities, resorting to general polysemous words, or using metaphors. Cheminée also
acknowledged that the experimental design itself, i.e. asking for evaluations, favoured
the occurrence in the corpus of axiological terms. Among the most used adjectives were
the indisputably axiological terms beautiful, good, agreeable, interesting and power-
ful. 11 Other frequently used adjectives (clear, homogeneous, bright, round 12 ) taken out
of context would be classified as objective and non-axiologically evaluative. Yet in their
11. Our own translation of the adjectives presented in French in the study: beau, bon, agréable, intéres-
sant and puissant.
12. clair, homogène, brillant, rond.
82

context of use by interviewed pianists, they were employed figuratively, with axiological
connotations. Deeper investigation of clear — the adjective most frequently used among
these figuratively used terms — in light of its context of use and reformulations revealed
that it was employed with three different meanings, positively with clear as luminous
(sound quality) and clear as well-defined (like an image or drawing), or negatively with
clear as dry.
The study was then aimed at observing whether the lexicon employed by different
pianists in describing piano sound is consensual in its meaning, despite its inherent sub-
jectivity. Given the many terms commonly employed by most if not all pianists, their
common context of use, and the synonymic relationships identified in the corpus, it was
concluded that the words employed in describing piano sound hold a consensual mean-
ing among the pianists interviewed. Most notably, the meanings hereby held by differ-
ent adjectives and their synonymic relationships in this context largely differed from the
common use of these terms in the standard French lexicon. Thus, this vocabulary con-
stituted a sub-lexicon within the common lexicon, in which two main axes of meaning
transpired: a sense of percussion, corresponding to the negative connotation of clear,
and a sense of resonance, corresponding to the positive connotation of clear.
In conclusion, the vocabulary used by pianists in describing piano sound was identi-
fied as forming a consensual, specialised lexicon of an affective and axiological nature,
and following two axes of meaning: percussion and resonance.
This semantic analysis of free verbalisations about piano sound was complemented
by listening tests and basic acoustical analyses (Cheminée et al., 2005). In qualitative
piano timbre listening tests, 40 musician participants provided (1) pairwise dissimilarity
ratings and (2) free categorisations of sound recordings of nine different pianos (played
by one pianist), followed by verbal descriptions and explanations. For the first task and
half the second task, downward chromatic scales were recorded, while for the remaining
half of the second task, short musical sequences were used. Tree analyses highlighted
cascading pairwise piano groupings by timbre, with the piano pairs 1–2 and 5–7 most
83

closely grouped in all tasks. Acoustical correlates of spectral richness in medium-high


frequencies were associated to pianos 1 and 2, and low spectral centroid to pianos 5 and
7. From the semantic description provided after the second task, pianos 1 and 2 were very
similarly described by all participants, with diverse adjectival synonyms pertaining to the
same semantic field and perceptual sensations, and with adjectives clear, bright, sharp,
metallic and open most recurring. 13 We can actually interpret those terms, according to
the proposed axes of meaning, as describing high percussiveness and/or resonance. As
for the second tight group of pianos 5 and 7, their timbre was most characterised with
attributes dull, soft, hushed, inside. 14 We can also interpret those as meaning a lack of
percussion and/or resonance.
The lexicon of piano sound that was first characterised in a top-down approach was
thus confirmed bottom-up, with consensual use of specialised semantic fields. Terms
like clear and soft are consistent with the axes of percussion and resonance and describe
the specific sounds of two groups of pianos.

Bellemare and Traube (2005, 2006) meanwhile explored the verbal descriptions of
piano timbre according to its performer-dependent dimensions. The aims of this study
were three-fold: to determine whether there is a common language among pianists with
regard to timbre, to establish and classify an inventory of verbal descriptors of piano tim-
bre, and to investigate how verbal descriptors relate to proprioceptive gestures, dynamic
level and register, and onomatopoeic vocal analogies.
In these aims, two series of interviews were conducted, in French and/or English,
with participants of diverse origins and native tongue, but all based in Quebec for at
least two years. For the first series, eight advanced-level pianists were asked to select
and define ten adjectives that they thought best described timbres that can be produced at
the piano. These adjectives were selected from a list of 69 terms in French and 73 in En-
13. In French: clair, brillant, net, métallique and ouvert.
14. In French: sourd, doux, feutré, intérieur.
84

glish 15 , or were provided by free additional suggestions. They were then asked, for each
adjective, to provide synonyms and antonyms, to describe and demonstrate the ways to
produce this timbre, to suggest a fitting musical context, and to vocalise it through ono-
matopoeia. In a second series of interviews, eight other advanced-level pianists were
asked to define the piano timbres corresponding to 20 selected adjectives, designate the
register and dynamic level to which they are most suitable, vocalise them through ono-
matopoeia, and organise them on a plane, according to similarity (both semantic and
technical).
98 verbal descriptors of piano timbre were thus gathered. Frequencies of occur-
rence for both direct and indirect use were compiled. Among those 98 descriptors,
the 6 most cited accounted for one-fourth of the occurrences, while the 15 following
terms accounted for almost half of the occurrences: round, clear, warm, harsh, full-
bodied, velvety, sparkling, rich, soft, bright, resonant, brassy, transparent, shimmering
and metallic.
Analyses of the provided definitions and synonyms/antonyms revealed a tight net-
work of semantic relationships between the descriptors of piano timbre. For instance,
full-bodied was defined as round and rich, harsh as percussive and metallic, clear as
pure, bright, glassy and detached. Among these 15 descriptors, 6 were mentioned as
very constrained in register: distant and shimmering could not be produced over mf
nuances, while harsh, full-bodied and especially brassy required high dynamic levels,
and round could only be obtained between p and f registers. Links to proprioceptive
gestures were also traced, through the specific technical details for the production of
each timbre that the participants provided (see Chapter 7, Section 7.2.4, p. 168). Al-
though onomatopoeia was frequently employed in defining timbre descriptors, the task
of explicit vocalisation was found extremely arduous. Still, the onomatopoeias provided
ranged from sole consonants to plosive consonants combined with nasal vowels and to
15. Some of the verbal timbre descriptors used in English did not have an appropriate, meaningful and
familiar translation in French; this explains why the list contained four more terms in English than in
French.
85

open vowels. Lastly, the arrangement of piano timbre descriptors on a plane according
to similarity was found to form a wheel-like, circular diagram. Given the definitions and
vocalisations provided of the descriptors, Hellwag’s articulatory triangle — that traces
the continuum of vowels articulation in between three poles of open [a], closed [u] and
pointed [i] — could be meaningfully superimposed over the semantic wheel of piano
timbre descriptors (see Figure 3.1). One can remark that this arrangement of the de-
scriptors of piano timbre is very different from that of the classical guitar, as instead of
a linear, open-ended disposition, the descriptors of piano timbre are arranged in close
form, thus befitting a circular continuum of timbral nuances.

[i]

Hellwag’s triangle

[a]

[u]

Figure 3.1: Semantic atlas of piano timbre – qualitative arrangement of descriptors


through assessment of similarity (Bellemare and Traube, 2006).
86

3.2.5 Conclusion

This chapter thus presented the diverse methodologies, aims and objects contained
in the literature on timbre perception and verbalisation. Relevant methods of stimu-
lus identification, proximity ratings and Multidimensional Scaling, verbal attribute rat-
ing and magnitude estimation, free verbalisations, and semantic analyses through verbal
protocol and corpus methods were detailed. Meanwhile, the successive timbre studies
presented converged gradually in context towards the subject of this dissertation, piano
timbre.
The most suitable methods among these were used to further extend the knowledge
on the perception and verbalisation of piano timbre detailed above. The following chap-
ters will thus present the experimental studies that I conducted on the perception and
verbal expression of piano timbre.
CHAPTER 4

PERCEPTION AND IDENTIFICATION OF PIANO TIMBRE:


A PILOT STUDY

The pilot study and results presented in this chapter were previously included in
Bernays and Traube (2010, 2012a). Prof. Caroline Traube is to be credited with the
idea for the study, the recruiting of participants and the organisation of the listening
test. Dominic Thibault designed the testing interface, and was in charge, along with the
author of this thesis, of the performance recordings. The author of this thesis carried out
the listening test, data collection and analysis.

4.1 Introduction – Aims

As we saw by the end of the previous chapter, advanced-level pianists use a shared,
common and organised lexicon of verbal descriptors to refer to a large palette of piano
timbre nuances (Bellemare and Traube, 2005; Cheminée et al., 2005). Yet could we
expect this apparently consensual meaning of verbal descriptors of piano timbre among
highly-skilled pianists to be reflected in their auditory perception of such piano tim-
bre nuances? That is, can pianists identify and consensually label performer-controlled
piano timbre nuances in audio recordings? Furthermore, do the verbal descriptions of pi-
ano timbre remain consistent from piano performance to the listening experience? That
is, can the timbral intentions of the performer (and their corresponding verbal descrip-
tors) be effectively perceived, understood and labelled accordingly by pianist listeners?
In summary, this study examines whether the production, perception and verbalisa-
tion of piano timbre can be consensually understood and agreed upon. The relations,
regarding piano timbre, from words, to performance, to sound, to perception and back
to words, are investigated in order to determine whether they form a closed chain.
88

4.2 Method

For this aim, a pilot study was designed. In the first part of the procedure, several
piano performances highlighting different timbral nuances were recorded. In the second
part, perception tests probing the identification of piano timbre were conducted, with the
audio recordings as stimuli.

4.2.1 Production of piano timbre and audio recordings

A professional pianist (MG) was asked to perform short musical pieces several times,
while expressing different timbral nuances each time.

4.2.1.1 Musical pieces and timbral nuances

Several constraints were imposed on the musical pieces to be performed. First of


all, each piece had to be suitable for the expression of several, different timbral nuances.
Second, the pieces had to be short in order for a timbral nuance to remain sufficiently sta-
ble throughout. Lastly, the timbral preconceptions that would be associated with a piece
taken from a known repertoire had to be avoided. For these aims, Sylvie-Anne Ménard, a
student composer at Université de Montréal, was asked to compose short original pieces.
The timbral nuances to feature were taken from the list of common verbal descriptors
of piano timbre highlighted by Bellemare and Traube (2005). After agreement with
the composer, the eight following timbre descriptors were selected: Bright, Dark, Dis-
tant, Full-bodied, Harsh, Round, Matte and Shimmering. Given the French-Canadian
linguistic context of this study, six timbre descriptors were used in French. However,
two descriptors were more familiar in English to the composer and the performer (both
bilingual French/English). The timbre descriptors actually used were thus Bright, Som-
bre, Lointain, Plein, Dur, Rond, Mat and Shimmering. The linguistic problems this use
of two languages may cause are discussed at the end of the chapter. In the rest of this
chapter, the timbre descriptors are all mentioned in English for easier reading.
89

In order for all these timbral nuances to be properly highlighted in the compositions,
three different pieces were written and used (see Figures 4.1, 4.2 and 4.3). The attribution
of timbres in those three pieces was as follows:
– First piece: Bright, Dark, Distant, Harsh, Round
– Second piece: Dark, Full-bodied, Harsh, Round
– Third piece: Matte, Shimmering
With this attribution, the Dark, Harsh and Round timbral nuances could be compared
between the first and second pieces. The pieces had a duration of roughly 20, 30 and 15
seconds respectively.

Étude de timbre no 1
Zviane

q»ª§
b œ. œ. b œ. œ œ œ b œ œ œ œ # œ œ
& b c  œ œ bœ œ #œ œ bœ œ œ #œ œ œœ
.
œœ œœ
. .
p
œ. œ. œ. . . .
œœ œœ œœ œ œœ. œ.
? b c  œ œ œ œ œ œ œ œ #œ œ bœ œ œ #œ
Piano

b
& b # # œ. œ #œ œ nœ œ bœ bœ œ bœ œ nœ #œ œ #œ  œ bœ
œ.œ n œœ # œ. œ œ.œ n œœ
3

œ
S S f p
œ. œ. # œ œ. œ. œ bœ
? b bœ # n # œœœ nœ œ œ
b J ‰ J ‰  œ ‰
J

nœ œ œ bœ bœ œ #œ œ. œœ. œœ.
b œœœœœ n œ b œ œ b œ œ b œœ œ
& b œ #œ œ bœ œ œ #œ
5

F
.
œœ
. .
œœ œœ œ. œ. œœ. œœ. œœ. b œœ b œ œ œ œ œ n œ
? b œ nœ œ nœ bœ
b œ œ

œ œ b œ. œ.
b bœ œ œœ b œœ œ œœ. œœ. œ œ œ œ œ bœ j œœ
&b œ œ n œ. ‰ œ
7

# # œœ
S
bœ œ œ œ œ S œ œ œ œ œ
? b bœ œ œ œ œ n œœ œ
œ œ b œ # œ œ. œ. œ
b œ

Figure 4.1: Score of the first miniature piece composed for this pilot study.
90

Étude de timbre no2


Zviane

qk »¢•
œœ n n >œœœ ...... œœ >œœ
Molto rubato
bb 6 n œœœ œœ œœ ...
& b b bb 8 œœ œœ œœ œ œ n œ .. n œœ œœ œœ .

œ f nœ
? b b b b 68 œœ
œœ œœœ n œœœ œ œ > œ œ œœ œœœ œœœ
Piano

bb
œ œ œ n œœ b œœ n n œœœ œ œ
F

>
bb œœ œœ œœ œœ œœœ n nn œœœœ ...... œœ >œœ œœœ ..
& b b bb œœ œ œ .. n œœ œœ œ ..
3

œ œ

œ œœ œœœ n œœœ f> n œœ œœ œœ œœ œœ


? b b b b œœ œ œ œ
œ
n œœ
œ
b œœ
nœ œ œ œ œ
bb n nn œœœ
F

b œœœ # œœœ # n # œ>œœ ......


& b bbbb œœœ œœœ n n œœœ .... œœ œœ # n œœ # œœœ œœ œœ
5

> ..
p
œœ œ œœ œœ œœ
? bb b b œ n œœ >œ œœœ œœœ # # # œœœ > nœ œœœ
bb
œ
n œœ
‹œ #œ # nn œœœ n œœ

>œœ .... œœ >œœ œœ ..


n œœ n œœ
# œ # œ # œ n œœœ
œœ .... œœ œœ œœ ..
b œ œ œœ .. œœ ..
& b b b b b # n œœ # œœ œœ œ. œ.
7

ƒ p
#œ œœ
? b b b b # # œœ # # œœ œ œœ & œ nœ œ. œ.
œœœ œœœ œœ œœœ
? ?
bb œ & n b œœœ œœœ

œ œ œ œ œ œ. œ.

Figure 4.2: Score of the second miniature piece composed for this pilot study.

4.2.1.2 Recording sessions

After sufficient preparation time was given to MG, four recording sessions were ar-
ranged (three on the same day, one a week later). The sessions took place in the BRAMS
facilities, on a Bösendorfer Imperial grand piano. 1 MG’s performances were recorded
with dual stereophonic sound takes, one from in close above the pianist’s head (AB type,
with two DPA4006 omni microphones), and one further away in a more standard fash-
ion (quasi-coincidental NOS type, 90 and 30 cm between capsules, with two DPA4011
1. Model 290: 9’6” long, low-register-extended 97-key keyboard.
91

Étude de timbre no. 3


»¡ºº
Zviane
qk

9 œ œœœ œœœœœ œœœœœ œ œœ œ œ œ œ œ œ œ œ œ œ œ
&8 œ œ œ œ œœ œ œ
Piano π√
bœ œ œ.
& 98 œ # œ # œ # œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ b œ b œ b œ œ œ œ œ œ œ œ œ œ œ ?

(√) >
> > > >
œ œ œ œ œ œ œbœ œ ˙. œ œ nœ œ. bœ nœ nœ
& œœ œ œœ œœœ œ œœœ
3

(√) p
#œ #œ#œ nœ œ œ
? b œ . & b œb œ œ œ œœ
œb œ œ œ œ œ œ œ œ œ œ # œ
?
b b œœ .
.

(√) √
> œœ ..
œ œ œ
œœ .
. œ œ œ œ œ œ œœœ œ.
5

&
π
œ œ œ œœ .
? œ. œ œ œ œ œ œ .
œœ .. &
œ œ œ

Figure 4.3: Score of the third miniature piece composed for this pilot study.

cardioid microphones). The recording studio equipment and the microphones setup are
-

displayed in Figure 4.4. Stereo audio files were created by editing and mixing the takes
together (with Digital Performer R ) in order to achieve as much realism as possible from
the performer’s perspective. MG was given the leeway during each series to proceed
with several performances corresponding to the eight timbres, in free order and with the
possibility to retry after an unsatisfactory attempt. The same process was repeated over
the four series, which allowed for comparing and selecting of the optimal performances
and audio recordings. 34 performances were recorded in total. Additionally, comments
from the pianist were gathered (through a questionnaire-guided interview) on the verbal
description of each timbre and the underlying technique of production, as well as general
comments on the pieces.
92

Figure 4.4: The Bösendorfer Imperial 290 grand piano installed in the BRAMS studio,
and the microphones setup used in performance recordings. (Photographer and model:
Dominic Thibault)

4.2.2 Perceptual identification test of piano timbre

4.2.2.1 Preliminary test

A preliminary timbre perception test was conducted with the performer, MG, him-
self. Shortly after the end of the fourth recording session, MG was presented with the
recordings of his performances from the first three sessions, and was asked to identify
and label the timbre performed in each audio excerpt. The recordings were presented
through two Genelec 8050A monitoring speakers, in a free order controlled by the ex-
perimenter to ensure that each timbre was tested and to insist on the most uncertain
answers. 21 of the 40 recordings were played. The second experimenter, also set in
blind testing mode, collected MG’s answers.

4.2.2.2 Main perception test

The identification procedure was chosen for this test (see Chapter 3, Section 3.1.1.1,
p. 55). Indeed, by asking participants to attach timbre labels to audio stimuli, both the
93

auditory perception of piano timbre and its verbal description could be tested at the same
time. This pilot study was not aimed at determining the extraneous parameters (acoustic,
gestural, and so on) that influenced the perception and verbal labelling of piano timbre
(if so, adjustment or evaluation procedures would have been preferable). While the
results of an identification task would not inform us on the actual cause of errors —
whether wrong perception or wrong verbal label attachment — this procedure was most
suitable for directly testing the consistency of verbal descriptions across performance
and perception.
Seventeen pianists were gathered for a collective listening test session. Sixteen of
them were currently related to the Faculté de musique, Université de Montréal (students
or professors), and 11 were Quebec natives. Twelve were native French speakers, and
the other five were at least fluent in French. All were fluent in English as well, and all
confirmed their understanding of the eight timbre descriptors. The average age of the
participants was 26 years old (S.D. 7.4 years). Six participants had perfect pitch.
Thirteen stimuli were selected among the whole set of recordings, so as to best il-
lustrate the eight timbres played by MG. These audio recordings are provided alongside
this thesis, in the first archive of additional audio files. Two stimuli among the 13 were
used to highlight each of the Bright, Round and Shimmering timbres, and three high-
lighted Harsh, in order to check the consistency of answers between different recordings
and especially between different pieces. Three successive listening sessions took place.
The first one consisted of a preliminary familiarisation with the stimuli, to help the par-
ticipants better apprehend the range of timbral nuances that they would have to evaluate.
Then, for the first testing process, the stimuli were played, one by one, grouped by piece,
and in random order (unknown in advance to the experimenter). After the presentation
of each stimulus, the participants had to write down a free verbal description of the tim-
bre, as well as technical comments on the gesture that they felt could be associated with
it. Lastly, after a break, the stimuli were all presented again (in a different, random or-
der) for a forced-choice task that consisted in choosing the most fitting verbal descriptor
94

among a list of the eight preselected timbre labels. Written questionnaires were used to
collect the answers. The participants also indicated their background, age, piano practice
and other personal information.

4.3 Results

4.3.1 Preliminary timbre identification test

This informal, preliminary identification task was performed with great ease by MG,
although he judged the task rather difficult and unusual. He was able to identify timbre
with the right verbal descriptor in almost all of the audio recordings presented, some-
times by elimination/default, but mostly by direct deduction. Only the terms Full-bodied
and Round yielded some confusion, as MG did not really differentiate the two terms
in his timbre lexicon. Therefore, the pianist’s conception of timbre was overall highly
consistent between performance and listening.

4.3.2 Main timbre perception test

4.3.2.1 Free identification task

The complete table of timbre descriptors used to qualify each of the 13 audio excerpts
in free verbalisations is presented in Table 4.I. Almost all descriptors were provided in
French, the native language of most participants (and a second language for the others).
They are provided verbatim.
In a first qualitative account, the timbre descriptors suggested by participants exhibit
varying degrees of correspondence with the timbres performed, depending on the ex-
cerpt. Sometimes the exact descriptor used as performance instruction was freely cited
(excerpts 3, 6, 7 and 9). More often, though, the descriptors mentioned were mostly syn-
onymous with the performance instruction, at least evoking neighbouring notions (e.g.
excerpts 11 and 12). For other excerpts, however, several of the descriptors cited repre-
95

Table 4.I: Timbre descriptors provided per audio excerpt in the free identification
task. Identified synonyms of the timbre performed are written in blue, while identified
antonyms are written in red.

Timbre
Excerpt Timbre descriptors proposed in the free identification task
performed
sec x5, articulé x4, direct x4, sautillant x3, percussif x2, clair x2, honky tonk x2, piquant,
Excerpt 1 Bright nasillard, âpre, incisif, précis, tapé, décidé, discipliné, carré, osseux, froid, naïf, rythmé,
segmenté, non legato, découpé, compact, dense, enjoué
chantant x2, legato, sensible, coulant, tendre, velouté, profond, réverbéré, cuivré, neutre,
staccato, tapé, élégant, équilibré, dandy, rond, fluide, swingué, rythmé, articulé, phrasé,
Excerpt 2 Harsh
gentil, caressant, amorti, vivant, souple, élastique, clair, sans nuance, sans résonance,
sage, studieux, sans fantaisie, sans couleur
direct x4, rond x3, clair x2, cru x2, lourd, sec, chaud, intime, dur, lointain, marqué,
volontaire, franc, contrasté, plein, présent, bouncy, enjoué, acidulé, coquet, passif,
Excerpt 3 Distant
rythmé, lean, loud, défini, précis, dynamique, vivant, dirigé, autoritaire, martelé, saturé,
intransigeant, indirect, taché, fort, vague, dialogué, gonflé
rond x2, mou, feutré, intime, délicat, doux, timide, fluide, coulant, plein, courbe,
maniéré, gentil, narratif, mouillé, nostalgique, triste, fatigué, moelleux, lointain,
Excerpt 4 Bright
déconnecté, sous-accentué, assourdi, aérien, flottant, superficiel, cotonneux, chantant,
élégant, détendu, expressif, timbré, souple, fin, mesquin, réservé
mou x2, déconnecté x2, feutré x2, doux x2, timide x2, neutre x2, lointain x2, léger x2,
pas soutenu, maigre, défini, concis, chantant, plein, smooth, absent, peu d'intention,
Excerpt 5 Round
coloré, subtil, pâle, pastel, clair, junk, lié, amorti, gentil, vivant, transparent, aéré,
innocent, sans relief, effacé, écho
dur x2, profond x2, pesant x2, large x2, grand, étouffé, peu résonant, inorganique,
froid, sans élan, léger, incisif, volontaire, poignant, soliste, inégal, tapé, grave, sérieux,
Excerpt 6 Harsh
cérébral, contrôlé, intransigeant, caverneux, sombre, sec, élancé, sonore, pas libre, lourd,
traînant, déchirant, majestueux, expressif, plein, fin, noble
rond x3, plein x2, chaud x2, clair, fluide, liquide, indirect, félin, violet, cordé, long,
riche, piano, tragique, pathétique, spontané, douloureux, organique, brumeux, profond,
Excerpt 7 Round
sonore, balancé, libre, vibrant, parlant, ouvert, haché, articulé, directionnel, actif, droit,
fier, interrogatif, mystérieux, connecté, fin, sec, puissant, glacial, respirant, fantaisiste
sec x5, dur x4, direct x4, fort, clair, cérébral, marcato, percussif, étouffé, théorique,
Excerpt 8 Dark défiant, âpre, intense, métallique, ferme, inflexible, carré, puissant, affirmé, fin, strict,
sévère
sec x9, dur x3, lourd x2, clair x2, rythmique x2, précis x2, forcé x2, direct x2, non
Excerpt 9 Harsh legato, maigre, musclé, nasillard, pas plein, froid, frustré, amer, rond, cassé, vide,
superficiel, pas clair, indéfini, énervé, martelé, cru, découpé, déconnecté, élastique
sensible x2, mouillé x2, retenu, résonant, organique, diffus, lancinant, riche, onctueux,
doux, brouillé, résigné, nostalgique, fatigué, large, perlé, calme, mélancolique, large,
Excerpt 10 Full-bodied
sombre, vivant, chaud, intime, intense, caressant, mystérieux, envelopé, lisse, lent, relief,
expressif, gluant
clair x4, aquatique x3, coulant x2, perlé x2, cristallin x2, articulé x2, transparent x2,
Excerpt 11 Shimmering aérien x2, intentions x2, feutré, focusé, cloché, précis, mouillé, proche, plat, pluvieux,
vitreux, sec, voilé, chatoyant, lumineux, concret, ordinaire, fontaine, ruisselant
clair x4, aquatique x2, dur x2, articulé x2, lumineux x2, brillant x2, appuyé, soutenu,
Excerpt 12 Shimmering fluide, tapé, défini, rond, pluvieux, éclatant, lourd, libre, concret, pointu, présent,
substantiel, nourri, confortable, expressif, intense, connecté, liquide
rond x3, léger x2, aquatique x2, absent x2, articulé x2, céleste, pur, chantant, mélodieux,
Excerpt 13 Matte linéaire, dirigé, inégal, troué, pas articulé, chatoyant, spirituel, clair, sec, voilé, diffus,
fluide, libre, timbré, présent, relief, flou, indéfini, aérien, raffiné, inactif, vide
96

sented timbral aspects other than those held by the performance instruction. For instance,
the first excerpt, performed Bright, was nonetheless most characterised by terms per-
taining to semantic fields of drought or sharpness. Excerpt 3 (Distant) was surprisingly
understood in either of two different ways by the participants. For some participants, its
timbre was indeed Distant, Indirect, Vague. But for others, the exact opposite stood out:
its timbre was considered Direct, Present, Lively. Finally, for excerpts 2, 4 and 8, the
timbre descriptors freely cited completely differed from the timbre descriptor that served
as the instruction to the performer. Excerpt 8 (Dark) was predominantly considered Dry
or Harsh. Excerpt 4 (Bright) was mostly described as Round. And excerpt 2 (Harsh)
was essentially described antonymically (Singing, Tender, Velvety, Round, etc.).
In order to explore more quantitatively the relations of the timbre descriptors pro-
posed by participants to the timbre descriptors performed, synonyms and antonyms (in
the context of piano timbre description) of the eight timbre descriptors performed were
sought out. The results from Bellemare and Traube’s (2005) interviews of pianists about
the description of piano timbre were used to this end. The synonyms and antonyms of
the eight timbre descriptors — either directly provided or suggested in the definitions of
timbral nuances — were thus gathered, and are presented in Table 4.II.
Those synonyms and antonyms of each timbre performed were then identified among
the timbre descriptors proposed in the free identification task of each excerpt, and are
represented in colour-code in Table 4.I.
Given the number of exact descriptors and attested synonyms and antonyms to the
timbre performed provided in the free identification of each excerpt, a total index of
synonymy per excerpt was calculated — with synonyms and antonyms prorated to the
number of synonyms and antonyms (respectively) identified for each timbre performed.
This index of synonymy is given by the following formula (where Nk is the number of
terms of type k):

Nsynonyms Nantonyms
Index = Nexact descriptors + (4.1)
Nlisted possible synonyms Nlisted possible antonyms
97

Table 4.II: Synonyms and antonyms of the eight timbre descriptors used as performance
instructions, as extracted and listed from Bellemare and Traube (2005).

luminous extroverted open sunny sparkling brilliant


synonyms
Bright percussive brassy clear radiant shimmering bell-like
antonyms dark tempestuous shadowed stifled smothered matt
brutal violent big loud hard rugged
synonyms
upfront destructive dry brash
Harsh
brassy clear resonant soft delicate fragile
antonyms
full-bodied mellow round tender caressing warm
removed faraway detached uninvolved clear cold
synonyms
Distant mysterious uncertain veiled emotionless
antonyms centred soulful present transparent precise well-defined
synonyms full-bodied rich resonant caressing sunny tender
Round
antonyms harsh thin dry sharp metallic transparent
synonyms cloudy heavy moody tempestuous evil
Dark
antonyms bright shimmering resonant radiant sparkling
synonyms round rich imposing resonant
Full-bodied
antonyms harsh metallic transparent
synonyms
Matte
antonyms bright shimmering sparkling
bright brassy glassy sparkling clear crystal
synonyms
Shimmering metallic champagne-like
antonyms dark matt

All this numerical information is contained in Table 4.III. These evaluations and indexes
of synonymy are admittedly very tentative, as the number of synonyms and antonyms
that could be retrieved in Bellemare and Traube (2005) largely varies between the timbre
descriptors used as performance instructions. Yet they could be used for a rough outline
of the consistency between the timbre descriptors that instructed the performances and
the verbal descriptors provided in the free identification of timbre in the audio record-
ings. Consistency is thus shown as largely varying between excerpts, and especially
between pieces, as free descriptions of the first piece were (but for the first excerpt)
more antonymically inclined.
98

Table 4.III: Numerical evaluation of synonymy for the verbal descriptors provided in the
free identification of each excerpt.

# Possible # Possible Total # Index of


# #
Timbre synonyms to antonyms to Descriptors # Exact synonymy
Excerpt Synonymic Antonymic
performed the timbre the timbre provided by descriptors of provided
relations relations
performed performed participants descriptors

Excerpt 1 Bright 12 6 41 0 4 0 0,33

Excerpt 2 Harsh 10 12 35 0 1 8 -0,57

Excerpt 3 Distant 10 6 46 1 5 14 -0,83

Excerpt 4 Bright 12 6 37 0 1 2 -0,25

Excerpt 5 Round 6 6 41 0 1 2 -0,17

Excerpt 6 Harsh 10 12 40 2 13 2 3,13

Excerpt 7 Round 6 6 45 3 3 1 3,33

Excerpt 8 Dark 5 5 32 0 4 1 0,60

Excerpt 9 Harsh 10 12 44 3 21 3 4,85

Excerpt 10 Full-bodied 4 3 34 0 2 0 0,50

Excerpt 11 Shimmering 8 2 38 0 8 0 1,00

Excerpt 12 Shimmering 8 2 34 0 9 0 1,13

Excerpt 13 Matte 0 3 37 0 0 1 -0,33


99

4.3.2.2 Forced-choice identification task

For the forced-choice timbre identification task, the participants had to choose the
label to attach to each stimulus between the eight timbre descriptors that had served as
instructions to the performer. The results are presented for each participant in Figure 4.5.
Timbre identification rate for the 13 excerpts — i.e. the proportion of answers where the
timbre descriptor chosen by the participant corresponds to the timbre descriptor used
as instruction to the performance — was significantly above chance (1/8 = 0.125) for
10 of the 17 participants. One participant in particular (#10) performed extremely well,
with a 0.846 identification rate (11/13). The average identification rate was 0.371 (SD
0.164), three times above chance. Differences in individual scores did not correlate to
any personal information such as age, years and daily hours of practice, academic level
or favourite repertoire, perfect pitch or not. 2
Timbre identification rate per excerpt was significantly above chance for 8 of the
13 excerpts (see Figure 4.6). The standard deviation between excerpts is higher than
between participants, at 0.226. Indeed, there is a somewhat large discrepancy in timbre
identification rate between excerpts, with, on the one hand, the two excerpts performed
with Round timbral intentions never identified as such — even though excerpt #7 was
described as Round three times in the free identification task — while two of the three
Harsh excerpts were correctly identified by 11 participants.
Those results are more understandable in light of the confusion matrix presented
in Table 4.IV, which details every timbre descriptor used as answers by participants
for each timbre descriptor used as performance instructions (for one or several audio
excerpts). 3 We can see that performed timbres Bright, Dark, Distant, Full-bodied, Harsh
and (to some degree) Shimmering were mostly well identified. Identification rates of
those six timbres are significantly above chance (p < 0.01).
2. Although participant #10 was one of six with perfect pitch, the five others performed within one
S.D. of the mean — thus no overall significant correlation of individual scores with perfect pitch.
3. The total number of answers provided do not sum up to 13 excerpts ⇥ 17 participants = 221 as
seven answers were left blank.
100

0.9

**
0.8 Significance above
chance level:
** p<.01
0.7
* p<.05

0.6
Identification rate

0.5
** ** ** ** **
0.4
* * * * Average

0.3

0.2

Chance
0.1
level

0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Avg.
Participants

Figure 4.5: Timbre identification rate per participant, by forced-choice between eight
timbre descriptors, over 13 audio excerpts. Horizontal lines show chance level (blue)
and average identification rate (red). Stars indicate individual significance over chance
level. The rightmost bar shows the average identification rate and confidence interval
(±2 S.E. between participants).

Table 4.IV: Confusion matrix of forced-choice timbre identification, with eight timbre
descriptors as both performance instructions (rows) and possible answers (columns).
The final column indicates the identification rate per timbre performed. The final lines
indicate the occurrence and accuracy of each timbre label used as answer.

Performance Answers # of Ident.


Excerpts
instructions Bright Dark Distant Full-b. Harsh Matte Round Shimm. answers rate
Bright 2 16 3 0 2 3 3 4 3 34 .471
Dark 1 1 8 2 0 0 1 3 1 16 .471
Distant 1 0 1 10 0 0 4 1 1 17 .588
Full-bodied 1 0 0 0 8 0 2 4 1 15 .471
Harsh 3 11 2 1 4 27 3 1 2 51 .529
Matte 1 2 0 6 0 1 2 0 4 15 .118
Round 2 1 8 6 5 0 12 0 0 32 0
Shimmering 2 13 0 4 3 1 1 1 11 34 .324
Occurrences 13 44 22 29 22 32 28 14 23 214 .371
Adequacy .364 .364 .345 .364 .844 .071 0 .478 .383
101

Piece 1 Piece 2 Piece 3


0.7
** ** Significance above
chance level:
0.6 ** ** p<.01

**
0.5
** **
Identification rate

** **
0.4
Average

0.3

0.2

Chance
0.1 level

0
Bright Harsh Distant Bright Round Harsh Round Dark Harsh Full−b. Shim. Shim. Matte Avg.
Excerpt and its timbre

Figure 4.6: Timbre identification rate per audio excerpt, by forced-choice between eight
timbre descriptors, for the 17 participants. Horizontal lines show chance level (blue) and
average identification rate (red). Stars indicate individual significance over chance level.
The rightmost bar shows the average identification rate and confidence interval (±2 S.E.
between excerpts).

Yet some frequent errors can be noticed. A Distant performance was identified as
Matte, and a Full-bodied performance as Round, four times each (23.5%). Harsh per-
formances were also identified as Bright in 11 cases over three excerpts (21.6%). As for
the Shimmering performances, errors are almost solely due to a confusion with Bright.
Shimmering excerpts were actually more often identified as Bright than as Shimmering.
Meanwhile, the other two timbres performed were not accurately identified in audio
recordings — i.e. not significantly above chance at the 5% level. The Matte perfor-
mances were mistakenly identified as Distant (35.3%) or as the antonymous Shimmering
(23.5%). And the Round-timbre performances, never identified correctly, were attached
Matte (35.3%) or Dark (23.5%) timbre labels.
On the other hand, the confusion matrix also indicates the frequency and accuracy
with which each descriptor was employed in qualifying the audio excerpts. The Bright
102

timbre descriptor was the most often used, yet was often improperly used for Harsh
or Shimmering performances. Meanwhile, the Harsh label was almost always applied
appropriately. Timbre labels Dark, Distant, Full-bodied, Matte and Shimmering were
employed between 20 and 30 times, with average or mediocre accuracy. Finally, the
Round label was used very scarcely by participants, only 14 times.

4.4 Discussion

First, plausible explanations can be provided for some of the results. The frequent
assessments of Shimmering performances as Bright can be explained by the semantic
proximity between the terms. Bright and Shimmering were explicitly denoted as syn-
onyms in Bellemare and Traube (2005) (see Table 4.II). We may also postulate that the
Round timbre descriptor was relatively seldom used in labelling the excerpts because it
would correspond to an “average” performance, lacking salient timbral features — and
usually aimed at as the timbre of choice for piano performance (Bellemare and Traube,
2005; Cheminée, 2006). The participants may have searched for any timbral feature
that could characterise each audio excerpt, thus avoiding description of any excerpt as
Round. On the contrary, the Harsh timbre may have the most distinguishing features,
and is generally (but for specific contexts) deemed undesirable in piano performance.
The Harsh descriptor could thus be used very accurately, i.e. seldom used mistakenly for
qualifying a non-Harsh audio excerpt.
Meanwhile, there were notable differences in accuracy between the free and forced-
choice tasks for some of the stimuli. The audio excerpts that were more accurately
described in the free identification task, with the exact timbre descriptor or with syn-
onyms of the timbre performed, were not necessarily those most correctly identified
in the forced-choice task, and vice versa. For instance, the second excerpt was freely
described mostly with antonyms of the timbre it was performed with, Harsh, but was
correctly identified by 11 of 17 participants in the forced-choice task. And the fourth
103

excerpt, correctly identified as Bright in the forced-choice task by nine participants, was
freely described as anything but Bright. On the contrary, the ninth excerpt was freely
described mostly with synonyms of the Harsh timbre it was performed with: it was
described nine times as Dry, and even three times correctly as Harsh. But in the forced-
choice task, the ninth excerpt was poorly identified as Harsh (5/17) and often perceived
as Bright. This pattern of free and forced answers might actually reveal a dual under-
standing of a Dry timbre as Harsh and Bright at the same time. Meanwhile, the accuracy
in timbre identification was essentially equivalent between tasks for some excerpts: high
identification rate in both tasks for excerpt 6 (performed Harsh), average for excerpt 1
(performed Bright), and low for excerpt 5 (performed Round) for instance. Other dis-
crepancies in accuracy between the free and forced-choice tasks can be explained by the
points previously stated — Round avoided as timbre descriptor in the forced-choice task,
Shimmering mixed up with Bright.
In conclusion, while the results cannot be deemed sufficient to indicate a total con-
sensus on the perception and verbal description of piano timbre, they still outline a cer-
tain common ground among the participants about the identification and meaning to
attach to piano timbre upon listening.
The pianists’ ability to identify timbre in the context of various timbral interpreta-
tions of a piece can already be considered convincing, as the performer himself could
easily retrieve the timbres he played and as the participants to the listening tests per-
formed three times above chance. However, some aspects of the experimental design
effectively rendered timbre identification more difficult. The audio recordings used as
stimuli were relatively long (up to 30 seconds), which affected timbre consistency over
the duration of an excerpt. The compositional character of each miniature piece was
also arguably too salient, as none of the three pieces was fit to be played with all of
the eight timbres; the influence of composition thus bore on the timbre-colouring of the
performances, without enough cross-comparisons between different pieces to set apart
the expression of timbre through sheer performance. Moreover, although the timbre de-
104

scriptors used were known to be common in the pianistic lexicon (Bellemare and Traube,
2005), they were not selected to optimally encompass the whole timbral space that can
be reached on the piano. As a consequence, the semantic distances between the timbre
descriptors used in this study were not optimised. The timbre descriptors could thus be
very close in meaning (e.g. Bright and Shimmering).
On the other hand, given the relatively short durations of the three musical pieces,
and the constraints of the timbral nuances to be expressed consistently during each per-
formance, the audio performance recordings that were used as stimuli were limited in
their ecological validity. Moreover, the intended condition of having each stimulus rep-
resent one single timbre throughout does not reflect how performers, in an ecological
context, apply many variations in timbral nuances during their performances of long
pieces. However, in the context of piano teaching, short demonstrations at the piano
are often offered by teachers to students, in the aim of highlighting a precise timbral
nuance. Students can then be asked to replay the same excerpt with the same timbral
colour. In this perspective, the performance recordings could reflect a context to which
advanced-level piano students are accustomed. The experimental control required to
test the hypotheses of the study may thus not have impeded too radically the ecological
validity of the experiment.
It has to be acknowledged that the English translations of the verbal descriptors pre-
sented in French may not have held the exact same meaning as in French (about piano
timbre) to the participants. The results may thus have slightly differed had all timbre
descriptors been presented in English (or conversely, all in French). However, it can
be presumed that the differences in meaning, regarding piano timbre, between a verbal
descriptor in French and its English translation are relatively small compared to the dif-
ferences between two different verbal descriptors. In the context of this pilot study, and
with participants fluent in both French and English, the results that would have been
obtained had all descriptors been presented in English arguably would not have contra-
dicted the observed tendencies.
105

Another issue however is that all but one of the participants were from the Faculté
de musique, Université de Montréal. Given their belonging to the same institution and
its distinctive piano school, 16 of the participants may have shared with the composer
and the performer of the pieces an idiosyncratic vocabulary for timbre description. This
may thus limit the extent of validity of the results outside of this local piano school,
although the participants had all previously followed high-level piano studies elsewhere,
with 13 mentioning that their conception of timbre had been shaped before their joining
Université de Montréal.
In the end, the results were considered encouraging enough for pursuing the explo-
ration of the expression, verbalisation, production and perception of piano timbre in
more controlled designs. Consequently, in order to learn more about the perception and
identification of piano timbre, the experimental protocol will be revised before conduct-
ing a follow-up study. The first step will be to optimise the selection of the timbral
nuances to study, by identifying the verbal descriptors of piano timbre that are the most
representative and different semantically.
Chapter 5 will thus detail the methodology employed for selecting an optimally en-
compassing subset of piano timbre descriptors. New audio stimuli will then be created,
through the designing of miniature musical pieces and audio recordings of their perfor-
mances (see Chapter 6, Section 6.2). And the preliminary results of a new, methodologi-
cally optimised identification test for the perception and verbal labelling of piano timbre
will be presented in Chapter 6.
CHAPTER 5

VERBAL EXPRESSION OF PIANO TIMBRE: SEMANTIC SCALING OF ITS


ADJECTIVAL DESCRIPTORS

The study described in this chapter was presented at the International Symposium on
Performance Science (Toronto, 2011), and appeared in its proceedings (in a shorter ver-
sion) (Bernays and Traube, 2011). The whole study, from design to data collection and
analysis, was carried out by the author of this thesis, with guidance from Prof. Caroline
Traube.

5.1 Introduction

The study presented in this chapter proposes a semantic similarity mapping of piano
timbre. Timbre is hereby considered at the intra-instrumental level, as the different nu-
ances highly skilled performers are able to draw from the piano, modulating and shaping
sounds in order to express their musical intentions. Piano timbre is thus not envisioned
here as the specific, performer-independent sonic characteristics of one given piano in
certain performance conditions (room reverberation, recording apparatus, etc.).
The different timbral nuances used for musical expression are generally familiar to
pianists, as attested by the vast vocabulary of verbal descriptors called upon to describe
its subtleties (see Chapter 3, Section 3.2.4). Moreover, the results from the study pre-
sented in Chapter 4 tend to indicate that pianists are able to identify timbral nuances in
audio recordings of piano performances, and to use the appropriate verbal descriptor (or
a synonym) to label the timbral intentions of the performer.
In this study, the verbal descriptors of piano timbre are only considered in light of
their sheer meaning to pianists — i.e. their mental representations and interpretations of
the words as abstractly envisioned in a musical context. This practice differs from most
107

of the timbre verbalisation studies presented in Chapter 3 that used audio stimuli to which
participants had to attach or rate verbal descriptors. Instead, the sound-context-free, top-
down methodology previously employed in Stěpánek and Moravec (2005) (see Chap-
ter 3, Section 3.2.2.3) is adopted here. Moreover, the results obtained with a free verbal-
isation methodology by Bellemare and Traube (2005, 2006), Cheminée et al. (2005) and
Cheminée (2006) formed the basis upon which this study builds.
In particular, Bellemare and Traube (2005, 2006) laid the ground work to this study
by exploring the verbalisation of piano timbre through interviews of pianists (see Chap-
ter 3, Section 3.2.4, p. 81). Nearly 100 verbal descriptors of piano timbre were collected,
detailed with descriptions, synonymic relationships and frequency of occurrence. Some
of these verbal descriptors were also arranged in a synthetic, qualitative atlas of piano
timbre descriptors (see Figure 5.1). Within this spatial design, two dimensions of the
verbal description of piano timbre most come forward, which can be associated with
aspects of brightness and warmth respectively.

5.2 Aims

On the basis of this verbal data collection, this study further explores these verbal de-
scriptors of piano timbre and quantifies their semantic proximities, in the aim of building
a multidimensional spatial representation of the semantic relationships between adjecti-
val descriptors of piano timbre. A further aim of this study is to identify by those means
the most encompassing subset of verbal descriptors that would accurately describe the
whole semantic space of piano timbre. This selection of verbal descriptors will then
be used for exploring the gestural control of piano timbre, with experiments aimed at
identifying the mapping of gesture characteristics towards the different timbral nuances
performed (see Chapter 9).
108

Figure 5.1: Semantic atlas of piano timbre: qualitative arrangement of verbal descriptors
through assessment of synonymy degree (Bellemare and Traube, 2006).

5.3 Method

Questionnaires were conceived to probe the semantic similarities between common


verbal descriptors of piano timbre. The descriptors were selected among the 98 timbre
descriptors highlighted in Bellemare and Traube (2005). This subset had to be small
enough so that the task of filling in the questionnaires would not become overwhelming.
n · (n 1)
Indeed, n verbal descriptors yield pairwise proximity ratings — the number
2
of ratings thus increases quadratically to the number of verbal descriptors. Bellemare
provided the table in which are listed the frequencies of use of verbal descriptors in the
109

interviews — either directly, i.e. mentioned as relevant adjectival descriptors of piano


timbre, or indirectly, i.e. used in definitions or as synonyms of other descriptors. The
29 descriptors that were cited more than once were first selected. 1 This corpus was then
downsized in light of the synonymic relationships listed between them. Some adjectival
descriptors were listed as synonymous (e.g. Sparkling and Shimmering, or Clear and
Transparent). The least frequently used of such synonyms could thus be discarded with-
out affecting the semantic range of piano timbre still represented. Additionally, several
more complex combinations of synonymic relationships were identified in the defini-
tions of adjectival descriptors. For instance, Full-bodied was expressed as a combination
of Round and Rich, Resonant as a combination of Full-bodied and Warm, Metallic as a
combination of Brassy and Piercing, Piercing as a combination of Sharp and Brassy, etc.
Following these synonymic combinations sequentially, it was possible to eliminate sev-
eral more adjectival descriptors from the list, as their timbral meaning could be contained
in other descriptors. Finally, the disposition of the remaining adjectival descriptors in the
semantic atlas of Figure 5.1 was verified, so that no large portion of the atlas would be
left unrepresented by the subset of descriptors. This way, a subset of 14 adjectival de-
scriptors of piano timbre was obtained: Brassy, Bright, Clear, Dark, Distant, Dry,
Full-bodied, Harsh, Metallic, Muddled, Round, Shimmering, Soft and Velvety (see
Table 5.I).
In the first part of the questionnaires, those 14 verbal descriptors were printed out in
random order, and participants were asked to rate their familiarity with each adjective on
a zero-to-five Likert scale (in increasing order of familiarity). In the second part, the par-
ticipants had to assess the semantic proximity between each of the 91 pairs of adjectives
from the 14-term set, on a six-degree Likert scale from zero (very different) to five (very
close). The printout order was randomised, for each questionnaire, between pairs and
within each pair of verbal descriptors. In this aim, a VBA macro was developed. It was
1. At this point, the participants would still be asked for 406 proximity ratings, which was considered
far too much.
110

Table 5.I: Uses and citations, by pianists interviewed in Bellemare and Traube (2005),
of the 14 timbre descriptors selected for the study.

Descriptors Direct uses Indirect uses Total


Round 3 9 12
Clear 5 6 11
Harsh - 9 9
Velvety 5 3 8
Full-bodied 4 4 8
Bright 3 4 7
Soft 2 5 7
Brassy 2 4 6
Muddled 1 5 6
Dark 3 3 6
Shimmering 2 4 6
Dry 2 3 5
Metallic - 5 5
Distant 1 3 4

included in a spreadsheet containing each of the 14 adjectival descriptors. The macro


then automatically filled in a template document with each descriptor (for the familiarity
ratings) and pair of descriptors (for the semantic proximity ratings), in random order. An
example of such a generated questionnaire is provided in Appendix I.
Seventeen pianists, most of them from the Faculté de musique, Université de Mon-
tréal, plus others from elsewhere in Canada, France and Finland, took part in the study
by filling in the questionnaires, anonymously, either on paper or on an electronic docu-
ment. The questionnaires were submitted in either French or English. When submitted
in French, the 14 timbre descriptors were translated (Cuivré, Brillant, Clair, Sombre,
Lointain, Sec, Plein, Dur, Métallique, Trouble, Rond, Scintillant, Doux and Velouté).
111

5.4 Results

5.4.1 Familiarity with timbre descriptors

The evaluations of familiarity with the 14 piano timbre descriptors, gathered from
the 17 filled-in questionnaires, were averaged per descriptor. The resulting means and
standard deviations are detailed in Table 5.II and presented as histograms in Figure 5.2.

Table 5.II: Evaluation of familiarity with the 14 common piano timbre descriptors se-
lected for the study: means and standard deviations (sorted in descending mean order).

Rank Timbre descriptors Familiarity (0–5)


Mean S.D.
1 Soft 4.706 0.686
2 Bright 4.353 0.862
3 Round 4.176 1.425
4 Clear 4.059 1.088
5 Harsh 4.000 1.414
6 Dry 3.765 1.715
7 Dark 3.588 1.661
8 Full-bodied 3.118 1.453
9 Velvety 3.000 1.581
10 Metallic 2.765 1.640
11 Shimmering 2.471 1.546
12 Distant 2.412 1.583
13 Brassy 2.118 1.996
14 Muddled 1.353 1.367

Independent-sample t-tests revealed no influence of procedural factors (paper or elec-


tronic questionnaire, French or English language) on these familiarity ratings. It espe-
cially shows that the familiarity with each timbre descriptor did not significantly differ
between their French and English versions.
Several remarks can be drawn from these familiarity ratings. First, the rankings
are not directly comparable to Bellemare and Traube’s (2005), as the participants were
handed a different task — scale ratings here, vs. the frequency of free-form mentions
112

Figure 5.2: Mean evaluation of familiarity with 14 piano timbre descriptors, sorted in
descending order. Error bars indicate ±2 S.E.

in Bellemare and Traube (2005). This fact can explain why descriptors Soft and Bright
are here the most consistently familiar and are ranked consistently higher than what is
shown in Table 5.I. Dark and Dry are also ranked higher, while Muddled is less familiar.
Moreover, there are large variations between participants in their assessments of famil-
iarity, as the error bars (±2 standard errors) show in Figure 5.2. While Soft and Bright,
then Round, Clear and Harsh, are rated consistently as more familiar and Muddled, then
Brassy, Distant, Shimmering and Metallic are less familiar, the variations in these rat-
ings all but impede a definitive ranking of the adjectival descriptors with regards to their
familiarity. Consequently, the precise familiarity rankings only reflect tendencies that
cannot be generalised. However, they can still be used by default for the purpose of
highlighting one descriptor within a subset.

5.4.2 Dissimilarities and semantic space

For the assessment of semantic proximity, numerical data were collected from the
questionnaires, for each participant and each of the 91 pairs of descriptors. These data
113

were compiled as similarity matrices (one for each participant), i.e. 14 ⇥ 14 matrices
where each row and column corresponds to one descriptor, and the intersection of one
row and one column contains the similarity rating (from zero to five) between the two
corresponding descriptors. The values in the dissimilarity matrices were then trans-
formed (so that a higher value marks less similarity) and metrically rescaled over a
zero-to-one interval to form dissimilarity matrices. These dissimilarities were tested
for possible effects of procedural factors (paper or electronic questionnaire, French or
English language) and independent-sample t-tests revealed no significant correlation of
dissimilarity variances to such groups, which most importantly indicates that the seman-
tic relations between timbre descriptors were not significantly influenced by the language
they were presented in.
The dissimilarity matrices were then input into a Multidimensional Scaling algo-
rithm. Many such algorithms exist (see Chapter 3, Section 3.1.3, p. 58), which all pro-
ceed by iteratively testing different configurations of reduced-dimensionality spaces un-
til reaching the best fit to the input dissimilarities — interpreted, in this case of 14 ⇥ 14
dissimilarity matrices, as 14-dimension spaces. Yet those algorithms differ in dissimi-
larity weighting procedures, and in the loss function used for assessing the goodness of
fit between the tested space and the input dissimilarities.
After thorough examination, I opted for the PROXSCAL algorithm (Busing et al.,
1997), as implemented in SPSS R (SPSS, 2007). First, this three-way algorithm can di-
rectly deal with several dissimilarity matrices at the same time. Moreover, PROXSCAL
uses a weighting model between sources, as originally introduced in INDSCAL (Carroll
and Chang, 1970). Individual spaces (for each source) can be stretched or shrunk along
each dimension, as well as rotated, so as to obtain the most relevant common space from
the combination of all sources. This algorithm can thus compensate for a different use
of the similarity scale range between participants — i.e. some participants may have
extensively used the extremes of the scale in their ratings while others may have opted
for more narrow ratings, without this idiosyncratic use of the scale necessarily repre-
114

senting a stronger or weaker sense of similarity between descriptors. Additionally, the


PROXSCAL algorithm uses an advanced loss function derived from the SMACOF algo-
rithm (De Leeuw, 1977), largely considered as one of the most efficient (De Soete and
Heiser, 1993; Groenen and van de Velden, 2004). The SMACOF algorithm proceeds
by minimising the weighted mean squared error between the transformed dissimilarities
and the distances of n objects within m sources, that is, the following loss function:

1 m n 2
s2 = · Â Â wi jk · di jk di j (Xk ) (5.1)
m k=1 i< j

where:
– wi jk is the weight i j for source k,
– di jk is the transformed proximity i j for source k,
– Xk is the n ⇥ p matrix of coordinates, for source k, defining the p-dimension space
tested at this iteration as a possible solution to the algorithm — i.e. the space
which provides the best fit between di jk and di j (Xk ),
q
p
– di j (Xk ) = Âs=1 (xis x js )2 is the distance between elements xi· and x j· of Xk in
the tested space,
– m = 17 sources (i.e. participants) and n = 14 elements (i.e. timbre descriptors).
In detail, the goodness of fit for successive iterations was assessed with the normal-
ized raw stress, the following normalized variant to Equation 5.1:

2
2 Âm n
k=1 Âi< j wi jk · di jk di j (aXk )
snorm = (5.2)
Âm n 2
k=1 Âi< j wi jk · di jk

Âm n
k=1 Âi< j wi jk · di jk · di j (Xk )
where a = .
Âm n 2
k=1 Âi< j wi jk · di j (Xk )

The optimal number of dimensions for the resulting reduced-dimensionality space


was examined with the help of a scree plot of stress values against several numbers
of dimensions (see Figure 5.3). The “point of inflexion” rule was then followed, i.e.
115

0.12

0.11

0.1
Stress

0.09

0.08
Normalized raw stress value
with 4D model: 0.07
0.07

1 2 3 4 5 6 7 8 9 10 11 12 13
Number of dimensions

Figure 5.3: Scree plot of the stress values corresponding to MDS spaces of different
dimensionalities (from 1 to 13); around the 4D configuration lies the curve’s steepest
point of inflexion.

identifying the dimensionality for which the drop-off in stress value slows down (i.e.
where the curve bends the most). This criterion may yield the selection of either a three-
or four-dimensional space, depending on the inclusion of the point of inflexion itself.
However, as the distances on the fourth dimension are of significant range (about the
same as the third dimension) and seem meaningful and interpretable, the number of
dimensions for the MDS spatial reduction was set at four.

Four-dimension Multidimensional Scaling space

In this four-dimension setting, the PROXSCAL algorithm was set to Euclidean scal-
ing and ratio-based (i.e. metric) proximity transformations. Convergence of the con-
figuration — i.e. the point after which further iterations do not significantly improve
the fit any more — was evaluated according to a stability criterion of less than a 10 4

improvement of the normalized raw stress. Convergence was thus obtained was ob-
tained in 15 iterations. Moreover, the final spatial organisation remains identical re-
116

gardless of the starting configuration to the algorithm. The final normalized raw stress
2
value is snorm = 0.07. Goodness of fit is indicated by the dispersion accounted for,
p
2
1 snorm = 0.93, and Tucker’s coefficient of congruence 1 snorm 2 = 0.964. Both
denote an excellent fit of the MDS space to the input dissimilarities.
The coordinates of each of the 14 piano timbre descriptors along those four dimen-
sions is presented in Table 5.III. Each dimension of the MDS space is represented sep-
arately in Figure 5.4. The planar projections of the 4D space along dimensions 1–2 and
3–4 are represented in Figure 5.5. The 3D plot in Figure 5.6 displays the first three
dimensions.
Table 5.III: Coordinates of the 14 piano timbre descriptors along the four dimensions of
the MDS semantic similarity space.

Timbre descriptors Dimension 1 Dimension 2 Dimension 3 Dimension 4


Brassy -0.3 0.127 0.352 -0.172
Bright -0.484 0.232 -0.072 -0.074
Clear -0.342 0.392 -0.209 0.029
Dark 0.563 -0.381 0.225 0.25
Distant 0.493 -0.335 -0.413 -0.054
Dry -0.436 -0.455 -0.12 0.355
Full-bodied 0.117 0.44 0.301 0.134
Harsh -0.552 -0.383 0.251 0.172
Metallic -0.586 -0.2 0.064 -0.236
Muddled 0.529 -0.496 0.17 -0.428
Round 0.308 0.56 0.157 -0.041
Shimmering -0.441 0.115 -0.414 -0.099
Soft 0.524 0.092 -0.264 -0.001
Velvety 0.607 0.293 -0.029 0.165

The fit of the reconstruction is further assessed in the Shepard plot (Figure 5.7),
in which both the 91 inter-point distances in the MDS space and the disparities (i.e.
the averaged final transformed dissimilarities) are plotted against the averaged original
dissimilarities. Were the MDS reconstruction ideal, all MDS distances and disparities
would align on the 1 : 1 diagonal line, because all disparities would be equal to their
117

Figure 5.4: Individual linear representation of each of the four MDS dimensions, with
the position of each descriptor indicated along each dimension.

0.8

0.6 0.6
Round
Full−bodied
0.4
Clear 0.4
Dry

Bright Velvety Dark


Dimension 2

0.2 0.2
Dimension 4

Velvety Harsh
Brassy Full−bodied
Shimmering Soft
0 0
Soft Clear
Distant Round
Bright
Shimmering
Brassy
−0.2
Metallic −0.2
Metallic
Distant
−0.4 Harsh Dark −0.4
Muddled
Dry
Muddled
−0.6 −0.6

−0.8
−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6

Dimension 1 Dimension 3

Figure 5.5: Planar projections of the four-dimension MDS space: Dimensions 1–2 (left)
and 3–4 (right). Each of the 14 descriptors is positioned in both plans.
118

Brassy
0.4
Full−bodied

0.3 Harsh
Round
0.2

Metallic
Dark
0.1
Bright
Muddled
Dimension 3

0
Velvety
Clear
−0.1 Dry

−0.2

Shimmering Soft
−0.3

−0.4

0.6
−0.5 Distant 0.4
−0.6
0.2
−0.4
−0.2 0
2
ion
0 −0.2
Dim ns
ensi 0.2
−0.4
ime
on 1 0.4
0.6
−0.6 D
0.8 −0.8

Figure 5.6: 3D representation of the first three MDS dimensions, which account for
91.1% of the MDS reconstruction. The position of each descriptor in the space is in-
dicated by a red dot, with a blue dotted line and cross figuring its projection on the
dimensions 1–2 plan.

corresponding dissimilarities, and distances would not be distorted. Consequently, the


deviations of disparities and distances from the 1 : 1 diagonal indicate the distortion
induced by the MDS reconstruction/reduction of the input dissimilarities, and illustrate
its goodness of fit (or lack thereof). As MDS distances are narrowly scattered around the
1 : 1 line, and the disparities line closely match the 1 : 1 line as well, this Shepard plot
indicates a good fit between disparities and dissimilarities and an accurate reconstruction
of the distances between descriptors.
Numerically, the distances between descriptors in the 4D space show a linear corre-
lation of r2 = 0.924 with the original, 14-dimension averaged dissimilarities.
119

1.6
Distances
Disparities
1:1 Line

1.4

1.2
Distances/Disparities

0.8

0.6

0.4

0.2

0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6
Dissimilarities

Figure 5.7: Shepard plot of distances in the MDS space and disparities vs. averaged
original dissimilarities. The X-axis corresponds to the input dissimilarities and the Y-
axis to their MDS reconstruction. The red line represents the disparities, i.e. the MDS-
transformed dissimilarities. The 1:1 diagonal line represents the ideal, perfect recon-
struction. The better the red line matches the 1:1 diagonal, the more accurate the MDS
reduction of the original dissimilarities is. Finally, each of the 91 blue dots corresponds
to the semantic distance between two of the 14 descriptors, with the deviation of their
position from the diagonal indicating the distortion caused by the MDS reduction.

Assessment of the MDS dimensions

The relative contribution of each of the four MDS dimensions to reproducing the in-
put dissimilarities and distances was assessed with the scalar product matrix of the space
coordinates. With C the MDS coordinate matrix presented in Table 5.III, I calculated
the relative magnitude of the four eigenvalues from the t C ⇥ C matrix product. Each
120

dimension thus accounts for respectively 49.3%, 27.9%, 13.9% and 8.9% of the MDS
reconstruction.
As for assigning semantic meanings to the dimensions, conjectures may be made that
the first dimension is associated to “sharpness” or “brightness” –– which would acous-
tically refer to the spectral centroid. 2 The second dimension may account for “warmth”
–– acoustically, the relative amount of low-to-mid frequencies (Beranek, 1979). This
description essentially fits the semantic atlas of adjectival descriptors of piano timbre ob-
tained by Bellemare and Traube (2006) (see Figure 5.1), in which the same dimensions
of brightness and warmth were identified. The dimensions 1-vs.-2 plan corresponds quite
exactly to this semantic atlas after slight rotation and distortion of the latter –– which is
allowed due to the qualitative nature of this atlas. The third and fourth dimensions are
more difficult to assess, although the third dimension might relate to timbre “loudness”
or “fullness” (of spectral content), and the fourth may seem akin to “presence”.

5.4.3 Cluster analysis

In addition to the Multidimensional Scaling of the dissimilarity data, Hierarchical


Clustering was performed, with distances transformed into a phylogenetic tree by itera-
tion of the Unweighted Pair Group Method with Arithmetical Mean (UPGMA) (Sokal
and Michener, 1958). Each iteration of this method groups the two closest objects or
clusters together, with the distance between two clusters X and Y (containing objects xi
and yi respectively) defined as:

1
dXY = ⇥ Â Â d(x, y) (5.3)
|X| ⇥ |Y | x2X y2Y

The dendrogram resulting from this hierarchical clustering of the semantic dissimi-
larities between the 14 piano timbre descriptors is presented in Figure 5.8.
2. See for instance the acoustical description of the second dimension of the perceptual MDS timbre
space in McAdams et al. (1995).
121

Figure 5.8: Dendrogram of the hierarchical clustering (through UPGMA transforma-


tions) of semantic dissimilarities between the 14 piano timbre descriptors. The five
fourth-level branches are highlighted in different colours.

5.5 Discussion

In order to identify clusters of piano timbre descriptors within the semantic structure,
the MDS semantic space was examined. Dimensions 1 and 2 were first explored, as to-
gether they account for 77.2% of variance in the 4D reconstruction of the dissimilarities,
i.e. 71.8% of the dispersion within the source data. Within the dimensions 1-vs.-2 plan,
five groups could be singled out within the set of descriptors (see Figure 5.9):
– Dry, Harsh and Metallic,
– Brassy, Bright, Clear and Shimmering,
– Full-bodied and Round,
– Soft and Velvety,
– Dark, distant and Muddled.
Those groups exactly match the five-branch subsets resulting from cluster analysis,
as highlighted in Figure 5.8.
Yet further information may be garnered within MDS dimensions 3 and 4. Although
the subset structure is much less salient and those dimensions combined only account for
21.2% of the dispersion, it is still worth noting that some descriptors can be regrouped in
122

0.8

0.6 0.6
Round
Full−bodied
0.4
Clear 0.4
Dry

Bright Velvety Dark


Dimension 2

0.2 0.2

Dimension 4
Velvety Harsh
Brassy Full−bodied
Shimmering Soft
0 0
Soft Clear
Distant Round
Bright
Shimmering
Brassy
−0.2
Metallic −0.2
Metallic
Distant
−0.4 Harsh Dark −0.4
Muddled
Dry
Muddled
−0.6 −0.6

−0.8
−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6

Dimension 1 Dimension 3

Figure 5.9: Semantic MDS space: dimensions 1-vs.-2 and 3-vs.-4 plans. Left is the
dimensions 1-vs.-2 plan, in which five separate groups of neighbouring descriptors are
framed in red. The arrangement pattern of those five groups is traced in blue. Right is
the dimensions 3-vs.-4 plan, with only some saliently neighbouring descriptors framed
as groups.

the dimensions 3-vs.-4 plan: Distant and Shimmering; Soft and Clear; Dark, Harsh and
Full-bodied. Dry and Muddled are each singled out as outliers. The other descriptors
lie somewhere in between these groups (see Figure 5.9). The other four plans opposing
dimensions 1 or 2 and dimensions 3 or 4 did not reveal much more about the arrange-
ment of subsets, as dimension 1 was clearly predominant in setting possible groups of
neighbouring descriptors, and as the descriptors were otherwise too scattered or grouped
in the subsets previously identified.
All the clusters identified in both these plans of the MDS semantic space could be
accounted for with some combination of the terms Dry, Bright/Clear, Round, Dark,
Soft/Velvety, Distant/Shimmering and Muddled. Yet, given the relatively low contri-
bution of dimensions 3 and 4 to the MDS representation, it was deemed preferable to
favour the familiarity with descriptors over the subsets identified in the dimensions 3-
vs.-4 plan in choosing which groups to highlight. Therefore, the subsets of Muddled
123

(singled out) and Distant/Shimmering were discarded, as those terms yielded very low
familiarity ratings and were solely identified over dimensions 3 and 4.
Five groups thus remain, each clearly identified in both the MDS space (along its
most meaningful dimensions) and the hierarchical tree obtained by cluster analysis. For
each of these groups, one single representative term was sought out. The familiarity
ratings were used for this aim. The relations between timbres and dynamic level — as
described in Bellemare and Traube (2006) — were also factored in. Indeed, some tim-
bre descriptors would double as descriptors of dynamics, and/or were dynamically con-
strained — especially problematic when constrained to very low or very high dynamics,
because the follow-up experiment would require different timbres to be applied to the
same piece at the same dynamic level. Soft was thus discarded in favour of Velvety.
In the end, the five terms that were selected to describe the whole semantic space of
the verbal descriptors of piano timbre are: Dry, Bright, Round, Velvety and Dark.
These five verbal descriptors cannot completely represent by themselves all the tim-
bral nuances that pianists can create, use, and verbally describe. Yet in the need of a
minimal number of timbre descriptors to highlight, these five terms provide the optimal,
most representative reduction of the semantic space of piano timbre description that can
be achieved with such a small number of timbre descriptors.
Moreover, statistical analyses did not reveal that the language (French or English)
in which the verbal descriptors of piano timbre were presented to the participants could
significantly influence the assessments of familiarity and semantic proximity. The mean
familiarity ratings, MDS semantic space, and hierarchical clustering tree thus remain a
priori valid whether the timbre descriptors are used in French or in English. The dif-
ferences in meaning between the direct translations of timbre descriptors from French
to English (or vice versa) are presumably small and subtle enough that the general pat-
terns of semantic relationships between different timbre descriptors are not significantly
affected. Consequently, it can be assumed that the French translations of Dry, Bright,
Round, Velvety and Dark (i.e. Sec, Brillant, Rond, Velouté and Sombre) constitute the
124

most representative reduction of the semantic space of verbal description of piano timbre
in French.

Perspectives

Further data collection is currently under way. Fifteen more questionnaires have
been filled in at the time of writing. Preliminary analyses tend to confirm the semantic
familiarities and proximities (and the subsequent spatial arrangement) presented in this
chapter, and only support the selection of Dry, Bright, Round, Velvety and Dark as
the five adjectival descriptors most representative of the semantic space associated with
piano timbre.
This selection of the five most representative verbal descriptors of piano timbre will
be employed for studying the gestural control of the five corresponding piano timbre
nuances, the details of which appear in Chapter 9. Moreover, the spatial arrangement of
the adjectival descriptors of piano timbre, in itself, may prove a handy pedagogical tool
that could facilitate the access to timbral nuances and the understanding of timbre as a
concept for the students. But first, the auditory perception and identification of those five
timbral nuances in specifically controlled piano performances is explored in Chapter 6.
CHAPTER 6

PERCEPTION AND IDENTIFICATION OF PIANO TIMBRE:


A FOLLOW-UP STUDY

The study presented in this chapter was carried out for the most part by the author
(idea, design, method, recordings, the majority of participants’ testing, data collection
and analysis). Sébastien Bel programmed the testing interface, under guidance from
Prof. Caroline Traube and the author, and also contributed to the recruiting and testing
of participants. Bernard Bouchard helped the author select the musical pieces to be
performed and recorded. Prof. Traube provided general advice and guidance.

6.1 Introduction and aims

Preliminary new results on the auditory perception and identification of piano timbre
are presented in this chapter. In comparison with the pilot study presented in Chapter 4,
the methodology employed in this study has been improved, with carefully selected tim-
bre descriptors to both guide the performances (whose recordings were used as stimuli)
and serve as labels for timbre identification, and with a better controlled musical context,
including refined musical pieces and several performers.
The verbal description of piano timbre was already shown to rely on a shared, organ-
ised lexicon among advanced-level pianists (see Chapter 3, Section 3.2.4, p. 81). And
the pilot study presented in Chapter 4 already indicated a certain common ground among
participant pianists about the identification and meaning of piano timbre as perceived in
audio recordings. This study aims at exploring whether, with a refined musical con-
text underlying the audio stimuli and with verbal descriptors of piano timbre optimised
in representativeness and distinctiveness, a clearer consensus can be revealed among
skilled pianists regarding the perception and verbal-label identification of timbral inten-
126

tions in piano performances. In other words, this study aims at exploring the relations
from words, to performance, to sound, to perception and back to words about piano tim-
bre, in order to determine whether the production, perception and verbalisation of piano
timbre can be consensually understood and agreed upon in a closed chain of meaning.

6.2 Method

An experiment was designed to achieve these aims. Audio recordings of several


piano performances, in which different timbral nuances were highlighted, were used as
stimuli in an auditory test of the perception and identification of piano timbre.

6.2.1 Piano timbre descriptors

The five verbal descriptors of piano timbre most familiar and representative of the
whole semantic field of piano timbre description were identified in Chapter 5. This study
explored the perception and identification of the five following piano timbre nuances:
Dry, Bright, Round, Dark and Velvety. 1 To begin, these timbral nuances were to be
expressed in piano performances whose audio recordings would serve as stimuli in the
perception test.
A procedure was thus developed to create such stimuli. Short musical pieces were
composed to fit the constraints of the study, and were performed by highly skilled pi-
anists.

6.2.2 Audio stimuli

6.2.2.1 Musical pieces

In order to set a musical context in which the five timbral nuances can be properly
expressed in piano performance, short solo piano pieces were specially composed for
1. As the study was conducted in French (see later), the verbal descriptors employed were Sec, Brillant,
Rond, Sombre and Velouté.
127

the study. Indeed, it proved all but impossible to find repertoire pieces that could be
played naturally with each of the five timbres. Prior knowledge of a repertoire piece by
a performer could have also proved problematic, as the pre-established musical idea of a
piece could not necessarily be bent to the timbral constraints of the experiment.
Four composers, all Ph.D. students at the Faculté de musique, Université de Mon-
tréal, were asked for miniatures pieces that could be performed with each of the five
proposed timbral nuances. Fifteen pieces were submitted. A selection process was then
conducted by the author with the help of a pianist with professional experience. The
main selection criterion was the quality of fit of the pieces to each of the five timbres.
We also considered whether pianists could express each timbre consistently through-
out the pieces, and whether the pieces formed a coherent musical whole — criteria that
are easier to achieve in short pieces. The submissions were evaluated at the piano, with
regard to the quality, relevance and consistency of the expression of each timbre. Tech-
nical performance considerations were also taken into account. In the end, a consensus
was reached over four of the submitted pieces — written by three composers: Stacey
Brown, Frédéric Chiasson and Ana Dall’Ara-Majek — that could allow for a meaning-
ful and consistent expression of each of the five timbral nuances. The pieces, just a few
bars long (from four to seven, with different meters), ranged in duration at straight score
tempo from 12 to 15 seconds. The scores are presented in Figure 6.1.

6.2.2.2 Performance recordings

In order for the four pieces to be performed with the five timbral nuances, three
pianists were recruited. Each had extensive professional experience, and had obtained
an advanced-level diploma in piano performance. One had perfect pitch.
As in the pilot study presented in Chapter 4, the recordings took place in the BRAMS
facilities, in a professional-level recording studio, on a Bösendorfer Imperial 290 grand
piano. The performers had received in advance the scores of the pieces and the timbral
nuances to express, and were given sufficient rehearsal time, including on the Bösendor-
128

Pièce 1


Frédéric Chiasson

           


Moderato q = 100  

     
          

               
             



pp

Pièce 2


Stacey Brown

      
                      
 
  
q = 110 Rubato

    
  
  
                
    

3
 


3

        

Pédale ad lib.

  
5
         
    

           


    
3
 3
  
 

3 3
3 

Pièce 3


Ana Dall'Ara-Majek

         
 
     
q = 72

      

 
    
         
     


                      
3 

               
     
     

Pièce 4


Frédéric Chiasson

         
Moderato q. = 72

        
3 3

                   



      
3
 2

      
rall. 
                  

Figure 6.1: Scores of the four miniature pieces composed and selected for the study.
129

fer piano. Each pianist was asked to perform each of the four pieces, with each of the
five timbres. With three such successive runs of 20 performances (twice in an order
of pieces and timbres chosen by the participant, and once in random forced order), 60
performances were recorded with each pianist — including three performances of each
combination of piece and timbre.
Two DPA 4011-TL professional cardioid microphones were used for the in-close
stereophonic sound take. They were set in XY type, in a “sweet spot” about 1 m off the
open-side ream at mid-frame, at a height of about 1.20 m (diving toward the far side of
the soundboard). This recording setup is pictured in Figure 6.2. Two omnidirectional
DPA 4006 microphones, set up in XY type, were placed about 50 cm below the cardioids,
at a height of about 1.40 m (and pointing toward the far side of the soundboard as well),
for additional ambiance and reverberation in the recordings. The microphone signals
were routed to a Millenia HV-3D preamp, and were recorded with Logic 9 R . These
four-channel recordings were then mixed in stereo, and a 44.1 kHz, 24 bit wav sound
file was created for each separate performance.
A fourth pianist had been previously recorded with the same methodology, with a
different recording setup (only one omnidirectional microphone, and different micro-
phone placements). However, a failure of one of the two cardioid microphones left us
with only a monophonic recording of the performances. It was therefore decided to use
these recordings only for the familiarisation and training phase of the perception test.
Out of the three performances that featured the same timbral nuance, piece and pi-
anist, one recording was selected, according to criteria of performance quality and fi-
delity to the score, quality of recording (especially the absence of background noise,
including a creaking chair or a singing pianist), and consistency and reliability of timbre
expression. Sixty recordings were thus selected for the test, featuring every combination
of the five timbral nuances, the four pieces and the three performers. Twenty additional
recordings from the fourth pianist were selected for training to the test per se.
130

Figure 6.2: Cardioid microphones setup used in recording performances on the


Bösendorfer Imperial 290 piano installed in the BRAMS studio.

The 60 audio recordings used as stimuli in this study are contained in the second
archive of additional audio files provided with this thesis.

6.2.3 Perceptual identification test of piano timbre

6.2.4 Participants

At the time of writing, 15 participants have taken the test. They all spoke French
as their everyday language, had all followed formal piano training and obtained an
advanced-level academic diploma in piano performance (and/or were currently study-
ing for one), and had at least 10 (up to 40) years of piano experience. Three had perfect
pitch. Six participants were currently enrolled in an academic piano performance pro-
gram, with five of them at the Faculté de musique, Université de Montréal. Two of the
five had previously obtained advanced-level diploma in piano performance elsewhere
131

(in France). Two other participants had obtained their highest-level piano performance
diploma at Université de Montréal. Six participants were Canadian (born and raised in
Quebec), four were French, two Iranian, one Argentinian, one Belgian and one Japanese.

Figure 6.3: Page of the testing interface in its initial, pre-manipulation state.

6.2.5 Testing interface and protocol

In order to examine the perceptual identification of timbre in those 60 audio stim-


uli, an interactive computer interface was designed by our colleague Sébastien Bel. The
interface was developed in Max/MSP R . Several successive pages are presented to the
user, who can use a mouse and keyboard to interact with set elements. In each testing
page, five audio stimuli are presented, corresponding to performances of the same piece
by the same pianist with the five timbral nuances. An example of this testing page is
presented, in its initial configuration before user interaction, in Figure 6.3. Each sound
132

is represented by a grey disk. The sounds play in a loop, and users can control which
one they hear by moving the mouse over the corresponding disk. The sound currently
heard is identified by a large green dot within the disk. Other sound-playing controls
are available to the user: a scrolling bar and play/pause button below the main frame,
and keyboard controls for all five sounds simultaneously (play/pause with the space bar,
stop/reset with the M key). Five yellow disks are situated in a fixed position within the
main frame. Each corresponds to a verbal descriptor of piano timbre (written inside).
The users must then drag-and-drop each audio excerpt (grey disk) to the timbre descrip-
tor (yellow disk) they deem most appropriate. The users are allowed to play and listen
to all five excerpts as long and as many times as they wish, and can switch seamlessly
from one excerpt to another. Once all five excerpts are assigned to a timbre descriptor,
the validation button turns from red to green, and the user can follow on to the next page.
This final, pre-validation state of the testing page is illustrated in Figure 6.4.
After a first page of general instructions, each participant was presented with a train-
ing page, which functioned like the testing page. It contained the five excerpts, each
expressing a different timbral nuance, from the fourth pianist’s performances of one
piece, chosen at random among the four pieces. Once the participant familiarised with
the interface and the audio stimuli, all the excerpts assigned to a timbre descriptor-disk,
and the training page validated, the participant was offered the choice to retry the train-
ing (with excerpts from a different piece) or to follow through with the test. For the test
itself, 12 pages were successively presented, each containing five excerpts correspond-
ing to the five timbral nuances expressed by one pianist in one piece. Presentation order
of these [piece ⇥ pianist] combinations was randomised, as well as the initial repartition
of the five excerpts/timbral nuances into the five grey disks.
As all the participants used French as primary language, all the text and timbre de-
scriptors in the interface were written in French. Each participant’s answers were auto-
matically output in text files. Each excerpt (designated by pianist, piece and timbre) was
set in correspondence with the timbre descriptor assigned by the participant.
133

Figure 6.4: Page of the testing interface in its final, pre-validation state, with each excerpt
(grey disk) assigned to a different timbre descriptor (yellow disk).

6.3 Results

6.3.1 Timbre identification rate per participant

The individual timbre identification rates for the 15 participants are displayed in Fig-
ure 6.5. On average, the overall timbre identification rate for all participants, pieces,
performers and timbres (which amounts to 900 answers) is 0.627, more than three times
above chance (0.2) — which is highly significant (p < 10 4 ). The standard deviation be-
tween participants is 0.115, with individual identification rates ranging from 0.5 to 0.85.
Individual identification rates are significantly above the 0.2 chance level (according to
a binomial distribution) for all participants.
134

Figure 6.5: Timbre identification rate per participant. 60 stimuli per participant, five
different timbre descriptors. The rightmost bar shows the average timbre identification
rate and 95% confidence interval (±2 S.E. between participants).

6.3.2 Effects of piece, performer and timbre of the stimuli on identification rates

The effects (on the identification rates for each stimulus) of the piece performed, of
the performer, and of the timbre expressed (and of the interactions between these three
factors), were explored in SPSS R with the Generalized Estimating Equations method
— an iterative procedure of generalised linear modelling able to process binomially dis-
tributed dependent variables, i.e. for this study the binary timbre identification in each
stimulus by each participant. Piece, performer and timbre (and their interactions) were
used as within-subject variables and factors. 2

2. In further detail, the GEE model was built with an identity link function for the dependent variable.
Factor significance was evaluated with the Wald test. Ten dummy cases were added to eliminate the
singularities in the factor combinations for which all 15 participants answered correctly.
135

The GEE model revealed significant effects of Performer (c 2 (2) = 16.066, p <
10 3 ), Timbre (c 2 (4) = 80.768, p < 10 3 ) and of interactions Performer ⇥ Timbre
(c 2 (8) = 39.449, p < 10 3 ), Piece ⇥ Timbre (c 2 (12) = 38.146, p < 10 3 ) and Per-
former ⇥ Piece ⇥ Timbre (c 2 (16) = 5.4 · 1013 , p < 10 3 ). The most relevant of these
results is the major effect of timbre (in itself and in interaction with other stimulus fac-
tors), which means that the identification rate depends of the timbral nuance to be identi-
fied — i.e. some timbres (Dry and Bright) were easier to identify than others. The effect
of the timbre-by-piece interaction is highlighted in Figure 6.6, and the effects of both
timbre and timbre-by-pianist interaction are highlighted in Figure 6.7. Furthermore, the
effect of the performer was significant for the identification of the timbre in a stimulus.
Timbre was easier to identify in some pianists’ performances (RB and FP) than others
(BB) (see Figure 6.7, green bars).
Yet timbre identification rates remain significantly above chance (at the 5% level
at least) for each pianist, each piece, and each timbre, and for each combination of
piece-by-performer, performer-by-timbre, and all but one (Piece no.2 with a Dark tim-
bre) piece-by-timbre combinations. Identification rates also remain significantly above
chance for 46 of the 60 piece-by-performer-by-timbre combinations (76.7%), even with
only 15 data points for each such combination.

6.3.3 Confusion matrix

The timbre-wise confusion matrix presented in Table 6.I sheds another light on the
results, by displaying the patterns of errors and ‘false alarms’ in identifying the timbres
featured in the stimuli.
Dry-timbre performances were the most accurately identified (0.856 rate). In most
errors, Dry performances were deemed Bright. Dry answers were also mistakenly used
most with Bright performances. Likewise, Bright performances were mostly identified
correctly (0.8 rate). Most errors show a confusion with Dry, and some with Round. The
same is true of the mistaken uses of Bright as an answer. As for Round performances,
136

Figure 6.6: Identification rate per piece by timbre. All rates significantly above chance
(p < 10 3 ) but for Dark-timbre, Piece no.2 performances. Error bars show the 95%
confidence intervals around averages per piece (±2 S.E. between timbres).

Table 6.I: Timbre-wise confusion matrix and timbre identification rates (180 answers
and stimuli per timbre).

Timbre Answers Identification


performed Dry Bright Round Dark Velvety rate
Dry 154 19 4 2 1 .856
Bright 17 144 11 4 4 .800
Round 7 13 104 33 23 .578
Dark 0 4 42 72 62 .400
Velvety 2 0 19 69 90 .500
.6267

the identification rate is still relatively high (0.578). In cases of errors, Round perfor-
mances were identified as Dark (18.3%), Velvety (12.8%), sometimes as Bright (7.2%)
and quite rarely as Dry (3.9%). Round as an answer was mistakenly used most for Dark
137

Timbres: SE by timbre Significance


Dry above chance
Bgt = Bright SE by pianist level:
Rnd = Round ** p << .01
Drk = Dark * p < .05
Vel = Velvety

Pianist: RB Pianist: BB Pianist: FP Overall


1 ** ** **

** **
0.8 **
Identification rate

**
**
** ** Average
0.6 ** ** ** **
** ** ** ** **
** ** **
0.4 **
*

0.2 Chance
level

0
Dry Bgt Rnd Drk Vel Avg. Dry Bgt Rnd Drk Vel Avg. Dry Bgt Rnd Drk Vel Avg. Dry Bgt Rnd Drk Vel Avg.
Performer and timbre

Figure 6.7: Identification rate per timbre, by performer and overall. The three green bars
(and horizontal lines) represent the averages per performer. The five dark red bars on the
right represent the averages per timbre performed. Error bars show the 95% confidence
intervals (±2 S.E. between pianists or timbres).

performances (more than half the misuses of Round). Dark performances were the least
accurately identified (0.4 rate). They were considered Velvety nearly as often (0.344
rate). They were identified as Round often as well (23.3%), yet hardly ever as Bright,
and never as Dry. Dark answers were actually employed about as often for Velvety
performances (0.383 rate) as for Dark ones, with other misuses of Dark occurring with
Round performances. Finally, Velvety performances were correctly identified half the
time (0.5 rate), with the vast majority of errors (76.7% of them) due to a confusion with
Dark. Misuses of Velvety as an answer were also most frequent with Dark performances,
and sometimes with Round ones.
138

6.4 Discussion

In the end, the main conclusion to draw from these preliminary results on the auditory
perception and identification of piano timbre is that advanced-level pianists are indeed
able to identify mostly correctly the timbral nuances highlighted in performances of
short piano pieces. The results thus indicate a fair agreement between the meaning
attached to timbre descriptors by the performers and the meaning associated to the same
timbre descriptors in describing the audio recordings of the performances. In particular,
different semantic regions of piano timbre description can be consensually understood
from production to perception. These four regions can be respectively associated with
the terms Dry, Bright, Round and Dark/Velvety.
It appears however that between the five timbre descriptors which were used as per-
formance instructions (production) and identification labels (perception), different de-
grees of agreement and/or identification were reached. Dry and Bright timbres were
indeed very easy to distinguish. Round was also rather well identified, despite some
confusion with Velvety and mostly Dark. Meanwhile, Dark and Velvety timbres were
largely mixed up, with Dark also sometimes confused with Round. These patterns of
confusion between timbre descriptors are mostly consistent between the subsets of re-
sults by performer and by piece, which indicates the confusions may stem from the
understanding or general perception of the terms. Additionally, the timbre identifica-
tion rates differed depending on the performer of the audio recordings, both overall and
by timbre. It first suggests that the performers may have employed different degrees of
expression in highlighting the different timbral nuances (some with more ‘exageration’
than others). Some timbral nuances were more difficult to identify in the audio record-
ings of only one of the three performers (e.g. the Dark nuance was least identified with
the third performer), which suggests that the performers may have conceived of and pro-
duced the same timbral nuances slightly differently.
139

As was the case in the pilot study presented in Chapter 4, the performances recorded
were of very short pieces, with only one timbral nuance expressed throughout. The per-
formances and their audio recordings thus cannot offer a degree of ecological validity on
par with an actual concert performance of a full musical work. Yet they may accurately
reflect the piano teaching practice of short masterly demonstrations of a specific timbre
followed by attempts at correct reproduction by the student. Consequently, the experi-
mental control over the musical context of the performances and their audio recordings
may have preserved enough ecological validity for the preliminary findings to remain
generally relevant.
As the study was conducted in French, the participants used verbal descriptors in
French (Sec, Brillant, Rond, Sombre and Velouté) to identify the timbral nuances ex-
pressed in the audio excerpts. It may thus be argued that slight differences in meaning (as
regards timbre) between a verbal descriptor in French and its translation in English may
have caused the results to differ had the study been conducted in English. However, the
subtle shifts in meaning that can occur between the French and English versions of a ver-
bal descriptor remain very small compared with the different meanings held by different
descriptors — especially as the five descriptors were chosen to give the most thorough
account of the whole semantic space of piano timbre and held very different meanings.
Consequently, the language in which the timbre descriptors were presented presumably
would not have affected how the audio excerpts were identified. This question shall be
examined further. Perception tests will be extended to English-speaking participants,
and the influence of the language in which the timbre descriptors are presented will be
assessed. On the other hand, unlike in the pilot study presented in Chapter 4, the 15
participants were of various origins and had studied in different piano schools (although
seven were or had been affiliated with Université de Montréal). The preliminary results
are thus more generalisable, and less confined to a local piano school than those of the
pilot study.
140

In conclusion, while these preliminary results would not suggest there is a com-
plete, utterly precise consensus on the exact meaning in sound that the timbre descrip-
tors hereby studied hold to pianists, the results still clearly outline the common ground
among 15 participants upon the general meaning, perception, identification and under-
standing of piano timbre.
In the near future, more participants are to take part in the perception test. The ten-
dencies shown in the preliminary results will be examined in new data, and more gener-
alised, possibly different tendencies may emerge. Moreover, the information highlighted
in this study about the perception and identification of piano timbre shall be further set
in relation with other aspects of the expression of piano timbre, including its production
and gestural control. As we will see in Chapter 9, the subtleties in the production of the
performances whose audio recordings were used as stimuli in this study were explored,
and gestural characteristics of the performances and their timbres were identified. By ex-
ploring the correlations between the timbre identification rate in each performance and
its gestural features, the performances which best represented each timbral nuance will
be identified, and the production and gesture strategies best able to convey the expression
of each timbre may be revealed. We may thus understand how a certain timbral nuance
can be best produced and controlled by pianists’ gestures for their timbral intention to
be most consensually and rightfully perceived.
Part III

Production and gestural control of


piano timbre

141
CHAPTER 7

SCIENTIFIC STUDIES OF PIANO PERFORMANCE AND TIMBRE


PRODUCTION

With the core characteristics of the verbal description of piano timbre now quan-
titatively explored and its main adjectival descriptors identified and mapped out, this
dissertation now focuses on the production of piano timbre.
This chapter provides an overview of the relevant scientific literature on piano perfor-
mance. After briefly discussing the epistemological tenets of the empirical, quantitative
and experimental research on music performance, its concepts and methodologies, we
will focus on the results and insights from piano performance studies, especially with
regards to timbre, touch and tone.
This chapter gathers relevant information in piano performance studies about the
methodologies to employ in understanding pianists’ gesture and control in expressive
performance, about the specific expressive parameters, patterns and gesture used by pi-
anists, and about how such precise parameters of gesture and touch can serve the pro-
duction of tone and timbre.

7.1 Illustrated epistemological perspective on performance studies

7.1.1 History of empirical music performance studies

A thorough review of “Music Performance Research at the Millennium” (up to 2002)


is given in Gabrielsson (2003) in an updated follow-up to Gabrielsson (1999). The fol-
lowing paragraphs present an augmented summary of his review, oriented towards our
research goals and complemented by some alternative perspectives.
Up to the twentieth century, most of the information on music performance came
from observations and advice on proper performance, written in musical treatises, letters,
143

reviews, diaries, or annotations in scores. Then starting in 1900 and peaking in the 1930s,
the first empirical studies of music performance were conducted, under a broad affiliation
with the nascent discipline of psychology. Probably the most significant work of this
era is contained in Seashore (1936). Those studies however were essentially focused
on physical measurements of music performance, as a rather direct application of the
technological means accessible to researchers in those times. After a hiatus through the
troubled times of the 1940s and early 1950s, music performance research was revived
in the 1960s, with better equipment (and recording technologies). Music performance
research has since been growing exponentially 1 , especially since the mid-1970s and the
dawning of the digital age.
Now, empirical music performance research covers a wide range of topics, and differ-
ent segmenting paradigms have been proposed. Sloboda (2000, p. 398) has highlighted
two major components of skilled musical performance, separate although interacting:
performance technique (the mechanics required in competently playing an instrument)
and performance expressiveness (intentional variations in performance parameters used
by performers to convey artistic meaning and emotions to the audience and the listen-
ers). Both of these components have been extensively covered in empirical music perfor-
mance research, and can be considered as relevant for our own research. Yet Sloboda’s
definition of performance expressiveness seems too limited, as it may reduce expres-
sion to deviations from a “neutral” reference (e.g. the score). A deeper perspective on
the expression in performance is provided by Clarke (1995), who applied Peirce’s semi-
otic differentiation of index, icon and symbol. Index refers to the direct signification
in sound of the physicality of the performance — e.g. the sonic trace of the effort ap-
plied by the performer. Meanwhile, icon holds an implicit signification, such as a rubato
applied to marking phrase structure. These two aspects result in the actual sound pro-
duction in expressive performance. I also propose to highlight the following two facets
1. For instance, whereas Gabrielsson could find about 500 publications on empirical musical research
from the beginning up to the mid-nineties, he found 200 new works published between 1995 and 2002.
144

of performance research, depending on the object: studies on the act of performance


itself (physical measurements, models, motor processes and physiology) and studies of
the process of performance (more psychology-oriented, including planning schemes and
practice, teaching and learning, sight-reading, evaluation of performance quality and
psychological and social factors at large). Those two facets however are not intended to
cover all the objects and topics explored in empirical performance research, which are
far too numerous and diverse to be properly represented in any simple dichotomy. Our
purpose here is to highlight that the object of our research lies in the act of performance
(and not in the process that leads to it). For an exhaustive list of reference publications on
empirical music performance, one can refer to Gabrielsson (2003). Let us now present
different research topics that are related to the study of the act of performance.
Measurements of performance are the oldest, and by far the most explored, topic.
The aspects of performance most measured relate to timing, dynamics, other instrument-
specific criteria (vibrato in violin and voice, touch in piano performance) and perfor-
mance errors.
From there, other studies were oriented toward building models of expressive mu-
sic performance (Goebl et al., 2008). Such models can either stem from measurement
data through the analysis/synthesis paradigm (Risset and Wessel, 1999; Wessel, 1979),
be based on generative rules defining the performance constraints for the model to re-
spect (Friberg, 1991; Sundberg et al., 1983, 1989), or use machine learning (Flossmann
and Widmer, 2011; Lauly, 2010; Widmer et al., 2009). Models then generally require
perceptual validation through listening tests. Expressive performance modelisation has
recently fostered major interest — illustrated for instance by Rencon 2 , an annual com-
petition where the latest expressive performance models are to be presented and judged.
An overview of this field is presented in Kirke and Miranda (2013).

2. Musical Performance Rendering Contest for Computer Systems (Hashida et al., 2008; Hiraga et al.,
2002). http://renconmusic.org/.
145

Other studies have examined the performer side of expressive performance. Topics
explored thus far include the motor processes of performance, multimodal feedback, ex-
pressive movements, and physiological matters. Although research on motor processes
in performance may concern the cognitive theories of motor skill and motor program-
ming (Palmer, 1997) and their neuropsychological interpretation (thus the process of
performance), motor processes in performance can also be studied in and of themselves
— e.g. the fingering techniques in piano performance (Clarke et al., 1997; Parncutt et al.,
1997; Sloboda et al., 1998). Moreover, the role of multimodal feedback has been studied
in piano performance by suppressing or impeding performing pianists’ auditory (Pfor-
dresher and Palmer, 2002; Repp, 1999b) and/or kinaesthetic (Goebl and Palmer, 2008;
Repp, 1999a) feedback, and analysing the effects on select performance parameters.
While we may not delve here into the psychological, neurological, physiological,
teaching, learning, memorising, practising, sight-reading and improvising aspects of the
performance process, one can find thorough information on these topics (and comple-
ments to the information on the topics presented in this chapter) in Rink (1995, 2002)
and Parncutt and McPherson (2002).

7.1.2 Movement and gesture in performance

Movement is a fundamental requirement to producing sound with an acoustic instru-


ment. Only the movement of an instrumental gesture (e.g. plucking, blowing, hitting,
or pressing) can provide the energy that sets the vibrator (string, air column, membrane)
in motion. As movements can yield sound, music and movement are intertwined in
the web of sound-producing gestures that is music performance. Otto Ortmann stated
for instance, in the context of his research on piano playing, that “music is movement”
(Ortmann, 1929).
This relationship is also fairly evident in another sense of movement, as in many
traditional African musics body movement, dance and song are indissociable (Blacking,
1973). In the western culture as well, Descartes noticed the integration of musical tempo
146

and metric structure through body movement, stating “it even seems that music natu-
rally brings us to [. . . ] distinguish[ing] each musical bar exactly by the set gestures and
movements of our bodies” 3 (Descartes, 1768, p. 451). Even earlier, Saint Augustine had
defined music as ars bene movandi, the art of good motion (Risset, 1994, p. 720).
Movement is also prominently featured in music education, in piano teaching meth-
ods such as Dalcroze’s eurhythmics, with exercises to enforce the sense of musical
rhythm through movement, and the Alexander technique, which teaches physiologically
appropriate posture and an efficient use of the body so as to optimise movement.
Accordingly, movements in music performance are not limited to sound-producing
gestures. 4 From studying Glenn Gould’s performance gestures, Delalande (1988) iden-
tified three functional types of performance gesture: effective, ancillary and figurative
(Cadoz and Wanderley, 2000, pp. 77–78). Effective gestures are necessary to mechan-
ically producing the sound. Ancillary (or accompanist) gestures are imposed by phys-
iological constraints, physiologically easing the application of an effective gesture or
helpful to performers in their musical expressions (keeping with the rhythm, etc.). Figu-
rative gestures hold no apparent correspondence to sound production, but are perceived
by the audience and thus hold a symbolic role.
These indirect aspects of movement and gesture in performance have been exten-
sively studied in the last twenty years. The importance of ancillary gestures to perform-
ers was notably demonstrated by Wanderley et al. (2005) in analyses of solo clarinet
performances. Different ancillary gestures marked rhythmic and phrasing structures,
helping both the performers and the audience communicate and perceive these struc-
tural levels in the performances. Likewise, Davidson (1993, 2002, 2007), Davidson and
Correia (2002), Dahl (2005), and Thompson (2012) for instance have explored the com-
3. “Il semble même que la musique nous porte naturellement [. . . ] [à] distinguer exactement chaque
mesure de musique par les gestes et les mouvements réglés de notre corps”.
4. Movement and gesture are closely related concepts, as we may consider movement as the phe-
nomenological counterpart to gesture, and conversely gesture as the cognitive operation producing move-
ment (Cadoz and Wanderley, 2000). While this is admittedly a coarse definition of gesture–movement
relations, it remains sufficient for our purpose.
147

munication of expressive intentions through body movement (whether intended for the
audience or for co-performers), revealing that performers’ movements convey informa-
tion about their expressive intentions. Furthermore, the theory of embodied cognition
(Johnson, 1987) was applied to music, most notably by Leman (2008), in studying the
role of the body as mediator between the mind and the physical environment, i.e. in the
transformation of musical intentions into sound. The body would accordingly hold a
major role in music performance and production, and in music reception as well — with
listeners’ bodies as bottom-up catalysts to experiencing the music.
For a more complete account of the relations between music and gesture in perfor-
mance, the reader shall refer to Gritten and King (2006, 2011) and Godøy and Leman
(2010).

7.1.3 Data acquisition technologies for piano performance

In order to study expressive piano performance and gesture according to experimen-


tal and quantitative principles, measurements are required. The following is a summary
of the various technologies used in scientific studies for measuring piano performance,
a thorough account of which is provided in Goebl et al. (2008).

7.1.3.1 External systems

Audio

The first, most employed and rather immediate method for measuring piano per-
formance consists in analysing audio recordings of piano performances. Such audio
recordings are widely available, both commercially and in open archives, and can be
easily obtained in experimental designs. 5 Acoustical analyses of the recorded signal can
then provide useful temporal, dynamic and timbral information about the performances.
5. Provided one possesses a piano, adequate microphones, recording apparatus and a recording room,
and can find willing pianists
148

However, such extraction of audio-based performance parameters (McAdams et al.,


2004) can prove extremely tenuous. The recording process itself bears a vast influence
on the audio result — in microphone position for instance, the slightest displacement
can cause a major difference in sound. Thus two recordings can be reliably compared
only if the recording situation was identical. Moreover, as the piano is a polyphonic
instrument, several notes can be played together simultaneously or overlapped in time,
and held or altered with pedals. Audio recordings of solo piano performances contain all
this information mixed together. In particular, the onsets of the several notes in a chord
would not be timed precisely, nor could their individual dynamics be disentangled —
despite attempts by Repp (1993) for instance. Even in single-note contexts, the reported
timing accuracies would be larger than 5 ms — e.g. 10 ms in Cook (1987), 6.5 ms in
Repp (1990).
Only recently have methods arisen to extract precise piano performance parame-
ters from audio recordings. Audio note individuation technologies such as the Direct
Note Access algorithm — commercially available in the Celemony Melodyne R soft-
ware — can rather reliably (although not perfectly) separate individual notes in poly-
phonic audio. More impressively and thoroughly, John Q. Walker and colleagues in the
Zenph company 6 have managed to extract from old, noisy audio recordings (e.g. early
twentieth-century recordings of Rachmaninov) a comprehensive number of performance
parameters, thanks to proprietary acoustic analysis software (Walker et al., 2009). 7
While thus ostensibly achievable, this strategy of extracting piano performance fea-
tures from audio recordings remains extremely arduous, especially in the absence (as far
as I know) of open-source or commercially available software.
6. http://www.zenph.com.
7. ‘Re-performances’ (Seaver, 2010) were then produced by feeding the extracted performance pa-
rameters to a ‘player piano’ (see Section 7.1.3.2). Audio recordings of these ‘re-performances’, intended
to render exactly the original performance with a much higher sound quality, were made commercially
available. This endeavour did not prove profitable however — the company folded recently.
149

Video

Meanwhile, other researchers interested in pianists’ movements have made use of


video technologies. Video-recorded piano performances can be analysed in order to
extract performance parameters from pianists’ movements. These movements can be
traced and quantified from standard video recordings with dedicated image- and video-
processing software (Camurri et al., 2000; Jensenius, 2007; Payeur et al., 2006). Other
video motion capture technologies can also be used (Thompson, 2012). In motion cap-
ture, light-reflecting markers are positioned on the participant at the key points and joints
whose movement to measure, and an array of infrared or photo-emitting cameras are
used to track the markers motions in three dimensions. 8
The Microsoft Kinect R system, with its dual cameras (one standard for 2D RGB
recording and one infrared for depth sensing) can also serve as an non-invasive, 3D
motion capture equipment — albeit one with coarse depth estimation (Hadjakos, 2012).
Such video recordings and motion captures were effectively used in assessing the in-
fluence of body movements on the perception of expressiveness in piano performance, by
comparing how expressiveness is perceived in either audiovisual, audio-only or video-
only stimuli, and by correlating perceived expressiveness with the amount of different
body movements. For instance, most recently Thompson (2012) confirmed Davidson’s
(1993) findings that visual perception of piano performance can effectively convey the
expressiveness of a performance and relies on the visual cues provided by head, neck
and shoulders movements.
However, these methods tend to be ill-suited to measuring a pianist’s hands and fin-
gers movements at the keyboard (MacRitchie, 2011). Motion capture markers are gen-
erally too intrusive to be set on pianists’ fingers without affecting their playing; visual
masking between fingers or hands is bound to happen; and the subtle, three-dimensional
finger movements are hard to track from standard video recordings. It should be noted
8. Other motion capture systems involve active markers, but otherwise proceed accordingly.
150

that, in this dissertation, the exploration of gestural control in the expression of piano
timbre will only concern the effective gestures applied by pianists to the keyboard and
pedals. As such, those video-based procedures used in studying body movements and
figurative gestures are not applicable for our aims.
MacRitchie (2011) however has developed a custom, non-intrusive solution for track-
ing finger movement: ‘FingerDance’. UV-reflective paint was used to apply passive
markers on each finger and hand joint. A black-light neon tube was set up above the
keyboard to make the markers fluoresce. Performances were filmed with a fast frame
rate camera, set high over the keyboard for an optimal, unimpeded view of the pianist’s
hands. A custom image processing software was then used to extract a 3D skeletal-frame
model of the hands in the video recordings, and extrapolate with regard to morphological
hand constraints the missing information from occlusions (hands or fingers overlapping)
and from the third, z-axis dimension.

Sensors

Other motion capture systems however do not use video, instead acquiring data from
motion sensors such as accelerometers, gyroscopes or pressure sensors. By fixing such
sensors on pianists’ arms, hands and/or fingers, pianists’ gestures in performance can
thus be measured (Furuya et al., 2010; Grosshauser et al., 2012; Hadjakos, 2011). How-
ever, this procedure remains at least somewhat invasive, especially when fixing sensors
on pianists’ fingers. Moreover, it is difficult with such a procedure to differentiate be-
tween effective and ancillary gestures — it was actually one of the main points of Had-
jakos’s (2011) whole thesis.
In order for the measurements to be as close as possible to the sole effective gestures,
they can be collected directly at the keyboard level, through key and pedal depressions
and hammer motions. The gestures effectively applied to the keyboard can thus be fully
measured via the energy they transfer toward the keyboard. This way, the ancillary
gestures used by pianists in movements separate from the keyboard action are left off.
151

Such ancillary gestures, from the involvement of the whole body to the fingertips, re-
main crucial to pianists, as they are involved in music embodiment, the self-perception
of the performance, the reception of the performance by the audience, and physiolog-
ical effectiveness (see Section 7.1.2 of this chapter). Regarding piano touch, they are
especially important as preparatory gestures, greatly contributing to an appropriate ex-
pressive touch (Doğantan-Dack, 2011). Yet the ancillary gestures in piano performance
can prove highly idiosyncratic (constrained for instance by the specific physiognomies
of different pianists, as well as by their own playing styles), and different movements of
this type could actually result in the same effective gestures as applied to the keys.
While the instrumental gestures applied on the keyboard may not be integrally ef-
fective toward actual sound production, they are still fully involved in pianists’ control
over their own performance, if only for the kinaesthetic feedback they provide (Goebl
and Palmer, 2008). We shall thus consider them as effective gestures of piano playing,
and consequently the most relevant to our studying the gestural control of piano timbre.
To obtain such measurements, though, embedded systems are required.

7.1.3.2 Embedded systems

In order to measure piano performance parameters via the movement of piano keys,
pedals and/or hammers, different systems embedded in pianos have been developed since
the early twentieth century, which made use of the technologies available at their time.

Mechanical apparatus

Early studies used mechanical or electro-mechanical setups to collect piano rolls of


key depressions. Piano rolls were actually one of the first performance recording and
storage means, along with the phonograph. The first piano roll system developed for
research purposes (Binet and Courtier, 1895) used rubber tubes placed under the keys,
which when compressed by the keys would transfer air pressure to a cylindric graphical
152

recorder. Strokes of the different keys would thus be inscribed on a piano roll. The
piano rolls commercially distributed by companies such as Duo-Art were also used as a
data source. Heinlein (1929a) for instance studied pedalling in Duo-Art roll recordings
of four performances of Schumann’s Träumerei, and Vernon (1936) used Duo-Art rolls
to study chord synchronisation. However, as the recording procedures of commercial
piano rolls were withheld, their accuracy and validity are questionable (Goebl et al.,
2008, p. 4).
Meanwhile, Ortmann (1929) devised various custom, mechanical systems for record-
ing piano performance and key movement. His most advanced mechanical apparatus
involved springs and levers connecting the keys to a dynamograph and revolving drum,
and could acquire continuous measurements of key motion and represent them as curves
detailing the slightest fluctuations.
Others used optical systems. Hart et al. (1934) tracked hammer motion with an
apparatus including a light source, lenses, a cardboard shutter with two slits mounted on
the hammer shank, and a rotating film. With the light passing through the slits recorded
on film, the hammer trajectory was thus stored. A major technological achievement of
its time, the Iowa Piano Camera was developed by Henderson et al. (1936). It could
record hammer motion and velocity while at the same time marking note onsets and
offsets, and also pedal depression. While the method used was similar in essence to that
of Hart et al. (1934), it was much more technically elaborate. The slit in the shutter
would let the light pass through for the last 12 mm of the hammer course toward the
string, with the shutter blocking light before and after. Hammer velocity could thus be
inferred from the distance of exposure on the film — which corresponds to the time
taken by the hammer for its travel. The Iowa Piano Camera was thus used to analyse
several dynamic, rhythmic/temporal and structural features, with a temporal resolution
as low as 10 ms and 17 dynamic levels (Seashore, 1936).
153

Digital technologies

After more than forty years of status quo, Shaffer (1980) renewed the optical system
approach by using the more modern photocell technology. A major advantage was that
mounting two photocells per key into the piano action let the instrument remain fully
playable, whereas the older systems required at best a few adjacent keys to be blocked,
and at worst one piano action/key to be dismounted from the instrument. Now, the data
output by the photocells (note onsets and offsets) could be stored digitally with 12-bit
resolution.
Thus, the age of digital technologies dawned on piano performance research. In
1983, the MIDI (Musical Instrument Digital Interface) standardised protocol was re-
leased, which would soon become universally used by all digital keyboard controllers.
Studies then followed that used digital synthesizers and pianos to collect MIDI data —
note number, onset, duration and 7-bit velocity information — and thus explore aspects
of expressive piano performance such as timing (Palmer, 1989), its relation to tempo
(Desain and Honing, 1994; Repp, 1994), or the legato articulation (Repp, 1995a). How-
ever, the digital keyboards used in such studies do not offer a proper, realistic ‘touch’
comparable to the action of an acoustic piano.
This problem was circumvented with computer-controlled grand pianos (Coenen and
Schäfer, 1992), which can output MIDI data from the information on the motion of the
hammers gathered by embedded optical sensors. 9 The widely used Yamaha Disklavier
system was released in 1989. Moog proposed the Piano Bar system to install over the
keyboards of acoustic pianos. Bösendorfer also offered its SE system (Moog and Rhea,
1990), of better quality and timing accuracy (Goebl and Bresin, 2003) — and much more
expensive, thus more seldom used. These digital MIDI recording acoustic pianos vastly
eased the quantitative piano roll-like data acquisition of some piano performance param-
eters, and were employed in many systematic studies by Repp (1995b, 1996b, 1997a),
9. In addition to such pianos having solenoids mounted on each key, which can be MIDI-controlled
for a digital player piano experience.
154

Bresin and Battel (2000), Windsor et al. (2001), Goebl (2003), Goebl et al. (2010)
(among many others) of such expressive features as timing, dynamics and articulation,
which have given us a better grasp of the role of gesture in expressive piano performance.
Yet the MIDI standard provides only limited information (Moore, 1988) (note on-
set, duration, velocity) that cannot suffice to detail the subtleties of piano touch control.
Various solutions have been envisioned for digital music notation and representation
formats (Selfridge-Field, 1997). As for performance data acquisition, custom solutions
have been developed, with keyboard interfaces responsive to multiple parameters of a
pianist’s touch — Moog’s Multiply-Touch-Sensitive Keyboard (Moog and Rhea, 1990),
and more recently McPherson and Kim’s (2011) multidimensional gesture sensing key-
board.
Meanwhile, Bösendorfer has internally developed and commercialised the CEUS
digital recording and reproducing system. 10 With its high-precision optical sensors and
integrated electronics and computer, the CEUS system can track keys and pedals posi-
tions and hammer velocities with unprecedented accuracy (see Chapter 8, Section 8.2,
p. 172). I had the privilege to use this prodigious tool for the measurement of piano
performance parameters and the effective instrumental gestures they stem from, unique
in offering such a high level of precision for studying piano touch and the finest nuances
from which arises timbre expression.

7.2 Findings on piano performance

Among all the studies on piano performance that followed, differing in objectives
and methodologies, several have unearthed results which are either directly or indirectly
relevant to our current research. This section first presents findings on the identifiable
patterns in expressive piano performance and on the properties of synchrony, articula-
tion, fingering and pedalling that bear significant interest toward understanding the ges-
10. The meaning of the CEUS acronym is rather obscure, but it was mentioned in an early version of
the operating manual as standing for “Create Emotions with Unique Sound”.
155

tural control of piano timbre. The scientific literature on piano touch, tone and timbre
is then explored, and directly constitutes the knowledge base from which our research
intends to go forward.

7.2.1 Expressive patterns in piano performance

Scientific research on the patterns followed by diverse expressive parameters in piano


performance has been essentially devoted to identifying the regularities one can expect
to find in the expressive deviations pianists employ in infusing performances of a piece
with character. It should be noted that expression in piano performance is not limited to
sheer deviations from the score, especially when considering the expression of timbral
nuances. Nevertheless, studying the expressive deviations from the score, despite its
limited scope, has provided useful insight.
Several comparative studies of piano interpretation and musical structure have thus
explored how the patterns followed by piano performance parameters, as measured fol-
lowing some of the methods previously described, related to the scores and their under-
lying structures, as highlighted through musical analysis. Henderson et al. (1936) thus
identified, in two performances of Chopin’s Nocturne no.6 (op.15 no.3) (a piece whose
musical form and phrasing they considered ‘obvious’), that accentuations in rhythm,
note duration, rubato, intensity, voice and chord synchronisation and pedalling (as mea-
sured through the Iowa Piano Camera) would correspond with structural patterns and
boundaries at different levels, from long multi-phrase sentences to intra-phrase accents.
Cook (1987), Repp (1992) and Shaffer (1980, 1992, 1995) also highlighted, in studying
expert performances of several well-known pieces, the correspondence of performance
timing and dynamics to musical structure — thus suggesting that performers, function-
ing first as music analysts, use the performance parameters in their control to infuse their
interpretations of the piece with character. Finally, MacRitchie (2011) recently demon-
strated that, in performances of two Chopin preludes, phrasing structures and underlying
expressive accents were reflected in tempo, dynamics and finger movements as well.
156

Patterns of expressive performance parameters were also compared with the local
structure, by which performances deviate from the score. Early on (Seashore, 1938),
such deliberate deviations from the score in piano performance were attested and proven
to follow, in timing and dynamics, precise and constant patterns — i.e. performed ac-
cents within the musical structure (Parncutt, 2003). Importantly, pianists also proved
highly consistent in reproducing the same expressive patterns in successive performances
of the same piece (Seashore, 1938). Those conclusions were later confirmed by Repp
in several studies conducted in the 1990s. The timing profiles extracted from MIDI
data acquired in performances of both Schumann’s Träumerei (Repp, 1992, 1995b) and
Debussy’s prelude La fille aux cheveux de lin (Repp, 1997c) proved quite similar — es-
sentially following parabolic patterns peaking at the melodic accents — even between
professionals and students, and were also largely repeatable by the performers. The
same consistency also proved true for dynamics (especially for the right hand) (Repp,
1996a). Likewise, Repp’s study of timing and dynamics in more than 100 commercial
audio recordings of Chopin’s Etude in E major (bars 1 to 5 only) (Repp, 1998a, 1999c)
revealed all performances were constrained within a narrow expressive domain, inside
which the performances only differed in that pianists used somewhat different combina-
tions of a few expressive timing and dynamic parameters.
Such studies thus inform us that the expressive parameters of piano performance, es-
pecially timing and dynamics (the most easily accessible), follow rules stemming from
musical structure (phrasing and accents) and from the ‘acceptable’ deviations from the
score. Consequently, even as pianists can each follow idiosyncratic expressive patterns
which they can themselves repeat identically, some common expressive ground is sys-
tematically shared among all performers of the same piece.
Even though expressive deviations from the score cannot account for all the expres-
sive strategies pianists can employ (cf. Clarke, 1995), the patterns they follow can es-
tablish more general expectations regarding expressive patterns at large. Considering
timbre production as an expressive parameter in piano performance, we shall expect to
157

find common patterns among different performers in the gestural control of piano timbre.

7.2.2 Synchrony, articulation and pedalling in piano performance

Meanwhile, other studies have focused on specific elements of piano performance,


whose role in expressive performance may suggest they can also be conjured in timbre
control.
Synchrony was first quantitatively studied in chords by Vernon (1936). The timing of
notes in chords was explored in Duo-Art roll recordings of eight performances by four
pianists. With a 10 ms accuracy in note timing (thus a discretisation of timing in 10-
ms units), as many as half the chords were identified as asynchronous — most of those
by several timing units, which suggests that most such deviations were intentional. The
synchrony in chords was shown as idiosyncratic and varying between different passages,
yet locally consistent for each pianist. The synchrony in chords was also positively
correlated with tempo and its stability: the slower or the most variable tempo becomes,
the more asynchronous (‘rolled’) the chords become. Right-hand asynchronous chords
were actually explained by a timing emphasis on the melody note — set apart usually
by being played early by several units, but sometimes played late, mostly at the end of a
phrase to simulate a ritardando. The asynchrony between hands was explained by bass
note anticipation, aimed at avoiding the obscuring of higher notes. Finally, the overall
asynchrony in chords did not prove to be related to the musical structure.
Much later, in studying note onset asynchronies from MIDI recordings of hammer
motion on a Disklavier, Repp (1996b) found (1) that some pianists would consistently
lead with the left hand, (2) that inner notes of within-hand chords tended to lag behind
outer notes, (3) that melody notes lead the others, especially those from the same-hand
chord, and (4) that such lead times were strongly correlated to MIDI velocity. The third
result was also identified by Palmer (1989, 1996b). Through measurements of key-bed
contact, the melody note was shown to lead by around 20 ms and between 20 and 40 ms
respectively. This melody lead effect was envisioned as an expressive strategy, as it was
158

less salient in performances instructed to be played ‘unmusically’. Yet, consequently


to Repp’s fourth result, this effect of melody lead could be explained as a consequence
of the dynamic accentuation of the melodic voice. This effect was confirmed by Goebl
(2001, 2003), who found that melody notes consistently lead the other voices by about
30 ms, and that the earlier the relative onset of the melody note, the greater the intensity.
This finding supports the understanding of melody lead as a velocity artefact: the dy-
namic emphasis (higher hammer velocity) on the melody note is produced with a faster
key depression, which also results in an earlier hammer-string contact. Furthermore, the
onsets of key stroke for all the notes in a chord were shown by regression from directly
measured hammer travel time and velocity to be almost simultaneous.
As for between-hand synchronisation, Shaffer (1981) found (although in studying
only one performance) that the right hand lagged on the first beat but led elsewhere.
Goebl et al. (2010) performed an automated computational analysis of Nikita Maga-
loff’s MIDI-recorded performances of the complete works of Chopin. The distribution
of between-hand chord asynchronies over the whole corpus showed many cases where
the right hand led (albeit to a slight degree, attributable to melody lead) and slightly
more cases overall where the left hand led (which results from bass anticipation, with
the lowest note played early).

Articulation has also been explored as an expressive device. Repp (1995a, 1997a)
found that in legato articulation, key overlap times (KOT) increased with register (both
between and within hands), step sizes (intervals between connected notes) and interval
consonance 11 , and decreased over an arpeggio sequence and (non-linearly) with tempo
— a faster tempo yields smaller inter-onset intervals (IOI), which in turn yield smaller
key overlap times. Bresin and Battel (2000) used key overlap ratios (KOR) of KOT
to IOI, and key detached ratio (KDR) of key detached time (KDT) to IOI, to show a
general trend in right-hand articulation (legato and staccato respectively) amid consid-
11. i.e. overlaps were longer between tones separated by a minor third than by a minor second.
159

erable variation between participants. KOR in legato depended on IOI, while staccato
KDRs were consistently around 60%. Instructed changes in expressive character be-
tween performances also tended to be conveyed with the same articulation strategies by
all pianists. The most relevant ideas in these studies are actually the concepts of inter-
onset interval, key overlap/detached time and key overlap/detached ratio as quantitative
specifiers of articulation.

Meanwhile, the role of pedalling in expressive piano performance has been explored
much less. An integral yet intriguing part of piano performance, the role of damper-
pedalling is less apparent to an audience than keyboard action and touch, yet crucial to
the appreciation of a piano performance, inducing flow and legato (Heinlein, 1929b).
However, pedalling patterns recorded and analysed by Heinlein (1929a) proved highly
idiosyncratic, a fact consistent with the stated lack of pedal teaching at the time. Ped-
alling was used in different roles: melodic or harmonic style (depending on whether the
pedalling patterns chosen by the performer match the harmonic or melodic structure of
the piece), ‘pumping’ (shown by a high frequency of pedal depressions/releases), syn-
copation (anticipation or delay on note onsets). Most surprisingly, pedalling patterns
also proved impossible for pianists to replicate reliably in consecutive performances.
Pedalling was shown as a highly integrated response into a unity of effect, intrinsically
related to fingering patterns — thus not integrated into the abstract musical image but in
the motor process (Heinlein, 1930).
More recently, Repp (1996c, 1997b) found no absolute or relative overall invariance
in the timing of damper pedal depression — with regard to key depressions and releases,
and as a function of global and local tempo. He did however observe consistent patterns
in same-performer repetitions of the same piece. Meanwhile, Palmer’s (1996a) case
study of a Mozart sonata showed good synchronisation of the sustain pedal with finger
movements, with a better synchronisation of sustain pedal releases to the following note
onsets than sustain pedal depressions after note onsets — as the latter are less urgent for
160

musical purposes. As for the soft pedal, it was clearly used for reducing dynamics (i.e.
not for its other, frequently acknowledged role of shaping tone quality; see Chapter 2,
p. 39).
As far as I know, however, the role of piano pedalling as an expressive control pa-
rameter to convey different characters or timbral nuances has not been quantitatively
explored — although the acoustic effect of pedals on piano sound has been studied
(Lehtonen, 2010; Lehtonen et al., 2009, 2007) and is undeniable.

7.2.3 Timbre, touch and a single tone

An expressive piano performance parameter paramount to the control and expression


of timbral nuances and tone qualities is the concept of piano touch, whose various forms
and facets, and effect on key depression patterns, have been most notably assessed by
Ortmann (1929, pp. 175–375).

7.2.3.1 Touch and no tone(-quality)?

As we have seen at the end of Chapter 1 (p. 27), studies exploring piano touch and its
effect on tone quality have concluded that, as far as a single key/note/tone is concerned,
timbre, as the spectro-temporal envelope of the sound produced, cannot be effected in-
dependently of loudness. In particular, Hart et al. (1934) explored in quantitative detail
the chain of actions from piano touch, to hammer motion, to the sound produced and
its wave form. Piano touch was varied with a mechanical key striker, which replicated
different key strokes for obtaining the exact same (of several tested) hammer velocity,
as directly measured with an optical system. As no difference appeared in the resulting
sounds, they only confirmed that pianists could not control tone-quality independently of
loudness. Loudness in turn has been shown to be determined only by hammer velocity
at string contact time. In this vein, Palmer and Brown (1991) identified a linear rela-
tionship from hammer velocity to peak sound amplitude, thus a logarithmic relationship
to sound level — i.e. intensity, the acoustic parameter underlying the psychoacoustical,
161

frequency-weighted notion of loudness. As the hammer hits the string(s) in free flight,
released from the action mechanism for the last 1 to 3 mm of its course (Askenfelt and
Jansson, 1990, p. 54), hammer velocity at string impact is imprinted by the instant ve-
locity of key depression at hammer let-off. Thus, the tone-quality of a single note would
be controlled only by a single parameter accessible to the pianist, the key depression
velocity at the instant of hammer let-off. This is arguably true for the timbre of the
sound resulting from string vibrations, as conjectures such as the effect on the strings of
hammer-shank vibrations (at about 50 Hz), that some kind of touch can produce or ac-
centuate (Askenfelt and Jansson, 1991), were proven acoustically negligible (Askenfelt
et al., 1998; Hart et al., 1934).

7.2.3.2 Contact noises

However, another possible source for the timbre of a single tone may be the contact
noises. In their determining that the timbre of a single tone could not be controlled
independently of loudness, Hart et al. (1934) did not account for the early contact noises,
as they only took in consideration the steady-state portion of piano tones, leaving out the
attack, and as they kept their acoustical analyses to rudimentary wave-form comparisons.
In particular, the sound of a piano tone at attack is not limited to sheer string vibrations.
Several contact and friction noises are involved, the three main of which result from
finger-key, key-keybed and hammer-string contacts (Parncutt and Troup, 2002, p. 291).
Such noise elements were adamantly posited as an integral part of piano tone and its
quality by Ortmann (1925) then Báron (1958). Of those, finger-key contact is the most
directly affected by different types of touch. 12
Two prototypical touch types are illustrated in the scientific literature, under diverse
denominations: non-percussive vs. percussive (Ortmann, 1925, 1929), legato vs. stac-
cato (Askenfelt and Jansson, 1991), soft vs. hard (Suzuki, 2003) or pressed vs. struck
12. The key-keybed contact noise, while it was proven to audibly influence the timbre of a piano tone
(Goebl and Fujinaga, 2008), can hardly be controlled independently of key velocity (Ortmann, 1925;
Parncutt and Troup, 2002).
162

(Goebl et al., 2004, 2005). As these latter monikers suggest, a pressed touch designates a
key depression that starts with the finger already resting on the key surface, whereas for
a struck touch the finger hits the key from above, thus with some initial velocity. Con-
sequently, opting for a struck touch rather than a pressed touch adds finger-key contact
noise to a piano tone (Goebl et al., 2005; Kinoshita et al., 2007). This contact noise is
called ‘touch precursor’ as it precedes hammer-string contact by 40 ±20 ms (Askenfelt
and Jansson, 1991; Kinoshita et al., 2007). It was shown to vary in relative intensity,
below the subsequent piano sound, from 30 dB SPL at pp and p dynamics to 15 dB SPL
at f and ff (Kinoshita et al., 2007). This noise is not only audible, but even the sole
distinguishing audio feature between pressed and struck tones (Goebl et al., 2004). Put
otherwise, the timbre of a piano tone is affected by touch — and the amount of finger-key
contact noise it generates.

7.2.3.3 Characteristics of different types of touch

Moreover, those two types of touch yield very different key depression and hammer
velocity patterns (cf. Goebl et al., 2005). A pressed touch indeed presents gradually in-
creasing key and hammer velocities, until the hammer hits the strings just after reaching
full speed, a few milliseconds after the key has reached the keybed. On the other hand,
the struck touch first imprints a sudden high-velocity jerk to the key which does not im-
mediately transfer to the hammer, but produces the touch precursor noise. The key then
decelerates, slowing to a near-total stop while at the same time the hammer receives a
large acceleration, delayed by several milliseconds from the initial striking of the key.
Its peak velocity is reached at string contact, almost simultaneously to key-keybed con-
tact. At equal resulting intensities, such a keystroke takes at least 20 ms less (close to
50%) from finger-key to hammer-string contacts than with a pressed touch. Such differ-
ences between pressed and struck touch thus result in different key depression profiles
(McPherson and Kim, 2011), where initial key depression is gradual with a pressed touch
and abrupt with a struck touch.
163

McPherson and Kim (2011) explored piano touch in the perspective of designing a
digital keyboard controller that could respond to the multiple dimensions of instrumental
gesture and touch that can be applied on keys — whether those aspects of gesture are
effective toward sound production with an acoustic piano or not. They identified five
separate dimensions of touch involved in expressive performance: velocity, percussive-
ness, rigidity, weight and depth. In addition to the direct parameter of velocity and to
percussiveness 13 , rigidity (of finger, wrist and arm muscles and joints when striking the
key), weight (of key depression on the keybed) and depth (of key depression within its
motion range) were also revealed as intuitively controllable touch parameters that can be
measured and quantified from key motion. Furthermore, these five controllable param-
eters were successfully identified and quantified, from a set of 12 descriptive features
extracted from key motion, by a machine learning decision-tree classifier algorithm.

7.2.3.4 Touch and tone control

Actually, from such differences in touch, timbre could be indirectly affected. Touch
indeed provides the kinaesthetic feedback which can help tone control, but also in-
fluences how the performers themselves perceive timbre, via multimodal integration
(Askenfelt et al., 1998; Parncutt and Troup, 2002). In a survey of several piano teachers
from European conservatoires (MacRitchie and Zicari, 2012), touch was qualitatively
described as a conveyor of musical intentions regarding timbre, conceived of as images
or metaphors, to project in performance. Touch was said to connect these timbral in-
tentions to the body, as the purveyor of the sensory feedback information from and for
which fingers, hands, arms and shoulders can adapt in many ways — point and surface of
finger-key contact, finger rigidity and curvature, limb tension, weight, velocity of move-
ment — in order to serve the musically rightful actuation of physical gestures. Between
types of touch, Ortmann (1925) noticed that a struck, percussive touch generally results
13. From which stem contact noises, and which Ortmann (1929) already deemed an inherent part of
tone-quality control
164

in greater key speeds, and that varying forms of touch from pressed to struck can indi-
rectly serve to control key depression speeds and hammer velocities, and affect timbre
(along with loudness) accordingly. Furthermore, Goebl et al. (2005) found that the re-
lation between the free-flight travel time of the hammer (FFT ) and its velocity at key
contact (HV ) was characterised by a power law of the form FFT = a ⇥ HV b, where
parameters a and b depend on the type of touch. In effect, for the same hammer velocity
a pressed touch yielded a shorter free-flight hammer travel time, which was interpreted
as yielding better tone control. The same advantage in tone control with a pressed, non-
percussive touch had also been highlighted by Ortmann (1925), who noticed that key
control with a percussive touch is possible only at impact, while with a non-percussive
touch the key is (and has to be) controlled all along its descent. Such differences in tone
control through touch may affect expressive performance parameters (e.g. articulation)
and their application at a microstructural level, in a way that may indirectly affect timbre
as well (cf. Parncutt and Troup, 2002).
Ortmann (1929) 14 recorded the precise key depression profiles (and muscular con-
tractions) produced when different tone-qualities were intended. He therefore intended
to highlight how the different ‘touches’ associated with different tone-qualities would af-
fect key velocity and finger percussion, in the belief that “all differences in tonal qualities
must show in the degree of percussiveness and in the velocity of the finger-stroke” (Ort-
mann, 1929, p. 243). Among the many tone-qualities and corresponding key depression
profiles that he examined (several of which he reported), let us detail a few.
Ortmann (1929, pp. 243–246) first recorded finger movement via finger-lever at-
tachment, then directly recorded key-action movements with the apparatus described in
Section 7.1.3.2, plus muscle tension (Ortmann, 1929, pp. 337–352). Both procedures
showed that a ‘cantabile’, ‘singing’ tone quality was obtained with a non-percussive
touch (gradual finger and key descents) of moderate intensity, long duration (key held
14. Even though he believed that touch on a single tone could only quantitatively determine its quality
through intensity (key depression velocity) and percussion (on the key surface, from which stem contact
noises).
165

down) and gradual release, with a relaxed arm. As for the axiologically opposite, ‘dry’
tone quality, it is characterised by the relative preponderance of noise over tonal ele-
ments, which is shown by its percussive touch (a sharp initial rise in finger and key
descents) and marked muscular contraction, but low intensity — as the finger descent is
interrupted after the initial blow, and thus the key slows before reaching the escapement
point. Likewise, a ‘full, round’ tone production differed from its axiological opposite
‘shallow’ by a regular finger descent, yet with a rather percussive touch — which thus
requires elimination of the retardation effect of a slowing key following the initial rapid
descent from the blow. A ‘shallow’ tone on the other hand clearly showed an inter-
rupted finger descent, and the faster initial descent (yielding more percussion noise) was
lost at escapement point, resulting in a lower intensity. A ‘sparkling’ quality mean-
while, in accordance with the visual sensation from which the qualifier is borrowed, was
characterised by percussiveness (abrupt initial rise in key depression), moderate to great
intensity (amplitude of the key depression curve) and extreme brevity, with a rapid and
short muscular contraction. It shall be finally reported that for a ‘velvety’ tone quality,
like the corresponding soft and smooth touch sensation, the key is gradually depressed
(non-percussive attack) with little intensity, long duration, and gradual release.
These prototypes of key depression profiles for some tone qualities thus illustrate the
range of variations in percussiveness, intensity and duration by which Ortmann (1929)
could integrally characterise all the studied tone qualities. Put otherwise, the quality of
a single tone could be completely defined — either directly or by association — by its
percussiveness, intensity and duration (Ortmann, 1929, p. 355). In a broad generalisation
from all observations, the most ‘musical’ tones were those least noisy and of moderate
tonal intensity and duration (Ortmann, 1929, p. 354). Yet this only holds for single,
isolated tones.
166

7.2.3.5 Other timbre control parameters for a single tone

Finally, the quality of a single tone can be controlled with performance parameters
other than touch itself, especially when considered in its duration. Key release is one
such parameter. Ortmann (1929) indicates that the timing of key release can affect tone
quality, as previously stated in the key-depression-profile descriptions of specific tone
qualities. He specifies that the manner of key release does not matter, as the finger ascent
only controls the instant of string damping (thus the actual timing of key release) and is
non-influential afterwards. Furthermore, the non-elastic key ascent will not follow a very
fast finger ascent, only giving a kinaesthetic impression of stacatissimo to the performer.
The characteristics of tone stopping, i.e. the releasing of the key or sustain pedal
which triggers the damping of the strings, were however proven to affect the perceived
quality of a piano tone, depending on how it changes the durations and slopes of sustain
and decay (Taguti et al., 2002). Yet this study used simulated tone stoppings, i.e. acoustic
post-processing of an actual piano tone recording. It has not been shown, as far as I know,
how the string-damping can be controlled and whether the effects studied by Taguti et al.
(2002) are mechanically reproducible. Still, it is possible to affect the decay of a piano
tone with partial damping of the vibrating strings, thanks to part-pedalling (Lehtonen
et al., 2009).
Moreover, pedal use can clearly modify the timbre of a single tone (Parncutt and
Troup, 2002). Ortmann (1925) emphasized the richness that the sustain pedal can add
by holding the resonance of previous tones, and to varying degrees by the timing of
pedal depression with a tone onset. On the other hand, the soft pedal shifts the action
and the hammer consequently hits two of the three strings with a less compressed part
of its felt; the sound becomes slightly less brilliant, less noisy, and contains sympathetic
resonances of the third string (Ortmann, 1925, via Gustafson (2007)).
Bellemare and Traube (2005) interviewed 16 pianists about verbally-described pi-
ano timbre nuances and gathered detailed descriptions and definitions of many timbre
167

descriptors. In particular, technical information about their gestural ways of production


was provided. The pianists interviewed could thus describe the production of some tim-
bral nuances according to the role taken by their centre of gravity — and its influence on
the movements and position toward the keyboard of body, arms and hands — and by the
control of attack and its speed and depth through fingers. The following timbral nuances
were thus produced:
– rich: a slow attack with fingerpads, often with a low centre of gravity
– harsh: great speed of attack, much weight, no attack absorption, with the wood of
the keybed heard (contact noise)
– distant: slightly varying attack speeds, with a round touch and body weightless-
ness
– clear: fast attack, close to the keys, with firm fingertips

Guigue (1994b) provided a summarising table of all acoustic elements constitutive of


piano timbre and the degree of control over them for pianists. First, pianists can control
the following mechanical actions:
– hammer velocity — controlled by keystroke velocity,
– damper activation — controlled by releasing a key and/or the sustain pedal,
– the number of strings set in vibration by a keystroke — with the soft, una-corda
pedal (except in the low register where only one string corresponds to each key),
– to a very lesser extent, hammer motion — through keystroke velocity and playing
modes.
Pianists were thus indicated to have great control over:
– the acoustic pressure (i.e. sound level) of partials — determined by keystroke
velocity,
– the number of audible partials — controlled by keystroke velocity and pedals,
– the relative extinction of partials — yielded (except in the higher register) by key
and/or sustain pedal releases,
168

– resonance — as beyond the minimal threshold of natural resonance of the instru-


ment, pedals and playing modes can control resonance.
And pianists could also control, to a lesser extent:
– the relative rising of partials — influenced by keystroke velocity and articulation,
– the sound-to-noise ratio — affected by attack and touch type,
– the frequency ratios between partials — slightly affected by the soft pedal.
We can notice that, in this list of performance control parameters that affect timbre,
one (articulation) requires — and others (pedals, playing modes) are more effective with
— a combination of tones instead of a single, isolated tone.

7.2.4 Timbre in combination of tones

Ortmann (1925, 1929) discussed the myriad possibilities for tone-quality opened by
tone combinations. Indeed, the three control parameters he envisioned for a single tone
(velocity, percussiveness and duration) could be differentially applied to either “simul-
taneous key depressions or successive key depressions, representing, respectively, the
harmonic and melodic aspects of piano playing” (Ortmann, 1925, p. 50, via Gustafson
(2007)). Likewise, Dahl et al. (2010, p. 42) explains that “the combined control of sound
level and articulation [. . . ] is what results in the ‘touch quality’.” Touch and tone quali-
ties can thus only be fully expressed in tone combinations.
In particular, simultaneous key depressions (i.e., essentially, chords) can vary in rela-
tive intensity, percussiveness and duration of each tone, as well as in synchrony (at both
onsets and offsets). Acoustically, the timing and relative intensity of tones in a chord
affect the resulting spectral and temporal envelopes, with timing especially altering the
attack portion of the temporal envelope, and relative intensities affecting the spectral en-
velope, consonance and salience (of the louder tone) (Parncutt and Troup, 2002). This is
for instance essential to voicing (Lanners, 2002) and balance.
Successive key depressions can likewise vary in relative intensity, percussiveness
and duration of each tone, but also in their relative articulation, as well as in the timing
169

patterns in a melodic line (e.g. rubato). The ‘velvety’ quality previously described for a
single tone is indeed better suited to a sequential combination of tones, of equal intensity
and highly legato-connected. Such tone combinations thus offer an “inexhaustible field
of tone-colour” (Ortmann, 1925, p. 131, via Gustafson (2007)).
Tone combinations in a musical context can also be ‘musical’ without the constraints
of noiselessness, moderate loudness and medium duration imposed on a single tone.
Indeed, as Ortmann (1929, p. 354) said, “a tone or chord played with a percussiveness
and at an intensity at which tonal beauty is decidedly impaired, may readily be reacted
to pleasantly if, for example, it marks the climax of a phrase, or awakens associations
the strength of which outweighs the purely tonal sensations”.
Finally, Bellemare and Traube’s (2005) interviews revealed that some timbral nu-
ances can only result from the polyphonic combination of several simultaneous and/or
sequential tones. We thus propose to define such timbral nuances as aggregate timbres:
combinations of at least two sonic elements into one resulting auditory object, where
articulation — the relation between one note to the next — is of prime importance.
The pianists interviewed in Bellemare and Traube (2005) indeed stated that, in order
to produce aggregate timbres, they could vary articulation parameters such as the dis-
tribution of attack speeds between notes, the timings of attack and release, the weight
distribution and synchrony in a chord, or the overlap between sequential notes.
The following descriptions of the production of some aggregate timbres (each pro-
vided by a different pianist) aptly reflect the processes involved:
– glassy is a two-step process: “you lay a chord down, then you play on top of its
resonance”,
– brassy is obtained with an “incredibly firm and aligned attack”, with maximum
resonance and sustain and a minimum of two simultaneous notes,
– transparent demands an equal weight distribution between simultaneous notes
(i.e. balanced voicing),
170

– muddled requires a minimum of two simultaneous or sequential notes, “blurry,


without definition” between them,
– velvety involves a legato touch, slow attacks, and “floating” connections between
tones, with no sharp edge: “you feel like a kitten walking along the keys”, thus
requiring several sequential notes,
– shimmering: “a carpet of sound serves as a backdrop onto which ring notes that
are rapidly attacked into the surface of the keybed and sustained with a surface
pedal”, a process which requires a minimum sequence of four or five notes.

In the end, from the general information and methods about the expressive parame-
ters of piano performance, and from all those cues about the gestural ways of production
of piano timbre, a methodology could be defined in order to quantitatively explore the
touch and gesture parameters of piano timbre production in their finest details.
CHAPTER 8

PIANO TOUCH ANALYSIS: A MATLAB TOOLBOX FOR EXTRACTING


PERFORMANCE FEATURES FROM HIGH-RESOLUTION KEYBOARD AND
PEDALLING DATA

This chapter is for the most part based on an article published in the proceedings of
(and the work presented at) the Journées d’Informatique Musicale 2012 (JIM 2012) in
Mons, Belgium (cf. Bernays and Traube, 2012b).
The software tools presented here were designed and developed by the author, in
collaboration with Nicolas Riche for some of the functions, and under general guidance
from Prof. Caroline Traube.

8.1 Introduction

This chapter will detail the creative solutions set up for exploring the production
of piano timbre nuances, with the aim of identifying the control parameters in gesture
and touch that advanced-level pianists can use to colour their performances in a certain
timbre. Such an exploration is no easy task, as the degree of finesse in gestural control
used to produce such subtle timbral nuances requires, to be measurable, high-precision
measuring tools, large amounts of data, and thus powerful data processing software tools
in order to extract the most relevant details from the data acquired.
Consequently, as we saw in Chapter 7, Section 7.1.3, the MIDI digital recording
systems that were successfully employed in studying properties of timing, dynamics or
articulation in expressive piano performance cannot provide the details and accuracy
required for the exploration of gestural control and touch in piano timbre production.
Meanwhile, other custom, alternative equipment built for acquiring detailed information
about the gestural control in piano performance, however valuable to the particular re-
172

search aims they were developed for, are not optimally suited for our needs either — the
bulk of the mechanical apparatus impeding their use for studying more than a single key
at a time, finger trackers (Hadjakos, 2011; MacRitchie, 2011) not clearly distinguishing
effective gestures of piano playing from its ancillary components, augmented keyboards
(McPherson and Kim, 2011) too different from a natural, acoustic piano — nor easy to
make use of in a time-effective fashion.
Thankfully, in order to acquire the highly precise data required to thoroughly as-
sess the intricacies of key strokes and the finest-grained nuances of pianists’ touch that
let them express different timbres in their performances, I had the opportunity to use
the Bösendorfer CEUS piano digital recording system. Its high-resolution piano key,
hammer and pedal tracking abilities gave us access to highly precise descriptions of the
instrumental gesture in performances, defined as the effective, non-ancillary part of the
performance gesture that conveys energy from the performer directly to the keys.
Yet to understand the gestural and touch control of piano timbre production, the raw
data accessible with the CEUS system needs to be interpreted according to high-level,
meaningful descriptors of a pianist’s gesture and touch, through which performances can
be characterised and compared with regard to the timbral nuances they are coloured with.
This chapter thus presents the analytic tools I developed in the MATLAB R (MATLAB,
2009) environment, in the aim of exploring the most subtle features of touch and gesture
in piano performance: the PianoTouch toolbox.

8.2 High-resolution data acquisition

The CEUS system I used is embedded in the Imperial Bösendorfer grand piano (see
Figure 8.1) installed at BRAMS in a dedicated studio, and which was used for audio
recordings in the studies presented in Chapters 4 and 6. The system includes optical sen-
sors behind the keys, hammers and pedals, microprocessors and electronic boards (see
Figure 8.2) that process sensor data and send it to an embedded computer on whose hard
173

drive data is stored. CEUS is also a reproducing system, with solenoids attached to each
key for precisely replicating the motion stored from a human performance.

Figure 8.1: The Imperial Bösendorfer grand piano installed in the BRAMS studio, with
embedded CEUS system. (Also pictured: Prof. Douglas Eck.)

The system can be controlled via the piano keyboard, over which a LED fallboard
display provides general information and indicates the functions (such as record, play,
browse recordings, etc.) controlled by predefined black keys. Four additional buttons on
the left side of the fallboard allow for navigation in the settings menu and recordings/hard
drive browser, and can be used to stop recordings or playback — as while recording or
replaying the keyboard interface and fallboard display are deactivated. The interface and
fallboard display can also be deactivated by a double push on the middle pedal, so that
the interface controls on the black keys do not interfere with performance, for instance
while rehearsing before starting a recording. The CEUS system can also be controlled
174

Figure 8.2: Details of the CEUS system: fallboard display interface and embedded elec-
tronics. ( c L. Bösendorfer Klavierfabrik GmbH)

with a LabVIEW R graphic use interface installed on the embedded computer. As the
embedded computer can be accessed remotely via Ethernet, the CEUS system can thus
be remotely controlled.
The CEUS system tracks key and pedal positions (via their angle relative to rest
level) at an output sampling rate of 500 Hz (timing measurement error: ± 1ms), and
over 250 steps (8-bit encoding depth), which means a 32 µm tracking accuracy for the
extremity of the key within its 8mm travel range (or about 30” for the key angle). As for
hammers, their timing is measured directly at the hammer head, 1 mm before it hits the
string, with a temporal accuracy of 1 µs. Dynamic range is measured internally over up
to 25000 steps per millisecond, then interpolated for output over 250 linear steps.
The resulting dataset is recorded as binary files. Successive data chunks correspond
to each timestamp — one every two milliseconds. Each chunk starts with a break code
(255), followed by a 24-bit value indicating milliseconds since the start of the recording.
It is followed by a series of 16-bit blocks, one for each key, pedal or hammer activated
at this timestamp. Each block first contains the 8-bit number of the key depressed (as in
MIDI, with the central C4 equal to 60, and up to 108), or pedal (109 for the soft pedal,
111 for the sustain pedal), or hammer (corresponding MIDI-key number + 128). The
175

second 8-bit number indicates the key/pedal position or hammer velocity. Datasets are
stored in the boe format.
The Bösendorfer CEUS recording system thus constitutes an extremely precise tool
to observe the finest subtleties in pianists’ touch and to measure the part of the gesture
actually transferred to the piano action. And with the Bösendorfer Imperial grand piano
and its ad hoc studio setting, performers are provided with a premium musical setup.
However, in order to get a clear understanding of piano touch peculiarities involved
in one performance, the raw data must be processed in a more intelligible way. This
processing is accomplished in the main functions in the PianoTouch toolbox.

8.3 Data processing

8.3.1 From streamed data files to piano rolls

The first requirement in the processing chain that would lead to a thorough a pos-
teriori analysis of CEUS recordings, is to parse the raw files and translate them into
MATLAB matrices. Binary-to-decimal conversion is handled by MATLAB, and the
function extract.m deals with identifying each chunk of data relative to one time-
stamp, essentially by finding the break tags “255”, then storing timestamp-and-value (of
key or pedal depression or hammer velocity) data pairs on one line. The parser can also
deal with the older boe format, in which the data is encoded in ASCII with hexadecimal
numbers. We thus get a timeline of events that happened in the recorded performance.
Yet data is easier to use and interpret once restructured in a key-by-events matrix, where
each row corresponds to one key, pedal or hammer, referred to by its MIDI-like num-
ber as referent, and successive blocks indicate each event with its timestamp and value.
From a timestamp-driven structure, the notetime.m function thus reorganises data
into a key-by-key account of events. Special attention is given to correcting for poten-
tially missing information, such as one key being depressed at two timestamps t and t+4
yet missing information at t+2. Missing or redundant timestamps are filtered, as well as
176

singleton events. 1 This structured data format (which can be saved as a CSV text file)
is especially useful to get a visual representation of the recorded performance, in the
form of a piano roll. In this aim, the pianoroll.m function can display the motion of
keys and pedals, one line for each, with respect to time. All recorded maximum hammer
velocities are displayed for their relative key, with a red vertical arrow (see Figures 8.3
and 8.4). The function thus provides the graphic equivalent of the well-known MIDI
piano roll display, but with the exact level of key depression instead of a fixed-velocity
block per note. The design of this function was partially inspired by the MIDI Toolbox
(Eerola and Toiviainen, 2004) and its MIDI pianoroll function.

111

109

72

71
Key

69

67

66

65

64

57
1000 2000 3000 4000 5000 6000 7000 8000
Time in milliseconds

Figure 8.3: Piano roll display of a performance (detail), where key motions are shown
as blue lines, peak hammer velocities as red vertical arrows, and soft and sustain pedals
(109 and 111 respectively) in light blue. The high precision of the 500 Hz sample rate,
8-bit resolution key motion tracking is apparent in the details of the note profiles.

1. This procedure has essentially become a precaution, as with the latest CEUS software updates such
acquisition errors have been all but eliminated.
177

111

72

71

69
Key

67

66

65

64

57
1000 1500 2000 2500 3000 3500 4000 4500 5000 5500
Time in milliseconds

Figure 8.4: Piano roll display of another performance of the same piece excerpt, by a
different pianist and coloured in a different timbre (Dry vs. Dark for the former). Key
motions are also shown as blue lines, peak hammer velocities as red vertical arrows, and
the sustain pedal (111) in light blue. Subtle differences in depression profiles of keys
and sustain pedal are salient when comparing the two piano rolls.

Consequently, one can already use these piano roll displays to directly observe the
profiles of key depressions, and thus qualitatively gain an idea of the nuances of touch
in a performance. Indeed, the comparison between the two piano rolls of Figures 8.3
and 8.4, which represent the same excerpts in two different performances of the same
piece, by two different pianists, with different timbre nuances (Dark and Dry respec-
tively), can immediately reveal — in addition to differences in tempo and first onset
time, as the second performance starts earlier and is played faster — some salient dif-
ferences in key depression profiles, with sharper notes, more abrupt key releases, deeper
key depressions, higher hammer velocities, and deeper, more abrupt sustain pedal de-
pressions (and absence of soft pedal use) in the second, Dry-timbre performance relative
to the first, Dark-timbre performance.
178

However, within this data structure and piano roll display, we can only qualitatively
observe the high-precision linear response from each key motion, and do not have direct,
quantitative information on the most basic musical structure, that is, the note. While
visually immediate, this basic information we get directly in MIDI has to be retrieved
here through additional processing.

8.3.2 Retrieving notes

As events (key angles) are only registered when the key is out of its rest position,
most note onsets can be retrieved by finding discontinuities in the timeline related to
one key. Note onsets thus occur whenever two consecutive, timestamp-and-value blocks
stored in “notetime” files possess non-consecutive timestamp values (i.e. separated by
more than 2 ms). This main process is run in the notes.m function.
Additionally, correction procedures are applied for missing timestamps (up to 8 ms
deep) and to discard noisy information at note onsets — e.g. a key being slightly pushed
down less than a millimeter when the pianist sets his/her finger on top of it in preparation
for the next note — which could shift the onset time earlier than actually performed.
And key depressions irrelevant to the performance, i.e. the retrieved notes that prove
too feeble to launch the hammer and produce a sound, are also filtered out. For each
note onset detected, a forward loop till next onset checks the key depression values
and return their maximum, which is compared to the minimum key depression level
required for hammer launch, and the note is then either kept or discarded. This threshold
was determined empirically by exploring a large corpus of performances — relations
between maximum key depression per note and existence of a corresponding hammer
event. The level was thus set at 80 (out of 250). This method was deemed preferable
to simply assessing the existence of a hammer event corresponding to the target note,
as it is more robust to potential missing hammer events, and allows us to account for
notes that were intended to launch the hammer and be played but failed to do so; it was
relevant for the purposes of our project to compensate for these kinds of pianists’ errors.
179

Moreover, the note-detection function can account for successive notes played on
the same key, with the second note starting before the key is fully released from the
first. Indeed, the double escapement action featured on grand pianos allows for faster
note repetition after just a partial key release, down to the threshold of the escapement
point. As two notes such as these are not separated in the data by a discontinuity in
the timeline, a forward-loop exploration of local minima was used. The escapement
threshold was assessed empirically through performance corpus analysis, and was set at
140 (out of 250). Local minima lower than this threshold define the onset of a new note.
Once the notes have thus been identified, the higher-level structure is explored: chords.

8.3.3 Identifying chords

The function chords.m identifies groups of notes which have near-synchronous


onsets. The task is performed in two rounds. First, a group of notes is formed when
their onsets are less than 50 ms apart. This group is defined as a chord, whose onset
is defined as the earliest note onset. Another note can then be assigned to this chord if
it has an under-50-ms onset timing difference. This ensures that the most synchronous
notes are grouped together. In the second round, chords can be merged if any of the
note onsets (instead of the earliest) within one chord falls within a looser interval, 100
ms, to any of the note onsets from other chords. The 50 and 100 ms onset-synchrony
intervals are consistent with mean and upper-bound values found by Repp (1996b) and
Shaffer (1981) and were tested empirically for matching between the chords identified
in a corpus of performances and the designations in their respective scores.
Additionally, in the context of our target experiment (detailed in the following chap-
ter) which involved four different pieces, thresholds in register could be set for separating
the range of each hand — i.e. a note above which the left hand, and below which the
right hand, never play. Chords could thus be separated between hands.
Chords are now identified, and the directly available parameters of chord onset, offset
and note keys can be input to the pianoroll.m function, as an optional addition to
180

the mandatory “notetime” matrix or file input. Chords thus get framed in the piano roll
display, with rectangles of four different alternating colours set around each chord onset,
offset and highest and lowest note keys. Such piano roll displays with framed chords
provide easily legible visual information for chord identification and comparison.

111

72

71

69

67

66

65
Key

64

57

55

54

53

52

51

50

48
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 11000
Time in milliseconds

Figure 8.5: Piano roll display of a full performance (from the same CEUS recording as
the piano roll excerpt from Figure 8.4), where chords are framed in rectangles of four
alternating colours. Key motions are still shown as blue lines, peak hammer velocities
as red vertical arrows, and the sustain pedal in light blue.

From the high-precision response tracking of key depressions, pedal depressions


and hammer velocities, we thus retrieve the fundamental musical structure of notes and
chords. 2 This allows us to achieve MIDI-like note identification, add the definition of
chords, and deal with the fine-grained information through reduction to an exhaustive
and relevant feature set to describe each note and chord.
2. The notion of “chord” is hereby used loosely, as it can account for a single note when not deemed
synchronous to any other note played by the same hand.
181

8.4 Piano touch features

The PianoTouch toolbox also provides a thorough analysis of notes and chords, and
extracts numerous features from the high-resolution 8 bit, 500 Hz tracking of key de-
pressions, pedal depressions and maximum hammer velocities recorded. The exhaustive
information related to each note or chord is reduced to a large set of features most rel-
evant to understanding piano touch. The most relevant features to identify were chosen
on the basis of studies of various aspects of expressive piano performance that used
MIDI data (Bresin and Battel, 2000; Goebl et al., 2005; Repp, 1996b, 1997a), as well
as studies focused on piano touch (Ortmann, 1929; Parncutt and Troup, 2002) or key-
board action (McPherson and Kim, 2011). Details of these studies and the features they
used were presented within Chapter 7 (p. 142). I have programmed, adapted, extended
and added piano performance features which can be sorted in several broad categories:
dynamic level, attack speed and type, duration and sustain, release, synchrony, intervals
and overlaps, and use of pedals.

8.4.1 Single note features

Each note is individually described by 46 features. First are its basic characteristics
(see Figure 8.6): key number, onset, offset and duration; maximum hammer velocity
(MHV), angle of maximum key depression (Amax) and their corresponding timestamps.
Additional attack features are then calculated: attack durations (related to instants of
either Amax or MHV), attack speeds (as a ratio of Amax or MHV to its duration), and
timing between Amax and MHV. Those features essentially assess the attack as dynamic
level — the faster the attack and the earlier MHV occurs compared to Amax, the higher
the dynamic level is. The attack type can also be defined by its percussiveness. Indeed,
as explained in Chapter 7, Section 7.2.3 (p. 160), key depression patterns differ between
the pressed and struck types of touch, as the struck touch displays a fast initial key
depression — yielded by the non-zero velocity of the finger striking the key from above
182

Max. key depression (Amax)

Max. hammer velocity

Attack
speed MHV-Amax sync

Note
onset

Attack duration Note offset

Figure 8.6: Illustration of some basic note features: note onset and offset, attack duration
and speed, maximum key depression and hammer velocity, and their relative synchrony.

— while a pressed touch has the key depressed slowly first and gradually accelerating.
Two methods are used to identify this behaviour: the ratio of key depression at half the
attack duration to the maximum key depression, and the mean key depression during
attack — akin to the area swept by the key depression curve during attack. Both features
will have higher values with a percussive touch.
Two other ways of assessing note profiles were designed. First, critical points akin to
the acoustic temporal envelope were retrieved, thus defining up to four zones in key de-
pression: attack, decay (a short decrease in key depression certainly due to the reaction
of the keybed felt, and found in many notes), sustain and release (see Figure 8.7). At-
tack, sustain and release durations and ratio to the total note duration were thus assessed.
Second, a threshold was empirically defined, over which the key can be deemed deeply
depressed. From there, three sections (and their durations) could be defined in each note:
the first before the key reaches the threshold, the second while the key is depressed over
the threshold, and the third when the key falls below it — akin in most cases to attack,
sustain and release respectively.
183

Attack Decay Sustain Release

Figure 8.7: Note key tracking sectioning analogous to the acoustic temporal ADSR en-
velope. The four portions — attack, decay, sustain and release — of key depression in
a note are highlighted in different colours. The attack starts at note onset and ends at
the instant of Amax. Decay ends, and sustain starts, when key depression stabilises at a
near-constant level. Release starts when the key begins to move up (down in the figure),
and ends at note offset.

Finally, the uses of sustain and soft pedals during the note were gathered: for each
pedal, their duration of use and amount of depression during the note, as well as their
depression at note onset, offset and at the instant of MHV.

8.4.2 Chord features

Each chord is first detailed by basic features: the number of notes it contains, its
onset and offset (earliest and latest of its note onsets and offsets respectively), duration
and maximum of its note Amax and MHV. Then, each of its notes is assigned, besides
its 46 individual features, 10 additional characteristics of its synchrony with respect to
the chord: note onset lag on chord onset, its ratio to the chord duration, and the onset
lag amount (defined as the sum of all other key depressions within the chord before the
onset of the target note); note offset lead, ratio and amount with regard to chord offset
(defined in the same way); its synchrony with the chord as the ratio of note duration to
184

chord duration; and its amount of synchrony, defined as the ratio of the total amount
from which the other notes within the chord are depressed for the duration of the target
note, compared with their total amount of depression. Each note within a chord is thus
characterised with 56 features.
And each chord in itself is also described by 56 other features, with descriptions of
internal synchrony and pedal use in addition to the primary features explained above.
The smallest, non-zero note onset lag defines the chord melody lead (how early the first
note is compared with the next), and the smallest, non-zero note offset lead sets the
one-note trail within the chord. As for the use of the soft and sustain pedals, both are
exhaustively described by the following features: duration and amount of depression
during the chord; duration of deep-depression and mid-depression (when the level of
pedal depression falls above or between certain thresholds, respectively); and assessment
at chord onset and offset of the levels of pedal depression, activation (on or off) and
timing (how long before or after the pedal was or will be activated).
Finally, the following features assess the relations between chords. First, intervals
are measured: the inter-onset interval (IOI) from one chord onset to the next, the in-
terval from chord offset to the next chord onset (OffOnI), and its direction — negative
indicates legato, positive staccato. Then, overlaps between chords are defined through
their duration (how long one chord is overlapped by others), amount (of depression) and
number (of chords overlapping with the target) (see Figure 8.8). All these features of
intervals and overlap are calculated with regard to any chord (in onset temporal order),
to same-hand chords only, and to chords played by the other hand only. They are meant
to identify different articulation strategies.
In the end, each multi-note chord is described by 56 chord-specific features, plus the
mean and standard deviation of the 56 features describing each of its notes. This amounts
to 168 features per chord. 3 Such an exhaustive account clearly shows the depths at which
3. In the case of a single-note “chord”, due to redundancy and no standard deviation, there remain 85
valid features.
185

Positive
OffOnI

Overlapping chords

Negative OffOnI
(Offset-to-onset interval)

IOI IOI
(Inter-onset interval)

Figure 8.8: Piano roll display (X-axis: time in ms; Y-axis: keys) of successive chords
with their interval and overlap relations pointed out. Each chord is framed in a colour
rectangle. Intervals between successive chord onsets (IOI) and between offset and onset
of two successive chords (OffOnI) are indicated. Overlaps between chords are shaded in
yellow.

the CEUS system allows us to observe piano performance, and gather quantitative infor-
mation about piano touch and its dynamics, percussiveness, articulation, depth, timing,
pedalling, etc.
The features are extracted by two functions, notes.m and chords.m (the latter
calling the former), and are output in a three-dimensional, chord-by-note-by-feature ma-
trix. For each chord of n notes, there are (n+3) vectors, the first being the features of
the chord itself, the second the mean of its note features and the third their standard
deviation. Additionally, the mean and standard deviation of all chord features gives an
overall portrayal of the performance, as a whole, through 322 characteristics. Moreover,
each performance is similarly described with regard to left-hand chords only, right-hand
186

chords only and left-vs-right hand differences. The results matrix can also be printed out
as a formatted text file (restructured in 2 dimensions), or as two separate files containing
chords alone and notes alone respectively.

8.5 Analysis and visualisation functions

In addition to the main chord-and-note structuring and feature-extracting functions,


the PianoTouch toolbox also contains several additional tools devoted to data visualisa-
tion and analysis.

8.5.1 Comparison of piano rolls

First, the pianoroll_plus.m function was designed to visually compare, di-


rectly, the piano rolls of two or more performances. To this end, the function pro-
duces superimposed representations of the piano rolls of the input performances (via
their “notetime” matrices or files). The piano roll of each performance is displayed as in
the individual piano roll described in Section 8.3, except that key and pedal depressions
and MHVs share the same colour. Each of the piano rolls superimposed within one fig-
ure is assigned a different colour. Performances are synchronized at the first chord onset,
so that the piano rolls start at the same time, regardless of the start of the CEUS record-
ing and the idle time that might precede a performance. Such a visualisation allows for
a direct comparison of each note and chord between several performances, although it
requires that the performances be at the same tempo, lest the same notes and chords
in the score be played at too different a time for notes and chords to remain visually
comparable between performances. Such an example of superimposed piano rolls of
two performances of the same piece is provided in Figure 8.9. One can thus visually
compare the differences between these two performance excerpts: differences in onset
timings, in note durations, key depression depths, attack and release profiles, hammer
velocities, and pedal use.
187

111
79
78
76
75
74
73
72
71
Key

70
69
68
67
66
65
64
62
58
57
2500 3000 3500 4000 4500 5000 5500 6000 6500
Time in milliseconds

Figure 8.9: Superimposed display of two performance piano rolls (detail). These are two
different performances of the same piece, by the same pianist, yet differing in timbre
expression (Dark vs. Velvety). The first performance is here represented in blue for
both its key motion and MHVs, while the second performance is represented in red.
The upper key line (# 111) corresponds to the sustain pedal. One can thus visually
compare the differences in timing, dynamics, sustain, attack, release, pedal use between
performances.

8.5.2 Chords and notes selection

Furthermore, subsets of chords or notes can be selected with two separate functions,
select_chords.m or select_notes.m. There are two ways of selecting the sub-
set: either by indicating the notes or chords to select in a matrix or text file provided as
an input argument, or graphically, by clicking on the chords/notes one wishes to select
on the piano roll display of the performance. With the first method, which is most useful
for batch processing, the notes to select are specified, one per line, by their key number
and a timestamp falling between their onset and offset. The same is required for the
chords to select, and here any key number falling within the range between the lowest
and highest notes in the chord can be used as reference. In case such coordinates could
188

refer to more than one note or chord (due to overlap), the closest fit in the time range is
selected. The graphical counterpart method works essentially the same way, with each
click selecting a note or chord within the piano roll figure sending back a key number
and timestamp. Chord selection by click is made easier by the framing of all chords in
the piano roll (see Figure 8.10). Any number of chords or notes can thus be selected.

Click within the chord(s) you wish to select; when done press ENTER

111
79
78
76
75
74
73
72
71
70
69
68
67
66
65
Key

64
62
58
57
56
55
54
53
52
49
46
45
44
43
42
41
40
34
4000 6000 8000 10000 12000 14000
Time in milliseconds

Figure 8.10: Chord selection interface, where a black target cursor (here centred on key
73 (C#5) at 6500 ms) indicates which chord a click within the frame will select. The
features of each selected chord are output, as well as the mean and standard deviation
of each feature over the selected chord. Local characteristics within a performance can
thus be quantitatively assessed.
189

The function then retrieves the features of each selected note or chord, and calculates
means and standard deviations over the subset. It outputs a 3D matrix of results similar
to the main performance-feature matrix previously described, with the same printout
options. One can thus quantitatively assess local characteristics within a performance.

8.5.3 Graphical representation of performance features

8.5.3.1 Evolution of the features in time during a performance

In order to visualise the evolution of features over the duration of a performance,


the g_pianostats.m function plots the normalized feature values against the perfor-
mance piano roll display. First, a graphical user interface allows the user to select the
features to plot and to specify some options (hand separation, error bars for note features
and the standard deviation between the notes in a chord) (see Figure 8.11).
The normalized value (Z-score) of each selected feature, for each chord, is then plot-
ted over time at the instant of chord onset (see Figure 8.12). With the piano roll as
reference, one thus can see the evolution of the feature during the performance, and pos-
sibly identify its relation to the musical structure (e.g. phrasing) of the piece.

8.5.3.2 Comparison in time between performances

The second function, g_compare.m, is a graphical comparison tool, in time, of


several performances. Essentially the same graphical user interface (see Figure 8.11) is
used to select the features to display, with a few additional specific options to select if
desired — display of horizontal lines at mean feature value, standard deviation between
chords displayed as error bars, synchronization of performances for matching the first
and the last chord onsets between performances. The evolution in time, from chord to
chord, of each selected feature for each input performance is tracked in a separate plot.
In order to improve the relevance of direct comparisons of performances in time, perfor-
190

Figure 8.11: Graphical user interface for feature selection towards their plotting in time
beside the performance piano roll. Each feature is presented under its working label,
alongside a checkbox button for selecting it. Additional options are available: split
chords by hand (and display either one or both), and display the standard deviations of
note features between each note in a chord as errorbars.

mances can be time-stretched so that both the first and last chord onsets are synchronized.
Synchronization of the first chord onset between performances eliminates the delays be-
tween the start of the recording and the beginning of the actual performance. And the
synchronization of the last chord onsets is obtained by time stretching. The whole per-
formance, and all of its chord onsets, are rescaled in time, uniformly, according to the
ratio of the mean actual duration (minus the initial offsets) of all the performances com-
pared to the actual duration of the performance. Thus, the last chord onsets are exactly
synchronized between all performances, and the timing correspondence of the chords
191

111
79
78
76
75
74
73
72
71
70
69
68
67
66
Key

65
64
62
58
57
56
55
54
53
52
46
45
44
43
42
41
40
34
0
2000 4000 6000 8000 10000 12000 14000 16000 18000
Time in milliseconds
MHV
3 Sustain pedal depression
Left hand
2
Right hand
1

−1

−2
2000 4000 6000 8000 10000 12000 14000 16000 18000

Figure 8.12: Graphical display of one performance piano roll and evolution over time of
selected features. MHV and sustain pedal depression are displayed separately for each
hand, with their normalized value (Z-score) per chord plotted at each chord onset.

corresponding to the same musical events in different performances is vastly improved.


Additionally, the means per performance for each selected feature are displayed as his-
tograms alongside their evolution in time. This representation mode thus allows us to
identify what differs between performances, and in particular when such differences oc-
cur (see Figure 8.13).
192

100 100

80 80

Melody lead
60 60

40 40

20 20

0 0
0 2000 4000 6000 8000 10000 12000

230 230
Key depression depth

220 220

210 210

200 200

190 190

180 180

170 170

160 160
0 2000 4000 6000 8000 10000 12000

9 9
8 8
7 7
Attack speed

6 6
5 5
4 4

3 3
2 2
1 1

0 0
0 2000 4000 6000 8000 10000 12000

Performance1 Performance2

Figure 8.13: Comparison in time of two performances according to three features:


melody lead, key depression depth and attack speed. For each performance, feature val-
ues for each chord are plotted at adjusted (for correspondence between performances)
chord onsets. Means per performance for each feature are represented on the right side.

8.5.3.3 Comparison in time between performances grouped by factor

The last graphical function of this ilk, g_factor.m, is also aimed at comparing
performances, in time, according to one or several features. Yet instead of tracking
individual performances, they can be grouped here according to a common factor, e.g.
193

the same performer, or the same timbre. This feature is particularly useful in trying to
identify and assess in time the gestural correlates of piano timbre nuances within a large
set of performances (see Chapter 9).
An example of this representation is provided in Figure 8.14, with the performer as
factor, for six performances of the same piece by two different pianists (three perfor-
mances each). Each performance was time-stretched with the same method used in the
g_compare.m function. The function uses sliding windows, with 50% overlap and
a size fixed by the user — in this figure the size is set at 500 ms. This ensures that
the chords from different performances that correspond to the same musical event are
grouped in the same window, provided their time-stretched onsets are sufficiently close
— i.e. the windows are large enough. The overlap between windows induces a smoothed
out representation. Within each window, the mean between all included chords from
same-factor performances is plotted, and an error bar indicates ±2 standard errors (95%
confidence interval). The means per factor/performer over all the time windows are also
presented as histograms on the right side.

8.5.4 Score matching

The score_matching.m function is aimed to assess the fit between two perfor-
mances or between a performance and its score. It compares the notes identified in one
performance to another, or to the score (rendered as MIDI). The function only com-
pares MIDI-like information of key number, onset, duration and velocity (MHV). This
function, developed by Riche (2010), is an adaptation, with a custom, CEUS-data com-
plying parser, of Edward Large’s score matching functions 4 (Large, 1993), essentially
intended to assess performance errors (such as missing notes). The function returns a
graphical display of two corresponding MIDI piano rolls, where each note within one
performance is linked to its corresponding note in the other performance/score. The
4. Available online at https://www.jyu.fi/hum/laitokset/musiikki/en/
research/coe/materials/miditoolbox/matchingPerformancetoNotation.zip
194

60 60

Melody lead 50 50

40 40

30 30

20 20

10 10

0 0
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

240 240
Key depression depth

220 220

200 200

180 180

160 160

140 140

120 120
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

9 9

8 8
Attack speed

7 7

6 6

5 5

4 4

3 3

2 2

1 1
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

Pianist1 Pianist2

Figure 8.14: Comparison in time of six performances grouped by performer, according


to three features: melody lead, key depression depth and attack speed. Time (and chords)
is segmented in 500-ms, 50%-overlap sliding windows. Error bars represent ±2 S.E.,
and overall means per performer are included as histograms.

cross-correlation matrix of the two performances/scores in time is also displayed, and


we added the calculation of a matching rate, which indicates the percentage of notes that
match (in key number, timing, duration and velocity) between the two performances.
195

An example of the matching comparison between a CEUS-recorded performance and its


score is displayed in Figure 8.15. For this specific comparison, the function also returned
a matching rate of 0.7533, thus an error rate of 0.2447.
Compared with the rest of the toolbox, this score-matching function is much more
limited in power and scope. It can however prove useful in identifying the deviations of
a performance from the score (especially the missing notes) or in comparing the notes
played in two performances. This function however cannot account for the subtleties of
an expressive performance, which extend farther than sheer deviations from the score.

8.5.5 MIDI to boe conversion

Last but not least, Riche (2010) and I designed the boe_gen.m function to generate
CEUS boe format files from MIDI input. First, the MIDI files are parsed and converted
into MATLAB variables with Ken Schutte and Tuomas Eerola’s midi2nmat.m func-
tion, from whose output we conserve the key number, onset, duration and velocity of
each note. MIDI velocity (7-bit) is linearly converted to CEUS MHV (8-bit). Our func-
tion can also take text files as input, using the same format as the midi2nmat output. The
instant of MHV can also be specified as a fifth parameter. When unspecified, the instants
of MHV are extrapolated as an empirical function of MHV and duration.
Then, boe files can be generated with a flat rendering of the input information, i.e.
with MIDI-style notes of constant key depression depth (akin to MIDI velocity) over
their duration. Yet the function can also generate more realistic boe files, in which the
patterns of key depression are rendered as if key motion were precisely tracked. De-
pending on the input parameters, each note is assigned one of three fine-grained note
prototypes, each representing a typical profile of key depression — as commonly identi-
fied in CEUS recordings. The note prototype is then warped by polynomial interpolation
so as to fit the note input parameters, with longer durations accounted for by stretching
the sustain phase (thus keeping attack and release in valid forms). The key-by-events
“notetime” matrix is first created, and then transcribed in the output boe file.
196

notated event
score position (beats)
20 40 60 80
0 5 10 15 20 25 30 35 40 45 50
110

100

10
90

80
f
note number

d
70 b 20

pitch
g
e
60
a
f
50 d
b 30
g
40

30
40
20

performed note
0 5 10 15 20 25

5 10 15 20 25 30 35 40 45 50
110
50

100

90

60
80
f
note number

d
70 b
pitch
g
e
60 70
a
f
50 d
b
g
40
80

30

20
0 2 4 6 8 10 12 14 16 18
90
time (sec)

Figure 8.15: Graphical output of the score matching function adapted by Riche (2010)
from Large (1993): comparison of a CEUS recording of a performance to its score. The
upper-left figure is a piano roll of the performance, reduced to its MIDI-like features. The
lower-left figure is the pianoroll of the MIDI score. Notes that correspond between piano
rolls are matched. On the right side, the cross-correlation matrix between performance
(Y axis) and score (X axis) is shown, where deviations from the diagonal indicate time
shifts in matching, lighter shades of grey indicate imperfect match, and white holes
indicate missing notes.
197

An example of this boe-file generation process is presented in Figure 8.16, which


contains two piano rolls — which stem from the intermediate “notetime” files produced,
not the final boe file. Both piano rolls were generated from the same text file, in which
each of the twelve notes it contains was described by its key number, onset, duration
and MHV, with the instant of MHV extrapolated from MHV and note duration. The
left-side piano roll represents the flat, MIDI-like rendering, while the right-side piano
roll represents the augmented rendering, more similar to a typical CEUS recording.

78 78

72 72

69 69

66 66
Key

Key

63 63

60 60

57 57

54 54

53 53

50 50
0 200 400 600 800 1000 1200 0 200 400 600 800 1000 1200
Time in milliseconds Time in milliseconds

Figure 8.16: Piano rolls generated from MIDI-like text data. Both piano rolls stem from
the same text file data — 12 notes with key number, onset, duration and hammer velocity
for each. Left: flat, MIDI-like rendering. Right: augmented, CEUS-like rendering.

This MIDI (or MIDI-like) conversion and/or augmentation into the CEUS system
boe format has proven useful for testing the feature-extracting functions, and for com-
paring performances, feature-wise, to the “flat” reference of their MIDI-rendered scores.
198

8.6 Discussion

The PianoTouch toolbox was developed to take advantage of the high-resolution,


high sample-rate Bösendorfer CEUS key, hammer and pedal tracking system data, and to
offer an exhaustive and thorough account of piano touch and gesture, through meaningful
features that can be interpreted in a musically relevant fashion. From performance-
tracking data collected with the Bösendorfer piano-embedded CEUS digital recording
system, the toolbox functions can retrieve notes and chords, then describe them using an
exhaustive set of features that can provide a thorough account of nuances in dynamics,
attack, touch, timing, articulation and pedalling within a pianist’s performance.
While the PianoTouch toolbox was especially designed for the CEUS data format, it
could conceivably be used with any similar high-precision equipment — such as high-
frame-rate video tracking or motion capture, or any other high-accuracy piano perfor-
mance recording systems like the latest-generation Yamaha Disklavier — with the key-
board captured as discrete key units in one dimension, and the high-accuracy tracking of
key depression in another. To those ends, we plan on making the PianoTouch toolbox
available online under a GNU licence.
The most immediate use of the PianoTouch toolbox resides in comparing perfor-
mances. Indeed, the piano touch features can reveal the quantitative differences and
similarities in gestural control among several expressive performances. This can be used
for instance in comparing different nuances of a specific expressive parameter. In this
respect, the analytic functions in the toolbox — with piano roll comparison, selection
and assessment of chords and notes, feature visualisation and comparison in time, and
score-performance matching — allow for a rigorous, quantitative exploration of expres-
sive performance and its gestural control.
For the purposes of this thesis, the PianoTouch toolbox is used for exploring the
characteristics of gestural control in the expression of piano timbre. CEUS-recorded
performances, expressing different timbre nuances, can be characterised by the gesture
199

and touch features extracted with the PianoTouch toolbox. The specific aspects of gestu-
ral control and technique used in the production of each timbral nuance can be assessed
by comparing same- and different-timbre performances. To this end, the analytic func-
tions of the toolbox were complemented by advanced, automated statistical analysis and
visualisation functions oriented towards studying the influence of a performance factor
such as timbre on the performance features, with the help of several statistical tests of
variance and correlation. We can thus determine how different timbre nuances involve
different control strategies, which performance features vary significantly depending on
timbre, and therefore identify all the features useful in determining the timbre expressed
in a performance. These various procedures are featured and applied towards exploring
the production of piano timbre in the next chapter.

In conclusion, the PianoTouch toolbox can serve to increase understanding of the


subtle nuances in expressive musical performance through which artists masterfully
manage to convey emotion and feeling.
CHAPTER 9

EXPRESSION AND GESTURAL CONTROL OF PIANO TIMBRE: TOUCH


AND GESTURE FOR TIMBRE PRODUCTION IN PIANO PERFORMANCE

9.1 Introduction

This chapter presents a study which investigates the strategies and technical nuances
of gestural control involved in pianists’ use of timbre as an expressive device in piano
performance. The study was carried out by the author. Bernard Bouchard provided
invaluable help and advice concerning all technical and pianistic aspects. Prof. Caroline
Traube provided general advice and guidance.
The importance of timbre to pianists as an expressive musical parameter was de-
tailed in the first part of this dissertation. This research, which concerns the gesture and
technique underlying timbre expression, follows in the steps of the previous systematic
studies of expressive piano performance that were presented in Chapter 7, and especially
Ortmann’s (1929) work on the effect of piano touch-forms on tone-qualities.
With the Bösendorfer CEUS piano performance recording system at our disposal,
and with the PianoTouch Toolbox developed to exhaustively extract meaningful gesture
and touch features (see Chapter 8), the fine-grained features of gestural control in piano
performance can be explored in more detail than was ever accessible to mechanical or
MIDI systems. Piano performance can thus be observed at the level of finesse solicited
in timbre production.
The experiment presented in this chapter was designed to respect a musically relevant
context. Short pieces were composed for the study, and performed on the Bösendorfer
CEUS piano by advanced-level pianists, with different timbre nuances. A large dataset
of CEUS-recorded performances coloured in different timbre nuances was collected. I
explored how features of articulation, timing, dynamics, attack, touch, pedalling, and
201

so on were used in these musical performances in order to produce different timbres.


With the help of exhaustive statistical analyses of variance and correlation, I could thus
determine how different timbre nuances involve different control strategies, the gestural
features which vary significantly depending on timbre; therefore it was possible identify
all the features useful in determining the timbre expressed in a performance.

9.2 Method

This section details the experimental design and the different steps taken in its con-
ception: selection of the timbre descriptors to study, selection of the musical pieces to
be performed, equipment used, participants and instructions.

9.2.1 Timbre descriptors

The timbres for which to seek out gesture and touch patterns in piano performance
were to be defined and called upon according to their verbal descriptors. For the pur-
pose of this experiment, in order to select the timbre descriptors that are most common
and representative of piano timbre, a study was conducted for evaluating the semantic
relationships between common descriptors of piano timbre, the details of which were
presented in Chapter 5. Five descriptors were thus identified as the most representative
of the five clusters appearing in the semantic structure of the verbal description of piano
timbre: Dry, Bright, Round, Velvety and Dark. On the graph representing the first two
dimensions of the MDS space of piano timbre descriptors, the five clusters are arranged
in a circular arc pattern (see Figure 9.1).

9.2.2 Musical pieces

Four pieces, specifically composed for the study, were selected to be performed with
the five timbre descriptors as expressive instructions of the timbral nuance to instil. The
selection process, motivation for using original pieces, and compositional constraints
202

0.8

0.6
Round
Full−bodied
0.4
Clear

Bright Velvety

Dimension 2
0.2

Brassy
Shimmering Soft
0

−0.2 Metallic
Distant
−0.4 Harsh Dark
Dry
−0.6
Muddled

−0.8
−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8

Dimension 1

Figure 9.1: Semantic similarity Multidimensional Scaling space of piano timbre descrip-
tors (Dimensions 1 and 2). Five clusters are identified by red rectangles; the five selected
timbre descriptors are encircled in red.

imposed are detailed in Chapter 6, Section 6.2 (p. 126), as the study of auditory percep-
tion and identification of piano timbre used the same pieces for the creation of stimuli.
In the end, four miniature pieces were selected, written by composers Stacey Brown,
Frédéric Chiasson (for two of them) and Ana Dall’Ara-Majek (see Figure 9.2). These
pieces are just a few bars long (from four to seven, with different meters), which can
favour a consistent expression of each of the five timbral nuances. Each piece was also
considered appropriate for the expression of each of the five timbral nuances in a musi-
cally meaningful way, and featured many aspects of piano technique that we wanted to
explore.
203

Pièce 1


Frédéric Chiasson

           


Moderato q = 100  

     
          

               
             



pp

Pièce 2


Stacey Brown

      
                      
 
  
q = 110 Rubato

    
  
  
                
    

3
 


3

        

Pédale ad lib.

  
5
         
    

           


    
3
 3
  
 

3 3
3 

Pièce 3


Ana Dall'Ara-Majek

         
 
     
q = 72

      

 
    
         
     


                      
3 

               
     
     

Pièce 4


Frédéric Chiasson

         
Moderato q. = 72

        
3 3

                   



      
3
 2

      
rall. 
                  

Figure 9.2: Scores of the four miniature pieces composed and selected for the study.
204

9.2.3 Equipment

In order to investigate the finest-grained nuances of pianists’ touch, highly precise


data were required, from which to thoroughly assess the intricacies of key strokes. For
this aim, the Bösendorfer CEUS piano digital recording system, embedded in the Impe-
rial grand piano installed at BRAMS, was used. The technical details of its recording
ability are detailed in Chapter 8. The CEUS recording system thus constitutes an ex-
tremely precise tool to observe the finest subtleties in pianists’ touch and to measure the
part of gesture actually effective and transferred to the piano action. Furthermore, with
the Bösendorfer Imperial grand piano and its ad hoc studio setting, participant perform-
ers are provided with a premium musical setup.

9.2.4 Participants and task outline

The piano performance recordings with the CEUS system were performed concomi-
tantly with the audio recordings that were used as stimuli in the study of piano timbre
perception and identification presented in Chapter 6.
Four pianists participated in the study. They are identified further by their initials:
PL, RB, BB and FP. Each had extensive professional experience. The four pianists (one
female, three male; one Canadian, two French, one Italian) ranged in age from 22 to 46,
and had all obtained advanced diploma in piano performance. One of them (FP) had
perfect pitch. Each participant had received in advance the scores of the pieces and the
timbral nuances to explore, and was given the required practice time. Rehearsal ses-
sions were also allotted on the Bösendorfer piano, to allow for familiarisation with the
instrument and the room. Participants were also instructed not to overuse tempo and
dynamic levels in marking the different timbre nuances, as those performance parame-
ters are distinct from timbre per se. They were then asked to perform each of the four
pieces, with each of the five timbres. Three such runs of 20 performances were con-
ducted successively — twice in an order of pieces and timbres chosen by the participant,
205

and once in random forced order — so as to get three performances for each condition
(piece ⇥ timbre). Each of the 60 performances per participant was recorded through the
CEUS system in order to acquire performance gesture data — in addition to the audio
recordings used in the study described in Chapter 6.
We thus collected 240 CEUS boe-format recordings of 4 pianists performing 4
pieces with 5 different timbres, 3 times each. The participants were also interviewed
about the experiment, the piano, the pieces (musical and technical thoughts) and timbre
(as a general musical concept, about terminology and especially about the five presented
timbre descriptors, and about the musical and technical expression of those five timbres,
both in general and within the four pieces hereby performed).
More pianists (around 10) were originally expected to take part in the study. Un-
fortunately, a technical failure of the CEUS system and the cracked soundboard of the
Bösendorfer Imperial grand piano prevented any further recordings. Still, the 240 per-
formances recorded provide a large enough dataset for exploring the differences in the
production of five timbral nuances. With only four participants, the idiosyncrasies of
each pianist’s playing style may however exert a larger influence than initially hoped for.

9.2.5 Performance analysis

The data thus gathered with the CEUS system provides a high-precision account of
key and pedal positions and hammer velocities through time. In order to extract from
these data pianistically meaningful assessments of the performance subtleties in gesture
and touch, I used the MATLAB R PianoTouch Toolbox that I developed for this purpose,
the details of which are presented in Chapter 8. The successive processing steps of
converting the streamlined data files into key-by-key position matrices, then retrieving
note and chord structures, and characterising each with several features of performance
touch and gesture — 46 for each note, and 168 for each chord (including the means
and standard deviations between its constituent notes, plus its chord-specific features)
— provided an exhaustive set of features for each performance, spanning several broad
206

areas of piano performance, gesture and touch: dynamic levels; attack speed, depth, type,
percussiveness and synchrony; sustain, release durations and synchrony; articulation,
intervals and overlaps; detailed use of pedals. For each of the 240 performances, the
means and standard deviations between all its chords were computed for all features. 1
Each performance was thus described by 322 piano performance features. Additionally,
the same 322 mean and SD features were computed for both left-hand-only and right-
hand-only chords. In the end, each of the 240 recorded performances is characterised by
966 gesture and touch features.

9.3 General results and discussion

9.3.1 Significant, timbre-discriminating piano performance features

Statistical analyses of variance were performed over the dataset of 966 features for
each of the 240 recorded performances. The dataset was organised in a three-repetition
design, with 80 samples (one for each same-pianist, same-piece, same-timbre condition),
timbre as factor (thus five groups) and the performance features as dependent variables. 2
MATLAB functions were developed to automate the analysis procedure and call
upon statistical tests partly adapted from the MATLAB Statistics Toolbox. Several tests
of analysis of variance were performed, depending on the violations of their assumptions
for each feature. In the case of within-group normal distributions (via Kolgomorov-
Smirnov tests with Lilliefors correction) and homoscedasticity 3 (via Levene’s test of ho-
mogeneity of variances), repeated-measures ANOVA was used; if only the homoscedas-
1. Each standard deviation of a feature between chords was then divided by the features’ mean. The
resulting deviation rate thus represents the percentage of deviation from the mean feature value, instead
of the sheer amount of deviation. This allows for more meaningful comparisons of relative deviations
between performances, regardless of the differences in means between performances.
2. This repeated-measures design is required for independence of samples according to the (pianist x
piece x timbre) experimental conditions, lest the variance within timbre-groups were underestimated due
to the assumable consistency in same-conditions performances — which would yield Type I errors, with
overestimated dependent-variable significance.
3. i.e. homogeneity of variances within each timbre group.
207

ticity assumption was violated, the Welch robust test of equality of means was used;
otherwise 4 , the non-parametric Kruskal-Wallis rank analysis of variance was used.
In the end, amongst 966 performance features, 192 proved significant at the 5% level.

9.3.2 Gestural spaces of piano timbre

First, Principal Component Analysis was applied to the subset of the 192 significant
performance features. Thus, the 192-variable set-space was reduced (by removal of
redundancies between correlated variables) to fewer orthogonal dimensions (principal
components), ordered in decreasing percentage of total variance accounted for — i.e.
each dimension contains more information from the original dataset than the following,
and no information is duplicated between dimensions. Each principal component is
defined as a vector of weights, each weight being ascribed to one input feature.
PCA was first applied to the whole 80-sample, 192-dependent-variable dataset. 5
Each input sample obtained new coordinates along the principal components, the sum
of the product of its value for each feature by the corresponding principal component
weight. The visual representation of the first two principal components — which com-
bined explain 53.1% of the variance in the input dataset (34.63%+18.47% respectively) 6
— and the position of each sample according to their coordinates in these two dimen-
sions is presented in Figure 9.3.
In order to obtain a clearer view, and especially to compare the same-condition per-
formances (collapsed as one sample in the previous figure), PCA was applied separately
to each piece and its dataset of 60 performances and 192 significant-overall features. In
all four cases, the first two principal components can explain a large enough amount of
4. In particular, the assumption of normality of distributions within each timbre group could be affected
by a ceiling effect, for dependent variables whose range of values is limited by a maximum threshold,
such as hammer velocity whose maximum possible value is 250; in the case of timbres for which hammer
velocity is generally close to this ceiling, their distributions can be skewed leftward.
5. For the sake of display clarity, the 240 performances were not used here as samples, only the 80
means over the three same-pianist, same-piece, same-timbre performances.
6. As the third dimension explains only 12% more of the variance, it does not need to be factored in.
208

10
BB BB
BB FP FP
Dimension 2 explained variance = 0.18468
BB
FP BB
RB
RB
BB PL RB
5 BB RB
RB BB PL FP
RB FP
BB RB BB PL RB PL RB
BB PL
BB PL
BB BB BB
BB BB BB PL PL
BB FP FP
RB
FP BB FP PL FP
0 RB FP
PL FP
RB

FP PL RB FP
PL
PL RB PL
PL FP
−5 PL RB
FP PL
PL
RB
FP FP
Bright
−10 RB RB Round
RB PL
Dry
FP Dark
FP
Velvety
−15
RB Piece 1
Piece 2
Piece 3
PL
−20 Piece 4
20 15 10 5 0 −5 −10 −15 −20
Dimension 1 explained variance = 0.34627

Figure 9.3: Principal Component Analysis of the 192 significant gestural features over
80 samples: Dimensions 1 and 2. Timbre averages are indicated by coloured crosses, and
±1 S.E with ellipses. The black arc shows the arrangement pattern of timbre averages.
Dimension 1 was reversed in direction for correspondence between figures.

the variance in the inputs dataset (53.09%, 55.27%, 57.71% and 53.82% respectively
for each piece). These two dimensions, and the corresponding position of each per-
formance in this space, are thus displayed for each piece in Figures 9.4a, 9.4b, 9.5a
and 9.5b respectively. Here the three performances of one timbral nuance by one pianist
are identified, linked by straight lines forming triangles.
The first thing to notice in those five gestural spaces is that the five mean positions
of performances sharing the same timbre are arranged along a circular arc. This shape is
quite consistent in all figures, be it the overall results or any of the four pieces. This in-
dicates that according to the 192 performance features identified as significant in telling
timbres apart, the gestural control of piano timbre is applied in quite the same way in all
pieces, thus producing a consistent gestural space along the two most significant princi-
pal components.
209

−10

Dimension 2 explained variance = 0.16723


Bright
Round
−5 Dry
Dark
Velvety
0 PL
RB
BB
5 FP

10

15

20
25 20 15 10 5 0 −5 −10 −15 −20
Dimension 1 explained variance = 0.3637

(a) Piece no.1.


−10
Bright
Round
Dry
−5
Dimension 2 explained variance = 0.21217

Dark
Velvety
PL
0 RB
BB
FP
5

10

15

20

25
35 30 25 20 15 10 5 0 −5 −10 −15
Dimension 1 explained variance = 0.34055

(b) Piece no.2.

Figure 9.4: Principal Component Analysis of 192 significant features over the 60 per-
formances of Pieces no.1 (a) and no.2 (b): Dimensions 1 and 2. The coloured lines link
same-timbre, same-pianist repeated performances. Timbre averages are indicated by
coloured crosses, and ±1 S.E with ellipses. The black arc shows the arrangement pat-
tern of timbre averages. Both dimensions were reversed in direction for correspondence
between figures.
210

−10
Bright

Dimension 2 explained variance = 0.19142


Round
Dry
−5 Dark
Velvety
PL
RB
0 BB
FP

10

15
−20 −15 −10 −5 0 5 10 15 20
Dimension 1 explained variance = 0.38572

(a) Piece no.3.


−10
Dimension 2 explained variance = 0.16928

Bright
Round
Dry
−5
Dark
Velvety
PL
0 RB
BB
FP

10

15
−15 −10 −5 0 5 10 15 20
Dimension 1 explained variance = 0.36891

(b) Piece no.4.

Figure 9.5: Principal Component Analysis of 192 significant features over the 60 per-
formances of Pieces no.3 (a) and no.4 (b): Dimensions 1 and 2. The coloured lines
link same-timbre, same-pianist repeated performances. Timbre averages are indicated
by coloured crosses, and ±1 S.E with ellipses. The black arc shows the arrangement
pattern of timbre averages. Dimension 2 was reversed in direction for correspondence
between figures.
211

Moreover, this circular arc pattern is quite reminiscent (albeit more open) to the ar-
rangement of the five timbre descriptors in the semantic similarity space of Figure 9.1.
On the other hand, in these gestural spaces, the order of timbres Dark and Velvety along
the arc is reversed compared to the semantic similarity space. Here, Dark-timbre per-
formances are closer to Round ones, with Velvety-timbre performances on the extreme
opposite side of Dry. The general semantic and gestural expressions of piano timbre
explored in our studies thus present some degree of correspondence, but between the
participants do not match across both studies in respect of timbres Dark and Velvety.
On the overall gestural space showing the means of same-pianist, same-piece, same-
timbre performances, we can observe a combination of three scattering effects due to
each condition. Indeed, performances tend to be grouped by performer (most especially
for BB’s, concentrated in the upper right region) and by piece (essentially in interac-
tion with timbre). Yet the most salient grouping effect is indubitably due to timbre.
A timbre-by-timbre account of performance position shows the westward position of all
Dry-timbre performances, with most grouped in the upper-left region. On the other hand,
most Velvety-timbre performances are situated on the far-right side. Bright and Round-
timbre (and Dark-timbre save for two outliers) performances are closely scattered around
their respective means.
A clearer, more complete account is available in the piece-by-piece PCA figures 9.4a,
9.4b, 9.5a and 9.5b, in which the three repetitions of same piece, pianist and timbre
conditions are represented. We can thus assess the pianists’ consistency in the production
of each timbre. Bright and Round timbres are most consistent, both between pianists
and between individual same-timbre takes, in pieces no.1, 2 and 4. Other timbres in
Piece no.1 are less consistent between pianists, but still essentially consistent between
individual same-timbre takes, save for a few outliers (most noticeably one of FP’s Dry
performances). Piece no.2 also shows much consistency between both performers and
individual takes for a Dark timbre, and consistency for the individual takes (and between-
performer as well but for RB) on the Velvety timbre. The gestural space for Piece no.3
212

shows more scattering, where for each timbre three of the performers are somewhat
consistent between each other but the other performer (FP for a Bright timbre, BB for
the others) is far off. Consistency between individual takes is still apparent for a Dark
timbre, and somewhat for timbres Round and Velvety (save one outlying performance
each). As for Piece no.4, there is a noticeable consistency in at least two of the three
takes on one timbre by one pianist.
Performer-wise, BB is very consistent between same-timbre takes, but also between
timbres. RB is also very consistent in his same-timbre takes, except for timbres Dark in
Piece no.2 and Bright in Piece no.3. Moreover, distances between his performances of
different timbres is larger, which gives more salience to his consistency in same-timbre
takes. Finally, PL and FP are less consistent in their same-timbre takes, with oftentimes
one performance set apart from its two same-timbre counterparts.
We can conclude that there is a fair consistency in the gestural production of five
different nuances of piano timbre, although less so for the timbres at the extremities
of the arc (Dry and Velvety). Same-timbre production consistency also varies some-
what between pieces, and especially between performers: some participants were more
consistent than others in producing the same timbre nuance. Yet the most important con-
clusion is the five timbral nuances, on average, are arranged in all gestural spaces along
an arc which resembles their arrangement in the semantic similarity space, with a major
difference though in the inversion of Dark and Velvety timbres.

9.3.3 Gestural description of piano timbre

In order to obtain a minimal, unique gestural portrait for each of the five piano tim-
bre nuances explored in this study, the set of 192 significant features was reduced to 13
essential features. For this aim, redundancies between features were removed. Indeed,
some of the significant features shared the same pianistic or technical meaning, either
213

exactly 7 or approximately. 8 When such features presented almost exactly the same pat-
terns of mean values for each timbre, only one of them, the most statistically significant,
was conserved. This allowed us to identify, with hardly any loss of relevant information,
a minimal set of 13 performance gesture and touch features to appropriately describe
each of the five timbral nuances in a unique way. These results are presented in the
Kiviat (radar) chart of Figure 9.6.

Figure 9.6: Kiviat chart of the 13 gestural features giving a minimal and unique descrip-
tion of the five timbral nuances explored in the study. Z-scores per timbral nuance for
each feature are indicated with colour-coded dots. The five colour-coded, dot-linking
closed lines thus represent the gestural portraits of each timbral nuance. The shades
around each closed line show the ±1.96 S.E. intervals (95% confidence interval).

7. For instance several ways of measuring attack duration were used.


8. For instance, pedal depression was measured at several points for each note. Moreover, one feature
could be significant overall as well as for one or both hands separately.
214

Below are the descriptions, statistical scores 9 and explanations of these 13 most
relevant features. Counter-clockwise, starting from the right of the chart, are the features
of attack and dynamics:
– Hammer velocity (c 2 (4) = 23.195, p < 10 3 , effect size r = 0.294 overall; c 2 (4) =
20.935, p < 10 3 , effect size r = 0.265 left hand; c 2 (4) = 25.156, p < 10 3 , effect
size r = 0.318 right hand): maximum hammer velocity for each note, as directly
measured by the piano sensors. As a direct correlate to intensity, it makes for a
descriptor of dynamic level. This feature is first presented overall (both hands to-
gether), then for left and right hands (respectively) only. Intensity is thus shown
as high for timbres Bright and Dry, medium for Round, lower for Dark and much
lower for Velvety. Additionally, left-hand hammer velocity is equivalent for Bright
and Dry whereas right-hand hammer velocity is much higher for Bright than Dry.
A Dark timbre is also played at a higher dynamic level with the left hand than the
right.

– Key depression depth (c 2 (4) = 21.412, p < 10 3 , effect size r = 0.271): indi-
cates how deep (close to the keybed) the key gets depressed for each note. On
average, keys are nearly fully depressed for a Bright timbre, then less and less
from Dry to Round, Dark then Velvety. This pattern resembles hammer velocity,
yet with a more marked difference between Bright and Dry.

– Variations in key attack speed (F(4, 75) = 3.117, p = 0.02, effect size r =
0.062): variations between chords, thus mostly in time during the performance,
of mean key attack speed. These are most salient for Dark and Velvety timbres
(largest ranges in attack speed) than Round, then Bright and Dry. This is some-
what the opposite order to intensity, which makes sense as the higher dynamic
registers tend to stay consistently forte, while lower mean dynamic registers have
9. Depending on the appropriate statistical test as dictated by the assumptions met, the first reported
statistic can be the ANOVA F-ratio F(df1,df2), the Welch F-ratio FWelch (df1,df2’) or the Kruskal-Wallis
Chi-square c 2 (df1).
215

some leeway to sometimes reach higher than piano (at least in the musical context
hereby set). In detail, a Dark timbre, while presenting higher mean hammer ve-
locities than Velvety, shows more variations in key attack speeds, and conversely
a Dry timbre, with hammer velocities a little lower than Bright, still shows even
fewer variations in key attack speed.

– Attack duration (F(4, 75) = 3.881, p = 0.006, effect size r = 0.133 overall;
F(4, 75) = 3.591, p = 0.01, effect size r = 0.149 left hand; F(4, 75) = 3.432,
p = 0.012, effect size r = 0.105 right hand): durations of note attacks, from the
start of key depression to the instant of maximum hammer velocity. While pri-
marily inversely proportional to hammer velocity (the faster the attack, the shorter
its duration), it also depends on nuances of touch. 10 Indeed, Dark-timbre attacks,
overall, are as long as for Velvety although Dark-timbre hammer velocities are
greater. While left-hand Dark-timbre attacks are shorter than for Velvety (which
concurs with dynamic insistence on the left-hand for a Dark timbre), right-hand
Dark-timbre attacks are longer than for Velvety despite showing higher right-hand
hammer velocities. On the short attack side, Dry and Bright timbres averages are
all but similar overall and for each hand, contrary to the Bright timbre higher right-
hand hammer velocity.

The next three features account for the use of both pedals:
– Soft pedal depression (FWelch (4, 110.994) = 4.629, p = 0.002, effect size r =
0.291): its amount of depression during the performance. The soft pedal was thus
heavily depressed for producing a Velvety timbre, less depressed for a Dark tim-
bre, and hardly depressed at all for the others.

10. More precisely, hammer velocity depends on key attack speed at the instant of hammer launch
towards the strings, whereas attack duration depends on the mean attack speed over the key course, and
on the key depression depth as well.
216

– Sustain pedal use (F(4, 75) = 9.916, p < 10 3 , effect size r = 0.315): duration of
sustain pedal depression during the performance. The sustain pedal is thus heavily
used for a Dark timbre, for Velvety and Round timbres as well, not so much for a
Bright timbre and almost not for Dry.

– Sustain pedal depression (FWelch (4, 116.114) = 7.727, p < 10 3 , effect size r =
0.438): its amount of depression during the performance. While resembling its
pattern of duration of use, here Dark, Velvety and Round timbres show equally
high sustain pedal depression. This must indicate at least some use of surface ped-
alling in producing a Dark timbre, which lowers the amount of depression while
keeping its duration of use high. Higher amount of depression (compared with
duration) for the Bright timbre indicates a constant full depression.

Finally, the last two features concern articulation:


– Release duration (FWelch (4, 115.915) = 13.795, p < 10 3 , effect size r = 0.32):
time taken for key release. It mostly accounts for articulation: a note released
slowly (thus slowed by the finger) may probably overlap with the next. Releases
are long (legato articulation) for Velvety and Dark timbres, long as well for Round,
shorter for Bright and very short (staccato) for Dry.

– Right-hand chord overlaps (F(4, 75) = 2.561, p = 0.045, effect size r = 0.111):
descriptor of right-hand articulation: the more overlap, the more legato in right-
hand play. In this case, right-hand articulation is most legato for a Dark timbre
(which the longer right-hand attacks could also support). Round then Velvety
follow. Right-hand legato is thus less prominent for Velvety than what release
duration indicates for both hands, which may suggest more left-hand than right-
hand legato for a Velvety timbre. Dry is the most right-hand staccato, with Bright
in-between, neither legato nor staccato.
217

9.3.4 Pairwise comparisons between timbres

The statistical analyses of variance were followed up by post-hoc pairwise compar-


isons, with Tukey’s Honest Significant Difference test to estimate the significance of the
features, in the aim of assessing which performance features most significantly differ
between each of the ten timbre pairs. 11
Those results, reduced for each timbre pair to a set of non-redundant (both in mean-
ing and values), significant features, are presented in Table 9.I.
In summary, the pairwise post-hoc tests clearly highlight differences in gestural con-
trol between either Dry or Bright timbres and either Dark or Velvety timbres. Moreover,
the use of pedals is essential in characterising some timbres. The Dry timbre differs
from the other four by its scarce or non-use of the sustain pedal. Meanwhile, heavy
depression of the soft pedal for a Velvety timbre distinguishes it from all the other tim-
bres but Dark. Dry or Bright essentially differ from Dark or Velvety through a higher
dynamic register, deeper key depression (but for between Dry and Dark), shorter and
faster right-hand attacks and a more staccato articulation. The Dark timbre also shows
longer left-hand chords than Dry, and more sustain pedal use than Bright. However, the
Round timbre cannot be thus distinguished from either Bright or Dark, and differs from
Dry and Velvety only by the depression of sustain and soft pedals (respectively). Finally,
only sustain pedal depression separates Bright and Dry, while Dark and Velvety cannot
be distinguished here.
This description is consistent with the gestural spaces of Section 9.3.2 and the ar-
rangement patterns of timbres, with Round in the middle, Dry and Bright at one end
and Dark and Velvety at the other end, with only the extremes clearly distinguished in
pairwise comparisons, save for the use of pedals (which allows for further discrimination
between closer timbres).
11. In short, Tukey’s HSD test is used to down-tune the a significance level (generally fixed at 5%) so
as to avoid an inflation of Type-I errors caused by the multiplication of pairwise tests performed.
218

Table 9.I: Summary of performance features significant in post-hoc pairwise timbre


comparisons.

Timbres Significant features pairwise


Timbre #1 Timbre #2 Timbre #1 > Timbre #2 Timbre #1 < Timbre #2
Dry Bright Sustain pedal depression
Sustain pedal use and
Dry Round
depression
Dry Dark Right-hand attack duration
Sustain pedal use and
depression
Left-hand chord durations
Variations in right-hand
Release durations
overlap durations
Dry Velvety Hammer velocity
Attack speed Right-hand attack duration
Key depression depth (esp.
left hand)
Soft pedal depression
Sustain pedal use and
depression
Release durations
Bright Round none
Bright Dark Hammer velocity
Right-hand attack speed Right-hand attack duration
Key depression depth
Sustain pedal use
Bright Velvety Hammer velocity
Key attack speed Right-hand attack duration
Key depression depth
Soft pedal depression
Round Dark none
Round Velvety Soft pedal depression
Dark Velvety none
219

9.3.5 Gestural descriptions of piano timbre in time

Furthermore, the behaviour in time, across all the pianists’ performances, of those
performance gesture and touch features from which can be drawn minimal, complete and
unique portrayals of the five timbral nuances, was assessed. Performances of each piece
were analysed separately, as musical events cannot be made to match in time between
different pieces or be meaningfully synchronized; thus a global analysis of all pieces,
regardless of their different sequences of musical events, would be meaningless. The
evolution in time of each performance was defined as the sequence of all its chord onsets.
Features were thus analysed as the series of feature values for each chord, set in time at
chord onsets. Moreover, all performances of one piece were time-stretched and offset so
that earliest and latest chord onsets were all synchronized, regardless of the duration of
each performance. 12
In order to ensure that the chords and notes corresponding to the same musical events
in all performances of the same piece could be directly compared and assigned to the
same time slot (without resorting to heavy, complex, and more error-prone time warp-
ing), sliding windows were used, with 50% overlap. This way, each chord whose onset
is included within the time frame defined by one position of the window is taken into ac-
count, with the means between all such chords for the feature examined to be assigned to
this window and set at its central position in time. As the design of the sliding windows
lets chords be included within one or two time frames, we thus obtain a smoothed-out
representation of the evolution of features in time between several performances of the
same piece, with the chords from different performances corresponding to the same mu-
sical event taken into account within the same window. To ensure the latter is the case,
different window sizes were tested, with the best fit empirically found for a 1000-ms
window size.

12. For each performance, time stretching was applied uniformly, according to the ratio of the mean
duration (for first to last chord onsets) of all performances of the piece to the duration of this performance.
220

Finally, the values assessed for one given feature for all performances of the same
piece were averaged by timbre for each 1000-ms time frame. Thus five curves, one for
each timbre, were produced for each feature evaluated, representing the averages per
timbre of all same-timbre performances of one piece, with the average evolution in time,
along with vertical bars showing the 95% confidence intervals (±2 standard errors), plot-
ted at each central position of 1000-ms, 50% overlap sliding windows (see Figures 9.7,
9.8, 9.9 and 9.10). Correspondence with the score of each piece was established through
the audio recordings of the performances. In each figure, the score, waveform of a per-
formance of the piece, and relative position of bars are indicated with dashed vertical
lines. Thus, the evolution of gestural features in time can be set in relation to the musical
structures.
Means per timbre between time frames were also calculated, and can differ from the
averages per timbre used for all previous results. Indeed, those previous averages are first
calculated as means per performance of all its chords, then averaged over same-timbre
performances. Here, however, averages are first calculated between same-timbre perfor-
mances, at a given time frame — and can also include several chords per performance.
Then the means are computed by timbre between time frames. Thus this different pro-
cess can yield different means per timbre.
Among the 13 features that make up the minimal, complete and unique gestural por-
trait of each of the five timbral nuances, only the four features whose evolution in time is
most relevant, sensible and interpretable were selected: key depression depth, hammer
velocity, sustain pedal depression and chord overlap. Indeed, the discarded features ei-
ther remained essentially constant per timbre over time (e.g. soft pedal depression), were
too redundant at each time frame level with one of the selected features (e.g. attack dura-
tion with hammer velocity), or simply were not available for this mode of representation
(e.g. variations in time of key attack speed).
221

Pièce 1 Frédéric Chiasson

Key depression depth

250 250

200 200

150 150
0 2000 4000 6000 8000 10000 0 5

160 160
Hammer velocity

140 140

120 120

100 100

80 80

60 60
0 2000 4000 6000 8000 10000 0 5
Sustain pedal depression

200 200

150 150

100 100

50 50

0 2000 4000 6000 8000 10000 0 5


Chords overlap

300 300

250 250

200 200

150 150

100 100
0 2000 4000 6000 8000 10000 0 5

Bright Round Dry Dark Velvety

Figure 9.7: Evolution in time, over all performances of Piece no.1 grouped by timbre,
of four gestural control features essential in timbre differentiation (X-axes: time in ms;
Y-axes: features values). The coloured lines follow the averages per timbre per 1000-ms
time frame, with ±2 S.E. bars at the centre of each frame. Means between time frames
per timbre (with ±2S.E. bars) are presented on the right side for each feature.
222

9.3.5.1 Piece no.1

Piece no.1 (Figure 9.7) starts quite piano, with low hammer velocity and shallow
key depression in the first bar. Key depression depth and hammer velocity then increase
immediately and stay consistently high until the third bar for the Bright timbre; mean-
while for other timbres (especially Velvety), the increases in key depression depth and
hammer velocities are progressive (crescendo). Up to this point those four timbres are
also quite separated in hammer velocity, though not so much in key depression depth. A
singular point occurs between 4s and 5s, which corresponds to the beginning of bar 4.
Key depression becomes far deeper there for Bright and Dark timbres, while it remains
stable for the other three timbres. There is synchronously a large decrease in hammer
velocity for Bright, while Dry stays stable and the other three continue their crescendo.
Both key depression depth and hammer velocity are then fairly stable until the end of
the piece. In more detail, though, the Velvety timbre first follows a decrescendo in ham-
mer velocity before stabilising low; likewise, to a lesser extent, for Dark and Round at
mostly identical, higher intensity, save for a peak for Round around the half-note chords
in bar 6. As for key depression depth, a slight decrease is first observed during bar 4,
then followed by a slight, slow increase during bars 5 and 6, with the Velvety timbre
fluctuating more. For the final left-hand chord, only the Dark timbre shows a change
in key depression depth patterns by decreasing noticeably, while all timbres increase in
intensity, especially Round and Velvety. On average over time frames, for both features
Bright and Dry are significantly higher, Round and Dark in the middle and Velvety the
lowest.
Piece no.1 also starts with rather little depression of the sustain pedal and not much
overlap. Sustain pedal depression then tends to vary, per timbre, from time frame to
time frame for the rest of the piece, oscillating around its average value. However, pedal
depression for the Dry timbre tends to increase from 5s to 10s (bars 4 to 6). There are
also two singular points: the first between 4s and 5s corresponding to the end of bar 3
223

/ beginning of bar 4, where the sustain pedal depression drops for Dry and Velvety, and
the second at the end, when depression increases sharply, except for Round, and most
noticeably for Dark, in order to hold the last chord. On average, the Dry timbre is clearly
set apart, with around one third less average pedal depression than the other four, fairly
equal timbres.
Finally, chord overlaps progress in a rather consistent pattern between timbres for the
first two bars: the most overlap for Dark, the least for Bright and Dry, with Round and
Velvety in between. Overlap increases in the first three time frames, resulting in a peak
at the third time frame (but for Dry). The pattern then become messier for the rest of the
piece, with timbres intertwined and oscillating. On average, Dry and Bright shows the
least overlap, then with Round, Velvety and Dark in order of increasing overlap. Yet the
error bars are quite large, as overlap wildly varies in time.

9.3.5.2 Piece no.2

Piece no.2 (Figure 9.8) is most evidently characterised by extremely low key depres-
sion depths and hammer velocities at the second time frame, which must correspond to
the sole, left-hand quarter-note dyad at the beginning of bar 2. This chord is thus played
very faintly, for all timbres but especially for Dry. After that, both patterns are fairly
stable, with Bright and Dry higher, Velvety and Dark lower, and Round in between.
One may notice a crescendo between 2s and 9s for the Velvety timbre, and a slight de-
crescendo for all timbres from 11s on to the end. Time averages per timbre are largely
affected by a decrease in the second time frame; thus Bright, Round and Dry become
equal in key depression depth, with Dark and Velvety lower, while for hammer velocity
Round is significantly lower than Bright and Dry, with Dark and Velvety still lower.
As for sustain pedal depression, we can also observe a decrease at the second time
frame. Sustain pedal depression is constantly lowest for the Dry timbre, most signifi-
cantly in the middle section (bars 3–4), and augments a little by the end. The other four
timbres are fairly closely matched, except for Bright in the first two bars (much lower).
224

Pièce 2 Stacey Brown

240 240

220 220
Key depression depth

200 200

180 180

160 160

140 140

120 120

100 100

80 80
Bright Round Dry Dark Velvety
60 60

40 40
0 2000 4000 6000 8000 10000 12000 14000 16000 0 5
Hammer velocity

150 150

100 100

50 50

0 5000 10000 15000 0 5


250 250
Sustain pedal dep

200 200

150 150

100 100

50 50

0 0
0 5000 10000 15000 0 5
300 300
Chords overlap

250 250

200 200

150 150

100 100

50 50

0 0
0 5000 10000 15000 0 5

Figure 9.8: Evolution in time, over all performances of Piece no.2 grouped by timbre,
of four gestural control features essential in timbre differentiation (X-axes: time in ms;
Y-axes: features values). The coloured lines follow the averages per timbre per 1000-ms
time frame, with ±2 S.E. bars at the centre of each frame. Means between time frames
per timbre (with ±2S.E. bars) are presented on the right side for each feature.
225

The amount of depression is rather consistently high for those four timbres from 3s (end
of bar 2) on to the end, with the exception of two peaks when it increases similarly for
all timbres. Those peaks (one between 6s and 7s, the other between 11s and 12s) cor-
respond to the (right-hand) dotted quarter-note dyad at the beginning of bar 4, and to
the quarter-note chords at the beginning of bar 6 (respectively), with the sustain pedal to
hold them down. On average the Dry timbre is clearly set apart again for this piece, this
time with Bright showing a little less pedal depression than the other three timbres.
Overlaps increase over bar 2 (from 2s to 5s), decrease slightly over the end of bar 3,
then reach a first peak at the beginning of bar 4 and the second at the left-hand eighth
notes at the end of bar 4. Those peaks in overlap are most salient for Dark and least
for Dry (both) and Bright (the second only). After that, overlapping decrease sharply,
only to gradually increase (though with fluctuations) until the end. During this increase,
Dark has the most overlap, Dry the least. On average over time timbres are ordered Dry,
Bright, Velvety, Round and Dark in increasing overlap.

9.3.5.3 Piece no.3

In Piece no.3 (Figure 9.9), the saliences in key depression depth and hammer veloc-
ity diverge between timbres, with increases or decreases occurring over different time
frames for different timbres. We can notice shallower key depressions and lower ham-
mer velocities in the first time frame, then a decrease in key depression depth in the
fourth (corresponding to the end of bar 1), more marked for Bright, Round and Dry tim-
bres, and accompanied by lower intensities for Dry, Dark and Round to some extent.
Intensities and key depression depths then stay mostly stable during bar 2, tend to de-
crease during the trill, and finally increase gradually until the end. On average timbres
are rather well separated, ordered Bright, Dry, Round, Dark and Velvety in decreasing
intensity and key depression depth.
The sustain pedal is used non-stop over bar 1, with mostly constant depression for
Bright, increasing for Round and Dry (albeit the least), and decreasing over the end
226

Pièce 3 Ana Dall'Ara Majek

Key depression depth

240 240
230 230
220 220
210 210
200 200
190 190
180 180
170 170
0 2000 4000 6000 8000 10000 12000 0 5
Hammer velocity

160 160

140 140

120 120

100 100

80 80

0 2000 4000 6000 8000 10000 12000 0 5


Sustain pedal depression

250 250

200 200

150 150

100 100

50 50

0 2000 4000 6000 8000 10000 12000 0 5

500 500
Chords overlap

400 400

300 300

200 200

100 100
0 2000 4000 6000 8000 10000 12000 0 5

Bright Round Dry Dark Velvety

Figure 9.9: Evolution in time, over all performances of Piece no.3 grouped by timbre,
of four gestural control features essential in timbre differentiation (X-axes: time in ms;
Y-axes: features values). The coloured lines follow the averages per timbre per 1000-ms
time frame, with ±2 S.E. bars at the centre of each frame. Means between time frames
per timbre (with ±2S.E. bars) are presented on the right side for each feature.
227

for Dark and Velvety. The pedal is mostly released before bar 2, then depressed again
(decreasingly so) over bar 2. The pedal is used less during the trill, then reaches its
highest depression at the end of bar 3. Finally, it is progressively released until the end
of the piece. While this pattern is mostly similar for all timbres, they are situated around
vastly different amounts of depression for Dry, Bright and the other three timbres.
Patterns of chord overlaps per timbre are quite similar over the first two bars, with
overlaps quickly increasing over bar 1 then decreasing slowly or stabilising over bar
2. Overlapping inevitably increases during the trill, then decreases sharply, with little
overlap for all timbres in the end of bar 3. Finally overlaps increase in bar 4, more so for
timbres Round, Dark and Velvety. Still, on average overlaps are much smaller for Dry,
then for Bright, bigger for Velvety and Round, and significantly even larger for Dark.

9.3.5.4 Piece no.4

In Piece no.4 (Figure 9.10), both key depression depth and hammer velocity evolve
comparably for all five timbres. In more detail, key depression is quite shallow over the
first three left-hand quarter-note dyads, while hammer velocity is somewhat low as well.
Then, as regards key depression depth, timbres Bright, Dry and Dark ebb and flow rather
similarly, with slight decreases during each bar. Meanwhile, the Round timbre is more
stable, and the Velvety timbre, always the lowest, tends to increase in key depression
depth over the first two bars then to slightly decrease until the end. Likewise, hammer
velocity shows for all timbres a pattern of crescendo over the first part of the piece, then
decrescendo till the end. Intensity is by far the lowest for the Velvety timbre, and the
highest for both Dry and Bright (yet with more fluctuations between time frames for
Dry). Round and Dark are of comparable, average intensity in the middle of the piece,
but with decreases for Dark and increases for Round in other time frames.
Sustain pedal depression increases for all timbres over the first bar, then decreases
sharply before bar 2 (which interrupts the sustained right-hand chord), most significantly
for Dry, then Bright, and least significantly for Velvety. Over the next bar, pedal depres-
228

Pièce 4 Frédéric Chiasson

Key depression depth

240 240

220 220

200 200

180 180

160 160

140 140
0 2000 4000 6000 8000 10000 0 5

160 160
Hammer velocity

140 140

120 120

100 100

80 80

60 60
0 2000 4000 6000 8000 10000 0 5
Sustain pedal depression

250 250

200 200

150 150

100 100

50 50
0 2000 4000 6000 8000 10000 0 5

250 250
Chords overlap

200 200

150 150

100 100

50 50

0 0
0 2000 4000 6000 8000 10000 0 5

Bright Round Dry Dark Velvety

Figure 9.10: Evolution in time, over all performances of Piece no.4 grouped by timbre,
of four gestural control features essential in timbre differentiation (X-axes: time in ms;
Y-axes: features values). The coloured lines follow the averages per timbre per 1000-ms
time frame, with ±2 S.E. bars at the centre of each frame. Means between time frames
per timbre (with ±2S.E. bars) are presented on the right side for each feature.
229

sion is both high and stable for all timbres but Dry, yet in slow, constant decrease for
Bright (and for Round in bar 3). In the meantime, pedal depression remains lowest for
Dry (although it increases along bars 2 and 4). Between the bars the pedal is released
to some degree, most strongly for Bright. On average, sustain pedal depression is by far
the lowest for Dry, and slightly lower for Bright than the other three timbres.
Chord overlaps are small at the beginning, when only the left hand plays. Then
over bars 1, 2 and 3 overlaps gradually increase, only to fall back low at the beginning
of the next bar. Finally, overlaps tend to decrease over the last bar. This pattern is
roughly similar for all five timbres, yet less salient for Velvety. Furthermore, for a Bright
timbre the gradual increase per bar is much larger in range in bar 2. And overlaps are
relatively smaller for a Dry timbre over bar 1 than afterwards. Otherwise, on average as
at each time frame, overlaps are largest for Dark then Round, smaller for Velvety, and
the smallest for Dry and Bright.

9.3.5.5 Conclusion

In summarising this section, the timbre profiles in time, overall, depend the most
on the position in bars or phrases, as those profiles (evolution in time of mean feature
values per timbre) tend to respect a certain pattern for the duration of a bar, and tend
to undergo more or less drastic changes at the transitions between bars. Indeed, means
per timbre of the four features tend to remain stable, or increase/decrease linearly over
the duration of each bar, yet at each bar transition those directions can either change
(e.g. chord overlaps between bars 5 and 6 in Piece no.2), stay the same (e.g. the rather
linear, continuous decrease in hammer velocity over bars 3 and 4 in Piece no.4), or be
drastically reset (e.g. the sustain pedal at least partially released and the chord overlaps
interrupted between the bars in Piece no.4), after which timbre profiles can either be
repeated from the previous bar or change course.
Within those bar-long patterns though, averages per timbre can differ relative to each
other, in mean value (like sustain pedal depression for a Dry timbre in all pieces, found
230

almost always much lower than for other timbres) but also in amount, speed or acceler-
ation of increase or decrease, and especially in fluctuations between time frames within
a bar-length pattern.
Finally, timbres mostly differed at bar transitions, especially by the degree to which
the feature values would drop, and at the extremes reached during bars — i.e. specific
peaks or troughs of differing amplitude and sometimes timing.

9.3.6 Piece-wise gestural description of piano timbre

In order to draw a more complete picture of the gestural control of piano timbre
within the context of this experimental design, each of the four pieces’ performance
datasets was analysed individually. The same statistical procedure of variance analysis
was thus applied to the performance dataset of each piece, separately, with 20 samples
of differing (pianist x timbre) conditions in a three-repetition design, and timbre as fac-
tor for the dependent variables i.e. performance features. The variance analysis tests
yielded, for Pieces no.1, no.2, no.3 and no.4 respectively, 37, 45, 124 and 40 features
significant at the 5% level. 13
These sets of significant features were reduced for each piece to a smaller number of
features, by removing redundancies in both meaning and mean-value patterns between
features. Minimal and unique gestural portraits for each of the five timbres were thus
identified for each of the four pieces.

9.3.6.1 Piece no.1

Four features suffice to give a minimal yet complete account of the 37 features signif-
icant in distinguishing timbre in Piece no.1. Those four features of right-hand dynamics
and depth, and of sustain pedal depression (overall and at hammer impact on the strings),
are presented in Figure 9.11.
13. For purely statistical reasons, with smaller sample sizes fewer features could prove significant in
discriminating between timbres.
231

Figure 9.11: Kiviat chart of the four gestural features giving a minimal and unique de-
scription in Piece no.1 of the five timbral nuances explored in the study. Z-scores per
timbral nuance for each feature are indicated with colour-coded dots. The five colour-
coded, dot-linking closed lines thus represent the gestural portraits of each timbre. The
shades around each closed line show the ±1.96 S.E. intervals (95% confidence interval).

– Right-hand hammer velocity (F(4, 15) = 3.966, p = 0.022, effect size r = 0.49):
each timbre is rather clearly distinguished in intensity, with Bright, Dry, Round,
Dark and Velvety in descending order (same as in the general case).

– Right-hand key depression depth (c 2 (4) = 9.8, p = 0.044, effect size r = 0.516)
shows almost the same pattern as hammer velocity, yet with Dark closer to Velvety
(less depth in key depression in proportion to its intensity).

– Sustain pedal depression (F(4, 15) = 3.999, p = 0.021, effect size r = 0.442):
like in the general case, the sustain pedal is most heavily depressed for Velvety,
Round and Dark, and all but absent for Dry. Yet sustain pedal depression is much
higher for Bright in this piece.
232

– Sustain pedal depression at hammer hit (F(4, 15) = 7.839, p = 0.001, effect
size r = 0.581): it indicates, for each note, the depth of pedal depression at the
instant at which the hammer reaches its maximum velocity (which is essentially
simultaneous with the hammer impact on the strings). In comparison with the
very similar previous pattern, the sustain pedal is just slightly more depressed
at hammer impact for Dark than Round then Velvety, which may indicate (ever
so slightly) that for a Dark timbre the sustain effect is especially sought at the
instant a note is produced rather than during note sustain, the opposite being true
of Velvety. For a Bright timbre, however, the noticeably smaller relative value in
pedal depression at hammer impact vs. overall indicates that the sustain effect is
especially sought over note sustains.

9.3.6.2 Piece no.2

Among the 45 significant features for Piece no.2, 8 were selected that give a mini-
mally complete portrayal of the five timbres (see Figure 9.12).
– Hammer velocity (FWelch (4, 26.937) = 3.28, p = 0.026, effect size r = 0.386):
same pattern of timbre z-scores as in the general case.

– Left-hand variations in key attack duration (FWelch (4, 26.794) = 3.382, p =


0.023, effect size r = 0.257): variations between chords of the durations of left-
hand key attacks. Related to variations in both attack speed and touch, they are
largest by far for Dark, thus the timbre which features the most diverse profiles in
left-hand attacks. In decreasing order of variations, Round, Velvety, Bright then
Dry follow.

– Note duration (F(4, 15) = 8.652, p = 0.001, effect size r = 0.588): as defined at
the keyboard by the time elapsed from beginning to end of a key depression, this
feature can be related not solely to tempo (a non-timbral performance parameter)
but to articulation as well, with one note finishing well after the next one starts
233

Figure 9.12: Kiviat chart of the eight gestural features giving a minimal and unique
description in Piece no.2 of the five timbral nuances explored in the study. Z-scores per
timbral nuance for each feature are indicated with colour-coded dots. The five colour-
coded, dot-linking closed lines thus represent the gestural portraits of each timbre. The
shades around each closed line show the ±1.96 S.E. intervals (95% confidence interval).

when played legato. Timbres are ordered Dark, Velvety, Round, Bright and Dry
in decreasing note durations (thus decreasing legato).

– Left-hand note sustain duration (F(4, 15) = 4.426, p = 0.015, effect size r =
0.454): note duration minus its attack and release durations, or duration of the por-
tion of the note where the key is kept depressed, after maximum depth is reached
in attack and before its release. Likewise, this feature is not only an indication
of tempo but of articulation as well. In difference with overall note durations,
left-hand not sustain is much shorter for Velvety.
These last two features are more prominent in Piece no.2, as it contains more
single notes (especially for the left hand), and thus offers more opportunities to
connect them in a legato articulation.
234

– Release duration (F(4, 15) = 6.065, p = 0.004, effect size r = 0.468): time taken
for key release, mostly accounting for articulation. Key releases are longer for a
Velvety timbre in this piece than in the general case.

Concerning articulation, in Piece no.2 the production of a Dark timbre requires long-
held notes with short releases — thus a legato articulation with notes overlapping at the
keyboard while deeply depressed. For a Velvety timbre, although notes are held long,
with the left hand especially they are held deep for only a short time, and consequently
releases are very long; the articulation is still legato in the right hand, but less so in the
left hand, and with smooth overlap transitions between keys. The three durations hereby
discussed are fairly balanced in a Round timbre, and indicate a somewhat legato, tending
to intermediate, articulation. For a Bright timbre, notes are a little more articulated in
the left hand, but overall are non-legato. Finally, Dry is definitely played staccato.

– Variations in onset synchrony between chords (FWelch (4, 27.386) = 3.149, p =


0.03, effect size r = 0.109): variations in the onset timing of notes in a chord,
ranging from tight (all notes in sync) to loose. The Dry timbre shows the fewest
variations, with Dark also fairly constant, then Bright, Velvety and Round showing
the most differences in chords’ “tightness” during performances.

– Sustain pedal depression (F(4, 15) = 3.61, p = 0.03, effect size r = 0.47): simi-
lar to the general case, although with some more pedal depression for Bright.

– Sustain pedal use (FWelch (4, 26.325) = 10.309, p < 10 3 , effect size r = 0.659):
like the general case, with even less pedal for Dry. For a Dark timbre, the sustain
pedal is used longer but not more depressed, which indicates some surface ped-
alling.
235

9.3.6.3 Piece no.3

Although 124 features proved significant in differentiating timbres in Piece no.3


(about three times as many features as for the other pieces), after removal of redun-
dancies a minimal set of just 6 features was obtained. The gestural portrait of the five
timbres over those 6 features for Piece no.3 is presented in Figure 9.13.

Figure 9.13: Kiviat chart of the six gestural features giving a minimal and unique de-
scription in Piece no.3 of the five timbral nuances explored in the study. Z-scores per
timbral nuance for each feature are indicated with colour-coded dots. The five colour-
coded, dot-linking closed lines thus represent the gestural portraits of each timbre. The
shades around each closed line show the ±1.96 S.E. intervals (95% confidence interval).

– Attack duration (F(4, 15) = 4.058, p = 0.02, effect size r = 0.296): from longest
to shortest are timbres Velvety, Dark, Round, Bright then Dry, with Velvety much
longer than Dark and Dry much shorter than Bright compared to the general case.

– Sustain pedal depression at onsets (F(4, 15) = 19.527, p < 10 3 , effect size
r = 0.753): depression depth at instant of note onset (here slightly more significant
than mean depression over chords duration). The sustain pedal proves a little more
236

depressed at onsets for a Dark timbre than for Velvety and Round; much less
depression appears here for a Bright timbre, and still all but none for Dry.

– Sustain pedal use (F(4, 15) = 11.038, p < 10 3 , effect size r = 0.638): sustain
pedal is used relatively longer for Dark and much shorter for Bright than in the
general case.

– Release duration (F(4, 15) = 5.292, p = 0.007, effect size r = 0.533): almost
comparable to the general case.

– Overlap durations between same-hand chords (F(4, 15) = 9.341, p = 0.001,


effect size r = 0.64): describes articulation in each hand. Overlaps are longest for
Dark, still long for Velvety and Round, shorter for Bright and shortest for Dry.
From those last two features, articulation is most legato for Dark, very legato for
Velvety and Round as well, non-legato for Bright, and staccato for Dry. Release
durations for Velvety performances may be lengthened not during articulation be-
tween notes, but rather at the end of suspended notes, of which there are many in
this piece.

– Variations in overlap between same-hand chords (F(4, 15) = 7.604, p = 0.001,


effect size r = 0.576): evaluates the variations in the depression amount of overlap-
ping chords or single notes played by the same hand. The more staccato timbres
show more variations in overlap because the range from no overlap to some over-
lap is greater, and especially because of the trill in this piece, where some overlap
between notes is unavoidable regardless of the staccato intentions. More interest-
ingly, the Round timbre, while not played the most legato, is the most consistent
in overlap amount.
237

9.3.6.4 Piece no.4

Seven features were selected among the 40 that proved significant for Piece no.4, and
give the minimal, complete portrayal of the five timbres presented in Figure 9.14.

Figure 9.14: Kiviat chart of the seven gestural features giving a minimal and unique
description in Piece no.4 of the five timbral nuances explored in the study. Z-scores per
timbral nuance for each feature are indicated with colour-coded dots. The five colour-
coded, dot-linking closed lines thus represent the gestural portraits of each timbre. The
shades around each closed line show the ±1.96 S.E. intervals (95% confidence interval).

– Mean key depression (F(4, 15) = 3.125, p = 0.047, effect size r = 0.339 overall;
F(4, 15) = 3.304, p = 0.039, effect size r = 0.376 left hand) gives for each note
the sum of its instant key depressions divided by its duration; it will be greater for
a long, deeply depressed note. As such, mean key depression is greater for Dark
and Round timbres, then Bright, and quite low for Velvety (slight key depression)
and Dry (short notes). Meanwhile, mean key depression for left-hand notes only
is greater for Dark, the rest being equal.
238

– Right-hand variations in hammer hit timing (FWelch (4, 22.202) = 3.763, p =


0.018, effect size r = 0.143): variations for right-hand notes in the delays between
the instants of deepest key depression and maximum hammer velocity; this in
fact accounts for variations in attack speed, as the faster the attack the earlier the
hammer impact on the strings will be compared to the deepest key depression.
Thus, right-hand attack speeds vary most for Bright, moderately for Round, Dry
and Dark, and least for Velvety.

– Variations in melody lead (c 2 (4) = 10.871, p = 0.028, effect size r = 0.572):


variations in the lead time of the first note played in a chord over the others. These
variations are more pronounced for Dry and Bright, and smaller for Round and
Dark then Velvety.

– Sustain pedal depression (F(4, 15) = 6.593, p = 0.003, effect size r = 0.62):
same pattern as in the general case, yet with more pedal depression for Bright.

– Sustain pedal use (F(4, 15) = 6.681, p = 0.003, effect size r = 0.597): same
pattern as in the general case (just slightly less pedal for Velvety).

– Release duration (F(4, 15) = 3.942, p = 0.022, effect size r = 0.471): same pat-
tern as in the general case, with just slightly shorter releases for Velvety, and even
shorter releases for Dry.

In summary, some characteristics of the gestural control of piano timbre could come
out as significant in differentiating timbres for only one of the four pieces performed in
this study, as some performance techniques could be fittingly achieved only in one piece
due to the compositional peculiarities of each piece. As a consequence, note durations
(especially in the left hand) could prove significant only in Piece no.2, as a descriptor
of articulation between the numerous left-hand single notes in this piece. Articulation
and overlap durations were most significant in Piece no.3, as its compositional style of
239

alternating dyads and single notes and its many suspended notes seem well suited for
using articulation as a timbre expression technique. Meanwhile, mean key depression
(a descriptor of combined depression depth and duration of a note) and the variations in
both melody lead (synchrony of the earliest note in a chord) and hammer hit timing (es-
sentially, attack speed) are significant only for Piece no.4, as its repeated left-hand dyads
and “rolling” right-hand dyads (respectively) may well befit the control and expressive
timbral use of each of these parameters.
Conversely, for some of the pieces, the results are deprived of features significant
in the general-case, all-pieces results. For instance, Piece no.1 with its chords is ill-
suited for using articulation as a timbre expression tool; likewise, resorting primarily to
intensity for timbre expression was easier to avoid in Pieces nos.3 and 4.
Nevertheless, complementarity between the four pieces was sought out above all,
in order to account for diverse techniques and styles more or less prominent in each
composition. The general results in timbre discrimination significance of performance
gesture and touch features could thus be confirmed in all four pieces regarding the use
and depression of the sustain pedal, in all but Piece no.1 for articulation and release
durations, and in the first two pieces regarding intensity, while the previously mentioned
features idiosyncratically significant for only one piece give further information about
timbre expression in specific compositional contexts.

9.4 Individual results

9.4.1 Comments from the participants

The participants were interviewed right after they completed the task, and their obser-
vations during the task were collected as well. The task was deemed a positive, original
and beneficial (even playful for FP) experience by all, in that it forces a direct reflection
on timbre in and of itself. RB stressed how this direct reflection differs from the usual
process of trying to find the one most appropriate interpretation of a piece, and it required
240

much thought to find five approaches to performing each piece with each timbre nuance,
with justifiable choices to be made by the performer instead of relying on the texts to
respect the composer’s idea. The four proposed pieces actually helped in that, as all
participants agreed they were short, neutral and complementary enough to suit the task
very well (personal musical tastes notwithstanding). The pieces were also considered
technically approachable (with more reserve for Piece no.2 though). The participants
could thus find natural ways to colour them in the five submitted timbres.
The participants had different affinities with the concept of timbre. For PL, timbre
was not a focal point in her piano learning process, but was essentially a by-product of the
imagery evoked by a piece through its musicality, its historical context or synaesthetic
impressions, colours mostly. This resulted in her understanding of the timbre descriptors
through colours and images.
On the contrary, BB learned to approach timbre through gesture and especially touch,
as an integrated technical process paramount to the musicality in interpretations. The
understanding of timbre descriptors however went through imagery, which enabled BB
to go beyond the surface semantic dichotomies between Bright–Dark and Round–Dry
and picture five distinct timbral nuances with their own, complex meaning.
Timbre is a familiar concept to RB, and the terminology for the five descriptors was
deemed traditional, although some terms could designate more than just timbre amongst
the several aspects of musical expression. For RB, Dark could describe an atmosphere
as well, Bright and Dry a playing style, and Velvety touch.
Lastly, FP sees timbre as a characteristic of a “pure” sound, mostly contained in attacks,
one of several control parameters for character. Timbre production is instinctual for him.
Different approaches to timbre production were also adopted, some holistic, others
adapted to each piece. PL played Dark mysterious and “black” by using the soft pedal,
Dry staccato and with sharp releases of the forte pedal to cut note sustains, Round with-
out much accent, like a waltz in Piece no.4, Velvety even smoother than Round, and
Bright with accents and forte.
241

BB applied one touch to each timbre, whichever the piece, and provided precise techni-
cal information. For Bright, keys were pressed deep and fast, with an angle towards the
back of the keyboard. Velvety was played the same way yet with much shallower key
depressions, only down to escapement level, and with a less ample gesture. For Round,
BB used the same touch as Velvety but with much more involvement of arms and shoul-
ders, to make the motion more fluid. Round was actually the “pivot” timbre, with regard
to which BB defined his playing of the other four. The production of a Dark timbre was
also characterised by a very flat hand. On the contrary, Dry was played with the fingers
straight up, perpendicular to the keyboard.
FP also adopted a holistic approach to timbre production, common to all pieces. While
his method was primarily instinctual, he could still provide a posteriori technical de-
tails. He defined the Round timbre as neutral, not too loud, with hands well balanced.
Dry involved a short articulation, no pedal and some rubato. FP played Velvety piano,
centred, with stiff fingers, some rubato, not much bass, and a good deal of pedal. Dark
was “dramatic”, with a lot of bass and much backward rubato (delay). And Bright con-
tained much more treble, an acceleration rubato (advance), clear articulation and not
much pedal.
For RB, the process of timbre production was preceded by an analysis of each piece in
light of the historical repertoire, and thus derived separately for each piece. Still, the
production of each timbre shared many technical characteristics. Bright was always em-
phatic, conquering (sic), decided, forte, with pedal. Round meant cantabile, with the left
hand legato. Dry was played staccato, aggressive, with little or no pedal. Production of
the Dark timbre most focused on different facets for each piece, with the first piano and
with each chord played late, Piece no.2 played with few nuances, focus on legatissimo in
Piece no.3 and for Piece no.4 cold regularity in the left hand and deep key depressions.
Finally, for Velvety RB used the una corda pedal and shallow key depressions at escape-
ment level.
242

The participants’ comments and perspectives on piano timbre production thus give
clues about the technical strategies involved in their individual approaches to timbre ex-
pression, as they perceived that they applied to their performances for the five different
timbre nuances. Let us now see how their intentions match up to the performance anal-
yses results, pianist by pianist.

9.4.2 Pianist-wise gestural description of piano timbre

Each of the four participants’ performance datasets was explored separately. The
same statistical procedure of variance analysis was applied to the performances of each
pianist, separately, with 20 samples of differing (piece x timbre) conditions in a three-
repetition design, and timbre as factor for the dependent variables (i.e. performance
features). The analyses revealed 128, 107, 52 and 36 (respectively) features significant
at the 5% level for pianists RB, PL, FP and BB. 14 These sets of significant features
were here again reduced for each pianist to a smaller number of features, by removing
redundancies in both meaning and mean-value patterns between features. Minimal and
unique gestural portraits for each of the five timbres were thus identified for each of the
four pianists.

9.4.2.1 Pianist: RB

Among the 128 features that proved significant for timbre differentiation in RB’s
performances, 11 were selected. The resulting, minimal and complete portrayal of the
five timbres for RB’s performances is presented in Figure 9.15.
– Hammer velocity (F(4, 15) = 12.438, p < 10 3 , effect size r = 0.74 overall;
F(4, 15) = 12.833, p < 10 3 , effect size r = 0.744 left hand; F(4, 15) = 15.023,
p < 10 3 , effect size r = 0.765 right hand): overall and for each hand, the patterns
14. The different numbers of significant features do not necessarily imply more differences in gesture
and touch between the five timbral nuances, as at this point, the redundancy between performance features
has yet to be removed.
243

Figure 9.15: Kiviat chart of the 11 minimal performance gesture and touch features that
can account for a unique description, in RB’s performances, of each of the five timbral
nuances explored in the study. Each axis on the chart corresponds to one feature, on
which each dot represents its mean z-score value for each timbre. The five colour-coded
closed lines linking same-colour dots thus represent the gestural portraits of each timbre.
The shades around each closed line show the ±1.96 S.E. intervals.

of hammer velocities per timbre are essentially identical to those of the 4-pianist
general case, yet with wider gaps in dynamic range between Velvety, Dark and
Round on the one hand, and Dry and especially Bright on the other. Dark and Dry
timbres also show more intensity in the left hand.

– Key depression depth (c 2 (4) = 10.471, p = 0.033, effect size r = 0.551): while
in the same order as in the general case (and as for hammer velocity), distances
between the timbre means differ: Dark is close to Velvety in lack of depression
depth, and deep key depressions are further accentuated for Dry and Bright.

– Attack duration (F(4, 15) = 8.181, p = 0.001, effect size r = 0.661): compared
with the general case, the pattern of timbre means almost match the inverse of
hammer velocity, but with Dry attacks shorter than Bright.
244

– Variations in hammer velocity (c 2 (4) = 10.057, p = 0.039, effect size r =


0.529): while both Bright and Dry timbres show little variation in hammer ve-
locity (thus a consistently high intensity), the pattern for the three timbres with
lower intensity is different, with somewhat low variations for Dark, and high vari-
ations for Velvety and especially for Round, which thus involve a larger dynamic
range.

– Sustain pedal use (F(4, 15) = 19.447, p < 10 3 , effect size r = 0.806): like in
the general case, the sustain pedal is barely used for a Dry timbre, a little more for
Bright, and extensively for Round, Dark and Velvety. RB makes as much use of
the sustain pedal for a Velvety timbre as for Dark.

– Release duration (FWelch (4, 24.867) = 39.433, p < 10 3 , effect size r = 0.759
overall; FWelch (4, 26.444) = 59.317, p < 10 3 , effect size r = 0.746 left hand;
F(4, 15) = 4.557, p = 0.013, effect size r = 0.511 right hand): while the five
timbres are ordered the same way as in the general case, they are much more
spread here. Patterns of release duration also differ between hands for Round
(longer release i.e. more legato left hand) and Dry (much shorter releases in left
hand notes) timbres.

– Overlap duration (c 2 (4) = 10.5, p = 0.033, effect size r = 0553.): confirms the
legato articulation for timbres Velvety, Dark and Round, the staccato articulation
for Dry, and for RB’s performances a rather staccato (or at least non-legato) artic-
ulation for Bright.

9.4.2.2 Pianist: PL

Among the 107 features that were significant for timbre differentiation in PL’s per-
formances, 10 were selected. The resulting, minimal and complete portrayal of the five
timbres for PL’s performances is presented in Figure 9.16.
245

Figure 9.16: Kiviat chart of the 10 minimal performance gesture and touch features that
can account for a unique description, in PL’s performances, of each of the five timbral
nuances explored in the study. Each axis on the chart corresponds to one feature, on
which each dot represents its mean z-score value for each timbre. The five colour-coded
closed lines linking same-colour dots thus represent the gestural portraits of each timbre.
The shades around each closed line show the ±1.96 S.E. intervals.

– Hammer velocity (c 2 (4) = 9.614, p = 0.047, effect size r = 0.506 overall; c 2 (4) =
10.114, p = 0.039, effect size r = 0.532 left hand; c 2 (4) = 11.547, p = 0.022, ef-
fect size r = 0.603 right hand): while the pattern of hammer velocities per timbre
resembles the general case, the Bright-timbre performances have higher intensity
relative to other timbres. On the other hand, intensity is much lower for Dark than
in the general case. Within this pattern, the same characteristics are found in each
hand as in the general case (more intensity in the left hand for Dry and Dark).
Overall and for each hand, the patterns of z-scores per timbre are essentially iden-
tical to those of the four-pianist general case, yet with wider gaps in dynamic range
between Velvety, Dark and Round on one hand, and Dry and especially Bright on
the other. Dark and Dry timbres also show more intensity in the left hand.
246

– Key depression depth (F(4, 15) = 4.772, p = 0.011, effect size r = 0.304): this
pattern is quite different from the general case for PL’s performances. Key depres-
sion depth is even higher (relative to other timbres) for Bright, lower for Dark, and
very much lower for Dry — which is consistent with the lower relative intensity
and staccato articulation.

– Key attack duration (F(4, 15) = 3.079, p = 0.049, effect size r = 0.299): time
elapsed from the start of key depression to its instant of maximum depression.
Compared with the general case, attacks are longer for Dark, and shorter for Bright
— which can be explained by the lower and higher (respectively) intensities.

– Sustain pedal depression (F(4, 15) = 10.006, p < 10 3 , effect size r = 0.679),
Sustain pedal use (F(4, 15) = 3.364, p = 0.037, effect size r = 0.423): equivalent
to the general case, with just slightly more depression for Dark — thus less surface
pedalling.

– Degree of staccato same-hand (c 2 (4) = 13.214, p = 0.01, effect size r = 0.695):


as this feature indicates the time interval from the end of one chord to the be-
ginning of the next chord, it represents the articulation type and salience: with a
positive value, the articulation is staccato, and with a negative value, the articula-
tion is legato.

– Overlap duration between same-hand chords (c 2 (4) = 10.114, p = 0.039, ef-


fect size r = 0.532): the longer the overlaps, the more legato the articulation.
From these two articulation features, we find once again that the Dry timbre is
the most staccato, Bright is rather non-legato, and Round, Velvety and especially
Dark are played legato.

– Variations in number of overlaps between same-hand chords (F(4, 15) = 3.53,


p = 0.032, effect size r = 0.409): variations in the number of outer notes overlap-
247

ping a given chord. While variations are high for the two least-legato timbres
(Bright and Dry), the most relevant information is that Round, even though less
legato than Velvety and Dark, is more consistent.

9.4.2.3 Pianist: FP

Eight features could be selected among the 52 features that were significant for tim-
bre differentiation in FP’s performances, and which give the minimal and complete por-
trayal of the five timbres for FP’s performances that is presented in Figure 9.17.

Figure 9.17: Kiviat chart of the 8 minimal performance gesture and touch features that
can account for a unique description, in FP’s performances, of each of the five timbral
nuances explored in the study. Each axis on the chart corresponds to one feature, on
which each dot represents its mean z-score value for each timbre. The five colour-coded
closed lines linking same-colour dots thus represent the gestural portraits of each timbre.
The shades around each closed line show the ±1.96 S.E. intervals.

– Hammer velocity (c 2 (4) = 10.614, p = 0.031, effect size r = 0.559 left hand;
F(4, 15) = 3.579, p = 0.031, effect size r = 0.438 right hand): significant for
248

each hand, the patterns presented are singularly idiosyncratic. Hammer velocity
is by far lowest, in each hand, for Velvety, but then differs very much between
hands for the other four timbres. For Round, intensity is relatively high, especially
for the right hand. Likewise, while intensity is rather high in the left hand for a
Bright timbre, it is much higher (the highest of all timbres by far) in the right hand.
On the other hand, both Dark and Dry timbre show higher hammer velocities in
the left hand than the right, especially for Dark — the second highest left-hand
intensity behind Dry, vs. only the fourth highest for the right hand.

– Left-hand key depression depth (F(4, 15) = 3.062, p = 0.05, effect size r =
0.422): mostly correlates to the left-hand hammer velocity, except the Dark timbre
shows by far the deepest key depressions.

– Left-hand key attack duration and Right-hand Attack duration (F(4, 15) =
3.354, p = 0.038, effect size r = 0.378 left hand; F(4, 15) = 5.523, p = 0.006,
effect size r = 0.383 right hand): while Velvety key attacks are by far the longest
(corresponding to the lowest intensity) in both hands, the other four timbres are
quite close in left-hand key attack duration — ordered Bright, Round, Dry and
Dark in decreasing duration. Meanwhile, Right-hand attack durations match right-
hand intensities in timbre order, but Dark and Round (longer attacks) are quite sep-
arated form Dry and Bright. Especially, right-hand Dry attacks, although having
less intensity than Bright ones, are almost as short.
FP’s performances thus differ most from the other participants’ performances in
the dynamics and attacks that he used for a Dark timbre, with higher intensity and
deeper, shorter attacks, especially in the left hand.

– Release duration (F(4, 15) = 3.587, p = 0.03, effect size r = 0.362) is quite com-
parable to the general case, albeit with slightly longer releases for Round.
249

– Soft pedal use (FWelch (4, 24.832) = 3.824, p = 0.015, effect size r = 0.364): the
only pianist for whom soft pedal use was significant 15 , FP used the soft pedal
most for Velvety, to a considerable extent for Dark as well, yet also a little for Dry
and Round, while the least for Bright.

– Sustain pedal depression (F(4, 15) = 8.297, p = 0.001, effect size r = 0.631):
matches the general case but for the Dark timbre, for which FP depressed the sus-
tain pedal less.

9.4.2.4 Pianist: BB

Although only 36 features were significant for timbre differentiation in BB’s per-
formances, eight features could still be selected. The resulting, minimal and complete
portrayal of the five timbres for BB’s performances is presented in Figure 9.18. This por-
trayal peculiarly only concerns nuances of dynamics and attack, meaning neither pedal
use nor articulation proved sufficiently significant in discriminating timbres within BB’s
performances.

– Hammer velocity (F(4, 15) = 3.884, p = 0.023, effect size r = 0.479 overall;
F(4, 15) = 4.085, p = 0.02, effect size r = 0.485 left hand; F(4, 15) = 5.018,
p = 0.009, effect size r = 0.532 right hand): Dark and Velvety timbres show by far
the lowest intensities both overall and in each hand, while Round, and Dry/Bright
more so, are far higher. Intensity is mostly well balanced between hands, with
just a little more accent on the left hand for Velvety and the right hand for Dry.
Thus, the patterns of left hand accents for Dry and Dark are absent from BB’s
performances.
15. For the other pianists, the non-use of the soft pedal in some performances yielded too small a sample
size for statistical processing.
250

Figure 9.18: Kiviat chart of the 8 minimal performance gesture and touch features that
can account for a unique description, in BB’s performances, of each of the five timbral
nuances explored in the study. Each axis on the chart corresponds to one feature, on
which each dot represents its mean z-score value for each timbre. The five colour-coded
closed lines linking same-colour dots thus represent the gestural portraits of each timbre.
The shades around each closed line show the ±1.96 S.E. intervals.

– Attack speed (F(4, 15) = 4.884, p = 0.01, effect size r = 0.527): while com-
parable overall to hammer velocity, attacks are noticeably faster for Bright than
Dry and for Velvety than Dark, despite similar hammer velocities. This may be
explained by the non-linearities in attack speeds, which means that the average
attack speed can differ slightly from the instant attack speed at hammer launch
(directly correlated with hammer velocity). Differences in attack touch may thus
have occurred between Bright and Dry, as well as between Velvety and Dark.

– Key depression depth (F(4, 15) = 3.519, p = 0.032, effect size r = 0.46) differs
from the general case, once again with more separation between Velvety–Dark and
Round–Dry–Bright, and with deeper key depressions for Dry than Bright.
251

– Variations in hammer velocity (F(4, 15) = 4.505, p = 0.014, effect size r =


0.306), while principally larger for lower-intensity timbres, are noticeably larger
(thus a larger dynamic range) for Dark than Velvety.

– Key attack duration (F(4, 15) = 4.812, p = 0.011, effect size r = 0.453) mostly
inversely corresponds to hammer velocity patterns, yet with slightly shorter attacks
for Round and longer attacks for Dry and dark.

– Left-hand touch percussiveness (FWelch (4, 27.215) = 2.767, p = 0.047, effect


size r = 0.302): this feature is an evaluation of the attack profile and its acceler-
ation curve: the higher the early key acceleration, the more percussive the attack
touch — as it corresponds to a key struck rather than pressed (Goebl et al., 2005).
While touch percussiveness here increases with intensity, it is more pronounced
for Bright and clearly shows Dark as the least touch-percussive timbre.

Contrary to the other participants, the use of pedals was not significant in differen-
tiating between timbres in BB’s performances. 16 This shall explain in a large part why
BB’s performances are so closely grouped, regardless of timbre, in the gestural spaces
of piano timbre presented in Section 9.3.2. Indeed, those PCA spaces were built upon
the 192 gestural control features that were significant overall in differentiating timbres,
among which several are pedal-related — given the importance of pedalling for the other
performers.

In conclusion, we can observe some idiosyncrasies in the gestural patterns of timbre


production for each pianist’s performances. Compared with the overall observations, RB
employed a larger separation in dynamic register and attacks between Velvety–Dark–
Round and Dry–Bright. PL used shallower key depressions for Dry timbral nuances,
16. BB actually took to an extreme the instruction (ill-advised in retrospect) not to overuse pedals as a
way to separate timbres.
252

and played Dark timbral nuances with lower intensity and Bright with higher intensity.
On the other hand, FP used a higher dynamic register, as well as deeper and shorter
left-hand attacks, for Dark timbral nuances. Meanwhile, BB played with more balance
in intensity between hands for Dark and Dry timbres. He also used larger separation
in dynamic register and attacks between Velvety–Dark and Round–Dry–Bright, and his
articulation and use of pedals was non-significant between timbres.
However, several characteristics of timbre production remain common to all pianists
and confirm the general-case, all-pianists results. Hammer velocity proved significant
(overall and/or per hand), and always shows high intensities for Bright and Dry, low
intensities for Velvety and Dark (save for FP), and mid-intensity for Round (although to
varying degrees). Likewise, attack duration is significant in some form for each pianist,
with the same separation as intensity between short attacks for Dry and Bright, and long
attacks for Velvety and Dark (save for FP). Moreover, key depression depth is significant
in differentiating timbres for each pianist, with the attacks always deep for Bright and
shallow for Velvety (and for Dark but for FP). The sustain pedal is also used essentially
the same way by three of the four pianists, that is heavily for Dark, Round and Velvety
timbres, less for Bright, and almost not for Dry. Finally, articulation plays a large role in
differentiating the timbral nuances for three pianists out of four, with an articulation most
legato for Velvety, Dark and Round, more intermediate for Bright, and most staccato
with Dry.

9.4.3 Comparison between participants’ comments and individual performance


analysis results

In conclusion, the gestural descriptions of piano timbre production obtained through


data analysis separately for each participant closely reflect in many ways the intentions
and self-descriptions of the process of timbre production shown in the participants’ com-
ments. RB’s descriptions of a Bright timbre as forte, of Round with left-hand legato, of
Dry staccato and with almost no pedal, of Dark more piano and legato, and of Velvety
253

with escapement-level key depressions, were confirmed. In PL’s performances, the stac-
cato and shallow depression of the sustain pedal for a Dry timbre, the relative lack of
distinctive features for Round (as the “medium” timbre), the softer, smoother (with shal-
lower key depressions) Velvety, and the forte dynamics and accents (with short and deep
attacks) for Bright were found accordingly to her expressed intentions. Data analyses
of FP’s performances demonstrated his strategies of playing a Round timbre neutral, not
too loud and with balance between hands, of playing a Dry timbre staccato and with
no sustain pedal, of playing a Velvety timbre piano, centred (in that it implies balance
between hands) and with a lot of pedal, of using a good deal of (left-hand) bass for
Dark, and of using much more (right-hand) treble, clear articulation (short releases) and
not much pedal for Bright. And the stated intentions of BB were accurately retrieved
in his performances’ analysis results: deep key depressions and fast attacks for Bright,
shallower key depressions for Velvety, the use of Round as the “pivot”, central timbre,
flat hands and therefore low touch percussiveness for Dark, and straight, perpendicular
fingers for Dry resulting in a high touch percussiveness.
Among the self-stated gestural and technical intentions toward timbre production that
were not seen in the performance analysis results, most were simply beyond the scope
of our explorations. As the study was focused on touch and gesture as transferred to key
motion and on pedals, other performance control parameters, like rubato, and ancillary
or indirect gesture like arms, shoulders and body involvement, were not examined. In
PL and RB’s case, their intended use of the soft pedal was not shown, but for purely
statistical reasons. Little contradictory information between the pianists’ intentions and
the analytic results thus remains: RB’s pedal use for a Bright timbre was not as salient as
intended; RB’s differing approaches between pieces for producing a Dark timbre were
not identified; and the lack of bass in FP’s Velvety performances was not apparent in the
features of left-hand attack and intensity.
Yet the performance analyses also revealed more subtleties in the touch and gestural
control of piano timbre than the pianists stated. While some unmentioned control pa-
254

rameters could be simple oversights, many are far from obvious and very precise, and
may have not been consciously/directly thought of by the participants. These non-trivial
parameters are thus worth mentioning.
For RB, the Bright timbre is well-balanced in intensity between hands, shows deep
key depressions, and non-legato articulation; Round shows higher intensity for the right
hand, and a lot of variation in intensity overall; the Dark timbre emphasizes the left hand
dynamically; and the Velvety timbre is very legato (and piano). Meanwhile, PL applied
more intensity with the left hand and played very legato to produce the Dark timbre;
and she used shallow key depressions in producing the Dry timbre. As for FP, he played
Round and Dark with much legato and sustain pedal; and he tended to play with more
left-hand intensity to produce the Dry timbre. And for BB, the Bright timbre was also
very percussive (left hand) while the Velvety touch was non-percussive; and the Dark
timbre showed much variation in intensity through the performances.

9.5 General discussion: gestural portraits of piano timbre

In conclusion, in light of an exhaustive exploration of performance gesture and touch


in the context of piano timbre production, gestural portraits of five timbral nuances were
drawn. The production of each timbral nuance (defined by a verbal descriptor) could be
characterised by a unique pattern in the use of certain gestural control parameters.
These gestural portraits, drawn from the strategies employed by only four partici-
pants to this study, cannot be expected to represent the only solutions for producing these
timbral nuances. Both the gestural patterns of timbre production and the understanding
of the timbral nuances and their verbal descriptors could differ between pianists on a
larger scale, especially between different piano schools. Yet in the context of this study,
the gestural portraits presented below remain valid strategies for producing the five tim-
bral nuances, and were shared by four pianists of different origins and piano training.
255

– A Dry timbre can be obtained with high intensity, with slightly more dynamic
emphasis on the left hand, constantly fast and short attacks, yet without fully deep
key depressions; the latter also favours a very staccato articulation, with clearly
separated notes and chords; finally, both the soft and sustain pedals are hardly
used.

– The Bright timbre is produced in a high dynamic register, with a slight emphasis
on the right hand, with very short attacks, and with keys deeply depressed down
to the keybed; the soft pedal is barely used; the sustain pedal is used sparingly, but
strongly depressed then; and the articulation is intermediate, non-legato yet not
staccato either.

– The Round timbre requires an average kind of play, with no salient trait; intensity
and attacks are quite moderate, well balanced between hands, and rather constant
through performances; key depressions are not very deep, yet well below escape-
ment point; the soft pedal is barely used; the sustain pedal is used frequently and
significantly; and the articulation is quite legato.

– For a Dark timbre, one needs a sharp contrast between hands in dynamic register
and attack, with attacks and intensity very light in the right hand, while much more
marked in the bass, left hand; keys lightly depressed; a fair use of the soft pedal;
a massive, quasi-constant use of the sustain pedal; and a very legato articulation,
especially in the right hand.

– And a Velvety timbre is characterised by its very low dynamic register (piano or
even pianissimo), long attacks (especially in the right hand), very shallow key de-
pression, at the escapement point, a very legato articulation (much more so in the
left hand), prominent use of the soft pedal and heavy use of the sustain pedal as
well.
256

Overall, these features are characteristic of the production of each of the five timbral
nuances examined in this study, independently of the performer and the musical context
— as least to the extent of musical diversity represented in the four pieces composed for
this study, and in the limits of the strategies chosen by the four participants. Therefore, in
the context of this study, the production and gestural control of different timbre nuances
required differences in dynamics, attack (and their balance between hands), key depres-
sion depth, pedalling and articulation, whereas neither the synchrony between notes in
chords nor note sustain, intervals between chords or left-hand overlaps proved significant
in differentiating timbres.
Post-hoc, pairwise timbre comparisons also made it clear that timbres Dry–Bright
on one hand, and Dark–Velvety on the other, are the most different, in dynamic register,
key depression depth, right-hand attacks and articulation. Meanwhile, the Dry timbre
also significantly differs from all the others by its (lack of) sustain pedal use, the Velvety
timbre differs from all the others but Dark by its consequent soft pedal use, and Bright
differs from Dark by its lesser use of the sustain pedal. But Round, as the most “average”
timbre, cannot be significantly distinguished from Bright or Dark, and is only set apart
from Dry and Velvety by the use of both pedals. Likewise, Dark and Velvety cannot be
differentiated in post-hoc analyses through their gestural control.
Additionally, some features dependent on musical context could be identified as char-
acteristic of some timbres, through the separate analysis of each of the four pieces in light
of their compositional specificities. The results clearly show that for Piece no.2, single
notes and melodic lines (especially in the left hand) have to be held long and deeply, and
sustained in overlap with the following notes, to produce the Dark timbre; for the Velvety
timbre meanwhile, left-hand notes are still held long and legato, yet their releases and
articulation are more smooth and gradual. And in a context of repeating, rhythmic dyads
like in Piece no.4, timbres Dark, Round and Bright are characterised by simultaneously
loud, deeply depressed and long-sustained chords.
257

In further detail, as regards the application of the most timbre-discriminative ges-


ture and touch features in time through the performances, one can see at first that the
five timbral nuances can differ in mean value at all times (such as a constantly lower
sustain pedal depression for Dry). Yet the manner in which a gestural feature is used
over time is clearly marked by discrete patterns corresponding to each phrase part (often
matching the score bars) respecting a constant direction (either increasing, decreasing
or stationary). Over one given phrase, those patterns can differ between timbres, in di-
rection, amplitude, slope and/or skewness. Moreover, at each phrase transition, changes
occur for one pattern to the next. They can result in continuity, in a change of direction,
but also in a drastic, sudden shift in value (like pedals or keys released at the end of a
phrase before starting anew) the extent of which differs between timbres. The evolution
in time of gestural features is thus characterised by the succession of patterns and tran-
sitions from bar to bar, the details of which may differ between performances coloured
in different timbres, in addition to sheer differences in mean feature values.
Finally, the account of individual patterns of gestural control in the production of
five timbral nuances for each of the four participant pianists could both confirm a com-
mon recourse to dynamics, attack (and their balance between hands), key depression
depth, pedalling and articulation as control parameters for producing different timbres,
and also reveal idiosyncrasies both within the details of this framework and outside of it.
Indeed, while pianists RB, PL and FP used the full extent of those control parameters,
different strategies could be observed for certain timbres, especially through varying rel-
ative values for these parameters between timbral nuances. BB especially used the most
idiosyncratic approach to the gestural control of piano timbre by solely resorting to dy-
namics, attack and touch in order to produce different timbre nuances.

In summary, despite some idiosyncrasies in each participant’s degree of use of ges-


tural control parameters for timbre production, and despite some different, conscious
strategy choices due to the participants’ individual pianistic upbringing, a general, ro-
258

bust system of gestural control in the production of different piano timbre nuances was
identified, involving dynamics, attack (and their balance between hands), key depression
depth, pedalling and articulation as control parameters, according to which five timbral
nuances could each be obtained with a unique pattern in the use and application of these
parameters.
CONCLUSION AND FUTURE WORK

This thesis presented an interdisciplinary investigation upon the expression of pi-


ano timbre by skilled performers. The multiple perspectives from the disciplines, fields
and domains which could help shedding light on the musicological concern of how
advanced-level pianists can envision, understand, speak about, hear, control and produce
piano timbre nuances in their performances were successively explored.

Round-up and discussion

Verbalisation of piano timbre

In order to orient our approach toward the expression of piano timbre by highly-
skilled pianists, the concept of timbre itself was first defined. Through acoustical, per-
ceptual, functional and musical perspectives on timbre, its role in characterising the fine-
grained surface attributes or quality of a sound, further than the sheer categorisation of
the sound source or musical instrument, was highlighted. At this level of timbre-quality,
the expression of performer-controlled piano timbre nuances can arise, within the frame
of sonic possibilities offered by the instrument.
An experiment was focused on the verbalisation of piano timbre, and in particular
on the adjectival descriptors employed by pianists to designate the different timbral nu-
ances they can control and produce at the piano. These verbal descriptors make up an
essential lexicon of references with which piano timbre nuances can be identified and
spoken about, between pianists and in the pedagogical process. Verbal descriptors of
piano timbre were found throughout pedagogical piano treatises, and were exhaustively
listed in the interviews of pianists conducted by Bellemare and Traube (2005). The 14
descriptors of piano timbre most frequently cited in this study were analysed quanti-
tatively, with regard to their semantic relationships. Evaluations of pairwise semantic
proximity between the descriptors were transformed by Multidimensional scaling into a
260

semantic map representing the space of timbre expression in piano performance that can
be described with these words. The five most representative descriptors were singled out
within the semantic space: Dry, Bright, Round, Velvety and Dark.
However, the extent of validity of this semantic map of piano timbre descriptors may
be limited by some issues. It remains possible that the idiosyncratic meanings some
participants may have attached to the verbal descriptors of piano timbre may persist to
some degree within the semantic map. Moreover, it may be assumed that the verbal
description of piano timbre bears the influence of a piano school, a culture and espe-
cially a language. Despite the attempt to diversify the origins of the participants to the
study, many of them were native French speakers and/or were related to the Faculté de
musique, Université de Montréal (where arguably a distinctive vocabulary for timbre has
been developed). Furthermore, as the study was conducted in either French or English,
the vocabulary used to describe piano timbre in any other language has not been consid-
ered. The direct translations of the descriptors used in this study may thus hold different
meanings with regard to timbre in Russian, German or Italian for instance. This issue
could have also arisen between French and English, although a comparative statistical
analysis suggested that this was not the case.
In the end, this semantic map of piano timbre can still outline the tendency among
different pianists to understand these verbal descriptors of piano timbre (and their rela-
tions) in comparable fashion. While the semantic map may not be accurate enough to
represent the most minute nuances of piano timbre description (which can be influenced
by linguistic and personal factors), for the general purpose of this research it could still
reliably allow one to distinguish broad categories of timbre descriptors. This semantic
study is now being extended to include more participants in an attempt to confirm the
validity of the semantic space model.
261

Perception of piano timbre

The auditory perception, identification and labelling of piano timbre was explored
in two separate studies. The first one revealed a certain common ground among 17
participant pianists about the verbal qualifiers used to describe eight different performer-
controlled timbral nuances, upon listening to piano performances of short pieces. Like-
wise, preliminary results from the second study, sharing the same goals but using refined
experimental conditions (more control over the musical context, more representative and
salient verbal descriptors to define the timbral nuances), strongly indicate that pianists
can identify the timbral intentions of a performer, at least in broad outline. In an exper-
imental design where three different factors (participants’ understanding of the verbal
descriptors, participants’ auditory perception, and the performers’ timbral expression
in the recorded stimuli) combine in influencing the results, to obtain significant timbre
identification rates is particularly meaningful.
However, the results of the pilot study are limited in their extent, because the com-
poser of the pieces performed, the pianist whose performances were recorded for the
stimuli, and all but one of the participants were related to the Faculté de musique, Uni-
versité de Montréal. Thus, they arguably shared (at least to some extent) the specific
conception of piano timbre that prevails in this institution. While this aspect of repre-
sentativeness is significantly improved in the follow-up study, the results are still pre-
liminary and more participants are required before the findings can be generalised. In
any case, both studies faced limitations in ecological validity that were necessary for
obtaining a sufficient level of experimental control. The results of these perception tests
cannot prove that timbre can be perfectly identified and labelled in live piano perfor-
mances. Nonetheless, the results indicate that in a controlled context, pianists are able
to identify broad nuances of timbre and to consistently and consensually use verbal de-
scriptors to designate these nuances.
262

The auditory perception of piano timbre was highlighted as essential for pianists in
order to understand and express timbral nuances in piano performance. Great piano
teachers of the twentieth century, in their treatises and pedagogical works, agree on
the utmost importance of the musical ear in driving the pianist’s sound, with auditory
feedback guiding the refinement of performance parameters and gestural control. Ear
training is thus stressed in treatises as a prime concern of piano teaching, for students to
form their own understanding of timbre from masterly demonstrations, and for them to
integrate the subtleties of timbre production through imitation and repetition, by careful
listening and adapting. Informed by the composition and structure of the musical piece,
the pianists can form their mental conceptions of the sound and timbre with which to
colour their performances, and refine their playing to best serve in timbre the aesthetics
of their performances and the emotions to convey.
From such a perspective, however, the actual, exact details of timbre production in
performance through gesture and touch are hidden in a holistic, implicit integration of
timbre-coloured performance. Students could then find themselves bereft of any estab-
lished, generalised concrete method of timbre production to fall back on in the face of
obstacles in the mental conception and musical sense of a specific timbral nuance.

Production and gestural control of piano timbre

Nonetheless, precise recommendations are provided in piano treatises for the pianist
to refine his/her own sound — in general, as an idiosyncratic sound-print of the per-
former, but sometimes also with regard to the timbre nuance with which to colour a
piece. Special advice is given about the proper gesture (depth, weight, percussion and
tactile control in touch, fingers curvature and angle, flexibility in wrist, elbow and shoul-
der joints, degree of tension/relaxation in arms and body muscles) and technique (bal-
ance between notes in chords, relative timing, articulation and dynamics of successive
notes, pedalling) to employ and vary in order to convey the most appropriate expression
in terms of timbre. Yet the opinions of these great teachers about the practical, concrete
263

ways of controlling piano timbre nuances could actually diverge to some degree, depend-
ing on the Zeitgeist, school or individual. In order to further investigate the empirical
body of knowledge documented in piano literature regarding the production, control and
expression of piano timbre in performance, experimental measurements were conducted
of the gesture, touch and technique employed by performers in producing different piano
timbre nuances.
In comparison with previous studies that were restrained to studying a single, isolated
tone — for which timbre control is limited by the mechanical de-coupling between per-
former gesture (key stroke) and sound-producing interaction (hammer impact on strings)
— and that hastily concluded that piano timbre cannot be controlled independently from
intensity, the production of piano timbre was explored for this research in a polyphonic,
musical context, where the combination of tones and their control in timing and inten-
sity can provide — as pianists adamantly claim — a large expressive palette from which
many timbral nuances can arise.
For this aim, the Bösendorfer CEUS piano performance recording system was used
to acquire high-resolution keyboard and pedalling data, and a MATLAB Toolbox was
specifically created to extract the performance features involved in the production of pi-
ano timbre. Performances by four pianists of four different miniature pieces specially
designed for the study were recorded with the CEUS system. Five different timbral nu-
ances, described as Dry, Bright, Round, Velvety and Dark, were successively featured
by each performer for each piece in separate performances. The performance features
extracted from the CEUS data were then analysed according to the timbral nuance char-
acterising each performance. Gestural portraits of the five timbral nuances were drawn,
each characterised by specific degrees of intensity (overall hammer velocity and its dif-
ference between hands), attack (speed and its variations, duration overall and by hand),
key depression depth, use and amount of depression of the soft and sustain pedals, and
articulation (release durations and right-hand overlaps).
264

Further investigations highlighted the performance features that significantly differed


between each pair of timbres among the five nuances, and ensured that a unique gestural
portrait was associated to each of the five piano timbre nuances. Meanwhile, in each
of the data subsets per piece and performer, the timbres could be distinguished by es-
sentially the same performance features, although not all of the performance features
significant overall were highlighted for each piece or performer.
Additionally, the performances were mapped in two-dimensional gestural spaces,
overall and for each piece. In all cases, mean positions of same-timbre performances (i.e.
the mean position of each timbral nuance) were arranged in the same order (Dry, Bright,
Round, Dark and Velvety) along a circular arc — a qualitative indication of differences in
the production of these five timbral nuances. Finally, timbre-discriminating performance
parameters were shown to follow patterns in time matching bars and phrases, most dif-
fering between timbres in their direction, amplitude, slope and/or skewness within a
phrase pattern, and in their behaviour at phrase transitions.
It is unfortunate that, due to the technical failure of the CEUS system and cracked
soundboard of the Bösendorfer Imperial grand piano, we could not record as many par-
ticipant performers as originally intended. In particular, the gestural portraits of piano
timbre nuances that were obtained with only four participants can only reflect the strate-
gies that the four pianists chose to employ for producing the five timbral nuances. These
strategies cannot be considered the only possibilities for producing the timbral nuances.
Different gestural patterns of timbre production may have been favoured by other pi-
anists, depending on their piano schools, individual styles, personal sounds, and under-
standing of timbral nuances. Yet, in light of the large dataset of performances analysed
(240), the general results were not undermined by salient idiosyncrasies from any one
pianist’s performances. Moreover, as I had the chance to work with four pianists from
different origins and who had followed different piano programs, the strategies of pro-
duction of piano timbre that were highlighted could not be attributed to the specificities
of one particular piano school.
265

Eventually, the production of five different piano timbre nuances could be depicted
in precise gestural portraits, each representing a possible strategy for obtaining one of
the timbral nuances.

General discussion

In the end, this research has intended to build a bridge between the empirical knowl-
edge and practice of pianists, and the quantitative measurements enabled by scientific
methods. In respecting both the points of view elicited by pianists about the expression
of piano timbre and a relatively valid musical context in which to set the experiments,
we have managed, at least to some extent, to learn new quantified information that can
illustrate what pianists believe in their understanding, verbal description, perception and
ability to control piano timbre. In particular, thanks to the Bösendorfer CEUS piano
performance digital recording system, we have managed to measure and explore piano
performance and its gestural control parameters in greater precision than ever before. We
were thus able to shine a scientific light upon what pianists mean by touch in piano per-
formance, and upon how the subtleties in relative dynamics, attack, depth, articulation
and pedalling could serve, in polyphonic musical performance, the actual production of
a concept widely envisioned within the pianistic community as an abstract construct: the
expression of piano timbre.
In holistically assessing the expression of piano timbre, one fundamental aspect is
still to be explored: sound itself, i.e. the acoustic features that characterise different
timbre nuances. This work is currently under way within our laboratory, in the doctoral
research of our colleague Sébastien Bel.
However, the quantitative information on the production of piano timbre, upon which
precise gestural portraits of five different timbre nuances were drawn, stands at a low
level of definition with regard piano technique and gesture — from the mechanical ac-
tions a pianist can control to prototypical musical gestures. This information remains
266

to be integrated into musically meaningful constructs fitting the lexicon and concepts of
piano technique and gesture. In any regard, this additional quantitative information on
the production of piano timbre ought to be intended as complementary to the empiric
approach each pianist has been refining in his/her learning, practising and performing.

Applications and future work

Piano pedagogy

Following these principles, future applications of this research could be found in


piano pedagogy. The semantic map of piano timbre descriptors could help piano students
better grasp the meanings of such terms, and find a recourse when confronted with a
timbre descriptor they do not understand, by referring to its position on the map and its
neighbouring terms. However, in order for the semantic map of timbre descriptors to be
considered as a reliable pedagogical tool, the map must become more robust, i.e. reflect
a larger consensus among pianists. For this aim, more participants are needed than the
17 from whose answers the results presented in Chapter 5 were obtained. Further data
collection is currently under way, and stronger, more reliable results can be expected in
the near future.
Furthermore, new pedagogic methods could be developed. The traditional focus on
piano timbre — through mental conception, imitation and careful self-actualisation in
performance as guided by the musical ear — would be complemented by concrete, gen-
eralised advice on the appropriate gesture to use for producing specific timbres. There
is no denying the merits, for artistic, cognitive and practical reasons, of a mental ap-
proach to piano timbre production, especially in performance. For the pianists to bring
forth their personal sounds while focusing on their expression, on whichever piano and
in whichever hall they come to perform, they need prior mental integration of the timbre
nuances which which to colour their performances. Yet in the context of training and
practice, the production of specific timbre nuances could be improved or facilitated by
267

concrete gestural guidelines. Such gesture indications, designed so as to be meaningful


to pianists, would be built upon the lower-level gestural information and portraits of tim-
bral nuances identified in this research. This complementary approach to timbre in piano
teaching would thus offer a tangible, gesture-based alternative to students in conceiving
of a timbral nuance, in potential cases when the abstract approach through imagery and
musical ear would not make clear sense.

Individuality in expressive piano performance

In complement to the common strategies employed by different pianists in the pro-


duction and gestural control of different timbral nuances that were explored in the third
part of this dissertation, another perspective can be envisioned: the exploration of id-
iosyncratic patterns of expressive performance employed by different pianists, regard-
less of the timbral nuance emphasized in any given performance. Indeed, the compre-
hensive dataset of CEUS-recorded piano performances collected for the study presented
in Chapter 9, although especially designed for setting forth different timbral nuances and
their gestural control, still constitutes a great potential source of information regarding
the general use of expressive parameters in piano performance. With performance data
available for four different pianists, and with 60 CEUS-recorded performances for each
pianist, the opportunity is there for exploring the performance features extracted with
the PianoTouch toolbox (that was described in Chapter 8), with statistical tests employed
this time for identifying the individual differences between performers and their idiosyn-
cratic patterns in the measured features of expressive performance. These analyses may
reveal idiosyncratic patterns of expressive performance both in the general comparison
of the pianists’ performances and in the comparison between pianists of the expressive
strategies employed for producing each of the five timbral nuances. Such a study may
thus highlight both the individual strategies of expressive piano performance in general
and the individual strategies employed for the expression of five timbral nuances.
268

Control of sound synthesis

Meanwhile, the results of this research may be applied to controlling piano sound
synthesis algorithms. Although efficient algorithms for piano sound synthesis exist 17 ,
they would serve little purpose if not actually playable as a musical instrument, just
as a note removed from its musical context bears little meaning. The quality of the
control parameters of piano sound synthesis is thus vital in conveying naturalness in
the reproduction of an acoustic piano. A more thorough understanding of the subtleties
pianists can use to produce timbral nuances could significantly improve the development
of new digital keyboard interfaces.
Within this paradigm, several perspectives can be envisioned. First, the vast majority
of digital keyboard interfaces commercially available nowadays follow the MIDI pro-
tocol to send information to audio software. The information sent from a keystroke is
thus limited to one velocity level (encoded on 7 bits), the timing of note onset and the
note duration. The finer details of the keystroke are thus lost. Yet as this research has
shown, pianists consistently deploy subtleties in their keystrokes for controlling timbre
— depression depth, attack and release lengths, degree of pedal depression — that can-
not be rendered with a MIDI keyboard. Whether those subtleties in the use of keyboard
and pedals are integrally transferred to sound production shall be revealed in further
acoustical analyses.
If this is the case, then developing digital keyboard interfaces able to render finer
control parameters will be paramount to reaching a realistic digital simulation of an
acoustic piano on which natural-sounding performances can be achieved.
If this is not the case and fine-grained control parameters are not essential to sound
production, it remains that pianists know how to use these control parameters. They
can therefore be employed as input to sound synthesis engines. This additional control
information can be used for other, non-piano-realistic aspects of digital sound produc-
17. Such as the physical-modelling virtual piano software Pianoteq R .
269

tion, just as the additional aftertouch control parameter of recent MIDI keyboards can
be used, for instance with effects such as tremolo or vibrato. The subtleties that pianists
use on the acoustic instrument to obtain precise timbral nuances can also serve indirectly
the production of timbre with digital piano sound synthesis engines, by accentuating the
variations on other, coarser control parameters relevant to producing a certain timbral
nuance. For instance, the articulation in a performance could be further pushed toward
legato when fine control parameters show that the timbral intentions of the pianist tend
toward Dark or Velvety nuances.
Such implementations are technically feasible. The Open Sound Control protocol
can be used to convey exhaustive information from a keyboard controller to a sound
synthesis engine. Building a digital keyboard interface able to measure and send fine-
grained control parameters is clearly possible, even within a reasonable budget — as
shown for instance by the Evo ‘touch-sensitive’ keyboard controller. 18 As for the pi-
ano sound synthesis engine, an intermediate software layer replacing the MIDI data-
responding code of the sound synthesis algorithm could suffice in creating a fine timbre-
level sound response.
Additionally, knowing how pianists use the keyboard of an acoustic instrument could
help develop more realistic digital keyboards. It has indeed been a major drawback of
MIDI keyboards in general that they do not provide a natural touch. The best solution
to this problem consists in implementing the whole piano action, including hammers, in
digital keyboards, and using hammer hits on sensors as the MIDI information source.
This is of course ideal, yet not quite cost-effective. Other efforts in building more re-
alistic digital keyboards have consisted in adding a so-called ‘heavy touch’ — whose
appreciation remains highly subjective. Additional, quantitative information about what
actually matters to pianists in touch could provide the bases upon which to refine the
feeling of touch.
18. On this commercially available keyboard, sensors are equipped on the surface of each key, and can
track finger position on the key. An advanced communication protocol is used to send this additional
information as control parameters to a sound synthesis engine. http://www.endeavour.de/.
270

On a personal level, my future work will be accomplished as a Postdoctoral Fellow in


the Music Technology Area, Schulich School of Music, McGill University (Montreal)
under the supervision of Profs. Philippe Depalle and Marcelo Wanderley, thanks to a
two-year grant from FRQSC. My research will be conducted within a general framework
of piano sound synthesis and control at the level of timbre production. Among other
things, I will work on offline timbre colouring of digitally-recorded piano performances
(or scores), so that control parameters can be automatically modified to accentuate the
salience of a certain timbral nuance, which the user could specify via a graphical user
interface. To begin, performances recordings will be digitally manipulated, through the
performance features involved in timbre control, so as to modify the timbral nuances
expressed. To do so, I will consider the performance features involved in timbre control,
and either directly manipulate CEUS recordings in light of the results presented in this
thesis, or develop a transformation model form fine-grained performance features to
coarser digital control parameters (MIDI for instance) then manipulate digital (MIDI)
piano recordings. This work is aimed at identifying the level of finesse required in digital
control parameters for the expression of piano timbre to remain perceivable.
Overall, the aim of my postdoctoral research will be to learn even more about the
production and perception of piano timbre — perhaps with a greater, tentative achieve-
ment in mind: the virtual pianist (Parncutt, 1997), that is, a holistic model of piano
performance which would include the pianist himself, and account for all the physio-
logical and cognitive factors that crucially underlie a natural, human, expressive musical
performance.
BIBLIOGRAPHY

ANSI (1960). USA Standard Acoustical Terminology. New York: American National
Standards Institute.

A SKENFELT, A., G ALEMBO, A., and C UDDY, L.L. (1998). On the acoustics and
psychology of piano touch and tone. Journal of the Acoustical Society of America,
103(5):2873. Abstract.

A SKENFELT, A. and JANSSON, E.V. (1990). From touch to string vibrations. I: Timing
in the grand piano action. Journal of the Acoustical Society of America, 88(1):52–63.

A SKENFELT, A. and JANSSON, E.V. (1991). From touch to string vibrations. II: The mo-
tion of the key and hammer. Journal of the Acoustical Society of America, 90(5):2383–
2393.

A SKENFELT, A. and JANSSON, E.V. (1993). From touch to string vibrations. III: String
motion and spectra. Journal of the Acoustical Society of America, 93(4):2181–2196.

BANOWETZ, J. (1985). The pianist’s guide to pedaling. Bloomington, IN: Indiana


University Press.

B ÁRON, J.G. (1958). Physical basis of piano touch. Journal of the Acoustical Society of
America, 30(2):151.

BARRIÈRE, J.B. (1991). Le timbre, métaphore pour la composition. Paris, France: C.


Bourgois.

BARTHET, M. (2008). De l’interprète à l’auditeur: une analyse acoustique et perceptive


du timbre musical. PhD thesis, Université Aix-Marseille II, France.

B ELLEMARE, M. and T RAUBE, C. (2005). Verbal description of piano timbre:


Exploring performer-dependent dimensions. In Digital proceedings of the second
Conference on Interdisciplinary Musicology (CIM05). Montreal, QC: Obser-
vatoire interdisciplinaire de création et de recherche en musique (OICRM).
URL http://oicrm.org/ressources/publications-recentes/
actes-de-colloques/actes-electroniques-cim05/.

B ELLEMARE, M. and T RAUBE, C. (2006). Investigating piano timbre : Relating ver-


bal description and vocal imitation to gesture, register, dynamics and articulation.
In Proceedings of the 9th International Conference on Music Perception and Cogni-
tion (ICMPC9), Bologna, Italy, edited by M. Baroni, A.R. Addessi, R. Caterina, and
M. Costa. ICMPC–ESCOM, 59–60. Abstract, URL http://www.marcocosta.
it/icmpc2006/proceedings.htm.
272

B ERANEK, L.L. (1979). Music, Acoustics and Architecture. New York: Wiley.

B ERLIOZ, H. (1844). Grand Traité d’Instrumentation et d’Orchestration Modernes.


Paris, France: Schonenberger. Excerpts online accessed 2008/03/20, URL http:
//www.hberlioz.com/Scores/BerliozTraite.html.

B ERNAYS, M. (2012). Expression et production du timbre au piano selon les traités:


Conception du timbre instrumental exprimée par les pianistes et professeurs dans les
ouvrages à vocation pédagogique. Recherche en éducation musicale, 29:7–27.

B ERNAYS, M. and T RAUBE, C. (2010). Expression of piano timbre: gestural control,


perception and verbalization. In Proceedings of the 11th International Conference on
Music Perception and Cognition (ICMPC11). Seattle, WA: University of Washington,
290–295.

B ERNAYS, M. and T RAUBE, C. (2011). Verbal expression of piano timbre: Multidimen-


sional semantic space of adjectival descriptors. In Proceedings of the International
Symposium on Performance Science (ISPS2011), Toronto, ON, edited by A. Willia-
mon, D. Edwards, and L. Bartel. Utrecht, Netherlands: European Association of Con-
servatoires (AEC), 299–304.

B ERNAYS, M. and T RAUBE, C. (2012a). Expression of piano timbre: verbal description


and gestural control. In La musique et ses instruments (actes du CIM09), edited by
M. Castellengo and H. Genevois. Paris, France: Delatour.

B ERNAYS, M. and T RAUBE, C. (2012b). Piano touch analysis: A MATLAB toolbox for
extracting performance descriptors from high-resolution keyboard and pedalling data.
In Proceedings of Journées d’Informatique Musicale (JIM2012), Gestes, Virtuosité
et Nouveaux Medias, edited by T. Dutoit, T. Todoroff, and N. d’Alessandro. Mons,
Belgium: UMONS/numediart, 55–64. URL http://www.jim2012.be.

B INET, A. and C OURTIER, J. (1895). Recherches graphiques sur la musique. L’année


psychologique, 2(1):201–222.

VON B ISMARCK, G. (1974). Timbre of steady sounds: A factorial investigation of its


verbal attributes. Acustica, 30(3):146–159.

B LACKHAM, E.D. (1965). The Physics of the Piano. Scientific American, 213:88–99.

B LACKING, J. (1973). How musical is man? Seattle, WA: University of Washington


Press.

B ORG, I. and G ROENEN, P.J.F. (2005). Modern Multidimensional Scaling: theory and
applications. New York: Springer-Verlag.
273

B OUTILLON, X. (1990). Le piano: modélisation physiques et développements tech-


nologiques. Journal de physique. Colloques, 2:811–820.

B RESIN, R. and BATTEL, G.U. (2000). Articulation strategies in expressive piano per-
formance. Journal of New Music Research, 29(3):211–224.

B RYAN, G.H. (1913). Pianoforte touch. Nature, 91:246–248.

B URGOYNE, J.A. and M C A DAMS, S. (2008). A meta-analysis of timbre perception


using nonlinear extensions to CLASCAL. In Sense of Sounds. Lecture Notes in Com-
puter Science, Volume 4969, edited by R. Kronland-Martinet, S. Ystad, and K. Jensen.
Berlin, Germany: Springer-Verlag, 181–202.

B USING, F., C OMMANDEUR, J., and H EISER, W. (1997). PROXSCAL: A multidimen-


sional scaling program for individual differences scaling with constraints. In Soft-
stat ’97: Advances in Statistical Software, edited by W. Bandilla and F. Faulbaum.
Stuttgart, Germany: Lucius & Lucius, 237–258.

C ACLIN, A. (2004). Interactions et indépendances entre dimensions du timbre des sons


complexes : Approche psychophysique et électrophysiologique chez l’Humain. PhD
thesis, Université Pierre et Marie Curie (Paris VI), Paris, France.

C ADOZ, C. (1991). Timbre et causalité. In Le timbre: métaphore pour la composition,


edited by J.B. Barrière. Paris, France: C. Bourgois, 17–46.

C ADOZ, C. and WANDERLEY, M.M. (2000). Gesture-music. In Trends in gestural


control of music, edited by M.M. Wanderley and M. Battier. Paris, France: IRCAM,
71–94. Electronic book.

C AMPBELL, M. and E MERSON, J.A. (2001). Timbre (i), Timbre (ii). In The New Grove
Dictionary of Music and Musicians, edited by S. Sadic and J. Tyrell. Macmillan, 478–
479. Volume 25, URL www.grovemusic.com.

C AMURRI, A., ET AL . (2000). EyesWeb: Toward gesture and affect recognition in


interactive dance and music systems. Computer Music Journal, 24(1):57–69.

C ARROLL, J.D. and C HANG, J.J. (1970). Analysis of individual differences in mul-
tidimensional scaling via an n-way generalization of Eckart-Young decomposition.
Psychometrika, 35(3):283–319.

C ASTELLENGO, M. and D UBOIS, D. (2007). Timbre ou timbres ? Propriété du signal, de


l’instrument, ou construction(s) cognitive(s) ? In Les cahiers de la Société québécoise
de recherche en musique: Le timbre musical: Composition, interprétation, perception
et réception, Volume 9 (1–2), October, edited by C. Caron, C. Traube, and S. Lacasse.
Montreal, QC: OICRM, 25–38.
274

C HAILLEY, J. (1982). Timbre. In Larousse de la musique (2), edited by M. Vignal.


Paris, France: Larousse, 1556.

C HEMINÉE, P. (2006). «Vous avez dit «clair» ? » Le lexique des pianistes, entre sens
commun et terminologie. Cahiers du LCPE: Dénomination, désignation et catégories,
7:39–54.

C HEMINÉE, P., G HERGHINOIU, C., and B ESNAINOU, C. (2005). Analyses des verbali-
sations libres sur le son du piano versus analyses acoustiques. In Digital proceedings
of the second Conference on Interdisciplinary Musicology (CIM05). Montreal, QC:
Observatoire interdisciplinaire de création et de recherche en musique (OICRM).
URL http://oicrm.org/ressources/publications-recentes/
actes-de-colloques/actes-electroniques-cim05/.

C LARK J R, M., ROBERTSON, P.T., and L UCE, D. (1964). A preliminary experiment on


the perceptual basis for musical instrument families. Journal of the Audio Engineering
Society, 12(3):199–203.

C LARKE, E.F. (1995). Expression in Performance: Generativity, Perception and Semio-


sis. In The Practice of Performance: Studies in Musical Interpretation, edited by
J. Rink. Cambridge, UK: Cambridge University Press, 21–54.

C LARKE, E.F. and C OOK, N. (2004). Empirical Musicology: Aims, Methods, Prospects.
Oxford, UK: Oxford University Press.

C LARKE, E.F., PARNCUTT, R., R AEKALLIO, M., and S LOBODA, J.A. (1997). Talking
fingers: an interview study of pianists’ views on fingering. Musicae Scientiae, 1:87–
108.

C LOSSON, E. (1944). Histoire du piano. Brussels, Belgium: Editions universitaires.

C OENEN, A. and S CHÄFER, S. (1992). Computer-controlled player pianos. Computer


Music Journal, 16(4):104–111.

C ONKLIN J R, H.A. (1996a). Design and tone in the mechanoacoustic piano. Part I. Piano
hammers and tonal effects. Journal of the Acoustical Society of America, 99(6):3286–
3296.

C ONKLIN J R, H.A. (1996b). Design and tone in the mechanoacoustic piano. Part II.
Piano structure. Journal of the Acoustical Society of America, 100(2):695–708.

C ONKLIN J R, H.A. (1996c). Design and tone in the mechanoacoustic piano. Part
III. Piano strings and scale design. Journal of the Acoustical Society of America,
100(3):1286–1298.
275

C OOK, N. (1987). Structure and Performance Timing in Bach’s C Major Prelude (WTC
I): An Empirical Study. Music Analysis, 6(3):257–272.

DAHL, S. (2005). On the beat: Human movement and timing in the production and
perception of music. PhD thesis, KTH, Stockholm, Sweden.

DAHL, S., ET AL . (2010). Piano. In Musical Gestures: Sound, Movement, and Meaning,
edited by R.I. Godøy and M. Leman. New York: Routledge – Taylor & Francis, 36–68.

DAVIDSON, J.W. (1993). Visual perception of performance manner in the movements


of solo musicians. Psychology of Music, 21(2):103–113.

DAVIDSON, J.W. (2002). Communicating with the body in performance. In Musical Per-
formance: A Guide to Understanding, edited by J. Rink. Cambridge, UK: Cambridge
University Press, 144–152.

DAVIDSON, J.W. (2007). Qualitative insights into the use of expressive body movement
in solo piano performance: a case study approach. Psychology of Music, 35(3):381–
401.

DAVIDSON, J.W. and C ORREIA, J.S. (2002). Body movement. In The Science and
Psychology of Music Performance: Creating Strategies for Teaching and Learning,
edited by R. Parncutt and G. McPherson. New York: Oxford University Press, 237–
250.

D E L EEUW, J. (1977). Applications of convex analysis to multidimensional scaling.


In Softstat ’97: Advances in Statistical Software, edited by J.R. Barra, F. Brodeau,
G. Romier, and B. van Cutsem. Amsterdam, Netherlands: North-Holland, 133–145.

D E S ILVA, P. (2008). The Fortepiano Writings of Streicher, Dieudonné, and the Schied-
mayers: Two Manuals and a Notebook. Lewiston, NY: Edwin Mellen Press.

D E S OETE, G. and H EISER, W.J. (1993). A latent class unfolding model for analyzing
single stimulus preference ratings. Psychometrika, 58(4):545–565.

D ELALANDE, F. (1988). La gestique de Gould: éléments pour une sémiologie du


geste musical. In Glenn Gould, Pluriel, edited by G. Guertin. Paris, France: Louise
Courteau Editrice Inc., 83–111.

D ESAIN, P. and H ONING, H. (1994). Does expressive timing in music performance scale
proportionally with tempo? Psychological Research, 56(4):285–292.

D ESCARTES, R. (1768). Abrégé de musique: Compendium musicae. Paris, France:


Presses Universitaires de France. Revised edition by F. de Buzon, 1987.

D ESCHAUSSÉES, M. (1982). L’homme et le piano. Fondettes, France: Van de Velde.


276

D ESFRAY, C. (2004). Réflexions autour de la transcription de Bach à Busoni:


La problématique émergente de la conquête du timbre. Académie de Caen,
compte-rendu du Stage National en éducation musicale. Accessed 2008/02/16,
URL ftp://trf.education.gouv.fr/pub/educnet/musique/base/
pdf/chaconne-busoni-desfray.pdf.

D ISLEY, A.C. and H OWARD, D.M. (2003). Timbral semantics and the pipe organ.
In Proceedings of the Stockholm Music Acoustic Conference (SMAC03), Stockholm,
Sweden. 607–610.

D ISLEY, A.C. and H OWARD, D.M. (2004). Spectral correlates of timbral semantics
relating to the pipe organ. KTH Speech, Music and Hearing Quarterly Progress and
Status Report, 46:25–40.

D ISLEY, A.C., H OWARD, D.M., and H UNT, A.D. (2006). Timbral description of musical
instruments. In Proceedings of the 9th International Conference on Music Perception
and Cognition, Bologna, Italy. 61–68.

D O ĞANTAN -DACK, M. (2011). In the beginning was gesture: piano touch and an intro-
duction to a phenomenology of the performing body. In New perspectives on music
and gesture, edited by A. Gritten and E. King. Aldershot, UK: Ashgate Publishing,
Ltd., 243–265.

D UBÉ, F. (2003). Les Préludes pour piano de Claude Debussy : une œuvre musicale qui
favorise le développement musical et pianistique de tout étudiant de niveau universi-
taire. Recherche en éducation musicale, 21:19–39.

E EROLA, T. and T OIVIAINEN, P. (2004). MIDI Toolbox: MATLAB Tools for Music Re-
search. Jyväskylä, Finland: University of Jyväskylä. URL www.jyu.fi/musica/
miditoolbox/.

E KMAN, G. (1954). Dimensions of color vision. Journal of Psychology, 38(2):467–474.

FAURE, A. (2000). Des sons aux mots, comment parle-t-on du timbre musical ? PhD
thesis, Ecole des Hautes Etudes en Sciences Sociales, Paris, France.

F INK, S. (1992). Mastering Piano Technique: A guide for students, teachers, and per-
formers. Portland, OR: Amadeus Press.

F LETCHER, H. (1934). Loudness, pitch and the timbre of musical tones and their relation
to the intensity, the frequency and the overtone structure. Journal of the Acoustical
Society of America, 6(2):59–69.

F LETCHER, H., B LACKHAM, E.D., and S TRATTON, R. (1962). Quality of piano tones.
Journal of the Acoustical Society of America, 34(6):749–761.
277

F LOSSMANN, S. and W IDMER, G. (2011). Toward a multilevel model of expressive


piano performance. In Proceedings of the International Symposium on Performance
Science (ISPS2011), Toronto, ON, edited by A. Williamon, D. Edwards, and L. Bartel.
Utrecht, Netherlands: European Association of Conservatoires (AEC), 641–646.

F RIBERG, A. (1991). Generative Rules for Music Performance: A Formal Description


of a Rule System. Computer Music Journal, 15(2):56–71.

F RITZ, C., B LACKWELL, A.F., C ROSS, I., M OORE, B.C.J., and W OODHOUSE, J.
(2008). Investigating English violin timbre descriptors. In Proceedings of the 10th
International Conference on Music Perception and Cognition (ICMPC 10), Sapporo,
Japan. 638–639.

F URUYA, S., A LTENMÜLLER, E., K ATAYOSE, H., and K INOSHITA, H. (2010). Control
of multi-joint arm movements for the manipulation of touch in keystroke by expert
pianists. BMC neuroscience, 11(1):82–96.

G ABRIELSSON, A. (1999). Music Performance. In The Psychology of Music, edited


by D. Deutsch. San Diego, CA: Academic Press, 501–602. Second edition (first ed.:
1982).

G ABRIELSSON, A. (2003). Music performance research at the millennium. Psychology


of music, 31(3):221–272.

G AILLARD, P., C ASTELLENGO, M., and D UBOIS, D. (2005). L’apport de


la catégorisation à l’étude du transitoire d’attaque du Steeldrum; contribu-
tion à la définition du timbre causal. In Digital proceedings of the second
Conference on Interdisciplinary Musicology (CIM05). Montreal, QC: Obser-
vatoire interdisciplinaire de création et de recherche en musique (OICRM).
URL http://oicrm.org/ressources/publications-recentes/
actes-de-colloques/actes-electroniques-cim05/.

G IBSON, J.J. (1966). The Senses Considered as Perceptual Systems. Boston, MA:
Houghton-Mifflin.

G INGRAS, B. (2008). Expressive strategies and performer-listener communication in


organ performance. PhD thesis, McGill University, Montreal, QC.

G ODØY, R.I. and L EMAN, M. (2010). Musical Gestures: Sound, Movement, and Mean-
ing. New York: Routledge – Taylor & Francis.

G OEBL, W. (2001). Melody lead in piano performance: expressive device or artifact?


Journal of the Acoustical Society of America, 110(1):563–572.
278

G OEBL, W. (2003). The Role of Timing and Intensity in the Production and Perception of
Melody in Expressive Piano Performance. PhD thesis, Institut für Musikwissenschaft,
der Karl-Franzens-Universität Graz, Austria.
G OEBL, W. and B RESIN, R. (2003). Measurement and reproduction accuracy of
computer-controlled grand pianos. Journal of the Acoustical Society of America,
114(4):2273–2283.
G OEBL, W., B RESIN, R., and G ALEMBO, A. (2004). Once again: The perception of
piano touch and tone. Can touch audibly change piano sound independently of in-
tensity? In Proceedings of the 2004 International Symposium on Music Acoustics
(ISMA’04), Nara, Japan. 332–335.
G OEBL, W., B RESIN, R., and G ALEMBO, A. (2005). Touch and temporal behavior of
grand piano actions. Journal of the Acoustical Society of America, 118(2):1154–1165.
G OEBL, W., D IXON, S., D E P OLI, G., F RIBERG, A., B RESIN, R., and W IDMER, G.
(2008). ‘Sense’ in Expressive Music Performance: Data Acquisition, Computational
Studies, and Models. In Sound to Sense — Sense to Sound: A State of the Art in
Sound and Music Computing, edited by P. Polotti and D. Rocchesso. Berlin, Germany:
Logos, 195—-242.
G OEBL, W., F LOSSMANN, S., and W IDMER, G. (2010). Investigations of between-hand
synchronization in Magaloff’s Chopin. Computer Music Journal, 34(3):35–44.
G OEBL, W. and F UJINAGA, I. (2008). Do key-bottom sounds distinguish piano tones? In
Proceedings of the 10th International Conference on Music Perception and Cognition,
Sapporo, Japan. 292.
G OEBL, W. and PALMER, C. (2008). Tactile feedback and timing accuracy in piano
performance. Experimental Brain Research, 186(3):471–479.
G OUNAROPOULOS, A. and J OHNSON, C. (2006). Synthesising timbres and timbre-
changes from adjectives/adverbs. In Applications of Evolutionary Computing. Lecture
Notes in Computer Science, Volume 3907. Berlin, Germany: Springer-Verlag, 664–
675.
G OWER, J.C. (1966). Some distance properties of latent root and vector methods used
in multivariate analysis. Biometrika, 53(3-4):325–338.
G REY, J.M. (1975). An exploration of musical timbre. Technical report STAN-M-2,
Center for Computer research in Music and Acoustics, Department of Music, Stanford
University, Palo Alto, CA.
G REY, J.M. (1977). Multidimensional perceptual scaling of musical timbres. Journal of
the Acoustical Society of America, 61(5):1270–1277.
279

G REY, J.M. and G ORDON, J.W. (1978). Perceptual effects of spectral modifications on
musical timbres. Journal of the Acoustical Society of America, 63(5):1493–1500.
G RITTEN, A. and K ING, E. (2006). Music and Gesture. Aldershot, UK: Ashgate Pub-
lishing, Ltd.
G RITTEN, A. and K ING, E. (2011). New perspectives on music and gesture. Aldershot,
UK: Ashgate Publishing, Ltd.
G ROENEN, P.J.F. and VAN DE V ELDEN, M. (2004). Multidimensional scaling. Econo-
metric Institute Report, 15.
G ROSSHAUSER, T., T ESSENDORF, B., T RÖSTER, G., H ILDEBRANDT, H., and C AN -
DIA , V. (2012). Sensor Setup for Force and Finger Position and Tilt Measurements
for Pianists. In Proceedings of the International Computer Music Conference (ICMC),
Ljubljana, Slovenia. Ann Arbour, MI: MPublishing, University of Michigan, 1–7.
G UIGUE, D. (1994a). Beethoven et le pianoforte: l’émergence d’une pensée des timbres
comme dimension autonome du discours musical. Revue de Musicologie, 80(1):81–
96.
G UIGUE, D. (1994b). Timbre & Écriture au Piano. Technical report, IRCAM.
G USTAFSON, A.E. (2007). Tone production on the piano: the research of Otto Rudolph
Ortmann. PhD thesis, University of Texas at Austin.
G UYOT, F. (1996). Etude de la perception sonore en termes de reconnaissance et
d’appréciation qualitative: une approche par la catégorisation. PhD thesis, Uni-
versité du Mans, France.
H ADJAKOS, A. (2011). Sensor-Based Feedback for Piano Pedagogy. PhD thesis, TU
Darmstadt, Germany.
H ADJAKOS, A. (2012). Pianist Motion Capture with the Kinect Depth Camera. In Pro-
ceedings of the International Computer Music Conference (ICMC), Ljubljana, Slove-
nia. Ann Arbour, MI: MPublishing, University of Michigan, 1–8.
H AJDA, J.M., K ENDALL, R.A., C ARTERETTE, E.C., and H ARSHBERGER, M.L. (1997).
Methodological Issues in Timbre Research. In Perception and Cognition of Music,
edited by I. Deliege and J. Sloboda. London, UK: Psychology Press, 278–300.
H ALL, D.E. and A SKENFELT, A. (1988). Piano string excitation V: Spectra for real
hammers and strings. Journal of the Acoustical Society of America, 83(4):1627–1638.
H ALMRAST, T., G UETTLER, K., BADER, R., and G ODØY, R.I. (2010). Gesture and
Timbre. In Musical Gestures: Sound, Movement, and Meaning, edited by R.I. Godøy
and M. Leman. New York: Routledge – Taylor & Francis, 183–211.
280

H ART, H.C., F ULLER, M.W., and L USBY, W.S. (1934). A precision study of piano
touch and tone. Journal of the Acoustical Society of America, VI:80–94.

H ASHIDA, M., ET AL . (2008). Rencon: Performance rendering contest for automated


music systems. In Proceedings of the 10th International Conference on Music Percep-
tion and Cognition (ICMPC 10), Sapporo, Japan. 53–57.

H EINLEIN, C.P. (1929a). A discussion of the nature of pianoforte damper-pedalling


together with an experimental study of some individual differences in pedal perfor-
mance. Journal of General Psychology, 2(4):489–508.

H EINLEIN, C.P. (1929b). The functional role of finger touch and damper-pedalling in
the appreciation of pianoforte music. Journal of General Psychology, 2(4):462–469.

H EINLEIN, C.P. (1930). Pianoforte damper-pedalling under ten different experimental


conditions. The Journal of General Psychology, 3(4):511–528.

VON H ELMHOLTZ, H.L. (1885). On the sensations of tone as a physiological basis for
the theory of music. New York: Courier Dover Publications. Translated from German
to English by A.J. Ellis (2nd ed.: 1954).

H ENDERSON, M.T., T IFFIN, J., and S EASHORE, C.E. (1936). The Iowa piano camera
and its use. In Objective Analysis of Music Performance, edited by C.E. Seashore.
Iowa City, IA: University of Iowa Press, 252–262.

H IRAGA, R., H ASHIDA, M., H IRATA, K., K ATAYOSE, H., and N OIKE, K. (2002). Ren-
con: toward a new evaluation method for performance. In Proceedings of the Interna-
tional Computer Music Conference (ICMC). Ann Arbour, MI: MPublishing, Univer-
sity of Michigan, 357–360.

H OFMANN, J. (1920). Piano Playing with Piano Questions answered. Philadelphia, PA:
Theodore Presser Co.

H OWAT, R. (1997). Debussy’s piano music: sources and performance. In Debussy


Studies, edited by R.L. Smith. Cambridge, UK: Cambridge University Press, 78–107.

JANKÉLÉVITCH, Vladimir (1961). La musique et l’ineffable. Paris, France: Armand


Colin.

JAËLL, M. (1897). Le mécanisme du toucher : L’étude du piano par l’analyse expéri-


mentale de la sensibilité tactile. Paris, France: A. Colin.

JAËLL, M. (1899). Le Toucher. Enseignement du piano basé sur la physiologie. Leipzig,


Germany); Paris, France: Breitkopf, Härtel, Costallat.
281

J ENSENIUS, A.R. (2007). Action-Sound: Developing Methods and Tools to Study Music-
Related Body Movement. PhD thesis, Department of Musicology, University of Oslo,
Norway.

J OHNSON, M. (1987). The body in the mind: The bodily basis of meaning, imagination,
and reason. Chicago, IL: University of Chicago Press.

J OHNSON, S.C. (1967). Hierarchical clustering schemes. Psychometrika, 32(3):241–


254.

K ENDALL, R. A. and C ARTERETTE, E.C. (1993a). Verbal attributes of simultaneous


wind instrument timbres: II. Adjectives induced from Piston’s Orchestration. Music
Perception, 10(4):469–502.

K ENDALL, R.A. and C ARTERETTE, E.C. (1991). Perceptual scaling of simultaneous


wind instrument timbres. Music Perception, 8:369–404.

K ENDALL, R.A. and C ARTERETTE, E.C. (1993b). Verbal attributes of simultaneous


wind instrument timbres: I. von Bismarck’s adjectives. Music Perception, 10(4):445–
468.

K ENDALL, R.A., C ARTERETTE, E.C., and H AJDA, J.M. (1995). Perceptual and acous-
tical attributes of natural and emulated orchestral instrument timbres. In Proceedings
of the Symposium on Musical Acoustics. Paris: Société Française d’Acoustique, 596–
601.

K INOSHITA, H., F URUYA, S., AOKI, T., and A LTENMÜLLER, E. (2007). Loudness con-
trol in pianists as exemplified in keystroke force measurements on different touches.
Journal of the Acoustical Society of America, 121(5):2959–2969.

K IRK, R.E. (1959). Tuning preference for piano unison groups. Journal of the Acoustical
Society of America, 31(12):1644–1648.

K IRKE, A. and M IRANDA, E.R. (2013). Guide to Computing for Expressive Music
Performance. London, UK: Springer.

KOCHEVITSKY, G. (1967). The art of piano playing: a scientific approach. Secaucus,


NJ: Summy-Birchard Music.

K RUMHANSL, C.L. (1989). Why is musical timbre so hard to understand. In Structure


and perception of electroacoustic sound and music, edited by S. Nielzen and O. Ols-
son. Amsterdam, Netherlands: Elsevier, 43–53.

K RUMHANSL, C.L. and I VERSON, P. (1992). Perceptual interactions between musical


pitch and timbre. Journal of Experimental Psychology, 18(3):739–751.
282

K RUSKAL, J.B. (1964a). Multidimensional scaling by optimizing goodness of fit to a


nonmetric hypothesis. Psychometrika, 29(1):1–27.
K RUSKAL, J.B. (1964b). Nonmetric multidimensional scaling: A numerical method.
Psychometrika, 29(2):115–129.
L ANNERS, T. (2002). Hearing Voices? Addressing the Subject of Balancing Voices in
Pianistic Textures. American Music Teacher, 51(5):30–34.
L ARGE, E.W. (1993). Dynamic programming for the analysis of serial behaviors. Be-
havior Research Methods, 25(2):238–241.
L AULY, S. (2010). Modélisation de l’interprétation des pianistes & applications d’auto-
encodeurs sur des modèles temporels. Master’s thesis, Université de Montréal,
Canada.
L AVOIE, M. (2008). Le timbre intra-instrumental en musicologie: proposition de méth-
odes d’analyse dérivées de la linguistique. Master’s thesis, Université de Montréal,
QC.
L EHTONEN, H.M. (2010). Analysis, perception, and synthesis of the piano sound. PhD
thesis, Aalto University School of Science and Technology, Espoo, Finland.
L EHTONEN, H.M., A SKENFELT, A., and V ÄLIMÄKI, V. (2009). Analysis of the
part-pedaling effect in the piano. Journal of the Acoustical Society of America,
126(2):EL49–EL54.
L EHTONEN, H.M., P ENTTINEN, H., R AUHALA, J., and V ÄLIMÄKI, V. (2007). Analysis
and modeling of piano sustain-pedal effects. Journal of the Acoustical Society of
America, 122(3):1787–1797.
L EIPP, E. (1971). Acoustique et musique. Paris, France: Masson.
L EMAN, M. (2008). Embodied music cognition and mediation technology. Cambridge,
MA: Mit Press.
L HEVINNE, J. (1924). Basic principles in pianoforte playing. Philadelphia, PA:
Theodore Presser Co.
L ICHTE, W.H. (1941). Attibutes of complex tones. Journal of Experimental Psychology,
28(6):455–480.
L IKERT, R. (1932). A Technique for the Measurement of Attitudes. Archives of Psy-
chology, 140:1–55.
M AC R ITCHIE, J. (2011). Elucidating musical structure through empirical measurement
of performance parameters. PhD thesis, University of Glasgow, UK.
283

M AC R ITCHIE, J. and Z ICARI, M. (2012). The Intentions of Piano Touch. In Pro-


ceedings of the 12th International Conference on Music Perception and Cognition,
Thessaloniki, Greece. 636–643.

M AROZEAU, J. (2004). L’effet de la fréquence fondamentale sur le timbre. PhD thesis,


Université Pierre et Marie Curie (Paris VI), France.

MATLAB (2009). release 7.8.0 (R2009a). Natick, MA: The MathWorks Inc.

M ATTHAY, T. (1913). Musical interpretation: Its laws and principles, and their appli-
cation in teaching and performing. Boston, MA: Boston Music Co. Revised edition.

M ATTHAY, T. (1932). The visible and invisible in Pianoforte Technique. London, UK:
Oxford University Press.

M C A DAMS, S. (1999). Perspectives on the contribution of timbre to musical structure.


Computer Music Journal, 23(3):85–102.

M C A DAMS, S. and D ELIÈGE, I. (Eds.) (1988). La musique et les sciences cognitives.


Liège, Belgium: Mardaga.

M C A DAMS, S., D EPALLE, P., and C LARKE, E.F. (2004). Analyzing musical sound. In
Empirical Musicology: Aims, Methods, Prospects, edited by E.F. Clarke and N. Cook.
Oxford, UK: Oxford University Press, 157–196.

M C A DAMS, S., W INSBERG, S., D ONNADIEU, S., D E S OETE, G., and K RIMPHOFF,
J. (1995). Perceptual scaling of synthesized musical timbres: Common dimensions,
specificities, and latent subject classes. Psychological Research, 58:177–192.

M C P HERSON, A. and K IM, Y. (2011). Multidimensional gesture sensing at the piano


keyboard. In CHI ’11: Proceedings of the 2011 annual conference on Human factors
in computing systems, Vancouver, BC. New York: ACM, 2789–2798.

M ELARA, R.D. and M ARKS, L.E. (1990). Interaction among auditory dimensions: Tim-
bre, pitch, and loudness. Perception & Psychophysics, 48(2):169–178.

M OOG, R.A. and R HEA, T.L. (1990). Evolution of the keyboard interface: The
Bösendorfer 290 SE recording piano and the Moog multiply-touch-sensitive key-
boards. Computer Music Journal, 14(2):52–60.

M OORE, F.R. (1988). The dysfunctions of MIDI. Computer music journal, 12(1):19–28.

NATTIEZ, J.J. (2007). Le timbre est-il un paramètre secondaire? In Les cahiers de


la Société québécoise de recherche en musique: Le timbre musical: Composition,
interprétation, perception et réception, Volume 9 (1–2), October, edited by C. Caron,
C. Traube, and S. Lacasse. Montreal, QC: OICRM, 13–24.
284

N EUHAUS, H. (1973). The art of piano playing. London, UK: Barrie and Jenkins.
Translated from Russian by K.A. Leibovitch.

N OSULENKO, V.N., S AMOYLENKO, E.S., and M C A DAMS, S. (1994). L’analyse de


descriptions verbales dans l’étude des comparaisons de timbres musicaux. Journal de
Physique IV, 4(C5):637–640.

N OYLE, L.J. (1987). Pianists on Playing: interviews with twelve concert pianists.
Metuchen, NJ: Scarecrow Press.

N YKÄNEN, A. and J OHANSSON, Ö. (2003). Development of a Language for Specify-


ing Saxophone Timbre. In Proceedings of the Stockholm Music Acoustic Conference
(SMAC03), Stockholm, Sweden. 647–650.

O RTMANN, O. (1925). The physical basis of piano touch and tone: an experimental
investigation of the effect of the player’s touch upon the tone of the piano. New York:
E.P. Dutton.

O RTMANN, O. (1929). The Physiological Mechanics of Piano Technique: An Experi-


mental Study of the Nature of Muscular Action as Used in Piano Playing and of the
Effects Thereof Upon the Piano Key and the Piano Tone. New York: E.P. Dutton.

O RTMANN, O. (1935). What is Tone-Quality? Musical Quarterly, 21(4):442–450.

O SGOOD, C.E., S UCI, G.J., and TANNENBAUM, P. (1957). The Measurement of Mean-
ing. Urbana-Champaign, IL: University of Illinois Press.

OTT, B. (1987). Liszt et la pédagogie du piano, Essai sur l’art du clavier selon Liszt.
Issy-Les-Moulineaux, France: E.A.P.

PALMER, C. (1989). Mapping musical thought to musical performance. Journal of


experimental psychology: human perception and performance, 15(2):331–346.

PALMER, C. (1996a). Anatomy of a performance: Sources of musical expression. Music


Perception, 13(3):433–453.

PALMER, C. (1996b). On the assignment of structure in music performance. Music


Perception, 14(1):23–56.

PALMER, C. (1997). Music performance. Annual Review of Psychology, 48(1):115–138.

PALMER, C. and B ROWN, J.C. (1991). Investigations in the amplitude of sounded piano
tones. Journal of the Acoustical Society of America, 90(1):60–66.
285

PARNCUTT, R. (1997). Modeling piano performance: Physics and cognition of a virtual


pianist. In Proceedings of the International Computer Music Conference (ICMC 97),
Thessaloniki, Greece, edited by T. Rikakis. San Francisco, CA: International Com-
puter Music Association, 15–18.

PARNCUTT, R. (2003). Accents and expression in piano performance. In Perspektiven


und Methoden einer Systemischen Musikwissenschaft, edited by K.W. Niemöller and
B. Gätjen. Frankfurt, Germany: Peter Lang, 163–185.

PARNCUTT, R. and M C P HERSON, G. (2002). The Science and Psychology of Music


Performance: Creating Strategies for Teaching and Learning. Oxford, UK: Oxford
University Press.

PARNCUTT, R., S LOBODA, J.A., C LARKE, E.F., R AEKALLIO, M., and D ESAIN, P.
(1997). An ergonomic model of keyboard fingering for melodic fragments. Music
Perception, 14(4):341–382.

PARNCUTT, R. and T ROUP, M. (2002). Piano. In The Science and Psychology of Music
Performance: Creating Strategies for Teaching and Learning, edited by R. Parncutt
and G. McPherson. New York: Oxford University Press, 285–302.

PAYEUR, P., C ÔTÉ, M., and C OMEAU, G. (2006). Les technologies de l’imagerie au
service de l’analyse du mouvement en pédagogie du piano. Recherche en éducation
musicale, 24:61–87.

P EARSON, K. (1901). On Lines and Planes of Closest Fit to Systems of Points in Space.
Philosophical Magazine, 2(6):559–572.

P FORDRESHER, P. and PALMER, C. (2002). Effects of delayed auditory feedback on


timing of music performance. Psychological Research, 66(1):71–79.

P HILIPP, L.H. (1969). Piano technique: Tone, touch, phrasing and dynamics. New
York: MCA Music. Second edition: 1982, New York: Dover.

P ITT, M.A. (1994). Perception of pitch and timbre by musically trained and untrained
listeners. Journal of experimental psychology: human perception and performance,
20(5):976–986.

P LOMP, R. (1970). Timbre as a multidimensional attribute of complex tones. Frequency


Analysis and Periodicity Detection in Hearing:397–414.

P LOMP, R. (1976). Aspects of tone sensation. New York: Academic Press.

P RATT, R.L. and D OAK, P.E. (1976). A subjective rating scale for timbre. Journal of
Sound and Vibration, 45(3):317–328.
286

R EPP, B.H. (1990). Patterns of expressive timing in performances of a Beethoven minuet


by nineteen famous pianists. Journal of the Acoustical Society of America, 88(2):622–
641.

R EPP, B.H. (1992). Diversity and commonality in music performance: An analysis of


timing microstructure in Schumann’s “Träumerei”. Journal of the Acoustical Society
of America, 92(5):2546–2568.

R EPP, B.H. (1993). Some empirical observations on sound level properties of recorded
piano tones. Journal of the Acoustical Society of America, 93(2):1136–1144.

R EPP, B.H. (1994). Relational invariance of expressive microstructure across global


tempo changes in music performance: An exploratory study. Psychological research,
56(4):269–284.

R EPP, B.H. (1995a). Acoustics, perception, and production of legato articulation on a


digital piano. Journal of the Acoustical Society of America, 97(6):3862–3874.

R EPP, B.H. (1995b). Expressive timing in Schumann’s “Träumerei:”An analysis of per-


formances by graduate student pianists. Journal of the Acoustical Society of America,
98(5):2413–2427.

R EPP, B.H. (1996a). The dynamics of expressive piano performance: Schumann’s


“Träumerei” revisited. Journal of the Acoustical Society of America, 100(1):641–650.

R EPP, B.H. (1996b). Patterns of note onset asynchronies in expressive piano perfor-
mance. Journal of the Acoustical Society of America, 100(6):3917–3932.

R EPP, B.H. (1996c). Pedal timing and tempo in expressive piano performance: A pre-
liminary investigation. Psychology of Music, 24(2):199–221.

R EPP, B.H. (1997a). Acoustics, perception, and production of legato articulation on


a computer-controlled grand piano. Journal of the Acoustical Society of America,
102(3):1878–1890.

R EPP, B.H. (1997b). The effect of tempo on pedal timing in piano performance. Psy-
chological research, 60(3):164–172.

R EPP, B.H. (1997c). Expressive timing in a Debussy Prelude: A comparison of student


and expert pianists. Musicae Scientiae, 1(2):257–268.

R EPP, B.H. (1998a). A microcosm of musical expression. I. Quantitative analysis of


pianists’ timing in the initial measures of Chopin’s Etude in E major. Journal of the
Acoustical Society of America, 104(2):1085–1100.
287

R EPP, B.H. (1998b). Perception and Production of Staccato articulation on the Piano.
Unpublished manuscript, Haskins Laboratories. URL http://www.haskins.
yale.edu/haskins/STAFF/repp.html.

R EPP, B.H. (1999a). Control of expressive and metronomic timing in pianists. Journal
of Motor Behavior, 31(2):145–164.

R EPP, B.H. (1999b). Effects of auditory feedback deprivation on expressive piano per-
formance. Music Perception, 16(4):409–438.

R EPP, B.H. (1999c). A microcosm of musical expression: II. Quantitative analysis of


pianists’ dynamics in the initial measures of Chopin’s Etude in E major. Journal of
the Acoustical Society of America, 105(3):1972–1988.

R ICHE, N. (2010). Conception d’une application informatique pour l’analyse des


paramètres articulatoires en lien avec la production du timbre au piano. Master’s
thesis, Université de Montréal, Canada and Polytech Mons, Belgium.

R IMSKY-KORSAKOV, N. and S TEINBERG, M. (1964). Principles of orchestration: with


musical examples drawn from his own works. New York: Dover. First edition: 1873.

R INK, J. (1995). The Practice of Performance: Studies in Musical Interpretation. Cam-


bridge, UK: Cambridge University Press.

R INK, J. (2002). Musical Performance: A Guide to Understanding. Cambridge, UK:


Cambridge University Press.

R ISSET, J.C. (1978). Musical Acoustics. Technical report 8, IRCAM, Paris, France.

R ISSET, J.C. (1994). Modèle physique et perception - Modèle physique et composition.


In Colloque International Modèles Physiques Création Musicale et Ordinateur Vol.
III, Grenoble, France. Paris, France: Fondation de la Maison des sciences de l’homme,
711–720.

R ISSET, J.C. (2004). Timbre. In Musiques. Une encyclopédie pour le XXIe siècle 2. Les
savoirs musicaux, edited by J.J. Nattiez. Arles, France: Actes Sud, 134–161.

R ISSET, J.C. and M ATHEWS, M. (1969). Analysis of musical-instrument tones. Physics


Today, 22(2):23–30.

R ISSET, J.C. and W ESSEL, D.L. (1999). Exploration of timbre by analysis and synthesis.
In The Psychology of Music, edited by D. Deutsch. San Diego, CA: Academic Press,
113–169. Second edition (first ed.: 1982).

ROSSING, T.D., W HEELER, P.A., and M OORE, F.R. (1990). The science of sound. San
Francisco, CA: Addison-Wesley.