Vous êtes sur la page 1sur 218


Emerging Communication
Studies in New Technologies and Practices in Communication

Emerging Communication publishes state-of-the-art papers that examine a broad range of issues
in communication technology, theories, research, practices and applications.
It presents the latest development in the field of traditional and computer-mediated
communication with emphasis on novel technologies and theoretical work in this
multidisciplinary area of pure and applied research.
Since Emerging Communication seeks to be a general forum for advanced communication
scholarship, it is especially interested in research whose significance crosses disciplinary and
sub-field boundaries.

Giuseppe Riva, Applied Technology for Neuro-Psychology Lab., Istituto Auxologico
Italiano, Milan, Italy
Fabrizio Davide, TELECOM ITALIA Learning Services S.p.A., Rome, Italy

Editorial Board
Luigi Anolli, University of Milan-Bicocca, Milan, Italy
Cristina Botella, Universitat Jaume I, Castellon, Spain
Martin Holmberg, Linkping University, Linkping, Sweden
Ingemar Lundstrm, Linkping University, Linkping, Sweden
Salvatore Nicosia, University of Tor Vergata, Rome, Italy
Brenda K. Wiederhold, Interactive Media Institute, San Diego, CA, USA
Luciano Gamberini, State University of Padua, Padua, Italy

Volume 11
Previously published in this series:

Vol. 10. F. Morganti, A. Carassa and G. Riva (Eds.), Enacting Intersubjectivity A Cognitive
and Social Perspective on the Study of Interactions
Vol. 9. G. Riva, M.T. Anguera, B.K. Wiederhold and F. Mantovani (Eds.), From
Communication to Presence
Vol. 8. R. Baldoni, G. Cortese, F. Davide and A. Melpignano (Eds.), Global Data
Vol. 7. L. Anolli, S. Duncan Jr., M.S. Magnusson and G. Riva (Eds.), The Hidden Structure
of Interaction
Vol. 6. G. Riva, F. Vatalaro, F. Davide and M. Alcaiz (Eds.), Ambient Intelligence
Vol. 5. G. Riva, F. Davide and W.A. IJsselsteijn (Eds.), Being There
Vol. 4. V. Milutinovi and F. Patricelli (Eds.), E-Business and E-Challenges
Vol. 3. L. Anolli, R. Ciceri and G. Riva (Eds.), Say Not to Say: New Perspectives on

ISSN 1566-7677 (print)

ISSN 1879-8349 (online)
Reiinterprreting Gestu
ure as Langu
L uage
uage in Action

Niicla Rosssini

Amstterdam Berrlin Tokyo Washington, DC

2012 The author and IOS Press.

All rights reserved. No part of this book may be reproduced, stored in a retrieval system,
or transmitted, in any form or by any means, without prior written permission from the publisher.

ISBN 978-1-60750-975-2 (print)

ISBN 978-1-60750-976-9 (online)

Cover illustration Nicla Rossini, Language, Mixed Media, 2006.

Library of Congress Control Number: 2011943120

IOS Press BV
Nieuwe Hemweg 6B
1013 BG Amsterdam
fax: +31 20 687 0019
e-mail: order@iospress.nl

Distributor in the USA and Canada

IOS Press, Inc.
4502 Rachael Manor Drive
Fairfax, VA 22032
fax: +1 703 323 3668
e-mail: iosbooks@iospress.com

The publisher is not responsible for the use which might be made of the following information.



At present, the enquiry into gesture has reached its maturity as a branch of study which
endorses a multidisciplinary approach to communication. Notwithstanding its spread
into a great number of sciences (Psychology, Psycholinguistics, Ethnology, among
others), in recent times little attention has been paid to the phenomena involved, as far
as the linguistic point of view is concerned. In particular, the communicative function
of gesture has not been addressed enough from a strictly linguistic point of view.
The aim of the present volume is to exploit some methodological instruments
provided by Linguistics in order to restore gesture to its original perview within the
field. Such a project implies the use of those empirical methodological tools to which
psychologists (and also linguists) are familiar. In doing so, the data presented here are
analysed as pieces of information that describe behaviour, that are also an integral part
of the more complex phenomenon of human communication. To the extent that a study
of this kind deals with gesture, a number of theoretical linguistic questions must be
The major claim of this book is that gesture and speech share the same cognitive,
psychological and physiological roots. In fact, gesture will here be claimed to be
integral to human language, its function within human communication being as much
goal-directed (MacKay, 1972) and, subsequently, communicative as speech.
Evidence for this assumption is provided by means of experiments on hearing and
deaf subjects, in addition to a review of the major findings about the use and function
of gesture in situations of handicap, such as aphasia and blindness. The ideas proposed
here are the result of a long speculation on the role of gesture in communicative acts, on
the one hand, and with respect to language, on the other hand, matured during the
decade of my professorship in Non-Verbal Communication, which began at the
University of Pavia, and is now continuing at both national and international levels.
This page intentionally left blank


A book is always the result of an incessant dialectic between oneself and others. I am
profoundly grateful to David McNeill for his always enthusiastic willingness to discuss
with me every idea: his encouragement, and especially his suggestions when I was
missing the point, have been extremely helpful. My deep gratitude also goes to Karl-
Erik McCullough, Fernando Poyatos, and Anna Esposito for their insightful
suggestions and their support. I am indebted to Dafydd Gibbon at the University of
Bielefeld for extensively discussing my ideas with me. At the University of Bielefeld I
also had the privilege of discussing these ideas with Katharina Rohlfing, Stefan Kopp,
and Ipke Wachsmuth. I was also blessed with the friendly comments and
encouragement of Iris Nomikou, Carolin Kirchhoff, Sascha Griffith and Zofia Malisz.
This book stems from my Ph.D. experience at the Department of Linguistics,
Universit di Pavia, where I had the opportunity to work with Marina Chini and
Gianguido Manzelli, and to meet friends and brilliant colleagues such as Andrea Sans,
Federica Da Milano, Cristina Mariotti, Cristiano Broccias, and Nicoletta Puddu. A
research stay at the Center for Speech and Gesture at the University of Chicago was
crucial. There I had the opportunity to talk with outstanding scholars in a particularly
welcoming environment: among them, Susan Duncan, Gale Stam, Mika Ishino, Irene
Kimbara, Karl-Erik McCullough and Fey Parrill who strongly influenced my scholarly
development and growth.
A particular acknowledgment goes to Sotaro Kita for his willingness to discuss
with me this latest version of the book at the University of Birmingham, and to Andrea
Sans for his friendly and insightful suggestions on a previous version of Chapter 7. I
am of course also indebted to all my students, who are constantly coming up with new
ideas from new perspectives.
I should also like to thank Erik Lautenschlager at the University of Luxembourg,
my coordinator first, and now a friendly presence, for being an unchanging reference
point during my experience as visiting faculty at the BSE.
Finally, I should also thank Jimmy the cat for teaching me about communication
beyond the limitations of a single species.
This page intentionally left blank

Index of Figures

Figure 1: Kendons (1986) analysis of kinesics compared to that by Birdwhistell

(1952) ________________________________________________________ 11
Figure 2: Ekman and Friesens parameters for gesture categorization ___________ 20
Figure 3: Kendons continuum (McNeill, 1992:37) _________________________ 23
Figure 4: an example of retelling inaccuracy with manner-mismatching input (Cassell,
McNeill and McCullough, 1999:15) _________________________________ 32
Figure 5: gesture as a prototype category _________________________________ 47
Figure 6: the development of the gesture category as a metonymic chain (Rossini,
2001, revised) __________________________________________________ 48
Figure 7: Percentage of gestures performed during each session _______________ 52
Figure 8: an instance of emblem performed by S1 during the third session _______ 53
Figure 9: instance - emblematic phrase performed by S2 during the third session ___ 53
Figure 10: the occurrence of gesture and speech within communicative acts ______ 55
Figure 11: Addio monti (from the novel I Promessi Sposi by Alessandro Manzoni,
ch.VIII) _______________________________________________________ 65
Figure 12: San Martino (by Giosu Carducci, Rime Nuove) ___________________ 66
Figure 13: S1 during the first nine seconds ________________________________ 67
Figure 14: S1 during seconds 9-20 ______________________________________ 68
Figure 15: S1 at 30-40 seconds _________________________________________ 68
Figure 16: S1 at 50-60 seconds _________________________________________ 69
Figure 17: S1 at 140-150 seconds _______________________________________ 69
Figure 18: S1 at 220-230 seconds _______________________________________ 69
Figure 19: S1 at 150-160 seconds _______________________________________ 70
Figure 20: S2s poetry part with multi-tasking _____________________________ 70
Figure 21: S9 during hesitation and false start ______________________________ 71
Figure 22: Locus in S1 ________________________________________________ 77
Figure 23: Locus in S1 ________________________________________________ 78
Figure 24: Locus in S2 ________________________________________________ 78
Figure 25: place of articulation in Italian Sign Language: the case of house _____ 79
Figure 26: Gesturing rate: analysis ______________________________________ 81
Figure 27: Gesturing rate: results ________________________________________ 81
Figure 28: an instance of kinetic unit composed of several gesture phrases _______ 82
Figure 29: a hypothesis for the evolution of gesture as a communicative device ___ 84
Figure 30: The audio-Visual Communication System. _______________________ 90
Figure 31: The determination of Size in gesture ____________________________ 94
Figure 32: Loci in gesture _____________________________________________ 95
Figure 33: key to abbreviations _________________________________________ 95
Figure 34: Levelts model (1989: 9) _____________________________________ 97
Figure 35: Krauss et al.s (2001: 34) model for speech and gesture production ____ 98
Figure 36: De Ruiter, 2000: 198 ________________________________________ 99
Figure 37: Computational model for AVC output __________________________ 100
x Index of Figures

Figure 38: instance of on-line integration of the verbal and non-verbal modalities by
the speaker ____________________________________________________ 104
Figure 39: instance of on-line integration of the verbal and non-verbal modalities by
both speaker and receiver ________________________________________ 105
Figure 40: case of gestural syntax ______________________________________ 106
Figure 41: instances of complex gestures in a) map task and b) face-to-face interaction,
compared with data available from c) spontaneous route description
(McCullough, 2005: 116) ________________________________________ 136
Figure 42: probable palm-down flap in an Italian subject intent in a face-to-face
guessing ______________________________________________________ 138
Figure 43: probable case of palm-down flap in an American subject (from
McCullough, 2005: 121). ________________________________________ 138
Figure 44: Case of lateralized gestural response to planning activity in S1 _______ 140
Figure 45. Lateralized gestural response with palm-down-flap in S3 ___________ 142
Figure 46. Lateralized planning gesture in S6 _____________________________ 143
Figure 47. Lateralized gestural response in a left-handed participant in the role of
Follower _____________________________________________________ 143
Figure 48. lateralized response to space description in S5. Left hand describing a path
on the left side of the map ________________________________________ 144
Figure 49. Lateralized response to space description in S5. Right hand describing the
same path on the left side of the map _______________________________ 145
Figure 50. Lateralized linguistic planning in S7 ___________________________ 146
Figure 51. Online lateralized gestural response in S7 _______________________ 147
Figure 52: Software of an ECA (Cassell, Vilhjlmsson, and Bickmore, 2001: 479). 153
Figure 53: architecture of the robot Maggie, with focus on the decision-making system
(Malfaz et al., 2011: 237). ________________________________________ 154
Figure 54: Architecture of the iCub (Vernon, von Hofsten, Fadiga, 2011: 126) ___ 155
Figure 55: expressivity of the MIT social robot Nexi _______________________ 155
Figure 56: mimicry in the iCub ________________________________________ 156
Figure 57: facial mimicry in GRETA (Mancini, Bresin, Pelachaud, 2007: 1839). _ 156
Figure 58: Nexis synchronisation between speech, gesture, and expressions ____ 157
Figure 59: transcription of a chunk of GRETAs production. Square brackets show the
parts of speech with which the non-verbal cues are synchronized (Rossini, 2011:
99). _________________________________________________________ 158
Figure 60: hello gesture in GRETAs performance. As it can be seen, the hand
performing the gesture is rigid and completely spread, as if performing a sign
language token. ________________________________________________ 159
Figure 61: proposal for a new architecture _______________________________ 160

Index of Tables

Table 1: Birdwhistells model for kinetic analysis ___________________________ 10

Table 2: Gesture and Prototype Theory. S1 Experiment Results ________________ 49
Table 3: Gesture and Prototype Theory. S2 Experiment Results ________________ 49
Table 4: Gesture and Prototype Theory. S3 Experiment Results ________________ 50
Table 5: Gesture and Prototype Theory. S4 Experiment Results ________________ 50
Table 6: Gesture and Prototype Theory. S5 Experiment Results ________________ 51
Table 7: number of gestures performed during each session ___________________ 52
Table 8: multi-tasking experiment. Overview of subjects performances _________ 67
Table 9: Gesture in Deaf Subjects. Statistics _______________________________ 75
This page intentionally left blank

Foreword ................................................................................................................ v
Acknowledgements .............................................................................................. vii
Index of Figures .................................................................................................... ix
Index of Tables . .................................................................................................... xi

1. Introduction .................................................................................................... 1
1.1. Precise .......................................................................................................... 3

2. Non-Verbal Communication: Towards a Definition ................................... 7

Overview ............................................................................................................. 7
2.1. State of the Art .............................................................................................. 7
The Place for Linguistics .............................................................................. 10
2.2. Non-verbal Communication vs. Non-verbal Behaviour: Towards
a Definition................................................................................................. 13
Summary ........................................................................................................... 16

3. Defining Gesture ........................................................................................... 19

Overview ........................................................................................................... 19
3.1. What Is Gesture? Getting More Focused ................................................... 19
3.2. Terminological Note About the Classification of Gestures:
Adopting McNeills Model ................................................................................ 24
Summary ........................................................................................................... 25

4. The Cognitive Foundations of Gesture ....................................................... 27

Overview............................................................................................................ 27
4.1. On the Psychological Foundations of Gesture: Is Gesture
Non-Verbal? ............................................................................................. ..... 27
4.2. The Functions of Gesture Within Communicative Acts .............................. 30
4.3. The Emergence of Gesture in Infants ......................................................... 34
4.4. Gesture and Aphasia .................................................................................. 37
4.5. Gesture in Blind Subjects ........................................................................... 39
xiv Contents

5. Towards the Interpretation of Gesture as a Prototype Category:

Gestures for the Speaker? ... ........................................................................ 43
Overview............................................................................................................ 43
5.1. Gestures for the Speaker? State of the Art ................................................ 43
5.2. Reinterpreting Gesture as a Prototype Category ....................................... 46
5.2.1. Results............................................................................................. 49
5.3. Is Gesture Communicative? ....................................................................... 54
Summary ............................................................................................................ 56
6. Language in Action .................................................................................. 57
Overview............................................................................................................ 57
6.1. The Neurological Correlates of Language ................................................. 57
6.2. Gesture in the Brain: Experiment on Gesture-Speech Synchronisation
in Multi-Tasking Activities ... ..................................................................... 63
6.2.1. State of the Art ................................................................................ 63
6.2.2. Experiment Setting ......................................................................... 64
6.2.3. Results............................................................................................. 66
6.2.4. Discussion and Further Research .................................................... 71
Summary ............................................................................................................ 72

7. Gesture in Deaf Orally-Educated Subjects: An Experiment ................... 73

Overview............................................................................................................ 73
7.1. The Experiment .......................................................................................... 74
7.2. Analysis of the Data .................................................................................. 75
7.3. Gesture in Deaf S ubjects: S ome Remarkable Phenomena ....................... 76
7.3.1. Locus .............................................................................................. 76
7.3.2. Point of Articulation ....................................................................... 79
7.3.3. Gesturing Rate ................................................................................ 80
7.4. Why Do We Gesture? First Conclusions .................................................... 83
Summary ............................................................................................................ 84

8. Reintegrating Gesture: Towards a New Parsing Model ........................... 87

Overview............................................................................................................ 87
8.1. The Audio-Visual Communication System ................................................. 87
8.2. About the Morphology of Gesture .............................................................. 92
8.3. Handling Recursion ................................................................................... 96

8.3.1. Existing Models .............................................................................. 97

Contents xv

8.4. Towards a Computational Model for AVC Parsing ................................... 99

Summary .......................................................................................................... 107

9. Private Language ....................................................................................... 109

Overview.......................................................................................................... 109
9.1. State of the Art ......................................................................................... 109
9.2. The Map-Task Experiment ....................................................................... 110
9.2.1. Co-Verbal Gestures and Other Non-Verbal Cues in
Map-Task Activities: Language for the Self ................................ 111
9.2.2. A Case Study of Map-Task Activity: Full Transcripts ................ 112
9.3. Co-Verbal Gestures and Planning in Conditions of Blocked Visibility
and Face-to-Face: An Overall View ........................................................ 135
9.4. Lateralization Phenomena in Gesture ...................................................... 139
9.4.1. Instances of Lateralized Gestural Processing ................................ 140
9.5. Discussion ................................................................................................ 147
Summary .......................................................................................................... 149
10. The Importance of Gesture and Other Non-Verbal Cues in
Human- Machine Interaction: Applications ............................................ 151
Overview.......................................................................................................... 151
10.1. State of the Art ..................................................................................... 151
10.1.1. Architecture of ECAs ..................................................................... 152
10.1.2. Architecture of a Robot ................................................................... 153
10.2. Expressions and Gestures in Artificial Agents ..................................... 155
10.3. Patterns of Synchronisation of Non-Verbal Cues and Speech in
Agents: Analysis of Common Problems ............................................... 157
10.4. Proposal for a More Natural Agent .................................................. 159
Summary .......................................................................................................... 161
Conclusions......................................................................................................... 163
References........................................................................................................... 165
Appendix I .......................................................................................................... 183
Appendix II ....................................................................................................... 195
Index of Topics .................................................................................................. 197
Index of Authors ............................................................................................... 199
This page intentionally left blank
Reinterpreting Gesture as Language 1
N. Rossini
IOS Press, 2012
2012 The author and IOS Press. All rights reserved.

1. Introduction
Every thought of mine, along with its
content, is an act or deed that I perform my
own individually answerable act or deed
(M. M. Bakhtin, Toward a Philosophy of the
Act, 3)

The modern study of gesture has its roots in the 17th century, when the first books
devoted exclusively to this topic appeared 1 , although the enquiry into the role of
gesture began to flourish in France, in the 18th century, during which time the study of
gesture came to be considered a key to the comprehension of the origin of human
language and thought. In particular, Condillac (1756) and Diderot (1751) extensively
wrote about gesture. In the 19th century, Tylor (1878) and Wundt (1901) expressed the
belief that the study of gesture might illuminate the transition from spontaneous
individual expression to codified language. And yet, the phenomenon was always
addressed in an unsystematic way largely aimed at providing evidence for
philosophical constructions focused on the origin of society. In fact, it was within the
fields of Ethology and Biology that non-verbal communication was first studied as an
autonomous system (see for instance Lorenz, 1939; Eibl-Eibesfeldt, 1949 and
following; or Darwin, 1872), the topic becoming more and more of a focus of interest
from the end of 19th century onward. The rising importance of this relatively new
branch of study is shown by the interest of scholars from a range of different
disciplines, including both psychologists (see Paul Ekman, 1957 and following; George
F. Mahl, 1968; David McNeill, 1985 and subsequent), and anthropologists (see for
example Andrea De Jorio, 1835; De Laguna, 1927; Marcel Jousse, 1925-1974). More
recently, linguists have zeroed in on non-verbal communication studies although
speculation about this topic is still for the most part considered to be a particular
application of relatively independent branches of the discipline, such as Pragmatics,
aimed at investigating the practical instantiations of language use in given
communicative contexts, and Computational Linguistics.
Throughout the twentieth century, only a few linguists were interested in the
analysis of nonverbal cues during speaking. Among the most famous, however, we
should remember Leonard Bloomfield (1933), who does take gesture into consideration
in his speculation about human language. In doing so, he remarks that gesture
accompanies speech, and is subject to social conventions, but that it is obvious in its
function (Bloomfield, 1933:39). Another linguist interested in gesture, Dwight
Bolinger (1946), reckoned that the boundary between what is considered to be

According to Morris (1979), the first treatise exclusively devoted to the use of gesture is
due to Cresollius and was published in Paris, in 1620. Kendon (1982, 2004) notes that one of the
first works exclusively devoted to gesture in Europe is due to Giovanni Bonifacio (see Kendon,
2004: 23-24 for this author) and was published in 1616, while the first work in English was
published in 1644 by John Bulwer, and its was entitled Chirologia, or the Natural Language of
the Hand whereunto is added Chironomia, or the Art of Manual Rhetoric (see Kendon, 2004: 25-
2 1. Introduction

language and what is not depends on arbitrary classifications contingent upon the
extent to which the phenomena in question can align with a structural analysis. His
reflections on the topic lead him to the conclusion particularly modern nowadays
that language is embedded in gesture (1975:18).
The most extensive theorization of gesture as part of the human linguistic capacity
is probably that of Kenneth L. Pike (1967), who approaches the field from the
perspective of his Unified Theory of the Structure of Human Behaviour: according to
this theory, language is only one phase of human activity, which should not be
dissociated from other phases of human behaviour, in order to be understood
completely. To demonstrate his point, he reports a famous and still used child-game in
which the words in a sentence are progressively replaced with gestures, which would
clearly show that non-spoken forms can be structurally integrated with spoken ones.
Still, the interest of linguists in this branch of studies is certainly not systematic:
gesture has always been regarded as a minor aspect, or even merely a collateral effect,
of human communication and, thus, a relatively uninteresting part of it. David
McNeills decades-long research on gestures and their relation to language represents
an eminent exception, probably the only one in this field. Nevertheless, the importance
of an analysis of gesture and other non-verbal cues for a better understating of language
has been recognized by at least some linguists during the past three decades. In Italy,
Alberto A. Sobrero was the first linguist to encourage the study of gesture in his book
on the usage of contemporary Italian published in 1993. At about the same time,
Giacomo Ferrari (1997) suggested that the analysis of non-verbal phenomena is
relevant to linguistic enquiry. His interest in this branch of investigation subsequently
gave birth to a model for the interpretation of the communicative functions of gesture,
face, and gaze within a Computational Linguistic framework, and also has lead him to
suggest a reorganization of Linguistics as a discipline, with a redefinition of the basic
concepts and object of enquiry (Ferrari, 2007). Also, Tullio De Mauro devotes a
conspicuous section of his 1982 publication to the semiotics of non-verbal codes and
strongly encourages the study of gestures as relevant to the interpretation of human
communication in his report closing the 40th meeting of the Societ di Linguistica
Italiana, held at the Universit del Piemonte Orientale A. Avogadro, Vercelli (Italy),
in September 2006.
Lately, some linguists have devoted their attention to gesture studies and non-
verbal communication: among them, of particular mention are Marianne Gullberg, who
is currently working on the relation between co-verbal gestures and speech in first and
second language acquisition, and Sotaro Kita, Susan Duncan, Fey Parrill and other
scholars of David McNeills school.
This book follows the ideas of the linguists who focus on gesture as a key for the
comprehension of language use in face-to-face interaction, and the aim here pursued is
to provide a linguistic interpretation of gesture, as systematic and coherent as possible.
Some of the basic questions about gesture and non-verbal communication will be
addressed, with special focus on the perspective that linguistic enquiry may bring to
this topic.
The major theoretical premise of my research is that, communication being what
MacKay (1999:3) defines as simply communicatio (from Latin communicatio,
communicationis: to share with someone the act of communication or distributing), no
theoretical distinction is needed between the act of communication itself and the act of
interacting, on the one hand, and between language as faculty and communication as
act, on the other hand. According to MacKay, what counts as communication is a
1.1. Precise 3

cognitive phenomenon, taking place whenever an organism interacts with itself or with
other living organisms in order to modify their behaviour. Language is therefore
interpreted as either perception, introspection, self-control, self-orientation of thought,
or output of a message. Such a message is, by its very nature, multimodal, resulting
from many different mechanisms taking place simultaneously.
The theoretical framework of this work is rooted in that put forward by David
McNeill (1992, 2004, among others) positing a profound correlation between speech
and gesture, although by reconceiving the phenomena of relevance during spoken
interaction and the relations between them, a somewhat modified version of the theory
emerges. Not only, in fact, is gesture here claimed to be integral to language-as-faculty
the means of expression of human thought (Whitney, 1899) but it is also
considered as a code itself, strikingly similar to the codes instantiating the faculty of
language, i.e., the spoken languages, which are unanimously considered as the object of
investigation par excellence of linguistic enquiry. As a consequence, the multimodal
signal is described and analysed as a whole, following a mainly structural approach
aimed at providing an exhaustive description of the relations existing between the parts
that constitute the final output. Substantial focus is provided to the predominance that
linguistic investigation has gained into my view of non-verbal phenomena as related to
language and communication. In particular, this book is informed by the work I have
done during the years of my professorship of non-verbal communication at the
University of Pavia. In addition, the data developed through both independent field
research and several research projects that I have conducted at the Department of
Linguistics, Universit di Pavia, at the Center for Speech and Gesture, The University
of Chicago, and at the Laboratory of Computational Linguistics and Text
Understanding (Li.Co.T.T.), Universit del Piemonte Orientale, provide a conspicuous
corpus from which many examples of the phenomena in question have been taken.

1.1. Precise

Although conceived as a research monograph, the different chapters of this book can be
read independently. Nonetheless, it would be useful to first read Chapter 2 and Chapter
3, which define the object of investigation and provide the indispensable terminological
coordinates for the reading of the chapters to come, as well as a brief survey of the state
of the art. Chapter 4 addresses the central question of the cognitive foundations of
gesture. The chapter is divided into two parts: the first presents previous research into
this topic, with special attention to the findings reported in McNeill (1992) and Cassell,
McNeill and McCullough (1999) about the function of (co-verbal) gestures in face-to-
face interaction. The second part of the chapter provides evidence available from
research already conducted on the emergence of language in children, the study of
aphasia and the study of gesture in blind subjects. Chapter 5 develops a reinterpretation
of the problem of intentionality: a solution to the still debated question of intentionality
in gesture is proposed by means of the application of the Prototype Theory (Taylor,
1995) to gesture, which is interpreted as a modular category that is, in turn, integral to a
wider phenomenon named audio-visual communication. In doing so, gestural
phenomena and concurrent speech are described as deeply interrelated sub-modules
obeying a compensatory rule. The gestural sub-module is analysed according to several
linguistic features, such as intentionality and awareness (following Ekman and Friesen,
1969), abstraction, arbitrariness, and (linguistic) extension. The establishment of such
4 1. Introduction

features for the synchronic description of the gesture provides a linguistic explanation
to the individuation of emblems as the most language-like members of the gesture
inventory (McNeill, 1992), although, in my view, these gestures are still co-verbal2.
Moreover, the synchronic analysis of gesture provides a theoretical framework for
addressing the question of the communicativeness of certain co-verbal gestures. The
theoretical approach adopted is corroborated by the results of an experiment on gestural
inhibition in formal vs. informal situations, which was conducted in 2002 at the Center
for Gesture and Speech, Department of Psychology, The University of Chicago.
Chapter 6 is devoted to the discussion of the neuro-motor basis for the hypothesis
of a single cognitive origin for speech and gesture. The recent and still somewhat
controversial discovery of the existence of mirror neurons in the human brain is
discussed within a general framework that views communication as an process of
abstraction that exploits the whole range of human neuron-motor production in order to
convey meaning. Edelmans theory of neuronal selection and global mappings is
discussed, with particular attention to the new perspective on the functions of brain
areas put forward by his most recent findings relating to mirror-neuron-like structures
in humans. A further consideration of Massaros (1994) theory on speech perception
and brain modularity leads to a reinterpretation of communication qua language as
deeply rooted in action performance and perception. Moreover, the results of a research
project conducted at the Li.Co.T.T., Universit del Piemonte Orientale, aimed at
assessing gesture-speech synchronisation patterns in multi-tasking activities are
presented as further evidence of the deep neuro-motor linkage between gesture and
Chapter 7 presents the results of a study of speech and gesture synchronisation in
profoundly deaf orally educated subjects. The study, originally conceived to provide
further evidence of a single psychological and cognitive origin for speech and gesture,
also leads to interesting further insights into the qualitative and quantitative nature of
co-verbal gestures performed by deaf subjects educated in oral production either with
or without the support of acoustic devices. In the latter case, the oral production was
elicited using Italian Sign Language.
Chapter 8 is devoted to a closer analysis of audio-visual communication, with
special attention to gesture. A structural description of the audio-visual communication
system in both signal production and perception is developed, together with some
consideration of the question of morphology in gesture. A proposal for a formal model
for the description of the gestural production phase is also attempted. The formal model
proposed for the description of audio-visual communication, based on a review of
models developed following upon Levelts (1989) model for speech production, is not
based upon the theoretical approach to language which I define as linearism most
commonly assumed in computational linguistics. Instead, the proposal of this formal
model is taken as an opportunity to address the debate over recursion in language
originally put forward by Chomsky and Miller (1963), and recently revised in Hauser,
Chomsky and Fitch (2002), with further implications for recursion in the gesture sub-

In his 1992 book, David McNeill interprets Emblems as a separate class of gestures with
regards to those unwitting pieces of gesticulation usually accompanying speech, which he calls
co-verbal gestures. This distinction is based, among other reasons, on the fact that emblems can
sometimes replace speech while co-verbal gestures are more tightly bound to speech production.
As we will see in Chapter 5, McNeills distinction is probably not necessary within the
theoretical framework adopted in these pages.
1.1. Precise 5

module. In concluding the chapter, a parsing model integrating gestural production

apparently the first attempt so far is presented and discussed.
Chapter 9 reports the results of a set of ongoing research projects on the function
of gestures within communicative acts, with a special focus on the self-directional
aspect of language. Most importantly, a case study of co-verbal gestures in map-task
activities shows how the recursive pattern of certain metaphors can be taken as a
reliable index of the communicative strategy adopted by the speaker when engaged in a
collaborative task requiring the activation of multiple capabilities, such as self-
orientation in space, planning, and communication in unnatural, or marked
conditions, such as blocked vision of the interlocutor. Recurrent metaphors indicating
the adoption of a plan, its abandonment, or its confirmation are shown and analysed as
evidence of the involvement of spatial reasoning in communication. An interesting case
of the lateralization of referential versus self-directional and planning functions into the
dominant and non-dominant hands is reported and discussed.
Finally, Chapter 10 presents the applications of the study of non-verbal
communication and gesture to Artificial Intelligence, with particular emphasis on the
relation between non-verbal cues and speech synthesis in robots and conversational
This volume also contains two appendices concerning the data collected within the
experiment on deaf subjects presented in chapter 7. In particular, Appendix 2 provides
a summary of the conversational turns recorded within the experiment.
The small number of subjects for some of the experiments presented in this book
makes it impossible to assess the statistical reliability of the results; nevertheless, in
these cases, the phenomena under examination are assumed to be attributable to the
human neuro-motor system and thus universal: thus, if a given phenomenon is found in
a small group of subjects, it may at least be potentially true for the whole population. In
any case, no sample is ever completely representative of the entirety.
This page intentionally left blank

2. Non-Verbal Communication: Towards a

Motus est corporis gestus et vultus
moderatio quaedam, quae probabiliora
reddit ea, quae pronuntiantur (De Ratione
Dicendi ad C. Herennium libri IV. Liber
III: 26).


As pointed out in the introduction, the study of non-verbal phenomena in general and
gesture in particular has gradually become more a focus of interest since the first
modern studies mostly done within ethological frameworks appeared. There has
also been an evolution of the scholarly definition of the concept of non-verbal
communication, over this time. This chapter offers a brief summary of the major
studies in the field, with particular emphasis on the role of Linguistics, and a
discussion of the principal definitions of non-verbal communication. Finally, an
alternative definition of communication within the realm of non-verbal phenomena is
adopted, in order to better suit the theoretical views of those scholars primarily
concerned with language studies.

2.1. State of the Art

As for what constitutes the study of non-verbal communication, and gesture in

particular, great importance is attributed to the contribution of the anthropologist David
Efron (1941) who was one of the first to engage in the field from an anthropological
perspective. His research was aimed at demonstrating the non genetic origin of
culturally determined gestures by means of a study of gesture within the Italian and
Jewish emigrant communities of New York City. This study, which is still considered a
milestone in field research, is particularly interesting for the linguist, and is universally
acknowledged to be one of most successful attempts to develop a structural analysis of
non-verbal cues. In effect, he adopts for kinetics 3 the same structural approach
developed in classical phonological studies. In particular, Efron coined new words for
kinetic phenomena, which are overtly reminiscent of phonetic concepts, i.e., kineme as
the non-verbal counterpart of phoneme. His insight was to interpret non-verbal
behaviour as a structured and articulated phenomenon, which could be analyzed with
the same analytic tools as sound in speech. This particular approach to kinesics seems
to have influenced in one way or another all subsequent attempts to develop a
classification of non-verbal cues. Among the most famous ones, suffice it here to
mention those proposed by Ray Birdwhistell (1952), Condon and Ogston (1966), and
Ekman and Friesen (1969).

The first use of this word is due to Efron (1941) himself, and, after him, Birdwhistell
8 2. Non-Verbal Communication: Towards a Definition

Another interesting contribution to this field, although less known, is that offered
by Marcel Jousse whose decades of teaching activity at the Sorbonne in Paris gave
birth to a new branch of studies within theology, which he named lanthropologie du
geste. The aim of his study, mainly conducted between 1925 and 1957, was the
reinforcement of the catholic theological claim of revelation by means of a wide
anthropological study of the anthropos (i.e., mankind) as center of such a revelation. In
doing so, he focused on the oral dimension and potential of man, which he considered
to be overlooked in the classical approach to the question, for the studies about
revelation had been exclusively based its written form (i.e., the Bible). Although his
attempt to argue that the revelation of God can also be seen through the gestures of the
men receiving has now been abandoned, his pioneering study still contains interesting
clues. Among these is the understanding of human mimicry as springing from the
perception of action, which he understands in terms of a basic Agent-Action-Patient
relationship that he names interaction triphase4.
More recently, Proxemics5 offered most valuable contributions to the study of non-
verbal communication. This branch of Anthropology is related to the study of the
different ways in which different cultures use space in communicative contexts. As far
as we know, for example, the minimum interpersonal space considered acceptable in
the interaction between two speakers varies significantly depending on the culture in
question6. Watson and Graves (1966), for instance, investigated the difference between
American and Arabic college students in the use of space and found out that, when
speaking to a stranger, Arabic subjects tend to speak louder, to face the listener more,
to sit closer to him/her, and to touch him/her more frequently. Recently, Hudson (1997)
suggested an interpretation for the use of space among speakers equating physical
distance with social distance. Thus, the wider the interpersonal space between the
speakers, the wider the social distance dividing them. According to his interpretation,
this approach could facilitate the investigation of power-solidarity relationships in
dyadic situations. To date these elements have been successfully used in human
geography studies, sociology, sociolinguistics, and psychology.
More recently, Fernando Poyatos (1992, 1997, 2002) has provided a most
extensive and interdisciplinary approach to the study of non-verbal communication, in
particular by proposing a basic triple structure of human communication consisting
of language, paralanguage and kinesics (Poyatos, 2002). Of particular interest within
his anthropological framework is his proposal of basic cultural units that he names
Within a more ethological framework, in fact, the first observations of the
expression of emotions and social behaviour in animals are due to Darwin (1872).
Ethologys origins as a discipline go back to 1939, when Konrad Lorenz published his
first paper (the Manifesto of Ethology) on Verhandlungen der Deutschen zoologischen
Gesellschaft. With him, and subsequently, a group of scholars have dedicated
themselves to the study of non-verbal behaviour: among them, suffice it to mention
Tinbergen (1935), Eibl-Eibesfeldt (1949), von Frisch (1967), Goodall (1968), Thorpe
(1972a, 1972b), Morris (1971, 1977), and Green and Marler (1979).

Jousse (1974: 46).
For the first use of this term, see Hall (1966).
Hall (1966) described four different zones for social interaction. They were as follows:
intimate, personal, social, and public. He was also the first in studying the intercultural variance
of the use of these zones.
2.1. State of the Art 9

The first attempts to combine Ethology and Anthropology in the study of non-
verbal behaviour are those of Eibl-Eibesfeldt (since 1967) and von Cranach and Vine
The most important aim for ethologists studying non-verbal phenomena is the
determination of physiological causes in animal behaviour, namely, to investigate
development of non-verbal communicative behaviour in human ontogenetic and
phylogenetic processes. Their interest is particularly focused on adaptive behavioural
patterns, or those patterns that seem to result from a specific evolutionary process, in
response to very specific environmental conditions. Eibl-Eibesfeldt's (1967)
investigation into evocators or expressive movements used by animals for
interspecific and intraspecific communication focuses directly on the concept of
communication itself. Shifting towards the realm of human behaviour, we can find
interesting contributions, such as the attempt to show the innateness of behavioural
patterns that we repeat in our daily social interactions. A well known example of this
sort of behavioural pattern is the eyebrow flick, first described by Eibl-Eibesfeldt
(1975), which would universally indicate a willingness to socialize, and, for this reason,
would often be used as a quick and informal salutation. Desmond Morris (1971),
among others, analyzes the zoological roots of human behaviour in mother-child
interaction, such as cuddling, patting and cooing. He also provides an analysis of the
most widespread substitutes for mother-infant contact. The ethological approach shows
up as well in psychological studies of gesture and non-verbal communication during
the Sixties. Because of the methodological crisis in social psychology during this
period, in which strictly experimental studies were called into question, the research
tended towards direct observation rather than experimental designs. due to the crisis of
social psychology and its methodology. This methodology, in fact, was mainly based
on laboratory experiments, the trustworthiness of which was challenged more and more
by scholars. For this reason, field research based on observation was preferred.
In particular, the attention of scholars was caught by studies relating to the
expression of emotions, and to interpersonal behavioural patterns, such as dominance-
submission attitudes. Also the effects of external appearance on the emotional response
of interlocutors were investigated. Argyle (1988), for example, besides working on
facial expression, non-verbal vocalizations, and gestures, also takes spatial behaviour
into account, namely, orientation in speaking, posture, and bodily contact. He even
considers clothes and personal look as noteworthy pieces of communicative behaviour,
although they may not be considered intentional or communicative in a proper sense, as
they are not directly linked to specific communicative intentions. This is because of his
particular interest in analysing the psychic component of the manifestations of human
As for what more strictly concerns a linguistic interpretation of gesture and non-
verbal cues, great significance can be attributed to the numerous classifications of
gestures proposed by Rosenfeld (1966), Freedman and Hoffman (1967), Mahl (1968),
and Argyle (1972) among others. Recently, in the field of cognitive psychology,
conspicuous research on verbal and non-verbal communication has been carried out,
aimed at providing a comprehensive view of these phenomena. In this regard, suffice it
to mention McNeill (1985 and following), who considers gestures and speech as parts
of a single psychological structure, basing his hypothesis on an adaptation of
Vygotskijs (1934) model of the development of cognition by means of social
interaction, contra Butterworth and Hadar (1989), who consider gestures as a mere
10 2. Non-Verbal Communication: Towards a Definition

epiphenomenon of verbal expression. These issues will be more thoroughly discussed

in Chapter 4.

2.1.1.The place of Linguistics

An important contribution to non-verbal communication studies which involves

a linguistic approach and suggests a structural analysis of the phenomena involved is
due to the anthropologist David Efron (1941), who wanted to demonstrate the non-
genetic origin of culturally determined gestures by means of a study of the Italian and
Hebrew communities of New York City.
His methodological approach is that of Linguistics: in fact, he applies to kinetics7 a
structural analysis derived from the tools of phonological analysis. In particular, the
author coins words for kinetic phenomena which are overtly reminiscent of phonetical
concepts, i.e., kineme as phoneme. The point is that he interprets non-verbal behaviour
as a structured and articulated phenomenon, which can be analyzed with the same tools
as sound in speech. This particular approach to kinesics is further developed by
Birdwhistell (1952), who divides non-verbal behaviour into kinesic units (see Table 1),
which, in their turn, are subdivided into kinemes, or smallest units of behaviour.

Table 1: Birdwhistells model for kinetic analysis

Non verbal behaviour Equivalent in speech
Kineme Phoneme or morpheme

Kinemorphic class

Complex kinemorph Word

Kinetic Unit Sentence

Kendon (1972) successfully walked this same path: he adopted Condon and
Ogston' s approach to the analysis of gestures, but his initial interest was mainly
focused on the way speech and movements are organized and synchronized in speech
acts8: his primary aim was to determine a relationship between bodily movements and
speech movements, and his discoveries about dramatic posture shifts in correspondence
with speech onset are extremely interesting within the economy of pragmatics. Posture
shifting, in fact, is shown to have a basic role, together with eye gaze, in the control of
conversational turns. His approach to nonverbal behaviour seems to have a taxonomic

The first use of this word is due to Efron (1941) himself, and, after him, Birdwhistell
... The primary aim here is to describe how the movements observed to occur as X
speaks are organized, not from the point of view of their possible significance as gestures ...
(Kendon 1972:79).
2.1. State of the Art 11

flavor in its effort to reduce both verbal and non-verbal phenomena to a hierarchical
structure, although no classification of gesture is made in advance (see Figure 1).
According to Kendon, if one observes manual gesticulation in a speaker, it is possible
to show how movements are organized as excursions, in which the gesticulating limb
moves away from a rest position, engages in one or more of a series of movement
patterns, and is then returned to its rest position. Ordinary observers identify the
movement patterns that are performed during such excursions as gestures. They see
the movement that precedes and succeeds them as serving merely to move the limb into
a space in which the gesture is to be performed. A Gesture Phrase may be
distinguished, thus, as a nucleus of movement having some definite form and dynamic
qualities, which is preceded by a preparatory movement and succeeded by a movement
which either moves the limb back to its rest position or repositions it for the beginning
of a new Gesture Phrase. (Kendon, 1986: 34)9

Kendon (1986) Birdwhistell

Excursion Kinesic unit

Gesture Phrase Complex


Stroke (Kineme)

Figure 1: Kendons (1986) analysis of kinesics compared to that by Birdwhistell (1952)

Kinesics, in other words, is analysed as composed of excursions (which

Birdwhistell (1952) calls kinesic units), gesture phrases (corresponding to
Birdwhistells complex kinemorph), which have, in their turn, a stroke or peak
(McNeill, 1979). This phase is the meaningful part of the gesture phrase and can be
compared to Birdwhistells (1952) kineme, although the correspondence is not so clear
due to Birdwhistells variable definition for kineme which he described as either a
formal unit which combines with others to convey meaning, or as the smallest
significant unit of behaviour (see Noth, 1995).
Kendons 2004 book provides a conspicuous contribution to linguistic enquiry as
applied to gesture studies, especially in emphasizing the deep interrelation existing
between gesticulation and speech, and in outlining the communicative function of some
gesture families in face-to-face interaction.
David McNeill is unanimously considered a pioneer in gesture studies, especially
from the perspective of linguistic investigation. His Growth Point Theory (McNeill,
1992; 2005) is in fact the only current theoretical psycholinguistic model integrating
speech and gesture. McNeills hypothesis posits a single cognitive and psychological
origin for speech and gesture together. His suggestion for an evolutionary model for the
human communication system (McNeill, 2005) pervades the theoretical structure and

Emphasis theirs.
12 2. Non-Verbal Communication: Towards a Definition

research contained in this book, but with some distinctions that will be exposed in the
following chapters.
Recently, keynote contributions have come from studies within the Conversation
Analysis framework that assess the role of gesture and other non-verbal cues such as
gaze and posture in turn-taking and conversation (Goodwin, 1984; Mondada, 2007;
Schegloff, 1984, 2006 among others).
Finally, Information Technology and Computational Linguistics have offered
important contributions to this field: suffice it to mention the relevance of Language-
Action Perspective, also known as Language-Action Theory firstly put forward by the
computationalists Flores and Ludlow (1980), and subsequently applied within
computational linguistics. The scholars draw from Austins (1962) concept of the
Illocutionary Act and Searles (1969) Speech Act Theory in order to interpret language
as a means for action. Speech Act Theory, together with Language Action Theory, have
led to the development of a new branch of studies, Informal Logic, originating in the
U.S. during the Sixties, that proposes to give an account for everyday reasoning, or
everyday argumentation. Within this field, attempts have been made to include non-
verbal phenomena such as expressions, displays, and physical acts into the notion of
argument (see e.g. Gilbert 2006).
Semioticians and philosophers of mind have also contributed to the study of
gesture and non-verbal phenomena: Ferruccio Rossi-Landi (2005), for instance, in
reinterpreting Herbert Meads (1934) studies, considers the whole of human social
behaviour as a semiotic unit, while Tullio De Mauro (1982) has focused on gestures
from within a semiotic perspective. Wittgenstein (1966, p.39) speculates on gestures in
a beautiful reflection upon the rules for performing gestures and the variations tolerable
in those rules, thus speculating on the existence of a morphology of hand gestures. Ryle
(2002) seems to go in a similar direction while analyzing gestural performance within
the concept of self-consciousness.
Nevertheless, despite these keynote contributions coming from Philosophy, and
across Pragmatics, Computational Linguistics and Semiotics, no specifically linguistic
study has addressed gestural and non-verbal communication in a systematic and
devoted fashion.
The most relevant contributions to the roles, functions, and nature of gesture and
nonverbal cues come from Applied and Computational Linguistics: Justine Cassells
work on the implementation of embodied conversational agents (see, for instance,
Cassell, 1998), among others, focuses on the primary questions of the role of gesture,
gaze, and posture within dialogic acts. The association of fields such as Information
Technology and Psycholinguistics in the investigation by Cassell, McNeill and
McCullough is a good example: one of their studies, in fact, provided strong support to
the theory of a single computational and psychological origin for speech and gesture
(Cassell, McNeill, McCullough, 1999) by proving that, in the case of gesture-speech
mismatches, the listener is as likely to attend to the information provided by gesture as
that in speech. Their contribution will be discussed more in depth in Chapter 4. For
now, suffice it to mention briefly the important findings about co-verbal gestures, their
psychological origin and their function obtained by Computational Linguistics and
Psychology, and the questions left still open, for further research:

 What is the function of the verbal and non-verbal modalities in discourse?

 Where do gestures occur with respect to discourse structures and semantic
2.2. Non-Verbal Communication vs. Non-Verbal Behaviour: Towards a Definition 13

 When do gestures and speech convey the same meaning and when they are
 When do speech and gestures have the same functions and when do they
integrate each other?
 Which semantic features are conveyed by gestures and which by speech?
 How can we describe the way by which the hands convey meaning? Is there a
morphology in gesture? (adapted from Cassell and Stone, 2000).

These are also the questions that inspired this volume, and will be addressed in
depth in the following chapters.

2.2. Non-verbal Communication vs. Non-verbal Behaviour: Towards a Definition

To date, the labels non-verbal communication and non-verbal behaviour have been
used interchangeably to refer to various phenomena: the terminological problem is not
just a consequence of the variety of theoretical frameworks used in the field, but of the
very conception of communication used therein. In this book, however, the term non-
verbal communication will refer only to a limited subset of non-verbal cues- precisely
those that are intended as communicative and/or interactive in the sense of Ekman and
Friesen (1969).
This leads the discussion to another key point in the study of non-verbal
phenomena: the lack of a clear-cut, universally accepted, definition of the phenomena
involved. Although a number of scholars have provided different systematizations of
what is usually considered to be non-verbal, a clear and universal distinction between
behaviour and communication in the study of non-verbal phenomena is missing. Over
time, non-verbal communication has come to be defined as a realm of meanings carried
by what is not verbal: both intentional and unintentional movements of the body, hands,
mouth and head, as well as intentional and unintentional sound effects usually termed
vocal gestures10 have been collected under the label, together with clothing (Morris,
1979; Argyle, 1982) and written texts11. Unsurprisingly, many scholars find the field as
it has been conceived to be exceedingly broad. One way to avoid ambiguity in the
definition of non-verbal communication can perhaps be found through limiting and
thus simplifying the field. To achieve this, a clear definition of what counts as
communication is desirable.
As regards non-verbal behaviour, one of the first attempts to diversify and classify
its numerous manifestations is found in Rosenfeld (1966), who divides it into
gesticulation and self-manipulation. A better and more precise categorization of non-
verbal behaviour is found in Ekman and Friesen (1969). In this article, they classify the
repertoire of non-verbal behaviour according to the following six different features:

 External conditions
 Relationship of the act to the associated verbal behaviour
 Person's awareness of emitting the act
 Person's intention to communicate
Hockett, 1960:393.
See for instance the work by Piastra (2006) on image-language co-reference in a
multimedial corpus.
14 2. Non-Verbal Communication: Towards a Definition

 Feedback from the person observing the act

 Type of information conveyed by the act

Non-verbal behaviour is consequently categorized as follows:

 Informative if it is not intended to convey meaning by the speaker, but t still

provides pieces of shared meaning to the listener(s);
 Communicative when it is clearly and consciously intended by the sender to
transmit a specifiable message to the receiver12;
 Interactive if it tends to modify or influence the interactive behavior 13 of

Initially, the methodological approach to non-verbal behaviour was focused on

uncovering its cultural or neurological origin. Research was mainly focused, on the one
hand, on the functional use of non-verbal behaviour observations in the diagnosis of
mental diseases; and on the other hand, on the biological and ethological roots of
human non-verbal behaviour.
A further step towards the definition of what is communication is found in
MacKay (1972), who seeks to provide non-verbal studies with a systematic conceptual
apparatus, focusing in particular on the notion of communication itself. As he points
the etymological root of the term (communicatio) means
sharing or distributing. ... In this general sense, A communicates
with B if anything is shared between A and B or transferred
from A to B. For scientific reasons, we need a more restricted
usage if the term is not to become trivial. (Otherwise, the study
of 'non-verbal communication' covers every interaction in the
universe except the use of words!)... (MacKay, 1972:3-4).

In this regard, MacKay (1972) suggests keeping the distinction between

informative and communicative behaviour, which is already present in the analysis of
non-verbal behaviour proposed by Ekman and Friesen (1969). In particular, he outlines
the distinction between signaling and communication, signaling being

... the activity of transmitting information, regardless of

whether or not the activity is goal-directed, what impact if any it
has on a recipient, or even whether the source is animate or
not....Thus, one is allowed to say, for example, ...that 'A is
signaling but not communicating'... (MacKay, 1972:6).
Communication, on the other hand, is defined as goal-directed
action: A communicates with B only when A's action is goal-
directed towards B... (MacKay, 1972:25).

According to MacKay's model, a goal-directed action by an organism

... is distinguished from mere undirected activity by an element

of evaluation: a process whereby some indication of the current
or predicted outcome is compared against some internal 'target

Ekman and Friesen, 1969:56.
Ekman and Friesen, 1969:56.
2.2. Non-Verbal Communication vs. Non-Verbal Behaviour: Towards a Definition 15

criterion' so that certain kinds of discrepancy or 'mismatch' ...

would evoke activity calculated to reduce that discrepancy (in
the short or long term). (MacKay, 1972:11).

The further step suggested by MacKay is that, when studying the non-verbal
domain, the label communication be used if, and only if, the subject matter of the
research proves to be as goal-oriented as speech, beyond any reasonable doubt.
Notwithstanding MacKay's contribution to the definition of the term
communication, not all scholars adopt such a distinction between behaviour and
communication: in 1988, in fact, Argyle published the second edition of a book entitled
Bodily Communication which collects the most varied observations about intentional
and unintentional behaviour. The same approach is adopted by the zoologist Desmond
Morris (1977), who draws a complete guide to human behaviour, including in his
definition of gesture diverse sorts of nonverbal behaviour. Despite the different aims of
these analyses, the criterion adopted for the definition and description of nonverbal
communication risks a high degree of vagueness, at least for the sake of the present
research, where communicativeness and ultimately intentionality, are determinant
features of the subject matter of my analysis.
To avoid this vagueness, my research is restricted to what Ekman and Friesen
(1969) define as communicative and interactive, for only these elements can be defined
as communicative in the sense suggested by MacKay (1972). However, there is a
clarification to make: this work will zero in on the intentional aspects of interactive
behaviour. This is a distinction that Ekman and Friesen do not.
One of the more interesting scholars focused on the value of interaction within
non-verbal communication is Fernando Poyatos. Of particular interest is his definition
of kinesics: Conscious and unconscious psychomuscularly-based body movements
and intervening or resulting still positions, either learned or somatogenic, of visual,
visual-acoustic and tactile kinesthetic perception, which, whether isolated or combined
with the linguistic and paralinguistic structures and with other somatic and object-
manipulating behavioural systems, possess intended or unintended communicative
value. (Poyatos, 2002:101)
This definition is interlaced with that of non-verbal communication, which to
Poyatos is as follows:

Las emisiones de signos activos o pasivos, constituyan o no

comportamiento, a travs de los sistemas no lxicos somticos,
objectuales y ambientales contenidos en una cultura,
individualmente o en mutual coestructuracin. (Poyatos, 2004:

Finally, in the same paper, the author suggests an interesting way to integrate
communication and interaction within the realm of non-verbal phenomena:

El intercambio consciente o inconsciente de signos

comportamentales o no comportamentales, sensible o
inteligibles, del arsenal de sistemas somticos y extra somticos

Ekman and Friesen, 1969:56: the emission of active or passive signs, may or may not
constitute behaviour, by means of the somatic and non-lexical systems, both objectual and
environmental, contained in a culture, either individually or mutually co-structured
16 2. Non-Verbal Communication: Towards a Definition

(independiente de que sean actividades o no-actividades) y el

resto de los sistemas culturales y ambientales circundantes, ya
que todos ellos actan como componentes emisores de signos (y
como posibles generadores de subsiguientes emisiones) que
determinan las caractersticas peculiares del encuentro.
(Poyatos, 2004: 59)15

Clearly, the distinction between intentional and unintentional phenomena is not

considered to be significant, the main feature acting as a discrimen being individuated
in the cultural flavor of communication. On the other hand, the opposition between
non-verbal communication and interaction is unfortunately not explicated.
In order to better focus this work, which will be especially dedicated to the
linguistic aspects of human behaviour, Non-verbal Communication will be defined as
the intentional transmission of information, either for
representational, emotive, poetic, and conative purposes,
from a transmitter A to a receiver B, mainly and
prototypically through the visual channel, but also
through the vocal-auditory channel, by means of specific
codes, either innate or culturally-determined, that are not
usually specialized for verbal communication.

Clearly, such a definition is intended to reflect the speculation up to this point of a

special emphasis on the intentionality of communication. What is not encompassed by
this definition will be hereinafter defined as behaviour, while non-verbal
communication will henceforth be referred to as a strictly communicative - and,
consequently, intentional - phenomenon.
Of course, adopting a fragile feature such as intentionality as a discrimen between
communication and behaviour can be precarious. As we shall see in the next pages, not
all facets of language show intentionality to the same extent. On the contrary, some
aspects of speech itself are held to be unintentional and unaware. Nevertheless, the act
of communicating, usually interpreted as responding to an emotional-volitional
impulse (see McNeill, 1992), is in these pages assumed to be unquestionably


This chapter has addressed the basic questions pertaining to a definition of non-verbal
communication as the object of study not only within the framework of disciplines such
as Ethology and Psychology, but also from the perspective of Linguistics. In defining
communication within the realm of non-verbal phenomena, in a way that allows
linguistic investigation, the trait of intentionality is thus adopted following MacKay

Translation: the conscious or unconscious exchange of behavioural or non-behavioural
signals, either noticeable or intelligible, that are part of the somatic or extra-somatic repertory
(no matter if they are activities or non-activities) and the rest of the surrounding cultural and
environmental systems: all these serve as components that transmit signals (and as potential
transmitters of subsequent emissions) that determine the peculiar characteristics of the
Summary 17

(1972). Of course, the adoption of such a fragile trait for the distinction between
communication and behaviour can be problematic, especially to the extent that not all
pieces of language and non-verbal phenomena which are usually intended to be
communicative are intentional. This basic question is further addressed in Chapter 4,
Chapter 6, and Chapter 7.
This page intentionally left blank

3. Defining Gesture
curabit etiam ne extremae syllabae
intercidant, ut par sibi sermo sit, ut
quotiens exclamandum erit lateris conatus
sit ille, non capitis, ut gestus ad vocem,
vultus ad gestum accommodetur.
(Marcus Fabius Quintilianus, Institutiones
Horatoriae Liber I, VIII).


In this chapter I define the basic terms related to gesture studies that will recur
throughout the book. After the definition for non-verbal communication provided in
Chapter 1, a further delineation of the field leads to the definition of gesture as a subset
of non-verbal communication. Gesture is here introduced by means of a review of the
major definitions provided by scholars in the field. Also, a description of the
classification of gestures and the parameters for gesture analysis and transcription
adopted in this book is also provided.

3.1. What Is Gesture? Getting More Focused

Now that a definition of non-verbal communication has been provided, we are in a

position to attempt a definition of gesture as a subset of the same. The common
usage of this word in English does not simplify the task: the word, in fact, can be
defined as either

1 archaic: carriage, bearing;

2 : a movement usually of the body or limbs that expresses or
emphasizes an idea, sentiment, or attitude
3 : the use of motions of the limbs or body as a means of
4 : something said or done by way of formality or courtesy, as a
symbol or token, or for its effect on the attitudes of others , or
something said or done as a formality or as an indication of
intention (Merriam-Webster Dictionary)

The etymology of the word gesture 16 goes back to a Latin verb, gerere, which
means to bear or carry; to perform or to accomplish. The word in its modern use
derives from the Medieval word, gestura, which means mode of action (Partridge,
1959). The word was later used in rhetoric treatises to refer to the expressive use of the
body namely, of the hands and face - in making speeches. Recently, scholars have
used this word to also refer to unconscious movements, vocal actions (Hockett, 1960),
or even sub-action of speaking (Armstrong, Stokoe, Wilcox, 1995; Pouplier and
Goldstein, 2010)

This brief review is based on Kendon (1982; 2004).
20 3. Defining Gesture

Moreover, numerous classifications of this phenomenon have been proposed, each

one based on different theoretical premises and research commitments. The first
detailed classification of gestures is due to David Efron (1941), who divided them into
emblems, or arbitrary movements, which do not show any iconic correlation with the
meaning they convey; ideographs, which express mental paths; deictics, which show
present objects or persons; spatial movements, which express spatial concepts (such as
size); kinetographs, which depict a physical action; and batons, which express
conversational rhythm.
The first attempts to categorise gestures within a psychological framework are
essentially bipolar and emphasise the distinction between communicative and non-
communicative gestures: Rosenfeld (1966), for example, divides non-verbal behaviour
into gesticulation, which he defines as arm and hand movements emphasizing the
speech rhythm, and self-manipulation, namely hand and arm movements interacting
with other body parts. This classification is also adopted by Freedman and Hoffman
(1967) - who divide the phenomenon into object-oriented and body-oriented gestures -
and Mahl (1968) - who divides it into autistic and communicative gestures.
Later, Ekman and Friesen (1969) produced a modified version of Efron's
classification: emblems, for example, in their rendition also include gestures which are
not totally arbitrary, but show to some extent an iconic relationship with the conveyed
meanings. Furthermore, they collect batons, ideographs, deictics, spatials, and
kinetographs into the single category of Illustrators.
Their most significant innovation is the introduction of a set of parameters for
gesture categorization: these are Intentionality, Awareness, Culturally Shared Meaning,
and Modification of Listener's Behaviour. These parameters help classify gestures as
Communicative, Informative, Interactive, or Idiosyncratic (see Figure 2).

Communicati Informative Interactive Idiosyncratic

Shared Meaning

Awareness YES NO YES/NO -

listeners NO NO YES -
Figure 2: Ekman and Friesens parameters for gesture categorization

According to the authors, communicative gestures have a culturally determined and

culturally shared meaning, and show a high degree of awareness and intentionality;
informative gestures differ from communicative ones in intentionality, for they are
claimed by the authors to be unintentional; interactive gestures differ from either
communicative and informative ones, for their main function is to modify the listener's
behaviour, so that their degree of intentionality is not relevant to their definition; and,
3.1. What Is Gesture? Getting More Focused 21

lastly, idiosyncratic gestures are not communicative at all, since they do not convey a
shared meaning.
In conclusion, Ekman and Friesen divide gestures into five categories, which are
defined as follows:

- Emblems: communicative - that is intentional - gestures that convey culturally

shared meanings. This meaning can be easily translated into words. These
gestures may either occur together with speech, or substitute for it;
- Illustrators: gestures that illustrate the part of the speech flow they occur with.
These gestures are exclusively co-verbal, in the sense that they can only occur
together with speech. Some of them can be defined as communicative (for their
meaning is culturally shared and their degree of intentionality is usually
comparable to that of Emblems), informative (since the performer's awareness
can be low and, in some cases, they may even be semi-unconscious, but the
conveyed meaning is still largely shared), and interactive (which means that they
function to modify the interlocutor's behaviour). The authors further divide them
into ideographs (movements showing a logic itinerary); deictics (movements
showing a present object/person); kinetographs (movements representing a
physical action); pictographs (movements depicting an image); batons
(movements following and underlying the rhythm of speech);
- Affect displays: intentional facial expressions displaying the speaker's emotions.
These gestures can be informative and/or communicative depending on the
speaker's degree of intentionality when performing them;
- Regulators: acts that regulate the rhythm of dialog. Since their degree of
intentionality is very low, they can be defined as interactive informative gestures;
- Adaptors: non-communicative and unaware acts, which can be considered as
relics of the human adaptive system. The authors divide them into self-adaptors,
when the speaker's hands contact other body-parts, and alter directed adaptors,
when they tend to interact with others. These acts include movements tending to
give or receive an object, and attack/protective behaviour; and lastly, object
adaptors, which are movements learnt in daily interaction with objects for the
achievement of a precise aim. All of these acts are solely informative.

Argyle (1975) also bases his own categorization on Ekman and Friesen, although
his classification is simplified into four classes. These are as follows:

- Conventional gestures: arm and hand movements conveying a culturally shared

meaning, which can be easily translated into words; this definition is very similar
to that proposed by Ekman and Friesen for Emblems;
- Speech-related gestures: movements illustrating the meaning conveyed by
- Affect displays: movements expressing the speakers emotions;
- Personality displays: non-communicative idiosyncratic gestures.

Argyle also focuses on ritual gestures, providing a valuable contribution to

psychological and anthropological studies.
Lastly, McNeill and Levis (1982) categorization exclusively takes into account the
gestures they define as co-verbal, that is, the whole range of gestures that can only
22 3. Defining Gesture

occur together with speech. Ekman and Friesens Emblems are not taken into account.
As for co-verbal gestures, the authors divide them into metaphors, iconics, and beats17.
No substantial change is made in the definition of these categories, which closely
resemble Ekman and Friesen's ideographs, pictographs, and batons. McNeill (1985)
later adopts Stephens and Tuite's (1984) suggestion to further divide iconics in two sub-
classes according to their resemblance to their original object-manipulation function.
He thus divides them into iconics 1, which manipulate virtual objects, and iconics 2,
which represent entities or movements not directly related to a manipulative function.
He also indicates subtypes of metaphors such as mathematical metaphors, which
express specific concepts such as limits, and conduits, which represent abstract
meanings as objects.
The word gesture has been often used, following Ekman and Friesens work, to
refer to either communicative, informative, and idiosyncratic non-verbal phenomena.
Kendon (1986) states that ... the word gesture 18 serves as a label for that domain of
visible action that participants routinely separate out and treat as governed by an
openly acknowledged communicative intent (Kendon, 1986: 28). Nonetheless, he later
adds that ... if the notion of gesture19 is to embrace all kinds of instances where an
individual engages in movements whose communicative intent is paramount, manifest,
and openly acknowledged, it remains exceedingly broad (Kendon, 1986:31).
This problem is also recognized by McNeill (1992), who states that Many authors
refer to all forms of nonverbal behavior as gesture, failing to distinguish among
different categories, with the result that behaviors that differ fundamentally are
confused or conflated. (McNeill, 1992:37)
To avoid confusion, Kendon (1986) defines ...all gesturing that occurs in
association with speech and which seems to be bound up with it as part of the total
utterance as gesticulation (Kendon, 1986:31). Standardized gestures, which can
function independently of speech as a complete utterance, he calls autonomous
For the purposes of the present work, the label gesture will be used in a more
restricted sense. We will also address it from a semiotic perspective, and introduce the
notion already proposed by Levelt (1987) of lexical access: if one interprets gestures as
semiotic means, it is easy to see that a form or combination of forms and trajectories is
usually aimed at conveying a precise content, or signified (see Saussure 1917).
Whenever the meaning conveyed corresponds to a precise entry in the speakers and
receivers vocabulary, the corresponding meaning will be called the gestures lexical
As a consequence, gestures will be here defined as follows:

Bavelas et al. (1992) propose that beat gestures are in fact interactive gestures, the
function of which goes well beyond their semantic content. Despite my complete agreement with
this assumption, I nevertheless prefer to distinguish between the semiotic and the functional
interpretations of gestures. Thus, gestures will be labeled here according to their inner semantics.
An analysis of their functions, both within the communicative and self-directional spheres of
language will also be provided in the following chapters.
Emphasis theirs.
Emphasis theirs.
3.1. What Is Gesture? Getting More Focused 23

intentional movements of hands, arms, shoulders and head,

occurring within communicative acts 20 , whose lexical
access21 is shared both by the speaker and the receiver 22

co-verbal gestures being here defined as follows:

a subset of gestures strictly correlated to and co-

occurring with speech within communicative acts.

These definitions meet both McNeill and Levy's (1982) definition of co-verbal
gestures, which the authors identify as hand movements co-occurring with speech, and
Kendons definition of gesticulation. Yet, both McNeill and Levy (1982) and Kendon
(1986) exclude from this type of gesture the category of emblems: emblems are
gestures that show the highest degree of arbitrariness (Kendon, 1986 names them
autonomous gestures). These gestures have a precise culturally determined meaning
and this meaning is shared by a relatively closed geographic or cultural group. McNeill
and Levy excluded them on the basis of methodology: their study focused exclusively
on the function of those gestures which can only occur together with speech (such as
iconics), in order to support the claim of a common psychological origin for both
speech and gesture.
Kendon, on the other hand, considers the category of emblems (or autonomous
gestures) as completely separate from gesticulation. This distinction is outlined on the
basis of what has been defined the Kendons continuum (McNeill, 1992. See Figure
3), which differentiates gestures according to three main parameters:

- the obligatory presence of speech;

- language-like properties;
- culturally shared meaning.

Gesticulation Language-like Gestures Pantomimes Emblems Sign Languages

Figure 3: Kendons continuum (McNeill, 1992:37)

A communicative act is here defined after MacKays (1972: 25) statement: A
communicates with B only when As action is goal-directed towards B .
For the notion of lexical access, see Levelt (1982).
There might be a problem inherent to this definition: as we will see in Chapter 5, there is
a particular class of gesture (i.e. beats), which are to be considered gestures, although they do not
have a lexical access. A possible solution to this problem is also provided in the Chapters 5, 8
and 9.
24 3. Defining Gesture

According to these parameters, gesticulation is claimed to be idiosyncratic, with no

culturally shared meaning, and no language-like properties, while emblems are closer
to sign language.
Nevertheless, the occasional use of emblems in substitution for speech does not
imply that they are not deeply related to speech, not to mention the apparent function of
adverbial modifier taken by some emblematic gestures 23 . As shown in Chapter 5,
gestures may be interpreted as a prototype category: some may convey concepts (such
as spatial concepts. See i.e. McNeill, 1985), which are largely shared among the human
race: this makes possible the description of sizes and paths even between speakers who
do not understand each other's language. On the other hand, gestures may have a more
restricted and more arbitrary meaning, that, by definition, is shared only among a
specific group of speakers, usually living in the same area, and speaking the same
language (see Morris, 1977). Yet, one must consider the possibility that even so called
idiosyncratic gestures (such as iconics, beats, or metaphors) may have, as well as
emblems, a precise culturally determined form that may vary according to geographic
and cultural areas.
A further clarification concerning head-articulated signs is proper here. The
opinions of scholars about this issue diverge: some of them (i.e. McNeill,1992) do not
consider head signs in their works on gesture; others (see Cassell et al., 1994) refer to
yes and no head signs as facial movements; Morris (1977) generically defines these
pieces of human behaviour as signals; Davis and Vaks (2001) consider them to be
gestures. Since these movements are consciously and intentionally performed, and have
a precise lexical access, I consider them to be communicative acts. In particular, yes
and no signs may cover essential feedback functions in the regulation of dialogue
interactions (see i.e. Person et al., 2000). Furthermore, a head gesture which is similar
to the yes head sign may be used as an autonomous salutation symbol in some areas
of Italy and Spain (see Morris, 1977): this implies that these signals do convey a
precise shared meaning.
As for vocal gestures, eye-gaze, or posture shifting, they have important functions
in the regulation of communicative acts, but their degree of intentionality is not easily
determinable, while their relationship with speech is weaker than that of co-verbal
gestures. For this reason, they will not be taken into account in this research. As we
will see in the next chapters, the questions so far mentioned have no easy solution: for
example, we know that some gestures (such as beats) are not definable with a precise
shared meaning (in other words, they do not have a clear lexical access), and still may
vary sensibly along the geographical axis.

3.2. Terminological Note About the Classification of Gestures: Adopting McNeills


As regards the question of classifying and categorizing gestures into types, I agree with
Kendons (1986) suggestion that the relationship between gesticulation and the speech
it occurs with should be discussed on their merits, with no classification being assumed
in advance. Nevertheless, for current purposes, I will adopt a classification pattern for
gestures starting from the remark that not all gestures show the same relationship with

Thanks to Karl-Erik McCullough for discussing this question with me.
Summary 25

their lexical access (as we will see, some gestures do not even have a lexical access).
More precisely, I will classify gestures as a prototype category 24, as follows:
- Emblems: co-verbal gestures whose lexical access is arbitrary and
thus strictly culturally determined and culturally shared.
- Metaphors: gestures whose lexical access, which is less strictly
culturally determined, represents an abstract mental content.
- Iconics: gestures provided with a lexical access that is not strictly
culturally determined and culturally shared.
- Beats: gestures provided with no lexical access. Such gestures seem
to follow the rhythm of the concurrent speech flow.
- Deictics: gestures whose referent is in fact the only lexical access.

This classification follows in its essentials that proposed by McNeill (1992),

although the underlying premises are different, being designed in order to reflect the
conception of gesture as a (synchronically) modular category.
The distinction between metaphoric and iconic gestures is particularly
controversial, and thus not always accepted (see for instance Cienki 2005; Mittelberg
2007; Mller 2004; Parrill and Sweetser 2004). David McNeill in his 2005 book
explains that the distinctions between co-verbal gestures presented in Hand and Mind
are to be interpreted as dimensions rather than categorical types of gestures.
In this classification, gesture types remain and are determined by a surface analysis
of the major relation existing between signifier and signified. Of course, the relation
existing between signifier and signified in metaphors has an iconic component, but
such iconicity mediates between a concrete spatial representation and an abstract
mental content. Assuming that metaphoric relations are indeed cultural-specific (see
McNeill, 2005), but the capability for creating an abstract association between separate
events is universal, and probably due to the structure of human brain (Edelman, 2006),
the distinction between iconics and metaphors will remain herein.
Of course, the analysis of gesture goes well beyond a first definition of the
phenomenon to be taken into account, and the division into types. More specific
parameters for the analysis of gesture and its morphology are presented in Chapter 8.


This chapter has introduced the concept of gesture as the major object of investigation
of the research presented in this book, also by means of a review of the principal
definitions of the phenomenon already provided by scholars who focused on this
question. A brief presentation of the major classifications of gesture types is also
provided, with the adoption of McNeills (1992) semiotic model.

See Chapter 5.
This page intentionally left blank

4. The Cognitive Foundations of Gesture

The hands gestures run everywhere
through language, in their most perfect
purity precisely when man speaks by
being silent (Martin Heidegger, What
Calls for Thinking?, 357).


The study of gesture usually raises a number of questions. The most debated and still
unresolved problem concerns the ultimate psychological origin of gesture. In effect, the
major difficulty is that gestures seem to be inevitable: it has been observed that, when
prevented from gesticulating, subjects tend to intensify both the use of vocal gestures
and the activity of facial muscles (Rauscher, Krauss, and Chen, 1996): in other words,
they still gesticulate. This finding led scholars to claim that gestures - and non-verbal
behaviour in general - should considered the unintentional output of a sort of bug in our
neuro-muscular system.
Moreover, gestures serve rather controversial functions relative to communicative
acts. While they seem to convey relevant information (McNeill, 1985, 1992, 2005) that
is usually attended by the listener (Cassell, McNeill, McCullough, 1999), they also help
the speaker's computational task, as remarked by scholars concerned with phenomena
such as gesticulation on the phone or similar conditions when the speaker is perfectly
aware that his/her gestures are not likely to be seen by his/her listener.
This chapter will deal with these and other issues. In particular, gestures will be
claimed to be communicative (ultimately, verbal, as suggested by McNeill, 1985) or
linguistic, as long as they have a relevant part in human interaction. Evidence for the
assumption that gesture is verbal (McNeill, 1985) will be provided by means of an
overview of the most important studies concerning the intentional nature of gesture (
4.1): specifically, the major studies on the development of gesture in children ( 4.3),
and gesticulation phenomena in both aphasic ( 4.4) and blind subjects ( 4.5) will be

4.1. On the Psychological Foundations of Gesture: Is Gesture Non-Verbal?

As anticipated in the introduction, the opinions of scholars about the psychological

foundations of gesture diverge: some (e.g. McNeill, 1985 and subsequent; Kendon,
1986 and subsequent) argue that gesture and speech are closely related and
communicative phenomena; others consider it to be ancillary to speech and
language production. In particular, those scholars who consider gesture as having the
same psychological origin as speech claim it to be the overt product of the same
internal processes that produce the other overt product, speech25 (McNeill, 1985:350)

In his papers McNeill distinguishes between referential and discourse oriented
gestures (McNeill, 1985: 350). According to his view, only those gestures that he defines as
Discourse-oriented are considered to be verbal, in the sense that they seem to be strictly related
28 4. The Cognitive Foundations of Gesture

and found their hypothesis on the consideration that gesture and speech are
partners in the same enterprise, separately dependent upon a single set of intentions.
(Kendon, 1986:33). Scholars who, on the other hand, consider gestures as a mere
epiphenomenon of speech, remark that they are not always intentional and the meaning
they convey is by no means intelligible without relying on an interpretation of the
concurrent speech. Among them, I will here mention Feyereisen and Seron (1982), and
Butterworth and Hadar (1989). These scholars adopt a different theoretical approach to
gesture studies. Such an approach here defined as linearism is influenced by a wider
theoretical approach, originally developed for language description (see, e.g. Bock,
1982,and generally known as the linear model. Such a model stems from early Natural
Language Processing and describes sentence generation and parsing as a self-contained
process consisting of a defined and recursive succession of stages, which linearly
follow each other.
Following this model, Butterworth and Hadar hypothesise that sentence and,
subsequently, gesture generation follows an eight-stage process, which is as follows:

 Stage 1. Pre-linguistic message construction.

 Stage 2. Determination of the grammatical form of the
sentence under construction.
 Stage 3. Selection of the lexical items in abstract form from a
semantically organized lexicon.
 Stage 4. Retrieval of phonological word forms on the basis of
Stage 3.
 Stage 5. Selection of prosodic features including the location
of sentence stress points.
 Stage 6. Phonological stage in which word forms are ordered
syntactically and prosodic features marked.
 Stage 7. Full phonetic specification with all timing
parameters specified.
 Stage 8. Instructions to articulators. (Butterworth and Hadar,
1989: 172).

According to this hypothesis, not all gestures originate at the same stage: iconics, for
example, would originate at Stage 3, when word meanings are available, while batons
would originate at Stage 7, when timing parameters and stress positions are specified.
As already stated, McNeill (1985, 1989) replies to Butterworth and Hadars position
by claiming a single psychological origin for speech and gestures. The title of his
article published in 1985 (So you think gestures are nonverbal?) is particularly
eloquent, as it vigorously claims that gesture be devolved to its genuine pertinence,
namely, the realm of communicative and thus verbal interaction.
He bases his claim on five main points:

 gestures26 occur during speech;

with the verbal flow they occur with. As will be clearer in the next chapters, the hypothesis to be
here stated is that all gestures may, in a sense, considered as verbal.
In his article (1985), McNeill only takes into account co-verbal gestures, which he defines
discourse-oriented (McNeill, 1985: 350). He does not consider emblems. Such a choice is easily
explained by methodological restriction. On the contrary, Emblems are here considered as co-
4.1. On the Psychological Foundations of Gesture: Is Gesture Non-Verbal? 29

 they have semantic and pragmatic functions that parallel

those of speech;
 they are synchronized with linguistic units in speech;
 they dissolve together with speech in aphasia27;
 they develop together with speech in children. (Butterworth
and Hadar, 1989: 172)

The first point seems to be uncontroversial, although Butterworth and Hadar replied
providing some examples of gestures Emblems that can be performed without
speech. They did not, however, take into account that even though these gestures do
occur in substitution of speech, they still occur in the communicative process.
Gestures performed by listeners are not satisfactory evidence. In the first place,
gestures performed by listeners are in fact still well integrated within the
communicative act: these gestures usually have a commentary function, and listeners
perform them when they intend to give feedback to the speaker without interrupting
him). Secondly, gestures performed in a listener role are quite rare compared to those
performed in a speaker role: McNeill (1985) provides data that confirms this claim.
Stephens (1983), for instance, showed that in about 100 hours of video-recording, only
one gesture was performed by a listener. Moreover, the majority of gestures occur
during a speakers actual speech process (McNeill, 1985: 354): as McNeill claims, only
10% of the gestures performed in a sample of six narrations occurred during silence,
and these were immediately followed by further speech. These data appear to support
the hypothesis that speech and gestures share the same computational stage: all
gestures performed by speakers in silent pauses were batons and conduits, whose role
is that of reactivating the speech flow when computational problems, such as tip-of-the-
tongue situations, occur.
As for point 3, Butterworth and Hadar (1989) objected that it is not possible to find
any synchronisation pattern between speech and gesture, since gesture strokes may
occur slightly before the verbal units they refer to. McNeill (1989) replied by pointing
out that synchronisation does not necessarily mean a mere overlapping of gesture
stokes and verbal units, but, rather, a deeper coordination between the gestures stroke
and the accent of the word it is related to. In this regard, Kendon (1980, 1986) showed
that the gestures stroke may occur either together with or before, but never after its
corresponding speech Tone Unit28. This leads McNeill (1985) to the conclusion that if
it is necessary to establish a hierarchy between gesture and speech, then gesture should
be claimed to have a primary position, since its performance may even precede speech.
The fourth point seems also to be unquestionable, although Butterworth and Hadar
hypothesise several plausible explanations that would account for the high number of
iconics produced during hesitations or aphasia:

- (gestures) act as interruption-suppression signals (Duncan,

1972) and the presence of a long silent pause may trigger a
gesture to prevent the listener from interrupting;

More precisely, in Brocas aphasics batons dissolve, while iconics remain; in Wernickes
aphasics, on the other hand, the opposite phenomenon takes place (iconics dissolve, while batons
Tone Units are to be interpreted as phonologically defined syllabic groupings united by a
single intonation tune (Kendon, 1986:34)
30 4. The Cognitive Foundations of Gesture

- word finding is delayed by the slow buildup of activation in

the searched form. By raising the overall activation in the
system through the production of a motor movement, the word
will reach a firing level more quickly (especially in aphasic
- the production of an iconic gesture somehow assists word
finding by exploiting another route to the phonological
- A word may be selected from the semantic lexicon and then
censored by an executive function for nonlinguistic reasons
(e.g., emotional or social inhibitions) (Butterworth and
Hadar, 1989: 173).

Still, investigations on speech and gesture disconnection in different cases of

aphasia are more consistent with the hypothesis of a deep interrelation between the two
phenomena (for a wider discussion, see 4.4).
Finally, another interesting hypothesis on the relationship between gesture and
speech is put forward by Kita et al. (in progress). These scholars claim that gesture
does have a close relationship to language, but can be separated from speech. Kita and
his colleagues put thus forward the concept of co-thought gestures, in analogy to
McNeills idea of co-speech gestures:, co-thought gestures are those self-directional
meaningful manual actions such as those observed during mental rotation tasks.
Participants in Kita et al.s experiments are asked to perform mental (i.e., silent)
calculations on the rotation of given objects. In order to suppress inner speech
(Vygotskij and Lurija 1930), subjects were also asked to repeat a given sequence of
numbers aloud while resolving the image rotation problems. The experiment has lead
to interesting results underlined by a high rate of iconic and metaphoric gestures during
mental rotation, which is even higher when participants are asked to resolve problems
in the multi-tasking condition. These results seem call into question the idea that
gesture is inseparable from speech. It is my opinion, however, that an internal and
close interdependence of gestures and language is also demonstrated, at least insofar as
we consider thought, speech, and gesture to be different and interdependent instances
of language.

4.2.The Functions of Gesture Within Communicative Acts

Let us now analyze the second point, which I consider to be the most relevant one for
the aims of the present research: McNeill notes that gestures have semantic and
pragmatic functions, which can also be complementary to those of speech. If this is true,
gestures have a more complex and rich function than that hypothesized by linearists
when they define gesture as a mere appendix to speech.
In fact, experiments conducted by McNeill and others show that the meaning
conveyed by gestures does not always repeat the information expressed by the speech
unit it is related to. For example, Cassell and Prevost (1996) discuss the results of a
story-telling experiment based on six subjects. These subjects were asked to watch a
Road Runner cartoon and retell it to six listeners. The aim of the experiment was to
establish the percentages of redundant versus non-redundant gestures and the results
4.2. The Functions of Gesture Within Communicative Acts 31

show that 50% out of the 90 gestures performed by the six subjects have semantic
features that are not redundant with respect to speech.
In 1998, Cassell, McNeill and McCullough published the results of a similar
experiment. The aim this time was to find out which channel people prefer to attend to
when the pieces of information provided by speech and gestures are incongruent. To
achieve this, the authors created a speech-gesture mismatching experiment: one of
the experimenters was videotaped while telling a cartoon story, following a
predetermined script which was taken from the Canary Row episode of the Tweety
and Silvester cartoon. The storytelling was videotaped twice for each event
sequence in the cartoon: during each of the two narrations, exactly the same words and
intonation were used. Only co-verbal gestures were modified so that the second
narration had speech-gesture mismatches. These mismatches were of three types:

- anaphor mismatches: the speaker introduced two separate referential loci in the
gesture space by means of deictic gestures, while verbally introducing two
characters. After a while, he would intentionally point to the wrong locus when
referring to one of the characters;
- origo mismatches: gestures provided a perspective on the action different from
that assumed by the accompanying speech;
- manner mismatches: gestures provided further pieces of information about the
manner in which an action was carried out;

The stimulus was designed so to contain 14 target phrases with concurrent gestures
for each videotape: such phrases had a concurrent gesture which in one version carried
the correct information and in the other the mismatched one. Six additional phrases
were introduced, always with a concurrent gesture of manner mismatch, which
provided additional information with regard to that expressed in speech. The six
additional phrases were introduced in both videotapes, with the additional information
differing from one videotape to the other. Each version was divided into three episodes
so that the cognitive demands placed on the subjects were not excessive.
The videotapes were then shown to eight subjects four of whom observed the
correct videotape, and four observed the mismatched one. The subjects were then
asked to retell the story narration they had viewed to another eight subjects, who played
a listener role. The video-recorded material was coded independently by two separate
researchers . The two transcriptions were subsequently compared and cases of
uncertainty were resolved by discussion. The results clearly show that the subjects
exposed to the mismatched narration tended to retell the story integrating the pieces
of information acquired via gestures. In particular, when the information conveyed by
the gesture contradicted that conveyed by speech, the subjects tended to show a higher
percentage of inaccuracies in retelling. Moreover, information acquired via gesture
alone was usually reported using speech. An example of a retelling inaccuracy of this
sort is shown in Figure 4: in this case, the subject heard the narrator say and Granny
whacked him one, while seeing him performing a punching gesture. When retelling
the story, this subject opts for an uncommon strategy: he conveys the contrasting pieces
of information by means of both a gesture and the following speech output: And
Granny like punches him or something and you know he whacks him. (Cassell,
McNeill, and McCullough, 1999: 15): But the best example of this experiment is an
integrated version of the mismatched gesture into speech with a concurrent mismatched
gesture. The stimulus video, in fact says luckily Granny is there, Tweetys friend, and
32 4. The Cognitive Foundations of Gesture

she [whacks] him one and throws it back out of the window with a manner
mismatched punching gesture (Cassell, McNeill, and McCullough, 1998: 11): one of
the subjects presented the following response: but Tweetys Granny is there and she
punches Sylvester out while performing with the hand a vague waving motion
(Cassell, McNeill, and McCullough, 1999: 11). Since manner mismatches were
inserted in both the video stimuli, the coder had to determine which one of the videos
the subjects had seen. Since one version of the video showed this event sequence with a
punching gesture and the other one showed a slapping gesture, the coder judged that
the subject was exposed to the second version. The results show that there is no
substantial difference between retelling inaccuracies for either anaphor, origo, and
manner mismatches. In particular, the percentage of retelling inaccuracies with a
mismatched stimulus expressed and integrated in speech was about 36% of the retelling
inaccuracies, which is a striking datum. Such outstanding results lead the authors to the
following considerations: If gesture is communicative, but not an equal partner in the
building of a representation of information, then one might expect manner mismatches
to be regularly attended to by the listeners, while the two other kinds of mismatches
would not. This is because manner mismatches expand on and support speech, as
opposed to contradicting it, and this is the most unmarked (at least controversial)
function for gestures to have. Origo and anaphor gesture mismatches convey
information that contradicts that conveyed by accompanying speech. (Cassell,
McNeill, and McCullough, 1999: 13).

Figure 4: an example of retelling inaccuracy with manner-mismatching input (Cassell, McNeill and
McCullough, 1999:15)

To summarize, the results of this experiment lead the authors to the conclusion that
not only is gesture listener-oriented, but it also has the same communicative properties
as speech. This hypothesis is confirmed by the fact that not only do subjects attend the
mismatched gestures, but they seem to provoke a new representation of the information
conveyed by speech and gesture (Cassell, McNeill, and McCullough, 1999:11).
Yet, some problems arise from this conclusion: first of all, the mechanism by which
subjects take note of the information conveyed by gestures is still uncertain. Further
4.2. The Functions of Gesture Within Communicative Acts 33

research related to this issue, conducted by means of eye-tracking (see Gullberg and
Holmqvist, 2001) shows that the listeners gaze shifts to the hands of the speaker only
when these leave the zero area i.e., the trunk area to occupy a lateral zone, be it
left, right, up, or down. In other words, listeners do not seem to deliberately pay
attention to the information expressed in gestures, their process of information retrieval
being more similar to an unintentional one. In effect, Cassell, McNeill and McCullough
(1999) affirm that none of the subjects partaking of the experiment noticed any strange
behaviour in the videos (both the normal and the mismatched one) that they had
been seen. Most of them only noted a certain animation in the speaker. This detail
should lead to the conclusion that none of the subjects really paid attention to the
gestures (otherwise, at least the mismatched version of the story should have been
easily noticed). As a consequence, the results provided by Cassell, McNeill and
McCullough only constitute evidence for the informative rather than communicative
function of co-verbal gestures, just as other pieces of non-verbal behaviour, which are
classically considered to be unintentional and thus non-communicative might also
be picked up by the listener, either intentionally or unintentionally (see Freud, 1901).
More precisely, their results seem to complicate the question rather than simplifying it,
for not only do they fail to provide evidence of the speakers intention in gesture
performance, but they also leave open the question of the receivers intention to gain
information from the speakers gestures. Besides, the gestural response shown in
Figure 4 can provide further analyzable pieces of information, especially if compared
to the stimulus it stems from. In fact, the response performed by the subjects is
typically less emphatic than the stimulus it stems from. It is perfectly possible that the
lack of emphasis in the gestural response is attributable to different factors: for instance,
it could be due to an idiosyncratic response to a stressful situation such as a video-
recorded experimental session; although, on the other hand, one must consider the
phenomenon by which both gesture emphasis and production increase in situations that
place a cognitive demand on the subject (McCullough, 1995). Also, the emphasis29 of
gesture response could be determined by the actual size of the video-stimulus: it is
reasonable to expect that a wider screen should elicit a noticeably higher emphasis in
the gestural response. Yet, a possible explanation of such a response is perhaps its deep
cognitive function: my hypothesis is that gestures elicited by means of a
reasoning/memory task (McCullough, 1995) are more likely to serve the speakers
cognitive effort. In particular, the case reported in Figure 4 may be interpreted as a
subliminal gestural response aimed at providing a metaphorical landmark for self-
orientation: the subject in question seems to be intent on two different cognitive tasks,
namely, a) recalling information and b) attempting a coherent interpretation of
divergent pieces of it. In brief, the gesture described above seems to provide a good
instance of speaker-oriented gesture, since it appears to help the speakers cognitive
function. Further evidence for this interpretation could be found in a particular
phenomenon that McCullough (2005) defines as mirror reproduction of the stimulus.
According to McCulloughs findings, this phenomenon affects the gestural expression
of the Path feature in a given Motion Event (Talmy, 2001). In particular, subjects are
always found to reproduce Paths shown on a screen by means of mirror gesture
(objects going right-to-left of the screen are always reproduced as going right-to-left of
the speaker). Of course, such a phenomenon is patently unintentional and could be
interpreted according to my hypothesis as speaker-oriented. Moreover, the fact that

For this feature in gesture morphology, see Rossini (2004a) and Chapter 8.
34 4. The Cognitive Foundations of Gesture

it has only been recorded in gestures elicited by an animated stimulus (such as that
provided by card stories, cartoons, or movies. McCullough, 2005) does not necessarily
imply that it is actually restricted to this instance. Unfortunately, the example provided
by Cassell, McNeill and McCullough (1999) shows an apparent mirror response, which
could also be interpreted as a simple dexterous response to a left-handed stimulus.
The hypothesis remains thus precarious.
In conclusion, the results of the experiment conducted by Cassell, McNeill and
McCullough are apparently inconsistent with the hypothesis that gesture is
communicative. In particular, the observed phenomenon that subjects attend to the
information provided by gestures, even when this is not consistent with that conveyed
by speech cannot help determine the communicativeness of gesture, provided that other
unintentional pieces of behaviour are attended to as well. If, in fact, the [+ intentional]
trait fails, communicative intent cannot be assumed.
On the other hand, such results are liable to be interpreted as indirect evidence of a
deep psychological correlation between gesture and speech. Furthermore, they are
consistent with those provided by other investigations. Suffice it to mention Rogers
(1978), and Thompson and Massaros (1986) experiments, which clearly show the
higher degree of attention people ascribe to the meaning conveyed by gestures in cases
of verbal ambiguity, and dialogs taking place in noisy environments.
The intentional trait of gestures seems to remain a vexata quaestio, which will be
addressed in Chapter 5. For now, let it suffice to state that the intentional trait also fails
for speech phenomena, such as phonetics, prosody, intonation, and lexical selection
(see McCullough, 2005). As it will be clear in the next pages, human language is not
necessarily intentional in any of its modes of expression.

4.3.The Emergence of Gesture in Infants

The study of gesture evolution in infants relies on the observation that infants achieve
verbal communicative skills relatively late (first vocalizations do not appear before 5
months, while first words usually appear around 9 months. See Goldin-Meadow, 1998),
and the frequency with which gestures accompany childrens first words seems more
than incidental (e.g. Dore, 1974).
Initially, the prevailing ontogenetic hypothesis concerning gesture development in
children was that gesture and speech were different and independent phenomena. This
hypothesis was supported by Hewes (1973), who noted that early iconic gestures in
children are progressively replaced by vocalizations and verbal expression. De Laguna
(1927) and Werner and Kaplan (1963) also considered gestures to be a primitive mode
of cognitive representation.
Somewhat more recently, Bates et al. (Bates at al., 1977; Bates et al., 1983)
suggested a continuity between preverbal and verbal expression.
These initial positions, together with the later development of theories on the
psychological origin of gesture, led to the examination of the relationship between
gestural development and language acquisition process in infants. The first studies
exclusively focused on pointing gestures, or deictics (Bates et al. 1975; Bruner, 1975),
since their object-distinguishing function was considered a precursor to verbal naming
(Werner and Kaplan, 1963; Bruner, 1975; Bates et al. 1979). In addition to deictics,
reaching towards objects and extending objects to others were considered crucial in the
4.3. The Emergence of Gesture in Infants 35

infants transition to verbal communication. In particular, open-handed reaching

towards an object has been interpreted as a proto-imperative (Bates et al., 1977).
More recently (e.g. DOdorico and Levorato, 1994), eye gaze has also been
implicated in infants acquisition of communicative processes. In fact, the value of eye
contact in infants social behaviour has been recognized by several studies (i.e. Wolff,
1961; Robson, 1967; Bloom, 1974; Samuels, 1985), although the study of eye gaze was
limited to identifying regularities in early interactive behaviour (see i.e. Friedman,
1972), and establishing its function as a strategy for the achievement and maintenance
of attachment between mother and child (see i.e. Robson, 1967).
DOdorico and Levorato (1994) addressed the question of the cognitive
determinants of the infants capacity to interrupt active exploration of external reality to
share the experience with the mother through eye contact (DOdorico and Levorato,
In their research, DOdorico and Levorato start from the assumption that, since
infants only begin to address intentional vocalizations to their mothers at about 15
months of age in order to comment and share their experience of the physical world,
this type of sharing must occur earlier by means of another type of communicative
exchange, eye contact. As the authors note, although after birth eye contact is
regulated in the infant by a homeostatic mechanism of attention/disattention it very
soon becomes a real psychological behavior. (DOdorico and Levorato, 1994:10)
In particular, they hypothesized that:

 In the first months of life looking towards mothers eyes has
the value of an answer to mothers solicitations, while in the
following period the infant becomes more and more capable
of initiating the exchange by her/him-self.
 In the first months of life there is a sort of antagonism
between the activity of interacting with a social partner and
that of exploring objects
 The capacity to coordinate a social schema of communication
and a cognitive schema of action is demonstrated when
infants experience of knowing becomes the signified of eye
contact with mothers and the gaze becomes a real
significant. (DOdorico and Levorato, 1994:10)

In order to prove this hypothesis, the authors conducted an experiment on two male
infants who were videorecorded through a one-way mirror during interactions with
their mother and an object. Each session was divided into four different interactive
situations: 1- mother-infant interaction; 2- mother-infant interaction with a toy; 3-
experimenter-infant interaction with a toy; 4- infant alone with a toy. Each phase
occured twice in each session. In mother-infant interactions with a toy, the first-time
mother and infant played with an object which was familiar to the infant, while the
second time the mother showed a new object to her baby.
Experimental sessions started when the infants were 3.19 and 5.28 months old
respectively, and took place regularly until the infants were 8.3 and 11.7 months old.
Results show that only for the older infant does the duration of eye-contact increase
with age in any significant way. This first result confirms the hypothesis that in this
period (i.e. 5.28 months) eye contact becomes a means of exchanging information.
Another element corroborating the hypothesis is the analysis of the variable
familiarity of the object. In fact, during the first age level (3.19-6.15 months for the
36 4. The Cognitive Foundations of Gesture

younger infant, 5.28-8.20 months for the older one) familiar and unfamiliar objects
elicit the same proportion of gaze towards the mother, while in the second age level
(6.19-8.3 months for the younger infant, 9-11.7 months for the older one) new objects
elicit more gaze than familiar objects.
These results lead to the hypothesis that at about 8/9 months of age communicative
exchanges through eye contact have the role of sharing new experiences with adults.
The results are consistent with the phenomenon reported by Trevarthen and Hubley
(1978) of the tendency to share new experiences with adults more and more
systematically by means of eye contact. Moreover, in their second age level, the two
infants showed an increasing capacity for shifting their interest spontaneously from the
object to the mother. At this level, infants also used eye contact as a request for
repetition of an event, while during the first age level the mothers interactions with an
object produced an increase of interest in the object, but not in the adult. These results
seem to fully confirm the theory that the function of eye gaze in infants develops
together with the development of their cognitive and social capacities.
As for the relationship between gestural development and the transition to words,
scholars suggest different hypotheses. As stated above, the first gestures acquired by
the infant (i.e. pointing, reaching towards an object, and extending objects to others)
are substantially involved with a deictic function. Yet, several claims have been made
concerning the origin of this function. In particular, Vygotskij (1966) and Lock (1980)
have hypothesized that pointing originate in the failed reaching activity of the infant.
Pointing would thus have a communicative function, that is, a request for adult
intervention. Werner and Kaplan (1963) and Bates (1976) have proposed that the origin
of pointing lies outside any communicative intent: instead, pointing would be an
orientating mechanism for the self. Leung and Rheingold (1981), on the basis of an
experiment on 32 children, claim that pointing replaces reaching as a reference gesture.
When the children grow older, the deictic gesture is acquired by modeling, or imitation
of adult behaviour.
Several experiments on the development and functions of these gestures have been
conducted more recently. Masurs (1994) research, for instance, explores the
relationship between gestural development and the transition to words. In particular,
the emergence and development of three communicative gestures is analyzed in an
experiment with four infants (two males and two females) who were videorecorded
during normal interaction with their caregivers in bi-weekly 30 minute experimental
sessions. At the time of the first sessions, all four subjects were 9 months old. The
experiment continued until the subjects were 18 months old. The analyzed gestures
were as follows:

- pointing at an object
- extending an object toward the mother
- open-handed reaching toward an object

The results show that all the children acquired open-handed reaching early, by 8 or
9 months, while extending objects and pointing appeared later. The mean time of
appearance of pointing for the four children was 12.25 months. None of the three
gestures emerged from an imitative context. Furthermore, a progression in the
acquisition and development of these gestures with regards to language acquisition was
marked out: this progression followed a sequence starting from gesture plus
vocalization (mean time of appearance: 12.25 months), continuing through dual-
4.4. Gesture and Aphasia 37

directional signaling (i.e., the capacity to point to an object in one direction while
looking at a person in a different visual field 30. [Time of appearance: 12 months or
later]), and culminating in gesture plus gesture (i.e.: pointing to an object while
nodding toward the mother [mean time of appearance: 16-18 months]), and gesture
plus conventional verbalization (mean time of appearance: within 1.5 months of the
first dual-directional signaling). Note that for all children words only emerged when
dual-directional signaling had been productively demonstrated with two different kinds
of gestures. This timing suggests that the capacity for dual nonverbal signaling may be
a necessary but not a sufficient prerequisite for the production of verbal conventional
signals with gestures.
The results of the above described experiments are consistent with the theory that
gesture and speech are different overt products of the same neuro-motor complex,
whose development can be traced and analyzed. Other studies seem to support this
interpretation: studies by Overton and Jackson (1973) and Jackson (1974) on 3-to-8-
year-old children showed that the younger subjects had marked difficulties in
performing gestural symbolic acts. Moreover, developmental changes have been
proved to modify not only the capacity to perform symbolic gestures, but also to affect
temporal synchronisation between gesture and speech.
These results are consistent with the findings of Van Meel (1984), who observed
that 4-to-6-year-old children make gestures before the beginning of their speech, while
older (8 to 10 years old) ones tended to gesture at the beginning of their speech and to
continue gesturing throughout their verbal production. Moreover, the older subjects
showed a better synchronisation between their gestures and the correspondent speech
flow units conveying the same meaning.
Experiments on subjects with different impairments provide further evidence of this.

4.4. Gesture and Aphasia

Aphasia and brain pathology in general have long been the subject of neuroscience
studies aimed at assessing lateralization functions in the human brain (see Feyereisen,
Thanks to these studies, a great deal of information has been gathered about the
locus and anatomy of brain structures. A great deal of information about brain areas in
which lesions disrupt visuogestural, visuofacial, and auditory-vocal processes has also
been assessed (Kolb and Whishaw, 1985). However, the information we have about
brain processes and the involvement of brain areas in communicative processes is
difficult to assess and has not led to homogeneous results.
Many attempts have been made to localise brain areas involved in nonverbal
behaviour and communication. The localization of these areas would allow assessment
of whether verbal and nonverbal processes are to be considered as separate and
independent processes (Feyereisen, 1991). Unfortunately, the lesions resulting in
nonverbal impairments cover large areas, so that generalizations are allowed only in
terms of gross distinctions between the right and the left hemispheres, or among the

According to the results of the experiment, visual gaze towards the mother was present
in the earliest childrens gestures but only in those cases where the gesture was itself directed
towards the mother. (Masur, 1994:26).
38 4. The Cognitive Foundations of Gesture

frontal, parietal, temporal, and occipital lobes (Feyereisen, 1991). Nowadays, these
gross distinctions seem to be far from the needed precise localization.
Nonetheless, several left-hemisphere sites, both cortical and subcortical, have been
found to be involved in the production of symbolic gestures (Basso, Luzzati, Spinnler,
1980; Heilman et al., 1983; Jason, 1985), and the comprehension of pantomimic
gestures and nonverbal sounds has been found to be disrupted by very different left-
hemisphere lesions (Varney and Damasio, 1987).
Moreover, the observation of the behaviour of brain-damaged subjects showed a
disassociation between components that are closely related in normal subjects. For
example, some aphasics are able to pantomime the function of an object, but cannot
name it (Feyereisen, Barter, Clerebaut, 1988). These results led to the hypothesis that
nonverbal and verbal communicative modalities were functionally independent, at least
in late response stages. However, an alternative hypothesis has been proposed, namely,
that one task, gesturing, is easier than another, naming (Feyereisen, 1991:36). In
addition, it has been clearly stated that in order to demonstrate a separation of the
verbal and nonverbal communication processes, it is necessary to find subjects without
a language deficit who show some impairment in nonverbal tasks (Shallice, 1988,
Feyereisen (1991) also focused on the dichotomy between verbal and nonverbal
processes. This dichotomy is assumed in the earliest descriptions of aphasia, where the
role of the right hemisphere of the brain was described as nonverbal. Accordingly, the
first neuropsychological studies of nonverbal behaviour focused on the differences
between the roles of brain hemispheres in communication. As a result, the dichotomy
between left and right hemispheres was extended to the dichotomy between verbal and
nonverbal processes. Later discoveries concerning the functions of brain areas showed
the inadequacy of this model. Several alternative explanations have challenged the
verbal-nonverbal distinction: these explanations focus on different reasons for the left-
hemisphere verbal superiority, which could result from the asymmetry of different
functions (Feyereisen, 1991). One of these explanations relies on the hypothesis of
asymmetry in symbolic functioning. In this hypothesis the left hemisphere of the human
brain specializes in the symbolic function.
Experiments within this framework showed that left-hemisphere-damaged subjects
encounter difficulties in performing symbolic gestures or emblems. These difficulties
do not seem to be due to impairments in understanding verbal instructions, since they
are also present in imitation tasks (see De Renzi, 1985), or may arise in absence of
auditory disorders (Feyereisen, 1991).
These results led to the hypothesis of the existence of a central communication
system. According to this hypothesis, the fact that aphasic subjects may or may not be
able to perform spontaneous gestures does not depend on a separate nonverbal
communication system, but, rather, on a single computational stage in which both
speech and gesture originate (McNeill, 1985).
The hypothesis relies on the observation of parallel changes in verbal and nonverbal
behaviour after brain lesions (Duffy et al., 1984). This hypothesis is also consistent
with the results of a study that described parallels in the communicative content of
speech and gesture (Cicone et al., 1979). In fact, two of the observed subjects showed
telegraphic speech and 80% of their gestures were informative31.

For the definition of informative, communicative, and interactive gestures, see Chapter 2.
4.5. Gesture in Blind Subjects 39

Cicone et al.s research also shows a relation between gestural and verbal fluency.
In fact, the subjects showing a speech rate below normal also showed fewer gestures,
although the results seem to be contradicted by other studies that showed no definite
relationship between speech fluency and gesture production rate (Feyereisen at al.,
1990). Further investigation of right-hemisphere aphasia helped to highlight the
functions of this hemisphere, for a long time called the minor hemisphere (see
Feyeresen, 1991) within verbal communication processes. Studies of Wernickes
aphasics (see Gardner et al., 1983; Rehak et al., 1992) outline that, although these
subjects show no deficits in understanding or producing linguistic forms, they are
impaired in their ability to organise narrations, and understand metaphors and jokes.
In particular, research by McNeill and Pedelty (1995) zeroed in on the role of right
hemisphere in language processes by means of a multimodal analysis of right-
hemisphere aphasic subjects and normal subjects. The results show that while the
normal subjects are able to perform numerous gestures to tie together episodes, the
aphasic ones are unable to build up any cohesive chain of gestures. When taken
individually, the gestures were semantically interpretable, but, when taken collectively,
they were not linked in space.
The subjects reset their gestures after each segment, performing only one stroke
between the onset of hand movement and the hands return to rest position. According
to the results of this research, right-hemisphere damaged subjects also lack access to
the referential spatial regime, wherein different regions of the gesture space are used to
depict different characters and objects (McNeill and Pedelty, 1995:69). In fact, when
these subjects point, they tend to point to the same locus for all references. No
instances of consistent deictic coreference were observed. Another phenomenon
observed in right-hemisphere damaged subjects is left-side neglect: the subject ignores
events occurring on the side of perceptual and motor space that is contralateral to the
damage (see also Morrow and Ratcliff, 1988). Moreover, the phenomenon of left-side
neglect was also observed in iconic gesture production.
On a different level, Goodwins (2000, 2003) work on aphasia and homesign can
help reconsider the place for gestures in communication. In analyzing the interactional
and communicative capabilities of Chil, a patient suffering from almost complete
aphasia resulting from a stroke32, Goodwin shows how gesture and other non-verbal
cues such as gaze and posture can structure into meaningful and effective
Other pieces of evidence regarding the cognitive origin of gesture and speech can
be provided by experiments on inborn deaf, orally educated subjects.

4.5. Gesture in Blind Subjects

The effects of congenital blindness on gesture have been examined by scholars in order
to analyze the relationship between visual input and gesticulation. The first study by
Blass, Freedman and Steingart (1974) asked congenitally blind and sighted subjects to
give a five-minute monologue, noting the almost total absence of communicative
gestures (object-oriented) among blind subjects. Later on, Manly (1980) analyzed the
Nonverbal Behaviour of congenitally blind adults engaged in conversations and found

Chil is only capable of uttering the words yes, no, and some intonational patterns
isolated from speech.
40 4. The Cognitive Foundations of Gesture

evidence of changes in posture in correlation with the end of conversational turns,

although gesticulation was reported to be almost totally absent. Another study (Parke,
Shallcross, and Anderson, 1980) found appropriate use of head nodding in congenitally
blind children.
More recently, Iverson (1998) suggested that, since all of the research previously
conducted focused on gesture production in relatively unstructured conversation,
congenitally blind subjects may be found to gesture under other circumstances. In fact,
two experiments on congenitally blind and sighted children/adolescents (Iverson, 1996;
Iverson and Goldin-Meadow, 1997) observed the incidence, form, and content of the
gestures performed by the subjects as they participated in two tasks differing from each
other and from the informal conversation tasks employed in previous studies. In the
first task (route directions), subjects were asked to describe the route from a fixed
starting point to a set of familiar locations. In the second task (Piagetian conversation),
participants were asked to reason about invariance across transformation in the context
of making judgments about liquid quantity, length, number, and mass. These tasks
were selected either according to previous observations noting that sighted adults and
children gesture extensively when performing them (see e.g. McCullough, 1995), and
because of the cognitive demand that these tasks place on the speaker.
The results show an increased speech production (especially in the route directions
task) among congenitally blind individuals with respects to sighted subjects. This
finding seems to be consistent with the results of Dekker and Kooles (1992) research,
which shows more efficient use of verbal memory in blind children compared to their
sighted peers.
As for gesture production, congenitally blind subjects were found to gesture with a
frequency similar to their sighted peers only during the Piagetian reasoning task 33. The
gestures performed by congenitally blind subjects were subsequently compared to
those performed by their sighted peers, in order to determine whether they conveyed
similar kinds of information. The form (i.e. handshape and motion) of each gesture was
noted, and the distribution of gesture forms for both sighted and blind participants was
compared in order to assess whether the blind subjects produced any gesture forms that
were not in the repertoire of their sighted peers.
The results of the analysis showed that the informational content in the gestures of
blind and sighted participants was remarkably similar (Iverson, 1996). In a later study
(Iverson and Goldin-Meadow, 1997), the most commonly performed gestures in both
groups were object indications (i.e. pointing to a container). Gestures focusing on one
of the dimensions of an object (i.e. placing a flat hand on the top of one of the glasses
to indicate its height) were the next most frequent. As for handshape, the overall set
produced in both experiments by the blind subjects was similar to that used by their
sighted peers, and no handshape was ever used by a blind subject that was not also
present among the control group. Among blind participants, however, pointing gestures
were extremely rare, and most of the handshapes employed by the blind subjects
followed the natural configuration of the relaxed hand. These data lead to the
consideration that blind subjects substituted for pointing by using their flat hand to
indicate or call attention to specific objects. Moreover, blind participants were found to
add an auditory cue to their gestures by tapping the referent of the gesture. This
additional auditory cue may have served to ensure that the blind subject and the

Gesture production was analyzed by calculating the number of gestures contained in each
participants response for each task.
4.5. Gesture in Blind Subjects 41

experimenter were attending to the same object. In my opinion, the auditory cue also
provides us with the necessary evidence that the gesture was effectively intended to be
In fact, there is no substantial evidence that the flat hand gestures performed by
blind subjects were intended to communicate something to the listener, rather than
being an orientating mechanism for the self, or even a mere exploratory act (e.g. the
reported gesture of placing a flat hand on the top of one of the glasses of water while
speaking of its height could be interpreted as an exploratory act rather than a deictic
This page intentionally left blank

5. Towards the Interpretation of Gesture as

a Prototype Category: Gestures for the
En effet, si nous remarquons dans la prosodie
del Grecs et des Romains quelques restes du
caractre du langage daction, nous devons,
plus forte raison, en apercevoir dans les
mouvements dont ils accompagnaient leurs
discours. (Etienne Bonnot de Condillac, Des
connaissances humaines, IV, 31)


This chapter addresses the question of the intentional and thus communicative - value
of gestures. An original theory concerning the interpretation of gesture as a Prototype
Category will also be presented. Such a theory which may help to resolve the main
issues calling the communicativeness of gesture into question will also find
consistency from the results of a dedicated experiment with 5 Italian subjects.

5.1. Gestures for the Speaker? State of the Art

Gestures have long been thought of as mainly listener-oriented. That is, they have been
interpreted as facilitating the listeners decoding task. This assumption has been
accepted by the majority of scholars studying the phenomenon, including Argyle
(1988), Butterworth and Hadar (1989), and Kendon (1980, 1986, 1994).
Nonetheless, some authors hypothesized a different primary function for gestures:
helping the speakers computational task. In particular, De Laguna (1927) and Werner
and Kaplan (1963) define gesture as a primitive mode of cognitive representation, that
is used when the speaker is unable to express his ideas by means of words. Another
suggestion is that gestures associated with speech are the result of an overflow into the
motor system of tensions that arise when speech production is blocked (see Dittmann,
1968. See also Frick-Horbury & Guttentag, 1998 for gesturing in tip-of-the-tongue
situations, when the addressee already knows the target word). The hypothesis that
gestures facilitate the speech encoding process by enacting the ideas into movements
has also been proposed (see Moscovici, 1967). In this regard, Rauscher, Krauss and
Chen (1996) found that preventing people from gesturing reduced the fluency of their
speech with spatial content. This idea was also adopted by Freedman (1972), Rim
(1982) and Krauss et al. (1990)34.
In particular, Rim notes that we gesture even in situations when we are absolutely
aware of the fact that our interlocutor is not able to see us: typical examples of this
behaviour are according to the author telephone conversations. The real problem

Also Butterworth and Hadar partially accept this hypothesis (see 4.1).
44 5. Towards the Interpretation of Gesture as a Prototype Category: Gestures for the Speaker?

according to Rim - is that, when speaking on the telephone, not only do we perform
batons (which are under the awareness level), but even illustrators 35 ; moreover, the
performance is not limited to hesitation pauses, but accompanies the whole speech flow,
as in face to face conversations.
Kendon (1994), on the other hand, hypothesized that gestures may have the role to
convey meanings that are particularly difficult to put into words, and claims that

the gestures that people produce when they talk do play a part in
communication as they do provide information to the co-
participants about the semantic content of the utterances
(Kendon, 1994:192).

More recently, De Ruiter (2000) has argued that there is no conflict between the
two views. Gestures may well be intended by the speaker to communicate and yet fail to
do so in some or even most cases (De Ruiter, 2000:290). The author also replies to
Rims consideration, stating that

the fact that people gesture on the telephone is not necessarily in

conflict with the view that gestures are generally intended to be
communicative. It is conceivable that people gesture on the
telephone because they always gesture while they speak
spontaneously they simply cannot suppress it. (De Ruiter, 2000:

This position seems to be consistent with the late development of the theoretical
view about the function of gestures put forward by Adam Kendon (2004) and David
McNeill (2005), among others.
Yet, as some scholars remark, this assumption may be taken as evidence of the non-
communicativeness of gesture: if people cannot suppress gesture, then gesture is
unintentional; if it is unintentional, then it is not communicative (see Krauss et al.,
2000). Krauss et al. (2000) also note that some gestures namely, iconic ones are
hard to interpret without the accompanying speech and thus draw the conclusion
that gestures do not communicate. Further evidence is provided by the analysis of an
example reported by Kendon (1980): in this example, a speaker is described while
saying with a big cake on it and making a series of circular motions of the
forearm with the index finger pointing downward. The described gesture is an iconic,
conveying the [ROUND] meaning. The question posed by Krauss et al. regarding this
particular example is that the particular cake described by the speaker had some other
properties such as color, flavor, texture, and so on but the speaker did not mention
them, probably because these properties were not relevant. But was the [ROUND]
property relevant? The authors note the following:

Although it may well have been the case that the particular cake
the speaker was talking about was round, ROUND is not a
semantic feature of the word cake (cakes come in a variety of
shapes), and for that reason ROUND was not a part of the
speakers communicative intention as it was reflected in the
spoken message. (Krauss et al., 2000: 272)

For the notion of illustrators, see Chapter 3 ( 3.1.)
5.1. Gestures for the Speaker? State of the Art 45

The problem with this discussion is that Krauss et al. seem to ignore a central
question, namely, the real meaning of the gesture, which will be here defined as lexical
access: the lexical access of the iconic gesture described by Kendon (1980) is clearly
[BIG/round] not [ROUND/big] (where capitalized letters show the Rhematic part of
the lexical access). In other words, it is clear that gestures can convey their lexical
access through the means of limb movements and hand shapes. As a result, the
meaning conveyed will have other representational features that do not strictly pertain
to the message to be conveyed. In this particular case, we face a gesture expressing
roundness and bigness, this last feature being the focus of the message being expressed.
The fact that the cake the subject is trying to describe is round matches with the
prototypical idea of cake, which is conveyed by the gesture. The same prototypical idea
is also expressed in speech, in a sense, by means of a particular strategy, that is the
absence of focus on the shape of the cake. In other words, if the cake the subject had
seen was a square one, this non prototypical feature would have been expressed in
speech, and also by means of a different iconic gesture (perhaps, a square one).
As for the hypothesis that gestures are not communicative, since the meaning (or
lexical access) of some of them is not comprehensible without the accompanying
speech, let us not forget that speech and gestures always co-occur, as they are
inseparable phenomena (see McNeill, 1992 and 2005).
Finally, a study by Justine Cassell (1999) notes that gestures seem to be actually
listener-oriented, for they are usually performed together with the rhematic36 part of the
accompanying speech. Thus, since gestures are performed together with the
introduction of new pieces of information, their function should be listener-oriented:
gestures, in fact, would be related to super segmental traits of the speech flow (that is
to their pragmatic and communicative functions), rather than to computational
processes supervising the pre-communicative lexical retrieval processes. Nonetheless
this conclusion is far from obvious. In effect, one should not ignore that both Theme
and Rheme are concepts coined in the realm of Psycho-linguistics in order to account
for the phenomena involved in speech production: in other words, such concepts
describe the main strategies adopted by the speaker for information ordering and
packaging by means of speech. In so being, the observed concurrence of a high
percentage of gestures with the Rheme of a sentence could also be interpreted as a
speaker-oriented behaviour, only gestures having the function of supporting the
speaker in his computational task. Nevertheless, it is my opinion that acknowledging a
speaker-oriented function for gesticulation does not undermine the interpretation of it
as a communicative and linguistic phenomenon37.

In a speaker-oriented phrase analysis, we can distinguish between Theme, which is the
part of the phrase establishing the subject, or theme of the message, and Rheme, which is the part
conveying some pieces of information about the theme. If we, on the contrary, analyse the phrase
with a listener-oriented analysis, we can distinguish between Given, which is the information
already known by the listener, and New, which is the information added by the speaker. Cassell
does not take this distinction into account.
Most scholars seem to agree with this view. Among these Adam Kendon (2004) and
McNeill (2005). In particular, McNeill states that the orientation of co-verbal gestures within
communicative acts is a false problem. See Chapter 9 for a wider discussion of this issue.
46 5. Towards the Interpretation of Gesture as a Prototype Category: Gestures for the Speaker?

5.2. Reinterpreting Gesture as a Prototype Category

As stated above, the most perplexing problem concerning the study of gesture is the
apparent lack of consistency between the numerous instantiations of this phenomenon.
It is my opinion that such apparent inconsistencies could be resolved by means of an
analysis of gesture as a subset class integral to a wider system. The system in question
can be described as composed of both speech and gesture. Since wider information
about speech is available, due to classical linguistic studies completely devoted to the
argument, we will here focus on a structural description of gesture38. In particular,
the gesture sub-module will be here interpreted as an incoherent class having the
properties of a prototype category.
More precisely, the category of gesture may be interpreted as a modular category
(Figure 5) whose elements can be arranged by means of five parameters listed below:

- intentionality (Ekman and Friesen, 1969)

- awareness (Ekman and Friesen, 1969)
- abstraction: degree of independence from perceptible objects;
- arbitrariness: degree of autonomy of the signifier from a morphological point
of view with respect to the contents it expresses;
- extension: the number of mental representations which can be related to the same
signifier. This notion is equivalent to that of lexical access (see 5. 2).

The core of gesture is constituted of arbitrary emblems, for these gestures have the
highest degree of intentionality, awareness, arbitrariness, and abstraction. In fact,
these gestures are unquestionably intentionally performed and can substitute for speech.
Moreover, the speaker is perfectly aware of his performance and the meaning it
conveys, and is usually able to effortlessly recall the gesture even after the passage of
On the other hand, these gestures have a minimum extension: their content is
easily definable in terms of semantics.
Iconic emblems have a slightly lower degree of arbitrariness and abstraction.
Metaphors show a considerably wider extension (which means that their semantic
content is less easily definable), but still have a good degree of intentionality.
Awareness however is lower, for the speaker is not always able to recall the gesture
after its execution. Their degree of arbitrariness is medium. Iconics show a lower
degree of arbitrariness and abstraction than metaphoric gestures. They are still
intentional, although awareness is low. Batons, which are at the periphery of this
category, are unintentional and unaware. Their extension cannot be determined, so that
arbitrariness and abstraction cannot be qualified for these gestures.
Note that deictics do not fit in this categorization: in fact, although they still have a
good degree of intentionality and awareness, their extension is not determinable, as for
batons. This could lead to the hypothesis that these gestures are the archetype of the
gesture category, perhaps even a relic of a human proto-language.

Of course, any description is usually facilitated if the phenomenon taken into account is
isolated from the context and analysed as independent. This same method will be adopted,
keeping in mind that any gesture, as a communicative phenomenon, is only completely
understood if analysed as a subset of a wider phenomenon, namely human Language. Note that
the same applies to speech.
5.2. Reinterpreting Gesture as a Prototype Category 47

To summarize, arbitrary emblems are at the core of the gesture category for the
following reasons:

- they are extremely intentional;

- their degree of awareness is highest;
- their degree of arbitrariness is the highest of the category;
- their extension is minimal.

Figure 5: gesture as a prototype category

As for the other members of the category, I assume that their relationships with the
core can be described as shown in Figure 6. As one may notice, iconicity is the only
feature common to all the classes into which gesture can be divided, if one categorizes
it from a synchronic perspective. Arbitrary emblems, as claimed above, are the most
intentional, aware, arbitrary and abstract class of gestures. Iconic emblems, on the other
hand, have the same degree of intentionality and awareness, although their arbitrariness
is lower than that of arbitrary emblems, since the signifier has an iconic relationship
with its signified. Metaphors, in turn, owe their name to the fact that their main
function is to attribute physical content to abstract meanings.
Finally, beats, do not have a lexical access, but follow and resemble the rhythm of
the co-occurring speech flow. For this reason, beats also may be considered as having
to some extent some iconic properties. For this reason, if gesture were classified from
a diachronic point of view, iconics would probably be the prototypic instance of the
48 5. Towards the Interpretation of Gesture as a Prototype Category: Gestures for the Speaker?

If this theory is correct, then an experimental session would show that, in varying
contexts (formal vs. informal) subjects gradually inhibit gestures starting from the core.
In fact, even a superficial analysis demonstrates that speakers in formal contexts show
a tendency to inhibit gestures that are usually performed in colloquial contexts because
those gestures are considered inappropriate in such situations.

Arbitrary Iconic emblems Iconics Metaphors



Figure 6: the development of the gesture category as a metonymic chain (Rossini, 2001, revised)

The hypothesis put forward here is that such inhibition of gestures in formal
contexts follows the prototype theory here described, arbitrary emblems being
inhibited by the speakers to a greater degree than less prototypical gestures. The more
prototypical a gesture is, the more it will be likely to undergo inhibition.
This theory has been tested by means of a three session experimental data collection
conducted at McNeills Center for Gesture and Speech, The University of Chicago in
October 2001. The data collected involved five Italian subjects (three males, two
females) aged 22-29 with a similar educational background (i.e., University students),
and a similar competence in English as second language (i.e., a TOEFEL test score
equal or higher than 26/30). All sessions were video-recorded by means of a digital
camcorder, on a recording set. None of the subjects were aware of the specific aim of
the experiment but they did know that the experiment was related to a psychological
During the first session, each subject was asked to hold a five minute conversation
in a foreign language39 (English) with an unknown interviewer (in the transcriptions,
I1) simulating a job interview.
The second session (about five minutes per subject) was structured so that each
subject could hold a conversation in his/her own mother language (Italian) with an
acquaintance (in the transcriptions, I2), who is the author of this book. During the third
session, the subjects were asked to perform a guessing game in pairs: I2 described the
ending scene of a story and asked them to reconstruct it. The guessing game was meant
to distract the participants from the presence of lights, microphone, and camera in the
recording set: the task of providing a solution to an amusing scenario for a story was
likely to lead to a more relaxed interaction, even though each participant reported no
previous relationship with their intended partner in this phase of the experiment.
The subjects were subsequently informed about the actual aim of the experiment and
gave their informed consent.

A foreign language was chosen in order to add a cognitive task to the first session: while
speaking in a foreign language, in fact, a greater number of gestures is triggered in the effort of
lexical-retrieval and period-structuring. My special acknowledgement goes to Karl-Erik
McCullough who kindly consented to act as native speaker interviewer during this session.
5.2. Reinterpreting Gesture as a Prototype Category 49

5.2.1. Results

The results of the experiment are shown in Tables 2-6. For each table, the first part
shows the number of gestures actually performed for each session, while the second
part shows the percentages gestures for each session.

Table 2: Gesture and Prototype Theory. S1 Experiment Results

Table 3: Gesture and Prototype Theory. S2 Experiment Results

50 5. Towards the Interpretation of Gesture as a Prototype Category: Gestures for the Speaker?

Table 4: Gesture and Prototype Theory. S3 Experiment Results

Table 5: Gesture and Prototype Theory. S4 Experiment Results

5.2. Reinterpreting Gesture as a Prototype Category 51

As shown in Figure 7, Emblems tended to occur only in sessions II and III, when
formality decreased: in fact, only one Emblem was performed during the first session,
by S1. Moreover, the percentage of Emblems noticeably increased in session II. Also
the percentage of performed Metaphors gradually increased from session I through
session III in four subjects out of five. Iconics show the same tendency in all subjects.
On the contrary, beats were never inhibited, although in four subjects out of five a
reduction of beats and conduits40 from session I to session II was reported. Note that all
subjects performed a remarkable number of beats and conduits in all sessions, with a
percentage of no less than 27% 41(see S1s performance of beats and conduits during
the first phase in Table 2).

Table 6: Gesture and Prototype Theory. S5 Experiment Results

Moreover, the results for each subject are consistent with the results obtained by
assessing the number of gestures performed in each session by all subjects (Table 7). In
fact, only one emblem was performed by all five subjects in the first session, while the
number of emblems increases in the second and third session. On the other hand, beats
and conduits decrease noticeably from the first session to the second one which fact

Note that beats and conduits have been gathered under the same label: conduits are a
particular sub-class of Metaphors, whose function is the metaphoric presentation of the speakers
idea to the listener. Since all conduits analysed within this experiment showed a superimposed
beat, they are assumed to have only in this particular case the same degree of awareness and
intentionality as beats. Note that Pavelin-Lesic (2009) reports the same phenomenon in stating
that conduit gestures are usually recorded with a superimposed beat.
See S1s performance of beats and conduits during the first phase (Table 2).
52 5. Towards the Interpretation of Gesture as a Prototype Category: Gestures for the Speaker?

is here claimed to be due to their replacement with emblems but are not involved in
gesture inhibition in formal situations.

Figure 7: Percentage of gestures performed during each session

One objection to these results is that gesture variation from the first session through
the third one could be content-related. In fact, this is not the case: during both first and
second sessions, the subjects were asked to talk about their life, so that the content of
the speech was kept constant, the language being the only changing factor. Moreover,
no particular variation related to the cognitive challenge of the experiment was found:
all five subjects were proficient in their second language, and had improved their skills
by means of a stay in the United States. All these conditions being constant, one would
have expected the performance of at least some American emblems, but this was not
the case. Only one emblem was performed during the first session, and it was a
surrender gesture, which is a typically Italian one.

Table 7: number of gestures performed during each session

Emblems Metaphors Iconics Conduits/Beats

I 1 40 7 182
II 34 40 0 89
III 35 25 29 65

During the third session, the risk of content-related emblems increased, but, in fact,
the emblems performed by the subjects were more liable to be due to the decreasing
formality, since these gestures were observed during pauses (but not hesitation pauses),
or when the subjects did not agree with each other. Instances of the emblems performed
by S1 and S2 are provided in Figure 8 and Figure 9 respectively.
5.2. Reinterpreting Gesture as a Prototype Category 53

Figure 8: an instance of emblem performed by S1 during the third session

Figure 9: instance - emblematic phrase performed by S2 during the third session

54 5. Towards the Interpretation of Gesture as a Prototype Category: Gestures for the Speaker?

5.3. Is Gesture Communicative?

The results reported above lead to a general consideration about the question of
intentionality in co-verbal gesticulation. If, in fact, co-verbal gesticulation is analysed
as a sort of sub-module of what will be defined as Audio-Visual Communication (see
chapter 8), the results of the above-mentioned experiment suggest the following

- Since Emblems are provided with the highest degree of both intentionality and
awareness, their suppression in a formal situation is easiest;
- Metaphors and Iconics whose intentionality and awareness is lower have a
higher probability of occuring even in very formal situations, although their
percentage of occurence noticeably decreases in formal situations: this
phenomenon can be explained with the hypothesis that subjects partially succeed
in their inhibition efforts;
- Beats and Conduits, which provide the lowest degree of intentionality and
awareness, are not involved in inhibition processes, even in formal situations.
Indeed, one might also hypothesise that such pieces of co-verbal gesticulation
should normally lack for both intentionality and awareness. The decreasing
percentage of these types of gesture in informal situations is due to their
substitution with other pieces of the gestural repertoire (i.e. Emblems). This
particular interpretation is consistent with the widespread hypothesis (Dittman
and Llewelyn, 1969; Freedman, 1977; Frick-Horbury and Guttentag, 1998) that
such gestures are an unintentional response to cognitive and/ or emotional arousal.

In conclusion, the results discussed above seem to be consistent with the claim
that the function of co-verbal gesticulation is to help the speakers computational task
(De Laguna, 1927; Werner and Kaplan, 1963), either by dissipating emotional arousal,
or by facilitating the speakers computational and cognitive exertion (Krauss et al.,
2001). This same idea can also be inferred by the observed higher gestural response
elicited by demanding cognitive tasks (see McCullough, 1995). On the other hand, my
results seem to also sustain the claim that co-verbal gestures are highly intentional, and
thus communicative.
More precisely, it is reasonable to assume that not all the different instances of co-
verbal gesticulation are consistent with respect to both intentionality and awareness in
performance. At first, such an assumption could appear severely incongruent with the
main claim put forward in these pages, namely, that of the communicativeness of
gesture. Nevertheless, this is not the case.
My hypothesis is that such a relevant and still unanswered question as the
communicativeness of gesture can be resolved by devolving gesture to the wider
communicative phenomenon it is part of, namely, Audio-Visual Communication, which,
in its turn, can be interpreted as a particular instantiation of human language. Of course,
such a phenomenon is by definition intentional, although not all the parts it is
structurally composed of are on their turn intentional.
For what regards gesture in particular, it appears to be composed of strongly
intentional, speech-like segments that can be used in substitution of speech. Such
gestures are likely to be listener-oriented. Metaphors, deictics, and iconics have a lower
degree of both intentionality and awareness in performance. These are more likely to be
5.3. Is Gesture Communicative? 55

either speaker or listener oriented, depending on the communicative act taking place.
They are more speech-related, that is, they show a deeper correlation with concurrent
speech, which is often co-referential. As we will see later on, such gestures also seem
to play a determinant role in the pragmatic phenomenon that Natural Language
Processing defines as Planning (Chapter 9).
Finally, beats and conduits, which have been shown to be the less communicative,
can only occur together with speech: speech and gesture perfectly merge and interact
with each other, for the more a gesture is speech-like, the wider is the possibility that
it substitutes for speech in its pragmatic and semantic functions.

Figure 10: the occurrence of gesture and speech within communicative acts

In particular, gesture usually co-occurs with speech within non-marked

communicative acts, that is, when all the conditions required for a successful face-to-
face communication are guaranteed. But, if one analyses the relationship between
speech and gesture, one will notice that not all types of gesture have the same degree of
association with concurrent speech. In particular, as one moves from the periphery to
the core of the gesture category (see Figure 10), the obligatory presence of speech
declines, emblems being more codified (i.e., culturally determined) than metaphors,
iconics, and beats. Moreover, while the occurrence of metaphors or iconics without co-
occurring speech is possible, beats are the most speech-oriented class of gestures:
such gestures, in fact, never occur without speech.
In other words, the more a gesture is unaware and unintentional, the more the
presence of concurrent speech is mandatory.
56 5. Towards the Interpretation of Gesture as a Prototype Category: Gestures for the Speaker?

Finally, let us not disregard the fact that the degree of intentionality (and
subsequently, communicativeness) also varies in the different aspects of the speech
module: in particular, determinant verbal language phenomena such as phonetic and
phonological encoding and perception, intonation, and prosody are unconscious ones.
Lastly, the lexical-retrieving process taking place during speech encoding has also been
recently suggested to be unaware (McCullough, 2005).


This chapter has addressed one of the major and most debated questions related to the
study of gesture from a linguistic viewpoint, namely, whether it is possible to postulate
that gestures as a semiotic class are endowed with intentionality and, consequently,
whether gestures are communicative. The analysis presented here, together with the
result from experimental data and the reference to previous studies in this direction,
have helped to propose different levels of intentionality for different typologies of
gestures provided that there is always a semiotic continuum together with an
overlapping of diverse semiotic components (McCullough, 2005) and dimensions
(McNeill, 2005) in the same class of gestures. The trait of intentionality is analysed
here also in the different parts of speech, and called into question as far as particular
classes of speech and gesture are involved. Nevertheless, the intentional, and thus
communicative value of gesture is here claimed to be extended from the overall process
to its parts and peculiar phenomena: when the overall and general interactional process
(i.e., the communicative interaction as a whole) is intentional, its components recover
their trait of general intentional and volitional value. Further enquiry on the topic of
communicative versus self-orientational functions of language can be found in Chapter

6. Language in Action
Sed cum haec magna in Antonio tum actio
singularis; quae si partienda est in gestum
atque vocem, gestus erat non verba
exprimens, sed cum sententiis congruens:
manus humeri latera supplosio pedis status
incessus omnisque motus cum verbis
sententiisque consentiens; Marcus Tullius
Cicero, Brutus, 141.


As argued in the last chapter, the results of numerous investigations into the cognitive
and psychological foundations and functions of gesture are consistent with the theory
of a single origin for speech and gesture. If this is so, it is also plausible to hypothesise
that together with speech, gesture is an instantiation of the human language capacity.
This chapter addresses the question from a motor and neurological perspective by
reviewing the research studies conducted in the field and by discussing the data from a
multi-tasking experiment that I conducted at the Li.Co.T.T., Universit del Piemonte
Orientale. Addressing this topic also involves discussing the principal theories of the
phylogenetic evolution of language.

6.1. The Neurological Correlates of Language

An extensive number of experiments have addressed the neurological correlates of

language, in the attempt to shed some light onto its neurobiological foundations and
evolution. As for the neurobiological foundations of language and the structure of mind,
scholars have put forward two major hypotheses. The first and most ancient one is the
Theory of Brain Modularity, mostly drawn on Fodors (1983) theory of the Modularity
of Mind. The hypothesis of brain modularity is drawn on the basis of classical
descriptions of the human brain as constituted of specialized areas devoted to specific
tasks (Broca 1861). As a consequence of a strong interpretation of Fodor's (1983)
hypothesis, mostly based on Chomsky's (1957) concept of Universal Grammar, the
brain is usually thought to be structured into different sub-modules, that each control a
particular function. The Theory of Brain Modularity in its usual conception
individuates different brain areas for the regulation of different activities. For instance,
it is assumed that the right hemisphere regulates motor-spatial and emotional functions,
while language and linguistic functions are controlled by the left hemisphere. Within
the latter, different areas would control different linguistic functions: the perception of
language, for example, is attributed to Wernicke's area, while linguistic production is
commonly identified with the Broca's area (see e.g. Lenneberg 1967). On the other
hand, recent investigation into the involvement of the hemispheres and areas of the
brain in language seem to dismiss the classical modular hypothesis and re-propose a
model anticipated by Freud (1891). This model, usually referred to as the Connectionist
model, is based on the hypothesis that the brain relies on connections of neurons and
58 6. Language "in Action"

synapses into devoted networks rather than on devoted modules. Investigation aimed at
assessing the involvement of different brain areas in language usually bases its analysis
on the observation of different relationships between injuries in given brain areas and
resulting aphasias. In addition to this field of inquiry, the recent growth of neuroscience
has led to several findings concerning the involvement of neural synapses in language
perception and production.
Because the neurological research aimed at assessing the biological foundations of
language frequently involve models and hypotheses about the evolution of language, I
will briefly address here the three major hypotheses. The so called gesture-first
hypothesis (Corballis, 2002; Givn, 2002; Tomasello, 2008) consists in the
interpretation of ontogenetic patterns in language evolution, with a particular interest in
the emergence of language in children as a model for the phylogenetic evolution of
language. Because gesture appears to emerge before speech in infants, and because
empirical data on the breathing apparatus in previous species of Homo shows that
phonation would be impeded, the scholars who put forth this hypothesis claim that
gesture emerged as the first means of communication and was subsequently replaced
by vocal communication.
Another hypothesis regarding the evolution of language is that it evolved as a
vocal communication system (see e.g. Jackendoff 2002): scholars convinced of this
hypothesis claim that language as vocal production it (i.e., language as speech) is
unique to the species Homo. Other scholars are convinced that language originated as
gestural and stayed gestural (Armstrong et al. 1995), while an interesting hypothesis is
that human language emerged as a multimodal system ever since the beginning
(McNeill, 2005, in press), and probably originated in mother-infant interaction.
Despite these several hypotheses about the evolution of language and its
neurological correlates, neuroscientific evidence is often contradictory. Moreover, even
when findings are consistent (as in the case of mirror neurons in man), the
interpretation of this evidence can vary significantly among scholars.
The recent and still debated discovery of mirror neurons in the Broca's area, for
instance, has questioned the hypothesis that the left hemisphere is completely devoted
to the control of linguistic functions and also the idea of Broca's and Wernicke's areas
as controlling, respectively, the production and perception of language (Arbib 2006).
This latter finding seems to be consistent with the hypothesis of a strong linkage
between manual action, gestural production, and language evolution (Armstrong
Stokoe and Wilcox 1995, Arbib 2006, McNeill 2005 and in preparation). Several left-
hemisphere sites, cortical and subcortical, have also been found to be involved in the
production of symbolic gestures (Basso, Luzzati & Spinnler 1980, Heilman et al.1983,
Jason 1985). Conversely, the comprehension of pantomimic gestures and nonverbal
sounds seems to be disrupted by very different left-hemisphere lesions (Varney &
Damasio 1987), while observation of the behaviour of brain-damaged subjects shows a
disassociation between components that are closely related in normal subjects: some
aphasics, for instance, are able to pantomime the function of an object, but are unable
to name it (Feyereisen et al., 1988). These data have led to hypothesise that nonverbal
and verbal communicative modalities are functionally independent. Studies on right
hemisphere aphasics already discussed in this volume (McNeill and Pedelty 1995)
showed an involvement of the right hemisphere in both the organization and coherence
of speech and gesture production. This finding - one of the key points confirming
McNeill's theory (1992, 2005, in press) of a single psychological and, to some extent,
neurological origin for speech and gesture - has been recently dismissed by a study on
6.1. The Neurological Correlates of Language 59

aphasic subjects (Carlomagno et al. 2005, Carlomagno and Cristilli 2006) whose results
show no relation between gestural and speech impairment. Moreover, an inquiry
conducted by Moro (2006) by means of an fMRI (functional magnetic resonance
imaging) investigation of brain areas duringlanguage training tasks has shown
exceptional results: participants engaged in brief linguistic training with fictional
languages designed by the experimenters showed an activation of Broca's area only in
response to fictional grammar rules obeying linguistic universals. Moro's results lead to
the conclusion that linguistic universals rely on a neurological foundation and could
thus be interpreted as consistent with the hypothesis of the modularity of mind.
Conversely, in this book (chapter 9) I suggest a lateralized gestural response to
different linguistic functions, which is also strikingly consistent with McNeill's
hypothesis (2005) about the involvement of the right hemisphere in linguistic functions:
the shifting of symbolic movement from the dominant to the non-dominant hand during
linguistic planning seems to suggest a role for the right hemisphere in the organization,
coherence and cohesiveness of the linguistic message, and thus in the organization and
mediation of the different functions of language. The results presented in chapter 9
about lateralization in gesture and linguistic functions are also consistent with the
observation of a right-hand preference for metaphor in healthy subjects with left-
hemisphere dominance for language (Kita, de Candappa & Mohr 2007), and the
parallel finding of non-lateralized gestural production in split-brain patients (Kita &
Lausberg 2008). These results are also consistent with the data available from patients
who had undergone commisurotomy or hemispherectomy of the left dominant
hemisphere, revealing that the right non-dominant hemisphere alone is capable of
distinguishing between words and meaningless word-like forms (Zaidel 1985).
As one proceeds from questions of mere lateralization in language dominance
(whose causes are still controversial, see Pulvermller 2002), to the brain areas
representing more fine-grained linguistic abilities, such as phonemic perception, word
recognition, semantic representation and syntax, the inquiry becomes even more
contentious: phonemic perception is suggested to be the ultimate cause of human brain
lateralization with the subsequent dominance of the left hemisphere (Miller 1996),
despite the fact that lateralization has also been found in animals (Pulvermller 2002).
fMRI enquiry into phonemic perception highlights, apart from the activation of the
transverse temporal gyri, also a significant activation of the planum temporale and the
left superior temporal sulcus (Jncke et al. 2002). Nevertheless, cases of double
dissociation in agraphia of kanji and kana studied by means of fMRI seem to reveal an
involvement of the middle frontal gyrus in phonological representation (Sakurai et al.
1997). Different specialists have proposed a number of brain areas for lexical
representation, including the left inferior frontal areas (Posner and Di Girolamo 1999),
the left superior temporal lobe (Tranel and Damasio 1999), the occipital lobes
(Skrandies 1999), and the primary motor, pre-motor, and prefrontal areas (Pulvermller
1999). More recently, Pulvermller (2002) proves the involvement of the temporo-
occipital and fronto-central areas for visual- and action-related words. Finally, there is
also metabolic evidence for the involvement of Broca's Area in the integration of
information available from iconic gestures and speech (Willems et al. 2007) and some
evidence of the activation of Broadmann Area 45 in gesture-only perception. The latest
findings lead Hagoort (2005) to propose an extension of the language area to BA 44,
BA 45, BA 44 and B6 in the left inferior frontal gyrus and to postulate the involvement
of the left temporal cotex, the dorsolateral prefrontal cortex, and the anterior cingulate
cortex in language cognition and processing.
60 6. Language "in Action"

Moving towards a more abstract level, the opinion of many is that language itself,
although perceived over time, is far from a linear system: as we speak, different
processes take place simultaneously, and they are all due to a complex neuro-motor
system that is deeply involved in communicative acts. Communication is thus
ultimately an abstraction process that exploits the whole range of human neuron-motor
production in order to convey meaning. This concept of human language, which had
already been put forward by a number of scholars, was adopted by Armstrong, Stokoe
and Wilcox (1995), who assumed a gestural origin for language. They base their claim,
in part, on Edelmans (1987) theory of Neuronal Group Selection.
Edelmans theory finds its roots in recent discoveries to which he contributed
considerably about the immune system and how it functions. In fact, the molecules
and cells composing the immune system have been found to obey the Darwinian
principle of selection, following an a posteriori process and contrary to the classically
instructive conception of antibodies. He subsequently applied these findings to the
study of brain development, giving birth to the Theory of Neuronal Group Selection
(Edelman, 1987). This theory is based on Neural Darwinism, or the approach in which
neural circuits are built up by means of a selection involving both the phylogenesis and
the ontogenesis of organisms. According to this perspective, the brain is constituted,
from birth, of a redundant number of neurons which are subsequently organised into
neural circuits by means of processes that parallel those of natural selection:
Depending on the intensity of their use, some neurons die and others grow. The
process of neural development is structured into three main phases:

- Diversification of anatomical connectivity occurs
epigenetically during development, leading to the formation
by selection of primary repertoires of structurally variant
neuronal groups. The diversification is such that no two
individual animals are likely to have identical connectivity in
corresponding brain regions.
- A second selective process occurs during postnatal behaviour
through epigenetic modifications in the strength of synaptic
connections within and between neuronal groups. As a result,
combinations of those particular groups whose activities are
correlated with various signals arising from adaptive
behavior are selected.
- Coherent temporal correlations of the responses of sensory
receptor sheets, motor ensembles, and interacting neuronal
groups in different brain regions occur by means of reentrant
signaling. Such signaling is based on the existence of
reciprocally connected neural maps. (Edelman, 1987: 5)

Still, this sort of selection does not affect the single neurons but, rather, it involves
neuronal groups. The arrangement and activity of neuronal groups forms, in turn,
neural maps,which are not to be confused with single neuron-connections. These maps
are highly and individually variant in their intrinsic connectivity.
As Edelman states, these structures provide the basis for the formation of large
numbers of degenerate neuronal groups in different repertoires linked in ways that
permit reentrant signalling (Edelman, 1987: 240). The degenerate system allows
6.1. The Neurological Correlates of Language 61

functional elements to perform more than one function and single functions to be
performed by more than one element (Edelman, 1987: 57).
Yet, neural maps are not isolated: they interact in a process called reentry, which
the author defines as a process of temporally ongoing parallel signalling between
separate maps along ordered anatomical connections (Edelman, 1989: 49). The
interaction among neural maps, as well as the interaction between neural maps and
non-mapped brain regions (i.e. the frontal lobes) forms the final component, the global
mapping, which is, ultimately, responsible for perceptual categorization:

The concept of a global mapping takes account of the fact that

perception depends upon and leads to action the results of
continual motor activities are considered to be an essential part
of perceptual categorization. Neuronal group selection in global
mapping occurs in a dynamic loop that continually matches
gesture and posture to several kinds of sensory signals
(Edelman, 1989: 54-56).

Global mappings have a dynamic structure that involves both reentrant local maps
and unmapped regions of the brain, and is responsible for the management of the flow
from perception to action. Motor activity, which is an essential input to perceptual
categorization, closes the dynamic loop. Moreover, Edelman proposes that global
mappings would control not only perceptual categorization but also concept formation
(Edelman, 1989:146).
This theory, that questions the conception of the brain as a modular system, finds
substantiation in the results of other recent fMRI-studies. The observed activation of
the auditory cortex in congenitally deaf subjets (Emmorrey, et al. 2003), and the
mirroring observation of activation of the visual cortex in congenitally blind subjects
(Sadato et al., 1996) can be interpreted as an index of brain plasticity and the non-
specificity of neural maps. Moreover, these findings are consistent with studies
highlighting the presence of so called mirror neurons in the cerebral cortex area F5 of
monkeys (Rizzolatti Luppino and Matelli, 1998), which revealed the activation of
neurons both when a monkey performs a given action and when it observes this action.
The supposed presence of mirror neurons in the primate brain is a further piece of
evidence of the non-specificity of the brain areas in contrast to the classical and
widespread conception of separate perception and production modules (see also Arbib,
2006). Inspired by the findings for the primate brain, Nishitani and Hari (2000)
conducted a similar study on man.
Seven subjects were tested while performing several tasks, such as repeated
movements (execution phase), on-line imitation of the movements performed by
another person (imitation phase), and observing precision movements performed by
another person (control phase). The results show that during execution, imitation, and
control the left IFA occipital area, and the primary motor areas of both hemispheres
responded to the task. During execution the left Broadman Area (BA) 44 was activated
first, followed by the left BA4, the left occipital area, and the right BA44. During
control and imitation the left occipital area responded first, followed by the left Inferior
Frontal Area (BA 44), and both BA4. Activation of the left BA44 and the left BA4 was
significantly stronger during the imitation phase. These results may be interpreted as a
strong evidence for the existence of an action execution/observation matching system
in humans as well as in monkeys. This mirror-neuron system would be placed in the
left BA44. Subsequently, another experiment (Damasio et al., 2001) found greater
62 6. Language "in Action"

activity in the left inferior parietal lobe, compared with the right one, while recognizing
transitive actions.
These experiments support the theory that there is no clear boundary between action
and perception of action, action and representation of action, action and verbal
expression of action.
This idea is clearly supported also by Bongioanni, et al. (2002), in their review
about the experiments on language impairments in Amyotrophic Lateral Sclerosis and
Motor Neuron Disease, whose results show that there is a deep link between action and
verbs and consequently between the syntax of action and the syntax of speech.
These results seem to provide evidence for the hypothesis suggested by Armstrong
et al. (1995):
We cannot accept the opinion of many that language
originated from and resides in a three-part system composed of
the brain, the vocal tract, and the auditory system. We believe
that not brain, voice, and hearing constitute language but that
the whole organism, and especially the visual and motor systems,
are involved in language (Armstrong, Stokoe and Wilcox,
1995: 19).

In particular, the authors assert that language is not essentially formal; rather,
the essence of language is richly intermodal it is motoric, perceptual and
kinesthetic (Armstrong, Stokoe & Wilcox, 1995:42). Their interpretation is based
on several main claims, which can be summarized as follows:

- Language has the primary puropose of supporting social
interaction and cooperation
- The neurological organization of language in the brain is not
- Language acquisition in children is organized along general
principles and is not guided by a language acquisition device
(Armstrong, Stokoe & Wilcox, 1995: 18)

The hypothesis of brain plasticity is based on a review of Edelmans theory together

with recent discoveries about the role of right hemisphere in signed and spoken
language (Armstrong and Katz, 1981). A piece of evidence for this tight interaction
between left and right hemisphere in spoken language comes from experiments on
verbal fluency: in fact, we know that females score higher than males in verbal fluency
tasks (Maccoby and Jacklin, 1978), and we also know that females have less lateralized
language function than males. These data are corroborated by magnetic resonance
experiments on female subjects who scored particularly high in verbal fluency tests
(Hines et al., 1992): the results show that these subjects had a larger splenium, that is to
say, their brain hemispheres were better connected. These findings lead the authors to
the conclusion that ...the involvement of the right hemisphere in language processing
supports the theory that spatialization/object recognition underlies linguistic abilities
(Armstrong, Stokoe & Wilcox, 1995: 104).
Nevertheless, the issue of language representation in brain is still controversial.
Undoubtedly, massive injury of the Brocas area affects speech production and
grammatical processing (see, e.g., Lenneberg, 1973), although the function of Brocas
and Wernickes areas in speech production and decoding has undergone numerous
6.2. Gesture in the Brain: Experiment on Gesture-Speech Synchronisation in Multi-Tasking Activities 63

revisions, each of them inspired by a given linguistic model42. Edelmans theory itself
has been interpreted as consistent with both the associationist and the modular position
(global maps, in this last case, are seen as a representation of the different modules of
the brain). On the other hand, there is evidence that early left-brain damage may not
compromise language acquisition in children (Levy, 1969). Ironically, less
controversial evidence can probably come from indirect experiments aimed at
assessing the possible existing linkages between speech and action.

6.2. Gesture in the Brain: Experiment on Gesture-Speech Synchronisation in

Multi-Tasking Activities

It is one of the major hypotheses of this book that speech and gesture are different
outputs of the same neuro-motor process, following Armstrong, Stokoe, Wilcox (1995)
and McNeills hypothesis for the unbreakable bond between gesture and speech,
together with his hypothesis for language evolution, which will be here called the
multi-modal hypothesis. As already shown in the previous section, the results of
fMRI experiments aimed at assessing the neurological correlates of the basic
mechanisms underlying the production and perception of speech and gesture have so
far lead to contrasting results.
Nevertheless, the hypotheses maintained here can find indirect corroboration in the
results of experiments aimed at gauging data on the synchronisation between speech
and co-occurring gesture or body movements. Both multi-tasking experiments and
gesture-speech synchronisation in deaf subjects, presented here and in the next chapter,
can serve this purpose. This chapter reports a multi-taking study conducted in 2004 at
the Li.Co.T.T., Universit del Piemonte Orientale. In this study, healthy participants
are asked to accomplish two simple tasks at the same time, such as reading and
imitating a given beat with their dominant hand, at the same time. This type of
experiment proves to be particularly suitable for testing the ability to reproduce
different types of rhythm simultaneously.

6.2.1. State of the art

A number of studies have already tested the rhythmic correlation between speech and
gesture (Kendon, 1972); posture shiftings and speech (Scheflen, 1973; Condon and
Ogston, 1971); and rhythmic action and breathing movements (Kelso et al. 1981;
Hayashi et al. 2005). David McNeills (1992) work with delayed auditory feedback
also reports a more than casual association between gestural disfluencies and speech
impediments, and suggests a profound motor linkage between action and speech. Nobe
(1996), on the other hand, found a close synchronisation between prosodic saliency and
gesture strokes. It has also been observed that subjects prevented from gesticulation
tend to intensify both the use of vocal gestures and the activity of facial muscles
(Rauscher, Krauss and Chen 1996), as a sort of unwitting discharge of motor activity.
This finding has led some scholars to claim that gestures during speaking and non-
verbal behaviour during speaking in general should be considered the unintentional
output of our neuromuscular system.

See for instance the last hypotheses following Chomskys theory for language evolution.
64 6. Language in Action

6.2.2. Experiment setting

A multi-tasking experiment was designed in order to assess the linkage between action
and speech flow. In this experimental session, 10 subjects were asked to read two texts
(the first one in prose, the second one in poetry) while repeating a rhythmic beat given
by an experimenter. The beat is repeated with the participants dominant hand
previously assessed by means of a standard lateralization test on the table. The texts
were both in prose and poetry in order to assess whether the rhythmic patterns of
speech during poetry reading have some different effects with respect to prose reading.
The purpose of this study was to investigate the linkage between manual action and
speech. If the linkage were strong, all the subjects should show a significant disruption
of their manual action due to speech influence during multi-tasking.
The experiment was divided into two phases: during the first phase, the subjects
read the texts aloud to the experimenter. During the second phase, they were asked to
read while repeating a rhythmic beat on the table with their dominant hand. The rhythm
was suggested by the interviewer who had previously listened to their reading
performance and it was purposely asynchronous with the reading rhythm shown by
each subject during the first phase. Once the subjects learned the rhythm, the
interviewer ceased its reproduction and let him/her begin the reading task.
The texts constituting the stimulus of the test (i.e. the Addio monti, from the novel I
Promessi Sposi by Alessandro Manzoni and San Martino by Giosu Carducci Rime
Nuove), reproduced in Figures 11 and 12, were chosen both because of their
pronounced rhythm (the former is in fact in rhythmic prose, the latter in settenari43),
and their pervasiveness in the Italian culture. Such characteristics, in fact, are likely to
cause chant-like reading.
If gesture and speech do not rely on the same neuro-motor process, then it is
reasonable to expect that all the subjects should succeed in the multi-tasking activity. If,
on the contrary, the subjects do not manage to perform the two tasks simultaneously,
then it is more likely to assume that gesture and speech are outputs of the same neuro-
motor process.
The data were collected by means of a digital camera with integrated microphone.
The digital video was subsequently captured with dedicated software that allowed the
extraction of the audio stream. In order to assure accuracy in coding, the analysis was
conducted both for audio-video together and for audio only: the analysis of the
audiowave was conducted with Praat, which allows for accurate segmentation of the
intervals occurring between the beats performed by the hand, and an even more
accurate and independent segmentation of the syllables in the speech string.

A settenario is an Italian verse with lines of seven syllables. Whenever the verse seems to
be composed of more than 7 syllables, a poetic rhetorical figure of sound called synalepha is
adopted by the poet.
6.2. Gesture in the Brain: Experiment on Gesture-Speech Synchronisation in Multi-Tasking Activities 65

Addio, monti sorgenti dall'acque, ed elevati al cielo; cime inuguali, note a chi cresciuto tra voi, e impresse
nella sua mente, non meno che lo sia l'aspetto de' suoi pi familiari; torrenti, de' quali distingue lo scrosco,
come il suono delle voci domestiche; ville sparse e biancheggianti sul pendio, come branchi di pecore
pascenti; addio! Quanto tristo il passo di chi, cresciuto tra voi, se ne allontana! Alla fantasia di quello
stesso che se ne parte volontariamente, tratto dalla speranza di fare altrove fortuna, si disabbelliscono, in quel
momento, i sogni della ricchezza; egli si maraviglia d'essersi potuto risolvere, e tornerebbe allora indietro, se
non pensasse che, un giorno, torner dovizioso. Quanto pi si avanza nel piano, il suo occhio si ritira,
disgustato e stanco, da quell'ampiezza uniforme; l'aria gli par gravosa e morta; s'inoltra mesto e disattento
nelle citt tumultuose; le case aggiunte a case, le strade che sboccano nelle strade, pare che gli levino il
respiro; e davanti agli edifizi ammirati dallo straniero, pensa, con desiderio inquieto, al campicello del suo
paese, alla casuccia a cui ha gi messo gli occhi addosso, da gran tempo, e che comprer, tornando ricco a'
suoi monti. Ma chi non aveva mai spinto al di l di quelli neppure un desiderio fuggitivo, chi aveva composti
in essi tutti i disegni dell'avvenire, e n' sbalzato lontano, da una forza perversa! Chi, staccato a un tempo
dalle pi care abitudini, e disturbato nelle pi care speranze, lascia que' monti, per avviarsi in traccia di
sconosciuti che non ha mai desiderato di conoscere, e non pu con l'immaginazione arrivare a un momento
stabilito per il ritorno! Addio, casa natia, dove, sedendo, con un pensiero occulto, s'impar a distinguere dal
rumore de' passi comuni il rumore d'un passo aspettato con un misterioso timore. Addio, casa ancora
straniera, casa sogguardata tante volte alla sfuggita, passando, e non senza rossore; nella quale la mente si
figurava un soggiorno tranquillo e perpetuo di sposa. Addio, chiesa, dove l'animo torn tante volte sereno,
cantando le lodi del Signore; dov'era promesso, preparato un rito; dove il sospiro segreto del cuore doveva
essere solennemente benedetto, e l'amore venir comandato, e chiamarsi santo; addio! Chi dava a voi tanta
giocondit per tutto; e non turba mai la gioia de' suoi figli, se non per prepararne loro una pi certa e pi
grande. Di tal genere, se non tali appunto, erano i pensieri di Lucia, e poco diversi i pensieri degli altri due
pellegrini, mentre la barca li andava avvicinando alla riva destra dell'Adda.

Farewell, ye mountains springing from the waters, and elevated to the heavens; unequal summits, known to
him who has grown in your midst, and impressed upon his mind, as clear as the countenance of his dearest
ones; torrents, whose roar he recognizes, like the sound of familiar voices; villages scattered and glistening
on the slope, like flocks of grazing sheeps; farewell! How mournful it is the step of him who, grown in your
midst, is going far away! In the imagination of that very one who willingly departs, attracted by the hope of
making a fortune elsewhere, all dreams of wealth at this moment lose their charms; he wonders he could
form such resolution, and back he would then turn, but for the thought of one day returning in wealth.As he
advances into the plain, his eye withdraws, disgusted and wearied, by that uniform amplitude; the air seems
to him burdensome and lifeless; he sadly and listlessly enters the tumultuous cities; the houses crowded upon
houses, the streets that lead into streets, seem to rob him of his breath; and before edifices admired by the
stranger, he recalls, with restless longing, the little field of his village, the little house he has already set his
heart upon, long ago, and which he will acquire, returning rich to his mountains. But he who had sent
beyond those not even a passing wish, who had composed in them all his designs for the future, and is driven
afar, by a perverted power! Who, suddenly parted from his dearest ways, and disturbed in his dearest hopes,
leaves these mountains, to go in search of strangers whom he had never desired to know, and is unable to
look forward to a fixed time of return! Farewell, native home, where, indulging in unconscious thought, one
learnt to distinguish from the noise of common footsteps the noise of a step expected with mysterious awe.
Farewell, still stranger house, so often hastily glanced at, in passing, and not without a blush; in which the
mind figured a tranquil and lasting home of a wife. Farewell, my church, where the heart was so often
soothed, while chanting the praises of the Lord; where it was promised, prepared a rite; where the secret
sighing of the heart was to be solemnly blessed, and love to be commanded, and called holy; farewell! He
who gave you so much cheerfulness is everywhere; and He never disturbs the joy of his children, but to
prepare them for one more certain and greater. Of such a nature, if not exactly these, were the thoughts of
Lucia, and not so dissimilar those of the two other pilgrims, while the boat approached the right bank of the
Figure 11: Addio monti (from the novel I Promessi Sposi by Alessandro Manzoni, ch.VIII)44

English translation: http://ercoleguidi.altervista.org/manzoni/psch_8_4.htm
66 6. Language in Action

La nebbia agli irti colli The fog the precipitous heels

Piovigginando sale, pattering ascends
E sotto il maestrale and under the northwest wind
Urla e biancheggia il mar; hollers and bubbles the sea

Ma per le vie del borgo but across the village streets

Dal ribollir de tini from the boiling vats
Va laspro odor de i vini spreads the tart smell of wines
Lanime a rallegrar. the souls to cheer

Gira su ceppi accesi rolls upon flaring stumps

Lo spiedo scoppiettando: the broach popping
Sta il cacciator fischiando stays the hunter whisteling
Su luscio a rimirar at his door to regard

Tra le rossastre nubi in between reddish clouds

Stormi duccelli neri, flocks of black birds
Com esuli pensieri, like exile thoughts
Nel vespero migrar. in that vesper migrate
Figure 12: San Martino (by Giosu Carducci, Rime Nuove)45

6.2.3. Results

The results show clearly that all the subjects who partook in the experiment failed
in multi-tasking. In particular, the rhythm given by the experimenter was lost at the
very beginning of the reading task in order to synchronise with the concurrent speech
flow. In one case, the multi-tasking caused severe interference and made reading
particularly difficult. Moreover, the analysis of the data showed some interesting
phenomena that were not predicted in the pre-experimental phase: not only, in fact, did
the beats undergo several syncopes in order to synchronise with the concurrent speech
flow, but the latter tended to be adapted in speed to the rhythm imposed by the beats. In
particular, all the subjects but two showed an increase of speech rate during multi-
tasking (see Table 8)46. Interestingly, such an increase of speech rate is more evident
for the prose section. The phenomenon is probably due to an intrinsic difficulty in
designing a beat rhythm for the hand that could effectively interfere with the rhythm of
the poetry. The shortness of the chosen poem itself might have contributed to these
results, although a pilot study structured with two poems in the second session
produced the same results and was found exceedingly difficult by the subjects, the
majority of whom abandoned the task in the middle of the second poem because of
severe speech disruptions. Chant-like reading was noticeably higher in poem reading
than prose for all subjects.
Let us analyse the case of the first subject, which is particularly indicative of the
overall performance. Some screen-prints of her audio analysed with Praat are shown in

English translation by the author of this book.
The timings reported refer to each performance from speech onset to end. Pauses,
hesitations, and false starts are not subtracted from the timing count, since each subject
(strikingly) presented roughly the same number of reading errors during both reading only and
multi-tasking. The (p) symbol indicates a partial timing, corresponding to the reading of the first
paragraph or, in the case of S6, to the reading of the text until the word dovizioso (line 6 in
Figure 11).
6.2. Gesture in the Brain: Experiment on Gesture-Speech Synchronisation in Multi-Tasking Activities 67

figures 13-20. As can be seen in Figure 13, beats performed with the dominant hand are
quite regular before the reading in both rhythm and intensity of the beats, each beat
phase47 being of about 0, 340 seconds.
Nevertheless, the beginning of the reading task produces a syncope with an
expansion of the upbeat.

Table 8: multi-tasking experiment. Overview of subjects performances.

Prose Prose Poem Poem
Only M-T only M-T
S1 2:30 2:11 0:16 0:20
S2 3:03 2:48 0:28 0:23

S3 2:28 2:10 0:17 0:21

S4 1:30 (p) 1:23 (p) 0:26 0:25

S6 0:51 (p) 0:39 (p) 0:23 0:23

S7 1:22 (p) 1:13 (p) 0:22 0:21

S8 1:25 (p) 1:15 (p) 0:20 0:23

S9 1:36 (p) 1:27 (p) 0:26 0:21

This syncope seems to be caused by an inability to synchronise the beating task and
the reading one: the speech string, in fact, starts with a tri-syllabic word addio (engl.:
farewell) that has the main accent in the second syllable, so a perfect muti-tasking
performance would have seen the beat coordinated with an upbeat in speech, and the
relevant syllable of the speech flow coordinated with an upbeat of the hand.
To the contrary, the subject seems to lengthen the upbeat of the hand in order to
synchronise the beats and accented syllables. This is only partly successful, with the
vocalic center of the accented syllable DI partly pronounced in concurrence with a
subsequent upbeat of the hand.

Figure 13: S1 during the first nine seconds

A beat phase comprehends both beat and upbeat.
68 6. Language in Action

The synchronisation seems to be complete with the eleventh beat, where the
accented syllable of the word acque, noted in the transcript with the proclisis of the
article L, is synchronized with the beat of the hand (Figure 13). Nevertheless, some
syncopes are still visible, especially in concurrence with a hesitation in the speech flow
(Figure 15), or following a breath pause (Figures 14-18).
Figure 16 shows an interesting case of syncope in concurrence with different
phenomena: the first syncope recorded in the figure cooccurs with neither hesitation
nor breath pause. Nevertheless, the syncope in question follows the performance in
speech of one stressed and two atonic syllables and it is probably due to the attempt to
synchronise the following stressed syllable with the hand beat.

Figure 14: S1 during seconds 9-20

The second syncope is more dramatic and follows one stressed and three atonic
syllables and tends to cover the utterance of the stressed syllable the first one of the
word stanco and three atonic ones. Still, interestingly enough, the accented syllable is
not uttered in concurrence with the beat, but with the upbeat. This phenomenon is quite
common in all subjects who partook of the experiment, and seems to signal a
synchronisation pattern, although not the most natural one. Anyhow, the beat-to-beat
synchronisation is reestablished with the subsequent beat phase that is perfectly
synchronized with the stressed syllable of the concurrent speech.

Figure 15: S1 at 30-40 seconds

6.2. Gesture in the Brain: Experiment on Gesture-Speech Synchronisation in Multi-Tasking Activities 69

Figure 16: S1 at 50-60 seconds

Lastly, an interesting phenomenon of synchronisation that can probably be

explained as an attempt to synchronise the speech flow and hand beats was recorded for
S1, and is reported in Figure 19 (highlighted segment): the picture refers to the
multi-tasking performance at seconds 150-160.

Figure 17: S1 at 140-150 seconds

The subject produced a speech error while reading the sequence al di l da quelli
(Eng: beyond those, see Figure 19), and read al di l di quelli. After the error, a pause
is recorded interpreted in Figure 19 as a hesitation pause and the subject restarts in
order to provide the exact reading. The interesting phenomenon observable in this
particular case is that the hand holds the upbeat during the hesitation pause and restarts
with the repetition of the segment so as to allow a perfect synchronisation with the
accented syllable quel (Eng: that). As already stated, the performance in the reading
of the poetry section did allow for recording any particular phenomenon that hadnt
been observed during the prose.

Figure 18: S1 at 220-230 seconds

70 6. Language in Action

Nevertheless, chant-like reading is more liable to be recorded during multi-tasking

rather than when reading only, and is particularly marked during poetry reading versus
during prose. Figure 20 shows the overall performance of S8 during the second part of
the recording.

Figure 19: S1 at 150-160 seconds

Chant-like reading is so marked for this subject that his F0 contour is particularly
regular in both curve and length throughout the entire reading. Moreover, this subject
showed an outstanding regularity in beat performances, with a noticeable adaptation of
the speech flow to the hand-beats. Nevertheless, the rhythm of hand beats breaks down
three times: the first one highlighted in Figure 20 constitutes the most evident
disruption of a hand beat recorded in this experiment. The disruption happens in
concurrence with a lengthy complete breakdown of speech, consisting of a silent pause,
a false start with a brief hesitation and a restart. No beat is audible during hesitation and
restart, the first beat recorded is in synchronisation with the most prominent syllable of
the restored speech. Nevertheless, the video shows some rhythmic movements of the
finger in concurrence with the syllables of the false start. Figure 21 reports frame-by-
frame one second of the video, while the subject is experiencing the break-down. In
particular, the frames refer to the last beat of the hand in concurrence with the atonic
syllable of the word vini (frames 1-2), a long hold in the beat phase (frames 3-18) in
concurrence with a silent pause, and two upbeat-beat phases in concurrence with the
first (accented) and last syllable of the word anime (frames 20-25).
The fact that noise created by these beats is not recorded in the audio is due to the
particular weakness of the beats in question. Moreover, while the cases of syncopation
in other subjects are always caused by a prolongation of the upbeat phase, in this
subjects case and in those reported in Figure 20 they are all due to a hold of the beat
phase, that is, to a stronger disruption of the system.

Figure 20: S2s poetry part with multi-tasking

6.2. Gesture in the Brain: Experiment on Gesture-Speech Synchronisation in Multi-Tasking Activities 71

wines / the souls

Figure 21: S9 during hesitation and false start

6.2.4. Discussion and Further Research

The results obtained with the multi-tasking experiment are outstandingly consistent
with the expected ones, especially as far as synchronisation between hand beats and
accented syllables in speech is concerned. The failure in multi-tasking seems to suggest
that the hypothesis of a unique neuro-muscular system controlling both the
performance of action and that of speech, perhaps attributable to the (still controversial)
presence of F5-homologues in the human brain, recently put forward by Arbib (2006).
Moreover, the unexpected influence that the hand beat is shown to have on the
speech rate reinforces this supposition and suggests a mutual psychological dependence
between manual action and speech, based on neurological biases. These results are also
consistent with the theories put forward by Edelman (1987) about perception and
global mappings, in so far as they suggest that the human brain is not modular, or at
least not completely modular. In fact, in a strictly modular system, different areas
would control different tasks allowing multitasking. The data collected so far appear to
be consistent with the hypothesis of a strong connection between manual action and
language, not only in terms of adaptation of the hand beats to speech, but also, and
more notably, in terms of a disruption of speech in adaptation to the hand beat.
72 6. Language in Action

This tight synchronisation and mutual influence is also consistent with the
hypothesis of a unitary language system in which gesture and speech appear to be
inseparable, precisely as put forward by McNeill (1992, 2005) and, in a different
fashion, by Armstrong et al. (1995) and Arbib (2006).
While the data discussed here can provide only indirect evidence for any theory of
language evolution, the results of this study appear to be more consistent with the
hypothesis that human language evolved as a joint and combined system of action and
speech (McNeill 2005) rather than by an exclusively gestural one (Armstrong et al.
2005). Further research replicating these results in fMRI experimental conditions could
provide additional evidence for understanding the relations between action, gesture and
language. In particular, it would be interesting to design the fMRI investigation in
question so to have the same subjects undergo several brief tasks under magnetic
resoncance, such as speaking only, reading only, beating only, imitating a beat given by
another person, and, finally, multi-tasking.
The next chapter reports the results of other experiments conducted on inborn deaf
subjects, which can serve as further indirect evidence for this still controversial


This chapter has addressed the question of the neurological correlates of the deep
linkage existing between gesture and speech, together with the major hypotheses about
the evolution of language. The question of the neurological correlates of language has
been more thoroughly addressed by means of fMRI and other metabolic studies aimed
at assessing the locus of the representation of linguistic objects. More recent fMRi
studies on the neurological correlates of gestures have also been presented.
Because this typology of studies usually reports to different models of mind and
brain structure, quite apart from the different hypotheses about the evolution of
language, the principal models concerning brain and mind structure and language
evolution have been presented and discussed, also bringing to bear recent discoveries
concerning mirror neurons in the human brains BA 44 and BA 45s areas, roughly
coincident with Brocas area, in the left hemisphere. Both the neuroscientific results
and their numerous interpretations have been rather controversial. Nevertheless, the
theories of brain plasticity and the connectionist model on the one hand, and McNeills
hypothesis of a multi-modal origin and evolution of language have been discussed
and adopted as possible frameworks for interpreting language as action. Because
metabolic evidence of the biological bases of language and gesture are often
controversial, indirect evidence of a strong linkage between manual action and speech
has been provided here by means of a multi-tasking experiment. Further indirect
evidence of this strong linkage between action and language, and gesture and speech is
provided in the following chapter.

7. Gesture in Deaf Orally-Educated

Subjects: An Experiment
Non, ut arbitror, dubitas, quisquis ille motus
corporis fuerit, quo mihi rem quae hoc verbo
significatur, demonstrare conabitur, non ipsam
rem futuram esse, sed signum.Quare hic quoque
non quidem verbo verbum, sed tamen signo
signum nihilominus indicabit; ut et hoc
monosyllabum, ex, et ille gestus, unam rem
quamdam significent, quam mihi ego vellem non
significando monstrari. (Aurelius Augustinus
Hipponensis, De Magistro, Liber 1, 3.6).


In the previous chapter, the hypothesis of a single neurological foundation for speech
and gesture was proposed and tested by means of a multi-tasking experiment. This
chapter is aimed at further investigating the possibility of a single neuro-motor basis for
speech and gesture by means of an analysis of multi-modal spontaneous
communication in two congenitally profoundly deaf subjects. The study of gesture-
speech synchronisation in congenitally deaf subjects can be an important means of
shedding light on the cognitive origin of gesture, and its motor linkage to speech. In
fact, the experiment proposed in these pages was designed to assess whether co-verbal
gestures follow the same synchronisation pattern in inborn profoundly deaf subjects as
in hearing ones.
The existence of a uniform pattern for gesture and speech was first outlined by
Kendon (1980; 1986), who hypothesized a relationship between the Tone Unit and
Gesture Phrase. This synchronisation pattern, adopted and put forward by McNeill
(1985), who stated that gestures are synchronized with linguistic units in speech
(McNeill, 1985:360), was dismissed by Butterworth and Hadar (1989), who provided a
review of different studies on gesture/speech synchronisation that led to different
findings. According to the authors, these findings complicate McNeills (1985) first
assertion that a gesture occurs at the same time as the related speech event, and they
refute the claim of universal synchrony between gesture and speech, even in the
minimal sense of temporal overlap48 (Butterworth and Hadar, 1989:171).
In a reply to Butterworth and Hadar (1989), McNeill (1889) stated that, on the
question of temporal relations between gesture and speech, Butterworth and Hadar
(1989) failed to distinguish the successive phases of gesture production (Kendon,
1980) (McNeill, 1989:176). The gesture phases in question are preparation, stroke (or
peak: see McNeill, 1982) and retraction.
Of course, the different phases synchronise differently with the accompanying
speech. In particular, the preparation phase has been observed to slightly anticipate the
onset of semantically related speech (Bull and Connelly, 1985), while the stroke phase
has been found to end at, or before, but not after, the phonologically most

Emphasis mine.
74 7. Gesture in Deaf Orally-Educated Subjects: An Experiment

prominent syllable of the accompanying speech49 (Kendon, 1980 quoted in McNeill,

The synchronisation pattern above explained may help further research on the
assessment of gesture cognitive and computational origin, and could be consistent with
other research carried out on the synchronisation of gesture and other non-verbal cues
with speech (Kelso et al, 1981, Hayashi et al. 2005, Kendon, 1972, McNeill, 1992). See
also the experiment presented in the previous chapter)
My hypothesis is that the observation of this synchronisation pattern in
congenitally deaf orally educated subjects would prove that this pattern is inborn and,
therefore, that it is not learned on imitation. In fact, if the above-mentioned
synchronisation pattern were acquired by imitation, an experiment on congenitally
profound deaf subjects would highlight some problems with synchronisation between
gesture and speech, since these subjects have no acoustic feedback of their speech
Of course, such an experiment required congenitally profound deaf subjects with
neither acoustic prosthesis, nor cochlear implant. Although these essential requirements
complicated the selection of suitable subjects for the experiment, it was made possible
thanks to the kind collaboration of a deaf couple living in Atessa (Chieti), a small town
in Abruzzo, central Italy.

7.1. The experiment

A couple of inborn profoundly deaf orally educated subjects with no acoustic

prosthesis (in transcriptions, S1 and S2), aged 45 and 47 years old respectively, were
video-recorded in their house in a familiar situation. The participants in question are
both profoundly deaf, with an acoustic loss of 70 db per ear, and were educated in a
special school in both spoken Italian and Italian Sign Language. Given the lack of
acoustic devices, the proficiency of these participants in Italian is rather poor, as their
phonetic output is mostly not understandable to non-specialists. For this reason, both
subjects rely mostly on co-speech gesture in order to convey their meaning. Because
these subjects happen to be the only profoundly deaf within the small community of
Atessa, their communicative strategy cannot rely on Italian Sign Language.
Nevertheless, they seem to be integrated with the community and manage to have
normal-like social interactions with their countrymen. Nevertheless, the participants
admit avoiding social interaction, when possible, because of their condition.
The primary interviewer of the recorded encounter (in transcriptions, I1) was well
known to the participants, since she had been the teacher of one of their sons. The
experimenter (who is the author of these pages) was introduced as a friend, who was
interested in psychological studies. She asked and obtained consent to video-record the
meeting for her studies.
The recordings took place in the family kitchen. During the first 20 minutes, S1,
her mother, and her son were present. Later, also S2 joined the rest of the family. The
conversations were encouraged to be as spontaneous as possible: for this reason,
conversational turns were often overlapped, which caused significant problems during
the transcription phase. The total length of the recorded session is 28 minutes.

Emphasis mine.
7.2. Analysis of the Data 75

7.2. Analysis of the Data

The results of data analysis for S1, S2, and I1 are shown in Table 9. As one may notice,
all gesture strokes were performed either in correspondence with or slightly before the
accented syllable of the speech flow the gesture occurred with. No strokes were
performed after the accented syllable. In particular, S1 produced 134 strokes out of 302
(45%) in correspondence with the accented syllable, 67 strokes (22%) before the
corresponding accented syllable, and 91 (30%) with no speech.

Table 9: Gesture in Deaf Subjects. Statistics

76 7. Gesture in Deaf Orally-Educated Subjects: An Experiment

The synchronisation of 10 (3%) gestures with the corresponding speech was not
determinable due to overlapping turns and the poorness of the participants oral
proficiency. S2, on the other hand, produced 51 strokes out of 65 (42%) in
correspondence with the accented syllable, 23 strokes (19%) before the corresponding
accented syllable, and 47 (39%) with no speech.
Note that all strokes beginning before the accented syllable have been counted as
performed before it, even when they were held up to the conclusion of the
word/utterance they corresponded to. These results seem to be consistent with
Kendons synchronisation pattern for gesture and speech.
Moreover, this synchronisation pattern has been verified for the main interviewer
of this experiment (in transcripts, I1). Her test results (see the complete list of the
performed gestures provided in Appendix 2) show the same synchronisation pattern
hypothesized by Kendon (1980) and observed in the deaf subjects. I1 performed 42
strokes out of 65 (64%) in correspondence with the accented syllable, 14 strokes (22%)
before the corresponding accented syllable, and 9 (14%) with no speech.

7.3. Gesture in Deaf Subjects: Some Remarkable Phenomena

This section offers an analysis of interesting phenomena observable in the co-verbal

gestures performed by inborn deaf subjects. These characteristics are analysed in terms
of the parameters presented in Chapter 6, namely, Locus, Point of Articulation, Size,
and Gesturing Rate. Although not essential for the aims of the experiment presented
above, these phenomena prove to be useful in determining the characteristics of
gestural performance in deaf subjects.

7.3.1. Locus

If compared to the hearing interviewer, deaf subjects performed their gestures at a

noticeably higher Locus. As shown in Table 9, these subjects tend to perform their
gestures at Upper Torso and Head: in fact, S1 performed 104 gestures out of 302 (35%)
at Upper Torso, 101 gestures (33%) at Head, 20 gestures (7%) at an intermediate area
between Upper Torso and Head, and only 45 gestures (15%) at Lower Torso. Note that
the gestures whose intrinsic morphology (in the tables, IM) required that they be
performed at a precise Locus, were not taken into account. Nonetheless, the ratio of
gestures performed at the Upper Torso-Head Loci with respect to the gestures
performed at the Lower Torso Locus is 5 to 1.
On the other hand, S2 performed 61 gestures out of 121 (50%) at Head, 24 gestures
(24%) at Upper Torso, 4 gestures (3%) at an intermediate area between Upper Torso
and Head, and only 8 gestures (7%) at Lower Torso. The ratio of gestures performed at
the Upper Torso-Head Loci with respect to the gestures performed at the Lower Torso
Locus is more than 10 to 1. These data differ remarkably from those of the main
interviewer, who performed 27 gestures out of 65 (41%) at the Lower Torso, 16
gestures (24%) at Head, 12 gestures (19%) at Upper Torso, and 6 gestures (9%) at an
intermediate area between Upper Torso and Head, with a ratio of about 1 to 1. Indeed,
as stated above, the intrinsic morphology of some gestures (i.e., some Emblems or
Iconic gestures) imply a specific Locus: the emblematic gesture for mad described in
Poggis Gestionary (Poggi, 1997) for example, intrinsically requires that the hand
points repeatedly to the speakers temple. Nevertheless, the gestures performed by the
7.3. Gesture in Deaf Subjects: Some Remarkable Phenomena 77

two deaf subjects in this experiment show a generally higher Locus compared to those
by the interviewer.
An instance of this generalization is shown in Figure 22. In this case, the gesture
shown in ff.114-116, which is an Emblem for almost, is performed at a significantly
higher Locus than normal. One explanation of the phenomenon in question could be
provided by the table on which the speakers elbows lay, but the same phenomenon is
also recorded in other situations in which the table does not serve as a base for the
speakers hands.

Figure 22: Locus in S1

A further example is provided in Figure 23, where a Metaphor with a

superimposed Beat performed by S1 is shown. Although S1s elbows do not rest on the
table, the gesture is still performed at a higher Locus than normal.
Figure 24 shows the same behaviour in S2: in this case, the subject performs at
Head an Iconic gesture whose intrinsic morphology does not imply a particular locus.
The gesture is performed without leaning. A possible explanation for this particular
phenomenon is that the location addressee of addressee/s can modify the speakers use
of common space (see zyurek, 2000), but the influence of addressees location on the
speakers gesture size has never been tested. Furthermore, even if one assumes that
there is a correlation between the addressees location and gesture size, this assumption
could explain only S2s behaviour: in fact, while S2 stood at a distance of about 2
meters from his main addressees, S1 displayed a comparable behaviour while sitting
right beside her addressee, at a distance of about 50 centimeters.
78 7. Gesture in Deaf Orally-Educated Subjects: An Experiment

Figure 23: Locus in S1

My hypothesis is that the locus raising observed in the deaf subjects is influenced
by Sign Language, for the majority of signs in Sign Language are characterized by a
place of articulation in the chest/neck area, as shown in Figure 2550.
In this figure, the Italian sign for House is shown: the place of articulation of
this gesture does not involve contact with any body part; still, the sign is performed
almost against head.

Figure 24: Locus in S2

This sign has been taken from the Italian Sign Language on-line dictionary:
7.3. Gesture in Deaf Subjects: Some Remarkable Phenomena 79

A reason for this peculiarity of Sign language is that signing needs to be quite
visible to the interlocutor, and it is thus usually performed at an intermediate place
between chest and neck, unless it has a required place of articulation.

Figure 25: place of articulation in Italian Sign Language: the case of house

7.3.2. Point of Articulation

As regards Point of Articulation, a striking difference between S1 and S2 one was

recorded: S1 produced 242 gestures out of 302 (80%) whose major Point of
Articulation was the elbow, 44 gestures (15%) whose major Point of Articulation was
the wrist, 10 gestures (3%) whose major Point of Articulation was the shoulder, and 6
gestures (2%) whose major Point of Articulation was the finger; S2, on the other hand,
produced 64 gestures out of 121 (53%) whose major Point of Articulation was the
elbow, 55 gestures (45%) whose major Point of Articulation was the shoulder, and 2
gestures (2%) whose major Point of Articulation was the finger.
S2s behaviour shows a greater resemblance to that by the main interviewer, who
produced 40 gestures out of 65 (62%) whose major Point of Articulation was the elbow,
12 gestures (18%) whose major Point of Articulation was the wrist, 10 gestures (15%)
whose major Point of Articulation was the shoulder, and 3 gestures (5%) whose major
Point of Articulation was the finger. The principal difference is that none of the
gestures performed by S2 were articulated at the wrist, versus the 15% of the gestures
performed with this oint of articulation by the Interviewer.
The analysis of Point of Articulation is significant for the determination of
emphasis and mobility in gesture production: in particular, a gesture is claimed to be
more emphatic when it is articulated at the shoulder, while mobility is claimed to be
higher when the Point of Articulation is located at shoulder/elbow and changes
frequently. The analysis of Point of Articulation for the interviewer (I1) show that her
gestures are not so emphatic, for only 15% of her gestures were articulated at the
shoulder: moreover, the majority of the gestures articulated at the shoulder were
deictics whose referent was far from the speaker (i.e. a house location in the town);
furthermore, I1s showed a good degree of mobility, being the 77% of her gestures
articulated at the shoulder/elbow, with a ratio of about 1 to 4. Moreover, all Points of
Articulation are used, which confirms the good mobility of I1s gestures.
The analysis of Points of Articulation in S2 shows a similar degree of mobility, but
a higher emphasis in gesturing: in fact, S2 performed 98% of his gestures with the
shoulder/elbow, with a ratio of about 1 to 1. This strikingly higher percentage of
gestures performed with shoulder/elbow reveals the highest degree of emphasis with
respect to I1s gestures; on the other hand, S2s mobility is slightly lower than I1s, since
80 7. Gesture in Deaf Orally-Educated Subjects: An Experiment

he did perform a great number of gestures with shoulder/elbow, but did not use all
Points of Articulation.
The analysis of S1s gestures reveals a lower degree of both emphasis and mobility:
in fact, S1 performed the majority of her gestures with the elbow (80%), with only 3%
of her gestures articulated at the shoulder, at a ratio of 26 to 1. All Points of
Articulation are used, but the elbow prevails as the major one.
A further piece of information can be provided by the analysis of gesture size: this
parameter, together with Point of Articulation, helps the determination of emphasis and
mobility in gesture: gestures, in fact, are claimed to be more emphatic when their size
average exceeds the normal average, while Mobility is higher when the angle
determined by the moving part of the joint in the repositioning phase is higher than 5.
Unfortunately, no studies have been devoted to the definition of size average and
mobility in gesture.
In a previous study of gesture in deaf and hearing subjects (Rossini, 2004a), the
normal average for gesture size was calculated on the basis of a 30-minute spontaneous
conversation between 7 subjects, 5 of them profoundly deaf, and 2 hearing: the average
gesture size was found to be about 30 for an Italian hearing person. Unfortunately, no
further studies are available to confirm this result. However, the analysis of gesture size
in the subject matter experiment of the present study showed that the size average for
the gestures produced by the main interviewer (I1) is 33, 41 (see Table 9): this
calculation seems to be consistent with the result of the previous experiment mentioned
The size average for the gestures performed by the two deaf subjects is lower than
that observed in I1s gestures: in fact, the size average for S1s gestures is 25, 73, while
the average for S2 is 29, 74.
These data seem to be consistent with those related to Point of Articulation, since
S2s gestures are more emphatic than those performed by S1, while mobility seems to be
higher in the second subject.In particular, mobility seems to be influenced by Sign
Language, as well as Locus, since a restricted Locus for gestures (place of articulation,
in the case of signs) implies a lower mobility of the limb joint.

7.3.3. Gesturing rate

Gesturing rate is another important parameter that highlights the differences between
deaf and hearing subjects: this parameter has been analyzed by means of two different
calculations: firstly, the relationship between total speech timing and the number of
strokes performed has been calculated for both the deaf subjects and the interviewer;
subsequently, the rate of strokes for a single kinetic unit has been determined. The
results are shown in Figures 26-27.
According to the results of the first analysis, the deaf subjects gesture much more
than the hearing interviewer: in particular, I1s conversational turns cover a time length
of 5.46 minutes (without taking into account silent pauses): during her conversational
turns, I1 performed 65 strokes, with an average rate of about 12 strokes per minute; S1s
conversational turns cover a time length of 9.16 minutes: during this time, she
performed 302 strokes, with an average rate of about 33 strokes per minute.
7.3. Gesture in Deaf Subjects: Some Remarkable Phenomena 81

I1 S1 S2

Speech turns(TOT) 05.46 09.18 01.50

n. of strokes 65 302 121

Average rate 12/min. 33/min. 80/min.

Figure 26: Gesturing rate: analysis

S1s average rate is almost three times higher than I1s one. Finally, S2s
conversational turns cover a time length of 1.50 minutes, during which he performed
121 strokes: in this case, the average rate is about 80 strokes per minute, which means
that S2s average rate is almost 7 times higher than I1s one.



I1 S1 S2


Figure 27: Gesturing rate: results

The second analysis more clearly highlights the difference between the deaf
subjects and the interviewer in gesture.
In the transcripts, hand movements have been segmented into kinetic units, which
include the whole time span from hand movement onset to retraction phase. This
interval can sometimes, but not always, correspond to a single gesture phrase, which is
in turn divided into different phases, namely preparation, stroke, and retraction.
However, a single kinetic unit is frequently composed of several gesture phrases.
An example of this phenomenon is shown in Figure 28 that displays a kinetic unit
performed by the first interviewer of the experiment on gesture and Prototype Theory
(Chapter 5): in this case, the kinetic unit is composed of three gesture phrases, namely,
two conduits, and one deictic.
This phenomenon is strictly correlated to the extent of the co-occurring speech,
namely, with the co-occurring Tone Unit (Kendon, 1986).
82 7. Gesture in Deaf Orally-Educated Subjects: An Experiment

Moreover, gesture production is influenced by other factors, such as arousal in

formal situations, when gestures are more likely to be suppressed by the speaker (see
Chapter 5).
The analysis of kinetic units in the deaf subjects showed their tendency to perform
numerous gesture phrases within a single tone unit: in particular, S1 performed an
average of 2.5 gesture phrases per kinetic unit, while the average for S2 is about 4
gesture phrases per kinetic unit.
On the contrary, the interviewer performed one gesture phrase per tone unit, but
this result is may be partly due to the phenomenon of gesture suppression in formal
situations. The presence of the camera might have influenced I1s behaviour, at least
for the first minutes of conversation. The fact that the interviewer did not perform
complex kinetic units after these first may be due to the length of her conversational
turns, which did not exceed 0.12 (for a complete table of conversational turns, see
Appendix II).

Figure 28: an instance of kinetic unit composed of several gesture phrases

7.4. Why Do We Gesture? First Conclusions 83

7.4. Why do we gesture? First Conclusions

The results of the experiment on deaf subjects confirm the hypothesis that gesture and
speech share the same cognitive-computational origin: in fact, both the subjects show a
synchronisation between gesture strokes and the accented syllables of the co-occurring
speech, confirming the pattern hypothesized by Kendon (1980). The fact that the
subjects in question were also signers does not affect the reliability of the experiment,
since they used a mixed code made of co-verbal gestures and LIS signs re-adapted to
the conversational situation. Moreover, when using co-verbal gestures together with
speech, the subjects perfectly synchronized the events according to the above-
mentioned pattern. The fact that co-verbal gestures were used in perfect synchrony is a
problem for the hypothesis that the synchronisation pattern is a learnt phenomenon.
Besides, the fact that sign languages also show synchronisation between sign stroke
and articulatory gestures may be considered a piece of evidence in favour of the deep
correlation existing between movement and speech, both being dependent on a
complex neuro-motor system deeply involved in communicative acts.
Ultimately, the answer to some of the principal questions of this book, i.e., what is
the origin of gesture, and why are gestures embedded in our daily interactions, can
probably seen in the unavoidable nature of gesture. The hypothesis put forward here is
that gesture is closely bound neurologically to speech activity. Our brain structure, with
the close adjacency of the so-called Brocas area to the motor area, on the one hand,
and the discovery of mirror neurons by Rizzolatti (see Rizzolatti and Arbib 1998), that
links both action and the perception of action to language mechanisms, probably makes
it so that limb movement is unavoidable during speech. Yet, this statement is not to be
interpreted as evidence for the non-communicativeness of gesture (Rim, 1982; Krauss
et al. 2000), but, rather, as a phylogenetic hypothesis: if one analyses gesture as a
prototype category according to a variety of parameters such as intentionality,
awareness, abstraction, arbitrariness, and extension (see 5.2.) one will find that there
is a particular class of gestures (i.e., beats) whose extension is not determinable since
no lexical access in conveyed and which are unintentional, and unaware; thus, one
hypothesis about the ontogenesis of gesture is that our neuro-motor system does not
allow speech production without neuro-motor epiphenomena, such as head and limb
movements, or facial expressions. To some extent, then, this hypothesis is consistent
with Butterworth and Hadars (1989) statement that gesture is a mere epiphenomenon
of the speech process. Nevertheless, Butterworth and Hadars statement is true only
from a phylogenetic perspective, which they do not adopt in their study: as we speak,
our neuro-motor system produces articulatory movements together with limb
movements which are not necessarily communicative in nature. Nevertheless, it is
possible that the communicative function of limb movements during speech evolved
from the inevitability of body movements during speech, which is, on its turn, due to
neurological phenomena: indirect evidence for this statement is provided by studies on
deictic gestures and other non-verbal interactional cues in primates (see Silberberg and
Fujita, 1996; Kendon 2009). A hypothesis for the evolution of the use of these limb
movements with a communicative intent is provided in Figure 29. Ultimately, the
hypothesis suggested here is that the limb movements produced by the human neuro-
motor system in concurrence with articulatory movements have been subsequently
used with communicative intents and have evolved as a means of signification together
with speech.
84 7. Gesture in Deaf Orally-Educated Subjects: An Experiment

articulatory movements, hence, limb movements

communicative limb movements with speech
Neuro-motor system deictics

Figure 29: a hypothesis for the evolution of gesture as a communicative device

A relic of these movements may be seen in beats, usually unintentionally

performed. A further step is the use of limb movements while speaking to point at
objects present at the scene where the communicative act is being performed: these are
the deictic gestures that are perhaps the archetypical limb movements provided with
some sort of lexical access, and are in fact seen by a majority of gesture scholars as
proto-gestures (Condillac, 1746; Arbib, 2002; Tomasello, 2008, but see also Place
2000 and Corballis 2002 for the hypothesis that iconic gestures are more likely to have
developed first).
A further step is describing objects that are not present in the scene during the
communicative act: these are firstly abstract deictic gestures and, as a consequence,
iconic gestures, whose function is to reproduce sizes and shapes of objects, or paths
that are not completely visible. Metaphors, on the other hand, may constitute an
advanced stage, since they consist of the attribution of physical form to abstract
concepts. Lastly, emblems, having a precise lexical access that is culturally determined
and culturally shared, may represent the endpoint of this evolutionary process of human
This evolutionary model appears to be consistent with Kendons Continuum
(McNeill, 1992), if the continuum is interpreted as a model of ontogenetic evolution of
the semiotic content of gestures. Moreover, the hypothesis proposed here is consistent
with the multi-modal hypothesis of the evolution of language put forward by
McNeill (1992 and in press): the profound neurological interlinkage between manual
action and articulatory movements required for speech production cannot account for
other evolutionary hypotheses than a system based on the co-presence of speech and
gesture ever since the dawn of human history.


This chapter has offered the results of an experiment with two inborn deaf subjects
aimed at assessing whether the synchronisation pattern between gesture and speech
first described by Kendon (1980) is also observable in inborn deaf subjects. With this
aim, two subjects were video-recorded during a spontaneous 30-minute interview. An
analysis of the materials thus obtained showed that the synchronisation pattern for
inborn deaf subjects is perfectly consistent with that observed for hearing subjects. This
finding leads to hypothesise that this synchronisation is not learnt, but attributable to
the human neuro-motor system.
Summary 85

As a consequence, gesture while speaking is probably unavoidable to some

extent, at least because of an ontogenetic property of our brain, probably the same
one that Edelman (2006) defines as degeneracy. It is then plausible to suppose that
gesture in terms of movement of head, hands, limbs, facial muscles originates as
unavoidable and unwitting. This hypothesis is partially consistent with Butterworth
and Hadars (1989) statement that gesture is a mere epiphenomenon of the speech
process. Nevertheless, this is true only from an ontogenetic perspective: as we speak,
our neuro-motor system produces articulatory movements together with limb
movements that are not necessarily communicative. The limb movements produced by
our motor-system, together with the articulatory movements, have been used with
communicative intent at a certain point of our evolution. A relic of these movements
may be seen in beats, which are usually unwittingly performed. Finally, some relevant
phenomena observed in the performance of the gestures produced by deaf subjects are
shown and discussed.
This page intentionally left blank

8. Reintegrating Gesture: Towards a New

Parsing Model
A gesture can have a very precise meaning in
a given context, but nothing remains of it
unless you filmed it, and apart from the fact
that both actor and spectator can remember and
perhaps repeat it. (Ferruccio Rossi-Landi,
Signs and Non-Signs, 290).


The considerations put forward in Chapter 5 and partly confirmed by the field studies
presented in Chapter 5, Chapter 6, and Chapter 7 lead to the hypothesis that the final
object of linguistic research should be expanded in order to provide a systematic view
of phenomena so far disregarded. In particular, the recorded interrelation between
verbal and nonverbal behaviour51 leads to the position that the final object of linguistic
investigation should be the whole of human communicative behaviour rather than just a
part of it.
In this chapter, the Audio-visual Communication System is defined, and its structure
is described.. Moreover, the question of recursion in language is addressed, in an
attempt to design an original model for the parsing and understanding of gesture and
speech signals apparently the first in this field.

8.1. The Audio-visual Communication System

As already stated, the deep interrelations between verbal and non-verbal behaviour
suggest a restructuring of the model of language as the object of linguistic enquiry: in
effect, linguistic speculation has often focused on a rather small part of the human
communicative potential. A further step in linguistic investigation can thus be to take
into consideration the basic verbal and non-verbal phenomena of ongoing face-to-face
interaction, and the relation between speech and gesture as communicative devices
complementing each other. This a complex system is structured into different levels
including sound the result of complex neuromuscular processes and gesture the
reinterpretation of basic neuro-motor acts for meaning-conveying purposes.
This interpretation leads to an account of speech and gestures as different
embodiments of the same neuro-motor process and, subsequently, as a whole.
This whole will be hereinafter labeled Audiovisual communication [AVC], AVC
being the communication system observable in human face-to-face interaction.
To consider human language a complex, basically oral process means to analyse it
as a whole, and, subsequently, to claim that speech and gestures cannot be studied

See the results of the multi-tasking experiments in Chapter 6 and the recorded
synchronisation between speech and gestures in deaf subjects in Chapter 7, both consistent with
the results of the experiments on D.A.F. presented in McNeill, 1992.
88 8. Reintegrating Gesture: Towards a New Parsing Model

separately, without first determining the main bases of the phenomenon they are related
to, for otherwise many of their properties, rules, and functions will not be evident.
As one analyses this system, it is evident that messages are conveyed by
neuromuscular impulses, which control movements and sounds.
Sounds are ultimately definable as the results of precise neuromuscular processes.
We can attempt a description of such a phenomenon by stating that the idea to be
expressed activates the neuro-motor system, which produces articulatory gestures, on
the one hand, and gesture phases, on the other. Articulatory gestures will, in their turn,
form words by means of phonological oppositions, while gesture phases will combine
to form gesture phrases, or gestures. Words will combine into sentences, and gesture
phrases will form kinetic units. The sounds produced by the articulatory gestures also
give rise to intonation that is pragmatically relevant within the economy of language.
In particular, the AVC system seems to communicate by means of two levels, one
of which conveys meaning in a mainly simultaneous way, while the other does so in a
mainly linear way. Both the levels are produced and perceived simultaneously with one
another. Let us analyse the first level with particular regard to gestures, which often
begin prior to the message conveyed by the second level (Kendon, 1972; McNeill,
1985). Gestures and other non-verbal phenomena have already been structurally
described by Birdwhistell who divided them into kineme, kinemorph, kinemorphic
class, and kinetic units, kinemes being the minimum psychological units of movement
provided with meaning. Kinemes should thus be the analogues to morphemes in speech,
which have been defined as the minimum psychological units of speech provided with
some meaning (Simone, 1999).
Nevertheless, if linguistic theories agree that a morpheme is the minimum
psychological unit, some problems arise if one attempts to assign them with an
unambiguous meaning. A morpheme is individuated by segmentation, which is made
possible by comparison between monemes, also known as lexemes (Simone, 1999),
although such segmentation does not always reach the goal of individuating an
unambiguous meaning for the morpheme in question. Morphology can be divided into
derivation and inflection. Some morphemes serve the purpose of providing a lexical
access (the base, or root of the lexeme in question), while other morphemes are
somehow combined with the root to form other lexemes, or monemes. Unfortunately,
the problem is even more complex, since there is not always a one-to-one
correspondence between morphemes and expressed meanings. For now, let it suffice to
cite the Italian couple shown in examples 1, and 2 listed below:

1- acr-e: sour

2- acer -rim- o: cruelest


In these cases, a synchronic segmentation is not possible. Scalise (1994) would

consider them as a supplementary entries in the paradigm, because of a change in the
lexical access, so that, from a synchronic point of view, acerrimo is a suppletive
superlative for acre. Still, another solution can be found if one analyses the examples 1
and 2 from a diachronic point of view, with its implication of morpho-phonological
8.1. The Audio-Visual Communication System 89

variation. Italian is a Romance language, which conserves some phonological,

morphological and syntactic vestiges of Latin.
If one now analyses the examples above, the lexical morpheme, or root, can be
described as *ac_r: in example 1, the root *ac_r is presented in its zero grade, while
example 2 shows the normal grade (see Jakobson, 1933 for a broader discussion). Still,
the problem remains with the morpheme rim, which could be further analysed into - r-
(intrusion), -i- (thematic vowel), -m- (paradigmatic with other superlatives such as, car-
o: dear, car-i-ssim-o: dearest). Simone (1998) also examines the opposition between
buon-o (good) and ottim-o (excellent): in this case, the paradigm seems to be
incomplete both in English and in Italian. Still, in colloquial use, Italian has developed
a new superlative for the adjective buono (i.e., buonissimo) which is analogic with
other superlatives: this phenomenon is usually defined as columnar analogy: since the
suppletive forms require a computational effort to recall them, in many languages they
have been substituted with forms which follow the paradigm.
A final consideration is that Italian is an inflectional language: if we consider again
the example cited by Simone (1998), buon- is the lexical morpheme, while o should
be a morpheme expressing both the singular, and the masculine meaning. Inflectional
languages tend to express different meanings by means of the same morpheme. On the
other hand, other languages (see, for instance, Turkish), show a one-to-one
correspondence between morpheme and meaning conveyed. Linguists usually avoid
confusion by simplifying the problem: when morphemes cannot be clearly identified,
they are implied to be not positional elements, but factors.
In such cases, linguists use the concept of morph, that is, a packet of phonetic
material conveying all the morphological information which cannot be segmented into
phonic material (see, e.g., the Italian lexeme which is the third singular person of the
present indicative of the verb essere (to be). For this reason, a description of speech -
and thus language cannot be limited to structure, although structural description is
important .
Given the utility of a synchronic description of that which takes place during face-
to-face interaction, we can extend the structural model in order to account for the non-
verbal phenomena involved. A structural description of the non-verbal subset of the
AVC is further complicated by the (generally assumed) lack of structure in the sub-
system itself. In fact, the idea that gestures and other non-verbal cues such as posture
shifting and gaze do not respond to an articulated code is almost unanimously accepted
and has constituted the strongest point in favor of those linguists who claim that such
non-verbal events are not to be considered linguistic . Nevertheless, given the fact that
communication is taking place by means of a code of some sort, a structural description
of such a code should be attempted, at least within the broader communicative system
this code is part of. Recently, some leading scholars within the realm of non-verbal
communication are revising this model, and introducing the concepts of gesture
grammar (Fricke et al., in prep.).
In effect, both verbal and nonverbal behaviour, although not linear, are perceived
over time, which leads to the impression that they take place linearly. Speech, in
particular, seems to be more linear than nonverbal behaviour and gesture in general
because structural speculation has already provided scholars with the methodology to
decompose such a complex and parallel phenomenon into a linear and articulated one.
Rossi-Landi (1968) proposes a less structural model of language in overt opposition
with the concept of articulation, and double articulation in particular, as proposed by
Martinet (1956). The author underlines the idea of first and second articulation, in
90 8. Reintegrating Gesture: Towards a New Parsing Model

stating that the analysis of sentences into words, and the analysis of words into
monemes, morphemes and phonemes is the exact opposite of the activity of speakers in
the real process of linguistic production, which starts from disarticulated sounds. In
other words, analyzing language as Martinet does is an abstract analysis that misses a
real and more profound aspect of language, its social dimension.
In his book (Rossi-Landi 1968, chapter VI), he proposes, against the theory of
double articulation, a homological schema for linguistic production that is is potentially
interdisciplinary: The theory of articulation to be expounded here is new in two
respects: (i) it maintains that there are two more articulations to be taken into account
with regard to any language provided that language is not viewed as a machine in
isolation . (ii) The four levels of articulation are to be found not only in the field of
language, but also in the field of material production. The principle of economy which
allowed man to construct his languages was also applied to the construction of
nonverbal sign systems (Rossi-Landi, 1992: 189).
Moreover, there are particular phenomena taking place during speech that are not
segmental. A good instance of this is intonation, which can be defined as a
superimposed melody, following precise rhythmic and tonal patterns conveying
important pragmatic functions, and adding relevant pieces of information. Intonation,
thus, still takes place over time, but is not articulated, and, moreover, takes place in co-
occurrence with words, so that it can be claimed to be simultaneous.

Figure 30: The audio-Visual Communication System.

8.1. The Audio-Visual Communication System 91

On the other hand, we have gestures that also take place over time, and are
performed simultaneously together with speech. Moreover, although gesture can be
subdivided into smaller phases, which are linear, these phases are only performed to
position the hand to the locus where the gesture is to be performed, the stroke phase
being the only one which conveys the lexical access of the gesture. Furthermore,
gesture may not be articulated (some doubts about this issue remain), but can be
Lastly, intonation has important pragmatic functions and is strictly interrelated
with gesture especially in terms of synchronisation 52. A model for speech and gesture
perception can thus be attempted following Massaro (1994), who stresses the
importance of simultaneous perception (see Figure 30).
According to this model, articulatory movements in the speaker would produce
both vocal and gestural outputs. These outputs are perceived and controlled by the
speaker himself and allow for synchronisation between the sender of the message and
its receiver. This synchronisation is mediated by a linguistic interpretation that
segments the parallel output into linear and separate channels: the vocal one, on the one
hand, and the gestural/behavioural on the other.
Of course, such a structural description is a simplification of the phenomenon of
perception itself, which is assumed to be much more complex. Nevertheless, describing
such a phenomenon as a structure helps to determine the range of simultaneousthreads .
Shifting towards the static dimension of language, as it is conceived by
Ferdinand de Saussure (1917) and the scholars of the Prague School more generally,
AVC can be analysed into three levels, as follows:
 Surface level: it is the macro-semiotic level of both speech and gesture,
which can be further divided into utterances in Kendons (2005) sense;
these are characterized by intonational patterns for speech, on the one
hand, and kinetic units on the other;53
 Meaning conveying level: it is the level of the minimum significant pieces
that have the potential to stand independently: these are lexical entries, or
lexemes, being either words on the one hand, or gestures, on the other;
 Basic level: it is the level of the minimum units of meaning regardless of
their potential independence. These are usually defined as morphemes,
from a purely formal point of view. Because morphemes are not always
clear-cut in linguistic form, the notion of morph is often introduced.
Morphs can be seen in both speech and gesture (for the notion of morph
in gesture, see McNeill 2005 and forthcoming). The basic level is usually
divided according to Hjemslevs (1961) duality of patterning; this
addresses the classical opposition between morphemes and phonemes: for
speech, of course, the phonemes are constituted of phonetic material.
Gestures in turn rely on handshape, trajectory, and other "morphological"
features in space.

For the relationship between gesture and intonation, see Kendon (1980 and following)
and Cassell (1998).
For the relationship between intonation and kinetic units, see Kendon, 1972, 1986.
92 8. Reintegrating Gesture: Towards a New Parsing Model

8.2. About the Morphology of Gesture

As already anticipated several times (see also Chapter 2), some scholars have tried to
describe the morphology of gestures, basing their analysis on the principal assumption
that if gesture is perceived and recognized without error by the receiver
notwithstanding idiosyncrasies in performance then it may be described by means of
recurrent features, i.e., morphology.
Birdwhistell (1952) is probably the only researcher to attempt a description of
kinetics that resembles the formal model proposed by structural linguists for minimal
linguistic units.After Birdwhistell, research about non-verbal communication and
gesture focused on other topics and when addressing the problem of morphology
abandoned the idea of a parallelism between linguistic properties and non-verbal
ones.David McNeill, for instance, while claiming a single origin for gesture and speech,
avoids the hypothesis of a morphological description of co-verbal gestures (McNeill,
1992), but suggests that emblematic gestures may have some morphological traits
(McNeill, 1992, 2005). Nevertheless, in his 2004 book, McNeill also suggests that
some morphology may be possible for those gestures that he terms co-verbal, when
suggesting some form of morphology for emblems and metaphors, with reference to
McCulloughs (2005) thesis of basic semiotic gestural components as somehow
opposed to Parrills (2003) study pointing out that nave speakers do not recognise
restricted and fixed forms even in the case of emblems. The author has also recently
accepted the hypothesis of morphological features in some gestures, claiming that the
existence of morphs in gesture would not affect the dialectic of speech and imagery54.
Recently, the author has proposed the concept of morphs in the proper sense for
gestures (McNeill, in press), partially accepting the ideas of those scholars who
envisage the possibility of describing the morphology of gestures (see Rossini 2004a
for an attempt). The thesis of basic semiotic components is particularly consistent with
the hypothesis I put forward (Rossini 2004b) about the analysis of gesture. The
hypothesis in question combined the parameters for analysis proposed by David
McNeill (1992), and Adam Kendon (1972). Kendons method of analysis is structured
as follows: during film analysis, two maps are made. In the first one, speech is
transcribed and changes of sound are recorded, while in the second a description of
movement changes (each of them labeled by means of a set of terms relating to
articulator function) are thoroughly matched with the speech segment they occur with.
He also introduced a set of fundamental concepts for the interpretation of gesture
morphology and a pattern for the analysis of gesture-speech synchronisation. In fact, he

morphological gestures may also engage language in a dialectic. There is a way this
can take place. The Neapolitan gestures comment on and/or regulate social interactions by
groups of interlocutors (Kendon). Such gestures add speech-act content to idea units and this
content becomes a component of the imagery-language dialectic. I understand this to refer to the
generation of idea units themselves. Imagine, for example, sequentially blending a mano borsa
(purse hand) with a PUOH or other metaphoric gesture. The idea unit, the very element of
thought, can then encode the conventionalized speech-act significance of the mano borsa
(roughly, insistent query or assertion). So one idea unit encodes a culturally specified way of
interacting while it is also about this discursive object; one idea unit existing on two levels
simultaneously. The dialectic produces a separation of levels. I believe layering is the means of
dialectically combing two morphological systems at the same instant. It arises because the
encoded form of the gesture morpheme asserts its identity in the dialectic, and layering is the
way that is accommodated. (David McNeill, personal communication, 2007).
8.2. About the Morphology of Gesture 93

analyzed gesture as composed of a nucleus of movement having some definite form

and enhanced dynamic qualities, which is preceded by a preparatory movement and
succeeded by a movement which either moves the limb back to its rest position or
repositions it for the beginning of a Gesture Phrase 55 ; a Gesture Phrase being
composed of preparation, nucleus, and retraction/reposition. He also defined a
correspondence pattern between Gesture Phrases and Tone Units, or
phonologically defined syllabic groupings united by a single intonation tune.56
McNeill and his laboratory, on the other hand, developed a method for
transcription mainly based on the same assumptions as Condon and Ogston. In this
case, speech and gestures transcriptions are perfectly matched: the first line shows
speech flow, with symbols representing hesitations, breath pauses and laughter. Square
brackets clearly determine the part of speech flow a gesture is related with. Boldface in
transcription precisely shows the syllables synchronizing with each gesture stroke.
Right below each speech flow report, an accurate description of the gesture is made
following the same method as for A.S.L. transcription. As with Kendons method, the
major parameter for McNeills gesture analysis is timing. He divided gesture into
different phases, which he named preparation, (eventual) pre-stroke hold, stroke,
(eventual) post-stroke hold and retraction.
A great contribution also comes from Sign Language scholars, who provided an
extensive description of phonology and morphology in signs (see e.g. Stokoe 1960,
1972). The classical parameters for the description of sign morphology, including such
elements as handshape, orientation, movement and position in gesture space,
have been recently adopted for the description of gestures (Bressem and Ladewig 2008),
with interesting results. In particular, Bressem (in prep.) proved that the gestures of
German speakers have standardized and recurrent forms.
An attempt to define interdisciplinary parameters for the description of the
morphology of gesture may thus start from the methods achieved by these scholars.
According to my proposal, the morphology of gesture can be easily described by
means of the following parameters, based on a reinterpretation of those usually applied
for the description of sign language:
 Size: the angle determined by the moving part of the articulation with
respect to the horizontal plane (see Figure 31);
 Gesture timing: the gesture phrase (which begins with the hand onset and
ends when the hand goes back to rest position/reposition) should be
further divided into different phases, which will be noted in transcriptions
along with their timing. Gesture phases: each gesture is composed of
several phases, which are as follows: pre-stroke phase, which is the
preparation phase. It is defined as the phase in which the hand leaves the
rest position and reaches the area in which the meaningful part of the
gesture will be performed (usually the torso area. See McNeill, 1992).
During this phase, the hand may begin to acquire the shape needed for the
gesture performance; stroke phase, or the meaningful part of gesture.
During this phase the hand acquires the full shape needed for the gesture
performance, and the gesture size is maximum.
 Oscillations: the intrinsic morphology of some gestures requires repeated
strokes, or oscillations. In these cases, although the stroke phase covers

Kendon, 1986:34. Emphasis theirs.
94 8. Reintegrating Gesture: Towards a New Parsing Model

the whole period, oscillations will be noted separately. This will help with
the determination of synchronisation patterns between Gesture Phrase
and Tone Unit; post-stroke phase, or retraction, when the hand loses the
configuration needed for the gesture performance and goes back to rest
 Point of Articulation: main articulator involved in the gesture movement;
 Locus: the body space involved by the gesture (See McNeill, 1992, based
on Pedelty, 1987). Locus will be identified by giving the name of the
body part the space of which is interested by hand movement, i.e.: L:
lower torso. For further indications, see Figure 32.
Figure 33 displays a key to the main abbreviations used in transcripts in this book.

Figure 31: The determination of Size in gesture

The above-mentioned notion of gesture morphology, intended in its etymological

sense, that is, as an attempt to track some recurrent forms in performance, can be
further divided into intrinsic morphology (including handshape, and locus), which
conveys the meaning (or lexical access) of the gesture, and extrinsic morphology
(including gesture size, point of articulation, and timing variation), which may confer
more or less emphasis to the meaning conveyed by the gesture.
8.2. About the Morphology of Gesture 95

As for the handshape, it can be probably divided into simple configurations that
could serve as minimum units, together with simple trajectories. A combination of
minimum handshape units and trajectories might constitute the signifier for those basic
semiotic components hypothesized by McCullough (2005).

Figure 32: Loci in gesture

Figure 33: key to abbreviations

96 8. Reintegrating Gesture: Towards a New Parsing Model

Some experiments could be run in order to assess whether a morphology in its strict
sense can be hypothesized for gestures as well: an fMRI study of brain behaviour in
subjects given the task of judging the correct form of familiar gestures and familiar
words, or judging whether the movements and strings of sounds can be considered
gestures and words could probably shed some light on the biological foundations of
perception and provide elements to confirm or dismiss the idea of gesture morphs.

8.3. Handling Recursion

Recursion is one of the most discussed properties of language (see e.g. Chomsky, 1957,
Chomsky and Miller, 1963, but also Simone, 1998). In fact, it has recently been
deemed to be the only distinguishing feature of human communication (Hauser et al.
2002). This claim has raised an interesting discussion with the reply by Pinker and
Jackendoff (2005) and the subsequent clarification by Fitch, Hauser and Chomsky
(2005). In particular, Hauser et al. (2002) distinguish between the Faculty of Language
in a Broad sense [FLB], that is supposed to be relevant to animals and man, and the
Faculty of Language in a Narrow sense [FLN], that is particular to humans. The only
distinction between FLB and FLN would be, precisely, recursion. Articulation, which
presupposes the recursion of a finite number of meaningful segments, is in fact
recognized as the only distinction between animal and human communication. Contra
both Hauser et al. (2002) and Pinker and Jackendoff (2005), but more in line with
Lieberman (2008), I will here assert that the debate about recursion in language raises
from a false presumption, that is, from the hypothesis that recursion in its narrow sense
is a property that is exclusive of human language, or speech. In effect, if one analyses
speech from a synchronic perspective, the property of recursion, at least in a broader
(i.e., not formal) sense, will appear to be unquestionable, as a consequence of the well
known principle of economy in language. Nevertheless, if one analyzes language as a
process rather than simply a code, then this property will only be true in theory, given
the limited capabilities of our working memory, while it happens to be perfectly true
for calculators that are able to reapply the same rule to the results of its first application
without incurring a system error. This means that the operation is potentially unlimited
for calculators, while the structure of human brain only allows for recurrent items and
rules, rather than recursive ones in a strict sense. If, on the other hand, language is
claimed to be recursive as a code, in the sense that the code itself is structured by
means of finite and recurrent parts, I suggest that gesture can also be described as a
recursive system. Moreover, if recursion is a label for any code built with recurrent
parts, then animal behaviour and communication can also be described as recursive, as
far as animals are provided with working and long term memory as well, while relying
on a finite number of behavioural patterns. Interestingly, Byrne (2003) has recently
come to the same conclusion. In this section, we will focus on human language and I
will explain how gesture can be recursive.
8.3. Handling Recursion 97

8.3.1. Existing Models

In order to understand completely the concept of recursion, we need to place it back in

the context of computational linguistics, within which the notion first arose.
As we know, computational linguistics is mainly devoted to language parsing, on
the one hand, and to the creation of speech simulators on the other. A vast variety of
parsers have been proposed since the 50s, when linguists and other scholars first
devoted their attention to this field. It is impossible to summarise here the numerous
studies aimed at building interactive programs, although Weizenbaums (1966) Eliza,
also known as DOCTOR, is probably the most famous attempt to develop a program
based on both language parsing and language simulation. The program in question
simulates a psychotherapist in an online written interaction with a human patient. Ever
since then, a great number of formal models for both speech production and parsing
have been proposed. The first application of a formal model to speech (see Figure 34)
is due to Levelt (1989). According to this model, human sentence production follows a
modular pattern, with different stages. In this model, boxes represent processing
components, while circles and ellipses represent knowledge stores. Utterances begin as
non-language specific communicative intentions, in the Conceptualizer, whose
function is to determine the semantic content of the to-be-spoken utterance.
The preverbal message of the Conceptualizer is stored in Working Memory, and is
subsequently transmitted to the Formulator, where the lexical items are selected. The
Formulator also selects phonological representations for the representation of the
lexical items (Phonological Encoding), and generates the surface structure of the
utterance (Grammatical Encoding). What emerges from the Formulator is the
articulatory plan of the utterance, which can be sent to the Articulator. Finally, the
Articulator sends the instructions for the production of overt speech to the motor

Figure 34: Levelts model (1989: 9)

98 8. Reintegrating Gesture: Towards a New Parsing Model

This model has been recently customized in order to take into account gesture
production. I will only mention the attempts of Krauss et al. (2001) and De Ruiter
(2000), which are shown in Figure 35 and Figure 36, respectively.
As shown in Figure 35, Krauss et al.s model is obtained by adding a separate
process for gesture production, which starts from the Working Memory and ends into
the Conceptualizer. This model is based on the presupposition that gesture is a mere
epiphenomenon of speech, and has to be considered to be a non-communicative
process, which is a side effect of speech production. The only problem with this model
is that no overt gesture production is considered. Moreover, gesture has a completely
separate planner, which has no direct feedback. Moreover, it is possible to hypothesise
that the motor planner, here kept separated from the conceptualizer, should be
integrated with it, or at least have a connection to it. The impression provided by this
model is that the phenomena are serial: my hypothesis is that, on the contrary, the
phenomena involved in language production and perception are parallel.

Figure 35: Krauss et al.s (2001: 34) model for speech and gesture production

On the other hand, De Ruiters (2000) model is more complex: the Conceptualizer
has both a Sketch Generation, which sends a sketch to the Gesture Planner, and a
Message Generation, which sends the pre-verbal message to the Formulator. The
Gesture Planner receives the sketch from the Conceptualizer, draws material from the
8.4. Towards a Computational Model for AVC Parsing 99

Gestuary, which is the nonverbal counterpart of the Lexicon, plans the to-be-gesture
and sends instructions to the motor control, which originates overt movements. As one
can observe, the process of speech production on the one hand, and the process of
gesture production on the other, are still separate in De Ruiters model, and work
simultaneously. Nevertheless, another model can be proposed.

Figure 36: De Ruiter, 2000: 198

8.4. Towards a Computational Model for AVC Parsing

Starting from the assumption that speech and gestures are different aspects of an
encompassing phenomenon here called AVC a formal model for the system is
needed. The model I propose is shown in Figure 37. In this model, both long-term
memory and working memory are placed in parallel. Moreover, they are deeply linked
and interdependent. The idea to be expressed is in the Conceptualizer, which, of course,
is linked to both long term and working memory and sends the non-linguistic message
to the Formulator. The Formulator is composed of a Kinetic Encoder, which provides
controls the motor system, and of a Grammatical Encoder, which provides the
instructions to the Motor System in order to product both overt gestures and overt
speech. As one may notice, no phonological encoder is placed inside the Formulator:
this is because the Kinetic Encoder should be able to send instructions to the Motor
100 8. Reintegrating Gesture: Towards a New Parsing Model

System to control either head, limb movements, and the phonological apparatus. The
Kinetic Encoder and the Grammatical Encoder are interrelated and depend from one
another, as shown by the arrows. Note that the Gestuary and the Lexicon are kept
separated from Memory. This is done for consistency with formal rules: a hypothetical
calculator running this flow diagram would not retrieve the needed information, unless
it were separately available. Nevertheless, both Gestuary and Lexicon are linked to
Memory (namely, both Working and Long Term Memory), for they have to be
considered part of it.
The result of the input sent to the Motor system, which is here underlined because I
consider it a key passage, is overt gestural production. According to this model, if the
computational process beginning in the Conceptualizer and ending with overt gesture
production is interrupted by the re-application of the same rule, then AVC is recursive.

Figure 37: Computational model for AVC output

Still, the phenomenon can be analysed in further detail: in particular, Chomskys

model for Sentence generation, which is shown in example 3, can be revised in order to
account for the production of both speech and gesture.
First, we need to define recursion: in particular, we need to distinguish beween
recursion in a narrow sense [RN], and recursion in a broad sense [RB]. RN can be
defined as the possibility in a code of reapplying the same rule to the result of its first
application for an infinite number of times. As already stated, this property is typical of
program languages, and is particularly evident in the behaviour of calculators.
8.4. Towards a Computational Model for AVC Parsing 101

Is human spoken language provided with this feature? My opinion is that it is

impossible to postulate it for human language, principally because of finite working
memory, the overload of which usually causes satiation (syntactic satiation, as
explained in Snyder 2000; semantic satiation as reported by Osgood 1988, etc.).
Human spoken language is thus more likely to be characterized by RB, being that it is
constituted of a finite number of small segments that recur in sentence generation. In
particular, the sentence shown in the example n. 3 shows different recursive features:
from a strictly formal point of view, the system is recursive every time a rule is
reapplied, this meaning that recursion appears every time a node such as, for instance,
NP is written both before and after the arrows.

3- The dress I bought yesterday shrinked S::> NP VP

NP::> det NP
NP::> N S
S2::> NP VP
NP::> Pron
VP::> V
VP::> V

An instance of this phenomenon is visible at line 2 of the listing, where NP::> det
NP is a binary description of a NP composed of both article and other material, which
is still labeled NP. The other material to be recognized by the system is an embedded
sentence (usually marked as S2), which is dependent on the first NP being parsed by
the calculator. The sentence analysed in example 3 is recursive both formally for a
rule is reapplied to the result of its first application, although the rule cannot be
reapplied an infinite numbers of times as would happen with recursion in a narrow
sense and because of the linguistic property that allows the embedding of a new
clause to the principal one in order to obey the rule of minimum effort, which is a
structuralist principle57. It is, nevertheless, a case of RB. Let us now try to combine
sentence parsing with the other overt product of AVC,gesture. Take for example the
sentences shown in 4 and 5. With these sentences, there is a common expectation that
the sentences will be accompanied by concurrent gesturing.

4- Ill take that sweater

5- that cat that sleeps on the couch, I dont like it

The sentences in question show various linguistic phenomena, including deixis,

focus, and reference.Let us now imagine that the sentence in Example 4 could be
accompanied by a deictic gesture in concurrence with the segment that sweater. The
result signal would be as shown in Example 4a:

Example 4a

Ill take [that sweater]

D.: dominant hand in D-shape

palm down away from body

The principle of economy based on the structuralist principle of minimum effort is
considered in the Minimalist Program first suggested in Chomsky (1993).
102 8. Reintegrating Gesture: Towards a New Parsing Model

An attempt to parse the sentence in example can be as follows:

S::> NP VP
NP::> Pr
VP::> V G[NP]
G[NP]::> Det G[NP]
G[NP]::> G[N]

In this case, square brackets indicate a specification of the sentence that is the
Holder, or a given segment of speech that acts as the anchor for further meaning
conveyed by the gesture.
Given the particular relevance of the gesture that in such cases is perhaps the most
significant part of the communicative segment, the Holder can be interpreted as a
feature of the concurrent gesture. Moreover, the presence of a deictic gesture in
concurrence with the second Holder confers to the sentence a definite pragmatic
meaning, that is, it establishes the Focus of the predication by indexing it in the real
word. On the other hand, the sentence reported in Example 5 could even be performed
with two concurrent gestures: a deictic synchronized with the Noun Phrase, and an
Emblem synchronized with the Verb Phrase. The result would be as shown in Example

Example 5a
[that cat that sleeps on the couch] [I dont like it]
D: dominant hand, D shape, points

E: dominant hand in D-shape palm away from body

oscillates several times from left to right

The result is even more complex than that in Example 4a: an attempt to parse it by
providing an account of both speech and gesture is as follows:

S::> [G] NP VP
[G] NP::> [G] [0 S]
[G] S::> [G] [NP VP]
[G] NP::> [G] Rel
[G] VP::> [G] V PP
[G] PP::> [G] [Prep SN]
[G] SN ::> [G] [Det N]
S::> [G] NP VP
[G] NP ::> Pron
[G] VP::> [G] Neg VP
[G] VP::> [G] [V NP]
[G]NP::> [G]Pron
8.4. Towards a Computational Model for AVC Parsing 103

In this case, the gestures performed in concurrence with the speech string are
interpreted as features of the segment of speech they synchronise with, since the
function they have is more likely to be a pragmatic one. The first gesture a deictic
can in fact act as a further index in synergy with the Determinant by defining the object
in the real world. This redundant repetition of the referential linguistic act has probably
a further function, that is, of individuating the focus of the predication; the second
gesture an Emblem would probably share the illocutory function already expressed
in speech by the negative adverb. The result is a particularly emphatic sentence that
would probably have a marked prosody, in terms of voice quality, prosody, and/or
intonation pattern. The further pragmatic sense added by such features is self-evident.
Let us now move a step further towards the analysis of a sentence where gesture
provides parallel independent pieces of information. As already pointed out in Chapter
4, evidence has already been provided for the function of co-verbal gestures in face-to-
face interaction: it is well known, for example, that the information provided by
gestures can integrate with that provided in the speech signal even though no direct
anchoring to that information is made in speech. Instances of such a phenomenon are
already presented in Cassell et al. (1999)58. As shown in Cassell et al., the listener takes
into account the information conveyed in gestures and integrates it into his personal
reprocessing of the message received. Interesting instances of such a reinterpretation of
the listener were also found in the data recorded at the University of Chicago, within
the experiment on the intentionality of gestures discussed in Chapter 5. The two
instances here discussed were recorded during the third phase of data collection, when
the subjects, in pairs, were asked to solve a guessing game provided by the interviewer.
The interesting characteristic of the examples taken under examination here is that
they were not prompted in any way and the subjects are not reproducing a stimulus of
any sort. The particular emphasis and vividness of the gestures reported (see Figure 38
and Figure 39) is thus perfectly spontaneous. Both the figures are record performances
of the same subject in transcripts indicated as S1 who is attempting with his
interlocutor the solution of the problem proposed by the experimenter. As already
stated in Chapter 5, the guessing game consisted in reconstructing a story starting from
its final scene that is described by the interviewer. The final scene, in this case, is as
follows for all the subjects: Theres a room with an open window. On the floor, you
can see shards, water, and Romeo and Juliet lying dead. What happened?. The
recorded segments in the figures are both related to attempts by S1 to provide a
sensible explanation for this final scene.
These instances are particularly interesting because they provide a vivid snapshot of
the capacity of the listener to integrate information on-line, while communication is
running. In particular, both the examples are good instances of synchronisation
between speaker and listener: in both cases, S2 uses her turn to expand upon S1s
sentence. Nevertheless, Figure 38 shows a case in which the additional information
conveyed by S1s gesture is overruled by S2. Note that S1 performs the gesture
conveying additional information in concurrence with a five-second-pause indicating
the end of his conversational turn.
An attempt to provide a formal description of S1s bimodal production relative to
the segment of speech [he] killed himself with the glass + iconic could be as follows:

See 4.2 for a discussion.
104 8. Reintegrating Gesture: Towards a New Parsing Model

S::> [G] [NP VP]

[G]NP::> 0
[G]VP::> [G] [VP Prep P1]
[G]VP::>V[0] ([G]Prep P2)
[G]Prep P2::> Prep [G]NP
[G]NP::> Det N[0]

[G]Prep P1::> G

Figure 38: instance of on-line integration of the verbal and non-verbal modalities by the speaker

This solution is more likely to be appropriate for the description of a sentence

where gestures convey additional information but are not directly anchored by speech.
In this case, gestures are assumed to be a mandatory specification of the Sentence node,
and, more specifically, a feature of the S rule. In other words, the rule S is interpreted
as generating two phrases and a kinetic unit (G) that is a specification of the two
phrases in question59. Each phrase is subsequently rewritten with the realization of its
implied specification. When the feature is not recorded in the multi-modal signal, its
absence will be recorded as shown in lines 4 and 7. The additional information
conveyed by the gesture is represented by adding a further node in this case a second
Prepositional Phrase noted between parentheses. Such a passage, though questionable,
is made possible by treating the beginning of the listeners dialogue turn (in this case,
S2) as a potential explanation of S1s speech, at least for the proposed speech structure.

Note that according to the minimalist approach one of the phrases is considered to be a
specification of the other. According to some scholars, the Verb Phrase would be the head, while
the Noun Phrase would be its modifier. Nevertheless, this interpretation is not unanimously
8.4. Towards a Computational Model for AVC Parsing 105

This way, it is possible to make formally explicit the elements that are implicit in S 1s
In this particular case, therefore, the node Verb Phrase at line 4 is potentially
modified by a Prepositional Phrase (Prep P 1) with a second Sentence node embedded in
it. Nevertheless, this potential Prepositional Phrase is expressed in S1s performance by
means of a gesture. This means that a specification, already implied in the generation of
the Phrase in question, replaces it.
The same phenomenon is recorded in Figure 39. In this case, S2 shows a perfect on-
line synchronisation with S1 and, again, exploits her conversational turn in order to
expand S1s sentence. In doing so, she chooses to accept the suggestion conveyed in
S1s gesture by making it explicit.
Again, a possible analysis of S1s sentence ([he] grabbed Juliet), that could take
into account the information conveyed in the iconic gesture is as follows:

S::> [G] [NP VP]

[G]NP::> 0
[G]VP::> [G] [VP (Prep P)]
[G]VP::> [G]V [G]NP
[G]NP::> [G]N

Prep P[G]::> G

Figure 39: instance of on-line integration of the verbal and non-verbal modalities by both speaker and

In this particular case the syntactic model adopted for the synchronic description of
the AVC system is shown to be particularly apt at representing the complexity of the
106 8. Reintegrating Gesture: Towards a New Parsing Model

phenomenon in play: in fact, in this particular instance, the specification expected to be

produced with the Verb Phrase is actually performed and synchronizes with it. Still, the
lexical access of the gesture provides further information about the inner representation
of the action the speaker is trying to convey. In particular, an intricate blend of Manner
and Path for the action in question, expressed in speech with a general verb, is fully
depicted. Such information is so relevant that it is unpacked by S1s interlocutor in her
dialogue turn. Since her speech production is a complex Prepositional Phrase with a
Sentenced embedded within that modifies the Verb Phrase node, I have chosen to add
this Phrase to the syntactic structure of S1s sentence, by hypothesizing its implicit
production represented in gesture form. As a consequence, the sentence reported in
Figure 39 is shown to have a syntactic structure where a specification already implied
by the application of the rule Prep P that is G replaces the content of the rule itself.
This phenomenon can be interpreted as an instance of recursion. Of course, recursion
here is to be understood in a weak sense. Other instances of stronger recursion, with
some suggestion of articulation, can be traced to instances where not only do gestures
completely replace the speech signal, but they are also performed within the syntax.
This phenomenon is usually recorded with emblematic gestures, but is not impossible
with other types of gestures such as deictics. An instance of gestural syntax recorded
in the data available from the experiment conducted at the University of Chicago,
already presented in Chapter 5 and shown in Figure 40.

Figure 40: case of gestural syntax

Summary 107

In this case, a string of two emblematic gestures is performed with no concurrent

speech. Still, some sort of communication is taking place, despite the lack of speech
production. The analysis of the utterance recorded this time for S 2 could be as follows:

S::> [G] [NP VP]

S::> [0] G
G::> G G

An instance of strong recursion in the gesture sub-module is recognizable, since the

rule G is reapplied to the result of its first application.
Of course, this analysis is not meant to be exhaustive. Further research and theoretic
speculation is needed in order to outline a structural model of Audio-Visual
Nevertheless, this attempt to apply the basic formal model to the system as a whole
should at least have suggested that recursion can be traced in any code at least to
some extent and that non-verbal communication is in fact based on a code of some
sort. On the other hand, this analysis should also highlight the complexity of the bi-
modal system, which can hardly be fully described by a linear model. The functions of
gestures and other non-verbal phenomena within human communication are
nevertheless undeniable. Disregarding them leads to an impoverishment of the
linguistic analysis.


This chapter has addressed some key theoretical points about the reinterpretation of
language as essentially multi-modal. A subset of language here defined as the Audio-
Visual Communication [AVC] system has been analyzed from a structural and thus
synchronic perspective. The hypothesis of a morphology in gesture not only in
symbolic gestures, but in all types of gestures has been put forward. The AVC system
has also been analyzed from a formal perspective, with a presentation and discussion of
previous attempts. Finally, the question of recursion in language and gesture has been
addressed within the wider system here proposed by means of examples. Now that the
system in question has been analyzed and described in formal linguistic terms, we are
ready to address some of its other functions which are often disregarded in linguistic
analysis: the planning and self-orientational phenomena thus far almost exclusively
considered within a psychological framework.
This page intentionally left blank

9. Private Language

Language is a system of orienting points,

necessary to act in this world. This system can be
used for self-orientation or for the orientation of
others- this difference is not a matter of
principle. (Alexander A. Leontev).


This chapter addresses the values and functions of language and thus speech and
gesture not just from a communicative and interactional viewpoint, but as means for
self-orientation and self-organisation of thought. The enquiry conducted by Piaget,
Vygotskij, and Lurija on the topic constitute a frame of reference from which we will
start our investigation. The theoretical part is here combined with an analysis of data
from an experimental study.

9.1. State of the Art

As suggested so far, the classical conception of language as an exclusively spoken

phenomenon is reductive, but new theories have been put forward which involve
actions and manual or visible gestures as phylogenetically and ontogenetically
linked to the primary mechanisms allowing both perception and the capacity of
expressing mental concepts. According to these theories, language is to be considered a
physical phenomenon not only because of the physical material (i.e., sounds and/or
movements of the limbs) that language makes use of for the expression of mental
contents, but also and above all because of the physical grounding of concept
formation (Edelman, 1989; Armstrong, Stokoe and Wilcox, 1994). These theories
constitute a solid starting point for the reinterpretation of gesture as a determinant piece
of human language. The previous chapter has provided an analysis of the role of
gesture within communicative acts, and an overall description of the audio-visual
communication system. Nevertheless, a still rather neglected aspect of language is its
self-orientational and private function. A study of gestures and other non-verbal cues
in face-to-face interactions with blocked visibility addresses these issues. The role of
language as a private phenomenon having to do with planning and self-directional
functions has been addressed only sporadically as far as linguistic studies are
concerned. As stated in the introduction, Leonard Bloomfield (1933) is the only well-
known linguist to devote some attention to the self-directional side of language, in
stating that thought is no more than communicating with oneself. Nevertheless,
psychologists such as Piaget (1926), Vygotskij and Lurija (1930) have thoroughly
addressed the self-orientational function of language. Piaget (1926) was the first to coin
the term egocentric speech for a particular stage in communication development that he
observed in 3-5 year-old children. Piagets interpretation of the phenomenon of self-
oriented speech in children involves a supposed inability to take the interlocutors point
of view, which causes an egocentric perspective. Vygotskij and Lurija (1930) divide
110 9. Private Language

speech into different categories, such as external speech, egocentric speech, and inner
speech. External speech is the communicative, social, and external manifestation of
language that is acquired by imitation; during the acquisition process, speech is used by
the child in order to organize his behavior. This step, which has no communicative
intent, is defined as egocentric speech. The latter is subsequently internalized and
evolves into inner speech, or the equivalent of thought. Because of a confusion
between Vygotskij and Lurijas (1930) definition of egocentric speech, which has no
communicative intent, and its interpretation provided by Piaget (1929), the authors
subsequently renamed this step of the ontogenetic evolution of language as private
speech (Vygotskij, 1961). Of course, the question of whether language arises first as a
communicative device, or rather as a private phenomenon is still debated.
Nevertheless, it is possible that both aspects of the linguistic phenomenon take place
simulaneously, both during the language acquisition process of the child and during
Although some studies address the private and self-directional function of language
in children, less is known about the use of language for self-directional and planning in
adult subjects. Vygotskij and Lurija (1930), for example, seem to imply that the further
evolution of egocentric speech into private or inner speech, which is silent, does
not allow for the use of some sort of egocentric speech in adulthood. The data here
presented show the contrary.

9.2. The Map-Task Experiment

In this chapter, the role of gestures in conditions of blocked visibility such as the
map-task experiment is addressed. This experiment is structured so as to have dyads
of interactants giving and following route directions on a map. Both the participants are
sitting facing each other, with an artificial wall that completely blocks the visibility of
the other person. The participants are not warned in advance about the fact that their
maps are partially mismatched, so as to add further cognitive load to their task. The
condition of facing the other without being able to see him/her, together with the
cognitive load placed on the interactants by the task of synchronizing to mismatched
maps makes possible the isolation of recurrent features, such as alterations in the role
of posture and gaze with respect to normal conditions, and results in a high production
of planning gestures. This typology of interaction is particularly suitable to
investigate the private and self-directional function of language: the speakers, in fact,
are in the presence of the other person, in a particularly interactive situation. The co-
verbal gestures observable in this condition are not derived from imagistic short-term
memory as happens with cartoon story retelling (McNeill 1992), but rather from self-
orientation in space and planning.
The map-task experiment was originally conducted at the Li.Co.T.T., Universit
del Piemonte Orientale, within a national project aimed at assessing the colloquial uses
of motion events in Italian. Because all sessions were video-recorded, an analysis of
gestural and non-verbal cues is also possible, and will be presented in these pages. The
data analyzed here were collected by Monica Mosca. The corpus consists of 4 hours, 5
minutes, and 22 seconds of map-task conversations, with a total of 44 participants. The
data collected were analized separately for speech by two coders. As for the non-verbal
part, the same researcher transcribed the whole corpus twice under blind conditions. In
particular, the rater was prevented from accessing the information determined during
9.2. The Map-Task Experiment 111

the data collection. This information concerned, for instance, the degree of
acquaintance between the interactants, their education, personal details, and handedness.
The measures adopted were aimed at ensuring the reliability of both the transcription
and the interpretation of the data. The inter-rater reliability for the speech transcript is
+0.89. The low rate is due to a different sensitivity to filled pauses, which is not a
factor within the framework of the present study. The test-retest reliability for the non-
verbal analysis gave a +0,99 correlation coefficient.

9.2.1. Co-Verbal Gestures and Other Non-Verbal Cues in Map-Task Activities:

Language for the Self

The analysis of planning gestures (Kendon, 2004) versus referential gestures in map-
task activities with blocked visibility is particularly interesting as far as the
phenomenon of private language is concerned. The topic of planning, self-oriented,
or self-directional gestures has been addressed in a number of previous studies (see e.g.
Goodwin and Goodwin, 1992; Kita, 2000) aimed at assessing both the role of gestures
within communicative acts and, on a more general level, the self-directional function of
The question of the role of gestures in dialogue has been extensively addressed in
the pastincluding experiments involving conditions of visibility or non-visibility. Mahl
(1961) was one of the first scholars to suggest a key role and influence of visibility on
the production of gestural and behavioural cues, while Rim (1982) proved that
blocked visibility does not completely prevent the production of gestures, and
consequently argued that the role of gestures is not strictly linguistic or
communicative. Other scholars following the same experimental line, such as Krauss,
et al. (1995) have come to the same conclusions. A different hypothesis is suggested
by Cohen and Harrison (1973), and more recently De Ruiter (2000): their suggestion is
that the resilience of gestures in conditions of interaction without visibility is due to the
adoption of behavioral patterns typical of default conditions, that is, face-to-face
interaction. Alibali et al. (2001) focus their investigation on the quality of gestures
(representational or descriptive versus beat gestures) during cartoon retelling in both
conditions of visibility and blocked visibility. Their research underlines the fact that,
because representational gestures are performed at a considerably higher rate during
conditions of visibility, these gestures are more likely to serve communicative
functions, while the function of gestures in general is both communicative and self-
directional. Janet Bavelas and colleagues (Bavelas et al. 2008) further suggest that
dialogue and dialogic conditions influence the production of gestures in interactions,
and that, because gestures, together with facial expressions and figurative language are
ultimately demonstrations, dialogical and visibility conditions profoundly influence
these cues.
We will focus here on a particularly interesting case in order to examine the non-
verbal and verbal cues present during map-task activities. The case discussed here
involves two subjects who knew each other before the experiment. This seems to affect
their performance during the task, since they display a striking synchronisation despite
both their lack of eye contact and the deceptive qualities of the task itself. Moreover,
their acquaintance with their interlocutor helps control the arousal due to the
experimental situation, and allows the subjects to express themselves more directly in
the case of a disappointing and/or frustrating response from their partner. These
conditions also seem to elicit a much higher number of gestures, probably because of
112 9. Private Language

the same phenomenon predicted within the experiment on the intentionality and
awareness of gestures- the possibility to interact in a more relaxed and informal way.
Given the peculiarity of such an effective exchange, a full transcript is provided in
the next section. The coding methodology is that adopted in the other reported
experiments, with the addition of some interpretation of the function of the gestures
recorded within the communicative exchange.

9.2.2. A Case Study of Map-Task Activity: Full Transcripts

00:01:18 -00:02:61
ff. 29-65

G: <eeh> { [tu [pa<a>rti e [vai [dritto<o>]

you start and go straight
LH leaves rest position and flaps repeatedly in the air. P.A.: w S: 30. Type:
Metaphor (palm-down-flap) with superimposed beat. Function: marks an easy
passage. Route direction is understood.

G: [# [poi [gi[ri ]
then you turn
LH raises slightly and flaps repeatedly in the air. P.A.: w S: 45. Type: Metaphor
with superimposed beat (palm-down-flap). Function: marks second step
ff. 94-104
9.2. The Map-Task Experiment 113

G: [<eeh>]
LH slightly rotates counter-clockwise (to left). Head bends to right. P.A.: w/f S: n.d.
Type: metaphor-deictic. Function: self-orientation

ff. 95-139

G: a [sinistra<a]a> #
LH moves to right preceded by a turn of the head towards the same direction
P.A.: w. Type: metaphor-deictic. Function: self-orientation in space

ff. 140-204

F: {[dritto dove verso la ruota o verso il Viale dei Li]}ll?

straight where? Towards the heel or towards Viale dei Lill?
F: RH holding a pencil spreads towards the interlocutor
P.A.: f. Type: metaphor-deictic. Function: marks a critic passage

G: head goes back to place. LH with C shape holds. P.A.: f. Type: metaphor.
Function: though the idea expressed deals with manipulation, the gesture in question is
to be interpreted as a vestige of an aborted plan, which was interrupted by the
114 9. Private Language


G: /[/ [no]
LH still holding a C-shape flaps twice. P.A.: w. Type: metaphor with
superimposed beat.Function: marks the abandonment of the old plan

ff. 222-263

G: [/ <eeh> [vai<ii>]
you go
LH flat swings twice forward-left. P.A.: e/w S.: 10. Type: deictic-iconic.
Function: self-orientation during hesitation pause (lexical-retrieval?)
ff. 263-268

G: [tu]
9.2. The Map-Task Experiment 115

LH flat swings once forward-left. P.A.: e S.: 5. Type: deictic-iconic. Function:
marks the adoption of a new plan (resolution of lexical retrieval impasse?

G: [praticamen[te<e>]
LH flat goes back and beats twice. P.A.: e/w S.: 45 e. Type: beat-iconic. Function:
rlines an significant passage, i.e., the continuation of the plan just adopted. The
gesture may also convey the idea of direction

G: [guardi<i] [i> # ]
look towards
LH flat, points leftward and slightly rotates counter-clockwise. Torso and head
self-orienting movement. P.A.: w. Type: iconic for direction + metaphor. Function:
self-orientation and eventual manipulation of a concept (representation of path?)
116 9. Private Language

ff. 311-341

G: * [sei girato verso<o] [oo>

youre facing
LH flat points leftwards and beats once. Subsequently the fingers loose. P.A.: w S.:
5 left; 7 w. Type: iconic-beat.
Function: self-orientation and marking of a relevant passage

G: un ba][nano]*[ vedi in lontananza ] } un banano #

a banana tree. You can see a banana tree faraway
9.2. The Map-Task Experiment 117

LH flat points left, goes to reposition and points again (ff. 1-4 above). The same
movement is repeated twice, with less and less energy. Then hand goes to rest position
(f. 5 above). P.A.: s/e/w/f S.: 90.Type: iconic with superimposed beat. Function: self-
orientation in space. The superimposed beat is due to the difficulty of expression.
ff. 406-485

G: pi {[o meno a<a> trecento metri]} # /// no?

at 300 meters, more or less
LH flat swings slightly back and forth. P.A.: w/f S.: 20 w. Type: emblem.
Function: conveys the meaning of approximation

F: no non lo vedo il {[banano per esempio l ///]}

No, I cant see the banana tree there, for instance
118 9. Private Language

G: {[///]
RH holding the pen spreads and holds, then closes. P.A.: f S.: n.d. Type: conduit.
Function: shows the evidence of the situation being described
Precision grip with LH
P.A.: f S.: n.d.
Type: metaphor
Function: introduces a
clarification which does not take place.


G: [vab al[lora [senti] #

Ok then listen
LH spreads palm away from body. The movement is repeated twice, with less
emphasis. Concurrent activation of the trunk in a self-adaptor movement. P.A.: e/w/f
S.: 30 e. Type: metaphor with superimposed beat. Function: marks the abandonment
of the old strategy and the adoption of a new plan. The beat and concurrent movement
of the trunk may function as arousal-restrain
ff. 570-612

G: [<eeeh>] [/ [girati<ii>]
BH move towards the table. RH goes to rest. LH with C shape,
palms away from body facing down, seams to grab something. The movement is
repeated twice. P.A.: e/w/f S.: 90 e Type: metaphor with superimposed beat. Function:
individuates and underlines a concept in this case, the new strategy to be adopted
9.2. The Map-Task Experiment 119

ff. 613-630

G: [parallela<a> #]
LH flattens and faces the table. RH beats rapidly on the table. P.A.:w/f S.: 15w.
Type: iconic.
Function: depicts a path in lexical retrieval impasse
ff. 631-653

G: [<aa]a>
LH beats rapidly on the table. P.A.: w S.: 5w. Type: beat.
Function: arousal- restrain in lexical retrieval impasse

G: [al[le gal[li<i>ne]
with respect to the chickens
BH flat, palm down, flap repeatedly in the air. P.A.: w S.: 10w. Type: metaphor
with superimposed beat (palm-down-flap). Function: individuation of a landmark in
both space and reasoning.
120 9. Private Language

ff. 682-720

G: e al [pra[ti][cello che hai affianco # ///]

and the small meadow you have alongside
BH repeat the gesture above with a superimposed beat. Type and function: same as
BH acquire a C shape and face each other. Type: iconic. Function: introduces a
landmark in space. The long hold continuing during breath pause and silence may
be interpreted as waiting for feedback


F: {[si]} G: [///]
Head sign BH, still holding the shape described above, beat once. Head nods
slightly. P.A.: w S.: n.d. Type: beat-metaphor. Function: underlines the acquisition of
positive feedback as a landmark for self-orientation in reasoning


G: [<%smack%> hai un [pra]ti[cel]lo affian]co #

LH, C-shape, moves left, beats twice and looses
RH, flat, beats twice and holds. Self-adapting trunk movement on the chair follows
9.2. The Map-Task Experiment 121

P.A.: w S.: 5 v-40left for LH, n.d. for RH. Types: metaphor with superimposed
beat and beat. Functions: the metaphor ideally sets aside a relevant piece of information,
beats and eventually trunk movement underline the relevance of passage


G: [fai<iii>] [fai tre o quattro passi in avanti ///

F: ma<aaa> * si un praticello
LH flat rotates repeatedly right-left; RH, flat, moves slightly to right. P.A.: w/f
S.:5 LH. Type: emblem for LH, deictic for LH.Functions: estimation and marking of a
landmark either geographic and in discourse.
BH repeat the emblem previously performed with LH only, then hold
P.A.: w S.: 10
Function: prevention of interruption. Hold correspond to a silent
pause indicating awaiting for feedback

ff. 829-873

F: mi trovo davanti una roulotte ///

G: BH hold in position shown at f.5 above
ff. 874-967
122 9. Private Language

G: ///] [no!] [Tu devi essere paralle<e>la non perpendicolare al <eee>

No, you have to be parallel not perpendicular to the <eee>
F: {[e vab]}
BH flat, palms facing each other, beat slightly. The gesture is repeated with more
emphasis with long preparation and post-stroke hold in concurrence with the
subsequent utterance. P.A.: e/w S.: 45 e for the second Gphrase. Type: metaphor
Function: highlights a concept
F: Head nodding
ff. 968-116
F: allora mi trovo al Viale dei Lill ///
Then I am in the Viale dei Lill
ff. 1017-1108

G: /// (3.13s) ]
F: ci sei?
Can you follow? (Lit.: are you there?)
BH hold the stroke performed at f. 3 above during a silent pause and the further
question of the follower. Type: metaphor. Function: processing feedback self-
orientation in thought

G: [no! /// ] G: [Non c nessun Viale dei Lill]

F: non c il Viale dei Lill Theres no Viale dei Lill
Theres not the Viale dei Lill
LH looses and opens with palm up. Self-adaptor
The stroke is held throughout
Followers dialogue turn
P.A.: w/f S.: 3w
Type: metaphor
Function: ideally shows to the interlocutor something obvious
9.2. The Map-Task Experiment 123

Second Clip
ff. 1195-1249

F: {[si]} G: [e<e>cco] # [al[lora] # [tu<u>]

Yes ok then you
Head nod. BH spread, palms away from body, move down. The movement is
repeated three times. P.A.: w S.: 20 (average) Type: metaphor (palm-down-flap).
The repetition suggests a superimposed beat. Function: highlights the acquisition of a
common landmark in space. The repetition of the gesture indexes a landmark in the
communicative strategy


G: [paralle<e>la<a>]
124 9. Private Language

LH C-shape with thumb and index marks a starting point on the table and
describes a path towards left. P.A.: e/w/f S.: w 20 down e 40 left .Type: iconic.
Function: describes the path to be conveyed in speech. Anticipates the concept of
binary tracks

G: *[ come<ee>]
LH spread, palm down, goes left and right. P.A.: e/w/f S.: 20 e (average). Type:
iconic-metaphor. Function: repeats the path described above with less accuracy. The
gesture may be a prevention from interruption by the interlocutor.
00:03:86- 00:04:30

[le<e> * i binari del]

Repetition of the gesture described at ff. 55-77 above. P.A.: e/f S: 70 e. Type:
iconic. Function: describes a path

[treno] F: si
LH spreads. P.A.: w/f S.: 100 w. Type: metaphor. Function: ideally marks and
end-point for a route in reasoning
9.2. The Map-Task Experiment 125

G: [ok] #
LH, holding the same shape, beats downward. P.A.: e S.: 25. Type: metaphor-beat.
Function: underlines the acquisition of positive feedback as a landmark for
self-orientation in reasoning
ff. 134-140

[ti muovi<i>]
you move

LH spread palm downwards goes down. P.A.: w S.: 50 w. Type: metaphor.

Function: marks the following step
126 9. Private Language

G: [dritta] <eee> [parallela] a [questo<o>] [questo]

straight on parallel with respect to this
LH C-shape with thumb and index marks a starting point on the table and
describes a path towards left. The gesture is repeated three times more, the second time
and last with more emphasis. P.A.: e/w/f S.: 10e; 40; 35; 45. Type: iconic.
Function: resolve a lexical retrieval impasse

G: [<oo][o>]
LH sloppy repeats once more the movement shown in ff.1-8 above. Subsequently
fingers close as if holding something. P.A.: e; f S.: 40e; n.d. Types: iconic + iconic.
Function: resolve a lexical retrieval impasse
ff. 229-259
9.2. The Map-Task Experiment 127

G: [reci][piente ///]
BH alternatively rotate (ff.1-3). Subsequently, RH goes back to place, while LH
spreads and holds (f.4). P.A.: e/w/f; f S.: 360w-10e. Type: metaphor+metaphor.
Function: the first gesture is commonly used when the speaker refers to common
knowledge, and/or when precision in discourse is not at best. The second one marks an
end-point in reasoning-speech

G: [fai tre o quattro pas] [si<i># F: eh G: poi]}<i> <eee>

make three or four steps then

BH flat palms facing each other swing alternatively right and left then stroke
downwards and hold. P.A.: e/w/f S.: 10 vertical 20 horizontal (average). Type:
emblem + metaphor. Function: estimation + individuation of a landmark in reasoning

G: {[va * giri a sinistra<a> ///

Go turn left
LH flat, palm away from body, turns left. The stroke is held during three
conversational turns (00:20:12, f. 503). P.A.: e/w/f S.: 3e 80w. Type: iconic-deictic.
Function: orientation in space
128 9. Private Language

ff. 399-452

F: {[vado in su praticamente]} G: non vai in [gi] /

I go up in fact no, you go down
RH flat points in front of the speakers space. P.A.: e/w/f S.: 40e 45w. Type:
iconic-deictic. Function: orientation in space

Head nod

G: se vai a sinistra<a>/ G: eh! #]

If you go left
F: {[ah si vab ]}vado in gi /e poi?
Ah OK I do down. And then?
RH holding the pen rotates clockwise. P.A: e/w S.: 30 e. Type:
emblem. Function: commonly used when the speaker refers to common knowledge,
and/or when precision in discourse is not at best. LH still holds the stroke achieved at f.
398 (00:15:92).
Head nods in concurrence with Fs {[ah si vab ]}vado in gi
9.2. The Map-Task Experiment 129

G: [e ok]#
LH spread rotates twice. P.A.: w S.: 85right; 30vertical. Type: metaphor.
Function: refers to common knowledge- probably marks degradation of New to

G: [e vai [avanti<i>]
And you go straight
BH palms downwards move twice down. Subsequently, RH goes back to rest
position. P.A.: w/f S.: 20w. Type: metaphor with superimposed beat. Function:
highlights the acquisition of a common landmark in space. The repetition of the gesture
indexes a landmark in the communicative strategy
130 9. Private Language

G: [per <eee>] [non so<oo>]

for I dont know
LH C-shape rotates counter-clockwise. Then hand moves slightly left and repeats
the movement. P.A.: w/f S.: 360w. Type: metaphor. Function: conveys the idea of

G: [quindici o venti passi///] F: hm

15 or 20 steps
BH C-shape facing each other swing alternatively right and left. RH . P.A.: w/f S.:
180 w. Type: emblem. Function: estimation
9.2. The Map-Task Experiment 131

ff. 632-649

G: [no?]
RH beats on the table, while left hand holds the stroke. P.A.: w S.: 10 Type: beat
Function: highlights a landmark in both reasoning and space

G: # [e<ee>] [ti trovi<i>]

and you find
LH, still holding the same shape, rotates slightly counter-clockwise. Subsequently,
BH flat, palms downward, move down. P.A.: w; w/f S.: 360w; 15w. Type: metaphor;
metaphor (palm-down-flap). Function: the first metaphor conveys the idea of
approximation; the second one marks the following step
ff. 671-719

G: [una [pianta di [banan [*banano alla tua] [destra<a>///

a banana tree on your right
BH repeat the movement described above 4 times. Subsequently, RH bends right
P.A.: w S.: 5w (average); 90 w right. Type: metaphor (palm-down-flap) with
superimposed beat; iconic-deictic. Function: the first gesture stresses the speech and
132 9. Private Language

helps with a pronunciation impasse; the second helps orientation in space and
eventually underlines an end-point in route description
ff. 710-780

F: {[non ce lho la pianta di bana]}no alla mia destra! ///

I have no banana tree on my right!
RH spreads. P.A. w/f S.: 5. Type: metaphor-deictic.
Function: shows the idea conveyed in speech. Underlines an obvious piece of

G: ///
BH hold the stroke. Then right hand goes to rest position. Type: metaphor.
Function: self-orientation in reasoning


F: comunque io vado dritto a un certo punto giro verso il basso

anyhow I go straight on and at a certain point I turn downward

G: LH still holds the stroke described above. In concurrence with Fs a un certo

punto, head moves forward and down, while LH beats once. P.A.: w S.: 2.Type: beat.
Function: highlights a relevant passage. The movement of the head is indicative of

G: [allo][ra] [giri] [verso ][il basso] e vai [sempre dri<i>tto poi F: eh!
ok you turn downward and then go straight on
LH still holding the same shape beats downwards 5 times. In concurrence with
sempre dritto, the hand moves slightly left and beats once more, then holds the stroke.
RH joins the movement only for the first beat, in concurrence with allora. P.A.: w S.:
20 (average). Type: beat. Function: the first one, performed with BH, underlines a
starting point and a acts as index of a slight change in strategy (palm-down-flap). The
following four strokes stress the information provided in speech which is repeated.
The last one (palm-down-flap) marks an end-point in route-description. Note that the
stroke is held throughout Fs dialogue turn as if waiting for feedback.

F: e finisce il foglio
then the map ends
G: [no<o>?]}
LH beats once more. Type: beat
9.2. The Map-Task Experiment 133

Function: marks the endpoint of route-description, probably with a synchronisation


G: ok {[///] <%laughing%> [prima [che finisca il foglio] <%> ///}

Before the map ends
<%laughing%> si

BH flat palms away from body facing downwards, move down once during the silent
pause. Afterwards, BH move up towards body and move down-up twice. LH slightly
rotates counter-clockwise. P.A.: w S:: 40 (average). Type: metaphor with
superimposed beat.Function: the first gesture highlights a landmark in reasoning. The
second one acts as signal for the activation of attention

G: giri in {[su] //
you turn upwards
F: si
index finger of RH points up in front of speakers space. P.A.: f S.: 40. Type:
deictic. Function: self-orientation

G: a [destra] F: si
(you turn) right
RH flat moves downward. P.A.: w/f S.: 35. Type: metaphor (palm-down-flap)
Function: marks a landmark in either space and organization of route-direction

G: [no<o>?] #
RH flat repeats the movement described above. P.A.: w S.: 25. Type: metaphor
(palm-down-flap). Function: marks a landmark in route-description

G: [e<e>cco]
RH repeats the movement once again. P.A.: w S.: 30. Type: metaphor (palm-
134 9. Private Language

Function: marks the degradation of New to Given

ff. 1187-1220

G: [girando a [de<e>stra<aa>]
turning right
The same downward movement is repeated once with RH, once with BH. P.A.: w
S.:15 (average).Type: metaphor (palm-down-flap) with superimposed beat
Function: highlights the Given as a starting point for subsequent description

G: [alla<a][a> <hmm>
RH flat moves down then rotates clockwise and holds
P.A.: w; e/w/f S.: 3w; 90 left w
Type: metaphor + preparation phase of an iconic-deictic
Function: the first gesture is a vestige of the metaphor (palm-down-flap)
described above. It may underline the following step in the communicative strategy.
The second kinetic phase is interpretable as a long preparation phase of an iconic-
deictic which is held in concurrence with a lexical retrieval impasse

G: alla tua [destra sempre]

on your right - always
RH rotates clockwise with a slight shift towards right then points right on the table
P.A.: w/f S.: n.d. Type: iconic-deictic with superimposed beat.
Function: self orientation in space. The superimposed beat helps control arousal in
a lexical retrieval impasse

G: [c una * una pianta di]

theres a banana tree
BH flat palms downwards rotate. P.A.: w S.: 360w. Type: metaphor
Function: conveys the idea of approximation

G: [banano ///
BH spread with palms up and hold. P.A.: w/f S.: 260w. Type: metaphor
(conduit?). Function: metaphorically presents the idea to be conveyed. In this case, the
gesture may also act as marker for a relevant landmark in space. The stroke is held
during most part of Fs turn, which suggests the additional function of feedback-
9.3. Co-Verbal Gestures and Planning in Conditions of Blocked Visibility and Face-to-Face 135

G: /// ]
F: e daie / <%laughing%>non ce lho<o> io la pianta di banano cho un {[Viale
dei Lill]}
Again! I have not the banana tree, I have a Viale dei Lill!

F: BH spread palms facing each other move towards the map. P.A.: w S.:70w.
Type: deictic. Function: orientation in space

9.3. Co-Verbal Gestures and Planning in Conditions of Blocked Visibility and

Face-to-Face: An Overall View

While the number of gestures performed under conditions of blocked visibility is

comparable to that observed in normal conditions, posture shifts and gaze towards the
interlocutor are dramatically reduced. Posture shifts appear in cases of particular
arousal, whereas gaze is recorded when the speaker is waiting for his interlocutors
response or is instinctively trying to have visual feedback.
An instance of the latter case is shown in the transcripts provided here (00:03:10-
00:03:85): the Giver is trying to describe a location on the map but has problems
retrieving a good term of comparison.
In pronouncing the word come (Eng.: like) followed by long filled pause
consisting of the prolongation of the final vowel, the Giver looks towards the Follower.
Cases of posture-shifting probably related to planning are also recordable, as in the
example shown at minutes 00:21:33-00:22:76 in the transcripts provided here. In this
case, the Giver performs two palm-down-flaps, interpreted as a single gesture with a
superimposed beat because of the observed decrease of movement dynamics in the
second stroke, and a concurrent activation of the trunk in a self-adaptor movement.
Because the trunk movement in question in synchronized with the word vab
(roughly, OK) in speech, and with a palm-down-flap, it is likely to mark the
abandonment of the old strategy and the adoption of a new plan. Both the beat and
concurrent movement of the trunk may also function as arousal-restraint.
As for gestures, an informal comparison with the data available from other
experiments on face-to-face interaction seems to confirm the impression that the
number of phrases performed is strikingly similar. In particular, the rate of gesture
phrases recorded during the map-task tends to match that performed by subjects
involved in normal interaction, 60 with frequent cases of communicative gestures 61.
On the other hand, the gestures recorded during the map-task tend to be simplified in
A mean of 17 gesture phrases per minute was recorded for subjects in blind map-task,
which is strikingly comparable to the overall average of 15 gesture phrases per minute recorded
in face-to-face interaction for the Italian subjects, and with the overall average of 19 gesture
phrases per minute recorded for the American subjects intent in road-description. This last datum
was extrapolated from the data shown in McCullough, 2005, Table 17, page 87.
Again, an interesting instance of communicative behaviour despite the fact that the
subjects are not visible to each other.
136 9. Private Language

form, and more recurrent, which helps to characterize them as extra-coherent: a

higher number of catchments in the McNeillian sense can in fact be identified. An
instance of such a simplification is visible in Figure 41, which offers a comparison
between the most complex62 gestural performance recorded during the map-task (plate
a) and a complex gestural performance recorded during a face-to-face road description
from McCullough (2005)63 (plate b).
Such a peculiarity makes possible the isolation of recurrent gestural patterns
referring to both spatial-reasoning and discourse organisation. A good instance of such
a phenomenon is a recurrent gesture, which will be here named palm-down-flap. This
gesture was observed in the performance of two subjects in the role of the Giver during
map-task and probably in two subjects intent in face-to-face interaction.


a) LH C-shape describes a path towards b) shape in three strokes embedded in a single left
while RH goes slightly down. gesture

c) RH describes a complex route pattern (McCullough, 2005: 116)

Figure 41: instances of complex gestures in a) map task and b) face-to-face interaction, compared with data
available from c) spontaneous route description (McCullough, 2005: 116)

Complexity is here determined by means of the following parameters:
- number of hands engaged in the performance
- number of separable trajectories
- number of separable hand(s) shape
- number of strokes embedded in the same gesture phrase
Thanks to Karl-Erik McCullough for his kind permission to reproduce images from his
University of Chicago Ph.D. Dissertation.
9.3. Co-Verbal Gestures and Planning in Conditions of Blocked Visibility and Face-to-Face 137

The gesture in question is a downward movement performed with palm(s) flat,

down and away from body (see transcripts 00:01:18 -00:02:61 frames 29-65 64 )
recorded for two subjects (S1 and S2) in synchronisation with crucial passages during
both route-direction (mainly soon after the acquisition of common landmarks) and
discourse organisation (i.e., in concurrence with the confirmation of a successful
communicative strategy).
The downward movement depicted by the gesture shows no direct iconic
relationship with the co-expressive speech65 but rather a more abstract and metaphoric
relationship with a self-directed thought related to planning. For this reason, it is here
interpreted as a metaphor, as defined in McNeill (1992), the downward movement
being depicting a given state in the organisation of the speakers communicative plan,
probably identifiable as degradation of New to Given (see Halliday, 1985).
Interestingly, this gesture is often recorded in synchronisation with adverbs
signalling alignment with the interlocutor, such as ecco! (Good!, when used as an
interjection. See transcripts, second clip, 00:48:77-00:50.94, frames 1195-1249).
The first occurrence of the palm-down-flap in S1, who is in the role of the Giver
(in transcripts g), is recorded at 00:01:19. In this case, S1 has just established the
basic coordinates in two brief dialogue turns in order to begin with route-direction.
After having received positive feedback from the Follower (in transcripts, f), the
Giver starts with route-direction by saying tu parti e vai dritto (you start and go
straight) with the concurrent performance of a palm-down gesture with superimposed
beat. This same gesture is repeated after the first pause, at 00:02:61 (see ff. 66-103) and
subsequently at 08:16 seconds (see ff. 205-221), after a silent pause that undoubtedly is
an index of perplexity. In this case, the gesture seems to signal the continuation of the
plan already adopted before the Followers interruption. In a sense, the Giver is
overruling the Followers interruption here, in order to continue with her description of
the segment of route-direction. The same gesture is then repeated with both hands at
00:26:13 minutes, after a 2-second-hesitation in speech, and at 00:48:77 minutes, in
concurrence with the utterance: ecco, allora tu (ok, then you), which follows a
long negotiation over spatial landmarks and ending with a positive feedback by the
Follower. In this case, the gesture is interestingly superimposed on the vestige of a C-
shape stroke-hold that takes place during the Followers dialogue turn, which is
interpretable as an index of an aborted idea. Another occurrence is recorded at 00:71:77,
after the resolution of a wrong-footing due to the incongruence between the two maps,
immediately following the performance of another recurrent metaphor used to index
common knowledge. Other occurrences of this gesture are synchronized with a request
for feedback (00:76:21) and the following utterance (00:76:93; 00:77:77), probably
with a superimposed beat, in a sort of gestural semantic satiation due to the
fossilization of the gesture.
A palm-down flap is also observed in another subject during map-task, at the very
beginning of her interaction with the interlocutor, within a complex phase of
negotiation about the starting-point (see Figure 41). In particular, the gesture is
recorded in concurrence with the utterance allora an overt index of planning an
effective communicative strategy and is repeated soon after in concurrence with the
following utterance.

The conventions for transcription follow those proposed in McNeill (1992 and 2005). See
also Chapter 8.
See McNeill, 1992.
138 9. Private Language

Figure 42: probable palm-down flap in an Italian subject intent in a face-to-face guessing

The planning is difficult nonetheless, because of several overlaying utterances by

the Follower. This lack of synchronisation between the interactants is resolved at
00:03:40, when S2 accepts the interruption and clearly states the intention of beginning
her route description from the starting point. Interestingly, during this alignment, S2
performs two palm-down flaps in strict succession (00:03:40-00:04:97), with

JA: hm okay / good [so that helps me]

Figure 43: probable case of palm-down flap in an American subject (from McCullough, 2005: 121).

Note that the gesture size is about 40 at the wrist level.
9.4. Lateralization Phenomena in Gesture 139

Finally, a strikingly comparable gesture was also recorded in face-to-face

interaction. Figure 42 shows the only occurrences of the gesture for the Italian subjects:
S3, after having received a hint, formulates a new hypothesis for the solution of the
guessing game. Figure 43 shows a palm-down gesture from McCullough (2005), which
can perhaps be interpreted as having the same function67.

9.4. Lateralization Phenomena in Gesture

Granted the ambivalent function of gestures and language, a new phenomenon has been
observed in conditions of blocked visibility during a route direction task: the data
presented here show an interesting gestural strategy adopted in order to handle
overlapping pieces of information related to different linguistic functions.
In particular, a tendency to use space for several purposes has been identified,
a) conveying information the aspect of gesture that is closely related to the
referential function of language, and
b) handling other self-regulation functions such as the degradation of "New" to
"Given" (for the psycholinguistic concepts of Given and New see Halliday, 1985).
Participants handle such functions by lateralizing them, so that the dominant hand is
devoted to referential aspects, while the weak hand handles self-regulation functions
and/or other psycho-pragmatic ones. This finding is potentially important, as it shows
that gestures as well as language generally can handle these different functions and
points out that an entirely listener-oriented linguistic theory is rather limited.
The fact that people tend to gesticulate even when aware that their gestures cannot
be seen has already been noted by Rim (1982). Our data supports this: participants
performed non-verbal cues (gaze, posture shifting, and gestures) even during a map
task with blocked visibility, when they were aware of the impossibility of seeing or
being seen by their interlocutors.
Of course, co-verbal gestures are the major indices of planning and self-orienting
thought. As mentioned above, the gestures performed by the participants intent in map-
task are simplified and extra-coherent, as they are less complex in both handshape
and trajectory. This simplification has allowed the isolation of a particular gesture that
has been named palm-down-flap. Because the gesture in question does not have a
simple iconic relationship with the co-expressive speech, but rather a metaphorical
connection, and because it has always been found in synchronisation with crucial
passages during either route-direction (mainly soon after the acquisition of common
landmarks) or discourse organization (i.e., in concurrence with the confirmation of a
successful communicative strategy), it is most likely attributable to planning activity.
This type of metaphoric gesture has also been recorded during face-to-face interaction
among both Italian subjects and American subjects.

The gesture in question is semiotically more complex (McCullough, personal
140 9. Private Language

9.4.1. Instances of Lateralized Gestural Processing

Many map-task participants lateralizethat is, assign to the dominant hand or the other
the major linguistic functions. The dominant hand is typically used to perform
gestures during face-to-face interactions, depicting shapes or trajectories or more
abstract mental content; the non-dominant hand, meanwhile, is used for gestures related
to self-organization processes that are usually performed in concurrence with planning
passages, such as the abandonment of an ineffective communicative strategy. These
latter gestures are relatively less frequently observed in face-to-face interactions, but
quite common in blocked visibility conditions (for a contrasting point of view on
lateralization, see Hadar and Butterwoth 1997)
Gestures related to planning activities often emerge while carrying out complex
tasks with others, particularly at frustrating moments. Frustration and a marked
condition of interaction are common contexts for these gestures. 43 out of 44
participants in the role of Givers showed some degree of lateralized gestural response
to opposite linguistic functions, with the tendency to perform just one planning gesture
with their non-dominant hand during the whole task duration. In particular, two types
of lateralization were observed. The first one is a lateralized response associated with
metaphoric gestures68 that have already been identified as indices of planning (Rossini,
2007). In these cases, the non-dominant hand is devoted to planning gestures while the
dominant hand handles the referential function. The second type of lateralization occurs
in the cases of restatement after a wrong-footing: in these cases the gestural production
shifts from the dominant hand to the non-dominant one. A good example of the second
type of lateralization is observable in the following example: the participant, in the role
of Giver, is attempting a description of a path around a landmark in his map. After a
filled pause and a silent hesitation, he resumes his route-description, but then interrupts
it and restates the route described thus far while using a different orientation strategy.
The word allora (adapted into English as OK), usually a clear sign of planning, is
here an index of the shift in the Givers communicative strategy.

Figure 44: Case of lateralized gestural response to planning activity in S1

Interestingly enough, a left-hand preference for metaphor in right-handed subjects with a
Left-Hemisphere dominance for language functions has already been recorded (see Kita, de
Condappa and Mohr, 2007).
9.4. Lateralization Phenomena in Gesture 141

Soon after the word allora, in concurrence with the segment vai avanti (Engl.:
go ahead), he performs a gesture signalling the abandonment of the original linguistic
plan. The gesture in question is performed with the Givers non-dominant hand, which
was previously disengaged from gestural production in a rest position. After the gesture
is performed, the Givers left hand goes back to the original rest position and is not
engaged in any further gestural performance.
An instance of a planning gesture observable in more than one participant is a
horizontal trajectory with palm-down handshape. Under normal conditions, this
gesture is considered to be a symbolic or emblematic: by depicting a clear line in the air,
the gesture conveys the idea of something incontestable (Kendon, 2004). The gesture
shown in Figure 44 is a variant of the normal symbolic gesture described above. It uses
the same handshape, but with a sloppy hand (for the concept of sloppy hand see
Kita et al., in prep.). It also has the same trajectory, although the path of the movement
is not sideways but away from body. Other instances of palm-down-flap gestures are
recorded in the case study presented in the previous section, as highlighted in the
transcripts. In particular, the sequence 00:22:77-00:24:47 in these transcripts shows an
interesting case of simultaneous lateralized gestural response to reference and self-
The participant depicted in the figure, who is left handed, is in the role of the
Giver: after receiving a negative response from her interlocutor, who is not able to find
the previously specified landmark in her map, the Giver changes strategy and says
vab, allora senti (Eng.: OK, then listen).
After the abandonment of the old communicative plan, the Giver run into speech
disfluencies: her verbal output consists of two single words preceded and followed by
long hesitations. During this long lexical-retrieval impasse, she performs a self-adaptor
followed by a complex gestural representation with both hands, which is synchronized
with the word girati (Engl.: turn around). The movement performed with the right
hand is marginal with respect to referential attempts. The referential function is instead
handled by the left hand, which performs a metaphoric representation of the Givers
attempt at lexical retrieval. The movement performed by the Giver with her right hand
appears to be a way to focus attention on the map, although the interpretation of this
particular case is somewhat uncertain.
A clearer example of lateralized response is visible immediately afterward
(transcripts, 00:24:48-00:25:20), when the Giver retrieves an effective orientation
strategy and the exact word to convey the instruction: she thus says parallela (Engl.:
parallel), and her right hand performs a rapid beat, while the dominant hand is in hold
phase, still depicting a route.
This beat gesture performed with the non-dominant hand is here considered to be
an instance of a planning gesture: as soon as she finds an effective way to convey her
idea, the speaker beats to stress a new starting point while her dominant hand is
engaged in the depiction of a path in concurrence with a lexical-retrieval impasse. This
seems to be a response to a referential linguistic function.
At minutes 00:29:19-00:30:48 and 00:09:12-00:10:35 a case of lateralized
response with gestural anchoring towards the same reference is shown. In the first case
the Giver resolves a lexical retrieval impasse signaled by a non-verbal sound and says
hai un praticello affianco (Engl.: [you] have a little meadow alongside). Her gestural
production starts in concurrence with the non-verbal sound. Both hands are engaged in
gestural performance, although each hand is handling a different linguistic function.
Her dominant hand is engaged in an iconic gesture with a superimposed beat pattern
142 9. Private Language

whose onset is synchronized with the non-verbal sound: this kind of gesture is clearly
referential and conflates both the mental contents of lateral placement and
roundness, the latter content being due to the round shape of the picture representing
the meadow in the Givers map. Her non-dominant hand performs a palm-down-flap
gesture with superimposed beat that apparently handles a different function related to
planning: the gesture is synchronized with the word praticello and underlines a
relevant passage in the Givers communicative strategy.
The same referential anchoring performed by the dominant hand is evident 9
seconds later, as the Giver describes the path once again. While referring to the same
landmark on her map, she describes its shape and calls it a container (recipiente):
in synchrony with the word recipiente, the Giver performs two gestures, the first
being a metaphor commonly used when referring to common knowledge, or when
discourse is not intended to be precise. The second gesture, synchronized with the
second part of the word, is exclusively performed with the speakers dominant hand,
and seems to have a strictly referential function. When referring twice to the same
object in the map, the participant seems to use synonymic reference in both her
linguistic and gestural output. Although, the handshape is roughly similar for both
gestures, which suggests a catchment in the McNeillian sense, the iconicity of the
gestural production is more evident in the latter reference. This phenomenon may be
due to focus-shifting from the concept of meadow to the description of a round object.
A similar instance of lateralization is recorded in another participant, who
performed only one planning gesture with her non-dominant hand at the beginning of
the task (Figure 45): in this case, the participant performs a palm-down-flap with her
left hand while saying allora (literally: then), which is, again, a clear index of
planning. The same behavior is also in another participant in the role of the Giver
(Figure 46), and in a participant in the role of Follower.

Figure 45. Lateralized gestural response with palm-down-flap in S3

Figure 46 shows the Givers performance at 00.01.17: after an initially

unsuccessful attempt to describe the path to his interlocutor, the Giver tries to
synchronize with the Follower by going back to the start, and says allora.
Praticamente la partenza in basso a sinistra, no? (Engl.: lets see. Practically, the
start is down left, no?). As soon as his interlocutor confirms, the Giver performs a
gesture with his non-dominant hand: the gesture in question is a horizontal cut,
indicating definiteness (Kendon, 2004). The fact that the gesture in question is
9.4. Lateralization Phenomena in Gesture 143

performed with no concurrent speech, and follows a successful attempt to synchronize

with the interlocutor, makes it seem to be an index of planning: in its performance, the
subject is signalling alignment with his Follower on a common geographical point.
After this gesture, the Giver uses his non-dominant hand only for self-orientation in
space, such as pointing to the left, or describing a path placed on the left side of the
map. Interestingly, he uses the non-dominant hand only four times during a four-
minute route description.

Figure 46. Lateralized planning gesture in S6

Another case of strong lateralized response to different linguistic functions was

observed in a participant in the role of Follower (Figure 47). On this occasion, the
participants have already finished with their task, and are checking the route, when the
Follower expresses some doubts about the reliability of the path and the landmarks in it,
and tries to understand the appearance of the Givers map.

Figure 47. Lateralized gestural response in a left-handed participant in the role of Follower
144 9. Private Language

The first attempt of alignment with the Giver is accompanied by a beat gesture
performed with his dominant hand. Since the alignment is not successful, the Follower
restates his question and simultaneously performs a palm-down-flap with his non-
dominant hand. Other instances of lateralized gestural processing concern the shift of
gestural movement from the dominant to the non-dominant hand. Instances of this
phenomenon in another participant in the role of the Giver are shown in Figure 48 and
Figure 49. In this case, the Giver is describing a path on the left side of her map.
During her first attempt to do so, she consistently uses her left hand to iconically depict
the path (Figure 48), the iconic gesture in question being an attempt at self-orientation
in space. After having described this segment, the Giver attempts to proceed with her
map description, and suddenly decides to restate the round path just described with the
parenthetical sentence cio, la giri e curvi verso destra (Engl.: that is, you turn
around it and curve towards right. Figure 49).

Figure 48. lateralized response to space description in S5. Left hand describing a path on the left side of the

In describing the round path once again, she now uses her right hand. This case is
particularly interesting because of the objective location of the path and landmark to be
described, which is reinterpreted twice with opposite hands. The first description of it
has a referential anchor, while the second one seems to be the result of both a
referential function and one of linguistic planning. The dominant hand takes on the
referential function until the segment fai un pezzo dritto (Engl.: you go straight on
for a bit), that is related to a new section in the route description, and instantly goes to
rest when she engages in a linguistic planning activity. At this point, soon after the
9.4. Lateralization Phenomena in Gesture 145

speech string cio (Engl.: that is), an obvious index of restatement, the non-dominant
hand engages in movement. Moreover, the gestures that the Giver performs with her
non-dominant hand during the restatement of the path description are not the exact
repetition of the gestures already performed with the other hand. The gestures
performed the first time are more iconic and space-anchored; during restatement, the
Givers gestures are more global, as if indexing a reinterpretation and appropriation of
the space by the speaker. This same kind of lateralization is recorded in S7, with a
more emphatic transition from dominant to non-dominant hand and vice versa. Figure
50 shows the first use of S7s non-dominant hand. In this case, Giver and Follower
have just realized that their maps do not match exactly. Nevertheless, the mismatch has
not caused disorientation, and S7 manages to guide his interlocutor through the second
landmark and is now describing a long trait through a part of the map that is empty of

Figure 49. Lateralized response to space description in S5. Right hand describing the same path on the left
side of the map

After this easy segment, S7 needs to refer to the third landmark in his map, but
prefers to synchronize with the Follower by directly asking if the landmark in question
a mine is reported on his map. During this phase, and also during the Followers
answer, S7s dominant hand holds the stroke of a pointing gesture. The Followers
response is confused at first, but the interactants succeed in synchronisation.
As the Follower says No. Yes! I have it, S7 marks the successful alignment with
his interlocutor with ecco 69 (Engl.: good/ok) and proceeds with the following
segment in his map. Interestingly enough, the non-dominant hand leaves the rest
position in concurrence with the word ecco, performs an iconic gesture depicting the

See Rossini (2007) for the word ecco as a clear index of planning.
146 9. Private Language

next segment of the path, and goes back to rest position. As soon as the Follower says
ok (Figure 50) S7s non-dominant hand leaves rest position and performs a beat with
a loose precision grip.


Figure 50. Lateralized linguistic planning in S7

The gestures onset is synchronized with a silent pause in speech, which leads to
considering it to be an index of linguistic planning. Subsequently, S7s non-dominant
hand is engaged in an abstract reference to the next route segment, when he is
interrupted by his interlocutor, who asks for a clarification about a landmark (Figure
51). During the Followers conversational turn, S7s hands are in rest position. After
the Followers question, the referential function is activated in S7 together with his
dominant hand: the participant performs an iconic gesture of proximity while saying
pi verso la miniera (Engl.: [keep yourself] rather towards the mine). After the
alignment between the participants has taken place, the Giver almost exclusively uses
his dominant hand.
9.5. Discussion 147

Figure 51. Online lateralized gestural response in S7

9.5. Discussion

Various hypotheses have been proposed, both about the role of gesture in
communication, and about the role of various brain areas in the production of language
and gesture. Some scholars, such as Butterworth and Hadar (1989), have suggested that
gestures are not communicative, while others are convinced that gestures do have a
communicative role. Among these, Melinger and Levelt (2004) and De Ruiter (2000)
hypothesize that gestures are intended by the speaker to be informative, regardless of
the fact that their gestural production may be completely ignored by the interlocutor.
Others put forward the idea that gestures play a significant role in face-to-face
interactions: Bavelas et al. (2008) have recently shown that gestures are mostly
performed in dialogic situations as opposite to monological ones, no matter what the
face-to-face condition is. Both McNeill (1992, 2005) and Kendon (1983, 2004)
highlight a single cognitive process underlying speech and gesture. In particular,
McNeill (1985 and following) hypothesizes that not only do gestures share the same
cognitive, psychological, and ontogenetic origin as speech, but they also interlace in
handling language functions.
The data here discussed, and already briefly introduced in Rossini (2007) are
consistent with this hypothesis and with the findings discussed in Bavelas et al. (2008)
to the extent that they show some interactive properties in face-to-face interactions with
blocked visibility. The self-orienting role of the gestures recorded during the map-task
are consistent with McNeills (2005: 53-54) idea that the problem of gestures being
exclusively produced for the speaker or the receivers benefit is a false one.
Gestures, as well as speech, may serve self-regulation and planning functions and
be a means of self-orientation and self-organization for each individual (Alibali et al.
2001), independently of being a means of communication and interaction. The analysis
of co-verbal gestures in map-task activities has revealed interesting phenomena,
contingent on the lack of the common semiotic space usually established by gaze. Such
148 9. Private Language

a condition produces among other phenomena a simplification in the gestural

performance and allows the isolation of recurrent patterns of movement related to both
spatial reasoning and discourse organisation.
The palm-down flap, which is presented here, is a good example of increased
gestural coherence when face-to-face interaction is not possible. The fact that this
gestural pattern is recorded in more than one subject suggests some cultural-specificity
of the metaphor behind it: further research is thus desirable, in order to assess its
eventual cross-cultural use. The observed persistence of interactive and even
communicative non-verbal behaviour when the interlocutor is not visible can perhaps
contribute to speculation about the complex relationship between behavioural patterns
and language.
Moreover, the results of this enquiry are particularly interesting concerning the
function of gestures within communicative acts. As stated in Chapter 5, gestures have
sometimes been assumed to have a speaker-oriented function, their role being closer
to self-orientation and self-organization of thought than to a communicative one.
Throughout the pages of this book, the idea that speech, and gesture can have speaker-
oriented functions has been suggested several times. Such an assertion at least when
related to gestures is generally associated to the hypothesis that gesture is not
communicative, or that it is a mere epiphenomenon of the speech-encoding process.
Nevertheless, speculation about language has led some linguists to assert that,
beside its communicative function, language itself also has another a self-orientation
and self-regulation function. In particular, Leonard Bloomfield (1933) suggests this
association when he defines thought as a way to speak to ourselves. This idea is
addressed in a more systematic way in McNeills theory - especially in his most recent
book but is also recurrent in other linguists not specifically concerned with gesture
studies, such as, for instance, Bashir and Singer (1999) . Interestingly enough, this
broader framework of analysis leads to a more thorough inquiry into the relationship
between language and behaviour, despite the neglect of behaviourism as such in
linguistic theory. Moreover, interesting cases of lateralized gestural response to
different linguistic functions, such as the referential and self-orienting ones, bring back
to the fore the hypothesis of an involvement of the right hemisphere in the organization
of language production.
This finding is strikingly consistent with the findings of a Right-Hemisphere
implication in linguistic and gestural production in experiments in healthy subjects with
Left-Hemisphere dominance for language (Kita, de Condappa and Mohr, 2007) and in
split-brain patients (Kita and Lausberg, 2008). If confirmed by further investigation,
the results of the present research can provide further evidence for McNeills (1992,
2005) hypothesis of the function of the right hemisphere in language, and also
contribute to a reconsideration of the hypothesis of modularity in brain activity. The
fact that a marked lateralized gestural response to different linguistic functions has been
so clearly identified during map-task activities can perhaps be attributed to the nature
of the task itself, which places a significant cognitive demand on the participants, both
in terms of orientation in space and in terms of planning: linguistic planning is elicited
by the need to find an effective communicative strategy despite the mismatches
between the maps provided to the interactants. Further research aimed at assessing
whether these results are replicable is desirable: a task-oriented experiment involving
step-by-step instructions to the interlocutor with no possibility of direct interaction
could serve this purpose. The experiment should be structured in two phases for each
Summary 149

participant in order to allow for the investigation of possible differences in language

production during face-to-face and blind interactions.


This chapter has addressed the issue of the private phenomenon of language with
special focus on the role of gesture in conditions of blocked visibility. Interesting
phenomena such as the resilience of communicative gestures during a map-task with
blocked visibility have been uncovered. A new gesture closely related with linguistic
planning activities, apparently never described, has been presented, and named palm-
down-flap. Moreover, cases of the gestural lateralization of different linguistic
functions, specifically the representational and the planning or self-directional fuction
have been highlighted. The themes brought back to the fore in these pages clearly
question the reliability of intentionality as the only feature for the definition of
language and communication. Moreover, the lateralization observed here is deemed to
be relevant to the assessment of a possible involvement of the right hemisphere in
language production, consistent with McNeills (1992) hypothesis. Indeed, further
research is needed in order to judge to what extent the right hemisphere is actually
involved in the organization and coherence of linguistic perception and performance.
Nevertheless, the ideas and data presented in these pages will hopefully be of
inspiration for such further research, whether it be neurological or observational
This page intentionally left blank

10. The Importance of Gesture and Other

Non-Verbal Cues in Human-Machine
Interaction: Applications

- XVII, 1455a, 30)


This chapter considers open questions in human-computer interaction and weighs the
importance of current knowledge on non-verbal communication and gesture as applied
to Embodied Conversational Agents [ECAs] and robotics. Studies of the
synchronisation of speech and non-verbal cues in order to create a more trustable
agent are also proposed, with some suggestions for future research. The importance of
gesture and other non-verbal cues in human-computer interaction has been taken into
account in several existing studies. Research and development of ECAs (see e.g.
Cassell et al. 1999, 2001; Hartman, Mancini and Pelachaud, 2006; Boukricha and
Wachsmuth 2011) and robots (see e.g. Vernon, von Hofsten and Fadiga, 2011; Breazel
et al., 2008) has also brought to light some interesting findings on topics that had,
perhaps surprisingly, previously been considered to be fully addressed.
The relationship between gesture and speech, for instance, was claimed to be
trivial until Cassell et al.s work on the creation of an ECA proved that the then-current
state of knowledge did not account for the number of gestures usually produced with a
speech utterance (ECAs programmed with prior knowledge tended to produce one
gesture with one lexical entry. As a result, agents tended to gesticulate too much in
interaction with native speakers), together with the question of what produces a gesture
(see for these topics Cassell, 2004).
Other questions include the synchronisation between speech and different non-
verbal cues (Rossini, 2011) and the socio-pragmatic influences on the occurrence of
several instances of the non-verbal repertoire (Rossini, 2005; 2011). We will here
analyse the behaviour of some ECAs and robots in order to suggest improvements in
the trustability and reliability of these agents for the final user.

10.1. State of the Art

A good number of ECAs and robots have been designed and implemented thus far,
with interesting results for both human-machine interaction and human-human
As for ECAs, the M.I.T. MediaLabs Real Estate Agent (REA, see Cassel et al.
1999; 2001) and the IUT de Montreuils GRETA (Hartman, Mancini, Pelachaud, 2006),
152 10. The Importance of Gesture and Other Non-Verbal Cues in Human-Machine Interaction

together with the ECA Max and its emotional counterpart EMMA (Boukricha and
Wachsmuth 2011) are probably the most known ones. Among robots, the most
widespread and best-known ones are the iCub (see e.g. Vernon, Metta, Sandini, 2007:
Vernon, von Hofsten and Fadiga, 2011) and the MIT MediaLab Social robot Nexi
(Breazel et al., 2008).
The architecture of robots and ECAs is complex enough to pose two basic and
opposite questions: one has to do with the capability for on-line response by the agent,
a capacity that normally requires a light system; the other having to do with the
reliability and trustability of the response of the agent, something that requires, on the
contrary, a highly descriptive heavy system.
Agents are normally programmed in the C/C++ language and are composed of two
main sub-systems: the parser, that allows for the recognition of the users speech,
gestures, and sometimes facial expressions, and the planner, that allows for a response
to the user by the agent.
Normally, the parser for the agents also requires a user model, or a cognitive and
behavioural model of the final user, as well as some sort of emotional intelligence to
allow for emotional recognition and emotional simulation. While robot hardware and
middleware is complex, and requires high engineering skills to address the creation of
mobile arms and legs, contact and impedance sensors to allow for grabbing without
breaking, movable joints for a more flexible kinesics, and movable eyes for a better
recognition a better appreciation of the environment, ECAs are usually rendered in
Mp4 video chunks.
Despite the fact that ECAs do not pose complex engineering problems related to
manipulation and interaction with the environment, their implementation is far from
trivial. We will examine some examples of behavioural features implemented in some
robots and ECAs without considering in detail the technical side of implementation;
afterwards, we will discuss the current state of development and propose some changes
for the sake of trustability and behavioural naturalness of the agents.

10.1.1. Architecture of ECAs

As already stated, ECAs are less demanding on the hardware side of the
implementation. Nevertheless, their architecture is particularly refined and leads to
interesting responsive results. Figure 52 shows a common architecture for an Embodied
Conversational Agent.
10.1. State of the Art 153

Figure 52: Software of an ECA (Cassell, Vilhjlmsson, and Bickmore, 2001: 479).

As may be seen, the basic structure of a conversational agent is a program

connecting different modules, or separate subprograms that are responsible for separate
agent functions, that are linked together into a self-organized system. The knowledge
base, or the set of pre-determined information about the world and the user (this latter
section is often called the user model) that the agent needs to have in order to analyse
the information derived from microphone and a video camera sources, is linked to the
discourse model that constitutes the source for the language parser (in Figure 51, the
Language Tagging module).
The Knowledge Base module is also linked to a Behaviour Generation module,
that selects the proper behavioural response of the agent to the information present in
the user model. The Behavioural Planner sends information to the Behavioural
Scheduling module and synchronises the behavioural response with the speech
response of the agent. Of course, the speech response also requires both a Speech
Parsing module, a Speech Planning module, and a Speech Generation Module that are
not shown in Figure 51. Both the Speech Parsing and the Speech Planning module are
linked to the Knowledge Base module. Normally, Speech Generation and Behavioural
Scheduling are synchronized with each other.

10.1.2. Architecture of a Robot

Even though robots are often regarded by possible final users as a futuristic topic, the
literature on robotics is extensive and goes back to the Seventies, and the topic is
currently attracting the attention of scholars with diverse backgrounds. Although it will
be impossible to present all of the studies and results, herein we will take into account
some notable examples that are particularly interesting and call for further research in
verbal and non-verbal communication.
The architecture of robots is usually more complex that that presented in ECAs,
although the programming language is usually the same, i.e., C/C++. It requires that the
agent gauge information from the outer world by means of its sensors, analyse them
through the Knowledge Base module, plan a response, and execute it (see e.g. Ishiguro
et al., 1999 for a review). Architectures are typically function-based (Nilsson, 1984),
or behaviour-based (Brooks, 1991). Function-based architectures are commonly
154 10. The Importance of Gesture and Other Non-Verbal Cues in Human-Machine Interaction

linear and are composed of function modules that operate linearly, while behaviour-
based architectures are composed of response modules that react to the environment
without requiring a planning stage. Because both types of architecture have their
drawbacks and strong points, hybrid architectures such as the deliberative/reactive
autonomous robot architecture (Arkin, 1998 but see also Oka, Inaba, and Inoue, 1997)
are commonly found: it has been observed, in fact, function-based architectures are
slow, because any action of the agent is the result of planning that must happen
beforehand, while behaviour-based architectures are composed of automatic responses
to the input received via the sensors, however, deliberate action by the virtual agent
cannot take place without function modules. For this reason, hybrid
deliberative/reactive architectures are commonly used nowadays, in order to combine
the response readiness typical of a behaviour-based architecture with the deliberative
behaviour that results from function-based architectures.
Moreover, some robots also show Emotional Intelligence [EI], or a module that is
responsible for the internal state of the robot: this module can be either juxtaposed to
other modules in the architecture in the creation of a social robot, or be included in the
decision making module of a deliberative-reactive system, as in the robot Maggie
(Malfaz et al. 2011), and in the iCub (Vernon, von Hofsten and Fadiga, 2011). The
overall architecture of the robot Maggie is shown in Figure 53, while that of the iCub is
shown in Figure 54. As is visible, the procedure includes emotional states and drives in
the decision making of the robot in order to deliver a self-organized system. In
particular, Maggie has drives for boredom, loneliness, and energy. Motivations derived
from the drives are thus either social, recreational, or survival ones.
The iCub has a comparatively heavier system that needs to be run by several
computers in parallel and thus a middleware. The most salient feature of the iCub is the
distinction between the endogenous and exogenous factors that control the Action
Selection module via the Affective State.

Figure 53: architecture of the robot Maggie, with focus on the decision-making system (Malfaz et al., 2011:

The same architecture is implied for the social robot Nexi. Nevertheless, despite
the fact that all new generation robots need an EI to work properly, the expressive side
of the agents differs considerably.
10.2. Expressions and Gestures in Artificial Agents 155

Figure 54: Architecture of the iCub (Vernon, von Hofsten, Fadiga, 2011: 126)

10.2. Expressions and Gestures in Artificial Agents

As already stated despite the fact that conversational agents and, most considerably,
robots do have an EI, their expression of emotions differs considerably. While Maggie
does not show emotional expression, both iCub and Nexi have some ways of
expressing emotions. Figure 54 shows the expressive means of Nexi: the face has
mobile eyebrows that are used for the expression of emotions. The somatic features of
the robot are deliberately far from a human-like appearance, in order to avoid the so-
called uncanny valley phenomenon.

Figure 55: expressivity of the MIT social robot Nexi

156 10. The Importance of Gesture and Other Non-Verbal Cues in Human-Machine Interaction

The iCub (Figure 55) also has stylized expressive features that are obtained the
mimicry of the face, while Maggie has no facial expressions. As for ECAs, the most
accurate facial mimicry can be seen in GRETA (see Figure 56), although MAX and
EMMA also display features of facial expression, and EMMA emotion recognition.

Figure 56: mimicry in the iCub

Figure 57: facial mimicry in GRETA (Mancini, Bresin, Pelachaud, 2007: 1839).

The use of limbs in robots varies considerably from the use of limbs in ECAs: it is
more likely that ECAs show gestures synchronized with speech, while robots are
programmed to achieve fine object manipulation and navigation in the environment,
although Nexi does show some gestures synchronized with speech. The conversational
agent REA also shows posture shifts that are synchronized with topic shifts in the
When a set of gestural, expressive, and, in the applicable cases, manipulative
features needs to be synchronized with speech, the conflict between efficiency of the
system and trustability of the agent for the final user arises (see Rossini, 2011). We will
here address some common problems of synchronisation between speech and other
10.3. Patterns of Synchronisation of Non-Verbal Cues and Speech in Agents 157

non-verbal cues in agents and the importance of this synchronisation for the
naturalness of the agent, with a special focus on the synchronisation between
behaviour and speech on the one hand, and determining the socio-cultural
appropriateness of the gesture and expression selected by the system on the other hand.

10.3. Patterns of Synchronisation of Non-Verbal Cues and Speech in Agents:

Analysis of Common Problems

Despite the interesting architecture and striking responsiveness of Embodied

Conversational Agents and robots, whenever these are tested with the intended final
users the results are usually discouraging (see Rossini 2011 for ECAs). Participants
with no specific knowledge of programming and computational linguistics testing
ECAs and robots usually find them unnatural and, if asked (Cassell 2004), strongly
prefer to interact with a touch screen.
This negative impression is most likely caused by a combination of factors, such as,
for instance, the synthesized voice, that often has trouble duplicating natural prosody
and intonation. On the other hand, the basic generated non-verbal traits are often not
natural, both in terms of the graphic quality of the Mpeg video stream, and for specific
problems of gestural and expressive production. In this section we shall focus on the
behaviour of GRETA and Nexi, as recorded in state of the art clips of the systems that
are easily available online. The analysis of GRETA and Nexis behaviour will be taken
as a starting point for suggesting improvements in verbal and non-verbal synthesis.
While other robots are implemented to operate directly on the world, Nexi also
shows some gestures in its online production. Here, we will analyse the
synchronisation between speech synthesis and gesture performance, in a video clip
retrievable online of the first test of the agent. Figure 57 shows a transcription of the
clip. Because of the complexity of the phenomena involved, we will here transcribe
kinetic units with curly brackets and other non-verbal cues (such as head movements
and expressions) with square brackets. The boldface will highlight the part of speech
that synchronises with a gesture stroke.

[[Hello!]// {My name is [Ne]xi] and Im an [MDS] robot} // MDS stands for [mobile] [dexterous], [social]//
Head bends slightly towards left
Eyebrows flick Eyebrows flick (left) Eyebrows flick repeated 3 times
(right, left, and both respectively)
pointing gesture towards self (deictic)
[mobile] because {I can move around //} [dexterou{s] because I can use my hands to touch things //}
Left eyebrows flick right eyebrows flick
Pointing gesture with the head bending towards left
Iconic gesture with right hand (precision grip), with
superimposed beats
Figure 58: Nexis synchronisation between speech, gesture, and expressions
158 10. The Importance of Gesture and Other Non-Verbal Cues in Human-Machine Interaction

Nexi shows an interesting complexity in the production of both expressions and

hand movements. In greeting, Nexi bends the head towards left in an informal
salutation. The robot also uses eyebrow flicks that are meant to underline the relevant
parts of the speech. The first problem, though, is a hyper-analysis of the movement of
the eyebrows leading to the agent using just the right or left eyebrow to underline
different keywords. This behaviour is simply not recorded for human nonverbal
behaviour. Besides, the indexical reference created with the use of each eyebrow (the
right one flicks with the word mobile and the left one is active with the word
dexterous) is inverted soon after the pause and causes an apparent indexical error,
which is rare in humans (see Cassell, McNeill and McCullough, 1999).
The hand gestures are slow, and seem to synchronize badly with both the
coexpressive part in speech and the prosodic emphasis. While the prosodic peak in on
my name, the stroke of the deictic gesture that starts correctly with my name is
synchronized with and Im an, which is an irrelevant part of the message in both
terms of prosody and semantics. Other gestures, such as the pointing of the head and
the precision grip are nevertheless well synchronised.
If we move to GRETA, we will see that the behaviour is not completely different
from that of Nexi. We will here report a brief segment of GRETAs performance
(Figure 58), or the interaction with Mr. Smith in an attempt to simulate the normal
doctor-patient dialogue. The parameters for gesture analysis used here are explained in
detail in a 2004 article (Rossini, 2004b) while the coding technique adopted here is that
provided in McNeill (1992).

Figure 59: transcription of a chunk of GRETAs production. Square brackets show the parts of speech with
which the non-verbal cues are synchronized (Rossini, 2011: 99).

As can be seen in the transcription, GRETA greets her virtual patient with an
informal upraising of the hand, palm flat and away from body. This emblem gesture
(McNeill 1992) is used in informal and relaxed occasions and can be felt to be
inappropriate in the given formal context, as it may involve a violation of pragmatic
expectations on the part of the human interactant, unless what follows is an informal
interaction. It has already been shown that gestures, as well as speech, obey the same
socio-pragmatic rules as speech (Rossini, 2005) and this means that both the speech
generation module and the gesture generation module need to be equally sensitive to
Another interesting problem with the performance in Figure 58 is that the
nonverbal expressions have an excessively brief duration, apart from being incoherent
within the overall context and in coordination. Soon after having uttered the chunk
Good morning Mr. Smith with a synchronized hello gesture, GRETAs face adapts
to the next chunk and performs a sad expression with a lowering and slight frowning of
10.4. Proposal for a More natural Agent 159

her eyebrows. The speech that follows contains in fact a keyword for emotion, that is,
the adjective sorry. The sad expression covers the whole chunk I am sorry to tell
you but disappears after a brief silent pause (the pause in question is also misplaced
and due to chunking generation). The rest of the sentence (that you have been
diagnosed) begins with a neutral expression, while the syllable gno of diagnosed
is synchronized with an eyebrows flick (Eibl-Eibesfeldt, 1972) that is usually a signal
of joy or openness to socialization, and is thus completely incoherent with the context.
Moreover, if we analyse the gesture performed by GRETA more in depth (see
Figure 59), we see a rather unnatural performance due to an excessive rigidity of the
hand. Such a rigidity is more likely to be comparable to a sign language production,
while co-verbal gestures have been found to be performed with a sloppy hand (Kita,
van Gijn, van der Hulst, in progress). Also, the gesture in question is performed at a
higher locus (Rossini, 2004b) than normal and it seems to occupy the zero space for
Sign Language.

Figure 60: hello gesture in GRETAs performance. As it can be seen, the hand performing the gesture is
rigid and completely spread, as if performing a sign language token.

10.4. Proposal for a more natural agent

The problems highlighted so far are basically due to the chunking generation on the
one hand and to probabilistic rules based of fuzzy logic for the selection of a
synchronized gesture or expression, on the other hand. It seems, in fact, that the socio-
linguistic variation of gestural use (Rossini, 2005) is completely disregarded, probably
due to an operational problem: an even more sophisticated architecture would
excessively slow down the system and cause a higher number of breakdowns.
Nevertheless, a definite improvement should be observed with a different
architecture relying less on Fuzzy Logic and a review of the lexicon for the generation
of gestures and expressions. The lexicon in question should allow for a more thorough
description of context-driven and socio-linguistic variation of gestures (Rossini,
More precisely, co-verbal gestures should be marked up according to their normal
occurrence in informal versus formal contexts, and related to registers in speech, thus
shifting from a fuzzy logic program to a mixed system relying on both fuzzy logic and
rule based operations. A special focus should also be placed on the often disregarded
160 10. The Importance of Gesture and Other Non-Verbal Cues in Human-Machine Interaction

gesture syntax, or the way gestures combine into kinetic utterances (Rossini, 2004a,
2004b; Gibbon, to appear). A separate problem is the unsatisfactory synchronisation of
facial expressions and speech.
Because the synchronisation between facial expressions and speech follows a
completely different timing when compared to hand gestures and speech
synchronisation (see e.g., Rossini, 2009), this issue can be resolved by allowing for a
distinct behavior planner exclusively devoted to facial expressions, with its own
synchronisation timings with respect to speech.
An alternative model is proposed in Figure 60. The ideal architecture for a more
natural agent is based on the separation of a discourse planner, that is devoted to
speech and gesture planning and word-gesture timing, and an expression planner, that
is exclusively devoted to facial expressions. The Expression Planner is still linked with
the discourse planner, but follows a different timing for the synchronisation of
expressions and speech. This planner will select the appropriate expressions by means
of an analysis of keywords in the planned discourse, if any, and hold the selected
expression for the total duration of the utterance.

Figure 61: proposal for a new architecture

Of course, since the expression planner is not intended to also decode the
emotional states of the interlocutor, it will have a separate vocabulary of expressions
among which one will be selected for output. If the system must be able to cope with
expression decoding, a separate module for emotional appraisal should be linked to the
microphone and the camera. As already stated, the gestures and expressions in the
agents response should also be planned according to the correct pragmatic rules of
social interaction and avoid complete reliance on random selection.
The present proposal does not address the implied slow-down of such a system,
although, in case the modifications proposed here are computationally possible, a
higher reliability and usability would certainly result from the modifications.
Summary 161


We have addressed here some common problems with gestures and non-verbal cues in
conversational agents and robots, with particular attention to the problems of
synchronisation between non-verbal cues and speech on the one hand, and the socio-
pragmatic selection of those cues in formal versus informal interactions on the other
hand. A review of some conversational agents and robots has been offered, with an
analysis of some instances of their online performance. The shortcomings highlighted
by this analysis have been exploited in order to propose a different architecture for the
generation of appropriate behaviour, in order to encourage further research in this fast
growing field.
This page intentionally left blank


This book has focused on the main questions about gesture and its function, although it
is not meant to be exhaustive. In particular, gesture has been defined as a relevant part
of non-verbal communication, non-verbal communication being, in its turn, assumed to
be the intentional part of non-verbal behavior. We have also seen that gesture is
perhaps the only subset of non-verbal communication provided with some form of
lexical access.
In Chapter 4, gesture is claimed to share the same psychological origin as speech,
gesture being communicative to the same extent that speech is, since they can be
interpreted as integral parts of a wider phenomenon named audio-visual
communication (see Chapter 8). Audio-visual communication is claimed to be a
relevant part of human language. But if gesture is integral to language, rather than a
mere paralinguistic phenomenon, a revision of the classical definition of language as
an exclusively spoken system needs to be revised: in this regard, chapter 6 attempts a
new interpretation of human language as made up of speech and gesture, which are
subsets of the language system and have complementary functions within the economy
of human communication. Moreover, gesture has been shown to have a number of
linguistic properties that have classically been defined as distinctive of human speech,
such as, for instance and to some extent, morphology and recursion. Of course, such
phenomena are observable in gesture itself in a more diffuse fashion if compared to
speech. Still, the simplefact that a structural analysis of the bi-modal system can be
attempted should encourage further speculation on the relations between gestures, non-
verbal phenomena, and the speech signal in order to come to a more detailed model of
the Audio-visual communication system.
Furthermore, gesture has been shown to vary according to the classical socio-
linguistic parameters, not only from a semantic perspective, but also in its intrinsic
morphology, which provides further evidence for the claim that gesture is part of a
system. This finding also provides an interpretation of the morphology of gesture. In
particular, the description of the morphology of gesture and its division between
intrinsic and extrinsic morphology allowed for the creation of a set of parameters for
the interdisciplinary study of gestural phenomena that are presented in Chapter 8.
These parameters have been successfully applied to the analysis of gesture in both of
the experiments presented in this book.
Gesture has also been analysed as a prototype category, in order to provide a
solution to the still debated question of its communicativeness. In particular, emblems
have been claimed to be the core of the category, their intentionality, awareness and
abstraction being highest, while their extension is clearly definable. This interpretation
helps to identify a particular sub-class of the gesture category, i.e., beats, which are not
intentional in themselves. This theory does not undermine the claim that gesture is
communicative, for it has been observed that the more a particular subset of gesture is
unaware and unintentional, the more the presence of speech is mandatory. This analysis,
on the contrary, helps to underline the deep interrelation between speech and gesture
within human communication.
164 Conclusions

The application of prototype theory to the analysis of gesture has also been tested
by means of an experiment on five Italian subjects, the results being consistent with the
hypothesis about the high intentionality of emblems versus the low intentionality of
Evidence has been provided for the main claim of this book, namely, that gesture
and speech share the same cognitive-psychological origin: new pieces of evidence for
this claim have been provided by means of an experiment on speech/gesture
synchronisation in either multi-tasking activities (Chapter 6) and congenitally deaf,
orally educated subjects (Chapter 7). The analysis of the data collected within these
experiments show that the synchronisation pattern between speech and gesture is not a
learnt one, but, rather, is inborn and thus is due to the human neuro-motor system. For
this reason, gesture and speech are claimed to be overt products of the same inner
system. Therefore, in Chapter 7 I attempt to explain the physiological evolution that
has led to the use of gestures with communicative intent. The hypothesis is that,
ultimately, we use gestures because it is unavoidable. Still, this statement is not
intended as a claim about the non-communicativeness of gesture but, rather, as a
hypothesis of the phylogenetic emergence of gesture as communication.
In Chapter 9 language in interpreted in its self-directional function. Questions are
raised and suggestions for further field research are provided: data from experiments
with blocked visibility outline phenomena such as the resilience of communicative
gestures, a higher number of planning gestures, such as the palm-down-flap that is
described and discussed here, and the gestural lateralization of diverse linguistic
functions, namely, the referential and self-directional one. The data at hand suggest that
the usage of gesture in particular and language in a broader sense as a means of self-
orientation and self-control should be addressed in a more methodical way, without
disregarding the complex relationship existing between language and behavioral
Of course, the formal, structural and computational approach to audio-visual
communication attempted in these pages (Chapter 8), far from being exhaustive of the
topic, is rather meant to encourage speculation, with the hope of eventually designing a
theoretical linguistic model that is able to account for the complexity of human
Finally, I hope to have offered in Chapter 10 some suggestions for further research
in the applied side of this field of study, with special interest for the new perspectives
that Artificial Intelligence can provide in the study of human behaviour and human

ABERCROMBIE, D. 1968. Paralanguage. British Journal of Disorders of Communication,
ACREDOLO, L. P., GOODWYN, S. W. 1988. Symbolic gesturing in normal infants. Child
Development, 59: 450-466.
ALBANO LEONI, F. 2009. Dei suoni e dei sensi. Il volto fonico delle parole. Il Mulino:
ALIBALI, M. W., HEATH, D. C. AND MYERS, H. J. 2001. Effects of visibility between
speaker and listener on gesture production: Some gestures are meant to be seen.
Journal of Memory and Language, 44: 169188.
ALLPORT, G. W. 1924. The study of undivided personality. Journal of Abnormal and
Societal Psychology, 34: 612-15.
ARBIB, M. A. 2002. The mirror system, imitation and the evolution of language. In
Nehaniv, C. and Dautenhahn, K. (Eds.), Imitation in Animals and Artifacts. Pp.
229- 280. MIT Press. Cambridge, MA
ARBIB, M. A. 2006. Action to Language via the Mirror Neuron System. Cambridge
University Press: Cambridge.
ARGYLE, M. 1972. The Psychology of Interpersonal Behavior. Penguin:
ARGYLE, M. 1988. Bodily Communication, Second Edition. Methuen and Co. Ltd:
ARGYLE, M. AND COOK, M. 1976. Gaze and Mutual Gaze. Cambridge University Press:
Cambridge and New York.
ARKIN, R. C. 1998. Behaviour-Based Robotics. M.I.T. Press: Cambridge, M.A.
ARMSTRONG, D. F., STOKOE W. C. AND WILCOX, S. 1995. Gesture and the Nature of
Language. Cambridge University Press: Cambridge and New York.
ARMSTRONG, D. F., AND KATZ, S. H. 1981. Brain laterality in signed and spoken
language: A synthetic theory of language use. Sign Language Studies, 33: 319-50.
AUSTIN, J. 1962. How to Do Things with Words. Harvard University Press: Cambridge,
Massachusetts .
ATTILI, G. AND RICCI BITTI, P. E. 1983. I gesti e i segni. La comunicazione non verbale
in psicologia e neurologia clinica e il linguaggio dei segni dei sordi. Bulzoni
Editore: Roma.
BAKHTIN, M.M. 1993. Toward a Philosophy of Mind. Translation and notes by V.
Liapunov, Ed. by M. Holquist and V. Liapunov. University of Texas Press: Austin,
BASHIR, A.S. AND SINGER, B.D. 1999. What are executive functions and self-regulation
and what do they have to do with language-learning disorders? Language, Speech,
and Hearing Services in Schools, July 1999, 30: 265-273.
BASSO, A., LUZZATTI, C. AND SPINNLER, H. 1980. Is ideomotor apraxia the outcome of
damage to well-defined regions of the left hemisphere? A neuropsychological
study of CT correlation. Journal of Neurology, Neurosurgery and Psychiatry, 43:
BATES, E. 1976. Language and Context. Academic: New York.
BATES, E., CAMAIONI, L., AND VOLTERRA, V. 1975. The Acquisition of Performatives
Prior to Speech. Merrill Palmer Quarterly, 21: 205-226.
166 References

Gesture to First Word: On Cognitive and Social Prerequisites. In Lewis, M. and
Rosenblum, L. A. (Eds.), Interaction, Conversation, and theDevelopment of
Language. Wiley: New York.
Emergence of Symbols: Cognition and Communication in Infancy. Academic: New
BATES, E., BRETHERTON, I., SHORE, C. AND MCNEW, S. 1983. Names, Gestures and
Objects: Symbolization in Infancy and Aphasia. In Nielson, K. (Ed.), Childrens
Language. Erlbaum: Hillsdale.
BAVELAS, J. B., CHOVIL, N., LAWRIE, D. A., AND WADE, A. 1992. Interactive gestures.
Discourse Processes, 15: 469-489.
BAVELAS, J. B., GERWIG, J., SUTTON, C., AND PREVOST, D. 2008. Gesturing on the
telephone: Independent effects of dialogue and visibility. Journal of Memory and
Language, 58: 495-520.
BEATTIE, G. 1978. Sequential temporal patterns of speech and gaze in dialogue.
Semiotics, 23: 29-52.
BEATTIE, G. 1980. The role of language production processes in the organization of
behavior in face-to-face interaction. In Butterworth, B. (Ed.), Language
Production, Vol. 1, pp. 69-107.
BIRDWHISTELL, R. L. 1952. Introduction to Kinetics. U. S. Department of State Foreign
Service Institute: Washington, D. C.
BLASS, T., FREEDMAN, N. AND STEINGART, I. 1974. Body Movement and Verbal
Encoding in the Congenitally Blind. In Perceptual and Motor Skills, 39: 279-293.
BLOOM, K. 1974. Eye Contact as a Setting Event for Infant Learning. Journal of
Experimental Child Psychology, 17: 250-263.
BLOOMFIELD, L. 1933. Language. Holt, Rinehart and Winston: New York.
BOCK, J. K. AND WARREN, R.K, 1985. Conceptual Accessibility and Syntactic Structure
in Sentence Formulation. Cognition, 21: 47-67.
BOCK, J. K. 1982. Toward a Cognitive Psychology of Syntax: Information Processing
Contributions to Sentence Formulation. Psychological Review, 89:1-47.
BOLINGER, D. 1946. Some Thoughts on Yep and Nope. American Speech, 21: 90-
BOLINGER, D. 1975. Aspects of Language. 2nd ed. Harcourt Brace and Jovanovich: New
BONGIOANNI, P., BUOIANO, G., AND MAGONI, M. 2002. Language impairments in
ALS/MND (Amyotrophic Lateral Sclerosis/Motor Neuron Disease). In
Proceedings European Society for Philosophy and Psychology Meeting 2002, pp.
20-21, Lyon, France.
BOUKRICHA, H. AND WACHSMUTH, I. 2011. Empathy-Based Emotional Alignment for a
Virtual Human: A Three-Step Approach. KI - Knstliche Intelligenz, Springer:
Berlin and Heidelberg. Online Open Source ISSN 1610-1987.
NARENDRAN, K. AND MCBEAN, J. 2008. Mobile, dexterous, social robots for
mobile manipulation and human-robot interaction. SIGGRAPH '08: ACM
SIGGRAPH 2008 new tech demos, New York, 2008.
BRESSEM, J. In progress. Recurrent form features in coverbal gestures. In Bressem, J.
and Ladewig, S. (Eds.), Hand made patterns. Recurrent forms and functions in
gestures. Planned for submission to Semiotica.
References 167

BROCA, P. 1861. Remarques sur le sige de la facult du langage articul; suivies d'une
observation d'aphmie. Bulletins Socit Anthropologique, 2: 235-238. [Remarks
on the seat of the faculty of articulate language, followed by an observation of
aphemia. In Von Bonin, G. (Ed.), Some papers on the cerebral cortex, pp. 49-72.
Charles C. Thomas Publisher: Springfield, Illinois.]
BRODMANN, K. 1909. Vergleichende lokalisationslehre der grosshirnrinde in ihren
prinzipien dargestellt auf grund des zellenbaues. Leipzig: Johann Ambrosius Barth
Verlag. English translation: Garey, L.J. (Ed.), 1999. Brodmanns localisation in
the cerebral cortex. Imperial College Press: London.
BROOKS, R. A. 1991. Intelligence Without Reason. Proceedings of 12th Int. Joint Conf.
on Artificial Intelligence, Sydney, Australia, August 1991, pp. 569-595.
BROWMAN, C. P. AND GOLDSTEIN, L. 1990. Gestural Structures: Distinctiveness,
phonological Processes, and Historical Change. In. Mattingly, I. G and Studdert-
Kennedy, M. (Eds.), Modularity and the Motor Theory of Speech Perception.
Laurence Erlbaum: Hillsdale.
BRUNER, J. S. 1975. The Ontogenesis of Speech Acts. In Journal of Child Language, 2:
BULL, P. E., AND CONNELLY, G. 1985. Body movement and emphasis in speech.
Journal of Nonverbal Behaviour, 9: 169187.
BUTTERWORTH, B. AND U. HADAR, 1998. Gesture, Speech, and Computational Stages:
A Reply to McNeill. Psychological Review, 96, 1: 168-174.
BYRNE, R. W. 2003. Imitation as behaviour parsing. The Philosophical Transactions of
the Royal Society. B, 358: 529536.
CACCIARI, C. 2001. Psicologia del linguaggio. Il Mulino: Bologna.
CALBRIS, G. 1985. Espace-Temps: Expression Gestuelle du Temps. Semiotica, 55: 43-
CALBRIS, G. 1990. Semiotics of French Gesture. Indiana University Press,
Coverbal gestures in Alzheimer's type dementia. Cortex, 1, 41: 535-46.
CARLOMAGNO, S AND CRISTILLI, C. 2006. Semantic attributes of iconic gestures in
fluent and non-fluent aphasic adults. Brain and Language, 99,1-2: 102-103
CASSELL, J. 1998. A Framework for Gesture Generation and Interpretation. In Cipolla,
R. and Pentland, A. (Eds.), Computer Vision in Human-Machine Interaction.
Cambridge University Press: Cambridge and New York.
CASSELL, J. 2005. Trading spaces: Gesture Morphology and Semantics in Humans and
Virtual Humans. Second ISGS Conference Interacting bodies. cole normale
suprieure Lettres et Sciences humaines Lyon - France, June 15-18.
DOUVILLE, B., PREVOST, S. AND STONE, M. 1994. Animated Conversation: Rule-
Based Generation of Facial Expression, Gesture and Spoken Intonation for
Multiple Conversational Agents. Proceedings of SIGGRAPH '94.
CASSELL, J., MCNEILL, D. AND MCCULLOUGH, K.-E. 1999. Speech-Gesture
Mismatches: Evidence for One Underlying Representation of linguistic and
Nonlinguistic Information. Pragmatics and Cognition, 7,1: 1-33.
CASSELL, J. AND PREVOST, S. 1996. Distribution of Semantic Features across Speech
and Gesture by Humans and Machines. Proceedings of the Workshop on the
Integration of Gesture in Language and Speech.
168 References

VILHJLMSSON AND YAN, A. 1999. Embodiment in Conversational Interfaces: Rea.
Proceedings of the CHI 1999 Conference, Pittsburgh, PA, pp. 520527.
CASSELL, J. AND STONE, M. 2000. Coordination and Context-Dependence in the
Generation of Embodied Conversation. In Proceedings of the International
Natural Language Generation Conference, pp. 171-178. June 12-16, Mitzpe
Ramon, Israel.
Expression Animation Toolkit. Proceedings of SIGGRAPH '01, pp. 477-486.
August 12-17, Los Angeles, CA.
and Generating Posture from Discourse Structure in Embodied Conversational
Agents. Workshop on Representing, Annotating, and Evaluating Non-Verbal and
Verbal Communicative Acts to Achieve Contextual Embodied Agents, Autonomous
Agents 2001 Conference, Montreal, Quebec, May 29.
CHOMSKY, N. 1957. Syntactic Structures. Mouton: The Hague.
CHOMSKY, N. AND MILLER, G. A. 1963. Introduction to the Formal Analysis of Natural
Languages. In Luce, R. D., Bush, R. R. and Galanter, E. (Eds.), Handbook of
Mathematical Psychology, vol. 2. Wiley: New York.
CICONE, M., WAPNER, W., FOLDI, N., ZURIF, E. AND GARDNER, H. 1979. The Relation
between Gesture and Language in Aphasic Communication. Brain and Language,
8: 324-349.
CIENKI, A. 2005. Image schemas and gesture. In Hampe, B. (Ed.), From perception to
meaning: Image schemas in cognitive linguistics (Vol. 29). Mouton de Gruyter:
CIMATTI, F.1998. Mente e linguaggio negli animali. Carocci: Roma.
COHEN, A.A. AND HARRISON, R. P. 1973. Intentionality in the use of hand illustrators in
face-to-face communication situations. Journal of Personality and Social
Psychology, 28: 276-279.
CONDILLAC, E. B. DE 1756/1971 An essay on the origin of human knowledge : Being a
supplement of Mr. Lockes essay on the human understanding. Translated by
Thomas Nugent. Scholars Reprints and Facsimiles:Gainesville, Florida.
VON CRANACH, M. AND VINE, I. 1973. Social Communication and Movement.
Academic press: London.
CONDON, W. S. AND OGSTON, W. D. 1966. Sound Film Analysis of Normal and
Pathological Behaviour Patterns. Journal of Nervous and Mental Disease, CXLII:
CONDON, W. S. AND OGSTON, W. D. 1971. Speech and body motion synchrony of the
speaker-hearer. In Horton, D. H. and Jenkins, J. J. (Eds.), The perception of
language, pp. 150-184. Academic Press: New York.
COOLEY, C. H. 1902 . Human Nature and the Social Order. Scribners: New York.
CORBALLIS, M. C. 2002. From hand to mouth: The gestural origins of language.
Princeton University Press: Princeton, NJ.
DAMASIO, A.R 2001. Neural correlates of naming actions and naming spatial
relations. NeuroImage, 13: 1053-1064.
DARWIN, C. 1872. Expression of emotions in man and animals. Appleton, London.
References 169

DAVIS, J. W. AND VAKS, S. 2001. A Perceptual User Interface for Recognizing Head
Gesture Acknowledgements. ACM Workshop on Perceptual User Interfaces,
Orlando, Florida.
DEKKER, R. AND KOOLE, F. D. 1992. Visually Impaired Childrens Visual
Characteristics and Intelligence. Developmental Medicine and Child Neurology,
DE LAGUNA, G. A. 1927. Speech: Its Function and Development. Yale University
Press: New Haven.
DE MAURO, T. 1982. Minisemantica delle lingue verbali e non verbali. Laterza: Roma-
DE RENZI, E. 1985. Methods of limb apraxia examination and their bearing on the
interpretation of the disorder. In Roy, E. A. (Ed.), Neuropsychological Studies of
Apraxia and Related Disorders, pp. 45-62.Elsevier Science Publishers B. V.: New
DE RUITER, J. P. 2000. The production of gesture and speech. In McNeill, D. (Ed.),
Language and Gesture, pp. 284-311.Cambridge University Press: Cambridge.
DIDEROT, D. 1751/1916. Letter on the deaf and Dumb. Translated and edited by H.
Jourdain in Diderots philosophical works. Open Court Publishing Company:
DITTMANN, A. T. 1972. The body movement-speech rhythm relationship as a cue to
speech encoding. In Siegman, A. and Pope, B. (Eds.), Studies in Dyadic
Communication. Pergamon Press: New York.
DITTMANN, A. T., AND LLEWELYN, L. G. 1969. Body movement and speech rhythm in
social conversation. Journal of Personality and Social Psychology, 23: 283-292.
DODORICO, L. AND LEVORATO, M. C. 1994. Social and Cognitive Determinants of
Mutual Gaze Between mother and Infant. In Volterra, V. and Erting, C. J. (Eds.),
From Gesture to Language in Hearing and Deaf Children. Gallaudet University
Press: Washington, DC.
DORE, J. A. 1974. A Pragmatic Description of Early Language Development. In
Journal of Psycholinguistic Research, 3: 343-350.
DUFFY, R. J., DUFFY, J. R. AND MERCAITIS, P. A. 1984. Comparison of the
Performances of a Fluent and a Nonfluent Aphasic on a Pantomimic Referential
Task. Brain and Language, 21: 260-273.
EDELMAN, G. M. 1987. Neural Darwinism: Theory of Neuronal Group Selection. Basic
Books: New York.
EDELMAN, G. M. 1989. The Remembered Present: A Biological Theory of
Consciousness. Basic Books: New York.
EDELMAN, G. M. 2006. Second Nature: Brain Science and Human Knowledge. Yale
University Press.
EFRON, D. 1941. Gesture and Environment. Kings Crown Press: New York.
EIBL-EIBESFELDT, I. 1949. ber das Vorkommen von Schreckstoffen bei
Erdkrtenquappen. Experientia, 5: 236.
EIBL-EIBESFELDT, I. 1967. Concepts of Ethology and their Significance for the Study of
Human Behaviour. In Stevenson, H. W. (Ed.), Early Behaviour, Comparative and
Development Approaches. Wiley: New York.
EIBL-EIBESFELDT, I. 1949. Ethology: The Biology of Behaviour. Holt, Rinehart and
Winston: New York.
170 References

EIBL-EIBESFELDT, I. 1972. Similarities and differences between cultures in expressive

movements. In Hinde, A. (Ed.), Non-verbal Communication, pp. 297312.
Cambridge University Press: Cambridge.
EKMAN, P. AND FRIESEN, W. V. 1969. The repertoire of nonverbal behaviour:
Categories, origins, usage, and coding. Semiotica, 1: 49- 98.
EMMORREY, K. AND CASEY, S. 2001 Gesture, thought and spatial language. Gesture,
1:1: 3550.
morphometric analysis of auditory brain regions in congenitally deaf adults.
Proceedings of the National Academy of Science of the United States of America,
Aug 19; 100(17): 10049-54.
FEYEREISEN, P. 1991. Brain Pathology, Lateralization and Nonverbal Behavior. In
Feldman, S. and Rim, B. Fundamentals of Nonverbal Behavior. Cambridge
University Press: Cambridge.
FEYEREISEN, P. 1991. Communicative behavior in aphasia. Aphasiology, 5: 323-333.
FEYEREISEN, P. AND SERON, X. 1982. Nonverbal Communication and Aphasia: a
Review. II. Expression. Brain and Language,16: 213-236.
FEYEREISEN, P., VAN DE WIELE, M. AND DUBOIS, F. 1988. The Meaning of Gestures:
What can be Understood without Speech? Cahiers de Psychologie Cognitive, 8: 3-
speech in referential communication by aphasic subjects: channel use and
efficiency. Aphasiology 2: 21-32.
FEYEREISEN, P., BOUCHAT, MP, DERY, D., AND RUIZ, M. 1990. The concomitance of
speech and manual gesture in aphasic subjects. In Hammond, G. R. (Ed.), The
cerebral control of speech and limb movements, pp. 279-301. Advances in
Psychology, Vol. 70. North Holland: Amsterdam.
FERRARI, G. 1991. Introduzione al Natural Language Processing. Edizioni Calderini:
FERRARI, G. 1997. Elementi non verbali nel dialogo reale e nel dialogo riportato. In
Ambrosini R., Bologna, M. P., Motta, F. and Orlandi, C. (Eds.), Schrbthair a
ainm n-ogaim. Scritti in memoria di Enrico Campanile. Pacini Editore: Pisa.
FERRARI, G. 2007. Linguistica eoltre(?). Studi in onore di Riccardo Ambrosini, Studi
e Saggi linguistici, XLIII-XLIV, 2005-2006. ETS: Pisa.
FITCH, W. T., HAUSER, M.D. AND CHOMSKY, N. 2005. The evolution of the language
faculty: Clarifications and implications. Cognition, 97: 179210.
FLORES, F. AND LUDLOW, J. 1980. Doing and Speaking in the Office. In Fick, G. and
Sprague, R.H. (Eds.), Decision Support Systems: Issues and Challenges pp. 95-
118. Pergamon Press: New York.
FODOR, J. A. 1983. The Modularity of Mind. MIT Press: Cambridge, MA.
FREEDMAN, N. 1972. The Analysis of Movement Behavior during the Clinical
Interview. In Siegman, A. and Pope, B. (Eds.), 1972. Studies in Dyadic
Communication. Pergamon Press: New York.
FREEDMAN, N. 1977. Hands, words and mind: On the structuralization of body
movement during discourse and the capacity for verbal representation. In
Freedman, N. and Grand, S. (Eds.), Communicative structures and psychic
structures: A psychoanalytic approach. Plenum: New York.
FREEDMAN, N. AND HOFFMAN, S. P. 1966. Kinetic Behavior in Altered Clinical States.
Perceptual and Motor Skills, XXIV: 527-39.
References 171

FREGE, F. L. G.1892. ber Sinn und Bedeutung (On Sense and Meaning). Zeitschrift
fr Philosophie und philosophische Kritik, C: 25-50
FREUD, S. 1891. Zur Auffassung der Aphasien. Leipzig : Deuticke. Available in English
as On Aphasia: A Critical Study. Translated by E. Stengel, International
Universities Press (1953).
FREUD, S. 1901. Psychopathology of Everyday Life. Translation by A. A. Brill (1914)
Originally published in London by T. Fisher Unwin.
FRIEDMAN, S. 1972. Habituation and Recovery of Visual Response in the Alert Human
Newborn. Journal of Experimental Child Psychology, 13: 339-349.
FRICKE, E., LAUSBERG, H., LIEBAL, K. AND MLLER, C. In progress. Towards a
grammar of gesture: evolution, brain, and linguistic structures. Book series
Gesture Studies. John Benjamins: Amsterdam.
FRICK-HORBURY, D. AND GUTTENTAG, R. E. 1998. The effects of restricting hand
gesture production on lexical retrieval and free recall. In American Journal of
Psychology, 111, 43-62.
VON FRISCH, K. 1967. The Dance Language and the Orientation of Bees. Harvard
University Press, Cambridge.
Point: The Role of the Right Hemisphere in the Processing of Complex Linguistic
Materials. In Perecman, E. (Ed.), Cognitive Processing in the Right Hemisphere.
Academic Press: New York.
GARFINKEL, H. 1967. Studies in Ethnomethodology. Prentice-Hall, Englewood Cliffs.
GIBBON, D. To appear. Modelling gesture as speech: A linguistic approach. Poznan
Studies in Contemporary Linguistics 47(3).
GILBERT, M. A. 1995. Emotional Argumentation, or, Why Do Argumentation Theorists
Argue with their Mates? In van Eemeren, F.H., Grootendorst, R., Blair, J.A. and
Willard, C.A. (Eds.), Analysis and Evaluation: Proceedings of the Third ISSA
Conference on Argumentation Vol II. SICSAT: Amsterdam.
GILBERT, M. A. 2003. But why call it an Argument?: In Defense of the Linguistically
Inexplicable. Presented at Informal Logic at 25. 2003. Windsor, ON.
GIVN, T. 2002. The visual information-processing system as an evolutionary precursor
of human language. In Givn, T. and Malle, B. F. (Eds.), The Evolution of
Language out of Pre-Language. John Benjamins: Amsterdam.
GOLDIN-MEADOW, S. 1998. The Development of Gesture and Speech as an Integrated
System. In Iverson, J. M. and Goldin-Meadow. S. (Eds.), The Nature and
Functions of Gesture in Childrens Communication. Jossey-Bass Publishers: San
GOODALL-VAN LAWICK, J. 1967. The Behaviour of Free-Living Chimpanzees in the
Gombe Stream Reserve, Animal Behaviour Monographs, 1:161-311.
GOODWIN, C. 1984. Notes on Story Structure and the Organization of Participation. In
Atkinson, M. and Heritage, J. (Eds.), Structures of Social Action, pp. 225-246.
Cambridge University Press: Cambridge.
GOODWIN, C. 2000. Gesture, Aphasia and Interaction. In McNeill, D. (Ed.), Language
and Gesture: Window into Thought and Action, pp. 84-98. Cambridge University
Press: Cambridge.
GOODWIN, C. 2003. Conversational Frameworks for the Accomplishment of Meaning
in Aphasia. In Goodwin, C. (Ed.), Conversation and Brain Damage, pp. 90-116.
Oxford University Press: Oxford.
172 References

GOODWIN, C. AND GOODWIN, M. H. 1992. Assessments and the construction of context.

In Duranti, A. A. and Goodwin, C. (Eds.), Rethinking Context: Language as an
Interactive Phenomenon, pp. 147-190. Cambdrigde University Press: New York.
GREEN, S. AND MARLER, P. 1979. The Analysis of Animal Communication. In Marler,
P. and Vandenbergh, J. G. (Eds.), Handbook of Behavioural Neurobiology, vol.3.
Plenum: New York and London.
GRICE, P. 1989. Studies in the Way of Words. Harvard University Press: Cambridge,
GULLBERG, M. AND HOLMQVIST, K. 2001. Eye tracking and the perception of gestures
in face-to-face interaction vs. on screen. In Cav, C., Guatella, I., Santi, S. (Eds.),
Oralit et gesturalit: Interactions et comportements multimodaux dans la
communication, pp. 381-384. L'Harmattan: Paris.
HAGOORT, P. 2005. On Broca, brain, and binding: a new framework. Trends in
Cognitive Science, 9(9): 416 423.
HALL, E.T. 1966. The Hidden Dimension. Doubleday: Garden City.
HALLIDAY, M. A. K. 1967. Some Aspects of the Thematic Organization of the English
Clause, Theme and Information in the English Clause. In Kress, G. (Ed.), System
and Function in Language. Selected Papers. Oxford University Press: Oxford.
HARD, S. C., AND STEIN, B. E. 1988. Small lateral suprasylvian cortex lesions produce
visual neglect and decreased visual activity in the superior colliculus. Journal of
Comparative Neurology, 273: 527-542.
HARTMANN, B., MANCINI, M. AND PELACHAUD, C. 2006. Implementing Expressive
Gesture Synthesis for Embodied Conversational Agents. In Gibet, S., Courty, N.,
Kamp, J.-F. (Eds.), GW 2005 LNCS (LNAI), vol. 3881, pp. 188199. Springer:
HAUSER, M.D. CHOMSKY, N. AND FITCH, W. T. 2002. The Faculty of Language: What
Is It, Who Has It, and How Did It Evolve? Science, 298: 1569-1579.
HAYASHI, K., FURUYAMA, N. AND TAKASE, H. 2005. Intra- and Inter-personal
Coordination of Speech, Gesture and Breathing Movements. Transactions of the
Japanese Society for Artificial Intelligence, 20: 247-258.
HEIDEGGER, M. 1978. What class for thinking? In Heidegger, M., Krell, D. F. (Eds.),
Basic Writings: From Being and time (1927) to The task of thinking (1964).Pp.
341-268. Taylor & Francis: Abingdon, Oxford.
HEILMAN, K., WATSON, R. T. AND BOWERS, D. 1983. Affective Disorders Associated
with Hemisphere Disease. In Heilman, K. M. and Satz, P. (Eds.), Neuropsychology
of Human Emotion. Guilford Press: New York.
HEWES, G. W.1973. Primate Communication and the Gestural Origins of Language.
Current Anthropology, 14:5-24.
Cognition and the corpus callosum: verbal fluency, visuospatial ability, and
language lateralization related to midsagital surface areas of callosal subregions.
Behavioural neuroscience, 106: 3-14.
HJELMSLEV, L. 1961. Prolegomena to a theory of language. University Wisconsin
Press: Madison.
HOCKETT, C. F. 1960. Logical Considerations in the Study of Animal Communication.
In Lanyon, W.E. and Tavolga, W. N. (Eds.), Animal Sounds and Communication.
American Institute of Biological Sciences: Washington D.C.
HUDSON, R. A. 1997. Sociolinguistics. II ed. Cambridge University Press, Cambridge.
References 173

ISHIGURO, H., KANDA, T., KIMOTO, K. AND ISHIDA, T. 1999. A Robot Architecture
Based on Situated Modules. IEEERSJ Conference on Intelligent Robots and
Systems 1999 IROS (1999), 3: 1617-1624.
IVERSON, J. M. 1996. Gesture and Speech: Context and Representational Effects on
Production in Congenitally Blind and Sighted Children and Adolescents. PhD.
thesis. Department of Psychology, University of Chicago.
IVERSON, J. M. AND GOLDIN-MEADOW, S. 1997. Whats Communication got to do with
it? Gesture in Children Blind from Birth. Developmental Psychology, 33: 453-467.
JACKENDOFF, R. 2002. Foundations of Language. Oxford University Press: Oxford,
New York.
JACKLIN, C.N. AND MACCOBY, E. E. 1978. Social behavior at 33 months in same-sex
and mixed-sex dyads. Child Development 49(3): 557569
JACKSON, J. P. 1974. The Relationship between the Development of Gestural Imagery
and the Development of Graphic Imagery. Child Development, 45: 432-438.
JAKOBSON, R. 1960. Linguistics and Poetics: Closing Statement. In Sebeok, T. (Ed.),
Style in Language, pp. 350-77. MIT Press: Cambridge, MA.
JAKOBSON, R. 1960. Language in Relation to Other Communication Systems. In
Roman Jakobson (Ed.), Selected Writings, Vol. 2, pp. 570-79. Mouton: The Hague.
Perception and the Temporal Cortex NeuroImage, 15(4): 733-746.
JASON, G. W. 1985. Manual sequence learning after focal cortical lesions.
Neuropsychologia, 23: 483-496.
KELSO, J.A.S., HOLT, K.G., RUBIN, P. AND KUGLER, P. N. 1981. Patterns of human
interlimb coordination emerge from the properties of nonlinear, limit cycle
oscillatory processes: Theory and data. Journal of Motor Behavior 13: 226261.
KELSO, J. A., SALTZMAN, E. L. AND TULLER, B. 1986. The dynamical perspective on
speech production: data and theory. Journal of Phonetics, 14: 29-59.
KENDON, A. 1972. Some Relationships between Body Motion and Speech. An Analysis
of an Example. In Wolfe, A. and Pope, B. (Eds.), Studies in Dyadic
Communication. Pergamon Press: New York.
KENDON, A. 1980. Gesticulation and Speech: Two Aspects of the Process of Utterance.
In Key, M.R. (Ed.), The Relation Between Verbal and Nonverbal Communication.
The Hague: Mouton.
KENDON, A. 1981. A Geography of Gesture. Semiotica, 37: 129-163.
KENDON, A. 1982. The study of gesture: Some remarks on its history. Recherches
Semiotique/Semiotic Inquiry 2: 45-62.
KENDON, A. 1986. Current Issues in the Study of Gesture. In Nespolous, J. L., Perron,P.
Lecours, A. R. (Eds.), The Biological Foundations of Gestures: Motor and
Semiotic Aspects. Laurence Erlbaum Associates: Hillsdale, London.
KENDON, A. 1990. Conducting Interaction: Patterns of Behavior in Focused
Encounters. Cambridge University Press: Cambridge.
KENDON, A. 1992. Abstraction in Gesture. Semiotica, 90 (3 4): 225 25.
KENDON, A. 1994. Do Gestures Communicate? A Review. Research on Language and
Social Interaction, 27, 3:175-200.
KENDON, A. 2000. Language and gesture: unity or duality?. In McNeill, D. (Ed.),
Language and Gesture. Cambridge University Press.
KENDON, A. 2004. Gesture. Visible Action as Utterance. Cambridge University Press:
174 References

KENDON, A. 2009. Why do people sometimes move their hands about when they talk.
International Conference Gesture and Speech in Interaction, Pozna, September,
24th - 26th 2009.
KITA, S. 2000. How representational gestures help speaking. In McNeill, D. (Ed.),
Language and Gesture, pp. 162-185. Cambridge University Press: Cambridge.

KITA, S., VAN GIJN, I. AND VAN DER HULST, H. In progress. The non-linguistic status of
the Symmetry Condition in Signed Languages: Evidence from a Comparison from
Signs and Speech Accompanying Representational Gestures.
KITA, S., DE CONDAPPA, O. AND MOHR, C. 2007. Metaphor explanation attenuates the
right-hand preference for depictive co-speech gestures that imitate actions. Brain
and Language, 101: 185-197.
KITA, S. AND LAUSBERG, H. 2008. Generation of co-speech gestures based on spatial
imagery form the right-hemisphere: Evidence form split-brain patiens. Cortex, 44:
KLEIBER, G. 1990. La Smantique du prototype: Catgories et sens lexical. Presses
Universitaires de France: Paris.
KLIMA, E. AND BELLUGI, U. 1979. The Signs of Language. Harvard University Press:
KOLB B., AND WHISHAW, I. 1985. Fundamentals of Human Neuropsychology (2nd
Edition) W.H. Freeman and Co.: New York.
KRAUSS, R. MORREL-SAMUELS, P. AND COLASANTE, C. 1991. Do Conversational Hand
Gestures Communicate?. Journal of Personality and Social Psychology, 61,5: 743-
KRAUSS, R. M., CHEN, Y. AND GOTTESMAN, R. F. 2000. Lexical gestures and lexical
access: a process model. In McNeill, D. (Ed.), Language and Gesture. Cambridge
University Press: Cambridge.
LENNEBERG, E. H. 1973. The neurology of language. Daedalus, 102: 115-134.
LEONTEV, A. A. n. d. Non-published paper, p. 3. Quoted by Robbins, D. (2007) Alexei
Alexeevitch Leontievs non-classical psycholinguistics. In Alanen, R. and
Pyhnen, S. (Eds.), Language in Action Vygotsky and Leontievian Legacy Today,
pp. 8-18. Cambridge Scholars Publishing: Cambridge.
LEVELT, J. M. 1989. Speaking. From Intention to Articulation. MIT Press: Cambridge,
LEVY, J. 1969. Possible basis for the evolution of lateral specialization of the human
brain. Nature 224: 614-615.
LEUNG, E. H. L. AND RHEINGOLD, H. L. 1981. Development of Pointing as a Social
Gesture. In Developmental Psychology, 17: 215-220.
LICHTMAN, R.1970.Symbolic Interactionism and Social Reality: Some Marxist Queries.
Berkeley Journal of Sociology, XV:76-94.
LIEBERMAN, P. 2008. Cortical-striatal-cortical neural circuits, reiteration, and the
narrow faculty of language. Behavioral and Brain Sciences, 31: 527-528.
LOCK, A. J. 1980. The Guided Reinvention of Language. Academic:London.
LOMBER, S. G., PAYNE, B. R., CORAWELL, P. AND LONG, K. D. 1996. Perceptual and
Cognitive Visual Functions of Parietal and Temporal Cortices in the Cat. Cerebral
Cortex, 6: 673-695.
LORENZ, K. 1939. Vergleichende Verhaltensforschung. Verhandlungen der Deutschen
zoologischen Gesellschaft, 12: 60-102.
References 175

LYONS, J. 1972. Human Language. In Hinde, R. A. Non-Verbal Communication.

Cambridge University Press: London, New York, Melbourne.
MACKAY, D. M. 1972. Formal Analysis of Communicative Processes. In Hinde, R. A.
(Ed.), Non-Verbal Communication. Cambridge University Press: London, New
York, Melbourne.
MACCOBY, E.E. AND JACKLIN, C. N. 1978. The Psychology of Sex Differences. Stanford
University Press: Stanford.
MAGNO CALDOGNETTO, E. 1997. La gestualit coverbale in soggetti normali e afasici.
In Poggi, I. and Magno Caldognetto, E. (Eds.), Mani che parlano. Gesti e
psicologia della comunicazione. Unipress: Padova.
MAGNO CALDOGNETTO, E. AND I. POGGI, 1997a. Conoscenza e uso dei gesti simbolici.
Differenze di sesso e di et. In Poggi, I. and Magno Caldognetto, E. (Eds.), Mani
che parlano. Gesti e psicologia della comunicazione. Unipress: Padova.
MAGNO CALDOGNETTO, E. AND I. POGGI, 1997b. Il sistema prosodico intonativo e
lanalisi multimodale del parlato in Poggi, I. and Magno Caldognetto, E. (Eds.),
Mani che parlano. Gesti e psicologia della comunicazione. Unipress: Padova.
MAHL, G. F.1968. Gestures and Body Movements. In Shlien, J. (Ed.), Research in
Psychotherapy vol. III, American Psychological Association: Washington.
MALINOWSKI, B. 1946. Supplement I. In Ogdoen, C. K. and Richards, I. A. (Eds.), The
Meaning of Meaning, 8th edition. Routledge and Kegan Paul: London.
biologically inspired architecture for an autonomous and social robot. IEEE
Transactions on Autonomous Mental Development, 3(3): 1.
MANCINI, M., BRESIN, R. AND PELACHAUD, M.A.. 2007. An expressive virtual agent
head driven by music performance. IEEE Transactions on Audio, Speech and
Language Processing 15(6): 18331841.
MANLY, L. 1980. Nonverbal Communication of the Blind. In Von Raffler-Engel, W.
(Ed.), Aspects of Nonverbal Communication. Swets and Zeitlinger: Lisse, The
MARTINET, A. 1960. Elments de linguistique gnrale, Colin: Paris.
MASUR, E. F. 1994. Gestural Development, dual-Directional Signaling, and the
Transition to Words. In Volterra, V. and Erting, C. J. (Eds.), From Gesture to
Language in Hearing and Deaf Children. Gallaudet University Press: Washington,
MCCULLOUGH, K.-E. 1995. Representation and Meaning of Space in Narrative and
Apartment Descriptions. Conference on Gestures Compared Cross-Linguistically,
Summer Linguistic Institute, University of New Mexico.
MCCULLOUGH, K.-E. 2005. Using Gestures in Speaking: Self-generating indexical
fields. Ph.D. Thesis, The University of Chicago.
MEAD, G. H. 1934. Mind, Self and Society. University of Chicago Press: Chicago.
MCNEILL, D. 1979. The Conceptual Basis of Language. Erlbaum: Hillsdale.
MCNEILL, D. 1985. So You Think Gestures Are Nonverbal? Psychological Review, 92
(3): 350 371.
MCNEILL, D. 1987. Psycholinguistics: A new Approach. Harper and Row: New York.
MCNEILL, D. 1989. A Straight Path-to Where? Reply to Butterworth and Hadar.
Psychological Review, 96 (1):175 179.
MCNEILL, D. 1991. Hand and Mind: What Gestures Reveal about Thought. University
of Chicago Press: Chicago and London.
176 References

MCNEILL, D. (Ed.), 2000. Language and Gesture. Cambridge University Press:

MCNEILL, D. 2005. Gesture and Thought. University of Chicago Press: Chicago and
MCNEILL, D. In progress. Notes on the origin of language: what evolved, and how.
MCNEILL, D. AND LEVY, E. 1982. Conceptual Representations in Language Activity
and Gesture. In Jarvella, R. J. and Klein, W. (Eds.), Speech, place and action.
Wiley and Sons: Chichester.
MCNEILL, D. AND PEDELTY, L. 1995. Right brain and gesture. In: Emmorey, K. and
Reilly, J. S. (Eds.), Language, gesture, and space. (International Conference on
Theoretical Issues in Sign Language Research), pp. 63-85. Erlbaum: Hillsdale, N.J.
VAN MEEL, J. M. 1982. The Nature and Development of the Kinetic Representational
System. In deGelder, B. (Ed.), Knowledge and Representation. Routledge and
Kegan Paul: London.
MELINGER, A. AND LEVELT, W. 2004. Gesture and the communicative intention of the
speaker. Gesture, 4: 119-141.
MILLER, R. 1996. Axonal conduction times and human cerebral laterality. A
psychobiological theory. Harwood: Amsterdam.
MITTELBERG, I. 2007. Methodology for multimodality: One way of working with
speech and gesture data. In Gonzalez-Marquez, M., Mittelberg, I., Coulson, S. and
Spivey, M. J. (Eds.), Methods in Cognitive Linguistics, pp. 225-248. John
Benjamins: Amsterdam/Philadelphia.
MONDADA, L. 2006. Participants online analysis and multimodal practices: projecting
the end of the turn and the closing of the sequence. Discourse Studies, 8 (1): 117-
MORO, A. 2006. I confini di Babele. Il cervello e il mistero delle lingue impossibili,
Longanesi, Milano; English Translation: The Boundaries of Babel. The Brain and
the Enigma of Impossible Languages. MIT Press, Cambridge: Massachusetts.
MORRIS, D. 1971. Intimate Behavior: A Zoologist's Classic Study of Human Intimacy.
Kodansha International: New York, Tokyo, London.
MORRIS, D. 1977. Manwatching. Abrams: New York.
MORROW, L. AND RATCLIFF, G. 1988. Neuropsychology of Spatial Cognition: Evidence
form Cerebral Lesions. In Stiles-Davis, J., Kritchevsky, M. and Bellugi, U.(Eds.),
Spatial Cognition: Brain Bases and Development. Lawrence Erlbaum: Hillsdale.
MOSCOVICI, S. 1967. Communication processes and the properties of language. In
Berkovitz, L. (Ed.), Advances in Experimental Social Psychology. Academic
Press: New York.
MLLER, C. 2004. Forms and uses of the Palm Up Open Hand: A case of gesture
family? In Mller, C and Posner, R. (Eds.), The semantics and pragmatics of
everyday gestures, pp. 233-356. Weidler Verlag: Berlin.
1997. Object representation in the ventral premotor cortex (area F5) of the monkey.
Journal of Neurophysiology,78: 2226-2230.
NIESSER, U. 1976. Cognition and Reality: Principles and Implications of Cognitive
Psychology. Freeman: New York.
NIEWIADOMSKI, R., OCHS, M., AND PELACHAUD, C. 2008. Expressions of Empathy in
ECAs. In Prendinger, H., Lester, J.C., Ishizuka, M. (Eds.) IVA 2008. LNCS (LNAI),
vol. 5208, pp. 3744. Heidelberg: Springer.
References 177

NILSSON, N. (Ed.), 1984. Shakey the Robot. Technical Note 323, SRI International,
Menlo Park, CA.
NISHITANI, N. AND HARI, R. 2000. Temporal dynamics of cortical representation for
action. Proceedings of the National Academy of Sciences of the United States of
America, 97: 913918.
NOBE, S. 1996. Cognitive rhythms, gestures, and acoustic aspects of speech: a
network/threshold model of gesture production. Ph.D. Dissertation, University of
NOTH, W.1995. Handbook of Semiotics. Indiana University Press: Bloomington and
Indianapolis, USA.
OKA, T., INABA, M. AND INOUE, H. 1997. Describing a modular motion system based
on a real time process network model, Proceeding of the IEEE/RSJ International
Conference on Intelligent Robots and Systems, pp. 821-827.
OSGOOD, C. E. 1988. Psycholinguistics, Cross-Cultural Universals, and Prospects for
Mankind. Praeger: Westpot, CT.
OVERTON, W. F. AND JACKSON, J. P. 1973. The Representation of Imagined Objects in
Action Sequences: A Development Study. Child Development, 44: 309-314.
ZYUREK, A. 2000. The influence of addressee location on spatial language and
representational gestures of direction. In McNeill, D. (Ed.), Language and
Gesture (pp. 64-83). Cambridge University Press: Cambridge.
PARKE, K. L., SHALLCROSS, R. AND ANDERSON, R.J. 1980. Differences in coverbal
behavior between blind and sighted persons during dyadic communication.
Journal of Visual Impairment and Blindness, 74: 142-146.
PARRILL, F. 2003. Intuitions and violations of good form in metaphoric conduit
gestures. Invited presentation, theme session on gesture and metaphor.
International Cognitive Linguistics Conference. Logroo, Spain.
PARRILL, F. 2008. Form, meaning and convention: An experimental examination of
metaphoric gestures. In Cienki, A. and Mller, C. (Eds.), Metaphor and Gesture,
pp. 195-217. John Benjamins: Amsterdam.
PARRILL, F. AND SWEETSER, E. 2004. What we mean by meaning: Conceptual
integration in gesture analysis and transcription. Gesture, 4: 197-219
PARTRIDGE, E. 1959. Origins: A Short Etymological Dictionary of Modern English.
Macmillan: New York.
PAVELIN-LESIC, B. 2009. Speech gestures and the pragmatic economy of oral
expression in face-to-face interaction. International Conference Gesture and
Speech in Interaction, Pozna, September, 24th - 26th 2009.
TUTORING RESEARCH GROUP, 2000. Incorporating human-like conversational
behaviors into AutoTutor. Agents 2000 Proceedings of the Workshop on Achieving
Human-like Behavior in the Interactive Animated Agents: 85-92. ACM Press:
PIAGET, J. 1926. The language and thought of the child. Harcourt, Brace, Jovanovich:
New York.
PICA, S. 2008. Gestures of apes and pre-linguistic human children: Similar or different?
First Language, Vol. 28, No. 2: 116-140.
PIKE, K. 1967. Language in Relation to a Unified Theory of the Structure of Human
Behavior. 2nd ed. Mouton: The Hague.
PINKER, S. AND JACKENDOFF, R. 2005. The Faculty of Language: Whats Special about
it? Cognition 95: 201236.
178 References

PLACE, U. T. 2000. The role of the hand in the evolution of language. Psycoloquy:
11(007), Language Gesture (1).
POGGI, I., 1980. La mano a borsa: analisi semantica di un gesto emblematico
olofrastico. In Attili, G. and Ricci Bitti, P. E. (Eds.), Comunicare senza parole. La
comunicazione non verbale nel bambino e nellinterazione sociale tra adulti.
Bulzoni Editore: Roma.
POGGI, I. AND MAGNO CALDOGNETTO, E. 1997. Il gestionario: un dizionario dei gesti
simbolici italiani. In Poggi, I. and Magno Caldognetto, E. (Eds.), Mani che
parlano. Gesti e psicologia della comunicazione. Unipress: Padova.
POGGI, I. AND PELACHAUD, C. 1998. Performative facial expressions in animated
faces. Speech Communication, 26: 521.
POSNER M. L. AND DI GIROLAMO, G. J. 1999. Flexible neural circuitry in word
processing. Behav Brain Sci, 22: 299-300.
POUPLIER, M. AND GOLDSTEIN, L. 2011. Intention in articulation: Articulatory timing in
alternating consonant sequences and its implications for models of speech
production. Language and Cognitive Processes, 25 (5): 616-649.
PULVERMLLER, F. 1999. Words in the brain's language. Behavioral and Brain
Sciences, 22: 290-291.
PULVERMLLER, F. 2002. The neuroscience of language. Cambridge University Press:
RAUSHER, F., KRAUSS, R.M., AND CHEN, Y. 1996. Gesture, speech and lexical access:
The role of lexical movements in speech production. Psychological Science, 7:
GARDNER, H. 1992. Story Processing in Right-Hemisphere Brain-Damaged
Patients. In Brain and Language, 42: 320-336.
RIM, B. 1982. The Elimination of Visible Behavior from Social Interactions: Effects
of Verbal, Nonverbal and Interpersonal Variables. European Journal of Social
Psychology, 73: 113-129.
RIZZOLATTI, G. 2005. The mirror neuron system and its function in humans. Anatomy
and Embryology, 210(5-6): 419-21.
RIZZOLATTI, G. AND ARBIB, M.A. 1998. Language within our grasp. Trends in
Neurosciences, 21: 188-194.
RIZZOLATTI, G., LUPPINO, G. AND MATELLI, M. 1998. The organization of the cortical
motor system: New concepts. Elettroencephalography and clinical
Neurophysiology, 106: 283-96.
ROBSON, K. 1967. The Role of Eye to Eye Contact in Maternal-Infant Attachment.
Journal of Child Psychology and Psychiatry, 8: 13-25.
ROGERS, W. T. 1978. The Contribution of Kinetic Illustrators towards the
Comprehension of Verbal Behavior within Utterances. Human Communication
Research, 5: 54-62.
ROSENFELD, H. M. 1966. Instrumental Affiliative Functions of Facial and Gestural
Expressions. Journal of Personality and Social Psychology, IV: 65-72.
ROSSI-LANDI, F. 19671972. Ideologie. Roma.
ROSSI-LANDI, F. 1983 [1968]. Language as Work and Trade. Bergin and Garvey:
South Had1ey.
ROSSI-LANDI, F. 1985. Metodica filosofica e scienza dei segni. Bompiani: Milano.
References 179

ROSSI-LANDI, F. 1992. Articulations in Verbal and Objectual Sign Systems. In Rossi-

Landi, F. Petrilli, F. Between signs and non-signs, pp. 189-252. John Benjamins
Publishing Company: Berlin.
ROSSINI, N. 2001. Gestualit e teoria dei prototipi: per una nuova interpretazione della
comunicazione non verbale. Studi Italiani di Linguistica Teorica e Applicata,
XXX, 3: 489-511.
ROSSINI, N. 2003. Gestures and Prototype Theory: a New Approach to Gesture
Categorization. 5th International Workshop on Gesture and Sign Language Based
Human-Computer Interaction (Gesture Workshop), Genova, Italy.
ROSSINI, N. 2004a. The Analysis of Gesture: Establishing a Set of Parameters. In
Camurri, A. and Volpe, G. (Eds.), Gesture-Based Communication in Human-
Computer Interaction. 5th International Gesture Workshop, GW 2003, Genova,
Italy, April 2003. Selected Revised Papers, pp. 124-131. Springer-Verlag: Berlin
Heidelberg New York.
ROSSINI, N. 2004b. Gesture and its cognitive origin: Why do we gesture? Experiments
on hearing and deaf people. Universit di Pavia Ph.D. thesis
ROSSINI, N. 2005. Sociolinguistics in Gesture: How about the Mano a Borsa?
Intercultural Communication Studies, XIII: 3: 144-154. Proceedings of the 9th
International Conference on Cross-Cultural Communication (CSF 2003).
ROSSINI, N. 2007. Unseen gestures and the Mind of the Speaker: An analysis of co-
verbal gestures in map-task activities. In A. Esposito, A., Bratanic, M., Keller, E.
and Marinaro, M. (Eds.), Fundamentals of Verbal and Nonverbal Communication
and the Biometric Issue. IOS Press, NATO Security through Science Series E:
Human and Societal Dynamics Vol. 18.
ROSSINI, N. 2009. Il gesto. Gestualit e tratti non verbali in interazioni diadiche.
Pitagora: Bologna.
ROSSINI, N. 2011. Patterns of Synchronization of Non-verbal Cues and Speech in
ECAs: Towards a More "Natural" Conversational Agent. In Esposito, A., Esposito,
A. M., Martone, R. Mueller, V. C., Scarpetta, G. (Eds.), Toward Autonomous,
Adaptive, and Context-Aware Multimodal Interfaces: Theoretical and Practical
Issues, pp. 97-104. Springer-Verlag: Berlin.
DE RUITER, J. P. 2000. The Production of Gesture and Speech. In McNeill, D. (Ed.),
Language and Gesture. Cambridge University Press: Cambridge.
RYLE, G. 2002. The Concept of Mind. University of Chicago Press : Chicago.
1996. Activation of the primary visual cortex by Braille reading in blind subjects.
Nature, Apr 11; 380(6574): 526-8.
agraphia for kanji or kana: Dissociation between morphology and phonology.
Neurology, 49: 946-952.
SAMUELS, C. A. 1985. Attention to Eye-Contact Opportunity and Facial Motion by
Three-Month-Old Infants. Journal of Experimental Child Psychology, 40:105-114.
DE SAUSSURE, F.1917. Cours de linguistique gnrale.
SCALISE, S. 1994. Morfologia. Il Mulino: Bologna.
SCHEGLOFF, E. A. 1984. On Some Gestures' Relation to Talk. In Atkinson, J. M. and
Heritage, J. (Eds.), Structures of Social Action, pp. 266-298. Cambridge
University Press: Cambridge.
SCHEGLOFF, E. A. 2006. Sequence organization in interaction: A primer in
conversation analysis. Cambridge University Press: Cambridge.
180 References

SCHEFLEN, A. E. 1973. Analysis of a Psychotherapy Transaction. Indiana University

Press: Bloomington.
SEARLE, J. 1969. Speech Acts. Cambridge University Press: Cambridge.
SEARLE, J. 1983. Intentionality. Cambridge University Press: Cambridge.
SHALLICE, T. 1988. From Neuropsychology to Mental Structure. Cambridge University
Press: Cambridge.
SHANNON, C. E. 1948. A mathematical theory of communication. In Bell System
Technical Journal, vol. 27, pp. 379-423 and 623-656, July and October.
SILBERBERG, A. AND FUJITA, K. 1996. Pointing at Smaller Food Amounts in an
Analogue of Boysen and Bertsons (1995) procedure. Journal of the Experimental
Analysis of Behavior, 66: 143-147.
SIMONE, R. 1998. Fondamenti di linguistica. Editori Laterza: Roma.
SKRANDIES, W. 1999. Early Effects of Semantic Meaning on Electrical Brain Activity.
Behavioral and Brain Sciences 22(2): 301-302.
SNYDER, W. 2000. An experimental investigation of syntactic satiation effects.
Linguistic Inquiry 31: 575-582.
SOBRERO, A. 1993. Pragmatica. In Sobrero, A. (Ed.), Introduzione allitaliano
contemporaneo. Editori Laterza: Roma.
SPERBER, D. AND WILSON, D. 1986. Relevance: Communication and Cognition.
Harvard University Press: Cambridge, MA.
STEPHENS, D.1983. Hemispheric language dominance and gesture hand preference.
Doctoral Dissertation, University of Chicago.
STEPHENS, D. AND TUITE, K.1980. The Hermeneutics of Gesture. Paper presented at the
Symposium on Gesture at the Meeting of the American Anthropological
Association, Chicago.
STOKOE, W. C. 1960. Sign Language Structure. Buffalo Univ. Press: Buffalo, NY.
STOKOE, W. C. 1972. Semiotics and human sign languages. Mouton: The Hague.
TAYLOR, J. R. 1995. Linguistic Categorization. Prototypes in Linguistic Theory.
Clarendon Press: Oxford.
THOMPSON, L. A. AND MASSARO, D. W. 1985. Evaluation and Integration of Speech
and Pointing Gestures during Referential Understanding. Journal of Experimental
Child Psychology, 42:144-168.
THORPE, W. H. 1972a. The Comparison of Vocal Communication in Animals and Man.
In Hinde, R. A. (Ed.), Non-Verbal Communication, 27-48. Cambridge University
THORPE, W. H. 1972b. Vocal Communication in Birds. In Hinde, R. A. (Ed.), Non-
Verbal Communication, 153-174. Cambridge University Press,: Cambridge.
TINBERGEN, N. 1935. ber die Orientierung des Bienenwolfes. Z. vgl. Physiol., 21:
TOMASELLO, M. 2008. Origins of human communications. MIT Press: Cambridge, MA.
TRUBECKOJ, N. S.1939. Grundzge der Phonologie, in Travaux du Circle linguistique
de Prague, VII.
TREVARTHEN, C. AND HUBLEY, P. 1978. Secondary Intersubjectivity: Confidence,
Confiding, and Acts of Meaning in the First Year. In Lock, A. (Ed.), Action,
Gesture, and Symbol. Academic: London.
TYLOR, E. B. 1865. Researches into the Early History of Mankind and the Development
of Civilization. John Murray: London.
VARNEY N. R. AND DAMASIO, H. 1987. Locus of lesion in impaired pantomime
recognition. Cortex 1987; 23: 699703.
References 181

VERNON D., VON HOFSTEN C, AND FADIGA, L. 2011. A Roadmap for Cognitive
Development in Humanoid Robots. Springer: Berlin.
VERNON, D. METTA, G. METTA, AND SANDINI, G. 2007. The iCub Cognitive
Architecture: Interactive Development in a Humanoid Robot, IEEE International
Conference on Development and Learning, Imperial College, London, July 2007.
VOLTERRA, V. (Ed.), 1985. La Lingua Italiana dei Segni. La comunicazione visivo-
gestuale nei sordi. Il Mulino: Bologna.
VYGOTSKIJ, L. S. 1962. Thought and language. MIT Press: Cambridge, MA.
VYGOTSKIJ, L. S. 1966. Development of the Higher Mental Functions. In Psychological
Research in the USSR.
VYGOTSKIJ, L. S. AND LURIJA, A. R. 1930. The function and fate of ego-centric speech.
Proceedings of the 9th International Congress of Psychology, pp. 464-465. The
Psychological Review: Princeton.
WATSON, O. M. AND GRAVES, T. D. 1966. Quantitative Research on Proxemic
Behaviour. American Anthropologist, 68: 382-409.
WEIZENBAUM, J. 1966. ELIZA- A Computer Program For the Study of Natural
Language Communication Between Man and Machine. Communications of the
ACM, 9(1): 36-35.
WERNER, H., AND KAPLAN, B. 1963. Symbol formation: An organismic developmental
approach to language and the expression of thought. John Wiley: New York.
WHITNEY, W. D. 1899. The Life and Growth of Language: An Outline of Linguistic
Science. Appleton: New York.
WILLEMS, R. M., ZYREK, A. AND HAGOORT, P. 2007. When language meets action:
the neural integration of gesture and speech. Cerebral Cortex, 17: 232233.
WINOGRAD, T. AND FLORES, F. 1986. Understanding Computers and Cognition: A New
Foundation for Design. Addison-Wesley Professional: Boston.
WITTGENSTEIN, L. 1966. Lectures and Conversations on Aesthetics, Psychology, and
Religious Belief. Edited by Cyril Barrett. Blackwell: Malden, M.A., Oxford.
WOLFF, C. 1945. A Psychology of Gesture. Methuen: London.
WOLFF, P.1961. Observations on early Development of smiling. In Foss, B. M. (Ed.),
Determinants of Infant Behavior. Vol. 2. Methuen, London.
WUNDT, W.1900/1973. The language of gestures. Translation of Vlkerpsychologie:
Eine Untersuchung der Entwicklungsgesetze von Sprache, Mythus und Sitte.
ZAIDEL, E. 1985, Language in the right hemisphere. In Benson D. F., Zaidel E. (Eds.),
The dual brain. Guilford: New York. 205-31.
This page intentionally left blank

Appendix I. Gesture in Deaf Subjects: Table

of the Performed Gestures
Subject 1 Interviewer 1

strokes locus size P.A. gesture h strokes Locus size P.A. gesture h
corresp. ub 50 e/w/f d rh
corresp. ub 7 e/w/f E rh
NS ub 50 e/w E bh
corresp. ub 5 e/w E rh
NS ub 2w w/f d rh
in in
corresp. ub 30s/25e s/e/w E rh corresp. 20 ub e/w m/E rh
before lb 0 e/w/f md rh
corresp. ub/h 50 e/w/f E rh
NS lb 3 e/w/f d rh
corresp. ub 80 e/f d rh
NS lb 28 e/f d rh
corresp. ub 30 e/f d rh
NS ub 30 e/w E rh
corresp. ub 3 e/w/f d rh
corresp. ub 1 e/w/f c rh
corresp. ub/h 50 e/w/f d rh before 10 lb e/w m/E bh
in in
corresp. h 5 e/w/f d rh corresp. 25 ub e/w d rh
corresp. h 5 e/w E rh
corresp. ub/h 3 e/w m rh
NS h 3 e/w E rh
before h 18 e/w/f d rh
corresp. h 10 e/w/f E rh
corresp. h 5 e/w/f d rh
corresp. h(IM) 45 e/w/f m rh
corresp. lb 40 e/w/f m rh
NS ub 23 e/w/f d rh
corresp. ub nd f E rh
before ub 7 e/w/f E rh
NS lb bp f IE
corresp. ub 30 e/f i
184 Appendix I. Gesture in Deaf Subjects: Table of the Performed Gestures

corresp. ub 5 e/f d
before lb/ub 50 e/f d
corresp. lb 3 w md
before h 70 e/f d
NS h 10 e/f d
NS h 5 e/w E
before h 2 e/w E
corresp. h 2 e/f md
NS ub 20 e/f E/LIS
NS lb 30 e/w E
before ub 2 e/w E
corresp. ub 10 e/f dm
nd ub 5 e/f d

Subject 1 Interviewer 1

strokes locus size P.A. gesture h strokes locus size P.A. gesture h
corresp. ub 5 e/w md
NS ub 0 w E
NS h 50 e/f E
before lb 30 w/f md
corresp. lb 10 e/w ic
before ub 50 e/w/f IE
before h 5 e/w m
NS ub 10 e/w d
corresp. h 45 e/w E
NS ub 20 e/w c
before h 60 e/f i
before ub 30 e/w E
corresp. ub 50w w/f E
NS h 90 e/w E
before h(IM) 10 e/f m
corresp. h 20 e/w c
in in
corresp. h(IM) 85 e/f m corresp. 10 lb w c
corresp. h 15 e/f d NS(hes) 10 lb e m
in in
corresp. ub 10 e/f d corresp. 45 h e/f d
in in
corresp. h 30 e/f E corresp. 20 lb e m
before ub 40 e/f d corresp. 110 lb e i
before h 15 e/f E NS 20 h e/f d
Appendix I. Gesture in Deaf Subjects: Table of the Performed Gestures 185

NS h 5 e/f d corresp. 60 h e/w E
before h 10 e/w m before 10 h e/w E
NS h 5 e/f d corresp. 10 ub e/f i
NS ub 60 e/f d before 40 ub e/w d
before h 5 e/w m corresp. 92 ub e/w d rh
NS h 50 e/f d corresp. 10 lb e/w/f E rh
in in
corresp. h 70 e/w E corresp. 50 ub e/w/f m rh
in in
corresp. h 5 e/w m corresp. 30s ub s/e/w/f d rh
NS h 2 e/f d corresp. 12 lb e/w m rh
in in
corresp. h 2 e/w E corresp. 100w lb w E
before lb 35 e/f md NS(hes) 40w lb w/f m
corresp. lb(IM) 30 w i NS(hes) 10 lb e/w m
in in
corresp. lb(IM) 45 e/f md corresp. 7w lb w/f d lh
in in
corresp. lb 180 e/f md corresp. 50w lb w E bh
NS lb(IM) 20 e/f md before 110w lb w d
in in
corresp. lb(IM) 40 e/f md corresp. 110w lb w/f d
NS ub 5 e/w i corresp. 20s ub s/e/w/f E
corresp. ub 180 e/w m NS 40s ub/h s/e/w/f d
NS lb(IM) 0 w/f md corresp. 93w lb w c bh
in in
corresp. ub 5 e/w i corresp. 43 ub/h e/w/f d
before ub 50 e/w/f m corresp. 7 lb e/w/f m
nd ub 5 e/w i NS 10w lb w/f m
corresp. ub 10 e/f d NS(hes) 34 ub e/w/f d
corresp. ub 50 e/w E NS Nd lb f d
nd lb 2 e/f d corresp. Nd lb f m
186 Appendix I. Gesture in Deaf Subjects: Table of the Performed Gestures

Subject 1 Interviewer 1

strokes locus size P.A. gesture h strokes Locus size P.A. gesture h

corresp s/e/w/
NS ub 50 e/w/f E . 30s/55e h f d
corresp s/e/w/
NS ub 10 e/w m . 5s/20e h f E
NS h(IM) 10 e/w/f E . 3w lb w/f m bh
NS lb 10 e/w E . 30 lb(IM) e/w/f i
before h 5w w/f i . 28 lb(IM) e/w/f i
in corresp. h 60w w/f md . 35 ub/h e/w/f d
in corresp. h 5w w/f m before 43 h e/w/f i lh
before h 10w w/f md before 2 lb e/f d lh
NS h 15w w/f E . 5w lb w/f d lh
NS h 15f f md . 30s/48e h s/e/w d
in corresp. h 55w w/f md before 45 lb e/w m bh
in corresp. h 30w w E ) 50 lb e/w m
in corresp. h nd f E . 75s/60e h s/e/w m
in corresp. lb nd e/w E . 177f h f c
in corresp. ub 50 e/f m . 45 h e/w/f md lh
before ub 5 e/f m . 90 h e/w/f d
before ub 20w w/f LIS . 70 h e/w E
before ub 30w w/f d . 80w lb w/f b
NS ub 90w w/f d . 3 lb e/w b
NS ub 90w w/f d corresp 3 lb/ub e/w b
Appendix I. Gesture in Deaf Subjects: Table of the Performed Gestures 187

NS ub 10w w/f d before 20 lb(IM) e/w/f md

in corresp. ub 20 e/w/f m . 28 ub e/w/f d
in corresp. lb 40 e/w E before 40 ub e/w/f d
NS lb(IM) 90w w/f m? before 100s/95e h s/e/w d lh
in corresp. ub 10 e/w m . 50 ub/h e/w/f d/b lh
in corresp. ub 35 e/w c before 25 lb e/w/f md lh
before ub/h 20 e/w LIS+E? . 35 ub/h e/w/f md/b lh
nd ub 90w w d before 40 ub/h e/w/f i
in corresp. h 20 e/w md before e h s/e/w d
NS(pause) ub 60 e/w/f d before 40 ub e/w/f d lh
corresp s/e/w/
NS(pause) h 3 e/w/f E . 20s/30e h f E lh
NS(pause) h(IM) 12 e/w/f d(LIS?)
in corresp. ub 45 e/w c
in corresp. ub/h 30 e/w/f m
in corresp. ub 48 e/w/f d
in corresp. h(IM) 5 e/w/f d(LIS?)
in corresp. ub/h 30 e/w/f m
in corresp. ub/h 25 e/f E
in corresp. ub 20 e/w/f c
in corresp. h 50 e/w/f m
NS ub 5 w/f d
in corresp. ub/h 32 e/w/f E?
before h 18 e/w/f m
in corresp. ub 55 e/w m?
before h(IM) 23 e/w/f m?
nd h 5 e/w/f d
before h 360 e m
in corresp. h 10 e d/c?(5.34,23)
in corresp. ub 3 e/w/f d
in corresp. lb 58w w E
in corresp. ub 40 e/w/f m(5.38.06)
in corresp. ub 60 e/w/f d
NS h 25 e/f d
NS ub 18 e/f d
in corresp. h 3 e )
188 Appendix I. Gesture in Deaf Subjects: Table of the Performed Gestures

NS h 10 e d
in corresp. h 28 e m
in corresp. ub 45 e d
before ub 6 e/f m
before h 63 e/f i bh
NS ub 3 e/w/f d rh
before ub 3 e/w/f m bh
before ub 3 e/f i bh(IM)
in corresp. ub 4 e/w m rh
in corresp. h 5 e/w E rh
NS ub 8 e/f d rh
in corresp. h 20 e/w/f m(LIS?) rh
before h 1 e/f i bh(IM)
before h 15 e/w md bh
before ub 3 e/f md rh
NS ub 3 e/w/f d rh
NS h 70 e/f d rh
NS ub 5 e/w m bh
in corresp. h 7w w/f d rh
in corresp. ub/h 2 e/w/f m bh
in corresp. ub/h 55w w/f d rh
NS ub/h 10w w d rh
in corresp. h 15 e/w/f m rh
in corresp. ub 5 e/w/f d rh
in corresp. h 7 e/f E rh
in corresp. h 10 e/w/f E bh
NS lb 2 w/f E rh
in corresp. lb 10 e/w/f m rh
in corresp. lb 20 w E bh
before lb 10 e/w/f m(LIS?) bh
NS lb 45 w c bh
in corresp. lb 4 e/w E bh
NS lb 10 w E bh
in corresp. lb 55w w/f m rh
in corresp. lb 7 e/w E rh
in corresp. ub 85 e/w/f m bh
before h 40w w/f E rh
before h 5 e/w/f d rh
before h(IM) 8 e/w/f md(LIS?) rh
in corresp. h 20 e/w/f d rh
before h 5 e/w/f d rh
in corresp. h(IM) 10 e/w/f md(LIS?) rh
Appendix I. Gesture in Deaf Subjects: Table of the Performed Gestures 189

in corresp. ub 27 e/w/f E(m?) rh

Subject 1 Interviewer 1

strokes locus size P.A. gesture h strokes Locus size P.A. gesture h

before h 75 e/w/f d(md) rh

in corresp. lb(IM) 5 e/w/f md rh
in corresp. ub 4 e/w/f E rh
in corresp. h 40 e/w d rh
NS ub 75 e/w m bh
in corresp. h 15 e/w/f i rh
before h(IM) 5 e/w md(LIS?) rh
in corresp. h 20 e/w/f E(m?) bh
NS h 18 e/w/f E rh
in corresp. ub/h 8 e/w/f E(m?) bh
nd ub 3 e/w/f m(LIS?) bh
nd h(IM) 13 e/w/f d(LIS?) rh
in corresp. ub/h 10 e/w/f m bh
before h(IM) 17 e/w/f md rh
NS ub 20 e/w c bh
in corresp. ub/h 4 e/w/f E(m?) rh
in corresp. h 95s/90e s/e i rh
NS ub 80s/15e f d rh
before ub 5s/35e s/e m rh
NS ub 2 e/w c rh
before ub 90 e/w c bh
before h(IM) 78 e/w/f md(LIS?) rh
before ub/h 5 e/w E rh
in corresp. ub 20 e/w/f c bh
NS lb/ub 10 e/w E bh
in corresp. lb 4w w/f c rh
NS lb 8 e/w/f d/c rh
in corresp. lb 10 e/w/f E rh
in corresp. lb 2 w/f c rh
in corresp. lb 87f f d rh
in corresp. lb 90f f d rh
before lb(IM) 64 e/w/f md rh
NS ub 3 e/w d rh
before lb(IM) 7 e/w/f md rh
before h 24 e/w/f i rh
190 Appendix I. Gesture in Deaf Subjects: Table of the Performed Gestures

NS lb(IM) 15 e/w/f md rh
in corresp. h 18 e/w/f i rh
NS lb 20 e/w/f md rh
NS lb 20 e/f i rh
NS ub 10w w/f d rh
NS ub 3 e/w d rh
in corresp. h 19 e/w/f m bh
nd lb 20 e/w d rh
in corresp. h 5s/90e f m rh
NS lb 3 e/w/f d rh
before lb 47 e/w m bh

Subject 1 Interviewer 1

strokes locus size P.A. gesture h strokes locus size P.A. e h

in corresp. lb 35 e/w E rh
in corresp. lb 34 e/w/f md rh
before h(IM) 26 e/w/f m(LIS?) rh
in corresp. h 57 e/w/f E rh
before lb 40 e/w/f m rh
in corresp. h 16 e/w E rh
NS h 4 w/f c rh
before h 5 e/w/f m rh
NS ub 15 e/w/f d rh
in corresp. ub/h 2 e/w/f c rh
nd ub/h 143w w m rh
in corresp. ub 7 e/w/f c/d rh
NS ub 53 e/w/f d rh
NS h 68 e/w/f d rh
before h 10 e md rh
NS ub/h 9 e d rh
before h 3 e/w E rh
before h 4 e E rh
before h 3 e c rh
NS h 2 e/w/f d rh
before h 40w w/f d rh
NS h nd w/f md rh
NS h 2 e/w/f c rh
in corresp. h 4 e/w/f E rh
Appendix I. Gesture in Deaf Subjects: Table of the Performed Gestures 191

in corresp. ub 10 e/w E rh
in corresp. ub 3 e/w d rh
in corresp. h 8 e/w/f E rh
before ub/h 18 e/w d rh
before ub 15 e/w E rh
NS ub 20 e/w c rh
NS ub 3 s/e/w E rh/S
NS h 20w w/f d rh
in corresp. ub 22 e/w E lh
before lb(IM) 20 e/w/f md lh
in corresp. ub 40 e/w E rh
NS h 10 e/w/f m rh
in corresp. h 5 e/w E rh
in corresp. lb 45 e/w d rh
in corresp. h 63 e/w/f E rh
in corresp. h 22 e/w/f m rh
nd lb(IM) 44 e/w/f d rh
in corresp. lb 15 e/w/f md rh
in corresp. ub 14 e/w/f E rh
NS lb(IM) nd e/w/f d rh
before ub 48 e/w/f c rh
NS lb/ub 45 e/w/f E rh

NS ub 8 e/w d rh
NS lb/ub 40w w/f E rh
before h 80 e/w/f E rh
in corresp. ub 10 e/w/f E rh
NS ub 4 e/w c rh
before h(IM) 87 e/w/f m rh
NS ub 100 e/w/f d rh
NS ub 40 e/w d rh
in corresp. ub/h 60 e/f md? rh
before h 7 e E rh
in corresp. h 4s/60w s/e/w m bh
NS h 40s s/e/w m rh
before h 20s/37e f E rh
NS(listener) lb 20w w/f d lh
NS(listener) h 10 e/w/f d rh
NS(listener) h 74 e/w E rh
in corresp. lb 35 e/w m lh
NS lb 50s s/e i bh
in corresp. h 88 e/w/f d rh
192 Appendix I. Gesture in Deaf Subjects: Table of the Performed Gestures

in corresp. ub 55 e/w/f E rh
NS lb 60w w/f m rh
NS ub 60w w/f E rh

Subject 2

strokes locus size P.A. gesture h

in corresp. lb 25s/100e f d lh
NS ub/h 30 e/w/f E rh
NS ub 10 e/w/f d rh
NS ub 30s/30e f E rh
NS lb(IM) 20s/150e f d rh
NS ub 55s/110e f E rh
NS lb(IM) 30s/150e f d rh
before ub 10 e/w/f c rh
NS ub/lb 40 e/w/f E rh
NS lb(IM) 25 e/w/f d rh
NS h 80s/90e f m rh
in corresp. h 40s/10e f m rh
NS h 10 e m rh
in corresp. h 7 e/w E rh
in corresp. h 90s/100e f i bh
NS lb(IM) 115 e/w/f d lh
NS h 3 e/w/f E bh
NS(hes) h 15 e/f md lh
in corresp. ub 10s/3e s/e/w m bh
before h(IM) 5 e/w/f d rh
NS h 15 e/w/f m rh
NS ub 15 e/w/f E bh
before h 85 e/w/f d rh
NS ub 5 e/w/f E rh
NS ub 3 e/w/f E rh
in corresp. h 80s/95e f d rh
in corresp. h 85s/120e f d bh
in corresp. ub 10 e/w/f LIS bh
before ub 3 e/w/f E bh
in corresp. lb 10 e/w/f i rh
in corresp. h 45 e/w/f m bh
Appendix I. Gesture in Deaf Subjects: Table of the Performed Gestures 193

in corresp. h 43s/50e f b rh
NS h 90s s d lh
NS ub 60 e/w E rh
in corresp. h 30 e/w/f md lh
in corresp. h 5s/10e f md lh
NS h 3 e/w/f E lh
NS ub/h 23 e/w/f E lh
in corresp. ub/lb 40 e/w/f E bh
before ub/h 45 e/w/f d lh
before h(IM) 10 e/w/f E lh
NS ub 10 e/w/f c bh
in corresp. ub/h 3 e/w/f E bh
before ub 13 e/w/f c bh
in corresp. ub 3 e/w/f E bh
NS ub nd f E bh
before ub 10 e/w/f E bh
NS ub 13 e/w/f E bh
NS h(IM) 23 e/w/f E rh
NS ub 50 e/w/f E bh
before h 65 e/w/f i rh
before h 20 e i rh
NS lb(IM) 40 e/w/f E bh
in corresp. ub 90 e/w/f E rh
before h 28 e/w/f i rh
in corresp. h 15s/20e s/e/f i rh
before h 10 e/f i bh
before h 8 e/w E bh
NS ub 15s/30e f m bh
NS ub/lb 40 e/w/f E bh
in corresp. ub 50 e/w/f m rh
before lb(IM) 35s/50e s/e/w d rh
before h 110s/90e f i rh
before h 20s/20e f i rh
in corresp. h 10 e/w/f E rh
in corresp. h 35s/25e f m rh
in corresp. h nd f m rh
NS h 5s/60e s/e/f m rh
NS h 10 e/w E rh
before h 20s/60e f i lh
This page intentionally left blank

Appendix II. Gesture in Deaf Subjects: Table

of Conversational Turns
Timing of speech turns

I1 S1 S2
01.14 01.01 00.02
00.35 00.25 00.01
00.20 00.40 00.01
00.15 00.10 00.01
00.06 00.12 00.02
00.02 00.20 00.05
00.05 00.02 00.01
00.05 00.11 00.01
00.02 00.05 00.06
00.02 00.10 00.01
00.05 00.05 00.09
00.03 00.04
00.03 00.02 00.02
00.02 00.02 00.08
00.04 00.04 00.01
00.02 00.06 00.16
00.02 04.00 00.10
00.04 00.09 00.01
00.03 00.03 00.02
00.03 00.01 00.01
00.04 00.06
00.05 00.04 00.01
00.02 00.02 00.06
00.03 00.01 00.08
00.10 00.01 00.04
00.04 00.05 00.10
00.02 00.04
00.01 00.01
00.11 00.01
00.01 00.04
00.02 00.03
00.02 00.01
196 Appendix II. Gesture in Deaf Subjects: Table of Conversational Turns

00.12 00.04
00.01 00.02
00.04 00.02

Index of Topics
Abstraction; 3; 4; 46; 60; 83; 163 Gesture Planner; 98
Adaptors; 21 Gesture timing; 93
Affect displays (Argyle); 21 Global mappings. See Neuronal Group
Aphasia; 37; 166; 170 Selection Theory
Arbitrariness; 3; 23; 46; 47; 83 GRETA; 151; 156; 157; 158; 159
Articulatory gestures; 83; 88 Head nodding in congenitally blind
Audio-visual Communication; 87 children; 40
Autonomous gestures (Kendon); 22 Hemisphere (brain); 140; 148; 171; 172;
AVC; 87; 88; 89; 91; 99; 100; 101; 105; 178
107 iconic; 20
Awareness; 3; 13; 20; 21; 44; 46; 47; 51; iconics; 20; 22; 23; 24; 25; 28; 29; 46; 47;
54; 83; 112; 163 52; 54; 55
Batons. See beats; Bavelas; 22; 111; 147; Ideographs; 20; 21; 22
166 Illocutionary Act; 12
Beats; 22; 24; 25; 51; 52; 54 Illustrators; 20; 21
Behavioural Scheduling; 153 Informal Logic; 12; 171
Behaviour-based Informative and communicative behaviour;
architectures; 153 14
Biology; 1 Inner speech; 30; 110
Classification of gestures; 24 Intentionality; 3; 15; 16; 20; 21; 24; 46;
Communication 47; 51; 54; 56; 83; 103; 112; 149; 163;
communication definition; 14 164
Conceptualizer; 97; 98; 99; 100 Interaction triphase (Jousse); 8
Conduits; 22 Interactive behavior; 15
Connectionist model; 57 Intrinsic morphology; 76; 77; 93; 94; 163
Conventional gestures (Argyle); 21 Kendons continuum; 23; 84
Co-verbal gestures, definition; 23 Kineme; 7; 10; 11; 88
Cultureme; 8 Kinemorph; 10; 11; 88
Deictics; 20; 25 Kinesics
Discourse planner; 160 Poyatos; 15
ECA; 151 Kinetic unit;10; 11; 80; 81; 82; 104
Egocentric speech; 110 Kinetics; 7; 10; 92
Emblems;20; 21; 23; 24; 25; 29; 51; 52; Kinetographs; 20; 21
54; 76 Language Action Theory; 12
EMMA; 152 Left Broadman Area (BA) 44
Ethology; 1 mirron neurons; 61
Evocators; 9 Lenneberg; 57; 62; 174
Expression planner; 160 Lexicon; 99; 100
Extension; 46 LIS; 184; 186; 187; 188; 189; 190; 192
External speech; 110 Locus in gesture; 76; 94; 180
Function of gestures; 3 Manual action; 58; 64; 71; 72; 84
Function-based Mathematical metaphors; 22
architectures; 153 MAX; 152
Fuzzy logic; 159 Metaphors; 5;22, 24; 25;39; 46; 47; 51;
Gesticulation; 13; 22; 23; 24 52; 54; 55; 84; 92
Gestuary; 99; 100 Mismatches; 31; 32; 148
Festure definition; 22 Modularity of Mind; 57
Gesture phase; 73; 88 Morphology; 4; 12; 88; 92; 93; 94; 96;
Gesture phrase; 11; 73; 81; 82; 88; 93;94; 107; 163
135; 136
198 Index of Topics

Multi-tasking; 4; 30; 57; 63; 64; 66; 67; Recursion; 4; 87; 96; 97; 100; 101; 106;
69; 70; 71; 72; 73; 87; 164 107; 163
Natural Language Processing; 28; 55; 170 Recursion in a narrow sense; 100
Neural Darwinism; 60; See Neuronal Regulators; 21
Group Selection Theory Rheme; 45
Neural maps. See Neuronal Group Slection RN
Theory recursion narrow; 100
Nexi; 152; 154; 155; 156; 157; 158 Self-manipulation,; 20
Non-verbal behavior. See NVB Size; 76; 93; 94
Non-verbal Communication; 13; See NVC "Sloppy hand (Kita); 159
definition; 16 Speaker-oriented; 33; 45
Oscillations; 93 speech; 1; 7; 12; 21; 22; 23; 87
Personality displays (Argyle); 21 Speech Act Theory; 12
Phoneme; 7; 10 Speech Generation; 153
Phonological oppositions; 88 Speech-related gestures (Argyle); 21
Pictographs; 21; 22 split-brain; 59; 148
Palm-down-flap; 139 Stroke; 11; 29; 39; 73; 81; 83; 91; 93;
Planning; 55 122; 127; 128; 131; 132; 134; 135; 145
Point of Articulation; 79; 94 Synchronization; 37; 76; 164
Posture shifting; 10 Synchronisation pattern; 29; 68; 74; 76;
Pragmatics; 1; 12; 167 83; 84
Prototype Category; 43; 46 Theme; 45; 172
Prototype Theory; 3; 49; 50; 51; 81; 179 Theory of Brain Modularity; 57
Proxemics; 8 Theory of Neuronal Group Selection; 60
RB; Tone Unit; 29; 73; 81; 94
recursion broad; 100; 101 Unified Theory of the Structure of Human
REA; 151; 156 Behavior; 2; 177
User model; 152; 153
Working Memory; 97; 98

Index of Authors
De Laguna; 1; 34; 43; 54; Davis & Vaks; 24
Alibali et al.; 111; 147 De Jorio; 1
Amstrong & Katz; 62 De Mauro; 2; 12
Amstrong Stokoe & Wilcox; 58 De Renzi; 38
Arbib; 58; 71; 72; 83; 84; De Ruiter; 44; 98; 99; 111; 147;
Argyle; 9; 13;15; 21; 43; Dekker & Koole; 40
Armstrong; 19; 58; 60; 62; 63; 72; 109; Diderot; 1
165 Dittman & Llewelyn; 54
Armstrong, Stokoe & Wilcox;58; 62; 72; Dittmann; 43;
109 Dore; 34;
Austin; 12; Duffy; 38;
Bashir; 148 Duncan; 29
Basso Luzzati & Spinnler; 38; 58 Edelman; 4; 25; 60; 61; 62; 63; 71; 85;
Bates; 34; 35; 36; 109
Bates et al; 34; 35 Efron; 7; 10; 20
Bavelas; 22; 111; 147; Eibl-Eibesfeldt; 1; 8; 9; 159;
Birdwhistell; 10; 11; Ekman; 1; 3; 7; 13; 14; 15; 20; 21; 22; 46;
Blass, Freedman & Steingart; 39 Ekman & Friesen; 3; 7; 13; 14; 15; 20; 21;
Bloomfield; 1; 109; 148; 22; 46
Bock; 28; Ferrari; 2
Bolinger; 1; Feyereisen; 28; 37; 38; 39; 58;
Bongioanni; 62; Feyereisen & Seron; 28
Boukricha; 151; 152 Feyereisen, Barter, Clerebaut,; 38
Breazel; 151; 152 Feyereisen, Paul; 38;
Bressem; 93 Flores and Ludlow; 12
Bressem & Ladewig; 93 Fodor; 57;
Broca; 29; 57; 58; 59; 62; 72; 83; Freedman; 9; 20; 39; 43; 54;
Bruner; 34; 35; Freedman & Hoffman; 9; 20
Bull & Connelly; 73 Freud; 33; 57
Butterworth & Hadar; 9; 28; 29; 30; 43; Fricke et al.; 89
73; 83; 85; 147 Frick-Horbury & Guttentag; 43; 54
Byrne; 96 Friedman; 35;
Carducci; 64 Gardner et al; 39
Carlomagno & Cristilli; 59 Gilbert; 12;
Cassell; 3; 12; 13; 24; 27; 30; 31; 32; 33; Givn; 58;
34; 45; 91; 103; 151; 153; 157; 158; Goldin-Meadow; 34; 40;
Cassell & Prevost; 30 Goldstein; 19
Cassell, McNeill & McCullough; 3; 27; 31; Goodall; 8
32; 33; 34 Goodwin; 12; 39
Chomsky; 4; 57; 96; 100; 101; Green and Marler; 8
Chomsky & Miller; 96 Gullberg; 2; 33
Cicone; 39; Haagort; 59
Cienki; 25 Hadar & Butterwoth; 140
Cohen & Harrison; 111 Halliday; 139;
Condillac; 1; 84; Hartman; 151
Condon &Ogston; 7; 63 Hauser and Chomsky; 4; 96
Corballis; 58; 84; Hayashi et al.; 63; 74
Cristilli; 59 Heilman et al.; 38; 58
DOdorico & Levorato; 35 Hewes; 34
Damasio; 61; Hines; 62
Darwin; 1; 8 Hockett; 19
200 Index of Authors

Holqvist; 33 Morris; 1; (; 9; 13; 15; 24

Hudson; 8 Morrow & Ratcliff; 39
Iverson; 40; Moscovici; 43;
Iverson & Goldin-Meadow; 40 Mller; 25;
Jackendoff; 58; Nobe; 63
Jackson; 37; Noth; 11
Jakobson; 89; Osgood; 101
Jncke et al.; 59 Parke, Shallcross, & Anderson; 40
Jason; 38; 58; Parrill; 2; 25; 92
Jousse; 1; 8 Parrill & Sweetser; 25
Kelso; 63; 74; Partridge; 19
Kendon; 1; 10; 11; 19; 22; 23; 24; 27; 28; Pavelin-Lesic; 51
29; 43; 44; 45; 63; 73; 74; 76; 81; 83; Pelachaud; 151; 156;
84; 88; 91; 92; 93; 111; 141; 142; 147; Person; 13; 24;
Kita & Lausberg; 59; 148 Piaget; 109; 110
Kita de Candappa & Mohr; 59; 140; 148 Pike; 2
Kita; 30; 59; 140; 141; 148 Pinker & Jackendoff; 96
Kolb & Whishaw; 37 Place; 84
Krauss; 27; 43; 44; 45; 54; 83; 98 Posner & Di Girolamo; 59
Krauss et al.; 43; 44 Poyatos; 8; 15; 16
Lenneberg; 57; 62 Poyatos, F.; 8; 16
Leung & Rheingold; 36 Pulvermller; 59;
Levelt; 4; 22; 23; 97 Rauscher, Krauss & Chen; 27; 43; 63
Levy; 23; 63 Rehak et al; 39
Lieberman; 96 Rim; 43; 44; 83; 111; 139
Lock; 36 Rizzolatti; 83
Lurija; 109; 110 Rosenfeld; 9; 13; 20
Maccoby & Jacklin; 62 Rossi-Landi; 12; 89; 178; 179
MacKay; 2; 14; 15; 16; 23 Rossini; 33; 48; 80; 92; 140; 145; 147;
Mahl; 1; 9; 20; 111 151; 156; 157; 158; 159; 160
Mancini; 151 Ryle; 12
Manly; 40 Sakurai; 59
Manzoni; 64 Saussure; 22; 91
Martinet; 89 Scalise; 88
Massaro;4; 34; 91 Scheflen; 63
Masur; 36; Schegloff; 12
McCullough; 3; 4; 9; 11; 12; 16;21; 22; Searle; 12
23; 24; 25;27; 28; 29; 30; 31; 32; 33; Shallice; 38
34; 38; 39; 40; 44; 45; 48; 54; 56; 58; Silberberg & Fujita; 83
63; 72; 73; 74; 84; 87; 88; 92; 93; 94; Simone; 88; 89; 96
95; 110; 135; 136;137; 138; 139; 147; Skrandies; 59
148; 149; 158 Sobrero; 2
McNeill; 1; 2; 3; 4; 9; 11; 12; 16; 21; 22; Stephens &Tuite; 22; 29
23; 24; 25; 27; 28; 29; 30; 31; 32; 33; Stokoe; 19; 58; 60; 62; 63; 93; 109
34; 38; 39; 44; 45; 48; 58; 63; 72; 73; Talmy; 33
74; 84; 87; 88; 91; 92; 93; 94; 110; Taylor; 3
137 Thorpe; 8
McNeill & Levy; 23 Tomasello; 58; 84
McNeill & Pedelty; 39; 58 Tranel & Damasio; 59
Melinger & Levelt; 147 Trevarthen & Hubley; 36
Miller; 4; 59 Tylor; 1
Mittelberg; 25 Van Meel; 37
Mondada; 12 Varney & Damasio; 38; 58
Moro; 59 von Cranach & Vine; 9
Index of Authors 201

von Frisch; 8 Wernicke; 39; 62

Vygotskij; 9; 30; 36; 109; 110 Whitney; 3
Vygotskij & Lurija; 30; 109 Wilcox; 19; 60; 63
Wachsmuth; 151; 152 Wilems et al.; 59
Watson & Graves; 8 Wittgenstein; 12
Weizenbaum; 97 Wundt; 1
Werner & Kaplan; 35; 36; 43; 54 Zaidel; 59