Vous êtes sur la page 1sur 5

CogInfoCom 2014 5th IEEE International Conference on Cognitive Infocommunications November 5-7, 2014, Vietri sul Mare, Italy

Temporal Dynamics of Speech and Gesture


in Autism Spectrum Disorder
Anna Lambrechts, Sebastian Gaigg, Kielan Yarrow
Autism Research Group
City University London
London, UK
anna.lambrechts.2@city.ac.uk

Katie Maras

Riccardo Fusaroli

University of Bath
Bath, UK

Aarhus university
Aarhus, Denmark

Autism Diagnosis Observation Schedule [1] include atypical


prosody and atypical use of gestures. The evidence published
so far support some aspect of these observations. Wetherby et
al. (2004) reported that inventory of gestures at 2 years of age
was the strongest predictor of autism and Colgan et al. (2006)
found that infants who would later be diagnosed with ASD
produced a lesser variety of gestures, but with similar
frequency and initiation than their Typically Developing (TD)
counterparts. De Marchena & Eigsti (2010) found that adults
with ASD produce iconic gestures with the same frequency as
TD individuals but with a longer delay between one gesture
and its related speech. Other aspects of speech-accompanying
movements in ASD could contribute to modify the efficiency
of gesture. Tantam, Holmes, & Cordess (1993) reported that
ASD individuals produce more self-stimulatory gestures
which could interfere with the use of gestures. Cook,
Blakemore, & Press (2013) tested kinematics of movement in
adults and found that ASD individuals did not minimize jerk
to the same extent as their TD counterpart, and moved with
greater acceleration and velocity. Overall evidence suggests
than rather than showing reduced or impaired gesturing
individuals with ASD demonstrate atypical dynamics when
using gestures and affected embodiment [7]. In another
domain recent research on timing and time perception in ASD
points at some difficulties with short duration processing [8].
These durations are precisely useful in communication, for
instance to determine the appropriate duration of a silence
between two sentences or to coordinate speech with
accompanying gestures.
In the present study we aimed to characterize the temporal
dynamics of speech and speech-accompanying gesture in ASD
using naturalistic recordings of high functioning adults with
and without ASD. Using naturalistic recordings of speech
from both diagnosis groups we looked at different quantifiers.
First, the temporal lock between quantity of movement and
speech volume provided an unbiased measure of temporal

Abstract Autism Spectrum Disorder (ASD) is characterized


by difficulties in communication and social interaction.
Abnormalities in the use of gestures or flow of conversation are
frequently reported in clinical observations and contribute to a
diagnosis of the disorder but the mechanisms underlying these
communication difficulties remain unclear. In the present study,
we examine the hypothesis that the temporal dynamics of speech
and gesture production is atypical in ASD and affects the overall
quality of communication. The context of a previously published
study of memory in ASD (Maras et al., 2013) provided the
opportunity to examine video recordings of 17 ASD and 17 TD
adults attempting to recall details of a standardized event they
had participated in (a first aid scenario). Results indicated no
group difference in the use and coordination of speech and
gesture: both groups produced the same quantity of movement
over time (t(33)=-0.165, p>.8), and gestures were produced within
the same time window and with a similar distribution by ASD
and TD individuals (p=.042). Similarly, no group differences
were found in the subjective ratings on the quality of
communication: in both groups the use of gestures improved
comprehension and engagement from the listener. Overall the
current data do not suggest that ASD individuals experience
more difficulties than TD participants in time processes relevant
to communicating personally experienced events. However large
inter-individual differences could contribute to communication
difficulties in some participants. It will be important for future
studies to examine the timing of communicative behaviors during
reciprocal interactions, that place demands not only on
coordinating speech with gesture but to coordinate ones own
behavior with that of others.
Keywords Autism Spectrum Disorder; Gesture; Speech;
Communication; Temporal dynamics

I.

INTRODUCTION (Heading 1)

Individuals with Autism Spectrum Disorder (ASD)


experience difficulties with social interactions and
communication. Some of the observations leading to a
diagnosis of ASD during a clinical interview such as the

978-1-4799-7280-7/14/$31.00 2014 IEEE


349

A. Lambrechts et al. Temporal Dynamics of Speech and Gesture in Autism Spectrum Disorder
Table1. Demographic data including age, verbal IQ (VIQ), performance IQ
(PIQ) and full-scale IQ (FIQ).

dynamics. Second, segmenting gestures using a classic


linguistic approach allowed us to compare the performance of
our participants with the existing literature. Finally, we
collected quality of communication ratings to evaluate the
impact of gestures on the overall perception of speech.
II.

B. Gesture segmentation
Two coders (one author and one naive coder) segmented
the videos using ANVIL software [11]. For each gesture the
onset, offset and stroke times were defined as well as the
category of gesture in McNeill's classification [12]. Gestures
were defined as iconic (describing a concrete element),
metaphoric (illustrating an abstract concept), deictic (pointing
gesture, in space or time) and beats (content-deprived gestures
equivalent to a visual accent). All other movements (e.g.
scratching) were not considered in the analysis.

METHOD

A. Participants
17 adults with a diagnosis of ASD and 17 TD adults were
recruited in the context of a previous study [9] and matched on
age, verbal, performance and full-scale IQ as measured by the
WAIS-R or WAIS-III UK (The Psychological Corporation,
2000) (see table 1) . After taking part in a live event scenario,
participant were asked to recall what had happened and their
answer was videotaped for later transcription. They
retrospectively gave their informed consent that their videos
could be used in the context of the present analysis. Material

C. Quality of communication ratings


For each recording a 2-minute excerpt was extracted.
Videos were edited so that a black mask was hiding the
speaker's face at all times. The auditory stream was also
extracted as an independent file. 30 naive adults were assigned
randomly to one of two groups. Each 2-minute excerpt was
presented as an audio-visual video to one group and as an
audio-only recording to the other group. The format of
presentation was counterbalanced between groups. Viewers
were asked to fill in a 6-item questionnaire (based and
extended from de Marchena & Eigsti, 2010) evaluating the
quality of communication in that excerpt (see Table 2). Results

Participants took part in a first-aid live scenario and were


later interviewed about what they could recalled. The first part
of the interview was a free recall task ("tell me everything you
can remember that happen from the moment you entered the
room"), followed by a series of questions. Participants were
videotape from the waist upwards, and no mention was made
about gesture. In this study we only consider the free recall part
of the interview, in which participants addressed the
experimenter but did not engage in a back-and-forth
conversation. The collected recordings had an average duration
of 5'592'01 in the TD group and 6'263'01 in the ASD group.
Independent t-test showed that recording length did not differ
between groups (p>.6).
III.

After the text edit has been completed, the paper is ready
for the template. Duplicate the template file by using the Save
As command, and use the naming convention prescribed by
your conference for the name of your paper. In this newly
created file, highlight all of the contents and import your
prepared text file. You are now ready to style your paper; use
the scroll down window on the left of the MS Word Formatting
toolbar.

ANALYSIS

Recordings were subjected to 3 different analyses.


A. Temporal dynamics of communication
A measure of quantity of movement was obtained for each
recording by counting the number of pixels which luminosity
changed more than 30dB (change in luminosity when filming
the room empty) from frame n to frame n+1 in the grayscale
videos. Volume of speech was extracted using Praat [10].
Cross-correlation scores were then computed using Matlab
(http://www.mathworks.com/). This score indicates how
systematically a change in movement was locked in time to a
change in speech volume and what the characteristic delay
(lag) was. In order to compare coefficients across participants
we normalised the scores for each participant using their
individual maximum score.

D. Temporal dynamics of communication


A mixed-sample, repeated-measure Analysis of Variance
(ANOVA) was conducted on normalized cross-correlation
TABLE II.

General
comprehension

Quality of
expression,
flow

TABLE I.
ASD (n=17)
Mean

SD

TD (n=17)
Range

Mean

SD

Range

Age

41.9

13.5

25-62

46.6

12.2

26-61

VIQ

110.4

11,2

81-123

111.1

12,7

82-128

PIQ

106.7

13,5

84-128

109.4

13,8

75-136

FIQ

109.4

12,4

81-127

111.3

13,7

77-135

Q1

How well were you able to follow what the person


was saying?

Q2

Was the persons report organised in a clear sequence


of events?

Q3

How well did the person express himself/herself?

Q4

Was the person speaking fluently and clearly?

Q5

How engaged were you while listening to the


recording?

Q6

How well could you picture the scene based on the


persons description?

Table 2. Quality of communication questionnaire. Viewers rated each


recording (audio-visual or audio only) on these 6 questions using a 1(low) to
9(high) scale.

350

CogInfoCom 2014 5th IEEE International Conference on Cognitive Infocommunications November 5-7, 2014, Vietri sul Mare, Italy

Fig.1. Normalised cross-correlation scores between quantity of movement and


speech volume, for lags ranging from -5 to +5 seconds. A negative lag indicates
that a change in movement is followed by a change in volume, whereas a
positive lags indicate that a change in volume is followed by a change in
movement.

scores between quantity of movement and speech volume


with lag as a within-group factor and diagnosis as a betweengroup factor. Results showed a main effect of lag (p<.05)
indicating that in both groups maximum cross-correlation score
was obtain for a null lag, i.e. a change in quantity of movement
was most often synchronous with a change in speech volume
(Fig.1). There was no main effect or interaction with the factor
group (t(33)=-0.165, p>.8, p=.042)

Fig.2. Frequency and duration of gestures segmented in a sample of


recordings (7 ASD and 5 TD participants).

F. Quality of communication ratings

E. Gesture segmentation

A mixed-sample, repeated-measure ANOVA was


conducted with question as a within-subject factor and
diagnosis as a between-group factor. Results show a main
effect of question but no main effect or interaction involving
the factor group. Post-hoc t-test indicated that Q6 (relative to
picturability) showed a significantly positive gain.

Gestures were segmented in a sample of recording (7 ASD


and 5 TD). Results show that participant produce a majority of
iconic gestures. Preliminary mixed-sample, repeated-measure
ANOVAs conducted on the frequency and the duration of
gestures with type of gesture as a within-group factor and
diagnosis as a between-group factor revealed no main effect or
interaction involving the factor group. This suggests that
participants with and without ASD produced a similar amount
of gesture in each category and that each gesture lasted on
average the same time in both groups (Fig.2).

Fig.3. Gain in the quality of communication rating when gestures were visible.
A positive gain indicates that gestures improved the perceived quality whereas
a negative gain shows that gestures decreased the perceived quality of
communication.

V. DISCUSSION
In the current study we investigated the temporal dynamics
of speech and speech-accompanying gestures in Autism
Spectrum Disorder. Participants were recalling an live event
they previously took part in and addressed an experimenter.

351

A. Lambrechts et al. Temporal Dynamics of Speech and Gesture in Autism Spectrum Disorder

Number footnotes separately in superscripts. Place the


actual footnote at the bottom of the column in which it was
cited. Do not put footnotes in the reference list. Use letters for
table footnotes.

A cross-correlation analysis looking at the dynamics of


movement and speech volume revealed that on average
participant were changing how loud they spoke at the same
time as they moved. Movement here is used as a proxy for
gesture: some noise is introduced by non-gesture movement,
however they are part of the dynamic of a speech. In addition
this measure allowed us to look at the temporal lock between
speech and gesture without any a priori about the onset and
offset of gestures. The observed synchrony of change in
movement and change in volume is in line with the gesture
literature which indicates that gestures happen exactly as one
demonstrates or accentuates a phrase [13]. It contrasts however
with de Marchena and Eigsti's [4] finding that iconic gestures
are produced with a short delay before the occurrence of the
related word. It is possible that the characteristic delay between
speech and gesture is dependent on the category of gesture,
which is not distinguishable in our analysis. Importantly, this
measure did not differ between group indicating that ASD
individuals coordinated their gestures with speech in a similar
way to their typical counterparts.

Unless there are six authors or more give all authors


names; do not use et al.. Papers that have not been published,
even if they have been submitted for publication, should be
cited as unpublished [4]. Papers that have been accepted for
publication should be cited as in press [5]. Capitalize only
the first word in a paper title, except for proper nouns and
element symbols.
For papers published in translation journals, please give the
English citation first, followed by the original foreign-language
citation [6].

Segmentation of gestures in a sample of ASD and TD


speech recordings do not support the infancy and childhood
literature claiming that individuals with ASD produce less
gestures or a lesser variety of them. In the current setting we
found no difference between group in the frequency, type or
duration of gestures. It is notable however that these counts
showed a high variability between individuals. A possibility is
that atypical use of gesture might not characterize the whole
ASD population but a subset of the population, e.g. individuals
with greater language difficulties. It is interesting in any case
that participants in both groups produced appropriate types of
gesture, in this case iconic gestures which allowed them to
describe and represent the scene for their interlocutor.
Finally, an interesting finding was that gestures were find
to increase the perceived quality of communication in ASD
inidividuals as in TD individuals. This indicates that in spite of
potential atypicalities in the kinematics of gestures in ASD,
movement is perceived as contributing positively to the
quality of communication. Noticeably the aspect of
communication that was enhanced by the use of gesture was
the picturability of the scene, which fits nicely with the fact
that participants produced a majority of iconic (descriptive)
gestures.
Overall the current results do not support differential
dynamics of speech and gesture coordination in ASD.
However the disorder is characterized by an large interindividual variability and it is possible that atypicalities are
only visible in some individuals whereas other adults might be
unimpaired or able to compensate.
REFERENCES
The template will number citations consecutively within
brackets [1]. The sentence punctuation follows the bracket [2].
Refer simply to the reference number, as in [3]do not use
Ref. [3] or reference [3] except at the beginning of a
sentence: Reference [3] was the first ...

352

[1]

C. Lord, S. Risi, L. Lambrecht, E. H. Cook, B. L. Leventhal, P. C.


DiLavore, A. Pickles, and M. Rutter, Autism Diagnostic
Observation Schedule (ADOS), vol. 30, no. 3. Western
Psychological Services, 2000, pp. 20523.

[2]

A. M. Wetherby, J. Woods, L. Allen, J. Cleary, H. Dickinson, and


C. Lord, Early indicators of autism spectrum disorders in the
second year of life, J. Autism Dev. Disord., vol. 34, no. 5, pp. 473
93, Oct. 2004.

[3]

S. E. Colgan, E. Lanter, C. McComish, L. R. Watson, E. R. Crais,


and G. T. Baranek, Analysis of social interaction gestures in
infants with autism., Child Neuropsychol., vol. 12, no. 45, pp.
30719, Aug. 2006.

[4]

A. de Marchena and I.-M. Eigsti, Conversational gestures in autism


spectrum disorders: asynchrony but not decreased frequency.,
Autism Res., vol. 3, no. 6, pp. 31122, Dec. 2010.

[5]

D. Tantam, D. Holmes, and C. Cordess, Nonverbal expression in


autism of Asperger type., J. Autism Dev. Disord., vol. 23, no. 1, pp.
11133, Mar. 1993.

[6]

J. L. Cook, S.-J. Blakemore, and C. Press, Atypical basic


movement kinematics in autism spectrum conditions., Brain, vol.
136, no. Pt 9, pp. 281624, Sep. 2013.

[7]

I.-M. Eigsti, A review of embodiment in autism spectrum


disorders., Front. Psychol., vol. 4, no. April, p. 224, Jan. 2013.

[8]

M. J. Allman, Deficits in temporal processing associated with


autistic disorder., Front. Integr. Neurosci., vol. 5, no. March, p. 2,
Jan. 2011.

[9]

K. L. Maras, A. Memon, A. Lambrechts, and D. M. Bowler, Recall


of a live and personally experienced eyewitness event by adults with
autism spectrum disorder., J. Autism Dev. Disord., vol. 43, no. 8,
pp. 17981810, Dec. 2013.

[10]

P. Boersma and D. Weenink, Praat: Doing Phonetics by


Computer, Ear and Hearing, vol. 32, no. 2. p. 266, 2011.

[11]

M. Kipp, ANVIL A Generic Annotation Tool for Multimodal


Dialogue, in Seventh European Conference on Speech
Communication and Technology, 2001, pp. 13671370.

[12]

D. McNeill, Hand and Mind: What Gestures Reveal about


Thought, in What gestures reveal about, University of Chicago
Press, 1992, pp. 104133.

CogInfoCom 2014 5th IEEE International Conference on Cognitive Infocommunications November 5-7, 2014, Vietri sul Mare, Italy
[13]

A. Kendon, Classifying gestures, in Gesture: Visible Action as


Utterance, Cambridge: Cambridge University Press, 2004, pp. 84
107.

353

Vous aimerez peut-être aussi