Vous êtes sur la page 1sur 20

GestureSpeech Unity or Embodiment

Its Origin in Human Evolution1

David McNeill
University of Chicago
Abstract ................................................................................................................................. 1
1. Why we gesture .......................................................................................................... 1
2. Language origin .......................................................................................................... 3
2.1 Gesture-first. ...................................................................................................................... 3
2.1.1 Models of supplanting and scaffolding ........................................................................... 4
2.1.2 Problems with pantomime .................................................................................................. 5
2.1.3 Two current gesture-first theories .................................................................................. 6 The mirror system hypothesis ................................................................................................... 6 The recursion hypothesis ............................................................................................................ 7
2.1.4 End of gesture-first ................................................................................................................. 7
2.2 Mead's Loop. ...................................................................................................................... 8
2.2.1 How Mead's Loop engendered Growth Point properties. ................................... 10
2.2.2. Answering puzzles ............................................................................................................... 13
Comments by Renia Lopez-Ozieblo ........................................................ 14
Responses by David McNeill ................................................................................. 14
2.3 Last word on the origin ............................................................................................... 15

3. Timeline .................................................................................................................... 17
References ......................................................................................................................... 18

Minimal packages of language embodiment have been called growth points (GPs). In a GP
gesture and speech are inherent and equal parts. Out of a GP comes speech orchestrated
around a gesture. Can theories of language origin explain this dynamic process? A popular
theory, gesture-first, cannot; in fact, it fails twice predicting what did not evolve (that
gesture was marginalized when speech emerged), and not predicting what did evolve (that
there is gesturespeech unity). A new theory, called Mead's Loop, is proposed that meets the
test. Mead's Loop agrees that gesture was indispensable to the origin of language but holds
that gesture was not first, that any gesture-first could not have led to language, and that to
reach it gesture and speech had to be equiprimordial.


1 Based

Gesture, dialectic, dynamic dimension, psychological predicates, language,

thought and action, cognitive being, origin of language

Why we gesture

largely on Chapters 2, 3, 5 and 6 of McNeill, D. How Language Began: Gesture and Speech in Human
Evolution, Cambridge, 2012. Prepared for Embodied Language II, Christs College, Cambridge,
2-4 Sept., 2013.

Why do we gesture? Many would say that it brings emphasis, energy and
ornamentation to speech (which is assumed to be the core of what is taking place); in
short, gesture is an add-on. However, evidence is against this. The reasons we
gesture are more profound. Language is inseparable from it. While gestures enhance
the material carriers of meaning, the core is gesture and speech together. They are bound
more tightly than saying the gesture is an add-on or ornament implies. They are
united as a matter of thought itself. Even if, for some reason, the hands are
restrained and a gesture is not externalized, the imagery it embodies can still be
present, hidden but integrated with speech (it may surface in some other part of the
body, the feet for example). But the ultimate reason we gesture is that the origin of
language was the origin of gesture and language jointly; they cannot be separated, and
this is the reason we gesture.
I can do no better than start with Humboldts distinction between Ergon and
An important distinctionkept reemerging; this was Humboldts
distinction between language as Ergonlanguage viewed as structureand
as Energeialanguage as an embodied moment of meaning located both in
the organism and in the medium that the organism uses for expression. The
latter is language at the moment of its use, alive, in an actor. (Elena Levy,
quoting Glick 1983, on Heinz Werner).

Saussure (1959, in lectures around 1910) crystalized Ergon into the synchronic
approach, wherein language is considered a static entity, its components all seen
panoramically at one theoretical instant. In language, Saussure said, there are only
differences. To see them the synchronic view is essential. This approach is
fundamental to most current-day linguistics, an academic field founded upon it, and
a century of insights attests to its intellectual vigor.
But there is also Energeia. Why do we gesture? is a question in this domain, of
the dynamics of language. As it intersects language form gesture brings language to
life, it standing otherwise inertly as a crystalline object.
Humboldts Ergon and Energeia are conceptualized here as dimensions of
language that cross in the growth point, described below. The dimensions have
classically been called linguistic (Ergon) and psychological (Energeia) but better if
less colorful (and less proprietary) terms are static and dynamic. Some phenomena
are more accessible or prominent on one dimension, others on the other, but the
dynamic and static dimensions cannot be isolated. They intersect and interact. The
dimensions are structured on different principles and draw on different methods of
description and analysis (the field of linguistics, as mentioned, specializing on the
In a trenchant remark concerning the poetic function of language Roman
Jakobson concisely explained the two principal axes of the static dimension. He
wrote that [t]he poetic function projects the principle of equivalence from the axis of
selection into the axis of combination (Jakobson 1960, p. 358). By poetic function, he
means a process whereby sequences come to have contrastive values. His axis of
selection is the paradigmatic axis the contrasts established when a linguistic form
is selected from a set of equivalents (sheep and mutton are equivalents that
contrast on an axis of selection in English). The axis of combination is the
syntagmatic axis by combining words new linguistic units and values appear
(combining hit and ball into hit the ball generates a verb phrase, a unit at a
higher level, and a direct object, a value ball does not have outside the
The dynamic dimension, cross-cutting the static, could be termed the
activity of language but I am calling it inhabiting language. Inhabiting is done
with ones being, thought and action and becomes part of the speakers cognitive
being at the moment of speech (Merleau-Ponty 1962). It is more than just making

Kendon (2008), who also argues against the view.

language your own. Inhabitance is all-encompassing, organizing thought and action,

effecting goals and fulfilling presuppositions. The inhabiting terminology has the
advantage of alluding to both langue, Saussures static system revealed
synchronically, and to whatever animates it, the dynamic dimension. Besides
Merleau-Ponty a historical figure associated with language regarded dynamically is
Vygotsky (1987). On the dynamic dimension, units come and go, emerge and
disperse in real time. It crosses at 90 degrees the abstractions from time and the
unmoving totality of synchronic langue.


Language origin

How did this dynamic system emerge in human evolution? Even though we are
referring to events that took place hundreds of thousands years ago origin theories
can be empirically tested. We ask: does a theory explain how gesturespeech unity
dialectic evolved? Can it explain gesturespeech synchrony? If it cannot, that is one
kind of falsification. Does it also positively predict that these things could not evolve?
That is another kind. We shall see that a popular theory, the mirror neuron
hypothesis of gesture-first, is such a theory, failing to predict in both senses.



This theory, recurring in many recent books and articles, says that the first steps
toward language phylogenetically were not speech, nor speech with gesture, but
were gestures alone. Vocalizations in non-human primates, the presumed precursors
of speech without gestures assistance, are too rigid and restricted in their functions
to offer a plausible starting point for language, but primate gestures appear to offer
the desired flexibility. Thus, the argument goes, gesture could have been the
linguistic launching pad (speech evolving later). The gestures in this theory are
regarded as the mimicry of real actions, a kind of pantomime, hence the appeal of
mirror neurons as the mechanism. Current chimps show this kind of action mimicry.
However, gesture-first could not have produced the gesture-speech unity that
we find in ourselves. It predicts what did not evolve (that gesture withered or was
marginalized when speech arose) and does not predict what did evolve (that gesture
is an integral part of speaking). That so many have adopted it I explain by folk (and
fabricated) beliefs about gestures that do not stand up to scrutiny when actual speech
and gesture are examined. The contradiction of gesture-first is that speech first
supplants gesture, it says, yet ends up integrated with it.
Why does gesture-first say gesture must have withered when speech
emerged? The logic of gesture-first, at its very core, means that supplantation of
gesture by speech, overt or hidden, is inescapable. This is why every advocate
automatically posits it. Empirically, there is a perfect correlation of those advocating
gesture first and the supplantation step (Table 1).
Moreover, there is this conceptual point that explains it; namely,
supplantation is built into the gesture-first theory. It is important to see that gesturefirst is a theory about the origin of speech (not gesture). Given that aim, it must
logically consider that from gesture one gets to speech; and here supplantation
enters: it is unavoidable logically.
I do not deny that gesture-first may once have existed, but if it did it could
not have led to human language. It would have created pantomime, which does not
synchronize with speech, and then extinguished or branched off into a dead-end. In
fact, in childrens language, we see that it possibly did once exist but entered a deadend. The earliest stages of language development look much as gesture-first would
expect, but then nothing comes from it; having no effects, it disappears from the
childs development. The imagerylanguage dialectic at the heart of our language
then emerges separately and later.

Henry Sweet (and
presumably Henry

Rizzolatti and Arbib.

Stefanini et al. (referring to

Gentilucci and others).
Tomasello (2008), thinking
in terms of primates and
very young (one-year and
less) human infants but
with the suggestion that
something similar took
place in phylogenesis.


Gesture helped to develop the power of forming
sounds while at the same time helping to lay the
foundation of language proper. When men first
expressed the idea of teeth, eat, bite, it was by
pointing to their teeth. If the interlocutors back was
turned, a cry for attention was necessary which would
naturally assume the form of the clearest and most open
vowel. A sympathetic lingual gesture would then
accompany the hand gesture which later would be
dropped as superfluous so that ADA or more
emphatically ATA would mean teeth or tooth and
bite or eat, these different meanings being only
gradually differentiated (pp. 3-4). (Thanks to Bencie
Woll for bringing this passage to my attention
Manual gestures progressively lost their importance,
whereas, by contrast, vocalization acquired autonomy,
until the relation between gestural and vocal
communication inverted and gesture became purely an
accessory factor to sound communication (p. 193).
the primitive mechanism that might have been used to
transfer a primitive arm gesture communicative
system from the arm to the mouth... (p. 218).
Infants iconic gestures emerge on the heels of their
first pointing they are quickly replaced by
conventional language because both iconic gestures
and linguistic conventions represent symbolic ways of
indicating referents (p. 323).

Table 1. Gesture-first advocates and supplantation of gesture by speech.

2.1.1 Models of supplanting and scaffolding

Contemporary coded gesture systems, such as the Warlpiri sign language or ASL
signs with speech performed by hearing sign-speech bilinguals, do not combine with
speech speech and sign mutually repel each other in time, so breaking synchrony,
or they are simultaneous but are not co-expressive: either way languagelike coded
gestures, semiotically like spoken language, do not form gesture-speech unities. The
principle is that juxtaposing two codes does not form a dialectic of semiotic
opposites; in fact, they are semiotically similar, not opposite, and an imagery
language dialectic cannot form. This affected evolution it could not have
progressed from a gesture-first language, and did require something else that
enabled a combination of opposites. Below, I argue that Mead's Loop provided it.
The following models demonstrate inherent gesture-first limitations.
a) Warlpiri sign language. Women use the Warlpiri sign language of
Aboriginal Australia when they are under (apparently quite frequent) speech bans
and also, casually, when speech is not prohibited. When this latter happens signs and
speech co-occur and lets us see what may have occurred at the hypothetical gesture
or sign-speech crossover. Here is one example from Kendon (1988):

The spacing is meant to show relative durations, not that signs and speech were
performed with temporal gaps (they were performed continuously). Speech and sign
start out together at the beginning of each phrase but, since signing is slower, they
immediately fall out of step. Each is on a track of its own and they do not unify.
Speech does not slow down to keep pace with gesture, as would be expected if
speech and gesture were unified (mutual speechgesture slowing is shown with
gesticulations, described in McNeill et al 2008). They then reset (there is one reset in
the example) and immediately commence to separate again. So, according to this
model, co-expressive speechgesture synchrony would be systematically interrupted
at the crossover point of gesture and speech codes. Yet synchrony of co-expressive
speech and gesture is what evolved.
b) English-ASL bilinguals. The second model is Emmorey et al.s (2005)
observation of the pairings of signs and speech by hearing ASL/English bilinguals.
While 94% of such pairings are signs and words translating each other, 6% are not
mutual translations. In the latter, sign and speech collaborate to form sentences, half
in speech, half in sign. For example, a bilingual says, all of a sudden [LOOKS-ATME] (from a Sylvester and Tweety narration; capitals signify signs simultaneous
with speech). This could be scaffolding but it does not create the combinations of
unlike semiotic modes at co-expressive points that we are looking for. First, signs
and words are of the same semiotic type segmented, analytic, repeatable, listable,
and so on. Second, there is no global-synthetic component, no built-in merging of
analytic/combinatoric forms with gestures global synthesis, and the spoken and
gestured elements are not co-expressive but are the different constituents of a
sentence. Of course, ASL/English bilinguals have the ability to form GP-style
cognitive units. But if we imagine a transitional species evolving this ability, the
bilingual ASL-spoken English model suggests that scaffolding did not lead to GPstyle cognition; on the contrary, it implies two analytic/combinatoric codes dividing
the work. If we surmise that an old pantomime/sign system did scaffold speech and
then withered away, this leaves us unable to explain how gesticulation emerged and
became engaged with speech. We conclude that scaffolding, even if it occurred,
would not have led to current-day speech-gesticulation linkages.
Corballis, in his 2002 argument for speech supplanting a gesture-first system
of communication, points out the advantages of speech over gesture. There is the
ability to communicate while manipulating objects and to communicate in the dark.
Less obviously, speech reduces demands on attention since interlocutors do not have
to look at one another (p. 191). While valid, these qualities are not necessary. There
are also positive reasons for gestures not being language-like, and they would be so
even if gesture and speech co-evolved as a single adaptation. All across the world,
languages are spoken/auditory unless there is some interference to the channel
(deafness, acoustic incompatibility, religious practice, etc.), and no culture has a
visual/gestural primary language. Susan Goldin-Meadow, Jenny Singleton and I
(1996) once proposed that gesture is the non-linguistic side of the gesturespeech
dual semiotic because it is better than speech for imagery: gesture has multiple
dimensions on which to vary, while speech has only the one dimension of time.
Given this asymmetry, even if speech and gesture were jointly selected, as proposed,
it would work out that speech is the medium of linguistic segmentation.

2.1.2 Problems with pantomime

A central reason why gesture-first cannot lead to gesturespeech unity is that its
gestures would be pantomimes. It portrays the initial communicative actions as
symbolic replications of actions of self, others and entities, and it was these
replications that scaffolded speech. The process appeals because it clearly taps the
(straight) mirror neuron response. Arbib (2012) gives it a central role. Donald (1991)
likewise posited mimesis as an early stage in the evolution of human intelligence. It
is conceivable that pantomime is something that an apelike brain is capable of and
was already in place in the last common chimphuman ancestor, some 8 million
years back. Contemporary bonobos are capable of it, supporting this idea (Pollick
The problem is not a lack of pantomime precursors but that pantomime
repels speech. The distinguishing mark of pantomime compared to gesticulation is
that the latter is integrated with speech; it is an aspect of speaking itself. In
pantomime this does not occur. There is no co-construction with speech, no coexpressiveness; timing is different (if there is speech at all), and no dual semiotic
modes. Pantomime, if it relates to speaking at all, does, as Susan Duncan points out,
as a gap filler appearing where speech does not, for example completing a
sentence (the parents were OK but the kids were [pantomime of knocking things
over]). Movement by itself offers no clue to whether a gesture is gesticulation or
pantomime; what matters is whether two modes of semiosis combine to co-express
one idea unit simultaneously. Pantomime does not; it does not have this dual

2.1.3 Two current gesture-first theories

Table 1 is a roster of gesture-first advocates, going back to Henry Sweet (said to be
Shaws model for Pygmalion). Note how all at some point say that speech supplants
the original gesture language and then is marginalized. Gesturespeech unity This
logical position is inescapable, and is the undoing of every gesture-first theory.
Gesturespeech unity is permanently blocked. It is the undoing as well of two recent
gesture-first theories. The mirror system hypothesis
Michael Arbib (2005, 2012), in his gesture-first theory, envisions an expanding
spiral of increasingly sophisticated protosign and protospeech, a spiral moving
from gesture-first to speech with pantomime a central bridge. He writes (2012, p.
229), the path to language went through protosign, rather than building speech
directly from vocalizations. It shows how praxic hand movements could have
evolved from the communicative gestures of apes and then, along the hominid line,
via pantomime to protosign. The Warlpiri sign/speech and bilingual ASL/English
situations above however suggest that pantomime and any kind of sign language
alone could never have evolved into spoken language. The spiral models gradual
change may seem unlike the crossovers modeled above but still they apply. With
each turn the two codes meet, and this is the mechanism of the spiral. A pantomime
(or sign) spins off a bit more of itself into speech; speech then becomes to this extent
another code. Far from shaping gesture or being shaped by it, as a code speech repels
coded gesture and/or divides the labor between itself and its former gesture master;
these are the demonstrable effects of two codes pitted against each other in the sign
language models above and they still exist in the expanding spiral.
The spiral however might work by exchange: as gesture and speech spiral, bits
of co-expressive gesture and speech trade places, a gesture (a bit of sign language up
till now) gives structure to a bit of speech (a vocal gesture till now) and the speech
(gesture till now) also gives bit of global-synthetic semiosis to gesture; so they
remain united.3 But note what this does. It supplants gesture by speech, yes; but it

Arbib writes for example (2012, p. 231), Protospeech builds on protosign in an expanding spiral:
Neural mechanisms that evolved for the control of protosign production came to also control the

also supplants speech by gesture. Each bit of gesture, formerly a bit of language, ceases
to be language and becomes a bit of global-synthetic gesture. The global-synthetic
gesture bits accumulate. The outcome is that gesture, unless the whole phylogenetic
exchange for some reason reverses, cannot be a language again. And the existence of
deaf sign languages, including home signs, invented by children themselves
(Goldin-Meadow 2003), shows the falsity of this prediction.
Mead's Loop does not use gesturespeech exchanges. Gesture and speech
evolved together, equiprimordially, the Loop bringing ones own gestures into the
orchestration of speech, with the effect gesturespeech unity, not exchange.
Moreover, when speech is blocked sign languages are natural outcomes of a
gesturespeech unity. Unity provides the modes equal potentials for codification.
Sign languages arise as naturally as speech (language is everywhere spoken unless
there is blockage, because gesture is better than speech for imagery). In keeping with
this unity (now, gesturesign unity), signs should also have their own spontaneous
gesticulations. Susan Duncan (2005) observed co-expressive gestures in Taiwanese
Sign Language. The gestures were manual: iconic distortions of canonical sign forms
and accordingly were perfectly synchronous with the signs. The distortions
incorporated the context and differentiated interiority in Canary Row narrations
when the signer had just before described Sylvesters climbing the pipe on the
outside, exactly as gestures by hearing speakers did in the natural experiment and
Fig. 1.
So a spiral either produces mutually repellent gestures and speech, contrary
to fact, or it makes sign languages impossible, also contrary to fact, and the only way
to avoid the supplantation trap is for gesture and speech to have been
equiprimordial, to have evolved together, and this is the process of Mead's Loop. The recursion hypothesis
Michael Corballis (2011) likewise continues to advocate gesture-first in a new work,
which takes as its central theme a posited universal, recursion, the embedding of like
things into like. This ability Corballis proposes is part of the psychological
foundation of human culture, providing the creative potential for such diverse
activities as reconstructing past episodes or imagining future ones, telling stories,
creating music or art, and manufacturing edifices and complicated machines (p.
181), but the gesture-first creature would not have been capable of it for language.
This is because recursion is not only the embedding of like into like in speech but has
a dual semiosis, it enters into gesturespeech unities. Gesture-first, trapped in its
own logic, does not achieve this.
Gesture-first would have driven recursion into out-of-synchrony gesture and
speech or divided it so that speech but not gesture (or vice versa), and not both, had
it. But we, surviving the extinction of gesture-first, evolved Mead's Loop and the
possibility of recursion in a gesturespeech unity, wherein gesture imagery includes
recursion and orchestrates speech with it as well (as in this example: you cant tell
the containing clause stating an ambiguity in the cartoon while, simultaneously,
the left hand moves into a new space with the value: Ambiguity; and speech
continues, if the bowling ball is under Sylvester or inside of him two embedded
clauses giving the poles of the ambiguity, and simultaneously the left hand, still in
the Ambiguity space, moving inward with under and outward with inside of
him; Figure 4).

2.1.4 End of gesture-first

The upshot is that gesture-first has little light to shed on the origin of language as we
know it; at best it explains the evolution of pantomime as a stage of phylogenesis
that, if it once occurred, went extinct as a code and landed at a different point on the
vocal apparatus with increasing flexibility, yielding protospeech as, initially, an adjunct to protosign
(emphasis added), which seems close to the exchange model.

continuum of gestures. Instead, I propose the somewhat out-of-the-box theory of

Mead's Loop. The next section shows how Mead's Loop created a dynamic system of


Mead's Loop.

Instead of gesture-first, I propose that speech and gesture were what Liesbet
Quaeghebeur has called equiprimordial. Gesture and speech had to be naturally
selected together. Neither gesture-first nor speech-first could have led to language. I
propose instead a mechanism for its selection that I call, after the early 20th Century
philosopher, George Herbert Mead, Mead's Loop.
Mead's Loop is a hypothesis of what emerged some one-half to one million
years ago in the evolution of the human brain. There began to evolve, in the brain, a
thought-language-hand link, localized at least in part in the area now called Brocas
Area (other brain areas also must have been involved, the right hemisphere for
imagery and metaphoricity, Wernickes area for categorized semantics, the prefrontal
cortex for fields of equivalents and psychological predicates, and Brocas area for
constructions and unpacking: all are language areas in the brain).
This link was a new kind of mirror neuron, a twisted mirror neuron.
Mirror neurons have been directly recorded in monkeys and reside supposedly in all
primate brains, including ours. To quote a Wikipedia article, [a] mirror neuron is a
neuron that fires both when an animal acts and when the animal observes the same
action performed by another. I call these mirror neurons straight, to distinguish
them from the Mead's Loop twist. Note what they provide. The significance of
the mirror neuron response is that of the action it mimics. The action of another is
repeated and becomes as ones own. If the mirror neuron circuit is used to produce a
gesture it will be a mimicked action, a pantomime. However, pantomime repels
speech. Its timing vis--vis speech is loose; often there is no speech at all, and if there
is speech the speech and the pantomime do not combine into a gesturespeech unit.
These empirical limits are inevitable if pantomime is the product of an earlier
gesture-first language which evolution sidelined when Mead's Loop began.
Here is the twist: G. H. Mead said that a gesture is meaningful when it
evokes the same response in the one making it as it evokes in the one receiving it:

The gesture which indicates the object indicates the object both to the other and
to the individual himself. In so far as this takes place the gesture is called
significant. The meaning of significance is that the individual, in indicating the
object or a part of the object to another and by this same process to ones self,
takes the role of the other and the indication of the object involves ones tending to act
upon this gesture, as the other. The gesture is said then to signify the action, and
comes therefore to stand for this character of the object. It stands for the reactions
to the object, or for its meaning.4

For evolution, this suggests a kind of twist, in which mirror neurons came
to respond to ones own gestures as if from another. They thereby brought ones own
gesture imagery and its significance into the motor areas for orchestrating speech.
Gesture became the dynamic unit not syntax (which arises as a means to stabilize the
dialectic of imagery and form).With metaphoricity, the expansion of meaning is
unlimited. Twisted mirror neurons also built into gesture a social orientation. While
straight mirror neurons reproduce the actions of another, with meanings that are
those of the others actions, the Meads Loop twist responds to ones own gestures
as if from another, and brings into the action-orchestration areas of the brain
different meanings, those of the gestures.
But was the twist needed? It was, because the gesture, although emanating
from the same brain area as speech, does not unite with it without self-response.
This gesture would be incomplete. It is neither synchronous nor co-expressive with
speech. Gesturespeech unity happens only when the gesture gets a self-response via
Mead's Loop. Then it becomes able to orchestrate speech movements. It also explains
the gestalt, a percept, during action. This was Meads insight: gesture (and speech)
have fundamentally a social character and, to be meaningful, must have a
social/public presence. With Mead's Loop, this occurs when the one making and the
one receiving are the same. This is the twist; then the gesture is a meaningful and
socially pertinent event, with the potential to connect to everything else in language
dynamically, and specifically to orchestrate vocal and manual movements.
How does this produce gesture-speech unity? It is built in from the start.
Again, this was Meads insight; only with a self-response is the gesture meaningful;
it is then a social/public response even though it is ones own (not sequentially, but
self-response is an essential aspect of the gestures meaning, the meaning under
which speech and gesture combine). A self-response makes the gesture the basis of
the orchestration of the actions of speech and gesture together. Straight mirror
neurons do not respond at all, because there is no external action.5 The Mead's Loop
twist, bringing the gestures meaning into the orchestration process, merges and
synchronizes speech with gesture at points where they co-express the same idea.
Hence the unity: it is built in. In all of this the gesture is fundamental. Because
Mead's Loop gave it the power to orchestrate speech Mead's Loop was the beginning
of everything in language.
Mead defined significance like this because he wanted to explain how
gestures are inherently social: they are significant when the person making them
responds to them the same way as the recipient. Weve taken this insight and made
it into a brain mechanism involving mirror neurons. Mead of course didnt think of
the self-response in this way. All he meant was that, for significance, the person
making it and the person receiving it must respond the same. He was assuming a
dyad. But in the passage quoted he imagined the self-response aspect of the Loop:
the gesturer takes the role of the other and the indication of the object involves ones
tending to act upon this gesture, as the other. Thus the dyad response also becomes an
individual property, a self-response.
Another way to see Mead's Loop is to envision a gesture having significance
only if it is public not made for social recognition but public in its inner essentials.
What does this mean? You need not be conscious of it but when you gesture you are

Emphasis added.
Which suggests a way to detect Mead's Loop activity via brain imagery.

automatically in a public mode. I think this property explains how Mead's Loop was
naturally selected. It was adaptive where this sense of being in a public mode was
important, such as in instructing children.
By treating imagery as a social stimulus Meads loop also explains why
gestures occur preferentially in a social context of some kind (face-to-face, on the
phone, but not alone talking to a tape recorder).
The psychological predicate, the differentiation of the newsworthy in the
immediate context, was inherent to Meads Loop by virtue of how it brought gesture
in as a speech-orchestrating force. This inseparability from context resides in Meads
Loops self-response to gestures. A gesture of the gesticulation kind (synchronous
with its co-expressive speech) is not portable from context to context. Over Meads
Loop it orchestrates vocal movements under the gestures significance. The result is
inherently context bound. From this comes an explanation of one meaning as two
things the thing differentiated, which is inseparable from the thing being
differentiated, the field of meaningful equivalents. Moreover, given the social
reference of Meads Loop, the contexts that gesture differentiates mesh with ongoing
social interactions. So what is newsworthy is meaningful as a social framework.
Then speech passes from display (of which chimps and young children are
capable) to communicating messages to the other (from Werner & Kaplan 1963,
which Mead's Loop produces).
Finally, it is clear that Mead's Loop could not have started from a
hypothetical gesture-first language. This is because Mead's Loop required gesturespeech equiprimordiality. Gesture-first (or speech-first) blocked it. Even if a gesturefirst language existed it could not have evolved into Mead's Loop. Extinction or
sidelining was the only path. Mead's Loop drew on other pre-linguistic gestures,
perhaps resembling those of contemporary chimps or bonobos, but not a sign
language or pantomime code.

2.2.1 How Mead's Loop engendered Growth Point properties.

A growth point or GP is a cognitive package that combines semiotically opposite
linguistic categorial and imagistic components (McNeill & Duncan 2000). Table
summarizes the semiotic oppositions inside a GP.
Imagery side
Language side
Global: meanings of parts dependent on
Compositional: meaning of whole
meaning of whole.
dependent on meaning of parts.
Synthetic: distinguishable meanings in
Analytic: distinguishable meanings
single image.
separated .
Additive: no new syntagmatic values
Combinatoric: new syntagmatic values
when images combine.
when parts combine.
Table 2. Semiotic Oppositions Within a GP
The GP becomes the minimal unit of the dynamic dimension itself. It is called a
growth point because it is meant to be the initial pulse of thinking-for-(and while)speaking, out of which a dynamic process of organization emerges. The linguistic
component categorizes the visuo-actional imagery component.6 The linguistic
component is important since, by categorizing the imagery, it brings the gesture into
the system of language. Imagery is equally important, since it grounds sequential
linguistic categories in an instantaneous visuospatial frame. It provides the GP with

For this reason questions like, how can the image of a sunset be a component of speaking?, are
irrelevant. There is no actional component, and possibly not a global-synthetic semiotic
component either.


the property of chunking, a hallmark of expert performance (cf. Chase & Ericsson
1981), whereby a chunk of linguistic output is organized around the presentation of an
image. Synchronized speech and gesture are the key to this theoretical GP unit. It
means that the same idea is realized in two opposite semiotic modes, and the GP as a
gesturespeech unity combines them in a dialectic. This model in intrinsically
dynamic, as semiotic opposition is unstable and seeks resolution. A grammatical
construction is this resolution, and one that is able to unpack the GP dialectic without
distortion is the spoken result.
Meads Loop led to the GP because its had both semiotic and motor effects:

Semiotically, it brought the gestures meaning into the mirror neuron area.
Mirror neurons no longer were confined to the semiosis of actions. Ones own
gestures (such as rising hollowness) entered, as if it were liberating action from
action and opening it to imagery in gesture. Extended by metaphoricity, the
significance of imagery is unlimited. So from this one change, the meaning
potential of language moved away from only action and expanded vastly.
At the motor level, in Brodmanns areas 44 and 45 (i.e. Brocas Area), the areas of
the brain where speech movements are orchestrated, Meads Loop enabled
significant imagery gesture to chunk vocal motor control, the foundation of
the GP.

How does this produce gesture-speech unity? It is built in from the start. Twisted
mirror neurons complete meads loop.
Our hypothesis is that part of human evolution was that mirror neurons
participated in ones own gesture imagery. This hooks into Meads loop ones own
gestures activate the part of the brain that responds to intentional actions including
gestures by someone else and thus treats ones own gesture as a social stimulus.
The evolutionary step was self-response by mirror neurons. This brought
brought meanings into the 44/45 areas, where meanings other than instrumental
action could orchestrate sequences of action.
This was Meads insight; only with a self-response is the gesture meaningful; it is
then a social/public response even though it is ones own (not sequentially, but selfresponse is an essential aspect of the gestures meaning, the meaning under which
speech and gesture combine).
This is why the gesture, coming from the same brain area as speech, does not
merge with it without the Mead's Loop self-response; it is not meaningful without it.
Only a self-response makes the gesture the basis of the orchestration of the actions of
speech and gesture together. Straight mirror neurons would not respond because
there is no action by another. The Mead's Loop twist, because it brings the gestures
meaning into the orchestration process, merges and synchronizes speech with
gesture at points where they co-express the same idea; hence gesturespeech unity.
In all of this the gesture is fundamental. Because Mead's Loop gave gestures the
power to orchestrate speech, Mead's Loop was the beginning of everything in
Meads loop treats imagery as a social stimulus and explains why gestures occur
preferentially in a social context of some kind (face-to-face, the phone, but not a tape
recorder). Mirror neurons complete Meads loop in a part of the brain where action
sequences are organized - two kinds of sequential actions, speech and gesture,
converging, and with meaning as an integral component. Co-opting sequential
actions by a socially referenced stimulus (imagery) makes a new kind of action
I conclude with a listing of the chief properties of the dynamic dimension and
GP, and how Meads Loop engendered them. For the GP itself, I refer to McNeill
(2012). Using only general primate cognition this ensemble of dynamic dimension
properties would not have come together. It goes without saying that other factors
could have influenced the dual semiotic, social reference and psychological predicate


properties, but all would seem to have their roots in the semiotic and motor effects of
Meads Loop.
Inhabitance. Gesture, the instantaneous, global, nonconventional component, is
not an external accompaniment of speech, which is the sequential, analytic,
combinatoric component; it is not a representation of meaning, but instead
meaning inhabits it:
The link between the word and its living meaning is not an external
accompaniment to intellectual processes, the meaning inhabits the word, and
language is not an external accompaniment to intellectual processes.7 We are
therefore led to recognize a gestural or existential significance to speech.
Language certainly has inner content, but this is not self-subsistent and selfconscious thought. What then does language express, if it does not express
thoughts? It presents or rather it is the subjects taking up of a position in the
world of his meanings (p. 193; emphasis in the original). 8

The GP is a mechanism geared to this existential content of speech this

taking up a position in the world. Gesture, orchestrating speech, is inhabited by
the same living meaning that inhabits the word (and beyond, the discourse). The
positieve correlation, as widely observed, of speech and gesture complexity thus
involves greater inhabitance (if this concept can be gradual, half in, half out); it
involves more of the body, in movements more coordinated by the significance being
differentiated from context. It is the thought-language-hand link that Mead's Loop
created simultaneously affecting both motor domains.
The dual semiotic and dialectic. The new form of mirror neuron response in
Meads Loop merged vocal movements and gesture, and synchronized them at
points where they were co-expressive of an underlying idea unit, laying the ground
for the imagerylanguage dialectic and the dynamic dimension of language. Once
codified linguistic forms evolved (and this presumably would have been along with
GPs), a dialectic would be the immediate response. Without Meads Loop, gesture
and speech would have had only the loose connection seen with pantomime, and
language could not have escaped the single-semiosis box.
The social reference. The social orientation of mirror neurons with the Meads
Loop twist gave gestures and GPs a social reference character. Without Meads
Loop gestures could have social reference only if directed at an interlocutor. But
speech-unified gesticulations are not necessarily aimed at someone else (indeed, in
contrast to emblems, points or pantomimes, they rarely are) yet gestures and their
GPs are social (public) entities. Also, crucially, a foundation in Meads Loop
opened a route over which the social conventions of speech and thought could form,
GPs absorbing social-interactive content. An important effect of the inherent sociality
of GPs due to Meads Loop arose in the origin of syntax, namely, sharaeability
(Freyd 1983).
Origin of psychological predicates. The psychological predicate, the
differentiation of what the speaker deems newsworthy in the immediate context, was
inherent to Meads Loop by virtue of how it brought gesture in as a speechorchestrating force. The inseparability of the psychological predicate from context
resides in Meads Loops self-response. A gesture of the gesticulation kind (in
contrast to sign language signs) is not portable from context to context. Over Meads
Loop it orchestrates vocal movements under the gestures significance. The result is
inherently context bound, as a matter of how it is formed. Moreover, given the social
reference of Meads Loop, the contexts that gesture differentiates mesh with ongoing
social interactions. So what is newsworthy is meaningful in a social framework.
Eventually, the clever Meads Loop creature was able to shape contexts to fit
intended differentiations. This was an elaboration of the psychological predicate
functionality of Meads Loop and may or may not have been part of it from the

Merleau-Pontys quotation is from Gelb and Goldstein (1925, p. 158).

am indebted to Jan Arnold for this quotation.



beginning (it seems possible that the ability to shape context to fit the intended
meaning is linked to another ability, which also had to emerge, to use metapragmatic
indicators to orchestrate unpackings for goals and intentions together, they
promote utterances that will be true to goals and intentions).
Origin of catchments. Catchments are threads of consistent imagery attached to
themes. They also arose with Meads Loop as a matter of course. Meads Loop binds
imagery with speech and brings the meaning of the image into the (twisted) mirror
neuron circuit. Although not shown here, each time the it down speaker regarded
the bowling ball as an antagonistic force its iconic imagery returned along with this
theme. There were a half dozen such occurrences. Theme and image were bound
together throughout. The theme is bound to the image because mirror neurons,
echoing the gesture, also echo its significance as an antagonistic force plus whatever
other iconicity it had. These iconic details varied but the recurring antagonistic forces
theme was a constant.
Metapragmatic indicators. Mead's Loop engendered gesturespeech unities, as
we have seen, but it also had a role to play in what Michael Silverstein (2003) has
pioneered, the collection of metapragmatic scaffolding that encompasses the GP
dialectic. The essential feature of metapragmatics awareness and control of the
pragmatic effects of ones utterance stemmed from the social impetus that Mead's
Loop provided. Without its sense of ones own actions as social and public, there
could be no metapragmatic indicators.
The equiprimordiality of speech and gesture. To select Meads Loop, speech
and gesture had to evolve together. One could not have come first and the other
later. This basic difference from gesture-first is perhaps the one step, if we try to
single out one, that sets human linguistic evolution apart. Some avian species (crows,
ravens) have evolved surprisingly elaborate vocal and gestural repertoires but have
not taken the step that led to language, evolving a unit that is both sound production
and gesture integrally. The entire Mead's Loop process involves ones own gesture
impinging on the same area of the brain where vocal actions are orchestrated, so
both are necessary.
Origin of unpacking. The GP is the core idea at the moment of speaking,
differentiating a context; unpacking cradles it and intuitions of well-formedness
comprise the stop-order. Some constructions unpack their GPs and nothing more,
others add further meanings such as caused-motion, but all arise from the GP. GP
and unpacking are not necessarily sequential. In fact, simultaneous activation
occurred with it down, but even if simultaneous they are functionally distinct,
with unpacking dependent on the GP. This is they key to its origin as well. By
combining imagery with codified form, a self-created selection pressure for the static
dimension also arose.
But not theory of mind. Meads Loop however is not theory of mind. In a sense,
they are opposites. The Meads Loop adaptation brings self-awareness of ones own
behavior as social, not a theory of the cognitions and intentions of another (a theory
of mind could evolve from straight mirror neurons in any case, which further limits
its role in the origin of language). Both theory of mind and Mead's Loop depend on
a more fundamental faculty, self-aware agency, a topic the late Susan Hurley
explored in depth. Awareness of self as agent seems to develop in the child around
age 4, and both Mead's Loop and theory of mind show themselves then as well
(childrens language is discussed in the fifth part of this series). The origin of Mead's
Loop depended on this sense. It is only through awareness of ones own agency that
gestures can be responded to as if from another.
Summary. Gesturespeech unity is felt across a spectrum of dynamic effects
prosody (the peak corresponding to the GP), the formation and differentiation of
contexts, and the energizing of communicative dynamism and with it, the amount of
coding material in both gesture and speech in motion together in a positive
relationship. All of this sorts out the two modes of consciousness of sentences.

2.2.2. Answering puzzles


I have discussed Mead's Loop with well-informed linguists and have, I believe,
identified, with their help, sticking points for understanding Mead's Loop and the
arguments in support of it. Most insidious is a seemingly irresistible attraction for
many readers of a linear, cause-line style of thinking that gets in the way of Mead's
Loop and the GP, which require thinking that is non-linear, simultaneous and all at
once (cf. Quaeghebeur 2012). The following exchange clarifies this obstacle.

Comments by Renia Lopez-Ozieblo

Responses by David McNeill

Self-response and gesture-speech unit at the same time: I can, at a push, follow the
idea of the self response and the unit formation all happening together. But if for
Meads Loop to originate I need a gesture, but this gesture is not the same as the GP

This is the key you must not think sequentially. Think instead of Mead's
Loop all at once.
And how? There is just one gesture. The gesture has reality on two
levels; think of a gesture as occurring and, at the same time, as selfresponding; the trick is to avoid sequential thinking. The word response
sounds sequential, but think of it not as a stimulus-response sequence but,
all at once, as a gesture that has self-response as an integral aspect. This is
no different than self-awareness of any action. If you act and are aware of
your agency, it is not that you act, then become aware that you were the
one who did it; self-awareness is part of the act from its inception, and this
reality makes it qualitatively different from unaware action. Its thus not
that you gesture and then respond; the self-response is part of the gesture

Why would the gesture come by itself? And wouldnt this be fuel for the gesture first

Its not a separate gesture. It is a level or residue in the new gesture that
Mead's Loop made possible. A primate gesture in any case is not

necessarily a gesture-first gesture but is better considered to be a precursor

to Mead's Loop, not a gesture facing extinction. The figure (from Amy
Pollick) shows a precursor type gesture; in this case, an iconic-deictic
gesture by one bonobo to show another bonobo where to go:
Why is this not Mead's Loop? While not impossible, in incipient form,
Mead's Loop means the left bonobo also would have had a self-response to
her gesture, so that it was more than an expression of her wish to get the
other bonobo to move forward, and also had significance for her as a
public act.

I think this is solved if instead of thinking of gesture we can think of the thesis
imagery. But was this what you meant?

Yes, but a gesture is an image, one that is materialized by the body itself.
Image and gesture are not separate entities. This is true even if an overt
gesture is lacking. A gesture is still present as imagery, at the low-material
end of a continuum with a full gesture at the high end. Imagery is


gestures most constant aspect; the outward movement is its material

embodiment but since less newsworthiness summons less material, the
absence of gesture is just the endpoint of this continuum. The dual
semiosis of imagery and form still is present.

What is Meads Loop adding?

Without it, the gesture is a single-level pantomime or point, and as such is

repelled by or otherwise unable to unify with speech. With it, the gesture
becomes two levels, and gains control over speech orchestration. (Try the
bonobo gesture yourself, in the same imagined situation: that way or
some equivalent speech is automatic. This would be Mead's Loop coming
into play.)

De Ruiter in his paper "Gesture and Speech Production" (1998), p. 60, says: It seems
therefore necessary to assume that the gesture planner, after having constructed a
motor program for the gesture, sends a "resume" signal to the conceptualizer. Upon
receiving this gesture, the conceptualizer can send the preverbal message to the
formulator. Is this not Mead's Loop?

It is a loop but it lacks two essentials of Mead's Loop. First, there is no

social reference which for Mead was crucial, and was the reason Mead's
Loop had an adaptive advantage in evolution. This loop has no biological
reason. Second, it does not capture gesturespeech unity. The gesture is a
signal to the conceptualizer but it has no orchestration power. It is a
resume signal. Accordingly, it produces a delay of speech, not an
orchestration of it. It tells the conceptualizer to let go, but the unit of
speech it releases is the unit the conceptualizer has already created, and
the gesture does not orchestrate it. This and other Speaking-inspired
models (such as Kita-zyreks 2003) face the empirical fact of synchrony
and include it but for them it is arbitrary and external, not inherent, as in
the GP.


Last word on the origin

Pulling the threads together, we have a new idea of how language began the
scenario is the emergence of human family life. How Mead's Loop works in outline:
 Gestures emanating from Broca's Area are mirrored by it.
 The gesture acquires significance according to Mead by being responded to in
the same way it is responded to by others.
 It acquires the sense of being social/public.
 It is now ready to orchestrate actions of the vocal tract (or hands in the case of
a sign language).
 Thus speech (or signing) becomes a dual semiotic imagery orchestrating
symbols that are themselves codified by sociocultural conventions.
The family, and child-rearing in particular, are environments where the
social/public value of ones own gestures is adaptive, and there Mead's Loop could
have been naturally selected (no doubt it was adaptive in other contexts as well).
Archeologists date the dawn of family life (with cooking hearths, for example) to
about one million years ago, with a stable family membership and a division of
labor. So it was back then that the natural selection of Mead's Loop possibly also
Mead's Loop also gave the speaker the sense that gestures have a public,
social significance. Whereas straight mirror neurons take a socially received action
and make it into a personal property, Mead's Loop makes one own gesture into a

A modification of the de Ruiter model, namely, that the gesture, when sent back, is not a resume
signal but has the power to reorchestrate the already formulated speech. This requires new
connections that further undermine the modularity of Speaking and hardly seems economical.
However, a loop within a loop like this is not impossible.


socially referenced action. For the speaker it gains public significance.

This social reference was a vital innovation. It gave the adult, typically a
mother inculcating cultural norms in infants, the sense of being an instructor as
opposed to being just a doer with an onlooker (the chimpanzee way). Entire cultural
practices of human childrearing depend upon this sense (Tomasello 1999). To do this
the adult must be sensitive to her own gestures as social/public actions. Sensing
actions as social would impact the next generation of children who, as a result of it,
do better at coping.
Given the social references of Meads Loop, natural selection for it would
arise in situations where sensing ones own actions as social and public is
advantageous for example, as suggested here, imparting information to infants,
where it gives the adult, typically a mother, the sense of being an instructor. The
focus of this selection would be adults in this scenario language evolved in women.
Their infants, both female and male, benefit from cultural inculcation and inherit any
genetic dispositions. Sarah Hrdy (2009) has highlighted the group rearing of infants,
including the infants of other adults, as an aspect of early human family life. The
ability to engage in collective infant rearing clearly demands (and naturally selects)
seeing ones own actions as social. Such practices stand in sharp contrast to
chimpanzee infant rearing, where infants, if left without their mothers, are neglected
and vulnerable to attacks by adults in the same group.
On this view it was women, mothers, in whom language began. We should
not be surprised, accordingly, that baby talk the speech register that adults
instinctively adopt speaking to infants appears universally.
Mead's Loop twisted mirror neurons, in other words, are symbolic in their
initiation, control and effects (giving material carriers to meanings and multimodal
embodiments to cognitive being). Their natural selection could have had nothing
to do with straight mirror neurons.
Also, the downstream links of Mead's Loop and straight mirror neurons
differ with twisted neurons the thoughtlanguagehand link; with straight
ones actions manipulating the world. This difference reflects separate evolutions as
Whether you are persuaded by these arguments depends, ultimately, on
taking seriously gesturespeech unity, that gesture and speech comprise a single
multimodal system, and that gesture is not an accompaniment, ornament,

FIGURE 6. Two versions of gesture-first extinction and Mead's Loop origin.

Version A: The branch leading to modern humans and Neanderthals and away from the newly
discovered Denisova hominin was the emergence of Mead's Loop. Denisova was the continuation
of the gesture-first creature, and went extinct some 30,000 to 40,000 years ago the date of the
fossil finger bone from which its mtDNA was recovered by Krause et al. 2010. In this version,
Neanderthals and modern humans share Mead's Loop but differ in other brain areas (the


temporal lobes and possibly the prefrontal area) and ontogenesis (Neanderthals, according to
Rozzi and de Castro 2004, developing markedly faster, leaving less time for GPs to emerge
before any critical period shutdown).
Version B: Mead's Loop arose instead at the 466 kya split between modern humans and
Neanderthals, and in part defines this separation of lines. In this version, neither of the extinct
hominin species evolved Mead's Loop, which may have been a factor in their extinctions. In an
alternate version, the split at the 1 mya point separated the common ancestor of the
Neanderthal and the Denisova hominin from that of modern humans. The Neanderthal and
Denisova ancestors later left Africa while the ancestors of modern humans remained behind
(possibly in the southwest, Atkinson 2011), evolving Mead's Loop and carrying it out of Africa
in their turn. Diagram based on Brown (2010); Mead's Loop annotations added.

supplement or add-on to speech but is actually part of it. Gesture-first does not
predict this languagegesture integration. When we look at models of speech
gesture crossovers of the kind that, in theory, gesture-first would have encountered
when speech supplanted an original gesture language, we do not find conditions for
gesturespeech unity, but instead non-co-expressiveness or mutual speechgesture
Joining the damage is Wolls (2005/2006) argument that not only does
gesture-first leave gestures unable to integrate with speech but it also blocks, within
speech itself, the arbitrary pairing of signifiers with signifieds that is characteristic of
(or, Saussure says, defining of) a linguistic code.



Finally, when did Mead's Loop evolve? The phrase, the dawn of language,
suggests that language burst forth at some definite point, say 150~200 kya (thousand
years ago), when the prefrontal expansion of the human brain was complete. But the
origin of language has elements that began long before 5 mya (million years ago)
for bipedalism, on which things gestural depend. I think 1 mya, based on humanlike
family life dated to then, for starting the expansion of forebrain and the selection of
self-responsiveness of mirror neurons and the resulting reconfiguration of Areas
44/45. I imagine this form of living was itself the product of changes in reproduction
patterns, female fertility cycles, child rearing, neotony, all of which must have been
emerging over long periods before.
So this says that language as we know it emerged over 1 to 2 million years
4A7G;4GABG@H6;;4F6;4A:87F<A68G;8  # #?4A7@4E>B9E86BA9<:HE<A:
Brocas area with the mirror neurons/Meads Loop circuit (although this date could
overlook continuing evolution: there are hints that the brain has changed since the
dawn of agriculture and urban living; see Evans et al 2005).
The Meads Loop model doesnt say what might have been a protolanguage
before 2 mya Lucy and all. It would have been something an apelike brain is
capable of. There are many proposals about this Kendon, for example, proposed
that signs emerged out of ritualized incipient actions (or incomplete actions). Natural
gesture signals in modern apes have an incipient quality as well, the characteristic of
which is that an action is cut short and the resulting action-stub becomes a signifier.
The slow-to-emerge precursor from 5 mya to 2 mya may have built up a
gesture language from instrumental actions, a gesture-first type language. It would
have been an evolution track leading to pantomime.
But the human brain evolved a new system in which gesture fused with
Meads Loop also does not say where language evolved (Atkinson 2011
points to the southwestern corner of Africa), but it does predict that wherever it
was the languages there now would tend to be of the isolating type (and this appears
to be the case in SW Africa; see 6,4.5). In any case the origin point would have been
an area where human family life also was emerging.
A proposed time line for the origin of Meads Loop is as follows:
a. To pick a date, the evolution of a thoughtlanguagehand link started
5 mya with the emergence of habitual bipedalism in Australopithicus. This


freed the hands for manipulative work and gesture, but it would have been
only the beginning. Even earlier there were preadaptations such as an ability
to combine vocal and manual gestures (Hopkins & Cantero 2003), to perform
rapid sequences of meaningful hand movements, and the sorts of
iconic/pantomimic gestures we see in bonobos (Pollick 2006), but not yet an
ability to orchestrate movements of the vocal tract by gestures.
b. The period from 5 to 3~2 mya Lucy and the long reign of
Australopithicus would have seen the emergence of various precursors of
language, such as the protolanguage Bickerton attributes to apes, very young
children and aphasics; also, ritualized incipient actions becoming signs as
described by Kendon.
c. At some point after the 3~2mya advent of H. habilis and later H.
erectus, there commenced the crucial selection of self-responsive mirror
neurons and the reconfiguring of areas 44 and 45, with a growing co-opting
of actions by language to form speech-integrated gestures, this emergence
being grounded in the appearance of a humanlike family life with a host of
other factors shaping the change (including cultural innovations like the
domestication of fire and cooking). The timing of this stage is not clear but
recent archeological findings strongly suggest that hominids had control of
fire, had hearths, and cooked 800 kya (Goren-Inbar et al. 2004).
d. Thus, the family as the scenario for evolving the thoughtlanguage
hand link we see in the IW case seems plausible, commencing no more
recently than 800 kya.
Another factor would have been the physical immaturity of human infants at
birth and the resulting prolonged period of dependency giving time for cultural
exposure and GPs to emerge, an essential delay pegged to the emergence of selfaware agency (Neanderthals, in contrast, may have had a short period of
development, Rozzi & de Castro 2004).
Along with this sociocultural revolution was the expansion of the forebrain
from 2 mya, and a reconfiguring of areas 44 and 45, including Meads loop, into what
we now call Brocas area. This development was an exclusively human phenomenon
and was completed with H. sapiens about 200100 kya. If a dawn occurred, it was
At least two other human species have existed, Neanderthals and the recently
discovered Denisova hominin and each may have had a gesture-only form of
communication. Our species developed Meads Loop. These other humans went
extinct, one factor in which could have been confinement to pantomime and a
consequent inability to reach the new form of language.
Figure 5 shows two timelines differing in when Mead's Loop evolved.
According to the timeline language with dual semiosis came into being over the last
1 or 2 million years. Considering protolanguage and then language itself the timeline seems to be over five million years (far from a big bang). Meaning-controlled
manual and vocal gestures that combine under imagery, emerged over the last two
million years. The evolution of self-aware agency would be one critical step
unleashing Mead's Loop and may not have emerged until 800 kya or more recently.
The entire process may have been completed not more than 100 kya, a mere 5,000
human generations, if it is not continuing.

Arbib, M. A. 2005. From monkey-like action recognition to human language: An
evolutionary framework for neurolinguistics. Behavioral and Brain Sciences,
Arbib, Michael. 2012. How the brain got language: The mirror system hypothesis. Oxford:
Oxford University Press.
Atkinson, Quentin D. 2011. Phonemic diversity supports a serial founder effect
model of language expansion from Africa. Science 332: 346349.
Bickerton, Derek. 1990. Language and Species. University of Chicago Press.


Brown, Terrence A. 2010. Stranger from Siberia. Nature 464: 838

Corballis, Michael C. 2002. From Hand to Mouth: The Origins of language. Cambridge,
MA: Harvard University Press .
Corballis, Michael C. 2011. The Recursive Mind: The Origin of Human Language,
Thought, and Civilization. Princeton.
De Ruiter, Jan. 1998. "Gesture and Speech Production." Max-Planck-Institute for
Donald, Merlin. 1991. Origins of the Modern Mind: Three Stages in the Evolution of
Culture and Cognition. Cambridge: Harvard University Press.
Dreyfus, H. 1994. Being-in-the-World: A Commentary on Heideggers Being and Time,
Division I. Cambridge: MIT Press.
Duncan, Susan. 2005. Gesture in signing: A case study in Taiwan Sign Language.
Language and Linguistics 6: 279-318.
Dunn, Michael, Greenhill, Simon J., Levinson, Stephen C. and Gray, Russell D. 2011.
Evolved structure of language shows lineage-specific trends in word-order
universals. Nature on-line
(Accessed 19 April 2011).
Emmorey, Karen, Borinstein, Helsa B. and Thompson, Robin. 2005. Bimodal
bilingualism: Code-blending between spoken English and American Sign
Language, in Cohen, James, McAlister, Kara T., Rolstad, Kellie and
MacSwan, Jeff (eds.). Proceedings of the 4th International Symposium on
Bilingualism, pp. 663-673.. Somerville, MA: Cascadilla Press.
Evans, Patrick D, Gilbert, Sandra L., Mekel-Bobrow, Nitzan, Vallender, Eric J.,
Anderson, Jeffrey R., Vaez-Azizi, Leila M., Tishkoff, Sarah A., Hudson,
Richard R. and Lahn, Bruce T. 2005. Microcephalin, a gene regulating brain
size, continues to evolve adaptively in humans. Science 309:1717-1720.
Freyd, Jennifer J. 1983. Shareability: The social psychology of epistemology.
Cognitive Science 7: 191-210.
Gardner, R. A. and Gardner, B. T. 1969. Teaching sign language to a chimpanzee.
Science 168: 664672.
Gelb, A. and Goldstein, K. 1925. ber Farbennamenamnesie. Psychologische
Forschung 6,, 127-186.
Gentilucci, Maurizio, and Dalla Volta, Riccardo (2007). The motor system and the
relationship between speech and gesture. Gesture 7: 159-177.
Glick, Joseph. 1983. Piaget, Vygotsky, and Werner. In Wapner, Seymour and Kaplan,
Bernard (eds.), Toward a Holistic Developmental Psychology, pp. 35-52. Hillsdale
NJ: Lawrence Erlbaum Associates.
Goldin-Madow, Susan. 2003. The Resilience of Language: What Gesture Creation in Deaf
Children Can Tell Us About How All Children Learn Language. New York:
Psychology Press.
Goren-Inbar, Naama, Alperson, Nira, Kislev, Mordechai E., Simchoni, Orit,
Melamed, Yoel, Ben-Nun, Adi, and Werker, Ella. 2004. Evidence of hominid
control of fire at Gesher Benot Yaaqov, Israel. Science 304: 725727.
Hopkins, William D. and Cantero, Monica. 2003. From hand to mouth in the
evolution of language: the influence of vocal behavior on lateralized hand use
in manual gestures by chimpanzees (Pan troglodytes). Developmental Science
6: 5561.
Hrdy, Sarah Blaffer. 2009. Mothers and others: The evolutionary origins of mutual
understanding. Cambridge, MA: Harvard University Press.
Humboldt, Wilhelm von. 1999. On Language. P. Heath (trans.), M. Losonsky (ed.).
Jakobson, R. 1960. Concluding statement: Linguistics and poetics, In Sebeok, T.
(ed.). Style in Language, pp. 350-377. Cambridge, MA: MIT Press.
Kendon, Adam. 1988. Sign languages of aboriginal Australia: cultural, semiotic and
communicative perspectives. Cambridge: Cambridge University Press.
Kendon, Adam. 2004. Gesture: visible action as utterance. Cambridge: Cambridge
University Press.


Kendon, Adam. 2008. Some reflections of the relationship between gesture and
sign. Gesture 8, 348-366.
Kita, Sotaro & zyrek, Asli 2003. What does cross-linguistic variation in semantic
coordination of speech and gesture reveal? Evidence for an interface
representation of spatial thinking and speaking. Journal of Memory and
Language 48:16-32.
McNeill, David. 2012. How Language Began. Cambridge University Press.
McNeill, David, Duncan, Susan, D., Cole, Jonathan., Gallagher, Shaun. and
Bertenthal, Bennett. 2008. Growth points from the very beginning. Interaction
Studies (special issue on proto-language, D. Bickerton and M. Arbib, eds.)
9(1): 117-132.
Mead, George Herbert. No Date. Typescript in University of Chicago Special Collections.
Box 13, Folder 21, top page starting with The gesture which indicates the object.
Merleau-Ponty, Maurice. 1962. Phenomenology of Perception (C. Smith, trans.).
London: Routledge.
Nobe, Shuichi. 1996. Representational gestures, cognitive rhythms, and acoustic
aspects of speech: a network/threshold model of gesture production.
Unpublished Ph.D. Dissertation, Department of Psychology, University of
Pollick, Amy S. 2006. Gestures and Multimodal Signaling in Bonobos and
Chimpanzees. Unpublished Ph.D. dissertation, Department of Psychology,
Emory University.
Quaeghebeur, Liesbet. 2012. The All-at-Onceness of embodied, face-to-face
interaction. Journal of Cognitive Semiotics 4: 167-188.
Rizzolatti, Giacomo and Arbib, Michael. 1998. Language within our grasp. Trends
in Neurosciences 21: 188-194.
Rozzi, Fernando V. Ramirez and de Castro, Jos Maria Bermudez. 2004.
Surprisingly rapid growth in Neanderthals. Nature 428: 936939.
Saussure, Ferdinand de. 1959. Course in General Linguistics (Charles Bally and Albert
Sechehaye, eds., Wade Baskin, trans.). New York: The Philosophical Library.
Silverstein, Michael. 2003. Indexical order and the dialectics of sociolinguistic life.
Language & Communication 23:193229.
Sweet, Henry. 1971/1900. The history of language, in Henderson, E. (ed.). The
Indispensable Foundation: a selection from the writings of Henry Sweet, pp. 1-24.
London: Oxford University Press.
Stefanini, Silvia, Caselli, Maria Cristina, & Volterra, Virginia. 2007. Spoken and
gestural production in a naming task by young children with Down
syndrome. Brain and Language 101:208-221.
Tomasello, Michael. 1999. The Cultural Origins of Human Cognition. Cambridge, MA:
Harvard University Press.
Tomasello, Michael. 2008. Origins of Human Communication. Cambridge, MA: MIT
Vygotsky, Lev S. 1987. Thought and Language. Edited and translated by E. Hanfmann
and G. Vakar (revised and edited by A. Kozulin). Cambridge: MIT Press.
Werner, Heinz & Kaplan, Bernard. 1963. Symbol Formation. New York: John Wiley &
Sons Ltd. [reprinted in 1984 by Erlbaum].
Wimmer, Heinz and Perner, Josef. 1983. Beliefs about beliefs: Representation and
constraining function of wrong beliefs in young childrens understanding of
deception. Cognition 13: 103128.
Woll, Bencie. 2005/2006. Do mouths sign? Do hands speak? in Botha, Rudie and de
Swart, Henriette (eds.). 2005/2006. Restricted Linguistic Systems as Windows on
Language Evolution. Utrecht: LOT (Netherlands Graduate School of Linguistics
Occasional Series, Utrecht University)
http://lotos.library.uu.nl/publish/articles/000287/bookpart.pdf (accessed