Vous êtes sur la page 1sur 12

ARTICLE IN PRESS

Journal of Phonetics 35 (2007) 552563


www.elsevier.com/locate/phonetics

Letter to the Editor

Current views on Neanderthal speech capabilities: A reply to


Boe et al. (2002)
Philip Lieberman
Department of Cognitive and Linguistic Sciences, Brown University, Providence, RI, USA

More than 30 years have passed since the publication of the 1971 Lieberman and Crelin, On the Speech of
Neanderthal Man, which concluded that the reconstructed supralaryngeal vocal tract (SVT) was similar to
that of a newborn infant and could not produce the full range of formant frequency patterns of human speech.
In the intervening years, independent studies have provided critical insights on both the development and
evolution of the vocal tract. However, it is apparent that many scholars having an interest in the evolution of
human speech are not familiar with these developments, or for that matter with the procedures and claims of
the 1971 paper. Louis-Jean Boe and his colleagues in two published papers (Boe, Maeda, & Heim, 1999; Boe,
Heim, Honda, & Maeda, 2002) that critique the 1971 Lieberman and Crelin paper, fail to take into account
these ndings. Independent studies show that the exact position of the tongue in the 1971 reconstruction may
be incorrect. However, they also show that the exure of the cranial base is not, as Boe and his colleagues
claim, an index that demonstrates that Neanderthals possessed fully modern human vocal tracts. Moreover,
other genetic and morphological studies subsequent to 1971 show that Neanderthals were a species distinct
from humans and have skeletal features that preclude a fully human SVT. The restructuring of the human
skull which places the human face in line with the braincase did not take place in Neanderthals, resulting in a
long oral cavity. A modern vocal tract placed on a Neanderthal skull would require a tongue displaced down
so low into its neck that the creatures larynx would be in its chest, a conguration absent in any primate
species.

1. Introduction

Many independent studies have shown that the range of area functions and the overall length of the SVT of
a primate determine the formant frequencies that it can generate (e. g., Chiba & Kajiyama, 1941; Fant, 1960;
Henke, 1966; Stevens, 1972; Story & Titze, 1998; Story, Titze, & Hoffman, 1996). Computer modeling of the
SVTs of living nonhuman primates show that the range of formant frequencies that they can produce is
restricted compared to an adult human SVT (Lieberman, Crelin, & Klatt, 1972; Lieberman, Klatt, & Wilson,
1969). The human tongue moving as an essentially undeformed body in the right angle space dened by the
mouth and pharynx can produce the changes in the cross-sectional area of the SVT necessary to produce the
vowels [i], [u], and []] which almost universally occur in human languages. Acoustic analyses of the
vocalizations of nonhuman primates conducted over the course of more than 30 years (e.g. Lieberman, 1968,

E-mail address: Philip_Lieberman@brown.edu.

0095-4470/$ - see front matter r 2005 Elsevier Ltd. All rights reserved.
doi:10.1016/j.wocn.2005.07.002
ARTICLE IN PRESS
P. Lieberman / Journal of Phonetics 35 (2007) 552563 553

Rendall, Kollias, Ney, & Loyd, in press) are consistent with these modeling studies. Independent, theoretical
studies of the relationship that holds between vocal tract shapes and formant frequency ranges again are
consistent with these ndings (Carre, Lindblom, & MacNeilage, 1995; Stevens, 1972; Stevens & House, 1955).
The Boe et al. (2002) and (1999) studies replicate these ndings since, as noted below, they model adult human
vocal tracts producing the area functions normally used to produce human speech.
Studies of the ontogenetic development of the human vocal tract have played a key role in revealing the
relationships that hold between skeletal features that are useful in reconstructing the vocal tracts of extinct
fossil hominids, as well as present limits on our knowledge. The tongue is positioned almost entirely in the
mouth in other species and in human neonates. In the course of human ontogenetic development the tongue
moves down into the pharynx, carrying the larynx down with it until the horizontal oral portion (SVTh)
and vertical pharyngeal (SVTv) portions of the SVT have equal 1:1 proportions. Independent studies of the
development of the human SVT show that, while cranial base exure reaches its adult values by age 2 years,
the tongue continues to descend; SVTh does not equal SVTv until ages 68 years (Fitch & Giedd, 1999;
Lieberman & McCarthy, 1999). In short, the adult-like human SVT has a tongue having an almost circular
saggital contour forming two segments, a horizontal oral cavity (SVTh), and a vertical pharyngeal cavity
(SVTv) having 1:1 proportions, positioned at a right angle. Movements of the undistorted tongue in the space
dened by the oral cavity and pharynx can produce the abrupt midpoint 10:1 area function discontinuities
necessary to produce the formant frequency patterns of the quantal vowels [i], [u] and []] (Carre et al., 1995;
Ladefoged, De Clerk, Lindau, & Papcun, 1972; Nearey, 1979; Stevens, 1972; Story & Titze, 1998). In contrast,
the tongues of nonhuman primates and human newborns are positioned almost entirely within their mouths
and cannot produce these SVT area functions (Bosma, 1975; Lieberman & Crelin, 1971; Negus, 1949;
Nishimura, Mikami, Suzuki, & Matsuzawa, 2003; Truby, Bosma, & Lind, 1965).

2. The historical backgroundVictor Negus

The restructuring of the human SVT between birth and age 68 years has shown that the difference between
nonhuman and adult human SVTs involves the descent of the tongue root down and into the pharynx.
carrying the larynx down with it. The displacement and ultimate position of the larynx is secondary; in
humans laryngeal position appears to be closely coupled to tongue displacement (Nishimura et al., 2003). This
developmental process was rst described by Negus (1949) who noted that:
There is a gradual descent [of the larynx] through the embryo, fetus and childy The determining factor in
man is recession of the jaws; there is no prognathous snout y The tongue however retains the size it had in
Apes and more primitive types of Man, and in consequence it is curved, occupying a position partly in the
mouth and partly in the pharynx. As the larynx is closely approximated to its hinder end, there is of
necessity descent in the neck; briey stated the tongue has pushed the larynx to a low position, opposite the
fourth, fth and sixth cervical vertebrae. (Negus, 1949, pp. 2526)

2.1. The descent of the tongue

The key factor, thus, is not the descent of the larynx. The descent of the tongue in the pharynx enables the
human SVT to produce either constricted or open pharyngeal airways. Several nonhuman species (including
deer and lions) have low larynges (Fitch & Reby, 2001; Weisengrubber, Forstenpointner, Peters, Kubber-
Heiss, & Fitch, 2002), but their long, relatively thin tongue bodies are positioned in their mouths; they cannot
produce the SVT shapes necessary to produce the formant frequency patterns that convey the full range of
human speech. Thus, despite the focus on the larynx in many studies on the evolution of speech, the descent
and change in the tongues shape are the key factors in both the development and evolution of the human
SVT. This was noted in Lieberman (1984, pp. 276280) which unambiguously states that the descent, shape
and position of the tongue in the right angle space delimited by the pharyngeal and oral cavities is the key to
the enhanced speech producing capabilities of the human SVT.
Recent studies of human and primate development conrm Neguss general insight; human beings and
nonhuman primates follow two different paths after birth. In humans the tongue descends into the pharynx
ARTICLE IN PRESS
554 P. Lieberman / Journal of Phonetics 35 (2007) 552563

taking the larynx down with it. However, it has become clear that the process entails more than the recession
of the jaws. Neguss inferences concerning the jaw are correct insofar as extensive facial retraction occurs only
in humans. In the rst two years of life the skeletal structures that support the face rotate backwards and the
cranial base exes from the relatively at contour that it has at birth (Lieberman, Ross, & Ravosa, 2000).
Therefore, the majority of the human face lies beneath the braincase, contributing to less projecting
browridges and shorter jaws than archaic, extinct hominids, including Neanderthals. Facial retraction also
reduces the space between the back of the hard palate and the foramen magnum (Lieberman & McCarthy,
1999)the space into which the larynx must t in order to be locked into the nose in the standard-plan SVT
(Lieberman & Crelin, 1971; Lieberman et al., 2000; Lieberman & McCarthy, 1999). As Lieberman and
McCarthy (1999) note (replicating Lieberman and Crelins (1971) observation, there is insufcient room for
the adult human larynx to t into this space, the oropharynx. The total length of the human mouth is
reduced and the tongue begins to descend down into the pharynx carrying the larynx down. Much less
rotation of the facial block occurred in Neanderthals.

3. The Lieberman and Crelin (1971) Neanderthal SVT reconstruction

But these ndings were 30 years in the future when the Lieberman and Crelin (1971) attempted to
reconstruct the SVT of the Neanderthal fossil found in the village of La Chapelle-aux-Saints (Boule,
19111913). Lieberman and Crelin (1971) relied on analogythe similarities that exist between the complete
skeletal morphology of the base of the skull and mandible of human newborn infants and the Neanderthal.
They noted a number of similarities, including but not limited to the exure of the base of the cranium. These
included skeletal features that support the muscles that move the tongue such as the pterygoid process of the
sphenoid bone and the angulation of the basilar portion of the occipital bone. The total length of the
basicranium and the distance between the end of the palate and the foramen magnum (the big hole into
which the spinal column inserts) were similar in newborns and the fossil. In human newborns the larynx moves
upwards to lock into the relatively long space between the palate and foramen magnum. Lieberman and Crelin
observed that it would be impossible for an adult human larynx to t into the shorter adult palate-to-foramen
magnum space. Other skeletal features such as the shape of the foramen magnum (into which the spinal
column inserts), the occipital condyles on which the cervical vertebrate rest were similar in newborns and the
fossil.
On this basis they reconstructed the Neanderthal SVT, judging it to be similar to that of a human newborn
and modeled a range of SVT area functions similar to those executed by newborns in the cineradiographic
study of (Truby et al., 1965). Henkes (1966) computer-implemented model established the relationship
between SVT conguration and formant frequencies. The 1971 paper concluded that the Neanderthal SVT
was similar to that of a human newborn and had similar phonetic limitations. Speech was possible, but the
formant frequency patterns that convey the extreme, point or quantal vowels (Stevens, 1972) of human
speech [i], [u] and []] could not be produced, owing to the reconstructed Neanderthals tongue resting almost
entirely in his oral cavity. This precluded its producing the abrupt 10:1 area function SVT midpoint
discontinuities necessary to produce these quantal vowels.

4. Misconceptions regarding the exure of the cranial base

Similarities between the embryonic and early stages of development have been used since Darwin (1859) to
make inferences concerning evolution. The evolutionary basis of the development of particular features are
now the subject of renewed attention in the eld of evolutionary biology, evo-devo, (Alberch, 1989; Carroll,
Grenier, & Weatherbee, 2001). However, the Lieberman and Crelin (1971) reconstruction was an analogy.
Therefore, George in her (1976, 1978) studies, attempted to provide a quantitative basis for reconstructing the
SVTs of archaic hominids. George studied the Denver series of cephalometric X-rays which tracked the
development of basicranial skeletal features and the soft tissue of the SVT in children from age 3 months to
adulthood (Maresh, 1948; McCammon, 1970; Maresh & Washburn, 1938). George correlated the exure of
the base of the skull, which at birth is relatively at, with the occurrence of vowels that to her ears appeared to
be exemplars of quantal vowels such as [i]. She noted that the cranial base angle becomes more acute,
ARTICLE IN PRESS
P. Lieberman / Journal of Phonetics 35 (2007) 552563 555

reaching its adult range by age 2 years when children appeared to produce these vowels. Since Stevens (1972)
had shown that an adult-like human SVT is necessary to produce these sounds, the logical conclusion was that
the descent of the tongue root necessary to achieve adult-like SVT proportions was closely linked to the
cranial base angle. In short, a SVT having adult SVT proportions, a SVTh (the oral, horizontal segment) and a
SVTv (the pharyngeal, vertical segment) having almost equal lengths, was achieved when cranial exure
stabilized early in life.
However, subsequent acoustic analyses show that young children do not really produce the formant
frequency patterns that specify these vowels. Buhr (1980) derived the formant frequency patterns of the vowels
produced by children in the rst years of life; they do not conform to those of adult speech. For example, the
formant frequencies of a 64-week-old infants vowels heard as [i] were actually those of [I]. But this is not
apparent, even when trained phoneticians listen to childrens vowels (e. g. Irwin, 1948). Patricia Kuhl,
Williams, Lacerda, Stevens, & Lindblom (1992) solved the mystery. When we listen to speech, our
judgments are inuenced by the phonetic categories of the language that we have been exposed to. Kuhl and
her colleagues demonstrated that an ill-formed formant frequency pattern in the [i] range will be perceived as
being similar to the ideal exemplar of that vowel in the language that a person is exposed to in the early
months of life. The effect may reect discrimination being reduced between signals that fall into the same
phonetic category (Lotto, Kluender, & Holt, 1998) and the Kuhl studies show that these categories are formed
early in life. In effect, our speech perception system cleans up sloppy signals for this quantal vowel, which as
Nearey (1979) demonstrated can serve as a robust indicator of vocal tract length. The absence of computer-
implemented digital image analysis technology in the 1970s precluded accurate measurements of tongue
position by George; the perceptual effects documented by Kuhl and her colleagues was not apparent until
more than a decade later. Thus cranial base exure was thought to be a measure that could be used to predict
whether or not a fossil had an adult human SVT.
The supposed close relationship between SVT development and cranial base angle derived from Georges
(1976, 1978) conating auditory vowel perception with formant frequency patterns was shared by myself and
other concerned parties. A series of studies followed that linked the cranial base angle with shape of the SVT
in nonhuman primates (Laitman, Heimbuch, & Crelin, 1978). These studies also employed a statistical
procedure that factored in the length of the basicranium (that roughly indicates oral cavity length) to estimate
SVTs of extinct fossil hominids (Laitman & Heimbuch, 1982; Laitman, Heimbuch, & Crelin, 1979). This
makes evaluation of their SVT predictions difcult in the light of subsequent studies; further study is
warranted.

4.1. Reappraising the significance of cranial base flexure

When Daniel Lieberman and McCarthy (1999) reexamined the Denver series and tracked the actual descent
of the tongue root and larynx with the cranial base angle, they found that the larynx and tongue root continue
to descend in humans years after cranial exure stabilized, reaching the range of adult exure. Fitch and Giedd
(1999) in an MRI study independently reached the same conclusion. The proportions of SVTh (the oral,
horizontal segment) and SVTv (the pharyngeal, vertical segment), do not achieve their adult 1:1 proportion
until age 56 years. In short, cranial exure cannot be used to predict tongue and larynx position. The human
exed cranial base angle which has been noted in independent studies (Laitman & Crelin, 1976; Lieberman
et al., 2000; Nishimura et al., 2003) appears to be linked to facial retraction and the initial stage of tongue
displacement into the pharynx. However, after the cranial base angle stabilizes between ages 2 and 3 years, the
proportion of the human tongue (SVTv) that is in the pharynx continues to increase through further descent,
relative to that in the oral cavity (SVTh) until age 68 years. The 1:1 proportion is then maintained as the face
and jaw attain their adult sizes at about 1618 years (Lieberman, McCarthy, Hiiemae, & Palmer, 2001;
Lieberman, McBratney, & Krovitz, 2002). Thus the conguration of the SVT cannot be determined with
certainty by examining the cranial base angle of a fossil.
Confusion still persists in some scholars minds between what a vowel-like formant frequency pattern
sounds like, and its actual formant frequencies. One can nd statements in the scientic literature that
baboons or chimpanzees produce the formant frequency patterns of [i]s, [u]s and []]s. Fortunately,
quantitative acoustic analyses are increasingly being employed. Rendall et al. (in press), for example, show
ARTICLE IN PRESS
556 P. Lieberman / Journal of Phonetics 35 (2007) 552563

that baboon vocalizations have a restricted vowel range close to the human schwa vowel and do not include
quantal vowels. Other quantitative acoustic analyses of nonhuman primate species having tongues positioned
in their mouths consistently show that their vocalizations are limited to the schwa vowel. Fitchs (2000) data,
for example, show that this is the case for chimpanzees and a wide range of primates. In fact, the formant
frequency dispersion metric used by Fitch (2000) to relate skull size with the absolute difference between F3
and F1 depend on the fact that all of the creatures studied produce vowel-like formant patterns that closely
approximate the neutral schwa vowel. For humans, who habitually produce a wide range of formant
frequencies, the F3F1 dispersion of an [i] for the same individual would be substantially greater than an []].

4.2. Incorrect inferences based on cranial base flexure

The incorrect inference that cranial exure in itself is an index of SVT morphology casts doubt on the
conclusions of many studies. This uncertainty applies to the Lieberman & Crelin, 1971 reconstruction. The
biological mechanisms that regulate the descent of the tongue after cranial exure stabilizes are presently
unknown and the position of the tongue in the pharynx in the Lieberman and Crelin (1971) and other studies
cannot presently be stated with certainty. This limitation applies with equal force to the Neanderthal
reconstruction presented by Boe and his colleagues in 1999, based on Heims measurements of the cranial
exture of the La Chapelle-aux-Saints fossils basicranium.
The vocal tract congurations modeled by Boe and his colleagues are those of adult human SVTs. Although
the cranial exure of the basicranium in Heims (1989) reconstruction of the La Chapelle-aux-Saints
Neanderthal fossil is within the human range, that fact cannot be used to determine the shape of the fossils
SVT. Boe and his colleagues in their 1999 paper may have missed the Lieberman and McCarthy (1999) and
Fitch and Giedd (1999) papers. However, their 2002 Journal of Phonetics study, selectively cites these papers,
ignoring their major nding. The relationships that hold between skeletal features and the SVT in adults noted
by Honda and Tiede (1998), which Boe and his colleagues discuss at length, are valid for adult humans, but
they do not hold even for young children. Nonetheless, Boe and his colleagues claim that they can use these
relationships to establish the SVT shape of Neanderthals, who are a nonhuman species.
Genetic evidence (Krings et al., 1997; Ovchinnikov et al., 2000) shows that Neanderthals diverged from
humans about 500,000 years ago. Their skeletal morphology differs from that of modern humans (Howells,
1976, 1989; Lieberman, 1995). In short, adult Neanderthals are not genetically or morphologically similar to
modern human adults. The relationships that hold between skulls, jaws and soft tissue correlated by Honda
and Tiede (1998) hold for adult humans; they do not apply to young children, human neonates, apes or
monkeys. They cannot arbitrarily be applied to Neanderthals. Despite these facts, Boe and his colleagues
claim that they can position the tongue root and larynx of a Neanderthal fossil with certainty. Ignoring studies
that have established the fact that Neanderthals were a species removed from humans, they claim that the
skeletalsoft tissue relationships that hold for adult humans apply to the La Chapelle-aux-Saints fossil; they
conclude that it had a fully modern, adult human SVT and then model the SVT shapes that adult human
speakers use to produce vowels. Not surprisingly, these fully human vocal tract congurations produce the full
range of human vowels. The infant SVT proposed by Goldstein, which Boe and his colleagues also model in
their 1999 and 2002 papers, does not resemble any newborn SVT in the published studies of Negus (1949),
Truby et al. (1965); Bosma (1975), Laitman and Crelin (1976), or anyone else. Its SVTv/SVTh ratio is close to
that of the 56-year-old children documented in both the Lieberman and McCarthy (1999) and Fitch and
Giedd (1999) studies.

5. Could a Neanderthal skull support a modern vocal tract?

We cannot predict how far the tongue would descend down into the pharynx, carrying the larynx down with
it since the tongue continues to descend in humans after cranial base exure stabilizes. So although rotation of
the facial block did not occur in Neanderthals to the extent evident in modern humans, it is possible that the
tongue descended into the pharynx to some degree. Nonetheless, Neanderthals could not have had a fully
human SVT. The hard skeletal evidence that demonstrates that Neanderthals did not have a normal
ARTICLE IN PRESS
P. Lieberman / Journal of Phonetics 35 (2007) 552563 557

adult-like SVT is the long span of the oral cavity between its anterior end, marked by the prosthion
(approximately at the front incisor teeth) and the anterior margin of the foramen magnum.
Since recent papers, such as Boe et al. (1999, 2002), continue to claim that the La Chapelle-aux-Saints
Neanderthal fossil had an adult human SVT, let us go through the steps of the reconstruction, keeping in mind
the length of the oral cavity, the horizontal part of the supralaryngeal vocal tract, SVTh: The rst step
entails placing the La Chapelle skull on a normal vertebral column. This follows the observation of Straus
and Cave (1957) and yields normal upright posture. Step two involves placing a human tongue on the fossil.
As noted above, the curved human tongue body forms both the oor of the oral cavity and the anterior wall of
the pharynx. Let us place an adult human tongue from the radiographic study of Ladefoged et al. (1972),
scaled to the size that would allow the Neanderthal to swallow food on the Chapelle-aux-Saints skull. Studies
of swallowing in humans and other species show that the Neanderthals tongue would have had to be large
enough to propel food along the length of his oral cavity to enable him to eat (Palmer et al., 1992; Hiiemae
et al., 2002).
Step three involves attaching the hyoid bone and complete larynx from the cineradiographic study of Perkell
(1969) to the Neanderthal. Human necks accommodate a SVTv that is equal in length to SVTh, but the t is
quite close. The cartilages of the larynx just t into the human neck. Fink and Demarest (1978, p. 25) in their
detailed study of the larynx, place the lower margin of the cricoid cartilage of the larynx during expiration at
the lowest cervical vertebra (CV7) of the human neck. Their cineradiographic data shows that the cricoid
transiently descends below CV7 during inspiration. Although Negus (1949) places the lower margin of the
cricoid at CV6, Perkells (1969) cineradiographic study locates the vocal cords at the lower margin of C6,
placing the cricoid cartilage somewhat lower. The anatomical evidence thus shows that the larynx just ts into
our neck. Heim (1976) who studied the La Ferrasie I Neanderthal fossil, whose cervical vertebrae were
preserved, states that the length of the Neanderthal neck was no longer than that of modern adult humans,
although the sketches on page 315 of his study show that it is shorter than the neck of the single Frenchman
illustrated. Total SVT length correlates with height in many animals (Fitch, 2000) and in humans (Fitch,
1993), though to a lesser degree since body shape and height differ in human populations adapted to different
climates. Neck lengths in humans also generally are longer for taller people (Mahajan & Bharucha, 1994) who
studied 2724 children living in India derived an age independent linear regression that correlated neck length
with height, neck length 10+(0.035  height). Different scaling factors probably hold for human
populations adapted for life in cold climates, but the point to keep in mind is that as children mature, their
larynges most likely just ts into their necks. This may explain the fact that the tongue does not descend into
the pharynx achieving SVTh SVTv, until after age 6 years.
Fig. 1 shows the Neanderthal skull and the putative human-like vocal tract. Note that larynx is positioned
below the seventh cervical vertebra at below the sternum. The problem arises because of the 12.66 cm
Neanderthal basicranial span between the incisors (the usual skeletal feature marking this point is the
prosthion) and the anterior margin of the foramen magnum, the basion. This skeletal length (prosthion
to basion) appears to have been subjected to intense selective pressure during the course of human evolution
since it has low variability across modern human populations. Howells (1989) studied the skulls of 2504 adult
males and females drawn from groups distributed around the world. The mean length for males is 100.46 mm
with a standard deviation (S.D.) of 4.6 mm. The longest male mean length, for a robust Melanesian group is
107.04 mm, S.D. 4.76 mm; the shortest length 93.75 mm, S.D. 5.7 mm for a European group (Howells, 1989,
pp. 10, 125, and 141). The 126.6 mm Neanderthal length is one element of the Neanderthal skeletal complex
that Howellss statistical analyses show fall outside the human range of variation.
If as a rough estimate (since the range of neck and SVT variation in Neanderthals is unknown) we use this
dimension to scale the Neanderthal SVT, it would be approximately 126/101 times longer than that of an adult
male human, if his SWVt and SVTh had equal lengths. SVT length is correlated with height in humans,
though to a lesser extent than rhesus monkeys (Fitch, 1993) and the SVT lengths of the tall, 6 foot and 5 foot 9
inch, (183 and 175 mm) adult males studied by Baer, Gore, Gracco, and Nye (1991) were 17 cm long. The
formant frequencies of their [i] vowels, which are determined by SVT length, are somewhat lower than those
measured by Hillenbrand, Getty, Clark, & Wheeler (1995) for 45 adult men, which suggests that 17 cm is most
likely the longest normal SVT length for adult human males. (A comprehensive study of human
populations similar to that of Howells (1989) for skulls is not available.) The La Chapelle Neanderthal SVT
ARTICLE IN PRESS
558 P. Lieberman / Journal of Phonetics 35 (2007) 552563

Fig. 1. La Chapelle-aux-Saints Neanderthal fossil provided with a supralaryngeal vocal tract that would be capable of producing the full
range of human speech. This entails having horizontal SVTh and vertical SVTv vocal tract lengths equal in length. Neanderthals
constitute a species that diverged from humans 500,000 years ago. Their long skull base, which determines the length of SVTh, is one of the
distinctions between this archaic species and modern humans. Therefore, SVTv must match the long Neanderthal SVTh, placing the
larynx below the neck. This position, which would impede swallowing, has not been documented in any living human or ape.

length thus would be about 21.3 cm long, given its basion to prosthion length (1.25  17 cm). This would result
in a SVTv length that was 20 mm longer than an adult human males, placing the lower margin of the cricoid
below D1, the rst thoracic vertebrae of the La Ferrasie I Neanderthal, whose length is 17 mm (Heim, 1975).
We have again tilted the exercise towards the benet of the Neanderthal, since the SVTvs of adult males are
somewhat longer that those of females (Fitch & Giedd, 1999). A longer male SVTv would place the larynx
even lower.
In short, the human structures of the larynx just t into the human neck. If a Neanderthal had a
functionally human SVT, its long SVTh would have to be matched by an equally long SVTv, placing the
cricoid cartilage of the larynx below the sternum. This would interfere with the laryngeal maneuvers that are
necessary to swallow food (Negus, 1949, p. 176). The hyoid moves upwards and forwards about 13 mm,
opening the esophagus and placing the larynx into a position in which food will not fall into it while
swallowing (Ishida, Palmer, & Hiiemae, 2002). A larynx placed in the neck can execute these maneuvers,
moving upwards and forwards. A larynx placed below the sternum would be blocked by this bone from
executing these movements. Since the swallowing pattern generatorsthe movements that are involved in
swallowingare similar in humans and apes (Palmer et al., 1992), no human or ape descended from our
common ancestor has a larynx in its chest.
The relevance of the length of the Neanderthal oral cavity and oropharynx to the reconstruction of a
Neanderthal SVT has been discussed before (c.f. Lieberman, 1984, pp. 290296; for a detailed discussion of
this issue see also Lieberman (1975, p. 137), and Lieberman (1979, 1982). The subsequent studies noted above
reafrm the conclusions noted therein. Any claim that a Neanderthal has a human SVT entails tting the
Neanderthal mouth with the tongue of a human adult. Human tongues have equally long oral and pharyngeal
sections and the larynx would end up placed in the creatures chest. Therefore, it is most unlikely that
Neanderthals could have produced the full range of human speech sounds.
ARTICLE IN PRESS
P. Lieberman / Journal of Phonetics 35 (2007) 552563 559

6. The antiquity of speech

One additional point concerning Neanderthals and the evolution of human speech deserves clarication, its
antiquity. Boe and his colleagues in their 1999 paper claim that Lieberman and Crelin in 1971 stated that
Neanderthals were a speechless species. In their 2002 paper they state that Lieberman and Crelin (1971)
concluded that Neanderthals could not speak. The paper that Boe and his colleagues ostensibly cite states
that,

He [Neanderthal] was not as well equipped for language as modern man. His phonetic ability was, however,
more advanced than those of present day nonhuman primates and his brain may have been sufciently well
developed for him to have established a language based on the speech signals at his command. The general
level of Neanderthal culture is such that this limited phonetic ability was probably used and that some form
of language existed. Neanderthal man thus represents an intermediate stage in the evolution of language.
This indicates that the evolution of language was gradual, that it was not an abrupt phenomenon. The
reason that human linguistic ability appears to be so distinct and unique is that the intermediate stages in its
evolution are represented by extinct species (Lieberman & Crelin, 1971, p. 221).

Although some recent studies continue to claim that modern speech capabilities appeared only 50,000 years
ago (Corballis, 2002), this is virtually impossible. The peculiar anatomy of the human airway, as Darwin
(1859) noted, increases the risk of choking to death when food lodges in the larynx. Current studies reafrm
Darwins observation: Studies of swallowing show that tens of thousands of incidents of fatal choking have
occurred (Feinberg & Ekberg, 1990). This, in itself, argues against Corballis 50,000 year date for speech.
Speech must have been in place in archaic hominids ancestral to humans and Neanderthals. There would have
been no selective advantage for retaining mutations that yielded the species-specic human speech producing
anatomy at the cost of increased morbidity from choking, unless speech was already present. Corballis dates
speech and grammar to 50,000 years before the present (BP), citing studies that claim that advanced stone
tools and other modern artefacts unearthed in Europe date to that period. However, these artefacts most
likely were transported there by the humans who migrated there from Western Asia and/or Africa since
similar artefacts have been discovered in Africa that date back to 90,000 years BP (McBrearty & Brooks,
2000). Moreover, since Asia and Australia were settled by human immigrants between 60,000 and 40,000 BP,
with a resident human population remaining in Africa, how can we possibly account for the fact any child
from any part of the world will acquire any human language with equal facilitythere are no neural
impediments to language acquisition in any human population. If the 50,000 year date were correct, we would
have to account for the local mutation spreading back to Africa and to all other inhabited parts of the world.

7. Old business

Boe and his colleagues also cite studies disputing aspects of the 1971 Lieberman and Crelin paper to support
their critique of this 33 year-old paper, without noting that these issues were addressed, many decades ago.
The appropriate references are in books that are readily available (Lieberman 1984, 1991, 1998, 2000).
DuBruls 1977 paper which claimed, without providing any comparative evidence, that Neanderthals had
human airways were addressed in Lieberman (1979). Falk (1975) claimed that a high hyoid bone (a bone that
supports the larynx) in a hominid, similar to its position in a chimpanzee would prevent swallowing food while
standing or sitting upright. As noted in Lieberman (1982), Falks theory can be refuted by visiting a zoo or
viewing documentary lms; chimpanzees habitually swallow food while sitting upright. Arensburg et al. (1990)
claimed that the shape of an isolated hyoid bone (the fossils skull was missing) could be used to determine the
position of the larynx. Lieberman (1993) noted the ndings of independent anatomical studies which show
that the position of the hyoid bone and larynx relative to the skull base changes in the course of human
development, without any corresponding change in the hyoid bones shape. Thus, the shape of an isolated
hyoid bone cannot be used to infer the position of the larynx, or more importantly the tongue.
Houghton (1993) endowed Neanderthals with small diameter tongues to avoid the larynx in the
chest problem. If Neanderthals had these tongues they would not have been able to swallow food. That
ARTICLE IN PRESS
560 P. Lieberman / Journal of Phonetics 35 (2007) 552563

was pointed out in Lieberman (1994), which took note of studies on swallowing such as Palmer, Rudin, Lara,
and Crompton (1992).
Other aspects of the Boe et al. (2002) paper also appear to derive from a failure to note the ndings of
studies subsequent to 1971. Boe and his colleagues claim that a minimal vocal tract constriction of 1.0 cm2
used to model Neanderthal SVTs by Lieberman and Crelin (1971), dramatically underestimated the range of
formant frequencies that could have been produced. The compressed ordinate scale of the sketch in the 1971
paper could lead to the conclusion that the minimal constriction was 1.0 cm2. That problem was pointed out in
1971 to Crelin, Lieberman and to Dennis Klatt, who participated in the computer modeling of the 1971 and
the earlier 1969 monkey SVT modeling paper. Therefore, in Lieberman et al. (1972) a nonlinear ordinate scale
was used to plot SVT area (Fig. 10 on page 296 of that paper); the plot showed that minimal area was
0.30 cm2. Moreover, subsequent independent studies suggest that the minimal area issue is overstated. The
MRI study of Baer et al. (1991) reports constrictions of 0.690.50 cm2 for two adult speakers when they
produced the quantal vowels [i], [u], and []]. In contrast, Story, Titze, and Hoffman (1996) report a minimal
constriction of 0.10 cm2 for their male subject. However, the shift in formant frequencies that might be
expected for the quantal vowel [i] for these different constriction areas would appear be less than 50 Hz.
Beckman et al. (1995) studying the production of the vowel [i], predict a shift of less than 50 Hz for F2 the
second formant frequency of this vowel for these different degrees of constriction. This is less than the
difference limen for F2. The changes in the formant frequencies appear to be imperceptible, which may explain
the area function variation evident in MRI studies of adult human speakers. Moreover, the relevance of the
minimal constriction is in any case the ratio between the adjacent constricted and open areas of the SVT. The
Neanderthal SVT modeled in Lieberman and Crelin (1971) could not achieve the ratios or abrupt area
function discontinuities necessary to produce the quantal vowels []], [i] and [u] in contemporary studies (Fant,
1960; Stevens, 1972) and the subsequent studies noted above.
Given the present state of knowledge concerning the development and evolution of the SVT, we can only be
certain about the morphology of the beginning and end points. Hominids who have snouts similar to those of
present day chimpanzees most likely had similar SVTs with phonetic capacities similar to those of
chimpanzees. Modern humans clearly have a SVT that can generate the range of formant frequencies that
conveys all of the sounds of human speech. Neanderthals whose skeletal features supported longer mouths
than humans cannot have had fully human SVTs, but we cannot specify their SVTs shapes with certainty until
we are able to either establish the length of their tongues, or nd skeletal features that predict the descent of
the tongue. Identifying the regulatory genes that regulate the development of the SVT would prove useful.

8. Conclusion

In the years intervening between the original Lieberman and Crelin (1971) study several independent studies
have shown that the exure of the cranial base angle cannot in itself be used to the pharyngeal position of the
tongue when reconstructing the pharynx the SVT of a fossil hominid. The attribute of the human SVT that
allows it to produce the full range of human speech sounds is the fact that while the tongue is positioned
almost entirely in the mouth in other species and in human neonates, it moves down into the pharynx, carrying
the larynx down with it until the horizontal oral portion (SVTh) and vertical pharyngeal (SVTv) portions of
the SVT have equal 1:1 proportions. Studies of the development of the human SVT show that while cranial
base exure reaches its adult values by age 2 years, the tongue continues to descend; SVTh does not equal
SVTv until ages 68 years. The biological mechanisms that regulate this process are presently unknown.
Therefore, the position of the tongue in the pharynx in the Lieberman and Crelin (1971) and other studies
cannot be stated with certainty. However, other subsequent independent studies argue against Neanderthals
being humans or having skeletal features that would support a fully human SVT. The Neanderthal SVT
reconstruction proposed by Boe and his colleagues would yield a neck, tongue, and larynx unlike those of any
human, ape, or known mammal.
Further studies, which are in progress, may establish the phonetic limitations of Neanderthal vocal tracts.
Exercises such as those published by Boe and his colleagues only confuse the issue.
ARTICLE IN PRESS
P. Lieberman / Journal of Phonetics 35 (2007) 552563 561

References

Alberch, P. (1989). The logic of monsters. Geobios, memoire speciale, 12, 2157.
Arensburg, B., Schepartz, L. A., Tiller, A. M., Vandermeersch, B., Duday, H., & Rak, Y. (1990). A reappraisal of the anatomical basis for
speech in middle palaeolithic hominids. American Journal of Physical Anthropology, 83, 137146.
Baer, T., Gore, J. C., Gracco, L. C., & Nye, P. W. (1991). Analysis of vocal tract shape and dimensions using magnetic resonance imaging:
Vowels. Journal of the Acoustical Society of America, 90, 799828.
Beckman, M. E., Jung, T.-P., Lee, S.-H., de Jong, K., Krishnamurthy, A. K., Ahalt, S. C., et al. (1995). Variability in the production of
quantal vowels revisited. Journal of the Acoustical Society of America, 97, 471489.
Boe, L.-J., Heim, J.-L., Honda, K., & Maeda, S. (2002). The potential Neanderthal vowel space was as large as that of modern humans.
Journal of Phonetics, 30, 465484.
Boe, L.-J., Maeda, S., & Heim, J.-L. (1999). Neanderthal man was not morphologically handicapped for speech. Evolution of
Communication, 3, 4977.
Bosma, J. F. (1975). Anatomic and physiologic development of the speech apparatus. In D. B. Towers (Ed.), Human communication and its
disorders (pp. 469481). New York: Raven.
Boule, M. (19111913). Lhomme fossile de la Chapelle-aux-Saints. Annales Paleontologie, 6, 109; 7, 21, 85; 8, 1.
Buhr, R. D. (1980). The emergence of vowels in an infant. Journal of Speech and Hearing Research, 23, 7594.
Carre, R., Lindblom, B., & MacNeilage, P. (1995). Acoustic factors in the evolution of the human vocal tract. C. R. Academie des Sciences
Paris, t320(Serie IIb), 471476.
Carroll, S. B., Grenier, J. K., & Weatherbee, S. D. (2001). From DNA to diversity: Molecular genetics and the evolution of animal design.
Blackwell Sciences: Malden, Massachusetts.
Chiba, T., & Kajiyama, J. (1941). The vowel: Its nature and structure. Tokyo: Tokyo-Kaisekan Publishing Co.
Corballis, M. (2002). From hand to mouth: The origins of language. Princeton NJ: Princeton University Press.
Darwin, C. (1859). On the origin of species (Facsimile ed. 1964). Cambridge MA: Harvard University Press.
DuBrul, E. L. (1977). Origins of the speech apparatus and its reconstruction in fossils. Brain and Language, 4, 365381.
Falk, D. (1975). Comparative anatomy of the larynx in man and chimpanzee; implications for language in Neanderthal. American Journal
of Physical Anthropology, 43, 123132.
Fant, G. (1960). Acoustic theory of speech production. The Hague: Mouton.
Feinberg, M. J., & Ekberg, O. (1990). Deglutition after near-fatal choking episode: Radiologic evaluation. Radiology, 176, 637640.
Fink, B. R., & Demarest, R. J. (1978). Laryngeal Biomechanics. Cambridge Mass: Harvard University Press.
Fitch, W. T. (2000). Skull dimensions in relation to body size in nonhuman mammals: The causal bases for acoustic allometry. Zoology,
103, 4058.
Fitch, III. W. T. (1993). Vocal tract length and the evolution of language. Ph.D. dissertation, Brown University.
Fitch, W. T., & Giedd, J. (1999). Morphology and development of the human vocal tract: A study using magnetic resonance imaging.
Journal of the Acoustical Society of America, 106, 15111522.
Fitch, W. T., & Reby, D. (2001). The descended larynx is not uniquely human. Proceedings of Royal Society London B, 268,
16691675.
George, S. L. (1976). The relationship between crantai base angle morphology and infant vocalizations. Sc.D. diss. University of
Connecticut.
George, S. L. (1978). A longitudinal and cross-sectional analysis of the growth of the postnatal cranial base angle. American Journal of
Physical Anthropology, 49, 171178.
Hiiemae, K. M., Palmer, J. B., Medicis, S. W., Hegener, J., Jackson, B. S., & Lieberman, D. E. (2002). Hyoid and tongue movements in
speaking and eating. Archives of Oral Biology, 47, 1127.
Heim, J.-L. (1976). Les hommes fossiles de la Ferrassie. Paris: Masson.
Heim, J.-L. (1989). La nouvelle reconstitution du crane neanderthalien de la Chapelle-aux-Saints. Methode et resultats. Bulletin et
Memoires de la Societe dAnthropologie de Paris. n. s., I, 95118.
Henke, W. L. (1966). Dynamic articulatory model of speech production using computer simulation. Ph.D. dissertation, MIT.
Hillenbrand, J. L., Getty, A., Clark, M. J., & Wheeler, K. (1995). Acoustic characteristics of American English vowels. Journal of the
Acoustical Society of America, 97, 30993111.
Honda, K. & Tiede, M. K. (1998). An MRI study on the relationship between oral cavity shape and larynx position. In Proceedings of the
5th international conference on spoken language processing (Vol. 2, pp. 437440).
Houghton, P. (1993). Neanderthal supralaryngeal vocal tract. American Journal of Physical Anthropology, 90, 139146.
Howells, W. W. (1976). Neanderthal man: facts and gures. In Proceedings of the ninth international congress of anthropological and
ethnological sciences, Chicago 1973. The Hague: Mouton.
Howells, W. W. (1989). Skull shapes and the map; craniometric analyses in the dispersion of modern Homo. In Papers of the Peabody
Museum of Archaeology and Ethnology (Vol. 79). Cambridge, MA: Harvard University.
Irwin, O,C. (1948). Infant speech: Development of vowel sounds. Journal of Speech and Hearing Disorders, 13, 3134.
Ishida, R., Palmer, J. B., & Hiiemae, K. M. (2002). Hyhoid motion during swallowing; factors affecting forward and upward
displacement. Dysphagia, 17, 262272.
Krings, M., Stone, A., Schmitz, R. W., Krainitzki, H., Stoneking, M., & Paabo, S. (1997). Neanderthal DNA sequences and the origin of
modern humans. Cell, 90, 1930.
ARTICLE IN PRESS
562 P. Lieberman / Journal of Phonetics 35 (2007) 552563

Kuhl, P. K., Williams, K. A., Lacerda, F., Stevens, K. N., & Lindblom, B. (1992). Linguistic experience alters phonetic perception in
infants by 6 months of age. Science, 255, 606608.
Ladefoged, P., De Clerk, J., Lindau, M., & Papcun, G. (1972). An auditory-motor theory of speech production. UCLA Working Papers in
Phonetics, 22, 4876.
Laitman, J. T., & Crelin, E. S. (1976). Postnatal development of the basicranium and vocal tract region in man. In J. Bosma (Ed.),
Symposium on development of the basicranium (pp. 206219). Washington DC: US Government Printing Ofce.
Laitman, J. T., & Heimbuch, R. C. (1982). The basicranium of Plio-Pleistocene hominids as an indicator of their upper respiratory
systems. American Journal of Physical Anthropology, 59, 323344.
Laitman, J. T., Heimbuch, R. C., & Crelin, E. S. (1979). The basicranium of fossil hominids as an indicator of their upper respiratory
systems. American Journal of Physical Anthropology, 51, 1534.
Laitman, J. T., Heimbuch, R. C., & Crelin, E. S. (1978). Developmental changes in a basicranial line and its relationship to the upper
respiratory system in living primates. American Journal of Anatomy, 152, 467482.
Lieberman, D. E. (1995). Testing hypotheses about recent human evolution from skulls. Current Anthropology, 36, 159198.
Lieberman, D. E., McBratney, B. M., & Krovitz, G. (2002). The evolution and development of cranial form in Homo sapiens. Proceedings
of the National Academy of Sciences (PNAS), 99, 11341139.
Lieberman, D. E., & McCarthy, R. C. (1999). The ontogeny of cranial base angulation in humans and chimpanzees and its implications
for reconstructing pharyngeal dimensions. Journal of Human Evolution, 36, 487517.
Lieberman, D. E., McCarthy, R. C., Hiiemae, K. M., & Palmer, J. B. (2001). Ontogeny of postnatal hyoid and laryngeal descent:
Implications for deglutition and vocalization. Archives of Oral Biology, 46, 117128.
Lieberman, D. E., Ross, C. F., & Ravosa, M. J. (2000). The primate cranial base: Ontogeny, function and integration. Yearbook of
Physical Anthropology, 43, 117169.
Lieberman, P. (1968). Primate vocalizations and human linguistic ability. Journal of the Acoustical Society of America, 44, 11571164.
Lieberman, P. (1975). On the origins of language: An introduction to the evolution of speech. New York: Macmillan.
Lieberman, P. (1979). Hominid evolution, supralaryngeal vocal tract physiology and the fossil evidence for reconstructions. Brain and
Language, 7, 101126.
Lieberman, P. (1982). Can chimpanzees swallow or talk? A reply to Falk. American Anthropologist, 84, 148152.
Lieberman, P. (1993). The Kebara KMH-2 hyoid and Neanderthal speech. Current Anthropology, 34, 172175.
Lieberman, P. (1994). Functional tongues and Neanderthal vocal tract reconstruction: A reply to Houghton (1993). American Journal of
Physical Anthropology, 95, 443452.
Lieberman, P. (1984). The biology and evolution of language. Cambridge, Mass: Harvard University Press.
Lieberman, P. (1991). Uniquely human: The evolution of speech, thought, and selfless behavior. Cambridge MA: Harvard University Press.
Lieberman, P. (1994). Functional tongues and Neanderthal vocal tract reconstruction: A reply to Houghton (1993). American Journal of
Physical Anthropology, 95, 443452.
Lieberman, P. (1998). Eve spoke; human language and human evolution. New York: W. W. Norton; London: Picador, Macmillan.
Lieberman, P. (2000). Human language and our reptilian brain: The subcortical bases of speech, syntax, and thought. Cambridge Mass:
Harvard University Press.
Lieberman, P., & Crelin, E. S. (1971). On the speech of Neanderthal man. Linguistic Inquiry, 2, 203222.
Lieberman, P., Crelin, E. S., & Klatt, D. H. (1972). Phonetic ability and related anatomy of the newborn, adult human, Neanderthal man,
and the chimpanzee. American Anthropologist, 74, 287307.
Lieberman, P., Klatt, D. H., & Wilson, W. H. (1969). Vocal tract limitations on the vowel repertoires of rhesus monkey and other
nonhuman primates. Science, 164, 11851187.
Lotto, A. J., Kluender, K. R., & Holt, L. L. (1998). Depolarizing the perceptual magnet effect. Journal of the Acoustical Society of
America, 103, 34483655.
Mahajan, P. V., & Bharucha, B. A. (1994). Evaluation of short neck: percentiles and linear correlations with height and sitting height.
Indian Pediatrics, 31, 11931203.
Maresh, M. M. (1948). Growth of the heart related to bodily growth during childhood and adolescence. Pediatrics, 2, 382402.
Maresh, M. M., & Washburn, A. H. (1938). Size of the heart in healthy children. American Journal of Disease in Children, 56, 3360.
McBrearty, S., & Brooks, A. S. (2000). The revolution that wasnt: A new interpretation of the origin of modern human behavior. Journal
of Human Evolution, 39, 453563.
McCammon, R. (1970). Human growth and development. Thomas: Springeld.
Nearey, T. (1979). Phonetic features for vowels. Bloomington: Indiana University Linguistics Club.
Negus, V. E. (1949). The comparative anatomy and physiology of the larynx. New York: Hafner.
Nishimura, T., Mikami, A., Suzuki, J., & Matsuzawa, T. (2003). Descent of the larynx in chimpanzee infants. Proceedings National
Academy of Sciences, USA, 100, 69306933.
Ovchinnikov, I. V., Gotherstrom, A., Romanova, G. P., Kharitonov, V. M., Liden, K., & Goodwin, W. (2000). Molecular analysis of
Neanderthal DNA from the northern Caucasus. Nature, 404, 490493.
Palmer, J. B., Rudin, N. J., Lara, G., & Crompton, A. W. (1992). Coordination of mastication and swallowing. Dysphagia, 7, 187200.
Perkell, J. S. (1969). Physiology of speech production: results and implications of a quantitative cineradiographic study. Cambridge MA.:
MIT Press.
Rendall, D., Kollias, S., Ney, C., & Loyd, P. (in press). Pitch (Fo) and formant proles of human and vowel-like baboon grunts: The role
of vocalizer body size and voice-acoustic allometry. The Journal of the Acoustical Society of America.
ARTICLE IN PRESS
P. Lieberman / Journal of Phonetics 35 (2007) 552563 563

Stevens, K. N. (1972). Quantal nature of speech. In E. E. David Jr., & P. B. Denes (Eds.), Human communication: A unified view
(pp. 5166). New York: McGraw Hill.
Stevens, K. N., & House, A. S. (1955). Development of a quantitative description of vowel articulation. Journal of the Acoustical Society of
America, 27, 484493.
Story, B., & Titze, B. H. (1998). Paramaterization of vocal tract area functions by emperical orthogonal modes. Journal of Phonetics, 26,
223260.
Story, B. H., Titze, I. R., & Hoffman, E. A. (1996). Vocal tract area functions from magnetic resonance imaging. Journal of the Acoustical
Society of America, 100, 537554.
Straus, W. L., Jr., & Cave, A. J. E. (1957). Pathology and posture of Neanderthal man. Quarterly Review of Biology, 32, 348363.
Truby, H. L., Bosma, J. F., & Lind, J. (1965). Newborn infant cry. Upsalla: Almquist and Wiksell.
Weisengrubber, G. E., Forstenpointner, G., Peters, G., Kubber-Heiss, A., & Fitch, W. T. (2002). Hyoid apparatus and pharynx in the lion
(Panthera leo), jaguar (Panthera onca), tiger (Panthera tigris), cheetah (Acinonyx jubatus) and domestic cat (Felis silvestris f. catus). Journal
of Anatomy, 201, 195201.

Vous aimerez peut-être aussi