Vous êtes sur la page 1sur 13

INTRODUCTION TO THE ACOUSTICS OF SPEECH PRODUCTION

Richard M. Stern
Department of Electrical and Computer Engineering and School of Computer Science Carnegie Mellon University Pittsburgh, Pennsylvania 15213 Telephone: (412) 268-2535 FAX: (412) 268-3890 INTERNET: rms@cs.cmu.edu
January 15, 2004

INTRODUCTION

n Will talk about how speech is produced


The source-filter theory Aspects of acoustics The wave equation Approximate solutions to wave equation

Carnegie Mellon

Slide 2

ECE and SCS Robust Speech Group

SOME ADDITIONAL READING

n Speech Communications: Human and Machine by D.


OShaughnessy, Chapter 3, Wiley 1999

n Digital Processing of Speech Signals by L. R. Rabiner and R.


W. Schafer, Chapter 3, Prentice-Hall 1978

n Speech Analysis, Synthesis, and Perception by J. L. Flanagan,


Springer-Verlag (out of print but in E&S Library)

Carnegie Mellon

Slide 3

ECE and SCS Robust Speech Group

THE SOURCE-FILTER MODEL FOR SPEECH

A useful model for representing the generation of speech sounds:

Pitch Pulse train source

Amplitude

p[n] Vocal tract model Noise source


Carnegie Mellon
Slide 4 ECE and SCS Robust Speech Group

THE SPEECH SPECTROGRAM

5000

4000

Frequency

3000

2000

1000

0 0 0.2 0.4 0.6 Time 0.8 1 1.2

Carnegie Mellon

Slide 5

ECE and SCS Robust Speech Group

SEPARATING THE VOCAL-TRACT EXCITATION FROM THE FILTER

Original speech: Speech with 75-Hz excitation: Speech with 150-Hz excitation: Speech with noise excitation:

Carnegie Mellon

Slide 6

ECE and SCS Robust Speech Group

WHY IS ACOUSTICAL ANALYSIS DIFFICULT?


n Methods of signal analysis depend on size of wavelengths
compared to size of objects
Long wavelengths: lumped-parameter analysis Short wavelengths: distributed-parameter analysis

n Sound waves are long compared to objects at low frequencies


but short compared to objects at high frequencies

Carnegie Mellon

Slide 7

ECE and SCS Robust Speech Group

THE ACOUSTIC THEORY OF SPEECH PRODUCTION: MODELING THE VOCAL TRACT

n The pressure at a distance r is Pr (w ) = S(w )T(w )R(w )


where
S(w ) = UG (w ) T(w ) = U L (w ) / UG (w ) R(w ) = Pr (w ) / UL (w )
Carnegie Mellon
Slide 8 ECE and SCS Robust Speech Group

UNVOICED SPEECH SOURCE


n Turbulent voicing sources are approximately flat in frequency:

Carnegie Mellon

Slide 9

ECE and SCS Robust Speech Group

VOICED SPEECH SOURCE


n Glottal pulses have a spectrum that decreases with the square
of frequency:

Carnegie Mellon

Slide 10

ECE and SCS Robust Speech Group

DERIVATION OF THE WAVE EQUATION


n Consider sound propagating along a 1-dimensional tube (with
area A much less than a wavelength)

n Define:

u(x, t) particle velocity


(U = uA)
p(x,t) sound pressure variation (P = P0 + p)

U(x, t) volume velocity

r density of air
c velocity of sound

n Comment: sound pressure variation and volume velocity play


same role in acoustical analysis that voltage and current play in electrical circuit analysis
Carnegie Mellon

Slide 11

ECE and SCS Robust Speech Group

DERIVATION OF THE WAVE EQUATION


n Assuming plane-wave propagation, it can be shown that:
-

p u =r x t

(Newtons second law)

and

u 1 p = 2 x rc t

(from the universal gas law)

where

rc 2 = gP

Carnegie Mellon

Slide 12

ECE and SCS Robust Speech Group

SOLUTIONS TO THE WAVE EQUATION


n From Newtons law and the universal gas law we obtain the
wave equation:

2u

1 2u = 2 2 x 2 c t

n Solutions to the wave equation (in one dimension) are of the


form:
x x u(x, t) = u + (t - ) - u - (t + ) c c + x x p(x,t) = rc u (t - ) + u - (t + ) c c

n Comments:

Travelling-wave solutions We frequently evaluate for the sinusoidal steady state


Carnegie Mellon
Slide 13 ECE and SCS Robust Speech Group

THE SINUSOIDAL STEADY STATE: COMPLEX AMPLITUDE


n Recall that
e jx = cos(x) + j sin(x) cos(x) = Re[e jx ]; sin(x) = Im[e jx ]

n Let u + ( x, t) be of the form Re[U + e jw (t- x / c) ] n Then


x x u(x, t) = u + (t - ) - u - (t + ) c c = Re[U + e jw (t -x / c) - [U - e jw (t + x / c) ] = Re[(U + e - jwx / c - U - e + jwx / c )e jwt ] Re[U(x, w )e jwt ]

n U(x, w ) is referred to as the complex amplitude


Carnegie Mellon

Slide 14

ECE and SCS Robust Speech Group

COMPLEX AMPLITUDES OF GENERAL SOLUTIONS TO THE WAVE EQUATION


n In the sinusoidal steady state we have
p(x,t) = Re[P( x, w )e jwt ] u(x, t) = Re[U(x, w )e jwt ]

where
P(x, w ) = rc U + e - jwx / c - U - e + jwx / c U(x, w ) = U + e - jwx / c + U - e + jwx / c

Carnegie Mellon

Slide 15

ECE and SCS Robust Speech Group

SOUND PROPAGATION IN A UNIFORM TUBE

x = -l Glottis

x=0 Lips

n Boundary conditions:
p(0, t) = 0 U + = U u(-l,t) = U 0 cos(w 0t) U + = U0 2 cos(w 0 l /c)

Carnegie Mellon

Slide 16

ECE and SCS Robust Speech Group

SOUND PROPAGATION IN A UNIFORM TUBE

x = -l x=0 Glottis Lips n With boundary conditions solved, we obtain:

U cos(w 0t) u(0,t) = 0 cos(w 0 l /c)

n Comments:
Resonant frequencies occur when w 0 l /c = (p /2),(3p /2),(5 p /2)K
Carnegie Mellon
Slide 17 ECE and SCS Robust Speech Group

UNIFORM TUBES RESONANT FREQUENCIES


50

45

40

35

Amplitude

30

25

20

15

10

100

200

300

400

n Comments:

Frequency

500

600

700

800

900

1000

UL (w )/U G (w ) is of the form A /cos(wl /c) With nonideal absorptive walls, response is not infinite at resonant frequencies
Carnegie Mellon

Slide 18

ECE and SCS Robust Speech Group

SOUND PROPAGATION IN A MORE REALISTIC UNIFORM TUBE

x = -l Glottis

x=0 Lips

n Comment: Resonant frequencies now non-uniform

Carnegie Mellon

Slide 19

ECE and SCS Robust Speech Group

VOWEL PRODUCTION IN THE VOCAL TRACT

Carnegie Mellon

Slide 20

ECE and SCS Robust Speech Group

SOME EXAMPLE VOWELS

5000

4000

Frequency

3000

2000

1000

0 0 0.5 1 1.5 2 Time 2.5 3 3.5 4

Carnegie Mellon

Slide 21

ECE and SCS Robust Speech Group

VOWEL PERCEPTION AND FORMANT FREQUENCIES

Carnegie Mellon

Slide 22

ECE and SCS Robust Speech Group

RADIATION IMPEDIANCE THE FINAL STEP


n Radiation impedance models the effect of the air load on the
output pressure wave: R(w ) = Pr (w ) /U L (w )

n At most frequencies of interest, radiation impedance has the


form of:
R(w ) = Pr (w ) /U L (w ) jw

n Effects of air loading are typically absorbed into the filter


effects of the vocal tract

Carnegie Mellon

Slide 23

ECE and SCS Robust Speech Group

SUMMARY
n We have discussed very superficially the production of speech
sounds
Source-filter model Vocal tract transfer functions Impact on perception Some attributes of acoustical analysis

n The source filter model is used


As a way to model how we produce speech sounds As a way to reduce the number of parameters needed to characterize speech sounds As a way of extracting features that are used by speech recognition systems

Carnegie Mellon

Slide 24

ECE and SCS Robust Speech Group

Vous aimerez peut-être aussi