Introduction To The Acoustics of Speech Production: Richard M. Stern

INTRODUCTION TO THE ACOUSTICS OF SPEECH PRODUCTION
Richard M. Stern
Department of Electrical and Computer Engineering and School of Computer Science Carnegie Mellon University Pittsburgh, Pennsylvania 15213 Telephone: (412) 268-2535 FAX: (412) 268-3890 INTERNET: rms@cs.cmu.edu
January 15, 2004
INTRODUCTION
n Will talk about how speech is produced

The source-filter theory Aspects of acoustics The wave equation Approximate solutions to wave equation
Carnegie Mellon
Slide 2
ECE and SCS Robust Speech Group
SOME ADDITIONAL READING
n Speech Communications: Human and Machine by D.

OShaughnessy, Chapter 3, Wiley 1999
n Digital Processing of Speech Signals by L. R. Rabiner and R.

W. Schafer, Chapter 3, Prentice-Hall 1978
n Speech Analysis, Synthesis, and Perception by J. L. Flanagan,

Springer-Verlag (out of print but in E&S Library)
Carnegie Mellon
Slide 3
THE SOURCE-FILTER MODEL FOR SPEECH
A useful model for representing the generation of speech sounds:
Pitch Pulse train source
Amplitude
p[n] Vocal tract model Noise source

Carnegie Mellon
Slide 4 ECE and SCS Robust Speech Group
THE SPEECH SPECTROGRAM
5000
4000
Frequency
3000
2000
1000
0 0 0.2 0.4 0.6 Time 0.8 1 1.2
Carnegie Mellon
Slide 5
SEPARATING THE VOCAL-TRACT EXCITATION FROM THE FILTER
Original speech: Speech with 75-Hz excitation: Speech with 150-Hz excitation: Speech with noise excitation:
Carnegie Mellon
Slide 6
WHY IS ACOUSTICAL ANALYSIS DIFFICULT?

n Methods of signal analysis depend on size of wavelengths
compared to size of objects
Long wavelengths: lumped-parameter analysis Short wavelengths: distributed-parameter analysis
n Sound waves are long compared to objects at low frequencies

but short compared to objects at high frequencies
Carnegie Mellon
Slide 7
THE ACOUSTIC THEORY OF SPEECH PRODUCTION: MODELING THE VOCAL TRACT
n The pressure at a distance r is Pr (w ) = S(w )T(w )R(w )

where
S(w ) = UG (w ) T(w ) = U L (w ) / UG (w ) R(w ) = Pr (w ) / UL (w )
Carnegie Mellon
UNVOICED SPEECH SOURCE

n Turbulent voicing sources are approximately flat in frequency:
Carnegie Mellon
Slide 9
VOICED SPEECH SOURCE

n Glottal pulses have a spectrum that decreases with the square
of frequency:
Carnegie Mellon
Slide 10
DERIVATION OF THE WAVE EQUATION

n Consider sound propagating along a 1-dimensional tube (with
area A much less than a wavelength)
n Define:

u(x, t) particle velocity

(U = uA)
p(x,t) sound pressure variation (P = P0 + p)
U(x, t) volume velocity
r density of air
c velocity of sound
n Comment: sound pressure variation and volume velocity play

same role in acoustical analysis that voltage and current play in electrical circuit analysis
Carnegie Mellon
Slide 11
DERIVATION OF THE WAVE EQUATION

n Assuming plane-wave propagation, it can be shown that:
-
p u =r x t
(Newtons second law)
and
u 1 p = 2 x rc t
(from the universal gas law)
where
rc 2 = gP
Carnegie Mellon
Slide 12
SOLUTIONS TO THE WAVE EQUATION

n From Newtons law and the universal gas law we obtain the
wave equation:
2u
1 2u = 2 2 x 2 c t
n Solutions to the wave equation (in one dimension) are of the

form:
x x u(x, t) = u + (t - ) - u - (t + ) c c + x x p(x,t) = rc u (t - ) + u - (t + ) c c
n Comments:
Travelling-wave solutions We frequently evaluate for the sinusoidal steady state

Carnegie Mellon
THE SINUSOIDAL STEADY STATE: COMPLEX AMPLITUDE

n Recall that
e jx = cos(x) + j sin(x) cos(x) = Re[e jx ]; sin(x) = Im[e jx ]
n Let u + ( x, t) be of the form Re[U + e jw (t- x / c) ] n Then

x x u(x, t) = u + (t - ) - u - (t + ) c c = Re[U + e jw (t -x / c) - [U - e jw (t + x / c) ] = Re[(U + e - jwx / c - U - e + jwx / c )e jwt ] Re[U(x, w )e jwt ]
n U(x, w ) is referred to as the complex amplitude

Carnegie Mellon
Slide 14
COMPLEX AMPLITUDES OF GENERAL SOLUTIONS TO THE WAVE EQUATION

n In the sinusoidal steady state we have
p(x,t) = Re[P( x, w )e jwt ] u(x, t) = Re[U(x, w )e jwt ]
where
P(x, w ) = rc U + e - jwx / c - U - e + jwx / c U(x, w ) = U + e - jwx / c + U - e + jwx / c
Carnegie Mellon
Slide 15
SOUND PROPAGATION IN A UNIFORM TUBE
x = -l Glottis
x=0 Lips
n Boundary conditions:
p(0, t) = 0 U + = U u(-l,t) = U 0 cos(w 0t) U + = U0 2 cos(w 0 l /c)
Carnegie Mellon
Slide 16
SOUND PROPAGATION IN A UNIFORM TUBE
x = -l x=0 Glottis Lips n With boundary conditions solved, we obtain:
U cos(w 0t) u(0,t) = 0 cos(w 0 l /c)
n Comments:
Resonant frequencies occur when w 0 l /c = (p /2),(3p /2),(5 p /2)K
Carnegie Mellon
UNIFORM TUBES RESONANT FREQUENCIES

50
45
40
35
Amplitude
30
25
20
15
10
100
200
300
400
n Comments:
Frequency
500
600
700
800
900
1000
UL (w )/U G (w ) is of the form A /cos(wl /c) With nonideal absorptive walls, response is not infinite at resonant frequencies
Carnegie Mellon
Slide 18
SOUND PROPAGATION IN A MORE REALISTIC UNIFORM TUBE
x = -l Glottis
x=0 Lips
n Comment: Resonant frequencies now non-uniform
Carnegie Mellon
Slide 19
VOWEL PRODUCTION IN THE VOCAL TRACT
Carnegie Mellon
Slide 20
SOME EXAMPLE VOWELS
5000
4000
Frequency
3000
2000
1000
0 0 0.5 1 1.5 2 Time 2.5 3 3.5 4
Carnegie Mellon
Slide 21
VOWEL PERCEPTION AND FORMANT FREQUENCIES
Carnegie Mellon
Slide 22
RADIATION IMPEDIANCE THE FINAL STEP

n Radiation impedance models the effect of the air load on the
output pressure wave: R(w ) = Pr (w ) /U L (w )
n At most frequencies of interest, radiation impedance has the

form of:
R(w ) = Pr (w ) /U L (w ) jw
n Effects of air loading are typically absorbed into the filter

effects of the vocal tract
Carnegie Mellon
Slide 23
SUMMARY
n We have discussed very superficially the production of speech
sounds
Source-filter model Vocal tract transfer functions Impact on perception Some attributes of acoustical analysis
n The source filter model is used

As a way to model how we produce speech sounds As a way to reduce the number of parameters needed to characterize speech sounds As a way of extracting features that are used by speech recognition systems
Carnegie Mellon
Slide 24

Introduction To The Acoustics of Speech Production: Richard M. Stern

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Introduction To The Acoustics of Speech Production: Richard M. Stern

Transféré par

Droits d'auteur :

Formats disponibles

INTRODUCTION TO THE ACOUSTICS OF SPEECH PRODUCTION

n Will talk about how speech is produced

ECE and SCS Robust Speech Group

SOME ADDITIONAL READING

n Speech Communications: Human and Machine by D.

n Digital Processing of Speech Signals by L. R. Rabiner and R.

n Speech Analysis, Synthesis, and Perception by J. L. Flanagan,

ECE and SCS Robust Speech Group

THE SOURCE-FILTER MODEL FOR SPEECH

A useful model for representing the generation of speech sounds:

Pitch Pulse train source

p[n] Vocal tract model Noise source

THE SPEECH SPECTROGRAM

0 0 0.2 0.4 0.6 Time 0.8 1 1.2

ECE and SCS Robust Speech Group

SEPARATING THE VOCAL-TRACT EXCITATION FROM THE FILTER

ECE and SCS Robust Speech Group

WHY IS ACOUSTICAL ANALYSIS DIFFICULT?

n Sound waves are long compared to objects at low frequencies

ECE and SCS Robust Speech Group

THE ACOUSTIC THEORY OF SPEECH PRODUCTION: MODELING THE VOCAL TRACT

n The pressure at a distance r is Pr (w ) = S(w )T(w )R(w )

UNVOICED SPEECH SOURCE

ECE and SCS Robust Speech Group

VOICED SPEECH SOURCE

ECE and SCS Robust Speech Group

DERIVATION OF THE WAVE EQUATION

u(x, t) particle velocity

U(x, t) volume velocity

n Comment: sound pressure variation and volume velocity play

ECE and SCS Robust Speech Group

DERIVATION OF THE WAVE EQUATION

(Newtons second law)

(from the universal gas law)

ECE and SCS Robust Speech Group

SOLUTIONS TO THE WAVE EQUATION

n Solutions to the wave equation (in one dimension) are of the

Travelling-wave solutions We frequently evaluate for the sinusoidal steady state

THE SINUSOIDAL STEADY STATE: COMPLEX AMPLITUDE

n Let u + ( x, t) be of the form Re[U + e jw (t- x / c) ] n Then

n U(x, w ) is referred to as the complex amplitude

ECE and SCS Robust Speech Group

COMPLEX AMPLITUDES OF GENERAL SOLUTIONS TO THE WAVE EQUATION

ECE and SCS Robust Speech Group

SOUND PROPAGATION IN A UNIFORM TUBE

ECE and SCS Robust Speech Group

SOUND PROPAGATION IN A UNIFORM TUBE

x = -l x=0 Glottis Lips n With boundary conditions solved, we obtain:

U cos(w 0t) u(0,t) = 0 cos(w 0 l /c)

UNIFORM TUBES RESONANT FREQUENCIES

ECE and SCS Robust Speech Group

SOUND PROPAGATION IN A MORE REALISTIC UNIFORM TUBE

n Comment: Resonant frequencies now non-uniform

ECE and SCS Robust Speech Group

VOWEL PRODUCTION IN THE VOCAL TRACT

ECE and SCS Robust Speech Group

SOME EXAMPLE VOWELS

0 0 0.5 1 1.5 2 Time 2.5 3 3.5 4

ECE and SCS Robust Speech Group

VOWEL PERCEPTION AND FORMANT FREQUENCIES

ECE and SCS Robust Speech Group

RADIATION IMPEDIANCE THE FINAL STEP