Vous êtes sur la page 1sur 10

Vowels are produced…

Constricting the vocal tract at a node or antinode (named = perturbation)


Movement of muscles
Lips
Tongue
Pharyngeal muscles
Jaw
Vowel Perturbation
Formant Frequency changes
Node: formant frequency raised with constriction at a node
Minimum volume velocity and maximum pressure
-done through lowering the mandible
Anti-node: formant frequency lowered with constriction at an antinode
Maximum volume velocity and minimum pressure
-done via labial constriction

Formant Frequency changes based on constriction location


Increased VT length - decreases ALL formant frequencies
Lip constriction - decreases ALL formant frequencies
Mid-Oral Cavity Constriction - decrease in F3 frequency, F1/2 unknown
Hypo-Pharynx Constriction - increase in ALL formant frequencies
Pharynx constriction - increase in F1, F2/3 unknown
Anterior Orol Cavity (alveolar ridge) constriction
Increase in F2 and F3 frequencies
Decrease in F1
Oro-Pharynx Constriction
F1 unknown
Decrease in F2 frequency
Increase in F3 frequency

VT articulatory posture is described by position of tongue


Tongue Height
High (closed)
Mid
Low (open)
Tongue Advancement
Front
Back

Vowel Quadrilateral

Take Aways from above diagram


As front vowels become more open
F1 increases in frequency
F2 decreases slightly in frequency
As back vowels become more open
F1 increases in frequency
F2 unclear
*Perception of vowel height is related to F1
Gap between F2 and F1 decreases from front to back vowels
Gap between F3 and F2 increases from front to back vowels
Normative Data: (overlapping bubbles slide)
-Take away point: No absolute values exist for F1/2/3. Formant frequencies
change from person to person based on how each individual produces them.

Tense-Lax and duration


Tense vowels
Produced with more muscle contraction
Produced with extremes of articulatory posture
Tongue higher in oral cavity
Duration
Tense vowels are longer
Lax vowels are shorter
Inherent Duration of Vowels
Kent, Dembowski and Lass (1996)

Rhotacized Vowel Quality


*Rhotacized Vowels created with Hypopharyngeal constriction

Diphthongs (combination of 2 vowels)


-Two vowels within the same syllabic nuclei
-Smooth glide from one vowel to the next
-Tongue posture and shape changes so drastically while producing diphthongs that
there is no exact position location for them.
However…
Rate of change of vowels is still stable
Relationship between the vowel formants is stable

Onglide = articulatory starting point of the diphthong


Offglide = articulatory ending point of the diphthong

Vocal Tract and Regulation of Intensity


Acoustic energy lost in VT due to:
Glottal opening
Absorbent walls of pharynx/mouth
Friction between air particles

Vocal Tract Transfer Function


-More widely spaced harmonics of female voice as opposed to the male voice

Singer’s Formant
Trained singers can tune their voices to match the fundamental frequencies of
one or more harmonics of sound source.
The effect of tuning is a louder sound and improved aspects of vocal quality is
observed
Acoustically, the singer’s formant results in a spectral peak around 2500-3000Hz.

Filters
-Vocal Tract = Band Pass Filter
-eliminates or reduces certain frequencies of the vibrations of the vocal folds.
This is why we have formants and not all the frequencies.
-Broadly tuned filters = slow attenuation of frequencies outside of the cutoff
frequency
-Sharply tuned filters = fast attenuation outside of the cutoff frequency
-Cutoff frequency = The half power point, where the amplitude of the frequency
component is decreased by 3dB.

Sound Spectroscopy

Spectrogram = graphical representation of the frequency and intensity of the sound


pressure wave as a function of time.
Tells us about the source and filter of the sound
The darker parts of the spectrogram shows us more energy in that area
Changing Source Spectrum:
Location of formants don’t change with bandwidth change
Because the vocal tract doesn’t change, only the vocal fold tension
As pitch changes, the harmonics move through the formants
But, the darkest areas of the spectrogram won’t change because those
areas are related to the formants and fundamental characteristics of the vocal tract.

Waveform = graphical representation of VF vibration


Interested in the amplitude and time information (period) of the vibration as it
relates to time frequency trade off.

Time frequency trade-off : The more time points there are, the lower the time resolution
and the higher the frequency resolution and vice versa.
Smaller/narrow bandwidth = higher frequency resolution
Larger/wide bandwidth = higher time resolution

Instruments Used:
Ultrasound : mainly used for capturing tongue posture during speech production
Cat Scan and MRI : benefit is we get a 3D reconstruction of tissues
Regular X-Ray:
Harmonic to Noise Ratio
-The ratio of energy in the fundamental frequency and harmonics, to the energy
in the aperiodic noise component of the speech signal, averaged over several cycles.
In irregular speech: the ratio of harmonics and fundamental frequency is
reduced.
-Jitter/shimmer H/N are time-based measures
-in moderate/severe dysphonia identification of cycle boundaries is
difficult and this measure may not be reliable.

Cepstral Measures
-A Fourier Transform of the power spectrum that shows the extent to which the
fundamental frequency and harmonic structure stand out from the background noise.
-Relative amplitude of dominant cepstral peak correlates well with perception of
breathiness and abnormal vocal quality
*CPP (cepstral peak prominence) - does not depend on time related information,
which makes it a good measure and easy to perform.
*CPP decrease is correlated with poor voice quality

Consonants
Stops and Fricatives

Consonants vs. vowels


Consonants
Some degree of constriction (giving less energy than vowels)
Source of sound: voiced, or voiceless
More meaning
Vowels
Open vocal tract (greater energy)
Source of sound: voiced
Less meaning

Constriction
Speech production is aerodynamic
Airflow is “egressive”

Sources of speech sounds


Nearly periodic: Vowels (l,j)
Aperiodic: Turbulent
Fricatives (f, s)
Airflow is turbulent as flows through supraglottal constriction
Glottis is open
Aperiodic: transient noise (p, k, g)
Rapid pressure change in supraglottal tract
Coarticulation = simultaneously articulating more than one phoneme
Very important in speech perception
This is the changing of the vocal tract and articulators continuously
Anticipatory (forward) = (boo)k
Retentive (backward) = b(ook)
Essential in perception of certain consonants

Vowel Transitions
The transition between the vowel and the consonant (VC) or the consonant and
the vowel (CV)
These allow the brain to distinguish between sounds and tell the difference
between consonant sounds (ex: /th/, /f/)
Vowel transitions are examples of coarticulation

Phonetic Description of Consonants (how we create consonant differentiation)


Presence or absence of voicing (cognates : /k/,/g/)
Place of articulation
Manner of articulation
Complete, transient cessation of airflow
Stops and affricates
Constriction with continuous airflow
Fricatives, nasals, glides and liquids

Place of Articulation
Bilabial (p,b,m,w)
Constriction at the lips/lips come together
Labiodental (f,v)
Lower lip and upper teeth constriction
Constriction by the tongue tip against other areas (“lingual” articulation)
Dental - tongue tip with front upper teeth
Alveolar (t, d, z, s, n, l) - tongue tip with alveolar ridge
Palatal - tongue tip and hard palate
Retroflex (shape of tongue can be different)
Palatal or alveolar
Velar (k, g, )
back of tongue against soft palate
Pharyngeal fricative (h)
Back of tongue and pharynx (voiceless)
Glottal
(glottal stop)
Glottal consonants are voiceless, where glottis rapidly closes
Ex: butter in british, ah-ah in english
Phonetic Description of Consonants

V+ = voiced, V- = voiceless
Stops
Five acoustic cues important for perception of stops
Silence
Burst noise
Aspiration
Voice onset time (VOT)
CV or VC vowel formant transition

Silence = stop gap


Occlusion to release the airflow
Voiceless stops = complete silence
Voiced stops
Varying amount of silence
Voicing with low amplitude can be present
Seen as a voice bar on spectrogram

Acoustic Features of Voiceless Stops


*If given a spectrogram be able to identify: where the stop gap is located, identify
voice bar in voiced stop spectrogram, as well as “burst” if asked.. See pics below for
reference.

Lighter areas indicate stop gap (indicates lowest energy)


Darkened areas immediately after stop gap indicate “energy release burst”
Voice bar at bottom of stop gap area = shows the formant is still present from previous
vowel. This is the case when stops are not entirely voiceless.

Release burst: transient burst noise upon release of the occlusion and impounded air
(Ex: pop sound heard in microphone)
Duration : approx 10-30ms for voiced stops, longer for voiceless cognates
Observed in waveforms as a “sudden change in amplitude”
Observed in spectrogram as “sudden appearance of energy at many frequencies

Vous aimerez peut-être aussi