Vous êtes sur la page 1sur 57

The Audio Signal

Elaborazione dell'audio digitale


Ingegneria del Cinema, Informatica e Telecomunicazioni


Antonio Servetti servetti@polito.it
Internet Media Group http://media.polito.it
Dip. di Automatica ed Informatica
Politecnico di Torino



The Audio Signal
Outline
Reference: Lombardo, "Audio e Multimedia", Ch. 1 and 2
Where an audio signal comes from?
Waveform basics (sinusoids)
Audio objective attributes
Amplitude/intensity, frequency, duration
Perceived ranges
Audio subjective features
From sinusoids to real sounds
Loudness, pitch, timbre
Production
Perception
Amplitude
Frequency
Wveform
Loudness
Pitch
Timbre
2
Audio signal - i.e. sound
Signal
"Something that carries information with its variations in
time/space", can be manipulated, stored, transmitted
Sound
It is a mechanical wave caused by a vibrating object that
propagates in every direction in a medium (such as air or
water) through compression and rarefaction
And that can be detected by the human ear
Audio signal is the representation of that sound
The Audio Signal 3
Audio signal waveform
Representation of the pattern of changing air
pressure that evolves with time
Characterized by amplitude, frequency (and phase)

The Audio Signal
Time: samples, seconds,
Air pressure: [-1,1]
Zero mean
Audacity: Toms_diner.wav
4
Amplitude
The Audio Signal
Audacity: tuning_fork_a4.wav
5
Represents the intensity/energy of the sound at a
given point in time or space
Measured as sound pressure: the difference between
average local pressure and the pressure of the sound wave

Sound pressure level
The Audio Signal 6
Represents the sound energy level
Measured using the root mean square (RMS) amplitude over
a time period (because amplitude has zero mean)
On a logaritmic scale (decibel, dB)
Human ear can detect sounds with a wide range of
amplitudes (from p
0
=2.510
-6
N/m
2
to 30 N/m
2
)
w.r.t. a reference level
Threshold of hearing at 1 kHz (10
-6
N/m
2
)
Intensity is given by the square root of (rms)
pressure, so:
Calibration
(http://www.audiocheck.net/testtones_hearingtesta
udiogram.php)
Audiograms require a properly calibrated audio
system. As we have no idea how loud your sound
level has been turned to as you listen to our sound
files, running an online audiogram test requires a
trick. As imprecise as it is, it will be good enough to
provide you with a rough estimate of your hearing
loss, if any.
First, we need you to adjust your computer's level to
match a known reference. Here is the trick: rub your
hands together, in front of your nose, quickly and
firmly, and try producing the same sound as our
calibration file. You are now generating a reference
sound that is approximately 65 dBSPL. As you play
back our calibration file, adjust your computer's
volume to match the sound level you just heard from
you hands. Proceed back and forth - preferably with
your eyes closed, to increase concentration - until
both levels match. Then, do not touch your
computer's volume knob anymore. Calibration is
done: your computer's volume knob has been set to
match 65 dBSPL. This procedure should give us a
confidence of approximately 10 dBHL in the next
hearing test.
Although headphones for this test are highly
recommended, they must be taken off when listening
to the reference sound made by your hands.
SPL = 20 log
10
(p/p
0
) dB = 10 log
10
(p
2
/p
0
2
) dB = 10 log
10
(I/I
0
)
SPL reference table
Rubbing your hands in front of your nose is
around 65 dB SPL (calibration trick)
Useful sound levels
between 50-100 dB SPL
50: average home
60: conversational speech
100: disco music
The Audio Signal 7
The Audio Signal
Frequency (Hz)
Frequency: number of cycles per unit of time
Related to the "altezza" (pitch) of a sound ("grave,acuto")
Perceived frequency range:
20-20'000 but maximum reduces with age (e.g. 16'000)
Below 20 Hz we perceive vibration with the body
Tuning fork
example
8
Audacity:
tuning_fork_a4.wav
The Audio Signal
Fundamental frequency
It is the lowest frequency in a sound
Music instruments (e.g. piano)
DO
4
(central) = 261.6 Hz
LA
4
= 440 Hz LA
5
= 880 Hz (octave)
Lower note = 27.5 Hz Higher note = 4180
Speech
Child speech ranges from 250-400 Hz, adult females tend
to speak at around 200 Hz on average and adult males
around 125 Hz.
Singer
Soprano: DO
4


DO
6
(1046.50 Hz), Tenore: DO
3


DO
3

9
The Audio Signal
Speech, voice, music, audio,
With respect to sound production we identify
General audio: all the perceived sound
Speech, voice, music represent a subregion
frequency range / dynamic range
Audio:
freq. range:
20-20000 Hz
intensity range:
~ 100 dB
Telephone speech:
300-3400 Hz
~ 80 dB

Voice region
10
Some theory
The Audio Signal 11
Base reference: sinusoid
Most basic signal
= cos(
0
+ )
Angle as a function of time, given
A:amplitude, w
0
:radian frequency (2
0
), :phase
The Audio Signal
A above middle C (LA 440 Hz)
= 10 cos(2 440
1
2
)
period: the shortest
time for the signal
to repeat itself


1/440 sec = 2.27 msec
12
Phase shift and time shift
Phase (together with frequency) determines the
time locations of the maxima and minima of a
cosine wave: = 0 = 0
Time shifting
= (
1
)
t
1
positive -> signal s(t) has been delayed
t
1
negative -> signal s(t) has been advanced
Positive peak closest to t=0
Phase shifting to time shifting
cos
0
+ = cos(
0
(
1
) where
1
=

0

Phase shift is negative when time shift is positive
The Audio Signal
Reference:
Mc Clellan, "Signal Processing First", Ch2 Sinusoids
13
The Audio Signal
Phase and delay
For a single sound source phase values are not influent (it is
just a delay)
But with multiple sound sources relative phase is important
(i.e. constructive or destructive effects, stereo image)
From phase to delay (and viceversa) as a function of the
signal frequency
t = ph / 2 PI f
(e.g. at 440 Hz,
ph = PI =>
t = 1.136 ms)
14
From theory to real sounds
The Audio Signal 15
The Audio Signal
Real sounds do not last forever
Real sounds are "transient"
Last for a finite time span: come to life and then
extinguish
themselves
16
Audacity: tuning_fork_a4.wav
Audacity: trumpet_G4.wav
The Audio Signal
Transients: ADSR
Reference:
Time envelope
Evolution of sound aplitude with time (positive peaks)
ADSR
Attack: initial run-up of level from nil to peak
Decay: subsequent run down to the designated sustain level
Sustain: level during the main sequence of the sound's duration
Release: level to decay from sustain level to zero
17
The Audio Signal
Transients: ADSR
Reference:
Musical instruments have different ADSR
A rapid attack will tend to be heard as a percussive sound
A slow attack is more fitting for wind instruments
Note:
Even experienced musicians may have difficulty identifying
the source of a sound when its envelope is manipulated
18
Real sounds are not periodic
Quasi-periodic: reapeat (almost) identical after
some (almost) constant time
A-periodic: no clear periodicity can be identified
(noise-like)

The Audio Signal
0.42 0.44 0.46 0.48 0.5 0.52 0.54
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
0.2
La110Chitarra.wav
T0
19
exactly
castanets.wav
0.01 0.012 0.014 0.016 0.018 0.02 0.022
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
0.2
Quasi-periodic signals
Complex sounds with multiple frequency
components
There is no single frequency
Fundamental f. (F0): signal period (lowest f.)
Harmonics (Fn):
integer multiplies of F0
(other peaks in the
w. cycle)
The Audio Signal
0.42 0.44 0.46 0.48 0.5 0.52 0.54
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
0.2
La110Chitarra.wav
T0
20
The Audio Signal
Complex sounds
Reference:
Complex sounds can be approximated by sine
waves with different amplitude, frequency and
phase
Fundamental frequency:
155 Hz
2
nd
harmonic with
aplitude 1/7 and phase +75
3
rd
harmonic with
amplitude 1/3.5
and phase +250
21
The Audio Signal
Complex sounds (example)
Reference:
Three instruments
Flute
Oboe
Violin
Playing the same
note
Have different
frequency content
22
The Audio Signal
Timbre
Fender Stratocaster guitar
The quality of a sound that distinguishes different
types of sound production (even if with the same
pitch and loudness)
Depends both on the signal waveform and
spectrum
The spectral content
The time envelope
Etc.
If a sound is played backward, the spectrum
is the same, but it sounds different
23
The Audio Signal
Timbre (audio demo)
Reference:
Demonstration 29. Effect of Tone Envelope on Timbre (2:16)
You will hear a recording of a Bach chorale played on a piano.
Now the same chorale played backwards
Now the tape of the last recording is played backwards so that the
chorale is heard forwards
The purpose of this demonstration (originally presented by J. Fassett) is to show that
the temporal envelope of a tone, i.e. the time course of the tone's amplitude, has a
significant influence on the perceived timbre of the tone. By removing the attack
segment of an instrument's sound, or by substituting the attack segment of another
musical instrument, the perceived timbre of the tone may change so drastically that
the instrument is no longer recognizable.
In this demonstration, a four-part chorale by J.S. Bach ("Als der gutige Gott) is played
on a piano and recorded on tape. Next the chorale is played backward on the piano
from end to beginning, and recorded again. Finally the tape recording c .backward
chorale is played in reverse, yielding the original (forward) chorale, except that each
note is reversed in time. The instrument does not sound like a piano any more, but
rather resembles a kind of reed organ. The power spectrum of each measured over the
note's duration, is not changed by temporal reversal of the tone.
24
D29C.EffectOfToneEnvelopeOnTimbre.ChoralePlayedBackwardReversed
From math to perception
Audio technology is heavily related to audio
perception because the human hearing system
and brain are involved
Psychoacoustics: how physical measures are
related to audio perception?
The Audio Signal 25
From physics to biology
Audio technology is heavily related to
audio perception because the human
hearing system and brain are involved
Psychoacoustics: how physical measures
are related to audio perception?
The Audio Signal 26
The Audio Signal
The human ear
Get slides (and videos) from Audio Coding
27
Loudness
Psychological correlate of sound amplitude
That attribute of auditory sensation in terms of which sounds
can be ordered on a scale extending from quiet to loud
Loudness level (phon)
Correlated to intensity log
But not uniform in f.
Affected by frequency
bandwidth and duration
Perceived loudness (sone)
Phones scale with level in dB,
not with loudness

The Audio Signal
TODO
- Esempi audio: syn stessa
ampiezza e diversa freq =>
diverso phon
Reference:
http://www.sengpielaudio.com/calculatorSonephon.htm (+10 dB)
28
The Audio Signal
Frequency (Hz)
Reference:
Frequency: number of cycles per unit of time
Perceived frequency range:
20-20'000 but maximum reduces with age (e.g. 16'000)
Perception:
Pitch perception of the ear is proportional to the
logarithm of frequency rather than to frequency itself
Example: 'sine_sweep.mp3'
Reference:
Lombardo, "Audio e Multimedia", Ch.1 Acustica
29
The Audio Signal
Pitch
Pitch is a perceptual property that allows the
ordering of sounds on a frequency-related scale
Human perception of pitch is approximately logarithmic
with respect to fundamental frequency
Pitch is an auditory sensation
Pure tones maps to frequency
Complex tones is ambiguous
Labeling (scientific pitch notation)
Note + octave
(es. C0 16 Hz, C4 261Hz, A4 440Hz)

Reference:
Lombardo, "Audio e Multimedia", Ch.1 Acustica
30
Pitch of harmonic sounds
Harmonics sounds approximated by sine waves
with different amplitude, frequency and phase
Fundamental frequency: 155 Hz
2
nd
harmonic with aplitude 1/7 and phase +75
3
rd
harmonic with
amplitude 1/3.5
and phase +250

The Audio Signal
Reference:
Lombardo, "Audio e Multimedia", Ch.1 Acustica
31
The Audio Signal
Complex sounds (audio demo)
Reference:
Cancelled Harmonics
A complex tone is presented followed by several
cancellations and restorations of a particular harmonic.
This is done for harmonics 1 through 10.
This demonstration illustrates Fourier analysis of a complex tone
consisting of 20 harmonics of a 200-Hz fundamental.
When we listen analytically, we hear the different components
separately; when we listen holistically, we focus on the whole sound and
pay little or no attention to the components.
When the relative amplitudes of all 20 harmonics remain steady (even if
the total intensity changes), we tend to hear them holistically.
However, when one of the
harmonics is turned off and on,
it stands out clearly
32
01_cancelled_harmonics
Virtual pitch
When there is no discernible fundamental, the ear
will often create one






1st Individually partials sound like high-pitched sinusoids
2nd Together create the percept of a single sound at lower f.
The Audio Signal
Reference:
Sethares, "Tuning, Timbre, Spectrum, Scale", Ch2 The Science of Sound
33
The Audio Signal
Sound identification
Reference: Watkinson, The art of digital audio, Ch.2
Location and size
Time domain response
works quickly and is
older in evolutionary
terms (< 1ms)
Pitch and timbre
Frequency domain response
works more slowly, evolved
later presumably after speech
evolved (> 10-30 ms)
34
The Audio Signal
Listening examples
Courtesy of www.audiocheck.net
Calibration
testtones_hearingtestaudiogram.php
Audiograms require a properly calibrated audio system. As we have no idea how loud your
sound level has been turned to as you listen to our sound files, running an online audiogram
test requires a trick (as imprecise as it is)
First, we need you to adjust your computer's level to match a known reference. Here is the
trick: rub your hands together, in front of your nose, quickly and firmly, and try producing the
same sound as our calibration file. You are now generating a reference sound that is
approximately 65 dBSPL.
High frequency range test (8-22 kHz)
audiotests_frequencycheckhigh.php
A -9 dbFS sweeping sine tone, from 22 kHz (supposedly inaudible) down to 8 kHz (if you can't
hear this one, consider checking your hearing). On the top of the test tone, a voiceover tells you
which frequency is currently playing.
Play back the file until you start hearing the underlying high pitch tone as it descends. The
voiceover tells you the frequency you have reached. This frequency more or less represents the
upper limit of your audio system, or your hearing.
35
The Audio Signal
Listening examples (cont)
Courtesy of www.audiocheck.net
Dynamic range
http://www.audiocheck.net/audiotests_dynamiccheck.php
Dynamic range represents the ratio between the loudest signal you can hear and the quietest.
Dynamic range is expressed in terms of decibels (dB). Being a ratio, the decibel has no units;
everything is relative. Since it is relative, it must be relative to some reference point that has to
be defined. Our reference point here is the loudest level you can comfortably bear for one
second. This test helps you benchmark the dynamic range of your sound system.
Interestingly, much emphasis is put on 24-bit audio recordings nowadays, with a dynamic range
exceeding 140dB. Our example is only 16-bit, with a maximum dynamic range of 96dB, yet that
should be plenty. Judge for yourself.
36
Sound sources localization
The Audio Signal 37
The Audio Signal 38
Localizzazione sorgenti sonore
Obiettivo: costruzione di una mappa sonora degli
oggetti intorno a noi
Primo uso dell'udito dal punto di vista evolutivo
Posizionamento su tre direzioni principali
Fronte-retro:
piano frontale
Sinistra-destra:
piano mediano
(eq. dist. orecchie)
Sopra-sotto:
piano orizzontale
(giacciono orecchie)
Fig. 3.20
The Audio Signal 39
Posizione sorgente sonora
Espressa tramite un vettore caratterizzato da 2
angoli
Azimut (0 fronte 180 retro)
Angolo tra proiezione sul piano orizzontale e vettore che segue la
direzione fronte-retro
Elevazione (-90 sotto, 90 sopra)
Angolo tra il vettore
ed il piano orizzontale
E da uno scalare
Distanza
The Audio Signal 40
Ascolto direzionale
Sono stati individuati due meccanismi che
descrivono entrambi la differenza tra i suoni alle
due orecchie
ITD Interaural time difference (tempo o fase)
IID Interaural intensity difference (intensit o ampiezza)
Fig. 3.23
The Audio Signal 41
Interaural Time Difference
Viene rilevata quando una sorgente non si trova
esattamente sul piano mediano
La distanza percorsa dal suono per giungere all'orecchio
opposto maggiore e quindi il suono arriva in ritardo
Si riesce a raggiungere
la precisione di un grado
(sx/dx) e la minima ITD
rilevabile 0,6 msec
Fino a 1000 Hz quando
lunghezza d'onda
comparabile con
distanza tra orecchie
The Audio Signal 42
Interaural Intensity Difference
Si definisce come differenza di ampiezza o di
spettro poich ad una delle due orecchie non
arrivano tutte le frequenze del suono
Che vengono filtrate dalla testa
Le alte frequenze (> 1500 Hz) vengono riflesse
Le basse frequenze subiscono diffrazione e girano intorno
all'ascoltatore
The Audio Signal 43
Head Related Transfer Function
Funzione di trasferimento in relazione alla testa
Descrive tutti i cambiamenti che occorrono alle nostre
orecchie rispetto alla forma d'onda in fase ed ampiezza
E' misurata tramite appositi microfoni posizionati
nell'orecchio di manichini
Sono difficili da generalizzare
La HRTF di tizio male si applica alla percezione di caio

Anche il padiglione auricolare filtra il segnale
Le pieghe permettono di percepire l'elevazione di una
sorgente sonora
Il padiglione la provenienza nella direzione davanti/dietro
The Audio Signal 44
Effetto di precedenza
In presenza di due (o pi) sorgenti sonore in posizioni
diverse, viene percepita una direzione che corrisponde,
Sotto la curva di intensit sonora, all'incirca alla prima sorgente che
arriva alle orecchie (effetto Haas)
Sopra la curva, la sorgente sonora localizzata verso il suono pi
forte
Dopo un ritardo
di 30 ms si inizia
a percepire l'eco
The Audio Signal 45
Posizionamento altoparlanti
Gli altoparlanti sono posti ai vertici di un triangolo
equilatero rispetto all'ascoltatore
Pena una minore stabilit nel posizionamento delle sorgenti
sonore
Se troppo distanti,
come al cinema,
facile percepire un buco
nella parte centrale
tra i due altoparlanti
The Audio Signal 46
Immagini sonore fantasma
Sono create in posizione intermedia tra i due
altoparlanti per mezzo delle differenze di intensit
quando la differenza di tempo molto ridotta
(0,05 < dt < 1,5 msec)
Invece di percepire due sorgenti sonore distinte la sorgente
risulter posizionata verso l'altoparlante pi forte (o al centro
se di pari intensit)
Caveat
Se l'ascoltatore non si trova alla distanza corretta tra i due
altoparlanti la sorgente fantasma percepita non quella
voluta
Le frequenze ammissibili non sono molte <700Hz,
Al di sopra interferenza distanza orecchie e filtraggio testa
The Audio Signal 47
Pan Potting
Formulazione matematica



Alfa angolo percepito che distanzia la
sorgente fantasma dal piano mediano
Beta angolo sotteso dai due altoparlanti
nella posizione dell'ascoltatore
(r.p.mediano)
Teta angolo che distanzia la sorgente
sonora reale
sin = + sin
+ = tan
The Audio Signal 48
Audio binaurale
Trasposizione dei canali stereo convenzionali sulle
cuffie
Differenze
Solo il canale destro arriva all'orecchio d. e viceversa
Non ci sono mai differenze di tempo tra i segnali
The Audio Signal 49
Audio binaurale
Sintesi binaurale
Per produrre reali effetti di audio 3D occorre il calcolo delle
HRTF e conseguente modifica dello spettro in seguito alle
differenze misurate per sorgenti sonore localizzate (utilizzo
di interpolazioni)
Head tracking
Orientato ad
applicazioni di
realt virtuale
The Audio Signal 50
Binaural Effects
The recording is then played back through headphones, so
that each channel is presented independently, without
mixing or crosstalk. Thus, each of the listener's eardrums is
driven with a replica of the auditory signal it would have
experienced at the recording location

Zeno, Nature has given man one tongue, but two ears,
that we may hear twice as much as we speak

Binaural effects
Binaural Lateralization A.D. 37 (72,73,74)
An auditory illusion A.D. 39 (80)

Binaural beats or binaural (for both ears)
tones are auditory processing artifacts, that
is apparent sounds, the perception of which
arises in the brain independent of physical
stimuli. This effect was discovered in 1839
by Heinrich Wilhelm Dove.

In nature, two sounds that are similar but
slightly shifted in frequency will beat to
produce two new frequencies which are the
sum and the difference of the original two
sounds. For example, a 400 Hz tone and a
410 Hz tone will form a 405 Hz tone
pulsating 10 times per second.

The brain produces a similar phenomenon
internally, resulting in low-frequency
pulsations in the loudness of a perceived
sound when two tones at slightly different
frequencies are presented separately, one to
each of a subject's ears, using stereo
headphones. A beating tone will be
perceived, as if the two tones mixed
naturally, out of the brain. The frequency of
the tones must be below about 1,000 to
1,500 hertz for the beating to be heard. The
difference between the two frequencies must
be small (below about 30 Hz) for the effect
to occur; otherwise the two tones will be
distinguishable and no beat will be
perceived.

Interest in binaural beats can be classified
into two categories. First, they are of
interest to neurophysiologists investigating
the sense of hearing. Second, some
protoscientists believe that binaural beats
may influence the brain in more subtle ways
through the entrainment of brainwaves and
can be used to produce relaxation and other
health benefits.
http://www.feilding.net/sfuad/musi3012-01/demos/audio/
The Audio Signal 51
Binaural Lateralization
The most important benefit we derive from binaural
hearing is the sense of localization of the sound source.
Low frequency sounds are lateralized mainly on the basis
of interaural time difference, whereas high frequency sounds
are lateralized mainly on the basis of interaural intensity
differences.
Phase difference. Tones of 550 Hz and then 200 Hz are
heard with alternating interaural phases of plus and minus
45 degrees. At 500 Hz, the image switches from side to side
as the phase changes. At 2000 Hz, on the other hand, no
such movement is perceived. (250 us / 62 us).
D37.BinauralLateralization_PHASE
The Audio Signal 52
An auditory illusion
Tones of 400 and 800 Hz alternate in both ears in opposite
phase; that is, when the left ear receives 400 Hz, the right
ear receives 800 Hz. About 99% of listeners hear a single
low-frequency tone in one ear and a high-frequency tone in
the other ear. Quite remarkably, when the headphones are
reversed, most listeners hear the high tone and the low tone
in the same ears as before.

STOP | PLAY | PAUSE
Perception samples
Primary BeatsIf two pure tones have slightly different
frequencies f1 and f2, where f2 = f1 + delta-f, the phase
difference, p1-p2, changes continuously with time. The
amplitude of the resultant tone varies between A1+ A2
and A1 - A2, where A1 and A2 are the individual
amplitudes. These slow periodic variations in amplitude
at frequency delta-f are called primary beats. Beats are
easily heard when delta-f is less that 10 Hz, and may be
perceived up to about 15 Hz.
Beats are an important contributor to the sensation of
dissonance in music, and form an invaluable perceptual
tool for the tuning of musical instruments.Secondary
BeatsSecond-order beats: A sensation of beats also
occurs when the frequencies of two tones f1 and f2 are
nearly, but not quite, in a simple ratio. If f2 = 2f1 +
x (mistuned octave), beats are heard at a frequency x. In
general, when f2 = (n/m)f1 + x, mx beats occur each
second. These are called second-order beats or beats of
mistuned consonances because the relationship f2 =
(n/m)f1, where n and m are integers, defines consonant
musical intervals, such as a perfect fifth (3/2), a perfect
fourth (4/3), a major third (5/4), etc.Binaural samples
The most important benefit we derive from binaural
hearing is the sense of localization of the sound source.
Although some degree of localization is possible in
monaural listening, binaural listening greatly enhances
our ability to sense the direction of the sound source.
Localization includes up-down and front-back
discrimination, but most attention is focused on side-to-
side discrimination or lateralization.
When we listen with headphones, we lose front-back
information, so that lateralization becomes exaggerated;
the image of the source appears to switch from one side
of the head to the other by moving "through the head",
or the sound source appears to be "in the head."
Low frequency sounds are lateralized mainly on the
basis of interaural time difference, whereas high
frequency sounds are lateralized mainly on the basis of
interaural intensity differences.Phase differenceTones of
550 Hz and then 200 Hz are heard with alternating
interaural phases of plus and minus 45 degrees. At 500
Hz, the image switches from side to side as the phase
changes. At 2000 Hz, on the other hand, no such
movement is perceived. (250 us / 62 us).Intensity
differenceTones of 250 and 4000 Hz illustrate the effects
of interaural intensity difference at low and high
frequency. The interaural intensity changes (in 1.25s)
from 32 dB to -32 dB. In both cases, the image moves
from side to side. (At low frequencies there is little
intensity difference even when the source is located to
one side of the head.)Auditory IllusionTones of 400 and
800 Hz alternate in both ears in opposite phase; that is,
when the left ear receives 400 Hz, the right ear receives
800 Hz. About 99% of listeners hear a single low-
frequency tone in one ear and a high-frequency tone in
the other ear. Quite remarkably, when the headphones
are reversed, most listeners hear the high tone and the
low tone in the same ears as before.Vocoder samples
The vocoder is demonstrated with audio samples below.
The audio is from an original Bell Labs recording of
1939.Vocoder IntroductionThe introduction to the
vocoder itself has been processed by the vocoder,
demonstrating reasonably good audio quality (by
telephone standards, which emphasize intelligibility and
speaker recognition over audio fidelity).Un-voiced
speechWhispered speech is generated by setting the
vocoder as if all speech were unvoiced (input to the
synthesis filter is only "hiss-type energy"). For most
languages, speech is fully intelligible in unvoiced
(whispered) form. Indeed, only a few languages, like
Mandarin Chinese, contain linguistic information in the
periodicity of speech.Voiced speechMechanical-
sounding speech generated by setting the vocoder as if
all speech were voiced (input to the synthesis filter is
only "buzz-type energy").Monotone speechHere, both
voiced and unvoiced sounds are produced, but the
voiced sounds are held at a constant pitch, yielding a
monotone effect.
D39.An_Auditory_Illusion
The Audio Signal
Bibliography
Lombardo, "Audio e Multimedia"
Ch.1 - Acustica
Ch.2
- Sez 2.3.1: I parametri della percezione
- Sez. 2.4: Localizzazione delle sorgenti sonore

Interesting readings
Mc Clellan, "Signal Processing First"
Ch.2: Sinusoids
Sethares, "Tuning, Timbre, Spectrum, Scale"
Ch.2: The Science of Sound (parts)

53
The Audio Signal
Tools
Audacity, http://audacity.sourceforge.net/

54
The Audio Signal
Audio samples
AES Auditory Demonstrations
http://www.feilding.net/sfuad/musi3012-01/demos/audio/
(unofficial link)

55
The Audio Signal
Source code
none
56
57 The Audio Signal

Vous aimerez peut-être aussi