Académique Documents
Professionnel Documents
Culture Documents
The inner ear, for example, does significant signal processing in converting
sound waveforms into neural stimulus, so certain differences between waveforms
may be imperceptible.[1] MP3and other audio compression techniques make use of
this fact.[2] In addition, the ear has a nonlinear response to sounds of
different loudness levels. Telephone networks and audio noise reduction systems
make use of this fact by nonlinearly compressing data samples before transmission,
and then expanding them for playback.[3] Another effect of the ear's nonlinear
response is that sounds that are close in frequency produce phantom beat notes,
or intermodulation distortion products.[4]
[edit]Limits of perception
The human ear can nominally hear sounds in the range 20 Hz to 20,000 Hz (20 kHz).
This upper limit tends to decrease with age, most adults being unable to hear above
16 kHz. The ear itself does not respond to frequencies below 20 Hz, but these can be
perceived via the body's sense of touch. Some recent research has also
demonstrated a hypersonic effect which is that although sounds above about 20 kHz
cannot consciously be heard, evidence suggests that ultrasonic sounds can induce
changes in EEG (electroencephalogram) readouts of listeners in controlled test
environments. In addition, though we are unable to perceive sounds above 20 kHz,
listeners in the same study gave qualitatively different judgments of sound when
ultrasonic frequencies were present.[5]
Frequency resolution of the ear is 0.36 Hz within the octave of 1,000–2,000 Hz. That
is, changes in pitch larger than 0.36 Hz can be perceived in a clinical
setting.[6] However, even smaller pitch differences can be perceived through other
means. For example, the interference of two pitches can often be heard as a (low-)
frequency difference pitch. This effect of phase variance upon the resultant sound is
known as 'beating'.
The semitone scale used in Western musical notation is not a linear frequency scale
but logarithmic. Other scales have been derived directly from experiments on human
hearing perception, such as the mel scale and Bark scale (these are used in studying
perception, but not usually in musical composition), and these are approximately
logarithmic in frequency at the high-frequency end, but nearly linear at the low-
frequency end.
The "intensity" range of audible sounds is enormous. Our ear drums are sensitive
only to variations in the sound pressure, but can detect pressure changes as small as
2×10–10 atm and as great or greater than 1 atm. For this reason, Sound Pressure
Level is also measured logarithmically, with all pressures referenced to 1.97385×10–
10
atm. The lower limit of audibility is therefore defined as 0 dB, but the upper limit is
not as clearly defined. While 1 atm (191 dB) is the largest pressure variation an
undistorted sound wave can have in Earth's atmosphere, larger sound waves can be
present in other atmospheres, or on Earth in the form of shock waves. The upper limit
is more a question of the limit where the ear will be physically harmed or with the
potential to cause a hearing disability. This limit also depends on the time exposed to
the sound. The ear can be exposed to short periods in excess of 120 dB without
permanent harm — albeit with discomfort and possibly pain; but long term exposure
to sound levels over 80 dB can cause permanent hearing loss.
A more rigorous exploration of the lower limits of audibility determines that the
minimum threshold at which a sound can be heard is frequency dependent. By
measuring this minimum intensity for testing tones of various frequencies, a
frequency dependent Absolute Threshold of Hearing (ATH) curve may be derived.
Typically, the ear shows a peak of sensitivity (i.e., its lowest ATH) between 1 kHz and
5 kHz, though the threshold changes with age, with older ears showing decreased
sensitivity above 2 kHz.
Robinson and Dadson refined the process in 1956 to obtain a new set of equal-
loudness curves for a frontal sound source measured in an anechoic chamber. The
Robinson-Dadson curves were standardized as ISO 226 in 1986. In 2003, ISO 226 was
revised as equal-loudness contour using data collected from 12 international studies.
[edit]Overview
[edit]Masking effects
Main article: Auditory masking
If two sounds occur simultaneously and one is masked by the other, this is referred to
as simultaneous masking. Simultaneous masking is also sometimes called frequency
masking. The tonality of a sound partially determines its ability to mask other sounds.
A sinusoidal masker, for example, requires a higher intensity to mask a noise-like
maskee than a loud noise-like masker does to mask a sinusoid. Computer models
which calculate the masking caused by sounds must therefore classify their individual
spectral peaks according to their tonality.
Similarly, a weak sound emitted soon after the end of a louder sound is masked by
the louder sound. Even a weak sound just before a louder sound can be masked by
the louder sound. These two effects are called forward and backward temporal
masking, respectively.
[edit]'Phantom' fundamentals
Main article: Missing fundamental
Low pitches can sometimes be heard when there is no apparent source or component
of that frequency. This perception is due to the brain interpreting repetition patterns
determined by the differences of audible harmonics that are present.[7] A harmonic
series of pitches that are related 2×f, 3×f, 4×f, 5×f, etc, give human hearing the
psychoacoustic impression that the pitch 1×f is present. This phenomenon is used by
some pro audio manufacturers to allow sound systems to seem to produce notes that
are lower in pitch than they are capable of reproducing.[8][9]
[edit]Software
The psychoacoustic model provides for high quality lossy signal compression by
describing which parts of a given digital audio signal can be removed (or aggressively
compressed) safely - that is, without significant losses in the (consciously) perceived
quality of the sound.
It can explain how a sharp clap of the hands might seem painfully loud in a quiet
library, but is hardly noticeable after a car backfires on a busy, urban street. This
provides great benefit to the overall compression ratio, and psychoacoustic analysis
routinely leads to compressed music files that are 1/10 to 1/12 the size of high
quality original masters with very little discernible loss in quality. Such compression is
a feature of nearly all modern audio compression formats. Some of these formats
include MP3, Ogg Vorbis, AAC, WMA, MPEG-1 Layer II (used for digital audio
broadcasting in several countries) and ATRAC, the compression used
in MiniDisc and Walkman.
Given that the ear will not be at peak perceptive capacity when dealing with these
limitations, a compression algorithm can assign a lower priority to sounds outside the
range of human hearing. By carefully shifting bits away from the unimportant
components and toward the important ones, the algorithm ensures that the sounds a
listener is most likely to perceive are of the highest quality.
[edit]Music
[edit]Applied psychoacoustics
Psychoacoustics Model
[edit]See also
Music portal
[edit]References
[edit]Footnotes
[edit]Notations
[edit]External links
Image MethodsRLE · Fractal · Wavelet · EZW · SPIHT · LP · DCT · Chain code · KLT
TermsVideo Characteristics · F
See Compression Formats and Standards for formats and Compression Software Impleme
• discussion
• history
• Try Beta
Main page
Contents
Featured content
Current events
Random article
search
interaction
About Wikipedia
Community portal
Recent changes
Contact Wikipedia
Donate to Wikipedia
Help
toolbox
Related changes
Upload file
Special pages
Printable version
Permanent link
Català
Deutsch
Ελληνικά
Español
Français
Italiano
Nederlands
日本語
Norsk (nynorsk)
Polski
Português
Русский
Suomi
Svenska
ไทย
Українська
organization.
Privacy policy
About Wikipedia
Disclaimers
Perceptual Coding
Use of psychoacoustic principles for the design of audio recording,
reproduction, and data reduction devices makes perfect sense. Audio equipment
is intended for interaction with humans, with all our abilities and limitations of
perception. Traditional audio equipment attempts to produce or reproduce
signals with the utmost fidelity to the original. A more appropriately directed,
and often more efficient, goal is to achieve the fidelity perceivable by humans.
Basically, this means removing the part of an audio signal we cannot hear. This
is the goal of perceptual coders.
Although one main goal of digital audio perceptual coders is data reduction,
this is not a necessary characteristic. Perceptual coding can be used to improve
the representation of digital audio through advanced bit allocation. Also, all
data reduction schemes are not necessarily perceptual coders. Some systems,
the DAT 16/12 scheme for example, achieve data reduction by simply reducing
the word length, in this case cutting off four bits from the least-significant side
of the data word, achieving a 25% reduction.
The Digital Compact Cassette (DCC), developed by Philips, is one of the first
commercially available forms of perceptually coded media. It achieves a 25%
data reduction through the use of the Precision Adaptive Sub-band Coding
(PASC) algorithm. The algorithm contains a psychoacoustical model of
masking effects as well as a representation of the minimum hearing threshold.
The masking function divides the frequency spectrum into 32 equally spaced
bands. Sony's ATRAC system for the MiniDisc format is similar.
Perceptual coders still have room for improvement but are headed in what
seems to be a more intelligent direction. The algorithms are not perfect models
of human perception and cognition. Of course, while the modeling of a
perceptual coder could be over-engineered in the spirit of cognitive science in
order to learn more about human cognition, all that is necessary in perceptual
coding is to develop an algorithm that operationally corresponds to human
auditory perception, not one that physically copies it.
Perceptual Coding
Compression schemes often operate on signal values like the amplitude of speech at a
specific instant (sample) or the intensity of an image at a specific location (pixel) without
regard to the way that the final reproduced signal will be heard or seen by a human user.
This is appropriate for some data such as measurements or text, but it fails to take
advantage of potentially useful information when reconstructing a signal intended for
subjective perception by humans. If, for example, greater compression can be achieved at
the cost only of loss imperceptible by the human ear or eye, then a lossy system can
appear to have as high a performance as a lossless system with far inferior compression.
Compression methods taking advantage of the nature of these phenomena are referred to
collectively as perceptual coding, and seminal work during the past decade promises
significant improvements in compression. Perceptual coding can be accomplished by a
variety of means, but it usually involves using models of human perception, such as a
human auditory system or human visual system model. These models can be quite
complex and their incorporation into compression algorithms quite involved, often
involving cooperative work among psychologists, computer scientists, and engineers. The
potential gains have been estimated at 10-50% improvements in efficiency of
compression with no perceptual distortion. One approach is to transform the raw data
using the perceptual model into features deemed important for perception. It is these
features that are then explicitly compressed and used to reconstruct the signal. Another
approach is to incorporate the perceptual knowledge into the measures of distortion and
fidelity used to design the codes. Regardless of the specific method, sensible
incorporation of quantitative aspects of human perception is likely to provide substantial
improvements in compression performance for speech, audio, images, and video with a
modest increase in cost or complexity.
Psychoacoustics
Psychoacoustics is essentially the study of the perception of sound. This includes how we listen, our
psychological responses, and the physiological impact of music and sound on the human nervous
system.
In the realm of psychoacoustics, the terms music, sound, frequency, and vibration are
interchangeable, because they are different approximations of the same essence. The study of
psychoacoustics dissects the listening experience.
Traditionally, psychoacoustics is broadly defined as “pertaining to the perception of sound and the
production of speech.” The abundant research that has been done in the field has focused primarily on
the exploration of speech and of the psychological effects of music therapy. Currently, however, there
is renewed interest in sound as vibration.
Research on the neurological component of sound is currently attracting many to the field of
psychoacoustics. A growing school of thought — based on the teachings of the Dr. Alfred Tomatis —
values the examination of both neurological and psychological effects of resonance and frequencies on
the human body.
Thanks to the ground breaking findings of Dr. Tomatis (1920-2001), we have come to understand the
extraordinary power of the ear. In addition to its critical functions of communication and balance, the
ear's primary purpose is to recycle sound and so recharge our inner batteries. According to Tomatis,
the ear's first function in utero is to govern the growth of the rest of the physical organism. After
birth, sound is to the nervous system what food is to our physical bodies: Food provides nourishment
at the cellular level of the organism, and sound feeds us the electrical impulses that charge the
neocortex. Indeed, psychoacoustics cannot be described at all without reference to the man known as
the “Einstein of the ear.”
Resonance is the single most important concept in understanding the constructive or destructive role
of sound in your life. Entrainment, sympathetic vibration, resonant frequencies, and resonant systems
all fall under the rubric of resonance. Resonance can be broadly defined as “the impact of one
vibration on another.” Literally, it means “to send again, to echo.” To resonate is to “re-sound.”
Something external sets something else into motion, or changes its vibratory rate. This can have
many different effects — some subtle and some not so.
From iceburgs to airport construction to the human body, soundwaves have the capacity to alter, to
actually shift frequency. Simply put, sound is a powerful — yet often ignored — medium for change.
Another fascinating and important aspect of resonance is the process of entrainment. Entrainment, in
the context of psychoacoustics, concerns changing the rate of brain waves, breaths, or heartbeats
from one speed to another through exposure to external, periodic rhythms.
The most common example of entrainment is tapping your feet to the external rhythm of music. Just
try keeping your foot or your head still when you are around fun, up-tempo rhythms. You will see that
it is almost an involuntary motor response. However, tapping your feet or bopping your head to
external rhythms is just the tip of the iceberg. While your feet might be jitterbugging, your nervous
system may be getting a terrible case of the jitters!
Rhythmic entrainment is contagious: If the brain doesn't resonate with a rhythm, neither will the
breath or heart rate. In this context, rhythm takes on new meanings. Not only is it entertaining, but
rhythmic entrainment is a potent sonic tool as well — be it for motor function or other autonomic
processes such as brainwave, heart, and breath rates. Alter one pulse (such as brain waves) with
music, and the other major pulses (heart and breath) will dutifully follow.
When it comes to the intentional applications of music, the entrainment effect completes the circle of
the chain of vibration: Atomic matter —> vibration —> frequency —> sound —> sympathetic
vibration (resonance) —> entrainment.
Music alters the performance of the nervous system primarily because of entrainment. Entrainment is
the rhythmic manifestation of resonance. With entrainment, a stronger external pulse does not just
activate another pulse but actually causes the latter to move out of its own resonant frequency to
match it.
Understanding the interlocking concepts of resonance and entrainment enables us to grasp the way
external tone and rhythm can heal or create havoc. Sound affects glass and concrete as well as brain
waves, motor response, and organic cells.
Pattern Identification
Simply put, pattern identification is one of the brain’s analytical processes. Identifying a pattern
(visual, auditory, odiferous, kinesthetic) enables cerebral attention to shift from active awareness to
passive acknowledgement. Listening and looking are active functions; hearing and seeing are
passive.
In active listening mode, the middle ear function is highly engaged while the brain seeks to identify a
pattern. Once an auditory pattern is found, passive hearing begins. Habituation sets in and the brain
focuses on other things. There are specific times when active listening or passive hearing is
preferable. Active listening stimulates the nervous system. Passive hearing is neutral or
“discharging.”
Sonic Neuro-Technologies
Representing two distinct approaches to therapeutic sound, filtration/gating (F/G) and binaural beat
frequencies (BBFs) currently define the growing field of “sonic neurotechnologies.” This phrase was
coined by Joshua Leeds to describe the arena of soundwork that depends on the precise mechanical
manipulation of soundwaves to bring about desired changes in the psyche and physical body. Two
diverse approaches to the processing of sound frequencies hold great interest and are used on some
of the audio programs in Sound Remedies.
Filtration/gating (F/G) techniques have been honed in Tomatis clinics worldwide. By gradually gating
and filtering out the lower range of music (sometimes up to 8000 Hz), and then adding the
frequencies back in, a retraining of the auditory processing system occurs. The effects of filtration and
gating are felt on a psychological, neurodevelopmental, and physical level. The application of sound
stimulation has been effective in the remediation of many neurodevelopmental issues. Children and
adults with learning/attention difficulties, developmental delays, auditory processing problems,
sensory integration and perceptual challenges have experienced profound improvement.
Another approach to sound processing is the field of binaural beat frequencies (BBFs). By listening
through stereo headphones to slightly detuned tones (i.e., sound frequencies that differ by a
prescribed number of Hz), sonic brainwave entrainment takes place. Facilitating a specific range of
brainwave states may assist in arenas such as pain reduction, enhanced creativity, or accelerated
learning.
These two sonic neurotechnologies — used separately — have roots in neurology, physiology, and
psychology. They must be used carefully and wisely. BBF and F/G soundtracks can be powerful tools.
Consequently, proper consideration must always be afforded.
Please note: Sound products with BBFs or F/G contribute to health and wellness, but they are never
intended to replace medical diagnosis or treatment. Do not drive or operate machinery while listening
to sound programs that use these methedologies.
The therapeutic use of sound, like any new tool, requires discipline, education, and strict observance
of ethical standards. There is currently no established licensure in the use of sonic neurotechnologies.
Therefore the onus of responsibility for handling the changes that occur as a consequence of the
application of these methods (most specifically, filtration/gating) falls on the practitioner. Sound is a
marvelous adjunct to an existing profession. Therapists and educators will do well in performing due
diligence and acquiring proper training.
• Auditory tonal processing (ATP) may be defined as the ability to differentiate between the
tones utilized in language.
• Auditory sequential processing (ASP) is the ability to link pieces of auditory information
together.
Auditory tonal processing is a basis for more complex levels of auditory sequential processing. ASP is
the ability to receive, hold, process, and utilize auditory information using our short-term memory. As
the foundation for short-term memory, ASP is one of the building blocks of thinking.
Sequential processing functions are fundamental to speech, language, learning, and other perceptual
skills. The ability to interpret sound efficiently provides the neurological foundation for these
sequential functions. Per neurodevelopmental specialist Robert J. Doman Jr., “many people who have
experienced auditory processing deficits have seen their sequential functions return and/or improve
when proper tonal processing is restored.”
The primary sound application used in the remediation of impaired tonal processing was created by
Alfred Tomatis. Further discussions cannot take place without absolute acknowledgment of his
pioneering research. The current field of sound stimulation auditory retraining evolves from Tomatis's
discoveries of the powerful effect of filtration and gating of sound.
• Filtration means the removal of specific frequencies from an existing sound recording, be that
the music of Mozart or a recording of a voice. Through the use of sound processing
equipment, it is possible to isolate and mute certain frequency bandwidths. With filtration,
any part of the low, mid, or high end of a recording can be withdrawn and reintroduced at
will. On a visual level, imagine erasing the bottom part of a picture and then eventually
drawing it back in. This is filtration.
• Gating refers to the creation of a random sonic event. This is accomplished by electronically
processing a soundtrack so it unexpectedly jumps between the high and low frequencies.
While not always pretty to listen to, the net effect of this sound treatment is an extensive
exercising of the muscles of the middle ear. The combined process of filtration and gating
creates a powerful auditory workout. And for good reason! The middle ear mechanism must
work very hard to translate the complexity of the “treated” incoming sound.
“Psychoacoustics” is a brief excerpt from The Power of Sound, published by Healing Arts Press. ©
2001 Joshua Leeds. All rights reserved. Further information about psychoacoustics can be found
in The Power of Sound and other fine books at Sound-Remedies.com.