Académique Documents
Professionnel Documents
Culture Documents
COMPRESSIONUSING ADSP2105
http://www.eforu.page.tl/
ABSTRACT
http://www.eforu.page.tl/
-authors
INTRODUCTION
SPEECH ANALYSIS
http://www.eforu.page.tl/
‘Phonemes’ are the smallest and basic unit of sound which can be recognized in
contrast with their environment. The sounds that stretch from the centre of one phoneme
to next phoneme thereby spanning the transition region is called ‘Dipone’.
Some common methods of speech analysis are:
♦ Short time Fourier analysis.
♦ Linear prediction coding.
♦ Homomorphic filtering.
Speech compression is a data compression design to reduce the size of data files.
To achieve the compression effect there exist many lossy and lossless algorithms.
Shannon lossless source coding theorem is based on the concept of block coding.
To illustrate this concept, let us consider a special information source in which the
alphabet consists of only two letters: A = {a, b}
Here, the letters `a' and `b' are equally likely to occur.
An n-th order block code is just a mapping which assigns to each block of n
consecutive characters a sequence of bits of varying length. The following examples
illustrate this concept.
Third order block code: Triplets of characters are mapped to bit sequence of lengths one
through six .
http://www.eforu.page.tl/
Here R=0.68 bits/character.
where R=1/n ∑ p(Bn) l(Bn) bits/sample.
An example:
Note that 17 bits are used to represent 24 characters --- an average of 0.71 bits/character.
Linear Predictive Coding (LPC) is one of the most powerful speech analysis
techniques, and one of the most useful methods for encoding good quality speech at a low
bit rate. It provides extremely accurate estimates of speech parameters, and is relatively
efficient for computation.
LPC is frequently used for transmitting spectral envelope information, and as such
as to be tolerant for transmission errors. Historically, digital speech signals are sampled at
a rate of 8000 samples/sec. Typically, each sample is represented by 8 bits (using mu-
law). This corresponds to an uncompressed rate of 64 kbps
MATHEMATICAL MODELLING:
http://www.eforu.page.tl/
The digital speech signal is the output of a digital filter (called the LPC filter)
whose input is either a train of impulses or a white noise sequence.
The LPC filter is given by: H(z) = 1/{1+a1z-1+…+a10z-10}
The input-output relationship of the filter is given by the linear difference equation:
The LPC model can be represented in vector form as: A = {a1, a2,…,a10,G,V/UV,T}
LPC ANALYSIS
http://www.eforu.page.tl/
s(n) + ∑ak s(n-k) = u(n) k varies from 1 to10. The ten LPC parameters (a1, a2,..,a10) are
chosen to minimize the energy of the innovation: f = ∑ u2(n) n varies from 0 to159.
Using standard calculus, we take the derivative of f with respect to a i and set it to zero:
df/dak= 0. From this the ten coefficients can be determined. .
a6 to a10 3 each
VOICED OR UNVOICED
Voiced sounds are produced by vibrations of the vocal cords. Their spectrum is
periodic with some fundamental frequency (which corresponds to the pitch). Examples of
voiced sounds include all of the vowels. Unvoiced signals do not have a fundamental
frequency or a harmonic structure. Instead, they are white noise.
For Voiced Sounds (V): The impulse train is shifted (insensitive to phase change).
For Unvoiced Sounds (UV): A different white noise sequence is used.
S (z) = E (z) A (z)
Where 1/A (z) is Transfer function
http://www.eforu.page.tl/
S (z) is the original speech signal.
The spectrum of the error signal E(z) will have a different structure depending on
whether the sound it comes from is voiced or unvoiced . Thus we can predict whether it is
voiced or unvoiced.
CODEC
CODEC is the combination of coder and decoder. The can convert the analog to
digital signal and vice versa.
Encoder:
http://www.eforu.page.tl/
Decoder
IMPLEMENTATION
http://www.eforu.page.tl/
5. LPC analysis.
6. Storing of signals.
7. LPC synthesis and De – emphasis of speech signals.
TEST RESULTS
The hardware designed for speech compression is tested with five different
algorithms such as ADPCM, LDCELP, CSACELP, CELP and LPC10.
COMPRESSION TEST
http://www.eforu.page.tl/
COMPRESSION STATUS
70000
60000
50000
BITS
40000
30000
20000 Series
10000
0 1
P
LD M
10
LP
P
AL
EL
EL
PC
IN
CE
C
AC
C
LP
G
AD
RI
CS
O
ALGORITHMS
QUALITY TEST
http://www.eforu.page.tl/
QUALITY TEST
100
80
QUALITY IN
PERCENT
60 NATURAL
40 SPEECH
LPC 10
20
0
DRT DRT DAM
withNOISE
TESTING METHOD
DELAY TEST
DELAY TEST
35
DELAY IN ms
30
25
20
15
10
5 Series1
0
10
M
P
P
EL
EL
PC
C
C
LP
AD
SA
C
METHODS
http://www.eforu.page.tl/
APPLICATIONS
CONCLUSION
The LPC and its derivative methods are most superior to all, because it can be
implemented easily, better compression ratio and it gives better signal quality. It is one
of the most powerful speech analysis techniques and one of the most useful methods for
encoding good quality speech signal at a low bit rate and provides extremely accurate
estimates speech parameters. And parallel processors can be involved to process audio,
video signals separately for the real time low cost applications.
REFERENCES
http://www.eforu.page.tl/