Adaptive Multi Rate Coder Using ACLP

Guided by : Mr.
Vijayendra Desai
Prepared by: Patel Hetal -82914 Nakrani Ankita -6247 Vora Mahesh-5267 Zalavadiya Ashish-7284
Concerned with obtaining compact digital representation of voice signals for more efficient transmission or smaller storage size.
Objective is to represent speech signal with minimum number of bits yet maintain the perceptual quality
The humans vocal apparatus consists of: lungs trachea (wind pipe) larynx contains 2 folds of skin called vocal cords which blow apart and flap together as air is forced through oral tract nasal tract
Speech Coder: device that converts speech to digital Types of speech coders Waveform coders Convert any analog signal to digital form Vocoders (Parametric coders) Try to exploit special properties of speech signal to reduce bit rate Build model of speech transmit parameters of model Hybrid Coders Combine features of waveform and vocoders
Type of speech codec

Types of Types of Speech Codecs Speech Codecs Waveform Waveform Codecs Codecs
ADPCM High Quality High Bit rate
Vocoders Vocoders
LPC Low Bit rate Low Quality
Hybrid Hybrid Codecs Codecs

CELP Medium Bit rate Good Quality
Waveform codecs
Sample and code High-quality and not complex Large amount of bandwidth Match the incoming signal to a math model Linear-predictive filter model of the vocal tract A voiced/unvoiced flag for the excitation The information is sent rather than the signal Low bit rates, but sounds synthetic Higher bit rates do not improve much
source codecs (vocoders)
Hybrid codecs
Attempt to provide the best of both Perform a degree of waveform matching Utilize the sound production model Quite good quality at low bit rate
Similar to images, we can also compress speech to make it smaller and easier to store and transmit. General compression methods such as DPCM can also be used. More compression can be achieved by taking advantage of the speech production model.
Why Adaptive Multi- Rate (AMR)? Major challenges for designing of coder
high quality speech throughout a wide variety of channel conditions Traditionally, fixed source/channel bit allocation Solution Variable bit rate allocation for source and channel coder.
To satisfy the requirement of variable bit rate, The quantization parameters of the fixed rate coders are changed. In CELP (Code Excited Linear Predictive coder), size of code book, gain, Linear Predictive parameters, etc are changed to have variable bit rate. CELP suffers from the larger processing time due to stochastic codebook. Algebraic/ well structured codebook is used in ACELP to solve problem of CELP.
Operating Modes of AMR
Performance Comparison Between some Standardized Coders
The Speech Signal
Background Signal
Pitch Period
Unvoiced Signal (noise-like sound)

14
Speech Waveforms and Spectra

S-silencebackground-no speech U-unvoiced, no vocal cord vibration (aspiration, unvoiced sounds) V-voiced-quasiperiodic speech
15
100 msec
Voiced Vs Unvoiced
voiced stops are transient sounds produced by building up pressure behind a total constriction in the oral tract and then suddenly releasing the pressure, resulting in a pop-like sound /B/ constriction at lips /D/ constriction at back of teeth /G/ constriction at velum unvoiced stops have no vocal cord vibration during period of closure => brief period of fraction (due to sudden turbulence of escaping air) and aspiration (steady air flow from the
16
Pitch and formants

For certain voiced sound, your vocal cords vibrate (open and close). The rate at which the vocal cords vibrate determines the pitch of your voice. For men pitch period is 4-20 ms (50-250Hz) For women pitch period is 2-8 ms (120-500Hz) Resonant frequency of vocal tract tube is called formants
Background Signal Pitch Period
17
Speech production Model
18
Non uniform probability density function (PDF) Non zero auto correlation between successive speech sample Existence of voiced and unvoiced segment Quasi periodicity of speech signal Speech signals are band limited. So it can samples at finite rate and signal can be reconstructed from this sample
Probability density function (PDF):

Non-uniform PDF of speech signal Very high probability of near zero amplitude Significant probability of very high amplitude And monotonically decreasing function in between
This PDF function has distinct peak at x=0, due to existence of frequency pauses and low level speech segment. non-uniform quantizer attempt to match distribution of quantization level to PDF of speech.
Autocorrelation function

Much correlation exists between the adjacent samples of segment of speech. So, in every sample of speech, there are large number of component which can be predicted from the previous samples with small random error. All differential and predictive type of coders are designed based on this property.
Characteristics of speech signal

Power spectral density function (PSD) Non flat characteristic of PSD of speech makes it possible to obtain significant compression by coding speech in frequency domain. Long term average PSD of speech shows that high frequency components contribute very little to total speech energy. So, coding of speech in different frequency band can lead to significant coding gain
Quantizer removes irrelevance in the signal, and operation is irreversible

1)
Uniform quantization: Non uniform quantization: Adaptive quantization Vector quantization
Amplitude level quantizer

1)
A law and law companding

1) 2)
Vocoders Channel vocoder Formant vocoders Cepstrum vocoder LPC LPC vocoder Multiplse Excited LPC Code- Exited LPC
It uses analysis and synthesis approach Signal need to be analyzed at the transmitter, It determines the envelop of speech signal for number of frequency band and then sample encode and multiplexed these samples with encoded output of the filter. Voiced unvoiced decision, energy information about each band and pitch frequency will be packed and transmitted.
Speech Production Models
Physical Model Mathematical Model
LPC Decoder
unpack Pitch voicing period Pitch period index decoder Power index Power decoder LPC index LPC decoder
LPC Bit stream
Impulse train generator Voiced/ Unvoiced speech
Gain computation
White noise generator
Synthesis filter
Synthesis speech
De-emphasis
29
Analysis-by-Synthesis Excitation Coding
CELP
Original speech sample
1,1 8,8
3,5
6,8
Speech S(n)
CELP Encoder Block Diagram LP parameters Buffer and

Gain, 0 LP analysis Pitch Estimate P Pitch Synthesis Filter p(z) e(n) Long term analysis Short term analysis LP Synthesis Filter (z) S (n ) *
LP Parameters
E n c o C h d a n e n e r l
Gaussian Excitation Codebook
k(n)
Index ,k
P ( z) =
1 1 z P
Perceptual Weighting Filter W(z)

w
Error Energy minimizatio n Excitation Parameters
CELP bit allocation for AMR
Algebraic Code Excited Linear Predictive (ACELP) Coder
Results
Five.wav
Bitrate (kbps) 4.75 5.15 5.9 6.7 7.4 7.95 10.2 12.2 Process delay 1.4 1.958 2.484 3.555 5.955 10.234 70.658 284.194 SNR 11.392 12.3201 10.0201 8.7001 7.6269 7.8191 7.3258 6.2923 MSE 3.73E-04 3.01E-04 5.11E-04 6.93E-04 8.87E-04 8.49E-04 9.51E-04 0.0012
Bitrate (kbps) 4.75 5.15 5.9 6.7 7.4 7.95 10.2 12.2
Process delay 3.443 4.496 4.978 7.297 12.721 22.891 165.706 652.134
SNR 1.6833 1.9458 2.5418 2.7128 2.7049 3.7974 3.7305 4.312
MSE 0.0195 1.84E-02 1.60E-02 1.54E-02 1.54E-02 1.20E-02 1.22E-02 1.07E-02
Bitrate (kbps) 4.75 5.15 5.9 6.7 7.4 7.95 10.2 12.2
Process delay 7.073 6.776 10.589 15.565 25.128 47.284 387.415 1387.386
SNR 9.0667 9.6937 9.7603 10.1021 10.3836 11.198 11.9593 12.292
MSE 4.03E-04 3.49E-04 3.44E-04 3.19E-04 2.98E-04 2.48E-04 2.08E-04 1.92E-04

Adaptive Multi Rate Coder Using ACLP

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Adaptive Multi Rate Coder Using ACLP

Transféré par

Droits d'auteur :

Formats disponibles

Guided by : Mr.

Type of speech codec

Hybrid Hybrid Codecs Codecs

source codecs (vocoders)

Operating Modes of AMR

Performance Comparison Between some Standardized Coders

The Speech Signal

Unvoiced Signal (noise-like sound)

Speech Waveforms and Spectra

Pitch and formants

Speech production Model

Probability density function (PDF):

Characteristics of speech signal

Quantizer removes irrelevance in the signal, and operation is irreversible

Uniform quantization: Non uniform quantization: Adaptive quantization Vector quantization

Amplitude level quantizer

A law and law companding

Speech Production Models

Physical Model Mathematical Model

LPC Bit stream

Impulse train generator Voiced/ Unvoiced speech

White noise generator

Analysis-by-Synthesis Excitation Coding

Original speech sample

CELP Encoder Block Diagram LP parameters Buffer and

Gaussian Excitation Codebook

Perceptual Weighting Filter W(z)

Error Energy minimizatio n Excitation Parameters

CELP bit allocation for AMR

Algebraic Code Excited Linear Predictive (ACELP) Coder

SNR 1.6833 1.9458 2.5418 2.7128 2.7049 3.7974 3.7305 4.312

MSE 0.0195 1.84E-02 1.60E-02 1.54E-02 1.54E-02 1.20E-02 1.22E-02 1.07E-02

SNR 9.0667 9.6937 9.7603 10.1021 10.3836 11.198 11.9593 12.292

MSE 4.03E-04 3.49E-04 3.44E-04 3.19E-04 2.98E-04 2.48E-04 2.08E-04 1.92E-04

Vous aimerez peut-être aussi