SAMPLING OF SPEECH SIGNALS Speech is a continuous function of continuous time variable. It is sampled periodically in time to produce a sequence of samples X(nT). It is necessary to quantize these sample values to a finite set of values in order to obtain digital representation. Since we are concerned with digital representation of speech signals we need to consider spectral properties of speech. According to steady state models for the production of vowel and fricative sounds, speech signals are not inherently band limited. Spectrum tend to fall off rapidly at high frequencies Thus to accurately represent all speech sounds would require a sampling rate greater than 20KHz REVIEW OF THE STATISTICAL MODEL FOR SPEECH Speech waveform can be represented by an ergodic random process. Assume that the signal x(t) is a sample function of continuous time random process then the sequence of samples obtained by sampling can be thought of as a sample sequence of discrete time random process.
The power spectrum of sampled signal is an aliased version of the power spectrum of the original signal. The averages such as mean and variance are the same for samples as for the original signal.
PROBABILITY DENSITY ESTIMATION It is estimated by determining a histogram of amplitudes for a large number of samples. A good approximation to measure probability density is gamma distribution given as
Similar approximation is the laplacian density
The auto correlation function and power spectrum of speech signals can be estimated by standard time series analysis methods An estimate of the autocorrelation function can be obtained by estimating the time average autocorrelation function from a long segment of a signal. The power spectrum can be estimated in a variety of ways 1. By measuring the average output of the set of bandpass filters. 2. Estimation of long term average power spectrum 3. Computing the power transfer function of a recursive digital filter