Académique Documents
Professionnel Documents
Culture Documents
1
Pratik Maheshwari 2
Mihir Kulkarni 3
Amol Madane 4
K.T.Talele
Figure 2 Example speech signal after it has been cleaned. 3. MEAN SQUARE ERROR
The formula for MSE is given by,
Speaker Recognition
Y= { ∑ xi2 } / N
The speaker recognition process relies heavily on
frequency analysis. This can be done because each person Equation 2
has some very unique characteristics to their voice that can
be isolated in the frequency domain. In this test we first find the autocorrelation of the original
signal and then we find the cross correlation of the original
1. The first test measures the length of speech. voice and the test signal.Now we find the MSE using the
This test is conducted mainly to eliminate speech segments above formula if the difference of the two resultant outputs.
that may contain too much data or too little data. If the If the MSE if above the certain threshold then it passes the
sample is found to be too short, it was most likely recorded test.
because background noise near the microphone tricked the
computer to start recording. If this is the case the results are 4. Discrete-time Fourier transform
ignored. If the sample is too long, then two possible
scenarios are imaginable. Either there is too much
background noise that is causing it to be constantly recorded,
X(e jω )= ∑ x[n] e jωn
or the user is issuing too many commands too quickly. In Equation 3 Discrete-time Fourier transform
both cases the other tests will still attempt to identify the
speech sample. The sample only passes the first test if the The rest of the tests all examine the signal in the frequency
length of the sample is within a percent threshold of the domain. All three remaining tests will examine the power
length of the template. This test is mainly designed to spectrum of the signals. First the Fourier transform is
prevent false positives. computed using Equation 3. It is convenient to have a
2. The second test used is max of time cross-correlation. standard length for the power spectrum computation when
comparing various signals. The length of the power spectrum
is equal to the length of the signal in the time domain when
computed with the method described in Equation 3. A plot of
the power spectrum obtained with this function can be seen
in Figure 4.
III. CONCLUSION
We successfully implemented speaker verification system.
We have tested this prototype system under different
atmospheric conditions. We observed that efficiency is 100%
in noise free environment. As the normalized coefficients are
used for comparison, the system works irrespective of the
input signal amplitude. When the signal level is weak,
system performance degrades. Authenticaton based on Mean
Square Error of PSD and correlation tests found to be the
most accurate test. We have considered additional tests to
improve the efficiency and to avoid the possibility of false
acceptance.
V. BIOGRAPHIES