Académique Documents
Professionnel Documents
Culture Documents
Presented by
Amr Medhat
Computer Engineering Department Cairo University 22-10-2005
Speech Why??
Speech How??
Noise
Channel
Signal + Protocol
Sender
Message
Receiver
Computer Analogy
text (TTS)
Speech Production
speech
Speech Synthesis
(ASR)
Speech Perception
()
speech
Grammar
Lexicon
Phone Models
Recognizer Characteristics
Discrete words / continuous speech Read / spontaneous speech Speaker dependent / independent Small / large vocabulary Finite state / context sensitive language model
What to study
Phonetics and Phonology (Linguistics) Speech Signal Processing (DSP) Pattern Recognition (AI)
Hidden
Phonetics
Phonetics: study of the production, perception, and physical properties of speech sounds Phonology: describes the way sounds function within a given language and how they are combined and organized Phoneme: The smallest phonetic unit in a language that is capable of conveying a distinction in meaning E.g.
boat-bought,
car-jar, - ,-
Sampling
Rate:
e.g. 16 kHz Sample size: e.g. 16 bits Format: PCM (.wav files)
Spectrogram
HMM
Tools
Audio Editing
Cool Edit () Gold Wave Sound Forge HTK () MATLAB Microsoft SAPI SDK Java Speech API ISIP ASR Toolkit Torch (Machine learning tool)
ASR
Speech Recognition