
Speech Enhancement |
|
|
| Contents |
| Introduction |
What is Speech Enhancement?Speech enhancement is the term used to describe algorithms or devices whose purpose is to improve some perceptual aspects of speech for the human listener or to improve the speech signal so that it may be better exploited by other speech processing algorithms. Development and widespread deployment of digital communication systems during the last twenty years have brought increased attention to the role of speech enhancement in speech processing problems (see Lim 1979, Makhoul 1989, O'Shaugnessy 1989, Boll 1992, Ephraim 1992). Speech enhancement algorithms have been applied to problems as diverse as correction of reverberation, pitch modification, rate modification, reconstruction of lost speech packets in digital networks, correction of so-called "hyperbaric'' speech produced by deep-sea divers breathing a helium-oxygen mixture and correction of speech that has been distorted due to pathological problems of the speaker. However, noise reduction is probably the most important and most frequently encountered speech enhancement problem. The removal of noise from degraded speech is the problem addressed in this page. |
| Objectives |
| An analysis/synthesis technique based on harmonic sinusoidal modeling of speech is used to develop a new hidden Markov model (HMM) based speech enhancement algorithm. State sequence estimation is done using a standard HMM based approach. State based enhancement is carried out by assuming a harmonic model for speech., i.e. by representing each block of speech as a sum of sine waves in terms of a set of amplitudes, phases and harmonically related frequencies. Given the maximum a-posteriori probability (MAP) state sequence, the amplitude, phases, voicing and fundamental frequencies are estimated. Compared to the standard HMM based approach, it was found to reduce the structured residual noise normally associated with HMM based algorithms. |
| Publications |
Harmonic Modeling," ICASSP'97, Vol. 2, p. 1175, April 21-24, 1997, Munich, Germany. |
| Research Group |
The people associated with this research project include:Dr. Andreas Spanias, Principal InvestigatorMichael E. Diesher, (Currently with Intel) |
| Affiliations |
Other sites related to this project include the following:Speech and Audio Processing LabDigital Signal Processing Lab Telecommunications Research Center College of Engineering and Applied Sciences Arizona State University, which is located in the city of Tempe, Arizona 85287-7206 USA |
| Contacts |
For further information, direct all correspondence to:Dr. Andreas S. Spanias <spanias@asu.edu> |