CHiME aims to build a speech recogniser that can operate reliably in everyday `acoustically cluttered
' environment. The research will build on an existing framework know as speech fragment decoding
. This technique, inspired by the scene-analysis account of auditory perception, operates in two stages: first, signal processing techniques to split the acoustic mixture into local time-frequency fragments of individual sound sources; second, statistical models are employed to select fragments belonging to the sound source of interest while rejecting fragments coming from distracting sound sources.
The project has outline a number of key objective which will extend the fragment decoding
framework in directions needed to bridge the gap between theory and real applications:
- Segmentation models: modelling the processes that track signal properties (e.g. location and pitch) across time and frequency to group isolated sound source fragments.
- Model combination: techniques for describing complex acoustic scenes by combining models of individual sources.
- Efficient search: how can multiple models be combined with combinatorial explosion of the search space?
- Adaptation: developing always-on systems that learn as they listen.
- Demonstration: deploying fragment decoding in a real-time distant-microphone speech-driven interface.
The CHiME speech recognition systems will be evaluated on a recognition task that simulates a speech-driven home automation system. Binaural (i.e. stereo) audio data is being recorded in a number of noisy domestic spaces (living rooms, dining rooms, kitchen). By using impulse responses carefully recorded in the same rooms, a standard speech recognition evaluation corpus (the Grid corpus
) will be mixed into the data as though it had been recorded in the rooms themselves. This will allow the construction of a corpus of realistic and yet carefully controlled noisy utterances, the CHiME corpus
The data will form the basis of an open ASR competition (the CHiME challenge
) that will allow the CHiME approach to be compared with external competing systems. Details of the challenge will be announced later in 2010.