Robust ASR with Missing data using Neural Networks

Research Student: Shahla Parveen Supervisor: Phil Green

Speech recognition systems perform reasonably well in controlled and matched training and recognition conditions. However, performance deteriorates with increas in mismatches between training and recognition conditions. Much current research is focused on making speech recognition systems more robust against different types of mismatches e.g. environmental changes, speaker variability etc. There are several common techniques for robust Automatic Speech Recognition (ASR) including, multiband, speaker/noise adaptation, speech enhancement and model adaptation.

Missing data techniques provide a more principled solution to deal with data corrupted by additive noise and do not require ASR systems to be trained with noisy data. They are based on identifying reliable data regions and adapting recognition algorithms so that classification is based on these regions.

Present missing data ASR techniques developed at Sheffield use continuous density Hidden Markov Models (CDHMMs), which are generative models and can not give direct estimates of posterior probabilities of the classes given the acoustics. Neural Networks, unlike HMMs, are discriminative models and give a more accurate estimate of posterior probabilities and have been used in hybrid ANN/HMM speech recognition systems. The goal of this project is to investigate Neural Networks for missing data ASR.