RESPITE:Events : Meeting, Jan 2001:Presentations: Jon Barker 
The performance of Missing Data ASR systems is largely dependent on the reliability with which we can estimate the probability that each spectrotemporal `pixel' is uncontaminated by noise. In the past we have based these probability estimates on simple local SNR estimates derived from a stationary noise assumption.
In the current work we show how the `pixel is uncontaminated' probability can be better estimated by introducing harmonicity information derived from an autocorrelogram representation of the speech signal. The basic strategy can be described as follows:
This simple strategy works well under the assumption that little of the speech utterance is dominated by harmonic noise. Refinements would be necessary if this is not the case.
Experiments with the Aurora 2000 database show that introducing harmonicity information in this way leads to consistent reductions in WER compared to a baseline system using local SNR estimates alone:
















Local SNR (Soft) 














