RESPITE:Events : Meeting, Sep 2000:Presentations: Georg Meyer

Does selective enhancement of speech work?

Georg Meyer

Auditory scene analysis algorithms are able to use cues, such as harmonicity, to enhance speech in noise. These algorithms fail when the signal to noise ration (SNA) falls below a minimum level. If hamonicity as a signal selection cue is assumed then some speech segments, such as high amplitude voiced sounds will be enhanced relative to low amplitude or unvoiced sounds. The amplitude contrast between vowels and consonants will be increased. Previous studies have shown that in clean speech this type of modification reduces intelligibility if syllable initial and syllable final consonants (Hecker, PhD dissertation, Stamford, 1974). Other studies have shown that reducing the amplitude contrast between consonants and vowels can have a small positive effect on intelligibility (review: Balakishnan et al., J. Acoust. Soc. Am. 99, 3758-3769, 1996). Six native English speakers were presented with a quasi-random sequence of VCV syllables (Shannon, R.V., et al. J. Acoust Soc. Am. 106, L71-L74, 1999) in a background of low-pass filtered white noise. Experiment 1 confirms that native UK speakers are able to recognise the consonants VCV syllables spoken by native US speakers (96.63% correct). An amplitude reduction of the consonant segment by 12 dB relative to the vowel has no significant effect in clean speech (95.5% correct). When noise is presented at 0dB SNR, relative to the vowel amplitude, recognition performance reduces to 91.52%, while only 50.19% of the consonants are correctly recognised at -12dB SNR. If the signal is modified so that the consonant SNR remains at -12dB while the vowel SNR is set to 0dB, consistent with the type of enhancement one might expect using the AM-map processing strategy, consonant recognition increases to 70.23%, which is significantly better (p < 0.0001) then the recognition rate if both consonant and vowel are at -12dB. The experiment shows that selective enhancement of vowels can help enhance speech in a background of noise for normal hearing subjects. We hypothesize that the formant transitions in the vowel segment are responsible for this increase in performance. This is supported by a further experiment with a cochlear implant user, where the enhancement of vowels has no effect on consonant recognition because cochlear implants do not code formant transitions well.


Jon Barker
Last modified: Mon Sep 18 15:30:08 BST 2000