Recognition of Distorted SpeechInvestigator: Jeremy Goslin Supervisor: Martin Cooke
Recently a number of studies have been made taking advantage the work of Harvey Fletcher and his colleagues during the early nineteen hundreds on the intelligibility of nonsense CVC's filtered through a variety of high and low pass filters. This work has surfaced again only recently due Allen's paper of 1994  drawing attention to this work suggesting that speech perception decisions within narrow frequency sub-bands maybe processed independently of each other. This also seems to tie in with Warren et al's  study into spectral redundancy where speech has been filtered through narrow spectral slits down to 1/20 octave bandwidth. This found that the intelligibility of speech through these filters was extremely high, reaching a maximum of nearly 80% with a filter cf of 1500Hz.
Recently, two recognition systems have been designed which take advantage of these new findings, dividing the frequency spectrum into a number of sub-bands to be analysed by separate ANN/HMM recognisers [3,4]. However, one feature that both of these recognisers have in common is their continued use of cross-frequency information in the recognition process. However, if recognition is possible using only very narrow frequency bands (say using a single cochlea filter) perhaps information could be extracted from analyses of the speech in the temporal domain. By submitting these representations to both human and artificial recognition, whilst removing a variety of features from the speech waveform it is hoped that an insight might be gained on the differences in speech recognition at different frequencies.