RESPITE:Events : Meeting, Sep 2000:Presentations: Glotin & Berthommier

Test of several external posterior weighting functions for multiband Full Combination ASR

Hervé Glotin, Frédéric Berthommier

Presented by Laurent Varin

Information about speech reliability can be extracted and then integrated in a recogniser by various means. The full combination (FC) approach allows the weighting of the posterior values estimated locally in the time frequency representation, according a speech reliability measure. Since most of the speech segments are voiced, we use a method exploiting the harmonicity of speech to derive these weights. We test this method together with the direct integration of the a priori SNR. Then, we run speech recognition with different kind of weighting functions. The weights are continuous or binary values. This corresponds to a soft or to a hard decision function about the speech reliability, which is derived from an observable harmonicity index. Using a binary decision process, the effect is, for each time frame, to collapse the set of combinations of sub-bands into a single combination. On the other hand, we substitute empirical values to these terms, including functions of the a priori SNR, which are continuous or discrete, but not based on a probabilistic estimation. We establish the average scores in % WER for a panel of noises at different levels, stationary or not, narrow-band or wide-band. All these functions are found to be sub-optimal comparatively to the constant weighting, but a robustness of the FC for narrow-band noises is observed.

Jon Barker
Last modified: Mon Sep 18 15:29:29 BST 2000