RSS

Roles for Confidence Measures in Automatic Speech Recognition

Investigator: Gethin Williams Supervisor: Steve Renals

It has been shown [1] that Artificial Neural Networks (ANNs), when trained as classifiers in the appropriate manner, can accurately estimate Bayesian posterior probabilities. Bayesian classifiers have many desirable qualities [2].
These include, the facilitation of optimal decision making, the use of rejection thresholds to raise the classification accuracy and the combination of outputs from several Bayesian classifiers in some form of voting scheme, again to improve accuracy.

Despite these desirable qualities, ANNs have difficulty accomodating temporal variation present in the speech signal. One technique which allows the use of ANNs to classify speech sounds is to combine them with Hidden Markov Models (HMMs), forming an ANN/HMM hybrid [3]. Such hybrids can be trained according to a Maximum A Posteriori (MAP) criterion, so as to produce optimal classifiers operating above the phone level.

References

  1. Richard & Lippman, Neural Computation, 1991.
  2. Bishop, 1995.
  3. Bourlard & Morgan, 1994.