RESPITE: Annual Report 2001: Scientific Highlights: Demonstrators

(RESPITE logo)

The RESPITE Demonstrators

The RESPITE industrial partners are developing demonstrator systems to pull-through the new techniques towards commercial systems.

Quantitative Evaluation of RESPITE Recognition Techniques

Daimler-Chrysler have constructed an off-line evaluation demonstrator which produces comparative results for RESPITE technology within the same system. The system incorporates missing data, multistream and tandem recognition techniques. These have been compared to the DC baseline recognition system performing the Aurora 2 connected digit recognition task.

The Tandem approach leads to a reduction in error rate of a factor of 2 or 3 over the DC baseline system as shown in the table below:

System SNR 0 SNR 10 SNR 20 Clean
DC Baseline 31.9 6.6 1.8 1.6
DC Tandem 21.4 2.4 0.8 0.9

Qualitative Demonstration of RESPITE Recognition Technologies

Babel are building a modular platform which will enable other partners to add their software into an on-line small-vocabulary recognition system. The intention is to add visualisation facilities at each stage, so that we demonstrate how the system works, rather than just what it does.

This demonstration system is built as a set of plug-in's to the WAVESURFER interface developed by the CTT lab of the KTH university. This interface allows for sound acquisition, easy display of data such as signal, spectrograms, pitch curve, etc. Plug-in's have been written for the display of ASR specific data such as likelihoods, acoustic features and phoneme/word labellings.

Click on image to view at full size.
Fig 1: plug-in's have been added to WAVESURFER for automatic speech recognition

The Wavesurfer interface has been extended to the specific needs of the RESPITE project partners. A highly customisable speech recognition interface has been provided, wherein each module of the speech recognition process can be defined and configured by the user. The interface provides access to the system at different anchor points such as the sampled signal, the acoustic features, the state likelihoods or the word hypothesis. Default modules can be readily replaced by external programs thus allowing a high degree of flexibility.

Fig 2: customisable ASR interface