m4 - multimodal meeting manager

M4 speech recognition

ICSI meetings recogniser

Acoustic modelling

Here are the latest models for the ICSI meetings data.

The current system is trained on about 40 hours of segmented speech and tested on one hour.

The acoustic models will eventually be adapted from the SWITCHBOARD models.

Language modelling

The language model training data consists of the transcripts of 43 meetings and does not include the sentences used for testing.

The best language model for recognition is a combination of the SWITCHBOARD LM (the same used in the SWITHCBOARD work above) and ICSI meetings LMs weighted by factors 0.144264 and 0.855736 respectively (or a ratio of 1:5.9317363). These figures were found by minimising the perplexity on the ICSI meeting test sentences. On the ICSI meeting training sentences the corresponding weighting factors were 0.00620704 and 0.993793.

The best LM scale factor during testing was 13.5