The CTK Development Page


Notes for HMM Decoders CTK v1.1.0

The decoders in CTKv1.1.xx are a little more sophisticated than those in CTKv1.0.xx and as a result a few non-backwardly compatible changes have had to be made to the block parameters. Specifically, LABELS has been replaced with LABEL_FILE, the format of HMM_FILE has changed, and the USE_OZ parameter has been superseded with the more general parameter HYPOTHESIS_FILTER.

These changes are described in the notes that follow.

HMM_FILE

The HMM_FILE parameter specifies the name of a file that associates HMM definitions with HMM NAMEs. This file can have one of two possible formats, depending on whether the HMMs are stored in a single file or are stored separately:

LABEL_FILE

This parameter specifies the name of a file which associates HMM NAMEs with HMM LABELs. Whereas each HMM must have a unique NAME, several HMMs can share the same LABEL. e.g. there may be both a male and female version of the digit one with NAMEs "one_m" and "one_f" both having the LABEL "1".

Each line of the file defines a separate LABEL. The LABEL occurs as the first character on the line and is followed by the NAME of each HMM that shares this LABEL. e.g:

1 one_m one_f
2 two_m two_f
S sil sp
etc. This parameter supersedes the "LABELS" parameter that was used in CTKv1.0.xx.

GRAMMAR_FILE

This parameter specifies the name of a file containing the grammar to be applied to the set of models.

The GRAMMAR_FILE specifies the grammar in terms of the NAMEs of the individual HMMs. The format is the same as that used in version 1.x of HTK. For more details see here.

If no GRAMMAR_FILE is specified, then all the models are placed in a simple loop grammar. i.e. any model can follow any other model.

HYPOTHESIS_FILTER

This is a string parameter that takes the form of a regular expression. If supplied, this expression will be matched against the winning hypothesis, and if it matches the hypothesis will be rejected in favour of the next best hypothesis that does not match the filter. (If no suitable result can be found in the top 50-best list, then output will revert to the overall best hypothesis regardless of the filter).

For example, with a digit recognition task like AURORA, if HYPOTHESIS_FILTER="11" then the decoder will try and reject recognition hypotheses containing the substring "11".

The USE_OZ parameter no longer exists, its effect can be achieved by setting HYPOTHESIS_FILTER="(O.*Z)|(Z.*O)"