RESPITE: The CASA Toolkit Page: Documentation: Block Library Index:HMMDecoderMDSoft


The HMMDecoderMDSoft is a version of HMMDecoderMD that has been generalised to accept a `soft mask'. Whereas the mask input for HMMDecoderMD should contain only 0's and 1's, the mask input for HMMDecoderMDSoft may contain values in the inclusive range 0.0 to 1.0. The values in the mask are interpreted as the probability that the data is reliable.

For full details of how the soft masks are employed in missing data speech recognition:

As part of its probability calculation the HMMDecoderMDSoft needs to perform a bounded-marginalisation (see HMMDecoderMD). The block therefore requires lower and upper bounds for the missing data values. These bounds are supplied via the block inputs in3 and in4. The lower bound input (in3) is optional, and if not supplied the lower bound will default to 0.

It is difficult to approximate good missing data bounds for delta features, so, by default, the HMMDecoderMDSoftdoes not apply the soft mask technique to the delta features. Instead, delta feature are treated in the same manner as they are by the discrete mask HMMDecoderMD decoder - i.e. they are either taken to be fully present or fully missing, and missing values are considered unbounded. Consequently, the elements of the missing data mask corresponding to the delta features should be 0 or 1. However, the decoder can be forced to treat delta features using the same soft mask technique as applied to the static features by setting USE_DELTA_BOUNDS to true. This may give better results if good lower and upper delta feature bounds are available.

The HMMDecoderMDSoft has the same parameters as the HMMDecoderMD, except: i) there is no USE_BOUNDS parameter (bounded-marginalisation is an integral part of the soft mask technique and bounds for the static features are always used), and ii) there is an additional parameter, ONE_ZERO_ROUNDING, which is described further below.

Note, that if a discrete mask consisting of only 0's and 1's is passed to HMMDecoderMDSoft it should produce exactly the same results as HMMDecoderMD. However, due to the way in which the probability calculation has been generalised, it is more efficient to use HMMDecoderMD for discrete masks .

As with the other decoders, HMMDecoderMDSoft outputs a stream of state-likelihood frames. Each frame consists of the likelihood of each model state having generated the corresponding input feature frame. Within these frames the state likelihoods occur in the same order in which the states are defined in the HMM definition file. (The mixture label frames (out2) indicate the integer label of the winning mixture for each state).

Inputs Meaning Sample 1-D frame $\ge$2-D frame
in1 feature vectors No Yes No
in2 fuzzy data mask No Yes No
(in3) lower bound No Yes No
in4 upper bound No Yes No

Outputs Meaning
out1state likelihoods
out2state max mixture label

Parameters Type Default Meaning
LOG_FILE String - Name of an optional log file
LOG_FILE_2 String - Name of additional detailed log file
WORD_PENALTY Float 0.0 The creation penalty
HMM_FILE String - Name of the HMM file list
GRAMMAR_FILE String - File storing the grammar
LABEL_FILE String - File storing HMM NAME-> HMM LABEL mapping
FIRST_TOKEN String - Label of a fixed first token
FINAL_TOKEN String - Label of a fixed final token
TRANSCRIPTION String - The correct transcription
SILENCE String "" The silence label(s)
MAX_APPROX Boolean False Use max mixture approximation
NBEST Int 1 Return best N hypotheses
STATE_PATH Boolean False Record HMM state path
HAS_DELTAS Boolean 0 Models have delta parameters
USE_DELTAS Boolean - Models have delta parameters
HYPOTHESIS FILTER String "" Regular expression for filtering hypotheses
OUTPUT_CONFUSIONS Boolean 0 Output confusion matrix
DUMP_PARAMETERS Boolean 0 Write parameters to log file
USE_BOUNDS Boolean False Use bounded marginalisation
ONE_ZERO_ROUNDING Float 0.0 Tolerance within which to round mask to 0 or 1

Documentation for CTKv1.1.4 - Last modified: Tue Jul 3 12:09:42 BST 2001