RESPITE: Private: The CASA Toolkit Page: Aurora 2.0 Recipe

The RESPITE CASA Toolkit Project

Aurora 2.0 Recipe


The following recipe demonstrates how to use CTK to perform missing data experiments on the Aurora 2.0 data base. (It assumes you already have Aurora 2.0 installed on your system.)

Step 1. Download CTK

(If you already have CTK installed, you may just to step 3.)
  1. Download the latest version of CTK, and unpack it.
  2. Make an environment variable called $CTKROOT to be the path of the directory called `CTK' that is constructed when the tar file is unpacked.

Step 2. Installing CTK

  1. Follow the instructions in $CTKROOT/INSTALL.

Step 3: Generating the feature data.

This step generates the features for the Aurora 2.0 Test Set A. The features are based on the output of a 32 channel gammatone filterbank, and include temporal derivative. The necessary scripts and support files are contained in the package, CTK_AURORA - Part 1, which can be downloaded below:
  1. Download and unpack the tar file - a directory called CTK_AURORA will be created.
  2. Set the environment variable $CTK_AURORAROOT to be the path of this directory.
  3. Make another environment variable called $AURORAROOT to be the path of top level directory of your aurora installation.
  4. Change directory to $CTK_AURORAROOT/scripts.
  5. Run the script $CTK_AURORAROOT/scripts/do_make_rate32_d.
The script will now produce the feature data for the whole of Test Set A. This data will appear in the directory $CTK_AURORAROOT/data/rate32_d/testa. This directory will be constructed automatically. Note, the complete test set occupies roughly 1.2 Gigabytes of disk space. If you do not have sufficient disk space on the disk where $CTK_AURORAROOT is located, then before running the script, construct the directory $CTK_AURORAROOT/data as a symbolic link to a directory on a device where there is plenty of space. The script will take several hours to run.

Step 4: Setting up and testing the clean speech models.

This step installs HMMs that have been trained on rate32_d data and tests them using traditional ASR techniques. The necessary scripts, model files and other support files are contained in the package, CTK_AURORA - Part 2, which can be downloaded below:
  1. Download the tar file and copy it to the directory above $CTK_AURORAROOT (i.e. $CTK_AURORAROOT/..).
  2. Unpack the tar file. The tar file contains model files, label files and transcription files that will be installed in directories under $CTK_AURORAROOT.
  3. Change directory to $CTK_AURORAROOT/scripts.
  4. Make sure $CTKROOT is set correctly.
  5. Type: setenv CTKWORK $CTK_AURORAROOT
  6. Type: test_trad_asr 1 clean
    There will be a short pause while the HMM definitions are read, after this recognition output should start appearing on stdout. The script is set up to use 3 mixture models and should produce around 97.0% accuracy on clean data. With 7 mixture models the result will be closer to 99.0%.
If you examine the directory $CTK_AURORAROOT/models you should find it contains both 3 and 7 mixture models.

Step 5: MD-ASR with discrete and fuzzy SNR masks.

This step provides the scripts necessary to run missing data experiments with discrete and fuzzy SNR-based masks. The scripts are contained in the package, CTK_AURORA - Part 3, which can be downloaded below:
  1. Download the tar file and copy it to the directory above $CTK_AURORAROOT (i.e. $CTK_AURORAROOT/..).
  2. Unpack the tar file. The tar file contains scripts that will be installed in $CTK_AURORAROOT/scripts/.
  3. Change directory to $CTK_AURORAROOT/scripts.
  4. Make sure $CTKROOT is set correctly.
  5. To test with a discrete MD mask use the script:
    test_discrete_snr_md_asr
    To test with a fuzzy MD mask use the script:
    test_fuzzy_snr_md_asr
    Read the text at the top of the scripts to see the syntax.
Once started there will be a short delay while the HMM definitions are read etc. Eventually recognition output will be generated on stdout. The scripts are using the 7 mixture models. The fuzzy mask test will run considerably slower than the discrete mask test.

Step 6: MD-ASR with fuzzy Harmonicity/SNR masks.

This step provides the scripts necessary for extracting harmonicity information from the aurora data and using it to improve recognition performance. The scripts are contained in the package, CTK_AURORA - Part 4, which can be downloaded below: Before running the ASR experiments f0 estimates, degree of voicing estimates and 'harmonicity masks' are extracted from the aurora data and stored. This data is then used in conjunction with the ratemap data to perform the recognition.
  1. Download the tar file and copy it to the directory above $CTK_AURORAROOT (i.e. $CTK_AURORAROOT/..).
  2. Unpack the tar file. The tar file contains scripts that will be installed in $CTK_AURORAROOT/scripts.
  3. Change directory to $CTK_AURORAROOT/scripts.
  4. Make sure $CTKROOT is set correctly.
  5. Make a directory called $CTK_AURORAROOT/data/harmonicity. This will be where the f0, voicing and harmonicity mask data is stored. This data requires roughly 680Mb of disk space. Make sure there is space available on the device before proceeding to the next step. If there is not enough space then you can set $CTK_AURORAROOT/data/harmonicity to be a symbolic link to a directory on a device where there is more space available.
  6. To start generating the data type:
    do_make_harmonicity_data
    Note, this is a computationally intensive task and the script will take a long time to run. Remember to check that you have sufficient storage space before running the script!
  7. Once the data generation script has finished you are ready to run the recognition experiments. First, you need to set an environment variable called CTKWORK. Simply type:
    setenv CTKWORK $CTK_AURORAROOT
  8. Now you can run the recognition script:
    test_harmonicity_asr 1 clean
    Read the comments at the top of the script to see the syntax.
Once the recognition script has been started there will be a short pause while the HMM definitions are read, after this recognition output should start appearing on stdout. The script is set up to use 7 mixture models. To use 3 mixture models, edit test_harmonicity_asr and replace occurrences of `7mix' with `3mix'.


These pages are maintained by Jon Barker, jon@dcs.shef.ac.uk
Last modified: Tue Feb 10 15:34:44 GMT 2004