Disciplines involving the production and perception of sound can be brought to life through the use of demonstrations. The field is full of so-called illusions, just-noticeable differences, thresholds and other effects. A small subset of these are routinely demonstrated on university courses in many subject areas, ranging from engineering to neurophysiology. Traditionally, most of these illustrations required some combination of expensive facilities, specialised software and dedicated tutors. More recently, CDs such as the Acoustical Society of America's Auditory Demonstrations CD, and Al Bregman's Auditory Scene Analysis CD have eased the task. Now, through the use of relatively cheap computing resources and platform-independent software, it is practical to demonstrate virtually all published phenomena in speech and hearing via interactive tools. The benefits of this approach are:
- exploration: interactive demos go beyond mere playback of examples. The user can explore the range of relevant variables underlying some effect. For example, the effect of tone-repetition time and frequency separation on two-tone streaming can be directly perceived in response to user-interaction (see the streamer demonstration).
- measurement: users can measure their own thresholds, for example (e.g. see the audiometer tool).
- 'narrowcast/broadcast': demos can be run individually, at times to suit the user, or broadcast, to an audience via projection direct from the computer screen.
- upgradeability: new demos can be added to the collection as workload permits and indeed as new effects appear in the literature. Similarly, demos can be modified as more incisive ways to present them are discovered.
The Matlab Auditory Demonstrations project, which got underway in late 1997, aims to exploit the significant benefits afforded by interactivity to create user-centred demonstrations and associated work-sheets in the following areas:
- basic psychoacoustics
- auditory scene analysis
- auditory modelling and representations
- speech processing
- speech recognition
Within each area, the scope for demonstrations is very wide. Our focus was initially on things we have most experience in, and things we require locally for teaching. Most of the effort to date has been in auditory scene analysis, robust ASR, and basic speech processing demonstrations for teaching. As the project gathers steam, student projects will contribute to the pool of demos. At present, several such projects are underway in the areas of binaural and pitch effects.
This enterprise is currently being carried out largely by members of the Speech and Hearing Group in the Department of Computer Science at the University of Sheffield. If you would like to get involved (e.g. contribute ideas or code) please contact us.
As of August 1999, demos have been contributed by:
- Martin Cooke
- Guy Brown
- Dan Ellis (ICSI, Berkeley)
- Stuart Wrigley
The MAD project is largely a spare-time activity. However, we gratefully acknowledge some financial support from the ELSNET Language Engineering Training Showcase for the production of 9 demonstrations in speech signal processing (contract 98/02).