Before using this demonstration, we suggest that you investigate the wangNeuron, wangNetwork and vowelExplorer demonstrations; they will give you a feel for the components of the model described here.
Pairs of synthetic vowels that start and stop at the same time (so-called 'double vowels') have been used to demonstrate that listeners can use information about the fundamental frequency (F0) of sounds to separate them. Specifically, listeners are better able to identity the constituents of a double vowel when the two vowels have different F0s, relative to the condition in which they have the same F0.
This application allows you to experiment with a model of double vowel segregation described by Brown and Wang (1997). An important principle in the model is that neural oscillators act as 'gates' on auditory filter channels. When the oscillator is in its active phase, the 'gate' is open and the energy in the channel contributes to a percept (in this case, a vowel). Similarly, energy in channels whose oscillators are in their silent phase do not contribute to a percept. It follows that by synchronising blocks of oscillators, groups of channels can be turned on and turned off, allowing an unobstructed view of each source in a mixture of acoustic sources.
In the model, oscillators are synchronised according to periodicity information in a correlogram. The oscillators corresponding to auditory filter channels that are excited by the same fundamental frequency (F0) tend to synchronize, and tend to desynchronise from those corresponding to channels with a different F0. The model reproduces the essential finding from perceptual studies &endash; that double vowel identification improves as an F0 difference is introduced between the two vowels.
The model performs vowel identification by matching the short-period (high frequency) information in the summary correlogram function, named the 'timbre region' by Meddis and Hewitt (1992). When you select a particular state of the oscillator array (corresponding to a particular time step), the set of active channels in the correlogram are shown, together with the summary function. The timbre region is then matched against reference templates, and the best-matching vowels are indicated in rank order.
The demonstration shows the model simulation for a mixture of two vowels, AR (with F0 100 Hz) and EE (with an F0 between 100 Hz and 126 Hz, as selected by the Simulation menu). If you are unfamiliar with the Assman and Summerfield synthetic vowel set, take a look at the VowelExplorer before going any further.
![]()
The correlogram (1) and activity of the neural oscillator array (2) are shown at the top of the screen. Below the correlogram is the summary (or pooled) autocorrelation function (3). Peaks in this function at longer lags correspond to the period of the stimulus, and the information at shorter lags is related to vowel timbre. You can select the time step at which to view the oscillator activity by moving the slider (4). The best matching vowels are displayed in rank order in the window at the bottom of the screen (5).
Use the Simulation menu to select different condition of F0 separation between the two vowels. Six conditions are provided, corresponding to difference of F0 between the two vowels of 0, 0.25, 0.5, 1, 2, and 4 semitones.
1. How many vowels are identified when the F0 of both vowels is 100 Hz (i.e. the condition in which the semitone difference in F0 is zero)?
2. What are the assumptions that the Brown and Wang model makes? How could these be overcome?
G. J. Brown and D. L. Wang (1997) Modelling the segregation of double vowels with a network of neural oscillators. Neural Networks, 8.
R. Meddis and M. J. Hewitt (1992) Modelling the identification of concurrent vowels with different fundamental frequencies. JASA, 91(1), 233-245.
See also the demonstrations for the Terman-Wang oscillator (wangNeuron), the oscillator network (wangNetwork) and the double vowel explorer (vowelExplorer).
Produced by: Guy J. Brown
Release date: June 22 1998
Permissions: This demonstration may be used and modified freely by anyone. It may be distributed in unmodified form.