Ph.D. Opportunities in SpandHThese paragraphs outline some of the research topics in the group suitable for research students.
Modelling the visual Lombard effectSupervisor: Jon Barker
In noisy environments, talkers subconsciously alter their speaking style to make their speech more easily heard against the noise background - this is known as the `Lombard' effect. The acoustic differences between normal speech and Lombard speech have been studied in detail, but there has been surprisingly little study of visual differences, i.e. changes in the pattern of lip, jaw and face movements. An understanding of the visual aspect of Lombard speech is necessary to inform the design of improved audio-visual automatic speech recognition (AV-ASR) systems.
The project will make use of the Speech and Hearing group's audio-visual speech recording facilities. A set of audio-visual Lombard speech recordings will be made, by asking subjects to read prompts while wearing headphones delivering noise at a variety of levels. Detailed 2-D visual information will be extracted by using artificial markers, and employing existing video-based marker tracking techniques. The data will be used to train noise-level dependent viseme models, which will be evaluated by incorporation into existing AV-ASR systems.
Clinical Applications of Speech TechnologySupervisor: Phil Green or Roger Moore
The projects in CAST have shown that Speech Technology can be used effectively in Speech Training and Assistive Technology. There are a number of possibilities for Ph.D. projects in the CAST area:
Information Access from Spoken LanguageSupervisor: Yoshi Gotoh
We have a well-established research effort in the general area of accessing information from spoken language, particularly from broadcast speech. This research has includes work in spoken document retrieval, automatic punctuation and identification of named entities. We have focussed on the use of trainable, statistical models, typically finite state models (eg Hidden Markov Models - HMMs).
Future research will continue these themes. There is a wide range of potential PhD projects, which may focus on new models (statistical translation models, maximum entropy models), new tasks (summarization, question answering), or the incorporation of further acoustic information (prosody).
Projects of particular interest include:
Language ModellingSupervisor: Yoshi Gotoh
Potential language modelling projects include: