HOARSE RESEARCHERS

 

Eva Bjorkner

Working at HUT

"I started my doctoral studies in Music Acoustics in March 2003 with a background as singer and singing teacher. The red thread through my thesis is to explore voice source differences between singing styles i.e., classical and non-classical singing by looking at the influence of subglottal pressure on the voice source, the Amplitude Quotient AQ (Alku & Vilkman, 1996) and the Normalized AQ, called NAQ (Alku et al., 2002). AQ and NAQ have shown to separate between tokens of different modes of phonation giving an indication about the voice quality. My studies so far are on; MRI images and acoustical characteristics for throaty voice quality, register differences in female musical theatre singers, subglottal pressure and NAQ variation in classically trained baritone singers, and subglottal pressure, AQ and NAQ in male musical theatre singers. In the future I want to study singing while being in- or out of balance, and if possible combine the acoustical measurements with measurements of brain activity."

Eva's April 05 presentation (ppt)


Julien BourgeoisJulien Bourgeois

Working at Daimler-Chrysler

Topic: Blind Source Separation for Speech Enhancement 

The subject of this thesis is the development of a speech enhancement method able to separate blindly multiple speech sources. When several persons speak at the same time in front of a speech-based human-machine interface, the separation of their individual speech becomes necessary since only one signal should be passed as input to a speech recognition component. 
Technically source separation is required when several sensors measure the mixtures of multiple sources. If no prior knowledge is available on the way they are mixed, the problem is termed blind source separation (BSS). The study will aim to build a upon previous work in the area of statistical BSS, microphone array and speech processing. The particularity of the acoustic response of the room and of speech signals also provides significant prior information about the sources and their mixing. The particularity of this work relies in the use of information contained in the acoustic response of the room and in the properties of speech. The car interior will be considered as a privileged application environment for the development and testing of the research.


Julien's April 05 presentation (ppt)

Guillaume LathoudGuillaume Lathoud

Working at IDIAP

Guillaume Lathoud has been working on speech processing with multiple microphones at IDIAP (Switzerland) for the past 3 years. From speech recorded with several microphones at known locations in space, it is possible to compare the various signals and infer the location of the active speaker(s). Detecting and locating the active speaker(s) can be very useful for automatic processing of spontaneous speech, as for example in meeting rooms. An example of application is automatic summarization of meetings, which can significantly help a person not present in the meeting to access the most interesting parts of its recording. Guillaume's research has been oriented in two main directions. The first direction is to develop robust techniques for detecting and locating multiple sound sources at the same time.  This is motivated by the high amount of overlaps and interruptions in spontaneous multi-party conversations, as well as the presence of interferences such as beamer and laptops. The second direction is to use the speaker location information for higher-level applications, such as automatic segmentation of speaker turns, and speaker tracking over time. The techniques developped by Guillaume are general and can be applied to other fields, as shown by collaborations with Daimler-Chrysler (speech enhancement in cars) and vision people at IDIAP (audio-visual speaker tracking).


Guillaume's April 05 presentation

Elvira PerezElvira Perez

Working at Liverpool

I am interested in understanding the phenomenon of speech perception considering the mediating role attention plays within the resultant speech percept. Until now I have been focusing more upon the effect of the noise and those parameters (rhythm, amplitude or degree of variability) that affect linguistic performance. Only recently I have been more concentrated on the role of the listener attention, funding this variable difficult to control when collecting behavioural data, therefore the need to contrast these data with psychophysical data. Within my PhD I have employed EEG methodology to collect data regarding listeners

Elvira's October 04 presentation





John Worley

Working at Patras

John WorleyMy field of interest is psychoacoustics, particularly auditory scene analysis.  I am interested in how the auditory system parses its input signals into coherent objects.  Many cues exist to aid the auditory system in its parsing task.  The most appreciable cues are probably pitch and location.  During my PhD I assessed auditory stream segregation as a function of vertical location and pitch.  Within the HOARSE project I have been interested in auditory stream segregation in reverberant environments.  Whilst at the University of Patras my work has focused on understanding the multiple perceptual effects of room dereverberation and loudspeaker equalisation.   In collaboration with the institute communications acoustics (IKA) (Ruhr University, Bochum) I assessed whether participants could learn the cues associated with another individuals ears in an anechoic environment via coincident visual presentation.  Also, whilst at IKA I measured localisation ability in large reverberant environments as a function of pure tone salience (Franssen illusion).  An extension of the work at IKA has been to study the integration of auditory grouping cues; specifically how localisation is mediated by pitch grouping in a reverberant environment.


John's April 2005 Presentation