Publications by HOARSE Researchers

 

Eva Bjorkner

  • Björkner, E and Sundberg, J:  “MR and Area function Study on Throaty Voice Quality”, Proc. ISCA workshop on Voice Quality, Geneva, Switzerland, August 27-29, 2003
  • Eva Björkner, Johan Sundberg, Tom Cleveland and Ed Stone:  “Voice source characteristics in different registers in classically trained musical theatre singers”, Proc. ICA2004, Kyoto, Japan, April 4-10, 2004; accepted for publication in Journal of Voice.
  • Björkner,  E., Sundberg,  J., and Alku P. NAQ variation with Ps in classically trained baritone singers. PEVOC6. London, UK, 31 August - 3 September, 2005.
  • Björkner, Eva. The Normalized Amplitude Quotient in glottal source parameterization of the singing voice; An overview. The Voice Foundation's 34thAnnual Symposium: Care of the Professional Voice, Philadelphia, USA, June 1-5,2005.
  • Björkner, E., Sundberg, J., Alku, P. Subglottal pressure and  NAQ variation in Classically Trained  Baritone  Singers  A STINT on voice research,  York , England, April 17 2005.
  • Björkner, E., Sundberg, J., Alku P. Subglottal Pressure and NAQ Variation in Voice Production of Classically Trained Baritone Singers. Interspeech'2005 -Eurospeech — 9th European Conference on Speech Communication and Technology. Lisboa, Portugal, September 4-8, 2005.
  • Rosenberg, S., Sundberg, J., Björkner, E. Attracting cows - What is the dB price in “kulning”? PEVOC6. London, UK, 31 August - 3 September, 2005. 1st prize winner

 

Julien Bourgeois

  • Bourgeois, J: A Clustering Approach to Audio Source Separation, in Proceedings of Eurospeech ‘03, September 1-4, 2003, pp. 1745-1748.
  • Julien Bourgeois and Klaus Linhard. Frequency-Domain Multichannel Signal Enhancement: Minimum-Variance vs. Minimum Correlation. Eusipco 2004, Vienna.
  • J. Bourgeois and J. Freudenberger, Multichannel Speech Enhancement in Cars: Implicit vs. Explicit Control, Proceedings of HSCMA2005 Joint Workshop on Hands-Free Speech Communication and Microphone Arrays, Rutgers University, Piscataway, New Jersey, USA, March 2005
  • J. Bourgeois, G. Lathoud and J. Freudenberger, Implicit Control of Noise Canceller for Speech Enhancement, Proceedings of INTERSPEECH2005 International Conference on Speech and Language Processing, Lisbon, Portugal, September 2005
  • J. Bourgeois , An LMS viewpoint on the local stability of second-order blind source separation, Proceedings of SSP2005, IEEE Workshop on Statistical Signal Processing, Bordeaux, France, July 2005.

 

Jana Eggink

  • Eggink, J. & Brown, G.J., "A missing feature approach to instrument identification in polyphonic music," Proc. ICASSP'03, pp. 553-556, 2003
  • Eggink, J. and Brown, G.J. (2004): Instrument recognition in accompanied sonatas and concertos. Proc. International Conference on Acoustics, Speech, and Signal Processing, ICASSP'04, pp. 217-220
  • Eggink, J. and Brown, G.J. (2004): Extracting melody lines from complex audio. Proc. International Conference on Music Information Retrieval, ISMIR'04

 

Guillaume Lathoud

  • Lathoud, G and McCowan, I: "Location Based Speaker Segmentation",in Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP-03)
  • McCowan , S. Bengio , D. Gatica-Perez , G. Lathoud , F. Monay , D. Moore , P. Wellner and H. Bourlard, "Modeling Human Interaction in Meetings", in Proceedings of the 2003 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP-03)
  • McCowan, I. Gatica-Perez, S. Bengio, and G. Lathoud,"Automatic Analysis of Multimodal Group Actions in Meetings",IDIAP-RR 03-27, 2003
  • G. Lathoud, I.A. McCowan, and J.M. Odobez. Unsupervised Location-Based Segmentation of Multi-Party Speech. Proceedings of the 2004 NIST Meeting Recognition Workshop (NIST-RT04).
  • J. Ajmera, G. Lathoud and I.A. McCowan. Clustering and Segmenting Speakers and their Locations in Meetings. Proceedings of the 2004 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP-04).
  • D. Zhang, D. Gatica-Perez, S. Bengio, I.A. McCowan and G. Lathoud. Modeling Individual and Group Actions in Meetings: a Two-Layer HMM Framework. Proceedings of CVPR 2004.
  • D. Gatica-Perez, G. Lathoud, I.A. McCowan and J.M. Odobez. A Mixed-State I-Particle Filter for Multi-Camera Speaker Tracking. Proceedings of the 2003 IEEE Int. Conf. on Computer Vision Workshop on Multimedia Technologies for E-Learning and Collaboration (ICCV-WOMTEC), 2003.
  • G. Lathoud, I.A. McCowan, and D.C. Moore. Segmenting Multiple Concurrent Speakers Using Microphone Arrays. Proceedings of Eurospeech 2003.
  • D. Gatica-Perez, G. Lathoud, I.A. McCowan, J.M. Odobez and D.C. Moore. Audio-Visual Speaker Tracking with Importance Particle Filters. Proceedings of the 2003 IEEE International Conference on Image Processing (ICIP-2003).
  • G. Lathoud and I.A. McCowan. Location based speaker segmentation. Proceedings of the 2003 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP-03).
  • I.A. McCowan, S. Bengio, D. Gatica-Perez, G. Lathoud, F. Monay, D.C. Moore, P. Wellner and H. Bourlard. Modeling human interactions in meetings. Proceedings of the 2003 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP-03).
  • G. Lathoud, J.M. Odobez and D. Gatica-Perez. AV16.3: an Audio-Visual Corpus for Speaker Localization and Tracking. IDIAP Research Report 04-28, 2004.
  • D. Zhang, D. Gatica-Perez, S. Bengio, I.A. McCowan and G. Lathoud. Multimodal Group Action Clustering in Meetings. IDIAP Research Report RR 04-24, 2004.
  • I.A. McCowan, D. Gatica-Perez, S. Bengio, and G. Lathoud. Automatic Analysis of Multimodal Group Actions in Meetings. IDIAP Research Report 03-27, 2003.
  • G. Lathoud, J.M. Odobez and D. Gatica-Perez. AV16.3: an Audio-Visual Corpus for Speaker Localization and Tracking.  in Proceedings of the MLMI’04 Workshop , S. Bengio and H. Bourlard Eds, Springer-Verlag, 2005.
  • D. Zhang, D. Gatica-Perez, S. Bengio, I.A. McCowan and G. Lathoud. Multimodal Group Action Clustering in Meetings.  Proceedings of the ACM-VSSN’04 Workshop, 2004.
  • G. Lathoud and I.A. McCowan. A Sector-Based Approach for Localization of Multiple Speakers with Microphone Arrays. in Proceedings of the 2004 ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing (SAPA-2004).
  • G. Lathoud and M. Magimai.-Doss. A Sector-Based, Frequency-Domain Approach to Detection and Localization of Multiple Speakers. In Proceedings of ICASSP’05 , 2005.
  • G. Lathoud, M. Magimai.-Doss and B. Mesot. A Spectrogram Model for Enhanced Source Localization and Noise-Robust ASR. in Proceedings of Interspeech , 2005.

 

 

Viktoria Maier

  • Andrew C. Morris, Viktoria Maier and Phil Green, “From WER and RIL to MER and WIL: improved evaluation measures for connected speech recognition”, in International Conference on Spoken Language Processing (ICSLP), Jeju Island, Korea, 2004
  • Viktoria Maier and Hynek Hermansky, “Perception of synthetic consonant-vowel stimuli”, in Multimodal Interaction and Related Machine Learning Algorithms (MLMI), Martigny, Switzerland, 2004

 

 

Juha Merimaa

  • Merimaa, J & Pulkki: Perceptually-Based Processing of Directional Room Responses for Multichannel Loudspeaker Reproduction, IEEE WASPAA, New Paltz, NY, USA, October 19-22, 2003.
  • Merimaa, J: Auditorily Motivated Analysis of Directional Room Responses, 1st ISCA Tutorial & Research Workshop on Auditory Quality of Systems, Akademie Mont-Cenis, Germany, 2003. Invited talk (no written paper).
  • Merimaa, J. & Hess, W: Training of Listeners for Evaluation of Spatial Attributes of Sound, AES 117th Convention, San Francisco, CA, USA, 2004.
  • J. Merimaa and V. Pulkki, “Spatial Impulse Response Rendering,” in Proc. 7th Int. Conference on Digital Audio Effects, Naples, Italy, Oct. 5-8, 2004, pp. 139-144. Invited paper.
  • V. Pulkki and J. Merimaa, “Spatial Impulse Response Rendering: Listening tests and applications to continuous sound,” presented at AES 118th Convention, Barcelona, Spain, May 28-31, 2005. Preprint 6371.
  • Vassilantonopoulos, S., Merimaa, J., Worley, J., and Mourjopoulos, J. (2005). “The Acoustics of Roman Odeion of Patras: Comparing Simulations and Acoustic Measurements”, Forum Acousticum, 29th Aug. – 3rd Sep.t, 2005; Budapest, Hungary.
  • J. Merimaa and W. Hess, “Training of Listeners for Evaluation of Spatial Attributes of Sound,” presented at AES 117th Convention, San Francisco, CA, USA, Oct. 28-31, 2004. Preprint 6237.

 

Kalle Palomaki

  • Palomäki, K. Brown, G., and Barker, J., 'Techniques For Handling Convolutional Distortion With `Missing Data' Automatic Speech Recognition”, Speech Communication Vol. 43, no. 1-2, pp. 123-142, 2004
  • Palomäki, K., Brown, G., and Wang, D., ''A Binaural Processor for Missing Data Speech Recognition in the Presence of Noise and Small-Room Reverberation,'' Speech Communication, 2004.
  • Brown G. J. and Palomäki K. J. (2005) "A computational model of the speech reception threshold for laterally separated speech and noise", Proceedings of Interspeech, Lisbon 4th-8th Sep, 2005.

 

Elvira Perez

  • Pérez, E., Rodriguez-Esteban, R., Meyer, G. (Aug 31-4 Sep, 2005)     Poster title: Oreja... a MATLAB environment for the design of psychoacoustic stimuli. XIV Congress ESCOP (European Society of Cognitive Psychology), Leiden, Netherlands.
  • Meyer, G., Pérez, E. (April 10-12, 2005 )   Poster title: Task related differences in speech / non-speech evoked potentials. XXII Cognitive Neuroscience Society, New York, USA.

 

John Worley

  • Vassilantonopoulos, S., Merimaa, J., Worley, J., and Mourjopoulos, J. (2005). “The Acoustics of Roman Odeion of Patras: Comparing Simulations and Acoustic Measurements”, Forum Acousticum, 29th Aug. – 3rd Sep.t, 2005; Budapest, Hungary.
  • Worley, J. and Braasch, J. (2005).  “Learning the Cues Associated with Non individualised HRTFs”, X1Vth Conference of the European Society for Cognitive Psychology, 31th Aug. – 3rd Sep.t, 2005; Leiden, The Netherlands.
  • Worley, J., Haziantoniou, P., and Mourjopoulos, J. (2005). “Subjective Assessments of Real-Time Dereverberation and Loudspeaker Equalization”, Paper 6461. 118th Convention of the Audio Engineering Society, 28th-31st May, 2005; Barcelona, Spain.