Multisource Decoding Home

Multisource decoding for speech in the presence of other sound sources


References for the overview and work programme

J. Barker, L. Josifovski, M.P. Cooke and P.D. Green (2000a), Soft decisions in missing data techniques for robust automatic speech recognition, Proc. ICSLP-2000, Beijing, vol. I, pp 373-376.

J. Barker, M.P. Cooke and D.P.W.Ellis (2000b), Decoding speech in the presence of other sound sources, Proc. ICSLP-2000, Beijing vol. IV, pp 270-273.

G.J. Brown & M.P. Cooke (1994) Computational auditory scene analysis. Computer Speech and Language 8, 297-336.

G. J. Brown & D. L. Wang (1997) Modelling the perceptual segregation of concurrent vowels with a network of neural oscillators. Neural Networks, 10 (9), pp. 1547-1558.

G. J. Brown & M. P. Cooke (1998) Temporal synchronisation in a neural oscillator model of primitive auditory stream segregation. In Computational Auditory Scene Analysis, edited by D. F. Rosenthal and H. G. Okuno. Mahwah, NJ: Lawrence Erlbaum, pp. 87-101.

G. J. Brown & D. L. Wang (2000) An oscillatory correlation framework for computational auditory scene analysis. In Advances in Neural Information Processing Systems 12, edited by S. A. Solla, T. K. Leen and K. Muller, MIT Press, pp. 747-753.

M. P. Cooke (1991) Modelling auditory processing and organisation, Ph.D. Thesis, University of Sheffield. Published by Cambridge University Press in 1993.

M. P. Cooke, P. D. Green, L. Josifovski & A. Vizinho (in press), Robust Automatic Speech Recognition with Missing and Uncertain Acoustic Data, Speech Communication.

M. P. Cooke & P. D. Green (in press), Auditory organisation and speech perception: pointers for robust ASR. In Listening to Speech, ed. Greenberg & Ainsworth, Oxford.

M. P. Cooke & D. P. W. Ellis (in press) The auditory organization of speech and other sources in listeners and computational models, accepted by Speech Communication.

M.P. Cooke, A. C. Morris & P. D. Green (1997) Missing data techniques for robust speech recognition, ICASSP-97.

P.D.Green, M. P. Cooke & M. D. Crawford (1995), Auditory Scene Analysis and HMM- Recognition of Speech in Noise, ICASSP-95, Detroit, p401-404.

P.D. Green, J. Barker, M.P. Cooke and L. Josifovski (2001), Handling Missing and Unreliable Information in Speech Recognition, AISTATS 2001, Florida, pp. 49-56.

L. Josifovski, M. Cooke, P. Green & A.Vizinho (1999) State based imputation of missing data for robust speech recognition and speech enhancement, Eurospeech-99, Budapest, Vol. 6, 2837-2840.

A. C. Morris, M. P. Cooke & P. D. Green (1998) Some solutions to the missing feature problem in data classification, with application to noise-robust ASR, ICASSP-98.

D.L. Wang & G.J. Brown (1999) Separation of speech from interfering sounds using oscillatory correlation. IEEE Transactions on Neural Networks, 10 (3), pp. 684-697.

A S. Bregman (1990) Auditory Scene Analysis, MIT Press, Cambridge MA

A. deCheveigne (1997) Concurrent vowel identification .3. A neural model of harmonic interference cancellation, JASA 101: (5) 2857-2865, Part 1 MAY 1997

C.J. Darwin & R.P. Carlyon (1995) Auditory Grouping. In: The Handbookof Perception & Cognition, Vol. 6 Hearing (Ed: B.C.J. Moore) Academic Press 387-424.

C.J. Darwin & R.W. Hukin (1998) Perceptual segregation of a harmonic from a vowel by interaural time difference in conjunction with mistuning and onset asynchrony, JASA 103: (2) 1080-1084.

D.P.W. Ellis (1997). Computational Auditory Scene Analysis exploiting Speech-Recognition knowledge, Proc. IEEE workshop on Apps. of Sig. Proc. to Aud. & Acous., Mohonk.

S. Furui. (1997) Recent advances in robust speech recognition, Proc. ESCA-NATO ARW on Robust Speech Recognition for Unknown Communication Channels, France, 11-20.

M.J.F. Gales & S.J. Young (1992) An improved approach to hidden Markov model decomposition of speech and noise, ICASSP-92, 233-236.

M.J.F. Gales & S.J. Young (1993) HMM recognition in noise using parallel model combination, Eurospeech-93, 837-840.

Y. Gong (1995) Speech Recognition in Noisy Environments: A Survey, Speech Communication 16, 261- 291.

H. Hermansky & N. Morgan (1994) RASTA processing of speech, IEEE Trans. Speech & Audio, 2(4), 578-589.

R.P. Lippmann & B. Carlson, (1997a) Using missing feature theory to actively select features for robust speech recognition with interruptions, filtering, and noise, in Proc. Eurospeech-97, pp. 37-40, 1997.

R.P. Lippmann (1997b) Speech Recognition by Humans and Machines, Speech Communication 22,1, pp1-16.

P. Lockwood & J. Boudy (1992), Experiments with a Nonlinear Spectral Subtractor, Hidden Markov Models and the Projection for Robust Speech Recognition in Cars, Speech Communication, 11.

H.G. Okuno, T. Nakatani & T. Kawabata (1999), "Listening to two simultaneous speeches", Speech Communication, Elsevier (in press).

D. Pearce 7 H.-G. Hirsch (2000), The AURORA experimental framework for the performance evaluation of speech recognition systems under noisy conditions, proc. ISCLP 2000, Beijing, vol. IV pp 29-32.

A.P. Varga & R.K. Moore (1990) Hidden markov model decomposition of speech and noise, Proc. ICASSP'90, pp.845-848.