Christian Füllgrabe
MRC Institute of Hearing Research, Nottingham
Beyond audibility - The role of supra-threshold auditory and cognitive processing in speech perception across the adult lifespan
Anecdotal evidence and experimental investigations indicate that older people experience increased speech-perception difficulties, especially in noisy environments. Since peripheral hearing sensitivity declines with age, lower speech intelligibility is generally attributed to a reduction in audibility. However, aided speech-perception in hearing-impaired listeners frequently falls short of the performance level that would be expected based on the audibility of the speech signal. Given that many of these listeners are older, poor performance may be partly caused by age-related changes in supra-threshold auditory and/or cognitive processing that are not captured by an audiometric assessment. The presentation will discuss experimental evidence obtained from clinically normal-hearing adult listeners showing that auditory temporal processing, cognition, and speech-in-noise perception are indeed linked and, independently of hearing loss, decline across the adult lifespan. These findings highlight the need to take into account these audibility unrelated factors in the prediction and rehabilitation of speech intelligibility.
Alessandro Di Nuovo
Centre for Automation and Robotics Research, Sheffield Hallam University
Number Understanding Modelling in a Behavioural Embodied Robot
The talk will present the recent cognitive developmental robotics studies on deep artificial neural network architectures to model the learning of associations between (motor) finger counting, (visual) object counting and (auditory) number words and sequence learning, to explore whether finger counting and the association of number words or digits to each finger could serve to bootstrap the representation of number.
The results obtained in the experiments with the iCub humanoid robotic platform show that learning the number word sequences together with finger sequencing helps the fast building of the initial representation of numbers in the robot. Just as has been found with young children, through the use of finger counting and verbal counting strategies, such a robotic model develops finger and word representations that subsequently sustain the robot’s learning the basic arithmetic operation of addition.
The ambition of the current work is to exploit the embodied mathematical processing, considered the archetypal examples of abstract and symbolic processing, as a fundamental cognitive capability of the next generation of interactive robots with human-like learning behaviours. This will positively influence the acceptance of robots in socially interactive environments, therefore increasing the socio-economic applications of future robots, in particular on tasks once thought too delicate to automate, especially in the fields of social care, companionship, children therapy, domestic assistance, entertainment, and education.
Bibliography
- Di Nuovo, V. M. De La Cruz, and A. Cangelosi, “Grounding fingers, words and numbers in a cognitive developmental robot,” in Computational Intelligence, Cognitive Algorithms, Mind, and Brain (CCMB), 2014 IEEE Symposium on, 2014, pp. 9–15.
-
Di Nuovo, V. M. De La Cruz, A. Cangelosi, and S. Di Nuovo, “The iCub learns numbers: An embodied cognition study,” in International Joint Conference on Neural Networks (IJCNN 2014), 2014, pp. 692–699.
-
V. M. De La Cruz, A. Di Nuovo, S. Di Nuovo, and A. Cangelosi, “Making fingers and words count in a cognitive robot.,” Front. Behav. Neurosci., vol. 8, no. February, p. 13, 2014.
-
Di Nuovo, V. M. De La Cruz, and A. Cangelosi, “A Deep Learning Neural Network for Number Cognition: A bi-cultural study with the iCub,” in IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob) 2015, 2015, pp. 320–325.
-
Cangelosi, A. Morse, A. Di Nuovo, M. Rucinski, F. Stramandinoli, D. Marocco, V. De La Cruz, and K. Fischer, “Embodied language and number learning in developmental robots,” in Conceptual and Interactive Embodiment: Foundations of Embodied Cognition, vol. 2, Routledge, 2016, pp. 275–293.
Dr Tony Tew
Audio Lab, Department of Electronics, University of York
Around the head in 80 ways
Abstract: The complex shape (morphology) of the outer ears and their uniqueness to each listener continue to pose challenges for the successful introduction of binaural spatial audio on a large scale. This talk will outline approaches being taken in the Audio Lab Research Group at York to address some of these problems.
Morphoacoustic perturbation analysis (MPA) is a powerful technique for relating features of head-related transfer functions to their morphological origins. The principles of MPA will be described and some initial validation results presented. One way in which MPA may assist with estimating individualised HRTFs will be discussed.
Alternative approaches for determining the perceptual performance of a binaural audio system will be considered. An obvious problem is how to compare a virtual sound rendered binaurally with the equivalent real 3D sound without the listener knowing which they are hearing. This discussion will lead into a brief outline of recent efforts in broadcasting to improve the quality of experience for listeners to binaural spatial audio.
Biography: Tony Tew is a senior lecturer in the Department of Electronics at the University of York. He has a particular interest in auditory acoustics, spatial hearing and applications of binaural signal processing. Collaborators on the work presented in this talk include the University of Sydney, Orange Labs, BBC R&D and Meridian Audio, with additional support from EPSRC and the Australian Research Council.
Host: Guy Brown (g.j.brown@sheffield.ac.uk)
Professor Yannis Stylianou
Professor of Speech Processing at the University of Crete and Group Leader of the Speech Technology Group at Toshiba Cambridge Research Lab, UK.
Speech Intelligibility and Beyond
Abstract: Speech is highly variable in terms of its clarity and intelligibility. Especially in adverse listening
contexts (noisy environment, hearing loss, level of language acquisition, etc) speech intelligibility
can be highly reduced. The first question we will discuss is: can we modify speech before present it
into the listening context with the goal to increase its intelligibility?
Although just increasing the speech volume is the usual solution in such situations, it is well known that this is
not optimal both in terms of signal distortions as well as of listener's comfort.
In this talk, I will present advancements in terms of speech signal processing that have been shown
to greatly improve the intelligibility of speech in various conditions without increasing
any volume of speech. I will show results for normal hearing people in near and far field,
listeners with mild to moderate hearing losses, children with certain degree of learning disabilities
etc. and discuss possible applications.
We will also discuss ways to evaluate intelligibility, objectively and subjectively, and comment on relatively
recent results from two large scale international evaluations, the Hurricane Challenge
(http://listening-talker.org/hurricane/). The results I will show you are partially based on my group's work
from an FP-7 FET-OPEN project: The Listening Talker.
Finally, we will just ask a second question: Is it sufficient to increase the intelligibility of speech without paying attention
to the effort or the cognitive load of the listener? This will not be answered during the talk.
But we plan to address it during a new Horizon2020 Marie Curie ETN (2016-2019) project which is about to start soon.
So, I will only put the question on the table and advertise the project, hoping to find in the audience
interested candidates for PhD (for example, in the beautiful island of Crete in Greece).
Biography: Yannis Stylianou is Professor of Speech Processing at University of Crete, Department of
Computer Science, CSD UOC, and Group Leader of the Speech Technology Group at
Toshiba Cambridge Research Lab, UK. Until 2012, he was also Associated Researcher in
the Signal Processing Laboratory of the Institute of Computer Science ICS at FORTH.
During the academic year 2011-2012 was visiting Professor at AHOLAB, University of the
Basque Country, in Bilbao, Spain (2011-2012). He received the Diploma of Electrical
Engineering from the National Technical University, N.T.U.A., of Athens in 1991 and the
M.Sc. and Ph.D. degrees in Signal Processing from the Ecole National Superieure des
Telecommunications, ENST, Paris, France in 1992 and 1996, respectively. From 1996 until
2001 he was with AT&T Labs Research (Murray Hill and Florham Park, NJ, USA) as a
Senior Technical Staff Member. In 2001 he joined Bell-Labs Lucent Technologies, in
Murray Hill, NJ, USA (now Alcatel-Lucent). Since 2002 he is with the Computer Science
Department at the University of Crete while since January 2013, he is also with Toshiba
Labs in Cambridge UK.
His current research focuses on speech signal processing algorithms for speech analysis,
statistical
signal
processing
(detection
and
estimation),
and
time-series
analysis/modelling. He has (co-)authored more than 170 scientific publications, and holds
about 20 UK and US patents, which have received more than 4400 citations (excluding
self-citations) with H-index=31. He co-edited the book on “Progress in Non Linear Speech
Processing”, Springer-Verlag, 2007. He has been the P.I. and scientific director of several
European and Greek research programs and has been participating as leader in USA
research programs.
Among other projects, he was P.I. of the FET-OPEN project LISTA: “The Listening Talker”,
where the goal is to develop scientific foundations for spoken language technologies based
on human communicative strategies. In LISTA, he was charged of speech modelling and
speech modifications in order to suggest novel techniques for spoken output generation of
artificial and natural speech.
He has created a lab for voice function assessment equipped with high quality instruments
for speech and voice recordings (i.e., high-speed camera) for the purpose of basic research
in speech and voice, as well for services, in collaboration with the Medical School at the
University of Crete.
He was on the Board of the International Speech Communication Association (ISCA), of
the IEEE Multimedia Communications Technical Committee, member of the IEEE Speech
and Language Technical Committee and on the Editorial Board of the Digital Signal
Processing Journal of Elsevier. He is on the Editorial Board of Journal of Electrical and
Computer Engineering, Hindawi JECE, Associate Editor of the EURASIP Journal on
Speech, Audio, and Music Processing, ASMP, and of the EURASIP Research Letters in
Signal Processing, RLSP. He was Associate Editor for the IEEE Signal Processing Letters,
Vice-Chair of the Cost Action 2103: "Advanced Voice Function Assessment", VOICE, and
on the Management Committee for the COST Action 277: "Nonlinear Speech Processing".
Host: Erfan Loweimi (eloweimi1@sheffield.ac.uk)
Dr Cleopatra Pike
Institute of Sound Recording, University of Surrey
Compensation for spectral envelope distortion in auditory perception
Abstract: Modifications by the transmission channel (loudspeakers, listening rooms, vocal tracts) can distort and colour sounds, preventing recognition. Human perception appears to be robust to channel distortions and a number of perceptual mechanisms appear to cause compensation for channel acoustics. Lab tests mimicking ‘real-world’ listening show that compensation reduces colouration caused by the channel by a moderate to large extent. These tests also indicate the psychological and physiological mechanisms that may be involved in this compensation. These mechanisms will be discussed and further work to uncover how humans remove distortions caused by transmission channels will put forward.
Biography: Cleo’s interest in audio perception and the hearing system began with her studies in Music Production at the Academy of Contemporary Music. In order to pursue this interest further she obtained an MSc in psychological research in 2009 and a PhD in psychoacoustics in 2015. Cleo’s PhD involved measuring the extent to which human listeners adapt to transmission channel acoustics (e.g. loudspeakers, rooms, and vocal tracts) and examining the psychological and neural mechanisms involved in this. Cleo has also worked as a research statistician and a research methods and statistics lecturer at Barts and The London School of Medicine, part of Queen Mary University of London. Cleo's ultimate research aim is to ascertain how human hearing processes can used benefit machine listening algorithms and the construction of listening environments, such as concert halls.
Host: Amy Beeston (a.beeston@sheffield.ac.uk)