  • Google Speech Processing from Mobile to Farfield
    Michiel Bacchiani (Google)
    Abstract Recent years have shown a large scale adoption of speech recognition by the public, in particular around mobile devices. Google, with its Android operating system, has integrated speech recognition as a key input modality. The century of speech that our systems process each day shows how popular speech processing has become. This talk will briefly describe some of the history and highlight some of the technical challenges we faced.
    More recently, home farfield devices, as popularized by Amazon Echo, have resulted in a major research emphasis on speech processing in such conditions. This talk will describe the Google research effort that underpin the upcoming Google Home devices. It will describe how our neural network technology is capable of processing multi-channel data and implicitly learns how to localize and beamform the incoming signal. We show three distinct approach to implement this. One uses factored raw waveform processing in the input layers. The second uses processing of the complex FFT signal in the input layer. And the third uses an adaptive filtering approach.
