S3L: Statistical Summarization of Spoken Language

Funded by EPSRC (GR/R42405) from 15 December 2001 - 14 June 2005

Investigators: Steve Renals and Yoshi Gotoh
Research Associate: Heidi Christensen
Research Student: BalaKrishna Kolluru
Industrial Collaborators: BBC Research and Development Department; SoftSound


The main aim of the proposed research is the automatic summarization of broadcast speech. We plan to adopt a statistical approach to the problem, including the development of new models and algorithms for summarization, an investigation of the utility of prosodic features, and the construction and evaluation of demonstration systems.

This project is primarily concerned with developing methods for the non-extractive summarization of spoken language using trainable statistical models. Although rule-based approaches have had some success, they have tended to be domain specific and typically require a large amount of effort to encode the domain knowledge as a template or script. Statistical methods have the potential to remove the bottleneck of manually encoding domain knowledge, and to increase the generality of summarization systems. Furthermore, we are specifically concerned with spoken language, which is more casual and less grammatical than text. We believe that statistical methods are well suited to this situation, particularly given the presence of speech recognition errors. Recent research in areas such as named entity identification has indicated that the relatively simple methods that have proven to be so successful in speech recognition may be applied to more demanding language processing tasks. A key scientific question that this project will address is whether such simple models may be applied to more complex tasks, such as summarization.

The main specific objectives are the development, implementation and evaluation of the following techniques for broadcast speech:

  1. Extractive summarization;
  2. Direct generative summarization using language model approaches;
  3. Content/style models for non-extractive summarization;
  4. Multi-document summarization;
  5. Incorporation of prosodic features using maximum entropy models.
A final objective is the construction of demonstration systems employing these techniques.

  • B. Kolluru, H. Christensen and Y. Gotoh
    Mutli-Stage Compaction approach to Broadcast News Summarization.
    In Proc. of Interspeech 2005, Lisbon, Portugal, 2005.
    [ps | pdf].
  • S. Simpson and Y. Gotoh
    Towards Speaker Independent Features for Information Extraction from Meeting Audio Data.
    In Proc. of MLMI Workshop 2005, Edinburgh, UK, 2005.
    [ Extended Abstract | poster ]
  • B. Kolluru and Y. Gotoh
    On the Subjectivity of Human Authored Short Summaries
    In Proc. of the ACL Workshop on Intrinsic and Extrinsic Evaluation
    Measures for Machine Translation and/or Summarization,
    Michigan, USA, 2005.
  • H. Christensen, B. Kolluru, Y. Gotoh and S. Renals
    Maximum Entropy Segmentation of Broadcast News.
    In Proc. of ICASSP 2005 Philadelphia, USA, 2005.
    [ps | pdf].
  • B. Kolluru, H. Christensen and Y. Gotoh.
    Decremetal Feature-based Compaction.
    In Proc. of HLT/NAACL Annual Meeting at Document Understanding Workshop
    by Document Understanding Conference
    , Boston, USA, 2004.
    [ps | pdf].
  • H. Christensen, B. Kolluru, Y. Gotoh and S. Renals.
    From text summarisation to style-specific summarisation for broadcast news.
    In Proc. of (ECIR'04), Sunderland, UK, 2004.
    [ps | pdf].
  • B. Kolluru and H. Christensen and Y. Gotoh and S. Renals.
    Exploring the style-technique interaction in extractive summarization of broadcast news.
    In Proc. of Automatic Speech Recognition and Understanding Workshop, St. Thomas, 2003.
    [ps | pdf].
  • H. Christensen and Y. Gotoh and B. Kolluru and S. Renals.
    Are extractive text summarisation techniques portable to broadcast news?
    In Proc. of Automatic Speech Recognition and Understanding Workshop, St. Thomas, 2003.
    [ps | pdf].