Empirically designing and evaluating a new revision-based model for summary generation

作者：

摘要

We present a system for summarizing quantitative data in natural language, focusing on the use of a corpus of basketball game summaries, drawn from on-line news services, to empirically shape the system design and to evaluate our approach. Our initial corpus analysis revealed characteristics of textual summaries that challenge the capabilities of current language generation systems. In order to meet these challenges, we developed a revision-based model for summary generation and implemented it in our prototype system streak. A second, detailed corpus analysis was used to identify and encode the revision rules of the system. Finally, we carried out a quantitative evaluation, using several test corpora, to measure the robustness of the new revision-based model. Our results show that our new model improves both coverage and extensibility of the traditional language generation model.

论文关键词：

论文评审过程：Available online 20 February 1999.

论文官网地址：https://doi.org/10.1016/0004-3702(95)00125-5