PE-MSC: partial entailment-based minimum set cover for text summarization

作者:Anand Gupta, Manpreet Kaur, Sonaali Mittal, Swati Garg

摘要

The notion of Textual Entailment (TE) is an established indicator of text connectedness. It captures semantic relationships between texts. Recently, it has been used successfully for determining sentence salience in many text summarization methods. However, it has been reported in previous works that the standard textual entailment is not ideal for measuring sentence salience. This is because textual entailment relationships between sentences are quite rare in real-world texts. Therefore, we suggest using partial TE to accomplish the task of recognizing standard TE. We present the single document summarization problem as an optimization problem which is solved using a weighted Minimum Set Cover (wMSC) algorithm. In this method, sentences are broken into fragments and Partial TE is used to form sets of fragments. Finally, wMSC is applied to the sets to obtain the minimum set cover, which corresponds to the summary of the document. The results achieved on the DUC 2002 dataset using ROUGE and other quality metrics show that the proposed method outperforms the state of the art.

论文关键词:Text Summarization, Minimum set cover, information retrieval, Natural Language Processing

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-020-01537-1