Mutual-reinforcement document summarization using embedded graph based sentence clustering for storytelling

作者:

Highlights:

摘要

In this paper, a document summarization framework for storytelling is proposed to extract essential sentences from a document by exploiting the mutual effects between terms, sentences and clusters. There are three phrases in the framework: document modeling, sentence clustering and sentence ranking. The story document is modeled by a weighted graph with vertexes that represent sentences of the document. The sentences are clustered into different groups to find the latent topics in the story. To alleviate the influence of unrelated sentences in clustering, an embedding process is employed to optimize the document model. The sentences are then ranked according to the mutual effect between terms, sentence as well as clusters, and high-ranked sentences are selected to comprise the summarization of the document. The experimental results on the Document Understanding Conference (DUC) data sets demonstrate the effectiveness of the proposed method in document summarization. The results also show that the embedding process for sentence clustering render the system more robust with respect to different cluster numbers.

论文关键词:Document summarization,Sentence ranking,Space embedding,Sentence clustering,Storytelling

论文评审过程:Received 15 March 2011, Revised 22 October 2011, Accepted 21 December 2011, Available online 30 January 2012.

论文官网地址:https://doi.org/10.1016/j.ipm.2011.12.006