A Dynamic Programming Algorithm for Linear Text Segmentation

作者:P. Fragkou, V. Petridis, Ath. Kehagias

摘要

In this paper we introduce a dynamic programming algorithm which performs linear text segmentation by global minimization of a segmentation cost function which incorporates two factors: (a) within-segment word similarity and (b) prior information about segment length. We evaluate segmentation accuracy of the algorithm by precision, recall and Beeferman's segmentation metric. On a segmentation task which involves Choi's text collection, the algorithm achieves the best segmentation accuracy so far reported in the literature. The algorithm also achieves high accuracy on a second task which involves previously unused texts.

论文关键词:text segmentation, information retrieval, document retrieval, machine learning

论文评审过程:

论文官网地址:https://doi.org/10.1023/B:JIIS.0000039534.65423.00