Bursty and Hierarchical Structure in Streams
作者:Jon Kleinberg
摘要
A fundamental problem in text data mining is to extract meaningful structure from document streams that arrive continuously over time. E-mail and news articles are two natural examples of such streams, each characterized by topics that appear, grow in intensity for a period of time, and then fade away. The published literature in a particular research field can be seen to exhibit similar phenomena over a much longer time scale. Underlying much of the text mining work in this area is the following intuitive premise—that the appearance of a topic in a document stream is signaled by a “burst of activity,” with certain features rising sharply in frequency as the topic emerges.
论文关键词:data stream algorithms, text mining, Markov source models
论文评审过程:
论文官网地址:https://doi.org/10.1023/A:1024940629314