Mining Top-k motifs with a SAT-based framework

作者:

摘要

In this paper, we introduce a new problem, called Top-k SAT, that consists in enumerating the Top-k models of a propositional formula. A Top-k model is defined as a model with less than k models preferred to it with respect to a preference relation. We show that Top-k SAT generalizes two well-known problems: the Partial MAX-SAT problem and the problem of computing minimal models. Moreover, we propose a general algorithm for Top-k SAT. Then, we give an application of our declarative framework in data mining, namely, the problem of mining Top-k motifs in the transaction databases and in the sequences. In the case of mining sequence data, we introduce a new mining task by considering the sequences of itemsets. Thanks to the flexibility and to the declarative aspects of our SAT-based approach, an encoding of this task is obtained by a very slight modification of mining motifs in the sequences of items.

论文关键词:Boolean satisfiability,Data mining,Modeling,Top-k motifs

论文评审过程:Revised 15 October 2015, Accepted 27 November 2015, Available online 30 November 2015, Version of Record 9 February 2017.

论文官网地址:https://doi.org/10.1016/j.artint.2015.11.003