Soft constraints for pattern mining

作者:Willy Ugarte, Patrice Boizumault, Samir Loudni, Bruno Crémilleux, Alban Lepailleur

摘要

Constraint-based pattern discovery is at the core of numerous data mining tasks. Patterns are extracted with respect to a given set of constraints (frequency, closedness, size, etc). In practice, many constraints require threshold values whose choice is often arbitrary. This difficulty is even harder when several thresholds are required and have to be combined. Moreover, patterns barely missing a threshold will not be extracted even if they may be relevant. The paper advocates the introduction of softness into the pattern discovery process. By using Constraint Programming, we propose efficient methods to relax threshold constraints as well as constraints involved in patterns such as the top-k patterns and the skypatterns. We show the relevance and the efficiency of our approach through a case study in chemoinformatics for discovering toxicophores.

论文关键词:Constraint-based pattern mining, Soft constraints, Soft skypatterns, Constraint Programming, Disjonctive relaxation, Chemoinformatics

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10844-013-0281-4