Mining optimized support rules for numeric attributes

作者:

Highlights:

摘要

Mining association rules on large data sets have received considerable attention in recent years. Association rules are useful for determining correlations between attributes of a relation and have applications in marketing, financial and retail sectors. Furthermore, optimized association rules are an effective way to focus on the most interesting characteristics involving certain attributes. Optimized association rules are permitted to contain uninstantiated attributes and the problem is to determine instantiations such that either the support, confidence or gain of the rule is maximized. In this paper, we generalize the optimized support association rule problem by permitting rules to contain disjunctions over uninstantiated numeric attributes. Our generalized association rules enable us to extract more useful information about seasonal and local patterns involving the uninstantiated attribute. For rules containing a single numeric attribute, we present a dynamic programming algorithm for computing optimized association rules. Furthermore, we propose bucketing technique for reducing the input size, and a divide and conquer strategy that improves the performance significantly without sacrificing optimality. We also present approximation algorithms based on dynamic programming for two numeric attributes. Our experimental results for a single numeric attribute indicate that our bucketing and divide and conquer enhancements are very effective in reducing the execution times and memory requirements of our dynamic programming algorithm. Furthermore, they show that our algorithms scale up almost linearly with the attribute's domain size as well as the number of disjunctions.

论文关键词:Optimized association rules,Data mining,Knowledge discovery

论文评审过程:Available online 7 August 2001.

论文官网地址:https://doi.org/10.1016/S0306-4379(01)00026-6