A hybrid heuristic approach for attribute-oriented mining

作者:

Highlights:

• clusterAOI uses better heuristics and avoids overgeneralisation than AOI.

• clusterAOI has superior runtime performance (about half of that of classical AOI).

• clusterAOI has 4 times interestingness and 1.5 times divergence better than AOI.

• clusterAOI does not fluctuate between small and large datasets—steady and stable.

摘要

We present a hybrid heuristic algorithm, clusterAOI, that generates a more interesting generalised table than obtained via attribute-oriented induction (AOI). AOI tends to overgeneralise as it uses a fixed global static threshold to cluster and generalise attributes irrespective of their features, and does not evaluate intermediate interestingness. In contrast, clusterAOI uses attribute features to dynamically recalculate new attribute thresholds and applies heuristics to evaluate cluster quality and intermediate interestingness. Experimental results show improved interestingness, better output pattern distribution and expressiveness, and improved runtime.

论文关键词:Induction,Heuristic,Threshold,Interestingness,Cluster,Algorithm

论文评审过程:Received 16 February 2012, Revised 18 July 2013, Accepted 22 August 2013, Available online 4 September 2013.

论文官网地址:https://doi.org/10.1016/j.dss.2013.08.012