Efficient clustering of databases induced by local patterns

作者:

摘要

Many large organizations have multiple large databases as they transact from multiple branches. Most of the previous pieces of work are based on a single database. Thus, it is necessary to study data mining on multiple databases. In this paper, we propose two measures of similarity between a pair of databases. Also, we propose an algorithm for clustering a set of databases. Efficiency of the clustering process has been improved using the following strategies: reducing execution time of clustering algorithm, using more appropriate similarity measure, and storing frequent itemsets space efficiently.

论文关键词:Clustering,IS coding,Local pattern analysis,Multi-database mining,Similarity between a pair of databases

论文评审过程:Received 16 November 2005, Revised 19 October 2007, Accepted 8 November 2007, Available online 17 November 2007.

论文官网地址:https://doi.org/10.1016/j.dss.2007.11.001