Efficient clustering of databases induced by local patterns

作者：

摘要

Many large organizations have multiple large databases as they transact from multiple branches. Most of the previous pieces of work are based on a single database. Thus, it is necessary to study data mining on multiple databases. In this paper, we propose two measures of similarity between a pair of databases. Also, we propose an algorithm for clustering a set of databases. Efficiency of the clustering process has been improved using the following strategies: reducing execution time of clustering algorithm, using more appropriate similarity measure, and storing frequent itemsets space efficiently.

论文关键词：Clustering,IS coding,Local pattern analysis,Multi-database mining,Similarity between a pair of databases

论文评审过程：Received 16 November 2005, Revised 19 October 2007, Accepted 8 November 2007, Available online 17 November 2007.

论文官网地址：https://doi.org/10.1016/j.dss.2007.11.001