Reducing the bottleneck of graph-based data mining by improving the efficiency of labeled graph isomorphism testing

作者:

Highlights:

摘要

Due to the complex nature of graph representations, the isomorphism testing between a pair of labeled graphs becomes one of the most time-consuming procedures during the process of graph-based data mining. In order to reduce this bottleneck, in this paper we propose a novel efficient algorithm to perform isomorphism testing of labeled graphs which in general performs less state-space tree searching. The proposed method uses graph signatures as the first-step filter, and it limits the backtracking occurring only between each pair of corresponding vertex classes, based on the proposed data structures and the vertex partition method. We compared the proposed method with state-of-the-art methods to verify its efficiency for several datasets each with different aspects of characteristics. The experimental results show that for irregular graphs, either labeled or unlabeled, the proposed method outperforms the compared methods in efficiency. For graphs with multiple labels but high regularity, the proposed method is still better than the compared methods. The result of this algorithm is directly applicable to those emerging applications related to graph-based data mining which need to perform isomorphism testing of labeled graphs in large databases.

论文关键词:Data mining,Mining methods and algorithm,Isomorphism testing,Graph signature,Search state-space tree

论文评审过程:Received 26 January 2013, Accepted 19 February 2014, Available online 3 March 2014.

论文官网地址:https://doi.org/10.1016/j.datak.2014.02.003