Upper bounds for can-tree and FP-tree

作者:Nima Shahbazi, Jarek Gryz

摘要

Two efficient tree structures known as Can-tree (Leung et al., Knowledge and Information Systems, 11(3), 287–311, 2007) and FP-tree (Han et al., 2000) are used to store a database in memory for mining frequent patterns. However, there has been no discussion on tight upper bound for the number of nodes in these trees. Instead, a very loose upper bound of 2n (where n is the number of distinct items in the database) is used. In this paper, we provide a tighter upper bound for the number of nodes in Can-tree and FP-tree. The upper bound on the number of nodes is provided through a greedy algorithm for the Can-tree and a closed form solution is derived for the FP-tree. These results are illustrated by examples both in graphical and mathematical form.

论文关键词:Data mining, Frequent item sets, Pattern mining, Upper bound

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10844-021-00673-6