RARE: Mining colossal closed itemset in high dimensional data

作者：

Highlights：

•

摘要

The present society has been sculpted into a continuous data generator. In fact, the massive automatic data collection has generated a new genre of dataset, termed as ‘high-dimensional data’, which is characterized by a relatively small number of rows, in comparison to that of large number of columns (or dimensions). Among the vast data mining tasks, association rules have been extensively employed so as to describe the correlations between the variables found in a dataset. The task of mining association rules highly relies on the efficiency of the algorithms to extract all frequent itemsets that exist in the database. The focus towards improving run time and memory consumption of algorithms is strongly influenced by search strategies, effective pruning strategies, and the method of closure checking. Neither depth- nor breadth-first search may exert any variance without these techniques, mainly because the search space appears similar. With that, this paper investigated the strategies implemented in both row and column enumeration-based algorithms, hence proposing the RARE; a breadth-first bottom-up row-enumeration algorithm, in mining colossal closed itemsets in high-dimensional data.

论文关键词：Data mining,Closed itemset,High-dimensional data

论文评审过程：Received 23 October 2017, Revised 1 June 2018, Accepted 13 July 2018, Available online 18 July 2018, Version of Record 31 October 2018.

论文官网地址：https://doi.org/10.1016/j.knosys.2018.07.025