Mining high-utility itemsets in dynamic profit databases

作者:

Highlights:

摘要

High-Utility Itemset (HUI) mining is an important data-mining task which has gained popularity in recent years due to its applications in numerous fields. HUI mining aims at discovering itemsets that have high utility (e.g., yield a high profit) in transactional databases. Although several algorithms have been designed to enumerate all HUIs, an important issue is that they assume that the utilities (e.g., unit profits) of items are static. But this simplifying assumption does not hold in real-life situations. For example, the unit profits of items often vary over time in a retail store due to fluctuating supply costs and promotions. Ignoring this important characteristic of real-life transactional databases makes current HUI-mining algorithms inapplicable in many real-world applications. To address this critical limitation of current HUI-mining techniques, this paper studies the novel problem of mining HUIs in databases having dynamic unit profits. To accurately assess the utility of any itemset in this context, a redefined utility measure is introduced. Furthermore, a novel algorithm named MEFIM (Modified EFficient high-utility Itemset Mining), which relies on a novel compact database format to discover the desired itemsets efficiently, is designed. An improved version of the MEFIM algorithm, named iMEFIM, is also introduced. This algorithm employs a novel structure called P-set to reduce the number of transaction scans and to speed up the mining process. Experimental results show that the proposed algorithms considerably outperform the state-of-the-art HUI-mining algorithms on dynamic profit databases in terms of runtime, memory usage, and scalability.

论文关键词:High-utility itemset mining,Dynamic profit,Candidate pruning,Data mining

论文评审过程:Received 29 August 2018, Revised 9 March 2019, Accepted 20 March 2019, Available online 26 March 2019, Version of Record 26 April 2019.

论文官网地址:https://doi.org/10.1016/j.knosys.2019.03.022