CBC: An associative classifier with a small number of rules

作者:

Highlights:

• Discuss the weaknesses of decision trees and associative classifiers

• Propose a novel rule-based classifier with a small number of rules

• Apply feature selection to reduce the number of rules

• Conduct extensive experiments and analysis

摘要

Associative classifiers have been proposed to achieve an accurate model with each individual rule being interpretable. However, existing associative classifiers often consist of a large number of rules and, thus, can be difficult to interpret. We show that associative classifiers consisting of an ordered rule set can be represented as a tree model. From this view, it is clear that these classifiers are restricted in that at least one child node of a non-leaf node is never split. We propose a new tree model, i.e., condition-based tree (CBT), to relax the restriction. Furthermore, we also propose an algorithm to transform a CBT to an ordered rule set with concise rule conditions. This ordered rule set is referred to as a condition-based classifier (CBC). Thus, the interpretability of an associative classifier is maintained, but more expressive models are possible. The rule transformation algorithm can be also applied to regular binary decision trees to extract an ordered set of rules with simple rule conditions. Feature selection is applied to a binary representation of conditions to simplify/improve the models further. Experimental studies show that CBC has competitive accuracy performance, and has a significantly smaller number of rules (median of 10 rules per data set) than well-known associative classifiers such as CBA (median of 47) and GARC (median of 21). CBC with feature selection has even a smaller number of rules.

论文关键词:Association rule,Decision tree,Feature selection,Rule-based classifier,Rule pruning

论文评审过程:Received 4 January 2013, Revised 25 September 2013, Accepted 17 November 2013, Available online 4 December 2013.

论文官网地址:https://doi.org/10.1016/j.dss.2013.11.004