A quality index for decision tree pruning

作者:

Highlights:

摘要

Decision tree is a divide and conquer classification method used in machine learning. Most pruning methods for decision trees minimize a classification error rate. In uncertain domains, some sub-trees that do not decrease the error rate can be relevant in pointing out some populations of specific interest or to give a representation of a large data file. A new pruning method (called DI pruning) is presented here. It takes into account the complexity of sub-trees and is able to keep sub-trees with leaves yielding to determine relevant decision rules, although they do not increase the classification efficiency. DI pruning allows to assess the quality of the data used for the knowledge discovery task. In practice, this method is implemented in the UnDeT software.

论文关键词:Decision tree,Quality,Pruning,Uncertain data

论文评审过程:Available online 5 January 2002.

论文官网地址:https://doi.org/10.1016/S0950-7051(01)00119-8