An efficient algorithm for optimal pruning of decision trees

作者:

摘要

Pruning decision trees is a useful technique for improving the generalization performance in decision tree induction, and for trading accuracy for simplicity in other applications. In this paper, a new algorithm called OPT-2 for optimal pruning of decision trees is introduced. The algorithm is based on dynamic programming. In its most basic form, the time and space complexities of OPT-2 are both Θ (nC), where n is the number of test nodes in the initial decision tree, and C is the number of leaves in the target (pruned) decision tree. This is an improvement over the recently published OPT algorithm of Bohanec and Bratko (which is the only known algorithm for optimal decision tree pruning) especially in the case of heavy pruning and when the tests of the given decision tree have many outcomes. If so desired, the space required by OPT-2 can further be reduced by a factor of r at the cost of increasing the execution time by a factor that is bounded above by (r + 1)2 (this is a considerable overestimate, however). From a practical point of view, OPT-2 enjoys considerable flexibility in various aspects, and is easy to implement.

论文关键词:

论文评审过程:Available online 16 February 1999.

论文官网地址:https://doi.org/10.1016/0004-3702(95)00060-7