Constructing a multi-valued and multi-labeled decision tree

作者:

Highlights:

摘要

Most decision tree classifiers are designed to classify the objects whose attributes and class labels are single values. However, many practical classification problems need to deal with multi-valued and multi-labeled data. For example, a customer data in a tour company may have multi-valued attributes such as the cars, the hobbies and the houses of the customer and multiple labels corresponding to the tours joined before. If the company intends to use customers' data to build a classifier to predict what kinds of customers are likely to participate in what kinds of tours; then a requirement arises immediately is how to design a new classification algorithm to classify the multi-valued and multi-labeled data. Therefore, this research has engaged in developing such a new classifier. We found that the design of some major functions used in our classifier is different from the existing ones, including how to select the next splitting attribute, when to stop the splitting of a node, how to determine a node's labels, and how to predict the labels of a new data. In this paper, all these issues are addressed and the problems are solved. The simulation result shows that the proposed algorithm performs well both in computing time and in accuracy.

论文关键词:Decision tree,Data mining,Classification,Multi-valued attribute,Multi-labeled attribute,Prediction,Customer relation management

论文评审过程:Available online 25 March 2003.

论文官网地址:https://doi.org/10.1016/S0957-4174(03)00047-2