Local and global feature selection for multilabel classification with binary relevance

作者:André Melo, Heiko Paulheim

摘要

Multilabel classification has become increasingly important for various use cases. Amongst the existing multilabel classification methods, problem transformation approaches, such as Binary Relevance, Pruned Problem Transformation, and Classifier Chains, are some of the most popular, since they break a global multilabel classification problem into a set of smaller binary or multiclass classification problems. Transformation methods enable the use of two different feature selection approaches: local, where the selection is performed independently for each of the transformed problems, and global, where the selection is performed on the original dataset, meaning that all local classifiers work on the same set of features. While global methods have been widely researched, local methods have received little attention so far. In this paper, we compare those two strategies on one of the most straight forward transformation approaches, i.e., Binary Relevance. We empirically compare their performance on various flat and hierarchical multilabel datasets of different application domains. We show that local outperforms global feature selection in terms of classification accuracy, without drawbacks in runtime performance.

论文关键词:Multilabel classification, Transformation methods, Local feature selection, Global feature selection, Binary relevance

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10462-017-9556-4