Efficient implementation of class-based decomposition schemes for Naïve Bayes

作者:Sang-Hyeun Park, Johannes Fürnkranz

摘要

Previous studies have shown that the classification accuracy of a Naïve Bayes classifier in the domain of text-classification can often be improved using binary decompositions such as error-correcting output codes (ECOC). The key contribution of this short note is the realization that ECOC and, in fact, all class-based decomposition schemes, can be efficiently implemented in a Naïve Bayes classifier, so that—because of the additive nature of the classifier—all binary classifiers can be trained in a single pass through the data. In contrast to the straight-forward implementation, which has a complexity of O(n⋅t⋅g), the proposed approach improves the complexity to O((n+t)⋅g). Large-scale learning of ensemble approaches with Naïve Bayes can benefit from this approach, as the experimental results shown in this paper demonstrate.

论文关键词:Naïve Bayes, Error-correcting output codes, Scalability

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10994-013-5430-z