PwAdaBoost: Possible world based AdaBoost algorithm for classifying uncertain data

作者:

Highlights:

摘要

Possible world has become one of the most effective tools to deal with various types of data uncertainty in uncertain data management. However, few uncertain data classification algorithms are proposed based on possible world. Most existing uncertain data classification algorithms are simply extended from traditional classification algorithms for certain data. They deal with data uncertainty based on relatively ideal probability distribution and data type assumptions, thus are difficult to be applied for various application scenarios. In this paper, we propose a novel possible world based AdaBoost algorithm for classifying uncertain data, called PwAdaBoost. In the training procedure, PwAdaBoost uses the possible world set generated from the uncertain training set sampled in each iteration to train the sub-basic classifiers, and employs the possible world set generated from the whole uncertain training set to adjust the weights of the sub-basic classifiers and detect the quality of the basic classifiers. In the prediction procedure, PwAdaBoost utilizes the possible world set generated from the predicted object to get the results of the basic classifiers via majority voting and weighted voting. Furthermore, we analyze the stability and give the parallelization strategies for its training procedure and prediction procedure respectively. The proposed PwAdaBoost can deal with various types of data uncertainty, and use any existing classification algorithms for certain data to serve for uncertain data. As far as we know, it is the first ensemble classification algorithm for uncertain data. Extensive experiment results demonstrate the superiority of our proposed algorithm in terms of effectiveness and efficiency.

论文关键词:Uncertain data,Classification,Possible world,AdaBoost

论文评审过程:Received 13 March 2019, Revised 7 August 2019, Accepted 9 August 2019, Available online 12 August 2019, Version of Record 5 November 2019.

论文官网地址:https://doi.org/10.1016/j.knosys.2019.104930