A formalized framework for incorporating expert labels in crowdsourcing environment

作者：Qingyang Hu, Qinming He, Hao Huang, Kevin Chiew, Zhenguang Liu

摘要

Crowdsourcing services have been proven efficient in collecting large amount of labeled data for supervised learning tasks. However, the low cost of crowd workers leads to unreliable labels, a new problem for learning a reliable classifier. Various methods have been proposed to infer the ground truth or learn from crowd data directly though, there is no guarantee that these methods work well for highly biased or noisy crowd labels. Motivated by this limitation of crowd data, in this paper, we propose a novel framewor for improving the performance of crowdsourcing learning tasks by some additional expert labels, that is, we treat each labeler as a personal classifier and combine all labelers’ opinions from a model combination perspective, and summarize the evidence from crowds and experts naturally via a Bayesian classifier in the intermediate feature space formed by personal classifiers. We also introduce active learning to our framework and propose an uncertainty sampling algorithm for actively obtaining expert labels. Experiments show that our method can significantly improve the learning quality as compared with those methods solely using crowd labels.

论文关键词：Crowdsourcing, Multiple annotator, Classification, Classifier fusion, Active learning

论文评审过程：

论文官网地址：https://doi.org/10.1007/s10844-015-0371-6