Learning the truth vector in high dimensions

作者：

Highlights：

•

摘要

Truth Discovery is an important learning problem arising in data analytics related fields. It concerns about finding the most trustworthy information from a dataset acquired from a number of unreliable sources. The problem has been extensively studied and a number of techniques have already been proposed. However, all of them are of heuristic nature and do not have any quality guarantee. In this paper, we formulate the problem as a high dimensional geometric optimization problem, called Entropy based Geometric Variance. Relying on a number of novel geometric techniques, we further discover new insights to this problem. We show, for the first time, that the truth discovery problem can be solved with guaranteed quality of solution. Particularly, it is possible to achieve a (1+ϵ)-approximation within nearly linear time under some reasonable assumptions. We expect that our algorithm will be useful for other data related applications.

论文关键词：Truth discovery,Entropy,High dimension,Approximation algorithm

论文评审过程：Received 28 May 2018, Revised 19 November 2019, Accepted 17 December 2019, Available online 30 December 2019, Version of Record 16 January 2020.

论文官网地址：https://doi.org/10.1016/j.jcss.2019.12.002