Truth selection for truth discovery models exploiting ordering relationship among values

作者:

Highlights:

摘要

Data veracity is one of the main issues regarding Web data. Truth Discovery models can be used to assess it by estimating value confidence and source trustworthiness through analysis of claims on the same real-world entities provided by different sources. Many studies have been conducted in this domain. True values selected by most models have the highest confidence estimation. This naive strategy cannot be applied to identify true values when there is a partial order among values that is considered to enhance the final performance. Indeed, in this case, the resulting estimations monotonically increase with respect to the partial order of values. The highest confidence is always assigned to the most general value that is implicitly supported by all the others. Thus, using the highest confidence as criterion to select the true values is not appropriate because it will always return the most general values. To address this problem, we propose a post-processing procedure that, leveraging the partial order among values and their monotonic confidence estimations, is able to identify the expected true value. Experimental results on synthetic datasets show the effectiveness of our approach.

论文关键词:Truth identification,Truth discovery,Conflicting values,Value relationships,Ontology

论文评审过程:Received 19 December 2017, Revised 25 June 2018, Accepted 28 June 2018, Available online 31 July 2018, Version of Record 10 September 2018.

论文官网地址:https://doi.org/10.1016/j.knosys.2018.06.023