FacTeR-Check: Semi-automated fact-checking through semantic similarity and natural language inference

作者:

Highlights:

摘要

Our society produces and shares overwhelming amounts of information through Online Social Networks (OSNs). Within this environment, misinformation and disinformation have proliferated, becoming a public safety concern in most countries. Allowing the public and professionals to efficiently find reliable evidence about the factual veracity of a claim is a crucial step to mitigate this harmful spread. To this end, we propose FacTeR-Check, a multilingual architecture for semi-automated fact-checking and hoaxes propagation analysis that can be used to implement applications designed for both the general public and for fact-checking organisations. FacTeR-Check implements three different modules relying on the XLM-RoBERTa Transformer architecture to evaluate semantic similarity, to calculate natural language inference and to build search queries through automatic keywords extraction and Named-Entity Recognition. The three modules have been validated using state-of-the-art benchmark datasets, exhibiting good performance in all of them. Besides, FacTeR-Check is employed to collect and label a dataset, called NLI19-SP, composed of more than 40,000 tweets supporting or denying 60 hoaxes related to COVID-19, released publicly. Finally, an analysis of the data collected in this dataset is provided, which allows to obtain a deep insight of how disinformation operated during the COVID-19 pandemic in Spanish-speaking countries.

论文关键词:Misinformation,Transformers,COVID-19,Hoax,Natural language inference,Semantic similarity

论文评审过程:Received 17 February 2022, Revised 10 June 2022, Accepted 11 June 2022, Available online 20 June 2022, Version of Record 28 June 2022.

论文官网地址:https://doi.org/10.1016/j.knosys.2022.109265