Automatic detection of satire in Twitter: A psycholinguistic-based approach

作者:

Highlights:

摘要

In recent years, a substantial effort has been made to develop sophisticated methods that can be used to detect figurative language, and more specifically, irony and sarcasm. There is, however, an absence of new approaches and research works that analyze satirical texts. The recognition of satire by sentiment analysis and Natural Language Processing (NLP) applications is extremely important because it can influence and change the meaning of a statement in varied and complex ways. We used this understanding as a basis to propose a method that employs a wide variety of psycholinguistic features and which detects satirical and non-satirical text. We then went on to train a set of machine learning algorithms that would enable us to classify unknown data. Finally, we conducted several experiments in order to detect the most relevant features that generate a better pattern as regards detecting satirical texts. We evaluated the effectiveness of our method by obtaining a corpus of satirical and non-satirical news from Mexican and Spanish Twitter accounts. Our proposal obtained encouraging results, with an F-measure of 85.5% for Mexico and one of 84.0% for Spain. Moreover, the results of the experiment showed that there is no significant difference between Mexican and Spanish satire.

论文关键词:Computational psycholinguistics,LIWC,Machine learning,Satire,Twitter

论文评审过程:Received 25 October 2016, Revised 20 April 2017, Accepted 22 April 2017, Available online 24 April 2017, Version of Record 25 May 2017.

论文官网地址:https://doi.org/10.1016/j.knosys.2017.04.009