A comparative study of effective approaches for Arabic sentiment analysis

作者:

Highlights:

• We survey state-of-the-art methods for Arabic sentiment analysis.

• We replicate the most effective methods (over 10 methods) for Arabic SA reported in the literature and introduce new ones, and compare their performance on the most popular publicly available benchmark Arabic SA datasets.

• We built the largest Arabic word-embeddings trained on 250 million unique tweets, covering multiple Arabic dialects that exist on social media.

• We apply BERT-based models for Arabic SA and compare their performance with all the existing state-of-the-art Arabic SA approaches, showing the superiority of using Arabic specific BERT.

• We conduct an extensive error analysis for the different approaches, which included the reannotation of the existing Arabic SA datasets to assess the subjectivity of the task and the presence of sarcasm.

• We empirically show the challenges that sarcasm imposes on sentiment analysis systems.

• Based on our comprehensive analysis, we suggest the most important future research directions in Arabic sentiment analysis in specific and related NLP tasks in general.

摘要

•We survey state-of-the-art methods for Arabic sentiment analysis.•We replicate the most effective methods (over 10 methods) for Arabic SA reported in the literature and introduce new ones, and compare their performance on the most popular publicly available benchmark Arabic SA datasets.•We built the largest Arabic word-embeddings trained on 250 million unique tweets, covering multiple Arabic dialects that exist on social media.•We apply BERT-based models for Arabic SA and compare their performance with all the existing state-of-the-art Arabic SA approaches, showing the superiority of using Arabic specific BERT.•We conduct an extensive error analysis for the different approaches, which included the reannotation of the existing Arabic SA datasets to assess the subjectivity of the task and the presence of sarcasm.•We empirically show the challenges that sarcasm imposes on sentiment analysis systems.•Based on our comprehensive analysis, we suggest the most important future research directions in Arabic sentiment analysis in specific and related NLP tasks in general.

论文关键词:Arabic,Sentiment Analysis,Sarcasm

论文评审过程:Received 15 July 2020, Revised 15 October 2020, Accepted 9 November 2020, Available online 16 December 2020, Version of Record 16 December 2020.

论文官网地址:https://doi.org/10.1016/j.ipm.2020.102438