Data-driven integration of multiple sentiment dictionaries for lexicon-based sentiment classification of product reviews

作者:

Highlights:

摘要

In lexicon-based sentiment classification, the problem of contextual polarity must be explicitly handled since it is a major cause for classification error. One way to handle contextual polarity is to revise the prior polarity of the sentiment dictionary to fit the domain. This paper presents a data-driven method of adapting sentiment dictionaries to diverse domains. Our method first merges multiple sentiment dictionaries at the entry word level to expand the dictionary. Then, leveraging the ratio of the positive/negative training data, it selectively removes the entry words that do not contribute to the classification. Finally, it selectively switches the sentiment polarity of the entry words to adapt to the domain. In essence, our method compares the positive/negative review’s dictionary word occurrence ratios with the positive/negative review ratio itself to determine which entry words to be removed and which entry words’ sentiment polarity to be switched. We show that the integrated sentiment dictionary constructed using our ‘merge’, ‘remove’, and ‘switch’ operations robustly outperforms individual dictionaries in the sentiment classification of product reviews across different domains such as smartphones, movies, and books.

论文关键词:Sentiment dictionary,Domain adaptation,Integration,Lexicon-based review classification,Sentiment analysis

论文评审过程:Available online 13 June 2014.

论文官网地址:https://doi.org/10.1016/j.knosys.2014.06.001