Using back-and-forth translation to create artificial augmented textual data for sentiment analysis models

作者:

Highlights:

• A novel data augmentation method, namely back-and-forth translation, is proposed.

• The proposed method decreases a sentiment classification model’s total error rate.

• The proposed total error decrease scales non-linearly with sample size.

• The method is more effective at smaller sample sizes compared to larger sample sizes.

摘要

•A novel data augmentation method, namely back-and-forth translation, is proposed.•The proposed method decreases a sentiment classification model’s total error rate.•The proposed total error decrease scales non-linearly with sample size.•The method is more effective at smaller sample sizes compared to larger sample sizes.

论文关键词:Natural language processing,Translation,Sentiment analysis,Data augmentation

论文评审过程:Received 19 May 2020, Revised 26 November 2020, Accepted 9 April 2021, Available online 16 April 2021, Version of Record 29 April 2021.

论文官网地址:https://doi.org/10.1016/j.eswa.2021.115033