Detecting abusive Instagram comments in Turkish using convolutional Neural network and machine learning methods

作者:

Highlights:

• The first public dataset dedicated to detecting abusive Turkish messages.

• 10,528 abusive, 19,826 not-abusive Instagram comments have been collected.

• CNN, NB, SVM, DT, RF, LR, AdaBoost, and XGBoost classifiers were evaluated.

• The best performance (F1-score: 0.974) was achieved by the CNN model.

摘要

•The first public dataset dedicated to detecting abusive Turkish messages.•10,528 abusive, 19,826 not-abusive Instagram comments have been collected.•CNN, NB, SVM, DT, RF, LR, AdaBoost, and XGBoost classifiers were evaluated.•The best performance (F1-score: 0.974) was achieved by the CNN model.

论文关键词:Abusive comment,Hate speech,Classification,Social media,Instagram,Dataset

论文评审过程:Received 20 August 2020, Revised 6 January 2021, Accepted 27 February 2021, Available online 3 March 2021, Version of Record 10 March 2021.

论文官网地址:https://doi.org/10.1016/j.eswa.2021.114802