Towards filtering undesired short text messages using an online learning approach with semantic indexing

作者:

Highlights:

• A new classifier is presented to detect undesired short text comments.

• The proposed approach is light, fast, multinomial and offers incremental learning.

• The impact of applying text normalization and semantic indexing is studied.

• The results indicate the proposed techniques outperformed most of the approaches.

• Text normalization and semantic indexing enhanced the classifiers performance.

摘要

•A new classifier is presented to detect undesired short text comments.•The proposed approach is light, fast, multinomial and offers incremental learning.•The impact of applying text normalization and semantic indexing is studied.•The results indicate the proposed techniques outperformed most of the approaches.•Text normalization and semantic indexing enhanced the classifiers performance.

论文关键词:Minimum description length,Short text messages,Semantic indexing,Text categorization,Machine learning,00-01,99-00

论文评审过程:Received 17 October 2016, Revised 14 April 2017, Accepted 28 April 2017, Available online 29 April 2017, Version of Record 8 May 2017.

论文官网地址:https://doi.org/10.1016/j.eswa.2017.04.055