Enhancing representation in the context of multiple-channel spam filtering

作者:

Highlights:

• Study of protocol-independent features for representing texts to figth against spam.

• Analysis of the different kind of features that are available for representing texts.

• Measure the impact of 10 features in conjunttion with several representation schemes.

• Use of two datasets to see the impact of analysed features in different channels.

摘要

•Study of protocol-independent features for representing texts to figth against spam.•Analysis of the different kind of features that are available for representing texts.•Measure the impact of 10 features in conjunttion with several representation schemes.•Use of two datasets to see the impact of analysed features in different channels.

论文关键词:Spam filtering,Feature engineering,Machine learning,Text representation

论文评审过程:Received 13 May 2021, Revised 7 October 2021, Accepted 7 November 2021, Available online 24 November 2021, Version of Record 24 November 2021.

论文官网地址:https://doi.org/10.1016/j.ipm.2021.102812