Constrained NMF-based semi-supervised learning for social media spammer detection

作者:

Highlights:

摘要

Within the past few years, social media platforms such as Facebook, Twitter, and Sina Weibo, have gradually become important channels for information dissemination and communication. However, in the meantime, these platforms are prone to be potentially attacked by spammers, who usually propagate disgusted information such as phishing URLs, false news, and even pornography to other users. Despite rapid increase of social media spammers, the traditional spammer detection methods become less effective. In this paper, we present a novel semi-supervised social media spammer detection approach, making full use of the message content and user behavior as well as the social relation information. First, we adapt the original constrained NMF-based semi-supervised learning (CNMF) algorithm, nonnegative matrix factorization (NMF) by imposing a label information constrain and sparseness constrain. Second, we present a novel CNMF-based integral framework for social media spammer detection by implementing the collaborative factorization on the message content matrix and the user behavior and social relation information matrix. Moreover, we explore the iterative update rule (IUR) and optimization algorithm for the spammer detection model. In addition, its corresponding convergence is also proven. Extensive experiments are conducted on the real-world dataset from Sina Weibo, the experiment results demonstrate that our proposed model performs significantly better than the conventionally applied supervised classifiers for the spammer detection.

论文关键词:Sina Weibo,Spammer detection,Nonnegative matrix factorization (NMF),Semi-supervised learning

论文评审过程:Received 26 September 2016, Revised 26 March 2017, Accepted 29 March 2017, Available online 30 March 2017, Version of Record 21 April 2017.

论文官网地址:https://doi.org/10.1016/j.knosys.2017.03.025