FindMal: A file-to-file social network based malware detection framework

作者：

Highlights：

•

摘要

The rapid development of malicious software programs has posed severe threats to Computer and Internet security. Therefore, it motivates anti-malware vendors and researchers to develop novel methods which are capable of protecting users against new threats. Existing malware detectors mostly treat the file samples separately using supervised learning algorithms. However, ignoring the relationship among file samples limits the capability of malware detectors. In this paper, based on the file-to-file social network, we present a new malware detection framework, FindMal(File-to-File Social Network based Malware Detection Framework), including graph-based features extraction, Label Propagation algorithm, and active learning strategy. Nearest neighbors are first chosen as adjacent nodes for each file node to construct kNN file relation graph. Three file relation graph features are proposed to sample the representative file samples for labeling. Then, Label Propagation algorithm, which propagates the label information from labeled file samples to unlabeled files, is applied to learn the probability that one unknown file is classified as malicious or benign. A batch mode active learning method is employed to reduce the labeling cost and improve the performance of Label Propagation. Comprehensive experiments on real and large scale dataset obtained from an anti-malware company are performed. The results demonstrate that our proposed FindMal outperforms other existing detection models in classifying file samples.

论文关键词：Malware detection,File relation graph,Graph feature,Label propagation,Active learning

论文评审过程：Received 10 June 2016, Revised 3 September 2016, Accepted 8 September 2016, Available online 9 September 2016, Version of Record 4 October 2016.

论文官网地址：https://doi.org/10.1016/j.knosys.2016.09.004