Community detection using hierarchical clustering based on edge-weighted similarity in cloud environment

作者:

Highlights:

摘要

Recently, social network has been paid more and more attention by people. Inaccurate community detection in social network can provide better product designs, accurate information recommendation and public services. Thus, the community detection (CD) algorithm based on network topology and user interests is proposed in this paper. This paper mainly includes two parts. In first part, the focused crawler algorithm is used to acquire the personal tags from the tags posted by other users. Then, the tags are selected from the tag set based on the TFIDF weighting scheme, the semantic extension of tags and the user semantic model. In addition, the tag vector of user interests is derived with the respective tag weight calculated by the improved PageRank algorithm. In second part, for detecting communities, an initial social network, which consists of the direct and unweighted edges and the vertexes with interest vectors, is constructed by considering the following/follower relationship. Furthermore, initial social network is converted into a new social network including the undirected and weighted edges. Then, the weights are calculated by the direction and the interest vectors in the initial social network and the similarity between edges is calculated by the edge weights. The communities are detected by the hierarchical clustering algorithm based on the edge-weighted similarity. Finally, the number of detected communities is detected by the partition density. Also, the extensively experimental study shows that the performance of the proposed user interest detection (PUID) algorithm is better than that of CF algorithm and TFIDF algorithm with respect to F-measure, Precision and Recall. Moreover, Precision of the proposed community detection (PCD) algorithm is improved, on average, up to 8.21% comparing with that of Newman algorithm and up to 41.17% comparing with that of CPM algorithm.

论文关键词:Community detection,User interests,Network topology,Hierarchical clustering

论文评审过程:Received 25 April 2018, Revised 6 September 2018, Accepted 8 October 2018, Available online 30 October 2018, Version of Record 30 October 2018.

论文官网地址:https://doi.org/10.1016/j.ipm.2018.10.004