Inferring social network user profiles using a partial social graph

作者:Raïssa Yapan Dougnon, Philippe Fournier-Viger, Jerry Chun-Wei Lin, Roger Nkambou

摘要

User profile inference on online social networks is a key task for targeted advertising and building recommender systems that rely on social network data. However, current algorithms for user profiling suffer from one or more of the following limitations: (1) assuming that the full social graph or a large training set of crawled data is available for training, (2) not exploiting the rich information that is available in social networks such as group memberships and likes, (3) treating numeric attributes as nominal attributes, and (4) not assessing the certainty of their predictions. In this paper, to address these limitations, we propose an algorithm named Partial Graph Profile Inference+ (PGPI+). The PGPI+ algorithm can accurately infer user profiles under the constraint of a partial social graph. PGPI+ does not require training, and it lets the user select the trade-off between the amount of information to be crawled for inferring a user profile and the accuracy the inference. Besides, PGPI+ is designed to use rich information about users when available: user profiles, friendship links, group memberships, and the ”views” and ”likes” from social networks such as Facebook. Moreover, to also address limitations 3 and 4, PGPI+ considers numeric attributes in addition to nominal attributes, and can evaluate the certainty of its predictions. An experimental evaluation with 31,247 user profiles from the Facebook and Pokec social networks shows that PGPI+ predicts user profiles with a higher accuracy than several start-of-the-art algorithms, and by accessing (crawling) less information from the social graph. Furthermore, an interesting result is that some profile attributes such as the status (student/professor) and genre can be predicted with more than 95 % accuracy using PGPI+.

论文关键词:Social networks, Inference, User profiles, Partial graph

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10844-016-0402-y