The added value of Facebook friends data in event attendance prediction

作者:

Highlights:

• We assess the added value of a Facebook user's friends data in event attendance prediction over and above user data.

• We use five classification algorithms, five times two-fold cross-validation and the Wilcoxon signed rank test.

• Including friends data increases the AUC significantly for most algorithms.

• Among the top predictors is the number of friends that are attending the focal event.

• These findings clearly indicate that including network data is a viable strategy.

摘要

This paper seeks to assess the added value of a Facebook user's friends data in event attendance prediction over and above user data. For this purpose we gathered data of users that have liked an anonymous European soccer team on Facebook. In addition we obtained data from all their friends. In order to assess the added value of friends data we have built two models for five different algorithms (Logistic Regression, Random Forest, Adaboost, Neural Networks and Naive Bayes). The baseline model contained only user data and the augmented model contained both user and friends data. We employed five times two-fold cross-validation and the Wilcoxon signed rank test to validate our findings. The results suggest that the inclusion of friends data in our predictive model increases the area under the receiver operating characteristic curve (AUC). Out of five algorithms, the increase is significant for three algorithms, marginally significant for one algorithm, and not significant for one algorithm. The increase in AUC ranged from 0.21%-points to 0.82%-points. The analyses show that a top predictor is the number of friends that are attending the focal event. To the best of our knowledge this is the first study that evaluates the added value of friends network data over and above user data in event attendance prediction on Facebook. These findings clearly indicate that including network data in event prediction models is a viable strategy for improving model performance.

论文关键词:Facebook,Network data,Events,Predictive models,Social media

论文评审过程:Received 23 April 2015, Revised 20 November 2015, Accepted 21 November 2015, Available online 28 November 2015, Version of Record 21 January 2016.

论文官网地址:https://doi.org/10.1016/j.dss.2015.11.003