Predicting literature’s early impact with sentiment analysis in Twitter

作者:

Highlights:

摘要

Traditional bibliometric techniques gauge the impact of research through quantitative indices based on the citations data. However, due to the lag time involved in the citation-based indices, it may take years to comprehend the full impact of an article. This paper seeks to measure the early impact of research articles through the sentiments expressed in tweets about them. We claim that cited articles in either positive or neutral tweets have a more significant impact than those not cited at all or cited in negative tweets. We used the SentiStrength tool and improved it by incorporating new opinion-bearing words into its sentiment lexicon pertaining to scientific domains. Then, we classified the sentiment of 6,482,260 tweets linked to 1,083,535 publications covered by Altmetric.com. Using positive and negative tweets as an independent variable, and the citation count as the dependent variable, linear regression analysis showed a weak positive prediction of high citation counts across 16 broad disciplines in Scopus. Introducing an additional indicator to the regression model, i.e. ‘number of unique Twitter users’, improved the adjusted R-squared value of regression analysis in several disciplines. Overall, an encouraging positive correlation between tweet sentiments and citation counts showed that Twitter-based opinion may be exploited as a complementary predictor of literature’s early impact.

论文关键词:Altmetrics,Twitter,Sentiment analysis,User category,Predicting citations

论文评审过程:Received 21 July 2019, Revised 10 December 2019, Accepted 11 December 2019, Available online 14 December 2019, Version of Record 24 February 2020.

论文官网地址:https://doi.org/10.1016/j.knosys.2019.105383