Content-based Node2Vec for representation of papers in the scientific literature

作者:

Highlights:

摘要

Lower-dimensional representation of scientific text has attracted much attention among researchers due to its impact on many data mining and recommendation tasks. This paper studies two main research streams in scientific literature representation. First, both local and distributed representation viewpoints are reviewed and their advantages and disadvantages in lower dimensional representation are discussed. The paper then proposes a novel hybrid distributed technique for text representation. Using scientific articles as the major source of textual information, both the article’s content and citation network are used to build a distributed and universal lower dimensional representation. The superiority of the new technique to the traditional methods is then justified in predicting the existence of links in large citation graphs.

论文关键词:Distributed representation,Artificial neural networks,Node2Vec,Link prediction

论文评审过程:Received 21 April 2018, Revised 7 March 2019, Accepted 7 February 2020, Available online 14 February 2020, Version of Record 28 May 2020.

论文官网地址:https://doi.org/10.1016/j.datak.2020.101794