Investigating relationships within and between category networks in Wikipedia

作者:

Highlights:

摘要

This work maps and analyses cross-citations in the areas of Biology, Mathematics, Physics and Medicine in the English version of Wikipedia, which are represented as an undirected complex network where the entries correspond to nodes and the citations among the entries are mapped as edges. We found a high value of clustering coefficient for the areas of Biology and Medicine, and a small value for Mathematics and Physics. The topological organization is also different for each network, including a modular structure for Biology and Medicine, a sparse structure for Mathematics and a dense core for Physics. The networks have degree distributions that can be approximated by a power-law with a cut-off. The assortativity of the isolated networks has also been investigated and the results indicate distinct patterns for each subject. We estimated the betweenness centrality of each node considering the full Wikipedia network, which contains the nodes of the four subjects and the edges between them. In addition, the average shortest path length between the subjects revealed a close relationship between the subjects of Biology and Physics, and also between Medicine and Physics. Our results indicate that the analysis of the full Wikipedia network cannot predict the behavior of the isolated categories since their properties can be very different from those observed in the full network.

论文关键词:Complex network,Wikipedia,Map of science

论文评审过程:Received 22 February 2010, Revised 10 March 2011, Accepted 15 March 2011, Available online 14 April 2011.

论文官网地址:https://doi.org/10.1016/j.joi.2011.03.003