Cross-language article linking with different knowledge bases using bilingual topic model and translation features

作者:

Highlights:

摘要

Creating links among online encyclopedia articles in different languages is crucial in the construction and integration of large multilingual knowledge bases. Most research to date has focused on linking among different language versions of Wikipedia, yet other large online encyclopedias in a variety of languages exist. In this work, we present a cross-language article-linking method using a bilingual topic model and translation features based on an SVM model to link articles in English Wikipedia and Chinese Baidu Baike, the most widely used Wiki-like encyclopedia in China. To evaluate our approach, we compile data sets from Baidu Baike articles and their corresponding English Wikipedia articles. The evaluation results show that our approach achieves at most 0.8158 in MRR, outperforming the baseline system by 0.1328 (+19.44%) in MRR. Our method does not heavily depend on linguistic characteristics, and it can be easily extended to generate cross-language article links among different online encyclopedias in other languages.

论文关键词:Cross-language article linking,Link discovery,Bilingual topic model

论文评审过程:Received 8 October 2014, Revised 8 August 2016, Accepted 11 August 2016, Available online 17 August 2016, Version of Record 23 September 2016.

论文官网地址:https://doi.org/10.1016/j.knosys.2016.08.015