Context-aware instance matching through graph embedding in lexical semantic space

作者:

Highlights:

摘要

In recent years, the growing availability of open-accessed data (e.g., Wikipedia) combined with the advances in algorithmic techniques for information extraction have facilitated the design and structuring of information giving rise to knowledge bases. A major challenge relies in the integration of these independently designed knowledge bases. Instance matching is presented as one of the solutions to facilitate this process. It aims to link co-referent instances with an owl:sameAs connection to allow knowledge bases to complement each other. In this work, we present an approach for automatic alignment of instances in knowledge bases in the form of Resource Description Framework (RDF) graphs. Our approach generates for each instance a virtual document from its local description (i.e., data-type properties) and instances related to it through object-type properties (i.e., neighbors). We transform the instance matching problem into a document matching problem and solve it by a vector space embedding technique. We consider the pre-trained word embeddings to assess words similarities at both the lexical and semantic levels. We evaluate our approach on multiple knowledge bases from the instance track of the Ontology Alignment Evaluation Initiative (OAEI). The experiments show that our approach gets prominent results compared to several state-of-the-art existing approaches.

论文关键词:Data linking,Instance matching,Lexical semantic vector,RDF graph,Semantic web

论文评审过程:Received 17 January 2019, Revised 4 August 2019, Accepted 6 August 2019, Available online 12 August 2019, Version of Record 5 November 2019.

论文官网地址:https://doi.org/10.1016/j.knosys.2019.104925