Exploiting non-taxonomic relations for measuring semantic similarity and relatedness in WordNet

作者：

Highlights：

•

摘要

Various applications in computational linguistics and artificial intelligence employ semantic similarity to solve challenging tasks, such as word sense disambiguation, text classification, information retrieval, machine translation, and document clustering. To our knowledge, research to date rely solely on the taxonomic relation “ISA” to evaluate semantic similarity and relatedness between terms. This paper explores the benefits of using all types of non-taxonomic relations in large linked data, such as WordNet knowledge graph, to enhance existing semantic similarity and relatedness measures. We propose a holistic poly-relational approach based on a new relation-based information content and non-taxonomic-based weighted paths to devise a comprehensive semantic similarity and relatedness measure. To demonstrate the benefits of exploiting non-taxonomic relations in a knowledge graph, we used three strategies to deploy non-taxonomic relations at different granularity levels. We conduct experiments on four well-known gold standard datasets. The results of our proposed method demonstrate an improvement over the benchmark semantic similarity methods, including the state-of-the-art knowledge graph embedding techniques, that ranged from 3.8%–23.8%, 1.3%–18.3%, 31.8%–117.2%, and 19.1%–111.1%, on all gold standard datasets MC, RG, WordSim, and Mturk, respectively. These results demonstrate the robustness and scalability of the proposed semantic similarity and relatedness measure, significantly improving existing similarity measures.

论文关键词：Semantic similarity and relatedness,Knowledge graph,Information content,WordNet

论文评审过程：Received 28 May 2020, Revised 17 August 2020, Accepted 26 October 2020, Available online 4 November 2020, Version of Record 24 December 2020.

论文官网地址：https://doi.org/10.1016/j.knosys.2020.106565