Ontology-based information content computation

作者:

Highlights:

摘要

The information content (IC) of a concept provides an estimation of its degree of generality/concreteness, a dimension which enables a better understanding of concept’s semantics. As a result, IC has been successfully applied to the automatic assessment of the semantic similarity between concepts. In the past, IC has been estimated as the probability of appearance of concepts in corpora. However, the applicability and scalability of this method are hampered due to corpora dependency and data sparseness. More recently, some authors proposed IC-based measures using taxonomical features extracted from an ontology for a particular concept, obtaining promising results. In this paper, we analyse these ontology-based approaches for IC computation and propose several improvements aimed to better capture the semantic evidence modelled in the ontology for the particular concept. Our approach has been evaluated and compared with related works (both corpora and ontology-based ones) when applied to the task of semantic similarity estimation. Results obtained for a widely used benchmark show that our method enables similarity estimations which are better correlated with human judgements than related works.

论文关键词:Information content,Semantic similarity,Ontologies,Taxonomic knowledge,WordNet

论文评审过程:Received 20 July 2010, Revised 5 October 2010, Accepted 5 October 2010, Available online 10 October 2010.

论文官网地址:https://doi.org/10.1016/j.knosys.2010.10.001