An automatic method for reporting the quality of thesauri

作者:

Highlights:

摘要

Thesauri are knowledge models commonly used for information classification and retrieval whose structure is defined by standards such as the ISO 25964. However, when creators do not correctly follow the specifications, they construct models with inadequate concepts or relations that provide a limited usability. This paper describes a process that automatically analyzes the thesaurus properties and relations with respect to ISO 25964 specification, and suggests the correction of potential problems. It performs a lexical and syntactic analysis of the concept labels, and a structural and semantic analyses of the relations. The process has been tested with Urbamet and Gemet thesauri and the results have been analyzed to determine how well the proposed process works.

论文关键词:Thesaurus,Digital libraries,Information retrieval,Thesaurus quality,Ontology alignment

论文评审过程:Received 8 September 2015, Revised 10 March 2016, Accepted 9 May 2016, Available online 16 May 2016, Version of Record 19 July 2016.

论文官网地址:https://doi.org/10.1016/j.datak.2016.05.002