Decisions in thesaurus construction and use

作者:

Highlights:

摘要

A thesaurus and an ontology provide a set of structured terms, phrases, and metadata, often in a hierarchical arrangement, that may be used to index, search, and mine documents. We describe the decisions that should be made when including a term, deciding whether a term should be subdivided into its subclasses, or determining which of more than one set of possible subclasses should be used. Based on retrospective measurements or estimates of future performance when using thesaurus terms in document ordering, decisions are made so as to maximize performance. These decisions may be used in the automatic construction of a thesaurus. The evaluation of an existing thesaurus is described, consistent with the decision criteria developed here. These kinds of user-focused decision-theoretic techniques may be applied to other hierarchical applications, such as faceted classification systems used in information architecture or the use of hierarchical terms in “breadcrumb navigation”.

论文关键词:Thesaurus,Ontology,Evaluation,Performance measurement,Controlled vocabulary

论文评审过程:Received 5 April 2006, Revised 11 August 2006, Accepted 16 August 2006, Available online 16 November 2006.

论文官网地址:https://doi.org/10.1016/j.ipm.2006.08.011