Subject indexing and citation indexing— part II: An evaluation and comparison∗

作者:

Highlights:

摘要

The extent to which four subject representations and two citation representations associate documents relevant to the same query and discriminate between documents relevant to different queries is investigated as a function of descriptor-weight and similarity thresholds. The descriptor-weight threshold selects a level of indexing exhaustivity for the document representation, and the similarity threshold selects a level from the associated single-link hierarchy. Computations of cluster-based retrieval effectiveness as a function of the threshold values reveal optimal performance levels for each representation. For the subject representations and the citation representations, optimal performance levels occur at relatively low levels of exhaustivity and are materially superior to results derived from random structures. Surprisingly, citation representations are marginally superior to subject representations based on MEDLINE® subject descriptions. The lowest performance levels are associated with exhaustive subject representations, which are biased against associating documents relevant to the same query. Composite results that include combinations of subject and citation outcomes produce meaningful improvements in retrieval performance when compared to the performance of constituent representations.

论文关键词:

论文评审过程:Received 19 December 1989, Accepted 13 April 1990, Available online 19 July 2002.

论文官网地址:https://doi.org/10.1016/0306-4573(90)90047-6