Correlation of term usage and term indexing frequencies

作者:

Highlights:

摘要

There have been several studies on the distributions of index terms, title terms, authors, and other elements employed in searching bibliographic databases. What is needed is to relate this information to the actual selection of terms for searching. This study analyzes data taken from monitoring the actual selection of terms for searching an online catalog at the School of Library and Information Science Library at the University of Western Ontario. Every time a term was employed in a search expression, a count in the dictionary file was updated. If the word was not in the dictionary, it was added. As a check on other studies, the rank distribution of terms chosen for searching was fit and found to be of a general Bradford-Zipf type. The main hypothesis was that high frequency terms in the catalog are the ones most frequently chosen in searches. The regular scatterplot of number of postings in the catalog versus the frequency in searching was checked and Pearson's correlation and Spearman's rank correlation coefficients were calculated. These data show that in general searchers actually tend to select the terms with a high number of postings for searching the catalog.

论文关键词:

论文评审过程:Received 8 September 1987, Accepted 7 December 1987, Available online 19 July 2002.

论文官网地址:https://doi.org/10.1016/0306-4573(88)90023-4