On selection and combining of relevance indicators

作者：

Highlights：

•

摘要

Identifying and representing the content of a document was, and still is, one of the main concerns of information retrieval systems. Representation of content is not dependent of the search strategy and other elements of information retrieval systems (IRS) but rather has some relationship with them.In the conventional IRS, each document in the file is characterized by one or more index terms which supposedly describe its content. Those terms are assigned from the natural language or from a pre-prepared list (Thesaurus). Over the years, other means of representing content were suggested. Also, attempts were made to combine several of them assuming independence.This paper discusses the attributes of the items in the data-base and their qualities. It seems that there is no single one which has all the desired qualities.If the attributes are not totally independent neither highly correlated then combining them in a certain way may increase effectiveness. The justification for this comes from the users' information seeking behavior—users are using index terms, author's names, citations, and other attributes in their searches.A model to accomodate the above hypothesis is formulated and the small experiment performed indicates that the hypothesis may be true, and this way of combining might improve effectiveness.

论文关键词：

论文评审过程：Received 13 February 1980, Available online 13 July 2002.

论文官网地址：https://doi.org/10.1016/0306-4573(80)90017-5