Using semantic components to search for domain-specific documents: An evaluation from the system perspective and the user perspective

作者:

Highlights:

摘要

We seek to leverage an expert user's knowledge about how information is organized in a domain and how information is presented in typical documents within a particular domain-specific collection, to effectively and efficiently meet the expert's targeted information needs. We have developed the semantic components model to describe important semantic content within documents. The semantic components model for a given collection (based on a general understanding of the type of information needs expected) consists of a set of document classes, where each class has an associated set of semantic components. Each semantic component instance consists of segments of text about a particular aspect of the main topic of the document and may not correspond to structural elements in the document. The semantic components model represents document content in a manner that is complementary to full text and keyword indexing. This paper describes how the semantic components model can be used to improve an information retrieval system. We present experimental evidence from a large interactive searching study that compared the use of semantic components in a system with full text and keyword indexing, where we extended the query language to allow users to search using semantic components, to a base system that did not have semantic components. We evaluate the systems from a system perspective, where semantic components were shown to improve document ranking for precision-oriented searches, and from a user perspective. We also evaluate the systems from a session-based perspective, evaluating not only the results of individual queries but also the results of multiple queries during a single interactive query session.

论文关键词:Information retrieval,Search,Digital libraries,Semantic components

论文评审过程:Available online 10 May 2009.

论文官网地址:https://doi.org/10.1016/j.is.2009.04.005