Applying query structuring in cross-language retrieval

作者:

Highlights:

摘要

We will explore various ways to apply query structuring in cross-language information retrieval. In the first test, English queries were translated into Finnish using an electronic dictionary, and were run in a Finnish newspaper database of 55,000 articles. Queries were structured by combining the Finnish translation equivalents of the same English query key using the syn-operator of the InQuery retrieval system. Structured queries performed markedly better than unstructured queries. Second, the effects of compound-based structuring using a proximity operator for the translation equivalents of query language compound components were tested. The method was not useful in syn-based queries but resulted in decrease in retrieval effectiveness. Proper names are often non-identical spelling variants in different languages. This allows n-gram based translation of names not included in a dictionary. In the third test, a query structuring method where the Boolean and-operator was used to assign more weight to keys translated through n-gram matching gave good results.

论文关键词:Compound word processing,Cross-language information retrieval,n-Gram matching,Proper name searching,Structured queries

论文评审过程:Received 14 December 2000, Accepted 3 October 2002, Available online 20 January 2003.

论文官网地址:https://doi.org/10.1016/S0306-4573(02)00091-2