Dictionary-based techniques for cross-language information retrieval

作者:

Highlights:

摘要

Cross-language information retrieval (CLIR) systems allow users to find documents written in different languages from that of their query. Simple knowledge structures such as bilingual term lists have proven to be a remarkably useful basis for bridging that language gap. A broad array of dictionary-based techniques have demonstrated utility, but comparison across techniques has been difficult because evaluation results often span only a limited range of conditions. This article identifies the key issues in dictionary-based CLIR, develops unified frameworks for term selection and term translation that help to explain the relationships among existing techniques, and illustrates the effect of those techniques using four contrasting languages for systematic experiments with a uniform query translation architecture. Key results include identification of a previously unseen dependence of pre- and post-translation expansion on orthographic cognates and development of a query-specific measure for translation fanout that helps to explain the utility of structured query methods.

论文关键词:Cross-language information retrieval,Ranked retrieval,Dictionary-based translation

论文评审过程:Received 10 June 2004, Accepted 14 June 2004, Available online 19 August 2004.

论文官网地址:https://doi.org/10.1016/j.ipm.2004.06.012