3.5K runs, 5K topics, 3M assessments and 70M measures: What trends in 10 years of Adhoc-ish CLEF?

作者:

Highlights:

摘要

Multilingual information access and retrieval is a key concern in today global society and, despite the considerable achievements over the past years, it still presents many challenges. In this context, experimental evaluation represents a key driver of innovation and multilinguality is tackled in several evaluation initiatives worldwide, such as CLEF in Europe, NTCIR in Japan and Asia, and FIRE in India. All these activities have run several evaluation cycles and there is a general consensus about their strong and positive impact on the development of multilingual information access systems. However, a systematic and quantitative assessment of the impact of evaluation initiatives on multilingual information access and retrieval over the long period is still missing.Therefore, in this paper we conduct the first systematic and large-scale longitudinal study on several CLEF Adhoc-ish tasks – namely the Adhoc, Robust, TEL, and GeoCLEF labs – in order to gain insights on the performance trends of monolingual, bilingual and multilingual information access systems, spanning several European and non-European languages, over a range of 10 years.We learned that monolingual retrieval exhibits a stable positive trend for many of the languages analyzed, even though the performance increase is not always steady from year to year due to the varying interests of the participants, who may not always be focused on just increasing performances. Bilingual retrieval demonstrates higher improvements in recent years – probably due to the better language resources now available – and it also outperforms monolingual retrieval in several cases. Multilingual retrieval shows improvements over the years and performances are comparable to those of bilingual and monolingual retrieval, and sometimes even better. Moreover, we have found evidence that the rule-of-thumb of a 3-year duration for an evaluation task is typically enough since top performances are usually reached by the third year and sometimes even by the second year, which then leaves room for research groups to investigate relevant research issues other than top performances.Overall, this study provides quantitative evidence that CLEF has achieved the objective which led to its establishment, i.e. making multilingual information access a reality for European languages. However, the outcomes of this paper not only indicate that CLEF has steered the community in the right direction, but they also highlight the many open challenges for multilinguality. For instance, multilingual technologies greatly depend on language resources and targeted evaluation cycles help not only in developing and improving them, but also in devising methodologies which are more and more language-independent. Another key aspect concerns multimodality, intended not only as the capability of providing access to information in multiple media, but also as the ability of integrating access and retrieval over different media and languages in a way that best fits with user needs and tasks.

论文关键词:Information retrieval,Multilingual information access,Longitudinal analysis,Experimental evaluation,CLEF

论文评审过程:Available online 13 August 2016, Version of Record 23 November 2016.

论文官网地址:https://doi.org/10.1016/j.ipm.2016.08.001