Collection fusion using Bayesian estimation of a linear regression model in image databases on the Web

作者:

Highlights:

摘要

The collection fusion problem of image databases is concerned with retrieving relevant images by content based retrieval from image databases distributed on the Web. While there have been many studies about database selection and collection fusion for text databases, little research has been attempted for the case of image databases. Image databases on the Web have heterogeneous characteristics since they use different similarity measures and queries are processed depending on their own policies. Our previous study [Inf. Process. Lett. 75 (1–2) (2000) 35] provided three algorithms for this problem. In this paper, the metaserver selects image databases supporting similarity measures that are correlated with a global similarity measure, and then submits a query to them. And, we propose a new algorithm for this metaserver, which exploits a probabilistic technique using Bayesian estimation for a linear regression model. It outperforms the previous approach for diverse sizes of result sets for a query, and its improvement in effectiveness becomes especially large with small sizes of result sets. We also provide a virtual optimal algorithm to which our algorithm is compared. With extensive experiments we show the superiority of the Bayesian method over the others.

论文关键词:Collection fusion,Bayesian model,Similarity search,Image database

论文评审过程:Available online 10 December 2002.

论文官网地址:https://doi.org/10.1016/S0306-4573(02)00051-1