Fast and accurate near-duplicate image search with affinity propagation on the ImageWeb

摘要

Near-duplicate image search in very large Web databases has been a hot topic in recent years. In the traditional methods, the Bag-of-Visual-Words (BoVW) model and the inverted index structure are very widely adopted. Despite the simplicity, efficiency and scalability, these algorithms highly depends on the accurate matching of local features. However, there are many reasons in real applications that limit the descriptive power of low-level features, and therefore cause the search results suffer from unsatisfied precision and recall. To overcome these shortcomings, it is reasonable to re-rank the initial search results using some post-processing approaches, such as spatial verification, query expansion and diffusion-based algorithms.In this paper, we investigate the re-ranking problem from a graph-based perspective. We construct ImageWeb, a sparse graph consisting of all the images in the database, in which two images are connected if and only if one is ranked among the top of another’s initial search result. Based on the ImageWeb, we use HITS, a query-dependent algorithm to re-rank the images according to the affinity values. We verify that it is possible to discover the nature of image relationships for search result refinement without using any handcrafted methods such as spatial verification. We also consider some tradeoff strategies to intuitively guide the selection of searching parameters. Experiments are conducted on the large-scale image datasets with more than one million images. Our algorithm achieves the state-of-the-art search performance with very fast speed at the online stages.