Integrating wavelets with clustering and indexing for effective content-based image retrieval

作者:

Highlights:

摘要

Recent development in technology influenced our daily life and the way people communicate and store data. There is a clear shift from traditional methods to sophisticated techniques; this maximizes the utilization of the widely available digital media. People are able to take photos using hand held devices and there is a massive increase in the volume of photos digitally stored. Digital devices are also shaping the medical field. Scanners are available for every part of the body to help identifying problems. However, this tremendous increase in the number of digitally captured and stored images necessitates the development of advanced techniques capable of classifying and effectively retrieving relevant images when needed. Thus, content-based image retrieval systems (CBIR) have become very popular for browsing, searching and retrieving images from a large database of digital images with minimum human intervention. The research community is competing for more efficient and effective methods as CBIR systems may be heavily employed in serving time critical monitoring applications in homeland security, scientific and medical domains, among others. All of this motivated for the work described in this paper. We propose a novel approach which uses a well-known clustering algorithm k-means and a database indexing structure B+-tree to facilitate retrieving relevant images in an efficient and effective way. Cluster validity analysis indexes combined with majority voting are employed to verify the appropriate number of clusters. While searching for similar images, we consider images from the closest cluster and from other nearby clusters. We introduced two new parameters named cG and cS to determine the distance range to be searched in each cluster. These parameters enable us to find similar images even if the query image is misclustered and to further narrow down the search space for large clusters. To determine values of cG and cS, we introduced a new formula for gain measurement and we iteratively find the best gain value and accordingly set the values. We used Daubechies wavelet transformation for extracting the feature vectors of images. The reported test results are promising. The results demonstrate how using data mining techniques could improve the efficiency of the CBIR task without sacrificing much from the accuracy of the overall process.

论文关键词:Content-based image retrieval,Wavelet transformation,Clustering,Cluster validity analysis,Indexing

论文评审过程:Received 5 July 2011, Revised 21 January 2012, Accepted 21 January 2012, Available online 2 February 2012.

论文官网地址:https://doi.org/10.1016/j.knosys.2012.01.013