Array-index: a plug&search K nearest neighbors method for high-dimensional data

作者:

Highlights:

摘要

Previous algorithms of data partitioning methods (DPMs) to find the exact K-nearest neighbors (KNN) at high dimensions are outperformed by a linear scan method [J.M. Kleinberg, Two algorithms for nearest neighbor search in high dimensions, 29th ACM Symposium on Theory of computing, 1997; R. Weber, H.-J. Schek, S. Blott. A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. in: Proc. of the 24th VLDB, USA, 1998]. In this paper, we present a “plug&search” method to greatly speed up the exact KNN search of existing DPMs. The idea is to linearize the data partitions produced by a DPM, rather than the points themselves, into a one-dimensional array-index, that is simple, compact and fast. Unlike most DPMs that support KNN search, which require storage space linear, or exponential [J.M. Kleinberg, Two algorithms for nearest neighbor search in high dimensions, 29th ACM Symposium on Theory of computing, 1997; M. Hagedoom, Nearest neighbors can be found efficiently if the dimension is small relative to the input size, ICDT 2003], in dimensions, the array-index requires a storage space that is linear in the number of mapped partitions.

论文关键词:Indexing methods,Image databases,KNN image search, array-index, plug&search method

论文评审过程:Received 4 March 2004, Revised 21 June 2004, Accepted 21 June 2004, Available online 19 August 2004.

论文官网地址:https://doi.org/10.1016/j.datak.2004.06.015