Algorithms for processing K-closest-pair queries in spatial databases

作者:

Highlights:

摘要

This paper addresses the problem of finding the K closest pairs between two spatial datasets (the so-called, K closest pairs query, K-CPQ), where each dataset is stored in an R-tree. There are two different techniques for solving this kind of distance-based query. The first technique is the incremental approach, which returns the output elements one-by-one in ascending order of distance. The second one is the non-incremental alternative, which returns the K elements of the result all together at the end of the algorithm. In this paper, based on distance functions between two MBRs in the multidimensional Euclidean space, we propose a pruning heuristic and two updating strategies for minimizing the pruning distance, and use them in the design of three non-incremental branch-and-bound algorithms for K-CPQ between spatial objects stored in two R-trees. Two of those approaches are recursive following a Depth-First searching strategy and one is iterative obeying a Best-First traversal policy. The plane-sweep method and the search ordering are used as optimization techniques for improving the naive approaches. Besides, a number of interesting extensions of the K-CPQ (K-Self-CPQ, Semi-CPQ, K-FPQ (the K-farthest pairs query), etc.) are discussed. An extensive performance study is also presented. This study is based on experiments performed with real datasets. A wide range of values for the basic parameters affecting the performance of the algorithms is examined in order to designate the most efficient algorithm for each setting of parameter values. Finally, an experimental study of the behavior of the proposed K-CPQ branch-and-bound algorithms in terms of scalability of the dataset size and the K value is also included.

论文关键词:Spatial databases,Branch-and-bound algorithms,Query processing,R-tree,Distance join,I/O and response time performance

论文评审过程:Received 4 December 2002, Revised 18 June 2003, Accepted 20 August 2003, Available online 8 October 2003.

论文官网地址:https://doi.org/10.1016/j.datak.2003.08.007