Fast similarity join for multi-dimensional data
作者:
Highlights:
•
摘要
The efficient processing of multidimensional similarity joins is important for a large class of applications. The dimensionality of the data for these applications ranges from low to high. Most existing methods have focused on the execution of high-dimensional joins over large amounts of disk-based data. The increasing sizes of main memory available on current computers, and the need for efficient processing of spatial joins suggest that spatial joins for a large class of problems can be processed in main memory. In this paper, we develop two new in-memory spatial join algorithms, the Grid-join and EGO*-join, and study their performance. Through evaluation, we explore the domain of applicability of each approach and provide recommendations for the choice of a join algorithm depending upon the dimensionality of the data as well as the expected selectivity of the join. We show that the two new proposed join techniques substantially outperform the state-of-the-art join algorithm, the EGO-join.
论文关键词:Similarity join,Grid-based joins
论文评审过程:Received 15 January 2003, Revised 20 July 2005, Accepted 22 July 2005, Available online 6 September 2005.
论文官网地址:https://doi.org/10.1016/j.is.2005.07.002