Pivot-based approximate k-NN similarity joins for big high-dimensional data
作者:
Highlights:
• Study of approximate k-NN similarity joins for big high-dimensional data.
• Pivot-based k-NN join methods supporting various levels of approximation guarantee.
• Implementation and algorithm extensions with publicly available source code.
• Comprehensive experiments using high-dimensional data and popular Big Data systems.
摘要
•Study of approximate k-NN similarity joins for big high-dimensional data.•Pivot-based k-NN join methods supporting various levels of approximation guarantee.•Implementation and algorithm extensions with publicly available source code.•Comprehensive experiments using high-dimensional data and popular Big Data systems.
论文关键词:Hadoop,Spark,MapReduce,k-NN,Approximate similarity join,High-dimensional data
论文评审过程:Received 10 May 2018, Accepted 27 June 2019, Available online 2 July 2019, Version of Record 8 August 2019.
论文官网地址:https://doi.org/10.1016/j.is.2019.06.006