PSATop-k: Approximate range top-k computation on big data

作者:

Highlights:

• A novel algorithm is presented to compute approximate range top-k efficiently.

• A mathematical method is provided to determine the optimal sampling size.

• An algorithm is proposed to obtain random tuples by reading dataset sequentially.

• Experiments verify that PSATop-k significantly outperforms the existing algorithms.

摘要

•A novel algorithm is presented to compute approximate range top-k efficiently.•A mathematical method is provided to determine the optimal sampling size.•An algorithm is proposed to obtain random tuples by reading dataset sequentially.•Experiments verify that PSATop-k significantly outperforms the existing algorithms.

论文关键词:Big data,Approximate range top-k,Partitioning,Sampling

论文评审过程:Received 17 June 2021, Revised 23 September 2021, Accepted 16 October 2021, Available online 21 October 2021, Version of Record 29 October 2021.

论文官网地址:https://doi.org/10.1016/j.knosys.2021.107614