Efficient skyline query processing in SpatialHadoop

作者:

Highlights:

摘要

This paper studies the problem of computing the skyline of a vast-sized spatial dataset in SpatialHadoop, an extension of Hadoop that supports spatial operations efficiently. The problem is particularly interesting due to advent of Big Spatial Data that are generated by modern applications run on mobile devices, and also because of the importance of the skyline operator for decision-making and supporting business intelligence. To this end, we present a scalable and efficient framework for skyline query processing that operates on top of SpatialHadoop, and can be parameterized by individual techniques related to filtering of candidate points as well as merging of local skyline sets. Then, we introduce two novel algorithms that follow the pattern of the framework and boost the performance of skyline query processing. Our algorithms employ specific optimizations based on effective filtering and efficient merging, the combination of which is responsible for improved efficiency. We compare our solution against the state-of-the-art skyline algorithm in SpatialHadoop. The results show that our techniques are more efficient and outperform the competitor significantly, especially in the case of large skyline output size.

论文关键词:Skyline query,Spatial data,MapReduce

论文评审过程:Available online 29 October 2014, Version of Record 3 September 2015.

论文官网地址:https://doi.org/10.1016/j.is.2014.10.003