Subspace top-k query processing using the hybrid-layer index with a tight bound

作者:

Highlights:

摘要

In this paper, we propose the Hybrid-Layer Index (simply, the HL-index) that is designed to answer top-k queries efficiently when the queries are expressed on any arbitrary subset of attributes in the database. Compared to existing approaches, the HL-index significantly reduces the number of tuples accessed during query processing by pruning unnecessary tuples based on two criteria, i.e., it filters out tuples both (1) globally based on the combination of all attribute values of the tuples like in the layer-based approach (simply, layer-level filtering) and (2) based on individual attribute values specifically used for ranking the tuples like in the list-based approach (simply, list-level filtering). Specifically, the HL-index exploits the synergic effect of integrating the layer-level filtering method and the list-level filtering method. Through an in-depth analysis of the interaction of the two filtering methods, we derive a tight bound that reduces the number of tuples retrieved during query processing while guaranteeing the correct query results. We propose the HL-index construction and retrieval algorithms and formally prove their correctness. Finally, we present the experimental results on synthetic and real datasets. Our experiments demonstrate that the query performance of the HL Index significantly outperforms other state-of-the-art indexes in most scenarios.

论文关键词:Access methods,Query,Top-k queries,Subspaces,Linear scoring functions,Layering,Listing,Hybrid

论文评审过程:Received 28 April 2011, Revised 10 July 2012, Accepted 14 July 2012, Available online 10 September 2012.

论文官网地址:https://doi.org/10.1016/j.datak.2012.07.001