Weighted top-k dominating queries on highly incomplete data

作者:

Highlights:

摘要

Top-k dominating (TKD) query retrieves the top k items that dominate other objects in the dataset. This is a key decision-making tool for any organization since it allows data analysts to discover dominant objects that can be used for recommendation. Incomplete data is a regular occurrence in real-world applications which occurs in many ways such as system failure, privacy protection, data loss, unavailability of data, and other issues. In this paper, we introduce a new approach for answering the top-k dominating queries over incomplete data. In many scenarios, the dominating object is one which has very high average rating but the number of rating is very low. We apply a weighted factor to calculate the score for dominating object. Hence realistic recommendation is possible. The idea of data bucketing is used to prune the non-candidate objects. The buckets are built using the B+ tree that makes the processing faster for high retrieval performance. In terms of top-k dominating query performance with incomplete data, the proposed model outperforms previous methods.

论文关键词:Top-k dominating query,Query processing,Skyline,Incomplete data,B+ tree,Dominance relationship

论文评审过程:Received 11 October 2021, Revised 13 January 2022, Accepted 14 February 2022, Available online 15 February 2022, Version of Record 18 February 2022.

论文官网地址:https://doi.org/10.1016/j.is.2022.102008