Selectivity estimation with density-model-based multidimensional histogram

作者:Meifan Zhang, Hongzhi Wang

摘要

Histograms are widely used in selectivity estimation for one-dimensional data. Using the one-dimensional histograms to estimate the selectivity of the multidimensional queries will result in a high estimation error, unless the assumption of attribute independence is true. Constructing a multidimensional histogram also brings great challenges. The storage of a multidimensional histogram exponentially increases with the number of dimensions. In this paper, we propose a density-model-based multidimensional histogram. It uses a lightweight density model to predict the densities of a large number of regions instead of storing too many buckets. The experimental results indicate that our method can provide highly accurate selectivity estimations while occupying little space. In addition, the superiority of our method is more evident in high-dimensional data.

论文关键词:Selectivity estimation, Multidimensional histogram, Query processing

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-021-01547-7