Managerial decision support with knowledge of accuracy and completeness of the relational aggregate functions

作者：

Highlights：

•

摘要

Aggregate data produced by decision support systems is utilized by managers in their decision making process to run or improve their firm's operations. Often, data residing in corporate databases and data warehouses are far from being perfect, and their imperfections have an impact on decision quality and outcome. Therefore, having knowledge about the effect of data errors on aggregate data could lead to more informed decisions, reduced risks, and competitive advantage. In this paper, we present a methodology to estimate the effects of data accuracy and completeness, as two important data quality dimensions, on the relational aggregate functions Count, Sum, Average, Max, and Min. Our methodology defines a set of attribute value types and deploys sampling strategies to determine the maximum likelihood estimates of each value type. We show the effect of data error rates on the scalar values returned by the aggregate functions and demonstrate the efficiency of our estimates by Monte Carlo simulations.

论文关键词：Information quality,Relational aggregate functions,Sampling strategies

论文评审过程：Received 31 March 2005, Revised 25 September 2005, Accepted 22 December 2005, Available online 13 February 2006.

论文官网地址：https://doi.org/10.1016/j.dss.2005.12.005