Efficient maintenance of basic statistical functions in data warehouses

作者:

Highlights:

• Significant improvements in maintaining the statistical functions inside a data warehouse.

• Reduce the maintenance time from minutes to seconds.

• Efficiently maintaining basic statistical functions inside a data warehouse contributes to firm performance.

• Can also be applied to maintain a distributed data warehouse or a data mart.

摘要

In general, some simple but very meaningful statistical functions are often used to retrieve valuable summary information in corporate databases. However, it is not uncommon that such information is obtained from computerized information systems which spend a great deal of time calculating the large volume of collected data. In practice, such data is usually stored in a data warehouse in which a large number of summary tables or materialized aggregate views are built in order to improve the system performance. Upon changes, most notable new transactional data are collected from various data sources, and all summary tables in the data warehouse that correspond to the transactional data must be updated accordingly. Since the number of summary tables that need to be maintained is often large, efficiently maintaining these is thus a critical issue for managing a data warehouse. In this study, an efficient maintenance approach to enhance the performance of a data warehouse is proposed, in which some additional auxiliary tables are kept inside a data warehouse with the role of improving the maintenance processes of some statistical functions, such as MIN, MAX, MEAN, and MEDIAN. Finally, a comparative analysis is performed to verify the effectiveness of the proposal method.

论文关键词:Data warehouse,Maintain data warehouse,Self-maintainability

论文评审过程:Received 5 May 2010, Revised 31 July 2013, Accepted 8 August 2013, Available online 19 August 2013.

论文官网地址:https://doi.org/10.1016/j.dss.2013.08.003