A data mining approach to knowledge discovery from multidimensional cube structures

作者:

Highlights:

摘要

In this research we present a novel methodology for the discovery of cubes of interest in large multi-dimensional datasets. Unlike previous research in this area, our approach does not rely on the availability of specialized domain knowledge and instead makes use of robust methods of data reduction such as Principal Component Analysis and Multiple Correspondence Analysis to identify a small subset of numeric and nominal variables that are responsible for capturing the greatest degree of variation in the data and are thus used in generating cubes of interest. Hierarchical clustering was integrated with the use of data reduction in order to gain insights into the dynamics of relationships between variables of interests at different levels of data abstraction. The two case studies that were conducted on two real word datasets revealed that the methodology was able to capture regions of interest that were significant from both the application and statistical perspectives.

论文关键词:Data cubes,OLAP analysis,Data mining,Ranked paths,Principal Component Analysis,Multiple Correspondence Analysis

论文评审过程:Received 19 July 2012, Revised 4 November 2012, Accepted 23 November 2012, Available online 10 December 2012.

论文官网地址:https://doi.org/10.1016/j.knosys.2012.11.008