Statistically-driven generation of multidimensional analytical schemas from linked data

作者:

Highlights:

摘要

The ever-increasing Linked Data (LD) initiative has given place to open, large amounts of semi-structured and rich data published on the Web. However, effective analytical tools that aid the user in his/her analysis and go beyond browsing and querying are still lacking. To address this issue, we propose the automatic generation of multidimensional analytical stars (MDAS). The success of the multidimensional (MD) model for data analysis has been in great part due to its simplicity. Therefore, in this paper we aim at automatically discovering MD conceptual patterns that summarize LD. These patterns resemble the MD star schema typical of relational data warehousing. The underlying foundations of our method is a statistical framework that takes into account both concept and instance data. We present an implementation that makes use of the statistical framework to generate the MDAS. We have performed several experiments that assess and validate the statistical approach with two well-known and large LD sets.

论文关键词:Linked data,RDF,Multidimensional models,Statistical models

论文评审过程:Received 8 January 2016, Revised 4 July 2016, Accepted 6 July 2016, Available online 6 July 2016, Version of Record 29 September 2016.

论文官网地址:https://doi.org/10.1016/j.knosys.2016.07.010