Mining combined causes in large data sets

作者:

Highlights:

摘要

In recent years, many methods have been developed for detecting causal relationships in observational data. Some of them have the potential to tackle large data sets. However, these methods fail to discover a combined cause, i.e. a multi-factor cause consisting of two or more component variables which individually are not causes. A straightforward approach to uncovering a combined cause is to include both individual and combined variables in the causal discovery using existing methods, but this scheme is computationally infeasible due to the huge number of combined variables. In this paper, we propose a novel approach to address this practical causal discovery problem, i.e. mining combined causes in large data sets. The experiments with both synthetic and real world data sets show that the proposed method can obtain high-quality causal discoveries with a high computational efficiency.

论文关键词:Causal discovery,Combined causes,Local causal discovery,HITON-PC,Multi-level HITON-PC

论文评审过程:Received 25 April 2015, Revised 7 October 2015, Accepted 15 October 2015, Available online 22 October 2015, Version of Record 11 December 2015.

论文官网地址:https://doi.org/10.1016/j.knosys.2015.10.018