Generative modeling of repositories of health records for predictive tasks

作者:Rui Henriques, Cláudia Antunes, Sara C. Madeira

摘要

Repositories of health records are collections of events with varying number and sparsity of occurrences within and among patients. Although a large number of predictive models have been proposed in the last decade, they are not yet able to simultaneously capture cross-attribute and temporal dependencies associated with these repositories. Two major streams of predictive models can be found. On one hand, deterministic models rely on compact subsets of discriminative events to anticipate medical conditions. On the other hand, generative models offer a more complete and noise-tolerant view based on the likelihood of the testing arrangements of events to discriminate a particular outcome. However, despite the relevance of generative predictive models, they are not easily extensible to deal with complex grids of events. In this work, we rely on the Markov assumption to propose new predictive models able to deal with cross-attribute and temporal dependencies. Experimental results hold evidence for the utility and superior accuracy of generative models to anticipate health conditions, such as the need for surgeries. Additionally, we show that the proposed generative models are able to decode temporal patterns of interest (from the learned lattices) with acceptable completeness and precision levels, and with superior efficiency for voluminous repositories.

论文关键词:Predictive models, Integrated healthcare data, Hidden Markov models, Temporal dependencies, Cross-attribute dependencies, Repositories of events, Sparse temporal data

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10618-014-0385-7