Handling incomplete heterogeneous data using VAEs

Highlights：

• Evidence Lower Bound on incomplete datasets, computed only on the observed data, regardless of the pattern of missing data.

• Generative model that handles mixed numerical and nominal likelihood models, parametrized using deep neural networks (DNNs).

• Stable recognition model that handles incomplete datasets without increasing its complexity or promoting overfitting.

• Data-normalization input/output layer prevents a few dimensions of the data dominating the training of the VAE, improving the training convergence.

• Comparison with state-of-the-art methods on six datasets for both missing data imputation and predictive tasks.

摘要

•Evidence Lower Bound on incomplete datasets, computed only on the observed data, regardless of the pattern of missing data.•Generative model that handles mixed numerical and nominal likelihood models, parametrized using deep neural networks (DNNs).•Stable recognition model that handles incomplete datasets without increasing its complexity or promoting overfitting.•Data-normalization input/output layer prevents a few dimensions of the data dominating the training of the VAE, improving the training convergence.•Comparison with state-of-the-art methods on six datasets for both missing data imputation and predictive tasks.

论文评审过程：Received 25 March 2019, Revised 23 March 2020, Accepted 12 June 2020, Available online 13 June 2020, Version of Record 25 June 2020.