Deep learning for missing value imputation of continuous data and the effect of data discretization

作者：

Highlights：

• Deep learning for imputing missing continuous values of tabular or structured data is studied.

• In particular, multilayer perceptron (MLP) and deep belief networks (DBN) are employed.

• Two different ordered combinations of data discretization and imputation steps are examined.

• MLP and DBN significantly outperform the baseline imputation methods.

• DBN is the better choice for imputation when the discretization of continuous data is required.

摘要

•Deep learning for imputing missing continuous values of tabular or structured data is studied.•In particular, multilayer perceptron (MLP) and deep belief networks (DBN) are employed.•Two different ordered combinations of data discretization and imputation steps are examined.•MLP and DBN significantly outperform the baseline imputation methods.•DBN is the better choice for imputation when the discretization of continuous data is required.

论文关键词：Data science,Machine learning,Deep learning,Missing value imputation,Data discretization

论文评审过程：Received 30 July 2021, Revised 20 December 2021, Accepted 24 December 2021, Available online 1 January 2022, Version of Record 14 January 2022.

论文官网地址：https://doi.org/10.1016/j.knosys.2021.108079