An explanatory machine learning framework for studying pandemics: The case of COVID-19 emergency department readmissions

作者:

Highlights:

• A new framework is proposed for making informative decisions at the time of pandemics.

• The framework involves a data-driven exploratory phase for feature selection.

• Selected features then are used to train a predictive deep neural network.

• The model predictions are used by a SHAP algorithm to provide intuitive insights for decision makers.

• The framework is showcased in a problem of predicting 7-day ED readmissions of COVID-19 patients.

摘要

One of the major challenges that confront medical experts during a pandemic is the time required to identify and validate the risk factors of the novel disease and to develop an effective treatment protocol. Traditionally, this process involves numerous clinical trials that may take up to several years, during which strict preventive measures must be in place to control the outbreak and reduce the deaths. Advanced data analytics techniques, however, can be leveraged to guide and speed up this process. In this study, we combine evolutionary search algorithms, deep learning, and advanced model interpretation methods to develop a holistic exploratory-predictive-explanatory machine learning framework that can assist clinical decision-makers in reacting to the challenges of a pandemic in a timely manner. The proposed framework is showcased in studying emergency department (ED) readmissions of COVID-19 patients using ED visits from a real-world electronic health records database. After an exploratory feature selection phase using genetic algorithm, we develop and train a deep artificial neural network to predict early (i.e., 7-day) readmissions (AUC = 0.883). Lastly, a SHAP model is formulated to estimate additive Shapley values (i.e., importance scores) of the features and to interpret the magnitude and direction of their effects. The findings are mostly in line with those reported by lengthy and expensive clinical trial studies.

论文关键词:Machine learning,Pandemic,COVID-19,SHAP,Deep learning,Genetic algorithm

论文评审过程:Received 27 December 2020, Revised 21 August 2021, Accepted 10 January 2022, Available online 18 January 2022, Version of Record 24 August 2022.

论文官网地址:https://doi.org/10.1016/j.dss.2022.113730