Automation of cleaning and ensembles for outliers detection in questionnaire data

作者:

Highlights:

• Ensemble outliers detection in raw multivariate questionnaire data.

• Enhanced methods based on entropy, correlation, and probability.

• Uncorrelated outlier scores addressing the specific issues of questionnaires.

• A case study on the real-world questionnaire dataset (HBSC 2020).

摘要

•Ensemble outliers detection in raw multivariate questionnaire data.•Enhanced methods based on entropy, correlation, and probability.•Uncorrelated outlier scores addressing the specific issues of questionnaires.•A case study on the real-world questionnaire dataset (HBSC 2020).

论文关键词:Anomaly detection,Outliers,Questionnaire data,Data cleaning,HBSC

论文评审过程:Received 2 March 2022, Revised 23 May 2022, Accepted 6 June 2022, Available online 13 June 2022, Version of Record 24 June 2022.

论文官网地址:https://doi.org/10.1016/j.eswa.2022.117809