Identifying the best data-driven feature selection method for boosting reproducibility in classification tasks

作者:

Highlights:

• Devising the first FS selection framework that is optimal in identifying the most reproducible features (i.e., biomarkers) for a given dataset.

• Graph-based centrality analysis for identifying the most reproducible FS method.

• Evaluation FS-Select on both small-scale and large-scale disordered datasets (brain dementia and autism).

• Proposing a multiple cross-validation strategy to evaluate the proposed framework in feature reproducibility.

摘要

•Devising the first FS selection framework that is optimal in identifying the most reproducible features (i.e., biomarkers) for a given dataset.•Graph-based centrality analysis for identifying the most reproducible FS method.•Evaluation FS-Select on both small-scale and large-scale disordered datasets (brain dementia and autism).•Proposing a multiple cross-validation strategy to evaluate the proposed framework in feature reproducibility.

论文关键词:Feature selection methods,Multi-graph topological analysis,Feature reproducibility,Biomarker discovery,Morphological brain network,Neurological disorders,Connectomics,Cross-validation

论文评审过程:Received 24 October 2018, Revised 10 November 2019, Accepted 24 December 2019, Available online 9 January 2020, Version of Record 20 January 2020.

论文官网地址:https://doi.org/10.1016/j.patcog.2019.107183