A multiple association-based unsupervised feature selection algorithm for mixed data sets

作者:

Highlights:

• New generic multiple association measure for categorical, numerical and mixed data.

• Proposing two algorithms to select features based on the new multiple association.

• Conducting an empirical evaluation of proposed solutions to assess their traits.

• Our approaches cater for unsupervised feature section datasets in mixed datasets.

• Reducing the processing time required for unsupervised feature selection problem.

摘要

•New generic multiple association measure for categorical, numerical and mixed data.•Proposing two algorithms to select features based on the new multiple association.•Conducting an empirical evaluation of proposed solutions to assess their traits.•Our approaches cater for unsupervised feature section datasets in mixed datasets.•Reducing the processing time required for unsupervised feature selection problem.

论文关键词:Feature selection,Measures of association,Multiple association,Categorical data,Mixed data,Feature engineering

论文评审过程:Received 26 October 2021, Revised 21 August 2022, Accepted 27 August 2022, Available online 5 September 2022, Version of Record 19 September 2022.

论文官网地址:https://doi.org/10.1016/j.eswa.2022.118718