Genetic programming for multiple-feature construction on high-dimensional classification

作者:

Highlights:

• Genetic programming (GP) is the most suitable technique for feature construction. This paper investigates what are the key factors and how they influence the performance of different approaches to GP for multiple feature construction on highdimensional data.

• In terms of representation, a multi-tree representation achieves better classification performance than a single-tree representation.

• In terms of evaluation, an appropriate combination of filter measures is more effective and efficient than a hybrid combination of wrapper and filter.

• In multi-tree GP for feature construction, the class-dependent constructed features achieved significantly better classification performance than the class-independent ones.

摘要

•Genetic programming (GP) is the most suitable technique for feature construction. This paper investigates what are the key factors and how they influence the performance of different approaches to GP for multiple feature construction on highdimensional data.•In terms of representation, a multi-tree representation achieves better classification performance than a single-tree representation.•In terms of evaluation, an appropriate combination of filter measures is more effective and efficient than a hybrid combination of wrapper and filter.•In multi-tree GP for feature construction, the class-dependent constructed features achieved significantly better classification performance than the class-independent ones.

论文关键词:Feature construction,Genetic programming,Classification,Class dependence,High-dimensional data

论文评审过程:Received 26 August 2018, Revised 4 April 2019, Accepted 1 May 2019, Available online 4 May 2019, Version of Record 9 May 2019.

论文官网地址:https://doi.org/10.1016/j.patcog.2019.05.006