Data pre-processing pipeline generation for AutoETL
作者:
Highlights:
• A study on the impact of pre-processing over a set of classification algorithms.
• A method to generate effective pre-processing pipeline prototypes.
• A method for automatic pipeline instantiation as a step towards AutoETL.
• A meta-learning approach to warm-start the pipeline instantiation.
• A comprehensive set of experiments that show the effectiveness of the proposed method.
摘要
•A study on the impact of pre-processing over a set of classification algorithms.•A method to generate effective pre-processing pipeline prototypes.•A method for automatic pipeline instantiation as a step towards AutoETL.•A meta-learning approach to warm-start the pipeline instantiation.•A comprehensive set of experiments that show the effectiveness of the proposed method.
论文关键词:Data pre-processing pipelines,Data analytics
论文评审过程:Received 28 June 2021, Revised 30 September 2021, Accepted 13 November 2021, Available online 3 December 2021, Version of Record 12 May 2022.
论文官网地址:https://doi.org/10.1016/j.is.2021.101957