On the cost-effectiveness of neural and non-neural approaches and representations for text classification: A comprehensive comparative study

作者:

Highlights:

• A critical literature review reveals serious experimental issues in the recent ATC (neural) literature.

• We provide a very comprehensive and scientifically sound comparison of neural and non-neural methods.

• We consider a cost-effectiveness tradeoff analysis based on more than 1500 measurements.

• Simpler and cheaper non-neural solutions beat neural network methods in smaller datasets with a shortage of training.

• Transformer architectures are better in larger datasets but by small margins and at a much higher cost.

• Metafeatures are competitive with neural networks in both scenarios with a potentially better tradeoff.

摘要

•A critical literature review reveals serious experimental issues in the recent ATC (neural) literature.•We provide a very comprehensive and scientifically sound comparison of neural and non-neural methods.•We consider a cost-effectiveness tradeoff analysis based on more than 1500 measurements.•Simpler and cheaper non-neural solutions beat neural network methods in smaller datasets with a shortage of training.•Transformer architectures are better in larger datasets but by small margins and at a much higher cost.•Metafeatures are competitive with neural networks in both scenarios with a potentially better tradeoff.

论文关键词:Text classification,Comparative study,Systematic review

论文评审过程:Received 9 July 2020, Revised 15 December 2020, Accepted 20 December 2020, Available online 5 February 2021, Version of Record 5 February 2021.

论文官网地址:https://doi.org/10.1016/j.ipm.2020.102481