Recognize the surrounding: Development and evaluation of convolutional deep networks using gammatone spectrograms and raw audio signals

作者：

Highlights：

• Utilized the models of human brain and auditory system.

• Two separate CNN models for different input types (2D and 1D).

• Performance of the proposed 1D CNN model is similar to the traditional 2D CNN.

• Effective data augmentation procedure to deal with real-world scenario.

• Proposed networks have a modest number of trainable parameters and FLOPS.

摘要

•Utilized the models of human brain and auditory system.•Two separate CNN models for different input types (2D and 1D).•Performance of the proposed 1D CNN model is similar to the traditional 2D CNN.•Effective data augmentation procedure to deal with real-world scenario.•Proposed networks have a modest number of trainable parameters and FLOPS.

论文关键词：Environmental sound,Classification,CNN,Gammatone

论文评审过程：Received 12 August 2021, Revised 6 January 2022, Accepted 26 March 2022, Available online 31 March 2022, Version of Record 1 April 2022.

论文官网地址：https://doi.org/10.1016/j.eswa.2022.116998