Stochastic Ghost Batch for Self-distillation with Dynamic Soft Label
作者:
Highlights:
•
摘要
Deep neural networks excel at learning patterns from finite training data but often provide incorrect predictions with high confidence while faced out-of-distribution data. In this work, we propose a data-agnostic framework called Stochastic Ghost Batch Augmentation (SGBA) to address these issues. It stochastically augments activation units at training iterations to amendment the model’s irregular prediction behaviors by leveraging the partial generalization ability of intermediate model, in which a self-distilled dynamic soft label as regularization term is introduced to establish the aforementioned lost connection, that incorporates the similarity prior in the vicinity distribution respect to raw samples, rather than conform model to static hard label. Also, the induced stochasticity can reduce much unnecessary, redundant computational cost in conventional batch augmentation performed at each pass. The proposed regularization provides direct supervision by the KL-Divergence between the output soft-max distribution of original and virtual data, and enforces the distribution matching to fuse the complementary information in the model’s prediction, which are becoming gradually mature and stable with the training process. In essence, it is a dynamic check or test about the generalization of neural network during training. Extensive performance evaluations demonstrate the superiority of our proposed framework.
论文关键词:Ghost batch augmentation,Self-distillation,Dynamic soft label
论文评审过程:Received 30 March 2021, Revised 28 October 2021, Accepted 10 December 2021, Available online 19 January 2022, Version of Record 2 February 2022.
论文官网地址:https://doi.org/10.1016/j.knosys.2021.107936