Pruning CNN filters via quantifying the importance of deep visual representations

摘要

The achievement of convolutional neural networks (CNNs) in a variety of applications is accompanied by a dramatic increase in computational costs and memory requirements. In this paper, we propose a novel framework to measure the importance of individual hidden units by computing a measure of relevance to identify the most critical filters and prune them to compress and accelerate CNNs. Unlike existing methods, we introduce the use of the activation of feature maps to detect valuable information and the essential semantic parts to evaluate the importance of feature maps, inspired by novel neural network interpretability. A majority voting technique based on the degree of alignment between a semantic concept and individual hidden unit representations is proposed to quantitatively evaluate the importance of feature maps. We also propose a simple yet effective method to estimate new convolution kernels based on the remaining, crucial channels to accomplish effective CNN compression. Experimental results show the effectiveness of our filter selection criteria, which outperforms the state-of-the-art baselines. Furthermore, we evaluate our pruning method on CIFAR-10, CUB-200, and ImageNet (ILSVRC 2012) datasets. The experimental results show that the proposed method efficiently achieves a 50% FLOPs reduction on CIFAR-10, with only 0.86% accuracy drop on the VGG-16 model. Meanwhile, ResNet pruned on CIFAR-10 achieves a 30% reduction in FLOPs with only 0.12% and 0.02% drops in accuracy on ResNet-20 and ResNet-32 respectively. For ResNet-50 on ImageNet, our pruned model achieves a 50% reduction in FLOPs with only a top-5 accuracy drop of 0.27%, which significantly outperforms state-of-the-art methods.