DNN compression by ADMM-based joint pruning

作者:

Highlights:

摘要

The success of deep neural networks (DNNs) has motivated pursuit of both computationally and memory efficient models for applications in resource-constrained systems such as embedded devices. In line with this trend, network pruning methods reducing redundancy in over-parameterized models are being studied actively. Previous works on this research have demonstrated the ability to learn a compact network by imposing sparsity constraints on the parameters, but most of them have difficulty not only in identifying both connections and neurons to be pruned, but also in converging to optimal solutions. We propose a systematic DNN compression method where weights and network architectures are jointly optimized. We solve the joint problem using alternating direction method of multipliers (ADMM), a powerful technique capable of handling non-convex separable programming. Additionally, we provide a holistic pruning approach, an integrated form of our method, for automatically pruning networks without specific layer-wise hyper-parameters. To verify our work, we deployed the proposed method to a variety of state-of-the-art convolutional neural networks (CNNs) on three image classification benchmark datasets: MNIST, CIFAR-10, and ImageNet. Results show that the proposed pruning method effectively compresses the network parameters and reduces the computation cost while preserving prediction accuracy.

论文关键词:Neural network compression,Structured pruning,Unstructured pruning,Alternative direction method of multipliers (ADMM)

论文评审过程:Received 29 August 2021, Revised 15 November 2021, Accepted 15 December 2021, Available online 23 December 2021, Version of Record 5 January 2022.

论文官网地址:https://doi.org/10.1016/j.knosys.2021.107988