Inference, Learning and Attention Mechanisms that Exploit and Preserve Sparsity in CNNs

作者：Timo Hackel, Mikhail Usvyatsov, Silvano Galliani, Jan D. Wegner, Konrad Schindler

摘要

Convolutional neural networks (CNNs) are a powerful tool for pattern recognition and computer vision, but they do not scale well to higher-dimensional inputs, because of the associated memory demands for storing and manipulating high-dimensional tensors. This work starts from the observation that higher-dimensional data, like for example 3D voxel volumes, are sparsely populated. CNNs naturally lend themselves to densely sampled data, and sophisticated, massively parallel implementations are available. On the contrary, existing frameworks by and large lack the ability to efficiently process sparse data. Here, we introduce a suite of tools that exploit sparsity in both the feature maps and the filter weights of a CNN, and thereby allow for significantly lower memory footprints and computation times than the conventional dense framework, when processing data with a high degree of sparsity. Our scheme provides (i) an efficient GPU implementation of a convolution layer based on direct, sparse convolution, as well as sparse implementations of the ReLU and max-pooling layers; (ii) a filter step within the convolution layer, which we call attention, that prevents fill-in, i.e., the tendency of convolution to rapidly decrease sparsity, and guarantees an upper bound on the computational resources; and (iii) an adaptation of back-propagation that makes it possible to combine our approach with standard learning frameworks, while still benefitting from sparsity in the data as well as the model.

论文关键词：Deep learning, Sparse convolutions, Voxel grids

论文评审过程：

论文官网地址：https://doi.org/10.1007/s11263-020-01302-5