A review on weight initialization strategies for neural networks

作者:Meenal V. Narkhede, Prashant P. Bartakke, Mukul S. Sutaone

摘要

Over the past few years, neural networks have exhibited remarkable results for various applications in machine learning and computer vision. Weight initialization is a significant step employed before training any neural network. The weights of a network are initialized and then adjusted repeatedly while training the network. This is done till the loss converges to a minimum value and an ideal weight matrix is obtained. Thus weight initialization directly drives the convergence of a network. Therefore, the selection of an appropriate weight initialization scheme becomes necessary for end-to-end training. An appropriate technique initializes the weights such that the training of the network is accelerated and the performance is improved. This paper discusses various advances in weight initialization for neural networks. The weight initialization techniques in the literature adopted for feed-forward neural network, convolutional neural network, recurrent neural network and long short term memory network have been discussed in this paper. These techniques are classified as (1) initialization techniques without pre-training, which are further classified into random initialization and data-driven initialization, (2) initialization techniques with pre-training. The different weight initialization and weight optimization techniques which select optimal weights for non-iterative training mechanism have also been discussed. We provide a close overview of different initialization schemes in these categories. This paper concludes with discussions on existing schemes and the future scope for research.

论文关键词:Weight initialization, Random initialization, Interval based, Variance scaling, Data-driven initialization, Unsupervised pre-training

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10462-021-10033-z