Online Speech Enhancement by Retraining of LSTM Using SURE Loss and Policy Iteration

作者:Sriharsha Koundinya, Abhijit Karmakar

摘要

Speech enhancement is required for improving the quality and intelligibility in various applications such as recognition, hearing aids and other personal assistant devices. Due to the varying acoustic environments, online enhancement is a very significant aspect for its applicability in practical scenarios. This emphasizes the need to observe the environment and enhance the speech accordingly. Adaptive filters were used previously to provide online enhancement, but a neural network based online enhancement has not been proposed previously. In this paper, we employ a unique architecture based on Long- Short Term Memory (LSTM) networks to enhance single channel speech online. The LSTM network is trained online in a novel way by minimizing the Stein’s unbiased risk estimate. This method of retraining helps the network to learn denoising without using a clean sample or ground truth. To avoid training for each and every sample we have used policy iteration with reward function based on ITU-T P.563, the widely-used single ended perceptual measure. The performance of this LSTM retraining can be observed with the increased PESQ of the enhanced speech by 0.53 on average. The proposed method also improves intelligibility which can be seen from the improvement in the metric STOI by 0.22.

论文关键词: LSTM , Reinforcement learning, Policy iteration, Stien’s risk estimate, Speech enhancement, Continual learning

论文评审过程:

论文官网地址:https://doi.org/10.1007/s11063-021-10535-5