Isolated word recognition using modular recurrent neural networks

作者:

Highlights:

摘要

This paper describes a novel method of using recurrent neural networks (RNN) for isolated word recognition. Each word in the target vocabulary is modeled by a fully connected recurrent network. To recognize an input utterance, the best matching word is determined based on its temporal output response. The system is trained in two stages. First, the RNN speech models (RSM) are trained independently to capture the essential static and temporal characteristics of individual words. This is performed by using an iterative re-segmentation training algorithm which gives the optimal phonetic segmentation automatically for each training utterance. The second-stage involves mutually discriminative training among the RSMs, aiming at minimizing the probability of misclassification. A series of simulation experiments have been performed to demonstrate the effectiveness of the proposed recognition method. For the recognition of (A) 20 English words, (B) 11 Cantonese digits and (C) 58 Cantonese CV syllables, the top-1 accuracy are 91.9, 93.6 and 87.1%, respectively.

论文关键词:Recurrent neural network,Isolated word recognition,Discriminative training

论文评审过程:Received 25 March 1997, Revised 28 August 1997, Available online 7 June 2001.

论文官网地址:https://doi.org/10.1016/S0031-3203(97)00106-4