Towards designing modular recurrent neural networks in learning protein secondary structures

作者：

Highlights：

•

摘要

Precise prediction of protein secondary structures from the associated amino acids sequence is of great importance in bioinformatics and yet a challenging task for machine learning algorithms. As a major step toward predicting the ultimate three dimensional structures, the secondary structure assignment specifies the protein function. Considering a multilayer perceptron neural network, pruned for optimum size of hidden layers, as the reference network, advanced kinds of recurrent neural network (RNN) are devised in this article to enhance the secondary structure prediction. To better model the strong correlations between secondary structure elements, types of modular reciprocal recurrent neural networks (MRR-NN) are examined. Additionally, to take into account the long-range interactions between amino acids in formation of the secondary structure, bidirectional RNN are investigated. A multilayer bidirectional recurrent neural network (MBR-NN) is finally applied to capture the predominant long-term dependencies. Eventually, a modular prediction system based on the interactive combination of the MRR-NN and MBR-NN boosts the percentage accuracy (Q3) up to 76.91% and augments the segment overlap (SOV) up to 68.13% when tested on the PSIPRED dataset. The coupling effects of the secondary structure types as well as the sequential information of amino acids along the protein chain can be well cast by the integration of the MRR-NN and the MBR-NN.

论文关键词：Protein secondary structure,Reciprocal recurrent neural network,Bidirectional recurrent neural network,Modular networks,Secondary structure correlation,Amino acid interactions

论文评审过程：Available online 24 December 2011.

论文官网地址：https://doi.org/10.1016/j.eswa.2011.12.059