Bayesian network multi-classifiers for protein secondary structure prediction

作者:

Highlights:

摘要

Successful secondary structure predictions provide a starting point for direct tertiary structure modelling, and also can significantly improve sequence analysis and sequence-structure threading for aiding in structure and function determination. Hence the improvement of predictive accuracy of the secondary structure prediction becomes essential for future development of the whole field of protein research.In this work we present several multi-classifiers that combine the predictions of the best current classifiers available on Internet. Our results prove that combining the predictions of a set of classifiers by creating composite classifiers is a fruitful one. We have created multi-classifiers that are more accurate than any of the component classifiers. The multi-classifiers are based on Bayesian networks. They are validated with 9 different datasets. Their predictive accuracy results outperform the best secondary structure predictors by 1.21% on average.Our main contributions are: (i) we improved the best know predictive accuracy by 1.21%, (ii) our best results have been obtained with a new semi naı̈ve Bayes approach named Pazzani-EDA and (iii) our multi-classifiers combine results of previously build classifiers predictions obtained through Internet, thanks to our development of a Java application.

论文关键词:Multi-classifier,Supervised classification,Machine learning,Stacked generalization,Bayesian networks,Protein secondary structure prediction,Pazzani-EDA

论文评审过程:Received 1 March 2003, Revised 2 May 2003, Accepted 16 January 2004, Available online 18 May 2004.

论文官网地址:https://doi.org/10.1016/j.artmed.2004.01.009