It's about time: Signal recognition in staged models of protein translocation

作者:

Highlights:

摘要

During their synthesis, a large fraction of proteins are directed to the secretory pathway. There are several models that aim to distinguish between different destinations along this pathway; however, they rarely distinguish between known stages of this translocation process.This paper presents a translocation probability function which models the protein SRP-recruitment process—the first stage of the secretory pathway. It unifies groups of proteins with distinct final destinations, allowing more specific sorting to be done in due course, mirroring the hierarchical nature of secretory translocation.We apply conditional random fields to evaluate the prediction accuracy of a full sequence model. Introducing the translocation function improves substantially compared to a model based on properties that are relevant to the subsequent stages and final destinations only. For the discrimination of secretory, signal peptide (SP)-equipped proteins and non-secretory proteins a correlation coefficient of 0.98 is achieved—a level of performance that is only met by specialized SP predictors. Transmembrane proteins cause considerable confusion in signal peptide predictors, but fit naturally into our transparent design and reduce the performance of the translocation function only slightly.The proposed function and model assist efforts to uncover localization and function for the growing numbers of protein sequence data. Applying our model we estimate with high confidence that about 27% of the human and 29% of the mouse proteins are associated with the secretory pathway.

论文关键词:Bioinformatics,Machine learning,Protein secretory pathway,Signal peptide,Conditional random field,Amino acid sequence

论文评审过程:Received 17 March 2008, Revised 4 June 2008, Accepted 12 September 2008, Available online 10 October 2008.

论文官网地址:https://doi.org/10.1016/j.patcog.2008.09.020