Identification of protein-nucleotide binding residues via graph regularized k-local hyperplane distance nearest neighbor model

作者:Yijie Ding, Chao Yang, Jijun Tang, Fei Guo

摘要

Accurate identification of protein-nucleotide binding residues is crucial for the study of drug structure and protein functional annotation. The study of protein-nucleotide binding residues is a typical problem of sample imbalance. The minority class (binding residues) are far less than the majority class (non-binding residues). The traditional machine learning algorithm is not universal for this kind of research, the results will be seriously biased to majority class. To deal with the serious imbalance problem, we propose a new computational method to identify protein-nucleotide binding residues via Graph Regularized k-local Hyperplane Distance Nearest Neighbor (GHKNN). On the training set, we compare the performance of the basic classifier, the ensemble classifier and the single classifier. On the independent test sets, we compare the performance with other existing models. The experimental results prove that our proposed method has higher accuracy in the identification of protein-nucleotide binding residues and is more prominent than other existing models. The data and material are freely available at https://github.com/guofei-tju/GHKNN.

论文关键词:Protein-nucleotide binding residues, Discrete cosine transform, Graph-based model, k-nearest neighbor, Local hyperplane

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-021-02737-0