Abstract
One of the major challenges in protein tertiary structure prediction is structure quality assessment. In many cases, protein structure prediction tools generate good structural models, but fail to select the best models from a huge number of candidates as the final output. In this study, we developed a sampling-based machine-learning method to rank protein structural models by integrating multiple scores and features. First, features such as predicted secondary structure, solvent accessibility and residue-residue contact information are integrated by two Radial Basis Function (RBF) models trained from different datasets. Then, the two RBF scores and five selected scoring functions developed by others, i.e., Opus-CA, Opus-PSP, DFIRE, RAPDF, and Cheng Score are synthesized by a sampling method. At last, another integrated RBF model ranks the structural models according to the features of sampling distribution. We tested the proposed method by using two different datasets, including the CASP server prediction models of all CASP8 targets and a set of models generated by our in-house software MUFOLD. The test result shows that our method outperforms any individual scoring function on both best model selection, and overall correlation between the predicted ranking and the actual ranking of structural quality.
Keywords: Protein structure prediction, Structural model quality assessment, CASP, Scoring function, Sampling, Machine learning, MUFOLD, protein function, nuclear magnetic resonance spectroscopy, X-ray crystallography
Current Protein & Peptide Science
Title: A Sampling-Based Method for Ranking Protein Structural Models by Integrating Multiple Scores and Features
Volume: 12 Issue: 6
Author(s): Xiaohu Shi, Jingfen Zhang, Zhiquan He, Yi Shang and Dong Xu
Affiliation:
Keywords: Protein structure prediction, Structural model quality assessment, CASP, Scoring function, Sampling, Machine learning, MUFOLD, protein function, nuclear magnetic resonance spectroscopy, X-ray crystallography
Abstract: One of the major challenges in protein tertiary structure prediction is structure quality assessment. In many cases, protein structure prediction tools generate good structural models, but fail to select the best models from a huge number of candidates as the final output. In this study, we developed a sampling-based machine-learning method to rank protein structural models by integrating multiple scores and features. First, features such as predicted secondary structure, solvent accessibility and residue-residue contact information are integrated by two Radial Basis Function (RBF) models trained from different datasets. Then, the two RBF scores and five selected scoring functions developed by others, i.e., Opus-CA, Opus-PSP, DFIRE, RAPDF, and Cheng Score are synthesized by a sampling method. At last, another integrated RBF model ranks the structural models according to the features of sampling distribution. We tested the proposed method by using two different datasets, including the CASP server prediction models of all CASP8 targets and a set of models generated by our in-house software MUFOLD. The test result shows that our method outperforms any individual scoring function on both best model selection, and overall correlation between the predicted ranking and the actual ranking of structural quality.
Export Options
About this article
Cite this article as:
Shi Xiaohu, Zhang Jingfen, He Zhiquan, Shang Yi and Xu Dong, A Sampling-Based Method for Ranking Protein Structural Models by Integrating Multiple Scores and Features, Current Protein & Peptide Science 2011; 12 (6) . https://dx.doi.org/10.2174/138920311796957658
DOI https://dx.doi.org/10.2174/138920311796957658 |
Print ISSN 1389-2037 |
Publisher Name Bentham Science Publisher |
Online ISSN 1875-5550 |

- Author Guidelines
- Bentham Author Support Services (BASS)
- Graphical Abstracts
- Fabricating and Stating False Information
- Research Misconduct
- Post Publication Discussions and Corrections
- Publishing Ethics and Rectitude
- Increase Visibility of Your Article
- Archiving Policies
- Peer Review Workflow
- Order Your Article Before Print
- Promote Your Article
- Manuscript Transfer Facility
- Editorial Policies
- Allegations from Whistleblowers