Abstract
As the gap grows tremendously between the numbers of protein sequence and structure, in-silico protein structure prediction plays more and more critical roles in life science. Biennial experiments of Critical Assessment of protein Structure Prediction (CASP), the most authoritative in the field of protein structure prediction, shows that most prediction methods of today are successful in certain aspects, such as comparative modeling. However, incomplete models unexpectedly appear and require further refinement works. Therefore, the present study designed an automated multi-template combination algorithm to perform such refinement works. A total of 59 proteins released during CASP9 prediction season (human group) were selected as experimental targets. Four prediction methods HHpred, Pcons, Modeller, and SAM were used to generate protein models, among which 318 models were incomplete. Automated multitemplate combination algorithm was used in this study to work on each incomplete model, find the missing structures from other models, combine them with the original model, and finally obtain a recombined new model. Our results indicated that the quality of 95.56% of these 318 models was improved after the combination, and the improvement was statistically significant. Therefore, this study provided an effective method to improve the protein model quality.
Keywords: CASP, model quality, model refinement, multi-template combination, protein structure prediction.