Abstract
Protein structure prediction is considered a main challenge in computational biology. The biannual international competition, Critical Assessment of protein Structure Prediction (CASP), has shown in its eleventh experiment that free modelling target predictions are still beyond reliable accuracy, therefore, much effort should be made to improve ab initio methods. Arguably, Rosetta is considered as the most competitive method when it comes to targets with no homologues. Relying on fragments of length 9 and 3 from known structures, Rosetta creates putative structures by assembling candidate fragments. Generally, the structure with the lowest energy score, also known as first model, is chosen to be the “predicted one”.
A thorough study has been conducted on the role and diversity of 3-mers involved in Rosetta’s model “refinement” phase. Usage of the standard number of 3-mers – i.e. 200 – has been shown to degrade alpha and alpha-beta protein conformations initially achieved by assembling 9-mers. Therefore, a new prediction pipeline is proposed for Rosetta where the “refinement” phase is customised according to a target’s structural class prediction. Over 8% improvement in terms of first model structure accuracy is reported for alpha and alpha-beta classes when decreasing the number of 3- mers.Keywords: Rosetta, ab initio protein structure prediction, fragment-based protein structure prediction, CATH, protein structural class, 9-mers, 3-mers.
Graphical Abstract