Abstract
A specific treatment of recurrent structural motifs that represent the local bias information has been proven to be an important ingredient in de novo protein structure predication. Significant majority of methods for local structure are based on building blocks, which still suffer from its inherent discrete nature. Instead of using building blocks, this work presents a new protocol framework for local structural motifs prediction based on the direct locating along protein sequence and probabilistic sampling in a continuous (φ,ψ) space. The protein sequence was first scanned by an algorithm of sliding window with variable length of 7 to 19 residues, to match local segments to one of 82 motifs patterns in the fragment library. Identified segments were then labeled and modeled as the correlations of backbone torsion angles with mixture of bivariate cosine distributions in continuous (φ,ψ) space. 3D conformations of corresponding segments were finally sampled by using a backtrack algorithm to the hidden Markov model with single output of (φ,ψ). For local motifs in 50 proteins of testing set, about 62% of eight-residue segments located with high confidence value were predicted within 1.5 Å of their native structures by the method. Majority of local structural motifs were identified and sampled, which indicates the proposed protocol may at least serve as the foundation to obtain better protein tertiary structure prediction.
Keywords: Directional statistics distribution, hidden Markov model, I-sites library, motifs identification, protein conformational sampling, structural motifs, Protein Structural Motifs, de novo protein, Protocol Framework, HMM, Parameter Estimation, 3.1. Local Structural Motifs Prediction, RMSD, MAE, Backtrack Sampling AlgorithmDirectional statistics distribution, hidden Markov model, I-sites library, motifs identification, protein conformational sampling, structural motifs, Protein Structural Motifs, de novo protein, Protocol Framework, HMM, Parameter Estimation, 3.1. Local Structural Motifs Prediction, RMSD, MAE, Backtrack Sampling Algorithm