Machine Learning Algorithms for Predicting Protein Folding Rates and Stability of Mutant Proteins: Comparison with Statistical Methods

M.      Michael Gromiha; Liang-Tsung      Huang

doi:10.2174/138920311796957630

Abstract

Machine learning algorithms have wide range of applications in bioinformatics and computational biology such as prediction of protein secondary structures, solvent accessibility, binding site residues in protein complexes, protein folding rates, stability of mutant proteins, and discrimination of proteins based on their structure and function. In this work, we focus on two aspects of predictions: (i) protein folding rates and (ii) stability of proteins upon mutations. We briefly introduce the concepts of protein folding rates and stability along with available databases, features for prediction methods and measures for prediction performance. Subsequently, the development of structure based parameters and their relationship with protein folding rates will be outlined. The structure based parameters are helpful to understand the physical basis for protein folding and stability. Further, basic principles of major machine learning techniques will be mentioned and their applications for predicting protein folding rates and stability of mutant proteins will be illustrated. The machine learning techniques could achieve the highest accuracy of predicting protein folding rates and stability. In essence, statistical methods and machine learning algorithms are complimenting each other for understanding and predicting protein folding rates and the stability of protein mutants. The available online resources on protein folding rates and stability will be listed.

Keywords: Protein folding rates, protein stability, structure based parameters, machine learning techniques, three-dimensional structure, polypeptide chain, protein folding problem, magnitude, spectroscopy, Folding rate