Abstract
Background: Energy functions of proteins are developed to quantitatively capture the desirable features of physical interaction that determines the protein folding and structure prediction processes.
Objective: It is vital to develop an accurate energy function to discriminate native-like proteins from decoys. Along the same line, we develop an accurate energy function, which involves careful modelling of the reference state. Method: Here we propose a novel three-dimensional ideal gas reference state based energy function, which is based on three distinct hydrophobic-hydrophilic interactions of amino acids. The three distinct group of interactions, namely hydrophobic versus hydrophilic, hydrophobic versus hydrophobic and hydrophilic versus hydrophilic are controlled via three-dimensional optimized values of alpha. Using Genetic Algorithm, we optimized the contributions of each of the three groups along with the z-score to discriminate the native from the decoys. Results: The approach allows us to segregate the statistics, which in turn enables us to model the interactions more accurately without grossly averaging the impact as done in well-known ideal gas reference state based approach. To compute the energy scores we use a database of 4332 known protein structures obtained from the Protein Data Bank. Conclusion: Our energy function is found to be very competitive compared to the state-of-the-art approaches, and outperforms the nearest competitor by 40.9% for the most challenging Rosetta decoy-set.Keywords: Decoy-set, energy function, genetic algorithm, optimization, protein structure.
Graphical Abstract