Abstract
Protein folding rate is a valuable clue for understanding the variations in protein folding kinetics. The ability to accurately discriminate protein folding rate change is very helpful in protein design. However, there are fewer studies on the influence of amino acid substitution to protein folding rates. In our earlier studies, we constructed a dataset of 467 mutants upon amino acid substitution and proposed novel methods for discriminating and predicting the accelerating and decelerating mutants during the folding process. This study aimed to effectively develop simple rules for discriminating accelerating mutants from decelerating ones upon single amino acid substitution. The main points of the study were to build a more general dataset F661 with 661 mutants, analyze the dataset systematically, and then implement different data mining techniques to build discrimination rules. Furthermore, the rules obtained from different methods were interpreted, evaluated, compared and integrated. The results appeared that the present approach may effectively develop simple rules from these mutants and the quality of the rules may be improved by combining the statistical and learning methods. These results suggest that the present method, as well as the rules, may advance the understanding of discriminating protein folding rate change.
The details of the rules along with relevant information have been integrated and available freely at http://bioinformatics.myweb.hinet.net/rulefr.htm
Keywords: Data mining, discriminating rule, protein folding rate, single amino acid substitution.