Abstract
Glutathione S-transferase (GST) proteins play vital role in living organism that includes detoxification of exogenous and endogenous chemicals, survivability during stress condition. This paper describes a method developed for predicting GST proteins. We have used a dataset of 107 GST and 107 non-GST proteins for training and the performance of the method was evaluated with five-fold cross-validation technique. First a SVM based method has been developed using amino acid and dipeptide composition and achieved the maximum accuracy of 91.59% and 95.79% respectively. In addition we developed a SVM based method using tripeptide composition and achieved maximum accuracy 97.66% which is better than accuracy achieved by HMM based searching (96.26%). Based on above study a web-server GSTPred has been developed (http://www.imtech.res.in/raghava/gstpred/).
Keywords: GST protein, Support vector machine, artificial intelligence, sensitivity, specificity, correlation