Abstract
Protein methylation is one of the most important post-translational modifications. Typically methylation occurs on arginine or lysine residues in the protein sequence. In the biological system, methylation is catalyzed by enzymes, and should be involved in modification of heavy metals, regulation of gene expression, regulation of protein function, and RNA metabolism. Thus the prediction of methylation sites plays a crucial role. As we know, traditional experiment approaches to predict the sites are accurate, but that are always labor-intensive and time-consuming. Thus, it is common to see that computational methods receive increasingly attentions due to their convenience and fast speed in recent years. In this study, we develop a computational approach to predict the performance of methylarginine and methyllysine sites. First, a new encoding schema as called the CKASSP is used in our method. Then, the support vector machine (SVM) algorithm is used as a predictor. Experimental results show that our method can obtain average prediction accuracy of 87.46%, sensitivity of 99.09%, specificity of 86.89% for arginine methylation sites, and average prediction accuracy of 88.78%, sensitivity of 93.75%, specificity of 81.79% for lysine methylation sites as well, which is better than those of other state-of-art predictors. The online service is implemented by java 1.4.2 and is freely available at http://202.198.129.219:8080/cksaap_methsite.
Keywords: Methylation sites, CKSAAP_Methsite, SVM, performance.