Generic placeholder image

Current Genomics

Editor-in-Chief

ISSN (Print): 1389-2029
ISSN (Online): 1875-5488

Research Article

LipoSVM: Prediction of Lysine lipoylation in Proteins based on the Support Vector Machine

Author(s): Meiqi Wu, Pengchao Lu, Yingxi Yang, Liwen Liu, Hui Wang, Yan Xu and Jixun Chu*

Volume 20, Issue 5, 2019

Page: [362 - 370] Pages: 9

DOI: 10.2174/1389202919666191014092843

Price: $65

Abstract

Background: Lysine lipoylation which is a rare and highly conserved post-translational modification of proteins has been considered as one of the most important processes in the biological field. To obtain a comprehensive understanding of regulatory mechanism of lysine lipoylation, the key is to identify lysine lipoylated sites. The experimental methods are expensive and laborious. Due to the high cost and complexity of experimental methods, it is urgent to develop computational ways to predict lipoylation sites.

Methodology: In this work, a predictor named LipoSVM is developed to accurately predict lipoylation sites. To overcome the problem of an unbalanced sample, synthetic minority over-sampling technique (SMOTE) is utilized to balance negative and positive samples. Furthermore, different ratios of positive and negative samples are chosen as training sets.

Results: By comparing five different encoding schemes and five classification algorithms, LipoSVM is constructed finally by using a training set with positive and negative sample ratio of 1:1, combining with position-specific scoring matrix and support vector machine. The best performance achieves an accuracy of 99.98% and AUC 0.9996 in 10-fold cross-validation. The AUC of independent test set reaches 0.9997, which demonstrates the robustness of LipoSVM. The analysis between lysine lipoylation and non-lipoylation fragments shows significant statistical differences.

Conclusion: A good predictor for lysine lipoylation is built based on position-specific scoring matrix and support vector machine. Meanwhile, an online webserver LipoSVM can be freely downloaded from https://github.com/stars20180811/LipoSVM.

Keywords: Lysine lipoylation, prediction, amino acids, support vector machine, post-translational modifications, scoring matrix.

Graphical Abstract

[1]
Wu, M.; Yang, Y.; Wang, H.; Xu, Y. A deep learning method to more accurately recall known lysine acetylation sites. BMC Bioinformatics, 2019, 20(1), 49.
[http://dx.doi.org/10.1186/s12859-019-2632-9] [PMID: 30674277]
[2]
Doerig, C.; Rayner, J.C.; Scherf, A.; Tobin, A.B. Post-translational protein modifications in malaria parasites. Nat. Rev. Microbiol., 2015, 13(3), 160-172.
[http://dx.doi.org/10.1038/nrmicro3402] [PMID: 25659318]
[3]
Azevedo, C.; Saiardi, A. Why always lysine? The ongoing tale of one of the most modified amino acids. Adv. Biol. Regul., 2016, 60, 144-150.
[http://dx.doi.org/10.1016/j.jbior.2015.09.008] [PMID: 26482291]
[4]
Allfrey, V.G.; Faulkner, R.; Mirsky, A.E. Acetylation and Methylation of Histones and Their Possible Role in the Regulation of Rna Synthesis. Proc. Natl. Acad. Sci. USA, 1964, 51, 786-794.
[http://dx.doi.org/10.1073/pnas.51.5.786] [PMID: 14172992]
[5]
Ambler, R.P.; Rees, M.W. Epsilon-N-Methyl-lysine in bacterial flagellar protein. Nature, 1959, 184, 56-57.
[http://dx.doi.org/10.1038/184056b0] [PMID: 13793118]
[6]
Goldstein, G.; Scheid, M.; Hammerling, U.; Schlesinger, D.H.; Niall, H.D.; Boyse, E.A. Isolation of a polypeptide that has lymphocyte-differentiating properties and is probably represented universally in living cells. Proc. Natl. Acad. Sci. USA, 1975, 72(1), 11-15.
[http://dx.doi.org/10.1073/pnas.72.1.11] [PMID: 1078892]
[7]
Matunis, M.J.; Coutavas, E.; Blobel, G. A novel ubiquitin-like modification modulates the partitioning of the Ran-GTPase-activating protein RanGAP1 between the cytosol and the nuclear pore complex. J. Cell Biol., 1996, 135(6 Pt 1), 1457-1470.
[http://dx.doi.org/10.1083/jcb.135.6.1457] [PMID: 8978815]
[8]
Smith, D.L.; Chen, C.C.; Bruegger, B.B.; Holtz, S.L.; Halpern, R.M.; Smith, R.A. Characterization of protein kinases forming acid-labile histone phosphates in Walker-256 carcinosarcoma cell nuclei. Biochemistry, 1974, 13(18), 3780-3785.
[http://dx.doi.org/10.1021/bi00715a025] [PMID: 4368488]
[9]
Rowland, E.A.; Snowden, C.K.; Cristea, I.M. Protein lipoylation: an evolutionarily conserved metabolic regulator of health and disease. Curr. Opin. Chem. Biol., 2018, 42, 76-85.
[http://dx.doi.org/10.1016/j.cbpa.2017.11.003] [PMID: 29169048]
[10]
Tsai, C.S.; Burgett, M.W.; Reed, L.J. Alpha-keto acid dehydrogenase complexes. XX. A kinetic study of the pyruvate dehydrogenase complex from bovine kidney. J. Biol. Chem., 1973, 248(24), 8348-8352.
[PMID: 4357736]
[11]
Reed, L.J. A trail of research from lipoic acid to alpha-keto acid dehydrogenase complexes. J. Biol. Chem., 2001, 276(42), 38329-38336.
[http://dx.doi.org/10.1074/jbc.R100026200] [PMID: 11477096]
[12]
Cronan, J.E.; Zhao, X.; Jiang, Y. Function, attachment and synthesis of lipoic acid in Escherichia coli. Adv. Microb. Physiol., 2005, 50, 103-146.
[http://dx.doi.org/10.1016/S0065-2911(05)50003-1] [PMID: 16221579]
[13]
Wallis, N.G.; Perham, R.N. Structural dependence of post-translational modification and reductive acetylation of the lipoyl domain of the pyruvate dehydrogenase multienzyme complex. J. Mol. Biol., 1994, 236(1), 209-216.
[http://dx.doi.org/10.1006/jmbi.1994.1130] [PMID: 8107106]
[14]
Perham, R.N. Swinging arms and swinging domains in multifunctional enzymes: catalytic machines for multistep reactions. Annu. Rev. Biochem., 2000, 69, 961-1004.
[http://dx.doi.org/10.1146/annurev.biochem.69.1.961] [PMID: 10966480]
[15]
Spalding, M.D.; Prigge, S.T. Lipoic acid metabolism in microbial pathogens. Microbiol. Mol. Biol. Rev., 2010, 74(2), 200-228.
[http://dx.doi.org/10.1128/MMBR.00008-10] [PMID: 20508247]
[16]
Payne, K.A.; Hough, D.W.; Danson, M.J. Discovery of a putative acetoin dehydrogenase complex in the hyperthermophilic archaeon Sulfolobus solfataricus. FEBS Lett., 2010, 584(6), 1231-1234.
[http://dx.doi.org/10.1016/j.febslet.2010.02.037] [PMID: 20171216]
[17]
Nichols, B.J.; Denton, R.M. Towards the molecular basis for the regulation of mitochondrial dehydrogenases by calcium ions. Mol. Cell. Biochem., 1995, 149-150, 203-212.
[http://dx.doi.org/10.1007/BF01076578] [PMID: 8569730]
[18]
Koukourakis, M.I.; Giatromanolaki, A.; Sivridis, E.; Gatter, K.C.; Harris, A.L. Pyruvate dehydrogenase and pyruvate dehydrogenase kinase expression in non small cell lung cancer and tumor-associated stroma. Neoplasia, 2005, 7(1), 1-6.
[http://dx.doi.org/10.1593/neo.04373] [PMID: 15736311]
[19]
Chen, J.Q.; Russo, J. Dysregulation of glucose transport, glycolysis, TCA cycle and glutaminolysis by oncogenes and tumor suppressors in cancer cells. Biochim. Biophys. Acta, 2012, 1826(2), 370-384.
[PMID: 22750268]
[20]
Fan, J.; Kang, H.B.; Shan, C.; Elf, S.; Lin, R.; Xie, J.; Gu, T.L.; Aguiar, M.; Lonning, S.; Chung, T.W.; Arellano, M.; Khoury, H.J.; Shin, D.M.; Khuri, F.R.; Boggon, T.J.; Kang, S.; Chen, J. Tyr-301 phosphorylation inhibits pyruvate dehydrogenase by blocking substrate binding and promotes the Warburg effect. J. Biol. Chem., 2014, 289(38), 26533-26541.
[http://dx.doi.org/10.1074/jbc.M114.593970] [PMID: 25104357]
[21]
Hellerstein, M.K.; Grunfeld, C.; Wu, K.; Christiansen, M.; Kaempfer, S.; Kletke, C.; Shackleton, C.H. Increased de novo hepatic lipogenesis in human immunodeficiency virus infection. J. Clin. Endocrinol. Metab., 1993, 76(3), 559-565.
[PMID: 8445011]
[22]
Baur, A.; Harrer, T.; Peukert, M.; Jahn, G.; Kalden, J.R.; Fleckenstein, B. Alpha-lipoic acid is an effective inhibitor of human immuno-deficiency virus (HIV-1) replication. Klin. Wochenschr., 1991, 69(15), 722-724.
[http://dx.doi.org/10.1007/BF01649442] [PMID: 1724477]
[23]
Munger, J.; Bennett, B.D.; Parikh, A.; Feng, X.J.; McArdle, J.; Rabitz, H.A.; Shenk, T.; Rabinowitz, J.D. Systems-level metabolic flux profiling identifies fatty acid synthesis as a target for antiviral therapy. Nat. Biotechnol., 2008, 26(10), 1179-1186.
[http://dx.doi.org/10.1038/nbt.1500] [PMID: 18820684]
[24]
Rowland, E.A.; Greco, T.M.; Snowden, C.K.; McCabe, A.L.; Silhavy, T.J.; Cristea, I.M. Sirtuin Lipoamidase Activity Is Conserved in Bacteria as a Regulator of Metabolic Enzyme Complexes. MBio, 2017, 8(5), e01096-e17.
[http://dx.doi.org/10.1128/mBio.01096-17] [PMID: 28900027]
[25]
Mathias, R.A.; Greco, T.M.; Oberstein, A.; Budayeva, H.G.; Chakrabarti, R.; Rowland, E.A.; Kang, Y.; Shenk, T.; Cristea, I.M. Sirtuin 4 is a lipoamidase regulating pyruvate dehydrogenase complex activity. Cell, 2014, 159(7), 1615-1625.
[http://dx.doi.org/10.1016/j.cell.2014.11.046] [PMID: 25525879]
[26]
Casteel, J.; Miernyk, J.A.; Thelen, J.J. Mapping the lipoylation site of Arabidopsis thaliana plastidial dihydrolipoamide S-acetyltransferase using mass spectrometry and site-directed mutagenesis. Plant Physiol. Biochem., 2011, 49(11), 1355-1361.
[http://dx.doi.org/10.1016/j.plaphy.2011.07.001] [PMID: 21798751]
[27]
Blagus, R.; Lusa, L. SMOTE for high-dimensional class-imbalanced data. BMC Bioinformatics, 2013, 14, 106.
[http://dx.doi.org/10.1186/1471-2105-14-106] [PMID: 23522326]
[28]
Xu, Y.; Wen, X.; Wen, L.S.; Wu, L.Y.; Deng, N.Y.; Chou, K.C. iNitro-Tyr: prediction of nitrotyrosine sites in proteins with general pseudo amino acid composition. PLoS One, 2014, 9(8)e105018
[http://dx.doi.org/10.1371/journal.pone.0105018] [PMID: 25121969]
[29]
Shao, J.; Xu, D.; Tsai, S.N.; Wang, Y.; Ngai, S.M. Computational identification of protein methylation sites through bi-profile Bayes feature extraction. PLoS One, 2009, 4(3)e4920
[http://dx.doi.org/10.1371/journal.pone.0004920] [PMID: 19290060]
[30]
Kawashima, S.; Pokarowski, P.; Pokarowska, M.; Kolinski, A.; Katayama, T.; Kanehisa, M. AAindex: amino acid index database, progress report 2008. Nucleic Acids Res., 2008, 36(Database issue), D202-D205.
[PMID: 17998252]
[31]
Hasan, M.A.M.; Ahmad, S.; Molla, M.K.I. iMulti-HumPhos: a multi-label classifier for identifying human phosphorylated proteins using multiple kernel learning based support vector machines. Mol. Biosyst., 2017, 13(8), 1608-1618.
[http://dx.doi.org/10.1039/C7MB00180K] [PMID: 28682387]
[32]
Vacic, V.; Iakoucheva, L.M.; Radivojac, P. Two Sample Logo: a graphical representation of the differences between two sets of sequence alignments. Bioinformatics, 2006, 22(12), 1536-1537.
[http://dx.doi.org/10.1093/bioinformatics/btl151] [PMID: 16632492]
[33]
Hess, M.; Keul, F.; Goesele, M.; Hamacher, K. Addressing inaccuracies in BLOSUM computation improves homology search performance. BMC Bioinformatics, 2016, 17, 189.
[http://dx.doi.org/10.1186/s12859-016-1060-3] [PMID: 27122148]
[34]
Li, T.; Du, P.; Xu, N. Identifying human kinase-specific protein phosphorylation sites by integrating heterogeneous information from various sources. PLoS One, 2010, 5(11)e15411
[http://dx.doi.org/10.1371/journal.pone.0015411] [PMID: 21085571]
[35]
Nakamura, M.; Kajiwara, Y.; Otsuka, A.; Kimura, H. LVQ-SMOTE - Learning Vector Quantization based Synthetic Minority Over-sampling Technique for biomedical data. BioData Min., 2013, 6(1), 16.
[http://dx.doi.org/10.1186/1756-0381-6-16] [PMID: 24088532]
[36]
Gnad, F.; Ren, S.; Choudhary, C.; Cox, J.; Mann, M. Predicting post-translational lysine acetylation using support vector machines. Bioinformatics, 2010, 26(13), 1666-1668.
[http://dx.doi.org/10.1093/bioinformatics/btq260] [PMID: 20505001]
[37]
Ju, Z.; He, J.J. Prediction of lysine propionylation sites using biased SVM and incorporating four different sequence features into Chou’s PseAAC. J. Mol. Graph. Model., 2017, 76, 356-363.
[http://dx.doi.org/10.1016/j.jmgm.2017.07.022] [PMID: 28763688]
[38]
Gao, L.; Ye, M.; Lu, X.; Huang, D. Hybrid Method Based on Information Gain and Support Vector Machine for Gene Selection in Cancer Classification. Genomics Proteomics Bioinformatics, 2017, 15(6), 389-395.
[http://dx.doi.org/10.1016/j.gpb.2017.08.002] [PMID: 29246519]
[39]
Xu, Y.; Ding, J.; Wu, L.Y.; Chou, K.C. iSNO-PseAAC: predict cysteine S-nitrosylation sites in proteins by incorporating position specific amino acid propensity into pseudo amino acid composition. PLoS One, 2013, 8(2)e55844
[http://dx.doi.org/10.1371/journal.pone.0055844] [PMID: 23409062]
[40]
Ju, Z.; Wang, S.Y. Predicting lysine lipoylation sites using bi-profile bayes feature extraction and fuzzy support vector machine algorithm. Anal. Biochem., 2018, 561-562, 11-17.
[http://dx.doi.org/10.1016/j.ab.2018.09.007] [PMID: 30218638]

Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy