Abstract
Sumoylation of proteins is an important reversible post-translational modification of proteins and mediates a variety of cellular processes. Sumo-modified proteins can change their subcellular localization, activity, and stability. In addition, it also plays an important role in various cellular processes such as transcriptional regulation and signal transduction. The abnormal sumoylation is involved in many diseases, including neurodegeneration and immune-related diseases, as well as the development of cancer. Therefore, identification of the sumoylation site (SUMO site) is fundamental to understanding their molecular mechanisms and regulatory roles. In contrast to labor-intensive and costly experimental approaches, computational prediction of sumoylation sites in silico has also attracted much attention for its accuracy, convenience, and speed. At present, many computational prediction models have been used to identify SUMO sites, but their contents have not been comprehensively summarized and reviewed. Therefore, the research progress of relevant models is summarized and discussed in this paper. We have briefly summarized the development of bioinformatics methods for sumoylation site prediction by mainly focusing on the benchmark dataset construction, feature extraction, machine learning method, published results, and online tools. We hope that this review will provide more help for wet-experimental scholars.
Keywords: Sumo modification, feature selection, machine learning, classification, post-translational modification, sequential forward selection.
[http://dx.doi.org/10.1038/nrm2293] [PMID: 18000527]
[http://dx.doi.org/10.1038/s41598-017-06195-y] [PMID: 28724993]
[http://dx.doi.org/10.2174/1574893614666191202152328]
[http://dx.doi.org/10.1038/nrm1200] [PMID: 14506472]
[http://dx.doi.org/10.1126/science.1092194] [PMID: 15064418]
[http://dx.doi.org/10.1159/000502142] [PMID: 31505513]
[http://dx.doi.org/10.1007/s12017-013-8257-7] [PMID: 23979993]
[http://dx.doi.org/10.1073/pnas.1812975115] [PMID: 30355771]
[http://dx.doi.org/10.3233/JAD-170468] [PMID: 29332039]
[http://dx.doi.org/10.1074/jbc.M510127200] [PMID: 16464864]
[http://dx.doi.org/10.1002/ana.24627]
[http://dx.doi.org/10.3389/fonc.2019.00898] [PMID: 31620361]
[http://dx.doi.org/10.1093/nar/gkl207]
[http://dx.doi.org/10.1093/nar/gki393]
[http://dx.doi.org/10.1038/nbt1146] [PMID: 16273072]
[http://dx.doi.org/10.1016/j.bbrc.2007.04.097] [PMID: 17470363]
[http://dx.doi.org/10.1186/1471-2105-9-8] [PMID: 18179724]
[http://dx.doi.org/10.1002/pmic.200800646] [PMID: 29658196]
[http://dx.doi.org/10.1007/s00726-011-1100-2] [PMID: 21986959]
[http://dx.doi.org/10.1371/journal.pone.0039195] [PMID: 22720073]
[http://dx.doi.org/10.1186/1471-2164-15-S9-S18] [PMID: 25521314]
[http://dx.doi.org/10.1074/jbc.M408705200] [PMID: 15355965]
[http://dx.doi.org/10.1093/bioinformatics/btv403] [PMID: 26142185]
[http://dx.doi.org/10.1186/s12864-018-5206-8] [PMID: 30999862]
[http://dx.doi.org/10.3390/molecules23123260] [PMID: 30544729]
[http://dx.doi.org/10.1093/bib/bby089] [PMID: 30285084]
[PMID: 27543076]
[PMID: 28171531]
[http://dx.doi.org/10.1093/nar/gkz843] [PMID: 31584099]
[http://dx.doi.org/10.1098/rsob.190054] [PMID: 31164042]
[http://dx.doi.org/10.1093/nar/gkz740] [PMID: 31504851]
[http://dx.doi.org/10.1093/nar/gkt1093] [PMID: 24214993]
[http://dx.doi.org/10.1093/nar/28.1.45] [PMID: 10592178]
[http://dx.doi.org/10.1093/bioinformatics/btl158] [PMID: 16731699]
[http://dx.doi.org/10.6026/97320630014213] [PMID: 30108418]
[http://dx.doi.org/10.1038/s41598-018-33951-5] [PMID: 30341374]
[http://dx.doi.org/10.1007/s00726-012-1290-2] [PMID: 22555647]
[http://dx.doi.org/10.1093/bioinformatics/btq043] [PMID: 20130033]
[http://dx.doi.org/10.1371/journal.pone.0050300] [PMID: 23209700]
[http://dx.doi.org/10.1186/1471-2105-7-124] [PMID: 16526956]
[http://dx.doi.org/10.1093/bib/bby028] [PMID: 29897410]
[http://dx.doi.org/10.1093/bib/bbz139] [PMID: 31813954]
[http://dx.doi.org/10.1093/bib/bbaa144] [PMID: 32685972]
[http://dx.doi.org/10.1002/prot.21677] [PMID: 17932917]
[http://dx.doi.org/10.2174/1574893612666170707094916]
[http://dx.doi.org/10.2174/092986610789909494] [PMID: 19508203]
[http://dx.doi.org/10.1093/nar/25.17.3389] [PMID: 9254694]
[http://dx.doi.org/10.1093/database/baz131]
[http://dx.doi.org/10.1093/bib/bbaa096] [PMID: 32524143]
[http://dx.doi.org/10.1021/jm9700575] [PMID: 9651153]
[http://dx.doi.org/10.1093/bib/bbz177] [PMID: 31994694]
[http://dx.doi.org/10.1093/bib/bbz123] [PMID: 31633777]
[http://dx.doi.org/10.2174/1574893612666170905153933]
[http://dx.doi.org/10.1093/bioinformatics/btaa155] [PMID: 32134472]
[http://dx.doi.org/10.1016/j.omtn.2020.02.004] [PMID: 32169803]
[http://dx.doi.org/10.1093/bib/bbx165] [PMID: 29272359]
[http://dx.doi.org/10.2174/1570178614666170329155502]
[http://dx.doi.org/10.1155/2017/3267325]
[http://dx.doi.org/10.3390/ijms21145014]
[http://dx.doi.org/10.2174/1381612826666200331091156] [PMID: 32228416]
[http://dx.doi.org/10.1109/TPAMI.2005.159] [PMID: 16119262]
[http://dx.doi.org/10.1093/bioinformatics/bty943] [PMID: 30428009]
[http://dx.doi.org/10.2174/1574893611666160608075753]
[http://dx.doi.org/10.1093/bioinformatics/btw564] [PMID: 27565583]
[http://dx.doi.org/10.1177/1176934319867088] [PMID: 31391777]
[http://dx.doi.org/10.1093/bioinformatics/bth261] [PMID: 15073010]
[http://dx.doi.org/10.1093/bioinformatics/btz358] [PMID: 31077296]
[http://dx.doi.org/10.3934/mbe.2019123] [PMID: 31137222]
[http://dx.doi.org/10.1109/TCBB.2017.2666141]
[http://dx.doi.org/10.1016/j.csbj.2020.04.015] [PMID: 32435427]
[http://dx.doi.org/10.2174/1574893613666180726163429]
[http://dx.doi.org/10.1016/j.omtn.2019.08.022] [PMID: 31581051]
[http://dx.doi.org/10.1016/j.omtn.2019.08.008] [PMID: 31536883]
[http://dx.doi.org/10.3390/ijms20081964] [PMID: 31013619]
[http://dx.doi.org/10.1016/j.omtn.2019.04.019] [PMID: 31146255]
[http://dx.doi.org/10.1093/bioinformatics/btx222] [PMID: 28419290]
[http://dx.doi.org/10.3389/fmicb.2018.00476] [PMID: 29616000]
[http://dx.doi.org/10.18632/oncotarget.23099] [PMID: 29416743]
[http://dx.doi.org/10.2174/1389200219666180820112457] [PMID: 30124147]
[http://dx.doi.org/10.3389/fbioe.2020.00008] [PMID: 32047745]
[PMID: 30040651]
[http://dx.doi.org/10.1093/bioinformatics/bty451] [PMID: 29868903]
[http://dx.doi.org/10.1504/IJDMB.2013.056078] [PMID: 24417022]
[http://dx.doi.org/10.2174/1386207322666190925125524] [PMID: 31553288]
[http://dx.doi.org/10.1109/TNNLS.2015.2461552] [PMID: 26316223]
[http://dx.doi.org/10.2174/1574893612666170405125637]
[http://dx.doi.org/10.2174/1574893614666181120093740]
[http://dx.doi.org/10.1109/TCBB.2013.65]
[http://dx.doi.org/10.2174/1574893614666190220114644]
[http://dx.doi.org/10.1016/j.omtn.2019.09.019] [PMID: 31678735]
[http://dx.doi.org/10.2174/1566523218666181010101114] [PMID: 30306867]
[http://dx.doi.org/10.3389/fimmu.2018.01695] [PMID: 30100904]
[http://dx.doi.org/10.1371/journal.pone.0106542] [PMID: 25222008]
[http://dx.doi.org/10.3389/fimmu.2018.01783] [PMID: 30108593]
[http://dx.doi.org/10.1016/j.ygeno.2020.08.016] [PMID: 32818637]
[http://dx.doi.org/10.1002/med.21658] [PMID: 31922268]
[http://dx.doi.org/10.1016/j.csbj.2018.10.007] [PMID: 30425802]
[http://dx.doi.org/10.1016/j.omtn.2019.08.011] [PMID: 31542696]
[http://dx.doi.org/10.1016/j.ygeno.2020.03.019] [PMID: 33017626]
[http://dx.doi.org/10.1021/acs.jproteome.0c00590] [PMID: 32897718]
[http://dx.doi.org/10.3390/cells9020353] [PMID: 32028709]
[http://dx.doi.org/10.1007/s10822-020-00323-z] [PMID: 32557165]
[http://dx.doi.org/10.1371/journal.pone.0072368] [PMID: 24019868]
[http://dx.doi.org/10.1016/j.ygeno.2020.03.019] [PMID: 32234434]
[http://dx.doi.org/10.1093/bib/bbaa043] [PMID: 32363401]
[http://dx.doi.org/10.1039/C9SC04336E] [PMID: 34123272]
[http://dx.doi.org/10.2174/1574893613666181113131415]
[http://dx.doi.org/10.1016/j.omtn.2019.05.028] [PMID: 31299595]
[http://dx.doi.org/10.2174/1573406415666191002152441] [PMID: 31339073]
[http://dx.doi.org/10.1093/bioinformatics/btaa428] [PMID: 32467970]
[http://dx.doi.org/10.2174/138161282626200714144530] [PMID: 32787750]
[http://dx.doi.org/10.1093/bib/bbaa202] [PMID: 32910169]
[http://dx.doi.org/10.1016/j.ijbiomac.2019.12.009] [PMID: 31805335]
[http://dx.doi.org/10.1007/s11103-020-00988-y] [PMID: 32140819]
[http://dx.doi.org/10.1142/S1793524517500504]
[http://dx.doi.org/10.3389/fgene.2018.00745] [PMID: 30713550]
[http://dx.doi.org/10.1109/TITS.2020.2997377]
[http://dx.doi.org/10.1093/bib/bby091] [PMID: 30239616]
[http://dx.doi.org/10.1016/j.artmed.2017.02.005] [PMID: 28245947]
[http://dx.doi.org/10.1016/j.artmed.2017.03.001] [PMID: 28320624]
[http://dx.doi.org/10.3389/fbioe.2020.00254] [PMID: 32292778]
[http://dx.doi.org/10.1093/bioinformatics/bty827] [PMID: 30247625]
[http://dx.doi.org/10.1093/bib/bbaa212] [PMID: 33048110]
[http://dx.doi.org/10.1155/2013/671269] [PMID: 25937950]
[http://dx.doi.org/10.1038/nsmb.2890] [PMID: 25218447]
[http://dx.doi.org/10.1093/nar/gkab016] [PMID: 33503258]
[PMID: 34184738]
[PMID: 33279983]
[http://dx.doi.org/10.1093/bib/bbaa255] [PMID: 33099604]
[http://dx.doi.org/10.1093/bib/bbab047] [PMID: 33751027]
[http://dx.doi.org/10.1515/cmb-2019-0001]
[http://dx.doi.org/10.3390/molecules22101732] [PMID: 29039790]
[http://dx.doi.org/10.1038/s41598-020-60598-y] [PMID: 32152330]
[PMID: 31588505]
[http://dx.doi.org/10.1109/TBCAS.2020.3018777] [PMID: 32833643]
[http://dx.doi.org/10.1109/TCYB.2020.3003060] [PMID: 32649286]
[http://dx.doi.org/10.1016/j.ins.2016.06.026]
[PMID: 28222000]