Recent Development of Machine Learning Methods in Sumoylation
Sites Prediction

doi:10.2174/0929867328666210915112030
摘要

蛋白质的 Sumoylation 是蛋白质的一种重要的可逆翻译后修饰，并介导多种细胞过程。 Sumo 修饰蛋白可以改变它们的亚细胞定位、活性和稳定性。此外，它还在转录调控和信号转导等多种细胞过程中发挥重要作用。异常的 sumoylation 与许多疾病有关，包括神经退行性疾病和免疫相关疾病，以及癌症的发展。因此，SUMO 化位点（SUMO 位点）的识别对于了解其分子机制和调控作用至关重要。与劳动密集型和昂贵的实验方法相比，计算机中的 sumoylation 位点的计算预测也因其准确性、便利性和速度而备受关注。目前，许多计算预测模型已被用于识别 SUMO 站点，但其内容尚未得到全面总结和回顾。因此，本文对相关模型的研究进展进行了总结和讨论。我们主要关注基准数据集的构建、特征提取、机器学习方法、已发表的结果和在线工具，简要总结了用于 sumoylation 位点预测的生物信息学方法的发展。我们希望这篇综述能为湿实验学者提供更多帮助。
关键词: 相扑修改、特征选择、机器学习、分类、翻译后修改、顺序前向选择
« Previous Next »
[1] 
Geiss-Friedlander, R.; Melchior, F. Concepts in sumoylation: A decade on. Nat. Rev. Mol. Cell Biol.,  2007, 8(12), 947-956.
[http://dx.doi.org/10.1038/nrm2293] [PMID: 18000527] 
[2] 
Huo, H.; Li, T.; Wang, S.; Lv, Y.; Zuo, Y.; Yang, L. Prediction of presynaptic and postsynaptic neurotoxins by combining various Chou’s pseudo components. Sci. Rep.,  2017, 7(1), 5827.
[http://dx.doi.org/10.1038/s41598-017-06195-y] [PMID: 28724993] 
[3] 
Hasan, M.A.M.; Islam, M.K.B.; Julia Rahman, J.; Ahmad, S. Citrullination Site Prediction by Incorporating Sequence Coupled Effects into PseAAC and Resolving Data Imbalance Issue. Curr. Bioinform.,  2020, 15(3), 235-245.
[http://dx.doi.org/10.2174/1574893614666191202152328] 
[4] 
Seeler, J.S.; Dejean, A. Nuclear and unclear functions of SUMO. Nat. Rev. Mol. Cell Biol.,  2003, 4(9), 690-699.
[http://dx.doi.org/10.1038/nrm1200] [PMID: 14506472] 
[5] 
Steffan, J.S.; Agrawal, N.; Pallos, J.; Rockabrand, E.; Trotman, L.C.; Slepko, N.; Illes, K.; Lukacsovich, T.; Zhu, Y.Z.; Cattaneo, E.; Pandolfi, P.P.; Thompson, L.M.; Marsh, J.L. SUMO modification of Huntingtin and Huntington’s disease pathology. Science,  2004, 304(5667), 100-104.
[http://dx.doi.org/10.1126/science.1092194] [PMID: 15064418] 
[6] 
Princz, A.; Tavernarakis, N. SUMOylation in Neurodegenerative Diseases. Gerontology,  2020, 66(2), 122-130.
[http://dx.doi.org/10.1159/000502142] [PMID: 31505513] 
[7] 
Lee, L.; Sakurai, M.; Matsuzaki, S.; Arancio, O.; Fraser, P. SUMO and Alzheimer’s disease. Neuromolecular Med.,  2013, 15(4), 720-736.
[http://dx.doi.org/10.1007/s12017-013-8257-7] [PMID: 23979993] 
[8] 
Liu, G.; Jin, S.; Hu, Y.; Jiang, Q. Disease status affects the association between rs4813620 and the expression of Alzheimer’s disease susceptibility gene TRIB3. Proc. Natl. Acad. Sci. USA,  2018, 115(45), E10519-E10520.
[http://dx.doi.org/10.1073/pnas.1812975115] [PMID: 30355771] 
[9] 
Liu, G.; Zhang, Y.; Wang, L.; Xu, J.; Chen, X.; Bao, Y.; Hu, Y.; Jin, S.; Tian, R.; Bai, W.; Zhou, W.; Wang, T.; Han, Z.; Zong, J.; Jiang, Q. Alzheimer’s Disease rs11767557 Variant Regulates EPHA1 Gene Expression Specifically in Human Whole Blood. J. Alzheimers Dis.,  2018, 61(3), 1077-1088.
[http://dx.doi.org/10.3233/JAD-170468] [PMID: 29332039] 
[10] 
Dorval, V.; Fraser, P.E. Small ubiquitin-like modifier (SUMO) modification of natively unfolded proteins tau and alpha-synuclein. J. Biol. Chem.,  2006, 281(15), 9919-9924.
[http://dx.doi.org/10.1074/jbc.M510127200] [PMID: 16464864] 
[11] 
Jiang, Q.; Liu, G. Lack of association between MC1R variants and Parkinson’s disease in European descent. Ann. Neurol.,  2016, 79(5), 866-868.
[http://dx.doi.org/10.1002/ana.24627] 
[12] 
Yang, B.; Shen, J.; Xu, L.; Chen, Y.; Che, X.; Qu, X.; Liu, Y.; Teng, Y.; Li, Z. Genome-Wide Identification of a Novel Eight-lncRNA Signature to Improve Prognostic Prediction in Head and Neck Squamous Cell Carcinoma. Front. Oncol.,  2019, 9, 898.
[http://dx.doi.org/10.3389/fonc.2019.00898] [PMID: 31620361] 
[13] 
Xue, Y. SUMOsp: A web server for sumoylation site prediction. Nucleic Acids Res,  2006, 34(Web Server issue), W254-W257.
[http://dx.doi.org/10.1093/nar/gkl207] 
[14] 
Xue, Y. GPS: A comprehensive www server for phosphorylation sites prediction. Nucleic Acids Res,  2005, 33(Web Server issue), W184-W187.
[http://dx.doi.org/10.1093/nar/gki393] 
[15] 
Schwartz, D.; Gygi, S.P. An iterative statistical approach to the identification of protein phosphorylation motifs from large-scale data sets. Nat. Biotechnol.,  2005, 23(11), 1391-1398.
[http://dx.doi.org/10.1038/nbt1146] [PMID: 16273072] 
[16] 
Liu, B.; Li, S.; Wang, Y.; Lu, L.; Li, Y.; Cai, Y. Predicting the protein SUMO modification sites based on Properties Sequential Forward Selection (PSFS). Biochem. Biophys. Res. Commun.,  2007, 358(1), 136-139.
[http://dx.doi.org/10.1016/j.bbrc.2007.04.097] [PMID: 17470363] 
[17] 
Xu, J.; He, Y.; Qiang, B.; Yuan, J.; Peng, X.; Pan, X.M. A novel method for high accuracy sumoylation site prediction from protein sequences. BMC Bioinformatics,  2008, 9, 8.
[http://dx.doi.org/10.1186/1471-2105-9-8] [PMID: 18179724] 
[18] 
Ren, J.; Gao, X.; Jin, C.; Zhu, M.; Wang, X.; Shaw, A.; Wen, L.; Yao, X.; Xue, Y. Systematic study of protein sumoylation: Development of a site-specific predictor of SUMOsp 2.0. Proteomics,  2009, 9(12), 3409-3412.
[http://dx.doi.org/10.1002/pmic.200800646] [PMID: 29658196] 
[19] 
Teng, S.; Luo, H.; Wang, L. Predicting protein sumoylation sites from sequence features. Amino Acids,  2012, 43(1), 447-455.
[http://dx.doi.org/10.1007/s00726-011-1100-2] [PMID: 21986959] 
[20] 
Chen, Y.Z.; Chen, Z.; Gong, Y.A.; Ying, G. SUMOhydro: A novel method for the prediction of sumoylation sites based on hydrophobic properties. PLoS One,  2012, 7(6), e39195.
[http://dx.doi.org/10.1371/journal.pone.0039195] [PMID: 22720073] 
[21] 
Yavuz, A.S.; Sezerman, O.U. Predicting sumoylation sites using support vector machines based on various sequence features, conformational flexibility and disorder. BMC Genomics,  2014, 15(Suppl. 9), S18.
[http://dx.doi.org/10.1186/1471-2164-15-S9-S18] [PMID: 25521314] 
[22] 
Macauley, M.S.; Errington, W.J.; Okon, M.; Schärpf, M.; Mackereth, C.D.; Schulman, B.A.; McIntosh, L.P. Structural and dynamic independence of isopeptide-linked RanGAP1 and SUMO-1. J. Biol. Chem.,  2004, 279(47), 49131-49137.
[http://dx.doi.org/10.1074/jbc.M408705200] [PMID: 15355965] 
[23] 
Beauclair, G.; Bridier-Nahmias, A.; Zagury, J.F.; Saïb, A.; Zamborlini, A. JASSA: A comprehensive tool for prediction of SUMOylation sites and SIMs. Bioinformatics,  2015, 31(21), 3483-3491.
[http://dx.doi.org/10.1093/bioinformatics/btv403] [PMID: 26142185] 
[24] 
Sharma, A.; Lysenko, A.; López, Y.; Dehzangi, A.; Sharma, R.; Reddy, H.; Sattar, A.; Tsunoda, T. HseSUMO: Sumoylation site prediction using half-sphere exposures of amino acids residues. BMC Genomics,  2019, 19(Suppl. 9), 982.
[http://dx.doi.org/10.1186/s12864-018-5206-8] [PMID: 30999862] 
[25] 
Dehzangi, A.; López, Y.; Taherzadeh, G.; Sharma, A.; Tsunoda, T. SumSec: Accurate Prediction of Sumoylation Sites Using Predicted Secondary Structure. Molecules,  2018, 23(12), E3260.
[http://dx.doi.org/10.3390/molecules23123260] [PMID: 30544729] 
[26] 
Chen, Z.; Liu, X.; Li, F.; Li, C.; Marquez-Lago, T.; Leier, A.; Akutsu, T.; Webb, G.I.; Xu, D.; Smith, A.I.; Li, L.; Chou, K.C.; Song, J. Large-scale comparative assessment of computational predictors for lysine post-translational modification sites. Brief. Bioinform.,  2019, 20(6), 2267-2290.
[http://dx.doi.org/10.1093/bib/bby089] [PMID: 30285084] 
[27] 
Zhang, T.; Tan, P.; Wang, L.; Jin, N.; Li, Y.; Zhang, L.; Yang, H.; Hu, Z.; Zhang, L.; Hu, C.; Li, C.; Qian, K.; Zhang, C.; Huang, Y.; Li, K.; Lin, H.; Wang, D. RNALocate: A resource for RNA subcellular localizations. Nucleic Acids Res.,  2017, 45(D1), D135-D138.
[PMID: 27543076] 
[28] 
Liang, Z.Y.; Lai, H.Y.; Yang, H.; Zhang, C.J.; Yang, H.; Wei, H.H.; Chen, X.X.; Zhao, Y.W.; Su, Z.D.; Li, W.C.; Deng, E.Z.; Tang, H.; Chen, W.; Lin, H. Pro54DB: A database for experimentally verified sigma-54 promoters. Bioinformatics,  2017, 33(3), 467-469.
[PMID: 28171531] 
[29] 
Cheng, L.; Qi, C.; Zhuang, H.; Fu, T.; Zhang, X. gutMDisorder: A comprehensive database for dysbiosis of the gut microbiota in disorders and interventions. Nucleic Acids Res.,  2020, 48(D1), D554-D560.
[http://dx.doi.org/10.1093/nar/gkz843] [PMID: 31584099] 
[30] 
Hu, B.; Zheng, L.; Long, C.; Song, M.; Li, T.; Yang, L.; Zuo, Y. EmExplorer: A database for exploring time activation of gene expression in mammalian embryos. Open Biol.,  2019, 9(6), 190054.
[http://dx.doi.org/10.1098/rsob.190054] [PMID: 31164042] 
[31] 
Liu, B.; Gao, X.; Zhang, H. BioSeq-Analysis2.0: An updated platform for analyzing DNA, RNA and protein sequences at sequence level and residue level based on machine learning approaches. Nucleic Acids Res.,  2019, 47(20), e127.
[http://dx.doi.org/10.1093/nar/gkz740] [PMID: 31504851] 
[32] 
Liu, Z.; Wang, Y.; Gao, T.; Pan, Z.; Cheng, H.; Yang, Q.; Cheng, Z.; Guo, A.; Ren, J.; Xue, Y. CPLM: A database of protein lysine modifications. Nucleic Acids Res.,  2014, 42(Database issue), D531-D536.
[http://dx.doi.org/10.1093/nar/gkt1093] [PMID: 24214993] 
[33] 
Bairoch, A.; Apweiler, R. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res.,  2000, 28(1), 45-48.
[http://dx.doi.org/10.1093/nar/28.1.45] [PMID: 10592178] 
[34] 
Li, W.; Godzik, A. Cd-hit: A fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics,  2006, 22(13), 1658-1659.
[http://dx.doi.org/10.1093/bioinformatics/btl158] [PMID: 16731699] 
[35] 
Ahmed, M.S.; Shahjaman, M.; Kabir, E.; Kamruzzaman, M. Prediction of Protein Acetylation Sites using Kernel Naive Bayes Classifier Based on Protein Sequences Profiling. Bioinformation,  2018, 14(5), 213-218.
[http://dx.doi.org/10.6026/97320630014213] [PMID: 30108418] 
[36] 
Chang, C-C.; Tung, C.H.; Chen, C.W.; Tu, C.H.; Chu, Y.W. SUMOgo: Prediction of sumoylation sites on lysines by motif screening models and the effects of various post-translational modifications. Sci. Rep.,  2018, 8(1), 15512.
[http://dx.doi.org/10.1038/s41598-018-33951-5] [PMID: 30341374] 
[37] 
Plewczynski, D.; Basu, S.; Saha, I. AMS 4.0: consensus prediction of post-translational modifications in protein sequences. Amino Acids,  2012, 43(2), 573-582.
[http://dx.doi.org/10.1007/s00726-012-1290-2] [PMID: 22555647] 
[38] 
Song, J.; Tan, H.; Shen, H.; Mahmood, K.; Boyd, S.E.; Webb, G.I.; Akutsu, T.; Whisstock, J.C. Cascleave: towards more accurate prediction of caspase substrate cleavage sites. Bioinformatics,  2010, 26(6), 752-760.
[http://dx.doi.org/10.1093/bioinformatics/btq043] [PMID: 20130033] 
[39] 
Song, J.; Tan, H.; Perry, A.J.; Akutsu, T.; Webb, G.I.; Whisstock, J.C.; Pike, R.N. PROSPER: An integrated feature-based tool for predicting protease substrate cleavage sites. PLoS One,  2012, 7(11), e50300.
[http://dx.doi.org/10.1371/journal.pone.0050300] [PMID: 23209700] 
[40] 
Song, J.; Burrage, K.; Yuan, Z.; Huber, T. Prediction of cis/trans isomerization in proteins using PSI-BLAST profiles and secondary structure information. BMC Bioinformatics,  2006, 7, 124.
[http://dx.doi.org/10.1186/1471-2105-7-124] [PMID: 16526956] 
[41] 
Song, J.; Wang, Y.; Li, F.; Akutsu, T.; Rawlings, N.D.; Webb, G.I.; Chou, K.C. iProt-Sub: A comprehensive package for accurately mapping and predicting protease-specific substrates and cleavage sites. Brief. Bioinform.,  2019, 20(2), 638-658.
[http://dx.doi.org/10.1093/bib/bby028] [PMID: 29897410] 
[42] 
Liu, B.; Zhu, Y.; Yan, K. Fold-LTR-TCP: protein fold recognition based on triadic closure principle. Brief. Bioinform.,  2020, 21(6), 2185-2193.
[http://dx.doi.org/10.1093/bib/bbz139] [PMID: 31813954] 
[43] 
Shao, J.; Yan, K.; Liu, B. FoldRec-C2C: protein fold recognition by combining cluster-to-cluster model and protein similarity network. Brief. Bioinform.,  2021, 22(3), bbaa144.
[http://dx.doi.org/10.1093/bib/bbaa144] [PMID: 32685972] 
[44] 
Kumar, M.; Gromiha, M.M.; Raghava, G.P. Prediction of RNA binding sites in a protein using SVM and PSSM profile. Proteins,  2008, 71(1), 189-194.
[http://dx.doi.org/10.1002/prot.21677] [PMID: 17932917] 
[45] 
Huang, G.H.; Li, J.C. Feature Extractions for Computationally Predicting Protein Post-Translational Modifications. Curr. Bioinform.,  2018, 13(4), 387-395.
[http://dx.doi.org/10.2174/1574893612666170707094916] 
[46] 
Wang, T.; Yang, J. Predicting subcellular localization of gram-negative bacterial proteins by linear dimensionality reduction method. Protein Pept. Lett.,  2010, 17(1), 32-37.
[http://dx.doi.org/10.2174/092986610789909494] [PMID: 19508203] 
[47] 
Altschul, S.F.; Madden, T.L.; Schäffer, A.A.; Zhang, J.; Zhang, Z.; Miller, W.; Lipman, D.J. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res.,  1997, 25(17), 3389-3402.
[http://dx.doi.org/10.1093/nar/25.17.3389] [PMID: 9254694] 
[48] 
Zheng, L.; Huang, S.; Mu, N.; Zhang, H.; Zhang, J.; Chang, Y.; Yang, L.; Zuo, Y. RAACBook: A web server of reduced amino acid alphabet for sequence-dependent inference by using Chou's five-step rule. Database (Oxford) 2019.,  2019, baz131.
[http://dx.doi.org/10.1093/database/baz131] 
[49] 
Zheng, L.; Liu, D.; Yang, W.; Yang, L.; Zuo, Y. RaacLogo: A new sequence logo generator by using reduced amino acid clusters. Brief. Bioinform.,  2021, 22(3), bbaa096.
[http://dx.doi.org/10.1093/bib/bbaa096] [PMID: 32524143] 
[50] 
Sandberg, M.; Eriksson, L.; Jonsson, J.; Sjöström, M.; Wold, S. New chemical descriptors relevant for the design of biologically active peptides. A multivariate characterization of 87 amino acids. J. Med. Chem.,  1998, 41(14), 2481-2491.
[http://dx.doi.org/10.1021/jm9700575] [PMID: 9651153] 
[51] 
Zhang, Z.Y.; Yang, Y.H.; Ding, H.; Wang, D.; Chen, W.; Lin, H. Design powerful predictor for mRNA subcellular location prediction in Homo sapiens. Brief. Bioinform.,  2020, 22(1), 526-535.
[http://dx.doi.org/10.1093/bib/bbz177] [PMID: 31994694] 
[52] 
Yang, H. A comparison and assessment of computational method for identifying recombination hotspots in Saccharomyces cerevisiae. Brief. Bioinform., 2019.
[http://dx.doi.org/10.1093/bib/bbz123] [PMID: 31633777] 
[53] 
Yao, Y. Recent Progress in Long Noncoding RNAs Prediction. Curr. Bioinform.,  2018, 13(4), 344-351.
[http://dx.doi.org/10.2174/1574893612666170905153933] 
[54] 
Liu, K.; Chen, W. iMRM: A platform for simultaneously identifying multiple kinds of RNA modifications. Bioinformatics,  2020, 36(11), 3336-3342.
[http://dx.doi.org/10.1093/bioinformatics/btaa155] [PMID: 32134472] 
[55] 
Liang, P.; Yang, W.; Chen, X.; Long, C.; Zheng, L.; Li, H.; Zuo, Y. Machine Learning of Single-Cell Transcriptome Highly Identifies mRNA Signature by Comparing F-Score Selection with DGE Analysis. Mol. Ther. Nucleic Acids,  2020, 20, 155-163.
[http://dx.doi.org/10.1016/j.omtn.2020.02.004] [PMID: 32169803] 
[56] 
Liu, B. BioSeq-Analysis: A platform for DNA, RNA and protein sequence analysis based on machine learning approaches. Brief. Bioinform.,  2019, 20(4), 1280-1294.
[http://dx.doi.org/10.1093/bib/bbx165] [PMID: 29272359] 
[57] 
Tang, H. Identification of Secretory Proteins of Malaria Parasite by Feature Selection Technique. Lett. Org. Chem.,  2017, 14(9), 621-624.
[http://dx.doi.org/10.2174/1570178614666170329155502] 
[58] 
Tang, H.; Yang, Y.; Zhang, C.; Chen, R.; Huang, P.; Duan, C.; Zou, P. Predicting Presynaptic and Postsynaptic Neurotoxins by Developing Feature Selection Technique. Biomed. Res. Int.,  2017, 2017, 3267325.
[http://dx.doi.org/10.1155/2017/3267325] 
[59] 
Yu, L.S.Y.; Zou, Q.; Wang, S.; Zheng, L.; Gao, L. Exploring Drug Treatment Patterns Based on the Action of Drug and Multilayer Network Model. Int. J. Mol. Sci.,  2020, 21(14), 5014.
[http://dx.doi.org/10.3390/ijms21145014] 
[60] 
Ao, C.; Jin, S.; Ding, H.; Zou, Q.; Yu, L. Application and Development of Artificial Intelligence and Intelligent Disease Diagnosis. Curr. Pharm. Des.,  2020, 26(26), 3069-3075.
[http://dx.doi.org/10.2174/1381612826666200331091156] [PMID: 32228416] 
[61] 
Peng, H.; Long, F.; Ding, C. Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell.,  2005, 27(8), 1226-1238.
[http://dx.doi.org/10.1109/TPAMI.2005.159] [PMID: 16119262] 
[62] 
Dao, F.Y.; Lv, H.; Wang, F.; Feng, C.Q.; Ding, H.; Chen, W.; Lin, H. Identify origin of replication in Saccharomyces cerevisiae using two-step feature selection technique. Bioinformatics,  2019, 35(12), 2075-2083.
[http://dx.doi.org/10.1093/bioinformatics/bty943] [PMID: 30428009] 
[63] 
Wang, S.P. Analysis and Prediction of Nitrated Tyrosine Sites with the mRMR Method and Support Vector Machine Algorithm. Curr. Bioinform.,  2018, 13(1), 3-13.
[http://dx.doi.org/10.2174/1574893611666160608075753] 
[64] 
Zuo, Y.; Li, Y.; Chen, Y.; Li, G.; Yan, Z.; Yang, L. PseKRAAC: A flexible web server for generating pseudo K-tuple reduced amino acids composition. Bioinformatics,  2017, 33(1), 122-124.
[http://dx.doi.org/10.1093/bioinformatics/btw564] [PMID: 27565583] 
[65] 
Zuo, Y.; Chang, Y.; Huang, S.; Zheng, L.; Yang, L.; Cao, G. iDEF-PseRAAC: Identifying the Defensin Peptide by Using Reduced Amino Acid Composition Descriptor. Evol. Bioinform. Online,  2019, 15, 1176934319867088.
[http://dx.doi.org/10.1177/1176934319867088] [PMID: 31391777] 
[66] 
Frank, E.; Hall, M.; Trigg, L.; Holmes, G.; Witten, I.H. Data mining in bioinformatics using Weka. Bioinformatics,  2004, 20(15), 2479-2481.
[http://dx.doi.org/10.1093/bioinformatics/bth261] [PMID: 15073010] 
[67] 
Xu, Z.C.; Feng, P.M.; Yang, H.; Qiu, W.R.; Chen, W.; Lin, H. iRNAD: A computational tool for identifying D modification sites in RNA sequence. Bioinformatics,  2019, 35(23), 4922-4929.
[http://dx.doi.org/10.1093/bioinformatics/btz358] [PMID: 31077296] 
[68] 
Tan, J.X.; Li, S.H.; Zhang, Z.M.; Chen, C.X.; Chen, W.; Tang, H.; Lin, H. Identification of hormone binding proteins based on machine learning methods. Math. Biosci. Eng.,  2019, 16(4), 2466-2480.
[http://dx.doi.org/10.3934/mbe.2019123] [PMID: 31137222] 
[69] 
Lin, H. Identification of hormone binding proteins based on machine learning methods. Mathematical Biosciences and Engineering,  2019, 16(4), 2466-2480.
[http://dx.doi.org/10.1109/TCBB.2017.2666141] 
[70] 
Dao, F.Y.; Lv, H.; Yang, Y.H.; Zulfiqar, H.; Gao, H.; Lin, H. Computational identification of N6-methyladenosine sites in multiple tissues of mammals. Comput. Struct. Biotechnol. J.,  2020, 18, 1084-1091.
[http://dx.doi.org/10.1016/j.csbj.2020.04.015] [PMID: 32435427] 
[71] 
Bu, H.D. Predicting Enhancers from Multiple Cell Lines and Tissues across Different Developmental Stages Based On SVM Method. Curr. Bioinform.,  2018, 13(6), 655-660.
[http://dx.doi.org/10.2174/1574893613666180726163429] 
[72] 
Chen, W.; Feng, P.; Song, X.; Lv, H.; Lin, H. iRNA-m7G: Identifying N7-methylguanosine Sites by Fusing Multiple Features. Mol. Ther. Nucleic Acids,  2019, 18, 269-274.
[http://dx.doi.org/10.1016/j.omtn.2019.08.022] [PMID: 31581051] 
[73] 
Liu, B.; Li, K. iPromoter-2L2.0: identifying promoters and their types by combining Smoothing Cutting Window algorithm and sequence-based features. Mol. Ther. Nucleic Acids,  2019, 18, 80-87.
[http://dx.doi.org/10.1016/j.omtn.2019.08.008] [PMID: 31536883] 
[74] 
Boopathi, V.; Subramaniyam, S.; Malik, A.; Lee, G.; Manavalan, B.; Yang, D.C. mACPpred: A Support Vector Machine-Based Meta-Predictor for Identification of Anticancer Peptides. Int. J. Mol. Sci.,  2019, 20(8), E1964.
[http://dx.doi.org/10.3390/ijms20081964] [PMID: 31013619] 
[75] 
Manavalan, B.; Basith, S.; Shin, T.H.; Wei, L.; Lee, G. Meta-4mCpred: A Sequence-Based Meta-Predictor for Accurate DNA 4mC Site Prediction Using Effective Feature Representation. Mol. Ther. Nucleic Acids,  2019, 16, 733-744.
[http://dx.doi.org/10.1016/j.omtn.2019.04.019] [PMID: 31146255] 
[76] 
Manavalan, B.; Lee, J. SVMQA: support-vector- machine-based protein single-model quality assessment. Bioinformatics,  2017, 33(16), 2496-2503.
[http://dx.doi.org/10.1093/bioinformatics/btx222] [PMID: 28419290] 
[77] 
Manavalan, B.; Shin, T.H.; Lee, G. PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine. Front. Microbiol.,  2018, 9, 476.
[http://dx.doi.org/10.3389/fmicb.2018.00476] [PMID: 29616000] 
[78] 
Manavalan, B.; Shin, T.H.; Lee, G. DHSpred: support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest. Oncotarget,  2017, 9(2), 1944-1956.
[http://dx.doi.org/10.18632/oncotarget.23099] [PMID: 29416743] 
[79] 
Stephenson, N.; Shane, E.; Chase, J.; Rowland, J.; Ries, D.; Justice, N.; Zhang, J.; Chan, L.; Cao, R. Survey of Machine Learning Techniques in Drug Discovery. Curr. Drug Metab.,  2019, 20(3), 185-193.
[http://dx.doi.org/10.2174/1389200219666180820112457] [PMID: 30124147] 
[80] 
Yu, L.; Xu, F.; Gao, L. Predict New Therapeutic Drugs for Hepatocellular Carcinoma Based on Gene Mutation and Expression. Front. Bioeng. Biotechnol.,  2020, 8, 8.
[http://dx.doi.org/10.3389/fbioe.2020.00008] [PMID: 32047745] 
[81] 
Su, R.; Wu, H.; Xu, B.; Liu, X.; Wei, L. Developing a Multi-Dose Computational Model for Drug-induced Hepatotoxicity Prediction based on Toxicogenomics Data. IEEE/ACM Trans. Comput. Biol. Bioinformatics,  2019, 16(4), 1231-1239.
[PMID: 30040651] 
[82] 
Wei, L.; Zhou, C.; Chen, H.; Song, J.; Su, R. ACPred-FL: A sequence-based predictor using effective feature representation to improve the prediction of anti-cancer peptides. Bioinformatics,  2018, 34(23), 4007-4016.
[http://dx.doi.org/10.1093/bioinformatics/bty451] [PMID: 29868903] 
[83] 
Jiang, Q.; Wang, G.; Jin, S.; Li, Y.; Wang, Y. Predicting human microRNA-disease associations based on support vector machine. Int. J. Data Min. Bioinform.,  2013, 8(3), 282-293.
[http://dx.doi.org/10.1504/IJDMB.2013.056078] [PMID: 24417022] 
[84] 
Zhu, Y.H.; Hu, J.; Qi, Y.; Song, X.N.; Yu, D.J. Boosting Granular Support Vector Machines for the Accurate Prediction of Protein-Nucleotide Binding Sites. Comb. Chem. High Throughput Screen.,  2019, 22(7), 455-469.
[http://dx.doi.org/10.2174/1386207322666190925125524] [PMID: 31553288] 
[85] 
Hou, J.; Gao, H.; Xia, Q.; Qi, N. Feature Combination and the kNN Framework in Object Classification. IEEE Trans. Neural Netw. Learn. Syst.,  2016, 27(6), 1368-1378.
[http://dx.doi.org/10.1109/TNNLS.2015.2461552] [PMID: 26316223] 
[86] 
Du, X.Q. Identification and Analysis of Cancer Diagnosis Using Probabilistic Classification Vector Machines with Feature Selection. Curr. Bioinform.,  2018, 13(6), 625-632.
[http://dx.doi.org/10.2174/1574893612666170405125637] 
[87] 
Ozkan, A. Benchmarking Classification Models for Cell Viability on Novel Cancer Image Datasets. Curr. Bioinform.,  2019, 14(2), 108-114.
[http://dx.doi.org/10.2174/1574893614666181120093740] 
[88] 
Dehzangi, A. A combination of feature extraction methods with an ensemble of different classifiers for protein structural class prediction problem. IEEE/ACM Trans Comput Biol Bioinform,  2013, 10(3), 564-575.
[http://dx.doi.org/10.1109/TCBB.2013.65] 
[89] 
Lv, H. iDNA-MS: An Integrated Computational Tool for Detecting DNA Modification Sites in Multiple Genomes. iScience,  2020, 23(4), 100991.
[90] 
Zhao, X. Predicting Drug Side Effects with Compact Integration of Heterogeneous Networks. Curr. Bioinform.,  2019, 14(8), 709-720.
[http://dx.doi.org/10.2174/1574893614666190220114644] 
[91] 
Cheng, L.; Zhao, H.; Wang, P.; Zhou, W.; Luo, M.; Li, T.; Han, J.; Liu, S.; Jiang, Q. Computational Methods for Identifying Similar Diseases. Mol. Ther. Nucleic Acids,  2019, 18, 590-604.
[http://dx.doi.org/10.1016/j.omtn.2019.09.019] [PMID: 31678735] 
[92] 
Cheng, L.; Hu, Y. Human Disease System Biology. Curr. Gene Ther.,  2018, 18(5), 255-256.
[http://dx.doi.org/10.2174/1566523218666181010101114] [PMID: 30306867] 
[93] 
Manavalan, B.; Govindaraj, R.G.; Shin, T.H.; Kim, M.O.; Lee, G. iBCE-EL: A New Ensemble Learning Framework for Improved Linear B-Cell Epitope Prediction. Front. Immunol.,  2018, 9, 1695.
[http://dx.doi.org/10.3389/fimmu.2018.01695] [PMID: 30100904] 
[94] 
Manavalan, B.; Lee, J.; Lee, J. Random forest-based protein model quality assessment (RFMQA) using structural features and potential energy terms. PLoS One,  2014, 9(9), e106542.
[http://dx.doi.org/10.1371/journal.pone.0106542] [PMID: 25222008] 
[95] 
Manavalan, B.; Shin, T.H.; Kim, M.O.; Lee, G. PIP-EL: A New Ensemble Learning Method for Improved Proinflammatory Peptide Predictions. Front. Immunol.,  2018, 9, 1783.
[http://dx.doi.org/10.3389/fimmu.2018.01783] [PMID: 30108593] 
[96] 
Ao, C.; Zhou, W.; Gao, L.; Dong, B.; Yu, L. Prediction of antioxidant proteins using hybrid feature representation method and random forest. Genomics,  2020, 112(6), 4666-4674.
[http://dx.doi.org/10.1016/j.ygeno.2020.08.016] [PMID: 32818637] 
[97] 
Basith, S.; Manavalan, B.; Hwan Shin, T.; Lee, G. Machine intelligence in peptide therapeutics: A next-generation tool for rapid disease screening. Med. Res. Rev.,  2020, 40(4), 1276-1314.
[http://dx.doi.org/10.1002/med.21658] [PMID: 31922268] 
[98] 
Basith, S.; Manavalan, B.; Shin, T.H.; Lee, G. iGHBP: Computational identification of growth hormone binding proteins from sequences using extremely randomised tree. Comput. Struct. Biotechnol. J.,  2018, 16, 412-420.
[http://dx.doi.org/10.1016/j.csbj.2018.10.007] [PMID: 30425802] 
[99] 
Basith, S.; Manavalan, B.; Shin, T.H.; Lee, G. SDM6A: A Web-Based Integrative Machine-Learning Framework for Predicting 6mA Sites in the Rice Genome. Mol. Ther. Nucleic Acids,  2019, 18, 131-141.
[http://dx.doi.org/10.1016/j.omtn.2019.08.011] [PMID: 31542696] 
[100] 
Charoenkwan, P.; Kanthawong, S.; Nantasenamat, C.; Hasan, M.M.; Shoombuatong, W. iAMY-SCM: Improved prediction and analysis of amyloid proteins using a scoring card method with propensity scores of dipeptides. Genomics,  2021, 113(1 Pt 2), 689-698.
[http://dx.doi.org/10.1016/j.ygeno.2020.03.019] [PMID: 33017626] 
[101] 
Charoenkwan, P.; Kanthawong, S.; Nantasenamat, C.; Hasan, M.M.; Shoombuatong, W. iDPPIV-SCM: A sequence-based predictor for identifying and analyzing dipeptidyl peptidase IV (DPP-IV) inhibitory peptides using a scoring card method. J. Proteome Res.,  2020, 19(10), 4125-4136.
[http://dx.doi.org/10.1021/acs.jproteome.0c00590] [PMID: 32897718] 
[102] 
Charoenkwan, P.; Kanthawong, S.; Schaduangrat, N.; Yana, J.; Shoombuatong, W. PVPred-SCM: Improved Prediction and Analysis of Phage Virion Proteins Using a Scoring Card Method. Cells,  2020, 9(2), 353.
[http://dx.doi.org/10.3390/cells9020353] [PMID: 32028709] 
[103] 
Charoenkwan, P.; Nantasenamat, C.; Hasan, M.M.; Shoombuatong, W. Meta-iPVP: A sequence-based meta-predictor for improving the prediction of phage virion proteins using effective feature representation. J. Comput. Aided Mol. Des.,  2020, 34(10), 1105-1116.
[http://dx.doi.org/10.1007/s10822-020-00323-z] [PMID: 32557165] 
[104] 
Charoenkwan, P.; Shoombuatong, W.; Lee, H.C.; Chaijaruwanich, J.; Huang, H.L.; Ho, S.Y. SCMCRYS: predicting protein crystallization using an ensemble scoring card method with estimating propensity scores of P-collocated amino acid pairs. PLoS One,  2013, 8(9), e72368.
[http://dx.doi.org/10.1371/journal.pone.0072368] [PMID: 24019868] 
[105] 
Charoenkwan, P.; Yana, J.; Schaduangrat, N.; Nantasenamat, C.; Hasan, M.M.; Shoombuatong, W. iBitter-SCM: Identification and characterization of bitter peptides using a scoring card method with propensity scores of dipeptides. Genomics,  2020, 112(4), 2813-2822.
[http://dx.doi.org/10.1016/j.ygeno.2020.03.019] [PMID: 32234434] 
[106] 
Jin, S.; Zeng, X.; Xia, F.; Huang, W.; Liu, X. Application of deep learning methods in biological networks. Brief. Bioinform.,  2021, 22(2), 1902-1917.
[http://dx.doi.org/10.1093/bib/bbaa043] [PMID: 32363401] 
[107] 
Zeng, X.; Zhu, S.; Lu, W.; Liu, Z.; Huang, J.; Zhou, Y.; Fang, J.; Huang, Y.; Guo, H.; Li, L.; Trapp, B.D.; Nussinov, R.; Eng, C.; Loscalzo, J.; Cheng, F. Target identification among known drugs by deep learning from heterogeneous networks. Chem. Sci. (Camb.),  2020, 11(7), 1775-1797.
[http://dx.doi.org/10.1039/C9SC04336E] [PMID: 34123272] 
[108] 
Yang, W. A brief survey of machine learning methods in protein sub-Golgi localization. Curr. Bioinform.,  2019, 14, 234-240.
[http://dx.doi.org/10.2174/1574893613666181113131415] 
[109] 
Lai, H.Y.; Zhang, Z.Y.; Su, Z.D.; Su, W.; Ding, H.; Chen, W.; Lin, H. iProEP: A Computational Predictor for Predicting Promoter. Mol. Ther. Nucleic Acids,  2019, 17, 337-346.
[http://dx.doi.org/10.1016/j.omtn.2019.05.028] [PMID: 31299595] 
[110] 
Chen, W.; Feng, P.; Nie, F. iATP: A sequence based method for identifying anti-tubercular peptides. Med. Chem.,  2020, 16(5), 620-625.
[http://dx.doi.org/10.2174/1573406415666191002152441] [PMID: 31339073] 
[111] 
Zhao, T.; Hu, Y.; Peng, J.; Cheng, L. DeepLGP: A novel deep learning method for prioritizing lncRNA target genes. Bioinformatics,  2020, 36(16), 4466-4472.
[http://dx.doi.org/10.1093/bioinformatics/btaa428] [PMID: 32467970] 
[112] 
Cheng, L. System Biology Methods and Tools for Pharmaceutical Design. Curr. Pharm. Des.,  2020, 26(26), 3047-3048.
[http://dx.doi.org/10.2174/138161282626200714144530] [PMID: 32787750] 
[113] 
Hasan, M.M.; Manavalan, B.; Khatun, MS.; Kurata, H. Meta-i6mA: An interspecies predictor for identifying DNA N6-methyladenine sites of plant genomes by exploiting informative features in an integrative machine-learning framework. Brief. Bioinform.,  2021, 22(3), bbaa202.
[http://dx.doi.org/10.1093/bib/bbaa202] [PMID: 32910169] 
[114] 
Hasan, M.M.; Manavalan, B.; Khatun, M.S.; Kurata, H. i4mC-ROSE, a bioinformatics tool for the identification of DNA N4-methylcytosine sites in the Rosaceae genome. Int. J. Biol. Macromol.,  2019, 157, 752-758.
[http://dx.doi.org/10.1016/j.ijbiomac.2019.12.009] [PMID: 31805335] 
[115] 
Hasan, M.M.; Manavalan, B.; Shoombuatong, W.; Khatun, M.S.; Kurata, H. i6mA-Fuse: improved and robust prediction of DNA 6 mA sites in the Rosaceae genome by fusing multiple feature representation. Plant Mol. Biol.,  2020, 103(1-2), 225-234.
[http://dx.doi.org/10.1007/s11103-020-00988-y] [PMID: 32140819] 
[116] 
Tang, H. A two-step discriminated method to identify thermophilic proteins. Int. J. Biomath.,  2017, 10(4), 1750050.
[http://dx.doi.org/10.1142/S1793524517500504] 
[117] 
Yu, L.; Yao, S.; Gao, L.; Zha, Y. Conserved Disease Modules Extracted From Multilayer Heterogeneous Disease and Gene Networks for Understanding Disease Mechanisms and Predicting Disease Treatments. Front. Genet.,  2019, 9, 745.
[http://dx.doi.org/10.3389/fgene.2018.00745] [PMID: 30713550] 
[118] 
Wang, T. Mobility based trust evaluation for heterogeneous electric vehicles network in smart cities. IEEE Trans. Intell. Transp. Syst.,  2020, 22(3), 1797-1806.
[http://dx.doi.org/10.1109/TITS.2020.2997377] 
[119] 
Qiang, X.; Zhou, C.; Ye, X.; Du, P.F.; Su, R.; Wei, L. CPPred-FL: A sequence-based predictor for large-scale identification of cell-penetrating peptides by feature representation learning. Brief. Bioinform., 2018.
[http://dx.doi.org/10.1093/bib/bby091] [PMID: 30239616] 
[120] 
Wei, L.; Wan, S.; Guo, J.; Wong, K.K. A novel hierarchical selective ensemble classifier with bioinformatics application. Artif. Intell. Med.,  2017, 83, 82-90.
[http://dx.doi.org/10.1016/j.artmed.2017.02.005] [PMID: 28245947] 
[121] 
Wei, L.; Xing, P.; Zeng, J.; Chen, J.; Su, R.; Guo, F. Improved prediction of protein-protein interactions using novel negative samples, features, and an ensemble classifier. Artif. Intell. Med.,  2017, 83, 67-74.
[http://dx.doi.org/10.1016/j.artmed.2017.03.001] [PMID: 28320624] 
[122] 
Zhang, Z.M.; Tan, J.X.; Wang, F.; Dao, F.Y.; Zhang, Z.Y.; Lin, H. Early Diagnosis of Hepatocellular Carcinoma Using Machine Learning Method. Front. Bioeng. Biotechnol.,  2020, 8, 254.
[http://dx.doi.org/10.3389/fbioe.2020.00254] [PMID: 32292778] 
[123] 
Feng, C.Q.; Zhang, Z.Y.; Zhu, X.J.; Lin, Y.; Chen, W.; Tang, H.; Lin, H. iTerm-PseKNC: A sequence-based tool for predicting bacterial transcriptional terminators. Bioinformatics,  2019, 35(9), 1469-1477.
[http://dx.doi.org/10.1093/bioinformatics/bty827] [PMID: 30247625] 
[124] 
Zhao, T.; Hu, Y.; Cheng, L. Deep-DRM: A computational method for identifying disease-related metabolites based on graph deep learning approaches. Brief. Bioinform.,  2021, 22(4), 10.
[http://dx.doi.org/10.1093/bib/bbaa212] [PMID: 33048110] 
[125] 
Ijaz, A. SUMOhunt: Combining Spatial Staging between Lysine and SUMO with Random Forests to Predict SUMOylation. ISRN Bioinform.,  2013, 2013, 671269.
[http://dx.doi.org/10.1155/2013/671269] [PMID: 25937950] 
[126] 
Hendriks, I.A.; D’Souza, R.C.; Yang, B.; Verlaan-de Vries, M.; Mann, M.; Vertegaal, A.C. Uncovering global SUMOylation signaling networks in a site-specific manner. Nat. Struct. Mol. Biol.,  2014, 21(10), 927-936.
[http://dx.doi.org/10.1038/nsmb.2890] [PMID: 25218447] 
[127] 
Wang, D.; Zhang, Z.; Jiang, Y.; Mao, Z.; Wang, D.; Lin, H.; Xu, D. DM3Loc: multi-label mRNA subcellular localization prediction and analysis based on multi-head self-attention mechanism. Nucleic Acids Res.,  2021, 49(8), e46.
[http://dx.doi.org/10.1093/nar/gkab016] [PMID: 33503258] 
[128] 
Lv, H.; Dao, F.Y.; Zulfiqar, H.; Lin, H. DeepIPs: comprehensive assessment and computational identification of phosphorylation sites of SARS-CoV-2 infection using a deep learning-based approach. Brief. Bioinform.,  2021, 22(6), bbab244.
[PMID: 34184738] 
[129] 
Dao, F.Y. DeepYY1: A deep learning approach to identify YY1-mediated chromatin loops. Brief. Bioinform.,  2021, 22(4), bbaa356.
[PMID: 33279983] 
[130] 
Lv, H. Deep-Kcr: Accurate detection of lysine crotonylation sites using deep learning method. Brief. Bioinform.,  2021, 22(4), bbaa255.
[http://dx.doi.org/10.1093/bib/bbaa255] [PMID: 33099604] 
[131] 
Dao, F.Y.; Lv, H.; Su, W.; Sun, Z.J.; Huang, Q.L.; Lin, H. iDHS-Deep: An integrated tool for predicting DNase I hypersensitive sites by deep neural network. Brief. Bioinform.,  2021, 22(5), bbab047.
[http://dx.doi.org/10.1093/bib/bbab047] [PMID: 33751027] 
[132] 
Matthew, C. AngularQA: protein model quality assessment with LSTM networks. Computational and Mathematical Biophysics,  2019, 7(1), 1-9.
[http://dx.doi.org/10.1515/cmb-2019-0001] 
[133] 
Cao, R.; Freitas, C.; Chan, L.; Sun, M.; Jiang, H.; Chen, Z. ProLanGO: Protein Function Prediction Using Neural Machine Translation Based on a Recurrent Neural Network. Molecules,  2017, 22(10), E1732.
[http://dx.doi.org/10.3390/molecules22101732] [PMID: 29039790] 
[134] 
Si, D.; Moritz, S.A.; Pfab, J.; Hou, J.; Cao, R.; Wang, L.; Wu, T.; Cheng, J. Deep Learning to Predict Protein Backbone Structure from High-Resolution Cryo-EM Density Maps. Sci. Rep.,  2020, 10(1), 4282.
[http://dx.doi.org/10.1038/s41598-020-60598-y] [PMID: 32152330] 
[135] 
Hong, Z.; Zeng, X.; Wei, L.; Liu, X. Identifying enhancer-promoter interactions with neural network based on pre-trained DNA vectors and attention mechanism. Bioinformatics,  2020, 36(4), 1037-1043.
[PMID: 31588505] 
[136] 
Hong, Q.; Yan, R.; Wang, C.; Sun, J. Memristive Circuit Implementation of Biological Nonassociative Learning Mechanism and Its Applications. IEEE Trans. Biomed. Circuits Syst.,  2020, 14(5), 1036-1050.
[http://dx.doi.org/10.1109/TBCAS.2020.3018777] [PMID: 32833643] 
[137] 
Song, B.; Zeng, X.; Jiang, M.; Perez-Jimenez, M.J. Monodirectional Tissue P Systems With Promoters. IEEE Trans. Cybern.,  2021, 51(1), 438-450.
[http://dx.doi.org/10.1109/TCYB.2020.3003060] [PMID: 32649286] 
[138] 
Wei, L.; Tang, J.; Zou, Q. Local-DPP: An Improved DNA-binding Protein Prediction Method by Exploring Local Evolutionary Information. Inf. Sci.,  2017, 384, 135-144.
[http://dx.doi.org/10.1016/j.ins.2016.06.026] 
[139] 
Wei, L.; Xing, P.; Shi, G.; Ji, Z.; Zou, Q. Fast prediction of methylation sites using sequence-based feature selection technique. IEEE/ACM Trans. Comput. Biol. Bioinformatics,  2019, 16(4), 1264-1273.
[PMID: 28222000] 
Rights & Permissions Print Cite
Article Metrics
27
1
Journal Information
For Authors
For Editors
For Reviewers
Explore Articles
Open Access
Open Access Articles
For Visitors
DOI https://dx.doi.org/10.2174/0929867328666210915112030	Print ISSN 0929-8673
Publisher Name Bentham Science Publisher	Online ISSN 1875-533X
当代药物化学

机器学习方法在 Sumoylation 位点预测中的最新进展

摘要

当代药物化学

机器学习方法在 Sumoylation 位点预测中的最新进展

摘要 Play Pause

Related Journals

Related Books

摘要