Review Article

机器学习方法在 Sumoylation 位点预测中的最新进展

卷 29, 期 5, 2022

发表于: 12 January, 2022

页: [894 - 907] 页: 14

弟呕挨: 10.2174/0929867328666210915112030

价格: $65

摘要

蛋白质的 Sumoylation 是蛋白质的一种重要的可逆翻译后修饰,并介导多种细胞过程。 Sumo 修饰蛋白可以改变它们的亚细胞定位、活性和稳定性。此外,它还在转录调控和信号转导等多种细胞过程中发挥重要作用。异常的 sumoylation 与许多疾病有关,包括神经退行性疾病和免疫相关疾病,以及癌症的发展。因此,SUMO 化位点(SUMO 位点)的识别对于了解其分子机制和调控作用至关重要。与劳动密集型和昂贵的实验方法相比,计算机中的 sumoylation 位点的计算预测也因其准确性、便利性和速度而备受关注。目前,许多计算预测模型已被用于识别 SUMO 站点,但其内容尚未得到全面总结和回顾。因此,本文对相关模型的研究进展进行了总结和讨论。我们主要关注基准数据集的构建、特征提取、机器学习方法、已发表的结果和在线工具,简要总结了用于 sumoylation 位点预测的生物信息学方法的发展。我们希望这篇综述能为湿实验学者提供更多帮助。

关键词: 相扑修改、特征选择、机器学习、分类、翻译后修改、顺序前向选择

[1]
Geiss-Friedlander, R.; Melchior, F. Concepts in sumoylation: A decade on. Nat. Rev. Mol. Cell Biol., 2007, 8(12), 947-956.
[http://dx.doi.org/10.1038/nrm2293] [PMID: 18000527]
[2]
Huo, H.; Li, T.; Wang, S.; Lv, Y.; Zuo, Y.; Yang, L. Prediction of presynaptic and postsynaptic neurotoxins by combining various Chou’s pseudo components. Sci. Rep., 2017, 7(1), 5827.
[http://dx.doi.org/10.1038/s41598-017-06195-y] [PMID: 28724993]
[3]
Hasan, M.A.M.; Islam, M.K.B.; Julia Rahman, J.; Ahmad, S. Citrullination Site Prediction by Incorporating Sequence Coupled Effects into PseAAC and Resolving Data Imbalance Issue. Curr. Bioinform., 2020, 15(3), 235-245.
[http://dx.doi.org/10.2174/1574893614666191202152328]
[4]
Seeler, J.S.; Dejean, A. Nuclear and unclear functions of SUMO. Nat. Rev. Mol. Cell Biol., 2003, 4(9), 690-699.
[http://dx.doi.org/10.1038/nrm1200] [PMID: 14506472]
[5]
Steffan, J.S.; Agrawal, N.; Pallos, J.; Rockabrand, E.; Trotman, L.C.; Slepko, N.; Illes, K.; Lukacsovich, T.; Zhu, Y.Z.; Cattaneo, E.; Pandolfi, P.P.; Thompson, L.M.; Marsh, J.L. SUMO modification of Huntingtin and Huntington’s disease pathology. Science, 2004, 304(5667), 100-104.
[http://dx.doi.org/10.1126/science.1092194] [PMID: 15064418]
[6]
Princz, A.; Tavernarakis, N. SUMOylation in Neurodegenerative Diseases. Gerontology, 2020, 66(2), 122-130.
[http://dx.doi.org/10.1159/000502142] [PMID: 31505513]
[7]
Lee, L.; Sakurai, M.; Matsuzaki, S.; Arancio, O.; Fraser, P. SUMO and Alzheimer’s disease. Neuromolecular Med., 2013, 15(4), 720-736.
[http://dx.doi.org/10.1007/s12017-013-8257-7] [PMID: 23979993]
[8]
Liu, G.; Jin, S.; Hu, Y.; Jiang, Q. Disease status affects the association between rs4813620 and the expression of Alzheimer’s disease susceptibility gene TRIB3. Proc. Natl. Acad. Sci. USA, 2018, 115(45), E10519-E10520.
[http://dx.doi.org/10.1073/pnas.1812975115] [PMID: 30355771]
[9]
Liu, G.; Zhang, Y.; Wang, L.; Xu, J.; Chen, X.; Bao, Y.; Hu, Y.; Jin, S.; Tian, R.; Bai, W.; Zhou, W.; Wang, T.; Han, Z.; Zong, J.; Jiang, Q. Alzheimer’s Disease rs11767557 Variant Regulates EPHA1 Gene Expression Specifically in Human Whole Blood. J. Alzheimers Dis., 2018, 61(3), 1077-1088.
[http://dx.doi.org/10.3233/JAD-170468] [PMID: 29332039]
[10]
Dorval, V.; Fraser, P.E. Small ubiquitin-like modifier (SUMO) modification of natively unfolded proteins tau and alpha-synuclein. J. Biol. Chem., 2006, 281(15), 9919-9924.
[http://dx.doi.org/10.1074/jbc.M510127200] [PMID: 16464864]
[11]
Jiang, Q.; Liu, G. Lack of association between MC1R variants and Parkinson’s disease in European descent. Ann. Neurol., 2016, 79(5), 866-868.
[http://dx.doi.org/10.1002/ana.24627]
[12]
Yang, B.; Shen, J.; Xu, L.; Chen, Y.; Che, X.; Qu, X.; Liu, Y.; Teng, Y.; Li, Z. Genome-Wide Identification of a Novel Eight-lncRNA Signature to Improve Prognostic Prediction in Head and Neck Squamous Cell Carcinoma. Front. Oncol., 2019, 9, 898.
[http://dx.doi.org/10.3389/fonc.2019.00898] [PMID: 31620361]
[13]
Xue, Y. SUMOsp: A web server for sumoylation site prediction. Nucleic Acids Res, 2006, 34(Web Server issue), W254-W257.
[http://dx.doi.org/10.1093/nar/gkl207]
[14]
Xue, Y. GPS: A comprehensive www server for phosphorylation sites prediction. Nucleic Acids Res, 2005, 33(Web Server issue), W184-W187.
[http://dx.doi.org/10.1093/nar/gki393]
[15]
Schwartz, D.; Gygi, S.P. An iterative statistical approach to the identification of protein phosphorylation motifs from large-scale data sets. Nat. Biotechnol., 2005, 23(11), 1391-1398.
[http://dx.doi.org/10.1038/nbt1146] [PMID: 16273072]
[16]
Liu, B.; Li, S.; Wang, Y.; Lu, L.; Li, Y.; Cai, Y. Predicting the protein SUMO modification sites based on Properties Sequential Forward Selection (PSFS). Biochem. Biophys. Res. Commun., 2007, 358(1), 136-139.
[http://dx.doi.org/10.1016/j.bbrc.2007.04.097] [PMID: 17470363]
[17]
Xu, J.; He, Y.; Qiang, B.; Yuan, J.; Peng, X.; Pan, X.M. A novel method for high accuracy sumoylation site prediction from protein sequences. BMC Bioinformatics, 2008, 9, 8.
[http://dx.doi.org/10.1186/1471-2105-9-8] [PMID: 18179724]
[18]
Ren, J.; Gao, X.; Jin, C.; Zhu, M.; Wang, X.; Shaw, A.; Wen, L.; Yao, X.; Xue, Y. Systematic study of protein sumoylation: Development of a site-specific predictor of SUMOsp 2.0. Proteomics, 2009, 9(12), 3409-3412.
[http://dx.doi.org/10.1002/pmic.200800646] [PMID: 29658196]
[19]
Teng, S.; Luo, H.; Wang, L. Predicting protein sumoylation sites from sequence features. Amino Acids, 2012, 43(1), 447-455.
[http://dx.doi.org/10.1007/s00726-011-1100-2] [PMID: 21986959]
[20]
Chen, Y.Z.; Chen, Z.; Gong, Y.A.; Ying, G. SUMOhydro: A novel method for the prediction of sumoylation sites based on hydrophobic properties. PLoS One, 2012, 7(6), e39195.
[http://dx.doi.org/10.1371/journal.pone.0039195] [PMID: 22720073]
[21]
Yavuz, A.S.; Sezerman, O.U. Predicting sumoylation sites using support vector machines based on various sequence features, conformational flexibility and disorder. BMC Genomics, 2014, 15(Suppl. 9), S18.
[http://dx.doi.org/10.1186/1471-2164-15-S9-S18] [PMID: 25521314]
[22]
Macauley, M.S.; Errington, W.J.; Okon, M.; Schärpf, M.; Mackereth, C.D.; Schulman, B.A.; McIntosh, L.P. Structural and dynamic independence of isopeptide-linked RanGAP1 and SUMO-1. J. Biol. Chem., 2004, 279(47), 49131-49137.
[http://dx.doi.org/10.1074/jbc.M408705200] [PMID: 15355965]
[23]
Beauclair, G.; Bridier-Nahmias, A.; Zagury, J.F.; Saïb, A.; Zamborlini, A. JASSA: A comprehensive tool for prediction of SUMOylation sites and SIMs. Bioinformatics, 2015, 31(21), 3483-3491.
[http://dx.doi.org/10.1093/bioinformatics/btv403] [PMID: 26142185]
[24]
Sharma, A.; Lysenko, A.; López, Y.; Dehzangi, A.; Sharma, R.; Reddy, H.; Sattar, A.; Tsunoda, T. HseSUMO: Sumoylation site prediction using half-sphere exposures of amino acids residues. BMC Genomics, 2019, 19(Suppl. 9), 982.
[http://dx.doi.org/10.1186/s12864-018-5206-8] [PMID: 30999862]
[25]
Dehzangi, A.; López, Y.; Taherzadeh, G.; Sharma, A.; Tsunoda, T. SumSec: Accurate Prediction of Sumoylation Sites Using Predicted Secondary Structure. Molecules, 2018, 23(12), E3260.
[http://dx.doi.org/10.3390/molecules23123260] [PMID: 30544729]
[26]
Chen, Z.; Liu, X.; Li, F.; Li, C.; Marquez-Lago, T.; Leier, A.; Akutsu, T.; Webb, G.I.; Xu, D.; Smith, A.I.; Li, L.; Chou, K.C.; Song, J. Large-scale comparative assessment of computational predictors for lysine post-translational modification sites. Brief. Bioinform., 2019, 20(6), 2267-2290.
[http://dx.doi.org/10.1093/bib/bby089] [PMID: 30285084]
[27]
Zhang, T.; Tan, P.; Wang, L.; Jin, N.; Li, Y.; Zhang, L.; Yang, H.; Hu, Z.; Zhang, L.; Hu, C.; Li, C.; Qian, K.; Zhang, C.; Huang, Y.; Li, K.; Lin, H.; Wang, D. RNALocate: A resource for RNA subcellular localizations. Nucleic Acids Res., 2017, 45(D1), D135-D138.
[PMID: 27543076]
[28]
Liang, Z.Y.; Lai, H.Y.; Yang, H.; Zhang, C.J.; Yang, H.; Wei, H.H.; Chen, X.X.; Zhao, Y.W.; Su, Z.D.; Li, W.C.; Deng, E.Z.; Tang, H.; Chen, W.; Lin, H. Pro54DB: A database for experimentally verified sigma-54 promoters. Bioinformatics, 2017, 33(3), 467-469.
[PMID: 28171531]
[29]
Cheng, L.; Qi, C.; Zhuang, H.; Fu, T.; Zhang, X. gutMDisorder: A comprehensive database for dysbiosis of the gut microbiota in disorders and interventions. Nucleic Acids Res., 2020, 48(D1), D554-D560.
[http://dx.doi.org/10.1093/nar/gkz843] [PMID: 31584099]
[30]
Hu, B.; Zheng, L.; Long, C.; Song, M.; Li, T.; Yang, L.; Zuo, Y. EmExplorer: A database for exploring time activation of gene expression in mammalian embryos. Open Biol., 2019, 9(6), 190054.
[http://dx.doi.org/10.1098/rsob.190054] [PMID: 31164042]
[31]
Liu, B.; Gao, X.; Zhang, H. BioSeq-Analysis2.0: An updated platform for analyzing DNA, RNA and protein sequences at sequence level and residue level based on machine learning approaches. Nucleic Acids Res., 2019, 47(20), e127.
[http://dx.doi.org/10.1093/nar/gkz740] [PMID: 31504851]
[32]
Liu, Z.; Wang, Y.; Gao, T.; Pan, Z.; Cheng, H.; Yang, Q.; Cheng, Z.; Guo, A.; Ren, J.; Xue, Y. CPLM: A database of protein lysine modifications. Nucleic Acids Res., 2014, 42(Database issue), D531-D536.
[http://dx.doi.org/10.1093/nar/gkt1093] [PMID: 24214993]
[33]
Bairoch, A.; Apweiler, R. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res., 2000, 28(1), 45-48.
[http://dx.doi.org/10.1093/nar/28.1.45] [PMID: 10592178]
[34]
Li, W.; Godzik, A. Cd-hit: A fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics, 2006, 22(13), 1658-1659.
[http://dx.doi.org/10.1093/bioinformatics/btl158] [PMID: 16731699]
[35]
Ahmed, M.S.; Shahjaman, M.; Kabir, E.; Kamruzzaman, M. Prediction of Protein Acetylation Sites using Kernel Naive Bayes Classifier Based on Protein Sequences Profiling. Bioinformation, 2018, 14(5), 213-218.
[http://dx.doi.org/10.6026/97320630014213] [PMID: 30108418]
[36]
Chang, C-C.; Tung, C.H.; Chen, C.W.; Tu, C.H.; Chu, Y.W. SUMOgo: Prediction of sumoylation sites on lysines by motif screening models and the effects of various post-translational modifications. Sci. Rep., 2018, 8(1), 15512.
[http://dx.doi.org/10.1038/s41598-018-33951-5] [PMID: 30341374]
[37]
Plewczynski, D.; Basu, S.; Saha, I. AMS 4.0: consensus prediction of post-translational modifications in protein sequences. Amino Acids, 2012, 43(2), 573-582.
[http://dx.doi.org/10.1007/s00726-012-1290-2] [PMID: 22555647]
[38]
Song, J.; Tan, H.; Shen, H.; Mahmood, K.; Boyd, S.E.; Webb, G.I.; Akutsu, T.; Whisstock, J.C. Cascleave: towards more accurate prediction of caspase substrate cleavage sites. Bioinformatics, 2010, 26(6), 752-760.
[http://dx.doi.org/10.1093/bioinformatics/btq043] [PMID: 20130033]
[39]
Song, J.; Tan, H.; Perry, A.J.; Akutsu, T.; Webb, G.I.; Whisstock, J.C.; Pike, R.N. PROSPER: An integrated feature-based tool for predicting protease substrate cleavage sites. PLoS One, 2012, 7(11), e50300.
[http://dx.doi.org/10.1371/journal.pone.0050300] [PMID: 23209700]
[40]
Song, J.; Burrage, K.; Yuan, Z.; Huber, T. Prediction of cis/trans isomerization in proteins using PSI-BLAST profiles and secondary structure information. BMC Bioinformatics, 2006, 7, 124.
[http://dx.doi.org/10.1186/1471-2105-7-124] [PMID: 16526956]
[41]
Song, J.; Wang, Y.; Li, F.; Akutsu, T.; Rawlings, N.D.; Webb, G.I.; Chou, K.C. iProt-Sub: A comprehensive package for accurately mapping and predicting protease-specific substrates and cleavage sites. Brief. Bioinform., 2019, 20(2), 638-658.
[http://dx.doi.org/10.1093/bib/bby028] [PMID: 29897410]
[42]
Liu, B.; Zhu, Y.; Yan, K. Fold-LTR-TCP: protein fold recognition based on triadic closure principle. Brief. Bioinform., 2020, 21(6), 2185-2193.
[http://dx.doi.org/10.1093/bib/bbz139] [PMID: 31813954]
[43]
Shao, J.; Yan, K.; Liu, B. FoldRec-C2C: protein fold recognition by combining cluster-to-cluster model and protein similarity network. Brief. Bioinform., 2021, 22(3), bbaa144.
[http://dx.doi.org/10.1093/bib/bbaa144] [PMID: 32685972]
[44]
Kumar, M.; Gromiha, M.M.; Raghava, G.P. Prediction of RNA binding sites in a protein using SVM and PSSM profile. Proteins, 2008, 71(1), 189-194.
[http://dx.doi.org/10.1002/prot.21677] [PMID: 17932917]
[45]
Huang, G.H.; Li, J.C. Feature Extractions for Computationally Predicting Protein Post-Translational Modifications. Curr. Bioinform., 2018, 13(4), 387-395.
[http://dx.doi.org/10.2174/1574893612666170707094916]
[46]
Wang, T.; Yang, J. Predicting subcellular localization of gram-negative bacterial proteins by linear dimensionality reduction method. Protein Pept. Lett., 2010, 17(1), 32-37.
[http://dx.doi.org/10.2174/092986610789909494] [PMID: 19508203]
[47]
Altschul, S.F.; Madden, T.L.; Schäffer, A.A.; Zhang, J.; Zhang, Z.; Miller, W.; Lipman, D.J. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res., 1997, 25(17), 3389-3402.
[http://dx.doi.org/10.1093/nar/25.17.3389] [PMID: 9254694]
[48]
Zheng, L.; Huang, S.; Mu, N.; Zhang, H.; Zhang, J.; Chang, Y.; Yang, L.; Zuo, Y. RAACBook: A web server of reduced amino acid alphabet for sequence-dependent inference by using Chou's five-step rule. Database (Oxford) 2019., 2019, baz131.
[http://dx.doi.org/10.1093/database/baz131]
[49]
Zheng, L.; Liu, D.; Yang, W.; Yang, L.; Zuo, Y. RaacLogo: A new sequence logo generator by using reduced amino acid clusters. Brief. Bioinform., 2021, 22(3), bbaa096.
[http://dx.doi.org/10.1093/bib/bbaa096] [PMID: 32524143]
[50]
Sandberg, M.; Eriksson, L.; Jonsson, J.; Sjöström, M.; Wold, S. New chemical descriptors relevant for the design of biologically active peptides. A multivariate characterization of 87 amino acids. J. Med. Chem., 1998, 41(14), 2481-2491.
[http://dx.doi.org/10.1021/jm9700575] [PMID: 9651153]
[51]
Zhang, Z.Y.; Yang, Y.H.; Ding, H.; Wang, D.; Chen, W.; Lin, H. Design powerful predictor for mRNA subcellular location prediction in Homo sapiens. Brief. Bioinform., 2020, 22(1), 526-535.
[http://dx.doi.org/10.1093/bib/bbz177] [PMID: 31994694]
[52]
Yang, H. A comparison and assessment of computational method for identifying recombination hotspots in Saccharomyces cerevisiae. Brief. Bioinform., 2019.
[http://dx.doi.org/10.1093/bib/bbz123] [PMID: 31633777]
[53]
Yao, Y. Recent Progress in Long Noncoding RNAs Prediction. Curr. Bioinform., 2018, 13(4), 344-351.
[http://dx.doi.org/10.2174/1574893612666170905153933]
[54]
Liu, K.; Chen, W. iMRM: A platform for simultaneously identifying multiple kinds of RNA modifications. Bioinformatics, 2020, 36(11), 3336-3342.
[http://dx.doi.org/10.1093/bioinformatics/btaa155] [PMID: 32134472]
[55]
Liang, P.; Yang, W.; Chen, X.; Long, C.; Zheng, L.; Li, H.; Zuo, Y. Machine Learning of Single-Cell Transcriptome Highly Identifies mRNA Signature by Comparing F-Score Selection with DGE Analysis. Mol. Ther. Nucleic Acids, 2020, 20, 155-163.
[http://dx.doi.org/10.1016/j.omtn.2020.02.004] [PMID: 32169803]
[56]
Liu, B. BioSeq-Analysis: A platform for DNA, RNA and protein sequence analysis based on machine learning approaches. Brief. Bioinform., 2019, 20(4), 1280-1294.
[http://dx.doi.org/10.1093/bib/bbx165] [PMID: 29272359]
[57]
Tang, H. Identification of Secretory Proteins of Malaria Parasite by Feature Selection Technique. Lett. Org. Chem., 2017, 14(9), 621-624.
[http://dx.doi.org/10.2174/1570178614666170329155502]
[58]
Tang, H.; Yang, Y.; Zhang, C.; Chen, R.; Huang, P.; Duan, C.; Zou, P. Predicting Presynaptic and Postsynaptic Neurotoxins by Developing Feature Selection Technique. Biomed. Res. Int., 2017, 2017, 3267325.
[http://dx.doi.org/10.1155/2017/3267325]
[59]
Yu, L.S.Y.; Zou, Q.; Wang, S.; Zheng, L.; Gao, L. Exploring Drug Treatment Patterns Based on the Action of Drug and Multilayer Network Model. Int. J. Mol. Sci., 2020, 21(14), 5014.
[http://dx.doi.org/10.3390/ijms21145014]
[60]
Ao, C.; Jin, S.; Ding, H.; Zou, Q.; Yu, L. Application and Development of Artificial Intelligence and Intelligent Disease Diagnosis. Curr. Pharm. Des., 2020, 26(26), 3069-3075.
[http://dx.doi.org/10.2174/1381612826666200331091156] [PMID: 32228416]
[61]
Peng, H.; Long, F.; Ding, C. Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell., 2005, 27(8), 1226-1238.
[http://dx.doi.org/10.1109/TPAMI.2005.159] [PMID: 16119262]
[62]
Dao, F.Y.; Lv, H.; Wang, F.; Feng, C.Q.; Ding, H.; Chen, W.; Lin, H. Identify origin of replication in Saccharomyces cerevisiae using two-step feature selection technique. Bioinformatics, 2019, 35(12), 2075-2083.
[http://dx.doi.org/10.1093/bioinformatics/bty943] [PMID: 30428009]
[63]
Wang, S.P. Analysis and Prediction of Nitrated Tyrosine Sites with the mRMR Method and Support Vector Machine Algorithm. Curr. Bioinform., 2018, 13(1), 3-13.
[http://dx.doi.org/10.2174/1574893611666160608075753]
[64]
Zuo, Y.; Li, Y.; Chen, Y.; Li, G.; Yan, Z.; Yang, L. PseKRAAC: A flexible web server for generating pseudo K-tuple reduced amino acids composition. Bioinformatics, 2017, 33(1), 122-124.
[http://dx.doi.org/10.1093/bioinformatics/btw564] [PMID: 27565583]
[65]
Zuo, Y.; Chang, Y.; Huang, S.; Zheng, L.; Yang, L.; Cao, G. iDEF-PseRAAC: Identifying the Defensin Peptide by Using Reduced Amino Acid Composition Descriptor. Evol. Bioinform. Online, 2019, 15, 1176934319867088.
[http://dx.doi.org/10.1177/1176934319867088] [PMID: 31391777]
[66]
Frank, E.; Hall, M.; Trigg, L.; Holmes, G.; Witten, I.H. Data mining in bioinformatics using Weka. Bioinformatics, 2004, 20(15), 2479-2481.
[http://dx.doi.org/10.1093/bioinformatics/bth261] [PMID: 15073010]
[67]
Xu, Z.C.; Feng, P.M.; Yang, H.; Qiu, W.R.; Chen, W.; Lin, H. iRNAD: A computational tool for identifying D modification sites in RNA sequence. Bioinformatics, 2019, 35(23), 4922-4929.
[http://dx.doi.org/10.1093/bioinformatics/btz358] [PMID: 31077296]
[68]
Tan, J.X.; Li, S.H.; Zhang, Z.M.; Chen, C.X.; Chen, W.; Tang, H.; Lin, H. Identification of hormone binding proteins based on machine learning methods. Math. Biosci. Eng., 2019, 16(4), 2466-2480.
[http://dx.doi.org/10.3934/mbe.2019123] [PMID: 31137222]
[69]
Lin, H. Identification of hormone binding proteins based on machine learning methods. Mathematical Biosciences and Engineering, 2019, 16(4), 2466-2480.
[http://dx.doi.org/10.1109/TCBB.2017.2666141]
[70]
Dao, F.Y.; Lv, H.; Yang, Y.H.; Zulfiqar, H.; Gao, H.; Lin, H. Computational identification of N6-methyladenosine sites in multiple tissues of mammals. Comput. Struct. Biotechnol. J., 2020, 18, 1084-1091.
[http://dx.doi.org/10.1016/j.csbj.2020.04.015] [PMID: 32435427]
[71]
Bu, H.D. Predicting Enhancers from Multiple Cell Lines and Tissues across Different Developmental Stages Based On SVM Method. Curr. Bioinform., 2018, 13(6), 655-660.
[http://dx.doi.org/10.2174/1574893613666180726163429]
[72]
Chen, W.; Feng, P.; Song, X.; Lv, H.; Lin, H. iRNA-m7G: Identifying N7-methylguanosine Sites by Fusing Multiple Features. Mol. Ther. Nucleic Acids, 2019, 18, 269-274.
[http://dx.doi.org/10.1016/j.omtn.2019.08.022] [PMID: 31581051]
[73]
Liu, B.; Li, K. iPromoter-2L2.0: identifying promoters and their types by combining Smoothing Cutting Window algorithm and sequence-based features. Mol. Ther. Nucleic Acids, 2019, 18, 80-87.
[http://dx.doi.org/10.1016/j.omtn.2019.08.008] [PMID: 31536883]
[74]
Boopathi, V.; Subramaniyam, S.; Malik, A.; Lee, G.; Manavalan, B.; Yang, D.C. mACPpred: A Support Vector Machine-Based Meta-Predictor for Identification of Anticancer Peptides. Int. J. Mol. Sci., 2019, 20(8), E1964.
[http://dx.doi.org/10.3390/ijms20081964] [PMID: 31013619]
[75]
Manavalan, B.; Basith, S.; Shin, T.H.; Wei, L.; Lee, G. Meta-4mCpred: A Sequence-Based Meta-Predictor for Accurate DNA 4mC Site Prediction Using Effective Feature Representation. Mol. Ther. Nucleic Acids, 2019, 16, 733-744.
[http://dx.doi.org/10.1016/j.omtn.2019.04.019] [PMID: 31146255]
[76]
Manavalan, B.; Lee, J. SVMQA: support-vector- machine-based protein single-model quality assessment. Bioinformatics, 2017, 33(16), 2496-2503.
[http://dx.doi.org/10.1093/bioinformatics/btx222] [PMID: 28419290]
[77]
Manavalan, B.; Shin, T.H.; Lee, G. PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine. Front. Microbiol., 2018, 9, 476.
[http://dx.doi.org/10.3389/fmicb.2018.00476] [PMID: 29616000]
[78]
Manavalan, B.; Shin, T.H.; Lee, G. DHSpred: support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest. Oncotarget, 2017, 9(2), 1944-1956.
[http://dx.doi.org/10.18632/oncotarget.23099] [PMID: 29416743]
[79]
Stephenson, N.; Shane, E.; Chase, J.; Rowland, J.; Ries, D.; Justice, N.; Zhang, J.; Chan, L.; Cao, R. Survey of Machine Learning Techniques in Drug Discovery. Curr. Drug Metab., 2019, 20(3), 185-193.
[http://dx.doi.org/10.2174/1389200219666180820112457] [PMID: 30124147]
[80]
Yu, L.; Xu, F.; Gao, L. Predict New Therapeutic Drugs for Hepatocellular Carcinoma Based on Gene Mutation and Expression. Front. Bioeng. Biotechnol., 2020, 8, 8.
[http://dx.doi.org/10.3389/fbioe.2020.00008] [PMID: 32047745]
[81]
Su, R.; Wu, H.; Xu, B.; Liu, X.; Wei, L. Developing a Multi-Dose Computational Model for Drug-induced Hepatotoxicity Prediction based on Toxicogenomics Data. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 2019, 16(4), 1231-1239.
[PMID: 30040651]
[82]
Wei, L.; Zhou, C.; Chen, H.; Song, J.; Su, R. ACPred-FL: A sequence-based predictor using effective feature representation to improve the prediction of anti-cancer peptides. Bioinformatics, 2018, 34(23), 4007-4016.
[http://dx.doi.org/10.1093/bioinformatics/bty451] [PMID: 29868903]
[83]
Jiang, Q.; Wang, G.; Jin, S.; Li, Y.; Wang, Y. Predicting human microRNA-disease associations based on support vector machine. Int. J. Data Min. Bioinform., 2013, 8(3), 282-293.
[http://dx.doi.org/10.1504/IJDMB.2013.056078] [PMID: 24417022]
[84]
Zhu, Y.H.; Hu, J.; Qi, Y.; Song, X.N.; Yu, D.J. Boosting Granular Support Vector Machines for the Accurate Prediction of Protein-Nucleotide Binding Sites. Comb. Chem. High Throughput Screen., 2019, 22(7), 455-469.
[http://dx.doi.org/10.2174/1386207322666190925125524] [PMID: 31553288]
[85]
Hou, J.; Gao, H.; Xia, Q.; Qi, N. Feature Combination and the kNN Framework in Object Classification. IEEE Trans. Neural Netw. Learn. Syst., 2016, 27(6), 1368-1378.
[http://dx.doi.org/10.1109/TNNLS.2015.2461552] [PMID: 26316223]
[86]
Du, X.Q. Identification and Analysis of Cancer Diagnosis Using Probabilistic Classification Vector Machines with Feature Selection. Curr. Bioinform., 2018, 13(6), 625-632.
[http://dx.doi.org/10.2174/1574893612666170405125637]
[87]
Ozkan, A. Benchmarking Classification Models for Cell Viability on Novel Cancer Image Datasets. Curr. Bioinform., 2019, 14(2), 108-114.
[http://dx.doi.org/10.2174/1574893614666181120093740]
[88]
Dehzangi, A. A combination of feature extraction methods with an ensemble of different classifiers for protein structural class prediction problem. IEEE/ACM Trans Comput Biol Bioinform, 2013, 10(3), 564-575.
[http://dx.doi.org/10.1109/TCBB.2013.65]
[89]
Lv, H. iDNA-MS: An Integrated Computational Tool for Detecting DNA Modification Sites in Multiple Genomes. iScience, 2020, 23(4), 100991.
[90]
Zhao, X. Predicting Drug Side Effects with Compact Integration of Heterogeneous Networks. Curr. Bioinform., 2019, 14(8), 709-720.
[http://dx.doi.org/10.2174/1574893614666190220114644]
[91]
Cheng, L.; Zhao, H.; Wang, P.; Zhou, W.; Luo, M.; Li, T.; Han, J.; Liu, S.; Jiang, Q. Computational Methods for Identifying Similar Diseases. Mol. Ther. Nucleic Acids, 2019, 18, 590-604.
[http://dx.doi.org/10.1016/j.omtn.2019.09.019] [PMID: 31678735]
[92]
Cheng, L.; Hu, Y. Human Disease System Biology. Curr. Gene Ther., 2018, 18(5), 255-256.
[http://dx.doi.org/10.2174/1566523218666181010101114] [PMID: 30306867]
[93]
Manavalan, B.; Govindaraj, R.G.; Shin, T.H.; Kim, M.O.; Lee, G. iBCE-EL: A New Ensemble Learning Framework for Improved Linear B-Cell Epitope Prediction. Front. Immunol., 2018, 9, 1695.
[http://dx.doi.org/10.3389/fimmu.2018.01695] [PMID: 30100904]
[94]
Manavalan, B.; Lee, J.; Lee, J. Random forest-based protein model quality assessment (RFMQA) using structural features and potential energy terms. PLoS One, 2014, 9(9), e106542.
[http://dx.doi.org/10.1371/journal.pone.0106542] [PMID: 25222008]
[95]
Manavalan, B.; Shin, T.H.; Kim, M.O.; Lee, G. PIP-EL: A New Ensemble Learning Method for Improved Proinflammatory Peptide Predictions. Front. Immunol., 2018, 9, 1783.
[http://dx.doi.org/10.3389/fimmu.2018.01783] [PMID: 30108593]
[96]
Ao, C.; Zhou, W.; Gao, L.; Dong, B.; Yu, L. Prediction of antioxidant proteins using hybrid feature representation method and random forest. Genomics, 2020, 112(6), 4666-4674.
[http://dx.doi.org/10.1016/j.ygeno.2020.08.016] [PMID: 32818637]
[97]
Basith, S.; Manavalan, B.; Hwan Shin, T.; Lee, G. Machine intelligence in peptide therapeutics: A next-generation tool for rapid disease screening. Med. Res. Rev., 2020, 40(4), 1276-1314.
[http://dx.doi.org/10.1002/med.21658] [PMID: 31922268]
[98]
Basith, S.; Manavalan, B.; Shin, T.H.; Lee, G. iGHBP: Computational identification of growth hormone binding proteins from sequences using extremely randomised tree. Comput. Struct. Biotechnol. J., 2018, 16, 412-420.
[http://dx.doi.org/10.1016/j.csbj.2018.10.007] [PMID: 30425802]
[99]
Basith, S.; Manavalan, B.; Shin, T.H.; Lee, G. SDM6A: A Web-Based Integrative Machine-Learning Framework for Predicting 6mA Sites in the Rice Genome. Mol. Ther. Nucleic Acids, 2019, 18, 131-141.
[http://dx.doi.org/10.1016/j.omtn.2019.08.011] [PMID: 31542696]
[100]
Charoenkwan, P.; Kanthawong, S.; Nantasenamat, C.; Hasan, M.M.; Shoombuatong, W. iAMY-SCM: Improved prediction and analysis of amyloid proteins using a scoring card method with propensity scores of dipeptides. Genomics, 2021, 113(1 Pt 2), 689-698.
[http://dx.doi.org/10.1016/j.ygeno.2020.03.019] [PMID: 33017626]
[101]
Charoenkwan, P.; Kanthawong, S.; Nantasenamat, C.; Hasan, M.M.; Shoombuatong, W. iDPPIV-SCM: A sequence-based predictor for identifying and analyzing dipeptidyl peptidase IV (DPP-IV) inhibitory peptides using a scoring card method. J. Proteome Res., 2020, 19(10), 4125-4136.
[http://dx.doi.org/10.1021/acs.jproteome.0c00590] [PMID: 32897718]
[102]
Charoenkwan, P.; Kanthawong, S.; Schaduangrat, N.; Yana, J.; Shoombuatong, W. PVPred-SCM: Improved Prediction and Analysis of Phage Virion Proteins Using a Scoring Card Method. Cells, 2020, 9(2), 353.
[http://dx.doi.org/10.3390/cells9020353] [PMID: 32028709]
[103]
Charoenkwan, P.; Nantasenamat, C.; Hasan, M.M.; Shoombuatong, W. Meta-iPVP: A sequence-based meta-predictor for improving the prediction of phage virion proteins using effective feature representation. J. Comput. Aided Mol. Des., 2020, 34(10), 1105-1116.
[http://dx.doi.org/10.1007/s10822-020-00323-z] [PMID: 32557165]
[104]
Charoenkwan, P.; Shoombuatong, W.; Lee, H.C.; Chaijaruwanich, J.; Huang, H.L.; Ho, S.Y. SCMCRYS: predicting protein crystallization using an ensemble scoring card method with estimating propensity scores of P-collocated amino acid pairs. PLoS One, 2013, 8(9), e72368.
[http://dx.doi.org/10.1371/journal.pone.0072368] [PMID: 24019868]
[105]
Charoenkwan, P.; Yana, J.; Schaduangrat, N.; Nantasenamat, C.; Hasan, M.M.; Shoombuatong, W. iBitter-SCM: Identification and characterization of bitter peptides using a scoring card method with propensity scores of dipeptides. Genomics, 2020, 112(4), 2813-2822.
[http://dx.doi.org/10.1016/j.ygeno.2020.03.019] [PMID: 32234434]
[106]
Jin, S.; Zeng, X.; Xia, F.; Huang, W.; Liu, X. Application of deep learning methods in biological networks. Brief. Bioinform., 2021, 22(2), 1902-1917.
[http://dx.doi.org/10.1093/bib/bbaa043] [PMID: 32363401]
[107]
Zeng, X.; Zhu, S.; Lu, W.; Liu, Z.; Huang, J.; Zhou, Y.; Fang, J.; Huang, Y.; Guo, H.; Li, L.; Trapp, B.D.; Nussinov, R.; Eng, C.; Loscalzo, J.; Cheng, F. Target identification among known drugs by deep learning from heterogeneous networks. Chem. Sci. (Camb.), 2020, 11(7), 1775-1797.
[http://dx.doi.org/10.1039/C9SC04336E] [PMID: 34123272]
[108]
Yang, W. A brief survey of machine learning methods in protein sub-Golgi localization. Curr. Bioinform., 2019, 14, 234-240.
[http://dx.doi.org/10.2174/1574893613666181113131415]
[109]
Lai, H.Y.; Zhang, Z.Y.; Su, Z.D.; Su, W.; Ding, H.; Chen, W.; Lin, H. iProEP: A Computational Predictor for Predicting Promoter. Mol. Ther. Nucleic Acids, 2019, 17, 337-346.
[http://dx.doi.org/10.1016/j.omtn.2019.05.028] [PMID: 31299595]
[110]
Chen, W.; Feng, P.; Nie, F. iATP: A sequence based method for identifying anti-tubercular peptides. Med. Chem., 2020, 16(5), 620-625.
[http://dx.doi.org/10.2174/1573406415666191002152441] [PMID: 31339073]
[111]
Zhao, T.; Hu, Y.; Peng, J.; Cheng, L. DeepLGP: A novel deep learning method for prioritizing lncRNA target genes. Bioinformatics, 2020, 36(16), 4466-4472.
[http://dx.doi.org/10.1093/bioinformatics/btaa428] [PMID: 32467970]
[112]
Cheng, L. System Biology Methods and Tools for Pharmaceutical Design. Curr. Pharm. Des., 2020, 26(26), 3047-3048.
[http://dx.doi.org/10.2174/138161282626200714144530] [PMID: 32787750]
[113]
Hasan, M.M.; Manavalan, B.; Khatun, MS.; Kurata, H. Meta-i6mA: An interspecies predictor for identifying DNA N6-methyladenine sites of plant genomes by exploiting informative features in an integrative machine-learning framework. Brief. Bioinform., 2021, 22(3), bbaa202.
[http://dx.doi.org/10.1093/bib/bbaa202] [PMID: 32910169]
[114]
Hasan, M.M.; Manavalan, B.; Khatun, M.S.; Kurata, H. i4mC-ROSE, a bioinformatics tool for the identification of DNA N4-methylcytosine sites in the Rosaceae genome. Int. J. Biol. Macromol., 2019, 157, 752-758.
[http://dx.doi.org/10.1016/j.ijbiomac.2019.12.009] [PMID: 31805335]
[115]
Hasan, M.M.; Manavalan, B.; Shoombuatong, W.; Khatun, M.S.; Kurata, H. i6mA-Fuse: improved and robust prediction of DNA 6 mA sites in the Rosaceae genome by fusing multiple feature representation. Plant Mol. Biol., 2020, 103(1-2), 225-234.
[http://dx.doi.org/10.1007/s11103-020-00988-y] [PMID: 32140819]
[116]
Tang, H. A two-step discriminated method to identify thermophilic proteins. Int. J. Biomath., 2017, 10(4), 1750050.
[http://dx.doi.org/10.1142/S1793524517500504]
[117]
Yu, L.; Yao, S.; Gao, L.; Zha, Y. Conserved Disease Modules Extracted From Multilayer Heterogeneous Disease and Gene Networks for Understanding Disease Mechanisms and Predicting Disease Treatments. Front. Genet., 2019, 9, 745.
[http://dx.doi.org/10.3389/fgene.2018.00745] [PMID: 30713550]
[118]
Wang, T. Mobility based trust evaluation for heterogeneous electric vehicles network in smart cities. IEEE Trans. Intell. Transp. Syst., 2020, 22(3), 1797-1806.
[http://dx.doi.org/10.1109/TITS.2020.2997377]
[119]
Qiang, X.; Zhou, C.; Ye, X.; Du, P.F.; Su, R.; Wei, L. CPPred-FL: A sequence-based predictor for large-scale identification of cell-penetrating peptides by feature representation learning. Brief. Bioinform., 2018.
[http://dx.doi.org/10.1093/bib/bby091] [PMID: 30239616]
[120]
Wei, L.; Wan, S.; Guo, J.; Wong, K.K. A novel hierarchical selective ensemble classifier with bioinformatics application. Artif. Intell. Med., 2017, 83, 82-90.
[http://dx.doi.org/10.1016/j.artmed.2017.02.005] [PMID: 28245947]
[121]
Wei, L.; Xing, P.; Zeng, J.; Chen, J.; Su, R.; Guo, F. Improved prediction of protein-protein interactions using novel negative samples, features, and an ensemble classifier. Artif. Intell. Med., 2017, 83, 67-74.
[http://dx.doi.org/10.1016/j.artmed.2017.03.001] [PMID: 28320624]
[122]
Zhang, Z.M.; Tan, J.X.; Wang, F.; Dao, F.Y.; Zhang, Z.Y.; Lin, H. Early Diagnosis of Hepatocellular Carcinoma Using Machine Learning Method. Front. Bioeng. Biotechnol., 2020, 8, 254.
[http://dx.doi.org/10.3389/fbioe.2020.00254] [PMID: 32292778]
[123]
Feng, C.Q.; Zhang, Z.Y.; Zhu, X.J.; Lin, Y.; Chen, W.; Tang, H.; Lin, H. iTerm-PseKNC: A sequence-based tool for predicting bacterial transcriptional terminators. Bioinformatics, 2019, 35(9), 1469-1477.
[http://dx.doi.org/10.1093/bioinformatics/bty827] [PMID: 30247625]
[124]
Zhao, T.; Hu, Y.; Cheng, L. Deep-DRM: A computational method for identifying disease-related metabolites based on graph deep learning approaches. Brief. Bioinform., 2021, 22(4), 10.
[http://dx.doi.org/10.1093/bib/bbaa212] [PMID: 33048110]
[125]
Ijaz, A. SUMOhunt: Combining Spatial Staging between Lysine and SUMO with Random Forests to Predict SUMOylation. ISRN Bioinform., 2013, 2013, 671269.
[http://dx.doi.org/10.1155/2013/671269] [PMID: 25937950]
[126]
Hendriks, I.A.; D’Souza, R.C.; Yang, B.; Verlaan-de Vries, M.; Mann, M.; Vertegaal, A.C. Uncovering global SUMOylation signaling networks in a site-specific manner. Nat. Struct. Mol. Biol., 2014, 21(10), 927-936.
[http://dx.doi.org/10.1038/nsmb.2890] [PMID: 25218447]
[127]
Wang, D.; Zhang, Z.; Jiang, Y.; Mao, Z.; Wang, D.; Lin, H.; Xu, D. DM3Loc: multi-label mRNA subcellular localization prediction and analysis based on multi-head self-attention mechanism. Nucleic Acids Res., 2021, 49(8), e46.
[http://dx.doi.org/10.1093/nar/gkab016] [PMID: 33503258]
[128]
Lv, H.; Dao, F.Y.; Zulfiqar, H.; Lin, H. DeepIPs: comprehensive assessment and computational identification of phosphorylation sites of SARS-CoV-2 infection using a deep learning-based approach. Brief. Bioinform., 2021, 22(6), bbab244.
[PMID: 34184738]
[129]
Dao, F.Y. DeepYY1: A deep learning approach to identify YY1-mediated chromatin loops. Brief. Bioinform., 2021, 22(4), bbaa356.
[PMID: 33279983]
[130]
Lv, H. Deep-Kcr: Accurate detection of lysine crotonylation sites using deep learning method. Brief. Bioinform., 2021, 22(4), bbaa255.
[http://dx.doi.org/10.1093/bib/bbaa255] [PMID: 33099604]
[131]
Dao, F.Y.; Lv, H.; Su, W.; Sun, Z.J.; Huang, Q.L.; Lin, H. iDHS-Deep: An integrated tool for predicting DNase I hypersensitive sites by deep neural network. Brief. Bioinform., 2021, 22(5), bbab047.
[http://dx.doi.org/10.1093/bib/bbab047] [PMID: 33751027]
[132]
Matthew, C. AngularQA: protein model quality assessment with LSTM networks. Computational and Mathematical Biophysics, 2019, 7(1), 1-9.
[http://dx.doi.org/10.1515/cmb-2019-0001]
[133]
Cao, R.; Freitas, C.; Chan, L.; Sun, M.; Jiang, H.; Chen, Z. ProLanGO: Protein Function Prediction Using Neural Machine Translation Based on a Recurrent Neural Network. Molecules, 2017, 22(10), E1732.
[http://dx.doi.org/10.3390/molecules22101732] [PMID: 29039790]
[134]
Si, D.; Moritz, S.A.; Pfab, J.; Hou, J.; Cao, R.; Wang, L.; Wu, T.; Cheng, J. Deep Learning to Predict Protein Backbone Structure from High-Resolution Cryo-EM Density Maps. Sci. Rep., 2020, 10(1), 4282.
[http://dx.doi.org/10.1038/s41598-020-60598-y] [PMID: 32152330]
[135]
Hong, Z.; Zeng, X.; Wei, L.; Liu, X. Identifying enhancer-promoter interactions with neural network based on pre-trained DNA vectors and attention mechanism. Bioinformatics, 2020, 36(4), 1037-1043.
[PMID: 31588505]
[136]
Hong, Q.; Yan, R.; Wang, C.; Sun, J. Memristive Circuit Implementation of Biological Nonassociative Learning Mechanism and Its Applications. IEEE Trans. Biomed. Circuits Syst., 2020, 14(5), 1036-1050.
[http://dx.doi.org/10.1109/TBCAS.2020.3018777] [PMID: 32833643]
[137]
Song, B.; Zeng, X.; Jiang, M.; Perez-Jimenez, M.J. Monodirectional Tissue P Systems With Promoters. IEEE Trans. Cybern., 2021, 51(1), 438-450.
[http://dx.doi.org/10.1109/TCYB.2020.3003060] [PMID: 32649286]
[138]
Wei, L.; Tang, J.; Zou, Q. Local-DPP: An Improved DNA-binding Protein Prediction Method by Exploring Local Evolutionary Information. Inf. Sci., 2017, 384, 135-144.
[http://dx.doi.org/10.1016/j.ins.2016.06.026]
[139]
Wei, L.; Xing, P.; Shi, G.; Ji, Z.; Zou, Q. Fast prediction of methylation sites using sequence-based feature selection technique. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 2019, 16(4), 1264-1273.
[PMID: 28222000]

Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy