Review Article

机器学习方法在疟原虫分泌蛋白鉴别中的发展

卷 29, 期 5, 2022

发表于: 11 January, 2022

页: [807 - 821] 页: 15

弟呕挨: 10.2174/0929867328666211005140625

价格: $65

摘要

由恶性疟原虫引起的疟疾是世界主要传染病之一。开发一种有效的方法来预测疟原虫的分泌蛋白以开发有效的治疗方法至关重要。 生化分析可以为准确鉴定分泌蛋白提供详细信息,但这些方法既昂贵又耗时。 在本文中,我们总结了基于机器学习的识别算法,并比较了不同计算方法之间的构建策略。 此外,我们还讨论了使用机器学习来提高算法识别疟疾寄生虫分泌的蛋白质的能力。

关键词: 分泌蛋白、疟原虫、机器学习、预测、算法、氨基酸

[1]
Stephenson, N.; Shane, E.; Chase, J.; Rowland, J.; Ries, D.; Justice, N.; Zhang, J.; Chan, L.; Cao, R. Survey of machine learning techniques in drug discovery. Curr. Drug Metab., 2019, 20(3), 185-193.
[http://dx.doi.org/10.2174/1389200219666180820112457] [PMID: 30124147]
[2]
Matthews, K.M.; Pitman, E.L.; de Koning-Ward, T.F. Illuminating how malaria parasites export proteins into host erythrocytes. Cell. Microbiol., 2019, 21(4)e13009
[http://dx.doi.org/10.1111/cmi.13009] [PMID: 30656810]
[3]
Singh, M.; Mukherjee, P.; Narayanasamy, K.; Arora, R.; Sen, S.D.; Gupta, S.; Natarajan, K.; Malhotra, P. Proteome analysis of Plasmodium falciparum extracellular secretory antigens at asexual blood stages reveals a cohort of proteins with possible roles in immune modulation and signaling. Mol. Cell. Proteomics, 2009, 8(9), 2102-2118.
[http://dx.doi.org/10.1074/mcp.M900029-MCP200] [PMID: 19494339]
[4]
Spillman, N.J.; Beck, J.R.; Goldberg, D.E. Protein export into malaria parasite-infected erythrocytes: mechanisms and functional consequences. Annu. Rev. Biochem., 2015, 84, 813-841.
[http://dx.doi.org/10.1146/annurev-biochem-060614-034157] [PMID: 25621510]
[5]
Chao, L.; Wei, L.; Zou, Q. SecProMTB: A SVM-based classifier for secretory proteins of Mycobacterium tuberculosis with imbalanced data set. Proteomics, 2019, 19e1900007
[http://dx.doi.org/10.1002/pmic.201900007]
[6]
Verma, R.; Tiwari, A.; Kaur, S.; Varshney, G.C.; Raghava, G.P. Identification of proteins secreted by malaria parasite into erythrocyte using SVM and PSSM profiles. BMC Bioinformatics, 2008, 9, 201.
[http://dx.doi.org/10.1186/1471-2105-9-201] [PMID: 18416838]
[7]
Zuo, Y.C.; Li, Q.Z. Using K-minimum increment of diversity to predict secretory proteins of malaria parasite based on groupings of amino acids. Amino Acids, 2010, 38(3), 859-867.
[http://dx.doi.org/10.1007/s00726-009-0292-1] [PMID: 19387791]
[8]
Lin, W.Z.; Fang, J.A.; Xiao, X.; Chou, K.C. Predicting secretory proteins of malaria parasite by incorporating sequence evolution information into pseudo amino acid composition via grey system model. PLoS One, 2012, 7(11)e49040
[http://dx.doi.org/10.1371/journal.pone.0049040] [PMID: 23189138]
[9]
Fan, G.L.; Zhang, X.Y.; Liu, Y.L.; Nang, Y.; Wang, H. DSPMP: Discriminating secretory proteins of malaria parasite by hybridizing different descriptors of Chou’s pseudo amino acid patterns. J. Comput. Chem., 2015, 36(31), 2317-2327.
[http://dx.doi.org/10.1002/jcc.24210] [PMID: 26484844]
[10]
Feng, Y.E. Identify secretory protein of malaria parasite with modified quadratic discriminant algorithm and amino acid composition. Interdiscip. Sci., 2016, 8(2), 156-161.
[http://dx.doi.org/10.1007/s12539-015-0112-0] [PMID: 26286010]
[11]
Hua, T.; Zhang, C.; Rong, C.; Huang, P.; Duan, C.; Ping, Z. Identification of secretory proteins of malaria parasite by feature selection technique. Lett. Org. Chem., 2017, 14(999), 621-624.
[http://dx.doi.org/10.2174/1570178614666170329155502]]
[12]
Zhang, H.; Xi, Q.; Huang, S.; Zheng, L.; Yang, W.; Zuo, Y. iSP-RAAC: Identify secretory proteins of malaria parasite using reduced amino acid composition. Comb. Chem. High Throughput Screen., 2020, 23(6), 536-545.
[http://dx.doi.org/10.2174/1386207323666200402084518] [PMID: 32238133]
[13]
Feng, C.; Wu, J.; Wei, H.; Xu, L.; Zou, Q. CRCF: A method of identifying secretory proteins of malaria parasites.IEEE/ACM Trans Comput Biol Bioinform, 2021.,
[http://dx.doi.org/10.1109/TCBB.2021.3085589] [PMID: 34061749]
[14]
Cheng, L.; Yang, H.; Zhao, H.; Pei, X.; Shi, H.; Sun, J.; Zhang, Y.; Wang, Z.; Zhou, M. MetSigDis: a manually curated resource for the metabolic signatures of diseases. Brief. Bioinform., 2019, 20(1), 203-209.
[http://dx.doi.org/10.1093/bib/bbx103] [PMID: 28968812]
[15]
Cheng, L.; Qi, C.; Zhuang, H.; Fu, T.; Zhang, X. gutMDisorder: a comprehensive database for dysbiosis of the gut microbiota in disorders and interventions. Nucleic Acids Res., 2020, 48(13), 7603.
[PMID: 32515792]
[16]
Cui, T.; Zhang, L.; Huang, Y.; Yi, Y.; Tan, P.; Zhao, Y.; Hu, Y.; Xu, L.; Li, E.; Wang, D. MNDR v2.0: an updated resource of ncRNA-disease associations in mammals. Nucleic Acids Res., 2018, 46(D1), D371-D374.
[PMID: 29106639]
[17]
Zhang, T.; Tan, P.; Wang, L.; Jin, N.; Li, Y.; Zhang, L.; Yang, H.; Hu, Z.; Zhang, L.; Hu, C.; Li, C.; Qian, K.; Zhang, C.; Huang, Y.; Li, K.; Lin, H.; Wang, D. RNALocate: a resource for RNA subcellular localizations. Nucleic Acids Res., 2017, 45(D1), D135-D138.
[PMID: 27543076]
[18]
Liang, Z.Y.; Lai, H.Y.; Yang, H.; Zhang, C.J.; Yang, H.; Wei, H.H.; Chen, X.X.; Zhao, Y.W.; Su, Z.D.; Li, W.C.; Deng, E.Z.; Tang, H.; Chen, W.; Lin, H. Pro54DB: a database for experimentally verified sigma-54 promoters. Bioinformatics, 2017, 33(3), 467-469.
[PMID: 28171531]
[19]
Fan, G.L.; Liu, Y.L.; Zuo, Y.C.; Mei, H.X.; Rang, Y.; Hou, B.Y.; Zhao, Y. acACS: improving the prediction accuracy of protein subcellular locations and protein classification by incorporating the average chemical shifts composition. ScientificWorldJournal, 2014, 2014864135
[http://dx.doi.org/10.1155/2014/864135] [PMID: 25110749]
[20]
Fan, G.L.; Li, Q.Z. Predict mycobacterial proteins subcellular locations by incorporating pseudo-average chemical shift into the general form of Chou’s pseudo amino acid composition. J. Theor. Biol., 2012, 304, 88-95.
[http://dx.doi.org/10.1016/j.jtbi.2012.03.017] [PMID: 22459701]
[21]
Fan, G.L.; Li, Q.Z. Discriminating bioluminescent proteins by incorporating average chemical shift and evolutionary information into the general form of Chou’s pseudo amino acid composition. J. Theor. Biol., 2013, 334, 45-51.
[http://dx.doi.org/10.1016/j.jtbi.2013.06.003] [PMID: 23770403]
[22]
Feng, Z.; Hu, X.; Jiang, Z.; Song, H.; Ashraf, M.A. The recognition of multi-class protein folds by adding average chemical shifts of secondary structure elements. Saudi J. Biol. Sci., 2016, 23(2), 189-197.
[http://dx.doi.org/10.1016/j.sjbs.2015.10.008] [PMID: 26980999]
[23]
Zhu, X.J.; Feng, C.Q.; Lai, H.Y.; Chen, W.; Lin, H. Predicting protein structural classes for low-similarity sequences by evaluating different features. Knowl. Base. Syst., 2019, 163, 787-793.
[http://dx.doi.org/10.1016/j.knosys.2018.10.007]
[24]
Zou, Q.; Wan, S.; Ju, Y.; Tang, J.; Zeng, X. Pretata: predicting TATA binding proteins with novel features and dimensionality reduction strategy. BMC Syst. Biol., 2016, 10(Suppl. 4), 114.
[http://dx.doi.org/10.1186/s12918-016-0353-5] [PMID: 28155714]
[25]
Wang, G.; Luo, X.; Wang, J.; Wan, J.; Xia, S.; Zhu, H.; Qian, J.; Wang, Y. MeDReaders: a database for transcription factors that bind to methylated DNA. Nucleic Acids Res., 2018, 46(D1), D146-D151.
[http://dx.doi.org/10.1093/nar/gkx1096] [PMID: 29145608]
[26]
Song, J.; Burrage, K.; Yuan, Z.; Huber, T. Prediction of cis/trans isomerization in proteins using PSI-BLAST profiles and secondary structure information. BMC Bioinformatics, 2006, 7, 124.
[http://dx.doi.org/10.1186/1471-2105-7-124] [PMID: 16526956]
[27]
Altschul, S.F.; Madden, T.L.; Schäffer, A.A.; Zhang, J.; Zhang, Z.; Miller, W.; Lipman, D.J. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res., 1997, 25(17), 3389-3402.
[http://dx.doi.org/10.1093/nar/25.17.3389] [PMID: 9254694]
[28]
Schäffer, A.A.; Aravind, L.; Madden, T.L.; Shavirin, S.; Spouge, J.L.; Wolf, Y.I.; Koonin, E.V.; Altschul, S.F. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res., 2001, 29(14), 2994-3005.
[http://dx.doi.org/10.1093/nar/29.14.2994] [PMID: 11452024]
[29]
Barberis, E.; Marengo, E.; Manfredi, M. Protein subcellular localization prediction. Methods Mol. Biol., 2021, 2361, 197-212.
[http://dx.doi.org/10.1007/978-1-0716-1641-3_12] [PMID: 34236663]
[30]
Li, C-C.; Liu, B. MotifCNN-fold: protein fold recognition based on fold-specific features extracted by motif-based convolutional neural networks. Brief. Bioinform., 2020, 21(6), 2133-2141.
[http://dx.doi.org/10.1093/bib/bbz133] [PMID: 31774907]
[31]
Liu, M.L.; Su, W.; Guan, Z.X.; Zhang, D.; Chen, W.; Liu, L.; Ding, H. An overview on predicting protein subchloroplast localization by using machine learning methods. Curr. Protein Pept. Sci., 2020, 21(12), 1229-1241.
[http://dx.doi.org/10.2174/1389203721666200117153412] [PMID: 31957607]
[32]
Li, S.H.; Zhang, J.; Zhao, Y.W.; Dao, F.Y.; Ding, H.; Chen, W.; Tang, H. iPhoPred: a predictor for identifying phosphorylation sites in human protein. IEEE Access, 2019, 7, 177517-177528.
[http://dx.doi.org/10.1109/ACCESS.2019.2953951]
[33]
Chen, W.; Feng, P.; Nie, F. iATP: A sequence based method for identifying anti-tubercular peptides. Med. Chem., 2020, 16(5), 620-625.
[PMID: 31339073]
[34]
Lv, Z.; Jin, S.; Ding, H.; Zou, Q. A random forest sub-Golgi protein classifier optimized via dipeptide and amino acid composition features. Front. Bioeng. Biotechnol., 2019, 7, 215.
[http://dx.doi.org/10.3389/fbioe.2019.00215] [PMID: 31552241]
[35]
Hasan, M.M.; Schaduangrat, N.; Basith, S.; Lee, G.; Shoombuatong, W.; Manavalan, B. HLPpred-Fuse: improved and robust prediction of hemolytic peptide and its activity by fusing multiple feature representation. Bioinformatics, 2020, 36(11), 3350-3356.
[http://dx.doi.org/10.1093/bioinformatics/btaa160] [PMID: 32145017]
[36]
Manavalan, B.; Subramaniyam, S.; Shin, T.H.; Kim, M.O.; Lee, G. Machine-learning-based prediction of cell-penetrating peptides and their uptake efficiency with improved accuracy. J. Proteome Res., 2018, 17(8), 2715-2726.
[http://dx.doi.org/10.1021/acs.jproteome.8b00148] [PMID: 29893128]
[37]
Chen, X.X.; Tang, H.; Li, W.C.; Wu, H.; Chen, W.; Ding, H.; Lin, H. Identification of bacterial cell wall lyases via pseudo amino acid composition. BioMed Res. Int., 2016, 20161654623
[http://dx.doi.org/10.1155/2016/1654623] [PMID: 27437396]
[38]
Yang, W.; Zhu, X.J.; Huang, J.; Ding, H.; Lin, H. A brief survey of machine learning methods in protein sub-Golgi localization. Curr. Bioinform., 2019, 14, 234-240.
[http://dx.doi.org/10.2174/1574893613666181113131415]
[39]
Tan, J.X.; Li, S.H.; Zhang, Z.M.; Chen, C.X.; Chen, W.; Tang, H.; Lin, H. Identification of hormone binding proteins based on machine learning methods. Math. Biosci. Eng., 2019, 16(4), 2466-2480.
[http://dx.doi.org/10.3934/mbe.2019123] [PMID: 31137222]
[40]
Zhao, Y.W.; Su, Z.D.; Yang, W.; Lin, H.; Chen, W.; Tang, H. IonchanPred 2.0: A tool to predict ion channels and their types. Int. J. Mol. Sci., 2017, 18(9)E1838
[http://dx.doi.org/10.3390/ijms18091838] [PMID: 28837067]
[41]
Zhang, Z.Y.; Yang, Y.H.; Ding, H.; Wang, D.; Chen, W.; Lin, H. Design powerful predictor for mRNA subcellular location prediction in HOMO sapiens. Brief. Bioinform., 2021, 22(1), 526-535.
[http://dx.doi.org/10.1093/bib/bbz177] [PMID: 31994694]
[42]
Zhang, Z.M.; Tan, J.X.; Wang, F.; Dao, F.Y.; Zhang, Z.Y.; Lin, H. Early diagnosis of hepatocellular carcinoma using machine learning method. Front. Bioeng. Biotechnol., 2020, 8, 254.
[http://dx.doi.org/10.3389/fbioe.2020.00254] [PMID: 32292778]
[43]
Feng, C.Q.; Zhang, Z.Y.; Zhu, X.J.; Lin, Y.; Chen, W.; Tang, H.; Lin, H. iTerm-PseKNC: a sequence-based tool for predicting bacterial transcriptional terminators. Bioinformatics, 2019, 35(9), 1469-1477.
[http://dx.doi.org/10.1093/bioinformatics/bty827] [PMID: 30247625]
[44]
Liu, K.; Chen, W.; Lin, H. XG-PseU: an eXtreme Gradient Boosting based method for identifying pseudouridine sites. Mol. Genet. Genomics, 2020, 295(1), 13-21.
[http://dx.doi.org/10.1007/s00438-019-01600-9] [PMID: 31392406]
[45]
Zhao, X.; Jiao, Q.; Li, H.; Wu, Y.; Wang, H.; Huang, S.; Wang, G. ECFS-DEA: an ensemble classifier-based feature selection for differential expression analysis on expression profiles. BMC Bioinformatics, 2020, 21(1), 43.
[http://dx.doi.org/10.1186/s12859-020-3388-y] [PMID: 32024464]
[46]
Chen, Z.; Zhao, P.; Li, F.; Leier, A.; Marquez-Lago, T.T.; Wang, Y.; Webb, G.I.; Smith, A.I.; Daly, R.J.; Chou, K.C.; Song, J. iFeature: a Python package and web server for features extraction and selection from protein and peptide sequences. Bioinformatics, 2018, 34(14), 2499-2502.
[http://dx.doi.org/10.1093/bioinformatics/bty140] [PMID: 29528364]
[47]
Chen, Z.; Zhao, P.; Li, F.; Marquez-Lago, T.T.; Leier, A.; Revote, J.; Zhu, Y.; Powell, D.R.; Akutsu, T.; Webb, G.I.; Chou, K-C.; Smith, A.I.; Daly, R.J.; Li, J.; Song, J. iLearn: an integrated platform and meta-learner for feature engineering, machine-learning analysis and modeling of DNA, RNA and protein sequence data. Brief. Bioinform., 2020, 21(3), 1047-1057.
[http://dx.doi.org/10.1093/bib/bbz041] [PMID: 31067315]
[48]
Liu, B.; Gao, X.; Zhang, H. BioSeq-Analysis2.0: an updated platform for analyzing DNA, RNA and protein sequences at sequence level and residue level based on machine learning approaches. Nucleic Acids Res., 2019, 47(20)e127
[http://dx.doi.org/10.1093/nar/gkz740] [PMID: 31504851]
[49]
Li, F.; Leier, A.; Liu, Q.; Wang, Y.; Xiang, D.; Akutsu, T.; Webb, G.I.; Smith, A.I.; Marquez-Lago, T.; Li, J.; Song, J. Procleave: predicting protease-specific substrate cleavage sites by combining sequence and structural information. Genom Proteom Bioinf, 2020, 18(1), 52-64.
[http://dx.doi.org/10.1016/j.gpb.2019.08.002] [PMID: 32413515]
[50]
Li, F.; Chen, J.; Ge, Z.; Wen, Y.; Yue, Y.; Hayashida, M.; Baggag, A.; Bensmail, H.; Song, J. Computational prediction and interpretation of both general and specific types of promoters in Escherichia coli by exploiting a stacked ensemble-learning framework. Brief. Bioinform., 2021, 22(2), 2126-2140.
[PMID: 32363397]
[51]
Li, F.; Zhang, Y.; Purcell, A.W.; Webb, G.I.; Chou, K.C.; Lithgow, T.; Li, C.; Song, J. Positive-unlabelled learning of glycosylation sites in the human proteome. BMC Bioinformatics, 2019, 20(1), 112.
[http://dx.doi.org/10.1186/s12859-019-2700-1] [PMID: 30841845]
[52]
Dao, F.Y.; Lv, H.; Yang, Y.H.; Zulfiqar, H.; Gao, H.; Lin, H. Computational identification of N6-methyladenosine sites in multiple tissues of mammals. Comput. Struct. Biotechnol. J., 2020, 18, 1084-1091.
[http://dx.doi.org/10.1016/j.csbj.2020.04.015] [PMID: 32435427]
[53]
Yang, H.; Yang, W.; Dao, F.Y.; Lv, H.; Ding, H.; Chen, W.; Lin, H. A comparison and assessment of computational method for identifying recombination hotspots in Saccharomyces cerevisiae. Brief. Bioinform., 2019, 21(5), 1568-1580.
[http://dx.doi.org/10.1093/bib/bbz123] [PMID: 31633777]
[54]
Zou, Q.; Zeng, J.; Cao, L.; Ji, R. A novel features ranking metric with application to scalable visual and bioinformatics data classification. Neurocomputing, 2016, 173, 346-354.
[http://dx.doi.org/10.1016/j.neucom.2014.12.123]
[55]
Yu, L.S.Y.; Zou, Q.; Wang, S.; Zheng, L.; Gao, L. Exploring drug treatment patterns based on the action of drug and multilayer network model. Int. J. Mol. Sci., 2020, 21(14), 5014.
[http://dx.doi.org/10.3390/ijms21145014]
[56]
Yu, L.; Xu, F.; Gao, L. Predict new therapeutic drugs for hepatocellular carcinoma based on gene mutation and expression. Front. Bioeng. Biotechnol., 2020, 8, 8.
[http://dx.doi.org/10.3389/fbioe.2020.00008] [PMID: 32047745]
[57]
Han, K.; Wang, M.; Zhang, L.; Wang, Y.; Guo, M.; Zhao, M.; Zhao, Q.; Zhang, Y.; Zeng, N.; Wang, C. Predicting ion channels genes and their types with machine learning techniques. Front. Genet., 2019, 10, 399.
[http://dx.doi.org/10.3389/fgene.2019.00399] [PMID: 31130983]
[58]
Li, M.; Wang, P.; Zhang, N.; Guo, L.; Feng, Y.M. Identification of genes of four malignant tumors and a novel prediction model development based on PPI data and support vector machines. Cancer Gene Ther., 2020, 27(9), 715-725.
[http://dx.doi.org/10.1038/s41417-019-0143-5] [PMID: 31645679]
[59]
Kamer, I.; Steuerman, Y.; Daniel-Meshulam, I.; Perry, G.; Izraeli, S.; Perelman, M.; Golan, N.; Simansky, D.; Barshack, I.; Ben Nun, A.; Gottfried, T.; Onn, A.; Gat-Viks, I.; Bar, J. Predicting brain metastasis in early stage non-small cell lung cancer patients by gene expression profiling. Transl. Lung Cancer Res., 2020, 9(3), 682-692.
[http://dx.doi.org/10.21037/tlcr-19-477] [PMID: 32676330]
[60]
Dao, F.Y.; Lv, H.; Wang, F.; Feng, C.Q.; Ding, H.; Chen, W.; Lin, H. Identify origin of replication in Saccharomyces cerevisiae using two-step feature selection technique. Bioinformatics, 2019, 35(12), 2075-2083.
[http://dx.doi.org/10.1093/bioinformatics/bty943] [PMID: 30428009]
[61]
Xu, Z.C.; Feng, P.M.; Yang, H.; Qiu, W.R.; Chen, W.; Lin, H. iRNAD: a computational tool for identifying D modification sites in RNA sequence. Bioinformatics, 2019, 35(23), 4922-4929.
[http://dx.doi.org/10.1093/bioinformatics/btz358] [PMID: 31077296]
[62]
Lin, H.; Liang, Z.Y.; Tang, H.; Chen, W. identifying sigma70 promoters with novel pseudo nucleotide composition. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 2019, 16(4), 1316-1321.
[http://dx.doi.org/10.1109/TCBB.2017.2666141] [PMID: 28186907]
[63]
Yang, Y.H.; Ma, C.; Wang, J.S.; Yang, H.; Ding, H.; Han, S.G.; Li, Y.W. Prediction of N7-methylguanosine sites in human RNA based on optimal sequence features. Genomics, 2020, 112(6), 4342-4347.
[http://dx.doi.org/10.1016/j.ygeno.2020.07.035] [PMID: 32721444]
[64]
Ding, H.; Yang, W.; Tang, H.; Feng, P.M.; Huang, J.; Chen, W.; Lin, H. PHYPred: a tool for identifying bacteriophage enzymes and hydrolases. Virol. Sin., 2016, 31(4), 350-352.
[http://dx.doi.org/10.1007/s12250-016-3740-6] [PMID: 27151186]
[65]
Wang, Y.; Shi, F.Q.; Cao, L.Y.; Dey, N.; Wu, Q.; Ashour, A.S.; Sherratt, R.S.; Rajinikanth, V.; Wu, L.J. Morphological segmentation analysis and texture-based support vector machines classification on mice liver fibrosis microscopic images. Curr. Bioinform., 2019, 14(4), 282-294.
[http://dx.doi.org/10.2174/1574893614666190304125221]
[66]
Meng, C.; Jin, S.; Wang, L.; Guo, F.; Zou, Q. AOPs-SVM: A sequence-based classifier of antioxidant proteins using a support vector machine. Front. Bioeng. Biotechnol., 2019, 7, 224.
[http://dx.doi.org/10.3389/fbioe.2019.00224] [PMID: 31620433]
[67]
Liu, B.; Li, K. iPromoter-2L2.0: identifying promoters and their types by combining smoothing cutting window algorithm and sequence-based features. Mol. Ther. Nucleic Acids, 2019, 18, 80-87.
[http://dx.doi.org/10.1016/j.omtn.2019.08.008] [PMID: 31536883]
[68]
Basith, S.; Manavalan, B.; Shin, T.H.; Lee, G. SDM6A: A Web-based integrative machine-learning framework for predicting 6mA sites in the rice genome. Mol. Ther. Nucleic Acids, 2019, 18, 131-141.
[http://dx.doi.org/10.1016/j.omtn.2019.08.011] [PMID: 31542696]
[69]
Hasan, M.M.; Basith, S.; Khatun, M.S.; Lee, G.; Manavalan, B.; Kurata, H. Meta-i6mA: an interspecies predictor for identifying DNA N6-methyladenine sites of plant genomes by exploiting informative features in an integrative machine-learning framework. Brief. Bioinform., 2020.
[http://dx.doi.org/10.1093/bib/bbaa202] [PMID: 32910169]
[70]
Manavalan, B.; Basith, S.; Shin, T.H.; Wei, L.; Lee, G. Meta-4mCpred: A sequence-based meta-predictor for accurate DNA 4mC site prediction using effective feature representation. Mol. Ther. Nucleic Acids, 2019, 16, 733-744.
[http://dx.doi.org/10.1016/j.omtn.2019.04.019] [PMID: 31146255]
[71]
Jiang, Q.; Wang, G.; Jin, S.; Li, Y.; Wang, Y. Predicting human microRNA-disease associations based on support vector machine. Int. J. Data Min. Bioinform., 2013, 8(3), 282-293.
[http://dx.doi.org/10.1504/IJDMB.2013.056078] [PMID: 24417022]
[72]
Zhao, Y.; Wang, F.; Juan, L. MicroRNA promoter identification in arabidopsis using multiple histone markers. BioMed Res. Int., 2015, 2015861402
[http://dx.doi.org/10.1155/2015/861402] [PMID: 26425556]
[73]
Wei, L.; Xing, P.; Shi, G.; Ji, Z.; Zou, Q. Fast prediction of protein methylation sites using a sequence-based feature selection technique. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 2019, 16(4), 1264-1273.
[http://dx.doi.org/10.1109/TCBB.2017.2670558] [PMID: 28222000]
[74]
Chen, Y.L.; Li, Q.Z.; Zhang, L.Q. Using increment of diversity to predict mitochondrial proteins of malaria parasite: integrating pseudo-amino acid composition and structural alphabet. Amino Acids, 2012, 42(4), 1309-1316.
[http://dx.doi.org/10.1007/s00726-010-0825-7] [PMID: 21191803]
[75]
Feng, P.; Wang, Z.; Yu, X. Predicting antimicrobial peptides by using increment of diversity with quadratic discriminant analysis method. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 2019, 16(4), 1309-1312.
[http://dx.doi.org/10.1109/TCBB.2017.2669302] [PMID: 28212093]
[76]
Li, Q.Z.; Lu, Z.Q. The prediction of the structural class of protein: application of the measure of diversity. J. Theor. Biol., 2001, 213(3), 493-502.
[http://dx.doi.org/10.1006/jtbi.2001.2441] [PMID: 11735294]
[77]
Hayat, M.; Khan, A. Discriminating outer membrane proteins with fuzzy K-nearest neighbor algorithms based on the general form of Chou’s PseAAC. Protein Pept. Lett., 2012, 19(4), 411-421.
[http://dx.doi.org/10.2174/092986612799789387] [PMID: 22185508]
[78]
Kou, G.; Feng, Y. Identify five kinds of simple super-secondary structures with quadratic discriminant algorithm based on the chemical shifts. J. Theor. Biol., 2015, 380, 392-398.
[http://dx.doi.org/10.1016/j.jtbi.2015.06.006] [PMID: 26087283]
[79]
Feng, Y.; Lin, H.; Luo, L. Prediction of protein secondary structure using feature selection and analysis approach. Acta Biotheor., 2014, 62(1), 1-14.
[http://dx.doi.org/10.1007/s10441-013-9203-7] [PMID: 24052343]
[80]
Feng, Y.; Luo, L. Use of tetrapeptide signals for protein secondary-structure prediction. Amino Acids, 2008, 35(3), 607-614.
[http://dx.doi.org/10.1007/s00726-008-0089-7] [PMID: 18431531]
[81]
Cheng, L. Computational and biological methods for gene therapy. Curr. Gene Ther., 2019, 19(4), 210-210.
[http://dx.doi.org/10.2174/156652321904191022113307] [PMID: 31762421]
[82]
Cheng, L.; Zhao, H.; Wang, P.; Zhou, W.; Luo, M.; Li, T.; Han, J.; Liu, S.; Jiang, Q. Computational methods for identifying similar diseases. Mol. Ther. Nucleic Acids, 2019, 18, 590-604.
[http://dx.doi.org/10.1016/j.omtn.2019.09.019] [PMID: 31678735]
[83]
Lv, H.; Dao, F.Y.; Zhang, D.; Guan, Z.X.; Yang, H.; Su, W.; Liu, M.L.; Ding, H.; Chen, W.; Lin, H. iDNA-MS: An integrated computational tool for detecting DNA modification sites in multiple genomes. iScience, 2020, 23(4), 100991.
[http://dx.doi.org/10.1016/j.isci.2020.100991] [PMID: 32240948]
[84]
Lv, Z.; Zhang, J.; Ding, H.; Zou, Q. RF-PseU: A random forest predictor for RNA pseudouridine sites. Front. Bioeng. Biotechnol., 2020, 8, 134.
[http://dx.doi.org/10.3389/fbioe.2020.00134] [PMID: 32175316]
[85]
Chen, W.; Feng, P.; Liu, T.; Jin, D. Recent advances in machine learning methods for predicting heat shock proteins. Curr. Drug Metab., 2019, 20(3), 224-228.
[http://dx.doi.org/10.2174/1389200219666181031105916] [PMID: 30378494]
[86]
Cheng, L.; Hu, Y. Human disease system biology. Curr. Gene Ther., 2018, 18(5), 255-256.
[http://dx.doi.org/10.2174/1566523218666181010101114] [PMID: 30306867]
[87]
Zhao, T.; Hu, Y.; Peng, J.; Cheng, L. DeepLGP: a novel deep learning method for prioritizing lncRNA target genes. Bioinformatics, 2020, 36(16), 4466-4472.
[http://dx.doi.org/10.1093/bioinformatics/btaa428] [PMID: 32467970]
[88]
Yu, L.; Yao, S.; Gao, L.; Zha, Y. Conserved disease modules extracted from multilayer heterogeneous disease and gene networks for understanding disease mechanisms and predicting disease treatments. Front. Genet., 2019, 9, 745.
[http://dx.doi.org/10.3389/fgene.2018.00745] [PMID: 30713550]
[89]
Wang, G.; Wang, Y.; Feng, W.; Wang, X.; Yang, J.Y.; Zhao, Y.; Wang, Y.; Liu, Y. Transcription factor and microRNA regulation in androgen-dependent and -independent prostate cancer cells. BMC Genomics, 2008, 9(Suppl. 2), S22.
[http://dx.doi.org/10.1186/1471-2164-9-S2-S22] [PMID: 18831788]
[90]
Wang, G.; Wang, Y.; Teng, M.; Zhang, D.; Li, L.; Liu, Y. Signal transducers and activators of transcription-1 (STAT1) regulates microRNA transcription in interferon gamma-stimulated HeLa cells. PLoS One, 2010, 5(7)e11794
[http://dx.doi.org/10.1371/journal.pone.0011794] [PMID: 20668688]
[91]
Jin, Q.; Meng, Z.; Tuan, D.P.; Chen, Q.; Wei, L.; Su, R. DUNet: A deformable network for retinal vessel segmentation. Knowl. Base. Syst., 2019, 178, 149-162.
[http://dx.doi.org/10.1016/j.knosys.2019.04.025]
[92]
Su, R.; Liu, X.; Xiao, G.; Wei, L. Meta-GDBP: a high-level stacked regression model to improve anticancer drug response prediction. Brief. Bioinform., 2020, 21(3), 996-1005.
[http://dx.doi.org/10.1093/bib/bbz022] [PMID: 30868164]
[93]
Su, R.; Wu, H.; Xu, B.; Liu, X.; Wei, L. Developing a multi-dose computational model for drug-induced hepatotoxicity prediction based on toxicogenomics data. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 2019, 16(4), 1231-1239.
[http://dx.doi.org/10.1109/TCBB.2018.2858756] [PMID: 30040651]
[94]
Wei, L.; He, W.; Malik, A.; Su, R.; Cui, L.; Manavalan, B. Computational prediction and interpretation of cell-specific replication origin sites from multiple eukaryotes by exploiting stacking framework. Brief. Bioinform.,, 2021, 22(4)bbaa275.
[PMID: 33152766]
[95]
Wei, L.; Hu, J.; Li, F.; Song, J.; Su, R.; Zou, Q. Comparative analysis and prediction of quorum-sensing peptides using feature representation learning and machine learning algorithms. Brief. Bioinform., 2018, 21(1), 106-119.
[http://dx.doi.org/10.1093/bib/bby107] [PMID: 30383239]
[96]
Wei, L.; Liao, M.; Gao, Y.; Ji, R.; He, Z.; Zou, Q. Improved and promising identification of human MicroRNAs by incorporating a high-quality negative set. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 2014, 11(1), 192-201.
[http://dx.doi.org/10.1109/TCBB.2013.146] [PMID: 26355518]
[97]
Wei, L.; Wan, S.; Guo, J.; Wong, K.K.L. A novel hierarchical selective ensemble classifier with bioinformatics application. Artif. Intell. Med., 2017, 83, 82-90.
[http://dx.doi.org/10.1016/j.artmed.2017.02.005] [PMID: 28245947]
[98]
Li, F.; Chen, J.; Leier, A.; Marquez-Lago, T.; Liu, Q.; Wang, Y.; Revote, J.; Smith, A.I.; Akutsu, T.; Webb, G.I.; Kurgan, L.; Song, J. DeepCleave: a deep learning predictor for caspase and matrix metalloprotease substrates and cleavage sites. Bioinformatics, 2020, 36(4), 1057-1065.
[http://dx.doi.org/10.1093/bioinformatics/btz721] [PMID: 31566664]
[99]
Li, F.; Li, C.; Marquez-Lago, T.T.; Leier, A.; Akutsu, T.; Purcell, A.W.; Ian Smith, A.; Lithgow, T.; Daly, R.J.; Song, J.; Chou, K.C. Quokka: a comprehensive tool for rapid and accurate prediction of kinase family-specific phosphorylation sites in the human proteome. Bioinformatics, 2018, 34(24), 4223-4231.
[http://dx.doi.org/10.1093/bioinformatics/bty522] [PMID: 29947803]
[100]
Li, F.; Wang, Y.; Li, C.; Marquez-Lago, T.T.; Leier, A.; Rawlings, N.D.; Haffari, G.; Revote, J.; Akutsu, T.; Chou, K.C.; Purcell, A.W.; Pike, R.N.; Webb, G.I.; Ian Smith, A.; Lithgow, T.; Daly, R.J.; Whisstock, J.C.; Song, J. Twenty years of bioinformatics research for protease-specific substrate and cleavage site prediction: a comprehensive revisit and benchmarking of existing methods. Brief. Bioinform., 2019, 20(6), 2150-2166.
[http://dx.doi.org/10.1093/bib/bby077] [PMID: 30184176]
[101]
Bonsack, M.; Hoppe, S.; Winter, J.; Tichy, D.; Zeller, C.; Küpper, M.D.; Schitter, E.C.; Blatnik, R.; Riemer, A.B. Performance evaluation of MHC class-I binding prediction tools based on an experimentally validated MHC-peptide binding data set. Cancer Immunol. Res., 2019, 7(5), 719-736.
[http://dx.doi.org/10.1158/2326-6066.CIR-18-0584] [PMID: 30902818]
[102]
Junwei, H.; Xudong, H.; Qingfei, K.; Liang, C. psSubpathway: a software package for flexible identification of phenotype-specific subpathways in cancer progression. Bioinformatics, 2019, 37(7), 2303-2305.
[103]
Cheng, L. Omics data and artificial intelligence: new challenges for gene therapy. Curr. Gene Ther., 2020, 20(1), 1.
[http://dx.doi.org/10.2174/156652322001200604150041] [PMID: 32603274]
[104]
Yu, L.; Gao, L. Human pathway-based disease network. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 2019, 16(4), 1240-1249.
[http://dx.doi.org/10.1109/TCBB.2017.2774802]
[105]
Yu, L.; Zhao, J.; Gao, L. Predicting potential drugs for breast cancer based on miRNA and tissue specificity. Int. J. Biol. Sci., 2018, 14(8), 971-982.
[http://dx.doi.org/10.7150/ijbs.23350] [PMID: 29989066]
[106]
Basith, S.; Manavalan, B.; Hwan Shin, T.; Lee, G. Machine intelligence in peptide therapeutics: A next-generation tool for rapid disease screening. Med. Res. Rev., 2020, 40(4), 1276-1314.
[http://dx.doi.org/10.1002/med.21658] [PMID: 31922268]
[107]
Hasan, M.M.; Manavalan, B.; Khatun, M.S.; Kurata, H. i4mC-ROSE, a bioinformatics tool for the identification of DNA N4-methylcytosine sites in the Rosaceae genome. Int. J. Biol. Macromol., 2019, 157, 752-758.
[PMID: 31805335]
[108]
Hasan, M.M.; Manavalan, B.; Shoombuatong, W.; Khatun, M.S.; Kurata, H. i6mA-Fuse: improved and robust prediction of DNA 6 mA sites in the Rosaceae genome by fusing multiple feature representation. Plant Mol. Biol., 2020, 103(1-2), 225-234.
[http://dx.doi.org/10.1007/s11103-020-00988-y] [PMID: 32140819]
[109]
Cheng, L.; Wang, P.; Tian, R.; Wang, S.; Guo, Q.; Luo, M.; Zhou, W.; Liu, G.; Jiang, H.; Jiang, Q. LncRNA2Target v2.0: a comprehensive database for target genes of lncRNAs in human and mouse. Nucleic Acids Res., 2019, 47(D1), D140-D144.
[http://dx.doi.org/10.1093/nar/gky1051] [PMID: 30380072]
[110]
Zhao, Y.; Wang, F.; Chen, S.; Wan, J.; Wang, G. Methods of microRNA promoter prediction and transcription factor mediated regulatory Network. BioMed Res. Int., 2017, 20177049406
[http://dx.doi.org/10.1155/2017/7049406] [PMID: 28656148]
[111]
Mirza, M.T.; Khan, A.; Tahir, M.; Lee, Y.S. MitProt-Pred: Predicting mitochondrial proteins of Plasmodium falciparum parasite using diverse physiochemical properties and ensemble classification. Comput. Biol. Med., 2013, 43(10), 1502-1511.
[http://dx.doi.org/10.1016/j.compbiomed.2013.07.024] [PMID: 24034742]
[112]
Song, J.; Tan, H.; Perry, A.J.; Akutsu, T.; Webb, G.I.; Whisstock, J.C.; Pike, R.N. PROSPER: an integrated feature-based tool for predicting protease substrate cleavage sites. PLoS One, 2012, 7(11)e50300
[http://dx.doi.org/10.1371/journal.pone.0050300] [PMID: 23209700]
[113]
Song, J.; Tan, H.; Shen, H.; Mahmood, K.; Boyd, S.E.; Webb, G.I.; Akutsu, T.; Whisstock, J.C. Cascleave: towards more accurate prediction of caspase substrate cleavage sites. Bioinformatics, 2010, 26(6), 752-760.
[http://dx.doi.org/10.1093/bioinformatics/btq043] [PMID: 20130033]
[114]
Zhang, M.; Li, F.; Marquez-Lago, T.T.; Leier, A.; Fan, C.; Kwoh, C.K.; Chou, K.C.; Song, J.; Jia, C. MULTiPly: a novel multi-layer predictor for discovering general and specific types of promoters. Bioinformatics, 2019, 35(17), 2957-2965.
[http://dx.doi.org/10.1093/bioinformatics/btz016] [PMID: 30649179]
[115]
Shao, J.; Xu, D.; Tsai, S.N.; Wang, Y.; Ngai, S.M. Computational identification of protein methylation sites through bi-profile Bayes feature extraction. PLoS One, 2009, 4(3)e4920
[http://dx.doi.org/10.1371/journal.pone.0004920] [PMID: 19290060]
[116]
Tan, J.X.; Dao, F.Y.; Lv, H.; Feng, P.M.; Ding, H. Identifying phage virion proteins by using two-step feature selection methods. Molecules, 2018, 23(8)E2000
[http://dx.doi.org/10.3390/molecules23082000] [PMID: 30103458]
[117]
Yang, H.; Tang, H.; Chen, X.X.; Zhang, C.J.; Zhu, P.P.; Ding, H.; Chen, W.; Lin, H. Identification of secretory proteins in mycobacterium tuberculosis using pseudo amino acid composition. BioMed Res. Int., 2016, 20165413903
[http://dx.doi.org/10.1155/2016/5413903] [PMID: 27597968]
[118]
Concu, R.; Podda, G.; Uriarte, E.; González-Díaz, H. Computational chemistry study of 3D-structure-function relationships for enzymes based on Markov models for protein electrostatic, HINT, and van der Waals potentials. J. Comput. Chem., 2009, 30(9), 1510-1520.
[http://dx.doi.org/10.1002/jcc.21170] [PMID: 19086060]
[119]
Jia, C.; Zuo, Y.; Zou, Q. O-GlcNAcPRED-II: an integrated classification algorithm for identifying O-GlcNAcylation sites based on fuzzy undersampling and a K-means PCA oversampling technique. Bioinformatics, 2018, 34(12), 2029-2036.
[http://dx.doi.org/10.1093/bioinformatics/bty039] [PMID: 29420699]
[120]
Islam, M.S.; Hoque, M.A.; Islam, M.S.; Ali, M.; Hossen, M.B.; Binyamin, M.; Merican, A.F.; Akazawa, K.; Kumar, N.; Sugimoto, M. Mining gene expression profile with missing values: a integration of kernel PCA and robust singular values decomposition. Curr. Bioinform., 2019, 14(1), 78-89.
[http://dx.doi.org/10.2174/1574893613666180413151654]
[121]
Peng, H.; Long, F.; Ding, C. Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell., 2005, 27(8), 1226-1238.
[http://dx.doi.org/10.1109/TPAMI.2005.159] [PMID: 16119262]
[122]
Zhu, P.P.; Li, W.C.; Zhong, Z.J.; Deng, E.Z.; Ding, H.; Chen, W.; Lin, H. Predicting the subcellular localization of mycobacterial proteins by incorporating the optimal tripeptides into the general form of pseudo amino acid composition. Mol. Biosyst., 2015, 11(2), 558-563.
[http://dx.doi.org/10.1039/C4MB00645C] [PMID: 25437899]
[123]
Cao, R.; Adhikari, B.; Bhattacharya, D.; Sun, M.; Hou, J.; Cheng, J. QAcon: single model quality assessment using protein structural and contact information with machine learning techniques. Bioinformatics, 2017, 33(4), 586-588.
[PMID: 28035027]
[124]
Zeng, X.; Wang, W.; Deng, G.; Bing, J.; Zou, Q. Prediction of potential disease-associated microRNAs by using neural networks. Mol. Ther. Nucleic Acids, 2019, 16, 566-575.
[http://dx.doi.org/10.1016/j.omtn.2019.04.010] [PMID: 31077936]
[125]
Zhang, S.; Zhang, T.; Liu, C. Prediction of apoptosis protein subcellular localization via heterogeneous features and hierarchical extreme learning machine. SAR QSAR Environ. Res., 2019, 30(3), 209-228.
[http://dx.doi.org/10.1080/1062936X.2019.1576222] [PMID: 30806087]
[126]
Li, Y.; Niu, M.; Zou, Q. ELM-MHC: An improved MHC identification method with extreme learning machine algorithm. J. Proteome Res., 2019, 18(3), 1392-1401.
[http://dx.doi.org/10.1021/acs.jproteome.9b00012] [PMID: 30698979]
[127]
An, J-Y.; Zhou, Y.; Zhang, L.; Niu, Q.; Wang, D-F. Improving self-interacting proteins prediction accuracy using protein evolutionary information and weighed-extreme learning machine. Curr. Bioinform., 2019, 14(2), 115-122.
[http://dx.doi.org/10.2174/1574893613666180209161152]
[128]
Lv, Z.; Ao, C.; Zou, Q. Protein function prediction: from traditional classifier to deep learning. Proteomics, 2019, 19(14)e1900119
[http://dx.doi.org/10.1002/pmic.201900119] [PMID: 31187588]
[129]
Wu, B.; Zhang, H.; Lin, L.; Wang, H.; Gao, Y.; Zhao, L.; Chen, Y-P.P.; Chen, R.; Gu, L. A similarity searching system for biological phenotype images using deep convolutional encoder-decoder architecture. Curr. Bioinform., 2019, 14(7), 628-639.
[http://dx.doi.org/10.2174/1574893614666190204150109]
[130]
Cao, R.; Freitas, C.; Chan, L.; Sun, M.; Jiang, H.; Chen, Z. ProLanGO: protein function prediction using neural machine translation based on a recurrent neural network. Molecules, 2017, 22(10), p. E1732.
[http://dx.doi.org/10.3390/molecules22101732] [PMID: 29039790]
[131]
Hippe, K, ; Gbenro, S, ; Cao, R, ProLanGO2: protein function prediction with ensemble of encoder-decoder networks. Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, 2020, 1-6.

Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy