A Brief Review of the Computational Identification of Antifreeze Protein

Fang      Wang; Zheng-Xing      Guan; Fu-Ying      Dao; Hui      Ding
doi:10.2174/1385272823666190718145613
Abstract

Lots of cold-adapted organisms could produce antifreeze proteins (AFPs) to counter the freezing of cell fluids by controlling the growth of ice crystal. AFPs have been found in various species such as in vertebrates, invertebrates, plants, bacteria, and fungi. These AFPs from fish, insects and plants displayed a high diversity. Thus, the identification of the AFPs is a challenging task in computational proteomics. With the accumulation of AFPs and development of machine meaning methods, it is possible to construct a high-throughput tool to timely identify the AFPs. In this review, we briefly reviewed the application of machine learning methods in antifreeze proteins identification from difference section, including published benchmark dataset, sequence descriptor, classification algorithms and published methods. We hope that this review will produce new ideas and directions for the researches in identifying antifreeze proteins.
Keywords: Antifreeze protein, classification, machine learning, computational proteomics, cold-adapted organisms, cell fluids.
« Previous Next »
Graphical Abstract

[1] 
Logsdon, J.M., Jr; Doolittle, W.F. Origin of antifreeze protein genes: a cool tale in molecular evolution. Proc. Natl. Acad. Sci. USA,  1997, 94(8), 3485-3487.
[http://dx.doi.org/10.1073/pnas.94.8.3485] [PMID:  9108001] 
[2] 
Ewart, K.V.; Lin, Q.; Hew, C.L. Structure, function and evolution of antifreeze proteins. Cell. Mol. Life Sci.,  1999, 55(2), 271-283.
[http://dx.doi.org/10.1007/s000180050289] [PMID:  10188586] 
[3] 
Cheng, C-H.C. Evolution of the diverse antifreeze proteins. Curr. Opin. Genet. Dev.,  1998, 8(6), 715-720.
[http://dx.doi.org/10.1016/S0959-437X(98)80042-7] [PMID:  9914209] 
[4] 
Davies, P.L.; Sykes, B.D. Antifreeze proteins. Curr. Opin. Struct. Biol.,  1997, 7(6), 828-834.
[http://dx.doi.org/10.1016/S0959-440X(97)80154-6] [PMID:  9434903] 
[5] 
Chou, K.C. Energy-optimized structure of antifreeze protein and its binding mechanism. J. Mol. Biol.,  1992, 223(2), 509-517.
[http://dx.doi.org/10.1016/0022-2836(92)90666-8] [PMID:  1738160] 
[6] 
Yu, X.M.; Griffith, M. Winter rye antifreeze activity increases in response to cold and drought, but not abscisic acid. Physiol. Plant.,  2001, 112(1), 78-86.
[http://dx.doi.org/10.1034/j.1399-3054.2001.1120111.x] [PMID:  11319018] 
[7] 
Davies, P.L.; Baardsnes, J.; Kuiper, M.J.; Walker, V.K. Structure and function of antifreeze proteins. Philos. Trans. R. Soc. Lond. B Biol. Sci.,  2002, 357(1423), 927-935.
[http://dx.doi.org/10.1098/rstb.2002.1081] [PMID:  12171656] 
[8] 
Urrutia, M.E.; Duman, J.G.; Knight, C.A. Plant thermal hysteresis proteins. Biochim. Biophys. Acta,  1992, 1121(1-2), 199-206.
[http://dx.doi.org/10.1016/0167-4838(92)90355-H] [PMID:  1599942] 
[9] 
Scholander, P.F.; Dam, L.V.; Kanwisher, J.W.; Hammel, H.T.; Gordon, M.S. Supercooling and osmoregulation in arctic fish. J. Cell. Physiol.,  2010, 49(1), 5-24.
[http://dx.doi.org/10.1002/jcp.1030490103] 
[10] 
Moriyama, M.; Abe, J.; Yoshida, M.; Tsurumi, Y.; Nakayama, S. Seasonal changes in freezing tolerance, moisture content and dry weight of three temperate grasses. [Dactylis glomerata, Lolium perenne, Phleum pratense].
Jap. J. Grassland Sci.,,  1995, 41(1), 21-25.
[11] 
Davies, P.L.; Hew, C.L. Biochemistry of fish antifreeze proteins. FASEB J.,  1990, 4(8), 2460-2468.
[http://dx.doi.org/10.1096/fasebj.4.8.2185972] [PMID:  2185972] 
[12] 
Graether, S.P.; Kuiper, M.J.; Gagné, S.M.; Walker, V.K.; Jia, Z.; Sykes, B.D.; Davies, P.L. Beta-helix structure and ice-binding properties of a hyperactive antifreeze protein from an insect. Nature,  2000, 406(6793), 325-328.
[http://dx.doi.org/10.1038/35018610] [PMID:  10917537] 
[13] 
Liu, D.; Li, G.; Zuo, Y. Function determinants of TET proteins: The arrangements of sequence motifs with specific codes. Brief. Bioinform., 2018.
[http://dx.doi.org/10.1093/bib/bby053] [PMID:  29947743] 
[14] 
Mondal, S.; Pai, P.P. Chou’s pseudo amino acid composition improves sequence-based antifreeze protein prediction. J. Theor. Biol.,  2014, 356, 30-35.
[http://dx.doi.org/10.1016/j.jtbi.2014.04.006] [PMID:  24732262] 
[15] 
Zuo, Y.C.; Peng, Y.; Liu, L.; Chen, W.; Yang, L.; Fan, G.L. Predicting peroxidase subcellular location by hybridizing different descriptors of Chou’ pseudo amino acid patterns. Anal. Biochem.,  2014, 458, 14-19.
[http://dx.doi.org/10.1016/j.ab.2014.04.032] [PMID:  24802134] 
[16] 
Huo, H.; Li, T.; Wang, S.; Lv, Y.; Zuo, Y.; Yang, L. Prediction of presynaptic and postsynaptic neurotoxins by combining various Chou’s pseudo components. Sci. Rep.,  2017, 7(1), 5827.
[http://dx.doi.org/10.1038/s41598-017-06195-y] [PMID:  28724993] 
[17] 
Cheng, L.; Jiang, Y.; Ju, H.; Sun, J.; Peng, J.; Zhou, M.; Hu, Y. InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk. BMC Genomics,  2018, 19(Suppl. 1), 919.
[http://dx.doi.org/10.1186/s12864-017-4338-6] [PMID:  29363423] 
[18] 
Hu, Y.; Zhao, T.; Zhang, N.; Zang, T.; Zhang, J.; Cheng, L. Identifying diseases-related metabolites using random walk. BMC Bioinformatics,  2018, 19(Suppl. 5), 116.
[http://dx.doi.org/10.1186/s12859-018-2098-1] [PMID:  29671398] 
[19] 
Hou, J.; Wu, T.; Cao, R.; Cheng, J. Protein tertiary structure modeling driven by deep learning and contact distance prediction in CASP13. Proteins, 2019.
[http://dx.doi.org/10.1002/prot.25697] [PMID:  30985027] 
[20] 
Kandaswamy, K.K.; Chou, K.C.; Martinetz, T.; Möller, S.; Suganthan, P.N.; Sridharan, S.; Pugalenthi, G. AFP-Pred: A random forest approach for predicting antifreeze proteins from sequence-derived properties. J. Theor. Biol.,  2011, 270(1), 56-62.
[http://dx.doi.org/10.1016/j.jtbi.2010.10.037] [PMID:  21056045] 
[21] 
Yu, C.S.; Lu, C.H. Identification of antifreeze proteins and their functional residues by support vector machine and genetic algorithms based on n-peptide compositions. PLoS One,  2011, 6(5)e20445
[http://dx.doi.org/10.1371/journal.pone.0020445] [PMID:  21655262] 
[22] 
Zhao, X.; Ma, Z.; Yin, M. Using support vector machine and evolutionary profiles to predict antifreeze protein sequences. Int. J. Mol. Sci.,  2012, 13(2), 2196-2207.
[http://dx.doi.org/10.3390/ijms13022196] [PMID:  22408447] 
[23] 
He, X.; Han, K.; Hu, J.; Yan, H.; Yang, J.Y.; Shen, H.B.; Yu, D.J. Target freeze: Identifying antifreeze proteins via a combination of weights using sequence evolutionary information and pseudo amino acid composition. J. Membr. Biol.,  2015, 248(6), 1005-1014.
[http://dx.doi.org/10.1007/s00232-015-9811-z] [PMID:  26058944] 
[24] 
Xiao, X.; Hui, M.; Liu, Z. iAFP-Ense: An ensemble classifier for identifying antifreeze protein by incorporating grey model and PSSM into PseAAC. J. Membr. Biol.,  2016, 249(6), 845-854.
[http://dx.doi.org/10.1007/s00232-016-9935-9] [PMID:  27812737] 
[25] 
Khan, S.; Naseem, I.; Togneri, R.; Bennamoun, M. RAFP-Pred: Robust prediction of antifreeze proteins using localized analysis of n-peptide compositions. IEEE/ACM Trans. Comput. Biol. Bioinformatics,  2018, 15(1), 244-250.
[http://dx.doi.org/10.1109/TCBB.2016.2617337] [PMID:  28113406] 
[26] 
Eslami, M.; Zade, R.S.; Takalloo, Z.; Mahdevar, G.; Emamjomeh, A.; Sajedi, R.H.; Zahiri, J. afpCOOL: A tool for antifreeze protein prediction. Heliyon,  2018, 4(7)e00705
[27] 
Cui, T.; Zhang, L.; Huang, Y.; Yi, Y.; Tan, P.; Zhao, Y.; Hu, Y.; Xu, L.; Li, E.; Wang, D. MNDR v2.0: An updated resource of ncRNA-disease associations in mammals. Nucleic Acids Res.,  2018, 46(D1), D371-D374.
[PMID:  29106639] 
[28] 
Zhang, T.; Tan, P.; Wang, L.; Jin, N.; Li, Y.; Zhang, L.; Yang, H.; Hu, Z.; Zhang, L.; Hu, C.; Li, C.; Qian, K.; Zhang, C.; Huang, Y.; Li, K.; Lin, H.; Wang, D. RNALocate: a resource for RNA subcellular localizations. Nucleic Acids Res.,  2017, 45(D1), D135-D138.
[PMID:  27543076] 
[29] 
Yi, Y.; Zhao, Y.; Li, C.; Zhang, L.; Huang, H.; Li, Y.; Liu, L.; Hou, P.; Cui, T.; Tan, P.; Hu, Y.; Zhang, T.; Huang, Y.; Li, X.; Yu, J.; Wang, D. RAID v2.0: an updated resource of RNA-associated interactions across organisms. Nucleic Acids Res.,  2017, 45(D1), D115-D118.
[http://dx.doi.org/10.1093/nar/gkw1052] [PMID:  27899615] 
[30] 
Yang, J.; Chen, X.; McDermaid, A.; Ma, Q. DMINDA 2.0: integrated and systematic views of regulatory DNA motif identification and analyses. Bioinformatics,  2017, 33(16), 2586-2588.
[http://dx.doi.org/10.1093/bioinformatics/btx223] [PMID:  28419194] 
[31] 
Liang, Z.Y.; Lai, H.Y.; Yang, H.; Zhang, C.J.; Yang, H.; Wei, H.H.; Chen, X.X.; Zhao, Y.W.; Su, Z.D.; Li, W.C.; Deng, E.Z.; Tang, H.; Chen, W.; Lin, H. Pro54DB: a database for experimentally verified sigma-54 promoters. Bioinformatics,  2017, 33(3), 467-469.
[PMID:  28171531] 
[32] 
Feng, P.; Ding, H.; Lin, H.; Chen, W. AOD: the antioxidant protein database. Sci. Rep.,  2017, 7(1), 7449.
[http://dx.doi.org/10.1038/s41598-017-08115-6] [PMID:  28784999] 
[33] 
He, B.; Chai, G.; Duan, Y.; Yan, Z.; Qiu, L.; Zhang, H.; Liu, Z.; He, Q.; Han, K.; Ru, B.; Guo, F.B.; Ding, H.; Lin, H.; Wang, X.; Rao, N.; Zhou, P.; Huang, J. BDB: biopanning data bank. Nucleic Acids Res.,  2016, 44(D1), D1127-D1132.
[http://dx.doi.org/10.1093/nar/gkv1100] [PMID:  26503249] 
[34] 
Ma, Q; Zhang, H; Mao, X; Zhou, C; Liu, B; Chen, X; Xu, Y DMINDA: An integrated web server for DNA motif identification and analyses. Nucleic Acids Res,  201442(Web Server issue). , W12-19.
[http://dx.doi.org/10.1093/nar/gku315] 
[35] 
Cheng, L.; Wang, P.; Tian, R.; Wang, S.; Guo, Q.; Luo, M.; Zhou, W.; Liu, G.; Jiang, H.; Jiang, Q. LncRNA2Target v2.0: a comprehensive database for target genes of lncRNAs in human and mouse. Nucleic Acids Res.,  2019, 47(D1), D140-D144.
[http://dx.doi.org/10.1093/nar/gky1051] [PMID:  30380072] 
[36] 
Cheng, L.; Hu, Y. Human Disease System Biology. Curr. Gene Ther.,  2018, 18(5), 255-256.
[http://dx.doi.org/10.2174/1566523218666181010101114] [PMID:  30306867] 
[37] 
Sonnhammer, E.L.; Eddy, S.R.; Durbin, R. Pfam: a comprehensive database of protein domain families based on seed alignments. Proteins,  1997, 28(3), 405-420.
[http://dx.doi.org/10.1002/(SICI)1097-0134(199707)28:3<405:AID-PROT10>3.0.CO;2-L] [PMID:  9223186] 
[38] 
Li, W.; Jaroszewski, L.; Godzik, A. Clustering of highly homologous sequences to reduce the size of large protein databases. Bioinformatics,  2001, 17(3), 282-283.
[http://dx.doi.org/10.1093/bioinformatics/17.3.282] [PMID:  11294794] 
[39] 
Zou, Q.; Lin, G.; Jiang, X.; Liu, X.; Zeng, X. Sequence clustering in bioinformatics: an empirical study. Brief. Bioinform., 2018.
[http://dx.doi.org/10.1093/bib/bby090] [PMID:  30239587] 
[40] 
Berman, H.M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T.N.; Weissig, H.; Shindyalov, I.N.; Bourne, P.E. The protein data bank. Nucleic Acids Res.,  2000, 28(1), 235-242.
[http://dx.doi.org/10.1093/nar/28.1.235] [PMID:  10592235] 
[41] 
Wang, G.; Dunbrack, R.L. Jr PISCES: A protein sequence culling server. Bioinformatics,  2003, 19(12), 1589-1591.
[http://dx.doi.org/10.1093/bioinformatics/btg224] [PMID:  12912846] 
[42] 
Larkin, M.A.; Blackshields, G.; Brown, N.P.; Chenna, R.; McGettigan, P.A.; McWilliam, H.; Valentin, F.; Wallace, I.M.; Wilm, A.; Lopez, R.; Thompson, J.D.; Gibson, T.J.; Higgins, D.G.R. Clustal W and clustal X version 2.0. Bioinformatics,  2007, 23(21), 2947-2948.
[http://dx.doi.org/10.1093/bioinformatics/btm404] [PMID:  17846036] 
[43] 
Bairoch, A; Apweiler, R; Wu, CH; Barker, WC; Boeckmann, B; Ferro, S; Gasteiger, E; Huang, H; Lopez, R; Magrane, M  The universal protein resource (UniProt).  Nucleic Acids Res,  200533(suppl_1). , D154-D159.
[44] 
Chen, W.; Feng, P.; Liu, T.; Jin, D. Recent advances in machine learning methods for predicting heat shock proteins. Curr. Drug Metab.,  2019, 20(3), 224-228.
[PMID:  30378494] 
[45] 
Yang, H.; Tang, H.; Chen, X.X.; Zhang, C.J.; Zhu, P.P.; Ding, H.; Chen, W.; Lin, H. Identification of secretory proteins in Mycobacterium tuberculosis using pseudo amino acid composition. BioMed Res. Int.,  2016, 20165413903
[http://dx.doi.org/10.1155/2016/5413903] [PMID:  27597968] 
[46] 
Tang, H.; Chen, W.; Lin, H. Identification of immunoglobulins using Chou’s pseudo amino acid composition with feature selection technique. Mol. Biosyst.,  2016, 12(4), 1269-1275.
[http://dx.doi.org/10.1039/C5MB00883B] [PMID:  26883492] 
[47] 
Chen, X.X.; Tang, H.; Li, W.C.; Wu, H.; Chen, W.; Ding, H.; Lin, H. Identification of bacterial cell wall lyases via pseudo amino acid composition. BioMed Res. Int.,  2016, 20161654623
[http://dx.doi.org/10.1155/2016/1654623] [PMID:  27437396] 
[48] 
Zhu, P.P.; Li, W.C.; Zhong, Z.J.; Deng, E.Z.; Ding, H.; Chen, W.; Lin, H. Predicting the subcellular localization of mycobacterial proteins by incorporating the optimal tripeptides into the general form of pseudo amino acid composition. Mol. Biosyst.,  2015, 11(2), 558-563.
[http://dx.doi.org/10.1039/C4MB00645C] [PMID:  25437899] 
[49] 
Pugalenthi, G.; Kumar, K.K.; Suganthan, P.N.; Gangal, R. Identification of catalytic residues from protein structure using support vector machine with sequence and structural features. Biochem. Biophys. Res. Commun.,  2008, 367(3), 630-634.
[http://dx.doi.org/10.1016/j.bbrc.2008.01.038] [PMID:  18206645] 
[50] 
McGuffin, L.J.; Bryson, K.; Jones, D.T. The PSIPRED protein structure prediction server. Bioinformatics,  2000, 16(4), 404-405.
[http://dx.doi.org/10.1093/bioinformatics/16.4.404] [PMID:  10869041] 
[51] 
Kawashima, S.; Kanehisa, M. AAindex: Amino acid index database. Nucleic Acids Res.,  2000, 28(1), 374-374.
[http://dx.doi.org/10.1093/nar/28.1.374] [PMID:  10592278] 
[52] 
Yu, C.S.; Chen, Y.C.; Lu, C.H.; Hwang, J.K. Prediction of protein subcellular localization. Proteins,  2006, 64(3), 643-651.
[http://dx.doi.org/10.1002/prot.21018] [PMID:  16752418] 
[53] 
Feng, P-M.; Ding, H.; Chen, W.; Lin, H. Naive Bayes classifier with feature selection to identify phage virion proteins; Comp. Math. Methods Med, 2013, p. 530696.
[http://dx.doi.org/10.1155/2013/530696] 
[54] 
Feng, P-M.; Lin, H.; Chen, W. Identification of antioxidants from sequence information using Naive Bayes. Comp. Math. Methods Med.,  2013, 2013567529
[http://dx.doi.org/10.1155/2013/567529] 
[55] 
Zuo, Y.; Li, Y.; Chen, Y.; Li, G.; Yan, Z.; Yang, L. PseKRAAC: A flexible web server for generating pseudo K-tuple reduced amino acids composition. Bioinformatics,  2017, 33(1), 122-124.
[http://dx.doi.org/10.1093/bioinformatics/btw564] [PMID:  27565583] 
[56] 
Tan, J.X.; Li, S.H.; Zhang, Z.M.; Chen, C.X.; Chen, W.; Tang, H.; Lin, H. Identification of hormone binding proteins based on machine learning methods. Math. Biosci. Eng.,  2019, 16(4), 2466-2480.
[http://dx.doi.org/10.3934/mbe.2019123] [PMID:  31137222] 
[57] 
Lin, H.; Ding, C.; Yuan, L.F.; Chen, W.; Ding, H.; Li, Z.Q.; Guo, F.B.; Huang, J.; Rao, N.N. Predicting subchloroplast locations of proteins based on the general form of Chou’s pseudo amino acid composition: Approached from optimal tripeptide composition. Int. J. Biomath.,  2013, 6(2)1350003
[http://dx.doi.org/10.1142/S1793524513500034] 
[58] 
Jones, D.T. Improving the accuracy of transmembrane protein topology prediction using evolutionary information. Bioinformatics,  2007, 23(5), 538-544.
[http://dx.doi.org/10.1093/bioinformatics/btl677] [PMID:  17237066] 
[59] 
Verma, R.; Varshney, G.C.; Raghava, G.P.S. Prediction of mitochondrial proteins of malaria parasite using split amino acid composition and PSSM profile. Amino Acids,  2010, 39(1), 101-110.
[http://dx.doi.org/10.1007/s00726-009-0381-1] [PMID:  19908123] 
[60] 
Wei, L.; Tang, J.; Zou, Q. Local-DPP: An improved DNA-binding protein prediction method by exploring local evolutionary information. Inf. Sci.,  2017, 384, 135-144.
[http://dx.doi.org/10.1016/j.ins.2016.06.026] 
[61] 
Schäffer, A.A.; Aravind, L.; Madden, T.L.; Shavirin, S.; Spouge, J.L.; Wolf, Y.I.; Koonin, E.V.; Altschul, S.F. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res.,  2001, 29(14), 2994-3005.
[http://dx.doi.org/10.1093/nar/29.14.2994] [PMID:  11452024] 
[62] 
Altschul, S.F.; Madden, T.L.; Schäffer, A.A.; Zhang, J.; Zhang, Z.; Miller, W.; Lipman, D.J. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res.,  1997, 25(17), 3389-3402.
[http://dx.doi.org/10.1093/nar/25.17.3389] [PMID:  9254694] 
[63] 
Chou, K.C. Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins,  2001, 43(3), 246-255.
[http://dx.doi.org/10.1002/prot.1035] [PMID:  11288174] 
[64] 
Chou, K-C. Some remarks on protein attribute prediction and pseudo amino acid composition. J. Theor. Biol.,  2011, 273(1), 236-247.
[http://dx.doi.org/10.1016/j.jtbi.2010.12.024] [PMID:  21168420] 
[65] 
Ding, H.; Luo, L.; Lin, H. Prediction of cell wall lytic enzymes using Chou’s amphiphilic pseudo amino acid composition. Protein Pept. Lett.,  2009, 16(4), 351-355.
[http://dx.doi.org/10.2174/092986609787848045] [PMID:  19356130] 
[66] 
Wold, S.; Jonsson, J.; Sjörström, M.; Sandberg, M.; Rännar, S. DNA and peptide sequences and chemical processes multivariately modelled by principal component analysis and partial least-squares projections to latent structures. Anal. Chim. Acta,  1993, 277(2), 239-253.
[http://dx.doi.org/10.1016/0003-2670(93)80437-P] 
[67] 
Min, J-L.; Xiao, X.; Chou, K-C. A web server for identifying the interaction between enzymes and drugs in cellular networking. BioMed Res. Int., 2013.
[68] 
Ding, C.; Peng, H. Minimum redundancy feature selection from microarray gene expression data. J. Bioinform. Comput. Biol.,  2003, 3(2), 185-205.
[http://dx.doi.org/10.1109/CSB.2003.1227396] 
[69] 
Zou, Q.; Zeng, J.; Cao, L.; Ji, R. A novel features ranking metric with application to scalable visual and bioinformatics data classification. Neurocomputing,  2016, 173, 346-354.
[http://dx.doi.org/10.1016/j.neucom.2014.12.123] 
[70] 
Zou, Q.; Wan, S.; Ju, Y.; Tang, J.; Zeng, X. Pretata: Predicting TATA binding proteins with novel features and dimensionality reduction strategy. BMC Syst. Biol.,  2016, 10(4), 114.
[http://dx.doi.org/10.1186/s12918-016-0353-5] [PMID:  28155714] 
[71] 
Li, F.; Li, C.; Marquez-Lago, T.T.; Leier, A.; Akutsu, T.; Purcell, A.W.; Ian Smith, A.; Lithgow, T.; Daly, R.J.; Song, J.; Chou, K.C. Quokka: a comprehensive tool for rapid and accurate prediction of kinase family-specific phosphorylation sites in the human proteome. Bioinformatics,  2018, 34(24), 4223-4231.
[http://dx.doi.org/10.1093/bioinformatics/bty522] [PMID:  29947803] 
[72] 
Chen, Z.; Zhao, P.; Li, F.; Leier, A.; Marquez-Lago, T.T.; Wang, Y.; Webb, G.I.; Smith, A.I.; Daly, R.J.; Chou, K.C.; Song, J. iFeature: a Python package and web server for features extraction and selection from protein and peptide sequences. Bioinformatics,  2018, 34(14), 2499-2502.
[http://dx.doi.org/10.1093/bioinformatics/bty140] [PMID:  29528364] 
[73] 
Feng, C.Q.; Zhang, Z.Y.; Zhu, X.J.; Lin, Y.; Chen, W.; Tang, H.; Lin, H. iTerm-PseKNC: A sequence-based tool for predicting bacterial transcriptional terminators. Bioinformatics,  2019, 35(9), 1469-1477.
[PMID:  30247625] 
[74] 
Tang, H.; Su, Z.D.; Wei, H.H.; Chen, W.; Lin, H. Prediction of cell-penetrating peptides with feature selection techniques. Biochem. Biophys. Res. Commun.,  2016, 477(1), 150-154.
[http://dx.doi.org/10.1016/j.bbrc.2016.06.035] [PMID:  27291150] 
[75] 
Zhang, Y.; Ding, C.; Li, T. Gene selection algorithm by combining reliefF and mRMR. BMC Genomics,  2008, 9(2)(Suppl. 2), S27.
[http://dx.doi.org/10.1186/1471-2164-9-S2-S27] [PMID:  18831793] 
[76] 
Frank, E.; Hall, M.; Trigg, L.; Holmes, G.; Witten, I.H. Data mining in bioinformatics using Weka. Bioinformatics,  2004, 20(15), 2479-2481.
[http://dx.doi.org/10.1093/bioinformatics/bth261] [PMID:  15073010] 
[77] 
Lu, C.; Chen, Y. Cs, Hwang J: Predicting disulfide connectivity patterns. Proteins  Struct. Funct. Bioinformatics,  2010, 67(2), 262-270.
[http://dx.doi.org/10.1002/prot.21309] 
[78] 
Zeng, X.; Yuan, S.; Huang, X.; Zou, Q. Identification of cytokine via an improved genetic algorithm. Front. Comput. Sci.,  2015, 9(4), 643-651.
[http://dx.doi.org/10.1007/s11704-014-4089-3] 
[79] 
Kandaswamy, K.K.; Pugalenthi, G.; Hartmann, E.; Kalies, K-U.; Möller, S.; Suganthan, P.N.; Martinetz, T. SPRED: A machine learning approach for the identification of classical and non-classical secretory proteins in mammalian genomes. Biochem. Biophys. Res. Commun.,  2010, 391(3), 1306-1311.
[http://dx.doi.org/10.1016/j.bbrc.2009.12.019] [PMID:  19995554] 
[80] 
Stephenson, N.; Shane, E.; Chase, J.; Rowland, J.; Ries, D.; Justice, N.; Zhang, J.; Chan, L.; Cao, R. Survey of Machine Learning Techniques in Drug Discovery. Curr. Drug Metab.,  2019, 20(3), 185-193.
[PMID:  30124147] 
[81] 
Tang, H.; Cao, R.Z.; Wang, W.; Liu, T.S.; Wang, L.M.; He, C.M. A two-step discriminated method to identify thermophilic proteins. Int. J. Biomath.,  2017, 10(4)1750050
[http://dx.doi.org/10.1142/S1793524517500504] 
[82] 
Cao, R.; Adhikari, B.; Bhattacharya, D.; Sun, M.; Hou, J.; Cheng, J. QAcon: single model quality assessment using protein structural and contact information with machine learning techniques. Bioinformatics,  2017, 33(4), 586-588.
[PMID:  28035027] 
[83] 
Jia, C.; Zuo, Y. S-SulfPred: A sensitive predictor to capture S-sulfenylation sites based on a resampling one-sided selection undersampling-synthetic minority oversampling technique. J. Theor. Biol.,  2017, 422, 84-89.
[http://dx.doi.org/10.1016/j.jtbi.2017.03.031] [PMID:  28411111] 
[84] 
Li, F.; Li, C.; Wang, M.; Webb, G.I.; Zhang, Y.; Whisstock, J.C.; Song, J. GlycoMine: a machine learning-based approach for predicting N-, C- and O-linked glycosylation in the human proteome. Bioinformatics,  2015, 31(9), 1411-1419.
[http://dx.doi.org/10.1093/bioinformatics/btu852] [PMID:  25568279] 
[85] 
Wang, M.; Zhao, X.M.; Tan, H.; Akutsu, T.; Whisstock, J.C.; Song, J. Cascleave 2.0, a new approach for predicting caspase and granzyme cleavage targets. Bioinformatics,  2014, 30(1), 71-80.
[http://dx.doi.org/10.1093/bioinformatics/btt603] [PMID:  24149049] 
[86] 
 Joachims, T Making large-scale SVM learning practical.Technical report, SFB 475: Komplexitätsreduktion in Multivariaten; 1998.
[87] 
Breiman, L. Random forests. Mach. Learn.,  2001, 45(1), 5-32.
[http://dx.doi.org/10.1023/A:1010933404324] 
[88] 
Chen, W.; Feng, P.; Ding, H.; Lin, H. Classifying included and excluded exons in exon skipping event using histone modifications. Front. Genet.,  2018, 9, 433.
[http://dx.doi.org/10.3389/fgene.2018.00433] [PMID:  30327665] 
[89] 
Su, R.; Liu, X.; Wei, L.; Zou, Q. Deep-Resp-Forest: A deep forest model to predict anti-cancer drug response. Methods,  2019, (18), 30323.
[http://dx.doi.org/10.1016/j.ymeth.2019.02.009] [PMID:  30772464] 
[90] 
Zhao, X.; Zou, Q.; Liu, B.; Liu, X. Exploratory predicting protein folding model with random forest and hybrid features. Curr. Proteomics,  2014, 11(4), 289-299.
[http://dx.doi.org/10.2174/157016461104150121115154] 
[91] 
Lv, H.; Zhang, Z.M.; Li, S.H.; Tan, J.X.; Chen, W.; Lin, H. Evaluation of different computational methods on 5-methylcytosine sites identification. Brief. Bioinform., 2019.
[http://dx.doi.org/10.1093/bib/bbz048] [PMID:  31157855] 
[92] 
Manavalan, B.; Lee, J.; Lee, J. Random forest-based protein model quality assessment (RFMQA) using structural features and potential energy terms. PLoS One,  2014, 9(9)e106542
[http://dx.doi.org/10.1371/journal.pone.0106542] [PMID:  25222008] 
[93] 
Manavalan, B.; Shin, T.H.; Kim, M.O.; Lee, G. PIP-EL: A New Ensemble Learning Method for Improved Proinflammatory Peptide Predictions. Front. Immunol.,  2018, 9, 1783.
[http://dx.doi.org/10.3389/fimmu.2018.01783] [PMID:  30108593] 
[94] 
Chen, W.; Lv, H.; Nie, F.; Lin, H. i6mA-Pred: Identifying DNA N6-methyladenine sites in the rice genome. Bioinformatics,  2019, 35(16), 2796-2800.
[http://dx.doi.org/10.1093/bioinformatics/btz015] [PMID:  30624619] 
[95] 
Feng, P.M.; Chen, W.; Lin, H.; Chou, K.C. iHSP-PseRAAAC: Identifying the heat shock protein families using pseudo reduced amino acid alphabet composition. Anal. Biochem.,  2013, 442(1), 118-125.
[http://dx.doi.org/10.1016/j.ab.2013.05.024] [PMID:  23756733] 
[96] 
Zhang, N.; Yu, S.; Guo, Y.; Wang, L.; Wang, P.; Feng, Y. Discriminating ramos and jurkat cells with image textures from diffraction imaging flow cytometry based on a support vector machine. Curr. Bioinform.,  2018, 13, 50-56.
[http://dx.doi.org/10.2174/1574893611666160608102537] 
[97] 
Wang, S.P.; Zhang, Q.; Lu, J.; Cai, Y.D. Analysis and prediction of nitrated tyrosine sites with the mRMR method and support vector machine algorithm. Curr. Bioinform.,  2018, 13(1), 3-13.
[http://dx.doi.org/10.2174/1574893611666160608075753] 
[98] 
Li, D.; Ju, Y.; Zou, Q. Protein folds prediction with hierarchical structured SVM. Curr. Proteomics,  2016, 13(2), 79-85.
[http://dx.doi.org/10.2174/157016461302160514000940] 
[99] 
Lai, H.Y.; Chen, X.X.; Chen, W.; Tang, H.; Lin, H. Sequence-based predictive modeling to identify cancerlectins. Oncotarget,  2017, 8(17), 28169-28175.
[http://dx.doi.org/10.18632/oncotarget.15963] [PMID:  28423655] 
[100] 
Cao, R.; Wang, Z.; Wang, Y.; Cheng, J. SMOQ: A tool for predicting the absolute residue-specific quality of a single protein model with support vector machines. BMC Bioinformatics,  2014, 15, 120.
[http://dx.doi.org/10.1186/1471-2105-15-120] [PMID:  24776231] 
[101] 
Manavalan, B.; Shin, T.H.; Lee, G. DHSpred: support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest. Oncotarget,  2017, 9(2), 1944-1956.
[PMID:  29416743] 
[102] 
Manavalan, B.; Shin, T.H.; Lee, G. PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine. Front. Microbiol.,  2018, 9, 476.
[http://dx.doi.org/10.3389/fmicb.2018.00476] [PMID:  29616000] 
[103] 
Song, J.; Tan, H.; Shen, H.; Mahmood, K.; Boyd, S.E.; Webb, G.I.; Akutsu, T.; Whisstock, J.C. Cascleave: towards more accurate prediction of caspase substrate cleavage sites. Bioinformatics,  2010, 26(6), 752-760.
[http://dx.doi.org/10.1093/bioinformatics/btq043] [PMID:  20130033] 
[104] 
Song, J.; Tan, H.; Mahmood, K.; Law, R.H.P.; Buckle, A.M.; Webb, G.I.; Akutsu, T.; Whisstock, J.C. Prodepth: predict residue depth by support vector regression approach from protein sequences only. PLoS One,  2009, 4(9)e7072
[http://dx.doi.org/10.1371/journal.pone.0007072] [PMID:  19759917] 
[105] 
Manavalan, B.; Basith, S.; Shin, T.H.; Choi, S.; Kim, M.O.; Lee, G. MLACP: Machine-learning-based prediction of anticancer peptides. Oncotarget,  2017, 8(44), 77121-77136.
[http://dx.doi.org/10.18632/oncotarget.20365] [PMID:  29100375] 
[106] 
Manavalan, B.; Basith, S.; Shin, T.H.; Wei, L.; Lee, G. mAHTPred: A sequence-based meta-predictor for improving the prediction of anti-hypertensive peptides using effective feature representation. Bioinformatics,  2019, 35(16), 2757-2765.
[http://dx.doi.org/10.1093/bioinformatics/bty1047] [PMID:  30590410] 
[107] 
Manavalan, B.; Lee, J. SVMQA: support-vector-machine-based protein single-model quality assessment. Bioinformatics,  2017, 33(16), 2496-2503.
[http://dx.doi.org/10.1093/bioinformatics/btx222] [PMID:  28419290] 
[108] 
Chang, C-C.; Lin, C. LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol.,  2001, 2(3)
[http://dx.doi.org/10.1145/1961189.1961199] 
[109] 
Zhu, X.J.; Feng, C.Q.; Lai, H.Y.; Chen, W.; Lin, H. Predicting protein structural classes for low-similarity sequences by evaluating different features. Knowl. Base. Syst.,  2019, 163, 787-793.
[http://dx.doi.org/10.1016/j.knosys.2018.10.007] 
[110] 
Tang, H.; Zhao, Y.W.; Zou, P.; Zhang, C.M.; Chen, R.; Huang, P.; Lin, H. HBPred: A tool to identify growth hormone-binding proteins. Int. J. Biol. Sci.,  2018, 14(8), 957-964.
[http://dx.doi.org/10.7150/ijbs.24174] [PMID:  29989085] 
[111] 
Schaffer, C. Selecting a classification method by cross-validation. Mach. Learn.,  1993, 13(1), 135-143.
[http://dx.doi.org/10.1007/BF00993106] 
[112] 
Dao, F.Y.; Lv, H.; Wang, F.; Feng, C.Q.; Ding, H.; Chen, W.; Lin, H. Identify origin of replication in Saccharomyces cerevisiae using two-step feature selection technique. Bioinformatics,  2019, 35(12), 2075-2083.
[PMID:  30428009] 
[113] 
Yang, H.; Lv, H.; Ding, H.; Chen, W.; Lin, H. iRNA-2OM: A Sequence-Based Predictor for Identifying 2′-O-Methylation Sites in Homo sapiens. J. Comput. Biol.,  2018, 25(11), 1266-1277.
[114] 
Fan, S.; Huang, K.; Ai, R.; Wang, M.; Wang, W. Predicting CpG methylation levels by integrating Infinium HumanMethylation450 BeadChip array data. Genomics,  2016, 107(4), 132-137.
[http://dx.doi.org/10.1016/j.ygeno.2016.02.005] [PMID:  26921858] 
[115] 
Basith, S.; Manavalan, B.; Shin, T.H.; Lee, G. iGHBP: Computational identification of growth hormone binding proteins from sequences using extremely randomised tree. Comput. Struct. Biotechnol. J.,  2018, 16, 412-420.
[http://dx.doi.org/10.1016/j.csbj.2018.10.007] [PMID:  30425802] 
[116] 
Manavalan, B.; Govindaraj, R.G.; Shin, T.H.; Kim, M.O.; Lee, G. iBCE-EL: A new ensemble learning framework for improved linear B-cell epitope prediction. Front. Immunol.,  2018, 9, 1695.
[http://dx.doi.org/10.3389/fimmu.2018.01695] [PMID:  30100904] 
[117] 
Manavalan, B.; Shin, T.H.; Kim, M.O.; Lee, G. AIPpred: Sequence-based prediction of anti-inflammatory peptides using random Forest. Front. Pharmacol.,  2018, 9, 276.
[http://dx.doi.org/10.3389/fphar.2018.00276] [PMID:  29636690] 
[118] 
Chen, W.; Yang, H.; Feng, P.; Ding, H.; Lin, H. iDNA4mC: Identifying DNA N4-methylcytosine sites based on nucleotide chemical properties. Bioinformatics,  2017, 33(22), 3518-3523.
[http://dx.doi.org/10.1093/bioinformatics/btx479] [PMID:  28961687] 
[119] 
Zuo, Y.; Lv, Y.; Wei, Z.; Yang, L.; Li, G.; Fan, G. iDPF-PseRAAAC: A web-server for identifying the defensin peptide family and subfamily using pseudo reduced amino acid alphabet composition. PLoS One,  2015, 10(12)e0145541
[http://dx.doi.org/10.1371/journal.pone.0145541] [PMID:  26713618] 
[120] 
Cheng, L.; Hu, Y.; Sun, J.; Zhou, M.; Jiang, Q. DincRNA: A comprehensive web-based bioinformatics toolkit for exploring disease associations and ncRNA function. Bioinformatics,  2018, 34(11), 1953-1956.
[http://dx.doi.org/10.1093/bioinformatics/bty002] [PMID:  29365045] 
[121] 
Manavalan, B.; Subramaniyam, S.; Shin, T.H.; Kim, M.O.; Lee, G. Machine-learning-based prediction of cell-penetrating peptides and their uptake efficiency with improved accuracy. J. Proteome Res.,  2018, 17(8), 2715-2726.
[http://dx.doi.org/10.1021/acs.jproteome.8b00148] [PMID:  29893128] 
[122] 
Liu, B.; Han, L.; Liu, X.; Wu, J.; Ma, Q. Computational prediction of sigma-54 promoters in bacterial genomes by integrating motif finding and machine learning strategies. IEEE/ACM Trans. Comput. Biol. Bioinformatics,, 2018.
[http://dx.doi.org/10.1109/TCBB.2018.2816032] [PMID:  29993815] 
[123] 
Ding, C.; Peng, H. Minimum redundancy feature selection from microarray gene expression data. J. Bioinform. Comput. Biol.,  2005, 3(2), 185-205.
[http://dx.doi.org/10.1142/S0219720005001004] [PMID:  15852500] 
[124] 
Wei, L.; Su, R.; Wang, B.; Li, X.; Zou, Q.; Gao, X. Integration of deep feature representations and handcrafted features to improve the prediction of N 6-methyladenosine sites. Neurocomputing,  2019, 324, 3-9.
[http://dx.doi.org/10.1016/j.neucom.2018.04.082] 
[125] 
Wei, L.; Ding, Y.; Su, R.; Tang, J.; Zou, Q. Prediction of human protein subcellular localization using deep learning. J. Parallel Distrib. Comput.,  2018, 117, 212-217.
[http://dx.doi.org/10.1016/j.jpdc.2017.08.009] 
[126] 
Peng, L.; Peng, M.M.; Liao, B.; Huang, G.H.; Li, W.B.; Xie, D.F. The advances and challenges of deep learning application in biological big data processing. Curr. Bioinform.,  2018, 13(4), 352-359.
[http://dx.doi.org/10.2174/1574893612666170707095707] 
[127] 
Zhang, Z.; Zhao, Y.; Liao, X.; Shi, W.; Li, K.; Zou, Q.; Peng, S. Deep learning in omics: A survey and guideline. Brief. Funct. Genomics, 2018.
[http://dx.doi.org/10.1093/bfgp/ely1030] [PMID:  30265280] 
[128] 
Cao, R.; Freitas, C.; Chan, L.; Sun, M.; Jiang, H.; Chen, Z. ProLanGO: Protein function prediction using neural machine translation based on a recurrent neural network. Molecules,  2017, 22(10)E1732
[http://dx.doi.org/10.3390/molecules22101732] [PMID:  29039790] 
[129] 
Cao, R.; Bhattacharya, D.; Hou, J.; Cheng, J.; Deep, Q.A. Improving the estimation of single protein model quality with deep belief networks. BMC Bioinformatics,  2016, 17(1), 495.
[http://dx.doi.org/10.1186/s12859-016-1405-y] [PMID:  27919220] 
[130] 
Cao, R.; Cheng, J. Integrated protein function prediction by mining function associations, sequences, and protein-protein and gene-gene interaction networks. Methods,  2016, 93, 84-91.
[http://dx.doi.org/10.1016/j.ymeth.2015.09.011] [PMID:  26370280] 
Rights & Permissions Print Cite
Article Metrics
47
1
Journal Information
For Authors
For Editors
For Reviewers
Explore Articles
Open Access
Open Access Articles
For Visitors
DOI https://dx.doi.org/10.2174/1385272823666190718145613	Print ISSN 1385-2728
Publisher Name Bentham Science Publisher	Online ISSN 1875-5348
Current Organic Chemistry

A Brief Review of the Computational Identification of Antifreeze Protein

Abstract

Graphical Abstract

Catalytic C-H bond activation as a tool for functionalization of heterocycles

Current Organic Chemistry

A Brief Review of the Computational Identification of Antifreeze Protein

Abstract Play Pause

Graphical Abstract

Call for Papers in Thematic Issues

Catalytic C-H bond activation as a tool for functionalization of heterocycles

Related Journals

Related Books

Related Articles

Abstract