A Survey for Predicting Enzyme Family Classes Using Machine Learning Methods

Jiu-Xin       Tan; Hao       Lv; Fang       Wang; Fu-Ying       Dao; Wei       Chen; Hui       Ding
doi:10.2174/1389450119666181002143355
Abstract

Enzymes are proteins that act as biological catalysts to speed up cellular biochemical processes. According to their main Enzyme Commission (EC) numbers, enzymes are divided into six categories: EC-1: oxidoreductase; EC-2: transferase; EC-3: hydrolase; EC-4: lyase; EC-5: isomerase and EC-6: synthetase. Different enzymes have different biological functions and acting objects. Therefore, knowing which family an enzyme belongs to can help infer its catalytic mechanism and provide information about the relevant biological function. With the large amount of protein sequences influxing into databanks in the post-genomics age, the annotation of the family for an enzyme is very important. Since the experimental methods are cost ineffective, bioinformatics tool will be a great help for accurately classifying the family of the enzymes. In this review, we summarized the application of machine learning methods in the prediction of enzyme family from different aspects. We hope that this review will provide insights and inspirations for the researches on enzyme family classification.
Keywords: Enzyme, family, classification, machine learning methods.
« Previous Next »
Graphical Abstract

[1] 
Webb EC. Enzyme nomenclatureAcademic Press, SanDiego 1992.
[2] 
Jensen LJ, Skovgaard M, Brunak S. Prediction of novel archaeal enzymes from sequence-derived features. Protein Sci  2002; 11: 2894-8.
[3] 
Chou KC, Cai YD. Using GO-PseAA predictor to predict enzyme sub-class. Biochem Biophys Res Commun  2004; 325: 506-9.
[4] 
Cai CZ, Han LY, Ji ZL, Chen YZ. Enzyme family classification by support vector machines. Proteins  2004; 55: 66-76.
[5] 
Cai YD, Chou KC. Using functional domain composition to predict enzyme family classes. J Proteome Res  2005; 4: 109-11.
[6] 
Cai YD, Chou KC. Predicting enzyme subclass by functional domain composition and pseudo amino acid composition. J Proteome Res  2005; 4: 967-71.
[7] 
Cai YD, Zhou GP, Chou KC. Predicting enzyme family classes by hybridizing gene product composition and pseudo-amino acid composition. J Theor Biol  2005; 234: 145-9.
[8] 
Lu L, Qian Z, Cai YD, Li Y. ECS: an automatic enzyme classifier based on functional domain composition. Comput Biol Chem  2007; 31: 226-32.
[9] 
Shen HB, Chou KC. EzyPred: a top-down approach for predicting enzyme functional classes and subclasses. Biochem Biophys Res Commun  2007; 364: 53-9.
[10] 
Nasibov E, Kandemir-Cavas C. Efficiency analysis of KNN and minimum distance-based classifiers in enzyme family prediction. Comput Biol Chem  2009; 33: 461-4.
[11] 
Concu R, Dea-Ayuela MA, Perez-Montoto LG, et al. Prediction of enzyme classes from 3D structure: a general model and examples of experimental-theoretic scoring of peptide mass fingerprints of Leishmania proteins. J Proteome Res  2009; 8: 4372-82.
[12] 
Concu R, Dea-Ayuela MA, Perez-Montoto LG, et al. 3D entropy and moments prediction of enzyme classes and experimental-theoretic study of peptide fingerprints in Leishmania parasites. Biochim Biophys Acta  2009; 1794: 1784-94.
[13] 
Qiu JD, Huang JH, Shi SP, Liang RP. Using the concept of Chou’s pseudo amino acid composition to predict enzyme family classes: an approach with support vector machine based on discrete wavelet transform. Protein Pept Lett  2010; 17: 715-22.
[14] 
Shi R, Hu X. Predicting enzyme subclasses by using support vector machine with composite vectors. Protein Pept Lett  2010; 17: 599-604.
[15] 
Volpato V, Adelfio A, Pollastri G. Accurate prediction of protein enzymatic class by N-to-1 Neural Networks. BMC Bioinformatics  2013; 14(Suppl. 1): S11.
[16] 
Niu B, Lu Y, Lu J, et al. Prediction of enzyme’s family based on protein-protein interaction network. Curr Bioinform  2015; 10: 16-21.
[17] 
Wu Y, Tang H, Chen W, Lin H. Predicting human enzyme family classes by using pseudo amino acid composition. Curr Proteomics  2016; 13: 99-104.
[18] 
Bairoch A. The ENZYME database in 2000. Nucleic Acids Res  2000; 28: 304-5.
[19] 
Bairoch A, Apweiler R. The SWISS-PROT protein sequence data bank and its supplement TrEMBL. Nucleic Acids Res  1997; 25: 31-6.
[20] 
Cui T, Zhang L, Huang Y, et al. MNDR v2.0: an updated resource of ncRNA-disease associations in mammals. Nucleic Acids Res  2018; 46: D371-4.
[21] 
Zhang T, Tan P, Wang L, et al. RNALocate: a resource for RNA subcellular localizations. Nucleic Acids Res  2017; 45: D135-8.
[22] 
Yi Y, Zhao Y, Li C, et al. RAID v2.0: an updated resource of RNA-associated interactions across organisms. Nucleic Acids Res  2017; 45: D115-8.
[23] 
Liang ZY, Lai HY, Yang H, et al. Pro54DB: a database for experimentally verified sigma-54 promoters. Bioinformatics  2017; 33: 467-9.
[24] 
Feng P, Ding H, Lin H, Chen W. AOD: the antioxidant protein database. Sci Rep  2017; 7: 7449.
[25] 
He B, Chai G, Duan Y, et al. BDB: biopanning data bank. Nucleic Acids Res  2016; 44: D1127-32.
[26] 
Wang G, Dunbrack RL Jr. PISCES: a protein sequence culling server. Bioinformatics  2003; 19: 1589-91.
[27] 
Zhu PP, Li WC, Zhong ZJ, et al. Predicting the subcellular localization of mycobacterial proteins by incorporating the optimal tripeptides into the general form of pseudo amino acid composition. Mol Biosyst  2015; 11: 558-63.
[28] 
Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics  2006; 22: 1658-9.
[29] 
Huang Y, Niu B, Gao Y, Fu L, Li W. CD-HIT Suite: a web server for clustering and comparing biological sequences. Bioinformatics  2010; 26: 680-2.
[30] 
Chou KC, Zhang CT. Predicting protein folding types by distance functions that make allowances for amino acid interactions. J Biol Chem  1994; 269: 22014-20.
[31] 
Chou KC. A novel approach to predicting protein structural classes in a (20-1)-D amino acid composition space. Proteins  1995; 21: 319-44.
[32] 
Lin H, Chen W. Prediction of thermophilic proteins using feature selection technique. J Microbiol Methods  2011; 84: 67-70.
[33] 
Letunic I, Copley RR, Pils B, et al. SMART 5: domains in the context of genomes and networks. Nucleic Acids Res  2006; 34: D257-60.
[34] 
Tatusov RL, Fedorova ND, Jackson JD, et al. The COG database: an updated version includes eukaryotes. BMC Bioinformatics  2003; 4: 41.
[35] 
Marchler-Bauer A, Anderson JB, Derbyshire MK, et al. CDD: a conserved domain database for interactive domain family analysis. Nucleic Acids Res  2007; 35: D237-40.
[36] 
Apweiler R, Attwood TK, Bairoch A, et al. The InterPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Res  2001; 29: 37-40.
[37] 
Chou KC. Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins  2001; 43: 246-55.
[38] 
Sahu SS, Panda G. A novel feature representation method based on Chou’s pseudo amino acid composition for protein structural class prediction. Comput Biol Chem  2010; 34: 320-7.
[39] 
Nanni L, Lumini A, Gupta D, Garg A. Identifying bacterial virulent proteins by fusing a set of classifiers based on variants of Chou's pseudo amino acid composition and on evolutionary informationIEEE/ACM Trans Comput Biol Bioinform 2012; 9: 467-75 
[40] 
Nanni L, Lumini A. Genetic programming for creating Chou’s pseudo amino acid based features for submitochondria localization. Amino Acids  2008; 34: 653-60.
[41] 
Qiu JD, Huang JH, Liang RP, Lu XQ. Prediction of G-protein-coupled receptor classes based on the concept of Chou’s pseudo amino acid composition: an approach from discrete wavelet transform. Anal Biochem  2009; 390: 68-73.
[42] 
Mohabatkar H, Mohammad Beigi M, Esmaeili A. Prediction of GABAA receptor proteins using the concept of Chou’s pseudo-amino acid composition and support vector machine. J Theor Biol  2011; 281: 18-23.
[43] 
Mohabatkar H, Beigi MM, Abdolahi K, Mohsenzadeh S. Prediction of allergenic proteins by means of the concept of Chou’s pseudo amino acid composition and a machine learning approach. Med Chem  2013; 9: 133-7.
[44] 
Hajisharifi Z, Piryaiee M, Mohammad Beigi M, Behbahani M, Mohabatkar H. Predicting anticancer peptides with Chou’s pseudo amino acid composition and investigating their mutagenicity via Ames test. J Theor Biol  2014; 341: 34-0.
[45] 
Khosravian M, Faramarzi FK, Beigi MM, Behbahani M, Mohabatkar H. Predicting antibacterial peptides by the concept of Chou’s pseudo-amino acid composition and machine learning methods. Protein Pept Lett  2013; 20: 180-6.
[46] 
Esmaeili M, Mohabatkar H, Mohsenzadeh S. Using the concept of Chou’s pseudo amino acid composition for risk type prediction of human papillomaviruses. J Theor Biol  2010; 263: 203-9.
[47] 
Feng PM, Chen W, Lin H, Chou KC. iHSP-PseRAAAC: Identifying the heat shock protein families using pseudo reduced amino acid alphabet composition. Anal Biochem  2013; 442: 118-25.
[48] 
Feng PM, Ding H, Chen W, Lin H. Naive Bayes classifier with feature selection to identify phage virion proteins. Comput Math Methods Med  2013; 2013: 530696.
[49] 
Feng PM, Lin H, Chen W. Identification of antioxidants from sequence information using naive Bayes. Comput Math Methods Med  2013; 2013: 567529.
[50] 
Yang H, Tang H, Chen XX, et al. Identification of Secretory Proteins in Mycobacterium tuberculosis Using Pseudo Amino Acid Composition. BioMed Res Int  2016; 2016: 5413903.
[51] 
Chen XX, Tang H, Li WC, et al. Identification of bacterial cell wall lyases via pseudo amino acid composition. BioMed Res Int  2016; 2016: 1654623.
[52] 
Tanford C. Contribution of hydrophobic interactions to the stability of the globular conformation of proteins. J Am Chem Soc 1962.
[53] 
Hopp TP, Woods KR. Prediction of protein antigenic determinants from amino acid sequences. Proc Natl Acad Sci USA  1981; 78: 3824-8.
[54] 
Chou KC, Cai YD. A new hybrid approach to predict subcellular localization of proteins by incorporating gene ontology. Biochem Biophys Res Commun  2003; 311: 743-7.
[55] 
Schaffer AA, Aravind L, Madden TL, et al. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res  2001; 29: 2994-3005.
[56] 
Laxton RR. The measure of diversity. J Theor Biol  1978; 70: 51-67.
[57] 
Zhang L, Luo L. Splice site prediction with quadratic discriminant analysis using diversity measure. Nucleic Acids Res  2003; 31: 6214-20.
[58] 
Li QZ, Lu ZQ. The prediction of the structural class of protein: application of the measure of diversity. J Theor Biol  2001; 213: 493-502.
[59] 
Liu W, Chou KC. Prediction of protein secondary structure content. Protein Eng  1999; 12: 1041-50.
[60] 
Weiss O, Herzel H. Correlations in protein sequences and property codes. J Theor Biol  1998; 190: 341-53.
[61] 
Liu H, Wang M, Chou KC. Low-frequency Fourier spectrum for predicting membrane protein types. Biochem Biophys Res Commun  2005; 336: 737-9.
[62] 
Chou KC. The biological functions of low-frequency vibrations (phonons). VI. A possible dynamic mechanism of allosteric transition in antibody molecules. Biopolymers  1987; 26: 285-95.
[63] 
Chou KC. Biological functions of low-frequency vibrations (phonons). III. Helical structures and microenvironment. Biophys J  1984; 45: 881-9.
[64] 
Chou KC. Low-frequency motions in protein molecules. Beta-sheet and beta-barrel. Biophys J  1985; 48: 289-97.
[65] 
Chou KC. Low-frequency collective motion in biomacromolecules and its biological functions. Biophys Chem  1988; 30: 3-48.
[66] 
Chou KC. Low-frequency resonance and cooperativity of hemoglobin. Trends Biochem Sci  1989; 14: 212-3.
[67] 
Haimovich AD, Byrne B, Ramaswamy R, Welsh WJ. Wavelet analysis of DNA walks. J Comput Biol  2006; 13: 1289-98.
[68] 
Turkheimer FE, Roncaroli F, Hennuy B, et al. Chromosomal patterns of gene expression from microarray data: methodology, validation and clinical relevance in gliomas. BMC Bioinformatics  2006; 7: 526.
[69] 
Mandell A, Selz K, Shlesinger M. Wavelet transformation of protein hydrophobicity sequences suggests their memberships in structural familiesPhysical Physical A Statistical Mechanics  Its Applications
1997; 244: 254-62 
[70] 
Li KB, Issac P, Krishnan A. Predicting allergenic proteins using wavelet transform. Bioinformatics  2004; 20: 2572-8.
[71] 
Rezaei MA, Abdolmaleki P, Karami Z, et al. Prediction of membrane protein types by means of wavelet analysis and cascaded neural networks. J Theor Biol  2008; 254: 817-20.
[72] 
Gonzalez-Diaz H, Gonzalez-Diaz Y, Santana L, Ubeira FM, Uriarte E. Proteomics, networks and connectivity indices. Proteomics  2008; 8: 750-78.
[73] 
Concu R, Podda G, Uriarte E, Gonzalez-Diaz H. Computational chemistry study of 3D-structure-function relationships for enzymes based on Markov models for protein electrostatic, HINT, and van der Waals potentials. J Comput Chem  2009; 30: 1510-20.
[74] 
Gonzalez-Diaz H, Prado-Prado F, Ubeira FM. Predicting antimicrobial drugs and targets with the MARCH-INSIDE approach. Curr Top Med Chem  2008; 8: 1676-90.
[75] 
Li BQ, Zhang YH, Jin ML, Huang T, Cai YD. Prediction of Protein-Peptide Interactions with a Nearest Neighbor Algorithm. Curr Bioinform  2018; 13: 14-24.
[76] 
Zhao W, Feng YE. Identify Protein 8-class secondary structure with quadratic discriminant algorithm based on the feature combination. Lett Org Chem  2017; 14: 625-31.
[77] 
Yuan LZ, Yong EF, Wei Z, Shan KG. Using quadratic discriminant analysis to predict protein secondary structure based on chemical shifts. Curr Bioinform  2017; 12: 52-6.
[78] 
Lin H, Li QZ. Using pseudo amino acid composition to predict protein structural class: approached by incorporating 400 dipeptide components. J Comput Chem  2007; 28: 1463-6.
[79] 
Lin H. The modified mahalanobis discriminant for predicting outer membrane proteins by using chou’s pseudo amino acid composition. J Theor Biol  2008; 252: 350-6.
[80] 
Lin H, Li QZ. Predicting conotoxin superfamily and family by using pseudo amino acid composition and modified Mahalanobis discriminant. Biochem Biophys Res Commun  2007; 354: 548-51.
[81] 
Chou KC, Elrod DW. Prediction of enzyme family classes. J Proteome Res  2003; 2: 183-90.
[82] 
Chou KC. Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes. Bioinformatics  2005; 21: 10-9.
[83] 
Mahalanobis PC. On the generalised distance in statistic. Proc Natl Sci India  1936; 2: 49-35.
[84] 
Zhou XB, Chen C, Li ZC, Zou XY. Using Chou’s amphiphilic pseudo-amino acid composition and support vector machine for prediction of enzyme subfamily classes. J Theor Biol  2007; 248: 546-51.
[85] 
Dobson PD, Doig AJ. Predicting enzyme class from protein structure without alignments. J Mol Biol  2005; 345: 187-99.
[86] 
Gaonkar B, Davatzikos C. Analytic estimation of statistical significance maps for support vector machine based multi-variate image analysis and classification. Neuroimage  2013; 78: 270-83.
[87] 
Cuingnet R, Rosso C, Chupin M, et al. Spatial regularization of SVM for the detection of diffusion alterations associated with stroke outcome. Med Image Anal  2011; 15: 729-37.
[88] 
Su ZD, Huang Y, Zhang ZY, et al. iLoc-lncRNA: predict the subcellular location of lncRNAs by incorporating octamer composition into general PseKNC. Bioinformatics 2018.
[http://dx.doi.org/10.1093/bioinformatics/bty508] 
[89] 
Feng P, Yang H, Ding H, Lin H, Chen W, Chou KC. iDNA6mA-PseKNC: Identifying DNA N(6)-methyladenosine sites by incorporating nucleotide physicochemical properties into PseKNC. Genomics 2018.
[http://dx.doi.org/10.1016/j.ygeno.2018.01.005] 
[90] 
Lin H, Liang ZY, Tang H, Chen W. Identifying sigma70 promoters with novel pseudo nucleotide compositionIEEE/ACM Trans
Comput Biol Bioinform 2017, DOI: 101109/TCBB20172666141 
[91] 
Zhang J, Feng P, Lin H, Chen W. Identifying RNA N(6)-methyladenosine sites in escherichia coli genome. Front Microbiol  2018; 9: 955.
[92] 
Chen W, Yang H, Feng P, Ding H, Lin H. iDNA4mC: identifying DNA N4-methylcytosine sites based on nucleotide chemical properties. Bioinformatics  2017; 33: 3518-23.
[93] 
Yang H, Qiu WR, Liu G, et al. iRSpot-Pse6NC: Identifying recombination spots in Saccharomyces cerevisiae by incorporating hexamer composition into general PseKNC. Int J Biol Sci  2018; 14: 883-91.
[94] 
Tang H, Zhao YW, Zou P, et al. HBPred: a tool to identify growth hormone-binding proteins. Int J Biol Sci  2018; 14: 957-64.
[95] 
Qiu WR, Sun BQ, Tang H, Huang J, Lin H. Identify and analysis crotonylation sites in histone by using support vector machines. Artif Intell Med  2017; 83: 75-81.
[96] 
Zhao YW, Su ZD, Yang W, et al. Ionchanpred 2.0: a tool to predict ion channels and their types. Int J Mol Sci  2017; 18: 1838.
[97] 
Manavalan B, Shin TH, Lee G. PVP-SVM: Sequence-Based prediction of phage virion proteins using a support vector machine. Front Microbiol  2018; 9: 476.
[98] 
Manavalan B, Lee J. SVMQA: support-vector-machine-based protein single-model quality assessment. Bioinformatics  2017; 33: 2496-503.
[99] 
Ye J, Chen W, Jin DC. Predicting the types of plant heat shock proteins. Lett Org Chem  2017; 14: 684-9.
[100] 
Tang H, Zhang CM, Chen R, et al. Identification of secretory proteins of malaria parasite by feature selection technique. Lett Org Chem  2017; 14: 621-4.
[101] 
Lei GC, Tang JJ, Du PF. Predicting s-sulfenylation sites using physicochemical properties differences. Lett Org Chem  2017; 14: 665-72.
[102] 
Jiang LM, Liao ZJ, Su R, Wei LY. Improved identification of cytokines using feature selection techniques. Lett Org Chem  2017; 14: 632-41.
[103] 
Loh SK, Low ST, Chai LE, et al. A Review of computational approaches to predict gene functions. Curr Bioinform  2018; 13: 373-86.
[104] 
Yang H, Lv H, Ding H, Chen W, Lin H. iRNA-2OM: A sequence-based predictor for identifying 2′-O-methylation sites in Homo sapiens. J Comput Biol 2018.
[http://dx.doi.org/10.1089/cmb.2018.0004] 
[105] 
Wei L, Zhou C, Chen H, Song J, Su R. ACPred-FL: a sequence-based predictor based on effective feature representation to improve the prediction of anti-cancer peptides. Bioinformatics 2018.
[http://dx.doi.org/10.1093/bioinformatics/bty451] 
[106] 
Li DP, Ju Y, Zou Q. Protein folds prediction with hierarchical structured svm. Curr Proteomics  2016; 13: 79-85.
[107] 
Bishop C. Pattern recognition and machine learning. Springer 2006.
[108] 
Dao FY, Yang H, Su ZD, et al. Recent advances in conotoxin classification by using machine learning methods. Mol  2017; 22: 1057.
[109] 
Song J, Wang Y, Li F, et al. iProt-Sub: a comprehensive package for accurately mapping and predicting protease-specific substrates and cleavage sites. Brief Bioinform 2018.
[http://dx.doi.org/10.1093/bib/bby028] 
[110] 
Song J, Li F, Leier A, et al. PROSPERous: high-throughput prediction of substrate cleavage sites for 90 proteases with improved accuracy. Bioinformatics  2018; 34: 684-7.
[111] 
Li F, Li C, Marquez-Lago TT, et al. Quokka: a comprehensive tool for rapid and accurate prediction of kinase family-specific phosphorylation sites in the human proteome. Bioinformatics  2018; bty522.
[http://dx.doi.org/10.1093/bioinformatics] 
[112] 
Bao Y, Marini S, Tamura T, et al. Toward more accurate prediction of caspase cleavage sites: a comprehensive review of current methods, tools and features. Brief Bioinform 2018.
[http://dx.doi.org/10.1093/bib/bby041] 
[113] 
He WY, Jia CZ, Duan YC, Zou Q. 70ProPred: a predictor for discovering sigma70 promoters based on combining multiple features. BMC Syst Biol  2018; 12: 44.
[114] 
Zou Q, Wan SX, Ju Y, Tang JJ, Zeng XX. Pretata: predicting TATA binding proteins with novel features and dimensionality reduction strategy. BMC Syst Biol  2016; 10: 114.
[115] 
Cao RZ, Adhikari B, Bhattacharya D, et al. QAcon: single model quality assessment using protein structural and contact information with machine learning techniques. Bioinformatics  2017; 33: 586-8.
[116] 
Cao R, Freitas C, Chan L, et al. ProLanGO: Protein function prediction using neural machine translation based on a recurrent neural network. Mol  2017; 22: E1732.
[117] 
Cao RZ, Bhattacharya D, Hou J, Cheng JL. DeepQA: improving the estimation of single protein model quality with deep belief networks. BMC Bioinformatics  2016; 17: 495.
[118] 
Tang H, Cao RZ, Wang W, et al. A two-step discriminated method to identify thermophilic proteins. Int J Biomath  2017; 10: 1750050.
[119] 
Mohabatkar H. Prediction of cyclin proteins using Chou’s pseudo amino acid composition. Protein Pept Lett  2010; 17: 1207-14.
[120] 
Chou KC, Wu ZC, Xiao X. iLoc-Hum: using the accumulation-label scale to predict subcellular locations of human proteins with both single and multiple sites. Mol Biosyst  2012; 8: 629-41.
[121] 
Qin YF, Wang CH, Yu XQ, et al. Predicting protein structural class by incorporating patterns of over-represented k-mers into the general form of Chou’s PseAAC. Protein Pept Lett  2012; 19: 388-97.
[122] 
Chou KC, Wu ZC, Xiao X. iLoc-Euk: a multi-label classifier for predicting the subcellular localization of singleplex and multiplex eukaryotic proteins. PLoS One  2011; 6: e18258.
[123] 
Zhao XW, Ma ZQ, Yin MH. Predicting protein-protein interactions by combing various sequence- derived features into the general form of Chou’s Pseudo amino acid composition. Protein Pept Lett  2012; 19: 492-500.
[124] 
Tang H, Chen W, Lin H. Identification of immunoglobulins using Chou’s pseudo amino acid composition with feature selection technique. Mol Biosyst  2016; 12: 1269-75.
[125] 
Li WC, Deng EZ, Ding H, Chen W, Lin H. iORI-PseKNC: A predictor for identifying origin of replication with pseudo k-tuple nucleotide composition. Chemom Intell Lab Syst  2015; 141: 100-6.
[126] 
Lin H, Deng EZ, Ding H, Chen W, Chou KC. iPro54-PseKNC: a sequence-based predictor for identifying sigma-54 promoters in prokaryote with pseudo k-tuple nucleotide composition. Nucleic Acids Res  2014; 42: 12961-72.
[127] 
Ding H, Deng EZ, Yuan LF, et al. iCTX-type: a sequence-based predictor for identifying the types of conotoxins in targeting ion channels. BioMed Res Int  2014; 2014: 286419.
[128] 
Manavalan B, Shin TH, Lee G. DHSpred: support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest. Oncotarget  2018; 9: 1944-56.
[129] 
Manavalan B, Basith S, Shin TH, et al. MLACP: machine-learning-based prediction of anticancer peptides. Oncotarget  2017; 8: 77121-36.
[130] 
Lin YQ, Min XP, Li LL, et al. Using a machine-learning approach to predict discontinuous antibody-specific b-cell epitopes. Curr Bioinform  2017; 12: 406-15.
[131] 
Lai HY, Chen XX, Chen W, Tang H, Lin H. Sequence-based predictive modeling to identify cancerlectins. Oncotarget  2017; 8: 28169-75.
[132] 
Li BQ, Hu LL, Niu S, Cai YD, Chou KC. Predict and analyze S-nitrosylation modification sites with the mRMR and IFS approaches. J Proteomics  2012; 75: 1654-65.
[133] 
Ho TK. The random subspace method for constructing decision forests. IEEE Transactoins on Pattrern Analysis & Machine Intelligence 1998.
[134] 
Voelz VA, Shell MS, Dill KA. Predicting peptide structures in native proteins from physical simulations of fragments. PLOS Comput Biol  2009; 5: e1000281.
[135] 
Lin C, Chen W, Qiu C, et al. LibD3C: Ensemble classifiers with a clustering and dynamic selection strategy. Neurocomputing  2014; 123: 424-35.
[136] 
Peng L, Peng MM, Liao B, et al. The advances and challenges of deep learning application in biological big data processing. Curr Bioinform  2018; 13: 352-9.
[137] 
Patel S, Tripathi R, Kumari V, Varadwaj P. DeepInteract: deep neural network based protein-protein interaction prediction tool. Curr Bioinform  2017; 12: 551-7.
[138] 
Long HX, Wang M, Fu HY. Deep convolutional neural networks for predicting hydroxyproline in proteins. Curr Bioinform  2017; 12: 233-8.
[139] 
Chen W, Lin H, Feng PM, et al. iNuc-PhysChem: a sequence-based predictor for identifying nucleosomes via physicochemical properties. PLoS One  2012; 7: e47843.
[140] 
Ding C, Peng H. Minimum redundancy feature selection from microarray gene expression data. J Bioinform Comput Biol  2005; 3: 185-205.
[141] 
Naseem I, Khan S, Togneri R, Bennamoun M. ECMSRC: A sparse learning approach for the prediction of extracellular matrix proteins. Curr Bioinform  2017; 12: 361-8.
Rights & Permissions Print Cite
Article Metrics
35
3
1
Journal Information
For Authors
For Editors
For Reviewers
Explore Articles
Open Access
Open Access Articles
For Visitors
DOI https://dx.doi.org/10.2174/1389450119666181002143355	Print ISSN 1389-4501
Publisher Name Bentham Science Publisher	Online ISSN 1873-5592
Current Drug Targets

A Survey for Predicting Enzyme Family Classes Using Machine Learning Methods

Abstract Play Pause

Graphical Abstract

Related Journals

Related Books

Related Articles

Abstract