[1]
Crippen GM, Maiorov VN. How many protein folding motifs are there? J Mol Biol 1995; 252(1): 144-51.
[2]
Wang ZX. How many fold types of protein are there in nature? Proteins 1996; 26(2): 186-91.
[3]
Lo Conte L, Ailey B, Hubbard TJ, Brenner SE, Murzin AG, Chothia C. SCOP: a structural classification of proteins database. Nucleic Acids Res 2000; 28(1): 257-9.
[4]
Guyon I, Elisseeff A. An introduction to variable and feature selection. J Mach Learn Res 2003; 3: 1157-82.
[5]
Wei L, Zou Q. Recent progress in machine learning-based methods for protein fold recognition. Int J Mol Sci 2016; 17(12): 2118.
[6]
Cheng J, Tegge AN, Baldi P. Machine learning methods for protein structure prediction. IEEE Rev Biomed Eng 2008; 1: 41-9.
[7]
Chen J, Guo M, Wang X, Liu B. A comprehensive review and comparison of different computational methods for protein remote homology detection. Brief Bioinform 2018; 19(2): 231-44.
[8]
Liu B, Chen J, Wang X. Application of learning to rank to protein remote homology detection. Bioinformatics 2015; 31(21): 3492-8.
[9]
Liu B, Zhang D, Xu R, et al. Combining evolutionary information extracted from frequency profiles with sequence-based kernels for protein remote homology detection. Bioinformatics 2014; 30(4): 472-9.
[10]
Chen J, Guo M, Li S, Liu B. ProtDec-LTR2.0: an improved method for protein remote homology detection by combining pseudo protein and supervised Learning to Rank. Bioinformatics 2017; 33(21): 3473-6.
[11]
Chen J, Long R, Wang XL, Liu B, Chou KC. dRHP-PseRA: detecting remote homology proteins using profile-based pseudo protein sequence and rank aggregation. Sci Rep 2016; 6: 32333. [http://dx.doi.org/10.1038/srep32333]. [PMID: 27581095].
[12]
Altschul SF, Madden TL, Schäffer AA, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997; 25(17): 3389-402.
[13]
Finn RD, Clements J, Eddy SR. HMMER web server: interactive sequence similarity searching. Nucleic Acids Res 2011; 39(Suppl_2): W29-37.
[14]
Remmert M, Biegert A, Hauser A, Söding J. HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat Methods 2011; 9(2): 173-5.
[15]
Margelevičius M, Venclovas C. Detection of distant evolutionary relationships between protein families using theory of sequence profile-profile comparison. BMC Bioinformatics 2010; 11(1): 89.
[16]
Lindahl E, Elofsson A. Identification of related proteins on family, superfamily and fold level. J Mol Biol 2000; 295(3): 613-25.
[17]
Ding CH, Dubchak I. Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics 2001; 17(4): 349-58.
[18]
Taguchi YH, Gromiha MM. Application of amino acid occurrence for discriminating different folding types of globular proteins. BMC Bioinformatics 2007; 8(1): 404.
[19]
Dong Q, Zhou S, Guan J. A new taxonomy-based protein fold recognition approach based on autocross-covariance transformation. Bioinformatics 2009; 25(20): 2655-62.
[20]
Chen K, Kurgan L. PFRES: protein fold classification by using evolutionary information and predicted secondary structure. Bioinformatics 2007; 23(21): 2843-50.
[21]
Yang JY, Chen X. Improving taxonomy-based protein fold recognition by using global and local features. Proteins 2011; 79(7): 2053-64.
[22]
Fox NK, Brenner SE, Chandonia JM. SCOPe: Structural Classification of Proteins--extended, integrating SCOP and ASTRAL data and classification of new structures. Nucleic Acids Res 2014; 42(Database issue): D304-9.
[23]
Xia J, Peng Z, Qi D, Mu H, Yang J. An ensemble approach to protein fold classification by integration of template-based assignment and support vector machine classifier. Bioinformatics 2017; 33(6): 863-70.
[24]
Chothia C, Finkelstein AV. The classification and origins of protein folding patterns. Annu Rev Biochem 1990; 59(1): 1007-39.
[25]
Chen D, Tian X, Zhou B, Gao J. Profold: Protein fold classification
with additional structural features and a novel ensemble classifier.
BioMed Research International 2016. 2016: Doi 6802832.
[26]
Fauchère JL, Charton M, Kier LB, Verloop A, Pliska V. Amino acid side chain parameters for correlation studies in biology and pharmacology. Int J Pept Protein Res 1988; 32(4): 269-78.
[27]
Grantham R. Amino acid difference formula to help explain protein evolution. Science 1974; 185(4154): 862-4.
[28]
Charton M, Charton BI. The structural dependence of amino acid hydrophobicity parameters. J Theor Biol 1982; 99(4): 629-44.
[29]
Lin C, Zou Y, Qin J, et al. Hierarchical classification of protein folds using a novel ensemble classifier. PLoS One 2013; 8(2)e56499
[30]
Dubchak I, Muchnik I, Mayor C, Dralyuk I, Kim SH. Recognition of a protein fold in the context of the SCOP classification. Proteins 1999; 35(4): 401-7.
[31]
Ibrahim W, Abadeh MS. Extracting features from protein sequences to improve deep extreme learning machine for protein fold recognition. J Theor Biol 2017; 421: 1-15.
[32]
Eisenberg D, Schwarz E, Komaromy M, Wall R. Analysis of membrane and surface protein sequences with the hydrophobic moment plot. J Mol Biol 1984; 179(1): 125-42.
[33]
McGuffin LJ, Bryson K, Jones DT. The PSIPRED protein structure prediction server. Bioinformatics 2000; 16(4): 404-5.
[34]
Wang S, Li W, Liu S, Xu J. RaptorX-Property: a web server for protein structure property prediction. Nucleic Acids Res 2016; 44(W1)W430-5
[35]
Kabsch W, Sander C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 1983; 22(12): 2577-637.
[36]
Cheng J, Randall AZ, Sweredoski MJ, Baldi P. CRATCH: a protein structure and structural feature prediction server Nucleic Acids Res 2015; 33(Suppl_2): W72-6.
[37]
Dubchak I, Muchnik I, Holbrook SR, Kim SH. Prediction of protein folding class using global description of amino acid sequence. Proc Natl Acad Sci USA 1995; 92(19): 8700-4.
[38]
Garg A, Bhasin M, Raghava GP. SVM-based method for subcellular localization of human proteins using amino acid compositions, their order and similarity search. J Biol Chem 2005; 280(15): 14427-32.
[39]
Guo J, Lin Y, Liu X. GNBSL: a new integrative system to predict the subcellular location for Gram-negative bacteria proteins. Proteomics 2006; 6(19): 5099-105.
[40]
Shamim MT, Anwaruddin M, Nagarajaram HA. Support Vector Machine-based classification of protein folds using the structural properties of amino acid residues and amino acid residue pairs. Bioinformatics 2007; 23(24): 3320-7.
[41]
Liu B, Liu F, Wang X, Chen J, Fang L, Chou KC. Pse-in-One: a web server for generating various modes of pseudo components of DNA, RNA, and protein sequences. Nucleic Acids Res 2015; 43(W1)W65-71
[42]
Liu B, Liu F, Fang L, Wang X, Chou KC. repDNA: a Python package to generate various modes of feature vectors for DNA sequences by incorporating user-defined physicochemical properties and sequence-order effects. Bioinformatics 2015; 31(8): 1307-9.
[43]
Chen W, Zhang X, Brooker J, Lin H, Zhang L, Chou KC. PseKNC-General: a cross-platform package for generating various modes of pseudo nucleotide compositions. Bioinformatics 2015; 31(1): 119-20.
[44]
Shen HB, Chou KC. PseAAC: a flexible web server for generating various kinds of protein pseudo amino acid composition. Anal Biochem 2008; 373(2): 386-8.
[45]
Liu B. BioSeq-Analysis: a platform for DNA, RNA and protein sequence analysis based on machine learning approaches. Brief Bioinform 2017.
[46]
Vapnik VN. An overview of statistical learning theory. IEEE Trans Neural Netw 1999; 10(5): 988-99.
[47]
Shen H, Chou KC. Using optimized evidence-theoretic K-nearest neighbor classifier and pseudo-amino acid composition to predict membrane protein types. Biochem Biophys Res Commun 2005; 334(1): 288-92.
[48]
Shen HB, Chou KC. Ensemble classifier for protein fold pattern recognition. Bioinformatics 2006; 22(14): 1717-22.
[49]
Nanni L. A novel ensemble of classifiers for protein fold recognition. Neurocomputing 2006; 69(16-18): 2434-7.
[50]
Guo X, Gao X. A novel hierarchical ensemble classifier for protein fold recognition. Protein Eng Des Sel 2008; 21(11): 659-64.
[51]
Schäffer AA, Aravind L, Madden TL, et al. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res 2001; 29(14): 2994-3005.
[52]
Marchler-Bauer A, Anderson JB, Derbyshire MK, et al. CDD: a conserved domain database for interactive domain family analysis Nucleic acids research 2006; 35(Suppl_1): D237-40.
[53]
Shen HB, Chou KC. Predicting protein fold pattern with functional domain and sequential evolution information. J Theor Biol 2009; 256(3): 441-6.
[54]
Ghanty P, Pal NR. Prediction of protein folds: extraction of new features, dimensionality reduction, and fusion of heterogeneous classifiers. IEEE Trans Nanobioscience 2009; 8(1): 100-10.
[55]
Dehzangi A, Phon-Amnuaisuk S, Dehzangi O. Using Random Forest for Protein Fold Prediction Problem: An Empirical Study. J Inf Sci Eng 2010; 26(6): 1941-56.
[56]
Dehzangi A, Phon-Amnuaisuk S, Manafi M, Safa S. Using rotation
forest for protein fold prediction problem: An empirical study.
European Conference on Evolutionary Computation, Machine
Learning and Data Mining in Bioinformatics. Berlin, Heidelberg.
In: Springer; 2010 Apr 7; 217-7.
[57]
Yang T, Kecman V, Cao L, Zhang C, Huang JZ. Margin-based ensemble classifier for protein fold recognition. Expert Syst Appl 2011; 38(10): 12348-55.
[58]
Faraggi E, Xue B, Zhou Y. Improving the prediction accuracy of residue solvent accessibility and real-value backbone torsion angles of proteins by guided-learning through a two-layer neural network. Proteins 2009; 74(4): 847-56.
[59]
Bailey TL, Boden M, Buske FA, et al. MEME SUITE: tools for motif discovery and searching Nucleic acids research 2009; 37(suppl_2): W202-8.
[60]
Li J, Wu J, Chen K. PFP-RFSM: Protein fold prediction by using random forests and sequence motifs. J Biomed Sci Eng 2013; 6(12): 1161.
[61]
Sharma A, Lyons J, Dehzangi A, Paliwal KK. A feature extraction technique using bi-gram probabilities of position specific scoring matrix for protein fold recognition. J Theor Biol 2013; 320: 41-6.
[62]
Wold S, Jonsson J, Sjörström M, Sandberg M, Rännar S. DNA and peptide sequences and chemical processes multivariately modelled by principal component analysis and partial least-squares projections to latent structures. Anal Chim Acta 1993; 277(2): 239-53.
[63]
Feng Z, Hu X. Recognition of 27-class protein folds by adding the interaction of segments and motif information. BioMed Res international 2014; 2014262850
[64]
Paliwal KK, Sharma A, Lyons J, Dehzangi A. Improving protein fold recognition using the amalgamation of evolutionary-based and structural based information. BMC Bioinformatics 2014; 15(16)(Suppl. 16): S12.
[65]
Paliwal KK, Sharma A, Lyons J, Dehzangi A. A tri-gram based feature extraction technique using linear probabilities of position specific scoring matrix for protein fold recognition. IEEE Trans Nanobioscience 2014; 13(1): 44-50.
[66]
Dehzangi A, Paliwal K, Lyons J, Sharma A, Sattar A. A segmentation-based method to extract structural and evolutionary features for protein fold recognition IEEE/ACM Transactions on
Computational Biology and Bioinformatics (TCBB) 2014; 11(3): 510-9.
[67]
Lyons J, Biswas N, Sharma A, Dehzangi A, Paliwal KK. Protein fold recognition by alignment of amino acid residues using kernelized dynamic time warping. J Theor Biol 2014; 354: 137-45.
[68]
Aram RZ, Charkari NM. A two-layer classification framework for protein fold recognition. J Theor Biol 2015; 365: 32-9.
[69]
Lyons J, Dehzangi A, Heffernan R, et al. Advancing the accuracy of protein fold recognition by utilizing profiles from hidden Markov models. IEEE Trans Nanobioscience 2015; 14(7): 761-72.
[70]
Saini H, Raicar G, Sharma A, et al. Probabilistic expression of spatially varied amino acid dimers into general form of Chou׳s pseudo amino acid composition for protein fold recognition. J Theor Biol 2015; 380: 291-8.
[71]
Wei L, Liao M, Gao X, Zou Q. Enhanced protein fold prediction method through a novel feature extraction technique. IEEE Trans Nanobioscience 2015; 14(6): 649-59.
[72]
Faraggi E, Zhang T, Yang Y, Kurgan L, Zhou Y. SPINE X: improving protein secondary structure prediction by multistep learning coupled with prediction of solvent accessible surface area and backbone torsion angles. J Comput Chem 2012; 33(3): 259-67.
[73]
Cheung NJ, Ding XM, Shen HB. Protein folds recognized by an intelligent predictor based-on evolutionary and structural information. J Comput Chem 2016; 37(4): 426-78.
[74]
Lyons J, Paliwal KK, Dehzangi A, Heffernan R, Tsunoda T, Sharma A. Protein fold recognition using HMM–HMM alignment and dynamic programming. J Theor Biol 2016; 393: 67-74.
[75]
Raicar G, Saini H, Dehzangi A, Lal S, Sharma A. Improving protein fold recognition and structural class prediction accuracies using physicochemical properties of amino acids. J Theor Biol 2016; 402: 117-28.
[76]
Söding J. Protein homology detection by HMM-HMM comparison. Bioinformatics 2005; 21(7): 951-60.
[77]
Jones DT. Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol 1999; 292(2): 195-202.
[78]
Saini H, Raicar G, Lal SP, Dehzangi A, Imoto S, Sharma A. Protein Fold Recognition Using Genetic Algorithm Optimized Voting Scheme and Profile Bigram. JSW 2016; 11(8): 756-67.
[79]
Yan K, Xu Y, Fang X, Zheng C, Liu B. Protein fold recognition based on sparse representation based classification. Artif Intell Med 2017; 79: 1-8.
[80]
Guo Y, Yu L, Wen Z, Li M. Using support vector machine combined with auto covariance to predict protein-protein interactions from protein sequences. Nucleic Acids Res 2008; 36(9): 3025-30.
[81]
Xia JF, Han K, Huang DS. Sequence-based prediction of protein-protein interactions by means of rotation forest and autocorrelation descriptor. Protein Pept Lett 2010; 17(1): 137-45.
[82]
Moran PA. Notes on continuous stochastic phenomena. Biometrika 1950; 37(1-2): 17-23.
[83]
Geary RC. The contiguity ratio and statistical mapping The incorporated statistician 1954; 5(3): 115-46.
[84]
Hollas B. An analysis of the autocorrelation descriptor for molecules. J Math Chem 2003; 33(2): 91-101.
[85]
Fisher RA. The use of multiple measurements in taxonomic problems. Ann Eugen 1936; 7(2): 179-88.