Abstract
The number of protein 3D structures without function annotation in Protein Data Bank (PDB) has been steadily increased. Many of these proteins are relevant for Pharmaceutical Design because they may be enzymes of different classes that could become drug targets. This fact has led in turn to an increment of demand for theoretical models to give a quick characterization of these proteins. In this work, we present a review and discussion of Alignment-Free Methods (AFMs) for fast prediction of the Enzyme Classification (EC) number from structural patterns. We referred to both methods based on linear techniques such as Linear Discriminant Analysis (LDA) and/or non-linear models like Artificial Neural Networks (ANN) or Support Vector Machine (SVM) in order to compare linear vs. nonlinear classifiers. We also detected which of these models have been implemented as Web Servers free to the public and compiled a list of some of these websites. For instance, we reviewed the servers implemented at portal Bio-AIMS (http://miaja.tic.udc.es/Bio- AIMS/EnzClassPred.php) and the server EzyPred (http://www.csbio.sjtu.edu.cn/bioinf/EzyPred/).
Keywords: Enzymes classes, Protein Structure-Function Relationship, Predict Enzyme Function, Support Vector Machine, Gene Ontology, Markov Models, Web Servers, Alignment-Free Methods (AFMs), computational approaches, 1D techniques, alignment-free Machine Learning methods, EC number, Functional Domain Composition (FDC), Bayesian classification, Saccharomyces Genome Database(SGD), Mouse Genome Database (MGD), Enzyme Class Query (ECQ), MASCOT, Markov Chain Model, Artificial Neural Network, Protein Data Bank