Abstract
By introducing the “multi-layer scale”, as well as hybridizing the information of gene ontology and the sequential evolution information, a novel predictor, called iLoc-Gpos, has been developed for predicting the subcellular localization of Gram positive bacterial proteins with both single-location and multiple-location sites. For facilitating comparison, the same stringent benchmark dataset used to estimate the accuracy of Gpos-mPLoc was adopted to demonstrate the power of iLoc-Gpos. The dataset contains 519 Gram-positive bacterial proteins classified into the following four subcellular locations: (1) cell membrane, (2) cell wall, (3) cytoplasm, and (4) extracell; none of proteins included has ≥25% pairwise sequence identity to any other in a same subset (subcellular location). The overall success rate by jackknife test on such a stringent benchmark dataset by iLoc-Gpos was over 93%, which is about 11% higher than that by GposmPLoc. As a user-friendly web-server, iLoc-Gpos is freely accessible to the public at http://icpr.jci.edu.cn/bioinfo/iLoc- Gpos or http://www.jci-bioinfo.cn/iLoc-Gpos. Meanwhile, a step-by-step guide is provided on how to use the web-server to get the desired results. Furthermore, for the user ’ s convenience, the iLoc-Gpos web-server also has the function to accept the batch job submission, which is not available in the existing version of Gpos-mPLoc web-server.
Keywords: Singleplex proteins, Multiplex proteins, Multi-layer scale, Gene ontology, KNN classifier, Absolute true success rate, post-genomic era, Gpos-PLoc, PseAAC, K-Nearest Neighbor, jackknife test, FASTA format, Mahalanobis, PSI-BLAST, PSSM