Abstract
B-cell epitope, also known as antigenic determinant, is part of an antigen recognized by the B-cells. The capability of an antibody to recognize epitopes is widely utilized in numerous biomedical applications including immunodetection and immunotherapeutics. Identification of immunogenic regions helps to understand the mechanisms of the immune system and guide the related applications. In contrast with laborious and time consuming experimental approaches, predicting B-cell epitopes by computational methods is more convenient and efficient. In this study, a novel predictor with feature selection was developed by combining maximum relevance minimum redundancy (mRMR) method and incremental feature selection (IFS). The predictor was then trained and tested by three B-cell epitope datasets. 8 types of features, including physicochemical and biochemical properties, residual disorder, sequence conservation, solvent accessibility, secondary structures, propensity of amino acid to be conserved at protein-protein interface and protein surface, deviation of side chain carbon atom number, gain/loss of amino acid during evolution were used to code the peptides. It was shown that sequence conservation, physicochemical and biochemical properties of amino acids, solvent accessibility and secondary structure contributed most to the identification of epitope sites. And the features from the sites surrounding the central residue are critical for the prediction. The finding of this study may shed lights on the prediction of epitopes and the mechanisms of antigen-antibody interactions.
Keywords: B-cell epitope, minimum redundancy maximum relevance, incremental feature selection, random forest algorithm.
Graphical Abstract