Abstract
The focuses of this work are: to propose a novel method for building an ensemble of classifiers for peptide classification based on substitution matrices; to show the importance to select a proper set of the parameters of the classifiers that build the ensemble of learning systems. The HIV-1 protease cleavage site prediction problem is here studied. The results obtained by a blind testing protocol are reported, the comparison with other state-of-the-art approaches, based on ensemble of classifiers, allows to quantify the performance improvement obtained by the systems proposed in this paper. The simulation based on experimentally determined protease cleavage data has demonstrated the success of these new ensemble algorithms. Particularly interesting it is to note that also if the HIV-1 protease cleavage site prediction problem is considered linearly separable we obtain the best performance using an ensemble of non-linear classifiers.
Keywords: Machine learning, ensembles of classifiers, substitution matrices, HIV-1 protease prediction