Abstract
Caspases play an important role in many critical non-apoptosis processes by cleaving relevant substrates at cleavage sites. Identification of caspase substrate cleavage sites is the key to understand these processes. This paper proposes a hybrid method using support vector machine (SVM) in conjunction with position specific scoring matrices (PSSM) for caspase substrate cleavage sites prediction. Three encoding schemes including orthonormal binary encoding, BLOSUM62 matrix profile and PSSM profile of neighborhood surrounding the substrate cleavage sites were regarded as the input of SVM. The 10-fold cross validation results demonstrate that the SVM-PSSM method performs well with an overall accuracy of 97.619% on a larger dataset.
Keywords: Caspase, substrate cleavage sites prediction, SVM, PSSM profiles, Hybrid SVMPSSM Method, support vector machine, orthonormal binary encoding, BLOSUM62 matrix, SVM-PSSM method, cysteine proteases, mammalian cell death, Substrate dataBAse, cathepsin B, caspase 3, granzyme B, MEROPS protease database, BLOSUM62 matrix profile, CASBAH database, P4-P3-P2-P1, GalNAc-transferase, HIV protease, alpha-turn types, serine hydrolases, B-cell epitope, radial basis function, Position Specific Score Matrix, alanine, valine, jackknife test, benchmark dataset, crossvalidation, Matthews's Correlation Coefficient