Abstract
Background: Protein S-Sulfenylation, the reversible oxidative modification of cysteine thiol groups to cysteine S-Sulfenic acids, is a post-translational modification (PTM) that plays a critical role in regulating protein function and signal transduction. The identification of specific protein Ssulfenylation sites is crucial to understand the underlying molecular mechanisms.
Objective: We sought to develop a computational method that can effectively predict S-sulfenylation sites by using optimally extracted properties.
Method: We propose DBN-Sulf, which uses a Deep Belief Network (DBN) with Restricted Boltzmann Machines (RBMs) to reduce the feature dimensions from a combination of heterogeneous information, including amino acid related features, evolutionary features, and structure-based features. Then a support vector machine (SVM) based predictor is built with the optimal features.
Results: We evaluate the DBN-Sulf classifier using a training dataset including 1007 positive sites and 7837 negative sites with 5-fold cross validation, and get an AUC score of 0.80, an ACC of 0.85 and a MCC of 0.53, which are significantly better than that of the existing methods. We further validate our method on the independent test set and obtain promising results.
Conclusion: The superior performance over existing S-sulfenylation site prediction approaches indicates the importance of the deep belief network-based feature extracting procedure.
Keywords: Deep belief network, support vector machine, S-sulfenylation sites, restricted boltzmann machines.
Graphical Abstract