Abstract
Introduction: Neddylation is the process of ubiquitin-like protein NEDD8 attaching substrate lysine via isopeptide bonds. As a highly dynamic and reversible post-translational modification, lysine neddylation has been found to be involved in various biological processes and closely associated with many diseases.
Objective: The accurate identification of neddylation sites is necessary to elucidate the underlying molecular mechanisms of neddylation. As traditional experimental methods are often expensive and time-consuming, it is imperative to design computational methods to identify neddylation sites.
Methods: In this study, a novel predictor named CKSAAP_NeddSite is developed to detect neddylation sites. An effective feature encoding technology, the composition of k-spaced amino acid pairs, is used to encode neddylation sites. And the F-score feature selection method is adopted to remove the redundant features. Moreover, a fuzzy support vector machine algorithm is employed to overcome the class imbalance and noise problem.
Results: As illustrated by 10-fold cross-validation, CKSAAP_NeddSite achieves an AUC of 0.9848. Independent tests also show that CKSAAP_NeddSite significantly outperforms existing neddylation sites predictor. Therefore, CKSAAP_NeddSite can be a useful bioinformatics tool for the prediction of neddylation sites. Feature analysis shows that some residues around neddylation sites may play an important role in the prediction.
Conclusion: The results of analysis and prediction could offer useful information for elucidating the molecular mechanisms of neddylation. A user-friendly web-server for CKSAAP_NeddSite is established at 123.206.31.171/CKSAAP_NeddSite.
Keywords: Post-translational modification, neddylation, feature extraction, fuzzy support vector machine, analysis, elucidating.
Graphical Abstract