Abstract
Protein folds prediction is an essential and basic problem for protein structure and function research. As far as we see, there are generally three problems for the protein folds prediction. The first one is the overfitting problem due to the lack of training samples. The second one is the missing information of hierarchical labels. Small size of the current benchmark is another troubling issue. In this paper, we proposed structured SVM to overcome the first and second problems. We also contributed three comparatively huge datasets as benchmark for protein folds prediction. Experiments on different datasets can prove the performance and robustness of our structured SVM.
Keywords: Structured support vector machine, protein folds prediction, protein structure, machine learning, bioinformatics, protein secondary structure.
Graphical Abstract