Abstract
Protein fold classification plays a key role in protein functional analysis, molecular biology, cell biology, biomedicine and drug design. The methods of classifying protein fold can be roughly divided into two categories: taxonomy-based method and template-based method. Machine learning algorithms, due to their excellent performance, have been widely applied to taxonomy-based methods. In this review, we mainly discuss the most popular and representative taxonomy-based methods via machine learning approach, including the three important aspects: dataset, feature extraction method, and classifying algorithm. We compare the overall accuracies of methods using the same classifiers with different feature vectors and summarize the development tendency and potential research directions. This review intends to assist researchers in choosing appropriate materials and developing new classifying methods in this area.
Keywords: Classification, dataset, ensemble classifier, feature extraction, machine learning, protein fold.
Graphical Abstract