Abstract
Background: Malaria is one of the major infectious diseases caused by Plasmodium falciparum (P. falciparum). The proteins secreted by malarial parasite play important roles in drug design in anti-malaria. Thus, it is very important to accurately identify secretory proteins of malarial parasite. Although biochemical experiments can solve the issue, it is both time- and money-consuming. Computational methods provide an important tool for fast and correct identification of the proteins secreted by malaria.
Method: The aim of the letter is to design a powerful prediction model to identify the secretory proteins of malarial parasite. In this model, the physicochemical properties of residues were incorporated into traditional pseudo amino acid composition to discretely formulate the secretory protein samples. Subsequently, the optimal feature subset was obtained by analysis of variance (ANOVA). Finally, the support vector machine was proposed to perform classification.
Results: In 5-fold cross-validation test, the overall accuracy reached 91.3%. Comparison with other method proves that the proposed method is powerful and robust.
Conclusion: This study demonstrates that the novel properties are important features for secretory protein prediction.
Keywords: ANOVA, analysis of variance, malaria parasite, physicochemical properties, secretory protein, physiochemical property.
Graphical Abstract