Abstract
Background: Anti-Inflammatory Peptides (AIPs) are potent therapeutic agents for inflammatory and autoimmune disorders due to their high specificity and minimal toxicity under normal conditions. Therefore, it is greatly significant and beneficial to identify AIPs for further discovering novel and efficient AIPs-based therapeutics. Recently, three computational approaches, which can effectively identify potential AIPs, have been developed based on machine learning algorithms. However, there are several challenges with the existing three predictors.
Objective: A novel machine learning algorithm needs to be proposed to improve the AIPs prediction accuracy.
Methods: This study attempts to improve the recognition of AIPs by employing multiple primary sequence-based feature descriptors and an efficient feature selection strategy. By sorting features through four enhanced minimal redundancy maximal relevance (emRMR) methods, and then attaching seven different classifiers wrapper methods based on the sequential forward selection algorithm (SFS), we proposed a hybrid feature selection technique emRMR-SFS to optimize feature vectors. Furthermore, by evaluating seven classifiers trained with the optimal feature subset, we developed the Extremely Randomized Tree (ERT) based predictor named PREDAIP for identifying AIPs.
Results: We systematically compared the performance of PREDAIP with the existing tools on independent test dataset. It demonstrates the effectiveness and power of the PREDAIP.
Conclusion: The correlation criteria used in emRMR would affect the selection results of the optimal feature subset at the SFS-wrapper stage, which justifies the necessity for considering different correlation criteria in emRMR.
Keywords: Machine learning, feature selection, enhanced minimal redundancy maximal relevance, sequential forward selection algorithm, prediction, extremely randomized tree.
Graphical Abstract