Abstract
Misfolding of proteins results in amyloidosis: a condition where amyloid motifs build up in neuronal tissues leading to life threatening organ failures. Hence understanding the underlying cause of incorrect folding of proteins is significant followed by the identification of such peptide motifs. This research effort proposes a distinctive ensemble approach by taking advantage of diverse fusion of structural information and sequence based features to predict amyloid motifs computationally. The assortment in the structure and sequence feature space owes to the structural statistics based on root mean square deviation and the sequence centered features by exploiting the sequence similarity to maintain sequence order effect and the physico-chemical properties attained after optimizing via a novel hybridization of machine learning classifier followed by swarm intelligence algorithm. The proposed approach resulted in considerably better predictive performances based on sensitivity, specificity and balanced accuracy than available predictors for discriminating amyloid motifs from non-amyloid motifs. Furthermore, it has been revealed that the effect of nested ensemble classifier and bootstrap evaluation protocol have significant role in ameliorating the prediction accuracy.
Keywords: Amyloid motifs, ensemble approach, physico-chemical properties, protein misfolding, protein sequence similarity, protein structure similarity.