Abstract
Background: Traditional quantitative structure - property / activity relationships (QSPRs/QSARs) are based on representation of molecular structure by molecular graph or simplified molecular input-line entry system (SMILES). It is an attractive idea to develop predictive models for large molecules in general and for peptides in particular. However, the representation of these molecules by molecular graph or SMILES is problematic owing to large size of these molecules. A possible alternative of SMILES is the representation of peptides via sequence of abbreviations of amino acids.
Method: Models for hemolysis and cytotoxicity of peptides are suggested. These models are based on representation of the peptides by sequences of amino acids. Correlation weights, which are calculated for each amino acid using the Monte Carlo method are basis for quantitative sequence - activity relationships (QSAR) for antimicrobial peptides. The correlation weights are the basis for optimal descriptors, which are correlated with experimental data for hemolysis and cytotoxicity. The basic hypothesis is that if optimal descriptors are correlated with endpoints of peptides for the training set, they should also correlate with the endpoints for validation set.
Results: Checking up of correlations between the above-mentioned descriptors and antimicrobial activity of peptides (cytotoxicity or hemolysis) has shown that these models have good predictive potential.
Conclusion: Suggested approach can be used as a tool to develop predictive models of biological activity of peptides as a mathematical function of sequences of amino acids.
Keywords: QSAR, hemolysis, cytotoxicity, antimicrobial peptides, Monte Carlo method, CORAL software.
Graphical Abstract