Abstract
Many important protein interactions related to cell signaling networks and post-translational modification events are mediated by the binding of a globular domain in one protein to a short peptide stretch in another. In the current study, we describe a structure-level protocol to realize the quantitative prediction of weak affinity in such interactions. This method uses the crystal structure of CAL PDZ domain complexed with a CFTR C-terminus mimic peptide as the template to construct other congeneric domain-peptide complex structure models. Subsequently, independent residue-pair interactions between the domain and peptide in constructed complexes are computed and correlated with experimentally measured affinity of 80 CAL PDZ binders by using partial least squares (PLS) and random forest (RF). We demonstrate that (a) the nonlinear RF is time-consuming but performs much well as compared to linear PLS in modeling and predicting the binding affinity of domain-peptide interactions, (b) the proposed structure-based strategy is more effective and accurate than those of traditional sequence-based methods in capturing the binding behavior and interaction information of domain with peptide, and (c) only very few residue-pairs at complex interface contribute significantly to domain-peptide binding.
Keywords: Domain-peptide binding, residue pair interaction, bioinformatics analysis, machine learning.