Abstract
In this paper we aim at investigating possible correlations between the number of putative interaction patches of a given protein, as inferred by an algorithm that we have developed, and its degree (number of edges of the protein node in a protein interaction network). We focus on the human cell cycle that, as compared with other biological processes, comprises the largest number of proteins whose structure is known at atomic resolution both as monomers and as interacting complexes. For predicting interaction patches we specifically develop a HM-SVM based method reaching 71% overall accuracy with a correlation coefficient value equal to 0.43 on a non redundant set of protein complexes. To test the biological meaning of our predictions, we also explore whether interacting patches contain energetically important residues and/or disease related mutations and find that predicted patches are endowed with both features. Based on this, we propose that mapping the protein with all the predicted interaction patches bridges the molecule to the interactome at the cell level. To test our hypothesis we downloaded interaction data from interaction data bases and find that the number of predicted interaction patches significantly correlates (Pearson correlation value > 0.3) with the number of the known interactions (edges) per protein in the human interactome, as contained in MINT and IntAct. We also show that the correlation increases (Pearson correlation value > 0.5) when the subcellular co-localization and the co-expression levels of the interacting partners are taken into account.
Keywords: Prediction of protein interaction patches, protein-protein interaction, Interactome, subcellular co-localization, co-expression, Pearson's correlation index, correlation, mapping, Genome-Wide Protein Interaction Networks, Human Cell Cycle, Hot Spot Prediction, Co-Localization Filtering, Scoring Indices, Cyclin-dependent kinases, Pearson correlation coefficient