Abstract
A number of representations of protein networks have been reported. Further, while the existence of multiple types of interactomes and relationships between proteins has been accepted and discussed extensively, the exploration of these concepts and hypotheses using machine learning frameworks for protein interaction prediction in a multi-class setting has not yet been extensively accomplished. Essentially, this is due to two reasons: the missing values issues in features and the heterogeneity and not always clear annotation of protein interaction data. This has motivated the attempt to build a set of universal features attributable to any set of protein pairs, generating a universal feature space where evolutionary constraints show their effects and play a central role. We have called this space and the features generating it respectively the sequence properties space and the derived features. We have probed an integrated version of sequence properties space in its ability to properly represent the different kind of available interactomes.
Keywords: Feature space, protein interaction prediction, sample energy.