Abstract
High-throughput experimental technologies continue to alter the study of current system biology. Investigators are understandably eager to harness the power of these new technologies. Protein-protein interactions on these platforms, however, present numerous production and bioinformatics challenges. Some issues like feature extraction, feature representation, prediction algorithm and results analysis have become increasingly problematic in the prediction of protein-protein interaction sites. The development of powerful, efficient prediction methods for inferring protein interface residues based on protein primary sequence or/and 3D structure is critical for the research community to accelerate research and publications. Currently, machine learning-based approaches are drawing the most attention in predicting protein interaction sites. This review aims to describe the state of the whole pipeline when machine learning strategies are applied to infer protein interaction sites.
Keywords: Bioinformatics, machine learning, protein feature, protein interaction site, system biology, whole pipeline, Evolutionary Conservation, Sequence Entropy, Amino Acid Composition,