Abstract
Correlated mutation is regarded as a phenomenon induced by the demand of maintaining the structure and/or function of a protein during its biological evolution. Since it is closely related to the underlying mechanism of protein structure and function, tremendous efforts have been made to reveal the relationship between correlated mutations and the structure and function of the protein. In the past few decades, different coevolutionary analysis algorithms have been developed. They have been applied to study various aspects of protein structure and function, such as prediction of disulfide bonds, functionally important residues, residue-residue contacts and protein-protein interaction. Although considerable progress has been achieved so far, obstacles exist in many aspects such as identification, evaluation and interpretation of correlated mutations. In this review, we discuss several essential issues related to the overcoming of these obstacles in coevolution analysis, including the alignment size bias, phylogenetic bias, algorithm evaluation and coevolution interpretation. In particular, we focus on the inconsistent results generated by different algorithms and discuss possible reasons accounting for this discrepancy. We also discuss future challenges and research directions in coevolution analysis.
Keywords: Coevolution analysis, correlated mutation algorithms, molecular dynamics simulation, multiple sequence alignment, phylogenetic bias, phylogenetic correlation, residue coevolution, mutation data matrix, molecular dynamics, Conserved Domain Database