Abstract
Background and Objective: The rapidly growing number of protein data available creates necessity of computational methods with low complexity to infer accurate protein structure, function, and evolution.
Method: A new description of proteins based on five topological indices of star-like graph representation and the occurrence frequency of 20 amino acids was proposed to compare the similarities of proteins. Results: A phylogenetic tree of eight ND6 proteins was constructed to demonstrate the effectiveness and rationality of our approach. Analogously, we applied this method to RNA polymerase proteins of some subtypes of influenza virus to infer their phylogenetic relationship. The results showed that the phylogenetic relationship among RNA polymerase of influenza virus is closely related to distributions of species virus host and geographical distribution. Conclusion: This novel approach is based on a mapping which can be recaptured mathematically without loss of information.Keywords: RNA polymerase, star-like graph, pseudo amino acids composition, phylogenetic tree, alignment-free, topological indices.