Abstract
Background: In the post-genome age, it is more urgent to understand the functions of genes and proteins. Since experimental methods are usually costly and time consuming, computational predictions are recognized as an alternative approach. In developing a predictive method for functional genomics and proteomics, one of the most important steps is to represent biological sequences with a fixed length numerical form, which can be further analyzed using machine learning algorithms. Chou’s pseudo-amino acid compositions and the pseudo k-nucleotide compositions are algorithms for this purpose.
Conclusion: Since the appearance of these algorithms, several software tools have been developed as implementations. These software tools facilitate the application of these algorithms. As these software tools are developed with different technologies and for different application scenarios, we will briefly review the technical aspect of these software tools in this short review.
Keywords: Pseudo-amino acid compositions, pseudo k-nucleotide compositions, sequence representations, Chou’s five-step rule, functional genomics, functional proteomics.
Graphical Abstract