Abstract
Background: In biology, the translation of genetic information to its corresponding protein sequences is carried out using the Universal Genetic Code. Out of all the possible combinations of 20 amino acids, proteins are formed by the possible combinations that occur naturally. This leaves a large number of unknown combinations of protein sequences that include the Never Born Proteins. A Never Born Protein is a theoretically possible protein that does not occur in nature or may be selected by evolution in future.
Objective: In this study, the "GenNBPSeq" online web server is developed to generate Never Born Protein Sequences and to analyze their sequence and structural stability.
Methods: The “GenNBPSeq” server is developed based on the Gray Code and Partitioned Gray Code representations of the Universal Genetic Code combined with the novel Toeplitz matrix approach. The sequence and structure analysis is done by various bioinformatics tools for the sample Never Born Protein sequences.
Results: The “GenNBPSeq” server is available at http://bioinfo.bdu.ac.in/nbps and the users can generate Never Born Protein sequences and download them in FASTA formats. The Never Born Protein sequences obtained by the above Toeplitz matrix approach contain the same amino acid composition. They also form protein secondary and 3-Dimensional structures with intrinsic stability.
Conclusion: This study conjectures that the Never Born Protein Sequences generated by “GenNBPSeq” server using the Toeplitz matrix approach may exhibit intrinsic structural stability. Synthesizing these Never Born Proteins and analyzing their biological applications are major research areas in Systems and Synthetic Biology.
Keywords: Gray Code, Universal Genetic Code, Toeplitz Matrices, Never Born Protein Sequences, Molecular Modelling, Molecular Dynamics Simulations.
Graphical Abstract
[http://dx.doi.org/10.1021/cr500288y] [PMID: 25004990]
[http://dx.doi.org/10.1002/cbdv.200690088] [PMID: 17193317]
[http://dx.doi.org/10.1007/s11084-006-9033-6] [PMID: 17131092]
[http://dx.doi.org/10.1002/cbdv.200790053] [PMID: 17443874]
[http://dx.doi.org/10.1007/978-94-017-9514-2]
[http://dx.doi.org/10.1186/2193-1801-2-200] [PMID: 23750329]
[http://dx.doi.org/10.6026/97320630003177] [PMID: 19238243]
[http://dx.doi.org/10.1016/j.bulm.2004.01.002] [PMID: 15294430]
[http://dx.doi.org/10.1073/pnas.53.5.1161] [PMID: 5330357]
[http://dx.doi.org/10.1093/nar/18.8.2163] [PMID: 2336393]
[http://dx.doi.org/10.1016/S0092-8240(84)80018-X] [PMID: 6733309]
[http://dx.doi.org/10.1007/BF01564502]
[http://dx.doi.org/10.1016/j.biosystems.2020.104280] [PMID: 33161051]
[http://dx.doi.org/10.1016/S0022-2836(05)80360-2] [PMID: 2231712]
[http://dx.doi.org/10.1093/nar/25.17.3389] [PMID: 9254694]
[http://dx.doi.org/10.1093/nar/gky995] [PMID: 30357350]
[http://dx.doi.org/10.2174/1574893611308050005]
[http://dx.doi.org/10.1016/j.jsb.2020.107479] [PMID: 32081792]
[http://dx.doi.org/10.1385/1-59259-890-0:571]
[http://dx.doi.org/10.1016/S0968-0004(99)01540-6] [PMID: 10694887]
[http://dx.doi.org/10.1038/nprot.2010.5] [PMID: 20360767]
[http://dx.doi.org/10.1093/nar/gkv342] [PMID: 25883148]
[http://dx.doi.org/10.1186/1471-2105-9-40] [PMID: 18215316]
[http://dx.doi.org/10.1016/j.softx.2015.06.001]
[http://dx.doi.org/10.1093/bib/bbt003] [PMID: 23418055]
[http://dx.doi.org/10.1016/S0076-6879(04)83004-0] [PMID: 15063647]
[http://dx.doi.org/10.1080/07391102.2006.10507076] [PMID: 16494501]
[http://dx.doi.org/10.1016/j.compbiolchem.2006.04.007] [PMID: 16798094]
[http://dx.doi.org/10.1107/S0021889892009944]
[http://dx.doi.org/10.1016/S0022-2836(63)80023-6] [PMID: 13990617]
[http://dx.doi.org/10.1021/ja00214a001] [PMID: 27557051]
[http://dx.doi.org/10.1007/978-94-015-7658-1_21]
[http://dx.doi.org/10.1103/PhysRevLett.45.1196]
[http://dx.doi.org/10.1063/1.328693]
[http://dx.doi.org/10.1063/1.443248]
[http://dx.doi.org/10.1021/acs.biochem.7b00942] [PMID: 29087706]