Abstract
Background: The coronavirus disease has led to an exhaustive exploration of the SARSCoV- 2 genome. Despite the amount of information accumulated, the prediction of short RNA motifs encoding peptides mediating protein-protein or protein-drug interactions has received limited attention.
Objective: The study aims to predict short RNA motifs that are interspersed in the SARS-CoV-2 genome.
Methods: A method in which 14 trinucleotide families, each characterized by being composed of triplets with identical nucleotides in all possible configurations, was used to find short peptides with biological relevance. The novelty of the approach lies in using these families to search how they are distributed across genomes of different CoV genera and then to compare the distributions of these families with each other.
Results: We identified distributions of trinucleotide families in different CoV genera and also how they are related, using a selection criterion that identified short RNA motifs. The motifs were reported to be conserved in SARS-CoVs; in the remaining CoV genomes analysed, motifs contained, exclusively, different configurations of the trinucleotides A, T, G and A, C, G. Eighty-eight short RNA motifs, ranging in length from 12 to 49 nucleotides, were found: 50 motifs in the 1a polyprotein-encoding orf, 27 in the 1b polyprotein-encoding orf, 5 in the spike-encoding orf, and 6 in the nucleocapsidencoding orf. Although some motifs (~27%) were found to be intercalated or attached to functional peptides, most of them have not yet been associated with any known functions.
Conclusion: Some of the trinucleotide family distributions in different CoV genera are not random; they are present in short peptides that, in many cases, are intercalated or attached to functional sites of the proteome.
Graphical Abstract
[http://dx.doi.org/10.1128/JVI.01977-08] [PMID: 18971277]
[http://dx.doi.org/10.1128/JVI.06540-11]
[http://dx.doi.org/10.1128/JVI.00299-07] [PMID: 17459938]
[http://dx.doi.org/10.1128/JVI.02722-07] [PMID: 18353961]
[http://dx.doi.org/10.1016/S1473-3099(20)30067-0] [PMID: 32057299]
[http://dx.doi.org/10.1016/j.tmaid.2020.101578] [PMID: 32044389]
[http://dx.doi.org/10.1016/j.cub.2020.03.022] [PMID: 32197085]
[http://dx.doi.org/10.1186/gb-2004-5-7-r46] [PMID: 15239831]
[http://dx.doi.org/10.1007/s00239-016-9746-8] [PMID: 27220874]
[http://dx.doi.org/10.1007/s00239-016-9763-7] [PMID: 27812751]
[http://dx.doi.org/10.1038/s41598-019-53013-8] [PMID: 31719605]
[http://dx.doi.org/10.2741/3175] [PMID: 18508681]
[http://dx.doi.org/10.1093/nar/gkx1094] [PMID: 29140468]
[http://dx.doi.org/10.1099/0022-1317-82-6-1273] [PMID: 11369870]
[http://dx.doi.org/10.1099/0022-1317-74-9-1795] [PMID: 8397280]
[http://dx.doi.org/10.1080/22221751.2020.1725399] [PMID: 32020836]
[http://dx.doi.org/10.1126/science.1085952]
[http://dx.doi.org/10.1126/science.1087139]
[http://dx.doi.org/10.1128/JVI.79.5.3097-3106.2005] [PMID: 15709029]
[http://dx.doi.org/10.1128/JVI.00697-06] [PMID: 16840328]
[http://dx.doi.org/10.1016/j.jviromet.2006.07.018] [PMID: 16934878]
[http://dx.doi.org/10.1016/j.febslet.2010.08.003]
[http://dx.doi.org/10.1016/0022-2836(70)90057-4] [PMID: 5420325]
[http://dx.doi.org/10.1016/j.genrep.2020.100682] [PMID: 32300673]
[http://dx.doi.org/10.1073/pnas.1407087111] [PMID: 25288733]
[http://dx.doi.org/10.1016/B978-0-12-809712-0.00011-3]
[http://dx.doi.org/10.1016/j.febslet.2004.09.026]
[http://dx.doi.org/10.1093/ecam/neh081] [PMID: 15937562]
[http://dx.doi.org/10.12688/f1000research.22457.2]
[http://dx.doi.org/10.1016/j.bbamem.2008.07.021] [PMID: 18721794]
[http://dx.doi.org/10.1128/JVI.79.3.1743-1752.2005] [PMID: 15650199]
[http://dx.doi.org/10.1021/jp7118229] [PMID: 18489147]
[http://dx.doi.org/10.1021/bi800814q] [PMID: 18616295]
[http://dx.doi.org/10.1021/bi048515g] [PMID: 15654751]
[http://dx.doi.org/10.1128/JVI.79.11.7195-7206.2005] [PMID: 15890958]
[http://dx.doi.org/10.1016/j.bbrc.2008.04.044] [PMID: 18424264]
[http://dx.doi.org/10.1021/bi501352u] [PMID: 25668103]
[http://dx.doi.org/10.1038/s41598-021-83949-9] [PMID: 33633155]
[http://dx.doi.org/10.1006/abio.1994.1405] [PMID: 7529005]
[http://dx.doi.org/10.1016/j.peptides.2005.09.006] [PMID: 16242214]
[http://dx.doi.org/10.1093/emboj/cdg516]
[http://dx.doi.org/10.1189/jlb.0404242] [PMID: 15331624]
[http://dx.doi.org/10.1096/fj.07-8770com] [PMID: 17628015]
[http://dx.doi.org/10.1111/j.1365-2796.2003.01302.x] [PMID: 14871456]
[http://dx.doi.org/10.1128/JVI.78.24.13600-13612.2004] [PMID: 15564471]
[http://dx.doi.org/10.2174/138161206779010369] [PMID: 17168763]
[http://dx.doi.org/10.1038/s41586-020-2180-5] [PMID: 32225176]
[http://dx.doi.org/10.1038/cr.2016.152] [PMID: 28008928]
[http://dx.doi.org/10.3389/fimmu.2020.01581] [PMID: 32719684]
[http://dx.doi.org/10.1016/j.ygeno.2020.09.019] [PMID: 32920121]
[http://dx.doi.org/10.1016/j.meegid.2020.104474] [PMID: 32712315]
[http://dx.doi.org/10.1038/s41467-021-25361-5]