Abstract
Background: RNA editing enriches post-transcriptional sequence changes. Currently detecting RNA editing sites is mostly based on the Sanger sequencing platform and second-generation sequencing. However, detection with Sanger sequencing is limited by the disturbing background peaks using the direct sequencing method and the clone number using the clone sequencing method, while second- generation sequencing detection is constrained by its short read.
Objective: We aimed to design a pipeline that can accurately detect RNA editing sites for full-length long-read amplicons to meet the requirement when focusing on a few specific genes of interest.
Methods: We developed a novel high-throughput RNA editing sites detection pipeline based on the PacBio circular consensus sequences sequencing which is accurate with high-throughput and long-read coverage. We tested the pipeline on cytosolic malate dehydrogenase in the hard-shelled mussel Mytilus coruscus and further validated it using direct Sanger sequencing.
Results: Data generated from the PacBio circular consensus sequences (CCS) amplicons in three mussels were first filtered by quality and then selected by open reading frame. After filtering, 225-2047 sequences of the three mussels, respectively, were used to identify RNA editing sites. With corresponding genomic DNA sequences, we extracted 227-799 candidate RNA editing sites excluding heterozygous sites. We further figured out 7-11 final RESs using a new error model specially designed for RNA editing site detection. The resulting RNA editing sites all agree with the validation using the Sanger sequencing.
Conclusion: We report a near-zero error rate method in identifying RNA editing sites of long-read amplicons with the use of PacBio CCS sequencing.
[http://dx.doi.org/10.1038/srep14941] [PMID: 26449202]
[http://dx.doi.org/10.1016/j.tig.2010.02.001] [PMID: 20395010]
[http://dx.doi.org/10.1146/annurev.genet.34.1.499] [PMID: 11092837]
[http://dx.doi.org/10.1111/tpj.14578] [PMID: 31630458]
[http://dx.doi.org/10.1101/gr.222760.117] [PMID: 28864459]
[http://dx.doi.org/10.4161/rna.7.2.11343] [PMID: 20473038]
[http://dx.doi.org/10.1126/science.1212795] [PMID: 22223739]
[http://dx.doi.org/10.1038/nn.4337] [PMID: 27348216]
[http://dx.doi.org/10.1371/journal.pone.0120089] [PMID: 25807502]
[http://dx.doi.org/10.1146/annurev-biochem-060208-105251] [PMID: 20192758]
[http://dx.doi.org/10.1101/gr.231209.117] [PMID: 29724793]
[http://dx.doi.org/10.1038/s41467-017-01458-8] [PMID: 29129909]
[http://dx.doi.org/10.1002/wrna.1665] [PMID: 34105255]
[http://dx.doi.org/10.1126/science.1191701] [PMID: 20847274]
[http://dx.doi.org/10.1038/s41576-018-0006-1] [PMID: 29692414]
[http://dx.doi.org/10.1007/s00439-017-1837-0] [PMID: 28913566]
[http://dx.doi.org/10.1152/physiol.00029.2012] [PMID: 23223630]
[http://dx.doi.org/10.1073/pnas.0602476103] [PMID: 16648246]
[http://dx.doi.org/10.1016/j.cca.2016.04.005] [PMID: 27071699]
[http://dx.doi.org/10.1080/15476286.2016.1184387] [PMID: 27149507]
[http://dx.doi.org/10.1093/nar/gkz569] [PMID: 31269198]
[http://dx.doi.org/10.3390/v13061125] [PMID: 34208165]
[http://dx.doi.org/10.1038/nmeth.2736] [PMID: 24270603]
[http://dx.doi.org/10.1016/j.celrep.2020.107878] [PMID: 32668243]
[http://dx.doi.org/10.1093/hmg/ddq416] [PMID: 20858600]
[http://dx.doi.org/10.1038/s41598-021-96829-z] [PMID: 34508117]
[http://dx.doi.org/10.1016/j.watres.2022.118334] [PMID: 35397370]
[http://dx.doi.org/10.1101/2022.03.23.485515]
[http://dx.doi.org/10.1038/s41587-019-0217-9] [PMID: 31406327]
[http://dx.doi.org/10.1186/s12866-016-0891-4] [PMID: 27842515]
[http://dx.doi.org/10.1093/bioinformatics/btx473] [PMID: 29036410]
[http://dx.doi.org/10.1038/nmeth.3869] [PMID: 27214047]
[http://dx.doi.org/10.1371/journal.pone.0227434] [PMID: 31945086]
[http://dx.doi.org/10.1016/j.marenvres.2018.02.005] [PMID: 29478767]
[http://dx.doi.org/10.1016/S0022-2836(05)80360-2] [PMID: 2231712]
[http://dx.doi.org/10.1093/molbev/mst010] [PMID: 23329690]
[http://dx.doi.org/10.1093/bib/bbn013] [PMID: 18372315]
[http://dx.doi.org/10.1080/106351500750049743] [PMID: 12116431]
[http://dx.doi.org/10.1016/j.ymthe.2018.08.007] [PMID: 30166242]
[http://dx.doi.org/10.1038/s41467-017-01658-2] [PMID: 29146998]
[http://dx.doi.org/10.1038/s41586-022-05052-x] [PMID: 35922514]
[http://dx.doi.org/10.15252/embr.201846303] [PMID: 30361393]
[http://dx.doi.org/10.1126/sciadv.aax5717] [PMID: 31086823]
[http://dx.doi.org/10.1038/nature24041] [PMID: 29022589]
[http://dx.doi.org/10.1093/bib/bbx129] [PMID: 29040360]
[http://dx.doi.org/10.1093/bfgp/ely032] [PMID: 30312373]
[http://dx.doi.org/10.1007/s13238-020-00724-8] [PMID: 32394199]
[http://dx.doi.org/10.1038/s41587-019-0201-4] [PMID: 31375807]
[http://dx.doi.org/10.1126/science.1158395] [PMID: 18566285]
[http://dx.doi.org/10.1093/nar/gkx1202] [PMID: 29220521]
[http://dx.doi.org/10.1016/j.plaphy.2019.02.001] [PMID: 30738217]
[http://dx.doi.org/10.1038/s41467-020-15435-1] [PMID: 32221286]
[http://dx.doi.org/10.1038/nmeth.3314] [PMID: 25730491]
[http://dx.doi.org/10.1016/j.margen.2016.04.012] [PMID: 27184710]