Abstract
Background: The standard approach for transcriptomic profiling involves high throughput short-read sequencing technology, mainly dominated by Illumina. However, the short reads have limitations in transcriptome assembly and in obtaining full-length transcripts due to the complex nature of transcriptomes with variable length and multiple alternative spliced isoforms. Recent advances in long read sequencing by the Oxford Nanopore Technologies (ONT) offered both cDNA as well as direct RNA sequencing and has brought a paradigm change in the sequencing technology to greatly improve the assembly and expression estimates. ONT enables molecules to be sequenced without fragmentation resulting in ultra-long read length enabling the entire genes and transcripts to be fully characterized. The direct RNA sequencing method, in addition, circumvents the reverse transcription and amplification steps.
Objective: In this study, RNA sequencing methods were assessed by comparing data from Illumina (ILM), ONT cDNA (OCD) and ONT direct RNA (ODR).
Methods: The sensitivity & specificity of the isoform detection was determined from the data generated by Illumina, ONT cDNA and ONT direct RNA sequencing technologies using Saccharomyces cerevisiae as model. Comparative studies were conducted with two pipelines to detect the isoforms, novel genes and variable gene length.
Results: Mapping metrics and qualitative profiles for different pipelines are presented to understand these disruptive technologies. The variability in sequencing technology and the analysis pipeline were studied.
Keywords: Next generation sequencing, second-generation sequencing, third-generation single molecular sequencing, oxford nanopore technologies, illumina, principal components analysis.
Graphical Abstract
[http://dx.doi.org/10.1073/pnas.74.12.5463] [PMID: 271968]
[http://dx.doi.org/10.1073/pnas.74.2.560] [PMID: 265521]
[http://dx.doi.org/10.1126/science.270.5235.467] [PMID: 7569999]
[http://dx.doi.org/10.1038/nbt1296-1675] [PMID: 9634850]
[http://dx.doi.org/10.1093/bfgp/elw043] [PMID: 28334071]
[http://dx.doi.org/10.1016/j.tig.2014.07.001] [PMID: 25108476]
[http://dx.doi.org/10.1038/nrg2484] [PMID: 19015660]
[http://dx.doi.org/10.1038/nature07002] [PMID: 18488015]
[http://dx.doi.org/10.1038/nature08390] [PMID: 19776739]
[http://dx.doi.org/10.1126/science.1162986] [PMID: 19023044]
[http://dx.doi.org/10.1038/nnano.2009.12] [PMID: 19350039]
[http://dx.doi.org/10.1093/bib/bbx062] [PMID: 28637243]
[http://dx.doi.org/10.1186/s13059-015-0777-z] [PMID: 26420219]
[http://dx.doi.org/10.7717/peerj.1441]
[http://dx.doi.org/10.1186/1472-6750-7-21] [PMID: 17480233]
[http://dx.doi.org/10.1016/0076-6879(87)52037-7] [PMID: 3657578]
[http://dx.doi.org/10.1093/nar/gkm683] [PMID: 17897965]
[http://dx.doi.org/10.1016/j.ygeno.2005.12.013] [PMID: 16457984]
[http://dx.doi.org/10.1067/mlc.2001.115452] [PMID: 11385363]
[http://dx.doi.org/10.1038/nmeth.4577] [PMID: 29334379]
[http://dx.doi.org/10.14806/ej.17.1.200]
[http://dx.doi.org/10.1038/nmeth.3317] [PMID: 25751142]
[http://dx.doi.org/10.1038/nbt.3122] [PMID: 25690850]
[http://dx.doi.org/10.1093/bioinformatics/bty191] [PMID: 29750242]
[http://dx.doi.org/10.1093/nar/gky1038] [PMID: 30407594]