Abstract
Transcriptome assembly plays a critical role in studying biological properties and examining the expression levels of genomes in specific cells. It is also the basis of many downstream analyses. With the increase of speed and the decrease in cost, massive sequencing data continues to accumulate. A large number of assembly strategies based on different computational methods and experiments have been developed. How to efficiently perform transcriptome assembly with high sensitivity and accuracy becomes a key issue. In this work, the issues with transcriptome assembly are explored based on different sequencing technologies. Specifically, transcriptome assemblies with next-generation sequencing reads are divided into reference-based assemblies and de novo assemblies. The examples of different species are used to illustrate that long reads produced by the third-generation sequencing technologies can cover fulllength transcripts without assemblies. In addition, different transcriptome assemblies using the Hybrid-seq methods and other tools are also summarized. Finally, we discuss the future directions of transcriptome assemblies.
Keywords: Transcriptome assembly, sequencing technologies, hybrid-Seq, full-length transcript, annotation, genomes.
Graphical Abstract
[http://dx.doi.org/10.1016/j.cell.2006.06.023] [PMID: 16839875 ]
[http://dx.doi.org/10.1016/j.cell.2009.02.006] [PMID: 19239885 ]
[http://dx.doi.org/10.1101/gad.17446611] [PMID: 21890647]
[http://dx.doi.org/10.1038/nature07509] [PMID: 18978772]
[http://dx.doi.org/10.1002/elps.1150171209] [PMID: 9034766]
[http://dx.doi.org/10.1073/pnas.74.12.5463] [PMID: 271968]
[http://dx.doi.org/10.1007/s40484-016-0069-y]
[http://dx.doi.org/10.1038/nature03959] [PMID: 16056220]
[http://dx.doi.org/10.1101/gr.076463.108] [PMID: 18477713]
[http://dx.doi.org/10.1109/TCBB.2018.2789909] [PMID: 29993951]
[http://dx.doi.org/10.1038/nbt.1621] [PMID: 20436464]
[http://dx.doi.org/10.1038/nbt.1633] [PMID: 20436462]
[http://dx.doi.org/10.1089/cmb.2010.0243] [PMID: 21385036]
[http://dx.doi.org/10.1007/978-3-642-20036-6_18]
[http://dx.doi.org/10.1073/pnas.1113972108] [PMID: 22135461]
[http://dx.doi.org/10.1007/978-3-642-33122-0_14]
[http://dx.doi.org/10.1101/gr.142232.112] [PMID: 23204306]
[http://dx.doi.org/10.1186/1471-2105-14-S5-S15]
[http://dx.doi.org/10.1038/nbt.3122] [PMID: 25690850]
[http://dx.doi.org/10.1093/bioinformatics/btx557] [PMID: 28968634]
[http://dx.doi.org/10.1101/gr.074492.107] [PMID: 18349386]
[http://dx.doi.org/10.1093/bioinformatics/btp367] [PMID: 19528083]
[http://dx.doi.org/10.1038/nmeth.1517] [PMID: 20935650]
[http://dx.doi.org/10.1186/1471-2164-11-663] [PMID: 21106091]
[http://dx.doi.org/10.1038/nbt.1883] [PMID: 21572440]
[http://dx.doi.org/10.1093/bioinformatics/bts094]] [PMID: 22368243]
[http://dx.doi.org/10.1093/bioinformatics/btt219] [PMID: 23813001]
[http://dx.doi.org/10.1093/bioinformatics/btt092] [PMID: 23457040 ]
[http://dx.doi.org/10.1093/bioinformatics/btt127] [PMID: 23493323]
[http://dx.doi.org/10.1093/bioinformatics/btu077] [PMID: 24532719]
[http://dx.doi.org/10.1186/s13059-015-0596-2] [PMID: 25723335]
[http://dx.doi.org/10.1371/journal.pcbi.1004772] [PMID: 26894997]
[http://dx.doi.org/10.1093/bioinformatics/btu762] [PMID: 25406329]
[http://dx.doi.org/10.1016/j.gpb.2015.08.002] [PMID: 26542840]
[http://dx.doi.org/10.1093/nar/gkw1076] [PMID: 27899656]
[http://dx.doi.org/10.1093/nar/gkv562] [PMID: 26040699]
[http://dx.doi.org/10.1073/pnas.1320101110] [PMID: 24282307]
[http://dx.doi.org/10.1093/bioinformatics/bty098] [PMID: 29905763]
[http://dx.doi.org/10.1371/journal.pone.0094098] [PMID: 24722757]
[http://dx.doi.org/10.1186/1471-2105-12-S14-S2]
[http://dx.doi.org/10.1038/nmeth.1613] [PMID: 21623353]
[http://dx.doi.org/10.1109/TCBB.2016.2550433] [PMID: 27076460]
[http://dx.doi.org/10.1109/TCBB.2018.2861380] [PMID: 30059317]
[http://dx.doi.org/10.1038/nrg3068] [PMID: 21897427]
[http://dx.doi.org/10.1101/gr.103846.109] [PMID: 20693479]
[http://dx.doi.org/10.1186/1752-0509-6-S3-S21] [PMID: 23282199]
[http://dx.doi.org/10.1038/nbt0510-421] [PMID: 20458303]
[http://dx.doi.org/10.1109/TCBB.2018.2876855] [PMID: 30334805]
[http://dx.doi.org/10.1093/bioinformatics/bty773] [PMID: 30184046]
[http://dx.doi.org/10.1186/1471-2164-11-571] [PMID: 20950480]
[http://dx.doi.org/10.1371/journal.pone.0031410] [PMID: 22384018]
[http://dx.doi.org/10.1371/journal.pone.0051188] [PMID: 23236450]
[http://dx.doi.org/10.1038/nprot.2012.016] [PMID: 22383036]
[http://dx.doi.org/10.1038/nprot.2016.095] [PMID: 27560171]
[http://dx.doi.org/10.1038/nbt.4020] [PMID: 29131147]
[PMID: 26315905]
[http://dx.doi.org/10.1093/bioinformatics/btw597] [PMID: 27634951]
[http://dx.doi.org/10.1101/gr.229202] [PMID: 11932250]
[http://dx.doi.org/10.1038/nmeth.3317] [PMID: 25751142]
[http://dx.doi.org/10.1093/bioinformatics/btp120] [PMID: 19289445]
[http://dx.doi.org/10.1038/nmeth.1923] [PMID: 22388286]
[http://dx.doi.org/10.1093/bioinformatics/btq057] [PMID: 20147302]
[http://dx.doi.org/10.1093/nar/gkq622] [PMID: 20802226]
[http://dx.doi.org/10.1093/nar/gkq211] [PMID: 20371516]
[http://dx.doi.org/10.1038/nmeth.1226] [PMID: 18516045]
[http://dx.doi.org/10.1038/nmeth.1371] [PMID: 19844228]
[http://dx.doi.org/10.1093/bioinformatics/btp041] [PMID: 19176549]
[http://dx.doi.org/10.1101/gr.074492.107]
[http://dx.doi.org/10.1007/s11427-013-4442-z] [PMID: 23393030]
[http://dx.doi.org/10.1186/2047-217X-1-18] [PMID: 23587118]
[PMID: 28172640]
[http://dx.doi.org/10.1038/nmeth.2714] [PMID: 24185837]
[http://dx.doi.org/10.1093/nar/gkq543] [PMID: 20571086]
[http://dx.doi.org/10.1186/s12864-017-3691-9] [PMID: 28438136]
[http://dx.doi.org/10.1093/hmg/ddq416] [PMID: 20858600]
[http://dx.doi.org/10.1038/nbt.1561] [PMID: 19668243]
[http://dx.doi.org/10.1186/1471-2164-13-341] [PMID: 22827831]
[http://dx.doi.org/10.1016/j.mib.2014.11.014] [PMID: 25461581]
[http://dx.doi.org/10.1038/nbt.2171] [PMID: 22446694]
[http://dx.doi.org/10.1038/nmeth.3290] [PMID: 25686389]
[http://dx.doi.org/10.1038/nbt.3622] [PMID: 27504770]
[http://dx.doi.org/10.1038/nbt.4060] [PMID: 29431738]
[http://dx.doi.org/10.12688/f1000research.11354.1] [PMID: 28794860]
[http://dx.doi.org/10.12688/f1000research.10571.2] [PMID: 28868132]
[http://dx.doi.org/10.1016/j.tig.2018.05.008] [PMID: 29941292]
[http://dx.doi.org/10.1371/journal.pone.0094650] [PMID: 24736250]
[http://dx.doi.org/10.1534/g3.112.004812] [PMID: 23450794]
[http://dx.doi.org/10.1038/nbt.2705] [PMID: 24108091]
[http://dx.doi.org/10.1073/pnas.1400447111] [PMID: 24961374]
[http://dx.doi.org/10.1126/science.1251033] [PMID: 25258084]
[http://dx.doi.org/10.1186/gb-2013-14-6-405] [PMID: 23822731]
[http://dx.doi.org/10.1101/gr.1858004] [PMID: 15123590]
[http://dx.doi.org/10.1101/gr.1859804] [PMID: 15123589]
[http://dx.doi.org/10.1126/science.1090100] [PMID: 14684825]
[http://dx.doi.org/10.1101/gr.135350.111] [PMID: 22955987]
[http://dx.doi.org/10.1038/ng.259] [PMID: 18978789]
[http://dx.doi.org/10.1126/science.1230612] [PMID: 23258890]
[http://dx.doi.org/10.1126/science.1228186] [PMID: 23258891]
[http://dx.doi.org/10.1093/jxb/erx289] [PMID: 28992056]
[http://dx.doi.org/10.1038/nmeth.4577] [PMID: 29334379]
[http://dx.doi.org/10.1093/bioinformatics/btu538] [PMID: 25165095]
[http://dx.doi.org/10.1093/bioinformatics/btu392] [PMID: 25015988]
[http://dx.doi.org/10.1371/journal.pone.0046679] [PMID: 23056399]
[http://dx.doi.org/10.1038/nbt.2280] [PMID: 22750884]
[http://dx.doi.org/10.1093/bioinformatics/bti310] [PMID: 15728110]
[http://dx.doi.org/10.1093/bioinformatics/17.3.282] [PMID: 11294794]
[http://dx.doi.org/10.1038/msb.2011.75] [PMID: 21988835]
[http://dx.doi.org/10.1186/1471-2164-14-465] [PMID: 23837739]
[http://dx.doi.org/10.1101/gr.131383.111] [PMID: 22147368]
[http://dx.doi.org/10.1101/gr.196469.115] [PMID: 27252236]
[http://dx.doi.org/10.1093/bioinformatics/bts723] [PMID: 23303509]
[http://dx.doi.org/10.1142/S0219720013300025] [PMID: 24131049]
[http://dx.doi.org/10.1186/1471-2105-11-S12-S1]