Abstract
In recent decades, scientific research has marked an important change in the conceptualization of studies. The development of new analytical technologies, capable of generating large amounts of data, led to the transition from the reductionist scientific model to the holistic one. Among these “high-throughput” technologies, nextgeneration sequencing (NGS) has exponentially increased the amount of knowledge about complex living systems. Bioinformatics and biostatistics are two disciplines developed together with the NGS platforms in order to allow more accurate analysis and data management. NGS technology can be equally applied to both emerging DNA and RNA, originally, for the detection of variants and the analysis of gene expression, respectively. However, in recent years, the possibility of calling variants from the RNA-seq analysis has become increasingly concrete. Here we discuss the different analytical conceptualizations that distinguish DNA from the analysis of RNA sequencing data, highlighting the informative potential of RNA-seq data, not only in relation to the quantification of gene expression. Therefore, the application of the variant calling pipeline analysis to transcriptome data is discussed. Furthermore, the possibility of identifying single nucleotide variants starting from RNA samples, allows characterizing two important mechanisms of regulation of gene expressions such as RNA editing and genomic imprinting. The study of these two biological mechanisms is probably the most stimulating resource obtained from RNA-seq and clearly requires highly adequate bioinformatics support, which is now being developed.
Keywords: Alignment, ABRA2, CaSpER, Diploid-SQUID, eSNV-Detect, Editome, Epigenetics, GATK, Imprinting, JACUSA, RNA sequencing, Transcriptome analysis, SNP discovery, RNA editing, SAMtools, RNAIndel, RVboost, SNPiR, VaDiR, Variant calling.