RNA sequencing or RNA-Seq is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases.
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while microarray-based techniques had drawbacks such as limited coverage and dependency on existing knowledge of the genome, Sanger sequencing has limitations, such as low-throughput, high cost, and inaccurate results. In contrast, RNA-seq is a next-generation sequencing (NGS) technology that provides relatively higher coverage and higher throughput. It also generates additional data that can help discover novel transcripts, understand allele-specific information, and identify alternatively spliced genes.
The RNA-seq process can be divided into several steps. The first step is the extraction and isolation of RNA of interest from the sample, followed by the conversion of this RNA to complementary DNA. This ensures the molecule's stability, easy handling, and ability to be put into an NGS workflow. Next, sequences known as adapters are attached to the DNA fragments to enable sequencing. The most widely used NGS platforms for RNA-seq include SOLiD, Ion Torrent, and HiSeq. The depth to which the library is sequenced varies depending on the end-goal of the experiment. For example, sequencing can involve single-read or paired-end sequencing methods. Single-read sequencing that sequences the DNA only from one end is a cheaper and faster technique, while the paired-end method that involves sequencing from both ends is more expensive and time-consuming. Additionally, additional information about which DNA strand was transcribed can also be retained through a strand-specific protocol.
The sequencing data is then aligned to a reference genome and used to generate a corresponding RNA sequence map. Depending on the nature of the analysis, different bioinformatic tools can be used to process data. For example, BitSeq and RSEM can help quantify expression level, whereas MISO can be used to quantify alternatively spliced genes.