Gene Expression with Targeted RNA-seq
In planning a gene expression experiment, most people start thinking about how many microarrays they will need to order for their project. While arrays have been the technology of choice for gene expression experiments in the past, next generation sequencing methods have surpassed hybridization techniques by offering the ability to detect novel transcripts, increased sensitivity and specificity, and the elimination errors and signal noise that are common for hybridization based experiments. Next generation sequencing can also provide the depth you need to detect rare and low abundance transcripts and provide absolute, instead of relative expression levels.
While the upsides to RNA-seq for gene expression experiments are unquestionable in providing comprehensive and reliable gene expression data, investigators are rightly concerned about cost and the complexity of RNAseq data analysis. Many investigators may feel they lack the experience to analyze RNA-seq data or they are concerned about the cost of doing a large project using RNA-seq.
This is where Targeted RNA-seq comes in.
Targeted RNA-seq can use either large amplicon panels or 3′ polyadenylated transcript selection to target the mRNA genes you are interested in comparing. Because this represents a small portion of the transcriptome, more samples can be sequenced at higher depth compared to whole transcriptome. In addition, as there is only one count per transcript, the data is simple to analyze without the need for transcript length normalization. Want another reason to try targeted RNA-seq? It also works on mildly to moderately degraded RNA extracted from formalin-fixed paraffin embedded samples.
Working with multiple different investigators, we have found ways to help you optimize your targeted RNA-seq experimental design. What is the most common question we hear?
How many reads do I need for each sample?
What are your main genes of interest and what is their expression levels in your tissues?
If your main genes of interest have a baseline low expression in the samples you are studying, then you will need a higher overall read count depth to identify expression differences.
How complex is your tissue?
The read count depth required to identify minor expression differences for cell lines and other homogenous cell populations would be substantially less than in a complex normal tissue like the kidney which has up to 17 different cell types.
What is a standard number of reads to start with?
Tough question. In general we recommend 6-8 million reads for targeted-RNAseq for moderately high coverage. This is comparable to sensitivity of microarrays. If the genes you are interested are expressed at low levels, or if you are looking for subtle gene expression changes in a drug study of a normal organ, you may need more than 8 million reads. We often recommend running a small pilot experiment on a subset of samples to determine if 6-8 million reads will give you the information you need.