RNA sequencing (RNA-seq) is a key tool for performing expression analysis of the entire transcriptome with high sensitivity and a wide dynamic range. Random-primed cDNA synthesis kits, like the SMARTer Stranded RNA-seq kits, are ideal for transcriptome analysis from all types of input RNA, including compromised samples. These kits are based on SMART (Switching Mechanism at 5' End of RNA Template) technology, which is an inherently strand-specific reverse transcription reaction leading to ≥99% accurate identification of the strand of origin without the need for additional preparation steps. Illumina adapters (up to 96 different indexes) are added during cDNA amplification eliminating further library preparation steps after cDNA synthesis.
Prior to cDNA synthesis with any random-primed cDNA synthesis kit, it is important to remove ribosomal RNA (rRNA), which can represent up to 90% of total RNA. The RiboGone - Mammalian kit uses hybridization technology and RNase H digestion to bind and specifically deplete 5S, 5.8S, 18S, and 28S nuclear rRNA sequences and 12S mitochondrial RNA (mtRNA) sequences from full-length or sheared total RNA derived from human, mouse, or rat samples. (This kit does not deplete 16S mitochondrial RNA sequences, which share significant homology with some nuclear genes). The SMARTer Stranded Total RNA Sample Prep Kit - Low Input Mammalian, combines these two technologies into one convenient kit for complete sample preparation and cDNA synthesis.
Tech Note
A complete kit for stranded RNA-seq library preparation
- Stranded information:
Accurate identification of transcript strand of origin - Ribosomal RNA depletion:
Efficient rRNA removal in a complete sample preparation kit - Integrated library preparation:
High-quality sequencing data generated on Illumina platforms
Introduction
Results
Advantages of strand-of-origin information
Maintaining strand-of-origin information in cDNA libraries for sequencing allows researchers to identify overlapping transcripts, which are common in compact bacterial genomes, and antisense transcripts that will be lost with a strand-agnostic cDNA synthesis method. Using the SMARTer stranded RNA-seq kits, we have been able to identify both overlapping and antisense transcripts correctly.
Distinguishing overlapping and antisense transcripts with the SMARTer Stranded RNA-seq Kit. Panel A. RNA-seq reads from a Human Brain Poly A+ RNA cDNA library were mapped against the human genome. The SMARTer Stranded method allowed assignment of sequencing reads to the correct gene in the case of overlapping PHC1 and M6PR transcripts. Panel B. Strand-specific coverage of the CDR1 locus. Nearly all reads are antisense to the annotated transcript, a finding independently reported elsewhere (Hansen et al. 2011). Panel C. Comparison of CDR1 gene counts obtained using either a strand-agnostic or strand-aware method.
RiboGone - Mammalian efficiently removes rRNA sequences from total RNA
Both RiboGone-treated and oligo(dT)-purified RNA sample inputs generated RNA-seq data with similarly low percent sequencing reads mapping to rRNA. Both intact (Human Brain Total RNA) and degraded (FFPE tissue) RNA samples are suitable for the SMARTer Stranded Total RNA Sample Prep Kit - Low Input Mammalian. Oligo(dT)-based methods for decreasing the number of rRNA reads also decrease the number of sequencing reads mapping to noncoding RNAs. The RiboGone method is based on selective hybridization to rRNA leaving both mRNA and noncoding RNAs available as templates for the reverse transcription reaction.
Efficient rRNA removal with the RiboGone - Mammalian kit. RNA-seq libraries were generated from Human Brain Total RNA or Breast Cancer FFPE RNA using the SMARTer Stranded RNA-seq Kit. Libraries generated from RiboGone-treated RNA had comparably low rRNA reads to oligo(dT)-enriched RNA while retaining more noncoding reads.
High-quality sequencing data
The SMARTer Stranded Total RNA Sample Prep Kit - Low Input Mammalian maintains the ability of other SMARTer Stranded kits to generate RNA-seq libraries from a variety of samples including Human Universal Reference RNA (HURR; Agilent) and Human Brain Reference RNA (HBRR; Ambion). When libraries were sequenced on an Illumina MiSeq® instrument, both the HURR and HBRR samples yielded a high number of reads, with 75–76% mapped, 66–70% uniquely mapped, over 13,800 genes identified, and less than 1% of reads mapped to rRNA.
Sequence alignment metrics | ||||||
---|---|---|---|---|---|---|
Human Universal Reference RNA (HURR) | Human Brain Reference RNA (HBRR) | |||||
No. of reads | 6,829,540 | 7,728,850 | ||||
Mapped to rRNA | 62,792 | (0.9%) | 49,844 | (0.7%) | ||
Mapped to mitochondrial RNA | 318,006 | (4.7%) | 224,939 | (2.9%) | ||
Mapped to RefSeq | 4,871,900 | (76%) | 5,515,264 | (75%) | ||
Mapped uniquely to RefSeq | 4,435,123 | (70%) | 4,888,340 | (66%) | ||
Exons | 2,311,575 | (47%) | 2,712,444 | (49%) | ||
Introns | 2,560,325 | (53%) | 2,802,820 | (51%) | ||
Genes identified | 14,563 | 13,839 |
Sequencing alignment metrics for HURR and HBRR libraries. 10 ng samples of intact HURR and HBRR were used as input for the SMARTer Stranded Total RNA Sample Prep Kit - Low Input Mammalian. RNA-seq libraries were prepared according to the kit protocol and sequenced on an Illumina MiSeq platform. RNA-seq data obtained with the SMARTer Stranded Total RNA Sample Prep Kit - Low Input Mammalian for HURR and HBRR samples correlate with qPCR data for the same RNAs obtained through the MicroArray Quality Control (MAQC) study (Shi et al. 2006). The high level of correlation with MAQC (R = 0.860) suggests that the RNA-seq data was not affected by rRNA depletion with the RiboGone - Mammalian kit.
High correlation between SMARTer Stranded RNA-seq data and MAQC qPCR data. A scatter plot was used to compare differential expression data obtained from SMARTer transcriptome analysis of HURR and HBRR cDNA libraries (in Reads per Kilobase of Exon per Million Reads; RPKM) and qPCR data for HURR and HBRR (in Ct) from the MAQC project. The transcripts used in this analysis were the 623 of ~900 transcripts present in the MAQC data set that were also detected in both the HURR and HBRR SMARTer stranded RNA-seq data sets.
Conclusions
The SMARTer stranded RNA-seq kits generate RNA-seq libraries from intact or degraded RNA samples that retain the strand-of-origin information. Strand-of-origin information can be used to identify overlapping and antisense transcripts. Sequencing data generated with these kits identify a large number of genes that highly correlate with MAQC data. The SMARTer Stranded Total RNA Sample Prep Kit - Low Input Mammalian combines the efficient rRNA-removal of RiboGone - Mammalian with cDNA synthesis. The data generated with this complete kit maintains the high quality of other SMARTer stranded RNA-seq kits.
Methods
Human Brain Poly A+ RNA was spiked with ERCC control RNA and serially diluted to prepare RNA samples containing between 100 pg–100 ng RNA. cDNA libraries were prepared using the SMARTer Stranded RNA-seq Kit according to the kit protocol with twelve different Illumina indices. Libraries were sequenced on an Illumina HiSeq® 2000 instrument, with ~300M 2 x 100 bp paired-end reads.
RNA was generated from Breast Cancer FFPE RNA (Cureline) using a NucleoSpin totalRNA FFPE kit. This RNA and Human Brain Total RNA was treated with either the RiboGone - Mammalian kit or the Magnosphere UltraPure mRNA Purification Kit, according to the specific kit protocol. Untreated total RNA was also used as input for RNA-seq library production in order to identify the high percent of rRNA reads present in the initial total RNA preparations. RNA-seq libraries were generated with the SMARTer Stranded RNA-seq Kit and sequenced on an Illumina MiSeq instrument. Reads were mapped to the hg19 genome and read distributions were determined using Picard RNA-seq Metrics.
RNA-seq libraries were generated from 10 ng samples of Human Universal Reference RNA (Agilent) and Human Brain Reference RNA (Ambion), the same RNAs used in the MAQC project (Shi et al. 2006), using the SMARTer Stranded Total RNA Sample Prep Kit - Low Input Mammalian according to the kit protocol, using 18 cycles of PCR. Libraries were sequenced on an Illumina MiSeq platform with ~7M 1 x 50 bp single-end reads per library.
Reads were trimmed by CLC Genomics Workbench and mapped to rRNA and the mitochondrial genome with CLC (% reads indicated). The unmapped reads were subsequently mapped with CLC to the human genome with RefSeq masking, producing mapped reads and uniquely mapped reads. The number of genes identified in each library was determined by the number of genes with an RPKM of at least 0.1. The number of reads that map to introns or exons is a percentage of the reads successfully mapped to RefSeq.
References
Hansen, T. B. et al. miRNA-dependent gene silencing involving Ago2-mediated cleavage of a circular antisense RNA. Embo J 30, 4414–4422 (2011).
Shi, L. et al. The MicroArray Quality Control (MAQC) project shows inter-and intraplatform reproducibility of gene expression measurements. Nat. Biotechnol. 24, 1151–1161 (2006).
Related Products
Takara Bio USA, Inc.
United States/Canada: +1.800.662.2566 • Asia Pacific: +1.650.919.7300 • Europe: +33.(0)1.3904.6880 • Japan: +81.(0)77.565.6999
FOR RESEARCH USE ONLY. NOT FOR USE IN DIAGNOSTIC PROCEDURES. © 2024 Takara Bio Inc. All Rights Reserved. All trademarks are the property of Takara Bio Inc. or its affiliate(s) in the U.S. and/or other countries or their respective owners. Certain trademarks may not be registered in all jurisdictions. Additional product, intellectual property, and restricted use information is available at takarabio.com.