Streamlined methods provide the opportunities necessary to push each experiment to its fullest potential. By combining time-saving techniques with high-performance reagents, studies in transcriptomics can move forward efficiently and accurately. Expression analysis of the entire transcriptome by RNA-sequencing (RNA-seq) can reap great benefits from highly sensitive, versatile, and easy-to-use protocols. Traditionally, generation of RNA-seq libraries from total RNA has been challenged by the high amounts of ribosomal RNA (rRNA) in the starting material, and lengthy protocols required to incorporate platform-specific adaptors via ligation. The SMARTer Stranded Total RNA Sample Prep Kit - HI Mammalian is a unique solution for generating indexed cDNA libraries suitable for next-generation sequencing (NGS) on any Illumina platform, starting with 100 ng–1 µg of total mammalian RNA of any quality.
Tech Note
A complete solution for generating stranded RNA-seq libraries from high-input total RNA
Introduction
Results
Fast, accurate technology for rRNA removal and library generation
Our SMARTer RNA-seq kits are based on the core SMART (Switching Mechanism at 5' End of RNA Template) technology (Chenchik et al. 1998), a streamlined process that maintains strand information, and also eliminates tedious library preparation by incorporating adaptors in reverse transcription and PCR steps. The strand-specific reverse transcription reaction maintains close to 99% accurate strand-of-origin information, allowing for the identification of overlapping transcripts and antisense transcripts. Sequencing-ready libraries are generated during PCR amplification of the cDNA, using primers containing Illumina cluster-generating sequences and indexes.
Total RNA can consist of ≥90% rRNA, making it important to remove rRNA from samples before generating RNA-seq libraries. The RiboGone technology incorporated in the protocol uses hybridization technology and RNase H digestion to bind and specifically deplete nuclear rRNA sequences (5S, 5.8S, 18S, and 28S), as well as mitochondrial rRNA sequences (12S) from human, mouse, or rat total RNA (Morlan, Qu, and Sinicropi 2012). By depleting the rRNA in samples prior to library generation, sequencing costs are lowered and mapping statistics are improved.
With the combined power of SMART and RiboGone technologies, the SMARTer Stranded Total RNA Sample Prep Kit - HI Mammalian enables you to go from total RNA to Illumina-compatible RNA-seq libraries in around five hours.
Flowchart of SMARTer Stranded Total RNA Sample Prep Kit - HI Mammalian library generation. Panel A. Depletion of rRNA from total RNA samples with RiboGone technology. Panel B. First-strand cDNA synthesis with SMART technology, incorporating Illumina Read Primers 1 and 2. Panel C. Template switching and generation of sequencing libraries with Illumina cluster-generating sequences and indexes by PCR amplification.
Reproducible sequencing data
The SMARTer Stranded Total RNA Sample Prep Kit - HI Mammalian produces extremely reliable RNA-seq data. Two 100-ng samples of Human Universal Reference RNA (HURR; Agilent) were treated with this kit, and the data from the two resulting libraries were compared. The high correlation between them (R = 0.99) displays an impressive level of reproducibility and consistency across replicates.
Reproducibility across replicates. RNA-seq libraries were generated from two samples of 100 ng of HURR. The scatterplot illustrates correlations between the FPKMs (Fragments Per Kilobase Of Exon Per Million Fragments Mapped) from the two libraries.
High-quality sequencing data
With the SMARTer Stranded Total RNA Sample Prep Kit - HI Mammalian system, researchers can generate RNA-seq libraries from a variety of samples, including HURR and HBRR (Human Brain Reference RNA; Ambion), starting from 100 ng–1 µg of total RNA. Using this kit, rRNA content was depleted from samples prior to cDNA synthesis and library generation. When sequenced, both the HURR and HBRR libraries yielded a high number of quality reads, with 88–94% mapped, 84–91% uniquely mapped, and approximately 17,600 genes identified. Additionally, based on the ERCC Spike-In RNA, strand information was maintained at about 99% for both samples. The benefits of rRNA depletion are clear, with less than 0.5% of reads from the HURR library and less than 6% of reads from the HBRR library mapped to rRNA.
Sequence alignment metrics | ||||||
---|---|---|---|---|---|---|
RNA source | Human Universal | Human Brain | ||||
Input amount | 400 ng | |||||
Number of reads (millions) | 8.5 (paired end reads) | |||||
Percentage of reads (%): | ||||||
rRNA | 0.3% | 5.3% | ||||
Mapped to genome | 94% | 88% | ||||
Mapped uniquely to genome | 91% | 84% | ||||
Exonic | 43% | 50% | ||||
Intronic | 43% | 33% | ||||
Intergenic | 14% | 12% | ||||
Number of genes identified | 17,570 | 17,600 | ||||
Percentage of ERCC transcripts with correct strand | 99.3% | 98.8% |
Sequence Alignment Metrics. 400 ng of HURR and HBRR with ERCC Spike-In RNA were treated with this kit. Alignment data is displayed for both libraries, with the percentage of reads that mapped to rRNA, exonic regions, intronic regions, intergenic regions, and the correct strand, as defined by Picard analysis.
These same RNA-seq libraries, generated from HURR and HBRR samples, produced data that had a strong correlation (R = 0.927) with qPCR data for the same RNAs obtained through the MicroArray Quality Control (MAQC) analysis. This suggested that the RiboGone method of rRNA depletion and SMARTer cDNA synthesis and library preparation did not negatively affect the RNA-seq data and maintained exceptional accuracy.
MAQC Analysis. RNA-seq libraries were generated with 400 ng of HURR and HBRR. The scatter plot shows the Log2 ratio of FPKMs of HURR/HBRR graphed against the Log2 of the ratio of HURR/HBRR derived from qPCR Taqman probes.
RNA-seq libraries produced with this kit provide an accurate representation of your sample. A 400-ng HBRR sample with ERCC (External RNA Controls Consortium) Spike-In RNA Mix (Life Technologies) was treated with this kit, and the libraries were sequenced, generating 8.5 million paired end reads. The FPKMs (Fragments Per Kilobase Of Exon Per Million Fragments Mapped) showed a strong correlation (R2 = 0.9199) and linearity (slope = 0.9988) to the input concentrations of the individual ERCC transcripts, indicating excellent accuracy and dynamic range.
Dynamic range and linearity of RNA-seq data. Libraries were generated from Human Brain Reference RNA with ERCC Spike-In RNA Mix2. The above graph shows strong correlation between the Log2 of input concentrations of individual ERCC transcripts vs. the Log2 of FPKMs for those transcripts.
Reliable, accurate results across RNA inputs of varying quality
Libraries can be quickly and precisely generated from input RNA of a wide range of quality. Mouse Liver RNA was chemically sheared to a RIN (RNA Integrity Number) of either 3 or 7 (Mortazavi et al. 2008). Samples of each quality were used at both 100-ng and 1-µg levels and treated with this kit. All of the libraries generated from these samples had high mapping statistics with 81–88% mapped reads, 72–77% uniquely mapped reads, with over 12,000 genes identified. Stranded information of the biological RNA was maintained at high levels (95–98%), regardless of RIN value.
Sequence alignment metrics from RNA of varying quality | ||||||
---|---|---|---|---|---|---|
RNA source | Mouse Liver | |||||
RNA quality (RIN) | RIN 3 | RIN 7 | ||||
Input amount | 100 ng | 1 µg | 100 ng | 1 µg | ||
Number of reads (millions) | 1.7 (paired end reads) | |||||
Percentage of reads (%): | ||||||
rRNA | 2% | 2% | 1% | 1% | ||
Mapped to genome | 82% | 86% | 81% | 88% | ||
Mapped uniquely to genome | 73% | 75% | 72% | 77% | ||
Exonic | 55% | 53% | 54% | 54% | ||
Intronic | 32% | 31% | 33% | 32% | ||
Intergenic | 12% | 14% | 12% | 13% | ||
Number of genes identified | 12,079 | 12,172 | 12,099 | 12,212 | ||
Percent biological strandedness | 95.5% | 97.2% | 95.6% | 98.1% |
High-quality libraries across varying levels of RNA quality. Libraries were generated from Mouse Liver RNA by chemical shearing until it had a RIN of 3 or 7. Sequencing data showed the percentage of reads that mapped to rRNA, exonic regions, intronic regions, intergenic regions, and the correct strand, as defined by Picard analysis.
The above data shows that the high reproducibility standards of this kit are not affected by the quality of input RNA. A comparison of data from the 1-µg libraries described above shows an extremely high correlation (R = 0.99), indicating the strong ability of the SMARTer Stranded Total RNA Sample Prep Kit - HI Mammalian kit to generate reliable, reproducible data across varying levels of RNA quality.
Reproducibility across RNA quality. A scatterplot illustrates the correlations between the FPKMs from two libraries generated from 1 µg of Mouse Liver RNA that was chemically sheared until it had a RIN of 3 or 7.
Conclusions
The SMARTer Stranded Total RNA Sample Prep Kit - HI Mammalian is a complete solution for preparing indexed Illumina sequencing libraries from 100 ng–1 µg of mammalian total RNA. This kit incorporates key RiboGone and SMART technologies, seamlessly blending abundant transcript (rRNA) removal and strand-specific library generation. SMART technology allows the addition of Illumina adaptors in a ligation-free manner, significantly reducing hands-on time while also increasing efficiency. The sequencing data obtained with this kit maintains high quality and reproducibility across sample replicates and RNA quality.
Methods
Reproducibility across replicates:
Reproducibility across replicates was illustrated with two samples of 100 ng of Human Universal Reference RNA (Agilent), treated with SMARTer Stranded Total RNA Sample Prep Kit - HI Mammalian. The two replicates underwent the same protocol, except Replica #1 used 13 PCR cycles and Replica #2 used 14 PCR cycles. The libraries were sequenced at 1.3 million single end reads (1 x 50 bp) on an Illumina MiSeq® instrument, and aligned with STAR against hg19 with Ensembl annotation.
MAQC and ERCC analysis:
The quality of sequencing data was demonstrated via MAQC analysis, dynamic range analysis, and sequence alignment metrics. For this purpose, RNA-seq libraries were generated from 400 ng of Human Universal Reference RNA (HURR; Agilent) and Human Brain Reference RNA (HBRR; Ambion) with ERCC Spike-In RNA, with Mix1 used for HURR and Mix2 used for HBRR. The libraries were sequenced at 8.5 million paired end reads (2 x 75 bp) on an Illumina MiSeq instrument, and aligned with STAR against hg19 with Ensembl annotation. The percentage of reads that mapped to rRNA, exonic regions, intronic regions, intergenic regions, and the correct strand were defined by Picard analysis. For MAQC analysis, the Log2 ratio of FPKMs from HURR/HBRR was graphed against the Log2 of the ratio of HURR/HBRR derived from qPCR Taqman probes. For the dynamic range study, the Log2 of input concentrations of individual ERCC transcripts was graphed against the Log2 of FPKMs for those transcripts in the HBRR sample.
Library generation across RNA inputs of varying quality
To compare data across RNA quality, Mouse Liver RNA was chemically sheared until it had a RIN (RNA Integrity Number) of 3 or 7. Either 100 ng or 1 µg of each RIN was used with this kit to generate RNA-seq libraries. These libraries were sequenced at 1.7 million paired end reads (2 x 25 bp) on an Illumina MiSeq instrument, and aligned with STAR against mm10 with Ensembl annotation. The percentage of reads that mapped to rRNA, exonic regions, intronic regions, intergenic regions, and the correct strand were defined by Picard analysis. Correlations between the FPKMs of libraries generated from 1 µg of RNA with both RINs were illustrated in a scatterplot.
References
Chenchik, A. et al. Generation and use of high-quality cDNA from small amounts of total RNA by SMART PCR. Gene cloning Anal. by RT-PCR 305–319 (1998).
Morlan, J. D., Qu, K. & Sinicropi, D. V. Selective depletion of rRNA enables whole transcriptome profiling of archival fixed tissue. PLoS One 7, (2012).
Mortazavi, A., Williams, B. A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5, 621–628 (2008).
Related products
Takara Bio USA, Inc.
United States/Canada: +1.800.662.2566 • Asia Pacific: +1.650.919.7300 • Europe: +33.(0)1.3904.6880 • Japan: +81.(0)77.565.6999
FOR RESEARCH USE ONLY. NOT FOR USE IN DIAGNOSTIC PROCEDURES. © 2024 Takara Bio Inc. All Rights Reserved. All trademarks are the property of Takara Bio Inc. or its affiliate(s) in the U.S. and/or other countries or their respective owners. Certain trademarks may not be registered in all jurisdictions. Additional product, intellectual property, and restricted use information is available at takarabio.com.