Cogent NGS Analysis Pipeline v1.0 User Manual
(Last updated: 01-Oct-2020)
Cogent NGS Analysis Pipeline (CogentAP) is bioinformatic software for analyzing RNA-seq NGS data generated using the following systems or kits:
- ICELL8 cx Single-Cell System or the ICELL8 Single-Cell System on the single-cell full-length transcriptome (SMART-Seq ICELL8 workflow)
- ICELL8 cx Single-Cell System or the ICELL8 Single-Cell System on the single-cell differential expression (3′ DE or 5′ DE) workflows (ICELL8 3′ DE or ICELL8 TCR)
- SMARTer Stranded Total RNA-Seq Kit v3 - Pico Input Mammalian
The program takes input data from sequencing and outputs an HTML report, with results typical to single-cell analysis, plus other files, such as a gene matrix, to continue further analysis. R data object with pre-computed results based on recommended parameters are also output. Either the standard output files or the R data object can serve as input for Cogent NGS Discovery Software (CogentDS), another bioinformatic software package provided by Takara Bio.
CogentAP software is written in Python and can be run either via a GUI or command-line interface.
I. Before you begin
A. Supported operating systems
CogentAP software is designed to be installed on a server running Linux. The following versions of Linux have been tested and are supported for use with CogentAP software:
- CentOS 6.9 & 6.10
- RedHat 7.5
- Ubuntu 17
B. Hardware requirements
For analyzing the output of Illumina NextSeq® High-Output sequencing data analysis, the following server requirements (or better) are recommended:
- CPU: 24-cores
- RAM: 64 GB
- Free hard drive space: 500 GB
Testing was also done on MiniSeq™, MiSeq®, HiSeq®, and NovaSeq™ datasets.
- MiniSeq or MiSeq—less computational power may be needed than the specifications described above for NextSeq output
- HiSeq or NovaSeq—requires more computational power than described above for NextSeq output
Precise hardware requirements were not determined for output from these datasets. Support for performance issues of the servers in conjunction with these dataset types may be limited.
C. User account requirements
CogentAP software can be installed in two different access scenarios; the user account requirement depends on which scenario you wish to implement in your environment.
- For use by a single user (single Linux username/account), CogentAP software can be installed and run by any account type with install and executable privileges (regular or root access).
- For use by multiple users (accounts), CogentAP software can either be installed singly across the multiple accounts (regular or root access) or installed for system-wide access by an administrator with root access.
D. Additional hardware and software dependencies and recommendations
- ICELL8 cx Single-Cell System or ICELL8 Single-Cell System
- Bash UNIX shell
- Internet connectivity on the server
The installation process requires Internet connectivity as it sources scripts from GitHub, Bioconda, and CRAN and downloads genome information from Ensembl. Please ensure that internet connectivity is available on the UNIX server while installing. - Conda
CogentAP software leverages the open-source package manager Conda for installation of the pipeline and its dependencies. Any tools and applications required by the pipeline are installed through Conda inside a local environment created specifically for CogentAP software.
If Conda is not currently installed on the server, instructions to do so can be found at https://conda.io/miniconda.html. We recommend installation of the lightweight version for Python 3.7+ (typically the 64-bit bash installer), also called Miniconda3 (version 4.8.2 or later). - bcl2fastq
CogentAP takes as input raw-fastq files that must be converted from the sequencer output FASTQ files using the conversion software bcl2fastq.
If you're not sure where to find bcl2fastq in your environment, it can be downloaded from https://support.illumina.com/sequencing/sequencing_software/bcl2fastq-conversion-software.html. - Keyboard, monitor, and mouse directly into the server, or a remote access program
CogentAP software must be run on the Linux server. If users do not have direct console access, a remote access program that enables a Virtual Network Computing (VNC) connection is required through a program such as RealVNC (realvnc.com), TightVNC (tightvnc.com), TigerVNC (tigervnc.org), or similar.
For more information on VNC, along with other VNC clients that can be used, please see the Wikipedia entry at https://en.wikipedia.org/wiki/Virtual_Network_Computing.
E. Required input files
- Paired-end read FASTQ files
- Sample description file
- Well-list file—a well-selection text file derived from CellSelect software that contains well-level sample information. A typical CellSelect well-list file should have no more than 1,728 wells.
- Well-list-like format file—a text file that contains sample information, including columns "Barcode" and "Sample". Each column name is case sensitive. The "Barcode" column contains i7 and i5 indexes concatenated with a plus-sign ("+") (e.g., TAGCGAGT+CCGTTGCG).
II. Software overview
CogentAP software consists of two main parts, the demultiplexer (demuxer) and the analyzer.
- The demultiplexer extracts the barcode from the sequencing data (based on the protocol) and writes it into FASTQ files at the end of the read name. There are two options:
- The default leaves the barcode-assigned reads in combined FASTQ files (which saves time during subsequent analysis)
- The second option splits the data up into barcode-level FASTQ files which can be carried over to other analysis pipelines
- The analyzer takes the data sent to it by the demultiplexer and performs the following functions:
The demultiplexer and analyzer can be invoked using a graphical user interface (GUI), called the CogentAP launcher, or can be run on the command line.
III. Installation & configuration options
The CogentAP software is available for download on the ICELL8 software portal at takarabio.com/ICELL8-software. An email will be sent to you with information on downloading the install script and a unique password required to run the installation on your server.
A. Verifying Conda installation
The following steps can be used to verify that Conda is installed properly on the server.
- Type the following command in at the prompt in any directory location on the server:
conda -V
If Conda is successfully installed, it should return text with the version number.
e.g.,
conda 4.8.2
- Check to see if the base Conda environment can be activated.
- Type the following command into the command-line Linux prompt on the server:
Or, if that returns an error, try:source activate
A successful Conda install will result in a change in the prompt, as shown in Figure 2.conda activate
- If Conda is successfully installed and the prompt changed as displayed in Figure 2, type the following command to return to the default Linux prompt:
source deactivate
or
conda deactivate
This command will take the user back to the Linux prompt and out of the Conda environment.
- Type the following command into the command-line Linux prompt on the server:
- Installation of Miniconda3 typically adds the location of its installation to the user's system environment. This is also required for the successful installation of CogentAP software.
The following steps can be used to verify that the Conda
$PATH
is configured correctly.- Open the file
.bash_profile
, which will be located in the home directory (for an individual user account):
more ~/.bash_profile
- Verify a line similar to the following is showing in the file:
whereexport PATH="/home/<USERNAME>/miniconda3/bin:$PATH"
<USERNAME>
is replaced by the username of the account that installed Conda.
e.g., username is ‘myacct':
If the line isn't displaying or theexport PATH="/home/myacct/miniconda3/bin:$PATH"
.bash_profile
file does not exist, it will need to be manually created and populated. For more information on setting an environment variable, see a UNIX user manual or a forum post like https://stackoverflow.com/a/7502128.
- Open the file
B. Installation
- Download the installation script (
takarabio_Linux64_installer.sh
), following the directions on the thank you page or the email sent after submitting the sign-up form on the CogentAP product page. - Move or copy the installation script onto the Linux server into the directory location where you want to install the CogentAP software.
NOTE: The account logged into while doing the installation must have read/write privileges to the install directory chosen.
- From the same directory location in Step 2, run the following command:
bash takarabio_Linux64_installer.sh CogentAP <PASSWORD>
<PASSWORD>
will be replaced by the unique password included in the sign-up confirmation email.NOTE: The installation process at this point will take anywhere from 90 minutes to 3 hours to complete, depending on the computational capacity of the server and the download speed of the internet connection. No further user input is required until the install is complete. The installation procedure may be left to run overnight, with the user not present at the console. If desired, the user can work in another terminal window simultaneous to the install.
Once the installation is complete, a message will display saying CogentAP software is ready to use.
There should be a folder named CogentAP in the directory into which the software was installed (Step 2) which contains files and directories required by the pipeline’s scripts.
C. (Optional) Set up a $COGENT_AP_HOME environmental variable
For ease of use, it's recommended that the CogentAP software install directory location be added to the .bash_profile
as a permanent environmental variable.
Example:
If your account name is 'myacct', the absolute pathname for myacct's home directory is /home/myacct
, and CogentAP software was installed in the ~/bin
directory, edit .bash_profile
to add the following line:
export COGENT_AP_HOME=/home/myacct/bin/CogentAP
Once added to the profile, you will either need to log out and back in to the account, or load the file in with the following command:
source ~/.bash_profile
The phrase $COGENT_AP_HOME
can then be used as an alias shortcut to reference /home/myacct/bin/CogentAP
.
Example:
Running the following command logged in as 'myacct':
cd $COGENT_AP_HOME
will change directories to ~/bin/CogentAP
.
NOTE: Subsequent references to $COGENT_AP_HOME
in this document refer to the full path where CogentAP software is installed.
D. Updating the pipeline scripts
To update CogentAP, run the following two commands in sequence:
cd $COGENT_AP_HOME
sh CogentAP_setup.sh update
E. Add a genome build
The human genome (Ensembl hg38 release 94) is built as default with the pipeline, but the genomes of other species can be loaded into the software post-install. To do so, download the following two files for the genome of interest:
- The FASTA file containing all the sequences (chromosomes and contigs)
- The GTF file containing the annotation, importantly the gene information for analysis
- Run the script:
$COGENT_AP_HOME/cogent add_genome \
-g <common_species_name> \
-f <FASTA_FILENAME> \
-a <GTF_FILENAME>Where
<common_species_name>
is the name of the genome being added (e.g., fruitfly) and the<FASTA_FILENAME>
and<GTF_FILENAME>
are the exact file names and path(s) for the FASTA and GTF files, respectively, on the server. For additional help with this script, type:$COGENT_AP_HOME/cogent add_genome -h
Example, using the fruit fly genome from Ensembl.org (https://uswest.ensembl.org/info/data/ftp/index.html):
- Download the FASTA file (Drosophila_melanogaster.BDGP6.dna.chromosome.2L.fa) using FTP:
wget ftp://ftp.ensembl.org/pub/release-94/fasta/drosophila_melanogaster/dna/Drosophila_melanogaster.BDGP6.dna.toplevel.fa.gz
or copy and paste the URI:
into the address bar of a web browser.ftp://ftp.ensembl.org/pub/release-94/fasta/drosophila_melanogaster/dna/Drosophila_melanogaster.BDGP6.dna.toplevel.fa.gz
- Download the GTF file (Drosophila_melanogaster.BDGP6.94.gtf):
wget ftp://ftp.ensembl.org/pub/release-94/gtf/drosophila_melanogaster/Drosophila_melanogaster.BDGP6.94.gtf.gz
or copy and paste:
into the address bar of a web browser.ftp://ftp.ensembl.org/pub/release-94/gtf/drosophila_melanogaster/Drosophila_melanogaster.BDGP6.94.gtf.gz
- If the downloaded files are stored in
~/ensembl
directory in "myacct", run the following script:
$COGENT_AP_HOME/cogent add_genome \
-g fruitfly \
-f ~/ensembl/Drosophila_melanogaster.BDGP6.dna.chromosome.2L.fa \
-a ~/ensembl/Drosophila_melanogaster.BDGP6.94.gtf
NOTE: As FASTA and GTF files are a standard file format, files from any source, should work with this script. However, this script has only been tested on genomes downloaded from Ensembl. If a problem is encountered using files from another source it is recommended to try the import using the Ensembl file versions of the genome.
F. Uninstall
CogentAP software can be uninstalled by deleting the CogentAP/
directory defined as $COGENT_AP_HOME
from the server.
If $COGENT_AP_HOME
has been defined in .bash_profile
, edit the file to remove the reference to $COGENT_AP_HOME
as well.
IV. Running the pipeline
Before running an analysis, raw-fastq files need to be generated from the sequencer-output FASTQ files. Once those are created, CogentAP can be run in either of two ways:
- Using a graphical user interface (GUI) (Section IV.B)
- On the command line (CLI) (Section IV.C)
A. Generating raw-fastq files
CogentAP demultiplexer takes two, one-paired raw-fastq files as input (not split by barcode).
- Log in to the server that stores the sequencing run output folder. This server will typically have the program
bcl2fastq
installed (see Section I.D for more information aboutbcl2fastq
). - Change to the directory where you want to generate the raw-fastq files.
- Run
bcl2fastq
with the following syntax:
bcl2fastq -R <RUN_FOLDER> \
-o <RUN_ID> \
--no-lane-splitting \
--sample-sheet $COGENT_AP_HOME/config/SampleSheet_dummy.csv > <RUN_ID>.stdout \
2> <RUN_ID>.stderrwhere
<RUN_FOLDER>
is the path to the sequencing run folder<RUN_ID>
is the ID automatically generated by Illumina sequencer
The file
SampleSheet_dummy.csv
is stored in the CogentAPconfig
folder, under the main install folder.NOTE: Recent versions of
bcl2fastq
(2.17 and higher) have a bug where the indexes required for demultiplexing will not be inserted into the raw-fastq if a samplesheet is not specified in the command syntax. We recommend using theSampleSheet_dummy.csv
file provided for every run of thebcl2fastq
command to generate the raw-fastq files in order to prevent encountering that issue. - Retrieve the raw-fastq files. These are typically named
Undetermined_*.fastq.gz
and are generated in the<RUN_ID>
folder in the working directory from Step 2.
NOTE: It is recommended that the raw-fastq files are moved to a directory on the server on which CogentAP is installed in order to reduce processing time.
B. GUI
The CogentAP GUI is a program called CogentAP_launcher
. Since it's graphical, it should be accessed by connecting to the Linux server using a remote access tool like VNC (see Section I.D, above).
- Connect to the Linux console where CogentAP software is installed, either remotely or via a graphical interface on the server.
- Once connected, navigate using the file browser functionality to the folder in which CogentAP software is installed. In Figure 5, the CogentAP install folder is located on the user's desktop.
- In the install folder, locate the file
CogentAP_launcher
and double-click on it to run. This will open the CogentAP GUI (Figure 6).The launcher has four tabs: Run Information, Advanced Options, Demuxer Options, and Analyzer Options.
- The Run Information tab takes as input:
- The workflow (default being 'Demultiplexer & Analyzer')
- R1 and R2 raw-fastq files (generated in Section IV.A)
- The well-list file (see Section I.E for more information about this file)
- The protocol: full-length, 3′ DE, or 5′ DE
- The genome to use for alignment (default: human hg38)
- The directory location in which to output the results
- The name for a new analysis folder, which will be created in the directory location specified above and will contain the analysis output files
NOTE: Use the scrollbar on the right of the window to view all the fields.
The other three tabs are intended for advanced users only, but their purposes are listed below:
- Advanced Options can be used to modify parameters such as the number of processor cores available for analysis, the number of mismatches to consider during barcode assignment, etc.
- Demuxer Options can be used to modify parameters on the demultiplexer process. The UMI length can be changed in this tab. Please input corresponding length to the kit used, such as SMARTer Stranded Total RNA-Seq Kit v3 - Pico Input Mammalian.
- Analyzer Options tab can be used to modify parameters of the analyzer's process such as normalizing method, skip trimming, etc.
- The Run Information tab takes as input:
- Once all the fields have been appropriately filled, click on [Start] to launch the pipeline. The process typically takes a few hours to run.
- Upon successful completion, the UI displays a "Finished" message as shown in the right panel of Figure 6 (above).
C. Command-line interface
NOTE: See Section V.B for an example of the full syntax for the command-line scripts.
CogentAP software starts from the main script, cogent
, and has three subcommands: demux
, analyze
, and add_genome
. The demux
and analyze
commands will be covered in this section.
In general, you would first run the demuxer (cogent demux
) to obtain the demultiplexed FASTQ files. These files are then used as input to run the analyzer (cogent analyze
) to obtain the output files (described in detail in Section VI). These scripts can be launched from any location (working directory) on the Linux server where CogentAP software is installed.
The full list of arguments can be accessed using the syntax:
cogent <COMMAND> -h
See examples for <COMMAND>
values 'demux' and 'analyze' in Figures 7 and 8, below.
%COGENT_AP_HOME%/cogent demux -h
%COGENT_AP_HOME%/cogent analyze -h
D. Processing time
The time taken by the pipeline will vary based on the hardware specifications of the server on which it is run and the size of the raw-fastq input files. During testing, a MiSeq run (~25M read pairs) typically took about 1–1.5 hr, and a NextSeq High Ouput run (~400M read pairs) typically took ~20–25 hr to complete.
NOTE: These baselines might be exceeded if the raw-fastq files are stored on a network device instead of locally on the server where CogentAP is installed.
V. Test dataset
A mini dataset file (test dataset) is included in the CogentAP distribution package; it can be found under the main installation folder in a sub-folder called $COGENT_AP_HOME/test/
(Figure 9, below). This dataset can be used to test the running of the pipeline end-to-end and will provide a sample of the output files. The output (report and stats only) from the test dataset are also included and can be found in $COGENT_AP_HOME/test/out_test/
. This included output can be used to compare to the output of your test run to verify everything is working correctly.
NOTE: The mini dataset should not be used for inference purposes. Output statistics and plots will only be meaningful with a real dataset.
A run using the test data can be started using either the UI launcher (Section A) or the command line (Section B).
The test run using either method should take ~5–10 min to complete successfully.
A. Test data through CogentAP launcher UI
The parameters defined in the CogentAP launcher fields in Figure 10 are listed below. The paths are written as $COGENT_AP_HOME
being /usr/local/bin/
.
Field name | Input value |
Workflow | Demultiplexer & Analyzer |
R1 Fastq | /usr/local/bin/CogentAP/test/test_FL_R1.fastq.gz |
R2 Fastq | /usr/local/bin/CogentAP/test/test_FL_R2.fastq.gz |
Well List File | /usr/local/bin/CogentAP/test/99999_CogentAP |
Protocol | Full Length |
Genome | hg38 |
Output Directory | /tmp/CogentAP_testing/ |
Output Folder | install_test |
Table 1. Parameter information for example CogentAP launcher run of the provided test dataset in Figure 10.
Once completed, the output folder (install_test/
) will be located under the directory specified as 'Output Directory' (/tmp/CogentAP_testing/
).
B. Test data through the command-line interface
- Run the demultiplexer script.
$COGENT_AP_HOME/cogent demux \
-i $COGENT_AP_HOME/test/test_FL_R1.fastq.gz \
-p $COGENT_AP_HOME/test/test_FL_R2.fastq.gz \
-b $COGENT_AP_HOME/test/99999_CogentAP_test_selected_WellList.TXT \
-t Full_length \
-o Desktop/install_test \
-n 8 - Once the demuxer is finished, run the data through the analysis script.
$COGENT_AP_HOME/cogent analyze \
-i $COGENT_AP_HOME/install_test/install_test_demuxed_R1.fastq \
-p $COGENT_AP_HOME/install_test/install_test_demuxed_R2.fastq \
-g hg38 \
-o Desktop/install_test/analysis \
-n 8
-d Desktop/install_test/install_test_counts_all.csv
-t Full_length
Once completed, the output folder (install_test/
) will be located under the directory specified as 'Output Directory' (Desktop/
as part of the parameter -o Desktop/install_test
).
VI. Output files
The pipeline produces output files that serve two purposes:
- Summarization of results using typical statistics and plots
- Facilitating further analyses using our interactive R kit, Cogent NGS Discovery Software (CogentDS), or any other tertiary analysis tool
A. Output folder structure
Both the CogentAP GUI launcher and CLI are designated to output to a folder specified in the run parameters, but the file structure within that directory is slightly different between the two methods.
- The output folder from CogentAP launcher UI will resemble Figure 13:
- The folder structure from the CLI will resemble Figure 14:
B. HTML report
The HTML report is generated by the same report generator process as CogentDS using standard parameters and contains the example statistics and plots listed below. For complete details, please see the Cogent NGS Discovery Software User Manual.
- Experiment overview
- Correlation analyses
- Various counts like reads, genes, Mito, Ribo.
- Clustering tables
C. Stats file
The Stats file is provided in CSV format and contains barcode-level statistics across the analysis pipeline. Starting from barcoded reads, it summarizes the number of reads after every step in the pipeline: for example, trimmed reads, mapped reads, exon/intron/intergenic reads, mitochondrial reads, ribosomal reads, etc. It also lists the number of genes detected per barcode. The columns shown in this file depends on the reagent kit used to generate the input data. As examples, you can see columns related to UMI when you use ICELL8_3DE_UMI or Strnd_UMI. The full list of possible columns in this file is described in the Appendix.
D. Gene matrix file
The gene matrix file (sometimes called the gene table or counts matrix) is also in CSV format and contains gene counts for each barcode, with the genes in the rows and barcodes/cells in the columns. The file contains raw counts that can then be normalized and transformed using CogentDS. An example is shown below.
E. Gene info file
The gene info file is a CSV-formatted file that contains the main annotation for the genes as described in the GTF file that is part of the genome build. It includes gene ID (used in CogentDS), Ensembl gene ID, gene symbol, and gene length (used for some normalizations).
F. CogentDS.analysis.rda
During the generation of the HTML report, an R Data object, CogentDS.analysis.rda
, is created with the results of various modules. This file can be used directly as input into CogentDS to perform further analysis, which saves on processing time in that tool.
G. Extra folder
The extras/
folder contains similar result files as B–F above, in the same format, but the data is calculated by also including intron regions of genes. Additionally, when used with a UMI enabled kit, results without UMI calculation are also stored in this folder. You can also use these files for tertiary analysis.
- From the GUI,
extras/
can be found in<output folder>/analysis
- From the CLI, it will be in the folder specified by the value of the
cogent analyze
command output_dir (-o
) argument
Appendix
Columns in Stats file
The tables below document all potential columns that might appear in the Stats output file (Section VI.C).
As mentioned in that section, not all stats files will include every column listed.
Columns that will be present in all Stats files output by CogentAP (input workflow agnostic).
Column name | Description |
---|---|
Barcode | Detected barcodes. This value will usually be the sample name from the well-list or well-list-like file, but there are three exceptions, documented in the table below. |
Sample | Sample names described in sample description file. |
Barcoded_Reads | Number of reads after demultiplexing. |
Trimmed_Reads | Number of remained reads after trimming. |
Unmapped_Reads | Number of reads not mapped to genome. |
Mapped_Reads | Number of reads mapped to genome. |
Multimapped_Reads | Number of reads mapped to multiple genomic locations. |
Uniquely_Mapped_Reads | Number of reads mapped to one genomic location. These reads are used for counting. |
Exon_Reads | Number of reads assigned to an exonic region. |
Ambiguous_Exon_Reads | Number of reads assigned to exonic regions of multiple genes. |
Intron_Reads | Number of reads assigned to an intronic region. |
Ambiguous_Intron_Reads | Number of reads assigned to intronic regions of multiple genes. |
Gene_Reads | Number of reads assigned to a gene region (exon + intron). |
Intergenic_Reads | Number of reads assigned to an intergenic region. |
No_of_Genes | Number of detected genes. |
Mitochondrial_Reads | Number of reads assigned to mitochondrial chromosome. |
Ribosomal_Reads | Number of reads assigned to a ribosomal gene. |
The "Barcode" column, in addition to the samples named in the well-list or well-list-like file, will also have three additional rows, which are described in the following table.
Barcode field value | Description |
---|---|
Short | Number of reads containing N in barcode or having shorter length than barcode. |
Unselected | Number of reads having a barcode included in Chip’s description, but not included in sample description file. |
Undetermined | Number of reads having undetermined barcode. |
Additional columns in the Stats file for 3′ DE analysis with UMIs on ICELL8 system.
The table below lists additional columns that will be present in the Stats file when the input FASTQ files result from the ICELL8 3′ DE for UMI Reagent Kit (Cat # 640005) workflow on the ICELL8 Single-Cell System (Cat # 640000).
Column name | Description |
No_of_UMIs | Number of UMI variations detected after demultiplexing. |
Exon_nUMIs | Number of deduplicated reads assigned to an exonic region. Deduplication is done by UMI. |
Intron_nUMIs | Number of deduplicated reads assigned to an intronic region. Deduplication is done by UMI. |
Gene_nUMIs | Number of deduplicated reads assigned to a gene region (exon + intron). Deduplication is done by UMI. |
Additional columns in Stats file for the SMARTer Stranded Total RNA-Seq Kit v3- Pico Input Mammalian protocol.
The table below lists additional columns that will be present in the Stats file when the input FASTQ files result from the SMARTer Stranded Total RNA-Seq Kit v3- Pico Input Mammalian protocol.
Column name | Description |
No_of_UMIs | Number of UMI variations detected after demultiplexing. |
Exon_nUMIs | Number of deduplicated reads assigned to an exonic region. Deduplication is done by UMI. |
Exon_nUSSs | Number of deduplicated reads assigned to an exonic region. Deduplication is done by Unique Start&Stop Site (USS). |
Exon_nUMIs_USSs | Number of deduplicated reads assigned to an exonic region. Deduplication is done by both UMI and USS. |
Intron_nUMIs | Number of deduplicated reads assigned to an intronic region. Deduplication is done by UMI. |
Intron_nUSSs | Number of deduplicated reads assigned to an intronic region. Deduplication is done by Unique Start&Stop Site (USS). |
Intron_nUMIs_USSs | Number of deduplicated reads assigned to an intronic region. Deduplication is done by both UMI and USS. |
Gene_nUMIs | Number of deduplicated reads assigned to a gene region (exon + intron). Deduplication is done by UMI. |
Gene_nUSSs | Number of deduplicated reads assigned to a gene region (exon + intron). Deduplication is done by Unique Start&Stop Site (USS). |
Gene_nUMIs_USSs | Number of deduplicated reads assigned to a gene region (exon + intron). Deduplication is done by both UMI and USS. |
Strand_Specificity | Ratio of reads detected as correct strand after mapping to genome. |
RNA-seq
Cogent NGS Analysis Pipeline
Analyze sequencing data generated by select Takara Bio applications.
Cogent NGS Discovery Software
Visualize sequencing data using the output from the Cogent NGS Analysis Pipeline.
SMART-Seq DE3 Demultiplexer
Demultiplex sequencing data from SMART-Seq mRNA 3′ DE into sorted read data files.
Takara Bio USA, Inc.
United States/Canada: +1.800.662.2566 • Asia Pacific: +1.650.919.7300 • Europe: +33.(0)1.3904.6880 • Japan: +81.(0)77.565.6999
FOR RESEARCH USE ONLY. NOT FOR USE IN DIAGNOSTIC PROCEDURES. © 2025 Takara Bio Inc. All Rights Reserved. All trademarks are the property of Takara Bio Inc. or its affiliate(s) in the U.S. and/or other countries or their respective owners. Certain trademarks may not be registered in all jurisdictions. Additional product, intellectual property, and restricted use information is available at takarabio.com.