scATAC-seq_v3 User Manual

1. Scope of application

scATAC-seq_v3 can handle the high-throughput sequencing data analysis for the following kits:

DNBelab C Series High-throughput Single-cell ATAC Library Preparation set.

2. Product Introduction

2.1 Analysis Flowchart

The scATAC-seq_v3 analysis workflow is an automated analysis process developed based on the DCS Cloud, which includes reference genome alignment, deconvolution (cell identification and merging), quality control, peak calling, downstream analysis, and report generation:

2.1.1 Reference Genome Alignment

Use chromap (v0.2.3_r407) to align the input FASTQ files with the reference genome file.

2.1.2 Deconvolution

Using d2c (v1.4.4) to capture complete cells with beads, and then merge the beads based on the similarity of fragments within the beads.

2.1.3 Quality Control

Perform fragment distribution statistics and TSS region enrichment for quality control.

2.1.4 Peak Calling

Use macs2 (v2.2.7.1) to identify regions enriched with aligned reads obtained from sequencing in the genome. for ATAC, peaks are regions of open chromatin, and generate a peak-cell matrix.

2.1.5 Downstream Analysis

Perform cell quality control based on the generated matrix, and conduct downstream analyses such as dimensionality reduction, clustering, and cell type annotation.

2.1.6 Report Output

Comprehensive analysis results, organized and summarized into an HTML report.

3. User Manual

The scATAC-seq_v3 analysis workflow is managed through the DCS Cloud platform for sample input and report output throughout the entire process. The following provides a detailed guide on how to use the scATAC-seq_v3 analysis workflow based on the DCS Cloud platform.

3.1 Guide Overview

3.1.1 Overview

This chapter introduces how to use scATAC-seq_v3 analysis workflow for analysis. Before using, please read and understand the content carefully to ensure correct usage of the scATAC-seq _v3 analysis workflow.

3.2 Usage Scenario 1: Manual Submission

The operation includes four steps: upload data, construct reference genome (optional), parameter settings, start analysis. After completing the parameter settings, run the task, when the client sees the task status as completed , it indicates that the task is complete, and the report section can be viewed (see section 3.4 for details).

3.2.1 Step One: Upload Data

Click on the navigation bar [Data] to enter the Data Management page, navigate to the target folder, and click the upper right corner [+Add files]-[Tool upload ] to upload data (Figure 3-1 ).

Click [Upload] to browse and select the required file (Figure 3-2), after the upload is complete, the file will be displayed in the target folder (if this is the first upload, you need to click [Install and start the transport client] to install the required tools).

3.2.2 Step Two: Build Reference Genome (Optional)

Click on the navigation bar [Workflow] to enter the Workflow Analysis page, enter scATAC-seq-build-index in the search box, and click [Run].

Figure 3-3 Step One of Building the Reference Genome

Select Run workflow, enter Entity ID, and click [Next].

Figure 3 -4 Step Two of Building the Reference Genome

Enter the reference genome information, and click [Next].

Figure 3-5 Step Three of Building the Reference Genome

Note

Parameter Description for Entering Reference Genome Information :

refName: Species name of the reference genome ;
GTF: Genome annotation GTF file ;
chrM: Mitochondrial name ;
FASTA: Genome FASTA file;
blacklist: Blacklist file, fill in None if not available;
Outdir: Output file path;
Cpu: Required CPU for operation;
Mem: Required memory for operation.

Click [Run] to start the analysis.

Figure 3-6 Step Four of Building the Reference Genome

After the task is completed, the Status will display as completed. Copy the Task ID (Figure 3-7 ), click on the navigation bar [Data], and enter the Task ID to search (Figure 3-8 ), where the scATAC_ref directory will be used as the reference genome file for the scATAC-seq _v3 analysis (Figure 3-9 ).

Figure 3-7 Step Five of Building the Reference Genome

Figure 3-8 Step Six of Building the Reference Genome

Figure 3-9 Step Seven of Building the Reference Genome

3.2.3 Step Three: Parameter settings

Click on the navigation bar [Workflow] to enter the process analysis page, enter scATAC-seq_v3 in the search box, and click [Run].

Figure 3-10 Step One of Sample Information Entry

Select Run workflow, enter Entity ID, and click [Next].

Figure 3-11 Step Two of Sample Information Entry

Enter sample information, and after completion, click [Next] .

Figure 3-12 Step Three of Sample Information Entry

Note

Sample Information Entry Parameter Description :

Data: FASTQ format R1 and R2 end sequences, note that for each pair of FASTQ, select R1 first, then select R2;
Outdir: Output file path;
SampleID: Sample Name, defaults to being consistent with Entity ID ;
readStructure (optional): Sequencing method, defaults to newT1 ;
OutBam: Whether to output bam file, defaults to false;
model: Whether it is a model organism (mouse / human), defaults to false;
ForceFrag (optional): Minimum number of fragments for merging duplicate data's dual fragment threshold;
refDir: Reference genome file, if constructed independently in step two (3.2.2 Build Reference Genome), select the scATAC_ref file in the Task ID folder;
BlackList: Blacklist file, defaults to None;
chrMT: Mitochondrial name, defaults to chrM;
genomeSize: Genome size, consistent with the value of genome size in the reference genome construction result file ref. json ;
cpp: Required CPU size for operation, default value is 4;
mem: Required memory for operation, default value is 20 ;
Species: Name of the species, if the value of model is true, input mm10 or hg38.

3.2.4 Step Four: Start Analysis

Click [Run] to start the analysis.

3.3 Usage Scenario 2: Table Submission

The operation includes five steps: upload data, build reference genome (optional), download sample template, fill in and import sample template, start analysis. After completing the sample template import, tasks can be run in batches, when the client sees the task status as completed , it indicates that the task is complete, and the report section can be viewed (see Section 3.4 for details).

3.3.1 Step One: Upload Data

Consistent with Step One of Use Case One (Manual Submission) (see 3.2.1 Step One: Upload Data).

3.3.2 Step Two: Construct Reference Genome (Optional)

Consistent with Step Two of Use Case One (Manual Submission) (see 3.2.2 Step Two: Construct Reference Genome).

3.3.3 Step Three: Sample Information Entry Table Download

Click on the Navigation Bar [Data], select [Table] -[Download] (as shown in Figure 3-14 ), click [Data model template]****, and select scATAC-seq_v3 template for download.

Figure 3-14 scATAC-seq_v3 Sample Information Template Download Navigation

After opening scATAC-seq_v3 Sample Template Excel as shown in Figure 3-15.

Figure 3-15 scATAC-seq_v3 Sample Information Template Table

3.3.4 Step Four: Import Sample Information

Under this usage scenario, the sample import table must fill in the work table (Figure 3-15 ). This scenario indicates that for samples with completed sequencing data, after importing the table, it directly enters the analysis.

Note

Excel Notes:

[ 1 ] The import file path must already exist on the cloud platform.

[ 2 ] In the template, all input fields except readStructure and ForceFrag are required. In the imported data, required fields must not be empty.

[ 3 ] The SampleID in Excel must be unique.

[ 4 ] Cells in Excel cannot be merged, and cell contents must not have leading or trailing spaces or special characters.

[5 ] Analysis Sample Entry (Figure 3-16):

Data: R1 and R 2 end sequences in FASTQ format. Note that for each pair of FASTQ files, you must first fill in the absolute path for R1 (at Data 1 ), and then fill in the absolute path for R2 (at Data 2 );
Outdir: Output file path;
SampleID: Sample Name, defaults to being consistent with Entity ID ;
readStructure (optional): Sequencing method, defaults to newT 1 ;
OutBam: Whether to output bam file, defaults to false;
model: Whether it is a model organism (mouse / human), defaults to false;
ForceFrag (optional): Minimum number of fragments for merging duplicate data's dual fragment threshold;
refDir: Reference genome files. If constructed independently in step two (3.3.2 Constructing Reference Genome), select the scATAC _ref folder within the Task ID folder;
BlackList: Blacklist file, defaults to None;
chrMT: Mitochondrial name, defaults to chrM;
genomeSize: Genome size, which should match the genomesize value in the result file ref. json from the reference genome construction;
cpp: Required CPU size for operation, default value is 4;
mem: Required memory size for operation, default value is 20 ;
Species: Name of the species, if the value of model is true, input mm10 or hg38.

Figure 3-16 Fill in the Sample Template (Sample Information Entry)

After configuring the sample template analysis sample entry work table, return to [Data] interface, click [Table] - [+ Add table].

Figure 3-17 Step One of Sample Information Import

Click [Click to upload/Drop here] browse and select the completed sample information table, then click [confirm] （Figure 3-18）, after the upload is complete, the file will be displayed in the target folder.

Figure 3-18 Step Two of Sample Information Import

Click the navigation bar [ Workflow ] to enter the workflow analysis page, enter scATAC-seq_v3 in the search box, click [ Run ].

Figure 3-19 Sample Information Import Step Three

Select Run workflow(s), click on Please select table, choose the table imported in this section 3), select the required rows, and click [ Next ].

Figure 3-20 Sample Information Import Step Four

Click on Values and select the corresponding values, such as Data selecting ${Data1} and ${Data2}, ensuring to select in order.

Figure 3-21 Sample Information Import Step Five

Enter sample information, and after completion, click [Next], ensuring that the parameter settings are correct.

Figure 3-22 Sample Information Import Template

3.3.5 Step Five: Start Analysis

Click [ Run ] to start analysis.

3.4 View Report and Download Result Files

3.4.1 View Report

Click on the navigation bar [Task], when the task status shows as completed, it indicates that the task is complete, click [Report] to view the report section.

3.4.2 Download Report

Click on the navigation bar [Data] to enter the data management page, search based on the task's Task ID, and click to enter the Task ID folder (as shown in Figure 3 -25 ).

Figure 3-25 Step One for Downloading Report

Click to enter report folder (as shown in Figure 3 -26 ).

Select the report.html file, click [Download] - [R aysync download ] (as shown in Figure 3 -27 ).

Click [Transfer]-[Download]-[Confirm], select the target directory and download the report (as shown in Figure 3 -28 ).

4. FAQ

What are the official single-cell processes and what are the corresponding versions of the kits?

The official maintenance process of scATAC-seq is scATAC-seq_v3, which corresponds to the kit DNBelab C Series High-throughput Single-cell ATAC Library Preparation Set.

What are the requirements for the gtf file format after the Reference is constructed?

The chromosome name in the gtf file should be the same as the chromosome name in the genome file.

scATAC-seq_v3 User Manual

# scATAC-seq_v3 User Manual

# 1. Scope of application

# 2. Product Introduction

# 2.1 Analysis Flowchart

# 2.1.1 Reference Genome Alignment

# 2.1.2 Deconvolution

# 2.1.3 Quality Control

# 2.1.4 Peak Calling

# 2.1.5 Downstream Analysis

# 2.1.6 Report Output

# 3. User Manual

# 3.1 Guide Overview

# 3.1.1 Overview

# 3.2 Usage Scenario 1: Manual Submission

# 3.2.1 Step One: Upload Data

# 3.2.2 Step Two: Build Reference Genome (Optional)

# 3.2.3 Step Three: Parameter settings

# 3.2.4 Step Four: Start Analysis

# 3.3 Usage Scenario 2: Table Submission

# 3.3.1 Step One: Upload Data

# 3.3.2 Step Two: Construct Reference Genome (Optional)

# 3.3.3 Step Three: Sample Information Entry Table Download

# 3.3.4 Step Four: Import Sample Information

# 3.3.5 Step Five: Start Analysis

# 3.4 View Report and Download Result Files

# 3.4.1 View Report

# 3.4.2 Download Report

# 4. FAQ

scATAC-seq_v3 User Manual

1. Scope of application

2. Product Introduction

2.1 Analysis Flowchart

2.1.1 Reference Genome Alignment

2.1.2 Deconvolution

2.1.3 Quality Control

2.1.4 Peak Calling

2.1.5 Downstream Analysis

2.1.6 Report Output

3. User Manual

3.1 Guide Overview

3.1.1 Overview

3.2 Usage Scenario 1: Manual Submission

3.2.1 Step One: Upload Data

3.2.2 Step Two: Build Reference Genome (Optional)

3.2.3 Step Three: Parameter settings

3.2.4 Step Four: Start Analysis

3.3 Usage Scenario 2: Table Submission

3.3.1 Step One: Upload Data

3.3.2 Step Two: Construct Reference Genome (Optional)

3.3.3 Step Three: Sample Information Entry Table Download

3.3.4 Step Four: Import Sample Information

3.3.5 Step Five: Start Analysis

3.4 View Report and Download Result Files

3.4.1 View Report

3.4.2 Download Report

4. FAQ