LUSH

DCS CloudAbout 16 min

LUSH User Manual

Chapter 1 Product Information

1.1 Product Description

The LUSH workflow is an optimized pipeline based on GATK best practices. Its main components include the aligner, bqsr, variantCaller, genotyper, and report tools from DCS Tools. The workflow features data alignment, quality control, and variant detection. Users can manage the analysis process through the graphical interface of the DCS Cloud, which simplifies parameter input and result output. The workflow utilizes high-performance server configurations and parallel task execution to shorten the analysis cycle, enabling fast and efficient delivery.

1.2 Precautions

  1. This product is intended for research purposes only and not for clinical diagnosis. Please read this manual carefully before use.
  2. This manual and the information it contains are proprietary and confidential to the BGI Research. Without written permission from BGI Research, no individual or organization may reprint, reproduce, modify, disseminate, or disclose any part of this manual to others. The readers of this manual are end-users authorized by BGI Research. Unauthorized use of this manual is strictly prohibited.
  3. BGI Research makes no warranties of any kind regarding this manual, including (but not limited to) implied warranties of merchantability and fitness for a particular purpose. BGI Research has taken measures to ensure the accuracy of this manual. However, BGI Research is not responsible for errors or omissions and reserves the right to make improvements to this manual and the workflow to enhance reliability, functionality, or design.
  4. All images in this manual are schematic diagrams. There may be slight differences between the images and the actual interface. Please refer to the actual interface for accurate representation.

Chapter 2 Product Introduction

2.1 Analysis Flowchart

The LUSH workflow is an optimized pipeline based on GATK best practices. Its main components include the aligner, bqsr, variantCaller, genotyper, and report tools from DCS Tools. The workflow features data alignment, quality control, and variant detection. Users can manage the analysis process through the graphical interface of the DCS Cloud, which simplifies parameter input and result output. The workflow utilizes high-performance server configurations and parallel task execution to shorten the analysis cycle, enabling fast and efficient delivery.

The workflow includes the following functions:

  • Data Alignment: Align sequences to the reference genome and generate alignment statistics.
  • Variant Detection: Analyze sample mutations, including SNPs and INDELs, based on alignment results.

Figure 2-1 The analysis flowchart:

1-2.png
Figure 2-1 Analysis System Flowchart

2.1.1 Reference Genome Filtering, Alignment, Sorting, and Deduplication

Sequences are aligned to the reference genome to determine their positions, providing a foundation for variant detection. Duplicate sequences generated by PCR amplification are marked and removed to reduce false positives and improve variant detection accuracy. These steps are performed by the aligner tool. Additionally, the aligner integrates data quality control functions to filter and process sequences before alignment, reducing error rates and avoiding interference from noisy data in subsequent analyses.

2.1.2 Base Quality Score Recalibration (BQSR)

Due to systematic errors, the quality values of sequence bases are not always entirely accurate. The base quality values need to be recalibrated. This step corrects systematic biases, generates more accurate quality scores, improves variant detection accuracy, and reduces false positives and false negatives. The standardized quality assessment aligns with GATK-recommended best practices and is performed by the bqsr tool.

2.1.3 Variant Detection

The variantCaller is a C/C++ reimplementation of GATK HaplotypeCaller, and the genotyper is a C/C++ reimplementation of GenotypeGVCFs. The variantCaller tool is used for variant detection, generating GVCF files, while the genotyper tool produces Genotype VCF files.

2.1.4 Report Generation

The report tool consolidates statistical files output from previous tasks, generating images and HTML reports.

Chapter 3 User Manual

The LUSH standard analysis workflow is managed end-to-end through the DCS Cloud, from sample input to result output. Below is a detailed operational guide for using the LUSH standard analysis workflow on the DCS Cloud.

3.1 Guide Overview

3.1.1 Overview

This chapter explains how to use the LUSH standard analysis workflow. Before use, please read and understand the content to ensure correct usage of LUSH.

3.1.2 Workflow Suite

  1. LUSH_Germline_FASTQ_WGS_Human
  2. LUSH_reference_index
  3. LUSH_Germline_FASTQ_WGS_NHS
  4. ExpansionHunter_WGS_STR
  5. CNVpytor_WGS_CNV
  6. PanGenie_WGS_SV

3.2 Use Case 1: LUSH Main Workflow - Manual Submission

The operation consists of four steps: uploading data, adding a workflow, LUSH2 reference genome index construction, and LUSH2 analysis. After running the task, when the task status displays 图片 32 "completed," the task is finished.

3.2.1 Step 1: Upload Data

  1. Click the left navigation bar [Data], enter the data management page, navigate to the target folder, and click the upper-right corner [+ Add files] - [Tool upload] to upload data (Figure 3-1):
1.png
Figure 3-1 File Upload Step 1
  1. This feature is exclusive to the overseas AWS environment of the cloud platform. It allows users to utilize AWS tools to upload local files or folders to the cloud platform.

On the Data Management Files page, click the "Add file" button. Select "Tool upload" to access the tool upload interface. The steps are as follows:

Download the AWS CLI tool to your local machine and double‑click the installation file (for first‑time AWS CLI tool users).

Open the command prompt on your local computer (on Windows, run 'cmd'; on macOS or Linux, access the terminal).

On the Cloud platform's tool upload interface, input the local folder or file path that you want to upload (if uploading a file, make sure to provide the full path with the file format, for example: D:\upLoad\test.txt).

Click the "Generate upload command" button. After the command is generated on the page, click the "Copy command" button. Paste the command into the command prompt. If you are uploading a file, remove the "‑‑recursive" flag from the command. Press Enter to begin uploading the file to the current folder on the cloud platform. After the upload is complete, make sure to refresh the cloud platform page.

image.png
image.png
image.png
Figure 3-2 File Upload Step 2

3.2.2 Step 2: Add Workflow

  1. Click the top navigation bar [Project] to enter the project list page. In the search box, enter the project name/project number and click to enter the project (Figure 3-3):
3.png
Figure 3-3 Add Workflow Step 1
  1. Click the left navigation bar [Workflow] to enter the workflow analysis page. Click the upper-right corner [+ Add apps] to add a workflow, then click [Copy from library] (Figure 3-4):
4.png
Figure 3-4 Add Workflow Step 2
  1. For example, for LUSH_Germline_FASTQ_WGS_NHS, enter the workflow name LUSH_Germline_FASTQ_WGS_NHS in the search box, select it, click [Copy], then click [Confirm]. Perform the same operation for other workflow suites:
5.png
5.png
5-2.png
Figure 3-5 Add Workflow Step 3
  1. The workflow is successfully added:
6.png
Figure 3-6 Add Workflow Step 4

3.2.3 Step 3: LUSH Reference Genome Index Construction

  1. For other species genome analysis, the LUSH workflow supports users in providing and constructing reference genome indexes.
  2. Click the left navigation bar [Workflow] to enter the workflow analysis page. In the search box, enter LUSH_reference_index and click [Run]:
7.png
Figure 3-7 LUSH Build Index Step 1
  1. Enter the entity ID and click [Next] (Figure 3-8):
8.png
Figure 3-8 LUSH Build Index Analysis Step 2
  1. Enter parameter information for LUSH, then click [Next] (Figure 3-9):
9.png
Figure 3-9 LUSH Build Index Step 3

LUSH_build_index Workflow Variable Descriptions:

  • ReferenceName: Reference genome name.
  • ReferenceFasta: Reference genome FASTA file.
  • dbsnpVcf: dbSNP database VCF file for the reference genome (optional).
  • KnownSiteVcfs: VCF file containing known sites for the species (optional).
  • ReferenceAlt: Index file containing alt information for the species (optional).
  • Species: Species name.
  1. Click [Run] to start the analysis (Figure 3-10):
10.png
Figure 3-10 LUSH Build Index Step 4
  1. Click the left navigation bar [Task] to enter the workflow analysis page. Under Workflow Name, select LUSH_build_index to view the task status. Once the task is completed, the status will display 图片 32 "completed." Copy the task number, click the navigation bar [Data], and enter the task number in the folder/file name search box. Navigate to the result files, where the reference folder will be used as the ReferenceDir parameter input for the LUSH_Germline_FASTQ_WGS_NHS workflow.
11.png
Figure 3-11 LUSH Build Index Step 5
12.png
Figure 3-12 LUSH Build Index Step 6

3.2.4 Step 4: LUSH_Germline_FASTQ_WGS_NHS Analysis Workflow

  1. Click the left navigation bar [Workflow] to enter the workflow analysis page. In the search box, enter LUSH_Germline_FASTQ_WGS_NHS and click [Run] (refer to 3.2.3).
  2. Enter the entity ID (usually the sample ID) and click [Next] (refer to 3.2.3):
13.png
Figure 3-13 LUSH_Germline_FASTQ_WGS_NHS Analysis Step 1
  1. Enter sample information for LUSH_Germline_FASTQ_WGS_NHS:
  • PE FASTQ Input: Based on the number of paired FASTQ files, click the "+" button to add file groups and select the corresponding FASTQ1 and FASTQ2 files for each group.
  • SE FASTQ Input: Similar to PE data, place one FASTQ file in each file group.
14-2.png
Figure 3-14 LUSH_Germline_FASTQ_WGS_NHS Analysis Step 2

LUSH_Germline_FASTQ_WGS_NHS Parameter Descriptions:

  • SampleID: Sample name or unique ID, defaulting to the Entity ID.
  • FASTQ: Sequencing data (fq.gz or arc). PE: One file group contains a pair of FASTQ files. SE: One file group contains one FASTQ file.
  • ReferenceDir: Supports user-uploaded genome data for constructing a reference genome (refer to 3.2.3). Input is the reference directory from the LUSH_reference_index workflow result files.
  • arcDir: Folder containing FASTQ index files for arcseq decompression. Not required if FASTQ format is fq.gz.
  • Adapter1: Adapter sequence for Read1 (used for filtering).
  • Adapter2: Adapter sequence for Read2 (used for filtering).
  • SOAPnukeLowQual: SOAPnuke low-quality threshold, default is 12.
  • SOAPnukeLowQualityRate: SOAPnuke low-quality rate threshold, default is 0.5.
  • SOAPnukeNRate: SOAPnuke N ratio threshold, default is 0.1.
  • StandCallConf: Variant detection confidence threshold, default is 30.
  • OutputUnmappedReads: Whether to output unmapped reads (FASTQ files), default is no.
  • OutputSortMarkdupBam: Whether to output sorted and deduplicated BAM files, default is no.
  • OutputBqsrBam: Whether to output BQSR-processed BAM files, default is no.
  • ApplyBQSR: Whether to run BQSR, default is yes. If set to "no," the sorted and deduplicated BAM file will be used for variant detection. If the reference genome's knownsites VCF is empty, this will be forced to "no."
  • ApplyHaplotypeCaller: Whether to run HaplotypeCaller, default is yes.
  • AlignerMemorySet: Alignment memory setting, default is 128 (GB).
  • BQSRMemorySet: BQSR memory setting, default is 64 (GB).
  • HaplotypeCallerMemorySet: HaplotypeCaller memory setting, default is 64 (GB).
  1. Click [Run] to start the analysis and wait for the results (Figure 3-15):
15.png
Figure 3-15 LUSH_Germline_FASTQ_WGS_NHS Analysis Step 3

3.3 Use Case 2: LUSH Fixed-Price Workflow - Manual Submission

3.3.1 Step 1: Upload Data

Same as Use Case 1 (Manual Submission) Step 1 (refer to 3.2.1 Step 1: Upload Data).

3.3.2 Step 2: Add Workflow

Same as Use Case 1 (Manual Submission) Step 2 (refer to 3.2.2 Step 2: Add Workflow).

3.3.3 Step 3: Add Public hg38 Reference Genome Files

  1. Click the "Public Library" tab at the top of the page to enter the public library interface.
16.png
Figure 3-16 LUSH_Germline_FASTQ_WGS_Human Analysis Step 1
  1. On the public library page, click the Data tab to enter the data tab. In the search bar, enter "LUSH_Reference_hg38" to search. In the search results, find the LUSH_Reference_hg38 card and click the Copy button.
17.png
Figure 3-17 LUSH_Germline_FASTQ_WGS_Human Analysis Step 2
  1. In the pop-up dialog, select the project where you want to store the data. Navigate to the specified path in the project, set the specific location for data storage, and complete the public data pull.
18.png
Figure 3-18 LUSH_Germline_FASTQ_WGS_Human Analysis Step 3

3.3.4 Step 4: LUSH_Germline_FASTQ_WGS_Human Analysis Workflow

  1. Click the left navigation bar [Workflow] to enter the workflow analysis page. In the search box, enter LUSH_Germline_FASTQ_WGS_Human and click [Run] (refer to 3.2.3).
  2. Enter the entity ID (usually the sample ID) and click [Next] (refer to 3.2.3).
  3. Enter sample information for LUSH_Germline_FASTQ_WGS_Human:
  • PE FASTQ Input: Based on the number of paired FASTQ files, click the "+" button to add file groups and select the corresponding FASTQ1 and FASTQ2 files for each group.

  • SE FASTQ Input: Similar to PE data, place one FASTQ file in each file group.

19.png
Figure 3-19 LUSH_Germline_FASTQ_WGS_Human Analysis Step 4

LUSH_Germline_FASTQ_WGS_Human Parameter Descriptions:

  • SampleID: Sample name or unique ID, defaulting to the Entity ID.
  • FASTQ: Sequencing data (fq.gz or arc). PE: One file group contains a pair of FASTQ files. SE: One file group contains one FASTQ file.
  • ReferenceDir: Reference genome folder, which needs to be pulled from the public library into the analysis project.
  • OutputSortMarkdupBam: Whether to output sorted and deduplicated BAM files, default is no.
  1. Click [Run] to start the analysis and wait for the results (Figure 3-20):
20.png
Figure 3-20 LUSH_Germline_FASTQ_WGS_Human Analysis Step 5

3.4 Use Case 3: ExpansionHunter_WGS_STR Workflow - Manual Submission

The ExpansionHunter_WGS_STR workflow uses Expansion Hunter to identify short tandem repeat (STR) sequences. Expansion Hunter is a tool for targeted genotyping of short tandem repeats and flanking variants. It searches BAM/CRAM files for reads spanning, flanking, or fully contained within each repeat. This analysis workflow currently only supports alignment data based on the human genome hg38 reference sequence.

3.4.1 Step 1: Input Data

For LUSH_Germline_FASTQ_WGS_Human analysis, set the OutputSortMarkdupBam parameter to "true" to output the aligner BAM file in the workflow results.

3.4.2 Step 2: Add Workflow

Similar to Use Case 1 (Manual Submission) Step 2, search by workflow name (refer to 3.2.2 Step 2: Add Workflow).

3.4.3 Step 3: ExpansionHunter_WGS_STR Analysis Workflow

  1. Click the left navigation bar [Workflow] to enter the workflow analysis page. In the search box, enter ExpansionHunter_WGS_STR and click [Run] (refer to 3.2.3):
21.png
Figure 3-21 ExpansionHunter_WGS_STR Analysis Step 4 (1)
  1. Enter the entity ID (usually the sample ID) and click [Next] (refer to 3.2.3):   22.png
  2. Enter sample information for ExpansionHunter_WGS_STR (Figure 3-23).
23.png
Figure 3-23 ExpansionHunter_WGS_STR Analysis Step 4 (3)

ExpansionHunter_WGS_STR Parameter Descriptions:

  • SampleID: Sample name or unique ID, defaulting to the Entity ID.
  • ReferenceDir: Reference genome folder, which needs to be pulled from the public library into the analysis project.
  • Sex: Sample gender, "male" or "female."
  • Bam: Sorted and deduplicated BAM file.
  • BamIndex: Index file for the sorted and deduplicated BAM file.
  1. Click [Run] to start the analysis and wait for the results:
24.png
Figure 3-24 ExpansionHunter_WGS_STR Analysis Step 4 (4)

3.5 Use Case 4: PanGenie_WGS_SV Workflow - Manual Submission

The PanGenie_WGS_SV workflow is a graph-based pan-genome workflow designed for genotype imputation and structural variant (SV) detection using PanGenie from sequencing data. Genotype calculation is based on read k-mer counts and a set of known, fully assembled haplotypes. Compared to alignment-based methods, PanGenie achieves higher genotype concordance for almost all tested variant types and coverages. Improvements are particularly significant for large insertions (≥50 bp) and variants in repetitive regions, enabling these categories of variants to be included in genome-wide association studies.

3.5.1 Step 1: Upload Data

Same as Use Case 1 (Manual Submission) Step 1 (refer to 3.2.1 Step 1: Upload Data).

3.5.2 Step 2: Add Workflow

Similar to Use Case 1 (Manual Submission) Step 2, search by workflow name (refer to 3.2.2 Step 2: Add Workflow).

3.5.3 Step 3: Add PanGenie Reference Genome Files

  1. Click the "Public Library" tab at the top of the page to enter the public library interface.
25.png
Figure 3-25 PanGenie_WGS_SV Analysis Step 3 (1)
  1. On the public library page, click the Data tab to enter the data tab. In the search bar, enter "PanGenie_refindex" to search. In the search results, find the PanGenie_refindex card and click the Copy button.
26.png
Figure 3-26 PanGenie_WGS_SV Analysis Step 3 (2)
  1. In the pop-up dialog, select the project where you want to store the data. Navigate to the specified path in the project, set the specific location for data storage, and complete the public data pull.
27.png
Figure 3-27 PanGenie_WGS_SV Analysis Step 3 (3)

3.5.4 Step 4: Add Human Structural Variant Annotation Database Files

Follow the method in Step 3, but change the search target to AnnotSV_annotations.

3.5.5 Step 5: PanGenie_WGS_SV Analysis Workflow

  1. Click the left navigation bar [Workflow] to enter the workflow analysis page. Enter PanGenie_WGS_SV in the search box and click [Run] (refer to 3.2.3):

    28.png
    Figure 3-28 PanGenie_WGS_SV Analysis Step 5 (1)
  2. Enter the entity ID (usually the sample ID) and click [Next] (refer to 3.2.3):

    29.png
    Figure 3-29 PanGenie_WGS_SV Analysis Step 5 (2)
  3. PanGenie_WGS_SV: Enter sample information. Click the folder icon under FqFile to enter the data management page and select the FASTQ file.

    30.png
    Figure 3-30 PanGenie_WGS_SV Analysis Step 5 (3)
  4. PanGenie_WGS_SV: Enter the PanGenie reference genome folder. Click the folder icon under the RefDir parameter to enter the data management page and select the PanGenie reference genome folder.

    31-2.png
    Figure 3-31 PanGenie_WGS_SV Analysis Step 5 (4)
  5. PanGenie_WGS_SV: Enter the human structural variant annotation database folder. Click the folder icon under the AnnotationRef parameter to enter the data management page and select the human structural variant annotation database.

    32.png
    Figure 3-32 PanGenie_WGS_SV Analysis Step 5 (5)

PanGenie_WGS_SV Parameter Description:

  • SampleID: Sample name or unique ID, default is the same as the Entity ID.
  • FqFile: Sequencing file.
  • RefDir: PanGenie reference genome file.
  • AnnotationRef: Human structural variant annotation database.
  • Threads: Number of CPU cores for the task.
  1. Click [Run] to start the analysis and wait for the output (Figure 3-33):

    33.png
    Figure 3-33 PanGenie_WGS_SV Analysis Step 5 (6)

3.6 Use Case 5: Manual Submission of CNVpytor_WGS_CNV Workflow

The CNVpytor_WGS_CNV workflow is a CNV detection and analysis workflow based on CNVpytor, used to detect and analyze copy number variations (CNVs) from sequencing data. CNVpytor inherits the core engine of its predecessor and extends visualization, modularity, performance, and functionality. Additionally, CNVpytor utilizes B-allele frequency (BAF) likelihood information from single nucleotide polymorphisms (SNPs) and small insertions/deletions (indels) as additional evidence for CNVs/copy number aberrations (CNAs) and primary information for copy number-neutral loss of heterozygosity.

3.6.1 Step 1: Input Data

When performing LUSH_Germline_FASTQ_WGS_Human analysis, set the OutputSortMarkdupBam parameter to true to output the aligner BAM file.

3.6.2 Step 2: Add Workflow

Similar to Use Case 1 (Manual Submission) Step 2, search by workflow name (refer to 3.2.2 Step 2: Add Workflow).

3.6.3 Step 3: CNVpytor_WGS_CNV Analysis Workflow

  1. Click the left navigation bar [Workflow] to enter the workflow analysis page. Enter CNVpytor_WGS_CNV in the search box and click [Run] (refer to 3.2.3):

    34.png
    Figure 3-34 CNVpytor_WGS_CNV Analysis Step 3 (1)
  2. Enter the entity ID (usually the sample ID) and click [Next] (refer to 3.2.3):

    35.png
    Figure 3-35 CNVpytor_WGS_CNV Analysis Step 3 (2)
  3. Click the folder icon under Bam to enter the data management page and select the BAM file.

    36.png
    Figure 3-36 CNVpytor_WGS_CNV Analysis Step 3 (3)
  4. Click on the folder icon in BamIndex to enter the data management page and select the index file for Bam.

CNVpytor_WGS_CNV parameter description:

  • SampleID: Sample name or special ID, which is consistent with the Entity ID by default;
  • Bam: The aligned, deduplicated, and sorted bam file;
  • BamIndex: The index file for the aligned, deduplicated, and sorted bam file;
  1. Click[Run] to start the analysis and wait for the results to be output (Figure 3-37).

    37.png
    Figure 3-37 CNVpytor_WGS_CNV Analysis Step 3 (4)

3.7 Use Case 6: Table Submission (Using LUSH_Germline_FASTQ_WGS_NHS as an Example)

The operation includes five steps: upload data, add workflow, download table template, fill and import the table, and start the analysis. After importing the sample template, tasks can be run in batches. When the task status shows Image 34 as completed, the task is finished, and results can be viewed (refer to Section 3.4).

3.7.1 Step 1: Upload Data

Same as Use Case 1 (Manual Submission) Step 1 (refer to 3.2.1 Step 1: Upload Data).

3.7.2 Step 2: Add Workflow

Same as Use Case 1 (Manual Submission) Step 2 (refer to 3.2.2 Step 2: Add Workflow).

3.7.3 Step 3: Download Table Template

  1. Click the left navigation bar [Data], select [Table] - [Download] (Figure 3-38). Click [**Data model template **] and download the LUSH_Germline_FASTQ_WGS_NHS_V1.0.0 template (the template for LUSH_Germline_FASTQ_WGS_Human is LUSH_Germline_FASTQ_WGS_Human_V1.0.0):

    38-2.png
    Figure 3-38 Sample Template Download
  2. The opened sample template Excel is shown in Figure 3-39:

39.png
Figure 3-39 LUSH_Germline_FASTQ_WGS_NHS Sample Template Table

3.7.4 Step 4: Fill and Import the Table

  1. Under this use case, the imported table must include a worksheet. This scenario indicates that the analysis begins directly after importing the table for already sequenced sample data.

::: important

  • The imported file path must already exist on the cloud platform.
  • Do not merge cells in Excel, and avoid spaces or special characters before or after cell content.
  • Sample analysis entry (Figure 3-40): :::
40.png
Figure 3-40 LUSH_Germline_FASTQ_WGS_NHS Filled Template Table
  1. After configuring the table template, return to the [Data]interface. Click [Table] - [+ Add table] (Figure 3-41):

    41.png
    Figure 3-41 Table Import Step 1
  2. Click [Click to upload / Drop here] to browse and select the table with filled sample information, then click [Confirm] (Figure 3-42). After uploading, the file will be displayed in the target folder:

    42.png
    Figure 3-42 Table Import Step 2
  3. Click the navigation bar [Workflow] to enter the workflow analysis page. Enter LUSH_Germline_FASTQ_WGS_NHS in the search box and click [Run].

  4. Select Run workflow(s), click Please Select Table 图片  231, select the table imported in Step 3 above, choose the required rows, and click [Next] (Figure 3-43):

    43.png
    Figure 3-43 Table Import Step 4
  5. Under Value, click 图片 59 and select the corresponding value. For example, for FASTQ, select ${FASTQ1} and ${FASTQ2}. Note that ${FASTQ1} and ${FASTQ2} must be selected in order (Figure 3-44):

    44.png
    Figure 3-44 Table Import Step 5
  6. Enter sample information, then click [Next] to ensure the parameter settings are correct (Figure 3-45):

    45.png
    Figure 3-45 Table Import Template

3.7.5 Step 5: Start Analysis

Click [Run] to start the analysis (Figure 3-46):

46.png
Figure 3-46 Start Analysis

3.8 LUSH Result Files

3.8.1 Result File Download

  1. Click the left navigation bar [Task] to enter the task management page. Select LUSH_Germline_FASTQ_WGS_NHS under Workflow Name to view the task status. When the task is completed, the status will show 图片 32 as completed, indicating the task is finished. Copy the Task ID (Figure 3-47):

    47.png
    Figure 3-47 Result View and Download Step 1
  2. Download the Ossutil tool zip file (for the first‑time use). 3.Extract the downloaded zip file and double‑click to open the Ossutil tool, entering the tool interface. 4.On the cloud platform file page, select the files you want to download, click the "Download ‑ Tool download" button, then click "Generate download command," and finally click "Copy command." Paste the copied command into the tool interface. The files will be automatically downloaded to the current folder.

image
Figure 3-48 Result View and Download Step 2
image
Figure 3-49 Result View and Download Step 3

3.8.2 LUSH WGS Analysis Workflow Result Files

Upon successful execution, the following directory tree will be generated in the output directory:

├── sample.genotyper.vcf.gz

├── sample.genotyper.vcf.gz.tbi

├── sample.g.vcf.gz

├── sample.g.vcf.gz.tbi

├── sample.unmapped.1.fq.gz

├── sample.unmapped.2.fq.gz

├── QC

│ ├── Base_distributions_by_read_position_1.txt

│ ├── Base_distributions_by_read_position_2.txt

│ ├── Base_quality_value_distribution_by_read_position_1.txt

│ ├── Base_quality_value_distribution_by_read_position_2.txt

│ ├── Basic_Statistics_of_Sequencing_Quality.txt

│ ├── Distribution_of_Q20_Q30_bases_by_read_position_1.txt

│ ├── Distribution_of_Q20_Q30_bases_by_read_position_2.txt

│ ├── output_filter_files_report.txt

│ └── Statistics_of_Filtered_Reads.txt

└── report

├── sample.AT.xls

├── sample.bamstat.xls

├── sample.base.png

├── sample.cumuPlot.png

├── sample.fqstat.xls

├── sample.histPlot.png

├── sample.insertsize.png

├── sample.qual.png

├── sample\_report\_cn.html

├── sample\_report\_en.html

└── sample.vcfstat.xls

Directory Description:

sample.*.vcf.gz*Variant detection results and index.
sample.unmapped.*.fq.gzReads files that failed to align to the reference genome.
QCQuality control and filtering results.
reportWorkflow result reports.
report/sample _ report__en.htmlEnglish version of the sample analysis report.
report/sample _ report__cn.htmlChinese version of the sample analysis report.
report/sample.fqstat.xlsStatistics for quality control results.
report/sample. bamstat.xlsStatistics for alignment results.
report/sample.vcfstat.xlsStatistics for variant results.
report/*.pngImages for the report.

Statistical Results Display:

  • Quality Control Statistics:

    b1.png
    b1.png
  • Alignment Statistics: b2.png

  • Variant Statistics: b3.png

3.8.3 ExpansionHunter_WGS_STR Analysis Workflow Result Files

Upon successful execution, the following directory tree will be generated in the output directory:

└── STR

├── sample.json

├── sample\_realigned.bam

└── sample.vcf

Directory Description:

FileDescription
sample.jsonJSON file containing sample parameters and summarized analysis results by locus.
sample_realigned.bamBAM file containing realigned read data.
sample.vcfVCF file containing STR positions and information.

3.8.4 PanGenie_WGS_SV Analysis Workflow Result Files

Upon successful execution, the following directory tree will be generated in the output directory:

├── circos.png

├── circos.svg

├── sample_AnnotSV

│ └── sample-pangenie-genotyping-biallelic-filtered-sv.annotated.tsv

├── sample-pangenie-genotyping-biallelic-filtered-sv.vcf.gz

├──sample-pangenie-genotyping-biallelic-filtered-sv.vcf.gz.tbi

├── pangenie-biallelic

│ ├── sample-pangenie-genotyping-biallelic.vcf.gz

│ └── sample-pangenie-genotyping-biallelic.vcf.gz.tbi

├── pangenie-results_genotyping.vcf

├── pangenie-results_histogram.histo

└── sv_type_counts.txt

Directory Description:

File/DirectoryDescription
sample.*.vcf.gz*Variant detection results and index; -filtered contains only SV information.
sample-pangenie-genotyping-biallelic-filtered-sv.annotated.tsvResults file containing SV genotyping and annotation information.
pangenie-results_genotyping.vcfSV variant detection results in pan-genome format.
sv_type_counts.txtSV statistics.
*.png, *.svg, *.histoStatistical images and plotting data.

3.8.5 CNVpytor_WGS_CNV Analysis Workflow Result Files

Upon successful execution, the following directory tree will be generated in the output directory:

cnv_results/

├── cnv_type_counts.txt

├── sample.CNV.call.txt

├── sample.CNV.manhattan.global.0000.png

├── sample.CNV.rdstat.stat.0000.png

└── out.pytor

Directory Description:

FileDescription
cnv_type_counts.txtCNV statistics.
sample.CNV.call.txtResults file containing CNV information.
sample.CNV.manhattan.*.pngManhattan heatmap.
sample.CNV.rdstat.stat.*.pngRead depth (RD) analysis statistics.
out.pytorCNVpytor file (HDF5 format).
Last update: