LUSH

LUSH User Manual

Chapter 1 Product Information

1.1 Product Description

The LUSH workflow is an optimized pipeline based on GATK best practices. Its main components include the aligner, bqsr, variantCaller, genotyper, and report tools from DCS Tools. The workflow features data alignment, quality control, and variant detection. Users can manage the analysis process through the graphical interface of the DCS Cloud, which simplifies parameter input and result output. The workflow utilizes high-performance server configurations and parallel task execution to shorten the analysis cycle, enabling fast and efficient delivery.

1.2 Precautions

This product is intended for research purposes only and not for clinical diagnosis. Please read this manual carefully before use.
This manual and the information it contains are proprietary and confidential to the BGI Research. Without written permission from BGI Research, no individual or organization may reprint, reproduce, modify, disseminate, or disclose any part of this manual to others. The readers of this manual are end-users authorized by BGI Research. Unauthorized use of this manual is strictly prohibited.
BGI Research makes no warranties of any kind regarding this manual, including (but not limited to) implied warranties of merchantability and fitness for a particular purpose. BGI Research has taken measures to ensure the accuracy of this manual. However, BGI Research is not responsible for errors or omissions and reserves the right to make improvements to this manual and the workflow to enhance reliability, functionality, or design.
All images in this manual are schematic diagrams. There may be slight differences between the images and the actual interface. Please refer to the actual interface for accurate representation.

Chapter 2 Product Introduction

2.1 Analysis Flowchart

The workflow includes the following functions:

Data Alignment: Align sequences to the reference genome and generate alignment statistics.
Variant Detection: Analyze sample mutations, including SNPs and INDELs, based on alignment results.

Figure 2-1 The analysis flowchart:

2.1.1 Reference Genome Filtering, Alignment, Sorting, and Deduplication

Sequences are aligned to the reference genome to determine their positions, providing a foundation for variant detection. Duplicate sequences generated by PCR amplification are marked and removed to reduce false positives and improve variant detection accuracy. These steps are performed by the aligner tool. Additionally, the aligner integrates data quality control functions to filter and process sequences before alignment, reducing error rates and avoiding interference from noisy data in subsequent analyses.

2.1.2 Base Quality Score Recalibration (BQSR)

Due to systematic errors, the quality values of sequence bases are not always entirely accurate. The base quality values need to be recalibrated. This step corrects systematic biases, generates more accurate quality scores, improves variant detection accuracy, and reduces false positives and false negatives. The standardized quality assessment aligns with GATK-recommended best practices and is performed by the bqsr tool.

2.1.3 Variant Detection

The variantCaller is a C/C++ reimplementation of GATK HaplotypeCaller, and the genotyper is a C/C++ reimplementation of GenotypeGVCFs. The variantCaller tool is used for variant detection, generating GVCF files, while the genotyper tool produces Genotype VCF files.

2.1.4 Report Generation

The report tool consolidates statistical files output from previous tasks, generating images and HTML reports.

Chapter 3 User Manual

The LUSH standard analysis workflow is managed end-to-end through the DCS Cloud, from sample input to result output. Below is a detailed operational guide for using the LUSH standard analysis workflow on the DCS Cloud.

3.1 Guide Overview

3.1.1 Overview

This chapter explains how to use the LUSH standard analysis workflow. Before use, please read and understand the content to ensure correct usage of LUSH.

3.1.2 Workflow Suite

LUSH_Germline_FASTQ_WGS_Human
LUSH_reference_index
LUSH_Germline_FASTQ_WGS_NHS
ExpansionHunter_WGS_STR
CNVpytor_WGS_CNV
PanGenie_WGS_SV

3.2 Use Case 1: LUSH Main Workflow - Manual Submission

The operation consists of four steps: uploading data, adding a workflow, LUSH2 reference genome index construction, and LUSH2 analysis. After running the task, when the task status displays "completed," the task is finished.

3.2.1 Step 1: Upload Data

Click the left navigation bar [Data], enter the data management page, navigate to the target folder, and click the upper-right corner [+ Add files] - [Tool upload] to upload data (Figure 3-1):

This feature is exclusive to the overseas AWS environment of the cloud platform. It allows users to utilize AWS tools to upload local files or folders to the cloud platform.

On the Data Management Files page, click the "Add file" button. Select "Tool upload" to access the tool upload interface. The steps are as follows:

Download the AWS CLI tool to your local machine and double‑click the installation file (for first‑time AWS CLI tool users).

Open the command prompt on your local computer (on Windows, run 'cmd'; on macOS or Linux, access the terminal).

On the Cloud platform's tool upload interface, input the local folder or file path that you want to upload (if uploading a file, make sure to provide the full path with the file format, for example: D:\upLoad\test.txt).

Click the "Generate upload command" button. After the command is generated on the page, click the "Copy command" button. Paste the command into the command prompt. If you are uploading a file, remove the "‑‑recursive" flag from the command. Press Enter to begin uploading the file to the current folder on the cloud platform. After the upload is complete, make sure to refresh the cloud platform page.

3.2.2 Step 2: Add Workflow

Click the top navigation bar [Project] to enter the project list page. In the search box, enter the project name/project number and click to enter the project (Figure 3-3):

Click the left navigation bar [Workflow] to enter the workflow analysis page. Click the upper-right corner [+ Add apps] to add a workflow, then click [Copy from library] (Figure 3-4):

For example, for LUSH_Germline_FASTQ_WGS_NHS, enter the workflow name LUSH_Germline_FASTQ_WGS_NHS in the search box, select it, click [Copy], then click [Confirm]. Perform the same operation for other workflow suites:

The workflow is successfully added:

3.2.3 Step 3: LUSH Reference Genome Index Construction

For other species genome analysis, the LUSH workflow supports users in providing and constructing reference genome indexes.
Click the left navigation bar [Workflow] to enter the workflow analysis page. In the search box, enter LUSH_reference_index and click [Run]:

Enter the entity ID and click [Next] (Figure 3-8):

Figure 3-8 LUSH Build Index Analysis Step 2

Enter parameter information for LUSH, then click [Next] (Figure 3-9):

LUSH_build_index Workflow Variable Descriptions:

ReferenceName: Reference genome name.
ReferenceFasta: Reference genome FASTA file.
dbsnpVcf: dbSNP database VCF file for the reference genome (optional).
KnownSiteVcfs: VCF file containing known sites for the species (optional).
ReferenceAlt: Index file containing alt information for the species (optional).
Species: Species name.

Click [Run] to start the analysis (Figure 3-10):

Click the left navigation bar [Task] to enter the workflow analysis page. Under Workflow Name, select LUSH_build_index to view the task status. Once the task is completed, the status will display "completed." Copy the task number, click the navigation bar [Data], and enter the task number in the folder/file name search box. Navigate to the result files, where the reference folder will be used as the ReferenceDir parameter input for the LUSH_Germline_FASTQ_WGS_NHS workflow.

3.2.4 Step 4: LUSH_Germline_FASTQ_WGS_NHS Analysis Workflow

Click the left navigation bar [Workflow] to enter the workflow analysis page. In the search box, enter LUSH_Germline_FASTQ_WGS_NHS and click [Run] (refer to 3.2.3).
Enter the entity ID (usually the sample ID) and click [Next] (refer to 3.2.3):

Figure 3-13 LUSH_Germline_FASTQ_WGS_NHS Analysis Step 1

Enter sample information for LUSH_Germline_FASTQ_WGS_NHS:

PE FASTQ Input: Based on the number of paired FASTQ files, click the "+" button to add file groups and select the corresponding FASTQ1 and FASTQ2 files for each group.
SE FASTQ Input: Similar to PE data, place one FASTQ file in each file group.

Figure 3-14 LUSH_Germline_FASTQ_WGS_NHS Analysis Step 2

LUSH_Germline_FASTQ_WGS_NHS Parameter Descriptions:

SampleID: Sample name or unique ID, defaulting to the Entity ID.
FASTQ: Sequencing data (fq.gz or arc). PE: One file group contains a pair of FASTQ files. SE: One file group contains one FASTQ file.
ReferenceDir: Supports user-uploaded genome data for constructing a reference genome (refer to 3.2.3). Input is the reference directory from the LUSH_reference_index workflow result files.
arcDir: Folder containing FASTQ index files for arcseq decompression. Not required if FASTQ format is fq.gz.
Adapter1: Adapter sequence for Read1 (used for filtering).
Adapter2: Adapter sequence for Read2 (used for filtering).
SOAPnukeLowQual: SOAPnuke low-quality threshold, default is 12.
SOAPnukeLowQualityRate: SOAPnuke low-quality rate threshold, default is 0.5.
SOAPnukeNRate: SOAPnuke N ratio threshold, default is 0.1.
StandCallConf: Variant detection confidence threshold, default is 30.
OutputUnmappedReads: Whether to output unmapped reads (FASTQ files), default is no.
OutputSortMarkdupBam: Whether to output sorted and deduplicated BAM files, default is no.
OutputBqsrBam: Whether to output BQSR-processed BAM files, default is no.
ApplyBQSR: Whether to run BQSR, default is yes. If set to "no," the sorted and deduplicated BAM file will be used for variant detection. If the reference genome's knownsites VCF is empty, this will be forced to "no."
ApplyHaplotypeCaller: Whether to run HaplotypeCaller, default is yes.
AlignerMemorySet: Alignment memory setting, default is 128 (GB).
BQSRMemorySet: BQSR memory setting, default is 64 (GB).
HaplotypeCallerMemorySet: HaplotypeCaller memory setting, default is 64 (GB).

Click [Run] to start the analysis and wait for the results (Figure 3-15):

Figure 3-15 LUSH_Germline_FASTQ_WGS_NHS Analysis Step 3

3.3 Use Case 2: LUSH Fixed-Price Workflow - Manual Submission

3.3.1 Step 1: Upload Data

Same as Use Case 1 (Manual Submission) Step 1 (refer to 3.2.1 Step 1: Upload Data).

3.3.2 Step 2: Add Workflow

Same as Use Case 1 (Manual Submission) Step 2 (refer to 3.2.2 Step 2: Add Workflow).

3.3.3 Step 3: Add Public hg38 Reference Genome Files

Click the "Public Library" tab at the top of the page to enter the public library interface.

Figure 3-16 LUSH_Germline_FASTQ_WGS_Human Analysis Step 1

On the public library page, click the Data tab to enter the data tab. In the search bar, enter "LUSH_Reference_hg38" to search. In the search results, find the LUSH_Reference_hg38 card and click the Copy button.

Figure 3-17 LUSH_Germline_FASTQ_WGS_Human Analysis Step 2

In the pop-up dialog, select the project where you want to store the data. Navigate to the specified path in the project, set the specific location for data storage, and complete the public data pull.

Figure 3-18 LUSH_Germline_FASTQ_WGS_Human Analysis Step 3

3.3.4 Step 4: LUSH_Germline_FASTQ_WGS_Human Analysis Workflow

Click the left navigation bar [Workflow] to enter the workflow analysis page. In the search box, enter LUSH_Germline_FASTQ_WGS_Human and click [Run] (refer to 3.2.3).
Enter the entity ID (usually the sample ID) and click [Next] (refer to 3.2.3).
Enter sample information for LUSH_Germline_FASTQ_WGS_Human:

PE FASTQ Input: Based on the number of paired FASTQ files, click the "+" button to add file groups and select the corresponding FASTQ1 and FASTQ2 files for each group.
SE FASTQ Input: Similar to PE data, place one FASTQ file in each file group.

Figure 3-19 LUSH_Germline_FASTQ_WGS_Human Analysis Step 4

LUSH_Germline_FASTQ_WGS_Human Parameter Descriptions:

SampleID: Sample name or unique ID, defaulting to the Entity ID.
FASTQ: Sequencing data (fq.gz or arc). PE: One file group contains a pair of FASTQ files. SE: One file group contains one FASTQ file.
ReferenceDir: Reference genome folder, which needs to be pulled from the public library into the analysis project.
OutputSortMarkdupBam: Whether to output sorted and deduplicated BAM files, default is no.

Click [Run] to start the analysis and wait for the results (Figure 3-20):

Figure 3-20 LUSH_Germline_FASTQ_WGS_Human Analysis Step 5

3.4 Use Case 3: ExpansionHunter_WGS_STR Workflow - Manual Submission

The ExpansionHunter_WGS_STR workflow uses Expansion Hunter to identify short tandem repeat (STR) sequences. Expansion Hunter is a tool for targeted genotyping of short tandem repeats and flanking variants. It searches BAM/CRAM files for reads spanning, flanking, or fully contained within each repeat. This analysis workflow currently only supports alignment data based on the human genome hg38 reference sequence.

3.4.1 Step 1: Input Data

For LUSH_Germline_FASTQ_WGS_Human analysis, set the OutputSortMarkdupBam parameter to "true" to output the aligner BAM file in the workflow results.

3.4.2 Step 2: Add Workflow

Similar to Use Case 1 (Manual Submission) Step 2, search by workflow name (refer to 3.2.2 Step 2: Add Workflow).

3.4.3 Step 3: ExpansionHunter_WGS_STR Analysis Workflow

Click the left navigation bar [Workflow] to enter the workflow analysis page. In the search box, enter ExpansionHunter_WGS_STR and click [Run] (refer to 3.2.3):

Figure 3-21 ExpansionHunter_WGS_STR Analysis Step 4 (1)

Enter the entity ID (usually the sample ID) and click [Next] (refer to 3.2.3):
Enter sample information for ExpansionHunter_WGS_STR (Figure 3-23).

Figure 3-23 ExpansionHunter_WGS_STR Analysis Step 4 (3)

ExpansionHunter_WGS_STR Parameter Descriptions:

SampleID: Sample name or unique ID, defaulting to the Entity ID.
ReferenceDir: Reference genome folder, which needs to be pulled from the public library into the analysis project.
Sex: Sample gender, "male" or "female."
Bam: Sorted and deduplicated BAM file.
BamIndex: Index file for the sorted and deduplicated BAM file.

Click [Run] to start the analysis and wait for the results:

Figure 3-24 ExpansionHunter_WGS_STR Analysis Step 4 (4)

3.5 Use Case 4: PanGenie_WGS_SV Workflow - Manual Submission

The PanGenie_WGS_SV workflow is a graph-based pan-genome workflow designed for genotype imputation and structural variant (SV) detection using PanGenie from sequencing data. Genotype calculation is based on read k-mer counts and a set of known, fully assembled haplotypes. Compared to alignment-based methods, PanGenie achieves higher genotype concordance for almost all tested variant types and coverages. Improvements are particularly significant for large insertions (≥50 bp) and variants in repetitive regions, enabling these categories of variants to be included in genome-wide association studies.

3.5.1 Step 1: Upload Data

Same as Use Case 1 (Manual Submission) Step 1 (refer to 3.2.1 Step 1: Upload Data).

3.5.2 Step 2: Add Workflow

Similar to Use Case 1 (Manual Submission) Step 2, search by workflow name (refer to 3.2.2 Step 2: Add Workflow).

3.5.3 Step 3: Add PanGenie Reference Genome Files

Click the "Public Library" tab at the top of the page to enter the public library interface.

Figure 3-25 PanGenie_WGS_SV Analysis Step 3 (1)

On the public library page, click the Data tab to enter the data tab. In the search bar, enter "PanGenie_refindex" to search. In the search results, find the PanGenie_refindex card and click the Copy button.

Figure 3-26 PanGenie_WGS_SV Analysis Step 3 (2)

In the pop-up dialog, select the project where you want to store the data. Navigate to the specified path in the project, set the specific location for data storage, and complete the public data pull.

Figure 3-27 PanGenie_WGS_SV Analysis Step 3 (3)

3.5.4 Step 4: Add Human Structural Variant Annotation Database Files

Follow the method in Step 3, but change the search target to AnnotSV_annotations.

3.5.5 Step 5: PanGenie_WGS_SV Analysis Workflow

Click the left navigation bar [Workflow] to enter the workflow analysis page. Enter PanGenie_WGS_SV in the search box and click [Run] (refer to 3.2.3):
Figure 3-28 PanGenie_WGS_SV Analysis Step 5 (1)
Enter the entity ID (usually the sample ID) and click [Next] (refer to 3.2.3):
Figure 3-29 PanGenie_WGS_SV Analysis Step 5 (2)
PanGenie_WGS_SV: Enter sample information. Click the folder icon under FqFile to enter the data management page and select the FASTQ file.
Figure 3-30 PanGenie_WGS_SV Analysis Step 5 (3)
PanGenie_WGS_SV: Enter the PanGenie reference genome folder. Click the folder icon under the RefDir parameter to enter the data management page and select the PanGenie reference genome folder.
Figure 3-31 PanGenie_WGS_SV Analysis Step 5 (4)
PanGenie_WGS_SV: Enter the human structural variant annotation database folder. Click the folder icon under the AnnotationRef parameter to enter the data management page and select the human structural variant annotation database.
Figure 3-32 PanGenie_WGS_SV Analysis Step 5 (5)

PanGenie_WGS_SV Parameter Description:

SampleID: Sample name or unique ID, default is the same as the Entity ID.
FqFile: Sequencing file.
RefDir: PanGenie reference genome file.
AnnotationRef: Human structural variant annotation database.
Threads: Number of CPU cores for the task.

Click [Run] to start the analysis and wait for the output (Figure 3-33):
Figure 3-33 PanGenie_WGS_SV Analysis Step 5 (6)

3.6 Use Case 5: Manual Submission of CNVpytor_WGS_CNV Workflow

The CNVpytor_WGS_CNV workflow is a CNV detection and analysis workflow based on CNVpytor, used to detect and analyze copy number variations (CNVs) from sequencing data. CNVpytor inherits the core engine of its predecessor and extends visualization, modularity, performance, and functionality. Additionally, CNVpytor utilizes B-allele frequency (BAF) likelihood information from single nucleotide polymorphisms (SNPs) and small insertions/deletions (indels) as additional evidence for CNVs/copy number aberrations (CNAs) and primary information for copy number-neutral loss of heterozygosity.

3.6.1 Step 1: Input Data

When performing LUSH_Germline_FASTQ_WGS_Human analysis, set the OutputSortMarkdupBam parameter to true to output the aligner BAM file.

3.6.2 Step 2: Add Workflow

Similar to Use Case 1 (Manual Submission) Step 2, search by workflow name (refer to 3.2.2 Step 2: Add Workflow).

3.6.3 Step 3: CNVpytor_WGS_CNV Analysis Workflow

Click the left navigation bar [Workflow] to enter the workflow analysis page. Enter CNVpytor_WGS_CNV in the search box and click [Run] (refer to 3.2.3):
Figure 3-34 CNVpytor_WGS_CNV Analysis Step 3 (1)
Enter the entity ID (usually the sample ID) and click [Next] (refer to 3.2.3):
Figure 3-35 CNVpytor_WGS_CNV Analysis Step 3 (2)
Click the folder icon under Bam to enter the data management page and select the BAM file.
Figure 3-36 CNVpytor_WGS_CNV Analysis Step 3 (3)
Click on the folder icon in BamIndex to enter the data management page and select the index file for Bam.

CNVpytor_WGS_CNV parameter description:

SampleID: Sample name or special ID, which is consistent with the Entity ID by default;
Bam: The aligned, deduplicated, and sorted bam file;
BamIndex: The index file for the aligned, deduplicated, and sorted bam file;

Click[Run] to start the analysis and wait for the results to be output (Figure 3-37).
Figure 3-37 CNVpytor_WGS_CNV Analysis Step 3 (4)

3.7 Use Case 6: Table Submission (Using LUSH_Germline_FASTQ_WGS_NHS as an Example)

The operation includes five steps: upload data, add workflow, download table template, fill and import the table, and start the analysis. After importing the sample template, tasks can be run in batches. When the task status shows as completed, the task is finished, and results can be viewed (refer to Section 3.4).

3.7.1 Step 1: Upload Data

Same as Use Case 1 (Manual Submission) Step 1 (refer to 3.2.1 Step 1: Upload Data).

3.7.2 Step 2: Add Workflow

Same as Use Case 1 (Manual Submission) Step 2 (refer to 3.2.2 Step 2: Add Workflow).

3.7.3 Step 3: Download Table Template

Click the left navigation bar [Data], select [Table] - [Download] (Figure 3-38). Click [**Data model template **] and download the LUSH_Germline_FASTQ_WGS_NHS_V1.0.0 template (the template for LUSH_Germline_FASTQ_WGS_Human is LUSH_Germline_FASTQ_WGS_Human_V1.0.0):
Figure 3-38 Sample Template Download
The opened sample template Excel is shown in Figure 3-39:

Figure 3-39 LUSH_Germline_FASTQ_WGS_NHS Sample Template Table

3.7.4 Step 4: Fill and Import the Table

Under this use case, the imported table must include a worksheet. This scenario indicates that the analysis begins directly after importing the table for already sequenced sample data.

::: important

The imported file path must already exist on the cloud platform.
Do not merge cells in Excel, and avoid spaces or special characters before or after cell content.
Sample analysis entry (Figure 3-40): :::

Figure 3-40 LUSH_Germline_FASTQ_WGS_NHS Filled Template Table

After configuring the table template, return to the [Data]interface. Click [Table] - [+ Add table] (Figure 3-41):
Figure 3-41 Table Import Step 1
Click [Click to upload / Drop here] to browse and select the table with filled sample information, then click [Confirm] (Figure 3-42). After uploading, the file will be displayed in the target folder:
Figure 3-42 Table Import Step 2
Click the navigation bar [Workflow] to enter the workflow analysis page. Enter LUSH_Germline_FASTQ_WGS_NHS in the search box and click [Run].
Select Run workflow(s), click Please Select Table , select the table imported in Step 3 above, choose the required rows, and click [Next] (Figure 3-43):
Figure 3-43 Table Import Step 4
Under Value, click and select the corresponding value. For example, for FASTQ, select ${FASTQ1} and ${FASTQ2}. Note that ${FASTQ1} and ${FASTQ2} must be selected in order (Figure 3-44):
Figure 3-44 Table Import Step 5
Enter sample information, then click [Next] to ensure the parameter settings are correct (Figure 3-45):
Figure 3-45 Table Import Template

3.7.5 Step 5: Start Analysis

Click [Run] to start the analysis (Figure 3-46):

3.8 LUSH Result Files

3.8.1 Result File Download

Click the left navigation bar [Task] to enter the task management page. Select LUSH_Germline_FASTQ_WGS_NHS under Workflow Name to view the task status. When the task is completed, the status will show as completed, indicating the task is finished. Copy the Task ID (Figure 3-47):
Figure 3-47 Result View and Download Step 1
Download the Ossutil tool zip file (for the first‑time use). 3.Extract the downloaded zip file and double‑click to open the Ossutil tool, entering the tool interface. 4.On the cloud platform file page, select the files you want to download, click the "Download ‑ Tool download" button, then click "Generate download command," and finally click "Copy command." Paste the copied command into the tool interface. The files will be automatically downloaded to the current folder.

Figure 3-48 Result View and Download Step 2

Figure 3-49 Result View and Download Step 3

3.8.2 LUSH WGS Analysis Workflow Result Files

Upon successful execution, the following directory tree will be generated in the output directory:

├── sample.genotyper.vcf.gz

├── sample.genotyper.vcf.gz.tbi

├── sample.g.vcf.gz

├── sample.g.vcf.gz.tbi

├── sample.unmapped.1.fq.gz

├── sample.unmapped.2.fq.gz

├── QC

│ ├── Base_distributions_by_read_position_1.txt

│ ├── Base_distributions_by_read_position_2.txt

│ ├── Base_quality_value_distribution_by_read_position_1.txt

│ ├── Base_quality_value_distribution_by_read_position_2.txt

│ ├── Basic_Statistics_of_Sequencing_Quality.txt

│ ├── Distribution_of_Q20_Q30_bases_by_read_position_1.txt

│ ├── Distribution_of_Q20_Q30_bases_by_read_position_2.txt

│ ├── output_filter_files_report.txt

│ └── Statistics_of_Filtered_Reads.txt

└── report

├── sample.AT.xls

├── sample.bamstat.xls

├── sample.base.png

├── sample.cumuPlot.png

├── sample.fqstat.xls

├── sample.histPlot.png

├── sample.insertsize.png

├── sample.qual.png

├── sample\_report\_cn.html

├── sample\_report\_en.html

└── sample.vcfstat.xls

Directory Description:

sample..vcf.gz	Variant detection results and index.
sample.unmapped.*.fq.gz	Reads files that failed to align to the reference genome.
QC	Quality control and filtering results.
report	Workflow result reports.
report/sample _ report__en.html	English version of the sample analysis report.
report/sample _ report__cn.html	Chinese version of the sample analysis report.
report/sample.fqstat.xls	Statistics for quality control results.
report/sample. bamstat.xls	Statistics for alignment results.
report/sample.vcfstat.xls	Statistics for variant results.
report/*.png	Images for the report.

Statistical Results Display:

Quality Control Statistics:
b1.png
Alignment Statistics:
Variant Statistics:

3.8.3 ExpansionHunter_WGS_STR Analysis Workflow Result Files

Upon successful execution, the following directory tree will be generated in the output directory:

└── STR

├── sample.json

├── sample\_realigned.bam

└── sample.vcf

Directory Description:

File	Description
`sample.json`	JSON file containing sample parameters and summarized analysis results by locus.
`sample_realigned.bam`	BAM file containing realigned read data.
`sample.vcf`	VCF file containing STR positions and information.

3.8.4 PanGenie_WGS_SV Analysis Workflow Result Files

Upon successful execution, the following directory tree will be generated in the output directory:

├── circos.png

├── circos.svg

├── sample_AnnotSV

│ └── sample-pangenie-genotyping-biallelic-filtered-sv.annotated.tsv

├── sample-pangenie-genotyping-biallelic-filtered-sv.vcf.gz

├──sample-pangenie-genotyping-biallelic-filtered-sv.vcf.gz.tbi

├── pangenie-biallelic

│ ├── sample-pangenie-genotyping-biallelic.vcf.gz

│ └── sample-pangenie-genotyping-biallelic.vcf.gz.tbi

├── pangenie-results_genotyping.vcf

├── pangenie-results_histogram.histo

└── sv_type_counts.txt

Directory Description:

File/Directory	Description
`sample..vcf.gz`	Variant detection results and index; `-filtered` contains only SV information.
`sample-pangenie-genotyping-biallelic-filtered-sv.annotated.tsv`	Results file containing SV genotyping and annotation information.
`pangenie-results_genotyping.vcf`	SV variant detection results in pan-genome format.
`sv_type_counts.txt`	SV statistics.
`.png`, `.svg`, `*.histo`	Statistical images and plotting data.

3.8.5 CNVpytor_WGS_CNV Analysis Workflow Result Files

Upon successful execution, the following directory tree will be generated in the output directory:

cnv_results/

├── cnv_type_counts.txt

├── sample.CNV.call.txt

├── sample.CNV.manhattan.global.0000.png

├── sample.CNV.rdstat.stat.0000.png

└── out.pytor

Directory Description:

File	Description
`cnv_type_counts.txt`	CNV statistics.
`sample.CNV.call.txt`	Results file containing CNV information.
`sample.CNV.manhattan.*.png`	Manhattan heatmap.
`sample.CNV.rdstat.stat.*.png`	Read depth (RD) analysis statistics.
`out.pytor`	CNVpytor file (HDF5 format).

LUSH

# LUSH User Manual

# Chapter 1 Product Information

# 1.1 Product Description

# 1.2 Precautions

# Chapter 2 Product Introduction

# 2.1 Analysis Flowchart

# 2.1.1 Reference Genome Filtering, Alignment, Sorting, and Deduplication

# 2.1.2 Base Quality Score Recalibration (BQSR)

# 2.1.3 Variant Detection

# 2.1.4 Report Generation

# Chapter 3 User Manual

# 3.1 Guide Overview

# 3.1.1 Overview

# 3.1.2 Workflow Suite

# 3.2 Use Case 1: LUSH Main Workflow - Manual Submission

# 3.2.1 Step 1: Upload Data

# 3.2.2 Step 2: Add Workflow

# 3.2.3 Step 3: LUSH Reference Genome Index Construction

# 3.2.4 Step 4: LUSH_Germline_FASTQ_WGS_NHS Analysis Workflow

# 3.3 Use Case 2: LUSH Fixed-Price Workflow - Manual Submission

# 3.3.1 Step 1: Upload Data

# 3.3.2 Step 2: Add Workflow

# 3.3.3 Step 3: Add Public hg38 Reference Genome Files

# 3.3.4 Step 4: LUSH_Germline_FASTQ_WGS_Human Analysis Workflow

# 3.4 Use Case 3: ExpansionHunter_WGS_STR Workflow - Manual Submission

# 3.4.1 Step 1: Input Data

# 3.4.2 Step 2: Add Workflow

# 3.4.3 Step 3: ExpansionHunter_WGS_STR Analysis Workflow

# 3.5 Use Case 4: PanGenie_WGS_SV Workflow - Manual Submission

# 3.5.1 Step 1: Upload Data

# 3.5.2 Step 2: Add Workflow

# 3.5.3 Step 3: Add PanGenie Reference Genome Files

# 3.5.4 Step 4: Add Human Structural Variant Annotation Database Files

# 3.5.5 Step 5: PanGenie_WGS_SV Analysis Workflow

# 3.6 Use Case 5: Manual Submission of CNVpytor_WGS_CNV Workflow

# 3.6.1 Step 1: Input Data

# 3.6.2 Step 2: Add Workflow

# 3.6.3 Step 3: CNVpytor_WGS_CNV Analysis Workflow

# 3.7 Use Case 6: Table Submission (Using LUSH_Germline_FASTQ_WGS_NHS as an Example)

# 3.7.1 Step 1: Upload Data

# 3.7.2 Step 2: Add Workflow

# 3.7.3 Step 3: Download Table Template

# 3.7.4 Step 4: Fill and Import the Table

# 3.7.5 Step 5: Start Analysis

# 3.8 LUSH Result Files

# 3.8.1 Result File Download

# 3.8.2 LUSH WGS Analysis Workflow Result Files

# 3.8.3 ExpansionHunter_WGS_STR Analysis Workflow Result Files

# 3.8.4 PanGenie_WGS_SV Analysis Workflow Result Files

# 3.8.5 CNVpytor_WGS_CNV Analysis Workflow Result Files

LUSH User Manual

Chapter 1 Product Information

1.1 Product Description

1.2 Precautions

Chapter 2 Product Introduction

2.1 Analysis Flowchart

2.1.1 Reference Genome Filtering, Alignment, Sorting, and Deduplication

2.1.2 Base Quality Score Recalibration (BQSR)

2.1.3 Variant Detection

2.1.4 Report Generation

Chapter 3 User Manual

3.1 Guide Overview

3.1.1 Overview

3.1.2 Workflow Suite

3.2 Use Case 1: LUSH Main Workflow - Manual Submission

3.2.1 Step 1: Upload Data

3.2.2 Step 2: Add Workflow

3.2.3 Step 3: LUSH Reference Genome Index Construction

3.2.4 Step 4: LUSH_Germline_FASTQ_WGS_NHS Analysis Workflow

3.3 Use Case 2: LUSH Fixed-Price Workflow - Manual Submission

3.3.1 Step 1: Upload Data

3.3.2 Step 2: Add Workflow

3.3.3 Step 3: Add Public hg38 Reference Genome Files

3.3.4 Step 4: LUSH_Germline_FASTQ_WGS_Human Analysis Workflow

3.4 Use Case 3: ExpansionHunter_WGS_STR Workflow - Manual Submission

3.4.1 Step 1: Input Data

3.4.2 Step 2: Add Workflow

3.4.3 Step 3: ExpansionHunter_WGS_STR Analysis Workflow

3.5 Use Case 4: PanGenie_WGS_SV Workflow - Manual Submission

3.5.1 Step 1: Upload Data

3.5.2 Step 2: Add Workflow

3.5.3 Step 3: Add PanGenie Reference Genome Files

3.5.4 Step 4: Add Human Structural Variant Annotation Database Files

3.5.5 Step 5: PanGenie_WGS_SV Analysis Workflow

3.6 Use Case 5: Manual Submission of CNVpytor_WGS_CNV Workflow

3.6.1 Step 1: Input Data

3.6.2 Step 2: Add Workflow

3.6.3 Step 3: CNVpytor_WGS_CNV Analysis Workflow

3.7 Use Case 6: Table Submission (Using LUSH_Germline_FASTQ_WGS_NHS as an Example)

3.7.1 Step 1: Upload Data

3.7.2 Step 2: Add Workflow

3.7.3 Step 3: Download Table Template

3.7.4 Step 4: Fill and Import the Table

3.7.5 Step 5: Start Analysis

3.8 LUSH Result Files

3.8.1 Result File Download

3.8.2 LUSH WGS Analysis Workflow Result Files

3.8.3 ExpansionHunter_WGS_STR Analysis Workflow Result Files

3.8.4 PanGenie_WGS_SV Analysis Workflow Result Files

3.8.5 CNVpytor_WGS_CNV Analysis Workflow Result Files