How is NGS data actually generated? | NGS Data Analysis Workflow [Part 1]
📍Key Takeaways
- NGS analysis runs through three stages — Primary, Secondary, and Tertiary — and the quality and tools used at each stage directly shape the final result.
- Variant detection is not a single-algorithm process. Different tools are applied for different variant types: SNVs, CNVs, structural variants, repeat expansions, and more.
- The same sample can yield different results depending on the pipeline and interpretation criteria used. It’s not just about whether NGS was done — it’s about how it was done.

When you order a genomic test and receive a report, you typically see a summary of a few variants along with their interpretations.
However, behind those results lies a much more complex analytical process than it may appear.
In this article, we will walk through how WES/WGS-based testing is actually performed, focusing on the clinically relevant workflow.
NGS analysis consists of three main stages
The overall workflow can be divided into three steps:
- Primary analysis
- Secondary analysis
- Tertiary analysis
Understanding this structure is the first step toward properly interpreting NGS results.
1. Primary analysis: “Reading the data”
This stage converts raw signals generated by the sequencing instrument into actual nucleotide sequence data.
The main steps include:
- Base calling (image → sequence)
- BCL to FASTQ conversion
- Adapter trimming
- Quality control (QC)
There is one key takeaway here:
the initial quality of the data determines everything that follows.
If the data quality is poor at this stage, even the most sophisticated downstream analysis cannot fully compensate for it.
2. Secondary analysis: “Finding where it belongs in the genome”
In this stage, the sequenced DNA fragments are aligned to the human reference genome.
Key steps include:
- Alignment (mapping to the reference genome)
- BAM file generation
- Duplicate read removal
- Base quality recalibration
And the most critical step happens here: Variant calling (identifying variants)
This is where actual variants are detected.
One important point is that no single algorithm is used to detect all types of variants.
- SNV/INDEL → GATK
- CNV → 3bCNV + MANTA
- Structural variants → MANTA
- Repeat expansion → ExpansionHunter
- Mobile element insertion → MELT
In other words, a single test result is the product of multiple analytical tools working together.

3. Tertiary analysis: “Assigning clinical meaning”
From this stage onward, the process moves beyond data processing into interpretation.
(1) Annotation
- Identifying which gene the variant is located in
- Assessing its impact on the protein
- Using tools such as VEP and internal databases
(2) Filtering
- Removing common variants using population databases (e.g., gnomAD)
- Most clinically insignificant variants are filtered out at this step
(3) Variant classification
- Classification based on ACMG guidelines (Pathogenic, Likely pathogenic, VUS, etc.)
(4) Prioritization
- AI-based analysis
- Matching with the patient’s phenotype
Ultimately, one key question
After all these steps, everything comes down to a single question:
“Does this variant explain the patient’s phenotype?”
No matter how advanced the technology becomes, this question does not change.
Questions we often receive in clinical practice
In clinical settings, we often hear questions like:
- “This variant was reported by another lab—why is it not in the 3billion report?”
- “Why is this variant classified as VUS here, but pathogenic elsewhere?”
These are natural questions, but they often overlook one important fact:
NGS results are not simply ‘discovered data’—they are the result of a series of analytical processes and interpretation criteria.
In reality, results can vary depending on:
- Data quality
- Variant calling algorithms used
- Filtering criteria
- Interpretation strategy
Even with the same sample, differences in these steps can lead to different reported variants and different interpretations.
What matters is not simply that “NGS was performed,” but how the data were analyzed and interpreted.
NGS results are not just test outputs—they are the outcome of a complex process of identifying meaningful signals from vast amounts of data.
Understanding this process is the starting point for accurate interpretation.
3billion’s WES/WGS tests run through every analysis stage described above using our in-house pipeline. If you have questions about our testing process or how we interpret variants, feel free to reach out.
Get exclusive rare disease updates
from 3billion.

Sohyun Lee
Clinical Genomics Scientist & Clinical Customer Support — guiding test selection, supporting variant and result interpretation, handling case inquiries, and translating field insights into service improvements.





