HTS Bioinf - ELLA MT module
Scope
This document describes the annotation of mitochondrial (MT) variants and their display in ELLA.
Summary of MT variants processing
This pipeline processes mitochondrial DNA (mtDNA) variant call format (VCF) files to generate a high-quality, filtered, and annotated data set for downstream analysis.
Processing steps
MT variant calling
See summary and details in the basepipe specification section for MT variants.
Quality control
MT variants undergo multiple filtering steps to remove low-confidence calls and artifacts. Applied filters include:
- Blacklist masking to remove known problematic sites.
- Nuclear mitochondrial DNA (NuMT) filtering to exclude nuclear sequences resembling mtDNA.
- Low heteroplasmy filtering to eliminate uncertain low-frequency variants.
- Additional QC metrics assess sequencing depth, contamination, and data reliability.
See more detailed descriptions and cutoffs in the Table 1 below.
Table 1: Overview of MT variant filtering based on QC
Filtering step | Cutoff | Description |
---|---|---|
max alt allele count | 4 | Maximum alt alleles per site |
min allele fraction | 0 | Minimum allele fraction required |
contamination estimate | 0 | Estimates contamination level from non-target DNA |
blacklist masking | blacklisted_sites | Masks regions identified in a blacklist |
autosomal coverage | set to median coverage per sample | Filters variants based on autosomal coverage for contamination |
max low heteroplasmy | 3 | Limits the number of low heteroplasmy sites |
Annotation and haplogroup classification
Variants are functionally annotated using external databases to predict their potential impact. Haplogroup classification is performed to determine maternal lineage. Structural and regulatory annotations are incorporated to enhance variant interpretation. Based on these annotations the following filtering steps are further performed, shown in Table 2:
Table 2: Overview of MT variant filtering based on annotations
Filtering step | Cutoff | Description |
---|---|---|
Sample-based filtering | remove maternal only | Only report variants present in proband |
Functional consequence | remove synonymous | Synonymous variants are removed when it is the most severe consequence (as determined by VEP) |
Auxillary variants | haplogroup markers | Used for assigning to haplogroups |
Frequency in Helixdb | >0.001 | Frequency filtering, both het and hom |
Frequency in gnomAD MT db | >0.001 | Frequency filtering, both het and hom |
Final output and reporting
A high-confidence variant data set is produced after all filtering and annotation steps. A tabular report is generated summarizing key variant information, including annotations, QC results, and haplogroup classification (see also Table 3).
If the haplogroups of the mother and proband differ, a warning will be displayed.
Summary of MT variants presentation in ELLA User Interface (UI)
The final report is displayed in ELLA's INFO page for the respective analysis, see example screenshot in Figure 1 and Table 3 for annotation explanations below.
Figure 1: MT analysis supplement in INFO page in ELLA
Table 3: mtDNA analysis report annotations
Annotation | Explanation | Source |
---|---|---|
POS | Genomic position in the mitochondrial genome | VCF file (Mutect2 caller) |
REF | Reference allele at that position | VCF file (Mutect2 caller) |
ALT | Alternative allele observed | VCF file (Mutect2 caller) |
FILTER | Quality control filter status (PASS indicates the variant passed all filters) | VCF file (Mutect2 caller) |
MITOMAPCPM__Associated_diseases | Disease associations from MITOMAP Clinical Presentations & Mutations database | MITOMAP |
CLINVAR_CLINSIG | Clinical significance as reported in ClinVar | ClinVar |
VEP_Consequence | Predicted functional consequence by Variant Effect Predictor (VEP) | VEP |
VEP_SYMBOL | Symbol of the gene affected by the variant | VEP |
VEP_HGVSc | Coding sequence change in HGVS notation | VEP |
VEP_HGVSp | Protein sequence change in HGVS notation | VEP |
VEP_HGVSg | Genomic sequence change in HGVS notation | VEP |
AD_Proband | Allele depth (reference, alternate) in proband | VCF file (Mutect2 caller) |
AF_Proband | Variant allele fraction (proportion of the sample's reads that support the variant allele) in proband | VCF file (Mutect2 caller) |
DP_Proband | Total read depth in proband | VCF file (Mutect2 caller) |
AD_Mother | Allele depth in mother | VCF file (Mutect2 caller) |
AF_Mother | Variant allele fraction in mother | VCF file (Mutect2 caller) |
DP_Mother | Total read depth in mother | VCF file (Mutect2 caller) |
HELIXMTDB__Het | Number of heteroplasmic individuals in HelixMTdb | HelixMTdb |
HELIXMTDB__Hom | Number of homoplasmic individuals in HelixMTdb | HelixMTdb |
GNOMAD_MT__AC_Het | Heteroplasmic allele count in gnomAD | gnomAD |
GNOMAD_MT__AC_Hom | Homoplasmic allele count in gnomAD | gnomAD |