Skip to content

HTS Bioinf - ELLA MT module

Scope

This document describes the annotation of mitochondrial (MT) variants and their display in ELLA.

Summary of MT variants processing

This pipeline processes mitochondrial DNA (mtDNA) variant call format (VCF) files to generate a high-quality, filtered, and annotated data set for downstream analysis.

Processing steps

MT variant calling

See summary and details in the basepipe specification section for MT variants.

Quality control

MT variants undergo multiple filtering steps to remove low-confidence calls and artifacts. Applied filters include:

  • Blacklist masking to remove known problematic sites.
  • Nuclear mitochondrial DNA (NuMT) filtering to exclude nuclear sequences resembling mtDNA.
  • Low heteroplasmy filtering to eliminate uncertain low-frequency variants.
  • Additional QC metrics assess sequencing depth, contamination, and data reliability.

See more detailed descriptions and cutoffs in the Table 1 below.

Table 1: Overview of MT variant filtering based on QC

Filtering step Cutoff Description
max alt allele count 4 Maximum alt alleles per site
min allele fraction 0 Minimum allele fraction required
contamination estimate 0 Estimates contamination level from non-target DNA
blacklist masking blacklisted_sites Masks regions identified in a blacklist
autosomal coverage set to median coverage per sample Filters variants based on autosomal coverage for contamination
max low heteroplasmy 3 Limits the number of low heteroplasmy sites

Annotation and haplogroup classification

Variants are functionally annotated using external databases to predict their potential impact. Haplogroup classification is performed to determine maternal lineage. Structural and regulatory annotations are incorporated to enhance variant interpretation. Based on these annotations the following filtering steps are further performed, shown in Table 2:

Table 2: Overview of MT variant filtering based on annotations

Filtering step Cutoff Description
Sample-based filtering remove maternal only Only report variants present in proband
Functional consequence remove synonymous Synonymous variants are removed when it is the most severe consequence (as determined by VEP)
Auxillary variants haplogroup markers Used for assigning to haplogroups
Frequency in Helixdb >0.001 Frequency filtering, both het and hom
Frequency in gnomAD MT db >0.001 Frequency filtering, both het and hom

Final output and reporting

A high-confidence variant data set is produced after all filtering and annotation steps. A tabular report is generated summarizing key variant information, including annotations, QC results, and haplogroup classification (see also Table 3).

If the haplogroups of the mother and proband differ, a warning will be displayed.

Summary of MT variants presentation in ELLA User Interface (UI)

The final report is displayed in ELLA's INFO page for the respective analysis, see example screenshot in Figure 1 and Table 3 for annotation explanations below.

Figure 1

Figure 1: MT analysis supplement in INFO page in ELLA

Table 3: mtDNA analysis report annotations

Annotation Explanation Source
POS Genomic position in the mitochondrial genome VCF file (Mutect2 caller)
REF Reference allele at that position VCF file (Mutect2 caller)
ALT Alternative allele observed VCF file (Mutect2 caller)
FILTER Quality control filter status (PASS indicates the variant passed all filters) VCF file (Mutect2 caller)
MITOMAPCPM__Associated_diseases Disease associations from MITOMAP Clinical Presentations & Mutations database MITOMAP
CLINVAR_CLINSIG Clinical significance as reported in ClinVar ClinVar
VEP_Consequence Predicted functional consequence by Variant Effect Predictor (VEP) VEP
VEP_SYMBOL Symbol of the gene affected by the variant VEP
VEP_HGVSc Coding sequence change in HGVS notation VEP
VEP_HGVSp Protein sequence change in HGVS notation VEP
VEP_HGVSg Genomic sequence change in HGVS notation VEP
AD_Proband Allele depth (reference, alternate) in proband VCF file (Mutect2 caller)
AF_Proband Variant allele fraction (proportion of the sample's reads that support the variant allele) in proband VCF file (Mutect2 caller)
DP_Proband Total read depth in proband VCF file (Mutect2 caller)
AD_Mother Allele depth in mother VCF file (Mutect2 caller)
AF_Mother Variant allele fraction in mother VCF file (Mutect2 caller)
DP_Mother Total read depth in mother VCF file (Mutect2 caller)
HELIXMTDB__Het Number of heteroplasmic individuals in HelixMTdb HelixMTdb
HELIXMTDB__Hom Number of homoplasmic individuals in HelixMTdb HelixMTdb
GNOMAD_MT__AC_Het Heteroplasmic allele count in gnomAD gnomAD
GNOMAD_MT__AC_Hom Homoplasmic allele count in gnomAD gnomAD