Skip to content

Guidelines for Internal Users: VardeDB

This document provides a detailed guide for internal users managing VardeDB. It includes instructions for importing and exporting variant classifications, generating divergent classification reports, and deploying the Apptainer container on TSD. It also outlines the technical setup and configuration.

Importing and Publishing with VardeDB

Prerequisites

  1. ClinVar release:

    Download the latest ClinVar release from the ELLA anno annotation pipeline. The VCF and tabix files can be accessed via DigitalOcean. Save them in:

    /ess/p22/data/durable/production/data/varde/clinvar/
    
  2. VCF submission:

    Ensure all participating organizations have submitted their encrypted and signed VCF files to TSD.

Running a Varde Iteration

Once all prerequisites are met, run a full Varde iteration using the task utility located at:

/ess/p22/data/durable/production/sw/datasharing/varde/

by executing the following command as the service-user:

task prod-run

This command will import all submitted VCFs into VardeDB, before exporting the most recent variant interpretations of every participating organisation, and generating the divergent classification report. Other utility commands are also available (e.g. psql for running psql and connecting to VardeDB). A complete listing of available commands can be obtained by simply calling task.

Publishing

After generating the export VCF and divergent classification report: 1. Access the TSD publication interface. 2. Upload the signed and encrypted VCF files and divergent classification reports. 3. Use the Publish button to make the files available to selected external users via the TSD publication portal.

Technical Setup

Software Deployment

VardeDB is deployed on the TSD VM p22-app-01 as an Apptainer container (vardedb). The container image is stored at

/ess/p22/data/durable/production/sw/datasharing/varde/vardedb.sif

Database

VardeDB uses PostgreSQL with a data model defined using SQLAlchemy. The database is hosted on the TSD VM p22-dbpg-prod02.

Directories

Varde relies on the following directories for persistant file storage and configurations:

  • Initial deposits:

/ess/p22/data/durable/file-import/p22-member-group/production/varde/

bind-mounted to /app/deposit/.

  • Temporary storage and archival:

/ess/p22/data/durable/production/data/varde/

bind-mounted to /app/data/. This data directory contains the following subdirectories:

/data/
      clinvar/    # ClinVar releases
      archive/    # Archived submitted VCFs
            org1/
            org2/
            ...
      publish/
            vcf/  # Exported VCFs
                  org1/
                  org2/
                  ...
            report/     # Divergent classification reports
                  org1/
                  org2/
                  ...

where each organisation's submitted VCFs and exported VCFs and reports are stored in their respective directories.

  • Configuration

/ess/p22/data/durable/production/sw/datasharing/varde/config/

bind-mounted to /app/config/.

  • GPG home directory

/ess/p22/data/durable/production/sw/datasharing/varde/.gnupg/

bind-mounted to /app/.gnupg/.

GPG Encryption

VardeDB uses GnuPG for encryption and signing VCF files. Note that if one whishes to call any gpg commands directly, the --homedir option must be set to /app/.gnupg/ to ensure that the appopriate keyring is used. This also requires being logged in as the service-user.

Configuration

Configuration files are located at:

/ess/p22/data/durable/production/sw/datasharing/varde/config/

Key files include:

  • .env.prod: Defines environmental variables, including the PostgreSQL database and key directories (e.g., .gnupg and data directories)
  • .keypass: Houses the passphrase for VardeDB's private key used for decryption
  • .pgpass: Supports the automated retrieval of the PostgreSQL database password using the pgpasslib library
  • .psql_history: Stores the history of PostgreSQL commands
  • vardedb.toml: Contains the configuration settings for VardeDB
  • keys/: Stores public keys of each organisation used for signature verification

Mention configuration of rulesets for reclassifiable variants.

Adding a New Organisation

  1. Add the organization to VardeDB:
varde-cli add-org <org> <name>
  • <org>: Organisation abbreviation
  • <name>: Full organisation name

  • Import the public key into the Varde keyring:

varde-cli add-key `<org>.asc`

Container Deployment

Follow these steps to set up and deploy the container:

  1. Download the Docker Image:

    Pull the latest VardeDB image from the container registry:

    docker pull registry.gitlab.com/dpipe/datasharing/varde/vardedb:latest
    docker save registry.gitlab.com/dpipe/datasharing/varde/vardedb:latest > vardedb.tar
    
  2. Transfer and Build the Image:

    Import the tarball to TSD and build the Apptainer image:

    singularity build vardedb.sif docker-archive:vardedb.tar
    
  3. Launch the Container:

    Once built, use the setup described in the Importing and Publishing with VardeDB section to operate the container.