Skip to content

Guideline for internal users

This document provides a comprehensive guide for utilizing VardeDB for internal users. It outlines the procedures for importing and exporting variant classifications, as well as generating a report of divergent classifications. Furthermore, it presents the technical setup and offers instructions for deploying the Apptainer/Singularity container on TSD.

Note that a dagu job has been setup on TSD to automate these processes (import, export, and report). The job is triggered when all participating organisations have deposited their VCFs to TSD.

Import to VardeDB and publish

Once all organisations have deposited their VCFs to VardeDB, the variant classifications are imported into the database. Then, the most recent variant interpretations are exported from the database, and a report of divergent variant classifications is generated. On TSD under /ess/p22/data/durable/production/sw/datasharing/varde/, Task has been setup with three key commands to perform these processes: prod-import for importing the submitted VCF files, prod-export for exporting all variant classifications of VardeDB, and prod-report for generating the divergent classification report. Other utility commands are also available (e.g. psql for running psql and connecting to VardeDB). A complete listing of available commands can be obtained by simply calling task.

Once the VCF files have been successfully exported and the divergent classification report has been generated, access the following link within the TSD project. Then, select and upload the signed and encrypted VCF files, along with the divergent classification reports. Click the Publish symbol to make the files available for download by selected external users via the TSD publication portal.

Technical setup

Software

VardeDB is deployed on the TSD VM p22-app-01 as the Apptainer/Singularity container vardedb. The Apptainer/Singularity image can be found under /ess/p22/data/durable/production/sw/datasharing/varde/ as vardedb.sif.

Database

VardeDB is a relational PostgreSQL database with the data model defined using SQLAlchemy. The database itself runs on the TSD database VM p22-dbpg-prod02.

File storage

Initial storage for deposited VCF files occur within /ess/p22/data/durable/file-import/p22-member-group/production/varde/. Before decrypting and importing these into VardeDB, the files are moved to /ess/p22/data/durable/production/data/varde/. Bind mounts have been created for these directories in the vardedb container under the directories /app/deposit/ and /app/data/, respectively.

Upon successful import, the VCF files are archived in /ess/p22/data/durable/production/data/varde/archive/.

GPG keys

VardeDB uses GnuPG for encryption and signature verification of deposited VCF files. The VardeDB private key and imported public keys are stored in /ess/p22/data/durable/production/sw/datasharing/varde/.gnupg/ which is bind mounted to /app/.gnupg/ in the container. Note that if one whishes to call any gpg commands directly, the --homedir option must be set to /app/.gnupg/ to ensure that the appopriate keyring is used.

Configuration

The configuration files for VardeDB are stored in /ess/p22/data/durable/production/sw/datasharing/varde/config/ which is bind mounted to /app/config/. The directory structure is as follows:

/config/
      .env.prod
      .keypass
      .pgpass
      .psql_history
      organisations.txt
      keys/
            <org1>.asc
            <org2>.asc
            ...

where

  • .env.prod: Defines environmental variables, including the PostgreSQL database and key directories (e.g., .gnupg and data directories)
  • .keypass: Houses the passphrase for VardeDB's private key used for decryption
  • .pgpass: Supports the automated retrieval of the PostgreSQL database password using the pgpasslib library
  • .psql_history: Stores the history of PostgreSQL commands
  • organisations.txt: Contains a semicolon-separated file of abbreviations and full names of each organisation
  • keys/: Stores public keys of each organisation used for signature verification

To incorporate a new organisation to VardeDB, the organisation's abbreviation and full name must be appended to organisations.txt, and its public key added to keys/. The public key is to be named <org>.asc, where <org> is the organisation's abbreviation. The public key is then imported to the VardeDB keyring

gpg --homedir .gnupg/ --import config/keys/<org>.asc

before being signed by the VardeDB private key

gpg --homedir .gnupg/ -u vardedb --sign-key <org>

Container setup

The following describes the setup and deployment of Varde on TSD as an Apptainer/Singularity container. Pull the latest docker image from the VardeDB container registry, then build and save the image

docker pull registry.gitlab.com/dpipe/datasharing/varde/vardedb:latest &&
      docker save registry.gitlab.com/dpipe/datasharing/varde/vardedb:latest > vardedb.tar

The image tarball is then imported to TSD, and an Apptainer/Singularity image is built under /ess/p22/data/durable/production/sw/datasharing/varde/

singularity build vardedb.sif docker-archive:vardedb.tar

The container can then be launched and run as described previously.