Guideline for internal users
This document provides a comprehensive guide for utilizing VardeDB for internal users. It outlines the procedures for importing and exporting variant classifications, as well as generating a report of divergent classifications. Furthermore, it presents the technical setup and offers instructions for deploying the Apptainer/Singularity container on TSD.
Note that a dagu job has been setup on TSD to automate these processes (import, export, and report). The job is triggered when all participating organisations have deposited their VCFs to TSD.
Import to VardeDB and publish
Once all organisations have deposited their VCFs to VardeDB, the variant classifications are imported into the database. Then, the most recent variant interpretations are exported from the database, and a report of divergent variant classifications is generated. On TSD under /ess/p22/data/durable/production/sw/datasharing/varde/
, Task has been setup with three key commands to perform these processes: prod-import
for importing the submitted VCF files, prod-export
for exporting all variant classifications of VardeDB, and prod-report
for generating the divergent classification report. Other utility commands are also available (e.g. psql
for running psql
and connecting to VardeDB). A complete listing of available commands can be obtained by simply calling task
.
Once the VCF files have been successfully exported and the divergent classification report has been generated, access the following link within the TSD project. Then, select and upload the signed and encrypted VCF files, along with the divergent classification reports. Click the Publish symbol to make the files available for download by selected external users via the TSD publication portal.
Technical setup
Software
VardeDB is deployed on the TSD VM p22-app-01
as the Apptainer/Singularity container vardedb
. The Apptainer/Singularity image can be found under /ess/p22/data/durable/production/sw/datasharing/varde/
as vardedb.sif
.
Database
VardeDB is a relational PostgreSQL database with the data model defined using SQLAlchemy. The database itself runs on the TSD database VM p22-dbpg-prod02
.
File storage
Initial storage for deposited VCF files occur within /ess/p22/data/durable/file-import/p22-member-group/production/varde/
. Before decrypting and importing these into VardeDB, the files are moved to /ess/p22/data/durable/production/data/varde/
. Bind mounts have been created for these directories in the vardedb
container under the directories /app/deposit/
and /app/data/
, respectively.
Upon successful import, the VCF files are archived in /ess/p22/data/durable/production/data/varde/archive/
.
GPG keys
VardeDB uses GnuPG for encryption and signature verification of deposited VCF files. The VardeDB private key and imported public keys are stored in /ess/p22/data/durable/production/sw/datasharing/varde/.gnupg/
which is bind mounted to /app/.gnupg/
in the container. Note that if one whishes to call any gpg
commands directly, the --homedir
option must be set to /app/.gnupg/
to ensure that the appopriate keyring is used.
Configuration
The configuration files for VardeDB are stored in /ess/p22/data/durable/production/sw/datasharing/varde/config/
which is bind mounted to /app/config/
. The directory structure is as follows:
where
.env.prod
: Defines environmental variables, including the PostgreSQL database and key directories (e.g.,.gnupg
and data directories).keypass
: Houses the passphrase for VardeDB's private key used for decryption.pgpass
: Supports the automated retrieval of the PostgreSQL database password using thepgpasslib
library.psql_history
: Stores the history of PostgreSQL commandsorganisations.txt
: Contains a semicolon-separated file of abbreviations and full names of each organisationkeys/
: Stores public keys of each organisation used for signature verification
To incorporate a new organisation to VardeDB, the organisation's abbreviation and full name must be appended to organisations.txt
, and its public key added to keys/
. The public key is to be named <org>.asc
, where <org>
is the organisation's abbreviation. The public key is then imported to the VardeDB keyring
before being signed by the VardeDB private key
Container setup
The following describes the setup and deployment of Varde on TSD as an Apptainer/Singularity container. Pull the latest docker image from the VardeDB container registry, then build and save the image
docker pull registry.gitlab.com/dpipe/datasharing/varde/vardedb:latest &&
docker save registry.gitlab.com/dpipe/datasharing/varde/vardedb:latest > vardedb.tar
The image tarball is then imported to TSD, and an Apptainer/Singularity image is built under /ess/p22/data/durable/production/sw/datasharing/varde/
The container can then be launched and run as described previously.