Skip to content

HTS Bioinf - Update of gene panel builder reference data

Scope

This procedure explains how to update the reference data used by the gene panel builder service.

Responsibility

A bioinformatician from GDx is responsible for performing these tasks.

The gene panel builder service

An instance of the gene panel builder service database must be running on the computer the update is to be performed on.

Prior to update

Notify the lab doctors (ELL@ous-hf.no) that you intend to upgrade reference data.

Update

External reference data

ssh gpbuilder@gpb.allele.es
# make sure that the gene panels repository is up to date with the remote
git -C genepanel-store checkout '<latest-release-tag>'
# open a Bash shell in a gene panel builder Docker container
docker compose --project-directory config exec -it gpbuilder bash

Hereafter, we will assume that the user is inside the Docker container running the GPB backend.

Before performing the update, make sure that a valid OMIM API key is stored in the OMIM_API_KEY env variable (echo $OMIM_API_KEY).

Update all external data with gpb update --diff all-external.

Note: To create tsv-files showing the effects of an update without committing to the database (useful for testing), run gpb --dry-run update --diff all-external.

Create drafts

Once the update is successfully completed, gene names, inheritance, transcripts and PanelApp panels/gene content may have changed. To avoid releasing gene panels with the same version, but different configuration, run the command:

gpb diff --create-drafts --skip-draft-if-no-gene-changes /panels/cur

This command will bump versions of all panels with content differing from that of their exported counterparts.

Take note of the panels (if any) listed with the warning These panels remain at draft stage after version bumping and require expert review.

Generate diff

Create a diff of all active database panels against panels in production:

gpb diff -o ${LOGS_DIR}/gpb-diff-$(date '+%Y-%m-%d').tsv /panels/cur

Notify ELL

Send the generated diffs (gpb-diff-YYYY-MM-DD.tsv and gpb-diff-YYYY-MM-DD.summary.tsv) to the lab doctors. Notify them of which panels (if any) have no submitted version after updating -- these are the panels that need to be manually inspected and resubmitted to be part of the next panel release.

Internal reference data

Bulk coverage

The gene panel builder uses bulk coverage data to calculate how well individual genes transcripts or regions are expected to be covered. The bulk coverage data should accurately reflect the current performance of library preparation, fragment sequencing and read mapping, and must therefore be updated whenever any of these processes undergo significant changes.

The bulk coverage data are generated following the instructions in the gene panel builder repository. The choice of the BAM files to be used in the process may depend on the specific circumstances warranting the update. Discuss with the rest of the bioinformatic team and decide on the best course of action.