HTS Bioinf - Update of gene panel builder reference data
Scope
This procedure explains how to update the reference data used by the gene panel builder service.
Responsibility
A bioinformatician from GDx is responsible for performing these tasks.
The gene panel builder service
An instance of the gene panel builder service database must be running on the computer the update is to be performed on.
Prior to update
Notify the lab doctors (ELL@ous-hf.no) that you intend to upgrade reference data.
Update
External reference data
ssh gpbuilder@gpb.allele.es
# make sure that the gene panels repository is up to date with the remote
git -C genepanel-store checkout '<latest-release-tag>'
# open a Bash shell in a gene panel builder Docker container
docker compose --project-directory config exec -it gpbuilder bash
Hereafter, we will assume that the user is inside the Docker container running the GPB backend.
Before performing the update, make sure that a valid OMIM API key is stored in the OMIM_API_KEY
env variable (echo $OMIM_API_KEY
).
Update all external data with gpb update --diff all-external
.
Note: To create tsv-files showing the effects of an update without committing to the database (useful for testing), run
gpb --dry-run update --diff all-external
.
Create drafts
Once the update is successfully completed, gene names, inheritance, transcripts and PanelApp panels/gene content may have changed. To avoid releasing gene panels with the same version, but different configuration, run the command:
This command will bump versions of all panels with content differing from that of their exported counterparts.
Take note of the panels (if any) listed with the warning These panels remain at draft stage after version bumping and require expert review
.
Generate diff
Create a diff of all active database panels against panels in production:
Notify ELL
Send the generated diffs (gpb-diff-YYYY-MM-DD.tsv
and gpb-diff-YYYY-MM-DD.summary.tsv
) to the lab doctors. Notify them of which panels (if any) have no submitted version after updating -- these are the panels that need to be manually inspected and resubmitted to be part of the next panel release.
Internal reference data
Bulk coverage
The gene panel builder uses bulk coverage data to calculate how well individual genes transcripts or regions are expected to be covered. The bulk coverage data should accurately reflect the current performance of library preparation, fragment sequencing and read mapping, and must therefore be updated whenever any of these processes undergo significant changes.
The bulk coverage data are generated following the instructions in the gene panel builder repository. The choice of the BAM files to be used in the process may depend on the specific circumstances warranting the update. Discuss with the rest of the bioinformatic team and decide on the best course of action.