HTS Bioinf - Training for running pipeline
Scope
This procedure lists what a bioinformatician must do before being allowed to run and monitor pipeline analyses on patient samples.
-
Read GATK best practice https://gatk.broadinstitute.org/hc/en-us/articles/360035535932-Germline-short-variant-discovery-SNPs-Indels -- use the latest version that fits with the pipeline -- as background material for an overall understanding of the different pipeline steps.
-
Read the TSD help pages to learn how to login, where to find what files, etc., should this be unfamiliar.
-
Read through the pipeline scripts and the automation system code. Get a tour by a trained bioinformatician who knows this well.
-
Read the pipeline specification for the pipelines that will be run in the analyses. Look inside the mentioned scripts to understand how they work, study the output files and make a mental note of how they should look like.
-
Get a user account on TSD, NSC and Clarity (see procedure [HTS Bioinf - Infrastructure]) if you haven't got one already.
-
Ask TSD system administrators to add you into the following user groups if you aren't a member of them already:
p22-diag-ous-bioinf-group
,p22-diag-ous-lab-group
,p22-export-group
,p22-import-group
. -
Set up prerequisites for using the
lims-exporter-api
.Each new user of
lims-exporter-api
needs to do a one time setup as following:- Ask Clarity LIMS administrator to add you as Clarity API user.
- On
gdx-login
, copy/boston/diag/transfer/sw/.genosqlrc.yaml
and/boston/diag/transfer/sw/tsd-import/src/lims_exporter_api/.genologicsrc
to your$HOME
directory. - Replace
USERNAME
andPASSWORD
with your own Clarity username and password the copy of.genologicsrc
in your$HOME
directory. - On
sleipnir
, copy/boston/diag/transfer/sw/.s3cfg
to your$HOME
.
-
Study the following procedures and sign ("lesekvittere") them:
- Clarity LIMS
- HTS - Mismatch between TaqMan SNP-ID and sequencing data
- HTS - Samples that fail QC in bioinformatic pipeline
- HTS - Use of reference materials for internal quality control
- HTS Bioinf - Demultiplexing and quality control of raw sequencing data
- HTS Bioinf - Deployment of vcpipe for production
- HTS Bioinf - Release and deployment of tsd-import
- HTS Bioinf - Execution and monitoring of pipeline
- HTS Bioinf - Storage and security of sensitive data
- HTS Bioinf - Basepipe pipeline
- HTS Bioinf - Trio pipeline
- HTS Bioinf - ELLA daily operations
- HTS Bioinf - anno with anno-targets
-
Run two analyses together with a bioinformatician who has already passed training.
-
Sign in "Opplæringssjekkliste" together with the bioinformatic coordinator or the production coordinator (see procedure [HTS Bioinf Group roles]).
Background
The bioinformatics pipeline is fully automated. Currently, the only manual steps involved are quality control of the raw sequencing data (FastQC), transfer of data from NSC to TSD, and monitoring that the pipeline executes and finishes successfully on TSD.
References
- Clarity LIMS
- HTS - Mismatch between TaqMan SNP-ID and sequencing data
- HTS - Samples that fail QC in bioinformatic pipeline
- HTS - Use of reference materials for internal quality control
- HTS Bioinf Group roles
- HTS Bioinf - Demultiplexing and quality control of raw sequencing data
- HTS Bioinf - Deployment of vcpipe for production
- HTS Bioinf - Release and deployment of tsd-import
- HTS Bioinf - Execution and monitoring of pipeline
- HTS Bioinf - Quality control of processed sequencing data
- HTS Bioinf - Storage and security of sensitive data
- HTS Bioinf - Basepipe pipeline
- HTS Bioinf - Trio pipeline
- HTS Bioinf - ELLA daily operations
- HTS Bioinf - anno with anno-targets