HTS Bioinf - Training for running pipeline
Scope
This procedure lists what a bioinformatician must do before being allowed to run and monitor pipeline analyses on patient samples.
- Read GATK best practice https://gatk.broadinstitute.org/hc/en-us/articles/360035535932-Germline-short-variant-discovery-SNPs-Indels- - use the latest version that fits with pipeline) as background material for an overall understanding of the different pipeline steps.
- Read the TSD help pages for understanding how to login, the TSD file structure, etc. if this is unfamiliar: https://www.uio.no/english/services/it/research/storage/sensitive-data/use-tsd/index.html
- Read through the pipeline scripts and the automation system code. Get a tour by a trained bioinformatician who knows this well.
- Read the pipeline specification for the pipelines that will be run in the analyses. Look inside the mentioned scripts to understand how they work and study output files and note what they should look like.
- Have a user account in TSD,
vali
,sleipnir
and Clarity (see procedure [HTS Bioinf - Infrastructure]) - Ask system administrator to add you into the following TSD user groups:
p22-import-group
,p22-export-group
,p22- diag-ous-bioinf-group
,p22-diag-ous-lab-group
-
Set up prerequisites for using the
lims-exporter-api
Each new user of lims-exporter-api needs to do a one time setup as following:
- Ask Clarity LIMS administrator to add you as a Clarity API user.
- On
beta
, copy/boston/diag/tranfer/sw/tsd-import/src/lims_exporter_api/.genologicsrc
to your$HOME
and change theUSERNAME
andPASSWORD
with your own Clarity username and password. - On
sleipnir
, copy/boston/diag/tranfer/sw/.s3cfg
to your$HOME
. - Ask a current user for a copy of
.genosqlrc.yaml
and put it in your$HOME
onvali
(orbeta
) without modification.
-
Study the following procedures and the linked procedures, and sign (lesekvittere):
- HTS Bioinf - Quality control of processed sequencing data
- HTS Bioinf - Deployment of vcpipe for production
- HTS Bioinf - Release and deployment of tsd-import
- HTS Bioinf - Execution and monitoring of pipeline
- Clarity LIMS
- HTS - Mismatch between TaqMan SNP-ID and sequencing data
- HTS IT - Storage and security of sensitive data
- HTS Bioinf - Basepipe pipeline
- HTS Bioinf - Trio pipeline
- HTS Bioinf - Demultiplexing and quality control of raw sequencing data
- HTS Bioinf - ELLA production
- HTS Bioinf - anno with anno-targets
- HTS - Use of NA samples for quality control
- MegaQC - quality trend analysis
- HTS - Samples that fail QC in bioinformatic pipeline
- Run two analyses together with a bioinformatician who has already passed training.
- Sign in "Opplæringssjekkliste" together with the bioinformatic coordinator or the production coordinator (see procedure [HTS Bioinf Group roles]).
Background
The bioinformatics pipeline is fully automated. Currently, the only manual steps involved is a quality check of the raw
sequencing data (FASTQC
), transferring data from NSC to TSD, and monitoring that the pipeline executes and finishes
successfully on TSD.
References
- HTS Bioinf - anno with anno-targets
- HTS Bioinf - Demultiplexing and quality control of raw sequencing data
- HTS Bioinf - ELLA production
- HTS Bioinf - Running annoservice
- HTS Bioinf - Trio pipeline
- HTS Bioinf Infrastructure
- HTS Bioinf - Quality control of processed sequencing data
- HTS Bioinf - Deployment of vcpipe for production
- HTS Bioinf - Release and deployment of tsd-import
- HTS Bioinf - Execution and monitoring of pipeline
- Clarity LIMS
- HTS - Mismatch between TaqMan SNP-ID and sequencing data
- HTS Bioinf - Storage and security of sensitive data
- HTS Bioinf Group roles