Skip to content

HTS Bioinf - Training for running pipeline

Scope

This procedure lists what a bioinformatician must do before being allowed to run and monitor pipeline analyses on patient samples.


  1. Read GATK best practice https://gatk.broadinstitute.org/hc/en-us/articles/360035535932-Germline-short-variant-discovery-SNPs-Indels- - use the latest version that fits with pipeline) as background material for an overall understanding of the different pipeline steps.
  2. Read the TSD help pages for understanding how to login, the TSD file structure, etc. if this is unfamiliar: https://www.uio.no/english/services/it/research/storage/sensitive-data/use-tsd/index.html
  3. Read through the pipeline scripts and the automation system code. Get a tour by a trained bioinformatician who knows this well.
  4. Read the pipeline specification for the pipelines that will be run in the analyses. Look inside the mentioned scripts to understand how they work and study output files and note what they should look like.
  5. Have a user account in TSD, vali, sleipnir and Clarity (see procedure [HTS Bioinf - Infrastructure])
  6. Ask system administrator to add you into the following TSD user groups: p22-import-group, p22-export-group, p22- diag-ous-bioinf-group, p22-diag-ous-lab-group
  7. Set up prerequisites for using the lims-exporter-api

    Each new user of lims-exporter-api needs to do a one time setup as following:

    1. Ask Clarity LIMS administrator to add you as a Clarity API user.
    2. On beta, copy /boston/diag/tranfer/sw/tsd-import/src/lims_exporter_api/.genologicsrc to your $HOME and change the USERNAME and PASSWORD with your own Clarity username and password.
    3. On sleipnir, copy /boston/diag/tranfer/sw/.s3cfg to your $HOME.
    4. Ask a current user for a copy of .genosqlrc.yaml and put it in your $HOME on vali(or beta) without modification.
  8. Study the following procedures and the linked procedures, and sign (lesekvittere):

    1. HTS Bioinf - Quality control of processed sequencing data
    2. HTS Bioinf - Deployment of vcpipe for production
    3. HTS Bioinf - Release and deployment of tsd-import
    4. HTS Bioinf - Execution and monitoring of pipeline
    5. Clarity LIMS
    6. HTS - Mismatch between TaqMan SNP-ID and sequencing data
    7. HTS IT - Storage and security of sensitive data
    8. HTS Bioinf - Basepipe pipeline
    9. HTS Bioinf - Trio pipeline
    10. HTS Bioinf - Demultiplexing and quality control of raw sequencing data
    11. HTS Bioinf - ELLA production
    12. HTS Bioinf - anno with anno-targets
    13. HTS - Use of NA samples for quality control
    14. MegaQC - quality trend analysis
    15. HTS - Samples that fail QC in bioinformatic pipeline
    16. Run two analyses together with a bioinformatician who has already passed training.
    17. Sign in "Opplæringssjekkliste" together with the bioinformatic coordinator or the production coordinator (see procedure [HTS Bioinf Group roles]).

Background

The bioinformatics pipeline is fully automated. Currently, the only manual steps involved is a quality check of the raw sequencing data (FASTQC), transferring data from NSC to TSD, and monitoring that the pipeline executes and finishes successfully on TSD.

References