What is GATK

GATK, properly pronounced “Gee-ay-tee-kay” (/dʒi•eɪ•ti•keɪ/) , stands for Genome Analysis Toolkit. It is a collection of command-line tools for analyzing high-throughput sequencing data with a primary focus on variant discovery. The tools can be used individually or chained together into complete workflows.

 

Module

To setup required environment variables, please use following command

System Command
HPC2021 module load gatk

 

Sample SLURM batch script for GATK4 using Singularity image is located under /share1/gatk/sample/ in HPC2021 system.

 

Additional Information

  1. GATK User Guides
  2. Youtube Clips for GATK Best Practice for Variant Discovery Training