What is GATK

GATK, properly pronounced “Gee-ay-tee-kay” (/dʒi•eɪ•ti•keɪ/) , stands for Genome Analysis Toolkit. It is a collection of command-line tools for analyzing high-throughput sequencing data with a primary focus on variant discovery. The tools can be used individually or chained together into complete workflows.



To setup required environment variables, please use following command

System Command
HPC2021 module load gatk


Sample SLURM batch script for GATK4 using Singularity image is located under /share1/gatk/sample/ in HPC2021 system.


Additional Information

  1. GATK User Guides
  2. Youtube Clips for GATK Best Practice for Variant Discovery Training