HPC Software and Programming Tools – Research Computing, HKU ITS

The following scientific computing software is available on our HPC cluster systems under directory /share1.

* Several versions are installed; listed is the default version.

Licensed/Open Source Software Applications

Bioinformatics

Software HPC2021

AlphaFold 2.1.2*, 2.3.1

CD-HIT 4.8.1

GATK 4.2.4.0

HyPhy 2.5.42

PHYML 3.3.20220408

MEME 4.11.2

MEGAHIT 1.1.4

SAMtools 1.12

Software	HPC2021
AlphaFold	2.1.2*, 2.3.1
CD-HIT	4.8.1
GATK	4.2.4.0
HyPhy	2.5.42
PHYML	3.3.20220408
MEME	4.11.2
MEGAHIT	1.1.4
SAMtools	1.12

Chemistry and Molecular Modeling

Software	HPC2021
ADF	2014, 2019
AMBER	20
CP2K	2022.1, 2023.1
CPMD	4.3
Gaussian	g09, g16
GROMACS	2021.3
LAMMPS	20210929, 20220803
NWChem	7.0.2
ORCA	5.0.0, 5.0.2, 5.0.3*
SIESTA	4.0.1

Physics and Materials Science

Software HPC2021

Abaqus 2020, 2021,2022,2023,2024*

Quantum ESPRESSO 6.7

VASP 5.4.4, 6.2.1, 6.3.1, 6.3.2*

Software	HPC2021
Abaqus	2020, 2021,2022,2023,2024*
Quantum ESPRESSO	6.7
VASP	5.4.4, 6.2.1, 6.3.1, 6.3.2*

Mathematics and Statistics

Software HPC2021

MATLAB R2021a, R2021b, R2022b, R2023b, R2024b

R 4.0.4*, 4.1.2, 4.2.1, 4.3.2

STATA 16.1, 17.0, 18.0, 18.5, 19.5*

Software	HPC2021
MATLAB	R2021a, R2021b, R2022b, R2023b, R2024b
R	4.0.4*, 4.1.2, 4.2.1, 4.3.2
STATA	16.1, 17.0, 18.0, 18.5, 19.5*

Data Analysis and Machine Learning

Software HPC2021

Anaconda Available

Software	HPC2021
Anaconda	Available

Utilities and Libraries

Compilers and programming languages

Software	HPC2021
GNU Compiler	8.3.1*, 10.2.0
Intel Compiler	2019,2020,2021,2022*
PGI Compiler	evolved into NVHPC
AMD Optimizing C/C++ Compiler	3.1.0, 3.2.0*
CUDA	11.2, 11.8, 12.3, 12.8
Perl	5.34.0
Python	3.9.2, 3.9.7*, 3.12.1
Julia	1.6.1, 1.10.4*
Ruby	2.7.2, 3.0.2

Parallel Libraries

Software HPC2021

MPICH 3.4.2, 4.1.2

Intel MPI 2019, 2020,2021,2022*

Open MPI 4.1.0, 4.1.4, 4.1.6

MPI for Python (mpi4py) Available

Software	HPC2021
MPICH	3.4.2, 4.1.2
Intel MPI	2019, 2020,2021,2022*
Open MPI	4.1.0, 4.1.4, 4.1.6
MPI for Python (mpi4py)	Available

Math Libraries

Software HPC2021

AOCL 3.0-6, 3.1.0*, 4.1.0

FFTW 3.3.9

GMP 6.2.1

GSL 2.7

Intel MKL Available

BLAS, LAPACK and ScaLAPACK Available

Software	HPC2021
AOCL	3.0-6, 3.1.0*, 4.1.0
FFTW	3.3.9
GMP	6.2.1
GSL	2.7
Intel MKL	Available
BLAS, LAPACK and ScaLAPACK	Available

Programming Utilities

Software HPC2021

HDF5 1.10.7, 1.12.2*

MPC 1.2.1

MPFR 4.1.0

Software	HPC2021
HDF5	1.10.7, 1.12.2*
MPC	1.2.1
MPFR	4.1.0

Visualization and Plotting

Software HPC2021

Gnuplot 5.4.2

ParaView 5.9.0, 5.10.0*

VMD 1.9.3

Software	HPC2021
Gnuplot	5.4.2
ParaView	5.9.0, 5.10.0*
VMD	1.9.3

List of available modules on HPC2021 system

Search:

Software	Description	Available versions	Keywords
abaqus	ABAQUS – Software suite for finite element analysis and computer-aided engineering.	abaqus/2020 abaqus/2021 abaqus/2022 abaqus/2023 abaqus/2024 (Default)	Finite Element Analysis, Computer-aided Engineering
abricate	Mass screening of contigs for antimicrobial and virulence genes	abricate/1.0.0	Virus
ABySS	ABySS is a de novo sequence assembler intended for short paired-end reads and genomes of all sizes	ABySS/2.3.3	Genome Assembler
adf	ADF: Package that uses Density Functional Theory(DFT) to predict chemical structure and reactivity for electronic and molecular structure calculations.	adf/2014 adf/2019	Density Functional Theory, Spectroscopy, Transition Metal, Heavy Elements
AHRD	Automated Assignment of Human Readable Descriptions (AHRD)	AHRD/3.3.3	Gene/Protein Annotation
alphafold	AlphaFold: AI program performs predictions of protein structure that developed by Google’s DeepMind	alphafold/2.1.0 alphafold/2.1.1 alphafold/2.1.2 (Default) alphafold/2.3.1	Structural Bioinformatics, Protein Structure Prediction, AI
anaconda	Anaconda: Python Data Science Platform for Python 3	anaconda/py3.8	Data Science, Conda, Python, Jupyter
ancestry_hmm-s	Inferring adaptive introgression from genomic data using hidden Markov models	ancestry_hmm-s/0.9.0.2	Population Genomics
AnnotSV	AnnotSV: An integrated tool for Structural Variations annotation and ranking	AnnotSV/3.1	Annotation, SV, CNV, Target Prioritization
ANNOVAR	ANNOVAR is an efficient software tool to utilize update-to-date information to functionally annotate genetic variants detected from diverse genomes	ANNOVAR/2020-06-07	NGS, Annotation
aocc	AOCC – AMD Optimizing C/C++ Compiler	aocc/3.1.0 aocc/3.2.0 (Default) aocc/4.1.0	AMD, EPYC, Compiler
aocl/aocc	AMD Optimizing CPU Libraries (AOCL)	aocl/aocc/3.0-6 aocl/aocc/3.1.0 (Default) aocl/aocc/4.1.0	AMD, EPYC, Numerical Libraries
aocl/gcc	AMD Optimizing CPU Libraries (AOCL)	aocl/gcc/3.0-6 aocl/gcc/3.1.0 (Default) aocl/gcc/4.1.0	AMD, EPYC, Numerical Libraries
arlequin	Arlequin: An Integrated Software for Population Genetics Data Analysis	arlequin/3.5.2.2	Population Genetics, Molecular Ecology
aspera	IBM Aspera Command-Line Interface (the Aspera CLI) is a collection of Aspera tools for performing high-speed, secure data transfers from the command line..	aspera/3.9.6	Data Transfer
augustus	AUGUSTUS is a program that predicts genes in eukaryotic genomic sequences.	augustus/3.4.0	Eukaryotic gene prediction
automake	Automake – make file builder part of autotools	automake/1.16.3	Makefile, Configure Tool
axel	axel: Lightweight CLI download accelerator	axel/2.17.11 (Default)	Data Download
bamtools	BamTools provides both a programmer’s API and an end-user’s toolkit for handling BAM files.	bamtools/2.5.2	NGS, Data Format, BAM
BASTA	https://github.com/timkahlke/BASTA.	BASTA/1.4.1	Taxonomy Assignment
BayeScan	BayeScan aims at identifying candidate loci under natural selection from genetic data, using differences in allele frequencies between populations.	bayescan/2.1	bayescan
BBmap	BBMap: Short read aligner for DNA and RNA-seq data. Capable of handling arbitrarily large genomes with millions of scaffolds	BBmap/38.93	NGS, Aligner, Short-read
bc	bc is an arbitrary precision numeric processing language.https://www.gnu.org/software/bc	bc/1.07.1	Calculator
bcftools	BCFtools is a program for variant calling and manipulating files in the Variant Call Format (VCF) and its binary counterpart BCF.	bcftools/1.14	NGS, Data Format, VCF
bcl2fastq	The Illumina bcl2fastq2 Conversion Software demultiplexes sequencing data and converts base call (BCL) files into FASTQ files.	bcl2fastq/2.19.0	NGS, Base-calling, Illumina
BEAGLE/4.0.0	BEAGLE is a high-performance library that can perform the core calculations at the heart of most Bayesian and Maximum Likelihood phylogenetics package.	BEAGLE/4.0.0/amd BEAGLE/4.0.0/gpu BEAGLE/4.0.0/intel (Default)	Phylogenetics
BEAST	BEAST is a cross-platform program for Bayesian analysis of molecular sequences using MCMC. It is entirely orientated towards rooted, time-measured phylogenies inferred using strict or relaxed molecular clock models.	BEAST/1.10.4	Phylogenetics
BEAST2	BEAST 2 is a cross-platform program for Bayesian phylogenetic analysis of molecular sequences. It estimates rooted, time-measured phylogenies using strict or relaxed molecular clock models.	BEAST2/2.6.7 (Default) BEAST2/2.7.6	Phylogenetics
bedtools	bedtools – the swiss army knife for genome arithmetic	bedtools/2.30.0	NGS, Data Format, BAM, BED, GFF, GTF, VCF
berkeleydb	Oracle Berkeley DB	berkeleydb/18.1.40	embedded key-value database
bismark	Bismark is a tool to map bisulfite converted sequence reads and determine cytosine methylation states	bismark/0.23.1	NGS, Bisulfite Sequencing, Methylation Call
blast-plus	BLAST finds regions of similarity between biological sequences.	blast-plus/2.13.0	Alignment, Sequeunce Query
boost/gcc	Boost provides free-reviewed portable C++ source libraries, emphasizing libraries that work well with the C++ Standard Library.	boost/gcc/1.77.0 boost/gcc/1.80.0 (Default)	C++ Libraries
bowtie	Bowtie is an ultrafast, memory-efficient short read aligner. It aligns short DNA sequences (reads) to the human genome at a rate of over 25 million 35-bp reads per hour.	bowtie/1.3.1	NGS, Aligner
bowtie2	Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes.	bowtie2/2.4.4 (Default)	NGS, Aligner
Bracken	Bracken (Bayesian Reestimation of Abundance with KrakEN) is a highly accurate statistical method that computes the abundance of species in DNA sequences from a metagenomics sample.	Bracken/2.6.2	NGS, Metagenomics
BRAKER	BRAKER2 is an extension of BRAKER1 which allows for fully automated training of the gene prediction tools GeneMark-EX R14, R15, R17, F1 and AUGUSTUS from RNA-Seq and/or protein homology information, and that integrates the extrinsic evidence from RNA-Seq and protein homology information into the prediction.	BRAKER/2.1.6	Gene structure annotation
bsmap	BSMAP is a short reads mapping software for bisulfite sequencing reads.	bsmap/2.9.0	NGS, Bisulfite Sequencing, Genome Mapping
busco	BUSCO – Benchmarking sets of Universal Single-Copy Ortholog	busco/5.3.2	Ortholog
Canu	Canu is a fork of the Celera Assembler designed for high-noise single-molecule sequencing.	Canu/2.2	NGS, Genome Assembler
CellPhoneDB	CellPhoneDB is a publicly available repository of curated receptors, ligands and their interactions. Subunit architecture is included for both ligands and receptors, representing heteromeric complexes accurately	CellPhoneDB/2.1.7	Receptors / Ligands database
CellProfiler	CellProfiler is a free open-source software designed to enable biologists without training in computer vision or programming to quantitatively measure phenotypes from thousands of images automatically.	CellProfiler/4.2.1	Cell Imaging
CellRanger	Cell Ranger is a set of analysis pipelines that process Chromium single-cell data to align reads, generate feature-barcode matrices, perform clustering and other secondary analysis, and more.	CellRanger/6.1.2	NGS, RNA-Seq, Single Cell
CFOUR	CFOUR (Coupled-Cluster techniques for Computational Chemistry) is a program package for performing high-level quantum chemical calculations on atoms and molecules.	cfour/2.1	Quantum Chemistry
CheckM	CheckM provides a set of tools for assessing the quality of genomes recovered from isolates, single cells, or metagenomes. It provides robust estimates of genome completeness and contamination by using collocated sets of genes that are ubiquitous and single-copy within a phylogenetic lineage	CheckM/1.1.3	Metagenomics, Quality Control
CheckV	CheckV is a fully automated command-line pipeline for assessing the quality of single-contig viral genomes, including identification of host contamination for integrated proviruses, estimating completeness for genome fragments, and identification of closed genomes	CheckV/0.8.1	Metagenomics, viral genomes
cmake	A cross-platform, open-source build system. CMake is a family of tools designed to build, test and package software.	cmake/3.19.7 cmake/3.30.2 (Default)	Make, Configure Tool
CMSeq	CMSeq is a set of commands to provide an interface to .bam files for coverage and sequence consensus	CMSeq/1.0.4	NGS, Data Format, BAM
CNVnator	a tool for CNV discovery and genotyping from depth-of-coverage by mapped reads	CNVnator/0.4.1	NGS, Structural Variant, CNV
CNVpytor	CNVnator is a python extension of CNVnator — a tool for CNV analysis from depth-of-coverage by mapped reads	CNVpytor/1.0	NGS, Structural Variant, CNV
comsol	COMSOL.	comsol/6.0 comsol/6.1
conos/R-4.1.2	R package wires together large collections of single-cell RNA-seq datasets, which allows for both the identification of recurrent cell clusters and the propagation of information between datasets in multi-sample or atlas-scale collections.	conos/R-4.1.2/1.4.4	NGS, RNA-seq, Single Cell
cpmd	CPMD – Car-Parrinello Molecular Dynamics simulations.	cpmd/4.1 cpmd/4.3-impi2020u4 cpmd/4.3 (Default)	Density Functional Theory, ab-initio molecular dynamics
cp2k	CP2K is a quantum chemistry and solid state physics software packages that can perform atomistic simulations of solid state, liquid, molecular, periodic, material, crystal and biological systems.	cp2k/2023.1	Quantum Chemistry, Simulations, Atoms
CTFFIND	CTFFIND4: Fast and accurate defocus estimation from electron micrographs	CTFFIND/4.1.14	Cryo-EM, Micrograph
cuda	NVIDIA CUDA Toolkit – comprehensive development environment for C and C++ developers building GPU-accelerated applications	cuda/11.2 (Default) cuda/11.8 cuda/12.3	NVIDIA, CUDA, GPU
cuda-toolkit	NVIDIA CUDA Toolkit – comprehensive development environment for C and C++ developers building GPU-accelerated applications	cuda-toolkit/11.7	NVIDIA, CUDA, GPU
cudnn	NVIDIA CUDNN Library – CUDA-based Deep Neural Network library	cudnn/8.2.4-cuda11.4 cudnn/8.9.7-cuda11.8 cudnn/8.9.7-cuda12.3 (Default)	NVIDIA, GPU, CUDA, cuDNN
cufflinks	Cufflinks assembles transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-Seq samples.	cufflinks/2.2.1	NGS, RNA seq
curl	CURL is an open source command line tool and library for transferring data with URL syntax	curl/7.75.0	Downloader
cutadapt	Cutadapt finds and removes adapter sequences, primers, poly-A tails and other types of unwanted sequence from your high-throughput sequencing reads.	cutadapt/3.4 cutadapt/3.5 (Default)	Bioinformatics, sequence trimming
cytoscape	Cytoscape is an open source software platform for visualizing complex networks and integrating these with any type of attribute data.	cytoscape/3.9.1	Network analysis
dadi	dadi implements methods for demographic history and selection inference from genetic data, based on diffusion approximations to the allele frequency spectrum.	dadi/2.1.2	Demographic Inference
deepTools	deepTools addresses the challenge of handling the large amounts of data that are now routinely generated from DNA sequencing centers. deepTools contains useful modules to process the mapped reads data for multiple quality checks, creating normalized coverage files in standard bedGraph and bigWig file formats, that allow comparison between different files (for example, treatment and control)	deepTools/3.5.1	NGS, Quality Control, Visualization
delly	DELLY2: Structural variant discovery by integrated paired-end and split-read analysis	delly/0.9.1	NGS, Structural Variant
DensityMap	DensityMap is perl tool for the visualization of features density along chromosomes	DensityMap/1.0	Chromosomes, Visualizations
DESeq2/R-4.1.2	DESeq2: Differential gene expression analysis based on the negative binomial distribution.	DESeq2/R-4.1.2/1.34.0	NGS, RNA-seq
diamond	DIAMOND is a sequence aligner for protein and translated DNA searches, designed for high performance analysis of big sequence data.	diamond/2.0.9 diamond/2.0.13 (Default)	Aligner
dotnet-sdk	.NET is a free and open-source, managed computer software framework for Windows, Linux, and macOS operating systems..	dotnet-sdk/3.1.100	.NET runtime
DoubletFinder/R-4.1.2	DoubletFinder is an R package that predicts doublets in single-cell RNA sequencing data.	DoubletFinder/R-4.1.2/2.0	NGS, RNA-seq, Single Cell
dRep	dRep is a python program for rapidly comparing large numbers of genomes. dRep can also ‘de-replicate’ a genome set by identifying groups of highly similar genomes and choosing the best representative genome for each genome set.	dRep/3.2.2	Metagenomics, Microbial-genomics
DROP	Detection of aberrant gene expression events in RNA sequencing data	DROP/1.1.1	NGS, RNA-Seq, Single Cell
Dsuite	Dsuite: Fast calculation of Paterson’s D (ABBA-BABA) and the f4-ratio statistics across many populations/species	Dsuite/0.5_r44	Population Genetics, Molecular Ecology
EBSeq/R-4.1.2	EBSeq: An R package for gene and isoform differential expression analysis of RNA-seq data	EBSeq/R-4.1.2/1.34.0	NGS, RNA-seq, Single Cell
eigen	Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms.	eigen/3.4.0	C++ tempalte, Linear Algebra, Matrices, Vectos
eigensoft	The EIGENSOFT package implements methods from the following 2 papers: Patterson et al. 2006 PLoS Genet 2:e190 [population structure], Price et al. 2006 Nat Genet 38:904-9 [EIGENSTRAT stratification correction]	eigensoft/7.2.1	Population Stratification
ensembl-vep	Ensembl Variant Effect Predictor(VEP determines the effect of your variants (SNPs, insertions, deletions, CNVs or structural variants) on genes, transcripts, and protein sequence, as well as regulatory regions	ensembl-vep/103.1 ensembl-vep/104.3 (Default)	NGS, Variant Effect Annotator
entrez-direct	Entrez Direct (EDirect) is an advanced method for accessing the NCBI’s set of interconnected databases (publication, sequence, structure, gene, variation, expression, etc.) from a UNIX terminal window.	entrez-direct/16.2	Sequence Retrieval
EthSEQ/R-4.1.2	EthSEQ: Ethnicity Annotation from Whole Exome Sequencing Data.	EthSEQ/R-4.1.2/2.1.4	NGS, Ethnicity Analysis
evidencemodeler	The EVidenceModeler (aka EVM) software combines ab initio gene predictions and protein and transcript alignments into weighted consensus gene structures.	evidencemodeler/1.1.1	Gene prediction
exonerate	Exonerate is a generic tool for pairwise sequence comparison. It allows you to align sequences using a many alignment models, either exhaustive dynamic programming or a variety of heuristics	exonerate/2.4.0	Sequence alignment
FastANI	FastANI is developed for fast alignment-free computation of whole-genome Average Nucleotide Identity (ANI). ANI is defined as mean nucleotide identity of orthologous gene pairs shared between two microbial genomes	FastANI/1.32	Microbiology, Genome assembly comparison
fastp	fastp is a tool designed to provide fast all-in-one preprocessing for FastQ files	fastp/0.23.2	NGS, Data Format, fastq
FastQC	FastQC is a program designed to spot potential problems in high througput sequencing datasets. It runs a set of analyses on one or more raw sequence files in fastq or bam format and produces a report which summarise the results.	FastQC/0.11.9	NGS, fastq, Quality Control
FastQScreen	FastQ-Screen is used for detecting contamination in NGS data and multi-species analysis.	FastQScreen/0.15.2	NGS, fastq, Quality Control
FastSimCoal2	FastSimCoal2 – fast sequential markov coalescent simulation of genomic data under complex evolutionary models.	FastSimCoal2/fsc27-binary	Evolutionary Model, Genome Simulation
FastTree	FastTree infers approximately-maximum-likelihood phylogenetic trees from alignments of nucleotide or protein sequences.	FastTree/2.1.10	Phylogenetics, 16S rRNA
FASTX-Toolkit	The FASTX-Toolkit is a collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing.	FASTX-Toolkit/0.0.14	NGS, Data Format, fastq
FCclasses 3	FCclasses3 computes vibronic spectra and nonradiative rates based on the harmonic approximation.	fcclasses3/3.0.2	fcclasses3
FEOS	The equation of state package FEOS for high energy density matter	FEOS/20130701	Equation of State; High Energy Density matter
ffmpeg	ffmpeg: Cross-platform solution to record, convert and stream audio and video.	ffmpeg/5.1.0 (Default)	Audio and Veido conversion
fftw	FFTW – Software library implementation of the Fast Fourier Transform(FFT) algorithm for computing Discrete Fourier Transform(DFT) compiled with MPICH libraries	fftw/3.3.9-gcc10.2 fftw/3.3.9 (Default)	Fast Fourier Transform
fgbio	fgbio is a set of tools to analyze genomic data with a focus on Next Generation Sequencing	fgbio/1.4.0	NGS
FRASER/R-4.1.2	Detection of rare aberrant splicing events in transcriptome profiles. The workflow aims to assist the diagnostics in the field of rare diseases where RNA-seq is performed to identify aberrant splicing defects.	FRASER/R-4.1.2/1.6.0	NGS, RNA-seq, Splicing
freebayes	freebayes is a Bayesian genetic variant detector designed to find small polymorphisms, specifically SNPs (single-nucleotide polymorphisms), indels (insertions and deletions), MNPs (multi-nucleotide polymorphisms), and complex events (composite insertion and substitution events) smaller than the length of a short-read sequencing alignment	freebayes/1.3.5	NGS, SNP, Variant
FreeSurfer	FreeSurfer is a software package for the analysis and visualization of structural and functional neuroimaging data from cross-sectional or longitudinal studies	FreeSurfer/7.3.2	Neuroimaging
fsl	FSL is a comprehensive library of analysis tools for FMRI, MRI and DTI brain imaging data	fsl/6.0.6.2	Neuroimaging
FusionCatcher	FusionCatcher is a finder of Somatic Fusion Genes in RNA-seq data.	FusionCatcher/1.33	NGS, RNA-Seq
gatk	GATK is a collection of command-line tools for analyzing high-throughput sequencing data with a primary focus on variant discovery.	gatk/4.1.5.0 gatk/4.2.4.0 (Default)	NGS, Variant, CNV, Genome Mapping
gaussian	Gaussian: A computational chemistry software of electronic structure modeling	gaussian/g09d01 gaussian/g16a03-avx2 gaussian/g16c01-avx2 (Default)	Computational Chemistry, Quantum Chemistry
gcc	GCC – GNU Compiler Collection includes Fortran, C, C++ compilers and libraries for these languages	gcc/9.2 gcc/10.2 gcc/12.3 (Default)	Compiler, C, C++, Fortran
GCTF	Gautomatch – Fully automatic accurate, convenient and extremely fast particle picking for EM	GCTF/0.56	CryoEM
gdal	GDAL is a translator library for raster and vector geospatial data formats that is released under an X/MIT style Open Source license by the Open Source Geospatial Foundation. As a library, it presents a single raster abstract data model and vector abstract data model to the calling application for all supported formats. It also comes with a variety of useful command line utilities for data translation and processing.	gdal/3.2.2 gdal/3.3.2 gdal/3.3.3 gdal/3.8.4 (Default)	Geospatial
GEMMA	GEMMA is a software toolkit for fast application of linear mixed models (LMMs) and related models to genome-wide association studies (GWAS) and other large-scale data sets	GEMMA/0.98.3	Statistical Genetics, GWAS
GeneMark-ES	GeneMark-ES algorithm identifies protein coding genes in eukaryotic genomes. This is the only eukaryotic gene finder that can perform gene prediction without curated training sets.	GeneMark-ES/4.68	Eukaryotic gene prediction
genomethreader	GenomeThreader is a software tool to compute gene structure predictions. The gene structure predictions are calculated using a similarity-based approach where additional cDNA/EST and/or protein sequences are used to predict gene structures via spliced alignments.	genomethreader/1.7.1	Gene prediction
geos	GEOS is a C/C++ library for spatial computational geometry of the sort generally used by “geographic information systems” software. GEOS is a core dependency of PostGIS, QGIS, GDAL, and Shapely.	geos/3.8.2 geos/3.9.1 geos/3.10.0 (Default) geos/3.12.1	Geometry Engine, Geospatial
GffCompare	GffCompare is a tool to classify, merge, track and annotate GFF files by comparing to a reference annotation GFF	GffCompare/0.11.2	Genome Annotation
GISTIC	GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers.	GISTIC/2.0.23	Oncology, Oncogenomics, Somatic Variant
glimmerhmm	GlimmerHMM is a new gene finder based on a Generalized Hidden Markov Model (GHMM).	glimmerhmm/3.0.4	Eukryotic gene prediction
glpk	The GLPK (GNU Linear Programming Kit) package is intended for solving large-scale linear programming (LP), mixed integer programming (MIP), and other related problems. It is a set of routines written in ANSI C and organized in the form of a callable library.	glpk/5.0	glpk
gmp	GMP – The GNU Multiple Precision Arithmetic Library	gmp/6.2.1
gnuparallel	GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input	gnuparallel/20211222 (Default)	Utilities, Parallel
gnuplot	Visualization, Plotting	gnuplot/5.4.2	Gnuplot is a portable command-line driven graphing utility
go	The Go Programming Language	go/1.18.5 go/1.19.4 (Default)	Programming Language
googletest	Google test framework for C++. Also called gtest.		Google, Testing Library
gpumd	GPUMD – Graphics Processing Units Molecular Dynamics	gpumd/2.7	GPU, Molecular Dynamics
graphviz	The Graphviz layout programs take descriptions of graphs in a simple text language, and make diagrams in several useful formats such as images and SVG for web pages, Postscript for inclusion in PDF or other documents; or display in an interactive graph browser.	graphviz/2.50.0	Plotting, Visualization
gromacs	GROMACS is a molecular dynamics package mainly designed for simulations of proteins, lipids, and nucleic acids.	gromacs/2021.3 (Default)	Molecular Dynamics, Protein, Lipid, DNA, Nucleic Acid
gsl/gcc	GSL – GNU Scientific Library	gsl/gcc/2.7 (Default) gsl/gcc/2.7.1	Numerical Libary, C, C++
gsl/intel	GSL – GNU Scientific Library	gsl/intel/2.7	Numerical Libary, C, C++
harfbuzz	An OpenType text shaping engine	harfbuzz/5.3.0	OpenType
harmony/R-4.1.2	harmony: Scalable integration of single cell RNAseq data for batch correction and meta analysis	harmony/R-4.1.2/0.1	NGS, RNA-seq, Single Cell
hdf5/gcc	HDF5 – suite for managing extremely large can complex data collections.	hdf5/gcc/1.10.7-gcc8.3.1 hdf5/gcc/1.12.2-gcc8.3.1 (Default)	Hierarchical Data Format
hdf5/impi	HDF5 – suite for managing extremely large can complex data collections.	hdf5/impi/1.10.7-impi2021 hdf5/impi/1.12.2-impi2022 (Default)	Hierarchical Data Format
HISAT2	HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes as well as to a single reference genome	HISAT2/2.2.1	NGS, aligner
HMMcopy/R-4.1.2	HMMcopy: Copy number prediction with correction for GC and mappability bias for HTS data	HMMcopy/R-4.1.2/1.36.0	NGS, Structural Variant
HMMER	HMMER is used for searching sequence databases for sequence homologs, and for making sequence alignments. It implements methods using probabilistic models called profile hidden Markov models (profile HMMs).	HMMER/3.3.2	Sequence Analysis, Sequence Clustering
HOMER	HOMER (Hypergeometric Optimization of Motif EnRichment) is a suite of tools for Motif Discovery and ChIP-Seq analysis, primarily written as a de novo motif discovery algorithm that is well suited for finding 8-12 bp motifs in large scale genomics data.	HOMER/4.11	NGS, ChIP-seq
HTSeq	HTSeq is a Python library to facilitate programmatic analysis of data from high-throughput sequencing (HTS) experiments.	HTSeq/1.99.2	NGS, RNA-seq
htslib	HTSlib is an implementation of a unified C library for accessing common file formats, such as SAM, CRAM and VCF, used for high-throughput sequencing data, and is the core library used by samtools and bcftools.	htslib/1.14	NGS, Data Format, VCF
HUMAnN2	HUMAnN 2.0 is a pipeline for efficiently and accurately profiling the presence/absence and abundance of microbial pathways in a community from metagenomic or metatranscriptomic sequencing data (typically millions of short DNA/RNA reads).	HUMAnN2/2.8.1	Metagenomics, Microbial Profiling
HUMAnN3	HUMAnN 3.0 is a pipeline for efficiently and accurately profiling the presence/absence and abundance of microbial pathways in a community from metagenomic or metatranscriptomic sequencing data (typically millions of short DNA/RNA reads).	HUMAnN3/3.0.0	Metagenomics, Microbial Profiling
hyphy	An open-source software package for comparative sequence analysis using stochastic evolutionary models.	hyphy/2.5.42 hyphy/2.5.51 (Default)	Comparative Genomics, Evolution
icu	ICU is a mature, widely used set of C/C++ and Java libraries providing Unicode and Globalization support for software applications.	icu/73.2	unicode
IGV	The Integrative Genomics Viewer (IGV) is a high-performance, easy-to-use, interactive tool for the visual exploration of genomic data. It supports flexible integration of all the common types of genomic data and metadata, investigator-generated or publicly available, loaded from local or cloud sources.	IGV/2.11.4 IGV/2.15.4 (Default)	Genome, Visualization
IGV-snapshot-automator	IGV Snapshot Automator is a script to automatically create and run IGV snapshot batch scripts. This script will first write an IGV batch script for the supplied input files, then load all supplied files for visualization (.bam, etc) in a headless IGV session and take snapshots at the locations defined in the regions .bed file.	IGV-snapshot-automator/20.11.1	Genome, Visualization
imagemagick	Software suite to create, edit, compose, or convert bitmap images.	imagemagick/7.1.0.43	Graphics, Images
impi	Intel C/C++/Fortran Compilers with Intel MPI Libraries and profiler tools.	impi/2019u4 impi/2020u4 impi/2021.1 impi/2021.4 impi/2022.1 impi/2022.2 (Default)	Intel, MPI, C, C++, Fortran, Compiler
IMPUTE2	IMPUTE version 2 (also known as IMPUTE2) is a genotype imputation and haplotype phasing program based on ideas from Howie et al. 2009	IMPUTE2/2.3.2	GWAS, Genotype Imputation
InferCNV/R-4.1.2	InferCNV: Inferring copy number alterations from tumor single cell RNA-Seq data	InferCNV/R-4.1.2/1.3.3	NGS, RNA-seq, Single Cell
Infernal	Infernal (INFERence of RNA ALignment) is for searching DNA sequence databases for RNA structure and sequence similarities. It is an implementation of a special case of profile stochastic context-free grammars called covariance models (CMs).	Infernal/1.1.4	Homolog Search, RNA Alignment
inStrain	InStrain is a tool for analysis of co-occurring genome populations from metagenomes that allows highly accurate genome comparisons, analysis of coverage, microdiversity, and linkage, and sensitive SNP detection with gene localization and synonymous non-synonymous identification	inStrain/1.5.5	Metageomics
intel	Intel C/C++/Fortran Compilers with Intel MKL: Optimized compilers, math libraries with debug and tuning tools.	intel/2019u4 intel/2020u4 intel/2021.1 intel/2021.4 intel/2022.1 intel/2022.2 (Default)	Intel, Compiler, C, C++, Fortran
InterProScan	InterPro provides functional analysis of proteins by classifying them into families and predicting domains and important sites.	InterProScan/5.52_86.0 InterProScan/5.54_87.0 InterProScan/5.59_91.0 (Default)	Protein functional classifications
IQ-TREE	IQ-TREE is a fast and effective stochastic algorithm to infer phylogenetic trees by maximum likelihood.	IQ-TREE/1.6.12 IQ-TREE/2.1.3 (Default)	Phylogenetics
JAGS	JAGS: Just Another Gibbs Sampler	JAGS/4.3.0	MCMC simulation, Gibbs Sampler
jsonc	A JSON implementation in C.	jsonc/0.13.1 jsonc/0.15 (Default)
json-parse/Perl-5.34.0	json-parse: A PERL module for parsing JSON	json-parse/Perl-5.34.0/0.61	Perl, JSON, Parser
julia	Julia – a high-level programming language for numerical computing.	julia/1.6.1 julia/1.10.4 (Default)	Numerical Computing
kaiju	Kaiju is a program for the taxonomic classification of high-throughput sequencing reads, e.g., Illumina or Roche/454, from whole-genome sequencing of metagenomic DNA	kaiju/1.8.2	Metageomics, Taxonomy Classification
kallisto	kallisto is a program for quantifying abundances of transcripts from bulk and single-cell RNA-Seq data, or more generally of target sequences using high-throughput sequencing reads.	kallisto/0.46.2	NGS, RNA-seq, Single Cell
kneaddata	KneadData is a tool designed to perform quality control on metagenomic and metatranscriptomic sequencing data, especially data from microbiome experiments.	kneaddata/0.10.0	Metagenomics, Quality Control
kofamscan	KofamKOALA assigns K numbers to the user’s sequence data by HMMER/HMMSEARCH against KOfam	kofamscan/1.3.0	Annotatiokn, Pathway
KOMB	KOMB: Taxonomy-oblivious characterization of metagenome dynamics	KOMB/1.0	Metagenomics, Functional Analysis
kraken2	Kraken is a taxonomic sequence classifier that assigns taxonomic labels to DNA sequences. Kraken examines the $k$-mers within a query sequence and uses the information within those $k$-mers to query a database.	kraken2/2.1.2	Metageomics, Taxonomy Classification
krona	Krona Tools is a set of scripts to create Krona charts from several Bioinformatics tools as well as from text and XML files.	krona/2.8.1	Metageomics, Taxonomy Classification
lammps	LAMMPS stands for Large-scale Atomic/Molecular Massively Parallel Simulator. It is a classical molecular dynamics simulation code that models an ensemble of particles in a liquid, solid, or gaseous state.	lammps/20210929 lammps/20220803 (Default) lammps/20230802 lammps/20230802u3	Molecular Dynamics, Simulations, Atoms
LASTZ	LASTZ is a program for aligning DNA sequences, a pairwise aligner. Originally designed to handle sequences the size of human chromosomes and from different species.	LASTZ/1.04.15	NGS, DNA Aligner
LDhat	LDhat: Estimate recombination rates from population genetic data	LDhat/2.2a	Population Genetics
LDhelmet	Software package for estimating fine-scale recombination rate.	LDhelmet/1.9	Population Genetics
libtiff	The LibTIFF software provides support for the Tag Image File Format (TIFF), a widely used format for storing image data.	libtiff/3.4.4
libgeotiff	GeoTIFF represents an effort by over 160 different remote sensing, GIS, cartographic, and surveying related companies and organizations to establish a TIFF based interchange format for georeferenced raster imagery.	libgeotiff/1.6.0
libjpeg-turbo	Libjpeg-turbo is a fork of the original IJG libjpeg which uses SIMD to accelerate baseline JPEG compression and decompression.	libjpeg-turbo/2.0.6
libpng	Libpng is te official PNG reference library.	libpng/1.6.37
libtiff	LibTIFF – Tag Image File Format(TIFF) Library and Utilities	libtiff/4.2.0
libxml2	Libxml2 is the XML C parser and toolkit developed for the Gnome project	libxml2/2.9.10
LIGGGHTS	LIGGGHTS® is an Open Source Discrete Element Method Particle Simulation Software	LIGGGHTS/3.8.0	Molecular Dynamics, Simulations, Atoms
MAFFT	MAFFT is a multiple sequence alignment program for unix-like operating systems. It offers a range of multiple alignment methods, L-INS-i (accurate; for alignment of <∼200 sequences), FFT-NS-2 (fast; for alignment of <∼30,000 sequences), etc	MAFFT/7.490	Sequence Analysis, Sequence Clustering
maker	MAKER is a portable and easily configurable genome annotation pipeline. Its purpose is to allow smaller eukaryotic and prokaryotic genome projects to independently annotate their genomes and to create genome databases.	maker/3.01.03	Genome Annotation
manta	Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads. It is optimized for analysis of germline variation in small sets of individuals and somatic variation in tumor/normal sample pairs	manta/1.6.0	NGS, Structural Variant
MaSuRCA	The MaSuRCA (Maryland Super Read Cabog Assembler) genome assembly and analysis toolkit contains of MaSuRCA genome assembler, QuORUM error corrector for Illumina data, POLCA genome polishing software, Chromosome scaffolder, jellyfish mer counter, and MUMmer aligner	MaSuRCA/4.0.9	Genome Assembler
matlab	MATLAB – High-level technical computing language for data analysis and numerical computation.	matlab/r2021a matlab/r2021b matlab/r2022b matlab/r2023b (Default) matlab/r2024b	Numerical Computing
maxquant	MaxQuant is a quantitative proteomics software package designed for analyzing large mass-spectrometric data sets. License restricted	maxquant/2.2.0	Proteomics, Mass Spectrometry, MS
mcl	MCL, the Markov Cluster algorithm, also known as Markov Clustering, is a method and program for clustering weighted or simple networks, a.k.a. graphs.	mcl/14.137	Graph
MEGA	MEGA: Software package for phylogenetic analysis with a graphical user interface. It allows viewing and editing of the aligned input sequence data and provides many tools for phylogenetic and statistical analysis of the alignments.	MEGA/11.0.10	Phylogenetics
MEGAHIT	MEGAHIT is an ultra-fast and memory-efficient NGS assembler. It is optimized for metagenomes, but also works well on generic single genome assembly (small or mammalian size) and single-cell assembly.	MEGAHIT/1.2.9	Metagenomcis, Genome Assembler
meme	The MEME Suite is a motif-based sequence analysis tools	meme/5.4.1 (Default)	Motif Sequence Analysis
MetaBAT	MetaBAT: A robust statistical framework for reconstructing genomes from metagenomic data	MetaBAT/2.15	Metagenomics, Taxonomy Classification
MetaPhlAn	MetaPhlAn ‘Metagenomic Phylogenetic Analysis’ is a computational tool for profiling the composition of microbial communities from metagenomic shotgun sequencing data.	MetaPhlAn/3.0.13 MetaPhlAn/4.0.3 (Default)	Metagenomics, Microbial Profiling
Metaxa2	Metaxa2: Improved Identification and Taxonomic Classification of Small and Large Subunit rRNA in Metagenomic Data	Metaxa2/2.2	Metagenomics, Taxonomy Classification
miniconda/py39	Miniconda: an open source package management system and environment management system	miniconda/py39/4.10.3	Python, Conda, Installer, Package
minimap2	minimap is a versatile pairwise aligner for genomic and spliced nucleotide sequences	minimap2/2.23 (Default)	Aligner
mitofinder	MitoFinder – efficient automated large-scale extraction of mitogenomic data from high throughput sequencing data	mitofinder/1.4.1	Bioinformatics, Mitochondria, NFS
mity	mity: A highly sensitive mitochondrial variant analysis pipeline for whole genome sequencing data	mity/0.3.0	Mitochondrial variant
mkl	Intel Math Kernel Library (MKL)	mkl/2020u4 mkl/2021.1 mkl/2021.4 mkl/2022.1 mkl/2022.2 (Default)	Math Routine, BLAS, LAPACK, ScaLAPACK, sparse solvers, fast Fourier transforms
mlst	Scan contig files against traditional PubMLST typing schemes	mlst/2.22.1	Sequence Typing, Bacteria
momap	MOMAP – Molecular Material Property Prediction Package, a suite of programs for predicting the properties of polyatomic molecules.	momap/2021A-mpich2	Molecular Material
mosaic	Mosaic is a set of tools to analyze DNA and protein data obtained from the Mission Bio Tapestri instrument.	mosaic/3.4.0
mOTUs	mOTUs is a tool for microbial abundance, activity and population genomic profiling	mOTUs/2.1.1	Metagenomics, Microbial Profiling
mpc	MPC – The GNU Multiple Precision C Library	mpc/1.2.1
mpfr	MPFR – The GNU Multiple Precision Floating-Point Library	mpfr/4.1.0
mpich/gcc	Message Passing MPICH libraries with GNU Compiler for parallel and distributed computing.	mpich/gcc/3.4.2-gcc8.3.1 mpich/gcc/4.1.2-gcc12.3 (Default)	MPI, Parallel, Distributed
mpich/intel	Message Passing MPICH libraries with GNU Compiler for parallel and distributed computing.	mpich/intel/3.4.2-intel2021 mpich/intel/4.1.2 (Default)	MPI, Parallel, Distributed
MSGFgui/R-4.1.2	MSGFplus: This package makes it possible to perform analyses using the MSGFplus package in a GUI environment.	MSGFgui/R-4.1.2/1.28.0	Mass Spectrometry
MultiQC	MultiQC is a tool to create a single report with interactive plots for multiple bioinformatics analyses across many samples.	MultiQC/1.11	NGS, Quality Control
MUMMER	MUMmer is a versatile alignment tool for DNA and protein sequences	MUMMER/3.23	Aligner
muscle	MUSCLE: multiple sequence alignment with high accuracy and high throughput.	muscle/5.1	Multiple Sequence Alignment
nasm	NASM (Netwide Assembler) is an 80×86 assembler designed for portability and modularity. It includes a disassembler as well.	nasm/2.15.05	x86 Assembly
NCL	NCL (NCAR Command Language).	NCL/6.6.2
NeEstimator	NeEstimator V2.1 estimates contemporary effective population size (Ne) using multi-locus diploid genotypes from population samples.	NeEstimator/2.1	Population Genetics, Molecular Ecology
NetLogo	NetLogo is a multi-agent programmable modeling environment.	NetLogo/6.2.2	Modelling
nextflow	A DSL for data-driven computational pipelines	nextflow/22.10.0	Pipeline, Workflow
NextGenMap	https://github.com/philres/NextGenMap.	NextGenMap/0.5.5	Sequence Mapping
ngsLD	ngsLD is a program to estimate pairwise linkage disequilibrium (LD) taking the uncertainty of genotype’s assignation into account.	ngsLD/1.2.0
ngsRelate	ngsTools: Program for inferring relatedness and other summary statistics	ngsRelate/2022-09-26	Next generation sequencing
ngsTools	ngsTools: Programs to analyse NGS data for population genetics purposes	ngsTools/2020-07-23	Population Genetics, Molecular Ecology
NIRVANA	Nirvana provides clinical-grade annotation of genomic variants (SNVs, MNVs, insertions, deletions, indels, and SVs (including CNVs). It can be run as a stand-alone package or integrated into larger software tools that require variant annotation.	NIRVANA/3.17.0	NGS, Annotation
nvhpc	NVIDIA HPC SDK includes compilers, libraries and software tools support GPU-accelerated HPC applications	nvhpc/20.11 nvhpc/21.3 nvhpc/22.3 nvhpc/22.7 nvhpc/23.7 nvhpc/24.9 (Default)	NIVIDA, GPU, CUDA, Compiler
n2p2	n2p2 – A neural network potential package.	n2p2/2.2.0	Neural Network
openfoam	OpenFOAM (for Open-source Field Operation And Manipulation) is a C++ toolbox for the development of customized numerical solvers, and pre-/post-processing utilities for the solution of continuum mechanics problems, most prominently including computational fluid dynamics (CFD).	openfoam/2206	Fluid Dynamics
openjdk	OpenJDK (Open Java Development Kit) is a free and open-source implementation of the Java Platform, Standard Edition (Java SE).	openjdk/11.0.9.1 (Default) openjdk/21.0.1	java, jdk, openjdk, jar
openmpi/aocc	An open source Message Passing Interface implementation.	openmpi/aocc/4.1.0-aocc3.1 openmpi/aocc/4.1.6-aocc4.1.0	MPI
openmpi/gcc	An open source Message Passing Interface implementation.	openmpi/gcc/4.1.0-gcc8.3.1 openmpi/gcc/4.1.0-gcc10.2 openmpi/gcc/4.1.4-gcc10.2 (Default) openmpi/gcc/4.1.6-gcc9.2 openmpi/gcc/4.1.6-gcc12.3	MPI
openmpi/intel	An open source Message Passing Interface implementation.	openmpi/intel/4.1.0-intel2020u4 openmpi/intel/4.1.6-intel2023	MPI
OptiType	OptiType is a novel HLA genotyping algorithm based on integer linear programming, capable of producing accurate 4-digit HLA genotyping predictions from NGS data by simultaneously selecting all major and minor HLA Class I alleles	OptiType/1.3.5	NGS, HLA-Typing
orca	ORCA – general purpose tool for quantum chemistry with specific emphasis on spectroscopic properties of open-shell molecules.	orca/5.0.0 orca/5.0.2 orca/5.0.3 (Default)	Quantum Chemistry
orthofinder	Phylogenetic orthology inference for comparative genomics	orthofinder/2.5.4	Orthology
pandoc	Pandoc is a Haskell library for converting from one markup format to another, and a command-line tool that uses this library.	pandoc/2.16.2	Doc Convert
ParallelFold	ParallelFold: Modified version of Alphafold to divide CPU part (MSA and template searching) and GPU part. This can accelerate Alphafold when predicting multiple structures	ParallelFold/2.1.2	Structural Bioinformatics, Protein Structure Prediction, AI
paraview	ParaView is an open-source, multi-platform data analysis and visualization application based on Visualization Toolkit (VTK).	paraview/5.9.0-binary paraview/5.10.0 (Default)	Visualization
pcre	PCRE – Perl-Compatible Regular Expressions	pcre/8.45
pcre2	PCRE2 – Perl-Compatible Regular Expressions	pcre2/10.40
perl	Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages.	perl/5.34.0	Programming Language, Script
perl-lib	Perl-lib allows for user-installed Perl modules in home folder	perl-lib/5.34.0	Programming Language, Script
PGDSpider	PGDSpider is a powerful automated data conversion tool for population genetic and genomics programs. It facilitates the data exchange possibilities between programs for a vast range of data types (e.g. DNA, RNA, NGS, microsatellite, SNP, RFLP, AFLP, multi-allelic data, allele frequency or genetic distances)	PGDSpider/2.1.1.5	Population Genetics, Molecular Ecology
phylip	PHYLIP is a free package of programs for inferring phylogenies.	phylip/3.697	Phylogenetics
phyloseq/R-4.2.1	phyloseq provides a set of classes and tools to facilitate the import, storage, analysis, and graphical display of microbiome census data.	phyloseq/R-4.2.1/1.42.0	Phylogenetics, Microbiome
PhyML	PhyML is a software package that uses modern statistical approaches to analyse alignments of nucleotide or amino acid sequences in a phylogenetic framework.	PhyML/3.3.20220408	Phylogenetics
picard	Picard is a set of Java command line tools for manipulating high-throughput sequencing (HTS) data and formats.	picard/2.26.6 (Default)	NGS, data formats
pigz	Parallel implementation of gzip	pigz/2.6	File compression
PLINK	PLINK is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner.	PLINK/1.90b6.24 PLINK/2.00a2.3 PLINK/2.00a3 (Default)	GWAS
pmi		pmi/pmix-x86_64
pnetcdf	PnetCDF is a high-performance parallel I/O library for accessing Unidata’s NetCDF, files in classic formats, specifically the formats of CDF-1, 2, and 5.	pnetcdf/impi/2020u4-intel2020u4 pnetcdf/openmpi/4.1.6-gcc12.3	netcdf
PosiGene	PosiGene is a tool that (i) detects positively selected genes on genome-scale, (ii) allows analysis of specific evolutionary branches, (iii) can be used in arbitrary species contexts and (iv) offers visualization of the candidates.	PosiGene/0.1	Bioinformatics, Genome, Bacterial, Assembly, Short-read
postgis	PostGIS extends the capabilities of the PostgreSQL relational database by adding support for storing, indexing, and querying geospatial data.	postgis/3.4.2
postgresql	PostgreSQL is a powerful, open source object-relational database system.	postgresql/13.2	Relational Database
ppanggolin	Depicting microbial species diversity via a Partitioned PanGenome Graph	ppanggolin/1.2.74	Microbiome, Bacteria
prank	PRANK is a probabilistic multiple alignment program for DNA, codon and amino-acid sequences.	prank/170427	Multiple Sequence Alignment
proj	PROJ is a generic coordinate transformation software, that transforms geospatial coordinates from one coordinate reference system (CRS) to another. This includes cartographic projections as well as geodetic transformations.	proj/7.2.1 proj/8.0.0 proj/8.1.1 proj/9.4.0 (Default)	Coordinate Tranformation
prokka	prokka: Rapid annotation of prokaryotic genomes	prokka/1.14.6	Prokaryote, Annotation
prune_graph	Fast pruning of arbitrary graphs.	prune_graph/0.3.2
psi4	Open-Source Quantum Chemistry – an electronic structure package in C++ driven by Python	psi4/1.7+6ce35a5	Quantum Chemistry
pypopgen3	Tools for population genomic analysis for Python 3	pypopgen3/2021-11-23	Population Genetics
pyrho	pyrho: Fast inference of fine-scale recombination rates based on fused-LASSO	pyrho/0.1	Population Genetics
python	Python – A widely used high-level programming language.	python/3.9.2 python/3.9.7 (Default) python/3.12.1	Programming Language, Data Science
QIIME2	QIIME 2 is a powerful, extensible, and decentralized microbiome analysis package with a focus on data and analysis transparency. QIIME 2 enables researchers to start an analysis with raw DNA sequence data and finish with publication-quality figures and statistical results.	QIIME2/2021.11	Microbiome, Microbiology
QualiMap	Qualimap 2 is a platform-independent application written in Java and R that provides both a Graphical User Interface (GUI) and a command-line interface to facilitate the quality control of alignment sequencing data and its derivatives like feature counts	QualiMap/2.2.1	NGS, Quality Control
R	R – computing language for statistical computation and graphics.	R/4.1.2-G R/4.1.2-gcc R/4.1.2-one R/4.1.2 R/4.2.1 R/4.0.4 (Default) R/4.3.2 R/4.4.3	Statistical Computing
RAxML-NG	RAxML-NG is a phylogenetic tree inference tool which uses maximum-likelihood (ML) optimality criterion. Its search heuristic is based on iteratively performing a series of Subtree Pruning and Regrafting (SPR) moves.	RAxML-NG/1.1.0	Phylogenetics
RELION-cpu	RELION (for REgularised LIkelihood OptimisatioN, pronounce rely-on) is a stand-alone computer program that employs an empirical Bayesian approach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryo-microscopy (cryo-EM).	RELION-cpu/4.0b2 (Default) RELION-cpu/5.0	Cryo-EM
RELION-gpu	RELION (for REgularised LIkelihood OptimisatioN, pronounce rely-on) is a stand-alone computer program that employs an empirical Bayesian approach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryo-microscopy (cryo-EM).	RELION-gpu/4.0b2 (Default) RELION-gpu/5.0	Cryo-EM
repeatmasker	RepeatMasker is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences.	repeatmasker/4.1.2.p1	DNA repeat
repeatmodeler	RepeatModeler is a de-novo repeat family identification and modeling package.	repeatmodeler/2.0.2a	DNA repeat
rmats	rMATS turbo is the C/Cython version of rMAT, a computational tool to detect differential alternative splicing events from RNA-Seq data.	rmats/4.1.1	NGS, RNA-Seq
roary	Rapid large-scale prokaryote pan genome analysis	roary/3.13.0	Prokaryote
ROOT	ROOT enables statistically sound scientific analyses and visualization of large amounts of data	ROOT/6.24.6	Data Science
RSEM	RSEM is a software package for estimating gene and isoform expression levels from RNA-Seq data. The RSEM package provides an user-friendly interface, supports threads for parallel computation of the EM algorithm, single-end and paired-end read data, quality scores, variable-length reads and RSPD estimation.	RSEM/1.3.3	NGS, RNA-Seq
RSeQC	RSeQC package provides a number of useful modules that can comprehensively evaluate high throughput sequence data especially RNA-seq data.	RSeQC/4.0.0	NGS, RNA-Seq, Quality Control
RStudio	The RStudio IDE is a set of integrated tools designed to help you be more productive with R and Python. It includes a console, syntax-highlighting editor that supports direct code execution, and a variety of robust tools for plotting, viewing history, debugging and managing your workspace.	RStudio/1.4.1717 RStudio/2022.02.3 RStudio/2022.07.1 RStudio/2022.12.0 RStudio/2023.03.0 (Default) RStudio/2023.09.1	Data Science, Statistics, R, Python, IDE
Ruby	Ruby is an interpreted, high-level, general-purpose programming language	Ruby/2.7.2 rustup/1.72.0-stable rustup/1.76.0-stable	Programming Language
rvtests	Rvtests, which stands for Rare Variant tests, is a flexible software package for genetic association analysis for sequence datasets. Since its inception, rvtests was developed as a comprehensive tool to support genetic association analysis and meta-analysis		NGS, Variant Caller, Rare Variant
salmon	Salmon is a wicked-fast program to produce a highly-accurate, transcript-level quantification estimates from RNA-seq data.	salmon/1.6.0	NGS, RNA-Seq
sambamba	Sambamba is a high performance highly parallel robust and fast tool (and library), written in the D programming language, for working with SAM and BAM files.	sambamba/0.8.1	NGS, File Format
samtools	samtools is a suite of programs for interacting with high-throughput sequencing data.	samtools/1.14 (Default) samtools/1.19	NGS, Data Format, SAM
scanpy	Scanpy is a scalable toolkit for analyzing single-cell gene expression data built jointly with anndata. It includes preprocessing, visualization, clustering, trajectory inference and differential expression testing.	scanpy/1.7.2	NGS, RNA-Seq, Single Cell
scikit-bio	scikit-bio is an open-source, BSD-licensed Python 3 package providing data structures, algorithms and educational resources for bioinformatics.	scikit-bio/0.5.6	Bioinformatics, Data Science
scotch	Static Mapping, Graph, Mesh and Hypergraph Partitioning, and Parallel and Sequential Sparse Matrix Ordering Package	scotch/6.0.9	Mesh, Mapping, Graph
scran/R-4.1.2	R packages with methods for Single-Cell RNA-Seq Data Analysis.	scran/R-4.1.2/1.23.1	NGS, RNA-seq, Single Cell
Seurat/R-4.1.2	Seurat is an R toolkit for single cell genomics	Seurat/R-4.1.2/4.0.5 Seurat/R-4.1.2/4.3.0 (Default)	NGS, RNA-seq, Single Cell
Seurat/R-4.2.1	Seurat is an R toolkit for single cell genomics	Seurat/R-4.2.1/4.3.0	NGS, RNA-seq, Single Cell
shapeit4	Segmented HAPlotype Estimation and Imputation Tools version 4	shapeit4/4.2.2	Population Genetics
Signac/R-4.1.2	Signac is a comprehensive R package for the analysis of single-cell chromatin data. Signac includes functions for quality control, normalization, dimension reduction, clustering, differential activity, and more.	Signac/R-4.1.2/1.4.0	NGS, RNA-seq, Single Cell Chromatin
simplejson	simplejson is a simple, fast, complete, correct and extensible JSON <http://json.org> encoder and decoder for Python	simplejson/3.17.6	JSON Parser
singularity	Singularity – Enable using containers in HPC environments.	singularity/3.8.0	Docker-alternative, Container, sif, simg
smcpp	SMC++ is a program for estimating the size history of populations from whole genome sequence data.	smcpp/1.15.2	NGS, WGS, Population Genetics
snakemake	The Snakemake workflow management system is a tool to create reproducible and scalable data analyses.	snakemake/6.12.1	Workflow
SNAP	SNAP is a general purpose gene finding program suitable for both eukaryotic and prokaryotic genomes. SNAP is an acronym for Semi-HMM-based Nucleic Acid Parser.	SNAP/2013_11_29	Eukaryotic and Prokaryotic gene prediction
snpEff	SnpEff is a Genetic variant annotation and functional effect prediction toolbox. It annotates and predicts the effects of genetic variants on genes and proteins (such as amino acid changes).	snpEff/5.0	Variant Functional Analysis
SNPhylo	SNPhylo is a pipeline to generate a phylogenetic tree from huge SNP data.	SNPhylo/20180901	Phylogenetics
SortMeRNA	SortMeRNA is a local sequence alignment tool for filtering, mapping and clustering.	SortMeRNA/4.3.4	NGS, Metagenomics, RNA-seq, Data Cleaning
sourmash	Sourmash is a tool that quickly search, compare, and analyze genomic and metagenomic data sets.	sourmash/4.2.2	Metagenomics
SPAdes	SPAdes – St. Petersburg genome assembler – is an assembly toolkit containing various assembly pipelines.	SPAdes/3.15.3 SPAdes/3.15.4 (Default)	Genome Assembler
sqlite	SQLite3 is an SQL database engine in C library. Programs that link the SQLite3 library can have SQL database access without running a separate RDBMS process.	sqlite/3.44.2 (Default) sqlite/3.35.2	Relational Database
squashfuse	FUSE filesystem to mount squashfs archives	squashfuse/0.1.104	filesystem
sra-tools	The SRA Toolkit provides a number of tools for download of data in Sequence Read Archive (SRA)	sra-tools/2.11.0	NCBI, Sequence Read Archive
srst2	Short Read Sequence Typing for Bacterial Pathogens	srst2/0.2.0	Sequence Typing, Bacteria
STAAR/R-4.1.2-gcc	An R package for performing STAAR procedure in whole-genome sequencing studies	STAAR/R-4.1.2-gcc/0.9.6.2	NGS, WGS
stacks	Stacks is a software pipeline for building loci from short-read sequences, such as those generated on the Illumina platform.	stacks/2.66
stairway-plot	The stairway plot is a method for inferring detailed population demographic history using the site frequency spectrum (SFS) from DNA sequence data	stairway-plot/2.1.1	Population Genetics, Molecular Ecology
STAR	STAR: ultrafast universal RNA-seq aligner	STAR/2.7.9a	NGS, RNA-seq, Aligner
STAR-Fusion	STAR-Fusion is a component of the Trinity Cancer Transcriptome Analysis Toolkit (CTAT). STAR-Fusion uses the STAR aligner to identify candidate fusion transcripts supported by Illumina reads. STAR-Fusion further processes the output generated by the STAR aligner to map junction reads and spanning reads to a reference annotation set.	STAR-Fusion/1.10.0	NGS, RNA-seq, Fusion Detection
stata	STATA is a general-purpose statistical software package for data analysis, data management and graphics.	stata/16.1 stata/17.0 stata/18.0 stata/18.5 stata/19.5 (Default)	Statistical Computing
strelka	Strelka2 is a fast and accurate small variant caller optimized for analysis of germline variation in small cohorts and somatic variation in tumor/normal sample pairs.	strelka/2.9.10	NGS, Variant Caller
StringTie	Stringtie employs efficient algorithms for transcript structure recovery and abundance estimation from bulk RNA-Seq reads aligned to a reference genome.	StringTie/2.1.7	NGS, RNA-Seq, Expression Analysis
SvABA	SvABA is a method for detecting structural variants in sequencing data using genome-wide local assembly	SvABA/1.1.0	NGS, Structural Variant
tcl	The TCL programming language.	tcl/8.6.12
texlive	texlive: An easy way to get up and running with the TeX document production system.	texlive/20220503	document
tk	A dynamic programming language with GUI support. Bundles Tcl and Tk.	tk/8.6.12	GUI, Tk, Tcl
TrimGalore	Trim Galore is a wrapper around Cutadapt and FastQC to consistently apply adapter and quality trimming to FastQ files, with extra functionality for RRBS data.	TrimGalore/0.6.7	NGS, Adaptor Trimming
trimmomatic	trimmomatic: A flexible read trimming tool for Illumina NGS data	trimmomatic/0.39 (Default)	NGS, Sequeince Trimmer
trinity	Trinity assembles transcript sequences from Illumina RNA-Seq data.	trinity/2.13.2 trinity/2.14.0 (Default)	Illumina, RNA-Seq, Assembler
Trycycler	Trycycler is a tool for generating consensus long-read assemblies for bacterial genomes	Trycycler/0.5.3	Bioinformatics, Genome, Bacterial, Assembly, Short-read
ucsc-kent	UCSC Genome Browser source tree	ucsc-kent/2021-11-18	UCSC, Genome Browser
Unicycler	Unicycler is an assembly pipeline for bacterial genomes	Unicycler/0.4.9 Unicycler/0.5.0 (Default) Unicycler/0.5.0p	Bioinformatics, Genome, Bacterial, Assembly, Short-read
USEARCH	USEARCH is a tool designed to enable high-throughput, sensitive search of very large sequence databases	USEARCH/11.0.667	Sequence Alignment, Sequence Clustering
VarDictJava	VarDictJava is a variant discovery program written in Java and Perl. It is a Java port of VarDict variant caller.	VarDictJava/1.8.3	Variant Caller
VarScan	VarScan is a platform-independent mutation caller for targeted, exome, and whole-genome resequencing data generated on Illumina, SOLiD, Life/PGM, Roche/454, and similar instruments.	VarScan/2.4.4	NGS, Mutation Caller
vasp5	VASP: Package for ab initio quantum-mechanical molecular dynamics simulation.	vasp5/5.4.4	Molecular Dynamics
vasp6	VASP: Package for ab initio quantum-mechanical molecular dynamics simulation.	vasp6/6.2.0 vasp6/6.2.1-impi2021 vasp6/6.2.1 vasp6/6.3.0 vasp6/6.3.1 vasp6/6.3.2 (Default)	Molecular Dynamics
vcftools	VCFtools is a program package designed for working with VCF files, such as those generated by the 1000 Genomes Project. The aim of VCFtools is to provide easily accessible methods for working with complex genetic variation data in the form of VCF files.	vcftools/0.1.17	Bioinformatics, Genome, Sequence, VCF
velvet	Velvet is a short read de novo assembler using de Bruijn graphs	velvet/1.2.10	Genome Assembler
VerifyBamID	VerifyBamID2: A robust tool for DNA contamination estimation from sequence reads using ancestry-agnostic method.	VerifyBamID/2.0.1	NGS, Contamination Detection
VMD	Visual Molecular Dynamics (VMD) is a molecular visualization program for displaying, animating, and analyzing large biomolecular systems using 3-D graphics and built-in scripting	VMD/1.9.3	Molecular Dynamics, Visualization
VSCode	Visual Studio Code is a lightweight but powerful source code editor .	VSCode/1.68 VSCode/1.74 (Default)	Integrated Development Environment
VSEARCH	VSEARCH which supports de novo and reference based chimera detection, clustering, full-length and prefix dereplication, rereplication, reverse complementation, masking, all-vs-all pairwise global alignment, exact and global alignment searching, shuffling, subsampling and sorting	VSEARCH/2.18 (Default)	Metagenomics
vtk	The Visualization Toolkit (VTK) is an open-source, freely available software system for 3D computer graphics, modeling, image processing, volume rendering, scientific visualization, and information visualization.	vtk/7.1.1 vtk/9.0.3	Visualization
XFuse	XFuse: Super-resolved spatial transcriptomics by deep data fusion	XFuse/0.2.1	Spatial transcriptomics
xtb	XTB is a Semiempirical Extended Tight-Binding Program Package	xtb/6.5.0
zlib	A free, general-purpose, legally unencumbered lossless data-compression library.	zlib/1.2.11