What is Anaconda?

Anaconda is an open-source package manager, environment manager and distribution of Python and R packages for data science and machine learning related applications. The Anaconda distribution comes with more than 1000 data packages and 250 popular data science packages.

Using Anaconda at HPC system

To use Anaconda, you have to load the module for Anaconda with command:

System Python version Command
HPC2021 3.8 module load anaconda/py3.8
3.11 module load anaconda/py3.11

Our HPC system maintain a list of anaconda python environments so as to avoid package conflicts from different applications working on the same python distribution. User can use our provided python environments or you can create an environment and install specific packages for your project. Here is some packages that are currently supported with Anaconda:

 

Package Description
Keras Wrapper for Neural Networks library for Tensorflow and Theano
PyTorch An Optimized tensor library for deep learning
Tensorflow Neural network library for Machine Learning and Deep Learning research
Theano Numerical computation library which designed for machine learning

Manage environments with Anaconda

  1. Check out available environments
    conda env list or conda info --envs

    A list similar to the following is displays:
    # conda environments:
    #
    base                     /share1/anaconda3
    deepchem-gpu             /share1/anaconda3/envs/deepchem-gpu
    tensorflow-gpu           /share1/anaconda3/envs/tensorflow-gpu
    
  2. View a list of packages in an environment
    • If the environment is not activated: conda list -n tensorflow-gpu
    • If the environment is activated: conda list
  3. Create Conda environment
    • Create an environment: conda create -n myenv
    • Create an environment with a specific Python version: conda create -n myenv python=3.8
    • Create an environment to target directory: conda create -p /path/to/dir/myenv
  4. Activate an environment
    source activate myenv
  5. Deactivate an environment
    conda deactivate
  6. Remove an environment
    conda remove -n myenv --all or conda env remove -n myenv

Manage packages

  1. Install packages into an existing environment ‘myenv
    • If the environment is not activated : conda --name myenv install PACKAGENAME
    • If the environment is activated : conda install PACKAGENAME
    • Install multiple packages at once: conda install pkg1 pkg2 pkg3
    • Install package with specific version: conda install scipy=1.1.0
  2. Install R packages : conda install -c r R-PACKAGENAME
    For example, you can install package r-rcpp & r-rstan by : conda install -c r r-rcpp r-rstan
  3. Install packages from channels(e.g. Bioconda) to your environment
    conda create -n bioconda
    conda activate bioconda
    conda config --add channels bioconda
    conda install -c r r
    conda install bwa bowtie fastqc bioconductor-rsamtools
    conda deactivate
    
  4. Update installed packages: conda update PACKAGENAME

Share conda environment with members in PI group

  • By default, conda packages are installed in each user’s own home folder, not accessible by others
  • To allow for sharing of software among group members, software may be installed to /lustre1/g/{$PI_GROUP}

Example

# Get your PI Group value
PI_GROUP=$(groups | tr " " "\n" | grep _)  
# Install to /lustre1/g/${PI_GROUP}/software/python/3.9.7 ml miniconda/py39/4.10.3
conda create -p /lustre1/g/${PI_GROUP}/software/python/3.9.7 -c conda-forge python=3.9.7
# Group members may run the following to use the conda environment without installing conda by themselves ml miniconda/py39/4.10.3 conda activate /lustre1/g/${PI_GROUP}/software/python/3.9.7

For full usage of conda, you can see the Conda cheet sheet (pdf) or Conda user guide.

Additional Information

Anaconda Home: https://anaconda.org

Anaconda Distribution packages: https://docs.anaconda.com/anaconda/packages/pkg-docs

R language packages for Anaconda: https://docs.anaconda.com/anaconda/packages/r-language-pkg-docs

Bioinformatics packages in Bioconda: https://anaconda.org/bioconda or https://bioconda.github.io/