What is Anaconda?

Anaconda is an open-source package manager, environment manager and distribution of Python and R packages for data science and machine learning related applications. The Anaconda distribution comes with more than 1000 data packages and 250 popular data science packages.

Using Anaconda at HPC system

To use Anaconda, you have to load the module for Anaconda with command:

System Python version Command
HPC2015 2.7 module load anaconda/py2
3 module load anaconda/py3
HPC2021 3.8 module load anaconda/py3.8

Our HPC system maintain a list of anaconda python environments so as to avoid package conflicts from different applications working on the same python distribution. User can use our provided python environments or you can create an environment and install specific packages for your project. Here is some packages that are currently supported with Anaconda:

 

Package Description
Caffe2 A lightweight and scalable deep learning framework
Deepchem Deep learning models for Drug Discovery, material science and quantum chemistry
Keras Wrapper for Neural Networks library for Tensorflow and Theano
PyTorch An Optimized tensor library for deep learning
Tensorflow Neural network library for Machine Learning and Deep Learning research
Theano Numerical computation library which designed for machine learning

Manage environments with Anaconda

  1. Check out available environments
    conda env list or conda info --envs
    A list similar to the following is displays:
    # conda environments:
    #
    base                     /share1/anaconda3
    deepchem-gpu             /share1/anaconda3/envs/deepchem-gpu
    tensorflow-gpu           /share1/anaconda3/envs/tensorflow-gpu
    
  2. View a list of packages in an environment
    If the environment is not activated: conda list -n tensorflow-gpu
    If the environment is activated: conda list
  3. Create Python environment
    Create an environment: conda create -n myenv
    Create an environment with a specific Python version: conda create -n myenv python=3.8
  4. Activate an environment
    source activate myenv
  5. Deactivate an environment
    conda deactivate
  6. Remove an environment
    conda remove -n myenv --all or conda env remove -n myenv

Manage packages

  1. Install packages into an existing environment ‘myenv
    If the environment is not activated : conda --name myenv install PACKAGENAME
    If the environment is activated : conda install PACKAGENAME
    Install multiple packages at once: conda install pkg1 pkg2 pkg3
    Install package with specific version: conda install scipy=1.1.0
  2. Install R packages : conda install -c r R-PACKAGENAME
    For example, you can install package r-rcpp & r-rstan by : conda install -c r r-rcpp r-rstan
  3. Install packages from channels(e.g. Bioconda) to your environment
    conda create -n bioconda
    conda activate bioconda
    conda config --add channels bioconda
    conda install -c r r
    conda install bwa bowtie fastqc bioconductor-rsamtools
    conda deactivate
  4. Update installed packages: conda update PACKAGENAME

For full usage of conda, you can see the Conda cheet sheet (pdf) or Conda user guide.

Additional Information

Anaconda Home: https://anaconda.org

Anaconda Distribution packages: https://docs.anaconda.com/anaconda/packages/pkg-docs

R language packages for Anaconda: https://docs.anaconda.com/anaconda/packages/r-language-pkg-docs

Bioinformatics packages in Bioconda: https://anaconda.org/bioconda or https://bioconda.github.io/