AI-Research is an advanced computing platform with state-of-the-art GPU accelerators to facilitate sophistication in various disciplines of research.
This guide consists of five major parts:
- Hardware specification
- System Login
- Accessing GPU
- SLURM scheduler which controls access to GPUs
- How to use containers to run software with enroot or Singularity
- Other local commands
(Note: Default per user disk quota is 50GB and default group quota is 5TB. However, group quota is currently capped at 1TB due to technical difficulties in connecting to the Lustre storage.)
Hardware Specification
The system is a NVIDIA DGX A100 machine which consists of:
- Dual AMD EPYC 7742, 2.25GHz CPU(base), 3.4Ghz (max boost)
(128-cores total; 256-threads due to Simultaneous multithreading (SMT) feature) - 1TB DDR4 RAM
- Eight NVIDIA A100 SXM4 GPUs with 40GB GDDR6
- NVSwitch
- 14TB NVMe SSD local storage
- 200Gb/s HDR InfiniBand
- Ubuntu 20.04 LTS
Further hardware details are available in its official datasheet.
System Login
The AI-research system, ai-research.hku.hk, can be accessed via Secure Shell from any device within HKU campus network (physically connected to HKU campus network, or using SSID “HKU” while on campus. If you need off-campus access, please use HKUVPN2FA service).
Following SSH features are enabled:
- SFTP (for file transfer)
- X Tunneling (for graphics software — not fully supported)
- Dynamic Port Tunneling
Accessing GPU
Here we list some common-pitfalls where new user may come across:
`nvidia-smi` shows nothing?
$nvidia-smi
Failed to initialize NVML: Unknown Error
Users whom logon to the node DO NOT have GPU access immediately (you will need to use the SLURM scheduler to apply for GPU). If you are inside an interactive session, or submitted a job script, then you will be allowed access to the GPUs.
Do not assume no one is using the GPUs.
There maybe other users using the GPUs (either interactively, or have their submitted jobs running). Check the immediate availability of GPUs with the gpu_avail
command. If you have requested interactive job and the GPU resources is not enough to fulfill your request, you will be put into waiting (and thus it appears as hung). To achieve efficient and fair use of resources, please refrain from requesting for more GPU than your job is able to use.
Why I cannot use docker?
Our scheduler (SLURM) is currently not compatible with docker, as the GPUs are allocated by SLURM to ensure fair share of resources, docker access by general users are not available. Please use enroot or singularity to pull docker from the docker repositories instead.
SLURM scheduler
Access to GPU card is not granted upon system login as resources are controlled and scheduled via the SLURM scheduler. To gain access to GPUs, you have to submit a SLURM job. Jobs could be interactive ( you can type commands during a session as if you directly logged in ) or batch ( you prepare a job script to include all the commands to be executed for your tasks and submit to run where you cannot interact with the system since then) depending on what you want to do.
Interactive Job
Interactive job allows you to use GPU interactively (of course, when there is immediately-available GPUs). To submit an interactive job with 1 GPU card for a maximum runtime of 5 minutes, use:
$ srun --pty --gres gpu:1 --time 5 /bin/bash
On default memory are allocated at a rate of 64GB/GPU. In case your program would require more than that, you should explicitly request for more memory (for example, 100GB for 1 GPU). Note that in SLURM memories are specified in the unit of MB:
$ srun --pty --gres gpu:1 --memory=100000 --time 5 /bin/bash
Please be noted that currently we limit maximum memory at 400GB/job.
Before submitting an interactive job, you might check the number of available GPUs with the gpu_avail
command:
The printout of the command should be self-explanatory. If you use a terminal multiplexer (screen/tmux), it should be done outside your srun command.
Currently SLURM only controls GPU and memory. Although users may still use CPU, memory and disk resources without having an active GPU job, users are advised to use them only for:
Batch Job
If you feel comfortable with shell script, you may put them into a text file, and submit it onto the system. In the following is a text file which shows the basic of a SLURM job script which uses the enroot container (as described in later sections). The script downloads a container image, starts it (and run a nvidia-smi in order to prove it could reach GPUs), then exits immediately:
#!/bin/bash
#
#SBATCH --get-user-env
#SBATCH --job-name=slurm_demo ## Job name
#SBATCH --partition=debug ## Job Queue
#SBATCH --output=slurm_demo.o%j ## File that STOUT will be written (%j:Job ID, %t:Task ID)
#SBATCH --error=slurm_demo.e%j ## File that STDERR will be written
# Uncomment these two lines if you want e-mail to be sent
##SBATCH --mail-type=END,FAIL ## Email notification type: BEGIN,END,FAIL,ALL
##SBATCH --mail-user=user@hku.hk ## Email that notifications will be sent to
#SBATCH --nodes=1 ## Number of compute node(s)
#SBATCH --ntasks-per-node=1 ## Number of process(es) per compute node
#SBATCH --time=00:05:00 ## Runtime in D-HH:MM/HH:MM:SS/MM:SS
##SBATCH --mem=2000 ## Total memory over all of the cores(in MB)
##SBATCH --mem-per-cpu=100 ## Memory per CPU core (in MB)
#SBATCH --gres=gpu:1
echo "Submission Directory : " $SLURM_SUBMIT_DIR
echo "Submission Host : " $SLURM_SUBMIT_HOST
echo "Job User : " $SLURM_JOB_USER
echo "Job ID : " $SLURM_JOB_ID
echo "Job Name : " $SLURM_JOB_NAME
echo "Queue : " $SLURM_JOB_PARTITION
echo "Node(s) allocated : " $SLURM_JOB_NODELIST
echo "Number of Node(s) : " $SLURM_NNODES
echo "Number of CPU Task(s): " $SLURM_NTASKS
echo "Number of Process(s) : " $SLURM_NPROCS
echo "Task(s) per Node : " $SLURM_TASKS_PER_NODE
echo "CPU(s) per Task : " $SLURM_CPUS_PER_TASK
echo "Task ID : " $SLURM_ARRAY_TASK_ID
echo ===========================================================
echo "Job Start Time is `date "+%Y/%m/%d -- %H:%M:%S"`"
cd $WORK
OUTFILE=${SLURM_JOB_NAME}.${SLURM_JOB_ID}
nvidia-smi
enroot import docker:nvcr.io#nvidia/cuda:11.0-devel
enroot create nvidia+cuda+11.0-devel
(
cat <<END
nvidia-smi
END
) | enroot start nvidia+cuda+11.0-devel nvidia-smi
rm nvidia+cuda+11.0-devel.sqsh
enroot remove nvidia+cuda+11.0-devel
enroot list
mv ${OUTFILE} ${SLURM_SUBMIT_DIR}
echo "Job Finish Time is `date "+%Y/%m/%d -- %H:%M:%S"`"
exit 0
If you save this file with the name script.slurm
, then to submit it, type the following into the shell:
$ sbatch script.slurm
- Transferring files from/to the node
- Preparing container images (see later sections) before SLURM job submission
- Submitting job to SLURM
Container
Currently two docker equivalence, enroot and Singularity are installed on the system. They both support docker images and will be discussed in details below:
enroot
Configuration For NVIDIA GPU Cloud (Optional)
Some containers on NVIDA GPU cloud requires authentication. Once you get the API token (check this on how to get your own token), you may store it in your environment via:
$ cat > ~/.local/share/enroot/.credentials <<END
machine nvcr.io login \$oauthtoken password MmdhYOUR_NGC_TOKEN_0aXFn_DONT_COPY_THIS_lM2Y0NjMtZGFhZi00YWRlLTk0ODYtMDNiN2U3YzBiOWE5
END
After that, you may add the following into your ~/.bashrc, which will be effective in your next login:
$ export ENROOT_CONFIG_PATH=~/.local/share/enroot
To take effect immediately, you should run: source ~/.bashrc
Importing Images
To import the image which you would normally pull (via docker) using docker pull nvcr.io/nvidia/cuda:11.0-devel
, you should use:
$ enroot import docker://nvcr.io#nvidia/cuda:11.0-devel
After import, enroot will create squash file with a name, e.g. nvidia+cuda+11.0-devel.sqsh in the example above. You may add -o filename.sqsh after the “import” keyword in order to save it to another file name.
Creating Container from Imported Image
To create a container with a squash file, you should use:
$ enroot create nvidia+cuda+11.0-devel.sqsh
By default, the command will extract the squash file into ~/.local/share/enroot/containername, where containername was generated from your squash file name. You may add -n containername after the “create” keyword in order to set a name to the container.
Listing Containers
To get a list of enroot containers under your folder, you should use:
$ enroot list
Running Container
(For enroot, the container only starts if you have a valid GPU connection)
To run the container with the name nvidia+cuda+11.0-devel and starting a bash shell, you should use:
$ enroot start nvidia+cuda+11.0-devel /bin/bash
Similarly you may start a container with other name and other programs. You should add -w
(write) if you would like to change files inside the container.
(Variants such as batch
and exec
are also supported by enroot.)
Deleting Container
To delete a container called containername, you may use:
$ enroot remove containername
You will be asked to delete the folder containing the root filesystem. You have to answer “y” or “N” (typing “yes” will do nothing).
Notes on Accessing Files in a container
A user may mount other folder to the container to make it accessible inside. For example, to mount your current working directory into the container as /mnt
inside:
$ enroot start --mount .:mnt nvidi+cuda+11.0-devel /bin/bash
User may interact with the folder in order to access files inside the container even when the container is not running. The root filesystem of the container is at:
~/.local/share/enroot/containername
Further Information
Further information of enroot‘s usage can be found at https://github.com/NVIDIA/enroot/blob/master/doc/usage.md
Singularity
Configuration For NVIDIA GPU Cloud (Optional)
Some containers on NVIDA GPU cloud requires authentication. Once you get the API token (check this on how to get your own token), you may add the following into your ~/.bashrc, which will be effective in your next login:
$export SINGULARITY_DOCKER_USERNAME="\$oauthtoken"
$export SINGULARITY_DOCKER_PASSWORD="MmdhYOUR_NGC_TOKEN_0aXFn_DONT_COPY_THIS_lM2Y0NjMtZGFhZi00YWRlLTk0ODYtMDNiN2U3YzBiOWE5"
To take effect immediately, you should run: source ~/.bashrc
Importing Images and Creating Container
To import the image which you would normally pull (via docker) using docker pull nvcr.io/nvidia/cuda:11.0-devel
, you should use:
$ singularity build cuda11.simg docker://nvcr.io/nvidia/cuda:11.0-devel
The command will create a “simg” file (cuda11.simg) which is the container. You may create containers with differing names by creating multiple “simg” files.
Running Container
To run the container simg file, you should use:
$ singularity shell --nv cuda11.simg
You will get a shell starting with Singularity> which behaves like a normal shell at the same path where you run the singularity command.
(Variants such as exec
and run
are supported by singularity.)
Deleting Container
Simply deleting the simg file is okay. You should run:
$ rm cuda11.simg
Further Information
Further information on usage of Singularity is at https://sylabs.io/guides/3.6/user-guide/cli/singularity.html
Other Local Commands
There are several local commands for users’ convenience when using the system:
gpu_avail
This is the local command for checking immediate GPU availability. Just type “gpu_avail” on the shell and you will see something like this:
2 out of 8 GPUs are allocated: GPU 0 GPU 1 [GPU 2][GPU 3] GPU 4 GPU 5 GPU 6 GPU 7 Legend: [ALLOC] AVAIL
The coloured output should be self-explanatory and this allows user to enquire how many GPUs are immediately available if they would like to submit interactive jobs.
gpu_smi
This is the local command for checking GPU resource usage for a user’s own running jobs. In normal cases if one have an interactive session and running their compute loads, they would like to confirm the occupancy of their GPUs. But a simple run of “nvidia-smi” at another command prompt would be barred from accessing such information (as this separate command prompt is not part of any job).
Just type “gpu_smi” on the shell and you will get something like this if you have a running job, for each of them:
Running Job ID: 5033 [ GPU2 GPU3 ] # gpu pwr gtemp mtemp sm mem enc dec mclk pclk # Idx W C C % % % % MHz MHz 0 50 23 23 0 0 0 0 1215 210 1 51 22 22 0 0 0 0 1215 210 # gpu pid type sm mem enc dec command
# Idx # C/G % % % % name 0 - - - - - - - 1 - - - - - - -
For each of the user’s running job, the system will list the job number along with the physical GPU the job is using, then a (per job) print out of GPU’s usage in an abridged form (“dmon” and “pmon”). Note that the GPU IDs always start from “0” in the listing, but these IDs do not correspond to the physical GPUs but the GPUs visible to the user’s job. This allows users to find if their job are actually using the GPU.