HPC2015 User guide – Research Computing, HKU ITS

System Overview

The HPC2015 system is a heterogeneous High Performance Computing Linux cluster which comprises various kinds of computing resource: compute nodes with fast multicore processors for general compute-intensive computing; special purpose compute nodes with large memory, GPU and MIC capabilities for data-intensive and accelerated computing. It is designed to support both compute- and data-intensive research. The heterogeneous environments provide researchers diverse and emerging computing technologies to exploit new solution approaches, and new research opportunities and relationship among distinct research areas.

Reading of this user guide and using of the system assumes familiarity with the Linux/Unix software environment. In order to get an understanding of the Linux/UNIX, please study the UNIX user’s guide in the ITS web page.

System Access

To ensure a secure login session, user must connect to the HPC2015 system by secure shell SSH program through the HKU campus network. If you are outside the University network, you should connect to the HKU Campus network via HKUVPN with 2FA first. SSH is not bundled by MS Windows that you may require to download SSH client like PuTTY. Please visit SSH with Putty for more details (Host Name: hpc2015.hku.hk / hpc2015-file.hku.hk).

Logging in the system

To log in the HPC2015 system, use the two frontend nodes of hostname:

hpc2015.hku.hk, which is reserved for program modification, compilation and job queue submission/manipulation;

hpc2015-file.hku.hk, which is reserved for file transfer, file management and data analysis/visualization.

If you connect to the HPC2015 from a UNIX or Linux system with SSH:

ssh <username>@hpc2015.hku.hk or ssh -l <username> hpc2015.hku.hk

When you log on to the login node, you should be in your home directory ($HOME) which is also accessible by compute nodes. Do not use the frontend nodes for computationally intensive processes. These nodes are meant for compilation, program editing, simple data analysis and file management. All computational intensive jobs should be submitted and run though the job scheduling system.

Transferring Data File

Data transfer must be done using the secure commands scp/sftp or SCP client like WINSCP/FileZilla. You local machine must connect to HKU campus network beforehand. Please visit SSH and Secure File Transfer for procedure on how to make SSH/SFTP connection. Be reminded that only hpc2015-file.hku.hk is capable for file transfer.

If you already have some file(s) or folder(s) located at other Linux system and would like to copy them to HPC2015, you may use the SCP command:

scp $SRC_HOST:$SRC_FILE hpc2015-file.hku.hk:$DST_FILE

Changing Your Password

You can reset the account password by changing your HKU Portal PIN. Whenever your HKU Portal PIN is changed, the HPC2015 account password will be updated correspondingly.

Editing the Program

You can use the command vi, emacs, nano or pico to edit programs. Please refer to the UNIX user’s guide for detail.

Important notice for Microsoft Windows users:

Any files you transfer from Windows to Linux HPC system may be incompatible due to different sequences of control characters to mark the end of line (EOL). To change the format from DOS(Windows) to unix, you should use command dos2unix <filename>

Environment Modules

Applications, software, compilers, tools, communications libraries and math libraries of the cluster system are keep updating. HPC2015 uses the Environment Modules to dynamically set up environments for different applications. Module commands set, change, or delete environnment variables that are needed for particular application. The ‘module load’ command will set PATH, LD_LIBRARY_PATH and other environment variables. User can choose different version of applications or libraries more easily.

Useful Module commands

Command	Description
`module list ml`	List currently loaded module(s)
`module avail ml avail`	Show what modules are available for loading
`module keyword [word1] [word2] …`	Show available modules matching the search criteria
`module whatis [module_name] module help [module_name]`	Show description of particular module
`module load [module_name] module load [module_name]/[version] module load [mod A] [mod B] …`	Configure your environment according to modulefile(s)
`module unload [module_name] module unload [mod A] [mod B] …`	Roll back configuration performed by the modulefile(s)
`module swap [module A] [module B]`	Unload modulefile A and load modulefile B
`module purge`	Unload all modules currently loaded

Using modules

You must remove some modules before loading others(e.g different MPI libraries) Some module depends on other, so they may be loaded or unloaded as a consequence of another module command.

If there is a set of modules that you are regularly use and want to be loaded at login, you can add the module command in the shell configuration file(.bashrc for bash users, .cshrc for C shell users).

Sometimes there may be caching error while listing/loading modules. You can delete the cache file and then run the module command again.

$ module avail /usr/bin/lua: /home/[user]/.lmod.d/.cache/moduleT.x86_64_Linux.lua:function expression ... $ rm -f /home/[user]/.lmod.d/.cache/*.lua

Resource Management System

Torque Resource Manager is a queue management system for managing and monitoring the computation workload of cluster system. User need to write a batch script and then submit it to the queue manager. The submitted jobs then queue up until the requested system resources is allocated. The queue manager will schedule your job to run on the queue that you designate, according to a predetermined site policy meant to balance competing user needs and to maximize efficient use of the cluster resources.

Job Queues

The HPC system is set up to support large computation jobs, following maximum number of CPUs and maximum processing time are allowed per batch job:

(A) Default queues available to all users

Queue Name	Maximum no. of node	Maximum Processing time (wall clock time)	Resource per node
debug	2	30 minutes	20 cores, 96GB RAM
parallel	4	24 hours
fourday	6	96 hours

(B) Queues that require special approval*

Queue Name	Maximum no. of node	Maximum Processing time (wall clock time)	Resource per node
gaussian	1	336 hours	20 cores, 96GB RAM
special	24		20 cores, 96GB RAM
gpu	2		20 cores, 96GB RAM, two K20X GPU
mic	2		20 cores, 96GB RAM, two Xeon Phi 7120P
hugemem	1		40 cores, 512GB RAM

* User with program/application which can show good efficiency and scalability, could request more computation resource per job for their intensive parallel computation. User with program/application which applicable to special computing resource(i.e. GPU, MIC, huge memory), could request using special resource. Please fill in CF162f to apply for additional computing resources for using research computing (HPC/AI/HTC) facilities.

Furthermore, the job scheduling is set in such a fashion that higher priority will be given parallel jobs requiring a larger number of processors.

PBS Job command file

To execute your program in the cluster system, you need to write a batch script and submit it to the queue manager. Sample of general PBS scripts can be obtained at your hpc2015 home directory (~/pbs-samples/) and refer to individiual software user guide.

To utilize the GPU/MIC resources in special compute nodes, additional general resources(GRES) have to be defined as shown in following examples.

1. Four CPU cores and two GPU card

#PBS -q gpu #PBS -l nodes=1:ppn=4 #PBS -W x=GRES:gpu@2

2. Four CPU cores and one MIC card

#PBS -q mic #PBS -l nodes=1:ppn=4 #PBS -W x=GRES:mic@1

Useful Commands

Submitting a Job

To submit the job, we use this command qsub

$ qsub pbs-mpi.cmd 226.hpc2015.hku.hk

Upon successful submission of a job, PBS returns a job identifier of the form JobID.hpc2015.hku.hk where JobID is an integer number assigned by PBS to that job. You’ll need the job identifier for any actions involving the job, such as checking job status or deleting the job.

When the job is being executed, it stores the program outputs to the file JobName.xxxx where xxxx is the job identifier of the job. At the end of the job, the file JobName.oxxxx and JobName.exxxx would also be copied to the working directory to show the standard output and error which were not explicited redirected in the job command file.

Manipulating a Job

There are some commands for manipulating the jobs

List all your jobs status

$ qa
hpc2015.hku.hk:
                                                              Req'd     Req'd      Elap 
Job ID         Username Queue    Jobname    SessID  NDS  TSK  Memory    Time    S  Time
-------------- -------- -------- ---------- ------ ---- ----- ------- --------- - ---------
216.hpc2015-mg h0xxxxxx gaussian test       26530     1   20      --  336:00:00 R  03:33:30
226.hpc2015-mg h0xxxxxx fourday  MIrSi50g   6859      4   80      --   96:00:00 R  01:20:03

Job information provided


Username	:	Job owner
NDS	:	Number of nodes requested
TSK	:	Number of processors requested
Req’d Memory	:	Requested amount of memory
Req’d Time	:	Requested amount of wallclock time
Elap Time	:	Elapsed time in the current job state
S	:	Job state (E-Exit; R-Running; Q-Queuing)

List running node(s) of a job
Command : qstat -n <JOB_ID> or qa -n <JOB_ID>

$ qstat -n 226
hpc2015.hku.hk:
                                                              Req'd     Req'd      Elap 
Job ID         Username Queue    Jobname    SessID  NDS  TSK  Memory    Time    S  Time
-------------- -------- -------- ---------- ------ ---- ----- ------- --------- - ---------
216.hpc2015-mg h0xxxxxx gaussian test       26530     1   20      --  336:00:00 R  03:34:30
   GP-4-20/0+GP-4-20/1+GP-4-20/2+GP-4-20/3+GP-4-20/4+GP-4-20/5+GP-4-20/6+GP-4-20/7
   +GP-4-20/8+GP-4-20/9+GP-4-20/10+GP-4-20/11+GP-4-20/12+GP-4-20/13+GP-4-20/14
   +GP-4-20/15+GP-4-20/16+GP-4-20/17+GP-4-20/18+GP-4-20/19

Checking the resource utilization of a running job
Command : ta <JOB_ID>

$ ta 216
JOBID: 216
================================ GP-4-20 ===================================
top - 16:41:18 up 149 days, 11:54,  0 users,  load average: 20.05, 19.80, 19.73
Tasks: 608 total,   2 running, 606 sleeping,   0 stopped,   0 zombie
Cpu(s): 79.0%us,  1.9%sy,  0.0%ni, 16.0%id,  3.1%wa,  0.0%hi,  0.2%si,  0.0%st
Mem:  99077612k total,  10895060k used, 88182552k free,   84436k buffers
Swap: 122878968k total,    19552k used, 122859416k free,  7575444k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
29144 h0xxxxxx  20   0 97.5g  84m 6200 R 1995.2  1.7  4982:46 l502.exe          
 2667 h0xxxxxx  20   0 15932 1500 1248 S  2.0  0.0   0:00.00 top               
 2622 h0xxxxxx  20   0 98.8m 1284 1076 S  0.0  0.0   0:00.00 sshd    
 2623 h0xxxxxx  20   0  105m  896  696 S  0.0  0.0   0:00.00 g09                
 2668 h0xxxxxx  20   0  100m  836 1168 S  0.0  0.0   0:00.00 226.hpc2015               
29800 h0xxxxxx  20   0  105m 1172  836 R  0.0  0.0   0:00.00 bash                
29801 h0xxxxxx  20   0  100m  848  728 S  0.0  0.0   0:00.00 grep               
29802 h0xxxxxx  20   0 98.6m  604  512 S  0.0  0.0   0:00.00 head

Filesystem      Size  Used Avail Use% Mounted on
/dev/md2        1.5T   59G  1.4T   5% /tmp

You can see the CPU utilization under CPU stats. This example show the process 1502.exe running in parallel on the 20-core system with 1995.2% of the CPU utilization (2000% utilization means all 20 cores of GP-4-20 are fully used). It also provides information such as memory usage(10895060k ~ 10MB used) , runtime of the processes and local /tmp disk usage(59GB used).

List all nodes

$ pa
GP-1-1
GP-1-2
GP-1-3
......

GP-2-7
     jobs = 0/226, 1/226, 2/226, 3/226, 4/226, 5/226, 6/226, 7/226, 8/226, 9/226, 10/226, 
11/226, 12/226, 13/226, 14/226, 15/226, 16/226, 17/226, 18/226, 19/226 
GP-2-8
     jobs = 0/226, 1/226, 2/226, 3/226, 4/226, 5/226, 6/226, 7/226, 8/226, 9/226, 10/226, 
11/226, 12/226, 13/226, 14/226, 15/226, 16/226, 17/226, 18/226, 19/226 

......

GP-4-20
     jobs = 0/216, 1/216, 2/216, 3/216, 4/216, 5/216, 6/216, 7/216, 8/216, 9/216, 10/216,
11/216, 12/216, 13/216, 14/216, 15/216, 16/216, 17/216, 18/216, 19/216
......

Delete a job
Command : qdel <JOB_ID>

$ qdel 226

Checking Queue Usage

You can use command ‘queue‘ or ‘q‘ to check status of queues that you are authorized to use.

$ queue
+---------------+-----------+--------+------------+---------+---------+
|               |  Cores    |  Total | Available  | Running | Queuing |
|  Queue Name   |  Per Node |  Nodes | Full Nodes | Jobs    | Jobs    |
+---------------+-----------+--------+------------+---------+---------+
        debug         20           2         2         0          0
     parallel         20          24         0        19         10
      fourday         20          24         0        15          5
     gaussian         20          32         2        30          2
      hugemem         40           3         1         4          1

Checking Disk Quota

To check your disk usage, you can use the ‘diskquota‘ command:

$ diskquota
	Disk quotas for user h0xxxxxx at Thu May 5 15:52:32 HKT 2019:
+----------------------+------------------------------+---------------------------------+
|                      | Block limits                 | File limits                     |
| Filesystem           | used   quota   limit   grace | files   quota   limit   grace   |
+----------------------+------------------------------+---------------------------------+
 /home/h0xxxxxx         5665M   20480M  21504M  0       146k    0       0       0
 /data/h0xxxxxx         37.32G  100G    105G    -       25564   0       0       -