There is one master node, namely the gridpoint.hku.hk. It is reserved for user login, program development, compilation and job queue submission/manipulation etc.

There are 144 compute nodes which are used for batch processing, located in eight Dell M1000e chassis and two IBM BladeCentre-H chassis. Except the master node, users cannot login to these nodes, but users can submit jobs to be executed in these nodes through the Torque batch queue system at master node.

Out of these 144 compute nodes, 32 nodes are interconnected with 4X-DDR Infiniband network of bandwidth up to 16 Gbps, 16 nodes interconnected with 4X-QDR Infiniband network of bandwidth up to 32 Gbps. 96 nodes are interconnected with 1 Gbps switches with 10Gbps uplinks to the 10GbE switches.

Compute nodes:

  • 128 x Dell M610 blade servers. Each node consists of:
    • A Dell M610 blade server motherboard and chipset
    • Two 64-bit quad-core Intel Nehalem CPU running at 2.53 GHz
    • 32GB RAM (16 nodes are 16GB RAM)
    • Two 250 GB SATA hard disks
    • 4X DDR Infiniband adaptor for IB-enabled nodes
    • Dual integrated 100/1000 ethernet ports
  • 16 x IBM HS22 blade servers. Each node consists of:
    • A IBM HS22 blade server motherboard and chipset
    • Two 64-bit 6-core Intel Westmere CPU running at 2.66 GHz
    • 48GB RAM
    • Two 300 GB SATA hard disks
    • 4X QDR Infiniband adaptor for IB-enabled nodes
    • Dual integrated 100/1000 ethernet ports

File Service:

  • 2 x dedicated NFS file servers of Dell R710 servers with 4TB of Dell MD10000 storage
  • 4 x dedicated NFS file servers of IBM X3620M3 servers with 18TB storage

Network:

  • A 4X DDR Infiniband network is formed by interconnecting the 32 Infiniband-enabled nodes Dell Quad-core compute nodes.
  • A 4X QDR Infiniband network is formed by interconnecting the 16 Infiniband-enabled IBM 6-core compute nodes
  • Two groups of 10 Gigabit Ethernet networks are formed, one is for the private NFS file system and one is for the inter-processor MPI communication.

Directories:

  • /home1 and /home2 are used for user home directories and files.
  • /share1 and /share2 are used to store commonly used files.
  • /scratch1 and /scratch2 can be read and write from all nodes with size of 1TB.
  • Each slave node has 190GB of local /tmp directory.