HPC cluster 2

From radwiki
Jump to navigation Jump to search

The department of Diagnostic Radiology manages another five servers colloquially referred to as the "GPU cluster 2".

They were set up and handed over in April 2022. The rack is physically located at the laboratory building, LG3/F.

Hardware Specifications

The servers are typically called nodes. There are four GPU computing nodes and one windows storage node.

Name Public IP address Physical CPU cores GPU RAM(GB) Storage (TB) Storage mount point URL
gpu1 10.97.168.102 48 A30x4 128 5.2 / NA
gpu2 10.97.168.103 48 A30x4 128 5.2 / NA
gpu3 10.97.168.104 48 A30x4 128 5.2 / NA
gpu4 10.97.168.105 48 A30x4 128 5.2 / NA
storage1 10.97.168.101 18 NA 16 100 exported as nfs to /home/[username]/share NA

Realtime performance and usage metrics can be found at HPC Diagnostics and Statistics

Usage

Users are currently expected to use the GPU cluster2 in the following ways:

  1. Shell access to the compute nodes
    • Users get their own user-specific home folder.

A few introductory guides are available to help users. Other software may be installed upon request, but users should note they can manually install any software in their own home directory without needing admin privileges.

  • To use the shell access features of the GPU cluster, users need to get a server account.
  • All users must be use HKU VPN to access anything.
  • The GPU cluster2 are not allowed to access the internet normally, but can be temporarily permitted upon request.

Choosing a server

Refer to the specs table for IP address information. storage1 is not intended for direct shell access, so password login for non-admins is disabled. The storage node transparently makes its storage capacity available to all nodes as the /home/[username]/share directory.

For running code and scripts

Choose any of the 4 gpu servers. Optionally check the local resource usage with command top, ps or nvidia-smi.

For file transfer/storage

It doesn't matter which server you choose, since the user home folder (/home/[username]/share) is a nfs mounted from the storage server. So it is suggested all of your files or folders should be saved in this directory. To evaluate available storage see the df or du commands.

Security

  • Please use a strong password and protect it.
  • Non-anonymous patient data should be stored in encrypted format.