quick start
Contents
- The HPC paradox
- Login
- Front-end nodes
- Operating system
- Configuration files
- Working with data
- Software
- Running batch jobs
The HPC paradox
Quick Start begins with a generic topic: The HPC paradox. The rest of this page describes pecularities of the Hummel-2 cluster.
We would like to start with mentioning the HPC paradox that every HPC user is faced with.
- An HPC cluster is not automatically faster than your personal
computer.
- In order to become faster parallel processing techniques must be employed.
- I/O (per CPU core) will perform worse than with the NVMe of your computer. For I/O intensive tasks disk usage needs planning.
Please contact the HPC team before you launch production jobs if you feel uncertain about these issues.
Login
The computer from which you want to access a login gateway must be in the network of Hamburg University. From outside the university VPN must be employed.
There are 2 login gateways. Their IP addresses are:
hummel3.rrz.uni-hamburg.de hummel4.rrz.uni-hamburg.de
Login is possible with the Secure Shell using public key authentication.
Note that you must log into a front-end node (see below) to start working. It is possible to transfer data via the login gateways but there is only a minmal set of software installed.
For X11 forwarding see the page on X11 forwarding.
Hint for MacOS users: If you get locale error messages from the Module setup try to follow the recommendations from https://blog.remibergsma.com/2012/07/10/setting-locales-correctly-on-mac-osx-terminal-application/.
Front-end nodes
A front-end node can be reached from a login gateway via:
bxy1234@hummel2-front1:~
$
ssh front1 bxy1234@hummel2-front1:~$
ssh front2
The two ssh
commands can be combined into a single
one:
$
ssh -t bxy1234@hummel3.rrz.uni-hamburg.de ssh front1
The X11 forwarding option -X
must be given to both
ssh
commands (recall that -X
should only be
used when needed):
$
ssh -X -t bxy1234@hummel3.rrz.uni-hamburg.de ssh -X front1
Operating system
The operating system is Debian GNU/Linux.
Configuration files
Initially the home directory $HOME
is almost empty,
i.e. there is a .ssh
directory but no other configuration
file like .profile
or .bashrc
.
Working with data
Data handling becomes an increasingly important topic. Please read also:
Summary
- The home directory (
$HOME
) should only/mainly be used for files that you create/edit yourself. - The User-installed-SoftWare directory (
$USW
) should be used for installing software packages downloaded form the internet.$USW
should als be used to unpack containers. - Data should be kept in the
/beegfs
($BEEGFS
) or/nfs/ssdX.Y
($SSD
) file systems, see Which file system to use when?. - In batch jobs
/tmp
and/dev/shm
are virtual file systems that are both kept in memory and are private to the batch job.
Disk quotas
Disk quotas are enabled on all filesystems. Disk usage and accessibility can be checked with the RRZ tool
Software
The operating system images of Hummel-2 contain only minimal software. To get more functionality switch to a pkrsrc Module. When you start on Hummel-2 it is a good idea to switch to a pkrsrc Module because this provides software that might be missing:
module switch env env/YYYYQQ-gcc-openmpi
Please install binary software packages (like Python
packages) in your
directory.$USW
More information on software is given on the page
Compilers
Installed compilers are listed on the page
Running batch jobs
Important points are:
- The smallest scheduling unit is 8 cores or 1 GPU. In particular, the smallest CPU job size is: 8 cores, 32 GB RAM.
- Single-core jobs are not allowed.
- Very short (minutes) and very long (days) running jobs should be avoided.
- Specifying
--account
at job submisson might be necessary. - Submit all jobs with
#SBATCH --export=NONE
and
source /sw/batch/init.sh
- Launch MPI programs with
mpirun
(srun
does not work). - Check your batch job limits with
rrz-batch-limits
. - Check resource usage of your batch jobs with
rrz-batch-jobreport
.
Please read also: