Running batch jobs

The concept of virtual nodes

Hummel-2 can be viewed as a cluster consisting of virtual nodes (not to be confused with virtual machines). Each virtual node has 8 CPU cores which share the same L3 data cache. The batch system always allocates full virtual nodes. Besides 8 CPU cores each virtual node has in the

std partition: 32 GB RAM,
big partition: 96 GB RAM,
gpu partition: 144 GB RAM and 1 NVIDIA H100 80 GB GPU.

Unfortunately, the batch system does know the concept of a virtual node. Hence, the unit virtual node can not be used to request resources from the batch system.

In practice the number of virtual nodes a batch job gets allocated is determined by the requested number of CPU cores or GPUs, respectively:

std partition

For single-node jobs the number of virtual nodes is determined by the number of CPU cores requested:

number of virtual nodes = ceiling(--ntasks × --cpus-per-task / 8)
Full nodes (192 cores) can be requested.
Multi-node jobs must use full nodes throughout (must specify --exclusive).

big partition

Only full nodes (192 cores) can be requested.

gpu partition

The number of virtual nodes is given by the --gpus parameter.

All partitions

A memory parameter (like --mem) must never be used.

Job sizes

std and big partitions

Hummel-2 is designed to run parallel programs. The goal is that each job uses at least half of the number of cores requested or half of the RAM (implicitly) requested.
Single-core tasks should be packed and executed in parallel in order to achieve that goal. This process is called trivial parallelization. On Hummel-2 it can easily accomplished with the
- RRZ tool jobber.
The majority of jobs will fit into a single node.
Multi-node jobs must use full nodes.

gpu partition

Single-GPU jobs are expected to be dominant.
Multi-GPU jobs are possible. A multi-GPU job will be scheduled onto a single node.

Job runtimes

Runtimes should be hours, not minutes. In order to obtain reasonable runtimes short tasks can (like small tasks) be packed with the
- RRZ tool jobber.
Very long runtimes (days) should be avoided by splitting jobs if this is possible.

Accounts

In order to achieve a fair distribution of computing time the fairshare algorithm is employed, see:

In the fair tree account subtrees were introduced per partition. The consequence is that users who are allowed to use more than one partition must specify --account in addition to --partition. The account names are:

WorkingGroupName_std
WorkingGroupName_big
WorkingGroupName_gpu

File systems and log files

In batch jobs

the /home file system is readonly,
the /usw file system is readonly.

Also log files cannot be written there. If you submit jobs from $HOME or $USW a log filename including an absolute path must be specified with --output=, i.e. the first character following --output= must be a slash (/).

Temporary/scratch files

All directories mentioned in this section will be automatically created by the batch system and automatically deleted at job end.

Each job can use directories /tmp and /dev/shm. Both are virtual file systems that are both kept in memory, i.e. a job can run out-of-memory if to much data is written there.
$RRZ_GLOBAL_TMPDIR is a directory in the /beegfs file system.

Job reports

Users should check whether their batch jobs use resources efficiently. This can be coomplished with the RRZ tool rrz-batch-jobreport.