OpenMP job
An OpenMP job runs a single process
on a single node and shall use all CPU cores of the node by using as many
threads as there are CPU cores in the node.
In the example this is expressed by the options --ntasks=1
and --cpus-per-task=16
.
The environment variables set in the example are explained below.
line no. |
openmp-job.sh |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
#!/bin/bash #SBATCH --partition=std #SBATCH --ntasks=1 #SBATCH --cpus-per-task=16 #SBATCH --time=00:02:00 #SBATCH --export=NONE
source /sw/batch/init.sh
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK # essential export OMP_PROC_BIND=spread # recommended export OMP_PLACES=cores # recommended export OMP_SCHEDULE=static # recommended export OMP_DISPLAY_ENV=verbose # good to know export KMP_AFFINITY=verbose # good to know
./a.out
exit |
---|
Explanation of environment variables
OMP_*
variables are definded in the OpenMP specifications.
KMP_AFFINITY
is part of the Thread Affinity Interface of the Intel runtime library.
OMP_NUM_THREADS
sets the number of threads to be used in parallel execution. In the example its value is copied from the SLURM_CPUS_PER_TASK variable which makes the script easier to modify (the number needs to be specified only once in--cpus-per-task
).OMP_PROC_BIND
andOMP_PLACES
define placement and binding of threads to CPUs.OMP_SCHEDULE
defines how parallel work is distributed over threads.static
is expected to be most efficient on dedicated CPU resources (which is the case on Hummel>).static
is the default value in Intel and PGI runtime environments, but not in the GNU runtime environment.OMP_DISPLAY_ENV=verbose
makes the runtime library print out thread bindings, which is interesting to check. It turned out that Intel and PGI runtime libraries print more useful information ifKMP_AFFINITY=verbose
is set in addition.