OpenMP job
An OpenMP job runs a single process
on a single node and shall use all CPU cores of the node by using as many
threads as there are CPU cores in the node.
In the example this is expressed by the options --ntasks=1 and --cpus-per-task=16.
The environment variables set in the example are explained below.
|
line no. |
openmp-job.sh |
12345678910111213141516171819 |
#!/bin/bash#SBATCH --partition=std#SBATCH --ntasks=1#SBATCH --cpus-per-task=16#SBATCH --time=00:02:00#SBATCH --export=NONEsource /sw/batch/init.shexport OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK # essentialexport OMP_PROC_BIND=spread # recommendedexport OMP_PLACES=cores # recommendedexport OMP_SCHEDULE=static # recommendedexport OMP_DISPLAY_ENV=verbose # good to knowexport KMP_AFFINITY=verbose # good to know./a.outexit |
|---|
Explanation of environment variables
OMP_* variables are definded in the OpenMP specifications.
KMP_AFFINITY is part of the Thread Affinity Interface of the Intel runtime library.
OMP_NUM_THREADSsets the number of threads to be used in parallel execution. In the example its value is copied from the SLURM_CPUS_PER_TASK variable which makes the script easier to modify (the number needs to be specified only once in--cpus-per-task).OMP_PROC_BINDandOMP_PLACESdefine placement and binding of threads to CPUs.OMP_SCHEDULEdefines how parallel work is distributed over threads.staticis expected to be most efficient on dedicated CPU resources (which is the case on Hummel>).staticis the default value in Intel and PGI runtime environments, but not in the GNU runtime environment.OMP_DISPLAY_ENV=verbosemakes the runtime library print out thread bindings, which is interesting to check. It turned out that Intel and PGI runtime libraries print more useful information ifKMP_AFFINITY=verboseis set in addition.