Generic structure of a (Slurm) batch job
A batch job is a shell script that is processed by a batch system. A typical batch job is shown below. It has four sections
- shebang (line 1)
- submit options (lines 3–5)
- initialization (lines 8–9)
- data handling and work (lines 12–15)
line no. |
batch job structure |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
#!/bin/bash # submit options #SBATCH --ntasks=1 #SBATCH --time=00:05:00 #SBATCH --export=NONE
# initialization source /sw/batch/init.sh module load package/version
# data handling and work cd /path/to/working/directory # optional, see Explanations cp /path/to/input/directory/input.dat . # optional, see Explanations binary [arguments] cp output.dat /path/to/output/directory/ # optional, see Explanations
exit |
---|
Explanations
1. shebang
- The first line of every shell script is the Shebang which specifies the command line interpreter to use.
2. submit options
-
In a batch job the next lines contain submit options. Alternatively options could be given as arguments to the submit command. The syntax for specifying options is the same in both cases. In a job script submit options must be preceded by a special prefix which is
#SBATCH
for the SLURM batch system. Syntactically the first character of the prefix makes such a line a shell script comment. The submit command stops processing these lines once the first line containing a shell command has been reached. -
We recommend to always set --export=NONE in order to begin each job with a known environment. The reason is reproducibility. The default behaviour is that a job inherits
ALL
environment variable settings that exist at submit time. Environment variable settings can influence performance and even results. Therefore, we consider it to be good practice to exclude side effects that might be introduced by inheritance of environment variables. (For completeness it should be mentioned that the batch system sets the environment variableSLURM_EXPORT_ENV
at execution time if--export=NONE
was set at submit time. In particular for MPI programs it turned out that one shouldunset SLURM_EXPORT_ENV
in the script if--export=NONE
was specified. At RRZ one does not have to care about removingSLURM_EXPORT_ENV
because this is done in/sw/batch/init.sh
.) -
Submit options are described in the man-page of the
sbatch
command.
3. initialization
- System specific initialization.
It depends on the system whether system specific initialization is needed. For example, on our system
source /sw/batch/init.sh
provides temporary directories and themodule
function which is often needed in job specific initialization. - Job specific initialization.
There are two typical use cases:
- For application packages the corresponding module must be loaded.
- For self-compiled software it might be necessary to load (or switch to) exactly the same modules that were loaded at compile time. For MPI programs that are lauched with
mpirun
command the MPI module used at compile time must be loaded in any case.
4. data handling and work
This part contains commands for handling data and the actual work to be performed. Usually one would avoid any unnecessary data movement. However, changing the working directory and copying data is included here in order to mention its relevance for I/O performance which is likely to become a bottleneck in data-intensive jobs.- Selecting the working directory.
The default working directory is the directory in which the submit command was issued. In particular for data-intensive jobs it should be checked whether using local SSDs is advantageous, which have the best I/O performance of all disk system in the cluster. SSDs can be used by
cd
-ing to$RRZ_LOCAL_TMPDIR
. - Copying data in. In particular when local disks are used data needs to be copied there before it can be used.
- Work. In many cases the work to be performed can be specified in a single command.
- Copying data out / saving data. In particular when local disks are used results need to be saved to other disks (because data on the local disks will automatically removed at the end of a job).