Beispiel für einen Job im Batchsystem
Ein einfaches Beispiel für einen Job für das Batchsystem, ein klassisches „Hello World” ist dies:
#!/bin/bash # Do not forget to select a proper partition if the default # one is no fit for the job! You can do that either in the sbatch # command line or here with the other settings. #SBATCH --job-name=hello #SBATCH --nodes=2 #SBATCH --tasks-per-node=16 #SBATCH --time=00:10:00 #SBATCH --export=NONE # Never forget --export=NONE! Strange happenings ensue otherwise. set -e # Good Idea to stop operation on first error. source /sw/batch/init.sh # Load environment modules for your application here. # Actual work starting here. You might need to call # srun or mpirun depending on your type of application # for proper parallel work. # Example for a simple command (that might itself handle # parallelisation). echo "Hello World! I am $(hostname -s) greeting you!" echo "Also, my current TMPDIR: $TMPDIR" # Let's pretend our started processes are working on a # predetermined parameter set, looking up their specific # parameters using the set number and the process number # inside the batch job. export PARAMETER_SET=42 # Simplest way to run an identical command on all allocated # cores on all allocated nodes. Use environment variables to # tell apart the instances. srun bash -c 'echo "process $SLURM_PROCID \ (out of $SLURM_NPROCS total) on $(hostname -s) \ parameter set $PARAMETER_SET"'
Wenn dies in $HOME/hello_world.sh
abgelegt ist, sollte
nach Abschicken des Jobs von einem Arbeitsverzeichnis in $WORK
aus,
mittels
shell$ mkdir $WORK/hello_workdir shell$ cd $WORK/hello_workdir shell$ sbatch $HOME/hello_world.sh Submitted batch job 123456
(mit Job-ID 123456, vermeldet von sbatch),
er in die Warteschlange gelangen. Während der Ausführung
entsteht eine Datei mit den Ausgaben des Jobs, hier slurm-123456.out
.
shell$ cat slurm-123456.out module: loaded site/slurm module: loaded site/tmpdir module: loaded site/hummel module: loaded env/system-gcc Hello World! I am node223 greeting you! Also, my current TMPDIR: /scratch/rrztest.123456 process 8 (out of 32 total) on node223 parameter set 42 process 15 (out of 32 total) on node223 parameter set 42 process 4 (out of 32 total) on node223 parameter set 42 process 5 (out of 32 total) on node223 parameter set 42 process 9 (out of 32 total) on node223 parameter set 42 process 7 (out of 32 total) on node223 parameter set 42 process 3 (out of 32 total) on node223 parameter set 42 process 6 (out of 32 total) on node223 parameter set 42 process 11 (out of 32 total) on node223 parameter set 42 process 2 (out of 32 total) on node223 parameter set 42 process 13 (out of 32 total) on node223 parameter set 42 process 12 (out of 32 total) on node223 parameter set 42 process 1 (out of 32 total) on node223 parameter set 42 process 10 (out of 32 total) on node223 parameter set 42 process 0 (out of 32 total) on node223 parameter set 42 process 14 (out of 32 total) on node223 parameter set 42 process 28 (out of 32 total) on node224 parameter set 42 process 23 (out of 32 total) on node224 parameter set 42 process 26 (out of 32 total) on node224 parameter set 42 process 27 (out of 32 total) on node224 parameter set 42 process 30 (out of 32 total) on node224 parameter set 42 process 19 (out of 32 total) on node224 parameter set 42 process 18 (out of 32 total) on node224 parameter set 42 process 22 (out of 32 total) on node224 parameter set 42 process 25 (out of 32 total) on node224 parameter set 42 process 17 (out of 32 total) on node224 parameter set 42 process 29 (out of 32 total) on node224 parameter set 42 process 21 (out of 32 total) on node224 parameter set 42 process 24 (out of 32 total) on node224 parameter set 42 process 16 (out of 32 total) on node224 parameter set 42 process 31 (out of 32 total) on node224 parameter set 42 process 20 (out of 32 total) on node224 parameter set 42