All access to the ARCHER2 compute nodes is via the Slurm workload manager, and writing batch submission scripts for standard MPI programs is relatively straightforward. However, it is useful to understand what actual happens under the hood which is sometimes not as simple as it may first appear.
There will be a 45-minute presentation at the start of this online tutorial, given by David Henty from ARCHER2 CSE support, which will address the following:
- how do batch scripts work?
- what happens to my Slurm job after I type sbatch?
- how do time limits and charging work?
- how do I access special queues?
- where does my job script actually run?
- how are processes and threads distributed across nodes or between the CPU-cores within a node?
- can I issue multiple srun commands in a single Slurm job?
- how do “interactive” batch jobs work?
- common issues; tips and tricks
This online session is open to all. It will use the Blackboard Collaborate platform.