Performance of MPI+OpenMP on ARCHER2

Commonly, parallel applications on ARCHER2 are run using the message passing interface (MPI), where one process (or task) is spawned on each core. For ARCHER2 (which has 128 cores per node) this means that a large number of processes are used when running on many nodes. Using this many processes can have a negative impact on the performance of MPI communications for some applications and use cases. As an alternative, in some cases it is possible to use a combination of MPI processes and OpenMP threads (known as MPI+OpenMP). This approach naturally results in using fewer processes overall, and this can improve the performance of applications by reducing the overhead of communications, and also have other positive effects, such as lowering the memory requirements.

This talk will present a study of the performance of MPI+OpenMP in a variety of applications on ARCHER2. In doing so different configurations of processes and threads will be explored for various benchmark systems. This will aim to provide guidance to users about which applications and use cases can see a performance benefit from using MPI+OpenMP. The main applications we will present results for are CASTEP, CP2K, LAMMPS, GROMACS and Quantum EPRESSO.

This online session is open to all. It will use the Blackboard Collaborate platform.

Video

Slides
Download pdf of the presentation.