Single Node Performance Optimisation

Location:

This course will take place face-to-face in room G0.3 at Edinburgh

This course will not be streamed online and a recording will not be made.

Target Audience:

This course covers techniques for improving the performance of parallel applications by optimising of the code that runs within each node.

Modern HPC systems such as ARCHER2 are being constructed using increasingly powerful nodes, with larger and larger numbers of cores and enhanced vector capabilities. To extract maximum performance from applications, it is therefore necessary to understand, and be able to overcome, on-node performance bottlenecks. This course will cover the main features of modern HPC nodes, including multiple cores, vector floating point units, deep cache hierarchies, and NUMA memory systems. We will cover techniques for efficient programming of these features, using batch processing options and compiler options as well as hand tuning of code. The course will also contain an introduction to the use of Cray performance analysis tools.

Prerequisites:

Participants must be familiar with software development on ARCHER2, or any other HPC facility, using C, C++ or Fortran.

This course is targeted at users interested in optimising the performance of their own applications, e.g. through compiler options or code changes.

Requirements:

Participants must bring a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.).

They are also required to abide by the ARCHER2 Code of Conduct.

Timetable:

Tuesday 12th November 2024 - Single Node Optimisation Day 1

09.30 – 09.45 Introduction
09.45 – 10.30 Node Architecture
10.30 – 11.00 Practical – memory performance
11.00 – 11.30 Break
11.30 – 12.30 Profiling
12.30 – 13.00 Practical – profiling
13.00 – 14.00 Lunch
14.00 – 15.00 Optimising with the compiler
15.00 – 15.30 Break
15.30 – 17.00 Practical – profiling and optimisation
17.00 – 17.10 Summary
17.10 – 17.30 Practical – profiling and optimisation

Wednesday 13th November 2024 - Single Node Optimisation Day 2

09.30 – 11.00 OpenMP optimisation
11.00 – 11.30 Break
11.30 – 12.30 Practical – OpenMP optimisation
12.30 – 13.30 Lunch
13.30 – 15.00 Vectorisation, Memory Hierarchy Optimisation
15.00 – 15.30 Break
15.30 – 16.00 Practical – memory and cache blocking

Course materials

Course materials

Course Chat

Feedback

Feedback
Please let us know what was great about this course and anything we can improve