This workshop will provide attendees with the knowledge required to understand how to port, execute and optimize applications for the ARCHER2 GPU Test and Development platform.
The workshop is a mixture of lectures and practical hands-on sessions. Example exercises will be provided but attendees are encouraged to bring their own application and a session is reserved where attendees can concentrate on their own applications.
Specific topics that will be covered include:
- System Architecture.
- Compiling and running applications on AMD GPUs (Programming Environment, compilers, scientific libraries, Slurm).
- OpenACC and OpenMP offloading with the Cray PE.
- GPU programming with HIP.
- Application profiling and debugging on GPUs.
Target Audience:
It is expected that attendees are familiar with the ARCHER2 environment but may not be familiar with systems having nodes with accelerators. The course will not teach GPU programming but will outline how to build and run applications using different programming models.
As places on this course are limited, priority will be given to existing ARCHER2 users. Please include your ARCHER2 username in the “Additional information” part of the registration form.
Requirements:
Participants must bring a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on.
They are also required to abide by the ARCHER2 Code of Conduct.
Timetable:
Tuesday 12th March 2024
- 09.00 Introduction (EPCC/ Harvey)
- Course organisation
- Introduction to GPU development platform (EPCC)
- 09.15 Introduction to the Cray EX Hardware and Programming Environment for GPUs
- (Harvey)
- HPE Cray EX hardware architecture and software stack
- The Cray programming environment and compiler wrapper scripts
- An introduction to the compiler suites for GPUs
- Description of the GPU Parallel Programming models
- 09.50 First steps for running on the GPU nodes
- Examples of using the Slurm Batch system, launching jobs
- 10.00 Exercises
- 10.30 Break
- 11.00 Libraries and resource placement
- Presentation of the Cray Scientific Libraries for GPU execution
- Controls for job placement (CPU/GPU/NIC)
- GPU-aware MPI communications
- 11.30 Exercises
- 12.15 Lunch
- 13.30 OpenACC and OpenMP offload with Cray Compilation Environment
- Directive-based approach for GPU offloading execution with the Cray
- Compilation Environment
- Compiler feedback and variable scoping with Reveal
- 14.30 Exercises
- 15.00 Break
- 15.30 GPU Programming with HIP
- GPU Hardware intro and terminology
- Introduction to ROCm and HIP
- Porting Applications to HIP
- 16.30 Exercises
- 17.00 Close
Wednesday 13th March 2024
- 09.00 GPU Profiling
- Overview of the Cray Performance and Analysis toolkit for profiling applications
- Demo: Visualization of performance data
- AMD Rocprof Profiling Tool
- 09.45 Exercises
- 10.15 Break
- 10.45 GPU Debugging
- AMD Debugger: ROCgdb
- Debugging at scale with gdb4hpc
- 11.30 Exercises
- 12.15 Lunch
- 13.30 Continuation of hand-on exercises
- Working on your application
- 15.00 General Questions & Answers
- 15.30 Close