Outline
This course will provide an introduction to GPU computing with CUDA aimed at scientific application programmers. The course will give a background on the difference between CPU and GPU architectures as a prelude to introductory exercises in CUDA programming. The course will discuss the execution of kernels, memory management, and shared memory operations. Common performance issues are discussed and their solution addressed. The course will also cover some of the alternatives to CUDA commonly available (OpenCL, OpenACC, and Kokkos) at the current time.
A separate “Hackathon Day” will be available for attendees to try out their own problems (or a ‘canned’ extended example) with the help of staff from both EPCC and NVIDIA.
Learning Outcomes
At the end of the course, attendees should be in a position to make an informed decision on how to approach GPU parallelisation in their applications in an efficient and portable manner.
Pre-requisites
Attendees must be familiar with programming in C or C++ (a number of the baseline CUDA exercises are also available using CUDA Fortran). Some knowledge of parallel/threaded programming models would be useful. Access to a GPU machine will be supplied.
Note: this course will not address machine learning or any machine learning frameworks.
Requirements:
Participants must bring a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on.
They are also required to abide by the ARCHER2 Training Code of Conduct.
Timetable:
- Monday 23 November 10:00-17:00 GMT
- Tuesday 24 November 10:00-17:00 GMT
- “Hackathon day” Thursday 26th November 10:00-17:00
(Wednesday is a rest day.)
Detailed timetable to follow
Course materials
Videos
Day 1
Part 1
Part 2
Part 3
Day 2
Part 1
Part 2
Feedback
This course is part-funded by the PRACE project and is free to all.