Please note:

This is not an ARCHER2 course but is advertised here as likely to be of interest to many members of the ARCHER2 commnunity.

This course will run on Cirrus, not ARCHER2.


This course will take place face-to-face at The Open University, Milton Keynes

This course will not be streamed online and a recording will not be made.


This short course will provide an introduction to GPU computing with CUDA aimed at scientific application programmers. The course will give a background on the difference between CPU and GPU architectures as a prelude to introductory exercises in CUDA programming. The course will discuss the execution of kernels, memory management, and shared memory operations. Common performance issues are discussed and their solution addressed. The course will also cover some of the alternatives to CUDA commonly available (OpenCL, OpenACC, and Kokkos) at the current time.

Templates will be provided to do the practical examples in Python using PyCUDA, although this still requires the computational kernels to be written in C.

Note: this course will not address machine learning or any machine learning frameworks.

Learning Outcomes

At the end of the course, attendees should be in a position to make an informed decision on how to approach GPU parallelisation in their applications in an efficient and portable manner.


Attendees must be familiar with programming in C or C++ (a number of the baseline CUDA exercises are also available using CUDA Fortran). Some knowledge of parallel/threaded programming models would be useful. Access to a GPU machine will be supplied.


Participants must bring a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on.

They are also required to abide by the ARCHER2 Code of Conduct.



Course materials


This course is part-funded by the PRACE PRACE project and is free to all.


Registration is not currently available for this course.