Researchers
Researchers
Novice
Package User
without HPC
expertise
Novice...
Experienced
Package User
without HPC
expertise
Experienced...
Experienced
Package User
with HPC
expertise
Experienced...
Data
Carpentry
Data...
Data Science
Data Science
Data
Scientist
Data...
RSEs / Developers
RSEs / Developers
Novice
Developer
without HPC
expertise
Novice...
Novice
Developer
with HPC
expertise
Novice...
Experienced
Developer
with HPC
expertise
Experienced...
Software
Carpentry
Software...
Package Use on
ARCHER2
Package Use on...
Data Science on
ARCHER2
Data Science on...
Development on
ARCHER2
Development on...
Development on
ARCHER2
Development on...
Virtual Tutorials
Virtual Tutorials
Message Passing Programming with MPI
Message Passing Prog...
Shared Memory Programming with OpenMP
Shared Memory Progra...
Understanding Package
Performance
Understanding Packag...
Advanced use of
LAMMPS
Advanced use of...
Scientific Programming
with Python
Scientific Programmi...
Advanced MPI
Advanced MPI
Performance
Optimisation
on AMD EPYC
Performance...
Efficient use of the
HPE Cray EX
Supercomputer
Efficient use of the...
Advanced OpenMP
Advanced OpenMP
Reproducible computational environments
using Containers
Reproducible computa...
Introduction to
CP2K
Introduction to...
Introduction to
GROMACS
Introduction to...
Introduction to
Unified Model
Introduction to...
Advanced use of
Code_Saturne
Advanced use of...
Data Analytics
with HPC
Data Analytics...
Modern C++ for
Computational Scientists
Modern C++ for...
Modern Fortran
Modern Fortran
Efficient Parallel IO
Efficient Parallel IO
Parallel Performance Analysis using Scalasca
Parallel Performance...
Development on
ARCHER2
Development on...
Software
Carpentry
Software...
HPC
Carpentry
HPC...
Package Use on
ARCHER2
Package Use on...
Package Use on
ARCHER2
Package Use on...
HPC
Carpentry
HPC...
HPC
Carpentry
HPC...
Package Use on
ARCHER2
Package Use on...
Package Use on
ARCHER2
Package Use on...
Reproducible computational environments
using Containers
Reproducible computa...
Reproducible computational environments
using Containers
Reproducible computa...
Plotting and
Programming
with Python
Plotting and...
Data Analysis
and Visualisation
in Python
Data Analysis...
Introduction to
LAMMPS
Introduction to...
Text is not SVG - cannot display

image

Some example scenarios and suggested training paths.

 

Outline Course Descriptions

Introductory (level 1) courses

Intermediate (level 2) courses

Advanced (level 3) courses

 

Introductory (level 1) courses

Data Carpentry

Course length: 2 days. Course level: introductory.

In many domains of research, the rapid generation of large amounts of data is fundamentally changing how research is done. The deluge of data presents great opportunities, but also many challenges in managing, analysing, and sharing data. Data Carpentry aims to teach the skills that will enable researchers to be more effective and productive. This two-day introductory workshop is designed for learners with little to no prior knowledge of programming, shell scripting, or command line tools.

HPC Carpentry

Course length: 2 days. Course level: introductory.

This course provides an introduction to High Performance Computing (HPC). After completing this course, participants will:

Software Carpentry

Course length: 2 days. Course level: introductory.

Software Carpentry’s goal is to help scientists and engineers become more productive by teaching them basic computing skills like program design, version control, testing, and task automation. In this two-day introductory workshop, short tutorials will alternate with hands-on practical exercises. Participants will be encouraged both to help one another, and to apply what they have learned to their own research problems during and between sessions.

Package Use on ARCHER2

Course length: 1 day. Course level: introductory.

This one-day introductory course will cover efficient use of pre-installed research software packages on ARCHER2. This will include the essentials of the ARCHER2 service and explain how pre-installed software packages can be used. We will run this both online and face-to-face and both of these will have practical exercises to complete.

Introduction to CP2K

Course length: 1 day. Course level: introductory.

This introductory course aims to give a practical introduction to using CP2K on HPC systems. CP2K is a quantum chemistry and solid state physics software package that can perform atomistic simulations of solid state, liquid, molecular, periodic, material, crystal, and biological systems.

Lectures introducing the theory implemented in CP2K will be interspersed by tutored practical sessions, with access to ARCHER2.

Introduction to LAMMPS

Course length: 1 day. Course level: introductory.

LAMMPS (Large-scale Atomic/Molecular Massively Parallel Simulator) is a widely-used classical molecular dynamics (MD) code. This C++ code is easy to use, incredibly versatile, and parallelised to run efficiently on both small-scale personal computers and CPU/GPU/CPU&GPU HPC clusters. As of 2023, LAMMPS has been used, to some degree, in over 40,000 publications in fields as varied as chemistry, physics, material science, granular and lubricated-granular flow.

The course will be divided into two parts:

Introduction to GROMACS

Course length: 1 day. Course level: introductory.

This introductory course aims to give a practical introduction to using GROMACS on HPC systems. GROMACS is a molecular dynamics package mainly designed for simulations of proteins, lipids, and nucleic acids. Lectures introducing the theory implemented in GROMACS will be interspersed by tutored practical sessions, with access to ARCHER2.

Introduction to Unified Model

Course length: 1 day. Course level: introductory.

The Met Office Unified Model is used for weather and climate prediction by forecasting centres worldwide. The Introduction to Unified Model course provides practical support for setting up and running experiments on this model, through a series of short lectures and hands-on workshops. This course is delivered in conjunction with NERC, with support from the ARCHER2 CSE team.

Data Science on ARCHER2

Course length: 1 day. Course level: introductory.

This one-day introductory course will cover the essentials of ARCHER2, the basic use of core data science packages (e.g., R, Pandas), and data handling best practice.

Data Analytics with HPC

Course length: 2 days. Course level: introductory.

Data Analytics, Data Science and Big Data are a just a few of the many terms used in business and academic research, all referring to the manipulation, processing, and analysis of data. Fundamentally, these are all concerned with the extraction of knowledge from data that can be used for competitive advantage or to provide scientific insight. In recent years, this area has undergone a revolution in which HPC has been a key driver. This two-day intermediate course provides an overview of data science and the analytical techniques that form its basis as well as exploring how HPC provides the power that has driven their adoption. The course will cover key data analytical techniques such as, classification, optimisation, and unsupervised learning; key parallel patterns, such as Map Reduce, for implementing analytical techniques; relevant HPC and data infrastructures; case studies from academia and business.

Data Analysis and Visualisation in Python

Course length: 2 days. Course level: introductory.

Python is a general purpose programming language that is useful for writing scripts to work effectively and reproducibly with data.

This is an introduction to Python designed for participants with no programming experience. This course covers:

Development on ARCHER2

Course length: 2 days. Course level: introductory.

This two-day introductory course will cover the ARCHER2 application development environment, core parallel and scientific software libraries, available debugging, and profiling tools. This will be available both online and face-to-face course to suit the needs of attendees.

Modern Fortran

Course length: 2 days. Course level: introductory.

Fortran (a contraction of Formula Translation) was the first programming langauge to have a standard (in 1954), but has changed significantly over the years. More recent standards (the latest being Fortran 2018) come under the umbrella term “Modern Fortran”. Fortran retains very great significance in many areas of scientific and numerical computing, particularly for applications such as quantum chemistry, plasmas, and in numerical weather prediction and climate models.

This course provides an introduction to the basics of writing Fortran. It will cover basic syntax, variables, expressions and assignments, flow of control, and introductions to i/o and user-defined types. Common Fortran idioms are introduced and contrasted with those available in C-like languages; the course will try to focus on real usage rather than formal descriptions.

At the end of the course you should be able to understand many Fortran programs and be confident to start to write well-structured and portable Fortran. Fortran is a rather “large” language, so it is not possible to cover all its features in a two day course. Further elements of Fortran are discussed in the “Intermediate Modern Fortran” course.

Plotting and Programming with Python

Course length: 1 day. Course level: introductory.

This lesson is an introduction to programming in Python for people with little or no previous programming experience. It uses plotting as its motivating example. Please note that this lesson uses Python 3.

This one-day course aims to answer the following questions:

How do I…

Scientific Programming with Python

Course length: 2 days. Course level: introductory.

This course is aimed at programmers with little or no Python knowledge seeking to learn how to use Python for scientific computing. We will introduce Python’s fundamental scientific libraries numpy, scipy, and matplotlib. We will also introduce how to interface Python with Fortran and C codes, along with parallel programming methods including MPI via mpi4py.

 

Intermediate (level 2) courses

Efficient Parallel IO

Course length: 1 day. Course level: intermediate.

One of the greatest challenges to running parallel applications on large numbers of processors is how to handle file IO. Standard Unix IO routines are not designed with parallelism in mind, and IO overheads can grow to dominate the overall runtime. Parallel file systems are optimised for large volumes of data, but performance can be far from optimal if every process opens its own file or if all IO is funnelled through a single controller process.

This hands-on course explores a range of issues related to parallel IO. It uses ARCHER2 and its parallel Lustre file system as a platform for the exercises; however, almost all the IO concepts and performance considerations are applicable to any parallel system.

We will give a general overview of the Lustre filesystem and how parallel IO is implemented in MPI-IO since these are the routines ultimately used by many higher-level libraries such as HDF5 and NetCDF. A good understanding of the performance characteristics of MPI-IO is therefore very useful in optimising the IO performance of most parallel applications.

The course does not teach the detailed syntax of the various parallel IO libraries, but the Fortran source code provided for the benchmarking application used in the practical sessions should be useful reference material.

Prerequisites: The course assumes an understanding of basic MPI programming in C, C++ or Fortran. Knowledge of MPI derived datatypes would be useful but not essential.

Message Passing Programming with MPI

Course length: 2 days. Course level: intermediate.

The world’s largest supercomputers are used almost exclusively to run applications which are parallelised using Message Passing. This two-day intermediate course covers all the basic knowledge required to write parallel programs using this programming model and is directly applicable to almost every parallel computer architecture.

Parallel programming by definition involves co-operation between processors to solve a common problem. The programmer has to define the tasks that will be executed by the processors, and also how these tasks are to synchronise and exchange data with one another. In the message-passing model the tasks are separate processes that communicate and synchronise by explicitly sending each other messages. All these parallel operations are performed via calls to some message-passing interface that is entirely responsible for interfacing with the physical communication network linking the actual processors together. This course uses the de facto standard for message passing, the Message Passing Interface (MPI). It covers point-to-point communication, non-blocking operations, derived datatypes, virtual topologies, collective communication, and general design issues.

The course is taught using a variety of methods including formal lectures, practical exercises, programming examples and informal tutorial discussions. This enables lecture material to be supported by the tutored practical sessions in order to reinforce the key concepts.

Shared Memory Programming with OpenMP

Course length: 2 days. Course level: intermediate.

Almost all modern computers now have a shared-memory architecture with multiple CPUs connected to the same physical memory, for example multicore laptops or large multi-processor compute servers. This two-day intermediate course covers OpenMP, the industry standard for shared-memory programming, which enables serial programs to be parallelised easily using compiler directives. Users of desktop machines can use OpenMP on its own to improve program performance by running on multiple cores; users of parallel supercomputers can use OpenMP in conjunction with MPI to better exploit the shared-memory capabilities of the compute nodes.

This course will cover an introduction to the fundamental concepts of the shared variables model, followed by the syntax and semantics of OpenMP and how it can be used to parallelise real programs. Hands-on practical programming exercises make up a significant, and integral, part of this course.

Modern C++ for Computational Scientists

Course length: 2 days. Course level: intermediate.

With the recent revisions to the C++ language and standard library, the ways it is now being used are quite different. Used well, these features enable the programmer to write elegant, reusable, and portable code that runs efficiently on a variety of architectures.

However, it is still a very large and complex tool. This course will cover a minimal set of features to allow an experienced non-C++ programmer to get to grips with language. These include overloading, templates, containers, iterators, lambdas, and standard algorithms. We will also briefly cover several important libraries for numerical computing.

The course is meant to appeal to programmers with experience in another language (e.g., C, Fortran, Java, Python), it is not an introduction to programming.

Reproducible computational environments using containers

Course length: 2 days. Course level: intermediate.

This course aims to introduce the use of containers with the goal of using them to effect reproducible computational environments. Such environments are useful for ensuring reproducible research outputs and for simplifying the setup of complex software dependencies across different systems. We will primarily use Docker to illustrate the use of containers but will also briefly introduce Singularity which is designed for use on multi-user systems (such as HPC resources). This course is aimed at researchers who have no (or very little) previous experience of using containers. Attendees are expected to have basic familiarity with using a command line interface such as bash or Powershell.

Understanding Package Performance

Course length: 1 day. Course level: intermediate.

As parallel packages for computational science become more sophisticated, it becomes more difficult for a researcher to understand the most important factors that determine end-to-end productivity from initial input data to final result. Aspects such as file IO and data transfer can be just as important in practice as the performance and parallel scalability of the application itself. This one-day intermediate course will take a holistic approach and cover tools and techniques to help researchers to improve their overall scientific productivity on large-scale HPC systems.

 

Advanced (level 3) courses

Advanced MPI

Course length: 2 days. Course level: advanced.

This course is aimed at programmers seeking to deepen their understanding of MPI and explore some of its more advanced features. We cover topics including efficient use of non-blocking communications, combining MPI and OpenMP, single-sided MPI and the new MPI memory model. We also look at performance aspects such as which MPI routines to use for scalability, overlapping communication and calculation and MPI internal implementation issues.

Advanced OpenMP

Course length: 2 days. Course level: advanced.

OpenMP is the industry standard for shared-memory programming, which enables serial programs to be parallelised using compiler directives. This course is aimed at programmers seeking to deepen their understanding of OpenMP and explore some of its more recent and advanced features.

This two-day advanced course will cover topics including nested parallelism, OpenMP tasks, the OpenMP memory model, performance tuning, hybrid OpenMP + MPI, OpenMP implementations, and new features in OpenMP 5. Hands-on practical programming exercises make up a significant, and integral, part of this course.

Advanced use of Code_Saturne

Course length: 1 day. Course level: advanced.

Code_Saturne is the free, open-source software developed and released by EDF to solve computational fluid dynamics (CFD) applications. This course will focus on the use of CFD for the prediction of fluid flow and heat transfer, including turbulence modelling, near wall modelling and conjugate heat transfer.

Advanced use of LAMMPS

Course length: 1 day. Course level: advanced.

LAMMPS (Large-scale Atomic/Molecular Massively Parallel Simulator) is a widely-used classical molecular dynamics (MD) code. This C++ code is easy to use, incredibly versatile, and parallelised to run efficiently on both small-scale personal computers and HPC clusters. As of 2018, LAMMPS has been used, to some degree, in over 14,000 publications in fields as varied as chemistry, physics, material science, granular and lubricated-granular flow, etc. This course will contain an in-depth discussion of the various packages LAMMPS offers and how to use them efficiently.

Efficient use of the HPE Cray EX System

Course length: 3 days. Course level: advanced.

HPE Cray’s supercomputer platforms are an advanced pairing of software and hardware that provide HPC application developers and users the opportunity of excellent scaling and high productivity. This workshop, provided by ARCHER2 staff from HPE Cray and EPCC, offers instruction and insight into using the advanced tools available for analysing and optimising applications on the new Cray EX architecture. All practical exercises will be done on the ARCHER2 system.

Parallel Performance Analysis using Scalasca

Course length: 2 days. Course level: advanced.

Current and future supercomputing architectures face a dramatic growth of parallelism and heterogeneity on multiple levels. As a result, it is almost impossible for code developers to predict which parts of their code will perform well, which development decisions impact scalability, which choice of data structures are reasonable for a specific architecture, etc. Most decisions are based upon experience, intuition, and a limited understanding of the code’s performance.

To get a better understanding of code performance and to guide performance engineering, it is essential for computational scientists and engineers to conduct measurements in order to study code performance in detail. Performance analysis tools, a generalisation of the classic profiler, are the best tools to obtain this insight. However, they themselves require a certain level of understanding, experience, and expertise to be used productively which adds to the complexity of the underlying problem. This two-day advanced workshop introduces the Scalasca tool and provides hands-on training on how to use it in practice on large-scale HPC applications.

Performance Optimisation on AMD EPYC

Course length: 2 days. Course level: advanced.

This two-day advanced course covers the system-specific features of the ARCHER2 processing units. It includes a detailed overview of the AMD EPYC processors and Cray-provided systems software and performance tools. It is ideal for users both familiar with existing Cray supercomputers and those porting from alternative platforms.

 

Virtual Tutorials and Webinars

The ARCHER2 virtual tutorials and webinars cover a wide range of topics and levels; from talks on research using ARCHER2 and HPC in general, through technical talks of interest to users to more general talks on areas such as diversity and inclusion. We also actively seek feedback from the community on potential topics and presenters for virtual tutorials so if you have any ideas for topics or presenters, please let us know at: support@archer2.ac.uk.

Most sessions last around an hour, with 40 minutes of presentation followed by 20 minutes of questions and discussion.