Service Status Login: Unavailable | Compute: Unavailable

 

Package Use on...HPC...
Researchers
Researchers
Novice
Package User
without HPC
expertise
Novice...
Experienced
Package User
without HPC
expertise
Experienced...
Experienced
Package User
with HPC
expertise
Experienced...
HPC...HPC...Data...Software...
Data Science
Data Science
Data
Scientist
Data...
RSEs / Developers
RSEs / Developers
Novice
Developer
without HPC
expertise
Novice...
Novice
Developer
with HPC
expertise
Novice...
Experienced
Developer
with HPC
expertise
Experienced...
Software...Package Use on...Package Use on...Data Science on...Development on...Development on...Development on...Intermediate and Advanced CoursesVirtual Tutorials
Viewer does not support full SVG 1.1

Outline Course Descriptions

Introductory (level 1) courses

Intermediate (level 2) courses

Advanced (level 3) courses

 

Introductory (level 1) courses

 

Data Carpentry

Course length: 2 days. Course level: introductory.

In many domains of research, the rapid generation of large amounts of data is fundamentally changing how research is done. The deluge of data presents great opportunities, but also many challenges in managing, analysing and sharing data. Data Carpentry aims to teach the skills that will enable researchers to be more effective and productive. The course is designed for learners with little to no prior knowledge of programming, shell scripting, or command line tools.

 

HPC Carpentry

Course length: 2 days. Course level: introductory.

This course provides an introduction to High Performance Computing (HPC). After completing this course, participants will:

 

Software Carpentry

Course length: 2 days. Course level: introductory.

Software Carpentry’s goal is to help scientists and engineers become more productive by teaching them basic computing skills like program design, version control, testing, and task automation. In this two-day workshop, short tutorials will alternate with hands-on practical exercises. Participants will be encouraged both to help one another, and to apply what they have learned to their own research problems during and between sessions.

 

Package Use on ARCHER2

Course length: 1 day. Course level: introductory.

This course will cover efficient use of pre-installed research software packages on ARCHER2. This will include the essentials of the ARCHER2 service and explain how pre-installed software packages can be used. We will run this both online and face-to-face and both of these will have practical exercises to complete.

 

Data Science on ARCHER2

Course length: 1 day. Course level: introductory.

This course will cover the essentials of ARCHER2, the basic use of core data science packages (e.g. R, Pandas), and data handling best practice.

 

Development on ARCHER2

Course length: 2 days. Course level: introductory.

This course will cover the ARCHER2 application development environment, core parallel and scientific software libraries, available debugging and profiling tools. This will be available both online and face-to-face course to suit the needs of attendees.

 

Reproducible computational environments using containers

Course length: 2 days. Course level: introductory.

This course aims to introduce the use of containers with the goal of using them to effect reproducible computational environments. Such environments are useful for ensuring reproducible research outputs and for simplifying the setup of complex software dependencies across different systems. We will primarily use Docker to illustrate the use of containers but will also briefly introduce Singularity which is designed for use on multi-user systems (such as HPC resources). This course is aimed at researchers who have no (or very little) previous experience of using containers. Attendees are expected to have basic familiarity with using a command line interface such as bash or Powershell.

 

Intermediate (level 2) courses

 

Understanding Package Performance

Course length: 1 day. Course level: intermediate.

As parallel packages for computational science become more sophisticated, it becomes more difficult for a researcher to understand the most important factors that determine end-to-end productivity from initial input data to final result. Aspects such as file IO and data transfer can be just as important in practice as the performance and parallel scalability of the application itself. This course will take a holistic approach and cover tools and techniques to help researchers to improve their overall scientific productivity on large-scale HPC systems.

 

Data Analysis using Python

Course length: 1 day. Course level: intermediate.

Data Analytics, Data Science and Big Data are a just a few of the many terms used in business and academic research, all referring to the manipulation, processing and analysis of data. Fundamentally, these are all concerned with the extraction of knowledge from data that can be used for competitive advantage or to provide scientific insight. In recent years, this area has undergone a revolution in which HPC has been a key driver. This course provides an overview of data science and the analytical techniques that form its basis as well as exploring how HPC provides the power that has driven their adoption. The course will cover: key data analytical techniques such as, classification, optimisation, and unsupervised learning; key parallel patterns, such as Map Reduce, for implementing analytical techniques; relevant HPC and data infrastructures; case studies from academia and business.

 

Data Analytics with HPC

Course length: 2 days. Course level: intermediate.

Data Analytics, Data Science and Big Data are a just a few of the many terms used in business and academic research, all referring to the manipulation, processing and analysis of data. Fundamentally, these are all concerned with the extraction of knowledge from data that can be used for competitive advantage or to provide scientific insight. In recent years, this area has undergone a revolution in which HPC has been a key driver. This course provides an overview of data science and the analytical techniques that form its basis as well as exploring how HPC provides the power that has driven their adoption. The course will cover: key data analytical techniques such as, classification, optimisation, and unsupervised learning; key parallel patterns, such as Map Reduce, for implementing analytical techniques; relevant HPC and data infrastructures; case studies from academia and business.

 

Message Passing Programming with MPI

Course length: 2 days. Course level: intermediate.

The world’s largest supercomputers are used almost exclusively to run applications which are parallelised using Message Passing. This course covers all the basic knowledge required to write parallel programs using this programming model, and is directly applicable to almost every parallel computer architecture.

Parallel programming by definition involves co-operation between processors to solve a common problem. The programmer has to define the tasks that will be executed by the processors, and also how these tasks are to synchronise and exchange data with one another. In the message-passing model the tasks are separate processes that communicate and synchronise by explicitly sending each other messages. All these parallel operations are performed via calls to some message-passing interface that is entirely responsible for interfacing with the physical communication network linking the actual processors together. This course uses the de facto standard for message passing, the Message Passing Interface (MPI). It covers point-to-point communication, non-blocking operations, derived datatypes, virtual topologies, collective communication and general design issues.

The course is taught using a variety of methods including formal lectures, practical exercises, programming examples and informal tutorial discussions. This enables lecture material to be supported by the tutored practical sessions in order to reinforce the key concepts.

 

Shared Memory Programming with OpenMP

Course length: 2 days. Course level: intermediate.

Almost all modern computers now have a shared-memory architecture with multiple CPUs connected to the same physical memory, for example multicore laptops or large multi-processor compute servers. This course covers OpenMP, the industry standard for shared-memory programming, which enables serial programs to be parallelised easily using compiler directives. Users of desktop machines can use OpenMP on its own to improve program performance by running on multiple cores; users of parallel supercomputers can use OpenMP in conjunction with MPI to better exploit the shared-memory capabilities of the compute nodes.

This course will cover an introduction to the fundamental concepts of the shared variables model, followed by the syntax and semantics of OpenMP and how it can be used to parallelise real programs. Hands-on practical programming exercises make up a significant, and integral, part of this course.

 

Advanced (level 3) courses

 

Efficient Parallel IO

Course length: 2 days. Course level: advanced.

One of the greatest challenges to running parallel applications on large numbers of processors is how to handle file IO: standard IO routines are not designed with parallelism in mind. Parallel file systems such as Lustre are optimised for large data transfers, and performance can be far from optimal if many files are opened at once.

The IO part of the MPI standard gives programmers access to efficient parallel IO in a portable fashion. However, there are a large number of different routines available and some can be difficult to use in practice. Despite its apparent complexity, MPI-IO adopts a very straightforward high-level model. If used correctly, almost all the complexities of aggregating data from multiple processes can be dealt with automatically by the library.

The first day of the course will cover the MPI-IO standard, developing IO routines for a regular domain decomposition example. It will also briefly cover higher-level standards such as HDF5 and NetCDF. The second day will concentrate on how to use the Lustre file system for best performance. Case studies from real codes will also be presented.

Although the course mainly uses the MPI-IO library and the Lustre parallel filesystem for specific examples, most of the IO concepts and performance considerations are applicable to almost any parallel system.

 

Performance Optimisation on AMD EPYC

Course length: 2 days. Course level: advanced.

This course covers the system-specific features of the ARCHER2 processing units over two days. It includes a detailed overview of the AMD EPYC processors and Cray-provided systems software and performance tools. It is ideal for users both familiar with existing Cray supercomputers or those porting from alternative platforms.

 

Performance Analysis Workshop

Course length: 3 days. Course level: advanced.

Current and future supercomputing architectures face a dramatic growth of parallelism and heterogeneity on multiple levels. As a result, it is almost impossible for code developers to predict which parts of their code will perform well, which development decisions impact scalability, which choice of data structures are reasonable for a specific architecture, etc. Most decisions are based upon experience, intuition and a limited understanding of the code’s performance.

To get a better understanding of code performance and to guide performance engineering, it is essential for computational scientists and engineers to conduct measurements in order to study code performance in detail. Performance analysis tools, a generalisation of the classic profiler, are the best tools to obtain this insight. However, they themselves require a certain level of understanding, experience and expertise to be used productively which adds to the complexity of the underlying problem. This workshop introduces several performance analysis tools and provides hands-on training on how to use them in practice on large-scale HPC applications.

 

Virtual Tutorials and Webinars

The ARCHER2 virtual tutorials and webinars cover a wide range of topics and levels; from talks on research using ARCHER2 and HPC in general, through technical talks of interest to users to more general talks on areas such as diversity and inclusion. We also actively seek feedback from the community on potential topics and presenters for virtual tutorials so if you have any ideas for topics or presenters, please let us know at: support@archer2.ac.uk.

Most sessions last around an hour, with 40 minutes of presentation followed by 20 minutes of questions and discussion.