ARCHER2 Weekly Newsletter


Parallel Performance Analysis using Scalasca/Score-P on ARCHER2 (for CPUs and HIP on the AMD GPUs).

Edinburgh, 29 - 30 April 2024 09:30 - 17:00 GMT

Scalasca/Score-P is a portable, free and open-source software toolset that supports the performance optimisation of parallel programs by measuring and analysing their runtime behaviour. The analysis identifies potential performance bottlenecks – in particular those concerning communication and synchronisation – and offers guidance in exploring their causes. Scalasca uses execution profiles and traces generated by the community-developed Score-P instrumentation and measurement infrastructure.

The tool has been specifically designed for use on large-scale systems, but is also well suited for small and medium-scale HPC platforms. The software is available for free download under the New BSD open-source license.

Scalasca/Score-P targets scientific and engineering applications based on the programming interfaces MPI, CUDA, HIP and OpenMP/OpenACC, including hybrid applications combining both with kernel offload to GPU accelerators. Note that for the AMD GPUs on ARCHER2, only instrumentation of HIP is currently supported

This in-person course will cover how to use the tools in practice, delivered by members of the development team. Scalasca/Score-P is portable across HPC systems, but for this course practical exercises will be conducted on the UK National HPC Service ARCHER2 (an HPE/Cray EX system) including access to the AMD GPU nodes for profiling of codes using HIP; all attendees will be given accounts on ARCHER2 for the duration of the course. Although example parallel programs will be provided, attendees are encouraged to analyse the performance of their own applications.

Access to ARCHER2 will be available before the course starts to port and build applications; those who are unfamiliar with ARCHER2-GPU programming are encouraged to attend or view the recordings of recent ARCHER2 GPU online training courses including “Introduction to GPU programming with HIP”.

Further details and registration

2024 Educational Award For Outstanding Contribution to Computational Science Education

The ACM SIGHPC Education Chapter is seeking nominations for candidates for the 2024 Educational Award For Outstanding Contribution to Computational Science Education. We are seeking candidates who have led projects or programs that have made significant contributions to computational science education defined broadly to include all disciplines and all education levels.

The award will be presented at SC24. The recipient will receive a $2,000 cash award and travel support to attend the SC24 conference. Nominations will include a statement endorsing the nominee, and up to three letters of endorsement. Applications are due by Friday June 28, 2024, by end of day anywhere on earth. The chapter will choose up to one award winner and up to two honorable mentions.

More details, including the application forms and instructions

Questions concerning award eligibility and nominations.

Access to HPC call open

Deadline: 23 April 2024 4:00pm UK time

ARCHER2 access page

Full details of UKRI call

No TA form required at application stage

Introduction to GPU programming with HIP

Online, 18 - 19 April 2024 09:30 - 16:00

This short course will provide an introduction to GPU computing with HIP aimed at scientific application programmers wishing to develop their own software. The course will give a background on the difference between CPU and GPU architectures as a prelude to introductory exercises in HIP programming. The course will discuss the execution of kernels, memory management, among other topics.

The course will not discuss programming with compiler directives, but does provide a concrete basis of understanding of the underlying principles of the HIP model which is useful for programmers ultimately wishing to make use of OpenMP or OpenACC. The course will not consider graphics programming, nor will it consider machine learning packages.

Note that the course is also appropriate for those wishing to use NVIDIA GPUs via the CUDA API, although we will not specifically use CUDA.

Further details and registration

Recently added known issues

The “Known Issues” page of the ARCHER2 Documentation https://docs.archer2.ac.uk/known-issues/ lists all current open known issues including a description of the issue, its symptoms and any work-arounds.

  • When close to storage quota, jobs may slow down or produce corrupted files (Added: 2024-02-27) For situations where users are close to user or project quotas on work (Lustre) file systems we have seen cases of the following behaviour:
    • Jobs run very slowly as IO slows down
    • IO calls seem to complete successfully but not all data is written (so output is corrupted)
    • No “disk quota exceeded” error is seen

If you see these symptoms: slower than expected performance, data corruption; then you should check if you are close to your storage quota (either user or project quota). If you are, you may be experiencing this issue. Either remove data to free up space or request more storage quota.

Upcoming ARCHER2 Training

Further details of upcoming training

We always welcome researchers wishing to present their work in a webinar - please contact the Service Desk if you would be interested in presenting your work.

Twitter

Recordings of past courses

Recordings of past virtual tutorials