ARCHER2 Weekly Newsletter


Automated service monitoring in the deployment of ARCHER2

Online webinar, Wednesday 3rd August 2022 13:00 - 14:00 BST Kieran Leach EPCC

The ARCHER2 service, a CPU based HPE Cray EX system with 750,080 cores (5,860 nodes), has been deployed throughout 2020 and 2021, going into full service in December of 2021.

A key part of the work during this deployment was the integration of ARCHER2 into our local monitoring systems.

As ARCHER2 was one of the very first large-scale EX deployments, this involved close collaboration and development work with the HPE team through a global pandemic situation where collaboration and co-working was significantly more challenging than usual. The deployment included the creation of automated checks and visual representations of system status which needed to be made available to external parties for diagnosis and interpretation.

We will describe how these checks have been deployed and how data gathered played a key role in the deployment of ARCHER2, the commissioning of the plant infrastructure, the conduct of HPL runs for submission to the Top500 and contractual monitoring of the availability of the ARCHER2 service during its commissioning and early life.

More information and join link: https://www.archer2.ac.uk/training/courses/220803-service-monitoring-vt/

Efficient Parallel IO

David Henty EPCC 23 August 2022 09:30 - 16:30 BST Online

One of the greatest challenges to running parallel applications on large numbers of processors is how to handle file IO. Standard Unix IO routines are not designed with parallelism in mind, and IO overheads can grow to dominate the overall runtime. Parallel file systems are optimised for large volumes of data, but performance can be far from optimal if every process opens its own file or if all IO is funnelled through a single controller process.

This hands-on course explores a range of issues related to parallel IO. It uses ARCHER2 and its parallel Lustre file system as a platform for the exercises; however, almost all the IO concepts and performance considerations are applicable to any parallel system.

Full details and registration: https://www.archer2.ac.uk/training/#upcoming-training

Pre-announcement of the HPC-AI Advisory Council UK conference: 19/20 October 2022, Leicester.

The annual UK conference of the HPC-AI Advisory Council is back to being an in-person event. The 2022 conference will be in Leicester on 19/20 October.

The three coupled themes for the 2 days are:

  1. The importance of large-scale computing
  2. Sustainability and net-zero in large-scale computing
  3. The vital role of DRI professional

The deadline for submitting contributed talk abstracts is August 17th.

Full details, including the registration and abstract submission forms, at https://www.hpcadvisorycouncil.com/events/2022/uk-conference/

Access to HPC Call

Access to HPC Call (EPSRC remit only) opened 4th July.
ARCHER2 or Tier-2 computing resource for a maximum duration of one year and a minimum of 4000 CU.
TA Deadline - 20th September 16:00
Submit Deadline - 18th October 16:00
More details an application forms https://www.archer2.ac.uk/support-access/access#calls-for-archer2-time-only

Recently added Known Issues

The “Known Issues” page of the ARCHER2 Documentation https://docs.archer2.ac.uk/known-issues/ lists all current open known issues including a description of the issue, its symptoms and any work-arounds.

  • No recent issues

Upcoming ARCHER2 Training

  • Message-passing Programming with MPI, Online, always-open self-service course
  • Shared Memory Programming with OpenMP, Online, always-open self-service course
  • Automated service monitoring in the deployment of ARCHER2, Online webinar, Wednesday 3rd August 2022 13:00 - 14:00 BST
  • Performance of different routing protocols on ARCHER2: OpenFabrics and UCX - (Postponed from 20th July), Online webinar, Wednesday date 17th August 2022 15:00 - 16:00 BST
  • Efficient Parallel IO, Online, 23 August 2022 09:30 - 16:30 BST
  • Introduction to OpenMP, Online, 30th & 31st August, 6th September 2022 09:00 - 17:00 BST
  • Debugging and Optimizing Parallel Codes with Arm Forge - Debugging and DDT, Online webinar, Wednesday 31st August 2022 15:00 - 16:00 BST
  • ARCHER2 for Software Developers, Online, 1 - 2 September 2022 10:00 - 16:00 BST
  • Debugging and Optimizing Parallel Codes with Arm Forge - Performance optimization, MAP, and PR, Online webinar, Wednesday 7th September 2022 15:00 - 16:00 BST
  • ARCHER2 for Package Users, Online, 13 October 2022 10:00 - 16:00 BST

Further details https://www.archer2.ac.uk/training/#upcoming-training

Twitter: https://twitter.com/ARCHER2_HPC

Recordings of past courses and virtual tutorials can be found here: https://www.archer2.ac.uk/training/materials/