- Current System Load - Full System
- Service Alerts
- Maintenance Sessions
- System Status Mailings
- FAQ
- Usage statistics
Current System Load - Full System
The plot below shows the status of nodes on the current ARCHER2 Full System service. A description of each of the status types is provided below the plot.
- alloc: Nodes running user jobs
- idle: Nodes available for user jobs
- resv: Nodes in reservation and not available for standard user jobs
- plnd: Nodes are planned to be used for a future jobs. If pending jobs can fit in the space before the future job is due to start they can run on these nodes (often referred to as backfilling).
- down, drain, maint, drng, comp, boot: Nodes unavailable for user jobs
- mix: Nodes in multiple states
Note: the long running reservation visible in the plot corresponds to the short QoS which is used to support small, short jobs with fast turnaround time.
Service Alerts
The ARCHER2 documentation also covers some Known Issues which users may encounter when using the system.
Status | Type | Start | End | Scope | User Impact | Reason |
---|---|---|---|---|---|---|
Ongoing | Service Alert | 2023-12-06 09:00 | 2023-12-06 10:00 | ARCHER2 login nodes | Users will now connect using a ssh key and passphrase and a time-based one time password | Enhance security |
Previous Service Alerts
This section lists resolved service alerts from the past 30 days. A full list of historical resolved service alerts is available.
Status | Type | Start | End | Scope | User Impact | Reason |
---|---|---|---|---|---|---|
Resolved | Service Alert | 2023-11-22 10:00 | 2023-11-22 10:10 | /work lustre file systems | The change should take minutes and should not impact users | Change in lustre configuration |
Resolved | Service Alert | 2023-11-06 09:20 | 2023-11-06 11:40 | Compute nodes | All running jobs have failed, no new jobs can start on the compute nodes. | An extenal power event has caused all ARCHER2 compute nodes to be unavailable |
Resolved | Service change | 2022-12-12 09:45 | N/A | Compute nodes | Users may see changes in application performance | The default CPU frequency for parallel jobs started using `srun` has been changed to 2.0 GHz to improve the energy efficiency of ARCHER2. We recommend that users test the energy efficiency of their applications and set the CPU frequency appropriately. |
Maintenance Sessions
This section lists recent and upcoming maintenance sessions. A full list of past maintenance sessions is available.
Status | Type | Start | End | Scope | User Impact | Reason |
---|---|---|---|---|---|---|
Planned | Partial | TBC | ARCHER2 scheduler | Running jobs will not be impacted. There will be several interruptions to the Slurm scheduler which means that users will not be able to submit new jobs and new jobs will not start. If users are impacted, you should wait a few minutes and then try to resubmit the job again. | Updating the Slurm configuration | |
Planned | Full | TBC | ARCHER2 | Users will be unable to connect to ARCHER2 and will not have access to data on ARCHER2 | Integrating the GPU nodes into ARCHER2 |
System Status mailings
If you would like to receive email notifications about system issues and outages, please subscribe to the System Status Notifications mailing list via SAFE
FAQ
Usage statistics
This section contains data on ARCHER2 usage for Oct 2023. Access to historical usage data is available at the end of the section.
Usage by job size and length
Queue length data
The colour indicates scheduling coefficient which is computed as [run time] divided by [run time + queue time]. A scheduling coefficient of 1 indicates that there was zero time queuing, a scheduling coefficient of 0.5 means that the job spent as long queuing as it did running.
Software usage data
Plot and table of % use and job step size statistics for different software on ARCHER2 for Oct 2023. This data is also available as a CSV file.
This table shows job step size statistics in cores weighted by usage, total number of job steps and percent usage broken down by different software for Oct 2023.
Software | Min | Q1 | Median | Q3 | Max | Jobs | Nodeh | PercentUse | Users | Projects |
---|---|---|---|---|---|---|---|---|---|---|
Overall | 1 | 512.0 | 1331.0 | 12288.0 | 524288 | 1306427 | 3628968.1 | 100.0 | 798 | 122 |
Unknown | 1 | 560.0 | 1536.0 | 8192.0 | 225280 | 260520 | 583508.6 | 16.1 | 370 | 93 |
VASP | 1 | 512.0 | 640.0 | 1024.0 | 8192 | 115764 | 565846.7 | 15.6 | 142 | 13 |
LAMMPS | 1 | 1280.0 | 38400.0 | 51200.0 | 131072 | 8992 | 358759.9 | 9.9 | 61 | 21 |
SENGA | 32 | 15936.0 | 18432.0 | 24576.0 | 24576 | 153 | 290786.8 | 8.0 | 5 | 4 |
Met Office UM | 1 | 1024.0 | 1152.0 | 6165.0 | 12544 | 13985 | 277626.7 | 7.7 | 48 | 6 |
CP2K | 2 | 128.0 | 512.0 | 1024.0 | 5376 | 25451 | 247288.1 | 6.8 | 56 | 11 |
GROMACS | 1 | 1000.0 | 1056.0 | 3072.0 | 7938 | 18100 | 180772.2 | 5.0 | 34 | 6 |
Xcompact3d | 64 | 8192.0 | 8192.0 | 131072.0 | 524288 | 339 | 172369.4 | 4.7 | 17 | 9 |
CASTEP | 1 | 250.0 | 768.0 | 1024.0 | 10240 | 197511 | 127354.9 | 3.5 | 47 | 5 |
Nektar++ | 128 | 3840.0 | 6144.0 | 12800.0 | 15360 | 366 | 73298.3 | 2.0 | 8 | 2 |
GENE | 1 | 8192.0 | 8192.0 | 20480.0 | 40960 | 189 | 67140.5 | 1.9 | 5 | 4 |
ONETEP | 1 | 128.0 | 128.0 | 224.0 | 1024 | 1686 | 66641.3 | 1.8 | 7 | 1 |
MITgcm | 1 | 240.0 | 624.0 | 624.0 | 900 | 28234 | 54792.9 | 1.5 | 11 | 3 |
OpenFOAM | 1 | 512.0 | 1152.0 | 2048.0 | 8192 | 1993 | 50155.3 | 1.4 | 30 | 16 |
3DNS | 950 | 11600.0 | 30512.0 | 50217.0 | 50217 | 35 | 44141.5 | 1.2 | 2 | 1 |
NEMO | 1 | 1568.0 | 6528.0 | 6528.0 | 7232 | 16089 | 40191.5 | 1.1 | 19 | 4 |
FHI aims | 32 | 128.0 | 128.0 | 512.0 | 4096 | 9781 | 36408.8 | 1.0 | 17 | 3 |
ChemShell | 1 | 512.0 | 768.0 | 1152.0 | 5504 | 1020 | 36377.1 | 1.0 | 13 | 5 |
OpenSBLI | 4096 | 16384.0 | 131072.0 | 131072.0 | 131072 | 78 | 35782.0 | 1.0 | 2 | 2 |
Quantum Espresso | 1 | 512.0 | 1024.0 | 1024.0 | 2048 | 2982 | 34210.4 | 0.9 | 19 | 4 |
ptau3d | 8 | 400.0 | 400.0 | 400.0 | 400 | 48 | 31648.5 | 0.9 | 2 | 1 |
EPOCH | 8 | 2560.0 | 3840.0 | 11520.0 | 11520 | 708 | 29525.8 | 0.8 | 7 | 1 |
Code_Saturne | 128 | 4000.0 | 4096.0 | 4096.0 | 16384 | 157 | 28756.9 | 0.8 | 4 | 2 |
CRYSTAL | 96 | 128.0 | 608.0 | 1024.0 | 3072 | 1730 | 28489.8 | 0.8 | 9 | 2 |
Python | 1 | 512.0 | 1280.0 | 16384.0 | 16384 | 553502 | 27061.2 | 0.7 | 54 | 24 |
CASINO | 128 | 2048.0 | 8192.0 | 8192.0 | 16384 | 208 | 24903.3 | 0.7 | 2 | 1 |
EDAMAME | 64 | 1331.0 | 1331.0 | 1331.0 | 3375 | 140 | 20811.5 | 0.6 | 2 | 1 |
a.out | 1 | 2560.0 | 2560.0 | 2560.0 | 10240 | 851 | 19157.1 | 0.5 | 13 | 10 |
TPLS | 128 | 512.0 | 512.0 | 2048.0 | 2048 | 185 | 17357.1 | 0.5 | 3 | 2 |
RMT | 4 | 352.0 | 352.0 | 1536.0 | 2560 | 318 | 13080.7 | 0.4 | 5 | 1 |
GS2 | 1 | 2304.0 | 2304.0 | 2304.0 | 2304 | 37846 | 7721.2 | 0.2 | 5 | 2 |
NWChem | 4 | 256.0 | 512.0 | 1024.0 | 2560 | 638 | 7564.2 | 0.2 | 11 | 6 |
HYDRA | 1 | 2560.0 | 3840.0 | 3840.0 | 3840 | 577 | 6920.8 | 0.2 | 9 | 4 |
Nek5000 | 1280 | 2560.0 | 2560.0 | 2560.0 | 2560 | 276 | 6221.2 | 0.2 | 2 | 2 |
Smilei | 8 | 256.0 | 256.0 | 256.0 | 256 | 128 | 4135.2 | 0.1 | 3 | 1 |
Cluster | 1024 | 1024.0 | 1024.0 | 1024.0 | 1024 | 31 | 3399.5 | 0.1 | 1 | 1 |
CESM | 64 | 256.0 | 256.0 | 256.0 | 1024 | 629 | 3305.1 | 0.1 | 17 | 2 |
iIMB | 2304 | 2304.0 | 2304.0 | 3072.0 | 3840 | 18 | 2917.2 | 0.1 | 1 | 1 |
NAMD | 512 | 512.0 | 512.0 | 512.0 | 512 | 20 | 613.0 | 0.0 | 1 | 1 |
SBLI | 1 | 256.0 | 256.0 | 512.0 | 512 | 222 | 564.8 | 0.0 | 3 | 2 |
SIESTA | 1 | 128.0 | 1280.0 | 1536.0 | 1536 | 4245 | 525.5 | 0.0 | 5 | 2 |
WRF | 64 | 128.0 | 256.0 | 256.0 | 512 | 39 | 400.3 | 0.0 | 3 | 2 |
HANDE | 1 | 4.0 | 8.0 | 16.0 | 16 | 395 | 245.4 | 0.0 | 1 | 1 |
DL_POLY | 8 | 4096.0 | 16384.0 | 16384.0 | 16384 | 70 | 61.7 | 0.0 | 3 | 2 |
Arm Forge | 4 | 1536.0 | 1536.0 | 1536.0 | 2048 | 104 | 59.6 | 0.0 | 6 | 5 |
HemeLB | 10 | 768.0 | 768.0 | 768.0 | 768 | 3 | 26.3 | 0.0 | 1 | 1 |
DL_MESO | 64 | 64.0 | 64.0 | 128.0 | 128 | 9 | 21.3 | 0.0 | 1 | 1 |
AxiSEM3D | 48 | 48.0 | 48.0 | 48.0 | 48 | 18 | 14.8 | 0.0 | 1 | 1 |
SDPB | 2 | 63.0 | 63.0 | 64.0 | 64 | 41 | 6.4 | 0.0 | 1 | 1 |
FVCOM | 1 | 1.0 | 1.0 | 2.0 | 2 | 3 | 0.0 | 0.0 | 1 | 1 |