Skip to main content

Patching and Pausing Jobs

On HPC1 use squeue to determine what jobs are running on what nodes:

squeue --long
Tue May 27 13:16:59 2025
             JOBID PARTITION     NAME     USER    STATE       TIME TIME_LIMI  NODES NODELIST(REASON)
              8957   compute a1748355  azeezoe  PENDING       0:00 1-00:00:00      1 (Dependency)
              8958   compute a1748355  azeezoe  PENDING       0:00  20:00:00      1 (Dependency)
              8955   compute a1748355  azeezoe  RUNNING    1:25:26 7-00:00:00      1 hpc2
              8956   compute a1748355  azeezoe  RUNNING    1:26:41 7-00:00:00      1 hpc2

Take note of what's running where, in this example let's say hpc2 is about to undergo schedule maintenance. Simply pause the jobs running on it:

scontrol suspend 8955,8956

Once it's back online, resume the jobs:

scontrol resume 8955,8956