Introduction to Slurm
Ismael Fernández Pavón
HPC Support
13 / 12 / 2022
What is ?
Resource Manager
Job Scheduler
Basic interaction
• Simple Linux Utility for Resource Management (Historic)
or simply Slurm.
Cluster manager and job scheduler
system for large and small Linux
clusters.
What is Slurm?
• Allocates access to resources for some duration of time.
• Provides a framework for starting, executing, and
monitoring work (normally a parallel job).
• Arbitrates contention for resources by managing a queue
of pending work.
Cluster manager and job scheduler
system for large and small Linux
clusters.
What is Slurm?
LoadLeveler (IBM)
LSF
Slurm
PBS Pro
Resource Managers Scheduler
ALPS (Cray)
Torque
Maui
Moab
What is Slurm?
✓ Open source
✓ Fault-tolerant
✓ Highly scalable
✓ Almost everywhere
LoadLeveler (IBM)
LSF
Slurm
PBS Pro
Resource Managers Scheduler
ALPS (Cray)
Torque
Maui
Moab
What is Slurm?
Cluster:
Collection of many separate
servers (nodes), connected
via a fast interconnect.
Node
Fast interconnection
Ethernet, Infiniband…
Slurm: Resource Management
Node
CPU
(Core)
CPU
(Thread)
Nodes:
• pirineus[1-6]
• pirineus[7-50]
• pirineus[51-69]
• canigo[1,2]
• pirineusgpu[1-4]
• pirineusknl[1-4]
GPGPU
(GRES)
Individual computer
component of an HPC
system.
Slurm: Resource Management
Partitions:
• std
• std-fat
• mem
• gpu
• knl
• covid19
• exclusive
Partitions
Logical group of nodes with
common specs.
Slurm: Resource Management
Allocated
cores
Allocated
memory
Jobs:
• ID (a number)
• Name
• Time limit
• Size specification
• Other Jobs Dependency
• State
Allocations of resources
assigned to a user for a
specified amount of time.
Slurm: Resource Management
Core
used
Memory
used
Jobs Step:
• ID (a number)
• Name
• Time limit
• Size specification
Sets of (possibly parallel)
tasks within a job.
Slurm: Resource Management
FULL CLUSTER
Job scheduling time!
Slurm: Resource Management
Scheduling: The process of determining next job to run and
on which resources.
Slurm: Job Scheduling
Scheduling: The process of determining next job to run and
on which resources.
FIFO Scheduling
Resources
Slurm: Job Scheduling
Scheduling: The process of determining next job to run and
on which resources.
FIFO Scheduling
Backfill Scheduling
• Job priority
• Time limit (Important!)
Time
Resources
Slurm: Job Scheduling
Backfill Scheduling:
• Ej: New lower priority job
Elapsed time
Time limit
Time
Resources
Slurm: Job Scheduling
Backfill Scheduling:
• Ej: New lower priority job
Time
Resources
Submit
Elapsed time
Time limit
Slurm: Job Scheduling
Backfill Scheduling:
• Ej: New lower priority job
Time
Resources
Elapsed time
Time limit
Slurm: Job Scheduling
Backfill Scheduling:
• Ej: New lower priority job
Time
Resources
Wait time: 7
Elapsed time
Time limit
Slurm: Job Scheduling
Backfill Scheduling:
• Ej: New lower priority job
Time
Resources
Elapsed time
Time limit
Slurm: Job Scheduling
Backfill Scheduling:
• Ej: New lower priority job
Time
Resources
Submit
Elapsed time
Time limit
Slurm: Job Scheduling
Backfill Scheduling:
• Ej: New lower priority job
Time
Resources
Elapsed time
Time limit
Slurm: Job Scheduling
Backfill Scheduling:
• Ej: New lower priority job
Time
Resources
Wait time: 1
Elapsed time
Time limit
Slurm: Job Scheduling
Backfill Scheduling:
• Starts with job priority.
Job_priority =
= site_factor +
+ (PriorityWeightQOS) * (QOS_factor) +
+ (PriorityWeightPartition) * (partition_factor) +
+ (PriorityWeightFairshare) * (fair-share_factor) +
+ (PriorityWeightAge) * (age_factor) +
+ (PriorityWeightJobSize) * (job_size_factor) +
+ (PriorityWeightAssoc) * (assoc_factor) +
+ SUM(TRES_weight_<type> * TRES_factor_<type>…)
− nice_factor
Slurm: Job Scheduling
Backfill Scheduling:
• Starts with job priority.
Job_priority =
= site_factor +
+ (PriorityWeightQOS) * (QOS_factor) +
+ (PriorityWeightPartition) * (partition_factor) +
+ (PriorityWeightFairshare) * (fair-share_factor) +
+ (PriorityWeightAge) * (age_factor) +
+ (PriorityWeightJobSize) * (job_size_factor) +
+ (PriorityWeightAssoc) * (assoc_factor) +
+ SUM(TRES_weight_<type> * TRES_factor_<type>…)
− nice_factor
Fixed value
Dynamic value
User defined value
Slurm: Job Scheduling
Backfill Scheduling:
• Priority factor: QoS:
• Account’s Priority:
− Normal
− Low
• RES users:
− class_a
− class_b
− class_c
QoS
Slurm: Job Scheduling
Backfill Scheduling:
• Priority factor: Fairshare:
• It depends on:
• Consumption.
• Resources requested.
QoS
Fairshare
Slurm: Job Scheduling
Backfill Scheduling:
• Priority factor: Age:
• Increase priority as more
time the job pends on
queue.
• Max 7 days.
• Not valid for dependent
jobs!
QoS
Fairshare
Age
Slurm: Job Scheduling
Backfill Scheduling:
• Priority factor: Job size:
• Bigger jobs have more
priority.
• ONLY resources
NOT time.
QoS
Fairshare
Age
Job size
Slurm: Job Scheduling
Name Variable Availability Lifetime Backup
/home/<user> $HOME Global -
/scratch/<user> - Global 30 d
/scratch/<user>/tmp/<jobid>
$SCRATCH /
$SHAREDSCRATCH
Global 7 d
/tmp/<user>/<jobid>
$SCRATCH /
$LOCALSCRATCH
Node Job
Basic: General information
ssh –p 2122 <user>@hpc.csuc.cat
scp -P 2122 <local_file> <user>@hpc.csuc.cat:<remote_path>
• Login:
• Transfer files:
• Storage:
Cheatsheet
Basic: General information
• Linux commands:
Command Description
pwd Show current path
ls List current folder’s files
cd <path> Change directory
mkdir <dir> Create directory
cp <file> <new> Copy
mv <file> <new> Move
rm <file> Remove file
man <command> Show manual
Command Description
CTRL-c Stop current command
CTRL-r Search history
!! Repeatlast command
grep <p> <f> Search for patterns in files
tail <file> Show last 10 lines
head <file> Show first 10 lines
cat <file> Print file content
touch <file> Create an empty file
Cheatsheet
$ vim submit_file.slm
PENDING
(CONFIGURING)
RUNNING COMPLETED
COMPLETING
Basic: Jobs
Batch job: #!/bin/bash
#SBATCH -J <job_name>
#SBATCH -o output_file.out
#SBATCH -e error_file.err
#SBATCH -p <partition>
#SBATCH -n <#tasks>
#SBATCH -c <#cpus_per_task>
#SBATCH -t 60
module load <module>
cp -r <input> ${SCRATCH}
cd ${SCRATCH}
<application>
cp -r <output> ${SLURM_SUBMIT_DIR}
Submit
Execute
Obtain
jobA
jobA
jobA
Basic: Jobs
Batch job:
Slurm directives
-J <name> Name of job
-o <file> Job’s std output file
-e <file> Job’s std error file
-p <part> Partition requested
-n <#tasks> Number of tasks
-c <#cpus> Number of procs per
task
-t <time> Time limit (dd-hh:mm,
hh:mm, mm)
#!/bin/bash
#SBATCH -J <job_name>
#SBATCH -o output_file.out
#SBATCH -e error_file.err
#SBATCH -p <partition>
#SBATCH -n <#tasks>
#SBATCH -c <#cpus_per_task>
#SBATCH -t 60
module load <module>
cp -r <input> ${SCRATCH}
cd ${SCRATCH}
<application>
cp -r <output> ${SLURM_SUBMIT_DIR}
Basic: Jobs
Batch job:
Defaults
• std: 1 core
3900 MB / core
• std-fat: 1 core
7900 MB / core
• mem: 1 core
23900 MB / core
• gpu: 24 cores
3900 MB / core
1 GPGPU
• knl: All node
#!/bin/bash
#SBATCH -J <job_name>
#SBATCH -o output_file.out
#SBATCH -e error_file.err
#SBATCH -p <partition>
#SBATCH -n <#tasks>
#SBATCH -c <#cpus_per_task>
#SBATCH -t 60
module load <module>
cp -r <input> ${SCRATCH}
cd ${SCRATCH}
<application>
cp -r <output> ${SLURM_SUBMIT_DIR}
Basic: Jobs
Batch job:
• First Load module
• Second Copy inputs to SCRATCH
Change working path
• Third Execution
• Forth Get outputs back
#!/bin/bash
#SBATCH -J <job_name>
#SBATCH -o output_file.out
#SBATCH -e error_file.err
#SBATCH -p <partition>
#SBATCH -n <#tasks>
#SBATCH -c <#cpus_per_task>
#SBATCH -t 60
module load <module>
cp -r <input> ${SCRATCH}
cd ${SCRATCH}
<application>
cp -r <output> ${SLURM_SUBMIT_DIR}
Basic: Jobs
Batch job: Steps #!/bin/bash
#SBATCH -J <job_name>
#SBATCH -o output_file.out
#SBATCH -e error_file.err
#SBATCH -p <partition>
#SBATCH -n <#tasks>
#SBATCH -c <#cpus_per_task>
#SBATCH -t 60
module load <module>
cp -r <input> ${SCRATCH}
cd ${SCRATCH}
srun <application>
srun <application> &
srun <application> &
wait
cp -r <output> ${SLURM_SUBMIT_DIR}
Submit
Execute
Obtain
jobA
jobA.3
jobA.2
jobA.1 … job.N
jobA
Basic: Jobs
Batch job: Arrays #!/bin/bash
#SBATCH -J <job_name>
#SBATCH -o output_file_%A_%a.out
#SBATCH -e error_file_%A_%a.err
#SBATCH --array=0-4
#SBATCH -p <partition>
#SBATCH -n <#tasks>
#SBATCH -c <#cpus_per_task>
#SBATCH -t 60
module load <module>
cp -r <input> ${SCRATCH}
cd ${SCRATCH}
<application>
cp -r <output> ${SLURM_SUBMIT_DIR}
Submit
Execute
Obtain
jobA
jobA_3
jobA_2
jobA_1 … job_N
jobA_3
jobA_2
jobA_1 … job_N
Basic: Jobs
Basic: Jobs
Batch job: Arrays
• Example: Job 50
#!/bin/bash
#SBATCH -J <job_name>
#SBATCH -o output_file_%A_%a.out
#SBATCH -e error_file_%A_%a.err
#SBATCH --array=0-4
#SBATCH -p <partition>
#SBATCH -n <#tasks>
#SBATCH -c <#cpus_per_task>
#SBATCH -t 60
module load <module>
cp -r <input> ${SCRATCH}
cd ${SCRATCH}
<application>
cp -r <output> ${SLURM_SUBMIT_DIR}
SLURM_JOB_ID 50
SLURM_ARRAY_JOB_ID 50
SLURM_ARRAY_TASK_ID 0
SLURM_ARRAY_TASK_COUNT 5
SLURM_ARRAY_TASK_MAX 4
SLURM_ARRAY_TASK_MIN 0
SLURM_JOB_ID 51
SLURM_ARRAY_JOB_ID 50
SLURM_ARRAY_TASK_ID 1
SLURM_ARRAY_TASK_COUNT 5
SLURM_ARRAY_TASK_MAX 4
SLURM_ARRAY_TASK_MIN 0
…
Batch job: Dependency
Pre-processing
Analysis
Verification
jobX
jobB
jobA
jobN
jobM
jobL
ok?
ok?
#!/bin/bash
#SBATCH -J <job_name>
#SBATCH -o output_file.out
#SBATCH -e error_file.err
#SBATCH -p <partition>
#SBATCH -n <#tasks>
#SBATCH -c <#cpus_per_task>
#SBATCH -t 60
#SBATCH --dependency=afterok:<jid>
module load <module>
cp -r <input> ${SCRATCH}
cd ${SCRATCH}
<application>
cp -r <output> ${SLURM_SUBMIT_DIR}
Basic: Jobs
Batch job: Dependency
• after:<jobid>[:<jobid>...]
Start of the specified jobs.
• afterany:<jobid>[:<jobid>...]
Termination of the specified jobs.
• afternotok:<jobid>[:<jobid>...]
Failing of the specified jobs.
• singleton
All jobs with the same name
and user have ended.
#!/bin/bash
#SBATCH -J <job_name>
#SBATCH -o output_file.out
#SBATCH -e error_file.err
#SBATCH -p <partition>
#SBATCH -n <#tasks>
#SBATCH -c <#cpus_per_task>
#SBATCH -t 60
#SBATCH --dependency=afterok:<jid>
module load <module>
cp -r <input> ${SCRATCH}
cd ${SCRATCH}
<application>
cp -r <output> ${SLURM_SUBMIT_DIR}
Basic: Jobs
PENDING
(CONFIGURING)
RUNNING
HELD RESIZE
COMPLETING
CANCELED COMPLETED TIMEOUT
FAIL
OUT OF
MEMORY
SPECIAL
EXIT
NODE
FAIL
HOLD
RELEASE
REQUEUE
SUBMISSION
Basic: Jobs
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
std* up infinite 19 mix pirineus[7,34-35,37-40,45]
std* up infinite 25 alloc pirineus[8,11-17,19-33,36,41-44,46-50]
std-fat up infinite 1 mix pirineus45
std-fat up infinite 5 alloc pirineus[46-50]
gpu up infinite 4 idle~ pirineusgpu[1-4]
knl up infinite 4 idle~ pirineusknl[1-4]
mem up infinite 2 mix canigo[1-2]
$ sinfo
PENDING
(CONFIGURING)
RUNNING COMPLETED
COMPLETING
Basic: Jobs
sinfo
+-----------+-------------+-----------------+--------------+------------+
| MACHINE | TOTAL SLOTS | ALLOCATED SLOTS | QUEUED SLOTS | OCCUPATION |
+-----------+-------------+-----------------+--------------+------------+
| std nodes | 1536 | 1468 | 2212 | 95 % |
| fat nodes | 288 | 144 | 0 | 50 % |
| mem nodes | 96 | 96 | 289 | 100 % |
| gpu nodes | 144 | 96 | 252 | 66 % |
| knl nodes | 816 | 0 | 0 | 0 % |
| res nodes | 672 | 648 | 1200 | 96 % |
+-----------+-------------+-----------------+--------------+------------+
$ system-status
PENDING
(CONFIGURING)
RUNNING COMPLETED
COMPLETING
Basic: Jobs
sinfo
Submitted batch job 1720189
$ sbatch <file>
Basic: Jobs
PENDING
(CONFIGURING)
RUNNING COMPLETING COMPLETED
sinfo
sbatch
Submitted batch job 1720189
$ sbatch <file>
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
1720189 std test user PD 0:00 1 (Resources)
$ squeue –u <username>
• Priority: One or more higher priority jobs exist for this partition or advanced reservation.
• Dependency: The job is waiting for a dependent job to complete.
Basic: Jobs
PENDING
(CONFIGURING)
RUNNING COMPLETING COMPLETED
sinfo
squeue
sbatch
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
1720189 std test user R 1-03:44 1 pirineus27
$ squeue –j 1720189
$ sstat –aj 1720189 --format=jobid,nodelist,mincpu,maxrss,pids
JobID Nodelist MinCPU MaxRSS Pids
------------ ---------------- ------------- ---------- ----------------------
1720189.ext+ pirineus27 226474
1720189.bat+ pirineus27 00:00.000 7348K 226491,226526,226528
1720189.0 pirineus27 1-03:44:05 19171808K 226557,226577
Basic: Jobs
PENDING
(CONFIGURING)
RUNNING COMPLETING COMPLETED
sinfo
squeue
sstat
squeue
sbatch
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
1720189 std test user CG 2-15:56 1 pirineus27
$ squeue –j 1720189
• Move files from $LOCALSCRATCH to $SHAREDSCRATCH.
Basic: Jobs
PENDING
(CONFIGURING)
RUNNING COMPLETING COMPLETED
sinfo
squeue
sstat
squeue
sbatch
JobID JobName Partition Account AllocCPUS State ExitCode
------------ ---------- ---------- ---------- ---------- ------------ --------
1720189 test std account 16 COMPLETED 0:0
1720189.bat+ batch account 16 COMPLETED 0:0
1720189.ext+ extern account 16 COMPLETED 0:0
1720189.0 pre account 16 COMPLETED 0:0
1720189.1 process account 16 COMPLETED 0:0
1720189.2 post account 16 COMPLETED 0:0
$ sacct
• completed (CP), time_out (TO), out of memory (OM), fail (F), node_fail (NF)…
Basic: Jobs
PENDING
(CONFIGURING)
RUNNING COMPLETING COMPLETED
sacct
sinfo
squeue
sstat
squeue
sbatch
CANCELLED
$ scancel –j 1720189
JobID JobName Partition Account AllocCPUS State ExitCode
------------ ---------- ---------- ---------- ---------- ------------ --------
1720189 test std account 16 CANCELLED+ 0:0
1720189.bat+ batch account 16 CANCELLED+ 0:0
1720189.ext+ extern account 16 COMPLETED 0:0
1720189.0 pre account 16 CANCELLED+ 0:0
$ sacct
Basic: Jobs
PENDING
(CONFIGURING)
RUNNING COMPLETING COMPLETED
sacct
sinfo
squeue
sstat
squeue
sbatch
scancel
the most suitable partition.
Choose
$SCRATCH as working directory.
Use
only the necessary files.
Move
important files at $HOME.
Keep
Basic: Best practices
Questions?
Thank you for your attention!
Feedback – ismael.fernandez@csuc.cat
Support – https://hpc.csuc.cat

Introduction to Slurm

  • 1.
    Introduction to Slurm IsmaelFernández Pavón HPC Support 13 / 12 / 2022
  • 2.
    What is ? ResourceManager Job Scheduler Basic interaction
  • 3.
    • Simple LinuxUtility for Resource Management (Historic) or simply Slurm. Cluster manager and job scheduler system for large and small Linux clusters. What is Slurm?
  • 4.
    • Allocates accessto resources for some duration of time. • Provides a framework for starting, executing, and monitoring work (normally a parallel job). • Arbitrates contention for resources by managing a queue of pending work. Cluster manager and job scheduler system for large and small Linux clusters. What is Slurm?
  • 5.
    LoadLeveler (IBM) LSF Slurm PBS Pro ResourceManagers Scheduler ALPS (Cray) Torque Maui Moab What is Slurm?
  • 6.
    ✓ Open source ✓Fault-tolerant ✓ Highly scalable ✓ Almost everywhere LoadLeveler (IBM) LSF Slurm PBS Pro Resource Managers Scheduler ALPS (Cray) Torque Maui Moab What is Slurm?
  • 7.
    Cluster: Collection of manyseparate servers (nodes), connected via a fast interconnect. Node Fast interconnection Ethernet, Infiniband… Slurm: Resource Management
  • 8.
    Node CPU (Core) CPU (Thread) Nodes: • pirineus[1-6] • pirineus[7-50] •pirineus[51-69] • canigo[1,2] • pirineusgpu[1-4] • pirineusknl[1-4] GPGPU (GRES) Individual computer component of an HPC system. Slurm: Resource Management
  • 9.
    Partitions: • std • std-fat •mem • gpu • knl • covid19 • exclusive Partitions Logical group of nodes with common specs. Slurm: Resource Management
  • 10.
    Allocated cores Allocated memory Jobs: • ID (anumber) • Name • Time limit • Size specification • Other Jobs Dependency • State Allocations of resources assigned to a user for a specified amount of time. Slurm: Resource Management
  • 11.
    Core used Memory used Jobs Step: • ID(a number) • Name • Time limit • Size specification Sets of (possibly parallel) tasks within a job. Slurm: Resource Management
  • 12.
    FULL CLUSTER Job schedulingtime! Slurm: Resource Management
  • 13.
    Scheduling: The processof determining next job to run and on which resources. Slurm: Job Scheduling
  • 14.
    Scheduling: The processof determining next job to run and on which resources. FIFO Scheduling Resources Slurm: Job Scheduling
  • 15.
    Scheduling: The processof determining next job to run and on which resources. FIFO Scheduling Backfill Scheduling • Job priority • Time limit (Important!) Time Resources Slurm: Job Scheduling
  • 16.
    Backfill Scheduling: • Ej:New lower priority job Elapsed time Time limit Time Resources Slurm: Job Scheduling
  • 17.
    Backfill Scheduling: • Ej:New lower priority job Time Resources Submit Elapsed time Time limit Slurm: Job Scheduling
  • 18.
    Backfill Scheduling: • Ej:New lower priority job Time Resources Elapsed time Time limit Slurm: Job Scheduling
  • 19.
    Backfill Scheduling: • Ej:New lower priority job Time Resources Wait time: 7 Elapsed time Time limit Slurm: Job Scheduling
  • 20.
    Backfill Scheduling: • Ej:New lower priority job Time Resources Elapsed time Time limit Slurm: Job Scheduling
  • 21.
    Backfill Scheduling: • Ej:New lower priority job Time Resources Submit Elapsed time Time limit Slurm: Job Scheduling
  • 22.
    Backfill Scheduling: • Ej:New lower priority job Time Resources Elapsed time Time limit Slurm: Job Scheduling
  • 23.
    Backfill Scheduling: • Ej:New lower priority job Time Resources Wait time: 1 Elapsed time Time limit Slurm: Job Scheduling
  • 24.
    Backfill Scheduling: • Startswith job priority. Job_priority = = site_factor + + (PriorityWeightQOS) * (QOS_factor) + + (PriorityWeightPartition) * (partition_factor) + + (PriorityWeightFairshare) * (fair-share_factor) + + (PriorityWeightAge) * (age_factor) + + (PriorityWeightJobSize) * (job_size_factor) + + (PriorityWeightAssoc) * (assoc_factor) + + SUM(TRES_weight_<type> * TRES_factor_<type>…) − nice_factor Slurm: Job Scheduling
  • 25.
    Backfill Scheduling: • Startswith job priority. Job_priority = = site_factor + + (PriorityWeightQOS) * (QOS_factor) + + (PriorityWeightPartition) * (partition_factor) + + (PriorityWeightFairshare) * (fair-share_factor) + + (PriorityWeightAge) * (age_factor) + + (PriorityWeightJobSize) * (job_size_factor) + + (PriorityWeightAssoc) * (assoc_factor) + + SUM(TRES_weight_<type> * TRES_factor_<type>…) − nice_factor Fixed value Dynamic value User defined value Slurm: Job Scheduling
  • 26.
    Backfill Scheduling: • Priorityfactor: QoS: • Account’s Priority: − Normal − Low • RES users: − class_a − class_b − class_c QoS Slurm: Job Scheduling
  • 27.
    Backfill Scheduling: • Priorityfactor: Fairshare: • It depends on: • Consumption. • Resources requested. QoS Fairshare Slurm: Job Scheduling
  • 28.
    Backfill Scheduling: • Priorityfactor: Age: • Increase priority as more time the job pends on queue. • Max 7 days. • Not valid for dependent jobs! QoS Fairshare Age Slurm: Job Scheduling
  • 29.
    Backfill Scheduling: • Priorityfactor: Job size: • Bigger jobs have more priority. • ONLY resources NOT time. QoS Fairshare Age Job size Slurm: Job Scheduling
  • 30.
    Name Variable AvailabilityLifetime Backup /home/<user> $HOME Global - /scratch/<user> - Global 30 d /scratch/<user>/tmp/<jobid> $SCRATCH / $SHAREDSCRATCH Global 7 d /tmp/<user>/<jobid> $SCRATCH / $LOCALSCRATCH Node Job Basic: General information ssh –p 2122 <user>@hpc.csuc.cat scp -P 2122 <local_file> <user>@hpc.csuc.cat:<remote_path> • Login: • Transfer files: • Storage: Cheatsheet
  • 31.
    Basic: General information •Linux commands: Command Description pwd Show current path ls List current folder’s files cd <path> Change directory mkdir <dir> Create directory cp <file> <new> Copy mv <file> <new> Move rm <file> Remove file man <command> Show manual Command Description CTRL-c Stop current command CTRL-r Search history !! Repeatlast command grep <p> <f> Search for patterns in files tail <file> Show last 10 lines head <file> Show first 10 lines cat <file> Print file content touch <file> Create an empty file Cheatsheet
  • 32.
  • 33.
    Batch job: #!/bin/bash #SBATCH-J <job_name> #SBATCH -o output_file.out #SBATCH -e error_file.err #SBATCH -p <partition> #SBATCH -n <#tasks> #SBATCH -c <#cpus_per_task> #SBATCH -t 60 module load <module> cp -r <input> ${SCRATCH} cd ${SCRATCH} <application> cp -r <output> ${SLURM_SUBMIT_DIR} Submit Execute Obtain jobA jobA jobA Basic: Jobs
  • 34.
    Batch job: Slurm directives -J<name> Name of job -o <file> Job’s std output file -e <file> Job’s std error file -p <part> Partition requested -n <#tasks> Number of tasks -c <#cpus> Number of procs per task -t <time> Time limit (dd-hh:mm, hh:mm, mm) #!/bin/bash #SBATCH -J <job_name> #SBATCH -o output_file.out #SBATCH -e error_file.err #SBATCH -p <partition> #SBATCH -n <#tasks> #SBATCH -c <#cpus_per_task> #SBATCH -t 60 module load <module> cp -r <input> ${SCRATCH} cd ${SCRATCH} <application> cp -r <output> ${SLURM_SUBMIT_DIR} Basic: Jobs
  • 35.
    Batch job: Defaults • std:1 core 3900 MB / core • std-fat: 1 core 7900 MB / core • mem: 1 core 23900 MB / core • gpu: 24 cores 3900 MB / core 1 GPGPU • knl: All node #!/bin/bash #SBATCH -J <job_name> #SBATCH -o output_file.out #SBATCH -e error_file.err #SBATCH -p <partition> #SBATCH -n <#tasks> #SBATCH -c <#cpus_per_task> #SBATCH -t 60 module load <module> cp -r <input> ${SCRATCH} cd ${SCRATCH} <application> cp -r <output> ${SLURM_SUBMIT_DIR} Basic: Jobs
  • 36.
    Batch job: • FirstLoad module • Second Copy inputs to SCRATCH Change working path • Third Execution • Forth Get outputs back #!/bin/bash #SBATCH -J <job_name> #SBATCH -o output_file.out #SBATCH -e error_file.err #SBATCH -p <partition> #SBATCH -n <#tasks> #SBATCH -c <#cpus_per_task> #SBATCH -t 60 module load <module> cp -r <input> ${SCRATCH} cd ${SCRATCH} <application> cp -r <output> ${SLURM_SUBMIT_DIR} Basic: Jobs
  • 37.
    Batch job: Steps#!/bin/bash #SBATCH -J <job_name> #SBATCH -o output_file.out #SBATCH -e error_file.err #SBATCH -p <partition> #SBATCH -n <#tasks> #SBATCH -c <#cpus_per_task> #SBATCH -t 60 module load <module> cp -r <input> ${SCRATCH} cd ${SCRATCH} srun <application> srun <application> & srun <application> & wait cp -r <output> ${SLURM_SUBMIT_DIR} Submit Execute Obtain jobA jobA.3 jobA.2 jobA.1 … job.N jobA Basic: Jobs
  • 38.
    Batch job: Arrays#!/bin/bash #SBATCH -J <job_name> #SBATCH -o output_file_%A_%a.out #SBATCH -e error_file_%A_%a.err #SBATCH --array=0-4 #SBATCH -p <partition> #SBATCH -n <#tasks> #SBATCH -c <#cpus_per_task> #SBATCH -t 60 module load <module> cp -r <input> ${SCRATCH} cd ${SCRATCH} <application> cp -r <output> ${SLURM_SUBMIT_DIR} Submit Execute Obtain jobA jobA_3 jobA_2 jobA_1 … job_N jobA_3 jobA_2 jobA_1 … job_N Basic: Jobs
  • 39.
    Basic: Jobs Batch job:Arrays • Example: Job 50 #!/bin/bash #SBATCH -J <job_name> #SBATCH -o output_file_%A_%a.out #SBATCH -e error_file_%A_%a.err #SBATCH --array=0-4 #SBATCH -p <partition> #SBATCH -n <#tasks> #SBATCH -c <#cpus_per_task> #SBATCH -t 60 module load <module> cp -r <input> ${SCRATCH} cd ${SCRATCH} <application> cp -r <output> ${SLURM_SUBMIT_DIR} SLURM_JOB_ID 50 SLURM_ARRAY_JOB_ID 50 SLURM_ARRAY_TASK_ID 0 SLURM_ARRAY_TASK_COUNT 5 SLURM_ARRAY_TASK_MAX 4 SLURM_ARRAY_TASK_MIN 0 SLURM_JOB_ID 51 SLURM_ARRAY_JOB_ID 50 SLURM_ARRAY_TASK_ID 1 SLURM_ARRAY_TASK_COUNT 5 SLURM_ARRAY_TASK_MAX 4 SLURM_ARRAY_TASK_MIN 0 …
  • 40.
    Batch job: Dependency Pre-processing Analysis Verification jobX jobB jobA jobN jobM jobL ok? ok? #!/bin/bash #SBATCH-J <job_name> #SBATCH -o output_file.out #SBATCH -e error_file.err #SBATCH -p <partition> #SBATCH -n <#tasks> #SBATCH -c <#cpus_per_task> #SBATCH -t 60 #SBATCH --dependency=afterok:<jid> module load <module> cp -r <input> ${SCRATCH} cd ${SCRATCH} <application> cp -r <output> ${SLURM_SUBMIT_DIR} Basic: Jobs
  • 41.
    Batch job: Dependency •after:<jobid>[:<jobid>...] Start of the specified jobs. • afterany:<jobid>[:<jobid>...] Termination of the specified jobs. • afternotok:<jobid>[:<jobid>...] Failing of the specified jobs. • singleton All jobs with the same name and user have ended. #!/bin/bash #SBATCH -J <job_name> #SBATCH -o output_file.out #SBATCH -e error_file.err #SBATCH -p <partition> #SBATCH -n <#tasks> #SBATCH -c <#cpus_per_task> #SBATCH -t 60 #SBATCH --dependency=afterok:<jid> module load <module> cp -r <input> ${SCRATCH} cd ${SCRATCH} <application> cp -r <output> ${SLURM_SUBMIT_DIR} Basic: Jobs
  • 42.
    PENDING (CONFIGURING) RUNNING HELD RESIZE COMPLETING CANCELED COMPLETEDTIMEOUT FAIL OUT OF MEMORY SPECIAL EXIT NODE FAIL HOLD RELEASE REQUEUE SUBMISSION Basic: Jobs
  • 43.
    PARTITION AVAIL TIMELIMITNODES STATE NODELIST std* up infinite 19 mix pirineus[7,34-35,37-40,45] std* up infinite 25 alloc pirineus[8,11-17,19-33,36,41-44,46-50] std-fat up infinite 1 mix pirineus45 std-fat up infinite 5 alloc pirineus[46-50] gpu up infinite 4 idle~ pirineusgpu[1-4] knl up infinite 4 idle~ pirineusknl[1-4] mem up infinite 2 mix canigo[1-2] $ sinfo PENDING (CONFIGURING) RUNNING COMPLETED COMPLETING Basic: Jobs sinfo
  • 44.
    +-----------+-------------+-----------------+--------------+------------+ | MACHINE |TOTAL SLOTS | ALLOCATED SLOTS | QUEUED SLOTS | OCCUPATION | +-----------+-------------+-----------------+--------------+------------+ | std nodes | 1536 | 1468 | 2212 | 95 % | | fat nodes | 288 | 144 | 0 | 50 % | | mem nodes | 96 | 96 | 289 | 100 % | | gpu nodes | 144 | 96 | 252 | 66 % | | knl nodes | 816 | 0 | 0 | 0 % | | res nodes | 672 | 648 | 1200 | 96 % | +-----------+-------------+-----------------+--------------+------------+ $ system-status PENDING (CONFIGURING) RUNNING COMPLETED COMPLETING Basic: Jobs sinfo
  • 45.
    Submitted batch job1720189 $ sbatch <file> Basic: Jobs PENDING (CONFIGURING) RUNNING COMPLETING COMPLETED sinfo sbatch
  • 46.
    Submitted batch job1720189 $ sbatch <file> JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 1720189 std test user PD 0:00 1 (Resources) $ squeue –u <username> • Priority: One or more higher priority jobs exist for this partition or advanced reservation. • Dependency: The job is waiting for a dependent job to complete. Basic: Jobs PENDING (CONFIGURING) RUNNING COMPLETING COMPLETED sinfo squeue sbatch
  • 47.
    JOBID PARTITION NAMEUSER ST TIME NODES NODELIST(REASON) 1720189 std test user R 1-03:44 1 pirineus27 $ squeue –j 1720189 $ sstat –aj 1720189 --format=jobid,nodelist,mincpu,maxrss,pids JobID Nodelist MinCPU MaxRSS Pids ------------ ---------------- ------------- ---------- ---------------------- 1720189.ext+ pirineus27 226474 1720189.bat+ pirineus27 00:00.000 7348K 226491,226526,226528 1720189.0 pirineus27 1-03:44:05 19171808K 226557,226577 Basic: Jobs PENDING (CONFIGURING) RUNNING COMPLETING COMPLETED sinfo squeue sstat squeue sbatch
  • 48.
    JOBID PARTITION NAMEUSER ST TIME NODES NODELIST(REASON) 1720189 std test user CG 2-15:56 1 pirineus27 $ squeue –j 1720189 • Move files from $LOCALSCRATCH to $SHAREDSCRATCH. Basic: Jobs PENDING (CONFIGURING) RUNNING COMPLETING COMPLETED sinfo squeue sstat squeue sbatch
  • 49.
    JobID JobName PartitionAccount AllocCPUS State ExitCode ------------ ---------- ---------- ---------- ---------- ------------ -------- 1720189 test std account 16 COMPLETED 0:0 1720189.bat+ batch account 16 COMPLETED 0:0 1720189.ext+ extern account 16 COMPLETED 0:0 1720189.0 pre account 16 COMPLETED 0:0 1720189.1 process account 16 COMPLETED 0:0 1720189.2 post account 16 COMPLETED 0:0 $ sacct • completed (CP), time_out (TO), out of memory (OM), fail (F), node_fail (NF)… Basic: Jobs PENDING (CONFIGURING) RUNNING COMPLETING COMPLETED sacct sinfo squeue sstat squeue sbatch
  • 50.
    CANCELLED $ scancel –j1720189 JobID JobName Partition Account AllocCPUS State ExitCode ------------ ---------- ---------- ---------- ---------- ------------ -------- 1720189 test std account 16 CANCELLED+ 0:0 1720189.bat+ batch account 16 CANCELLED+ 0:0 1720189.ext+ extern account 16 COMPLETED 0:0 1720189.0 pre account 16 CANCELLED+ 0:0 $ sacct Basic: Jobs PENDING (CONFIGURING) RUNNING COMPLETING COMPLETED sacct sinfo squeue sstat squeue sbatch scancel
  • 51.
    the most suitablepartition. Choose $SCRATCH as working directory. Use only the necessary files. Move important files at $HOME. Keep Basic: Best practices
  • 52.
  • 53.
    Thank you foryour attention! Feedback – ismael.fernandez@csuc.cat Support – https://hpc.csuc.cat