Introduction to Slurm
Ismael Fernández Pavón
16 / 12 / 2021
What is Slurm? Resource Manager
Job Scheduler Basic interaction
What is Slurm?
• Allocates access to resources for some duration of time.
• Provides a framework for starting, executing, and
monitoring work (normally a parallel job).
• Arbitrates contention for resources by managing a queue
of pending work.
Cluster manager and job scheduler
system for large and small Linux
clusters.
What is Slurm?
• The use of “Slurm”, not SLURM or any other variation is
preferred.
• Simple Linux Utility for Resource Management (Historic)
• Slurm in all capitals describes earlier days of the software
when Slurm was just a resource manager.
Cluster manager and job scheduler
system for large and small Linux
clusters.
LoadLeveler (IBM)
LSF
Slurm
PBS Pro
Resource Managers Scheduler
What is Slurm?
ALPS (Cray)
Torque
Maui
Moab
✓ Open source
✓ Fault-tolerant
✓ Highly scalable
LoadLeveler (IBM)
LSF
Slurm
PBS Pro
Resource Managers Scheduler
What is Slurm?
ALPS (Cray)
Torque
Maui
Moab
Slurm: Resource Management
Cluster:
Collection of many separate
servers (nodes), connected
via a fast interconnect.
Node
Fast interconnection
Ethernet, Infiniband…
Node
CPU
(Core)
CPU
(Thread)
Slurm: Resource Management
Nodes:
• pirineus[1-44]
• pirineus[45-50]
• canigo[1,2]
• pirineusgpu[1-4]
• pirineusknl[1-4]
GPGPU
(GRES)
Individual computer
component of an HPC
system.
Slurm: Resource Management
Partitions:
• std
• std-fat
• mem
• gpu
• knl
• covid19
Partitions
Logical group of nodes with
common specs.
Allocated
cores
Slurm: Resource Management
Allocated
memory
Jobs:
• ID (a number)
• Name
• Time limit
• Size specification
• Other Jobs Dependency
• State
Allocations of resources
assigned to a user for a
specified amount of time.
Core
used
Slurm: Resource Management
Memory
used
Jobs Step:
• ID (a number)
• Name
• Time limit
• Size specification
Sets of (possibly parallel)
tasks within a job.
Slurm: Resource Management
FULL CLUSTER
Job scheduling time!
Slurm: Job Scheduling
Scheduling: The process of determining next job to run and
on which resources.
Scheduling: The process of determining next job to run and
on which resources.
FIFO Scheduling
Slurm: Job Scheduling
Resources
Slurm: Job Scheduling
Scheduling: The process of determining next job to run and
on which resources.
FIFO Scheduling
Backfill Scheduling
• Job priority
• Time limit (Important!)
Time
Resources
Backfill Scheduling:
• Ej: New lower priority job
Slurm: Job Scheduling
Elapsed time
Time limit
Time
Resources
Backfill Scheduling:
• Ej: New lower priority job
Time
Resources
Slurm: Job Scheduling
Submit
Elapsed time
Time limit
Backfill Scheduling:
• Ej: New lower priority job
Slurm: Job Scheduling
Time
Resources
Elapsed time
Time limit
Backfill Scheduling:
• Ej: New lower priority job
Slurm: Job Scheduling
Time
Resources
Wait time: 7
Elapsed time
Time limit
Backfill Scheduling:
• Ej: New lower priority job
Time
Resources
Slurm: Job Scheduling
Elapsed time
Time limit
Backfill Scheduling:
• Ej: New lower priority job
Time
Resources
Slurm: Job Scheduling
Submit
Elapsed time
Time limit
Backfill Scheduling:
• Ej: New lower priority job
Slurm: Job Scheduling
Time
Resources
Elapsed time
Time limit
Backfill Scheduling:
• Ej: New lower priority job
Slurm: Job Scheduling
Time
Resources
Wait time: 1
Elapsed time
Time limit
Slurm: Job Scheduling
Backfill Scheduling:
• Starts with job priority.
Job_priority =
= site_factor +
+ (PriorityWeightQOS) * (QOS_factor) +
+ (PriorityWeightPartition) * (partition_factor) +
+ (PriorityWeightFairshare) * (fair-share_factor) +
+ (PriorityWeightAge) * (age_factor) +
+ (PriorityWeightJobSize) * (job_size_factor) +
+ (PriorityWeightAssoc) * (assoc_factor) +
+ SUM(TRES_weight_<type> * TRES_factor_<type>…)
− nice_factor
Slurm: Job Scheduling
Backfill Scheduling:
• Starts with job priority.
Job_priority =
= site_factor +
+ (PriorityWeightQOS) * (QOS_factor) +
+ (PriorityWeightPartition) * (partition_factor) +
+ (PriorityWeightFairshare) * (fair-share_factor) +
+ (PriorityWeightAge) * (age_factor) +
+ (PriorityWeightJobSize) * (job_size_factor) +
+ (PriorityWeightAssoc) * (assoc_factor) +
+ SUM(TRES_weight_<type> * TRES_factor_<type>…)
− nice_factor
Fixed value
Dynamic value
User defined value
Backfill Scheduling:
• Priority factor:
Slurm: Job Scheduling
QoS:
• Account’s Priority:
− Normal
− Low
• RES users:
− class_a
− class_b
− class_c
QoS
Backfill Scheduling:
• Priority factor:
Slurm: Job Scheduling
Fairshare:
• It depends on:
• Consumption.
• Resources requested.
QoS
Fairshare
Backfill Scheduling:
• Priority factor:
Slurm: Job Scheduling
Age:
• Increase priority as more
time the job pends on
queue.
• Max 7 days.
• Not valid for dependent
jobs!
QoS
Fairshare
Age
Backfill Scheduling:
• Priority factor:
Slurm: Job Scheduling
Job size:
• Bigger jobs have more
priority.
• ONLY resources
NOT time.
QoS
Fairshare
Age
Job size
Name Variable Availability Time limit Backup
/home/<user> $HOME Global -
/scratch/<user> - Global 30 d
/scratch/<user>/tmp/<jobid>
$SCRATCH /
$SHAREDSCRATCH
Global 7 d
/tmp/<user>/<jobid>
$SCRATCH /
$LOCALSCRATCH
Node Job
Basic: General information
ssh –p 2122 <user>@hpc.csuc.cat
scp -P 2122 <local_file> <user>@hpc.csuc.cat:<remote_path>
• Login:
• Transfer files:
• Storage:
Basic: General information
• Linux commands:
Command Description
pwd Show current path
ls List current folder’s files
cd <path> Change directory
mkdir <dir> Create directory
cp <file> <new> Copy
mv <file> <new> Move
rm <file> Remove file
man <command> Show manual
Command Description
CTRL-c Stop current command
CTRL-r Search history
!! Repeatlast command
grep <p> <f> Search for patterns in files
tail <file> Show last 10 lines
head <file> Show first 10 lines
cat <file> Print file content
touch <file> Create an empty file
Basic: Jobs
Batch job: #!/bin/bash
#SBATCH -J <job_name>
#SBATCH -o output_file.out
#SBATCH -e error_file.err
#SBATCH -p <partition>
#SBATCH -n <#tasks>
#SBATCH -c <#cpus_per_task>
#SBATCH -t 60
module load <module>
cp -r <input> ${SCRATCH}
cd ${SCRATCH}
<application>
cp -r <output> ${SLURM_SUBMIT_DIR}
Submit
Execute
Obtain
jobA
jobA
jobA
Basic: Jobs
Batch job:
Slurm directives
-J <name> Name of job
-o <file> Job’s std output file
-e <file> Job’s std error file
-p <part> Partition requested
-n <#tasks> Number of tasks
-c <#cpus> Number of procs per
task
-t <time> Time limit (dd-hh:mm,
hh:mm, mm)
#!/bin/bash
#SBATCH -J <job_name>
#SBATCH -o output_file.out
#SBATCH -e error_file.err
#SBATCH -p <partition>
#SBATCH -n <#tasks>
#SBATCH -c <#cpus_per_task>
#SBATCH -t 60
module load <module>
cp -r <input> ${SCRATCH}
cd ${SCRATCH}
<application>
cp -r <output> ${SLURM_SUBMIT_DIR}
Basic: Jobs
Batch job:
Defaults
• std: 1 core
3900 MB / core
• std-fat: 1 core
7900 MB / core
• mem: 1 core
23900 MB / core
• gpu: 24 cores
3900 MB / core
1 GPGPU
• knl: All node
#!/bin/bash
#SBATCH -J <job_name>
#SBATCH -o output_file.out
#SBATCH -e error_file.err
#SBATCH -p <partition>
#SBATCH -n <#tasks>
#SBATCH -c <#cpus_per_task>
#SBATCH -t 60
module load <module>
cp -r <input> ${SCRATCH}
cd ${SCRATCH}
<application>
cp -r <output> ${SLURM_SUBMIT_DIR}
Basic: Jobs
Batch job:
• First Load module
• Second Copy inputs to SCRATCH
Change working path
• Third Execution
• Forth Get outputs back
#!/bin/bash
#SBATCH -J <job_name>
#SBATCH -o output_file.out
#SBATCH -e error_file.err
#SBATCH -p <partition>
#SBATCH -n <#tasks>
#SBATCH -c <#cpus_per_task>
#SBATCH -t 60
module load <module>
cp -r <input> ${SCRATCH}
cd ${SCRATCH}
<application>
cp -r <output> ${SLURM_SUBMIT_DIR}
Basic: Jobs
Batch job: Steps #!/bin/bash
#SBATCH -J <job_name>
#SBATCH -o output_file.out
#SBATCH -e error_file.err
#SBATCH -p <partition>
#SBATCH -n <#tasks>
#SBATCH -c <#cpus_per_task>
#SBATCH -t 60
module load <module>
cp -r <input> ${SCRATCH}
cd ${SCRATCH}
srun <application>
srun <application> &
srun <application> &
wait
cp -r <output> ${SLURM_SUBMIT_DIR}
Submit
Execute
Obtain
jobA
jobA.3
jobA.2
jobA.1 … job.N
jobA
Basic: Jobs
Batch job: Arrays #!/bin/bash
#SBATCH -J <job_name>
#SBATCH -o output_file_%A_%a.out
#SBATCH -e error_file_%A_%a.err
#SBATCH --array=0-4
#SBATCH -p <partition>
#SBATCH -n <#tasks>
#SBATCH -c <#cpus_per_task>
#SBATCH -t 60
module load <module>
cp -r <input> ${SCRATCH}
cd ${SCRATCH}
<application>
cp -r <output> ${SLURM_SUBMIT_DIR}
Submit
Execute
Obtain
jobA
jobA_3
jobA_2
jobA_1 … job_N
jobA_3
jobA_2
jobA_1 … job_N
Basic: Jobs
Batch job: Arrays
• Example: Job 50
#!/bin/bash
#SBATCH -J <job_name>
#SBATCH -o output_file_%A_%a.out
#SBATCH -e error_file_%A_%a.err
#SBATCH --array=0-4
#SBATCH -p <partition>
#SBATCH -n <#tasks>
#SBATCH -c <#cpus_per_task>
#SBATCH -t 60
module load <module>
cp -r <input> ${SCRATCH}
cd ${SCRATCH}
<application>
cp -r <output> ${SLURM_SUBMIT_DIR}
SLURM_JOB_ID 50
SLURM_ARRAY_JOB_ID 50
SLURM_ARRAY_TASK_ID 0
SLURM_ARRAY_TASK_COUNT 5
SLURM_ARRAY_TASK_MAX 4
SLURM_ARRAY_TASK_MIN 0
SLURM_JOB_ID 51
SLURM_ARRAY_JOB_ID 50
SLURM_ARRAY_TASK_ID 1
SLURM_ARRAY_TASK_COUNT 5
SLURM_ARRAY_TASK_MAX 4
SLURM_ARRAY_TASK_MIN 0
…
Batch job: Dependency
Basic: Jobs
Pre-processing
Analysis
Verification
jobX
jobB
jobA
jobN
jobM
jobL
ok?
ok?
#!/bin/bash
#SBATCH -J <job_name>
#SBATCH -o output_file.out
#SBATCH -e error_file.err
#SBATCH -p <partition>
#SBATCH -n <#tasks>
#SBATCH -c <#cpus_per_task>
#SBATCH -t 60
#SBATCH --dependency=afterok:<jid>
module load <module>
cp -r <input> ${SCRATCH}
cd ${SCRATCH}
<application>
cp -r <output> ${SLURM_SUBMIT_DIR}
Basic: Jobs
Batch job: Dependency
• after:<jobid>[:<jobid>...]
Start of the specified jobs.
• afterany:<jobid>[:<jobid>...]
Termination of the specified jobs.
• afternotok:<jobid>[:<jobid>...]
Failing of the specified jobs.
• singleton
All jobs with the same name
and user have ended.
#!/bin/bash
#SBATCH -J <job_name>
#SBATCH -o output_file.out
#SBATCH -e error_file.err
#SBATCH -p <partition>
#SBATCH -n <#tasks>
#SBATCH -c <#cpus_per_task>
#SBATCH -t 60
#SBATCH --dependency=afterok:<jid>
module load <module>
cp -r <input> ${SCRATCH}
cd ${SCRATCH}
<application>
cp -r <output> ${SLURM_SUBMIT_DIR}
PENDING
(CONFIGURING)
RUNNING
HELD RESIZE
CANCELED
COMPLETING
COMPLETED TIMEOUT
FAIL
OUT OF
MEMORY
SPECIAL
EXIT
NODE FAIL
HOLD
RELEASE
REQUEUE
SUBMISSION
Basic: Jobs
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
std* up infinite 19 mix pirineus[7,34-35,37-40,45]
std* up infinite 25 alloc pirineus[8,11-17,19-33,36,41-44,46-50]
std-fat up infinite 1 mix pirineus45
std-fat up infinite 5 alloc pirineus[46-50]
gpu up infinite 4 idle~ pirineusgpu[1-4]
knl up infinite 4 idle~ pirineusknl[1-4]
mem up infinite 2 mix canigo[1-2]
$ sinfo
PENDING
(CONFIGURING)
RUNNING COMPLETED
sinfo
COMPLETING
Basic: Jobs
+-----------+-------------+-----------------+--------------+------------+
| MACHINE | TOTAL SLOTS | ALLOCATED SLOTS | QUEUED SLOTS | OCCUPATION |
+-----------+-------------+-----------------+--------------+------------+
| std nodes | 1536 | 1468 | 2212 | 95 % |
| fat nodes | 288 | 144 | 0 | 50 % |
| mem nodes | 96 | 96 | 289 | 100 % |
| gpu nodes | 144 | 96 | 252 | 66 % |
| knl nodes | 816 | 0 | 0 | 0 % |
| res nodes | 672 | 648 | 1200 | 96 % |
+-----------+-------------+-----------------+--------------+------------+
$ system-status
PENDING
(CONFIGURING)
RUNNING COMPLETED
sinfo
COMPLETING
Basic: Jobs
Submitted batch job 1720189
$ sbatch <file>
PENDING
(CONFIGURING)
RUNNING COMPLETED
sbatch
sinfo
COMPLETING
Basic: Jobs
Submitted batch job 1720189
$ sbatch <file>
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
1720189 std test user PD 0:00 1 (Resources)
$ squeue –u <username>
• Priority: One or more higher priority jobs exist for this partition or advanced reservation.
• Dependency: The job is waiting for a dependent job to complete.
PENDING
(CONFIGURING)
RUNNING COMPLETED
sbatch squeue
sinfo
COMPLETING
Basic: Jobs
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
1720189 std test user R 1-03:44 1 pirineus27
$ squeue –j 1720189
PENDING
(CONFIGURING)
RUNNING COMPLETED
sbatch
sinfo
COMPLETING
$ sstat –aj 1720189 --format=jobid,nodelist,mincpu,maxrss,pids
JobID Nodelist MinCPU MaxRSS Pids
------------ ---------------- ------------- ---------- ----------------------
1720189.ext+ pirineus27 226474
1720189.bat+ pirineus27 00:00.000 7348K 226491,226526,226528
1720189.0 pirineus27 1-03:44:05 19171808K 226557,226577
squeue
sstat
squeue
Basic: Jobs
PENDING
(CONFIGURING)
RUNNING COMPLETING COMPLETED
sbatch squeue squeue
sinfo
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
1720189 std test user CG 2-15:56 1 pirineus27
$ squeue –j 1720189
• Move files from $LOCALSCRATCH to $SHAREDSCRATCH.
sstat
Basic: Jobs
Slurm: Job Life
JobID JobName Partition Account AllocCPUS State ExitCode
------------ ---------- ---------- ---------- ---------- ------------ --------
1720189 test std account 16 COMPLETED 0:0
1720189.bat+ batch account 16 COMPLETED 0:0
1720189.ext+ extern account 16 COMPLETED 0:0
1720189.0 pre account 16 COMPLETED 0:0
1720189.1 process account 16 COMPLETED 0:0
1720189.2 post account 16 COMPLETED 0:0
$ sacct
PENDING
(CONFIGURING)
RUNNING COMPLETED
sbatch
sinfo
COMPLETING
squeue
squeue
sstat sacct
• completed (CP), time_out (TO), out of memory (OM), fail (F), node_fail (NF)…
PENDING
(CONFIGURING)
RUNNING COMPLETED
sbatch scancel
CANCELLED
sinfo
COMPLETING
squeue
squeue
sstat
$ scancel –j 1720189
sacct
JobID JobName Partition Account AllocCPUS State ExitCode
------------ ---------- ---------- ---------- ---------- ------------ --------
1720189 test std account 16 CANCELLED+ 0:0
1720189.bat+ batch account 16 CANCELLED+ 0:0
1720189.ext+ extern account 16 COMPLETED 0:0
1720189.0 pre account 16 CANCELLED+ 0:0
$ sacct
Basic: Jobs
the most suitable partition.
Choose
$SCRATCH as working directory.
Use
only the necessary files.
Move
important files at $HOME.
Keep
Basic: Best practices
Questions?
Thank you for your attention!
feedback – ismael.fernandez@csuc.cat
support – http://hpc.csuc.cat

Introduction to SLURM

  • 1.
    Introduction to Slurm IsmaelFernández Pavón 16 / 12 / 2021
  • 2.
    What is Slurm?Resource Manager Job Scheduler Basic interaction
  • 3.
    What is Slurm? •Allocates access to resources for some duration of time. • Provides a framework for starting, executing, and monitoring work (normally a parallel job). • Arbitrates contention for resources by managing a queue of pending work. Cluster manager and job scheduler system for large and small Linux clusters.
  • 4.
    What is Slurm? •The use of “Slurm”, not SLURM or any other variation is preferred. • Simple Linux Utility for Resource Management (Historic) • Slurm in all capitals describes earlier days of the software when Slurm was just a resource manager. Cluster manager and job scheduler system for large and small Linux clusters.
  • 5.
    LoadLeveler (IBM) LSF Slurm PBS Pro ResourceManagers Scheduler What is Slurm? ALPS (Cray) Torque Maui Moab
  • 6.
    ✓ Open source ✓Fault-tolerant ✓ Highly scalable LoadLeveler (IBM) LSF Slurm PBS Pro Resource Managers Scheduler What is Slurm? ALPS (Cray) Torque Maui Moab
  • 7.
    Slurm: Resource Management Cluster: Collectionof many separate servers (nodes), connected via a fast interconnect. Node Fast interconnection Ethernet, Infiniband…
  • 8.
    Node CPU (Core) CPU (Thread) Slurm: Resource Management Nodes: •pirineus[1-44] • pirineus[45-50] • canigo[1,2] • pirineusgpu[1-4] • pirineusknl[1-4] GPGPU (GRES) Individual computer component of an HPC system.
  • 9.
    Slurm: Resource Management Partitions: •std • std-fat • mem • gpu • knl • covid19 Partitions Logical group of nodes with common specs.
  • 10.
    Allocated cores Slurm: Resource Management Allocated memory Jobs: •ID (a number) • Name • Time limit • Size specification • Other Jobs Dependency • State Allocations of resources assigned to a user for a specified amount of time.
  • 11.
    Core used Slurm: Resource Management Memory used JobsStep: • ID (a number) • Name • Time limit • Size specification Sets of (possibly parallel) tasks within a job.
  • 12.
    Slurm: Resource Management FULLCLUSTER Job scheduling time!
  • 13.
    Slurm: Job Scheduling Scheduling:The process of determining next job to run and on which resources.
  • 14.
    Scheduling: The processof determining next job to run and on which resources. FIFO Scheduling Slurm: Job Scheduling Resources
  • 15.
    Slurm: Job Scheduling Scheduling:The process of determining next job to run and on which resources. FIFO Scheduling Backfill Scheduling • Job priority • Time limit (Important!) Time Resources
  • 16.
    Backfill Scheduling: • Ej:New lower priority job Slurm: Job Scheduling Elapsed time Time limit Time Resources
  • 17.
    Backfill Scheduling: • Ej:New lower priority job Time Resources Slurm: Job Scheduling Submit Elapsed time Time limit
  • 18.
    Backfill Scheduling: • Ej:New lower priority job Slurm: Job Scheduling Time Resources Elapsed time Time limit
  • 19.
    Backfill Scheduling: • Ej:New lower priority job Slurm: Job Scheduling Time Resources Wait time: 7 Elapsed time Time limit
  • 20.
    Backfill Scheduling: • Ej:New lower priority job Time Resources Slurm: Job Scheduling Elapsed time Time limit
  • 21.
    Backfill Scheduling: • Ej:New lower priority job Time Resources Slurm: Job Scheduling Submit Elapsed time Time limit
  • 22.
    Backfill Scheduling: • Ej:New lower priority job Slurm: Job Scheduling Time Resources Elapsed time Time limit
  • 23.
    Backfill Scheduling: • Ej:New lower priority job Slurm: Job Scheduling Time Resources Wait time: 1 Elapsed time Time limit
  • 24.
    Slurm: Job Scheduling BackfillScheduling: • Starts with job priority. Job_priority = = site_factor + + (PriorityWeightQOS) * (QOS_factor) + + (PriorityWeightPartition) * (partition_factor) + + (PriorityWeightFairshare) * (fair-share_factor) + + (PriorityWeightAge) * (age_factor) + + (PriorityWeightJobSize) * (job_size_factor) + + (PriorityWeightAssoc) * (assoc_factor) + + SUM(TRES_weight_<type> * TRES_factor_<type>…) − nice_factor
  • 25.
    Slurm: Job Scheduling BackfillScheduling: • Starts with job priority. Job_priority = = site_factor + + (PriorityWeightQOS) * (QOS_factor) + + (PriorityWeightPartition) * (partition_factor) + + (PriorityWeightFairshare) * (fair-share_factor) + + (PriorityWeightAge) * (age_factor) + + (PriorityWeightJobSize) * (job_size_factor) + + (PriorityWeightAssoc) * (assoc_factor) + + SUM(TRES_weight_<type> * TRES_factor_<type>…) − nice_factor Fixed value Dynamic value User defined value
  • 26.
    Backfill Scheduling: • Priorityfactor: Slurm: Job Scheduling QoS: • Account’s Priority: − Normal − Low • RES users: − class_a − class_b − class_c QoS
  • 27.
    Backfill Scheduling: • Priorityfactor: Slurm: Job Scheduling Fairshare: • It depends on: • Consumption. • Resources requested. QoS Fairshare
  • 28.
    Backfill Scheduling: • Priorityfactor: Slurm: Job Scheduling Age: • Increase priority as more time the job pends on queue. • Max 7 days. • Not valid for dependent jobs! QoS Fairshare Age
  • 29.
    Backfill Scheduling: • Priorityfactor: Slurm: Job Scheduling Job size: • Bigger jobs have more priority. • ONLY resources NOT time. QoS Fairshare Age Job size
  • 30.
    Name Variable AvailabilityTime limit Backup /home/<user> $HOME Global - /scratch/<user> - Global 30 d /scratch/<user>/tmp/<jobid> $SCRATCH / $SHAREDSCRATCH Global 7 d /tmp/<user>/<jobid> $SCRATCH / $LOCALSCRATCH Node Job Basic: General information ssh –p 2122 <user>@hpc.csuc.cat scp -P 2122 <local_file> <user>@hpc.csuc.cat:<remote_path> • Login: • Transfer files: • Storage:
  • 31.
    Basic: General information •Linux commands: Command Description pwd Show current path ls List current folder’s files cd <path> Change directory mkdir <dir> Create directory cp <file> <new> Copy mv <file> <new> Move rm <file> Remove file man <command> Show manual Command Description CTRL-c Stop current command CTRL-r Search history !! Repeatlast command grep <p> <f> Search for patterns in files tail <file> Show last 10 lines head <file> Show first 10 lines cat <file> Print file content touch <file> Create an empty file
  • 32.
    Basic: Jobs Batch job:#!/bin/bash #SBATCH -J <job_name> #SBATCH -o output_file.out #SBATCH -e error_file.err #SBATCH -p <partition> #SBATCH -n <#tasks> #SBATCH -c <#cpus_per_task> #SBATCH -t 60 module load <module> cp -r <input> ${SCRATCH} cd ${SCRATCH} <application> cp -r <output> ${SLURM_SUBMIT_DIR} Submit Execute Obtain jobA jobA jobA
  • 33.
    Basic: Jobs Batch job: Slurmdirectives -J <name> Name of job -o <file> Job’s std output file -e <file> Job’s std error file -p <part> Partition requested -n <#tasks> Number of tasks -c <#cpus> Number of procs per task -t <time> Time limit (dd-hh:mm, hh:mm, mm) #!/bin/bash #SBATCH -J <job_name> #SBATCH -o output_file.out #SBATCH -e error_file.err #SBATCH -p <partition> #SBATCH -n <#tasks> #SBATCH -c <#cpus_per_task> #SBATCH -t 60 module load <module> cp -r <input> ${SCRATCH} cd ${SCRATCH} <application> cp -r <output> ${SLURM_SUBMIT_DIR}
  • 34.
    Basic: Jobs Batch job: Defaults •std: 1 core 3900 MB / core • std-fat: 1 core 7900 MB / core • mem: 1 core 23900 MB / core • gpu: 24 cores 3900 MB / core 1 GPGPU • knl: All node #!/bin/bash #SBATCH -J <job_name> #SBATCH -o output_file.out #SBATCH -e error_file.err #SBATCH -p <partition> #SBATCH -n <#tasks> #SBATCH -c <#cpus_per_task> #SBATCH -t 60 module load <module> cp -r <input> ${SCRATCH} cd ${SCRATCH} <application> cp -r <output> ${SLURM_SUBMIT_DIR}
  • 35.
    Basic: Jobs Batch job: •First Load module • Second Copy inputs to SCRATCH Change working path • Third Execution • Forth Get outputs back #!/bin/bash #SBATCH -J <job_name> #SBATCH -o output_file.out #SBATCH -e error_file.err #SBATCH -p <partition> #SBATCH -n <#tasks> #SBATCH -c <#cpus_per_task> #SBATCH -t 60 module load <module> cp -r <input> ${SCRATCH} cd ${SCRATCH} <application> cp -r <output> ${SLURM_SUBMIT_DIR}
  • 36.
    Basic: Jobs Batch job:Steps #!/bin/bash #SBATCH -J <job_name> #SBATCH -o output_file.out #SBATCH -e error_file.err #SBATCH -p <partition> #SBATCH -n <#tasks> #SBATCH -c <#cpus_per_task> #SBATCH -t 60 module load <module> cp -r <input> ${SCRATCH} cd ${SCRATCH} srun <application> srun <application> & srun <application> & wait cp -r <output> ${SLURM_SUBMIT_DIR} Submit Execute Obtain jobA jobA.3 jobA.2 jobA.1 … job.N jobA
  • 37.
    Basic: Jobs Batch job:Arrays #!/bin/bash #SBATCH -J <job_name> #SBATCH -o output_file_%A_%a.out #SBATCH -e error_file_%A_%a.err #SBATCH --array=0-4 #SBATCH -p <partition> #SBATCH -n <#tasks> #SBATCH -c <#cpus_per_task> #SBATCH -t 60 module load <module> cp -r <input> ${SCRATCH} cd ${SCRATCH} <application> cp -r <output> ${SLURM_SUBMIT_DIR} Submit Execute Obtain jobA jobA_3 jobA_2 jobA_1 … job_N jobA_3 jobA_2 jobA_1 … job_N
  • 38.
    Basic: Jobs Batch job:Arrays • Example: Job 50 #!/bin/bash #SBATCH -J <job_name> #SBATCH -o output_file_%A_%a.out #SBATCH -e error_file_%A_%a.err #SBATCH --array=0-4 #SBATCH -p <partition> #SBATCH -n <#tasks> #SBATCH -c <#cpus_per_task> #SBATCH -t 60 module load <module> cp -r <input> ${SCRATCH} cd ${SCRATCH} <application> cp -r <output> ${SLURM_SUBMIT_DIR} SLURM_JOB_ID 50 SLURM_ARRAY_JOB_ID 50 SLURM_ARRAY_TASK_ID 0 SLURM_ARRAY_TASK_COUNT 5 SLURM_ARRAY_TASK_MAX 4 SLURM_ARRAY_TASK_MIN 0 SLURM_JOB_ID 51 SLURM_ARRAY_JOB_ID 50 SLURM_ARRAY_TASK_ID 1 SLURM_ARRAY_TASK_COUNT 5 SLURM_ARRAY_TASK_MAX 4 SLURM_ARRAY_TASK_MIN 0 …
  • 39.
    Batch job: Dependency Basic:Jobs Pre-processing Analysis Verification jobX jobB jobA jobN jobM jobL ok? ok? #!/bin/bash #SBATCH -J <job_name> #SBATCH -o output_file.out #SBATCH -e error_file.err #SBATCH -p <partition> #SBATCH -n <#tasks> #SBATCH -c <#cpus_per_task> #SBATCH -t 60 #SBATCH --dependency=afterok:<jid> module load <module> cp -r <input> ${SCRATCH} cd ${SCRATCH} <application> cp -r <output> ${SLURM_SUBMIT_DIR}
  • 40.
    Basic: Jobs Batch job:Dependency • after:<jobid>[:<jobid>...] Start of the specified jobs. • afterany:<jobid>[:<jobid>...] Termination of the specified jobs. • afternotok:<jobid>[:<jobid>...] Failing of the specified jobs. • singleton All jobs with the same name and user have ended. #!/bin/bash #SBATCH -J <job_name> #SBATCH -o output_file.out #SBATCH -e error_file.err #SBATCH -p <partition> #SBATCH -n <#tasks> #SBATCH -c <#cpus_per_task> #SBATCH -t 60 #SBATCH --dependency=afterok:<jid> module load <module> cp -r <input> ${SCRATCH} cd ${SCRATCH} <application> cp -r <output> ${SLURM_SUBMIT_DIR}
  • 41.
    PENDING (CONFIGURING) RUNNING HELD RESIZE CANCELED COMPLETING COMPLETED TIMEOUT FAIL OUTOF MEMORY SPECIAL EXIT NODE FAIL HOLD RELEASE REQUEUE SUBMISSION Basic: Jobs
  • 42.
    PARTITION AVAIL TIMELIMITNODES STATE NODELIST std* up infinite 19 mix pirineus[7,34-35,37-40,45] std* up infinite 25 alloc pirineus[8,11-17,19-33,36,41-44,46-50] std-fat up infinite 1 mix pirineus45 std-fat up infinite 5 alloc pirineus[46-50] gpu up infinite 4 idle~ pirineusgpu[1-4] knl up infinite 4 idle~ pirineusknl[1-4] mem up infinite 2 mix canigo[1-2] $ sinfo PENDING (CONFIGURING) RUNNING COMPLETED sinfo COMPLETING Basic: Jobs
  • 43.
    +-----------+-------------+-----------------+--------------+------------+ | MACHINE |TOTAL SLOTS | ALLOCATED SLOTS | QUEUED SLOTS | OCCUPATION | +-----------+-------------+-----------------+--------------+------------+ | std nodes | 1536 | 1468 | 2212 | 95 % | | fat nodes | 288 | 144 | 0 | 50 % | | mem nodes | 96 | 96 | 289 | 100 % | | gpu nodes | 144 | 96 | 252 | 66 % | | knl nodes | 816 | 0 | 0 | 0 % | | res nodes | 672 | 648 | 1200 | 96 % | +-----------+-------------+-----------------+--------------+------------+ $ system-status PENDING (CONFIGURING) RUNNING COMPLETED sinfo COMPLETING Basic: Jobs
  • 44.
    Submitted batch job1720189 $ sbatch <file> PENDING (CONFIGURING) RUNNING COMPLETED sbatch sinfo COMPLETING Basic: Jobs
  • 45.
    Submitted batch job1720189 $ sbatch <file> JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 1720189 std test user PD 0:00 1 (Resources) $ squeue –u <username> • Priority: One or more higher priority jobs exist for this partition or advanced reservation. • Dependency: The job is waiting for a dependent job to complete. PENDING (CONFIGURING) RUNNING COMPLETED sbatch squeue sinfo COMPLETING Basic: Jobs
  • 46.
    JOBID PARTITION NAMEUSER ST TIME NODES NODELIST(REASON) 1720189 std test user R 1-03:44 1 pirineus27 $ squeue –j 1720189 PENDING (CONFIGURING) RUNNING COMPLETED sbatch sinfo COMPLETING $ sstat –aj 1720189 --format=jobid,nodelist,mincpu,maxrss,pids JobID Nodelist MinCPU MaxRSS Pids ------------ ---------------- ------------- ---------- ---------------------- 1720189.ext+ pirineus27 226474 1720189.bat+ pirineus27 00:00.000 7348K 226491,226526,226528 1720189.0 pirineus27 1-03:44:05 19171808K 226557,226577 squeue sstat squeue Basic: Jobs
  • 47.
    PENDING (CONFIGURING) RUNNING COMPLETING COMPLETED sbatchsqueue squeue sinfo JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 1720189 std test user CG 2-15:56 1 pirineus27 $ squeue –j 1720189 • Move files from $LOCALSCRATCH to $SHAREDSCRATCH. sstat Basic: Jobs
  • 48.
    Slurm: Job Life JobIDJobName Partition Account AllocCPUS State ExitCode ------------ ---------- ---------- ---------- ---------- ------------ -------- 1720189 test std account 16 COMPLETED 0:0 1720189.bat+ batch account 16 COMPLETED 0:0 1720189.ext+ extern account 16 COMPLETED 0:0 1720189.0 pre account 16 COMPLETED 0:0 1720189.1 process account 16 COMPLETED 0:0 1720189.2 post account 16 COMPLETED 0:0 $ sacct PENDING (CONFIGURING) RUNNING COMPLETED sbatch sinfo COMPLETING squeue squeue sstat sacct • completed (CP), time_out (TO), out of memory (OM), fail (F), node_fail (NF)…
  • 49.
    PENDING (CONFIGURING) RUNNING COMPLETED sbatch scancel CANCELLED sinfo COMPLETING squeue squeue sstat $scancel –j 1720189 sacct JobID JobName Partition Account AllocCPUS State ExitCode ------------ ---------- ---------- ---------- ---------- ------------ -------- 1720189 test std account 16 CANCELLED+ 0:0 1720189.bat+ batch account 16 CANCELLED+ 0:0 1720189.ext+ extern account 16 COMPLETED 0:0 1720189.0 pre account 16 CANCELLED+ 0:0 $ sacct Basic: Jobs
  • 50.
    the most suitablepartition. Choose $SCRATCH as working directory. Use only the necessary files. Move important files at $HOME. Keep Basic: Best practices
  • 51.
  • 52.
    Thank you foryour attention! feedback – ismael.fernandez@csuc.cat support – http://hpc.csuc.cat