SlideShare a Scribd company logo
PRESENTED BY:
HPC Exercises
Interacting with High Performance
Computing Systems
6/7/18 1
VirginiaTrueheart, MSIS
Texas Advanced Computing Center
vtrueheart@tacc.utexas.edu
Logging In
• In order to access the TACC machines you will need to login using a
terminal or SSH client
• SSH is an encrypted network protocol to access a secure system
over an unsecured network
• The following example is for XSEDE Login
• You can also log directly into the TACC machine but your password may be
different.
Logging In (Mac Terminal)
$ ssh –l <username> login.xsede.org
Please login to this system using your XSEDE username and password:
password:
Duo two-factor login for <username>
Enter a passcode or select one of the following options
1. Duo Push to XXX-XXX-XXXX
2. Phone call to XXX-XXX-XXXX
Passcode or option (1-2):
6/7/18 3
Logging In pt. 2
# For example, to login to the Comet system at SDSC, enter: gsissh comet
#
# Email help@xsede.org if you require assistance in the use of this system.
[username@ssohub ~]$ gsissh stampede2
6/7/18 4
Interacting with the System
After logging in you will be able to see the TACC Info box which will tell
you what projects you are associated with and how much of the file
system you have used.
Welcome to Stampede2, *please* read these important system notes:
--> Stampede2, Phase 2 Skylake nodes are now available for jobs
--> Stampede2 user documentation is available at:
https://portal.tacc.utexas.edu/user-guides/stampede2
----------------------- Project balances for user vtrue -----------------------
| Name Avail SUs Expires | |
| A-ccsc 189624 2018-12-31 | |
------------------------- Disk quotas for user vtrue --------------------------
| Disk Usage (GB) Limit %Used File Usage Limit %Used |
| /home1 1.9 10.0 19.43 39181 200000 19.59 |
| /work 311.8 1024.0 30.45 225008 3000000 7.50 |
| /scratch 0.0 0.0 0.00 4 0 0.00 |
-------------------------------------------------------------------------------
6/7/18 6
Creating a File
Inline text editors can be very useful when interacting with the system
so lets use a very simple one (nano) to create a file that we can use to
execute some code.
Create a File
login1.stampede2$ cd $WORK
login1.stampede2$ pwd
/work/03658/vtrue/stampede2
login1.stampede2$ nano helloWorld.py
6/7/18 8
Editing a File
• You should now be sitting in the editing environment of the
helloWorld.py file.
• You can type in the code found on the next slide in order to create the
contents of the file.
• Type Ctrl+X to exit and type Y to save the file when you are prompted.
6/7/18 10
#!/usr/bin/env python
"""
Hello World
"""
import datetime as DT
today = DT.datetime.today()
print "Hello World! Today is:"
print today.strftime("%d %b %Y")
A Very Small File
Executing Our File
• It is prohibited to run code on the login nodes as they are a shared
resource.
• In order to run this little code we have written we will first need to
start an idev (interactive development) session.
helloWorld.py
staff.stampede2(1005)$ idev
-> Checking on the status of development queue. OK
-> Defaults file : ~/.idevrc
-> System : stampede2
-> Queue : development (idev default )
[...]
c455-012[knl](1019)$
6/7/18 12
helloWorld.py
staff.stampede2(1005)$ idev
-> Checking on the status of development queue. OK
-> Defaults file : ~/.idevrc
-> System : stampede2
-> Queue : development (idev default )
[...]
c455-012[knl](1019)$ python helloWorld.py
Hello World! Today is:
17 Jun 2018
c455-012[knl](1020)$
6/7/18 13
Do More with Our File
• Now that we see the helloWorld.py file will run on the compute node
and produce output let’s test out the parallel aspects of running on a
node. Namely, accessing all of the cores on the node.
• Type ‘nano helloWorld.py’ to reopen your file and begin editing it
again.
• Input the python code you see on the next slide and then do Ctrl+X
again to save your changes.
#!/usr/bin/env python
"""
Parallel Hello World
"""
from mpi4py import MPI
import sys
size = MPI.COMM_WORLD.Get_size()
rank = MPI.COMM_WORLD.Get_rank()
name = MPI.Get_processor_name()
sys.stdout.write(
"Hello, World! I am process %d of %d on %s.n"
% (rank, size, name))
6/7/18 15
Running in Parallel
• Even though this is still a python code, we are now taking advantage
of the parallel capabilities of the node
• As such you need to start your code by using ‘ibrun’ instead of just
typing ‘python’
• When you run this code you will receive feedback from each core on
the node.
Parallel helloWorld.py
• c455-012[knl](1019)$ ibrun python helloParallel.py
• TACC: Starting up job 1595632
• TACC: Starting parallel tasks...
• Hello, World! I am process 1 of 68 on c456-042.stampede2.tacc.utexas.edu.
• Hello, World! I am process 49 of 68 on c456-042.stampede2.tacc.utexas.edu.
• Hello, World! I am process 66 of 68 on c456-042.stampede2.tacc.utexas.edu.
• Hello, World! I am process 67 of 68 on c456-042.stampede2.tacc.utexas.edu.
• Hello, World! I am process 64 of 68 on c456-042.stampede2.tacc.utexas.edu.
• ...
• TACC: Shutdown complete. Exiting.
6/7/18 17
Exiting an idev Session
• Great! Now that you can see what interactive jobs look like we can go
on to more advanced job submission.
• To leave an idev session simply type ‘exit’. Let’s do that now.
Submitting a Job
• The previous two examples ran a job interactively. This means you
could be on the node and see the output generated as it happened.
• This isn’t always practical though so we need a way to submit jobs
and then leave them to the system to run whenever nodes become
available.
• To do this we’ll take advantage of the SLURM system.
Create a New Nano File
• Create a new file called batchJob.sh
• Input the code found on the next slide and save the file.
An Example SLURM Batch File
#!/bin/bash
#SBATCH -J myJob # Job name
#SBATCH -o myJob.o%j # Name of stdout output file
#SBATCH -e myJob.e%j # Name of stderr error file
#SBATCH -p development # Queue (partition) name
#SBATCH -N 1 # Total # of nodes
#SBATCH -n 68 # Total # of mpi tasks
#SBATCH -t 00:05:00 # Run time (hh:mm:ss)
#SBATCH -A myproject # Allocation name (req'd if you have more than 1)
#SBATCH --mail-user=hkang@austin.utexas.edu
#SBATCH --mail-type=all # Send email at begin and end of job
# Other commands must follow all #SBATCH directives...
module list
pwd
date
# Launch code...
ibrun python helloParallel.py
6/7/18 21
Submitting the Batch Job
• To submit the SLURM Batch job use the following command
• sbatch batchJob.sh
• This should generate a text output to let you know about the
parameters of the job and then provide you with a Job ID once the
job has successfully been admitted to the queues.
Checking the Status of a Job
• Once a job is submitted to the queues, you can’t see it running the
way you could when you were running interactively.
• Instead we use monitoring commands to see what state our job is in.
• The command ’squeue’ is very useful for this and has several flags
available to control it’s output. Let’s use the –u flag to see what all
jobs under our username are doing.
staff.stampede2(1009)$ squeue -u vtrue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
1604426 development idv20717 vtrue R 16:57 1 c455-001
staff.stampede2(1010)$ scontrol show job=1604426
JobId=1604426 JobName=idv20717
UserId=vtrue(829572) GroupId=G-815499(815499) MCS_label=N/A
Priority=400 Nice=0 Account=A-ccsc QOS=normal
JobState=RUNNING Reason=None Dependency=(null)
Requeue=0 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
RunTime=00:18:08 TimeLimit=00:30:00 TimeMin=N/A
SubmitTime=2018-06-09T21:27:33 EligibleTime=2018-06-09T21:27:33
StartTime=2018-06-09T21:27:36 EndTime=2018-06-09T21:57:36 Deadline=N/A
PreemptTime=None SuspendTime=None SecsPreSuspend=0
LastSchedEval=2018-06-09T21:27:36
...
6/7/18 24
Cleaning Up Jobs
• When your batch job has finished running it will automatically be
cleared from the queues.
• Your output will be in the folder you pointed it to within your batch
job file.
• If for some reason you wish to cancel your job while it is still running,
you can do so with ‘scancel –jobID>’

More Related Content

What's hot

Linux Tracing Superpowers by Eugene Pirogov
Linux Tracing Superpowers by Eugene PirogovLinux Tracing Superpowers by Eugene Pirogov
Linux Tracing Superpowers by Eugene Pirogov
Pivorak MeetUp
 
Kernel Recipes 2015: Kernel packet capture technologies
Kernel Recipes 2015: Kernel packet capture technologiesKernel Recipes 2015: Kernel packet capture technologies
Kernel Recipes 2015: Kernel packet capture technologies
Anne Nicolas
 
Linux Performance Profiling and Monitoring
Linux Performance Profiling and MonitoringLinux Performance Profiling and Monitoring
Linux Performance Profiling and Monitoring
Georg Schönberger
 
Systems@Scale 2021 BPF Performance Getting Started
Systems@Scale 2021 BPF Performance Getting StartedSystems@Scale 2021 BPF Performance Getting Started
Systems@Scale 2021 BPF Performance Getting Started
Brendan Gregg
 
Linux Kernel Crashdump
Linux Kernel CrashdumpLinux Kernel Crashdump
Linux Kernel Crashdump
Marian Marinov
 
Velocity 2017 Performance analysis superpowers with Linux eBPF
Velocity 2017 Performance analysis superpowers with Linux eBPFVelocity 2017 Performance analysis superpowers with Linux eBPF
Velocity 2017 Performance analysis superpowers with Linux eBPF
Brendan Gregg
 
Kernel crashdump
Kernel crashdumpKernel crashdump
Kernel crashdump
Adrien Mahieux
 
LISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
LISA18: Hidden Linux Metrics with Prometheus eBPF ExporterLISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
LISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
Ivan Babrou
 
1 m+ qps on mysql galera cluster
1 m+ qps on mysql galera cluster1 m+ qps on mysql galera cluster
1 m+ qps on mysql galera cluster
OlinData
 
Kernel Recipes 2015 - Kernel dump analysis
Kernel Recipes 2015 - Kernel dump analysisKernel Recipes 2015 - Kernel dump analysis
Kernel Recipes 2015 - Kernel dump analysis
Anne Nicolas
 
Intro to linux performance analysis
Intro to linux performance analysisIntro to linux performance analysis
Intro to linux performance analysis
Chris McEniry
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016
Brendan Gregg
 
Introduction to eBPF and XDP
Introduction to eBPF and XDPIntroduction to eBPF and XDP
Introduction to eBPF and XDP
lcplcp1
 
The New Systems Performance
The New Systems PerformanceThe New Systems Performance
The New Systems Performance
Brendan Gregg
 
Known basic of NFV Features
Known basic of NFV FeaturesKnown basic of NFV Features
Known basic of NFV Features
Raul Leite
 
Linux Crash Dump Capture and Analysis
Linux Crash Dump Capture and AnalysisLinux Crash Dump Capture and Analysis
Linux Crash Dump Capture and Analysis
Paul V. Novarese
 
Debugging the Cloud Foundry Routing Tier
Debugging the Cloud Foundry Routing TierDebugging the Cloud Foundry Routing Tier
Debugging the Cloud Foundry Routing Tier
VMware Tanzu
 
Linux Performance 2018 (PerconaLive keynote)
Linux Performance 2018 (PerconaLive keynote)Linux Performance 2018 (PerconaLive keynote)
Linux Performance 2018 (PerconaLive keynote)
Brendan Gregg
 
Percona Live UK 2014 Part III
Percona Live UK 2014  Part IIIPercona Live UK 2014  Part III
Percona Live UK 2014 Part III
Alkin Tezuysal
 
Linux kernel debugging
Linux kernel debuggingLinux kernel debugging
Linux kernel debugging
libfetion
 

What's hot (20)

Linux Tracing Superpowers by Eugene Pirogov
Linux Tracing Superpowers by Eugene PirogovLinux Tracing Superpowers by Eugene Pirogov
Linux Tracing Superpowers by Eugene Pirogov
 
Kernel Recipes 2015: Kernel packet capture technologies
Kernel Recipes 2015: Kernel packet capture technologiesKernel Recipes 2015: Kernel packet capture technologies
Kernel Recipes 2015: Kernel packet capture technologies
 
Linux Performance Profiling and Monitoring
Linux Performance Profiling and MonitoringLinux Performance Profiling and Monitoring
Linux Performance Profiling and Monitoring
 
Systems@Scale 2021 BPF Performance Getting Started
Systems@Scale 2021 BPF Performance Getting StartedSystems@Scale 2021 BPF Performance Getting Started
Systems@Scale 2021 BPF Performance Getting Started
 
Linux Kernel Crashdump
Linux Kernel CrashdumpLinux Kernel Crashdump
Linux Kernel Crashdump
 
Velocity 2017 Performance analysis superpowers with Linux eBPF
Velocity 2017 Performance analysis superpowers with Linux eBPFVelocity 2017 Performance analysis superpowers with Linux eBPF
Velocity 2017 Performance analysis superpowers with Linux eBPF
 
Kernel crashdump
Kernel crashdumpKernel crashdump
Kernel crashdump
 
LISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
LISA18: Hidden Linux Metrics with Prometheus eBPF ExporterLISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
LISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
 
1 m+ qps on mysql galera cluster
1 m+ qps on mysql galera cluster1 m+ qps on mysql galera cluster
1 m+ qps on mysql galera cluster
 
Kernel Recipes 2015 - Kernel dump analysis
Kernel Recipes 2015 - Kernel dump analysisKernel Recipes 2015 - Kernel dump analysis
Kernel Recipes 2015 - Kernel dump analysis
 
Intro to linux performance analysis
Intro to linux performance analysisIntro to linux performance analysis
Intro to linux performance analysis
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016
 
Introduction to eBPF and XDP
Introduction to eBPF and XDPIntroduction to eBPF and XDP
Introduction to eBPF and XDP
 
The New Systems Performance
The New Systems PerformanceThe New Systems Performance
The New Systems Performance
 
Known basic of NFV Features
Known basic of NFV FeaturesKnown basic of NFV Features
Known basic of NFV Features
 
Linux Crash Dump Capture and Analysis
Linux Crash Dump Capture and AnalysisLinux Crash Dump Capture and Analysis
Linux Crash Dump Capture and Analysis
 
Debugging the Cloud Foundry Routing Tier
Debugging the Cloud Foundry Routing TierDebugging the Cloud Foundry Routing Tier
Debugging the Cloud Foundry Routing Tier
 
Linux Performance 2018 (PerconaLive keynote)
Linux Performance 2018 (PerconaLive keynote)Linux Performance 2018 (PerconaLive keynote)
Linux Performance 2018 (PerconaLive keynote)
 
Percona Live UK 2014 Part III
Percona Live UK 2014  Part IIIPercona Live UK 2014  Part III
Percona Live UK 2014 Part III
 
Linux kernel debugging
Linux kernel debuggingLinux kernel debugging
Linux kernel debugging
 

Similar to HPC Examples

What’s eating python performance
What’s eating python performanceWhat’s eating python performance
What’s eating python performance
Piotr Przymus
 
Practical Operation Automation with StackStorm
Practical Operation Automation with StackStormPractical Operation Automation with StackStorm
Practical Operation Automation with StackStorm
Shu Sugimoto
 
Linux Cluster Job Management Systems (SGE)
Linux Cluster Job Management Systems (SGE)Linux Cluster Job Management Systems (SGE)
Linux Cluster Job Management Systems (SGE)
anandvaidya
 
Getting started kali linux
Getting started kali linuxGetting started kali linux
Getting started kali linux
Dhruv Sharma
 
Automating with NX-OS: Let's Get Started!
Automating with NX-OS: Let's Get Started!Automating with NX-OS: Let's Get Started!
Automating with NX-OS: Let's Get Started!
Cisco DevNet
 
Linux Capabilities - eng - v2.1.5, compact
Linux Capabilities - eng - v2.1.5, compactLinux Capabilities - eng - v2.1.5, compact
Linux Capabilities - eng - v2.1.5, compact
Alessandro Selli
 
Being HAPI! Reverse Proxying on Purpose
Being HAPI! Reverse Proxying on PurposeBeing HAPI! Reverse Proxying on Purpose
Being HAPI! Reverse Proxying on Purpose
Aman Kohli
 
Reverse engineering Swisscom's Centro Grande Modem
Reverse engineering Swisscom's Centro Grande ModemReverse engineering Swisscom's Centro Grande Modem
Reverse engineering Swisscom's Centro Grande Modem
Cyber Security Alliance
 
Functional and scale performance tests using zopkio
Functional and scale performance tests using zopkio Functional and scale performance tests using zopkio
Functional and scale performance tests using zopkio
Marcelo Araujo
 
Final ProjectFinal Project Details Description Given a spec.docx
Final ProjectFinal Project Details Description  Given a spec.docxFinal ProjectFinal Project Details Description  Given a spec.docx
Final ProjectFinal Project Details Description Given a spec.docx
AKHIL969626
 
OSMC 2009 | Windows monitoring - Going where no man has gone before... by Mic...
OSMC 2009 | Windows monitoring - Going where no man has gone before... by Mic...OSMC 2009 | Windows monitoring - Going where no man has gone before... by Mic...
OSMC 2009 | Windows monitoring - Going where no man has gone before... by Mic...
NETWAYS
 
26.1.7 lab snort and firewall rules
26.1.7 lab   snort and firewall rules26.1.7 lab   snort and firewall rules
26.1.7 lab snort and firewall rules
Freddy Buenaño
 
The genesis of clusterlib - An open source library to tame your favourite sup...
The genesis of clusterlib - An open source library to tame your favourite sup...The genesis of clusterlib - An open source library to tame your favourite sup...
The genesis of clusterlib - An open source library to tame your favourite sup...
Arnaud Joly
 
Managing Large-scale Networks with Trigger
Managing Large-scale Networks with TriggerManaging Large-scale Networks with Trigger
Managing Large-scale Networks with Trigger
jathanism
 
Activity 5
Activity 5Activity 5
Activity 5
Heidi Owens
 
Docker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic StackDocker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic Stack
Jakub Hajek
 
Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek
PROIDEA
 
200519 TMU Ubiquitous Robot
200519 TMU Ubiquitous Robot200519 TMU Ubiquitous Robot
200519 TMU Ubiquitous Robot
NoriakiAndo
 
CI from scratch with Jenkins (EN)
CI from scratch with Jenkins (EN)CI from scratch with Jenkins (EN)
CI from scratch with Jenkins (EN)
Borislav Traykov
 

Similar to HPC Examples (20)

What’s eating python performance
What’s eating python performanceWhat’s eating python performance
What’s eating python performance
 
Practical Operation Automation with StackStorm
Practical Operation Automation with StackStormPractical Operation Automation with StackStorm
Practical Operation Automation with StackStorm
 
Linux Cluster Job Management Systems (SGE)
Linux Cluster Job Management Systems (SGE)Linux Cluster Job Management Systems (SGE)
Linux Cluster Job Management Systems (SGE)
 
Getting started kali linux
Getting started kali linuxGetting started kali linux
Getting started kali linux
 
Automating with NX-OS: Let's Get Started!
Automating with NX-OS: Let's Get Started!Automating with NX-OS: Let's Get Started!
Automating with NX-OS: Let's Get Started!
 
Linux Capabilities - eng - v2.1.5, compact
Linux Capabilities - eng - v2.1.5, compactLinux Capabilities - eng - v2.1.5, compact
Linux Capabilities - eng - v2.1.5, compact
 
Being HAPI! Reverse Proxying on Purpose
Being HAPI! Reverse Proxying on PurposeBeing HAPI! Reverse Proxying on Purpose
Being HAPI! Reverse Proxying on Purpose
 
Reverse engineering Swisscom's Centro Grande Modem
Reverse engineering Swisscom's Centro Grande ModemReverse engineering Swisscom's Centro Grande Modem
Reverse engineering Swisscom's Centro Grande Modem
 
Functional and scale performance tests using zopkio
Functional and scale performance tests using zopkio Functional and scale performance tests using zopkio
Functional and scale performance tests using zopkio
 
Final ProjectFinal Project Details Description Given a spec.docx
Final ProjectFinal Project Details Description  Given a spec.docxFinal ProjectFinal Project Details Description  Given a spec.docx
Final ProjectFinal Project Details Description Given a spec.docx
 
OSMC 2009 | Windows monitoring - Going where no man has gone before... by Mic...
OSMC 2009 | Windows monitoring - Going where no man has gone before... by Mic...OSMC 2009 | Windows monitoring - Going where no man has gone before... by Mic...
OSMC 2009 | Windows monitoring - Going where no man has gone before... by Mic...
 
Unix commands
Unix commandsUnix commands
Unix commands
 
26.1.7 lab snort and firewall rules
26.1.7 lab   snort and firewall rules26.1.7 lab   snort and firewall rules
26.1.7 lab snort and firewall rules
 
The genesis of clusterlib - An open source library to tame your favourite sup...
The genesis of clusterlib - An open source library to tame your favourite sup...The genesis of clusterlib - An open source library to tame your favourite sup...
The genesis of clusterlib - An open source library to tame your favourite sup...
 
Managing Large-scale Networks with Trigger
Managing Large-scale Networks with TriggerManaging Large-scale Networks with Trigger
Managing Large-scale Networks with Trigger
 
Activity 5
Activity 5Activity 5
Activity 5
 
Docker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic StackDocker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic Stack
 
Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek
 
200519 TMU Ubiquitous Robot
200519 TMU Ubiquitous Robot200519 TMU Ubiquitous Robot
200519 TMU Ubiquitous Robot
 
CI from scratch with Jenkins (EN)
CI from scratch with Jenkins (EN)CI from scratch with Jenkins (EN)
CI from scratch with Jenkins (EN)
 

Recently uploaded

FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 

HPC Examples

  • 1. PRESENTED BY: HPC Exercises Interacting with High Performance Computing Systems 6/7/18 1 VirginiaTrueheart, MSIS Texas Advanced Computing Center vtrueheart@tacc.utexas.edu
  • 2. Logging In • In order to access the TACC machines you will need to login using a terminal or SSH client • SSH is an encrypted network protocol to access a secure system over an unsecured network • The following example is for XSEDE Login • You can also log directly into the TACC machine but your password may be different.
  • 3. Logging In (Mac Terminal) $ ssh –l <username> login.xsede.org Please login to this system using your XSEDE username and password: password: Duo two-factor login for <username> Enter a passcode or select one of the following options 1. Duo Push to XXX-XXX-XXXX 2. Phone call to XXX-XXX-XXXX Passcode or option (1-2): 6/7/18 3
  • 4. Logging In pt. 2 # For example, to login to the Comet system at SDSC, enter: gsissh comet # # Email help@xsede.org if you require assistance in the use of this system. [username@ssohub ~]$ gsissh stampede2 6/7/18 4
  • 5. Interacting with the System After logging in you will be able to see the TACC Info box which will tell you what projects you are associated with and how much of the file system you have used.
  • 6. Welcome to Stampede2, *please* read these important system notes: --> Stampede2, Phase 2 Skylake nodes are now available for jobs --> Stampede2 user documentation is available at: https://portal.tacc.utexas.edu/user-guides/stampede2 ----------------------- Project balances for user vtrue ----------------------- | Name Avail SUs Expires | | | A-ccsc 189624 2018-12-31 | | ------------------------- Disk quotas for user vtrue -------------------------- | Disk Usage (GB) Limit %Used File Usage Limit %Used | | /home1 1.9 10.0 19.43 39181 200000 19.59 | | /work 311.8 1024.0 30.45 225008 3000000 7.50 | | /scratch 0.0 0.0 0.00 4 0 0.00 | ------------------------------------------------------------------------------- 6/7/18 6
  • 7. Creating a File Inline text editors can be very useful when interacting with the system so lets use a very simple one (nano) to create a file that we can use to execute some code.
  • 8. Create a File login1.stampede2$ cd $WORK login1.stampede2$ pwd /work/03658/vtrue/stampede2 login1.stampede2$ nano helloWorld.py 6/7/18 8
  • 9. Editing a File • You should now be sitting in the editing environment of the helloWorld.py file. • You can type in the code found on the next slide in order to create the contents of the file. • Type Ctrl+X to exit and type Y to save the file when you are prompted.
  • 10. 6/7/18 10 #!/usr/bin/env python """ Hello World """ import datetime as DT today = DT.datetime.today() print "Hello World! Today is:" print today.strftime("%d %b %Y") A Very Small File
  • 11. Executing Our File • It is prohibited to run code on the login nodes as they are a shared resource. • In order to run this little code we have written we will first need to start an idev (interactive development) session.
  • 12. helloWorld.py staff.stampede2(1005)$ idev -> Checking on the status of development queue. OK -> Defaults file : ~/.idevrc -> System : stampede2 -> Queue : development (idev default ) [...] c455-012[knl](1019)$ 6/7/18 12
  • 13. helloWorld.py staff.stampede2(1005)$ idev -> Checking on the status of development queue. OK -> Defaults file : ~/.idevrc -> System : stampede2 -> Queue : development (idev default ) [...] c455-012[knl](1019)$ python helloWorld.py Hello World! Today is: 17 Jun 2018 c455-012[knl](1020)$ 6/7/18 13
  • 14. Do More with Our File • Now that we see the helloWorld.py file will run on the compute node and produce output let’s test out the parallel aspects of running on a node. Namely, accessing all of the cores on the node. • Type ‘nano helloWorld.py’ to reopen your file and begin editing it again. • Input the python code you see on the next slide and then do Ctrl+X again to save your changes.
  • 15. #!/usr/bin/env python """ Parallel Hello World """ from mpi4py import MPI import sys size = MPI.COMM_WORLD.Get_size() rank = MPI.COMM_WORLD.Get_rank() name = MPI.Get_processor_name() sys.stdout.write( "Hello, World! I am process %d of %d on %s.n" % (rank, size, name)) 6/7/18 15
  • 16. Running in Parallel • Even though this is still a python code, we are now taking advantage of the parallel capabilities of the node • As such you need to start your code by using ‘ibrun’ instead of just typing ‘python’ • When you run this code you will receive feedback from each core on the node.
  • 17. Parallel helloWorld.py • c455-012[knl](1019)$ ibrun python helloParallel.py • TACC: Starting up job 1595632 • TACC: Starting parallel tasks... • Hello, World! I am process 1 of 68 on c456-042.stampede2.tacc.utexas.edu. • Hello, World! I am process 49 of 68 on c456-042.stampede2.tacc.utexas.edu. • Hello, World! I am process 66 of 68 on c456-042.stampede2.tacc.utexas.edu. • Hello, World! I am process 67 of 68 on c456-042.stampede2.tacc.utexas.edu. • Hello, World! I am process 64 of 68 on c456-042.stampede2.tacc.utexas.edu. • ... • TACC: Shutdown complete. Exiting. 6/7/18 17
  • 18. Exiting an idev Session • Great! Now that you can see what interactive jobs look like we can go on to more advanced job submission. • To leave an idev session simply type ‘exit’. Let’s do that now.
  • 19. Submitting a Job • The previous two examples ran a job interactively. This means you could be on the node and see the output generated as it happened. • This isn’t always practical though so we need a way to submit jobs and then leave them to the system to run whenever nodes become available. • To do this we’ll take advantage of the SLURM system.
  • 20. Create a New Nano File • Create a new file called batchJob.sh • Input the code found on the next slide and save the file.
  • 21. An Example SLURM Batch File #!/bin/bash #SBATCH -J myJob # Job name #SBATCH -o myJob.o%j # Name of stdout output file #SBATCH -e myJob.e%j # Name of stderr error file #SBATCH -p development # Queue (partition) name #SBATCH -N 1 # Total # of nodes #SBATCH -n 68 # Total # of mpi tasks #SBATCH -t 00:05:00 # Run time (hh:mm:ss) #SBATCH -A myproject # Allocation name (req'd if you have more than 1) #SBATCH --mail-user=hkang@austin.utexas.edu #SBATCH --mail-type=all # Send email at begin and end of job # Other commands must follow all #SBATCH directives... module list pwd date # Launch code... ibrun python helloParallel.py 6/7/18 21
  • 22. Submitting the Batch Job • To submit the SLURM Batch job use the following command • sbatch batchJob.sh • This should generate a text output to let you know about the parameters of the job and then provide you with a Job ID once the job has successfully been admitted to the queues.
  • 23. Checking the Status of a Job • Once a job is submitted to the queues, you can’t see it running the way you could when you were running interactively. • Instead we use monitoring commands to see what state our job is in. • The command ’squeue’ is very useful for this and has several flags available to control it’s output. Let’s use the –u flag to see what all jobs under our username are doing.
  • 24. staff.stampede2(1009)$ squeue -u vtrue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 1604426 development idv20717 vtrue R 16:57 1 c455-001 staff.stampede2(1010)$ scontrol show job=1604426 JobId=1604426 JobName=idv20717 UserId=vtrue(829572) GroupId=G-815499(815499) MCS_label=N/A Priority=400 Nice=0 Account=A-ccsc QOS=normal JobState=RUNNING Reason=None Dependency=(null) Requeue=0 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0 RunTime=00:18:08 TimeLimit=00:30:00 TimeMin=N/A SubmitTime=2018-06-09T21:27:33 EligibleTime=2018-06-09T21:27:33 StartTime=2018-06-09T21:27:36 EndTime=2018-06-09T21:57:36 Deadline=N/A PreemptTime=None SuspendTime=None SecsPreSuspend=0 LastSchedEval=2018-06-09T21:27:36 ... 6/7/18 24
  • 25. Cleaning Up Jobs • When your batch job has finished running it will automatically be cleared from the queues. • Your output will be in the folder you pointed it to within your batch job file. • If for some reason you wish to cancel your job while it is still running, you can do so with ‘scancel –jobID>’

Editor's Notes

  1. Inputs at password and token will likely appear blank so type carefully
  2. Inputs at password and token will likely appear blank so type carefully
  3. Pay attention to your command prompt. Ofc you can change this if you want but many systems have a default that is designed to be helpful .
  4. Single processor
  5. Single node/task = one output Shift + ZZ to save and exit ls to see if file was saved We’ll come back for this later when we start running some examples but for now make sure it’s saved and try to remember where you put it
  6. Single processor per task (multithreaded) but not yet hyperthreaded Great! Now you know how to run jobs interactively
  7. “slurm batch”