SlideShare a Scribd company logo
Scaling Systems for Research Computing
adam@bioteam.net
!1
!2
The ‘Meta’ Issue
What is driving all of this?
Scalable Infrastructure
Scalable Software
Compliance
Intro to BioTeam
Who, What, Why
Q&A
1
2
3
4
5
6
Who, What, Why ...
!3
BioTeam
‣ Independent consulting shop
‣ Staffed by scientists forced to
learn IT, SW & HPC to get our
own research done
‣ 10+ years bridging the “gap”
between science, IT & high
performance computing
‣ Our wide-ranging work is what
gets us invited to speak at
events like this ...
!4
Bioinformatics and Big Iron
Culture
BioTeam
‣ We are a distributed company
• BioTeam is 100% REMOTE
• All employees are MANAGERS
• Workflow is mostly ASYNCHRONOUS
‣ Prefer small interdisciplinary TEAMS
• Value placed on TRUST and PERFORMANCE
!5
Today
BioTeam
‣ 10 full-time employees in 2014
• 2 dedicated to HPC Infrastructure
• 2 dedicated to Software Development
• 1 dedicated to Products
• 1 dedicated to Government Services
• 1 dedicated to Cloud Computing
‣ 10+ years supporting Life Sciences Research
!6
!7
The ‘meta’ issue
!8
Science is changing faster than
IT infrastructure
!9
Cloud Computing
Amazon vs. Other Clouds
‣ AWS has by far the most useful IaaS building
blocks today
• First choice for most Bio-IT use cases
‣ AWS quietly rolls out killer features
• Spot Market
• Virtual Private Cloud
‣ Provider decision may be based on where your
data actually resides
!10
11
Real world simulation project
Massive resources and API’s galore
Google
‣ Google started with PaaS and worked down
‣ Google Exacycle for Visiting Faculty (closed)
• 1 billion core hours on demand; what’s next?	
‣ Google is DEVELOPER centric; everything has
an API
‣ Culture is based on Science and Engineering
!12
!13
Tools and Techniques
Devops
Configuration Management
‣ Required in almost every cloud project
‣ Chef/Puppet/Ansible/Fabric
• Domain specific languages; Agent-based versus SSH; Abstraction
‣ Key is reducing institutionalized knowledge and sharing
recipes
‣ Docker/lxc could be disrupting
• Lightweight differential images; not very HPC friendly at this point
‣ Orchestration tools lagging behind provisioning and
configuration
‣ Best techniques are making their way back into HPC
!14
!15
Devops
open-source cluster computing toolkit
MIT StarCluster
‣ Ideal for most HPC use cases
• Includes Grid Engine, NFS, and MPI
• NEW Support for Virtual Private Cloud!
‣ Works with Spot Instances
‣ Extensible via plugins
• Hadoop
• HTCondor
• GlusterFS
• IPython Notebook
!16
!17
Private Clouds
Where is your datacenter?
Private Cloud
!18
AWS Regions
Public Cloud
!19
Google Datacenters
Public Cloud
!20
!21
Scalable Software
In modern processors and coprocessors
Types of Parallelism
!22
Instruction
Level
Vector
Level
Thread
Level
Node
Level
Micro-architectural techniques such as pipelined execution,
out-of/in-order execution, super-scalar execution, branch
prediction…
Using SIMD vector processing instructions for SSE, AVX,
Phi
Multi-core architectures with or without Hyper-Threading
Many-core architecture with smart round robin hardware
multithreading
Distributed Computing
Cluster Computing
Fully functional multi-thread execution unit
Intel Xeon Phi Coprocessor
‣ 50+ cores with a ring interconnect
‣ 64-bit addressing
‣ Scalar unit based on Intel Pentium family
‣ Vector unit 512-bit SIMD Instructions
‣ 4 hardware threads per core
‣ Highly Parallel device
‣ SMP on-a-chip
!23
Choices
Programming Xeon Phi
!24
Offloaded Native
‣ Pragma/directives based
‣ Better serial processing
‣ More memory
‣ Better file access
‣ Makes full use of available resources
‣ Simpler programming model
‣ Quicker to test key kernels
‣ Some constraints
‣ Memory availability
‣ File I/O access
Mapping with Burrows-Wheeler Aligner (BWA)
Intel Optimization Example
!25
0
0
1
1
2
1.86
1.24
1
Xeon (baseline)
Xeon (optimized)
Xeon + Phi‣ Replace pthreads with OpenMP
‣ Better load balancing
‣ Overlap I/O and Compute
‣ Better thread usage
‣ Efficient memory allocation
‣ Vectorized performance critical
loops
‣ Data prefetch to reduce memory
latency
Source: Life Sciences Optimization - Intel - SC13
Protein sequence analysis with MPI-HMMER
Intel Optimization Example
!26
0
0
1
1
2
1.56
1
Xeon
Xeon + Phi
‣ No source code changes required
‣ Use #pragma unroll to improve
loop performance
‣ Double nested loop in Viterbi
algorithm is auto-vectorized for
Xeon and Phi by Intel compilers
Source: Life Sciences Optimization - Intel - SC13
Assembly with Velour
Intel Optimization Example
‣ Intel and UIUC released open-source
alternative to velveth
‣ > 10x reduction in memory usage
• Intelligently caching portions of assembly to disk
• 700GB to 60GB
‣ https://github.com/jjcook/velour
‣ Cook, Jeffrey J. 2011. Scaling short read de novo
DNA sequence assembly to gigabase genomes.
!27
Recommendations
Programming Xeon Phi
‣ Host can have multiple Phi cards
‣ MLK libraries are pre-optimized
‣ OpenMP is applicable to multi-core and many-
core programming
• omp offload target(mic)
‣ MPI supports distributed computation and
combines with other models
• OpenMP within nodes and MPI between nodes
‣ Xeon optimizations translate well to Phi
!28
In the Life Sciences
Parallel Programming
‣ Targets: CPU, Coprocessors, GPU, FGPA, ASIC
‣ There is no silver bullet
‣ Problem decomposition is the most critical step
‣ Think in parallel
‣ Using Intel compilers can yield ~30% speedup
in many cases
• vtune and other analysis tools are available
‣ Must optimize at one or more levels
!29
!30
Recommendations
Parallel Programming	
‣ Leaving performance on the table
• Low hanging fruit; splitting input files into parts
• Avoid using languages with poor concurrency model and GIL
‣ Exploit thread-level parallelism
• Use multi-threading and multi-processing to fully utilize multicore
processors
‣ Use Intel’s Auto-Vectorizing compiler
• Take advantage of SIMD parallelism and wider vectors on Phi
‣ Prepare for a heterogenous many-core future
• Hybrid Programming (OpenMP + MPI)
!31
Platforms
Parallel Programming
‣ Intel Distribution for Apache Hadoop
• Enhances open-source Hadoop on Xeon processors
• More efficient; faster startup times
• Management tools
‣ Intel Enterprise Edition Lustre
• Enhances open-source Lustre
• REST API
• Hadoop Adapter
!32
A fresh approach to technical computing
I <3 Julia
‣ Homoiconic; Dynamic type
system
‣ Designed for parallelism and
distributed computation
‣ MATLAB-like syntax and
extensive math library
‣ Call C functions directly
‣ Call Python functions
‣ IJulia Notebook
‣ Open Source
!33
!34
Compliance
Overview
Compliance
‣ Need a compliance apparatus
‣ Often a barrier to competition
‣ Compute and Storage are easy
• Policy and procedures are harder
‣ AWS and Google will now sign BAA
!35
Strategy
Compliance
‣ Keys are protecting data and preventing access
‣ Data management - points of control
‣ Encrypt data in flight and at rest
• Use S3 server-side encryption
• Google Persistent Disks are automatically encrypted
‣ Use credential rotation policies
‣ Lock down security groups and firewalls
‣ Use VPN for all public connections
‣ Log everything and audit often
!36
!37
http://biote.am/storage
!38
ACK
!
!
http://bioteam.net
http://psc.edu
http://software.intel.com/en-us/mic-developer
http://julialang.org

More Related Content

What's hot

Journey Through Four Stages of Kubernetes Deployment Maturity
Journey Through Four Stages of Kubernetes Deployment MaturityJourney Through Four Stages of Kubernetes Deployment Maturity
Journey Through Four Stages of Kubernetes Deployment Maturity
Altoros
 
Kubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPOKubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPO
Animesh Singh
 
ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...
ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...
ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...
Provectus
 
Code Hosting: The Key to Autonomous, Self-Service Development
Code Hosting: The Key to Autonomous, Self-Service DevelopmentCode Hosting: The Key to Autonomous, Self-Service Development
Code Hosting: The Key to Autonomous, Self-Service Development
Rachel Maxwell
 
BYOP: Custom Processor Development with Apache NiFi
BYOP: Custom Processor Development with Apache NiFiBYOP: Custom Processor Development with Apache NiFi
BYOP: Custom Processor Development with Apache NiFi
DataWorks Summit
 
Open-source vs. public cloud in the Big Data landscape. Friends or Foes?
Open-source vs. public cloud in the Big Data landscape. Friends or Foes?Open-source vs. public cloud in the Big Data landscape. Friends or Foes?
Open-source vs. public cloud in the Big Data landscape. Friends or Foes?
GetInData
 
Present and future of unified, portable, and efficient data processing with A...
Present and future of unified, portable, and efficient data processing with A...Present and future of unified, portable, and efficient data processing with A...
Present and future of unified, portable, and efficient data processing with A...
DataWorks Summit
 
Storage Spaces Direct - the new Microsoft SDS star - Carsten Rachfahl
Storage Spaces Direct - the new Microsoft SDS star - Carsten RachfahlStorage Spaces Direct - the new Microsoft SDS star - Carsten Rachfahl
Storage Spaces Direct - the new Microsoft SDS star - Carsten Rachfahl
ITCamp
 
Perforce Helix Never Dies: DevOps at Bandai Namco Studios
Perforce Helix Never Dies: DevOps at Bandai Namco StudiosPerforce Helix Never Dies: DevOps at Bandai Namco Studios
Perforce Helix Never Dies: DevOps at Bandai Namco Studios
Perforce
 
180 nm Tape out experience using Open POWER ISA
180 nm Tape out experience using Open POWER ISA180 nm Tape out experience using Open POWER ISA
180 nm Tape out experience using Open POWER ISA
Ganesan Narayanasamy
 
OSDC 2018 - Distributed monitoring
OSDC 2018 - Distributed monitoringOSDC 2018 - Distributed monitoring
OSDC 2018 - Distributed monitoring
Gianluca Arbezzano
 
OpenShift Meetup - Red Hat OpenShift Container Storage explained
OpenShift Meetup - Red Hat OpenShift Container Storage explainedOpenShift Meetup - Red Hat OpenShift Container Storage explained
OpenShift Meetup - Red Hat OpenShift Container Storage explained
ConSol Consulting & Solutions Software GmbH
 
Crap. Your Big Data Kitchen Is Broken.
Crap. Your Big Data Kitchen Is Broken.Crap. Your Big Data Kitchen Is Broken.
Crap. Your Big Data Kitchen Is Broken.
Altoros
 
Interactive Analytics using Apache Spark
Interactive Analytics using Apache SparkInteractive Analytics using Apache Spark
Interactive Analytics using Apache Spark
Sachin Aggarwal
 
IPv6 at LinkedIn
IPv6 at LinkedInIPv6 at LinkedIn
IPv6 at LinkedIn
APNIC
 
Devops Columbia October 2020 - Gabriel Alix: A Discussion on Terraform
Devops Columbia October 2020 - Gabriel Alix: A Discussion on TerraformDevops Columbia October 2020 - Gabriel Alix: A Discussion on Terraform
Devops Columbia October 2020 - Gabriel Alix: A Discussion on Terraform
Drew Malone
 
Operator development made easy with helm
Operator development made easy with helmOperator development made easy with helm
Operator development made easy with helm
ConSol Consulting & Solutions Software GmbH
 
How and Why GraalVM is quickly becoming relevant for developers (ACEs@home - ...
How and Why GraalVM is quickly becoming relevant for developers (ACEs@home - ...How and Why GraalVM is quickly becoming relevant for developers (ACEs@home - ...
How and Why GraalVM is quickly becoming relevant for developers (ACEs@home - ...
Lucas Jellema
 
Apache NiFi User Guide
Apache NiFi User GuideApache NiFi User Guide
Apache NiFi User Guide
Deon Huang
 
Bringing complex event processing to Spark streaming
Bringing complex event processing to Spark streamingBringing complex event processing to Spark streaming
Bringing complex event processing to Spark streaming
DataWorks Summit
 

What's hot (20)

Journey Through Four Stages of Kubernetes Deployment Maturity
Journey Through Four Stages of Kubernetes Deployment MaturityJourney Through Four Stages of Kubernetes Deployment Maturity
Journey Through Four Stages of Kubernetes Deployment Maturity
 
Kubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPOKubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPO
 
ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...
ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...
ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...
 
Code Hosting: The Key to Autonomous, Self-Service Development
Code Hosting: The Key to Autonomous, Self-Service DevelopmentCode Hosting: The Key to Autonomous, Self-Service Development
Code Hosting: The Key to Autonomous, Self-Service Development
 
BYOP: Custom Processor Development with Apache NiFi
BYOP: Custom Processor Development with Apache NiFiBYOP: Custom Processor Development with Apache NiFi
BYOP: Custom Processor Development with Apache NiFi
 
Open-source vs. public cloud in the Big Data landscape. Friends or Foes?
Open-source vs. public cloud in the Big Data landscape. Friends or Foes?Open-source vs. public cloud in the Big Data landscape. Friends or Foes?
Open-source vs. public cloud in the Big Data landscape. Friends or Foes?
 
Present and future of unified, portable, and efficient data processing with A...
Present and future of unified, portable, and efficient data processing with A...Present and future of unified, portable, and efficient data processing with A...
Present and future of unified, portable, and efficient data processing with A...
 
Storage Spaces Direct - the new Microsoft SDS star - Carsten Rachfahl
Storage Spaces Direct - the new Microsoft SDS star - Carsten RachfahlStorage Spaces Direct - the new Microsoft SDS star - Carsten Rachfahl
Storage Spaces Direct - the new Microsoft SDS star - Carsten Rachfahl
 
Perforce Helix Never Dies: DevOps at Bandai Namco Studios
Perforce Helix Never Dies: DevOps at Bandai Namco StudiosPerforce Helix Never Dies: DevOps at Bandai Namco Studios
Perforce Helix Never Dies: DevOps at Bandai Namco Studios
 
180 nm Tape out experience using Open POWER ISA
180 nm Tape out experience using Open POWER ISA180 nm Tape out experience using Open POWER ISA
180 nm Tape out experience using Open POWER ISA
 
OSDC 2018 - Distributed monitoring
OSDC 2018 - Distributed monitoringOSDC 2018 - Distributed monitoring
OSDC 2018 - Distributed monitoring
 
OpenShift Meetup - Red Hat OpenShift Container Storage explained
OpenShift Meetup - Red Hat OpenShift Container Storage explainedOpenShift Meetup - Red Hat OpenShift Container Storage explained
OpenShift Meetup - Red Hat OpenShift Container Storage explained
 
Crap. Your Big Data Kitchen Is Broken.
Crap. Your Big Data Kitchen Is Broken.Crap. Your Big Data Kitchen Is Broken.
Crap. Your Big Data Kitchen Is Broken.
 
Interactive Analytics using Apache Spark
Interactive Analytics using Apache SparkInteractive Analytics using Apache Spark
Interactive Analytics using Apache Spark
 
IPv6 at LinkedIn
IPv6 at LinkedInIPv6 at LinkedIn
IPv6 at LinkedIn
 
Devops Columbia October 2020 - Gabriel Alix: A Discussion on Terraform
Devops Columbia October 2020 - Gabriel Alix: A Discussion on TerraformDevops Columbia October 2020 - Gabriel Alix: A Discussion on Terraform
Devops Columbia October 2020 - Gabriel Alix: A Discussion on Terraform
 
Operator development made easy with helm
Operator development made easy with helmOperator development made easy with helm
Operator development made easy with helm
 
How and Why GraalVM is quickly becoming relevant for developers (ACEs@home - ...
How and Why GraalVM is quickly becoming relevant for developers (ACEs@home - ...How and Why GraalVM is quickly becoming relevant for developers (ACEs@home - ...
How and Why GraalVM is quickly becoming relevant for developers (ACEs@home - ...
 
Apache NiFi User Guide
Apache NiFi User GuideApache NiFi User Guide
Apache NiFi User Guide
 
Bringing complex event processing to Spark streaming
Bringing complex event processing to Spark streamingBringing complex event processing to Spark streaming
Bringing complex event processing to Spark streaming
 

Similar to Scaling systems for research computing

OpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC Systems
HPCC Systems
 
ETL as a Platform: Pandora Plays Nicely Everywhere with Real-Time Data Pipelines
ETL as a Platform: Pandora Plays Nicely Everywhere with Real-Time Data PipelinesETL as a Platform: Pandora Plays Nicely Everywhere with Real-Time Data Pipelines
ETL as a Platform: Pandora Plays Nicely Everywhere with Real-Time Data Pipelines
confluent
 
Putting Compilers to Work
Putting Compilers to WorkPutting Compilers to Work
Putting Compilers to Work
SingleStore
 
High performance computing for research
High performance computing for researchHigh performance computing for research
High performance computing for research
Esteban Hernandez
 
AI & Machine Learning Pipelines with Knative
AI & Machine Learning Pipelines with KnativeAI & Machine Learning Pipelines with Knative
AI & Machine Learning Pipelines with Knative
Animesh Singh
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications
OpenEBS
 
Tyrone-Intel oneAPI Webinar: Optimized Tools for Performance-Driven, Cross-Ar...
Tyrone-Intel oneAPI Webinar: Optimized Tools for Performance-Driven, Cross-Ar...Tyrone-Intel oneAPI Webinar: Optimized Tools for Performance-Driven, Cross-Ar...
Tyrone-Intel oneAPI Webinar: Optimized Tools for Performance-Driven, Cross-Ar...
Tyrone Systems
 
Bitfusion Nimbix Dev Summit Heterogeneous Architectures
Bitfusion Nimbix Dev Summit Heterogeneous Architectures Bitfusion Nimbix Dev Summit Heterogeneous Architectures
Bitfusion Nimbix Dev Summit Heterogeneous Architectures
Subbu Rama
 
Early Successes Debugging with TotalView on the Intel Xeon Phi Coprocessor
Early Successes Debugging with TotalView on the Intel Xeon Phi CoprocessorEarly Successes Debugging with TotalView on the Intel Xeon Phi Coprocessor
Early Successes Debugging with TotalView on the Intel Xeon Phi Coprocessor
Intel IT Center
 
Introduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AIIntroduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AI
Tyrone Systems
 
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
 Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo... Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo...
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
Rogue Wave Software
 
High Performance Machine Learning in R with H2O
High Performance Machine Learning in R with H2OHigh Performance Machine Learning in R with H2O
High Performance Machine Learning in R with H2O
Sri Ambati
 
Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learn
John D Almon
 
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
HPCC Systems
 
PyData Boston 2013
PyData Boston 2013PyData Boston 2013
PyData Boston 2013
Travis Oliphant
 
Hpc in the cloud meetup 19 march 2019
Hpc in the cloud   meetup 19 march 2019Hpc in the cloud   meetup 19 march 2019
Hpc in the cloud meetup 19 march 2019
Abhishek Gupta
 
Exploring the Open Source Linux Ecosystem
Exploring the Open Source Linux EcosystemExploring the Open Source Linux Ecosystem
Exploring the Open Source Linux Ecosystem
IBM
 
HP Enterprises in Hana Pankaj Jain May 2016
HP Enterprises in Hana Pankaj Jain May 2016HP Enterprises in Hana Pankaj Jain May 2016
HP Enterprises in Hana Pankaj Jain May 2016
INDUSCommunity
 
from ai.backend import python @ pycontw2018
from ai.backend import python @ pycontw2018from ai.backend import python @ pycontw2018
from ai.backend import python @ pycontw2018
Chun-Yu Tseng
 
GTC15-Manoj-Roge-OpenPOWER
GTC15-Manoj-Roge-OpenPOWERGTC15-Manoj-Roge-OpenPOWER
GTC15-Manoj-Roge-OpenPOWERAchronix
 

Similar to Scaling systems for research computing (20)

OpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC Systems
 
ETL as a Platform: Pandora Plays Nicely Everywhere with Real-Time Data Pipelines
ETL as a Platform: Pandora Plays Nicely Everywhere with Real-Time Data PipelinesETL as a Platform: Pandora Plays Nicely Everywhere with Real-Time Data Pipelines
ETL as a Platform: Pandora Plays Nicely Everywhere with Real-Time Data Pipelines
 
Putting Compilers to Work
Putting Compilers to WorkPutting Compilers to Work
Putting Compilers to Work
 
High performance computing for research
High performance computing for researchHigh performance computing for research
High performance computing for research
 
AI & Machine Learning Pipelines with Knative
AI & Machine Learning Pipelines with KnativeAI & Machine Learning Pipelines with Knative
AI & Machine Learning Pipelines with Knative
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications
 
Tyrone-Intel oneAPI Webinar: Optimized Tools for Performance-Driven, Cross-Ar...
Tyrone-Intel oneAPI Webinar: Optimized Tools for Performance-Driven, Cross-Ar...Tyrone-Intel oneAPI Webinar: Optimized Tools for Performance-Driven, Cross-Ar...
Tyrone-Intel oneAPI Webinar: Optimized Tools for Performance-Driven, Cross-Ar...
 
Bitfusion Nimbix Dev Summit Heterogeneous Architectures
Bitfusion Nimbix Dev Summit Heterogeneous Architectures Bitfusion Nimbix Dev Summit Heterogeneous Architectures
Bitfusion Nimbix Dev Summit Heterogeneous Architectures
 
Early Successes Debugging with TotalView on the Intel Xeon Phi Coprocessor
Early Successes Debugging with TotalView on the Intel Xeon Phi CoprocessorEarly Successes Debugging with TotalView on the Intel Xeon Phi Coprocessor
Early Successes Debugging with TotalView on the Intel Xeon Phi Coprocessor
 
Introduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AIIntroduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AI
 
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
 Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo... Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo...
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
 
High Performance Machine Learning in R with H2O
High Performance Machine Learning in R with H2OHigh Performance Machine Learning in R with H2O
High Performance Machine Learning in R with H2O
 
Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learn
 
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
 
PyData Boston 2013
PyData Boston 2013PyData Boston 2013
PyData Boston 2013
 
Hpc in the cloud meetup 19 march 2019
Hpc in the cloud   meetup 19 march 2019Hpc in the cloud   meetup 19 march 2019
Hpc in the cloud meetup 19 march 2019
 
Exploring the Open Source Linux Ecosystem
Exploring the Open Source Linux EcosystemExploring the Open Source Linux Ecosystem
Exploring the Open Source Linux Ecosystem
 
HP Enterprises in Hana Pankaj Jain May 2016
HP Enterprises in Hana Pankaj Jain May 2016HP Enterprises in Hana Pankaj Jain May 2016
HP Enterprises in Hana Pankaj Jain May 2016
 
from ai.backend import python @ pycontw2018
from ai.backend import python @ pycontw2018from ai.backend import python @ pycontw2018
from ai.backend import python @ pycontw2018
 
GTC15-Manoj-Roge-OpenPOWER
GTC15-Manoj-Roge-OpenPOWERGTC15-Manoj-Roge-OpenPOWER
GTC15-Manoj-Roge-OpenPOWER
 

Recently uploaded

Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 

Recently uploaded (20)

Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 

Scaling systems for research computing

  • 1. Scaling Systems for Research Computing adam@bioteam.net !1
  • 2. !2 The ‘Meta’ Issue What is driving all of this? Scalable Infrastructure Scalable Software Compliance Intro to BioTeam Who, What, Why Q&A 1 2 3 4 5 6
  • 3. Who, What, Why ... !3 BioTeam ‣ Independent consulting shop ‣ Staffed by scientists forced to learn IT, SW & HPC to get our own research done ‣ 10+ years bridging the “gap” between science, IT & high performance computing ‣ Our wide-ranging work is what gets us invited to speak at events like this ...
  • 5. Culture BioTeam ‣ We are a distributed company • BioTeam is 100% REMOTE • All employees are MANAGERS • Workflow is mostly ASYNCHRONOUS ‣ Prefer small interdisciplinary TEAMS • Value placed on TRUST and PERFORMANCE !5
  • 6. Today BioTeam ‣ 10 full-time employees in 2014 • 2 dedicated to HPC Infrastructure • 2 dedicated to Software Development • 1 dedicated to Products • 1 dedicated to Government Services • 1 dedicated to Cloud Computing ‣ 10+ years supporting Life Sciences Research !6
  • 8. !8 Science is changing faster than IT infrastructure
  • 10. Amazon vs. Other Clouds ‣ AWS has by far the most useful IaaS building blocks today • First choice for most Bio-IT use cases ‣ AWS quietly rolls out killer features • Spot Market • Virtual Private Cloud ‣ Provider decision may be based on where your data actually resides !10
  • 12. Massive resources and API’s galore Google ‣ Google started with PaaS and worked down ‣ Google Exacycle for Visiting Faculty (closed) • 1 billion core hours on demand; what’s next? ‣ Google is DEVELOPER centric; everything has an API ‣ Culture is based on Science and Engineering !12
  • 14. Devops Configuration Management ‣ Required in almost every cloud project ‣ Chef/Puppet/Ansible/Fabric • Domain specific languages; Agent-based versus SSH; Abstraction ‣ Key is reducing institutionalized knowledge and sharing recipes ‣ Docker/lxc could be disrupting • Lightweight differential images; not very HPC friendly at this point ‣ Orchestration tools lagging behind provisioning and configuration ‣ Best techniques are making their way back into HPC !14
  • 16. open-source cluster computing toolkit MIT StarCluster ‣ Ideal for most HPC use cases • Includes Grid Engine, NFS, and MPI • NEW Support for Virtual Private Cloud! ‣ Works with Spot Instances ‣ Extensible via plugins • Hadoop • HTCondor • GlusterFS • IPython Notebook !16
  • 18. Where is your datacenter? Private Cloud !18
  • 22. In modern processors and coprocessors Types of Parallelism !22 Instruction Level Vector Level Thread Level Node Level Micro-architectural techniques such as pipelined execution, out-of/in-order execution, super-scalar execution, branch prediction… Using SIMD vector processing instructions for SSE, AVX, Phi Multi-core architectures with or without Hyper-Threading Many-core architecture with smart round robin hardware multithreading Distributed Computing Cluster Computing
  • 23. Fully functional multi-thread execution unit Intel Xeon Phi Coprocessor ‣ 50+ cores with a ring interconnect ‣ 64-bit addressing ‣ Scalar unit based on Intel Pentium family ‣ Vector unit 512-bit SIMD Instructions ‣ 4 hardware threads per core ‣ Highly Parallel device ‣ SMP on-a-chip !23
  • 24. Choices Programming Xeon Phi !24 Offloaded Native ‣ Pragma/directives based ‣ Better serial processing ‣ More memory ‣ Better file access ‣ Makes full use of available resources ‣ Simpler programming model ‣ Quicker to test key kernels ‣ Some constraints ‣ Memory availability ‣ File I/O access
  • 25. Mapping with Burrows-Wheeler Aligner (BWA) Intel Optimization Example !25 0 0 1 1 2 1.86 1.24 1 Xeon (baseline) Xeon (optimized) Xeon + Phi‣ Replace pthreads with OpenMP ‣ Better load balancing ‣ Overlap I/O and Compute ‣ Better thread usage ‣ Efficient memory allocation ‣ Vectorized performance critical loops ‣ Data prefetch to reduce memory latency Source: Life Sciences Optimization - Intel - SC13
  • 26. Protein sequence analysis with MPI-HMMER Intel Optimization Example !26 0 0 1 1 2 1.56 1 Xeon Xeon + Phi ‣ No source code changes required ‣ Use #pragma unroll to improve loop performance ‣ Double nested loop in Viterbi algorithm is auto-vectorized for Xeon and Phi by Intel compilers Source: Life Sciences Optimization - Intel - SC13
  • 27. Assembly with Velour Intel Optimization Example ‣ Intel and UIUC released open-source alternative to velveth ‣ > 10x reduction in memory usage • Intelligently caching portions of assembly to disk • 700GB to 60GB ‣ https://github.com/jjcook/velour ‣ Cook, Jeffrey J. 2011. Scaling short read de novo DNA sequence assembly to gigabase genomes. !27
  • 28. Recommendations Programming Xeon Phi ‣ Host can have multiple Phi cards ‣ MLK libraries are pre-optimized ‣ OpenMP is applicable to multi-core and many- core programming • omp offload target(mic) ‣ MPI supports distributed computation and combines with other models • OpenMP within nodes and MPI between nodes ‣ Xeon optimizations translate well to Phi !28
  • 29. In the Life Sciences Parallel Programming ‣ Targets: CPU, Coprocessors, GPU, FGPA, ASIC ‣ There is no silver bullet ‣ Problem decomposition is the most critical step ‣ Think in parallel ‣ Using Intel compilers can yield ~30% speedup in many cases • vtune and other analysis tools are available ‣ Must optimize at one or more levels !29
  • 30. !30
  • 31. Recommendations Parallel Programming ‣ Leaving performance on the table • Low hanging fruit; splitting input files into parts • Avoid using languages with poor concurrency model and GIL ‣ Exploit thread-level parallelism • Use multi-threading and multi-processing to fully utilize multicore processors ‣ Use Intel’s Auto-Vectorizing compiler • Take advantage of SIMD parallelism and wider vectors on Phi ‣ Prepare for a heterogenous many-core future • Hybrid Programming (OpenMP + MPI) !31
  • 32. Platforms Parallel Programming ‣ Intel Distribution for Apache Hadoop • Enhances open-source Hadoop on Xeon processors • More efficient; faster startup times • Management tools ‣ Intel Enterprise Edition Lustre • Enhances open-source Lustre • REST API • Hadoop Adapter !32
  • 33. A fresh approach to technical computing I <3 Julia ‣ Homoiconic; Dynamic type system ‣ Designed for parallelism and distributed computation ‣ MATLAB-like syntax and extensive math library ‣ Call C functions directly ‣ Call Python functions ‣ IJulia Notebook ‣ Open Source !33
  • 35. Overview Compliance ‣ Need a compliance apparatus ‣ Often a barrier to competition ‣ Compute and Storage are easy • Policy and procedures are harder ‣ AWS and Google will now sign BAA !35
  • 36. Strategy Compliance ‣ Keys are protecting data and preventing access ‣ Data management - points of control ‣ Encrypt data in flight and at rest • Use S3 server-side encryption • Google Persistent Disks are automatically encrypted ‣ Use credential rotation policies ‣ Lock down security groups and firewalls ‣ Use VPN for all public connections ‣ Log everything and audit often !36