Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
QNIBTerminal plus InfiniBand 
Containerized MPI Workloads 
insideHPC Edition 
Slides slightly modified in comparison 
to t...
Agenda 
• Docker in a Nutshell 
• QNIBTerminal 
• Testbed 
• MPI Benchmark 
• HPCG-Results 
• Future Work 
• Conclusion 
2
Docker in a Nutshell 
3 
• (chroot on steroids)2
Docker in a Nutshell 
• Builds on-top LinuX Containers (LXC) 
• Kernel namespaces (isolation) 
• cgroups (resource mgmt) 
...
Docker in a Nutshell 
• intuitive build system 
5 
• (chroot on steroids)2 
• Builds on-top LinuX Containers (LXC) 
• Kern...
Docker in a Nutshell 
• intuitive build system 
• public repositories 
• RedHat backing 
6 
• (chroot on steroids)2 
• Bui...
Traditional vs. Lightweight 
Layers 
7 
SERVICE SERVICE 
Userland (OS) Userland (OS) Userland (OS) 
KERNEL KERNEL 
HYPERVI...
QNIBTerminal 
Motivation 
8 
Plain Metrics
QNIBTerminal 
Motivation 
9 
Plain Log Events
QNIBTerminal 
Motivation 
10 
Overlap Metrics/Log Events
QNIBTerminal 
Overview 
11 
One Node Setup 
• All network traffic over bridge 
• Crippled MPI workload 
elk 
dns 
elastics...
• 3 containers on top (CentOS 6, CentOS 7, Ubuntu 12) 
• SLURM Resource Scheduler 
• 1 native partition 
• 3 containers pa...
MPI Benchmark 
• osu-micro-benchmarks-4.4.1 
• osu_alltoall with two tasks on two hosts 
$ mpirun -np 2 -H venus001,venus0...
MPI Benchmark 
distribution’s results [2 task @2nodes] 
14 
latency [us] 
5 
4 
3 
2 
1 
0 
native cos7 cos6 u12 
4 8 16 3...
15 
latency [us] 
2,8 
2,1 
1,4 
0,7 
0 
MPI Benchmark 
Open MPI comparison [2 task @2nodes, avg(1B->64B)] 
distribution 1...
• mimics thermodynamic application workload 
• Linpack corrective / successor in the long-term? 
16 
HPCG Benchmark
17 
HPCG Benchmark 
6 
5,25 
GFLOP/s 3 
4,5 
3,75 
distribution’s results 
native cos7 cos6 u12 
CentOS 7.0 
oMPI 1.6.4 
g...
18 
HPCG Benchmark 
6 
5,25 
GFLOP/s 3 
4,5 
3,75 
native cos7 cos6 u12 
CentOS 7.0 
oMPI 1.6.4 
gcc 4.8.2 
CentOS 6.5 
oM...
6 
5,25 
HPCG Benchmark 
s 
GFLOP/4,5 
3,75 
3 
distribution 
19 native 
cos7 
cos6 
u12 
oMPI 1.6.4 
oMPI 1.6.4 
oMPI 1.5...
6 
5,25 
HPCG Benchmark 
s 
GFLOP/4,5 
3,75 
3 
distribution 1.6.4 1.8.4 
20 native 
cos7 
cos6 
u12 
oMPI 1.6.4 
oMPI 1.6...
6 
5,25 
HPCG Benchmark 
s 
GFLOP/4,5 
3,75 
3 
distribution 1.5.4 1.6.4 1.8.4 
21 native 
cos7 
cos6 
u12 
oMPI 1.6.4 
oM...
Future Work 
• Compare with tuned bare-metal 
• Tune docker installation 
• Use of SV-IOR (Keynote earlier today) 
• Compa...
Conclusion 
• Abstraction bare-metal / application works fine 
• Bare-metal kernel provides access to IB 
• Container in c...
La Fin 
• Contact 
• @CQnib / @qnibinc 
• christian@qnib.org 
• http://qnib.org 
24 
https://www.flickr.com/photos/dharmab...
La Fin 
• Contact 
• @CQnib / @qnibinc 
• christian@qnib.org 
• http://qnib.org 
• Paper: http://doc.qnib.org/ 
25 
https:...
La Fin 
26 
https://www.flickr.com/photos/dharmabum1964/3108162671 
• Contact 
• @CQnib / @qnibinc 
• christian@qnib.org 
...
La Fin 
27 
https://www.flickr.com/photos/dharmabum1964/3108162671 
• Contact 
• @CQnib / @_qnib 
• christian@qnib.org 
• ...
Upcoming SlideShare
Loading in …5
×

QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

783 views

Published on

In this deck, Christian Kniep presents: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads.

Watch the video presentation: http://wp.me/p3RLHQ-dvM

Published in: Technology
  • Be the first to comment

  • Be the first to like this

QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

  1. 1. QNIBTerminal plus InfiniBand Containerized MPI Workloads insideHPC Edition Slides slightly modified in comparison to the HPC Advisory Council Christian Kniep 2014-11-05
  2. 2. Agenda • Docker in a Nutshell • QNIBTerminal • Testbed • MPI Benchmark • HPCG-Results • Future Work • Conclusion 2
  3. 3. Docker in a Nutshell 3 • (chroot on steroids)2
  4. 4. Docker in a Nutshell • Builds on-top LinuX Containers (LXC) • Kernel namespaces (isolation) • cgroups (resource mgmt) 4 • (chroot on steroids)2
  5. 5. Docker in a Nutshell • intuitive build system 5 • (chroot on steroids)2 • Builds on-top LinuX Containers (LXC) • Kernel namespaces (isolation) • cgroups (resource mgmt)
  6. 6. Docker in a Nutshell • intuitive build system • public repositories • RedHat backing 6 • (chroot on steroids)2 • Builds on-top LinuX Containers (LXC) • Kernel namespaces (isolation) • cgroups (resource mgmt)
  7. 7. Traditional vs. Lightweight Layers 7 SERVICE SERVICE Userland (OS) Userland (OS) Userland (OS) KERNEL KERNEL HYPERVISOR HOST KERNEL SERVER SERVICE KERNEL Userland (OS) SERVICE SERVICE Userland (OS) Userland (OS) Userland (OS) HOST KERNEL SERVER SERVICE Userland (OS) Traditional Virtualisation Containerisation IB IB
  8. 8. QNIBTerminal Motivation 8 Plain Metrics
  9. 9. QNIBTerminal Motivation 9 Plain Log Events
  10. 10. QNIBTerminal Motivation 10 Overlap Metrics/Log Events
  11. 11. QNIBTerminal Overview 11 One Node Setup • All network traffic over bridge • Crippled MPI workload elk dns elasticsearch logstash kibana helixdns etcd haproxy haproxy Compute slurmd compute<N> slurmd compute0 slurmctld slurmctld grafana grafana graphite-api graphite-api graphite-web graphite-web carbon carbon Log/Events Services Performance
  12. 12. • 3 containers on top (CentOS 6, CentOS 7, Ubuntu 12) • SLURM Resource Scheduler • 1 native partition • 3 containers partitions • Multiple Open MPI version installed • gcc versions Testbed 12 • 8 nodes (CentOS 7, 2x 4core XEON, 32GB, Mellanox ConnectX-2)
  13. 13. MPI Benchmark • osu-micro-benchmarks-4.4.1 • osu_alltoall with two tasks on two hosts $ mpirun -np 2 -H venus001,venus002 $(pwd)/osu_alltoall # OSU MPI All-to-All Personalized Exchange Latency Test v4.4.1 # Size Avg Latency(us) 1 1.83 2 1.82 4 1.74 8 1.63 16 1.62 32 1.68 64 1.80 128 2.77 256 3.11 512 3.51 MPI benchmark was not in original HPC Advisory Council Presentation 13
  14. 14. MPI Benchmark distribution’s results [2 task @2nodes] 14 latency [us] 5 4 3 2 1 0 native cos7 cos6 u12 4 8 16 32 64 128 256 512 1024 Message Size (KB) MPI benchmark was not in original HPC Advisory Council Presentation
  15. 15. 15 latency [us] 2,8 2,1 1,4 0,7 0 MPI Benchmark Open MPI comparison [2 task @2nodes, avg(1B->64B)] distribution 1.5.4 1.6.4 1.8.3 native cos7 cos6 u12 oMPI 1.6.4 oMPI 1.6.4 oMPI 1.5.4 oMPI 1.5.4
  16. 16. • mimics thermodynamic application workload • Linpack corrective / successor in the long-term? 16 HPCG Benchmark
  17. 17. 17 HPCG Benchmark 6 5,25 GFLOP/s 3 4,5 3,75 distribution’s results native cos7 cos6 u12 CentOS 7.0 oMPI 1.6.4 gcc 4.8.2
  18. 18. 18 HPCG Benchmark 6 5,25 GFLOP/s 3 4,5 3,75 native cos7 cos6 u12 CentOS 7.0 oMPI 1.6.4 gcc 4.8.2 CentOS 6.5 oMPI 1.5.4 gcc 4.4.7 Ubuntu12.04 oMPI 1.5.4 gcc 4.6.3 distribution’s results
  19. 19. 6 5,25 HPCG Benchmark s GFLOP/4,5 3,75 3 distribution 19 native cos7 cos6 u12 oMPI 1.6.4 oMPI 1.6.4 oMPI 1.5.4 oMPI 1.5.4 Open MPI comparison
  20. 20. 6 5,25 HPCG Benchmark s GFLOP/4,5 3,75 3 distribution 1.6.4 1.8.4 20 native cos7 cos6 u12 oMPI 1.6.4 oMPI 1.6.4 oMPI 1.5.4 oMPI 1.5.4 gcc 4.8.2 gcc 4.8.2 gcc 4.4.7 gcc 4.6.3 Open MPI comparison
  21. 21. 6 5,25 HPCG Benchmark s GFLOP/4,5 3,75 3 distribution 1.5.4 1.6.4 1.8.4 21 native cos7 cos6 u12 oMPI 1.6.4 oMPI 1.6.4 oMPI 1.5.4 oMPI 1.5.4 gcc 4.8.2 gcc 4.8.2 gcc 4.4.7 gcc 4.6.3 Open MPI comparison
  22. 22. Future Work • Compare with tuned bare-metal • Tune docker installation • Use of SV-IOR (Keynote earlier today) • Compare different frameworks to orchestrate • Security evaluations 22 • Benchmark real-world applications
  23. 23. Conclusion • Abstraction bare-metal / application works fine • Bare-metal kernel provides access to IB • Container in charge from MPI upwards • Out-of-the-box: container beats bare-metal • Continuous testing/deployment of containerized workloads 23 • Bunch of tooling within docker ecosystem • Low performance overhead
  24. 24. La Fin • Contact • @CQnib / @qnibinc • christian@qnib.org • http://qnib.org 24 https://www.flickr.com/photos/dharmabum1964/3108162671
  25. 25. La Fin • Contact • @CQnib / @qnibinc • christian@qnib.org • http://qnib.org • Paper: http://doc.qnib.org/ 25 https://www.flickr.com/photos/dharmabum1964/3108162671
  26. 26. La Fin 26 https://www.flickr.com/photos/dharmabum1964/3108162671 • Contact • @CQnib / @qnibinc • christian@qnib.org • http://qnib.org • Paper: http://doc.qnib.org/ • Interested? • Docker Pitch today • Internal Evaluations • Workshops / Talks
  27. 27. La Fin 27 https://www.flickr.com/photos/dharmabum1964/3108162671 • Contact • @CQnib / @_qnib • christian@qnib.org • http://qnib.org • Paper: http://doc.qnib.org/ • Interested? • Docker Pitch today • Internal Evaluations • Workshops / Talks • Questions?

×