SlideShare a Scribd company logo
1 of 25
Download to read offline
Toward a practical “HPC Cloud”:
  Performance tuning of a virtualized HPC cluster


          Ryousei Takano, Tsutomu Ikegami,
          Takahiro Hirofuchi, Yoshio Tanaka

                              Information Technology Research Institute,
National Institute of Advanced Industrial Science and Technology (AIST),
                                                                  Japan
               CUTE2011@Seoul, Dec.15 2011
Background
•  Cloud computing is getting increased attention
   from High Performance Computing community.
   –  e.g., Amazon EC2 Cluster Compute Instances
•  Virtualization is a key technology.
   –  Provider rely on virtualization to consolidate
      computing resources.
•  Virtualization provides not only opportunities,
   but also challenges for HPC systems and
   applications.
   –  Concern: Performance degradation due to the
      overhead of virtualization
                                                       2
Contribution
•  Goal:
   –  To realize a practical HPC Cloud whose performance
      is close to that of bare metal (i.e., non-virtualized)
      machines
•  Contributions:
   –  Feasible study by evaluating the HPC Challenge
      benchmark on a 16 node InfiniBand cluster
   –  The effect of three performance tuning techniques:
      •  PCI passthrough
      •  NUMA affinity
      •  VMM noise reduction

                                                               3
Outline
•  Background
•  Performance tuning techniques for HPC Cloud
  –  PCI passthrough
  –  NUMA affinity
  –  VMM noise reduction
•  Performance evaluation
  –  HPC Challenge benchmark suite
  –  Results
•  Summary

                                                 4
Outline
•  Background
•  Performance tuning techniques for HPC Cloud
  –  PCI passthrough
  –  NUMA affinity
  –  VMM noise reduction
•  Performance evaluation
  –  HPC Challenge benchmark suite
  –  Results
•  Summary

                                                 5
Toward a practical HPC Cloud
                          To reduce the overhead of                     “True” HPC Cloud
 VM1
                          interrupt virtualization                       The performance is
    Guest OS
                          To disable unnecessary services                close to that of bare
        Physical
         driver
                          on the host OS (i.e., ksmd).                     metal machines.

 VMM
                                                            Reduce
                                                            VMM noise
  NIC
                                  Set NUMA
                                  affinity
                                                                        VM (QEMU process)
                                                                             Guest OS
                                                                            Threads


                   Use PCI                                               VCPU
                                                                         threads
                   passthrough
                                                                        Linux kernel

  Current                                                                  KVM
 HPC Cloud
Its performance is
   not good and                                                         Physical
     unstable.                                                          CPU
                                                                                       CPU socket
                                                                                                    6
IO architectures of VMs
           IO emulation                         PCI passthrough
         VM1              VM2                    VM1              VM2
          Guest OS                                Guest OS
                                …                                          …
            Guest                                  Physical
            driver                                  driver


         VMM                                     VMM
                     vSwitch

                     Physical
                      driver                           VMM-bypass
                                                         access
          NIC                                    NIC




IO emulation degrades the performance    PCI passthrough achieves the performance
due to the overhead of VMM processing.   comparable to bare metal machines.
                                                              VMM: Virtual Machine Monitor
                                                                                         7
NUMA affinity
        Bare Metal
Linux
                                   On NUMA systems, memory affinity
  Threads
                                   is an important performance factor.
                  numactl

                 Process           Local memory accesses are faster
                 scheduler
                                   than remote memory accesses.

                                   In order to avoid inter-socket
                                   memory transfer, binding a thread
                                   to CPU socket can be effective.
                      CPU socket
Physical
CPU         P0   P1     P2   P3


            memory      memory            NUMA: Non Uniform Memory Access
                                                                            8
NUMA affinity: KVM
        Bare Metal                                KVM
Linux                              VM (QEMU process)

  Threads                                 Guest OS
                                       Threads
                  numactl
                                                       numactl                bind threads
                 Process                                                      to vSocket
                                    VCPU
                 scheduler                        V0     V1      V2   V3
                                    threads


                                   Linux kernel
                                                          taskset
                                        KVM
                                                                              pin vCPU to
                                                         Process              CPU (Vn = Pn)
                                                         scheduler
                      CPU socket
Physical
CPU         P0   P1     P2   P3
                                      Physical
                                      CPU         P0     P1      P2   P3
            memory      memory
                                                                 CPU socket
                                                                                             9
NUMA affinity: Xen
        Bare Metal                                Xen
Linux                                           VM (Xen DomU)

  Threads                          VM                 Guest OS
                                   (Dom0)   Threads
                  numactl
                 numactl cannot run on a guest
                  Process
                 OS, because Xen does not
                  scheduler                     V0 V1       V2   V3
                 disclose the physical NUMA    VCPU
                 topology.
                                   Xen Hypervisor

                                                                      pin vCPU to
                                                        Domain        CPU (Vn = Pn)
                                                        scheduler
                      CPU socket
Physical
CPU         P0   P1     P2   P3
                                     Physical
                                     CPU         P0    P1   P2   P3
            memory      memory

                                                                                 10
VMM noise
•  OS noise is well-known problem to large-scale
   system scalability.
  –  OS activities and some daemon programs take up
     CPU time, consume cache and TLB, and delay the
     synchronization of parallel processes
•  VMM level noise, called VMM noise, can cause
   the same problem for a guest OS.
  –  The overhead of interrupt virtualization that results in
     VM exits (i.e., VM-to-VMM switching)
  –  Unnecessary services on the host OS (i.e., ksmd)
•  Now, we do not take care of VMM noise.
                                                                11
Outline
•  Background
•  Performance tuning techniques for HPC Cloud
  –  PCI passthrough
  –  NUMA affinity
  –  VMM noise reduction
•  Performance evaluation
  –  HPC Challenge benchmark suite
  –  Results
•  Summary

                                                 12
Experimental setting
      Evaluation of HPC Challenge benchmark on
      a 16 node Infiniband cluster
      Blade server Dell PowerEdge M610              Host machine environment
CPU        Intel quad-core Xeon E5540/2.53GHz x2   OS             Debian 6.0.1

Chipset    Intel 5520                              Linux kernel   2.6.32-5-amd64

Memory     48 GB DDR3                              KVM            0.12.50

InfiniBand Mellanox ConnectX (MT26428)             Xen            4.0.1
                                                   Compiler       gcc/gfortran 4.4.5
                 Blade switch                      MPI            Open MPI 1.4.2
InfiniBand Mellanox M3601Q (QDR 16 ports)                   VM environment
                                                   VCPU       8
                 Only 1 VM runs on 1 host. !       Memory     45 GB
                                                                                   13
HPC Challenge Benchmark Suite
   We measure spatial and temporal locality boundaries
          !"#$%&#$"'(")(#*+(,-..(/+0$1'
   by evaluating HPC Challenge benchmark suite.
Communication
                                                                                            Compute intensive
intensive
           %&'(




                                    @@8                                              /;<!!
                                                                                     ,-5
           8+93":&4(5"6&4$#7




                                                         !$00$"'(
                                                         -&:#'+:
                                                       >334$6&#$"'0                         Memory intensive
                                                                                     -8=>?2
                                  =&'A"9>66+00                                      28=<>!
               "#$                                23&#$&4(5"6&4$#7                        %&'(


                               From: Piotr Luszczek, et al., “The HPC Challenge (HPCC) Benchmark Suite,” SC2006 Tutorial.

                                                                                                                            14
HPC Challenge: Result
                                       HPL(G)
                                                 Compute intensive
                                      1.4

                                      1.2

                                       1
       Random Ring Latency                                PTRANS(G)
                                      0.8                              Memory intensive
                                      0.6

                                      0.4                                   BMM
                                                                            BMM+pin
                                      0.2
                                                                            KVM
                                       0
                                                                            KVM+pin+bind
                                                                            Xen
      Random Ring BW                                           STREAM(EP)
                                                                            Xen+pin



                                                       Comparing Xen and KVM, the
                                                       performances are almost same.
                             FFT(G)              RandomAccess(G)

Communication                         G: Global, EP: Embarrassingly parallel
intensive                             Higher is better, except for Random Ring Latency.
                                                                                           15
HPC Challenge: Result
                       Xen                                                     KVM
                        HPL(G)                                                  HPL(G)
                       1.2                                                     1.2

                        1                                                       1
    Random
                                                        Random Ring            0.8
     Ring              0.8              PTRANS(G)                                                PTRANS(G)
                                                          Latency
    Latency                                                                    0.6
                       0.6
                       0.4                                                     0.4
                       0.2                                                     0.2
                        0                                                       0
Random                                       STREAM Random Ring
                                                                                                      STREAM(EP)
Ring BW                                        (EP)     BW




                                 RandomAcc                                                RandomAccess
              FFT(G)                                                  FFT(G)
                                   ess(G)                                                      (G)

              BMM        Xen      Xen+pin                             BMM       KVM        KVM+pin+bind



    NUMA affinity is important even on a VM.
    But, the effect of VCPU pin is uncertain.                G: Global, EP: Embarrassingly parallel
                                                             Higher is better, except for Random Ring Latency.
                                                                                                             16
HPL: High Performance LINPACK
•  BMM: The LINPACK efficiency is 57.7% in 16
   nodes (63.1% in a single node).
•  BMM, KVM: setting NUMA affinity is effective.
•  Virtualization overhead is 6 to 8%.
Configuration      1 node          16 nodes
BMM                50.24 (1.00)    706.21 (1.00)
BMM + bind         51.07 (1.02)    747.88 (1.06)
Xen                49.44 (0.98)    700.23 (0.99)
Xen + pin          49.37 (0.98)    698.93 (0.99)
KVM                48.03 (0.96)    671.97 (0.95)
KVM + pin + bind   49.33 (0.98)    684.96 (0.97)


                                                   17
Discussion
•  The performance of global benchmarks, except
   for FFT(G), is almost comparable with that of
   bare metal machines.
  –  FFT decreased the performance by 11% to 20% due
     to the virtualization overhead related to the inter-node
     communication and/or VMM noise.
  –  PCI passthrough improves MPI communication
     throughput close to that of bare metal machines.
     But, interrupt injection that results in VM exits can
     disturb the application execution.



                                                                18
Discussion (cont.)
•  The performance of Xen is marginally better than
   that of KVM, except for RandomRing Bandwidth.
  –  The bandwidth decreases by 4% in KVM, 20% in Xen.
•  KVM: The performance of STREAM(EP)
   decreases by 27%.
  –  A lot of memory contention among processes (TLB
     miss) may occur. It is the worst situation for EPT
     (Extended Page Table), because the page walk of EPT
     takes more time than that of shadow page table.
     This means a virtual machine is more sensitive to
     memory contention than a bare metal machine.

                                                           19
Outline
•  Background
•  Performance tuning techniques for HPC Cloud
  –  PCI passthrough
  –  NUMA affinity
  –  VMM noise reduction
•  Performance evaluation
  –  HPC Challenge benchmark suite
  –  Results
•  Summary

                                                 20
Summary
HPC Cloud is promising!
•  The performance of coarse-grained parallel
   applications is comparable to bare metal machines.
•  We plan to adopt these performance tuning
   techniques into our private cloud service called
   “AIST Cloud.”
•  Open issues:
  –  VMM noise reduction
  –  Live migration with VMM-bypass devices


                                                    21
22
HPC Cloud
HPC Cloud utilizes cloud resources in High
Performance Computing (HPC) applications.
Virtualized
 Clusters




      Users require resources   Provider allocates users a dedicated
      according to needs.       virtual cluster on demand.

 Physical
 Cluster



                                                                       23
Amazon EC2 CCI in TOP500
                                                                             TOP500 Nov. 2011
           100
                                                                                  InfiniBand: 76%
                 80
Efficiency (%)




                                                                        10 Gigabit Ethernet: 72%
                 60


                 40
                      GPGPU machines                                    Gigabit Ethernet: 52%
                           #42 Amazon EC2                               InfiniBand
                 20
                           cluster compute instances                    Gigabit Ethernet
                                                                        10 Gigabit Ethernet
                  0

                                                    TOP500 rank
                        Efficiency   Maximum LINPACK performance Rmax   Theoretical peak performance Rpeak
LINPACK Efficiency
                                                                            TOP500 June 2011
           100                                                             InfiniBand: 79%

                 80
Efficiency (%)




                                                                      10 Gigabit Ethernet: 74%
                 60


                 40                                     Gigabit Ethernet: 54%
                      GPGPU machines
                                                                         #451 Amazon EC2
                               InfiniBand                                cluster compute instances
                 20
                               Gigabit Ethernet
                               10 Gigabit Ethernet
                                                                      Virtualization causes the
                  0                                                   performance degradation!

                                                     TOP500 rank
                      Efficiency   Maximum LINPACK performance Rmax    Theoretical peak performance Rpeak

More Related Content

What's hot

LinuxCon NA 2012: Virtualization in the cloud featuring xen
LinuxCon NA 2012: Virtualization in the cloud featuring xenLinuxCon NA 2012: Virtualization in the cloud featuring xen
LinuxCon NA 2012: Virtualization in the cloud featuring xenThe Linux Foundation
 
XCP: The Art of Open Virtualization for the Enterprise and the Cloud
XCP: The Art of Open Virtualization for the Enterprise and the CloudXCP: The Art of Open Virtualization for the Enterprise and the Cloud
XCP: The Art of Open Virtualization for the Enterprise and the CloudThe Linux Foundation
 
Membase Meetup Chicago - january 2011
Membase Meetup Chicago - january 2011Membase Meetup Chicago - january 2011
Membase Meetup Chicago - january 2011Membase
 
SAP Virtualization Week 2012 - The Lego Cloud
SAP Virtualization Week 2012 - The Lego CloudSAP Virtualization Week 2012 - The Lego Cloud
SAP Virtualization Week 2012 - The Lego Cloudaidanshribman
 
Buiding a better Userspace - The current and future state of QEMU and KVM int...
Buiding a better Userspace - The current and future state of QEMU and KVM int...Buiding a better Userspace - The current and future state of QEMU and KVM int...
Buiding a better Userspace - The current and future state of QEMU and KVM int...aliguori
 
Xen Cloud Platform at Build a Cloud Day at SCALE 10x
Xen Cloud Platform at Build a Cloud Day at SCALE 10x Xen Cloud Platform at Build a Cloud Day at SCALE 10x
Xen Cloud Platform at Build a Cloud Day at SCALE 10x The Linux Foundation
 
Why Choose Xen For Your Cloud?
Why Choose Xen For Your Cloud? Why Choose Xen For Your Cloud?
Why Choose Xen For Your Cloud? Todd Deshane
 
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Disco
Disco: Running Commodity Operating Systems on Scalable Multiprocessors DiscoDisco: Running Commodity Operating Systems on Scalable Multiprocessors Disco
Disco: Running Commodity Operating Systems on Scalable Multiprocessors DiscoMagnus Backman
 
Deploying of Unikernels in the NFV Infrastructure
Deploying of Unikernels in the NFV InfrastructureDeploying of Unikernels in the NFV Infrastructure
Deploying of Unikernels in the NFV InfrastructureStefano Salsano
 
Xen cloud platform v1.1 (given at Build a Cloud Day in Antwerp)
Xen cloud platform v1.1 (given at Build a Cloud Day in Antwerp)Xen cloud platform v1.1 (given at Build a Cloud Day in Antwerp)
Xen cloud platform v1.1 (given at Build a Cloud Day in Antwerp)The Linux Foundation
 
Securing your cloud with Xen's advanced security features
Securing your cloud with Xen's advanced security featuresSecuring your cloud with Xen's advanced security features
Securing your cloud with Xen's advanced security featuresThe Linux Foundation
 
Nested Virtual Machines and Proxies
Nested Virtual Machines and Proxies Nested Virtual Machines and Proxies
Nested Virtual Machines and Proxies Kuniyasu Suzaki
 
Windows server 8 hyper v networking (aidan finn)
Windows server 8 hyper v networking (aidan finn)Windows server 8 hyper v networking (aidan finn)
Windows server 8 hyper v networking (aidan finn)hypervnu
 

What's hot (20)

LinuxCon NA 2012: Virtualization in the cloud featuring xen
LinuxCon NA 2012: Virtualization in the cloud featuring xenLinuxCon NA 2012: Virtualization in the cloud featuring xen
LinuxCon NA 2012: Virtualization in the cloud featuring xen
 
XCP: The Art of Open Virtualization for the Enterprise and the Cloud
XCP: The Art of Open Virtualization for the Enterprise and the CloudXCP: The Art of Open Virtualization for the Enterprise and the Cloud
XCP: The Art of Open Virtualization for the Enterprise and the Cloud
 
Xen in the Cloud at SCALE 10x
Xen in the Cloud at SCALE 10xXen in the Cloud at SCALE 10x
Xen in the Cloud at SCALE 10x
 
XS Boston 2008 ARM
XS Boston 2008 ARMXS Boston 2008 ARM
XS Boston 2008 ARM
 
I/O Scalability in Xen
I/O Scalability in XenI/O Scalability in Xen
I/O Scalability in Xen
 
Membase Meetup Chicago - january 2011
Membase Meetup Chicago - january 2011Membase Meetup Chicago - january 2011
Membase Meetup Chicago - january 2011
 
An Opportunistic Storage System for UnaGrid
An Opportunistic Storage System for UnaGridAn Opportunistic Storage System for UnaGrid
An Opportunistic Storage System for UnaGrid
 
SAP Virtualization Week 2012 - The Lego Cloud
SAP Virtualization Week 2012 - The Lego CloudSAP Virtualization Week 2012 - The Lego Cloud
SAP Virtualization Week 2012 - The Lego Cloud
 
Buiding a better Userspace - The current and future state of QEMU and KVM int...
Buiding a better Userspace - The current and future state of QEMU and KVM int...Buiding a better Userspace - The current and future state of QEMU and KVM int...
Buiding a better Userspace - The current and future state of QEMU and KVM int...
 
Xen Cloud Platform at Build a Cloud Day at SCALE 10x
Xen Cloud Platform at Build a Cloud Day at SCALE 10x Xen Cloud Platform at Build a Cloud Day at SCALE 10x
Xen Cloud Platform at Build a Cloud Day at SCALE 10x
 
Why Choose Xen For Your Cloud?
Why Choose Xen For Your Cloud? Why Choose Xen For Your Cloud?
Why Choose Xen For Your Cloud?
 
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Disco
Disco: Running Commodity Operating Systems on Scalable Multiprocessors DiscoDisco: Running Commodity Operating Systems on Scalable Multiprocessors Disco
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Disco
 
Deploying of Unikernels in the NFV Infrastructure
Deploying of Unikernels in the NFV InfrastructureDeploying of Unikernels in the NFV Infrastructure
Deploying of Unikernels in the NFV Infrastructure
 
Xen cloud platform v1.1 (given at Build a Cloud Day in Antwerp)
Xen cloud platform v1.1 (given at Build a Cloud Day in Antwerp)Xen cloud platform v1.1 (given at Build a Cloud Day in Antwerp)
Xen cloud platform v1.1 (given at Build a Cloud Day in Antwerp)
 
XS Boston 2008 Memory Overcommit
XS Boston 2008 Memory OvercommitXS Boston 2008 Memory Overcommit
XS Boston 2008 Memory Overcommit
 
Securing your cloud with Xen's advanced security features
Securing your cloud with Xen's advanced security featuresSecuring your cloud with Xen's advanced security features
Securing your cloud with Xen's advanced security features
 
Nested Virtual Machines and Proxies
Nested Virtual Machines and Proxies Nested Virtual Machines and Proxies
Nested Virtual Machines and Proxies
 
PVOps Update
PVOps Update PVOps Update
PVOps Update
 
Handout2o
Handout2oHandout2o
Handout2o
 
Windows server 8 hyper v networking (aidan finn)
Windows server 8 hyper v networking (aidan finn)Windows server 8 hyper v networking (aidan finn)
Windows server 8 hyper v networking (aidan finn)
 

Similar to Toward a practical “HPC Cloud”: Performance tuning of a virtualized HPC cluster

LCA 2013 - Baremetal Provisioning with Openstack
LCA 2013 - Baremetal Provisioning with OpenstackLCA 2013 - Baremetal Provisioning with Openstack
LCA 2013 - Baremetal Provisioning with OpenstackDevananda Van Der Veen
 
Hardware supports for Virtualization
Hardware supports for VirtualizationHardware supports for Virtualization
Hardware supports for VirtualizationYoonje Choi
 
Am 04 track1--salvatore orlando--openstack-apac-2012-final
Am 04 track1--salvatore orlando--openstack-apac-2012-finalAm 04 track1--salvatore orlando--openstack-apac-2012-final
Am 04 track1--salvatore orlando--openstack-apac-2012-finalOpenCity Community
 
HPC Cloud: Clouds on supercomputers for HPC
HPC Cloud: Clouds on supercomputers for HPCHPC Cloud: Clouds on supercomputers for HPC
HPC Cloud: Clouds on supercomputers for HPCRyousei Takano
 
Virtualization Technology Overview
Virtualization Technology OverviewVirtualization Technology Overview
Virtualization Technology OverviewOpenCity Community
 
OpenStack Quantum: Cloud Carrier Summit 2012
OpenStack Quantum: Cloud Carrier Summit 2012OpenStack Quantum: Cloud Carrier Summit 2012
OpenStack Quantum: Cloud Carrier Summit 2012Dan Wendlandt
 
Windows Server 8 Hyper V Networking
Windows Server 8 Hyper V NetworkingWindows Server 8 Hyper V Networking
Windows Server 8 Hyper V NetworkingAidan Finn
 
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...Jim St. Leger
 
ARMvisor @ Linux Symposium 2012
ARMvisor @ Linux Symposium 2012ARMvisor @ Linux Symposium 2012
ARMvisor @ Linux Symposium 2012Peter Chang
 
Virtual Server 2004 Overview
Virtual Server 2004 OverviewVirtual Server 2004 Overview
Virtual Server 2004 Overviewwebhostingguy
 
Virtual Server 2004 Overview
Virtual Server 2004 OverviewVirtual Server 2004 Overview
Virtual Server 2004 Overviewwebhostingguy
 
Nova for Physicalization and Virtualization compute models
Nova for Physicalization and Virtualization compute modelsNova for Physicalization and Virtualization compute models
Nova for Physicalization and Virtualization compute modelsopenstackindia
 
Virtualization Primer for Java Developers
Virtualization Primer for Java DevelopersVirtualization Primer for Java Developers
Virtualization Primer for Java DevelopersRichard McDougall
 
Virtual Server 2005 Overview Rich McBrine, CISSP
Virtual Server 2005 Overview Rich McBrine, CISSPVirtual Server 2005 Overview Rich McBrine, CISSP
Virtual Server 2005 Overview Rich McBrine, CISSPwebhostingguy
 
Quantum Folsom Summit Developer Overview
Quantum Folsom Summit Developer OverviewQuantum Folsom Summit Developer Overview
Quantum Folsom Summit Developer OverviewDan Wendlandt
 
Windows Server 2008 Web Workload Overview
Windows Server 2008 Web Workload OverviewWindows Server 2008 Web Workload Overview
Windows Server 2008 Web Workload OverviewDavid Chou
 
Realtime scheduling for virtual machines in SKT
Realtime scheduling for virtual machines in SKTRealtime scheduling for virtual machines in SKT
Realtime scheduling for virtual machines in SKTThe Linux Foundation
 
OpenStack Quantum Intro (OS Meetup 3-26-12)
OpenStack Quantum Intro (OS Meetup 3-26-12)OpenStack Quantum Intro (OS Meetup 3-26-12)
OpenStack Quantum Intro (OS Meetup 3-26-12)Dan Wendlandt
 
Storage-Performance-Tuning-for-FAST-Virtual-Machines_Fam-Zheng.pdf
Storage-Performance-Tuning-for-FAST-Virtual-Machines_Fam-Zheng.pdfStorage-Performance-Tuning-for-FAST-Virtual-Machines_Fam-Zheng.pdf
Storage-Performance-Tuning-for-FAST-Virtual-Machines_Fam-Zheng.pdfaaajjj4
 

Similar to Toward a practical “HPC Cloud”: Performance tuning of a virtualized HPC cluster (20)

LCA 2013 - Baremetal Provisioning with Openstack
LCA 2013 - Baremetal Provisioning with OpenstackLCA 2013 - Baremetal Provisioning with Openstack
LCA 2013 - Baremetal Provisioning with Openstack
 
Hardware supports for Virtualization
Hardware supports for VirtualizationHardware supports for Virtualization
Hardware supports for Virtualization
 
Am 04 track1--salvatore orlando--openstack-apac-2012-final
Am 04 track1--salvatore orlando--openstack-apac-2012-finalAm 04 track1--salvatore orlando--openstack-apac-2012-final
Am 04 track1--salvatore orlando--openstack-apac-2012-final
 
HPC Cloud: Clouds on supercomputers for HPC
HPC Cloud: Clouds on supercomputers for HPCHPC Cloud: Clouds on supercomputers for HPC
HPC Cloud: Clouds on supercomputers for HPC
 
Virtualization Technology Overview
Virtualization Technology OverviewVirtualization Technology Overview
Virtualization Technology Overview
 
OpenStack Quantum: Cloud Carrier Summit 2012
OpenStack Quantum: Cloud Carrier Summit 2012OpenStack Quantum: Cloud Carrier Summit 2012
OpenStack Quantum: Cloud Carrier Summit 2012
 
Windows Server 8 Hyper V Networking
Windows Server 8 Hyper V NetworkingWindows Server 8 Hyper V Networking
Windows Server 8 Hyper V Networking
 
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
 
ARMvisor @ Linux Symposium 2012
ARMvisor @ Linux Symposium 2012ARMvisor @ Linux Symposium 2012
ARMvisor @ Linux Symposium 2012
 
Virtual Server 2004 Overview
Virtual Server 2004 OverviewVirtual Server 2004 Overview
Virtual Server 2004 Overview
 
Virtual Server 2004 Overview
Virtual Server 2004 OverviewVirtual Server 2004 Overview
Virtual Server 2004 Overview
 
Nova for Physicalization and Virtualization compute models
Nova for Physicalization and Virtualization compute modelsNova for Physicalization and Virtualization compute models
Nova for Physicalization and Virtualization compute models
 
Virtualization Primer for Java Developers
Virtualization Primer for Java DevelopersVirtualization Primer for Java Developers
Virtualization Primer for Java Developers
 
Virtual Server 2005 Overview Rich McBrine, CISSP
Virtual Server 2005 Overview Rich McBrine, CISSPVirtual Server 2005 Overview Rich McBrine, CISSP
Virtual Server 2005 Overview Rich McBrine, CISSP
 
Quantum Folsom Summit Developer Overview
Quantum Folsom Summit Developer OverviewQuantum Folsom Summit Developer Overview
Quantum Folsom Summit Developer Overview
 
Windows Server 2008 Web Workload Overview
Windows Server 2008 Web Workload OverviewWindows Server 2008 Web Workload Overview
Windows Server 2008 Web Workload Overview
 
Realtime scheduling for virtual machines in SKT
Realtime scheduling for virtual machines in SKTRealtime scheduling for virtual machines in SKT
Realtime scheduling for virtual machines in SKT
 
OpenStack Quantum Intro (OS Meetup 3-26-12)
OpenStack Quantum Intro (OS Meetup 3-26-12)OpenStack Quantum Intro (OS Meetup 3-26-12)
OpenStack Quantum Intro (OS Meetup 3-26-12)
 
Storage-Performance-Tuning-for-FAST-Virtual-Machines_Fam-Zheng.pdf
Storage-Performance-Tuning-for-FAST-Virtual-Machines_Fam-Zheng.pdfStorage-Performance-Tuning-for-FAST-Virtual-Machines_Fam-Zheng.pdf
Storage-Performance-Tuning-for-FAST-Virtual-Machines_Fam-Zheng.pdf
 
The kvm virtualization way
The kvm virtualization wayThe kvm virtualization way
The kvm virtualization way
 

More from Ryousei Takano

Error Permissive Computing
Error Permissive ComputingError Permissive Computing
Error Permissive ComputingRyousei Takano
 
Opportunities of ML-based data analytics in ABCI
Opportunities of ML-based data analytics in ABCIOpportunities of ML-based data analytics in ABCI
Opportunities of ML-based data analytics in ABCIRyousei Takano
 
ABCI: An Open Innovation Platform for Advancing AI Research and Deployment
ABCI: An Open Innovation Platform for Advancing AI Research and DeploymentABCI: An Open Innovation Platform for Advancing AI Research and Deployment
ABCI: An Open Innovation Platform for Advancing AI Research and DeploymentRyousei Takano
 
クラウド環境におけるキャッシュメモリQoS制御の評価
クラウド環境におけるキャッシュメモリQoS制御の評価クラウド環境におけるキャッシュメモリQoS制御の評価
クラウド環境におけるキャッシュメモリQoS制御の評価Ryousei Takano
 
USENIX NSDI 2016 (Session: Resource Sharing)
USENIX NSDI 2016 (Session: Resource Sharing)USENIX NSDI 2016 (Session: Resource Sharing)
USENIX NSDI 2016 (Session: Resource Sharing)Ryousei Takano
 
User-space Network Processing
User-space Network ProcessingUser-space Network Processing
User-space Network ProcessingRyousei Takano
 
Flow-centric Computing - A Datacenter Architecture in the Post Moore Era
Flow-centric Computing - A Datacenter Architecture in the Post Moore EraFlow-centric Computing - A Datacenter Architecture in the Post Moore Era
Flow-centric Computing - A Datacenter Architecture in the Post Moore EraRyousei Takano
 
A Look Inside Google’s Data Center Networks
A Look Inside Google’s Data Center NetworksA Look Inside Google’s Data Center Networks
A Look Inside Google’s Data Center NetworksRyousei Takano
 
クラウド時代の半導体メモリー技術
クラウド時代の半導体メモリー技術クラウド時代の半導体メモリー技術
クラウド時代の半導体メモリー技術Ryousei Takano
 
AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...Ryousei Takano
 
IEEE CloudCom 2014参加報告
IEEE CloudCom 2014参加報告IEEE CloudCom 2014参加報告
IEEE CloudCom 2014参加報告Ryousei Takano
 
Expectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software researchExpectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software researchRyousei Takano
 
Exploring the Performance Impact of Virtualization on an HPC Cloud
Exploring the Performance Impact of Virtualization on an HPC CloudExploring the Performance Impact of Virtualization on an HPC Cloud
Exploring the Performance Impact of Virtualization on an HPC CloudRyousei Takano
 
不揮発メモリとOS研究にまつわる何か
不揮発メモリとOS研究にまつわる何か不揮発メモリとOS研究にまつわる何か
不揮発メモリとOS研究にまつわる何かRyousei Takano
 
High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...
High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...
High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...Ryousei Takano
 
クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~
クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~
クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~Ryousei Takano
 
From Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersRyousei Takano
 
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
高性能かつスケールアウト可能なHPCクラウド AIST Super Green CloudRyousei Takano
 
Iris: Inter-cloud Resource Integration System for Elastic Cloud Data Center
Iris: Inter-cloud Resource Integration System for Elastic Cloud Data CenterIris: Inter-cloud Resource Integration System for Elastic Cloud Data Center
Iris: Inter-cloud Resource Integration System for Elastic Cloud Data CenterRyousei Takano
 

More from Ryousei Takano (20)

Error Permissive Computing
Error Permissive ComputingError Permissive Computing
Error Permissive Computing
 
Opportunities of ML-based data analytics in ABCI
Opportunities of ML-based data analytics in ABCIOpportunities of ML-based data analytics in ABCI
Opportunities of ML-based data analytics in ABCI
 
ABCI: An Open Innovation Platform for Advancing AI Research and Deployment
ABCI: An Open Innovation Platform for Advancing AI Research and DeploymentABCI: An Open Innovation Platform for Advancing AI Research and Deployment
ABCI: An Open Innovation Platform for Advancing AI Research and Deployment
 
ABCI Data Center
ABCI Data CenterABCI Data Center
ABCI Data Center
 
クラウド環境におけるキャッシュメモリQoS制御の評価
クラウド環境におけるキャッシュメモリQoS制御の評価クラウド環境におけるキャッシュメモリQoS制御の評価
クラウド環境におけるキャッシュメモリQoS制御の評価
 
USENIX NSDI 2016 (Session: Resource Sharing)
USENIX NSDI 2016 (Session: Resource Sharing)USENIX NSDI 2016 (Session: Resource Sharing)
USENIX NSDI 2016 (Session: Resource Sharing)
 
User-space Network Processing
User-space Network ProcessingUser-space Network Processing
User-space Network Processing
 
Flow-centric Computing - A Datacenter Architecture in the Post Moore Era
Flow-centric Computing - A Datacenter Architecture in the Post Moore EraFlow-centric Computing - A Datacenter Architecture in the Post Moore Era
Flow-centric Computing - A Datacenter Architecture in the Post Moore Era
 
A Look Inside Google’s Data Center Networks
A Look Inside Google’s Data Center NetworksA Look Inside Google’s Data Center Networks
A Look Inside Google’s Data Center Networks
 
クラウド時代の半導体メモリー技術
クラウド時代の半導体メモリー技術クラウド時代の半導体メモリー技術
クラウド時代の半導体メモリー技術
 
AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...
 
IEEE CloudCom 2014参加報告
IEEE CloudCom 2014参加報告IEEE CloudCom 2014参加報告
IEEE CloudCom 2014参加報告
 
Expectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software researchExpectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software research
 
Exploring the Performance Impact of Virtualization on an HPC Cloud
Exploring the Performance Impact of Virtualization on an HPC CloudExploring the Performance Impact of Virtualization on an HPC Cloud
Exploring the Performance Impact of Virtualization on an HPC Cloud
 
不揮発メモリとOS研究にまつわる何か
不揮発メモリとOS研究にまつわる何か不揮発メモリとOS研究にまつわる何か
不揮発メモリとOS研究にまつわる何か
 
High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...
High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...
High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...
 
クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~
クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~
クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~
 
From Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computers
 
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
 
Iris: Inter-cloud Resource Integration System for Elastic Cloud Data Center
Iris: Inter-cloud Resource Integration System for Elastic Cloud Data CenterIris: Inter-cloud Resource Integration System for Elastic Cloud Data Center
Iris: Inter-cloud Resource Integration System for Elastic Cloud Data Center
 

Recently uploaded

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 

Recently uploaded (20)

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

Toward a practical “HPC Cloud”: Performance tuning of a virtualized HPC cluster

  • 1. Toward a practical “HPC Cloud”: Performance tuning of a virtualized HPC cluster Ryousei Takano, Tsutomu Ikegami, Takahiro Hirofuchi, Yoshio Tanaka Information Technology Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Japan CUTE2011@Seoul, Dec.15 2011
  • 2. Background •  Cloud computing is getting increased attention from High Performance Computing community. –  e.g., Amazon EC2 Cluster Compute Instances •  Virtualization is a key technology. –  Provider rely on virtualization to consolidate computing resources. •  Virtualization provides not only opportunities, but also challenges for HPC systems and applications. –  Concern: Performance degradation due to the overhead of virtualization 2
  • 3. Contribution •  Goal: –  To realize a practical HPC Cloud whose performance is close to that of bare metal (i.e., non-virtualized) machines •  Contributions: –  Feasible study by evaluating the HPC Challenge benchmark on a 16 node InfiniBand cluster –  The effect of three performance tuning techniques: •  PCI passthrough •  NUMA affinity •  VMM noise reduction 3
  • 4. Outline •  Background •  Performance tuning techniques for HPC Cloud –  PCI passthrough –  NUMA affinity –  VMM noise reduction •  Performance evaluation –  HPC Challenge benchmark suite –  Results •  Summary 4
  • 5. Outline •  Background •  Performance tuning techniques for HPC Cloud –  PCI passthrough –  NUMA affinity –  VMM noise reduction •  Performance evaluation –  HPC Challenge benchmark suite –  Results •  Summary 5
  • 6. Toward a practical HPC Cloud To reduce the overhead of “True” HPC Cloud VM1 interrupt virtualization The performance is Guest OS To disable unnecessary services close to that of bare Physical driver on the host OS (i.e., ksmd). metal machines. VMM Reduce VMM noise NIC Set NUMA affinity VM (QEMU process) Guest OS Threads Use PCI VCPU threads passthrough Linux kernel Current KVM HPC Cloud Its performance is not good and Physical unstable. CPU CPU socket 6
  • 7. IO architectures of VMs IO emulation PCI passthrough VM1 VM2 VM1 VM2 Guest OS Guest OS … … Guest Physical driver driver VMM VMM vSwitch Physical driver VMM-bypass access NIC NIC IO emulation degrades the performance PCI passthrough achieves the performance due to the overhead of VMM processing. comparable to bare metal machines. VMM: Virtual Machine Monitor 7
  • 8. NUMA affinity Bare Metal Linux On NUMA systems, memory affinity Threads is an important performance factor. numactl Process Local memory accesses are faster scheduler than remote memory accesses. In order to avoid inter-socket memory transfer, binding a thread to CPU socket can be effective. CPU socket Physical CPU P0 P1 P2 P3 memory memory NUMA: Non Uniform Memory Access 8
  • 9. NUMA affinity: KVM Bare Metal KVM Linux VM (QEMU process) Threads Guest OS Threads numactl numactl bind threads Process to vSocket VCPU scheduler V0 V1 V2 V3 threads Linux kernel taskset KVM pin vCPU to Process CPU (Vn = Pn) scheduler CPU socket Physical CPU P0 P1 P2 P3 Physical CPU P0 P1 P2 P3 memory memory CPU socket 9
  • 10. NUMA affinity: Xen Bare Metal Xen Linux VM (Xen DomU) Threads VM Guest OS (Dom0) Threads numactl numactl cannot run on a guest Process OS, because Xen does not scheduler V0 V1 V2 V3 disclose the physical NUMA VCPU topology. Xen Hypervisor pin vCPU to Domain CPU (Vn = Pn) scheduler CPU socket Physical CPU P0 P1 P2 P3 Physical CPU P0 P1 P2 P3 memory memory 10
  • 11. VMM noise •  OS noise is well-known problem to large-scale system scalability. –  OS activities and some daemon programs take up CPU time, consume cache and TLB, and delay the synchronization of parallel processes •  VMM level noise, called VMM noise, can cause the same problem for a guest OS. –  The overhead of interrupt virtualization that results in VM exits (i.e., VM-to-VMM switching) –  Unnecessary services on the host OS (i.e., ksmd) •  Now, we do not take care of VMM noise. 11
  • 12. Outline •  Background •  Performance tuning techniques for HPC Cloud –  PCI passthrough –  NUMA affinity –  VMM noise reduction •  Performance evaluation –  HPC Challenge benchmark suite –  Results •  Summary 12
  • 13. Experimental setting Evaluation of HPC Challenge benchmark on a 16 node Infiniband cluster Blade server Dell PowerEdge M610 Host machine environment CPU Intel quad-core Xeon E5540/2.53GHz x2 OS Debian 6.0.1 Chipset Intel 5520 Linux kernel 2.6.32-5-amd64 Memory 48 GB DDR3 KVM 0.12.50 InfiniBand Mellanox ConnectX (MT26428) Xen 4.0.1 Compiler gcc/gfortran 4.4.5 Blade switch MPI Open MPI 1.4.2 InfiniBand Mellanox M3601Q (QDR 16 ports) VM environment VCPU 8 Only 1 VM runs on 1 host. ! Memory 45 GB 13
  • 14. HPC Challenge Benchmark Suite We measure spatial and temporal locality boundaries !"#$%&#$"'(")(#*+(,-..(/+0$1' by evaluating HPC Challenge benchmark suite. Communication Compute intensive intensive %&'( @@8 /;<!! ,-5 8+93":&4(5"6&4$#7 !$00$"'( -&:#'+: >334$6&#$"'0 Memory intensive -8=>?2 =&'A"9>66+00 28=<>! "#$ 23&#$&4(5"6&4$#7 %&'( From: Piotr Luszczek, et al., “The HPC Challenge (HPCC) Benchmark Suite,” SC2006 Tutorial. 14
  • 15. HPC Challenge: Result HPL(G) Compute intensive 1.4 1.2 1 Random Ring Latency PTRANS(G) 0.8 Memory intensive 0.6 0.4 BMM BMM+pin 0.2 KVM 0 KVM+pin+bind Xen Random Ring BW STREAM(EP) Xen+pin Comparing Xen and KVM, the performances are almost same. FFT(G) RandomAccess(G) Communication G: Global, EP: Embarrassingly parallel intensive Higher is better, except for Random Ring Latency. 15
  • 16. HPC Challenge: Result Xen KVM HPL(G) HPL(G) 1.2 1.2 1 1 Random Random Ring 0.8 Ring 0.8 PTRANS(G) PTRANS(G) Latency Latency 0.6 0.6 0.4 0.4 0.2 0.2 0 0 Random STREAM Random Ring STREAM(EP) Ring BW (EP) BW RandomAcc RandomAccess FFT(G) FFT(G) ess(G) (G) BMM Xen Xen+pin BMM KVM KVM+pin+bind NUMA affinity is important even on a VM. But, the effect of VCPU pin is uncertain. G: Global, EP: Embarrassingly parallel Higher is better, except for Random Ring Latency. 16
  • 17. HPL: High Performance LINPACK •  BMM: The LINPACK efficiency is 57.7% in 16 nodes (63.1% in a single node). •  BMM, KVM: setting NUMA affinity is effective. •  Virtualization overhead is 6 to 8%. Configuration 1 node 16 nodes BMM 50.24 (1.00) 706.21 (1.00) BMM + bind 51.07 (1.02) 747.88 (1.06) Xen 49.44 (0.98) 700.23 (0.99) Xen + pin 49.37 (0.98) 698.93 (0.99) KVM 48.03 (0.96) 671.97 (0.95) KVM + pin + bind 49.33 (0.98) 684.96 (0.97) 17
  • 18. Discussion •  The performance of global benchmarks, except for FFT(G), is almost comparable with that of bare metal machines. –  FFT decreased the performance by 11% to 20% due to the virtualization overhead related to the inter-node communication and/or VMM noise. –  PCI passthrough improves MPI communication throughput close to that of bare metal machines. But, interrupt injection that results in VM exits can disturb the application execution. 18
  • 19. Discussion (cont.) •  The performance of Xen is marginally better than that of KVM, except for RandomRing Bandwidth. –  The bandwidth decreases by 4% in KVM, 20% in Xen. •  KVM: The performance of STREAM(EP) decreases by 27%. –  A lot of memory contention among processes (TLB miss) may occur. It is the worst situation for EPT (Extended Page Table), because the page walk of EPT takes more time than that of shadow page table. This means a virtual machine is more sensitive to memory contention than a bare metal machine. 19
  • 20. Outline •  Background •  Performance tuning techniques for HPC Cloud –  PCI passthrough –  NUMA affinity –  VMM noise reduction •  Performance evaluation –  HPC Challenge benchmark suite –  Results •  Summary 20
  • 21. Summary HPC Cloud is promising! •  The performance of coarse-grained parallel applications is comparable to bare metal machines. •  We plan to adopt these performance tuning techniques into our private cloud service called “AIST Cloud.” •  Open issues: –  VMM noise reduction –  Live migration with VMM-bypass devices 21
  • 22. 22
  • 23. HPC Cloud HPC Cloud utilizes cloud resources in High Performance Computing (HPC) applications. Virtualized Clusters Users require resources Provider allocates users a dedicated according to needs. virtual cluster on demand. Physical Cluster 23
  • 24. Amazon EC2 CCI in TOP500 TOP500 Nov. 2011 100 InfiniBand: 76% 80 Efficiency (%) 10 Gigabit Ethernet: 72% 60 40 GPGPU machines Gigabit Ethernet: 52% #42 Amazon EC2 InfiniBand 20 cluster compute instances Gigabit Ethernet 10 Gigabit Ethernet 0 TOP500 rank Efficiency Maximum LINPACK performance Rmax Theoretical peak performance Rpeak
  • 25. LINPACK Efficiency TOP500 June 2011 100 InfiniBand: 79% 80 Efficiency (%) 10 Gigabit Ethernet: 74% 60 40 Gigabit Ethernet: 54% GPGPU machines #451 Amazon EC2 InfiniBand cluster compute instances 20 Gigabit Ethernet 10 Gigabit Ethernet Virtualization causes the 0 performance degradation! TOP500 rank Efficiency Maximum LINPACK performance Rmax Theoretical peak performance Rpeak