SlideShare a Scribd company logo
1 of 28
Download to read offline
COLO: COarse-grain LOck-stepping
    Virtual Machine for Non-stop Service


                      Eddie Dong, Yunhong Jiang




                               Software & Services Group

1
Legal Disclaimer
 INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR
  IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT.
  EXCEPT AS PROVIDED IN INTEL’S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY
  WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL®
  PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE,
  MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. INTEL
  PRODUCTS ARE NOT INTENDED FOR USE IN MEDICAL, LIFE SAVING, OR LIFE SUSTAINING APPLICATIONS.
 Intel may make changes to specifications and product descriptions at any time, without notice.
 All products, dates, and figures specified are preliminary based on current expectations, and are subject to change
  without notice.
 Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause
  the product to deviate from published specifications. Current characterized errata are available on request.
 Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United
  States and other countries.
 *Other names and brands may be claimed as the property of others.
 Copyright © 2012 Intel Corporation.




                                                                                          Software & Services Group
Agenda

    •   Background
    •   COarse-grain LOck-stepping
    •   Performance Optimization
    •   Evaluation
    •   Summary




                                     Software & Services Group

3
Non-Stop Service with VM Replication

    • Typical Non-stop Service Requires
       – Expensive hardware for redundancy
       – Extensive software customization
    • VM Replication: Cheap Application-agnostic Solution




                                             Software & Services Group

4
Existing VM Replication Approaches

    • Replication Per Instruction: Lock-stepping
      – Execute in parallel for deterministic instructions
      – Lock and step for un-deterministic instructions
    • Replication Per Epoch: Continuous Checkpoint
      – Secondary VM is synchronized with Primary VM per
        epoch
      – Output is buffered within an epoch




                                                Software & Services Group

5
Problems

    • Lock-stepping
      – Excessive replication overhead
         • memory access in an MP-guest is un-deterministic
    • Continuous Checkpoint
      – Extra network latency
      – Excessive VM checkpoint overhead




                                                    Software & Services Group

6
Agenda

    •   Background
    •   COarse-grain LOck-stepping
    •   Performance Optimization
    •   Evaluation
    •   Summary




                                     Software & Services Group

7
Why COarse-grain LOck-stepping (COLO)

    • VM Replication is an overly strong condition
      – Why we care about the VM state ?
         • The client care about response only
      – Can the control failover without ”precise VM state
        replication”?
    • Coarse-grain lock-stepping VMs
      – Secondary VM is a replica, as if it can generate same
        response with primary so far
         • Be able to failover without service stop

        Non-stop service focus on server response, not internal machine state!


                                                               Software & Services Group

8
How COLO Works

    • Response Model for C/S System

       –    &    are the request and the execution result of
         an un-deterministic instruction
       – Each response packet from the equation is a
         semantics response
    • Successfully failover at kth packet if

           (C is the packet series the client received)




                                                          Software & Services Group

9
Architecture of COLO




     COarse-grain LOck-stepping Virtual Machine for Non-stop Service

                                                        Software & Services Group

10
Why Better

     • Comparing with Continuous VM checkpoint
       – No buffering-introduced latency
       – Less checkpoint frequency
          • On demand vs. periodic
     • Comparing with lock-stepping
       – Eliminate excessive overhead of un-deterministic
         instruction execution due to MP-guest memory
         access




                                              Software & Services Group

11
Agenda

     •   Background
     •   COarse-grain LOck-stepping
     •   Performance Optimization
     •   Evaluation
     •   Summary




                                      Software & Services Group

12
Performance Challenges

     • Frequency of Checkpoint
       – Highly dependent on the Output Similarity, or
         Response Similarity
          • Key Focus is TCP packet!
     • Cost of Checkpoint
       – Xen/Remus uses passive-checkpoint
          • Secondary VM is not resumed until failover  Slow path
       – COLO implements active-checkpoint
          • Secondary VM resumes frequently




                                                     Software & Services Group

13
Improving Response Similarity

     • Minor Modification to Guest TCP/IP Stack
       –   Coarse Grain Time Stamp
       –   Highly-deterministic ACK mechanism
       –   Coarse Grain Notification Window Size
       –   Per-Connection Comparison




                                                   Software & Services Group

14
Similarities after Optimization
                         • Web Server                                                            • FTP Server
                                              Packets #        Duration                                                   Packets #    Duration

                         16000                                                 600                           4000                                           400


                                                                               500
                         12000
                                                                                                             3000                                           300
                                                                               400
     Number of Packets




                                                                                     Time (ms)




                                                                                                                                                                  Time (ms)
                                                                                                 Packets #
                         8000                                                  300
                                                                                                             2000                                           200

                                                                               200
                         4000
                                                                               100                           1000                                           100



                             0                                                 0
                                  1       2        4       8         16   32
                                                                                                                0                                           0
                                               Number of Threads
                                                                                                                    PUT                     GET
                                                  *Run Web Bench in Client
                           For more complete information about performance and benchmark results, visit Performance Test Disclosure

                                                                                                                                Software & Services Group

15
Reducing the Cost of Active-checkpoint

     • Lazy Device State Update
       – Lazy network interface up/down
       – Lazy event channel up/down
     • Fast Path Communication




                                          Software & Services Group

16
Checkpoint Cost with Optimizations
                                                                                      Leave Net UP

                                              Suspend   Resume     Netif-Up   Netif-Down    Mem Xmit
                                                                                                       Leave EventChannel up
                            1800


                            1500
                                                                                                                           Replace
                                                                                                                           XenStore
                            1200                                                                                           Access with
          Spent Time (ms)




                                                                                                                           Eventchannel
                            900


                            600


                            300


                               0
                                   Baseline                Lazy Netif           Lazy Event Channel       Efficient Comm.




      Final cost: 74ms/checkpoint: (1/3 on page transmission, 2/3 on suspend/resume)

     For more complete information about performance and benchmark results, visit Performance Test Disclosure

                                                                                                         Software & Services Group

17
Agenda

     •   Background
     •   COarse-grain LOck-stepping
     •   Performance Optimization
     •   Evaluation
     •   Summary




                                      Software & Services Group

18
Configurations

     • Hardware
       – Intel® Core™ i7 platform, a 2.8 GHz quad-core
         processor
       – 2GB RAM
       – Intel® 82576 1Gbps NIC * 2 (internal & external)
     • Software
       – Xen 4.1
       – Domain 0: RHEL5U5
       – Guest: 32-bit BusyBox 1.20.0, Linux kernel 2.6.32
          • 256MB RAM and uses a ramdisk for storage


                                                   Software & Services Group

19
Bandwidth of NetPerf

                               Native    Remus-20ms    Remus-40ms   COLO                                         Native    Remus-20ms    Remus-40ms   COLO

                        1000                                                                              1000

                         800                                                                               800




                                                                                       Bandwidth (Mb/s)
     Bandwidth (Mb/s)




                         600                                                                               600

                         400                                                                               400

                         200                                                                               200

                           0                                                                                 0
                                        54            1500           64K                                                  54             1500          64K
                                                   Message Size                                                                     Message Size




                                             TCP                                                                                        UDP



                        For more complete information about performance and benchmark results, visit Performance Test Disclosure

                                                                                                                                   Software & Services Group

20
FTP Server
                128


                 64


                 32


                 16
                                                                                                                Remus-20ms
     Time (s)




                                                                                                                Remus-40ms
                  8
                                                                                                                COLO
                                                                                                                Native
                  4


                  2


                  1
                                  PUT                                            GET

                0.5



     For more complete information about performance and benchmark results, visit Performance Test Disclosure

                                                                                             Software & Services Group

21
Web Server - Concurrency
                                                Native   Remus-20ms     Remus-40ms   COLO

                             1000




                             100
         Throughput (Mbps)




                               10




                                1
                                    1       2                  4                8             16                32
                                                                      Threads


                                                                                            Run Web Bench in Client

     For more complete information about performance and benchmark results, visit Performance Test Disclosure

                                                                                                   Software & Services Group

22
Web Server - Throughput
                                                  Native   Remus-20ms         Remus-40ms     COLO

                                  1200



                                  1000
        Throughput (response/s)




                                   800



                                   600



                                   400



                                   200



                                     0
                                         100        500                        1000                      1500
                                                                   Request / second

                                                                                           Run httperf in Client

     For more complete information about performance and benchmark results, visit Performance Test Disclosure

                                                                                                Software & Services Group

23
Latency in Netperf/Ping
                                        40




                                        30
                 Average Latency (ms)




                                                                                                                     Native

                                        20                                                                           Remus-20ms
                                                                                                                     Remus-40ms
                                                                                                                     COLO


                                        10




                                             0.28            0.40   0.28            0.4   0.38           0.55
                                        0
                                                Netperf-TCP-RR         Netperf-UDP-RR            Ping




     For more complete information about performance and benchmark results, visit Performance Test Disclosure

                                                                                                        Software & Services Group

24
Web Server - Latency
                                          Native         Remus-20ms                Remus-40ms         COLO
                             1200



                             1000
     Response Latency (ms)




                              800



                              600



                              400



                              200



                                0
                                    100            500                       1000                            1500
                                                                  Request/second

                                                                                                 Run httperf in Client
      For more complete information about performance and benchmark results, visit Performance Test Disclosure

                                                                                                Software & Services Group

25
Agenda

     •   Background
     •   COarse-grain LOck-stepping
     •   Performance Optimization
     •   Evaluation
     •   Summary




                                      Software & Services Group

26
Summary

     • COLO is an ideal Application-agnostic Solution
       for Non-stop service
       – Web server: 67% of native performance
       – CPU, memory and netperf: near-native performance


     • Next steps:
       – Merge into Xen
       – More optimizations




                                            Software & Services Group

27
Software & Services Group

28

More Related Content

What's hot

Ron Broersma dren-stavanger-22 nov2011
Ron Broersma dren-stavanger-22 nov2011Ron Broersma dren-stavanger-22 nov2011
Ron Broersma dren-stavanger-22 nov2011
IPv6no
 
Service Function Chaining in Openstack Neutron
Service Function Chaining in Openstack NeutronService Function Chaining in Openstack Neutron
Service Function Chaining in Openstack Neutron
Michelle Holley
 

What's hot (20)

Introduction to the Helium release of OpenDaylight
Introduction to the Helium release of OpenDaylightIntroduction to the Helium release of OpenDaylight
Introduction to the Helium release of OpenDaylight
 
EAP TLS, the Rolls-Royce of extensible authentication protocol (EAP) methods ...
EAP TLS, the Rolls-Royce of extensible authentication protocol (EAP) methods ...EAP TLS, the Rolls-Royce of extensible authentication protocol (EAP) methods ...
EAP TLS, the Rolls-Royce of extensible authentication protocol (EAP) methods ...
 
IPv6 Adressvergabe und Adressierung
IPv6 Adressvergabe und AdressierungIPv6 Adressvergabe und Adressierung
IPv6 Adressvergabe und Adressierung
 
PLNOG15: Practical deployments of Kea, a high performance scalable DHCP - Tom...
PLNOG15: Practical deployments of Kea, a high performance scalable DHCP - Tom...PLNOG15: Practical deployments of Kea, a high performance scalable DHCP - Tom...
PLNOG15: Practical deployments of Kea, a high performance scalable DHCP - Tom...
 
PLNOG16: IOS XR – 12 lat innowacji, Krzysztof Mazepa
PLNOG16: IOS XR – 12 lat innowacji, Krzysztof MazepaPLNOG16: IOS XR – 12 lat innowacji, Krzysztof Mazepa
PLNOG16: IOS XR – 12 lat innowacji, Krzysztof Mazepa
 
Ron Broersma dren-stavanger-22 nov2011
Ron Broersma dren-stavanger-22 nov2011Ron Broersma dren-stavanger-22 nov2011
Ron Broersma dren-stavanger-22 nov2011
 
IPv6 Security und Hacking
IPv6 Security und HackingIPv6 Security und Hacking
IPv6 Security und Hacking
 
FreeSWITCH as a Microservice
FreeSWITCH as a MicroserviceFreeSWITCH as a Microservice
FreeSWITCH as a Microservice
 
LAN, WAN, SAN upgrades: hyperconverged vs traditional vs cloud
LAN, WAN, SAN upgrades: hyperconverged vs traditional vs cloudLAN, WAN, SAN upgrades: hyperconverged vs traditional vs cloud
LAN, WAN, SAN upgrades: hyperconverged vs traditional vs cloud
 
FreeSWITCH on Docker
FreeSWITCH on DockerFreeSWITCH on Docker
FreeSWITCH on Docker
 
IPv6 i det mobile nettet: Pete Vickers, Network Engineer, Network Norway
IPv6 i det mobile nettet: Pete Vickers, Network Engineer, Network NorwayIPv6 i det mobile nettet: Pete Vickers, Network Engineer, Network Norway
IPv6 i det mobile nettet: Pete Vickers, Network Engineer, Network Norway
 
Evolution of network automation at Imperial College London
Evolution of network automation at Imperial College LondonEvolution of network automation at Imperial College London
Evolution of network automation at Imperial College London
 
OpenStack Neutron IPv6 Lessons
OpenStack Neutron IPv6 LessonsOpenStack Neutron IPv6 Lessons
OpenStack Neutron IPv6 Lessons
 
Distribution, redundancy and high availability using OpenSIPS
Distribution, redundancy and high availability using OpenSIPSDistribution, redundancy and high availability using OpenSIPS
Distribution, redundancy and high availability using OpenSIPS
 
Google and IPv6: Steinar H. Gunderson, Software engineer, Google
Google and IPv6: Steinar H. Gunderson, Software engineer, GoogleGoogle and IPv6: Steinar H. Gunderson, Software engineer, Google
Google and IPv6: Steinar H. Gunderson, Software engineer, Google
 
Service Function Chaining in Openstack Neutron
Service Function Chaining in Openstack NeutronService Function Chaining in Openstack Neutron
Service Function Chaining in Openstack Neutron
 
IPv6 at Mythic Beasts - Networkshop44
IPv6 at Mythic Beasts - Networkshop44IPv6 at Mythic Beasts - Networkshop44
IPv6 at Mythic Beasts - Networkshop44
 
BRKSDN-2115
BRKSDN-2115 BRKSDN-2115
BRKSDN-2115
 
HPNFVの取組みとMWC2015 – OpenStack最新情報セミナー 2015年4月
HPNFVの取組みとMWC2015 – OpenStack最新情報セミナー 2015年4月HPNFVの取組みとMWC2015 – OpenStack最新情報セミナー 2015年4月
HPNFVの取組みとMWC2015 – OpenStack最新情報セミナー 2015年4月
 
IPv6 deployment planning Jordi Palet
IPv6 deployment planning Jordi PaletIPv6 deployment planning Jordi Palet
IPv6 deployment planning Jordi Palet
 

Similar to COLO: COarse-grain LOck-stepping Virtual Machines for Non-stop Service

Packet shaper datasheet 81
Packet shaper datasheet 81Packet shaper datasheet 81
Packet shaper datasheet 81
Zalli13
 
Packet shaper datasheet 81
Packet shaper datasheet 81Packet shaper datasheet 81
Packet shaper datasheet 81
Zalli13
 
TSM seminar 6.3-6.4
TSM seminar 6.3-6.4TSM seminar 6.3-6.4
TSM seminar 6.3-6.4
Solv AS
 

Similar to COLO: COarse-grain LOck-stepping Virtual Machines for Non-stop Service (20)

Performance Oriented Design
Performance Oriented DesignPerformance Oriented Design
Performance Oriented Design
 
Packet shaper datasheet 81
Packet shaper datasheet 81Packet shaper datasheet 81
Packet shaper datasheet 81
 
Packet shaper datasheet 81
Packet shaper datasheet 81Packet shaper datasheet 81
Packet shaper datasheet 81
 
The Real World - Plugging the Enterprise Into It (nodejs)
The Real World - Plugging  the Enterprise Into It (nodejs)The Real World - Plugging  the Enterprise Into It (nodejs)
The Real World - Plugging the Enterprise Into It (nodejs)
 
Hadoop World 2011: BI on Hadoop in Financial Services - Stefan Grschupf, Data...
Hadoop World 2011: BI on Hadoop in Financial Services - Stefan Grschupf, Data...Hadoop World 2011: BI on Hadoop in Financial Services - Stefan Grschupf, Data...
Hadoop World 2011: BI on Hadoop in Financial Services - Stefan Grschupf, Data...
 
Nonfunctional Testing: Examine the Other Side of the Coin
Nonfunctional Testing: Examine the Other Side of the CoinNonfunctional Testing: Examine the Other Side of the Coin
Nonfunctional Testing: Examine the Other Side of the Coin
 
Optimizing Performance of your Oracle Database using 8Gb Fibre Channel
Optimizing Performance of your Oracle Database using 8Gb Fibre ChannelOptimizing Performance of your Oracle Database using 8Gb Fibre Channel
Optimizing Performance of your Oracle Database using 8Gb Fibre Channel
 
Layer 7 and Oracle -
Layer 7 and Oracle - Layer 7 and Oracle -
Layer 7 and Oracle -
 
Betting On Data Grids
Betting On Data GridsBetting On Data Grids
Betting On Data Grids
 
Server Day 2009: Oracle/Bea Fusion Middleware by Paolo Ramasso
Server Day 2009: Oracle/Bea Fusion Middleware by Paolo RamassoServer Day 2009: Oracle/Bea Fusion Middleware by Paolo Ramasso
Server Day 2009: Oracle/Bea Fusion Middleware by Paolo Ramasso
 
Cisco UCS Solution EMC World 2015
Cisco UCS Solution EMC World 2015Cisco UCS Solution EMC World 2015
Cisco UCS Solution EMC World 2015
 
Effectively Plan for Your Move to the Cloud
Effectively Plan for Your Move to the CloudEffectively Plan for Your Move to the Cloud
Effectively Plan for Your Move to the Cloud
 
NTTs Journey with Openstack-final
NTTs Journey with Openstack-finalNTTs Journey with Openstack-final
NTTs Journey with Openstack-final
 
MBL303 Scalable Mobile and Web Apps - AWS re: Invent 2012
MBL303 Scalable Mobile and Web Apps - AWS re: Invent 2012MBL303 Scalable Mobile and Web Apps - AWS re: Invent 2012
MBL303 Scalable Mobile and Web Apps - AWS re: Invent 2012
 
TSM seminar 6.3-6.4
TSM seminar 6.3-6.4TSM seminar 6.3-6.4
TSM seminar 6.3-6.4
 
PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments...
PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments...PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments...
PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments...
 
PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments...
PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments...PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments...
PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments...
 
CA Spectrum® Just Keeps Getting Better and Better
CA Spectrum® Just Keeps Getting Better and BetterCA Spectrum® Just Keeps Getting Better and Better
CA Spectrum® Just Keeps Getting Better and Better
 
Veloxum corporate introduction for crowdfunder may 29 2012
Veloxum corporate introduction for crowdfunder may 29 2012Veloxum corporate introduction for crowdfunder may 29 2012
Veloxum corporate introduction for crowdfunder may 29 2012
 
VMworld 2014: Extreme Performance Series
VMworld 2014: Extreme Performance Series VMworld 2014: Extreme Performance Series
VMworld 2014: Extreme Performance Series
 

More from The Linux Foundation

More from The Linux Foundation (20)

ELC2019: Static Partitioning Made Simple
ELC2019: Static Partitioning Made SimpleELC2019: Static Partitioning Made Simple
ELC2019: Static Partitioning Made Simple
 
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
 
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
 
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
 
XPDDS19 Keynote: Unikraft Weather Report
XPDDS19 Keynote:  Unikraft Weather ReportXPDDS19 Keynote:  Unikraft Weather Report
XPDDS19 Keynote: Unikraft Weather Report
 
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
 
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, XilinxXPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
 
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
 
XPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
XPDDS19: Memories of a VM Funk - Mihai Donțu, BitdefenderXPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
XPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
 
OSSJP/ALS19: The Road to Safety Certification: Overcoming Community Challeng...
OSSJP/ALS19:  The Road to Safety Certification: Overcoming Community Challeng...OSSJP/ALS19:  The Road to Safety Certification: Overcoming Community Challeng...
OSSJP/ALS19: The Road to Safety Certification: Overcoming Community Challeng...
 
OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
 OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making... OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
 
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, CitrixXPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
 
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltdXPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
 
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
 
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&DXPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
 
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM SystemsXPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
 
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
 
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
 
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
 
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSEXPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

COLO: COarse-grain LOck-stepping Virtual Machines for Non-stop Service

  • 1. COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Eddie Dong, Yunhong Jiang Software & Services Group 1
  • 2. Legal Disclaimer  INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL’S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL® PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. INTEL PRODUCTS ARE NOT INTENDED FOR USE IN MEDICAL, LIFE SAVING, OR LIFE SUSTAINING APPLICATIONS.  Intel may make changes to specifications and product descriptions at any time, without notice.  All products, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice.  Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request.  Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.  *Other names and brands may be claimed as the property of others.  Copyright © 2012 Intel Corporation. Software & Services Group
  • 3. Agenda • Background • COarse-grain LOck-stepping • Performance Optimization • Evaluation • Summary Software & Services Group 3
  • 4. Non-Stop Service with VM Replication • Typical Non-stop Service Requires – Expensive hardware for redundancy – Extensive software customization • VM Replication: Cheap Application-agnostic Solution Software & Services Group 4
  • 5. Existing VM Replication Approaches • Replication Per Instruction: Lock-stepping – Execute in parallel for deterministic instructions – Lock and step for un-deterministic instructions • Replication Per Epoch: Continuous Checkpoint – Secondary VM is synchronized with Primary VM per epoch – Output is buffered within an epoch Software & Services Group 5
  • 6. Problems • Lock-stepping – Excessive replication overhead • memory access in an MP-guest is un-deterministic • Continuous Checkpoint – Extra network latency – Excessive VM checkpoint overhead Software & Services Group 6
  • 7. Agenda • Background • COarse-grain LOck-stepping • Performance Optimization • Evaluation • Summary Software & Services Group 7
  • 8. Why COarse-grain LOck-stepping (COLO) • VM Replication is an overly strong condition – Why we care about the VM state ? • The client care about response only – Can the control failover without ”precise VM state replication”? • Coarse-grain lock-stepping VMs – Secondary VM is a replica, as if it can generate same response with primary so far • Be able to failover without service stop Non-stop service focus on server response, not internal machine state! Software & Services Group 8
  • 9. How COLO Works • Response Model for C/S System – & are the request and the execution result of an un-deterministic instruction – Each response packet from the equation is a semantics response • Successfully failover at kth packet if (C is the packet series the client received) Software & Services Group 9
  • 10. Architecture of COLO COarse-grain LOck-stepping Virtual Machine for Non-stop Service Software & Services Group 10
  • 11. Why Better • Comparing with Continuous VM checkpoint – No buffering-introduced latency – Less checkpoint frequency • On demand vs. periodic • Comparing with lock-stepping – Eliminate excessive overhead of un-deterministic instruction execution due to MP-guest memory access Software & Services Group 11
  • 12. Agenda • Background • COarse-grain LOck-stepping • Performance Optimization • Evaluation • Summary Software & Services Group 12
  • 13. Performance Challenges • Frequency of Checkpoint – Highly dependent on the Output Similarity, or Response Similarity • Key Focus is TCP packet! • Cost of Checkpoint – Xen/Remus uses passive-checkpoint • Secondary VM is not resumed until failover  Slow path – COLO implements active-checkpoint • Secondary VM resumes frequently Software & Services Group 13
  • 14. Improving Response Similarity • Minor Modification to Guest TCP/IP Stack – Coarse Grain Time Stamp – Highly-deterministic ACK mechanism – Coarse Grain Notification Window Size – Per-Connection Comparison Software & Services Group 14
  • 15. Similarities after Optimization • Web Server • FTP Server Packets # Duration Packets # Duration 16000 600 4000 400 500 12000 3000 300 400 Number of Packets Time (ms) Time (ms) Packets # 8000 300 2000 200 200 4000 100 1000 100 0 0 1 2 4 8 16 32 0 0 Number of Threads PUT GET *Run Web Bench in Client For more complete information about performance and benchmark results, visit Performance Test Disclosure Software & Services Group 15
  • 16. Reducing the Cost of Active-checkpoint • Lazy Device State Update – Lazy network interface up/down – Lazy event channel up/down • Fast Path Communication Software & Services Group 16
  • 17. Checkpoint Cost with Optimizations Leave Net UP Suspend Resume Netif-Up Netif-Down Mem Xmit Leave EventChannel up 1800 1500 Replace XenStore 1200 Access with Spent Time (ms) Eventchannel 900 600 300 0 Baseline Lazy Netif Lazy Event Channel Efficient Comm. Final cost: 74ms/checkpoint: (1/3 on page transmission, 2/3 on suspend/resume) For more complete information about performance and benchmark results, visit Performance Test Disclosure Software & Services Group 17
  • 18. Agenda • Background • COarse-grain LOck-stepping • Performance Optimization • Evaluation • Summary Software & Services Group 18
  • 19. Configurations • Hardware – Intel® Core™ i7 platform, a 2.8 GHz quad-core processor – 2GB RAM – Intel® 82576 1Gbps NIC * 2 (internal & external) • Software – Xen 4.1 – Domain 0: RHEL5U5 – Guest: 32-bit BusyBox 1.20.0, Linux kernel 2.6.32 • 256MB RAM and uses a ramdisk for storage Software & Services Group 19
  • 20. Bandwidth of NetPerf Native Remus-20ms Remus-40ms COLO Native Remus-20ms Remus-40ms COLO 1000 1000 800 800 Bandwidth (Mb/s) Bandwidth (Mb/s) 600 600 400 400 200 200 0 0 54 1500 64K 54 1500 64K Message Size Message Size TCP UDP For more complete information about performance and benchmark results, visit Performance Test Disclosure Software & Services Group 20
  • 21. FTP Server 128 64 32 16 Remus-20ms Time (s) Remus-40ms 8 COLO Native 4 2 1 PUT GET 0.5 For more complete information about performance and benchmark results, visit Performance Test Disclosure Software & Services Group 21
  • 22. Web Server - Concurrency Native Remus-20ms Remus-40ms COLO 1000 100 Throughput (Mbps) 10 1 1 2 4 8 16 32 Threads Run Web Bench in Client For more complete information about performance and benchmark results, visit Performance Test Disclosure Software & Services Group 22
  • 23. Web Server - Throughput Native Remus-20ms Remus-40ms COLO 1200 1000 Throughput (response/s) 800 600 400 200 0 100 500 1000 1500 Request / second Run httperf in Client For more complete information about performance and benchmark results, visit Performance Test Disclosure Software & Services Group 23
  • 24. Latency in Netperf/Ping 40 30 Average Latency (ms) Native 20 Remus-20ms Remus-40ms COLO 10 0.28 0.40 0.28 0.4 0.38 0.55 0 Netperf-TCP-RR Netperf-UDP-RR Ping For more complete information about performance and benchmark results, visit Performance Test Disclosure Software & Services Group 24
  • 25. Web Server - Latency Native Remus-20ms Remus-40ms COLO 1200 1000 Response Latency (ms) 800 600 400 200 0 100 500 1000 1500 Request/second Run httperf in Client For more complete information about performance and benchmark results, visit Performance Test Disclosure Software & Services Group 25
  • 26. Agenda • Background • COarse-grain LOck-stepping • Performance Optimization • Evaluation • Summary Software & Services Group 26
  • 27. Summary • COLO is an ideal Application-agnostic Solution for Non-stop service – Web server: 67% of native performance – CPU, memory and netperf: near-native performance • Next steps: – Merge into Xen – More optimizations Software & Services Group 27