SlideShare a Scribd company logo
1 of 29
Brought to you by
Cutting Through the Fog
of Virtualization
Bernd Bandemer
Head of Data Science at Clockwork.io
Bernd Bandemer
Head of Data Science at Clockwork.io
■ Accurate, scalable, and stable clock sync is
a game changer in distributed computing
■ My prior lives:
Aruba Networks, Stanford, originally from Germany
The Promise of Virtualization
■ Dynamic demand automatically drives scale-up / scale-down
■ Resource guarantees
● Every VM behaves the same, independent of location, date, time of day
● Every network link between VMs behaves the same
■ Resource isolation
● Your neighbors won't bother you
● Your own VMs won't affect each other
VM Colocation
Top of Rack Switch
Fabric switch Fabric switch
Fabric switch
Top of Rack Switch Top of Rack Switch
Physical machines
(hosts)
VM Colocation
Top of Rack Switch
Physical machines
(hosts)
Fabric switch Fabric switch
Fabric switch
Top of Rack Switch Top of Rack Switch
Virtual machines
VM Colocation
Top of Rack Switch
Fabric switch Fabric switch
Fabric switch
Top of Rack Switch Top of Rack Switch
Virtual machines
YOUR virtual machines
Physical machines
(hosts)
VM Colocation
Top of Rack Switch
Fabric switch Fabric switch
Fabric switch
Top of Rack Switch Top of Rack Switch
Virtual machines
YOUR virtual machines
Physical machines
(hosts)
VM Colocation
Top of Rack Switch
Fabric switch Fabric switch
Fabric switch
Top of Rack Switch Top of Rack Switch
Virtual machines
YOUR virtual machines
Colocated on same host
Physical machines
(hosts)
VM Colocation vs. Shared Tenancy
■ Shared Tenancy
● Load on VMs is independent of each other
● Load spikes average out across the VMs
VM Colocation is much worse than shared tenancy
■ VM Colocation
● VMs participate in the same workload at the same time
● Simultaneous load spikes, averaging does not help
Potential Effects of VM Colocation
■ CPU and memory are well isolated between VMs
■ Networking is not well isolated
● Bandwidth: colocated VMs share a physical network interface (NICs)
● Latency: packets between colocated VMs don't travel through the network
● Packet drops: packets between colocated VMs can't get dropped in the network
How do these effects play out
in practice?
Real-world Data
■ ~3,000 test cluster instances
■ Amazon EKS, Google GKE, Azure AKS
■ Three Geo regions:
Eastern US (Virginia), Western Europe (London), Southeast Asia (Singapore)
■ Each cluster
● Bring up 50 virtual machine instances
● Instrumented with Clockwork's clock sync system
● Latency Sensei Audit, which includes several phases of network load
Determining VM Colocation
■ Clock sync system measures relative
clock offsets and clock drifts
● This is done by exchanging small UDP
packets in a probe mesh and measuring
their transit times
■ Colocated VMs share a physical
system clock
● This can be detected and used to
reverse-engineer the colocation
● Validated by many experiments, including
sole-tenant hosts
← Example colocation structure
Network Bandwidth
Network Bandwidth
■ We measure network bandwidth by sending long TCP flows between the VMs
Colocation structure Egress bandwidth
10 Gbps →
5 Gbps →
0 Gbps →
Colocated VMs have severely lower network bandwidth
Network Bandwidth on Google Cloud
n1-standard-4 n2-standard-4
Bandwidth is impacted when 3 or more VMs are colocated
Network Bandwidth on AWS
m4.xlarge m5.xlarge
Colocation is purposely limited; no bandwidth impact
Network Bandwidth on Azure
Standard_D4s_v3 Standard_D4s_v4
Bandwidth impact appears when colocation > 4
Network Bandwidth on Azure
During low-load times, Azure lifts the speed limit
20 Gbps →
10 Gbps →
0 Gbps →
Network Latency
Network Latency
■ We measure two-way delay between any two VMs with high accuracy
● Two-way delay is the sum of two one-way delays
● Exclude the effect of ACK turn-around time and sender/receiver stack delays
Network Latency on AWS
AWS virtual networking hides any potential latency benefit
■ No visible difference between
colocated and non-colocated
pairs of VM
Network Latency on AWS
AWS virtual networking hides any potential latency benefit
■ Two distinct modes, probably
explained by different generations
of networking implementation
Network Latency on Azure
In Azure, colocated VMs have higher latency
■ Azure's accelerated networking
optimizes the typical case
■ VMs on the same physical host raise an
exception that is handled in software
Network Latency on Google Cloud
In Google cloud, each region/instance type behaves differently
Packet Drops
Network Packet Drops
■ In each measurement run, we send 10s of millions of probe packets
■ UDP packets may get lost on the way
Non-colocated VMs Colocated VMs
Azure 68 ppm 60 ppm
AWS 220 ppm 213 ppm
Google 60 ppm 62 ppm
Packet drops rates are independent of VM colocation
Conclusion
■ VM Colocation has performance impact and no upside
● Highly colocated VMs have lower network bandwidth
● Colocation has no latency or reliability benefit
■ For optimal cloud system performance, VM colocation should be avoided
■ Clockwork Latency Sensei provides visibility into your cloud system
● Accurate clock sync makes VM colocation visible
● Latency Sensei audit reports highlight the impact on YOUR cloud system
Brought to you by
Bernd Bandemer
bernd@clockwork.io
www.clockwork.io
Thank you!

More Related Content

Similar to Cutting Through the Fog of Virtualization

Software Defined Networks (SDN) na przykładzie rozwiązania OpenContrail.
Software Defined Networks (SDN) na przykładzie rozwiązania OpenContrail.Software Defined Networks (SDN) na przykładzie rozwiązania OpenContrail.
Software Defined Networks (SDN) na przykładzie rozwiązania OpenContrail.Semihalf
 
Testing the limits of cloud networks
Testing the limits of cloud networksTesting the limits of cloud networks
Testing the limits of cloud networksPLUMgrid
 
The impact of cloud NSBCon NY by Yves Goeleven
The impact of cloud NSBCon NY by Yves GoelevenThe impact of cloud NSBCon NY by Yves Goeleven
The impact of cloud NSBCon NY by Yves GoelevenParticular Software
 
Challenges in Cloud Computing – VM Migration
Challenges in Cloud Computing – VM MigrationChallenges in Cloud Computing – VM Migration
Challenges in Cloud Computing – VM MigrationSarmad Makhdoom
 
Aws Architecture Fundamentals
Aws Architecture FundamentalsAws Architecture Fundamentals
Aws Architecture Fundamentals2nd Watch
 
Things You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst IT
Things You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst ITThings You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst IT
Things You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst ITOpenStack
 
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE Platforms
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE PlatformsFIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE Platforms
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE PlatformsFIWARE
 
Managing Cloud networking costs for data-intensive applications by provisioni...
Managing Cloud networking costs for data-intensive applications by provisioni...Managing Cloud networking costs for data-intensive applications by provisioni...
Managing Cloud networking costs for data-intensive applications by provisioni...Igor Sfiligoi
 
Cloud interconnection networks basic .pptx
Cloud interconnection networks basic .pptxCloud interconnection networks basic .pptx
Cloud interconnection networks basic .pptxRahulBhole12
 
Grid middleware is easy to install, configure, secure, debug and manage acros...
Grid middleware is easy to install, configure, secure, debug and manage acros...Grid middleware is easy to install, configure, secure, debug and manage acros...
Grid middleware is easy to install, configure, secure, debug and manage acros...Paul Brebner
 
Tokyo azure meetup #12 service fabric internals
Tokyo azure meetup #12   service fabric internalsTokyo azure meetup #12   service fabric internals
Tokyo azure meetup #12 service fabric internalsTokyo Azure Meetup
 
VMworld 2013: Silent Killer: How Latency Destroys Performance...And What to D...
VMworld 2013: Silent Killer: How Latency Destroys Performance...And What to D...VMworld 2013: Silent Killer: How Latency Destroys Performance...And What to D...
VMworld 2013: Silent Killer: How Latency Destroys Performance...And What to D...VMworld
 
Edge Zones In CloudStack
Edge Zones In CloudStackEdge Zones In CloudStack
Edge Zones In CloudStackShapeBlue
 
Shoot the Bird: Linear Broadcast Distribution on AWS by Usman Shakeel of Amaz...
Shoot the Bird: Linear Broadcast Distribution on AWS by Usman Shakeel of Amaz...Shoot the Bird: Linear Broadcast Distribution on AWS by Usman Shakeel of Amaz...
Shoot the Bird: Linear Broadcast Distribution on AWS by Usman Shakeel of Amaz...ETCenter
 
The lies we tell our code, LinuxCon/CloudOpen 2015-08-18
The lies we tell our code, LinuxCon/CloudOpen 2015-08-18The lies we tell our code, LinuxCon/CloudOpen 2015-08-18
The lies we tell our code, LinuxCon/CloudOpen 2015-08-18Casey Bisson
 
OpenStack and OpenContrail for FreeBSD platform by Michał Dubiel
OpenStack and OpenContrail for FreeBSD platform by Michał DubielOpenStack and OpenContrail for FreeBSD platform by Michał Dubiel
OpenStack and OpenContrail for FreeBSD platform by Michał Dubieleurobsdcon
 
Rise of the machines: Continuous Delivery at SEEK - YOW! Night Summary Slides
Rise of the machines: Continuous Delivery at SEEK - YOW! Night Summary SlidesRise of the machines: Continuous Delivery at SEEK - YOW! Night Summary Slides
Rise of the machines: Continuous Delivery at SEEK - YOW! Night Summary SlidesDiUS
 
Migrate to Microservices Judiciously!
Migrate to Microservices Judiciously!Migrate to Microservices Judiciously!
Migrate to Microservices Judiciously!pflueras
 
Dependable Cloud Comuting
Dependable Cloud ComutingDependable Cloud Comuting
Dependable Cloud ComutingKazuhiko Kato
 

Similar to Cutting Through the Fog of Virtualization (20)

Vpc aws meetup
Vpc   aws meetupVpc   aws meetup
Vpc aws meetup
 
Software Defined Networks (SDN) na przykładzie rozwiązania OpenContrail.
Software Defined Networks (SDN) na przykładzie rozwiązania OpenContrail.Software Defined Networks (SDN) na przykładzie rozwiązania OpenContrail.
Software Defined Networks (SDN) na przykładzie rozwiązania OpenContrail.
 
Testing the limits of cloud networks
Testing the limits of cloud networksTesting the limits of cloud networks
Testing the limits of cloud networks
 
The impact of cloud NSBCon NY by Yves Goeleven
The impact of cloud NSBCon NY by Yves GoelevenThe impact of cloud NSBCon NY by Yves Goeleven
The impact of cloud NSBCon NY by Yves Goeleven
 
Challenges in Cloud Computing – VM Migration
Challenges in Cloud Computing – VM MigrationChallenges in Cloud Computing – VM Migration
Challenges in Cloud Computing – VM Migration
 
Aws Architecture Fundamentals
Aws Architecture FundamentalsAws Architecture Fundamentals
Aws Architecture Fundamentals
 
Things You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst IT
Things You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst ITThings You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst IT
Things You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst IT
 
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE Platforms
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE PlatformsFIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE Platforms
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE Platforms
 
Managing Cloud networking costs for data-intensive applications by provisioni...
Managing Cloud networking costs for data-intensive applications by provisioni...Managing Cloud networking costs for data-intensive applications by provisioni...
Managing Cloud networking costs for data-intensive applications by provisioni...
 
Cloud interconnection networks basic .pptx
Cloud interconnection networks basic .pptxCloud interconnection networks basic .pptx
Cloud interconnection networks basic .pptx
 
Grid middleware is easy to install, configure, secure, debug and manage acros...
Grid middleware is easy to install, configure, secure, debug and manage acros...Grid middleware is easy to install, configure, secure, debug and manage acros...
Grid middleware is easy to install, configure, secure, debug and manage acros...
 
Tokyo azure meetup #12 service fabric internals
Tokyo azure meetup #12   service fabric internalsTokyo azure meetup #12   service fabric internals
Tokyo azure meetup #12 service fabric internals
 
VMworld 2013: Silent Killer: How Latency Destroys Performance...And What to D...
VMworld 2013: Silent Killer: How Latency Destroys Performance...And What to D...VMworld 2013: Silent Killer: How Latency Destroys Performance...And What to D...
VMworld 2013: Silent Killer: How Latency Destroys Performance...And What to D...
 
Edge Zones In CloudStack
Edge Zones In CloudStackEdge Zones In CloudStack
Edge Zones In CloudStack
 
Shoot the Bird: Linear Broadcast Distribution on AWS by Usman Shakeel of Amaz...
Shoot the Bird: Linear Broadcast Distribution on AWS by Usman Shakeel of Amaz...Shoot the Bird: Linear Broadcast Distribution on AWS by Usman Shakeel of Amaz...
Shoot the Bird: Linear Broadcast Distribution on AWS by Usman Shakeel of Amaz...
 
The lies we tell our code, LinuxCon/CloudOpen 2015-08-18
The lies we tell our code, LinuxCon/CloudOpen 2015-08-18The lies we tell our code, LinuxCon/CloudOpen 2015-08-18
The lies we tell our code, LinuxCon/CloudOpen 2015-08-18
 
OpenStack and OpenContrail for FreeBSD platform by Michał Dubiel
OpenStack and OpenContrail for FreeBSD platform by Michał DubielOpenStack and OpenContrail for FreeBSD platform by Michał Dubiel
OpenStack and OpenContrail for FreeBSD platform by Michał Dubiel
 
Rise of the machines: Continuous Delivery at SEEK - YOW! Night Summary Slides
Rise of the machines: Continuous Delivery at SEEK - YOW! Night Summary SlidesRise of the machines: Continuous Delivery at SEEK - YOW! Night Summary Slides
Rise of the machines: Continuous Delivery at SEEK - YOW! Night Summary Slides
 
Migrate to Microservices Judiciously!
Migrate to Microservices Judiciously!Migrate to Microservices Judiciously!
Migrate to Microservices Judiciously!
 
Dependable Cloud Comuting
Dependable Cloud ComutingDependable Cloud Comuting
Dependable Cloud Comuting
 

More from ScyllaDB

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLScyllaDB
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...ScyllaDB
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...ScyllaDB
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaScyllaDB
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityScyllaDB
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptxScyllaDB
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDBScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationScyllaDB
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsScyllaDB
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesScyllaDB
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsScyllaDB
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101ScyllaDB
 

More from ScyllaDB (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQL
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & Pitfalls
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual Workshop
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & Tradeoffs
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101
 

Recently uploaded

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 

Recently uploaded (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

Cutting Through the Fog of Virtualization

  • 1. Brought to you by Cutting Through the Fog of Virtualization Bernd Bandemer Head of Data Science at Clockwork.io
  • 2. Bernd Bandemer Head of Data Science at Clockwork.io ■ Accurate, scalable, and stable clock sync is a game changer in distributed computing ■ My prior lives: Aruba Networks, Stanford, originally from Germany
  • 3. The Promise of Virtualization ■ Dynamic demand automatically drives scale-up / scale-down ■ Resource guarantees ● Every VM behaves the same, independent of location, date, time of day ● Every network link between VMs behaves the same ■ Resource isolation ● Your neighbors won't bother you ● Your own VMs won't affect each other
  • 4. VM Colocation Top of Rack Switch Fabric switch Fabric switch Fabric switch Top of Rack Switch Top of Rack Switch Physical machines (hosts)
  • 5. VM Colocation Top of Rack Switch Physical machines (hosts) Fabric switch Fabric switch Fabric switch Top of Rack Switch Top of Rack Switch Virtual machines
  • 6. VM Colocation Top of Rack Switch Fabric switch Fabric switch Fabric switch Top of Rack Switch Top of Rack Switch Virtual machines YOUR virtual machines Physical machines (hosts)
  • 7. VM Colocation Top of Rack Switch Fabric switch Fabric switch Fabric switch Top of Rack Switch Top of Rack Switch Virtual machines YOUR virtual machines Physical machines (hosts)
  • 8. VM Colocation Top of Rack Switch Fabric switch Fabric switch Fabric switch Top of Rack Switch Top of Rack Switch Virtual machines YOUR virtual machines Colocated on same host Physical machines (hosts)
  • 9. VM Colocation vs. Shared Tenancy ■ Shared Tenancy ● Load on VMs is independent of each other ● Load spikes average out across the VMs VM Colocation is much worse than shared tenancy ■ VM Colocation ● VMs participate in the same workload at the same time ● Simultaneous load spikes, averaging does not help
  • 10. Potential Effects of VM Colocation ■ CPU and memory are well isolated between VMs ■ Networking is not well isolated ● Bandwidth: colocated VMs share a physical network interface (NICs) ● Latency: packets between colocated VMs don't travel through the network ● Packet drops: packets between colocated VMs can't get dropped in the network
  • 11. How do these effects play out in practice?
  • 12. Real-world Data ■ ~3,000 test cluster instances ■ Amazon EKS, Google GKE, Azure AKS ■ Three Geo regions: Eastern US (Virginia), Western Europe (London), Southeast Asia (Singapore) ■ Each cluster ● Bring up 50 virtual machine instances ● Instrumented with Clockwork's clock sync system ● Latency Sensei Audit, which includes several phases of network load
  • 13. Determining VM Colocation ■ Clock sync system measures relative clock offsets and clock drifts ● This is done by exchanging small UDP packets in a probe mesh and measuring their transit times ■ Colocated VMs share a physical system clock ● This can be detected and used to reverse-engineer the colocation ● Validated by many experiments, including sole-tenant hosts ← Example colocation structure
  • 15. Network Bandwidth ■ We measure network bandwidth by sending long TCP flows between the VMs Colocation structure Egress bandwidth 10 Gbps → 5 Gbps → 0 Gbps → Colocated VMs have severely lower network bandwidth
  • 16. Network Bandwidth on Google Cloud n1-standard-4 n2-standard-4 Bandwidth is impacted when 3 or more VMs are colocated
  • 17. Network Bandwidth on AWS m4.xlarge m5.xlarge Colocation is purposely limited; no bandwidth impact
  • 18. Network Bandwidth on Azure Standard_D4s_v3 Standard_D4s_v4 Bandwidth impact appears when colocation > 4
  • 19. Network Bandwidth on Azure During low-load times, Azure lifts the speed limit 20 Gbps → 10 Gbps → 0 Gbps →
  • 21. Network Latency ■ We measure two-way delay between any two VMs with high accuracy ● Two-way delay is the sum of two one-way delays ● Exclude the effect of ACK turn-around time and sender/receiver stack delays
  • 22. Network Latency on AWS AWS virtual networking hides any potential latency benefit ■ No visible difference between colocated and non-colocated pairs of VM
  • 23. Network Latency on AWS AWS virtual networking hides any potential latency benefit ■ Two distinct modes, probably explained by different generations of networking implementation
  • 24. Network Latency on Azure In Azure, colocated VMs have higher latency ■ Azure's accelerated networking optimizes the typical case ■ VMs on the same physical host raise an exception that is handled in software
  • 25. Network Latency on Google Cloud In Google cloud, each region/instance type behaves differently
  • 27. Network Packet Drops ■ In each measurement run, we send 10s of millions of probe packets ■ UDP packets may get lost on the way Non-colocated VMs Colocated VMs Azure 68 ppm 60 ppm AWS 220 ppm 213 ppm Google 60 ppm 62 ppm Packet drops rates are independent of VM colocation
  • 28. Conclusion ■ VM Colocation has performance impact and no upside ● Highly colocated VMs have lower network bandwidth ● Colocation has no latency or reliability benefit ■ For optimal cloud system performance, VM colocation should be avoided ■ Clockwork Latency Sensei provides visibility into your cloud system ● Accurate clock sync makes VM colocation visible ● Latency Sensei audit reports highlight the impact on YOUR cloud system
  • 29. Brought to you by Bernd Bandemer bernd@clockwork.io www.clockwork.io Thank you!