SlideShare a Scribd company logo
Jiannan Ouyang, John Lange
Department of Computer Science
University of Pittsburgh
VEE ’13
03/17/2013
Preemptable Ticket Spinlocks
Improving Consolidated Performance in the Cloud
Motivation
2
—  VM interference in overcommitted environments
—  OS synchronization overhead
—  Lock holder preemption (LHP)
—  Lock Waiter Preemption
—  significance analysis of lock waiter preemption
—  PreemptableTicket Spinlock
—  implementation inside Linux
—  Evaluation
—  significant speedup over Linux
Contributions
3
Spinlocks
4
—  Basics
—  lock() & unlock()
—  Busy waiting lock
—  generic spinlock: random order, unfair (starvation)
—  ticket spinlock: FIFO order, fair
—  Designed for fast mutual exclusion
—  busy waiting vs. sleep/wakeup
—  spinlocks for short & fast critical sections (~1us)
—  OS assumptions
—  use spinlocks for short critical section only
—  never preempt a thread holding or waiting a kernel spinlock
Preemption in VMs
5
—  Lock Holder Preemption (LHP)
—  virtualization breaks the OS assumption
—  vCPU holding a lock is unscheduled byVMM
—  preemption prolongs critical section (~1m v.s. ~1us)
—  Proposed Solutions
—  Co-scheduling and variants
—  Hardware-assisted scheme (Pause Loop Exiting)
—  Paravirtual spinlocks
Preemption in Ticket Lock
6
0 1 2
head = 0 tail = 2
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
Preemption in Ticket Lock
7
0 1 2
head = 0 tail = 2
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
Preemption in Ticket Lock
8
0 1 2 3
head = 0 tail = 3
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
Preemption in Ticket Lock
9
0 1 2 3 4
head = 0 tail = 4
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
Preemption in Ticket Lock
10
1 2 3 4
tail = 4head = 1
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
Preemption in Ticket Lock
11
1 2 3 4
tail = 4head = 1
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
Lock Holder Preemption!
Preemption in Ticket Lock
12
1 2 3 4
tail = 4head = 1
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
Preemption in Ticket Lock
13
1 2 3 4
tail = 4head = 1
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
Preemption in Ticket Lock
14
2 3 4
tail = 4head = 2
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
Preemption in Ticket Lock
15
3 4
head = 3
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
tail = 4
Preemption in Ticket Lock
16
3 4 5
head = 3 tail = 5
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
Preemption in Ticket Lock
17
3 4 5 6
head = 3 tail = 6
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
LockWaiter Preemption
wait on available resource
Lock Waiter Preemption
18
—  Lock waiter is preempted
—  Later waiters wait on an available lock
—  Possible to adapt to it, if we
—  detect preempted waiter
—  acquire lock out of order
How significant is it??
Waiter Preemption Dominates
19
LHP + LWP LWP ​ 𝐋 𝐖𝐏/ 𝐋𝐇𝐏
+𝐋𝐖𝐏 
hackbench x1 1089 452 41.5%
hackbench x2 44342 39221 88.5%
ebizzy x1 294 166 56.5%
ebizzy x2 1017 980 96.4%
Table 2: LockWaiter Preemption Problem in the Linux Kernel
Lock waiter preemption dominates in
overcommitted environments
Challenges & Approach
20
—  How to identify a preempted waiter?
—  timeout threshold
—  How to violate order constraints?
—  allow timed out waiters get the lock randomly
—  ensure mutual exclusion between them
—  How NOT to break the whole ordering mechanism?
—  timeout threshold proportional to queue position
Queue Position Index
21
N = ticket – queue_head
—  ticket: copy of queue tail value upon enqueue
—  N: number of earlier waiters
n n+1 n+2
head = n tail = n+2
ticket = n+2
N = 2
Proportional Timeout Threshold
22
T = N x t
—  t is a constant parameter
—  large enough to avoid false detection
—  small enough to save waiting time
—  Performance is NOT t value sensitive
—  most locks take ~1us & most spinning time wasted on locks
that wait ~1ms
—  larger t does not harm & smaller t does not gain much
n n+1 n+2
head = n tail = n+2
0 t 2tTimeout
Threshold
Preemptable Ticket Spinlock
23
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
0 1 2 3 4 5
head = 0 tail = 5
0
Timeout
Threshold
t 2t 3t 4t 5t
Preemptable Ticket Spinlock
24
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
1 2 3 4 5
head = 1 tail = 5
0
Timeout
Threshold
t 2t 3t 4t
Preemptable Ticket Spinlock
25
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
1 2 3 4 5
head = 1 tail = 5
0
Timeout
Threshold
t 2t 3t 4t
Preemptable Ticket Spinlock
26
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
1 3 4 5
head = 2 tail = 5
Timeout
Threshold
t 2t 3t
N = ticket – head
Preemptable Ticket Spinlock
27
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
1 3 4 5
head = 2 tail = 5
Timeout
Threshold
t 2t 3t
Preemptable Ticket Spinlock
28
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
1 3 5
head = 3 tail = 5
Timeout
Threshold
0 2t
Preemptable Ticket Spinlock
29
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
1 3 5
head = 3 tail = 5
Timeout
Threshold
0 2t
Preemptable Ticket Spinlock
30
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
1 3 5
head = 3 tail = 5
Timeout
Threshold
0 2t
Summary
31
—  PreemptableTicket Lock adapts to preemption
—  preserve order in absence of preemption
—  violate order upon preemption
—  PreemptableTicket Lock preserves fairness
—  order violations are restricted
—  priority is always given to timed out waiters
—  timed out waiters bounded by vCPU numbers of aVM
Implementation
32
—  Drop-in replacement
—  lock(), unlock(), is_locked(), trylock(), etc.
—  Correct
—  race condition free: atomic updates
—  Fast
—  performance is sensitive to lock efficiency
—  ~60 lines of C/inline-assembly in Linux 3.5.0
Paravirtual Spinlocks
33
—  Lock holder preemption is unaddressed
—  semantic gap between guest and host
—  paravirtualization: guest/host cooperation
—  signal long waiting lock / put a vCPU to sleep
—  notify to wake up a vCPU / wake up a vCPU
—  paravirtual preemptable ticket spinlock
—  sleep when waiting too long after timed out
—  wake up all sleeping waiters upon lock releasing
Evaluation
34
—  Host
—  8 core 2.6GHz Intel Core i7 CPU, 8 GB RAM, 1Gbit NIC,
Fedora 17 (Linux 3.5.0)
—  Guest
—  8 core, 1G RAM, Fedora 17 (Linux 3.5.0)
—  Benchmarks
—  hackbench, ebizzy, dell dvd store
—  Lock implementations
—  baseline: ticket lock, paravirtual ticket lock (pv-lock)
—  preemptable ticket lock
—  paravirtual (pv) preemptable ticket lock
Hackbench
35
—  Average Speedup
—  preemptable-lock vs. ticket lock: 4.82X
—  pv-preemptable-lock v.s. ticket lock: 7.08X
—  pv-preemptable-lock v.s. pv-lock: 1.03X
Ebizzy
36
Less variance over ticket lock and pv-lock
—  in-VM preemption adaptivity
—  lessVM interference
variance
80.36 vs. 10.94
variance
131.62 vs. 16.09
Dell DVD Store (apache/mysql)
37
—  Average Speedup
—  preemptable-lock vs. ticket lock: 11.68X
—  pv-preemptable-lock v.s. ticket lock: 19.52X
—  pv-preemptable-lock v.s. pv-lock: 1.11X
Evaluation Summary
38
—  PreemptableTicket Spinlocks speedup
—  5.32X over ticket lock
—  Paravirtual PreemptableTicket Spinlocks speedup
—  7.91X over ticket lock
—  1.08X over paravirtual ticket lock
Average speedup across cases for all benchmarks
—  LockWaiter Preemption
—  most significant preemption problem in queue based lock under
overcommitted environment
—  PreemptableTicket Spinlock
—  Implementation with ~60 lines of code in Linux
—  Better performance in overcommitted environment
—  5.32X average speedup up over ticket lock w/oVMM support
—  1.08X average speedup over pv-lock with less variance
Conclusion
39
Thank You
40
— Jiannan Ouyang
—  ouyang@cs.pitt.edu
—  http://www.cs.pitt.edu/~ouyang/
1 2 3 4 5
0 t 2t 3t 4t
PreemptableTicket Spinlock

More Related Content

What's hot

Exploiting buffer overflows
Exploiting buffer overflowsExploiting buffer overflows
Exploiting buffer overflows
Paul Dutot IEng MIET MBCS CITP OSCP CSTM
 
主機自保指南
主機自保指南主機自保指南
主機自保指南
維泰 蔡
 
Pres
PresPres
Pres
Zeus G
 
CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...
CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...
CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...
PROIDEA
 
Proxy arp
Proxy arpProxy arp
Proxy arp
Marian Marinov
 
Prosess accouting
Prosess accoutingProsess accouting
Prosess accouting
Torstein Hansen
 
DTMF Decoder Shield for Arduino
DTMF Decoder Shield for ArduinoDTMF Decoder Shield for Arduino
DTMF Decoder Shield for Arduino
Raghav Shetty
 
The propeller
The propellerThe propeller
Austin c-c++-meetup-feb2018-spectre
Austin c-c++-meetup-feb2018-spectreAustin c-c++-meetup-feb2018-spectre
Austin c-c++-meetup-feb2018-spectre
Kim Phillips
 
Roll your own toy unix clone os
Roll your own toy unix clone osRoll your own toy unix clone os
Roll your own toy unix clone os
eramax
 
Ethereum A to Z
Ethereum A to ZEthereum A to Z
Ethereum A to Z
Dongsam Byun
 
Playing CTFs for Fun & Profit
Playing CTFs for Fun & ProfitPlaying CTFs for Fun & Profit
Playing CTFs for Fun & Profit
impdefined
 
Creating "Secure" PHP applications, Part 2, Server Hardening
Creating "Secure" PHP applications, Part 2, Server HardeningCreating "Secure" PHP applications, Part 2, Server Hardening
Creating "Secure" PHP applications, Part 2, Server Hardening
archwisp
 

What's hot (13)

Exploiting buffer overflows
Exploiting buffer overflowsExploiting buffer overflows
Exploiting buffer overflows
 
主機自保指南
主機自保指南主機自保指南
主機自保指南
 
Pres
PresPres
Pres
 
CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...
CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...
CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...
 
Proxy arp
Proxy arpProxy arp
Proxy arp
 
Prosess accouting
Prosess accoutingProsess accouting
Prosess accouting
 
DTMF Decoder Shield for Arduino
DTMF Decoder Shield for ArduinoDTMF Decoder Shield for Arduino
DTMF Decoder Shield for Arduino
 
The propeller
The propellerThe propeller
The propeller
 
Austin c-c++-meetup-feb2018-spectre
Austin c-c++-meetup-feb2018-spectreAustin c-c++-meetup-feb2018-spectre
Austin c-c++-meetup-feb2018-spectre
 
Roll your own toy unix clone os
Roll your own toy unix clone osRoll your own toy unix clone os
Roll your own toy unix clone os
 
Ethereum A to Z
Ethereum A to ZEthereum A to Z
Ethereum A to Z
 
Playing CTFs for Fun & Profit
Playing CTFs for Fun & ProfitPlaying CTFs for Fun & Profit
Playing CTFs for Fun & Profit
 
Creating "Secure" PHP applications, Part 2, Server Hardening
Creating "Secure" PHP applications, Part 2, Server HardeningCreating "Secure" PHP applications, Part 2, Server Hardening
Creating "Secure" PHP applications, Part 2, Server Hardening
 

Viewers also liked

Achieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-KernelsAchieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-Kernels
Jiannan Ouyang, PhD
 
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUsShoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
Jiannan Ouyang, PhD
 
Dead Lock Analysis of spin_lock() in Linux Kernel (english)
Dead Lock Analysis of spin_lock() in Linux Kernel (english)Dead Lock Analysis of spin_lock() in Linux Kernel (english)
Dead Lock Analysis of spin_lock() in Linux Kernel (english)
Sneeker Yeh
 
Basic Concept of Pixel and MPEG data structure (english)
Basic Concept of Pixel and MPEG data structure (english)Basic Concept of Pixel and MPEG data structure (english)
Basic Concept of Pixel and MPEG data structure (english)
Sneeker Yeh
 
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONS
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONSENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONS
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONS
Stephan Cadene
 
Cache profiling on ARM Linux
Cache profiling on ARM LinuxCache profiling on ARM Linux
Cache profiling on ARM Linux
Prabindh Sundareson
 
Docker by demo
Docker by demoDocker by demo
Denser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microserversDenser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microservers
Jez Halford
 
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running LinuxLinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
brouer
 
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Eric Van Hensbergen
 
SDN - OpenFlow + OpenVSwitch + Quantum
SDN - OpenFlow + OpenVSwitch + QuantumSDN - OpenFlow + OpenVSwitch + Quantum
SDN - OpenFlow + OpenVSwitch + Quantum
The Linux Foundation
 
Q2.12: Research Update on big.LITTLE MP Scheduling
Q2.12: Research Update on big.LITTLE MP SchedulingQ2.12: Research Update on big.LITTLE MP Scheduling
Q2.12: Research Update on big.LITTLE MP Scheduling
Linaro
 
Effect of Virtualization on OS Interference
Effect of Virtualization on OS InterferenceEffect of Virtualization on OS Interference
Effect of Virtualization on OS Interference
Eric Van Hensbergen
 
DOXLON November 2016: Facebook Engineering on cgroupv2
DOXLON November 2016: Facebook Engineering on cgroupv2DOXLON November 2016: Facebook Engineering on cgroupv2
DOXLON November 2016: Facebook Engineering on cgroupv2
Outlyer
 
reference_guide_Kernel_Crash_Dump_Analysis
reference_guide_Kernel_Crash_Dump_Analysisreference_guide_Kernel_Crash_Dump_Analysis
reference_guide_Kernel_Crash_Dump_Analysis
Buland Singh
 
Linux Device Driver parallelism using SMP and Kernel Pre-emption
Linux Device Driver parallelism using SMP and Kernel Pre-emptionLinux Device Driver parallelism using SMP and Kernel Pre-emption
Linux Device Driver parallelism using SMP and Kernel Pre-emption
Hemanth Venkatesh
 
Memory Barriers in the Linux Kernel
Memory Barriers in the Linux KernelMemory Barriers in the Linux Kernel
Memory Barriers in the Linux Kernel
Davidlohr Bueso
 
Linux cgroups and namespaces
Linux cgroups and namespacesLinux cgroups and namespaces
Linux cgroups and namespaces
Locaweb
 
How Ceph performs on ARM Microserver Cluster
How Ceph performs on ARM Microserver ClusterHow Ceph performs on ARM Microserver Cluster
How Ceph performs on ARM Microserver Cluster
Aaron Joue
 
SFO15-407: Performance Overhead of ARM Virtualization
SFO15-407: Performance Overhead of ARM VirtualizationSFO15-407: Performance Overhead of ARM Virtualization
SFO15-407: Performance Overhead of ARM Virtualization
Linaro
 

Viewers also liked (20)

Achieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-KernelsAchieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-Kernels
 
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUsShoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
 
Dead Lock Analysis of spin_lock() in Linux Kernel (english)
Dead Lock Analysis of spin_lock() in Linux Kernel (english)Dead Lock Analysis of spin_lock() in Linux Kernel (english)
Dead Lock Analysis of spin_lock() in Linux Kernel (english)
 
Basic Concept of Pixel and MPEG data structure (english)
Basic Concept of Pixel and MPEG data structure (english)Basic Concept of Pixel and MPEG data structure (english)
Basic Concept of Pixel and MPEG data structure (english)
 
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONS
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONSENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONS
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONS
 
Cache profiling on ARM Linux
Cache profiling on ARM LinuxCache profiling on ARM Linux
Cache profiling on ARM Linux
 
Docker by demo
Docker by demoDocker by demo
Docker by demo
 
Denser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microserversDenser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microservers
 
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running LinuxLinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
 
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
 
SDN - OpenFlow + OpenVSwitch + Quantum
SDN - OpenFlow + OpenVSwitch + QuantumSDN - OpenFlow + OpenVSwitch + Quantum
SDN - OpenFlow + OpenVSwitch + Quantum
 
Q2.12: Research Update on big.LITTLE MP Scheduling
Q2.12: Research Update on big.LITTLE MP SchedulingQ2.12: Research Update on big.LITTLE MP Scheduling
Q2.12: Research Update on big.LITTLE MP Scheduling
 
Effect of Virtualization on OS Interference
Effect of Virtualization on OS InterferenceEffect of Virtualization on OS Interference
Effect of Virtualization on OS Interference
 
DOXLON November 2016: Facebook Engineering on cgroupv2
DOXLON November 2016: Facebook Engineering on cgroupv2DOXLON November 2016: Facebook Engineering on cgroupv2
DOXLON November 2016: Facebook Engineering on cgroupv2
 
reference_guide_Kernel_Crash_Dump_Analysis
reference_guide_Kernel_Crash_Dump_Analysisreference_guide_Kernel_Crash_Dump_Analysis
reference_guide_Kernel_Crash_Dump_Analysis
 
Linux Device Driver parallelism using SMP and Kernel Pre-emption
Linux Device Driver parallelism using SMP and Kernel Pre-emptionLinux Device Driver parallelism using SMP and Kernel Pre-emption
Linux Device Driver parallelism using SMP and Kernel Pre-emption
 
Memory Barriers in the Linux Kernel
Memory Barriers in the Linux KernelMemory Barriers in the Linux Kernel
Memory Barriers in the Linux Kernel
 
Linux cgroups and namespaces
Linux cgroups and namespacesLinux cgroups and namespaces
Linux cgroups and namespaces
 
How Ceph performs on ARM Microserver Cluster
How Ceph performs on ARM Microserver ClusterHow Ceph performs on ARM Microserver Cluster
How Ceph performs on ARM Microserver Cluster
 
SFO15-407: Performance Overhead of ARM Virtualization
SFO15-407: Performance Overhead of ARM VirtualizationSFO15-407: Performance Overhead of ARM Virtualization
SFO15-407: Performance Overhead of ARM Virtualization
 

Recently uploaded

Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise EditionWhy Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Envertis Software Solutions
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
Quickdice ERP
 
E-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet DynamicsE-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet Dynamics
Hornet Dynamics
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Łukasz Chruściel
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptxLORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
lorraineandreiamcidl
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Crescat
 
Unveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdfUnveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdf
brainerhub1
 
DDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systemsDDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systems
Gerardo Pardo-Castellote
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
Google
 
What is Master Data Management by PiLog Group
What is Master Data Management by PiLog GroupWhat is Master Data Management by PiLog Group
What is Master Data Management by PiLog Group
aymanquadri279
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
Hironori Washizaki
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
kalichargn70th171
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Neo4j
 
Oracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptxOracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptx
Remote DBA Services
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
mz5nrf0n
 
What is Augmented Reality Image Tracking
What is Augmented Reality Image TrackingWhat is Augmented Reality Image Tracking
What is Augmented Reality Image Tracking
pavan998932
 
E-commerce Application Development Company.pdf
E-commerce Application Development Company.pdfE-commerce Application Development Company.pdf
E-commerce Application Development Company.pdf
Hornet Dynamics
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
Aftab Hussain
 

Recently uploaded (20)

Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise EditionWhy Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
 
E-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet DynamicsE-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet Dynamics
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptxLORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
 
Unveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdfUnveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdf
 
DDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systemsDDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systems
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
 
What is Master Data Management by PiLog Group
What is Master Data Management by PiLog GroupWhat is Master Data Management by PiLog Group
What is Master Data Management by PiLog Group
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
 
Oracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptxOracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptx
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
 
What is Augmented Reality Image Tracking
What is Augmented Reality Image TrackingWhat is Augmented Reality Image Tracking
What is Augmented Reality Image Tracking
 
E-commerce Application Development Company.pdf
E-commerce Application Development Company.pdfE-commerce Application Development Company.pdf
E-commerce Application Development Company.pdf
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
 

Preemptable ticket spinlocks: improving consolidated performance in the cloud

  • 1. Jiannan Ouyang, John Lange Department of Computer Science University of Pittsburgh VEE ’13 03/17/2013 Preemptable Ticket Spinlocks Improving Consolidated Performance in the Cloud
  • 2. Motivation 2 —  VM interference in overcommitted environments —  OS synchronization overhead —  Lock holder preemption (LHP)
  • 3. —  Lock Waiter Preemption —  significance analysis of lock waiter preemption —  PreemptableTicket Spinlock —  implementation inside Linux —  Evaluation —  significant speedup over Linux Contributions 3
  • 4. Spinlocks 4 —  Basics —  lock() & unlock() —  Busy waiting lock —  generic spinlock: random order, unfair (starvation) —  ticket spinlock: FIFO order, fair —  Designed for fast mutual exclusion —  busy waiting vs. sleep/wakeup —  spinlocks for short & fast critical sections (~1us) —  OS assumptions —  use spinlocks for short critical section only —  never preempt a thread holding or waiting a kernel spinlock
  • 5. Preemption in VMs 5 —  Lock Holder Preemption (LHP) —  virtualization breaks the OS assumption —  vCPU holding a lock is unscheduled byVMM —  preemption prolongs critical section (~1m v.s. ~1us) —  Proposed Solutions —  Co-scheduling and variants —  Hardware-assisted scheme (Pause Loop Exiting) —  Paravirtual spinlocks
  • 6. Preemption in Ticket Lock 6 0 1 2 head = 0 tail = 2 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1
  • 7. Preemption in Ticket Lock 7 0 1 2 head = 0 tail = 2 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1
  • 8. Preemption in Ticket Lock 8 0 1 2 3 head = 0 tail = 3 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1
  • 9. Preemption in Ticket Lock 9 0 1 2 3 4 head = 0 tail = 4 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1
  • 10. Preemption in Ticket Lock 10 1 2 3 4 tail = 4head = 1 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1
  • 11. Preemption in Ticket Lock 11 1 2 3 4 tail = 4head = 1 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 Lock Holder Preemption!
  • 12. Preemption in Ticket Lock 12 1 2 3 4 tail = 4head = 1 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1
  • 13. Preemption in Ticket Lock 13 1 2 3 4 tail = 4head = 1 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1
  • 14. Preemption in Ticket Lock 14 2 3 4 tail = 4head = 2 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1
  • 15. Preemption in Ticket Lock 15 3 4 head = 3 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 tail = 4
  • 16. Preemption in Ticket Lock 16 3 4 5 head = 3 tail = 5 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1
  • 17. Preemption in Ticket Lock 17 3 4 5 6 head = 3 tail = 6 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 LockWaiter Preemption wait on available resource
  • 18. Lock Waiter Preemption 18 —  Lock waiter is preempted —  Later waiters wait on an available lock —  Possible to adapt to it, if we —  detect preempted waiter —  acquire lock out of order How significant is it??
  • 19. Waiter Preemption Dominates 19 LHP + LWP LWP ​ 𝐋 𝐖𝐏/ 𝐋𝐇𝐏 +𝐋𝐖𝐏  hackbench x1 1089 452 41.5% hackbench x2 44342 39221 88.5% ebizzy x1 294 166 56.5% ebizzy x2 1017 980 96.4% Table 2: LockWaiter Preemption Problem in the Linux Kernel Lock waiter preemption dominates in overcommitted environments
  • 20. Challenges & Approach 20 —  How to identify a preempted waiter? —  timeout threshold —  How to violate order constraints? —  allow timed out waiters get the lock randomly —  ensure mutual exclusion between them —  How NOT to break the whole ordering mechanism? —  timeout threshold proportional to queue position
  • 21. Queue Position Index 21 N = ticket – queue_head —  ticket: copy of queue tail value upon enqueue —  N: number of earlier waiters n n+1 n+2 head = n tail = n+2 ticket = n+2 N = 2
  • 22. Proportional Timeout Threshold 22 T = N x t —  t is a constant parameter —  large enough to avoid false detection —  small enough to save waiting time —  Performance is NOT t value sensitive —  most locks take ~1us & most spinning time wasted on locks that wait ~1ms —  larger t does not harm & smaller t does not gain much n n+1 n+2 head = n tail = n+2 0 t 2tTimeout Threshold
  • 23. Preemptable Ticket Spinlock 23 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 0 1 2 3 4 5 head = 0 tail = 5 0 Timeout Threshold t 2t 3t 4t 5t
  • 24. Preemptable Ticket Spinlock 24 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 1 2 3 4 5 head = 1 tail = 5 0 Timeout Threshold t 2t 3t 4t
  • 25. Preemptable Ticket Spinlock 25 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 1 2 3 4 5 head = 1 tail = 5 0 Timeout Threshold t 2t 3t 4t
  • 26. Preemptable Ticket Spinlock 26 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 1 3 4 5 head = 2 tail = 5 Timeout Threshold t 2t 3t N = ticket – head
  • 27. Preemptable Ticket Spinlock 27 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 1 3 4 5 head = 2 tail = 5 Timeout Threshold t 2t 3t
  • 28. Preemptable Ticket Spinlock 28 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 1 3 5 head = 3 tail = 5 Timeout Threshold 0 2t
  • 29. Preemptable Ticket Spinlock 29 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 1 3 5 head = 3 tail = 5 Timeout Threshold 0 2t
  • 30. Preemptable Ticket Spinlock 30 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 1 3 5 head = 3 tail = 5 Timeout Threshold 0 2t
  • 31. Summary 31 —  PreemptableTicket Lock adapts to preemption —  preserve order in absence of preemption —  violate order upon preemption —  PreemptableTicket Lock preserves fairness —  order violations are restricted —  priority is always given to timed out waiters —  timed out waiters bounded by vCPU numbers of aVM
  • 32. Implementation 32 —  Drop-in replacement —  lock(), unlock(), is_locked(), trylock(), etc. —  Correct —  race condition free: atomic updates —  Fast —  performance is sensitive to lock efficiency —  ~60 lines of C/inline-assembly in Linux 3.5.0
  • 33. Paravirtual Spinlocks 33 —  Lock holder preemption is unaddressed —  semantic gap between guest and host —  paravirtualization: guest/host cooperation —  signal long waiting lock / put a vCPU to sleep —  notify to wake up a vCPU / wake up a vCPU —  paravirtual preemptable ticket spinlock —  sleep when waiting too long after timed out —  wake up all sleeping waiters upon lock releasing
  • 34. Evaluation 34 —  Host —  8 core 2.6GHz Intel Core i7 CPU, 8 GB RAM, 1Gbit NIC, Fedora 17 (Linux 3.5.0) —  Guest —  8 core, 1G RAM, Fedora 17 (Linux 3.5.0) —  Benchmarks —  hackbench, ebizzy, dell dvd store —  Lock implementations —  baseline: ticket lock, paravirtual ticket lock (pv-lock) —  preemptable ticket lock —  paravirtual (pv) preemptable ticket lock
  • 35. Hackbench 35 —  Average Speedup —  preemptable-lock vs. ticket lock: 4.82X —  pv-preemptable-lock v.s. ticket lock: 7.08X —  pv-preemptable-lock v.s. pv-lock: 1.03X
  • 36. Ebizzy 36 Less variance over ticket lock and pv-lock —  in-VM preemption adaptivity —  lessVM interference variance 80.36 vs. 10.94 variance 131.62 vs. 16.09
  • 37. Dell DVD Store (apache/mysql) 37 —  Average Speedup —  preemptable-lock vs. ticket lock: 11.68X —  pv-preemptable-lock v.s. ticket lock: 19.52X —  pv-preemptable-lock v.s. pv-lock: 1.11X
  • 38. Evaluation Summary 38 —  PreemptableTicket Spinlocks speedup —  5.32X over ticket lock —  Paravirtual PreemptableTicket Spinlocks speedup —  7.91X over ticket lock —  1.08X over paravirtual ticket lock Average speedup across cases for all benchmarks
  • 39. —  LockWaiter Preemption —  most significant preemption problem in queue based lock under overcommitted environment —  PreemptableTicket Spinlock —  Implementation with ~60 lines of code in Linux —  Better performance in overcommitted environment —  5.32X average speedup up over ticket lock w/oVMM support —  1.08X average speedup over pv-lock with less variance Conclusion 39
  • 40. Thank You 40 — Jiannan Ouyang —  ouyang@cs.pitt.edu —  http://www.cs.pitt.edu/~ouyang/ 1 2 3 4 5 0 t 2t 3t 4t PreemptableTicket Spinlock