• Like
  • Save
Virtualized Databases?
Upcoming SlideShare
Loading in...5
×
 

Virtualized Databases?

on

  • 926 views

This is a rehash of the talk I gave at Open Database Camp, toned down to fit into the 25-minute timeslot at FOSDEM (it still didn't). I added some notes again to hopefully clarify some things.

This is a rehash of the talk I gave at Open Database Camp, toned down to fit into the 25-minute timeslot at FOSDEM (it still didn't). I added some notes again to hopefully clarify some things.

Statistics

Views

Total Views
926
Views on SlideShare
922
Embed Views
4

Actions

Likes
2
Downloads
12
Comments
0

2 Embeds 4

http://www.linkedin.com 2
https://www.linkedin.com 2

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Virtualized Databases? Virtualized Databases? Document Transcript

    • VIRTUALIZED DATABASES?Approach: mechanics of virtualization"certain big players" will not be mentionedTalk is general, mostly about hardware issues which are the same for any platform
    • ME• Liz van Dijk (@lizztheblizz)• Working at Sizing Servers Research Lab• First-timer at FOSDEM!• Not really a developer, not really a sysadmin, not really a DBA•I just like knowing how stuff works.
    • SO... VIRTUALIZATION, HUH. • It’s far too broad a term • It’s a pretty old concept. (about half a century, actually) • Its main purposes are abstraction and security • Making use of the correct CPU execution mode • Managing Virtual MemoryHistory!Broad term, 100 different meaningsFull-system virtualization on the mainframes in the 60sIBM m44, trap and emulateRecently:* x86 did not support full virtualization, trap and emulate did not work* multicore hardware, single threaded software. Inefficient datacenters.Full Virtualization is not the only virtualizationcombination of different methodsWho uses RAID?Who uses Virtual Memory?2 big issues that all solutions try to work aroundFocus on these, the next steps should be more or less logicalProblem 1: matter of privilegeskernels assume full control over hardwarehow does the hardware deal with this?layer-based security system (onion)2-bit code in memory address, cpu verifies the code, does or doesnt do the instructionx86: 4 layerscode 00: supervisor modecode 11: user mode
    • SO... VIRTUALIZATION, HUH. • It’s far too broad a term • It’s a pretty old concept. (about half a century, actually) • Its main purposes are abstraction and security • Making use of the correct CPU execution mode • Managing Virtual MemoryHistory!Broad term, 100 different meaningsFull-system virtualization on the mainframes in the 60sIBM m44, trap and emulateRecently:* x86 did not support full virtualization, trap and emulate did not work* multicore hardware, single threaded software. Inefficient datacenters.Full Virtualization is not the only virtualizationcombination of different methodsWho uses RAID?Who uses Virtual Memory?2 big issues that all solutions try to work aroundFocus on these, the next steps should be more or less logicalProblem 1: matter of privilegeskernels assume full control over hardwarehow does the hardware deal with this?layer-based security system (onion)2-bit code in memory address, cpu verifies the code, does or doesnt do the instructionx86: 4 layerscode 00: supervisor modecode 11: user mode
    • SO... VIRTUALIZATION, HUH. • It’s far too broad a term • It’s a pretty old concept. (about half a century, actually) • Its main purposes are abstraction and security • Making use of the correct CPU execution mode • Managing Virtual MemoryHistory!Broad term, 100 different meaningsFull-system virtualization on the mainframes in the 60sIBM m44, trap and emulateRecently:* x86 did not support full virtualization, trap and emulate did not work* multicore hardware, single threaded software. Inefficient datacenters.Full Virtualization is not the only virtualizationcombination of different methodsWho uses RAID?Who uses Virtual Memory?2 big issues that all solutions try to work aroundFocus on these, the next steps should be more or less logicalProblem 1: matter of privilegeskernels assume full control over hardwarehow does the hardware deal with this?layer-based security system (onion)2-bit code in memory address, cpu verifies the code, does or doesnt do the instructionx86: 4 layerscode 00: supervisor modecode 11: user mode
    • X86 VIRTUALIZATION • Binary Translation, aka “faking it” • Applies ring deprivileging, and translates “bad calls” on the fly • “Full” Hardware Virtualization • Introduced Ring -1: Hypervisor mode • Only intervenes when absolutely necessaryBT, old awesome, employed by QEMU and wine.Less relevant now for full-virtualizationring deprivileging, look it up!Intel/AMD caught up, implemented VT-x and AMD-Vring -1: hypervisorLet OSes do whatever they want, but use trap and emulateextra roundtrip, extra overheadCPU has more tasks to perform, but they also take longernewer cpu is better
    • X86 VIRTUALIZATION • Binary Translation, aka “faking it” • Applies ring deprivileging, and translates “bad calls” on the fly • “Full” Hardware Virtualization • Introduced Ring -1: Hypervisor mode • Only intervenes when absolutely necessaryBT, old awesome, employed by QEMU and wine.Less relevant now for full-virtualizationring deprivileging, look it up!Intel/AMD caught up, implemented VT-x and AMD-Vring -1: hypervisorLet OSes do whatever they want, but use trap and emulateextra roundtrip, extra overheadCPU has more tasks to perform, but they also take longernewer cpu is better
    • X86 VIRTUALIZATION • Binary Translation, aka “faking it” • Applies ring deprivileging, and translates “bad calls” on the fly • “Full” Hardware Virtualization • Introduced Ring -1: Hypervisor mode • Only intervenes when absolutely necessaryBT, old awesome, employed by QEMU and wine.Less relevant now for full-virtualizationring deprivileging, look it up!Intel/AMD caught up, implemented VT-x and AMD-Vring -1: hypervisorLet OSes do whatever they want, but use trap and emulateextra roundtrip, extra overheadCPU has more tasks to perform, but they also take longernewer cpu is better
    • VIRTUAL MEMORY 0xA 0xB 0xC 0xD 0xE 0xF 0xG 0xH CPU Mem Managed by software Actual HardwareProblem 2: Virtual memory4kb physical segments with physical addressessoftware: pagesvery easy to manage in OS, all software gets a continuous blockpage table keeps track of physical to virtual mappingTLB cache keeps track of these mappings, very fastneeds to flush every context switch.
    • VIRTUAL MEMORY Virtual 0xA Memory 0xB 1 0xC 2 0xD 3 0xE 4 0xF 5 0xG OS 6 7 0xH CPU 8 9 Mem 10 11 12 Managed by software Actual HardwareProblem 2: Virtual memory4kb physical segments with physical addressessoftware: pagesvery easy to manage in OS, all software gets a continuous blockpage table keeps track of physical to virtual mappingTLB cache keeps track of these mappings, very fastneeds to flush every context switch.
    • VIRTUAL MEMORY Virtual Page Table 0xA Memory 0xB 1 1 | 0xD 0xC 2 2 | 0xC 0xD 3 3 | 0xF 0xE 4 4 | 0xA 0xF 5 5 | 0xH 0xG OS 6 7 6 | 0xG 7 | 0xB 0xH CPU 8 9 8 | 0xE Mem 10 11 etc. 12 Managed by software Actual HardwareProblem 2: Virtual memory4kb physical segments with physical addressessoftware: pagesvery easy to manage in OS, all software gets a continuous blockpage table keeps track of physical to virtual mappingTLB cache keeps track of these mappings, very fastneeds to flush every context switch.
    • VIRTUAL MEMORY Virtual Page Table 0xA Memory 0xB 1 1 | 0xD 0xC 2 2 | 0xC 0xD TLB 3 3 | 0xF 0xE 1 | 0xD 4 4 | 0xA 0xF 5 | 0xH 5 5 | 0xH 0xG 2 | 0xC OS 6 7 6 | 0xG 7 | 0xB 0xH CPU 8 9 8 | 0xE Mem etc. 10 11 etc. 12 Managed by software Actual HardwareProblem 2: Virtual memory4kb physical segments with physical addressessoftware: pagesvery easy to manage in OS, all software gets a continuous blockpage table keeps track of physical to virtual mappingTLB cache keeps track of these mappings, very fastneeds to flush every context switch.
    • SPT VS HAP “Read-only” 0xA Page Table 0xB 1 | 0xD 0xC 1 2 5 | 0xH 0xD VM A 3 2 | 0xC 0xE 0xF 4 5 0xG N 0xH CPU 1 12 | 0xB 10 | 0xE Mem 2 VM B 3 9 | 0xA 4 12 etc. Managed by VM OS Managed by hypervisor Actual Hardware2 methodslocked page table, access generates trap, VMM handles memory accessmuch slower memory accessEPT/RVI/HAPMake TLB much bigger, make it smarter, VM-awaremuch more complex to fill up, though. slow initial memory accessfilled TLB is very fast, tho.
    • SPT VS HAP “Read-only” “Shadow” 0xA Page Table Page Table 0xB 1 | 0xD 1 | 0xG 0xC 1 5 | 0xH 5 | 0xD 0xD 2 VM A 3 2 | 0xC 2 | 0xF 0xE 0xF 4 5 N A 0xG 0xH CPU 1 12 | 0xB 10 | 0xE 12 | 0xE 10 | 0xB Mem 2 VM B 3 9 | 0xA 9 | 0xC 4 12 etc. B Managed by VM OS Managed by hypervisor Actual Hardware2 methodslocked page table, access generates trap, VMM handles memory accessmuch slower memory accessEPT/RVI/HAPMake TLB much bigger, make it smarter, VM-awaremuch more complex to fill up, though. slow initial memory accessfilled TLB is very fast, tho.
    • SPT VS HAP “Read-only” 0xA Page Table TLB 0xB 1 | 0xD 0xC A1 | 0xD 1 5 | 0xH 0xD A5 | 0xH 2 VM A 3 2 | 0xC 0xE 0xF A2 | 0xC B12 | 0xB 4 5 0xG B10 | 0xE N 0xH B9 | 0xA CPU 1 12 | 0xB 10 | 0xE Mem 2 VM B 3 9 | 0xA 4 12 etc. etc. Managed by VM OS Managed by hypervisor Actual Hardware2 methodslocked page table, access generates trap, VMM handles memory accessmuch slower memory accessEPT/RVI/HAPMake TLB much bigger, make it smarter, VM-awaremuch more complex to fill up, though. slow initial memory accessfilled TLB is very fast, tho.
    • WHAT DOES THIS TEACH US? • All “kernel” activity is a lot more costly: • Interrupts • System Calls (I/O) • Memory page managementso, 3 actions are slower in virtualizationInterrupts - hardware asking for attentionSystem Calls - software asking for kernel attentionPage Management - memory access
    • IN THE WILD...• From best to worst case scenario... • Bare-metal (Xen, KVM, ESX, Hyper-V) • Host-based (VirtualBox, VMware Workstation, etc.) • Cloud-based (Amazon, Terremark, etc.)
    • BARE-METAL OPTIONS • Know your my.cnf inside out • Use hardware-assisted paging + Large Pages! (InnoDB: large- pages) • Make use of paravirtualized HW options • Take care of all your caching levels • Use DirectIO (innodb_flush_method=O_DIRECT)smalls mistakes in a native environment get bigger in virtual onememory allocations are expensiveoptimize your my.cnf!!!tools.percona.com good starting pointconnection-specific buffers (join-buffer, sort-buffer, etc)sweet spot = test!!SWAPPING = EVILswappinessLarge PagesDirectIO
    • BARE-METAL OPTIONS • Know your my.cnf inside out • Use hardware-assisted paging + Large Pages! (InnoDB: large- pages) • Make use of paravirtualized HW options • Take care of all your caching levels • Use DirectIO (innodb_flush_method=O_DIRECT)smalls mistakes in a native environment get bigger in virtual onememory allocations are expensiveoptimize your my.cnf!!!tools.percona.com good starting pointconnection-specific buffers (join-buffer, sort-buffer, etc)sweet spot = test!!SWAPPING = EVILswappinessLarge PagesDirectIO
    • BARE-METAL OPTIONS • Know your my.cnf inside out • Use hardware-assisted paging + Large Pages! (InnoDB: large- pages) • Make use of paravirtualized HW options • Take care of all your caching levels • Use DirectIO (innodb_flush_method=O_DIRECT)smalls mistakes in a native environment get bigger in virtual onememory allocations are expensiveoptimize your my.cnf!!!tools.percona.com good starting pointconnection-specific buffers (join-buffer, sort-buffer, etc)sweet spot = test!!SWAPPING = EVILswappinessLarge PagesDirectIO
    • HARDWARE CHOICES • Choosing the right CPU’s • Intel5500/7500 and later types (Nehalem) / All AMD quadcore Opterons (HW-assisted/MMU virtualization) • Choosing the right NIC’s (VMDQ) • Choosing the right storage system (iSCSI vs FC SAN)CPUs listed here support both HW-assist and HAPvirtual machine device queueing
    • HOST-BASED • All of the above, if possible :) • IO becomes the bigger issue on standard client hardware • Focus on moving database IO away from the same disk you run the host- and guest-OS on. • Consider installing an SSD :)Keep in mind all of the previous thingsIO is a bigger issue2 OSes + DB running on the same disk always a problemseparate disk, maybe iSCSI lun?buy an SSD!
    • CLOUD-BASED • No control whatsoever over host-system :( • Sometimes unreliable IO • Change strategy! Design for easy sharding and replication! • Caching caching caching! • Consider RDS to reduce operational overhead?Cant escape the hurtunreliable disk IOCACHINGsharding/replication to spread write/read loadvery write-heavy may be more trouble than its worthasynchronous writes? not very durableUse RDS to cut back operational cost
    • THANKS!