Your SlideShare is downloading. ×
FlurryDB: A Dynamically Scalable Relational Database with Virtual Machine Cloning
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

FlurryDB: A Dynamically Scalable Relational Database with Virtual Machine Cloning

408
views

Published on

Presentation for SYSTOR 2011 at IBM Haifa

Presentation for SYSTOR 2011 at IBM Haifa

Published in: Technology, Spiritual

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
408
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. FlurryDBA Dynamically Scalable RelationalDatabase with Virtual Machine CloningMichael Mior and Eyal de LaraUniversity of Toronto
  • 2. Cluster Databases DBMS Load VM balancer DBMS VM• We use read-one write all (ROWA) replication • Send reads to any server instance • Send writes to all server instances• Two-phase commit is required for writes
  • 3. Problem• Loads may fluctuate drastically over time• Pay-as-you-go IaaS clouds should be idealHowever We want to adapt the size of the cluster to match the load
  • 4. Problem• Databases have large state and are hard to scale• A complete copy may be required for new instances• Copying large databases may take hoursQueriesReplication copy boot catch add data replica up replica• Overprovisioning is necessary to maintain service
  • 5. Solution – Virtual Machine Fork• Analogous to fork() for OS processes• Clones start immediately• State fetched on-demand • Page faults fetch the memory or disk page This can reduce instantiation time from minutes or hours to seconds
  • 6. VM Fork Master VM Clone VM
  • 7. FlurryDB • Use VM fork to provision new instances and add elasticity to unmodified MySQL • Making distributed commit cloning-aware to handle in-flight transactionsQueriesReplication clone add replica
  • 8. FlurryDB Challenges1. Incorporate new worker into cluster2. Preserve application semantics
  • 9. Incorporate new worker• Clone must connect to the load balancer• Load balancer must begin sending transactions to the clone
  • 10. Preserve application semantics• Transactions may be in-progress at the time of cloning• Clone gets new IP address• Doing nothing drops connections and transaction status is unknown New IP Address DBMS T DBMS ?T’ Master VM Clone VM
  • 11. Solutions to consistency1. Employ a write barrier DBMS DBMS Master VM Clone VM2. Roll back all transactions in progress DBMS DBMS T’ Master VM Clone VM3. Allow all transactions to complete DBMS DBMS T T’ Master VM Clone VM
  • 12. FlurryDB: Consistency beyond VM forkSolutionUse a proxy which is aware of VM fork insidethe virtual machine to maintain the databaseconnection
  • 13. Two-phase commit duringcloning Client T Load Balancer T T’ Proxy T Proxy T’ fork Database T Database T’ Master VM Clone VM
  • 14. Replica addition delay• Update 10,000 rows concurrently• Clone and measure replica addition delay in two cases • Write barrier - wait for outstanding writes to complete before cloning • FlurryDB - use double-proxying to allow completion on the clone
  • 15. Replica addition delay
  • 16. Proxy overhead• Measurement of large SELECT transfer times shows ~5% drop in bandwidth• Reconnection to new servers ~10x faster with no authentication
  • 17. RUBBoS• Simulates a news website such as Slashdot• Users read, post, and comment on stories• We run server at full capacity (25 clients)• After 5 minutes, we clone to two servers and double the number of clients• Throughput in queries per second is measured at the load balancer
  • 18. RUBBOS Results cloning at 300s
  • 19. Summary• FlurryDB adds elasticity to unmodified MySQL by interposing a cloning-aware proxy• VM fork handles most of the issues with replica addition• A proxy handles in-flight requests• New replicas can be made available in seconds while maintaining consistency
  • 20. Future Work• Experiment with varying consistency protocols (e.g. master-slave, eventual consistency)• Test scalability with larger numbers of clones• Optimize virtual disk performance• Provide support for transactional workloads
  • 21. Questions?
  • 22. Related Work• “NoSQL” (perhaps more accurately, NoACID) systems using eventual consistency or key-value data models such as Cassandra or Dynamo• Relational Cloud uses dynamic partitioning across a cluster of MySQL/PostgreS instances• Urgaonkar et al. use rapid changes in resource allocate to provision a multi-tier Internet application• Soundararajan et al. present dynamic replication policies for scaling database servers• HyPer uses process fork and memory snapshots to enable a hybrid OLAP/OLTP workload