Scaling Your Database in the Cloud


Published on

Rightscale Webinar: The number one cause of poor scalable web application performance is the database. This problem is magnified in cloud environments where I/O and bandwidth are generally slower and less predictable than in dedicated data centers. Database sharding is a highly effective method of removing the database scalability barrier by operating on top of proven RDBMS products such as MySQL and PostgreSQL.

In this webinar, you'll learn what it really takes to implement sharding, the role it plays in the effective end-to-end lifecycle management of your entire database environment, and why it is crucial for ensuring reliability.

In this webinar, we will:

- Guide you on how to choose the best technology for your specific application
- Show you how to shard your existing database
- Review a case study on a Top 20 Facebook application built on dbShards

Published in: Technology, Business
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Writes are linear, reads can be faster – depending on your database architecture.
  • How did your database perform 6 months or a year ago?
  • Difference between Black box sharding/NoSQL and App Aware sharding with an RDBMS – you can get sets of related data from the same shard, otherwise need to retrieve a row at a time and consolidate
  • Scaling Your Database in the Cloud

    1. 1. Scaling Your Database in the Cloud Watch the video of this webinar July 21, 2011
    2. 2. 2# Your Panel TodayPresenting:• Uri Budnik: Director, ISV Partner Program, RightScale @uribudnik• Cory Isaacson: CEO & Founder, CodeFutures @dbShards• David Blinder: CTO, Family BuilderQ&A:• Jason Altobelli, Inside Sales Representative, RightScale Please use the chat box window to ask questions anytime! Webinar Recordings:
    3. 3. 3#Agenda • Introduction to RightScale • Introduction to CodeFutures • Live Demo • Live Q&A Please use the chat box window to ask questions anytime!
    4. 4. 4#RightScaleReal Customers, Real Deployments, Real Benefits• Managed Cloud Deployments for 4 Years• More than 30,000 users; launched over 2.7MM servers• Behind the largest production deployments on that cloud to date
    5. 5. 5#Complete Systems Management
    6. 6. 6#RightScale: Core Focus Improved Agility  Reduce complexity with ServerTemplates™  Manage Systems, not Servers  Orchestrate and Automate Maintain Choice  Multi-cloud  Configuration Asset Marketplace  ISV Partner Solutions Control & Security  User Access and Roles  Cost Control and Allocation  Complete Transparency
    7. 7. 7#ServerTemplates: Built-to-Order Servers VS. Image bundling and maintenance
    8. 8. 8#RightScripts in Multi-Cloud Marketplace • Two RightScripts you can use to analyze you application to determine if its “shard-safe” 1. Logging Driver for Native MySQL® 2. dbShards/Analyze Driver for JDBC • Installed in your app server to gather SQL statistics. • Its an in-depth analysis of what is needed to shard you database • Report lists each unique SQL statement and how it will function once sharded • Run once and generate a report that CodeFutures will review with you at no charge
    9. 9. 9#Introduction• Who I am • Cory Isaacson, CEO of CodeFutures • Providers of dbShards • Author of Software Pipelines• Partnerships: • Rightscale • The leading Cloud Management Platform• Leaders in database scalability, performance, and high-availability for the cloud • based on real-world experience with dozens of cloud-based applications • social networking, gaming, data collection, mobile, analytics• Objective is to provide useful experience you can apply to scaling (and managing) your database tier… • especially for high volume applications • and an overview of dbShards technology
    10. 10. 10#Challenges of cloud computing• Cloud provides highly attractive service environment • Flexible, scales with need (up or down) • No need for dedicated IT staff, fixed facility costs • Pay-as-you-go model• Cloud services occasionally fail • Partial network outages • Server failures • by their nature cloud servers are “transient” • Disk volume issues• Cloud-based resources are constrained • CPU • I/O Rates • the “Cloud I/O Barrier”
    11. 11. 11#Typical Application Architecture
    12. 12. 12#Scaling in the Cloud• Scaling Load Balancers is easy • Stateless routing to app server • Can add redundant Load Balancers if needed • If one goes down • failover to another• Scaling Application Servers is easy • Stateless • Sessions can easily transition to another server • Add or remove servers as need dictates • If one goes down • failover to another
    13. 13. 13#Scaling in the Cloud• Scaling the Database tier is hard • “Statefull” by definition (and necessity) • Large, integrated data sets • 10s of GBs to TBs (or more) • Difficult to move, reload • I/O dependent • adversely affected by cloud service failures • and slow cloud I/O • If one goes down • ouch!
    14. 14. 14#Scaling in the Cloud• Databases form the “last mile” of true application scalability • Start with simple optimizations • implement a follow-on scalability strategy for long-term performance goals • and a high-availability strategy is a must • Ensure your databases can failover • unplanned outages • and planned maintenance• The best time to plan your database scalability strategy is now • don‟t wait until it‟s a “3-alarm fire”
    15. 15. 15#Familybuilder  Innovator in Facebook applications  Among first 500 apps worldwide  David Blinder, CTO
    16. 16. 16#All CPUs wait at the same speed… The Cloud I/O Barrier
    17. 17. 17#Database slowdown is not linear… Database Load Curve 10000 9000 8000 7000 Load Time 6000 5000 4000 Time 3000 Expon. (Time) 2000 1000 0 0 10 20 30 40 Data File (GB) GB Load Time (Min) .9 1 1.3 2.5 3.5 11.7 39.0 10 days…
    18. 18. 18#Challenges apply to all types of databases• Traditional RDBMS (MySQL, PostgreSQL, Oracle…) • I/O bound • Multi-user, lock contention • High-availability • Lifecycle management… • backup/restore • schema changes • index maintenance• NoSQL Databases (In-memory, Caching, Document) • Reliability, High-availability • Limits of a single server • and a single thread • Data dumps to disk • Replication • Lifecycle Management
    19. 19. 19#Challenges apply to all types of databases• No matter what the technology, big databases are hard to manage • elastic scaling is a real challenge • degradation from growth in size and volume is a certainty • application-specific database requirements add to the challenge• Sound database design is key… • balance performance vs. convenience vs. data size
    20. 20. 20#The Laws of Databases• Law #1: Small Databases are fast• Law #2: Big Databases are slow• Law #3: Keep databases small
    21. 21. 21#What is the answer? • Database sharding is the only effective method for achieving scale, elasticity, reliability and easy management • regardless of your database technology
    22. 22. 22#What is Database Sharding? • “Horizontal partitioning is a database design principle whereby rows of a database table are held separately... Each partition forms part of a shard, which may in turn be located on a separate database server or physical location.” Wikipedia
    23. 23. 23#What is Database Sharding? • Start with a big monolithic database • break it into smaller databases • across many servers • using a key value
    24. 24. 24#The key to Database Sharding…
    25. 25. 25#dbShards Architecture
    26. 26. 26#Database Sharding… the results
    27. 27. 27#Why does Database Sharding work?• Maximize CPU/Memory per database instance • as compared to database size• Reduce the size of index trees • speeds writes dramatically • reads are faster too • aggregate, list queries are generally much faster• No contention between servers • locking, disk, memory, CPU• Allows for intelligent parallel processing • Go Fish queries across shards• Keep CPUs busy and productive
    28. 28. 28#Breaking the Cloud I/O Barrier
    29. 29. 29#Familybuilder  Top 50 Facebook Application  100,000 New Users Daily  Doubled Users in 12 months to over 40MM  David Blinder, CTO
    30. 30. 30#Relational Sharding Shard-Tree Root Table Shard-Tree Child Tables Global Tables
    31. 31. 31#How Relational Sharding works
    32. 32. 32#How Relational Sharding works• Shard key recognition in SQL • SELECT * FROM customer WHERE customer_id = 1234 • INSERT INTO customer (customer_id, first_name, last_name, addr_line1,…) VALUES (2345, „John‟, „Jones‟, „123 B Street‟,…) • UPDATE customer SET addr_line1 = „456 C Avenue‟ WHERE customer_id = 4567
    33. 33. 33#What about Cross-Shard result sets?
    34. 34. 34#Cross-shard result set example• Go Fish (no shard key) • SELECT country_id, count(*) FROM customer GROUP BY country_id
    35. 35. 35#Moving to Database Sharding withdbShards
    36. 36. 36#dbShards/Analyze• Review Database Schema• Define your initial shard strategy• Run dbShards/Analyze Driver • on your app in a test environment • generate logs of all application SQL• Generate dbShards/Analyze reports • with your data model • your shard strategy • your SQL logs as input• Ensure your application is shard-safe • before you shard your database • and identify optimization opportunities
    37. 37. Demo
    38. 38. 38#No-charge Shard Analysis• Drop-in dbShards/Analyze Drivers • Native MySQL • JDBC • ODBC• Available as Rightscale templates • search Multi-Cloud Marketplace for CodeFutures • Logging Driver for Native MySQL® • dbShards/Analyze Driver for JDBC• Run driver in your environment, with your app • ship us the logs, schema • a dbShards consultant take you through the analysis• Find out exactly what it takes to shard your database • regardless of the technology you select
    39. 39. 39# Wrap-up• Database Sharding is the tool for scaling your database• dbShards is a complete, drop-in sharding solution • Plug-compatible database drivers • nothing between you and your database • Intelligent agents for shard management, processing • Database agnostic, pick the DBMS you prefer• Use dbShards for existing applications • new ones too• dbShards supports the entire Database Sharding infrastructure • Analyze, Shard, Manage • 24X7 Monitoring and Support for all customers
    40. 40. 40#Questions/Answers Cory Isaacson CodeFutures Corporation
    41. 41. 41#We Appreciate Your TimeContactsCory Isaacson: RIGHTSCALE:CodeFutures Corporation (866) sales@rightscale.com More Info: Webinar archive: Whitepapers: Free Edition: