Scaling Your Database in the Cloud

Like this? Share it with your network

Share

Scaling Your Database in the Cloud

  • 2,702 views
Uploaded on

Rightscale Webinar: The number one cause of poor scalable web application performance is the database. This problem is magnified in cloud environments where I/O and bandwidth are generally slower......

Rightscale Webinar: The number one cause of poor scalable web application performance is the database. This problem is magnified in cloud environments where I/O and bandwidth are generally slower and less predictable than in dedicated data centers. Database sharding is a highly effective method of removing the database scalability barrier by operating on top of proven RDBMS products such as MySQL and PostgreSQL.

In this webinar, you'll learn what it really takes to implement sharding, the role it plays in the effective end-to-end lifecycle management of your entire database environment, and why it is crucial for ensuring reliability.

In this webinar, we will:

- Guide you on how to choose the best technology for your specific application
- Show you how to shard your existing database
- Review a case study on a Top 20 Facebook application built on dbShards

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
2,702
On Slideshare
2,652
From Embeds
50
Number of Embeds
3

Actions

Shares
Downloads
54
Comments
0
Likes
3

Embeds 50

http://www.rightscale.com 23
http://www.linkedin.com 21
http://localhost 6

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Writes are linear, reads can be faster – depending on your database architecture.
  • How did your database perform 6 months or a year ago?
  • Difference between Black box sharding/NoSQL and App Aware sharding with an RDBMS – you can get sets of related data from the same shard, otherwise need to retrieve a row at a time and consolidate

Transcript

  • 1. Scaling Your Database in the Cloud Watch the video of this webinar July 21, 2011
  • 2. 2# Your Panel TodayPresenting:• Uri Budnik: Director, ISV Partner Program, RightScale @uribudnik• Cory Isaacson: CEO & Founder, CodeFutures @dbShards• David Blinder: CTO, Family BuilderQ&A:• Jason Altobelli, Inside Sales Representative, RightScale Please use the chat box window to ask questions anytime! Webinar Recordings: www.rightscale.com/webinars
  • 3. 3#Agenda • Introduction to RightScale • Introduction to CodeFutures • Live Demo • Live Q&A Please use the chat box window to ask questions anytime!
  • 4. 4#RightScaleReal Customers, Real Deployments, Real Benefits• Managed Cloud Deployments for 4 Years• More than 30,000 users; launched over 2.7MM servers• Behind the largest production deployments on that cloud to date
  • 5. 5#Complete Systems Management
  • 6. 6#RightScale: Core Focus Improved Agility  Reduce complexity with ServerTemplates™  Manage Systems, not Servers  Orchestrate and Automate Maintain Choice  Multi-cloud  Configuration Asset Marketplace  ISV Partner Solutions Control & Security  User Access and Roles  Cost Control and Allocation  Complete Transparency
  • 7. 7#ServerTemplates: Built-to-Order Servers VS. Image bundling and maintenance
  • 8. 8#RightScripts in Multi-Cloud Marketplace • Two RightScripts you can use to analyze you application to determine if its “shard-safe” 1. Logging Driver for Native MySQL® 2. dbShards/Analyze Driver for JDBC • Installed in your app server to gather SQL statistics. • Its an in-depth analysis of what is needed to shard you database • Report lists each unique SQL statement and how it will function once sharded • Run once and generate a report that CodeFutures will review with you at no charge
  • 9. 9#Introduction• Who I am • Cory Isaacson, CEO of CodeFutures • Providers of dbShards • Author of Software Pipelines• Partnerships: • Rightscale • The leading Cloud Management Platform• Leaders in database scalability, performance, and high-availability for the cloud • based on real-world experience with dozens of cloud-based applications • social networking, gaming, data collection, mobile, analytics• Objective is to provide useful experience you can apply to scaling (and managing) your database tier… • especially for high volume applications • and an overview of dbShards technology
  • 10. 10#Challenges of cloud computing• Cloud provides highly attractive service environment • Flexible, scales with need (up or down) • No need for dedicated IT staff, fixed facility costs • Pay-as-you-go model• Cloud services occasionally fail • Partial network outages • Server failures • by their nature cloud servers are “transient” • Disk volume issues• Cloud-based resources are constrained • CPU • I/O Rates • the “Cloud I/O Barrier”
  • 11. 11#Typical Application Architecture
  • 12. 12#Scaling in the Cloud• Scaling Load Balancers is easy • Stateless routing to app server • Can add redundant Load Balancers if needed • If one goes down • failover to another• Scaling Application Servers is easy • Stateless • Sessions can easily transition to another server • Add or remove servers as need dictates • If one goes down • failover to another
  • 13. 13#Scaling in the Cloud• Scaling the Database tier is hard • “Statefull” by definition (and necessity) • Large, integrated data sets • 10s of GBs to TBs (or more) • Difficult to move, reload • I/O dependent • adversely affected by cloud service failures • and slow cloud I/O • If one goes down • ouch!
  • 14. 14#Scaling in the Cloud• Databases form the “last mile” of true application scalability • Start with simple optimizations • implement a follow-on scalability strategy for long-term performance goals • and a high-availability strategy is a must • Ensure your databases can failover • unplanned outages • and planned maintenance• The best time to plan your database scalability strategy is now • don‟t wait until it‟s a “3-alarm fire”
  • 15. 15#Familybuilder  Innovator in Facebook applications  Among first 500 apps worldwide  David Blinder, CTO
  • 16. 16#All CPUs wait at the same speed… The Cloud I/O Barrier
  • 17. 17#Database slowdown is not linear… Database Load Curve 10000 9000 8000 7000 Load Time 6000 5000 4000 Time 3000 Expon. (Time) 2000 1000 0 0 10 20 30 40 Data File (GB) GB Load Time (Min) .9 1 1.3 2.5 3.5 11.7 39.0 10 days…
  • 18. 18#Challenges apply to all types of databases• Traditional RDBMS (MySQL, PostgreSQL, Oracle…) • I/O bound • Multi-user, lock contention • High-availability • Lifecycle management… • backup/restore • schema changes • index maintenance• NoSQL Databases (In-memory, Caching, Document) • Reliability, High-availability • Limits of a single server • and a single thread • Data dumps to disk • Replication • Lifecycle Management
  • 19. 19#Challenges apply to all types of databases• No matter what the technology, big databases are hard to manage • elastic scaling is a real challenge • degradation from growth in size and volume is a certainty • application-specific database requirements add to the challenge• Sound database design is key… • balance performance vs. convenience vs. data size
  • 20. 20#The Laws of Databases• Law #1: Small Databases are fast• Law #2: Big Databases are slow• Law #3: Keep databases small
  • 21. 21#What is the answer? • Database sharding is the only effective method for achieving scale, elasticity, reliability and easy management • regardless of your database technology
  • 22. 22#What is Database Sharding? • “Horizontal partitioning is a database design principle whereby rows of a database table are held separately... Each partition forms part of a shard, which may in turn be located on a separate database server or physical location.” Wikipedia
  • 23. 23#What is Database Sharding? • Start with a big monolithic database • break it into smaller databases • across many servers • using a key value
  • 24. 24#The key to Database Sharding…
  • 25. 25#dbShards Architecture
  • 26. 26#Database Sharding… the results
  • 27. 27#Why does Database Sharding work?• Maximize CPU/Memory per database instance • as compared to database size• Reduce the size of index trees • speeds writes dramatically • reads are faster too • aggregate, list queries are generally much faster• No contention between servers • locking, disk, memory, CPU• Allows for intelligent parallel processing • Go Fish queries across shards• Keep CPUs busy and productive
  • 28. 28#Breaking the Cloud I/O Barrier
  • 29. 29#Familybuilder  Top 50 Facebook Application  100,000 New Users Daily  Doubled Users in 12 months to over 40MM  David Blinder, CTO
  • 30. 30#Relational Sharding Shard-Tree Root Table Shard-Tree Child Tables Global Tables
  • 31. 31#How Relational Sharding works
  • 32. 32#How Relational Sharding works• Shard key recognition in SQL • SELECT * FROM customer WHERE customer_id = 1234 • INSERT INTO customer (customer_id, first_name, last_name, addr_line1,…) VALUES (2345, „John‟, „Jones‟, „123 B Street‟,…) • UPDATE customer SET addr_line1 = „456 C Avenue‟ WHERE customer_id = 4567
  • 33. 33#What about Cross-Shard result sets?
  • 34. 34#Cross-shard result set example• Go Fish (no shard key) • SELECT country_id, count(*) FROM customer GROUP BY country_id
  • 35. 35#Moving to Database Sharding withdbShards
  • 36. 36#dbShards/Analyze• Review Database Schema• Define your initial shard strategy• Run dbShards/Analyze Driver • on your app in a test environment • generate logs of all application SQL• Generate dbShards/Analyze reports • with your data model • your shard strategy • your SQL logs as input• Ensure your application is shard-safe • before you shard your database • and identify optimization opportunities
  • 37. Demo
  • 38. 38#No-charge Shard Analysis• Drop-in dbShards/Analyze Drivers • Native MySQL • JDBC • ODBC• Available as Rightscale templates • search Multi-Cloud Marketplace for CodeFutures • Logging Driver for Native MySQL® • dbShards/Analyze Driver for JDBC• Run driver in your environment, with your app • ship us the logs, schema • a dbShards consultant take you through the analysis• Find out exactly what it takes to shard your database • regardless of the technology you select
  • 39. 39# Wrap-up• Database Sharding is the tool for scaling your database• dbShards is a complete, drop-in sharding solution • Plug-compatible database drivers • nothing between you and your database • Intelligent agents for shard management, processing • Database agnostic, pick the DBMS you prefer• Use dbShards for existing applications • new ones too• dbShards supports the entire Database Sharding infrastructure • Analyze, Shard, Manage • 24X7 Monitoring and Support for all customers
  • 40. 40#Questions/Answers Cory Isaacson CodeFutures Corporation sales@codefutures.com http://www.dbshards.com
  • 41. 41#We Appreciate Your TimeContactsCory Isaacson: RIGHTSCALE:CodeFutures Corporation (866) 720-0208sales@codefutures.com sales@rightscale.comhttp://www.dbshards.com http://www.rightscale.com More Info: Webinar archive: RightScale.com/webinars Whitepapers: RightScale.com/whitepapers Free Edition: RightScale.com/free