• Save
"How Sharding turned MySQL into the Internet de-facto database standard?", Moshe Kaplan, RockeTier
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

"How Sharding turned MySQL into the Internet de-facto database standard?", Moshe Kaplan, RockeTier

  • 15,610 views
Uploaded on

A common belief in the enterprise software world is that MySQL cannot scale to large databases sizes. The Internet industry proved it can be done. These days many of the Internet giants, processing......

A common belief in the enterprise software world is that MySQL cannot scale to large databases sizes. The Internet industry proved it can be done. These days many of the Internet giants, processing billions of events every day, are based on MySQL. Most of these giants were able to turn MySQL into a mighty database machines by implementing Sharding.
What is Sharding? What kinds of Sharding can you implement? What are the best practices? All these issues will be address in this lecture by Moshe Kaplan from RockeTier. the performance experts

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • First of all, thank you for this very insightful presentation!

    I have a question about slide 5: here it is stated that one of MySQL limitations is to support 50 queries/second: this number surprised me since it seems quite low.
    Of course it very much depends on the query and on the amount of data that is fetched each time.

    In any case, both in my experience and from the evidence that I've found googling on the subject, it seems that this limit can be set to much a higher value.

    Can you please comment on how you came out with it?
    Thanks!
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
15,610
On Slideshare
14,486
From Embeds
1,124
Number of Embeds
29

Actions

Shares
Downloads
0
Comments
1
Likes
16

Embeds 1,124

http://top-performance.blogspot.com 804
http://www.slideshare.net 108
http://incertain2.wordpress.com 94
http://top-performance.blogspot.co.il 24
http://top-performance.blogspot.in 19
http://top-performance.blogspot.com.au 14
http://www.linkedin.com 12
https://www.linkedin.com 5
http://top-performance.blogspot.ca 5
http://top-performance.blogspot.com.ar 4
https://si0.twimg.com 4
http://top-performance.blogspot.com.br 3
http://top-performance.blogspot.de 3
http://top-performance.blogspot.ru 3
http://top-performance.blogspot.mx 3
http://top-performance.blogspot.kr 2
http://top-performance.blogspot.fi 2
http://top-performance.blogspot.jp 2
http://top-performance.blogspot.sg 2
http://top-performance.blogspot.ro 2
http://top-performance.blogspot.co.uk 1
http://top-performance.blogspot.ie 1
http://top-performance.blogspot.it 1
http://www.docshut.com 1
http://translate.googleusercontent.com 1
http://top-performance.blogspot.tw 1
http://top-performance.blogspot.com.es 1
http://top-performance.blogspot.pt 1
http://top-performance.blogspot.fr 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Initial setup of a u-Page from toolbar 14M users today 20M users 2009Q1 600K downloads per day Moving from 1:N templates to 10-15% users that updates the page (from 1:1000 templates/users ratio to 1:7 templates/users) Statistics: user changes: 1 per week, upload: 2 times a day Wants to use the default if the user did not modified the template. However, wants to support pushed changes from templates to users pages, even if those were changed Have the same situation now, but now it’s saved on the users desktop Current applicative cache: - int: toolbar_id, list of int: widget_id[] - int: widget_id, object: widget - int: user_id, object: user Table design: - User id, tab id and user_tab_id are GUID – meaning that they can be distributed between databases - Other are not (may be needed to support?) - Expected XML scheme size: 45KB (up to 500KB) Why migration to MySQL will be problematic (they are already using the current SQL Server SP features): - Maximal row size * http://dev.mysql.com/doc/refman/5.0/en/innodb-restrictions.html * The maximum row length, except for VARBINARY , VARCHAR , BLOB and TEXT columns, is slightly less than half of a database page. That is, the maximum row length is about 8000 bytes - Scope identity - For XML - Bulk Insert from XML - Rollback/Transaction - BEGIN TRY Applicative changes by the user: - Move - Delete - Add - Minimize - Change internal parameters Options to be considered: - SQL Server based large machine - Sharding based on MySQL, SQL Server - Gigaspaces solution: felt that it’s a large machine, and they prefer the database way Other options - Saving only XML in the current way - Save the summed page configuration in an XML, so little read should be done from the DB (Tab based) - Write 20K files of 40KB each on the my laptop HD: 149s - Read XML: 109.5s - Write to DB: 1789s Use serialization of .Net - Save the XML on the disk in order to avoid variable length fields - Use memcached to hold the hash of users? Things to be considered: - What horizontal sharding algorithm should be selected - Hibernate Shards – provided by Google. Still beta-testing phase - What vertical sharding tables should be spitted to different databases - How do you manage so many databases (distribute data and so on) - There is not really an option to do that - Defining optimal table sizes - Retrieval of data from the disk vs. Getting data from the tables: Tab (1 per displayed page), Zone (3 per displayed page), Widget instance (10-300 per displayed page + should be extracted with/out zones) - OLAP solution to merge data OLAP solution Toolbar design: - Saving

Transcript

  • 1. MySQL Sharding MySQL User Group 4/3/2009 [email_address] http://top-performance.blogspot.com
  • 2. RockeTier Methodology
    • Detect
    • Rate
    • Immediate Effective Relief
    • Roadmap
    • Scale up and Scale out
  • 3. Presentation Objectives
    • Who is using MySQL?
    • MySQL Limitations
    • How to get over this?
      • Move to another DB and scale up…
      • Vertical Sharding
      • Horizontal Sharding
    • Sharding test case
  • 4. Who is Using MySQL?
  • 5. MySQL Limitations
    • Table sizes: 50-100M records per table
    • Reads: 50 queries/second
  • 6. Why Do I Care?
    • From 0 to 100 (US mass adaptation)
      • Phone: 100 yrs
      • Radio: 40 yrs
      • TV: 30 yrs
      • Mobile: 20 yrs
      • Internet: 10 yrs
      • Facebook: 2 yrs
  • 7. 100K New Users/Week
  • 8. The Network Effect
  • 9. What Should I Do?
    • Oracle
    • SQL Server
    • $$$
  • 10. Sharding
  • 11. Vertical Sharding
  • 12. Horizontal Sharding
    • Static Hashing
      • Complex growth
      • Simple
    Mod 10 = 0 Mod 10 = 1 Mod 10 = 2 Mod 10 = 3 Mod 10 = 4 Mod 10 = 5 Mod 10 = 6 Mod 10 = 7 Mod 10 = 8 Mod 10 = 9
  • 13.
    • Key locations are defined in a directory
      • Simple growth
      • Directory is SPOF
    Horizontal Sharding
  • 14. Horizontal Sharding
    • Static hashing with directory mapping
      • Simple growth
      • Small Directory still SPOF
    Mod 1000 = 4
  • 15. Horizontal Sharding
    • Each key signed by DB# generated on creation
      • Simple growth
      • New key generation is SPOF
  • 16. Sharding Management
    • No mature tools in the market
    • Hibernate Shards – not recommended
      • Hibernate…
      • Beta
    • Required Mechanisms
      • Distribution of changes in DB schema
  • 17. Reporting
  • 18. Best Practices
    • $connection = new_db_connection (" customer :// 1234 ") ;
    • $statement = $connection->prepare( $sql_statement, $params );
    • $result = $statement->execute();
  • 19. Lessons
    • Vertical Sharding:
      • User Actions, Users, Comments, Items
    • Horizontal Sharding
    • Denormalization
    • MySQL Replication
  • 20. Lessons
    • Vertical Sharding:
      • User Actions, Users, Comments, Items
    • Horizontal Sharding
    • Denormalization
    • MySQL Replication
  • 21. Lessons
    • 100M views per day
    • The path to Sharding:
      • Single server
      • Single master with multiple read slaves
      • Partitioned
      • Sharding
  • 22. Lessons
    • Master-Master replication
    • Each Shard is 50% loaded
    • 40K queries/second
  • 23. Ad Network Reference Architecture
  • 24. The Bottom Line: Grow ∞
  • 25. Startup your Engines Thank you [email_address] http://top-performance.blogspot.com