Sharding Architectures

21,961
-1

Published on

Published in: Technology
1 Comment
37 Likes
Statistics
Notes
  • obat kista coklat http://obatkistacoklatsite.wordpress.com/
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
21,961
On Slideshare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
613
Comments
1
Likes
37
Embeds 0
No embeds

No notes for slide

Sharding Architectures

  1. 1. Sharding Architectures International PHP Conference 2008
  2. 2. Who the f*** is talking? David Soria Parra <dsp@php.net> PHP since 7 years Software Engineer at Sun Microsystems Sun Microsystems Web Stack PHP Source Committer Various other OSS Projects (Mercurial, etc)
  3. 3. Agenda Problem Possible Solutions ...and what not What is sharding? Implementation Optimization Migration
  4. 4. Problem Scalability What kind of scalability? High read and write traffic Huge amount of data
  5. 5. Solutions
  6. 6. Master/Slave replication Master Slave Slave
  7. 7. Master/Master replication Master Master Master Slave Slave
  8. 8. Master/Master Master/Slave Master / Master unstable / no official support replication not distribution Master / Slave no write optimization replication not distribution
  9. 9. Sharding Shard Database Shard Shard Aggregation
  10. 10. What is sharding? Application side implementation Application side aggregation Splitting data across databases Parallelization
  11. 11. Sharding is not... sharding is... NOT replication / backup NOT clustering NOT just spreading
  12. 12. Implementation
  13. 13. shard 0 PHP shard 1 application shard 2
  14. 14. shard 0 user ID: 9 PHP shard 1 application shard 2
  15. 15. 9%3=0 shard 0 user ID: 9 PHP shard 1 application shard 2
  16. 16. 9%3=0 shard 0 user ID: 9 PHP shard 1 application shard 2
  17. 17. 9%3=0 shard 0 user ID: 9 PHP shard 1 application shard 2
  18. 18. Shard Tables with a lot of writes Tables with a lot of reads Related data on one shard
  19. 19. Splitting: Tables User Feeds Database Posts
  20. 20. Splitting: Tables User Feeds Shard1 Shard2 Posts
  21. 21. Splitting: Data User Config Database Thread Post
  22. 22. Splitting: Data User Config User Config Shard1 Shard2 Thread Thread Post Post even user id odd user id
  23. 23. Data distribution
  24. 24. Criteria Fast lookup Few database connections Inexpensive reorganization Equal distribution
  25. 25. Modulo Shard Shard Shard Uid: 1-9 Uid: 10-19 Uid: 20-30
  26. 26. Modulo 2 4 6 Shard Shard Shard Shard Uid: 1-7 Uid: 8-15 Uid: 16-23 Uid: 24-30
  27. 27. Lookup table Shard Uid: 1-9 Lookup Shard Uid: 10-18 Shard Uid: 19-28
  28. 28. Optimizing
  29. 29. Optimizing caches (memcache) model optimization normalization? no! connection pools persistant connections (using mysqli+mysqlnd)
  30. 30. Optimizing II aggregation daemons persistent MySQL Proxy with LUA asynchronous queries (mysqlnd)
  31. 31. Optimizing III Shard 0 App Aggregation Shard 1 App Shard 2
  32. 32. Problems Joins, Unions, Intersections Grouping Selecting and projecting on groups Aggregation Primary keys Referential integrity (foreign keys) (De-)Normalization
  33. 33. Problems Global tables Sharing Tagging Invitations Search Lots of relations (due to normalization) between tables
  34. 34. Migration Find the bottleneck Refactoring of DB model needed? Prepare implementation prepare DBA layer write unit tests split existing data
  35. 35. Migration II Setup shards Migrate your code Test, Test, Test Check if everything runs smooth MySQL proxy DTrace
  36. 36. Scale!
  37. 37. Some code $conn->getConnection($userId)->query(); $conn->getGlobalConnection()->query();
  38. 38. Questions? david.soriaparra@sun.com http://blog.experimentalworks.net
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×