Sharding Architectures

  • 20,787 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • obat kista coklat http://obatkistacoklatsite.wordpress.com/
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
20,787
On Slideshare
0
From Embeds
0
Number of Embeds
9

Actions

Shares
Downloads
594
Comments
1
Likes
37

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Sharding Architectures International PHP Conference 2008
  • 2. Who the f*** is talking? David Soria Parra <dsp@php.net> PHP since 7 years Software Engineer at Sun Microsystems Sun Microsystems Web Stack PHP Source Committer Various other OSS Projects (Mercurial, etc)
  • 3. Agenda Problem Possible Solutions ...and what not What is sharding? Implementation Optimization Migration
  • 4. Problem Scalability What kind of scalability? High read and write traffic Huge amount of data
  • 5. Solutions
  • 6. Master/Slave replication Master Slave Slave
  • 7. Master/Master replication Master Master Master Slave Slave
  • 8. Master/Master Master/Slave Master / Master unstable / no official support replication not distribution Master / Slave no write optimization replication not distribution
  • 9. Sharding Shard Database Shard Shard Aggregation
  • 10. What is sharding? Application side implementation Application side aggregation Splitting data across databases Parallelization
  • 11. Sharding is not... sharding is... NOT replication / backup NOT clustering NOT just spreading
  • 12. Implementation
  • 13. shard 0 PHP shard 1 application shard 2
  • 14. shard 0 user ID: 9 PHP shard 1 application shard 2
  • 15. 9%3=0 shard 0 user ID: 9 PHP shard 1 application shard 2
  • 16. 9%3=0 shard 0 user ID: 9 PHP shard 1 application shard 2
  • 17. 9%3=0 shard 0 user ID: 9 PHP shard 1 application shard 2
  • 18. Shard Tables with a lot of writes Tables with a lot of reads Related data on one shard
  • 19. Splitting: Tables User Feeds Database Posts
  • 20. Splitting: Tables User Feeds Shard1 Shard2 Posts
  • 21. Splitting: Data User Config Database Thread Post
  • 22. Splitting: Data User Config User Config Shard1 Shard2 Thread Thread Post Post even user id odd user id
  • 23. Data distribution
  • 24. Criteria Fast lookup Few database connections Inexpensive reorganization Equal distribution
  • 25. Modulo Shard Shard Shard Uid: 1-9 Uid: 10-19 Uid: 20-30
  • 26. Modulo 2 4 6 Shard Shard Shard Shard Uid: 1-7 Uid: 8-15 Uid: 16-23 Uid: 24-30
  • 27. Lookup table Shard Uid: 1-9 Lookup Shard Uid: 10-18 Shard Uid: 19-28
  • 28. Optimizing
  • 29. Optimizing caches (memcache) model optimization normalization? no! connection pools persistant connections (using mysqli+mysqlnd)
  • 30. Optimizing II aggregation daemons persistent MySQL Proxy with LUA asynchronous queries (mysqlnd)
  • 31. Optimizing III Shard 0 App Aggregation Shard 1 App Shard 2
  • 32. Problems Joins, Unions, Intersections Grouping Selecting and projecting on groups Aggregation Primary keys Referential integrity (foreign keys) (De-)Normalization
  • 33. Problems Global tables Sharing Tagging Invitations Search Lots of relations (due to normalization) between tables
  • 34. Migration Find the bottleneck Refactoring of DB model needed? Prepare implementation prepare DBA layer write unit tests split existing data
  • 35. Migration II Setup shards Migrate your code Test, Test, Test Check if everything runs smooth MySQL proxy DTrace
  • 36. Scale!
  • 37. Some code $conn->getConnection($userId)->query(); $conn->getGlobalConnection()->query();
  • 38. Questions? david.soriaparra@sun.com http://blog.experimentalworks.net