0
Sharding Architectures	
International PHP Conference 2008
Who the f*** is talking?
 David Soria Parra <dsp@php.net>
 PHP since 7 years
 Software Engineer at Sun Microsystems
 Sun M...
Agenda
Problem
Possible Solutions ...and what not
What is sharding?
Implementation
Optimization
Migration
Problem

Scalability
  What kind of scalability?
High read and write traffic
Huge amount of data
Solutions
Master/Slave replication



               Master
       Slave            Slave
Master/Master replication



           Master Master Master
   Slave                          Slave
Master/Master Master/Slave
Master / Master
  unstable / no official support
  replication not distribution
Master / Slave
 ...
Sharding


                     Shard
  Database   Shard           Shard



                Aggregation
What is sharding?

Application side implementation
Application side aggregation
Splitting data across databases
Paralleliz...
Sharding is not...

 sharding is...
   NOT replication / backup
   NOT clustering
   NOT just spreading
Implementation
shard 0



  PHP
              shard 1
application



              shard 2
shard 0

user ID: 9
               PHP
                           shard 1
             application



                    ...
9%3=0   shard 0

user ID: 9
               PHP
                                   shard 1
             application



    ...
9%3=0   shard 0

user ID: 9
               PHP
                                   shard 1
             application



    ...
9%3=0   shard 0

user ID: 9
               PHP
                                   shard 1
             application



    ...
Shard


Tables with a lot of writes
Tables with a lot of reads
Related data on one shard
Splitting: Tables

         User            Feeds


              Database
            Posts
Splitting: Tables

User                         Feeds


      Shard1        Shard2
   Posts
Splitting: Data

         User       Config


               Database
            Thread
                          Post
Splitting: Data

 User       Config         User          Config

       Shard1                    Shard2
    Thread        ...
Data distribution
Criteria

 Fast lookup
 Few database connections
 Inexpensive reorganization
 Equal distribution
Modulo



     Shard      Shard       Shard
     Uid: 1-9   Uid: 10-19 Uid: 20-30
Modulo

      2
                  4
                              6


     Shard      Shard       Shard        Shard
     ...
Lookup table

               Shard
                Uid: 1-9



                                    Lookup
       Shard
   ...
Optimizing
Optimizing

caches (memcache)
model optimization
  normalization? no!
connection pools
persistant connections (using mysql...
Optimizing II

 aggregation daemons
   persistent
 MySQL Proxy with LUA
 asynchronous queries (mysqlnd)
Optimizing III
                        Shard 0

    App

          Aggregation   Shard 1

    App

                       ...
Problems
Joins, Unions, Intersections
Grouping
Selecting and projecting on groups
Aggregation
Primary keys
Referential int...
Problems
Global tables
  Sharing
  Tagging
  Invitations
  Search
Lots of relations (due to normalization) between tables
Migration
 Find the bottleneck
 Refactoring of DB model needed?
 Prepare implementation
   prepare DBA layer
   write unit...
Migration II
 Setup shards
 Migrate your code
 Test, Test, Test
 Check if everything runs smooth
   MySQL proxy
   DTrace
Scale!
Some code


$conn->getConnection($userId)->query();
$conn->getGlobalConnection()->query();
Questions?


  david.soriaparra@sun.com
     http://blog.experimentalworks.net
Upcoming SlideShare
Loading in...5
×

Sharding Architectures

21,512

Published on

Published in: Technology
1 Comment
37 Likes
Statistics
Notes
  • obat kista coklat http://obatkistacoklatsite.wordpress.com/
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
21,512
On Slideshare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
605
Comments
1
Likes
37
Embeds 0
No embeds

No notes for slide

Transcript of "Sharding Architectures"

  1. 1. Sharding Architectures International PHP Conference 2008
  2. 2. Who the f*** is talking? David Soria Parra <dsp@php.net> PHP since 7 years Software Engineer at Sun Microsystems Sun Microsystems Web Stack PHP Source Committer Various other OSS Projects (Mercurial, etc)
  3. 3. Agenda Problem Possible Solutions ...and what not What is sharding? Implementation Optimization Migration
  4. 4. Problem Scalability What kind of scalability? High read and write traffic Huge amount of data
  5. 5. Solutions
  6. 6. Master/Slave replication Master Slave Slave
  7. 7. Master/Master replication Master Master Master Slave Slave
  8. 8. Master/Master Master/Slave Master / Master unstable / no official support replication not distribution Master / Slave no write optimization replication not distribution
  9. 9. Sharding Shard Database Shard Shard Aggregation
  10. 10. What is sharding? Application side implementation Application side aggregation Splitting data across databases Parallelization
  11. 11. Sharding is not... sharding is... NOT replication / backup NOT clustering NOT just spreading
  12. 12. Implementation
  13. 13. shard 0 PHP shard 1 application shard 2
  14. 14. shard 0 user ID: 9 PHP shard 1 application shard 2
  15. 15. 9%3=0 shard 0 user ID: 9 PHP shard 1 application shard 2
  16. 16. 9%3=0 shard 0 user ID: 9 PHP shard 1 application shard 2
  17. 17. 9%3=0 shard 0 user ID: 9 PHP shard 1 application shard 2
  18. 18. Shard Tables with a lot of writes Tables with a lot of reads Related data on one shard
  19. 19. Splitting: Tables User Feeds Database Posts
  20. 20. Splitting: Tables User Feeds Shard1 Shard2 Posts
  21. 21. Splitting: Data User Config Database Thread Post
  22. 22. Splitting: Data User Config User Config Shard1 Shard2 Thread Thread Post Post even user id odd user id
  23. 23. Data distribution
  24. 24. Criteria Fast lookup Few database connections Inexpensive reorganization Equal distribution
  25. 25. Modulo Shard Shard Shard Uid: 1-9 Uid: 10-19 Uid: 20-30
  26. 26. Modulo 2 4 6 Shard Shard Shard Shard Uid: 1-7 Uid: 8-15 Uid: 16-23 Uid: 24-30
  27. 27. Lookup table Shard Uid: 1-9 Lookup Shard Uid: 10-18 Shard Uid: 19-28
  28. 28. Optimizing
  29. 29. Optimizing caches (memcache) model optimization normalization? no! connection pools persistant connections (using mysqli+mysqlnd)
  30. 30. Optimizing II aggregation daemons persistent MySQL Proxy with LUA asynchronous queries (mysqlnd)
  31. 31. Optimizing III Shard 0 App Aggregation Shard 1 App Shard 2
  32. 32. Problems Joins, Unions, Intersections Grouping Selecting and projecting on groups Aggregation Primary keys Referential integrity (foreign keys) (De-)Normalization
  33. 33. Problems Global tables Sharing Tagging Invitations Search Lots of relations (due to normalization) between tables
  34. 34. Migration Find the bottleneck Refactoring of DB model needed? Prepare implementation prepare DBA layer write unit tests split existing data
  35. 35. Migration II Setup shards Migrate your code Test, Test, Test Check if everything runs smooth MySQL proxy DTrace
  36. 36. Scale!
  37. 37. Some code $conn->getConnection($userId)->query(); $conn->getGlobalConnection()->query();
  38. 38. Questions? david.soriaparra@sun.com http://blog.experimentalworks.net
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×