SlideShare a Scribd company logo
1 of 69
Download to read offline
Sharding using PHP
                                                                                                                                                Insert Picture Here


Mats Kindahl (Senior Principal Software Developer)




    2Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Insert Information Protection Policy Classification from Slide 12
About the Presentation



           After this presentation you should know what sharding
           is and the basic caveats surrounding sharding. You
           should also have an idea of what is needed to develop
           a sharding solution.




3   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Program Agenda


     Why do we shard
     Introduction to sharding
     High-level sharding architecture
     Elements of a sharding solution
     Sharding planning



4   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
What is sharding?
                                                                       Splintering



                                                                                     Horizontal
                                                                                     Partitioning



      ●      Slice your database into independent data “shards”
      ●      Queries execute only on one shard
      ●      Shards can be stored on different servers

5   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Sharding for locality




                                                                           “B
                                                                             ig
                                                                              D
                                                                               at
                                                                                  a
                                                                                  ”
                                                                                  cl
                                                                                     os
                                                                                       e
                                                                                      to
                                                                                           us
                                                                                             er
6   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Sharding for performance

Reduced working set




                                                Database vs. cache



               Parallel processing


7   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Sharding Limitations

    ●     Auto-increment
            –      Composite key
            –      Distributed key generation
            –      UUID?
    ●     Cross-shard joins
            –      Very expensive: avoid them
            –      Federated tables?



8   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Developing a
    Sharding Solution




9   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
High-level Architecture


       ●      Broker
                –      Distributes queries
       ●      Sharding Database
                –      Information about the shards
                –      If it goes down, all goes down
                –      Need to be HA




10   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Running Example: Employees sample database


                                                                            Table             Rows
                                                                            salaries       2 844 04700
                                                                            titles          443 30800
                                                                            employees       300 02400
                                                                            dept_emp        331 60300
                                                                            dept_manager         2400
                                                                            departments           900



11   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Areas to cover



                                            Data                               Meta-Data


                                                                    Sharding

                                          Query                                Operations




12   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Data




       Partition                                                      Mapping                        Shard
        Data                                                           Keys                        Allocation

                       Key Columns                                                 Range Mapping           Single Shard

                       Dependent Columns                                           Hash Mapping            Multiple Shards

                       Tables to Shard                                             List Mapping




13   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Partitioning the data


                                                                            Table               Rows
                                                                            salaries       284 404 700
                                                                            titles          44 330 800
                                                                            employees       30 002 400
                                                                            dept_emp        33 160 300
                                                                            dept_manager         2 400
                                                                            departments           900



14   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Partitioning the data: sharding column(s)
     emp_no birth_date                              first_name last_name              gender hire_date
          4711 1989-06-13                           John                    Smith     M      2009-12-24
         19275 1954-11-12                           Sally                   Smith     F      1975-01-01
         27593 1477-05-19                           Mats                    Kindahl   M      2002-02-27
      587003 1830-08-28                             Charles                 Bell      M      2003-11-31
     ●     Sharding columns dictated by queries
             –      Queries should give same result before and after sharding
     ●     One or more columns
             –      Does not have to be primary key, but easier if it is
     ●     Sharding key is needed for re-sharding
15   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Partitioning the data: sharding column(s)
       ●
              Choice of sharding columns                                             9 millions

                –      Distribution                                              SE
                                                                            US
                –      Locality
       ●
              Avoid non-unique keys                                              200 millions

                –      Difficult to get good distribution
                –      Avoid: Country
                –      Prefer: Employee ID




16   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Partitioning the data: dependent columns


 ?                                        ?                  Foreign keys
                                                                            Table               Rows
                                                                            salaries       284 404 700


                                                                     ?
                                                                            titles          44 330 800
                                                                            employees       30 002 400
                                                                            dept_emp        33 160 300
                                                                            dept_manager         2 400


                         ?                                                  departments           900



17   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Partitioning the data: dependent columns

      SELECT first_name, last_name, salary
      FROM salaries JOIN employees
        USING (emp_no)
      WHERE emp_no = 21012
        AND CURRENT_DATE BETWEEN from_date AND to_date;

       ●      Referential Integrity Constraint
                –      Example query joining salaries and employees
                –      Same key, same shard
       ●      JOIN within a shard

18   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Partitioning the data: dependent columns                                                                   find
                                                                                                       uer y to
                                                                                              Handy q          colum
                                                                                                                     ns
                                                                                                          ent
                                                                                              all d epend
       ●      Referential Integrity                                         mysql> SELECT table_schema, table_name, column_name
                                                                                -> FROM
                                                                                ->      information_schema.key_column_usage
                –      Foreign Keys                                             ->
                                                                                ->
                                                                                     JOIN
                                                                                        information_schema.table_constraints
                                                                                ->   USING
       ●      Dependent rows                                                    ->      (table_schema, table_name, constraint_name)
                                                                                -> WHERE constraint_type = 'FOREIGN KEY'
                                                                                ->   AND referenced_table_schema = 'employees'
                –      Same shard                                               ->
                                                                                ->
                                                                                     AND referenced_table_name = 'employees'
                                                                                     AND referenced_column_name = 'emp_no';
                                                                            +--------------+--------------+-------------+
                –      Join on equality                                     | table_schema | table_name    | column_name |
                                                                            +--------------+--------------+-------------+
                                                                            | employees     | dept_emp     | emp_no      |
       ●      Sharding Columns                                              | employees
                                                                            | employees
                                                                                            | dept_manager | emp_no
                                                                                            | salaries     | emp_no
                                                                                                                         |
                                                                                                                         |
                                                                            | employees     | titles       | emp_no      |
                –      Follow foreign keys                                  +--------------+--------------+-------------+
                                                                            4 rows in set (0.56 sec)



19   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Partitioning the data: unsharded tables


                                                                            Table               Rows
                                                                            salaries       284 404 700
                                                                            titles          44 330 800
                                                                            employees       30 002 400
                                                                            dept_emp        33 160 300
                                                                            dept_manager         2 400


                                                                     ?
                                                                            departments           900



20   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Partitioning the data: unsharded tables

      SELECT first_name, last_name, GROUP_CONCAT(dept_name)
        FROM employees JOIN dept_emp    USING (emp_no)
                       JOIN departments USING (dept_no)
      WHERE emp_no = 21012 GROUP BY emp_no;

       ●     Referential Integrity Constraint
               –     Join with sharded tables
               –     Tables dept_emp (and dept_manager) references two tables
       ●     Shard table departments?
               –     Not necessary: small table
               –     Difficult to get right: keeping shards of two tables in same location
21   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Partitioning the data: unsharded tables

      SELECT first_name, last_name, GROUP_CONCAT(dept_name)
        FROM employees JOIN dept_emp USING (emp_no)
                       JOIN departments USING (dept_no)
      WHERE emp_no = 21012 GROUP BY emp_no;

       ●      Solution: do not shard departments
                –      Keep table on all shards
                –      Joins will only need to address one shard
       ●      You need to consider
                       … how to update unsharded table

22   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Data




       Partition                                                      Mapping                        Shard
        Data                                                           Keys                        Allocation

                       Key Columns                                                 Range Mapping           Single Shard

                       Dependent Columns                                           Hash Mapping            Multiple Shards

                       Tables to Shard                                             List Mapping




23   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Mapping Keys to Shards


       ●      Given
                –      Sharding key value
                –      Optional other information (tables accessed, RO or RW, etc.)
       ●      Provide the following
                –      Shard location (host, port)
                –      Shard identifier (if you have multiple shards for each server)




24   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Mapping Keys to Shards


       ●      Range Mapping: range of values for each shard
                –      Type-dependent
       ●      Hash Mapping: hash of key to find shard
                –      Type-independent
                –      Complicated?
       ●      List Mapping: list of keys for each shard
                –      Does not offer good distribution

25   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Data




       Partition                                                      Mapping                        Shard
        Data                                                           Keys                        Allocation

                       Key Columns                                                 Range Mapping           Single Shard

                       Dependent Columns                                           Hash Mapping            Multiple Shards

                       Tables to Shard                                             List Mapping




26   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Shard Allocation: Single Shard per Server


       ●     Idea: there is only one shard on each server
       ●     Advantage: Cross-database queries does not require rewrite
       ●     Disadvantage: Expensive to balance server load
                     … moving hot data from server requires re-sharding

           SELECT first_name, last_name
           FROM
             employees.employees JOIN expenses.reciepts USING (emp_no)
           WHERE
             currency = 'USD'


27   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Shard Allocation: Multiple Shards per Server


       ●      Idea: Keep several “virtual shards” on each server
       ●      Advantages
                –      Easier to balance load of servers
                                … move hot virtual shards to other server
                –      Improves performance
                –      Increases availability




28   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Shard Allocation: Multiple Shards per Server


       ●     Disadvantage: cross-database queries require rewrite
                –     Error-prone
                –     Expensive?
       ●     Queries that go to one database not a problem
           SELECT first_name, last_name
           FROM
             employees.employees JOIN expenses.reciepts USING (emp_no)
           WHERE
             currency = 'USD'


29   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Shard Allocation: Multiple Shards per Server


       ●     Idea: Add suffix to database name (optionally table name)
                      employees_N.employees
                      employees_N.employees_N
       ●     Idea: Keep substitution pattern in query string
           SELECT first_name, last_name
           FROM
             {employees.employees} JOIN {expenses.reciepts} USING (emp_no)
           WHERE
             currency = 'USD'


30   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Shard Allocation: Multiple Shards per Server


       class my_mysqli extends mysqli {
         var $shard_id;

         public function query($query,
                               $resultmode = MYSQLI_STORE_RESULT)
         {
           $real_query = preg_replace('/{(w+).(w+)}/',
                                      “$1_{$this­>shard_id}.$2”,
                                      $query);
           return parent::query($real_query, $resultmode);
         }
       }



31   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Areas that we need to cover



                                            Data                               Meta-Data


                                                                    Sharding

                                          Query                                Operations




32   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Meta Data




       Mapping                                                       Shard                     Mapping
       Methods                                                    Information                  Schemes

                       Static Sharding                                      Shard ID                 Range Mapping

                       Dynamic Sharding                                     Shard Host               Hash Mapping

                                                                            Shard Specifics*         List Mapping


        *
            If you use multiple shards per server


33   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Mapping Methods: Static Sharding

     ●     Idea: Compute shard statically
     ●     Advantages
             –      Simple
             –      No extra lookups
             –      No single point of failure
     ●     Disadvantage
             –      Lack of flexibility



34   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Mapping Methods: Static Sharding, in code


       ●      Dictionary class
       ●      Input: sharding key
       ●      Output: connection
       class Dictionary {
         private $emp_no;
         public function __construct() { ... }
         public function set_key($emp_no) { $this->emp_no = $emp_no; }

           public function get_connection() {
             $i = $this->shardinfo[$this->emp_no % count($this->shards)];
             return new mysqli("p:{$i->host}", $i->user, $i->passwd, $i->db, $i->port);
           }
       }

35   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Mapping Methods: Static Sharding, in code

         $HIRED = <<<END_OF_QUERY
         SELECT first_name, last_name, hire_date, salary
           FROM employees AS e, salaries AS s
         WHERE s.emp_no = e.emp_no AND e.emp_no = ?
           AND CURRENT_DATE BETWEEN s.from_date AND s.to_date
         END_OF_QUERY;

         $DICTIONARY = new Dictionary();

         $DICTIONARY->set_key($emp_no);
         $link = $DICTIONARY->get_connection();
         if ($stmt = $link->prepare($HIRED)) {
           $stmt->bind_param('i', $emp_no);
           $stmt->execute();
           $stmt->bind_result($first, $last, $hire, $salary);
           while ($stmt->fetch())
             printf("%s %s was hired at %s and have a salary of %sn",
                    $first, $last, $hire, $salary);
         }



36   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Mapping Methods: Dynamic Sharding


       ●      Idea: use a sharding database to keep track of shard locations
       ●      Advantages:
                –      Easy to migrate shards
                –      Easy to re-shard
       ●      Disadvantages:
                –      Complex
       ●      Performance?

37   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Dynamic sharding, in code
       $FETCH_SHARD = <<<END_OF_QUERY
       shard selection query
       END_OF_QUERY;

       class Dictionary {
         var $dict; var $emp_no;

           public function __construct() {
             $this->dict = new mysqli('shardinfo.example.com', ...);
           }

           public set_key($emp_no) { $this->emp_no = $emp_no; }

           public function get_connection() {
             $stmt = $this->dict->prepare($FETCH_SHARD))
             $stmt->bind_param('i', $this->emp_no);
             $stmt->execute();
             $stmt->bind_result($no, $host, $user, $passwd, $db, $port);
             $stmt->fetch();
             return new mysqli("p:{$host}", $user, $passwd, $db, $port);
           }
       }
38   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Meta Data




       Mapping                                                       Shard                     Mapping
       Methods                                                    Information                  Schemes

                       Static Sharding                                      Shard ID                 Range Mapping

                       Dynamic Sharding                                     Shard Host               Hash Mapping

                                                                            Shard Specifics*         List Mapping


        *
            If you use multiple shards per server


39   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Mapping Schemes: Range Mapping


       ●      Most basic scheme                                             Shard ID   Lower
       ●      One row for each range
                                                                            0          0
       ●      Just store lower bound                                        1          20000
                                                                            2          50000

         SELECT shard_id, hostname, port
         FROM shard_ranges JOIN shard_locations USING (shard_id)
         WHERE key_id = 1 AND 2345 >= shard_ranges.lower_bound
         ORDER BY shard_ranges.lower_bound LIMIT 1;


40   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Mapping Schemes: Regular Hashing


       ●      Computing a hash from the key
                       ShardID = SHA1(key) mod N
       ●      Adding (or removing) a shard
                       … can require moving rows between many shards
                       … often a lot of rows




41   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Mapping Schemes: Regular Hashing
                 emp_no=20101                      emp_no=43210             emp_no=23456   emp_no=36912




                  N



             N+1

                                    0                         1               2            3          4
                                                            HASH(key) mod N

42   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Mapping Schemes: Consistent Hashing


       ●      Computing a hash from the key                                 Shard ID   Hash
                       SHA1(key)                                            6          08b1286ad1bebe6...
       ●      Adding (or removing) a shard                                  2          1c2d4132144211a...
                                                                            4          9893238ed75cfc9...
                       … only require moving rows
                       from one shard to the new                            1          989bb9d2bc381f4...
                       shard                                                5          cab8c76b85c4e24...
                                                                            3          eccf30f69fe850f...



43   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Mapping Schemes: Consistent Hashing
                                                                              shard1
                                                                                                     emp_no=20101
                emp_no=36912



                                      shard4                                Hash
                                                                            Ring            shard2



                                                                                                     emp_no=43210
                       emp_no=23456
                                                                                   shard5
                                                                  shard3



44   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Areas that we need to cover



                                            Data                               Meta-Data


                                                                    Sharding

                                          Query                                Operations




45   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Query
                                                                       Handling




      Query                                                           Connector            Sharding
     Dispatch                                                          Caches                Key

                 Mechanism                                                    Time (TTL)         Parsing
                 Single/Multi Cast                                            On Error           Application
                 Handling Reads                                               Explicit           provided
                 Handling Updates
                 Transaction Handling



46     Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Query Dispatch: Mechanism


       ●      Proxy
                –      Sharding key extracted from query
                –      Requires extra hop
       ●      Application level
                –      Application provides sharding key
                –      No extra hop




47   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Query Dispatch: Query Type


       ●      Read Query
                –      How do you ensure that it is executing on the right shard?
                –      How do you ensure that it is not cross-shard?
       ●      Update Query
                –      Updating an unsharded table – think about consistency




48   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Query Dispatch: Handling Transactions


       ●      All statements of a transaction should go to the same session
                –      Sharding key on start of transaction?
                –      Is it a read-only or read-write transaction?
       ●      Statements for different transactions can go to different sessions
                –      How to detect transaction boundaries
       ●      Maintaining the session state



49   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Query Dispatch: Handling Transactions
Hmm... looks like                                  Sharding key?                             Ah, there it is!
a read transaction
                                                                            Session state?

       BEGIN
       SELECT salary INTO @s FROM salaries WHERE emp_no = 20101;
       SET @s = 1.1 * @s;
       INSERT INTO salaries VALUES (20101, @s);
       COMMIT                 Oops.. it was a
       BEGIN                  write transaction!
       INSERT INTO ... 
       COMMIT             Transaction done!
                                                               Clear session state?


           New transaction! Different connection?

50   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Query
                                                                       Handling




      Query                                                           Connector            Sharding
     Dispatch                                                          Caches                Key

                 Mechanism                                                    Time (TTL)         Parsing
                 Single/Multi Cast                                            On Error           Application
                 Handling Reads                                               Explicit           provided
                 Handling Updates
                 Transaction Handling



51     Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Extracting Sharding Key


       ●      Parsing the query
                –      Locating the key
                –      Handling Transactions
       ●      Application-provided sharding key
                –      Annotating queries
                –      Separate function in connector




52   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Extracting Sharding Key: Parsing Query


       ●      Problem: Locating the key                                     INSERT INTO
                                                                              titles(emp_no, title, from_date)
       ●      No generic parser                                             SELECT emp_no, '', CURRENT_DATE
                                                                            FROM titles JOIN employees
                                                                                 USING (emp_no)
                –      Application specific parser                          WHERE first_name = 'Keith'
                –      Constrain application
                       developer                                            BEGIN
                                                                            SELECT …
       ●      Transactions                                                  INSERT …
                                                                            COMMIT;
                –      Key needed for first statement


53   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Extracting Sharding Key: Application Provided


       ●      Idea: Provide key explicitly                                  /* emp_no=20101 */ BEGIN;
                                                                            SELECT …
       ●      Annotate the statement                                        INSERT …
                                                                            COMMIT;
       ●      Extend connection manager
                –      Demonstrated previously
                                                                            …
                                                                            $DICT­>set_key($key);
                                                                            $link = $DICT­>get_connection();
                                                                            …




54   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Areas that we need to cover



                                            Data                               Meta-Data


                                                                    Sharding

                                          Query                                Operations




55   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Operations: Monitoring the System


       ●      Monitor load of each node
                                  … to see if any node get an unfair number of queries
       ●      Monitor load of each shard (multiple shards per node)
                                  … to see if a shard gets an unfair number of queries




56   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Operations: Re-balancing the System


       ●     If a instance is hot:
                –     Move Shard: Move one shard to another instance
       ●     If a shard is hot:
                –     Split Shard: Split the shard into multiple shards
                –     Move Shard: Move one of the shards to another instance
       ●     If a shard is cold:
                –     Merge Shard: Merge a shard with other shards
                         ●    Avoid it – very difficult to do on-line

57   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Operations: Moving a Shard


       ●      Offline (trivial)
                –      Bring source and target nodes down
                –      Copy shard from source to target
                –      Update dictionary
       ●      Online (tricky)
                –      We go through it on the following slides




58   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Operations: Online Move of Shard

                                                                                  Application
       1. Backup shard
                –      Might be multiple databases
                –      Note down binary log position
                          ●     “Backup position”                                           Src
                                                                            Dst
                –      Online backup
                          ●     mysqldump
                          ●     MySQL Enterprise Backup                                    @Pos

       2. Restore backup on destination
59   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Operations: Online Move of Shard

                                                                                        Application
       3. Start replication
                –      Source to target
                –      Start replication from backup
                       position
                                                                                  Dst            Src
                –      Only replicate shard?
                                                 replicate­wild­do­table=db_1.*

                                                                                                @Pos




60   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Operations: Online Move of Shard

                                                                                  Application
       4. Wait until destination is close
          enough
       5. Write lock on source
                       LOCK TABLES                                          Dst             Src

       6. Note binary log position
                –      “Catch-up Position”                                                @Pos




61   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Operations: Online Move of Shard

                                                                                  Application
       7. Wait for destination to reach
          catch-up position
                       START SLAVE UNTIL
                       MASTER_POS_WAIT                                      Dst             Src




                                                                                          @Pos




62   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Operations: Online Move of Shard

                                                                                  Application
       8. Update sharding database
                       … will re-direct queries
       9. Stop replication
                                                                            Dst             Src
                       RESET SLAVE
       10.Drop old shard
                       … unless you just wanted a copy                                    @Pos




63   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Operations: Splitting a Shard


       ●      Application dependent
                –      Change sharding key?
                –      Change sharding scheme?
       ●      Can be expensive
       ●      You will have to do it
                                  … eventually



64   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Operations: Splitting a Shard

                                                                            one.example.com       two.example.com

       1. Copy shard to new location
                                                                                              1
                –      Use on-line move described on                              3                     3
                       previous slides
       2. Update sharding database
                –       Will re-direct queries
       3. Remove rows from both shards                                                        2
                –      Remove rows that do not
                       belong to the shard

65   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Great!
                                                                             Wait a
                                         Let's Shard!
                                                                            minute...




66   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
When to shard?


       ●      Inherently more complex
                –      Requires careful planning
                –      Application design?
       ●      Alternatives?
                –      Functional partitioning?
                –      Archiving old data?




67   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Preparations for sharding

     ●      Monitor the system
              –      Types of queries
                        ●     What are the join queries
              –      Access patterns
                        ●     What tables are accessed
     ●      Find natural partition keys
              –      Robust and easy to implement
              –      Watch out for cross-shard joins


68   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Summary

     ●     What are your goals?
     ●     Do your homework
     ●     Don't be too eager
     ●     Plan
     ●     Develop sharding solution
     ●     Revise the plans



69   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Thanks for attending!

     ●     Questions? Comments?
     ●     Download MySQL!
                               http://dev.mysql.com
     ●     Read our book!
             –      Covers replication, sharding, scale-out, and much
                    much more




70   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

More Related Content

What's hot

Inside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseInside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseMike Dirolf
 
Oracle ACFS High Availability NFS Services (HANFS)
Oracle ACFS High Availability NFS Services (HANFS)Oracle ACFS High Availability NFS Services (HANFS)
Oracle ACFS High Availability NFS Services (HANFS)Anju Garg
 
Advanced messaging with Apache ActiveMQ
Advanced messaging with Apache ActiveMQAdvanced messaging with Apache ActiveMQ
Advanced messaging with Apache ActiveMQdejanb
 
Continuous ETL Testing for Pentaho Data Integration (kettle)
Continuous ETL Testing for Pentaho Data Integration (kettle)Continuous ETL Testing for Pentaho Data Integration (kettle)
Continuous ETL Testing for Pentaho Data Integration (kettle)Slawomir Chodnicki
 
BloodHound: Attack Graphs Practically Applied to Active Directory
BloodHound: Attack Graphs Practically Applied to Active DirectoryBloodHound: Attack Graphs Practically Applied to Active Directory
BloodHound: Attack Graphs Practically Applied to Active DirectoryAndy Robbins
 
CEH - Module 5 : System Hacking
CEH - Module 5 : System HackingCEH - Module 5 : System Hacking
CEH - Module 5 : System HackingAvirot Mitamura
 
Webinaire : sécurité informatique sur le web - Jérôme Thémée
Webinaire : sécurité informatique sur le web - Jérôme ThéméeWebinaire : sécurité informatique sur le web - Jérôme Thémée
Webinaire : sécurité informatique sur le web - Jérôme ThéméeMarie Tapia
 
Migrating and Running DBs on Amazon RDS for Oracle
Migrating and Running DBs on Amazon RDS for OracleMigrating and Running DBs on Amazon RDS for Oracle
Migrating and Running DBs on Amazon RDS for OracleMaris Elsins
 
Best Practices in Security with PostgreSQL
Best Practices in Security with PostgreSQLBest Practices in Security with PostgreSQL
Best Practices in Security with PostgreSQLEDB
 
MySQL Backup and Security Best Practices
MySQL Backup and Security Best PracticesMySQL Backup and Security Best Practices
MySQL Backup and Security Best PracticesLenz Grimmer
 
Java Deserialization Vulnerabilities - The Forgotten Bug Class
Java Deserialization Vulnerabilities - The Forgotten Bug ClassJava Deserialization Vulnerabilities - The Forgotten Bug Class
Java Deserialization Vulnerabilities - The Forgotten Bug ClassCODE WHITE GmbH
 
データインターフェースとしてのHadoop ~HDFSとクラウドストレージと私~ (NTTデータ テクノロジーカンファレンス 2019 講演資料、2019...
データインターフェースとしてのHadoop ~HDFSとクラウドストレージと私~ (NTTデータ テクノロジーカンファレンス 2019 講演資料、2019...データインターフェースとしてのHadoop ~HDFSとクラウドストレージと私~ (NTTデータ テクノロジーカンファレンス 2019 講演資料、2019...
データインターフェースとしてのHadoop ~HDFSとクラウドストレージと私~ (NTTデータ テクノロジーカンファレンス 2019 講演資料、2019...NTT DATA Technology & Innovation
 
From Traditional Data Warehouse To Real Time Data Warehouse
From Traditional Data Warehouse To Real Time Data WarehouseFrom Traditional Data Warehouse To Real Time Data Warehouse
From Traditional Data Warehouse To Real Time Data WarehouseOsama Hussein
 
PCF-VxRail-ReferenceArchiteture
PCF-VxRail-ReferenceArchiteturePCF-VxRail-ReferenceArchiteture
PCF-VxRail-ReferenceArchitetureVuong Pham
 
Oracle User Management
Oracle User ManagementOracle User Management
Oracle User ManagementArun Sharma
 
Hadoop -ResourceManager HAの仕組み-
Hadoop -ResourceManager HAの仕組み-Hadoop -ResourceManager HAの仕組み-
Hadoop -ResourceManager HAの仕組み-Yuki Gonda
 
Transparent sharding with Spider: what's new and getting started
Transparent sharding with Spider: what's new and getting startedTransparent sharding with Spider: what's new and getting started
Transparent sharding with Spider: what's new and getting startedMariaDB plc
 
Java Deserialization Vulnerabilities - The Forgotten Bug Class (RuhrSec Edition)
Java Deserialization Vulnerabilities - The Forgotten Bug Class (RuhrSec Edition)Java Deserialization Vulnerabilities - The Forgotten Bug Class (RuhrSec Edition)
Java Deserialization Vulnerabilities - The Forgotten Bug Class (RuhrSec Edition)CODE WHITE GmbH
 
DB12c: All You Need to Know About the Resource Manager
DB12c: All You Need to Know About the Resource ManagerDB12c: All You Need to Know About the Resource Manager
DB12c: All You Need to Know About the Resource ManagerMaris Elsins
 

What's hot (20)

Inside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseInside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source Database
 
Oracle ACFS High Availability NFS Services (HANFS)
Oracle ACFS High Availability NFS Services (HANFS)Oracle ACFS High Availability NFS Services (HANFS)
Oracle ACFS High Availability NFS Services (HANFS)
 
Advanced messaging with Apache ActiveMQ
Advanced messaging with Apache ActiveMQAdvanced messaging with Apache ActiveMQ
Advanced messaging with Apache ActiveMQ
 
Continuous ETL Testing for Pentaho Data Integration (kettle)
Continuous ETL Testing for Pentaho Data Integration (kettle)Continuous ETL Testing for Pentaho Data Integration (kettle)
Continuous ETL Testing for Pentaho Data Integration (kettle)
 
BloodHound: Attack Graphs Practically Applied to Active Directory
BloodHound: Attack Graphs Practically Applied to Active DirectoryBloodHound: Attack Graphs Practically Applied to Active Directory
BloodHound: Attack Graphs Practically Applied to Active Directory
 
CEH - Module 5 : System Hacking
CEH - Module 5 : System HackingCEH - Module 5 : System Hacking
CEH - Module 5 : System Hacking
 
Webinaire : sécurité informatique sur le web - Jérôme Thémée
Webinaire : sécurité informatique sur le web - Jérôme ThéméeWebinaire : sécurité informatique sur le web - Jérôme Thémée
Webinaire : sécurité informatique sur le web - Jérôme Thémée
 
Migrating and Running DBs on Amazon RDS for Oracle
Migrating and Running DBs on Amazon RDS for OracleMigrating and Running DBs on Amazon RDS for Oracle
Migrating and Running DBs on Amazon RDS for Oracle
 
Best Practices in Security with PostgreSQL
Best Practices in Security with PostgreSQLBest Practices in Security with PostgreSQL
Best Practices in Security with PostgreSQL
 
MySQL Backup and Security Best Practices
MySQL Backup and Security Best PracticesMySQL Backup and Security Best Practices
MySQL Backup and Security Best Practices
 
Java Deserialization Vulnerabilities - The Forgotten Bug Class
Java Deserialization Vulnerabilities - The Forgotten Bug ClassJava Deserialization Vulnerabilities - The Forgotten Bug Class
Java Deserialization Vulnerabilities - The Forgotten Bug Class
 
データインターフェースとしてのHadoop ~HDFSとクラウドストレージと私~ (NTTデータ テクノロジーカンファレンス 2019 講演資料、2019...
データインターフェースとしてのHadoop ~HDFSとクラウドストレージと私~ (NTTデータ テクノロジーカンファレンス 2019 講演資料、2019...データインターフェースとしてのHadoop ~HDFSとクラウドストレージと私~ (NTTデータ テクノロジーカンファレンス 2019 講演資料、2019...
データインターフェースとしてのHadoop ~HDFSとクラウドストレージと私~ (NTTデータ テクノロジーカンファレンス 2019 講演資料、2019...
 
From Traditional Data Warehouse To Real Time Data Warehouse
From Traditional Data Warehouse To Real Time Data WarehouseFrom Traditional Data Warehouse To Real Time Data Warehouse
From Traditional Data Warehouse To Real Time Data Warehouse
 
Introduction to sqoop
Introduction to sqoopIntroduction to sqoop
Introduction to sqoop
 
PCF-VxRail-ReferenceArchiteture
PCF-VxRail-ReferenceArchiteturePCF-VxRail-ReferenceArchiteture
PCF-VxRail-ReferenceArchiteture
 
Oracle User Management
Oracle User ManagementOracle User Management
Oracle User Management
 
Hadoop -ResourceManager HAの仕組み-
Hadoop -ResourceManager HAの仕組み-Hadoop -ResourceManager HAの仕組み-
Hadoop -ResourceManager HAの仕組み-
 
Transparent sharding with Spider: what's new and getting started
Transparent sharding with Spider: what's new and getting startedTransparent sharding with Spider: what's new and getting started
Transparent sharding with Spider: what's new and getting started
 
Java Deserialization Vulnerabilities - The Forgotten Bug Class (RuhrSec Edition)
Java Deserialization Vulnerabilities - The Forgotten Bug Class (RuhrSec Edition)Java Deserialization Vulnerabilities - The Forgotten Bug Class (RuhrSec Edition)
Java Deserialization Vulnerabilities - The Forgotten Bug Class (RuhrSec Edition)
 
DB12c: All You Need to Know About the Resource Manager
DB12c: All You Need to Know About the Resource ManagerDB12c: All You Need to Know About the Resource Manager
DB12c: All You Need to Know About the Resource Manager
 

Viewers also liked

MySQL Sharding: Tools and Best Practices for Horizontal Scaling
MySQL Sharding: Tools and Best Practices for Horizontal ScalingMySQL Sharding: Tools and Best Practices for Horizontal Scaling
MySQL Sharding: Tools and Best Practices for Horizontal ScalingMats Kindahl
 
MySQL Enterprise Cloud
MySQL Enterprise Cloud MySQL Enterprise Cloud
MySQL Enterprise Cloud Mark Swarbrick
 
Building Scalable High Availability Systems using MySQL Fabric
Building Scalable High Availability Systems using MySQL FabricBuilding Scalable High Availability Systems using MySQL Fabric
Building Scalable High Availability Systems using MySQL FabricMats Kindahl
 
[스마트스터디]MongoDB 의 역습
[스마트스터디]MongoDB 의 역습[스마트스터디]MongoDB 의 역습
[스마트스터디]MongoDB 의 역습smartstudy_official
 
SunshinePHP 2017 - Making the most out of MySQL
SunshinePHP 2017 - Making the most out of MySQLSunshinePHP 2017 - Making the most out of MySQL
SunshinePHP 2017 - Making the most out of MySQLGabriela Ferrara
 
Strip your TEXT fields - Exeter Web Feb/2016
Strip your TEXT fields - Exeter Web Feb/2016Strip your TEXT fields - Exeter Web Feb/2016
Strip your TEXT fields - Exeter Web Feb/2016Gabriela Ferrara
 
MySQL Cluster Whats New
MySQL Cluster Whats NewMySQL Cluster Whats New
MySQL Cluster Whats NewMark Swarbrick
 
LaravelSP - MySQL 5.7: introdução ao JSON Data Type
LaravelSP - MySQL 5.7: introdução ao JSON Data TypeLaravelSP - MySQL 5.7: introdução ao JSON Data Type
LaravelSP - MySQL 5.7: introdução ao JSON Data TypeGabriela Ferrara
 
20171104 hk-py con-mysql-documentstore_v1
20171104 hk-py con-mysql-documentstore_v120171104 hk-py con-mysql-documentstore_v1
20171104 hk-py con-mysql-documentstore_v1Ivan Ma
 
Coding like a girl - DjangoCon
Coding like a girl - DjangoConCoding like a girl - DjangoCon
Coding like a girl - DjangoConGabriela Ferrara
 
Exploring MongoDB & Elasticsearch: Better Together
Exploring MongoDB & Elasticsearch: Better TogetherExploring MongoDB & Elasticsearch: Better Together
Exploring MongoDB & Elasticsearch: Better TogetherObjectRocket
 
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...Ontico
 
Ora mysql bothGetting the best of both worlds with Oracle 11g and MySQL Enter...
Ora mysql bothGetting the best of both worlds with Oracle 11g and MySQL Enter...Ora mysql bothGetting the best of both worlds with Oracle 11g and MySQL Enter...
Ora mysql bothGetting the best of both worlds with Oracle 11g and MySQL Enter...Ivan Zoratti
 
LAMP: Desenvolvendo além do trivial
LAMP: Desenvolvendo além do trivialLAMP: Desenvolvendo além do trivial
LAMP: Desenvolvendo além do trivialGabriela Ferrara
 
MySQL 5.7 - 
Tirando o Máximo Proveito
MySQL 5.7 - 
Tirando o Máximo ProveitoMySQL 5.7 - 
Tirando o Máximo Proveito
MySQL 5.7 - 
Tirando o Máximo ProveitoGabriela Ferrara
 
The MySQL Server Ecosystem in 2016
The MySQL Server Ecosystem in 2016The MySQL Server Ecosystem in 2016
The MySQL Server Ecosystem in 2016Colin Charles
 

Viewers also liked (20)

MySQL Sharding: Tools and Best Practices for Horizontal Scaling
MySQL Sharding: Tools and Best Practices for Horizontal ScalingMySQL Sharding: Tools and Best Practices for Horizontal Scaling
MySQL Sharding: Tools and Best Practices for Horizontal Scaling
 
MySQL Enterprise Cloud
MySQL Enterprise Cloud MySQL Enterprise Cloud
MySQL Enterprise Cloud
 
Building Scalable High Availability Systems using MySQL Fabric
Building Scalable High Availability Systems using MySQL FabricBuilding Scalable High Availability Systems using MySQL Fabric
Building Scalable High Availability Systems using MySQL Fabric
 
MEAN Stack
MEAN StackMEAN Stack
MEAN Stack
 
[스마트스터디]MongoDB 의 역습
[스마트스터디]MongoDB 의 역습[스마트스터디]MongoDB 의 역습
[스마트스터디]MongoDB 의 역습
 
SunshinePHP 2017 - Making the most out of MySQL
SunshinePHP 2017 - Making the most out of MySQLSunshinePHP 2017 - Making the most out of MySQL
SunshinePHP 2017 - Making the most out of MySQL
 
Strip your TEXT fields - Exeter Web Feb/2016
Strip your TEXT fields - Exeter Web Feb/2016Strip your TEXT fields - Exeter Web Feb/2016
Strip your TEXT fields - Exeter Web Feb/2016
 
MySQL Cluster Whats New
MySQL Cluster Whats NewMySQL Cluster Whats New
MySQL Cluster Whats New
 
LaravelSP - MySQL 5.7: introdução ao JSON Data Type
LaravelSP - MySQL 5.7: introdução ao JSON Data TypeLaravelSP - MySQL 5.7: introdução ao JSON Data Type
LaravelSP - MySQL 5.7: introdução ao JSON Data Type
 
Mongodb
MongodbMongodb
Mongodb
 
20171104 hk-py con-mysql-documentstore_v1
20171104 hk-py con-mysql-documentstore_v120171104 hk-py con-mysql-documentstore_v1
20171104 hk-py con-mysql-documentstore_v1
 
Coding like a girl - DjangoCon
Coding like a girl - DjangoConCoding like a girl - DjangoCon
Coding like a girl - DjangoCon
 
Exploring MongoDB & Elasticsearch: Better Together
Exploring MongoDB & Elasticsearch: Better TogetherExploring MongoDB & Elasticsearch: Better Together
Exploring MongoDB & Elasticsearch: Better Together
 
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...
 
Ora mysql bothGetting the best of both worlds with Oracle 11g and MySQL Enter...
Ora mysql bothGetting the best of both worlds with Oracle 11g and MySQL Enter...Ora mysql bothGetting the best of both worlds with Oracle 11g and MySQL Enter...
Ora mysql bothGetting the best of both worlds with Oracle 11g and MySQL Enter...
 
LAMP: Desenvolvendo além do trivial
LAMP: Desenvolvendo além do trivialLAMP: Desenvolvendo além do trivial
LAMP: Desenvolvendo além do trivial
 
MySQL 5.7 - 
Tirando o Máximo Proveito
MySQL 5.7 - 
Tirando o Máximo ProveitoMySQL 5.7 - 
Tirando o Máximo Proveito
MySQL 5.7 - 
Tirando o Máximo Proveito
 
Laravel 5 and SOLID
Laravel 5 and SOLIDLaravel 5 and SOLID
Laravel 5 and SOLID
 
Strip your TEXT fields
Strip your TEXT fieldsStrip your TEXT fields
Strip your TEXT fields
 
The MySQL Server Ecosystem in 2016
The MySQL Server Ecosystem in 2016The MySQL Server Ecosystem in 2016
The MySQL Server Ecosystem in 2016
 

Similar to Sharding using MySQL and PHP

Oracle Optimized Datacenter - Storage
Oracle Optimized Datacenter - StorageOracle Optimized Datacenter - Storage
Oracle Optimized Datacenter - StorageWalter Moriconi
 
Cloud Consolidation with Oracle (RAC) - How much is too much?
Cloud Consolidation with Oracle (RAC) - How much is too much?Cloud Consolidation with Oracle (RAC) - How much is too much?
Cloud Consolidation with Oracle (RAC) - How much is too much?Markus Michalewicz
 
Cómo construimos Oracle TimesTen
Cómo construimos Oracle TimesTenCómo construimos Oracle TimesTen
Cómo construimos Oracle TimesTenSoftware Guru
 
OSI_MySQL_Performance Schema
OSI_MySQL_Performance SchemaOSI_MySQL_Performance Schema
OSI_MySQL_Performance SchemaMayank Prasad
 
Integrating Hadoop Into the Enterprise
Integrating Hadoop Into the EnterpriseIntegrating Hadoop Into the Enterprise
Integrating Hadoop Into the EnterpriseDataWorks Summit
 
Hadoop Summit 2012 | Integrating Hadoop Into the Enterprise
Hadoop Summit 2012 | Integrating Hadoop Into the EnterpriseHadoop Summit 2012 | Integrating Hadoop Into the Enterprise
Hadoop Summit 2012 | Integrating Hadoop Into the EnterpriseCloudera, Inc.
 
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012Jonathan Seidman
 
Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Jonathan Seidman
 
Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...
Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...
Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...DataWorks Summit
 
Turning Relational Database Tables into Hadoop Datasources by Kuassi Mensah
Turning Relational Database Tables into Hadoop Datasources by Kuassi MensahTurning Relational Database Tables into Hadoop Datasources by Kuassi Mensah
Turning Relational Database Tables into Hadoop Datasources by Kuassi MensahData Con LA
 
Ebs troubleshooting con9019_pdf_9019_0001
Ebs troubleshooting con9019_pdf_9019_0001Ebs troubleshooting con9019_pdf_9019_0001
Ebs troubleshooting con9019_pdf_9019_0001jucaab
 
Lesson05 从多表中查询数据
Lesson05 从多表中查询数据Lesson05 从多表中查询数据
Lesson05 从多表中查询数据renguzi
 
Ria2010 workshop dev mobile
Ria2010 workshop dev mobileRia2010 workshop dev mobile
Ria2010 workshop dev mobileMichael Chaize
 
Intelligent mro for intelligent aircraft
Intelligent mro for intelligent aircraftIntelligent mro for intelligent aircraft
Intelligent mro for intelligent aircraftMichael Parsons
 
SSDT Workshop @ SQL Bits X (2012-03-29)
SSDT Workshop @ SQL Bits X (2012-03-29)SSDT Workshop @ SQL Bits X (2012-03-29)
SSDT Workshop @ SQL Bits X (2012-03-29)Gert Drapers
 
Santo Leto - MySQL Connect 2012 - Getting Started with Mysql Cluster
Santo Leto - MySQL Connect 2012 - Getting Started with Mysql ClusterSanto Leto - MySQL Connect 2012 - Getting Started with Mysql Cluster
Santo Leto - MySQL Connect 2012 - Getting Started with Mysql ClusterSanto Leto
 
Replication Tips & Trick for SMUG
Replication Tips & Trick for SMUGReplication Tips & Trick for SMUG
Replication Tips & Trick for SMUGMats Kindahl
 
Ebs performance tune2_con9030_pdf_9030_0002
Ebs performance tune2_con9030_pdf_9030_0002Ebs performance tune2_con9030_pdf_9030_0002
Ebs performance tune2_con9030_pdf_9030_0002jucaab
 

Similar to Sharding using MySQL and PHP (20)

Oracle Optimized Datacenter - Storage
Oracle Optimized Datacenter - StorageOracle Optimized Datacenter - Storage
Oracle Optimized Datacenter - Storage
 
Cloud Consolidation with Oracle (RAC) - How much is too much?
Cloud Consolidation with Oracle (RAC) - How much is too much?Cloud Consolidation with Oracle (RAC) - How much is too much?
Cloud Consolidation with Oracle (RAC) - How much is too much?
 
Cómo construimos Oracle TimesTen
Cómo construimos Oracle TimesTenCómo construimos Oracle TimesTen
Cómo construimos Oracle TimesTen
 
OSI_MySQL_Performance Schema
OSI_MySQL_Performance SchemaOSI_MySQL_Performance Schema
OSI_MySQL_Performance Schema
 
Integrating Hadoop Into the Enterprise
Integrating Hadoop Into the EnterpriseIntegrating Hadoop Into the Enterprise
Integrating Hadoop Into the Enterprise
 
Hadoop Summit 2012 | Integrating Hadoop Into the Enterprise
Hadoop Summit 2012 | Integrating Hadoop Into the EnterpriseHadoop Summit 2012 | Integrating Hadoop Into the Enterprise
Hadoop Summit 2012 | Integrating Hadoop Into the Enterprise
 
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
 
Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013
 
Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...
Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...
Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...
 
OOW-TBE-12c-CON7307-Sharable
OOW-TBE-12c-CON7307-SharableOOW-TBE-12c-CON7307-Sharable
OOW-TBE-12c-CON7307-Sharable
 
Turning Relational Database Tables into Hadoop Datasources by Kuassi Mensah
Turning Relational Database Tables into Hadoop Datasources by Kuassi MensahTurning Relational Database Tables into Hadoop Datasources by Kuassi Mensah
Turning Relational Database Tables into Hadoop Datasources by Kuassi Mensah
 
Ebs troubleshooting con9019_pdf_9019_0001
Ebs troubleshooting con9019_pdf_9019_0001Ebs troubleshooting con9019_pdf_9019_0001
Ebs troubleshooting con9019_pdf_9019_0001
 
Lesson05 从多表中查询数据
Lesson05 从多表中查询数据Lesson05 从多表中查询数据
Lesson05 从多表中查询数据
 
Golden gate11g overview - Edgars Rungis
Golden gate11g overview - Edgars RungisGolden gate11g overview - Edgars Rungis
Golden gate11g overview - Edgars Rungis
 
Ria2010 workshop dev mobile
Ria2010 workshop dev mobileRia2010 workshop dev mobile
Ria2010 workshop dev mobile
 
Intelligent mro for intelligent aircraft
Intelligent mro for intelligent aircraftIntelligent mro for intelligent aircraft
Intelligent mro for intelligent aircraft
 
SSDT Workshop @ SQL Bits X (2012-03-29)
SSDT Workshop @ SQL Bits X (2012-03-29)SSDT Workshop @ SQL Bits X (2012-03-29)
SSDT Workshop @ SQL Bits X (2012-03-29)
 
Santo Leto - MySQL Connect 2012 - Getting Started with Mysql Cluster
Santo Leto - MySQL Connect 2012 - Getting Started with Mysql ClusterSanto Leto - MySQL Connect 2012 - Getting Started with Mysql Cluster
Santo Leto - MySQL Connect 2012 - Getting Started with Mysql Cluster
 
Replication Tips & Trick for SMUG
Replication Tips & Trick for SMUGReplication Tips & Trick for SMUG
Replication Tips & Trick for SMUG
 
Ebs performance tune2_con9030_pdf_9030_0002
Ebs performance tune2_con9030_pdf_9030_0002Ebs performance tune2_con9030_pdf_9030_0002
Ebs performance tune2_con9030_pdf_9030_0002
 

More from Mats Kindahl

High-Availability using MySQL Fabric
High-Availability using MySQL FabricHigh-Availability using MySQL Fabric
High-Availability using MySQL FabricMats Kindahl
 
Elastic Scalability in MySQL Fabric Using OpenStack
Elastic Scalability in MySQL Fabric Using OpenStackElastic Scalability in MySQL Fabric Using OpenStack
Elastic Scalability in MySQL Fabric Using OpenStackMats Kindahl
 
Sharding and Scale-out using MySQL Fabric
Sharding and Scale-out using MySQL FabricSharding and Scale-out using MySQL Fabric
Sharding and Scale-out using MySQL FabricMats Kindahl
 
MySQL Fabric: Easy Management of MySQL Servers
MySQL Fabric: Easy Management of MySQL ServersMySQL Fabric: Easy Management of MySQL Servers
MySQL Fabric: Easy Management of MySQL ServersMats Kindahl
 
MySQL Applier for Apache Hadoop: Real-Time Event Streaming to HDFS
MySQL Applier for Apache Hadoop: Real-Time Event Streaming to HDFSMySQL Applier for Apache Hadoop: Real-Time Event Streaming to HDFS
MySQL Applier for Apache Hadoop: Real-Time Event Streaming to HDFSMats Kindahl
 
Replication Tips & Tricks
Replication Tips & TricksReplication Tips & Tricks
Replication Tips & TricksMats Kindahl
 
MySQL Binary Log API Presentation - OSCON 2011
MySQL Binary Log API Presentation - OSCON 2011MySQL Binary Log API Presentation - OSCON 2011
MySQL Binary Log API Presentation - OSCON 2011Mats Kindahl
 
Python Utilities for Managing MySQL Databases
Python Utilities for Managing MySQL DatabasesPython Utilities for Managing MySQL Databases
Python Utilities for Managing MySQL DatabasesMats Kindahl
 
Mysteries of the binary log
Mysteries of the binary logMysteries of the binary log
Mysteries of the binary logMats Kindahl
 

More from Mats Kindahl (10)

Why rust?
Why rust?Why rust?
Why rust?
 
High-Availability using MySQL Fabric
High-Availability using MySQL FabricHigh-Availability using MySQL Fabric
High-Availability using MySQL Fabric
 
Elastic Scalability in MySQL Fabric Using OpenStack
Elastic Scalability in MySQL Fabric Using OpenStackElastic Scalability in MySQL Fabric Using OpenStack
Elastic Scalability in MySQL Fabric Using OpenStack
 
Sharding and Scale-out using MySQL Fabric
Sharding and Scale-out using MySQL FabricSharding and Scale-out using MySQL Fabric
Sharding and Scale-out using MySQL Fabric
 
MySQL Fabric: Easy Management of MySQL Servers
MySQL Fabric: Easy Management of MySQL ServersMySQL Fabric: Easy Management of MySQL Servers
MySQL Fabric: Easy Management of MySQL Servers
 
MySQL Applier for Apache Hadoop: Real-Time Event Streaming to HDFS
MySQL Applier for Apache Hadoop: Real-Time Event Streaming to HDFSMySQL Applier for Apache Hadoop: Real-Time Event Streaming to HDFS
MySQL Applier for Apache Hadoop: Real-Time Event Streaming to HDFS
 
Replication Tips & Tricks
Replication Tips & TricksReplication Tips & Tricks
Replication Tips & Tricks
 
MySQL Binary Log API Presentation - OSCON 2011
MySQL Binary Log API Presentation - OSCON 2011MySQL Binary Log API Presentation - OSCON 2011
MySQL Binary Log API Presentation - OSCON 2011
 
Python Utilities for Managing MySQL Databases
Python Utilities for Managing MySQL DatabasesPython Utilities for Managing MySQL Databases
Python Utilities for Managing MySQL Databases
 
Mysteries of the binary log
Mysteries of the binary logMysteries of the binary log
Mysteries of the binary log
 

Sharding using MySQL and PHP

  • 1. Sharding using PHP Insert Picture Here Mats Kindahl (Senior Principal Software Developer) 2Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12
  • 2. About the Presentation After this presentation you should know what sharding is and the basic caveats surrounding sharding. You should also have an idea of what is needed to develop a sharding solution. 3 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 3. Program Agenda  Why do we shard  Introduction to sharding  High-level sharding architecture  Elements of a sharding solution  Sharding planning 4 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 4. What is sharding? Splintering Horizontal Partitioning ● Slice your database into independent data “shards” ● Queries execute only on one shard ● Shards can be stored on different servers 5 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 5. Sharding for locality “B ig D at a ” cl os e to us er 6 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 6. Sharding for performance Reduced working set Database vs. cache Parallel processing 7 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 7. Sharding Limitations ● Auto-increment – Composite key – Distributed key generation – UUID? ● Cross-shard joins – Very expensive: avoid them – Federated tables? 8 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 8. Developing a Sharding Solution 9 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 9. High-level Architecture ● Broker – Distributes queries ● Sharding Database – Information about the shards – If it goes down, all goes down – Need to be HA 10 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 10. Running Example: Employees sample database Table Rows salaries 2 844 04700 titles 443 30800 employees 300 02400 dept_emp 331 60300 dept_manager 2400 departments 900 11 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 11. Areas to cover Data Meta-Data Sharding Query Operations 12 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 12. Data Partition Mapping Shard Data Keys Allocation Key Columns Range Mapping Single Shard Dependent Columns Hash Mapping Multiple Shards Tables to Shard List Mapping 13 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 13. Partitioning the data Table Rows salaries 284 404 700 titles 44 330 800 employees 30 002 400 dept_emp 33 160 300 dept_manager 2 400 departments 900 14 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 14. Partitioning the data: sharding column(s) emp_no birth_date first_name last_name gender hire_date 4711 1989-06-13 John Smith M 2009-12-24 19275 1954-11-12 Sally Smith F 1975-01-01 27593 1477-05-19 Mats Kindahl M 2002-02-27 587003 1830-08-28 Charles Bell M 2003-11-31 ● Sharding columns dictated by queries – Queries should give same result before and after sharding ● One or more columns – Does not have to be primary key, but easier if it is ● Sharding key is needed for re-sharding 15 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 15. Partitioning the data: sharding column(s) ● Choice of sharding columns 9 millions – Distribution SE US – Locality ● Avoid non-unique keys 200 millions – Difficult to get good distribution – Avoid: Country – Prefer: Employee ID 16 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 16. Partitioning the data: dependent columns ? ? Foreign keys Table Rows salaries 284 404 700 ? titles 44 330 800 employees 30 002 400 dept_emp 33 160 300 dept_manager 2 400 ? departments 900 17 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 17. Partitioning the data: dependent columns SELECT first_name, last_name, salary FROM salaries JOIN employees USING (emp_no) WHERE emp_no = 21012 AND CURRENT_DATE BETWEEN from_date AND to_date; ● Referential Integrity Constraint – Example query joining salaries and employees – Same key, same shard ● JOIN within a shard 18 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 18. Partitioning the data: dependent columns find uer y to Handy q colum ns ent all d epend ● Referential Integrity mysql> SELECT table_schema, table_name, column_name -> FROM -> information_schema.key_column_usage – Foreign Keys -> -> JOIN information_schema.table_constraints -> USING ● Dependent rows -> (table_schema, table_name, constraint_name) -> WHERE constraint_type = 'FOREIGN KEY' -> AND referenced_table_schema = 'employees' – Same shard -> -> AND referenced_table_name = 'employees' AND referenced_column_name = 'emp_no'; +--------------+--------------+-------------+ – Join on equality | table_schema | table_name | column_name | +--------------+--------------+-------------+ | employees | dept_emp | emp_no | ● Sharding Columns | employees | employees | dept_manager | emp_no | salaries | emp_no | | | employees | titles | emp_no | – Follow foreign keys +--------------+--------------+-------------+ 4 rows in set (0.56 sec) 19 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 19. Partitioning the data: unsharded tables Table Rows salaries 284 404 700 titles 44 330 800 employees 30 002 400 dept_emp 33 160 300 dept_manager 2 400 ? departments 900 20 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 20. Partitioning the data: unsharded tables SELECT first_name, last_name, GROUP_CONCAT(dept_name) FROM employees JOIN dept_emp USING (emp_no) JOIN departments USING (dept_no) WHERE emp_no = 21012 GROUP BY emp_no; ● Referential Integrity Constraint – Join with sharded tables – Tables dept_emp (and dept_manager) references two tables ● Shard table departments? – Not necessary: small table – Difficult to get right: keeping shards of two tables in same location 21 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 21. Partitioning the data: unsharded tables SELECT first_name, last_name, GROUP_CONCAT(dept_name) FROM employees JOIN dept_emp USING (emp_no) JOIN departments USING (dept_no) WHERE emp_no = 21012 GROUP BY emp_no; ● Solution: do not shard departments – Keep table on all shards – Joins will only need to address one shard ● You need to consider … how to update unsharded table 22 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 22. Data Partition Mapping Shard Data Keys Allocation Key Columns Range Mapping Single Shard Dependent Columns Hash Mapping Multiple Shards Tables to Shard List Mapping 23 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 23. Mapping Keys to Shards ● Given – Sharding key value – Optional other information (tables accessed, RO or RW, etc.) ● Provide the following – Shard location (host, port) – Shard identifier (if you have multiple shards for each server) 24 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 24. Mapping Keys to Shards ● Range Mapping: range of values for each shard – Type-dependent ● Hash Mapping: hash of key to find shard – Type-independent – Complicated? ● List Mapping: list of keys for each shard – Does not offer good distribution 25 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 25. Data Partition Mapping Shard Data Keys Allocation Key Columns Range Mapping Single Shard Dependent Columns Hash Mapping Multiple Shards Tables to Shard List Mapping 26 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 26. Shard Allocation: Single Shard per Server ● Idea: there is only one shard on each server ● Advantage: Cross-database queries does not require rewrite ● Disadvantage: Expensive to balance server load … moving hot data from server requires re-sharding SELECT first_name, last_name FROM   employees.employees JOIN expenses.reciepts USING (emp_no) WHERE   currency = 'USD' 27 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 27. Shard Allocation: Multiple Shards per Server ● Idea: Keep several “virtual shards” on each server ● Advantages – Easier to balance load of servers … move hot virtual shards to other server – Improves performance – Increases availability 28 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 28. Shard Allocation: Multiple Shards per Server ● Disadvantage: cross-database queries require rewrite – Error-prone – Expensive? ● Queries that go to one database not a problem SELECT first_name, last_name FROM   employees.employees JOIN expenses.reciepts USING (emp_no) WHERE   currency = 'USD' 29 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 29. Shard Allocation: Multiple Shards per Server ● Idea: Add suffix to database name (optionally table name) employees_N.employees employees_N.employees_N ● Idea: Keep substitution pattern in query string SELECT first_name, last_name FROM   {employees.employees} JOIN {expenses.reciepts} USING (emp_no) WHERE   currency = 'USD' 30 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 30. Shard Allocation: Multiple Shards per Server class my_mysqli extends mysqli {   var $shard_id;   public function query($query,                         $resultmode = MYSQLI_STORE_RESULT)   {     $real_query = preg_replace('/{(w+).(w+)}/',                                “$1_{$this­>shard_id}.$2”,                                $query);     return parent::query($real_query, $resultmode);   } } 31 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 31. Areas that we need to cover Data Meta-Data Sharding Query Operations 32 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 32. Meta Data Mapping Shard Mapping Methods Information Schemes Static Sharding Shard ID Range Mapping Dynamic Sharding Shard Host Hash Mapping Shard Specifics* List Mapping * If you use multiple shards per server 33 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 33. Mapping Methods: Static Sharding ● Idea: Compute shard statically ● Advantages – Simple – No extra lookups – No single point of failure ● Disadvantage – Lack of flexibility 34 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 34. Mapping Methods: Static Sharding, in code ● Dictionary class ● Input: sharding key ● Output: connection class Dictionary { private $emp_no; public function __construct() { ... } public function set_key($emp_no) { $this->emp_no = $emp_no; } public function get_connection() { $i = $this->shardinfo[$this->emp_no % count($this->shards)]; return new mysqli("p:{$i->host}", $i->user, $i->passwd, $i->db, $i->port); } } 35 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 35. Mapping Methods: Static Sharding, in code $HIRED = <<<END_OF_QUERY SELECT first_name, last_name, hire_date, salary FROM employees AS e, salaries AS s WHERE s.emp_no = e.emp_no AND e.emp_no = ? AND CURRENT_DATE BETWEEN s.from_date AND s.to_date END_OF_QUERY; $DICTIONARY = new Dictionary(); $DICTIONARY->set_key($emp_no); $link = $DICTIONARY->get_connection(); if ($stmt = $link->prepare($HIRED)) { $stmt->bind_param('i', $emp_no); $stmt->execute(); $stmt->bind_result($first, $last, $hire, $salary); while ($stmt->fetch()) printf("%s %s was hired at %s and have a salary of %sn", $first, $last, $hire, $salary); } 36 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 36. Mapping Methods: Dynamic Sharding ● Idea: use a sharding database to keep track of shard locations ● Advantages: – Easy to migrate shards – Easy to re-shard ● Disadvantages: – Complex ● Performance? 37 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 37. Dynamic sharding, in code $FETCH_SHARD = <<<END_OF_QUERY shard selection query END_OF_QUERY; class Dictionary { var $dict; var $emp_no; public function __construct() { $this->dict = new mysqli('shardinfo.example.com', ...); } public set_key($emp_no) { $this->emp_no = $emp_no; } public function get_connection() { $stmt = $this->dict->prepare($FETCH_SHARD)) $stmt->bind_param('i', $this->emp_no); $stmt->execute(); $stmt->bind_result($no, $host, $user, $passwd, $db, $port); $stmt->fetch(); return new mysqli("p:{$host}", $user, $passwd, $db, $port); } } 38 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 38. Meta Data Mapping Shard Mapping Methods Information Schemes Static Sharding Shard ID Range Mapping Dynamic Sharding Shard Host Hash Mapping Shard Specifics* List Mapping * If you use multiple shards per server 39 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 39. Mapping Schemes: Range Mapping ● Most basic scheme Shard ID Lower ● One row for each range 0 0 ● Just store lower bound 1 20000 2 50000 SELECT shard_id, hostname, port FROM shard_ranges JOIN shard_locations USING (shard_id) WHERE key_id = 1 AND 2345 >= shard_ranges.lower_bound ORDER BY shard_ranges.lower_bound LIMIT 1; 40 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 40. Mapping Schemes: Regular Hashing ● Computing a hash from the key ShardID = SHA1(key) mod N ● Adding (or removing) a shard … can require moving rows between many shards … often a lot of rows 41 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 41. Mapping Schemes: Regular Hashing emp_no=20101 emp_no=43210 emp_no=23456 emp_no=36912 N N+1 0 1 2 3 4 HASH(key) mod N 42 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 42. Mapping Schemes: Consistent Hashing ● Computing a hash from the key Shard ID Hash SHA1(key) 6 08b1286ad1bebe6... ● Adding (or removing) a shard 2 1c2d4132144211a... 4 9893238ed75cfc9... … only require moving rows from one shard to the new 1 989bb9d2bc381f4... shard 5 cab8c76b85c4e24... 3 eccf30f69fe850f... 43 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 43. Mapping Schemes: Consistent Hashing shard1 emp_no=20101 emp_no=36912 shard4 Hash Ring shard2 emp_no=43210 emp_no=23456 shard5 shard3 44 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 44. Areas that we need to cover Data Meta-Data Sharding Query Operations 45 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 45. Query Handling Query Connector Sharding Dispatch Caches Key Mechanism Time (TTL) Parsing Single/Multi Cast On Error Application Handling Reads Explicit provided Handling Updates Transaction Handling 46 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 46. Query Dispatch: Mechanism ● Proxy – Sharding key extracted from query – Requires extra hop ● Application level – Application provides sharding key – No extra hop 47 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 47. Query Dispatch: Query Type ● Read Query – How do you ensure that it is executing on the right shard? – How do you ensure that it is not cross-shard? ● Update Query – Updating an unsharded table – think about consistency 48 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 48. Query Dispatch: Handling Transactions ● All statements of a transaction should go to the same session – Sharding key on start of transaction? – Is it a read-only or read-write transaction? ● Statements for different transactions can go to different sessions – How to detect transaction boundaries ● Maintaining the session state 49 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 49. Query Dispatch: Handling Transactions Hmm... looks like Sharding key? Ah, there it is! a read transaction Session state? BEGIN SELECT salary INTO @s FROM salaries WHERE emp_no = 20101; SET @s = 1.1 * @s; INSERT INTO salaries VALUES (20101, @s); COMMIT Oops.. it was a BEGIN write transaction! INSERT INTO ...  COMMIT Transaction done! Clear session state? New transaction! Different connection? 50 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 50. Query Handling Query Connector Sharding Dispatch Caches Key Mechanism Time (TTL) Parsing Single/Multi Cast On Error Application Handling Reads Explicit provided Handling Updates Transaction Handling 51 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 51. Extracting Sharding Key ● Parsing the query – Locating the key – Handling Transactions ● Application-provided sharding key – Annotating queries – Separate function in connector 52 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 52. Extracting Sharding Key: Parsing Query ● Problem: Locating the key INSERT INTO   titles(emp_no, title, from_date) ● No generic parser SELECT emp_no, '', CURRENT_DATE FROM titles JOIN employees      USING (emp_no) – Application specific parser WHERE first_name = 'Keith' – Constrain application developer BEGIN SELECT … ● Transactions INSERT … COMMIT; – Key needed for first statement 53 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 53. Extracting Sharding Key: Application Provided ● Idea: Provide key explicitly /* emp_no=20101 */ BEGIN; SELECT … ● Annotate the statement INSERT … COMMIT; ● Extend connection manager – Demonstrated previously … $DICT­>set_key($key); $link = $DICT­>get_connection(); … 54 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 54. Areas that we need to cover Data Meta-Data Sharding Query Operations 55 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 55. Operations: Monitoring the System ● Monitor load of each node … to see if any node get an unfair number of queries ● Monitor load of each shard (multiple shards per node) … to see if a shard gets an unfair number of queries 56 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 56. Operations: Re-balancing the System ● If a instance is hot: – Move Shard: Move one shard to another instance ● If a shard is hot: – Split Shard: Split the shard into multiple shards – Move Shard: Move one of the shards to another instance ● If a shard is cold: – Merge Shard: Merge a shard with other shards ● Avoid it – very difficult to do on-line 57 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 57. Operations: Moving a Shard ● Offline (trivial) – Bring source and target nodes down – Copy shard from source to target – Update dictionary ● Online (tricky) – We go through it on the following slides 58 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 58. Operations: Online Move of Shard Application 1. Backup shard – Might be multiple databases – Note down binary log position ● “Backup position” Src Dst – Online backup ● mysqldump ● MySQL Enterprise Backup @Pos 2. Restore backup on destination 59 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 59. Operations: Online Move of Shard Application 3. Start replication – Source to target – Start replication from backup position Dst Src – Only replicate shard? replicate­wild­do­table=db_1.* @Pos 60 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 60. Operations: Online Move of Shard Application 4. Wait until destination is close enough 5. Write lock on source LOCK TABLES Dst Src 6. Note binary log position – “Catch-up Position” @Pos 61 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 61. Operations: Online Move of Shard Application 7. Wait for destination to reach catch-up position START SLAVE UNTIL MASTER_POS_WAIT Dst Src @Pos 62 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 62. Operations: Online Move of Shard Application 8. Update sharding database … will re-direct queries 9. Stop replication Dst Src RESET SLAVE 10.Drop old shard … unless you just wanted a copy @Pos 63 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 63. Operations: Splitting a Shard ● Application dependent – Change sharding key? – Change sharding scheme? ● Can be expensive ● You will have to do it … eventually 64 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 64. Operations: Splitting a Shard one.example.com two.example.com 1. Copy shard to new location 1 – Use on-line move described on 3 3 previous slides 2. Update sharding database – Will re-direct queries 3. Remove rows from both shards 2 – Remove rows that do not belong to the shard 65 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 65. Great! Wait a Let's Shard! minute... 66 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 66. When to shard? ● Inherently more complex – Requires careful planning – Application design? ● Alternatives? – Functional partitioning? – Archiving old data? 67 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 67. Preparations for sharding ● Monitor the system – Types of queries ● What are the join queries – Access patterns ● What tables are accessed ● Find natural partition keys – Robust and easy to implement – Watch out for cross-shard joins 68 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 68. Summary ● What are your goals? ● Do your homework ● Don't be too eager ● Plan ● Develop sharding solution ● Revise the plans 69 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
  • 69. Thanks for attending! ● Questions? Comments? ● Download MySQL! http://dev.mysql.com ● Read our book! – Covers replication, sharding, scale-out, and much much more 70 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.