Consolidated Sharded
                    Indexes in Real-Time
                                   Jeff Mace



©Continuent 2012.
About Continuent

              • The leading provider of clustering and
                   replication for open source DBMS

              • Tungsten Clustering - Commercial-grade
                   HA, performance scaling and data
                   management for MySQL

              • Tungsten Replication - Flexible, high-
                   performance replication




©Continuent 2012                     2
Working with Sharded Data

              • Schema sharding is a proven strategy for
                   many SaaS application providers

              • Easily add new servers to handle growth
              • Costly operations by one customer do not
                   affect all other customers

              • But it’s difficult to work with data spread
                   across all schemas




©Continuent 2012                        3
An Example

              • A franchise business with many locations
              • Needs a consolidated table of all accounts
                   and their balance is needed for BI

              • Some lag is ok but not more than a few
                   seconds

              • This example can be applied to many
                   scenarios using the same techniques




©Continuent 2012                      4
An Example

                   NYC Branch                SFO Branch
         id              balance       id         balance
          1        $234.78              1   $820.20
          2        $892.24              2   $240.27
          3        $1,023.76            3   $527.63
          4        $521.08
          5        $982.62




©Continuent 2012                   5
An Example

                   id          balance    schema
                    1   $234.78          nyc
                    2   $892.24          nyc
                    1   $820.20          sfo
                    2   $240.27          sfo
                    3   $1,023.76        nyc
                    4   $521.08          nyc
                    5   $982.62          nyc
                    3   $527.63          sfo




©Continuent 2012                     6
What can we do about it?

              • Make the application do it
              • Run a batch process to dump and load
                   data

              • Replicate into a central schema with
                   Tungsten Replicator




©Continuent 2012                     7
Tungsten Replicator

              • GPL v2
              • Global Transaction IDs
              • Multi-Master Replication
              • Parallel Replication
              • Heterogenous Replication
              • Supports MySQL 5.0 and up


©Continuent 2012                 8
Tungsten Replicator
          Master-Slave      Heterogenous   Direct




                   Fan-In   All-Masters    Star




©Continuent 2012                 9
Replication Services


                   Stage                       Stage                      Stage
    Extrac         Filter   Apply     Extrac   Filter   Apply    Extrac   Filter   Apply




    Master                    Transaction                  In-Memory               Slave
    DBMS                      History Log                    Queue                 DBMS




©Continuent 2012                                   10
Transaction History Log
                                                        Event Header
                                                            Seqno
                       Header Record                        Fragno
                                                          Last_frag
                   (version & first seqno)              Epoch_number
                                                          Source_id
                                                           Event_id
                      Event Record                        Shard_id
                                                           Tstamp
                                                         Data_length
                      Event Record                 (Java Primitive Types)


                           ...                   Serialized Event
                                                   (Google Protobufs,
                      Event Record                    Up to 2Gb)

                   Log Rotation Record                    CRC
                    (name of next file)            (Java Primitive Types)



©Continuent 2012                            11
Filters

              • Modify THL events during replication
              • Can be written in Java or JavaScript
              • JavaScript is compiled at runtime
              • Drop all or part of a THL event
              • Modify the contents of a THL event
              • Insert new statements or rows into a THL
                   event

              • Use your imagination
©Continuent 2012                  12
Putting It All Together

              • Filtering allows us to modify ROW
                   replication events before applying them
                   •   dropstatementdata

                   •   replicate

                   •   replicatecolumns

                   •   buildindextable

              • Apply to the master server with log slave
                   updates

              • Integrate with other servers to ensure
©Continuent 2012                           13
Drop Statement Data Filter
          Definition
  # Remove statement events and drop the event if the result is
  empty

  replicator.filter.dropstatementdata=com.continuent.tungsten.re
  plicator.filter.JavaScriptFilter
  replicator.filter.dropstatementdata.script=$
  {replicator.home.dir}/samples/extensions/javascript/
  dropstatementdata.js




©Continuent 2012                14
Replicate Filter Definition
  # Filter to forward or ignore particular schemas and/or
  databases. Entries
  # are comma-separated lists of the form schema[.table] where
  the table is
  # optional. List entries may use * and ? as wild cards. When
  both
  # filter lists are empty updates on all tables are allowed.

  replicator.filter.replicate=com.continuent.tungsten.replicator.fil
  ter.ReplicateFilter
  replicator.filter.replicate.ignore=
  replicator.filter.replicate.do=*.account




©Continuent 2012                 15
ReplicateColumns Filter Definition
  # Join rows from one table in many schemas, to a table in a
  single schema

  replicator.filter.replicatecolumns=com.continuent.tungsten.repl
  icator.filter.ReplicateColumnsFilter
  replicator.filter.replicatecolumns.do=
  replicator.filter.replicatecolumns.ignore=account.filler




©Continuent 2012                 16
ReplicateColumns : Before
  - SQL(27) =
   - ACTION = INSERT
   - SCHEMA = e3
   - TABLE = account
   - ROW# = 0
    - COL(1: account_id) = 27
    - COL(2: branch_id) = 0
    - COL(3: account_balance) = 0
    - COL(4: filler) =
  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
  XX
    - COL(5: time_stamp) = 2012-09-28 11:50:36.0



©Continuent 2012          17
ReplicateColumns : After
  - SQL(27) =
   - ACTION = INSERT
   - SCHEMA = e3
   - TABLE = account
   - ROW# = 0
    - COL(1: account_id) = 27
    - COL(2: branch_id) = 0
    - COL(3: account_balance) = 0
    - COL(5: time_stamp) = 2012-09-28 11:50:36.0




©Continuent 2012              18
Index Filter Definition
  # Join rows from one table in many schemas, to a table in a
  single schema

  replicator.filter.buildindextable=com.continuent.tungsten.repli
  cator.filter.BuildIndexTable
  replicator.filter.buildindextable.target_schema_name=bi




©Continuent 2012                 19
Index Filter : Before
  - SQL(27) =
   - ACTION = INSERT
   - SCHEMA = e3
   - TABLE = account
   - ROW# = 0
    - COL(1: account_id) = 27
    - COL(2: branch_id) = 0
    - COL(3: account_balance) = 0
    - COL(5: time_stamp) = 2012-09-28 11:50:36.0




©Continuent 2012              20
Index Filter : After
  - SQL(27) =
   - ACTION = INSERT
   - SCHEMA = test
   - TABLE = account
   - ROW# = 0
    - COL(0: account_id) = 27
    - COL(0: branch_id) = 0
    - COL(0: account_balance) = 0
    - COL(0: time_stamp) = 2012-09-28 11:50:36.0
    - COL(0: schema) = e3




©Continuent 2012              21
Requirements

              • Standard Tungsten Requirements
                   •   Java

                   •   Ruby

              • Row Replication
              • Unique schema names
              • Target schema with an additional
                   ‘schema’ column




©Continuent 2012                     22
Installation - Master

              • Normal




©Continuent 2012           23
Installation - Slave

              • Some hacks to write back to the master
              • Some filters to make the changes




©Continuent 2012                  24
Demo




©Continuent 2012   25
Replicating the index table

              • Setup a normal master-slave pair
              • Could be a Tungsten cluster
              • Set up the indexing replicator to apply
                   events to the master or through the
                   connector




©Continuent 2012                     26
Provisioning

              • Not as simple as restoring a backup
              • New loader command to initialize the
                   master THL or slave database

              • Creates a set of row inserts for each table
              • Supports extracting from MySQL
              • Uses FLUSH TABLES WITH READ LOCK


©Continuent 2012                    27
Provisioning Master THL
  $ /opt/continuent/tungsten/tungsten-replicator/bin/loader 
  -extractor 
  com.continuent.tungsten.replicator.loader.MySQLLoader 
  -extractor.uri "mysql://mdb-1.local:3306/" 
  -extractor.user tungsten 
  -extractor.password secret 
  -extractor.includeSchemas e1,e2,e3,e4,e5




©Continuent 2012               28
Provisioning Slave Database
  $ /opt/indexer/tungsten/tungsten-replicator/bin/loader 
  -extractor 
  com.continuent.tungsten.replicator.loader.MySQLLoader 
  -extractor.uri "mysql://mdb-2.local:3306/" 
  -extractor.user tungsten 
  -extractor.password secret 
  -extractor.includeSchemas e1,e2,e3,e4,e5 
  -extractor.tungstenServiceSchema tungsten_globalidx




©Continuent 2012               29
Demo




©Continuent 2012   30
Next Steps

              • Loading big data
              • Replicate from many masters
              • Support non-row statements
                   •   Drop schema

                   •   Drop table

                   •   Truncate table

              • Expand the provisioning support to
                   extract from Oracle

              • https://docs.continuent.com/wiki/x/uIAz
©Continuent 2012                        31
We’re Hiring

              • Cluster Implementation Engineer
              • Quality Assurance Engineer
              • Technical Writer




©Continuent 2012                 32
Jeff Mace
       jeff.mace@continuent.com
       sales@continuent.com
       560 S. Winchester Blvd. Suite 500
       San Jose, CA 95128
       Tel (866) 998-3642
       Fax (408) 668-1009


                              http://www.continuent.com
                   http://code.google.com/p/tungsten-replicator

©Continuent 2012                           19
                                           33

Consolidated shared indexes in real time

  • 1.
    Consolidated Sharded Indexes in Real-Time Jeff Mace ©Continuent 2012.
  • 2.
    About Continuent • The leading provider of clustering and replication for open source DBMS • Tungsten Clustering - Commercial-grade HA, performance scaling and data management for MySQL • Tungsten Replication - Flexible, high- performance replication ©Continuent 2012 2
  • 3.
    Working with ShardedData • Schema sharding is a proven strategy for many SaaS application providers • Easily add new servers to handle growth • Costly operations by one customer do not affect all other customers • But it’s difficult to work with data spread across all schemas ©Continuent 2012 3
  • 4.
    An Example • A franchise business with many locations • Needs a consolidated table of all accounts and their balance is needed for BI • Some lag is ok but not more than a few seconds • This example can be applied to many scenarios using the same techniques ©Continuent 2012 4
  • 5.
    An Example NYC Branch SFO Branch id balance id balance 1 $234.78 1 $820.20 2 $892.24 2 $240.27 3 $1,023.76 3 $527.63 4 $521.08 5 $982.62 ©Continuent 2012 5
  • 6.
    An Example id balance schema 1 $234.78 nyc 2 $892.24 nyc 1 $820.20 sfo 2 $240.27 sfo 3 $1,023.76 nyc 4 $521.08 nyc 5 $982.62 nyc 3 $527.63 sfo ©Continuent 2012 6
  • 7.
    What can wedo about it? • Make the application do it • Run a batch process to dump and load data • Replicate into a central schema with Tungsten Replicator ©Continuent 2012 7
  • 8.
    Tungsten Replicator • GPL v2 • Global Transaction IDs • Multi-Master Replication • Parallel Replication • Heterogenous Replication • Supports MySQL 5.0 and up ©Continuent 2012 8
  • 9.
    Tungsten Replicator Master-Slave Heterogenous Direct Fan-In All-Masters Star ©Continuent 2012 9
  • 10.
    Replication Services Stage Stage Stage Extrac Filter Apply Extrac Filter Apply Extrac Filter Apply Master Transaction In-Memory Slave DBMS History Log Queue DBMS ©Continuent 2012 10
  • 11.
    Transaction History Log Event Header Seqno Header Record Fragno Last_frag (version & first seqno) Epoch_number Source_id Event_id Event Record Shard_id Tstamp Data_length Event Record (Java Primitive Types) ... Serialized Event (Google Protobufs, Event Record Up to 2Gb) Log Rotation Record CRC (name of next file) (Java Primitive Types) ©Continuent 2012 11
  • 12.
    Filters • Modify THL events during replication • Can be written in Java or JavaScript • JavaScript is compiled at runtime • Drop all or part of a THL event • Modify the contents of a THL event • Insert new statements or rows into a THL event • Use your imagination ©Continuent 2012 12
  • 13.
    Putting It AllTogether • Filtering allows us to modify ROW replication events before applying them • dropstatementdata • replicate • replicatecolumns • buildindextable • Apply to the master server with log slave updates • Integrate with other servers to ensure ©Continuent 2012 13
  • 14.
    Drop Statement DataFilter Definition # Remove statement events and drop the event if the result is empty replicator.filter.dropstatementdata=com.continuent.tungsten.re plicator.filter.JavaScriptFilter replicator.filter.dropstatementdata.script=$ {replicator.home.dir}/samples/extensions/javascript/ dropstatementdata.js ©Continuent 2012 14
  • 15.
    Replicate Filter Definition # Filter to forward or ignore particular schemas and/or databases. Entries # are comma-separated lists of the form schema[.table] where the table is # optional. List entries may use * and ? as wild cards. When both # filter lists are empty updates on all tables are allowed. replicator.filter.replicate=com.continuent.tungsten.replicator.fil ter.ReplicateFilter replicator.filter.replicate.ignore= replicator.filter.replicate.do=*.account ©Continuent 2012 15
  • 16.
    ReplicateColumns Filter Definition # Join rows from one table in many schemas, to a table in a single schema replicator.filter.replicatecolumns=com.continuent.tungsten.repl icator.filter.ReplicateColumnsFilter replicator.filter.replicatecolumns.do= replicator.filter.replicatecolumns.ignore=account.filler ©Continuent 2012 16
  • 17.
    ReplicateColumns : Before - SQL(27) = - ACTION = INSERT - SCHEMA = e3 - TABLE = account - ROW# = 0 - COL(1: account_id) = 27 - COL(2: branch_id) = 0 - COL(3: account_balance) = 0 - COL(4: filler) = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XX - COL(5: time_stamp) = 2012-09-28 11:50:36.0 ©Continuent 2012 17
  • 18.
    ReplicateColumns : After - SQL(27) = - ACTION = INSERT - SCHEMA = e3 - TABLE = account - ROW# = 0 - COL(1: account_id) = 27 - COL(2: branch_id) = 0 - COL(3: account_balance) = 0 - COL(5: time_stamp) = 2012-09-28 11:50:36.0 ©Continuent 2012 18
  • 19.
    Index Filter Definition # Join rows from one table in many schemas, to a table in a single schema replicator.filter.buildindextable=com.continuent.tungsten.repli cator.filter.BuildIndexTable replicator.filter.buildindextable.target_schema_name=bi ©Continuent 2012 19
  • 20.
    Index Filter :Before - SQL(27) = - ACTION = INSERT - SCHEMA = e3 - TABLE = account - ROW# = 0 - COL(1: account_id) = 27 - COL(2: branch_id) = 0 - COL(3: account_balance) = 0 - COL(5: time_stamp) = 2012-09-28 11:50:36.0 ©Continuent 2012 20
  • 21.
    Index Filter :After - SQL(27) = - ACTION = INSERT - SCHEMA = test - TABLE = account - ROW# = 0 - COL(0: account_id) = 27 - COL(0: branch_id) = 0 - COL(0: account_balance) = 0 - COL(0: time_stamp) = 2012-09-28 11:50:36.0 - COL(0: schema) = e3 ©Continuent 2012 21
  • 22.
    Requirements • Standard Tungsten Requirements • Java • Ruby • Row Replication • Unique schema names • Target schema with an additional ‘schema’ column ©Continuent 2012 22
  • 23.
    Installation - Master • Normal ©Continuent 2012 23
  • 24.
    Installation - Slave • Some hacks to write back to the master • Some filters to make the changes ©Continuent 2012 24
  • 25.
  • 26.
    Replicating the indextable • Setup a normal master-slave pair • Could be a Tungsten cluster • Set up the indexing replicator to apply events to the master or through the connector ©Continuent 2012 26
  • 27.
    Provisioning • Not as simple as restoring a backup • New loader command to initialize the master THL or slave database • Creates a set of row inserts for each table • Supports extracting from MySQL • Uses FLUSH TABLES WITH READ LOCK ©Continuent 2012 27
  • 28.
    Provisioning Master THL $ /opt/continuent/tungsten/tungsten-replicator/bin/loader -extractor com.continuent.tungsten.replicator.loader.MySQLLoader -extractor.uri "mysql://mdb-1.local:3306/" -extractor.user tungsten -extractor.password secret -extractor.includeSchemas e1,e2,e3,e4,e5 ©Continuent 2012 28
  • 29.
    Provisioning Slave Database $ /opt/indexer/tungsten/tungsten-replicator/bin/loader -extractor com.continuent.tungsten.replicator.loader.MySQLLoader -extractor.uri "mysql://mdb-2.local:3306/" -extractor.user tungsten -extractor.password secret -extractor.includeSchemas e1,e2,e3,e4,e5 -extractor.tungstenServiceSchema tungsten_globalidx ©Continuent 2012 29
  • 30.
  • 31.
    Next Steps • Loading big data • Replicate from many masters • Support non-row statements • Drop schema • Drop table • Truncate table • Expand the provisioning support to extract from Oracle • https://docs.continuent.com/wiki/x/uIAz ©Continuent 2012 31
  • 32.
    We’re Hiring • Cluster Implementation Engineer • Quality Assurance Engineer • Technical Writer ©Continuent 2012 32
  • 33.
    Jeff Mace jeff.mace@continuent.com sales@continuent.com 560 S. Winchester Blvd. Suite 500 San Jose, CA 95128 Tel (866) 998-3642 Fax (408) 668-1009 http://www.continuent.com http://code.google.com/p/tungsten-replicator ©Continuent 2012 19 33