Consolidated shared indexes in real time

244 views

Published on

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
244
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Consolidated shared indexes in real time

    1. 1. Consolidated Sharded Indexes in Real-Time Jeff Mace©Continuent 2012.
    2. 2. About Continuent • The leading provider of clustering and replication for open source DBMS • Tungsten Clustering - Commercial-grade HA, performance scaling and data management for MySQL • Tungsten Replication - Flexible, high- performance replication©Continuent 2012 2
    3. 3. Working with Sharded Data • Schema sharding is a proven strategy for many SaaS application providers • Easily add new servers to handle growth • Costly operations by one customer do not affect all other customers • But it’s difficult to work with data spread across all schemas©Continuent 2012 3
    4. 4. An Example • A franchise business with many locations • Needs a consolidated table of all accounts and their balance is needed for BI • Some lag is ok but not more than a few seconds • This example can be applied to many scenarios using the same techniques©Continuent 2012 4
    5. 5. An Example NYC Branch SFO Branch id balance id balance 1 $234.78 1 $820.20 2 $892.24 2 $240.27 3 $1,023.76 3 $527.63 4 $521.08 5 $982.62©Continuent 2012 5
    6. 6. An Example id balance schema 1 $234.78 nyc 2 $892.24 nyc 1 $820.20 sfo 2 $240.27 sfo 3 $1,023.76 nyc 4 $521.08 nyc 5 $982.62 nyc 3 $527.63 sfo©Continuent 2012 6
    7. 7. What can we do about it? • Make the application do it • Run a batch process to dump and load data • Replicate into a central schema with Tungsten Replicator©Continuent 2012 7
    8. 8. Tungsten Replicator • GPL v2 • Global Transaction IDs • Multi-Master Replication • Parallel Replication • Heterogenous Replication • Supports MySQL 5.0 and up©Continuent 2012 8
    9. 9. Tungsten Replicator Master-Slave Heterogenous Direct Fan-In All-Masters Star©Continuent 2012 9
    10. 10. Replication Services Stage Stage Stage Extrac Filter Apply Extrac Filter Apply Extrac Filter Apply Master Transaction In-Memory Slave DBMS History Log Queue DBMS©Continuent 2012 10
    11. 11. Transaction History Log Event Header Seqno Header Record Fragno Last_frag (version & first seqno) Epoch_number Source_id Event_id Event Record Shard_id Tstamp Data_length Event Record (Java Primitive Types) ... Serialized Event (Google Protobufs, Event Record Up to 2Gb) Log Rotation Record CRC (name of next file) (Java Primitive Types)©Continuent 2012 11
    12. 12. Filters • Modify THL events during replication • Can be written in Java or JavaScript • JavaScript is compiled at runtime • Drop all or part of a THL event • Modify the contents of a THL event • Insert new statements or rows into a THL event • Use your imagination©Continuent 2012 12
    13. 13. Putting It All Together • Filtering allows us to modify ROW replication events before applying them • dropstatementdata • replicate • replicatecolumns • buildindextable • Apply to the master server with log slave updates • Integrate with other servers to ensure©Continuent 2012 13
    14. 14. Drop Statement Data Filter Definition # Remove statement events and drop the event if the result is empty replicator.filter.dropstatementdata=com.continuent.tungsten.re plicator.filter.JavaScriptFilter replicator.filter.dropstatementdata.script=$ {replicator.home.dir}/samples/extensions/javascript/ dropstatementdata.js©Continuent 2012 14
    15. 15. Replicate Filter Definition # Filter to forward or ignore particular schemas and/or databases. Entries # are comma-separated lists of the form schema[.table] where the table is # optional. List entries may use * and ? as wild cards. When both # filter lists are empty updates on all tables are allowed. replicator.filter.replicate=com.continuent.tungsten.replicator.fil ter.ReplicateFilter replicator.filter.replicate.ignore= replicator.filter.replicate.do=*.account©Continuent 2012 15
    16. 16. ReplicateColumns Filter Definition # Join rows from one table in many schemas, to a table in a single schema replicator.filter.replicatecolumns=com.continuent.tungsten.repl icator.filter.ReplicateColumnsFilter replicator.filter.replicatecolumns.do= replicator.filter.replicatecolumns.ignore=account.filler©Continuent 2012 16
    17. 17. ReplicateColumns : Before - SQL(27) = - ACTION = INSERT - SCHEMA = e3 - TABLE = account - ROW# = 0 - COL(1: account_id) = 27 - COL(2: branch_id) = 0 - COL(3: account_balance) = 0 - COL(4: filler) = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XX - COL(5: time_stamp) = 2012-09-28 11:50:36.0©Continuent 2012 17
    18. 18. ReplicateColumns : After - SQL(27) = - ACTION = INSERT - SCHEMA = e3 - TABLE = account - ROW# = 0 - COL(1: account_id) = 27 - COL(2: branch_id) = 0 - COL(3: account_balance) = 0 - COL(5: time_stamp) = 2012-09-28 11:50:36.0©Continuent 2012 18
    19. 19. Index Filter Definition # Join rows from one table in many schemas, to a table in a single schema replicator.filter.buildindextable=com.continuent.tungsten.repli cator.filter.BuildIndexTable replicator.filter.buildindextable.target_schema_name=bi©Continuent 2012 19
    20. 20. Index Filter : Before - SQL(27) = - ACTION = INSERT - SCHEMA = e3 - TABLE = account - ROW# = 0 - COL(1: account_id) = 27 - COL(2: branch_id) = 0 - COL(3: account_balance) = 0 - COL(5: time_stamp) = 2012-09-28 11:50:36.0©Continuent 2012 20
    21. 21. Index Filter : After - SQL(27) = - ACTION = INSERT - SCHEMA = test - TABLE = account - ROW# = 0 - COL(0: account_id) = 27 - COL(0: branch_id) = 0 - COL(0: account_balance) = 0 - COL(0: time_stamp) = 2012-09-28 11:50:36.0 - COL(0: schema) = e3©Continuent 2012 21
    22. 22. Requirements • Standard Tungsten Requirements • Java • Ruby • Row Replication • Unique schema names • Target schema with an additional ‘schema’ column©Continuent 2012 22
    23. 23. Installation - Master • Normal©Continuent 2012 23
    24. 24. Installation - Slave • Some hacks to write back to the master • Some filters to make the changes©Continuent 2012 24
    25. 25. Demo©Continuent 2012 25
    26. 26. Replicating the index table • Setup a normal master-slave pair • Could be a Tungsten cluster • Set up the indexing replicator to apply events to the master or through the connector©Continuent 2012 26
    27. 27. Provisioning • Not as simple as restoring a backup • New loader command to initialize the master THL or slave database • Creates a set of row inserts for each table • Supports extracting from MySQL • Uses FLUSH TABLES WITH READ LOCK©Continuent 2012 27
    28. 28. Provisioning Master THL $ /opt/continuent/tungsten/tungsten-replicator/bin/loader -extractor com.continuent.tungsten.replicator.loader.MySQLLoader -extractor.uri "mysql://mdb-1.local:3306/" -extractor.user tungsten -extractor.password secret -extractor.includeSchemas e1,e2,e3,e4,e5©Continuent 2012 28
    29. 29. Provisioning Slave Database $ /opt/indexer/tungsten/tungsten-replicator/bin/loader -extractor com.continuent.tungsten.replicator.loader.MySQLLoader -extractor.uri "mysql://mdb-2.local:3306/" -extractor.user tungsten -extractor.password secret -extractor.includeSchemas e1,e2,e3,e4,e5 -extractor.tungstenServiceSchema tungsten_globalidx©Continuent 2012 29
    30. 30. Demo©Continuent 2012 30
    31. 31. Next Steps • Loading big data • Replicate from many masters • Support non-row statements • Drop schema • Drop table • Truncate table • Expand the provisioning support to extract from Oracle • https://docs.continuent.com/wiki/x/uIAz©Continuent 2012 31
    32. 32. We’re Hiring • Cluster Implementation Engineer • Quality Assurance Engineer • Technical Writer©Continuent 2012 32
    33. 33. Jeff Mace jeff.mace@continuent.com sales@continuent.com 560 S. Winchester Blvd. Suite 500 San Jose, CA 95128 Tel (866) 998-3642 Fax (408) 668-1009 http://www.continuent.com http://code.google.com/p/tungsten-replicator©Continuent 2012 19 33

    ×