Consolidated shared indexes in real time

  • 87 views
Uploaded on

 

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
87
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n

Transcript

  • 1. Consolidated Sharded Indexes in Real-Time Jeff Mace©Continuent 2012.
  • 2. About Continuent • The leading provider of clustering and replication for open source DBMS • Tungsten Clustering - Commercial-grade HA, performance scaling and data management for MySQL • Tungsten Replication - Flexible, high- performance replication©Continuent 2012 2
  • 3. Working with Sharded Data • Schema sharding is a proven strategy for many SaaS application providers • Easily add new servers to handle growth • Costly operations by one customer do not affect all other customers • But it’s difficult to work with data spread across all schemas©Continuent 2012 3
  • 4. An Example • A franchise business with many locations • Needs a consolidated table of all accounts and their balance is needed for BI • Some lag is ok but not more than a few seconds • This example can be applied to many scenarios using the same techniques©Continuent 2012 4
  • 5. An Example NYC Branch SFO Branch id balance id balance 1 $234.78 1 $820.20 2 $892.24 2 $240.27 3 $1,023.76 3 $527.63 4 $521.08 5 $982.62©Continuent 2012 5
  • 6. An Example id balance schema 1 $234.78 nyc 2 $892.24 nyc 1 $820.20 sfo 2 $240.27 sfo 3 $1,023.76 nyc 4 $521.08 nyc 5 $982.62 nyc 3 $527.63 sfo©Continuent 2012 6
  • 7. What can we do about it? • Make the application do it • Run a batch process to dump and load data • Replicate into a central schema with Tungsten Replicator©Continuent 2012 7
  • 8. Tungsten Replicator • GPL v2 • Global Transaction IDs • Multi-Master Replication • Parallel Replication • Heterogenous Replication • Supports MySQL 5.0 and up©Continuent 2012 8
  • 9. Tungsten Replicator Master-Slave Heterogenous Direct Fan-In All-Masters Star©Continuent 2012 9
  • 10. Replication Services Stage Stage Stage Extrac Filter Apply Extrac Filter Apply Extrac Filter Apply Master Transaction In-Memory Slave DBMS History Log Queue DBMS©Continuent 2012 10
  • 11. Transaction History Log Event Header Seqno Header Record Fragno Last_frag (version & first seqno) Epoch_number Source_id Event_id Event Record Shard_id Tstamp Data_length Event Record (Java Primitive Types) ... Serialized Event (Google Protobufs, Event Record Up to 2Gb) Log Rotation Record CRC (name of next file) (Java Primitive Types)©Continuent 2012 11
  • 12. Filters • Modify THL events during replication • Can be written in Java or JavaScript • JavaScript is compiled at runtime • Drop all or part of a THL event • Modify the contents of a THL event • Insert new statements or rows into a THL event • Use your imagination©Continuent 2012 12
  • 13. Putting It All Together • Filtering allows us to modify ROW replication events before applying them • dropstatementdata • replicate • replicatecolumns • buildindextable • Apply to the master server with log slave updates • Integrate with other servers to ensure©Continuent 2012 13
  • 14. Drop Statement Data Filter Definition # Remove statement events and drop the event if the result is empty replicator.filter.dropstatementdata=com.continuent.tungsten.re plicator.filter.JavaScriptFilter replicator.filter.dropstatementdata.script=$ {replicator.home.dir}/samples/extensions/javascript/ dropstatementdata.js©Continuent 2012 14
  • 15. Replicate Filter Definition # Filter to forward or ignore particular schemas and/or databases. Entries # are comma-separated lists of the form schema[.table] where the table is # optional. List entries may use * and ? as wild cards. When both # filter lists are empty updates on all tables are allowed. replicator.filter.replicate=com.continuent.tungsten.replicator.fil ter.ReplicateFilter replicator.filter.replicate.ignore= replicator.filter.replicate.do=*.account©Continuent 2012 15
  • 16. ReplicateColumns Filter Definition # Join rows from one table in many schemas, to a table in a single schema replicator.filter.replicatecolumns=com.continuent.tungsten.repl icator.filter.ReplicateColumnsFilter replicator.filter.replicatecolumns.do= replicator.filter.replicatecolumns.ignore=account.filler©Continuent 2012 16
  • 17. ReplicateColumns : Before - SQL(27) = - ACTION = INSERT - SCHEMA = e3 - TABLE = account - ROW# = 0 - COL(1: account_id) = 27 - COL(2: branch_id) = 0 - COL(3: account_balance) = 0 - COL(4: filler) = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XX - COL(5: time_stamp) = 2012-09-28 11:50:36.0©Continuent 2012 17
  • 18. ReplicateColumns : After - SQL(27) = - ACTION = INSERT - SCHEMA = e3 - TABLE = account - ROW# = 0 - COL(1: account_id) = 27 - COL(2: branch_id) = 0 - COL(3: account_balance) = 0 - COL(5: time_stamp) = 2012-09-28 11:50:36.0©Continuent 2012 18
  • 19. Index Filter Definition # Join rows from one table in many schemas, to a table in a single schema replicator.filter.buildindextable=com.continuent.tungsten.repli cator.filter.BuildIndexTable replicator.filter.buildindextable.target_schema_name=bi©Continuent 2012 19
  • 20. Index Filter : Before - SQL(27) = - ACTION = INSERT - SCHEMA = e3 - TABLE = account - ROW# = 0 - COL(1: account_id) = 27 - COL(2: branch_id) = 0 - COL(3: account_balance) = 0 - COL(5: time_stamp) = 2012-09-28 11:50:36.0©Continuent 2012 20
  • 21. Index Filter : After - SQL(27) = - ACTION = INSERT - SCHEMA = test - TABLE = account - ROW# = 0 - COL(0: account_id) = 27 - COL(0: branch_id) = 0 - COL(0: account_balance) = 0 - COL(0: time_stamp) = 2012-09-28 11:50:36.0 - COL(0: schema) = e3©Continuent 2012 21
  • 22. Requirements • Standard Tungsten Requirements • Java • Ruby • Row Replication • Unique schema names • Target schema with an additional ‘schema’ column©Continuent 2012 22
  • 23. Installation - Master • Normal©Continuent 2012 23
  • 24. Installation - Slave • Some hacks to write back to the master • Some filters to make the changes©Continuent 2012 24
  • 25. Demo©Continuent 2012 25
  • 26. Replicating the index table • Setup a normal master-slave pair • Could be a Tungsten cluster • Set up the indexing replicator to apply events to the master or through the connector©Continuent 2012 26
  • 27. Provisioning • Not as simple as restoring a backup • New loader command to initialize the master THL or slave database • Creates a set of row inserts for each table • Supports extracting from MySQL • Uses FLUSH TABLES WITH READ LOCK©Continuent 2012 27
  • 28. Provisioning Master THL $ /opt/continuent/tungsten/tungsten-replicator/bin/loader -extractor com.continuent.tungsten.replicator.loader.MySQLLoader -extractor.uri "mysql://mdb-1.local:3306/" -extractor.user tungsten -extractor.password secret -extractor.includeSchemas e1,e2,e3,e4,e5©Continuent 2012 28
  • 29. Provisioning Slave Database $ /opt/indexer/tungsten/tungsten-replicator/bin/loader -extractor com.continuent.tungsten.replicator.loader.MySQLLoader -extractor.uri "mysql://mdb-2.local:3306/" -extractor.user tungsten -extractor.password secret -extractor.includeSchemas e1,e2,e3,e4,e5 -extractor.tungstenServiceSchema tungsten_globalidx©Continuent 2012 29
  • 30. Demo©Continuent 2012 30
  • 31. Next Steps • Loading big data • Replicate from many masters • Support non-row statements • Drop schema • Drop table • Truncate table • Expand the provisioning support to extract from Oracle • https://docs.continuent.com/wiki/x/uIAz©Continuent 2012 31
  • 32. We’re Hiring • Cluster Implementation Engineer • Quality Assurance Engineer • Technical Writer©Continuent 2012 32
  • 33. Jeff Mace jeff.mace@continuent.com sales@continuent.com 560 S. Winchester Blvd. Suite 500 San Jose, CA 95128 Tel (866) 998-3642 Fax (408) 668-1009 http://www.continuent.com http://code.google.com/p/tungsten-replicator©Continuent 2012 19 33