Edb 100k-trans-qcon-rev1

1,214 views

Published on

Achieving 100 transactions a second with MarkLogic

Published in: Technology
  • Be the first to comment

Edb 100k-trans-qcon-rev1

  1. 1. Achieving 100,000Transactions Per Second with a NoSQL Database Eric David Bloch @eedeebee 19 jun 2012
  2. 2. A bit about me• I’ve written software used by millions of people. Apps, libraries, compilers, device drivers, operating systems• This is my second QCon and my first talk• I’m the Community Director at MarkLogic, last 2 years.• Born here in NY in 1965; now in CA• I survived having 3 kids in less than 2 years.
  3. 3. Me, 1967
  4. 4. Me, today
  5. 5. Musical Form for the TalkA: Why?B: How? Whirl-wind database architecture tour Melody from part A again Techniques to get to 100K
  6. 6.  It’s about money. Top 5 bank needed to manage trades Trades look more like documents than tables Schemas for trades change all the time Transactions Scale and velocity (“Big Data”)
  7. 7. Trades and Positions 1 million trades per day Followed by 1 million position reports at end of day  Roll up trades of current date for each “book, instrument” pair  Group-by, with key = “date, book, instrument” 1M positions 1M trades Day Day Day Day Day ... 1 2 3 4 5
  8. 8. Trades and Positions<trade> <quantity>8540882</quantity> <quantity2>1193.71</quantity2> <instrument>WASAX</instrument> <book>679</book> <trade-date>2011-03-13-07:00</trade-date> <settle-date>2011-03-17-07:00</settle-date></trade><position> <instrument>EAAFX</instrument> <book>679</book> <quantity>3</quantity> <business-date>2011-03-25Z</business-date> <position-date>2011-03-24Z</position-date></position>
  9. 9. Now show us• 15K inserts per second• Linear scalability
  10. 10. Requirements NoSQL flexibility, performance & scale Enterprise-gradetransactional guarantees
  11. 11. What NoSQL, OldSQL, or NewSQLdatabase out there can we use?
  12. 12. Your database has a largeeffect on how you see the world
  13. 13. What is MarkLogic? Non-relational, document-oriented, distributed database  Shared nothing clustering, linear scale out  Multi-Version Concurrency Control (MVCC)  Transactions (ACID) Search Engine  Web scale (Big Data)  Inverted indexes (term lists)  Real-time updates  Compose-able queriesSlide 18 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  14. 14. Architecture Question Data model What does data model Indexing have to do with scalability? Clustering Query execution TransactionsSlide 19 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  15. 15. Key (URI) Value (Document)/trade/153748994 <trade> <id>8</id> <time>2012-02-20T14:00:00</time> <instrument>BYME AAA</instrument> <price cur=“usd”>600.27</price> </trade>/user/eedeebee { “name” : “Eric Bloch”, “age” : 47, “hair” : “gray”, “kids” : [ “Grace”, “Ryan”, “Owen” ] }/book5293 It was the best of times, it was the worst of times, it was the age of wisdom, .../2012-02-20T14:47:53/01445 .mp3 .avi [your favorite binary format]
  16. 16. Inverted Index“which” 123, 127, 129, 152, 344, 791 . . .“uniquely” 122, 125, 126, 129, 130, 167 . . .“identify” 123, 126, 130, 142, 143, 167 . . .“each” 123, 130, 131, 135, 162, 177 . . . Document“uniquely identify” 126, 130, 167, 212, 219, 377 . . . References<article> ... 126, 130, 167, 212, 219, 377 . . .article/abstract/@author ...<product>IMS</product> Slide 21 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  17. 17. Range Index <trade> <trader_id>8</trader_id> Value Docid <time>2012-02-20T14:00:00</time> <instrument>IBM</instrument> 0 287 … </trade> 8 1129 13 531 Docid Value <trade> <trader_id>13</trader_id> … … 287 0 <time>2012-02-20T14:30:00</time>Rows <instrument>AAPL</instrument> … … 531 13 … 1129 8 </trade> … … <trade> <trader_id>0</trader_id> … … <time>2012-02-20T15:30:00</time> <instrument>GOOG</instrument> • Column Oriented … • Memory Mapped </trade> Slide 22 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  18. 18. Shared-Nothing Clustering Host E1 Host D1 Host D2 Host D3 … Host DjSlide 23 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  19. 19. Query Evaluation – “Map” Q Host E1 Host E2 Host E3 … Host Ei Host D1 Host D2 Host D3 Host Dj …Q Q Q Q F1 … Fn F1 … Fn F1 … Fn F1 … Fn Slide 24 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  20. 20. Query Evaluation – “Reduce” Top 10 Host E1 Host E2 Host E3 … Host Ei Host D1 Host D2 Host D3 Host Dj …Top Top Top Top10 10 10 10 F1 … Fn F1 … Fn F1 … Fn F1 … Fn Slide 25 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  21. 21. Queries/Updates with MVCC Every query has a timestamp Documents do not change Reads are lock-free Inserts – see next slide Deletes – mark as deleted Edits –  copy  edit  insert the copy  mark the original as deletedSlide 26 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  22. 22. Insert Mechanics1) New URI+Document arrive at E-node2) URI Probe – determine whether URI exists in any forest3) URI Lock – write locks taken on D node(s)4) Forest Assignment – URI is deterministically placed in Forest5) Indexing6) Journaling7) Commit – transaction complete8) Release URI Locks – D node(s) are notified to release lockSlide 27 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  23. 23. Save-and-merge (Log Structured Tree Merge) Database Insert/ Update Forest 1 Forest 2 S0Memory SaveDisk Journaled S1 S2 S1 S2 S3 S3 Merge J Slide 28 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  24. 24. Back tothe money
  25. 25. Trades and Positions 1 million trades per day followed by 1 million position reports at end of day  Roll up trades of the current “date” for each “book:instrument” pair  Group-by, with key = “book:date:instrument” 1M positions 1M trades Day Day Day Day Day ... 1 2 3 4 5Slide 30 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  26. 26. Naive Query Pseudocodefor each bookfor each instrument in that book position = position(yesterday, book, instrument) for each trade of that instrument in this book position += trade(today, book, instrument).quantity insert(today, book, instrument, position) Slide 32 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  27. 27. Initial resultsSingle node – 19,000 inserts per secondSlide 33 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  28. 28. Initial results with cluster 70000 60000 50000 40000 doc/sec 30000 Report Query 2 20000 10000 0 1DE 2DE 3DE # of nodesSlide 34 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  29. 29. Techniques to get to 100K Insert Query for Computing New Positions  Materialized compound key, Co-Occurrence Query and Aggregation Insert of New Positions  Batching  Optimized insert mechanicsSlide 35 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  30. 30. Materializing a compound key<trade> <quantity2>1193.71</quantity2> <instrument>WASAX</instrument> <book>679</book> <trade-date>2011-03-13-07:00</trade-date> <settle-date>2011-03-17-07:00</settle-date></trade> Slide 36 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  31. 31. Materializing a compound key<trade> <roll-up book-date-instrument=“151333445566782303”/> <quantity2>1193.71</quantity2> <instrument>WASAX</instrument> <book>679</book> <trade-date>2011-03-13-07:00</trade-date> <settle-date>2011-03-17-07:00</settle-date></trade> Slide 37 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  32. 32. Co-Occurrence and Distributed Aggregation<trade> <roll-up book-date-instrument=“151373445566703”/> V D D V <quantity>8540882</quantity> … … <instrument>WASAX</instrument> … … … 1129 <book>679</book> … … <trade-date>2011-03-13-07:00</trade-date> … … 15137… … <settle-date>2011-03-17-07:00</settle-date> … …</trade> … … … … … …• Co-occurrences: V D Find pairings of range indexed values D V … …• Aggregate on the D nodes (Map/Reduce): 8540882 … 1129 … Sum up the quantities above … … … … … 8540882• Similar to a Group-by, … … • in a column-oriented, in-memory … … … … database … … Slide 38 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  33. 33. Initial results with new query > 30K inserts/second on a single nodeSlide 39 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  34. 34. Co-Occurrence + Aggregate versusNaïve approach 70000 60000 50000 40000 doc/sec Report Query 3 30000 Report Query 2 20000 10000 0 1DE 2DE 3DE # of nodesSlide 40 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  35. 35. Techniques Computing Positions  Materialized compound key, Co-Occurrence Query and Aggregation Updates  Batching  Optimized insert mechanicsSlide 41 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  36. 36. Transaction Size andThroughput 35000 30000 25000 #docs per sec. 20000 15000 insert throughput 10000 5000 0 1 2 4 8 16 32 100 500 1000 10000 # doc inserts per transactionSlide 42 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  37. 37. Techniques Computing Positions  Materialized compound key, Co-Occurrence Query and Aggregation Updates (Transaction)  Batching  Optimized insert mechanicsSlide 43 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  38. 38. Insert Mechanics, Again1) New URI+Document arrive at E-node2) *URI Probe – determine whether URI exists in any forest3) *URI Lock – write locks are created4) Forest Assignment – URI is deterministically placed in Forest5) Indexing6) Journaling7) Commit – transaction complete8) *Release URI Locks – D nodes are notified to release lock* Overhead of these operations increases with cluster sizeSlide 44 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  39. 39. Deterministic Placement4) Forest Assignment – URI is deterministically placed in Forest Hash 64-bit Bucketed Fi URI function number into a Forest • Done in C++ within server • But… • Can also be done in the client • Server allows queries to be evaluated against only one forest…Slide 45 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  40. 40. Optimized Insert Q E D1 D2 Q F1 F2 F3 F4Slide 46 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  41. 41. Optimized Insert Mechanics1) New URI+Document arrive at E-node a) Compute Fi using b) Ask server to evaluated the insert query only against Fi2) URI Probe – Fi Only3) URI Lock – Fi Only4) Forest Assignment – Fi Only5) Indexing6) Journaling7) Commit – transaction complete8) Lock Release - Fi OnlySlide 47 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  42. 42. Regular Insert Vs. Optimized Insert 70000 60000 50000 docs/second 40000 regular insert 30000 in-forest-eval In-Forest Eval 20000 10000 0 1DE 2DE 3DE # of nodesSlide 48 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  43. 43. Tada! Achieving 100K Updates In-Forest EvalSlide 49 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  44. 44. We weren’t content with 15KSo we showed them… A quote from the bank“We threw everything we had at MarkLogic and it didn’t break a sweat”Slide 50 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  45. 45. Enterprise-grade NoSQL document-oriented database with real-time full-text search and transactions Doing 100K transactions per second
  46. 46. Thank You!@eedeebeehttp://community.marklogic.comeric.bloch@marklogic.comSlide 55 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

×