Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Supercharge your RDBMS with MongoDB Superpowers

1,117 views

Published on

ToroDB is an open source, NoSQL, document database. It is compatible with the MongoDB protocol, but uses a relational database for data storage.

ToroDB can transform your RDBMS into a MongoDB-on-steroids server, adding support for atomic operations, consistent reads and native SQL layer --while preserving your RDBMS data, tools and expertise.

Plus ToroDB offers you a significant advantage in terms of speed vs MongoDB, both in terms of data insertion (1.5x) but specially on aggregate queries on sharded clusters (25x-75x speedup when using with Greenplum backend).

Published in: Software
  • Be the first to comment

Supercharge your RDBMS with MongoDB Superpowers

  1. 1. ToroDB @NoSQLonSQL About $self and 8Kdata *8Kdata*
  2. 2. ToroDB @NoSQLonSQL The world has changed... http://chasingafterdear.com/wp-content/uploads/2013/05/how-the-world-has-changed.png
  3. 3. ToroDB @NoSQLonSQL Say you were… ● A happy DBA, managing your RDBMS ● Bofhing your users when required ● Just having to fight devs who don't know who is Mr. Bobby Tables
  4. 4. ToroDB @NoSQLonSQL … and then NoSQL came And you started receiving questions like: I want NoSQL! Install MongoDB! My app is web scale!
  5. 5. ToroDB @NoSQLonSQL Fear no more! You can now supercharge your RDBMS with MongoDB superpowers
  6. 6. ToroDB @NoSQLonSQL
  7. 7. ToroDB @NoSQLonSQL ToroDB in one slide ● Document-oriented, JSON, NoSQL db ● Open source (AGPL) ● MongoDB compatibility (wire protocol level)
  8. 8. ToroDB @NoSQLonSQL
  9. 9. ToroDB @NoSQLonSQL Mapping unstructured data to relational
  10. 10. ToroDB @NoSQLonSQL ToroDB storage internals { "name": "ToroDB", "data": { "a": 42, "b": "hello world!" }, "nested": { "j": 42, "deeper": { "a": 21, "b": "hello" } } }
  11. 11. ToroDB @NoSQLonSQL ToroDB storage internals The document is split into the following subdocuments: { "name": "ToroDB", "data": {}, "nested": {} } { "a": 42, "b": "hello world!"} { "j": 42, "deeper": {}} { "a": 21, "b": "hello"}
  12. 12. ToroDB @NoSQLonSQL ToroDB storage internals ┌─────┬───────┬────────────────────────────┬────────┐ │ did │ index │ _id │ name │ ├─────┼───────┼────────────────────────────┼────────┤ │ 0 │ ¤ │ x5451a07de7032d23a908576d │ ToroDB │ └─────┴───────┴────────────────────────────┴────────┘ ┌─────┬───────┬────┬──────────────┐ │ did │ index │ a │ b │ ├─────┼───────┼────┼──────────────┤ │ 0 │ ¤ │ 42 │ hello world! │ │ 0 │ 1 │ 21 │ hello │ └─────┴───────┴────┴──────────────┘ ┌─────┬───────┬────┐ │ did │ index │ j │ ├─────┼───────┼────┤ │ 0 │ ¤ │ 42 │ └─────┴───────┴────┘
  13. 13. ToroDB @NoSQLonSQL ToroDB storage internals select * from demo.structures ┌─────┬────────────────────────────────────────────────────────────────────────────┐ │ sid │ _structure │ ├─────┼────────────────────────────────────────────────────────────────────────────┤ │ 0 │ {"t": 3, "data": {"t": 1}, "nested": {"t": 2, "deeper": {"i": 1, "t": 1}}} │ └─────┴────────────────────────────────────────────────────────────────────────────┘ select * from demo.root; ┌─────┬─────┐ │ did │ sid │ ├─────┼─────┤ │ 0 │ 0 │ └─────┴─────┘
  14. 14. ToroDB @NoSQLonSQL How data is stored in schema-less Data normalization
  15. 15. ToroDB @NoSQLonSQL This is how we store in ToroDB
  16. 16. ToroDB @NoSQLonSQL Advantages over MongoDB
  17. 17. ToroDB @NoSQLonSQL ToroDB: native SQL
  18. 18. ToroDB @NoSQLonSQL torodb$ select * from toroviews.person ; ┌─────┬───────────┬────────┬─────┐ │ did │ surname │ name │ age │ ├─────┼───────────┼────────┼─────┤ │ 0 │ Hernandez │ Alvaro │ ¤ │ │ 1 │ Surname │ Name │ 31 │ └─────┴───────────┴────────┴─────┘ (2 rows) torodb$ select * from toroviews."person.contact"; ┌─────┬──────────┬────────────────────────┐ │ did │ verified │ email │ ├─────┼──────────┼────────────────────────┤ │ 0 │ t │ aht@torodb.com │ │ 1 │ ¤ │ name.surname@gmail.com │ └─────┴──────────┴────────────────────────┘ (2 rows) ToroDB VIEWs
  19. 19. ToroDB @NoSQLonSQL VIEWs, ToroDB from any SQL tool
  20. 20. ToroDB @NoSQLonSQL Mix-and-match relational & NoSQL ● Use the same database for both your relational data and ToroDB ● Just use separate schemas (if you will) ● Don't write to ToroDB data or metadata tables ● Query with SQL, do joins, whatever!
  21. 21. ToroDB @NoSQLonSQL And much more! ● Atomic batch-operations ● Clean reads ● Within node… transactions! (coming soon)
  22. 22. ToroDB @NoSQLonSQL Data discoverability, SQL connectors ● They are two of the major announcements for MongoDB 3.2 ● To discover data, MongoDB samples data. ToroDB: just look at table structures! (and join with root if you want a count) ● SQL connectors: native, no emulation
  23. 23. ToroDB @NoSQLonSQL Replication
  24. 24. ToroDB @NoSQLonSQL ToroDB v0.4 ● ToroDB works as a secondary slave of a MongoDB master (or slave, chained rep) ● Implements the full replication protocol (not as an oplog tailable query) ● Open source github.com/torodb/torodb (devel branch, version 0.4-SNAPSHOT)
  25. 25. ToroDB @NoSQLonSQL Horizontal scalability (aka sharding)
  26. 26. ToroDB @NoSQLonSQL Write scalability (sharding) ● MongoDB's sharding API not implemented yet (roadmap: ToroDB 0.8) ● Will use MongoDB's mongos without modification, as well as config servers ● That might change in the future (pg_shard?)
  27. 27. ToroDB @NoSQLonSQL Horizontal scalability (storage level) ● Another non-exclusive option is to have ToroDB store data in a distributed database ● Requires a distributed database like GreenPlum, CitusDb or RedShift ● Paired with replication as a slave: DW in NoSQL enabler
  28. 28. ToroDB @NoSQLonSQL Enabling Data Warehousing for the NoSQL World
  29. 29. ToroDB @NoSQLonSQL ● Amazon reviews dataset Image-based recommendations on styles and substitutes J. McAuley, C. Targett, J. Shi, A. van den Hengel SIGIR, 2015 ● AWS c4.xlarge (4vCPU, 8GB RAM) 4KIOPS SSD ● 4x shards, 3x config; 4x segments GP ● 83M records, 65GB plain json Benchmark
  30. 30. ToroDB @NoSQLonSQL Disk usage Mongo 3.0, WT, Snappy GP columnar, zlib level 9 table size index size total size 0 10000000000 20000000000 30000000000 40000000000 50000000000 60000000000 70000000000 80000000000 Storage requirements MongoDB vs ToroDB on Greenplum Mongo ToroDB on GP bytes
  31. 31. ToroDB @NoSQLonSQL SELECT count( distinct( "reviewerID" ) ) FROM reviews; Queries: which one is easier? db.reviews.aggregate([ { $group: { _id: "reviewerID"} }, { $group: {_id: 1, count: { $sum: 1}} } ])
  32. 32. ToroDB @NoSQLonSQL SELECT "reviewerName", count(*) as reviews FROM reviews GROUP BY "reviewerName" ORDER BY reviews DESC LIMIT 10; Queries: which one is easier? db.reviews.aggregate( [ { $group : { _id : '$reviewerName', r : { $sum : 1 } } }, { $sort : { r : -1 } }, { $limit : 10 } ], {allowDiskUse: true} )
  33. 33. ToroDB @NoSQLonSQL Query times 3 different queries Q3 on MongoDB: aggregate fails 27.95 74.87 0 0 200 400 600 800 1000 1200 969 1007 035 13 31 Query duration (s) MongoDB vs ToroDB on Greenplum MongoDB ToroDB on GP speedup seconds
  34. 34. ToroDB @NoSQLonSQL Announcing today… MyToro! (experimental)
  35. 35. ToroDB @NoSQLonSQL

×