Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

NoSQL Riak MongoDB Elasticsearch - All The Same?

3,198 views

Published on

Gives a general introduction to NoSQL and modeling data with JSON. Goes on to compare MongoDB, Riak and Elasticsearch - that seem to be the same at first sight but are in fact pretty different. Presented at JavaLand.

Published in: Technology

NoSQL Riak MongoDB Elasticsearch - All The Same?

  1. 1. MongoDB, Elasticsearch, Riak – all the same? Eberhard Wolff Freelancer Head Technology Advisory Board adesso AG http://ewolff.com
  2. 2. Eberhard Wolff - @ewolff Leseprobe: http://bit.ly/CD-Buch
  3. 3. Eberhard Wolff - @ewolff Modeling: Relational Databases vs. JSON
  4. 4. Eberhard Wolff - @ewolff Financial System •  Different financial products •  Mapping objects / database •  Inheritance
  5. 5. Eberhard Wolff - @ewolff E/R Model Asset Stock Zero Bond Option Country> 20 database tables Up to 25 attributes Currency
  6. 6. Eberhard Wolff - @ewolff JOINs L
  7. 7. Get all asset with interest rate x
  8. 8. Eberhard Wolff - @ewolff
  9. 9. Eberhard Wolff - @ewolff JSON
  10. 10. Eberhard Wolff - @ewolff Asset Type ID Zero Bond Interest Rate Fixed Rate Bond Interest Rate Stock Option … Preferred Underlying asset Country Price Country Currency
  11. 11. Eberhard Wolff - @ewolff { "ID" : "42", "type" : "Fixed Rate Bond", "Country" : "DE", "Currency" : "EUR", "ISIN" : "DE0001141562", "Interest Rate" : "2.5" }
  12. 12. Eberhard Wolff - @ewolff All stores in this presentation support JSON
  13. 13. Eberhard Wolff - @ewolff Scaling Relational Databases
  14. 14. Eberhard Wolff - @ewolff Larger Server DB Server DB Server Expensive Server Limited
  15. 15. Eberhard Wolff - @ewolff Common Storage DB Server Expensive Storage Limited DB Server DB Server DB Server e.g. Oracle RAC
  16. 16. Eberhard Wolff - @ewolff Replication Cheap Server Almost unlimited DB Server DB Server DB Server DB Server Inconsistent data Conflict resolution or Read only
  17. 17. Eberhard Wolff - @ewolff Replication DB Server DB Server DB Server DB Server MySQL Master-Slave Oracle Advanced Replication
  18. 18. Eberhard Wolff - @ewolff Network Failure •  Either Answer & provide outdated data •  or Don’t answer i.e. always provide up to date data
  19. 19. Eberhard Wolff - @ewolff CAP •  Consistency •  Availability •  Network Partition Tolerance •  If network fails provide a potentially incorrect answer or no at all?
  20. 20. Eberhard Wolff - @ewolff BASE •  Basically Available •  Soft State •  Eventually (= in the end) consistent •  i.e. give potentially incorrect answer
  21. 21. Eberhard Wolff - @ewolff BASE and Relational DBs •  Very limited •  Stand by •  Read only replica •  No truly distributed DB
  22. 22. Eberhard Wolff - @ewolff Relational & BASE •  Most relational operations cover multiple tables •  Needs locks across multiple servers •  Not realistically possible
  23. 23. Eberhard Wolff - @ewolff NoSQL & BASE •  Typical operation covers one data structure •  …that contains more information •  No complex locking •  More sophisticated BASE
  24. 24. Eberhard Wolff - @ewolff Naïve View on NoSQL
  25. 25. Eberhard Wolff - @ewolff Key / Value Stores •  Map Key to Value •  For simple data structure •  Retrieval only by key •  Easy scalability •  Only for simple applications Key Value 42 Some data
  26. 26. Eberhard Wolff - @ewolff Document Oriented •  Documents e.g. JSON •  Complex structures & queries •  Still great scalability •  For more complex applications { "author":{ "name":"Eberhard Wolff", "email":"eberhard.wolff@gmail.com" }, "title": "Continuous Delivery”, }
  27. 27. Eberhard Wolff - @ewolff Graph, Column Oriented…
  28. 28. Eberhard Wolff - @ewolff Educated View on NoSQL
  29. 29. Eberhard Wolff - @ewolff Key / value Document-based Search engine All the same?
  30. 30. Eberhard Wolff - @ewolff MongoDB elasticsearch Riak
  31. 31. Eberhard Wolff - @ewolff MongoDB elasticsearch Riak
  32. 32. Eberhard Wolff - @ewolff •  Key / value •  Truly distributed database What is Riak?
  33. 33. Eberhard Wolff - @ewolff Riak: Technologies •  Erlang •  Open Source (Apache 2.0) •  Company: Basho
  34. 34. Eberhard Wolff - @ewolff •  Allows secondary indices •  Riak Search 2.0: Solr integration •  Solr: Lucene based search engine •  API compatible to Solr •  Key / value or document based? More indices
  35. 35. Eberhard Wolff - @ewolff •  Map/reduce •  Scans all datasets •  Can store large binary objects More Features
  36. 36. Eberhard Wolff - @ewolff Scaling Riak •  Based on the Dynamo paper •  Well understood •  …and battle proofed at Amazon
  37. 37. Eberhard Wolff - @ewolff Scaling Riak Server A Shard1 Shard3 Shard4 Server B Shard2 Shard1 Shard4 Server D Shard4 Shard2 Shard3 Server C Shard3 Shard2 Shard1
  38. 38. Eberhard Wolff - @ewolff Scaling Riak Server A Shard1 Shard3 Shard4 Server B Shard2 Shard1 Shard4 Server D Shard4 Shard2 Shard3 Server C Shard3 Shard2 Shard1
  39. 39. Eberhard Wolff - @ewolff Scaling Riak Server A Shard1 Shard3 Shard4 Server B Shard2 Shard1 Shard4 Server D Shard4 Shard2 Shard3 Server C Shard3 Shard2 Shard1 New Server
  40. 40. Eberhard Wolff - @ewolff Tuning BASE •  N node with replica •  R nodes read from •  W nodes written to •  Trade off
  41. 41. Eberhard Wolff - @ewolff Is it bullet proof?
  42. 42. Eberhard Wolff - @ewolff Jepsen •  Test suite for network failures etc •  https://aphyr.com/tags/jepsen •  Riak succeeds •  …if tuned correctly •  …might still need to merge versions •  https://aphyr.com/posts/285-call-me- maybe-riak
  43. 43. Eberhard Wolff - @ewolff MongoDB elasticsearch Riak
  44. 44. Eberhard Wolff - @ewolff MongoDB elasticsearch Riak
  45. 45. Eberhard Wolff - @ewolff MongoDB elasticsearch Riak
  46. 46. Eberhard Wolff - @ewolff •  Document-oriented •  MMAPv1 Memory-mapped files + journal •  New in 3.0: WiredTiger for complex loads Humongous What is MongoDB?
  47. 47. Eberhard Wolff - @ewolff MongoDB: Technologies •  C++ •  Open Source (AGPL) •  Company: MongoDB, Inc.
  48. 48. Eberhard Wolff - @ewolff •  Can store large binary objects •  Its own full text search More Features
  49. 49. Eberhard Wolff - @ewolff More Features •  Map / Reduce •  JavaScript •  Aggregation framework
  50. 50. Eberhard Wolff - @ewolff Scaling MongoDB Replica 1 Shard 1 Replica 2 Replica 3 Shard 2 Replica 1 Replica 2 Replica 3
  51. 51. Eberhard Wolff - @ewolff Availability Replica 1 Shard 1 Replica 2 Replica 3 Shard 2 Replica 1 Replica 2 Replica 3
  52. 52. Eberhard Wolff - @ewolff Scaling MongoDB Replica 1 Shard 1 Replica 2 Replica 3 Replica 1 Shard 2 Replica 2 Replica 3 Replica 1 Shard 3 Replica 2 Replica 3
  53. 53. Eberhard Wolff - @ewolff Scaling MongoDB Replica 1 Shard 1 Replica 2 Replica 3 Shard 2 Replica 1 Replica 2 Replica 3 ?
  54. 54. Eberhard Wolff - @ewolff Tuning BASE •  Write concerns •  How many nodes should acknowledge the write? •  Read from primary •  …or also secondaries
  55. 55. Eberhard Wolff - @ewolff Jepsen •  Mongo loses writes •  A bug – might still be there •  Also: non-acknowledge writes might still survive •  …and overwrite other data •  https://aphyr.com/posts/284-call-me- maybe-mongodb
  56. 56. Eberhard Wolff - @ewolff MongoDB elasticsearch Riak
  57. 57. Eberhard Wolff - @ewolff MongoDB elasticsearch Riak
  58. 58. Eberhard Wolff - @ewolff MongoDB elasticsearch Riak
  59. 59. Eberhard Wolff - @ewolff Database =Storage + Search
  60. 60. Eberhard Wolff - @ewolff elasticsearch =Storage + Search
  61. 61. Eberhard Wolff - @ewolff What is elasticsearch? •  Search Engine •  Also stores original documents •  Based on Lucene Search Libray •  Easy scaling
  62. 62. Eberhard Wolff - @ewolff elasticsearch: Technologies •  Java •  REST •  Open Source (Apache) •  Backed by company elasticsearch
  63. 63. Eberhard Wolff - @ewolff elasticsearch Internals •  Append only file •  Many benefits •  But not too great for updates
  64. 64. Eberhard Wolff - @ewolff Scaling elasticsearch Server Server Server Shard 1 Replica 1 Replica 2 Shard 2 Replica 3Shard 3
  65. 65. Eberhard Wolff - @ewolff Tuning BASE •  Write acknowledge: 1, majority, all •  Including indexing •  Read from primary •  …or also secondaries
  66. 66. Eberhard Wolff - @ewolff Jepsen •  Loses data even if just one node is partioned (June 2014) •  Actively worked on •  It’s a search engine… •  https://aphyr.com/posts/317-call-me- maybe-elasticsearch •  http://www.elasticsearch.org/guide/ en/elasticsearch/resiliency/current/
  67. 67. Eberhard Wolff - @ewolff Scenarios elasticsearch
  68. 68. Eberhard Wolff - @ewolff Search •  Powerful query language •  Configurable index •  Text analysis •  Stop words •  Stemming
  69. 69. Eberhard Wolff - @ewolff Facets •  Number of hits by category •  Useful for statistics •  & Big Data •  Statistical facet (+ computation) •  Range facets etc.
  70. 70. Eberhard Wolff - @ewolff MongoDB elasticsearch Riak
  71. 71. Eberhard Wolff - @ewolff MongoDB elasticsearch Riak
  72. 72. Eberhard Wolff - @ewolff Conclusion •  Relational databases might be BASE •  NoSQL embraces BASE better •  Key / Value, Document stores and search engine: very similar features •  Care about scaling •  Care about resilience
  73. 73. Eberhard Wolff - @ewolff
  74. 74. Eberhard Wolff - @ewolff Thank You!

×