Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Nonrelational Databases

13,735 views

Published on

My improvised/copied preso for some short talk I gave.

Published in: Technology

Nonrelational Databases

  1. 1. Non-relational Databases A new kind of Databases for handling Web Scale
  2. 2. Agenda <ul><li>The problem
  3. 3. The solution
  4. 4. Benefits
  5. 5. Cost
  6. 6. Example: Cassandra </li></ul>
  7. 7. The problem <ul><li>The Web introduces a new scale for applications, in terms of: </li><ul><li>Concurrent users (millions of reqs/second)
  8. 8. Data (peta-bytes generated daily)
  9. 9. Processing (all this data needs processing)
  10. 10. Exponential growth (surging unpredictable demands) </li></ul></ul>
  11. 11. The problem (contd.) <ul><li>Web sites with very large traffic have no way to deal with this using existing RDBMS solutions: </li><ul><li>Oracle
  12. 12. MS SQL
  13. 13. Sybase
  14. 14. MySQL
  15. 15. PostgreSQL </li></ul><li>Even with their high-end clustering solutions </li></ul>
  16. 16. The problem (contd.) <ul><li>Why? </li><ul><li>Applications using normalized database schema require the use of join's, which doesn't perform well under lots of data and/or nodes
  17. 17. Existing RDBMS clustering solutions require scale-up, which is limited & not really scalable when dealing with exponential growth
  18. 18. Machines have upper limits on capacity, & sharding the data & processing across machines is very complex & app-specific </li></ul></ul>
  19. 19. The problem (contd.) <ul><li>Why not just use sharding? </li><ul><li>Very problematic when adding/removing nodes
  20. 20. Basically, you end up denormalizing everything & loosing all benefits of relational databases </li></ul></ul>
  21. 21. Who faced this problem? <ul><li>Web applications dealing with high traffic, massive data, large user-base & user-generated content, such as: </li><ul><li>Google
  22. 22. Yahoo!
  23. 23. Amazon
  24. 24. Facebook
  25. 25. Twitter
  26. 26. Linked-In
  27. 27. & many more </li></ul></ul>
  28. 28. 1 difference though <ul><li>Compared to traditional large applications (telco, financial, &c), these web applications are usually free & therefore: </li><ul><li>can sacrifice data integrity / consistency </li><ul><li>No one will sue them if he doesn't receive the most current: </li><ul><li>status of their friends (Facebook/Twitter)
  29. 29. Web search result (Google /Yahoo!)
  30. 30. Item added to cart (Amazon) </li></ul></ul></ul></ul>
  31. 31. The solution <ul><li>These companies had to come up with a new kind of DBMS, capable of handling web scale </li><ul><li>Possibly sacrificing some level of consistency or some other feature </li></ul></ul>
  32. 32. Must we sacrifice something? <ul><li>In 2000, Eric Brewer (co-founder of Inktomi) formulated the CAP theorem, claiming that you can only optimize 2 out of these 3: </li><ul><li>C onsistency
  33. 33. A vailability
  34. 34. P artition-tolerance </li></ul><li>BTW, the theorem was later proved by MIT scientists in 2002 </li></ul>
  35. 35. Simple example <ul><li>When you have a lot of data which needs to be highly available, you'll usually need to p artition it across machines & also replicate it to be more fault-tolerant
  36. 36. This means, that when writing a record, all replica's must be updated too
  37. 37. Now you need to choose between: </li><ul><li>Lock all relevant replica's during update => be less a vailable
  38. 38. Don't lock the replicas => be less c onsistent </li></ul></ul>
  39. 39. The consequence <ul><li>You need to either: </li><ul><li>Drop partition tolerance (CA)
  40. 40. Drop availability (CP)
  41. 41. Drop consistency (AP) </li></ul><li>“Drop” here is usually not meant as binary, but rather tunable </li></ul>
  42. 42. Non-relational databases <ul><li>The solution these companies came up with are a family of database for handling web scale: </li><ul><li>BigTable (developed at Google)
  43. 43. Hbase (developed at Yahoo!)
  44. 44. Dynamo (developed at Amazon)
  45. 45. Cassandra (developed at FaceBook)
  46. 46. Voldemort (developed at LinkedIn)
  47. 47. & a few more: </li><ul><li>Riak, Redis, CouchDB, MongoDB, Hypertable </li></ul></ul></ul>
  48. 48. Benefits <ul><li>Massively scalable
  49. 49. Extremely fast
  50. 50. Highly available, decentralized & fault tolerant (no single-point-of-failure)
  51. 51. Transparent sharding (consistent hashing)
  52. 52. Elasticity
  53. 53. Parallel processing
  54. 54. Dynamic schema
  55. 55. Automatic conflict resolution </li></ul>
  56. 56. Consistent hashing
  57. 57. Replication
  58. 58. Replication – node joining
  59. 59. Replication – node leaving
  60. 60. Scale-out / elasticity? <ul><li>O(1) Distributed Hashtable
  61. 61. Runs on a large number of cheap commodity machines
  62. 62. Replication
  63. 63. Gossip protocol
  64. 64. Transparently handles adding/removing nodes </li></ul>
  65. 65. Tunable consistency? <ul><li>Levels of consistency: </li><ul><li>Strict consistency
  66. 66. Read your writes consistency
  67. 67. Session consistency
  68. 68. Monotonic read consistency
  69. 69. Eventual consistency </li></ul><li>Tunable means: how many replica's to lock on write </li><ul><li>N, R, W parameters
  70. 70. Quorum </li></ul></ul>
  71. 71. Dealing with inconsistency <ul><li>Read-repair (when encountering inconsistency)
  72. 72. Vector clock conflict resolution </li></ul>
  73. 73. Dynamic schema <ul><li>Column families (basically a sparse table) </li></ul>
  74. 74. Dynamic schema (contd.) <ul><li>“Supercolumn” is a collection of columns
  75. 75. Record can have several “supercolumns” </li></ul>
  76. 76. Data processing <ul><li>Map/Reduce: an API exposed by non-relational databases to process data </li><ul><li>A functional programming pattern for parallelizing work
  77. 77. Brings the workers to the data – excellent fit for non-relational databases
  78. 78. Minimizes the programming to 2 simple functions (map & reduce)
  79. 79. Example: count appearances of a word in a giant table of large texts </li></ul></ul>
  80. 80. Map/Reduce (contd.)
  81. 81. Storage
  82. 82. Cost <ul><li>Allows sacrificing consistency (ACID) - at certain circumstances (but can deal with it)
  83. 83. Non-standard new API model
  84. 84. Non-standard new Schema model
  85. 85. New knowledge required to tune/optimize
  86. 86. Less mature </li></ul>
  87. 87. API model <ul><li>Usually, similar to Key-Value map: </li><ul><li>Get(key)
  88. 88. Put(key, value)
  89. 89. Delete(key)
  90. 90. Execute(operation, key_list) </li></ul><li>“value” can be </li><ul><li>an opaque serialized object
  91. 91. a record (list of “columns”: <name, value, timestamp>) </li></ul></ul>
  92. 92. Schema model <ul><li>Kind of sparse table
  93. 93. No schema </li></ul>
  94. 94. Example: Cassandra <ul><li>Features: </li><ul><li>O(1) DHT
  95. 95. Eventual consistency </li><ul><li>tunable: consistency vs. latency </li></ul><li>Values are structured, indexed
  96. 96. Columns / column families
  97. 97. Slicing with predicates (queries)
  98. 98. PartitionOrderer </li></ul></ul>
  99. 99. Cassandra performance <ul><li>Benchmark against MySQL (50GB) </li><ul><li>MySQL: </li><ul><li>300ms write
  100. 100. 350ms read </li></ul><li>Cassandra: </li><ul><li>0.12ms write
  101. 101. 15ms read </li></ul></ul><li>how come writes are so fast? </li><ul><li>Writes involve no reads/seeks
  102. 102. Use any node (closest to you) </li></ul></ul>
  103. 103. Cassandra API
  104. 104. Cassandra API (contd.)
  105. 105. Example: Cassandra (contd.) <ul><li>Java API </li><ul><li>Simple DAO
  106. 106. Simple client </li></ul></ul>
  107. 107. Cassandra usage <ul><li>Very high-traffic sites: </li><ul><li>Facebook
  108. 108. Digg
  109. 109. Twitter </li></ul></ul>
  110. 110. Further information <ul><li>The Dynamo paper: </li><ul><li>http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html </li></ul><li>Nosql patterns: </li><ul><li>http://horicky.blogspot.com/2009/11/nosql-patterns.html </li></ul><li>Nosql conference video's: </li><ul><li>https://nosqleast.com/2009/ </li></ul><li>Hebrew podcast covering nosql & Cassandra (episodes 56, 57 & more): </li><ul><li>http://www.reversim.com/ </li></ul></ul>
  111. 111. Further information (contd.) <ul><li>Ran Tavori's lecture (video + slides): </li><ul><li>http://prettyprint.me/2010/01/09/introduction-to-nosql-and-cassandra-part-1/
  112. 112. http://prettyprint.me/2010/01/20/introduction-to-nosql-and-cassandra-part-2/ </li></ul></ul>

×