CodeFest 2014. Осипов К. — NoSQL: вангуем вместе

1,339 views
1,204 views

Published on

Published in: Internet
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,339
On SlideShare
0
From Embeds
0
Number of Embeds
408
Actions
Shares
0
Downloads
11
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

CodeFest 2014. Осипов К. — NoSQL: вангуем вместе

  1. 1. NoSQL: вангуем вместе! CodeFest 2014 2014-03-29 Konstantin Osipov
  2. 2. Variety, Velocity, Volume nuff-nuff says: is #bigdata #nonql?
  3. 3. s/3v/3d/g: data model, data consistency, data access
  4. 4. What's wrong with Relational DBMS? ● Rigidity of schema change ● Data normalization vs. data distribution ● The Web market is vastly bigger than OLTP ● New hardware and software stack – it's time for a complete rewrite
  5. 5. Data model ● NoSQL: key/value, document store, JSON store, BigTable (columnar store) ● Traditional: XML, Object-oriented, Relational ● Outliers: Graph databases
  6. 6. Relational vs. JSON Schema or schemaless
  7. 7. XML vs. JSON
  8. 8. XML vs. JSON person[“Children”][0][“Name”] = Schemaless or implicit schema?
  9. 9. Column family (traditional)
  10. 10. Column family: BigTable/Cassandra
  11. 11. Column family in Cassandra (2)
  12. 12. Graph data model
  13. 13. The idea of an aggregate CUSTORDER is the main aggregate of this application domain
  14. 14. Data models: distilled through the idea of aggregate Document Graph oriented Key/Value Column store
  15. 15. Dimension 2: data consistency ● ACID is not usable for long operations anyway ● Consistency is all about the money ● … and CAP is not really the dilemma you have
  16. 16. What's atomicity? ● In relational and graph DBMS = ACID transactions ● Aggregate database = atomic update of an aggregate ● Distributed database ?
  17. 17. Idea: logical vs. physical consistency ● As long as you have multiple copies of the data you need to worry about physical consistency ● Consistency and availability go hand in hand ● But sometimes you have to choose between consistency and availability and/or performance
  18. 18. Case study 1: CouchDB, Lotus Notes
  19. 19. Case study 2: Amazon Shopping Basket The customers must be able to shop! Version evolution of object over time
  20. 20. Case study 3: Airline/hotel booking
  21. 21. To sum up: business sets the rules ● Lotus Notes and CouchDB: eventual consistency of document and email edits ● DynamoDB: vector clocks for customers which should always be able to shop! ● Hotel, airline reservation and distributed queuing as a case for long-running operations which can naturally result in inconsistency
  22. 22. Data models: distilled through the idea of aggregate Eventually consistent Transactional Aggregate-atomic
  23. 23. CAP: what's the fuss about? ● ACID vs. BASE ● To CAP or not to CAP is not a single binary choice ● A lot of the time you're trading consistency with response time ● Dynamo sure works hard! (c)
  24. 24. Dimension 3: data storage ● In-memory index - high velocity ● 2-level B-trees: - simple use cases ● B-trees - retro & classic ● LSM trees - high write/read ratio ● Fractional cascading/Fractal trees
  25. 25. Data storage: the map approaches 2-level B-tree Fractal tree/LSM B-tree In-memory Sophia
  26. 26. Putting it all together: 3 ideas ● Consistent hashing ● Relaxed consistency and vector clocks ● Log structured merge trees
  27. 27. ОК! А теперь скажи что делать! Следить за: ● WiredTiger ● WebScaleSQL ● RocksDB ● Sophia & Tarantool Ждать второго пришествия в виде: ● NuoDB ● VoltDB ● MemSQL ● FoundationDB Использовать: ● MySQL, PostgreSQL & MariaDB ● TokuMX ● Hadoop ● Redis^W^W^W ну вы поняли :)

×