Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

DataStax: The Whys of NoSQL

1,043 views

Published on

As the {no,new,not only} SQL market cements itself as a backbone for today's Internet Enterprise applications, it behoves us both as technologists and enterprise leaders, to understand "whys" of design choices of NoSQL systems. This talk will reflect upon the rationale of design choices within traditional RDBMS systems (e.g. 3NF, joins, transactions) and contrast them with the similar foundational concepts within NoSQL systems such as Cassandra.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

DataStax: The Whys of NoSQL

  1. 1. The Whys of NoSQL
  2. 2. 1 Jargon Galore 2 Schema 3 Modeling and Internals 4 Deployment 5 Conclusion 2© 2015. All Rights Reserved.
  3. 3. ©2015 DataStax Confidential. Do not distribute without consent. 3 SQL Jargon
  4. 4. ©2015 DataStax Confidential. Do not distribute without consent. 4 NoSQL Noise?
  5. 5. Schema ©2015 DataStax Confidential. Do not distribute without consent. Rigid Schema Schema Free Schema on read Schema Easy to change In flexible Writes are schema free, reads are freaking slow Reads/Writes are schema aware Schema changes are O(1) operations BLOBs Too Slow Optimized for Agility of change when needed, not theoretical extremes
  6. 6. ©2015 DataStax Confidential. Do not distribute without consent. 6 Normalization, Joins, Referential Integrity Database normalization is the process of organizing the columns (attributes) and tables (relations) of a relational database to minimize data redundancy. Referential integrity is a property of data which, when satisfied, requires every value of one column of a table to exist as a value of another column in a different table.A JOIN is a means for combining fields from two tables (or more) by using values common to each. Source - https://en.wikipedia.org/
  7. 7. ©2015 DataStax Confidential. Do not distribute without consent. 7 Not all Data Access is equal 1:168K random vs. sequential 1:10 random vs. sequential Source - https://queue.acm.org/detail.cfm?id=1563874
  8. 8. ©2015 DataStax Confidential. Do not distribute without consent. 8 Disk Density Source http://silvertonconsulting.com/blog/2010/04/22/save-the-planet-buy-fatter-disks-and-flash/#sthash.sh2nwqtX.dpbs
  9. 9. ©2015 DataStax Confidential. Do not distribute without consent. 9 $0.01 $0.10 $1.00 $10.00 $100.00 $1,000.00 $10,000.00 $100,000.00 $1,000,000.00 201420132010200520001995199019851980 HDD Price / GB Minimize Data Redundancy? Disk Price / GB
  10. 10. OS Cache C* Read and Write paths ©2015 DataStax Confidential. Do not distribute without consent. Memtable 1 Memtable 2 Memtable N SSTable 1 SSTable 2 SSTable N Commit Log Persistent Storage Off Heap In Process Memory Reads (memtable + N SSTables where N >= 1) Mandatory Flush Writes Max # of SSTables = N (based on compaction) Creation of new memtable during flush operation (cleanup tombstones, cleanup token ranges, etc.) Time (memtable_flush_in_ms controls the frequency) Accounting SSTable Compacted RANDOM ACCESS SEQUENTIAL ACCESS
  11. 11. Execution Engine ©2015 DataStax Confidential. Do not distribute without consent.
  12. 12. Key takeaways ©2015 DataStax Confidential. Do not distribute without consent. Optimal utilization of physical resources (random access, sequential IO and CPU) No Read before Write (well mostly!) Plan for Compaction (like commercial paper, you need a regular pay back) De-Normalize for optimal application response (use 2NF instead of 3NF)
  13. 13. Deployment Semantics ©2014 DataStax Confidential. Do not distribute without consent. R/W R Single BoxDR GR ScaleUpby. Sharding Replication GR + DR San Francisco New York Stockholm DC1 DC2
  14. 14. Linear Scaling ©2015 DataStax Confidential. Do not distribute without consent. http://www.datastax.com/apache-cassandra-leads-nosql-benchmark End Point Report Excerpt: Balanced Read/Write YCSB Test
  15. 15. So what's the catch? ©2015 DataStax Confidential. Do not distribute without consent.
  16. 16. ©2015 DataStax Confidential. Do not distribute without consent. 16 Conclusion Best in class performance, backed by physics Enables pragmatic business agility, Delivering delightful customer experience, Always on, Linear Scale architecture delivering optimal ROI
  17. 17. Thank you

×