Introducing Cassandra NoSQL database Vitalii Tymchyshyn (tivv00@gmail.com) <ul><li>NoSQL key-value vs RDBMS – why and when
Cassandra architecture
Cassandra data model
Life without joins or HDD space is cheap today
When standard is good or which client to use </li></ul>
RDBMS problems <ul><li>Sometimes you reach the point where single server can't cope
Replication </li><ul><li>Not write scalable
Data is not instantly visible </li></ul><li>Sharding </li><ul><li>No foreign keys or joins
No transactions
Reduced reliability (multiple servers) </li></ul><li>Schema update is a pain </li></ul>
No SQL key-value <ul><li>Master-Master Replication + Sharding in one bottle
Eventual consistency as a standard
All data in one record – no need to join
Flexible schema </li></ul>
Upcoming SlideShare
Loading in...5
×

Introducing cassandra

1,256

Published on

Cassandra & No-SQL key-value introduction

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,256
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
11
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Introducing cassandra

  1. 1. Introducing Cassandra NoSQL database Vitalii Tymchyshyn (tivv00@gmail.com) <ul><li>NoSQL key-value vs RDBMS – why and when
  2. 2. Cassandra architecture
  3. 3. Cassandra data model
  4. 4. Life without joins or HDD space is cheap today
  5. 5. When standard is good or which client to use </li></ul>
  6. 6. RDBMS problems <ul><li>Sometimes you reach the point where single server can't cope
  7. 7. Replication </li><ul><li>Not write scalable
  8. 8. Data is not instantly visible </li></ul><li>Sharding </li><ul><li>No foreign keys or joins
  9. 9. No transactions
  10. 10. Reduced reliability (multiple servers) </li></ul><li>Schema update is a pain </li></ul>
  11. 11. No SQL key-value <ul><li>Master-Master Replication + Sharding in one bottle
  12. 12. Eventual consistency as a standard
  13. 13. All data in one record – no need to join
  14. 14. Flexible schema </li></ul>
  15. 15. Cassandra ring - server - client
  16. 16. Ring partitioner types <ul>Order Preserving <li>Each server serves key range
  17. 17. Range queries possible
  18. 18. Read/Write/Disk space hot spots possible </li></ul><ul>Random <li>Data is smoothly distributed on servers
  19. 19. No range queries
  20. 20. No hot spots </li></ul>
  21. 21. Runtime CAP-solving <ul><li>The whole thing is about replication
  22. 22. CAP: Consistency, Availability, Partition tolerance – choose two.
  23. 23. With cassandra you can choose at runtime. </li></ul>
  24. 24. Runtime CAP-solving Quorum read/write Fast writes Fast reads Fast, less consistency
  25. 25. Data model <ul><li>Keyspaces – much like database in RDBMS
  26. 26. Column Families – storage element, like tables in RDBMS
  27. 27. Columns – you can have million for a row, names are flexible, still like columns in RDBMS
  28. 28. Super Column – A column that has structured content (another columns) </li></ul>
  29. 29. Data model <ul><li>You can have same key in multiple column families
  30. 30. You can have different set of columns for different keys in same column family
  31. 31. You can query a range of columns for a key (columns are sorted) with pagination
  32. 32. You can have (and it's useful) to have columns without values </li></ul>
  33. 33. “Index” example <ul><li>Column family people </li><ul><li>Key: Fred [phone=2223355, phone2=4445566, fax=9998877]
  34. 34. Key: John [phone=4445566, mobile=099123456] </li></ul><li>Column family phone_directory </li><ul><li>Key: 2223355 [Fred]
  35. 35. Key: 4445566 [Fred, John]
  36. 36. Key: 9998877 [Fred]
  37. 37. Key: 099123456 [John] </li></ul></ul>
  38. 38. “Join” example <ul><li>Column family customer </li><ul><li>Key: Boeing [email: [email_address] ]
  39. 39. Key: Oracle [skype: java] </li></ul><li>Column family orders </li><ul><li>Key: 1 [customer: Boeing, total: 200m]
  40. 40. Key: 2 [customer: Oracle, total: 300m]
  41. 41. Key: 3 [customer: Boeing, total: 500m] </li></ul><li>Column family customer_order_totals </li><ul><li>Key: Boeing[ 1:200m, 3:500m]
  42. 42. Key: Oracle[ 2:300m] </li></ul></ul>
  43. 43. Java clients <ul><li>Thrift
  44. 44. Hector
  45. 45. Kundera
  46. 46. Pelops
  47. 47. Cassandrelle </li></ul>
  48. 48. Real-world example <ul><li>Two-phase processing </li><ul><li>Phase 1: Download and parse few pages from each internet site (billions of pages)
  49. 49. Phase 2: Multiple times perform gathered data analysis </li></ul></ul>
  50. 50. Q&A Author: Vitalii Tymchyshyn [email_address] © Triibes
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×