Cassandra Prophecy

799 views

Published on

introduction to Apache Cassandra distributed database

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
799
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
18
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Cassandra Prophecy

  1. 1. Cassandra Prophecy Igor Khotin E-mail: khotin@gmx.com
  2. 2. Background● 11+ years in the IT industry● 6+ years with Java● Flexible design promoter● Agile-junkie
  3. 3. highly scalable, eventually consistent,distributed, structured key-value store
  4. 4. Decentralized● P2P● No SPOF● No network bottlenecks
  5. 5. Fault Tolerant● High Availability● Replication and redundancy● Node replacement & no downtime● Multiple racks & datacenters
  6. 6. Elastic Scalability● Scales up and down● Just add or remove nodes● Linear scalability● Low maintenance cost
  7. 7. Tunable consistency● Different consistency levels● Consistency vs. latency
  8. 8. Rich Data Model● Goes beyond simple key-value● Values could be indexed● Flexible schema
  9. 9. Scale up problem
  10. 10. Sharding doesnt solve it
  11. 11. Google File System & Google BigTable
  12. 12. Amazon Dynamo
  13. 13. Cassandraby Avinash Lakshman and Prashant Malik
  14. 14. Cassandraused in Inbox Search
  15. 15. Open sourced in July 2008
  16. 16. March 2009Accepted to Apache Incubator
  17. 17. February 2010Top-Level Apache Project
  18. 18. late 2010... Cassandra abandonedMessaging moved to HBase
  19. 19. October 2011 Release 1.0
  20. 20. November 30, 2011 Release 1.0.5 (current stable)
  21. 21. Moving forward fast...
  22. 22. Brewers CAP Theorem
  23. 23. Data Model
  24. 24. Column Family
  25. 25. Column sorting● ASCII Design decision● UTF8● Bytes● Long● LexicalUUID● TimeUUID● Custom
  26. 26. Denormalization
  27. 27. Denormalization
  28. 28. Design for queries
  29. 29. Keyring
  30. 30. Keyring
  31. 31. Keyring
  32. 32. Keyring
  33. 33. Keyring
  34. 34. Keyring
  35. 35. Keyring
  36. 36. Keyring
  37. 37. Gossip
  38. 38. Optimized for writes
  39. 39. Optimized for writes ● No reads ● No seeks ● No b-trees ● Fast ● Row - atomic
  40. 40. Tunable Consistency
  41. 41. Tombstone
  42. 42. Low Level Clients● Thrift ● IDL and binary communication protocol ● Multiple languages support ● Really sucks● Avro ● Better than Thrift, but sucks anyway
  43. 43. High Level Clients● Feature-rich ● Connection pool ● Load-balancing ● Fail-over● Hector, Pelops... (Java)● Pycassa... (Python)● Fauna (Ruby)● ...
  44. 44. CQL● SQL for NoSQL● CREATE KEYSPACE, CREATE COLUMNFAMILY, CREATE INDEX● USE, SELECT, UPDATE, DELETE... SELECT population FROM city WHERE KEY = Paris USING CONSISTENCY QUORUM
  45. 45. Understand your problem
  46. 46. Understand your problemFind appropriate solution
  47. 47. Dont let default solutions to be imposed on you
  48. 48. Hard to choose?
  49. 49. Leaders will emerge
  50. 50. Resources● http://cassandra.apache.org● Dynamo: Amazon’s Highly Available Key-value Store● Cassandra - A Decentralized Structured Storage System● Bigtable: A Distributed Storage System for Structured Data
  51. 51. ContactsE-mail: khotin@gmx.comBlog: www.ikhotin.comTwitter: chaostarter
  52. 52. Questions?

×