Cassandra nosql eu 2010

  • 11,032 views
Uploaded on

 

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • nosql
    Are you sure you want to
    Your message goes here
  • nosql,cassandra
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
11,032
On Slideshare
0
From Embeds
0
Number of Embeds
4

Actions

Shares
Downloads
276
Comments
2
Likes
23

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Cassandra Jonathan Ellis @spyced jbellis@riptano.com Tuesday, April 20, 2010
  • 2. Tuesday, April 20, 2010
  • 3. Tuesday, April 20, 2010
  • 4. Tuesday, April 20, 2010
  • 5. “NoSQL” Performance Reliability Scaling B&D Tuesday, April 20, 2010
  • 6. Performance Tuesday, April 20, 2010 Each Cassandra node manages its storage locally. Not limited by obsolete systems, and not slowed by layering on top of a DFS.
  • 7. b-trees Tuesday, April 20, 2010 read-before-write index in ram random i/o
  • 8. Memtable / SSTable Tuesday, April 20, 2010
  • 9. Durable • Write to commitlog • fsync is cheap since it’s append-only • Write to memtable • [amortized] flush memtable to sstable Tuesday, April 20, 2010 Cassandra is one of the few NoSQL systems that is suitable for use when data loss is unacceptable.
  • 10. SSTable format, briefly <row data 0> <key 127> <row data 1> <key 255> ... ... <row data 127> ... <row data 255> ... Tuesday, April 20, 2010
  • 11. Scaling Tuesday, April 20, 2010 How managing our own data helps scaling
  • 12. Scaling • Facebook: grew from less than 80 machines to 150+ • SimpleGEO: from 20 EC2 Large instances to 50+ Tuesday, April 20, 2010
  • 13. How it works Tuesday, April 20, 2010
  • 14. W A T L Tuesday, April 20, 2010
  • 15. W A F T L Tuesday, April 20, 2010
  • 16. W A (A-F] F T (F-L] L Tuesday, April 20, 2010
  • 17. Key “C” W A F T L Tuesday, April 20, 2010
  • 18. Reliability • No single points of failure • Multiple datacenters • Monitorable Tuesday, April 20, 2010
  • 19. Design Tuesday, April 20, 2010
  • 20. The opposite of heroes • “If your software wakes people up at 4 AM to fix it, you’re doing it wrong.” Tuesday, April 20, 2010
  • 21. W A T L Tuesday, April 20, 2010 Every node is equal
  • 22. Y Key “C” A W U F T L P Tuesday, April 20, 2010 Always at least one copy in each datacenter Alternate datacenters on the ring
  • 23. Monitorable Tuesday, April 20, 2010
  • 24. Events Tuesday, April 20, 2010
  • 25. JMX Tuesday, April 20, 2010
  • 26. Bondage & Discipline • Twitter: “Fifteen months ago, it took two weeks to perform ALTER TABLE on the statuses [tweets] table.” Tuesday, April 20, 2010
  • 27. ColumnFamilies Columns Tuesday, April 20, 2010
  • 28. SuperColumns SuperColumns Tuesday, April 20, 2010
  • 29. Twissandra User = { 'a4a70900-24e1-11df-8924-001ff3591711': { 'id': 'a4a70900-24e1-11df-8924-001ff3591711', 'username': 'ericflo', 'password': '****', }, } Followers = { 'a4a70900-24e1-11df-8924-001ff3591711': { # friend id: timestamp of when the followership was added '10cf667c-24e2-11df-8924-001ff3591711': '1267413962580791', '343d5db2-24e2-11df-8924-001ff3591711': '1267413990076949', '3f22b5f6-24e2-11df-8924-001ff3591711': '1267414008133277', }, } Tuesday, April 20, 2010
  • 30. Tweet = { '7561a442-24e2-11df-8924-001ff3591711': { 'id': '89da3178-24e2-11df-8924-001ff3591711', 'user_id': 'a4a70900-24e1-11df-8924-001ff3591711', 'body': 'Trying out Twissandra. This is awesome!', '_ts': '1267414173047880', }, } Timeline = { 'a4a70900-24e1-11df-8924-001ff3591711': { # timestamp of tweet: tweet id 1267414247561777: '7561a442-24e2-11df-8924-001ff3591711', 1267414277402340: 'f0c8d718-24e2-11df-8924-001ff3591711', 1267414305866969: 'f9e6d804-24e2-11df-8924-001ff3591711', 1267414319522925: '02ccb5ec-24e3-11df-8924-001ff3591711', }, } Tuesday, April 20, 2010
  • 31. Tuesday, April 20, 2010
  • 32. Denormalize Userline = { 'a4a70900-24e1-11df-8924-001ff3591711': { # timestamp of tweet: tweet id 1267414247561777: '7561a442-24e2-11df-8924-001ff3591711', 1267414277402340: 'f0c8d718-24e2-11df-8924-001ff3591711', 1267414305866969: 'f9e6d804-24e2-11df-8924-001ff3591711', 1267414319522925: '02ccb5ec-24e3-11df-8924-001ff3591711', }, } Tuesday, April 20, 2010
  • 33. A note on UUIDs • TimeUUID = Version 1 UUID • LexicalUUID = any UUID • usually version 4 Tuesday, April 20, 2010 UUIDs are better than timestamps
  • 34. Questions Tuesday, April 20, 2010