Cassandra nosql eu 2010

13,554 views
13,052 views

Published on

2 Comments
23 Likes
Statistics
Notes
No Downloads
Views
Total views
13,554
On SlideShare
0
From Embeds
0
Number of Embeds
3,180
Actions
Shares
0
Downloads
281
Comments
2
Likes
23
Embeds 0
No embeds

No notes for slide

Cassandra nosql eu 2010

  1. Cassandra Jonathan Ellis @spyced jbellis@riptano.com Tuesday, April 20, 2010
  2. Tuesday, April 20, 2010
  3. Tuesday, April 20, 2010
  4. Tuesday, April 20, 2010
  5. “NoSQL” Performance Reliability Scaling B&D Tuesday, April 20, 2010
  6. Performance Tuesday, April 20, 2010 Each Cassandra node manages its storage locally. Not limited by obsolete systems, and not slowed by layering on top of a DFS.
  7. b-trees Tuesday, April 20, 2010 read-before-write index in ram random i/o
  8. Memtable / SSTable Tuesday, April 20, 2010
  9. Durable • Write to commitlog • fsync is cheap since it’s append-only • Write to memtable • [amortized] flush memtable to sstable Tuesday, April 20, 2010 Cassandra is one of the few NoSQL systems that is suitable for use when data loss is unacceptable.
  10. SSTable format, briefly <row data 0> <key 127> <row data 1> <key 255> ... ... <row data 127> ... <row data 255> ... Tuesday, April 20, 2010
  11. Scaling Tuesday, April 20, 2010 How managing our own data helps scaling
  12. Scaling • Facebook: grew from less than 80 machines to 150+ • SimpleGEO: from 20 EC2 Large instances to 50+ Tuesday, April 20, 2010
  13. How it works Tuesday, April 20, 2010
  14. W A T L Tuesday, April 20, 2010
  15. W A F T L Tuesday, April 20, 2010
  16. W A (A-F] F T (F-L] L Tuesday, April 20, 2010
  17. Key “C” W A F T L Tuesday, April 20, 2010
  18. Reliability • No single points of failure • Multiple datacenters • Monitorable Tuesday, April 20, 2010
  19. Design Tuesday, April 20, 2010
  20. The opposite of heroes • “If your software wakes people up at 4 AM to fix it, you’re doing it wrong.” Tuesday, April 20, 2010
  21. W A T L Tuesday, April 20, 2010 Every node is equal
  22. Y Key “C” A W U F T L P Tuesday, April 20, 2010 Always at least one copy in each datacenter Alternate datacenters on the ring
  23. Monitorable Tuesday, April 20, 2010
  24. Events Tuesday, April 20, 2010
  25. JMX Tuesday, April 20, 2010
  26. Bondage & Discipline • Twitter: “Fifteen months ago, it took two weeks to perform ALTER TABLE on the statuses [tweets] table.” Tuesday, April 20, 2010
  27. ColumnFamilies Columns Tuesday, April 20, 2010
  28. SuperColumns SuperColumns Tuesday, April 20, 2010
  29. Twissandra User = { 'a4a70900-24e1-11df-8924-001ff3591711': { 'id': 'a4a70900-24e1-11df-8924-001ff3591711', 'username': 'ericflo', 'password': '****', }, } Followers = { 'a4a70900-24e1-11df-8924-001ff3591711': { # friend id: timestamp of when the followership was added '10cf667c-24e2-11df-8924-001ff3591711': '1267413962580791', '343d5db2-24e2-11df-8924-001ff3591711': '1267413990076949', '3f22b5f6-24e2-11df-8924-001ff3591711': '1267414008133277', }, } Tuesday, April 20, 2010
  30. Tweet = { '7561a442-24e2-11df-8924-001ff3591711': { 'id': '89da3178-24e2-11df-8924-001ff3591711', 'user_id': 'a4a70900-24e1-11df-8924-001ff3591711', 'body': 'Trying out Twissandra. This is awesome!', '_ts': '1267414173047880', }, } Timeline = { 'a4a70900-24e1-11df-8924-001ff3591711': { # timestamp of tweet: tweet id 1267414247561777: '7561a442-24e2-11df-8924-001ff3591711', 1267414277402340: 'f0c8d718-24e2-11df-8924-001ff3591711', 1267414305866969: 'f9e6d804-24e2-11df-8924-001ff3591711', 1267414319522925: '02ccb5ec-24e3-11df-8924-001ff3591711', }, } Tuesday, April 20, 2010
  31. Tuesday, April 20, 2010
  32. Denormalize Userline = { 'a4a70900-24e1-11df-8924-001ff3591711': { # timestamp of tweet: tweet id 1267414247561777: '7561a442-24e2-11df-8924-001ff3591711', 1267414277402340: 'f0c8d718-24e2-11df-8924-001ff3591711', 1267414305866969: 'f9e6d804-24e2-11df-8924-001ff3591711', 1267414319522925: '02ccb5ec-24e3-11df-8924-001ff3591711', }, } Tuesday, April 20, 2010
  33. A note on UUIDs • TimeUUID = Version 1 UUID • LexicalUUID = any UUID • usually version 4 Tuesday, April 20, 2010 UUIDs are better than timestamps
  34. Questions Tuesday, April 20, 2010

×