Cassandra nosql eu 2010
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Cassandra nosql eu 2010

on

  • 13,122 views

 

Statistics

Views

Total Views
13,122
Views on SlideShare
9,976
Embed Views
3,146

Actions

Likes
23
Downloads
275
Comments
2

11 Embeds 3,146

http://nosql.pl 2261
http://nosql.mypopescu.com 671
http://www.slideshare.net 130
http://nosqlpl.tumblr.com 72
http://bdkennedy.blogspot.com 5
http://webcache.googleusercontent.com 2
http://www.hanrss.com 1
http://feeds.feedburner.com 1
http://web.archive.org 1
http://mbot-2.local 1
http://codeheadsystems.wordpress.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • nosql
    Are you sure you want to
    Your message goes here
    Processing…
  • nosql,cassandra
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Cassandra nosql eu 2010 Presentation Transcript

  • 1. Cassandra Jonathan Ellis @spyced jbellis@riptano.com Tuesday, April 20, 2010
  • 2. Tuesday, April 20, 2010
  • 3. Tuesday, April 20, 2010
  • 4. Tuesday, April 20, 2010
  • 5. “NoSQL” Performance Reliability Scaling B&D Tuesday, April 20, 2010
  • 6. Performance Tuesday, April 20, 2010 Each Cassandra node manages its storage locally. Not limited by obsolete systems, and not slowed by layering on top of a DFS.
  • 7. b-trees Tuesday, April 20, 2010 read-before-write index in ram random i/o
  • 8. Memtable / SSTable Tuesday, April 20, 2010
  • 9. Durable • Write to commitlog • fsync is cheap since it’s append-only • Write to memtable • [amortized] flush memtable to sstable Tuesday, April 20, 2010 Cassandra is one of the few NoSQL systems that is suitable for use when data loss is unacceptable.
  • 10. SSTable format, briefly <row data 0> <key 127> <row data 1> <key 255> ... ... <row data 127> ... <row data 255> ... Tuesday, April 20, 2010
  • 11. Scaling Tuesday, April 20, 2010 How managing our own data helps scaling
  • 12. Scaling • Facebook: grew from less than 80 machines to 150+ • SimpleGEO: from 20 EC2 Large instances to 50+ Tuesday, April 20, 2010
  • 13. How it works Tuesday, April 20, 2010
  • 14. W A T L Tuesday, April 20, 2010
  • 15. W A F T L Tuesday, April 20, 2010
  • 16. W A (A-F] F T (F-L] L Tuesday, April 20, 2010
  • 17. Key “C” W A F T L Tuesday, April 20, 2010
  • 18. Reliability • No single points of failure • Multiple datacenters • Monitorable Tuesday, April 20, 2010
  • 19. Design Tuesday, April 20, 2010
  • 20. The opposite of heroes • “If your software wakes people up at 4 AM to fix it, you’re doing it wrong.” Tuesday, April 20, 2010
  • 21. W A T L Tuesday, April 20, 2010 Every node is equal
  • 22. Y Key “C” A W U F T L P Tuesday, April 20, 2010 Always at least one copy in each datacenter Alternate datacenters on the ring
  • 23. Monitorable Tuesday, April 20, 2010
  • 24. Events Tuesday, April 20, 2010
  • 25. JMX Tuesday, April 20, 2010
  • 26. Bondage & Discipline • Twitter: “Fifteen months ago, it took two weeks to perform ALTER TABLE on the statuses [tweets] table.” Tuesday, April 20, 2010
  • 27. ColumnFamilies Columns Tuesday, April 20, 2010
  • 28. SuperColumns SuperColumns Tuesday, April 20, 2010
  • 29. Twissandra User = { 'a4a70900-24e1-11df-8924-001ff3591711': { 'id': 'a4a70900-24e1-11df-8924-001ff3591711', 'username': 'ericflo', 'password': '****', }, } Followers = { 'a4a70900-24e1-11df-8924-001ff3591711': { # friend id: timestamp of when the followership was added '10cf667c-24e2-11df-8924-001ff3591711': '1267413962580791', '343d5db2-24e2-11df-8924-001ff3591711': '1267413990076949', '3f22b5f6-24e2-11df-8924-001ff3591711': '1267414008133277', }, } Tuesday, April 20, 2010
  • 30. Tweet = { '7561a442-24e2-11df-8924-001ff3591711': { 'id': '89da3178-24e2-11df-8924-001ff3591711', 'user_id': 'a4a70900-24e1-11df-8924-001ff3591711', 'body': 'Trying out Twissandra. This is awesome!', '_ts': '1267414173047880', }, } Timeline = { 'a4a70900-24e1-11df-8924-001ff3591711': { # timestamp of tweet: tweet id 1267414247561777: '7561a442-24e2-11df-8924-001ff3591711', 1267414277402340: 'f0c8d718-24e2-11df-8924-001ff3591711', 1267414305866969: 'f9e6d804-24e2-11df-8924-001ff3591711', 1267414319522925: '02ccb5ec-24e3-11df-8924-001ff3591711', }, } Tuesday, April 20, 2010
  • 31. Tuesday, April 20, 2010
  • 32. Denormalize Userline = { 'a4a70900-24e1-11df-8924-001ff3591711': { # timestamp of tweet: tweet id 1267414247561777: '7561a442-24e2-11df-8924-001ff3591711', 1267414277402340: 'f0c8d718-24e2-11df-8924-001ff3591711', 1267414305866969: 'f9e6d804-24e2-11df-8924-001ff3591711', 1267414319522925: '02ccb5ec-24e3-11df-8924-001ff3591711', }, } Tuesday, April 20, 2010
  • 33. A note on UUIDs • TimeUUID = Version 1 UUID • LexicalUUID = any UUID • usually version 4 Tuesday, April 20, 2010 UUIDs are better than timestamps
  • 34. Questions Tuesday, April 20, 2010