• Save
Hello @world #cassandra
Upcoming SlideShare
Loading in...5
×
 

Hello @world #cassandra

on

  • 2,509 views

My talk from http://wdcnz.com 2012.

My talk from http://wdcnz.com 2012.

I took a brief look at Cassandra and then stepped through building a twitter clone. Very rough code is at https://github.com/amorton/wdcnz-2012-site

Statistics

Views

Total Views
2,509
Views on SlideShare
2,436
Embed Views
73

Actions

Likes
1
Downloads
0
Comments
1

5 Embeds 73

http://lanyrd.com 63
https://si0.twimg.com 7
http://karinacursoessencial.blogspot.com 1
http://feeds.feedburner.com 1
http://pmomale-ld1 1

Accessibility

Categories

Upload Details

Uploaded via as Apple Keynote

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n

Hello @world #cassandra Presentation Transcript

  • 1. HELLO @WORLD #CASSANDRA APACHE CASSANDRA IN ACTION WDCNZ 2012 Aaron Morton, Apache Cassandra Committer @aaronmorton www.thelastpickle.com Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License
  • 2. The Code is at...github.com/amorton/wdcnz-2012-site
  • 3. Cassandra?
  • 4. Cassandra? Started at Facebook.
  • 5. Cassandra? Top Level Apache project since 2010.
  • 6. Used by... Netflix, Twitter, Reddit, Rackspace...
  • 7. Commercial support by... Data Stax, Acunu, PalominoDB, Impetus...
  • 8. Why Cassandra? Scale
  • 9. Why Cassandra? Operations
  • 10. Why Cassandra? Data Model
  • 11. Cluster
  • 12. Store ‘foo’ key with Replication Factor 3. Node 1 - foo Node 4 Node 2 - foo Node 3 - foo
  • 13. Consistent Hashing... Evenly map keys to nodes.
  • 14. Consistent Hashing... Minimise keymovements when nodes join or leave.
  • 15. Partitioner... RandomPartitioner transforms Keys to Tokens using MD5. (Default Partitioner, there are others.)
  • 16. Keys and Tokens? key fop foo token 0 10 90 99
  • 17. Token Ring. 99 0 foo fop token: 90 token: 10
  • 18. Token Ranges. Node 1 token: 0 76-0 1-25 Node 4 Node 2 token: 75 token: 25 Node 3 token: 50
  • 19. Locate Token Range. Node 1 token: 0 foo token: 90 Node 4 Node 2 token: 75 token: 25 Node 3 token: 50
  • 20. Replication Strategy selectsReplication Factor number of nodes for a row.
  • 21. SimpleStrategy with RF 3. Node 1 token: 0 foo token: 90 Node 4 Node 2 token: 75 token: 25 Node 3 token: 50
  • 22. Clients connect to any node in the cluster.
  • 23. The Client and the Coordinator. Node 1 token: 0 foo token: 90 Node 4 Node 2 token: 75 token: 25 Node 3 Client token: 50
  • 24. Client specifiedConsistency Level.
  • 25. Consistency Level... Any*, One, Two, Three,
  • 26. Consistency Level... QUORUM, LOCAL_QUORUM, EACH_QUOURM*
  • 27. QUOURM at Replication Factor... Replication 2 or 3 4 or 5 6 or 7 Factor QUOURM 2 3 4
  • 28. Node Down. Node 1 token: 0 foo token: 90 Node 4 Node 2 token: 75 token: 25 Node 3 Client token: 50
  • 29. Write ‘foo’ at QUOURM with Hinted Handoff. Node 1 foo foo token: 90 Node 4 Node 2 foo for #3 foo Node 3 Client
  • 30. Read ‘foo’ at QUOURM. Node 1 foo foo token: 90 Node 4 Node 2 foo Node 3 Client
  • 31. Consistency Levelnodes must agree.
  • 32. Column Timestamps used to resolve differences.
  • 33. Consistent read for ‘foo’ at QUOURM. Node 1 Node 1 cromulent cromulent Node 4 Node 2 Node 4 Node 2 embiggins cromulent cromulent Client Client Node 3 Node 3
  • 34. R +W > N(#Read Nodes + #Write Nodes > Replication Factor)
  • 35. Data Model
  • 36. Data Model so far. Row Key: Column Column Column (Incomplete.)
  • 37. Data Model. Keyspace Column Family Column Family Column Family Column Column Column Row Key: Column Column Column Column Column Column (Excludes Super Columns.)
  • 38. Data Model... Keyspace Column Family Column: name, value, timestamp Row Key: Column: name, value, timestamp Column: name, value, timestamp (Also TTL and Tombstone Columns.)
  • 39. Code
  • 40. Tweet Storage... CF / User User User Global User Tweet Row Key Tweets Timeline Metrics Timeline user_name ✓ ✓ ✓ ✓ tweet_id ✓
  • 41. Followers Storage... CF / Ordered Relationships TweetDelivery Row Key Relationships (user_name, rel_type) ✓ ✓ tweet_id ✓
  • 42. Data DrivenWellington(It’s a meet-up on MeetUp.Com)
  • 43. Aaron Morton @aaronmorton www.thelastpickle.comLicensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License