• Save
Hello @world #cassandra
Upcoming SlideShare
Loading in...5
×
 

Hello @world #cassandra

on

  • 2,458 views

My talk from http://wdcnz.com 2012.

My talk from http://wdcnz.com 2012.

I took a brief look at Cassandra and then stepped through building a twitter clone. Very rough code is at https://github.com/amorton/wdcnz-2012-site

Statistics

Views

Total Views
2,458
Views on SlideShare
2,386
Embed Views
72

Actions

Likes
1
Downloads
0
Comments
1

4 Embeds 72

http://lanyrd.com 63
https://si0.twimg.com 7
http://karinacursoessencial.blogspot.com 1
http://feeds.feedburner.com 1

Accessibility

Upload Details

Uploaded via as Apple Keynote

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n

Hello @world #cassandra Hello @world #cassandra Presentation Transcript

  • HELLO @WORLD #CASSANDRA APACHE CASSANDRA IN ACTION WDCNZ 2012 Aaron Morton, Apache Cassandra Committer @aaronmorton www.thelastpickle.com Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License
  • The Code is at...github.com/amorton/wdcnz-2012-site
  • Cassandra?
  • Cassandra? Started at Facebook.
  • Cassandra? Top Level Apache project since 2010.
  • Used by... Netflix, Twitter, Reddit, Rackspace...
  • Commercial support by... Data Stax, Acunu, PalominoDB, Impetus...
  • Why Cassandra? Scale
  • Why Cassandra? Operations
  • Why Cassandra? Data Model
  • Cluster
  • Store ‘foo’ key with Replication Factor 3. Node 1 - foo Node 4 Node 2 - foo Node 3 - foo
  • Consistent Hashing... Evenly map keys to nodes.
  • Consistent Hashing... Minimise keymovements when nodes join or leave.
  • Partitioner... RandomPartitioner transforms Keys to Tokens using MD5. (Default Partitioner, there are others.)
  • Keys and Tokens? key fop foo token 0 10 90 99
  • Token Ring. 99 0 foo fop token: 90 token: 10
  • Token Ranges. Node 1 token: 0 76-0 1-25 Node 4 Node 2 token: 75 token: 25 Node 3 token: 50
  • Locate Token Range. Node 1 token: 0 foo token: 90 Node 4 Node 2 token: 75 token: 25 Node 3 token: 50
  • Replication Strategy selectsReplication Factor number of nodes for a row.
  • SimpleStrategy with RF 3. Node 1 token: 0 foo token: 90 Node 4 Node 2 token: 75 token: 25 Node 3 token: 50
  • Clients connect to any node in the cluster.
  • The Client and the Coordinator. Node 1 token: 0 foo token: 90 Node 4 Node 2 token: 75 token: 25 Node 3 Client token: 50
  • Client specifiedConsistency Level.
  • Consistency Level... Any*, One, Two, Three,
  • Consistency Level... QUORUM, LOCAL_QUORUM, EACH_QUOURM*
  • QUOURM at Replication Factor... Replication 2 or 3 4 or 5 6 or 7 Factor QUOURM 2 3 4
  • Node Down. Node 1 token: 0 foo token: 90 Node 4 Node 2 token: 75 token: 25 Node 3 Client token: 50
  • Write ‘foo’ at QUOURM with Hinted Handoff. Node 1 foo foo token: 90 Node 4 Node 2 foo for #3 foo Node 3 Client
  • Read ‘foo’ at QUOURM. Node 1 foo foo token: 90 Node 4 Node 2 foo Node 3 Client
  • Consistency Levelnodes must agree.
  • Column Timestamps used to resolve differences.
  • Consistent read for ‘foo’ at QUOURM. Node 1 Node 1 cromulent cromulent Node 4 Node 2 Node 4 Node 2 embiggins cromulent cromulent Client Client Node 3 Node 3
  • R +W > N(#Read Nodes + #Write Nodes > Replication Factor)
  • Data Model
  • Data Model so far. Row Key: Column Column Column (Incomplete.)
  • Data Model. Keyspace Column Family Column Family Column Family Column Column Column Row Key: Column Column Column Column Column Column (Excludes Super Columns.)
  • Data Model... Keyspace Column Family Column: name, value, timestamp Row Key: Column: name, value, timestamp Column: name, value, timestamp (Also TTL and Tombstone Columns.)
  • Code
  • Tweet Storage... CF / User User User Global User Tweet Row Key Tweets Timeline Metrics Timeline user_name ✓ ✓ ✓ ✓ tweet_id ✓
  • Followers Storage... CF / Ordered Relationships TweetDelivery Row Key Relationships (user_name, rel_type) ✓ ✓ tweet_id ✓
  • Data DrivenWellington(It’s a meet-up on MeetUp.Com)
  • Aaron Morton @aaronmorton www.thelastpickle.comLicensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License