HELLO @WORLD #CASSANDRA   APACHE CASSANDRA IN ACTION               WDCNZ 2012    Aaron Morton, Apache Cassandra Committer ...
The Code is at...github.com/amorton/wdcnz-2012-site
Cassandra?
Cassandra?         Started at         Facebook.
Cassandra?  Top Level Apache project since 2010.
Used by...   Netflix, Twitter, Reddit, Rackspace...
Commercial support by...   Data Stax, Acunu,    PalominoDB,      Impetus...
Why Cassandra?             Scale
Why Cassandra?       Operations
Why Cassandra?       Data Model
Cluster
Store ‘foo’ key with Replication Factor 3.                              Node 1 - foo                     Node 4           ...
Consistent Hashing...	 Evenly map keys to       nodes.
Consistent Hashing...	    Minimise keymovements when nodes    join or leave.
Partitioner...     RandomPartitioner   transforms Keys to Tokens           using MD5.         (Default Partitioner, there ...
Keys and Tokens?    key     fop   foo  token 0    10     90      99
Token Ring.                          99   0                  foo            fop              token: 90            token: 10
Token Ranges.                                   Node 1                                   token: 0                         ...
Locate Token Range.                                              Node 1                                              token...
Replication Strategy selectsReplication Factor number of      nodes for a row.
SimpleStrategy with RF 3.                                          Node 1                                          token: ...
Clients connect to any node in the      cluster.
The Client and the Coordinator.                                            Node 1                                         ...
Client specifiedConsistency Level.
Consistency Level...   Any*, One, Two,       Three,
Consistency Level...          QUORUM,       LOCAL_QUORUM,       EACH_QUOURM*
QUOURM at Replication Factor...   Replication                 2 or 3   4 or 5   6 or 7     Factor   QUOURM          2     ...
Node Down.                                     Node 1                                     token: 0             foo        ...
Write ‘foo’ at QUOURM with Hinted Handoff.                                             Node 1                             ...
Read ‘foo’ at QUOURM.                                       Node 1                                       foo              ...
Consistency Levelnodes must agree.
Column Timestamps used to resolve    differences.
Consistent read for ‘foo’ at QUOURM.                    Node 1                                         Node 1             ...
R +W > N(#Read Nodes + #Write Nodes > Replication Factor)
Data Model
Data Model so far.     Row Key:   Column        Column   Column                  (Incomplete.)
Data Model.                           Keyspace               Column Family   Column Family   Column Family                ...
Data Model...                            Keyspace                                Column Family                Column: name...
Code
Tweet Storage...     CF /                     User      User      User      Global              User   Tweet   Row Key    ...
Followers Storage...       CF /                        Ordered                  Relationships                   TweetDeliv...
Data DrivenWellington(It’s a meet-up on MeetUp.Com)
Aaron Morton                     @aaronmorton                   www.thelastpickle.comLicensed under a Creative Commons Att...
Upcoming SlideShare
Loading in...5
×

Hello @world #cassandra

2,155

Published on

My talk from http://wdcnz.com 2012.

I took a brief look at Cassandra and then stepped through building a twitter clone. Very rough code is at https://github.com/amorton/wdcnz-2012-site

Published in: Technology, Business
1 Comment
1 Like
Statistics
Notes
No Downloads
Views
Total Views
2,155
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
1
Likes
1
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Hello @world #cassandra

    1. 1. HELLO @WORLD #CASSANDRA APACHE CASSANDRA IN ACTION WDCNZ 2012 Aaron Morton, Apache Cassandra Committer @aaronmorton www.thelastpickle.com Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License
    2. 2. The Code is at...github.com/amorton/wdcnz-2012-site
    3. 3. Cassandra?
    4. 4. Cassandra? Started at Facebook.
    5. 5. Cassandra? Top Level Apache project since 2010.
    6. 6. Used by... Netflix, Twitter, Reddit, Rackspace...
    7. 7. Commercial support by... Data Stax, Acunu, PalominoDB, Impetus...
    8. 8. Why Cassandra? Scale
    9. 9. Why Cassandra? Operations
    10. 10. Why Cassandra? Data Model
    11. 11. Cluster
    12. 12. Store ‘foo’ key with Replication Factor 3. Node 1 - foo Node 4 Node 2 - foo Node 3 - foo
    13. 13. Consistent Hashing... Evenly map keys to nodes.
    14. 14. Consistent Hashing... Minimise keymovements when nodes join or leave.
    15. 15. Partitioner... RandomPartitioner transforms Keys to Tokens using MD5. (Default Partitioner, there are others.)
    16. 16. Keys and Tokens? key fop foo token 0 10 90 99
    17. 17. Token Ring. 99 0 foo fop token: 90 token: 10
    18. 18. Token Ranges. Node 1 token: 0 76-0 1-25 Node 4 Node 2 token: 75 token: 25 Node 3 token: 50
    19. 19. Locate Token Range. Node 1 token: 0 foo token: 90 Node 4 Node 2 token: 75 token: 25 Node 3 token: 50
    20. 20. Replication Strategy selectsReplication Factor number of nodes for a row.
    21. 21. SimpleStrategy with RF 3. Node 1 token: 0 foo token: 90 Node 4 Node 2 token: 75 token: 25 Node 3 token: 50
    22. 22. Clients connect to any node in the cluster.
    23. 23. The Client and the Coordinator. Node 1 token: 0 foo token: 90 Node 4 Node 2 token: 75 token: 25 Node 3 Client token: 50
    24. 24. Client specifiedConsistency Level.
    25. 25. Consistency Level... Any*, One, Two, Three,
    26. 26. Consistency Level... QUORUM, LOCAL_QUORUM, EACH_QUOURM*
    27. 27. QUOURM at Replication Factor... Replication 2 or 3 4 or 5 6 or 7 Factor QUOURM 2 3 4
    28. 28. Node Down. Node 1 token: 0 foo token: 90 Node 4 Node 2 token: 75 token: 25 Node 3 Client token: 50
    29. 29. Write ‘foo’ at QUOURM with Hinted Handoff. Node 1 foo foo token: 90 Node 4 Node 2 foo for #3 foo Node 3 Client
    30. 30. Read ‘foo’ at QUOURM. Node 1 foo foo token: 90 Node 4 Node 2 foo Node 3 Client
    31. 31. Consistency Levelnodes must agree.
    32. 32. Column Timestamps used to resolve differences.
    33. 33. Consistent read for ‘foo’ at QUOURM. Node 1 Node 1 cromulent cromulent Node 4 Node 2 Node 4 Node 2 embiggins cromulent cromulent Client Client Node 3 Node 3
    34. 34. R +W > N(#Read Nodes + #Write Nodes > Replication Factor)
    35. 35. Data Model
    36. 36. Data Model so far. Row Key: Column Column Column (Incomplete.)
    37. 37. Data Model. Keyspace Column Family Column Family Column Family Column Column Column Row Key: Column Column Column Column Column Column (Excludes Super Columns.)
    38. 38. Data Model... Keyspace Column Family Column: name, value, timestamp Row Key: Column: name, value, timestamp Column: name, value, timestamp (Also TTL and Tombstone Columns.)
    39. 39. Code
    40. 40. Tweet Storage... CF / User User User Global User Tweet Row Key Tweets Timeline Metrics Timeline user_name ✓ ✓ ✓ ✓ tweet_id ✓
    41. 41. Followers Storage... CF / Ordered Relationships TweetDelivery Row Key Relationships (user_name, rel_type) ✓ ✓ tweet_id ✓
    42. 42. Data DrivenWellington(It’s a meet-up on MeetUp.Com)
    43. 43. Aaron Morton @aaronmorton www.thelastpickle.comLicensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License

    ×