• Like
Cassandra Basics: Indexing
Upcoming SlideShare
Loading in...5
×
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • useful! thanks
    Are you sure you want to
    Your message goes here
  • this is rad, am going to use this when explaining Cassandra to colleagues!
    Are you sure you want to
    Your message goes here
  • Very nice introduction to how to turn a SQL 'select where' into something on Cassandra. I was initially a little confused by the slide title since cassandra has a concept of an index for columns now but this is actually very useful.
    One more thing...I can't read keynote slides, would have preferred the slides to be posted in pdf or openoffice format.
    Are you sure you want to
    Your message goes here
  • user ColumnFamily maintains secondary indexes
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
27,268
On Slideshare
0
From Embeds
0
Number of Embeds
6

Actions

Shares
Downloads
759
Comments
4
Likes
40

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide












































Transcript

  • 1. Cassandra Basics Indexing Benjamin Black, b@b3k.us
  • 2. Relational stores are SCHEMA ORIENTED
  • 3. Start from your SCHEMA & WORK FORWARDS
  • 4. Column stores are QUERY ORIENTED
  • 5. Start from your QUERIES & WORK BACKWARDS
  • 6. AT SCALE
  • 7. AT SCALE Denormalization is THE NORM
  • 8. AT SCALE
  • 9. AT SCALE Everything depends on THE INDICES
  • 10. Cassandra is an INDEX CONSTRUCTION KIT
  • 11. Column Family
  • 12. Two-level Map key: { column: value, column: value, ... }
  • 13. Super Column Family
  • 14. Three-level Map key: { supercolumn: { column:value, column: value }, supercolumn: { ... } }
  • 15. column sorting defined by CompareWith/ CompareSubcolumnsWith
  • 16. TimeUUIDType UTF8Type ASCIIType LongType LexicalUUIDType
  • 17. row placement determined by Partitioner
  • 18. RandomPartitioner Place based on MD5 of key OrderPreservingPartitioner Place based on actual key
  • 19. Rows are sorted by key on each node Regardless of partitioner
  • 20. One example in TWO ACTS
  • 21. Prelude A USER DATABASE
  • 22. <ColumnFamily Name=”Users” CompareWith=”UTF8Type” />
  • 23. “b”: {“name”:”Ben”, “street”:”1234 Oak St.”, “city”:”Seattle”, “state”:”WA”} “jason”: {”name”:”Jason”, “street”:”456 First Ave.”, “city”:”Bellingham”, “state”:”WA”} “zack”: {”name”: “Zack”, “street”: “4321 Pine St.”, “city”: “Seattle”, “state”: “WA”} “jen1982”: {”name”:”Jennifer”, “street”:”1120 Foo Lane”, “city”:”San Francisco”, “state”:”CA”} “albert”: {”name”:”Albert”, “street”:”2364 South St.”, “city”:”Boston”, “state”:”MA”}
  • 24. SELECT name FROM Users WHERE state=”WA”
  • 25. SELECT name FROM Users WHERE state=”WA” How is WHERE clause formed?
  • 26. Act One Supercolumn Indexing
  • 27. <ColumnFamily Name=”LocationUserIndexSCF” CompareWith=”UTF8Type” CompareSubcolumnsWith=”UTF8Type” ColumnType=”Super” />
  • 28. [state]: { [city1]: {[name1]:[user1], [name2]:[user2], ... }, [city2]: {[name3]:[user3], [name4]:[user4], ... }, ... [cityX]: {[name5]:[user5], [name6]:[user6], ... } }
  • 29. “CA”: { “San Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
  • 30. Row Key “CA”: { “San Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
  • 31. Row Key Super Column “CA”: { “San Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
  • 32. Row Key Colum Super Column n “CA”: { “San Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
  • 33. Row Key Colum Super Column Value n “CA”: { “San Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
  • 34. Show me EVERYONE IN WASHINGTON
  • 35. get(:LocationUserIndexSCF, ‘WA’)
  • 36. { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
  • 37. Act Two Composite Key Indexing
  • 38. Order Preserving Partitioner + Range Queries
  • 39. <ColumnFamily Name=”LocationUserIndexCF” CompareWith=”UTF8Type” />
  • 40. [state1]/[city1]: {[name1]:[user1], [name2]:[user2], ... } [state1]/[city2]: {[name3]:[user3], [name4]:[user4], ... } [state2]/[city1]: {[name5]:[user5], [name6]:[user6], ... } ... [stateX]/[cityY]: {[name7]:[user7], [name8]:[user8], ... }
  • 41. “CA/San Francisco”: {”Jennifer”: “jen1982”} “MA/Boston”: {”Albert”: “albert”} “WA/Bellingham”: {”Jason”: “jason”} “WA/Seattle”: {”Ben”: “b”, “Zack”: “zack”}
  • 42. Show me EVERYONE IN WASHINGTON
  • 43. get_range(:LocationUserIndexCF, {:start: 'WA', :finish:'WB'})
  • 44. { ”WA/Bellingham”: {”Jason”: “jason”}, “WA/Seattle”: {”Ben”: “b”, “Zack”: “zack”} }
  • 45. Finale BUILD SOMETHING AWESOME
  • 46. (This part is up to you)
  • 47. Appendix EXAMPLE KEYSPACE
  • 48. <Keyspace Name="UserDb"> <ColumnFamily Name="Users" CompareWith="UTF8Type" /> <ColumnFamily Name="LocationUserIndexSCF" CompareWith="UTF8Type" CompareSubcolumnsWith="UTF8Type" ColumnType="Super" /> <ColumnFamily Name="LocationUserIndexCF" CompareWith="UTF8Type" /> <ReplicaPlacementStrategy> org.apache.cassandra.locator.RackUnawareStrategy </ReplicaPlacementStrategy> <ReplicationFactor>1</ReplicationFactor> <EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch> </Keyspace>