Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Cassandra Basics: Indexing

37,414 views

Published on

An introduction to indexing with supercolumns and range queries in Cassandra.

Published in: Technology, Education, Business

Cassandra Basics: Indexing

  1. Cassandra Basics Indexing Benjamin Black, b@b3k.us
  2. Relational stores are SCHEMA ORIENTED
  3. Start from your SCHEMA & WORK FORWARDS
  4. Column stores are QUERY ORIENTED
  5. Start from your QUERIES & WORK BACKWARDS
  6. AT SCALE
  7. AT SCALE Denormalization is THE NORM
  8. AT SCALE
  9. AT SCALE Everything depends on THE INDICES
  10. Cassandra is an INDEX CONSTRUCTION KIT
  11. Column Family
  12. Two-level Map key: { column: value, column: value, ... }
  13. Super Column Family
  14. Three-level Map key: { supercolumn: { column:value, column: value }, supercolumn: { ... } }
  15. column sorting defined by CompareWith/ CompareSubcolumnsWith
  16. TimeUUIDType UTF8Type ASCIIType LongType LexicalUUIDType
  17. row placement determined by Partitioner
  18. RandomPartitioner Place based on MD5 of key OrderPreservingPartitioner Place based on actual key
  19. Rows are sorted by key on each node Regardless of partitioner
  20. One example in TWO ACTS
  21. Prelude A USER DATABASE
  22. <ColumnFamily Name=”Users” CompareWith=”UTF8Type” />
  23. “b”: {“name”:”Ben”, “street”:”1234 Oak St.”, “city”:”Seattle”, “state”:”WA”} “jason”: {”name”:”Jason”, “street”:”456 First Ave.”, “city”:”Bellingham”, “state”:”WA”} “zack”: {”name”: “Zack”, “street”: “4321 Pine St.”, “city”: “Seattle”, “state”: “WA”} “jen1982”: {”name”:”Jennifer”, “street”:”1120 Foo Lane”, “city”:”San Francisco”, “state”:”CA”} “albert”: {”name”:”Albert”, “street”:”2364 South St.”, “city”:”Boston”, “state”:”MA”}
  24. SELECT name FROM Users WHERE state=”WA”
  25. SELECT name FROM Users WHERE state=”WA” How is WHERE clause formed?
  26. Act One Supercolumn Indexing
  27. <ColumnFamily Name=”LocationUserIndexSCF” CompareWith=”UTF8Type” CompareSubcolumnsWith=”UTF8Type” ColumnType=”Super” />
  28. [state]: { [city1]: {[name1]:[user1], [name2]:[user2], ... }, [city2]: {[name3]:[user3], [name4]:[user4], ... }, ... [cityX]: {[name5]:[user5], [name6]:[user6], ... } }
  29. “CA”: { “San Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
  30. Row Key “CA”: { “San Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
  31. Row Key Super Column “CA”: { “San Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
  32. Row Key Colum Super Column n “CA”: { “San Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
  33. Row Key Colum Super Column Value n “CA”: { “San Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
  34. Show me EVERYONE IN WASHINGTON
  35. get(:LocationUserIndexSCF, ‘WA’)
  36. { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
  37. Act Two Composite Key Indexing
  38. Order Preserving Partitioner + Range Queries
  39. <ColumnFamily Name=”LocationUserIndexCF” CompareWith=”UTF8Type” />
  40. [state1]/[city1]: {[name1]:[user1], [name2]:[user2], ... } [state1]/[city2]: {[name3]:[user3], [name4]:[user4], ... } [state2]/[city1]: {[name5]:[user5], [name6]:[user6], ... } ... [stateX]/[cityY]: {[name7]:[user7], [name8]:[user8], ... }
  41. “CA/San Francisco”: {”Jennifer”: “jen1982”} “MA/Boston”: {”Albert”: “albert”} “WA/Bellingham”: {”Jason”: “jason”} “WA/Seattle”: {”Ben”: “b”, “Zack”: “zack”}
  42. Show me EVERYONE IN WASHINGTON
  43. get_range(:LocationUserIndexCF, {:start: 'WA', :finish:'WB'})
  44. { ”WA/Bellingham”: {”Jason”: “jason”}, “WA/Seattle”: {”Ben”: “b”, “Zack”: “zack”} }
  45. Finale BUILD SOMETHING AWESOME
  46. (This part is up to you)
  47. Appendix EXAMPLE KEYSPACE
  48. <Keyspace Name="UserDb"> <ColumnFamily Name="Users" CompareWith="UTF8Type" /> <ColumnFamily Name="LocationUserIndexSCF" CompareWith="UTF8Type" CompareSubcolumnsWith="UTF8Type" ColumnType="Super" /> <ColumnFamily Name="LocationUserIndexCF" CompareWith="UTF8Type" /> <ReplicaPlacementStrategy> org.apache.cassandra.locator.RackUnawareStrategy </ReplicaPlacementStrategy> <ReplicationFactor>1</ReplicationFactor> <EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch> </Keyspace>

×