Cassandra Basics: Indexing

32,573 views
31,727 views

Published on

An introduction to indexing with supercolumns and range queries in Cassandra.

Published in: Technology, Education, Business
4 Comments
40 Likes
Statistics
Notes
No Downloads
Views
Total views
32,573
On SlideShare
0
From Embeds
0
Number of Embeds
2,820
Actions
Shares
0
Downloads
796
Comments
4
Likes
40
Embeds 0
No embeds

No notes for slide












































  • Cassandra Basics: Indexing

    1. Cassandra Basics Indexing Benjamin Black, b@b3k.us
    2. Relational stores are SCHEMA ORIENTED
    3. Start from your SCHEMA & WORK FORWARDS
    4. Column stores are QUERY ORIENTED
    5. Start from your QUERIES & WORK BACKWARDS
    6. AT SCALE
    7. AT SCALE Denormalization is THE NORM
    8. AT SCALE
    9. AT SCALE Everything depends on THE INDICES
    10. Cassandra is an INDEX CONSTRUCTION KIT
    11. Column Family
    12. Two-level Map key: { column: value, column: value, ... }
    13. Super Column Family
    14. Three-level Map key: { supercolumn: { column:value, column: value }, supercolumn: { ... } }
    15. column sorting defined by CompareWith/ CompareSubcolumnsWith
    16. TimeUUIDType UTF8Type ASCIIType LongType LexicalUUIDType
    17. row placement determined by Partitioner
    18. RandomPartitioner Place based on MD5 of key OrderPreservingPartitioner Place based on actual key
    19. Rows are sorted by key on each node Regardless of partitioner
    20. One example in TWO ACTS
    21. Prelude A USER DATABASE
    22. <ColumnFamily Name=”Users” CompareWith=”UTF8Type” />
    23. “b”: {“name”:”Ben”, “street”:”1234 Oak St.”, “city”:”Seattle”, “state”:”WA”} “jason”: {”name”:”Jason”, “street”:”456 First Ave.”, “city”:”Bellingham”, “state”:”WA”} “zack”: {”name”: “Zack”, “street”: “4321 Pine St.”, “city”: “Seattle”, “state”: “WA”} “jen1982”: {”name”:”Jennifer”, “street”:”1120 Foo Lane”, “city”:”San Francisco”, “state”:”CA”} “albert”: {”name”:”Albert”, “street”:”2364 South St.”, “city”:”Boston”, “state”:”MA”}
    24. SELECT name FROM Users WHERE state=”WA”
    25. SELECT name FROM Users WHERE state=”WA” How is WHERE clause formed?
    26. Act One Supercolumn Indexing
    27. <ColumnFamily Name=”LocationUserIndexSCF” CompareWith=”UTF8Type” CompareSubcolumnsWith=”UTF8Type” ColumnType=”Super” />
    28. [state]: { [city1]: {[name1]:[user1], [name2]:[user2], ... }, [city2]: {[name3]:[user3], [name4]:[user4], ... }, ... [cityX]: {[name5]:[user5], [name6]:[user6], ... } }
    29. “CA”: { “San Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
    30. Row Key “CA”: { “San Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
    31. Row Key Super Column “CA”: { “San Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
    32. Row Key Colum Super Column n “CA”: { “San Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
    33. Row Key Colum Super Column Value n “CA”: { “San Francisco”: {”Jennifer”: “jen1982”} } “MA”: { “Boston”: {”Albert”: “albert”} } “WA”: { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
    34. Show me EVERYONE IN WASHINGTON
    35. get(:LocationUserIndexSCF, ‘WA’)
    36. { “Bellingham”: {”Jason”: “jason”}, “Seattle”: {”Ben”: “b”, ”Zack”: “zack”} }
    37. Act Two Composite Key Indexing
    38. Order Preserving Partitioner + Range Queries
    39. <ColumnFamily Name=”LocationUserIndexCF” CompareWith=”UTF8Type” />
    40. [state1]/[city1]: {[name1]:[user1], [name2]:[user2], ... } [state1]/[city2]: {[name3]:[user3], [name4]:[user4], ... } [state2]/[city1]: {[name5]:[user5], [name6]:[user6], ... } ... [stateX]/[cityY]: {[name7]:[user7], [name8]:[user8], ... }
    41. “CA/San Francisco”: {”Jennifer”: “jen1982”} “MA/Boston”: {”Albert”: “albert”} “WA/Bellingham”: {”Jason”: “jason”} “WA/Seattle”: {”Ben”: “b”, “Zack”: “zack”}
    42. Show me EVERYONE IN WASHINGTON
    43. get_range(:LocationUserIndexCF, {:start: 'WA', :finish:'WB'})
    44. { ”WA/Bellingham”: {”Jason”: “jason”}, “WA/Seattle”: {”Ben”: “b”, “Zack”: “zack”} }
    45. Finale BUILD SOMETHING AWESOME
    46. (This part is up to you)
    47. Appendix EXAMPLE KEYSPACE
    48. <Keyspace Name="UserDb"> <ColumnFamily Name="Users" CompareWith="UTF8Type" /> <ColumnFamily Name="LocationUserIndexSCF" CompareWith="UTF8Type" CompareSubcolumnsWith="UTF8Type" ColumnType="Super" /> <ColumnFamily Name="LocationUserIndexCF" CompareWith="UTF8Type" /> <ReplicaPlacementStrategy> org.apache.cassandra.locator.RackUnawareStrategy </ReplicaPlacementStrategy> <ReplicationFactor>1</ReplicationFactor> <EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch> </Keyspace>

    ×