Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

SASI, Cassandra on the full text search ride - DuyHai Doan - Codemotion Milan 2016

235 views

Published on

Apache Cassandra is a scalable database with high availability features. But they come with severe limitations in term of querying capabilities. Since the introduction of SASI in Cassandra 3.4, the limitations belong to the pass. Now you can create performant indices on your columns as well as benefit from full text search capabilities with the introduction of the new LIKE %term% syntax. To illustrate how SASI works, we'll use a database of 100 000 albums and artists.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

SASI, Cassandra on the full text search ride - DuyHai Doan - Codemotion Milan 2016

  1. 1. @doanduyhai SASI, Cassandra on the full text search ride DuyHai DOAN Apache Cassandra™ Evangelist Apache Zeppelin™ Commiter MILAN 25-26 NOVEMBER 2016
  2. 2. @doanduyhai Datastax •  Founded in April 2010 •  We contribute a lot to Apache Cassandra™ •  400+ customers (25 of the Fortune 100), 400+ employees •  Headquarter in San Francisco Bay area •  EU headquarter in London, offices in France and Germany •  Datastax Enterprise = better Apache Cassandra + extra features 2
  3. 3. @doanduyhai 1 5 minutes introduction to Apache Cassandra™ 2 SASI introduction 3 SASI cluster-wide 4 SASI local read/write path 5 Query planner 6 Take away 3
  4. 4. @doanduyhai Trademark Policy 4 From now on … Cassandra ⩵ Apache Cassandra™
  5. 5. @doanduyhai 5 minutes introduction to Apache Cassandra™
  6. 6. @doanduyhai The tokens 6 Random hash of #partition à token = hash(#p) Hash: ] –x, x ] hash range: 264 values x = 264/2 C* C* C* C* C* C* C* C*
  7. 7. @doanduyhai Token ranges 7 A: −x,− 3x 4 ⎤ ⎦ ⎥ ⎥ ⎤ ⎦ ⎥ ⎥ B: − 3x 4 ,− 2x 4 ⎤ ⎦ ⎥ ⎥ ⎤ ⎦ ⎥ ⎥ C: − 2x 4 ,− x 4 ⎤ ⎦ ⎥ ⎥ ⎤ ⎦ ⎥ ⎥ D: − x 4 ,0 ⎤ ⎦ ⎥ ⎥ ⎤ ⎦ ⎥ ⎥ E: 0, x 4 ⎤ ⎦ ⎥ ⎥ ⎤ ⎦ ⎥ ⎥ F: x 4 , 2x 4 ⎤ ⎦ ⎥ ⎥ ⎤ ⎦ ⎥ ⎥ G: 2x 4 , 3x 4 ⎤ ⎦ ⎥ ⎥ ⎤ ⎦ ⎥ ⎥ H : 3x 4 ,x ⎤ ⎦ ⎥ ⎥ ⎤ ⎦ ⎥ ⎥ H A E D B C G F
  8. 8. @doanduyhai Distributed tables 8 H A E D B C G F user_id1 user_id2 user_id3 user_id4 user_id5 CREATE TABLE users( user_id int, …, PRIMARY KEY(user_id) ),
  9. 9. @doanduyhai Distributed tables 9 H A E D B C G F user_id1 user_id2 user_id3 user_id4 user_id5
  10. 10. @doanduyhai Coordinator node 10 Responsible for handling requests (read/write) Every node can be coordinator • masterless • no SPOF • proxy role H A E D B C G F coordinator request 1 2 3
  11. 11. @doanduyhai11 Q & A ! "
  12. 12. @doanduyhai SASI introduction
  13. 13. @doanduyhai What is SASI ? 13 •  SSTable-Attached Secondary Index à new 2nd index impl that follows SSTable life-cycle •  Objective: provide more performant & capable 2nd index
  14. 14. @doanduyhai Who created it ? 14 Open-source contribution by an engineers team
  15. 15. @doanduyhai Why is it better than native 2nd index ? 15 •  follow SSTable life-cycle (flush, compaction, rebuild …) à more optimized •  new data-strutures •  range query (<, ≤, >, ≥) possible •  full text search options
  16. 16. @doanduyhai16 Demo
  17. 17. @doanduyhai SASI cluster-wide
  18. 18. @doanduyhai Distributed index 18 On cluster level, SASI works exactly like native 2nd index H A E D B C G F UK user1 user102 … user493 US user54 user483 … user938 UK user87 user176 … user987 UK user17 user409 … user787
  19. 19. @doanduyhai Distributed search algorithm 19 H A E D B C G F coordinator 1st round Concurrency factor = 1
  20. 20. @doanduyhai Distributed search algorithm 20 H A E D B C G F coordinator Not enough results ?
  21. 21. @doanduyhai Distributed search algorithm 21 H A E D B C G F coordinator 2nd round Concurrency factor = 2
  22. 22. @doanduyhai Distributed search algorithm 22 H A E D B C G F coordinator Still not enough results ?
  23. 23. @doanduyhai Distributed search algorithm 23 H A E D B C G F coordinator 3rd round Concurrency factor = 4
  24. 24. @doanduyhai Caveat 1: non restrictive filters 24 H A E D B C G F coordinator Hit all nodes eventually L
  25. 25. @doanduyhai Caveat 1 solution : always use LIMIT 25 H A E D B C G F coordinator SELECT * FROM … WHERE ... LIMIT 1000
  26. 26. @doanduyhai Caveat 2: 1-to-1 index (user_email) 26 H A E D B C G F coordinator Not found WHERE user_email = ‘xxx'
  27. 27. @doanduyhai Caveat 2: 1-to-1 index (user_email) 27 H A E D B C G F coordinator Still no result WHERE user_email = ‘xxx'
  28. 28. @doanduyhai Caveat 2: 1-to-1 index (user_email) 28 H A E D B C G F coordinator At best 1 user found At worst 0 user found WHERE user_email = ‘xxx'
  29. 29. @doanduyhai Caveat 2 solution: materialized views 29 For 1-to-1 index/relationship, use materialized views instead CREATE MATERIALIZED VIEW user_by_email AS SELECT * FROM users WHERE user_id IS NOT NULL and user_email IS NOT NULL PRIMARY KEY (user_email, user_id) But range queries ( <, >, ≤, ≥) not possible …
  30. 30. @doanduyhai Caveat 3: fetch all rows for analytics use-case 30 H A E D B C G F coordinator Client
  31. 31. @doanduyhai Caveat 3 solution: use co-located Spark 31 H A E D B C G F Local index filtering in Cassandra Aggregation in Spark Local index query
  32. 32. @doanduyhai SASI local read/write path
  33. 33. @doanduyhai SASI Life-cycle: in-memory 33 Commit log1 . . . 1 Commit log2 Commit logn Memory . . . MemTable Table1 MemTable Table2 MemTable TableN 2 Index MemTable1 Index MemTable2 . . . Index MemTableN 3 ACK the client
  34. 34. @doanduyhai Local write path data structures 34 Index mode, data type Data structure Usage PREFIX, text Guava ConcurrentRadixTree name LIKE 'John%' CONTAINS, text Guava ConcurrentSuffixTree name LIKE ’%John%' name LIKE ’%ny’ PREFIX, other JDK ConcurrentSkipListSet age = 20 age >= 20 AND age <= 30 SPARSE, other JDK ConcurrentSkipListSet age = 20 age >= 20 AND age <= 30 suitable for 1-to-N index with N ≤ 5
  35. 35. @doanduyhai SASI Life-cycle: flush to SSTable 35 Commit log1 . . . 1 Commit log2 Commit logn Memory Table1 SStable1 Table2 Table3 SStable2 SStable3 4 OnDiskIndex1 OnDiskIndex2 OnDiskIndex3
  36. 36. @doanduyhai OnDiskIndex files 36 SStable1 SStable2 user_id4 FR user_id1 US user_id5 FR user_id3 UK user_id2 DE OnDiskIndex1 FR US OnDiskIndex2 UK DE B+Tree-like data structures
  37. 37. @doanduyhai Local read path 37 •  first, optimize query using Query Planer (see later) •  then load chunks (4k) of index files from disk into memory •  perform binary search to find the indexed value(s) •  retrieve the offset of the partition from the original SSTable •  read the original data
  38. 38. @doanduyhai Query Planner
  39. 39. @doanduyhai Query planner 39 •  build predicates tree •  predicates push-down & re-ordering •  predicate fusions for != operator
  40. 40. @doanduyhai Query optimization example 40 WHERE age < 100 AND fname LIKE 'p%' AND fname != 'pa%' AND age > 21
  41. 41. @doanduyhai Query optimization example 41 AND is associative and commutative
  42. 42. @doanduyhai Query optimization example 42 != transformed to exclusion on range scan
  43. 43. @doanduyhai Query optimization example 43 AND is associative and commutative
  44. 44. @doanduyhai Take Away
  45. 45. @doanduyhai Index mode caveat Beware of disk space usage for full text search !!! Table albums with ≈ 110 000 records, 6.8Mb data size 45
  46. 46. @doanduyhai SASI vs search engines SASI vs Solr/ElasticSearch ? •  Cassandra is not a search engine !!! (database = durability) •  always slower because 2 passes (SASI index read + original Cassandra data) •  no scoring •  no ordering (ORDER BY) •  no grouping (GROUP BY) à Apache Spark for analytics If you don’t need the above features, SASI is for you! 46
  47. 47. @doanduyhai SASI sweet spots SASI is a relevant choice if •  you need multi criteria search and you don't need ordering/grouping/scoring •  you mostly need 100 to 10000 of rows for your search queries •  you always know the partition keys of the rows to be searched for (this one applies to native secondary index too) 47
  48. 48. @doanduyhai SASI blind spots SASI is a poor choice if •  you have strong SLA on search latency, for example few millisecs requirement •  ordering of the search results is important for you 48
  49. 49. @doanduyhai49 Q & A ! "
  50. 50. @doanduyhai50 Thank You @doanduyhai duy_hai.doan@datastax.com

×