Couchbase TLV Dev track 04 - power techniques with indexing


Published on

Published in: Technology, Business
1 Like
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Built on the Javascript V8 engine. Our query language is simple Javascript, so very easy to write our map functions.
  • How we can write our map functions: We could use a Single Element key, or “Primary key.”
  • Group_level queriesWe can use the built in dateToArray Javascript helper function to find a rollup of i.e. documents edited by date (by year, month, day, hour etc.)Group_level=2 we'd segment by year,month   and    Group_level=3  we'd segment by year,month,day etc.
  • Per data bucket, we have multiple Design Docs which contain the view definitions for a number of views.  This means our views are all batched together to be incrementally updated.  Best practise is splitting our views up into relevant ownerships / writers.  So i.e. 1 Design Document holds all the views for the Frontend UI of the website, and another Design Document holds the views for the Backend Admin interface (used to list and edit users, or posts etc etc.)In a worst case Design Doc scenario, there would be a 1 view in a dozen design documents, meaning we have 12 view functions to run, whereas we should structure it as multiple views per design document.  But, getting the balance right is important, as we also wouldn't want to have a design document with 100 views in it!When we change 1 view definition, it will update the index for the ENTIRE design doc, this is why it's logical to split views into relevant Design Doc categories etc.
  • First, walk through the optionsThen mention Observe
  • Couchbase TLV Dev track 04 - power techniques with indexing

    1. 1. Developing with Couchbase: Power Techniques with Indexing Michael Nitschinger Engineer, Developer Solutions
    2. 2. Agenda • Introduction to Indexing and Querying in Couchbase • Understand Map/Reduce Basics • Architectural Overview • Simple Indexes • Simple Queries
    3. 3. Indexing and Querying
    4. 4. Views are Indexes Indexes help to speed up access to data Doc2 Doc3 Doc1 Index Doc1 Doc3 Doc4 Doc2 Doc5
    5. 5. Couchbase Server 2.0: Views • Storing and Indexing Data are separate processes • In RDBMS, Indexes are optimized based on fixed data types. • Map-Reduce is a flexible approach helping to Index unstructured data.
    6. 6. Map-Reduce in General • The map function locates data items and outputs optimized data structures • The reduce function aggregates the output from a map function. • Together: very good for semi-structured and distributed data. Map Output Map Output Reduce Map Output Map Output
    7. 7. Couchbase Server Map-Reduce In Couchbase, Map-Reduce is specifically used to create an Index. Map functions are applied to JSON Documents and they output or “emit” a data structure designed to be rapidly queried and traversed. emit() CRUD Operations MAP() (processed)
    8. 8. Couchbase Server Views • Create a View of beer names • Filter only Documents with a JSON key type == beer and also has JSON keys brewery_id and name • Output the beer name, and a Alcohol By Volume (ABV) value
    9. 9. Couchbase Server Views • Views can cover a few different use cases - Simple secondary indexes (the most common) - Complex secondary, tertiary and composite indexes - Aggregation functions (reduction) • Example: count the number of North American Ales - Organizing related data
    10. 10. Map() Function => Index Every changed document goes through all map functions Map Content Metadata function(doc, meta) { emit(doc.username, } create row indexed key output value(s)
    11. 11. Single Element Keys (Text Key) Map function(doc, meta) { emit(, null) } text key u::1 u::2 u::3
    12. 12. Compound Keys (Array) Array Based Index Keys get sorted as Strings, but can be grouped by array elements Map function(doc, meta) { emit(dateToArray(doc.timestamp), 1) } dateToArray(doc.timestam array key value p) [2012,7,9,18,45] 1 [2012,8,26,11,15] 1 [2012,9,13,2,12] 1
    13. 13. Indexing Architecture App Server Doc 1 Couchbase Server Node To other node Replication Queue Doc 1 Doc 1 3 Doc Updated in RAM Cache First Disk Queue 3 2 Managed Cache Disk Doc 1 All Documents & Updates Pass Through View Engine View Engine Indexer Updates Indexes After On Disk, in Batches
    14. 14. Buckets >> Design Documents >> Views Beer-Sample Indexers Are Allocated Per Design Doc Beers by_name by_abv Breweries all All Updated at Same Time All Updated at Same Time location beers
    15. 15. Querying Views: Parameters
    16. 16. Parameters used in View Querying • key = “” - used for exact match of index-key • keys = [] - used for matching set of index-keys • startkey/endkey = “” - used for range queries on index-keys • startkey_docID/endkey_docID = “” - used for range queries on • stale=[false, update_after, true] - used to decide indexer behavior from client • group/group_by - used with reduces to aggregate with grouping
    17. 17. Query Pattern: Range
    18. 18. Index-Key Matching Match a Single Index-Key u::1 u::7 ?key=”” u::2 u::5 u::6 u::4 u::3
    19. 19. Range Query u::1 ?startkey=”” ?startkey=”bz” endkey=”zz” ?startkey=”b1” & endkey=”zn” &endkey=”” u::7 Range of a single item Pulls the Index-Keys (can also UTF-8 Range between be done with key= parameter). specified by the startkey and endkey. u::2 u::5 u::6 u::4 u::3
    20. 20. Index-Key Set Matches Query Multiple in the Set (Array Notation) u::1 u::7 ?keys=[“”, “”] u::2 u::5 u::6 u::4 u::3
    21. 21. Query Pattern: Basic Aggregations
    22. 22. Simple secondary Index • Find the ABV for each brewery
    23. 23. Aggregation: Reducing doc.abv with _stats
    24. 24. Group reduce (reduce by unique key)
    25. 25. Querying from Views Querying from Ruby Client
    26. 26. Query Pattern: Time Based Rollups
    27. 27. Find Comment Counts By Time { timestam p "type": "comment", "about_id": "beer_Enlightened_Black_Ale", "user_id": 525, "text": "tastes like college!", "updated": "2010-07-22 20:00:20" } { "id": "u525_c1" }
    28. 28. dateToArray() converts DateTime strings to Array of values • String or Integer based timestamps • Output optimized for group_level queries • Generates an array of JSON numbers: [2012,9,21,11,30,44]
    29. 29. Query with group_level=2 to get monthly rollups
    30. 30. group_level=3 - daily results - great for graphing • Daily, hourly, minute or second rollup all possible with the same index. •
    31. 31. Query Pattern: Leaderboard
    32. 32. Aggregate value stored in a document • Lets find the top-rated beers! { ratings "brewery": "New Belgium Brewing", "name": "1554 Enlightened Black Ale", "style": "Other Belgian-Style Ales", "updated": "2010-07-22 20:00:20", “ratings” : { “ingenthr” : 5, “jchris” : 4, “scalabl3” : 5, “damienkatz” : 1 }, “comments” : [ “f1e62”, “6ad8c“ ] }
    33. 33. Sort each beer by its average rating • Lets find the top-rated beers! 34
    34. 34. Q&A
    35. 35. Thanks!
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.