Your SlideShare is downloading. ×

Couchbase 103 - Views and Map-Reduce

4,421
views

Published on

Learn the architecture and use of Views, the structure of Map-Reduce functions, design documents, querying views and view query parameters, primary aggregate reduces and grouping, eventual consistency …

Learn the architecture and use of Views, the structure of Map-Reduce functions, design documents, querying views and view query parameters, primary aggregate reduces and grouping, eventual consistency of indexes and strategies of use.

What will be covered during this training:

What are Indexes
What is a Map-Reduce
Understanding Design Documents
Admin Console Overview
Anatomy of Map Functions
Batch Processing
Range Querying, Index-Key Querying, Set Querying
RDBMS Queries vs. Map-Reduce Queries
Grouping and Group Level
Eventual Consistency and Stale Parameter
Tips for Creating Views and Sandboxing Tests

Published in: Technology, Business

0 Comments
8 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,421
On Slideshare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
96
Comments
0
Likes
8
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Monday, October 14, 13
  • 2. Couchbase  103:  Views Jasdeep  Jaitla Technical  Evangelist email:  jasdeep@couchbase.com Monday, October 14, 13 twi0er:  @scalabl3
  • 3. Couchbase  103:  Views Jasdeep  Jaitla Technical  Evangelist email:  jasdeep@couchbase.com Monday, October 14, 13 twi0er:  @scalabl3
  • 4. Monday, October 14, 13
  • 5. WHAT  IS  A  VIEW? Monday, October 14, 13
  • 6. Views are Indexes • Indexes are methodologies to speed up access to information • Examples: - Dewey Decimal System - Card Catalogs - Hierarchal File Folders • In databases, Indexes are specialized structures for searching for data, typically one or two key fields Monday, October 14, 13
  • 7. Indexing Subsystem • Storing data and Indexing data are separate systems in all databases • In explicit schema scenarios (RDBMS), Indexes are optimized based on the data type(s) • In flexible schema scenarios Map-Reduce is used to create indexes Monday, October 14, 13
  • 8. What is Map-Reduce? • Map-Reduce is a technique designed for dealing with Big Data and processing in parallel in distributed systems • Map-Reduce is also specifically designed for dealing with unstructured or semi-structured data • Map functions identify data with collections, process them, and output transformed values • Reduce functions take the output of Map functions and perform numeric aggregate calculations on them Monday, October 14, 13
  • 9. Views: Map-Reduce Indexes • In Couchbase, Map-Reduce is specifically used to create Indexes. • Map functions are applied to JSON documents and they output or "emit" data that is organized in an Index emit() CRUD Operations MAP() (processed) Monday, October 14, 13
  • 10. Sample View function (doc, meta) { // if json doc has this stuff, emit the doc.name field ! if (doc.type == “beer” && doc.brewery_id && doc.name) { ! ! emit(doc.name, doc.abv); ! } } • Creates an Index of Beer Names (doc.name) and the Alcohol By Volume values (doc.abv) - Filters Documents • Only JSON Documents with json key doc.type == "beer" • and doc.brewery_id is non-null • and doc.name is non-null - Outputs • Beer Name (doc.name) [searchable] • Beer Alcohol By Volume (doc.abv) [row value] Monday, October 14, 13
  • 11. Monday, October 14, 13
  • 12. ARCHITECTURE Monday, October 14, 13
  • 13. Storage to Index Couchbase Server RAM Cache EP Engine Disk Write Queue Application Server Replication Queue Replica Couchbase Cluster Machine Monday, October 14, 13 View Engine Indexers
  • 14. Storage to Index Couchbase Server RAM Cache storage ops EP Engine Disk Write Queue Application Server Replication Queue Replica Couchbase Cluster Machine Monday, October 14, 13 View Engine Indexers
  • 15. Views: Eventual Consistency Couchbase Server RAM Cache EP Engine Disk Write Queue Application Server Replication Queue Replica Couchbase Cluster Machine Monday, October 14, 13 View Engine Indexers
  • 16. Views: Eventual Consistency Time 1 RAM Cache storage ops Couchbase Server EP Engine Disk Write Queue Application Server Replication Queue Replica Couchbase Cluster Machine Monday, October 14, 13 View Engine Indexers
  • 17. Views: Eventual Consistency Time 1 RAM Cache get Couchbase Server EP Engine Disk Write Queue Application Server Replication Queue Replica Couchbase Cluster Machine Monday, October 14, 13 View Engine Indexers
  • 18. Views: Eventual Consistency Time 1 RAM Cache get Couchbase Server EP Engine Disk Write Queue View Engine Indexers Application Server Replication Queue Replica Couchbase Cluster Machine Monday, October 14, 13 Time 2
  • 19. Why  Use  Map-­‐Reduce  Indexes? • Index  (Find)  Documents  by  different  JSON  Values   • Query  Documents  by  JSON  Values   • Create  StaVsVcs  and  Aggregates When  are  Indexes  Necessary? • Documents  are  Keyed  by  Random  ProperVes  (UUID,  GUID,  etc.) • IteraVng  through  Lists  of  Documents  with  Random  Keys • IteraVng  through  Lists  of  Documents  on  different  JSON  ProperVes   (i.e.  all  User  docs,  all  Product  docs,  by  Timestamp,  etc.) Monday, October 14, 13
  • 20. Monday, October 14, 13
  • 21. ANATOMY  OF  A  VIEW Monday, October 14, 13
  • 22. Buckets  >>  Design  Documents  >>  Views Couchbase Bucket Monday, October 14, 13
  • 23. Buckets  >>  Design  Documents  >>  Views Couchbase Bucket Design Document 1 Monday, October 14, 13
  • 24. Buckets  >>  Design  Documents  >>  Views Couchbase Bucket Design Document 1 View Monday, October 14, 13
  • 25. Buckets  >>  Design  Documents  >>  Views Couchbase Bucket Design Document 1 View Monday, October 14, 13 View
  • 26. Buckets  >>  Design  Documents  >>  Views Couchbase Bucket Design Document 1 View Monday, October 14, 13 View View
  • 27. Buckets  >>  Design  Documents  >>  Views Couchbase Bucket Design Document 1 View Monday, October 14, 13 View View Design Document 2
  • 28. Buckets  >>  Design  Documents  >>  Views Couchbase Bucket Design Document 1 View Monday, October 14, 13 View View Design Document 2 View
  • 29. Buckets  >>  Design  Documents  >>  Views Couchbase Bucket Design Document 1 View Monday, October 14, 13 View View Design Document 2 View View
  • 30. Buckets  >>  Design  Documents  >>  Views Couchbase Bucket Indexers Are Allocated Per Design Doc Design Document 1 View Monday, October 14, 13 View View Design Document 2 View View
  • 31. Buckets  >>  Design  Documents  >>  Views Couchbase Bucket Indexers Are Allocated Per Design Doc Design Document 1 View View View All Updated at Same Time Monday, October 14, 13 Design Document 2 View View
  • 32. Buckets  >>  Design  Documents  >>  Views Couchbase Bucket Indexers Are Allocated Per Design Doc Design Document 1 View View View Design Document 2 Can Only Access Data in the Bucket Namespace All Updated at Same Time Monday, October 14, 13 View View
  • 33. Buckets  >>  Design  Documents  >>  Views Couchbase Bucket Design Document 1 View View View Design Document 2 Can Only Access Data in the Bucket Namespace All Updated at Same Time Monday, October 14, 13 View View
  • 34. Buckets  >>  Design  Documents  >>  Views Couchbase Bucket Design Document 1 View Monday, October 14, 13 View View Design Document 2 Can Only Access Data in the Bucket Namespace View View
  • 35. Map()  FuncVon  =>  Index Every Document passes through View Map() functions Map function(doc,  meta)  { emit(doc.username,  doc.email) } Monday, October 14, 13
  • 36. Map()  FuncVon  =>  Index Every Document passes through View Map() functions Map json doc function(doc,  meta)  { emit(doc.username,  doc.email) } Monday, October 14, 13
  • 37. Map()  FuncVon  =>  Index Every Document passes through View Map() functions Map json doc doc metadata function(doc,  meta)  { emit(doc.username,  doc.email) } Monday, October 14, 13
  • 38. Map()  FuncVon  =>  Index Every Document passes through View Map() functions Map json doc doc metadata function(doc,  meta)  { emit(doc.username,  doc.email) } create row Monday, October 14, 13
  • 39. Map()  FuncVon  =>  Index Every Document passes through View Map() functions Map json doc doc metadata function(doc,  meta)  { emit(doc.username,  doc.email) } create row Monday, October 14, 13 indexed key
  • 40. Map()  FuncVon  =>  Index Every Document passes through View Map() functions Map json doc doc metadata function(doc,  meta)  { emit(doc.username,  doc.email) } create row Monday, October 14, 13 indexed key output value(s)
  • 41. Single  Element  Keys  (Text  Key) Map function(doc,  meta)  { emit(doc.email,  doc.points) } Monday, October 14, 13
  • 42. Single  Element  Keys  (Text  Key) Map function(doc,  meta)  { emit(doc.email,  doc.points) } text key Monday, October 14, 13
  • 43. Single  Element  Keys  (Text  Key) Map function(doc,  meta)  { emit(doc.email,  doc.points) } text key meta.id doc.email doc.points u::1 u::35 jasdeep@couchbase.com 1200 u::20 Monday, October 14, 13 abba@couchbase.com 1000 zorro@couchbase.com 900
  • 44. Compound  Keys  (Array) Array Based Index Keys get sorted as Strings, but can be grouped by array elements Map function(doc,  meta)  { emit(dateToArray(doc.timestamp),  1) } Monday, October 14, 13
  • 45. Compound  Keys  (Array) Array Based Index Keys get sorted as Strings, but can be grouped by array elements Map function(doc,  meta)  { emit(dateToArray(doc.timestamp),  1) } array key Monday, October 14, 13
  • 46. Compound  Keys  (Array) Array Based Index Keys get sorted as Strings, but can be grouped by array elements Map function(doc,  meta)  { emit(dateToArray(doc.timestamp),  1) } meta.id u::20 [2012,10,9,18,45] 1 u::1 [2012,9,26,11,15] 1 u::35 Monday, October 14, 13 dateToArray(doc.3mestamp) value array key [2012,8,13,2,12] 1
  • 47. 32 3 Monday, October 14, 13
  • 48. QUERYING  VIEWS 32 3 Monday, October 14, 13
  • 49. View Query Parameters •  key$=$“”$ ­  used%for%exact%match%of%index1key% •  keys$=$[]$ ­  used%for%matching%set%of%index1keys% •  startkey/endkey$=$“”$ ­  used%for%range%queries%on%index1keys% •  startkey_docID/endkey_docID$=$“”$ ­  used%for%range%queries%on%meta.id% •  stale=[false,$update_a;er,$true]$ ­  used%to%decide%indexer%behavior%from%client% •  group/group_by$ ­  used%with%reduces%to%aggregate%with%grouping% Monday, October 14, 13
  • 50. Most  Common  Query’s  Are  Ranges doc.email abba@couchbase.com u::1 beta@couchbase.com u::7 jasdeep@couchbase.com u::2 math@couchbase.com u::5 ma0@couchbase.com u::6 yeV@couchbase.com u::4 zorro@couchbase.com Monday, October 14, 13 meta.id u::3
  • 51. Most  Common  Query’s  Are  Ranges doc.email meta.id abba@couchbase.com u::1 ?startkey=”b1”  &  endkey=”zZ” beta@couchbase.com u::7 Pulls  the  Index-­‐Keys   between  UTF-­‐8  Range   specified  by  the   startkey  and  endkey. jasdeep@couchbase.com u::2 math@couchbase.com u::5 ma0@couchbase.com u::6 yeV@couchbase.com u::4 zorro@couchbase.com u::3 Monday, October 14, 13
  • 52. Most  Common  Query’s  Are  Ranges doc.email meta.id abba@couchbase.com u::1 ?startkey=”bz”  &  endkey=”zn” beta@couchbase.com u::7 Pulls  the  Index-­‐Keys   between  UTF-­‐8  Range   specified  by  the   startkey  and  endkey. jasdeep@couchbase.com u::2 math@couchbase.com u::5 ma0@couchbase.com u::6 yeV@couchbase.com u::4 zorro@couchbase.com u::3 Monday, October 14, 13
  • 53. Most  Common  Query’s  Are  Ranges doc.email abba@couchbase.com u::1 beta@couchbase.com u::7 jasdeep@couchbase.com u::2 math@couchbase.com u::5 ma0@couchbase.com u::6 yeV@couchbase.com u::4 zorro@couchbase.com Monday, October 14, 13 meta.id u::3
  • 54. Index-­‐Key  Matching doc.email abba@couchbase.com u::1 beta@couchbase.com u::7 jasdeep@couchbase.com u::2 math@couchbase.com u::5 ma0@couchbase.com u::6 yeV@couchbase.com u::4 zorro@couchbase.com Monday, October 14, 13 meta.id u::3
  • 55. Index-­‐Key  Matching doc.email abba@couchbase.com u::7 u::2 math@couchbase.com u::5 u::6 yeV@couchbase.com u::4 zorro@couchbase.com Monday, October 14, 13 beta@couchbase.com ma0@couchbase.com Match  a  Single  Index-­‐Key u::1 jasdeep@couchbase.com ?key=”math@couchbase.com”   meta.id u::3
  • 56. Index-­‐Key  Set  Matches doc.email abba@couchbase.com Query  MulVple  in  the   Set  (Array  NotaVon) Monday, October 14, 13 u::1 beta@couchbase.com u::7 jasdeep@couchbase.com u::2 math@couchbase.com u::5 ma0@couchbase.com u::6 yeV@couchbase.com u::4 zorro@couchbase.com ?keys=[“math@couchbase.com”, “ye3@couchbase.com”] meta.id u::3
  • 57. Understanding  CollaVon  Order Byte  Order 1234567890  <  a-­‐z  <  A-­‐Z Unicode  Colla3on 1234567890  <  aAbBcCdDeEfFgGhHiIjJkKlLmM... a < á < A < Á < b If  it  were  Byte  Order  2  Queries  Merged: startkey="y"&endkey="z"  merged  with  startkey="Y"&endkey="Z" With  Unicode  Colla3on  gets  both  y  and  Y: startkey="y"&endkey="z" Monday, October 14, 13
  • 58. Understanding Stale stale  =  UPDATE_AFTER  (default  if  nothing  is  specified) always  get  fastest  response can  take  two  queries  to  read  your  own  writes stale  =  OK auto  update  will  trigger  eventually might  not  see  your  own  writes  for  a  few  minutes least  frequent  updates  -­‐>  least  resource  impact stale  =  FALSE Use  with  Persistence  observe  if  data  needs  to  be  included  in  view  results BUT  aware  of  delay  it  adds,  only  use  when  really  required Monday, October 14, 13
  • 59. Built-In Reduces • Are faster than creating your own reduces for the same information - _count • gives count for number of items in Index - _sum • sums value parameters (for numeric values only) - _stats • gives sum, count, min, max and sum of squares for statistics Monday, October 14, 13
  • 60. Custom Reduces • Are a bit tricky at first, it's a skill! • Learn about it through our docs, practice first, most common problem in custom reduces is that they don't "reduce" the data • Can be creatively used! • Always do it in a separate Design Document to sandbox it from your existing Views, if you have a logic problem or error it won't interrupt existing Views Monday, October 14, 13
  • 61. 32 3 Monday, October 14, 13
  • 62. BEER  SAMPLE  VIEW 32 3 Monday, October 14, 13
  • 63. Beer  Sample  Database  Example meta doc { { "id": "110f37fa30", "rev": "1-000000000", "expiration": 0, "flags": 0, "type": "json" } Monday, October 14, 13 "name": "Aventinus Weizenstarkbier / Doppel Weizen Bock", "abv": 8.2, "ibu": 0, "srm": 0, "upc": 0, "type": "beer", "brewery_id": "110f1f2012", "updated": "2010-07-22 20:00:20", "description": "Dark-ruby, almost black-colored and streaked with fine top-fermenting yeast, this beer has a compact and persistent head. This is a very intense wheat doppelbock with a complex spicy chocolate-like arome with a hint of banana and raisins. On the palate, you experience a soft touch and on the tongue it is very rich and complex, though fresh with a hint of caramel. It finishes in a rich soft and lightly bitter impression.", "style": "South German-Style Weizenbock", "category": "German Ale" }
  • 64. Beer  Sample  Database  Example meta { } doc { alcohol by volume (abv) "name": "Aventinus Weizenstarkbier / Doppel "id": "110f37fa30", Weizen Bock", "rev": "1-000000000", "abv": 8.2, "expiration": 0, document key "ibu": 0, "flags": 0, "srm": 0, brewery_id (key) "type": "json" "upc": 0, "type": "beer", "brewery_id": "110f1f2012", "updated": "2010-07-22 20:00:20", "description": "Dark-ruby, almost black-colored and streaked with fine top-fermenting yeast, this beer has a compact and persistent head. This is a very intense wheat doppelbock with a complex spicy chocolate-like arome with a hint of banana and raisins. On the palate, you experience a soft touch and on the tongue it is very rich and complex, though fresh with a hint of caramel. It finishes in a rich soft and lightly bitter impression.", "style": "South German-Style Weizenbock", "category": "German Ale" } Monday, October 14, 13
  • 65. Map  FuncVon  -­‐  Index  DefiniVon 30 Monday, October 14, 13
  • 66. Map  FuncVon  -­‐  Index  DefiniVon +row 30 Monday, October 14, 13
  • 67. Map  FuncVon  -­‐  Index  DefiniVon +row indexed key 30 Monday, October 14, 13
  • 68. Map  FuncVon  -­‐  Index  DefiniVon +row indexed key value(s) 30 Monday, October 14, 13
  • 69. Result  Set  -­‐  Brewery  ID’s  by  Beer 31 Monday, October 14, 13
  • 70. Result  Set  -­‐  Brewery  ID’s  by  Beer brewery_id alcohol by volume (abv) document key (of the beer) 31 Monday, October 14, 13
  • 71. Reduce  Values  (doc.abv)  with  _stats 34 34 Monday, October 14, 13
  • 72. Reduce  Values  (doc.abv)  with  _stats add _stats built-in reduction 34 34 Monday, October 14, 13
  • 73. Query  with  Group  and  Reduce Find average alcohol by volume per brewery. 33 Monday, October 14, 13
  • 74. Query  with  Group  and  Reduce Find average alcohol by volume per brewery. add _stats built-in reduction set group=true & reduce=true 33 Monday, October 14, 13
  • 75. Groups  Brewery_ID’s,  Reduces  for  Stats Brewery ID’s are Grouped, and _stats collected (Reduced) Monday, October 14, 13 35 35
  • 76. Groups  Brewery_ID’s,  Reduces  for  Stats group=true & reduce=true number of beers by this brewery min abv max abv Brewery ID’s are Grouped, and _stats collected (Reduced) Monday, October 14, 13 35 35
  • 77. Monday, October 14, 13
  • 78. INTERFACE  DEMO Monday, October 14, 13
  • 79. Monday, October 14, 13
  • 80. Q  &  A Monday, October 14, 13