CouchConf Tokyo 2013_App Development with Documents Indexes and Queries
 

Like this? Share it with your network

Share

CouchConf Tokyo 2013_App Development with Documents Indexes and Queries

on

  • 595 views

 

Statistics

Views

Total Views
595
Views on SlideShare
417
Embed Views
178

Actions

Likes
0
Downloads
9
Comments
0

3 Embeds 178

http://www.couchbase.com 169
http://site-qa.cbauthx.com 5
http://beta.stage.couchbase.com 4

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • example: running the mapper and reducer over all of the docs.Caveat: use careful guards to be sure view execution doesn’t stop due to unbound variables
  • First, walk through the optionsThen mention Observe
  • Mention in summary…Also, note there are other systems such as Hadoop, Elastic search that have a lot of merit to add, and there are system like relational databases that don’tadddresscontemporaryneeds

CouchConf Tokyo 2013_App Development with Documents Indexes and Queries Presentation Transcript

  • 1. App DevelopmentDocuments, Indexes and Queries Matt Ingenthron Director, Developer Solutions
  • 2. Agenda• Introduction to Indexing and Querying in Couchbase• The lifecycle of Couchbase Views• Indexing and Querying with related documents• Patterns
  • 3. INDEXING AND QUERYING
  • 4. Couchbase Server 2.0: Views• Views can cover a few different use cases – Primary Index – Simple secondary indexes (the most common) – Complex secondary, tertiary and composite indexes – Aggregation functions (reduction) • Example: count the number of North American Ales – Organizing related data• Built using Map/Reduce – Map function creates a matrix from document fields – Reduce function summarizes (reduces) information – Written using superfast Javascript
  • 5. Querying from Views Querying from Ruby Clientblog = c.design_docs[blog]blog.views #=> ["recent_posts"]blog.recent_posts #=> [#<Couchbase::ViewRow:9855800 @id="hello-world"@key="2009/01/15 15:52:20" @value="Hello World" @doc=nil @meta={}@views=[]>, ...]blog.recent_posts.each do |doc| # do something # with doc object doc.key # gives the key argument of the emit() doc.value # gives the value argument of the emit()end
  • 6. VIEW LIFECYCLE:DEFINE – BUILD – QUERY
  • 7. View Definition (in JavaScript)like:CREATE INDEX city ON brewery city; 8
  • 8. Distributed Index Build Phase• Optimized for lookups, in-order access and aggregations• All view reads from disk (different performance profile)• View builds against every document on every node – This is why you should group them in a design document• Automatically kept up to date SERVER SERVER SERVER 3 Active Docs Active Docs Active Docs 1 2 Doc 5 DOC Doc 4 DOC Doc 1 DOC Doc 2 DOC Doc 7 DOC Doc 3 DOC Doc 9 DOC Doc 8 DOC Doc 6 DOC Replica Docs Replica Docs Replica Docs Doc 4 DOC Doc 6 DOC Doc 7 DOC Doc 1 DOC Doc 3 DOC Doc 9 DOC Doc 8 DOC Doc 2 DOC Doc 5 DOC 9
  • 9. Dynamic Range Queries with Optional Aggregation• Efficiently fetch an row or group of related rows.• Queries use cached values from B-tree inner nodes when possible• Take advantage of in-order tree traversal with group_level queries ?startkey=“J”&endkey=“K” {“rows”:[{“key”:“Juneau”,“value”:null}]} SERVER SERVER SERVER 3 Active Docs Active Docs Active Docs 1 2 Doc 5 DOC Doc 4 DOC Doc 1 DOC Doc 2 DOC Doc 7 DOC Doc 3 DOC Doc 9 DOC Doc 8 DOC Doc 6 DOC Replica Docs Replica Docs Replica Docs Doc 4 DOC Doc 6 DOC Doc 7 DOC Doc 1 DOC Doc 3 DOC Doc 9 DOC Doc 8 DOC Doc 2 DOC Doc 5 DOC
  • 10. Queries run against stale indexes by default• stale=update_after (default if nothing is specified) – always get fastest response – can take two queries to read your own writes• stale=ok – auto update will trigger eventually – might not see your own writes for a few minutes – least frequent updates -> least resource impact• stale=false – Use with Persistence observe if data needs to be included in view results – BUT aware of delay it adds, only use when really required
  • 11. Development vs. Production Views• Development views index a subset of the data.• Publishing a view builds the index across the entire cluster.• Queries on production views are scattered to all cluster members and results are gathered and returned to the client.
  • 12. Emergent Schema • Falls out of your key-value usage • Helps to know whats efficient • Mostly you can relax"Capture the users intent" JSON.org Github API Twitter API
  • 13. QUERY PATTERN:BASIC AGGREGATIONS
  • 14. Use a built-in reduce function with a group query• Lets find average abv for each brewery!
  • 15. We are reducing doc.abv with _stats
  • 16. Group reduce (reduce by unique key)
  • 17. QUERY PATTERN:TIME BASED ROLLUPS
  • 18. Find patterns in beer comments by time { "type": "comment", "about_id": "beer_Enlightened_Black_Ale", "user_id": 525,timestam "text": "tastes like college!",p "updated": "2010-07-22 20:00:20" } { "id": "u525_c1" }
  • 19. Query with group_level=2 to get monthly rollups
  • 20. dateToArray() is your friend• String or Integer based timestamps• Output optimized for group_level queries• array of JSON numbers: [2012,9,21,11,30,44]
  • 21. group_level=2 results• Monthly rollup• Sorted by time—sort the query results in your application if you want to rank by value—no chained map-reduce 2
  • 22. group_level=3 - daily results - great for graphing• Daily, hourly, minute or second rollup all possible with the same index.• http://crate.im/posts/couchbase-views-reddit-data/
  • 23. QUERY PATTERN: LEADERBOARD
  • 24. Aggregate value stored in a document• Lets find the top-rated beers! { "brewery": "New Belgium Brewing", "name": "1554 Enlightened Black Ale", "abv": 5.5, "description": "Born of a flood...", "category": "Belgian and French Ale", "style": "Other Belgian-Style Ales", "updated": "2010-07-22 20:00:20", “ratings” : { “ingenthr” : 5, ratings “jchris” : 4, “scalabl3” : 5, “damienkatz” : 1
  • 25. Sort each beer by its average rating• Lets find the top-rated beers! average 26
  • 26. QUERY PATTERN:COLLATION OF RELATED DOCS
  • 27. Join Through CollationSee Bradley Holt’s presentationfrom CouchConf Boston:http://www.couchbase.com/couchconf-boston
  • 28. Anti-patterns• Emitting document or too much data into a view – Especially avoid including the doc itself in an emit() call• Reduces that don’t reduce – If you implement a custom reduce, make sure it doesn’t expand!• Expecting a query on an index to be as fast – Secondary indexes need to be built, happen asynchronously, and are cached at the filesystem level• Trying to do too much with one view – Instead, co-locate views in design documents, or have separate design documents• Note that sometimes, you may need to make requests of multiple views – There is not directly a method of doing a join, but there is a technique
  • 29. COUCHBASE INTEGRATION
  • 30. Integration with ElasticSearch1. ElasticSearch Query 2. ElasticSearch Result 3. Couchbase Multi-GET 4. Couchbase Result ElasticSearch
  • 31. The Learning Portal • Designed and built as a collaboration between MHE Labs and Couchbase • Serves as proof-of-concept and testing harness for Couchbase + ElasticSearch integration • Available for download and further development as open source codehttps://github.com/couchbaselabs/learningportal
  • 32. Integration with Hadoop Ad Targeting Platform Logs Logs LogsCouchbase Server Cluster Logs sqoop export Logs flume flow sqoop import Hadoop Cluster
  • 33. In SummaryCouchbase has Views for Indexing andQueryingViews are incremental map-reduce code that run across all documents.Views Allow Common Methods of QueryingCommon patterns such as simple secondary indexes, count and averageaggregations, and time series rollups are simple and fast.Couchbase Integrates for Full Text and Large AnalyticsCouchbase integrates with ElasticSearch, Hadoop and other systems. 35
  • 34. Q&A 36
  • 35. THANKS! 37