Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Deep Dive into N1QL: Power Features and Internals in Couchbase Server 4.0 –Couchbase Connect 2015

2,427 views

Published on

N1QL is a rich query language for JSON data. N1QL provides the following enhanced SQL statements: SELECT, INSERT, UPDATE, DELETE, MERGE. We’ll explain the advanced select-join-project-nest-unnest operations as well as data modification features in N1QL. We’ll also discuss basics of index selection and query planning in N1QL.

Published in: Technology
  • Be the first to comment

Deep Dive into N1QL: Power Features and Internals in Couchbase Server 4.0 –Couchbase Connect 2015

  1. 1. DEEP DIVE INTO N1QL: INTERNALS AND POWER FEATURES IN COUCHBASE 4.0 Keshav Murthy Couchbase Engineering keshav@couchbase.com @N1QL @rkeshavmurthy
  2. 2. ©2015 Couchbase Inc. 2 Agenda Query Service Overview Query ServiceArchitecture N1QL Power Features Q&A
  3. 3. Query Service Overview
  4. 4. ©2015 Couchbase Inc. 4 Couchbase Server Cluster Architecture 4 STORAGE Couchbase Server 1 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Managed Cache Storage Data Service Index Service Query Service STORAGE Couchbase Server 2 Managed Cache Cluster ManagerCluster Manager Data Service Index Service Query Service STORAGE Couchbase Server 3 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Data Service Index Service Query Service STORAGE Couchbase Server 4 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Data Service Index Service Query Service STORAGE Couchbase Server 5 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Data Service Index Service Query Service STORAGE Couchbase Server 6 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Data Service Index Service Query Service Managed Cache Storage Managed Cache Storage Managed Cache Storage Managed Cache Storage Managed Cache Storage
  5. 5. ©2015 Couchbase Inc. 5 Couchbase Server Cluster Service Deployment 5 STORAGE Couchbase Server 1 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Managed Cache Storage Data Servic e STORAGE Couchbase Server 2 Managed Cache Cluster ManagerCluster Manager Data Servic e STORAGE Couchbase Server 3 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Data Servic e STORAGE Couchbase Server 4 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Query Servic e STORAGE Couchbase Server 5 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Query Servic e STORAGE Couchbase Server 6 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Index Servic e Managed Cache Storage Managed Cache Storage Storage STORAGE Couchbase Server 6 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Index Servic e Storage Managed Cache Managed Cache
  6. 6. ©2015 Couchbase Inc. 6 N1QL: Query Execution Flow Clients 1. Submit the query over RESTAPI 8. Query result 2. Parse, Analyze, create Plan 7. Evaluate: Documents to results 3. Scan Request; index filters 6. Fetch the documents Index Servic e Query Service Data Servic e 4. Get qualified doc keys 5. Fetch Request, doc keys SELECT c_id, c_first, c_last, c_max FROM CUSTOMER WHERE c_id = 49165; { "c_first": "Joe", "c_id": 49165, "c_last": "Montana", "c_max" : 50000 }
  7. 7. Query Service Architecture
  8. 8. ©2015 Couchbase Inc. 8 Inside a Query Service Client FetchParse Plan Join Filter Pre-Aggregate Offset Limit ProjectSortAggregateScan Query Service Index Servic e Data Servic e
  9. 9. ©2015 Couchbase Inc. 10 Client to Query Service: REST API  Communication protocol is REST on top of HTTP  The database protocol structure is embedded within the REST API.  Query Service is stateless: All query information is embedded within the REST request.  REST is open. All REST clients work with N1QL  All N1QL clients, JDBC, ODBC drivers use REST Fetch Parse Plan Join Filter Offset Limit Project Sort Aggre gate Scan import requests import json url = "http://localhost:8093/query" s1=”SELECT * FROM CUSTOMER WHERE C_ID = 1284"; r = requests.post(url, data=s1, auth=('Administrator', 'abc')) print r.json()
  10. 10. ©2015 Couchbase Inc. 11 Query Execution: Parse & Semantic Check  Analyzes the Query for syntax & grammar  Only verifies for existence of referenced buckets  Flexible schema means, you can refer to arbitrary attribute names  Use IS MISSING clause to check if the keyname is present  Full reference to JSON structure  Nested reference: CUSTOMER.contact.address.state  Array Reference: CUSTOMER.c_contact.phone_number[0]  SQL is enhanced to access & manipulate Arrays Fetch Parse Plan Join Filter Offset Limit Project Sort Aggre gate Scan
  11. 11. ©2015 Couchbase Inc. 12 Query Execution: Parse & Semantic Check SELECT c_zip, COUNT(c_id), AVG(c_balance) FROM CUSTOMER WHERE c_state = ‘CA’ AND c_year = 2014 ORDER BY COUNT(c_id) DESC LIMIT 100 Simple refererences to the attribute name, just like columns Use expressions, just like SQL Table/keyspace/bucket references. Filters on the JSON document work just like SQL Sorting of the result set Top N clause. Fetch Parse Plan Join Filter Offset Limit Project Sort Aggre gate Scan
  12. 12. ©2015 Couchbase Inc. 14 Query Execution: Plan Fetch Parse Plan Join Filter Offset Limit Project Sort Aggre gate Scan  Each query can be executed in several ways  Create the query execution plan  Access path for each keyspace reference  Decide on the filters to push down  Determine Join order and join method  Create the execution tree  For each keyspace reference:  Look at the available indices  Match the filters in the query with index keys  Choose one or more indices for each keyspace
  13. 13. ©2015 Couchbase Inc. 15 Query Execution: Plan Fetch Parse Plan Join Filter Offset Limit Project Sort Aggre gate Scan EXPLAIN SELECT c_id, c_first, c_middle, c_last, c_balance FROM CUSTOMER WHERE c_w_id = 49 AND c_d_id = 16 AND c_last = ‘Montana’;  Explain provides the JSON representation of the query plan  Focus on the index selection and the predicates pushed down
  14. 14. ©2015 Couchbase Inc. 16 Query Execution: Plan Fetch Parse Plan Join Filter Offset Limit Project Sort Aggre gate Scan { "#operator": "IndexScan", "index": "CU_W_ID_D_ID_LAST", "keyspace": "CUSTOMER", … "spans": [ { "Range": { "High": [ "49", "16", ""Montana"" ], "Inclusion": 3, "Low": [ "49", "16",
  15. 15. ©2015 Couchbase Inc. 17 Query Execution: Plan Fetch Parse Plan Join Filter Offset Limit Project Sort Aggre gate Scan "#operator": "Parallel", "~child": { "#operator": "Sequence", "~children": [ { "#operator": "Fetch", "keyspace": "CUSTOMER", "namespace": "default » }, { "#operator": "Filter”, "condition": "((((`CUSTOMER`.`C_W_ID`) = 49) and ((`CUSTOMER`.`C_D_ID`) = 16)) and ((`CUSTOMER`.`C_LAST`) = "Montana"))” },
  16. 16. ©2015 Couchbase Inc. 18 Query Execution: Project Fetch Parse Plan Join Filter Offset Limit Project Sort Aggre gate Scan "#operator": "Parallel", "~child": { "#operator": "Sequence", "~children": [ { "#operator": ”Project", "keyspace": "CUSTOMER", "namespace": "default » }, {
  17. 17. ©2015 Couchbase Inc. 21 Query Execution: Scan Data Service Global Secondary Index View Indexes Global Secondary Index Global Secondary Index KeyScan IndexScan IndexScan Data FetchFetch Parse Plan Join Filter Offset Limit Project Sort Aggre gate Scan Query Service Cluster Map
  18. 18. ©2015 Couchbase Inc. 22 Query Execution: Fetch  List of qualified document-keys are grouped into batches.  List of the documents is obtained from the Index or specified directly via USE KEYS clause.  Fetch request is done in parallel.  The join operation use the fetch operation to get the matching document.  Fetch results are streamed into next operators.  For big queries, scan-fetch-join-filter-aggregation will be executing in parallel. Fetch Parse Plan Join Filter Offset Limit Project Sort Aggre gate Scan
  19. 19. ©2015 Couchbase Inc. 23 Query Execution: Join  You can join any two key spaces if one has document-key of the other.  You can store multiple entities within the same bucket and join between distinct groups  Uses Nested Loop JOIN now  JOINs are done in the same order specified in the query  Index selection is important for the first keyspace in the FROM clause.  Qualified documents from that scan is joined with the other Keyspace using the DOCUMENT KEYS Fetch Parse Plan Join Filter Offset Limit Project Sort Aggre gate Scan
  20. 20. ©2015 Couchbase Inc. 24 Query Execution: Join "CUSTOMER": { "C_D_ID": 10, "C_ID": 1938, "C_W_ID": 1, "C_BALANCE": -10, "C_CITY": ”San Jose", "C_CREDIT": "GC”, "C_DELIVERY_CNT": 0, "C_DISCOUNT": 0.3866, "C_FIRST": ”Jay", "C_LAST": ”Smith", "C_MIDDLE": "OE", "C_PAYMENT_CNT": 1, "C_PHONE": ”555-123-1234", "C_SINCE": "2015-03-22 00:50:42.822518", "C_STATE": ”CA", "C_STREET_1": ”555, Tideway Drive", "C_STREET_2": ”Alameda", "C_YTD_PAYMENT": 10, "C_ZIP": ”94501" } Document key: “1.10.1938” Document key: “1.10.143” “ORDERS”: { “O_CUSTOMER_KEY”: “1.10.1938”: "O_D_ID": 10, "O_ID": 1, "O_ALL_LOCAL": 1, "O_CARRIER_ID": 2, "O_C_ID": 1938, "O_ENTRY_D": "2015-05-19 16:22:08.544472", "O_ID": 143, "O_OL_CNT": 10, "O_W_ID": 1 }x “ORDERS”: { “O_CUSTOMER_KEY”: “1.10.1938”: "O_ALL_LOCAL": 1, "O_CARRIER_ID": 2, "O_C_ID": 1938, "O_D_ID": 10, "O_ENTRY_D": "2015-05-19 16:22:08.544472", "O_ID": 1355, "O_OL_CNT": 10, "O_W_ID": 3 } Document key: “1.10.1355”
  21. 21. ©2015 Couchbase Inc. 25 Query Execution: Join SELECT COUNT(o.O_ORDER_CNT ) AS CNT_O_OL_C NT FROM ORDERS o INNER JOIN CUSTOMER c ON KEYS (o.O_CUSTOMER_KEY) WHERE o.O_CARRIER_NAME = ”Penske” AND c.C_STATE = “CA”; Two keyspace joins ON Clause for the join Fetch Parse Plan Join Filter Offset Limit Project Sort Aggre gate Scan
  22. 22. ©2015 Couchbase Inc. 26 N1QL: Join SELECT * FROM ORDERS o INNER JOIN CUSTOMER c ON KEYS (o.O_C_ID) LEFT JOIN PREMIUM p ON KEYS (c.C_PR_ID) LEFT JOIN demographics d ON KEYS (c.c_DEMO_ID) Fetch Parse Plan Join Filter Offset Limit Project Sort Aggre gate Scan  Support INNER and LEFT OUTER joins  Join order follows the order in the FROM clause.  N1QL supports the nested loop joins now.  Join is always from a key of one document(outer table) to the document key of the second document (inner table)
  23. 23. ©2015 Couchbase Inc. 27 Query Execution: Filter  Filters not pushed to the index scan will have to be applied.  Since the indices are maintained asynchronously, we apply the filters again to ensure integrity of the result set. Fetch Parse Plan Join Filter Offset Limit Project Sort Aggre gate Scan
  24. 24. ©2015 Couchbase Inc. 28 Query Execution: Aggregate, Sort, Offset, Limit  Each stream creates partial grouping & aggregates  The result set is sorted to evaluated the ORDER BY  The sort is done in parallel  OFFSET and LIMIT is typically used in pagination  Evaluated after the ORDER BY clause is evaluated. Fetch Parse Plan Join Filter Offset Limit Project Sort AGG Scan
  25. 25. ©2015 Couchbase Inc. 29 Query Execution: Project Fetch Parse Plan Join Filter Offset Limit Project Sort Aggre gate Scan SELECT C_ZIP, count(*) as NUMCUSTOMERS FROM CUSTOMER GROUP BY C_ZIP ORDER BY COUNT(*) DESC LIMIT 10; { "requestID": "ff49a6e6-35f0-4eac-8d74-aa8a0aab58e7", "signature": { "C_ZIP": "json", "NUMCUSTOMERS": "number" }, "results": [ { "C_ZIP": "304811111", "NUMCUSTOMERS": 12 }, ... { "C_ZIP": "709811111", "NUMCUSTOMERS": 10 } ], "status": "success", "metrics": { "elapsedTime": "1.57600634s", "executionTime": "1.575851088s", "resultCount": 10, "resultSize": 228 } Projection Signature of the resultset Query execution & resultset information
  26. 26. N1QL Power Features: USE KEYS
  27. 27. ©2015 Couchbase Inc. 31 Power Features: USE KEYS Data Service Global Secondary Index View Indexes Global Secondary Index Global Secondary Index KeyScan IndexScan IndexScan Data Fetch Query Service Cluster Map
  28. 28. ©2015 Couchbase Inc. 32 Power Features: USE KEYS SELECT c_id, c_first, c_middle, c_last, (c_max - c_balance) FROM CUSTOMER USE KEYS [‘1.10.1938’];  KeyScan: Directly use the Couchbase cluster map to get the document  You can give one or more values in the array  From N1QL, get keys via: META(CUSTOMER).id
  29. 29. ©2015 Couchbase Inc. 33 Power Features: USE KEYS EXPLAIN SELECT * FROM CUSTOMER USE KEYS ['1.1.1634', '1.1.1639']; { …[ { "#operator": "Sequence", "~children": [ { "#operator": "KeyScan", "keys": "["1.1.1634”, "1.1.1639”]" }, { "#operator": "Parallel", "~child": { "#operator": "Sequence", "~children": [ { "#operator": "Fetch", "keyspace": "CUSTOMER", "namespace": "default" },
  30. 30. ©2015 Couchbase Inc. 34 Power Features: USE KEYS UPDATE customer USE KEYS ['1.20.981', '12.42.196'] SET c_balance = c_balance + 200; DELETE customer USE KEYS ['1.20.198', '12.42.2848'];  Even when you use the USE KEYS, the indexes are automatically maintained.
  31. 31. N1QL Power Features: UNNEST
  32. 32. ©2015 Couchbase Inc. 36 UNNEST: Denormalized CUSTOMER Document { "C_ZIP" : "828011111", "C_STATE" : "vt", "C_FIRST" : "ykfdbqku", "C_CREDIT" : "GC", "C_DELIVERY_CNT" : 0, "C_W_ID" : 1, "C_CITY" : "quhpismkzumehqhr", "C_STREET_1" : "rmtxadlsxqefdcwf", "C_D_ID" : 1, "ORDERS" : [ { "ORDER_LINE" : [ { "OL_AMOUNT" : 0, "OL_DELIVERY_D" : "2015-02-11T14:55:25.480Z", "OL_DIST_INFO" : "yptiwgjdelfxmathbjzirvye", "OL_I_ID" : 35828, "OL_SUPPLY_W_ID" : 1, "OL_QUANTITY" : 5 }, { "OL_AMOUNT" : 0, "OL_DELIVERY_D" : "2015-02-11T14:55:25.480Z", "OL_DIST_INFO" : "dxhqulhcgksjgqsicujzqhdb", "OL_I_ID" : 26024, "OL_SUPPLY_W_ID" : 1, "OL_QUANTITY" : 5 }, } ….
  33. 33. ©2015 Couchbase Inc. 37 Power Features: UNNEST operation SELECT COUNT(my_order_line) AS total_orders, MAX(my_order_line.ol_delivery_d) AS max_delivery_date, MAX(my_order_line.ol_quantity) AS max_order_quantity, MAX(my_orders.o_entry_d) AS max_customer_entry, MAX(my_orders.o_ol_cnt) AS max_orderline_entry, COUNT(customer) AS total_customers FROM CUSTOMER MY_CUSTOMER UNNEST ORDERS AS my_orders UNNEST my_orders.order_line AS my_order_line ;
  34. 34. N1QL Power Features: Named Prepared Statement
  35. 35. ©2015 Couchbase Inc. 39 Power Features: Named Prepare Statement Client FetchParse Plan Join Filter Pre-Aggregate Offset Limit ProjectSortAggregateScan Query Service Index Servic e Data Servic e
  36. 36. ©2015 Couchbase Inc. 41 Named Prepared Statement url="http://localhost:8093/query" s = requests.Session() s.keep_alive = True s.auth = ('Administrator','password') query = {'statement':'prepare select * from `beer-sample` where name = [$1]’} r = s.post(url, data=query, stream=False) prepared = str(r.json()['results'][0]['name']) for i in range (0, 5): query={'prepared': '"' + prepared + '"', 'args': '["old_hat_brewery"]' } r = s.post(url, data=query, stream=False) print i, r.json()['metrics']['executionTime'] BindValues Many times Prepare ONCE
  37. 37. Functional Indices
  38. 38. ©2015 Couchbase Inc. 43 Functional Indices "contacts": { "age": 46, "children": [ { "age": 17, "fname": "Aiden", "gender": "m" }, { "age": 2, "fname": "Bill", "gender": "f" } ], "email": "dave@gmail.com", "fname": "Dave", "hobbies": [ "golf", "surfing" ], "lname": "Smith", "relation": "friend", "title": "Mr.", "type": "contact" CREATE INDEX idx_lname_lower ON contacts(LOWER(lname)) using GSI; SELECT count(*) FROM contacts WHERE lower(lname) = smith;
  39. 39. ©2015 Couchbase Inc. 44 Functional Indices "contacts": { "age": 46, "children": [ { "age": 17, "fname": "Aiden", "gender": "m" }, { "age": 2, "fname": "Bill", "gender": "f" } ], "email": "dave@gmail.com", "fname": "Dave", "hobbies": [ "golf", "surfing" ], "lname": "Smith", "relation": "friend", "title": "Mr.", "type": "contact"  The value indexed is the result of the function or expression.  The query has to use the same expression in the WHERE clause for the planner to consider using the index.  Use EXPLAIN to verify using the index.
  40. 40. N1QL Power Features: Multi-Index Scans
  41. 41. ©2015 Couchbase Inc. 46 Power Features: IntersectScan (Multi-Index Scan) SELECT * FROM customer WHERE c_last = ’Smith’ AND c_city = 'Santa Clara';  Index scan using composite index: – Needs first N keys to be used to choose the index – Will multiple indexes on same set of columns to support filter push down  IntersectScan using multiple indices: – Multiple indices are scanned in parallel – Provides more flexibility in using the indices for filters – Requires less number of indexes defined on table. • Can save on disk space and memory space as well.
  42. 42. ©2015 Couchbase Inc. 47 Multi-Index Scan SELECT * FROM customer WHERE c_last = ’Smith’ AND c_city = 'Santa Clara'; "#operator": "IntersectScan", "scans": [ { "#operator": "IndexScan", "index": "idx_cust_city", "keyspace": "CUSTOMER", "limit": 9.223372036854776e+18, "namespace": "default", "spans": [ { "Range": { "High": [ ""Santa Clara"" ], "Inclusion": 3, "Low": [ ""Santa Clara"" ] }, "Seek": null } { "#operator": "IndexScan", "index": "idx_last_name", "keyspace": "CUSTOMER", "limit": 9.223372036854776e+18, "namespace": "default", "spans": [ { "Range": { "High": [ "”Smith"" ], "Inclusion": 3, "Low": [ "”Smith"" ] }, "Seek": null } ],
  43. 43. query.couchbase.com @N1QL Keshav Murthy keshav@couchbase.com @rkeshavmurthy

×