N1QLWORKSHOP:
INDEXING AND QUERYTUNING IN COUCHBASE 4.0
Keshav Murthy
Couchbase Engineering
keshav@couchbase.com
@N1QL @rkeshavmurthy
©2015 Couchbase Inc. 2
Agenda
 Indexing Overview
 View Index
 GSI Index
 Multi Index Scan
 Hands On N1QL
 QueryTuning with Hands on N1QL
 Index Selection Hints
 Key-ValueAccess
 Joins
 Hands On N1QL
Indexing Overview
©2015 Couchbase Inc. 4
Couchbase Server Cluster Service Deployment
4
STORAGE
Couchbase Server 1
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Managed Cache
Storage
Data
Servic
e
STORAGE
Couchbase Server 2
Managed
Cache
Cluster
ManagerCluster
Manager
Data
Servic
e
STORAGE
Couchbase Server 3
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Data
Servic
e
STORAGE
Couchbase Server 4
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Query
Servic
e
STORAGE
Couchbase Server 5
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Query
Servic
e
STORAGE
Couchbase Server 6
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Index
Servic
e
Managed Cache
Storage
Managed Cache
Storage Storage
STORAGE
Couchbase Server 6
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Index
Servic
e
Storage
Managed Cache Managed Cache
©2015 Couchbase Inc. 5
N1QL: Query Execution Flow
Clients
1. Submit the query over RESTAPI 8. Query result
2. Parse, Analyze, create Plan 7. Evaluate: Documents to results
3. Scan Request;
index filters
6. Fetch the documents
Index
Servic
e
Query
Service
Data
Servic
e
4. Get qualified doc keys
5. Fetch Request,
doc keys
SELECT firstname,
lastname,
state
FROM customer
WHERE customerid = "customer494";
{
"firstName": "Nicolette",
"lastName": "Wilderman",
"state": "IL“
}
©2015 Couchbase Inc. 6
N1QL: Query Execution Flow
Clients
1. Submit the query over RESTAPI 8. Query result
2. Parse, Analyze, create Plan 7. Evaluate: Documents to results
3. Scan Request;
index filters
6. Fetch the documents
Index
Servic
e
Query
Service
Data
Servic
e
4. Get qualified doc keys
5. Fetch Request,
doc keys
SELECT firstname,
lastname,
state
FROM customer
WHERE customerid = "customer494";
{
"firstName": "Nicolette",
"lastName": "Wilderman",
"state": "IL“
}
©2015 Couchbase Inc. 7
N1QL: Inside a Query Service
Client
FetchParse Plan Join Filter
Pre-Aggregate
Offset Limit ProjectSortAggregate
Index
Service
Data
Service
Scan
Query Service
©2015 Couchbase Inc. 8
Index Overview: Primary Index
 Primary Index
 CREATE PRIMARY INDEX PIX_CUST ON customer;
 Document key is unique for the bucket.
 Primary index is used when no other qualifying
index is available or when no predicate is given
in the query.
 PrimaryScan is equivalent of full table scan
"customer": {
"ccInfo": {
"cardExpiry": "2015-11-11",
"cardNumber”:"1212--1234",
"cardType": "americanexpress”
},
"customerId": "customer534",
"dateAdded": "2014-04-06",
"dateLastActive”:"2014-05-02”,
"emailAddress”:”iles@kertz.name",
"firstName": "Mckayla",
"lastName": "Brown",
"phoneNumber": "1-533-290-6403",
"postalCode": "92341",
"state": "VT",
"type": "customer"
}
Document key: “customer534”
©2015 Couchbase Inc. 9
Index Overview: Secondary Index
 Secondary Index can be created on any
combination of attribute names.
 CREATE INDEX idx_cust_cardnum
customer(ccInfo.cardNumber)
 CREATE INDEX idx_cust_postalCode
CUSTOMER(postalCode);
 Useful in speeding up the queries.
 Need to have matching indices with right key-
ordering
 (ccInfo.cardExpiry, postalCode)
 (type, state, lastName firstName)
"customer": {
"ccInfo": {
"cardExpiry": "2015-11-11",
"cardNumber”:"1212-232-1234",
"cardType": "americanexpress”
},
"customerId": "customer534",
"dateAdded": "2014-04-06",
"dateLastActive”:"2014-05-02”,
"emailAddress”:”iles@kertz.name",
"firstName": "Mckayla",
"lastName": "Brown",
"phoneNumber": "1-533-290-6403",
"postalCode": "92341",
"state": "VT",
"type": "customer"
}
Document key: “customer534”
©2015 Couchbase Inc. 10
Couchbase Indexes for N1QL
Fetch
Parse
Plan
Join
Filter
Offset
Limit
Project
Sort
Aggre
gate
Scan
Data Service
Global
Secondary
Index
View Indexes
Global
Secondary
Index
Global
Secondary
Index
IndexScan
IndexScan
Data
Access
Query
Service
Cluster Map
View Index
©2015 Couchbase Inc. 12
View Index
12
APPLICATION SERVER
VIEW
INDEXER
Query Set
 DCP based Replication: updates
queued for the indexer
 View Indexer: Executes incremental
map/reduce on a batch of updates
 Couchstore based Storage: updates
queued for storage
 ViewQuery Engine: REST Based
queries with filters, limit and more
executed with scatter-gather
 N1QLView Index: Created via
N1QL’sCREATE Index statement
 For Beta, View index is the default
CREATE PRIMARY INDEX px_customer ON customer USING VIEW;
CREATE INDEX idx_cust_postalCode customer(postalCode) USING VIEW ;
DROP INDEX customer.idx_cust_postalCode USING VIEW;
GSI Index
©2015 Couchbase Inc. 14
Data Service
Projector & Router
Global Secondary Index
Query Service
Bucket#1 Bucket#2
DCP Stream
Index Service
Supervisor
Index maintenance &
Scan coordinator
Index#2Index#1
Index#4Index#3
ForestDB
Storage Engine
B
u
c
k
e
t
#
2
B
u
c
k
e
t
#
1
©2015 Couchbase Inc. 15
GSI Index: Key details
 SupportedTypes
 String, Boolean, Numeric, Nil,Array, Sub-document
 Total length of the keys
 4 KB – actual length of the key indexed
 How the the length is calculated? Does it include the “key name”?
 Number of keys
 4096!
CREATE PRIMARY INDEX px_customer ON customer USING GSI;
CREATE INDEX idx_cust_postalCode customer(postalCode) USING GSI;
DROP INDEX customer.idx_cust_postalCode USING GSI;
©2015 Couchbase Inc. 16
Query Execution: Plan
Fetch
Parse
Plan
Join
Filter
Offset
Limit
Project
Sort
Aggre
gate
Scan
 Each query can be executed in several ways
 Create the query execution plan
 Access path for each keyspace reference
 Decide on the filters to push down
 Determine Join order and join method
 Create the execution tree
 For each keyspace reference:
 Look at the available indices
 Match the filters in the query with index keys
 Choose one or more indices for each keyspace
 Create index filters and post scan, post join filters
©2015 Couchbase Inc. 17
Query Execution: Plan
Fetch
Parse
Plan
Join
Filter
Offset
Limit
Project
Sort
Aggre
gate
Scan
{
"#operator": "IndexScan",
"index":
"CU_W_ID_D_ID_LAST",
"keyspace": "CUSTOMER",
…
"spans": [
{
"Range": {
"High": [
"49",
"16",
""Montana""
],
"Inclusion": 3,
"Low": [
"49",
"16",
EXPLAIN SELECT c_id,
c_first,
c_middle,
c_last,
c_balance
FROM CUSTOMER
WHERE c_w_id = 49
AND c_d_id = 16
AND c_last = ‘Montana’;
©2015 Couchbase Inc. 18
Query Execution: Plan
Fetch
Parse
Plan
Join
Filter
Offset
Limit
Project
Sort
Aggre
gate
Scan
"#operator": "Sequence",
"~children": [
{
"#operator": "PrimaryScan",
"index": "#primary",
"keyspace": "reviews",
"namespace": "default",
"using": "gsi"
},
{
"#operator": "Parallel",
"~child": {
"#operator": "Sequence",
"~children": [
{
"#operator": "Fetch",
"keyspace": "reviews",
"namespace": "default"
},
{
SELECT productid,
rating,
Count(productid)
FROM reviews
WHERE rating < 3
AND productid
BETWEEN "product300"
AND "product400"
GROUP BY productid,
rating;
©2015 Couchbase Inc. 19
Query Execution: Index Scan
 PrimaryScan
 Equivalent of full table scan in RDBMS
 Uses the primary index to scan from start to finish
 Index Scan
 Index selection is based on the filters and available
matching index
 Indices with expressions are matched with query
expressions
 N1QL can use one or more indices per table per
query
Data
Service
Global
Secondary
Index
View
Indexes
IndexScan
IndexScan
Data Fetch
Query
Service
Cluster
Map
Composite Indexes
©2015 Couchbase Inc. 21
Power Features: Composite Indexes
CREATE INDEX ix_last_postal ON PRODUCT(lastName, postalCode,
city);
SELECT * FROM product
WHERE
lastname= “Smith”
AND
postalCode = ‘58292’;
 Index scan using composite index:
– Needs first N keys to be used to choose the index
– Will multiple indexes on same set of columns to support filter push down
Power Features:
Index Intersection aka Multi Index Scan
©2015 Couchbase Inc. 23
Power Features: IntersectScan (Multi-Index Scan)
SELECT * FROM customer
WHERE lastName = ’Smith’ AND postalCode = ’94040’;
 IntersectScan using multiple indices:
– Multiple indices are scanned in parallel
– Provides more flexibility in using the indices for filters
– Requires less number of indexes defined on table.
• Can save on disk space and memory space as well.
Switch to Hands On N1QL
Page #55
Power Features:
Index Selection Hints
©2015 Couchbase Inc. 26
Query Execution: USE INDEX
CREATE INDEX ix_last_postal ON
PRODUCT(lastName, postalCode);
CREATE INDEX ix_postal_category ON
PRODUCT(postalCode,lastName);
SELECT * FROM product
WHERE
lastname= “Smith”
AND
postalCode = ‘58292’;
SELECT *
FROM product
USE INDEX(ix_last_postal using gsi)
WHERE
Category = “Smith”
AND
Name = ‘58292’;
Data
Service
Global
Secondary
Index
View
Indexes
IndexScan
IndexScan
Data Fetch
Query
Service
Cluster
Map
©2015 Couchbase Inc. 27
Query Execution: USE INDEX
CREATE INDEX ix_last ON
PRODUCT(lastName) USING GSI;
CREATE INDEX ix_postal ON
PRODUCT(postalCode) USING GSI;
SELECT *
FROM product
USE INDEX(ix_last USING GSI,
ix_postal USING GSI)
WHERE
Category = “Smith”
AND
Name = ‘58292’;
Data
Service
Global
Secondary
Index
View
Indexes
IndexScan
IndexScan
Data Fetch
Query
Service
Cluster
Map
N1QL Power Features: USE KEYS
©2015 Couchbase Inc. 29
Power Features: USE KEYS
Data Service
Global
Secondary
Index
View Indexes
Global
Secondary
Index
Global
Secondary
Index
KeyScan
IndexScan
IndexScan
Data Fetch
Query
Service
Cluster Map
©2015 Couchbase Inc. 30
Power Features: USE KEYS
SELECT customerId,
lastName,
firstName
FROM customer USE KEYS [‘customer494’];
 KeyScan: Directly use the Couchbase cluster map to get the
document
 You can give one or more values in the array
 From N1QL, get keys via: META(customer).id
Functional Indices
©2015 Couchbase Inc. 32
Functional Indices
"contacts": {
"age": 46,
"children": [
{
"age": 17,
"fname": "Aiden",
"gender": "m"
},
{
"age": 2,
"fname": "Bill",
"gender": "f"
}
],
"email": "dave@gmail.com",
"fname": "Dave",
"hobbies": [
"golf",
"surfing"
],
"lname": "Smith",
"relation": "friend",
"title": "Mr.",
"type": "contact"
CREATE INDEX idx_lname_lower ON
contacts(LOWER(lname)) using GSI;
SELECT count(*)
FROM contacts
WHERE lower(lname) = smith;
©2015 Couchbase Inc. 33
Functional Indices
"contacts": {
"age": 46,
"children": [
{
"age": 17,
"fname": "Aiden",
"gender": "m"
},
{
"age": 2,
"fname": "Bill",
"gender": "f"
}
],
"email": "dave@gmail.com",
"fname": "Dave",
"hobbies": [
"golf",
"surfing"
],
"lname": "Smith",
"relation": "friend",
"title": "Mr.",
"type": "contact"
 The value indexed is the result of the function or
expression.
 The query has to use the same expression in the
WHERE clause for the planner to consider using
the index.
 Use EXPLAIN to verify using the index.
Indexes on Arrays
©2015 Couchbase Inc. 35
Indexes on Arrays
cbq> select contacts.children from contacts limit 1;
{
"requestID": "e61a011f-2387-47d3-aee0-7bfd874ed2bf",
"signature": {
"children": "json"
},
"results": [
{
"children": [
{
"age": 17,
"fname": "Aiden",
"gender": "m"
},
{
"age": 2,
"fname": "Bill",
"gender": "f"
}
]
}
],
CREATE INDEX idx_ctx_children ON
contacts(children) USING GSI;
select * from system:indexes where name =
"idx_ctx_children";
"indexes": {
"datastore_id": "http://127.0.0.1:8091",
"id": "ea6023a0dd24bedc",
"index_key": [
"`children`"
],
"keyspace_id": "contacts",
"name": "idx_ctx_children",
"namespace_id": "default",
"state": "online",
"using": "gsi"
}
©2015 Couchbase Inc. 36
Indexes on Arrays
cbq> select contacts.children from contacts limit 1;
{
"requestID": "e61a011f-2387-47d3-aee0-7bfd874ed2bf",
"signature": {
"children": "json"
},
"results": [
{
"children": [
{
"age": 17,
"fname": "Aiden",
"gender": "m"
},
{
"age": 2,
"fname": "Bill",
"gender": "f"
}
]
}
],
select * from contacts where children[0] =
{"age":17, "fname":"Xena", "gender":"f"};
select *
from contacts c unnest c.children as anychild
where anychild = {"age":17, "fname":"Xena",
"gender":"f"};
©2015 Couchbase Inc. 37
Indexes on Arrays
cbq> select contacts.children from contacts limit 1;
{
"requestID": "e61a011f-2387-47d3-aee0-7bfd874ed2bf",
"signature": {
"children": "json"
},
"results": [
{
"children": [
{
"age": 17,
"fname": "Aiden",
"gender": "m"
},
{
"age": 2,
"fname": "Bill",
"gender": "f"
}
]
}
],
select * from contacts where children =
[
{ "age": 17, "fname": "Xena”, "gender": "f”},
{"age": 2,"fname": "Yuri”,"gender": "m”}
];
JOINS in N1QL
©2015 Couchbase Inc. 39
Query Execution: Join
"CUSTOMER": "customer": {
"ccInfo": {
"cardExpiry": "2015-11-11",
"cardNumber": "1212-1221-1121-1234",
"cardType": "americanexpress"
},
"customerId": "customer285",
"dateAdded": "2014-04-06T15:52:16Z",
"dateLastActive": "2014-05-06T15:52:16Z",
"emailAddress":
"jason_skiles@kertzmann.name",
"firstName": "Mckayla",
"lastName": "Brown",
"phoneNumber": "1-533-290-6403 x2729",
"postalCode": "92341",
"state": "VT",
"type": "customer"
}
Document key: “customer285” Document key: “purchase1492”
“purchases”:{
"customerId": "customer285",
"lineItems": [
{"count": 3,
"product": "product55”},
{"count": 4,
"product": "product169”},],
"purchaseId": "purchase7049",
"type": "purchase”
}
"purchases": {
"customerId": "customer285",
"lineItems": [
{ "count": 5,
"product”: "prod551" },
{ "count": 3,
"product": "product549" }, ],
"purchaseId": "purchase3648",
"purchasedAt": "2013-11-07T15:52:38Z",
"type": "purchase"
}
Document key: “purchase583”
©2015 Couchbase Inc. 40
Joins
SELECT c.customerid,
Count(*) totpurchases
FROM purchases p
INNER JOIN customer c
ON KEYS p.customerid
GROUP BY c.customerid
ORDER BY count(*) DESC limit 10;
Two keyspace joins
ON Clause for the join
Fetch
Parse
Plan
Join
Filter
Offset
Limit
Project
Sort
Aggre
gate
Scan
©2015 Couchbase Inc. 41
Joins
SELECT c_id,
c_first,
c_middle,
c_last,
(c_max - c_balance)
FROM CUSTOMER USE KEYS [‘1.10.1938’];
Fetch
Parse
Plan
Join
Filter
Offset
Limit
Project
Sort
Aggre
gate
Scan
SELECT c_id,
c_first,
c_middle,
c_last,
(c_max - c_balance)
FROM CUSTOMER USE KEYS
[to_string($1)|| “.” || to_string($2) ||
“.” || to_string($3)];
©2015 Couchbase Inc. 42
N1QL: Join
SELECT *
FROM ORDERS o INNER JOIN CUSTOMER c
ON KEYS (o.O_C_ID)
LEFT JOIN PREMIUM p
ON KEYS (c.C_PR_ID)
LEFT JOIN demographics d
ON KEYS (c.c_DEMO_ID)
Fetch
Parse
Plan
Join
Filter
Offset
Limit
Project
Sort
Aggre
gate
Scan
 Support INNER and LEFT OUTER joins
 Join order follows the order in the FROM clause.
 N1QL supports the nested loop joins now.
 Join is always from a key of one document(outer
table) to the document key of the second
document (inner table)
Switch to Hands On N1QL
Page #71
©2015 Couchbase Inc. 44
Summary
 Create indices with right set of keys
 Important to analyze your workload
 Primary Scan = FullTable Scan
 Indexes are maintained asynchronously.
 Set the right consistency level for your query
 Design your primary keys and document
references correctly.
 USE KEYS will get you the data fast
 EXPLAIN to understand query plan
 Take control when you must, USE INDEX hint
query.couchbase.com
@N1QL

N1QL workshop: Indexing & Query turning.

  • 1.
    N1QLWORKSHOP: INDEXING AND QUERYTUNINGIN COUCHBASE 4.0 Keshav Murthy Couchbase Engineering keshav@couchbase.com @N1QL @rkeshavmurthy
  • 2.
    ©2015 Couchbase Inc.2 Agenda  Indexing Overview  View Index  GSI Index  Multi Index Scan  Hands On N1QL  QueryTuning with Hands on N1QL  Index Selection Hints  Key-ValueAccess  Joins  Hands On N1QL
  • 3.
  • 4.
    ©2015 Couchbase Inc.4 Couchbase Server Cluster Service Deployment 4 STORAGE Couchbase Server 1 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Managed Cache Storage Data Servic e STORAGE Couchbase Server 2 Managed Cache Cluster ManagerCluster Manager Data Servic e STORAGE Couchbase Server 3 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Data Servic e STORAGE Couchbase Server 4 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Query Servic e STORAGE Couchbase Server 5 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Query Servic e STORAGE Couchbase Server 6 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Index Servic e Managed Cache Storage Managed Cache Storage Storage STORAGE Couchbase Server 6 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Index Servic e Storage Managed Cache Managed Cache
  • 5.
    ©2015 Couchbase Inc.5 N1QL: Query Execution Flow Clients 1. Submit the query over RESTAPI 8. Query result 2. Parse, Analyze, create Plan 7. Evaluate: Documents to results 3. Scan Request; index filters 6. Fetch the documents Index Servic e Query Service Data Servic e 4. Get qualified doc keys 5. Fetch Request, doc keys SELECT firstname, lastname, state FROM customer WHERE customerid = "customer494"; { "firstName": "Nicolette", "lastName": "Wilderman", "state": "IL“ }
  • 6.
    ©2015 Couchbase Inc.6 N1QL: Query Execution Flow Clients 1. Submit the query over RESTAPI 8. Query result 2. Parse, Analyze, create Plan 7. Evaluate: Documents to results 3. Scan Request; index filters 6. Fetch the documents Index Servic e Query Service Data Servic e 4. Get qualified doc keys 5. Fetch Request, doc keys SELECT firstname, lastname, state FROM customer WHERE customerid = "customer494"; { "firstName": "Nicolette", "lastName": "Wilderman", "state": "IL“ }
  • 7.
    ©2015 Couchbase Inc.7 N1QL: Inside a Query Service Client FetchParse Plan Join Filter Pre-Aggregate Offset Limit ProjectSortAggregate Index Service Data Service Scan Query Service
  • 8.
    ©2015 Couchbase Inc.8 Index Overview: Primary Index  Primary Index  CREATE PRIMARY INDEX PIX_CUST ON customer;  Document key is unique for the bucket.  Primary index is used when no other qualifying index is available or when no predicate is given in the query.  PrimaryScan is equivalent of full table scan "customer": { "ccInfo": { "cardExpiry": "2015-11-11", "cardNumber”:"1212--1234", "cardType": "americanexpress” }, "customerId": "customer534", "dateAdded": "2014-04-06", "dateLastActive”:"2014-05-02”, "emailAddress”:”iles@kertz.name", "firstName": "Mckayla", "lastName": "Brown", "phoneNumber": "1-533-290-6403", "postalCode": "92341", "state": "VT", "type": "customer" } Document key: “customer534”
  • 9.
    ©2015 Couchbase Inc.9 Index Overview: Secondary Index  Secondary Index can be created on any combination of attribute names.  CREATE INDEX idx_cust_cardnum customer(ccInfo.cardNumber)  CREATE INDEX idx_cust_postalCode CUSTOMER(postalCode);  Useful in speeding up the queries.  Need to have matching indices with right key- ordering  (ccInfo.cardExpiry, postalCode)  (type, state, lastName firstName) "customer": { "ccInfo": { "cardExpiry": "2015-11-11", "cardNumber”:"1212-232-1234", "cardType": "americanexpress” }, "customerId": "customer534", "dateAdded": "2014-04-06", "dateLastActive”:"2014-05-02”, "emailAddress”:”iles@kertz.name", "firstName": "Mckayla", "lastName": "Brown", "phoneNumber": "1-533-290-6403", "postalCode": "92341", "state": "VT", "type": "customer" } Document key: “customer534”
  • 10.
    ©2015 Couchbase Inc.10 Couchbase Indexes for N1QL Fetch Parse Plan Join Filter Offset Limit Project Sort Aggre gate Scan Data Service Global Secondary Index View Indexes Global Secondary Index Global Secondary Index IndexScan IndexScan Data Access Query Service Cluster Map
  • 11.
  • 12.
    ©2015 Couchbase Inc.12 View Index 12 APPLICATION SERVER VIEW INDEXER Query Set  DCP based Replication: updates queued for the indexer  View Indexer: Executes incremental map/reduce on a batch of updates  Couchstore based Storage: updates queued for storage  ViewQuery Engine: REST Based queries with filters, limit and more executed with scatter-gather  N1QLView Index: Created via N1QL’sCREATE Index statement  For Beta, View index is the default CREATE PRIMARY INDEX px_customer ON customer USING VIEW; CREATE INDEX idx_cust_postalCode customer(postalCode) USING VIEW ; DROP INDEX customer.idx_cust_postalCode USING VIEW;
  • 13.
  • 14.
    ©2015 Couchbase Inc.14 Data Service Projector & Router Global Secondary Index Query Service Bucket#1 Bucket#2 DCP Stream Index Service Supervisor Index maintenance & Scan coordinator Index#2Index#1 Index#4Index#3 ForestDB Storage Engine B u c k e t # 2 B u c k e t # 1
  • 15.
    ©2015 Couchbase Inc.15 GSI Index: Key details  SupportedTypes  String, Boolean, Numeric, Nil,Array, Sub-document  Total length of the keys  4 KB – actual length of the key indexed  How the the length is calculated? Does it include the “key name”?  Number of keys  4096! CREATE PRIMARY INDEX px_customer ON customer USING GSI; CREATE INDEX idx_cust_postalCode customer(postalCode) USING GSI; DROP INDEX customer.idx_cust_postalCode USING GSI;
  • 16.
    ©2015 Couchbase Inc.16 Query Execution: Plan Fetch Parse Plan Join Filter Offset Limit Project Sort Aggre gate Scan  Each query can be executed in several ways  Create the query execution plan  Access path for each keyspace reference  Decide on the filters to push down  Determine Join order and join method  Create the execution tree  For each keyspace reference:  Look at the available indices  Match the filters in the query with index keys  Choose one or more indices for each keyspace  Create index filters and post scan, post join filters
  • 17.
    ©2015 Couchbase Inc.17 Query Execution: Plan Fetch Parse Plan Join Filter Offset Limit Project Sort Aggre gate Scan { "#operator": "IndexScan", "index": "CU_W_ID_D_ID_LAST", "keyspace": "CUSTOMER", … "spans": [ { "Range": { "High": [ "49", "16", ""Montana"" ], "Inclusion": 3, "Low": [ "49", "16", EXPLAIN SELECT c_id, c_first, c_middle, c_last, c_balance FROM CUSTOMER WHERE c_w_id = 49 AND c_d_id = 16 AND c_last = ‘Montana’;
  • 18.
    ©2015 Couchbase Inc.18 Query Execution: Plan Fetch Parse Plan Join Filter Offset Limit Project Sort Aggre gate Scan "#operator": "Sequence", "~children": [ { "#operator": "PrimaryScan", "index": "#primary", "keyspace": "reviews", "namespace": "default", "using": "gsi" }, { "#operator": "Parallel", "~child": { "#operator": "Sequence", "~children": [ { "#operator": "Fetch", "keyspace": "reviews", "namespace": "default" }, { SELECT productid, rating, Count(productid) FROM reviews WHERE rating < 3 AND productid BETWEEN "product300" AND "product400" GROUP BY productid, rating;
  • 19.
    ©2015 Couchbase Inc.19 Query Execution: Index Scan  PrimaryScan  Equivalent of full table scan in RDBMS  Uses the primary index to scan from start to finish  Index Scan  Index selection is based on the filters and available matching index  Indices with expressions are matched with query expressions  N1QL can use one or more indices per table per query Data Service Global Secondary Index View Indexes IndexScan IndexScan Data Fetch Query Service Cluster Map
  • 20.
  • 21.
    ©2015 Couchbase Inc.21 Power Features: Composite Indexes CREATE INDEX ix_last_postal ON PRODUCT(lastName, postalCode, city); SELECT * FROM product WHERE lastname= “Smith” AND postalCode = ‘58292’;  Index scan using composite index: – Needs first N keys to be used to choose the index – Will multiple indexes on same set of columns to support filter push down
  • 22.
  • 23.
    ©2015 Couchbase Inc.23 Power Features: IntersectScan (Multi-Index Scan) SELECT * FROM customer WHERE lastName = ’Smith’ AND postalCode = ’94040’;  IntersectScan using multiple indices: – Multiple indices are scanned in parallel – Provides more flexibility in using the indices for filters – Requires less number of indexes defined on table. • Can save on disk space and memory space as well.
  • 24.
    Switch to HandsOn N1QL Page #55
  • 25.
  • 26.
    ©2015 Couchbase Inc.26 Query Execution: USE INDEX CREATE INDEX ix_last_postal ON PRODUCT(lastName, postalCode); CREATE INDEX ix_postal_category ON PRODUCT(postalCode,lastName); SELECT * FROM product WHERE lastname= “Smith” AND postalCode = ‘58292’; SELECT * FROM product USE INDEX(ix_last_postal using gsi) WHERE Category = “Smith” AND Name = ‘58292’; Data Service Global Secondary Index View Indexes IndexScan IndexScan Data Fetch Query Service Cluster Map
  • 27.
    ©2015 Couchbase Inc.27 Query Execution: USE INDEX CREATE INDEX ix_last ON PRODUCT(lastName) USING GSI; CREATE INDEX ix_postal ON PRODUCT(postalCode) USING GSI; SELECT * FROM product USE INDEX(ix_last USING GSI, ix_postal USING GSI) WHERE Category = “Smith” AND Name = ‘58292’; Data Service Global Secondary Index View Indexes IndexScan IndexScan Data Fetch Query Service Cluster Map
  • 28.
  • 29.
    ©2015 Couchbase Inc.29 Power Features: USE KEYS Data Service Global Secondary Index View Indexes Global Secondary Index Global Secondary Index KeyScan IndexScan IndexScan Data Fetch Query Service Cluster Map
  • 30.
    ©2015 Couchbase Inc.30 Power Features: USE KEYS SELECT customerId, lastName, firstName FROM customer USE KEYS [‘customer494’];  KeyScan: Directly use the Couchbase cluster map to get the document  You can give one or more values in the array  From N1QL, get keys via: META(customer).id
  • 31.
  • 32.
    ©2015 Couchbase Inc.32 Functional Indices "contacts": { "age": 46, "children": [ { "age": 17, "fname": "Aiden", "gender": "m" }, { "age": 2, "fname": "Bill", "gender": "f" } ], "email": "dave@gmail.com", "fname": "Dave", "hobbies": [ "golf", "surfing" ], "lname": "Smith", "relation": "friend", "title": "Mr.", "type": "contact" CREATE INDEX idx_lname_lower ON contacts(LOWER(lname)) using GSI; SELECT count(*) FROM contacts WHERE lower(lname) = smith;
  • 33.
    ©2015 Couchbase Inc.33 Functional Indices "contacts": { "age": 46, "children": [ { "age": 17, "fname": "Aiden", "gender": "m" }, { "age": 2, "fname": "Bill", "gender": "f" } ], "email": "dave@gmail.com", "fname": "Dave", "hobbies": [ "golf", "surfing" ], "lname": "Smith", "relation": "friend", "title": "Mr.", "type": "contact"  The value indexed is the result of the function or expression.  The query has to use the same expression in the WHERE clause for the planner to consider using the index.  Use EXPLAIN to verify using the index.
  • 34.
  • 35.
    ©2015 Couchbase Inc.35 Indexes on Arrays cbq> select contacts.children from contacts limit 1; { "requestID": "e61a011f-2387-47d3-aee0-7bfd874ed2bf", "signature": { "children": "json" }, "results": [ { "children": [ { "age": 17, "fname": "Aiden", "gender": "m" }, { "age": 2, "fname": "Bill", "gender": "f" } ] } ], CREATE INDEX idx_ctx_children ON contacts(children) USING GSI; select * from system:indexes where name = "idx_ctx_children"; "indexes": { "datastore_id": "http://127.0.0.1:8091", "id": "ea6023a0dd24bedc", "index_key": [ "`children`" ], "keyspace_id": "contacts", "name": "idx_ctx_children", "namespace_id": "default", "state": "online", "using": "gsi" }
  • 36.
    ©2015 Couchbase Inc.36 Indexes on Arrays cbq> select contacts.children from contacts limit 1; { "requestID": "e61a011f-2387-47d3-aee0-7bfd874ed2bf", "signature": { "children": "json" }, "results": [ { "children": [ { "age": 17, "fname": "Aiden", "gender": "m" }, { "age": 2, "fname": "Bill", "gender": "f" } ] } ], select * from contacts where children[0] = {"age":17, "fname":"Xena", "gender":"f"}; select * from contacts c unnest c.children as anychild where anychild = {"age":17, "fname":"Xena", "gender":"f"};
  • 37.
    ©2015 Couchbase Inc.37 Indexes on Arrays cbq> select contacts.children from contacts limit 1; { "requestID": "e61a011f-2387-47d3-aee0-7bfd874ed2bf", "signature": { "children": "json" }, "results": [ { "children": [ { "age": 17, "fname": "Aiden", "gender": "m" }, { "age": 2, "fname": "Bill", "gender": "f" } ] } ], select * from contacts where children = [ { "age": 17, "fname": "Xena”, "gender": "f”}, {"age": 2,"fname": "Yuri”,"gender": "m”} ];
  • 38.
  • 39.
    ©2015 Couchbase Inc.39 Query Execution: Join "CUSTOMER": "customer": { "ccInfo": { "cardExpiry": "2015-11-11", "cardNumber": "1212-1221-1121-1234", "cardType": "americanexpress" }, "customerId": "customer285", "dateAdded": "2014-04-06T15:52:16Z", "dateLastActive": "2014-05-06T15:52:16Z", "emailAddress": "jason_skiles@kertzmann.name", "firstName": "Mckayla", "lastName": "Brown", "phoneNumber": "1-533-290-6403 x2729", "postalCode": "92341", "state": "VT", "type": "customer" } Document key: “customer285” Document key: “purchase1492” “purchases”:{ "customerId": "customer285", "lineItems": [ {"count": 3, "product": "product55”}, {"count": 4, "product": "product169”},], "purchaseId": "purchase7049", "type": "purchase” } "purchases": { "customerId": "customer285", "lineItems": [ { "count": 5, "product”: "prod551" }, { "count": 3, "product": "product549" }, ], "purchaseId": "purchase3648", "purchasedAt": "2013-11-07T15:52:38Z", "type": "purchase" } Document key: “purchase583”
  • 40.
    ©2015 Couchbase Inc.40 Joins SELECT c.customerid, Count(*) totpurchases FROM purchases p INNER JOIN customer c ON KEYS p.customerid GROUP BY c.customerid ORDER BY count(*) DESC limit 10; Two keyspace joins ON Clause for the join Fetch Parse Plan Join Filter Offset Limit Project Sort Aggre gate Scan
  • 41.
    ©2015 Couchbase Inc.41 Joins SELECT c_id, c_first, c_middle, c_last, (c_max - c_balance) FROM CUSTOMER USE KEYS [‘1.10.1938’]; Fetch Parse Plan Join Filter Offset Limit Project Sort Aggre gate Scan SELECT c_id, c_first, c_middle, c_last, (c_max - c_balance) FROM CUSTOMER USE KEYS [to_string($1)|| “.” || to_string($2) || “.” || to_string($3)];
  • 42.
    ©2015 Couchbase Inc.42 N1QL: Join SELECT * FROM ORDERS o INNER JOIN CUSTOMER c ON KEYS (o.O_C_ID) LEFT JOIN PREMIUM p ON KEYS (c.C_PR_ID) LEFT JOIN demographics d ON KEYS (c.c_DEMO_ID) Fetch Parse Plan Join Filter Offset Limit Project Sort Aggre gate Scan  Support INNER and LEFT OUTER joins  Join order follows the order in the FROM clause.  N1QL supports the nested loop joins now.  Join is always from a key of one document(outer table) to the document key of the second document (inner table)
  • 43.
    Switch to HandsOn N1QL Page #71
  • 44.
    ©2015 Couchbase Inc.44 Summary  Create indices with right set of keys  Important to analyze your workload  Primary Scan = FullTable Scan  Indexes are maintained asynchronously.  Set the right consistency level for your query  Design your primary keys and document references correctly.  USE KEYS will get you the data fast  EXPLAIN to understand query plan  Take control when you must, USE INDEX hint
  • 45.

Editor's Notes

  • #8 Data-parallel — Query latency scales up with cores Memory-bound
  • #11 View Indexes: Incremental Map/Reduce with customer JavaScript for complex indexing logic for online reporting and analytics GSI (Global Secondary Indexes): Efficient indexes for secondary lookups and ad-hoc query processing
  • #15 Projector and Router: 1 Projector and Router per node 1 stream of changes per buckets per supervisor Supervisor 1 Supervisor per node Many indexes per Supervisor
  • #27 PrimaryScan Equivalent of full table scan in RDBMS Uses the primary index to scan from start to finish Equivalent of full table scan in RDBMS