N1QL workshop: Indexing & Query turning.

N1QLWORKSHOP:
INDEXING AND QUERYTUNING IN COUCHBASE 4.0
Keshav Murthy
Couchbase Engineering
keshav@couchbase.com
@N1QL @rkeshavmurthy

©2015 Couchbase Inc. 2
Agenda
 Indexing Overview
 View Index
 GSI Index
 Multi Index Scan
 Hands On N1QL
 QueryTuning with Hands on N1QL
 Index Selection Hints
 Key-ValueAccess
 Joins
 Hands On N1QL

Couchbase Server Cluster Service Deployment
4
STORAGE
Couchbase Server 1
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Managed Cache
Storage
Data
Servic
e
STORAGE
Couchbase Server 2
Managed
Cache
Cluster
ManagerCluster
Manager
Data
Servic
e
STORAGE
Couchbase Server 3
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Data
Servic
e
STORAGE
Couchbase Server 4
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Query
Servic
e
STORAGE
Couchbase Server 5
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Query
Servic
e
STORAGE
Couchbase Server 6
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Index
Servic
e
Managed Cache
Storage
Managed Cache
Storage Storage
STORAGE
Couchbase Server 6
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Index
Servic
e
Storage
Managed Cache Managed Cache

N1QL: Query Execution Flow
Clients
1. Submit the query over RESTAPI 8. Query result
2. Parse, Analyze, create Plan 7. Evaluate: Documents to results
3. Scan Request;
index filters
6. Fetch the documents
Index
Servic
e
Query
Service
Data
Servic
e
4. Get qualified doc keys
5. Fetch Request,
doc keys
SELECT firstname,
lastname,
state
FROM customer
WHERE customerid = "customer494";
{
"firstName": "Nicolette",
"lastName": "Wilderman",
"state": "IL“
}

N1QL: Inside a Query Service
Client
FetchParse Plan Join Filter
Pre-Aggregate
Offset Limit ProjectSortAggregate
Index
Service
Data
Service
Scan
Query Service

Index Overview: Primary Index
 Primary Index
 CREATE PRIMARY INDEX PIX_CUST ON customer;
 Document key is unique for the bucket.
 Primary index is used when no other qualifying
index is available or when no predicate is given
in the query.
 PrimaryScan is equivalent of full table scan
"customer": {
"ccInfo": {
"cardExpiry": "2015-11-11",
"cardNumber”:"1212--1234",
"cardType": "americanexpress”
},
"customerId": "customer534",
"dateAdded": "2014-04-06",
"dateLastActive”:"2014-05-02”,
"emailAddress”:”iles@kertz.name",
"firstName": "Mckayla",
"lastName": "Brown",
"phoneNumber": "1-533-290-6403",
"postalCode": "92341",
"state": "VT",
"type": "customer"
}
Document key: “customer534”

Index Overview: Secondary Index
 Secondary Index can be created on any
combination of attribute names.
 CREATE INDEX idx_cust_cardnum
customer(ccInfo.cardNumber)
 CREATE INDEX idx_cust_postalCode
CUSTOMER(postalCode);
 Useful in speeding up the queries.
 Need to have matching indices with right key-
ordering
 (ccInfo.cardExpiry, postalCode)
 (type, state, lastName firstName)
"customer": {
"ccInfo": {
"cardExpiry": "2015-11-11",
"cardNumber”:"1212-232-1234",
"cardType": "americanexpress”
},
"dateAdded": "2014-04-06",
"dateLastActive”:"2014-05-02”,
"emailAddress”:”iles@kertz.name",
"phoneNumber": "1-533-290-6403",
"state": "VT",
"type": "customer"
}
Document key: “customer534”

Couchbase Indexes for N1QL
Fetch
Parse
Plan
Join
Filter
Offset
Limit
Project
Sort
Aggre
gate
Scan
Data Service
Global
Secondary
Index
View Indexes
Global
Secondary
Index
Global
Secondary
Index
IndexScan
IndexScan
Data
Access
Query
Service
Cluster Map

View Index
12
APPLICATION SERVER
VIEW
INDEXER
Query Set
 DCP based Replication: updates
queued for the indexer
 View Indexer: Executes incremental
map/reduce on a batch of updates
 Couchstore based Storage: updates
queued for storage
 ViewQuery Engine: REST Based
queries with filters, limit and more
executed with scatter-gather
 N1QLView Index: Created via
N1QL’sCREATE Index statement
 For Beta, View index is the default
CREATE PRIMARY INDEX px_customer ON customer USING VIEW;
CREATE INDEX idx_cust_postalCode customer(postalCode) USING VIEW ;
DROP INDEX customer.idx_cust_postalCode USING VIEW;

Data Service
Projector & Router
Global Secondary Index
Query Service
Bucket#1 Bucket#2
DCP Stream
Index Service
Supervisor
Index maintenance &
Scan coordinator
Index#2Index#1
Index#4Index#3
ForestDB
Storage Engine
B
u
c
k
e
t
#
2
B
u
c
k
e
t
#
1

GSI Index: Key details
 SupportedTypes
 String, Boolean, Numeric, Nil,Array, Sub-document
 Total length of the keys
 4 KB – actual length of the key indexed
 How the the length is calculated? Does it include the “key name”?
 Number of keys
 4096!
CREATE PRIMARY INDEX px_customer ON customer USING GSI;
CREATE INDEX idx_cust_postalCode customer(postalCode) USING GSI;
DROP INDEX customer.idx_cust_postalCode USING GSI;

Query Execution: Plan
Fetch
Parse
Plan
Join
Filter
Offset
Limit
Project
Sort
Aggre
gate
Scan
 Each query can be executed in several ways
 Create the query execution plan
 Access path for each keyspace reference
 Decide on the filters to push down
 Determine Join order and join method
 Create the execution tree
 For each keyspace reference:
 Look at the available indices
 Match the filters in the query with index keys
 Choose one or more indices for each keyspace
 Create index filters and post scan, post join filters

Fetch
Parse
Plan
Join
Filter
Offset
Limit
Project
Sort
Aggre
gate
Scan
{
"#operator": "IndexScan",
"index":
"CU_W_ID_D_ID_LAST",
"keyspace": "CUSTOMER",
…
"spans": [
{
"Range": {
"High": [
"49",
"16",
""Montana""
],
"Inclusion": 3,
"Low": [
"49",
"16",
EXPLAIN SELECT c_id,
c_first,
c_middle,
c_last,
c_balance
FROM CUSTOMER
WHERE c_w_id = 49
AND c_d_id = 16
AND c_last = ‘Montana’;

Fetch
Parse
Plan
Join
Filter
Offset
Limit
Project
Sort
Aggre
gate
Scan
"#operator": "Sequence",
"~children": [
{
"#operator": "PrimaryScan",
"index": "#primary",
"keyspace": "reviews",
"namespace": "default",
"using": "gsi"
},
{
"#operator": "Parallel",
"~child": {
"#operator": "Sequence",
"~children": [
{
"#operator": "Fetch",
"keyspace": "reviews",
"namespace": "default"
},
{
SELECT productid,
rating,
Count(productid)
FROM reviews
WHERE rating < 3
AND productid
BETWEEN "product300"
AND "product400"
GROUP BY productid,
rating;

Query Execution: Index Scan
 PrimaryScan
 Equivalent of full table scan in RDBMS
 Uses the primary index to scan from start to finish
 Index Scan
 Index selection is based on the filters and available
matching index
 Indices with expressions are matched with query
expressions
 N1QL can use one or more indices per table per
query
Data
Service
Global
Secondary
Index
View
Indexes
IndexScan
IndexScan
Data Fetch
Query
Service
Cluster
Map

Power Features: Composite Indexes
CREATE INDEX ix_last_postal ON PRODUCT(lastName, postalCode,
city);
SELECT * FROM product
WHERE
lastname= “Smith”
AND
postalCode = ‘58292’;
 Index scan using composite index:
– Needs first N keys to be used to choose the index
– Will multiple indexes on same set of columns to support filter push down

Power Features:
Index Intersection aka Multi Index Scan

Power Features: IntersectScan (Multi-Index Scan)
SELECT * FROM customer
WHERE lastName = ’Smith’ AND postalCode = ’94040’;
 IntersectScan using multiple indices:
– Multiple indices are scanned in parallel
– Provides more flexibility in using the indices for filters
– Requires less number of indexes defined on table.
• Can save on disk space and memory space as well.

Switch to Hands On N1QL
Page #55

Power Features:
Index Selection Hints

Query Execution: USE INDEX
CREATE INDEX ix_last_postal ON
PRODUCT(lastName, postalCode);
CREATE INDEX ix_postal_category ON
PRODUCT(postalCode,lastName);
SELECT * FROM product
WHERE
lastname= “Smith”
AND
postalCode = ‘58292’;
SELECT *
FROM product
USE INDEX(ix_last_postal using gsi)
WHERE
Category = “Smith”
AND
Name = ‘58292’;
Data
Service
Global
Secondary
Index
View
Indexes
IndexScan
IndexScan
Data Fetch
Query
Service
Cluster
Map

Query Execution: USE INDEX
CREATE INDEX ix_last ON
PRODUCT(lastName) USING GSI;
CREATE INDEX ix_postal ON
PRODUCT(postalCode) USING GSI;
SELECT *
FROM product
USE INDEX(ix_last USING GSI,
ix_postal USING GSI)
WHERE
Category = “Smith”
AND
Name = ‘58292’;
Data
Service
Global
Secondary
Index
View
Indexes
IndexScan
IndexScan
Data Fetch
Query
Service
Cluster
Map

Power Features: USE KEYS
Data Service
Global
Secondary
Index
View Indexes
Global
Secondary
Index
Global
Secondary
Index
KeyScan
IndexScan
IndexScan
Data Fetch
Query
Service
Cluster Map

Power Features: USE KEYS
SELECT customerId,
lastName,
firstName
FROM customer USE KEYS [‘customer494’];
 KeyScan: Directly use the Couchbase cluster map to get the
document
 You can give one or more values in the array
 From N1QL, get keys via: META(customer).id

Functional Indices
"contacts": {
"age": 46,
"children": [
{
"age": 17,
"fname": "Aiden",
"gender": "m"
},
{
"age": 2,
"fname": "Bill",
"gender": "f"
}
],
"email": "dave@gmail.com",
"fname": "Dave",
"hobbies": [
"golf",
"surfing"
],
"lname": "Smith",
"relation": "friend",
"title": "Mr.",
"type": "contact"
CREATE INDEX idx_lname_lower ON
contacts(LOWER(lname)) using GSI;
SELECT count(*)
FROM contacts
WHERE lower(lname) = smith;

Functional Indices
"contacts": {
"age": 46,
"children": [
{
"age": 17,
"fname": "Aiden",
"gender": "m"
},
{
"age": 2,
"fname": "Bill",
"gender": "f"
}
],
"email": "dave@gmail.com",
"fname": "Dave",
"hobbies": [
"golf",
"surfing"
],
"lname": "Smith",
"relation": "friend",
"title": "Mr.",
"type": "contact"
 The value indexed is the result of the function or
expression.
 The query has to use the same expression in the
WHERE clause for the planner to consider using
the index.
 Use EXPLAIN to verify using the index.

Indexes on Arrays
cbq> select contacts.children from contacts limit 1;
{
"requestID": "e61a011f-2387-47d3-aee0-7bfd874ed2bf",
"signature": {
"children": "json"
},
"results": [
{
"children": [
{
"age": 17,
"fname": "Aiden",
"gender": "m"
},
{
"age": 2,
"fname": "Bill",
"gender": "f"
}
]
}
],
CREATE INDEX idx_ctx_children ON
contacts(children) USING GSI;
select * from system:indexes where name =
"idx_ctx_children";
"indexes": {
"datastore_id": "http://127.0.0.1:8091",
"id": "ea6023a0dd24bedc",
"index_key": [
"`children`"
],
"keyspace_id": "contacts",
"name": "idx_ctx_children",
"namespace_id": "default",
"state": "online",
"using": "gsi"
}

Indexes on Arrays
{
"signature": {
"children": "json"
},
"results": [
{
"children": [
{
"age": 17,
"fname": "Aiden",
"gender": "m"
},
{
"age": 2,
"fname": "Bill",
"gender": "f"
}
]
}
],
select * from contacts where children[0] =
{"age":17, "fname":"Xena", "gender":"f"};
select *
from contacts c unnest c.children as anychild
where anychild = {"age":17, "fname":"Xena",
"gender":"f"};

Indexes on Arrays
{
"signature": {
"children": "json"
},
"results": [
{
"children": [
{
"age": 17,
"fname": "Aiden",
"gender": "m"
},
{
"age": 2,
"fname": "Bill",
"gender": "f"
}
]
}
],
select * from contacts where children =
[
{ "age": 17, "fname": "Xena”, "gender": "f”},
{"age": 2,"fname": "Yuri”,"gender": "m”}
];

Query Execution: Join
"CUSTOMER": "customer": {
"ccInfo": {
"cardExpiry": "2015-11-11",
"cardNumber": "1212-1221-1121-1234",
"cardType": "americanexpress"
},
"dateAdded": "2014-04-06T15:52:16Z",
"dateLastActive": "2014-05-06T15:52:16Z",
"emailAddress":
"jason_skiles@kertzmann.name",
"phoneNumber": "1-533-290-6403 x2729",
"state": "VT",
"type": "customer"
}
Document key: “customer285” Document key: “purchase1492”
“purchases”:{
"lineItems": [
{"count": 3,
"product": "product55”},
{"count": 4,
"product": "product169”},],
"purchaseId": "purchase7049",
"type": "purchase”
}
"purchases": {
"lineItems": [
{ "count": 5,
"product”: "prod551" },
{ "count": 3,
"product": "product549" }, ],
"purchaseId": "purchase3648",
"purchasedAt": "2013-11-07T15:52:38Z",
"type": "purchase"
}
Document key: “purchase583”

Joins
SELECT c.customerid,
Count(*) totpurchases
FROM purchases p
INNER JOIN customer c
ON KEYS p.customerid
GROUP BY c.customerid
ORDER BY count(*) DESC limit 10;
Two keyspace joins
ON Clause for the join
Fetch
Parse
Plan
Join
Filter
Offset
Limit
Project
Sort
Aggre
gate
Scan

Joins
SELECT c_id,
c_first,
c_middle,
c_last,
(c_max - c_balance)
FROM CUSTOMER USE KEYS [‘1.10.1938’];
Fetch
Parse
Plan
Join
Filter
Offset
Limit
Project
Sort
Aggre
gate
Scan
SELECT c_id,
c_first,
c_middle,
c_last,
(c_max - c_balance)
FROM CUSTOMER USE KEYS
[to_string($1)|| “.” || to_string($2) ||
“.” || to_string($3)];

N1QL: Join
SELECT *
FROM ORDERS o INNER JOIN CUSTOMER c
ON KEYS (o.O_C_ID)
LEFT JOIN PREMIUM p
ON KEYS (c.C_PR_ID)
LEFT JOIN demographics d
ON KEYS (c.c_DEMO_ID)
Fetch
Parse
Plan
Join
Filter
Offset
Limit
Project
Sort
Aggre
gate
Scan
 Support INNER and LEFT OUTER joins
 Join order follows the order in the FROM clause.
 N1QL supports the nested loop joins now.
 Join is always from a key of one document(outer
table) to the document key of the second
document (inner table)

Switch to Hands On N1QL
Page #71

Summary
 Create indices with right set of keys
 Important to analyze your workload
 Primary Scan = FullTable Scan
 Indexes are maintained asynchronously.
 Set the right consistency level for your query
 Design your primary keys and document
references correctly.
 USE KEYS will get you the data fast
 EXPLAIN to understand query plan
 Take control when you must, USE INDEX hint

N1QL workshop: Indexing & Query turning.

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (10)

Similar to N1QL workshop: Indexing & Query turning.

Similar to N1QL workshop: Indexing & Query turning. (20)

More from Keshav Murthy

More from Keshav Murthy (19)

Recently uploaded

Recently uploaded (20)

N1QL workshop: Indexing & Query turning.

Editor's Notes