Tuning N1QL Query Performance with Couchbase Server 4.0

TUNING N1QL QUERY PERFORMANCE & SCALE
IN COUCHBASE SERVER 4.0
Cihan Biyikoglu
Dir. Product Management
1

©2015 Couchbase Inc. 2
Goals
 Deeper look at query performance and scale
 Look at Query and Index Service Scale Characteristics
 Understand Query Execution Flow
 Understand Index Usage
 Tune queries with a few techniques

Agenda
 Part I - Architectural Overview
 New Cluster Architecture with Couchbase Server 4.0
 Query Processing & Indexing
 Part II - Optimizing Queries
 Execution Plans and Operators
 Optimizing Queries - Filtering, Index Selection and Joins
 OptimizingApps - Consistency Dials
 QA
Demos & More Demos…

Disclaimer
Couchbase Server 4.0 and ForestDB are still
in development and the final version of the
products may not be identical in details
discussed on this session.

Couchbase Server Cluster Architecture
STORAGE
Couchbase Server 1
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Managed Cache
Storage
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 2
Managed
Cache
Cluster
ManagerCluster
Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 3
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 4
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 5
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 6
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Data Service
Index Service
Query Service
Managed Cache
Storage
Managed Cache
Storage
Managed Cache
Storage
Managed Cache
Storage
Managed Cache
Storage

©2014 Couchbase Inc.
Couchbase Server Cluster Architecture
STORAGE
Couchbase Server 1
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Managed Cache
Storage
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 2
Managed
Cache
Cluster
ManagerCluster
Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 3
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 4
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 5
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 6
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed
Cache
Cluster
ManagerCluster
Manager
Data Service
Index Service
Query Service
Managed Cache
Storage
Managed Cache
Storage
Managed Cache
Storage
Managed Cache
Storage
Managed Cache
Storage

Query Execution
 Submitting Queries in N1QL
 Stateless Connectivity through REST
 Load-Balance across Query Service nodes
 Prepared vs Ad-hoc Query Execution
 Consistency Dials – more on this later…

Query Execution
 Parallelization factor is #cores on Query Service Node
Execution Flow

Query Service - Capacity Management
Scaling the Query Service
 Pro: Load Balance Queries across all nodes
 Con: Compete with Index and DataWorkloads
Index Service
Couchbase Cluster
Query Service
Data Service
node1 node8

Scaling the Query Service
 Added CPU: higher intra-query parallelization
 Added RAM: improved caching with larger result sets
 Added Node: better availability and load balancing
Couchbase Cluster
node1 node8
Data ServiceIndex Service
Query Service

Indexing in Couchbase Server 4.0
 Multiple Indexers
 GSI – Index Service
New indexing for N1QL for low latency queries without compromising on mutation performance
(insert/update/delete)
Independently partitioned and independently scalable indexes in Indexing Service
 Map/ReduceViews – Data Service
Powerful programmable indexer for complex reporting and indexing logic.
Full partition alignment and paired scalability with Data Service.
 SpatialView – Data Service
Incremental R-tree indexing for powerful bounding-box queries
Full partition alignment and paired scalability with Data Service
New
Index Scan

Which to choose – GSI vsViews
Workloads New GSI in v4.0 Map/ReduceViews
Complex
Reporting
Just InTime Pre-aggregated
Workload
Optimization
Optimized for Scan Latency &
Throughput
Optimized for Insertion
Flexible
Index Logic
N1QL Functions Javascript
Secondary
Lookups
Single Node Lookup Scatter-Gather
Tunable
Consistency
Staleness false or ok or
everything in between
Staleness false or ok

Which to choose – GSI vsViews
Capabilities New GSI in v4.0 Map/ReduceViews
Partitioning Model Independent – Indexing Service Aligned to Data – Data Service
Scale Model Independently Scale Index Service Scale with Data Service
Fetch with Index Key Single Node Scatter-Gather
Range Scan Single Node Scatter-Gather
Grouping,Aggregates With N1QL Built-in withViews API
Caching Managed Not Managed
Storage ForestDB Couchstore
Availability Multiple Identical Indexes load
balanced
Replica Based

Scaling the Index Service
 Pro: Load balance scans across all nodes
 Con: Compete with Query and DataWorkloads
Index Service
Couchbase Cluster
Query Service
Data Service
node1 node8

Index Service Capacity Management
Scaling the Index Service
 Added RAM: better caching of indexes
 Added CPU: faster index maintenance & parallelized index scans
 Add Faster IO Path: faster index persistence
 Added Node: better availability and load balancing
Couchbase Cluster
node1 node8
Data Service
Index Service
Query Service

Execution Plans & Explain
 EXPLAIN query
 Plan is assembled into an execution flow expressed through the
operators
 Operators stream results up and down the stream
Sequence Parallel
Primary
Scan
Initial
Project
Fetch
Initial
ProjectFetch
Initial
ProjectFetch
…
Limit

Operators
 Main Operations
 Scans
PrimaryScan: Scan of the Primary Index based on document keys
IndexScan: Scan of the Secondary Index based on a predicate
 Fetch
Fetch: Reach into the Data service with a document key
 Projection Operations
InitialProject: reducing the stream size to the fields involved in query.
FinalProject: final shaping of the result to the requested JSON shape

Operators cont.
 Operator Assembly
Parallel: execute all child operations in parallel
Sequence: execute child items in a sequence
 Filtering Operators
Filter:Apply a filter expression (ex.WHERE field = “value”)
Limit: limit the number of items returned to N
Offset: start returning items from a specified item count

Operators cont.
 Join Operators
Join: Join left and right keyspaces on attributes and document key
Unnest: Join operation between a parent and a child with a nested
array where parent is repeated for each child array item.
Nest:Grouping operation between a parent and a child array where child
array is embedded into the parent.

CommonTechniques forTuning Queries

Minimize Items Scanned
 Primary Index Scan vs. Index Scan
 Primary Index can only filter on document keys thus typically means
“full-scan” of the bucket
 Secondary Index is typically done with predicates and are smaller in
size thus better to scan
Index Selection: Based on matching expressions matching in Index andWHERE clause
DEMO #2
SELECT name,updated FROM `beer-sample` WHERE type="beer" AND abv>0 ORDER BY name LIMIT 10;
Vs.
CREATE INDEX i_type on `beer-sample`(type) USING GSI;
SELECT name,updated FROM `beer-sample` WHERE type="beer" AND abv>0 ORDER BY name LIMIT 10;

 HINT index usage to queries
 There can be multiple indexes with to choose from and you can hint
index choice to us.
SELECT name,updated FROM `beer-sample` USE INDEX(i_type using gsi)
WHERE type="beer" AND abv>0 ORDER BY name LIMIT 10;

 Limit & Filters help eliminate rows early in the execution plan
 With Limit, Upstream operators are signaled to stop by limit when enough
rows accumulate
 Ex: Remember to Filter on Document type with buckets that contain
multiple types.
DEMO #3
SELECT b1.name as beer_name, b2.name as brewery_name, b2.country
FROM `beer-sample` AS b1 JOIN `beer-sample` AS b2 on KEYS b1.brewery_id
WHERE abv>0;
vs
SELECT b1.name as beer_name, b2.name as brewery_name, b2.country
FROM `beer-sample` AS b1 JOIN `beer-sample` AS b2 on KEYS b1.brewery_id
WHERE b1.type="beer” and abv>0;

Joins
 Joins are efficient by nature
 Left hand value is joined to the right hand document key with nested
loop.
Query: Get brewery location for each beer:
SELECT …
FROM `beer-sample` AS b1
JOIN `beer-sample` AS b2 on KEYS b1.brewery_id
WHERE b1.type="beer”;
For each document with type=“beer” take b1.brewery_id and look for and
equal document key in b2.

New Consistency Settings!
 View Stale-ness
 Ok: unbounded – query what’s available in the index/view now
 False: query after all changes up to the request timestamp (and
maybe more) has been indexed for a given index or view.
 New Indexes with Couchbase Server 4.0
 Improves granularity of the consistency logical-timestamp.
 New: ScanConsistency can be set to any logical timestamp
Indicate stale=false to stale=ok and everything in between

Flexible Consistency Settings
 Time
t1 insert (k1, v1)
…
t2 do other business logic computation
…
t3 issue query/read on (k1,v1) with t3 vs t1
Catch up all the indexes
to t3 and then issue query
Identical to “stale=false”
Catch up all the indexes
to t1 and then issue query
Improved efficiency over
“stale=false”

Recap
 New Unique Query and Indexing Architecture
 Workload isolation with MDS gives you a great performance and scale
advancement.
 Familiar Concepts from your past life will help tune queries
 Understand Execution Plans
 Understand Indexes and Index Selection
 Filter & Limit aggressively
 Understand JOINs
 Use powerful new Consistency Dials for best efficiency

Q&A
Cihan Biyikoglu
cihan@couchbase.com
@cihangirb

Tuning N1QL Query Performance with Couchbase Server 4.0

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Tuning N1QL Query Performance with Couchbase Server 4.0

Similar to Tuning N1QL Query Performance with Couchbase Server 4.0 (20)

More from Cihan Biyikoglu

More from Cihan Biyikoglu (8)

Recently uploaded

Recently uploaded (20)

Tuning N1QL Query Performance with Couchbase Server 4.0

Editor's Notes