SlideShare a Scribd company logo
©2016 Couchbase Inc.
{ "Utilizing Arrays" :
["Modeling", "Querying", "Indexing"] }
1
Keshav Murthy
Director,Couchbase R&D
©2016 Couchbase Inc.©2016 Couchbase Inc.
Agenda
• Introduction to Arrays
• Data Modeling with Arrays
• Query PerformanceWith Arrays
• Array Indexing
• FunWithArrays
• Query Performance
• Tag Search
• String Search
2
©2016 Couchbase Inc. 3
IntroductionTo Arrays
©2016 Couchbase Inc.©2016 Couchbase Inc.
Every N1QL query returns Arrays
4
cbq> select distinct type from `travel-sample`;
{
…
"results": [
{ "type": "route“ },
{ "type": "airport” },
{ "type": "hotel" },
{ "type": "airline” },
{ "type": "landmark” }
] ,
"status": "success",
"metrics": {
"elapsedTime": "840.518052ms",
"executionTime": "840.478414ms",
"resultCount": 5,
"resultSize": 202
}
}
Results from every query is an array.
cbq> SELECT * FROM `travel-
sample`WHERE type = 'airport' and
faa = 'BLR';
{
"results": [],
"metrics": {
"elapsedTime": "9.606755ms",
"executionTime": "9.548749ms",
"resultCount": 0,
"resultSize": 0
}
}
©2016 Couchbase Inc.
©2016 Couchbase Inc.©2016 Couchbase Inc.
Introduction to Arrays
• An arrangement of quantities or symbols in rows
and columns; a matrix
6
• An indexed set of related elements
©2016 Couchbase Inc.©2016 Couchbase Inc.
JSON Arrays
7
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"hobbies" : ["lego", "piano", "badminton", "robotics"],
"scores" : [3.4, 2.9, 9.2, 4.1],
"legos" : [
true,
9292,
"fighter 2",
{
"name" : "Millenium Falcon",
"type" : "Starwars"
}
]
}
• Arrays in JSON can
contain simply values,
or any combination of
JSON types within the
same array.
• No type or structure
enforcement within
the array.
©2016 Couchbase Inc.©2016 Couchbase Inc.
JSON Arrays
8
{
"Name": "Jane Smith",
"DOB" : "1990-01-30",
"phones" : [
"+1 510-523-3529", "+1 650-392-4923"
],
"Billing": [
{
"type": "visa",
"cardnum": "5827-2842-2847-3909",
"expiry": "2019-03"
},
{
"type": "master",
"cardnum": "6274-2542-5847-3949",
"expiry": "2018-12"
}
]
}
Billing has two credit card
entries, stored as an ARRAY
Two phone number entries
©2016 Couchbase Inc.©2016 Couchbase Inc.
JSON Arrays : Syntax Diagram
9
©2016 Couchbase Inc. 10
Data Modeling with Arrays
©2016 Couchbase Inc.©2016 Couchbase Inc.
Properties of Real-World Data
• Rich structure
• Attributes, Sub-structure
• Relationships
• To other data
• Value evolution
• Data is updated
• Structure evolution
• Data is reshaped
Customer
Name
DOB
Billing
Connections
Purchases
©2016 Couchbase Inc.©2016 Couchbase Inc.
Modeling Data in RelationalWorld
Billing
ConnectionsPurchases
Contacts
Customer
• Rich structure
• Normalize & JOIN Queries
• Relationships
• JOINS and Constraints
• Value evolution
• INSERT, UPDATE, DELETE
• Structure evolution
• ALTER TABLE
• Application Downtime
• Application Migration
• Application Versioning
©2016 Couchbase Inc.©2016 Couchbase Inc.
Using JSON For RealWorld Data
CustomerID Name DOB
CBL2015 Jane Smith 1990-01-30
Table: Customer
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30"
}
• The primary (CustomerID) becomes the DocumentKey
• Column name-Column value become KEY-VALUE
pair.
{
"Name" : {
"fname": "Jane",
"lname": "Smith"
}
"DOB" : "1990-01-30"
}
OR
Customer DocumentKey: CBL2015
©2016 Couchbase Inc.©2016 Couchbase Inc.
Using JSON to Store Data
CustomerID Name DOB
CBL2015 Jane Smith 1990-01-30
Table: Customer
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"Billing" : [
{
"type" : "visa",
"cardnum" : "5827-2842-
2847-3909",
"expiry" : "2019-03"
}
]
}
CustomerID Type Cardnum Expiry
CBL2015 visa 5827… 2019-03
Table: Billing
• Rich Structure & Relationships
• Billing information is stored as a sub-document
• There could be more than a single credit card. So, use an array.
Customer DocumentKey: CBL2015
©2016 Couchbase Inc.©2016 Couchbase Inc.
Using JSON to Store Data
CustomerID Name DOB
CBL2015 Jane Smith 1990-01-30
Table: Customer
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"Billing" : [
{
"type" : "visa",
"cardnum" : "5827-2842-
2847-3909",
"expiry" : "2019-03"
},
{
"type" : "master",
"cardnum" : "6274-2542-
5847-3949",
"expiry" : "2018-12"
}
]
}
Customer DocumentKey: CBL2015
CustomerID Type Cardnum Expiry
CBL2015 visa 5827… 2019-03
CBL2015 master 6274… 2018-12
Table: Billing
Value evolution
 Simply add additional array element or
update a value.
©2016 Couchbase Inc.©2016 Couchbase Inc.
Using JSON to Store Data
CustomerID ConnId Name
CBL2015 XYZ987 Joe Smith
CBL2015 SKR007 Sam Smith
CBL2015 RGV492 Rav Smith
Table: Connections
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"Billing" : [
{
"type" : "visa",
"cardnum" : "5827-2842-2847-3909",
"expiry" : "2019-03"
},
{
"type" : "master",
"cardnum" : "6274-2542-5847-3949",
"expiry" : "2018-12"
}
],
"Connections" : [
{
"ConnId" : "XYZ987",
"Name" : "Joe Smith"
},
{
"ConnId" : "SKR007",
"Name" : "Sam Smith"
},
{
"ConnId" : "RGV491",
"Name" : "Rav Smith"
}
Structure evolution
 Simply add new key-value pairs
 No downtime to add new KV pairs
 Applications can validate data
 Structure evolution over time.
Relations via Reference
Customer DocumentKey: CBL2015
©2016 Couchbase Inc.©2016 Couchbase Inc.
Using JSON to Store Data
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"Billing" : [
{
"type" : "visa",
"cardnum" : "5827-2842-2847-3909",
"expiry" : "2019-03"
},
{
"type" : "master",
"cardnum" : "6274-2842-2847-3909",
"expiry" : "2019-03"
}
],
"Connections" : [
{
"ConnId" : "XYZ987",
"Name" : "Joe Smith"
},
{
"ConnId" : "SKR007",
"Name" : "Sam Smith"
},
{
"ConnId" : "RGV491",
"Name" : "Rav Smith"
}
],
"Purchases" : [
{ "id":12, item: "mac", "amt": 2823.52 }
{ "id":19, item: "ipad2", "amt": 623.52 }
]
}
CustomerID Name DOB
CBL2015 Jane Smith 1990-01-30
Customer
ID
Type Cardnum Expiry
CBL2015 visa 5827… 2019-03
CBL2015 maste
r
6274… 2018-12
CustomerID ConnId Name
CBL2015 XYZ987 Joe Smith
CBL2015 SKR007 Sam Smith
CBL2015 RGV492 Rav Smith
CustomerID item amt
CBL2015 mac 2823.52
CBL2015 ipad2 623.52
CustomerID ConnId Name
CBL2015 XYZ987 Joe Smith
CBL2015 SKR007 Sam
Smith
Contacts
Customer
Billing
ConnectionsPurchases
Customer DocumentKey: CBL2015
©2016 Couchbase Inc.©2016 Couchbase Inc.
Models for Representing Data
Data Concern Relational Model JSON Document Model (NoSQL)
Rich Structure
 Multiple flat tables
 Constant assembly / disassembly
 Documents
 No assembly required!
Relationships
 Represented
 Queried (SQL)
 Represented
 N1QL, MongoDB, CQL
Value Evolution  Data can be updated  Data can be updated
Structure Evolution
 Uniform and rigid
 Manual change (disruptive)
 Flexible
 Dynamic change
©2016 Couchbase Inc. 19
Querying Arrays
©2016 Couchbase Inc.©2016 Couchbase Inc.
Querying Arrays
• Array Access
• Expressions
• Functions
• Aggregates
• Statements
• Array Clauses
20
©2016 Couchbase Inc.©2016 Couchbase Inc.
Array Access: Expressions, Functions and Aggregates.
21
• EXPRESSIONS
• ARRAY
• ANY
• EVERY
• IN
• WITHIN
• Construct [elem]
• Slice array[start:end]
• Selection array[#pos]
• FUNCTIONS
• ISARRAY
• TYPE
• ARRAY_APPEND
• ARRAY_CONCAT
• ARRAY_CONTAINS
• ARRAY_DISTINCT
• ARRAY_IFNULL
• ARRAY_FLATTEN
• ARRAY_INSERT
• ARRAY_INTERSECT
• ARRAY_LENGTH
• ARRAY_POSITION
• AGGREGATES
• ARRAY_AVG
• ARRAY_COUNT
• ARRAY_MIN
• ARRAY_MAX
• FUNCTIONS
• ARRAY_PREPEND
• ARRAY_PUT
• ARRAY_RANGE
• ARRAY_REMOVE
• ARRAY_REPEAT
• ARRAY_REPLACE
• ARRAY_REVERSE
• ARRAY_SORT
• ARRAY_STAR
• ARRAY_SUM
©2016 Couchbase Inc.©2016 Couchbase Inc.
Array access
22
{
"Name": "Jane Smith",
"DOB" : "1990-01-30",
"phones" : [
"+1 510-523-3529", "+1 650-392-4923"
],
"Billing": [
{
"type": "visa",
"cardnum": "5827-2842-2847-3909",
"expiry": "2019-03"
},
{
"type": "master",
"cardnum": "6274-2542-5847-3949",
"expiry": "2018-12"
}
]
}
SELECT phones from t;
[
{
"phones": [
"+1 510-523-3529",
"+1 650-392-4923"
]
}
]
SELECT phones[1] from t;
[
{
"$1": "+1 650-392-4923"
}
]
SELECT phones[0:1] from t;
[
{
"$1": [
"+1 510-523-3529"
]
}
]
©2016 Couchbase Inc.©2016 Couchbase Inc.
Array access: Expressions and functions
23
{
"Name": "Jane Smith",
"DOB" : "1990-01-30",
"phones" : [
"+1 510-523-3529", "+1 650-392-4923"
],
"Billing": [
{
"type": "visa",
"cardnum": "5827-2842-2847-3909",
"expiry": "2019-03"
},
{
"type": "master",
"cardnum": "6274-2542-5847-3949",
"expiry": "2018-12"
}
]
}
SELECT Billing[0].cardnum from t;
[
{
"cardnum": "5827-2842-2847-3909"
}
]
SELECT Billing[*].cardnum from t;
[
{
"cardnum": [
"5827-2842-2847-3909",
"6274-2542-5847-3949"
]
}
]
SELECT ISARRAY(Name) name, ISARRAY(phones)
phones from t;
[
{
"name": false,
"phones": true
}
]
©2016 Couchbase Inc.©2016 Couchbase Inc.
Array access : Functions
24
{
"Name": "Jane Smith",
"DOB" : "1990-01-30",
"phones" : [
"+1 510-523-3529", "+1 650-392-4923"
],
"Billing": [
{
"type": "visa",
"cardnum": "5827-2842-2847-3909",
"expiry": "2019-03"
},
{
"type": "master",
"cardnum": "6274-2542-5847-3949",
"expiry": "2018-12"
}
]
}
SELECT ARRAY_CONCAT(phones, ["+1 408-284-
2921"]) from t;
[
{
"$1": [
"+1 510-523-3529",
"+1 650-392-4923",
"+1 408-284-2921"
]
}
]
SELECT ARRAY_COUNT(Billing) billing,
ARRAY_COUNT(phones) phones from t;
[
{
"billing": 2,
"phones": 2
}
]
©2016 Couchbase Inc.©2016 Couchbase Inc.
Array access : Functions
25
SELECT phones, ARRAY_REVERSE(phones)
reverse from t;
{
"phones": [
"+1 510-523-3529",
"+1 650-392-4923"
],
"reverse": [
"+1 650-392-4923",
"+1 510-523-3529"
]
}
]
SELECT phones, ARRAY_INSERT(phones, 0, "+1 415-
439-4923") newlist from t;[
{
"billing": 2,
"phones": 2
}
]
SELECT phones, ARRAY_INSERT(phones, 0, "+1 415-
439-4923") newlist from t;
[
{
"newlist": [
"+1 415-439-4923",
"+1 510-523-3529",
"+1 650-392-4923"
],
"phones": [
"+1 510-523-3529",
"+1 650-392-4923"
]
}
]
©2016 Couchbase Inc.©2016 Couchbase Inc.
Array access : Aggregates
26
SELECT ARRAY_MIN(Billing) AS minbill FROM
t;
[
{
"minbill": {
"cardnum": "5827-2842-2847-3909",
"expiry": "2019-03",
"type": "visa"
}
}
]
SELECT name,
ARRAY_AVG(reviews[*].ratings[*].Overall) AS
avghotelrating
FROM `travel-sample`
WHERE type = 'hotel'
ORDER BY avghotelrating desc
LIMIT 3;
[
{
"avghotelrating": 5,
"name": "Culloden House Hotel"
},
{
"avghotelrating": 5,
"name": "The Bulls Head"
},
{
"avghotelrating": 5,
"name": "La Pradella"
}
]
©2016 Couchbase Inc.©2016 Couchbase Inc.
SELECT: ARRAY & FIRST Expression
27
ARRAY: The ARRAY operator lets you map and filter
the elements or attributes of a collection, object, or
objects. It evaluates to an array of the operand
expression, that satisfies the WHEN clause, if provided.
SELECT ARRAY [name, r.ratings.`Value`]
FOR r IN reviews
WHEN r.ratings.`Value` = 4
END
FROM `travel-sample`
WHERE type = 'hotel'
SELECT FIRST [name, r.ratings.`Value`]
FOR r IN reviews
WHEN r.ratings.`Value` = 4
END
FROM `travel-sample`
WHERE type = 'hotel'
FIRST: The FIRST operator enables you to map and
filter the elements or attributes of a collection, object,
or objects. It evaluates to a single element based on
the operand expression that satisfies the WHEN clause,
if provided.
©2016 Couchbase Inc.©2016 Couchbase Inc.
Statements
• INSERT
• INSERT documents with arrays
• INSERT multiple documents with arrays
• INSERT result of documents from SELECT
• UPDATE
• UPDATE specific elements and objects within an array
• DELETE
• DELETE documents based on values within one or more arrays
• MERGE
• MERGE documents to INSERT, UPDATE or DELETE documents.
• SELECT
• Fetch documents given an array of keys
• JOIN based on array of keys
• Predicates (filters) on arrays
• Array expressions, functions and aggregates
• UNNEST, NEST operations
28
©2016 Couchbase Inc.©2016 Couchbase Inc.
Statements:INSERT
INSERT INTO customer VALUES ("KEY01", { "cid": "ABC01", "orders": ["LG012", "LG482", "LG134"] });
INSERT INTO customer VALUES (("KEY01", { "cid": "XYC21", "orders": ["LG92", "LG859"] }),
VALUES (("KEY04", { "cid": "PQR49", "orders": ["LG47", "LG09", "LG134"] }),
VALUES (("KEY09", { "cid": "KDL29", "orders": ["LG082"] });
INSERT INTO customer
(
KEY uuid(),
value c
)
SELECT mycustomers AS c
FROM newcustomers AS n
WHERE n.type = "premium";
29
©2016 Couchbase Inc.©2016 Couchbase Inc.
Statements: DELETE
DELETE
FROM customer
WHERE orders = ["LG012", "LG482", "LG134"];
DELETE
FROM customer
WHERE ANY o IN orders SATISFIES o = "LG012" END;
DELETE
FROM customer
WHERE ANY o IN orders SATISFIES o = "LG012" END
RETURNING meta().id, *;
30
©2016 Couchbase Inc.©2016 Couchbase Inc.
Statements:UPDATE
UPDATE customer USE KEYS ["KEY091"] SET orders = ["LG012", "LG482", "LG134"];
UPDATE customer USE KEYS ["KEY091"]
SET orders = ARRAY_REMOVE(orders, "LG012") ;
UPDATE customer USE KEYS ["KEY091"]
SET orders = ARRAY_APPEND(orders, "LG892") ;
31
©2016 Couchbase Inc.©2016 Couchbase Inc.
Statements : SELECT
• SELECT
• Array predicates
• NEST, UNNEST
• Fetch documents given an array of keys
• JOIN based on array of keys
32
©2016 Couchbase Inc. 33
SELECT statement
ARRAY PREDICATES
©2016 Couchbase Inc.©2016 Couchbase Inc.
SELECT: Array predicates
34
• ANY
• EVERY
• SATISFIES
• IN
• WITH
• WHEN
©2016 Couchbase Inc.©2016 Couchbase Inc.
SELECT: Array predicates
35
• Arrays and Objects: Arrays are compared element-
wise. Objects are first compared by length; objects
of equal length are compared pairwise, with the
pairs sorted by name.
• IN clause: Use this when you want to evaluate based
on specific field.
• WITHIN clause: Use this when you don’t know which
field contains the value you’re looking for. The
WITHIN operator evaluates to TRUE if the right-side
value contains the left-side value as a child or
descendant. The NOT WITHIN operator evaluates to
TRUE if the right-side value does not contain the left-
side value as a child or descendant.
SELECT *
FROM `travel-sample`
WHERE type = 'hotel’
AND ANY r IN reviews
SATISFIES r.ratings.`Value` >= 3
END;
SELECT *
FROM `travel-sample`
WHERE type = 'hotel’
AND ANY r WITHIN reviews
SATISFIES r LIKE '%Ozella%'
END;
• EVERY: EVERY is a range predicate that tests a
Boolean condition over the elements or attributes of
a collection, object, or objects. It uses the IN and
WITHIN operators to range through the collection.
SELECT *
FROM `travel-sample`
WHERE type = 'hotel’
AND EVERY r IN reviews
SATISFIES r.ratings.Cleanliness >= 4
END;
©2016 Couchbase Inc.©2016 Couchbase Inc.
SELECT: Array predicates
36
• ARRAY_CONTAINS
• Returns true if the array contains value.
SELECT name, t.public_likes
FROM `travel-sample` t
WHERE type="hotel" AND
ARRAY_CONTAINS(t.public_likes,
"Vallie Ryan") = true;
[
{
"name": "Medway Youth Hostel",
"public_likes": [
"Julius Tromp I",
"Corrine Hilll",
"Jaeden McKenzie",
"Vallie Ryan",
"Brian Kilback",
"Lilian McLaughlin",
"Ms. Moses Feeney",
"Elnora Trantow"
]
}
]
©2016 Couchbase Inc.©2016 Couchbase Inc.
Array Expressions, Functions and Aggregates.
37
• EXPRESSIONS
• ARRAY
• ANY
• EVERY
• IN
• WITHIN
• Construct [elem]
• Slice array[start:end]
• Selection array[#pos]
• FUNCTIONS
• ISARRAY
• TYPE
• ARRAY_APPEND
• ARRAY_CONCAT
• ARRAY_CONTAINS
• ARRAY_DISTINCT
• ARRAY_IFNULL
• ARRAY_FLATTEN
• ARRAY_INSERT
• ARRAY_INTERSECT
• ARRAY_LENGTH
• ARRAY_POSITION
• AGGREGATES
• ARRAY_AVG
• ARRAY_COUNT
• ARRAY_MIN
• ARRAY_MAX
• ARRAY_SUM
• FUNCTIONS
• ARRAY_PREPEND
• ARRAY_PUT
• ARRAY_RANGE
• ARRAY_REMOVE
• ARRAY_REPEAT
• ARRAY_REPLACE
• ARRAY_REVERSE
• ARRAY_SORT
• ARRAY_STAR
©2016 Couchbase Inc.©2016 Couchbase Inc.
SELECT: ARRAY & FIRST Expression
38
ARRAY: The ARRAY operator lets you map and filter
the elements or attributes of a collection, object, or
objects. It evaluates to an array of the operand
expression, that satisfies the WHEN clause, if provided.
SELECT ARRAY [name, r.ratings.`Value`]
FOR r IN reviews
WHEN r.ratings.`Value` = 4
END
FROM `travel-sample`
WHERE type = 'hotel'
SELECT FIRST [name, r.ratings.`Value`]
FOR r IN reviews
WHEN r.ratings.`Value` = 4
END
FROM `travel-sample`
WHERE type = 'hotel'
FIRST: The FIRST operator enables you to map and
filter the elements or attributes of a collection, object,
or objects. It evaluates to a single element based on
the operand expression that satisfies the WHEN clause,
if provided.
©2016 Couchbase Inc. 39
SELECT statement
UNNEST and NEST
©2016 Couchbase Inc.©2016 Couchbase Inc.
Querying Arrays: UNNEST
• UNNEST : If a document or object contains
an array, UNNEST performs a join of the
nested array with its parent document. Each
resulting joined object becomes an input to
the query. UNNEST, JOINs can be chained.
40
SELECT r.author, COUNT(r.author) AS authcount
FROM `travel-sample` t UNNEST reviews r
WHERE t.type="hotel"
GROUP BY r.author
ORDER BY COUNT(r.author) DESC
LIMIT 5;
[
{
"authcount": 2,
"author": "Anita Baumbach"
},
{
"authcount": 2,
"author": "Uriah Gutmann"
},
{
"authcount": 2,
"author": "Ashlee Champlin"
},
{
"authcount": 2,
"author": "Cassie O'Hara"
},
{
"authcount": 1,
"author": "Zoe Kshlerin"
}
]
©2016 Couchbase Inc.©2016 Couchbase Inc.
Querying Arrays: NEST
• NEST is the inverse of UNNEST.
• Nesting is conceptually the inverse of
unnesting. Nesting performs a join across
two keyspaces. But instead of producing a
cross-product of the left and right inputs, a
single result is produced for each left input,
while the corresponding right inputs are
collected into an array and nested as a single
array-valued field in the result object.
41
SELECT *
FROM `travel-sample` route
NEST `travel-sample` airline
ON KEYS route.airlineid
WHERE route.type = ‘airline' LIMIT 1;
[
{
"airline": [
{
"callsign": "AIRFRANS",
"country": "France",
"iata": "AF",
"icao": "AFR",
"id": 137,
"name": "Air France",
"type": "airline"
}
],
"route": {
"airline": "AF",
"airlineid": "airline_137",
"destinationairport": "MRS",
"distance": 2881.617376098415,
"equipment": "320",
"id": 10000,
"schedule": [
{
"day": 0,
"flight": "AF198",
"utc": "10:13:00"
},
{
"day": 0,
"flight": "AF547",
"utc": "19:14:00"
},
{
"day": 0,
"flight": "AF943",
©2016 Couchbase Inc. 42
Query Performance with Arrays
©2016 Couchbase Inc.©2016 Couchbase Inc.
Array Indexing
• Before 4.5, creating index on
array attribute would index the
entire array as a single scalar
value.
CREATE INDEX i1 ON
`travel-sample`(schedule);
"schedule": [
{
"day" : 0,
"flight" : "AI111",
"utc" : "1:11:11"}
},
{
"day": 1,
"flight": "AF552",
"utc": "14:41:00"
},
{
"day": 2,
"flight": "AF166",
"utc": "08:59:00"
}, …
]
©2016 Couchbase Inc.©2016 Couchbase Inc.
Array Indexing - motivation
[
{ "day" : 0,
"special_flights" :
[
{ "flight" : "AI111", "utc" : ”1:11:11"},
{ "flight" : "AI222", "utc" : ”2:22:22" }
]
},
{
"day": 1,
"flight": "AF552",
"utc": "14:41:00”
}, …
]
"London":[
"London",
"Tokyo",
"NewYork",
…
]
©2016 Couchbase Inc.©2016 Couchbase Inc.
Why array indexing?
• When NoSQL databases asked customers to denormalize, they put the child
table info into arrays in parent tables.
• E.g. Each customer doc had all phone numbers, contacts, orders in arrays.
• Not easy to query - need to specify full array value in where predicates.
• Ex: list of users who purchased a product – Unknown values & large list
• Was not possible to index part of the array with objects
• Bloated index size (indexes whole array value)
• Example: Index just the day field in array of flights in schedule.
• Performance Limitation
• ANY…IN orWITHIN array
• Ease of querying - Must specify full array value inWHERE-clause
• Manageable for Known or handful of values
• Difficult for Unknown or Large list of values.
©2016 Couchbase Inc.©2016 Couchbase Inc.
Who wants array indexing?
• Find my crew based on the airline.
WHERE ANY p IN ods.pilot satisfies p.filen = ”XYZ1012" END ;
• Find my customer based on one of the emails on the customer
WHERE ANY a IN u.telecom SATISFIES a.system = ‘email’ AND a.value = ‘a@b.com’ END ;
• Find service qualification based on arrays of arrays.
WHERE ANY c_0 IN `item`.`blackoutserviceblocklist` SATISFIES
ANY c_1 IN c_0.`blackoutserviceblock`.`ppvservicelist` SATISFIES
c_1.`ppvservice`.`eventcode` = "E001"
END
END ;
©2016 Couchbase Inc.©2016 Couchbase Inc.
What is Array Indexing?
• Enables visibility into the array structure
schedule =
• Subset of array elements can be indexed & searched efficiently
[
{ "day" : 0,
"special_flights" :
[
{ "flight" : "AI111" , "utc" : "1:11:11"},
{ "flight" : "AI932" , "utc" : "2:22:22"}
]
},
{
"day": 1,
"flight": "AF552",
"utc": "14:41:00"
}, …
]
©2016 Couchbase Inc.©2016 Couchbase Inc.
How Array Indexing Helps?
• Index only required elements or attributes in the the array
• Efficient on Index storage & search time
• Benefits are lot more significant for nested arrays/objects
©2016 Couchbase Inc.©2016 Couchbase Inc.
HowArray Indexing Helps -- Example
"schedule”:
[ { "day" : 0,
"special_flights" : [
{ "flight" : "AI111", "utc" : "1:11:11"},
{ "flight" : "AI222", "utc" : "2:22:22"}
]
},
{
"day": 1,
"flight": "AF552",
"utc": "14:41:00"
},
{
"day": 2,
"flight": "AF166",
"utc": "08:59:00"
}, …
]
"flight":"AF552",
"flight":"AF166",
…
Array Index in Couchbase
©2016 Couchbase Inc.©2016 Couchbase Inc.
Create Array Index
• No syntax changes to DML statements
• Supports all DML statements with a WHERE-clause
• SELECT, UPDATE, DELETE, etc.
• Array index support only for GSI indexes.
• Supports both standard secondary and memory optimized index.
CREATE INDEX isched ON `travel-sample`
(DISTINCT ARRAY v.flight FOR v IN schedule END) WHERE type = "route";
©2016 Couchbase Inc.©2016 Couchbase Inc.
Array Index syntax
CREATE INDEX isched ON `travel-sample`
(ALL ARRAY p FOR p IN public_likes END)
WHERE type = "hotel" ;
"Julius Smith", [DocID]
"Corrine Hill", [DocID]
"Jaeden McKenzie", [DocID]
"Vallie Ryan", [DocID]
"Brian Kilback", [DocID]
"Lilian McLaughlin", [DocID]
"Ms. Moses Feeney", [DocID]
"Elnora Trantow”, [DocID]
"public_likes": [
"Julius Smith",
"Corrine Hill",
"Jaeden McKenzie",
"Vallie Ryan",
"Brian Kilback",
"Lilian McLaughlin",
"Ms. Moses Feeney",
"Elnora Trantow"
]
©2016 Couchbase Inc.©2016 Couchbase Inc.
Example - Indexing individual attributes/elements
• "Find the total number of flights scheduled on 3rd day"
CREATE INDEX isched ON `travel-sample`
(DISTINCT ARRAY v.day FOR v IN schedule END) WHERE type = "route” ;
SELECT count(*) FROM `travel-sample`
WHERE type = "route" AND
ANY v IN schedule SATISFIES v.day = 3 END;
©2016 Couchbase Inc.©2016 Couchbase Inc.
Example - Indexing individual attributes/elements
explain SELECT count(1) FROM `travel-sample`
WHERE type = "route" AND
ANY v IN schedule SATISFIES v.day = 3 END;
{
"#operator": "DistinctScan",
"scan": {
"#operator": "IndexScan",
"index": "isched",
"index_id": "2b24c681fa54d83f",
"keyspace": "travel-sample",
"namespace": "default",
"spans": [
{
"Range": {
"High": [
"3"
],
"Inclusion": 3,
"Low": [
"3"
]
©2016 Couchbase Inc.©2016 Couchbase Inc.
Example - Index with Array Elements and Other Attributes
• "Find all scheduled flights with hops, and group by number of stops"
CREATE INDEX iflight_stops ON `travel-sample`
( stops, DISTINCT ARRAY v.flight FOR v IN schedule END )
WHERE type = "route" ;
SELECT * FROM `travel-sample`
WHERE type = "route"
AND ANY v IN schedule SATISFIES v.flight LIKE 'AA%' END
AND stops >= 0;
©2016 Couchbase Inc.©2016 Couchbase Inc.
Example - Indexing Nested Arrays
"schedule" : [
{"day" : 0,
"special_flights" : [
{"flight" : "AI111",
"utc" : "1:11:11"},
{"flight" : "AI222",
"utc" : "2:22:22" }
]
},
{"day": 1,
"flight": "AF552",
"utc": "14:41:00"
} …
]
©2016 Couchbase Inc.©2016 Couchbase Inc.
Example - Indexing Nested Arrays
• "Find the total number of special flights scheduled"
CREATE INDEX inested ON `travel-sample`
(DISTINCT ARRAY
(DISTINCT ARRAY y.flight
FOR y IN x.special_flights END)
FOR x IN schedule END)
WHERE type = "route" ;
SELECT count(*) FROM `travel-sample`
WHERE type ="route" AND
ANY x IN schedule SATISFIES
(ANY y IN x.special_flights
SATISFIES y.flight IS NOT NULL END)
END ;
"schedule”:
[ { "day" : 0,
"special_flights" : [
{ "flight" : "AI111", "utc":"1:11:11"},
{ "flight" : "AI222", "utc":"2:22:22"}
]
},
{
"day": 1,
"flight": "AF552",
"utc": "14:41:00"
},
{
"day": 2,
"flight": "AF166",
"utc": "08:59:00"
}, …
]
©2016 Couchbase Inc.©2016 Couchbase Inc.
Example – UNNEST
• N1QL array indexing
supports both collection
predicates
• ANY
• ANY AND EVERY
• Exploited UNNEST
CREATE INDEX idx_crew ON flight
(DISTINCT ARRAY c FOR c IN public_likes END);
SELECT *
FROM flight UNNEST crew_ids AS c
WHERE c = "Joe Smith" ;
©2016 Couchbase Inc.©2016 Couchbase Inc.
Restrictions in 4.5
Variable names and index keys, such as v & v.day
used in CREATE INDEX and subsequent SELECT statements must be same.
CREATE INDEX isched ON `travel-sample`
(DISTINCT ARRAY v.day FOR v IN schedule END) WHERE type = "route" ;
SELECT count(*) FROM `travel-sample`
WHERE type = "route" AND
ANY v IN schedule SATISFIES v.day = 3 END;
©2016 Couchbase Inc.©2016 Couchbase Inc.
Restrictions in 4.5
• Supported operators:
DISTINCT ARRAY
ALL ARRAY
ARRAY
ANY
ANY AND EVERY
IN, WITHIN
UNNEST
• NOT supported operators: EVERY
©2016 Couchbase Inc. 60
Fun with Arrays
©2016 Couchbase Inc.©2016 Couchbase Inc.
SELECT: Fetch Documents
SELECT * FROM customer USE KEYS ["KEY01"] ;
SELECT * FROM customer USE KEYS [ "CUST:09", "CUST:29", "CUST:234", "CUST:852", "CUST:258"] ;
SELECT status, COUNT(status)
FROM customer c USE KEYS [ "CUST:09", "CUST:29", "CUST:234", "CUST:852", "CUST:258" ]
WHERE c.region = 'US’
GROUP BY status;
SELECT product, COUNT(product)
FROM customer c USE KEYS [ "CUST:09", "CUST:29", "CUST:234", "CUST:852", "CUST:258" ]
INNER JOIN
locations ON KEYS c.lid
WHERE c.region = 'US’
GROUP BY product;
61
©2016 Couchbase Inc.©2016 Couchbase Inc.
SELECT: JOIN
62
SELECT COUNT(1)
FROM `beer-sample` beer
INNER JOIN
`beer-sample` brewery ON KEYS beer.brewery_id
WHERE state = ‘CA’
• JOIN operation combines documents
from two key spaces
• JOIN criteria is based on ON KEYS clause
• The outer table uses the index scan, if
possible
• The fetch of the inner table (brewery)
document-by-document
• Couchbase 4.6 improves this by fetching
in batches.
©2016 Couchbase Inc.©2016 Couchbase Inc.
SELECT: JOIN
SELECT COUNT(1)
FROM (
SELECT RAW META().id
FROM `beer-sample` beer
WHERE state = ‘CA’) as blist
INNER JOIN
`beer-sample` brewery ON KEYS blist;
63
SELECT COUNT(1)
FROM (
SELECT ARRAY_AGG(META().id) karray
FROM `beer-sample` beer
WHERE state = ‘CA’) as b
INNER JOIN
`beer-sample` brewery ON KEYS b.karray;
• Why not get all of the required document IDs from the index scan then do a big bulk get on the
outer table?
• Two ways to do it.
a) Use the array aggregate (ARRAY_AGG()) to create the list
b) Use RAW to create the the array and then use that to JOIN.
©2016 Couchbase Inc.©2016 Couchbase Inc.
Data.gov : NewYork Names
{
"meta": {
"view": {
"id": "25th-nujf",
"name": "Most Popular Baby Names by Sex and Mother's Ethnic Group, New York City",
"category": "Health",
"createdAt": 1382724894,
"description": "The most popular baby names by sex and mother's ethnicity in New York City.",
"displayType": "table",
…
"columns": [{
"id": -1,
"name": "sid",
"dataTypeName": "meta_data",
"fieldName": ":sid",
"position": 0,
"renderTypeName": "meta_data",
"format": {}
}, {
"id": -1,
"name": "id",
"dataTypeName": "meta_data",
"fieldName": ":id",
"position": 0,
"renderTypeName": "meta_data",
"format": {}
}
...
]
"data": [
[1, "EB6FAA1B-EE35-4D55-B07B-8E663565CCDF", 1, 1386853125, "399231", 1386853125, "399231", "{n}", "2011", "FEMALE",
"HISPANIC", "GERALDINE", "13", "75"],
[2, "2DBBA431-D26F-40A1-9375-AF7C16FF2987", 2, 1386853125, "399231", 1386853125, "399231", "{n}", "2011", "FEMALE",
"HISPANIC", "GIA", "21", "67"],
[3, "54318692-0577-4B21-80C8-9CAEFCEDA8BA", 3, 1386853125, "399231", 1386853125, "399231", "{n}", "2011", "FEMALE",
"HISPANIC", "GIANNA", "49", "42"]
...
]
} 64
©2016 Couchbase Inc.©2016 Couchbase Inc.
Data.gov : NewYork Names
INSERT INTO nynames (KEY UUID(), VALUE kname)
SELECT {":sid":d[0],
":id":d[1],
":position":d[2],
":created_at":d[3],
":created_meta":d[4],
":updated_at":d[5],
":updated_meta":d[6],
":meta":d[7],"brth_yr":d[8],
"brth_yr":d[9],
"ethcty":d[10],
"nm":d[11],
"cnt":d[12],
"rnk":d[13]} kname
FROM (SELECT d FROM datagov UNNEST data d) as u1;
65
©2016 Couchbase Inc.©2016 Couchbase Inc.
Data.gov : NewYork Names
INSERT INTO nynames
(
KEY UUID(),
value o
)
SELECT o
FROM (
SELECT meta.`view`.columns[*].fieldName f,
data
FROM datagov) d
UNNEST data d1
LET o = OBJECT p:d1[ARRAY_POSITION(d.f, p)] FOR p IN d.f END ;
66
©2016 Couchbase Inc.©2016 Couchbase Inc.
SPLIT & CONQUOR:
SELECT name FROM `travel-sam5ple`
WHERE type = 'hotel' LIMIT 5;
[
{
"name": "Medway Youth Hostel"
},
{
"name": "The Balmoral Guesthouse"
},
{
"name": "The Robins"
},
{
"name": "Le Clos Fleuri"
},
{
"name": "Glasgow Grand Central"
}
]
67
• Problem: Search for a word within a string
©2016 Couchbase Inc.©2016 Couchbase Inc.
SPLIT & CONQUER:
select name
from `travel-sample`
where type = 'hotel' and
lower(name) LIKE '%grand%';
[
{
"name": "Glasgow Grand Central"
},
{
"name": "Horton Grand Hotel"
},
{
"name": "Manchester Grand Hyatt"
},
{
"name": "Grande Colonial Hotel"
},
{
"name": "Grand Hotel Serre Chevalier"
},
{
"name": "The Sheraton Grand Hotel"
}
]
68
• Use the LIKE predicate
• Runs in about 81 milliseconds to search 917
documents
©2016 Couchbase Inc.©2016 Couchbase Inc.
SPLIT & CONQUER:
CREATE INDEX idxtravelname ON
`travel-sample`
(DISTINCT ARRAY wrd
FOR wrd IN SPLIT(LOWER(name)) END) where type =
'hotel';
SELECT name FROM `travel-sample`
WHERE ANY wrd IN SPLIT(LOWER(name)) satisfies wrd =
'grand' END AND type = 'hotel';
[
{
"name": "The Sheraton Grand Hotel"
},
{
"name": "Horton Grand Hotel"
},
{
"name": "Grand Hotel Serre Chevalier"
},
{
"name": "Glasgow Grand Central"
},
{
"name": "Manchester Grand Hyatt"
}
]
~ 69
• Convert into LOWER case
• Split the name into words.
• SPLIT() returns a ARRAY of these words.
• Create the INDEX on this array.
• Query using the Array predicate.
• Query runs in 10 ms.
• Benefits grow with number of docs.
©2016 Couchbase Inc.©2016 Couchbase Inc.
Bucket: article
{
{
"tags": "JSON,N1QL,COUCHBASE,BIGDATA,NAME,data.gov,SQL",
"title": "What's in a New York Name? Unlock data.gov Using N1QL "
}, {
"tags": "TWITTER,NOSQL,SQL,QUERIES,ANALYSIS,HASHTAGS,JSON,COUCHBASE,ANALYTICS,INDEX",
"title": "SQL on Twitter: Analysis Made Easy Using N1QL"
}, {
"tags":
"CONCURRENCY,MONGODB,COUCHBASE,INDEX,READ,WRITE,PERFORMANCE,SNAPSHOT,CONSISTENCY",
"title": "Concurrency Behavior: MongoDB vs. Couchbase"
}, {
"tags": "COUCHBASE,N1QL,JOIN,PERFORMANCE,INDEX,DATA MODEL,FLEXIBLE,SCHEMA",
"title": "JOIN Faster With Couchbase Index JOINs"
}, {
"tags":
"NOSQL,NOSQL,BENCHMARK,SQL,JSON,COUCHBASE,MONGODB,YCSB,PERFORMANCE,QUERY,INDEX",
"title": "How Couchbase Won YCSB"
}
}
©2016 Couchbase Inc.©2016 Couchbase Inc.
Questions:
Find all the articles with N1QL in their title
Find all the articles with COUCHBASE in their tags
{
{
"tags": "JSON,N1QL,COUCHBASE,BIGDATA,NAME,data.gov,SQL",
"title": "What's in a New York Name? Unlock data.gov Using N1QL "
}, {
"tags": "TWITTER,NOSQL,SQL,QUERIES,ANALYSIS,HASHTAGS,JSON,COUCHBASE,ANALYTICS,INDEX",
"title": "SQL on Twitter: Analysis Made Easy Using N1QL"
}, {
"tags":
"CONCURRENCY,MONGODB,COUCHBASE,INDEX,READ,WRITE,PERFORMANCE,SNAPSHOT,CONSISTENCY",
"title": "Concurrency Behavior: MongoDB vs. Couchbase"
}, {
"tags": "COUCHBASE,N1QL,JOIN,PERFORMANCE,INDEX,DATA MODEL,FLEXIBLE,SCHEMA",
"title": "JOIN Faster With Couchbase Index JOINs"
}, {
"tags":
"NOSQL,NOSQL,BENCHMARK,SQL,JSON,COUCHBASE,MONGODB,YCSB,PERFORMANCE,QUERY,INDEX",
"title": "How Couchbase Won YCSB"
}
}
©2016 Couchbase Inc.©2016 Couchbase Inc.
Basic Framework
"tags": "JSON,N1QL,COUCHBASE,BIGDATA,NAME,data.gov,SQL"
[
"JSON",
"N1QL",
"COUCHBASE",
"BIGDATA",
"NAME",
"data.gov",
"SQL"
]
SPLIT() into an
array Array
Index
Distinct array wrd for wrd in
split(tags,”,”) end
Index this array N1QL
Query
Service
SELECT *
FROM articles
WHERE ANY wrd IN SPLIT(tags, ",")
satisfies wrd = "COUCHBASE”
END
©2016 Couchbase Inc.©2016 Couchbase Inc.
Basic Framework
"title": "What's in a New York Name? Unlock data.gov Using N1QL "
[
"What's",
"in",
"a",
"New",
"York",
"Name?",
"Unlock",
"data.gov",
"Using",
"N1QL"
]
SPLIT() into an
array
Array
Index
??? N1QL
Query
Service
???
©2016 Couchbase Inc.©2016 Couchbase Inc.
New Function:TOKENS() in Couchbase 4.6 – OUT in DP now.
TOKENS(expression [, parameter])
expression : JSON expression
parameter : options
{"names":true} Include the key names in the JSON “key”:value pair.
{"case":"lower"} Return the values in upper/lower case
{"specials":true} Recognize special characters like @, - to form tokens.
"tags": "JSON,N1QL,COUCHBASE,BIGDATA,NAME,data.gov,SQL",
"tagsarray": [
"data",
"gov",
"bigdata",
"n1ql",
"couchbase",
"sql",
"json",
"name"
],
select title, tags, tokens(tags, {"case":"lower"}) tagsarray, tokens(title) titlearray from articles limit 1;
"title": "What's in a New York Name? Unlock data.gov Using N1QL ",
"titlearray": [
"s",
"Unlock",
"data",
"N1QL",
"gov",
"in",
"Using",
"New",
"What",
"York",
"a",
"Name"
]
©2016 Couchbase Inc.©2016 Couchbase Inc.
UsingTOKENS() – Index on title, lower case
create index ititlesearch on articles(distinct array wrd for wrd in tokens(title, {"case":"lower"}) end);
explain select title from articles where any wrd in tokens(title, {"case":"lower"}) satisfies wrd = 'n1ql' end;
{
"#operator": "DistinctScan",
"scan": {
"#operator": "IndexScan",
"index": "ititlesearch",
"index_id": "7a162af1199565b5",
"keyspace": "articles",
"namespace": "default",
"spans": [
{
"Range": {
"High": [
""n1ql""
],
"Inclusion": 3,
"Low": [
""n1ql""
]
}
}
],
"using": "gsi"
}
},
©2016 Couchbase Inc.©2016 Couchbase Inc.
UsingTOKENS() Index on theWHOLE document
create index ititlesearch2 on articles
(distinct array wrd for wrd in tokens(articles, { "case":"lower" , "names":true }) end);
explain select title from articles where
any wrd in tokens(articles, {"case":"lower", "names":true }) satisfies wrd = ’title' end;
"#operator": "DistinctScan",
"scan": {
"#operator": "IndexScan",
"index": "ititlesearch2",
"index_id": "c60792ca9f957cfd",
"keyspace": "articles",
"namespace": "default",
"spans": [
{
"Range": {
"High": [
""title""
],
"Inclusion": 3,
"Low": [
"“title""
]
}
}
],
"using": "gsi"
}
©2016 Couchbase Inc. 77
Keshav Murthy
Director, Couchbase Engineering
keshav@couchase.com
©2016 Couchbase Inc.
ThankYou!
78
©2016 Couchbase Inc.©2016 Couchbase Inc.
Goal of N1QL
Give developers and enterprises an expressive,
powerful, and complete language for querying,
transforming, and manipulating JSON data.
©2016 Couchbase Inc.©2016 Couchbase Inc.
Array Indexing – How array is expanded in GSI
Sl. Create Index Expression Key versions generated
by Projector
Index Entries in GSI
storage
1. age [K1] [K1]docid
2. age, name, children [K1, K2, [c1, c2, c3]] [K1, K2, [c1, c2, c3]]docid
3. ALL ARRAY c FOR c IN cities END [[K11, K12, K13]] [ K11]docid
[ K12]docid
[ K13]docid
4. ALL ARRAY c FOR c IN cities END, age [[K11, K12, K13], K2] [ K11, K2]docid
[ K12, K2]docid
[ K13, K2]docid
4.1 age, ALL ARRAY c FOR c IN cities END, name [K1, [ K21, K22, K23,], K3] [K1, K21, K3]docid
[K1, K22, K3]docid
[K1, K23, K3]docid
©2016 Couchbase Inc.©2016 Couchbase Inc.
Array Indexing – How array is expanded in GSI
Sl. Create Index Expression Key versions generated by
Projector
Index Entries in GSI storage
5. ALL ARRAY c FOR c IN cities END,
children
[[K11, K12, K13], [c1, c2, c3]] [ K11, [c1, c2, c3]]docid
[ K12, [c1, c2, c3]]docid
[ K13, [c1, c2, c3]]docid
6. ALL ARRAY (ALL ARRAY y FOR y IN c
END) FOR c IN cities END
[
[K1, K2, K3, K4, K5]
]
[K1]docid
[K2]docid
[K3]docid
[K4]docid
[K5]docid
©2016 Couchbase Inc.©2016 Couchbase Inc.
Array Indexing Performance in ForestDB (3.6K sets)
Metrics KPI Measured comments
Array Q2(stale=ok) 13000 15140 Single doc match
& fetch
Array Q2(stale=false) 700 9420 Same with
consistency
Array Q3(stale=ok) 1100 1435 100 doc match
and fetch.
Array Q3(stale=false) 428 1084 Same with
consistency
©2016 Couchbase Inc.©2016 Couchbase Inc.
Array Indexing Performance in MOI with 30K sets
Metrics KPI Measured comments
Array Q2(stale=ok) 13000 15251 Single doc match
& fetch
Array Q2(stale=false) 700 7545 Same with
consistency
Array Q3(stale=ok) 1100 1371 100 doc match
and fetch.
Array Q3(stale=false) 428 1580 Same with
consistency
©2016 Couchbase Inc.©2016 Couchbase Inc.
UNITED – POC on 4.0
Response times in milliseconds
1 Thread 10 Thread 50 Thread
Q1 13 35.1 197.84
Q2 28 66.8 285.32
Q3 - 7d 160 606 2960.2
Q3 - 28d 1725 8240.3 41439.86
1 Thread 5 Threads
Q1 1500 31000
Q2 Timed out.
Q3 23000 90000
MongoDB Query
Couchbase
Query
Response times in milliseconds
©2016 Couchbase Inc.©2016 Couchbase Inc.
UNITED -- POC
• Query 2 – Get the selected flight using the document key. For each crew member
(pilot and flight attendant) found in the flight details.
• Fetch the previous flight assigned to the crew member
• Fetch the next flight assigned to the crew member
select
ods.GMT_EST_DEP_DTM,ods.PRFL_ACT_GMT_DEP_DTM,ods.PRFL_SCHED_GMT_DEP_D
TM,ods.GMT_EST_ARR_DTM,
ods.PRFL_ACT_GMT_ARR_DTM,ods.PRFL_SCHED_GMT_ARR_DTM,ods.FLT_LCL_ORIG_
DT,ods.PRFL_FLT_NBR,
ods.PRFL_TAIL_NBR,PILOT.PRPS_RSV_IND
from ods unnest ods.PILOT
where ods.TYPE='CREW_ON_FLIGHT' and
((ods.PRFL_ACT_GMT_DEP_DTM is not missing and
ods.PRFL_ACT_GMT_DEP_DTM > "2015-07-15T02:45:00Z") OR
(ods.PRFL_ACT_GMT_DEP_DTM is missing and ods.GMT_EST_DEP_DTM is not
null and ods.GMT_EST_DEP_DTM > "2015-07-15T02:45:00Z"))
and any p in ods.PILOT satisfies p.FILEN = "U110679" end
order by ods.GMT_EST_DEP_DTM limit 1
©2016 Couchbase Inc.©2016 Couchbase Inc.
UNITED – POC Queries on 4.5
• 422,137 documents.
• Query2: BEFORE array indexing
• Primary index scan
• 38.91 seconds.
create index idx_odspilot on ods(DISTINCT ARRAY p.FILEN in p in PILOT END);
• Query2: AFTER array indexing
• Array index scan [DistinctScan]
• 8.51 millisecond
• Improvement of 4572 TIMES
©2016 Couchbase Inc.©2016 Couchbase Inc.
Array Indexing – Size and numbers
• There is no limit on number of elements in the array.
• Total size of array index key should not exceed setting max_array_seckey_size (Default =
10K)
CREATE INDEX i1 on default(ALL flights, airlineid) . Lets say a given document is:
{
"flights": ["AF552", "AF166", "AF268", "AF422"],
"airlineid": "airline_137"
}
The indexable array keys for the document are:
[ ["AF552", "airline_137"], ["AF166", "airline_137"], ["AF268", "airline_137"], ["AF422",
"airline_137"] ]
Sum of lengths above items should be < max_array_seckey_size. Setting can be
increased but not decreased.
©2016 Couchbase Inc.©2016 Couchbase Inc.
Statements : MERGE
BIG MERGE statement – Use travel-sample
explain merge into b1 using b2 on key "11" when matched then update set b1.o3=1;
merge into b1 using (select id from b2 where x < 10) as b3 on key b3.id when matched then update
set b1.o4=1;
merge into `travel-sample` using default on key "2" when matched then update set `travel-
sample`.name="aaa";
MERGE into WAREHOUSE using `beer-sample` ON KEY to_string("yakima_brewing_and_malting_grant_s_ales-
deep_powder_winter_ale²) when matched then delete;
88

More Related Content

What's hot

Graph abstraction
Graph abstractionGraph abstraction
Graph abstraction
openCypher
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
GeeksLab Odessa
 
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...
MongoDB
 
Graph Connect: Importing data quickly and easily
Graph Connect: Importing data quickly and easilyGraph Connect: Importing data quickly and easily
Graph Connect: Importing data quickly and easily
Mark Needham
 
Advanced Schema Design Patterns
Advanced Schema Design PatternsAdvanced Schema Design Patterns
Advanced Schema Design Patterns
MongoDB
 
Html5 forms input types
Html5 forms input typesHtml5 forms input types
Html5 forms input typessinhacp
 
N1QL New Features in couchbase 7.0
N1QL New Features in couchbase 7.0N1QL New Features in couchbase 7.0
N1QL New Features in couchbase 7.0
Keshav Murthy
 
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: TutorialMongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
MongoDB
 
High Performance Applications with MongoDB
High Performance Applications with MongoDBHigh Performance Applications with MongoDB
High Performance Applications with MongoDB
MongoDB
 
Bringing SQL to NoSQL: Rich, Declarative Query for NoSQL
Bringing SQL to NoSQL: Rich, Declarative Query for NoSQLBringing SQL to NoSQL: Rich, Declarative Query for NoSQL
Bringing SQL to NoSQL: Rich, Declarative Query for NoSQL
Keshav Murthy
 
Simplifying & accelerating application development with MongoDB's intelligent...
Simplifying & accelerating application development with MongoDB's intelligent...Simplifying & accelerating application development with MongoDB's intelligent...
Simplifying & accelerating application development with MongoDB's intelligent...
Maxime Beugnet
 
Migrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDBMigrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDB
MongoDB
 

What's hot (12)

Graph abstraction
Graph abstractionGraph abstraction
Graph abstraction
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
 
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...
 
Graph Connect: Importing data quickly and easily
Graph Connect: Importing data quickly and easilyGraph Connect: Importing data quickly and easily
Graph Connect: Importing data quickly and easily
 
Advanced Schema Design Patterns
Advanced Schema Design PatternsAdvanced Schema Design Patterns
Advanced Schema Design Patterns
 
Html5 forms input types
Html5 forms input typesHtml5 forms input types
Html5 forms input types
 
N1QL New Features in couchbase 7.0
N1QL New Features in couchbase 7.0N1QL New Features in couchbase 7.0
N1QL New Features in couchbase 7.0
 
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: TutorialMongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
 
High Performance Applications with MongoDB
High Performance Applications with MongoDBHigh Performance Applications with MongoDB
High Performance Applications with MongoDB
 
Bringing SQL to NoSQL: Rich, Declarative Query for NoSQL
Bringing SQL to NoSQL: Rich, Declarative Query for NoSQLBringing SQL to NoSQL: Rich, Declarative Query for NoSQL
Bringing SQL to NoSQL: Rich, Declarative Query for NoSQL
 
Simplifying & accelerating application development with MongoDB's intelligent...
Simplifying & accelerating application development with MongoDB's intelligent...Simplifying & accelerating application development with MongoDB's intelligent...
Simplifying & accelerating application development with MongoDB's intelligent...
 
Migrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDBMigrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDB
 

Viewers also liked

Drilling on JSON
Drilling on JSONDrilling on JSON
Drilling on JSON
Keshav Murthy
 
Introduction to NoSQL and Couchbase
Introduction to NoSQL and CouchbaseIntroduction to NoSQL and Couchbase
Introduction to NoSQL and Couchbase
Cecile Le Pape
 
Query in Couchbase. N1QL: SQL for JSON
Query in Couchbase.  N1QL: SQL for JSONQuery in Couchbase.  N1QL: SQL for JSON
Query in Couchbase. N1QL: SQL for JSON
Keshav Murthy
 
Couchbase Day
Couchbase DayCouchbase Day
Couchbase Day
Idan Tohami
 
Deep dive into N1QL: SQL for JSON: Internals and power features.
Deep dive into N1QL: SQL for JSON: Internals and power features.Deep dive into N1QL: SQL for JSON: Internals and power features.
Deep dive into N1QL: SQL for JSON: Internals and power features.
Keshav Murthy
 
Couchbase @ Big Data France 2016
Couchbase @ Big Data France 2016Couchbase @ Big Data France 2016
Couchbase @ Big Data France 2016
Cecile Le Pape
 
SDEC2011 Using Couchbase for social game scaling and speed
SDEC2011 Using Couchbase for social game scaling and speedSDEC2011 Using Couchbase for social game scaling and speed
SDEC2011 Using Couchbase for social game scaling and speed
Korea Sdec
 
Accelerating analytics on the Sensor and IoT Data.
Accelerating analytics on the Sensor and IoT Data. Accelerating analytics on the Sensor and IoT Data.
Accelerating analytics on the Sensor and IoT Data.
Keshav Murthy
 

Viewers also liked (8)

Drilling on JSON
Drilling on JSONDrilling on JSON
Drilling on JSON
 
Introduction to NoSQL and Couchbase
Introduction to NoSQL and CouchbaseIntroduction to NoSQL and Couchbase
Introduction to NoSQL and Couchbase
 
Query in Couchbase. N1QL: SQL for JSON
Query in Couchbase.  N1QL: SQL for JSONQuery in Couchbase.  N1QL: SQL for JSON
Query in Couchbase. N1QL: SQL for JSON
 
Couchbase Day
Couchbase DayCouchbase Day
Couchbase Day
 
Deep dive into N1QL: SQL for JSON: Internals and power features.
Deep dive into N1QL: SQL for JSON: Internals and power features.Deep dive into N1QL: SQL for JSON: Internals and power features.
Deep dive into N1QL: SQL for JSON: Internals and power features.
 
Couchbase @ Big Data France 2016
Couchbase @ Big Data France 2016Couchbase @ Big Data France 2016
Couchbase @ Big Data France 2016
 
SDEC2011 Using Couchbase for social game scaling and speed
SDEC2011 Using Couchbase for social game scaling and speedSDEC2011 Using Couchbase for social game scaling and speed
SDEC2011 Using Couchbase for social game scaling and speed
 
Accelerating analytics on the Sensor and IoT Data.
Accelerating analytics on the Sensor and IoT Data. Accelerating analytics on the Sensor and IoT Data.
Accelerating analytics on the Sensor and IoT Data.
 

Similar to Utilizing Arrays: Modeling, Querying and Indexing

N1QL: What's new in Couchbase 5.0
N1QL: What's new in Couchbase 5.0N1QL: What's new in Couchbase 5.0
N1QL: What's new in Couchbase 5.0
Keshav Murthy
 
N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5
N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5
N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5
Keshav Murthy
 
Couchbase Tutorial: Big data Open Source Systems: VLDB2018
Couchbase Tutorial: Big data Open Source Systems: VLDB2018Couchbase Tutorial: Big data Open Source Systems: VLDB2018
Couchbase Tutorial: Big data Open Source Systems: VLDB2018
Keshav Murthy
 
Introducing N1QL: New SQL Based Query Language for JSON
Introducing N1QL: New SQL Based Query Language for JSONIntroducing N1QL: New SQL Based Query Language for JSON
Introducing N1QL: New SQL Based Query Language for JSON
Keshav Murthy
 
SQL for JSON: Rich, Declarative Querying for NoSQL Databases and Applications 
SQL for JSON: Rich, Declarative Querying for NoSQL Databases and Applications SQL for JSON: Rich, Declarative Querying for NoSQL Databases and Applications 
SQL for JSON: Rich, Declarative Querying for NoSQL Databases and Applications 
Keshav Murthy
 
NoSQL Data Modeling using Couchbase
NoSQL Data Modeling using CouchbaseNoSQL Data Modeling using Couchbase
NoSQL Data Modeling using Couchbase
Brant Burnett
 
SDKs, the good the bad the ugly - Japan
SDKs, the good the bad the ugly - JapanSDKs, the good the bad the ugly - Japan
SDKs, the good the bad the ugly - Japan
tristansokol
 
Document Data Modelling with Couchbase Server 4.0
Document Data Modelling with Couchbase Server 4.0Document Data Modelling with Couchbase Server 4.0
Document Data Modelling with Couchbase Server 4.0
Cihan Biyikoglu
 
NoSQL's biggest lie: SQL never went away - Martin Esmann
NoSQL's biggest lie: SQL never went away - Martin EsmannNoSQL's biggest lie: SQL never went away - Martin Esmann
NoSQL's biggest lie: SQL never went away - Martin Esmann
distributed matters
 
Querying NoSQL with SQL: HAVING Your JSON Cake and SELECTing it too
Querying NoSQL with SQL: HAVING Your JSON Cake and SELECTing it tooQuerying NoSQL with SQL: HAVING Your JSON Cake and SELECTing it too
Querying NoSQL with SQL: HAVING Your JSON Cake and SELECTing it too
All Things Open
 
0 to 60 with AWS AppSync: Rapid Development Techniques for Mobile APIs (MOB32...
0 to 60 with AWS AppSync: Rapid Development Techniques for Mobile APIs (MOB32...0 to 60 with AWS AppSync: Rapid Development Techniques for Mobile APIs (MOB32...
0 to 60 with AWS AppSync: Rapid Development Techniques for Mobile APIs (MOB32...
Amazon Web Services
 
MongoDB Stich Overview
MongoDB Stich OverviewMongoDB Stich Overview
MongoDB Stich Overview
MongoDB
 
Json data modeling june 2017 - pittsburgh tech fest
Json data modeling   june 2017 - pittsburgh tech festJson data modeling   june 2017 - pittsburgh tech fest
Json data modeling june 2017 - pittsburgh tech fest
Matthew Groves
 
Couchbase N1QL: Index Advisor
Couchbase N1QL: Index AdvisorCouchbase N1QL: Index Advisor
Couchbase N1QL: Index Advisor
Keshav Murthy
 
MongoDB Stitch Introduction
MongoDB Stitch IntroductionMongoDB Stitch Introduction
MongoDB Stitch Introduction
MongoDB
 
Auto Scaling Groups
Auto Scaling GroupsAuto Scaling Groups
Auto Scaling Groups
Peter Sankauskas
 
The Aggregation Framework
The Aggregation FrameworkThe Aggregation Framework
The Aggregation Framework
MongoDB
 
AWS CloudFormation Best Practices
AWS CloudFormation Best PracticesAWS CloudFormation Best Practices
AWS CloudFormation Best Practices
Amazon Web Services
 
No sq ls-biggest-lie_sql-never-went-away_martin-esmann
No sq ls-biggest-lie_sql-never-went-away_martin-esmannNo sq ls-biggest-lie_sql-never-went-away_martin-esmann
No sq ls-biggest-lie_sql-never-went-away_martin-esmann
Martin Esmann
 
Docker Summit MongoDB - Data Democratization
Docker Summit MongoDB - Data Democratization Docker Summit MongoDB - Data Democratization
Docker Summit MongoDB - Data Democratization
Chris Grabosky
 

Similar to Utilizing Arrays: Modeling, Querying and Indexing (20)

N1QL: What's new in Couchbase 5.0
N1QL: What's new in Couchbase 5.0N1QL: What's new in Couchbase 5.0
N1QL: What's new in Couchbase 5.0
 
N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5
N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5
N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5
 
Couchbase Tutorial: Big data Open Source Systems: VLDB2018
Couchbase Tutorial: Big data Open Source Systems: VLDB2018Couchbase Tutorial: Big data Open Source Systems: VLDB2018
Couchbase Tutorial: Big data Open Source Systems: VLDB2018
 
Introducing N1QL: New SQL Based Query Language for JSON
Introducing N1QL: New SQL Based Query Language for JSONIntroducing N1QL: New SQL Based Query Language for JSON
Introducing N1QL: New SQL Based Query Language for JSON
 
SQL for JSON: Rich, Declarative Querying for NoSQL Databases and Applications 
SQL for JSON: Rich, Declarative Querying for NoSQL Databases and Applications SQL for JSON: Rich, Declarative Querying for NoSQL Databases and Applications 
SQL for JSON: Rich, Declarative Querying for NoSQL Databases and Applications 
 
NoSQL Data Modeling using Couchbase
NoSQL Data Modeling using CouchbaseNoSQL Data Modeling using Couchbase
NoSQL Data Modeling using Couchbase
 
SDKs, the good the bad the ugly - Japan
SDKs, the good the bad the ugly - JapanSDKs, the good the bad the ugly - Japan
SDKs, the good the bad the ugly - Japan
 
Document Data Modelling with Couchbase Server 4.0
Document Data Modelling with Couchbase Server 4.0Document Data Modelling with Couchbase Server 4.0
Document Data Modelling with Couchbase Server 4.0
 
NoSQL's biggest lie: SQL never went away - Martin Esmann
NoSQL's biggest lie: SQL never went away - Martin EsmannNoSQL's biggest lie: SQL never went away - Martin Esmann
NoSQL's biggest lie: SQL never went away - Martin Esmann
 
Querying NoSQL with SQL: HAVING Your JSON Cake and SELECTing it too
Querying NoSQL with SQL: HAVING Your JSON Cake and SELECTing it tooQuerying NoSQL with SQL: HAVING Your JSON Cake and SELECTing it too
Querying NoSQL with SQL: HAVING Your JSON Cake and SELECTing it too
 
0 to 60 with AWS AppSync: Rapid Development Techniques for Mobile APIs (MOB32...
0 to 60 with AWS AppSync: Rapid Development Techniques for Mobile APIs (MOB32...0 to 60 with AWS AppSync: Rapid Development Techniques for Mobile APIs (MOB32...
0 to 60 with AWS AppSync: Rapid Development Techniques for Mobile APIs (MOB32...
 
MongoDB Stich Overview
MongoDB Stich OverviewMongoDB Stich Overview
MongoDB Stich Overview
 
Json data modeling june 2017 - pittsburgh tech fest
Json data modeling   june 2017 - pittsburgh tech festJson data modeling   june 2017 - pittsburgh tech fest
Json data modeling june 2017 - pittsburgh tech fest
 
Couchbase N1QL: Index Advisor
Couchbase N1QL: Index AdvisorCouchbase N1QL: Index Advisor
Couchbase N1QL: Index Advisor
 
MongoDB Stitch Introduction
MongoDB Stitch IntroductionMongoDB Stitch Introduction
MongoDB Stitch Introduction
 
Auto Scaling Groups
Auto Scaling GroupsAuto Scaling Groups
Auto Scaling Groups
 
The Aggregation Framework
The Aggregation FrameworkThe Aggregation Framework
The Aggregation Framework
 
AWS CloudFormation Best Practices
AWS CloudFormation Best PracticesAWS CloudFormation Best Practices
AWS CloudFormation Best Practices
 
No sq ls-biggest-lie_sql-never-went-away_martin-esmann
No sq ls-biggest-lie_sql-never-went-away_martin-esmannNo sq ls-biggest-lie_sql-never-went-away_martin-esmann
No sq ls-biggest-lie_sql-never-went-away_martin-esmann
 
Docker Summit MongoDB - Data Democratization
Docker Summit MongoDB - Data Democratization Docker Summit MongoDB - Data Democratization
Docker Summit MongoDB - Data Democratization
 

More from Keshav Murthy

XLDB Lightning Talk: Databases for an Engaged World: Requirements and Design...
XLDB Lightning Talk: Databases for an Engaged World: Requirements and Design...XLDB Lightning Talk: Databases for an Engaged World: Requirements and Design...
XLDB Lightning Talk: Databases for an Engaged World: Requirements and Design...
Keshav Murthy
 
Couchbase 5.5: N1QL and Indexing features
Couchbase 5.5: N1QL and Indexing featuresCouchbase 5.5: N1QL and Indexing features
Couchbase 5.5: N1QL and Indexing features
Keshav Murthy
 
N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli
N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram VemulapalliN1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli
N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli
Keshav Murthy
 
Couchbase Query Workbench Enhancements By Eben Haber
Couchbase Query Workbench Enhancements  By Eben Haber Couchbase Query Workbench Enhancements  By Eben Haber
Couchbase Query Workbench Enhancements By Eben Haber
Keshav Murthy
 
Mindmap: Oracle to Couchbase for developers
Mindmap: Oracle to Couchbase for developersMindmap: Oracle to Couchbase for developers
Mindmap: Oracle to Couchbase for developers
Keshav Murthy
 
Extended JOIN in Couchbase Server 4.5
Extended JOIN in Couchbase Server 4.5Extended JOIN in Couchbase Server 4.5
Extended JOIN in Couchbase Server 4.5
Keshav Murthy
 
Enterprise Architect's view of Couchbase 4.0 with N1QL
Enterprise Architect's view of Couchbase 4.0 with N1QLEnterprise Architect's view of Couchbase 4.0 with N1QL
Enterprise Architect's view of Couchbase 4.0 with N1QL
Keshav Murthy
 
You know what iMEAN? Using MEAN stack for application dev on Informix
You know what iMEAN? Using MEAN stack for application dev on InformixYou know what iMEAN? Using MEAN stack for application dev on Informix
You know what iMEAN? Using MEAN stack for application dev on Informix
Keshav Murthy
 
Informix SQL & NoSQL: Putting it all together
Informix SQL & NoSQL: Putting it all togetherInformix SQL & NoSQL: Putting it all together
Informix SQL & NoSQL: Putting it all together
Keshav Murthy
 
Informix SQL & NoSQL -- for Chat with the labs on 4/22
Informix SQL & NoSQL -- for Chat with the labs on 4/22Informix SQL & NoSQL -- for Chat with the labs on 4/22
Informix SQL & NoSQL -- for Chat with the labs on 4/22
Keshav Murthy
 
NoSQL Deepdive - with Informix NoSQL. IOD 2013
NoSQL Deepdive - with Informix NoSQL. IOD 2013NoSQL Deepdive - with Informix NoSQL. IOD 2013
NoSQL Deepdive - with Informix NoSQL. IOD 2013
Keshav Murthy
 
Informix NoSQL & Hybrid SQL detailed deep dive
Informix NoSQL & Hybrid SQL detailed deep diveInformix NoSQL & Hybrid SQL detailed deep dive
Informix NoSQL & Hybrid SQL detailed deep dive
Keshav Murthy
 
Table for two? Hybrid approach to developing combined SQL, NoSQL applications...
Table for two? Hybrid approach to developing combined SQL, NoSQL applications...Table for two? Hybrid approach to developing combined SQL, NoSQL applications...
Table for two? Hybrid approach to developing combined SQL, NoSQL applications...
Keshav Murthy
 

More from Keshav Murthy (13)

XLDB Lightning Talk: Databases for an Engaged World: Requirements and Design...
XLDB Lightning Talk: Databases for an Engaged World: Requirements and Design...XLDB Lightning Talk: Databases for an Engaged World: Requirements and Design...
XLDB Lightning Talk: Databases for an Engaged World: Requirements and Design...
 
Couchbase 5.5: N1QL and Indexing features
Couchbase 5.5: N1QL and Indexing featuresCouchbase 5.5: N1QL and Indexing features
Couchbase 5.5: N1QL and Indexing features
 
N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli
N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram VemulapalliN1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli
N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli
 
Couchbase Query Workbench Enhancements By Eben Haber
Couchbase Query Workbench Enhancements  By Eben Haber Couchbase Query Workbench Enhancements  By Eben Haber
Couchbase Query Workbench Enhancements By Eben Haber
 
Mindmap: Oracle to Couchbase for developers
Mindmap: Oracle to Couchbase for developersMindmap: Oracle to Couchbase for developers
Mindmap: Oracle to Couchbase for developers
 
Extended JOIN in Couchbase Server 4.5
Extended JOIN in Couchbase Server 4.5Extended JOIN in Couchbase Server 4.5
Extended JOIN in Couchbase Server 4.5
 
Enterprise Architect's view of Couchbase 4.0 with N1QL
Enterprise Architect's view of Couchbase 4.0 with N1QLEnterprise Architect's view of Couchbase 4.0 with N1QL
Enterprise Architect's view of Couchbase 4.0 with N1QL
 
You know what iMEAN? Using MEAN stack for application dev on Informix
You know what iMEAN? Using MEAN stack for application dev on InformixYou know what iMEAN? Using MEAN stack for application dev on Informix
You know what iMEAN? Using MEAN stack for application dev on Informix
 
Informix SQL & NoSQL: Putting it all together
Informix SQL & NoSQL: Putting it all togetherInformix SQL & NoSQL: Putting it all together
Informix SQL & NoSQL: Putting it all together
 
Informix SQL & NoSQL -- for Chat with the labs on 4/22
Informix SQL & NoSQL -- for Chat with the labs on 4/22Informix SQL & NoSQL -- for Chat with the labs on 4/22
Informix SQL & NoSQL -- for Chat with the labs on 4/22
 
NoSQL Deepdive - with Informix NoSQL. IOD 2013
NoSQL Deepdive - with Informix NoSQL. IOD 2013NoSQL Deepdive - with Informix NoSQL. IOD 2013
NoSQL Deepdive - with Informix NoSQL. IOD 2013
 
Informix NoSQL & Hybrid SQL detailed deep dive
Informix NoSQL & Hybrid SQL detailed deep diveInformix NoSQL & Hybrid SQL detailed deep dive
Informix NoSQL & Hybrid SQL detailed deep dive
 
Table for two? Hybrid approach to developing combined SQL, NoSQL applications...
Table for two? Hybrid approach to developing combined SQL, NoSQL applications...Table for two? Hybrid approach to developing combined SQL, NoSQL applications...
Table for two? Hybrid approach to developing combined SQL, NoSQL applications...
 

Recently uploaded

Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
GetInData
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Enterprise Wired
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
eddie19851
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 

Recently uploaded (20)

Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 

Utilizing Arrays: Modeling, Querying and Indexing

  • 1. ©2016 Couchbase Inc. { "Utilizing Arrays" : ["Modeling", "Querying", "Indexing"] } 1 Keshav Murthy Director,Couchbase R&D
  • 2. ©2016 Couchbase Inc.©2016 Couchbase Inc. Agenda • Introduction to Arrays • Data Modeling with Arrays • Query PerformanceWith Arrays • Array Indexing • FunWithArrays • Query Performance • Tag Search • String Search 2
  • 3. ©2016 Couchbase Inc. 3 IntroductionTo Arrays
  • 4. ©2016 Couchbase Inc.©2016 Couchbase Inc. Every N1QL query returns Arrays 4 cbq> select distinct type from `travel-sample`; { … "results": [ { "type": "route“ }, { "type": "airport” }, { "type": "hotel" }, { "type": "airline” }, { "type": "landmark” } ] , "status": "success", "metrics": { "elapsedTime": "840.518052ms", "executionTime": "840.478414ms", "resultCount": 5, "resultSize": 202 } } Results from every query is an array. cbq> SELECT * FROM `travel- sample`WHERE type = 'airport' and faa = 'BLR'; { "results": [], "metrics": { "elapsedTime": "9.606755ms", "executionTime": "9.548749ms", "resultCount": 0, "resultSize": 0 } }
  • 6. ©2016 Couchbase Inc.©2016 Couchbase Inc. Introduction to Arrays • An arrangement of quantities or symbols in rows and columns; a matrix 6 • An indexed set of related elements
  • 7. ©2016 Couchbase Inc.©2016 Couchbase Inc. JSON Arrays 7 { "Name" : "Jane Smith", "DOB" : "1990-01-30", "hobbies" : ["lego", "piano", "badminton", "robotics"], "scores" : [3.4, 2.9, 9.2, 4.1], "legos" : [ true, 9292, "fighter 2", { "name" : "Millenium Falcon", "type" : "Starwars" } ] } • Arrays in JSON can contain simply values, or any combination of JSON types within the same array. • No type or structure enforcement within the array.
  • 8. ©2016 Couchbase Inc.©2016 Couchbase Inc. JSON Arrays 8 { "Name": "Jane Smith", "DOB" : "1990-01-30", "phones" : [ "+1 510-523-3529", "+1 650-392-4923" ], "Billing": [ { "type": "visa", "cardnum": "5827-2842-2847-3909", "expiry": "2019-03" }, { "type": "master", "cardnum": "6274-2542-5847-3949", "expiry": "2018-12" } ] } Billing has two credit card entries, stored as an ARRAY Two phone number entries
  • 9. ©2016 Couchbase Inc.©2016 Couchbase Inc. JSON Arrays : Syntax Diagram 9
  • 10. ©2016 Couchbase Inc. 10 Data Modeling with Arrays
  • 11. ©2016 Couchbase Inc.©2016 Couchbase Inc. Properties of Real-World Data • Rich structure • Attributes, Sub-structure • Relationships • To other data • Value evolution • Data is updated • Structure evolution • Data is reshaped Customer Name DOB Billing Connections Purchases
  • 12. ©2016 Couchbase Inc.©2016 Couchbase Inc. Modeling Data in RelationalWorld Billing ConnectionsPurchases Contacts Customer • Rich structure • Normalize & JOIN Queries • Relationships • JOINS and Constraints • Value evolution • INSERT, UPDATE, DELETE • Structure evolution • ALTER TABLE • Application Downtime • Application Migration • Application Versioning
  • 13. ©2016 Couchbase Inc.©2016 Couchbase Inc. Using JSON For RealWorld Data CustomerID Name DOB CBL2015 Jane Smith 1990-01-30 Table: Customer { "Name" : "Jane Smith", "DOB" : "1990-01-30" } • The primary (CustomerID) becomes the DocumentKey • Column name-Column value become KEY-VALUE pair. { "Name" : { "fname": "Jane", "lname": "Smith" } "DOB" : "1990-01-30" } OR Customer DocumentKey: CBL2015
  • 14. ©2016 Couchbase Inc.©2016 Couchbase Inc. Using JSON to Store Data CustomerID Name DOB CBL2015 Jane Smith 1990-01-30 Table: Customer { "Name" : "Jane Smith", "DOB" : "1990-01-30", "Billing" : [ { "type" : "visa", "cardnum" : "5827-2842- 2847-3909", "expiry" : "2019-03" } ] } CustomerID Type Cardnum Expiry CBL2015 visa 5827… 2019-03 Table: Billing • Rich Structure & Relationships • Billing information is stored as a sub-document • There could be more than a single credit card. So, use an array. Customer DocumentKey: CBL2015
  • 15. ©2016 Couchbase Inc.©2016 Couchbase Inc. Using JSON to Store Data CustomerID Name DOB CBL2015 Jane Smith 1990-01-30 Table: Customer { "Name" : "Jane Smith", "DOB" : "1990-01-30", "Billing" : [ { "type" : "visa", "cardnum" : "5827-2842- 2847-3909", "expiry" : "2019-03" }, { "type" : "master", "cardnum" : "6274-2542- 5847-3949", "expiry" : "2018-12" } ] } Customer DocumentKey: CBL2015 CustomerID Type Cardnum Expiry CBL2015 visa 5827… 2019-03 CBL2015 master 6274… 2018-12 Table: Billing Value evolution  Simply add additional array element or update a value.
  • 16. ©2016 Couchbase Inc.©2016 Couchbase Inc. Using JSON to Store Data CustomerID ConnId Name CBL2015 XYZ987 Joe Smith CBL2015 SKR007 Sam Smith CBL2015 RGV492 Rav Smith Table: Connections { "Name" : "Jane Smith", "DOB" : "1990-01-30", "Billing" : [ { "type" : "visa", "cardnum" : "5827-2842-2847-3909", "expiry" : "2019-03" }, { "type" : "master", "cardnum" : "6274-2542-5847-3949", "expiry" : "2018-12" } ], "Connections" : [ { "ConnId" : "XYZ987", "Name" : "Joe Smith" }, { "ConnId" : "SKR007", "Name" : "Sam Smith" }, { "ConnId" : "RGV491", "Name" : "Rav Smith" } Structure evolution  Simply add new key-value pairs  No downtime to add new KV pairs  Applications can validate data  Structure evolution over time. Relations via Reference Customer DocumentKey: CBL2015
  • 17. ©2016 Couchbase Inc.©2016 Couchbase Inc. Using JSON to Store Data { "Name" : "Jane Smith", "DOB" : "1990-01-30", "Billing" : [ { "type" : "visa", "cardnum" : "5827-2842-2847-3909", "expiry" : "2019-03" }, { "type" : "master", "cardnum" : "6274-2842-2847-3909", "expiry" : "2019-03" } ], "Connections" : [ { "ConnId" : "XYZ987", "Name" : "Joe Smith" }, { "ConnId" : "SKR007", "Name" : "Sam Smith" }, { "ConnId" : "RGV491", "Name" : "Rav Smith" } ], "Purchases" : [ { "id":12, item: "mac", "amt": 2823.52 } { "id":19, item: "ipad2", "amt": 623.52 } ] } CustomerID Name DOB CBL2015 Jane Smith 1990-01-30 Customer ID Type Cardnum Expiry CBL2015 visa 5827… 2019-03 CBL2015 maste r 6274… 2018-12 CustomerID ConnId Name CBL2015 XYZ987 Joe Smith CBL2015 SKR007 Sam Smith CBL2015 RGV492 Rav Smith CustomerID item amt CBL2015 mac 2823.52 CBL2015 ipad2 623.52 CustomerID ConnId Name CBL2015 XYZ987 Joe Smith CBL2015 SKR007 Sam Smith Contacts Customer Billing ConnectionsPurchases Customer DocumentKey: CBL2015
  • 18. ©2016 Couchbase Inc.©2016 Couchbase Inc. Models for Representing Data Data Concern Relational Model JSON Document Model (NoSQL) Rich Structure  Multiple flat tables  Constant assembly / disassembly  Documents  No assembly required! Relationships  Represented  Queried (SQL)  Represented  N1QL, MongoDB, CQL Value Evolution  Data can be updated  Data can be updated Structure Evolution  Uniform and rigid  Manual change (disruptive)  Flexible  Dynamic change
  • 19. ©2016 Couchbase Inc. 19 Querying Arrays
  • 20. ©2016 Couchbase Inc.©2016 Couchbase Inc. Querying Arrays • Array Access • Expressions • Functions • Aggregates • Statements • Array Clauses 20
  • 21. ©2016 Couchbase Inc.©2016 Couchbase Inc. Array Access: Expressions, Functions and Aggregates. 21 • EXPRESSIONS • ARRAY • ANY • EVERY • IN • WITHIN • Construct [elem] • Slice array[start:end] • Selection array[#pos] • FUNCTIONS • ISARRAY • TYPE • ARRAY_APPEND • ARRAY_CONCAT • ARRAY_CONTAINS • ARRAY_DISTINCT • ARRAY_IFNULL • ARRAY_FLATTEN • ARRAY_INSERT • ARRAY_INTERSECT • ARRAY_LENGTH • ARRAY_POSITION • AGGREGATES • ARRAY_AVG • ARRAY_COUNT • ARRAY_MIN • ARRAY_MAX • FUNCTIONS • ARRAY_PREPEND • ARRAY_PUT • ARRAY_RANGE • ARRAY_REMOVE • ARRAY_REPEAT • ARRAY_REPLACE • ARRAY_REVERSE • ARRAY_SORT • ARRAY_STAR • ARRAY_SUM
  • 22. ©2016 Couchbase Inc.©2016 Couchbase Inc. Array access 22 { "Name": "Jane Smith", "DOB" : "1990-01-30", "phones" : [ "+1 510-523-3529", "+1 650-392-4923" ], "Billing": [ { "type": "visa", "cardnum": "5827-2842-2847-3909", "expiry": "2019-03" }, { "type": "master", "cardnum": "6274-2542-5847-3949", "expiry": "2018-12" } ] } SELECT phones from t; [ { "phones": [ "+1 510-523-3529", "+1 650-392-4923" ] } ] SELECT phones[1] from t; [ { "$1": "+1 650-392-4923" } ] SELECT phones[0:1] from t; [ { "$1": [ "+1 510-523-3529" ] } ]
  • 23. ©2016 Couchbase Inc.©2016 Couchbase Inc. Array access: Expressions and functions 23 { "Name": "Jane Smith", "DOB" : "1990-01-30", "phones" : [ "+1 510-523-3529", "+1 650-392-4923" ], "Billing": [ { "type": "visa", "cardnum": "5827-2842-2847-3909", "expiry": "2019-03" }, { "type": "master", "cardnum": "6274-2542-5847-3949", "expiry": "2018-12" } ] } SELECT Billing[0].cardnum from t; [ { "cardnum": "5827-2842-2847-3909" } ] SELECT Billing[*].cardnum from t; [ { "cardnum": [ "5827-2842-2847-3909", "6274-2542-5847-3949" ] } ] SELECT ISARRAY(Name) name, ISARRAY(phones) phones from t; [ { "name": false, "phones": true } ]
  • 24. ©2016 Couchbase Inc.©2016 Couchbase Inc. Array access : Functions 24 { "Name": "Jane Smith", "DOB" : "1990-01-30", "phones" : [ "+1 510-523-3529", "+1 650-392-4923" ], "Billing": [ { "type": "visa", "cardnum": "5827-2842-2847-3909", "expiry": "2019-03" }, { "type": "master", "cardnum": "6274-2542-5847-3949", "expiry": "2018-12" } ] } SELECT ARRAY_CONCAT(phones, ["+1 408-284- 2921"]) from t; [ { "$1": [ "+1 510-523-3529", "+1 650-392-4923", "+1 408-284-2921" ] } ] SELECT ARRAY_COUNT(Billing) billing, ARRAY_COUNT(phones) phones from t; [ { "billing": 2, "phones": 2 } ]
  • 25. ©2016 Couchbase Inc.©2016 Couchbase Inc. Array access : Functions 25 SELECT phones, ARRAY_REVERSE(phones) reverse from t; { "phones": [ "+1 510-523-3529", "+1 650-392-4923" ], "reverse": [ "+1 650-392-4923", "+1 510-523-3529" ] } ] SELECT phones, ARRAY_INSERT(phones, 0, "+1 415- 439-4923") newlist from t;[ { "billing": 2, "phones": 2 } ] SELECT phones, ARRAY_INSERT(phones, 0, "+1 415- 439-4923") newlist from t; [ { "newlist": [ "+1 415-439-4923", "+1 510-523-3529", "+1 650-392-4923" ], "phones": [ "+1 510-523-3529", "+1 650-392-4923" ] } ]
  • 26. ©2016 Couchbase Inc.©2016 Couchbase Inc. Array access : Aggregates 26 SELECT ARRAY_MIN(Billing) AS minbill FROM t; [ { "minbill": { "cardnum": "5827-2842-2847-3909", "expiry": "2019-03", "type": "visa" } } ] SELECT name, ARRAY_AVG(reviews[*].ratings[*].Overall) AS avghotelrating FROM `travel-sample` WHERE type = 'hotel' ORDER BY avghotelrating desc LIMIT 3; [ { "avghotelrating": 5, "name": "Culloden House Hotel" }, { "avghotelrating": 5, "name": "The Bulls Head" }, { "avghotelrating": 5, "name": "La Pradella" } ]
  • 27. ©2016 Couchbase Inc.©2016 Couchbase Inc. SELECT: ARRAY & FIRST Expression 27 ARRAY: The ARRAY operator lets you map and filter the elements or attributes of a collection, object, or objects. It evaluates to an array of the operand expression, that satisfies the WHEN clause, if provided. SELECT ARRAY [name, r.ratings.`Value`] FOR r IN reviews WHEN r.ratings.`Value` = 4 END FROM `travel-sample` WHERE type = 'hotel' SELECT FIRST [name, r.ratings.`Value`] FOR r IN reviews WHEN r.ratings.`Value` = 4 END FROM `travel-sample` WHERE type = 'hotel' FIRST: The FIRST operator enables you to map and filter the elements or attributes of a collection, object, or objects. It evaluates to a single element based on the operand expression that satisfies the WHEN clause, if provided.
  • 28. ©2016 Couchbase Inc.©2016 Couchbase Inc. Statements • INSERT • INSERT documents with arrays • INSERT multiple documents with arrays • INSERT result of documents from SELECT • UPDATE • UPDATE specific elements and objects within an array • DELETE • DELETE documents based on values within one or more arrays • MERGE • MERGE documents to INSERT, UPDATE or DELETE documents. • SELECT • Fetch documents given an array of keys • JOIN based on array of keys • Predicates (filters) on arrays • Array expressions, functions and aggregates • UNNEST, NEST operations 28
  • 29. ©2016 Couchbase Inc.©2016 Couchbase Inc. Statements:INSERT INSERT INTO customer VALUES ("KEY01", { "cid": "ABC01", "orders": ["LG012", "LG482", "LG134"] }); INSERT INTO customer VALUES (("KEY01", { "cid": "XYC21", "orders": ["LG92", "LG859"] }), VALUES (("KEY04", { "cid": "PQR49", "orders": ["LG47", "LG09", "LG134"] }), VALUES (("KEY09", { "cid": "KDL29", "orders": ["LG082"] }); INSERT INTO customer ( KEY uuid(), value c ) SELECT mycustomers AS c FROM newcustomers AS n WHERE n.type = "premium"; 29
  • 30. ©2016 Couchbase Inc.©2016 Couchbase Inc. Statements: DELETE DELETE FROM customer WHERE orders = ["LG012", "LG482", "LG134"]; DELETE FROM customer WHERE ANY o IN orders SATISFIES o = "LG012" END; DELETE FROM customer WHERE ANY o IN orders SATISFIES o = "LG012" END RETURNING meta().id, *; 30
  • 31. ©2016 Couchbase Inc.©2016 Couchbase Inc. Statements:UPDATE UPDATE customer USE KEYS ["KEY091"] SET orders = ["LG012", "LG482", "LG134"]; UPDATE customer USE KEYS ["KEY091"] SET orders = ARRAY_REMOVE(orders, "LG012") ; UPDATE customer USE KEYS ["KEY091"] SET orders = ARRAY_APPEND(orders, "LG892") ; 31
  • 32. ©2016 Couchbase Inc.©2016 Couchbase Inc. Statements : SELECT • SELECT • Array predicates • NEST, UNNEST • Fetch documents given an array of keys • JOIN based on array of keys 32
  • 33. ©2016 Couchbase Inc. 33 SELECT statement ARRAY PREDICATES
  • 34. ©2016 Couchbase Inc.©2016 Couchbase Inc. SELECT: Array predicates 34 • ANY • EVERY • SATISFIES • IN • WITH • WHEN
  • 35. ©2016 Couchbase Inc.©2016 Couchbase Inc. SELECT: Array predicates 35 • Arrays and Objects: Arrays are compared element- wise. Objects are first compared by length; objects of equal length are compared pairwise, with the pairs sorted by name. • IN clause: Use this when you want to evaluate based on specific field. • WITHIN clause: Use this when you don’t know which field contains the value you’re looking for. The WITHIN operator evaluates to TRUE if the right-side value contains the left-side value as a child or descendant. The NOT WITHIN operator evaluates to TRUE if the right-side value does not contain the left- side value as a child or descendant. SELECT * FROM `travel-sample` WHERE type = 'hotel’ AND ANY r IN reviews SATISFIES r.ratings.`Value` >= 3 END; SELECT * FROM `travel-sample` WHERE type = 'hotel’ AND ANY r WITHIN reviews SATISFIES r LIKE '%Ozella%' END; • EVERY: EVERY is a range predicate that tests a Boolean condition over the elements or attributes of a collection, object, or objects. It uses the IN and WITHIN operators to range through the collection. SELECT * FROM `travel-sample` WHERE type = 'hotel’ AND EVERY r IN reviews SATISFIES r.ratings.Cleanliness >= 4 END;
  • 36. ©2016 Couchbase Inc.©2016 Couchbase Inc. SELECT: Array predicates 36 • ARRAY_CONTAINS • Returns true if the array contains value. SELECT name, t.public_likes FROM `travel-sample` t WHERE type="hotel" AND ARRAY_CONTAINS(t.public_likes, "Vallie Ryan") = true; [ { "name": "Medway Youth Hostel", "public_likes": [ "Julius Tromp I", "Corrine Hilll", "Jaeden McKenzie", "Vallie Ryan", "Brian Kilback", "Lilian McLaughlin", "Ms. Moses Feeney", "Elnora Trantow" ] } ]
  • 37. ©2016 Couchbase Inc.©2016 Couchbase Inc. Array Expressions, Functions and Aggregates. 37 • EXPRESSIONS • ARRAY • ANY • EVERY • IN • WITHIN • Construct [elem] • Slice array[start:end] • Selection array[#pos] • FUNCTIONS • ISARRAY • TYPE • ARRAY_APPEND • ARRAY_CONCAT • ARRAY_CONTAINS • ARRAY_DISTINCT • ARRAY_IFNULL • ARRAY_FLATTEN • ARRAY_INSERT • ARRAY_INTERSECT • ARRAY_LENGTH • ARRAY_POSITION • AGGREGATES • ARRAY_AVG • ARRAY_COUNT • ARRAY_MIN • ARRAY_MAX • ARRAY_SUM • FUNCTIONS • ARRAY_PREPEND • ARRAY_PUT • ARRAY_RANGE • ARRAY_REMOVE • ARRAY_REPEAT • ARRAY_REPLACE • ARRAY_REVERSE • ARRAY_SORT • ARRAY_STAR
  • 38. ©2016 Couchbase Inc.©2016 Couchbase Inc. SELECT: ARRAY & FIRST Expression 38 ARRAY: The ARRAY operator lets you map and filter the elements or attributes of a collection, object, or objects. It evaluates to an array of the operand expression, that satisfies the WHEN clause, if provided. SELECT ARRAY [name, r.ratings.`Value`] FOR r IN reviews WHEN r.ratings.`Value` = 4 END FROM `travel-sample` WHERE type = 'hotel' SELECT FIRST [name, r.ratings.`Value`] FOR r IN reviews WHEN r.ratings.`Value` = 4 END FROM `travel-sample` WHERE type = 'hotel' FIRST: The FIRST operator enables you to map and filter the elements or attributes of a collection, object, or objects. It evaluates to a single element based on the operand expression that satisfies the WHEN clause, if provided.
  • 39. ©2016 Couchbase Inc. 39 SELECT statement UNNEST and NEST
  • 40. ©2016 Couchbase Inc.©2016 Couchbase Inc. Querying Arrays: UNNEST • UNNEST : If a document or object contains an array, UNNEST performs a join of the nested array with its parent document. Each resulting joined object becomes an input to the query. UNNEST, JOINs can be chained. 40 SELECT r.author, COUNT(r.author) AS authcount FROM `travel-sample` t UNNEST reviews r WHERE t.type="hotel" GROUP BY r.author ORDER BY COUNT(r.author) DESC LIMIT 5; [ { "authcount": 2, "author": "Anita Baumbach" }, { "authcount": 2, "author": "Uriah Gutmann" }, { "authcount": 2, "author": "Ashlee Champlin" }, { "authcount": 2, "author": "Cassie O'Hara" }, { "authcount": 1, "author": "Zoe Kshlerin" } ]
  • 41. ©2016 Couchbase Inc.©2016 Couchbase Inc. Querying Arrays: NEST • NEST is the inverse of UNNEST. • Nesting is conceptually the inverse of unnesting. Nesting performs a join across two keyspaces. But instead of producing a cross-product of the left and right inputs, a single result is produced for each left input, while the corresponding right inputs are collected into an array and nested as a single array-valued field in the result object. 41 SELECT * FROM `travel-sample` route NEST `travel-sample` airline ON KEYS route.airlineid WHERE route.type = ‘airline' LIMIT 1; [ { "airline": [ { "callsign": "AIRFRANS", "country": "France", "iata": "AF", "icao": "AFR", "id": 137, "name": "Air France", "type": "airline" } ], "route": { "airline": "AF", "airlineid": "airline_137", "destinationairport": "MRS", "distance": 2881.617376098415, "equipment": "320", "id": 10000, "schedule": [ { "day": 0, "flight": "AF198", "utc": "10:13:00" }, { "day": 0, "flight": "AF547", "utc": "19:14:00" }, { "day": 0, "flight": "AF943",
  • 42. ©2016 Couchbase Inc. 42 Query Performance with Arrays
  • 43. ©2016 Couchbase Inc.©2016 Couchbase Inc. Array Indexing • Before 4.5, creating index on array attribute would index the entire array as a single scalar value. CREATE INDEX i1 ON `travel-sample`(schedule); "schedule": [ { "day" : 0, "flight" : "AI111", "utc" : "1:11:11"} }, { "day": 1, "flight": "AF552", "utc": "14:41:00" }, { "day": 2, "flight": "AF166", "utc": "08:59:00" }, … ]
  • 44. ©2016 Couchbase Inc.©2016 Couchbase Inc. Array Indexing - motivation [ { "day" : 0, "special_flights" : [ { "flight" : "AI111", "utc" : ”1:11:11"}, { "flight" : "AI222", "utc" : ”2:22:22" } ] }, { "day": 1, "flight": "AF552", "utc": "14:41:00” }, … ] "London":[ "London", "Tokyo", "NewYork", … ]
  • 45. ©2016 Couchbase Inc.©2016 Couchbase Inc. Why array indexing? • When NoSQL databases asked customers to denormalize, they put the child table info into arrays in parent tables. • E.g. Each customer doc had all phone numbers, contacts, orders in arrays. • Not easy to query - need to specify full array value in where predicates. • Ex: list of users who purchased a product – Unknown values & large list • Was not possible to index part of the array with objects • Bloated index size (indexes whole array value) • Example: Index just the day field in array of flights in schedule. • Performance Limitation • ANY…IN orWITHIN array • Ease of querying - Must specify full array value inWHERE-clause • Manageable for Known or handful of values • Difficult for Unknown or Large list of values.
  • 46. ©2016 Couchbase Inc.©2016 Couchbase Inc. Who wants array indexing? • Find my crew based on the airline. WHERE ANY p IN ods.pilot satisfies p.filen = ”XYZ1012" END ; • Find my customer based on one of the emails on the customer WHERE ANY a IN u.telecom SATISFIES a.system = ‘email’ AND a.value = ‘a@b.com’ END ; • Find service qualification based on arrays of arrays. WHERE ANY c_0 IN `item`.`blackoutserviceblocklist` SATISFIES ANY c_1 IN c_0.`blackoutserviceblock`.`ppvservicelist` SATISFIES c_1.`ppvservice`.`eventcode` = "E001" END END ;
  • 47. ©2016 Couchbase Inc.©2016 Couchbase Inc. What is Array Indexing? • Enables visibility into the array structure schedule = • Subset of array elements can be indexed & searched efficiently [ { "day" : 0, "special_flights" : [ { "flight" : "AI111" , "utc" : "1:11:11"}, { "flight" : "AI932" , "utc" : "2:22:22"} ] }, { "day": 1, "flight": "AF552", "utc": "14:41:00" }, … ]
  • 48. ©2016 Couchbase Inc.©2016 Couchbase Inc. How Array Indexing Helps? • Index only required elements or attributes in the the array • Efficient on Index storage & search time • Benefits are lot more significant for nested arrays/objects
  • 49. ©2016 Couchbase Inc.©2016 Couchbase Inc. HowArray Indexing Helps -- Example "schedule”: [ { "day" : 0, "special_flights" : [ { "flight" : "AI111", "utc" : "1:11:11"}, { "flight" : "AI222", "utc" : "2:22:22"} ] }, { "day": 1, "flight": "AF552", "utc": "14:41:00" }, { "day": 2, "flight": "AF166", "utc": "08:59:00" }, … ] "flight":"AF552", "flight":"AF166", … Array Index in Couchbase
  • 50. ©2016 Couchbase Inc.©2016 Couchbase Inc. Create Array Index • No syntax changes to DML statements • Supports all DML statements with a WHERE-clause • SELECT, UPDATE, DELETE, etc. • Array index support only for GSI indexes. • Supports both standard secondary and memory optimized index. CREATE INDEX isched ON `travel-sample` (DISTINCT ARRAY v.flight FOR v IN schedule END) WHERE type = "route";
  • 51. ©2016 Couchbase Inc.©2016 Couchbase Inc. Array Index syntax CREATE INDEX isched ON `travel-sample` (ALL ARRAY p FOR p IN public_likes END) WHERE type = "hotel" ; "Julius Smith", [DocID] "Corrine Hill", [DocID] "Jaeden McKenzie", [DocID] "Vallie Ryan", [DocID] "Brian Kilback", [DocID] "Lilian McLaughlin", [DocID] "Ms. Moses Feeney", [DocID] "Elnora Trantow”, [DocID] "public_likes": [ "Julius Smith", "Corrine Hill", "Jaeden McKenzie", "Vallie Ryan", "Brian Kilback", "Lilian McLaughlin", "Ms. Moses Feeney", "Elnora Trantow" ]
  • 52. ©2016 Couchbase Inc.©2016 Couchbase Inc. Example - Indexing individual attributes/elements • "Find the total number of flights scheduled on 3rd day" CREATE INDEX isched ON `travel-sample` (DISTINCT ARRAY v.day FOR v IN schedule END) WHERE type = "route” ; SELECT count(*) FROM `travel-sample` WHERE type = "route" AND ANY v IN schedule SATISFIES v.day = 3 END;
  • 53. ©2016 Couchbase Inc.©2016 Couchbase Inc. Example - Indexing individual attributes/elements explain SELECT count(1) FROM `travel-sample` WHERE type = "route" AND ANY v IN schedule SATISFIES v.day = 3 END; { "#operator": "DistinctScan", "scan": { "#operator": "IndexScan", "index": "isched", "index_id": "2b24c681fa54d83f", "keyspace": "travel-sample", "namespace": "default", "spans": [ { "Range": { "High": [ "3" ], "Inclusion": 3, "Low": [ "3" ]
  • 54. ©2016 Couchbase Inc.©2016 Couchbase Inc. Example - Index with Array Elements and Other Attributes • "Find all scheduled flights with hops, and group by number of stops" CREATE INDEX iflight_stops ON `travel-sample` ( stops, DISTINCT ARRAY v.flight FOR v IN schedule END ) WHERE type = "route" ; SELECT * FROM `travel-sample` WHERE type = "route" AND ANY v IN schedule SATISFIES v.flight LIKE 'AA%' END AND stops >= 0;
  • 55. ©2016 Couchbase Inc.©2016 Couchbase Inc. Example - Indexing Nested Arrays "schedule" : [ {"day" : 0, "special_flights" : [ {"flight" : "AI111", "utc" : "1:11:11"}, {"flight" : "AI222", "utc" : "2:22:22" } ] }, {"day": 1, "flight": "AF552", "utc": "14:41:00" } … ]
  • 56. ©2016 Couchbase Inc.©2016 Couchbase Inc. Example - Indexing Nested Arrays • "Find the total number of special flights scheduled" CREATE INDEX inested ON `travel-sample` (DISTINCT ARRAY (DISTINCT ARRAY y.flight FOR y IN x.special_flights END) FOR x IN schedule END) WHERE type = "route" ; SELECT count(*) FROM `travel-sample` WHERE type ="route" AND ANY x IN schedule SATISFIES (ANY y IN x.special_flights SATISFIES y.flight IS NOT NULL END) END ; "schedule”: [ { "day" : 0, "special_flights" : [ { "flight" : "AI111", "utc":"1:11:11"}, { "flight" : "AI222", "utc":"2:22:22"} ] }, { "day": 1, "flight": "AF552", "utc": "14:41:00" }, { "day": 2, "flight": "AF166", "utc": "08:59:00" }, … ]
  • 57. ©2016 Couchbase Inc.©2016 Couchbase Inc. Example – UNNEST • N1QL array indexing supports both collection predicates • ANY • ANY AND EVERY • Exploited UNNEST CREATE INDEX idx_crew ON flight (DISTINCT ARRAY c FOR c IN public_likes END); SELECT * FROM flight UNNEST crew_ids AS c WHERE c = "Joe Smith" ;
  • 58. ©2016 Couchbase Inc.©2016 Couchbase Inc. Restrictions in 4.5 Variable names and index keys, such as v & v.day used in CREATE INDEX and subsequent SELECT statements must be same. CREATE INDEX isched ON `travel-sample` (DISTINCT ARRAY v.day FOR v IN schedule END) WHERE type = "route" ; SELECT count(*) FROM `travel-sample` WHERE type = "route" AND ANY v IN schedule SATISFIES v.day = 3 END;
  • 59. ©2016 Couchbase Inc.©2016 Couchbase Inc. Restrictions in 4.5 • Supported operators: DISTINCT ARRAY ALL ARRAY ARRAY ANY ANY AND EVERY IN, WITHIN UNNEST • NOT supported operators: EVERY
  • 60. ©2016 Couchbase Inc. 60 Fun with Arrays
  • 61. ©2016 Couchbase Inc.©2016 Couchbase Inc. SELECT: Fetch Documents SELECT * FROM customer USE KEYS ["KEY01"] ; SELECT * FROM customer USE KEYS [ "CUST:09", "CUST:29", "CUST:234", "CUST:852", "CUST:258"] ; SELECT status, COUNT(status) FROM customer c USE KEYS [ "CUST:09", "CUST:29", "CUST:234", "CUST:852", "CUST:258" ] WHERE c.region = 'US’ GROUP BY status; SELECT product, COUNT(product) FROM customer c USE KEYS [ "CUST:09", "CUST:29", "CUST:234", "CUST:852", "CUST:258" ] INNER JOIN locations ON KEYS c.lid WHERE c.region = 'US’ GROUP BY product; 61
  • 62. ©2016 Couchbase Inc.©2016 Couchbase Inc. SELECT: JOIN 62 SELECT COUNT(1) FROM `beer-sample` beer INNER JOIN `beer-sample` brewery ON KEYS beer.brewery_id WHERE state = ‘CA’ • JOIN operation combines documents from two key spaces • JOIN criteria is based on ON KEYS clause • The outer table uses the index scan, if possible • The fetch of the inner table (brewery) document-by-document • Couchbase 4.6 improves this by fetching in batches.
  • 63. ©2016 Couchbase Inc.©2016 Couchbase Inc. SELECT: JOIN SELECT COUNT(1) FROM ( SELECT RAW META().id FROM `beer-sample` beer WHERE state = ‘CA’) as blist INNER JOIN `beer-sample` brewery ON KEYS blist; 63 SELECT COUNT(1) FROM ( SELECT ARRAY_AGG(META().id) karray FROM `beer-sample` beer WHERE state = ‘CA’) as b INNER JOIN `beer-sample` brewery ON KEYS b.karray; • Why not get all of the required document IDs from the index scan then do a big bulk get on the outer table? • Two ways to do it. a) Use the array aggregate (ARRAY_AGG()) to create the list b) Use RAW to create the the array and then use that to JOIN.
  • 64. ©2016 Couchbase Inc.©2016 Couchbase Inc. Data.gov : NewYork Names { "meta": { "view": { "id": "25th-nujf", "name": "Most Popular Baby Names by Sex and Mother's Ethnic Group, New York City", "category": "Health", "createdAt": 1382724894, "description": "The most popular baby names by sex and mother's ethnicity in New York City.", "displayType": "table", … "columns": [{ "id": -1, "name": "sid", "dataTypeName": "meta_data", "fieldName": ":sid", "position": 0, "renderTypeName": "meta_data", "format": {} }, { "id": -1, "name": "id", "dataTypeName": "meta_data", "fieldName": ":id", "position": 0, "renderTypeName": "meta_data", "format": {} } ... ] "data": [ [1, "EB6FAA1B-EE35-4D55-B07B-8E663565CCDF", 1, 1386853125, "399231", 1386853125, "399231", "{n}", "2011", "FEMALE", "HISPANIC", "GERALDINE", "13", "75"], [2, "2DBBA431-D26F-40A1-9375-AF7C16FF2987", 2, 1386853125, "399231", 1386853125, "399231", "{n}", "2011", "FEMALE", "HISPANIC", "GIA", "21", "67"], [3, "54318692-0577-4B21-80C8-9CAEFCEDA8BA", 3, 1386853125, "399231", 1386853125, "399231", "{n}", "2011", "FEMALE", "HISPANIC", "GIANNA", "49", "42"] ... ] } 64
  • 65. ©2016 Couchbase Inc.©2016 Couchbase Inc. Data.gov : NewYork Names INSERT INTO nynames (KEY UUID(), VALUE kname) SELECT {":sid":d[0], ":id":d[1], ":position":d[2], ":created_at":d[3], ":created_meta":d[4], ":updated_at":d[5], ":updated_meta":d[6], ":meta":d[7],"brth_yr":d[8], "brth_yr":d[9], "ethcty":d[10], "nm":d[11], "cnt":d[12], "rnk":d[13]} kname FROM (SELECT d FROM datagov UNNEST data d) as u1; 65
  • 66. ©2016 Couchbase Inc.©2016 Couchbase Inc. Data.gov : NewYork Names INSERT INTO nynames ( KEY UUID(), value o ) SELECT o FROM ( SELECT meta.`view`.columns[*].fieldName f, data FROM datagov) d UNNEST data d1 LET o = OBJECT p:d1[ARRAY_POSITION(d.f, p)] FOR p IN d.f END ; 66
  • 67. ©2016 Couchbase Inc.©2016 Couchbase Inc. SPLIT & CONQUOR: SELECT name FROM `travel-sam5ple` WHERE type = 'hotel' LIMIT 5; [ { "name": "Medway Youth Hostel" }, { "name": "The Balmoral Guesthouse" }, { "name": "The Robins" }, { "name": "Le Clos Fleuri" }, { "name": "Glasgow Grand Central" } ] 67 • Problem: Search for a word within a string
  • 68. ©2016 Couchbase Inc.©2016 Couchbase Inc. SPLIT & CONQUER: select name from `travel-sample` where type = 'hotel' and lower(name) LIKE '%grand%'; [ { "name": "Glasgow Grand Central" }, { "name": "Horton Grand Hotel" }, { "name": "Manchester Grand Hyatt" }, { "name": "Grande Colonial Hotel" }, { "name": "Grand Hotel Serre Chevalier" }, { "name": "The Sheraton Grand Hotel" } ] 68 • Use the LIKE predicate • Runs in about 81 milliseconds to search 917 documents
  • 69. ©2016 Couchbase Inc.©2016 Couchbase Inc. SPLIT & CONQUER: CREATE INDEX idxtravelname ON `travel-sample` (DISTINCT ARRAY wrd FOR wrd IN SPLIT(LOWER(name)) END) where type = 'hotel'; SELECT name FROM `travel-sample` WHERE ANY wrd IN SPLIT(LOWER(name)) satisfies wrd = 'grand' END AND type = 'hotel'; [ { "name": "The Sheraton Grand Hotel" }, { "name": "Horton Grand Hotel" }, { "name": "Grand Hotel Serre Chevalier" }, { "name": "Glasgow Grand Central" }, { "name": "Manchester Grand Hyatt" } ] ~ 69 • Convert into LOWER case • Split the name into words. • SPLIT() returns a ARRAY of these words. • Create the INDEX on this array. • Query using the Array predicate. • Query runs in 10 ms. • Benefits grow with number of docs.
  • 70. ©2016 Couchbase Inc.©2016 Couchbase Inc. Bucket: article { { "tags": "JSON,N1QL,COUCHBASE,BIGDATA,NAME,data.gov,SQL", "title": "What's in a New York Name? Unlock data.gov Using N1QL " }, { "tags": "TWITTER,NOSQL,SQL,QUERIES,ANALYSIS,HASHTAGS,JSON,COUCHBASE,ANALYTICS,INDEX", "title": "SQL on Twitter: Analysis Made Easy Using N1QL" }, { "tags": "CONCURRENCY,MONGODB,COUCHBASE,INDEX,READ,WRITE,PERFORMANCE,SNAPSHOT,CONSISTENCY", "title": "Concurrency Behavior: MongoDB vs. Couchbase" }, { "tags": "COUCHBASE,N1QL,JOIN,PERFORMANCE,INDEX,DATA MODEL,FLEXIBLE,SCHEMA", "title": "JOIN Faster With Couchbase Index JOINs" }, { "tags": "NOSQL,NOSQL,BENCHMARK,SQL,JSON,COUCHBASE,MONGODB,YCSB,PERFORMANCE,QUERY,INDEX", "title": "How Couchbase Won YCSB" } }
  • 71. ©2016 Couchbase Inc.©2016 Couchbase Inc. Questions: Find all the articles with N1QL in their title Find all the articles with COUCHBASE in their tags { { "tags": "JSON,N1QL,COUCHBASE,BIGDATA,NAME,data.gov,SQL", "title": "What's in a New York Name? Unlock data.gov Using N1QL " }, { "tags": "TWITTER,NOSQL,SQL,QUERIES,ANALYSIS,HASHTAGS,JSON,COUCHBASE,ANALYTICS,INDEX", "title": "SQL on Twitter: Analysis Made Easy Using N1QL" }, { "tags": "CONCURRENCY,MONGODB,COUCHBASE,INDEX,READ,WRITE,PERFORMANCE,SNAPSHOT,CONSISTENCY", "title": "Concurrency Behavior: MongoDB vs. Couchbase" }, { "tags": "COUCHBASE,N1QL,JOIN,PERFORMANCE,INDEX,DATA MODEL,FLEXIBLE,SCHEMA", "title": "JOIN Faster With Couchbase Index JOINs" }, { "tags": "NOSQL,NOSQL,BENCHMARK,SQL,JSON,COUCHBASE,MONGODB,YCSB,PERFORMANCE,QUERY,INDEX", "title": "How Couchbase Won YCSB" } }
  • 72. ©2016 Couchbase Inc.©2016 Couchbase Inc. Basic Framework "tags": "JSON,N1QL,COUCHBASE,BIGDATA,NAME,data.gov,SQL" [ "JSON", "N1QL", "COUCHBASE", "BIGDATA", "NAME", "data.gov", "SQL" ] SPLIT() into an array Array Index Distinct array wrd for wrd in split(tags,”,”) end Index this array N1QL Query Service SELECT * FROM articles WHERE ANY wrd IN SPLIT(tags, ",") satisfies wrd = "COUCHBASE” END
  • 73. ©2016 Couchbase Inc.©2016 Couchbase Inc. Basic Framework "title": "What's in a New York Name? Unlock data.gov Using N1QL " [ "What's", "in", "a", "New", "York", "Name?", "Unlock", "data.gov", "Using", "N1QL" ] SPLIT() into an array Array Index ??? N1QL Query Service ???
  • 74. ©2016 Couchbase Inc.©2016 Couchbase Inc. New Function:TOKENS() in Couchbase 4.6 – OUT in DP now. TOKENS(expression [, parameter]) expression : JSON expression parameter : options {"names":true} Include the key names in the JSON “key”:value pair. {"case":"lower"} Return the values in upper/lower case {"specials":true} Recognize special characters like @, - to form tokens. "tags": "JSON,N1QL,COUCHBASE,BIGDATA,NAME,data.gov,SQL", "tagsarray": [ "data", "gov", "bigdata", "n1ql", "couchbase", "sql", "json", "name" ], select title, tags, tokens(tags, {"case":"lower"}) tagsarray, tokens(title) titlearray from articles limit 1; "title": "What's in a New York Name? Unlock data.gov Using N1QL ", "titlearray": [ "s", "Unlock", "data", "N1QL", "gov", "in", "Using", "New", "What", "York", "a", "Name" ]
  • 75. ©2016 Couchbase Inc.©2016 Couchbase Inc. UsingTOKENS() – Index on title, lower case create index ititlesearch on articles(distinct array wrd for wrd in tokens(title, {"case":"lower"}) end); explain select title from articles where any wrd in tokens(title, {"case":"lower"}) satisfies wrd = 'n1ql' end; { "#operator": "DistinctScan", "scan": { "#operator": "IndexScan", "index": "ititlesearch", "index_id": "7a162af1199565b5", "keyspace": "articles", "namespace": "default", "spans": [ { "Range": { "High": [ ""n1ql"" ], "Inclusion": 3, "Low": [ ""n1ql"" ] } } ], "using": "gsi" } },
  • 76. ©2016 Couchbase Inc.©2016 Couchbase Inc. UsingTOKENS() Index on theWHOLE document create index ititlesearch2 on articles (distinct array wrd for wrd in tokens(articles, { "case":"lower" , "names":true }) end); explain select title from articles where any wrd in tokens(articles, {"case":"lower", "names":true }) satisfies wrd = ’title' end; "#operator": "DistinctScan", "scan": { "#operator": "IndexScan", "index": "ititlesearch2", "index_id": "c60792ca9f957cfd", "keyspace": "articles", "namespace": "default", "spans": [ { "Range": { "High": [ ""title"" ], "Inclusion": 3, "Low": [ "“title"" ] } } ], "using": "gsi" }
  • 77. ©2016 Couchbase Inc. 77 Keshav Murthy Director, Couchbase Engineering keshav@couchase.com
  • 79. ©2016 Couchbase Inc.©2016 Couchbase Inc. Goal of N1QL Give developers and enterprises an expressive, powerful, and complete language for querying, transforming, and manipulating JSON data.
  • 80. ©2016 Couchbase Inc.©2016 Couchbase Inc. Array Indexing – How array is expanded in GSI Sl. Create Index Expression Key versions generated by Projector Index Entries in GSI storage 1. age [K1] [K1]docid 2. age, name, children [K1, K2, [c1, c2, c3]] [K1, K2, [c1, c2, c3]]docid 3. ALL ARRAY c FOR c IN cities END [[K11, K12, K13]] [ K11]docid [ K12]docid [ K13]docid 4. ALL ARRAY c FOR c IN cities END, age [[K11, K12, K13], K2] [ K11, K2]docid [ K12, K2]docid [ K13, K2]docid 4.1 age, ALL ARRAY c FOR c IN cities END, name [K1, [ K21, K22, K23,], K3] [K1, K21, K3]docid [K1, K22, K3]docid [K1, K23, K3]docid
  • 81. ©2016 Couchbase Inc.©2016 Couchbase Inc. Array Indexing – How array is expanded in GSI Sl. Create Index Expression Key versions generated by Projector Index Entries in GSI storage 5. ALL ARRAY c FOR c IN cities END, children [[K11, K12, K13], [c1, c2, c3]] [ K11, [c1, c2, c3]]docid [ K12, [c1, c2, c3]]docid [ K13, [c1, c2, c3]]docid 6. ALL ARRAY (ALL ARRAY y FOR y IN c END) FOR c IN cities END [ [K1, K2, K3, K4, K5] ] [K1]docid [K2]docid [K3]docid [K4]docid [K5]docid
  • 82. ©2016 Couchbase Inc.©2016 Couchbase Inc. Array Indexing Performance in ForestDB (3.6K sets) Metrics KPI Measured comments Array Q2(stale=ok) 13000 15140 Single doc match & fetch Array Q2(stale=false) 700 9420 Same with consistency Array Q3(stale=ok) 1100 1435 100 doc match and fetch. Array Q3(stale=false) 428 1084 Same with consistency
  • 83. ©2016 Couchbase Inc.©2016 Couchbase Inc. Array Indexing Performance in MOI with 30K sets Metrics KPI Measured comments Array Q2(stale=ok) 13000 15251 Single doc match & fetch Array Q2(stale=false) 700 7545 Same with consistency Array Q3(stale=ok) 1100 1371 100 doc match and fetch. Array Q3(stale=false) 428 1580 Same with consistency
  • 84. ©2016 Couchbase Inc.©2016 Couchbase Inc. UNITED – POC on 4.0 Response times in milliseconds 1 Thread 10 Thread 50 Thread Q1 13 35.1 197.84 Q2 28 66.8 285.32 Q3 - 7d 160 606 2960.2 Q3 - 28d 1725 8240.3 41439.86 1 Thread 5 Threads Q1 1500 31000 Q2 Timed out. Q3 23000 90000 MongoDB Query Couchbase Query Response times in milliseconds
  • 85. ©2016 Couchbase Inc.©2016 Couchbase Inc. UNITED -- POC • Query 2 – Get the selected flight using the document key. For each crew member (pilot and flight attendant) found in the flight details. • Fetch the previous flight assigned to the crew member • Fetch the next flight assigned to the crew member select ods.GMT_EST_DEP_DTM,ods.PRFL_ACT_GMT_DEP_DTM,ods.PRFL_SCHED_GMT_DEP_D TM,ods.GMT_EST_ARR_DTM, ods.PRFL_ACT_GMT_ARR_DTM,ods.PRFL_SCHED_GMT_ARR_DTM,ods.FLT_LCL_ORIG_ DT,ods.PRFL_FLT_NBR, ods.PRFL_TAIL_NBR,PILOT.PRPS_RSV_IND from ods unnest ods.PILOT where ods.TYPE='CREW_ON_FLIGHT' and ((ods.PRFL_ACT_GMT_DEP_DTM is not missing and ods.PRFL_ACT_GMT_DEP_DTM > "2015-07-15T02:45:00Z") OR (ods.PRFL_ACT_GMT_DEP_DTM is missing and ods.GMT_EST_DEP_DTM is not null and ods.GMT_EST_DEP_DTM > "2015-07-15T02:45:00Z")) and any p in ods.PILOT satisfies p.FILEN = "U110679" end order by ods.GMT_EST_DEP_DTM limit 1
  • 86. ©2016 Couchbase Inc.©2016 Couchbase Inc. UNITED – POC Queries on 4.5 • 422,137 documents. • Query2: BEFORE array indexing • Primary index scan • 38.91 seconds. create index idx_odspilot on ods(DISTINCT ARRAY p.FILEN in p in PILOT END); • Query2: AFTER array indexing • Array index scan [DistinctScan] • 8.51 millisecond • Improvement of 4572 TIMES
  • 87. ©2016 Couchbase Inc.©2016 Couchbase Inc. Array Indexing – Size and numbers • There is no limit on number of elements in the array. • Total size of array index key should not exceed setting max_array_seckey_size (Default = 10K) CREATE INDEX i1 on default(ALL flights, airlineid) . Lets say a given document is: { "flights": ["AF552", "AF166", "AF268", "AF422"], "airlineid": "airline_137" } The indexable array keys for the document are: [ ["AF552", "airline_137"], ["AF166", "airline_137"], ["AF268", "airline_137"], ["AF422", "airline_137"] ] Sum of lengths above items should be < max_array_seckey_size. Setting can be increased but not decreased.
  • 88. ©2016 Couchbase Inc.©2016 Couchbase Inc. Statements : MERGE BIG MERGE statement – Use travel-sample explain merge into b1 using b2 on key "11" when matched then update set b1.o3=1; merge into b1 using (select id from b2 where x < 10) as b3 on key b3.id when matched then update set b1.o4=1; merge into `travel-sample` using default on key "2" when matched then update set `travel- sample`.name="aaa"; MERGE into WAREHOUSE using `beer-sample` ON KEY to_string("yakima_brewing_and_malting_grant_s_ales- deep_powder_winter_ale²) when matched then delete; 88

Editor's Notes

  1. Arrays can be simple; arrays can be complex. JSON arrays give you a method to collapse the data model while retaining structure flexibility. Arrays of scalars, objects, and arrays are common structures in a JSON data model. Once you have this, you need to write queries to update and retrieve the data you need efficiently. This talk will discuss modeling and querying arrays. Then, it will discuss using array indexes to help run those queries on arrays faster.
  2. Arrays can be simple; arrays can be complex. JSON arrays give you a method to collapse the data model while retaining structure flexibility. Arrays of scalars, objects, and arrays are common structures in a JSON data model. Once you have this, you need to write queries to update and retrieve the data you need efficiently. This talk will discuss modeling and querying arrays. Then, it will discuss using array indexes to help run those queries on arrays faster.
  3. Arrays can be simple; arrays can be complex. JSON arrays give you a method to collapse the data model while retaining structure flexibility. Arrays of scalars, objects, and arrays are common structures in a JSON data model. Once you have this, you need to write queries to update and retrieve the data you need efficiently. This talk will discuss modeling and querying arrays. Then, it will discuss using array indexes to help run those queries on arrays faster. cbq> select distinct type from `travel-sample`; { "requestID": "458b7651-53a3-4a83-9abe-b65959420010", "signature": { "type": "json" }, "results": [ { "type": "route" }, { "type": "airport" }, { "type": "hotel" }, { "type": "airline" }, { "type": "landmark" } ], "status": "success", "metrics": { "elapsedTime": "840.518052ms", "executionTime": "840.478414ms", "resultCount": 5, "resultSize": 202 }
  4. An array is a way to hold more than one value at a time. It’s like a list of items. Think of an array as the columns in a spreadsheet. You can have a spreadsheet with only one column or lots of columns.
  5. An array is a way to hold more than one value at a time. It’s like a list of items. Think of an array as the columns in a spreadsheet. You can have a spreadsheet with only one column or lots of columns.
  6. An object is an unordered set of name/value pairs. An object begins with { (left brace) and ends with } (right brace). Each name is followed by : (colon) and the name/value pairs are separated by , (comma). An array is an ordered collection of values. An array begins with [ (left bracket) and ends with ] (right bracket). Values are separated by , (comma). A value can be a string in double quotes, or a number, or true or false or null, or an object or an array. These structures can be nested.
  7. An object is an unordered set of name/value pairs. An object begins with { (left brace) and ends with } (right brace). Each name is followed by : (colon) and the name/value pairs are separated by , (comma). An array is an ordered collection of values. An array begins with [ (left bracket) and ends with ] (right bracket). Values are separated by , (comma). A value can be a string in double quotes, or a number, or true or false or null, or an object or an array. These structures can be nested.
  8. An object is an unordered set of name/value pairs. An object begins with { (left brace) and ends with } (right brace). Each name is followed by : (colon) and the name/value pairs are separated by , (comma). An array is an ordered collection of values. An array begins with [ (left bracket) and ends with ] (right bracket). Values are separated by , (comma). A value can be a string in double quotes, or a number, or true or false or null, or an object or an array. These structures can be nested.
  9. Let’s look at modeling Customer data.
  10. Rich Structure In relational database, this customers data would be stored in five normalized tables. Each time you want to construct a customer object, you JOIN the data in these tables; Each time you persist, you find the appropriate rows in relevant tables and insert/update. Relationship Enforcement is via referential constraints. Objects are constructed by JOINS, EACH time. Value Evolution Additional values of the SAME TYPE (e.g. additional phone, additional address) is managed by additional ROWS in one of the tables. Customer:contacts will have 1:n relationship. Structure Evolution: This is the most difficult part.changing the structure is difficult, within a table, across tae table. While you can do these via ALTER TABLE, requires downtime, migration and application versioning. This is one of the problem document databases try to handle by representing data in JSON.
  11. Let’s see how to represent customer data in JSON.
  12. So, finally, you have a JSON document that represents a CUSTOMER. In a single JSON document, relationship between the data is implicit by use of sub-structures and arrays and arrays of sub-structures.
  13. The whole array is one blob of value that was indexed before 4.5. Any query should have to specify the entire array to find a match, which was not practical
  14. Let’s see how array indexing helps. First it enables visibility into the array structure, so index can be created on subset of finer array elements or attributes. With Array Indexing, subset of the array elements or attributes can be individually indexed & searched
  15. We can index only required subset of the array, and hence be efficient on Index storage & search times. Clearly, Benefits are lot more effectively visible with nested arrays/objects
  16. For example, index created in earlier versions would look like the blue triangle with whole array indexed. With array indexes in 4.5 only flight attributes with in the array can be indexed, which is much more efficient on storage and performance. In summary, array indexing brings Performance, and ease of querying with arrays
  17. For ex: this SELECT statement finds the total number of flights scheduled on 3rd day of the week, It iterates using the ANY operator to find matching index keys. Note that, the DML statement uses the exact array variables and predicates which are used in create index
  18. this example creates composite index with attributes in the array such as ‘v.flight’, where v is an array element, and non-array attribute such as ‘stops’. The SELECT query Finds all scheduled flights with one or more stops, and groups the result by number of stops. Note how the array elements can be iterated in the projection list of SELECT
  19. Lets look at an example with nested arrays. Consider the schedule array in travel-sample, with the nested array special-flights. So, the create index statement also uses nested DISTINCT ARRAY construct to create the index on each distinct special flight.
  20. Here is a SELECT statement to find the total number of scheduled special flights, which uses. Again, note the nested form of ANY construct and the use of matching variables names & index keys of the corresponding CREATE index statement.
  21. This feature has few. First the variable names and index keys, such as v & v.day, that are used in CREATE INDEX & SELECT must exactly match The query predicate, which must appear in the WHERE clause of a SELECT, UPDATE, or DELETE statement, must have the exact matching format as the variable in the array index key, including the name of the variable like v.
  22. Only the operators… are supported.
  23. 3. SELECT * FROM default WHERE ANY c IN cities SATISFIES c = "Bombay" END; 4. SELECT * FROM default WHERE ANY c IN cities SATISFIES c = "Bombay" END AND age < 35 ; The select in #4 can be done using index in #3. But range low and high are different depending on index created. If #4 select is used with #3 create index, then range is: High= ["\"Bombay\"”] Low= ["\"Bombay\"”] Inclusion: 3 If #4 select is used with #4 create index, then range is: High= ["\"Bombay\"”,"35”] Low= ["\"Bombay\"”,"null”] Inclusion: 0
  24. #6 : If two docs are: D1 = { "age": 25, "cities":[["Bangalore","Mysore"],["Chennai","Ooty"]] } D2 = { "age": 30, "cities":[["Siliguri","Kolkata"],["Kohlapur","Mumbai"]]} } Then Create index query would be: CREATE INDEX idcities_nested ON default(ALL ARRAY (ALL ARRAY y for y IN c END) FOR c IN cities END)
  25. The above query throughput (queries per second) measured for ForestDB with 3.6K set ops per second.
  26. The above query throughput (queries per second) measured for MOI with 30K set ops per second.
  27. Q1 took 13 ms but with Couchbase query, it took about 1500 ms.