New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2

Antonios Giannopoulos
Antonios GiannopoulosDatabase Administrator at Rackspace
Indexing & Aggregation
in MongoDB 4.2
#what_is_new
Antonios Giannopoulos
DBA @ ObjectRocket by Rackspace
Connect:linkedin.com/in/antonis/ Follow:@iamantonios
1
Introduction
www.objectrocket.com
2
Antonios Giannopoulos
Database troubleshooter aka troublemaker
@ObjectRocket
Troubleshoot: MongoDB, CockroachDB &
Postgres
Troublemaking: All the things
Overview
• Index builds
• Index Limitations
• Wildcard Indexes
• Materialized views
• Updates using aggregations
• Server side updates
www.objectrocket.com
3
The art of typing…
www.objectrocket.com
4
How many of you have mistype a mongo command?
How many of you have mistype the background on an index build?
Index builds
www.objectrocket.com
5
db.createIndex({keys},{options})
db.createIndex({keys},{options, background:true})
In MongoDB 4.2 the backgrund flag is deprecated!!!
MongoDB 4.2 is using a new hybrid approach (best of two worlds)
Hybrid Index - Build Stages
www.objectrocket.com
6
Initialization
o Exclusive lock against the collection being indexed
o Application is blocked
Data Ingestion and Processing
o Intent locks against the collection being indexed
o Application can read/write
Cleanup
o Exclusive lock (same as initialization)
Completion
o Makes Index available
Build Stages - Verbose
www.objectrocket.com
7
• Lock - obtains an exclusive X lock
• Initialization
• Lock - downgrades the exclusive X collection lock to an intent exclusive IX lock
• Scan Collection
• Process Side Writes Table
• Lock - shared S lock
• Finish Processing Temporary Side Writes Table
• Lock - Upgrades the shared S lock on the collection to an exclusive X lock
• Drop Side Write Table
• Process Constraint Violation Table
• Mark the Index as Ready
• Lock - Releases the X lock on the collection
Index builds - Animated
www.objectrocket.com
8
Index
Metadata
Entry
Collection
Side
Writes
Table
Constraint
Violation
Table
Index builds - Logs
www.objectrocket.com
9
Index builds – Nothing fancy
www.objectrocket.com
10
Index builds – Measure Impact
11
Index builds – Impact (4.0)
www.objectrocket.com
12
Foreground Index in MongoDB 4.0
o One big lock
o Locks the database vs Collection with Hybrid
Background Index in MongoDB 4.0
o Execution time didn’t affected
o Latency was affected
o Doesn’t lock vs the locks Hybrid needs
Index builds – Size & Time
www.objectrocket.com
13
Time
Foreground Index: 43950ms (43,95 seconds)
Background Index: 675243ms (675,24 seconds)
Hybrid Index: 116467ms (116,46 seconds)
Size
Foreground Index : 176635904 (168.45 MiB)
Background Index : 365649920 (348.71 MiB)
Hybrid Index : 176934912 (168.73 MiB)
Index
Limitations • Index Key Limit
• Index Name
www.objectrocket.com
14
Index Key Limit
www.objectrocket.com
15
Before 4.2: The total size of an index entry, which can include structural overhead depending
on the BSON type, must be less than 1024 bytes.
>q=db.limitations.findOne({},{payload:1,_id:0})
> Object.bsonsize(q)/1024
10.5458984375
With > db.adminCommand( { setFeatureCompatibilityVersion: "4.0" } )
Index Key Limit
www.objectrocket.com
16
Starting in version 4.2, MongoDB removes the Index Key Limit for FCV set to "4.2" or greater.
>q=db.limitations.findOne({},{payload:1,_id:0})
> Object.bsonsize(q)/1024
10.5458984375
With > db.adminCommand( { setFeatureCompatibilityVersion: "4.2" } )
Index Name Length Limit
www.objectrocket.com
17
Before 4.2: fully qualified index names, which include the namespace and the dot separators
(i.e. <database name>.<collection name>.$<index name>), cannot be longer than 127 bytes
Index Name Length Limit
www.objectrocket.com
18
Starting in version 4.2, MongoDB removes the Index Name Length Limit for MongoDB
versions with FVC set to "4.2" or greater.
Wildcard
Indexes
• State prior to 4.2
• Definition
• Use cases
• Limitations
www.objectrocket.com
19
Indexing metadata
www.objectrocket.com
20
There can be no more than 32 fields in a compound index
A single collection can have no more than 64 indexes
The above limitations may cause issues in a data model like:
o Too many combos,
o Too many indexes,
o too many fields for a single index,
o sparse fields, new fields … etc
Indexing metadata
21
The key-value store approach:
Wildcard Indexes
www.objectrocket.com
22
Create a wildcard index on a <field>
db.collection.createIndex( { ”<field>.$**" : 1 } )
Create a wildcard index on all fields (excluding _id)
db.collection.createIndex( { "$**" : 1 } )
Specify fields to index
db.collection.createIndex(
{ "$**" : 1 },
{ "wildcardProjection" :
{ ”<field>" : 1, ”<field>.<subfield>" : 1
}
})
Specify fields to exclude from index
db.collection.createIndex(
{ "$**" : 1 },
{ "wildcardProjection" :
{ ”<field>" : 0, ”<field>.<subfield>" : 0 }
})
Wildcard Indexes
www.objectrocket.com
23
Same example as the key-value approach
Considerations / Notes
www.objectrocket.com
24
o Support at most one field in any given query predicate.
o The featureCompatibilityVersion must be 4.2
o Wildcard indexes omit the _id field by default
o You can create multiple wildcard indexes in a collection
o A wildcard index may cover the same fields as other indexes in the collection
o Wildcard indexes are Sparse Indexes
Considerations / Notes
www.objectrocket.com
25
Wildcard indexes can support a covered query only if all of the following are true:
o The query planner selects the wildcard index for satisfying the query predicate.
o The query predicate specifies exactly one field covered by the wildcard index.
o The projection explicitly excludes _id and includes only the query field.
o The specified query field is never an array.
Considerations / Notes
www.objectrocket.com
26
MongoDB can use a wildcard index for satisfying the sort() only if all of the following are true:
o The query planner selects the wildcard index for satisfying the query predicate.
o The sort() specifies only the query predicate field.
o The specified field is never an array
Considerations / Notes
www.objectrocket.com
27
Wildcard indexes can support at most one query predicate field. That is:
o MongoDB cannot use a non-wildcard index to satisfy one part of a query predicate and a
wildcard index to satisfy another.
o MongoDB cannot use one wildcard index to satisfy one part of a query predicate and
another wildcard index to satisfy another.
o Even if a single wildcard index could support multiple query fields, MongoDB can use the
wildcard index to support only one of the query fields. All remaining fields are resolved
without an index.
$or is not restricted by the above limitation (query and aggregation).
Considerations / Notes
www.objectrocket.com
28
Unsupported query patterns
Wildcard indexes, cannot support :
o query condition that checks if a field does not exist.
o query condition that checks if a field is or is not equal to a document or an array
o query condition that checks if a field is not equal to null.
o the $min or $max aggregation operators.
Restrictions
www.objectrocket.com
29
You cannot shard a collection using a wildcard index
You cannot create a compound index.
You cannot specify the following properties for a wildcard index:
o TTL
o Unique
You cannot create the following index types using wildcard syntax:
o 2d (Geospatial)
o 2dsphere (Geospatial)
o Hashed
Comparison – Nothing fancy II
30
Natural vs Key-Value Model
www.objectrocket.com
31
Natural vs Key-Value Model
www.objectrocket.com
32
Both indexes scan only one branch of the $and
Natural vs Key-Value Model
www.objectrocket.com
33
Lets’ add price to the equation - Wildcard doesn’t support compound indexes
Scans the index for Red
Scans the index for Price
Scans for Price & Red
Natural vs Key-Value Model
www.objectrocket.com
34
o Key-Value adds overhead to the collection (doc size)
o Both indexing models can utilize one field (combo for the k-v)
o $exists:false only can be satisfied by the key-value model
o Key-value supports compound
o Natural can cover a query(see considerations), key-value don’t (multikey)
o Key-value looks more flexible…(with lot buts…)
o Natural is a good idea for selective fields / unpredicted queries
Aggregation
Framework -
New Operators
• Trigonometry Expressions
• Arithmetic Expressions
• Regular Expressions
• New Stages
www.objectrocket.com
35
Trigonometry Expressions
www.objectrocket.com
36
$sin Returns the sine of a value that is measured in radians.
$cos Returns the cosine of a value that is measured in radians.
$tan Returns the tangent of a value that is measured in radians.
$degreesToRadians Converts a value from degrees to radians.
$radiansToDegrees Converts a value from radians to degrees.
Full list of trigonometry expressions https://bit.ly/2knln2D
Example:
Arithmetic Expressions
www.objectrocket.com
37
MongoDB 4.2 adds the $round aggregation expression.
MongoDB 4.2 adds expanded functionality and new syntax to $trunc
Example:
Regular Expressions
www.objectrocket.com
38
$regexFind Applies a regular expression (regex) to a string and returns information on the first matched substring
$regexFindAll Applies a regular expression (regex) to a string and returns information on all matched substrings.
$regexMatch Applies a regular expression (regex) to a string and returns true if a match is found and false if a
match is not found.
Example:
New Stages
www.objectrocket.com
39
$merge Writes the aggregation results to a collection
$planCacheStats Provides plan cache information for a collection
$replaceWith Replaces the input document with the specified document. Alias to the $replaceRoot stage
$set Adds new fields to documents. Alias to the $addFields
$unset Excludes fields from documents. Alias to the $project stage
Materialized
Views
• Logical vs Physical view
• Definition
• Implementation
• Use cases
• Under the hood
• Considerations
www.objectrocket.com
40
Materialized Views
www.objectrocket.com
41
MongoDB 3.4 adds support of views on a collection:
o MongoDB computes the view contents by executing the aggregation on-
demand during read operations
o Views are not associated with data structures on disk
o db.runCommand( { create: <view>, viewOn: <source>, pipeline: <pipeline> } )
MongoDB 4.2 adds support of materialized views on a collection:
o Introduces $merge stage for the aggregation pipeline
o Materialized views are associated with data structures on disk (collections)
{ $merge: {
into: <collection> -or- { db: <db>, coll: <collection> }, //Mandatory
on: <identifier field> -or- [ <identifier field1>, ...], // Optional
let: <variables>, // Optional
whenMatched: <replace|keepExisting|merge|fail|pipeline>, // Optional
whenNotMatched: <insert|discard|fail> // Optional
} }
MV – Definition - Into
www.objectrocket.com
42
Into: The collection name.
Format: into: "myOutput” or into: { db:"myDB", coll:"myOutput" }
If the output collection does not exist, $merge creates the collection:
o For a replica set, if the output database does not exist, $merge also creates the database
o For a sharded cluster, the specified output database must already exist
The output collection cannot be the same collection as the collection being
aggregated
The output collection cannot appear in any other stages of the pipeline
The output collection can be a sharded collection
MV – Definition - Into
www.objectrocket.com
43
MV – Definition - On
www.objectrocket.com
44
On: (Optional) Field or fields that act as a unique identifier for a document.
Format: on: "_id” on: [ "date", "customerId" ]
The order of the fields in the array does not matter, and you cannot specify the
same field multiple times.
For the specified field (or fields):
o The aggregation results documents must contain the field(s) specified in the on, unless the
on field is the _id field
o The specified field or fields cannot contain a null or an array value.
o $merge requires a unique index with keys that correspond to the on identifier fields.
o For output collections that already exist, the corresponding index must already exist.
The default value for on depends on the output collection
MV – Definition - On
www.objectrocket.com
45
MV–Definition - WhenMatched
www.objectrocket.com
46
WhenMatched: (Optional): The behavior of $merge if a result document and an
existing document in the collection have the same value for the specified on
field(s).
Options:
replace: Replace the existing document in the output collection with the matching
results document.
keepexisting: Keep the existing document in the output collection.
merge (default): Merge the matching documents (similar to the $mergeObjects
operator)
MV–Definition - WhenMatched
www.objectrocket.com
47
Options (continue):
fail: Stop and fail the aggregation operation. Any changes to the output collection
from previous documents are not reverted.
Pipeline: An aggregation pipeline to update the document in the collection. Can
only contain $addFields and its alias $set, $project and its alias $unset,
$replaceRoot and its alias $replaceWith
MV–Definition - WhenMatched
www.objectrocket.com
48
MV–Definition - WhenMatched
www.objectrocket.com
49
MV–Definition - WhenNotMatched
www.objectrocket.com
50
WhenNotMatched: Optional. The behavior of $merge if a result document does
not match an existing document in the out collection.
Options
insert (Default): Insert the document into the output collection.
discard: Discard the document; i.e. $merge does not insert the document into the
output collection.
fail: Stop and fail the aggregation operation. Any changes to the output collection
from previous documents are not reverted.
MV–Definition - WhenNotMatched
www.objectrocket.com
51
MV – Combine collections
www.objectrocket.com
52
MV – Under the hood
www.objectrocket.com
53
Merge performs a $set update
replace performs a full update (updateobj)
keepExisting performs a setOnInsert
MV – Under the hood
www.objectrocket.com
54
Insert performs a $set update, with {upsert:true}
discard performs a $set update, with {upsert:false}
What about fail?
MV - $merge vs $out
www.objectrocket.com
55
$merge $out
Available starting in MongoDB 4.2 Available starting in MongoDB 2.6
Can output to a collection in the same or different
database.
Can output to a collection in the same database
Creates a new collection if the output collection
does not already exist
Creates a new collection if the output collection
does not already exist
Can incorporate results (Previous slides) Replaces the output collection completely if it
already exists
Input/Output can be sharded Only Input can be sharded
MV–$merge restrictions
www.objectrocket.com
56
o The output collection cannot be:
- the same collection as the collection being aggregated
- a collection that appears in any other stages of the pipeline ($lookup)
o An aggregation pipeline cannot use $merge inside a transaction.
o View definition cannot include the $merge stage
o $lookup or $facet stage’s nested pipeline cannot include the $merge stage
o The $merge stage cannot be used in conjunction with read concern
"linearizable"
MV – Use Cases
www.objectrocket.com
57
Reporting: Rolling up a summary of sales daily
Pre-compute aggregation: Aggregating averages of events every N <time unit>.
Data warehouse: Merging new different sources of data on a single view
Caching: Keep a subset of documents that meet read requirements
A use case based on: The Concept of Materialized Views in MongoDB Sharded Clusters
https://www.percona.com/community-blog/2019/07/16/concept-materialized-views-mongodb-
sharded-clusters/
MV – Scatter Gather
www.objectrocket.com
58
Scenario: An update heavy user profile collection, with reads on various fields –
including _id & email
Our Goal: Avoid scatter-gather as much as possible
Two collections:
o Users: Contains all user related information (sharded on _id:”hashed”)
o Cache: Contains static content (sharded on email)
o Two queries instead of one using the _id from cache
o Refresh on regular intervals
o On “fail” app retries on users collection (scatter-gather)
More
expressive
Update
Language
• Aggregation framework &
updates
• Server-side updates
www.objectrocket.com
59
Aggr. Expressions on Updates
www.objectrocket.com
60
Starting in MongoDB 4.2, you can use the aggregation pipeline for updates
operation
The statements that can use the aggregation pipeline are:
o db.collection.findAndModify()
o db.collection.findOneAndUpdate()
o db.collection.updateOne()
o db.collection.updateMany()
o db.collection.update()
Meaning:
o Updates can be specified with the aggregation pipeline
o All field from existing document can be accessed
o More powerful but slower…
Examples-Handle exceptions
61
Handle missing/default values
62
Upsert an array with documents
63
The oplog… Update
64
The oplog… Aggregation
65
Recap & Takeways
www.objectrocket.com
66
Hybrid indexes:
Less impactful than foreground, faster than background (Best of both worlds!)
We still recommend the secondary build method for large collections
Wildcard Indexes:
Very powerful when the queries are unknown
Better than the key-value model when no other field involves in the query predicates
Aggregation framework:
New operators and stages introduced
New stage $merge creates materialized views
Aggregation framework language can be used on update operations
More powerful updates but slower…
www.objectrocket.com
67
Rate My Session
www.objectrocket.com
68
I still have problems sleeping,
but count bugs in 4.2 helps me sleep
I had problems sleeping, but I took a quick nap
I had problems sleeping but not anymore
www.objectrocket.com
69
We’re Hiring!
Looking to join a dynamic & innovative team?
https://www.objectrocket.com/careers/
Questions?
www.objectrocket.com
70
Thank you!
Address:
9001 N Interstate Hwy 35 #150, Austin, TX 78753
Support:
US Toll free: 1-855-722-8165
UK Toll free +448081686840
support@objectrocket.com
Sales:
1-888-440-3242
sales@objectrocket.com
www.objectrocket.com
71
1 of 71

More Related Content

What's hot(20)

Triggers In MongoDBTriggers In MongoDB
Triggers In MongoDB
Jason Terpko205 views
MongoDB - External AuthenticationMongoDB - External Authentication
MongoDB - External Authentication
Jason Terpko621 views
MongoDB Scalability Best PracticesMongoDB Scalability Best Practices
MongoDB Scalability Best Practices
Jason Terpko115 views
What's new in Redis v3.2What's new in Redis v3.2
What's new in Redis v3.2
Itamar Haber2.2K views
Tag based sharding presentationTag based sharding presentation
Tag based sharding presentation
Juan Antonio Roy Couto1.8K views
ELK stack at weibo.comELK stack at weibo.com
ELK stack at weibo.com
琛琳 饶2.9K views
StackExchange.redisStackExchange.redis
StackExchange.redis
Larry Nung2K views
Extend Redis with ModulesExtend Redis with Modules
Extend Redis with Modules
Itamar Haber2K views
Time Series Processing with Solr and SparkTime Series Processing with Solr and Spark
Time Series Processing with Solr and Spark
Josef Adersberger527 views
Prometheus StoragePrometheus Storage
Prometheus Storage
Fabian Reinartz9.7K views
Redis modules 101Redis modules 101
Redis modules 101
Dvir Volk1.9K views
Tweaking performance on high-load projectsTweaking performance on high-load projects
Tweaking performance on high-load projects
Dmitriy Dumanskiy2.5K views
JEEConf. Vanilla javaJEEConf. Vanilla java
JEEConf. Vanilla java
Dmitriy Dumanskiy770 views
MongoDB ConceptsMongoDB Concepts
MongoDB Concepts
Juan Antonio Roy Couto1.4K views
Handling 20 billion requests a monthHandling 20 billion requests a month
Handling 20 billion requests a month
Dmitriy Dumanskiy1.5K views

Similar to New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2

Mongo indexesMongo indexes
Mongo indexesparadokslabs
2.2K views17 slides
Mongo db basicsMongo db basics
Mongo db basicsDhaval Mistry
1.1K views26 slides

Similar to New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2(20)

Mongo indexesMongo indexes
Mongo indexes
paradokslabs2.2K views
Mongo db basicsMongo db basics
Mongo db basics
Dhaval Mistry1.1K views
Mongodb IntroductionMongodb Introduction
Mongodb Introduction
Raghvendra Parashar306 views
Nosql part 2Nosql part 2
Nosql part 2
Ruru Chowdhury765 views
Query Optimization in MongoDBQuery Optimization in MongoDB
Query Optimization in MongoDB
Hamoon Mohammadian Pour750 views
2016 feb-23 pyugre-py_mongo2016 feb-23 pyugre-py_mongo
2016 feb-23 pyugre-py_mongo
Michael Bright150 views
Using MongoDB and PythonUsing MongoDB and Python
Using MongoDB and Python
Mike Bright1.5K views
Indexing and Query OptimizerIndexing and Query Optimizer
Indexing and Query Optimizer
MongoDB847 views
MongoDB FabLab LeónMongoDB FabLab León
MongoDB FabLab León
Juan Antonio Roy Couto612 views
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
Raghunath A115 views
Mongo db tutorialsMongo db tutorials
Mongo db tutorials
Anuj Jain488 views
Mongo Nosql CRUD OperationsMongo Nosql CRUD Operations
Mongo Nosql CRUD Operations
anujaggarwal49593 views
Data Types/Structures in DivConqData Types/Structures in DivConq
Data Types/Structures in DivConq
eTimeline, LLC644 views

More from Antonios Giannopoulos(6)

Elastic 101 tutorial - Percona Europe 2018 Elastic 101 tutorial - Percona Europe 2018
Elastic 101 tutorial - Percona Europe 2018
Antonios Giannopoulos5.9K views
Percona Live 2017 ­- Sharded cluster tutorialPercona Live 2017 ­- Sharded cluster tutorial
Percona Live 2017 ­- Sharded cluster tutorial
Antonios Giannopoulos330 views
Introduction to Polyglot Persistence Introduction to Polyglot Persistence
Introduction to Polyglot Persistence
Antonios Giannopoulos642 views
MongoDB Sharding Fundamentals MongoDB Sharding Fundamentals
MongoDB Sharding Fundamentals
Antonios Giannopoulos1.1K views

Recently uploaded(20)

New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2

  • 1. Indexing & Aggregation in MongoDB 4.2 #what_is_new Antonios Giannopoulos DBA @ ObjectRocket by Rackspace Connect:linkedin.com/in/antonis/ Follow:@iamantonios 1
  • 2. Introduction www.objectrocket.com 2 Antonios Giannopoulos Database troubleshooter aka troublemaker @ObjectRocket Troubleshoot: MongoDB, CockroachDB & Postgres Troublemaking: All the things
  • 3. Overview • Index builds • Index Limitations • Wildcard Indexes • Materialized views • Updates using aggregations • Server side updates www.objectrocket.com 3
  • 4. The art of typing… www.objectrocket.com 4 How many of you have mistype a mongo command? How many of you have mistype the background on an index build?
  • 5. Index builds www.objectrocket.com 5 db.createIndex({keys},{options}) db.createIndex({keys},{options, background:true}) In MongoDB 4.2 the backgrund flag is deprecated!!! MongoDB 4.2 is using a new hybrid approach (best of two worlds)
  • 6. Hybrid Index - Build Stages www.objectrocket.com 6 Initialization o Exclusive lock against the collection being indexed o Application is blocked Data Ingestion and Processing o Intent locks against the collection being indexed o Application can read/write Cleanup o Exclusive lock (same as initialization) Completion o Makes Index available
  • 7. Build Stages - Verbose www.objectrocket.com 7 • Lock - obtains an exclusive X lock • Initialization • Lock - downgrades the exclusive X collection lock to an intent exclusive IX lock • Scan Collection • Process Side Writes Table • Lock - shared S lock • Finish Processing Temporary Side Writes Table • Lock - Upgrades the shared S lock on the collection to an exclusive X lock • Drop Side Write Table • Process Constraint Violation Table • Mark the Index as Ready • Lock - Releases the X lock on the collection
  • 8. Index builds - Animated www.objectrocket.com 8 Index Metadata Entry Collection Side Writes Table Constraint Violation Table
  • 9. Index builds - Logs www.objectrocket.com 9
  • 10. Index builds – Nothing fancy www.objectrocket.com 10
  • 11. Index builds – Measure Impact 11
  • 12. Index builds – Impact (4.0) www.objectrocket.com 12 Foreground Index in MongoDB 4.0 o One big lock o Locks the database vs Collection with Hybrid Background Index in MongoDB 4.0 o Execution time didn’t affected o Latency was affected o Doesn’t lock vs the locks Hybrid needs
  • 13. Index builds – Size & Time www.objectrocket.com 13 Time Foreground Index: 43950ms (43,95 seconds) Background Index: 675243ms (675,24 seconds) Hybrid Index: 116467ms (116,46 seconds) Size Foreground Index : 176635904 (168.45 MiB) Background Index : 365649920 (348.71 MiB) Hybrid Index : 176934912 (168.73 MiB)
  • 14. Index Limitations • Index Key Limit • Index Name www.objectrocket.com 14
  • 15. Index Key Limit www.objectrocket.com 15 Before 4.2: The total size of an index entry, which can include structural overhead depending on the BSON type, must be less than 1024 bytes. >q=db.limitations.findOne({},{payload:1,_id:0}) > Object.bsonsize(q)/1024 10.5458984375 With > db.adminCommand( { setFeatureCompatibilityVersion: "4.0" } )
  • 16. Index Key Limit www.objectrocket.com 16 Starting in version 4.2, MongoDB removes the Index Key Limit for FCV set to "4.2" or greater. >q=db.limitations.findOne({},{payload:1,_id:0}) > Object.bsonsize(q)/1024 10.5458984375 With > db.adminCommand( { setFeatureCompatibilityVersion: "4.2" } )
  • 17. Index Name Length Limit www.objectrocket.com 17 Before 4.2: fully qualified index names, which include the namespace and the dot separators (i.e. <database name>.<collection name>.$<index name>), cannot be longer than 127 bytes
  • 18. Index Name Length Limit www.objectrocket.com 18 Starting in version 4.2, MongoDB removes the Index Name Length Limit for MongoDB versions with FVC set to "4.2" or greater.
  • 19. Wildcard Indexes • State prior to 4.2 • Definition • Use cases • Limitations www.objectrocket.com 19
  • 20. Indexing metadata www.objectrocket.com 20 There can be no more than 32 fields in a compound index A single collection can have no more than 64 indexes The above limitations may cause issues in a data model like: o Too many combos, o Too many indexes, o too many fields for a single index, o sparse fields, new fields … etc
  • 22. Wildcard Indexes www.objectrocket.com 22 Create a wildcard index on a <field> db.collection.createIndex( { ”<field>.$**" : 1 } ) Create a wildcard index on all fields (excluding _id) db.collection.createIndex( { "$**" : 1 } ) Specify fields to index db.collection.createIndex( { "$**" : 1 }, { "wildcardProjection" : { ”<field>" : 1, ”<field>.<subfield>" : 1 } }) Specify fields to exclude from index db.collection.createIndex( { "$**" : 1 }, { "wildcardProjection" : { ”<field>" : 0, ”<field>.<subfield>" : 0 } })
  • 24. Considerations / Notes www.objectrocket.com 24 o Support at most one field in any given query predicate. o The featureCompatibilityVersion must be 4.2 o Wildcard indexes omit the _id field by default o You can create multiple wildcard indexes in a collection o A wildcard index may cover the same fields as other indexes in the collection o Wildcard indexes are Sparse Indexes
  • 25. Considerations / Notes www.objectrocket.com 25 Wildcard indexes can support a covered query only if all of the following are true: o The query planner selects the wildcard index for satisfying the query predicate. o The query predicate specifies exactly one field covered by the wildcard index. o The projection explicitly excludes _id and includes only the query field. o The specified query field is never an array.
  • 26. Considerations / Notes www.objectrocket.com 26 MongoDB can use a wildcard index for satisfying the sort() only if all of the following are true: o The query planner selects the wildcard index for satisfying the query predicate. o The sort() specifies only the query predicate field. o The specified field is never an array
  • 27. Considerations / Notes www.objectrocket.com 27 Wildcard indexes can support at most one query predicate field. That is: o MongoDB cannot use a non-wildcard index to satisfy one part of a query predicate and a wildcard index to satisfy another. o MongoDB cannot use one wildcard index to satisfy one part of a query predicate and another wildcard index to satisfy another. o Even if a single wildcard index could support multiple query fields, MongoDB can use the wildcard index to support only one of the query fields. All remaining fields are resolved without an index. $or is not restricted by the above limitation (query and aggregation).
  • 28. Considerations / Notes www.objectrocket.com 28 Unsupported query patterns Wildcard indexes, cannot support : o query condition that checks if a field does not exist. o query condition that checks if a field is or is not equal to a document or an array o query condition that checks if a field is not equal to null. o the $min or $max aggregation operators.
  • 29. Restrictions www.objectrocket.com 29 You cannot shard a collection using a wildcard index You cannot create a compound index. You cannot specify the following properties for a wildcard index: o TTL o Unique You cannot create the following index types using wildcard syntax: o 2d (Geospatial) o 2dsphere (Geospatial) o Hashed
  • 30. Comparison – Nothing fancy II 30
  • 31. Natural vs Key-Value Model www.objectrocket.com 31
  • 32. Natural vs Key-Value Model www.objectrocket.com 32 Both indexes scan only one branch of the $and
  • 33. Natural vs Key-Value Model www.objectrocket.com 33 Lets’ add price to the equation - Wildcard doesn’t support compound indexes Scans the index for Red Scans the index for Price Scans for Price & Red
  • 34. Natural vs Key-Value Model www.objectrocket.com 34 o Key-Value adds overhead to the collection (doc size) o Both indexing models can utilize one field (combo for the k-v) o $exists:false only can be satisfied by the key-value model o Key-value supports compound o Natural can cover a query(see considerations), key-value don’t (multikey) o Key-value looks more flexible…(with lot buts…) o Natural is a good idea for selective fields / unpredicted queries
  • 35. Aggregation Framework - New Operators • Trigonometry Expressions • Arithmetic Expressions • Regular Expressions • New Stages www.objectrocket.com 35
  • 36. Trigonometry Expressions www.objectrocket.com 36 $sin Returns the sine of a value that is measured in radians. $cos Returns the cosine of a value that is measured in radians. $tan Returns the tangent of a value that is measured in radians. $degreesToRadians Converts a value from degrees to radians. $radiansToDegrees Converts a value from radians to degrees. Full list of trigonometry expressions https://bit.ly/2knln2D Example:
  • 37. Arithmetic Expressions www.objectrocket.com 37 MongoDB 4.2 adds the $round aggregation expression. MongoDB 4.2 adds expanded functionality and new syntax to $trunc Example:
  • 38. Regular Expressions www.objectrocket.com 38 $regexFind Applies a regular expression (regex) to a string and returns information on the first matched substring $regexFindAll Applies a regular expression (regex) to a string and returns information on all matched substrings. $regexMatch Applies a regular expression (regex) to a string and returns true if a match is found and false if a match is not found. Example:
  • 39. New Stages www.objectrocket.com 39 $merge Writes the aggregation results to a collection $planCacheStats Provides plan cache information for a collection $replaceWith Replaces the input document with the specified document. Alias to the $replaceRoot stage $set Adds new fields to documents. Alias to the $addFields $unset Excludes fields from documents. Alias to the $project stage
  • 40. Materialized Views • Logical vs Physical view • Definition • Implementation • Use cases • Under the hood • Considerations www.objectrocket.com 40
  • 41. Materialized Views www.objectrocket.com 41 MongoDB 3.4 adds support of views on a collection: o MongoDB computes the view contents by executing the aggregation on- demand during read operations o Views are not associated with data structures on disk o db.runCommand( { create: <view>, viewOn: <source>, pipeline: <pipeline> } ) MongoDB 4.2 adds support of materialized views on a collection: o Introduces $merge stage for the aggregation pipeline o Materialized views are associated with data structures on disk (collections) { $merge: { into: <collection> -or- { db: <db>, coll: <collection> }, //Mandatory on: <identifier field> -or- [ <identifier field1>, ...], // Optional let: <variables>, // Optional whenMatched: <replace|keepExisting|merge|fail|pipeline>, // Optional whenNotMatched: <insert|discard|fail> // Optional } }
  • 42. MV – Definition - Into www.objectrocket.com 42 Into: The collection name. Format: into: "myOutput” or into: { db:"myDB", coll:"myOutput" } If the output collection does not exist, $merge creates the collection: o For a replica set, if the output database does not exist, $merge also creates the database o For a sharded cluster, the specified output database must already exist The output collection cannot be the same collection as the collection being aggregated The output collection cannot appear in any other stages of the pipeline The output collection can be a sharded collection
  • 43. MV – Definition - Into www.objectrocket.com 43
  • 44. MV – Definition - On www.objectrocket.com 44 On: (Optional) Field or fields that act as a unique identifier for a document. Format: on: "_id” on: [ "date", "customerId" ] The order of the fields in the array does not matter, and you cannot specify the same field multiple times. For the specified field (or fields): o The aggregation results documents must contain the field(s) specified in the on, unless the on field is the _id field o The specified field or fields cannot contain a null or an array value. o $merge requires a unique index with keys that correspond to the on identifier fields. o For output collections that already exist, the corresponding index must already exist. The default value for on depends on the output collection
  • 45. MV – Definition - On www.objectrocket.com 45
  • 46. MV–Definition - WhenMatched www.objectrocket.com 46 WhenMatched: (Optional): The behavior of $merge if a result document and an existing document in the collection have the same value for the specified on field(s). Options: replace: Replace the existing document in the output collection with the matching results document. keepexisting: Keep the existing document in the output collection. merge (default): Merge the matching documents (similar to the $mergeObjects operator)
  • 47. MV–Definition - WhenMatched www.objectrocket.com 47 Options (continue): fail: Stop and fail the aggregation operation. Any changes to the output collection from previous documents are not reverted. Pipeline: An aggregation pipeline to update the document in the collection. Can only contain $addFields and its alias $set, $project and its alias $unset, $replaceRoot and its alias $replaceWith
  • 50. MV–Definition - WhenNotMatched www.objectrocket.com 50 WhenNotMatched: Optional. The behavior of $merge if a result document does not match an existing document in the out collection. Options insert (Default): Insert the document into the output collection. discard: Discard the document; i.e. $merge does not insert the document into the output collection. fail: Stop and fail the aggregation operation. Any changes to the output collection from previous documents are not reverted.
  • 52. MV – Combine collections www.objectrocket.com 52
  • 53. MV – Under the hood www.objectrocket.com 53 Merge performs a $set update replace performs a full update (updateobj) keepExisting performs a setOnInsert
  • 54. MV – Under the hood www.objectrocket.com 54 Insert performs a $set update, with {upsert:true} discard performs a $set update, with {upsert:false} What about fail?
  • 55. MV - $merge vs $out www.objectrocket.com 55 $merge $out Available starting in MongoDB 4.2 Available starting in MongoDB 2.6 Can output to a collection in the same or different database. Can output to a collection in the same database Creates a new collection if the output collection does not already exist Creates a new collection if the output collection does not already exist Can incorporate results (Previous slides) Replaces the output collection completely if it already exists Input/Output can be sharded Only Input can be sharded
  • 56. MV–$merge restrictions www.objectrocket.com 56 o The output collection cannot be: - the same collection as the collection being aggregated - a collection that appears in any other stages of the pipeline ($lookup) o An aggregation pipeline cannot use $merge inside a transaction. o View definition cannot include the $merge stage o $lookup or $facet stage’s nested pipeline cannot include the $merge stage o The $merge stage cannot be used in conjunction with read concern "linearizable"
  • 57. MV – Use Cases www.objectrocket.com 57 Reporting: Rolling up a summary of sales daily Pre-compute aggregation: Aggregating averages of events every N <time unit>. Data warehouse: Merging new different sources of data on a single view Caching: Keep a subset of documents that meet read requirements A use case based on: The Concept of Materialized Views in MongoDB Sharded Clusters https://www.percona.com/community-blog/2019/07/16/concept-materialized-views-mongodb- sharded-clusters/
  • 58. MV – Scatter Gather www.objectrocket.com 58 Scenario: An update heavy user profile collection, with reads on various fields – including _id & email Our Goal: Avoid scatter-gather as much as possible Two collections: o Users: Contains all user related information (sharded on _id:”hashed”) o Cache: Contains static content (sharded on email) o Two queries instead of one using the _id from cache o Refresh on regular intervals o On “fail” app retries on users collection (scatter-gather)
  • 59. More expressive Update Language • Aggregation framework & updates • Server-side updates www.objectrocket.com 59
  • 60. Aggr. Expressions on Updates www.objectrocket.com 60 Starting in MongoDB 4.2, you can use the aggregation pipeline for updates operation The statements that can use the aggregation pipeline are: o db.collection.findAndModify() o db.collection.findOneAndUpdate() o db.collection.updateOne() o db.collection.updateMany() o db.collection.update() Meaning: o Updates can be specified with the aggregation pipeline o All field from existing document can be accessed o More powerful but slower…
  • 63. Upsert an array with documents 63
  • 66. Recap & Takeways www.objectrocket.com 66 Hybrid indexes: Less impactful than foreground, faster than background (Best of both worlds!) We still recommend the secondary build method for large collections Wildcard Indexes: Very powerful when the queries are unknown Better than the key-value model when no other field involves in the query predicates Aggregation framework: New operators and stages introduced New stage $merge creates materialized views Aggregation framework language can be used on update operations More powerful updates but slower…
  • 68. Rate My Session www.objectrocket.com 68 I still have problems sleeping, but count bugs in 4.2 helps me sleep I had problems sleeping, but I took a quick nap I had problems sleeping but not anymore
  • 69. www.objectrocket.com 69 We’re Hiring! Looking to join a dynamic & innovative team? https://www.objectrocket.com/careers/
  • 71. Thank you! Address: 9001 N Interstate Hwy 35 #150, Austin, TX 78753 Support: US Toll free: 1-855-722-8165 UK Toll free +448081686840 support@objectrocket.com Sales: 1-888-440-3242 sales@objectrocket.com www.objectrocket.com 71