Whether you are running MongoDB on-premise, self-managing in the cloud, or using MongoDB Atlas, it's critical that you have dependable backups of your data for when things go sideways. This takes infrastructure, storage, and coordination, which can be complex and costly. In MongoDB 4.2, we are changing how backup is architected, helping you reduce the required storage footprint and remove architectural complexities to increase performance and decrease costs. Come to this session to see how we're accomplishing this.
2. #MDBLocal
Safe Harbor Statement
This presentation contains “forward-looking statements” within the meaning of Section 27A of the Securities Act
of 1933, as amended, and Section 21E of the Securities Exchange Act of 1934, as amended. Such forward-looking
statements are subject to a number of risks, uncertainties, assumptions and other factors that could cause
actual results and the timing of certain events to differ materially from future results expressed or implied by the
forward-looking statements. Factors that could cause or contribute to such differences include, but are not
limited to, those identified our filings with the Securities and Exchange Commission. You should not rely upon
forward-looking statements as predictions of future events. Furthermore, such forward-looking statements
speak only as of the date of this presentation.
In particular, the development, release, and timing of any features or functionality described for MongoDB
products remains at MongoDB’s sole discretion. This information is merely intended to outline our general
product direction and it should not be relied on in making a purchasing decision nor is this a commitment,
promise or legal obligation to deliver any material, code, or functionality. Except as required by law, we
undertake no obligation to update any forward-looking statements to reflect events or circumstances after the
date of such statements.
3. Quick level set — We’re not going to talk
about disaster recovery
18. #MDBLocal
What is it?
● Enhanced WiredTiger to take checkpoints of itself
● Checkpoints are then moved to long term storage
● HeadDB’s are completely eliminated — no more initial syncs!
○ Reduced storage requirements
○ Reduces architectural complexities
○ Reduces infrastructure required
● Consolidated agents, from three agents to one
19. #MDBLocal
...
...
...
...
Oplog04
Oplog02
Blockstore06
Blockstore04
Blockstore02
Oplog06
On-Premises Backup Architecture Today
Ops Manager
OM01
OM02
OM03
OM04
OM05
OM06
Backup Agent
Monitoring Agent
Backup Agent
Monitoring Agent
Backup Agent
Monitoring Agent
Backup Agent
Monitoring Agent
Backup Agent
Monitoring Agent
Backup Agent
Monitoring Agent
OM Group
BackupLoadBalancer
App Server for
Backup
App Server for
Backup
App Server for
Backup
Oplog01
Oplog03
Oplog05
Backup
Daemon
HeadDB
Backup
Daemon
HeadDB
Backup
Daemon
HeadDB
Blockstore01
Blockstore03
Blockstore05
20. #MDBLocal
Oplog02
Blockstore
New Backup Architecture
Ops Manager
OM01
OM02
OM03
OM04
OM05
OM06
MongoDB Agent
MongoDB Agent
MongoDB Agent
MongoDB Agent
MongoDB Agent
MongoDB Agent
OM Group
BackupLoadBalancer
App
Server/Backup
Daemon
App
Server/Backup
Daemon
Oplog01
S3 Snapshot Store
21. #MDBLocal
Backup Node Selection Order
1. Hidden Secondaries
2. A secondary we have already taken a snapshot from
3. Secondary closet to the time of the time of the snapshot
4. Any available secondary
5. Primary
22. #MDBLocal
Phase 1 Details
• Backup agent required to exist on every node
• OM 4.2 required for backing up 4.2
• Backwards compatible for 4.0 and below
• Daemons are used for background tasks
• Replica sets only
• Sharded clusters coming later this year
23. #MDBLocal
S3 Oplog
Blockstore
New Backup Architecture — Future
Ops Manager
OM01
OM02
OM03
OM04
OM05
OM06
MongoDB Agent
MongoDB Agent
MongoDB Agent
MongoDB Agent
MongoDB Agent
MongoDB Agent
OM Group
BackupLoadBalancer
App Server for
Backup
App Server for
Backup
Oplog01
S3 Snapshot Store
25. #MDBLocal
S3 Oplog
New Backup Architecture — Future
Ops Manager
OM01
OM02
OM03
OM04
OM05
OM06
MongoDB Agent
MongoDB Agent
MongoDB Agent
MongoDB Agent
MongoDB Agent
MongoDB Agent
OM Group
S3 Snapshot Store
26. #MDBLocal
Future Improvements
● Agents read/write directly to OpLog and snapshots store
● Leave a checkpoint behind
● Incremental Checkpoints
● Data copy optimizations
30. #MDBLocal
Atlas Backups: Data Recovery
Roll back the clock when you
run into issues triggered by
user or application errors that
are replicated from the
primary to the rest of your
cluster.
31. #MDBLocal
Point-in-Time Data Recovery
• Lets you select a restore time based on your PIT window
• Restores the closest snapshot and rolls ahead
• Reduces the possibility of data loss
32. #MDBLocal
What About Small Disasters?
• The application is working fine
• But there is data missing or has
been altered
• No time to do a full restore
33. #MDBLocal
Queryable Backups
• Ability to query your snapshots
and restore data at the
document level in minutes.
• Reduces the operational
overhead associated with:
• Identifying whether data of
interest has been altered
• Pinpointing the best point in
time to restore a database
34. db = source.locations
db2 = destination.locations
zips = db.zipcodes
zips2 = db2.zipcodes
def restore():
print "Finding Missing Data"
query = {'state’: ‘IL'}
try:
cursor = zips.find(query)
except Exception as e:
print "Unexpected error:", type(e), e
for doc in cursor:
zips2.insert(doc)
Sample Script
Script is available on my Github - https://github.com/bencefalo
37. #MDBLocal
Cloud Provider Snapshots (CPS)
• Utilizes each providers native snapshot
capabilities
• Granular backup region selection
• Satisfy data sovereignty requirements
• Supports replica sets and sharded clusters
• Pricing is based on snapshot size, not data size
• Less expensive, starting at $0.08 per GB of
snapshot size (varies per provider and region)
• Now available on all cloud providers!
• AWS and GCP snapshots are incremental
Incremental snapshots for Atlas
customers deploying on AWS
10 GB 15 GB (5 new) 20 GB (5 new)
40. #MDBLocal
Cloud Provider Snapshots (CPS) Updates
• Bring your own keys
• Backup Policies
• On-demand snapshots (quicksave)
• Takes a snapshot immediately if there’s not already one in progress
• API for pipeline integrations
• M2/M5 Backups
• NEW! - Point in Time Restore on AWS
• Restore to a point in time within the window, down to the minute granularity
• Configurable Point in Time Window (Continuous only has 24 hours)
• Available first on AWS, GCP and Azure are coming soon!
42. #MDBLocal
Cloud Provider Snapshots – Point in Time Restore
• Restore to a point in time within the window, down to the minute
granularity
• Customizable Point in Time Window (Continuous only had 24 hours)
• Available first on AWS, GCP and Azure are coming soon!
45. #MDBLocal
Cloud Provider Snapshots – Coming Soon
● Bring point in time restores to GCP and Azure clusters
● Build a new cluster from a backup
● Optimized path for queryable use cases
● Selective restore
● Snapshot Distribution
47. #MDBlocal
Modern Day Backup and
Recovery from On-
Premises to Public Cloud
https://www.surveymonkey.com/r/PS8S5V2
Every session you rate enters you into a drawing for a $250
Visa gift card, sponsored by