Moving to a new home is daunting. Packing up all your things, getting a vehicle to move it all, unpacking it, updating your mailing address, and making sure you did not leave anything behind. Well, the move to MongoDB Atlas is similar, but all the logistics are already figured out for you by MongoDB.
When stars align: studies in data quality, knowledge graphs, and machine lear...
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
1. Packing up your data and
moving to Atlas
(with a quick detour to the Grand Canyon along the way)
2. FACILITRON is a public spaces marketplace and provider of facility management and data solutions
for public schools and colleges. Through a unique partnership strategy which provides software,
setup and support at no cost, Facilitron helps districts create, develop and manage facility use
programs that maximize utilization and recovery costs and enables a completely new approach to
data-driven facility management.
Dima Bronin
Steve Gaylord
Introduction
3. ● We Made the Journey
● Did We Bring Everything We Needed? Did We Hit
All The Sites Along The Way? What Is It Like Living
In Our Dream Home?
● Forwarding Our Mail
● Expanding the Living Room
Overview
● Review Existing Home
● Criteria for Our New Home
● Planning The Voyage To Our Dream Home
● Discovery Tour
5. Current Home
● Application Server
○ Hosted at Heroku
■ Dynamic IP Environment
○ Node
○ Mongoose
● Database Server
○ Hosted at Compose
■ Web based UI for access
○ MongoDB 3.2
■ Multiple Databases for different parts of our application
6. Looking For Our New Home
Performance
/ Scalability
Reporting and transparency
Fine grain control
Support Staff / Knowledge / Expertise
API access
7. Reporting & Transparency
Compose
● Impossible to quickly and easily see
slow running ops or any other
pertinent stats
● Near zero visibility into underlying
hardware stats, like CPU usage,
memory, disk, etc.
Atlas
● Can quickly see real time query
execution times, slow running ops, hot
collections, indexing issues, etc.
● Dashboard metrics exposing 32 stats,
including 22 MongoDB metrics and 10
underlying hardware metrics
8.
9.
10. Performance & Scalability
Compose
● Black box hardware, making
educated scaling extremely
difficult
● Weird memory issues
Atlas
● Documented cluster tiers using
all 3 major cloud providers
● Seamless scaling ability from the
dashboard, including ability to
scale by region and number of
nodes
11.
12. API Access
● Provides API access to global functionality of Atlas
● We set it up using AWS Lambda and Cloudwatch Events
● We use it primarily to:
○ Pause/Resume dev clusters ($$)
○ Scale clusters on demand ($$)
● It allows us to do much more in the future
13. Fine Grain Control
● More access to “tune” in our
environment and scale only the
necessary component
● Greater access to logs for debugging
● Logs not delayed by a day
● Difficult to sync application issues with
MongoDB issues
● No web based complex query tool / just
collection finds
14. Support Staff / Knowledge / Expertise
● Supported by the company
that built MongoDB
● Access to the most
knowledgeable individuals
● Highly focused on
MongoDB
● Ability to raise and resolve
bugs, if issues found
● No “Finger Pointing”
● Quick Access to New
Versions
16. “Mapping” out the journey
● Understanding of the overall directions to
get from point a to point b.
○ Initial Sync then oplog tail
● Review information from
others that had already
taken the journey.
How To : https://docs.mongodb.com/guides/cloud/migrate-from-
compose/
● Understand the “vehicles” that
we will be using to make the
move and how it work
Mongomirror :
https://docs.atlas.mongodb.com/reference/mongomirror/index.html
18. “Try” A Similar Home In That Town
● Started with our development database
○ Database Access by Project / Cluster
● Test Migration with production
database
○ "error applying oplog
entries during initial sync:
renameCollection
command encountered
during initial sync. Please
restart mongomirror."
○ “Clear any existing data on
your target cluster?”
● Complete environment using
the development / production
migrated databases
20. Did We Bring Everything We
Needed? Did We Hit All The
Sites Along The Way? What
Is It Like Living In Our
Dream Home?
21. $lookup / $unwind document size
● Aggregations on 3.2 that were working began to produce errors on 3.6
● After $lookup had $project to reduce document size
“Total size of documents in <collection name> matching pipeline <pipeline>
exceeds maximum document size”
● SERVER-31755: Raise intermediate $lookup document size to 100MB, and
make it configurable
● Workaround was to add a $unwind and $group if required
22. retryWrites error on Aggregation with $out
● Mongoose 5.2.10 which users the Mongodb driver 3.1.4
● “FailedToParse: unrecognized field 'retryWrites' error when running
aggregations with $out.”
● Mongoose 5.2.10 uses the 3.1.4 driver and an upgrade to Mongoose
5.2.16, which uses the 3.1.6 mongodb driver resolved the issue.
● There were other changes in 5.2.16, so we had to run with RetryWrites set
to false
23. Views and Match
● We had a case where we have data in different collections that we wanted
to combine and were going to do a view for this so we do not have to pass
the aggregate from the application
● Views run defined aggregation when called
● $match is needed to avoid COLLSCAN, but this did not work with our use
case
● Focus of views is to restrict the data seen by the user of the view
● Had to implement an application level solution
24. $lookup from collection: documents with and without the key
_id
Element-1
Element-2
Element-3
Collection 2
Collection 1
_id
Element-A
Element-B
_id
Collection2_id:
Element-A
Element-B
“Collection2IdToMatch”: {"$ifNull":
["$Collection2_id", "NoMatch"]},
25. Instance heaven
● How many CPUs does your instance need?
○ Long running aggregations can be CPU hogs
● Memory? How big is your working set?
○ As always, if your working set can fit into memory, you’ll have way better performance
○ Severe bottlenecks will happen once WiredTiger cache exceeds 95% of available memory
● SSD (Local NVMe SSD on M40+)?
○ What if your working set is too large to fit into memory?
○ Atlas cannot scale the drive in this case and forces you to move to a higher instance
○ Our internal benchmarks showed a 10% performance improvement right out of the box
after warming up the cache (though IOPS are 4 orders of magnitude greater)
● Bandwidth throughput?
○ Sometimes the bottleneck is in the bandwidth
○ Different instances have different bandwidth throughputs
26. Bandwidth considerations
● Are you moving a ton of data between your web server and your
database?
○ Never see transfer speed anywhere near the advertised limits
● MongoDB offers two network compression variants: snappy and zlib
○ Don’t confuse these with the WiredTiger storage compression (snappy)
○ Network compression can be set on the connection string
■ “compressors=zlib,snappy” (indicated both a supported by zlib prefered)
○ Zlib is only supported in the latest node driver (3.1.11+) (bug in prior versions)
○ Snappy is faster
○ Zlib provides a better level of compression
27. Storage gatchyas
● How much storage are you using and what do the storage numbers
mean?
● This is tricky depending on:
○ Mongo storage engine (mmapv1 vs WiredTiger)
○ Compressed data (snappy by default in WiredTiger) vs uncompressed data
○ Accumulated cluster “garbage” over time
■ These are “bumper” files that serve as reserves when disk space is low (giving you
time to upgrade)
○ Data on disk (compressed) vs system cache (compressed) vs WiredTiger internal cache
(uncompressed)
■ WiredTiger will not release free space when docs are removed from a collection
(retains this space for future use)
○ Disk Space Used (compressed) vs DB Storage (mostly uncompressed)
28. Performance advisor
● Are you missing
indexes and
doing collection
scans?
● Do you need
compound
indexes?
● Be mindful of the performance advisor’s reporting delay
● Need insight into your aggregations?
29. MongoDB Compass
● Compass vs Compass
Community Edition
○ May be included in
your existing services
_id :
Status:
Details: {
_
id:
..
.
}
● Issue with Sub-documents with an _id field
31. Receiving Our Critical Mail
● Automatic Proactive Outreach
Emails
○ Email - to any standard email client
○ SMS - Text message to alert your phone
○ HipChat / Slack / Flowdock - chat
notifications pushed via the MongoDB
Atlas API directly to a specified channel
○ PagerDuty - integration with on-call
schedule and alerting
○ Webhooks - Sends an HTTP POST request
to an endpoint for programmatic
processing
33. Oh for the love of reporting ...
● Considerations:
○ Performance
○ Data Sources
○ Ease of Use
○ Cost
● Tech:
○ MongoDB Aggregation Framework
○ MongoDB BI connector
○ Amazon Redshift via Panoply
○ Stitchdata
34. Reporting: Performance
● Time sensitive real time reporting?
○ Is your user waiting for the report to finish in real-time?
● Fresh data a must?
○ Can your data be stale?
● Panoply refresh
○ How big is your data and how often will you ETL?
● Amazon Redshift cache
○ Are your queries repeating enough to utilize caching?
○ Are you index heavy?
● Scaling and Data Size
○ How often are you generating reports?
○ How big is your data set?
35. Reporting: Data Sources
● Where does your data reside?
○ Is most of your data in mongoDB?
○ How easy is it to transform your data?
● How do you combine your data?
○ How do you associate data between sources?
○ How useful is it?
36. Reporting: Ease of Use
● Engineering
○ Configuration and on-going maintenance
○ NoSQL to SQL
○ Data ETL
○ Existing query code base
● End User
○ BI connector
○ Tools like Power BI, Tableau, etc.
○ In-house report generation
37. Reporting: Cost
● Atlas BI Connector
● Amazon Redshift via Panoply
○ Data size considerations
● ETL Tools
○ Stitch Data (servere MongoDB version limitations)
○ Alooma, etc. (can be very costly)
● Reporting Tools