SlideShare a Scribd company logo
Packing up your data and
moving to Atlas
(with a quick detour to the Grand Canyon along the way)
FACILITRON is a public spaces marketplace and provider of facility management and data solutions
for public schools and colleges. Through a unique partnership strategy which provides software,
setup and support at no cost, Facilitron helps districts create, develop and manage facility use
programs that maximize utilization and recovery costs and enables a completely new approach to
data-driven facility management.
Dima Bronin
Steve Gaylord
Introduction
● We Made the Journey
● Did We Bring Everything We Needed? Did We Hit
All The Sites Along The Way? What Is It Like Living
In Our Dream Home?
● Forwarding Our Mail
● Expanding the Living Room
Overview
● Review Existing Home
● Criteria for Our New Home
● Planning The Voyage To Our Dream Home
● Discovery Tour
What Does Our Current
Home Look Like?
Current Home
● Application Server
○ Hosted at Heroku
■ Dynamic IP Environment
○ Node
○ Mongoose
● Database Server
○ Hosted at Compose
■ Web based UI for access
○ MongoDB 3.2
■ Multiple Databases for different parts of our application
Looking For Our New Home
Performance
/ Scalability
Reporting and transparency
Fine grain control
Support Staff / Knowledge / Expertise
API access
Reporting & Transparency
Compose
● Impossible to quickly and easily see
slow running ops or any other
pertinent stats
● Near zero visibility into underlying
hardware stats, like CPU usage,
memory, disk, etc.
Atlas
● Can quickly see real time query
execution times, slow running ops, hot
collections, indexing issues, etc.
● Dashboard metrics exposing 32 stats,
including 22 MongoDB metrics and 10
underlying hardware metrics
Performance & Scalability
Compose
● Black box hardware, making
educated scaling extremely
difficult
● Weird memory issues
Atlas
● Documented cluster tiers using
all 3 major cloud providers
● Seamless scaling ability from the
dashboard, including ability to
scale by region and number of
nodes
API Access
● Provides API access to global functionality of Atlas
● We set it up using AWS Lambda and Cloudwatch Events
● We use it primarily to:
○ Pause/Resume dev clusters ($$)
○ Scale clusters on demand ($$)
● It allows us to do much more in the future
Fine Grain Control
● More access to “tune” in our
environment and scale only the
necessary component
● Greater access to logs for debugging
● Logs not delayed by a day
● Difficult to sync application issues with
MongoDB issues
● No web based complex query tool / just
collection finds
Support Staff / Knowledge / Expertise
● Supported by the company
that built MongoDB
● Access to the most
knowledgeable individuals
● Highly focused on
MongoDB
● Ability to raise and resolve
bugs, if issues found
● No “Finger Pointing”
● Quick Access to New
Versions
Planning The Voyage To
Our Dream Home
“Mapping” out the journey
● Understanding of the overall directions to
get from point a to point b.
○ Initial Sync then oplog tail
● Review information from
others that had already
taken the journey.
How To : https://docs.mongodb.com/guides/cloud/migrate-from-
compose/
● Understand the “vehicles” that
we will be using to make the
move and how it work
Mongomirror :
https://docs.atlas.mongodb.com/reference/mongomirror/index.html
Discovery Tour
“Try” A Similar Home In That Town
● Started with our development database
○ Database Access by Project / Cluster
● Test Migration with production
database
○ "error applying oplog
entries during initial sync:
renameCollection
command encountered
during initial sync. Please
restart mongomirror."
○ “Clear any existing data on
your target cluster?”
● Complete environment using
the development / production
migrated databases
We Made
The Journey
Did We Bring Everything We
Needed? Did We Hit All The
Sites Along The Way? What
Is It Like Living In Our
Dream Home?
$lookup / $unwind document size
● Aggregations on 3.2 that were working began to produce errors on 3.6
● After $lookup had $project to reduce document size
“Total size of documents in <collection name> matching pipeline <pipeline>
exceeds maximum document size”
● SERVER-31755: Raise intermediate $lookup document size to 100MB, and
make it configurable
● Workaround was to add a $unwind and $group if required
retryWrites error on Aggregation with $out
● Mongoose 5.2.10 which users the Mongodb driver 3.1.4
● “FailedToParse: unrecognized field 'retryWrites' error when running
aggregations with $out.”
● Mongoose 5.2.10 uses the 3.1.4 driver and an upgrade to Mongoose
5.2.16, which uses the 3.1.6 mongodb driver resolved the issue.
● There were other changes in 5.2.16, so we had to run with RetryWrites set
to false
Views and Match
● We had a case where we have data in different collections that we wanted
to combine and were going to do a view for this so we do not have to pass
the aggregate from the application
● Views run defined aggregation when called
● $match is needed to avoid COLLSCAN, but this did not work with our use
case
● Focus of views is to restrict the data seen by the user of the view
● Had to implement an application level solution
$lookup from collection: documents with and without the key
_id
Element-1
Element-2
Element-3
Collection 2
Collection 1
_id
Element-A
Element-B
_id
Collection2_id:
Element-A
Element-B
“Collection2IdToMatch”: {"$ifNull":
["$Collection2_id", "NoMatch"]},
Instance heaven
● How many CPUs does your instance need?
○ Long running aggregations can be CPU hogs
● Memory? How big is your working set?
○ As always, if your working set can fit into memory, you’ll have way better performance
○ Severe bottlenecks will happen once WiredTiger cache exceeds 95% of available memory
● SSD (Local NVMe SSD on M40+)?
○ What if your working set is too large to fit into memory?
○ Atlas cannot scale the drive in this case and forces you to move to a higher instance
○ Our internal benchmarks showed a 10% performance improvement right out of the box
after warming up the cache (though IOPS are 4 orders of magnitude greater)
● Bandwidth throughput?
○ Sometimes the bottleneck is in the bandwidth
○ Different instances have different bandwidth throughputs
Bandwidth considerations
● Are you moving a ton of data between your web server and your
database?
○ Never see transfer speed anywhere near the advertised limits
● MongoDB offers two network compression variants: snappy and zlib
○ Don’t confuse these with the WiredTiger storage compression (snappy)
○ Network compression can be set on the connection string
■ “compressors=zlib,snappy” (indicated both a supported by zlib prefered)
○ Zlib is only supported in the latest node driver (3.1.11+) (bug in prior versions)
○ Snappy is faster
○ Zlib provides a better level of compression
Storage gatchyas
● How much storage are you using and what do the storage numbers
mean?
● This is tricky depending on:
○ Mongo storage engine (mmapv1 vs WiredTiger)
○ Compressed data (snappy by default in WiredTiger) vs uncompressed data
○ Accumulated cluster “garbage” over time
■ These are “bumper” files that serve as reserves when disk space is low (giving you
time to upgrade)
○ Data on disk (compressed) vs system cache (compressed) vs WiredTiger internal cache
(uncompressed)
■ WiredTiger will not release free space when docs are removed from a collection
(retains this space for future use)
○ Disk Space Used (compressed) vs DB Storage (mostly uncompressed)
Performance advisor
● Are you missing
indexes and
doing collection
scans?
● Do you need
compound
indexes?
● Be mindful of the performance advisor’s reporting delay
● Need insight into your aggregations?
MongoDB Compass
● Compass vs Compass
Community Edition
○ May be included in
your existing services
_id :
Status:
Details: {
_
id:
..
.
}
● Issue with Sub-documents with an _id field
Forwarding Our Mail
Receiving Our Critical Mail
● Automatic Proactive Outreach
Emails
○ Email - to any standard email client
○ SMS - Text message to alert your phone
○ HipChat / Slack / Flowdock - chat
notifications pushed via the MongoDB
Atlas API directly to a specified channel
○ PagerDuty - integration with on-call
schedule and alerting
○ Webhooks - Sends an HTTP POST request
to an endpoint for programmatic
processing
Expanding the
Living Room
Oh for the love of reporting ...
● Considerations:
○ Performance
○ Data Sources
○ Ease of Use
○ Cost
● Tech:
○ MongoDB Aggregation Framework
○ MongoDB BI connector
○ Amazon Redshift via Panoply
○ Stitchdata
Reporting: Performance
● Time sensitive real time reporting?
○ Is your user waiting for the report to finish in real-time?
● Fresh data a must?
○ Can your data be stale?
● Panoply refresh
○ How big is your data and how often will you ETL?
● Amazon Redshift cache
○ Are your queries repeating enough to utilize caching?
○ Are you index heavy?
● Scaling and Data Size
○ How often are you generating reports?
○ How big is your data set?
Reporting: Data Sources
● Where does your data reside?
○ Is most of your data in mongoDB?
○ How easy is it to transform your data?
● How do you combine your data?
○ How do you associate data between sources?
○ How useful is it?
Reporting: Ease of Use
● Engineering
○ Configuration and on-going maintenance
○ NoSQL to SQL
○ Data ETL
○ Existing query code base
● End User
○ BI connector
○ Tools like Power BI, Tableau, etc.
○ In-house report generation
Reporting: Cost
● Atlas BI Connector
● Amazon Redshift via Panoply
○ Data size considerations
● ETL Tools
○ Stitch Data (servere MongoDB version limitations)
○ Alooma, etc. (can be very costly)
● Reporting Tools
Questions?

More Related Content

What's hot

Real-time analytics with Druid at Appsflyer
Real-time analytics with Druid at AppsflyerReal-time analytics with Druid at Appsflyer
Real-time analytics with Druid at AppsflyerMichael Spector
 
Cassandra as event sourced journal for big data analytics
Cassandra as event sourced journal for big data analyticsCassandra as event sourced journal for big data analytics
Cassandra as event sourced journal for big data analyticsAnirvan Chakraborty
 
Big data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeBig data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeItai Yaffe
 
Challenges in Building a Data Pipeline
Challenges in Building a Data PipelineChallenges in Building a Data Pipeline
Challenges in Building a Data PipelineManish Kumar
 
Amazon aws big data demystified | Introduction to streaming and messaging flu...
Amazon aws big data demystified | Introduction to streaming and messaging flu...Amazon aws big data demystified | Introduction to streaming and messaging flu...
Amazon aws big data demystified | Introduction to streaming and messaging flu...Omid Vahdaty
 
Argus Production Monitoring at Salesforce
Argus Production Monitoring at SalesforceArgus Production Monitoring at Salesforce
Argus Production Monitoring at SalesforceHBaseCon
 
5 Levels of High Availability: From Multi-instance to Hybrid Cloud
5 Levels of High Availability: From Multi-instance to Hybrid Cloud5 Levels of High Availability: From Multi-instance to Hybrid Cloud
5 Levels of High Availability: From Multi-instance to Hybrid CloudRafał Leszko
 
Using druid for interactive count distinct queries at scale
Using druid for interactive count distinct queries at scaleUsing druid for interactive count distinct queries at scale
Using druid for interactive count distinct queries at scaleItai Yaffe
 
MySQL NDB Cluster 101
MySQL NDB Cluster 101MySQL NDB Cluster 101
MySQL NDB Cluster 101Bernd Ocklin
 
Symantec: Cassandra Data Modelling techniques in action
Symantec: Cassandra Data Modelling techniques in actionSymantec: Cassandra Data Modelling techniques in action
Symantec: Cassandra Data Modelling techniques in actionDataStax Academy
 
Scaling event aggregation at twitter
Scaling event aggregation at twitterScaling event aggregation at twitter
Scaling event aggregation at twitterlohitvijayarenu
 
Percona Live 2012PPT: MySQL Cluster And NDB Cluster
Percona Live 2012PPT: MySQL Cluster And NDB ClusterPercona Live 2012PPT: MySQL Cluster And NDB Cluster
Percona Live 2012PPT: MySQL Cluster And NDB Clustermysqlops
 
Lambda Architectures in Practice
Lambda Architectures in PracticeLambda Architectures in Practice
Lambda Architectures in PracticeC4Media
 
NoSQL no more: SQL on Druid with Apache Calcite
NoSQL no more: SQL on Druid with Apache CalciteNoSQL no more: SQL on Druid with Apache Calcite
NoSQL no more: SQL on Druid with Apache Calcitegianmerlino
 
Webinar: Choosing the Right Shard Key for High Performance and Scale
Webinar: Choosing the Right Shard Key for High Performance and ScaleWebinar: Choosing the Right Shard Key for High Performance and Scale
Webinar: Choosing the Right Shard Key for High Performance and ScaleMongoDB
 
Our journey with druid - from initial research to full production scale
Our journey with druid - from initial research to full production scaleOur journey with druid - from initial research to full production scale
Our journey with druid - from initial research to full production scaleItai Yaffe
 
Lambda architecture @ Indix
Lambda architecture @ IndixLambda architecture @ Indix
Lambda architecture @ IndixRajesh Muppalla
 
MySQL NDB Cluster 8.0 SQL faster than NoSQL
MySQL NDB Cluster 8.0 SQL faster than NoSQL MySQL NDB Cluster 8.0 SQL faster than NoSQL
MySQL NDB Cluster 8.0 SQL faster than NoSQL Bernd Ocklin
 

What's hot (20)

Real-time analytics with Druid at Appsflyer
Real-time analytics with Druid at AppsflyerReal-time analytics with Druid at Appsflyer
Real-time analytics with Druid at Appsflyer
 
Cassandra as event sourced journal for big data analytics
Cassandra as event sourced journal for big data analyticsCassandra as event sourced journal for big data analytics
Cassandra as event sourced journal for big data analytics
 
Big data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeBig data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real time
 
Challenges in Building a Data Pipeline
Challenges in Building a Data PipelineChallenges in Building a Data Pipeline
Challenges in Building a Data Pipeline
 
Amazon aws big data demystified | Introduction to streaming and messaging flu...
Amazon aws big data demystified | Introduction to streaming and messaging flu...Amazon aws big data demystified | Introduction to streaming and messaging flu...
Amazon aws big data demystified | Introduction to streaming and messaging flu...
 
Argus Production Monitoring at Salesforce
Argus Production Monitoring at SalesforceArgus Production Monitoring at Salesforce
Argus Production Monitoring at Salesforce
 
5 Levels of High Availability: From Multi-instance to Hybrid Cloud
5 Levels of High Availability: From Multi-instance to Hybrid Cloud5 Levels of High Availability: From Multi-instance to Hybrid Cloud
5 Levels of High Availability: From Multi-instance to Hybrid Cloud
 
Using druid for interactive count distinct queries at scale
Using druid for interactive count distinct queries at scaleUsing druid for interactive count distinct queries at scale
Using druid for interactive count distinct queries at scale
 
MySQL NDB Cluster 101
MySQL NDB Cluster 101MySQL NDB Cluster 101
MySQL NDB Cluster 101
 
Symantec: Cassandra Data Modelling techniques in action
Symantec: Cassandra Data Modelling techniques in actionSymantec: Cassandra Data Modelling techniques in action
Symantec: Cassandra Data Modelling techniques in action
 
Scaling event aggregation at twitter
Scaling event aggregation at twitterScaling event aggregation at twitter
Scaling event aggregation at twitter
 
Percona Live 2012PPT: MySQL Cluster And NDB Cluster
Percona Live 2012PPT: MySQL Cluster And NDB ClusterPercona Live 2012PPT: MySQL Cluster And NDB Cluster
Percona Live 2012PPT: MySQL Cluster And NDB Cluster
 
Lambda Architectures in Practice
Lambda Architectures in PracticeLambda Architectures in Practice
Lambda Architectures in Practice
 
NoSQL no more: SQL on Druid with Apache Calcite
NoSQL no more: SQL on Druid with Apache CalciteNoSQL no more: SQL on Druid with Apache Calcite
NoSQL no more: SQL on Druid with Apache Calcite
 
Google Cloud Spanner Preview
Google Cloud Spanner PreviewGoogle Cloud Spanner Preview
Google Cloud Spanner Preview
 
Webinar: Choosing the Right Shard Key for High Performance and Scale
Webinar: Choosing the Right Shard Key for High Performance and ScaleWebinar: Choosing the Right Shard Key for High Performance and Scale
Webinar: Choosing the Right Shard Key for High Performance and Scale
 
Our journey with druid - from initial research to full production scale
Our journey with druid - from initial research to full production scaleOur journey with druid - from initial research to full production scale
Our journey with druid - from initial research to full production scale
 
Druid @ branch
Druid @ branch Druid @ branch
Druid @ branch
 
Lambda architecture @ Indix
Lambda architecture @ IndixLambda architecture @ Indix
Lambda architecture @ Indix
 
MySQL NDB Cluster 8.0 SQL faster than NoSQL
MySQL NDB Cluster 8.0 SQL faster than NoSQL MySQL NDB Cluster 8.0 SQL faster than NoSQL
MySQL NDB Cluster 8.0 SQL faster than NoSQL
 

Similar to MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas

Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixC4Media
 
Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartMukesh Singh
 
Big data @ Hootsuite analtyics
Big data @ Hootsuite analtyicsBig data @ Hootsuite analtyics
Big data @ Hootsuite analtyicsClaudiu Coman
 
kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadKrivoy Rog IT Community
 
Hadoop Training Tutorial for Freshers
Hadoop Training Tutorial for FreshersHadoop Training Tutorial for Freshers
Hadoop Training Tutorial for Freshersrajkamaltibacademy
 
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...NETWAYS
 
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned Omid Vahdaty
 
Understanding Hadoop
Understanding HadoopUnderstanding Hadoop
Understanding HadoopAhmed Ossama
 
Introduction to mongo db
Introduction to mongo dbIntroduction to mongo db
Introduction to mongo dbLawrence Mwai
 
Big data should be simple
Big data should be simpleBig data should be simple
Big data should be simpleDori Waldman
 
#lspe Building a Monitoring Framework using DTrace and MongoDB
#lspe Building a Monitoring Framework using DTrace and MongoDB#lspe Building a Monitoring Framework using DTrace and MongoDB
#lspe Building a Monitoring Framework using DTrace and MongoDBdan-p-kimmel
 
AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned Omid Vahdaty
 
Distributed real time stream processing- why and how
Distributed real time stream processing- why and howDistributed real time stream processing- why and how
Distributed real time stream processing- why and howPetr Zapletal
 
How to Develop and Operate Cloud First Data Platforms
How to Develop and Operate Cloud First Data PlatformsHow to Develop and Operate Cloud First Data Platforms
How to Develop and Operate Cloud First Data PlatformsAlluxio, Inc.
 
Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...
Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...
Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...Flink Forward
 
Designing your SaaS Database for Scale with Postgres
Designing your SaaS Database for Scale with PostgresDesigning your SaaS Database for Scale with Postgres
Designing your SaaS Database for Scale with PostgresOzgun Erdogan
 
Bootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkBootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkDataWorks Summit
 
Global Cluster Topologies in MongoDB Atlas - Andrew Davidson
Global Cluster Topologies in MongoDB Atlas - Andrew DavidsonGlobal Cluster Topologies in MongoDB Atlas - Andrew Davidson
Global Cluster Topologies in MongoDB Atlas - Andrew DavidsonMongoDB
 

Similar to MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas (20)

Cloud arch patterns
Cloud arch patternsCloud arch patterns
Cloud arch patterns
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFix
 
Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @Lendingkart
 
BigData Hadoop
BigData Hadoop BigData Hadoop
BigData Hadoop
 
Big data @ Hootsuite analtyics
Big data @ Hootsuite analtyicsBig data @ Hootsuite analtyics
Big data @ Hootsuite analtyics
 
kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High load
 
Hadoop Training Tutorial for Freshers
Hadoop Training Tutorial for FreshersHadoop Training Tutorial for Freshers
Hadoop Training Tutorial for Freshers
 
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
 
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
 
Understanding Hadoop
Understanding HadoopUnderstanding Hadoop
Understanding Hadoop
 
Introduction to mongo db
Introduction to mongo dbIntroduction to mongo db
Introduction to mongo db
 
Big data should be simple
Big data should be simpleBig data should be simple
Big data should be simple
 
#lspe Building a Monitoring Framework using DTrace and MongoDB
#lspe Building a Monitoring Framework using DTrace and MongoDB#lspe Building a Monitoring Framework using DTrace and MongoDB
#lspe Building a Monitoring Framework using DTrace and MongoDB
 
AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned
 
Distributed real time stream processing- why and how
Distributed real time stream processing- why and howDistributed real time stream processing- why and how
Distributed real time stream processing- why and how
 
How to Develop and Operate Cloud First Data Platforms
How to Develop and Operate Cloud First Data PlatformsHow to Develop and Operate Cloud First Data Platforms
How to Develop and Operate Cloud First Data Platforms
 
Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...
Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...
Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...
 
Designing your SaaS Database for Scale with Postgres
Designing your SaaS Database for Scale with PostgresDesigning your SaaS Database for Scale with Postgres
Designing your SaaS Database for Scale with Postgres
 
Bootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkBootstrapping state in Apache Flink
Bootstrapping state in Apache Flink
 
Global Cluster Topologies in MongoDB Atlas - Andrew Davidson
Global Cluster Topologies in MongoDB Atlas - Andrew DavidsonGlobal Cluster Topologies in MongoDB Atlas - Andrew Davidson
Global Cluster Topologies in MongoDB Atlas - Andrew Davidson
 

More from MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

More from MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Recently uploaded

Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2DianaGray10
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutesconfluent
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
 
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀DianaGray10
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Product School
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Thierry Lestable
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupCatarinaPereira64715
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlPeter Udo Diehl
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxDavid Michel
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...Elena Simperl
 

Recently uploaded (20)

Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 

MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas

  • 1. Packing up your data and moving to Atlas (with a quick detour to the Grand Canyon along the way)
  • 2. FACILITRON is a public spaces marketplace and provider of facility management and data solutions for public schools and colleges. Through a unique partnership strategy which provides software, setup and support at no cost, Facilitron helps districts create, develop and manage facility use programs that maximize utilization and recovery costs and enables a completely new approach to data-driven facility management. Dima Bronin Steve Gaylord Introduction
  • 3. ● We Made the Journey ● Did We Bring Everything We Needed? Did We Hit All The Sites Along The Way? What Is It Like Living In Our Dream Home? ● Forwarding Our Mail ● Expanding the Living Room Overview ● Review Existing Home ● Criteria for Our New Home ● Planning The Voyage To Our Dream Home ● Discovery Tour
  • 4. What Does Our Current Home Look Like?
  • 5. Current Home ● Application Server ○ Hosted at Heroku ■ Dynamic IP Environment ○ Node ○ Mongoose ● Database Server ○ Hosted at Compose ■ Web based UI for access ○ MongoDB 3.2 ■ Multiple Databases for different parts of our application
  • 6. Looking For Our New Home Performance / Scalability Reporting and transparency Fine grain control Support Staff / Knowledge / Expertise API access
  • 7. Reporting & Transparency Compose ● Impossible to quickly and easily see slow running ops or any other pertinent stats ● Near zero visibility into underlying hardware stats, like CPU usage, memory, disk, etc. Atlas ● Can quickly see real time query execution times, slow running ops, hot collections, indexing issues, etc. ● Dashboard metrics exposing 32 stats, including 22 MongoDB metrics and 10 underlying hardware metrics
  • 8.
  • 9.
  • 10. Performance & Scalability Compose ● Black box hardware, making educated scaling extremely difficult ● Weird memory issues Atlas ● Documented cluster tiers using all 3 major cloud providers ● Seamless scaling ability from the dashboard, including ability to scale by region and number of nodes
  • 11.
  • 12. API Access ● Provides API access to global functionality of Atlas ● We set it up using AWS Lambda and Cloudwatch Events ● We use it primarily to: ○ Pause/Resume dev clusters ($$) ○ Scale clusters on demand ($$) ● It allows us to do much more in the future
  • 13. Fine Grain Control ● More access to “tune” in our environment and scale only the necessary component ● Greater access to logs for debugging ● Logs not delayed by a day ● Difficult to sync application issues with MongoDB issues ● No web based complex query tool / just collection finds
  • 14. Support Staff / Knowledge / Expertise ● Supported by the company that built MongoDB ● Access to the most knowledgeable individuals ● Highly focused on MongoDB ● Ability to raise and resolve bugs, if issues found ● No “Finger Pointing” ● Quick Access to New Versions
  • 15. Planning The Voyage To Our Dream Home
  • 16. “Mapping” out the journey ● Understanding of the overall directions to get from point a to point b. ○ Initial Sync then oplog tail ● Review information from others that had already taken the journey. How To : https://docs.mongodb.com/guides/cloud/migrate-from- compose/ ● Understand the “vehicles” that we will be using to make the move and how it work Mongomirror : https://docs.atlas.mongodb.com/reference/mongomirror/index.html
  • 18. “Try” A Similar Home In That Town ● Started with our development database ○ Database Access by Project / Cluster ● Test Migration with production database ○ "error applying oplog entries during initial sync: renameCollection command encountered during initial sync. Please restart mongomirror." ○ “Clear any existing data on your target cluster?” ● Complete environment using the development / production migrated databases
  • 20. Did We Bring Everything We Needed? Did We Hit All The Sites Along The Way? What Is It Like Living In Our Dream Home?
  • 21. $lookup / $unwind document size ● Aggregations on 3.2 that were working began to produce errors on 3.6 ● After $lookup had $project to reduce document size “Total size of documents in <collection name> matching pipeline <pipeline> exceeds maximum document size” ● SERVER-31755: Raise intermediate $lookup document size to 100MB, and make it configurable ● Workaround was to add a $unwind and $group if required
  • 22. retryWrites error on Aggregation with $out ● Mongoose 5.2.10 which users the Mongodb driver 3.1.4 ● “FailedToParse: unrecognized field 'retryWrites' error when running aggregations with $out.” ● Mongoose 5.2.10 uses the 3.1.4 driver and an upgrade to Mongoose 5.2.16, which uses the 3.1.6 mongodb driver resolved the issue. ● There were other changes in 5.2.16, so we had to run with RetryWrites set to false
  • 23. Views and Match ● We had a case where we have data in different collections that we wanted to combine and were going to do a view for this so we do not have to pass the aggregate from the application ● Views run defined aggregation when called ● $match is needed to avoid COLLSCAN, but this did not work with our use case ● Focus of views is to restrict the data seen by the user of the view ● Had to implement an application level solution
  • 24. $lookup from collection: documents with and without the key _id Element-1 Element-2 Element-3 Collection 2 Collection 1 _id Element-A Element-B _id Collection2_id: Element-A Element-B “Collection2IdToMatch”: {"$ifNull": ["$Collection2_id", "NoMatch"]},
  • 25. Instance heaven ● How many CPUs does your instance need? ○ Long running aggregations can be CPU hogs ● Memory? How big is your working set? ○ As always, if your working set can fit into memory, you’ll have way better performance ○ Severe bottlenecks will happen once WiredTiger cache exceeds 95% of available memory ● SSD (Local NVMe SSD on M40+)? ○ What if your working set is too large to fit into memory? ○ Atlas cannot scale the drive in this case and forces you to move to a higher instance ○ Our internal benchmarks showed a 10% performance improvement right out of the box after warming up the cache (though IOPS are 4 orders of magnitude greater) ● Bandwidth throughput? ○ Sometimes the bottleneck is in the bandwidth ○ Different instances have different bandwidth throughputs
  • 26. Bandwidth considerations ● Are you moving a ton of data between your web server and your database? ○ Never see transfer speed anywhere near the advertised limits ● MongoDB offers two network compression variants: snappy and zlib ○ Don’t confuse these with the WiredTiger storage compression (snappy) ○ Network compression can be set on the connection string ■ “compressors=zlib,snappy” (indicated both a supported by zlib prefered) ○ Zlib is only supported in the latest node driver (3.1.11+) (bug in prior versions) ○ Snappy is faster ○ Zlib provides a better level of compression
  • 27. Storage gatchyas ● How much storage are you using and what do the storage numbers mean? ● This is tricky depending on: ○ Mongo storage engine (mmapv1 vs WiredTiger) ○ Compressed data (snappy by default in WiredTiger) vs uncompressed data ○ Accumulated cluster “garbage” over time ■ These are “bumper” files that serve as reserves when disk space is low (giving you time to upgrade) ○ Data on disk (compressed) vs system cache (compressed) vs WiredTiger internal cache (uncompressed) ■ WiredTiger will not release free space when docs are removed from a collection (retains this space for future use) ○ Disk Space Used (compressed) vs DB Storage (mostly uncompressed)
  • 28. Performance advisor ● Are you missing indexes and doing collection scans? ● Do you need compound indexes? ● Be mindful of the performance advisor’s reporting delay ● Need insight into your aggregations?
  • 29. MongoDB Compass ● Compass vs Compass Community Edition ○ May be included in your existing services _id : Status: Details: { _ id: .. . } ● Issue with Sub-documents with an _id field
  • 31. Receiving Our Critical Mail ● Automatic Proactive Outreach Emails ○ Email - to any standard email client ○ SMS - Text message to alert your phone ○ HipChat / Slack / Flowdock - chat notifications pushed via the MongoDB Atlas API directly to a specified channel ○ PagerDuty - integration with on-call schedule and alerting ○ Webhooks - Sends an HTTP POST request to an endpoint for programmatic processing
  • 33. Oh for the love of reporting ... ● Considerations: ○ Performance ○ Data Sources ○ Ease of Use ○ Cost ● Tech: ○ MongoDB Aggregation Framework ○ MongoDB BI connector ○ Amazon Redshift via Panoply ○ Stitchdata
  • 34. Reporting: Performance ● Time sensitive real time reporting? ○ Is your user waiting for the report to finish in real-time? ● Fresh data a must? ○ Can your data be stale? ● Panoply refresh ○ How big is your data and how often will you ETL? ● Amazon Redshift cache ○ Are your queries repeating enough to utilize caching? ○ Are you index heavy? ● Scaling and Data Size ○ How often are you generating reports? ○ How big is your data set?
  • 35. Reporting: Data Sources ● Where does your data reside? ○ Is most of your data in mongoDB? ○ How easy is it to transform your data? ● How do you combine your data? ○ How do you associate data between sources? ○ How useful is it?
  • 36. Reporting: Ease of Use ● Engineering ○ Configuration and on-going maintenance ○ NoSQL to SQL ○ Data ETL ○ Existing query code base ● End User ○ BI connector ○ Tools like Power BI, Tableau, etc. ○ In-house report generation
  • 37. Reporting: Cost ● Atlas BI Connector ● Amazon Redshift via Panoply ○ Data size considerations ● ETL Tools ○ Stitch Data (servere MongoDB version limitations) ○ Alooma, etc. (can be very costly) ● Reporting Tools