SlideShare a Scribd company logo
1 of 87
Download to read offline
Daniel Coupal, Curriculum Team, MongoDB
A Complete Methodology to Data Modeling for MongoDB
@danielcoupal
Daniel Coupal
Curriculum Engineer, Education Department, Palo Alto, CA
https://university.mongodb.com
Goals of the Presentation
Document vs Tabular
Recognize the
differences
Methodology
Summarize the steps
when modeling for
MongoDB
Patterns
Recognize when to apply
Goals of the Presentation
Document vs Tabular
Recognize the
differences
Methodology
Summarize the steps
when modeling for
MongoDB
Patterns
Recognize when to apply
Goals of the Presentation
Document vs Tabular
Recognize the
differences
Methodology
Summarize the steps
when modeling for
MongoDB
Patterns
Recognize when to apply
Document versus
Tabular
Recognize the differences when modeling for
a Document Database versus a
Relational/Tabular Database
Thinking in Documents
§ Polymorphism
§ different documents may contain
different fields
§ Array
§ represent a "one-to-many" relation
§ index entry separately
§ Sub Document
§ grouping some fields together
§ JSON/BSON
§ documents shown as JSON
§ BSON is the physical format
Example: Modeling a blog
CRDs: Collection-Relationship-Diagrams
for two solutions
ORSolution A Solution B
Queries by
articles or
users
Queries by
articles
Duplication
of users
information
Simpler
Example: Modeling a Social Network
Solution A Solution B
Example: Modeling a Social Network
ü Slower writes
ü More storage space
ü Duplication
ü Faster reads
Pre-aggregated
Data
Solution A Solution B
(Fan Out on writes)(Fan Out on reads)
Tabular MongoDB
Steps to create the
model
1 – define schema
2 – develop app and queries
1 – identifying the queries
2 – define schema
Differences: Tabular vs Document
Tabular MongoDB
Steps to create the
model
1 – define schema
2 – develop app and queries
1 – identifying the queries
2 – define schema
Initial schema • 3rd normal form
• one possible solution
• many possible solutions
Differences: Tabular vs Document
Tabular MongoDB
Steps to create the
model
1 – define schema
2 – develop app and queries
1 – identifying the queries
2 – define schema
Initial schema • 3rd normal form
• one possible solution
• many possible solutions
Final schema • likely denormalized • few changes
Differences: Tabular vs Document
Tabular MongoDB
Steps to create the
model
1 – define schema
2 – develop app and queries
1 – identifying the queries
2 – define schema
Initial schema • 3rd normal form
• one possible solution
• many possible solutions
Final schema • likely denormalized • few changes
Schema evolution • difficult and not optimal
• likely downtime
• easy
• no downtime
Differences: Tabular vs Document
Tabular MongoDB
Steps to create the
model
1 – define schema
2 – develop app and queries
1 – identifying the queries
2 – define schema
Initial schema • 3rd normal form
• one possible solution
• many possible solutions
Final schema • likely denormalized • few changes
Schema evolution • difficult and not optimal
• likely downtime
• easy
• no downtime
Performance • mediocre • optimized
Differences: Tabular vs Document
Methodology
Summarize the steps of a methodology when
modeling for MongoDB
Main Tradeoff in Modeling
Methodology
Methodology
1. Describe the
Workload
Methodology
1. Describe the
Workload
2. Identify and Model
the Relationships
Methodology
1. Describe the
Workload
2. Identify and Model
the Relationships
3. Apply Patterns
Flexible Methodology
Use Case
Let's start a franchise of coffee shops…
Case Study: Coffee Shop Franchises
Name: Beyond the Stars Coffee
Case Study: Coffee Shop Franchises
Name: Beyond the Stars Coffee
Objective:
§ 10 000 stores in the United States
Case Study: Coffee Shop Franchises
Name: Beyond the Stars Coffee
Objective:
§ 10 000 stores in the United States
§ … then we expand to the rest of the World
Case Study: Coffee Shop Franchises
Name: Beyond the Stars Coffee
Objective:
§ 10 000 stores in the United States
§ … then we expand to the rest of the World
Keys to success:
1. Best coffee in the world
Case Study: Coffee Shop Franchises
Name: Beyond the Stars Coffee
Objective:
§ 10 000 stores in the United States
§ … then we expand to the rest of the World
Keys to success:
1. Best coffee in the world
2. Best Technology
Key to Success 1:
Make the Best Coffee in the World
Make the Best Coffee in the World
23g of ground coffee in, 20g of extracted coffee
out, in approximately 20 seconds
1. Fill a small or regular cup with 80% hot water
(not boiling but pretty hot). Your cup should
be 150ml to 200ml in total volume, 80% of
which will be hot water.
2. Grind 23g of coffee into your portafilter using
the double basket. We use a scale that you
can get here.
3. Draw 20g of coffee over the hot water by
placing your cup on a scale, press tare and
extract your shot.
Key to Success 2:
Best Technology
a) Intelligent Shelves
§ Measure inventory in real time
Key to Success 2:
Best Technology
a) Intelligent Shelves
§ Measure inventory in real time
b) Intelligent Coffee Machines
§ Weightings, temperature, time to produce, …
§ Coffee perfection
Key to Success 2:
Best Technology
a) Intelligent Shelves
§ Measure inventory in real time
b) Intelligent Coffee Machines
§ Weightings, temperature, time to produce, …
§ Coffee perfection
c) Intelligent Data Storage
§ MongoDB
Methodology
1. Describe the
Workload
2. Identify and Model
the Relationships
3. Apply Patterns
1 – Workload: List Queries
Query Operation Description
1. Coffee weight on the shelves write A shelf send information when coffee bags are
added or removed
1 – Workload: List Queries
Query Operation Description
1. Coffee weight on the shelves write A shelf send information when coffee bags are
added or removed
2. Coffee to deliver to stores read How much coffee do we have to ship to the store in
the next days
1 – Workload: List Queries
Query Operation Description
1. Coffee weight on the shelves write A shelf send information when coffee bags are
added or removed
2. Coffee to deliver to stores read How much coffee do we have to ship to the store in
the next days
3. Anomalies in the inventory read Analytics
1 – Workload: List Queries
Query Operation Description
1. Coffee weight on the shelves write A shelf send information when coffee bags are
added or removed
2. Coffee to deliver to stores read How much coffee do we have to ship to the store in
the next days
3. Anomalies in the inventory read Analytics
4. Making a cup of coffee write A coffee machine reporting on the production of a
coffee cup
1 – Workload: List Queries
Query Operation Description
1. Coffee weight on the shelves write A shelf send information when coffee bags are
added or removed
2. Coffee to deliver to stores read How much coffee do we have to ship to the store in
the next days
3. Anomalies in the inventory read Analytics
4. Making a cup of coffee write A coffee machine reporting on the production of a
coffee cup
5. Analysis of cups of coffee read Analytics
1 – Workload: List Queries
Query Operation Description
1. Coffee weight on the shelves write A shelf send information when coffee bags are
added or removed
2. Coffee to deliver to stores read How much coffee do we have to ship to the store in
the next days
3. Anomalies in the inventory read Analytics
4. Making a cup of coffee write A coffee machine reporting on the production of a
coffee cup
5. Analysis of cups of coffee read Analytics
6. Technical Support read Helping our franchisees
1 – Workload: quantify/qualify the queries
Query Quantification Qualification
1. Coffee weight on the shelves 10/day*shelf*store
=> 1/sec
<1s
critical write
2. Coffee to deliver to stores 1/day*store
=> 0.1/sec
<60s
3. Anomalies in the inventory 24 reads/day <5mins
"collection scan"
4. Making a cup of coffee 10 000 000 writes/day
115 writes/sec
<100ms
non-critical write
… cups of coffee at rush hour 3 000 000 writes/hr
833 writes/sec
<100ms
non-critical write
5. Analysis of cups of coffee 24 reads/day stale data is fine
"collection scan"
6. Technical Support 1000 reads/day <1s
1 – Workload: quantify/qualify the queries
Query Quantification Qualification
1. Coffee weight on the shelves 10/day*shelf*store
=> 1/sec
<1s
critical write
2. Coffee to deliver to stores 1/day*store
=> 0.1/sec
<60s
3. Anomalies in the inventory 24 reads/day <5mins
"collection scan"
4. Making a cup of coffee 10 000 000 writes/day
115 writes/sec
<100ms
non-critical write
… cups of coffee at rush hour 3 000 000 writes/hr
833 writes/sec
<100ms
non-critical write
5. Analysis of cups of coffee 24 reads/day stale data is fine
"collection scan"
6. Technical Support 1000 reads/day <1s
Disk Space
Cups of coffee
§ one year of data
§ 10000 x 1000/day x 365
§ 3.7 billions/year
§ 370 GB (100 bytes/cup of coffee)
Weighings
§ one year of data
§ 10000 x 10/day x 365
§ 365 billions/year
§ 3.7 GB (100 bytes/weighings)
Methodology
1. Describe the
Workload
2. Identify and Model
the Relationships
3. Apply Patterns
2 - Relations are still important
Type of Relation -> one-to-one/1-1 one-to-many/1-N many-to-many/N-N
Document
embedded in the
parent document
• one read
• no joins
• one read
• no joins
• one read
• no joins
• duplication of
information
Document
referenced in the
parent document
• smaller reads
• many reads
• smaller reads
• many reads
• smaller reads
• many reads
2 - Entities for Beyond the Stars Coffee
Entities:
§ Coffee cups
§ Stores
§ Coffee machines
§ Shelves
§ Weighings
§ Coffee bags
Methodology
1. Describe the
Workload
2. Identify and Model
the Relationships
3. Apply Patterns
Patterns
Recognize the need and when to apply Schema
Design Patterns
Schema Design Patterns Resources
A. Advanced Schema Design Patterns
§ MongoDB World 2017
B. Blogs on Patterns, with Ken Alger
§ https://www.mongodb.com/blog/post/building-
with-patterns-a-summary
C. MongoDB University: M320 – Data Modeling
§ https://university.mongodb.com/courses/M320/about
D. Schema Design, Builder Fest PODs
§ Wednesday, with our Consulting Engineers
Schema Versioning
Computed Pattern
Subset Pattern
Subset Pattern
Bucket Pattern
Bucket Pattern
{
"device_id": 000123456,
"type": "2A",
"date": ISODate("2018-03-02"),
"temp": [ [ 20.0, 20.1, 20.2, ... ],
[ 22.1, 22.1, 22.0, ... ],
...
]
}
{
"device_id": 000123456,
"type": "2A",
"date": ISODate("2018-03-03"),
"temp": [ [ 20.1, 20.2, 20.3, ... ],
[ 22.4, 22.4, 22.3, ... ],
...
]
}
{
"device_id": 000123456,
"type": "2A",
"date": ISODate("2018-03-02T13"),
"temp": { 1: 20.0, 2: 20.1, 3: 20.2, ... }
}
{
"device_id": 000123456,
"type": "2A",
"date": ISODate("2018-03-02T14"),
"temp": { 1: 22.1, 2: 22.1, 3: 22.0, ... }
}
Bucket per
Day
Bucket per
Hour
Solution with Patterns
• Schema Versioning
• Subset
• Computed
• Bucket
Data Modeling
Patterns
Use Cases
https://university.mongodb.com/courses/M320/about
Conclusion
Takeaways from the Presentation
Document vs Tabular
Recognize the
differences
Methodology
Summarize the steps
when modeling for
MongoDB
Patterns
Recognize when to apply
Takeaways from the Presentation
Document vs Tabular
Recognize the
differences
Methodology
Summarize the steps
when modeling for
MongoDB
Patterns
Recognize when to apply
Takeaways from the Presentation
Document vs Tabular
Recognize the
differences
Methodology
Summarize the steps
when modeling for
MongoDB
Patterns
Recognize when to apply
Thank you for taking our FREE
MongoDB classes at
university.mongodb.com
Register Now!
https://university.mongodb.com/courses/M320/about
Appendix A
Schema Versioning
Pattern
Nightmare: Alter Table
This is what your dreams should be when
thinking about a schema upgrade !
Schema Revision
Relational MongoDB
Versioned Unit Schema Document
Migration Procedure Difficult Easy
Service Uptime Interrupted No interruption
Rollback Difficult to
nightmare-ish
Easy
Application Lifecycle
Modify Application
§ Can read/process all versions of documents
§ Have different handler per version
§ Reshape the document before processing
it
Update all Application servers
§ Install updated application
§ Remove old processes
Once migration completed
§ remove the code to process old versions.
Document Lifecycle
New Documents:
§ Application writes them in latest version
Existing Documents
A) Use updates to documents
§ to transform to latest version
§ keep forever documents that never
need an update
B) or transform all documents in batch
§ no worry even if process takes days
Timeline of the migration
Problem Solution
Use Cases Examples Benefits and Trade-Offs
Schema Versioning Pattern
● Avoid downtime while doing schema
upgrades
● Upgrading all documents can take hours,
days or even weeks when dealing with
big data
● Don't want to update all documents
No downtime needed
Feel in control of the migration
Less future technical debt
🆇 May need 2 indexes for same field while
in migration period
● Each document gets a "schema_version"
field
● Application can handle all versions
● Choose your strategy to migrate the
documents
● Every application that use a database,
deployed in production and heavily used.
● System with a lot of legacy data
Appendix B
Computed Pattern
Mathematical Operations
Mathematical Operations
"Fan Out" Operations
"Roll Up" Operations
Problem Solution
Use Cases Examples Benefits and Trade-Offs
Computed Pattern
● Costly computation or manipulation of
data
● Executed frequently on the same data,
producing the same result
Read queries are faster
Saving on resources like CPU and Disk
🆇 May be difficult to identify the need
🆇 Avoid applying or overusing it unless
needed
● Perform the operation and store the result
in the appropriate document and
collection
● If need to redo the operations, keep the
source of them
● Internet Of Things (IOT)
● Event Sourcing
● Time Series Data
● Frequent Aggregation Framework queries
Data Modeling Methodology for MongoDB Coffee Franchise Case Study

More Related Content

What's hot

Modeling JSON data for NoSQL document databases
Modeling JSON data for NoSQL document databasesModeling JSON data for NoSQL document databases
Modeling JSON data for NoSQL document databasesRyan CrawCour
 
Building your first app with mongo db
Building your first app with mongo dbBuilding your first app with mongo db
Building your first app with mongo dbMongoDB
 
NoSQL Tel Aviv Meetup#1: NoSQL Data Modeling
NoSQL Tel Aviv Meetup#1: NoSQL Data ModelingNoSQL Tel Aviv Meetup#1: NoSQL Data Modeling
NoSQL Tel Aviv Meetup#1: NoSQL Data ModelingNoSQL TLV
 
MongoDB Schema Design by Examples
MongoDB Schema Design by ExamplesMongoDB Schema Design by Examples
MongoDB Schema Design by ExamplesHadi Ariawan
 
Socialite, the Open Source Status Feed
Socialite, the Open Source Status FeedSocialite, the Open Source Status Feed
Socialite, the Open Source Status FeedMongoDB
 
Socialite, the Open Source Status Feed Part 1: Design Overview and Scaling fo...
Socialite, the Open Source Status Feed Part 1: Design Overview and Scaling fo...Socialite, the Open Source Status Feed Part 1: Design Overview and Scaling fo...
Socialite, the Open Source Status Feed Part 1: Design Overview and Scaling fo...MongoDB
 
Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed
Socialite, the Open Source Status Feed Part 3: Scaling the Data FeedSocialite, the Open Source Status Feed Part 3: Scaling the Data Feed
Socialite, the Open Source Status Feed Part 3: Scaling the Data FeedMongoDB
 
NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and HowBigBlueHat
 
MongoDB, E-commerce and Transactions
MongoDB, E-commerce and TransactionsMongoDB, E-commerce and Transactions
MongoDB, E-commerce and TransactionsSteven Francia
 
Searching Relational Data with Elasticsearch
Searching Relational Data with ElasticsearchSearching Relational Data with Elasticsearch
Searching Relational Data with Elasticsearchsirensolutions
 
Introduction to MongoDB and Workshop
Introduction to MongoDB and WorkshopIntroduction to MongoDB and Workshop
Introduction to MongoDB and WorkshopAhmedabadJavaMeetup
 
Elasticsearch first-steps
Elasticsearch first-stepsElasticsearch first-steps
Elasticsearch first-stepsMatteo Moci
 
Scaling Analytics with elasticsearch
Scaling Analytics with elasticsearchScaling Analytics with elasticsearch
Scaling Analytics with elasticsearchdnoble00
 
Schema Design
Schema DesignSchema Design
Schema DesignMongoDB
 
Back to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLBack to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLMongoDB
 
Using elasticsearch with rails
Using elasticsearch with railsUsing elasticsearch with rails
Using elasticsearch with railsTom Z Zeng
 

What's hot (20)

Modeling JSON data for NoSQL document databases
Modeling JSON data for NoSQL document databasesModeling JSON data for NoSQL document databases
Modeling JSON data for NoSQL document databases
 
The What and Why of NoSql
The What and Why of NoSqlThe What and Why of NoSql
The What and Why of NoSql
 
Building your first app with mongo db
Building your first app with mongo dbBuilding your first app with mongo db
Building your first app with mongo db
 
NoSQL Tel Aviv Meetup#1: NoSQL Data Modeling
NoSQL Tel Aviv Meetup#1: NoSQL Data ModelingNoSQL Tel Aviv Meetup#1: NoSQL Data Modeling
NoSQL Tel Aviv Meetup#1: NoSQL Data Modeling
 
MongoDB Schema Design by Examples
MongoDB Schema Design by ExamplesMongoDB Schema Design by Examples
MongoDB Schema Design by Examples
 
Socialite, the Open Source Status Feed
Socialite, the Open Source Status FeedSocialite, the Open Source Status Feed
Socialite, the Open Source Status Feed
 
Socialite, the Open Source Status Feed Part 1: Design Overview and Scaling fo...
Socialite, the Open Source Status Feed Part 1: Design Overview and Scaling fo...Socialite, the Open Source Status Feed Part 1: Design Overview and Scaling fo...
Socialite, the Open Source Status Feed Part 1: Design Overview and Scaling fo...
 
Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed
Socialite, the Open Source Status Feed Part 3: Scaling the Data FeedSocialite, the Open Source Status Feed Part 3: Scaling the Data Feed
Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed
 
NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and How
 
MongoDB, E-commerce and Transactions
MongoDB, E-commerce and TransactionsMongoDB, E-commerce and Transactions
MongoDB, E-commerce and Transactions
 
Searching Relational Data with Elasticsearch
Searching Relational Data with ElasticsearchSearching Relational Data with Elasticsearch
Searching Relational Data with Elasticsearch
 
Introduction to MongoDB and Workshop
Introduction to MongoDB and WorkshopIntroduction to MongoDB and Workshop
Introduction to MongoDB and Workshop
 
Elasticsearch first-steps
Elasticsearch first-stepsElasticsearch first-steps
Elasticsearch first-steps
 
Scaling Analytics with elasticsearch
Scaling Analytics with elasticsearchScaling Analytics with elasticsearch
Scaling Analytics with elasticsearch
 
Schema Design
Schema DesignSchema Design
Schema Design
 
MongoDB and hadoop
MongoDB and hadoopMongoDB and hadoop
MongoDB and hadoop
 
Mongo DB
Mongo DBMongo DB
Mongo DB
 
Back to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLBack to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQL
 
Using elasticsearch with rails
Using elasticsearch with railsUsing elasticsearch with rails
Using elasticsearch with rails
 
MongoDB
MongoDBMongoDB
MongoDB
 

Similar to Data Modeling Methodology for MongoDB Coffee Franchise Case Study

MongoDB World 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB World 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB World 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB World 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB
 
Data Modelling for MongoDB - MongoDB.local Tel Aviv
Data Modelling for MongoDB - MongoDB.local Tel AvivData Modelling for MongoDB - MongoDB.local Tel Aviv
Data Modelling for MongoDB - MongoDB.local Tel AvivNorberto Leite
 
MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...
MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...
MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...MongoDB
 
MongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB
 
MongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDB
MongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDBMongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDB
MongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
Data Modeling for MongoDB
Data Modeling for MongoDBData Modeling for MongoDB
Data Modeling for MongoDBMongoDB
 
MongoDB .local London 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local London 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB .local London 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local London 2019: A Complete Methodology to Data Modeling for MongoDBLisa Roth, PMP
 
MongoDB .local London 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local London 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB .local London 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local London 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB
 
MongoDB.local Sydney 2019: Data Modeling for MongoDB
MongoDB.local Sydney 2019: Data Modeling for MongoDBMongoDB.local Sydney 2019: Data Modeling for MongoDB
MongoDB.local Sydney 2019: Data Modeling for MongoDBMongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
Mendeley’s Research Catalogue: building it, opening it up and making it even ...
Mendeley’s Research Catalogue: building it, opening it up and making it even ...Mendeley’s Research Catalogue: building it, opening it up and making it even ...
Mendeley’s Research Catalogue: building it, opening it up and making it even ...Kris Jack
 
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...MongoDB
 
Demystify Big Data, Data Science & Signal Extraction Deep Dive
Demystify Big Data, Data Science & Signal Extraction Deep DiveDemystify Big Data, Data Science & Signal Extraction Deep Dive
Demystify Big Data, Data Science & Signal Extraction Deep DiveHyderabad Scalability Meetup
 
Lean innovation - Basic principles of Lean
Lean innovation - Basic principles of LeanLean innovation - Basic principles of Lean
Lean innovation - Basic principles of LeanJoeri Vercammen, PhD
 
Moving away from legacy code with BDD
Moving away from legacy code with BDDMoving away from legacy code with BDD
Moving away from legacy code with BDDKonstantin Kudryashov
 
Agile experiments in Machine Learning with F#
Agile experiments in Machine Learning with F#Agile experiments in Machine Learning with F#
Agile experiments in Machine Learning with F#J On The Beach
 
Informatica Training in Chennai | Informatica Course Content
Informatica Training in Chennai | Informatica Course ContentInformatica Training in Chennai | Informatica Course Content
Informatica Training in Chennai | Informatica Course ContentCore Mind
 
Short Essay -worth 10 of total class grade General In.docx
Short Essay -worth 10 of total class grade General In.docxShort Essay -worth 10 of total class grade General In.docx
Short Essay -worth 10 of total class grade General In.docxmaoanderton
 

Similar to Data Modeling Methodology for MongoDB Coffee Franchise Case Study (20)

MongoDB World 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB World 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB World 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB World 2019: A Complete Methodology to Data Modeling for MongoDB
 
Data Modelling for MongoDB - MongoDB.local Tel Aviv
Data Modelling for MongoDB - MongoDB.local Tel AvivData Modelling for MongoDB - MongoDB.local Tel Aviv
Data Modelling for MongoDB - MongoDB.local Tel Aviv
 
MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...
MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...
MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...
 
MongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDB
 
MongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDB
MongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDBMongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDB
MongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDB
 
Data Modeling for MongoDB
Data Modeling for MongoDBData Modeling for MongoDB
Data Modeling for MongoDB
 
MongoDB .local London 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local London 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB .local London 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local London 2019: A Complete Methodology to Data Modeling for MongoDB
 
MongoDB .local London 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local London 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB .local London 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local London 2019: A Complete Methodology to Data Modeling for MongoDB
 
MongoDB.local Sydney 2019: Data Modeling for MongoDB
MongoDB.local Sydney 2019: Data Modeling for MongoDBMongoDB.local Sydney 2019: Data Modeling for MongoDB
MongoDB.local Sydney 2019: Data Modeling for MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
Mendeley’s Research Catalogue: building it, opening it up and making it even ...
Mendeley’s Research Catalogue: building it, opening it up and making it even ...Mendeley’s Research Catalogue: building it, opening it up and making it even ...
Mendeley’s Research Catalogue: building it, opening it up and making it even ...
 
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...
 
Demystify Big Data, Data Science & Signal Extraction Deep Dive
Demystify Big Data, Data Science & Signal Extraction Deep DiveDemystify Big Data, Data Science & Signal Extraction Deep Dive
Demystify Big Data, Data Science & Signal Extraction Deep Dive
 
Lean innovation - Basic principles of Lean
Lean innovation - Basic principles of LeanLean innovation - Basic principles of Lean
Lean innovation - Basic principles of Lean
 
Sas - Introduction to working under change management
Sas - Introduction to working under change managementSas - Introduction to working under change management
Sas - Introduction to working under change management
 
Moving away from legacy code with BDD
Moving away from legacy code with BDDMoving away from legacy code with BDD
Moving away from legacy code with BDD
 
Agile experiments in Machine Learning with F#
Agile experiments in Machine Learning with F#Agile experiments in Machine Learning with F#
Agile experiments in Machine Learning with F#
 
Informatica Training in Chennai | Informatica Course Content
Informatica Training in Chennai | Informatica Course ContentInformatica Training in Chennai | Informatica Course Content
Informatica Training in Chennai | Informatica Course Content
 
Short Essay -worth 10 of total class grade General In.docx
Short Essay -worth 10 of total class grade General In.docxShort Essay -worth 10 of total class grade General In.docx
Short Essay -worth 10 of total class grade General In.docx
 

More from Daniel Coupal

MongoDB.Live 2020 - Advanced Schema Design Patterns
MongoDB.Live 2020  - Advanced Schema Design PatternsMongoDB.Live 2020  - Advanced Schema Design Patterns
MongoDB.Live 2020 - Advanced Schema Design PatternsDaniel Coupal
 
Silicon Valley Code Camp 2016 - MongoDB in production
Silicon Valley Code Camp 2016 - MongoDB in productionSilicon Valley Code Camp 2016 - MongoDB in production
Silicon Valley Code Camp 2016 - MongoDB in productionDaniel Coupal
 
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelSilicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelDaniel Coupal
 
MMS: The Easiest Way to Run MongoDB
MMS: The Easiest Way to Run MongoDBMMS: The Easiest Way to Run MongoDB
MMS: The Easiest Way to Run MongoDBDaniel Coupal
 
Silicon Valley Code Camp 2014 - Advanced MongoDB
Silicon Valley Code Camp 2014 - Advanced MongoDBSilicon Valley Code Camp 2014 - Advanced MongoDB
Silicon Valley Code Camp 2014 - Advanced MongoDBDaniel Coupal
 
Semi Formal Model for Document Oriented Databases
Semi Formal Model for Document Oriented DatabasesSemi Formal Model for Document Oriented Databases
Semi Formal Model for Document Oriented DatabasesDaniel Coupal
 

More from Daniel Coupal (6)

MongoDB.Live 2020 - Advanced Schema Design Patterns
MongoDB.Live 2020  - Advanced Schema Design PatternsMongoDB.Live 2020  - Advanced Schema Design Patterns
MongoDB.Live 2020 - Advanced Schema Design Patterns
 
Silicon Valley Code Camp 2016 - MongoDB in production
Silicon Valley Code Camp 2016 - MongoDB in productionSilicon Valley Code Camp 2016 - MongoDB in production
Silicon Valley Code Camp 2016 - MongoDB in production
 
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelSilicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
 
MMS: The Easiest Way to Run MongoDB
MMS: The Easiest Way to Run MongoDBMMS: The Easiest Way to Run MongoDB
MMS: The Easiest Way to Run MongoDB
 
Silicon Valley Code Camp 2014 - Advanced MongoDB
Silicon Valley Code Camp 2014 - Advanced MongoDBSilicon Valley Code Camp 2014 - Advanced MongoDB
Silicon Valley Code Camp 2014 - Advanced MongoDB
 
Semi Formal Model for Document Oriented Databases
Semi Formal Model for Document Oriented DatabasesSemi Formal Model for Document Oriented Databases
Semi Formal Model for Document Oriented Databases
 

Recently uploaded

GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....kzayra69
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 

Recently uploaded (20)

GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 

Data Modeling Methodology for MongoDB Coffee Franchise Case Study

  • 1. Daniel Coupal, Curriculum Team, MongoDB A Complete Methodology to Data Modeling for MongoDB @danielcoupal
  • 2. Daniel Coupal Curriculum Engineer, Education Department, Palo Alto, CA
  • 4. Goals of the Presentation Document vs Tabular Recognize the differences Methodology Summarize the steps when modeling for MongoDB Patterns Recognize when to apply
  • 5. Goals of the Presentation Document vs Tabular Recognize the differences Methodology Summarize the steps when modeling for MongoDB Patterns Recognize when to apply
  • 6. Goals of the Presentation Document vs Tabular Recognize the differences Methodology Summarize the steps when modeling for MongoDB Patterns Recognize when to apply
  • 7. Document versus Tabular Recognize the differences when modeling for a Document Database versus a Relational/Tabular Database
  • 8.
  • 9. Thinking in Documents § Polymorphism § different documents may contain different fields § Array § represent a "one-to-many" relation § index entry separately § Sub Document § grouping some fields together § JSON/BSON § documents shown as JSON § BSON is the physical format
  • 11. CRDs: Collection-Relationship-Diagrams for two solutions ORSolution A Solution B Queries by articles or users Queries by articles Duplication of users information Simpler
  • 12. Example: Modeling a Social Network Solution A Solution B
  • 13. Example: Modeling a Social Network ü Slower writes ü More storage space ü Duplication ü Faster reads Pre-aggregated Data Solution A Solution B (Fan Out on writes)(Fan Out on reads)
  • 14. Tabular MongoDB Steps to create the model 1 – define schema 2 – develop app and queries 1 – identifying the queries 2 – define schema Differences: Tabular vs Document
  • 15. Tabular MongoDB Steps to create the model 1 – define schema 2 – develop app and queries 1 – identifying the queries 2 – define schema Initial schema • 3rd normal form • one possible solution • many possible solutions Differences: Tabular vs Document
  • 16. Tabular MongoDB Steps to create the model 1 – define schema 2 – develop app and queries 1 – identifying the queries 2 – define schema Initial schema • 3rd normal form • one possible solution • many possible solutions Final schema • likely denormalized • few changes Differences: Tabular vs Document
  • 17. Tabular MongoDB Steps to create the model 1 – define schema 2 – develop app and queries 1 – identifying the queries 2 – define schema Initial schema • 3rd normal form • one possible solution • many possible solutions Final schema • likely denormalized • few changes Schema evolution • difficult and not optimal • likely downtime • easy • no downtime Differences: Tabular vs Document
  • 18. Tabular MongoDB Steps to create the model 1 – define schema 2 – develop app and queries 1 – identifying the queries 2 – define schema Initial schema • 3rd normal form • one possible solution • many possible solutions Final schema • likely denormalized • few changes Schema evolution • difficult and not optimal • likely downtime • easy • no downtime Performance • mediocre • optimized Differences: Tabular vs Document
  • 19. Methodology Summarize the steps of a methodology when modeling for MongoDB
  • 20. Main Tradeoff in Modeling
  • 23. Methodology 1. Describe the Workload 2. Identify and Model the Relationships
  • 24.
  • 25.
  • 26.
  • 27. Methodology 1. Describe the Workload 2. Identify and Model the Relationships 3. Apply Patterns
  • 29. Use Case Let's start a franchise of coffee shops…
  • 30. Case Study: Coffee Shop Franchises Name: Beyond the Stars Coffee
  • 31. Case Study: Coffee Shop Franchises Name: Beyond the Stars Coffee Objective: § 10 000 stores in the United States
  • 32. Case Study: Coffee Shop Franchises Name: Beyond the Stars Coffee Objective: § 10 000 stores in the United States § … then we expand to the rest of the World
  • 33. Case Study: Coffee Shop Franchises Name: Beyond the Stars Coffee Objective: § 10 000 stores in the United States § … then we expand to the rest of the World Keys to success: 1. Best coffee in the world
  • 34. Case Study: Coffee Shop Franchises Name: Beyond the Stars Coffee Objective: § 10 000 stores in the United States § … then we expand to the rest of the World Keys to success: 1. Best coffee in the world 2. Best Technology
  • 35. Key to Success 1: Make the Best Coffee in the World
  • 36. Make the Best Coffee in the World 23g of ground coffee in, 20g of extracted coffee out, in approximately 20 seconds 1. Fill a small or regular cup with 80% hot water (not boiling but pretty hot). Your cup should be 150ml to 200ml in total volume, 80% of which will be hot water. 2. Grind 23g of coffee into your portafilter using the double basket. We use a scale that you can get here. 3. Draw 20g of coffee over the hot water by placing your cup on a scale, press tare and extract your shot.
  • 37. Key to Success 2: Best Technology a) Intelligent Shelves § Measure inventory in real time
  • 38. Key to Success 2: Best Technology a) Intelligent Shelves § Measure inventory in real time b) Intelligent Coffee Machines § Weightings, temperature, time to produce, … § Coffee perfection
  • 39. Key to Success 2: Best Technology a) Intelligent Shelves § Measure inventory in real time b) Intelligent Coffee Machines § Weightings, temperature, time to produce, … § Coffee perfection c) Intelligent Data Storage § MongoDB
  • 40. Methodology 1. Describe the Workload 2. Identify and Model the Relationships 3. Apply Patterns
  • 41. 1 – Workload: List Queries Query Operation Description 1. Coffee weight on the shelves write A shelf send information when coffee bags are added or removed
  • 42. 1 – Workload: List Queries Query Operation Description 1. Coffee weight on the shelves write A shelf send information when coffee bags are added or removed 2. Coffee to deliver to stores read How much coffee do we have to ship to the store in the next days
  • 43. 1 – Workload: List Queries Query Operation Description 1. Coffee weight on the shelves write A shelf send information when coffee bags are added or removed 2. Coffee to deliver to stores read How much coffee do we have to ship to the store in the next days 3. Anomalies in the inventory read Analytics
  • 44. 1 – Workload: List Queries Query Operation Description 1. Coffee weight on the shelves write A shelf send information when coffee bags are added or removed 2. Coffee to deliver to stores read How much coffee do we have to ship to the store in the next days 3. Anomalies in the inventory read Analytics 4. Making a cup of coffee write A coffee machine reporting on the production of a coffee cup
  • 45. 1 – Workload: List Queries Query Operation Description 1. Coffee weight on the shelves write A shelf send information when coffee bags are added or removed 2. Coffee to deliver to stores read How much coffee do we have to ship to the store in the next days 3. Anomalies in the inventory read Analytics 4. Making a cup of coffee write A coffee machine reporting on the production of a coffee cup 5. Analysis of cups of coffee read Analytics
  • 46. 1 – Workload: List Queries Query Operation Description 1. Coffee weight on the shelves write A shelf send information when coffee bags are added or removed 2. Coffee to deliver to stores read How much coffee do we have to ship to the store in the next days 3. Anomalies in the inventory read Analytics 4. Making a cup of coffee write A coffee machine reporting on the production of a coffee cup 5. Analysis of cups of coffee read Analytics 6. Technical Support read Helping our franchisees
  • 47. 1 – Workload: quantify/qualify the queries Query Quantification Qualification 1. Coffee weight on the shelves 10/day*shelf*store => 1/sec <1s critical write 2. Coffee to deliver to stores 1/day*store => 0.1/sec <60s 3. Anomalies in the inventory 24 reads/day <5mins "collection scan" 4. Making a cup of coffee 10 000 000 writes/day 115 writes/sec <100ms non-critical write … cups of coffee at rush hour 3 000 000 writes/hr 833 writes/sec <100ms non-critical write 5. Analysis of cups of coffee 24 reads/day stale data is fine "collection scan" 6. Technical Support 1000 reads/day <1s
  • 48. 1 – Workload: quantify/qualify the queries Query Quantification Qualification 1. Coffee weight on the shelves 10/day*shelf*store => 1/sec <1s critical write 2. Coffee to deliver to stores 1/day*store => 0.1/sec <60s 3. Anomalies in the inventory 24 reads/day <5mins "collection scan" 4. Making a cup of coffee 10 000 000 writes/day 115 writes/sec <100ms non-critical write … cups of coffee at rush hour 3 000 000 writes/hr 833 writes/sec <100ms non-critical write 5. Analysis of cups of coffee 24 reads/day stale data is fine "collection scan" 6. Technical Support 1000 reads/day <1s
  • 49. Disk Space Cups of coffee § one year of data § 10000 x 1000/day x 365 § 3.7 billions/year § 370 GB (100 bytes/cup of coffee) Weighings § one year of data § 10000 x 10/day x 365 § 365 billions/year § 3.7 GB (100 bytes/weighings)
  • 50. Methodology 1. Describe the Workload 2. Identify and Model the Relationships 3. Apply Patterns
  • 51. 2 - Relations are still important Type of Relation -> one-to-one/1-1 one-to-many/1-N many-to-many/N-N Document embedded in the parent document • one read • no joins • one read • no joins • one read • no joins • duplication of information Document referenced in the parent document • smaller reads • many reads • smaller reads • many reads • smaller reads • many reads
  • 52. 2 - Entities for Beyond the Stars Coffee Entities: § Coffee cups § Stores § Coffee machines § Shelves § Weighings § Coffee bags
  • 53. Methodology 1. Describe the Workload 2. Identify and Model the Relationships 3. Apply Patterns
  • 54. Patterns Recognize the need and when to apply Schema Design Patterns
  • 55. Schema Design Patterns Resources A. Advanced Schema Design Patterns § MongoDB World 2017 B. Blogs on Patterns, with Ken Alger § https://www.mongodb.com/blog/post/building- with-patterns-a-summary C. MongoDB University: M320 – Data Modeling § https://university.mongodb.com/courses/M320/about D. Schema Design, Builder Fest PODs § Wednesday, with our Consulting Engineers
  • 61. Bucket Pattern { "device_id": 000123456, "type": "2A", "date": ISODate("2018-03-02"), "temp": [ [ 20.0, 20.1, 20.2, ... ], [ 22.1, 22.1, 22.0, ... ], ... ] } { "device_id": 000123456, "type": "2A", "date": ISODate("2018-03-03"), "temp": [ [ 20.1, 20.2, 20.3, ... ], [ 22.4, 22.4, 22.3, ... ], ... ] } { "device_id": 000123456, "type": "2A", "date": ISODate("2018-03-02T13"), "temp": { 1: 20.0, 2: 20.1, 3: 20.2, ... } } { "device_id": 000123456, "type": "2A", "date": ISODate("2018-03-02T14"), "temp": { 1: 22.1, 2: 22.1, 3: 22.0, ... } } Bucket per Day Bucket per Hour
  • 62. Solution with Patterns • Schema Versioning • Subset • Computed • Bucket
  • 65. Takeaways from the Presentation Document vs Tabular Recognize the differences Methodology Summarize the steps when modeling for MongoDB Patterns Recognize when to apply
  • 66. Takeaways from the Presentation Document vs Tabular Recognize the differences Methodology Summarize the steps when modeling for MongoDB Patterns Recognize when to apply
  • 67. Takeaways from the Presentation Document vs Tabular Recognize the differences Methodology Summarize the steps when modeling for MongoDB Patterns Recognize when to apply
  • 68. Thank you for taking our FREE MongoDB classes at university.mongodb.com
  • 70.
  • 73. This is what your dreams should be when thinking about a schema upgrade !
  • 74. Schema Revision Relational MongoDB Versioned Unit Schema Document Migration Procedure Difficult Easy Service Uptime Interrupted No interruption Rollback Difficult to nightmare-ish Easy
  • 75.
  • 76.
  • 77. Application Lifecycle Modify Application § Can read/process all versions of documents § Have different handler per version § Reshape the document before processing it Update all Application servers § Install updated application § Remove old processes Once migration completed § remove the code to process old versions.
  • 78. Document Lifecycle New Documents: § Application writes them in latest version Existing Documents A) Use updates to documents § to transform to latest version § keep forever documents that never need an update B) or transform all documents in batch § no worry even if process takes days
  • 79. Timeline of the migration
  • 80. Problem Solution Use Cases Examples Benefits and Trade-Offs Schema Versioning Pattern ● Avoid downtime while doing schema upgrades ● Upgrading all documents can take hours, days or even weeks when dealing with big data ● Don't want to update all documents No downtime needed Feel in control of the migration Less future technical debt 🆇 May need 2 indexes for same field while in migration period ● Each document gets a "schema_version" field ● Application can handle all versions ● Choose your strategy to migrate the documents ● Every application that use a database, deployed in production and heavily used. ● System with a lot of legacy data
  • 86. Problem Solution Use Cases Examples Benefits and Trade-Offs Computed Pattern ● Costly computation or manipulation of data ● Executed frequently on the same data, producing the same result Read queries are faster Saving on resources like CPU and Disk 🆇 May be difficult to identify the need 🆇 Avoid applying or overusing it unless needed ● Perform the operation and store the result in the appropriate document and collection ● If need to redo the operations, keep the source of them ● Internet Of Things (IOT) ● Event Sourcing ● Time Series Data ● Frequent Aggregation Framework queries