SlideShare a Scribd company logo
The Fine Art of 
Schema Design: Dos 
and Don'ts 
Matias Cascallares 
Senior Solutions Architect, MongoDB Inc. 
matias@mongodb.com
Who am I? 
• Originally from Buenos Aires, 
Argentina 
• Solutions Architect at MongoDB 
Inc based in Singapore 
• Software Engineer, most of my 
experience in web environments 
• In my toolbox I have Java, Python 
and Node.js
RDBMs 
• Relational databases are made up of tables 
• Tables are made up of rows: 
• All rows have identical structure 
• Each row has the same number of columns 
• Every cell in a column stores the same type of data
MONGODB IS A 
DOCUMENT 
ORIENTED 
DATABASE
Show me a document 
{ 
"name" : "Matias Cascallares", 
"title" : "Senior Solutions Architect", 
"email" : "matias@mongodb.com", 
"birth_year" : 1981, 
"location" : [ "Singapore", "Asia"], 
"phone" : { 
"type" : "mobile", 
"number" : "+65 8591 3870" 
} 
}
Document Model 
• MongoDB is made up of collections 
• Collections are composed of documents 
• Each document is a set of key-value pairs 
• No predefined schema 
• Keys are always strings 
• Values can be any (supported) data type 
• Values can also be an array 
• Values can also be a document
Benefits of 
document 
model ..?
Flexibility 
• Each document can have different fields 
• No need of long migrations, easier to be agile 
• Common structure enforced at application level
Arrays 
• Documents can have field with array values 
• Ability to query and index array elements 
• We can model relationships with no need of different 
tables or collections
Embedded documents 
• Documents can have field with document values 
• Ability to query and index nested documents 
• Semantic closer to Object Oriented Programming
Indexing an array of documents
Relational 
Schema Design 
Document 
Schema Design
Relational 
Schema Design 
Focus on 
data 
storage 
Document 
Schema Design 
Focus on 
data 
usage
SCHEMA 
DESIGN IS 
AN ART 
https://www.flickr.com/photos/76377775@N05/11098637655/
Implementing 
Relations 
https://www.flickr.com/photos/ravages/2831688538
A task 
tracking app
Requirement #1 
"We need to store user information like name, email 
and their addresses… yes they can have more than 
one.” 
— Bill, a project manager, contemporary
Relational 
id name email title 
1 Kate 
Powell 
kate.powell@somedomain.c 
om 
Regional Manager 
id street city user_id 
1 123 Sesame Street Boston 1 
2 123 Evergreen Street New York 1
Let’s use the document model 
> db.user.findOne( { email: "kate.powell@somedomain.com"} ) 
{ 
_id: 1, 
name: "Kate Powell", 
email: "kate.powell@somedomain.com", 
title: "Regional Manager", 
addresses: [ 
{ street: "123 Sesame St", city: "Boston" }, 
{ street: "123 Evergreen St", city: "New York" } 
] 
}
Requirement #2 
"We have to be able to store tasks, assign them to 
users and track their progress…" 
— Bill, a project manager, contemporary
Embedding tasks 
> db.user.findOne( { email: "kate.powell@somedomain.com"} ) 
{ 
name: "Kate Powell", 
// ... previous fields 
tasks: [ 
{ 
summary: "Contact sellers", 
description: "Contact agents to specify our needs 
and time constraints", 
due_date: ISODate("2014-08-25T08:37:50.465Z"), 
status: "NOT_STARTED" 
}, 
{ // another task } 
] 
}
Embedding tasks 
• Tasks are unbounded items: initially we do not know 
how many tasks we are going to have 
• A user along time can end with thousands of tasks 
• Maximum document size in MongoDB: 16 MB ! 
• It is harder to access task information without a user 
context
Referencing tasks 
> db.user.findOne({_id: 1}) 
{ 
_id: 1, 
name: "Kate Powell", 
email: "kate.powell@...", 
title: "Regional Manager", 
addresses: [ 
{ // address 1 }, 
{ // address 2 } 
] 
} 
> db.task.findOne({user_id: 1}) 
{ 
_id: 5, 
summary: "Contact sellers", 
description: "Contact agents 
to specify our ...", 
due_date: ISODate(), 
status: "NOT_STARTED", 
user_id: 1 
}
Referencing tasks 
• Tasks are unbounded items and our schema supports 
that 
• Application level joins 
• Remember to create proper indexes (e.g. user_id)
Embedding 
vs 
Referencing
One-to-many relations 
• Embed when you have a few number of items on ‘many' 
side 
• Embed when you have some level of control on the 
number of items on ‘many' side 
• Reference when you cannot control the number of items 
on the 'many' side 
• Reference when you need to access to ‘many' side items 
without parent entity scope
Many-to-many relations 
• These can be implemented with two one-to-many 
relations with the same considerations
RECIPE #1 
USE EMBEDDING 
FOR ONE-TO-FEW 
RELATIONS
RECIPE #2 
USE REFERENCING 
FOR ONE-TO-MANY 
RELATIONS
Working with 
arrays 
https://www.flickr.com/photos/kishjar/10747531785
Arrays are 
great!
List of sorted elements 
> db.numbers.insert({ 
_id: "even", 
values: [0, 2, 4, 6, 8] 
}); 
> db.numbers.insert({ 
_id: "odd", 
values: [1, 3, 5, 7, 9] 
});
Access based on position 
db.numbers.find({_id: "even"}, {values: {$slice: [2, 3]}}) 
{ 
_id: "even", 
values: [4, 6, 8] 
} 
db.numbers.find({_id: "odd"}, {values: {$slice: -2}}) 
{ 
_id: "odd", 
values: [7, 9] 
}
Access based on values 
// is number 2 even or odd? 
> db.numbers.find( { values : 2 } ) 
{ 
_id: "even", 
values: [0, 2, 4, 6, 8] 
}
Like sorted sets 
> db.numbers.find( { _id: "even" } ) 
{ 
_id: "even", 
values: [0, 2, 4, 6, 8] 
} 
> db.numbers.update( 
{ _id: "even"}, 
{ $addToSet: { values: 10 } } 
); 
Several times…! 
> db.numbers.find( { _id: "even" } ) 
{ 
_id: "even", 
values: [0, 2, 4, 6, 8, 10] 
}
Array update operators 
• pop 
• push 
• pull 
• pullAll
But…
Storage 
DocA DocB DocC 
{ 
_id: 1, 
name: "Nike Pump Air 180", 
tags: ["sports", "running"] 
} 
db.inventory.update( 
{ _id: 1}, 
{ $push: { tags: "shoes" } } 
)
Empty 
Storage 
DocA DocB DocC 
IDX IDX IDX
Empty 
Storage 
DocA DocC DocB 
IDX IDX IDX
Why is expensive to move a doc? 
1. We need to write the document in another location ($$) 
2. We need to mark the original position as free for new 
documents ($) 
3. We need to update all those index entries pointing to the 
moved document to the new location ($$$)
Considerations with arrays 
• Limited number of items 
• Avoid document movements 
• Document movements can be delayed with padding 
factor 
• Document movements can be mitigated with pre-allocation
RECIPE #3 
AVOID EMBEDDING 
LARGE ARRAYS
RECIPE #4 
USE DATA MODELS 
THAT MINIMIZE THE 
NEED FOR 
DOCUMENT 
GROWTH
Denormalization 
https://www.flickr.com/photos/ross_strachan/5146307757
Denormalization 
"…is the process of attempting to optimise the 
read performance of a database by adding 
redundant data …” 
— Wikipedia
Products and comments 
> db.product.find( { _id: 1 } ) 
{ 
_id: 1, 
name: "Nike Pump Air Force 180", 
tags: ["sports", "running"] 
} 
> db.comment.find( { product_id: 1 } ) 
{ score: 5, user: "user1", text: "Awesome shoes" } 
{ score: 2, user: "user2", text: "Not for me.." }
Denormalizing 
> db.product.find({_id: 1}) 
{ 
_id: 1, 
name: "Nike Pump Air Force 180", 
tags: ["sports", “running"], 
comments: [ 
{ user: "user1", text: "Awesome shoes" }, 
{ user: "user2", text: "Not for me.." } 
] 
} 
> db.comment.find({product_id: 1}) 
{ score: 5, user: "user1", text: "Awesome shoes" } 
{ score: 2, user: "user2", text: "Not for me.."}
RECIPE #5 
DENORMALIZE 
TO AVOID 
APP-LEVEL JOINS
RECIPE #6 
DENORMALIZE ONLY 
WHEN YOU HAVE A 
HIGH READ TO WRITE 
RATIO
Bucketing 
https://www.flickr.com/photos/97608671@N02/13558864555/
What’s the idea? 
• Reduce number of documents to be retrieved 
• Less documents to retrieve means less disk seeks 
• Using arrays we can store more than one entity per 
document 
• We group things that are accessed together
An example 
Comments are showed in 
buckets of 2 comments 
A ‘read more’ button 
loads next 2 comments
Bucketing comments 
> db.comments.find({post_id: 123}) 
.sort({sequence: -1}) 
.limit(1) 
{ 
_id: 1, 
post_id: 123, 
sequence: 8, // this acts as a page number 
comments: [ 
{user: user1@somedomain.com, text: "Awesome shoes.."}, 
{user: user2@somedomain.com, text: "Not for me..”} 
] // we store two comments per doc, fixed size bucket 
}
RECIPE #7 
USE BUCKETING TO 
STORE THINGS THAT 
ARE GOING TO BE 
ACCESSED AS A 
GROUP
谢谢

More Related Content

What's hot

Back to Basics 1: Thinking in documents
Back to Basics 1: Thinking in documentsBack to Basics 1: Thinking in documents
Back to Basics 1: Thinking in documents
MongoDB
 
Practical Ruby Projects With Mongo Db
Practical Ruby Projects With Mongo DbPractical Ruby Projects With Mongo Db
Practical Ruby Projects With Mongo Db
Alex Sharp
 
MongoDB Schema Design: Four Real-World Examples
MongoDB Schema Design: Four Real-World ExamplesMongoDB Schema Design: Four Real-World Examples
MongoDB Schema Design: Four Real-World Examples
Mike Friedman
 
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentosConceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos
MongoDB
 
Learn Learn how to build your mobile back-end with MongoDB
Learn Learn how to build your mobile back-end with MongoDBLearn Learn how to build your mobile back-end with MongoDB
Learn Learn how to build your mobile back-end with MongoDB
Marakana Inc.
 
User Data Management with MongoDB
User Data Management with MongoDB User Data Management with MongoDB
User Data Management with MongoDB
MongoDB
 
Back to Basics Webinar 3: Schema Design Thinking in Documents
 Back to Basics Webinar 3: Schema Design Thinking in Documents Back to Basics Webinar 3: Schema Design Thinking in Documents
Back to Basics Webinar 3: Schema Design Thinking in Documents
MongoDB
 
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
MongoDB
 
Socialite, the Open Source Status Feed Part 2: Managing the Social Graph
Socialite, the Open Source Status Feed Part 2: Managing the Social GraphSocialite, the Open Source Status Feed Part 2: Managing the Social Graph
Socialite, the Open Source Status Feed Part 2: Managing the Social Graph
MongoDB
 
Socialite, the Open Source Status Feed
Socialite, the Open Source Status FeedSocialite, the Open Source Status Feed
Socialite, the Open Source Status Feed
MongoDB
 
MongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and Implications
MongoDB
 
Building Your First MongoDB App ~ Metadata Catalog
Building Your First MongoDB App ~ Metadata CatalogBuilding Your First MongoDB App ~ Metadata Catalog
Building Your First MongoDB App ~ Metadata Catalog
hungarianhc
 
MongoDB 101
MongoDB 101MongoDB 101
MongoDB 101
Abhijeet Vaikar
 
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorialsMongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
SpringPeople
 
Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB
 Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB
Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB
MongoDB
 
Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed
Socialite, the Open Source Status Feed Part 3: Scaling the Data FeedSocialite, the Open Source Status Feed Part 3: Scaling the Data Feed
Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed
MongoDB
 
Building Apps with MongoDB
Building Apps with MongoDBBuilding Apps with MongoDB
Building Apps with MongoDB
Nate Abele
 
Building your first app with MongoDB
Building your first app with MongoDBBuilding your first app with MongoDB
Building your first app with MongoDB
Norberto Leite
 
Mongo db operations_v2
Mongo db operations_v2Mongo db operations_v2
Mongo db operations_v2
Thanabalan Sathneeganandan
 
Agile Schema Design: An introduction to MongoDB
Agile Schema Design: An introduction to MongoDBAgile Schema Design: An introduction to MongoDB
Agile Schema Design: An introduction to MongoDB
Stennie Steneker
 

What's hot (20)

Back to Basics 1: Thinking in documents
Back to Basics 1: Thinking in documentsBack to Basics 1: Thinking in documents
Back to Basics 1: Thinking in documents
 
Practical Ruby Projects With Mongo Db
Practical Ruby Projects With Mongo DbPractical Ruby Projects With Mongo Db
Practical Ruby Projects With Mongo Db
 
MongoDB Schema Design: Four Real-World Examples
MongoDB Schema Design: Four Real-World ExamplesMongoDB Schema Design: Four Real-World Examples
MongoDB Schema Design: Four Real-World Examples
 
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentosConceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos
 
Learn Learn how to build your mobile back-end with MongoDB
Learn Learn how to build your mobile back-end with MongoDBLearn Learn how to build your mobile back-end with MongoDB
Learn Learn how to build your mobile back-end with MongoDB
 
User Data Management with MongoDB
User Data Management with MongoDB User Data Management with MongoDB
User Data Management with MongoDB
 
Back to Basics Webinar 3: Schema Design Thinking in Documents
 Back to Basics Webinar 3: Schema Design Thinking in Documents Back to Basics Webinar 3: Schema Design Thinking in Documents
Back to Basics Webinar 3: Schema Design Thinking in Documents
 
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
 
Socialite, the Open Source Status Feed Part 2: Managing the Social Graph
Socialite, the Open Source Status Feed Part 2: Managing the Social GraphSocialite, the Open Source Status Feed Part 2: Managing the Social Graph
Socialite, the Open Source Status Feed Part 2: Managing the Social Graph
 
Socialite, the Open Source Status Feed
Socialite, the Open Source Status FeedSocialite, the Open Source Status Feed
Socialite, the Open Source Status Feed
 
MongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and Implications
 
Building Your First MongoDB App ~ Metadata Catalog
Building Your First MongoDB App ~ Metadata CatalogBuilding Your First MongoDB App ~ Metadata Catalog
Building Your First MongoDB App ~ Metadata Catalog
 
MongoDB 101
MongoDB 101MongoDB 101
MongoDB 101
 
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorialsMongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
 
Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB
 Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB
Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB
 
Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed
Socialite, the Open Source Status Feed Part 3: Scaling the Data FeedSocialite, the Open Source Status Feed Part 3: Scaling the Data Feed
Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed
 
Building Apps with MongoDB
Building Apps with MongoDBBuilding Apps with MongoDB
Building Apps with MongoDB
 
Building your first app with MongoDB
Building your first app with MongoDBBuilding your first app with MongoDB
Building your first app with MongoDB
 
Mongo db operations_v2
Mongo db operations_v2Mongo db operations_v2
Mongo db operations_v2
 
Agile Schema Design: An introduction to MongoDB
Agile Schema Design: An introduction to MongoDBAgile Schema Design: An introduction to MongoDB
Agile Schema Design: An introduction to MongoDB
 

Viewers also liked

The What and Why of NoSql
The What and Why of NoSqlThe What and Why of NoSql
The What and Why of NoSql
Matias Cascallares
 
Internet of Things Cologne 2015: Why Your Dad’s Database won’t Work for IoT a...
Internet of Things Cologne 2015: Why Your Dad’s Database won’t Work for IoT a...Internet of Things Cologne 2015: Why Your Dad’s Database won’t Work for IoT a...
Internet of Things Cologne 2015: Why Your Dad’s Database won’t Work for IoT a...
MongoDB
 
Building your first application w/mongoDB MongoSV2011
Building your first application w/mongoDB MongoSV2011Building your first application w/mongoDB MongoSV2011
Building your first application w/mongoDB MongoSV2011
Steven Francia
 
Intro to MongoDB and datamodeling
Intro to MongoDB and datamodeling Intro to MongoDB and datamodeling
Intro to MongoDB and datamodeling
rogerbodamer
 
Кратко о MongoDB
Кратко о MongoDBКратко о MongoDB
Кратко о MongoDB
Gleb Lebedev
 
MongoDB. Области применения, преимущества и узкие места, тонкости использован...
MongoDB. Области применения, преимущества и узкие места, тонкости использован...MongoDB. Области применения, преимущества и узкие места, тонкости использован...
MongoDB. Области применения, преимущества и узкие места, тонкости использован...phpdevby
 
Преимущества NoSQL баз данных на примере MongoDB
Преимущества NoSQL баз данных на примере MongoDBПреимущества NoSQL баз данных на примере MongoDB
Преимущества NoSQL баз данных на примере MongoDB
UNETA
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation Framework
Tyler Brock
 
Выбор NoSQL базы данных для вашего проекта: "Не в свои сани не садись"
Выбор NoSQL базы данных для вашего проекта: "Не в свои сани не садись"Выбор NoSQL базы данных для вашего проекта: "Не в свои сани не садись"
Выбор NoSQL базы данных для вашего проекта: "Не в свои сани не садись"
Alexey Zinoviev
 
Data Modeling for NoSQL
Data Modeling for NoSQLData Modeling for NoSQL
Data Modeling for NoSQL
Tony Tam
 
Agg framework selectgroup feb2015 v2
Agg framework selectgroup feb2015 v2Agg framework selectgroup feb2015 v2
Agg framework selectgroup feb2015 v2
MongoDB
 
Webinar: 10-Step Guide to Creating a Single View of your Business
Webinar: 10-Step Guide to Creating a Single View of your BusinessWebinar: 10-Step Guide to Creating a Single View of your Business
Webinar: 10-Step Guide to Creating a Single View of your Business
MongoDB
 

Viewers also liked (12)

The What and Why of NoSql
The What and Why of NoSqlThe What and Why of NoSql
The What and Why of NoSql
 
Internet of Things Cologne 2015: Why Your Dad’s Database won’t Work for IoT a...
Internet of Things Cologne 2015: Why Your Dad’s Database won’t Work for IoT a...Internet of Things Cologne 2015: Why Your Dad’s Database won’t Work for IoT a...
Internet of Things Cologne 2015: Why Your Dad’s Database won’t Work for IoT a...
 
Building your first application w/mongoDB MongoSV2011
Building your first application w/mongoDB MongoSV2011Building your first application w/mongoDB MongoSV2011
Building your first application w/mongoDB MongoSV2011
 
Intro to MongoDB and datamodeling
Intro to MongoDB and datamodeling Intro to MongoDB and datamodeling
Intro to MongoDB and datamodeling
 
Кратко о MongoDB
Кратко о MongoDBКратко о MongoDB
Кратко о MongoDB
 
MongoDB. Области применения, преимущества и узкие места, тонкости использован...
MongoDB. Области применения, преимущества и узкие места, тонкости использован...MongoDB. Области применения, преимущества и узкие места, тонкости использован...
MongoDB. Области применения, преимущества и узкие места, тонкости использован...
 
Преимущества NoSQL баз данных на примере MongoDB
Преимущества NoSQL баз данных на примере MongoDBПреимущества NoSQL баз данных на примере MongoDB
Преимущества NoSQL баз данных на примере MongoDB
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation Framework
 
Выбор NoSQL базы данных для вашего проекта: "Не в свои сани не садись"
Выбор NoSQL базы данных для вашего проекта: "Не в свои сани не садись"Выбор NoSQL базы данных для вашего проекта: "Не в свои сани не садись"
Выбор NoSQL базы данных для вашего проекта: "Не в свои сани не садись"
 
Data Modeling for NoSQL
Data Modeling for NoSQLData Modeling for NoSQL
Data Modeling for NoSQL
 
Agg framework selectgroup feb2015 v2
Agg framework selectgroup feb2015 v2Agg framework selectgroup feb2015 v2
Agg framework selectgroup feb2015 v2
 
Webinar: 10-Step Guide to Creating a Single View of your Business
Webinar: 10-Step Guide to Creating a Single View of your BusinessWebinar: 10-Step Guide to Creating a Single View of your Business
Webinar: 10-Step Guide to Creating a Single View of your Business
 

Similar to The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Mongodb intro
Mongodb introMongodb intro
Mongodb intro
christkv
 
OSDC 2012 | Building a first application on MongoDB by Ross Lawley
OSDC 2012 | Building a first application on MongoDB by Ross LawleyOSDC 2012 | Building a first application on MongoDB by Ross Lawley
OSDC 2012 | Building a first application on MongoDB by Ross Lawley
NETWAYS
 
MongoDB & NoSQL 101
 MongoDB & NoSQL 101 MongoDB & NoSQL 101
MongoDB & NoSQL 101
Jollen Chen
 
Managing Social Content with MongoDB
Managing Social Content with MongoDBManaging Social Content with MongoDB
Managing Social Content with MongoDB
MongoDB
 
MediaGlu and Mongo DB
MediaGlu and Mongo DBMediaGlu and Mongo DB
MediaGlu and Mongo DB
Sundar Nathikudi
 
MongoDB using Grails plugin by puneet behl
MongoDB using Grails plugin by puneet behlMongoDB using Grails plugin by puneet behl
MongoDB using Grails plugin by puneet behl
TO THE NEW | Technology
 
Introduction to RavenDB
Introduction to RavenDBIntroduction to RavenDB
Introduction to RavenDB
Sasha Goldshtein
 
MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)
Uwe Printz
 
10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling
DATAVERSITY
 
MongoDB
MongoDBMongoDB
SQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The MoveSQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The Move
IBM Cloud Data Services
 
Change RelationalDB to GraphDB with OrientDB
Change RelationalDB to GraphDB with OrientDBChange RelationalDB to GraphDB with OrientDB
Change RelationalDB to GraphDB with OrientDB
Apaichon Punopas
 
MongoDB Basics
MongoDB BasicsMongoDB Basics
MongoDB Basics
Sarang Shravagi
 
Webinar: Building Your First Application with MongoDB
Webinar: Building Your First Application with MongoDBWebinar: Building Your First Application with MongoDB
Webinar: Building Your First Application with MongoDB
MongoDB
 
Tech Gupshup Meetup On MongoDB - 24/06/2016
Tech Gupshup Meetup On MongoDB - 24/06/2016Tech Gupshup Meetup On MongoDB - 24/06/2016
Tech Gupshup Meetup On MongoDB - 24/06/2016
Mukesh Tilokani
 
Mongo DB
Mongo DB Mongo DB
How to Achieve Scale with MongoDB
How to Achieve Scale with MongoDBHow to Achieve Scale with MongoDB
How to Achieve Scale with MongoDB
MongoDB
 
Mongo db eveningschemadesign
Mongo db eveningschemadesignMongo db eveningschemadesign
Mongo db eveningschemadesign
MongoDB APAC
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
Ravi Teja
 
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_WilkinsMongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
kiwilkins
 

Similar to The Fine Art of Schema Design in MongoDB: Dos and Don'ts (20)

Mongodb intro
Mongodb introMongodb intro
Mongodb intro
 
OSDC 2012 | Building a first application on MongoDB by Ross Lawley
OSDC 2012 | Building a first application on MongoDB by Ross LawleyOSDC 2012 | Building a first application on MongoDB by Ross Lawley
OSDC 2012 | Building a first application on MongoDB by Ross Lawley
 
MongoDB & NoSQL 101
 MongoDB & NoSQL 101 MongoDB & NoSQL 101
MongoDB & NoSQL 101
 
Managing Social Content with MongoDB
Managing Social Content with MongoDBManaging Social Content with MongoDB
Managing Social Content with MongoDB
 
MediaGlu and Mongo DB
MediaGlu and Mongo DBMediaGlu and Mongo DB
MediaGlu and Mongo DB
 
MongoDB using Grails plugin by puneet behl
MongoDB using Grails plugin by puneet behlMongoDB using Grails plugin by puneet behl
MongoDB using Grails plugin by puneet behl
 
Introduction to RavenDB
Introduction to RavenDBIntroduction to RavenDB
Introduction to RavenDB
 
MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)
 
10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling
 
MongoDB
MongoDBMongoDB
MongoDB
 
SQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The MoveSQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The Move
 
Change RelationalDB to GraphDB with OrientDB
Change RelationalDB to GraphDB with OrientDBChange RelationalDB to GraphDB with OrientDB
Change RelationalDB to GraphDB with OrientDB
 
MongoDB Basics
MongoDB BasicsMongoDB Basics
MongoDB Basics
 
Webinar: Building Your First Application with MongoDB
Webinar: Building Your First Application with MongoDBWebinar: Building Your First Application with MongoDB
Webinar: Building Your First Application with MongoDB
 
Tech Gupshup Meetup On MongoDB - 24/06/2016
Tech Gupshup Meetup On MongoDB - 24/06/2016Tech Gupshup Meetup On MongoDB - 24/06/2016
Tech Gupshup Meetup On MongoDB - 24/06/2016
 
Mongo DB
Mongo DB Mongo DB
Mongo DB
 
How to Achieve Scale with MongoDB
How to Achieve Scale with MongoDBHow to Achieve Scale with MongoDB
How to Achieve Scale with MongoDB
 
Mongo db eveningschemadesign
Mongo db eveningschemadesignMongo db eveningschemadesign
Mongo db eveningschemadesign
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_WilkinsMongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
 

Recently uploaded

Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
Rakesh Kumar R
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata
 
Hand Rolled Applicative User Validation Code Kata
Hand Rolled Applicative User ValidationCode KataHand Rolled Applicative User ValidationCode Kata
Hand Rolled Applicative User Validation Code Kata
Philip Schwarz
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
Hironori Washizaki
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
Philip Schwarz
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
Google
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
Aftab Hussain
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
Aftab Hussain
 
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
Alina Yurenko
 
Preparing Non - Technical Founders for Engaging a Tech Agency
Preparing Non - Technical Founders for Engaging  a  Tech AgencyPreparing Non - Technical Founders for Engaging  a  Tech Agency
Preparing Non - Technical Founders for Engaging a Tech Agency
ISH Technologies
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
SOCRadar
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
Shane Coughlan
 
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
kalichargn70th171
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
Green Software Development
 

Recently uploaded (20)

Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
 
Hand Rolled Applicative User Validation Code Kata
Hand Rolled Applicative User ValidationCode KataHand Rolled Applicative User ValidationCode Kata
Hand Rolled Applicative User Validation Code Kata
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
 
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
 
Preparing Non - Technical Founders for Engaging a Tech Agency
Preparing Non - Technical Founders for Engaging  a  Tech AgencyPreparing Non - Technical Founders for Engaging  a  Tech Agency
Preparing Non - Technical Founders for Engaging a Tech Agency
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
 
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
 

The Fine Art of Schema Design in MongoDB: Dos and Don'ts

  • 1. The Fine Art of Schema Design: Dos and Don'ts Matias Cascallares Senior Solutions Architect, MongoDB Inc. matias@mongodb.com
  • 2. Who am I? • Originally from Buenos Aires, Argentina • Solutions Architect at MongoDB Inc based in Singapore • Software Engineer, most of my experience in web environments • In my toolbox I have Java, Python and Node.js
  • 3. RDBMs • Relational databases are made up of tables • Tables are made up of rows: • All rows have identical structure • Each row has the same number of columns • Every cell in a column stores the same type of data
  • 4. MONGODB IS A DOCUMENT ORIENTED DATABASE
  • 5. Show me a document { "name" : "Matias Cascallares", "title" : "Senior Solutions Architect", "email" : "matias@mongodb.com", "birth_year" : 1981, "location" : [ "Singapore", "Asia"], "phone" : { "type" : "mobile", "number" : "+65 8591 3870" } }
  • 6. Document Model • MongoDB is made up of collections • Collections are composed of documents • Each document is a set of key-value pairs • No predefined schema • Keys are always strings • Values can be any (supported) data type • Values can also be an array • Values can also be a document
  • 8. Flexibility • Each document can have different fields • No need of long migrations, easier to be agile • Common structure enforced at application level
  • 9. Arrays • Documents can have field with array values • Ability to query and index array elements • We can model relationships with no need of different tables or collections
  • 10. Embedded documents • Documents can have field with document values • Ability to query and index nested documents • Semantic closer to Object Oriented Programming
  • 11. Indexing an array of documents
  • 12. Relational Schema Design Document Schema Design
  • 13. Relational Schema Design Focus on data storage Document Schema Design Focus on data usage
  • 14. SCHEMA DESIGN IS AN ART https://www.flickr.com/photos/76377775@N05/11098637655/
  • 17. Requirement #1 "We need to store user information like name, email and their addresses… yes they can have more than one.” — Bill, a project manager, contemporary
  • 18. Relational id name email title 1 Kate Powell kate.powell@somedomain.c om Regional Manager id street city user_id 1 123 Sesame Street Boston 1 2 123 Evergreen Street New York 1
  • 19. Let’s use the document model > db.user.findOne( { email: "kate.powell@somedomain.com"} ) { _id: 1, name: "Kate Powell", email: "kate.powell@somedomain.com", title: "Regional Manager", addresses: [ { street: "123 Sesame St", city: "Boston" }, { street: "123 Evergreen St", city: "New York" } ] }
  • 20. Requirement #2 "We have to be able to store tasks, assign them to users and track their progress…" — Bill, a project manager, contemporary
  • 21. Embedding tasks > db.user.findOne( { email: "kate.powell@somedomain.com"} ) { name: "Kate Powell", // ... previous fields tasks: [ { summary: "Contact sellers", description: "Contact agents to specify our needs and time constraints", due_date: ISODate("2014-08-25T08:37:50.465Z"), status: "NOT_STARTED" }, { // another task } ] }
  • 22. Embedding tasks • Tasks are unbounded items: initially we do not know how many tasks we are going to have • A user along time can end with thousands of tasks • Maximum document size in MongoDB: 16 MB ! • It is harder to access task information without a user context
  • 23. Referencing tasks > db.user.findOne({_id: 1}) { _id: 1, name: "Kate Powell", email: "kate.powell@...", title: "Regional Manager", addresses: [ { // address 1 }, { // address 2 } ] } > db.task.findOne({user_id: 1}) { _id: 5, summary: "Contact sellers", description: "Contact agents to specify our ...", due_date: ISODate(), status: "NOT_STARTED", user_id: 1 }
  • 24. Referencing tasks • Tasks are unbounded items and our schema supports that • Application level joins • Remember to create proper indexes (e.g. user_id)
  • 26. One-to-many relations • Embed when you have a few number of items on ‘many' side • Embed when you have some level of control on the number of items on ‘many' side • Reference when you cannot control the number of items on the 'many' side • Reference when you need to access to ‘many' side items without parent entity scope
  • 27. Many-to-many relations • These can be implemented with two one-to-many relations with the same considerations
  • 28. RECIPE #1 USE EMBEDDING FOR ONE-TO-FEW RELATIONS
  • 29. RECIPE #2 USE REFERENCING FOR ONE-TO-MANY RELATIONS
  • 30. Working with arrays https://www.flickr.com/photos/kishjar/10747531785
  • 32. List of sorted elements > db.numbers.insert({ _id: "even", values: [0, 2, 4, 6, 8] }); > db.numbers.insert({ _id: "odd", values: [1, 3, 5, 7, 9] });
  • 33. Access based on position db.numbers.find({_id: "even"}, {values: {$slice: [2, 3]}}) { _id: "even", values: [4, 6, 8] } db.numbers.find({_id: "odd"}, {values: {$slice: -2}}) { _id: "odd", values: [7, 9] }
  • 34. Access based on values // is number 2 even or odd? > db.numbers.find( { values : 2 } ) { _id: "even", values: [0, 2, 4, 6, 8] }
  • 35. Like sorted sets > db.numbers.find( { _id: "even" } ) { _id: "even", values: [0, 2, 4, 6, 8] } > db.numbers.update( { _id: "even"}, { $addToSet: { values: 10 } } ); Several times…! > db.numbers.find( { _id: "even" } ) { _id: "even", values: [0, 2, 4, 6, 8, 10] }
  • 36. Array update operators • pop • push • pull • pullAll
  • 38. Storage DocA DocB DocC { _id: 1, name: "Nike Pump Air 180", tags: ["sports", "running"] } db.inventory.update( { _id: 1}, { $push: { tags: "shoes" } } )
  • 39. Empty Storage DocA DocB DocC IDX IDX IDX
  • 40. Empty Storage DocA DocC DocB IDX IDX IDX
  • 41. Why is expensive to move a doc? 1. We need to write the document in another location ($$) 2. We need to mark the original position as free for new documents ($) 3. We need to update all those index entries pointing to the moved document to the new location ($$$)
  • 42. Considerations with arrays • Limited number of items • Avoid document movements • Document movements can be delayed with padding factor • Document movements can be mitigated with pre-allocation
  • 43. RECIPE #3 AVOID EMBEDDING LARGE ARRAYS
  • 44. RECIPE #4 USE DATA MODELS THAT MINIMIZE THE NEED FOR DOCUMENT GROWTH
  • 46. Denormalization "…is the process of attempting to optimise the read performance of a database by adding redundant data …” — Wikipedia
  • 47. Products and comments > db.product.find( { _id: 1 } ) { _id: 1, name: "Nike Pump Air Force 180", tags: ["sports", "running"] } > db.comment.find( { product_id: 1 } ) { score: 5, user: "user1", text: "Awesome shoes" } { score: 2, user: "user2", text: "Not for me.." }
  • 48. Denormalizing > db.product.find({_id: 1}) { _id: 1, name: "Nike Pump Air Force 180", tags: ["sports", “running"], comments: [ { user: "user1", text: "Awesome shoes" }, { user: "user2", text: "Not for me.." } ] } > db.comment.find({product_id: 1}) { score: 5, user: "user1", text: "Awesome shoes" } { score: 2, user: "user2", text: "Not for me.."}
  • 49. RECIPE #5 DENORMALIZE TO AVOID APP-LEVEL JOINS
  • 50. RECIPE #6 DENORMALIZE ONLY WHEN YOU HAVE A HIGH READ TO WRITE RATIO
  • 52. What’s the idea? • Reduce number of documents to be retrieved • Less documents to retrieve means less disk seeks • Using arrays we can store more than one entity per document • We group things that are accessed together
  • 53. An example Comments are showed in buckets of 2 comments A ‘read more’ button loads next 2 comments
  • 54. Bucketing comments > db.comments.find({post_id: 123}) .sort({sequence: -1}) .limit(1) { _id: 1, post_id: 123, sequence: 8, // this acts as a page number comments: [ {user: user1@somedomain.com, text: "Awesome shoes.."}, {user: user2@somedomain.com, text: "Not for me..”} ] // we store two comments per doc, fixed size bucket }
  • 55. RECIPE #7 USE BUCKETING TO STORE THINGS THAT ARE GOING TO BE ACCESSED AS A GROUP