SlideShare a Scribd company logo
MongoDB
PRESENTED BY
Jörg Reichert
Licensed under cc-by v3.0 (any jurisdiction)
Introduction
● Name derived from humongous (= gigantic)
● NoSQL (= not only SQL) database
● Document oriented database
– documents stored as binary JSON (BSON)
● Ad-hoc queries
● Server side Javascript execution
● Aggregation / MapReduce
● High performance, availability, scalability
MongoDB
Relational vs. document based: concepts
SQL
Person
Name AddressId
MongoDB
1
2
Mueller 1
Id
Address
City Street
1
2
<null> 2
Leipzig Burgstr. 1
Dresden <null>
Person
{
_id: ObjectId(“...“),
Name: “Mueller“,
Address: {
City: “Leipzig“,
Street: “Burgstr. 1“,
},
}, {
_id: ObjectId(“...“),
Address: {
City: “Leipzig“,
},
}
DB DB
Table CollectionColumn
Row
Document
Key: Value
FieldPK
FK
Relation
Embedded document
PK
PK: primary key, FK: foreign key
MongoDB
SELECT * FROM Person;
SELECT * FROM Person
WHERE name = “Mueller“;
SELECT * FROM Person
WHERE name like “M%“;
SELECT name FROM Person;
SELECT distinct(name)
FROM Person
WHERE name = “Mueller“;
Relational vs. document based: syntax (1/3)
db.getCollection(“Person“).find();
db.Person.find({ “name“: "Mueller“ });
db.Person.find({ “name“: /M.*/ });
db.Person.find({}, {name: 1, _id: 0});
db.Person.distinct(
“name“, { “name“: "Mueller“ });
MongoDB
SELECT * FROM Person
WHERE id > 10
AND name <> “Mueller“;
SELECT p.name FROM Person p
JOIN Address a
ON p.address = a.id
WHERE a.city = “Leipzig“
ORDER BY p.name DESC;
SELECT * FROM
WHERE name IS NOT NULL;
SELECT COUNT(*) FROM PERSON
WHERE name = “Mueller“;
Relational vs. document based: syntax (2/3)
db.Person.find({ $and: [
{ _id: { $gt: ObjectId("...") }},
{ name: { $ne: "Mueller" }}]});
db.Person.find(
{ Address.city: “Leipzig“ },
{ name: 1, _id: 0 }
).sort({ name: -1 });
db.Person.find( { name: {
$not: { $type: 10 }, $exists: true }});
db.Person.count({ name: “Mueller“ });
db.Person.find(
{ name: “Mueller“ }).count();
MongoDB
UPDATE Person
SET name = “Müller“
WHERE name = “Mueller“;
DELETE Person
WHERE name = “Mueller“;
INSERT Person (name, address)
VALUES (“Mueller“, 3);
ALTER TABLE PERSON
DROP COLUMN name;
DROP TABLE PERSON;
Relational vs. document based: syntax (3/3)
db.Person.updateMany(
{ name: “Mueller“ },
{ $set: { name: “Müller“} });
db.Person.remove( { name: “Mueller“ } );
db.Person.insert(
{ name: “Mueller“, Address: { … } });
db.Person.updateMany( {},
{ $unset: { name: 1 }} );
db.Person.drop();
MongoDB
● principle of least cardinality
● Store what you query for
schema design principles
MongoDB
● applicable for 1:1 and 1:n when
n can‘t get to large
● Embedded document cannot get
too large
● Embedded document not very
likely to change
● arrays that grow without bound
should never be embedded
schema design: embedded document
{
_id: ObjectId(“...“),
City: “Leipzig“,
Street: “Burgstr. 1“,
Person: [
{
Name: “Mueller“,
},
{
Name: “Schneider“,
},
]
}
Address
MongoDB
● applicable for :n when n can‘t
get to large
● Referenced document likely to
change often in future
● there are many referenced
documents expected, so storing
only the reference is cheaper
● there are large referenced
documents expected, so storing
only the reference is cheaper
● arrays that grow without bound
should never be embedded
● Address should be accessible on
its own
schema design: referencing
{
_id: ObjectId(“...“),
City: “Leipzig“,
Street: “Burgstr. 1“,
Person: [
ObjectId(“...“), ObjectId(“...“),
]
}
{
_id: ObjectId(“...“),
Name: “Mueller“,
}
Address
Person
MongoDB
● applicable for :n relations when
n can get very large (note: a
MongoDB document isn‘t
allowed to exceed 16MB)
● Joins are done on application
level
schema design: parent-referencing
{
_id: ObjectId(“...“),
City: “Dubai“,
Street: “1 Sheikh Mohammed
bin Rashid Blvd“,
}
{
_id: ObjectId(“...“),
Name: “Mueller“,
Address: ObjectId(“...“),
}
Address
Person
MongoDB
● applicable for m:n when n and m
can‘t get to large and application
requires to navigate both ends
● disadvantage: need to update
operations when changing
references
schema design: two way referencing
{
_id: ObjectId(“...“),
City: “Leipzig“,
Street: “Burgstr. 1“,
Person: [
ObjectId(“...“), ObjectId(“...“),
]
}
{
_id: ObjectId(“...“),
Name: “Mueller“,
Address: [
ObjectId(“...“), ObjectId(“...“),
]
}
Address
Person
MongoDB
● queries expected to filter by
certain fields of the referenced
document, so including this field
already in the hosts saves an
additional query at application
level
● disadvantage: two update
operations for duplicated field
● disadvantage: additional
memory consumption
schema design: denormalization
{
_id: ObjectId(“...“),
City: “Leipzig“,
Street: “Burgstr. 1“,
}
{
_id: ObjectId(“...“),
Name: “Mueller“,
Address: [
{
id: ObjectId(“...“),
city: “Leipzig“,
}, ...
]
}
Address
Person
MongoDB
● applicable for :n relations when
n can get very large and it‘s
expected that application will
use pagination anyway
● DB schema will already create
the chunks, the application will
later query for
schema design: bucketing
{
_id: ObjectId(“...“),
City: “Leipzig“,
Street: “Burgstr. 1“,
}
{
_id: ObjectId(“...“),
Address: ObjectId(“...“),
Page: 13,
Count: 50,
Persons: [
{ Name: “Mueller“ }, ...
]
}
Address
Person
MongoDB
Aggregation Framework
● Aggregation pipeline consisting of (processing) stages
– $match, $group, $project, $redact, $unwind, $lookup, $sort, ...
● Aggregation operators
– Boolean: $and, $or, $not
– Aggregation: $eq, $lt, $lte, $gt, $gte, $ne, $cmp
– Arithmetic: $add, $substract, $multiply, $divide, ...
– String: $concat, $substr, …
– Array: $size, $arrayElemAt, ...
– Aggregation variable: $map, $let
– Group Accumulator: $sum, $avg, $addToSet, $push, $min, $max
$first, $last, …
– ...
MongoDB
Aggregation Framework
db.Person.aggregate( [
{ $match: { name: { $ne: "Fischer" } } },
{ $group: {
_id: "$name",
city_occurs: { $addToSet: "$Address.city" }
} },
{ $project: {
_id: "$_id",
city_count: { $size: "$city_occurs" }
}},
{ $sort: { name: 1 } }
{ $match: { city_count: { $gt: 1 } }},
{ $out: "PersonCityCount"}
] );
PersonCityCount
{
_id: Mueller,
city_count: 2,
},
{
_id: Schmidt,
city_count: 3,
}, ...
MongoDB
Map-Reduce
● More control than aggregation framework, but slower
var map = function() {
if(this.name != "Fischer") emit(this.name, this.Address.city);
}
var reduce = function(key, values) {
var distinct = [];
for(value in values) {
if(distinct.indexOf(value) == -1) distinct.push(value);
}
return distinct.length;
}
db.Person.mapReduce(map, reduce,
{
out: "PersonCityCount2"
});
MongoDB
● Default _id index, assuring uniqueness
● Single field index: db.Person.createIndex( { name: 1 } );
● Compound index: db.Address.createIndex( { city: 1, street: -1 } );
– index sorts first asc. by city then desc. by street
– Index will also used when query only filters by one of the fields
● Multikey index: db.Person.createIndex( { Address.city: 1 } )
– Indexes content stored in arrays, an index entry is created foreach
● Geospatial index
● Text index
● Hashed index
Indexes
MongoDB
● uniqueness: insertion of duplicate field value will be rejected
● partial index: indexes only documents matching certain filter criteria
● sparse index: indexes only documents having the indexed field
● TTL index: automatically removes documents after certain time
● Query optimization: use db.MyCollection.find({ … }).explain() to check
whether query is answered using an index, and how many documents had
still to be scanned
● Covered queries: if a query only contains indexed fields, the results will
delivered directly from index without scanning or materializing any
documents
● Index intersection: can apply different indexes to cover query parts
Index properties
MongoDB
● Since MongoDB 3.0 WiredTiger is the default storage engine
– locking at document level enables concurrent writes on collection
– durability ensured via write-ahead transaction log and checkpoints (
Journaling)
– supports compression of collections and indexes (via snappy or zlib)
● MMAPv1 was the default storage until MongoDB 3.0
– since MongoDB 3.0 supports locking at collection level, before only
database level
– useful for selective updates, as WiredTiger always replace the hole
document in a update operation
Storage engines
MongoDB
Clustering, Sharding, Replication
Shard 1
Primary
(mongod)
Secondary
(mongod)
Secondary
(mongod)
Config server
(replica set)
App server
(mongos)
Client app
(driver)
Heartbeat
Replication Replication
writes
reads
MongoDB
Shard key selection
Shard 1 Shard 2 Shard 3
{
key: 12,
...
}
{
key: 21,
...
}
{
key: 35,
...
}
min <= key < 15 15 <= key < 30 30 <= key < max
Sharded Collection
(Hash function)
MongoDB
● ACID → MongoDB is compliant to this only at document level
– Atomicity
– Consistency
– Isolation
– Durability
● CAP → MongoDB assures CP
– Consistency
– Availability
– Partition tolerance
transactions
BASE:
Basically Available, Soft state,
Eventual consistency
MongoDB doesn't support transactions
multi document updates can be
performed via Two-Phase-Commit
MongoDB
● Javascript: Mongo Node.js driver
● Java: Java MongoDB Driver
● Python: PyMongo, Motor (async)
● Ruby: MongoDB Ruby Driver
● C#: Mongo Csharp Driver
● ...
Driver
Object-document mappers
● Javascript: mongoose, Camo, MEAN.JS
● Java: Morphia, SpringData MongoDB
● Python: Django MongoDB engine
● Ruby: MongoMapper, Mongoid
● C#: LinQ
● ...
MongoDB
● CKAN
● MongoDB-Hadoop connector
● MongoDB Spark connector
● MongoDB ElasticSearch/Solr connector
● ...
Extensions and connectors
Tool support
● Robomongo
● MongoExpress
● ...
MongoDB
● Who uses MongoDB
● Case studies
● Arctic TimeSeries and Tick store
● uptime
Real world examples
MongoDB in Code For Germany projects
● Politik bei uns (Offenes Ratsinformationssystem), gescrapte Stadtratsdaten
werden gemäß dem OParl-Format in einer MongoDB gespeichert, siehe
auch Daten, Web-API und Oparl-Client
MongoDB
●
Choose
– mass data processing, like event data
– dynamic scheme
●
Not to choose
– static scheme with lot of relations
– strict transaction requirements
When to choose, when not to choose
MongoDB
●
MongoDB Schema Simulation
●
6 Rules of Thumb for MongoDB Schema Design
●
MongoDB Aggregation
●
MongoDB Indexes
●
Sharding
●
MongoDB University
●
Why Relational Databases are not the Cure-All
Links

More Related Content

What's hot

Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
NodeXperts
 
An Introduction to MongoDB Compass
An Introduction to MongoDB CompassAn Introduction to MongoDB Compass
An Introduction to MongoDB Compass
MongoDB
 
Inside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseInside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source Database
Mike Dirolf
 
An Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBAn Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDB
MongoDB
 
Fast querying indexing for performance (4)
Fast querying   indexing for performance (4)Fast querying   indexing for performance (4)
Fast querying indexing for performance (4)
MongoDB
 
MongoDB Tick Data Presentation
MongoDB Tick Data PresentationMongoDB Tick Data Presentation
MongoDB Tick Data PresentationMongoDB
 
Hybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS ApplicationsHybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS ApplicationsSteven Francia
 
Common MongoDB Use Cases
Common MongoDB Use CasesCommon MongoDB Use Cases
Common MongoDB Use CasesDATAVERSITY
 
MongoDB Schema Design
MongoDB Schema DesignMongoDB Schema Design
MongoDB Schema Design
MongoDB
 
Sharding
ShardingSharding
Sharding
MongoDB
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architecture
Bishal Khanal
 
MongodB Internals
MongodB InternalsMongodB Internals
MongodB Internals
Norberto Leite
 
Mongo DB
Mongo DB Mongo DB
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]
MongoDB
 
Introduction of sql server indexing
Introduction of sql server indexingIntroduction of sql server indexing
Introduction of sql server indexing
Mahabubur Rahaman
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance Tuning
Puneet Behl
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
MongoDB
 
MongoDB: How it Works
MongoDB: How it WorksMongoDB: How it Works
MongoDB: How it Works
Mike Dirolf
 
Mastering the MongoDB Shell
Mastering the MongoDB ShellMastering the MongoDB Shell
Mastering the MongoDB Shell
MongoDB
 
The InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQLThe InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQLMorgan Tocker
 

What's hot (20)

Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
An Introduction to MongoDB Compass
An Introduction to MongoDB CompassAn Introduction to MongoDB Compass
An Introduction to MongoDB Compass
 
Inside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseInside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source Database
 
An Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBAn Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDB
 
Fast querying indexing for performance (4)
Fast querying   indexing for performance (4)Fast querying   indexing for performance (4)
Fast querying indexing for performance (4)
 
MongoDB Tick Data Presentation
MongoDB Tick Data PresentationMongoDB Tick Data Presentation
MongoDB Tick Data Presentation
 
Hybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS ApplicationsHybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS Applications
 
Common MongoDB Use Cases
Common MongoDB Use CasesCommon MongoDB Use Cases
Common MongoDB Use Cases
 
MongoDB Schema Design
MongoDB Schema DesignMongoDB Schema Design
MongoDB Schema Design
 
Sharding
ShardingSharding
Sharding
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architecture
 
MongodB Internals
MongodB InternalsMongodB Internals
MongodB Internals
 
Mongo DB
Mongo DB Mongo DB
Mongo DB
 
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]
 
Introduction of sql server indexing
Introduction of sql server indexingIntroduction of sql server indexing
Introduction of sql server indexing
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance Tuning
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
 
MongoDB: How it Works
MongoDB: How it WorksMongoDB: How it Works
MongoDB: How it Works
 
Mastering the MongoDB Shell
Mastering the MongoDB ShellMastering the MongoDB Shell
Mastering the MongoDB Shell
 
The InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQLThe InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQL
 

Similar to Mongo DB schema design patterns

MongoDB - A Document NoSQL Database
MongoDB - A Document NoSQL DatabaseMongoDB - A Document NoSQL Database
MongoDB - A Document NoSQL DatabaseRuben Inoto Soto
 
Storage dei dati con MongoDB
Storage dei dati con MongoDBStorage dei dati con MongoDB
Storage dei dati con MongoDB
Andrea Balducci
 
Mongodb intro
Mongodb introMongodb intro
Mongodb intro
christkv
 
MongoDB
MongoDBMongoDB
MongoDB
wiTTyMinds1
 
Introduction To MongoDB
Introduction To MongoDBIntroduction To MongoDB
Introduction To MongoDB
ElieHannouch
 
Building your first app with MongoDB
Building your first app with MongoDBBuilding your first app with MongoDB
Building your first app with MongoDB
Norberto Leite
 
Mongo DB
Mongo DBMongo DB
Using MongoDB and Python
Using MongoDB and PythonUsing MongoDB and Python
Using MongoDB and Python
Mike Bright
 
2016 feb-23 pyugre-py_mongo
2016 feb-23 pyugre-py_mongo2016 feb-23 pyugre-py_mongo
2016 feb-23 pyugre-py_mongo
Michael Bright
 
OSDC 2012 | Building a first application on MongoDB by Ross Lawley
OSDC 2012 | Building a first application on MongoDB by Ross LawleyOSDC 2012 | Building a first application on MongoDB by Ross Lawley
OSDC 2012 | Building a first application on MongoDB by Ross Lawley
NETWAYS
 
Mongodb Introduction
Mongodb IntroductionMongodb Introduction
Mongodb Introduction
Raghvendra Parashar
 
MongoDB and Play! Framework workshop
MongoDB and Play! Framework workshopMongoDB and Play! Framework workshop
MongoDB and Play! Framework workshop
João Vazão Vasques
 
Building web applications with mongo db presentation
Building web applications with mongo db presentationBuilding web applications with mongo db presentation
Building web applications with mongo db presentationMurat Çakal
 
MongoDB Aggregations Indexing and Profiling
MongoDB Aggregations Indexing and ProfilingMongoDB Aggregations Indexing and Profiling
MongoDB Aggregations Indexing and Profiling
Manish Kapoor
 
introtomongodb
introtomongodbintrotomongodb
introtomongodbsaikiran
 
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesBack to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
MongoDB
 
Postgres-XC as a Key Value Store Compared To MongoDB
Postgres-XC as a Key Value Store Compared To MongoDBPostgres-XC as a Key Value Store Compared To MongoDB
Postgres-XC as a Key Value Store Compared To MongoDB
Mason Sharp
 
An introduction to MongoDB by César Trigo #OpenExpoDay 2014
An introduction to MongoDB by César Trigo #OpenExpoDay 2014An introduction to MongoDB by César Trigo #OpenExpoDay 2014
An introduction to MongoDB by César Trigo #OpenExpoDay 2014
OpenExpoES
 
An introduction to MongoDB
An introduction to MongoDBAn introduction to MongoDB
An introduction to MongoDB
César Trigo
 
MongoDB FabLab León
MongoDB FabLab LeónMongoDB FabLab León
MongoDB FabLab León
Juan Antonio Roy Couto
 

Similar to Mongo DB schema design patterns (20)

MongoDB - A Document NoSQL Database
MongoDB - A Document NoSQL DatabaseMongoDB - A Document NoSQL Database
MongoDB - A Document NoSQL Database
 
Storage dei dati con MongoDB
Storage dei dati con MongoDBStorage dei dati con MongoDB
Storage dei dati con MongoDB
 
Mongodb intro
Mongodb introMongodb intro
Mongodb intro
 
MongoDB
MongoDBMongoDB
MongoDB
 
Introduction To MongoDB
Introduction To MongoDBIntroduction To MongoDB
Introduction To MongoDB
 
Building your first app with MongoDB
Building your first app with MongoDBBuilding your first app with MongoDB
Building your first app with MongoDB
 
Mongo DB
Mongo DBMongo DB
Mongo DB
 
Using MongoDB and Python
Using MongoDB and PythonUsing MongoDB and Python
Using MongoDB and Python
 
2016 feb-23 pyugre-py_mongo
2016 feb-23 pyugre-py_mongo2016 feb-23 pyugre-py_mongo
2016 feb-23 pyugre-py_mongo
 
OSDC 2012 | Building a first application on MongoDB by Ross Lawley
OSDC 2012 | Building a first application on MongoDB by Ross LawleyOSDC 2012 | Building a first application on MongoDB by Ross Lawley
OSDC 2012 | Building a first application on MongoDB by Ross Lawley
 
Mongodb Introduction
Mongodb IntroductionMongodb Introduction
Mongodb Introduction
 
MongoDB and Play! Framework workshop
MongoDB and Play! Framework workshopMongoDB and Play! Framework workshop
MongoDB and Play! Framework workshop
 
Building web applications with mongo db presentation
Building web applications with mongo db presentationBuilding web applications with mongo db presentation
Building web applications with mongo db presentation
 
MongoDB Aggregations Indexing and Profiling
MongoDB Aggregations Indexing and ProfilingMongoDB Aggregations Indexing and Profiling
MongoDB Aggregations Indexing and Profiling
 
introtomongodb
introtomongodbintrotomongodb
introtomongodb
 
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesBack to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
 
Postgres-XC as a Key Value Store Compared To MongoDB
Postgres-XC as a Key Value Store Compared To MongoDBPostgres-XC as a Key Value Store Compared To MongoDB
Postgres-XC as a Key Value Store Compared To MongoDB
 
An introduction to MongoDB by César Trigo #OpenExpoDay 2014
An introduction to MongoDB by César Trigo #OpenExpoDay 2014An introduction to MongoDB by César Trigo #OpenExpoDay 2014
An introduction to MongoDB by César Trigo #OpenExpoDay 2014
 
An introduction to MongoDB
An introduction to MongoDBAn introduction to MongoDB
An introduction to MongoDB
 
MongoDB FabLab León
MongoDB FabLab LeónMongoDB FabLab León
MongoDB FabLab León
 

More from joergreichert

OKLab Leipzig - 2023 Update
OKLab Leipzig - 2023 UpdateOKLab Leipzig - 2023 Update
OKLab Leipzig - 2023 Update
joergreichert
 
SDGs und wo sind die Daten?
SDGs und wo sind die Daten?SDGs und wo sind die Daten?
SDGs und wo sind die Daten?
joergreichert
 
Gieß a bit more the Bäume
Gieß a bit more the BäumeGieß a bit more the Bäume
Gieß a bit more the Bäume
joergreichert
 
OKLab Leipzig 2022
OKLab Leipzig 2022OKLab Leipzig 2022
OKLab Leipzig 2022
joergreichert
 
FAIRe Sensordaten
FAIRe SensordatenFAIRe Sensordaten
FAIRe Sensordaten
joergreichert
 
OKLab Leipzig 2021
OKLab Leipzig 2021OKLab Leipzig 2021
OKLab Leipzig 2021
joergreichert
 
Leipzig Giesst (Dezember 2020)
Leipzig Giesst (Dezember 2020)Leipzig Giesst (Dezember 2020)
Leipzig Giesst (Dezember 2020)
joergreichert
 
Road to mauAR
Road to mauARRoad to mauAR
Road to mauAR
joergreichert
 
OKLab Leipzig - Schwerpunkt Mobilität
OKLab Leipzig - Schwerpunkt MobilitätOKLab Leipzig - Schwerpunkt Mobilität
OKLab Leipzig - Schwerpunkt Mobilität
joergreichert
 
Die Stadt als Schule der Demokratie
Die Stadt als Schule der DemokratieDie Stadt als Schule der Demokratie
Die Stadt als Schule der Demokratie
joergreichert
 
OKLab Leipzig (2019 Update)
OKLab Leipzig (2019 Update)OKLab Leipzig (2019 Update)
OKLab Leipzig (2019 Update)
joergreichert
 
A Pattern Language - Patterns for Javascript
A Pattern Language - Patterns for JavascriptA Pattern Language - Patterns for Javascript
A Pattern Language - Patterns for Javascript
joergreichert
 
Unit testing mit Javascript
Unit testing mit JavascriptUnit testing mit Javascript
Unit testing mit Javascript
joergreichert
 
damals.in/leipzig
damals.in/leipzigdamals.in/leipzig
damals.in/leipzig
joergreichert
 
OkLab Leipzig (2018 Update)
OkLab Leipzig (2018 Update)OkLab Leipzig (2018 Update)
OkLab Leipzig (2018 Update)
joergreichert
 
Map technologies
Map technologiesMap technologies
Map technologies
joergreichert
 
OkLab Leipzig (state: 2017)
OkLab Leipzig (state: 2017)OkLab Leipzig (state: 2017)
OkLab Leipzig (state: 2017)
joergreichert
 
MOOCs
MOOCsMOOCs
Log4j2
Log4j2Log4j2
Using openArchitectureWare 4.0 in domain "registration"
Using openArchitectureWare 4.0 in domain "registration"Using openArchitectureWare 4.0 in domain "registration"
Using openArchitectureWare 4.0 in domain "registration"
joergreichert
 

More from joergreichert (20)

OKLab Leipzig - 2023 Update
OKLab Leipzig - 2023 UpdateOKLab Leipzig - 2023 Update
OKLab Leipzig - 2023 Update
 
SDGs und wo sind die Daten?
SDGs und wo sind die Daten?SDGs und wo sind die Daten?
SDGs und wo sind die Daten?
 
Gieß a bit more the Bäume
Gieß a bit more the BäumeGieß a bit more the Bäume
Gieß a bit more the Bäume
 
OKLab Leipzig 2022
OKLab Leipzig 2022OKLab Leipzig 2022
OKLab Leipzig 2022
 
FAIRe Sensordaten
FAIRe SensordatenFAIRe Sensordaten
FAIRe Sensordaten
 
OKLab Leipzig 2021
OKLab Leipzig 2021OKLab Leipzig 2021
OKLab Leipzig 2021
 
Leipzig Giesst (Dezember 2020)
Leipzig Giesst (Dezember 2020)Leipzig Giesst (Dezember 2020)
Leipzig Giesst (Dezember 2020)
 
Road to mauAR
Road to mauARRoad to mauAR
Road to mauAR
 
OKLab Leipzig - Schwerpunkt Mobilität
OKLab Leipzig - Schwerpunkt MobilitätOKLab Leipzig - Schwerpunkt Mobilität
OKLab Leipzig - Schwerpunkt Mobilität
 
Die Stadt als Schule der Demokratie
Die Stadt als Schule der DemokratieDie Stadt als Schule der Demokratie
Die Stadt als Schule der Demokratie
 
OKLab Leipzig (2019 Update)
OKLab Leipzig (2019 Update)OKLab Leipzig (2019 Update)
OKLab Leipzig (2019 Update)
 
A Pattern Language - Patterns for Javascript
A Pattern Language - Patterns for JavascriptA Pattern Language - Patterns for Javascript
A Pattern Language - Patterns for Javascript
 
Unit testing mit Javascript
Unit testing mit JavascriptUnit testing mit Javascript
Unit testing mit Javascript
 
damals.in/leipzig
damals.in/leipzigdamals.in/leipzig
damals.in/leipzig
 
OkLab Leipzig (2018 Update)
OkLab Leipzig (2018 Update)OkLab Leipzig (2018 Update)
OkLab Leipzig (2018 Update)
 
Map technologies
Map technologiesMap technologies
Map technologies
 
OkLab Leipzig (state: 2017)
OkLab Leipzig (state: 2017)OkLab Leipzig (state: 2017)
OkLab Leipzig (state: 2017)
 
MOOCs
MOOCsMOOCs
MOOCs
 
Log4j2
Log4j2Log4j2
Log4j2
 
Using openArchitectureWare 4.0 in domain "registration"
Using openArchitectureWare 4.0 in domain "registration"Using openArchitectureWare 4.0 in domain "registration"
Using openArchitectureWare 4.0 in domain "registration"
 

Recently uploaded

First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
lorraineandreiamcidl
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
Deuglo Infosystem Pvt Ltd
 
Introduction to Pygame (Lecture 7 Python Game Development)
Introduction to Pygame (Lecture 7 Python Game Development)Introduction to Pygame (Lecture 7 Python Game Development)
Introduction to Pygame (Lecture 7 Python Game Development)
abdulrafaychaudhry
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke
 
E-commerce Application Development Company.pdf
E-commerce Application Development Company.pdfE-commerce Application Development Company.pdf
E-commerce Application Development Company.pdf
Hornet Dynamics
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
Shane Coughlan
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
Aftab Hussain
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
Łukasz Chruściel
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Globus
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Crescat
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
Neo4j
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
Google
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
Matt Welsh
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
XfilesPro
 

Recently uploaded (20)

First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
 
Introduction to Pygame (Lecture 7 Python Game Development)
Introduction to Pygame (Lecture 7 Python Game Development)Introduction to Pygame (Lecture 7 Python Game Development)
Introduction to Pygame (Lecture 7 Python Game Development)
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
 
E-commerce Application Development Company.pdf
E-commerce Application Development Company.pdfE-commerce Application Development Company.pdf
E-commerce Application Development Company.pdf
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
 

Mongo DB schema design patterns

  • 1. MongoDB PRESENTED BY Jörg Reichert Licensed under cc-by v3.0 (any jurisdiction)
  • 2. Introduction ● Name derived from humongous (= gigantic) ● NoSQL (= not only SQL) database ● Document oriented database – documents stored as binary JSON (BSON) ● Ad-hoc queries ● Server side Javascript execution ● Aggregation / MapReduce ● High performance, availability, scalability
  • 3. MongoDB Relational vs. document based: concepts SQL Person Name AddressId MongoDB 1 2 Mueller 1 Id Address City Street 1 2 <null> 2 Leipzig Burgstr. 1 Dresden <null> Person { _id: ObjectId(“...“), Name: “Mueller“, Address: { City: “Leipzig“, Street: “Burgstr. 1“, }, }, { _id: ObjectId(“...“), Address: { City: “Leipzig“, }, } DB DB Table CollectionColumn Row Document Key: Value FieldPK FK Relation Embedded document PK PK: primary key, FK: foreign key
  • 4. MongoDB SELECT * FROM Person; SELECT * FROM Person WHERE name = “Mueller“; SELECT * FROM Person WHERE name like “M%“; SELECT name FROM Person; SELECT distinct(name) FROM Person WHERE name = “Mueller“; Relational vs. document based: syntax (1/3) db.getCollection(“Person“).find(); db.Person.find({ “name“: "Mueller“ }); db.Person.find({ “name“: /M.*/ }); db.Person.find({}, {name: 1, _id: 0}); db.Person.distinct( “name“, { “name“: "Mueller“ });
  • 5. MongoDB SELECT * FROM Person WHERE id > 10 AND name <> “Mueller“; SELECT p.name FROM Person p JOIN Address a ON p.address = a.id WHERE a.city = “Leipzig“ ORDER BY p.name DESC; SELECT * FROM WHERE name IS NOT NULL; SELECT COUNT(*) FROM PERSON WHERE name = “Mueller“; Relational vs. document based: syntax (2/3) db.Person.find({ $and: [ { _id: { $gt: ObjectId("...") }}, { name: { $ne: "Mueller" }}]}); db.Person.find( { Address.city: “Leipzig“ }, { name: 1, _id: 0 } ).sort({ name: -1 }); db.Person.find( { name: { $not: { $type: 10 }, $exists: true }}); db.Person.count({ name: “Mueller“ }); db.Person.find( { name: “Mueller“ }).count();
  • 6. MongoDB UPDATE Person SET name = “Müller“ WHERE name = “Mueller“; DELETE Person WHERE name = “Mueller“; INSERT Person (name, address) VALUES (“Mueller“, 3); ALTER TABLE PERSON DROP COLUMN name; DROP TABLE PERSON; Relational vs. document based: syntax (3/3) db.Person.updateMany( { name: “Mueller“ }, { $set: { name: “Müller“} }); db.Person.remove( { name: “Mueller“ } ); db.Person.insert( { name: “Mueller“, Address: { … } }); db.Person.updateMany( {}, { $unset: { name: 1 }} ); db.Person.drop();
  • 7. MongoDB ● principle of least cardinality ● Store what you query for schema design principles
  • 8. MongoDB ● applicable for 1:1 and 1:n when n can‘t get to large ● Embedded document cannot get too large ● Embedded document not very likely to change ● arrays that grow without bound should never be embedded schema design: embedded document { _id: ObjectId(“...“), City: “Leipzig“, Street: “Burgstr. 1“, Person: [ { Name: “Mueller“, }, { Name: “Schneider“, }, ] } Address
  • 9. MongoDB ● applicable for :n when n can‘t get to large ● Referenced document likely to change often in future ● there are many referenced documents expected, so storing only the reference is cheaper ● there are large referenced documents expected, so storing only the reference is cheaper ● arrays that grow without bound should never be embedded ● Address should be accessible on its own schema design: referencing { _id: ObjectId(“...“), City: “Leipzig“, Street: “Burgstr. 1“, Person: [ ObjectId(“...“), ObjectId(“...“), ] } { _id: ObjectId(“...“), Name: “Mueller“, } Address Person
  • 10. MongoDB ● applicable for :n relations when n can get very large (note: a MongoDB document isn‘t allowed to exceed 16MB) ● Joins are done on application level schema design: parent-referencing { _id: ObjectId(“...“), City: “Dubai“, Street: “1 Sheikh Mohammed bin Rashid Blvd“, } { _id: ObjectId(“...“), Name: “Mueller“, Address: ObjectId(“...“), } Address Person
  • 11. MongoDB ● applicable for m:n when n and m can‘t get to large and application requires to navigate both ends ● disadvantage: need to update operations when changing references schema design: two way referencing { _id: ObjectId(“...“), City: “Leipzig“, Street: “Burgstr. 1“, Person: [ ObjectId(“...“), ObjectId(“...“), ] } { _id: ObjectId(“...“), Name: “Mueller“, Address: [ ObjectId(“...“), ObjectId(“...“), ] } Address Person
  • 12. MongoDB ● queries expected to filter by certain fields of the referenced document, so including this field already in the hosts saves an additional query at application level ● disadvantage: two update operations for duplicated field ● disadvantage: additional memory consumption schema design: denormalization { _id: ObjectId(“...“), City: “Leipzig“, Street: “Burgstr. 1“, } { _id: ObjectId(“...“), Name: “Mueller“, Address: [ { id: ObjectId(“...“), city: “Leipzig“, }, ... ] } Address Person
  • 13. MongoDB ● applicable for :n relations when n can get very large and it‘s expected that application will use pagination anyway ● DB schema will already create the chunks, the application will later query for schema design: bucketing { _id: ObjectId(“...“), City: “Leipzig“, Street: “Burgstr. 1“, } { _id: ObjectId(“...“), Address: ObjectId(“...“), Page: 13, Count: 50, Persons: [ { Name: “Mueller“ }, ... ] } Address Person
  • 14. MongoDB Aggregation Framework ● Aggregation pipeline consisting of (processing) stages – $match, $group, $project, $redact, $unwind, $lookup, $sort, ... ● Aggregation operators – Boolean: $and, $or, $not – Aggregation: $eq, $lt, $lte, $gt, $gte, $ne, $cmp – Arithmetic: $add, $substract, $multiply, $divide, ... – String: $concat, $substr, … – Array: $size, $arrayElemAt, ... – Aggregation variable: $map, $let – Group Accumulator: $sum, $avg, $addToSet, $push, $min, $max $first, $last, … – ...
  • 15. MongoDB Aggregation Framework db.Person.aggregate( [ { $match: { name: { $ne: "Fischer" } } }, { $group: { _id: "$name", city_occurs: { $addToSet: "$Address.city" } } }, { $project: { _id: "$_id", city_count: { $size: "$city_occurs" } }}, { $sort: { name: 1 } } { $match: { city_count: { $gt: 1 } }}, { $out: "PersonCityCount"} ] ); PersonCityCount { _id: Mueller, city_count: 2, }, { _id: Schmidt, city_count: 3, }, ...
  • 16. MongoDB Map-Reduce ● More control than aggregation framework, but slower var map = function() { if(this.name != "Fischer") emit(this.name, this.Address.city); } var reduce = function(key, values) { var distinct = []; for(value in values) { if(distinct.indexOf(value) == -1) distinct.push(value); } return distinct.length; } db.Person.mapReduce(map, reduce, { out: "PersonCityCount2" });
  • 17. MongoDB ● Default _id index, assuring uniqueness ● Single field index: db.Person.createIndex( { name: 1 } ); ● Compound index: db.Address.createIndex( { city: 1, street: -1 } ); – index sorts first asc. by city then desc. by street – Index will also used when query only filters by one of the fields ● Multikey index: db.Person.createIndex( { Address.city: 1 } ) – Indexes content stored in arrays, an index entry is created foreach ● Geospatial index ● Text index ● Hashed index Indexes
  • 18. MongoDB ● uniqueness: insertion of duplicate field value will be rejected ● partial index: indexes only documents matching certain filter criteria ● sparse index: indexes only documents having the indexed field ● TTL index: automatically removes documents after certain time ● Query optimization: use db.MyCollection.find({ … }).explain() to check whether query is answered using an index, and how many documents had still to be scanned ● Covered queries: if a query only contains indexed fields, the results will delivered directly from index without scanning or materializing any documents ● Index intersection: can apply different indexes to cover query parts Index properties
  • 19. MongoDB ● Since MongoDB 3.0 WiredTiger is the default storage engine – locking at document level enables concurrent writes on collection – durability ensured via write-ahead transaction log and checkpoints ( Journaling) – supports compression of collections and indexes (via snappy or zlib) ● MMAPv1 was the default storage until MongoDB 3.0 – since MongoDB 3.0 supports locking at collection level, before only database level – useful for selective updates, as WiredTiger always replace the hole document in a update operation Storage engines
  • 20. MongoDB Clustering, Sharding, Replication Shard 1 Primary (mongod) Secondary (mongod) Secondary (mongod) Config server (replica set) App server (mongos) Client app (driver) Heartbeat Replication Replication writes reads
  • 21. MongoDB Shard key selection Shard 1 Shard 2 Shard 3 { key: 12, ... } { key: 21, ... } { key: 35, ... } min <= key < 15 15 <= key < 30 30 <= key < max Sharded Collection (Hash function)
  • 22. MongoDB ● ACID → MongoDB is compliant to this only at document level – Atomicity – Consistency – Isolation – Durability ● CAP → MongoDB assures CP – Consistency – Availability – Partition tolerance transactions BASE: Basically Available, Soft state, Eventual consistency MongoDB doesn't support transactions multi document updates can be performed via Two-Phase-Commit
  • 23. MongoDB ● Javascript: Mongo Node.js driver ● Java: Java MongoDB Driver ● Python: PyMongo, Motor (async) ● Ruby: MongoDB Ruby Driver ● C#: Mongo Csharp Driver ● ... Driver Object-document mappers ● Javascript: mongoose, Camo, MEAN.JS ● Java: Morphia, SpringData MongoDB ● Python: Django MongoDB engine ● Ruby: MongoMapper, Mongoid ● C#: LinQ ● ...
  • 24. MongoDB ● CKAN ● MongoDB-Hadoop connector ● MongoDB Spark connector ● MongoDB ElasticSearch/Solr connector ● ... Extensions and connectors Tool support ● Robomongo ● MongoExpress ● ...
  • 25. MongoDB ● Who uses MongoDB ● Case studies ● Arctic TimeSeries and Tick store ● uptime Real world examples MongoDB in Code For Germany projects ● Politik bei uns (Offenes Ratsinformationssystem), gescrapte Stadtratsdaten werden gemäß dem OParl-Format in einer MongoDB gespeichert, siehe auch Daten, Web-API und Oparl-Client
  • 26. MongoDB ● Choose – mass data processing, like event data – dynamic scheme ● Not to choose – static scheme with lot of relations – strict transaction requirements When to choose, when not to choose
  • 27. MongoDB ● MongoDB Schema Simulation ● 6 Rules of Thumb for MongoDB Schema Design ● MongoDB Aggregation ● MongoDB Indexes ● Sharding ● MongoDB University ● Why Relational Databases are not the Cure-All Links