SlideShare a Scribd company logo
1 of 53
MongoDB
A NoSQL Document Oriented Database
Agenda
● RelationalDBs
● NoSQL
– What, Why
– Types
– History
– Features
– Types
● MongoDB
– Indexes
– Replication
– Sharding
– Querying
– Mapping
– MapReduce
● Use Case: RealNetworks
Relational DBs
● Born in the 70s
– storage is expensive
– schemas are simple
● Based on Relational Model
– Mathematical model for describing data structure
– Data represented in „tuples“, grouped into „relations“
● Queries based on Relational Algebra
– union, intersection, difference, cartesian product, selection,
projection, join, division
● Constraints
– Foreign Keys, Primary Keys, Indexes
– Domain Integrity (DataTypes)
Joins
Relational Dbs
● Normalization
– minimize redundancy
– avoid duplication
Normalization
Relational DBs - Transactions
● Atomicity
– If one part of the transaction fails, the whole transaction fails
● Consistency
– Transaction leaves the DB in a valid state
● Isolation
– One transaction doesn't see an intermediate state of the other
● Durability
– Transaction gets persisted
Relational Dbs - Use
NoSQL – Why?
● Web2.0
– Huge DataVolumes
– Need for Speed
– Accesibility
● RDBMS are difficult to scale
● Storage gets cheap
● Commodity machines get cheap
NoSQL – What?
● Simple storage of data
● Looser consistency model (eventual consistency), in
order to achieve:
– higher availability
– horizontal scaling
● No JOINs
● Optimized for big data, when no relational features are
needed
Vertical Scale
Horizontal Scale
Vertical Scale
Horizontal Scale
Enforces parallel computing
Eventual Consistency
● RDBMS: all users see a consistent view
of the data
● ACID gets difficult when distributing
data across nodes
● Eventual Consistency: inconsistencies
are transitory. The DB may have some
inconsistencies at a point of time, but will
eventually get consistent.
● BASE (in contrast to ACID)– Basically
Available Soft-state Eventually
CAP Theorem
All nodes see
the same data
at the same time
Requests always
get an immediate response
System continues to work,
even if a part of it breaks
NoSQL - History
● Term first used in 1998 by C. Strozzi to name
his RelationalDB that didn't use SQL
● Term reused in 2009 by E.Evans to name the
distributed Dbs that didn't provide ACID
● Some people traduce it as „Not Only SQL“
● Should actually be called „NoRel“ (no
Relational)
NoSQL – Some Features
● Auto-Sharding
● Replication
● Caching
● Dynamic Schema
NoSQL - Types
● Document
– „Map“ key-value, with a „Document“ (xml, json, pdf, ..) as
value
– MongoDB, CouchDB
● Key-Value
– „Map“ key-value, with an „Object“ (Integer, String, Order, ..)
as value
– Cassandra, Dynamo, Voldemort
● Graph
– Data stored in a graph structure – nodes have pointer to
adjacent ones
– Neo4J
MongoDB
● OpenSource NoSQL Document DB written in
C++
● Started in 2009
● Commercial Support by 10gen
● From humongous (huge)
● http://www.mongodb.org/
MongoDB – Document Oriented
● No Document Structure - schemaless
● Atomicity: only at document level (no
transactions across documents)
● Normalization is not easy to achieve:
– Embed: +duplication, +performance
– Reference: -duplication, +roundtrips
MongoDB
●
> db.users.save(
{ name: 'ruben',
surname : 'inoto',
age : '36' } )
●
> db.users.find()
– { "_id" : ObjectId("519a3dd65f03c7847ca5f560"),
"name" : "ruben",
"surname" : "inoto",
"age" : "36" }
● > db.users.update(
{ name: 'ruben' },
{ $set: { 'age' : '24' } } )
Documents are stored in BSON format
MongoDB - Querying
● find(): Returns a cursor containing a number of documents
– All users
– db.users.find()
– User with id 42
– db.users.find({ _id: 42})
– Age between 20 and 30
– db.users.find( { age: { $gt: 20, $lt: 30 } } )
– Subdocuments: ZIP 5026
– db.users.find( { address.zip: 5026 } )
– OR: ruben or younger than 30
– db.users.find({ $or: [
{ name : "ruben" },
{ age: { $lt: 30 } }
]})
– Projection: Deliver only name and age
– db.users.find({ }, { name: 1, age: 1 })
{
"_id": 42,
"name": "ruben",
"surname": "inoto",
„age“: „36“,
"address": {
"street": "Glaserstraße",
"zip": "5026" }
}
MongoDB - Saving
● Insert
– db.test.save( { _id: "42", name: "ruben" } )
● Update
– db.test.update( { _id : "42" }, { name : "harald" } )
– db.test.update( { _id : "42" }, { name : "harald", age : 39 } )
● Atomic Operators ($inc)
– db.test.update( { _id : "42" }, { $inc: { age : 1 } } )
● Arrays
– { _id : "48", name : "david", hobbies : [ "bike", "judo" ] }
– Add element to array atomic ($push)
● db.test.update( { _id : "48" }, { $push: { hobbies : "swimming" } } )
– $each, $pop, $pull, $addToSet...
MongoDB - Delete
● db.test.remove ( { _id : „42“ } )
MongoDB – Indexes
● Indexes on any attribute
– > db.users.ensureIndex( { 'age' : 1 } )
● Compound indexes
– > db.users.ensureIndex( { 'age' : 1 }, { 'name':
1 } )
● Unique Indexes
● >v2.4 → Text Indexing (search)
SQL → Mongo Mapping (I)
SQL Statement Mongo Query Language
CREATE TABLE USERS (a Number, b
Number)
implicit
INSERT INTO USERS VALUES(1,1) db.users.insert({a:1,b:1})
SELECT a,b FROM users db.users.find({}, {a:1,b:1})
SELECT * FROM users db.users.find()
SELECT * FROM users WHERE age=33 db.users.find({age:33})
SELECT * FROM users WHERE age=33
ORDER BY name
db.users.find({age:33}).sort({name:1})
SQL → Mongo Mapping (I)
SQL Statement Mongo Query Language
SELECT * FROM users WHERE age>33 db.users.find({'age':{$gt:33}})})
CREATE INDEX myindexname ON
users(name)
db.users.ensureIndex({name:1})
SELECT * FROM users WHERE a=1 and
b='q'
db.users.find({a:1,b:'q'})
SELECT * FROM users LIMIT 10 SKIP 20 db.users.find().limit(10).skip(20)
SELECT * FROM users LIMIT 1 db.users.findOne()
EXPLAIN PLAN FOR SELECT * FROM users
WHERE z=3
db.users.find({z:3}).explain()
SELECT DISTINCT last_name FROM users db.users.distinct('last_name')
SELECT COUNT(*)
FROM users where AGE > 30
db.users.find({age: {'$gt': 30}}).count()
Embed vs Reference
Relational
Document
user: {
id: "1",
name: "ruben"
}
order: {
id: "a",
user_id: "1",
items: [ {
product_id: "x",
quantity: 10,
price: 300
},
{
product_id: "y",
quantity: 5,
price: 300
}]
}
referenced
embedded
MongoDB – Replication (I)
● Master-slave replication: primary and secondary nodes
● replica set: cluster of mongod instances that replicate amongst one
another and ensure automated failover
WriteConcern
MongoDB – Replication (II)
● adds redundancy
● helps to ensure high availability – automatic
failover
● simplifies backups
WriteConcerns
● Errors Ignored
– even network errors are ignored
● Unacknowledged
– at least network errors are handled
● Acknowledged
– constraints are handled (default)
● Journaled
– persisted to journal log
● Replica ACK
– 1..n
– Or 'majority'
MongoDB – Sharding (I)
● Scale Out
● Distributes data to nodes automatically
● Balances data and load accross machines
MongoDB – Sharding (II)
● A sharded Cluster is composed of:
– Shards: holds data.
● Either one mongod instance (primary daemon process –
handles data requests), or a replica set
– config Servers:
● mongod instance holding cluster metadata
– mongos instances:
● route application calls to the shards
● No single point of failure
MongoDB – Sharding (III)
MongoDB – Sharding (IV)
MongoDB – Sharding (V)
● Collection has a shard key: existing field(s) in
all documents
● Documents get distributed according to ranges
● In a shard, documents are partitioned into
chunks
● Mongo tries to keep all chunks at the same size
MongoDB – Sharding (VI)
● Shard Balancing
– When a shard has too many chunks, mongo moves
chunks to other shards
● Only makes sense with huge amount of data
Object Mappers
● C#, PHP, Scala, Erlang, Perl, Ruby
● Java
– Morphia
– Spring MongoDB
– mongo-jackson-mapper
– jongo
● ..
Jongo - Example
DB db = new MongoClient().getDB("jongo");
Jongo jongo = new Jongo(db);
MongoCollection users = jongo.getCollection("users");
User user = new User("ruben", "inoto", new Address("Musterstraße", "5026"));
users.save(user);
User ruben = users.findOne("{name: 'ruben'}").as(User.class);
public class User {
private String name;
private String surname;
private Address address;
public class Address {
private String street;
private String zip;
{
"_id" : ObjectId("51b0e1c4d78a1c14a26ada9e"),
"name" : "ruben",
"surname" : "inoto",
"address" : {
"street" : "Musterstraße",
"zip" : "5026"
}
}
TTL (TimeToLive)
● Data with an expiryDate
● After the specified TimeToLive, the data will be
removed from the DB
● Implemented as an Index
● Useful for logs, sessions, ..
db.broadcastMessages.ensureIndex( { "createdAt": 1 }, { expireAfterSeconds: 3600 } )
MapReduce
● Programming model for processing large data sets with a
parallel, distributed algorithm.
● Handles complex aggregation tasks
● Problem can be distributed in smaller tasks, distributed across
nodes
● map phase: selects the data
– Associates a value with a key and a value pair
– Values will be grouped by the key, and passed to the reduce function
● reduce phase: transforms the data
– Accepts two arguments: key and values
– Reduces to a single object all the values associated with the key
MapReduce
MapReduce Use Example
● Problem: Count how much money each
customer has paid in all its orders
Solution - Relational
select customer_id, sum(price * quantity)
from orders
group by customer_id
order_id customer_id price quantity
a 1 350 2
b 2 100 2
c 1 20 1
customer_id total
1 720
2 200
Solution - Sequential
var customerTotals = new Map();
for (Order order: orders) {
var newTotal = order.price * order.quantity;
if (customerTotals.containsKey(order.customerId)) {
newTotal += customerTotals.get(order.customerId);
}
customerTotals.put(order.customerId, newTotal);
}
[{
order_id: "a",
customer_id: "1",
price: 350,
quantity: 2
},
{
order_id: "b",
customer_id: "2",
price: 100,
quantity: 2
},
{
order_id: "c",
customer_id: "1",
price: 20,
quantity: 1
}]
{ „1“: 720 }
{ „2“: 200 }
Solution - MapReduce
db.orders.insert([
{
order_id: "a",
customer_id: "1",
price: 350
quantity: 2
},
{
order_id: "b",
customer_id: "2",
price: 100,
quantity: 2
},
{
order_id: "c",
customer_id: "1",
price: 20,
quantity: 1
}
]);
var mapOrders = function() {
var totalPrice = this.price * this.quantity;
emit(this.customer_id, totalPrice);
};
var reduceOrders = function(customerId, tempTotal) {
return Array.sum(tempTotal);
};
db.orders.mapReduce(
mapOrders,
reduceOrders,
{ out: "map_reduce_orders" }
);
> db.map_reduce_orders.find().pretty();
{ "_id" : "1", "value" : 720 }
{ "_id" : "2", "value" : 200 }
MapReduce
Who is using Mongo?
● Craigslist
● SourceForge
● Disney
● TheGuardian
● Forbes
● CERN
● ….
„Real“ Use Case – Android
Notifications
● App to send „notifications“ (messages) to devices
with an installed RealNetworks application (Music,
RBT)
● Scala, Scalatra, Lift, Jersey, Guice,
ProtocolBuffers
● MongoDB, Casbah, Salat
● Mongo Collections
– Devices: deviceId, msisdn, application
– Messages: message, audience
– SentMessages: deviceId, message, status
Criticism
● Loss of data
– Specially in a cluster
Conclusion
● Not a silver bullet
● Makes sense when:
– Eventual consistency is acceptable
– Prototyping
– Performance
– Object model doesn't suit in a Relational DB
● Easy to learn

More Related Content

What's hot

Back to Basics Webinar 2: Your First MongoDB Application
Back to Basics Webinar 2: Your First MongoDB ApplicationBack to Basics Webinar 2: Your First MongoDB Application
Back to Basics Webinar 2: Your First MongoDB ApplicationMongoDB
 
Back to Basics Webinar 3: Schema Design Thinking in Documents
 Back to Basics Webinar 3: Schema Design Thinking in Documents Back to Basics Webinar 3: Schema Design Thinking in Documents
Back to Basics Webinar 3: Schema Design Thinking in DocumentsMongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBNosh Petigara
 
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorialsMongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorialsSpringPeople
 
Basics of MongoDB
Basics of MongoDB Basics of MongoDB
Basics of MongoDB Habilelabs
 
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentosConceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentosMongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBantoinegirbal
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBMike Dirolf
 
Intro To MongoDB
Intro To MongoDBIntro To MongoDB
Intro To MongoDBAlex Sharp
 
MongoDB for Beginners
MongoDB for BeginnersMongoDB for Beginners
MongoDB for BeginnersEnoch Joshua
 
Mongo DB schema design patterns
Mongo DB schema design patternsMongo DB schema design patterns
Mongo DB schema design patternsjoergreichert
 
Building a Location-based platform with MongoDB from Zero.
Building a Location-based platform with MongoDB from Zero.Building a Location-based platform with MongoDB from Zero.
Building a Location-based platform with MongoDB from Zero.Ravi Teja
 
Intro To Mongo Db
Intro To Mongo DbIntro To Mongo Db
Intro To Mongo Dbchriskite
 
MongoDB Schema Design
MongoDB Schema DesignMongoDB Schema Design
MongoDB Schema DesignMongoDB
 
Back to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLBack to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLMongoDB
 

What's hot (19)

Back to Basics Webinar 2: Your First MongoDB Application
Back to Basics Webinar 2: Your First MongoDB ApplicationBack to Basics Webinar 2: Your First MongoDB Application
Back to Basics Webinar 2: Your First MongoDB Application
 
Back to Basics Webinar 3: Schema Design Thinking in Documents
 Back to Basics Webinar 3: Schema Design Thinking in Documents Back to Basics Webinar 3: Schema Design Thinking in Documents
Back to Basics Webinar 3: Schema Design Thinking in Documents
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
An introduction to MongoDB
An introduction to MongoDBAn introduction to MongoDB
An introduction to MongoDB
 
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorialsMongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
 
Basics of MongoDB
Basics of MongoDB Basics of MongoDB
Basics of MongoDB
 
MongoDB 101
MongoDB 101MongoDB 101
MongoDB 101
 
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentosConceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Intro To MongoDB
Intro To MongoDBIntro To MongoDB
Intro To MongoDB
 
MongoDB for Beginners
MongoDB for BeginnersMongoDB for Beginners
MongoDB for Beginners
 
Mongo DB schema design patterns
Mongo DB schema design patternsMongo DB schema design patterns
Mongo DB schema design patterns
 
Building a Location-based platform with MongoDB from Zero.
Building a Location-based platform with MongoDB from Zero.Building a Location-based platform with MongoDB from Zero.
Building a Location-based platform with MongoDB from Zero.
 
Intro To Mongo Db
Intro To Mongo DbIntro To Mongo Db
Intro To Mongo Db
 
Mongo DB
Mongo DB Mongo DB
Mongo DB
 
MongoDB Schema Design
MongoDB Schema DesignMongoDB Schema Design
MongoDB Schema Design
 
Back to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLBack to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQL
 
Mongo DB Presentation
Mongo DB PresentationMongo DB Presentation
Mongo DB Presentation
 

Similar to MongoDB - A Document NoSQL Database

Mongodb intro
Mongodb introMongodb intro
Mongodb introchristkv
 
2016 feb-23 pyugre-py_mongo
2016 feb-23 pyugre-py_mongo2016 feb-23 pyugre-py_mongo
2016 feb-23 pyugre-py_mongoMichael Bright
 
Using MongoDB and Python
Using MongoDB and PythonUsing MongoDB and Python
Using MongoDB and PythonMike Bright
 
2012 mongo db_bangalore_roadmap_new
2012 mongo db_bangalore_roadmap_new2012 mongo db_bangalore_roadmap_new
2012 mongo db_bangalore_roadmap_newMongoDB
 
NoSQL Solutions - a comparative study
NoSQL Solutions - a comparative studyNoSQL Solutions - a comparative study
NoSQL Solutions - a comparative studyGuillaume Lefranc
 
Your Database Cannot Do this (well)
Your Database Cannot Do this (well)Your Database Cannot Do this (well)
Your Database Cannot Do this (well)javier ramirez
 
Introduction To MongoDB
Introduction To MongoDBIntroduction To MongoDB
Introduction To MongoDBElieHannouch
 
Apache Drill @ PJUG, Jan 15, 2013
Apache Drill @ PJUG, Jan 15, 2013Apache Drill @ PJUG, Jan 15, 2013
Apache Drill @ PJUG, Jan 15, 2013Gera Shegalov
 
MongoDB a document store that won't let you down.
MongoDB a document store that won't let you down.MongoDB a document store that won't let you down.
MongoDB a document store that won't let you down.Nurul Ferdous
 
MongoDB: a gentle, friendly overview
MongoDB: a gentle, friendly overviewMongoDB: a gentle, friendly overview
MongoDB: a gentle, friendly overviewAntonio Pintus
 
MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)Uwe Printz
 
Starting with MongoDB
Starting with MongoDBStarting with MongoDB
Starting with MongoDBDoThinger
 
Webinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev TeamsWebinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev TeamsMongoDB
 
Social Data and Log Analysis Using MongoDB
Social Data and Log Analysis Using MongoDBSocial Data and Log Analysis Using MongoDB
Social Data and Log Analysis Using MongoDBTakahiro Inoue
 
Getting Started with Geospatial Data in MongoDB
Getting Started with Geospatial Data in MongoDBGetting Started with Geospatial Data in MongoDB
Getting Started with Geospatial Data in MongoDBMongoDB
 
Working with MongoDB as MySQL DBA
Working with MongoDB as MySQL DBAWorking with MongoDB as MySQL DBA
Working with MongoDB as MySQL DBAIgor Donchovski
 

Similar to MongoDB - A Document NoSQL Database (20)

Mongodb intro
Mongodb introMongodb intro
Mongodb intro
 
2016 feb-23 pyugre-py_mongo
2016 feb-23 pyugre-py_mongo2016 feb-23 pyugre-py_mongo
2016 feb-23 pyugre-py_mongo
 
Using MongoDB and Python
Using MongoDB and PythonUsing MongoDB and Python
Using MongoDB and Python
 
2012 mongo db_bangalore_roadmap_new
2012 mongo db_bangalore_roadmap_new2012 mongo db_bangalore_roadmap_new
2012 mongo db_bangalore_roadmap_new
 
NoSQL Solutions - a comparative study
NoSQL Solutions - a comparative studyNoSQL Solutions - a comparative study
NoSQL Solutions - a comparative study
 
Your Database Cannot Do this (well)
Your Database Cannot Do this (well)Your Database Cannot Do this (well)
Your Database Cannot Do this (well)
 
MongoDb and NoSQL
MongoDb and NoSQLMongoDb and NoSQL
MongoDb and NoSQL
 
Introduction To MongoDB
Introduction To MongoDBIntroduction To MongoDB
Introduction To MongoDB
 
Apache Drill @ PJUG, Jan 15, 2013
Apache Drill @ PJUG, Jan 15, 2013Apache Drill @ PJUG, Jan 15, 2013
Apache Drill @ PJUG, Jan 15, 2013
 
Database.pdf
Database.pdfDatabase.pdf
Database.pdf
 
MongoDB a document store that won't let you down.
MongoDB a document store that won't let you down.MongoDB a document store that won't let you down.
MongoDB a document store that won't let you down.
 
MongoDB: a gentle, friendly overview
MongoDB: a gentle, friendly overviewMongoDB: a gentle, friendly overview
MongoDB: a gentle, friendly overview
 
MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)
 
MongoDB.pdf
MongoDB.pdfMongoDB.pdf
MongoDB.pdf
 
Starting with MongoDB
Starting with MongoDBStarting with MongoDB
Starting with MongoDB
 
Webinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev TeamsWebinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev Teams
 
Social Data and Log Analysis Using MongoDB
Social Data and Log Analysis Using MongoDBSocial Data and Log Analysis Using MongoDB
Social Data and Log Analysis Using MongoDB
 
Getting Started with Geospatial Data in MongoDB
Getting Started with Geospatial Data in MongoDBGetting Started with Geospatial Data in MongoDB
Getting Started with Geospatial Data in MongoDB
 
Working with MongoDB as MySQL DBA
Working with MongoDB as MySQL DBAWorking with MongoDB as MySQL DBA
Working with MongoDB as MySQL DBA
 
MongoDB
MongoDBMongoDB
MongoDB
 

Recently uploaded

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 

Recently uploaded (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

MongoDB - A Document NoSQL Database

  • 1. MongoDB A NoSQL Document Oriented Database
  • 2. Agenda ● RelationalDBs ● NoSQL – What, Why – Types – History – Features – Types ● MongoDB – Indexes – Replication – Sharding – Querying – Mapping – MapReduce ● Use Case: RealNetworks
  • 3. Relational DBs ● Born in the 70s – storage is expensive – schemas are simple ● Based on Relational Model – Mathematical model for describing data structure – Data represented in „tuples“, grouped into „relations“ ● Queries based on Relational Algebra – union, intersection, difference, cartesian product, selection, projection, join, division ● Constraints – Foreign Keys, Primary Keys, Indexes – Domain Integrity (DataTypes)
  • 4.
  • 6. Relational Dbs ● Normalization – minimize redundancy – avoid duplication
  • 8. Relational DBs - Transactions ● Atomicity – If one part of the transaction fails, the whole transaction fails ● Consistency – Transaction leaves the DB in a valid state ● Isolation – One transaction doesn't see an intermediate state of the other ● Durability – Transaction gets persisted
  • 10. NoSQL – Why? ● Web2.0 – Huge DataVolumes – Need for Speed – Accesibility ● RDBMS are difficult to scale ● Storage gets cheap ● Commodity machines get cheap
  • 11. NoSQL – What? ● Simple storage of data ● Looser consistency model (eventual consistency), in order to achieve: – higher availability – horizontal scaling ● No JOINs ● Optimized for big data, when no relational features are needed
  • 14. Eventual Consistency ● RDBMS: all users see a consistent view of the data ● ACID gets difficult when distributing data across nodes ● Eventual Consistency: inconsistencies are transitory. The DB may have some inconsistencies at a point of time, but will eventually get consistent. ● BASE (in contrast to ACID)– Basically Available Soft-state Eventually
  • 15. CAP Theorem All nodes see the same data at the same time Requests always get an immediate response System continues to work, even if a part of it breaks
  • 16. NoSQL - History ● Term first used in 1998 by C. Strozzi to name his RelationalDB that didn't use SQL ● Term reused in 2009 by E.Evans to name the distributed Dbs that didn't provide ACID ● Some people traduce it as „Not Only SQL“ ● Should actually be called „NoRel“ (no Relational)
  • 17. NoSQL – Some Features ● Auto-Sharding ● Replication ● Caching ● Dynamic Schema
  • 18. NoSQL - Types ● Document – „Map“ key-value, with a „Document“ (xml, json, pdf, ..) as value – MongoDB, CouchDB ● Key-Value – „Map“ key-value, with an „Object“ (Integer, String, Order, ..) as value – Cassandra, Dynamo, Voldemort ● Graph – Data stored in a graph structure – nodes have pointer to adjacent ones – Neo4J
  • 19. MongoDB ● OpenSource NoSQL Document DB written in C++ ● Started in 2009 ● Commercial Support by 10gen ● From humongous (huge) ● http://www.mongodb.org/
  • 20. MongoDB – Document Oriented ● No Document Structure - schemaless ● Atomicity: only at document level (no transactions across documents) ● Normalization is not easy to achieve: – Embed: +duplication, +performance – Reference: -duplication, +roundtrips
  • 21. MongoDB ● > db.users.save( { name: 'ruben', surname : 'inoto', age : '36' } ) ● > db.users.find() – { "_id" : ObjectId("519a3dd65f03c7847ca5f560"), "name" : "ruben", "surname" : "inoto", "age" : "36" } ● > db.users.update( { name: 'ruben' }, { $set: { 'age' : '24' } } ) Documents are stored in BSON format
  • 22. MongoDB - Querying ● find(): Returns a cursor containing a number of documents – All users – db.users.find() – User with id 42 – db.users.find({ _id: 42}) – Age between 20 and 30 – db.users.find( { age: { $gt: 20, $lt: 30 } } ) – Subdocuments: ZIP 5026 – db.users.find( { address.zip: 5026 } ) – OR: ruben or younger than 30 – db.users.find({ $or: [ { name : "ruben" }, { age: { $lt: 30 } } ]}) – Projection: Deliver only name and age – db.users.find({ }, { name: 1, age: 1 }) { "_id": 42, "name": "ruben", "surname": "inoto", „age“: „36“, "address": { "street": "Glaserstraße", "zip": "5026" } }
  • 23. MongoDB - Saving ● Insert – db.test.save( { _id: "42", name: "ruben" } ) ● Update – db.test.update( { _id : "42" }, { name : "harald" } ) – db.test.update( { _id : "42" }, { name : "harald", age : 39 } ) ● Atomic Operators ($inc) – db.test.update( { _id : "42" }, { $inc: { age : 1 } } ) ● Arrays – { _id : "48", name : "david", hobbies : [ "bike", "judo" ] } – Add element to array atomic ($push) ● db.test.update( { _id : "48" }, { $push: { hobbies : "swimming" } } ) – $each, $pop, $pull, $addToSet...
  • 24. MongoDB - Delete ● db.test.remove ( { _id : „42“ } )
  • 25. MongoDB – Indexes ● Indexes on any attribute – > db.users.ensureIndex( { 'age' : 1 } ) ● Compound indexes – > db.users.ensureIndex( { 'age' : 1 }, { 'name': 1 } ) ● Unique Indexes ● >v2.4 → Text Indexing (search)
  • 26. SQL → Mongo Mapping (I) SQL Statement Mongo Query Language CREATE TABLE USERS (a Number, b Number) implicit INSERT INTO USERS VALUES(1,1) db.users.insert({a:1,b:1}) SELECT a,b FROM users db.users.find({}, {a:1,b:1}) SELECT * FROM users db.users.find() SELECT * FROM users WHERE age=33 db.users.find({age:33}) SELECT * FROM users WHERE age=33 ORDER BY name db.users.find({age:33}).sort({name:1})
  • 27. SQL → Mongo Mapping (I) SQL Statement Mongo Query Language SELECT * FROM users WHERE age>33 db.users.find({'age':{$gt:33}})}) CREATE INDEX myindexname ON users(name) db.users.ensureIndex({name:1}) SELECT * FROM users WHERE a=1 and b='q' db.users.find({a:1,b:'q'}) SELECT * FROM users LIMIT 10 SKIP 20 db.users.find().limit(10).skip(20) SELECT * FROM users LIMIT 1 db.users.findOne() EXPLAIN PLAN FOR SELECT * FROM users WHERE z=3 db.users.find({z:3}).explain() SELECT DISTINCT last_name FROM users db.users.distinct('last_name') SELECT COUNT(*) FROM users where AGE > 30 db.users.find({age: {'$gt': 30}}).count()
  • 30. Document user: { id: "1", name: "ruben" } order: { id: "a", user_id: "1", items: [ { product_id: "x", quantity: 10, price: 300 }, { product_id: "y", quantity: 5, price: 300 }] } referenced embedded
  • 31. MongoDB – Replication (I) ● Master-slave replication: primary and secondary nodes ● replica set: cluster of mongod instances that replicate amongst one another and ensure automated failover WriteConcern
  • 32. MongoDB – Replication (II) ● adds redundancy ● helps to ensure high availability – automatic failover ● simplifies backups
  • 33. WriteConcerns ● Errors Ignored – even network errors are ignored ● Unacknowledged – at least network errors are handled ● Acknowledged – constraints are handled (default) ● Journaled – persisted to journal log ● Replica ACK – 1..n – Or 'majority'
  • 34. MongoDB – Sharding (I) ● Scale Out ● Distributes data to nodes automatically ● Balances data and load accross machines
  • 35. MongoDB – Sharding (II) ● A sharded Cluster is composed of: – Shards: holds data. ● Either one mongod instance (primary daemon process – handles data requests), or a replica set – config Servers: ● mongod instance holding cluster metadata – mongos instances: ● route application calls to the shards ● No single point of failure
  • 38. MongoDB – Sharding (V) ● Collection has a shard key: existing field(s) in all documents ● Documents get distributed according to ranges ● In a shard, documents are partitioned into chunks ● Mongo tries to keep all chunks at the same size
  • 39. MongoDB – Sharding (VI) ● Shard Balancing – When a shard has too many chunks, mongo moves chunks to other shards ● Only makes sense with huge amount of data
  • 40. Object Mappers ● C#, PHP, Scala, Erlang, Perl, Ruby ● Java – Morphia – Spring MongoDB – mongo-jackson-mapper – jongo ● ..
  • 41. Jongo - Example DB db = new MongoClient().getDB("jongo"); Jongo jongo = new Jongo(db); MongoCollection users = jongo.getCollection("users"); User user = new User("ruben", "inoto", new Address("Musterstraße", "5026")); users.save(user); User ruben = users.findOne("{name: 'ruben'}").as(User.class); public class User { private String name; private String surname; private Address address; public class Address { private String street; private String zip; { "_id" : ObjectId("51b0e1c4d78a1c14a26ada9e"), "name" : "ruben", "surname" : "inoto", "address" : { "street" : "Musterstraße", "zip" : "5026" } }
  • 42. TTL (TimeToLive) ● Data with an expiryDate ● After the specified TimeToLive, the data will be removed from the DB ● Implemented as an Index ● Useful for logs, sessions, .. db.broadcastMessages.ensureIndex( { "createdAt": 1 }, { expireAfterSeconds: 3600 } )
  • 43. MapReduce ● Programming model for processing large data sets with a parallel, distributed algorithm. ● Handles complex aggregation tasks ● Problem can be distributed in smaller tasks, distributed across nodes ● map phase: selects the data – Associates a value with a key and a value pair – Values will be grouped by the key, and passed to the reduce function ● reduce phase: transforms the data – Accepts two arguments: key and values – Reduces to a single object all the values associated with the key
  • 45. MapReduce Use Example ● Problem: Count how much money each customer has paid in all its orders
  • 46. Solution - Relational select customer_id, sum(price * quantity) from orders group by customer_id order_id customer_id price quantity a 1 350 2 b 2 100 2 c 1 20 1 customer_id total 1 720 2 200
  • 47. Solution - Sequential var customerTotals = new Map(); for (Order order: orders) { var newTotal = order.price * order.quantity; if (customerTotals.containsKey(order.customerId)) { newTotal += customerTotals.get(order.customerId); } customerTotals.put(order.customerId, newTotal); } [{ order_id: "a", customer_id: "1", price: 350, quantity: 2 }, { order_id: "b", customer_id: "2", price: 100, quantity: 2 }, { order_id: "c", customer_id: "1", price: 20, quantity: 1 }] { „1“: 720 } { „2“: 200 }
  • 48. Solution - MapReduce db.orders.insert([ { order_id: "a", customer_id: "1", price: 350 quantity: 2 }, { order_id: "b", customer_id: "2", price: 100, quantity: 2 }, { order_id: "c", customer_id: "1", price: 20, quantity: 1 } ]); var mapOrders = function() { var totalPrice = this.price * this.quantity; emit(this.customer_id, totalPrice); }; var reduceOrders = function(customerId, tempTotal) { return Array.sum(tempTotal); }; db.orders.mapReduce( mapOrders, reduceOrders, { out: "map_reduce_orders" } ); > db.map_reduce_orders.find().pretty(); { "_id" : "1", "value" : 720 } { "_id" : "2", "value" : 200 }
  • 50. Who is using Mongo? ● Craigslist ● SourceForge ● Disney ● TheGuardian ● Forbes ● CERN ● ….
  • 51. „Real“ Use Case – Android Notifications ● App to send „notifications“ (messages) to devices with an installed RealNetworks application (Music, RBT) ● Scala, Scalatra, Lift, Jersey, Guice, ProtocolBuffers ● MongoDB, Casbah, Salat ● Mongo Collections – Devices: deviceId, msisdn, application – Messages: message, audience – SentMessages: deviceId, message, status
  • 52. Criticism ● Loss of data – Specially in a cluster
  • 53. Conclusion ● Not a silver bullet ● Makes sense when: – Eventual consistency is acceptable – Prototyping – Performance – Object model doesn't suit in a Relational DB ● Easy to learn