Mongo - an intermediate introduction

MongoDB“Walking on water and developing software from a
specification are easy if both are frozen.”
BY :
NklMish
@nklmish

Why & What?
Features
Aggregation framework overview
Sharding and Replica Set
Summary
Agenda

This talk is not about :
NoSql
Relational vs Non-Relational
Comparison of other NoSql flavors
NOTE

“One size fits all” approach no longer applies
NoSql Database
SchemaLess != NoSchema
Document based Approach
Non-Relational Dbs scale more easily, especially
horizontally
In a nutshell focus on speed, performance, flexibility and
scalability.
What is MongoDB?
“Nothing endures but change”

Agility.
High Scalability => horizontal => thousand of nodes or clouds or across multiple data centers
Rich Indexing
Real life example:
Server Density
OTTO
Expedia, Forbes, MetLife,Bosch, etc.
Why?
“Luck is not a factor. Hope is not a strategy. Fear is not an option”

RDBMS
Tables
Records/rows
Queries return
records
Mapping
“You improvise. You adapt. You overcome.”
MongoDB
Collections
Documents/objects
Queries return a
cursor ???? Because
of performance,
efficiency

Standard Db Features
Docs are stored in BSON => Mongo understands JSON
natively => Any Valid JSON can be imported and
queried(E.g. mongoimport -f foo.json).
Map Reduce
Aggregation Framework
GridFS (for Efficient binary large objects )
GeoNear
Features
“Stable Velocity. Sustainable Pace.”

$match – filter docs
$project – reshape docs
$group – Summarize docs
$unwind – Expand docs
$sort – Order docs
$limit/$skip – paginate docs
$redact – Restrict docs
$geoNear – Proximity sort docs
$let, $map – Define variables
Aggregation In Nutshell
“Talk is cheap. Show me the code”

{
name : “Java”,
price : 250,
Type : “ebook”
}
{
name : “Php”,
price : 200,
Type : “ebook”
}
{
name : “Javascript”,
price : 150,
Type : “hardCopy”
}
Matching
{$match : {
Type : “ebook”
}}
{
name : “Java”,
price : 200,
Type : “ebook”
}
{
name : “Php”,
price : 200,
Type : “ebook”
}

{
name : “Java”,
price : 250,
Type : “ebook”
}
{
name : “Php”,
price : 200,
Type : “ebook”
}
{
name : “Javascript”,
price : 150,
Type : “hardCopy”
}
Query Operator
{$match : {
price : {$gt : 200}
}}
{
name : “Java”,
price : 250,
Type : “ebook”
}
{
name : “Php”,
price : 200,
Type : “ebook”
}

{
name : “Java”,
price : 250,
Type : “ebook”
}
{
name : “Php”,
price : 200,
Type : “ebook”
}
Including Excluding part of document
{$project : {
name :1,
price : 1,
Type : 0
}}
{
name : “Java”,
price : 200”
}
{
name : “Php”,
price : 200
}

{
name : “Java”,
price : 250,
Type : “ebook”,
quantity : 3
}
{
name : “Php”,
price : 200,
Type : “ebook”,
quantity : 2
}
Custom Field Computation
{$project : {
fullStock : {
$mul : [“$price”, “$quantity”]
},
Title : “$name”
}}
{
fullStock : 750,
Title: “Java”
}
{
price : 400,
Title: “Php”
}

{
name : “Java”,
price : 250,
Type : “ebook”,
quantity : 3
}
{
name : “Php”,
price : 200,
Type : “ebook”,
quantity : 2
}
Generating Sub-Document
{$project : {
fullStock : {
$mul : [“$price”, “$quantity”]
},
details : {Title : “$name”, quantity : “$quantity” }
}}
{
fullStock : 750,
details : {Title: “Java”, quantity : 3}
},
{,
fullStock : 400,
details : {Title: “Php”, quantity : 2}
}

$group
Group Docs by value
By default in memory processing
Helpful operators that go with:
$max, $min, $avg, $sum
$addToSet, $push
$first, $last

{
name : “Java Fun”,
publisher : “manning”,
price : 1000
}
{
name : “Oracle Fun”,
publisher : “manning”,
price : 1000
}
{
name : “Php Fun”,
publisher : “Orelly”,
price : 400
}
Compute Average
{$group : {
_id : “$publisher”,
avgPrice : {$avg : “$price” }
}}
{
_id: manning,
avgPrice :1000
},
{
_id : Php,,
avgPrice: 400
}

$unwind
Useful for doc containing array fields:
Create docs from array entries
Entries can be replace by value

{
publisher :“manning”,
title: [“java”, “Php”],
discount : 50%
}
$unwind
{$unwind : $category}
{
publisher : manning,
title: java,
discount :50%
},
{
publisher : manning,
title: Php,
discount :50%
}

$redact
Restrict access to docs based on doc fields
to define privileges
Useful terminology $$DESCEND, $$PRUNE,
$$KEEP

{
_id :123,
name : logo,
security : “ANYONE”,
Profit : {
security : “MARKETING”,
revenue : 500%,
ProfitByCountry : {
security : “BOARD_OF_DIRECTOR”
PL : 800 %
NO : 700 %
DK : 600%
SW : 500 %
}
}
}
$redact
db.products.aggregate([
{$match : {name : “logo”}},
$redact : {
$cond : {
if : {$eq : …},
then : “$$DESCEND”,
else : “$PRUNE
}
}
])
{...}

Sharding
Allow to store data across multiple machines.
Ok but what for ?
Database systems with large data sets and high throughput applications
can challenge the capacity of a single server.
High query rates can exhaust the CPU capacity of the server.
Larger data sets exceed the storage capacity of a single machine
Working set sizes larger than the system’s RAM stress the I/O capacity of
disk drives.

Replica Set
Group of mongod processes that maintain the same data set. Provides redundancy and high
availability, in a nutshell basis for all production deployments.
Min 3 nodes required E.g.

Document oriented db.
Scale and performs well
Provide powerful aggregration framework
Tested on massive datasets.
Support for map reduce.
Summary

Mongo - an intermediate introduction

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (17)

Similar to Mongo - an intermediate introduction

Similar to Mongo - an intermediate introduction (20)

More from nklmish

More from nklmish (10)

Recently uploaded

Recently uploaded (20)

Mongo - an intermediate introduction