MongoDB Live Hacking

MongoDB Live Coding Session
tobias.trelle@codecentric.de
@tobiastrelle
codecentric AG

„It‘s not my fault the chapters are short,
MongoDB is just easy to learn“

from „The Little MongoDB book“

codecentric AG

MongoDB User Groups by codecentric

MongoDB User-Gruppe Düsseldorf
https://www.xing.com/net/mongodb-dus
@MongoDUS
Contact: Tobias Trelle

MongoDB User-Gruppe Frankfurt/Main
https://www.xing.com/net/mongodb-ffm
Contact: Uwe Seiler

codecentric AG

What is MongoDB?

Named from „humongous“ = gigantic http://www.mongodb.org

NoSQL datastore, Open Source https://github.com/mongodb
support from manufacturer 10gen http://www.10gen.com

Highly scalable (scale-out)

Stores so called „documents“

Supports replication & sharding

Map/Reduce

Geospatial indexes / queries

codecentric AG

Basic structure of a MongoDB server

Server

Database
Relational counterpart But …
Flexible
Table Collection
Schema

Row Document

- Arrays
Column Field
- recursive
codecentric AG

What‘s a document?

Single record that can be stored in a collection

JSON = JavaScript Object Notation (internal representation BSON = Binary JSON)

var doc = {
title: „MongoDB_Live_Hacking.pptx“,
tags: [ „cc“, „mongodb“, „nosql“ ],
slides: [
{ nr = 1, header = „MongoDB User Groups by codecentric“},
{ nr = 2, header = „MongoDB at codedcentric WiKi“},
…
]
};

codecentric AG

Live Session

CRUD operations

Queries

Geospatial Queries

Map/Reduce

Replication

Sharding

Raw Java API & Spring Data API

codecentric AG

Geospatial Queries

Queries based on
2-dimensional coordinates

_id: "A", position: [0.001, -0.002]
_id: "B", position: [0.75, 0.75]
_id: "C", position: [0.5, 0.5]
_id: "D", position: [-0.5, -0.5]

Queries based on distances
& shapes

Details:
http://blog.codecentric.de/en/2012/02/spring-data-mongodb-geospatial-queries/

codecentric AG

Map/Reduce

Data processing algorithm based on two phases: map & reduce

Code execution co-located with the data

Map phase can be run in parallel (on multiple nodes etc.) on huge data sets

MongoDB map / reduce:

runs on a subset of / all documents of a collection

Map / Reduce algorithms are JS functions

Output documents of the map function are input to the reduce function

Results are documents stored in a target collection
codecentric AG

Map/Reduce example
We want to count occurences of tags assigned to our documents:
{name: „Doc 1“, tags: [ „cc“, „mongodb“, „nosql“ ] }
{name: „Doc 2“, tags: [ „cc“, „agile“ ] } Map output:
{name: „Doc 3“, tags: [ „cc“ ] } key = „cc“, value = {count: 1}
key = „mongodb“, value = {count: 1}
key = „nosql“, value = {count: 1}
Map function:
key = „cc“, value = {count: 1}
function() { this.tags.forEach( function(tag) { key = „agile“, value = {count: 1}
emit( tag, {count: 1} ) key = „cc“, value = {count: 1}
})
}
Reduce function: Reduce input:
function(key, values) { key = „cc“, values = [ {count: 1}, {count: 1}, {count: 1} ]
var result = {count: 0};
key = „mongodb“, values = [ {count: 1} ]
values.forEach(function(value) {
key = „nosql“, values = [ {count: 1} ]
result.count += value.count;
key = „agile“, values = [ {count: 1} ]
});
return result;
}
codecentric AG

MongoDB Replication

A cluster is called „replica set“
Uses Master/Slave replication
Writes from clients go to the master only
If the master goes down, the slaves elect a new master (n > 2)

Replica set w/ n = 3

Slave 1

Client Master

Slave 2

codecentric AG

MongoDB Sharding

Data is distributed over n nodes, each record is persisted only once
Data only on the shard nodes
Config Server = book keeper, knows where the data is
Switch: Gateway for clients

Sharding setup

Config
Server Shard 1

Shard 2
Client Switch

codecentric AG

MongoDB Sharding in Production

Each shard is a replica set + 3 config servers

Source: http://www.mongodb.org/display/DOCS/Sharding+Introduction
codecentric AG

MongoDB Sharding Example: Initial State
mongos> sh.status()
--- Sharding Status ---
sharding version: { "_id" : 1, "version" : 3 }
shards: 2 Shards
{ "_id" : "shard0000", "host" : "tmp-pc:9000" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "data", "partitioned" : true, "primary" : "shard0000" }
data.foo chunks:
shard0000 1
{ "age" : { $minKey : 1 } } -->> { "age" : { $maxKey : 1 } } on : shard0000 { "t" : 1000,
"i" : 0 }

codecentric AG

MongoDB Sharding Example: Multiple Chunks
mongos> sh.status()
--- Sharding Status ---
sharding version: { "_id" : 1, "version" : 3 }
shards: 2 Shards
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "data", "partitioned" : true, "primary" : "shard0000" }
data.foo chunks:
shard0001 4
shard0000 5
Chunks
{ "age" : { $minKey : 1 } } -->> { "age" : 50 } on : shard0001 { "t" : 2000, "i" : 0 }
are equally
distributed { "age" : 50 } -->> { "age" : 53 } on : shard0001 { "t" : 3000, "i" : 0 }
{ "age" : 53 } -->> { "age" : 54 } on : shard0001 { "t" : 4000, "i" : 0 }
{ "age" : 69 } -->> { "age" : { $maxKey : 1 } } on : shard0000 { "t" : 1000, "i" : 4 }

codecentric AG

MongoDB API
Drivers for many languages (Java, Ruby, PHP, C++, …)
Low level Java API: MongoDB Java Driver
Spring Data MongoDB: Repository Support + Objekt/Collection Mapping

Spring Data
CrudRepository PagingAndSortingRepository

Spring Data Spring Data Spring Data Spring Data
JPA MongoDB Neo4j …
JpaRepository MongoRepository GraphRepository
MongoTemplate Neo4jTemplate

Embedded REST

JPA Mongo Java Driver

JDBC

RDBMS MongoDB Neo4j …

codecentric AG

QUESTION?

Tobias Trelle

codecentric AG
Merscheider Str. 1
42699 Solingen

tel +49 (0) 212.233628.47
fax +49 (0) 212.233628.79
mail Tobias.Trelle@codecentric.de
twitter @tobiastrelle

www.codecentric.de
www.mbg-online.de
blog.codecentric.de
www.xing.com/net/mongodb-dus

codecentric AG 20.08.2012 17

MongoDB Live Hacking

More Related Content

What's hot

Viewers also liked

Similar to MongoDB Live Hacking

More from Tobias Trelle

Recently uploaded

MongoDB Live Hacking