Reducing Development Time with MongoDB vs. SQL

Reducing Development Time
with MongoDB vs. SQL
Buzz Moschetti
EnterpriseArchitect, MongoDB
buzz.moschetti@mongodb.com
@buzzmoschetti

Who is your Presenter?
• Yes, I use “Buzz” on my business cards
• Former Investment Bank Chief Architect at
JPMorganChase and Bear Stearns before that
• Over 27 years of designing and building systems
• Big and small
• Super-specialized to broadly useful in any vertical
• “Traditional” to completely disruptive
• Advocate of language leverage and strong factoring
• Still programming – using emacs, of course

What Are Your Developers Doing All
Day?
Adding and testing business features
OR
“Integrating/modifying other components and tools”
• Database(s)
• ETL and other data transfer operations
• Messaging
• Services (web & other)
• Other open source frameworks incl.
ORMs

Why Can’t We Just Save and Fetch
Data?
Because the way we think about data at the
business use case level…
…which traditionally is VERY different than the
way it is implemented at the database level
…is different than the way it is implemented at
the application/code level…

This Problem Isn’t New…
…but for the past 40 years, innovation at the business & application layers
has outpaced innovation at the database layer
1974 2014
Business
Data Goals
Capture my company’s
transactions daily at
5:30PM EST, add them up
on a nightly basis, and print
a big stack of paper
Capture my company’s global transactions in realtime
plus everything that is happening in the world
(customers, competitors, business/regulatory/weather),
producing any number of computed results, and passing
this all in realtime to predictive analytics with model
feedback; results in realtime to 10000s of mobile
devices, multiple GUIs, and b2b and b2c channels
Release
Schedule
Semi-Annually Yesterday
Application
/Code
COBOL, Fortran, Algol,
PL/1, assembler,
proprietary tools
C, C++, VB, C#, Java, javascript, groovy, ruby, perl
python, Obj-C, SmallTalk, Clojure, ActionScript, Flex,
DSLs, spring, AOP, CORBA, ORM, third party software
ecosystem, the whole open source movement, … and
COBOL and Fortran
Database I/VSAM, early RDBMS Mature RDBMS, legacy I/VSAM
Column & key/value stores, and…mongoDB

Exactly How Does MongoDB Change
Things?
• MongoDB is designed from the ground up to
address rich structure (maps of maps of lists
of…), not rectangles
• Standard RDBMS interfaces (i.e. JDBC) do not exploit features
of contemporary languages
• Rapid Application Development (RAD) and scripting in
Javascript, Python, Perl, Ruby, and Scala is impedance-
matched to mongoDB
• In MongoDB, the data is the schema
• Shapes of data go in the same way they come
out

Rectangles are 1974. Maps and Lists are
2014
{ customer_id : 1,
first_name : "Mark",
last_name : "Smith",
city : "San Francisco",
phones: [ {
type : “work”,
number: “1-800-555-1212”
},
{ type : “home”,
number: “1-800-555-1313”,
DNC: true
},
number: “1-800-555-1414”,
DNC: true
}
]
}

An Actual Code Example (Finally!)
Let’s compare and contrast RDBMS/SQL to MongoDB
development using Java over the course of a few weeks.
Some ground rules:
1. Observe rules of Software Engineering 101: Assume separation of application,
Data Access Layer, and persistor implementation
2. Data Access Layer must be able to
a. Expose simple, functional, data-only interfaces to the application
• No ORM, frameworks, compile-time bindings, special tools
b. Exploit high performance features of persistor
3. Focus on core data handling code and avoid distractions that require the same
amount of work in both technologies
a. No exception or error handling
b. Leave out DB connection and other setup resources
4. Day counts are a proxy for progress, not actual time to complete indicated task

The Task: Saving and Fetching Contact
data
Map m = new HashMap();
m.put(“name”, “buzz”);
m.put(“id”, “K1”);
Start with this simple,
flat shape in the Data
Access Layer:
save(Map m)
And assume we
save it in this way:
Map m = fetch(String id)
And assume we
fetch one by primary
key in this way:
Brace yourself…..

Day 1: Initial efforts for both technologies
DDL: create table contact ( … )
init()
{
contactInsertStmt = connection.prepareStatement
(“insert into contact ( id, name ) values ( ?,? )”);
fetchStmt = connection.prepareStatement
(“select id, name from contact where id = ?”);
}
save(Map m)
{
contactInsertStmt.setString(1, m.get(“id”));
contactInsertStmt.setString(2, m.get(“name”));
contactInsertStmt.execute();
}
Map fetch(String id)
{
Map m = null;
fetchStmt.setString(1, id);
rs = fetchStmt.execute();
if(rs.next()) {
m = new HashMap();
m.put(“id”, rs.getString(1));
m.put(“name”, rs.getString(2));
}
return m;
}
SQL
DDL: none
save(Map m)
{
collection.insert(m);
}
MongoDB
{
Map m = null;
DBObject dbo = new BasicDBObject();
dbo.put(“id”, id);
c = collection.find(dbo);
if(c.hasNext()) }
m = (Map) c.next();
}
return m;
}

Day 2: Add simple fields
m.put(“id”, “K1”);
m.put(“title”, “Mr.”);
m.put(“hireDate”, new Date(2011, 11, 1));
• Capturing title and hireDate is part of adding a new
business feature
• It was pretty easy to add two fields to the structure
• …but now we have to change our persistence code
Brace yourself (again) …..

SQL Day 2 (changes in bold)
DDL: alter table contact add title varchar(8);
alter table contact add hireDate date;
init()
{
(“insert into contact ( id, name, title, hiredate ) values
( ?,?,?,? )”);
(“select id, name, title, hiredate from contact where id =
?”);
}
save(Map m)
{
contactInsertStmt.setString(3, m.get(“title”));
contactInsertStmt.setDate(4, m.get(“hireDate”));
}
{
Map m = null;
if(rs.next()) {
m = new HashMap();
m.put(“title”, rs.getString(3));
m.put(“hireDate”, rs.getDate(4));
}
return m;
}
Consequences:
1. Code release schedule linked
to database upgrade (new
code cannot run on old
schema)
2. Issues with case sensitivity
starting to creep in (many
RDBMS are case insensitive
for column names, but code is
case sensitive)
3. Changes require careful mods
in 4 places
4. Beginning of technical debt

MongoDB Day 2
save(Map m)
{
}
{
Map m = null;
if(c.hasNext()) }
m = (Map) c.next();
}
return m;
}
Advantages:
1. Zero time and money spent on
overhead code
2. Code and database not physically
linked
3. New material with more fields can
be added into existing collections;
backfill is optional
4. Names of fields in database
precisely match key names in
code layer and directly match on
name, not indirectly via positional
offset
5. No technical debt is created
✔ NO
CHANGE

Day 3: Add list of phone numbers
m.put(“id”, “K1”);
m.put(“title”, “Mr.”);
m.put(“hireDate”, new Date(2011, 11,
1));
n1.put(“type”, “work”);
n1.put(“number”, “1-800-555-1212”));
list.add(n1);
n2.put(“type”, “home”));
n2.put(“number”, “1-866-444-3131”));
list.add(n2);
m.put(“phones”, list);
• It was still pretty easy to add this data to the structure
• .. but meanwhile, in the persistence code …
REALLY brace yourself…

SQL Day 3 changes: Option 1: Assume
just 1 work and 1 home phone number
DDL: alter table contact add work_phone varchar(16);
alter table contact add home_phone varchar(16);
init()
{
(“insert into contact ( id, name, title, hiredate,
work_phone, home_phone ) values ( ?,?,?,?,?,? )”);
(“select id, name, title, hiredate, work_phone,
home_phone from contact where id = ?”);
}
save(Map m)
{
for(Map onePhone : m.get(“phones”)) {
String t = onePhone.get(“type”);
String n = onePhone.get(“number”);
if(t.equals(“work”)) {
contactInsertStmt.setString(5, n);
} else if(t.equals(“home”)) {
contactInsertStmt.setString(6, n);
}
}
}
{
Map m = null;
if(rs.next()) {
m = new HashMap();
Map onePhone;
onePhone = new HashMap();
onePhone.put(“type”, “work”);
onePhone.put(“number”, rs.getString(5));
list.add(onePhone);
onePhone = new HashMap();
onePhone.put(“type”, “home”);
list.add(onePhone);
}
This is just plain bad….

SQL Day 3 changes: Option 2:
Proper approach with multiple phone
numbersDDL: create table phones ( … )
init()
{
(“insert into contact ( id, name, title, hiredate )
values ( ?,?,?,? )”);
c2stmt = connection.prepareStatement(“insert into
phones (id, type, number) values (?, ?, ?)”;
(“select id, name, title, hiredate, type, number from
contact, phones where phones.id = contact.id and
contact.id = ?”);
}
save(Map m)
{
startTrans();
c2stmt.setString(1, m.get(“id”));
c2stmt.setString(2, onePhone.get(“type”));
c2stmt.setString(3, onePhone.get(“number”));
c2stmt.execute();
}
endTrans();
}
{
Map m = null;
int i = 0;
List list = new ArrayList();
while (rs.next()) {
if(i == 0) {
m = new HashMap();
}
Map onePhone = new HashMap();
onePhone.put(“type”, rs.getString(5));
list.add(onePhone);
i++;
}
return m;
}
This took time and money

SQL Day 5: Zombies! (zero or more between entities)
init()
{
values ( ?,?,?,? )”);
(“select A.id, A.name, A.title, A.hiredate, B.type,
B.number from contact A left outer join phones B on
(A.id = B. id) where A.id = ?”);
}
Whoops! And it’s also wrong!
We did not design the query accounting
for contacts that have no phone number.
Thus, we have to change the join to an
outer join.
But this ALSO means we have to change
the unwind logic
This took more time and
money!
while (rs.next()) {
if(i == 0) {
// …
}
String s = rs.getString(5);
if(s != null) {
Map onePhone = new HashMap();
onePhone.put(“type”, s);
list.add(onePhone);
}
}
…but at least we have a DAL…
right?

MongoDB Day 3
Advantages:
overhead code
2. No need to fear fields that are
“naturally occurring” lists
containing data specific to the
parent structure and thus do not
benefit from normalization and
referential integrity
3. Safe from zombies and other
undead distractions from productivity
save(Map m)
{
}
{
Map m = null;
if(c.hasNext()) }
m = (Map) c.next();
}
return m;
}
✔ NO
CHANGE

By Day 14, our structure looks like this:
n4.put(“geo”, “US-EAST”);
n4.put(“startupApps”, new String[] { “app1”, “app2”, “app3” } );
list2.add(n4);
n4.put(“geo”, “EMEA”);
n4.put(“startupApps”, new String[] { “app6” } );
n4.put(“useLocalNumberFormats”, false):
list2.add(n4);
m.put(“preferences”, list2)
n6.put(“optOut”, true);
n6.put(“assertDate”, someDate);
seclist.add(n6);
m.put(“attestations”, seclist)
m.put(“security”, mapOfDataCreatedByExternalSource);
• It was still pretty easy to add this data to the structure
• Want to guess what the SQL persistence code looks like?
• How about the MongoDB persistence code?

SQL Day 14
Error: Could not fit all the code into this space.
…actually, I didn’t want to spend 2 hours putting the code together..
But very likely, among other things:
• n4.put(“startupApps”,new String[]{“app1”,“app2”,“app3”});
was implemented as a single semi-colon delimited string
• m.put(“security”, anotherMapOfData);
was implemented by flattening it out and storing a subset of fields

MongoDB Day 14 – and every other day
Advantages:
overhead code
2. Persistence is so easy and flexible
and backward compatible that the
persistor does not upward-
influence the shapes we want to
persist i.e. the tail does not wag
the dog
save(Map m)
{
}
{
Map m = null;
if(c.hasNext()) }
m = (Map) c.next();
}
return m;
}
✔ NO
CHANGE

But what if we must do a join?
Both RDBMS and MongoDB will have a PhoneTransactions
table/collection
{ customer_id : 1,
first_name : "Mark",
last_name : "Smith",
city : "San Francisco",
phones: [ {
type : “work”,
number: “1-800-555-1212”
},
number: “1-800-555-1313”,
DNC: true
},
number: “1-800-555-1414”,
DNC: true
}
]
}
{ number: “1-800-555-1212”,
target: “1-999-238-3423”,
duration: 20
}
{ number: “1-800-555-1212”,
target: “1-444-785-6611”,
duration: 243
}
{ number: “1-800-555-1414”,
target: “1-645-331-4345”,
duration: 132
}
{ number: “1-800-555-1414”,
target: “1-990-875-2134”,
duration: 71
}
PhoneTransactions

SQL Join Attempt #1
select A.id, A.lname, B.type, B.number, C.target, C.duration
from contact A, phones B, phonestx C
Where A.id = B.id and B.number = C.number
id | lname | type | number | target | duration
-----+--------------+------+----------------+----------------+----------
g9 | Moschetti | home | 1-900-555-1212 | 1-222-707-7070 | 23
g10 | Kalan | work | 1-999-444-9999 | 1-222-907-7071 | 7
g9 | Moschetti | work | 1-800-989-2231 | 1-987-707-7072 | 9
g9 | Moschetti | home | 1-900-555-1212 | 1-222-707-7071 | 7
g9 | Moschetti | home | 1-777-999-1212 | 1-222-807-7070 | 23
g9 | Moschetti | home | 1-777-999-1212 | 1-222-807-7071 | 7
g10 | Kalan | work | 1-999-444-9999 | 1-222-907-7070 | 23
g9 | Moschetti | home | 1-900-555-1212 | 1-222-707-7072 | 9
g10 | Kalan | work | 1-999-444-9999 | 1-222-907-7072 | 9
g9 | Moschetti | home | 1-777-999-1212 | 1-222-807-7072 | 9
How to turn this into a list of names –
each with a list of numbers, each of those with a list of target
numbers?

SQL Unwind Attempt #1
Map idmap = new HashMap();
ResultSet rs = fetchStmt.execute();
while (rs.next()) {
String id = rs.getString(“id");
String nmbr = rs.getString("number");
List tnum;
Map snum;
if((snum = (List) idmap.get(id)) == null) {
snum = new HashMap();
idmap.put(did, snum);
}
if((tnum = snum.get(nmbr)) == null) {
tnum = new ArrayList();
snum.put(number, tnum);
}
Map info = new HashMap();
info.put("target", rs.getString("target"));
info.put("duration", rs.getInteger("duration"));
tnum.add(info);
}
// idmap[“g9”][“1-900-555-1212”] = ({target:1-222-707-7070,duration:23…)

SQL Join Attempt #2
select A.id, A.lname, B.type, B.number, C.target, C.duration
Fromcontact A, phones B, phonestx C
Where A.id = B.id and B.number = C.number order by A.id, B.number
id | lname | type | number | target | duration
-----+--------------+------+----------------+----------------+----------
g10 | Kalan | work | 1-999-444-9999 | 1-222-907-7072 | 9
g10 | Kalan | work | 1-999-444-9999 | 1-222-907-7070 | 23
g10 | Kalan | work | 1-999-444-9999 | 1-222-907-7071 | 7
g9 | Moschetti | home | 1-777-999-1212 | 1-222-807-7072 | 9
g9 | Moschetti | home | 1-777-999-1212 | 1-222-807-7070 | 23
g9 | Moschetti | home | 1-777-999-1212 | 1-222-807-7071 | 7
g9 | Moschetti | work | 1-800-989-2231 | 1-987-707-7072 | 9
g9 | Moschetti | home | 1-900-555-1212 | 1-222-707-7071 | 7
g9 | Moschetti | home | 1-900-555-1212 | 1-222-707-7072 | 9
g9 | Moschetti | home | 1-900-555-1212 | 1-222-707-7070 | 23
“Early bail out” from cursor is now possible –
but logic to construct list of source and target numbers is similar

SQL is about Disassembly
String s = “select A, B, C, D,
E, F from T1,T2,T3 where T1.col
= T2.col and T2.col2 = T3.col2
and X = Y and X2 != Y2 and G >
10 and G < 100 and TO_DATE(‘ …”;
while(ResultSet.next()) {
if(new column1 value from T1) {
set up new Object;
}
set up new Object2
}
set up new Object3
}
populate maps, lists and scalars
}
ResultSet rs = execute(s);
Design a Big Query
including business logic
to grab all the data up
front
Throw it at the engine
Disassemble Big
Rectangle into usable
objects with logic implicit
in change in column
values

MongoDB is about Assembly
Cursor c = coll1.find({“X”:”Y”});
while(c.hasNext()) {
populate maps, lists and scalars;
Cursor c2 = coll2.find(logic+key from c);
while(c2.hasNext()) {
Cursor c3 = coll3.find(logic+key from c2);
}
}
Assemble usable
objects incrementally
with explicit logic

MongoDB ”Join” using simple n+1
DBCursor c = contacts.find();
DBObject item = c.next();
String id = item.get(“id”);
Map nummap = new HashMap();
for(Map phone : (List)item.get(”phones”)) {
String pnum = phone.get(“number”);
DBObject q = new BasicDBObject(“number”, pnum);
DBCursor c2 = phonestx.find(q);
List txs = new ArrayList();
txs.add((Map)c2.next());
}
nummap.put(pnum, txs);
}
idmap.put(id, nummap);
}
// idmap[“g9”][“1-900-555-1212”] = ({target:1-222-707-7070,duration:23…)

MongoDB ”Join” in Efficient Two-Pass
Mode
Set uniquePhones = new HashSet();
DBCursor c = contacts.find();
idmap.put("id", (String)item.get(“id”));
uniquePhones.add((String)item.get(“number”));
}
String[] phoneTxKeys = (String[]) uniquePhones.toArray(new String[0]);
DBObject q = new BasicDBObject(“number”, new BasicDBObject("$in", phoneTxKeys));
DBCursor c2 = phonestx.find(q);
ArrayListMultimap<String,Map> numMap = ArrayListMultimap.create();
numMap.put((String)item.get(“number”), (Map)item);
}
// String numkey = idmap["g9"].number;
// List phoneTx = numMap[numkey];

But what about “real” queries?
• MongoDB query language is a physical map-of-
map based structure, not a String
• Operators (e.g. AND, OR, GT, EQ, etc.) and arguments are
keys and values in a cascade of Maps
• No grammar to parse, no templates to fill in, no whitespace,
no escaping quotes, no parentheses, no punctuation
• Same paradigm to manipulate data is used to
manipulate query expressions
• …which is also, by the way, the same paradigm
for working with MongoDB metadata and
explain()

MongoDB Queries use familiar operators
SQL CLI select * from contact A, phones B where
A.did = B.did and B.type = 'work’;
MongoDB CLI db.contact.find({"phones.type”:”work”});
SQL in Java String s = “select * from contact A, phones B
where A.did = B.did and B.type = 'work’”;
ResultSet rs = execute(s);
MongoDB via
Java driver
DBObject expr = new BasicDBObject();
expr.put(“phones.type”, “work”);
Cursor c = contact.find(expr);
Find all contacts with at least one work phone

MongoDB Queries are Expressive and
Typed
SQL select A.did, A.lname, A.hiredate, B.type,
B.number from contact A left outer join phones B
on (B.did = A.did) where b.type = 'work' or
A.hiredate > '2014-02-02'::date
MongoDB CLI db.contacts.find({"$or”: [
{"phones.type":”work”},
{"hiredate": {”$gt": new ISODate("2014-02-
02")}}
]});
Find all contacts with at least one work phone or
hired after 2014-02-02

MongoDB Queries Welcome
Programmatic Construction
Fetch a given set of fields for contacts hired
between date A and date B, inclusive:
fetch(String[] fields, Date date1, Date date2) {
DBObject projection = new BasicDBObject();
for(String f: fields) {
projection.put(f, 1);
}
List exprs = new List();
exprs.add(new BasicDBObject("hdate", new BasicDBObject("$gte",date1)));
exprs.add(new BasicDBObject("hdate", new BasicDBObject("$lte",date2)));
DBObject predicate = new BasicDBObject("$and", exprs);
DBCursor c2 = contacts.find(predicate, projection);
}

SQL Queries Require Construction of a
String
StringBuffer sb = new StringBuffer();
sb.append("select ");
for(int i = 0; i < fields.length; i++) {
if(i != 0) { sb.append(","); }
sb.append(fields[i]);
}
DateFormat df = new SimpleDateFormat("YYYY-MM-DD");
sb.append("from contact where ");
sb.append("hiredate >= ");
sb.append("'");
sb.append(df.format(date1));
sb.append("'::date");
sb.append(" and ");
sb.append("hiredate <= ");
sb.append("'");
sb.append(df.format(date2));
sb.append("'::date");
ResultSet resultSet = stmt.executeQuery(sb.toString());
}
Careful! What happens if
the fetch involved a join
and the incoming field
names needed to be
prefixed, e.g.
select A.fields[0],
B.fields[1],
B.fields[2]

PreparedStatement ease the pain a bit….
sb.append("select ");
for(int i = 0; i < fields.length; i++) {
if(i != 0) { sb.append(","); }
sb.append(fields[i]);
}
sb.append(” from contact where ");
sb.append("hiredate >= ? and hiredate <= ?");
px = conn.prepareStatement( sb.toString() );
px.setString(1, new java.sql.Date(date1));
px.setString(2, new java.sql.Date(date2));
ResultSet resultSet = px.executeQuery();
}

MongoDB Facilitates Versatile Filtering
Pass an arbitrary filter expression to contacts:
/**
* Contract: Filter only that which you could see if you
* applied NO filter. So allow Mongo query language
* (MQL) to be used ABOVE the DAL as a filter spec.
* No MongoDB physical dependencies!
*/
fetch(Map mql) {
DBCursor c2 = contacts.find(new BasicDBObject(mql));
}

SQL Requires … MongoDB Query
Language!
Pass an arbitrary filter expression to contacts:
fetch(Map mql) {
sb.append("select * from contact where ");
walkMap(sb, mql);
ResultSet resultSet = stmt.executeQuery(sb.toString());
}
walkMap(StringBuffer sb, Map fragment) {
java.util.Iterator<String> ii = m.keySet().iterator();
while(ii.hasNext()) {
String key = ii.next();
Object ov = m.get(key);
process(sb, ov, key); // e.g. (hdate >= ‘2013-04-01’::date)
}
}

…and before you ask…
Yes, MongoDB query expressions
support
1. Sorting
2. Cursor size limit
3. Aggregation (“GROUP BY”) functions

Day 30: RAD on MongoDB with Python
import pymongo
def save(data):
coll.insert(data)
def fetch(id):
return coll.find_one({”id": id } )
myData = {
“name”: “jane”,
“id”: “K2”,
# no title? No problem
“hireDate”: datetime.date(2011, 11, 1),
“phones”: [
{ "type": "work",
"number": "1-800-555-1212"
},
{ "type": "home",
"number": "1-866-444-3131"
}
]
}
save(myData)
print fetch(“K2”)
expr = {"$or": [ {"phones.type": “work”}, {”hireDate": {“$gt”: datetime.date(2014,2,2)}} ]}
for c in coll.find(expr):
print [ k.upper() for k in sorted(c.keys()) ]
Advantages:
1. Far easier and faster to create
scripts due to “fidelity-parity” of
mongoDB map data and python
(and perl, ruby, and javascript)
structures
1. Data types and structure in scripts
are exactly the same as that read and
written in Java and C++

Day 30: Polymorphic RAD on MongoDB with
Python
import pymongo
item = fetch("K8")
# item is:
{
“name”: “bob”,
“id”: “K8”,
"personalData": {
"preferedAirports": [ "LGA", "JFK" ],
"travelTimeThreshold": { "value": 3,
"units": “HRS”}
}
}
item = fetch("K9")
# item is:
{
“name”: “steve”,
“id”: “K9”,
"personalData": {
"lastAccountVisited": {
"name": "mongoDB",
"when": datetime.date(2013,11,4)
},
"favoriteNumber": 3.14159
}
}
Advantages:
1. Scripting languages easily digest
shapes with common fields and
dissimilar fields
2. Easy to create an information
architecture where placeholder fields
like personalData are “known” in the
software logic to be dynamic

Day 30: (Not) RAD on top of SQL with
Python
init()
{
values ( ?,?,?,? )”);
(“select id, name, title, hiredate, type, number from
contact, phones where phones.id = contact.id and
contact.id = ?”);
}
save(Map m)
{
startTrans();
c2stmt.setString(1, onePhone.get(“type”));
c2stmt.setString(2, onePhone.get(“number”));
c2stmt.execute();
}
endTrans();
}
Consequences:
1. All logic coded in Java interface
layer (unwinding contact, phones,
preferences, etc.) needs to be
rewritten in python (unless Jython
is used) … AND/or perl, C++,
Scala, etc.
2. No robust way to handle
polymorphic data other than
BLOBing it
3. …and that will take real time and
money!

The Fundamental Change with mongoDB
RDBMS designed in era when:
• CPU and disk was slow &
expensive
• Memory was VERY expensive
• Network? What network?
• Languages had limited means to
dynamically reflect on their types
• Languages had poor support for
richly structured types
Thus, the database had to
• Act as combiner-coordinator of
simpler types
• Define a rigid schema
• (Together with the code) optimize
at compile-time, not run-time
In mongoDB, the
data is the schema!

mongoDB and the Rich Map Ecosystem
Generic comparison of two
records
Map expr = new HashMap();
expr.put("myKey", "K1");
DBObject a = collection.findOne(expr);
expr.put("myKey", "K2");
DBObject b = collection.findOne(expr);
List<MapDiff.Difference> d = MapDiff.diff((Map)a, (Map)b);
Getting default values for a thing
on a certain date and then
overlaying user preferences (like
for a calculation run)
Map expr = new HashMap();
expr.put("myKey", "DEFAULT");
expr.put("createDate", new Date(2013, 11, 1));
DBObject a = collection.findOne(expr);
expr.clear();
expr.put("myKey", "user1");
DBObject b = otherCollectionPerhaps.findOne(expr);
MapStack s = new MapStack();
s.push((Map)a);
s.push((Map)b);
Map merged = s.project();
Runtime reflection of Maps and Lists enables generic powerful utilities
(MapDiff, MapStack) to be created once and used for all kinds of shapes,
saving time and money

Lastly: A CLI with teeth
> db.contact.find({"SeqNum": {"$gt”:10000}}).explain();
{
"cursor" : "BasicCursor",
"n" : 200000,
//...
"millis" : 223
}
Try a query and show the
diagnostics
> for(v=[],i=0;i<3;i++) {
… n = i*50000;
… expr = {"SeqNum": {"$gt”: n}};
… v.push( [n, db.contact.find(expr).explain().millis)] }
Run it 3 times with smaller and
smaller chunks and create a
vector of timing result pairs
(size,time)
> v
[ [ 0, 225 ], [ 50000, 222 ], [ 100000, 220 ] ]
Let’s see that vector
> load(“jStat.js”)
> jStat.stdev(v.map(function(p){return p[1];}))
2.0548046676563256
Use any other javascript you
want inside the shell
> for(i=0;i<3;i++) {
… expr = {"SeqNum": {"$gt":i*1000}};
… db.foo.insert(db.contact.find(expr).explain()); }
Party trick: save the explain()
output back into a collection!

What Does All This Add Up To?
• MongoDB easier than RDBMS/SQL for many
problems
• Quicker to change
• Much better harmonized with modern languages
• Comprehensive indexing (arbitrary non/unique
secondaries, compound keys, geospatial, text
search, TTL, etc….)
• Horizontally scalable to petabytes
• Isomorphic HA and DR
Modern Database for Modern
Solutions
+
=

Reducing Development Time with MongoDB vs. SQL

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Reducing Development Time with MongoDB vs. SQL

Similar to Reducing Development Time with MongoDB vs. SQL (20)

More from MongoDB

More from MongoDB (20)

Recently uploaded

Recently uploaded (20)

Reducing Development Time with MongoDB vs. SQL