10. Simplified Blog:
a SQL approach
Three tables of rows:
posts posts_comments posts_tags
post_id post_id post_id
name content name
content
INSERT INTO posts (post_id, name, content) VALUES (42, “Lost Series
Finale Approaching!”, “<p>It’s going to be pretty exciting.</p>’);
INSERT INTO posts_comments (post_id, content) VALUES (42, “Cool!”);
INSERT INTO posts_comments (post_id, content) VALUES (42,
“Awesome!”);
INSERT INTO posts_tags (post_id, content) VALUES (42, ‘Lost’);
INSERT INTO posts_tags (post_id, content) VALUES (42,
‘Television’);
11. Simplified Blog:
a MongoDB approach
One collection of documents:
posts
_id
name
content
comments
tags
db.posts.insert( {
_id: 42,
name: “Lost Series Finale Approaching!”,
content: “<p>It’s going to be pretty exciting.</p>”,
comments: [ { content: “Cool!” },
{ content: “Awesome!” },
],
tags: [“Lost”, “Television”]
} );
12. Simplified Blog:
Queries
Get a post and all its tags and comments
SQL
SELECT * FROM posts WHERE post_id = 42;
SELECT * FROM posts_comments WHERE post_id = 42;
SELECT * FROM posts_tags WHERE post_id = 42;
MongoDB
db.posts.findOne( {_id: 42} );
13. Simplified Blog:
Queries
Get all of the posts tagged a certain way
SQL
SELECT * FROM posts
JOIN posts_tags USING (post_id)
WHERE posts_tags.name = “Television”;
MongoDB
db.posts.find( { tags: “Television” } );
14. Easier Development
• Complex data structures (hashes, arrays)
stored directly in their natural form
• No need for Object-Relational Mapping
• No need to worry about SQL injection
• Fewer collections/tables in system
• Easy for new developers to pick up
15. Easier Deployment
• ALTERs are a pain and require downtime
on large datasets
• You don’t need ALTERs in MongoDB!
• Though occasionally still need migration
scripts
16. Horizontal Scaling
• No JOINs encourages scalable practices
• Denormalization, scalable design gets baked
in early
• Hard for a sane design NOT to scale
19. High Performance
• As fast as a cache when retrieving individual
documents
• Limited use of caching (posts are pulled live
from MongoDB)
• Every pageview on Business Insider does
multiple writes
• Just using a simple master/slave, running
about 5% capacity
21. Realtime Analytics
• MongoDB is highly optimized for realtime
updates
• Up-to-the-second data on pageviews,
referrers, click tracking
• Minimal development time, huge value to
editorial
• Could be bolted onto a SQL-based website
or traditional CMS
22. Database File Storage
(GridFS)
• Every image stored/served in the database
• Eliminates awkwardness/duplicate systems
for replication, backups, test datasets, etc
23. Other Cool Stuff
• Map/Reduce
• Capped Collections
• Geospatial Indexes
• ... but we don’t use them (yet!)
25. The Future (IMHO)
• NoSQL adoption grows rapidly
• Some sites use it for specific systems, some
go all NoSQL
• Open-source CMSs support NoSQL
(already happening)
• More diversity in datastores
• RDMS still useful but no longer the only
option