1. Practical Use of MongoDB for
Node.js
Or: Going Faster with Node.js and MongoDB
MongoDC June, 2012
Jonathan Altman
@async_io
2. Going Faster with Node.js and
MongoDB
• Getting started with Node.js and MongoDB development fast
• Building a web application with Node.js and MongoDB fast
• Making your Node.js/MongoDB-powered web application faster,
better, stronger
4. To a Front-End Developer:
• node: • jQuery:
var util = require('util'), $('.station_map').live("pagecreate", function() {
" path = require('path'), if(navigator.geolocation) {
" fs = require('fs'); navigator.geolocation.getCurrentPosition(functi
on(position){
var fileName = process.argv[2]; initializeMap(position.coords.latitude,positi
path.exists(fileName, function(exists) { on.coords.longitude);
" if (!exists) { });
" " util.puts(fileName + ' not found'); }
" " return; });
" } // snippet taken from https://github.com/mikeymckay/
" Capital-Bikeshare/blob/master/capitalbikeshare.js
" util.puts('w00t! ' + fileName + ' exists!');
});
// https://gist.github.com/2988106
Doesn’t the code look just like the event-driven javascript DHTML-
based web apps have been written in for years?
5. Front-End Similarity is Cool
• Node, like front-end web development, is built from the ground-up to
be asynchronous javascript coding
• Specifically, a lot of node.js’ power is its ability to deal with lots of
activity where most of it is just waiting on I/O: user input, network,
database, filesystem
• This capability is based on the very assumptions Ryan Dahl, node.js’
creator, made in building the system
• But the asynchronous nature imposes some costs in terms of coding
style and library design: you must buy in to the asynchronous
6. To A Network Admin:
var http = require('http');
http.createServer(function(request, response) {
var proxy = http.createClient(80, request.headers['host'])
var proxy_request = proxy.request(request.method, request.url, request.headers);
proxy_request.addListener('response', function (proxy_response) {
proxy_response.addListener('data', function(chunk) {
response.write(chunk, 'binary');
});
proxy_response.addListener('end', function() {
response.end();
});
response.writeHead(proxy_response.statusCode, proxy_response.headers);
});
request.addListener('data', function(chunk) {
proxy_request.write(chunk, 'binary');
});
request.addListener('end', function() {
proxy_request.end();
});
}).listen(8080);
// from http://www.catonmat.net/http-proxy-in-nodejs/
7. Why Node Works Well for
Networked Services
• Async nature and native support for sockets/http makes it easy
to write performant proxies
• The sample on the previous slide is a fully-functioning http
proxy, 20 lines of code
8. To A Site With Many
Simultaneous Clients
// server.js
var app = require('http').createServer(handler), io = require('socket.io').listen(app), fs = require('fs')
// Start an HTTP server on port 8080
app.listen(8080);
function handler(req, res) {
// Hardcode *all* HTTP requests to this server to serve up index.html
fs.readFile(
__dirname + '/index.html',
function (err, data) {
if (err) { res.writeHead(500); return res.end('Error loading index.html'); }
res.writeHead(200);
res.end(data);
}
);
}
// After any socket connects, SEND it a custom 'news' event
io.sockets.on(
'connection',
function (socket) {
socket.emit('news', {hello: 'world'});
}
); // from http://www.zachstronaut.com/posts/2011/08/17/node-socket-io-curious.html
9. How Node.js Works Well for
Simultaneous Clients
• Implements websockets, which is a standard for handling more
interactive web connectivity
• Frequently used by game and chat environments
• Asynchronous nature again makes it easy to handle lots of
connections because they can be multiplexed easily since they just
become one more resource like disk, backend network, and database
to wait on
• Like many other things, support for websockets was easy because
node enables support out of the box for lots of libs/modules
10. What Does Node.js Give Me?
• It is very good/fast when your application spends time waiting on
expensive resources: networks, databases, disks
• Javascript is nice, programming model familiar to some
• Good libraries for many things
• Headaches and/or slowdowns when your bottlenecks are not
already handled by node.js
• You must think and code in asynchronous javascript/callbacks
11. Example of a Headache
• If you have something that doesn’t have a good async module to do
what you need, and it’s resource intensive, then you have to build the
library to do it...asynchronously
• HeatNode: https://github.com/jonathana/heatNode is web middleware
I wrote. Part of it does some intensive image processing
• Image processing is slow, and this code abuses node.js’ processing
• Also, if you do have code that needs ordering or sequencing, you will
need to incorporate a library like promise or async to handle that for
you
13. What is MongoDB?
• Really? You’re at a MongoDB conference
• Seriously:
“MongoDB is an open source, document-oriented database
designed with both scalability and developer agility in mind.
Instead of storing your data in tables and rows as you would
with a relational database, in MongoDB you store JSON-like
documents with dynamic schemas.”
http://www.10gen.com/what-is-mongodb
14. What is MongoDB to
node.js?
It is a high-fidelity datastore for node.js
15. What Do I Mean By High
Fidelity?
• High fidelity on several axes:
• Emphasis on speed
• Similarity of processing model: node.js async meshes well with mongodb
fire-and-forget insert/update
• Plasticity: node.js is built using scripting languages, MongoDB is document-
oriented and therefore schema-less
• JSON as the lingua franca data format in both node.js and MongoDB
(ignoring the BSON internally). It’s JSON all the way down—until you hit
the turtles
16. Getting Started Faster with
Node.js
We are going to cover getting your environment up and running
17. Use nave for virtual
environments
jonathan@ogmore ~ $ nave use stable
######################################################################## 100.0%
fetched from http://nodejs.org/dist/v0.8.0/node-v0.8.0.tar.gz
'variables': { 'host_arch': 'x64',
'node_install_npm': 'true',
'node_use_openssl': 'true',
'strict_aliasing': 'true',
'target_arch': 'x64',
'v8_use_snapshot': 'true'}}
# Several hundreds of lines of build output deleted...
cp -rf out/Release/node /home/jonathan/.nave/installed/0.8.0/bin/node
# some more stuff deleted
ln -sf ../lib/node_modules/npm/bin/npm-cli.js /home/jonathan/.nave/installed/0.8.0/bin/npm
shebang #!/home/jonathan/.nave/installed/0.8.0/bin/node /home/jonathan/.nave/installed/0.8.0/lib/node_modules/npm/bin/npm-cli.js
using 0.8.0
jonathan@ogmore ~ $ exit
exit
jonathan@ogmore ~ $ nave use 0.6.18
Already installed: 0.6.18
using 0.6.18
jonathan@ogmore ~ $
18. Why nave?
• Allows you to manage versioning of your node install
• Similar to python’s virtualenv or ruby’s rbenv
• You can build multiple predictable environments for your code
• Example shows installing 0.8.0, but 0.6.19 was stable up until the day
I ran that command
• My example code needed 0.6.18. Nave is good, QED
• There is also nvm, but I like the fidelity to virtualenv and rbenv
19. npm Manages Packages
• Similar to ruby’s rubygems or python’s pip/easy_install
• 2 main modes: global and local
• You will want some packages globally, but most locally
• Rule of thumb: global for packages usable from command line or generally
applicable, local for everything else
• npm install -g node-inspector # enables graphical debugging
• npm install -g express # this is a web framework, more in a
second
21. You Have Some Choices
• Connect/Express: these are the granddaddy
• Geddy: looks cool to me
• Mojito: From Yahoo, tight integration with YUI client AND server
• Derby: newer, emerging, interesting
• Meteor: newer, emerging, interesting
• RailwayJS or TowerJS: Add more full-stack web like Rails or Django
on top of express and connect, respectively
23. Ecosystem, Ecosystem,
Ecosystem
• Everyauth is a canonical example (previous slide hid lots of
authentication methods it supports, btw)
• There are npm packages for everything for connect/express. I might
be exaggerating about everything
• RailwayJS is built on top of express
• TowerJS is built on top of connect
• We will go with express: it does not provide a full MVC framework,
but rolling your own on top is fairly easy
24. Adding MongoDB to Our Stack
• connect-mongodb: web framework middleware for a MongoDB-
backed session
• node-mongodb-native: native node driver for MongoDB
• mongolia: “non-magic” layer on the mongodb native driver
• mongoose: Javascript<->MongoDB object mapper
• mongoose-auth: everyauth integration with mongoose. It
supports a subset of everyauth’s authentication systems
• jugglingdb: multi-dbms support, including MongoDB. However, this
adds another layer of idirection on top of Mongoose
25. My Example Stack Choice
• express
• mongoose + connect-mongodb: not a slam dunk choice, but it is
fast to get started
• mongoose-auth+everyauth: everyauth integration with mongoose
• ejs: embedded javascript page templating <% // yay! %>. Syntax may not
be everybody’s cup of tea, but it does make it easy to adapt existing static
HTML
• bootstrap: twitter’s ui toolkit. Works very well with single-page apps,
local caching on full page loads should help, and again we get to start up
fast
26. And Away We Go...
jonathan@ogmore /mnt/hgfs/src $ express create expspreleton
create : expspreleton
create : expspreleton/package.json
create : expspreleton/app.js
[...]
create : expspreleton/public/images
dont forget to install dependencies:
$ cd expspreleton && npm install
jonathan@ogmore /mnt/hgfs/src $ ls expsreleton
app.js package.json public routes views
27. Now Add Extra Packages
• package.json is where you specify what packages you’ll depend
on
• It is like Rails’ Gemfile
• mongoose and mongoose-auth give us access to MongoDB
• I snuck ejs in here too
30. Build It!
• Set up bootstrap static resources
• Set up a bootstrap layout and index page template
• Add conf.js
• Wire conf.js into app.js
• Add mongoose-auth to app.js
• Build models
31. Set Up Static Bootstrap
Resources
• Download bootstrap from http://twitter.github.com/bootstrap/
• Unpack it to your public directory
• Personal preference: I put it inside node.js’ public directory at
public/tbs:
jonathan@ogmore /mnt/hgfs/src/expspreleton $ ls public/tbs
css img js
32. Setup Bootstrap Layout and
Index Page
• Bootstrap’s examples page is a good place to go to get nice
starting templates: http://twitter.github.com/bootstrap/
examples.html. Click on the examples to view, and save the view
source
• The bulk of the page becomes your layout.ejs. Then you find the
part that changes and it becomes your index.ejs--and eventually
the other pages:
jonathan@ogmore /mnt/hgfs/src/expspreleton $ ls views
index.ejs layout.ejs
33. Add a conf.js File
• Good place to store application-specific configuration
• Frequently not a good file to put into code repository due to environment
(production versus development) differences where you do not want the
production values visible
• Example:
jonathan@ogmore /mnt/hgfs/src/expspreleton $ cat conf.js
module.exports = {
sessionSecret: 'YouHadBetterNotStoreYourSecretThisWay'
, google: {
clientId: '_your_client_id_goes_here_'
, clientSecret: '_your_client_secret_goes_here_'
}
};
34. Wire in conf.js
• app.js is the “main” node.js file in an express app
• You can also see some other setup stuff
/**
* Module dependencies.
*/
var express = require('express')
, routes = require('./routes');
var conf = require('./conf');
35. Add Authentication with
mongoose-auth
• See the app.js in the example application in mongoose-auth source code: https://github.com/bnoguchi/
mongoose-auth/blob/master/example/server.js
• The magic here is mongoose-auth wiring itself into the UserSchema
• Here is the crux part, but what happens below this snippet is your configuration part:
var everyauth = require('everyauth')
, Promise = everyauth.Promise;
everyauth.debug = true;
var mongoose = require('mongoose')
, Schema = mongoose.Schema
, ObjectId = mongoose.SchemaTypes.ObjectId;
var UserSchema = new Schema({})
, User;
var mongooseAuth = require('mongoose-auth');
UserSchema.plugin(mongooseAuth, {
36. Enable and Use Authentication
• The everyauth code here is what is doing the authentication checking
• Here is some example code that implements a login button:
<div class="btn-group pull-right">
<%if (!everyauth.loggedIn) { %>
<a class="btn" href="/auth/google">
" <i class="icon-cog"></i>Log in
<% } else { %>
<a class="btn" href="/logout">
<i class="icon-user"></i> <%= everyauth.google.user.name %>
<% } %>
</a>
</div>
37. Make Your Other Schemas
• For example, create and add into your app a Link schema. This is the Link.js module:
var mongoose = require('mongoose')
, Schema = mongoose.Schema
, ObjectId = mongoose.SchemaTypes.ObjectId;
var LinkSchema = new Schema({
id " : ObjectId
, title : String
, url : String
, description : String
, created : Date
, tags" :[String]
, owner" : ObjectId
});
module.exports = mongoose.model('Link', LinkSchema);
38. Run Your App
jonathan@ogmore /mnt/hgfs/src/expspreleton $ nave use 0.6.18
Already installed: 0.6.18
using 0.6.18
jonathan@ogmore /mnt/hgfs/src/expspreleton $ node app.js
Express server listening on port 3000 in development mode
42. Go Faster Speeding Up and
Fixing Your Site
• node-debugger is your friend
• Understand the performance implications of:
• Layers of abstraction
• The node event loop
• As your application scales up, speed means performance not
development velocity: replace easy with faster but more complex
43. Node Debugger
• Here’s a stack trace? What now?
SyntaxError: Unexpected identifier
at Object.Function (unknown source)
at Object.compile (/mnt/hgfs/src/expspreleton/node_modules/ejs/lib/ejs.js:209:12)
at Function.compile (/mnt/hgfs/src/expspreleton/node_modules/express/lib/view.js:68:33)
at ServerResponse._render (/mnt/hgfs/src/expspreleton/node_modules/express/lib/view.js:
417:18)
at ServerResponse.render (/mnt/hgfs/src/expspreleton/node_modules/express/lib/view.js:318:17)
at /mnt/hgfs/src/expspreleton/routes/index.js:9:7
at callbacks (/mnt/hgfs/src/expspreleton/node_modules/express/lib/router/index.js:272:11)
at param (/mnt/hgfs/src/expspreleton/node_modules/express/lib/router/index.js:246:11)
at pass (/mnt/hgfs/src/expspreleton/node_modules/express/lib/router/index.js:253:5)
at Router._dispatch (/mnt/hgfs/src/expspreleton/node_modules/express/lib/router/index.js:
280:4)
46. Performance: Levels of
Indirection
• Learn the underlying implementation of the libraries you use. They
may impose a performance overhead for things you do not need
• For example, Mongoose was great for getting started but involves
extra indirection to map json keys to object members
• Mongoose does not optimize its MongoDB access patterns for
performance by default: you may want to specify subsets/index
ordering
• You may find as performance needs grow that you will want to replace
Mongoose with your own custom classes using the native driver
47. Node Asynchronous Is Not
Free
• The first rule of node.js is do not block the event loop
• The second rule of node.js is do not block the event loop
• The first time you write something in node.js, you will block the
event loop
* Apologies to Chuck Pahlaniuk
48. Be Aware of Event Loop
Blocking
• Basically: if you do something that is processor-intensive you will block the loop
and your performance will plummet
• If you do something with slow resources that does not use a well-designed
asynchronous library and callbacks, you will block and performance will
plummet
• Learn how the libraries you use are implemented. If you have to implement
async libraries, understand how well-behaved node.js modules use the node api
• If you need to impose ordering on a large number of asynchronous operations
or have deeply nested callbacks upon callbacks, consider a library like async.js
or futures to implement more orderly sequencing
50. Links to the Components
Discussed
• node.js: http://nodejs.org/
• MongoDB: http://www.mongodb.org/ and http://www.10gen.com/
• nave: https://github.com/isaacs/nave/
• npm: installs with node now, but info is at http://npmjs.org/