SlideShare a Scribd company logo
1 of 26
Download to read offline
node.js in production:
Reflections on three years
of riding the unicorn
Bryan Cantrill
SVP, Engineering
bryan@joyent.com
@bcantrill
Tuesday, December 3, 13
Production systems

•

Production systems are ones doing real work: when
they misbehave, users or other systems are affected

•

Production systems value reliability, performance and
ease of deployment — usually in that order

•

Contrast to development systems, that value ease of
development and speed of development — in that order

•

These values can be in tension: new languages and
environments typically arise for their development
values, not their production ones

•

Would node.js be any different?

Tuesday, December 3, 13
node.js advantages

•

In terms of production suitability, node.js had — and still
has — a couple of major advantages going for it:

•
•

It’s built on a VM (V8) that itself was designed for
performance

•

Tuesday, December 3, 13

It leverages extant (Unix) abstractions

•

•

It’s not a new language

Its pure event-oriented model aligns ease of
programming with scalability with respect to load

As the stewards of both node and SmartOS, Joyent had
another advantage: we could change, improve or
leverage SmartOS to accommodate node in production
node.js challenges

•

But node.js also has a couple of major challenges:

•
•

JavaScript closures make it easy to accidentally
reference memory

•

Because node.js is often used to connect backend
components, failure to propagate back pressure can
induce memory explosion and death

•

Tuesday, December 3, 13

Single-threaded execution of JavaScript means that
compute-bound code can entirely impede progress

High performance VM also implies inscrutable core
dumps and very limited instrumentation
August 2010: DTrace in node.js

•

Added simple user-level statically defined tracing
(USDT) probes for node.js on platforms that support
DTrace (e.g., Mac OS X, SmartOS)

•

Probes were around connection establishment, serving
HTTP requests, etc.

•

Allowed questions to be dynamically asked of running,
production node.js servers, e.g.:
dtrace -n ‘node*:::http-server-request{
printf(“%s of %s from %sn”, args[0]->method,
args[0]->url, args[1]->remoteAddress)}‘
dtrace -n http-server-request’{
@[args[1]->remoteAddress] = count()}‘
dtrace -n gc-start’{self->ts = timestamp}’ 
-n gc-done’/self->ts/{@ = quantize(timestamp - self->ts)}’

Tuesday, December 3, 13
August 2010: Deploying 0.2.x

•

In August 2010, we deployed our first node.js-based
service into production: a NodeKnockout leader-board
that used node.js DTrace probes to geolocate
connections to contestants in real-time

•

Results were promising; surprisingly easy to develop
and deploy a node.js based service — and service
consumed very little CPU

•

Watching the Node Knockout contestants in production
revealed they were all light on CPU:

•

But there was a storm cloud...

Tuesday, December 3, 13
August 2010: Deploying 0.2.x, cont.

•

We had a memory leak that resulted in heap exhaustion
after several hours under heavy load

•

Our service was stateless and load balanced for HA, so
this was more disconcerting than debilitating...

•

...but we also had quite a few contestants that would run
their RSS up and crash; there was clearly a larger issue:

Tuesday, December 3, 13
February 2011: 0.4.0

•

In February 2011, we deployed our first major node.jsbased service (on 0.4.0)

•

Service was able to be built remarkably quickly — but
with some pain-points around Connect

•

Despite being potentially a compute-bound service,
CPU consumption was (again) a non-issue

•

And with an updated node (and many fixed node leaks),
memory consumption wasn’t necessarily as acute...

•

…but we hit our first “spinning black hole” problem

Tuesday, December 3, 13
January 2011: node-dtrace-provider

•

Our DTrace probes in node were proving to be too lowlevel for higher-level services — we needed to allow
USDT probes to be expressed in JavaScript

•

Fortunately, DTrace community member Chris Andrews
extended his libusdt to node.js, allowed statically
defined probes in JavaScript, e.g.:
var dtp = d.createDTraceProvider(‘foo’);
var probe = dtp.addProbe(‘foo-start’);
probe.fire(function(p) {
return ([ { bar: 123, baz: ‘bar’ } ]);
});

Tuesday, December 3, 13
April 2011: Restify

•

Based on our experiences with Connect/Express, we
wanted to build a node module that was purpose-built to
implement HTTP-based API endpoints

•

Based on Chris Andrews’ work, we wanted to have first
class support for DTrace

•

Joyent’s Mark Cavage developed node-restify, which
quickly became the foundation for all of our services

•

Built-in DTrace support allows full observability into perroute/per-handler latency — a capability that we could
not live without at this point

Tuesday, December 3, 13
November 2011: MDB support for V8

•

In mid-2011, Joyent’s Dave Pacheco dared to dream the
impossible dream: full postmortem support for V8 for
MDB, the debugger native to SmartOS

•

Several unspeakable layer violations, mdb_v8 brought
postmortem debugging to node.js

•

::jsstack prints full stack including both native C++
frames and JavaScript frames

•
•

::jsprint prints JavaScript objects — from the dump

Tuesday, December 3, 13

Thanks to mdb_v8, we were able to go back to a core
dump from that infinite loop in our service deployed
several months earlier — and nail it
December 2011: DTrace ustack helper

•

mdb_v8 was actually a way station to an even bolder
dream: a DTrace ustack helper for node.js

•

A ustack helper is a bit of code that accompanies a
binary and assists DTrace in probe context to resolve
stack frames to their higher-level names

•

Once completed, allows user-level stack traces to be
associated with in-kernel events — like profiling events

•

Can use the DTrace profile provider to determine how a
node.js program is consuming CPU via stack sampling

Tuesday, December 3, 13
December 2011: Flame graphs

•

Pouring through stack traces can make hot functions
difficult to visualize

•

Joyent’s Brendan Gregg developed flame graphs, which
allow us to easily visualize thousands of sampled
stacks:

Tuesday, December 3, 13
January 2012: Bunyan

•

Logging was becoming more and more of a problem for
us — especially as we were developing distributed
systems in node.js

•

Joyent’s Trent Mick developed node-bunyan, a simple
and fast JSON logging library for node.js

•

Provides standardized, JSON, line-based log output that
can be easily processed with JSON tools, e.g.:
{"name":"moray","hostname":"d1cfb6c7-c975-4ed8-a689fb18f94b6bfc","pid":8393,"component":"manatee","path":"/manatee/sdc/
election","level":20,"db":{"available":2,"max":15,"size":2,"waiting":
0},"options":{"async":false,"read":true},"msg":"pg:
entered","time":"2013-12-03T02:54:24.565Z","v":0}

•

Tuesday, December 3, 13

Also includes command line tool, bunyan, for displaying
Bunyan logs
February 2012: npm shrinkwrap

•

npm allows for fine-grained semver control over
package dependencies, but we found that nested
dependencies could result in non-replicable installs

•

“npm shrinkwrap” generates a file that shrinkwraps all
nested dependencies into npm-shrinkwrap.json,
thereby locking down all nested versions

•

Guarantees that all installs will have same semver
versions of dependencies

•

Doesn’t necessarily guarantee identical installs,
however; for this, one needs private npm repositories

Tuesday, December 3, 13
April 2012: node-vasync

•

There are a number of modules that deal with some of
the mechanics of asynchronous control flow…

•

But we found that libraries that handle We found we
needed one that emphasized debugging, and in
particular,

•

node-vasync captures a number of popular flow patterns
and allows state to be inspected via MDB

Tuesday, December 3, 13
May 2012: ::findjsobjects

•

Building on Dave Pacheco’s mdb_v8, we implemented a
debugger command that iterates over all of memory in a
core dump, looking for JavaScript objects

•

Entirely brute force, but allows one to take a swing at a
nasty node.js issue: semantic memory leaks
> ::findjsobjects
OBJECT #OBJECTS
95709ac1
195
957093f9
66
95f13181
130
8432ff55
222
843304dd
91
8432cc55
99
95f08545
66
8432f2e1
546
9570cafd
47
8432be95
415
8432fb09
67

Tuesday, December 3, 13

#PROPS
3
9
5
3
9
9
14
2
24
3
19

CONSTRUCTOR: PROPS
Object: socket, type, handle
Object: uid, windowsVerbatimArguments, stdio, …
<anonymous> (as exports.StringDecoder): …
Buffer: length, offset, parent
Object: refreservation, creation, name, type, …
Object: time, msg, level, hostname, pid, action, …
ChildProcess: _closesNeeded, stdio, …
Array
Object: <sliced string>, <sliced string>, …
Array
Socket: errorEmitted, _bytesDispatched, …
May 2012: ::findjsobjects -p

•

Searching by property name allows one to find particular
objects in the JavaScript heap, e.g.:
> ::findjsobjects -p ip4addr | ::findjsobjects | ::jsprint -a
8432b109: {
ip4addr: 9aee115d: "10.88.88.200",
VLAN: 9aee1199: "0",
Host Interface: 9aee1185: "e1000g0",
Link Status: 9aee1175: "up",
MAC Address: 9aee113d: "02:08:20:47:93:82",
}
…

•

While designed for postmortem debugging, this allows
mdb_v8 to be used for in situ debugging in development

•

Also guides one to a best practice: towards unique
property names (which we have historically done in the
operating system via structure prefixing)

Tuesday, December 3, 13
July 2012: node-fast

•

While HTTP makes it very easy to put together a
distributed system, parsing and connection
management can become prohibitively expensive

•

In building Manta, we found that we needed something
lighter/faster; Joyent’s Mark Cavage built node-fast

•

Only what you need: fully async/duplex/persistent
connections, simple on-wire protocol (JSON), etc.

•

None of what you don’t want: no IDL madness, no object
model, no binary translation madness, etc.

•

Deliberately light and limited — HTTP is still the right
answer until it isn’t

Tuesday, December 3, 13
October 2012: Bunyan + DTrace

•

With all of our services using Bunyan, we could enable
dynamic logging by adding DTrace USDT probes

•

Can use the raw DTrace probes:
# dtrace -qn log-debug'{printf("%sn", copyinstr(arg0))}' -x strsize=8k
{"name":"wf-moray-backend","hostname":"414ffb35-adee-47b7-bdf4d21cb039386c","pid":
10952,"component":"MorayClient","host":"10.99.99.17","port":
2020,"req_id":"bddb180f-1770-edcf-8df2-b3a81d97e9b1","level":
20,"bucket":"wf_runners","key":"414ffb35-adee-47b7-bdf4d21cb039386c","value":
{"active_at":"2013-12-03T07:22:25.125Z","idle":false},"msg":"putObject:
entered","time":"2013-12-03T07:22:25.135Z","v":0}
...

•

Added the json() subroutine to DTrace to make this
easier to process

•

Can also use “bunyan -p” and avoid the lower-level
DTrace details entirely

Tuesday, December 3, 13
May 2013: --abort-on-uncaught-exception

•

Crash dumps are great — but aborting after an
uncaught exception makes it very difficult to determine
the true origin of the exception

•

Dave Pacheco implemented a V8 patch to induce a
process abort (and a core dump) on an uncaught
exception

•

This allows us to use postmortem debugging to debug
our everyday logic errors

•

Available starting in 0.10.x — we use it wherever we
have it!

Tuesday, December 3, 13
July 2013: Thoth

•

One of the most important systems we have built in
node is Manta, our object store featuring in situ compute

•

Manta is an excellent platform for building data-based
services — especially for large data objects

•

We built manta-thoth, a platform for core and crash
dump analysis that allows us to debug core dumps
without moving them

•

Thoth has become critically important for us to track and
automatically debug production node.js services

Tuesday, December 3, 13
December 2013: Dump analysis on Linux

•

Postmortem debugging has been a (the) tremendous
breakthrough for node.js in production…

•

...but despite all node’s postmortem support all being
open source, it has been limited to SmartOS

•

Some have toyed with porting MDB to Linux; this is in
principle possible, but will be rough sledding

•

Joyent’s TJ Fontaine (of node core fame) observed what
we had done with dump analysis on Manta and had a
simpler idea…

•

What about making Linux dumps consumable on
SmartOS — and therefore Manta?

Tuesday, December 3, 13
December 2013: Linux support in libproc

•

Over the course of a multiday engineering hackathon,
TJ and Joyent’s Max Brunning added support for Linux
crash dumps in SmartOS’s libproc

•

Fortunately, because of the way the postmortem work
was done by Dave Pacheco, it Just Works

•

Do this yourself:
https://gist.github.com/tjfontaine/de104fe058300a51f7cf

•

For Linux users: put your Linux dumps to Manta, and
you can finally debug those pesky leaks and crashes!

•

Use --abort-on-uncaught-exception and you
can use Manta and postmortem debugging to debug
more quotidian programming errors!

Tuesday, December 3, 13
Node.js in production!

•

For us at Joyent, the tooling that we have built into
node.js has resulted in what we believe to be the best
dynamic environment for production use

•

Yes, even when compared to much older platforms like
Java and Erlang...

•

There is still work to be done, especially around add-on
development (see TJ’s shim work!) and potentially better
bundling of objects…

•

We will continue to emphasize production deployment
and use in our stewardship of node.js!

Tuesday, December 3, 13
Thank you

•

@dapsays, the Patron Saint of node.js in production, for
DTrace support, MDB support, node-vasync, Manta, etc.

•
•
•
•
•

@mcavage for node-restify, node-fast, Manta, etc.

Tuesday, December 3, 13

@trentmick for node-bunyan
@chrisandrews for node-dtrace-provider
@brendangregg for flame graphs
@tjfontaine for bringing postmortem debugging to an
entirely new audience with Linux support for libproc!

More Related Content

What's hot

The DIY Punk Rock DevOps Playbook
The DIY Punk Rock DevOps PlaybookThe DIY Punk Rock DevOps Playbook
The DIY Punk Rock DevOps Playbookbcantrill
 
Bringing the Unix Philosophy to Big Data
Bringing the Unix Philosophy to Big DataBringing the Unix Philosophy to Big Data
Bringing the Unix Philosophy to Big Databcantrill
 
Papers We Love: Jails and Zones
Papers We Love: Jails and ZonesPapers We Love: Jails and Zones
Papers We Love: Jails and Zonesbcantrill
 
Why it’s (past) time to run containers on bare metal
Why it’s (past) time to run containers on bare metalWhy it’s (past) time to run containers on bare metal
Why it’s (past) time to run containers on bare metalbcantrill
 
Leaping the chasm from proprietary to open: A survivor's guide
Leaping the chasm from proprietary to open: A survivor's guideLeaping the chasm from proprietary to open: A survivor's guide
Leaping the chasm from proprietary to open: A survivor's guidebcantrill
 
The dream is alive! Running Linux containers on an illumos kernel
The dream is alive! Running Linux containers on an illumos kernelThe dream is alive! Running Linux containers on an illumos kernel
The dream is alive! Running Linux containers on an illumos kernelbcantrill
 
Platform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyondPlatform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyondbcantrill
 
Down Memory Lane: Two Decades with the Slab Allocator
Down Memory Lane: Two Decades with the Slab AllocatorDown Memory Lane: Two Decades with the Slab Allocator
Down Memory Lane: Two Decades with the Slab Allocatorbcantrill
 
Triton + Docker, July 2015
Triton + Docker, July 2015Triton + Docker, July 2015
Triton + Docker, July 2015Casey Bisson
 
Cloud stack design camp on jun 15
Cloud stack design camp on jun 15Cloud stack design camp on jun 15
Cloud stack design camp on jun 15Isaac Chiang
 
BayLISA meetup: 8/16/12
BayLISA meetup: 8/16/12BayLISA meetup: 8/16/12
BayLISA meetup: 8/16/12bcantrill
 
The Internet-of-things: Architecting for the deluge of data
The Internet-of-things: Architecting for the deluge of dataThe Internet-of-things: Architecting for the deluge of data
The Internet-of-things: Architecting for the deluge of databcantrill
 
Instrumenting the real-time web
Instrumenting the real-time webInstrumenting the real-time web
Instrumenting the real-time webbcantrill
 
Ceph, Xen, and CloudStack: Semper Melior-XPUS13 McGarry
Ceph, Xen, and CloudStack: Semper Melior-XPUS13 McGarryCeph, Xen, and CloudStack: Semper Melior-XPUS13 McGarry
Ceph, Xen, and CloudStack: Semper Melior-XPUS13 McGarryThe Linux Foundation
 
Deploying Apache CloudStack from API to UI
Deploying Apache CloudStack from API to UIDeploying Apache CloudStack from API to UI
Deploying Apache CloudStack from API to UIJoe Brockmeier
 
Oscon 2012 : From Datacenter to the Cloud - Featuring Xen and XCP
Oscon 2012 : From Datacenter to the Cloud - Featuring Xen and XCPOscon 2012 : From Datacenter to the Cloud - Featuring Xen and XCP
Oscon 2012 : From Datacenter to the Cloud - Featuring Xen and XCPThe Linux Foundation
 
BACD July 2012 : The Xen Cloud Platform
BACD July 2012 : The Xen Cloud Platform BACD July 2012 : The Xen Cloud Platform
BACD July 2012 : The Xen Cloud Platform The Linux Foundation
 

What's hot (20)

The DIY Punk Rock DevOps Playbook
The DIY Punk Rock DevOps PlaybookThe DIY Punk Rock DevOps Playbook
The DIY Punk Rock DevOps Playbook
 
Bringing the Unix Philosophy to Big Data
Bringing the Unix Philosophy to Big DataBringing the Unix Philosophy to Big Data
Bringing the Unix Philosophy to Big Data
 
Papers We Love: Jails and Zones
Papers We Love: Jails and ZonesPapers We Love: Jails and Zones
Papers We Love: Jails and Zones
 
Why it’s (past) time to run containers on bare metal
Why it’s (past) time to run containers on bare metalWhy it’s (past) time to run containers on bare metal
Why it’s (past) time to run containers on bare metal
 
Leaping the chasm from proprietary to open: A survivor's guide
Leaping the chasm from proprietary to open: A survivor's guideLeaping the chasm from proprietary to open: A survivor's guide
Leaping the chasm from proprietary to open: A survivor's guide
 
The dream is alive! Running Linux containers on an illumos kernel
The dream is alive! Running Linux containers on an illumos kernelThe dream is alive! Running Linux containers on an illumos kernel
The dream is alive! Running Linux containers on an illumos kernel
 
Platform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyondPlatform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyond
 
Down Memory Lane: Two Decades with the Slab Allocator
Down Memory Lane: Two Decades with the Slab AllocatorDown Memory Lane: Two Decades with the Slab Allocator
Down Memory Lane: Two Decades with the Slab Allocator
 
Triton + Docker, July 2015
Triton + Docker, July 2015Triton + Docker, July 2015
Triton + Docker, July 2015
 
Cloud stack design camp on jun 15
Cloud stack design camp on jun 15Cloud stack design camp on jun 15
Cloud stack design camp on jun 15
 
BayLISA meetup: 8/16/12
BayLISA meetup: 8/16/12BayLISA meetup: 8/16/12
BayLISA meetup: 8/16/12
 
The Internet-of-things: Architecting for the deluge of data
The Internet-of-things: Architecting for the deluge of dataThe Internet-of-things: Architecting for the deluge of data
The Internet-of-things: Architecting for the deluge of data
 
Instrumenting the real-time web
Instrumenting the real-time webInstrumenting the real-time web
Instrumenting the real-time web
 
Ceph, Xen, and CloudStack: Semper Melior-XPUS13 McGarry
Ceph, Xen, and CloudStack: Semper Melior-XPUS13 McGarryCeph, Xen, and CloudStack: Semper Melior-XPUS13 McGarry
Ceph, Xen, and CloudStack: Semper Melior-XPUS13 McGarry
 
Xen and Apache cloudstack
Xen and Apache cloudstack  Xen and Apache cloudstack
Xen and Apache cloudstack
 
Deploying Apache CloudStack from API to UI
Deploying Apache CloudStack from API to UIDeploying Apache CloudStack from API to UI
Deploying Apache CloudStack from API to UI
 
Oscon 2012 : From Datacenter to the Cloud - Featuring Xen and XCP
Oscon 2012 : From Datacenter to the Cloud - Featuring Xen and XCPOscon 2012 : From Datacenter to the Cloud - Featuring Xen and XCP
Oscon 2012 : From Datacenter to the Cloud - Featuring Xen and XCP
 
Hyper v r2 deep dive
Hyper v r2 deep diveHyper v r2 deep dive
Hyper v r2 deep dive
 
CloudStack Architecture
CloudStack ArchitectureCloudStack Architecture
CloudStack Architecture
 
BACD July 2012 : The Xen Cloud Platform
BACD July 2012 : The Xen Cloud Platform BACD July 2012 : The Xen Cloud Platform
BACD July 2012 : The Xen Cloud Platform
 

Similar to Node.js in production: Reflections on three years of riding the unicorn

Instrumenting the real-time web: Node.js in production
Instrumenting the real-time web: Node.js in productionInstrumenting the real-time web: Node.js in production
Instrumenting the real-time web: Node.js in productionbcantrill
 
Introduction to node.js aka NodeJS
Introduction to node.js aka NodeJSIntroduction to node.js aka NodeJS
Introduction to node.js aka NodeJSJITENDRA KUMAR PATEL
 
Accra MongoDB User Group
Accra MongoDB User GroupAccra MongoDB User Group
Accra MongoDB User GroupMongoDB
 
Новый InterSystems: open-source, митапы, хакатоны
Новый InterSystems: open-source, митапы, хакатоныНовый InterSystems: open-source, митапы, хакатоны
Новый InterSystems: open-source, митапы, хакатоныTimur Safin
 
Practical Use of MongoDB for Node.js
Practical Use of MongoDB for Node.jsPractical Use of MongoDB for Node.js
Practical Use of MongoDB for Node.jsasync_io
 
Polyglot persistence @ netflix (CDE Meetup)
Polyglot persistence @ netflix (CDE Meetup) Polyglot persistence @ netflix (CDE Meetup)
Polyglot persistence @ netflix (CDE Meetup) Roopa Tangirala
 
Docker dev ops for cd meetup 12-14
Docker dev ops for cd meetup 12-14Docker dev ops for cd meetup 12-14
Docker dev ops for cd meetup 12-14Simon Storm
 
Introduction to cloud and openstack
Introduction to cloud and openstackIntroduction to cloud and openstack
Introduction to cloud and openstackShivaling Sannalli
 
Node.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scaleNode.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scaleDmytro Semenov
 
Capstone Project Final Presentation
Capstone Project Final PresentationCapstone Project Final Presentation
Capstone Project Final PresentationMatthew Chang
 
Questions On The Code And Core Module
Questions On The Code And Core ModuleQuestions On The Code And Core Module
Questions On The Code And Core ModuleKatie Gulley
 
#lspe Building a Monitoring Framework using DTrace and MongoDB
#lspe Building a Monitoring Framework using DTrace and MongoDB#lspe Building a Monitoring Framework using DTrace and MongoDB
#lspe Building a Monitoring Framework using DTrace and MongoDBdan-p-kimmel
 
An Introduction to Node.js Development with Windows Azure
An Introduction to Node.js Development with Windows AzureAn Introduction to Node.js Development with Windows Azure
An Introduction to Node.js Development with Windows AzureTroy Miles
 
Art and Science of Web Sites Performance: A Front-end Approach
Art and Science of Web Sites Performance: A Front-end ApproachArt and Science of Web Sites Performance: A Front-end Approach
Art and Science of Web Sites Performance: A Front-end ApproachJiang Zhu
 

Similar to Node.js in production: Reflections on three years of riding the unicorn (20)

18_Node.js.ppt
18_Node.js.ppt18_Node.js.ppt
18_Node.js.ppt
 
18_Node.js.ppt
18_Node.js.ppt18_Node.js.ppt
18_Node.js.ppt
 
Instrumenting the real-time web: Node.js in production
Instrumenting the real-time web: Node.js in productionInstrumenting the real-time web: Node.js in production
Instrumenting the real-time web: Node.js in production
 
Introduction to node.js aka NodeJS
Introduction to node.js aka NodeJSIntroduction to node.js aka NodeJS
Introduction to node.js aka NodeJS
 
Accra MongoDB User Group
Accra MongoDB User GroupAccra MongoDB User Group
Accra MongoDB User Group
 
Новый InterSystems: open-source, митапы, хакатоны
Новый InterSystems: open-source, митапы, хакатоныНовый InterSystems: open-source, митапы, хакатоны
Новый InterSystems: open-source, митапы, хакатоны
 
Practical Use of MongoDB for Node.js
Practical Use of MongoDB for Node.jsPractical Use of MongoDB for Node.js
Practical Use of MongoDB for Node.js
 
F8 tech talk_pinterest_v4
F8 tech talk_pinterest_v4F8 tech talk_pinterest_v4
F8 tech talk_pinterest_v4
 
Node js
Node jsNode js
Node js
 
Polyglot persistence @ netflix (CDE Meetup)
Polyglot persistence @ netflix (CDE Meetup) Polyglot persistence @ netflix (CDE Meetup)
Polyglot persistence @ netflix (CDE Meetup)
 
Docker dev ops for cd meetup 12-14
Docker dev ops for cd meetup 12-14Docker dev ops for cd meetup 12-14
Docker dev ops for cd meetup 12-14
 
Introduction to cloud and openstack
Introduction to cloud and openstackIntroduction to cloud and openstack
Introduction to cloud and openstack
 
Node azure
Node azureNode azure
Node azure
 
Node.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scaleNode.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scale
 
Capstone Project Final Presentation
Capstone Project Final PresentationCapstone Project Final Presentation
Capstone Project Final Presentation
 
Questions On The Code And Core Module
Questions On The Code And Core ModuleQuestions On The Code And Core Module
Questions On The Code And Core Module
 
#lspe Building a Monitoring Framework using DTrace and MongoDB
#lspe Building a Monitoring Framework using DTrace and MongoDB#lspe Building a Monitoring Framework using DTrace and MongoDB
#lspe Building a Monitoring Framework using DTrace and MongoDB
 
An Introduction to Node.js Development with Windows Azure
An Introduction to Node.js Development with Windows AzureAn Introduction to Node.js Development with Windows Azure
An Introduction to Node.js Development with Windows Azure
 
Surge2012
Surge2012Surge2012
Surge2012
 
Art and Science of Web Sites Performance: A Front-end Approach
Art and Science of Web Sites Performance: A Front-end ApproachArt and Science of Web Sites Performance: A Front-end Approach
Art and Science of Web Sites Performance: A Front-end Approach
 

More from bcantrill

Predicting the Present
Predicting the PresentPredicting the Present
Predicting the Presentbcantrill
 
Sharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of ToolmakingSharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of Toolmakingbcantrill
 
Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...bcantrill
 
I have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systemsI have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systemsbcantrill
 
Towards Holistic Systems
Towards Holistic SystemsTowards Holistic Systems
Towards Holistic Systemsbcantrill
 
The Coming Firmware Revolution
The Coming Firmware RevolutionThe Coming Firmware Revolution
The Coming Firmware Revolutionbcantrill
 
Hardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden AgeHardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden Agebcantrill
 
Tockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator tracesTockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator tracesbcantrill
 
No Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's LawNo Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's Lawbcantrill
 
Andreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software EngineeringAndreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software Engineeringbcantrill
 
Visualizing Systems with Statemaps
Visualizing Systems with StatemapsVisualizing Systems with Statemaps
Visualizing Systems with Statemapsbcantrill
 
Platform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system softwarePlatform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system softwarebcantrill
 
Is it time to rewrite the operating system in Rust?
Is it time to rewrite the operating system in Rust?Is it time to rewrite the operating system in Rust?
Is it time to rewrite the operating system in Rust?bcantrill
 
dtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the uniondtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the unionbcantrill
 
The Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systemsThe Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systemsbcantrill
 
Papers We Love: ARC after dark
Papers We Love: ARC after darkPapers We Love: ARC after dark
Papers We Love: ARC after darkbcantrill
 
Principles of Technology Leadership
Principles of Technology LeadershipPrinciples of Technology Leadership
Principles of Technology Leadershipbcantrill
 
Zebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data pathZebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data pathbcantrill
 
Debugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mindDebugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mindbcantrill
 
The State of Cloud 2016: The whirlwind of creative destruction
The State of Cloud 2016: The whirlwind of creative destructionThe State of Cloud 2016: The whirlwind of creative destruction
The State of Cloud 2016: The whirlwind of creative destructionbcantrill
 

More from bcantrill (20)

Predicting the Present
Predicting the PresentPredicting the Present
Predicting the Present
 
Sharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of ToolmakingSharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of Toolmaking
 
Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...
 
I have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systemsI have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systems
 
Towards Holistic Systems
Towards Holistic SystemsTowards Holistic Systems
Towards Holistic Systems
 
The Coming Firmware Revolution
The Coming Firmware RevolutionThe Coming Firmware Revolution
The Coming Firmware Revolution
 
Hardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden AgeHardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden Age
 
Tockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator tracesTockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator traces
 
No Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's LawNo Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's Law
 
Andreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software EngineeringAndreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software Engineering
 
Visualizing Systems with Statemaps
Visualizing Systems with StatemapsVisualizing Systems with Statemaps
Visualizing Systems with Statemaps
 
Platform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system softwarePlatform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system software
 
Is it time to rewrite the operating system in Rust?
Is it time to rewrite the operating system in Rust?Is it time to rewrite the operating system in Rust?
Is it time to rewrite the operating system in Rust?
 
dtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the uniondtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the union
 
The Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systemsThe Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systems
 
Papers We Love: ARC after dark
Papers We Love: ARC after darkPapers We Love: ARC after dark
Papers We Love: ARC after dark
 
Principles of Technology Leadership
Principles of Technology LeadershipPrinciples of Technology Leadership
Principles of Technology Leadership
 
Zebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data pathZebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data path
 
Debugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mindDebugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mind
 
The State of Cloud 2016: The whirlwind of creative destruction
The State of Cloud 2016: The whirlwind of creative destructionThe State of Cloud 2016: The whirlwind of creative destruction
The State of Cloud 2016: The whirlwind of creative destruction
 

Recently uploaded

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Node.js in production: Reflections on three years of riding the unicorn

  • 1. node.js in production: Reflections on three years of riding the unicorn Bryan Cantrill SVP, Engineering bryan@joyent.com @bcantrill Tuesday, December 3, 13
  • 2. Production systems • Production systems are ones doing real work: when they misbehave, users or other systems are affected • Production systems value reliability, performance and ease of deployment — usually in that order • Contrast to development systems, that value ease of development and speed of development — in that order • These values can be in tension: new languages and environments typically arise for their development values, not their production ones • Would node.js be any different? Tuesday, December 3, 13
  • 3. node.js advantages • In terms of production suitability, node.js had — and still has — a couple of major advantages going for it: • • It’s built on a VM (V8) that itself was designed for performance • Tuesday, December 3, 13 It leverages extant (Unix) abstractions • • It’s not a new language Its pure event-oriented model aligns ease of programming with scalability with respect to load As the stewards of both node and SmartOS, Joyent had another advantage: we could change, improve or leverage SmartOS to accommodate node in production
  • 4. node.js challenges • But node.js also has a couple of major challenges: • • JavaScript closures make it easy to accidentally reference memory • Because node.js is often used to connect backend components, failure to propagate back pressure can induce memory explosion and death • Tuesday, December 3, 13 Single-threaded execution of JavaScript means that compute-bound code can entirely impede progress High performance VM also implies inscrutable core dumps and very limited instrumentation
  • 5. August 2010: DTrace in node.js • Added simple user-level statically defined tracing (USDT) probes for node.js on platforms that support DTrace (e.g., Mac OS X, SmartOS) • Probes were around connection establishment, serving HTTP requests, etc. • Allowed questions to be dynamically asked of running, production node.js servers, e.g.: dtrace -n ‘node*:::http-server-request{ printf(“%s of %s from %sn”, args[0]->method, args[0]->url, args[1]->remoteAddress)}‘ dtrace -n http-server-request’{ @[args[1]->remoteAddress] = count()}‘ dtrace -n gc-start’{self->ts = timestamp}’ -n gc-done’/self->ts/{@ = quantize(timestamp - self->ts)}’ Tuesday, December 3, 13
  • 6. August 2010: Deploying 0.2.x • In August 2010, we deployed our first node.js-based service into production: a NodeKnockout leader-board that used node.js DTrace probes to geolocate connections to contestants in real-time • Results were promising; surprisingly easy to develop and deploy a node.js based service — and service consumed very little CPU • Watching the Node Knockout contestants in production revealed they were all light on CPU: • But there was a storm cloud... Tuesday, December 3, 13
  • 7. August 2010: Deploying 0.2.x, cont. • We had a memory leak that resulted in heap exhaustion after several hours under heavy load • Our service was stateless and load balanced for HA, so this was more disconcerting than debilitating... • ...but we also had quite a few contestants that would run their RSS up and crash; there was clearly a larger issue: Tuesday, December 3, 13
  • 8. February 2011: 0.4.0 • In February 2011, we deployed our first major node.jsbased service (on 0.4.0) • Service was able to be built remarkably quickly — but with some pain-points around Connect • Despite being potentially a compute-bound service, CPU consumption was (again) a non-issue • And with an updated node (and many fixed node leaks), memory consumption wasn’t necessarily as acute... • …but we hit our first “spinning black hole” problem Tuesday, December 3, 13
  • 9. January 2011: node-dtrace-provider • Our DTrace probes in node were proving to be too lowlevel for higher-level services — we needed to allow USDT probes to be expressed in JavaScript • Fortunately, DTrace community member Chris Andrews extended his libusdt to node.js, allowed statically defined probes in JavaScript, e.g.: var dtp = d.createDTraceProvider(‘foo’); var probe = dtp.addProbe(‘foo-start’); probe.fire(function(p) { return ([ { bar: 123, baz: ‘bar’ } ]); }); Tuesday, December 3, 13
  • 10. April 2011: Restify • Based on our experiences with Connect/Express, we wanted to build a node module that was purpose-built to implement HTTP-based API endpoints • Based on Chris Andrews’ work, we wanted to have first class support for DTrace • Joyent’s Mark Cavage developed node-restify, which quickly became the foundation for all of our services • Built-in DTrace support allows full observability into perroute/per-handler latency — a capability that we could not live without at this point Tuesday, December 3, 13
  • 11. November 2011: MDB support for V8 • In mid-2011, Joyent’s Dave Pacheco dared to dream the impossible dream: full postmortem support for V8 for MDB, the debugger native to SmartOS • Several unspeakable layer violations, mdb_v8 brought postmortem debugging to node.js • ::jsstack prints full stack including both native C++ frames and JavaScript frames • • ::jsprint prints JavaScript objects — from the dump Tuesday, December 3, 13 Thanks to mdb_v8, we were able to go back to a core dump from that infinite loop in our service deployed several months earlier — and nail it
  • 12. December 2011: DTrace ustack helper • mdb_v8 was actually a way station to an even bolder dream: a DTrace ustack helper for node.js • A ustack helper is a bit of code that accompanies a binary and assists DTrace in probe context to resolve stack frames to their higher-level names • Once completed, allows user-level stack traces to be associated with in-kernel events — like profiling events • Can use the DTrace profile provider to determine how a node.js program is consuming CPU via stack sampling Tuesday, December 3, 13
  • 13. December 2011: Flame graphs • Pouring through stack traces can make hot functions difficult to visualize • Joyent’s Brendan Gregg developed flame graphs, which allow us to easily visualize thousands of sampled stacks: Tuesday, December 3, 13
  • 14. January 2012: Bunyan • Logging was becoming more and more of a problem for us — especially as we were developing distributed systems in node.js • Joyent’s Trent Mick developed node-bunyan, a simple and fast JSON logging library for node.js • Provides standardized, JSON, line-based log output that can be easily processed with JSON tools, e.g.: {"name":"moray","hostname":"d1cfb6c7-c975-4ed8-a689fb18f94b6bfc","pid":8393,"component":"manatee","path":"/manatee/sdc/ election","level":20,"db":{"available":2,"max":15,"size":2,"waiting": 0},"options":{"async":false,"read":true},"msg":"pg: entered","time":"2013-12-03T02:54:24.565Z","v":0} • Tuesday, December 3, 13 Also includes command line tool, bunyan, for displaying Bunyan logs
  • 15. February 2012: npm shrinkwrap • npm allows for fine-grained semver control over package dependencies, but we found that nested dependencies could result in non-replicable installs • “npm shrinkwrap” generates a file that shrinkwraps all nested dependencies into npm-shrinkwrap.json, thereby locking down all nested versions • Guarantees that all installs will have same semver versions of dependencies • Doesn’t necessarily guarantee identical installs, however; for this, one needs private npm repositories Tuesday, December 3, 13
  • 16. April 2012: node-vasync • There are a number of modules that deal with some of the mechanics of asynchronous control flow… • But we found that libraries that handle We found we needed one that emphasized debugging, and in particular, • node-vasync captures a number of popular flow patterns and allows state to be inspected via MDB Tuesday, December 3, 13
  • 17. May 2012: ::findjsobjects • Building on Dave Pacheco’s mdb_v8, we implemented a debugger command that iterates over all of memory in a core dump, looking for JavaScript objects • Entirely brute force, but allows one to take a swing at a nasty node.js issue: semantic memory leaks > ::findjsobjects OBJECT #OBJECTS 95709ac1 195 957093f9 66 95f13181 130 8432ff55 222 843304dd 91 8432cc55 99 95f08545 66 8432f2e1 546 9570cafd 47 8432be95 415 8432fb09 67 Tuesday, December 3, 13 #PROPS 3 9 5 3 9 9 14 2 24 3 19 CONSTRUCTOR: PROPS Object: socket, type, handle Object: uid, windowsVerbatimArguments, stdio, … <anonymous> (as exports.StringDecoder): … Buffer: length, offset, parent Object: refreservation, creation, name, type, … Object: time, msg, level, hostname, pid, action, … ChildProcess: _closesNeeded, stdio, … Array Object: <sliced string>, <sliced string>, … Array Socket: errorEmitted, _bytesDispatched, …
  • 18. May 2012: ::findjsobjects -p • Searching by property name allows one to find particular objects in the JavaScript heap, e.g.: > ::findjsobjects -p ip4addr | ::findjsobjects | ::jsprint -a 8432b109: { ip4addr: 9aee115d: "10.88.88.200", VLAN: 9aee1199: "0", Host Interface: 9aee1185: "e1000g0", Link Status: 9aee1175: "up", MAC Address: 9aee113d: "02:08:20:47:93:82", } … • While designed for postmortem debugging, this allows mdb_v8 to be used for in situ debugging in development • Also guides one to a best practice: towards unique property names (which we have historically done in the operating system via structure prefixing) Tuesday, December 3, 13
  • 19. July 2012: node-fast • While HTTP makes it very easy to put together a distributed system, parsing and connection management can become prohibitively expensive • In building Manta, we found that we needed something lighter/faster; Joyent’s Mark Cavage built node-fast • Only what you need: fully async/duplex/persistent connections, simple on-wire protocol (JSON), etc. • None of what you don’t want: no IDL madness, no object model, no binary translation madness, etc. • Deliberately light and limited — HTTP is still the right answer until it isn’t Tuesday, December 3, 13
  • 20. October 2012: Bunyan + DTrace • With all of our services using Bunyan, we could enable dynamic logging by adding DTrace USDT probes • Can use the raw DTrace probes: # dtrace -qn log-debug'{printf("%sn", copyinstr(arg0))}' -x strsize=8k {"name":"wf-moray-backend","hostname":"414ffb35-adee-47b7-bdf4d21cb039386c","pid": 10952,"component":"MorayClient","host":"10.99.99.17","port": 2020,"req_id":"bddb180f-1770-edcf-8df2-b3a81d97e9b1","level": 20,"bucket":"wf_runners","key":"414ffb35-adee-47b7-bdf4d21cb039386c","value": {"active_at":"2013-12-03T07:22:25.125Z","idle":false},"msg":"putObject: entered","time":"2013-12-03T07:22:25.135Z","v":0} ... • Added the json() subroutine to DTrace to make this easier to process • Can also use “bunyan -p” and avoid the lower-level DTrace details entirely Tuesday, December 3, 13
  • 21. May 2013: --abort-on-uncaught-exception • Crash dumps are great — but aborting after an uncaught exception makes it very difficult to determine the true origin of the exception • Dave Pacheco implemented a V8 patch to induce a process abort (and a core dump) on an uncaught exception • This allows us to use postmortem debugging to debug our everyday logic errors • Available starting in 0.10.x — we use it wherever we have it! Tuesday, December 3, 13
  • 22. July 2013: Thoth • One of the most important systems we have built in node is Manta, our object store featuring in situ compute • Manta is an excellent platform for building data-based services — especially for large data objects • We built manta-thoth, a platform for core and crash dump analysis that allows us to debug core dumps without moving them • Thoth has become critically important for us to track and automatically debug production node.js services Tuesday, December 3, 13
  • 23. December 2013: Dump analysis on Linux • Postmortem debugging has been a (the) tremendous breakthrough for node.js in production… • ...but despite all node’s postmortem support all being open source, it has been limited to SmartOS • Some have toyed with porting MDB to Linux; this is in principle possible, but will be rough sledding • Joyent’s TJ Fontaine (of node core fame) observed what we had done with dump analysis on Manta and had a simpler idea… • What about making Linux dumps consumable on SmartOS — and therefore Manta? Tuesday, December 3, 13
  • 24. December 2013: Linux support in libproc • Over the course of a multiday engineering hackathon, TJ and Joyent’s Max Brunning added support for Linux crash dumps in SmartOS’s libproc • Fortunately, because of the way the postmortem work was done by Dave Pacheco, it Just Works • Do this yourself: https://gist.github.com/tjfontaine/de104fe058300a51f7cf • For Linux users: put your Linux dumps to Manta, and you can finally debug those pesky leaks and crashes! • Use --abort-on-uncaught-exception and you can use Manta and postmortem debugging to debug more quotidian programming errors! Tuesday, December 3, 13
  • 25. Node.js in production! • For us at Joyent, the tooling that we have built into node.js has resulted in what we believe to be the best dynamic environment for production use • Yes, even when compared to much older platforms like Java and Erlang... • There is still work to be done, especially around add-on development (see TJ’s shim work!) and potentially better bundling of objects… • We will continue to emphasize production deployment and use in our stewardship of node.js! Tuesday, December 3, 13
  • 26. Thank you • @dapsays, the Patron Saint of node.js in production, for DTrace support, MDB support, node-vasync, Manta, etc. • • • • • @mcavage for node-restify, node-fast, Manta, etc. Tuesday, December 3, 13 @trentmick for node-bunyan @chrisandrews for node-dtrace-provider @brendangregg for flame graphs @tjfontaine for bringing postmortem debugging to an entirely new audience with Linux support for libproc!