Node.js Postmortem
Working Group Update
Yunong Xiao, Netflix
Michael Dawson, IBM
Yunong Xiao
Platform Architect, Netflix
@yunongx
http://yunong.io
About Michael
About The Postmortem Workgroup
Howard Hellyer
@hhellyer
David Pacheco
@davepacheco
Julien Gilli
@mistredjules
Michael Dawson
@mhdawson
Chris Bailey
@seabaylea
Daniel Khan
@danielkhan
Joshua Clulow
@jclulow
Yunong Xiao
@yunong
James Bellenger
@jbellenger
Bradley Meck
@bmeck
Luca Maraschi
@lucamaraschi
David Clements
@davidmarkclements
Richard Chamberlain
@rnchamberlain
Mission Statement
The working group is dedicated to the support and improvement of
postmortem debugging for Node.js.
Debugging Node.js
Debugging Node.js
Test Environment
What about production?
“The method described in this article was designed to
provide a core dump… with a minimal impact on the
spacecraft… as the resumption of data acquisition from
the spacecraft is the highest priority.”
- Chafin, R. "Pioneer F & G Telemetry and Command Processor Core Dump
Program." JPL Technical Report XVI, no. 32-1526 (1971): 174.
Core Dumps: Brief History
● Magnetic core memory
● Dump out the contents
of “core” memory for
debugging
● “Core dump” was coined
● Initially printed on paper
● Postmortem debugging
was born
Production Constraints
● Uptime is critical
● Not easily reproducible
● Can’t simulate environment
● Resume normal operations ASAP
Postmortem Debugging
└─[0] <> node --abort_on_uncaught_exception throw.js
Uncaught Error
FROM
Object.<anonymous> (/Users/yunong/throw.js:1:63)
Module._compile (module.js:435:26)
Object.Module._extensions..js (module.js:442:10)
Module.load (module.js:356:32)
Function.Module._load (module.js:311:12)
Function.Module.runMain (module.js:467:10)
startup (node.js:134:18)
node.js:961:3
[1] 4131 illegal hardware instruction (core dumped) node
--abort_on_uncaught_exception throw.js
Where: Inspect stack trace
Why: Inspect heap and stack variable state
Generate Core Dump Ad-hoc
root@demo:~# gcore `pgrep node`
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7facaeffd700 (LWP 5650)]
[New Thread 0x7facaf7fe700 (LWP 5649)]
[New Thread 0x7facaffff700 (LWP 5648)]
[New Thread 0x7facbc967700 (LWP 5647)]
[New Thread 0x7facbd168700 (LWP 5617)]
[New Thread 0x7facbd969700 (LWP 5616)]
[New Thread 0x7facbe16a700 (LWP 5615)]
[New Thread 0x7facbe96b700 (LWP 5614)]
0x00007facbea5b5a9 in syscall () from /lib/x86_64-linux-gnu/libc.so.6
Saved corefile core.5602
Example
Netflix API Request
Node to API RPC
[2016-09-09T16:25:48.388Z] WARN: reactive
socket/rs-pool/17352 on lgud-yunong:
TcpLoadBalancer._connect: no more free
connections
Postmortem Debugging
Connection Pool State
> ::findjsobjects -p _connections | ::jsprint
{
"connected": {
},
"connecting": {},
"free": {
"100.82.188.185:7001": [...],
"100.82.37.181:7001": [...],
"100.82.41.121:7001": [...],
"100.82.102.157:7001": [...],
"100.82.106.115:7001": [...],
"100.82.129.239:7001": [...],
"100.82.102.158:7001": [...],
"100.82.74.237:7001": [...],
...
}
}
Postmortem Debugging is Critical to Large Scale
Production Node Deployments
Postmortem WG
github.com/nodejs/post-mortem/
Postmortem WG - Mission
Guide improvements in postmortem
● Interfaces/APIs
● Dump formats
● Tools and Techniques
State of key tools today
Heap dump - snapshot of heap
● heapdump module - https://github.com/bnoordhuis/node-heapdump
● Chrome developer tools
● Limitations
● Need to modify application
● Slow to generate (minutes or hours)
● O(N) memory usage
● Limited content
● Output is large
State of key tools today
Core dump - memory image
● Creation
○ Crash, signal
○ --abort-on-uncaught-exception
○ Fast (relative to heap dumps)
○ Size matches process memory
● OS debuggers
○ Examination at C/C++ or assembler level
○ No knowledge of Node/v8 structures
● Node core dump inspectors
○ MDB (limited platform support)
○ IDDE (IBM SDK specific)
○ LLNODE (newer, less complete)
Example commands
MDB_V8 command LLNODE command IDDE
Print a stack trace jsstack, jsframe v8 bt !stack, !frame
Find objects findjsobjects v8 findjsobjects,
v8 findjsinstances <type>
!jslistobjects
!jsgroupobjects
!jsfindbyproperty
!jsobjectsmatching
Print an object jsprint v8 inspect !jsobject
Print function source jssource v8 source (prints source for a stack
frame)
!jsobject, !string + work
Find constructor for an
object
jsconstructor n/a !jsconstructor
Print elements of a
FixedArray
v8array v8 inspect <instance> !array
Find native memory
backing a buffer
nodebuffer v8 inspect <instance> !nodebuffers
How to make this better?
● Improve ease of use
● Common APIs to introspect dumps
● Cross platform support
● Common command set
● Lightweight dump
The Postmortem WG is working on...
Common Heap Dump Format
Improved Core Dump Analysis
● Library in C & JS
● Tools: mdb_v8(mdb), llnode(lldb), ...
Node Report
Common Heap Dump Format
Enabler for new tools
Generation
● mdb
● llnode
Consumption
● Conversion to existing v8 format - > chrome dev tools
● C/Javascript APIs
Core Dump Analysis
Currently working on
● Platform coverage
● Re-use of command implementation
● Common APIs
Soon to be
nodejs/llnode !
https://github.com/nodejs/post-mortem/issues/37
Working to get to….
Node Report
Lightweight Dump
● Fast
● Small
● Human readable
● Key information to start investigating
● Triggers: exception, fatal error, signal, JavaScript API
NodeReport
example - heap out
of memory error
NodeReport content:
● Event summary
● Node.js and OS versions
● JavaScript stack trace
● Native stack trace
● Heap and GC statistics
● Resource usage
● libuv handle summary
● Environment variables
● OS ulimit settings
Javascript API
API in Javascript
● More accessible
● Leverages
○ llnode
○ Common Heap Dump (future)
JavaScript API - example application
Summary
What is postmortem debugging
Example of where it’s helpful
Activities of the working group
● Common heap format
● APIs (C/JS)
● Tools(lldb, mdb_v8, NodeReport)
Get Involved !
Great chance to learn
● Low level machine details
● Key debugging techniques
● Different platforms/operating systems
Where
● Most work done through GitHub issues/Pull Requests
● http://github.com/nodejs/post-mortem/
Postmortem Debugging is Critical to Large Scale
Production Node Deployments
Some production problems are otherwise impossible
Save complete process state for debugging later
Copyrights and Trademarks
IBM, the IBM logo, ibm.com are trademarks or registered
trademarks of International Business Machines Corp.,
registered in many jurisdictions worldwide. Other product and
service names might be trademarks of IBM or other companies.
A current list of IBM trademarks is available on the Web at
“Copyright and trademark information” at
www.ibm.com/legal/copytrade.shtml
Node.js is an official trademark of Joyent. IBM SDK for Node.js is not formally related to or endorsed by the official
Joyent Node.js open source or commercial project.
Java, JavaScript and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or
its affiliates.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United
States, other countries, or both.

Post mortem talk - Node Interactive EU

  • 1.
    Node.js Postmortem Working GroupUpdate Yunong Xiao, Netflix Michael Dawson, IBM
  • 2.
    Yunong Xiao Platform Architect,Netflix @yunongx http://yunong.io
  • 3.
  • 4.
    About The PostmortemWorkgroup Howard Hellyer @hhellyer David Pacheco @davepacheco Julien Gilli @mistredjules Michael Dawson @mhdawson Chris Bailey @seabaylea Daniel Khan @danielkhan Joshua Clulow @jclulow Yunong Xiao @yunong James Bellenger @jbellenger Bradley Meck @bmeck Luca Maraschi @lucamaraschi David Clements @davidmarkclements Richard Chamberlain @rnchamberlain
  • 5.
    Mission Statement The workinggroup is dedicated to the support and improvement of postmortem debugging for Node.js.
  • 6.
  • 7.
  • 8.
  • 11.
    “The method describedin this article was designed to provide a core dump… with a minimal impact on the spacecraft… as the resumption of data acquisition from the spacecraft is the highest priority.” - Chafin, R. "Pioneer F & G Telemetry and Command Processor Core Dump Program." JPL Technical Report XVI, no. 32-1526 (1971): 174.
  • 12.
    Core Dumps: BriefHistory ● Magnetic core memory ● Dump out the contents of “core” memory for debugging ● “Core dump” was coined ● Initially printed on paper ● Postmortem debugging was born
  • 13.
    Production Constraints ● Uptimeis critical ● Not easily reproducible ● Can’t simulate environment ● Resume normal operations ASAP
  • 14.
  • 15.
    └─[0] <> node--abort_on_uncaught_exception throw.js Uncaught Error FROM Object.<anonymous> (/Users/yunong/throw.js:1:63) Module._compile (module.js:435:26) Object.Module._extensions..js (module.js:442:10) Module.load (module.js:356:32) Function.Module._load (module.js:311:12) Function.Module.runMain (module.js:467:10) startup (node.js:134:18) node.js:961:3 [1] 4131 illegal hardware instruction (core dumped) node --abort_on_uncaught_exception throw.js
  • 16.
    Where: Inspect stacktrace Why: Inspect heap and stack variable state
  • 17.
    Generate Core DumpAd-hoc root@demo:~# gcore `pgrep node` [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". [New Thread 0x7facaeffd700 (LWP 5650)] [New Thread 0x7facaf7fe700 (LWP 5649)] [New Thread 0x7facaffff700 (LWP 5648)] [New Thread 0x7facbc967700 (LWP 5647)] [New Thread 0x7facbd168700 (LWP 5617)] [New Thread 0x7facbd969700 (LWP 5616)] [New Thread 0x7facbe16a700 (LWP 5615)] [New Thread 0x7facbe96b700 (LWP 5614)] 0x00007facbea5b5a9 in syscall () from /lib/x86_64-linux-gnu/libc.so.6 Saved corefile core.5602
  • 18.
  • 19.
  • 20.
  • 21.
    [2016-09-09T16:25:48.388Z] WARN: reactive socket/rs-pool/17352on lgud-yunong: TcpLoadBalancer._connect: no more free connections
  • 22.
  • 23.
    Connection Pool State >::findjsobjects -p _connections | ::jsprint { "connected": { }, "connecting": {}, "free": { "100.82.188.185:7001": [...], "100.82.37.181:7001": [...], "100.82.41.121:7001": [...], "100.82.102.157:7001": [...], "100.82.106.115:7001": [...], "100.82.129.239:7001": [...], "100.82.102.158:7001": [...], "100.82.74.237:7001": [...], ... } }
  • 24.
    Postmortem Debugging isCritical to Large Scale Production Node Deployments
  • 25.
  • 26.
    Postmortem WG -Mission Guide improvements in postmortem ● Interfaces/APIs ● Dump formats ● Tools and Techniques
  • 27.
    State of keytools today Heap dump - snapshot of heap ● heapdump module - https://github.com/bnoordhuis/node-heapdump ● Chrome developer tools ● Limitations ● Need to modify application ● Slow to generate (minutes or hours) ● O(N) memory usage ● Limited content ● Output is large
  • 28.
    State of keytools today Core dump - memory image ● Creation ○ Crash, signal ○ --abort-on-uncaught-exception ○ Fast (relative to heap dumps) ○ Size matches process memory ● OS debuggers ○ Examination at C/C++ or assembler level ○ No knowledge of Node/v8 structures ● Node core dump inspectors ○ MDB (limited platform support) ○ IDDE (IBM SDK specific) ○ LLNODE (newer, less complete)
  • 29.
    Example commands MDB_V8 commandLLNODE command IDDE Print a stack trace jsstack, jsframe v8 bt !stack, !frame Find objects findjsobjects v8 findjsobjects, v8 findjsinstances <type> !jslistobjects !jsgroupobjects !jsfindbyproperty !jsobjectsmatching Print an object jsprint v8 inspect !jsobject Print function source jssource v8 source (prints source for a stack frame) !jsobject, !string + work Find constructor for an object jsconstructor n/a !jsconstructor Print elements of a FixedArray v8array v8 inspect <instance> !array Find native memory backing a buffer nodebuffer v8 inspect <instance> !nodebuffers
  • 30.
    How to makethis better? ● Improve ease of use ● Common APIs to introspect dumps ● Cross platform support ● Common command set ● Lightweight dump
  • 31.
    The Postmortem WGis working on... Common Heap Dump Format Improved Core Dump Analysis ● Library in C & JS ● Tools: mdb_v8(mdb), llnode(lldb), ... Node Report
  • 32.
    Common Heap DumpFormat Enabler for new tools Generation ● mdb ● llnode Consumption ● Conversion to existing v8 format - > chrome dev tools ● C/Javascript APIs
  • 33.
    Core Dump Analysis Currentlyworking on ● Platform coverage ● Re-use of command implementation ● Common APIs Soon to be nodejs/llnode ! https://github.com/nodejs/post-mortem/issues/37
  • 34.
  • 35.
    Node Report Lightweight Dump ●Fast ● Small ● Human readable ● Key information to start investigating ● Triggers: exception, fatal error, signal, JavaScript API
  • 36.
    NodeReport example - heapout of memory error NodeReport content: ● Event summary ● Node.js and OS versions ● JavaScript stack trace ● Native stack trace ● Heap and GC statistics ● Resource usage ● libuv handle summary ● Environment variables ● OS ulimit settings
  • 37.
    Javascript API API inJavascript ● More accessible ● Leverages ○ llnode ○ Common Heap Dump (future)
  • 38.
    JavaScript API -example application
  • 39.
    Summary What is postmortemdebugging Example of where it’s helpful Activities of the working group ● Common heap format ● APIs (C/JS) ● Tools(lldb, mdb_v8, NodeReport)
  • 40.
    Get Involved ! Greatchance to learn ● Low level machine details ● Key debugging techniques ● Different platforms/operating systems Where ● Most work done through GitHub issues/Pull Requests ● http://github.com/nodejs/post-mortem/
  • 41.
    Postmortem Debugging isCritical to Large Scale Production Node Deployments
  • 42.
    Some production problemsare otherwise impossible Save complete process state for debugging later
  • 43.
    Copyrights and Trademarks IBM,the IBM logo, ibm.com are trademarks or registered trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at “Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml Node.js is an official trademark of Joyent. IBM SDK for Node.js is not formally related to or endorsed by the official Joyent Node.js open source or commercial project. Java, JavaScript and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.