Node.js was initially challenging to use in production due to memory leaks and lack of debugging tools. Over three years, Joyent developed tools like DTrace probes, MDB for debugging core dumps, Bunyan for logging, and node-restify for building HTTP services to make node.js more reliable and observable in production. These tools helped Joyent successfully deploy many internal services using node.js and identify issues through postmortem analysis. Joyent continues working to improve node.js for production use.
Manta: a new internet-facing object storage facility that features compute by...Hakka Labs
As the amount of unstructured data has greatly exceeded a single computer's ability to process it, data has become increasingly isolated from the compute elements . The resulting haul from stores of record (e.g., SAN, NAS, S3) to transient compute (e.g., Hadoop, EC2) creates needless mechanical work and human labor. Is there a better way? In this talk, we'll explore the coming convergence of data and compute in the cloud, focusing in particular on Joyent's Manta, a new internet-facing object storage facility that features compute. We will describe the design principles for Manta, the engineering challenges in building it, and more generally, the opportunities presented by the convergence of compute and data.
Manta: a new internet-facing object storage facility that features compute by...Hakka Labs
As the amount of unstructured data has greatly exceeded a single computer's ability to process it, data has become increasingly isolated from the compute elements . The resulting haul from stores of record (e.g., SAN, NAS, S3) to transient compute (e.g., Hadoop, EC2) creates needless mechanical work and human labor. Is there a better way? In this talk, we'll explore the coming convergence of data and compute in the cloud, focusing in particular on Joyent's Manta, a new internet-facing object storage facility that features compute. We will describe the design principles for Manta, the engineering challenges in building it, and more generally, the opportunities presented by the convergence of compute and data.
My (very brief!) presentation at Interzone.io on March 11, 2015. A more in depth exploration of these ideas can be found at http://www.slideshare.net/bcantrill/docker-and-the-future-of-containers-in-production video: https://www.joyent.com/developers/videos/docker-and-the-future-of-containers-in-production
The dream is alive! Running Linux containers on an illumos kernelbcantrill
Presentation for #illumos day at #surgecon, 2014. Video can be found at https://www.youtube.com/watch?v=TrfD3pC0VSs Source code is at https://github.com/joyent/illumos-joyent
Docker on a laptop is easy, but Docker in the cloud is hard. Networking, scaling, and security are very different problems at cloud scale than on a laptop, and they can combine to sink a project before it ships What makes that transition so hard, and what can we do about it? How does this challenge affect hosting infrastructure and application design?
Casey will share lessons learned, open-source solutions, and working patterns for building and scaling applications to production.
Presented at:
http://www.meetup.com/Docker-Austin/events/223284441/
http://www.meetup.com/Docker-Philadelphia/events/223290624/
RBD, the RADOS Block Device in Ceph, gives you virtually unlimited scalability (without downtime), high performance, intelligent balancing and self-healing capabilities that traditional SANs can't provide. Ceph achieves this higher throughput through a unique system of placing objects across multiple nodes, and adaptive load balancing that replicates frequently accessed objects over more nodes. This talk will give a brief overview of the Ceph architecture, current integration with Apache CloudStack, and recent advancements with Xen and blktap2.
CloudStack, the world's leading open-source cloud infrastructure platform, was recently donated to the Apache Foundation, and is now an incubated Apache project. Ewan Mellor, Director of Engineering in the Citrix Cloud Platforms Group will describe the CloudStack project and explain why Xen is the pre-eminent hypervisor in public clouds today. He will describe the changes coming in CloudStack in the next 12 months, and how they are going to change the way that Xen is consumed in public and private clouds next year.
Deploying Apache CloudStack from API to UIJoe Brockmeier
For most organizations with a large computing footprint, it's not a matter of if you'll need a private cloud - it's when, and what kind. One of the most mature and widely deployed options is Apache CloudStack, a robust, turnkey cloud that includes everything you need to set up a private, public, or hybrid cloud. We'll cover Apache CloudStack from API to UI, and a little of everything in between.
Oscon 2012 : From Datacenter to the Cloud - Featuring Xen and XCPThe Linux Foundation
Do you dream of being able to spin up ten or twenty (or a thousand) virtual machines in an instant? Discover and repair resource bottlenecks without moving a finger? Dodge the loss of an entire storage array with no-one noticing? Span across data centers with a fleet of virtual machines? This is no sales pitch; during this tutorial, we’ll demonstrate how to leverage truly FOSS tools to build a powerful, scalable cloud that easily competes with those proprietary solutions!
This deep-dive into Xen, Xen Cloud Platform, and other FOSS cloud tools and concepts is intended both for those ready to wholeheartedly embrace virtualization and for those already seasoned in general virtualization practices. You’ll leave with a collection of pre-made tools that you can use right out of the box or modify to your liking. You’ll also leave with immediately useful knowledge on best practices and common pitfalls, presented by actual FOSS practitioners like you.
We begin this tutorial by discussing Xen, Xen Cloud Platform (XCP), and XCP cloud concepts (pools, hosts, storage, networks, etc.). We then explore in detail the API that makes Xen so useful for building a cloud, explore provisioning of hosts and guests using PXE, and discuss templating and installing guest virtual machines. Critical to understanding potential bottlenecks, identifying tuning opportunities and planning for the future, we will discuss performance monitoring and methodologies. Next, we teach you how to make the most of your new FOSS cloud capabilities and discuss in detail high availability infrastructure for storage and networking, advanced networking capabilities like bonding/VLANs, and the cloud orchestration tools that save you time and money. All of this with a focus on XCP in enterprise environments. Tools discussed include DRBD, Pacemaker, Open vSwitch, Cloudstack, Openstack, and more.
We conclude by shedding light on exciting developments: Xen 4.2 has recently been released, with just over a year of development time and nearly 3,000 changesets. We will discuss many of the new features introduced in 4.2, as well as what changes we have in store for the 4.3 release as well as other exciting developments.
My (very brief!) presentation at Interzone.io on March 11, 2015. A more in depth exploration of these ideas can be found at http://www.slideshare.net/bcantrill/docker-and-the-future-of-containers-in-production video: https://www.joyent.com/developers/videos/docker-and-the-future-of-containers-in-production
The dream is alive! Running Linux containers on an illumos kernelbcantrill
Presentation for #illumos day at #surgecon, 2014. Video can be found at https://www.youtube.com/watch?v=TrfD3pC0VSs Source code is at https://github.com/joyent/illumos-joyent
Docker on a laptop is easy, but Docker in the cloud is hard. Networking, scaling, and security are very different problems at cloud scale than on a laptop, and they can combine to sink a project before it ships What makes that transition so hard, and what can we do about it? How does this challenge affect hosting infrastructure and application design?
Casey will share lessons learned, open-source solutions, and working patterns for building and scaling applications to production.
Presented at:
http://www.meetup.com/Docker-Austin/events/223284441/
http://www.meetup.com/Docker-Philadelphia/events/223290624/
RBD, the RADOS Block Device in Ceph, gives you virtually unlimited scalability (without downtime), high performance, intelligent balancing and self-healing capabilities that traditional SANs can't provide. Ceph achieves this higher throughput through a unique system of placing objects across multiple nodes, and adaptive load balancing that replicates frequently accessed objects over more nodes. This talk will give a brief overview of the Ceph architecture, current integration with Apache CloudStack, and recent advancements with Xen and blktap2.
CloudStack, the world's leading open-source cloud infrastructure platform, was recently donated to the Apache Foundation, and is now an incubated Apache project. Ewan Mellor, Director of Engineering in the Citrix Cloud Platforms Group will describe the CloudStack project and explain why Xen is the pre-eminent hypervisor in public clouds today. He will describe the changes coming in CloudStack in the next 12 months, and how they are going to change the way that Xen is consumed in public and private clouds next year.
Deploying Apache CloudStack from API to UIJoe Brockmeier
For most organizations with a large computing footprint, it's not a matter of if you'll need a private cloud - it's when, and what kind. One of the most mature and widely deployed options is Apache CloudStack, a robust, turnkey cloud that includes everything you need to set up a private, public, or hybrid cloud. We'll cover Apache CloudStack from API to UI, and a little of everything in between.
Oscon 2012 : From Datacenter to the Cloud - Featuring Xen and XCPThe Linux Foundation
Do you dream of being able to spin up ten or twenty (or a thousand) virtual machines in an instant? Discover and repair resource bottlenecks without moving a finger? Dodge the loss of an entire storage array with no-one noticing? Span across data centers with a fleet of virtual machines? This is no sales pitch; during this tutorial, we’ll demonstrate how to leverage truly FOSS tools to build a powerful, scalable cloud that easily competes with those proprietary solutions!
This deep-dive into Xen, Xen Cloud Platform, and other FOSS cloud tools and concepts is intended both for those ready to wholeheartedly embrace virtualization and for those already seasoned in general virtualization practices. You’ll leave with a collection of pre-made tools that you can use right out of the box or modify to your liking. You’ll also leave with immediately useful knowledge on best practices and common pitfalls, presented by actual FOSS practitioners like you.
We begin this tutorial by discussing Xen, Xen Cloud Platform (XCP), and XCP cloud concepts (pools, hosts, storage, networks, etc.). We then explore in detail the API that makes Xen so useful for building a cloud, explore provisioning of hosts and guests using PXE, and discuss templating and installing guest virtual machines. Critical to understanding potential bottlenecks, identifying tuning opportunities and planning for the future, we will discuss performance monitoring and methodologies. Next, we teach you how to make the most of your new FOSS cloud capabilities and discuss in detail high availability infrastructure for storage and networking, advanced networking capabilities like bonding/VLANs, and the cloud orchestration tools that save you time and money. All of this with a focus on XCP in enterprise environments. Tools discussed include DRBD, Pacemaker, Open vSwitch, Cloudstack, Openstack, and more.
We conclude by shedding light on exciting developments: Xen 4.2 has recently been released, with just over a year of development time and nearly 3,000 changesets. We will discuss many of the new features introduced in 4.2, as well as what changes we have in store for the 4.3 release as well as other exciting developments.
“Node's goal is to provide an easy way to build scalable Network programs”
Asynchronous i/o framework
Core in c++ on top of v8
Rest of it in javascript
Swiss army knife for network Related stuffs
Can handle thousands of Concurrent connections with Minimal overhead (cpu/memory) on a single process
It’s NOT a web framework, and it’s also NOT a language
• Created by Ryan Dahl in 2009
• Development && maintenance sponsored by Joyent
• License MIT
• Last release : 0.10.31
• Based on Google V8 Engine
• +99 000 packages
Новый InterSystems: open-source, митапы, хакатоныTimur Safin
Presentation for the 1st InterSystems Meetup in the Minsk:
- New and better InterSystems changes their practice.
- open-source repositories, meetups, and hackathon;
- CPM (package manager) as a good example of open-source project
In April 2014, Pinterest engineers presented to members of the engineering community at a series of Tech Talks held at the Pinterest offices in San Francisco. Topics included:
- Mobile & Growth: Scaling user education on mobile, and a deep dive into the new user experience (with engineers Dannie Chu and Wendy Lu)
- Monetization & Data: The open sourcing of Pinterest Secor and a look at zero data loss log persistence services (with engineer Pawel Garbacki)
- Developing & Shipping Code at Pinterest: The tools and technologies Pinterest uses to build quickly and deploy confidently.
You can find more at: engineering.pinterest.com and facebook.com/pinterestengineering
Netflix’s architecture involves thousands of microservices built to serve unique business needs. As this architecture grew, it became clear that the data storage and query needs were unique to each area; there is no one silver bullet which fits the data needs for all microservices. CDE (Cloud Database Engineering team) offers polyglot persistence, which promises to offer ideal matches between problem spaces and persistence solutions. In this meetup you will get a deep dive into the Self service platform, our solution to repairing Cassandra data reliably across different datacenters, Memcached Flash and cross region replication and Graph database evolution at Netflix.
This presentation by Andrew Aslinger discusses best practices and pitfalls of integrating Docker into Continuous Delivery Pipelines. Learn how Andrew and his team used Docker to replace Chef to simplify their development and migration processes.
Air Hockey Game with Google Cloud + NodeJS + NginX + Socket.io + HTML5
you can see gitlab repository: http://git.matthewlab.com/root/remote-web-airhockey
#lspe Building a Monitoring Framework using DTrace and MongoDBdan-p-kimmel
A talk I gave at the Large Scale Production Engineering meetup at Yahoo! about building monitoring tools and how to use DTrace to get more out of your monitoring data.
An Introduction to Node.js Development with Windows AzureTroy Miles
Node.js has taken off in popularity. Find out why major internet companies like Yammer, CouchOne, DocumentCloud, and LinkedIn are using Node to power their servers. And why Microsoft added support for it to Azure. In this session we will build a simple yet functional web server using Node, enhance it using plugins known as Modules, and hopefully explain why Node is such a powerful new web server paradigm.
When Node.js Goes Wrong: Debugging Node in Production
The event-oriented approach underlying Node.js enables significant concurrency using a deceptively simple programming model, which has been an important factor in Node's growing popularity for building large scale web services. But what happens when these programs go sideways? Even in the best cases, when such issues are fatal, developers have historically been left with just a stack trace. Subtler issues, including latency spikes (which are just as bad as correctness bugs in the real-time domain where Node is especially popular) and other buggy behavior often leave even fewer clues to aid understanding. In this talk, we will discuss the issues we encountered in debugging Node.js in production, focusing upon the seemingly intractable challenge of extracting runtime state from the black hole that is a modern JIT'd VM.
We will describe the tools we've developed for examining this state, which operate on running programs (via DTrace), as well as VM core dumps (via a postmortem debugger). Finally, we will describe several nasty bugs we encountered in our own production environment: we were unable to understand these using existing tools, but we successfully root-caused them using these new found abilities to introspect the JavaScript VM.
Art and Science of Web Sites Performance: A Front-end ApproachJiang Zhu
People love fast web sites, but up until now developers have been focusing on the wrong area. Network (TCP, buffers, routing) performance and Backend (web server, database, etc.) performance are important for reducing hardware costs and improving efficiency, but for most pages 80% of the load time is spent on the frontend (HTML, CSS, JavaScript, images, iframes, and others). We will talk about the best practices for making web pages faster, provide case study from top web site, and introduce the tools we use for researching performance. In addition to know how to improve web performance, we will also try to gain an understanding of the fundamentals of how the Internet works including DNS, HTTP, and browsers. This talks was given as an Educational Series called Fog Computing Reading Group at Cisco Advanced Architecture and Research. The content is derived from the materials by Steven Sounders (Google/Stanford), Collin Jackson (Stanford/CMU) and Daniel Austin (eBay).
Similar to node.js in production: Reflections on three years of riding the unicorn (20)
Talk given at the OCP Open System Firmware engineering workshop on 5/17/22. Talk was recorded; video at https://www.youtube.com/watch?v=eNI0wFgBNmY#t=7044s
Hardware/software Co-design: The Coming Golden Agebcantrill
Talk I gave as a keynote at RailsConf 2021. There is no Rails in the talk, though; this is all about the revolutions in open source firmware and hardware that are changing the way we build systems. Video to come!
Tockilator: Deducing Tock execution flows from Ibex Verilator tracesbcantrill
Talk given on March 20, 2020 at Oxidize 1K, a virtual conference that went from first idea to 300+ person conference in a week during the COVID-19 pandemic.
Platform values, Rust, and the implications for system softwarebcantrill
Talk given at Scale By The Bay 2018. Video is at https://www.youtube.com/watch?v=2wZ1pCpJUIM. If you are interested in this talk, you might also be interested in my talk on Platform as a Reflection of Values from Node Summit 2017: https://www.slideshare.net/bcantrill/platform-as-reflection-of-values-joyent-nodejs-and-beyond
My Papers We Love talk in San Francisco on October 12, 2017 on "ARC: A self-tuning, low overhead replacement cache." Video at https://www.youtube.com/watch?v=F8sZRBdmqc0
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Assuring Contact Center Experiences for Your Customers With ThousandEyes
node.js in production: Reflections on three years of riding the unicorn
1. node.js in production:
Reflections on three years
of riding the unicorn
Bryan Cantrill
SVP, Engineering
bryan@joyent.com
@bcantrill
Tuesday, December 3, 13
2. Production systems
•
Production systems are ones doing real work: when
they misbehave, users or other systems are affected
•
Production systems value reliability, performance and
ease of deployment — usually in that order
•
Contrast to development systems, that value ease of
development and speed of development — in that order
•
These values can be in tension: new languages and
environments typically arise for their development
values, not their production ones
•
Would node.js be any different?
Tuesday, December 3, 13
3. node.js advantages
•
In terms of production suitability, node.js had — and still
has — a couple of major advantages going for it:
•
•
It’s built on a VM (V8) that itself was designed for
performance
•
Tuesday, December 3, 13
It leverages extant (Unix) abstractions
•
•
It’s not a new language
Its pure event-oriented model aligns ease of
programming with scalability with respect to load
As the stewards of both node and SmartOS, Joyent had
another advantage: we could change, improve or
leverage SmartOS to accommodate node in production
4. node.js challenges
•
But node.js also has a couple of major challenges:
•
•
JavaScript closures make it easy to accidentally
reference memory
•
Because node.js is often used to connect backend
components, failure to propagate back pressure can
induce memory explosion and death
•
Tuesday, December 3, 13
Single-threaded execution of JavaScript means that
compute-bound code can entirely impede progress
High performance VM also implies inscrutable core
dumps and very limited instrumentation
5. August 2010: DTrace in node.js
•
Added simple user-level statically defined tracing
(USDT) probes for node.js on platforms that support
DTrace (e.g., Mac OS X, SmartOS)
•
Probes were around connection establishment, serving
HTTP requests, etc.
•
Allowed questions to be dynamically asked of running,
production node.js servers, e.g.:
dtrace -n ‘node*:::http-server-request{
printf(“%s of %s from %sn”, args[0]->method,
args[0]->url, args[1]->remoteAddress)}‘
dtrace -n http-server-request’{
@[args[1]->remoteAddress] = count()}‘
dtrace -n gc-start’{self->ts = timestamp}’
-n gc-done’/self->ts/{@ = quantize(timestamp - self->ts)}’
Tuesday, December 3, 13
6. August 2010: Deploying 0.2.x
•
In August 2010, we deployed our first node.js-based
service into production: a NodeKnockout leader-board
that used node.js DTrace probes to geolocate
connections to contestants in real-time
•
Results were promising; surprisingly easy to develop
and deploy a node.js based service — and service
consumed very little CPU
•
Watching the Node Knockout contestants in production
revealed they were all light on CPU:
•
But there was a storm cloud...
Tuesday, December 3, 13
7. August 2010: Deploying 0.2.x, cont.
•
We had a memory leak that resulted in heap exhaustion
after several hours under heavy load
•
Our service was stateless and load balanced for HA, so
this was more disconcerting than debilitating...
•
...but we also had quite a few contestants that would run
their RSS up and crash; there was clearly a larger issue:
Tuesday, December 3, 13
8. February 2011: 0.4.0
•
In February 2011, we deployed our first major node.jsbased service (on 0.4.0)
•
Service was able to be built remarkably quickly — but
with some pain-points around Connect
•
Despite being potentially a compute-bound service,
CPU consumption was (again) a non-issue
•
And with an updated node (and many fixed node leaks),
memory consumption wasn’t necessarily as acute...
•
…but we hit our first “spinning black hole” problem
Tuesday, December 3, 13
9. January 2011: node-dtrace-provider
•
Our DTrace probes in node were proving to be too lowlevel for higher-level services — we needed to allow
USDT probes to be expressed in JavaScript
•
Fortunately, DTrace community member Chris Andrews
extended his libusdt to node.js, allowed statically
defined probes in JavaScript, e.g.:
var dtp = d.createDTraceProvider(‘foo’);
var probe = dtp.addProbe(‘foo-start’);
probe.fire(function(p) {
return ([ { bar: 123, baz: ‘bar’ } ]);
});
Tuesday, December 3, 13
10. April 2011: Restify
•
Based on our experiences with Connect/Express, we
wanted to build a node module that was purpose-built to
implement HTTP-based API endpoints
•
Based on Chris Andrews’ work, we wanted to have first
class support for DTrace
•
Joyent’s Mark Cavage developed node-restify, which
quickly became the foundation for all of our services
•
Built-in DTrace support allows full observability into perroute/per-handler latency — a capability that we could
not live without at this point
Tuesday, December 3, 13
11. November 2011: MDB support for V8
•
In mid-2011, Joyent’s Dave Pacheco dared to dream the
impossible dream: full postmortem support for V8 for
MDB, the debugger native to SmartOS
•
Several unspeakable layer violations, mdb_v8 brought
postmortem debugging to node.js
•
::jsstack prints full stack including both native C++
frames and JavaScript frames
•
•
::jsprint prints JavaScript objects — from the dump
Tuesday, December 3, 13
Thanks to mdb_v8, we were able to go back to a core
dump from that infinite loop in our service deployed
several months earlier — and nail it
12. December 2011: DTrace ustack helper
•
mdb_v8 was actually a way station to an even bolder
dream: a DTrace ustack helper for node.js
•
A ustack helper is a bit of code that accompanies a
binary and assists DTrace in probe context to resolve
stack frames to their higher-level names
•
Once completed, allows user-level stack traces to be
associated with in-kernel events — like profiling events
•
Can use the DTrace profile provider to determine how a
node.js program is consuming CPU via stack sampling
Tuesday, December 3, 13
13. December 2011: Flame graphs
•
Pouring through stack traces can make hot functions
difficult to visualize
•
Joyent’s Brendan Gregg developed flame graphs, which
allow us to easily visualize thousands of sampled
stacks:
Tuesday, December 3, 13
14. January 2012: Bunyan
•
Logging was becoming more and more of a problem for
us — especially as we were developing distributed
systems in node.js
•
Joyent’s Trent Mick developed node-bunyan, a simple
and fast JSON logging library for node.js
•
Provides standardized, JSON, line-based log output that
can be easily processed with JSON tools, e.g.:
{"name":"moray","hostname":"d1cfb6c7-c975-4ed8-a689fb18f94b6bfc","pid":8393,"component":"manatee","path":"/manatee/sdc/
election","level":20,"db":{"available":2,"max":15,"size":2,"waiting":
0},"options":{"async":false,"read":true},"msg":"pg:
entered","time":"2013-12-03T02:54:24.565Z","v":0}
•
Tuesday, December 3, 13
Also includes command line tool, bunyan, for displaying
Bunyan logs
15. February 2012: npm shrinkwrap
•
npm allows for fine-grained semver control over
package dependencies, but we found that nested
dependencies could result in non-replicable installs
•
“npm shrinkwrap” generates a file that shrinkwraps all
nested dependencies into npm-shrinkwrap.json,
thereby locking down all nested versions
•
Guarantees that all installs will have same semver
versions of dependencies
•
Doesn’t necessarily guarantee identical installs,
however; for this, one needs private npm repositories
Tuesday, December 3, 13
16. April 2012: node-vasync
•
There are a number of modules that deal with some of
the mechanics of asynchronous control flow…
•
But we found that libraries that handle We found we
needed one that emphasized debugging, and in
particular,
•
node-vasync captures a number of popular flow patterns
and allows state to be inspected via MDB
Tuesday, December 3, 13
17. May 2012: ::findjsobjects
•
Building on Dave Pacheco’s mdb_v8, we implemented a
debugger command that iterates over all of memory in a
core dump, looking for JavaScript objects
•
Entirely brute force, but allows one to take a swing at a
nasty node.js issue: semantic memory leaks
> ::findjsobjects
OBJECT #OBJECTS
95709ac1
195
957093f9
66
95f13181
130
8432ff55
222
843304dd
91
8432cc55
99
95f08545
66
8432f2e1
546
9570cafd
47
8432be95
415
8432fb09
67
Tuesday, December 3, 13
#PROPS
3
9
5
3
9
9
14
2
24
3
19
CONSTRUCTOR: PROPS
Object: socket, type, handle
Object: uid, windowsVerbatimArguments, stdio, …
<anonymous> (as exports.StringDecoder): …
Buffer: length, offset, parent
Object: refreservation, creation, name, type, …
Object: time, msg, level, hostname, pid, action, …
ChildProcess: _closesNeeded, stdio, …
Array
Object: <sliced string>, <sliced string>, …
Array
Socket: errorEmitted, _bytesDispatched, …
18. May 2012: ::findjsobjects -p
•
Searching by property name allows one to find particular
objects in the JavaScript heap, e.g.:
> ::findjsobjects -p ip4addr | ::findjsobjects | ::jsprint -a
8432b109: {
ip4addr: 9aee115d: "10.88.88.200",
VLAN: 9aee1199: "0",
Host Interface: 9aee1185: "e1000g0",
Link Status: 9aee1175: "up",
MAC Address: 9aee113d: "02:08:20:47:93:82",
}
…
•
While designed for postmortem debugging, this allows
mdb_v8 to be used for in situ debugging in development
•
Also guides one to a best practice: towards unique
property names (which we have historically done in the
operating system via structure prefixing)
Tuesday, December 3, 13
19. July 2012: node-fast
•
While HTTP makes it very easy to put together a
distributed system, parsing and connection
management can become prohibitively expensive
•
In building Manta, we found that we needed something
lighter/faster; Joyent’s Mark Cavage built node-fast
•
Only what you need: fully async/duplex/persistent
connections, simple on-wire protocol (JSON), etc.
•
None of what you don’t want: no IDL madness, no object
model, no binary translation madness, etc.
•
Deliberately light and limited — HTTP is still the right
answer until it isn’t
Tuesday, December 3, 13
20. October 2012: Bunyan + DTrace
•
With all of our services using Bunyan, we could enable
dynamic logging by adding DTrace USDT probes
•
Can use the raw DTrace probes:
# dtrace -qn log-debug'{printf("%sn", copyinstr(arg0))}' -x strsize=8k
{"name":"wf-moray-backend","hostname":"414ffb35-adee-47b7-bdf4d21cb039386c","pid":
10952,"component":"MorayClient","host":"10.99.99.17","port":
2020,"req_id":"bddb180f-1770-edcf-8df2-b3a81d97e9b1","level":
20,"bucket":"wf_runners","key":"414ffb35-adee-47b7-bdf4d21cb039386c","value":
{"active_at":"2013-12-03T07:22:25.125Z","idle":false},"msg":"putObject:
entered","time":"2013-12-03T07:22:25.135Z","v":0}
...
•
Added the json() subroutine to DTrace to make this
easier to process
•
Can also use “bunyan -p” and avoid the lower-level
DTrace details entirely
Tuesday, December 3, 13
21. May 2013: --abort-on-uncaught-exception
•
Crash dumps are great — but aborting after an
uncaught exception makes it very difficult to determine
the true origin of the exception
•
Dave Pacheco implemented a V8 patch to induce a
process abort (and a core dump) on an uncaught
exception
•
This allows us to use postmortem debugging to debug
our everyday logic errors
•
Available starting in 0.10.x — we use it wherever we
have it!
Tuesday, December 3, 13
22. July 2013: Thoth
•
One of the most important systems we have built in
node is Manta, our object store featuring in situ compute
•
Manta is an excellent platform for building data-based
services — especially for large data objects
•
We built manta-thoth, a platform for core and crash
dump analysis that allows us to debug core dumps
without moving them
•
Thoth has become critically important for us to track and
automatically debug production node.js services
Tuesday, December 3, 13
23. December 2013: Dump analysis on Linux
•
Postmortem debugging has been a (the) tremendous
breakthrough for node.js in production…
•
...but despite all node’s postmortem support all being
open source, it has been limited to SmartOS
•
Some have toyed with porting MDB to Linux; this is in
principle possible, but will be rough sledding
•
Joyent’s TJ Fontaine (of node core fame) observed what
we had done with dump analysis on Manta and had a
simpler idea…
•
What about making Linux dumps consumable on
SmartOS — and therefore Manta?
Tuesday, December 3, 13
24. December 2013: Linux support in libproc
•
Over the course of a multiday engineering hackathon,
TJ and Joyent’s Max Brunning added support for Linux
crash dumps in SmartOS’s libproc
•
Fortunately, because of the way the postmortem work
was done by Dave Pacheco, it Just Works
•
Do this yourself:
https://gist.github.com/tjfontaine/de104fe058300a51f7cf
•
For Linux users: put your Linux dumps to Manta, and
you can finally debug those pesky leaks and crashes!
•
Use --abort-on-uncaught-exception and you
can use Manta and postmortem debugging to debug
more quotidian programming errors!
Tuesday, December 3, 13
25. Node.js in production!
•
For us at Joyent, the tooling that we have built into
node.js has resulted in what we believe to be the best
dynamic environment for production use
•
Yes, even when compared to much older platforms like
Java and Erlang...
•
There is still work to be done, especially around add-on
development (see TJ’s shim work!) and potentially better
bundling of objects…
•
We will continue to emphasize production deployment
and use in our stewardship of node.js!
Tuesday, December 3, 13
26. Thank you
•
@dapsays, the Patron Saint of node.js in production, for
DTrace support, MDB support, node-vasync, Manta, etc.
•
•
•
•
•
@mcavage for node-restify, node-fast, Manta, etc.
Tuesday, December 3, 13
@trentmick for node-bunyan
@chrisandrews for node-dtrace-provider
@brendangregg for flame graphs
@tjfontaine for bringing postmortem debugging to an
entirely new audience with Linux support for libproc!