SlideShare a Scribd company logo
Debugging microservices in
production
CTO
bryan@joyent.com
Bryan Cantrill
@bcantrill
Debugging in the beginning...
Debugging in the beginning...
— Sir Maurice Wilkes, 1913 - 2010
Debugging in the beginning...
— Sir Maurice Wilkes, 1913 - 2010
Debugging in the beginning...
— Sir Maurice Wilkes, 1913 - 2010
“As soon as we started programming,
we found to our surprise that it wasn't
as easy to get programs right as we
had thought. Debugging had to be
discovered. I can remember the exact
instant when I realized that a large
part of my life from then on was going
to be spent in finding mistakes in my
own programs.”
Debugging
• The first non-trivial program for the EDSAC (a program to
calculate a table of Airy integrals) had 120 lines and 20 errors —
including one not debugged until four decades later!
• This experience remains modern for anyone in software today,
and many spend much of their career debugging
• Yet there is little formalized about debugging: few books on it;
little research; no conferences — and no university courses!
• Is it any surprise that debugging anti-patterns persist?
Debugging anti-patterns
• For too many, debugging is the process of making problems go
away rather than understanding the system!
• The view of bugs-as-nuisance has many knock-on effects:
• Fixes that don’t fix the problem (or introduce new ones!)
• Bug reports closed out as “will not fix” or “works for me”
• Users who are told to “restart” or “reboot” or “log out” or
anything else that amounts to wishful thinking
• And this is only when the process has obviously failed...
Darker debugging anti-patterns
• More insidious effects are felt when the problem appears to have
been resolved, but hasn’t actually been fully understood
• These are the fixes that amount to a fresh coat of paint over a
crack in the foundation — and they are worse than nothing
• Not only do these fixes not actually resolve the problem, they give
the engineer a false sense of confidence that spreads virally
• “Debugging” devolves into an oral tradition: folk tales of problems
that were made to go away
Thinking methodically
• The way we think about debugging is fundamentally wrong; we
need to think methodically about debugging!
• When we think of debugging as the quest for understanding our
(misbehaving) systems, it allows us to consider it more abstractly
• Namely, how do we explain the phenomena that affect our world?
• We have found that the most powerful explanations reflect an
understanding of underlying structure — beyond what to why
• This deeper understanding allows us to not only to explain but
make predictions
Predictive power
• Valuing predictive power allows us to test our explanations: if our
predictions are wrong, our understanding is incomplete
• We can use the understanding from failed predictions to develop
new explanations and new predictions
• We can then test these new predictions to test our understanding
• If all of this is sounding familiar, it’s because it’s science — and
the methodical exploration of it is the scientific method
The scientific method
• The scientific method is to:
• Make observations
• Formulate a question
• Formulate a hypothesis that answers the question
• Formulate predictions that test the hypothesis
• Test the predictions by conducting an experiment
• Refine the hypothesis and repeat as needed
Science, seriously?!
Science, seriously.
• Software debugging is a pure distillation of scientific thinking
• The limitless amount of data from software systems allows
experiments in seconds instead of weeks/months/years
• The systems we’re reasoning about are entirely synthetic,
discrete and mutable — we made it, we can understand it
• Software is mathematical machine; the conclusions of software
debugging are often mathematical in their unequivocal power!
• Software debugging is so pure, it requires us to refine the
scientific method slightly to reflect its capabilities...
The software debugging method
• Make observations
• Based on observations, formulate a question
• If the question can be answered through subsequent observation,
answer the question through observation and refine/iterate
• If the question cannot be answered through observation, make a
hypothesis as to the answer and formulate predictions
• If predictions can be tested through subsequent observation, test
the predictions through observation and refine/iterate
• Otherwise, test predictions through experiment and refine/iterate
Observation is the heart of debugging!
• The essence — and art! — of debugging software is making
observations and asking questions, not formulating hypotheses!
• Observations are facts — they constrain hypotheses in that any
hypothesis contradicted by facts can be summarily rejected
• As facts beget questions which beget observations and more
facts, hypotheses become more tightly constrained — like a
cordon being cinched around the truth
• Or, in the words of Sir Arthur Conan Doyle’s Sherlock Holmes,
“when you have eliminated all which is impossible, then whatever
remains, however improbable, must be the truth”
Making the hypothetical leap
• Once observation has sufficiently narrowed the gap between what
is known and what is wrong, a hypothetical leap should be made
• Debugging is inefficient when this leap is made too early — like
making a specific guess too early in Twenty Questions
• A hypothesis is only as good as its ability to form a prediction
• A prediction should be tested with either subsequent observation
or by conducting an experiment
• If the prediction proves to be incorrect, understanding is
incomplete; the hypothesis must be rejected — or refined
Experiments in software
• A beauty of software is that it is highly amenable to experiment
• Many experiments are programs — and the most satisfying
experiments test predictions about how failure can be induced
• Many “non-reproducible” problems are merely unusual!
• Debugging a putatively non-reproducible problem to the point of a
reproducible test case is a joy unique in software engineering
Software debugging in practice
• The specifics of observation depends on the nature of the failure
• Software has different kinds of failure modes:
• Fatal failure (segmentation violation, uncaught exception)
• Non-fatal failure (gives the wrong answer, performs terribly)
• Explicit failure (assertion failure, error message)
• Implicit failure (cheerfully does the wrong thing)
Taxonomizing software failure
Implicit
Explicit
Non-fatal Fatal
Gives the wrong answer
Returns the wrong result
Leaks resources
Stops doing work
Performs pathologically
Emits an error message
Returns an error code
Assertion failure
Process explicitly aborts
Exits with an error code
Segmentation violation
Bus Error
Panic
Type Error
Uncaught Exception
Microservices prehistory
• The late 1990s saw the rise of three-tier architectures consisting
of presentation, application logic and data tiers
• Many names for roughly the same notion: “Service-oriented
architecture”, “Model/View/Controller”, etc.
• The AJAX+REST revolution of the mid-2000s gave rise to true
web applications in which application logic could live on the edge
• Led to some broader architectural questioning...
Post-AJAX questions
• Why should HTTP be restricted to the web?
• Why should REST be restricted to web apps?
• Instead of having one monolithic architecture, why not have a
series of (smaller) services that merely did one thing well?
• In case this sounds vaguely familiar...
The Unix Philosophy
• The Unix philosophy, as articulated by Doug McIlroy:
• Write programs that do one thing and do it well
• Write programs to work together
• Write programs that handle text streams, because that is a
universal interface
• The single most important revolution in software systems thinking!
• Applying it to HTTP-based services...
Microservices
• Microservices do one thing, and strive to do it well
• Replace a small number of monoliths with many services that
have well-documented, small HTTP-based APIs
• Larger systems can be composed of these smaller services
• While the trend it describes is real, the term “microservices” isn’t
without its controversy...
Microservices
Microservices
Debugging microservices
• Veteran nerd rage may be being provoked by proponents of
microservices not fully appreciating the risks…
• Microservices turn a monolithic system into a distributed one
• While resilient to certain classes of force majeure failures,
distributed systems remain vulnerable to software defects
• Distributed systems are infamously nasty to debug — not least
because they often must be debugged in production
Microservices in production
• Microservices are tautologically small — they don’t need their
own dedicated physical hardware, or even dedicated virtual
hardware!
• Microservices are a particularly good fit for containers, virtual OS
instances pioneered by FreeBSD jails and Solaris zones
Containers at Joyent
• Joyent runs OS containers in the cloud via SmartOS — and we
have run containers in multi-tenant production since ~2006
• Adding support for hardware-based virtualization circa 2011
strengthened our resolve with respect to OS-based virtualization
• OS containers are lightweight and efficient — which is especially
important as services become smaller and more numerous:
overhead and latency become increasingly important!
• We emphasized their operational characteristics — performance,
elasticity, tenancy — and for many years, we were a lone voice...
Containers as PaaS foundation?
• Some saw the power of OS containers to facilitate up-stack
platform-as-a-service abstractions
• For example, dotCloud — a platform-as-a-service provider — built
their PaaS on OS containers
• Struggling as a PaaS, dotCloud pivoted — and open sourced
their container-based orchestration layer...
...and Docker was born
Docker revolution
• Docker has used the rapid provisioning + shared underlying
filesystem of containers to allow developers to think operationally
• Developers can encode deployment procedures via an image
• Images can be reliably and reproducibly deployed as a container
• Images can be quickly deployed — and re-deployed
• Docker complements the small-system ethos of microservices!
Docker at Joyent
• We wanted to create a best-of-all-worlds platform: the developer
ease of Docker on the production-grade substrate of SmartOS
• We developed a Linux system call interface for SmartOS,
allowing SmartOS to run Linux binaries at bare-metal speed
• In March 2015, we introduced Triton, our (open source!) stack
that deploys Docker containers directly on the metal
• Triton virtualizes the notion of a Docker host (i.e., “docker ps”
shows all of one’s containers datacenter-wide)
• Brings full debugging (DTrace, MDB) to Docker containers
When microservices fail?
A more apt metaphor...
Microservice failure modes
Implicit
Explicit
Non-fatal Fatal
Gives the wrong answer
Returns the wrong result
Leaks resources
Stops doing work
Performs pathologically
Emits an error message
Returns an error code
Assertion failure
Process explicitly aborts
Exits with an error code
Segmentation violation
Bus Error
Panic
Type Error
Uncaught Exception
Cascading microservice failure modes
Implicit
Explicit
Non-fatal Fatal
Gives the wrong answer
Returns the wrong result
Leaks resources
Stops doing work
Performs pathologically
Emits an error message
Returns an error code
Assertion failure
Process explicitly aborts
Exits with an error code
Segmentation violation
Bus Error
Panic
Type Error
Uncaught Exception
Debugging fatal failure
• When software fails fatally, we know that the software itself is
broken — its state has become inconsistent
• By saving in-memory state to stable storage, the software can be
debugged postmortem
• To debug, one starts with the invalid state and reasons backwards
to discover a transition from a valid state to an invalid one
• This technique is so old, that the terms for this saved state dates
back to the dawn of the computing age: a core dump
• Not as low-level as the name implies! Modern high-level
languages (e.g., node.js and Go) allow postmortem debugging!
Debugging fatal failure: microservices
• Postmortem analysis lends itself very well to microservices:
• There is no run-time overhead; overhead (such as it is) is only
at the time of death
• The microservice/container can be safely (automatically!)
restarted; the core dump can be analyzed asynchronously
• Tooling need not be in container, can be made arbitrarily rich
• In Triton, all core dumps are automatically stored and then
uploaded into a system that allows for analysis, tagging, etc.
• This has been invaluable for debugging our own services!
Debugging non-fatal failure
• There is a solace in fatal failure: it always represents a software
defect at some level — and the inconsistent state is static
• Non-fatal failure can be more challenging: the state is valid and
dynamic — it’s difficult to separate symptom from cause
• Non-fatal failure must still be understood empirically!
• Debugging in vivo requires that data be extracted from the system
— either of its own volition (e.g., via logs) or by coercion (e.g., via
instrumentation)
Debugging explicit, non-fatal failure
• When failure is explicit (e.g., an error or warning message), it
provides a very important data point
• If failure is non-reproducible or otherwise transient, analysis of
explicit software activity becomes essential
• Action in one container will often need to be associated with
failures in another
• Especially for distributed systems, this becomes log analysis, and
is an essential forensic tool for understanding explicit failure
• Essential observation: a time line of events!
Debugging implicit, non-fatal failure
• Problems that are both implicit and non-fatal represent the most
time-consuming, most difficult problems to debug because the
system must be understood against its will
• Wherever possible make software explicit about failure!
• Where errors are programmatic (and not operational), they
should always induce fatal failure!
• Microservices break at the boundaries: two services each think
that they are operating correctly, but together they’re broken
• Data must be coerced from the system via instrumentation
Instrumenting production systems
• Traditionally, software instrumentation was hard-coded and static
(necessitating software restart or — worse — recompile)
• Dynamic system instrumentation was historically limited to system
call table (strace/truss) or packet capture (tcpdump/snoop)
• Effective for some problems, but a poor fit for ad hoc analysis
• In 2003, Sun developed DTrace, a facility for arbitrary, dynamic
instrumentation of production systems that has since been ported
to Mac OS X, FreeBSD, NetBSD and (to a degree) Linux
• DTrace has inspired dynamic instrumentation in other systems
(see @brendangregg’s talk!)
Instrumenting Docker containers
• In Docker, instrumentation is a challenge as containers may not
include the tooling necessary to understand the system
• Docker host-based techniques for instrumentation may be
tempting, but they should be considered an anti-pattern!
• DTrace has a privilege model that allows it to be safely (and
usefully) used from within a container
• In Triton, DTrace is available from within every container — one
can “docker exec -it bash” and then debug interactively
Instrumenting node.js-based microservices
• We have invested heavily in node.js-based infrastructure to allow
us to meaningfully instrument microservices in production:
• We developed Bunyan, a logging facility for node.js that
includes DTrace support
• Added DTrace support for node.js profiling
• An essential vector for iterative observation: turning up the
logging level on a running microservice!
Debugging microservices in production
• Debugging methodically requires us to shift our thinking — and
learn how to carefully observe the systems we build
• Different types of failures necessitate different techniques:
• Fatal failure is best debugged via postmortem analysis —
which is particular appropriate in an all-container world
• Non-fatal failure necessitates log analysis and dynamic
instrumentation
• The ability to debug problems in production is essential to
successfully deploy and scale microservices!

More Related Content

What's hot

Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
confluent
 
Seastar / ScyllaDB, or how we implemented a 10-times faster Cassandra
Seastar / ScyllaDB,  or how we implemented a 10-times faster CassandraSeastar / ScyllaDB,  or how we implemented a 10-times faster Cassandra
Seastar / ScyllaDB, or how we implemented a 10-times faster Cassandra
Tzach Livyatan
 
Stop the Guessing: Performance Methodologies for Production Systems
Stop the Guessing: Performance Methodologies for Production SystemsStop the Guessing: Performance Methodologies for Production Systems
Stop the Guessing: Performance Methodologies for Production Systems
Brendan Gregg
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in Java
Ruben Badaró
 
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...
Sandesh Rao
 
Containerized Stream Engine to Build Modern Delta Lake
Containerized Stream Engine to Build Modern Delta LakeContainerized Stream Engine to Build Modern Delta Lake
Containerized Stream Engine to Build Modern Delta Lake
Databricks
 
High Performance Data Lake with Apache Hudi and Alluxio at T3Go
High Performance Data Lake with Apache Hudi and Alluxio at T3GoHigh Performance Data Lake with Apache Hudi and Alluxio at T3Go
High Performance Data Lake with Apache Hudi and Alluxio at T3Go
Alluxio, Inc.
 
Ceph on arm64 upload
Ceph on arm64   uploadCeph on arm64   upload
Ceph on arm64 upload
Ceph Community
 
Cosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle ServiceCosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle Service
Databricks
 
Oracle to Postgres Migration - part 1
Oracle to Postgres Migration - part 1Oracle to Postgres Migration - part 1
Oracle to Postgres Migration - part 1
PgTraining
 
5 Steps to PostgreSQL Performance
5 Steps to PostgreSQL Performance5 Steps to PostgreSQL Performance
5 Steps to PostgreSQL Performance
Command Prompt., Inc
 
Best Practices for Becoming an Exceptional Postgres DBA
Best Practices for Becoming an Exceptional Postgres DBA Best Practices for Becoming an Exceptional Postgres DBA
Best Practices for Becoming an Exceptional Postgres DBA
EDB
 
Amazon Redshift
Amazon Redshift Amazon Redshift
Amazon Redshift
Amazon Web Services
 
Apache Spark on Kubernetes Anirudh Ramanathan and Tim Chen
Apache Spark on Kubernetes Anirudh Ramanathan and Tim ChenApache Spark on Kubernetes Anirudh Ramanathan and Tim Chen
Apache Spark on Kubernetes Anirudh Ramanathan and Tim Chen
Databricks
 
Scylla Summit 2022: Scylla 5.0 New Features, Part 2
Scylla Summit 2022: Scylla 5.0 New Features, Part 2Scylla Summit 2022: Scylla 5.0 New Features, Part 2
Scylla Summit 2022: Scylla 5.0 New Features, Part 2
ScyllaDB
 
The Top 5 Reasons to Deploy Your Applications on Oracle RAC
The Top 5 Reasons to Deploy Your Applications on Oracle RACThe Top 5 Reasons to Deploy Your Applications on Oracle RAC
The Top 5 Reasons to Deploy Your Applications on Oracle RAC
Markus Michalewicz
 
Azure redis cache
Azure redis cacheAzure redis cache
Azure redis cache
Shahriar Hossain
 
The Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersThe Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and Containers
SATOSHI TAGOMORI
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
Rahul Jain
 
Fine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark JobsFine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark Jobs
Databricks
 

What's hot (20)

Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
 
Seastar / ScyllaDB, or how we implemented a 10-times faster Cassandra
Seastar / ScyllaDB,  or how we implemented a 10-times faster CassandraSeastar / ScyllaDB,  or how we implemented a 10-times faster Cassandra
Seastar / ScyllaDB, or how we implemented a 10-times faster Cassandra
 
Stop the Guessing: Performance Methodologies for Production Systems
Stop the Guessing: Performance Methodologies for Production SystemsStop the Guessing: Performance Methodologies for Production Systems
Stop the Guessing: Performance Methodologies for Production Systems
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in Java
 
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...
 
Containerized Stream Engine to Build Modern Delta Lake
Containerized Stream Engine to Build Modern Delta LakeContainerized Stream Engine to Build Modern Delta Lake
Containerized Stream Engine to Build Modern Delta Lake
 
High Performance Data Lake with Apache Hudi and Alluxio at T3Go
High Performance Data Lake with Apache Hudi and Alluxio at T3GoHigh Performance Data Lake with Apache Hudi and Alluxio at T3Go
High Performance Data Lake with Apache Hudi and Alluxio at T3Go
 
Ceph on arm64 upload
Ceph on arm64   uploadCeph on arm64   upload
Ceph on arm64 upload
 
Cosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle ServiceCosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle Service
 
Oracle to Postgres Migration - part 1
Oracle to Postgres Migration - part 1Oracle to Postgres Migration - part 1
Oracle to Postgres Migration - part 1
 
5 Steps to PostgreSQL Performance
5 Steps to PostgreSQL Performance5 Steps to PostgreSQL Performance
5 Steps to PostgreSQL Performance
 
Best Practices for Becoming an Exceptional Postgres DBA
Best Practices for Becoming an Exceptional Postgres DBA Best Practices for Becoming an Exceptional Postgres DBA
Best Practices for Becoming an Exceptional Postgres DBA
 
Amazon Redshift
Amazon Redshift Amazon Redshift
Amazon Redshift
 
Apache Spark on Kubernetes Anirudh Ramanathan and Tim Chen
Apache Spark on Kubernetes Anirudh Ramanathan and Tim ChenApache Spark on Kubernetes Anirudh Ramanathan and Tim Chen
Apache Spark on Kubernetes Anirudh Ramanathan and Tim Chen
 
Scylla Summit 2022: Scylla 5.0 New Features, Part 2
Scylla Summit 2022: Scylla 5.0 New Features, Part 2Scylla Summit 2022: Scylla 5.0 New Features, Part 2
Scylla Summit 2022: Scylla 5.0 New Features, Part 2
 
The Top 5 Reasons to Deploy Your Applications on Oracle RAC
The Top 5 Reasons to Deploy Your Applications on Oracle RACThe Top 5 Reasons to Deploy Your Applications on Oracle RAC
The Top 5 Reasons to Deploy Your Applications on Oracle RAC
 
Azure redis cache
Azure redis cacheAzure redis cache
Azure redis cache
 
The Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersThe Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and Containers
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
Fine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark JobsFine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark Jobs
 

Viewers also liked

Why it’s (past) time to run containers on bare metal
Why it’s (past) time to run containers on bare metalWhy it’s (past) time to run containers on bare metal
Why it’s (past) time to run containers on bare metal
bcantrill
 
Run containers on bare metal already!
Run containers on bare metal already!Run containers on bare metal already!
Run containers on bare metal already!
bcantrill
 
Papers We Love: Jails and Zones
Papers We Love: Jails and ZonesPapers We Love: Jails and Zones
Papers We Love: Jails and Zones
bcantrill
 
The Container Revolution: Reflections after the first decade
The Container Revolution: Reflections after the first decadeThe Container Revolution: Reflections after the first decade
The Container Revolution: Reflections after the first decade
bcantrill
 
Debugging (Docker) containers in production
Debugging (Docker) containers in productionDebugging (Docker) containers in production
Debugging (Docker) containers in production
bcantrill
 
Leaping the chasm from proprietary to open: A survivor's guide
Leaping the chasm from proprietary to open: A survivor's guideLeaping the chasm from proprietary to open: A survivor's guide
Leaping the chasm from proprietary to open: A survivor's guide
bcantrill
 
A crime against common sense
A crime against common senseA crime against common sense
A crime against common sense
bcantrill
 
The Peril and Promise of Early Adoption: Arriving 10 Years Early to Containers
The Peril and Promise of Early Adoption: Arriving 10 Years Early to ContainersThe Peril and Promise of Early Adoption: Arriving 10 Years Early to Containers
The Peril and Promise of Early Adoption: Arriving 10 Years Early to Containers
bcantrill
 
Down Memory Lane: Two Decades with the Slab Allocator
Down Memory Lane: Two Decades with the Slab AllocatorDown Memory Lane: Two Decades with the Slab Allocator
Down Memory Lane: Two Decades with the Slab Allocator
bcantrill
 
Oral tradition in software engineering: Passing the craft across generations
Oral tradition in software engineering: Passing the craft across generationsOral tradition in software engineering: Passing the craft across generations
Oral tradition in software engineering: Passing the craft across generations
bcantrill
 
Docker's Killer Feature: The Remote API
Docker's Killer Feature: The Remote APIDocker's Killer Feature: The Remote API
Docker's Killer Feature: The Remote API
bcantrill
 
The State of Cloud 2016: The whirlwind of creative destruction
The State of Cloud 2016: The whirlwind of creative destructionThe State of Cloud 2016: The whirlwind of creative destruction
The State of Cloud 2016: The whirlwind of creative destruction
bcantrill
 
Joyent circa 2006 (Scale with Rails)
Joyent circa 2006 (Scale with Rails)Joyent circa 2006 (Scale with Rails)
Joyent circa 2006 (Scale with Rails)
bcantrill
 
The DIY Punk Rock DevOps Playbook
The DIY Punk Rock DevOps PlaybookThe DIY Punk Rock DevOps Playbook
The DIY Punk Rock DevOps Playbook
bcantrill
 
node.js and Containers: Dispatches from the Frontier
node.js and Containers: Dispatches from the Frontiernode.js and Containers: Dispatches from the Frontier
node.js and Containers: Dispatches from the Frontier
bcantrill
 
Expect the unexpected: Prepare for failures in microservices
Expect the unexpected: Prepare for failures in microservicesExpect the unexpected: Prepare for failures in microservices
Expect the unexpected: Prepare for failures in microservices
Bhakti Mehta
 
Deploying JHipster Microservices
Deploying JHipster MicroservicesDeploying JHipster Microservices
Deploying JHipster Microservices
Joe Kutner
 
Pushing Java EE outside of the Enterprise - Home Automation
Pushing Java EE outside of the Enterprise - Home AutomationPushing Java EE outside of the Enterprise - Home Automation
Pushing Java EE outside of the Enterprise - Home Automation
David Delabassee
 
Vert.X: Microservices Were Never So Easy (Clement Escoffier)
Vert.X: Microservices Were Never So Easy (Clement Escoffier)Vert.X: Microservices Were Never So Easy (Clement Escoffier)
Vert.X: Microservices Were Never So Easy (Clement Escoffier)
Red Hat Developers
 
J1 2015 "Debugging Java Apps in Containers: No Heavy Welding Gear Required"
J1 2015 "Debugging Java Apps in Containers: No Heavy Welding Gear Required"J1 2015 "Debugging Java Apps in Containers: No Heavy Welding Gear Required"
J1 2015 "Debugging Java Apps in Containers: No Heavy Welding Gear Required"
Daniel Bryant
 

Viewers also liked (20)

Why it’s (past) time to run containers on bare metal
Why it’s (past) time to run containers on bare metalWhy it’s (past) time to run containers on bare metal
Why it’s (past) time to run containers on bare metal
 
Run containers on bare metal already!
Run containers on bare metal already!Run containers on bare metal already!
Run containers on bare metal already!
 
Papers We Love: Jails and Zones
Papers We Love: Jails and ZonesPapers We Love: Jails and Zones
Papers We Love: Jails and Zones
 
The Container Revolution: Reflections after the first decade
The Container Revolution: Reflections after the first decadeThe Container Revolution: Reflections after the first decade
The Container Revolution: Reflections after the first decade
 
Debugging (Docker) containers in production
Debugging (Docker) containers in productionDebugging (Docker) containers in production
Debugging (Docker) containers in production
 
Leaping the chasm from proprietary to open: A survivor's guide
Leaping the chasm from proprietary to open: A survivor's guideLeaping the chasm from proprietary to open: A survivor's guide
Leaping the chasm from proprietary to open: A survivor's guide
 
A crime against common sense
A crime against common senseA crime against common sense
A crime against common sense
 
The Peril and Promise of Early Adoption: Arriving 10 Years Early to Containers
The Peril and Promise of Early Adoption: Arriving 10 Years Early to ContainersThe Peril and Promise of Early Adoption: Arriving 10 Years Early to Containers
The Peril and Promise of Early Adoption: Arriving 10 Years Early to Containers
 
Down Memory Lane: Two Decades with the Slab Allocator
Down Memory Lane: Two Decades with the Slab AllocatorDown Memory Lane: Two Decades with the Slab Allocator
Down Memory Lane: Two Decades with the Slab Allocator
 
Oral tradition in software engineering: Passing the craft across generations
Oral tradition in software engineering: Passing the craft across generationsOral tradition in software engineering: Passing the craft across generations
Oral tradition in software engineering: Passing the craft across generations
 
Docker's Killer Feature: The Remote API
Docker's Killer Feature: The Remote APIDocker's Killer Feature: The Remote API
Docker's Killer Feature: The Remote API
 
The State of Cloud 2016: The whirlwind of creative destruction
The State of Cloud 2016: The whirlwind of creative destructionThe State of Cloud 2016: The whirlwind of creative destruction
The State of Cloud 2016: The whirlwind of creative destruction
 
Joyent circa 2006 (Scale with Rails)
Joyent circa 2006 (Scale with Rails)Joyent circa 2006 (Scale with Rails)
Joyent circa 2006 (Scale with Rails)
 
The DIY Punk Rock DevOps Playbook
The DIY Punk Rock DevOps PlaybookThe DIY Punk Rock DevOps Playbook
The DIY Punk Rock DevOps Playbook
 
node.js and Containers: Dispatches from the Frontier
node.js and Containers: Dispatches from the Frontiernode.js and Containers: Dispatches from the Frontier
node.js and Containers: Dispatches from the Frontier
 
Expect the unexpected: Prepare for failures in microservices
Expect the unexpected: Prepare for failures in microservicesExpect the unexpected: Prepare for failures in microservices
Expect the unexpected: Prepare for failures in microservices
 
Deploying JHipster Microservices
Deploying JHipster MicroservicesDeploying JHipster Microservices
Deploying JHipster Microservices
 
Pushing Java EE outside of the Enterprise - Home Automation
Pushing Java EE outside of the Enterprise - Home AutomationPushing Java EE outside of the Enterprise - Home Automation
Pushing Java EE outside of the Enterprise - Home Automation
 
Vert.X: Microservices Were Never So Easy (Clement Escoffier)
Vert.X: Microservices Were Never So Easy (Clement Escoffier)Vert.X: Microservices Were Never So Easy (Clement Escoffier)
Vert.X: Microservices Were Never So Easy (Clement Escoffier)
 
J1 2015 "Debugging Java Apps in Containers: No Heavy Welding Gear Required"
J1 2015 "Debugging Java Apps in Containers: No Heavy Welding Gear Required"J1 2015 "Debugging Java Apps in Containers: No Heavy Welding Gear Required"
J1 2015 "Debugging Java Apps in Containers: No Heavy Welding Gear Required"
 

Similar to Debugging microservices in production

Debugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mindDebugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mind
bcantrill
 
Empirical Methods in Software Engineering - an Overview
Empirical Methods in Software Engineering - an OverviewEmpirical Methods in Software Engineering - an Overview
Empirical Methods in Software Engineering - an Overview
alessio_ferrari
 
Perspectives on salesforce architecture Forcelandia talk 2017
Perspectives on salesforce architecture   Forcelandia talk 2017Perspectives on salesforce architecture   Forcelandia talk 2017
Perspectives on salesforce architecture Forcelandia talk 2017
Steven Herod
 
devops, microservices, and platforms, oh my!
devops, microservices, and platforms, oh my!devops, microservices, and platforms, oh my!
devops, microservices, and platforms, oh my!
Andrew Shafer
 
Cloud Foundry Summit 2015: Devops, microservices and platforms, oh my!
Cloud Foundry Summit 2015: Devops, microservices and platforms, oh my!Cloud Foundry Summit 2015: Devops, microservices and platforms, oh my!
Cloud Foundry Summit 2015: Devops, microservices and platforms, oh my!
VMware Tanzu
 
The 7 quests of resilient software design
The 7 quests of resilient software designThe 7 quests of resilient software design
The 7 quests of resilient software design
Uwe Friedrichsen
 
Feedback loops between tooling and culture
Feedback loops between tooling and cultureFeedback loops between tooling and culture
Feedback loops between tooling and culture
Chris Winters
 
Arch factory - Agile Design: Best Practices
Arch factory - Agile Design: Best PracticesArch factory - Agile Design: Best Practices
Arch factory - Agile Design: Best PracticesIgor Moochnick
 
Agile software development
Agile software developmentAgile software development
Agile software developmentHemangi Talele
 
Scaling a Web Site - OSCON Tutorial
Scaling a Web Site - OSCON TutorialScaling a Web Site - OSCON Tutorial
Scaling a Web Site - OSCON Tutorial
duleepa
 
Making disaster routine
Making disaster routineMaking disaster routine
Making disaster routine
Peter Varhol
 
Lec 1 Introduction to Software Engg.pptx
Lec 1 Introduction to Software Engg.pptxLec 1 Introduction to Software Engg.pptx
Lec 1 Introduction to Software Engg.pptx
Abdullah Khan
 
Cleaning Code - Tools and Techniques for Large Legacy Projects
Cleaning Code - Tools and Techniques for Large Legacy ProjectsCleaning Code - Tools and Techniques for Large Legacy Projects
Cleaning Code - Tools and Techniques for Large Legacy Projects
Mike Long
 
Reactive Microservice Architecture with Groovy and Grails
Reactive Microservice Architecture with Groovy and GrailsReactive Microservice Architecture with Groovy and Grails
Reactive Microservice Architecture with Groovy and GrailsSteve Pember
 
What do the "Cool Kids" know about DevOps?
What do the "Cool Kids" know about DevOps?What do the "Cool Kids" know about DevOps?
What do the "Cool Kids" know about DevOps?
Bill Holtshouser
 
DevSecCon Asia 2017 Shannon Lietz: Security is Shifting Left
DevSecCon Asia 2017 Shannon Lietz: Security is Shifting LeftDevSecCon Asia 2017 Shannon Lietz: Security is Shifting Left
DevSecCon Asia 2017 Shannon Lietz: Security is Shifting Left
DevSecCon
 
Software Defects and SW Reliability Assessment
Software Defects and SW Reliability AssessmentSoftware Defects and SW Reliability Assessment
Software Defects and SW Reliability Assessment
Kristine Hejna
 
From 1 to 100
From 1 to 100From 1 to 100
From 1 to 100
Eric Schultz
 
Agile Software Development.ppt
Agile Software Development.pptAgile Software Development.ppt
Agile Software Development.ppt
abdulwehab2
 
CS101- Introduction to Computing- Lecture 45
CS101- Introduction to Computing- Lecture 45CS101- Introduction to Computing- Lecture 45
CS101- Introduction to Computing- Lecture 45
Bilal Ahmed
 

Similar to Debugging microservices in production (20)

Debugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mindDebugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mind
 
Empirical Methods in Software Engineering - an Overview
Empirical Methods in Software Engineering - an OverviewEmpirical Methods in Software Engineering - an Overview
Empirical Methods in Software Engineering - an Overview
 
Perspectives on salesforce architecture Forcelandia talk 2017
Perspectives on salesforce architecture   Forcelandia talk 2017Perspectives on salesforce architecture   Forcelandia talk 2017
Perspectives on salesforce architecture Forcelandia talk 2017
 
devops, microservices, and platforms, oh my!
devops, microservices, and platforms, oh my!devops, microservices, and platforms, oh my!
devops, microservices, and platforms, oh my!
 
Cloud Foundry Summit 2015: Devops, microservices and platforms, oh my!
Cloud Foundry Summit 2015: Devops, microservices and platforms, oh my!Cloud Foundry Summit 2015: Devops, microservices and platforms, oh my!
Cloud Foundry Summit 2015: Devops, microservices and platforms, oh my!
 
The 7 quests of resilient software design
The 7 quests of resilient software designThe 7 quests of resilient software design
The 7 quests of resilient software design
 
Feedback loops between tooling and culture
Feedback loops between tooling and cultureFeedback loops between tooling and culture
Feedback loops between tooling and culture
 
Arch factory - Agile Design: Best Practices
Arch factory - Agile Design: Best PracticesArch factory - Agile Design: Best Practices
Arch factory - Agile Design: Best Practices
 
Agile software development
Agile software developmentAgile software development
Agile software development
 
Scaling a Web Site - OSCON Tutorial
Scaling a Web Site - OSCON TutorialScaling a Web Site - OSCON Tutorial
Scaling a Web Site - OSCON Tutorial
 
Making disaster routine
Making disaster routineMaking disaster routine
Making disaster routine
 
Lec 1 Introduction to Software Engg.pptx
Lec 1 Introduction to Software Engg.pptxLec 1 Introduction to Software Engg.pptx
Lec 1 Introduction to Software Engg.pptx
 
Cleaning Code - Tools and Techniques for Large Legacy Projects
Cleaning Code - Tools and Techniques for Large Legacy ProjectsCleaning Code - Tools and Techniques for Large Legacy Projects
Cleaning Code - Tools and Techniques for Large Legacy Projects
 
Reactive Microservice Architecture with Groovy and Grails
Reactive Microservice Architecture with Groovy and GrailsReactive Microservice Architecture with Groovy and Grails
Reactive Microservice Architecture with Groovy and Grails
 
What do the "Cool Kids" know about DevOps?
What do the "Cool Kids" know about DevOps?What do the "Cool Kids" know about DevOps?
What do the "Cool Kids" know about DevOps?
 
DevSecCon Asia 2017 Shannon Lietz: Security is Shifting Left
DevSecCon Asia 2017 Shannon Lietz: Security is Shifting LeftDevSecCon Asia 2017 Shannon Lietz: Security is Shifting Left
DevSecCon Asia 2017 Shannon Lietz: Security is Shifting Left
 
Software Defects and SW Reliability Assessment
Software Defects and SW Reliability AssessmentSoftware Defects and SW Reliability Assessment
Software Defects and SW Reliability Assessment
 
From 1 to 100
From 1 to 100From 1 to 100
From 1 to 100
 
Agile Software Development.ppt
Agile Software Development.pptAgile Software Development.ppt
Agile Software Development.ppt
 
CS101- Introduction to Computing- Lecture 45
CS101- Introduction to Computing- Lecture 45CS101- Introduction to Computing- Lecture 45
CS101- Introduction to Computing- Lecture 45
 

More from bcantrill

Predicting the Present
Predicting the PresentPredicting the Present
Predicting the Present
bcantrill
 
Sharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of ToolmakingSharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
 
Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...
bcantrill
 
I have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systemsI have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systems
bcantrill
 
Towards Holistic Systems
Towards Holistic SystemsTowards Holistic Systems
Towards Holistic Systems
bcantrill
 
The Coming Firmware Revolution
The Coming Firmware RevolutionThe Coming Firmware Revolution
The Coming Firmware Revolution
bcantrill
 
Hardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden AgeHardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden Age
bcantrill
 
Tockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator tracesTockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator traces
bcantrill
 
No Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's LawNo Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's Law
bcantrill
 
Andreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software EngineeringAndreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software Engineering
bcantrill
 
Visualizing Systems with Statemaps
Visualizing Systems with StatemapsVisualizing Systems with Statemaps
Visualizing Systems with Statemaps
bcantrill
 
Platform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system softwarePlatform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system software
bcantrill
 
Is it time to rewrite the operating system in Rust?
Is it time to rewrite the operating system in Rust?Is it time to rewrite the operating system in Rust?
Is it time to rewrite the operating system in Rust?
bcantrill
 
dtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the uniondtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the union
bcantrill
 
The Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systemsThe Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systems
bcantrill
 
Papers We Love: ARC after dark
Papers We Love: ARC after darkPapers We Love: ARC after dark
Papers We Love: ARC after dark
bcantrill
 
Principles of Technology Leadership
Principles of Technology LeadershipPrinciples of Technology Leadership
Principles of Technology Leadership
bcantrill
 
Zebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data pathZebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data path
bcantrill
 
Platform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyondPlatform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyond
bcantrill
 

More from bcantrill (19)

Predicting the Present
Predicting the PresentPredicting the Present
Predicting the Present
 
Sharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of ToolmakingSharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of Toolmaking
 
Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...
 
I have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systemsI have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systems
 
Towards Holistic Systems
Towards Holistic SystemsTowards Holistic Systems
Towards Holistic Systems
 
The Coming Firmware Revolution
The Coming Firmware RevolutionThe Coming Firmware Revolution
The Coming Firmware Revolution
 
Hardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden AgeHardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden Age
 
Tockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator tracesTockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator traces
 
No Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's LawNo Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's Law
 
Andreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software EngineeringAndreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software Engineering
 
Visualizing Systems with Statemaps
Visualizing Systems with StatemapsVisualizing Systems with Statemaps
Visualizing Systems with Statemaps
 
Platform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system softwarePlatform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system software
 
Is it time to rewrite the operating system in Rust?
Is it time to rewrite the operating system in Rust?Is it time to rewrite the operating system in Rust?
Is it time to rewrite the operating system in Rust?
 
dtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the uniondtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the union
 
The Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systemsThe Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systems
 
Papers We Love: ARC after dark
Papers We Love: ARC after darkPapers We Love: ARC after dark
Papers We Love: ARC after dark
 
Principles of Technology Leadership
Principles of Technology LeadershipPrinciples of Technology Leadership
Principles of Technology Leadership
 
Zebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data pathZebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data path
 
Platform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyondPlatform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyond
 

Recently uploaded

Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
Globus
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
Donna Lenk
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
vrstrong314
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Shahin Sheidaei
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
Georgi Kodinov
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
takuyayamamoto1800
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
RISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent EnterpriseRISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent Enterprise
Srikant77
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
AMB-Review
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Juraj Vysvader
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
wottaspaceseo
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
Globus
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
XfilesPro
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Natan Silnitsky
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
informapgpstrackings
 

Recently uploaded (20)

Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
RISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent EnterpriseRISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent Enterprise
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 

Debugging microservices in production

  • 2. Debugging in the beginning...
  • 3. Debugging in the beginning... — Sir Maurice Wilkes, 1913 - 2010
  • 4. Debugging in the beginning... — Sir Maurice Wilkes, 1913 - 2010
  • 5. Debugging in the beginning... — Sir Maurice Wilkes, 1913 - 2010 “As soon as we started programming, we found to our surprise that it wasn't as easy to get programs right as we had thought. Debugging had to be discovered. I can remember the exact instant when I realized that a large part of my life from then on was going to be spent in finding mistakes in my own programs.”
  • 6. Debugging • The first non-trivial program for the EDSAC (a program to calculate a table of Airy integrals) had 120 lines and 20 errors — including one not debugged until four decades later! • This experience remains modern for anyone in software today, and many spend much of their career debugging • Yet there is little formalized about debugging: few books on it; little research; no conferences — and no university courses! • Is it any surprise that debugging anti-patterns persist?
  • 7. Debugging anti-patterns • For too many, debugging is the process of making problems go away rather than understanding the system! • The view of bugs-as-nuisance has many knock-on effects: • Fixes that don’t fix the problem (or introduce new ones!) • Bug reports closed out as “will not fix” or “works for me” • Users who are told to “restart” or “reboot” or “log out” or anything else that amounts to wishful thinking • And this is only when the process has obviously failed...
  • 8. Darker debugging anti-patterns • More insidious effects are felt when the problem appears to have been resolved, but hasn’t actually been fully understood • These are the fixes that amount to a fresh coat of paint over a crack in the foundation — and they are worse than nothing • Not only do these fixes not actually resolve the problem, they give the engineer a false sense of confidence that spreads virally • “Debugging” devolves into an oral tradition: folk tales of problems that were made to go away
  • 9. Thinking methodically • The way we think about debugging is fundamentally wrong; we need to think methodically about debugging! • When we think of debugging as the quest for understanding our (misbehaving) systems, it allows us to consider it more abstractly • Namely, how do we explain the phenomena that affect our world? • We have found that the most powerful explanations reflect an understanding of underlying structure — beyond what to why • This deeper understanding allows us to not only to explain but make predictions
  • 10. Predictive power • Valuing predictive power allows us to test our explanations: if our predictions are wrong, our understanding is incomplete • We can use the understanding from failed predictions to develop new explanations and new predictions • We can then test these new predictions to test our understanding • If all of this is sounding familiar, it’s because it’s science — and the methodical exploration of it is the scientific method
  • 11. The scientific method • The scientific method is to: • Make observations • Formulate a question • Formulate a hypothesis that answers the question • Formulate predictions that test the hypothesis • Test the predictions by conducting an experiment • Refine the hypothesis and repeat as needed
  • 13. Science, seriously. • Software debugging is a pure distillation of scientific thinking • The limitless amount of data from software systems allows experiments in seconds instead of weeks/months/years • The systems we’re reasoning about are entirely synthetic, discrete and mutable — we made it, we can understand it • Software is mathematical machine; the conclusions of software debugging are often mathematical in their unequivocal power! • Software debugging is so pure, it requires us to refine the scientific method slightly to reflect its capabilities...
  • 14. The software debugging method • Make observations • Based on observations, formulate a question • If the question can be answered through subsequent observation, answer the question through observation and refine/iterate • If the question cannot be answered through observation, make a hypothesis as to the answer and formulate predictions • If predictions can be tested through subsequent observation, test the predictions through observation and refine/iterate • Otherwise, test predictions through experiment and refine/iterate
  • 15. Observation is the heart of debugging! • The essence — and art! — of debugging software is making observations and asking questions, not formulating hypotheses! • Observations are facts — they constrain hypotheses in that any hypothesis contradicted by facts can be summarily rejected • As facts beget questions which beget observations and more facts, hypotheses become more tightly constrained — like a cordon being cinched around the truth • Or, in the words of Sir Arthur Conan Doyle’s Sherlock Holmes, “when you have eliminated all which is impossible, then whatever remains, however improbable, must be the truth”
  • 16. Making the hypothetical leap • Once observation has sufficiently narrowed the gap between what is known and what is wrong, a hypothetical leap should be made • Debugging is inefficient when this leap is made too early — like making a specific guess too early in Twenty Questions • A hypothesis is only as good as its ability to form a prediction • A prediction should be tested with either subsequent observation or by conducting an experiment • If the prediction proves to be incorrect, understanding is incomplete; the hypothesis must be rejected — or refined
  • 17. Experiments in software • A beauty of software is that it is highly amenable to experiment • Many experiments are programs — and the most satisfying experiments test predictions about how failure can be induced • Many “non-reproducible” problems are merely unusual! • Debugging a putatively non-reproducible problem to the point of a reproducible test case is a joy unique in software engineering
  • 18. Software debugging in practice • The specifics of observation depends on the nature of the failure • Software has different kinds of failure modes: • Fatal failure (segmentation violation, uncaught exception) • Non-fatal failure (gives the wrong answer, performs terribly) • Explicit failure (assertion failure, error message) • Implicit failure (cheerfully does the wrong thing)
  • 19. Taxonomizing software failure Implicit Explicit Non-fatal Fatal Gives the wrong answer Returns the wrong result Leaks resources Stops doing work Performs pathologically Emits an error message Returns an error code Assertion failure Process explicitly aborts Exits with an error code Segmentation violation Bus Error Panic Type Error Uncaught Exception
  • 20. Microservices prehistory • The late 1990s saw the rise of three-tier architectures consisting of presentation, application logic and data tiers • Many names for roughly the same notion: “Service-oriented architecture”, “Model/View/Controller”, etc. • The AJAX+REST revolution of the mid-2000s gave rise to true web applications in which application logic could live on the edge • Led to some broader architectural questioning...
  • 21. Post-AJAX questions • Why should HTTP be restricted to the web? • Why should REST be restricted to web apps? • Instead of having one monolithic architecture, why not have a series of (smaller) services that merely did one thing well? • In case this sounds vaguely familiar...
  • 22. The Unix Philosophy • The Unix philosophy, as articulated by Doug McIlroy: • Write programs that do one thing and do it well • Write programs to work together • Write programs that handle text streams, because that is a universal interface • The single most important revolution in software systems thinking! • Applying it to HTTP-based services...
  • 23. Microservices • Microservices do one thing, and strive to do it well • Replace a small number of monoliths with many services that have well-documented, small HTTP-based APIs • Larger systems can be composed of these smaller services • While the trend it describes is real, the term “microservices” isn’t without its controversy...
  • 26. Debugging microservices • Veteran nerd rage may be being provoked by proponents of microservices not fully appreciating the risks… • Microservices turn a monolithic system into a distributed one • While resilient to certain classes of force majeure failures, distributed systems remain vulnerable to software defects • Distributed systems are infamously nasty to debug — not least because they often must be debugged in production
  • 27. Microservices in production • Microservices are tautologically small — they don’t need their own dedicated physical hardware, or even dedicated virtual hardware! • Microservices are a particularly good fit for containers, virtual OS instances pioneered by FreeBSD jails and Solaris zones
  • 28. Containers at Joyent • Joyent runs OS containers in the cloud via SmartOS — and we have run containers in multi-tenant production since ~2006 • Adding support for hardware-based virtualization circa 2011 strengthened our resolve with respect to OS-based virtualization • OS containers are lightweight and efficient — which is especially important as services become smaller and more numerous: overhead and latency become increasingly important! • We emphasized their operational characteristics — performance, elasticity, tenancy — and for many years, we were a lone voice...
  • 29. Containers as PaaS foundation? • Some saw the power of OS containers to facilitate up-stack platform-as-a-service abstractions • For example, dotCloud — a platform-as-a-service provider — built their PaaS on OS containers • Struggling as a PaaS, dotCloud pivoted — and open sourced their container-based orchestration layer...
  • 31. Docker revolution • Docker has used the rapid provisioning + shared underlying filesystem of containers to allow developers to think operationally • Developers can encode deployment procedures via an image • Images can be reliably and reproducibly deployed as a container • Images can be quickly deployed — and re-deployed • Docker complements the small-system ethos of microservices!
  • 32. Docker at Joyent • We wanted to create a best-of-all-worlds platform: the developer ease of Docker on the production-grade substrate of SmartOS • We developed a Linux system call interface for SmartOS, allowing SmartOS to run Linux binaries at bare-metal speed • In March 2015, we introduced Triton, our (open source!) stack that deploys Docker containers directly on the metal • Triton virtualizes the notion of a Docker host (i.e., “docker ps” shows all of one’s containers datacenter-wide) • Brings full debugging (DTrace, MDB) to Docker containers
  • 34. A more apt metaphor...
  • 35. Microservice failure modes Implicit Explicit Non-fatal Fatal Gives the wrong answer Returns the wrong result Leaks resources Stops doing work Performs pathologically Emits an error message Returns an error code Assertion failure Process explicitly aborts Exits with an error code Segmentation violation Bus Error Panic Type Error Uncaught Exception
  • 36. Cascading microservice failure modes Implicit Explicit Non-fatal Fatal Gives the wrong answer Returns the wrong result Leaks resources Stops doing work Performs pathologically Emits an error message Returns an error code Assertion failure Process explicitly aborts Exits with an error code Segmentation violation Bus Error Panic Type Error Uncaught Exception
  • 37. Debugging fatal failure • When software fails fatally, we know that the software itself is broken — its state has become inconsistent • By saving in-memory state to stable storage, the software can be debugged postmortem • To debug, one starts with the invalid state and reasons backwards to discover a transition from a valid state to an invalid one • This technique is so old, that the terms for this saved state dates back to the dawn of the computing age: a core dump • Not as low-level as the name implies! Modern high-level languages (e.g., node.js and Go) allow postmortem debugging!
  • 38. Debugging fatal failure: microservices • Postmortem analysis lends itself very well to microservices: • There is no run-time overhead; overhead (such as it is) is only at the time of death • The microservice/container can be safely (automatically!) restarted; the core dump can be analyzed asynchronously • Tooling need not be in container, can be made arbitrarily rich • In Triton, all core dumps are automatically stored and then uploaded into a system that allows for analysis, tagging, etc. • This has been invaluable for debugging our own services!
  • 39. Debugging non-fatal failure • There is a solace in fatal failure: it always represents a software defect at some level — and the inconsistent state is static • Non-fatal failure can be more challenging: the state is valid and dynamic — it’s difficult to separate symptom from cause • Non-fatal failure must still be understood empirically! • Debugging in vivo requires that data be extracted from the system — either of its own volition (e.g., via logs) or by coercion (e.g., via instrumentation)
  • 40. Debugging explicit, non-fatal failure • When failure is explicit (e.g., an error or warning message), it provides a very important data point • If failure is non-reproducible or otherwise transient, analysis of explicit software activity becomes essential • Action in one container will often need to be associated with failures in another • Especially for distributed systems, this becomes log analysis, and is an essential forensic tool for understanding explicit failure • Essential observation: a time line of events!
  • 41. Debugging implicit, non-fatal failure • Problems that are both implicit and non-fatal represent the most time-consuming, most difficult problems to debug because the system must be understood against its will • Wherever possible make software explicit about failure! • Where errors are programmatic (and not operational), they should always induce fatal failure! • Microservices break at the boundaries: two services each think that they are operating correctly, but together they’re broken • Data must be coerced from the system via instrumentation
  • 42. Instrumenting production systems • Traditionally, software instrumentation was hard-coded and static (necessitating software restart or — worse — recompile) • Dynamic system instrumentation was historically limited to system call table (strace/truss) or packet capture (tcpdump/snoop) • Effective for some problems, but a poor fit for ad hoc analysis • In 2003, Sun developed DTrace, a facility for arbitrary, dynamic instrumentation of production systems that has since been ported to Mac OS X, FreeBSD, NetBSD and (to a degree) Linux • DTrace has inspired dynamic instrumentation in other systems (see @brendangregg’s talk!)
  • 43. Instrumenting Docker containers • In Docker, instrumentation is a challenge as containers may not include the tooling necessary to understand the system • Docker host-based techniques for instrumentation may be tempting, but they should be considered an anti-pattern! • DTrace has a privilege model that allows it to be safely (and usefully) used from within a container • In Triton, DTrace is available from within every container — one can “docker exec -it bash” and then debug interactively
  • 44. Instrumenting node.js-based microservices • We have invested heavily in node.js-based infrastructure to allow us to meaningfully instrument microservices in production: • We developed Bunyan, a logging facility for node.js that includes DTrace support • Added DTrace support for node.js profiling • An essential vector for iterative observation: turning up the logging level on a running microservice!
  • 45. Debugging microservices in production • Debugging methodically requires us to shift our thinking — and learn how to carefully observe the systems we build • Different types of failures necessitate different techniques: • Fatal failure is best debugged via postmortem analysis — which is particular appropriate in an all-container world • Non-fatal failure necessitates log analysis and dynamic instrumentation • The ability to debug problems in production is essential to successfully deploy and scale microservices!