SlideShare a Scribd company logo
1 of 55
Distributed tracing in practice
Ivo Mägi, CEO & product manager @ Plumbr
September 2019
What we are going to cover today?
Understanding the need for distributed traces and the general concepts
Examples of how a distributed traces help you to locate the root cause
Advanced examples of how distributed traces map root causes to real user impact
Different ways to add distributed tracing to your production services
Plumbr - sign up for your free trial a https://www.plumbr.io
How did we get to distributed services?
Software is eating the world
More and more major businesses and industries are being run
on software and delivered as online services.
----- Marc Andreessen, 2011
Plumbr - sign up for your free trial a https://www.plumbr.io 3
Software is eating the world faster
Large companies are forced to take plays from start-ups’
playbooks to stay competitive. Enterprises are under pressure
to innovate faster in order to stay in business.
----- McKinsey, 2019
Plumbr - sign up for your free trial a https://www.plumbr.io 4
Implications for the IT teams
Moving from monoliths to
microservices to enable
innovation in individual teams.
Adopting devops practices within
IT to support faster innovation
Plumbr - sign up for your free trial a https://www.plumbr.io 5
Distributed tracing – why bother?
Plumbr - sign up for your free trial a https://www.plumbr.io 6
Distributed
tracing - why
bother?
Support
tickets like
this.
From: John
To: support@example.com
Subject: Cannot complete checkout
I just tried to complete the order
#32828, but was unable to finish the
checkout. Your app stalled for 20
seconds and then gave me an error.
7Plumbr - sign up for your free trial a https://www.plumbr.io
…. turning into
this in two
weeks
From: John
To: support@example.com
Subject: Re:Re:Re:Re:Re:Cannot
complete the checkout
Managed finally capture the HAR file
from my browser using the
instructions you altered. However it
is too big to be sent as email
attachment. Please advise
8Plumbr - sign up for your free trial a https://www.plumbr.io
The power of
distributed
tracing
9Plumbr - sign up for your free trial a https://www.plumbr.io
What would
such a trace
look like?
10Plumbr - sign up for your free trial a https://www.plumbr.io
Cornerstone
of any
distributed
trace: UUID
Universally Unique
Identifier (UUID)
• 128-bit random number
• Requires no central
coordinator
• For practical
purposes, unique
• You are 460,000,000
times more likely to
die from meteorite
impact than to clash
on UUIDs
11
68a9ab9d-f457-4dc8-98b0-645ef476fda6
Plumbr - sign up for your free trial a https://www.plumbr.io
Plumbr - sign up for your free trial a https://www.plumbr.io 12
Plumbr - sign up for your free trial a https://www.plumbr.io 13
Plumbr - sign up for your free trial a https://www.plumbr.io 14
Passing the UUID: HTTP-headers
15
Plumbr - sign up for your free trial a https://www.plumbr.io
Plumbr - sign up for your free trial a https://www.plumbr.io 16
Plumbr - sign up for your free trial a https://www.plumbr.io 17
Plumbr - sign up for your free trial a https://www.plumbr.io 18
Plumbr - sign up for your free trial a https://www.plumbr.io 19
Plumbr - sign up for your free trial a https://www.plumbr.io 20
Outcome: distributed trace
• Consisting of spans
• Registering the duration and
outcome of the trace
• Enriched with additional metadata at
span/trace level:
• User ID
• Cluster the span belongs to
• Node ID of the span
• …
21Plumbr - sign up for your free trial a https://www.plumbr.io
Summary: three building blocks for distributed tracing
22Plumbr - sign up for your free trial a https://www.plumbr.io
Put the
distributed
traces into
good use
Removing the need to manually
reproduce and gather evidence when
responding to support tickets
Fully understanding the impact of user-
facing issues
Prioritizing the improvements based on
the impact to end user
Proactively responding to issues via
alerting based on the tracing information
23Plumbr - sign up for your free trial a https://www.plumbr.io
Hypothetical
support case
landing on
your desk
From: John
To: support@example.com
Subject: Cannot complete checkout
I just tried to complete the order
#32828, but was unable to finish the
checkout. Your app stalled for 20
seconds and then gave me an error.
24Plumbr - sign up for your free trial a https://www.plumbr.io
…. two weeks
later
From: John
To: support@example.com
Subject: Re:Re:Re:Re:Re:Cannot
complete the checkout
Managed finally capture the HAR file
from my browser using the
instructions you altered. However it
is too big to be sent as email
attachment. Please advise
25Plumbr - sign up for your free trial a https://www.plumbr.io
What happened during the two weeks?
26Plumbr - sign up for your free trial a https://www.plumbr.io
Could it have
been different?
Yes. Lets walk through examples
understanding how distributed
tracing helps you by:
• Verifying the claim
• Prioritizing the response
• Understanding the true impact
• Proactively handling such
problems
27Plumbr - sign up for your free trial a https://www.plumbr.io
Example #1: verifying the complaint
28Plumbr - sign up for your free trial a https://www.plumbr.io
Example #1: verifying the complaint
29Plumbr - sign up for your free trial a https://www.plumbr.io
Example #1:
complaint
verified
Metadata added to the
trace allowed us to search
for the evidence
Spans linked to the trace
allowed us to verify the
failure had indeed occurred
30Plumbr - sign up for your free trial a https://www.plumbr.io
Example #2: prioritizing the response
31Plumbr - sign up for your free trial a https://www.plumbr.io
Example #2: prioritizing the response
32Plumbr - sign up for your free trial a https://www.plumbr.io
Example #2: prioritizing the response
33Plumbr - sign up for your free trial a https://www.plumbr.io
Example #2:
priorities
assigned
based on the
impact
Unique identification of an error
coupled with distributed tracing
allows you to objectively quantify
the priority for a particular error.
In the specific situation, (a high
priority) response is likely not
justified.
34Plumbr - sign up for your free trial a https://www.plumbr.io
Example #3: zooming out to see what real users experience
35Plumbr - sign up for your free trial a https://www.plumbr.io
Example #3: zooming out to what real users experience
36Plumbr - sign up for your free trial a https://www.plumbr.io
Example #3:
true impact
only reveals
itself if traces
go all the way
to real user
Distributed tracing can and
should leave the server
rooms
End-to-end traces are the
way to expose both the
impact and root cause
correctly
37Plumbr - sign up for your free trial a https://www.plumbr.io
Example #4: becoming proactive
+
38Plumbr - sign up for your free trial a https://www.plumbr.io
Example #4: becoming proactive
39Plumbr - sign up for your free trial a https://www.plumbr.io
Example #4: do
not rely upon
end users.
Harness the
true power of
distributed
traces
Trigger alerts based on
the impact
Send the alerts to
channels in use
Respond to incidents
using the root causes
40Plumbr - sign up for your free trial a https://www.plumbr.io
Adopting distributed tracing:
different solutions available
41Plumbr - sign up for your free trial a https://www.plumbr.io
Opensource
distributed
tracing solutions
42Plumbr - sign up for your free trial a https://www.plumbr.io
Capturing a
trace with
Zipkin:
example
$tracing = create_tracing('php-frontend', '127.0.0.1');
$tracer = $tracing->getTracer();
$request = ComponentRequest::createFromGlobals();
/* Extract the context from HTTP headers */
$carrier = array_map(function ($header) {
return $header[0];
}, $request->headers->all());
$extractor = $tracing->getPropagation()-
>getExtractor(new Map());
$extractedContext = $extractor($carrier);
/* Create a span and set its attributes */
$span = $tracer->newChild($extractedContext);
$span->start(Timestampnow());
$span->setName('parse_request');
$span->setKind(ZipkinKindSERVER);
43Plumbr - sign up for your free trial a https://www.plumbr.io
Capturing a trace with Zipkin: example
44Plumbr - sign up for your free trial a https://www.plumbr.io
OS solutions:
flexible but
obtrusive
• You can tailor the metadata and model to match
your specific needs
• As a result, your application code is now
dependent on the framework
• In addition, there is the human factor – if you
forgot to add a particular endpoint, it will be
missing from traces
• Usability-wise, there are limited ways to query
and visualize the data.
45Plumbr - sign up for your free trial a https://www.plumbr.io
Commercial distributed tracing solutions
46Plumbr - sign up for your free trial a https://www.plumbr.io
Capturing a trace with Plumbr: example
$ java -javaagent:/path/to/plumbr.jar com.example.YourExecutable
47Plumbr - sign up for your free trial a https://www.plumbr.io
Capturing a trace with Plumbr: example
48Plumbr - sign up for your free trial a https://www.plumbr.io
Commercial
solutions: cost
attached but
do the heavy
lifting for you
• Installation is easy
• No dependencies at source code level
• Less nuances to deal with
49Plumbr - sign up for your free trial a https://www.plumbr.io
Tying it
together
You now understand how distributed
tracing works
You got a sneak peek into how
different OS and commercial vendors
can help you to capture the
distributed traces
You are equipped with examples
how hard questions can be coupled
with simple answers thanks to the
distributed tracing helping you
50Plumbr - sign up for your free trial a https://www.plumbr.io
And of course, when you go to your journey with distributed tracing …
51
Plumbr - sign up for your free trial a https://www.plumbr.io
… Plumbr will be the solution to consider
52
Plumbr - sign up for your free trial a https://www.plumbr.io
We integrate with your existing ecosystem
53
Plumbr - sign up for your free trial a https://www.plumbr.io
And all the information exposed is based on the distributed traces
54
Plumbr - sign up for your free trial a https://www.plumbr.io
Thank you!
Ivo Mägi, CEO & product manager
@ Plumbr
55Plumbr - sign up for your free trial a https://www.plumbr.io

More Related Content

What's hot

Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)Brian Brazil
 
Adopting OpenTelemetry
Adopting OpenTelemetryAdopting OpenTelemetry
Adopting OpenTelemetryVincent Behar
 
Meetup OpenTelemetry Intro
Meetup OpenTelemetry IntroMeetup OpenTelemetry Intro
Meetup OpenTelemetry IntroDimitrisFinas1
 
OpenTelemetry For Operators
OpenTelemetry For OperatorsOpenTelemetry For Operators
OpenTelemetry For OperatorsKevin Brockhoff
 
NATS Streaming - an alternative to Apache Kafka?
NATS Streaming - an alternative to Apache Kafka?NATS Streaming - an alternative to Apache Kafka?
NATS Streaming - an alternative to Apache Kafka?Anton Zadorozhniy
 
Microservices, Containers, Kubernetes, Kafka, Kanban
Microservices, Containers, Kubernetes, Kafka, KanbanMicroservices, Containers, Kubernetes, Kafka, Kanban
Microservices, Containers, Kubernetes, Kafka, KanbanAraf Karsh Hamid
 
How to Monitoring the SRE Golden Signals (E-Book)
How to Monitoring the SRE Golden Signals (E-Book)How to Monitoring the SRE Golden Signals (E-Book)
How to Monitoring the SRE Golden Signals (E-Book)Siglos
 
Understand your system like never before with OpenTelemetry, Grafana, and Pro...
Understand your system like never before with OpenTelemetry, Grafana, and Pro...Understand your system like never before with OpenTelemetry, Grafana, and Pro...
Understand your system like never before with OpenTelemetry, Grafana, and Pro...LibbySchulze
 
Prometheus design and philosophy
Prometheus design and philosophy   Prometheus design and philosophy
Prometheus design and philosophy Docker, Inc.
 
Microservices, DevOps & SRE
Microservices, DevOps & SREMicroservices, DevOps & SRE
Microservices, DevOps & SREAraf Karsh Hamid
 
Developing with the Go client for Apache Kafka
Developing with the Go client for Apache KafkaDeveloping with the Go client for Apache Kafka
Developing with the Go client for Apache KafkaJoe Stein
 
Opentelemetry - From frontend to backend
Opentelemetry - From frontend to backendOpentelemetry - From frontend to backend
Opentelemetry - From frontend to backendSebastian Poxhofer
 
Microservice Architecture
Microservice ArchitectureMicroservice Architecture
Microservice ArchitectureNguyen Tung
 
Combining logs, metrics, and traces for unified observability
Combining logs, metrics, and traces for unified observabilityCombining logs, metrics, and traces for unified observability
Combining logs, metrics, and traces for unified observabilityElasticsearch
 
Service Mesh - Observability
Service Mesh - ObservabilityService Mesh - Observability
Service Mesh - ObservabilityAraf Karsh Hamid
 
OpenTelemetry For Architects
OpenTelemetry For ArchitectsOpenTelemetry For Architects
OpenTelemetry For ArchitectsKevin Brockhoff
 
Everything You wanted to Know About Distributed Tracing
Everything You wanted to Know About Distributed TracingEverything You wanted to Know About Distributed Tracing
Everything You wanted to Know About Distributed TracingAmuhinda Hungai
 
Monitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaMonitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaArvind Kumar G.S
 
Building Microservices with gRPC and NATS
Building Microservices with gRPC and NATSBuilding Microservices with gRPC and NATS
Building Microservices with gRPC and NATSShiju Varghese
 

What's hot (20)

Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)
 
Adopting OpenTelemetry
Adopting OpenTelemetryAdopting OpenTelemetry
Adopting OpenTelemetry
 
Meetup OpenTelemetry Intro
Meetup OpenTelemetry IntroMeetup OpenTelemetry Intro
Meetup OpenTelemetry Intro
 
OpenTelemetry For Operators
OpenTelemetry For OperatorsOpenTelemetry For Operators
OpenTelemetry For Operators
 
NATS Streaming - an alternative to Apache Kafka?
NATS Streaming - an alternative to Apache Kafka?NATS Streaming - an alternative to Apache Kafka?
NATS Streaming - an alternative to Apache Kafka?
 
DevOps & SRE at Google Scale
DevOps & SRE at Google ScaleDevOps & SRE at Google Scale
DevOps & SRE at Google Scale
 
Microservices, Containers, Kubernetes, Kafka, Kanban
Microservices, Containers, Kubernetes, Kafka, KanbanMicroservices, Containers, Kubernetes, Kafka, Kanban
Microservices, Containers, Kubernetes, Kafka, Kanban
 
How to Monitoring the SRE Golden Signals (E-Book)
How to Monitoring the SRE Golden Signals (E-Book)How to Monitoring the SRE Golden Signals (E-Book)
How to Monitoring the SRE Golden Signals (E-Book)
 
Understand your system like never before with OpenTelemetry, Grafana, and Pro...
Understand your system like never before with OpenTelemetry, Grafana, and Pro...Understand your system like never before with OpenTelemetry, Grafana, and Pro...
Understand your system like never before with OpenTelemetry, Grafana, and Pro...
 
Prometheus design and philosophy
Prometheus design and philosophy   Prometheus design and philosophy
Prometheus design and philosophy
 
Microservices, DevOps & SRE
Microservices, DevOps & SREMicroservices, DevOps & SRE
Microservices, DevOps & SRE
 
Developing with the Go client for Apache Kafka
Developing with the Go client for Apache KafkaDeveloping with the Go client for Apache Kafka
Developing with the Go client for Apache Kafka
 
Opentelemetry - From frontend to backend
Opentelemetry - From frontend to backendOpentelemetry - From frontend to backend
Opentelemetry - From frontend to backend
 
Microservice Architecture
Microservice ArchitectureMicroservice Architecture
Microservice Architecture
 
Combining logs, metrics, and traces for unified observability
Combining logs, metrics, and traces for unified observabilityCombining logs, metrics, and traces for unified observability
Combining logs, metrics, and traces for unified observability
 
Service Mesh - Observability
Service Mesh - ObservabilityService Mesh - Observability
Service Mesh - Observability
 
OpenTelemetry For Architects
OpenTelemetry For ArchitectsOpenTelemetry For Architects
OpenTelemetry For Architects
 
Everything You wanted to Know About Distributed Tracing
Everything You wanted to Know About Distributed TracingEverything You wanted to Know About Distributed Tracing
Everything You wanted to Know About Distributed Tracing
 
Monitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaMonitoring using Prometheus and Grafana
Monitoring using Prometheus and Grafana
 
Building Microservices with gRPC and NATS
Building Microservices with gRPC and NATSBuilding Microservices with gRPC and NATS
Building Microservices with gRPC and NATS
 

Similar to Distributed Tracing in Practice

Openstack & why cloud for enterprise ppt
Openstack & why cloud for enterprise pptOpenstack & why cloud for enterprise ppt
Openstack & why cloud for enterprise pptAsmaa Ibrahim
 
Experiences with serverless for high throughput low usage applications | ryan...
Experiences with serverless for high throughput low usage applications | ryan...Experiences with serverless for high throughput low usage applications | ryan...
Experiences with serverless for high throughput low usage applications | ryan...AWSCOMSUM
 
Better Than Best Effort at Bloomberg from ThousandEyes Connect
Better Than Best Effort at Bloomberg from ThousandEyes ConnectBetter Than Best Effort at Bloomberg from ThousandEyes Connect
Better Than Best Effort at Bloomberg from ThousandEyes ConnectThousandEyes
 
Supercharging Optimizely Performance by Moving Decisions to the Edge
Supercharging Optimizely Performance by Moving Decisions to the EdgeSupercharging Optimizely Performance by Moving Decisions to the Edge
Supercharging Optimizely Performance by Moving Decisions to the EdgeOptimizely
 
Module 3 IUT Bobigny : Infrastructure et Opérations
Module 3 IUT Bobigny : Infrastructure et OpérationsModule 3 IUT Bobigny : Infrastructure et Opérations
Module 3 IUT Bobigny : Infrastructure et OpérationsFrédéric Rivain
 
Are You Ready for a Cloud Pentest?
Are You Ready for a Cloud Pentest?Are You Ready for a Cloud Pentest?
Are You Ready for a Cloud Pentest?Teri Radichel
 
Fight bad bot on the internet
Fight bad bot on the internetFight bad bot on the internet
Fight bad bot on the internetCloudflare
 
Ship your Machine Learning Application
Ship your Machine Learning ApplicationShip your Machine Learning Application
Ship your Machine Learning ApplicationMichael Stockerl
 
Iirdem a novel approach for enhancing security in multi cloud environment
Iirdem a novel approach for enhancing security in multi  cloud environmentIirdem a novel approach for enhancing security in multi  cloud environment
Iirdem a novel approach for enhancing security in multi cloud environmentIaetsd Iaetsd
 
The 36th Chamber of Shaolin - Improve Your Microservices Kung Fu in 36 Easy S...
The 36th Chamber of Shaolin - Improve Your Microservices Kung Fu in 36 Easy S...The 36th Chamber of Shaolin - Improve Your Microservices Kung Fu in 36 Easy S...
The 36th Chamber of Shaolin - Improve Your Microservices Kung Fu in 36 Easy S...Stefan Richter
 
Mehr und schneller ist nicht automatisch besser - data2day, 06.10.16
Mehr und schneller ist nicht automatisch besser - data2day, 06.10.16Mehr und schneller ist nicht automatisch besser - data2day, 06.10.16
Mehr und schneller ist nicht automatisch besser - data2day, 06.10.16Boris Adryan
 
20130917 the future of supply chain management - a strategic viewpoint - sa...
20130917   the future of supply chain management - a strategic viewpoint - sa...20130917   the future of supply chain management - a strategic viewpoint - sa...
20130917 the future of supply chain management - a strategic viewpoint - sa...Thorsten Schroeer
 
Webinar - Order out of Chaos: Avoiding the Migration Migraine
Webinar - Order out of Chaos: Avoiding the Migration MigraineWebinar - Order out of Chaos: Avoiding the Migration Migraine
Webinar - Order out of Chaos: Avoiding the Migration MigrainePeak Hosting
 
Preparing_for_PCA_Workbook.pptx
Preparing_for_PCA_Workbook.pptxPreparing_for_PCA_Workbook.pptx
Preparing_for_PCA_Workbook.pptxmambrino
 
SolarWinds Monthly Product Update: NPM--What's New, What's Coming, and Popula...
SolarWinds Monthly Product Update: NPM--What's New, What's Coming, and Popula...SolarWinds Monthly Product Update: NPM--What's New, What's Coming, and Popula...
SolarWinds Monthly Product Update: NPM--What's New, What's Coming, and Popula...SolarWinds
 
Situation Normal - UKUUG Mar'10
Situation Normal - UKUUG Mar'10Situation Normal - UKUUG Mar'10
Situation Normal - UKUUG Mar'10Simon Wardley
 
Situation Normal - Presentation at NottTuesday
Situation Normal - Presentation at NottTuesdaySituation Normal - Presentation at NottTuesday
Situation Normal - Presentation at NottTuesdaySimon Wardley
 
What do I want from a university library in the future?
What do I want from a university library in the future?What do I want from a university library in the future?
What do I want from a university library in the future?Per Olof Arnäs
 

Similar to Distributed Tracing in Practice (20)

Openstack & why cloud for enterprise ppt
Openstack & why cloud for enterprise pptOpenstack & why cloud for enterprise ppt
Openstack & why cloud for enterprise ppt
 
Experiences with serverless for high throughput low usage applications | ryan...
Experiences with serverless for high throughput low usage applications | ryan...Experiences with serverless for high throughput low usage applications | ryan...
Experiences with serverless for high throughput low usage applications | ryan...
 
Anshika
AnshikaAnshika
Anshika
 
Better Than Best Effort at Bloomberg from ThousandEyes Connect
Better Than Best Effort at Bloomberg from ThousandEyes ConnectBetter Than Best Effort at Bloomberg from ThousandEyes Connect
Better Than Best Effort at Bloomberg from ThousandEyes Connect
 
Supercharging Optimizely Performance by Moving Decisions to the Edge
Supercharging Optimizely Performance by Moving Decisions to the EdgeSupercharging Optimizely Performance by Moving Decisions to the Edge
Supercharging Optimizely Performance by Moving Decisions to the Edge
 
Network security
Network securityNetwork security
Network security
 
Module 3 IUT Bobigny : Infrastructure et Opérations
Module 3 IUT Bobigny : Infrastructure et OpérationsModule 3 IUT Bobigny : Infrastructure et Opérations
Module 3 IUT Bobigny : Infrastructure et Opérations
 
Are You Ready for a Cloud Pentest?
Are You Ready for a Cloud Pentest?Are You Ready for a Cloud Pentest?
Are You Ready for a Cloud Pentest?
 
Fight bad bot on the internet
Fight bad bot on the internetFight bad bot on the internet
Fight bad bot on the internet
 
Ship your Machine Learning Application
Ship your Machine Learning ApplicationShip your Machine Learning Application
Ship your Machine Learning Application
 
Iirdem a novel approach for enhancing security in multi cloud environment
Iirdem a novel approach for enhancing security in multi  cloud environmentIirdem a novel approach for enhancing security in multi  cloud environment
Iirdem a novel approach for enhancing security in multi cloud environment
 
The 36th Chamber of Shaolin - Improve Your Microservices Kung Fu in 36 Easy S...
The 36th Chamber of Shaolin - Improve Your Microservices Kung Fu in 36 Easy S...The 36th Chamber of Shaolin - Improve Your Microservices Kung Fu in 36 Easy S...
The 36th Chamber of Shaolin - Improve Your Microservices Kung Fu in 36 Easy S...
 
Mehr und schneller ist nicht automatisch besser - data2day, 06.10.16
Mehr und schneller ist nicht automatisch besser - data2day, 06.10.16Mehr und schneller ist nicht automatisch besser - data2day, 06.10.16
Mehr und schneller ist nicht automatisch besser - data2day, 06.10.16
 
20130917 the future of supply chain management - a strategic viewpoint - sa...
20130917   the future of supply chain management - a strategic viewpoint - sa...20130917   the future of supply chain management - a strategic viewpoint - sa...
20130917 the future of supply chain management - a strategic viewpoint - sa...
 
Webinar - Order out of Chaos: Avoiding the Migration Migraine
Webinar - Order out of Chaos: Avoiding the Migration MigraineWebinar - Order out of Chaos: Avoiding the Migration Migraine
Webinar - Order out of Chaos: Avoiding the Migration Migraine
 
Preparing_for_PCA_Workbook.pptx
Preparing_for_PCA_Workbook.pptxPreparing_for_PCA_Workbook.pptx
Preparing_for_PCA_Workbook.pptx
 
SolarWinds Monthly Product Update: NPM--What's New, What's Coming, and Popula...
SolarWinds Monthly Product Update: NPM--What's New, What's Coming, and Popula...SolarWinds Monthly Product Update: NPM--What's New, What's Coming, and Popula...
SolarWinds Monthly Product Update: NPM--What's New, What's Coming, and Popula...
 
Situation Normal - UKUUG Mar'10
Situation Normal - UKUUG Mar'10Situation Normal - UKUUG Mar'10
Situation Normal - UKUUG Mar'10
 
Situation Normal - Presentation at NottTuesday
Situation Normal - Presentation at NottTuesdaySituation Normal - Presentation at NottTuesday
Situation Normal - Presentation at NottTuesday
 
What do I want from a university library in the future?
What do I want from a university library in the future?What do I want from a university library in the future?
What do I want from a university library in the future?
 

More from DevOps.com

Modernizing on IBM Z Made Easier With Open Source Software
Modernizing on IBM Z Made Easier With Open Source SoftwareModernizing on IBM Z Made Easier With Open Source Software
Modernizing on IBM Z Made Easier With Open Source SoftwareDevOps.com
 
Comparing Microsoft SQL Server 2019 Performance Across Various Kubernetes Pla...
Comparing Microsoft SQL Server 2019 Performance Across Various Kubernetes Pla...Comparing Microsoft SQL Server 2019 Performance Across Various Kubernetes Pla...
Comparing Microsoft SQL Server 2019 Performance Across Various Kubernetes Pla...DevOps.com
 
Comparing Microsoft SQL Server 2019 Performance Across Various Kubernetes Pla...
Comparing Microsoft SQL Server 2019 Performance Across Various Kubernetes Pla...Comparing Microsoft SQL Server 2019 Performance Across Various Kubernetes Pla...
Comparing Microsoft SQL Server 2019 Performance Across Various Kubernetes Pla...DevOps.com
 
Next Generation Vulnerability Assessment Using Datadog and Snyk
Next Generation Vulnerability Assessment Using Datadog and SnykNext Generation Vulnerability Assessment Using Datadog and Snyk
Next Generation Vulnerability Assessment Using Datadog and SnykDevOps.com
 
Vulnerability Discovery in the Cloud
Vulnerability Discovery in the CloudVulnerability Discovery in the Cloud
Vulnerability Discovery in the CloudDevOps.com
 
2021 Open Source Governance: Top Ten Trends and Predictions
2021 Open Source Governance: Top Ten Trends and Predictions2021 Open Source Governance: Top Ten Trends and Predictions
2021 Open Source Governance: Top Ten Trends and PredictionsDevOps.com
 
A New Year’s Ransomware Resolution
A New Year’s Ransomware ResolutionA New Year’s Ransomware Resolution
A New Year’s Ransomware ResolutionDevOps.com
 
Getting Started with Runtime Security on Azure Kubernetes Service (AKS)
Getting Started with Runtime Security on Azure Kubernetes Service (AKS)Getting Started with Runtime Security on Azure Kubernetes Service (AKS)
Getting Started with Runtime Security on Azure Kubernetes Service (AKS)DevOps.com
 
Don't Panic! Effective Incident Response
Don't Panic! Effective Incident ResponseDon't Panic! Effective Incident Response
Don't Panic! Effective Incident ResponseDevOps.com
 
Creating a Culture of Chaos: Chaos Engineering Is Not Just Tools, It's Culture
Creating a Culture of Chaos: Chaos Engineering Is Not Just Tools, It's CultureCreating a Culture of Chaos: Chaos Engineering Is Not Just Tools, It's Culture
Creating a Culture of Chaos: Chaos Engineering Is Not Just Tools, It's CultureDevOps.com
 
Role Based Access Controls (RBAC) for SSH and Kubernetes Access with Teleport
Role Based Access Controls (RBAC) for SSH and Kubernetes Access with TeleportRole Based Access Controls (RBAC) for SSH and Kubernetes Access with Teleport
Role Based Access Controls (RBAC) for SSH and Kubernetes Access with TeleportDevOps.com
 
Monitoring Serverless Applications with Datadog
Monitoring Serverless Applications with DatadogMonitoring Serverless Applications with Datadog
Monitoring Serverless Applications with DatadogDevOps.com
 
Deliver your App Anywhere … Publicly or Privately
Deliver your App Anywhere … Publicly or PrivatelyDeliver your App Anywhere … Publicly or Privately
Deliver your App Anywhere … Publicly or PrivatelyDevOps.com
 
Securing medical apps in the age of covid final
Securing medical apps in the age of covid finalSecuring medical apps in the age of covid final
Securing medical apps in the age of covid finalDevOps.com
 
How to Build a Healthy On-Call Culture
How to Build a Healthy On-Call CultureHow to Build a Healthy On-Call Culture
How to Build a Healthy On-Call CultureDevOps.com
 
The Evolving Role of the Developer in 2021
The Evolving Role of the Developer in 2021The Evolving Role of the Developer in 2021
The Evolving Role of the Developer in 2021DevOps.com
 
Service Mesh: Two Big Words But Do You Need It?
Service Mesh: Two Big Words But Do You Need It?Service Mesh: Two Big Words But Do You Need It?
Service Mesh: Two Big Words But Do You Need It?DevOps.com
 
Secure Data Sharing in OpenShift Environments
Secure Data Sharing in OpenShift EnvironmentsSecure Data Sharing in OpenShift Environments
Secure Data Sharing in OpenShift EnvironmentsDevOps.com
 
How to Govern Identities and Access in Cloud Infrastructure: AppsFlyer Case S...
How to Govern Identities and Access in Cloud Infrastructure: AppsFlyer Case S...How to Govern Identities and Access in Cloud Infrastructure: AppsFlyer Case S...
How to Govern Identities and Access in Cloud Infrastructure: AppsFlyer Case S...DevOps.com
 
Elevate Your Enterprise Python and R AI, ML Software Strategy with Anaconda T...
Elevate Your Enterprise Python and R AI, ML Software Strategy with Anaconda T...Elevate Your Enterprise Python and R AI, ML Software Strategy with Anaconda T...
Elevate Your Enterprise Python and R AI, ML Software Strategy with Anaconda T...DevOps.com
 

More from DevOps.com (20)

Modernizing on IBM Z Made Easier With Open Source Software
Modernizing on IBM Z Made Easier With Open Source SoftwareModernizing on IBM Z Made Easier With Open Source Software
Modernizing on IBM Z Made Easier With Open Source Software
 
Comparing Microsoft SQL Server 2019 Performance Across Various Kubernetes Pla...
Comparing Microsoft SQL Server 2019 Performance Across Various Kubernetes Pla...Comparing Microsoft SQL Server 2019 Performance Across Various Kubernetes Pla...
Comparing Microsoft SQL Server 2019 Performance Across Various Kubernetes Pla...
 
Comparing Microsoft SQL Server 2019 Performance Across Various Kubernetes Pla...
Comparing Microsoft SQL Server 2019 Performance Across Various Kubernetes Pla...Comparing Microsoft SQL Server 2019 Performance Across Various Kubernetes Pla...
Comparing Microsoft SQL Server 2019 Performance Across Various Kubernetes Pla...
 
Next Generation Vulnerability Assessment Using Datadog and Snyk
Next Generation Vulnerability Assessment Using Datadog and SnykNext Generation Vulnerability Assessment Using Datadog and Snyk
Next Generation Vulnerability Assessment Using Datadog and Snyk
 
Vulnerability Discovery in the Cloud
Vulnerability Discovery in the CloudVulnerability Discovery in the Cloud
Vulnerability Discovery in the Cloud
 
2021 Open Source Governance: Top Ten Trends and Predictions
2021 Open Source Governance: Top Ten Trends and Predictions2021 Open Source Governance: Top Ten Trends and Predictions
2021 Open Source Governance: Top Ten Trends and Predictions
 
A New Year’s Ransomware Resolution
A New Year’s Ransomware ResolutionA New Year’s Ransomware Resolution
A New Year’s Ransomware Resolution
 
Getting Started with Runtime Security on Azure Kubernetes Service (AKS)
Getting Started with Runtime Security on Azure Kubernetes Service (AKS)Getting Started with Runtime Security on Azure Kubernetes Service (AKS)
Getting Started with Runtime Security on Azure Kubernetes Service (AKS)
 
Don't Panic! Effective Incident Response
Don't Panic! Effective Incident ResponseDon't Panic! Effective Incident Response
Don't Panic! Effective Incident Response
 
Creating a Culture of Chaos: Chaos Engineering Is Not Just Tools, It's Culture
Creating a Culture of Chaos: Chaos Engineering Is Not Just Tools, It's CultureCreating a Culture of Chaos: Chaos Engineering Is Not Just Tools, It's Culture
Creating a Culture of Chaos: Chaos Engineering Is Not Just Tools, It's Culture
 
Role Based Access Controls (RBAC) for SSH and Kubernetes Access with Teleport
Role Based Access Controls (RBAC) for SSH and Kubernetes Access with TeleportRole Based Access Controls (RBAC) for SSH and Kubernetes Access with Teleport
Role Based Access Controls (RBAC) for SSH and Kubernetes Access with Teleport
 
Monitoring Serverless Applications with Datadog
Monitoring Serverless Applications with DatadogMonitoring Serverless Applications with Datadog
Monitoring Serverless Applications with Datadog
 
Deliver your App Anywhere … Publicly or Privately
Deliver your App Anywhere … Publicly or PrivatelyDeliver your App Anywhere … Publicly or Privately
Deliver your App Anywhere … Publicly or Privately
 
Securing medical apps in the age of covid final
Securing medical apps in the age of covid finalSecuring medical apps in the age of covid final
Securing medical apps in the age of covid final
 
How to Build a Healthy On-Call Culture
How to Build a Healthy On-Call CultureHow to Build a Healthy On-Call Culture
How to Build a Healthy On-Call Culture
 
The Evolving Role of the Developer in 2021
The Evolving Role of the Developer in 2021The Evolving Role of the Developer in 2021
The Evolving Role of the Developer in 2021
 
Service Mesh: Two Big Words But Do You Need It?
Service Mesh: Two Big Words But Do You Need It?Service Mesh: Two Big Words But Do You Need It?
Service Mesh: Two Big Words But Do You Need It?
 
Secure Data Sharing in OpenShift Environments
Secure Data Sharing in OpenShift EnvironmentsSecure Data Sharing in OpenShift Environments
Secure Data Sharing in OpenShift Environments
 
How to Govern Identities and Access in Cloud Infrastructure: AppsFlyer Case S...
How to Govern Identities and Access in Cloud Infrastructure: AppsFlyer Case S...How to Govern Identities and Access in Cloud Infrastructure: AppsFlyer Case S...
How to Govern Identities and Access in Cloud Infrastructure: AppsFlyer Case S...
 
Elevate Your Enterprise Python and R AI, ML Software Strategy with Anaconda T...
Elevate Your Enterprise Python and R AI, ML Software Strategy with Anaconda T...Elevate Your Enterprise Python and R AI, ML Software Strategy with Anaconda T...
Elevate Your Enterprise Python and R AI, ML Software Strategy with Anaconda T...
 

Recently uploaded

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 

Recently uploaded (20)

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 

Distributed Tracing in Practice

  • 1. Distributed tracing in practice Ivo Mägi, CEO & product manager @ Plumbr September 2019
  • 2. What we are going to cover today? Understanding the need for distributed traces and the general concepts Examples of how a distributed traces help you to locate the root cause Advanced examples of how distributed traces map root causes to real user impact Different ways to add distributed tracing to your production services Plumbr - sign up for your free trial a https://www.plumbr.io
  • 3. How did we get to distributed services? Software is eating the world More and more major businesses and industries are being run on software and delivered as online services. ----- Marc Andreessen, 2011 Plumbr - sign up for your free trial a https://www.plumbr.io 3
  • 4. Software is eating the world faster Large companies are forced to take plays from start-ups’ playbooks to stay competitive. Enterprises are under pressure to innovate faster in order to stay in business. ----- McKinsey, 2019 Plumbr - sign up for your free trial a https://www.plumbr.io 4
  • 5. Implications for the IT teams Moving from monoliths to microservices to enable innovation in individual teams. Adopting devops practices within IT to support faster innovation Plumbr - sign up for your free trial a https://www.plumbr.io 5
  • 6. Distributed tracing – why bother? Plumbr - sign up for your free trial a https://www.plumbr.io 6
  • 7. Distributed tracing - why bother? Support tickets like this. From: John To: support@example.com Subject: Cannot complete checkout I just tried to complete the order #32828, but was unable to finish the checkout. Your app stalled for 20 seconds and then gave me an error. 7Plumbr - sign up for your free trial a https://www.plumbr.io
  • 8. …. turning into this in two weeks From: John To: support@example.com Subject: Re:Re:Re:Re:Re:Cannot complete the checkout Managed finally capture the HAR file from my browser using the instructions you altered. However it is too big to be sent as email attachment. Please advise 8Plumbr - sign up for your free trial a https://www.plumbr.io
  • 9. The power of distributed tracing 9Plumbr - sign up for your free trial a https://www.plumbr.io
  • 10. What would such a trace look like? 10Plumbr - sign up for your free trial a https://www.plumbr.io
  • 11. Cornerstone of any distributed trace: UUID Universally Unique Identifier (UUID) • 128-bit random number • Requires no central coordinator • For practical purposes, unique • You are 460,000,000 times more likely to die from meteorite impact than to clash on UUIDs 11 68a9ab9d-f457-4dc8-98b0-645ef476fda6 Plumbr - sign up for your free trial a https://www.plumbr.io
  • 12. Plumbr - sign up for your free trial a https://www.plumbr.io 12
  • 13. Plumbr - sign up for your free trial a https://www.plumbr.io 13
  • 14. Plumbr - sign up for your free trial a https://www.plumbr.io 14
  • 15. Passing the UUID: HTTP-headers 15 Plumbr - sign up for your free trial a https://www.plumbr.io
  • 16. Plumbr - sign up for your free trial a https://www.plumbr.io 16
  • 17. Plumbr - sign up for your free trial a https://www.plumbr.io 17
  • 18. Plumbr - sign up for your free trial a https://www.plumbr.io 18
  • 19. Plumbr - sign up for your free trial a https://www.plumbr.io 19
  • 20. Plumbr - sign up for your free trial a https://www.plumbr.io 20
  • 21. Outcome: distributed trace • Consisting of spans • Registering the duration and outcome of the trace • Enriched with additional metadata at span/trace level: • User ID • Cluster the span belongs to • Node ID of the span • … 21Plumbr - sign up for your free trial a https://www.plumbr.io
  • 22. Summary: three building blocks for distributed tracing 22Plumbr - sign up for your free trial a https://www.plumbr.io
  • 23. Put the distributed traces into good use Removing the need to manually reproduce and gather evidence when responding to support tickets Fully understanding the impact of user- facing issues Prioritizing the improvements based on the impact to end user Proactively responding to issues via alerting based on the tracing information 23Plumbr - sign up for your free trial a https://www.plumbr.io
  • 24. Hypothetical support case landing on your desk From: John To: support@example.com Subject: Cannot complete checkout I just tried to complete the order #32828, but was unable to finish the checkout. Your app stalled for 20 seconds and then gave me an error. 24Plumbr - sign up for your free trial a https://www.plumbr.io
  • 25. …. two weeks later From: John To: support@example.com Subject: Re:Re:Re:Re:Re:Cannot complete the checkout Managed finally capture the HAR file from my browser using the instructions you altered. However it is too big to be sent as email attachment. Please advise 25Plumbr - sign up for your free trial a https://www.plumbr.io
  • 26. What happened during the two weeks? 26Plumbr - sign up for your free trial a https://www.plumbr.io
  • 27. Could it have been different? Yes. Lets walk through examples understanding how distributed tracing helps you by: • Verifying the claim • Prioritizing the response • Understanding the true impact • Proactively handling such problems 27Plumbr - sign up for your free trial a https://www.plumbr.io
  • 28. Example #1: verifying the complaint 28Plumbr - sign up for your free trial a https://www.plumbr.io
  • 29. Example #1: verifying the complaint 29Plumbr - sign up for your free trial a https://www.plumbr.io
  • 30. Example #1: complaint verified Metadata added to the trace allowed us to search for the evidence Spans linked to the trace allowed us to verify the failure had indeed occurred 30Plumbr - sign up for your free trial a https://www.plumbr.io
  • 31. Example #2: prioritizing the response 31Plumbr - sign up for your free trial a https://www.plumbr.io
  • 32. Example #2: prioritizing the response 32Plumbr - sign up for your free trial a https://www.plumbr.io
  • 33. Example #2: prioritizing the response 33Plumbr - sign up for your free trial a https://www.plumbr.io
  • 34. Example #2: priorities assigned based on the impact Unique identification of an error coupled with distributed tracing allows you to objectively quantify the priority for a particular error. In the specific situation, (a high priority) response is likely not justified. 34Plumbr - sign up for your free trial a https://www.plumbr.io
  • 35. Example #3: zooming out to see what real users experience 35Plumbr - sign up for your free trial a https://www.plumbr.io
  • 36. Example #3: zooming out to what real users experience 36Plumbr - sign up for your free trial a https://www.plumbr.io
  • 37. Example #3: true impact only reveals itself if traces go all the way to real user Distributed tracing can and should leave the server rooms End-to-end traces are the way to expose both the impact and root cause correctly 37Plumbr - sign up for your free trial a https://www.plumbr.io
  • 38. Example #4: becoming proactive + 38Plumbr - sign up for your free trial a https://www.plumbr.io
  • 39. Example #4: becoming proactive 39Plumbr - sign up for your free trial a https://www.plumbr.io
  • 40. Example #4: do not rely upon end users. Harness the true power of distributed traces Trigger alerts based on the impact Send the alerts to channels in use Respond to incidents using the root causes 40Plumbr - sign up for your free trial a https://www.plumbr.io
  • 41. Adopting distributed tracing: different solutions available 41Plumbr - sign up for your free trial a https://www.plumbr.io
  • 42. Opensource distributed tracing solutions 42Plumbr - sign up for your free trial a https://www.plumbr.io
  • 43. Capturing a trace with Zipkin: example $tracing = create_tracing('php-frontend', '127.0.0.1'); $tracer = $tracing->getTracer(); $request = ComponentRequest::createFromGlobals(); /* Extract the context from HTTP headers */ $carrier = array_map(function ($header) { return $header[0]; }, $request->headers->all()); $extractor = $tracing->getPropagation()- >getExtractor(new Map()); $extractedContext = $extractor($carrier); /* Create a span and set its attributes */ $span = $tracer->newChild($extractedContext); $span->start(Timestampnow()); $span->setName('parse_request'); $span->setKind(ZipkinKindSERVER); 43Plumbr - sign up for your free trial a https://www.plumbr.io
  • 44. Capturing a trace with Zipkin: example 44Plumbr - sign up for your free trial a https://www.plumbr.io
  • 45. OS solutions: flexible but obtrusive • You can tailor the metadata and model to match your specific needs • As a result, your application code is now dependent on the framework • In addition, there is the human factor – if you forgot to add a particular endpoint, it will be missing from traces • Usability-wise, there are limited ways to query and visualize the data. 45Plumbr - sign up for your free trial a https://www.plumbr.io
  • 46. Commercial distributed tracing solutions 46Plumbr - sign up for your free trial a https://www.plumbr.io
  • 47. Capturing a trace with Plumbr: example $ java -javaagent:/path/to/plumbr.jar com.example.YourExecutable 47Plumbr - sign up for your free trial a https://www.plumbr.io
  • 48. Capturing a trace with Plumbr: example 48Plumbr - sign up for your free trial a https://www.plumbr.io
  • 49. Commercial solutions: cost attached but do the heavy lifting for you • Installation is easy • No dependencies at source code level • Less nuances to deal with 49Plumbr - sign up for your free trial a https://www.plumbr.io
  • 50. Tying it together You now understand how distributed tracing works You got a sneak peek into how different OS and commercial vendors can help you to capture the distributed traces You are equipped with examples how hard questions can be coupled with simple answers thanks to the distributed tracing helping you 50Plumbr - sign up for your free trial a https://www.plumbr.io
  • 51. And of course, when you go to your journey with distributed tracing … 51 Plumbr - sign up for your free trial a https://www.plumbr.io
  • 52. … Plumbr will be the solution to consider 52 Plumbr - sign up for your free trial a https://www.plumbr.io
  • 53. We integrate with your existing ecosystem 53 Plumbr - sign up for your free trial a https://www.plumbr.io
  • 54. And all the information exposed is based on the distributed traces 54 Plumbr - sign up for your free trial a https://www.plumbr.io
  • 55. Thank you! Ivo Mägi, CEO & product manager @ Plumbr 55Plumbr - sign up for your free trial a https://www.plumbr.io

Editor's Notes

  1. Highlight a downtime cost and maybe the more and more businesses relying on digital channels
  2. Remember the old days? When the entire application under management consisted of one big box. Well, in reality you most likely had few of those running in load balanced cluster, but every node was identical. Now, Instead of a few stable services under management. You now need to govern hundreds of fast-changing microservices. As a result, services break more frequently. Just to give you some idea – if every service you have is 99% available, then if you have 30 microservices under management, the end-to-end availability drops to 74%.
  3. In order to help you fully comprehend and appreciate distributed tracing, let’s dive into a few details about what constitutes a trace.  A trace is the complete processing of a request. The trace represents the whole journey of a request as it moves through all of the services or components of a distributed system. All trace events generated by a request share a trace ID that tools use to organize, filter, and search for specific traces. Distributed traces help IT and DevOps teams to monitor applications, especially those composed of microservices. Distributed tracing helps pinpoint where failures occur and what causes suboptimal performance.
  4. In order to help you fully comprehend and appreciate distributed tracing, let’s dive into a few details about what constitutes a trace.  A trace is the complete processing of a request. The trace represents the whole journey of a request as it moves through all of the services or components of a distributed system. All trace events generated by a request share a trace ID that tools use to organize, filter, and search for specific traces. Distributed traces help IT and DevOps teams to monitor applications, especially those composed of microservices. Distributed tracing helps pinpoint where failures occur and what causes suboptimal performance.
  5. Kas suudame seda protsessi kuidagi lihtsalt animeerida? Mikroteenuste pildi peal? Ülevalt tuleb päring sisse Esimese node juures luuakse ID (midagi automaatset, a la sdv0894vöeb8sv) ja registreeritakse alguse aeg Päring liigub teise node juurde Teise node juures on sama ID ja rügatakse alguse aeg Päring liigub kolmanda node juurde Kolmanda juures sama ID ja alguse ning lõpu aeg Päring liigub tagasi teise node juurde, teine node saab lõpuaja Liigub tagasi esimese node juurde, esimene saab lõpuaja Liigub ülevalt välja Kõikidest nodedest liigub info monitooringu keskserverisse
  6. The last piece of all tracing infrastructure is the monitoring agents themselves.  Monitoring agents are a work of software craft by themselves.  The common denominator among all web applications is http or https traffic.  Therefore it is a common practice to have agents that can operate at the lowest levels so that they can capture the complete details of traffic between all the nodes in an application.  Agents must be able to capture and analyze traffic in a manner that is agnostic to languages, frameworks, and other infrastructure.  Such agents are built either using the language-specific APIs at bytecode level (such as Java or .NET agents) or dig deeper and hook into system library calls via LD_PRELOAD at native code level.
  7. So we covered the concept. Distributed tracing builds up a data model, consisting of traces and spans which are uniquely identified and contain valuable metadata. This data is captured by agents, deployed per microservice under monitoring. The data is sent to the central server where it is processed and made available for querying and visualization.
  8. I bet your mind is already racing a million miles a minute, thinking about all the cool things that can be done given such information, right? Let me show you three examples, going from a simple and straightforward use case to something I bet you never even thought about:
  9. Just as easily, you are now able to confirm that the customer complaint is real - evidence is right in front of you. No more trying to reproduce or gathering additional evidence. You see the failure right in front of you. So you can confirm the presence of the issue and proceed with fixing the bug right away. Right?
  10. Just as easily, you are now able to confirm that the customer complaint is real - evidence is right in front of you. No more trying to reproduce or gathering additional evidence. You see the failure right in front of you. So you can confirm the presence of the issue and proceed with fixing the bug right away. Right?
  11. Hm. Now you still have the evidence right in front of you. John Smith indeed was unable to complete the checkout, but apparently he was the only one experiencing this error. Should you really spend your time on this issue, considering all the other bugs and features waiting for your attention in the backlog?
  12. Whoa. So it was not the checkout all along. It was the subscription details that was the culprit. Apparently the failure to fetch subscription details was not handled properly and thus a non-existing subscription ID was passed to the checkout. But the subscription detail API has been failing for hours now, for hundreds of users. Houston, P1!
  13. Whoa. So it was not the checkout all along. It was the subscription details that was the culprit. Apparently the failure to fetch subscription details was not handled properly and thus a non-existing subscription ID was passed to the checkout. But the subscription detail API has been failing for hours now, for hundreds of users. Houston, P1!
  14. As every chef I also happen to be proud of my own menu. Plumbr APM and RUM solutions are especially good in doing all and more that we described today. If you were inspired, go and grab your free trial and check out how we can change the way you handle availability and performance issues in production.
  15. As every chef I also happen to be proud of my own menu. Plumbr APM and RUM solutions are especially good in doing all and more that we described today. If you were inspired, go and grab your free trial and check out how we can change the way you handle availability and performance issues in production.
  16. As every chef I also happen to be proud of my own menu. Plumbr APM and RUM solutions are especially good in doing all and more that we described today. If you were inspired, go and grab your free trial and check out how we can change the way you handle availability and performance issues in production.
  17. As every chef I also happen to be proud of my own menu. Plumbr APM and RUM solutions are especially good in doing all and more that we described today. If you were inspired, go and grab your free trial and check out how we can change the way you handle availability and performance issues in production.