SlideShare a Scribd company logo
1 of 32
OpenTelemetry For
Architects
Presented by Kevin Brockhoff
Apache 2.0 Licensed
Our
Agenda
● Where are current observability patterns
falling short?
● Who is OpenTelemetry and why should I
care?
● What are some recommended
OpenTelemetry deployment
architectures?
● How can I use OpenTelemetry to
incrementally improve telemetry
collection in applications?
Level
Setting
● Have you used ELK stack or other log
aggregator?
● Have you used an APM system?
● Have you used distributed tracing
before?
● Have you used OpenCensus?
● Have you used OpenTracing?
Who am I?
● Kevin Brockhoff - Senior
Consultant, Daugherty Business
Solutions
○ Solving difficult cloud adoption
challenges for Daugherty's
Fortune 500 clients
○ OpenTelemetry committer since
early stages of the project
○ Github:
https://github.com/kbrockhoff
○ Linkedin:
https://www.linkedin.com/in/kevi
n-brockhoff-a557877/
5
Observability 2.0
6
Why observability?
● Microservices create complex interactions.
● Failures don't exactly repeat.
● Debugging multi-tenancy is painful.
● Monitoring no longer can help us.
Cynefin Framework
Complex
7
Observability 1.0
8
Metrics Concepts
● Gauges
○ Instantaneous point-in-time value (e.g.
CPU utilization)
● Cumulative counters
○ Cumulative sums of data since process
start (e.g. request counts)
● Cumulative histogram
○ Grouped counters for a range of buckets
(e.g. 0-10ms, 11-20ms)
● Rates
○ The derivative of a counter, typically. (e.g.
requests per second)
9
Basic Observability Metrics Methods
● USE - Utilization, Saturation, and Errors
○ Resource-scoped
● RED - Rate, Errors, and Duration
○ Request-scoped
10
Tracing Concepts
● Span
○ Represents a single unit of work in a
system.
● Trace
○ Defined implicitly by its spans. A trace
can be thought of as a directed acyclic
graph of spans where the edges
between spans are defined as
parent/child relationships.
● Distributed Context
○ Contains the tracing identifiers, tags, and
options that are propagated from parent
to child spans.
11
Observability 1.0 Limitations
● Data ends up in 3 different datastores.
● Different types of data not correlated with each other.
● Observability is not necessarily insight.
12
Operational Complexity Growth
2010 2020
Circuit Breaker Homegrown w/ 3 configs Resilience4J w/ 14 configs
Retries End user clicks submit again Resilience4J w/ 7 configs
Health Check HTTP server and DB are live Kubernetes liveness,
readiness, and startup probes
with 5 timing configs per probe
Alerts Unread count on circuit
breaker opened email folder
???
From Observability 1.0 to 2.0
14
Observability 2.0 - PoC
● Deep Linking Metrics and Traces with OpenTelemetry, OpenMetrics and
M3 - Rob Skillington (Presentation @ KubeCon North America 2019)
○ Click on point in metrics graph to get representative traces
○ Click on trace span to get system metrics from server that produced the span
○ Click on trace span to get all application logs emitted during span
15
OpenTelemetry Project
Sandbox Project
OpenCensus + OpenTracing = OpenTelemetry
● OpenCensus:
○ Provides APIs and instrumentation that allow you to collect application metrics and
distributed tracing.
○ Provides oc-service and oc-agent middleware.
● OpenTracing:
○ Provides APIs for distributed tracing with implementations provided by tracing backend
vendors.
● OpenTelemetry:
○ An effort to combine distributed tracing, metrics and logging into a single set of system
components and language-specific libraries.
17
OpenTelemetry Project
● Specification
○ API (for application developers)
○ SDK Implementations
○ Transport Protocol (Protobuf - gRPC)
● Collector (middleware)
● SDK’s (various stages of maturity)
○ C++
○ C# (Auto-instrument/Manual)
○ Erlang
○ Go
○ JavaScript (Browser/Node)
○ Java (Auto-instrument/Manual)
■ Android compatibility
○ PHP
○ Python (Auto-instrument/Manual)
○ Ruby
○ Rust
○ Swift
Open Source Observability Platforms Supported
Contributors
20
W3C Distributed Tracing Working Group
● Trace Context – Level 1 -
Recommendation
● Propagation format for distributed trace
context: Baggage (rec-track)
● Trace Context: AMQP protocol (rec-
track)
● Trace Context: MQTT protocol (rec-
track)
● Trace Response Headers (rec-track)
● Trace Context Protocols Registry –
Group Note
● Trace Context: binary protocol (rec-
track)
● Trace Interchange Format (rec-track)
● Trace State Ids Registry (note)
21
Trace Context HTTP Headers
traceparent: 00-0af7651916cd43dd8448eb211c80319c-00f067aa0ba902b7-01
tracestate: rojo=00f067aa0ba902b7,congo=t61rcWkgMzE
version trace-id (128 bit) parent-id (64 bit) trace-flags (8 bit)
vendor-specific key/value pairs
Baggage: userId=sergey,serverNode=DF:28,isProduction=false
Draft Baggage header specification
Architecture
23
Deployment Architectures
Kubernetes Deployment - Proof of Concept
service:
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, resource, ...]
exporters: [otlp]
metrics:
receivers: [otlp, prometheus]
processors: [memory_limiter, resource, ...]
exporters: [otlp]
service:
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, batch, queued_retry]
exporters: [jaeger]
metrics:
receivers: [otlp]
processors: [memory_limiter]
exporters: [prometheus]
Kubernetes Deployment - External Backends
service:
pipelines:
traces:
receivers: [otlp, zipkin]
processors: [memory_limiter, resource, ...]
exporters: [otlp]
metrics:
receivers: [otlp, prometheus]
processors: [memory_limiter, resource, ...]
exporters: [otlp]
service:
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, batch, queued_retry]
exporters: [commercial...]
metrics:
receivers: [otlp]
processors: [memory_limiter, batch, queued_retry]
exporters: [commercial...]
Kubernetes Deployment - Service Mesh
service:
pipelines:
traces:
receivers: [zipkin]
processors: [memory_limiter, resource, ...]
exporters: [otlp]
metrics:
receivers: [statsd, prometheus]
processors: [memory_limiter, resource, ...]
exporters: [otlp]
service:
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, batch, queued_retry]
exporters: [commercial...]
metrics:
receivers: [otlp]
processors: [memory_limiter, batch, queued_retry]
exporters: [commercial...]
Application Server on VM Deployment
service:
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, resource, ...]
exporters: [otlp]
metrics:
receivers: [statsd, otlp]
processors: [memory_limiter, resource, ...]
exporters: [otlp]
service:
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, batch, queued_retry]
exporters: [commercial...]
metrics:
receivers: [otlp]
processors: [memory_limiter, batch, queued_retry]
exporters: [commercial...]
28
Instrumentation Strategies
29
Greenfield Project Evolution
● Proof of Concept Demos
○ Sample App w/auto-instrumentation & direct exporters -> Jaeger & Prometheus
● Initial Development
○ Application libraries w/manual instrumentation -> In-memory and/or logging exporter
● Deployments during Development
○ Application w/SDK -> Collector (OTLP receiver) -> Cloud platform native monitoring
● Production
○ Applications w/SDK on hybrid cloud -> Collector (OTLP receiver) -> Latest and greatest
enterprise-wide observability platform
30
Already Instrumented Applications
● OpenCensus
○ Application -> Collector (OpenCensus receiver) -> Backend
● OpenTracing
○ Application w/OT + OpenTracing shim + SDK -> Collector (OTLP receiver) -> Backend
● Spring Boot
○ Application w/Micrometer -> Collector (Prometheus receiver) -> Backend
○ Application w/Spring Cloud Sleuth -> Collector (Zipkin receiver) -> Backend
● AWS
○ Application w/X-Ray SDK -> Collector (X-Ray receiver) -> Backend(s)
31
Non-instrumented Applications
● Java
○ Launch with OpenTelemetry Java Agent (support for 61 widely-used frameworks and
libraries)
● Javascript/Typescript
○ Add handlers/wrappers at key places or Node auto-instrumentation
● Microservice in any language
○ Deploy Envoy proxy as sidecar
● Infrastructure
○ Move to public cloud. AWS, Azure, GCP are all incorporating the OpenTelemety collector
in their infrastructure
32
Thank you!

More Related Content

What's hot

Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...
Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...
Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...Tonny Adhi Sabastian
 
Exploring the power of OpenTelemetry on Kubernetes
Exploring the power of OpenTelemetry on KubernetesExploring the power of OpenTelemetry on Kubernetes
Exploring the power of OpenTelemetry on KubernetesRed Hat Developers
 
Adopting OpenTelemetry
Adopting OpenTelemetryAdopting OpenTelemetry
Adopting OpenTelemetryVincent Behar
 
Meetup OpenTelemetry Intro
Meetup OpenTelemetry IntroMeetup OpenTelemetry Intro
Meetup OpenTelemetry IntroDimitrisFinas1
 
Distributed tracing using open tracing & jaeger 2
Distributed tracing using open tracing & jaeger 2Distributed tracing using open tracing & jaeger 2
Distributed tracing using open tracing & jaeger 2Chandresh Pancholi
 
Observability, Distributed Tracing, and Open Source: The Missing Primer
Observability, Distributed Tracing, and Open Source: The Missing PrimerObservability, Distributed Tracing, and Open Source: The Missing Primer
Observability, Distributed Tracing, and Open Source: The Missing PrimerVMware Tanzu
 
Intro to open source observability with grafana, prometheus, loki, and tempo(...
Intro to open source observability with grafana, prometheus, loki, and tempo(...Intro to open source observability with grafana, prometheus, loki, and tempo(...
Intro to open source observability with grafana, prometheus, loki, and tempo(...LibbySchulze
 
Introduction to Open Telemetry as Observability Library
Introduction to Open  Telemetry as Observability LibraryIntroduction to Open  Telemetry as Observability Library
Introduction to Open Telemetry as Observability LibraryTonny Adhi Sabastian
 
OpenTelemetry: From front- to backend (2022)
OpenTelemetry: From front- to backend (2022)OpenTelemetry: From front- to backend (2022)
OpenTelemetry: From front- to backend (2022)Sebastian Poxhofer
 
Understand your system like never before with OpenTelemetry, Grafana, and Pro...
Understand your system like never before with OpenTelemetry, Grafana, and Pro...Understand your system like never before with OpenTelemetry, Grafana, and Pro...
Understand your system like never before with OpenTelemetry, Grafana, and Pro...LibbySchulze
 
Monitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaMonitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaArvind Kumar G.S
 
KCD-OpenTelemetry.pdf
KCD-OpenTelemetry.pdfKCD-OpenTelemetry.pdf
KCD-OpenTelemetry.pdfRui Liu
 
OSMC 2022 | The Power of Metrics, Logs & Traces with Open Source by Emil-Andr...
OSMC 2022 | The Power of Metrics, Logs & Traces with Open Source by Emil-Andr...OSMC 2022 | The Power of Metrics, Logs & Traces with Open Source by Emil-Andr...
OSMC 2022 | The Power of Metrics, Logs & Traces with Open Source by Emil-Andr...NETWAYS
 
Observability: Beyond the Three Pillars with Spring
Observability: Beyond the Three Pillars with SpringObservability: Beyond the Three Pillars with Spring
Observability: Beyond the Three Pillars with SpringVMware Tanzu
 
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryEric D. Schabell
 
Distributed Tracing for Kafka with OpenTelemetry with Daniel Kim | Kafka Summ...
Distributed Tracing for Kafka with OpenTelemetry with Daniel Kim | Kafka Summ...Distributed Tracing for Kafka with OpenTelemetry with Daniel Kim | Kafka Summ...
Distributed Tracing for Kafka with OpenTelemetry with Daniel Kim | Kafka Summ...HostedbyConfluent
 
Distributed tracing 101
Distributed tracing 101Distributed tracing 101
Distributed tracing 101Itiel Shwartz
 

What's hot (20)

Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...
Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...
Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...
 
Exploring the power of OpenTelemetry on Kubernetes
Exploring the power of OpenTelemetry on KubernetesExploring the power of OpenTelemetry on Kubernetes
Exploring the power of OpenTelemetry on Kubernetes
 
Adopting OpenTelemetry
Adopting OpenTelemetryAdopting OpenTelemetry
Adopting OpenTelemetry
 
Meetup OpenTelemetry Intro
Meetup OpenTelemetry IntroMeetup OpenTelemetry Intro
Meetup OpenTelemetry Intro
 
Distributed tracing using open tracing & jaeger 2
Distributed tracing using open tracing & jaeger 2Distributed tracing using open tracing & jaeger 2
Distributed tracing using open tracing & jaeger 2
 
Observability, Distributed Tracing, and Open Source: The Missing Primer
Observability, Distributed Tracing, and Open Source: The Missing PrimerObservability, Distributed Tracing, and Open Source: The Missing Primer
Observability, Distributed Tracing, and Open Source: The Missing Primer
 
Intro to open source observability with grafana, prometheus, loki, and tempo(...
Intro to open source observability with grafana, prometheus, loki, and tempo(...Intro to open source observability with grafana, prometheus, loki, and tempo(...
Intro to open source observability with grafana, prometheus, loki, and tempo(...
 
Introduction to Open Telemetry as Observability Library
Introduction to Open  Telemetry as Observability LibraryIntroduction to Open  Telemetry as Observability Library
Introduction to Open Telemetry as Observability Library
 
OpenTelemetry: From front- to backend (2022)
OpenTelemetry: From front- to backend (2022)OpenTelemetry: From front- to backend (2022)
OpenTelemetry: From front- to backend (2022)
 
Understand your system like never before with OpenTelemetry, Grafana, and Pro...
Understand your system like never before with OpenTelemetry, Grafana, and Pro...Understand your system like never before with OpenTelemetry, Grafana, and Pro...
Understand your system like never before with OpenTelemetry, Grafana, and Pro...
 
Observability
ObservabilityObservability
Observability
 
Monitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaMonitoring using Prometheus and Grafana
Monitoring using Prometheus and Grafana
 
KCD-OpenTelemetry.pdf
KCD-OpenTelemetry.pdfKCD-OpenTelemetry.pdf
KCD-OpenTelemetry.pdf
 
OSMC 2022 | The Power of Metrics, Logs & Traces with Open Source by Emil-Andr...
OSMC 2022 | The Power of Metrics, Logs & Traces with Open Source by Emil-Andr...OSMC 2022 | The Power of Metrics, Logs & Traces with Open Source by Emil-Andr...
OSMC 2022 | The Power of Metrics, Logs & Traces with Open Source by Emil-Andr...
 
Observability: Beyond the Three Pillars with Spring
Observability: Beyond the Three Pillars with SpringObservability: Beyond the Three Pillars with Spring
Observability: Beyond the Three Pillars with Spring
 
Observability
ObservabilityObservability
Observability
 
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetry
 
Prometheus monitoring
Prometheus monitoringPrometheus monitoring
Prometheus monitoring
 
Distributed Tracing for Kafka with OpenTelemetry with Daniel Kim | Kafka Summ...
Distributed Tracing for Kafka with OpenTelemetry with Daniel Kim | Kafka Summ...Distributed Tracing for Kafka with OpenTelemetry with Daniel Kim | Kafka Summ...
Distributed Tracing for Kafka with OpenTelemetry with Daniel Kim | Kafka Summ...
 
Distributed tracing 101
Distributed tracing 101Distributed tracing 101
Distributed tracing 101
 

Similar to OpenTelemetry For Architects

Implementing Observability for Kubernetes.pdf
Implementing Observability for Kubernetes.pdfImplementing Observability for Kubernetes.pdf
Implementing Observability for Kubernetes.pdfJose Manuel Ortega Candel
 
Citi Tech Talk: Monitoring and Performance
Citi Tech Talk: Monitoring and PerformanceCiti Tech Talk: Monitoring and Performance
Citi Tech Talk: Monitoring and Performanceconfluent
 
Free GitOps Workshop
Free GitOps WorkshopFree GitOps Workshop
Free GitOps WorkshopWeaveworks
 
Tracing-for-fun-and-profit.pptx
Tracing-for-fun-and-profit.pptxTracing-for-fun-and-profit.pptx
Tracing-for-fun-and-profit.pptxHai Nguyen Duy
 
Free GitOps Workshop (with Intro to Kubernetes & GitOps)
Free GitOps Workshop (with Intro to Kubernetes & GitOps)Free GitOps Workshop (with Intro to Kubernetes & GitOps)
Free GitOps Workshop (with Intro to Kubernetes & GitOps)Weaveworks
 
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...Ambassador Labs
 
Prometheus: What is is, what is new, what is coming
Prometheus: What is is, what is new, what is comingPrometheus: What is is, what is new, what is coming
Prometheus: What is is, what is new, what is comingJulien Pivotto
 
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...NETWAYS
 
Manage Microservices Chaos and Complexity with Observability
Manage Microservices Chaos and Complexity with ObservabilityManage Microservices Chaos and Complexity with Observability
Manage Microservices Chaos and Complexity with ObservabilityNGINX, Inc.
 
Leveraging open source for large scale analytics
Leveraging open source for large scale analyticsLeveraging open source for large scale analytics
Leveraging open source for large scale analyticsSouth West Data Meetup
 
Monitoring federation open stack infrastructure
Monitoring federation open stack infrastructureMonitoring federation open stack infrastructure
Monitoring federation open stack infrastructureFernando Lopez Aguilar
 
RTBkit Meetup - Developer Spotlight, Behind the Scenes of RTBkit and Intro to...
RTBkit Meetup - Developer Spotlight, Behind the Scenes of RTBkit and Intro to...RTBkit Meetup - Developer Spotlight, Behind the Scenes of RTBkit and Intro to...
RTBkit Meetup - Developer Spotlight, Behind the Scenes of RTBkit and Intro to...Datacratic
 
FIWARE Tech Summit - Stream Processing with Kurento Media Server
FIWARE Tech Summit - Stream Processing with Kurento Media ServerFIWARE Tech Summit - Stream Processing with Kurento Media Server
FIWARE Tech Summit - Stream Processing with Kurento Media ServerFIWARE
 
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Brian Brazil
 
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...GetInData
 
Using eBPF to Measure the k8s Cluster Health
Using eBPF to Measure the k8s Cluster HealthUsing eBPF to Measure the k8s Cluster Health
Using eBPF to Measure the k8s Cluster HealthScyllaDB
 
Integrating Puppet and Gitolite for sysadmins cooperations
Integrating Puppet and Gitolite for sysadmins cooperationsIntegrating Puppet and Gitolite for sysadmins cooperations
Integrating Puppet and Gitolite for sysadmins cooperationsLuca Mazzaferro
 
Sprint 44 review
Sprint 44 reviewSprint 44 review
Sprint 44 reviewManageIQ
 
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataMonitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataGetInData
 

Similar to OpenTelemetry For Architects (20)

Implementing Observability for Kubernetes.pdf
Implementing Observability for Kubernetes.pdfImplementing Observability for Kubernetes.pdf
Implementing Observability for Kubernetes.pdf
 
Citi Tech Talk: Monitoring and Performance
Citi Tech Talk: Monitoring and PerformanceCiti Tech Talk: Monitoring and Performance
Citi Tech Talk: Monitoring and Performance
 
Free GitOps Workshop
Free GitOps WorkshopFree GitOps Workshop
Free GitOps Workshop
 
Tracing-for-fun-and-profit.pptx
Tracing-for-fun-and-profit.pptxTracing-for-fun-and-profit.pptx
Tracing-for-fun-and-profit.pptx
 
Free GitOps Workshop (with Intro to Kubernetes & GitOps)
Free GitOps Workshop (with Intro to Kubernetes & GitOps)Free GitOps Workshop (with Intro to Kubernetes & GitOps)
Free GitOps Workshop (with Intro to Kubernetes & GitOps)
 
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
 
Prometheus: What is is, what is new, what is coming
Prometheus: What is is, what is new, what is comingPrometheus: What is is, what is new, what is coming
Prometheus: What is is, what is new, what is coming
 
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
 
Manage Microservices Chaos and Complexity with Observability
Manage Microservices Chaos and Complexity with ObservabilityManage Microservices Chaos and Complexity with Observability
Manage Microservices Chaos and Complexity with Observability
 
Leveraging open source for large scale analytics
Leveraging open source for large scale analyticsLeveraging open source for large scale analytics
Leveraging open source for large scale analytics
 
Monitoring federation open stack infrastructure
Monitoring federation open stack infrastructureMonitoring federation open stack infrastructure
Monitoring federation open stack infrastructure
 
RTBkit Meetup - Developer Spotlight, Behind the Scenes of RTBkit and Intro to...
RTBkit Meetup - Developer Spotlight, Behind the Scenes of RTBkit and Intro to...RTBkit Meetup - Developer Spotlight, Behind the Scenes of RTBkit and Intro to...
RTBkit Meetup - Developer Spotlight, Behind the Scenes of RTBkit and Intro to...
 
FIWARE Tech Summit - Stream Processing with Kurento Media Server
FIWARE Tech Summit - Stream Processing with Kurento Media ServerFIWARE Tech Summit - Stream Processing with Kurento Media Server
FIWARE Tech Summit - Stream Processing with Kurento Media Server
 
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
 
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
 
Using eBPF to Measure the k8s Cluster Health
Using eBPF to Measure the k8s Cluster HealthUsing eBPF to Measure the k8s Cluster Health
Using eBPF to Measure the k8s Cluster Health
 
Integrating Puppet and Gitolite for sysadmins cooperations
Integrating Puppet and Gitolite for sysadmins cooperationsIntegrating Puppet and Gitolite for sysadmins cooperations
Integrating Puppet and Gitolite for sysadmins cooperations
 
Go at uber
Go at uberGo at uber
Go at uber
 
Sprint 44 review
Sprint 44 reviewSprint 44 review
Sprint 44 review
 
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataMonitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
 

Recently uploaded

Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsMehedi Hasan Shohan
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 

Recently uploaded (20)

Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software Solutions
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 

OpenTelemetry For Architects

  • 1. OpenTelemetry For Architects Presented by Kevin Brockhoff Apache 2.0 Licensed
  • 2. Our Agenda ● Where are current observability patterns falling short? ● Who is OpenTelemetry and why should I care? ● What are some recommended OpenTelemetry deployment architectures? ● How can I use OpenTelemetry to incrementally improve telemetry collection in applications?
  • 3. Level Setting ● Have you used ELK stack or other log aggregator? ● Have you used an APM system? ● Have you used distributed tracing before? ● Have you used OpenCensus? ● Have you used OpenTracing?
  • 4. Who am I? ● Kevin Brockhoff - Senior Consultant, Daugherty Business Solutions ○ Solving difficult cloud adoption challenges for Daugherty's Fortune 500 clients ○ OpenTelemetry committer since early stages of the project ○ Github: https://github.com/kbrockhoff ○ Linkedin: https://www.linkedin.com/in/kevi n-brockhoff-a557877/
  • 6. 6 Why observability? ● Microservices create complex interactions. ● Failures don't exactly repeat. ● Debugging multi-tenancy is painful. ● Monitoring no longer can help us. Cynefin Framework Complex
  • 8. 8 Metrics Concepts ● Gauges ○ Instantaneous point-in-time value (e.g. CPU utilization) ● Cumulative counters ○ Cumulative sums of data since process start (e.g. request counts) ● Cumulative histogram ○ Grouped counters for a range of buckets (e.g. 0-10ms, 11-20ms) ● Rates ○ The derivative of a counter, typically. (e.g. requests per second)
  • 9. 9 Basic Observability Metrics Methods ● USE - Utilization, Saturation, and Errors ○ Resource-scoped ● RED - Rate, Errors, and Duration ○ Request-scoped
  • 10. 10 Tracing Concepts ● Span ○ Represents a single unit of work in a system. ● Trace ○ Defined implicitly by its spans. A trace can be thought of as a directed acyclic graph of spans where the edges between spans are defined as parent/child relationships. ● Distributed Context ○ Contains the tracing identifiers, tags, and options that are propagated from parent to child spans.
  • 11. 11 Observability 1.0 Limitations ● Data ends up in 3 different datastores. ● Different types of data not correlated with each other. ● Observability is not necessarily insight.
  • 12. 12 Operational Complexity Growth 2010 2020 Circuit Breaker Homegrown w/ 3 configs Resilience4J w/ 14 configs Retries End user clicks submit again Resilience4J w/ 7 configs Health Check HTTP server and DB are live Kubernetes liveness, readiness, and startup probes with 5 timing configs per probe Alerts Unread count on circuit breaker opened email folder ???
  • 14. 14 Observability 2.0 - PoC ● Deep Linking Metrics and Traces with OpenTelemetry, OpenMetrics and M3 - Rob Skillington (Presentation @ KubeCon North America 2019) ○ Click on point in metrics graph to get representative traces ○ Click on trace span to get system metrics from server that produced the span ○ Click on trace span to get all application logs emitted during span
  • 16. OpenCensus + OpenTracing = OpenTelemetry ● OpenCensus: ○ Provides APIs and instrumentation that allow you to collect application metrics and distributed tracing. ○ Provides oc-service and oc-agent middleware. ● OpenTracing: ○ Provides APIs for distributed tracing with implementations provided by tracing backend vendors. ● OpenTelemetry: ○ An effort to combine distributed tracing, metrics and logging into a single set of system components and language-specific libraries.
  • 17. 17 OpenTelemetry Project ● Specification ○ API (for application developers) ○ SDK Implementations ○ Transport Protocol (Protobuf - gRPC) ● Collector (middleware) ● SDK’s (various stages of maturity) ○ C++ ○ C# (Auto-instrument/Manual) ○ Erlang ○ Go ○ JavaScript (Browser/Node) ○ Java (Auto-instrument/Manual) ■ Android compatibility ○ PHP ○ Python (Auto-instrument/Manual) ○ Ruby ○ Rust ○ Swift
  • 18. Open Source Observability Platforms Supported
  • 20. 20 W3C Distributed Tracing Working Group ● Trace Context – Level 1 - Recommendation ● Propagation format for distributed trace context: Baggage (rec-track) ● Trace Context: AMQP protocol (rec- track) ● Trace Context: MQTT protocol (rec- track) ● Trace Response Headers (rec-track) ● Trace Context Protocols Registry – Group Note ● Trace Context: binary protocol (rec- track) ● Trace Interchange Format (rec-track) ● Trace State Ids Registry (note)
  • 21. 21 Trace Context HTTP Headers traceparent: 00-0af7651916cd43dd8448eb211c80319c-00f067aa0ba902b7-01 tracestate: rojo=00f067aa0ba902b7,congo=t61rcWkgMzE version trace-id (128 bit) parent-id (64 bit) trace-flags (8 bit) vendor-specific key/value pairs Baggage: userId=sergey,serverNode=DF:28,isProduction=false Draft Baggage header specification
  • 24. Kubernetes Deployment - Proof of Concept service: pipelines: traces: receivers: [otlp] processors: [memory_limiter, resource, ...] exporters: [otlp] metrics: receivers: [otlp, prometheus] processors: [memory_limiter, resource, ...] exporters: [otlp] service: pipelines: traces: receivers: [otlp] processors: [memory_limiter, batch, queued_retry] exporters: [jaeger] metrics: receivers: [otlp] processors: [memory_limiter] exporters: [prometheus]
  • 25. Kubernetes Deployment - External Backends service: pipelines: traces: receivers: [otlp, zipkin] processors: [memory_limiter, resource, ...] exporters: [otlp] metrics: receivers: [otlp, prometheus] processors: [memory_limiter, resource, ...] exporters: [otlp] service: pipelines: traces: receivers: [otlp] processors: [memory_limiter, batch, queued_retry] exporters: [commercial...] metrics: receivers: [otlp] processors: [memory_limiter, batch, queued_retry] exporters: [commercial...]
  • 26. Kubernetes Deployment - Service Mesh service: pipelines: traces: receivers: [zipkin] processors: [memory_limiter, resource, ...] exporters: [otlp] metrics: receivers: [statsd, prometheus] processors: [memory_limiter, resource, ...] exporters: [otlp] service: pipelines: traces: receivers: [otlp] processors: [memory_limiter, batch, queued_retry] exporters: [commercial...] metrics: receivers: [otlp] processors: [memory_limiter, batch, queued_retry] exporters: [commercial...]
  • 27. Application Server on VM Deployment service: pipelines: traces: receivers: [otlp] processors: [memory_limiter, resource, ...] exporters: [otlp] metrics: receivers: [statsd, otlp] processors: [memory_limiter, resource, ...] exporters: [otlp] service: pipelines: traces: receivers: [otlp] processors: [memory_limiter, batch, queued_retry] exporters: [commercial...] metrics: receivers: [otlp] processors: [memory_limiter, batch, queued_retry] exporters: [commercial...]
  • 29. 29 Greenfield Project Evolution ● Proof of Concept Demos ○ Sample App w/auto-instrumentation & direct exporters -> Jaeger & Prometheus ● Initial Development ○ Application libraries w/manual instrumentation -> In-memory and/or logging exporter ● Deployments during Development ○ Application w/SDK -> Collector (OTLP receiver) -> Cloud platform native monitoring ● Production ○ Applications w/SDK on hybrid cloud -> Collector (OTLP receiver) -> Latest and greatest enterprise-wide observability platform
  • 30. 30 Already Instrumented Applications ● OpenCensus ○ Application -> Collector (OpenCensus receiver) -> Backend ● OpenTracing ○ Application w/OT + OpenTracing shim + SDK -> Collector (OTLP receiver) -> Backend ● Spring Boot ○ Application w/Micrometer -> Collector (Prometheus receiver) -> Backend ○ Application w/Spring Cloud Sleuth -> Collector (Zipkin receiver) -> Backend ● AWS ○ Application w/X-Ray SDK -> Collector (X-Ray receiver) -> Backend(s)
  • 31. 31 Non-instrumented Applications ● Java ○ Launch with OpenTelemetry Java Agent (support for 61 widely-used frameworks and libraries) ● Javascript/Typescript ○ Add handlers/wrappers at key places or Node auto-instrumentation ● Microservice in any language ○ Deploy Envoy proxy as sidecar ● Infrastructure ○ Move to public cloud. AWS, Azure, GCP are all incorporating the OpenTelemety collector in their infrastructure

Editor's Notes

  1. Copyright 2020, The OpenTelemetry Authors Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.