1
The State of
OpenTelemetry
Dotan Horovits
@horovits
2
How many tools do
companies use to
collect telemetry data?
3
4
Dotan Horovits
@horovits
• Developer Advocate at
• 20 years in the hi-tech industry
• Developer, architect, product
• CNCF advocate, meetup organizer
• OpenObservability Talks podcast
5
The vision:
unified
observability Metrics
Detect
Logs
Diagnose
Traces
Isolate &
improve
Observability
6
The reality:
fragmented
instrumentation
?
Collector
SDK
Protocol API
Backend
Agent
7
OpenTelemetry (a.k.a. OTel)
“OpenTelemetry is an observability framework –
software and tools that assist in generating and
capturing telemetry data from cloud-native software.”
Across Traces, Metrics, Logs
+ =
OPENCENSUS
8
Second most active CNCF project
OpenTelemetry 2 66 30 3 22K
Source: CNCF Dev Stats
9
OpenTelemetry
A unified set of vendor-agnostic APIs, SDKs and tools
for generating and collecting telemetry data, and
then exporting it to a variety of analysis tools.
Generate Emit Collect Process Export
APPLICATION COLLECTOR
OTLP OTLP
10
OpenTelemetry
Cross-language requirements for all
OpenTelemetry implementations
API specification | SDK specification | Data specification
For traces, metrics and logs
Generate Emit Collect Process Export
APPLICATION COLLECTOR
11
OpenTelemetry Client Libraries
Generate Emit Collect Process Export
APPLICATION COLLECTOR
12
OpenTelemetry Collector
Generate Emit Collect Process Export
APPLICATION COLLECTOR
Exporters
Processors
Receivers
Jaeger
Zipkin
Receiver N
Fan
out
Processor 0 Processor N
Exporter N
Prometheus
Logz.io
13
OpenTelemetry Protocol (OTLP)
Generate Emit
APPLICATION
Collect Process Export
OTLP OTLP
Transport: grpc and HTTP 1.1 | Encoding: protobuf | Telemetry data model
Is OpenTelemetry GA?
YES
NO
IT DEPENDS
Status for...
?
Tracing
API
SDK
Collector Protocol
Receivers
Exporters
Java
GoLang
.Net
C++
Metrics Logging
State
of the
signals
Metrics
Logs Traces
Observability
• Protocol and Collector
are experimental
• API, SDK are in draft stage
• Focusing first on
integration with existing
logging systems
• Log appenders are
under development in
many languages
• Protocol and API
Are stable
• SDK is in feature
freeze (soon Stable)
• Prototyped in Java,
.NET, and Python
• Collector is still
Experimental, incl.
Prometheus support
• API, SDK, Protocol,
Collector are stable
• Client libraries >v1.0 for Java,
Go, .Net, Python, C++,
JavaScript
• Working on Ruby, PHP, Erlang
• Long-term support,
backwards compatibility,
and dependency isolation
Experimental
(some specs still in draft)
Stable
(i.e. GA)
Expecting stability
by end of 2021
State of the signals
Logs
Metrics
Traces
How do I get started with OpenTelemetry?
?
Tracing
API
SDK
Collector Protocol
Receivers
Exporters
Java
GoLang
.Net
C++
Metrics Logging
Know your stack
Which
protocols?
Which
analytics tools?
Which
languages?
Which
signals?
Then check status for each and follow respective guides
https://opentelemetry.io/status/
20
Google “OpenTelemetry Guide”, or go to:
https://logz.io/learn/opentelemetry-guide/
21
Thank you!
Dotan Horovits,
@horovits

THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.io

Editor's Notes

  • #3 How many tools does a company use (on average) to collect telemetry data from its systems? Logs, metrics, traces? App, infra? organizations use 5-10 different tools to collect telemetry from their systems You can reduce it to 1 standard unified platform Here’s the story of otel (think fluentd, filebeat, metricbeat, datadog, new relic, Prometheus, statsd, collectd…)
  • #5 Logz provides cloud native observability platform that’s based on popular open source (elasticsearch, Prometheus, jaeger, opentelemetry…) WE’RE RECRUITING. Pay us a visit Advocate of open source software, open standards and communities Organize the local CNCF chapter in Tel Aviv, monthly meetup Run a podcast – OOtalks. on the finalists for Best DevOps Podcast Series in 2021 on the DevOps Dozen² Awards!
  • #6 observability: the ability to understand the state of our system based on the telemetry data it emits The vision: unified observability across different signal types (logs/metrics/traces) and across different sources (frontend code, backend code, open source tools, cloud services ...) Gain unified observability across all of these signal types and sources
  • #7 The reality is much more fragmented – we use many tools for our observability each tool and each vendor has proprietary APIs and SDKs for instrumenting (datadog, splunk, zipkin, new relic, jaeger, ... client libraries) and then also proprietary agents, daemons, collectors to collect and run aggregations, sampling, other processing And protocol and data model to transmit the telemetry to analytics backend Not only an operational headache and vendor lock-in issue Creates tight coupling between telemetry collection and the telemetry storage and analysis backend Makes it very very difficult to correlate data between and gain unified observability across these data silos That’s what OpenTelemetry comes to solve.
  • #8 AKA OTel Across all the observability pillars: traces, metrics and logs. One framework to rule them all it’s an incubating project under the CNCF, a merge of OpenTracing and OpenCensus OTel adopted by all the major vendors, all the monitoring tools, cloud providers (AWS, Azure)
  • #9 is the second most active CNCF project behind Kubernetes Source: CNCF dev stats https://all.devstats.cncf.io/d/1/activity-repository-groups?orgId=1
  • #10 Dive deeper into what otel provides us: OpenTelemetry provides the libraries, agents, and other components that you need to capture telemetry from your services. Specifically,  captures metrics, distributed traces, resource metadata, and logs (logging support is incubating now) from your backend and client applications  sends this data to backends like Prometheus, Jaeger, Zipkin, and others for processing.  The OpenTelemetry specification describes the cross-language requirements and expectations for all OpenTelemetry implementations. includes: API spec, SDK spec and Data spec (e.g. semantic convention) https://github.com/open-telemetry/opentelemetry-specification
  • #11 It’s not a component but rather governs all the components The OpenTelemetry specification describes the cross-language requirements and expectations for all OpenTelemetry implementations. includes: API spec, SDK spec and Data spec (e.g. semantic convention) For traces, metrics and logs to overcome the fragmentation This comes to solve the pain of fragmentation that each vendor, each programming language and each signal has its own convention. a unified data spec and semantic conventions will also enable correlation across signals and sources https://github.com/open-telemetry/opentelemetry-specification
  • #12 for instrumenting your app: one API and SDK per language (based on a unified specification) One API and SDK per language, which include the interfaces and implementations that define and create distributed traces and metrics, manage sampling and context propagation, etc. Language-specific integrations for popular web frameworks, storage clients, RPC libraries, etc. that (when enabled) automatically capture relevant traces and metrics and handle context propagation Automatic instrumentation agents that can collect telemetry from some applications without requiring code changes Language-specific exporters that allow SDKs to send captured traces and metrics to any supported backends
  • #13 The OpenTelemetry Collector can collect data from OpenTelemetry SDKs and other sources, and then export this telemetry to any supported backend Built like a data processing pipeline: receivers in multiple protocols, processing and aggregation, then exporters in multiple protocols can collect telemetry from our app (backend/frontend) or from other infra components (k8s, docker, kafka, mysql, redis, httpd, aws xray, GCP pubsub, collectd …) processing can do things like filter, modify, batch, sample, etc.  Exporters exist to aws xray, azure monitor, google, datadog, Dynatrace, splunk, sumologic, logz.io, Prometheus, jaeger, zipkin, kafka …
  • #14 OTLP is a general-purpose telemetry data delivery protocol - between telemetry sources, intermediate nodes such as collectors and telemetry backends OTLP defines the encoding of telemetry data and the protocol used to exchange data between the client and the server. it’s a request/response style protocol for client-server communications. includes the data model OTLP is implemented over gRPC and HTTP 1.1 transports Currently supports binary Protobuf encoding of the payload. later to add support for JSON encoding  OTLP provides wire-level compatibility for the binary Protobuf serialization you can get the .proto files and can generate raw gRPC client libraries from them yourself NOTE: OTEL collector is NOT limited to OTLP, as said, it has receivers and exported for many protocols. Still, OTEL as a project strives to provide a unified protocol as part of the holistic framework, and to enable correlation across the telemetry OTLP specification describes the encoding, transport, and delivery mechanism of telemetry data between telemetry sources, intermediate nodes such as collectors and telemetry backends.
  • #15 Opentelemetry is an aggregate of multiple groups, each working on a different component of this huge endeavor: different groups handle the specification for the different telemetry signals – tracing, logging and metrics, there are different groups focused on the different programming-language specific clients, to name a few. Each group has its own release cadence, which means that different components of OpenTelemetry may be in different stages of the maturity lifecycle: Draft → Experimental → Stable → Deprecated. Stable is the equivalent of GA (generally available), which is what you’d be seeking to run it in a production environment. stable is covered by long term support. Experimental is a Beta stage, which should enable testing the technology in evaluations and PoC towards integration. When coming to evaluate OpenTelemetry for your project, you should map the status of the relevant components for your system: The standard for the signal type of interest (traces/metrics/logs) The protocol for the signal type of interest  The client library for the programming language(s) you use. Potentially also  agents for instrumenting programming frameworks you use in your code.
  • #16 Opentelemetry is an aggregate of multiple groups, each working on a different component of this huge endeavor: different groups handle the specification for the different telemetry signals – tracing, logging and metrics, there are different groups focused on the different programming-language specific clients, to name a few. Each group has its own release cadence, which means that different components of OpenTelemetry may be in different stages of the maturity lifecycle: Draft → Experimental → Stable → Deprecated. Stable is the equivalent of GA (generally available), which is what you’d be seeking to run it in a production environment. stable is covered by long term support (e.g. all instrumentation written against the tracing API will be compatible with future minor versions, and supported for a minimum of three years after the next major version of the OpenTelemetry API). Experimental is a Beta stage, which should enable testing the technology in evaluations and PoC towards integration. When coming to evaluate OpenTelemetry for your project, you should map the status of the relevant components for your system: The standard for the signal type of interest (traces/metrics/logs) The protocol for the signal type of interest  The client library for the programming language(s) you use. Potentially also  agents for instrumenting programming frameworks you use in your code.
  • #17 OpenTelemetry clients are versioned to v1.0 once their tracing implementation is complete. Metrics The data model is stable and released as part of the OTLP protocol. Experimental support for metric pipelines are available in the Collector. Collector support for Prometheus is under developemnet, in collaboration with the Prometheus community. The metric API and SDK specification is currently being prototyped in Java, .NET, and Python. API: feature-freeze SDK: experimental Protocol: stable Collector: experimental Logging The data model is experimental and released as part of the OTLP protocol. Log processing for many data formats has been added to the Collector, thanks to the donation of Stanza to the the OpenTelemetry project. Log appenders are currently under develop in many languages. Log appenders allow OpenTelemetry tracing data, such as trace and span IDs, to be appended to existing logging systems. An OpenTelemetry logging SDK is currently under development. This allows OpenTelemetry clients to injest logging data from existing logging systems, outputting logs as part of OTLP along with tracing and metrics. An OpenTelemetry logging API is not currently under development. We are focusing first on integration with existing logging systems. When metrics is complete, focus will shift to development of an OpenTelemetry logging API. API: draft SDK: draft Protocol: experimental Collector: experimental
  • #18 Traces Tracing API, SDK and Protocol specifications are stable, the Collector is stable OpenTelemetry clients are versioned to v1.0 once their tracing implementation is complete. Metrics Protocol: Stable. The data model is stable and released as part of the OTLP protocol. API: stable, SDK: feature-freeze The metric API and SDK specification is currently being prototyped in Java, .NET, and Python. Collector: experimental. Collector support for Prometheus is under development, in collaboration with the Prometheus community. Experimental support for metric pipelines are available in the Collector. Logging Protocol: experimental. The data model is experimental and released as part of the OTLP protocol. Collector: experimental. Log processing for many data formats has been added to the Collector, thanks to the donation of Stanza to the the OpenTelemetry project. On API/SDK front: both still in draft stage. focusing first on integration with existing logging systems. Log appenders are currently under develop in many languages. Log appenders allow OpenTelemetry tracing data, such as trace and span IDs, to be appended to existing logging systems. An OpenTelemetry logging SDK is currently under development. This allows OpenTelemetry clients to ingest logging data from existing logging systems, outputting logs as part of OTLP along with tracing and metrics. An OpenTelemetry logging API is not currently under development. We are focusing first on integration with existing logging systems. When metrics is complete, focus will shift to development of an OpenTelemetry logging API.
  • #20 When coming to evaluate OpenTelemetry for your project, you should map the status of the relevant components for your system: The client library for the programming language(s) you use. Potentially also agents for instrumenting programming frameworks you use in your code. (we use nodejs with Happi/Express, or Java with Spring) The signal type of interest (traces/metrics/logs) The protocol for the signal type of interest (especially if brownfield) The backend tool
  • #22 Get involved in the open source Feedback on the guide Reach out to me @horovits
  • #24 OpenTelemetry clients are versioned to v1.0 once their tracing implementation is complete. Metrics The data model is stable and released as part of the OTLP protocol. Experimental support for metric pipelines are available in the Collector. Collector support for Prometheus is under developemnet, in collaboration with the Prometheus community. The metric API and SDK specification is currently being prototyped in Java, .NET, and Python. API: feature-freeze SDK: experimental Protocol: stable Collector: experimental Logging The data model is experimental and released as part of the OTLP protocol. Log processing for many data formats has been added to the Collector, thanks to the donation of Stanza to the the OpenTelemetry project. Log appenders are currently under develop in many languages. Log appenders allow OpenTelemetry tracing data, such as trace and span IDs, to be appended to existing logging systems. An OpenTelemetry logging SDK is currently under development. This allows OpenTelemetry clients to injest logging data from existing logging systems, outputting logs as part of OTLP along with tracing and metrics. An OpenTelemetry logging API is not currently under development. We are focusing first on integration with existing logging systems. When metrics is complete, focus will shift to development of an OpenTelemetry logging API. API: draft SDK: draft Protocol: experimental Collector: experimental
  • #25 OpenTelemetry clients are versioned to v1.0 once their tracing implementation is complete. Metrics The data model is stable and released as part of the OTLP protocol. Experimental support for metric pipelines are available in the Collector. Collector support for Prometheus is under developemnet, in collaboration with the Prometheus community. The metric API and SDK specification is currently being prototyped in Java, .NET, and Python. API: feature-freeze SDK: experimental Protocol: stable Collector: experimental Logging The data model is experimental and released as part of the OTLP protocol. Log processing for many data formats has been added to the Collector, thanks to the donation of Stanza to the the OpenTelemetry project. Log appenders are currently under develop in many languages. Log appenders allow OpenTelemetry tracing data, such as trace and span IDs, to be appended to existing logging systems. An OpenTelemetry logging SDK is currently under development. This allows OpenTelemetry clients to injest logging data from existing logging systems, outputting logs as part of OTLP along with tracing and metrics. An OpenTelemetry logging API is not currently under development. We are focusing first on integration with existing logging systems. When metrics is complete, focus will shift to development of an OpenTelemetry logging API. API: draft SDK: draft Protocol: experimental Collector: experimental
  • #26 Traces Tracing API, SDK and Protocol specifications are stable, the Collector is stable OpenTelemetry clients are versioned to v1.0 once their tracing implementation is complete. Metrics Protocol: Stable. The data model is stable and released as part of the OTLP protocol. API: feature-freeze, SDK: experimental The metric API and SDK specification is currently being prototyped in Java, .NET, and Python. Collector: experimental. Collector support for Prometheus is under development, in collaboration with the Prometheus community. Experimental support for metric pipelines are available in the Collector. Logging Protocol: experimental. The data model is experimental and released as part of the OTLP protocol. Collector: experimental. Log processing for many data formats has been added to the Collector, thanks to the donation of Stanza to the the OpenTelemetry project. On API/SDK front: both still in draft stage. focusing first on integration with existing logging systems. Log appenders are currently under develop in many languages. Log appenders allow OpenTelemetry tracing data, such as trace and span IDs, to be appended to existing logging systems. An OpenTelemetry logging SDK is currently under development. This allows OpenTelemetry clients to ingest logging data from existing logging systems, outputting logs as part of OTLP along with tracing and metrics. An OpenTelemetry logging API is not currently under development. We are focusing first on integration with existing logging systems. When metrics is complete, focus will shift to development of an OpenTelemetry logging API.