chronosphere.io
MTTS - Sleep more, slog less
Eric D. Schabell
Director Evangelism, Chronosphere
@ericschabell{.bsky.social | @fosstodon.org}
Rundeck by PagerDuty Meetup
12 Nov, Salt Lake City
with automated cloud
native o11y platforms
chronosphere.io
Open Source O11y
with the CNCF
(metrics)
chronosphere.io
Prometheus for metrics, alerting, queries
Service 1
chronosphere.io
Prometheus for metrics, alerting, queries
Service 1
Service 2
chronosphere.io
Prometheus for metrics, alerting, queries
Service 1
Service 2
Service 3
Prometheus
+
TSDB
chronosphere.io
Prometheus for metrics, alerting, queries
Service 1
Service 2
Service 3
Prometheus
+
TSDB
clientlib
scrape
chronosphere.io
Prometheus for metrics, alerting, queries
Service 1
Service 2
Service 3
Prometheus
+
TSDB
clientlib
scrape
clientlib
chronosphere.io
Prometheus for metrics, alerting, queries
Service 1
Service 2
Service 3
Prometheus
+
TSDB
clientlib
scrape
clientlib
clientlib
chronosphere.io
Prometheus for metrics, alerting, queries
Instrumented
service 1
Instrumented
service 2
Instrumented
service 3
Prometheus
+
TSDB
clientlib
scrape
clientlib
clientlib
chronosphere.io
Prometheus for metrics, alerting, queries
clientlib
Instrumented
service 1
clientlib
Instrumented
service 2
clientlib
Instrumented
service 3
scrape
Prometheus
+
TSDB
Dashboards
Visualization
PromLens
querying
chronosphere.io
Prometheus for metrics, alerting, queries
clientlib
Instrumented
service 1
clientlib
Instrumented
service 2
clientlib
Instrumented
service 3
scrape
Prometheus
+
TSDB
Dashboards
Visualization
PromLens
querying
Alert Manager
alerts
chronosphere.io
Prometheus for metrics, alerting, queries
clientlib
Instrumented
service 1
clientlib
Instrumented
service 2
clientlib
Instrumented
service 3
scrape
Prometheus
+
TSDB
Dashboards
Visualization
PromLens
querying
Alert Manager
Email
PagerDuty
Slack
etc…
alerts
chronosphere.io
Prometheus for metrics, alerting, queries
clientlib
Instrumented
service 1
clientlib
Instrumented
service 2
clientlib
Instrumented
service 3
scrape
Prometheus
+
TSDB
Dashboards
Visualization
PromLens
querying
Alert Manager
Email
PagerDuty
Slack
etc…
Service
Discovery
Get targets
alerts
chronosphere.io
What’s happening (for the techies)?
Prometheus
scrape targets
TSDB
ingest
PromQL Engine
read (query)
read recording rule results
Dashboards
Visualization
PromLens
Read
(query)
Alert Manager
a
l
e
r
t
s
chronosphere.io
Need to trace your
service calls?
chronosphere.io
Applications (Java)
OTel Auto Instrumentation (libraries)
OTel API
OTel SDK
OTel Collector
OTLP
OTLP
OTLP
OpenTelemetry (Auto) instrumentation
chronosphere.io
Host
Observability Backend
(Prometheus, Jaeger, Fluent Bit, etc.),
Applications
OTel Auto Instrumentation
OTel API
OTel SDK
OTel Collector Agent
OTLP
OTLP
OTLP
OTLP
OTLP
OpenTelemetry Collector (agent)
chronosphere.io
Collecting telemetry
data with a pipeline?
chronosphere.io
Telemetry pipelines
Input
Output N
Parser Filter Buffer Routing Output 2
Output 1
Input
Output N
Parser Filter Buffer Routing Output 2
Output 1
Input
Output N
Parser Filter Buffer Routing Output 2
Output 1
Input
Output N
Parser Filter Buffer Routing Output 2
Output 1
Input
Output N
Parser Filter Buffer Routing Output 2
Output 1
Input
Output N
Parser Filter Buffer Routing Output 2
Output 1
chronosphere.io
What to do when
you scale?
At cloud native scale…
chronosphere.io
Host
Host
Host
Observability Backend
(Prometheus, Jaeger, Fluent Bit, etc.),
Applications
OTel Auto Instrumentation
OTel API
OTel SDK
OTel Collector Agent
OTLP
OTLP
OTLP
OTLP
Collector (gateway)
OTel Collector Gateway
Chronosphere products
29
Control costs and improve productivity
Observability Platform
DATA COLLECTION CONTROL PLANE STORE LENS
Telemetry Pipeline
Reduce
Enrich
Secure
TRANSFORM AND ROUTE DATA
IN YOUR ENVIRONMENT
STORE DATA IN THIRD PARTY
LOG & SIEM SOLUTIONS
chronosphere.io
https://o11y-workshops.gitlab.io
chronosphere.io
Thank You
Eric D. Schabell
Director Evangelism, Chronosphere
@ericschabell{@fosstodon.org}

MTTS - Sleep more, slog less with automated cloud native o11y platforms

Editor's Notes

  • #1 Join us for a journey through the CNCF landscape, from open source cloud native o11y beginnings we all have seen the struggle to maintain our sleep patterns when carrying the pager. Sleep more and slog less is about how you ensure your tools and platforms are working for you, not the other way around. Rundeck from PagerDuty and keeping control of your o11y data with Chronosphere gives you all the MTTS you ever dreamed of!
  • #2 Let’s start with the open source ecosystem located in the Cloud Native Computing Foundation (CNCF) and specifically, to narrow down the scope in this talk, starting with metrics.
  • #3 Widely adopted and accepted standards for metrics can be found in the Prometheus project, including time-series storage, communication protocols to scrape (pull) data from targets, and PromQL the query language for visualizing the data.
  • #4 Widely adopted and accepted standards for metrics can be found in the Prometheus project, including time-series storage, communication protocols to scrape (pull) data from targets, and PromQL the query language for visualizing the data.
  • #5 Widely adopted and accepted standards for metrics can be found in the Prometheus project, including time-series storage, communication protocols to scrape (pull) data from targets, and PromQL the query language for visualizing the data.
  • #6 Widely adopted and accepted standards for metrics can be found in the Prometheus project, including time-series storage, communication protocols to scrape (pull) data from targets, and PromQL the query language for visualizing the data.
  • #7 Widely adopted and accepted standards for metrics can be found in the Prometheus project, including time-series storage, communication protocols to scrape (pull) data from targets, and PromQL the query language for visualizing the data.
  • #8 Widely adopted and accepted standards for metrics can be found in the Prometheus project, including time-series storage, communication protocols to scrape (pull) data from targets, and PromQL the query language for visualizing the data.
  • #9 Widely adopted and accepted standards for metrics can be found in the Prometheus project, including time-series storage, communication protocols to scrape (pull) data from targets, and PromQL the query language for visualizing the data.
  • #10 Widely adopted and accepted standards for metrics can be found in the Prometheus project, including time-series storage, communication protocols to scrape (pull) data from targets, and PromQL the query language for visualizing the data.
  • #11 Widely adopted and accepted standards for metrics can be found in the Prometheus project, including time-series storage, communication protocols to scrape (pull) data from targets, and PromQL the query language for visualizing the data.
  • #12 Widely adopted and accepted standards for metrics can be found in the Prometheus project, including time-series storage, communication protocols to scrape (pull) data from targets, and PromQL the query language for visualizing the data.
  • #13 Widely adopted and accepted standards for metrics can be found in the Prometheus project, including time-series storage, communication protocols to scrape (pull) data from targets, and PromQL the query language for visualizing the data.
  • #14 Widely adopted and accepted standards for metrics can be found in the Prometheus project, including time-series storage, communication protocols to scrape (pull) data from targets, and PromQL the query language for visualizing the data.
  • #15 Once you’ve scaled up and become very successful, you’ll notice that you spend a lot of time on managing your o11y infrastructure, that telemetry data is growing out of control, costs are rising, and losing time better spent on supporting product development for your customers. What can you do? Leave the o11y at scale to Chronosphere!
  • #16 Instrumentation Libraries OpenTelemetry supports a broad number of components that generate relevant telemetry data from popular libraries and frameworks for supported languages. For example, inbound and outbound HTTP requests from an HTTP library will generate data about those requests. It is a long-term goal that popular libraries are authored to be observable out of the box, such that pulling in a separate component is not required. For more information, see Instrumenting Libraries. Automatic Instrumentation If applicable a language specific implementation of OpenTelemetry will provide a way to instrument your application without touching your source code. While the underlying mechanism depends on the language, at a minimum this will add the OpenTelemetry API and SDK capabilities to your application. Additionally they may add a set of Instrumentation Libraries and exporter dependencies. For more information, see Instrumenting.
  • #17 The OpenTelemetry Collector is a vendor-agnostic proxy that can receive, process, and export telemetry data. It supports receiving telemetry data in multiple formats (for example, OTLP, Jaeger, Prometheus, as well as many commercial/proprietary tools) and sending data to one or more backends. It also supports processing and filtering telemetry data before it gets exported. Collector contrib packages bring support for more data formats and vendor backends. Agent: A Collector instance running with the application or on the same host as the application (e.g. binary, sidecar, or daemonset). For more information, see Collector.
  • #18 Once you’ve scaled up and become very successful, you’ll notice that you spend a lot of time on managing your o11y infrastructure, that telemetry data is growing out of control, costs are rising, and losing time better spent on supporting product development for your customers. What can you do? Leave the o11y at scale to Chronosphere!
  • #19 This is the telemetry pipeline overview, all of the phases data goes through that you will learn in this workshop.
  • #20 Fluent Bit Telemetry Pipeline
  • #21 Fluent Bit Telemetry Pipeline
  • #22 Fluent Bit Telemetry Pipeline
  • #23 Fluent Bit Telemetry Pipeline
  • #24 Fluent Bit Telemetry Pipeline
  • #25 Fluent Bit Telemetry Pipeline
  • #26 Once you’ve scaled up and become very successful, you’ll notice that you spend a lot of time on managing your o11y infrastructure, that telemetry data is growing out of control, costs are rising, and losing time better spent on supporting product development for your customers. What can you do? Leave the o11y at scale to Chronosphere!
  • #28 The OpenTelemetry Collector is a vendor-agnostic proxy that can receive, process, and export telemetry data. It supports receiving telemetry data in multiple formats (for example, OTLP, Jaeger, Prometheus, as well as many commercial/proprietary tools) and sending data to one or more backends. It also supports processing and filtering telemetry data before it gets exported. Collector contrib packages bring support for more data formats and vendor backends. Gateway: One or more Collector instances running as a standalone service (e.g. container or deployment) typically per cluster, data center or region. For more information, see Collector.
  • #29 Chronosphere offers two products: observability platform and telemetry pipeline. We are leaders in reliability and scalability. Our products help you control data volumes and let developers solve problems faster.
  • #30 Explore the collection of o11y workshops here: https://o11y-workshops.gitlab.io
  • #31 Join us for a journey through the CNCF landscape, from open source cloud native o11y beginnings we all have seen the struggle to maintain our sleep patterns when carrying the pager. Sleep more and slog less is about how you ensure your tools and platforms are working for you, not the other way around. Rundeck from PagerDuty and keeping control of your o11y data with Chronosphere gives you all the MTTS you ever dreamed of!