Are you collecting just about every metric under the sun and the kitchen sink too? Understanding the cost of collecting metrics and the usefulness of those metrics is the only way to scale in a cloud native world. You can’t get away with just collecting everything as you grow. Your observability teams need to make decisions about what to collect, what to drop, what to aggregate, and still be able to alert, triage, remediate, and do their root cause analysis on a daily basis. Gain immediate insights into high cost data (DPPS), when to drop time series data, and how to determine when the value of that data is at its lowest. Session includes a recorded demo video of it in action.
4. chronosphere.io
“It’s remarkable how common this situation is,
where an organization is paying more for their
observability data, than they do for their
production infrastructure.”
9. chronosphere.io
Dedicated FinOps
“By 2023, 80% of organizations
using cloud services will
establish a dedicated FinOps
function to automate policy-
driven observibility and
optimization of cloud resources
to maximize value.”
-- Source: IDC 2022
11. Centralized Governance - It Starts Here
Centralized Governance
Give teams ownership and control of their metrics to control
cardinality and growth
Quotas - Allocate portions of the licensed persisted write
capacity amongst teams and services
Priorities - Prioritize which data is impacted if over
capacity
12. Analyze Data
Analyze
Understand the value of the observability data to identify what is
useful and what is waste
Metrics Traffic Analyzer - Provides a real-time view of incoming
metrics grouped by label, and their relative frequency
Metrics Usage Analyzer - View all metrics in Chronosphere
ranked from least used to most used to understand the value
each metric delivers
Trace Analyzer - Provides a real-time view of incoming traces
grouped by tag and their relative frequency
13. The Metrics Traffic Analyzer helps
users:
● Understand metrics traffic
patterns and scale
● Break down biggest and
smallest contributors to traffic
scale (by metric name, label,
application, etc)
● Troubleshoot cardinality
spikes
Metrics Traffic Analyzer
14. Real-time view of incoming metrics
View Live or Pause to
investigate specific metrics
and their labels
View traffic before
it is stored to help
make decisions
about traffic shape
before you pay for
it
16. Troubleshoot high cardinality metrics & labels
Metrics with
‘instance’ label
‘instance’ label is
on 100% of metrics
and has 62 unique
values
17. Metrics Usage Analyzer
The Metrics Usage Analyzer
allows user to:
● Understand the value
each metric delivers
● Identify unused and
underutilized metrics
● Know if a metric is being
used, where, and by
whom
● Help make better
shaping decisions
18. What is and is not valuable?
Default
sort is
Least
Valuable
Click for more
Usage Details
24. Refine - Shape and Transform Data
Refine
After understanding cost & value of data, we enable
you to take action without touching source code or
redeploying.
We do this by allowing you to aggregate or
downsample data, remove high cardinality labels, or
drop non-valuable data. This is done real-time at
ingest (streaming), meaning no delay in alerts or
need to store raw data.
The result is reduced cost & improved performance
without alert or query impact.
25. Operate
The Control Plane has built-in capabilities to ensure queries
perform optimally and require no user intervention, while reducing
idle time and improving engineer productivity
Query Accelerator - Automatically ensures every possible
dashboard is fast and performant – no manual optimizations
needed.
Query Scheduler - Automatically ensures that query resources are
fairly shared so one user, or group of users can’t crowd out others.
Shaping Rules UI - Understand current shaping rules
configuration and value. Preview new policies before they are
implemented.
Operate - Continuously Adjust for Efficiency
26. Why Optimizing Observability Spend
The need is real
● Study by ESG, 69% of companies are concerned
with the rate of their observability data growth
● When able to control and optimize their data:
○ Expanding visibility and coverage
○ Increasing instrumentation of customer
experience to improve business
outcomes
○ Freeing up observability team time to
tackle strategic projects
28. "With Chronosphere, we were able to
not only significantly improve
reliability and performance of our
observability solution, but we've also
saved millions of dollars a year. With
the Chronosphere Control Plane, we're
reducing our observability data
volumes by more than 80%."
Yash Kumaraswamy, Senior Staff Engineer, Robinhood
29. chronosphere.io
chronosphere.io
Learn More
Resources
● Introducing: The Observability Data Optimization Cycle
● Metrics Usage Analyzer: Understand the value of each metric in your system
● How cloud native workloads affect cardinality over time
● Metrics Quotas: Protect yourself from cardinality explosions and budget overruns
Case Studies
● Why DoorDash Needed True Cloud Native Monitoring
● Top FinTech company chooses Chronosphere observability for industry-leading
reliability and performance
Talk to an Observability expert at Chronosphere
○ Schedule a conversation