SlideShare a Scribd company logo
1 of 30
Optimizing
Observability
Spend: Metrics
Eric D. Schabell
Director Evangelism
@ericschabell{@fosstodon.org}
Aug 2023
DataOps Day 2023
chronosphere.io
Observability…
chronosphere.io
Cloud Native
Observability at Scale
chronosphere.io
“It’s remarkable how common this situation is,
where an organization is paying more for their
observability data, than they do for their
production infrastructure.”
chronosphere.io
Data volume
Experiment:
- Hello World app on 4 node
Kubernetes cluster with
Tracing, End User Metrics
(EUM), Logs, Metrics
(containers / nodes)
- 30 days == +450 GB
chronosphere.io
“If you have to ask,
you can’t afford it…”
Only if we get better
outcomes…
chronosphere.io
chronosphere.io
10 hours
on average, per week, trying
to triage and understand
incidents - a quarter
of a 40 hour work week
chronosphere.io
Know the cost of
observability
metrics data?
chronosphere.io
Dedicated FinOps
“By 2023, 80% of organizations
using cloud services will
establish a dedicated FinOps
function to automate policy-
driven observibility and
optimization of cloud resources
to maximize value.”
-- Source: IDC 2022
Observability Data Optimization Cycle
Centralized Governance - It Starts Here
Centralized Governance
Give teams ownership and control of their metrics to control
cardinality and growth
Quotas - Allocate portions of the licensed persisted write
capacity amongst teams and services
Priorities - Prioritize which data is impacted if over
capacity
Analyze Data
Analyze
Understand the value of the observability data to identify what is
useful and what is waste
Metrics Traffic Analyzer - Provides a real-time view of incoming
metrics grouped by label, and their relative frequency
Metrics Usage Analyzer - View all metrics in Chronosphere
ranked from least used to most used to understand the value
each metric delivers
Trace Analyzer - Provides a real-time view of incoming traces
grouped by tag and their relative frequency
The Metrics Traffic Analyzer helps
users:
● Understand metrics traffic
patterns and scale
● Break down biggest and
smallest contributors to traffic
scale (by metric name, label,
application, etc)
● Troubleshoot cardinality
spikes
Metrics Traffic Analyzer
Real-time view of incoming metrics
View Live or Pause to
investigate specific metrics
and their labels
View traffic before
it is stored to help
make decisions
about traffic shape
before you pay for
it
Breakdown traffic by metric name and label
Labels
Metrics
Troubleshoot high cardinality metrics & labels
Metrics with
‘instance’ label
‘instance’ label is
on 100% of metrics
and has 62 unique
values
Metrics Usage Analyzer
The Metrics Usage Analyzer
allows user to:
● Understand the value
each metric delivers
● Identify unused and
underutilized metrics
● Know if a metric is being
used, where, and by
whom
● Help make better
shaping decisions
What is and is not valuable?
Default
sort is
Least
Valuable
Click for more
Usage Details
Resolving uncertainty about value
Where is it being
used?
Select 14 or 30
days
How much is it
used?
Keeping it in context
Utility Score + DPPS
Where and how
much it’s used
Scenario - Low Utility Score but High Ingest
Low Utility Score,
but high DPPS.
How is it being
used?
Let’s take a look at
the Usage Details
Scenario - No references, but some executions
Not being used in
Dashboard, Alerts,
etc… but two
users. Who are
they?
Underutilized metric discovered!
Two of our top
SREs!
What are they
using it for?
Should others be
using it as well?
Refine - Shape and Transform Data
Refine
After understanding cost & value of data, we enable
you to take action without touching source code or
redeploying.
We do this by allowing you to aggregate or
downsample data, remove high cardinality labels, or
drop non-valuable data. This is done real-time at
ingest (streaming), meaning no delay in alerts or
need to store raw data.
The result is reduced cost & improved performance
without alert or query impact.
Operate
The Control Plane has built-in capabilities to ensure queries
perform optimally and require no user intervention, while reducing
idle time and improving engineer productivity
Query Accelerator - Automatically ensures every possible
dashboard is fast and performant – no manual optimizations
needed.
Query Scheduler - Automatically ensures that query resources are
fairly shared so one user, or group of users can’t crowd out others.
Shaping Rules UI - Understand current shaping rules
configuration and value. Preview new policies before they are
implemented.
Operate - Continuously Adjust for Efficiency
Why Optimizing Observability Spend
The need is real
● Study by ESG, 69% of companies are concerned
with the rate of their observability data growth
● When able to control and optimize their data:
○ Expanding visibility and coverage
○ Increasing instrumentation of customer
experience to improve business
outcomes
○ Freeing up observability team time to
tackle strategic projects
chronosphere.io
chronosphere.io
Customer Impact
50%
data volume
reduction
90%
reduction in on-call
pages
80%
data volume
reduction
8x
query latency
improvement
98%
data volume
reduction
8x
MTTD
improvement
"With Chronosphere, we were able to
not only significantly improve
reliability and performance of our
observability solution, but we've also
saved millions of dollars a year. With
the Chronosphere Control Plane, we're
reducing our observability data
volumes by more than 80%."
Yash Kumaraswamy, Senior Staff Engineer, Robinhood
chronosphere.io
chronosphere.io
Learn More
Resources
● Introducing: The Observability Data Optimization Cycle
● Metrics Usage Analyzer: Understand the value of each metric in your system
● How cloud native workloads affect cardinality over time
● Metrics Quotas: Protect yourself from cardinality explosions and budget overruns
Case Studies
● Why DoorDash Needed True Cloud Native Monitoring
● Top FinTech company chooses Chronosphere observability for industry-leading
reliability and performance
Talk to an Observability expert at Chronosphere
○ Schedule a conversation
Questions?
Eric D. Schabell
Director Evangelism
@ericschabell{@fosstodon.org}
Aug 2023

More Related Content

Similar to Optimizing Observability Spend: Metrics

SplunkLive! Zurich 2018: Integrating Metrics and Logs
SplunkLive! Zurich 2018: Integrating Metrics and LogsSplunkLive! Zurich 2018: Integrating Metrics and Logs
SplunkLive! Zurich 2018: Integrating Metrics and LogsSplunk
 
Connecting the dots – Industrial IoT is more than just sensor deployment
Connecting the dots – Industrial IoT is more than just sensor deploymentConnecting the dots – Industrial IoT is more than just sensor deployment
Connecting the dots – Industrial IoT is more than just sensor deploymentNagarro
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Roger Barga
 
How to Wrestle Your Observability Data Demons and Win!
How to Wrestle Your Observability Data Demons and Win!How to Wrestle Your Observability Data Demons and Win!
How to Wrestle Your Observability Data Demons and Win!Eric D. Schabell
 
Simplifying Analytics - by Novoniel Deb
Simplifying Analytics - by Novoniel DebSimplifying Analytics - by Novoniel Deb
Simplifying Analytics - by Novoniel DebNovoniel Deb
 
Implementing Advanced Analytics Platform
Implementing Advanced Analytics PlatformImplementing Advanced Analytics Platform
Implementing Advanced Analytics PlatformArvind Sathi
 
Top 8 Trends in Performance Engineering
Top 8 Trends in Performance EngineeringTop 8 Trends in Performance Engineering
Top 8 Trends in Performance EngineeringConvetit
 
Using analytics in ux design my view
Using analytics in ux design   my viewUsing analytics in ux design   my view
Using analytics in ux design my viewOuti Aramo
 
Driving Customer Loyalty with Azure Machine Learning
Driving Customer Loyalty with Azure Machine LearningDriving Customer Loyalty with Azure Machine Learning
Driving Customer Loyalty with Azure Machine LearningCCG
 
Providing Highly Accurate Service Recommendation over Big Data using Adaptive...
Providing Highly Accurate Service Recommendation over Big Data using Adaptive...Providing Highly Accurate Service Recommendation over Big Data using Adaptive...
Providing Highly Accurate Service Recommendation over Big Data using Adaptive...IRJET Journal
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...DATAVERSITY
 
Chief AI Officer and AI Digital Transformation
Chief AI Officer and AI Digital TransformationChief AI Officer and AI Digital Transformation
Chief AI Officer and AI Digital TransformationValue Amplify Consulting
 
Data Driven Engineering 2014
Data Driven Engineering 2014Data Driven Engineering 2014
Data Driven Engineering 2014Roger Barga
 
Data Analytics in Digital Transformation
Data Analytics in Digital TransformationData Analytics in Digital Transformation
Data Analytics in Digital TransformationMukund Babbar
 
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarFuture-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarImpetus Technologies
 
Leverage Sage Business Intelligence for Your Organization
Leverage Sage Business Intelligence for Your OrganizationLeverage Sage Business Intelligence for Your Organization
Leverage Sage Business Intelligence for Your OrganizationRKLeSolutions
 

Similar to Optimizing Observability Spend: Metrics (20)

SplunkLive! Zurich 2018: Integrating Metrics and Logs
SplunkLive! Zurich 2018: Integrating Metrics and LogsSplunkLive! Zurich 2018: Integrating Metrics and Logs
SplunkLive! Zurich 2018: Integrating Metrics and Logs
 
Automated Analytics at Scale
Automated Analytics at ScaleAutomated Analytics at Scale
Automated Analytics at Scale
 
Connecting the dots – Industrial IoT is more than just sensor deployment
Connecting the dots – Industrial IoT is more than just sensor deploymentConnecting the dots – Industrial IoT is more than just sensor deployment
Connecting the dots – Industrial IoT is more than just sensor deployment
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015
 
How to Wrestle Your Observability Data Demons and Win!
How to Wrestle Your Observability Data Demons and Win!How to Wrestle Your Observability Data Demons and Win!
How to Wrestle Your Observability Data Demons and Win!
 
Simplifying Analytics - by Novoniel Deb
Simplifying Analytics - by Novoniel DebSimplifying Analytics - by Novoniel Deb
Simplifying Analytics - by Novoniel Deb
 
Sgcp14dunlea
Sgcp14dunleaSgcp14dunlea
Sgcp14dunlea
 
Implementing Advanced Analytics Platform
Implementing Advanced Analytics PlatformImplementing Advanced Analytics Platform
Implementing Advanced Analytics Platform
 
Top 8 Trends in Performance Engineering
Top 8 Trends in Performance EngineeringTop 8 Trends in Performance Engineering
Top 8 Trends in Performance Engineering
 
Using analytics in ux design my view
Using analytics in ux design   my viewUsing analytics in ux design   my view
Using analytics in ux design my view
 
Driving Customer Loyalty with Azure Machine Learning
Driving Customer Loyalty with Azure Machine LearningDriving Customer Loyalty with Azure Machine Learning
Driving Customer Loyalty with Azure Machine Learning
 
Providing Highly Accurate Service Recommendation over Big Data using Adaptive...
Providing Highly Accurate Service Recommendation over Big Data using Adaptive...Providing Highly Accurate Service Recommendation over Big Data using Adaptive...
Providing Highly Accurate Service Recommendation over Big Data using Adaptive...
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
 
Chief AI Officer and AI Digital Transformation
Chief AI Officer and AI Digital TransformationChief AI Officer and AI Digital Transformation
Chief AI Officer and AI Digital Transformation
 
Data Driven Engineering 2014
Data Driven Engineering 2014Data Driven Engineering 2014
Data Driven Engineering 2014
 
Data Analytics in Digital Transformation
Data Analytics in Digital TransformationData Analytics in Digital Transformation
Data Analytics in Digital Transformation
 
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarFuture-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
 
Data Science for Retail Broking
Data Science for Retail BrokingData Science for Retail Broking
Data Science for Retail Broking
 
Data Science for Retail Broking
Data Science for Retail BrokingData Science for Retail Broking
Data Science for Retail Broking
 
Leverage Sage Business Intelligence for Your Organization
Leverage Sage Business Intelligence for Your OrganizationLeverage Sage Business Intelligence for Your Organization
Leverage Sage Business Intelligence for Your Organization
 

More from Eric D. Schabell

OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
Checking the pulse of your cloud native architecture
Checking the pulse of your cloud native architectureChecking the pulse of your cloud native architecture
Checking the pulse of your cloud native architectureEric D. Schabell
 
3 Pitfalls Everyone Should Avoid with Cloud Data
3 Pitfalls Everyone Should Avoid with Cloud Data3 Pitfalls Everyone Should Avoid with Cloud Data
3 Pitfalls Everyone Should Avoid with Cloud DataEric D. Schabell
 
Observability For You and Me with OpenTelemetry (with demo)
Observability For You and Me with OpenTelemetry (with demo)Observability For You and Me with OpenTelemetry (with demo)
Observability For You and Me with OpenTelemetry (with demo)Eric D. Schabell
 
3 Pitfalls Everyone Should Avoid with Cloud Native Observability
3 Pitfalls Everyone Should Avoid with Cloud Native Observability3 Pitfalls Everyone Should Avoid with Cloud Native Observability
3 Pitfalls Everyone Should Avoid with Cloud Native ObservabilityEric D. Schabell
 
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryEric D. Schabell
 
Roadmap to Becoming a CNCF Ambassador
Roadmap to Becoming a CNCF AmbassadorRoadmap to Becoming a CNCF Ambassador
Roadmap to Becoming a CNCF AmbassadorEric D. Schabell
 
Cloud Native Bedtime Stories - Terrifying Execs into Action
Cloud Native Bedtime Stories - Terrifying Execs into ActionCloud Native Bedtime Stories - Terrifying Execs into Action
Cloud Native Bedtime Stories - Terrifying Execs into ActionEric D. Schabell
 
SRECon EU 2023 - Three Phases to Better Observability Outcomes
SRECon EU 2023 - Three Phases to Better Observability OutcomesSRECon EU 2023 - Three Phases to Better Observability Outcomes
SRECon EU 2023 - Three Phases to Better Observability OutcomesEric D. Schabell
 
Engaging Your Execs - Telling Great Observability Tales Inspiring Action
Engaging Your Execs - Telling Great Observability Tales Inspiring ActionEngaging Your Execs - Telling Great Observability Tales Inspiring Action
Engaging Your Execs - Telling Great Observability Tales Inspiring ActionEric D. Schabell
 
WTF is SRE - Telling Effective Tales about Production
WTF is SRE - Telling Effective Tales about ProductionWTF is SRE - Telling Effective Tales about Production
WTF is SRE - Telling Effective Tales about ProductionEric D. Schabell
 
Optimizing Observability Spend: Metrics
Optimizing Observability Spend: MetricsOptimizing Observability Spend: Metrics
Optimizing Observability Spend: MetricsEric D. Schabell
 
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryEric D. Schabell
 
Open Source 101 - Observability For You and Me with OpenTelemetry
Open Source 101 - Observability For You and Me with OpenTelemetryOpen Source 101 - Observability For You and Me with OpenTelemetry
Open Source 101 - Observability For You and Me with OpenTelemetryEric D. Schabell
 
3 Pitfalls Everyone Should Avoid with Cloud Data
3 Pitfalls Everyone Should Avoid with Cloud Data3 Pitfalls Everyone Should Avoid with Cloud Data
3 Pitfalls Everyone Should Avoid with Cloud DataEric D. Schabell
 
3 Pitfalls Everyone Should Avoid with Cloud Native Data
3 Pitfalls Everyone Should Avoid with Cloud Native Data3 Pitfalls Everyone Should Avoid with Cloud Native Data
3 Pitfalls Everyone Should Avoid with Cloud Native DataEric D. Schabell
 
DZone webinar - Shift left Observability
DZone webinar - Shift left ObservabilityDZone webinar - Shift left Observability
DZone webinar - Shift left ObservabilityEric D. Schabell
 
Storytelling - How to build and delivery a story
Storytelling - How to build and delivery a storyStorytelling - How to build and delivery a story
Storytelling - How to build and delivery a storyEric D. Schabell
 
Trajectory 2022 - Shifting Cloud Native Observability to the Left
Trajectory 2022 - Shifting Cloud Native Observability to the LeftTrajectory 2022 - Shifting Cloud Native Observability to the Left
Trajectory 2022 - Shifting Cloud Native Observability to the LeftEric D. Schabell
 

More from Eric D. Schabell (20)

OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
Checking the pulse of your cloud native architecture
Checking the pulse of your cloud native architectureChecking the pulse of your cloud native architecture
Checking the pulse of your cloud native architecture
 
3 Pitfalls Everyone Should Avoid with Cloud Data
3 Pitfalls Everyone Should Avoid with Cloud Data3 Pitfalls Everyone Should Avoid with Cloud Data
3 Pitfalls Everyone Should Avoid with Cloud Data
 
Observability For You and Me with OpenTelemetry (with demo)
Observability For You and Me with OpenTelemetry (with demo)Observability For You and Me with OpenTelemetry (with demo)
Observability For You and Me with OpenTelemetry (with demo)
 
3 Pitfalls Everyone Should Avoid with Cloud Native Observability
3 Pitfalls Everyone Should Avoid with Cloud Native Observability3 Pitfalls Everyone Should Avoid with Cloud Native Observability
3 Pitfalls Everyone Should Avoid with Cloud Native Observability
 
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetry
 
Roadmap to Becoming a CNCF Ambassador
Roadmap to Becoming a CNCF AmbassadorRoadmap to Becoming a CNCF Ambassador
Roadmap to Becoming a CNCF Ambassador
 
Cloud Native Bedtime Stories - Terrifying Execs into Action
Cloud Native Bedtime Stories - Terrifying Execs into ActionCloud Native Bedtime Stories - Terrifying Execs into Action
Cloud Native Bedtime Stories - Terrifying Execs into Action
 
SRECon EU 2023 - Three Phases to Better Observability Outcomes
SRECon EU 2023 - Three Phases to Better Observability OutcomesSRECon EU 2023 - Three Phases to Better Observability Outcomes
SRECon EU 2023 - Three Phases to Better Observability Outcomes
 
Engaging Your Execs - Telling Great Observability Tales Inspiring Action
Engaging Your Execs - Telling Great Observability Tales Inspiring ActionEngaging Your Execs - Telling Great Observability Tales Inspiring Action
Engaging Your Execs - Telling Great Observability Tales Inspiring Action
 
WTF is SRE - Telling Effective Tales about Production
WTF is SRE - Telling Effective Tales about ProductionWTF is SRE - Telling Effective Tales about Production
WTF is SRE - Telling Effective Tales about Production
 
Optimizing Observability Spend: Metrics
Optimizing Observability Spend: MetricsOptimizing Observability Spend: Metrics
Optimizing Observability Spend: Metrics
 
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetry
 
Open Source 101 - Observability For You and Me with OpenTelemetry
Open Source 101 - Observability For You and Me with OpenTelemetryOpen Source 101 - Observability For You and Me with OpenTelemetry
Open Source 101 - Observability For You and Me with OpenTelemetry
 
3 Pitfalls Everyone Should Avoid with Cloud Data
3 Pitfalls Everyone Should Avoid with Cloud Data3 Pitfalls Everyone Should Avoid with Cloud Data
3 Pitfalls Everyone Should Avoid with Cloud Data
 
3 Pitfalls Everyone Should Avoid with Cloud Native Data
3 Pitfalls Everyone Should Avoid with Cloud Native Data3 Pitfalls Everyone Should Avoid with Cloud Native Data
3 Pitfalls Everyone Should Avoid with Cloud Native Data
 
DZone webinar - Shift left Observability
DZone webinar - Shift left ObservabilityDZone webinar - Shift left Observability
DZone webinar - Shift left Observability
 
Storytelling - How to build and delivery a story
Storytelling - How to build and delivery a storyStorytelling - How to build and delivery a story
Storytelling - How to build and delivery a story
 
Shift left Observability
Shift left ObservabilityShift left Observability
Shift left Observability
 
Trajectory 2022 - Shifting Cloud Native Observability to the Left
Trajectory 2022 - Shifting Cloud Native Observability to the LeftTrajectory 2022 - Shifting Cloud Native Observability to the Left
Trajectory 2022 - Shifting Cloud Native Observability to the Left
 

Recently uploaded

Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 

Recently uploaded (20)

Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 

Optimizing Observability Spend: Metrics

  • 1. Optimizing Observability Spend: Metrics Eric D. Schabell Director Evangelism @ericschabell{@fosstodon.org} Aug 2023 DataOps Day 2023
  • 4. chronosphere.io “It’s remarkable how common this situation is, where an organization is paying more for their observability data, than they do for their production infrastructure.”
  • 5. chronosphere.io Data volume Experiment: - Hello World app on 4 node Kubernetes cluster with Tracing, End User Metrics (EUM), Logs, Metrics (containers / nodes) - 30 days == +450 GB
  • 6. chronosphere.io “If you have to ask, you can’t afford it…” Only if we get better outcomes… chronosphere.io
  • 7. chronosphere.io 10 hours on average, per week, trying to triage and understand incidents - a quarter of a 40 hour work week
  • 8. chronosphere.io Know the cost of observability metrics data?
  • 9. chronosphere.io Dedicated FinOps “By 2023, 80% of organizations using cloud services will establish a dedicated FinOps function to automate policy- driven observibility and optimization of cloud resources to maximize value.” -- Source: IDC 2022
  • 11. Centralized Governance - It Starts Here Centralized Governance Give teams ownership and control of their metrics to control cardinality and growth Quotas - Allocate portions of the licensed persisted write capacity amongst teams and services Priorities - Prioritize which data is impacted if over capacity
  • 12. Analyze Data Analyze Understand the value of the observability data to identify what is useful and what is waste Metrics Traffic Analyzer - Provides a real-time view of incoming metrics grouped by label, and their relative frequency Metrics Usage Analyzer - View all metrics in Chronosphere ranked from least used to most used to understand the value each metric delivers Trace Analyzer - Provides a real-time view of incoming traces grouped by tag and their relative frequency
  • 13. The Metrics Traffic Analyzer helps users: ● Understand metrics traffic patterns and scale ● Break down biggest and smallest contributors to traffic scale (by metric name, label, application, etc) ● Troubleshoot cardinality spikes Metrics Traffic Analyzer
  • 14. Real-time view of incoming metrics View Live or Pause to investigate specific metrics and their labels View traffic before it is stored to help make decisions about traffic shape before you pay for it
  • 15. Breakdown traffic by metric name and label Labels Metrics
  • 16. Troubleshoot high cardinality metrics & labels Metrics with ‘instance’ label ‘instance’ label is on 100% of metrics and has 62 unique values
  • 17. Metrics Usage Analyzer The Metrics Usage Analyzer allows user to: ● Understand the value each metric delivers ● Identify unused and underutilized metrics ● Know if a metric is being used, where, and by whom ● Help make better shaping decisions
  • 18. What is and is not valuable? Default sort is Least Valuable Click for more Usage Details
  • 19. Resolving uncertainty about value Where is it being used? Select 14 or 30 days How much is it used?
  • 20. Keeping it in context Utility Score + DPPS Where and how much it’s used
  • 21. Scenario - Low Utility Score but High Ingest Low Utility Score, but high DPPS. How is it being used? Let’s take a look at the Usage Details
  • 22. Scenario - No references, but some executions Not being used in Dashboard, Alerts, etc… but two users. Who are they?
  • 23. Underutilized metric discovered! Two of our top SREs! What are they using it for? Should others be using it as well?
  • 24. Refine - Shape and Transform Data Refine After understanding cost & value of data, we enable you to take action without touching source code or redeploying. We do this by allowing you to aggregate or downsample data, remove high cardinality labels, or drop non-valuable data. This is done real-time at ingest (streaming), meaning no delay in alerts or need to store raw data. The result is reduced cost & improved performance without alert or query impact.
  • 25. Operate The Control Plane has built-in capabilities to ensure queries perform optimally and require no user intervention, while reducing idle time and improving engineer productivity Query Accelerator - Automatically ensures every possible dashboard is fast and performant – no manual optimizations needed. Query Scheduler - Automatically ensures that query resources are fairly shared so one user, or group of users can’t crowd out others. Shaping Rules UI - Understand current shaping rules configuration and value. Preview new policies before they are implemented. Operate - Continuously Adjust for Efficiency
  • 26. Why Optimizing Observability Spend The need is real ● Study by ESG, 69% of companies are concerned with the rate of their observability data growth ● When able to control and optimize their data: ○ Expanding visibility and coverage ○ Increasing instrumentation of customer experience to improve business outcomes ○ Freeing up observability team time to tackle strategic projects
  • 27. chronosphere.io chronosphere.io Customer Impact 50% data volume reduction 90% reduction in on-call pages 80% data volume reduction 8x query latency improvement 98% data volume reduction 8x MTTD improvement
  • 28. "With Chronosphere, we were able to not only significantly improve reliability and performance of our observability solution, but we've also saved millions of dollars a year. With the Chronosphere Control Plane, we're reducing our observability data volumes by more than 80%." Yash Kumaraswamy, Senior Staff Engineer, Robinhood
  • 29. chronosphere.io chronosphere.io Learn More Resources ● Introducing: The Observability Data Optimization Cycle ● Metrics Usage Analyzer: Understand the value of each metric in your system ● How cloud native workloads affect cardinality over time ● Metrics Quotas: Protect yourself from cardinality explosions and budget overruns Case Studies ● Why DoorDash Needed True Cloud Native Monitoring ● Top FinTech company chooses Chronosphere observability for industry-leading reliability and performance Talk to an Observability expert at Chronosphere ○ Schedule a conversation
  • 30. Questions? Eric D. Schabell Director Evangelism @ericschabell{@fosstodon.org} Aug 2023