SlideShare a Scribd company logo
1
2
Daniel Krook
Senior Certified IT Specialist, IBM
The IBM dashboard for operational metrics
3
We run Cloud Foundry on dozens of OpenStack VMs
Two intranet clusters
In the past year, we’ve learned how to
Classic: 38 huge VMs deployed with Chef: 1,302 users, 1,710 apps
NG: 41 medium VMs deployed with BOSH: 123 users, 247 apps
Not counting Dev deployments
All on 50+ Nova Compute nodes
• Keep Cloud Foundry running smoothly
• Discover and prevent impending problems
• Resolve unexpected issues quickly
4
1. Show the key data points we track
2. Show how our metrics dashboard helps us monitor that data
3. Share ideas on how to find better data in NG and beyond
4. Spark discussion on improved visibility for CF admins and customers.
Goals of this lightning talk
We are looking to get better at this, and help the community get better as well.
5
1. The key data
6
What are the important metrics?
Data that can be
tracked over time to see
trends and behaviors
Data that can help
us predict problems
before they happen
DEAs and apps health
 Memory reserved as a proportion of the
memory available
General health of all components
 Health of the virtual machines
 Status of the processes running on them
Database nodes and services
 Number of provisioned services against
capacity available
At the PaaS layer, that means:
7
 Deliver continuous
availability in the cloud
 Proactively solve
problems rather than
react to them
 Understand the behavior
of the system to
automate it
Why do we need metrics?
8
 NATS message bus
• Discover the components to interrogate
• Best for dynamically changing data
Where can we find them?
 Cloud Controller database (CCDB)
• Longer lived data that isn’t in the varz endpoints
9
2. Monitoring that data
10
1. Views of component health
2. Resource usage details
3. Ongoing growth trends
4. Access to logs and raw varz
5. Email notifications
Our metrics dashboard provides…
11
 Components nearing capacity or failure
 Already failed components
 Out of control apps and noisy users
 Active/inactive users and apps
 Growth trends and runtime/service adoption
It helps us find (and fix) problems
It helps us see patterns
12
User and app trends
There is also one unauthenticated page for high level stats
13
DEA list
14
DEA details
15
Service node list
16
Service node details
17
User list
18
User details
19
App list
20
App details
21
Log list
22
Log details
23
Email notifications
24
3. Finding and acting on better data
25
 NG provides granular user/org/space views…
• This enables better BSS potential in terms of QoS and departmental billing
 …But we lost user and app data linkages from the health manager
• Can’t see what DEA my app resides on (not currently enabled in our NG version)
• Can’t see how many apps a user has (replaced by orgs and spaces, but still
valuable to trace)
• See https://github.com/cloudfoundry/cloud_controller_ng/issues/81
 We’d like to restore that data, either surface it
• in varz endpoints (dynamic data, preferred) or
• CC_DB (static data, could be a security concern)
Let’s resolve gaps in data captured from NG
26
 Detect errors in applications that are traceable to users/orgs
• Preemptively reach out to them to see if they need help
• Think customer service and proactive support!
• Can we hook into to BOSH or Jenkins for automation?
 Automate (and expand links to the IaaS and SaaS stacks)
• Self healing systems (out of disk, move apps)
• Self scaling systems (detect when nearing thresholds)
• Evolving topologies (replace unused service nodes with popular ones)
Let’s begin to link metrics to automation
27
 Admins are the primary beneficiary right now
• But data is almost completely read only
• Should we provide UAA based tiers of access to admins?
 Others can and should benefit
• Customers
• End users
• Developers
• Management
• Executives, line of business owners
• Finance
Let’s expand the broadcast of metrics to more users
28
Thanks!
29
The metrics dashboard innovators
Chris Peters Russell Boykin
Doug Davis Wei Feng
30
We’re hiring!
Search Jobs at IBM by:
SmartCloud Application Services
31

More Related Content

What's hot

January 2015 Webinar - Wins and Successes from 2014
January 2015 Webinar -  Wins and Successes from 2014January 2015 Webinar -  Wins and Successes from 2014
January 2015 Webinar - Wins and Successes from 2014
RapidScale
 
Science for the Future: Strategies for Moving and Sharing Data
Science for the Future: Strategies for Moving and Sharing DataScience for the Future: Strategies for Moving and Sharing Data
Science for the Future: Strategies for Moving and Sharing Data
Ian Foster
 
Big Data as a Service: A Neo-Metropolis Model Approach for Innovation
Big Data as a Service: A Neo-Metropolis Model Approach for InnovationBig Data as a Service: A Neo-Metropolis Model Approach for Innovation
Big Data as a Service: A Neo-Metropolis Model Approach for Innovation
SoftServe
 
Towards Personalization in Global Digital Health
Towards Personalization in Global Digital HealthTowards Personalization in Global Digital Health
Towards Personalization in Global Digital Health
Databricks
 
SplunkLive! Washington DC May 2013 - Splunk Enterprise 5
SplunkLive! Washington DC May 2013 - Splunk Enterprise 5SplunkLive! Washington DC May 2013 - Splunk Enterprise 5
SplunkLive! Washington DC May 2013 - Splunk Enterprise 5Splunk
 
Splunk Distributed Management Console
Splunk Distributed Management Console                                         Splunk Distributed Management Console
Splunk Distributed Management Console
Splunk
 
Modern management of data pipelines made easier
Modern management of data pipelines made easierModern management of data pipelines made easier
Modern management of data pipelines made easier
CloverDX
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
Splunk
 
Affecto Informatica World Tour 2015: The Age of Engagement
Affecto Informatica World Tour 2015: The Age of EngagementAffecto Informatica World Tour 2015: The Age of Engagement
Affecto Informatica World Tour 2015: The Age of Engagement
Affecto
 
Splunk in the Cisco Unified Computing System (UCS)
Splunk in the Cisco Unified Computing System (UCS) Splunk in the Cisco Unified Computing System (UCS)
Splunk in the Cisco Unified Computing System (UCS)
Splunk
 
RapidScale CloudMail
RapidScale CloudMailRapidScale CloudMail
RapidScale CloudMail
RapidScale
 
Three Pillars, Zero Answers: Rethinking Observability
Three Pillars, Zero Answers: Rethinking ObservabilityThree Pillars, Zero Answers: Rethinking Observability
Three Pillars, Zero Answers: Rethinking Observability
DevOps.com
 
Migrating from Java EE to cloud-native Reactive systems
Migrating from Java EE to cloud-native Reactive systemsMigrating from Java EE to cloud-native Reactive systems
Migrating from Java EE to cloud-native Reactive systems
Markus Eisele
 
Event-driven architecture
Event-driven architectureEvent-driven architecture
Event-driven architecture
Andrew Easter
 
IBM and Lightbend Build Integrated Platform for Cognitive Development
IBM and Lightbend Build Integrated Platform for Cognitive DevelopmentIBM and Lightbend Build Integrated Platform for Cognitive Development
IBM and Lightbend Build Integrated Platform for Cognitive Development
Lightbend
 
SplunkLive! Customer Presentation - SSA
SplunkLive! Customer Presentation - SSASplunkLive! Customer Presentation - SSA
SplunkLive! Customer Presentation - SSA
Splunk
 
SplunkLive! Customer Presentation - Staples
SplunkLive! Customer Presentation - StaplesSplunkLive! Customer Presentation - Staples
SplunkLive! Customer Presentation - Staples
Splunk
 
Splunk Ninjas: New Features, Pivot, and Search Dojo
Splunk Ninjas: New Features, Pivot, and Search DojoSplunk Ninjas: New Features, Pivot, and Search Dojo
Splunk Ninjas: New Features, Pivot, and Search Dojo
Splunk
 
Conferencia principal: Evolución y visión de Elastic Observability
Conferencia principal: Evolución y visión de Elastic ObservabilityConferencia principal: Evolución y visión de Elastic Observability
Conferencia principal: Evolución y visión de Elastic Observability
Elasticsearch
 

What's hot (20)

January 2015 Webinar - Wins and Successes from 2014
January 2015 Webinar -  Wins and Successes from 2014January 2015 Webinar -  Wins and Successes from 2014
January 2015 Webinar - Wins and Successes from 2014
 
Science for the Future: Strategies for Moving and Sharing Data
Science for the Future: Strategies for Moving and Sharing DataScience for the Future: Strategies for Moving and Sharing Data
Science for the Future: Strategies for Moving and Sharing Data
 
Big Data as a Service: A Neo-Metropolis Model Approach for Innovation
Big Data as a Service: A Neo-Metropolis Model Approach for InnovationBig Data as a Service: A Neo-Metropolis Model Approach for Innovation
Big Data as a Service: A Neo-Metropolis Model Approach for Innovation
 
Towards Personalization in Global Digital Health
Towards Personalization in Global Digital HealthTowards Personalization in Global Digital Health
Towards Personalization in Global Digital Health
 
SplunkLive! Washington DC May 2013 - Splunk Enterprise 5
SplunkLive! Washington DC May 2013 - Splunk Enterprise 5SplunkLive! Washington DC May 2013 - Splunk Enterprise 5
SplunkLive! Washington DC May 2013 - Splunk Enterprise 5
 
Splunk Distributed Management Console
Splunk Distributed Management Console                                         Splunk Distributed Management Console
Splunk Distributed Management Console
 
Modern management of data pipelines made easier
Modern management of data pipelines made easierModern management of data pipelines made easier
Modern management of data pipelines made easier
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
 
Affecto Informatica World Tour 2015: The Age of Engagement
Affecto Informatica World Tour 2015: The Age of EngagementAffecto Informatica World Tour 2015: The Age of Engagement
Affecto Informatica World Tour 2015: The Age of Engagement
 
Splunk in the Cisco Unified Computing System (UCS)
Splunk in the Cisco Unified Computing System (UCS) Splunk in the Cisco Unified Computing System (UCS)
Splunk in the Cisco Unified Computing System (UCS)
 
RapidScale CloudMail
RapidScale CloudMailRapidScale CloudMail
RapidScale CloudMail
 
Three Pillars, Zero Answers: Rethinking Observability
Three Pillars, Zero Answers: Rethinking ObservabilityThree Pillars, Zero Answers: Rethinking Observability
Three Pillars, Zero Answers: Rethinking Observability
 
Migrating from Java EE to cloud-native Reactive systems
Migrating from Java EE to cloud-native Reactive systemsMigrating from Java EE to cloud-native Reactive systems
Migrating from Java EE to cloud-native Reactive systems
 
Event-driven architecture
Event-driven architectureEvent-driven architecture
Event-driven architecture
 
IBM and Lightbend Build Integrated Platform for Cognitive Development
IBM and Lightbend Build Integrated Platform for Cognitive DevelopmentIBM and Lightbend Build Integrated Platform for Cognitive Development
IBM and Lightbend Build Integrated Platform for Cognitive Development
 
SplunkLive! Customer Presentation - SSA
SplunkLive! Customer Presentation - SSASplunkLive! Customer Presentation - SSA
SplunkLive! Customer Presentation - SSA
 
SplunkLive! Customer Presentation - Staples
SplunkLive! Customer Presentation - StaplesSplunkLive! Customer Presentation - Staples
SplunkLive! Customer Presentation - Staples
 
Splunk Ninjas: New Features, Pivot, and Search Dojo
Splunk Ninjas: New Features, Pivot, and Search DojoSplunk Ninjas: New Features, Pivot, and Search Dojo
Splunk Ninjas: New Features, Pivot, and Search Dojo
 
Dev ops toronto
Dev ops torontoDev ops toronto
Dev ops toronto
 
Conferencia principal: Evolución y visión de Elastic Observability
Conferencia principal: Evolución y visión de Elastic ObservabilityConferencia principal: Evolución y visión de Elastic Observability
Conferencia principal: Evolución y visión de Elastic Observability
 

Viewers also liked

Meaningful Metrics - Aligning Operational Metrics with Marketing & Customer E...
Meaningful Metrics - Aligning Operational Metrics with Marketing & Customer E...Meaningful Metrics - Aligning Operational Metrics with Marketing & Customer E...
Meaningful Metrics - Aligning Operational Metrics with Marketing & Customer E...
Earley Information Science
 
Best Practices in Measuring Critical Support Metrics
Best Practices in Measuring Critical Support MetricsBest Practices in Measuring Critical Support Metrics
Best Practices in Measuring Critical Support Metricsdreamforce2006
 
Cloud Foundry Deployment Tools: BOSH vs Juju Charms
Cloud Foundry Deployment Tools:  BOSH vs Juju CharmsCloud Foundry Deployment Tools:  BOSH vs Juju Charms
Cloud Foundry Deployment Tools: BOSH vs Juju Charms
Altoros
 
Webinar: “KPIs in Digital Marketing” - presented by Jacques Warren
Webinar: “KPIs in Digital Marketing” - presented by Jacques WarrenWebinar: “KPIs in Digital Marketing” - presented by Jacques Warren
Webinar: “KPIs in Digital Marketing” - presented by Jacques Warren
AT Internet
 
Regulatory Reporting Dashboard
Regulatory Reporting DashboardRegulatory Reporting Dashboard
Regulatory Reporting Dashboard
accenture
 
The difference between a KPI and a Metric
The difference between a KPI and a MetricThe difference between a KPI and a Metric
The difference between a KPI and a Metric
Dennis Mortensen
 
Stress management in hr
Stress management in hrStress management in hr
Stress management in hr
'Anuraag Ghosh
 
KPI for HR Manager - Sample of KPIs for HR
KPI for HR Manager - Sample of KPIs for HRKPI for HR Manager - Sample of KPIs for HR
KPI for HR Manager - Sample of KPIs for HR
Yodhia Antariksa
 
Microservices with Spring and Cloud Foundry
Microservices with Spring and Cloud FoundryMicroservices with Spring and Cloud Foundry
Microservices with Spring and Cloud Foundry
mimacom
 
The 10 Most Important Banking Metrics
The 10 Most Important Banking MetricsThe 10 Most Important Banking Metrics
The 10 Most Important Banking Metrics
John J. Maxfield
 
Project Metrics & Measures
Project Metrics & MeasuresProject Metrics & Measures
Project Metrics & Measures
Anand Subramaniam
 
Developing Metrics and KPI (Key Performance Indicators
Developing Metrics and KPI (Key Performance IndicatorsDeveloping Metrics and KPI (Key Performance Indicators
Developing Metrics and KPI (Key Performance Indicators
Victor Holman
 
Learning Metrics: Building Your Training Scorecard
Learning Metrics: Building Your Training ScorecardLearning Metrics: Building Your Training Scorecard
Learning Metrics: Building Your Training Scorecard
Cornerstone OnDemand Foundation
 
KEY PERFORMANCE INDICATOR
KEY PERFORMANCE INDICATORKEY PERFORMANCE INDICATOR
KEY PERFORMANCE INDICATOR
speedcars
 

Viewers also liked (14)

Meaningful Metrics - Aligning Operational Metrics with Marketing & Customer E...
Meaningful Metrics - Aligning Operational Metrics with Marketing & Customer E...Meaningful Metrics - Aligning Operational Metrics with Marketing & Customer E...
Meaningful Metrics - Aligning Operational Metrics with Marketing & Customer E...
 
Best Practices in Measuring Critical Support Metrics
Best Practices in Measuring Critical Support MetricsBest Practices in Measuring Critical Support Metrics
Best Practices in Measuring Critical Support Metrics
 
Cloud Foundry Deployment Tools: BOSH vs Juju Charms
Cloud Foundry Deployment Tools:  BOSH vs Juju CharmsCloud Foundry Deployment Tools:  BOSH vs Juju Charms
Cloud Foundry Deployment Tools: BOSH vs Juju Charms
 
Webinar: “KPIs in Digital Marketing” - presented by Jacques Warren
Webinar: “KPIs in Digital Marketing” - presented by Jacques WarrenWebinar: “KPIs in Digital Marketing” - presented by Jacques Warren
Webinar: “KPIs in Digital Marketing” - presented by Jacques Warren
 
Regulatory Reporting Dashboard
Regulatory Reporting DashboardRegulatory Reporting Dashboard
Regulatory Reporting Dashboard
 
The difference between a KPI and a Metric
The difference between a KPI and a MetricThe difference between a KPI and a Metric
The difference between a KPI and a Metric
 
Stress management in hr
Stress management in hrStress management in hr
Stress management in hr
 
KPI for HR Manager - Sample of KPIs for HR
KPI for HR Manager - Sample of KPIs for HRKPI for HR Manager - Sample of KPIs for HR
KPI for HR Manager - Sample of KPIs for HR
 
Microservices with Spring and Cloud Foundry
Microservices with Spring and Cloud FoundryMicroservices with Spring and Cloud Foundry
Microservices with Spring and Cloud Foundry
 
The 10 Most Important Banking Metrics
The 10 Most Important Banking MetricsThe 10 Most Important Banking Metrics
The 10 Most Important Banking Metrics
 
Project Metrics & Measures
Project Metrics & MeasuresProject Metrics & Measures
Project Metrics & Measures
 
Developing Metrics and KPI (Key Performance Indicators
Developing Metrics and KPI (Key Performance IndicatorsDeveloping Metrics and KPI (Key Performance Indicators
Developing Metrics and KPI (Key Performance Indicators
 
Learning Metrics: Building Your Training Scorecard
Learning Metrics: Building Your Training ScorecardLearning Metrics: Building Your Training Scorecard
Learning Metrics: Building Your Training Scorecard
 
KEY PERFORMANCE INDICATOR
KEY PERFORMANCE INDICATORKEY PERFORMANCE INDICATOR
KEY PERFORMANCE INDICATOR
 

Similar to The IBM dashboard for operational metrics

Cloudera federal summit
Cloudera federal summitCloudera federal summit
Cloudera federal summit
Matt Carroll
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics
DATAVERSITY
 
SMAC - Social, Mobile, Analytics and Cloud - An overview
SMAC - Social, Mobile, Analytics and Cloud - An overview SMAC - Social, Mobile, Analytics and Cloud - An overview
SMAC - Social, Mobile, Analytics and Cloud - An overview
Rajesh Menon
 
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesPutting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
DATAVERSITY
 
Whitepaper factors to consider when selecting an open source infrastructure ...
Whitepaper  factors to consider when selecting an open source infrastructure ...Whitepaper  factors to consider when selecting an open source infrastructure ...
Whitepaper factors to consider when selecting an open source infrastructure ...
apprize360
 
Lecture 3.31 3.32.pptx
Lecture 3.31  3.32.pptxLecture 3.31  3.32.pptx
Lecture 3.31 3.32.pptx
RATISHKUMAR32
 
ADDO Open Source Observability Tools
ADDO Open Source Observability Tools ADDO Open Source Observability Tools
ADDO Open Source Observability Tools
Mickey Boxell
 
Whitepaper factors to consider commercial infrastructure management vendors
Whitepaper  factors to consider commercial infrastructure management vendorsWhitepaper  factors to consider commercial infrastructure management vendors
Whitepaper factors to consider commercial infrastructure management vendors
apprize360
 
The Architecture of Continuous Innovation - OSCON 2015
The Architecture of Continuous Innovation - OSCON 2015The Architecture of Continuous Innovation - OSCON 2015
The Architecture of Continuous Innovation - OSCON 2015
Chip Childers
 
About Streaming Data Solutions for Hadoop
About Streaming Data Solutions for HadoopAbout Streaming Data Solutions for Hadoop
About Streaming Data Solutions for Hadoop
Lynn Langit
 
Global Azure Bootcamp 2017 - Performance and Health Management for Modern App...
Global Azure Bootcamp 2017 - Performance and Health Management for Modern App...Global Azure Bootcamp 2017 - Performance and Health Management for Modern App...
Global Azure Bootcamp 2017 - Performance and Health Management for Modern App...
Adin Ermie
 
Cloud-Native Data: What data questions to ask when building cloud-native apps
Cloud-Native Data: What data questions to ask when building cloud-native appsCloud-Native Data: What data questions to ask when building cloud-native apps
Cloud-Native Data: What data questions to ask when building cloud-native apps
VMware Tanzu
 
How to improve your system monitoring
How to improve your system monitoringHow to improve your system monitoring
How to improve your system monitoring
Andrew White
 
DockerCon SF 2019 - Observability Workshop
DockerCon SF 2019 - Observability WorkshopDockerCon SF 2019 - Observability Workshop
DockerCon SF 2019 - Observability Workshop
Kevin Crawley
 
Why Monitoring and Logging are Important in DevOps.pdf
Why Monitoring and Logging are Important in DevOps.pdfWhy Monitoring and Logging are Important in DevOps.pdf
Why Monitoring and Logging are Important in DevOps.pdf
Datacademy.ai
 
Big Data
Big DataBig Data
Big Data
Neha Mehta
 
Whitepaper tableau for-the-enterprise-0
Whitepaper tableau for-the-enterprise-0Whitepaper tableau for-the-enterprise-0
Whitepaper tableau for-the-enterprise-0
alok khobragade
 
Emerging IT Trends and Innovation Concepts.pptx
Emerging IT Trends and Innovation Concepts.pptxEmerging IT Trends and Innovation Concepts.pptx
Emerging IT Trends and Innovation Concepts.pptx
Roshni814224
 
How to add security in dataops and devops
How to add security in dataops and devopsHow to add security in dataops and devops
How to add security in dataops and devops
Ulf Mattsson
 
Introducción a Microservicios, SUSE CaaS Platform y Kubernetes
Introducción a Microservicios, SUSE CaaS Platform y KubernetesIntroducción a Microservicios, SUSE CaaS Platform y Kubernetes
Introducción a Microservicios, SUSE CaaS Platform y Kubernetes
SUSE España
 

Similar to The IBM dashboard for operational metrics (20)

Cloudera federal summit
Cloudera federal summitCloudera federal summit
Cloudera federal summit
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics
 
SMAC - Social, Mobile, Analytics and Cloud - An overview
SMAC - Social, Mobile, Analytics and Cloud - An overview SMAC - Social, Mobile, Analytics and Cloud - An overview
SMAC - Social, Mobile, Analytics and Cloud - An overview
 
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesPutting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
 
Whitepaper factors to consider when selecting an open source infrastructure ...
Whitepaper  factors to consider when selecting an open source infrastructure ...Whitepaper  factors to consider when selecting an open source infrastructure ...
Whitepaper factors to consider when selecting an open source infrastructure ...
 
Lecture 3.31 3.32.pptx
Lecture 3.31  3.32.pptxLecture 3.31  3.32.pptx
Lecture 3.31 3.32.pptx
 
ADDO Open Source Observability Tools
ADDO Open Source Observability Tools ADDO Open Source Observability Tools
ADDO Open Source Observability Tools
 
Whitepaper factors to consider commercial infrastructure management vendors
Whitepaper  factors to consider commercial infrastructure management vendorsWhitepaper  factors to consider commercial infrastructure management vendors
Whitepaper factors to consider commercial infrastructure management vendors
 
The Architecture of Continuous Innovation - OSCON 2015
The Architecture of Continuous Innovation - OSCON 2015The Architecture of Continuous Innovation - OSCON 2015
The Architecture of Continuous Innovation - OSCON 2015
 
About Streaming Data Solutions for Hadoop
About Streaming Data Solutions for HadoopAbout Streaming Data Solutions for Hadoop
About Streaming Data Solutions for Hadoop
 
Global Azure Bootcamp 2017 - Performance and Health Management for Modern App...
Global Azure Bootcamp 2017 - Performance and Health Management for Modern App...Global Azure Bootcamp 2017 - Performance and Health Management for Modern App...
Global Azure Bootcamp 2017 - Performance and Health Management for Modern App...
 
Cloud-Native Data: What data questions to ask when building cloud-native apps
Cloud-Native Data: What data questions to ask when building cloud-native appsCloud-Native Data: What data questions to ask when building cloud-native apps
Cloud-Native Data: What data questions to ask when building cloud-native apps
 
How to improve your system monitoring
How to improve your system monitoringHow to improve your system monitoring
How to improve your system monitoring
 
DockerCon SF 2019 - Observability Workshop
DockerCon SF 2019 - Observability WorkshopDockerCon SF 2019 - Observability Workshop
DockerCon SF 2019 - Observability Workshop
 
Why Monitoring and Logging are Important in DevOps.pdf
Why Monitoring and Logging are Important in DevOps.pdfWhy Monitoring and Logging are Important in DevOps.pdf
Why Monitoring and Logging are Important in DevOps.pdf
 
Big Data
Big DataBig Data
Big Data
 
Whitepaper tableau for-the-enterprise-0
Whitepaper tableau for-the-enterprise-0Whitepaper tableau for-the-enterprise-0
Whitepaper tableau for-the-enterprise-0
 
Emerging IT Trends and Innovation Concepts.pptx
Emerging IT Trends and Innovation Concepts.pptxEmerging IT Trends and Innovation Concepts.pptx
Emerging IT Trends and Innovation Concepts.pptx
 
How to add security in dataops and devops
How to add security in dataops and devopsHow to add security in dataops and devops
How to add security in dataops and devops
 
Introducción a Microservicios, SUSE CaaS Platform y Kubernetes
Introducción a Microservicios, SUSE CaaS Platform y KubernetesIntroducción a Microservicios, SUSE CaaS Platform y Kubernetes
Introducción a Microservicios, SUSE CaaS Platform y Kubernetes
 

More from Platform CF

The Platform for Building Great Software
The Platform for Building Great SoftwareThe Platform for Building Great Software
The Platform for Building Great SoftwarePlatform CF
 
The Path to Stackato
The Path to StackatoThe Path to Stackato
The Path to StackatoPlatform CF
 
Continuous Deployment with Cloud Foundry, Github and Travis CI
Continuous Deployment with Cloud Foundry, Github and Travis CIContinuous Deployment with Cloud Foundry, Github and Travis CI
Continuous Deployment with Cloud Foundry, Github and Travis CIPlatform CF
 
The Journey to Cloud Foundry
The Journey to Cloud FoundryThe Journey to Cloud Foundry
The Journey to Cloud FoundryPlatform CF
 
Pivotal HD as a Cloud Foundry Service
Pivotal HD as a Cloud Foundry ServicePivotal HD as a Cloud Foundry Service
Pivotal HD as a Cloud Foundry ServicePlatform CF
 
What Lessons Can Cloud Foundry Teach to IaaS?
What Lessons Can Cloud Foundry Teach to IaaS?What Lessons Can Cloud Foundry Teach to IaaS?
What Lessons Can Cloud Foundry Teach to IaaS?Platform CF
 
Cloud Foundry at VMware
Cloud Foundry at VMwareCloud Foundry at VMware
Cloud Foundry at VMwarePlatform CF
 
Go Within Cloud Foundry
Go Within Cloud FoundryGo Within Cloud Foundry
Go Within Cloud FoundryPlatform CF
 
Continuous Delivery with Cloud Foundry
Continuous Delivery with Cloud FoundryContinuous Delivery with Cloud Foundry
Continuous Delivery with Cloud FoundryPlatform CF
 
From Zero To Factory
From Zero To FactoryFrom Zero To Factory
From Zero To FactoryPlatform CF
 
Service Distribution to Any Cloud - Cloud Elements
Service Distribution to Any Cloud - Cloud ElementsService Distribution to Any Cloud - Cloud Elements
Service Distribution to Any Cloud - Cloud ElementsPlatform CF
 
Cloud Foundry Marketplace Powered by AppDirect
Cloud Foundry MarketplacePowered by AppDirectCloud Foundry MarketplacePowered by AppDirect
Cloud Foundry Marketplace Powered by AppDirectPlatform CF
 
The Path to Stackato
The Path to StackatoThe Path to Stackato
The Path to StackatoPlatform CF
 
Multi-site Architecture Considerations
Multi-site Architecture ConsiderationsMulti-site Architecture Considerations
Multi-site Architecture ConsiderationsPlatform CF
 
Cloud Foundry at NTT
Cloud Foundry at NTTCloud Foundry at NTT
Cloud Foundry at NTTPlatform CF
 
Building Opportunity with an Open Cloud Architecture
Building Opportunity with an Open Cloud ArchitectureBuilding Opportunity with an Open Cloud Architecture
Building Opportunity with an Open Cloud ArchitecturePlatform CF
 
Extending Cloud Foundry to .NET
Extending Cloud Foundry to .NETExtending Cloud Foundry to .NET
Extending Cloud Foundry to .NETPlatform CF
 
Cloud Foundry at Rakuten
Cloud Foundry at RakutenCloud Foundry at Rakuten
Cloud Foundry at RakutenPlatform CF
 

More from Platform CF (19)

The Platform for Building Great Software
The Platform for Building Great SoftwareThe Platform for Building Great Software
The Platform for Building Great Software
 
The Path to Stackato
The Path to StackatoThe Path to Stackato
The Path to Stackato
 
Continuous Deployment with Cloud Foundry, Github and Travis CI
Continuous Deployment with Cloud Foundry, Github and Travis CIContinuous Deployment with Cloud Foundry, Github and Travis CI
Continuous Deployment with Cloud Foundry, Github and Travis CI
 
The Journey to Cloud Foundry
The Journey to Cloud FoundryThe Journey to Cloud Foundry
The Journey to Cloud Foundry
 
Pivotal HD as a Cloud Foundry Service
Pivotal HD as a Cloud Foundry ServicePivotal HD as a Cloud Foundry Service
Pivotal HD as a Cloud Foundry Service
 
What Lessons Can Cloud Foundry Teach to IaaS?
What Lessons Can Cloud Foundry Teach to IaaS?What Lessons Can Cloud Foundry Teach to IaaS?
What Lessons Can Cloud Foundry Teach to IaaS?
 
Cloud Foundry at VMware
Cloud Foundry at VMwareCloud Foundry at VMware
Cloud Foundry at VMware
 
Go Within Cloud Foundry
Go Within Cloud FoundryGo Within Cloud Foundry
Go Within Cloud Foundry
 
Continuous Delivery with Cloud Foundry
Continuous Delivery with Cloud FoundryContinuous Delivery with Cloud Foundry
Continuous Delivery with Cloud Foundry
 
From Zero To Factory
From Zero To FactoryFrom Zero To Factory
From Zero To Factory
 
Service Distribution to Any Cloud - Cloud Elements
Service Distribution to Any Cloud - Cloud ElementsService Distribution to Any Cloud - Cloud Elements
Service Distribution to Any Cloud - Cloud Elements
 
Cloud Foundry Marketplace Powered by AppDirect
Cloud Foundry MarketplacePowered by AppDirectCloud Foundry MarketplacePowered by AppDirect
Cloud Foundry Marketplace Powered by AppDirect
 
The Path to Stackato
The Path to StackatoThe Path to Stackato
The Path to Stackato
 
Multi-site Architecture Considerations
Multi-site Architecture ConsiderationsMulti-site Architecture Considerations
Multi-site Architecture Considerations
 
Intro to MoPaaS
Intro to MoPaaSIntro to MoPaaS
Intro to MoPaaS
 
Cloud Foundry at NTT
Cloud Foundry at NTTCloud Foundry at NTT
Cloud Foundry at NTT
 
Building Opportunity with an Open Cloud Architecture
Building Opportunity with an Open Cloud ArchitectureBuilding Opportunity with an Open Cloud Architecture
Building Opportunity with an Open Cloud Architecture
 
Extending Cloud Foundry to .NET
Extending Cloud Foundry to .NETExtending Cloud Foundry to .NET
Extending Cloud Foundry to .NET
 
Cloud Foundry at Rakuten
Cloud Foundry at RakutenCloud Foundry at Rakuten
Cloud Foundry at Rakuten
 

Recently uploaded

Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 

Recently uploaded (20)

Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 

The IBM dashboard for operational metrics

  • 1. 1
  • 2. 2 Daniel Krook Senior Certified IT Specialist, IBM The IBM dashboard for operational metrics
  • 3. 3 We run Cloud Foundry on dozens of OpenStack VMs Two intranet clusters In the past year, we’ve learned how to Classic: 38 huge VMs deployed with Chef: 1,302 users, 1,710 apps NG: 41 medium VMs deployed with BOSH: 123 users, 247 apps Not counting Dev deployments All on 50+ Nova Compute nodes • Keep Cloud Foundry running smoothly • Discover and prevent impending problems • Resolve unexpected issues quickly
  • 4. 4 1. Show the key data points we track 2. Show how our metrics dashboard helps us monitor that data 3. Share ideas on how to find better data in NG and beyond 4. Spark discussion on improved visibility for CF admins and customers. Goals of this lightning talk We are looking to get better at this, and help the community get better as well.
  • 6. 6 What are the important metrics? Data that can be tracked over time to see trends and behaviors Data that can help us predict problems before they happen DEAs and apps health  Memory reserved as a proportion of the memory available General health of all components  Health of the virtual machines  Status of the processes running on them Database nodes and services  Number of provisioned services against capacity available At the PaaS layer, that means:
  • 7. 7  Deliver continuous availability in the cloud  Proactively solve problems rather than react to them  Understand the behavior of the system to automate it Why do we need metrics?
  • 8. 8  NATS message bus • Discover the components to interrogate • Best for dynamically changing data Where can we find them?  Cloud Controller database (CCDB) • Longer lived data that isn’t in the varz endpoints
  • 10. 10 1. Views of component health 2. Resource usage details 3. Ongoing growth trends 4. Access to logs and raw varz 5. Email notifications Our metrics dashboard provides…
  • 11. 11  Components nearing capacity or failure  Already failed components  Out of control apps and noisy users  Active/inactive users and apps  Growth trends and runtime/service adoption It helps us find (and fix) problems It helps us see patterns
  • 12. 12 User and app trends There is also one unauthenticated page for high level stats
  • 24. 24 3. Finding and acting on better data
  • 25. 25  NG provides granular user/org/space views… • This enables better BSS potential in terms of QoS and departmental billing  …But we lost user and app data linkages from the health manager • Can’t see what DEA my app resides on (not currently enabled in our NG version) • Can’t see how many apps a user has (replaced by orgs and spaces, but still valuable to trace) • See https://github.com/cloudfoundry/cloud_controller_ng/issues/81  We’d like to restore that data, either surface it • in varz endpoints (dynamic data, preferred) or • CC_DB (static data, could be a security concern) Let’s resolve gaps in data captured from NG
  • 26. 26  Detect errors in applications that are traceable to users/orgs • Preemptively reach out to them to see if they need help • Think customer service and proactive support! • Can we hook into to BOSH or Jenkins for automation?  Automate (and expand links to the IaaS and SaaS stacks) • Self healing systems (out of disk, move apps) • Self scaling systems (detect when nearing thresholds) • Evolving topologies (replace unused service nodes with popular ones) Let’s begin to link metrics to automation
  • 27. 27  Admins are the primary beneficiary right now • But data is almost completely read only • Should we provide UAA based tiers of access to admins?  Others can and should benefit • Customers • End users • Developers • Management • Executives, line of business owners • Finance Let’s expand the broadcast of metrics to more users
  • 29. 29 The metrics dashboard innovators Chris Peters Russell Boykin Doug Davis Wei Feng
  • 30. 30 We’re hiring! Search Jobs at IBM by: SmartCloud Application Services
  • 31. 31