SlideShare a Scribd company logo
1 of 19
A Lean
Approach
to Monitoring
September 15, 2015
About Ernest
• Product Manager at IDERA in Austin, TX
• 20 years of IT experience, from startups to
enterprise shops
• Runs CloudAustin user
group, DevOpsDays Austin
conference
• Twitter: @ernestmueller
• Blog: theagileadmin.com
Agenda
The Monitoring Landscape
What Is Lean?
MVP Monitoring Areas
Next Steps
Monitoring Your Systems
First Topic Subcontent
Goes Here
Monitoring Your Applications
Monitoring Tools
• Network (SNMP, Netflow)
• Server (SNMP, WMI, system)
• Virtualization/Cloud/Container
• Real User Monitoring (network, browser)
• Service Endpoint (simple/transactional, local/remote)
• Application (management interface, instrumentation)
• Software metrics (database, web/app server)
• Custom metrics (application)
• Logging, Security, Analytics, Reporting, More…
What To Do?
• Monitor it all?
– Expensive
– Complex
• How deep?
– Monitor parts of it?
– Gaps in visibility
– Which parts?
Monitoring Pitfalls
• “I have 100,000 metrics, but still can’t tell if the
site is down?”
• “Did you know we’re generating 30% of our
system load from monitoring?”
• “It’s going to cost how much? Maybe, but the
procurement cycle will be 9 months…”
• “We’re spending 2 headcount just on
maintaining our monitoring systems!”
• We get so many alerts we need a secondary
triage system so we know which ones to pay
attention to.”
What Is Lean?
• Eliminate Waste
• Amplify Learning
• Decide as late as possible
• Deliver as fast as possible
• Empower the team
• Build quality in
• See the whole
Lean Principles
Your Monitoring Is A Product
• Build – Minimum Viable Monitoring
• Measure – All the Monitoring Points
• Learn – About the App and the Monitoring
• Repeat – Go Deeper Where It’s Needed
Iterate Through A Development Cycle
Monitoring MVP Areas
1. Service Performance and Uptime
2. Software Component Metrics
3. System Metrics
4. Application Metrics
What are the most important areas to cover?
Service Performance and Uptime
• Remember lean principle “see the whole”
• “What do my users see?”
• MVP: external synthetic probe of the end service
• Next: RUM, waterfalls, transactions
• Later: transaction warehousing, cross-tier
transaction tracing
The end user view is always the most critical
Remember the Process
• Build – Minimum Viable Monitoring
• Measure – All the Monitoring Points
• Learn – About the App and the Monitoring
• Repeat – Go Deeper Where It’s Needed
Lean Development Cycle
Software Component Metrics
• “Is my service up?”
• Check ports/processes for actionable outages
• MVP: local probes
• Next: More metrics beyond uptime and response
time (most have a set they expose)
• Later: Advanced deep dive database and other app
component APM
What you can page people on
System and Network Metrics
• “What is the root cause?”
• Load on your systems and network devices
• MVP: basic system metrics
(CPU/mem/disk/network)
• Next: More depth, cloud/virt/container layer stats
• Later: Netflow, deeper dive into specific hardware
platform metrics (SANs, etc.)
Diagnosing Issues
Application Metrics
• “What is really going on?”
• The app knows, get the app to tell you
• MVP: Logging and log aggregation
• Later: Better logging
• Next: Specific app metric emission, application
instrumentation (Management API or bytecode)
Business value and troubleshooting specifics
Think About The Principles
• Eliminate Waste
• Amplify Learning
• Decide as late as possible
• Deliver as fast as possible
• Empower the team
• Build quality in
• See the whole
Lean Principles
Quick Demo
• CopperEgg – Ultra quick-start SaaS-based
monitoring with basics on systems, endpoints,
RUM, custom
• Uptime – Download and install infrastructure and
application monitoring
• Precise – APM suite with deep support from
everything from SAP to Java to SQL
Monitor At the Right Depth
Questions?
Monitor the Lean way…

More Related Content

More from IDERA Software

Idera live 2021: The Power of Abstraction by Steve Hoberman
Idera live 2021:  The Power of Abstraction by Steve HobermanIdera live 2021:  The Power of Abstraction by Steve Hoberman
Idera live 2021: The Power of Abstraction by Steve Hoberman
IDERA Software
 
Idera live 2021: Keynote Presentation The Future of Data is The Data Cloud b...
Idera live 2021:  Keynote Presentation The Future of Data is The Data Cloud b...Idera live 2021:  Keynote Presentation The Future of Data is The Data Cloud b...
Idera live 2021: Keynote Presentation The Future of Data is The Data Cloud b...
IDERA Software
 
Idera live 2021: Database Auditing - on-Premises and in the Cloud by Craig M...
Idera live 2021:  Database Auditing - on-Premises and in the Cloud by Craig M...Idera live 2021:  Database Auditing - on-Premises and in the Cloud by Craig M...
Idera live 2021: Database Auditing - on-Premises and in the Cloud by Craig M...
IDERA Software
 

More from IDERA Software (20)

Problems and solutions for migrating databases to the cloud
Problems and solutions for migrating databases to the cloudProblems and solutions for migrating databases to the cloud
Problems and solutions for migrating databases to the cloud
 
Public cloud uses and limitations
Public cloud uses and limitationsPublic cloud uses and limitations
Public cloud uses and limitations
 
Optimize the performance, cost, and value of databases.pptx
Optimize the performance, cost, and value of databases.pptxOptimize the performance, cost, and value of databases.pptx
Optimize the performance, cost, and value of databases.pptx
 
Monitor cloud database with SQL Diagnostic Manager for SQL Server
Monitor cloud database with SQL Diagnostic Manager for SQL ServerMonitor cloud database with SQL Diagnostic Manager for SQL Server
Monitor cloud database with SQL Diagnostic Manager for SQL Server
 
Database administrators (dbas) face increasing pressure to monitor databases
Database administrators (dbas) face increasing pressure to monitor databasesDatabase administrators (dbas) face increasing pressure to monitor databases
Database administrators (dbas) face increasing pressure to monitor databases
 
Six tips for cutting sql server licensing costs
Six tips for cutting sql server licensing costsSix tips for cutting sql server licensing costs
Six tips for cutting sql server licensing costs
 
Idera live 2021: The Power of Abstraction by Steve Hoberman
Idera live 2021:  The Power of Abstraction by Steve HobermanIdera live 2021:  The Power of Abstraction by Steve Hoberman
Idera live 2021: The Power of Abstraction by Steve Hoberman
 
Idera live 2021: Why Data Lakes are Critical for AI, ML, and IoT By Brian Flug
Idera live 2021:  Why Data Lakes are Critical for AI, ML, and IoT  By Brian FlugIdera live 2021:  Why Data Lakes are Critical for AI, ML, and IoT  By Brian Flug
Idera live 2021: Why Data Lakes are Critical for AI, ML, and IoT By Brian Flug
 
Idera live 2021: Will Data Vault add Value to Your Data Warehouse? 3 Signs th...
Idera live 2021: Will Data Vault add Value to Your Data Warehouse? 3 Signs th...Idera live 2021: Will Data Vault add Value to Your Data Warehouse? 3 Signs th...
Idera live 2021: Will Data Vault add Value to Your Data Warehouse? 3 Signs th...
 
Idera live 2021: Managing Digital Transformation on a Budget by Bert Scalzo
Idera live 2021:  Managing Digital Transformation on a Budget by Bert ScalzoIdera live 2021:  Managing Digital Transformation on a Budget by Bert Scalzo
Idera live 2021: Managing Digital Transformation on a Budget by Bert Scalzo
 
Idera live 2021: Keynote Presentation The Future of Data is The Data Cloud b...
Idera live 2021:  Keynote Presentation The Future of Data is The Data Cloud b...Idera live 2021:  Keynote Presentation The Future of Data is The Data Cloud b...
Idera live 2021: Keynote Presentation The Future of Data is The Data Cloud b...
 
Idera live 2021: Managing Databases in the Cloud - the First Step, a Succes...
Idera live 2021:   Managing Databases in the Cloud - the First Step, a Succes...Idera live 2021:   Managing Databases in the Cloud - the First Step, a Succes...
Idera live 2021: Managing Databases in the Cloud - the First Step, a Succes...
 
Idera live 2021: Database Auditing - on-Premises and in the Cloud by Craig M...
Idera live 2021:  Database Auditing - on-Premises and in the Cloud by Craig M...Idera live 2021:  Database Auditing - on-Premises and in the Cloud by Craig M...
Idera live 2021: Database Auditing - on-Premises and in the Cloud by Craig M...
 
Idera live 2021: Performance Tuning Azure SQL Database by Monica Rathbun
Idera live 2021:  Performance Tuning Azure SQL Database by Monica RathbunIdera live 2021:  Performance Tuning Azure SQL Database by Monica Rathbun
Idera live 2021: Performance Tuning Azure SQL Database by Monica Rathbun
 
Geek Sync | How to Be the DBA When You Don't Have a DBA - Eric Cobb | IDERA
Geek Sync | How to Be the DBA When You Don't Have a DBA - Eric Cobb | IDERAGeek Sync | How to Be the DBA When You Don't Have a DBA - Eric Cobb | IDERA
Geek Sync | How to Be the DBA When You Don't Have a DBA - Eric Cobb | IDERA
 
How Users of a Performance Monitoring Tool Can Benefit from an Inventory Mana...
How Users of a Performance Monitoring Tool Can Benefit from an Inventory Mana...How Users of a Performance Monitoring Tool Can Benefit from an Inventory Mana...
How Users of a Performance Monitoring Tool Can Benefit from an Inventory Mana...
 
Benefits of Third Party Tools for MySQL | IDERA
Benefits of Third Party Tools for MySQL | IDERABenefits of Third Party Tools for MySQL | IDERA
Benefits of Third Party Tools for MySQL | IDERA
 
Achieve More with Less Resources | IDERA
Achieve More with Less Resources | IDERAAchieve More with Less Resources | IDERA
Achieve More with Less Resources | IDERA
 
Benefits of SQL Server 2017 and 2019 | IDERA
Benefits of SQL Server 2017 and 2019 | IDERABenefits of SQL Server 2017 and 2019 | IDERA
Benefits of SQL Server 2017 and 2019 | IDERA
 
Be Proactive: A Good DBA Goes Looking for Signs of Trouble | IDERA
Be Proactive: A Good DBA Goes Looking for Signs of Trouble | IDERABe Proactive: A Good DBA Goes Looking for Signs of Trouble | IDERA
Be Proactive: A Good DBA Goes Looking for Signs of Trouble | IDERA
 

Recently uploaded

Recently uploaded (20)

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

Geek Sync | A Lean Approach To Application Performance Monitoring

  • 2. About Ernest • Product Manager at IDERA in Austin, TX • 20 years of IT experience, from startups to enterprise shops • Runs CloudAustin user group, DevOpsDays Austin conference • Twitter: @ernestmueller • Blog: theagileadmin.com
  • 3. Agenda The Monitoring Landscape What Is Lean? MVP Monitoring Areas Next Steps
  • 4. Monitoring Your Systems First Topic Subcontent Goes Here
  • 6. Monitoring Tools • Network (SNMP, Netflow) • Server (SNMP, WMI, system) • Virtualization/Cloud/Container • Real User Monitoring (network, browser) • Service Endpoint (simple/transactional, local/remote) • Application (management interface, instrumentation) • Software metrics (database, web/app server) • Custom metrics (application) • Logging, Security, Analytics, Reporting, More…
  • 7. What To Do? • Monitor it all? – Expensive – Complex • How deep? – Monitor parts of it? – Gaps in visibility – Which parts?
  • 8. Monitoring Pitfalls • “I have 100,000 metrics, but still can’t tell if the site is down?” • “Did you know we’re generating 30% of our system load from monitoring?” • “It’s going to cost how much? Maybe, but the procurement cycle will be 9 months…” • “We’re spending 2 headcount just on maintaining our monitoring systems!” • We get so many alerts we need a secondary triage system so we know which ones to pay attention to.”
  • 9. What Is Lean? • Eliminate Waste • Amplify Learning • Decide as late as possible • Deliver as fast as possible • Empower the team • Build quality in • See the whole Lean Principles
  • 10. Your Monitoring Is A Product • Build – Minimum Viable Monitoring • Measure – All the Monitoring Points • Learn – About the App and the Monitoring • Repeat – Go Deeper Where It’s Needed Iterate Through A Development Cycle
  • 11. Monitoring MVP Areas 1. Service Performance and Uptime 2. Software Component Metrics 3. System Metrics 4. Application Metrics What are the most important areas to cover?
  • 12. Service Performance and Uptime • Remember lean principle “see the whole” • “What do my users see?” • MVP: external synthetic probe of the end service • Next: RUM, waterfalls, transactions • Later: transaction warehousing, cross-tier transaction tracing The end user view is always the most critical
  • 13. Remember the Process • Build – Minimum Viable Monitoring • Measure – All the Monitoring Points • Learn – About the App and the Monitoring • Repeat – Go Deeper Where It’s Needed Lean Development Cycle
  • 14. Software Component Metrics • “Is my service up?” • Check ports/processes for actionable outages • MVP: local probes • Next: More metrics beyond uptime and response time (most have a set they expose) • Later: Advanced deep dive database and other app component APM What you can page people on
  • 15. System and Network Metrics • “What is the root cause?” • Load on your systems and network devices • MVP: basic system metrics (CPU/mem/disk/network) • Next: More depth, cloud/virt/container layer stats • Later: Netflow, deeper dive into specific hardware platform metrics (SANs, etc.) Diagnosing Issues
  • 16. Application Metrics • “What is really going on?” • The app knows, get the app to tell you • MVP: Logging and log aggregation • Later: Better logging • Next: Specific app metric emission, application instrumentation (Management API or bytecode) Business value and troubleshooting specifics
  • 17. Think About The Principles • Eliminate Waste • Amplify Learning • Decide as late as possible • Deliver as fast as possible • Empower the team • Build quality in • See the whole Lean Principles
  • 18. Quick Demo • CopperEgg – Ultra quick-start SaaS-based monitoring with basics on systems, endpoints, RUM, custom • Uptime – Download and install infrastructure and application monitoring • Precise – APM suite with deep support from everything from SAP to Java to SQL Monitor At the Right Depth

Editor's Notes

  1. Your systems are complex, and there are many points at which you caninstrument them for monitoring, and various methods you can use to perform the measurements.
  2. And the same goes for your applications.
  3. There are many, many different ways to monitor your system and applications and for each type there are various instrumentation approaches and levels of depth.
  4. An experienced IT person can make educated guesses at this – but they’re just guesses, and every system is unique. And there is a tendency for experienced folks to say “monitor it all! Maximum resolution forever!” But it’s easy for a solution to be too complex for an operator, and everything you have running has a logistics tail all of its own – maintenance, data storage, etc. Plus, you end up with a flood of data and especially alerts that you may not be prepared to properly handle.
  5. These are all specific real monitoring issues I’ve seen with my own eyes.
  6. Lean Manufacturing, as popularized by Takashi Ohno’s Toyota Production System, is a method for eliminating waste in a system. This has been adapted into Lean IT and Lean Software Development. Lean software development is characterized by a seven principles. It is designed to promote visibility, shorten cycle time, and ensure you’re delivering the highest value first.
  7. Eric Ries applied lean to product development in his book Lean Startup (2011) which characterizes the core loop inside the product development cycle as “Build – Measure – Learn.” Lean is not cost-cutting, lean is about bringing your maximum force at the item with the highest leverage at any given time.
  8. I’d like to recommend a sample roadmap of where I think you should start your monitoring. Rather than spending $100k in any one area, you want to get broad coverage in many areas first and then deepen those and/or move into adjacencies as needed. These are the four key areas to nail down first, assuming you’re starting from scratch (or trying to learn/redesign a complex set of monitoring solutions already in place).
  9. Your most important attribute of a system is not CPU on some box, or a queue length, or whether a process is running. It’s whether users are able to access and use your service from out there in the world – period. That’s the first thing to address.
  10. Because remember, we’re iterating here, so that you can learn what you really need (or don’t need), and you and your team can learn how to use what you have well before getting more. You don’t half-bake each step, remember the “build quality in” principle, but you add on one type of monitoring and see what it tells you and what you need next.
  11. Next, you need more internally focused outage detection to tell you if you have an issue, and where that breakage is in your system.
  12. Now we segue to metrics more useful for root cause analysis – a service is down or slow, why?
  13. The more customized the metric is to your business, the more value it has for troubleshooting and for business purposes. Most issues lie within applications, not system components, but you have to rely on either the application telling you what’s wrong or external profiling.
  14. As you go through each iteration, ask yourself how you are achieving these principles. Usually there are many consumers of a monitoring system, of all kinds of different skill levels. How do you empower those people and help them learn with the system you are constructing? Are you building in quality, and is your monitoring integrated to where it really lets you see the state of the system and not just “a bunch of line graphs”?