SlideShare a Scribd company logo
1 of 24
Download to read offline
Webinar | June 24, 2019
Adam Frank
Senior Product Manager
Mick Miller
Senior DevOps Architect
How KeyBank Liberated its IT Ops
from Rules-Based Event Management
Breaking the
Rules
15
States
1,200+
Branches
1,500+
ATMs
20,000
Employees
$135B
Assets
$5B
Revenue
2
Datacenters
Systems have become more…
IT System
Complexity
• Modular
• Distributed
• Dynamic
• Ephemeral
What is
Driving Digital
Transformation?
• Increased demand
• Increasing change velocity
• Customer expectation
• Customer mobility
• Customer choice
What is Driving
Change Velocity?
• Expansion of digital services
• Emergence of containers
• High availability architectures
• Volume: 100k+ and above logins per
sec, etc.
Increased monitoring breaks down
legacy approaches…
• Increasing staff does not scale with the
rate of data ingestion
• Legacy systems do not learn
Keeping
Customers…
…and attracting new ones through
improved customer experience (Cx)
• Near 100% uptime has become
expected
• Restoration of services is measured
in seconds not hours
• Capturing click-level events to
discover how customers are using
your systems
• Continuous delivery
The Weight
• Legacy rules-based filtering (if,
then, else, etc.) won’t scale with
exponential growth
• Too many interdependencies
between complex systems and
rules supporting the telemetry
Legacy Monitoring Can’t Scale
Obsolescence: Planned
and Unplanned
• Software/Hardware: at the core of
ideas, which change as we advance
information/data/technology
• Languages: Over 25 languages in 60
years (1948–2009)
• Data: Flat files -> ISAM -> Relational ->
No-SQL -> Clusters -> etc.
• Software : ad-hoc -> Structured
programming -> Object -> Functional -
> etc.
• IT Operations: ad hoc -> ITIL v1-3 ->
ITIL v4 -> DevOps -> etc.
• And on, and on …
The Only
Constant is
Change
If you don’t like change, you are
really going to hate extinction.
New Strategy Required for IT System
Monitoring
Graph based on StackState monitoring maturity model for IT operations
visibilityandintelligence level 1
individual
component
monitoring
level 2
full-breadth
monitoring
level 3
end-to-end
monitoring and
correlation
level 4
AIOps
maturity level
reactive
monitoring
proactive
monitoring
predictive
monitoring
Rules-based
AIOps
• As the number of systems increases,
so does the volume of data. This
means the number of rules will
increase causing exponential
complexity.
• Increased number of rules becomes
unpredictable and untestable.
This Parrot Is
No More
Rules-based event correlation is
past it’s time
This Parrot Is No More
• Multiple rules interacting is a factorial problem:
(n! = n × (n−1)!)
o 5! rules = 120 possible combinations
o 6! rules = 740 possible combinations
o 10! rules = 3,628,800 possible combinations
o 100! rules = 9 x 10157 power
(9 followed by 157 zeros) possible
combinations
• While easy to understand and implement,
rules-based monitoring implodes at the enterprise
scale as complexity increases
Rules-based event correlation is past it’s time
n n!
0 1
1 1
2 2
3 6
4 24
5 120
6 702
7 5,040
8 40,320
9 362,880
10 3,628,800
11 39,916,800
12 479,001,600
13 6,277,020,800
n n!
14 87,178,291,200
15 1,307,674,368,000
16 20,922,789,888,000
17 355,687,428,096,000
18 6,402,373,705,728,000
19 121,645,100,408,832,000
20 2,432,902,008,176,640,000
21 51,090,942,171,709,440,000
22 1,124,000,727,777,607,680,000
23 25,852,016,738,884,976,640,000
24 620,448,401,733,239,439,360,000
25 15,511,210,043,330,985,984,000,000
26 403,291,461,126,605,635,584,000,000
27 10,888,869,450,418,352,160,768,000,000
Relationship Between Rules Growth Is
Not Linear
• Trying to understand and
test all the relationships
between rules is not
possible.
• Data scientists call this the
“NP-complete” problem
(not solvable with current
compute capability)
• Virtually impossible to
understand effect of alert
exceptions in a collection
of rules, even at 10 rules.
You don’t know
what you don’t
know
• You can’t predict unusual events (events
not caught or missed by rules)
• Rules-based approaches need to change
to AIOps
• ML and AI: all event data can be
processed
• Modern AIOps uses algorithms to identify
when something is unusual
In data science a “black swan event” is
something you can’t predict.
Whodunit?
• Rules-based approaches cannot
decide on root cause of system
failures
• Random nature of real-world failures
in highly distributed systems can
have multiple root causes
• Unlike rules-based systems, AIOps
have built-in learning models. You
don’t need to constantly add new
rules
Root cause probability
Take the red pill…
• Deceptively simple
• Expensive
• Unpredictable
• Undecidable
Rules-based systems cannot meet
the demands of complex
distributed computing
Take the red pill…
• Processes all events
• Does not separate data from systems
• Algorithms are deterministic
• Algorithms don’t care about order
• Single algorithm can replace
hundreds of rules
AIOps (AI and ML ) liberates IT
from the limitations of rules-
based systems
Start Today
• Now is the time to start your AIOps journey
• Move beyond legacy rules-based systems
• Start using modern machine-learning of
AIOps to deliver continuous service
assurance toyour enterprise
Get Started by Reading the AIOps Manifesto:
• https://www.aiops-exchange.org/wp-
content/uploads/2019/05/aiops-manifesto.pdf
• https://www.moogsoft.com/resources/aiops/e
book/aiops-liberates-it
DEMO Moogsoft AIOps
Continuous Service Delivery, Optimal Business Agility
TIME
Quickly focus on and
resolve the most critical
issues, at scale
Improve your economics by making
teams faster, smarter, and more
productive
COST
Get real-time visibility into your
existing data sources, tools and
workflow
RISK
ALL
DATA
Any
SCALE
Purpose-Built AI for IT and DevOps
Moogsoft Is the Platform for Agile and Proactive Event
Resolution Workflow
Industrialized data
ingestion from
multiple sources
Proactively and
automatically detects
Incidents and probable root
causes (reduces MTTD)
Triggers automation
to restore services
Predictive insights
(reduces support
escalations and
MTTR)
Enables collaborative
workflows (reduces
MTTR and adverse
business impact)
Automatically resolves
signals from alert
noise
Early Detection, fewer tickets, reduced MTTR
AI
AI
AI
AI
Diagnostics Diagnostics
Custom Scripts
Existing Runbooks
RUNBOOK
AUTOMATION
NSO
ORCHESTRATION
Continuous
Deployment
Known
Remediation
AIOps
Notifications
Incident
Cases
NOTIFICATIONS
INCIDENTS
Events
AMW
ESX
NETAPP
AWS X-Ray
APPLICATION NETWORK INFRASTRUCTURE CUSTOM ALERTS LOGOS/SYSLOGS SYNTHETIC CLOUD
Seamless Integration With Your Existing Tools
and Workflows
Mick Miller
mick_miller@keybank.com
Adam Frank
adam.frank@moogsoft.com
Q & A

More Related Content

What's hot

Grafana overview deck - Tech - 2023 May v1.pdf
Grafana overview deck  - Tech - 2023 May v1.pdfGrafana overview deck  - Tech - 2023 May v1.pdf
Grafana overview deck - Tech - 2023 May v1.pdf
BillySin5
 
Monitoring and observability
Monitoring and observabilityMonitoring and observability
Monitoring and observability
Theo Schlossnagle
 
A Crash Course in Building Site Reliability
A Crash Course in Building Site ReliabilityA Crash Course in Building Site Reliability
A Crash Course in Building Site Reliability
Acquia
 

What's hot (20)

Observability
ObservabilityObservability
Observability
 
Platform Strategy to Deliver Digital Experiences on Azure
Platform Strategy to Deliver Digital Experiences on AzurePlatform Strategy to Deliver Digital Experiences on Azure
Platform Strategy to Deliver Digital Experiences on Azure
 
Managing Product Development Chaos with Jira Software and Confluence
Managing Product Development Chaos with Jira Software and ConfluenceManaging Product Development Chaos with Jira Software and Confluence
Managing Product Development Chaos with Jira Software and Confluence
 
Modernizing Infrastructure Monitoring and Management with AIOps
Modernizing Infrastructure Monitoring and Management with AIOpsModernizing Infrastructure Monitoring and Management with AIOps
Modernizing Infrastructure Monitoring and Management with AIOps
 
Grafana overview deck - Tech - 2023 May v1.pdf
Grafana overview deck  - Tech - 2023 May v1.pdfGrafana overview deck  - Tech - 2023 May v1.pdf
Grafana overview deck - Tech - 2023 May v1.pdf
 
DevOps introduction
DevOps introductionDevOps introduction
DevOps introduction
 
Monitoring and observability
Monitoring and observabilityMonitoring and observability
Monitoring and observability
 
Site Reliability Engineering: An Enterprise Adoption Story (an ITSM Academy W...
Site Reliability Engineering: An Enterprise Adoption Story (an ITSM Academy W...Site Reliability Engineering: An Enterprise Adoption Story (an ITSM Academy W...
Site Reliability Engineering: An Enterprise Adoption Story (an ITSM Academy W...
 
DevOps and Splunk
DevOps and SplunkDevOps and Splunk
DevOps and Splunk
 
The future of AIOps
The future of AIOpsThe future of AIOps
The future of AIOps
 
Cloud-Native Observability
Cloud-Native ObservabilityCloud-Native Observability
Cloud-Native Observability
 
DevOps Transition Strategies
DevOps Transition StrategiesDevOps Transition Strategies
DevOps Transition Strategies
 
Monitoring & Observability
Monitoring & ObservabilityMonitoring & Observability
Monitoring & Observability
 
Gitops: a new paradigm for software defined operations
Gitops: a new paradigm for software defined operationsGitops: a new paradigm for software defined operations
Gitops: a new paradigm for software defined operations
 
Agile DevOps Transformation Strategy
Agile DevOps Transformation StrategyAgile DevOps Transformation Strategy
Agile DevOps Transformation Strategy
 
MeasureWorks - Performance Labs - Why Observability Matters!
MeasureWorks - Performance Labs - Why Observability Matters!MeasureWorks - Performance Labs - Why Observability Matters!
MeasureWorks - Performance Labs - Why Observability Matters!
 
A Crash Course in Building Site Reliability
A Crash Course in Building Site ReliabilityA Crash Course in Building Site Reliability
A Crash Course in Building Site Reliability
 
Site (Service) Reliability Engineering
Site (Service) Reliability EngineeringSite (Service) Reliability Engineering
Site (Service) Reliability Engineering
 
DevOps - an Agile Perspective (at Scale)
DevOps - an Agile Perspective (at Scale)DevOps - an Agile Perspective (at Scale)
DevOps - an Agile Perspective (at Scale)
 
Customer case - Dynatrace Monitoring Redefined
Customer case - Dynatrace Monitoring RedefinedCustomer case - Dynatrace Monitoring Redefined
Customer case - Dynatrace Monitoring Redefined
 

Similar to Webinar Slides - How KeyBank Liberated its IT Ops from Rules-Based Event Management

Brighttalk high scale low touch and other bedtime stories - final
Brighttalk   high scale low touch and other bedtime stories - finalBrighttalk   high scale low touch and other bedtime stories - final
Brighttalk high scale low touch and other bedtime stories - final
Andrew White
 

Similar to Webinar Slides - How KeyBank Liberated its IT Ops from Rules-Based Event Management (20)

Using Machine Learning to Optimize DevOps Practices
Using Machine Learning to Optimize DevOps PracticesUsing Machine Learning to Optimize DevOps Practices
Using Machine Learning to Optimize DevOps Practices
 
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
 
Itsummit2015 blizzard
Itsummit2015 blizzardItsummit2015 blizzard
Itsummit2015 blizzard
 
Automation of document management paul fenton webinar
Automation of document management paul fenton webinarAutomation of document management paul fenton webinar
Automation of document management paul fenton webinar
 
Context Is Critical for IT Operations - How Rich Data Yields Richer Results
Context Is Critical for IT Operations - How Rich Data Yields Richer Results Context Is Critical for IT Operations - How Rich Data Yields Richer Results
Context Is Critical for IT Operations - How Rich Data Yields Richer Results
 
Observability - the good, the bad, and the ugly
Observability - the good, the bad, and the uglyObservability - the good, the bad, and the ugly
Observability - the good, the bad, and the ugly
 
Using InfluxDB for Full Observability of a SaaS Platform by Aleksandr Tavgen,...
Using InfluxDB for Full Observability of a SaaS Platform by Aleksandr Tavgen,...Using InfluxDB for Full Observability of a SaaS Platform by Aleksandr Tavgen,...
Using InfluxDB for Full Observability of a SaaS Platform by Aleksandr Tavgen,...
 
How to improve your system monitoring
How to improve your system monitoringHow to improve your system monitoring
How to improve your system monitoring
 
If You Are Not Embedding Analytics Into Your Day To Day Processes, You Are Do...
If You Are Not Embedding Analytics Into Your Day To Day Processes, You Are Do...If You Are Not Embedding Analytics Into Your Day To Day Processes, You Are Do...
If You Are Not Embedding Analytics Into Your Day To Day Processes, You Are Do...
 
WWV2015: Jibes Paul van der Hulst big data
WWV2015: Jibes Paul van der Hulst big dataWWV2015: Jibes Paul van der Hulst big data
WWV2015: Jibes Paul van der Hulst big data
 
Monitoring Containerized Micro-Services In Azure
Monitoring Containerized Micro-Services In AzureMonitoring Containerized Micro-Services In Azure
Monitoring Containerized Micro-Services In Azure
 
Brighttalk high scale low touch and other bedtime stories - final
Brighttalk   high scale low touch and other bedtime stories - finalBrighttalk   high scale low touch and other bedtime stories - final
Brighttalk high scale low touch and other bedtime stories - final
 
Observability - The good, the bad and the ugly Xp Days 2019 Kiev Ukraine
Observability -  The good, the bad and the ugly Xp Days 2019 Kiev Ukraine Observability -  The good, the bad and the ugly Xp Days 2019 Kiev Ukraine
Observability - The good, the bad and the ugly Xp Days 2019 Kiev Ukraine
 
Drive Smarter Decisions with Big Data Using Complex Event Processing
Drive Smarter Decisions with Big Data Using Complex Event ProcessingDrive Smarter Decisions with Big Data Using Complex Event Processing
Drive Smarter Decisions with Big Data Using Complex Event Processing
 
DevOpsDays Chicago 2014 - Controlling Devops
DevOpsDays Chicago 2014 -  Controlling DevopsDevOpsDays Chicago 2014 -  Controlling Devops
DevOpsDays Chicago 2014 - Controlling Devops
 
Correlation does not mean causation
Correlation does not mean causationCorrelation does not mean causation
Correlation does not mean causation
 
Its Not You Its Me MSSP Couples Counseling
Its Not You Its Me   MSSP Couples CounselingIts Not You Its Me   MSSP Couples Counseling
Its Not You Its Me MSSP Couples Counseling
 
Security a Revenue Center: How Security Can Drive Your Business
Security a Revenue Center: How Security Can Drive Your BusinessSecurity a Revenue Center: How Security Can Drive Your Business
Security a Revenue Center: How Security Can Drive Your Business
 
Using Time Series for Full Observability of a SaaS Platform
Using Time Series for Full Observability of a SaaS PlatformUsing Time Series for Full Observability of a SaaS Platform
Using Time Series for Full Observability of a SaaS Platform
 
Microservices 101 - The Big Why?
Microservices 101 - The Big Why?Microservices 101 - The Big Why?
Microservices 101 - The Big Why?
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Recently uploaded (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

Webinar Slides - How KeyBank Liberated its IT Ops from Rules-Based Event Management

  • 1. Webinar | June 24, 2019 Adam Frank Senior Product Manager Mick Miller Senior DevOps Architect How KeyBank Liberated its IT Ops from Rules-Based Event Management
  • 4. Systems have become more… IT System Complexity • Modular • Distributed • Dynamic • Ephemeral
  • 5. What is Driving Digital Transformation? • Increased demand • Increasing change velocity • Customer expectation • Customer mobility • Customer choice
  • 6. What is Driving Change Velocity? • Expansion of digital services • Emergence of containers • High availability architectures • Volume: 100k+ and above logins per sec, etc. Increased monitoring breaks down legacy approaches… • Increasing staff does not scale with the rate of data ingestion • Legacy systems do not learn
  • 7. Keeping Customers… …and attracting new ones through improved customer experience (Cx) • Near 100% uptime has become expected • Restoration of services is measured in seconds not hours • Capturing click-level events to discover how customers are using your systems • Continuous delivery
  • 8. The Weight • Legacy rules-based filtering (if, then, else, etc.) won’t scale with exponential growth • Too many interdependencies between complex systems and rules supporting the telemetry Legacy Monitoring Can’t Scale
  • 9. Obsolescence: Planned and Unplanned • Software/Hardware: at the core of ideas, which change as we advance information/data/technology • Languages: Over 25 languages in 60 years (1948–2009) • Data: Flat files -> ISAM -> Relational -> No-SQL -> Clusters -> etc. • Software : ad-hoc -> Structured programming -> Object -> Functional - > etc. • IT Operations: ad hoc -> ITIL v1-3 -> ITIL v4 -> DevOps -> etc. • And on, and on …
  • 10. The Only Constant is Change If you don’t like change, you are really going to hate extinction.
  • 11. New Strategy Required for IT System Monitoring Graph based on StackState monitoring maturity model for IT operations visibilityandintelligence level 1 individual component monitoring level 2 full-breadth monitoring level 3 end-to-end monitoring and correlation level 4 AIOps maturity level reactive monitoring proactive monitoring predictive monitoring Rules-based AIOps
  • 12. • As the number of systems increases, so does the volume of data. This means the number of rules will increase causing exponential complexity. • Increased number of rules becomes unpredictable and untestable. This Parrot Is No More Rules-based event correlation is past it’s time
  • 13. This Parrot Is No More • Multiple rules interacting is a factorial problem: (n! = n × (n−1)!) o 5! rules = 120 possible combinations o 6! rules = 740 possible combinations o 10! rules = 3,628,800 possible combinations o 100! rules = 9 x 10157 power (9 followed by 157 zeros) possible combinations • While easy to understand and implement, rules-based monitoring implodes at the enterprise scale as complexity increases Rules-based event correlation is past it’s time
  • 14. n n! 0 1 1 1 2 2 3 6 4 24 5 120 6 702 7 5,040 8 40,320 9 362,880 10 3,628,800 11 39,916,800 12 479,001,600 13 6,277,020,800 n n! 14 87,178,291,200 15 1,307,674,368,000 16 20,922,789,888,000 17 355,687,428,096,000 18 6,402,373,705,728,000 19 121,645,100,408,832,000 20 2,432,902,008,176,640,000 21 51,090,942,171,709,440,000 22 1,124,000,727,777,607,680,000 23 25,852,016,738,884,976,640,000 24 620,448,401,733,239,439,360,000 25 15,511,210,043,330,985,984,000,000 26 403,291,461,126,605,635,584,000,000 27 10,888,869,450,418,352,160,768,000,000 Relationship Between Rules Growth Is Not Linear • Trying to understand and test all the relationships between rules is not possible. • Data scientists call this the “NP-complete” problem (not solvable with current compute capability) • Virtually impossible to understand effect of alert exceptions in a collection of rules, even at 10 rules.
  • 15. You don’t know what you don’t know • You can’t predict unusual events (events not caught or missed by rules) • Rules-based approaches need to change to AIOps • ML and AI: all event data can be processed • Modern AIOps uses algorithms to identify when something is unusual In data science a “black swan event” is something you can’t predict.
  • 16. Whodunit? • Rules-based approaches cannot decide on root cause of system failures • Random nature of real-world failures in highly distributed systems can have multiple root causes • Unlike rules-based systems, AIOps have built-in learning models. You don’t need to constantly add new rules Root cause probability
  • 17. Take the red pill… • Deceptively simple • Expensive • Unpredictable • Undecidable Rules-based systems cannot meet the demands of complex distributed computing
  • 18. Take the red pill… • Processes all events • Does not separate data from systems • Algorithms are deterministic • Algorithms don’t care about order • Single algorithm can replace hundreds of rules AIOps (AI and ML ) liberates IT from the limitations of rules- based systems
  • 19. Start Today • Now is the time to start your AIOps journey • Move beyond legacy rules-based systems • Start using modern machine-learning of AIOps to deliver continuous service assurance toyour enterprise Get Started by Reading the AIOps Manifesto: • https://www.aiops-exchange.org/wp- content/uploads/2019/05/aiops-manifesto.pdf • https://www.moogsoft.com/resources/aiops/e book/aiops-liberates-it
  • 21. Continuous Service Delivery, Optimal Business Agility TIME Quickly focus on and resolve the most critical issues, at scale Improve your economics by making teams faster, smarter, and more productive COST Get real-time visibility into your existing data sources, tools and workflow RISK ALL DATA Any SCALE Purpose-Built AI for IT and DevOps
  • 22. Moogsoft Is the Platform for Agile and Proactive Event Resolution Workflow Industrialized data ingestion from multiple sources Proactively and automatically detects Incidents and probable root causes (reduces MTTD) Triggers automation to restore services Predictive insights (reduces support escalations and MTTR) Enables collaborative workflows (reduces MTTR and adverse business impact) Automatically resolves signals from alert noise Early Detection, fewer tickets, reduced MTTR AI AI AI AI
  • 23. Diagnostics Diagnostics Custom Scripts Existing Runbooks RUNBOOK AUTOMATION NSO ORCHESTRATION Continuous Deployment Known Remediation AIOps Notifications Incident Cases NOTIFICATIONS INCIDENTS Events AMW ESX NETAPP AWS X-Ray APPLICATION NETWORK INFRASTRUCTURE CUSTOM ALERTS LOGOS/SYSLOGS SYNTHETIC CLOUD Seamless Integration With Your Existing Tools and Workflows