Confidential and Proprietary
DevOps Today? Self-Service
NoOps Tomorrow!
Using AI and Automation to Build Self-Healing Pipelines
February 19, 2020
Confidential and Proprietary
Today’s Presenters
Travis DePuy
Head Product
Evangelist
@xM_Tinkerer
Andreas Grabner
DevOps & ACE
Activist
@grabnerandi
Confidential and Proprietary
Agenda
● What is Self-Healing and NoOps and why are teams
making the move?
● Moving to NoOps as a Self-Service
● Self-Healing and NoOps Use Cases
● Next Steps
● Q & A
Confidential and Proprietary
What is Self-Healing and NoOps, and
why are teams making the move?
Confidential 5Confidential 5
Infrastructure as Code
Ops to work like Devs
Everything as Code
Dev think like Ops
Self-Service
Everyone thinks Biz
DevOps NoOps ACE
Automate Delivery
Launch Control
Automate Operations
Mission Control
Platforms
Autonomous Cloud
93% fewer
impacting issues
800+ engineers
working autonomously
Speed Stability Scale (out)
1h
Code to Prod
Confidential and Proprietary
NoOps is a Mindset!
“What can possibly go wrong? How can I prevent it using
automation?”
7Confidential
MTTI
Mean Time to Innovation
MTTR
Mean Time to Remediate
4.8 days
4 hours
~ 10min
12.5 days 2 days ~ 1 hour
But there is a long way to go based on https://dynatrace.ai/acsurvey
Only < 5%
Incidents in the Age of Customer Experience
Confidential and Proprietary
4 Golden Signals of SRE
● Latency
● Errors
● Saturation
● Traffic
Confidential and Proprietary
How to move to NoOps as a
Self-Service?
Path to NoOps – 4 Building BlocksAutomateMonitoring
AutomateQuality
Automate Quality
AutomateDelivery
Automate Delivery
AutomateOperations
Automate Operations
Confidential 1
2
Confidential 1
2
www.keptn.sh - @keptnProject
Part of the CNCF Landscape for Application Delivery
How Keptn manages Continuous Delivery with Blue/Green Deployments
New
artifact
Update
Config
Update
Environment
Run
Tests
Validate
Quality Gate
(SLI/SLO)
Rollback
if failed
4/8
(50%)
Total
7/8
(87.5%)
Promote
if passed
Total
Repeat
for other
stages:
Keptn is more than CD: Automatic Remediation
Alert by
Problem
v1
Get remediation
action
Execute
remediation
action
Re-validate
Quality Gate
v2
v1
v1
Resolved?
Escalate?
Examples:
• Rollback to old version
• Toggle feature flag
• Scale up Deployment/restart Pods
• Clear disk
• YOUR manual operation tasks
Blue/Green
Deployment
Example 2: Process Crash Remediation
1
2
… all while Ansible rolls back the last commit.
Full Disk Remediation
Confidential and Proprietary
Next Steps
https://www.xmatters.com/dynatrace
@keptnProject
github.com/keptn
github.com/keptn/community
www.keptn.sh
Confidential and Proprietary
Questions?
Confidential and Proprietary
Thank you!

DevOps Today? Self-Service NoOps Tomorrow!