SlideShare a Scribd company logo
Jorge Salamero Sanz <jsalamero@serverdensity.com>
IncontroDevOps 1 April 2016
War Games - Flight training for DevOps
How to Monitor MySQL
● Infrastructure automation
● Configuration automation
● Continuous testing
● Continuous deployment / delivery
● Monitoring
● Logs, error handling
● Feedback
● Human Ops
DevOps lifecycle
● Humans are part of any system
● Initial design, ongoing improvements
● Maintenance
● Upgrades
● Issues, Incident response
Humans in DevOps
● System issues = error rates + SLA + ...
● Human issues = alerts out of hours + interruptions + .
● System issues = Human issues
Human issues = system issues
● System health impacts human health
● Human health impacts system health
Humans impact systems
● Downtime = loss of users, reputation, revenue
● Downtime caused by unreliable systems
● Unhealthy teams reduce reliability
● Unhealthy teams = loss of users, reputation, revenue
Humans impact business
● Slip
● Lapse
● Mistake
● Violation
● (Always, again, again)
Human risk
● Prepare and practice
● Respond
● Postmortem
Expect downtime
Real example
(small war story, won’t be long)
● Power failure to half of our servers
● Automated failover unavailable
(known failure condition)
● Manual DNS switch required
● Expected impact: 20 min
● Actual impact: 43min
Incident example
Lesson learned?
● Unfamiliarity with the process
● Pressure of time sensitive event
(panic effect)
● Escalation introduces delays
The Human Factor
Handling the Human factor
● First responder, acknowledge alert
● Load incident response checklist
● Log into #ops-war-room in Slack
● Log incident into JIRA
● Begin investigation
General response process
1. Extended use of checklists
Documented procedures
● The “limits of human memory and
attention”
○ Complexity
○ Stress and fatigue
○ Ego
● Pilots, doctors, divers:
Bruce Willis Ruins All Films
(BCD, weights, releases, air, final)
Pre-flight checklists
1. Extended use of checklists
2. Not to follow blindly, use knowledge
and experience
3. Independent system
4. Searchable
5. List of known issues and
documented workarounds/fixes
Documented procedures
● Replica environment
● or mock command line
● Record actions and timing
● Multiple failures
● Unexpected results
Realistic scenarios: War Games
Results
● Team and individual test of response
● Run real commands
● Training the people
● Training the procedures
● Training the tools
Results
● Increase confidence
● Reduce panic
● Better coordination
● Trust relationships
● Improves time to resolution
Humans results
● Review
● Suggestions for improvements
● Do it again
● Scenario evolves
● People forget
loop(): review and repeat
● On call rotation design
● Alert prioritization
● Notification optimization
What else?
Human Ops
1. Humans are part of the system
2. Humans impact systems
3. Humans impact business
4. Human issues count as system issues
Human Ops principles
meetup.com/humanops-london/
Human Ops Meetup
www.CloudStatusApp.com
Jorge Salamero Sanz
Chief Developer Advocate
@bencerillo
@serverdensity
our DevOps stories
blog.serverdensity.com

More Related Content

Similar to Flight training for DevOps & HumanOps - IncontroDevOps 2016

Overview of Site Reliability Engineering (SRE) & best practices
Overview of Site Reliability Engineering (SRE) & best practicesOverview of Site Reliability Engineering (SRE) & best practices
Overview of Site Reliability Engineering (SRE) & best practices
Ashutosh Agarwal
 
Practical DevSecOps: Fundamentals of Successful Programs
Practical DevSecOps: Fundamentals of Successful ProgramsPractical DevSecOps: Fundamentals of Successful Programs
Practical DevSecOps: Fundamentals of Successful Programs
Matt Tesauro
 
Cissp Week 23
Cissp Week 23Cissp Week 23
Cissp Week 23jemtallon
 
Latency Control And Supervision In Resilience Design Patterns
Latency Control And Supervision In Resilience Design Patterns Latency Control And Supervision In Resilience Design Patterns
Latency Control And Supervision In Resilience Design Patterns
Tu Pham
 
August: DevOps 101 (in lieu of DevOps Patterns Distilled)
August: DevOps 101 (in lieu of DevOps Patterns Distilled)August: DevOps 101 (in lieu of DevOps Patterns Distilled)
August: DevOps 101 (in lieu of DevOps Patterns Distilled)
TriTAUG
 
Winston - Netflix's event driven auto remediation and diagnostics tool
Winston - Netflix's event driven auto remediation and diagnostics toolWinston - Netflix's event driven auto remediation and diagnostics tool
Winston - Netflix's event driven auto remediation and diagnostics tool
Vinay Shah
 
HowTo DR
HowTo DRHowTo DR
Moodle at scale why assigning a role can cause a catastrophe
Moodle at scale   why assigning a role can cause a catastropheMoodle at scale   why assigning a role can cause a catastrophe
Moodle at scale why assigning a role can cause a catastrophe
sammarshall_ou
 
Aleksej Šipulia - Retrospective – heart of scrum
Aleksej Šipulia - Retrospective – heart of scrumAleksej Šipulia - Retrospective – heart of scrum
Aleksej Šipulia - Retrospective – heart of scrum
Agile Lietuva
 
Demise of test scripts rise of test ideas
Demise of test scripts rise of test ideasDemise of test scripts rise of test ideas
Demise of test scripts rise of test ideas
Richard Robinson
 
Monitoring &amp; alerting presentation sabin&amp;mustafa
Monitoring &amp; alerting presentation sabin&amp;mustafaMonitoring &amp; alerting presentation sabin&amp;mustafa
Monitoring &amp; alerting presentation sabin&amp;mustafa
Lama K Banna
 
Incident response orchestration
Incident response orchestrationIncident response orchestration
Incident response orchestration
OpsGenie
 
Brainstorming failure
Brainstorming failureBrainstorming failure
Brainstorming failure
Jeffery Smith
 
Digital Forensics & Incident Response Fundamentals.pdf
Digital Forensics & Incident Response Fundamentals.pdfDigital Forensics & Incident Response Fundamentals.pdf
Digital Forensics & Incident Response Fundamentals.pdf
Christopher Doman
 
PEX Week: iDatix Workshop Part 3
PEX Week: iDatix Workshop Part 3PEX Week: iDatix Workshop Part 3
PEX Week: iDatix Workshop Part 3
iDatix
 
Data Integrity - Patryk Hes
Data Integrity - Patryk HesData Integrity - Patryk Hes
Data Integrity - Patryk Hes
PROIDEA
 
What is this exploratory testing thing
What is this exploratory testing thingWhat is this exploratory testing thing
What is this exploratory testing thing
tonybruce
 
How to Automate Yourself out of a Job (7/9/19)
How to Automate Yourself out of a Job (7/9/19)How to Automate Yourself out of a Job (7/9/19)
How to Automate Yourself out of a Job (7/9/19)
judy (fink) johnson
 
What Your Tech Lead Thinks You Know (But Didn't Teach You)
What Your Tech Lead Thinks You Know (But Didn't Teach You)What Your Tech Lead Thinks You Know (But Didn't Teach You)
What Your Tech Lead Thinks You Know (But Didn't Teach You)
Chris Riccomini
 
Software development myths that block your career
Software development myths that block your careerSoftware development myths that block your career
Software development myths that block your career
Piotr Horzycki
 

Similar to Flight training for DevOps & HumanOps - IncontroDevOps 2016 (20)

Overview of Site Reliability Engineering (SRE) & best practices
Overview of Site Reliability Engineering (SRE) & best practicesOverview of Site Reliability Engineering (SRE) & best practices
Overview of Site Reliability Engineering (SRE) & best practices
 
Practical DevSecOps: Fundamentals of Successful Programs
Practical DevSecOps: Fundamentals of Successful ProgramsPractical DevSecOps: Fundamentals of Successful Programs
Practical DevSecOps: Fundamentals of Successful Programs
 
Cissp Week 23
Cissp Week 23Cissp Week 23
Cissp Week 23
 
Latency Control And Supervision In Resilience Design Patterns
Latency Control And Supervision In Resilience Design Patterns Latency Control And Supervision In Resilience Design Patterns
Latency Control And Supervision In Resilience Design Patterns
 
August: DevOps 101 (in lieu of DevOps Patterns Distilled)
August: DevOps 101 (in lieu of DevOps Patterns Distilled)August: DevOps 101 (in lieu of DevOps Patterns Distilled)
August: DevOps 101 (in lieu of DevOps Patterns Distilled)
 
Winston - Netflix's event driven auto remediation and diagnostics tool
Winston - Netflix's event driven auto remediation and diagnostics toolWinston - Netflix's event driven auto remediation and diagnostics tool
Winston - Netflix's event driven auto remediation and diagnostics tool
 
HowTo DR
HowTo DRHowTo DR
HowTo DR
 
Moodle at scale why assigning a role can cause a catastrophe
Moodle at scale   why assigning a role can cause a catastropheMoodle at scale   why assigning a role can cause a catastrophe
Moodle at scale why assigning a role can cause a catastrophe
 
Aleksej Šipulia - Retrospective – heart of scrum
Aleksej Šipulia - Retrospective – heart of scrumAleksej Šipulia - Retrospective – heart of scrum
Aleksej Šipulia - Retrospective – heart of scrum
 
Demise of test scripts rise of test ideas
Demise of test scripts rise of test ideasDemise of test scripts rise of test ideas
Demise of test scripts rise of test ideas
 
Monitoring &amp; alerting presentation sabin&amp;mustafa
Monitoring &amp; alerting presentation sabin&amp;mustafaMonitoring &amp; alerting presentation sabin&amp;mustafa
Monitoring &amp; alerting presentation sabin&amp;mustafa
 
Incident response orchestration
Incident response orchestrationIncident response orchestration
Incident response orchestration
 
Brainstorming failure
Brainstorming failureBrainstorming failure
Brainstorming failure
 
Digital Forensics & Incident Response Fundamentals.pdf
Digital Forensics & Incident Response Fundamentals.pdfDigital Forensics & Incident Response Fundamentals.pdf
Digital Forensics & Incident Response Fundamentals.pdf
 
PEX Week: iDatix Workshop Part 3
PEX Week: iDatix Workshop Part 3PEX Week: iDatix Workshop Part 3
PEX Week: iDatix Workshop Part 3
 
Data Integrity - Patryk Hes
Data Integrity - Patryk HesData Integrity - Patryk Hes
Data Integrity - Patryk Hes
 
What is this exploratory testing thing
What is this exploratory testing thingWhat is this exploratory testing thing
What is this exploratory testing thing
 
How to Automate Yourself out of a Job (7/9/19)
How to Automate Yourself out of a Job (7/9/19)How to Automate Yourself out of a Job (7/9/19)
How to Automate Yourself out of a Job (7/9/19)
 
What Your Tech Lead Thinks You Know (But Didn't Teach You)
What Your Tech Lead Thinks You Know (But Didn't Teach You)What Your Tech Lead Thinks You Know (But Didn't Teach You)
What Your Tech Lead Thinks You Know (But Didn't Teach You)
 
Software development myths that block your career
Software development myths that block your careerSoftware development myths that block your career
Software development myths that block your career
 

More from Server Density

Content marketing @ Server Density
Content marketing @ Server DensityContent marketing @ Server Density
Content marketing @ Server Density
Server Density
 
How to Monitor MySQL
How to Monitor MySQLHow to Monitor MySQL
How to Monitor MySQL
Server Density
 
Scaling humans - Ops teams and incident management
Scaling humans - Ops teams and incident managementScaling humans - Ops teams and incident management
Scaling humans - Ops teams and incident management
Server Density
 
Briefing: Containers
Briefing: ContainersBriefing: Containers
Briefing: Containers
Server Density
 
Why puppet? Why now?
Why puppet? Why now?Why puppet? Why now?
Why puppet? Why now?
Server Density
 
Infrastructure choices - cloud vs colo vs bare metal
Infrastructure choices - cloud vs colo vs bare metalInfrastructure choices - cloud vs colo vs bare metal
Infrastructure choices - cloud vs colo vs bare metal
Server Density
 
Navigating the customer lifecycle
Navigating the customer lifecycleNavigating the customer lifecycle
Navigating the customer lifecycle
Server Density
 
Experiences from DevOps production: Deployment, performance, failure.
Experiences from DevOps production: Deployment, performance, failure.Experiences from DevOps production: Deployment, performance, failure.
Experiences from DevOps production: Deployment, performance, failure.
Server Density
 
DevOps Incident Handling - Making friends not enemies.
DevOps Incident Handling - Making friends not enemies.DevOps Incident Handling - Making friends not enemies.
DevOps Incident Handling - Making friends not enemies.
Server Density
 
How to monitor NGINX
How to monitor NGINXHow to monitor NGINX
How to monitor NGINX
Server Density
 
How to monitor MongoDB
How to monitor MongoDBHow to monitor MongoDB
How to monitor MongoDB
Server Density
 
High performance Infrastructure Oct 2013
High performance Infrastructure Oct 2013High performance Infrastructure Oct 2013
High performance Infrastructure Oct 2013Server Density
 
Puppet at the centre of everything
Puppet at the centre of everythingPuppet at the centre of everything
Puppet at the centre of everything
Server Density
 
NoSQL Infrastructure - Late 2013
NoSQL Infrastructure - Late 2013NoSQL Infrastructure - Late 2013
NoSQL Infrastructure - Late 2013
Server Density
 
Remote startup - building a company from everywhere in the world
Remote startup - building a company from everywhere in the worldRemote startup - building a company from everywhere in the world
Remote startup - building a company from everywhere in the world
Server Density
 
NoSQL Infrastructure
NoSQL InfrastructureNoSQL Infrastructure
NoSQL Infrastructure
Server Density
 
StartOps: Growing an ops team from 1 founder
StartOps: Growing an ops team from 1 founderStartOps: Growing an ops team from 1 founder
StartOps: Growing an ops team from 1 founder
Server Density
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
Server Density
 
Puppet Camp Ghent 2013
Puppet Camp Ghent 2013Puppet Camp Ghent 2013
Puppet Camp Ghent 2013
Server Density
 

More from Server Density (20)

Content marketing @ Server Density
Content marketing @ Server DensityContent marketing @ Server Density
Content marketing @ Server Density
 
How to Monitor MySQL
How to Monitor MySQLHow to Monitor MySQL
How to Monitor MySQL
 
Handling incidents
Handling incidentsHandling incidents
Handling incidents
 
Scaling humans - Ops teams and incident management
Scaling humans - Ops teams and incident managementScaling humans - Ops teams and incident management
Scaling humans - Ops teams and incident management
 
Briefing: Containers
Briefing: ContainersBriefing: Containers
Briefing: Containers
 
Why puppet? Why now?
Why puppet? Why now?Why puppet? Why now?
Why puppet? Why now?
 
Infrastructure choices - cloud vs colo vs bare metal
Infrastructure choices - cloud vs colo vs bare metalInfrastructure choices - cloud vs colo vs bare metal
Infrastructure choices - cloud vs colo vs bare metal
 
Navigating the customer lifecycle
Navigating the customer lifecycleNavigating the customer lifecycle
Navigating the customer lifecycle
 
Experiences from DevOps production: Deployment, performance, failure.
Experiences from DevOps production: Deployment, performance, failure.Experiences from DevOps production: Deployment, performance, failure.
Experiences from DevOps production: Deployment, performance, failure.
 
DevOps Incident Handling - Making friends not enemies.
DevOps Incident Handling - Making friends not enemies.DevOps Incident Handling - Making friends not enemies.
DevOps Incident Handling - Making friends not enemies.
 
How to monitor NGINX
How to monitor NGINXHow to monitor NGINX
How to monitor NGINX
 
How to monitor MongoDB
How to monitor MongoDBHow to monitor MongoDB
How to monitor MongoDB
 
High performance Infrastructure Oct 2013
High performance Infrastructure Oct 2013High performance Infrastructure Oct 2013
High performance Infrastructure Oct 2013
 
Puppet at the centre of everything
Puppet at the centre of everythingPuppet at the centre of everything
Puppet at the centre of everything
 
NoSQL Infrastructure - Late 2013
NoSQL Infrastructure - Late 2013NoSQL Infrastructure - Late 2013
NoSQL Infrastructure - Late 2013
 
Remote startup - building a company from everywhere in the world
Remote startup - building a company from everywhere in the worldRemote startup - building a company from everywhere in the world
Remote startup - building a company from everywhere in the world
 
NoSQL Infrastructure
NoSQL InfrastructureNoSQL Infrastructure
NoSQL Infrastructure
 
StartOps: Growing an ops team from 1 founder
StartOps: Growing an ops team from 1 founderStartOps: Growing an ops team from 1 founder
StartOps: Growing an ops team from 1 founder
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
 
Puppet Camp Ghent 2013
Puppet Camp Ghent 2013Puppet Camp Ghent 2013
Puppet Camp Ghent 2013
 

Recently uploaded

一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
keoku
 
Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027
harveenkaur52
 
Explore-Insanony: Watch Instagram Stories Secretly
Explore-Insanony: Watch Instagram Stories SecretlyExplore-Insanony: Watch Instagram Stories Secretly
Explore-Insanony: Watch Instagram Stories Secretly
Trending Blogers
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
Arif0071
 
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
uehowe
 
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC
 
Search Result Showing My Post is Now Buried
Search Result Showing My Post is Now BuriedSearch Result Showing My Post is Now Buried
Search Result Showing My Post is Now Buried
Trish Parr
 
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
vmemo1
 
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
cuobya
 
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
zoowe
 
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
eutxy
 
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
cuobya
 
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
ysasp1
 
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
ufdana
 
[HUN][hackersuli] Red Teaming alapok 2024
[HUN][hackersuli] Red Teaming alapok 2024[HUN][hackersuli] Red Teaming alapok 2024
[HUN][hackersuli] Red Teaming alapok 2024
hackersuli
 
Bài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docxBài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docx
nhiyenphan2005
 
7 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 20247 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 2024
Danica Gill
 
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
zyfovom
 
Gen Z and the marketplaces - let's translate their needs
Gen Z and the marketplaces - let's translate their needsGen Z and the marketplaces - let's translate their needs
Gen Z and the marketplaces - let's translate their needs
Laura Szabó
 
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Brad Spiegel Macon GA
 

Recently uploaded (20)

一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
 
Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027
 
Explore-Insanony: Watch Instagram Stories Secretly
Explore-Insanony: Watch Instagram Stories SecretlyExplore-Insanony: Watch Instagram Stories Secretly
Explore-Insanony: Watch Instagram Stories Secretly
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
 
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
 
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
 
Search Result Showing My Post is Now Buried
Search Result Showing My Post is Now BuriedSearch Result Showing My Post is Now Buried
Search Result Showing My Post is Now Buried
 
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
 
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
 
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
 
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
 
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
 
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
 
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
 
[HUN][hackersuli] Red Teaming alapok 2024
[HUN][hackersuli] Red Teaming alapok 2024[HUN][hackersuli] Red Teaming alapok 2024
[HUN][hackersuli] Red Teaming alapok 2024
 
Bài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docxBài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docx
 
7 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 20247 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 2024
 
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
 
Gen Z and the marketplaces - let's translate their needs
Gen Z and the marketplaces - let's translate their needsGen Z and the marketplaces - let's translate their needs
Gen Z and the marketplaces - let's translate their needs
 
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
 

Flight training for DevOps & HumanOps - IncontroDevOps 2016

  • 1. Jorge Salamero Sanz <jsalamero@serverdensity.com> IncontroDevOps 1 April 2016 War Games - Flight training for DevOps
  • 3. ● Infrastructure automation ● Configuration automation ● Continuous testing ● Continuous deployment / delivery ● Monitoring ● Logs, error handling ● Feedback ● Human Ops DevOps lifecycle
  • 4. ● Humans are part of any system ● Initial design, ongoing improvements ● Maintenance ● Upgrades ● Issues, Incident response Humans in DevOps
  • 5. ● System issues = error rates + SLA + ... ● Human issues = alerts out of hours + interruptions + . ● System issues = Human issues Human issues = system issues
  • 6. ● System health impacts human health ● Human health impacts system health Humans impact systems
  • 7. ● Downtime = loss of users, reputation, revenue ● Downtime caused by unreliable systems ● Unhealthy teams reduce reliability ● Unhealthy teams = loss of users, reputation, revenue Humans impact business
  • 8. ● Slip ● Lapse ● Mistake ● Violation ● (Always, again, again) Human risk
  • 9. ● Prepare and practice ● Respond ● Postmortem Expect downtime
  • 10. Real example (small war story, won’t be long)
  • 11. ● Power failure to half of our servers ● Automated failover unavailable (known failure condition) ● Manual DNS switch required ● Expected impact: 20 min ● Actual impact: 43min Incident example
  • 12.
  • 14. ● Unfamiliarity with the process ● Pressure of time sensitive event (panic effect) ● Escalation introduces delays The Human Factor
  • 16. ● First responder, acknowledge alert ● Load incident response checklist ● Log into #ops-war-room in Slack ● Log incident into JIRA ● Begin investigation General response process
  • 17. 1. Extended use of checklists Documented procedures
  • 18. ● The “limits of human memory and attention” ○ Complexity ○ Stress and fatigue ○ Ego ● Pilots, doctors, divers: Bruce Willis Ruins All Films (BCD, weights, releases, air, final) Pre-flight checklists
  • 19. 1. Extended use of checklists 2. Not to follow blindly, use knowledge and experience 3. Independent system 4. Searchable 5. List of known issues and documented workarounds/fixes Documented procedures
  • 20. ● Replica environment ● or mock command line ● Record actions and timing ● Multiple failures ● Unexpected results Realistic scenarios: War Games
  • 22. ● Team and individual test of response ● Run real commands ● Training the people ● Training the procedures ● Training the tools Results
  • 23. ● Increase confidence ● Reduce panic ● Better coordination ● Trust relationships ● Improves time to resolution Humans results
  • 24. ● Review ● Suggestions for improvements ● Do it again ● Scenario evolves ● People forget loop(): review and repeat
  • 25. ● On call rotation design ● Alert prioritization ● Notification optimization What else?
  • 27. 1. Humans are part of the system 2. Humans impact systems 3. Humans impact business 4. Human issues count as system issues Human Ops principles
  • 30. Jorge Salamero Sanz Chief Developer Advocate @bencerillo @serverdensity our DevOps stories blog.serverdensity.com