2. -
Desert Code Camp 2019
Before DevOps
Team Ops
Team Dev
Image Courtesy: Kieran Jacobsen, Readify, Microsoft
3. Desert Code Camp 2019
Before DevOps
Team Dev(Engg)
• Release management
and deployments
• IT admin and InfoSec
• Infrastructure, DBA and
maintenance
• Reliability Engineering
• Business Operations
• Requirements -> design
• SCM & Code revisioning
• Coding, feature dev
• Testing, QA
• Delivering release
candidate
• Bug fixes and/or triageTeam (Sys)Ops
4. Outperforming
teams are
collaborate
extensively
with their
counterparts
54%
more
likely to
Developers
26.7%
No executive support
56.7%
Cultural inhibitors
43.3%
Fragmentedprocesses
Collaboration blockers
DevOps was being initiated by
more development teams than IT Ops
teams by about a 40% to 33% margin
Agile methodologieshave adopted
3/4 of teams
BusinessIT Ops
The average hourly
cost of infrastructure
failure is $100,000 per
hour
It takes on average
200 minutes to
diagnose and repair a
production issue
A bug caught in production ends up
costing
than if the same bug was found
earlier in the development cycle
100x more
IT decision
makers are still
unfamiliar with
the term DevOps
61 in
40%
… of implementations end up getting
reworked because they don’t meet the
users’ original requirements
… of development budgets for software, IT staff
and external professional services will be
consumed by poor requirements
41%
IT drives
business
success!
High IT performance
correlates with strong
business performance,
helps boost productivity,
market share and profit.
Responding to
ongoing needs for
efficiency and growth
Always keeping all
systems safe and secure
dual goals
… for companies that try toadapt
theirexisting toolsfor DevOps
practices
80% failure rate …
CIOs
70 %
to reduce
IT costs
Would
increase
risk
and accelerate
business agility
of
5. Desert Code Camp 2019
DevOps Cycle
By 2022 DevOps will be the
norm for majority of the
software developed.
HP Enterprise in 2017
- Ship Code 30x faster
- 55% more responsive
to business needs
- 50% fewer failures
- 38% improved code
quality
Puppetlabs in 2013
DevOps means caring about your
job enough to not pass the buck,
wanting to learn all the parts as a
whole, and not just your little
world.
— John Vincent
According to Statista, many
business organizations are
adopting DevOps and there
is an increase up to 17% in
2018 than what was about
10% in the year 2017.
Image source: Kieran Jacobsen, Readify & Microsoft
6. Desert Code Camp 2019
What is DevOps?
Slide source: Thiago Almeida| @nzthiago | talmeida.net
DevOpsis
development
andoperations
DevOpsis
treatingyour
DevOps
isusing
forOps?
DevOps
isfeature
DevOps
is
deployments
7. Ø Not merely development and operations collaborating
Ø A culture and mindset for collaborating between developers and
operations
Ø Developing with ops/tools/usage in mind
Ø Deploying with automation, emergency fixes in mind
Ø Test driven development with user experience frustrations in mind
Ø Bug triaging with fix cost estimation and plan in mind
Ø Provisioning/procurement with automatic scaling in mind
Ø Release planning with an A/B production switch in mind
Ø Faster deployments, even faster response times, improved
quality and health of systems
Ø Correct people, processes and tools/products leveraged
Ø Reduced costs overall, reinforce trust across organization
Desert Code Camp 2019
What is DevOps?
8. Desert Code Camp 2019
What DevOps Isn’t
DevOps means caring about your job enough to not pass the buck, wanting
to learn all the parts as a whole, and not just your little world.
— John Vincent
• Caring for your system does not require you to be an expert in
everything, you still continue doing what you are good at, paying more
attention to other areas of the system
• Owner v/s Renter analogy – owners don’t walk away from a problem
• Specialization, domain expertise still valuable over generalist work,
DevOps is merely asking cross awareness (cross pollinated skills)
• Documentation, training and communication tools overcome challenges
9. Desert Code Camp 2019
Tools of the TradeImage Source: https://eduinpro.com/blog/top-devops-tools-in-the-digital-market/Image & medium.com
10. Desert Code Camp 2019
Tools of the Trade
• Dashboards, traceability, incremental delivery of value
• Agile methods like Scrum and Kanban used effectively
• Continuous Integration and release pipelines
• Automation where needed, IaC (Infrastructure as Code)
• Application monitoring and alerting, incident management
• Business and support in co-ordination with developers
• Shared responsibility for ops, same as security
11. • Treat templates, scripts, orchestration code or provisioning
like code artifacts (yaml/json/xml)
• Any tools or config scripts also go in codebase/scm
• Follow change management practices for infrastructure as
well (version, manifest, CM approvals)
• Record changes in visible log (Slack channel/Jira work log)
• Security concerns called out in planning and properly
tracked during implementation
Desert Code Camp 2019
Infrastructure as Code
12. Desert Code Camp 2019
DevSecOps
• What about security? IT InfoSec used to take care of it.
• Security is a shared responsibility as well
• Never treat security as an afterthought (reactionary)
• DevSecOps (DevOps with security in mind)
• Clear Communication Pathways
• Streamlined Communication
• Security As Code
• Training
• Integrate Security into DevOps cycle
13. Desert Code Camp 2019
Communication
Development
Operations
Security
Ops tools, metrics, alerts
Security
Review,Data
classification,security
fixes
Major defects, highlight pain points, drive
improvements/incident action items
Pen
testcode,Com
pliance,
Security
action
item
s,policy
Security
m
onitoring
tools, firewall
review, access
log
scan, vulnerability,
Outdated
hardware/software
Application
scan, Pen
test
infra, access
control rules
NO:
⨯ Excel checklists
⨯ Word document reports or
policy documents
⨯ Email attachments
⨯ Private communication –
adhoc cc list
⨯ Private chat/tribal
knowledge, verbal approval
YES:
ü Backlogs/boards (like
jira/scrum tools/MS
project)
ü Support ticketing (like
remedy/zendesk)
ü Markup and Git
(readme.md, confluence)
ü Traceable tool, CM
(confluence, google docs
with versioning, author,
slack history, work logs)
14. • Application Source Code incorporates Security libraries/platforms
• Infrastructure follows security guideline (Cloudformation, templates)
• Server Configuration – Chef, Puppet, DSC, Wuzah
• Traceable, checked in code into repository (leverage git + CI/CD)
• Check in not just source, but also policy as code artifacts
• Monitoring/operations configuration also should be checked in as
code in form of a script/template
• Testing & scanning tools/policy also can be checked in/automated
• Document the process to deploy run the above for easy reuse
• Firewall rules, access control changes, permission requests
Desert Code Camp 2019
Security as Code
15. Desert Code Camp 2019
Training
• We can’t be experts in Dev, Sec and Ops at once
• We need cross pollination of skills
• Developer that understands app vulnerability
• IT/Ops that can understand code
• Security expert that can review infrastructure
• Starts at day 0 (Can’t be postponed)
• Leverage existing tools used in DevOps for security
• Common training with Devops tools
• Don’t assume non-technical staff (or one particular group
of the org) as only source of security issues
16. • What to measure in your code? (And why)
• Latency, Volume, Errors and Exceptions
• Understand the repercussions of failure
• Fault tolerance and logging necessary details
• What constitutes an alert?
• Business impairment/impact
• System impairment/load
• Severity
• Log triage, root cause analysis, forensics
• Red herrings and known outlying cases
• Statistics – Average, worst case, best case, 99th percentile
Desert Code Camp 2019
Monitoring and Alerting
17. • Incident priority and severity, business impact
• Pager alerts, response protocol
• Monitoring, dashboards, analysis tools
• Post Mortems
• Ops Tools
• Communication
Desert Code Camp 2019
Incident Management
Image: PagerDuty.com
18. • After Incident
• Post Mortem / Correction of
Errors – trackable document
• Deeper dive, provide graphs/logs
• Immediate actions to prevent
repeat occurrence (Kanban)
• Longer term actions (scrum)
Desert Code Camp 2019
Incident Management
• During Incident
• Standard Operational
Procedure (SOP)
• Notetaker and Liaison
• Paging hierarchy
• Log each action with
timestamp, record effect
• Continuous Improvement
• Tune Alarms, update SOP (ops proc)
• Review dashboards
• Automate manual steps, ops tools
19. • Sample Dashboard (Gitlab)
• AWS Cloudwatch & PagerDuty walkthrough
• SumoLogic walkthrough (Log analysis)
• Sentry and real time exception watches
• Reviewing and tracking alarms and dashboards
• Red/orange lines for warning and alerts
• Standard ops procedure consults dashboard & vice versa
Desert Code Camp 2019
Monitoring and Alerting
20. Desert Code Camp 2019
Final Thoughts
• Dealing with Operations Overload/Security Events Overload
• Eisenhower Decision Matrix for backlog prioritization
• Web Application Firewalls (AWS WAF)
• Forensics after outages/events
• Speed up log analysis – share triage information
• Vulnerability management – urgent upgrades
• Don’t postpone critical vulnerability patches
• A/B labs for runtime switches (management)
• Deploy new feature to production hidden by on off switch
• Allow ”dial up” of feature to certain percentage of customers