3. WHY DEVOPS?
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
4. @sudiptal
SETTING THE CONTEXT…
• From the manifesto: Early continuous delivery of valuable software!
• What would be a single KPI if you want to focus on this objective?
– Lead time (LT): From the time when the customer gave you some idea
of what he/she wanted, how long did it take him/her to get it?
• Keeping focus on LT helps all teams come together to server
common objective: customer!
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
5. @sudiptal
HOW DO WE BRING LEAD TIME DOWN?
Ideation to Specification
(including the front end)
Specification
to Dev
Complete
Dev
Complete to
Production
Deployment
Agile methods
focus here.
DevOps extends the spectrum
Can we make this a non-issue
by putting adequate processes
and infrastructure in place?
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
7. @sudiptal
NET RESULT: TRIPLE CONSTRAINT
BUSTED!
• Agile Methods/DevOps
establish the opposite
– We can deliver the highest
quality in the least time with
the least effort (minimize
waste)
– Slam dunk in green field
development initiatives
– Takes times with legacy systems
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
8. @sudiptal
WHAT MAKES HIGH PERFORMANCE IT
ORGANIZATIONS?
Continuous
Delivery
Lean Management
Practices
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
9. @sudiptal
CONTINUOUS DELIVERY:
MAKES WORK BETTER AND MAKE IT “FEEL” BETTER
Test Deployment
and
Automation
Continuous
Integration
(All) Production
artefacts in
Version Control
Continuous
Delivery
IT
Performance
Lower
Deployment
Plan
Lower
Change Fail
Rates
Organization
Performance
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
10. @sudiptal
LEAN MANAGEMENT PRACTICES:
MAKES WORK BETTER AND MAKE IT “FEEL” BETTER
WIP Limits:
Drive Improvement
Visualisations to
monitor work
Monitoring to make
business decisions
Lean
Management
IT
Performance
Decreased
Burnout
Improved
Organization
Culture
Organization
Performance
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
11. WHY IS IT
DIFFICULT?
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
13. @sudiptal
FROM THE “CULT” CLASSIC:
3 TAKEAWAYS
• DevOps is applied kanban.To run
DevOps well, you must understand
kanban.
• DevOps is not a collection of
tools
• If your people are 100% utilized, you
are introducing waste
http://daveondevops.com/2016/03/17/takeawaysfromphoenixproject/
15. CONTINUOUS
DELIVERY
YO U R WO R K P RO D U C T I S A LWAYS
R E A DY F O R D E L I V E RY
E X C L U S I O N S : L E A N M A N AG E M E N T
P R AC T I C E S , C U LT U R E
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
16. @sudiptal
“ALWAYS DEPLOYABLE” MEANS…
• Ability to get changes (features, configuration changes, bug fixes,
experiments) into Production, safely, quickly and sustain it
– Make Releases boring; no one stays awake at night!
– No need to use the latest tools; bring people together to get this done
– Not just functionally ready – it should be ready with all NFR requirements
(performance, security)
• Eliminate integration, testing and hardening
– A good 5-10% of the overall LeadTime
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
17. HOW DO WE GET
THERE?
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
18. 1. QUALITY
IMPROVING QUALITY IS
EVERYONE’S RESPONSIBILITY
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
19. @sudiptal
THE QUALITY BIBLE:
DEMING’S 14 POINTS ON QUALITY MANAGEMENT
• Create constancy of purpose for improving products and services.
• Adopt the new philosophy.
• Cease dependence on inspection to achieve quality; eliminate inspection;
BUILD QUALITY INTOTHE PRODUCT INTHE FIRST PLACE
• End the practice of awarding business on price alone; instead, minimize total
cost by working with a single supplier, on a long term relationship of loyalty
and trust!
• Improve constantly and forever every process for planning, production and
service.
• Institute training on the job.
• Adopt and institute leadership.
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
20. @sudiptal
THE BIBLE:
DEMING’S 14 POINTS ON QUALITY MANAGEMENT
• Drive out fear.
• Break down barriers between staff areas.
• Eliminate slogans, exhortations and targets for the workforce.
• Eliminate numerical quotas for the workforce and numerical goals for
management; eliminate MBO
• Remove barriers that rob people of pride of workmanship, and eliminate the
annual rating or merit system.The responsibility of supervisors must be
changed from sheer numbers to quality.
• Institute a vigorous program of education and self-improvement for everyone.
• Put everybody in the company to work accomplishing the transformation.
Transformation is everyone’s job!
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
21. @sudiptal
WHAT DOES IT MEAN TO US?
• There is no Agility/DevOps with crappy software
• There is no Agility/DevOps with manual test regression of
days/weeks/months
• There is no Agility/DevOps with Dev and Ops in their own cocoons,
with handoffs from one team to another
• Developers should be writing tests; if you don’t have this, you are not
ready for DevOps
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
22. @sudiptal
WHAT TO DO?
• Treat Tests as first class citizens of
your project
– Use tools to build and manage them…
just like you do with your source code
• Follow Agile Testing body of
knowledge
– If needed, get rid of the separate Testing
team; Dev understands that there is no
insurance to cover them for their
crappy code!
– InvertedTesting Pyramid is a non-
starter!
• Revitalize the tester
– Tester is a role; not a person
• Definitely, not a failed developer
– Advocates for the user; makes quality
transparent
– Preferably, not doing manual testing
• Focussed on exploratory testing +
maintaining automated acceptance test
cases
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
23. @sudiptal
AGILE TESTING QUADRANTS
Diagram invented by Brian Marick
Unless you do TDD,
test automation post
deployment is
expensive and hard
Cannot automate
this stuff!You need
people…
Should be doing this from
the beginning.
These things are testing
the architecture.
You need to know if you
have this right
24. @sudiptal
TEST DATA MANAGEMENT
• Suffers from low focus on (compared to Test Automation)
• Need adequate test data + ability to create test data on demand
• Choose low volume test data combinations that cover large volume
scenarios
– Avoid loading/unloading of DB dumps
– Don’t use production data dumps (except for Staging and Performance)
– Start from a clean DB and programmatically create data using the application APIs
=> Works nicely with an automation led strategy
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
26. @sudiptal
ACCEPTANCE TESTING
• Writing “good” acceptance test
cases is hard
• Its “good” IF you have the
confidence in the quality of the
software when AcceptanceTest
Cases pass
• If you get failure, introduce an
automation script at the right level
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
27. @sudiptal
ACCEPTANCE TEST SUITES
• Very hard to maintain
• Decay over time..
– Just like code! Refactor relentlessly
• Ownership is always the issue
– Not owned by the tester but by the
team!
• Treat test code as Production
• Flaky tests are no good!
– You lose the trust in your existing
Test Suite
– Move them to a different test suite
– Quarantine them till they are
refactored and consistent
• External Systems
– Move to a separate suite
– Parameterize the connections
– Run them before the full acceptance
suite
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
28. @sudiptal
BROWSER BASED TESTING IS UNRELIABLE
• If you hear “it failed in CI… but it ran manually”, you know there is
a difference between test mechanics and interaction pattern
• If you have AJAX based testing where tests need to wait for a
response from the server, then you will run into issues
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
29. @sudiptal
PERSONA BASED USER-JOURNEYS
• Persona based user-journeys
– Extract them from existing AcceptanceTest Cases
– Move to server side testing (away from browser based UI testing)
– Journey should only cover the most likely path
• Extract negative/edge scenarios to a separate suite to run after the journey scenarios
are done
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
32. @sudiptal
CONWAYS LAW
• Organizations which design systems are constrained to produce
designs which are copies of the communication structures of these
organizations
• If you don’t want your product to look like your organization,
change your organization or change you product
Rebecca Parsons
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
34. @sudiptal
AMAZON’S DIRECTIVE TO ITS TEAMS…
• All teams will henceforth expose their data and functionality through service
interfaces.
• Teams must communicate with each other through these interfaces.
• There will be no other form of interprocess communication allowed: no direct linking,
no direct reads of another team's data store, no shared-memory model, no back-
doors whatsoever.The only communication allowed is via service interface calls over
the network.
• It doesn't matter what technology they use. HTTP, Corba, Pubsub, custom protocols -
- doesn't matter. Bezos doesn't care.
• All service interfaces, without exception, must be designed from the ground up to be
externalizable. That is to say, the team must plan and design to be able to expose the
interface to developers in the outside world. No exceptions.
• Anyone who doesn't do this will be fired.
https://plus.google.com/+RipRowan/posts/eVeouesvaVX
35. @sudiptal
YOU BUILD IT, YOU RUN IT!
“… Giving developers operational responsibilities has greatly enhanced
the quality of the services, both from a customer and a technology point
of view.The traditional model is that you take your software to the wall
that separates development and operations, and throw it over and then
forget about it. Not at Amazon.You build it, you run it.This brings
developers into contact with the day-to-day operation of their software. It
also brings them into day-to-day contact with the customer.This customer
feedback loop is essential for improving the quality of the service.”
WernerVogels, CTO,Amazon
June 2006
http://queue.acm.org/detail.cfm?id=1142065
36. @sudiptal
ARCHITECTING FOR
REMOTE APPLICATIONS
• Circuit Breakers: handle remote calls
• Wrap the function call in a circuit breaker object
– Once the failures reach a threshold, it trips
– Further calls return with an error, without calling the
function
• Monitor and alert when it trips; have the breaker
itself detect when its ready again
https://martinfowler.com/bliki/CircuitBreaker.html
37. @sudiptal
ARCHITECTING FOR
LEGACY APPLICATIONS
• Strangler Applications
• Start by building new functionality in
new modules, using SOA
– Don’t rewrite existing code except to
simplify or removing bugs
– If you need to extend, write wrappers
• Deliver fast
https://www.martinfowler.com/bliki/StranglerApplication.html
38. @sudiptal
STRANGLER APPLICATION
• Benefits:
– Reduced risk.
– Give value steadily
– Frequent releases allow you to
monitor its progress more
carefully
– You can avoid a lot of the
unnecessary features that cut
over rewrites often generate
• For new applications:
– All new applications today will
be legacy tomorrow!
– When designing a new
application, design it in such a
way as to make it easier for it
to be strangled in the future
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
39. @sudiptal
ARCHITECTED FOR RECOVERY
• Even the High performing organizations report failures up to 15%
– However, they can recover in <1hr
• Applications have to be designed for recovery
– Strategies could vary for the application layer to the DB layer
• If the build is broken, rollback first!You don’t need to stay up all night or
late evening to fix it.
– Take time to fix it. Fixes made in a hurry create more problems/technical debt!
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
40. @sudiptal
OPTIMISE FOR MTRS
• Think Lead Time!
– How quickly can I detect?
– How quickly can I find the
cause?
– How quickly can I fix the
problem?
– How quickly can I rollout the
fix?
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
41. 3. CONTINUOUS
INTEGRATION
I N T E G R AT E E A R LY A N D O F T E N
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
43. @sudiptal
ESSENTIALS FOR CI
• Maintain a Single Source
Repository.
• Automate the Build
• MakeYour Build Self-Testing
• Everyone CommitsTo the
Mainline Every Day
• Every Commit Should Build the
Mainline on an Integration
Machine
• Fix Broken Builds Immediately
• Keep the Build Fast
• Test in a Clone of the
Production Environment
• Make it Easy for Anyone to Get
the Latest Executable
• Everyone can see what's
happening
• Automate Deployment
https://martinfowler.com/articles/continuousIntegration.html#EveryoneCommitsToTheMainlineEveryDay
44. @sudiptal
FEATURE BRANCHES + CI
FEATURE BRANCH CONTINUOUS INTEGRATION
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
45. @sudiptal
AWAY FROM (LONG LIVE) FEATURE
BRANCHES
• Most teams got driven to Feature
Based Development
– Until these explode… and they age!
• Longer you are building in your
own branch, greater the risk of all
sorts of incompatibilities
• Move to:
– Short branches (less than a day)
– Less than 3 active branches
– Merge to trunk/master on a daily
basis
• Emphasis on main line
development
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
46. @sudiptal
SAY NO TO “LONG LIVE” BRANCHES
• Rarely needed; its value
diminishes dramatically over
time
• Avoid “environment(itis)”
Age of Branch
RealPotentialTestingValue
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
48. @sudiptal
CATEGORIES OF TOGGLES
• ReleaseToggles: allow incomplete and untested code paths to be shipped to
production as latent code, which might never be turned on!
• Experiment Toggles: used to perform A/B testing
• Ops Toggles: control operational aspects of the application, for e.g., turning
down a load intensive processing when there is high transaction load
• Permissioning Toggles: change features that certain users receive, for e.g., for a
set of internal users (“Champagne Brunch”)
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
49. @sudiptal
CATEGORIES OF TOGGLES
• Static toggles OR Short
longevity Toggles would need
a simple on/off configuration
• Dynamic or High Longevity
Toggles need sophisticated
“Toggle Routers”
https://martinfowler.com/articles/feature-toggles.html
50. @sudiptal
TOGGLE ROUTING
• Prefer static routes that are
baked into the source code via
configuration
– All the benefit of infra as a code
– Simpler testing
• DynamicToggle Routing
patterns
– Hard coded toggle configuration
– Parameterised toggle
configuration (command line or
env variables)
– Toggle Configuration file
– Toggle via App DB
– Overriding configuration
• Per request overrides
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
51. @sudiptal
FEATURE TOGGLES ADD TESTING
COMPLEXITY
• Both options need to be tested
for each Toggle!
• (Might) explode with multiple
toggles options!
– Use Toggle Configuration files
– Add meta-data to track
audit/governance information for
that toggle
– In general, there's no need to test
all combinations of features.
• For release toggles, test 2
combinations
– All toggles on that are expected
to be on in the next release
– All toggles on
• Build an ability to generate the
listing of all the “active” toggles
on the runtime
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
52. @sudiptal
TOGGLE COME AT A COST!
• View the FeatureToggles as inventory
– There is a carrying cost; keep this inventory as low as possible.
• Team must be proactive in removing feature toggles; retire then when
pending feature are bedded to Production
– Add a toggle removal task onto backlog whenever it is introduced
– Put "expiration dates" on toggles.
• Creating "time bombs" that will fail a test (or even refuse to start an application!) if a
toggle is still around after its expiration date
– Apply a Lean approach by placing a limit on the number of toggles
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
53. 4. DEPLOYMENT
PIPELINE
D E P L OY M E N T I S T H E F I N A L S TAG E
O F C I
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
55. LOW RISK RELEASE
PATTERNS
YO U H AV E TO A R C H I T E C T F O R T H E S E !
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
56. @sudiptal
DEPLOYING DB CHANGES
• If you want to
• Then
– Make the DB change in the 1st release by adding the incremental fields
– The UI does a conditional read; if new field is blank, read from the old field
– The UI always writes to both fields.
– Now, introduce the new feature... if it fails, you are ready to rollback immediately,
– Much later, delete the old field (called the "contract" phase).
Address Address1 Address2
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
57. @sudiptal
CI FOR DB
• Data is persistent
• Often, large datasets; rollback
is not an option
– Some changes are irreversible
• Make DDL/DML scripts part
of version control
– In test environment, build DB
from scratch
– Then, run acceptance test cases
• Scripts are ordered; they run
in a sequence
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
58. @sudiptal
DEPLOY ON PRODUCTION
• Apply the same scripts on
Production
• Incremental scripts to be pulled
from version control
• For each script, build the rollback
script (as far as possible)
• Check dbdeploy.com
• Maintain a metadata table that
indicates what scripts have been run
on it
• Logs success/failure
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
59. @sudiptal
BLUE-GREEN DEPLOYMENTS
• Deployments often become a batch operation
waiting for the next “good” time
• Solution: Blue Green deployments
• Minimizes cutover from one version to next;
Fast rollback if needed
• Use the “other” as staging environment
• Multiple approaches to handle “live”
transactions on the earlier system
https://martinfowler.com/bliki/BlueGreenDeployment.html
• For Database Changes:
– Separate from the application rollout
– Use the earlier pattern
60. @sudiptal
RELEASE != DEPLOYMENT
• Deployment might release to all environments at the same time.
• Release process controls who sees what
– In FB, this person is called “Gatekeeper”
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
63. @sudiptal
DEFINE THE ENVIRONMENT
• What it is?
• What to use it for?
• How long to retain that environment?
• All environments are an “approximation” of the Production
– Hence, it is only good for a purpose – not good for any other purpose
– That defines the “What to use it for?”
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
64. @sudiptal
INFRASTRUCTURE
• All environment and supporting
services
– Networking, Storage, Mail, DNS…
• Desired state in version control
• Self corrects to the desired state
(Autonomic)
• State should be known via continuous
monitoring
• Protect from “Configuration Drift”
– Adhoc changes to the system that go
unrecorded
– Test yourself like a “Fire Drill”
• Solution:
– Use Software that automatically syncs
with a “baseline”
– Limited to the extent that you have
artefacts under version control
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
65. @sudiptal
THE TALE OF 2 COMPANIES…
http://radar.oreilly.com/2007/10/operations-is-a-competitive-ad.html
66. @sudiptal
“…it takes about 80 hours to bootstrap a startup.This generally means
installing and configuring an automated infrastructure management system
(puppet), version control system (subversion), continuous build and test
(frequently cruisecontrol.rb), software deployment (capistrano),
monitoring (currently evaluating Hyperic, Zenoss, and Groundwork). Once
this is done the “install time” is reduced to nearly zero and requires no
specialized knowledge.”
Jesse Robbins
(ex) Master of Disaster, Amazon
Founder of Chef
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
67. @sudiptal
THE NEXTFLIX SIMIAN ARMY
https://insights.sei.cmu.edu/devops/2015/04/devops-case-study-netflix-and-the-chaos-monkey.html
68. @sudiptal
AMAZON GAME DAYS
• Inject failures into critical
systems
• Discover flaws and critical
dependencies
• Accept that reliable software
platform is built on top of
components that are unreliable
• Need to keep testing services
against failure all year around
• Fail systems that will need to
bring people together who
otherwise don’t interact with
each other
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
69. @sudiptal
RESILIENCE ENGINEERING:
A FUNDAMENTAL CULTURAL SHIFT
• From a steadfast belief that
systems should never fail—and if
they do, focusing on who's to
blame—to actually forcing systems
to fail
• Rather than expending resources
on building systems that don't fail,
the emphasis is to how to deal
with systems swiftly and expertly
once they do fail—because fail they
will.
• Much of the value comes from
changing the collective mindset of
the engineers who design and build
– It's not easy to watch their systems
fail and its consequences
– Overtime, they gain confidence in the
systems and practices
• It invokes a more just culture in
which people can be held
accountable without being blamed,
or punished, for failure.
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
70. SOME OTHER
THOUGHTS…
… T H AT I C O U L D N ’ T F I T A N Y W H E R E
E L S E !
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
71. @sudiptal
PRACTICES
• Developers should be able to run acceptance tests on their
environments
• Virtualize all environments
• If anything fails, stop the line!
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
72. @sudiptal
PITFALLS
• Configuration(itis)!
– Once you start automating your
configurations, you will see an
explosion in the same
– For every small incremental functional
OR NFT, you will get a request for a
new configuration… and then, these
will stay
– Set some STANDARDS and
governance around this (similar to
temporary branching)!
• What should be a standard env for
Test, Staging, including test data
• Ops people are generally automation
savvy
– Don’t get too obsessed with frameworks,
trying to make it too generic
– Don’t build a huge framework with tons
of scripts and tools! Don’t have hundreds
of metadata config points that becomes a
nightmare to manage
• Don’t create Configuration
Management “Sherpas” whose only job
is to manage and track configuration
– Avoid your Brent!
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
73. @sudiptal
PITFALLS
• Change in a “stealth” mode:
– Just start by saying that you are automating
what you do today; one organization called
“Rapid Release”
– Don’t take the message “we will change
everything”
– Fast delivery in a safe way keeping all the
gray suits/ITIL happy!
– Focus on better work-life balance for
everyone
• If you are starting your DevOps
initiative with tools, you are almost
certain to fail
• Event and Alert monitoring:
– With an explosion of tools and
environments, it will be a nightmare to
analyse and track all that is happening
– Get all feeds into one activity stream like
Slack
– Build parsers to filter out what you really
want to see
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com
74. IN CLOSING…
@ Agile Network India , All Rights Reserved. www.agilenetworkindia.com