© 2014 IBM Corporation
Session: 2427
What the Cool Kids are Doing
with DevOps
Bill Holtshouser
Senior Strategist, Mobile, DevOps, Cloud
IBM Rational
Please note…
IBM’s statements regarding its plans, directions, and intent are subject to change
or withdrawal without notice at IBM’s sole discretion.
Information regarding potential future products is intended to outline our general
product direction and it should not be relied on in making a purchasing decision.
The information mentioned regarding potential future products is not a commitment,
promise, or legal obligation to deliver any material, code or functionality.
Information about potential future products may not be incorporated into any
contract. The development, release, and timing of any future features or
functionality described for our products remains at our sole discretion.
Performance is based on measurements and projections using standard IBM
benchmarks in a controlled environment. The actual throughput or performance
that any user will experience will vary depending upon many factors, including
considerations such as the amount of multiprogramming in the user’s job stream,
the I/O configuration, the storage configuration, and the workload processed.
Therefore, no assurance can be given that an individual user will achieve results
similar to those stated here.
1
Introduction
• This session is based on an examination of a series of “born on the
web” companies to see what common patterns and other learnings can
be derived from their DevOps journeys, with the goal of extracting
guidance for IBM’s clients
• We used only publicly available information such as published
conference presentations, company blogs, videos, news stories and
white papers
• Important: Everything here is strictly our opinion; none of the
companies mentioned reviewed or endorsed these opinions in any way!
2
Key Takeaways
• “Born on the Web” startups like Etsy, Netflix and others have been
leaders in applying a DevOps approach to SW development and delivery
– but they are essentially built from the ground up to do so
• These companies display numerous common DevOps-related traits in
the areas of Culture, Organization, Practices, Automation and
Measurements
• Although your enterprise won’t be able to replicate all aspects of these
“cool kid” companies and how they have applied DevOps (nor should
you even try), there are some important learnings from them that
can inform your own DevOps approach
3
4
Does this story sound familiar?
One way to address the issue…
5
Believe it or not, Dev and Ops weren’t always separate
“Back in the dawn of the computer
age, there was no distinction between
dev and ops. If you developed, you
operated. You mounted the tapes, you
flipped the switches on the front panel,
you rebooted when things crashed, and
possible even replaced the burned out
vacuum tubes. And you got to wear a
geeky white lab coat…”
“Dev and ops started to separate in the
‘60s, when programmers dumped boxes of
punch cards into readers and “computer
operators” scurried around mounting tapes
in response to IBM JCL. The operators also
pulled printouts from line printers and put
them in labeled cubbyholes, where you got
your output filed under your last name.”
– John Alspaw, Etsy
6
So…just who are these “Cool Kids” anyway?
7
Sidebar: Continuous Delivery is more than just “fast
Continuous Integration”
Continuous Delivery
• Websites, SaaS offerings
• Multiple pushes to
production per day
• Highly decoupled,
independent feature sets
• Single image/single
stream
• New practices and
patterns
Continuous
Integration
• Traditional applications,
appliances, mobile apps,
Web APIs
• Delivery to production
every few days to weeks
• Coordinated releases,
multiple version streams
• Established Agile
practices
Continuous
Engineering
• Complex embedded
systems
• Complex product
release and update
cycles
• Management of
variants and versions
• Engineering practices
8
Five essential elements of “Cool Kids” DevOps
success
Organization
Practices
Culture Automation
Measure-
ment
9
• Trust leads to an acceptance of “reasonable” risk
– Organization, tools, automation, instrumentation can all reduce risk
• Risk = PROBABILITY of Error x COST of Error
– Not all risks are created equal; zero risk is unattainable
– Cost depends on Time to Fix
• Learning from mistakes > blame
– …but there is still Karma: repeated mistakes may lead to loss of privilege
Cool Kids and Culture - key learnings
Culture
At Etsy, employees have a high degree of creative freedom and, when things go wrong,
accountability without blame. “We actually trust people,” CTO Chad Dickerson says. He
calls the approach a “radical decentralization of authority.” – Inc. Magazine, 12/13
1
0
• ALL exhibit a high degree of delegation
– …which leads to velocity
• In order to delegate, the Cool Kids trust… but verify
– E.g. via instrumentation, measurement
Re-defining the attitude towards “failure”
11
• NetFlix allows
failure to happen
continuously, and
want their SW to be
able to deal with it;
in fact they take
steps to encourage
errors (Simian
Army)
• In reality they look
at “failure” as simply
another STEP in the
SW development
process
http://techblog.netflix.com/2011/07/netflix-simian-army.html
• Adopt an “Ops First” design mentality
– Don’t build what you can’t manage
• Recognize the importance of build
– They don’t just give the build system to the “worst programmer”
or newest hire, but establish a focused role
Cool Kids and Culture – more learnings
Culture
12
Bottom line: a culture of trust is required
13
Rapid delivery
requires low
risk
Small
feature sets
Independent
services
Progressive
exposure
Rapid
feedback
Reliable
rollback
High
delegation
& trust
Risk = Probability of error
x Cost of error
Culture
Adrian Cockcroft of Netflix on Culture
“Culture is very hard to create or modify but easy to destroy.
This is because everyone has to buy into it for it to be effective,
and then every manager has to hire only people who are
compatible with the culture, and also get rid of people who turn
out not to fit in, even if they are doing good work.
So the short answer is: start a new company from scratch
with the culture you want, and pay a lot of attention to who
you hire. I don't think it is possible to do a culture shift if
there are more than a roomful of people involved.
Even with a roadmap and a guide, you probably won't be able
to follow this path if you are in a large established company.
Your existing culture won't let you.”
http://perfcap.blogspot.com/2012/03/ops-devops-and-noops-at-netflix.html
14
Organization follows Culture
Traditional Culture DevOps Culture
My priority is to
deliver code…
fast.
My priority is to
keep the site up
and running.
We’re all on the
same team! Want
some pizza?
15
Organ-
ization
• Conway’s Law (you build what you are) applies
– …also applies to how you’re organized
• Feature teams, not platform teams
– Small teams: “two pizza” rule
• Organize for an “end-to-end” responsibility for delivery
– Positive approach to fixing mistakes – learning, not “blame and shame”
• Many common patterns are seen in QA…
– Shared responsibility across a team, everybody does QA, or co-located QA
– Small Quality Engineering CoE team provides common tools/practices
– But NOT a separate/antagonostic QA org (“clean up your own mess”)
• Small DevOps “toolsmith” teams
– A.K.A. Systems Release Engineering
– Provide common tools & processes for automation, logging, monitoring…
– There to help, NOT to do it for you
• Finally - no “throwing it over the wall”…
Organization follows Culture
16
Organ-
ization
…basically, you need to be getting away from this
17
Practices that “make perfect” for the Cool Kids
Practices
• “Light” planning and specs
– Etsy high level planning done in 60 day chunks and two
week periods; specs kept very light – no more than what is
required
• Cut the cord with traditional release process
– Developers coordinate and drive the release of their own
code without need for a centralized release cycle
– Netflix goes farther than most: “NoOps”
• Speed, speed, speed
– Its all about rapid deployment; some deploy updates to their site 25x
per day
• Progressive rollout of new features, “dark” releases
– Concept of “config flags”, new features there but not yet enabled, then
launched with simple switch in the code
• They talk about it…a LOT
– Lots of internal and external forums / blogs among the Cool Kids
– Example: Etsy “Code as Craft” site www.codeasdraft.com
18
• Most of these companies manage a single production
image that they completely control
– The don’t have to worry about shipping releases to
customers who might or might not install those releases
• …therefore there are no branches in their version
control – everything is checked into the trunk
Practices: a single image simplifies things
Practices
19
• Testing everything on every check-in is good…but it
isn’t the endgame
– LinkedIn has only a few thousand unit tests
• Testing in a non-production environment can reach a
point of diminishing returns
– Ever-growing lists of unit tests, often testing very obscure
scenarios, often overlapping and redundant
– Limited by your ability to predict real world scenarios
• LinkedIn practice: get to production environment as
soon as practical
– Progressive rollout minimizes the risk when deploying to
production…
Practices: “Continuous Delivery Heresy”
(Yes, you can do too much testing)
Practices
20
• Progressive rollout of new features, “dark” releases:
– Deploy to one server with all features disabled to ensure no
performance or resource regressions (also known as “canarying”)
– Turn on features for a small population, and measure (“smoke test”)
– Turn it on for up to 1% of users, and measure
– Progressively roll out to all servers, continuing to measure
– Config Flags (also known as feature flags or gatekeepers [LinkedIn])
control which users see which features
• In order to successfully do Progressive Rollout, you’ll need
two more of our five essential elements:
– Automation, both to progressively roll out and to roll back if a
problem is discovered
– Measurement (tied to Instrumentation), in order to be able to rapidly
measure the impact
Practices: Progressive Rollout
Practices
21
Progressive Rollout console at Facebook
Practices
22
• These companies tend to avoid “release-defining
features” that can hold up the entire release
• Cool Kids pattern: release features when they are
ready - the release train waits for nobody
– Also known as date-based releases - the date of release is
fixed, but the features in that release are flexible
• For this to work, you must respect forward and
backwards compatibility of API (service) interfaces
Practices: Fire When Ready!
Practices
23
• In general, the Cool Kids automate as much as
possible
– Etsy has invested a lot in automated unit / functional
testing, dev tooling and monitoring, use of dashboards
– Netflix has a heavy degree of automation across the
board
• Automate even the infrastructure, but keep it simple
– LinkedIn, Flickr and Netflix generally build up their
infrastructure from just a single OS image
– From here, configure individual servers using automated
scripts driven by tool of choice (e.g. IBM UrbanCode)
– Also commonly seen was use of “Phoenix” servers (vs.
“Snowflakes”), which can be re-built at any time then
“burned to the ground” if needed
• … but only automate what can be measured
Cool Kids and Automation Auto-
mation
24
Think you don’t need to keep an eye on automation?
http://windowsitpro.com/windows-7/aggressive-configmgr-based-windows-7-deployment-takes-down-emory-university
“During TechEd 2014, the Emory University IT department prepared and deployed
Windows 7 upgrades to the campuses computers. If you've worked with ConfigMgr
at all, you know that there are checks-and-balances that can be employed to ensure
that only specifically targeted systems will receive an OS upgrade. In Emory
University's case, the check-and-balance method failed and instead of delivering
the upgrade to applicable computers, delivered Windows 7 to ALL computers
including laptops, desktops, and even servers.
I'll stop for a second to let you take that in.
Yes, even servers.
By the time it was realized what exactly had happened, the Windows 7 sequence
had repartitioned, reformatted, and installed Windows 7. Emory IT powered off the
ConfigMgr server, hoping to stop the deployment before it was too late, but – it was
too late. Even the ConfigMgr server had been repartitioned and reformatted…”
– Windows IT Pro, May 19, 2014
Finally: Instrument and Measure
26
• LinkedIn: “Measurement is better than prediction”
• Provide a common framework to make it easy for developers to
choose what to log simply by tagging or registering it
– “Push” from services works better than “pull” or polling
– In many cases, developers need do no more than push key/value pairs
to a logging system
– LinkedIn collects 500K+ metrics per minute at an average of 400
metrics per service
• Instrument user behaviors to improve the user experience
– Esurance: “we mined the data to figure out what people were doing
most often, make those tasks the most prominent and make them
addressable in as few clicks as possible”
• Metrics dashboards also display deployment activity
– So if there’s a problem, you can easily tie the start time of the issue to
the preceding pushes
Measure
-ment
• LinkedIn developed and then open
sourced tools for monitoring and
graphing data being pushed to its logs…
Monitoring at LinkedIn
inGraph, inFormed
Measure
-ment
27
So…what are the Cool Kids DevOps takeaways?
28
Culture
• Cultural change takes time – take reasonable steps
– Team-building, cross-training, improved communication
– Maybe include your Ops team in requirements / feature
reviews and planning (e.g. via IBM RRC, RTC)
• Don’t turn your organization upside down
– Experiment on a few smaller, low-risk projects
– Maybe create DevOps "center of excellence"
– Tear down walls between teams
Organi-
zation
• “Continuous Integration” is a good starting point
– Push all builds to the last stage before release
– Eat your own dog food (get employees involved to test)
– Try progressive rollout or dark release of features
Practices
So…what are the Cool Kids DevOps takeaways?
29
Auto-
mation
• Start by automating a few areas that you can easily see
and track the results from
– E.g. Test / build pipeline, possibly using UrbanCode Deploy
• First, assess your current process and consider the
changes you want to make – then consider how to
measure them
– Instrument and measure anything you intend to automate
Measure
-ment
• But above all, be honest
– Assess your own DevOps maturity and aspirations – where are you
and where do you want to be?
30
IBM can help: DevOps Adoption Framework delivers
measurable outcomes
Enable lean adoption of DevOps capabilities
Adoption Model
Self-assessments
Adoption paths
Adoption services
Solutions
Practices
Tooling
Services
Steer Product-based
Agile
Automated
Collaborative
Optimizing
More
Predictable
More
Transparent
More
Continuous
Process-based
Process-heavy
Manual
Silo-ed
Develop/Test
Deploy
Operate
Inefficient Leaner
Leaner and
Smarter
Continuous
Customer
Feedback &
Optimization
Collaborative
Development
Continuous Release and
Deployment
Continuous
Monitoring
Continuous
Business Planning
Continuous
Testing
Operate Develop/
Test
Deploy
Steer
DevOps
Continuous
Feedback
Community
Stories
Enablement
Feedback
Where and
How to Get
Lean
Expertise
and
Technologies
Knowledge
sharing
31
Where to start: DevOps Adoption Roadmap
Assess desired outcome and supporting practices to drive strategy and rollout
What am I
trying to
achieve?
 Think through business-level drivers for improvement
 Define measurable goals for your organizational investment
 Look across silos and include key Dev and Ops stakeholders
Where am I
currently?
 What do you measure and currently achieve
 What don’t you measure, but should to improve
 What practices are difficult, incubating, well-scaled
 How do your team members agree with these findings
What are my
priorities?
 Start where you are today and where your improvement goals
 Consider changes to People, Practices and Technology
 Prioritize change using goals, complexities and dependencies
Step1Step2Step3
Current Practice
Assessment
Objective & Prioritized
Capabilities
Business Goal
Determination
What new
practices
should help
me grow?
Step4
 Understand your appetite for cross-functional change
 Target improvements with the biggest bang for the buck
 Roadmap and agree on an actionable plan
 Use measurable milestones that include early wins Strategy/Roadmap
32
Connect with me on Twitter at @BillHoltshouser or LinkedIn at
www.linkedin.com/pub/bill-holtshouser/4/815/66a/
Acknowledgements and Disclaimers
© Copyright IBM Corporation 2012. All rights reserved.
– U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract
with IBM Corp.
IBM, the IBM logo, ibm.com are trademarks or registered trademarks of International Business Machines Corporation in the United
States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a
trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information
was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is
available on the Web at “Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml
Other company, product, or service names may be trademarks or service marks of others.
Availability. References in this presentation to IBM products, programs, or services do not imply that they will be available in all
countries in which IBM operates.
The workshops, sessions and materials have been prepared by IBM or the session speakers and reflect their own views. They are
provided for informational purposes only, and are neither intended to, nor shall have the effect of being, legal or other guidance or advice
to any participant. While efforts were made to verify the completeness and accuracy of the information contained in this presentation, it is
provided AS-IS without warranty of any kind, express or implied. IBM shall not be responsible for any damages arising out of the use of,
or otherwise related to, this presentation or any other materials. Nothing contained in this presentation is intended to, nor shall have the
effect of, creating any warranties or representations from IBM or its suppliers or licensors, or altering the terms and conditions of the
applicable license agreement governing the use of IBM software.
All customer examples described are presented as illustrations of how those customers have used IBM products and the results they may
have achieved. Actual environmental costs and performance characteristics may vary by customer. Nothing contained in these
materials is intended to, nor shall have the effect of, stating or implying that any activities undertaken by you will result in any specific
sales, revenue growth or other results.
33

What do the "Cool Kids" know about DevOps?

  • 1.
    © 2014 IBMCorporation Session: 2427 What the Cool Kids are Doing with DevOps Bill Holtshouser Senior Strategist, Mobile, DevOps, Cloud IBM Rational
  • 2.
    Please note… IBM’s statementsregarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM’s sole discretion. Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision. The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. The development, release, and timing of any future features or functionality described for our products remains at our sole discretion. Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon many factors, including considerations such as the amount of multiprogramming in the user’s job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve results similar to those stated here. 1
  • 3.
    Introduction • This sessionis based on an examination of a series of “born on the web” companies to see what common patterns and other learnings can be derived from their DevOps journeys, with the goal of extracting guidance for IBM’s clients • We used only publicly available information such as published conference presentations, company blogs, videos, news stories and white papers • Important: Everything here is strictly our opinion; none of the companies mentioned reviewed or endorsed these opinions in any way! 2
  • 4.
    Key Takeaways • “Bornon the Web” startups like Etsy, Netflix and others have been leaders in applying a DevOps approach to SW development and delivery – but they are essentially built from the ground up to do so • These companies display numerous common DevOps-related traits in the areas of Culture, Organization, Practices, Automation and Measurements • Although your enterprise won’t be able to replicate all aspects of these “cool kid” companies and how they have applied DevOps (nor should you even try), there are some important learnings from them that can inform your own DevOps approach 3
  • 5.
    4 Does this storysound familiar?
  • 6.
    One way toaddress the issue… 5
  • 7.
    Believe it ornot, Dev and Ops weren’t always separate “Back in the dawn of the computer age, there was no distinction between dev and ops. If you developed, you operated. You mounted the tapes, you flipped the switches on the front panel, you rebooted when things crashed, and possible even replaced the burned out vacuum tubes. And you got to wear a geeky white lab coat…” “Dev and ops started to separate in the ‘60s, when programmers dumped boxes of punch cards into readers and “computer operators” scurried around mounting tapes in response to IBM JCL. The operators also pulled printouts from line printers and put them in labeled cubbyholes, where you got your output filed under your last name.” – John Alspaw, Etsy 6
  • 8.
    So…just who arethese “Cool Kids” anyway? 7
  • 9.
    Sidebar: Continuous Deliveryis more than just “fast Continuous Integration” Continuous Delivery • Websites, SaaS offerings • Multiple pushes to production per day • Highly decoupled, independent feature sets • Single image/single stream • New practices and patterns Continuous Integration • Traditional applications, appliances, mobile apps, Web APIs • Delivery to production every few days to weeks • Coordinated releases, multiple version streams • Established Agile practices Continuous Engineering • Complex embedded systems • Complex product release and update cycles • Management of variants and versions • Engineering practices 8
  • 10.
    Five essential elementsof “Cool Kids” DevOps success Organization Practices Culture Automation Measure- ment 9
  • 11.
    • Trust leadsto an acceptance of “reasonable” risk – Organization, tools, automation, instrumentation can all reduce risk • Risk = PROBABILITY of Error x COST of Error – Not all risks are created equal; zero risk is unattainable – Cost depends on Time to Fix • Learning from mistakes > blame – …but there is still Karma: repeated mistakes may lead to loss of privilege Cool Kids and Culture - key learnings Culture At Etsy, employees have a high degree of creative freedom and, when things go wrong, accountability without blame. “We actually trust people,” CTO Chad Dickerson says. He calls the approach a “radical decentralization of authority.” – Inc. Magazine, 12/13 1 0 • ALL exhibit a high degree of delegation – …which leads to velocity • In order to delegate, the Cool Kids trust… but verify – E.g. via instrumentation, measurement
  • 12.
    Re-defining the attitudetowards “failure” 11 • NetFlix allows failure to happen continuously, and want their SW to be able to deal with it; in fact they take steps to encourage errors (Simian Army) • In reality they look at “failure” as simply another STEP in the SW development process http://techblog.netflix.com/2011/07/netflix-simian-army.html
  • 13.
    • Adopt an“Ops First” design mentality – Don’t build what you can’t manage • Recognize the importance of build – They don’t just give the build system to the “worst programmer” or newest hire, but establish a focused role Cool Kids and Culture – more learnings Culture 12
  • 14.
    Bottom line: aculture of trust is required 13 Rapid delivery requires low risk Small feature sets Independent services Progressive exposure Rapid feedback Reliable rollback High delegation & trust Risk = Probability of error x Cost of error Culture
  • 15.
    Adrian Cockcroft ofNetflix on Culture “Culture is very hard to create or modify but easy to destroy. This is because everyone has to buy into it for it to be effective, and then every manager has to hire only people who are compatible with the culture, and also get rid of people who turn out not to fit in, even if they are doing good work. So the short answer is: start a new company from scratch with the culture you want, and pay a lot of attention to who you hire. I don't think it is possible to do a culture shift if there are more than a roomful of people involved. Even with a roadmap and a guide, you probably won't be able to follow this path if you are in a large established company. Your existing culture won't let you.” http://perfcap.blogspot.com/2012/03/ops-devops-and-noops-at-netflix.html 14
  • 16.
    Organization follows Culture TraditionalCulture DevOps Culture My priority is to deliver code… fast. My priority is to keep the site up and running. We’re all on the same team! Want some pizza? 15 Organ- ization
  • 17.
    • Conway’s Law(you build what you are) applies – …also applies to how you’re organized • Feature teams, not platform teams – Small teams: “two pizza” rule • Organize for an “end-to-end” responsibility for delivery – Positive approach to fixing mistakes – learning, not “blame and shame” • Many common patterns are seen in QA… – Shared responsibility across a team, everybody does QA, or co-located QA – Small Quality Engineering CoE team provides common tools/practices – But NOT a separate/antagonostic QA org (“clean up your own mess”) • Small DevOps “toolsmith” teams – A.K.A. Systems Release Engineering – Provide common tools & processes for automation, logging, monitoring… – There to help, NOT to do it for you • Finally - no “throwing it over the wall”… Organization follows Culture 16 Organ- ization
  • 18.
    …basically, you needto be getting away from this 17
  • 19.
    Practices that “makeperfect” for the Cool Kids Practices • “Light” planning and specs – Etsy high level planning done in 60 day chunks and two week periods; specs kept very light – no more than what is required • Cut the cord with traditional release process – Developers coordinate and drive the release of their own code without need for a centralized release cycle – Netflix goes farther than most: “NoOps” • Speed, speed, speed – Its all about rapid deployment; some deploy updates to their site 25x per day • Progressive rollout of new features, “dark” releases – Concept of “config flags”, new features there but not yet enabled, then launched with simple switch in the code • They talk about it…a LOT – Lots of internal and external forums / blogs among the Cool Kids – Example: Etsy “Code as Craft” site www.codeasdraft.com 18
  • 20.
    • Most ofthese companies manage a single production image that they completely control – The don’t have to worry about shipping releases to customers who might or might not install those releases • …therefore there are no branches in their version control – everything is checked into the trunk Practices: a single image simplifies things Practices 19
  • 21.
    • Testing everythingon every check-in is good…but it isn’t the endgame – LinkedIn has only a few thousand unit tests • Testing in a non-production environment can reach a point of diminishing returns – Ever-growing lists of unit tests, often testing very obscure scenarios, often overlapping and redundant – Limited by your ability to predict real world scenarios • LinkedIn practice: get to production environment as soon as practical – Progressive rollout minimizes the risk when deploying to production… Practices: “Continuous Delivery Heresy” (Yes, you can do too much testing) Practices 20
  • 22.
    • Progressive rolloutof new features, “dark” releases: – Deploy to one server with all features disabled to ensure no performance or resource regressions (also known as “canarying”) – Turn on features for a small population, and measure (“smoke test”) – Turn it on for up to 1% of users, and measure – Progressively roll out to all servers, continuing to measure – Config Flags (also known as feature flags or gatekeepers [LinkedIn]) control which users see which features • In order to successfully do Progressive Rollout, you’ll need two more of our five essential elements: – Automation, both to progressively roll out and to roll back if a problem is discovered – Measurement (tied to Instrumentation), in order to be able to rapidly measure the impact Practices: Progressive Rollout Practices 21
  • 23.
    Progressive Rollout consoleat Facebook Practices 22
  • 24.
    • These companiestend to avoid “release-defining features” that can hold up the entire release • Cool Kids pattern: release features when they are ready - the release train waits for nobody – Also known as date-based releases - the date of release is fixed, but the features in that release are flexible • For this to work, you must respect forward and backwards compatibility of API (service) interfaces Practices: Fire When Ready! Practices 23
  • 25.
    • In general,the Cool Kids automate as much as possible – Etsy has invested a lot in automated unit / functional testing, dev tooling and monitoring, use of dashboards – Netflix has a heavy degree of automation across the board • Automate even the infrastructure, but keep it simple – LinkedIn, Flickr and Netflix generally build up their infrastructure from just a single OS image – From here, configure individual servers using automated scripts driven by tool of choice (e.g. IBM UrbanCode) – Also commonly seen was use of “Phoenix” servers (vs. “Snowflakes”), which can be re-built at any time then “burned to the ground” if needed • … but only automate what can be measured Cool Kids and Automation Auto- mation 24
  • 26.
    Think you don’tneed to keep an eye on automation? http://windowsitpro.com/windows-7/aggressive-configmgr-based-windows-7-deployment-takes-down-emory-university “During TechEd 2014, the Emory University IT department prepared and deployed Windows 7 upgrades to the campuses computers. If you've worked with ConfigMgr at all, you know that there are checks-and-balances that can be employed to ensure that only specifically targeted systems will receive an OS upgrade. In Emory University's case, the check-and-balance method failed and instead of delivering the upgrade to applicable computers, delivered Windows 7 to ALL computers including laptops, desktops, and even servers. I'll stop for a second to let you take that in. Yes, even servers. By the time it was realized what exactly had happened, the Windows 7 sequence had repartitioned, reformatted, and installed Windows 7. Emory IT powered off the ConfigMgr server, hoping to stop the deployment before it was too late, but – it was too late. Even the ConfigMgr server had been repartitioned and reformatted…” – Windows IT Pro, May 19, 2014
  • 27.
    Finally: Instrument andMeasure 26 • LinkedIn: “Measurement is better than prediction” • Provide a common framework to make it easy for developers to choose what to log simply by tagging or registering it – “Push” from services works better than “pull” or polling – In many cases, developers need do no more than push key/value pairs to a logging system – LinkedIn collects 500K+ metrics per minute at an average of 400 metrics per service • Instrument user behaviors to improve the user experience – Esurance: “we mined the data to figure out what people were doing most often, make those tasks the most prominent and make them addressable in as few clicks as possible” • Metrics dashboards also display deployment activity – So if there’s a problem, you can easily tie the start time of the issue to the preceding pushes Measure -ment
  • 28.
    • LinkedIn developedand then open sourced tools for monitoring and graphing data being pushed to its logs… Monitoring at LinkedIn inGraph, inFormed Measure -ment 27
  • 29.
    So…what are theCool Kids DevOps takeaways? 28 Culture • Cultural change takes time – take reasonable steps – Team-building, cross-training, improved communication – Maybe include your Ops team in requirements / feature reviews and planning (e.g. via IBM RRC, RTC) • Don’t turn your organization upside down – Experiment on a few smaller, low-risk projects – Maybe create DevOps "center of excellence" – Tear down walls between teams Organi- zation • “Continuous Integration” is a good starting point – Push all builds to the last stage before release – Eat your own dog food (get employees involved to test) – Try progressive rollout or dark release of features Practices
  • 30.
    So…what are theCool Kids DevOps takeaways? 29 Auto- mation • Start by automating a few areas that you can easily see and track the results from – E.g. Test / build pipeline, possibly using UrbanCode Deploy • First, assess your current process and consider the changes you want to make – then consider how to measure them – Instrument and measure anything you intend to automate Measure -ment • But above all, be honest – Assess your own DevOps maturity and aspirations – where are you and where do you want to be?
  • 31.
    30 IBM can help:DevOps Adoption Framework delivers measurable outcomes Enable lean adoption of DevOps capabilities Adoption Model Self-assessments Adoption paths Adoption services Solutions Practices Tooling Services Steer Product-based Agile Automated Collaborative Optimizing More Predictable More Transparent More Continuous Process-based Process-heavy Manual Silo-ed Develop/Test Deploy Operate Inefficient Leaner Leaner and Smarter Continuous Customer Feedback & Optimization Collaborative Development Continuous Release and Deployment Continuous Monitoring Continuous Business Planning Continuous Testing Operate Develop/ Test Deploy Steer DevOps Continuous Feedback Community Stories Enablement Feedback Where and How to Get Lean Expertise and Technologies Knowledge sharing
  • 32.
    31 Where to start:DevOps Adoption Roadmap Assess desired outcome and supporting practices to drive strategy and rollout What am I trying to achieve?  Think through business-level drivers for improvement  Define measurable goals for your organizational investment  Look across silos and include key Dev and Ops stakeholders Where am I currently?  What do you measure and currently achieve  What don’t you measure, but should to improve  What practices are difficult, incubating, well-scaled  How do your team members agree with these findings What are my priorities?  Start where you are today and where your improvement goals  Consider changes to People, Practices and Technology  Prioritize change using goals, complexities and dependencies Step1Step2Step3 Current Practice Assessment Objective & Prioritized Capabilities Business Goal Determination What new practices should help me grow? Step4  Understand your appetite for cross-functional change  Target improvements with the biggest bang for the buck  Roadmap and agree on an actionable plan  Use measurable milestones that include early wins Strategy/Roadmap
  • 33.
    32 Connect with meon Twitter at @BillHoltshouser or LinkedIn at www.linkedin.com/pub/bill-holtshouser/4/815/66a/
  • 34.
    Acknowledgements and Disclaimers ©Copyright IBM Corporation 2012. All rights reserved. – U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. IBM, the IBM logo, ibm.com are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at “Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml Other company, product, or service names may be trademarks or service marks of others. Availability. References in this presentation to IBM products, programs, or services do not imply that they will be available in all countries in which IBM operates. The workshops, sessions and materials have been prepared by IBM or the session speakers and reflect their own views. They are provided for informational purposes only, and are neither intended to, nor shall have the effect of being, legal or other guidance or advice to any participant. While efforts were made to verify the completeness and accuracy of the information contained in this presentation, it is provided AS-IS without warranty of any kind, express or implied. IBM shall not be responsible for any damages arising out of the use of, or otherwise related to, this presentation or any other materials. Nothing contained in this presentation is intended to, nor shall have the effect of, creating any warranties or representations from IBM or its suppliers or licensors, or altering the terms and conditions of the applicable license agreement governing the use of IBM software. All customer examples described are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics may vary by customer. Nothing contained in these materials is intended to, nor shall have the effect of, stating or implying that any activities undertaken by you will result in any specific sales, revenue growth or other results. 33