Speakers:
Uri Budnik - Cloud Evangelist, RightScale
Arindam Mukherjee - Sr. Manager DevOps, Blackhawk Network
IT organizations are turning processes and practices often referred to as DevOps in order to speed up application delivery, shorten release cycles, improve quality, and better meet the needs of their business. We will present a real-life story of an organization implementing DevOps and leave you with best practices for use in your own organization.
DevOps Stories: Getting to Agile - RightScale Compute 2013
1. april25-26
sanfrancisco
cloud success starts here
DevOps and Cloud Management
Arindam Mukherjee, Sr. Manager, Engineering Cloud
Services, Blackhawk
Uri Budnik, Cloud Evangelist, RightScale. @UriBudnik
2. #2#2
#RightscaleCompute
What is DevOps?
A company's ability to compete is limited by its ability to realize its product
vision as quickly and efficiently as possible
Hence: Agile Development
Traditional IT infrastructure requires large commitments of time, money, and
minds
Hence: Cloud Computing
The most successful developers of modern applications drive
controlled, high-tempo change to their user experiences at unprecedented
scales
Hence: DevOps
3. #3#3
#RightscaleCompute
• Does this happen in your IT dept. when something
breaks?
• Ops: Its not my machines,
its your code!
• Developer: Its not my code,
its your machines!
• Traditionally:
• Developers job is to add new features
• Ops job is to keep the site stable and fast
How Does DevOps Help?
4. #4#4
#RightscaleCompute
• Business requires change
• But, change is the root of most outages
• Discourage change in the interest of stability?
• Build tools and culture to allow change to happen as often as
it needs to
How Does DevOps Help?
5. #5#5
#RightscaleCompute
How Does DevOps Help?
• DevOps is to operations what agile has been to
development
• Replace big changes with constant, repeatable
incremental change
• This offers more control and predictability
6. #6#6
#RightscaleCompute
Lower the risk of change with tools and culture
• Cloud: automated infrastructure
• Single step builds
• One step deploys
• ServerTemplates
• Small frequent changes, easier to recover if something
goes wrong
• Deploy log – Who? When? What?
• Healthy attitude about failure
8. #8#8
#RightscaleCompute
Blackhawk IT before DevOps
Classic Development
& Operations division
of labor
Ops takes 6-8 weeks
to deliver despite
best intentions
Top priority is
maintaining
production
9. #9#9
#RightscaleCompute
Blackhawk IT before DevOps
Devs don’t have
timely access to
environments
Must submit detailed
requests
Confidence level—is
what is delivered the
same as requested?
11. #11#11
#RightscaleCompute
DevOps at Blackhawk
Solution provisioning mindset;
instead of request processing and
incident handling
Take ownership of
environments/applications, not just
IT assets
Embed in development process to
create and iterate on software stack
12. #12#12
#RightscaleCompute
Cloud + DevOps + RightScale
DevOps team maintains a
catalog of ServerTemplates
that developers can use
Self-service portal, no need
to ask permission when a
new server is needed
Developers are no longer
tied to actual servers
13. #13#13
#RightscaleCompute
Agile Deployments
Provisioning time now
minutes instead of
months!
Environments are created
programmatically as part
of continuous integration
Focus shifts to lifecycle
management of server
templates – iterate, fine
tune, code manage
14. #14#14
#RightscaleCompute
Cloud Instance Sprawl
Side effect of agile
programming + DevOps +
Cloud — lots of cloud
instances running that may
not be in use
Robust, targeted and frequent
reporting of chargeback
allocation and cost trending
Alarms can alert you when
the spend on a particular
deployment crosses a
threshold
PlanForCloud.com helps you
forecast costs
15. #15#15
#RightscaleCompute
Lessons Learned
• Take ownership of applications
• Embed Ops people into the development process
• Enable developers to self provision environments
• DevOps + RightScale can simplify application lifecycle
management — ServerTemplates
• Create dashboard for production operation tasks
• Surface cost information to people that manage
budgets
• Think about how to architect for the cloud where
adding more infrastructure is no longer a bottleneck
DevOps is enabled by the confluence of several major trends
Not siloed although treated as such in some organizationsDiscuss DevOps in small shops vs. enterprise. We are focused on existing larger organizations hereOps: says no all the time. Afraid changing a machine will break something. Developers don’t talk to them because they feel they’ll just say noOps feels that no one tells them anythingOps takes too long to give me a QA environment
Lower the risk of change through good use of tools and good working cultureIncreased the confidence that any given change is not going to cause an outageIncrease the confidence that you can recover from those outages, and quicklyDevelopers that think like operations people and vice versa
If you get only one takeaway its: automationSource control, but for your server configurations tooEmbed operations people into the dev team to document early on their requirements, have a role in defining themDevs, talk to ops about the impact of your code changesOps – provide constructive feedback on current aches and pains
Deploy log is really important for decomposing problems. You cannot ship software faster if you break your app every time you doOps people, let dev look at performance metrics of the production serversDevelopers, realize that others will be responsible for fixing systems when your code breaks somethingAvoid having an argumentative combative cultureNo finger pointingLike automated infrastructure, a culture of cooperation and respect is a must-haveAssume good faith. Even if you have worked with a cowboy in the past that was not the friendliestDon’t just say no. that means I don’t care about your problem. Find out what the real issue is. The idea is for developers and operations to come together to find unique solutionsHiding solutions from someone who just says no isa bad idea. They will find out eventually. “I don’t want to tell bob because he is going to freak out” is a vey bad sign and a horrible practiceRead only access to RS. Its not every developer having root on every production system. They are writing the code that runs in the production machinesFailure will happen. Consider how you will respond. Airline pilots spend a lot of time on simulators practicing for different scenarios. Develop the competency to deal with problems efficiently. Fire drillsIn summaryAutomationEmbed ops people into devIterate quicklySelf provisionDeploy log / audit trailGive developers visibility into productionNo finger pointing
Leader in prepaid and financial payment productsGift Card Mall™ has 160MM customer visits each week120 Developers and ~30 people in ITLooking to reduce time-to-market for new products. Need to be more nimble and flexible Transforming IT from a request organization and incident handling to service delivery—DevOpsArindam is leading the way in this metamorphosisChallenges are technical and cultural/political
Classic Development and Operations division of laborDevs request infrastructure and must submit detailed required specificationsOps takes 6-8 weeks to deliver (despite best intentions)Development requests are low priority and often get delayedTop Ops priority is maintaining the production environmentsDevs don’t have timely access to environments to test, experiment and codeConfidence level — is what delivered same as requested?Cost prohibitive to have many different environments
Classic Development and Operations division of laborDevs request infrastructure and must submit detailed required specificationsOps takes 6-8 weeks to deliver (despite best intentions)Development requests are low priority and often get delayedTop Ops priority is maintaining the production environmentsDevs don’t have timely access to environments to test, experiment and codeConfidence level — is what delivered same as requested?Cost prohibitive to have many different environments
Autonomy = speed and flexibilityLack of structured engagement with ITIterate on the fly – agile development methodologyNeed systems or environments to try an ideaFail fasterMake and break, try and evolve – less analysis and theorizingControl environments directly from anywhereSelf-service, not IT help desk or request gatherers and processors
Dev+Ops; new face of ITTeam has Hybrid skills in development (specifically automation tasks) and IT operationsSolution provision mindset; instead of request processing and incident handlingTake ownership of environments/applications, not just IT assetsEmbed in development process to create and iterate on software stack per serverFully document all software installations, post-installation configurations – own the build process per serverCreate developer self-service portals, predictable tasks can be automated with ON/OFF switches--------------------------------------------Devops (IT) people immerse themselves in the software stack with the developers to learn what is going to take to support those apps (sometimes they call them platforms)Dev and ops need to be joined at the hip as they create new apps.Ops takes user stories out of the scrumAnd, ops has deliverables into the scrum They work with the architecture teams and with the developers. E.g. what version of java should they use? What version of an app server like tomcat? Difficult to change in production.
Public Cloud delivers raw servers at your fingertipsUsing AWS VPC so machines look like they are part of the data centerDevOps team works closely with architects and devs to collaborate on server configurationsCookbooks and Chef Recipes in RightScale means that machine build and configuration is automatedDevOps team maintains a catalog of machines that developers can useSelf-service portal, no need to ask permission when a new server is neededDevelopers are no longer tied to actual servers
Provisioning time now minutes instead of months!Predictable environments made up of servers created from unique server templates Environments are created programmatically as part of continuous integrationOnly stays up for the duration of workload processingAll assets are disposableFocus shifts to lifecycle management of server templates – iterate, fine tune, code manageThe servers have no value, it’s the cookbooks and recipes; the ServerTemplates that matter-----------------------------------------------Compare the time to create a brand new server template vs just launching from one in the library
Side effect of agile programming + DevOps + Cloud — lots of cloud instances running that may not be in useCapture and pass cost information to those responsible for budgetingRobust, targeted and frequent reporting of chargeback allocation and cost trendingRightScale makes it possible to track costs on a per-deployment basis and to automatically send that information to those who need itAlarms can alert you when the spend on a particular deployment crosses a thresholdPlanForCloud.com helps you forecast costs
Take ownership of applications, not just handling requests and incidentsEmbed Ops people into the development processEnable developers to self provision environmentsDevOps + RightScale can simplify application lifecycle managementUse ServerTemplates to automate all machine configurations and for server version control, just like for codeComplete software code management of server templates allows spin-up of any mix-matched version of environment for specific troubleshooting and root cause isolationInstrumentation is shared amongst Development, QA, Pre-production and Production environments – adds predictabilityCreate dashboard for production operation tasks – e.g. new instance, kill instance, scale instance, change managed code rolloutSurface cost information to people that manage budgetsThink about how to architect for the cloud where adding more infrastructure is no longer a bottle neck
There are many elements here from other presentations on DevOpsNot possible to cite everyoneNevertheless, this is an acknowledgment that many ideas here came from other people (and all the images too)For those we borrowed from, hanks for sharing. We are making this work public so that others may find it useful too