O documento discute como arquiteturas de microserviços, entrega contínua e várias ferramentas da AWS como CodePipeline, CodeDeploy e Lambda podem permitir que times entreguem software de forma mais rápida e confiável, com até 50 milhões de deployments por ano. Também aborda como outros podem implementar práticas semelhantes de CI/CD usando serviços da nuvem da AWS.
7. O que é DevOps?
DevOps = eficiência que aumenta a velocidade desse ciclo
devs clientes
entregartestarcompilar
planejar monitorar
pipeline de entrega
ciclo de feedback
SDLC
12. Microserviço e o SDLC
desenvolvedores pipelines de entregaserviços
entregartestarcompilar
entregartestarcompilar
entregartestarcompilar
entregartestarcompilar
entregartestarcompilar
13. = 50 milhões de deployments por ano
Milhares de times
× Arquitetura de Microserviços
× Entrega Contínua
× Múltiplos Ambientes
- if we go back to 2001, the amazon.com website was an architectural monolith
- don't get me wrong, it had multiple tiers and those tiers had many components, but they were so tightly-coupled that they behaved like a monolith
- this monolith-first architecture is not uncommon
- many fast growing startups or early projects make tradeoffs early that optimize for short term speed
- but these can cause longer term issues as you add more developers and add more code
- over time the size of the project is crushed by its own weight
- this happened to Amazon, and as we scaled the website and the team, we started to get bogged down
- to visualize why it was getting bogged down, let's look at the development lifecycle
- when you're working with a monolithic app, you have many developers all pushing changes through a shared release pipeline
- this causes frictions at many points of the lifecycle
- upfront during development, engineers need to coordinate their changes to make sure they're not making changes that will break someone else's code
- if you want to upgrade a shared library to take advantage of a new feature, you need to convince everyone else to upgrade at the same time – good luck with that
- and if you want to quickly push an important fix for your feature, you still need to merge it in with everyone else's in process changes
- this leads to "merge Fridays", or worse yet "merge weeks", where all the developers have to compile their changes and resolve any conflicts for the next release
- even after development, you also face overhead when you're pushing the changes through the delivery pipeline
- you need to re-build the entire app, run all of the test suites to make sure there are no regressions, and re-deploy the entire app
- to give you an idea of this overhead, Amazon had a central team whose sole job it was to deploy this monolithic app into production
- even if you're just making a one-line change in a tiny piece of code you own, you still need to go through this heavyweight process and wait to catch the next train leaving the station
- for a fast growth company trying to innovate and compete, this overhead and sluggishness was unacceptable
- the monolith became too big to scale efficiently so we made a couple of big changes
- one was architectural, and the other was organizational
- we took the monolith and broke it apart into a service oriented architecture
- factored the app into small, focused, single-purpose services, which we call "primitives"
- for example, we had a primitive for displaying the buy button on a product page, and we had one for calculating taxes
- every primitive was packaged as a standalone web service, and got an HTTP interface
- these building blocks only communicated to each other through the web service interfaces
- this created a highly decoupled architecture where these services could be iterated on independently as long as they adhered to their web service interface
- to give you an idea of the scope of these small services, I've included this graphic
- this is the constellation of services that deliver the Amazon.com website back in 2009, 6 years ago
- this term didn't exist back then, but today you'd call this a microservice architecture
- in conjunction with breaking apart the architecture, we also broke apart the organization
- we split up the hierarchical org into small teams
- we called them 2-pizza teams, because if they got larger than you could feed with 2 pizzas, we'd break them up
- in reality, the target number is about 8 people per team, so I personally think the 2 pizza goal is maybe a little too frugal
- another important change that went along with this is cultural
- when we split up the org, we gave the teams full autonomy
- they became small startups that owned every aspect of their service
- they worked directly with their customers (internal or external), set their roadmap, designed their features, wrote the code, ran the tests, deployed to production, and operated it
- if there was pain anywhere in the process they felt it
- operational issue in the middle of the night, the team was paged
- lack of tests breaking customers, the team got a bunch of support tickets
- that motivation ensured the team focused on all aspects of the software lifecycle, broke down any barriers between the phases, and made the process flow as efficiently as possible
- we didn't have this term at the time, but this was the start of our "DevOps" culture
- but before we get started, I have a confession to make – I used to hate the term "DevOps"
- it bugged me because it's a very fuzzy term
- people use it in many different ways to mean many different things, so no one really knew what other people are talking about when they hear it
- earlier this year, I finally caved to the momentum, and started using it in my talks
- I had to admit that "DevOps" came the closest to describing this new modern style of rapid cloud development and delivery that I want to talk about today
- since I'm using this fuzzy term, it's now my responsibility to define it, so at least we're on the same page
- but rather than try to define it directly, I'm going to put it in the context of something that all of us are familiar with - the software development lifecycle
- here's the general development lifecycle for a web application or service
- on one side is the development team, and on the other side are the customers
- every new feature or bug fix goes through this same process
- developer writes code, code is built and unit tested, app is deployed to a testing environment for deeper testing, finally given a thumbs up and deployed to production where customers can use it
- after that happens, the company can collect feedback from customers, make decisions, and continue to iterate and improve the product
- there are a few important things to note here
- the speed of completing this loop determines your business agility: to go from an idea, to a delivered feature, to learning about it and coming up with the next idea
- the faster you can complete that loop, the faster you can innovate
- if you can only complete this cycle once a month, you will be outmaneuvered by competitors that can do this every day
- another point is that you're only adding value when you're writing code for new changes
- the effort you spend in this middle section is lost time
- don't get me wrong, you need to ensure high quality releases, but the less your team spends releasing software, the more time they can be writing code
- to me, that's the essence of DevOps – to make this process as efficient as possible, and speed up the learning cycle
- this is why DevOps is fuzzy, because there are many different ways to optimize this cycle
- you can make process changes, organization changes, culture changes, tool changes
- to me, they all count, and I think it's fine to classify them all as DevOps
- these two changes decoupled the teams and made a dramatic improvement to the front end of the lifecycle
- it was very easy for them to make decisions and write new code for their microservice
- but when they went to deploy their code to production, they struggled with trying to handle this themselves
- we had a tooling gap, and the old way of having a central team push out the entire codebase was no longer workable
- that wouldn't scale to be able to serve thousands of different teams with different technologies and release schedules
- to fix this, Amazon started a new central tools team to build a new breed of developer tools
- these new tools had some unique characteristics
- the tools had to be self-service, because there's no other way to be able to scale to that many customers
- the tools had to be technology agnostic, because the teams chose many different types of platforms and programming languages for their services
- the tools had to encourage best practices, while we allow autonomy, we also want to support shared learning across the teams so everyone can improve
- and of course, in the service-oriented mindset, the tools were delivered as primitive services
- one of the first primitives to emerge was Apollo, a name that we clearly borrowed from Nasa
- Apollo is the deployment engine for Amazon, everything from the retail site to AWS services
- it's how we roll out software changes across our servers
- we first launched Apollo over a dozen years ago
- in that time we've been continually learning about how to manage deployments and baking that knowledge back into the service
- one capability was zero downtime deployments
- there's no way we would allow taking the retail site down just to push a software change
- Apollo supports rolling out a software change without taking down an application
- we also can't let a deployment bug take down the app, so Apollo tracks deployment health and stops bad deployments
- another primitive that emerged was Pipelines, our continuous delivery service
- started after we did a study of how long a software change took to go from a developer check in to running in production
- I'm not going to share any numbers, but let's say that it was embarrassing how long that took
- we found that it wasn't the builds, tests, or deployments that were taking so long, but rather the human processes that tied them all together
- one person would notify another person that a task was ready, eventually they'd see the request and batch it with others, finally they'd start a job and let it run, they'd come back later to see if it completed successfully or needed to be re-run, then they'd finally route the task onto another group for the next job
- this process added in a ton of human delay, and for a company with an insane focus on efficiency, this was unacceptable
- since we're automating our fulfillment centers, we thought we should automate our software delivery
- we created Pipelines to automate that end-to-end release process, from code check-in to build to test to production
- this tool is used pervasively across Amazon, by well over 90% of the teams
- with these new tools, we completed the puzzle
- the teams were decoupled and they had the tools necessary to efficiently release on their own
- what does success look like
- there are a lot of ways that you can measure the process, and no one way is perfect
- but here's one data point
- when you have thousands of independent teams
- producing highly-factored microservices
- that are deployed across multiple dev, test, and production environments
- in a continuous delivery process
- you get a lot of deployments
- at Amazon in 2014, we ran over 50M deployments
that's an average of 136 thousands deployments each day, almost 6 thousands every hour, which means around 1.5 deployments every second
- after we tell customers the story of our DevOps transformation, they typically ask us how they can do the same
- many angles to that answer: architecture, organization, process, tools
- that's the focus for the remainder of this talk
- I'm going to talk about the new AWS Code services and how you can use these to set up your own DevOps pipelines
- I demonstrated using a few partner solutions with our AWS Code services
- Here's the full list of partners who have integrated their tools with CodePipeline and CodeDeploy
- And this list is growing as we welcome more integrations into our tools suite
- Many of these partners have booths in the expo hall
- I encourage everyone to explore their solutions to see how they might benefit your cloud development projects
- We even have some partners here at this talk, so when we take Q&A (up front|| out in the hallway) after the talk, you'll be able to ask questions of them as well
No servers to Manage
Continuous Scaling
Subsecond metering