Intro CloudBees, myself.Java PaaS and principal sponsors of the Jenkins open source project.Talk about how the combo of PaaS and CI can change the way you develop, deploy, and deliver applicationsReal world examples.
Let’s start with PaaSWhat it is, what it does, a bit about what’s under the covers
Canonical taxonomy of IaaS, PaaS, SaaS, with some familiar examplesAmazon, SalesforcePaaS follows Salesforce model, but for the platform you build to typically. No install, no maintenance. Examples.
Compare PaaS to traditional stackYou control all these layers.You have to deal with all these layers.
By adopting one of the *aaS models, you offload some parts of this stackYou also offload the maintenance and most ops aspects of that part of the stackDepends on your choice, tho, as you may just be taking on another piece of software that Is supposed to offload you, etc
In short, this lets you focus on what you want to focus on – the applications and the data – and leave the rest of it to CloudBees and the PaaS.Just like you get on-demand, pay-as-you-go, elasticity of infrastructure resources using AWS, you get the same for applications in PaaSYou don’t deal with the maintenance and update issues for the stack below.And, this is a service, not another piece of packaged software you install.
Switching from legos to more traditional powerpoint, then…Services and Partner Ecosystem.Again, no install/maintain/config of the partner services either.
Side trip: Do you want to build this yourself?After all, you have an IT department, VM images, Chef/Puppet?Example: NetFlix.
From a technology standpoint, it’s not just about racking and stacking VM images.In our case as an example, it’s about a set of shared services to address…And an architectural approach that gives flexibility in deployment models and IaaS providers.
Aside from never having to argue with my IT department again, what can I get out of this.Well, there is time-to-market, better lower-case agility to go with your Agile practice.There is lower cost. But let me show you an operational practice that is nearly impossible to do without PaaS and the cloud.
This is a cloud variation on something that probably takes weeks of time, extensive planning and meetings to do in most places. This is what they do at NetFlix.You have an existing version of your app running in production and you want to roll a new one out.
Since you’re paying on-demand and you can provision new systems simply, quickly and reliablyYou set up a 100% duplicate environment as your original, with the same number of resources to handle the full load.Don’t cannibalize your existing pool, why should you?
You roll over load to the new version until you can stop using the old app.
When something goes wrong…
You just shift the load back to the old system. It’s all still there and can do what it was doing the day before, because it hasn’t changed at all.
You roll over load to the new version until you can stop using the old app.
You roll over load to the new version until you can stop using the old app.
You roll over load to the new version until you can stop using the old app.
So now let’s switch gears away from PaaS itself and into CI and the cloud.
To start with, I’d just make the observation that there are some tradeoffs of working in the cloud, particularly for development.You want all the benefits we just spoke about, but you also want the local speed of your hot new machine, local control, and so on.Furthermore, you have to make some tradeoff between cloud benefits and things like latency and turnaround.
So here’s a way I like to think about it. Across the spectrum of activities during your development, QA, and overall release process,There are some things that play to the cloud advantages more than others. Human intensive things, like cutting code, debugging, and so on at least to me seem to be biased toward local systems. Whereas activities that can benefit from larger resources, like stress testing, etc, find a very natural home in the cloud.
It turns out that CI is one of those things for a couple of reasons.First, activity is spiky. It’s tied to the time your team is awake, and when builds kick off, and when you do more compute intensive operations.Second, if you had all the resources you could possibly make use of, you’d be in good shape.But typically you don’t, so you have to make do with what you have. So you get a double-whammy of being resource constrained, but even with the resources you have, you can’t use them efficiently because of the spiky nature of CI activities.
And if you look beyond a sort of normal day, the situation is just as bad.You end up with spiky load that is coupled to the stage of your release.
In a lot of ways, you feel like you’re always fighting traffic, but the road is really quite empty.
What happens is that PaaS is the solution for the problem of how to get these resources when you want them and only pay for the ones you use.But you still have some problems. How do you harness all that compute resource?And onec you’ve harnessed it, how to do you keep it in control and running at peak efficiency, without bad things happening?
And it turns out that Jenkins, even if it went by another name in an earlier life, has been driving distributed builds for five years now.Jenkins can easily control MANY servers, and at the heart of its value is a community-driven set of plugins that make that simple.
So let’s talk about some specific techniques to take advantage of this abundance of compute resource.The first is
So there is a problem with CI that people discover, particularly in complex interrelated projects.You have to commit code to cause Jenkins to test things.But you don’t want to commit something that breaks things.So you end up doing more testing locally before committing.Seems like a problem.
And the problem gets worse in large projects, where the mathematics work against you.
The solution is to set up an arrangement where developers commit to branches, causing Jenkins to test the branch, and if it passes, to merge to the trunk.
You might think this could be worse than dealing with the occasional bad commit.It turns our it’s not.Typical developers remain close to the tip, and that helps.And the reality is that your coding cycle is slower than the test suites you would use to gate merges.So it would more typically interleave merges.
This approach has some real advantages.
A team that uses this kind of approach is the NetBeans team. They use multiple team repositories they merge to using the team’s gating tests, those in turn kick off tests asynchornously, which if successful, merge the code to the master repo.
Next, let’s look at ways to automate deployment.
You have all these compute resources, you should take advantage of them. Why build something if you’re not going to run it?
But, you need to test it before deploying it. Just because it compiles, it ain’t ready to ship.So, you should set up a build pipeline, typically of more and more expensive tests.When something fails at any stage of the pipeline, you want to throw out the checkin that caused the problem, and avoid going any further.You get feedback upstream more quickly, and you avoid wasting cycles.
Typically, people just start with a waterfall type model.An upstream job triggers a downstream one if it’s successful, using the binaries from the upstream job.
If you look at this on a time basis, it ends up looking more like this.Checkin triggers a build, that succeeds, that triggers a more complicated downstream test that triggers another one if it’s successful.And m
It turns out there is a very nice plugin – the Build Pipeline plugin – that gives you a nice visual representation of your pipeline as each build is kicked off. So you can see very quickly when your pipeline is stalled and why. You can kick off retries manually.
There is another way to build your pipeline beyond build triggers. In the case of promotions, you’re not thinking in terms of build process flow as much as state transition. So, you think about “When is my development-team bits ready to be picked up by the QA team” or “When is my product ready to be rolled out for stability testing”.
So here is an example of combining pipeline and parallel processing with promotions.
Promotions get you in a mode where code advances asynchronously, more on its own pace that makes sense to you and the teams you work with. They give you a lot more flexibility in the way you construct a pipeline and take advantage of the abundance of resources.
Finally, let’s talk about traceability. When you have so many resources working on your behalf…
How do you cope when bad things happen?
You have a complicated pipeline that may be consumed by teams in different ways. As a developer, you commit code and create a binary artifact. During QA, you may create other artifacts, like logs, that are then consumed elsewhere or will contain some kind of incriminating evidence of a problem. And so on with other areas.
What you can do is let Jenkins track these artifacts as they progress through your pipeline. In Jenkins, it’s call fingerprints. Basically, Jenkins computes a checksum for the artifacts and tracks them for every build. What you can do is just be very liberal in your use of this fingerprinting capability at every stage. It’s not like you have to fingerprint just a binary. You can do it with any artifact, like logs, at any stage of the lifecycle you use Jenkins to drive. Then when a problem arises, you can find out when the fingerprint showed up that caused the problem.
Here’s what it looks like in our example from before. The fingerprint is associated with artifacts you identify at each step, so you can easily tell in this example that the 3rd build was used in the 2nd integration test. Jenkins has built-in capabilities to track fingerprints and make it easy to chase them down.
Finally, I want to talk briefly about a customer of ours who is putting a lot of this into practice.
This is a “quote” major retailer, building a cataloging application providing easier access to their many stores, both from desktops as well as mobile devices. They had a top-down directive to move things to the cloud. He surveyed a variety of solutions, coming in with his own prejudices. He and his team were completely committed to agile practice and CI. They didn’t want to deal with managing infrastructure, etc. They were building the app using Grails and wanted to use standard war file delivery. So they very smartly chose CloudBees. They use a continuous delivery approach to deliver the app into a Dev and Test environment. Then they “manually” push out a new version after every two week sprint.
And this is what there build pipeline looks like. This happens to be their production pipeline, which is an exact mirror of their DEV pipeline, except that there is a manual trigger to deploy to production. Note they have exploited parallelism right off the bat.
And here’s another interesting view into their project. They use Jenkins to, as Marco put it, keep the pressure up on things they value. So, they require tests to be checked in with all code, and they track coverage all the time. If coverage falls below some threshold, it triggers a failure. Similarly with some code quality metrics they measure with Codenarc.
So, in summary…PaaS is serving up the old platform capabilities you used to have to install and maintain yourself as hosted and fully managed services.This ability to get resources delivered to you on-demand really changes the way you work and drives even more need for a proper CI and CD process.Jenkins itself provides the kind of extensibility and distributed build capability you need to harness the abundance of riches.