Chrome Release Cycle Author: Anthony Laforge (firstname.lastname@example.org)
Our Philosophy• Think of any major website, even the fancy Web 2.HTML5 ones... do they have version numbers?• We took the same approach to our client software as an online web service.• That is... we treat releases as a means of getting features out to users and not goals in and of themselves.• Its about flow.
How Our Users Work• Users sign up for Chrome release channels based on the level of stability they want.• Updates happen automatically in the background.• New features and functionality just flow in without any effort on the users part.
How the Chrome Team Works• Developers work almost exclusively on a single central trunk (possible due to our try bots and continuous build infrastructure)• Our branches stem from points on the trunk• We stabilize branches by pulling in changes from trunk (i.e. everything lands on trunk first)
In practice, for that final beta...wed• Cut a branch (when we thought we had everything)• Merge about ~500 patches (since we didnt have everything)• Spend weeks stabilizing and re-stabilizing issues.• Ended up working 1-3 months to get a release out the door, always certainly missing our 13 week plan.• And at the end finally shipping something we were happy with... but that left us pretty drained (i.e. the bad flow)
The chain of events• Wed hold releases, sometimes our branch points, until features were done, which meant...• Lots of changes got queued up on the trunk, which caused...• Engineers merging lots of changes to the branch, which led to...• Instability on the branch that we had to deal with and that meant...
That chain of events...• Long lived branches, which...• Made supporting the branch (i.e. merging changes) more difficult as we drifted further from the trunk...• All of which led to... o Unpredictable release cycles, with date targets we could never hit, and.... o Engineers who were always in a rush to get their features into "this" release since we couldnt make any promises about the schedule
Which led me to think...• I need a long vacation.• There has to be a better way.• Whats causing things not to flow smoothly? ...Yes...In that order.
Primary Goals• Shorten the release cycle and reduce the life span of a branch (make merges easier)• Make releases more predictable and easier to scope• Reduce the strain on the entire team Good Bad
Goal to Plan - Getting down the cycle• Originally set out a plan to simplify the 12 week cycle, 6 weeks on dev, 6 weeks on beta, and then to stable.• Once diagrammed, using blocks to represent the weeks, realized that with 3 channels you could have two overlapping releases running at once, which could get us a stable release ~ every 6 weeks.
It was a start, but it wasnt the full answer • Sure, turning the wheel faster would make some things, like merges, easier. • But without addressing the scope problems, the flow wouldnt be any better and would continue to interrupt releases. • Things wouldnt be any more predictable and the whole cycle would just fall apart (more puddles, less of a stream).
Goal to Plan - Controlling Scope• The pace of the schedule sets the boundaries for the amount of work that can be completed.• Its important to have specific points in the schedule to review features and cut scope.• Establish clear expectations (and engineering practice) to developers that any features not ready to ship at branch will be disabled (i.e. we only cut post branch, never add).
Plan to Pitch• Reached out to the various cross functional groups on the Chrome team, who would all be impacted, before approaching our executives. o Engineering, QA, Product, Marketing, Support, Localization, Security, etc...• There were a lot of concerns to address, but the exercise forced a lot of thought on the implications of the schedule change.• It took time, but it made the pitch easier to present to the leadership and the rest of the team.
Concerns• Would large feature development still be possible? "Yes, engineers would have to work behind flags, however they can work for as many releases as they need to and can remove the flag when they are done."• Can the engineers keep up? "Their pace wont need to change, since features can be disabled there should be no milestone pressure, things ship when they are ready."• What would a world look like where we didnt base our marketing on releases? "We market features, not releases."
And so we implemented it 11 week overlapping schedules.
Our General Rules• The branch point is the end of our development cycle• Features only get disabled post branch point• Features should be engineered so that they can be disabled easily (1 patch)• Only stability, security, and critical regressions can ever block a release
Things we struggled with• Overcoming team inertia (particularly when we cut scope)• Saying no consistently to the team• Fighting the urge to return to date driven management decisions• We initially didnt solve the trunk->dev->branch time to patch problem (daily channel helped).• Setting the right burn down point (i.e. branch point).• Disabling features.
Key Lessons• Having clear and concise rules is important.• Predictable schedules cut down on communication costs and team confusion.• It takes work and fore-thought to disable features.• Diligent feature tracking is important.
Conclusion• Speed alone isnt always the right answer, its about keeping things running smoothly.• For us, scope was getting in the way of that goal. We basically wanted to operate more like trains leaving Grand Central Station(regularly scheduled and always on time), and less like taxis leaving the Bronx (ad hoc and unpredictable).