The Evolution of Continuous Delivery at Scale @ Linkedin

C4Media
C4MediaMarketing Manager at C4Media
THE EVOLUTION OF
CONTINUOUS DELIVERY AT SCALE
QCon SF
Nov 2014
Jason Toy
jtoy@linkedin.com
1
InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Watch the video with slide
synchronization on InfoQ.com!
http://www.infoq.com/presentations
/cd-linkedin
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
Presented at QCon San Francisco
www.qconsf.com
How did we evolve our solution to allow
developers to quickly iterate on
creating product as LinkedIn
engineering grew from 30 to 1800
technologists?
2
?
We will be talking about that evolution today.
3
• How we have improved developer productivity
and the release pipeline
• The pitfalls we’ve seen
• How we’ve tackled them
• What it took
• What we have learned
4
What have we accomplished as we scaled??
• Scaling: From 2007 to Today
• 5 services -> 550+ services
• 30 -> 1800+ technologists
• 13 million members -> 332 million members
• At the same time
• Monolithic deployments to prod once every several
weeks -> Independent deployments when ready
• Manual -> Automated commit to production pipeline
• Faster iterations on the technology stack
5
LinkedIn 2007
• ~30 developers, 5-10 services
• Trunk based development
• Testing
• Mostly manual
• Nightly regressions: automated junit, manual functional
• Release (Every couple weeks)
• Create branch and deployment ordering
• Rehearse deployment, run tests in staging
• Site downtime to push release (All eng + ops party)
Problems in 2007
• Testing and Development
• Trunk stability: large changes, manual/local/nightly
testing
• Codebase increasing in size
• Release
• Infrequent, and time consuming
6
LinkedIn 2008-2011
• ~ 300 developers, ~300 services
• Branch based development, merge for release
• Testing
• Added automated ‘Feature Branch Readiness’
• Before merge prove branch had 0 test failures / issues
• Release (Every couple weeks)
• Exactly as before:
• Create, rehearse, and execute a deployment ordering.
7
Improvements in 2008-2011
• Branches supported more developers
• More automated testing
8
Tradeoff: Branch Hell
• Qualifying 20-40 branches
• Stabilizing release branch hard
• Point of friction: fragile/flaky/unmaintained tests
• Impact:
• frustrating process became power struggle
9
Problem: Deployment Hell
• Monolithic change with 29 levels of ordering
• Must fix forward: too complex to rollback
• Manual prod deployment did not scale:
• Dangerous, painful, and long (2 days)
• Impact:
• Operations very expensive and distracting
• Missing a release became expensive to developers
• More hotfixes and alternative process created
10
Linkedin 2011: The Turning Point
• Company-wide Project Inversion
• Build a well defined release process
• Move to trunk development
• Automated deployment process
• Build the tooling to support this!
• Enforcing good engineering practices.
• No more isolated development (no branches)
• No backwards incompatible changes
• Remove deployment dependencies
• Simplify architecture (complexity a cascading effect)
• Code must be able to go out at any time
11
LinkedIn 2011
• ~ 600 developers ~250 services
• Trunk based development
• Testing:
• Mostly automated
• Source code validation: post commit test automation
• Artifact validation: automated jobs in the test environment
• Release:
• On your own timeline per service
• One button to push to deploy to testing or prod
12
How did we make this work?
(A mixture of people, process, and
tooling)
13
?
Commit Pipeline
• Pre/Post commit (PCX) machinery
• On each commit, tests are run
• Focused test effort: scope based on change set
• Automated remediation: either block or rollback
• Small team maintains machinery and stability
• Creates new artifact upon success
• Working Copy Test
• PCX machinery to test local changes before commit
• Great for qualifying massive/horizontal changes
14
Shared Test Environment
• Continuously test artifacts with automated jobs
• Stability treated in the same respect as trunk
• Can test local changes against environment
15
Deployment vs Release
• New distinction:
• Deployment (new change to the site)
• Trunk must be deployable at all times
• Release (new feature for customers)
• Feature exposure ramped through configs
• Predictable schedule for releasing change
• Product teams can release functionality at will without
interfering with change
16
Deployment Process
• Deployment Sequence:
1. Canary Deployment (New!)
2. Full rollout
3. Ramp feature exposure (New!)
4. Problem? Revert step. (New!)
• No deployment dependencies allowed
• Fully automated
• Owners / Auto nominate deployment or rollback
• All the deployment / rollback information is in plans
17
People
• Everyone had to be willing to change
• Greater engineering responsibility
• No backwards incompatible changes
• Rethink architecture, practices (piecewise features)
• In return gave ownership of products and quality
back to engineers
• Release on your own schedule
• Local decision making
• You are responsible for your quality, not a central team
• You own a piece of the codebase not a branch (acls)
18
Tooling
• Acls for code review
• Pre/Post commit CI framework / pipeline
• CRT: Change Request Tracker
• Developer commit lifecycle management
• Deployment automation plans / Canaries
• Performance
• i.e. Evaluate canaries on things like exceptions
• Test Manager
• Manage automated tests (mostly in test environment)
• Monitoring for environment / service stability
• Config changes to ramp features
19
Improvements in 2011
• No merge hell
• Find failures faster
• Keep testing sane and automated
• Independent and easy deployment and release
• Create greater ownership
• More control over, responsible for your decisions
• Breaking the barriers: Easier to work with others
20
Challenges in 2011 (Overcame)
• Breakages immediately affect others, so find and
remove failures fast
• Pre and post commit automation
• Hard to save off work in progress
• Break down your feature into commits that are safe to
push to production. Use configs to ramp
21
Problems in 2011
• Monolithic Codebase
• Not flexible enough to accommodate
• Acquisitions
• Exploration
• Iterations needed to be even faster (non global block)
• Ownership could be clearer
• Of code
• Of failures
• Developer and code base grew significantly (again)
22
Multiproduct
• ~1500 products ~1800 devs ~550 services
• Ecosystem of smaller individual products each with an
individual release cycle
• Can depend on artifacts from other products
• Uniform process of lifecycle and tasks
• Abstractions allow us to build generic tooling to
accommodate a variety of technologies and products
• Lifecycle / tasks (i.e. build, test, deploy) owner defined
• Testing and Release mostly the same
• During your postcommit we test everything that depends
on you – to ensure you aren’t breaking anything
23
Improvements with Multiproduct
• No monolithic codebase
• Flexible
• Easier, faster to validate and not block
24
Challenges with Multiproduct
• Architecture
• Versioning Hell
• Circular Dependencies
• How to work across many products
• How to work with others
• Give people full control (no central police)
25
Conclusion: Key Successes
• 0 Test Failures
• Multitude of automated testing options
• Automated, independent, frequent deployments
• Distinguish between Deployments and Release
• More accountability and ownership for teams
26
Conclusion: Takeaways
• Notice any trends?
• Validate fast, early, often
• Simplify
• Build the tooling to succeed
• Creating more digestible pieces, giving more control to owners
• It’s all a matter of tradeoffs and priorities
• They change over time
• Ours seem to be getting better!
• It’s not only about technology: culture matters
• Change, Ownership, Craftsmanship
• People, process, technology
• Invest in improvements, and stick with it
27
Thanks!
28
Questions?
29
30
Watch the video with slide synchronization on
InfoQ.com!
http://www.infoq.com/presentations/cd-
linkedin
1 of 33

Recommended

How to contribute to an open source project and don’t die during the Code Rev... by
How to contribute to an open source project and don’t die during the Code Rev...How to contribute to an open source project and don’t die during the Code Rev...
How to contribute to an open source project and don’t die during the Code Rev...Victor Morales
122 views12 slides
Validating latest changes with XCI by
Validating latest changes with XCIValidating latest changes with XCI
Validating latest changes with XCIVictor Morales
197 views17 slides
Effective .NET Core Unit Testing with SQLite and Dapper by
Effective .NET Core Unit Testing with SQLite and DapperEffective .NET Core Unit Testing with SQLite and Dapper
Effective .NET Core Unit Testing with SQLite and DapperMike Melusky
174 views23 slides
Jenkins Reviewbot by
Jenkins ReviewbotJenkins Reviewbot
Jenkins ReviewbotYardena Meymann
4.8K views16 slides

More Related Content

What's hot

Effective .NET Core Unit Testing with SQLite and Dapper by
Effective .NET Core Unit Testing with SQLite and DapperEffective .NET Core Unit Testing with SQLite and Dapper
Effective .NET Core Unit Testing with SQLite and DapperMike Melusky
2.6K views23 slides
Cross Community CI project by
Cross Community CI projectCross Community CI project
Cross Community CI projectVictor Morales
186 views14 slides
Play 2 Java Framework with TDD by
Play 2 Java Framework with TDDPlay 2 Java Framework with TDD
Play 2 Java Framework with TDDBasav Nagur
1K views26 slides
ONAP on Vagrant for ONAPers by
ONAP on Vagrant for ONAPersONAP on Vagrant for ONAPers
ONAP on Vagrant for ONAPersVictor Morales
250 views13 slides
Embracing Observability in CI/CD with OpenTelemetry by
Embracing Observability in CI/CD with OpenTelemetryEmbracing Observability in CI/CD with OpenTelemetry
Embracing Observability in CI/CD with OpenTelemetryCyrille Le Clerc
465 views17 slides
JENKINS Training by
JENKINS TrainingJENKINS Training
JENKINS TrainingNithin Kumar
246 views8 slides

What's hot(20)

Effective .NET Core Unit Testing with SQLite and Dapper by Mike Melusky
Effective .NET Core Unit Testing with SQLite and DapperEffective .NET Core Unit Testing with SQLite and Dapper
Effective .NET Core Unit Testing with SQLite and Dapper
Mike Melusky2.6K views
Play 2 Java Framework with TDD by Basav Nagur
Play 2 Java Framework with TDDPlay 2 Java Framework with TDD
Play 2 Java Framework with TDD
Basav Nagur1K views
Embracing Observability in CI/CD with OpenTelemetry by Cyrille Le Clerc
Embracing Observability in CI/CD with OpenTelemetryEmbracing Observability in CI/CD with OpenTelemetry
Embracing Observability in CI/CD with OpenTelemetry
Cyrille Le Clerc465 views
Continuous integration / deployment with Jenkins by cherryhillco
Continuous integration / deployment with JenkinsContinuous integration / deployment with Jenkins
Continuous integration / deployment with Jenkins
cherryhillco2.3K views
Louisville Software Engineering Meet Up: Continuous Integration Using Jenkins by James Strong
Louisville Software Engineering Meet Up: Continuous Integration Using JenkinsLouisville Software Engineering Meet Up: Continuous Integration Using Jenkins
Louisville Software Engineering Meet Up: Continuous Integration Using Jenkins
James Strong419 views
Building a loosely coupled toolchain with Rundeck and Puppet by smeunier114
Building a loosely coupled toolchain with Rundeck and PuppetBuilding a loosely coupled toolchain with Rundeck and Puppet
Building a loosely coupled toolchain with Rundeck and Puppet
smeunier1144.4K views
Start with Angular framework by Knoldus Inc.
Start with Angular frameworkStart with Angular framework
Start with Angular framework
Knoldus Inc.269 views
Continuous integration by Lior Tal
Continuous integrationContinuous integration
Continuous integration
Lior Tal2.4K views
Modern Tools for Building Progressive Web Apps by All Things Open
Modern Tools for Building Progressive Web AppsModern Tools for Building Progressive Web Apps
Modern Tools for Building Progressive Web Apps
All Things Open102 views
Docker в автоматизации тестирования by COMAQA.BY
Docker в автоматизации тестированияDocker в автоматизации тестирования
Docker в автоматизации тестирования
COMAQA.BY 936 views
Microsoft ASP.NET 5 - The new kid on the block by Christos Matskas
Microsoft ASP.NET 5 - The new kid on the block Microsoft ASP.NET 5 - The new kid on the block
Microsoft ASP.NET 5 - The new kid on the block
Christos Matskas1.3K views
Expedia 3x3 presentation by Drew Hannay
Expedia 3x3 presentationExpedia 3x3 presentation
Expedia 3x3 presentation
Drew Hannay445 views

Similar to The Evolution of Continuous Delivery at Scale @ Linkedin

Dev/Test scenarios in DevOps world by
Dev/Test scenarios in DevOps worldDev/Test scenarios in DevOps world
Dev/Test scenarios in DevOps worldDavide Benvegnù
1.1K views28 slides
2016 09-dev opsjourney-devopsdaysoslo by
2016 09-dev opsjourney-devopsdaysoslo2016 09-dev opsjourney-devopsdaysoslo
2016 09-dev opsjourney-devopsdaysosloJon Arild Tørresdal
203 views24 slides
Continuous Delivery for the Rest of Us by
Continuous Delivery for the Rest of UsContinuous Delivery for the Rest of Us
Continuous Delivery for the Rest of UsC4Media
768 views58 slides
Adopting Continuous Integration in an Ops Group by
Adopting Continuous Integration in an Ops GroupAdopting Continuous Integration in an Ops Group
Adopting Continuous Integration in an Ops Groupcolleenfry
2.2K views44 slides
DevOps Fest 2020. Kohsuke Kawaguchi. GitOps, Jenkins X & the Future of CI/CD by
DevOps Fest 2020. Kohsuke Kawaguchi. GitOps, Jenkins X & the Future of CI/CDDevOps Fest 2020. Kohsuke Kawaguchi. GitOps, Jenkins X & the Future of CI/CD
DevOps Fest 2020. Kohsuke Kawaguchi. GitOps, Jenkins X & the Future of CI/CDDevOps_Fest
154 views42 slides
Road to Continuous Delivery - Wix.com by
Road to Continuous Delivery - Wix.comRoad to Continuous Delivery - Wix.com
Road to Continuous Delivery - Wix.comAviran Mordo
3K views70 slides

Similar to The Evolution of Continuous Delivery at Scale @ Linkedin(20)

Dev/Test scenarios in DevOps world by Davide Benvegnù
Dev/Test scenarios in DevOps worldDev/Test scenarios in DevOps world
Dev/Test scenarios in DevOps world
Davide Benvegnù1.1K views
Continuous Delivery for the Rest of Us by C4Media
Continuous Delivery for the Rest of UsContinuous Delivery for the Rest of Us
Continuous Delivery for the Rest of Us
C4Media768 views
Adopting Continuous Integration in an Ops Group by colleenfry
Adopting Continuous Integration in an Ops GroupAdopting Continuous Integration in an Ops Group
Adopting Continuous Integration in an Ops Group
colleenfry2.2K views
DevOps Fest 2020. Kohsuke Kawaguchi. GitOps, Jenkins X & the Future of CI/CD by DevOps_Fest
DevOps Fest 2020. Kohsuke Kawaguchi. GitOps, Jenkins X & the Future of CI/CDDevOps Fest 2020. Kohsuke Kawaguchi. GitOps, Jenkins X & the Future of CI/CD
DevOps Fest 2020. Kohsuke Kawaguchi. GitOps, Jenkins X & the Future of CI/CD
DevOps_Fest154 views
Road to Continuous Delivery - Wix.com by Aviran Mordo
Road to Continuous Delivery - Wix.comRoad to Continuous Delivery - Wix.com
Road to Continuous Delivery - Wix.com
Aviran Mordo3K views
DevOps-as-a-Service: Towards Automating the Automation by Keith Pleas
DevOps-as-a-Service: Towards Automating the AutomationDevOps-as-a-Service: Towards Automating the Automation
DevOps-as-a-Service: Towards Automating the Automation
Keith Pleas6.7K views
Gartner Infrastructure and Operations Summit Berlin 2015 - DevOps Journey by Kelly Looney
Gartner Infrastructure and Operations Summit Berlin 2015 - DevOps JourneyGartner Infrastructure and Operations Summit Berlin 2015 - DevOps Journey
Gartner Infrastructure and Operations Summit Berlin 2015 - DevOps Journey
Kelly Looney614 views
Lean-Agile Development with SharePoint - Bill Ayers by SPC Adriatics
Lean-Agile Development with SharePoint - Bill AyersLean-Agile Development with SharePoint - Bill Ayers
Lean-Agile Development with SharePoint - Bill Ayers
SPC Adriatics829 views
Wix Dev-Centric Culture And Continuous Delivery by Aviran Mordo
Wix Dev-Centric Culture And Continuous DeliveryWix Dev-Centric Culture And Continuous Delivery
Wix Dev-Centric Culture And Continuous Delivery
Aviran Mordo3.1K views
Constant Contact SF's Road to CD by Solano Labs
Constant Contact SF's Road to CDConstant Contact SF's Road to CD
Constant Contact SF's Road to CD
Solano Labs835 views
Dev ops != Dev+Ops by Shalu Ahuja
Dev ops != Dev+OpsDev ops != Dev+Ops
Dev ops != Dev+Ops
Shalu Ahuja1.7K views
SQL Server DevOps Jumpstart by Ori Donner
SQL Server DevOps JumpstartSQL Server DevOps Jumpstart
SQL Server DevOps Jumpstart
Ori Donner1.6K views

More from C4Media

Streaming a Million Likes/Second: Real-Time Interactions on Live Video by
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoStreaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoC4Media
2.5K views171 slides
Next Generation Client APIs in Envoy Mobile by
Next Generation Client APIs in Envoy MobileNext Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileC4Media
845 views107 slides
Software Teams and Teamwork Trends Report Q1 2020 by
Software Teams and Teamwork Trends Report Q1 2020Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020C4Media
530 views11 slides
Understand the Trade-offs Using Compilers for Java Applications by
Understand the Trade-offs Using Compilers for Java ApplicationsUnderstand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsC4Media
494 views59 slides
Kafka Needs No Keeper by
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No KeeperC4Media
579 views127 slides
High Performing Teams Act Like Owners by
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like OwnersC4Media
409 views75 slides

More from C4Media(20)

Streaming a Million Likes/Second: Real-Time Interactions on Live Video by C4Media
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoStreaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live Video
C4Media2.5K views
Next Generation Client APIs in Envoy Mobile by C4Media
Next Generation Client APIs in Envoy MobileNext Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy Mobile
C4Media845 views
Software Teams and Teamwork Trends Report Q1 2020 by C4Media
Software Teams and Teamwork Trends Report Q1 2020Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020
C4Media530 views
Understand the Trade-offs Using Compilers for Java Applications by C4Media
Understand the Trade-offs Using Compilers for Java ApplicationsUnderstand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java Applications
C4Media494 views
Kafka Needs No Keeper by C4Media
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No Keeper
C4Media579 views
High Performing Teams Act Like Owners by C4Media
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like Owners
C4Media409 views
Does Java Need Inline Types? What Project Valhalla Can Bring to Java by C4Media
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
C4Media339 views
Service Meshes- The Ultimate Guide by C4Media
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate Guide
C4Media269 views
Shifting Left with Cloud Native CI/CD by C4Media
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
C4Media300 views
CI/CD for Machine Learning by C4Media
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
C4Media355 views
Fault Tolerance at Speed by C4Media
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
C4Media286 views
Architectures That Scale Deep - Regaining Control in Deep Systems by C4Media
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
C4Media323 views
ML in the Browser: Interactive Experiences with Tensorflow.js by C4Media
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
C4Media1.7K views
Build Your Own WebAssembly Compiler by C4Media
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
C4Media297 views
User & Device Identity for Microservices @ Netflix Scale by C4Media
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
C4Media1.2K views
Scaling Patterns for Netflix's Edge by C4Media
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
C4Media588 views
Make Your Electron App Feel at Home Everywhere by C4Media
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
C4Media587 views
The Talk You've Been Await-ing For by C4Media
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
C4Media250 views
Future of Data Engineering by C4Media
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
C4Media1.7K views
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More by C4Media
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
C4Media842 views

Recently uploaded

Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ... by
Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...
Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...ShapeBlue
114 views12 slides
Digital Personal Data Protection (DPDP) Practical Approach For CISOs by
Digital Personal Data Protection (DPDP) Practical Approach For CISOsDigital Personal Data Protection (DPDP) Practical Approach For CISOs
Digital Personal Data Protection (DPDP) Practical Approach For CISOsPriyanka Aash
103 views59 slides
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava... by
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...ShapeBlue
74 views17 slides
Future of AR - Facebook Presentation by
Future of AR - Facebook PresentationFuture of AR - Facebook Presentation
Future of AR - Facebook PresentationRob McCarty
54 views27 slides
DRBD Deep Dive - Philipp Reisner - LINBIT by
DRBD Deep Dive - Philipp Reisner - LINBITDRBD Deep Dive - Philipp Reisner - LINBIT
DRBD Deep Dive - Philipp Reisner - LINBITShapeBlue
110 views21 slides
Ransomware is Knocking your Door_Final.pdf by
Ransomware is Knocking your Door_Final.pdfRansomware is Knocking your Door_Final.pdf
Ransomware is Knocking your Door_Final.pdfSecurity Bootcamp
81 views46 slides

Recently uploaded(20)

Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ... by ShapeBlue
Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...
Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...
ShapeBlue114 views
Digital Personal Data Protection (DPDP) Practical Approach For CISOs by Priyanka Aash
Digital Personal Data Protection (DPDP) Practical Approach For CISOsDigital Personal Data Protection (DPDP) Practical Approach For CISOs
Digital Personal Data Protection (DPDP) Practical Approach For CISOs
Priyanka Aash103 views
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava... by ShapeBlue
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...
ShapeBlue74 views
Future of AR - Facebook Presentation by Rob McCarty
Future of AR - Facebook PresentationFuture of AR - Facebook Presentation
Future of AR - Facebook Presentation
Rob McCarty54 views
DRBD Deep Dive - Philipp Reisner - LINBIT by ShapeBlue
DRBD Deep Dive - Philipp Reisner - LINBITDRBD Deep Dive - Philipp Reisner - LINBIT
DRBD Deep Dive - Philipp Reisner - LINBIT
ShapeBlue110 views
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit... by ShapeBlue
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
ShapeBlue86 views
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or... by ShapeBlue
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
ShapeBlue128 views
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT by ShapeBlue
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBITUpdates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
ShapeBlue138 views
Keynote Talk: Open Source is Not Dead - Charles Schulz - Vates by ShapeBlue
Keynote Talk: Open Source is Not Dead - Charles Schulz - VatesKeynote Talk: Open Source is Not Dead - Charles Schulz - Vates
Keynote Talk: Open Source is Not Dead - Charles Schulz - Vates
ShapeBlue178 views
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue by ShapeBlue
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlueMigrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue
ShapeBlue147 views
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue by ShapeBlue
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue
ShapeBlue75 views
Webinar : Desperately Seeking Transformation - Part 2: Insights from leading... by The Digital Insurer
Webinar : Desperately Seeking Transformation - Part 2:  Insights from leading...Webinar : Desperately Seeking Transformation - Part 2:  Insights from leading...
Webinar : Desperately Seeking Transformation - Part 2: Insights from leading...
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue by ShapeBlue
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlueWhat’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
ShapeBlue191 views
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue by ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlueCloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
ShapeBlue63 views
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue by ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueCloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
ShapeBlue68 views

The Evolution of Continuous Delivery at Scale @ Linkedin

  • 1. THE EVOLUTION OF CONTINUOUS DELIVERY AT SCALE QCon SF Nov 2014 Jason Toy jtoy@linkedin.com 1
  • 2. InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations /cd-linkedin
  • 3. Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide Presented at QCon San Francisco www.qconsf.com
  • 4. How did we evolve our solution to allow developers to quickly iterate on creating product as LinkedIn engineering grew from 30 to 1800 technologists? 2 ?
  • 5. We will be talking about that evolution today. 3 • How we have improved developer productivity and the release pipeline • The pitfalls we’ve seen • How we’ve tackled them • What it took • What we have learned
  • 6. 4 What have we accomplished as we scaled?? • Scaling: From 2007 to Today • 5 services -> 550+ services • 30 -> 1800+ technologists • 13 million members -> 332 million members • At the same time • Monolithic deployments to prod once every several weeks -> Independent deployments when ready • Manual -> Automated commit to production pipeline • Faster iterations on the technology stack
  • 7. 5 LinkedIn 2007 • ~30 developers, 5-10 services • Trunk based development • Testing • Mostly manual • Nightly regressions: automated junit, manual functional • Release (Every couple weeks) • Create branch and deployment ordering • Rehearse deployment, run tests in staging • Site downtime to push release (All eng + ops party)
  • 8. Problems in 2007 • Testing and Development • Trunk stability: large changes, manual/local/nightly testing • Codebase increasing in size • Release • Infrequent, and time consuming 6
  • 9. LinkedIn 2008-2011 • ~ 300 developers, ~300 services • Branch based development, merge for release • Testing • Added automated ‘Feature Branch Readiness’ • Before merge prove branch had 0 test failures / issues • Release (Every couple weeks) • Exactly as before: • Create, rehearse, and execute a deployment ordering. 7
  • 10. Improvements in 2008-2011 • Branches supported more developers • More automated testing 8
  • 11. Tradeoff: Branch Hell • Qualifying 20-40 branches • Stabilizing release branch hard • Point of friction: fragile/flaky/unmaintained tests • Impact: • frustrating process became power struggle 9
  • 12. Problem: Deployment Hell • Monolithic change with 29 levels of ordering • Must fix forward: too complex to rollback • Manual prod deployment did not scale: • Dangerous, painful, and long (2 days) • Impact: • Operations very expensive and distracting • Missing a release became expensive to developers • More hotfixes and alternative process created 10
  • 13. Linkedin 2011: The Turning Point • Company-wide Project Inversion • Build a well defined release process • Move to trunk development • Automated deployment process • Build the tooling to support this! • Enforcing good engineering practices. • No more isolated development (no branches) • No backwards incompatible changes • Remove deployment dependencies • Simplify architecture (complexity a cascading effect) • Code must be able to go out at any time 11
  • 14. LinkedIn 2011 • ~ 600 developers ~250 services • Trunk based development • Testing: • Mostly automated • Source code validation: post commit test automation • Artifact validation: automated jobs in the test environment • Release: • On your own timeline per service • One button to push to deploy to testing or prod 12
  • 15. How did we make this work? (A mixture of people, process, and tooling) 13 ?
  • 16. Commit Pipeline • Pre/Post commit (PCX) machinery • On each commit, tests are run • Focused test effort: scope based on change set • Automated remediation: either block or rollback • Small team maintains machinery and stability • Creates new artifact upon success • Working Copy Test • PCX machinery to test local changes before commit • Great for qualifying massive/horizontal changes 14
  • 17. Shared Test Environment • Continuously test artifacts with automated jobs • Stability treated in the same respect as trunk • Can test local changes against environment 15
  • 18. Deployment vs Release • New distinction: • Deployment (new change to the site) • Trunk must be deployable at all times • Release (new feature for customers) • Feature exposure ramped through configs • Predictable schedule for releasing change • Product teams can release functionality at will without interfering with change 16
  • 19. Deployment Process • Deployment Sequence: 1. Canary Deployment (New!) 2. Full rollout 3. Ramp feature exposure (New!) 4. Problem? Revert step. (New!) • No deployment dependencies allowed • Fully automated • Owners / Auto nominate deployment or rollback • All the deployment / rollback information is in plans 17
  • 20. People • Everyone had to be willing to change • Greater engineering responsibility • No backwards incompatible changes • Rethink architecture, practices (piecewise features) • In return gave ownership of products and quality back to engineers • Release on your own schedule • Local decision making • You are responsible for your quality, not a central team • You own a piece of the codebase not a branch (acls) 18
  • 21. Tooling • Acls for code review • Pre/Post commit CI framework / pipeline • CRT: Change Request Tracker • Developer commit lifecycle management • Deployment automation plans / Canaries • Performance • i.e. Evaluate canaries on things like exceptions • Test Manager • Manage automated tests (mostly in test environment) • Monitoring for environment / service stability • Config changes to ramp features 19
  • 22. Improvements in 2011 • No merge hell • Find failures faster • Keep testing sane and automated • Independent and easy deployment and release • Create greater ownership • More control over, responsible for your decisions • Breaking the barriers: Easier to work with others 20
  • 23. Challenges in 2011 (Overcame) • Breakages immediately affect others, so find and remove failures fast • Pre and post commit automation • Hard to save off work in progress • Break down your feature into commits that are safe to push to production. Use configs to ramp 21
  • 24. Problems in 2011 • Monolithic Codebase • Not flexible enough to accommodate • Acquisitions • Exploration • Iterations needed to be even faster (non global block) • Ownership could be clearer • Of code • Of failures • Developer and code base grew significantly (again) 22
  • 25. Multiproduct • ~1500 products ~1800 devs ~550 services • Ecosystem of smaller individual products each with an individual release cycle • Can depend on artifacts from other products • Uniform process of lifecycle and tasks • Abstractions allow us to build generic tooling to accommodate a variety of technologies and products • Lifecycle / tasks (i.e. build, test, deploy) owner defined • Testing and Release mostly the same • During your postcommit we test everything that depends on you – to ensure you aren’t breaking anything 23
  • 26. Improvements with Multiproduct • No monolithic codebase • Flexible • Easier, faster to validate and not block 24
  • 27. Challenges with Multiproduct • Architecture • Versioning Hell • Circular Dependencies • How to work across many products • How to work with others • Give people full control (no central police) 25
  • 28. Conclusion: Key Successes • 0 Test Failures • Multitude of automated testing options • Automated, independent, frequent deployments • Distinguish between Deployments and Release • More accountability and ownership for teams 26
  • 29. Conclusion: Takeaways • Notice any trends? • Validate fast, early, often • Simplify • Build the tooling to succeed • Creating more digestible pieces, giving more control to owners • It’s all a matter of tradeoffs and priorities • They change over time • Ours seem to be getting better! • It’s not only about technology: culture matters • Change, Ownership, Craftsmanship • People, process, technology • Invest in improvements, and stick with it 27
  • 32. 30
  • 33. Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations/cd- linkedin