Building a Modern Enterprise SOA at LinkedIn
Agenda
 Building Code at LinkedIn
 Product Development with Multiproduct
 Build Automation with Gradle
 A Peek at the Future
©2013 LinkedIn Corporation. All Rights Reserved. 2
Building Code at LinkedIn
©2013 LinkedIn Corporation. All Rights Reserved. 3
In the beginning there was Network
 Single, relatively homogenous code base
– Build from source, little dependency management
 Java, Spring, Ant, …
 JavaScript, HTML, JSPs, CSS, …
 JARs, WARs, Jetty, Tomcat, …
©2013 LinkedIn Corporation. All Rights Reserved. 4
Then everything went exponential
Number of developers, programming languages, build
systems, frameworks, lines of code, servers, users, page views, …
… and pretty much everything else
©2013 LinkedIn Corporation. All Rights Reserved. 5
Hypergrowth isn’t all fun and games
 Acquisitions came in with their own technology and processes
 Releases became more and more painful
 Productivity and stability suffered
©2013 LinkedIn Corporation. All Rights Reserved. 7
Scaling a software development organization
 Requires sensible code and dependency management
– Source code APIs
– Service APIs
– Versioned dependencies
 Performance is king
– Iterative improvements
 Divide and conquer
– Split and isolate failures
©2013 LinkedIn Corporation. All Rights Reserved. 8
Code Isolation
Network Trunk Development
 All development in Network shifted to trunk
– No branches, no merging
 Continuous releases from trunk
– Deploy multiple times per day
 Work started on break up and clean up
– Migrating build logic to Gradle
©2013 LinkedIn Corporation. All Rights Reserved. 10
Product Development with Multiproduct
©2013 LinkedIn Corporation. All Rights Reserved.
Traditional Software Development
 Use a well-established technology stack
– Homogeneity => Simplicity
 To adopt a new technology:
– Requires “out of the box” thinking and effort
– Do a proof-of-concept implementation
– Present to decision makers to demonstrate ROI => get approval
 Slow-moving by design
– New technology integration expensive
– Top-down management decisions used as barrier
©2013 LinkedIn Corporation. All Rights Reserved. 12
LinkedIn Software Development
 We don’t ever want to be in the box
– Technical experimentation and diversity encouraged
– Living on the bleeding edge, often defining it
 The Need for Speed
– Pace of iteration
– Automation (the human is slow)
– Continuous delivery
 Build versus buy
©2013 LinkedIn Corporation. All Rights Reserved. 13
Chaos Theory
Multiproduct
 Toolset that is architected for a heterogeneous technology world
 Agnostic to version control, build system and programming stack
– Future-proof
 Abstracts common software tasks
– For example: “build”, “test”, “release”
 Provide a default implementation, allow users to override
– i.e. Gradle w/ LI plug-ins
©2013 LinkedIn Corporation. All Rights Reserved. 15
Key Concepts
 Elevate tooling from artifact to product level
 Metadata ties the tooling together
– Ivy
– Version and Build specification
 Pluggable implementation of subsystems
 Version management
 Continuous automated delivery
©2013 LinkedIn Corporation. All Rights Reserved. 16
©2013 LinkedIn Corporation. All Rights Reserved. 17
Source
Control
Build System Deployment
Version Management
 End-of-life dates
– Graceful deprecation and upgrades
 Push version upgrades to consumers
 Dependency Reports
– What products depends on me
– What products do I depend on
©2013 LinkedIn Corporation. All Rights Reserved. 18
Push My Upgrade
©2013 LinkedIn Corporation. All Rights Reserved. 19
Tracking Upgrades
©2013 LinkedIn Corporation. All Rights Reserved. 20
Continuous Delivery in Multiproduct
 Automated pipeline triggered on developer change
– No other developer action needed
 Publishing 10,000+ artifacts per day for 300+ products
– Mean time for a good commit: ~10 minutes
– Mean time counting failures: ~25 minutes
©2013 LinkedIn Corporation. All Rights Reserved. 21
Continuous Delivery
©2013 LinkedIn Corporation. All Rights Reserved. 22
©2013 LinkedIn Corporation. All Rights Reserved. 23
Build Automation with Gradle
©2013 LinkedIn Corporation. All Rights Reserved. 24
Why LinkedIn uses Gradle
 Dependency resolution engine
 Rich plug-in system w/ real programming language
– DSL has high learning curve, but powerful
 Visions align
– Automation
– Continuous delivery
©2013 LinkedIn Corporation. All Rights Reserved. 25
LinkedIn Gradle plug-ins
 Customize built-in plug-ins for LinkedIn’s environment
– i.e. Java, Scala, War, FindBugs, Cobertura
 Add custom artifact types
– For example database patches, static content, and Hadoop workflows
 Create metadata for publishing and deployment tooling to consume
 Elevate concepts from artifact to product level
©2013 LinkedIn Corporation. All Rights Reserved. 26
Dependency Graph powered by Gradle
©2013 LinkedIn Corporation. All Rights Reserved. 27
©2013 LinkedIn Corporation. All Rights Reserved. 28
End-of-life enforcement
Our own gradlew: ligradle
 Our own custom Gradle wrapper
 Provisions Gradle and plug-ins
– Allows each product to define versions it uses
 Provides lifecycle management for Gradle and plug-ins
– End-of-life
 Usage data
– Used to track usage and discover problems
©2013 LinkedIn Corporation. All Rights Reserved. 29
Usage Data
©2013 LinkedIn Corporation. All Rights Reserved. 30
Source vs Binary Dependencies
 Source offers incremental updates and flexibility  intra-product
 Binary offers stability and speed inter-product
There’s no right answer, but there are plenty of wrong answers!
©2013 LinkedIn Corporation. All Rights Reserved. 31
Network migration to Gradle
 3,600 build.xml files to convert
– Many of them with custom logic
 300 developers to train
 Performance targets
– 2x speed-up for clean builds
– 5x speed-up for incremental builds
©2013 LinkedIn Corporation. All Rights Reserved. 32
Proof of Concept
 Migrated 1,100 modules
 Tested single large build versus isolated
segments
– Single large build simpler
 Scale feasible
– Requires scalability and performance
work in Gradle core
©2013 LinkedIn Corporation. All Rights Reserved. 33
Gradle features required for migration
 Configuration on demand
– Only configure the task graph you need
 Refactored cache logic for performance
– Task history
– Dependency descriptors
 Candidate performance improvements
– Parallel configuration
– Daemon stores project model
– Daemon performs continuous up-to-date checks
©2013 LinkedIn Corporation. All Rights Reserved. 34
Daemon Heap Usage
©2013 LinkedIn Corporation. All Rights Reserved. 36
Project Timeline – 1 year
 Q1: Proof-of-concept and prep work
 Q2: Implementation 
 Q3: Roll-out
 Q4: Clean-up
©2013 LinkedIn Corporation. All Rights Reserved. 37
A Peek at the Future
©2013 LinkedIn Corporation. All Rights Reserved. 38
Gradle Features
 Faster
– Use the daemon effectively in development and CI
– Intra-project parallel execution
 More scalable
– Heap usage
 Ease of use
– IDE integration
©2013 LinkedIn Corporation. All Rights Reserved. 39
More Multiproduct Intelligence
 Analytics
– Common exceptions
– Usage and error patterns
 Verification suite
– CheckStyle, FindBugs and Cobertura
 Automation and Integration
 Ease of use
©2013 LinkedIn Corporation. All Rights Reserved. 40
Distributed Build Automation
 Distribute build and testing on a cluster
 Automatic provisioning
 Artifact sharing at scale
©2013 LinkedIn Corporation. All Rights Reserved. 41
©2013 LinkedIn Corporation. All Rights Reserved. 42

Building a Modern Enterprise SOA at LinkedIn

  • 1.
    Building a ModernEnterprise SOA at LinkedIn
  • 2.
    Agenda  Building Codeat LinkedIn  Product Development with Multiproduct  Build Automation with Gradle  A Peek at the Future ©2013 LinkedIn Corporation. All Rights Reserved. 2
  • 3.
    Building Code atLinkedIn ©2013 LinkedIn Corporation. All Rights Reserved. 3
  • 4.
    In the beginningthere was Network  Single, relatively homogenous code base – Build from source, little dependency management  Java, Spring, Ant, …  JavaScript, HTML, JSPs, CSS, …  JARs, WARs, Jetty, Tomcat, … ©2013 LinkedIn Corporation. All Rights Reserved. 4
  • 5.
    Then everything wentexponential Number of developers, programming languages, build systems, frameworks, lines of code, servers, users, page views, … … and pretty much everything else ©2013 LinkedIn Corporation. All Rights Reserved. 5
  • 7.
    Hypergrowth isn’t allfun and games  Acquisitions came in with their own technology and processes  Releases became more and more painful  Productivity and stability suffered ©2013 LinkedIn Corporation. All Rights Reserved. 7
  • 8.
    Scaling a softwaredevelopment organization  Requires sensible code and dependency management – Source code APIs – Service APIs – Versioned dependencies  Performance is king – Iterative improvements  Divide and conquer – Split and isolate failures ©2013 LinkedIn Corporation. All Rights Reserved. 8
  • 9.
  • 10.
    Network Trunk Development All development in Network shifted to trunk – No branches, no merging  Continuous releases from trunk – Deploy multiple times per day  Work started on break up and clean up – Migrating build logic to Gradle ©2013 LinkedIn Corporation. All Rights Reserved. 10
  • 11.
    Product Development withMultiproduct ©2013 LinkedIn Corporation. All Rights Reserved.
  • 12.
    Traditional Software Development Use a well-established technology stack – Homogeneity => Simplicity  To adopt a new technology: – Requires “out of the box” thinking and effort – Do a proof-of-concept implementation – Present to decision makers to demonstrate ROI => get approval  Slow-moving by design – New technology integration expensive – Top-down management decisions used as barrier ©2013 LinkedIn Corporation. All Rights Reserved. 12
  • 13.
    LinkedIn Software Development We don’t ever want to be in the box – Technical experimentation and diversity encouraged – Living on the bleeding edge, often defining it  The Need for Speed – Pace of iteration – Automation (the human is slow) – Continuous delivery  Build versus buy ©2013 LinkedIn Corporation. All Rights Reserved. 13
  • 14.
  • 15.
    Multiproduct  Toolset thatis architected for a heterogeneous technology world  Agnostic to version control, build system and programming stack – Future-proof  Abstracts common software tasks – For example: “build”, “test”, “release”  Provide a default implementation, allow users to override – i.e. Gradle w/ LI plug-ins ©2013 LinkedIn Corporation. All Rights Reserved. 15
  • 16.
    Key Concepts  Elevatetooling from artifact to product level  Metadata ties the tooling together – Ivy – Version and Build specification  Pluggable implementation of subsystems  Version management  Continuous automated delivery ©2013 LinkedIn Corporation. All Rights Reserved. 16
  • 17.
    ©2013 LinkedIn Corporation.All Rights Reserved. 17 Source Control Build System Deployment
  • 18.
    Version Management  End-of-lifedates – Graceful deprecation and upgrades  Push version upgrades to consumers  Dependency Reports – What products depends on me – What products do I depend on ©2013 LinkedIn Corporation. All Rights Reserved. 18
  • 19.
    Push My Upgrade ©2013LinkedIn Corporation. All Rights Reserved. 19
  • 20.
    Tracking Upgrades ©2013 LinkedInCorporation. All Rights Reserved. 20
  • 21.
    Continuous Delivery inMultiproduct  Automated pipeline triggered on developer change – No other developer action needed  Publishing 10,000+ artifacts per day for 300+ products – Mean time for a good commit: ~10 minutes – Mean time counting failures: ~25 minutes ©2013 LinkedIn Corporation. All Rights Reserved. 21
  • 22.
    Continuous Delivery ©2013 LinkedInCorporation. All Rights Reserved. 22
  • 23.
    ©2013 LinkedIn Corporation.All Rights Reserved. 23
  • 24.
    Build Automation withGradle ©2013 LinkedIn Corporation. All Rights Reserved. 24
  • 25.
    Why LinkedIn usesGradle  Dependency resolution engine  Rich plug-in system w/ real programming language – DSL has high learning curve, but powerful  Visions align – Automation – Continuous delivery ©2013 LinkedIn Corporation. All Rights Reserved. 25
  • 26.
    LinkedIn Gradle plug-ins Customize built-in plug-ins for LinkedIn’s environment – i.e. Java, Scala, War, FindBugs, Cobertura  Add custom artifact types – For example database patches, static content, and Hadoop workflows  Create metadata for publishing and deployment tooling to consume  Elevate concepts from artifact to product level ©2013 LinkedIn Corporation. All Rights Reserved. 26
  • 27.
    Dependency Graph poweredby Gradle ©2013 LinkedIn Corporation. All Rights Reserved. 27
  • 28.
    ©2013 LinkedIn Corporation.All Rights Reserved. 28 End-of-life enforcement
  • 29.
    Our own gradlew:ligradle  Our own custom Gradle wrapper  Provisions Gradle and plug-ins – Allows each product to define versions it uses  Provides lifecycle management for Gradle and plug-ins – End-of-life  Usage data – Used to track usage and discover problems ©2013 LinkedIn Corporation. All Rights Reserved. 29
  • 30.
    Usage Data ©2013 LinkedInCorporation. All Rights Reserved. 30
  • 31.
    Source vs BinaryDependencies  Source offers incremental updates and flexibility  intra-product  Binary offers stability and speed inter-product There’s no right answer, but there are plenty of wrong answers! ©2013 LinkedIn Corporation. All Rights Reserved. 31
  • 32.
    Network migration toGradle  3,600 build.xml files to convert – Many of them with custom logic  300 developers to train  Performance targets – 2x speed-up for clean builds – 5x speed-up for incremental builds ©2013 LinkedIn Corporation. All Rights Reserved. 32
  • 33.
    Proof of Concept Migrated 1,100 modules  Tested single large build versus isolated segments – Single large build simpler  Scale feasible – Requires scalability and performance work in Gradle core ©2013 LinkedIn Corporation. All Rights Reserved. 33
  • 34.
    Gradle features requiredfor migration  Configuration on demand – Only configure the task graph you need  Refactored cache logic for performance – Task history – Dependency descriptors  Candidate performance improvements – Parallel configuration – Daemon stores project model – Daemon performs continuous up-to-date checks ©2013 LinkedIn Corporation. All Rights Reserved. 34
  • 36.
    Daemon Heap Usage ©2013LinkedIn Corporation. All Rights Reserved. 36
  • 37.
    Project Timeline –1 year  Q1: Proof-of-concept and prep work  Q2: Implementation   Q3: Roll-out  Q4: Clean-up ©2013 LinkedIn Corporation. All Rights Reserved. 37
  • 38.
    A Peek atthe Future ©2013 LinkedIn Corporation. All Rights Reserved. 38
  • 39.
    Gradle Features  Faster –Use the daemon effectively in development and CI – Intra-project parallel execution  More scalable – Heap usage  Ease of use – IDE integration ©2013 LinkedIn Corporation. All Rights Reserved. 39
  • 40.
    More Multiproduct Intelligence Analytics – Common exceptions – Usage and error patterns  Verification suite – CheckStyle, FindBugs and Cobertura  Automation and Integration  Ease of use ©2013 LinkedIn Corporation. All Rights Reserved. 40
  • 41.
    Distributed Build Automation Distribute build and testing on a cluster  Automatic provisioning  Artifact sharing at scale ©2013 LinkedIn Corporation. All Rights Reserved. 41
  • 42.
    ©2013 LinkedIn Corporation.All Rights Reserved. 42

Editor's Notes

  • #7 Homogenous -> HeterogenousBuild vs buy, starting taking on more tasks internally. Started open-sourcing our own stuff to pay it forward.
  • #8 When there are more people who are new than established it’s hard to maintain anything
  • #9 We needed to scale ourselves. The answer was to establish tools team to focus on internal productivity, code management and process.Performance improvements -> waterfall releases to every 2 weeks to every day to N times per day to continuous dev
  • #10 If your code doesn’t impact my code then failures in your unit tests should not stop me from releasing my codeProducts move at different paces: experimentation, heavy development, stable and mature, maintenance mode etc.Public APIs at the code and service layers abstracts the messiness underneath.
  • #14 “out of the box” thinking should make you ask why the heck you were in the box in the first placeAutomation requires trust in the system
  • #15 But there must be basic sanity checks: our operations team must be able to configure and operate the products at scale, the organization must maintain focus and prioritize, have to consider compliance (legal, government, ethical)There’s constructive and destructive chaos. Multiproduct tries to define constructive chaos.
  • #17 A product can publish a single or many artifacts
  • #19 EOL dates solve several problems:I need certain versions to go away because they have bugs or flawsI need to not maintain too many versions. Focus my resources where they matter
  • #22 Currently a manual step for product owners to promote to production (need more trust in verification to remove this step)
  • #26 Dependency management probably what we need the build tool to do the mostXML is not a programming language.Groovy DSL has extreme learning curve, but our audience are programmers anyway.
  • #28 We are able to ask Gradle complex dependency questions and get the answers we need
  • #29 An example of where we elevate functionality to the product level using the Gradle plug-ins
  • #31 ~5,000 gradle executions per workday. ~10,000 mint executions
  • #33 We need daemon and parallel execution to make this possible
  • #36 Already made great progress: both in Gradle and in our plug-ins