PUPPET AT SCALE:
CASE STUDY OF LEARNINGS AT PAYPAL
Stan Hsu, Harendra Narayan, Chris Huang
August 23, 2013
PAYPAL SCALE
2 Confidential and Proprietary
• PayPal is part of eBay Inc
• 132 million active registered
accounts
• 25 currencies in 193 markets
• Net Total Payment Volume: $43 Billion in Q2
• 7.6 million payments per day
• $5,277 in Total Payment Volume every second
THE CHALLENGE
3 Confidential and Proprietary
• 100GB+ and counting…
• 4000+ packages
• 20-50 new packages introduced every release
• complex dependency graph across domains and services
• Build a new system for deploying
application and system software
• Massive scale in production across
multiple data centers
• Thousands of stages in QA
• 3000+ dev & QE in 10+ offices across
time zones and geographic regions
SCALABILITY CHALLENGE
Challenges:
• Traditional 1 app 1 module does not scale
• High velocity environment with ever increasing speed of
change
• New pkgs, sunset pkgs, dependency changes, dev staff
operating 24x5
• Lack of puppet expertise to complement 3000+ technology
staff across geographic regions
Solution:
• Ninja engine generate resources dynamically
• Dependency discovery
• Puppet code change not required
Confidential and Proprietary
ROLES & LABELS
Role
• One role per pool
• Define a set of packages to install
Label
• A set of versioned packages
• Backed by a yum repository
6 Confidential and Proprietary
7
ROLES & LABELS
Confidential and Proprietary
Pool
Host
Label
abWeb abSvc
deploy
deploy
SYSTEM HIERARCHY
9
SYSTEM HIERARCHY
Confidential and Proprietary
HomeWeb
1001
HomeWeb
1002
HomeSvc
1001
HomeSvc
1002
HomeBE
2020
HomeBE
2021
Host
HomeWeb HomeSvc HomeBEPool
SF
NA
ProdEnv
DC
Geo
LAX
ENC / HIERA
10 Confidential and Proprietary
• Mongo DB for hierarchical datastore
• Reduced multiple Hiera calls to one for classes, role,
parameters look up
• Efficiency & easier debug
• Created a web based tool to visualize Hiera data
• REST API for CRUD operations on Hiera data
SCALING ACTIVEMQ
12 Confidential and Proprietary
ACTIVEMQ
CLIENTS
LOAD
BALANCER
ACTIVEMQ
CLUSTER
MCOLLECTIVE AT SCALE
• Query systems through facts, agents, or regular expressions
[peadmin@puppet ~]$ mco find -F processorcount=24
• Verify package versions in all systems for simple auditing purposes
[peadmin@puppet ~]$ mco package status python-qpid
---- package agent summary ----
Nodes: 3366 / 3366
Versions: 3356 * 0.14-6.el5, 10 * absent
Elapsed Time: 34.68 s
• Kick off puppet runs
• ssh script replacement
• REST API enables Mcollective to web and other tools
13 Confidential and Proprietary
14 Confidential and Proprietary
Large number of
Applications and Services
Real Time Status Updates
of Puppet Runs
WHY’S IT TAKING SO LONG? ETA?
Confidential and Proprietary
PROGRESS PUPPET MODULE
Sample Update Message (JSON):
{
"host": "stage2vmppsm02.sc4.paypal.com",
"time": "2012-04-25T10:00:37Z",
"type": "puppet_run",
"catalog_version": 1335348026,
"puppet_run_status": "running",
"package": {
"status": "successful",
"name": "axis"
}
}
Confidential and Proprietary
PROGRESS PUPPET MODULE
• Open source code available at
• https://github.com/hunner/progress_mq
• Modified since then
• To enable/disable messages for different puppet runs
• To format JSON messages when writing to a file
Confidential and Proprietary
Puppet at Scale – Case Study of PayPal's Learnings - PuppetConf 2013

Puppet at Scale – Case Study of PayPal's Learnings - PuppetConf 2013

  • 1.
    PUPPET AT SCALE: CASESTUDY OF LEARNINGS AT PAYPAL Stan Hsu, Harendra Narayan, Chris Huang August 23, 2013
  • 2.
    PAYPAL SCALE 2 Confidentialand Proprietary • PayPal is part of eBay Inc • 132 million active registered accounts • 25 currencies in 193 markets • Net Total Payment Volume: $43 Billion in Q2 • 7.6 million payments per day • $5,277 in Total Payment Volume every second
  • 3.
    THE CHALLENGE 3 Confidentialand Proprietary • 100GB+ and counting… • 4000+ packages • 20-50 new packages introduced every release • complex dependency graph across domains and services • Build a new system for deploying application and system software • Massive scale in production across multiple data centers • Thousands of stages in QA • 3000+ dev & QE in 10+ offices across time zones and geographic regions
  • 5.
    SCALABILITY CHALLENGE Challenges: • Traditional1 app 1 module does not scale • High velocity environment with ever increasing speed of change • New pkgs, sunset pkgs, dependency changes, dev staff operating 24x5 • Lack of puppet expertise to complement 3000+ technology staff across geographic regions Solution: • Ninja engine generate resources dynamically • Dependency discovery • Puppet code change not required Confidential and Proprietary
  • 6.
    ROLES & LABELS Role •One role per pool • Define a set of packages to install Label • A set of versioned packages • Backed by a yum repository 6 Confidential and Proprietary
  • 7.
    7 ROLES & LABELS Confidentialand Proprietary Pool Host Label abWeb abSvc deploy deploy
  • 8.
  • 9.
    9 SYSTEM HIERARCHY Confidential andProprietary HomeWeb 1001 HomeWeb 1002 HomeSvc 1001 HomeSvc 1002 HomeBE 2020 HomeBE 2021 Host HomeWeb HomeSvc HomeBEPool SF NA ProdEnv DC Geo LAX
  • 10.
    ENC / HIERA 10Confidential and Proprietary • Mongo DB for hierarchical datastore • Reduced multiple Hiera calls to one for classes, role, parameters look up • Efficiency & easier debug • Created a web based tool to visualize Hiera data • REST API for CRUD operations on Hiera data
  • 11.
  • 12.
    12 Confidential andProprietary ACTIVEMQ CLIENTS LOAD BALANCER ACTIVEMQ CLUSTER
  • 13.
    MCOLLECTIVE AT SCALE •Query systems through facts, agents, or regular expressions [peadmin@puppet ~]$ mco find -F processorcount=24 • Verify package versions in all systems for simple auditing purposes [peadmin@puppet ~]$ mco package status python-qpid ---- package agent summary ---- Nodes: 3366 / 3366 Versions: 3356 * 0.14-6.el5, 10 * absent Elapsed Time: 34.68 s • Kick off puppet runs • ssh script replacement • REST API enables Mcollective to web and other tools 13 Confidential and Proprietary
  • 14.
    14 Confidential andProprietary Large number of Applications and Services Real Time Status Updates of Puppet Runs
  • 15.
    WHY’S IT TAKINGSO LONG? ETA? Confidential and Proprietary
  • 16.
    PROGRESS PUPPET MODULE SampleUpdate Message (JSON): { "host": "stage2vmppsm02.sc4.paypal.com", "time": "2012-04-25T10:00:37Z", "type": "puppet_run", "catalog_version": 1335348026, "puppet_run_status": "running", "package": { "status": "successful", "name": "axis" } } Confidential and Proprietary
  • 17.
    PROGRESS PUPPET MODULE •Open source code available at • https://github.com/hunner/progress_mq • Modified since then • To enable/disable messages for different puppet runs • To format JSON messages when writing to a file Confidential and Proprietary