SlideShare a Scribd company logo
1 of 62
The Road to the White House
with Puppet & AWS
Leo Zhadanovsky – Solutions Architect – leo@amazon.com
@leozh
What am I talking about today?
What was OFA Tech?
• Who did it?
• What did they build?
How did they do that?
• Technologies and Tradeoffs
• Services vs. Software
How did they leverage puppet?
What did they learn from building something so big?
Who Am I?
I work for AWS
I worked for the DNC 2009-2012
I was embedded at OFA
AWS does not endorse
political candidates
I love Star Trek (TNG is the best)
So here’s the Idea
~30th biggest E-commerce operation, globally
~200 distinct new applications, many mobile
Hundreds of new, untested analytical approaches
Processing hundreds of TB of data on thousands of servers
Spikes of hundreds of thousands of concurrent users
FUN FUN FUN
a few constraints…
~30th biggest E-commerce operation, globally
~200 distinct applications, many mobile
Hundreds of new, untested analytical approaches
Processing hundreds of TB of data on thousands of servers
Spikes of hundreds of thousands of concurrent users
Critically compressed budget
Less than a year to execute
Volunteer and near-volunteer development team
Core systems will be used for a single critical day
Constitutionally-mandated completion date
NOT
NOT
Built by guys and gals like these: Obama For
America
Business as usual..
…for a technology startup
Election Day – OFA Headquarters
So they built it all, and it worked
Typical Charts
How?
The old approach, even from Amazon 
The old approach.. Might have some problems..
No Up-Front
Capital Expense
Pay Only for
What You Use
Self-Service
Infrastructure
Easily Scale
Up and Down
Improve Agility &
Time-to-Market
Low Cost
Cloud Computing Benefits
Deploy
OFA’s Infrastructure
awsofa.info
Web-Scale Applications
500k+ IOPS DB Systems
Services API
Ingredients
Ubuntu nginx boundary Unity jQuery SQLServer hbase
NewRelic EC2 node.js Cybersource hive ElasticSearch
Ruby Twilio EE S3 ELB boto Magento PHP EMR SES
Route53 SimpleDB Campfire nagios Paypal CentOS
CloudSearch levelDB mongoDB python securitygroups
Usahidhi PostgresSQL Github apache bootstrap SNS
cloudformation Jekyll RoR EBS FPS VPC Mashery
Vertica RDS Optimizely MySQL puppet tsunamiUDP R
asgard cloudwatch ElastiCache cloudopt SQS cloudinit
DirectConnect BSD rsync STS Objective-C DynamoDB
Data Stores
Ubuntu nginx boundary Unity jQuery SQLServer hbase
NewRelic EC2 node.js Cybersource hive ElasticSearch
Ruby Twilio EE S3 ELB boto Magento PHP EMR SES
Route53 SimpleDB Campfire nagios Paypal CentOS
CloudSearch levelDB mongoDB python securitygroups
Usahidhi PostgresSQL Github apache bootstrap SNS
cloudformation Jekyll RoR EBS FPS VPC Mashery
Vertica RDS Optimizely MySQL puppet tsunamiUDP R
asgard cloudwatch ElastiCache cloudopt SQS cloudinit
DirectConnect BSD rsync STS Objective-C DynamoDB
Development Frameworks
Ubuntu nginx boundary Unity jQuery SQLServer hbase
NewRelic EC2 node.js Cybersource hive ElasticSearch
Ruby Twilio EE S3 ELB boto Magento PHP EMR SES
Route53 SimpleDB Campfire nagios Paypal CentOS
CloudSearch levelDB mongoDB python securitygroups
Usahidhi PostgresSQL Github apache bootstrap SNS
cloudformation Jekyll RoR EBS FPS VPC Mashery
Vertica RDS Optimizely MySQL puppet tsunamiUDP R
asgard cloudwatch ElastiCache cloudopt SQS cloudinit
DirectConnect BSD rsync STS Objective-C DynamoDB
Infrastructure, Configuration
Management & Monitoring
Ubuntu nginx boundary Unity jQuery SQLServer hbase
NewRelic EC2 node.js Cybersource hive ElasticSearch
Ruby Twilio EE S3 ELB boto Magento PHP EMR SES
Route53 SimpleDB Campfire nagios Paypal CentOS
CloudSearch levelDB mongoDB python securitygroups
Usahidhi PostgresSQL Github apache bootstrap SNS
cloudformation Jekyll RoR EBS FPS VPC Mashery
Vertica RDS Optimizely MySQL puppet tsunamiUDP R
asgard cloudwatch ElastiCache cloudopt SQS cloudinit
DirectConnect BSD rsync STS Objective-C DynamoDB
Configuration Management: Puppet
In mid-2011, we look at options for configuration
management and chose Puppet
We needed to make it scale, and to get it to work with state-
less, horizontally scalable infrastructure
How did we do this?
Bootstrapping Puppet with CloudInit
CloudInit is built
into Ubuntu and
Amazon Linux
• Allows you to
pass bootstrap
parameters in
Amazon EC2
user-data field, in
YAML format
Bootstrapping Puppet with CloudInit
Don’t store creds in puppet manifests, store them in private
Amazon S3 buckets
Either pass Amazon S3 creds through CloudInit:
Even better – avoid this by using AWS Identity and Access
Management (IAM) roles and the version of s3cmd in github
Bootstrapping Puppet with CloudInit
Built-in puppet support
Use certname with %i for instance id to name the node
Puppetmaster must have auto sign turned on
• Use security groups and/or NACLs for network-level security
In nodes.pp, use regex to match node names
Puppet Tips
Use a base class to define your standard install
Use runstages
Don’t store credentials in puppet, store them in private
Amazon S3 buckets
• Use AWS IAM to secure the credentials bucket/folders within that
bucket
Puppet Tips
Puppet Tips
Use puppet only for configuration files and what makes your
apps unique
For undifferentiated parts of apps, use Amazon S3 backed
RPM/Debian repositories
• Can be either public or private repos, depending on your needs
• Amazon S3 Private RPM Repos: http://git.io/YAcsbg
• Amazon S3 Private Debian Repos: http://git.io/ecCjWQ
Puppet Tips
By using packages for applications deploys, you can set ensure
=> latest, and just bump the package in the repo to update
Log everything with rsyslog/graylog/loggly/NewRelic/splunk
Scaling the Puppet Masters
Use an Auto Scaling group for puppet masters
• Min size => 2, use multiple Availability Zones
Either have them build themselves off of existing puppet
masters in the group or off packages storied in Amazon S3 and
bootstrapped through user-data
Auto-sign must be on
Sites
Communications
Ad Targeting
Ops Tools
Analytics
Apps
Micro-targeting
Micro-listening
Reporting
Registrations
Volunteer
Coordination
Etc, etc, etc.
Technology Choice
Polyglot Development
Cloud Hosting
Diverse, App-centered
Databases
SOA, queue-based system
integrations
Expected Tradeoff
More Complex Ops
Less Infra Control,
performance
More Complex Ops,
Fragility, Data Corruption
Dev Complexity, slower
system performance
Technology Choice
Polyglot
Development
Cloud Hosting
Diverse, App-
centered Databases
SOA, queue-based
system integrations
Expected Tradeoff
More Complex
Ops
Less Infra Control,
performance
More Complex
Ops, Fragility,
Data Corruption
Dev Complexity,
slower system
performance
Upside
Build as little as
possible, rev-1 faster,
reuse dev skills
Scale, Speed, Cost
Heterogeneous
Resilience, right
tools for the job
Scalability,
serviceability,
operational
flexibility, and
substantially faster
in aggregate
$5.2B retail business
7,800 employees
A whole lot of servers
2003
2012
Every day,
AWS adds enough
server capacity to
power this $5B
enterprise
$5.2B retail business
7,800 employees
A whole lot of servers
2003
2012
Amazon Simple
Queuing Service
(SQS)
Thousands of customers
A whole lot of servers
Over 5 Billion Queued
Events
2006-8
2012
OFA
Produced 8.4 Billion
Amazon SQS Queued
Events
Amazon Simple
Queuing Service
(SQS)
Thousands of customers
A whole lot of servers
Over 5 Billion Queued
Events
2006-8
2012
OFA
Produced 8.4 Billion
Amazon SQS Queued
Events
Just the last month of
the campaign
2006-8
Amazon Simple
Queuing Service
(SQS)
Thousands of customers
A whole lot of servers
Over 5 Billion Queued
Events
No time to waste
This applies to lots of services!
Elastic Load Balancing
Amazon ElastiCache
Amazon RDS
Amazon CloudSearch
Amazon Route53
Amazon S3
Amazon CloudFront
Amazon DynamoDB
You can mostly
do these on your
own…
But do you have extra:
focus, expertise, time, research,
money, risk-tolerance, staff, dedication to
innovate, operations coverage, scalability in design...
Looks pretty simple.
Inserts 7.5m records in Amazon DynamoDB, in 8 minutes
One thing that is difficult to prepare for…
No pressure…
They had this built for the previous 3
months, all on the East Coast.
They had this built for the previous 3
months, all on the East Coast.
We built this
part in 9 hours
to be safe.
AWS +
Puppet +
Netflix Asgard +
CloudOpt +
DevOps =
Cross-Continent Fault-
Tolerance On-Demand
Replication across the continent..
http://tsunami-udp.sourceforge.net/
478.18 Mbps
cross-continental data transit
rate for a single cc2.8xlarge
instance
1.72 Tb an hour
27 Tb of data to move
3.92 Hours
required to move the data
across the continent with
four cc2.8xlarge instances
So what did they learn?
HA in Depth: Amazon S3 static pages, de-coupled UI,
jekyll/hyde
Game Day: Practice failures so you know what to do.
( http://www.awsgameday.com )
Loose-Coupling: Ops easy, scale easy, test easy, fix easy…
Fail-Forward: features, quality, and focus are all critical.
Cloud works.
We showed it to the world at re: Invent 2012
together with the OFA DevOps crew
We presented in Tokyo…
Born from the Campaign
What will you do next?
Maybe look at some of their Ruby code?
Register Now!
reinvent.awsevents.com
$200 Off Discount Code:
Zoltan2013
Gain New Skills & Knowledge
Choose from 175+ technical sessions,
training bootcamps, hands-on labs,
and hackathons.
Dive Deeper into AWS
Dive deep into foundational AWS
services and learn about the latest
services and features.
Get Your Questions Answered
Get your technical questions answered
by AWS architects, engineers, and
product leads.
Learn Best Practices
Discover best practices, tips and
tricks, and lessons learned from
expert customers.
Thank you!
Questions?
• Come talk to an AWS Solutions Architect at Table 22
Contact me!
• @leozh
• leo@amazon.com

More Related Content

More from Puppet

Puppet camp2021 testing modules and controlrepo
Puppet camp2021 testing modules and controlrepoPuppet camp2021 testing modules and controlrepo
Puppet camp2021 testing modules and controlrepo
Puppet
 
2021 04-15 operational verification (with notes)
2021 04-15 operational verification (with notes)2021 04-15 operational verification (with notes)
2021 04-15 operational verification (with notes)
Puppet
 
Enforce compliance policy with model-driven automation
Enforce compliance policy with model-driven automationEnforce compliance policy with model-driven automation
Enforce compliance policy with model-driven automation
Puppet
 

More from Puppet (20)

Puppet camp2021 testing modules and controlrepo
Puppet camp2021 testing modules and controlrepoPuppet camp2021 testing modules and controlrepo
Puppet camp2021 testing modules and controlrepo
 
Puppetcamp r10kyaml
Puppetcamp r10kyamlPuppetcamp r10kyaml
Puppetcamp r10kyaml
 
2021 04-15 operational verification (with notes)
2021 04-15 operational verification (with notes)2021 04-15 operational verification (with notes)
2021 04-15 operational verification (with notes)
 
Puppet camp vscode
Puppet camp vscodePuppet camp vscode
Puppet camp vscode
 
Modules of the twenties
Modules of the twentiesModules of the twenties
Modules of the twenties
 
Applying Roles and Profiles method to compliance code
Applying Roles and Profiles method to compliance codeApplying Roles and Profiles method to compliance code
Applying Roles and Profiles method to compliance code
 
KGI compliance as-code approach
KGI compliance as-code approachKGI compliance as-code approach
KGI compliance as-code approach
 
Enforce compliance policy with model-driven automation
Enforce compliance policy with model-driven automationEnforce compliance policy with model-driven automation
Enforce compliance policy with model-driven automation
 
Keynote: Puppet camp compliance
Keynote: Puppet camp complianceKeynote: Puppet camp compliance
Keynote: Puppet camp compliance
 
Automating it management with Puppet + ServiceNow
Automating it management with Puppet + ServiceNowAutomating it management with Puppet + ServiceNow
Automating it management with Puppet + ServiceNow
 
Puppet: The best way to harden Windows
Puppet: The best way to harden WindowsPuppet: The best way to harden Windows
Puppet: The best way to harden Windows
 
Simplified Patch Management with Puppet - Oct. 2020
Simplified Patch Management with Puppet - Oct. 2020Simplified Patch Management with Puppet - Oct. 2020
Simplified Patch Management with Puppet - Oct. 2020
 
Accelerating azure adoption with puppet
Accelerating azure adoption with puppetAccelerating azure adoption with puppet
Accelerating azure adoption with puppet
 
Puppet catalog Diff; Raphael Pinson
Puppet catalog Diff; Raphael PinsonPuppet catalog Diff; Raphael Pinson
Puppet catalog Diff; Raphael Pinson
 
ServiceNow and Puppet- better together, Kevin Reeuwijk
ServiceNow and Puppet- better together, Kevin ReeuwijkServiceNow and Puppet- better together, Kevin Reeuwijk
ServiceNow and Puppet- better together, Kevin Reeuwijk
 
Take control of your dev ops dumping ground
Take control of your  dev ops dumping groundTake control of your  dev ops dumping ground
Take control of your dev ops dumping ground
 
100% Puppet Cloud Deployment of Legacy Software
100% Puppet Cloud Deployment of Legacy Software100% Puppet Cloud Deployment of Legacy Software
100% Puppet Cloud Deployment of Legacy Software
 
Puppet User Group
Puppet User GroupPuppet User Group
Puppet User Group
 
Continuous Compliance and DevSecOps
Continuous Compliance and DevSecOpsContinuous Compliance and DevSecOps
Continuous Compliance and DevSecOps
 
The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick Maludy
The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick MaludyThe Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick Maludy
The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick Maludy
 

Recently uploaded

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Recently uploaded (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 

The Road to the White House with Puppet & AWS

  • 1. The Road to the White House with Puppet & AWS Leo Zhadanovsky – Solutions Architect – leo@amazon.com @leozh
  • 2. What am I talking about today? What was OFA Tech? • Who did it? • What did they build? How did they do that? • Technologies and Tradeoffs • Services vs. Software How did they leverage puppet? What did they learn from building something so big?
  • 3. Who Am I? I work for AWS I worked for the DNC 2009-2012 I was embedded at OFA AWS does not endorse political candidates I love Star Trek (TNG is the best)
  • 4. So here’s the Idea ~30th biggest E-commerce operation, globally ~200 distinct new applications, many mobile Hundreds of new, untested analytical approaches Processing hundreds of TB of data on thousands of servers Spikes of hundreds of thousands of concurrent users FUN FUN FUN
  • 5. a few constraints… ~30th biggest E-commerce operation, globally ~200 distinct applications, many mobile Hundreds of new, untested analytical approaches Processing hundreds of TB of data on thousands of servers Spikes of hundreds of thousands of concurrent users Critically compressed budget Less than a year to execute Volunteer and near-volunteer development team Core systems will be used for a single critical day Constitutionally-mandated completion date NOT NOT
  • 6. Built by guys and gals like these: Obama For America
  • 7. Business as usual.. …for a technology startup
  • 8. Election Day – OFA Headquarters
  • 9. So they built it all, and it worked
  • 10.
  • 12.
  • 13. How?
  • 14. The old approach, even from Amazon 
  • 15. The old approach.. Might have some problems..
  • 16. No Up-Front Capital Expense Pay Only for What You Use Self-Service Infrastructure Easily Scale Up and Down Improve Agility & Time-to-Market Low Cost Cloud Computing Benefits Deploy
  • 19. 500k+ IOPS DB Systems
  • 21. Ingredients Ubuntu nginx boundary Unity jQuery SQLServer hbase NewRelic EC2 node.js Cybersource hive ElasticSearch Ruby Twilio EE S3 ELB boto Magento PHP EMR SES Route53 SimpleDB Campfire nagios Paypal CentOS CloudSearch levelDB mongoDB python securitygroups Usahidhi PostgresSQL Github apache bootstrap SNS cloudformation Jekyll RoR EBS FPS VPC Mashery Vertica RDS Optimizely MySQL puppet tsunamiUDP R asgard cloudwatch ElastiCache cloudopt SQS cloudinit DirectConnect BSD rsync STS Objective-C DynamoDB
  • 22. Data Stores Ubuntu nginx boundary Unity jQuery SQLServer hbase NewRelic EC2 node.js Cybersource hive ElasticSearch Ruby Twilio EE S3 ELB boto Magento PHP EMR SES Route53 SimpleDB Campfire nagios Paypal CentOS CloudSearch levelDB mongoDB python securitygroups Usahidhi PostgresSQL Github apache bootstrap SNS cloudformation Jekyll RoR EBS FPS VPC Mashery Vertica RDS Optimizely MySQL puppet tsunamiUDP R asgard cloudwatch ElastiCache cloudopt SQS cloudinit DirectConnect BSD rsync STS Objective-C DynamoDB
  • 23. Development Frameworks Ubuntu nginx boundary Unity jQuery SQLServer hbase NewRelic EC2 node.js Cybersource hive ElasticSearch Ruby Twilio EE S3 ELB boto Magento PHP EMR SES Route53 SimpleDB Campfire nagios Paypal CentOS CloudSearch levelDB mongoDB python securitygroups Usahidhi PostgresSQL Github apache bootstrap SNS cloudformation Jekyll RoR EBS FPS VPC Mashery Vertica RDS Optimizely MySQL puppet tsunamiUDP R asgard cloudwatch ElastiCache cloudopt SQS cloudinit DirectConnect BSD rsync STS Objective-C DynamoDB
  • 24. Infrastructure, Configuration Management & Monitoring Ubuntu nginx boundary Unity jQuery SQLServer hbase NewRelic EC2 node.js Cybersource hive ElasticSearch Ruby Twilio EE S3 ELB boto Magento PHP EMR SES Route53 SimpleDB Campfire nagios Paypal CentOS CloudSearch levelDB mongoDB python securitygroups Usahidhi PostgresSQL Github apache bootstrap SNS cloudformation Jekyll RoR EBS FPS VPC Mashery Vertica RDS Optimizely MySQL puppet tsunamiUDP R asgard cloudwatch ElastiCache cloudopt SQS cloudinit DirectConnect BSD rsync STS Objective-C DynamoDB
  • 25. Configuration Management: Puppet In mid-2011, we look at options for configuration management and chose Puppet We needed to make it scale, and to get it to work with state- less, horizontally scalable infrastructure How did we do this?
  • 26. Bootstrapping Puppet with CloudInit CloudInit is built into Ubuntu and Amazon Linux • Allows you to pass bootstrap parameters in Amazon EC2 user-data field, in YAML format
  • 27. Bootstrapping Puppet with CloudInit Don’t store creds in puppet manifests, store them in private Amazon S3 buckets Either pass Amazon S3 creds through CloudInit: Even better – avoid this by using AWS Identity and Access Management (IAM) roles and the version of s3cmd in github
  • 28. Bootstrapping Puppet with CloudInit Built-in puppet support Use certname with %i for instance id to name the node Puppetmaster must have auto sign turned on • Use security groups and/or NACLs for network-level security In nodes.pp, use regex to match node names
  • 29. Puppet Tips Use a base class to define your standard install
  • 30. Use runstages Don’t store credentials in puppet, store them in private Amazon S3 buckets • Use AWS IAM to secure the credentials bucket/folders within that bucket Puppet Tips
  • 31. Puppet Tips Use puppet only for configuration files and what makes your apps unique For undifferentiated parts of apps, use Amazon S3 backed RPM/Debian repositories • Can be either public or private repos, depending on your needs • Amazon S3 Private RPM Repos: http://git.io/YAcsbg • Amazon S3 Private Debian Repos: http://git.io/ecCjWQ
  • 32. Puppet Tips By using packages for applications deploys, you can set ensure => latest, and just bump the package in the repo to update Log everything with rsyslog/graylog/loggly/NewRelic/splunk
  • 33. Scaling the Puppet Masters Use an Auto Scaling group for puppet masters • Min size => 2, use multiple Availability Zones Either have them build themselves off of existing puppet masters in the group or off packages storied in Amazon S3 and bootstrapped through user-data Auto-sign must be on
  • 35. Technology Choice Polyglot Development Cloud Hosting Diverse, App-centered Databases SOA, queue-based system integrations Expected Tradeoff More Complex Ops Less Infra Control, performance More Complex Ops, Fragility, Data Corruption Dev Complexity, slower system performance
  • 36. Technology Choice Polyglot Development Cloud Hosting Diverse, App- centered Databases SOA, queue-based system integrations Expected Tradeoff More Complex Ops Less Infra Control, performance More Complex Ops, Fragility, Data Corruption Dev Complexity, slower system performance Upside Build as little as possible, rev-1 faster, reuse dev skills Scale, Speed, Cost Heterogeneous Resilience, right tools for the job Scalability, serviceability, operational flexibility, and substantially faster in aggregate
  • 37. $5.2B retail business 7,800 employees A whole lot of servers 2003
  • 38. 2012 Every day, AWS adds enough server capacity to power this $5B enterprise $5.2B retail business 7,800 employees A whole lot of servers 2003
  • 39. 2012 Amazon Simple Queuing Service (SQS) Thousands of customers A whole lot of servers Over 5 Billion Queued Events 2006-8
  • 40. 2012 OFA Produced 8.4 Billion Amazon SQS Queued Events Amazon Simple Queuing Service (SQS) Thousands of customers A whole lot of servers Over 5 Billion Queued Events 2006-8
  • 41. 2012 OFA Produced 8.4 Billion Amazon SQS Queued Events Just the last month of the campaign 2006-8 Amazon Simple Queuing Service (SQS) Thousands of customers A whole lot of servers Over 5 Billion Queued Events
  • 42.
  • 43.
  • 44. No time to waste
  • 45. This applies to lots of services! Elastic Load Balancing Amazon ElastiCache Amazon RDS Amazon CloudSearch Amazon Route53 Amazon S3 Amazon CloudFront Amazon DynamoDB You can mostly do these on your own… But do you have extra: focus, expertise, time, research, money, risk-tolerance, staff, dedication to innovate, operations coverage, scalability in design...
  • 46. Looks pretty simple. Inserts 7.5m records in Amazon DynamoDB, in 8 minutes
  • 47. One thing that is difficult to prepare for…
  • 49. They had this built for the previous 3 months, all on the East Coast.
  • 50. They had this built for the previous 3 months, all on the East Coast. We built this part in 9 hours to be safe. AWS + Puppet + Netflix Asgard + CloudOpt + DevOps = Cross-Continent Fault- Tolerance On-Demand
  • 51.
  • 52.
  • 53. Replication across the continent.. http://tsunami-udp.sourceforge.net/ 478.18 Mbps cross-continental data transit rate for a single cc2.8xlarge instance 1.72 Tb an hour 27 Tb of data to move 3.92 Hours required to move the data across the continent with four cc2.8xlarge instances
  • 54. So what did they learn? HA in Depth: Amazon S3 static pages, de-coupled UI, jekyll/hyde Game Day: Practice failures so you know what to do. ( http://www.awsgameday.com ) Loose-Coupling: Ops easy, scale easy, test easy, fix easy… Fail-Forward: features, quality, and focus are all critical. Cloud works.
  • 55. We showed it to the world at re: Invent 2012
  • 56. together with the OFA DevOps crew
  • 57. We presented in Tokyo…
  • 58. Born from the Campaign
  • 59. What will you do next?
  • 60. Maybe look at some of their Ruby code?
  • 61. Register Now! reinvent.awsevents.com $200 Off Discount Code: Zoltan2013 Gain New Skills & Knowledge Choose from 175+ technical sessions, training bootcamps, hands-on labs, and hackathons. Dive Deeper into AWS Dive deep into foundational AWS services and learn about the latest services and features. Get Your Questions Answered Get your technical questions answered by AWS architects, engineers, and product leads. Learn Best Practices Discover best practices, tips and tricks, and lessons learned from expert customers.
  • 62. Thank you! Questions? • Come talk to an AWS Solutions Architect at Table 22 Contact me! • @leozh • leo@amazon.com

Editor's Notes

  1. Not your normal technology professionals
  2. Not your normal office environment
  3. A few friends in high places
  4. Cloud computing is a better way to run your business. The cloud helps companies of all sizesbecome moreagile. Instead of running your applications yourself you can run them on the cloud where IT infrastructure is offered as a service like a utility. With the cloud, your company saves money: there are no up-front capital expenses as you don’t have to buy hardware for your projects. The massive scale and fast pace of innovation of the cloud drive the costs down for you. In the cloud, you pay only for what you use just like electricity.The cloud can also help your company save time and improve agility – it’s faster to get started: you can build new environments in minutes as you don’t need to wait for new servers to arrive. The elastic nature of the cloud makes it easy to scale up and down as needed. At the end of the day you have more resources left for innovation which allows you to focus on projects that can really impact your businesses like building and deploying more applications. “With the high growth nature of our business, we were looking for a cloud solution to enable us to scale fast. Think twice before buying your next server. Cloud computing is the way forward.” - Sami Lababidi, CTO, Playfish