SlideShare a Scribd company logo
1 of 72
Download to read offline
Changing Etsy's
Architectural Foundation
                 with
   Continuous Deployment
                          Matt Graham
                 Core Engineer @ Etsy
                  Continuous Deployer
#surgecon
September 28, 2012
Marketplace for
Handmade Goods

Gross Sales 2011: $537    million
Total Members:       19   million
Items For Sale:      15   million
Uniques / month:     40   million
Page Views / month: 1.4   billion
Architecture is Relative
Organic Architecture
Premature Architecture
Premature Architecture
Passing Time => Change
● Scale
● Product


● Technology


● Engineering Team
Passing Time => Change
● Scale
● Product


● Technology


● Engineering Team


● The Correct Architecture

  Changes
Architectural Change
     Antipattern
A Brief History of
    Deployment
The Internet
Agility
Continuous
What it's all about

● Reduce Failure Time
● Start with Culture


● Tools Help


● Enable the Unfeasible
A Tale of Six Bugs
Six Bugs with
Monthly Deploys
   4 caught
        --->


 2 missed
 <---


 fix live:
  24 hours
Six Bugs with
Continuous Deploys
         2 caught
              --->


       4 missed
       <---


       fix live:
        6 hours
Failure Time

2 Bugs * 24 Hours = 48 BH
4 Bugs *   6 Hours = 24 BH


 Minimize BugHours
      24 < 48
MTTR   vs   MTTF
Cost of Recovery

Photons                  Minimal

Electrons                    Low

Protons & Neutrons          High

Humans               Prohibitive
Cost of Recovery




$6 million in 1973 = $31m today
Good Excuses


● Infrequent Changes
● Infrequent Executions


● Life and Death


● Physical Investment
Medical Devices?




       No
NASA?




 No
Enterprise Software?




        Yes!
Print of Cards?




      No
App Store?




    No
Financial Transactions?
Financial Transactions?




         Yes!
Getting Started
Culture Before Tools


●   Throw out the deploy schedule

●   Ship changes when tested & ready

●   Software is stable & supported
Tools of Etsy Deployment
Jenkins

● Unit Tests
● Functional Tests
Jenkins

● Unit Tests
● Functional Tests


● Manual Testing
Jenkins

● Unit Tests
● Functional Tests


● Manual Testing
Nagios & Naglite2




github.com/lozzd/Naglite2
tail -f | grep
github.com/etsy/deployinator
IRC
Graphs!!!
Ganglia
Graphite
Event Overlay
StatsD
if ($success) {
  StatsD::timing('query.runtime', $time);
} else {
  StatsD::increment('query.failure');
}




     github.com/etsy/statsd
github.com/etsy/logster
github.com/etsy/logster
Practices @ Etsy


     Feature Flags
Customer Communication
Feature Flags


Deploy != Product Launch
Dark Launch

def get_payment_link():
  return ...
Dark Launch

def get_payment_link():
  if enabled('creditcards'):
    return creditcard_link()
  else:
    return check_link()
Dark Launch

application_config:
  - creditcards: admin
  - NewFeatureB: off
  - NewFeatureC: on
Ramp Up

application_config
  - creditcards: 1%
  - NewFeatureB: off
  - NewFeatureC: on
Ramp Up

application_config
  - creditcards: 5%
  - NewFeatureB: off
  - NewFeatureC: on
Whoops!

application_config:
  - creditcards: admin
  - NewFeatureB: off
  - NewFeatureC: on
Ramp Up

application_config
  - creditcards: 5%
  - NewFeatureB: off
  - NewFeatureC: on
Ramp Up

application_config
  - creditcards: 25%
  - NewFeatureB: off
  - NewFeatureC: on
Credit Cards are ON

application_config
  - creditcards: 100%
  - NewFeatureB: off
  - NewFeatureC: on
AB Testing


●   Prove success of interface
    changes

●   Prove interest in new
    features
Community Communication
Forums / Message Boards
etsystatus.com
twitter.com/etsystatus
twitter.com/etsystatus
Deployment is First Class



    Deployment is a
  First Class Feature
Engineers are Users Too
Examples from Etsy

●   Photos From Twisted to PHP

●   PostgreSQL to MySQL Shards
From Twisted to PHP
● Run Apache/PHP on a new port
● Implement one service in PHP


● Ramp up users on new service


● Repeat for each service


● Shut down Twisted version
PostgreSQL to MySQL Shards
● Migrate table by table
● Tee writes to both DBs


● Copy old data from PostgreSQL


● Verify data matches


● Ramp up reads from MySQL


● Stop PostgreSQL writes
Continuous Deploy Pattern
● Change in small steps
● Dark launch via config


● Iterations to prod while dark


● Maintain old & new in parallel


● Ramp up new architecture


● Remove old architecture
Once Again
● Minimize BugHours
● Trash the Schedule


● Iterate on the Tools


● Make Big Changes
Mean Time To Addiction
Changing Etsy's
Architectural Foundation
              with
 Continuous Deployment
             Matt Graham
     http://twitter.com/lapsu
              http://lapsu.tv

  Core Engineer @ Etsy
  Continuous Deployer
  http://codeascraft.etsy.com
  http://www.etsy.com/careers

More Related Content

What's hot

Contributing to Koha
Contributing to KohaContributing to Koha
Contributing to KohaLibriotech
 
Gitlab meets Kubernetes
Gitlab meets KubernetesGitlab meets Kubernetes
Gitlab meets Kubernetesinovex GmbH
 
Managing Magento Projects by Viacheslav Kravchuk from Atwix
Managing Magento Projects by Viacheslav Kravchuk from AtwixManaging Magento Projects by Viacheslav Kravchuk from Atwix
Managing Magento Projects by Viacheslav Kravchuk from AtwixAtwix
 
Ctndeck 2 1-2011
Ctndeck 2 1-2011Ctndeck 2 1-2011
Ctndeck 2 1-2011Aaron Cohen
 
OpenText MBPM Q&A Webinar
OpenText MBPM Q&A WebinarOpenText MBPM Q&A Webinar
OpenText MBPM Q&A Webinarconvedo Group
 
Ivan Dryzhyruk “Ducks Don’t Like Bugs”
Ivan Dryzhyruk “Ducks Don’t Like Bugs”Ivan Dryzhyruk “Ducks Don’t Like Bugs”
Ivan Dryzhyruk “Ducks Don’t Like Bugs”LogeekNightUkraine
 

What's hot (6)

Contributing to Koha
Contributing to KohaContributing to Koha
Contributing to Koha
 
Gitlab meets Kubernetes
Gitlab meets KubernetesGitlab meets Kubernetes
Gitlab meets Kubernetes
 
Managing Magento Projects by Viacheslav Kravchuk from Atwix
Managing Magento Projects by Viacheslav Kravchuk from AtwixManaging Magento Projects by Viacheslav Kravchuk from Atwix
Managing Magento Projects by Viacheslav Kravchuk from Atwix
 
Ctndeck 2 1-2011
Ctndeck 2 1-2011Ctndeck 2 1-2011
Ctndeck 2 1-2011
 
OpenText MBPM Q&A Webinar
OpenText MBPM Q&A WebinarOpenText MBPM Q&A Webinar
OpenText MBPM Q&A Webinar
 
Ivan Dryzhyruk “Ducks Don’t Like Bugs”
Ivan Dryzhyruk “Ducks Don’t Like Bugs”Ivan Dryzhyruk “Ducks Don’t Like Bugs”
Ivan Dryzhyruk “Ducks Don’t Like Bugs”
 

Viewers also liked

برنامج محاسبة للفنادق
برنامج محاسبة للفنادقبرنامج محاسبة للفنادق
برنامج محاسبة للفنادقalmanara web
 
فنادق رخيصة فى الخبر Holdinn.com.sa -
فنادق رخيصة فى الخبر   Holdinn.com.sa -فنادق رخيصة فى الخبر   Holdinn.com.sa -
فنادق رخيصة فى الخبر Holdinn.com.sa -holdinnsa
 
Hyatt Hotel Typical Floor
Hyatt Hotel   Typical FloorHyatt Hotel   Typical Floor
Hyatt Hotel Typical Floormcbaldwin
 
Hotel Architect of the Year
Hotel Architect of the YearHotel Architect of the Year
Hotel Architect of the Yearsidb7
 
MODEL MAKING_WEBSITE(LOW RES)
MODEL MAKING_WEBSITE(LOW RES)MODEL MAKING_WEBSITE(LOW RES)
MODEL MAKING_WEBSITE(LOW RES)Mehnaj Tabassum
 
الاعتبارات البصرية و أسس الاضاءة في المباني
الاعتبارات البصرية و أسس الاضاءة في المبانيالاعتبارات البصرية و أسس الاضاءة في المباني
الاعتبارات البصرية و أسس الاضاءة في المبانيAhmad Fahed
 
Burj Al Arab, Tower Of The Arabs
Burj Al Arab, Tower Of The ArabsBurj Al Arab, Tower Of The Arabs
Burj Al Arab, Tower Of The ArabsLaura Domínguez
 
5 star hotel desing.compressed
5 star hotel desing.compressed5 star hotel desing.compressed
5 star hotel desing.compressedMehnaj Tabassum
 
Alternaty - Common mistakes in hotel design
Alternaty - Common mistakes in hotel designAlternaty - Common mistakes in hotel design
Alternaty - Common mistakes in hotel designAlternaty
 
Portman Hotel Case Study Analysis
Portman Hotel Case Study AnalysisPortman Hotel Case Study Analysis
Portman Hotel Case Study AnalysisMohammad Mohtashim
 
Case Study -Hotel design
Case Study -Hotel designCase Study -Hotel design
Case Study -Hotel designhebasayeed
 
Hotel Design - Midpoint Thesis Book
Hotel Design - Midpoint Thesis BookHotel Design - Midpoint Thesis Book
Hotel Design - Midpoint Thesis Bookrajensen00
 

Viewers also liked (14)

برنامج محاسبة للفنادق
برنامج محاسبة للفنادقبرنامج محاسبة للفنادق
برنامج محاسبة للفنادق
 
فنادق رخيصة فى الخبر Holdinn.com.sa -
فنادق رخيصة فى الخبر   Holdinn.com.sa -فنادق رخيصة فى الخبر   Holdinn.com.sa -
فنادق رخيصة فى الخبر Holdinn.com.sa -
 
Hyatt Hotel Typical Floor
Hyatt Hotel   Typical FloorHyatt Hotel   Typical Floor
Hyatt Hotel Typical Floor
 
Hotel Architect of the Year
Hotel Architect of the YearHotel Architect of the Year
Hotel Architect of the Year
 
MODEL MAKING_WEBSITE(LOW RES)
MODEL MAKING_WEBSITE(LOW RES)MODEL MAKING_WEBSITE(LOW RES)
MODEL MAKING_WEBSITE(LOW RES)
 
الاعتبارات البصرية و أسس الاضاءة في المباني
الاعتبارات البصرية و أسس الاضاءة في المبانيالاعتبارات البصرية و أسس الاضاءة في المباني
الاعتبارات البصرية و أسس الاضاءة في المباني
 
Burj Al Arab, Tower Of The Arabs
Burj Al Arab, Tower Of The ArabsBurj Al Arab, Tower Of The Arabs
Burj Al Arab, Tower Of The Arabs
 
5 star hotel desing.compressed
5 star hotel desing.compressed5 star hotel desing.compressed
5 star hotel desing.compressed
 
Alternaty - Common mistakes in hotel design
Alternaty - Common mistakes in hotel designAlternaty - Common mistakes in hotel design
Alternaty - Common mistakes in hotel design
 
Design of hotel
Design of hotelDesign of hotel
Design of hotel
 
Portman Hotel Case Study Analysis
Portman Hotel Case Study AnalysisPortman Hotel Case Study Analysis
Portman Hotel Case Study Analysis
 
Case Study -Hotel design
Case Study -Hotel designCase Study -Hotel design
Case Study -Hotel design
 
Hotel Design - Midpoint Thesis Book
Hotel Design - Midpoint Thesis BookHotel Design - Midpoint Thesis Book
Hotel Design - Midpoint Thesis Book
 
Burj khalifa
Burj khalifaBurj khalifa
Burj khalifa
 

Similar to Changing Etsy's Architectural Foundations with Continuous Deployment

Delivery at Scale
Delivery at ScaleDelivery at Scale
Delivery at ScaleAgilar
 
DevOps: Find Solutions, Not More Defects
DevOps: Find Solutions, Not More DefectsDevOps: Find Solutions, Not More Defects
DevOps: Find Solutions, Not More DefectsTechWell
 
Gatling - Bordeaux JUG
Gatling - Bordeaux JUGGatling - Bordeaux JUG
Gatling - Bordeaux JUGslandelle
 
Using SaltStack to DevOps the enterprise
Using SaltStack to DevOps the enterpriseUsing SaltStack to DevOps the enterprise
Using SaltStack to DevOps the enterpriseChristian McHugh
 
7 tools for your devops stack
7 tools for your devops stack7 tools for your devops stack
7 tools for your devops stackKris Buytaert
 
Agile conference 2013
Agile conference 2013Agile conference 2013
Agile conference 2013gbgruver
 
Garelic: Google Analytics as App Performance monitoring
Garelic: Google Analytics as App Performance monitoringGarelic: Google Analytics as App Performance monitoring
Garelic: Google Analytics as App Performance monitoringJano Suchal
 
RandomTest - Random Software Integration Tests That Just Work for C/C++, Java...
RandomTest - Random Software Integration Tests That Just Work for C/C++, Java...RandomTest - Random Software Integration Tests That Just Work for C/C++, Java...
RandomTest - Random Software Integration Tests That Just Work for C/C++, Java...dcieslak
 
Continuous Infrastructure First
Continuous Infrastructure FirstContinuous Infrastructure First
Continuous Infrastructure FirstKris Buytaert
 
Engineering Velocity @indeed eng presented on Sept 24 2014 at Beyond Agile
Engineering Velocity @indeed eng presented on Sept 24 2014 at Beyond AgileEngineering Velocity @indeed eng presented on Sept 24 2014 at Beyond Agile
Engineering Velocity @indeed eng presented on Sept 24 2014 at Beyond AgileKenAtIndeed
 
Ship code like a keptn
Ship code like a keptnShip code like a keptn
Ship code like a keptnRob Jahn
 
Buytaert kris tools
Buytaert kris toolsBuytaert kris tools
Buytaert kris toolskuchinskaya
 
Db migrations equal pain
Db migrations equal painDb migrations equal pain
Db migrations equal painEugen Oskin
 
Continuous Deployment Applied at MyHeritage
Continuous Deployment Applied at MyHeritageContinuous Deployment Applied at MyHeritage
Continuous Deployment Applied at MyHeritageRan Levy
 
Droidcon Spain 2016 - The Pragmatic Android Programmer: from hype to reality
 Droidcon Spain 2016 - The Pragmatic Android Programmer: from hype to reality Droidcon Spain 2016 - The Pragmatic Android Programmer: from hype to reality
Droidcon Spain 2016 - The Pragmatic Android Programmer: from hype to realityDaniel Gallego Vico
 
improving the performance of Rails web Applications
improving the performance of Rails web Applicationsimproving the performance of Rails web Applications
improving the performance of Rails web ApplicationsJohn McCaffrey
 

Similar to Changing Etsy's Architectural Foundations with Continuous Deployment (20)

Delivery at Scale
Delivery at ScaleDelivery at Scale
Delivery at Scale
 
Delivery at Scale
Delivery at ScaleDelivery at Scale
Delivery at Scale
 
DevOps: Find Solutions, Not More Defects
DevOps: Find Solutions, Not More DefectsDevOps: Find Solutions, Not More Defects
DevOps: Find Solutions, Not More Defects
 
Gatling - Bordeaux JUG
Gatling - Bordeaux JUGGatling - Bordeaux JUG
Gatling - Bordeaux JUG
 
Using SaltStack to DevOps the enterprise
Using SaltStack to DevOps the enterpriseUsing SaltStack to DevOps the enterprise
Using SaltStack to DevOps the enterprise
 
7 tools for your devops stack
7 tools for your devops stack7 tools for your devops stack
7 tools for your devops stack
 
Agile conference 2013
Agile conference 2013Agile conference 2013
Agile conference 2013
 
Garelic: Google Analytics as App Performance monitoring
Garelic: Google Analytics as App Performance monitoringGarelic: Google Analytics as App Performance monitoring
Garelic: Google Analytics as App Performance monitoring
 
RandomTest - Random Software Integration Tests That Just Work for C/C++, Java...
RandomTest - Random Software Integration Tests That Just Work for C/C++, Java...RandomTest - Random Software Integration Tests That Just Work for C/C++, Java...
RandomTest - Random Software Integration Tests That Just Work for C/C++, Java...
 
Frappe Open Day - September 2018
Frappe Open Day - September 2018Frappe Open Day - September 2018
Frappe Open Day - September 2018
 
Continuous Infrastructure First
Continuous Infrastructure FirstContinuous Infrastructure First
Continuous Infrastructure First
 
Engineering Velocity @indeed eng presented on Sept 24 2014 at Beyond Agile
Engineering Velocity @indeed eng presented on Sept 24 2014 at Beyond AgileEngineering Velocity @indeed eng presented on Sept 24 2014 at Beyond Agile
Engineering Velocity @indeed eng presented on Sept 24 2014 at Beyond Agile
 
Ship code like a keptn
Ship code like a keptnShip code like a keptn
Ship code like a keptn
 
Buytaert kris tools
Buytaert kris toolsBuytaert kris tools
Buytaert kris tools
 
Db migrations equal pain
Db migrations equal painDb migrations equal pain
Db migrations equal pain
 
GitOps , done Right
GitOps , done RightGitOps , done Right
GitOps , done Right
 
Continuous Deployment Applied at MyHeritage
Continuous Deployment Applied at MyHeritageContinuous Deployment Applied at MyHeritage
Continuous Deployment Applied at MyHeritage
 
Droidcon Spain 2016 - The Pragmatic Android Programmer: from hype to reality
 Droidcon Spain 2016 - The Pragmatic Android Programmer: from hype to reality Droidcon Spain 2016 - The Pragmatic Android Programmer: from hype to reality
Droidcon Spain 2016 - The Pragmatic Android Programmer: from hype to reality
 
improving the performance of Rails web Applications
improving the performance of Rails web Applicationsimproving the performance of Rails web Applications
improving the performance of Rails web Applications
 
Introduction to git & github
Introduction to git & githubIntroduction to git & github
Introduction to git & github
 

Recently uploaded

Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialJoão Esperancinha
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...amber724300
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 

Recently uploaded (20)

Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorial
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 

Changing Etsy's Architectural Foundations with Continuous Deployment

Editor's Notes

  1. I&apos;m here to talk about continuous deployment and how it helps with BIG architectural changes. I&apos;m an engineer on Etsy&apos;s Core Team and I&apos;m coming at this from the perspective of an engineer involved with making fundamental changes to Etsy&apos;s infrastructure. I&apos;ve found continuous deployment to be a very good way to work and I&apos;m here to spread the good word and hopefully trigger some ideas about how it could help you and how to get started with it.
  2. As the business grows, there is change pressuring the software from all over.
  3. Start with an axiom: that good architecture is not static. The same company had 2 different architectures at 2 points in its history. While the 2005 architecture never would have scaled to 2012, the 2012 architecture would have been complete overkill in 2005. Etsy wouldn&apos;t have had time to build the features that got it started.
  4. Here&apos;s a very not to scale graph of how the business and architecture grew together.
  5. Here&apos;s what it would look like if they had shot straight to the 2012 architecture back in 2005. First, they probably wouldn&apos;t have gotten it right, but let&apos;s assume they did. We&apos;re done right?
  6. What now? We overshot the 2005 architecture to be able to handle 2012, but now we&apos;re still not prepared for 2017. What do we do now? The point here is that you can&apos;t escape architectural change. All you can do is try to make it easier. Continuous deployment makes architectural change easier.
  7. As the business grows, there is change pressuring the software from all over.
  8. Ultimately, the correct architecture needs to change too.
  9. We have to be able to make big architectural changes, and we need them to go better than this The key to success here is breaking down big changes into many smaller changes When we write code, we break it down into manageable modules. But when it comes time to deploy it, we mash it back together into an unmanageable chunk. This limits the scale of changes. With continuous deployment we remove that limit.
  10. Let&apos;s step back and look at how deployment got to where it is. And we&apos;ll start here, the 80s. In the olden days... software had to be copied onto floppy disks, put in a box, shipped to a store and then finally purchased from the store And you wouldn&apos;t want to give people updates for free, so what happened? They&apos;d all be batched up in an “upgrade” release or a new version altogether. Deploys were understandably rare
  11. And then this happened. Likewise, we even started distributing our thick client software over the internet: Windows Update is a good example We took applications that used to be physically distributed thick clients and made them “web applications” For the most part we were still using the old school deploy cycle
  12. But some people wanted to go faster. We realized, “Hey, it&apos;s a website, we can deploy it every month.”
  13. But... why can&apos;t we just deploy and deploy and deploy? Well, we can. We can invert the unit of measurement from days per deploy, to be deploys per day
  14. Makes it possible to do what might otherwise be too risky
  15. Most modern software is dealing primarily with electrons The impact on the real world is minimal and indirect In these cases, MTTR is far cheaper than MTTF. Say you&apos;re coming up to a monthly release and you have 3 people spend 6 days testing for 100 different bugs. They find 4 and miss 2. The 2 that were missed take 2 days to get fixed and deployed. With continuous deployment, say we find 2 and miss 3.
  16. Most modern software is dealing primarily with electrons The impact on the real world is minimal and indirect In these cases, MTTR is far cheaper than MTTF.
  17. Continuous Deployment minimizes bug hours
  18. Not all bugs are equal though With MTTF, you&apos;re telling yourself, if we test it enough, there won&apos;t be any bugs. With MTTR, you&apos;re saying, we know there will be bugs, let&apos;s fix them as quickly as possible.
  19. Cost to recover Steve Austin – $6 million $31 million today
  20. Cost to recover Steve Austin – $6 million $31 million today
  21. Most other cases, continuous deployment may help
  22. GE MRI? No
  23. NASA? No
  24. Enterprise software? Yes!
  25. Printing Health Insurance, Credit Cards? No
  26. Continuously deploy to the App Store?
  27. What about when it comes to processing financial transactions?
  28. Etsy is PCI compliant, so we are financial software. The process is different for our credit card processing software, but we don&apos;t deploy on a schedule. We push code whenever it&apos;s necessary.
  29. This is all it is, and they&apos;re all cultural. Everything else is an implementation detail. Doing it is cultural, the technical part is just improving how well you&apos;re doing it. I&apos;ll talk about a few things that Etsy does to help us, but they&apos;re not necessary if you want to start continuous deployment. It&apos;s just a few of things you&apos;re likely to find helpful once you do start. Continuous Deployment is like everything else in software ship early and iterate.
  30. This is all it is, and they&apos;re all cultural. Everything else is an implementation detail. Doing it is cultural, the technical part is just improving how well you&apos;re doing it. I&apos;ll talk about a few things that Etsy does to help us, but they&apos;re not necessary if you want to start continuous deployment. It&apos;s just a few of things you&apos;re likely to find helpful once you do start. Continuous Deployment is like everything else in software ship early and iterate.
  31. This is all it is, and they&apos;re all cultural. Everything else is an implementation detail. Doing it is cultural, the technical part is just improving how well you&apos;re doing it. I&apos;ll talk about a few things that Etsy does to help us, but they&apos;re not necessary if you want to start continuous deployment. It&apos;s just a few of things you&apos;re likely to find helpful once you do start. Continuous Deployment is like everything else in software ship early and iterate.
  32. Everything else measures how good you are at continuous deployment.
  33. We don&apos;t have the mythical 100% automated test coverage, so we do manual testing too.
  34. Great incentive to add automated tests Manual testing once/month or week can be tolerated Manual testing for multiple deploys/day is painful
  35. Laurie Denness
  36. grep and Enter the Dragon both released in 1973 Tailing a log and using grep to filter out uninteresting stuff is a great way to monitor the health of the system.
  37. We use a tool called deployinator to actually execute our deploys. Deployinator has buttons on it to kick off each stage of the deploy It triggers a shell script that uses dsh to do stuff on each server And it logs what&apos;s happening on deploys It&apos;s designed to allow only minimal human output as a feature. All we do is say, “Start.” There are no options that we might screw up. The “deploy button” is probably the tool that most contributes to cutting down time spent on deploys
  38. With ~100 developers, there is going to be contention for doing deploys We use another advanced technology for resolving that contention: the topic of an IRC channel. I&apos;m mattg and I&apos;m at the front of the queue. will_gallego is sharing my deploy and Michael Horowitz will do his deploy when he&apos;s done.
  39. Ganglia is a common graphing tool. It&apos;s great for looking at a pool of machines. Each band here is a separate machine.
  40. Graphite is another graphing tool. It let&apos;s you easily apply functions or stack graphs and is a better for displaying system level and busines metrics. These 2 graphs actually show the same event where we switched to an optimized version of libjpeg.
  41. Here&apos;s another ganglia graph and there&apos;s a clear drop in memcache connections. What caused that? We draw these vertical lines at each deploys and this one is blue. That means there was a configuration change at that time that led to the drop. Now I know I can check the deploy logs to see what went out.
  42. Graphs are great to look at, but they don&apos;t help if there&apos;s not an easy way for developers to get the right data into them. We use a tool that we&apos;ve open sourced called StatsD. It&apos;s a node.js UDP server that just listens for incoming data and sends it to Graphite. From our application code, the only thing we need to write is this little bit.
  43. Logster is another tool we use to easily get data into graphs. It scans production logs Uses plugins to parse out interesting information Pushes it to Graphite or Ganglia
  44. Logster is another tool we use to easily get data into graphs. It scans production logs Uses plugins to parse out interesting information Pushes it to Graphite or Ganglia
  45. A deploy is not the same as product launch Just because you deploy frequently, doesn&apos;t mean you have to give up control of when software is “launched” Feature flags are the tools that give that control through a “dark launch” Credit Card Processing
  46. Imagine if you have this function to get a link for feature A. It formats the string and returns it.
  47. So now to dark launch it, change that function to check a config value. If it&apos;s enabled, return the value from a new funciton otherwise return the value from the old function. This is just to generate a link, but we use these all over our code so there&apos;s not really any limit to how you use feature flags.
  48. A sample of configuration At Etsy, we use admin to mean only Etsy employees. This so FeatureA is dark launched for employees only so we can see how the development is progressing.
  49. Now we&apos;re ready to let real users start seeing the new feature so we increase it to 1% of all users. We can also white list which specific users get the new feature This ramp up is a powerful way to reduce risk in a change and is why continuous deployment could work for financial trading software.
  50. If at any point we see a problem, we just roll back to admin only.
  51. A sample of configuration At Etsy, we use admin to mean only Etsy employees. This so FeatureA is dark launched for employees only so we can see how the development is progressing.
  52. If at any point we see a problem, we just roll back to admin only.
  53. Now 100% of people are using NewFeatureA
  54. Have an interface change and want to see if it moves metrics? Split users across different options and see what happens Have a new feature and want to see if people like it before getting behind it, let people select themselves into a beta group. Google does this with the Labs. With continuous deployment you can make these changes and instantly see results and then make more changes.
  55. Communication a very helpful tool in both directions.
  56. First, if something small breaks, you want to have a feedback path for users to inform you Not specific to continuous deployment but changes are spread with low intensity over a period of time, so you want to have a good low intensity Forums or message boards are a great, low intensity way for users to send feedback
  57. If something big breaks, you want to be able to inform users out of band of a potentially non-functional site At etsy we have a blog hosted by wordpress where we post outages or even slowness on the site
  58. We also have an etsystatus twitter account
  59. And here&apos;s one reason to have 2 channels
  60. What we&apos;re doing with all these tools is making deployment a first class member of the system Compare to tech support features or business intelligence
  61. Listing photos are a core part of our site as it&apos;s what lets buyers see what people are selling and it gets 400k uploads per day. The postgres DB was our central DB and we&apos;re migrating all of the data there over to the shards. All this happens live.
  62. Ultimately, the correct architecture needs to change too.
  63. Ultimately, the correct architecture needs to change too.
  64. Finally, get rid of the old stuff. This is the most satisfying step. Also very important to keep unused stuff from causing confusion.