Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Scaling      :          What Went Wrong,          What Went RightRoss Snyderross@etsy.com@beamrider9              Sept. 30...
Etsy is the world’s handmade marketplace.          (vintage and supplies, too)                                            2
Etsy was founded in mid-2005  and is constantly growing.  Gross Merchandise Sales ($MM)                                  3
From humble beginnings... June   Four employees, one web*,2005:   one db, founder’s apartment        * until getting slash...
... to today’s handmade juggernaut.Sept.   250+ employees, multiple2011:   offices, billions of pageviews        (NYC Mayor ...
How’d we get here?                     6
Answer: with some difficulty.“There is no education like adversity.” - Benjamin Disraeli                                    ...
A few disclaimers                    8
Hindsight is 20/20                     9
“History is written by the victors”                                      10
Etsy thrives today  because of whatits early employees    accomplished                      11
Your narrator wasn’t present for most  of the events covered in this talk                                        12
Etsy Architecture: 2007                          13
Etsy Architecture: 2007Operating System:       Database:      Webserver:      Languages:                            14
Etsy Architecture: 2007   Most business logic in Postgres stored procedures                              15
Etsy Architecture: 2007 Front end / database interaction = storedprocedure calls wrapped with PHP functions               ...
Etsy Architecture: 2007Some database partitioning by feature,   but still with a large central DB                         ...
Etsy Architecture: 2007   Site uptime = not great                             18
Etsy Architecture: 2007    “How do we scale?”                          19
Etsy Architecture: 2007“Let’s write some middleware!”(runners up: “Let’s rewrite the site in Java!”   and “Let’s rewrite t...
Conway’s Law:“Any organization that designsa system (defined broadly) willproduce a design whosestructure is a copy of theo...
Etsy Engineering: 2007Dev      DBA        Ops                          22
Etsy Engineering: 2007Dev        DBA          Ops      Devs write code                              23
Etsy Engineering: 2007Dev        DBA         Ops      DBAs write SQL                             24
Etsy Engineering: 2007Dev          DBA           OpsOps deploys code & touches prod                                  25
SILOS        26
Etsy’s big bet: “Sprouter”    (the Stored Procedure Router)                                    27
Sprouter Web           Sprouter          DB(PHP)          (Python)       (Postgres)        Runs on each webserver,        ...
Sprouter  Web            Sprouter           DB (PHP)           (Python)        (Postgres)  Maps name/arguments to a Postgr...
Sprouter Web      Sprouter         DB(PHP)     (Python)      (Postgres)        Caches things                              ...
Sprouter Web         Sprouter           DB(PHP)        (Python)        (Postgres)   Supports sharding (in theory)         ...
Sprouter Web        Sprouter          DB(PHP)       (Python)       (Postgres) Devs write PHP, DBAs write SQL,  meet somewh...
SILOS        33
Sprouter Web          Sprouter          DB(PHP)         (Python)       (Postgres) The hope: easier to scale Sprouter  than...
Sprouter  Web          Sprouter          DB (PHP)         (Python)       (Postgres)   (scaling the db when everything’s in...
Sprouter: Timeline  Fall ’07: Idea first discussedSpring ’08: Alpha version debuts  Fall ’08: Released in production       ...
Sprouter: Timeline     Fall ’07: Idea first discussed   Spring ’08: Alpha version debuts     Fall ’08: Released in producti...
What happened?                 38
Sprouter: “Good” Parts Web          Sprouter          DB(PHP)         (Python)       (Postgres)Forcibly centralizes databa...
Sprouter: “Good” Parts Web         Sprouter          DB(PHP)        (Python)       (Postgres)  Hides data store implementa...
Sprouter: “Good” Parts Web             Sprouter          DB(PHP)            (Python)       (Postgres)            Opens the...
Sprouter: “Good” Parts   Web          Sprouter          DB  (PHP)         (Python)       (Postgres)Prevents developers fro...
43
Sprouter: Not-As-Good Parts Web           Sprouter          DB(PHP)          (Python)       (Postgres)Creates substantial ...
Sprouter: Not-As-Good Parts Web        Sprouter        DB(PHP)       (Python)     (Postgres)Homegrown daemon + dependencie...
Sprouter: Not-As-Good Parts  Web          Sprouter           DB (PHP)         (Python)        (Postgres)Lack of community ...
Sprouter: Not-As-Good Parts    Web          Sprouter          DB   (PHP)         (Python)       (Postgres)Complex synchron...
Sprouter: Not-As-Good Parts  Web           Sprouter          DB (PHP)          (Python)       (Postgres)Database remains s...
Sprouter: SummaryExtra barriers to development                                49
Sprouter: SummaryExtra barriers to development+ Negligible (negative?) effect on site reliability                          ...
Sprouter: SummaryExtra barriers to development+ Negligible (negative?) effect on site reliability+ Deploys even more painfu...
Sprouter: SummaryExtra barriers to development+ Negligible (negative?) effect on site reliability+ Deploys even more painfu...
Sprouter: SummaryExtra barriers to development+ Negligible (negative?) effect on site reliability+ Deploys even more painfu...
How did attitudes change so quickly?                                       54
Sprouter: Timeline     Fall ’07: Idea first discussed   Spring ’08: Alpha version debuts     Fall ’08: Released in producti...
The Great Etsy Culture Shift                               56
The Great Etsy Culture ShiftJust as Sprouter went live, many of its strongest proponents departed Etsy                    ...
The Great Etsy Culture Shift      Taking with them...                               58
The Great Etsy Culture Shift   Devotion to Postgres stored      procedures / types                                 59
The Great Etsy Culture Shift  Fear of developers writing SQL                                   60
The Great Etsy Culture Shift Fear of developers touching prod                                    61
The Great Etsy Culture ShiftInfrequent / large deploys to production                                           62
The Great Etsy Culture Shift     “Not developed here”                               63
The Great Etsy Culture ShiftThen                  Now            Fall            ’08                               64
DevOps         65
DevOpsSilos = bad              66
DevOpsTrust, cooperation, transparency,  shared responsibility = good                                    67
DevOps“We’re all in this together”                               68
The Way Forward: Part 1     Stabilize the site                          69
The Way Forward: Part 1       Stabilize the site Improve metrics & monitoring                                70
The Way Forward: Part 1      Stabilize the site            StatsDhttp://github.com/etsy/statsd                            ...
The Way Forward: Part 1      Stabilize the site Upgrade database hardware vertically as far as possible                   ...
The Way Forward: Part 1         Stabilize the siteGive developers production access to    help troubleshoot problems      ...
The Way Forward: Part 2   Continuous Deployment                           74
The Way Forward: Part 2      Continuous Deployment   Any engineer can deploy to prod(generally happens 25+ times per day) ...
The Way Forward: Part 2      Continuous Deployment            Deployinatorhttp://github.com/etsy/deployinator             ...
The Way Forward: Part 2   Continuous DeploymentOne button that deploys the site                                   77
The Way Forward: Part 2      Continuous DeploymentSmall changesets, deployed frequently                                   ...
The Way Forward: Part 2   Continuous Deployment    Requires solid tests,    good communication                            79
The Way Forward: Part 2    Continuous DeploymentDistributed developer-driven QA                                  80
The Way Forward: Part 3    Circumvent Sprouter                          81
The Way Forward: Part 3      Circumvent SprouterObject-Relational Mapping (ORM)                                  82
The Way Forward: Part 3        Circumvent Sprouteraka “The Vietnam of Computer Science”              (Google it)          ...
The Way Forward: Part 3         Circumvent SprouterFront-end PHP talks directly to database     via ORM (also written in P...
The Way Forward: Part 3      Circumvent SprouterORM can cache where appropriate      (as can front end)                   ...
The Way Forward: Part 4    Database Sharding                          86
The Way Forward: Part 4        Database Sharding Etsy has a lot of DNA from flickr -including their DB sharding scheme     ...
The Way Forward: Part 4    Database Sharding     Based on MySQL                          88
The Way Forward: Part 4     Database Sharding  Battle-tested, well-known                              89
The Way Forward: Part 4      Database Sharding Scales horizontally to infinity      (or close enough)                      ...
The Way Forward: Part 4     Database Sharding  No single points of failure (master-master replication)                    ...
The Way Forward: Part 4      Database Sharding Gradually phase out Sprouter, phase in ORM / sharded data                  ...
Sprouter: Timeline  Fall ’07: Idea first discussedSpring ’08: Alpha version debuts  Fall ’08: Released in productionSpring ...
Sprouter: Timeline   Fall ’07: Idea first discussed Spring ’08: Alpha version debuts   Fall ’08: Released in production Spr...
95
Lessons Learned                  96
Etsy Architecture: 2007Operating System:       Database:      Webserver:      Languages:                            97
Etsy Architecture: 2011Operating System:       Database:      Webserver:      Languages:                            98
Open & trusting > closed & afraid  (DevOps DevOps DevOps)                                    99
Front end/database interaction is too criticalto take chances on novel/untested solutions                                 ...
Side corollary: If you’re doing something“clever”, you’re probably doing it wrong                                         ...
The architectural decisions you make todaywill have large impact long after you’re gone                                   ...
No architectural hole is so deep that provenscaling strategies don’t exist for digging out                                ...
AcknowledgementWe are probably making decisions today that will be the subject of a similar talk in 2015                  ...
Learn More:http://codeascraft.etsy.com/@codeascraft                               105
Etsy is hiring!http://www.etsy.com/careers@etsy                              106
Upcoming SlideShare
Loading in …5
×

Scaling Etsy: What Went Wrong, What Went Right

12,768 views

Published on

Slides for the talk given at Surge 2011.

Published in: Technology
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/yxufevpm } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/yxufevpm } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/yxufevpm } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/yxufevpm } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/yxufevpm } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/yxufevpm } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Scaling Etsy: What Went Wrong, What Went Right

  1. 1. Scaling : What Went Wrong, What Went RightRoss Snyderross@etsy.com@beamrider9 Sept. 30, 2011 1
  2. 2. Etsy is the world’s handmade marketplace. (vintage and supplies, too) 2
  3. 3. Etsy was founded in mid-2005 and is constantly growing. Gross Merchandise Sales ($MM) 3
  4. 4. From humble beginnings... June Four employees, one web*,2005: one db, founder’s apartment * until getting slashdotted by a link from Boing Boing in Aug. 2005 4
  5. 5. ... to today’s handmade juggernaut.Sept. 250+ employees, multiple2011: offices, billions of pageviews (NYC Mayor Mike Bloomberg visited Etsy in June 2011) 5
  6. 6. How’d we get here? 6
  7. 7. Answer: with some difficulty.“There is no education like adversity.” - Benjamin Disraeli 7
  8. 8. A few disclaimers 8
  9. 9. Hindsight is 20/20 9
  10. 10. “History is written by the victors” 10
  11. 11. Etsy thrives today because of whatits early employees accomplished 11
  12. 12. Your narrator wasn’t present for most of the events covered in this talk 12
  13. 13. Etsy Architecture: 2007 13
  14. 14. Etsy Architecture: 2007Operating System: Database: Webserver: Languages: 14
  15. 15. Etsy Architecture: 2007 Most business logic in Postgres stored procedures 15
  16. 16. Etsy Architecture: 2007 Front end / database interaction = storedprocedure calls wrapped with PHP functions 16
  17. 17. Etsy Architecture: 2007Some database partitioning by feature, but still with a large central DB 17
  18. 18. Etsy Architecture: 2007 Site uptime = not great 18
  19. 19. Etsy Architecture: 2007 “How do we scale?” 19
  20. 20. Etsy Architecture: 2007“Let’s write some middleware!”(runners up: “Let’s rewrite the site in Java!” and “Let’s rewrite the site in Python!”) 20
  21. 21. Conway’s Law:“Any organization that designsa system (defined broadly) willproduce a design whosestructure is a copy of theorganizations communicationstructure.”- Melvin Conway, 1968 21
  22. 22. Etsy Engineering: 2007Dev DBA Ops 22
  23. 23. Etsy Engineering: 2007Dev DBA Ops Devs write code 23
  24. 24. Etsy Engineering: 2007Dev DBA Ops DBAs write SQL 24
  25. 25. Etsy Engineering: 2007Dev DBA OpsOps deploys code & touches prod 25
  26. 26. SILOS 26
  27. 27. Etsy’s big bet: “Sprouter” (the Stored Procedure Router) 27
  28. 28. Sprouter Web Sprouter DB(PHP) (Python) (Postgres) Runs on each webserver, listens on port 8010 28
  29. 29. Sprouter Web Sprouter DB (PHP) (Python) (Postgres) Maps name/arguments to a Postgresstored procedure, calls it, returns results 29
  30. 30. Sprouter Web Sprouter DB(PHP) (Python) (Postgres) Caches things 30
  31. 31. Sprouter Web Sprouter DB(PHP) (Python) (Postgres) Supports sharding (in theory) 31
  32. 32. Sprouter Web Sprouter DB(PHP) (Python) (Postgres) Devs write PHP, DBAs write SQL, meet somewhere in the middle 32
  33. 33. SILOS 33
  34. 34. Sprouter Web Sprouter DB(PHP) (Python) (Postgres) The hope: easier to scale Sprouter than to scale the database itself 34
  35. 35. Sprouter Web Sprouter DB (PHP) (Python) (Postgres) (scaling the db when everything’s instored procedures = somewhere between hard and impossible) 35
  36. 36. Sprouter: Timeline Fall ’07: Idea first discussedSpring ’08: Alpha version debuts Fall ’08: Released in production 36
  37. 37. Sprouter: Timeline Fall ’07: Idea first discussed Spring ’08: Alpha version debuts Fall ’08: Released in productionSpring ’09: Sprouter deprecated 37
  38. 38. What happened? 38
  39. 39. Sprouter: “Good” Parts Web Sprouter DB(PHP) (Python) (Postgres)Forcibly centralizes database access 39
  40. 40. Sprouter: “Good” Parts Web Sprouter DB(PHP) (Python) (Postgres) Hides data store implementation from caller 40
  41. 41. Sprouter: “Good” Parts Web Sprouter DB(PHP) (Python) (Postgres) Opens the door for “clever” automatic caching 41
  42. 42. Sprouter: “Good” Parts Web Sprouter DB (PHP) (Python) (Postgres)Prevents developers from writing SQL (?) 42
  43. 43. 43
  44. 44. Sprouter: Not-As-Good Parts Web Sprouter DB(PHP) (Python) (Postgres)Creates substantial developer friction 44
  45. 45. Sprouter: Not-As-Good Parts Web Sprouter DB(PHP) (Python) (Postgres)Homegrown daemon + dependencies for Ops to maintain 45
  46. 46. Sprouter: Not-As-Good Parts Web Sprouter DB (PHP) (Python) (Postgres)Lack of community support / provability 46
  47. 47. Sprouter: Not-As-Good Parts Web Sprouter DB (PHP) (Python) (Postgres)Complex synchronization required to deploy (due to tight coupling with Postgres) 47
  48. 48. Sprouter: Not-As-Good Parts Web Sprouter DB (PHP) (Python) (Postgres)Database remains single point of failure(sharding features never fully formed) 48
  49. 49. Sprouter: SummaryExtra barriers to development 49
  50. 50. Sprouter: SummaryExtra barriers to development+ Negligible (negative?) effect on site reliability 50
  51. 51. Sprouter: SummaryExtra barriers to development+ Negligible (negative?) effect on site reliability+ Deploys even more painful 51
  52. 52. Sprouter: SummaryExtra barriers to development+ Negligible (negative?) effect on site reliability+ Deploys even more painful+ Requires extra Ops/Dev resources 52
  53. 53. Sprouter: SummaryExtra barriers to development+ Negligible (negative?) effect on site reliability+ Deploys even more painful+ Requires extra Ops/Dev resources= 53
  54. 54. How did attitudes change so quickly? 54
  55. 55. Sprouter: Timeline Fall ’07: Idea first discussed Spring ’08: Alpha version debuts Fall ’08: Released in productionSpring ’09: Sprouter deprecated 55
  56. 56. The Great Etsy Culture Shift 56
  57. 57. The Great Etsy Culture ShiftJust as Sprouter went live, many of its strongest proponents departed Etsy 57
  58. 58. The Great Etsy Culture Shift Taking with them... 58
  59. 59. The Great Etsy Culture Shift Devotion to Postgres stored procedures / types 59
  60. 60. The Great Etsy Culture Shift Fear of developers writing SQL 60
  61. 61. The Great Etsy Culture Shift Fear of developers touching prod 61
  62. 62. The Great Etsy Culture ShiftInfrequent / large deploys to production 62
  63. 63. The Great Etsy Culture Shift “Not developed here” 63
  64. 64. The Great Etsy Culture ShiftThen Now Fall ’08 64
  65. 65. DevOps 65
  66. 66. DevOpsSilos = bad 66
  67. 67. DevOpsTrust, cooperation, transparency, shared responsibility = good 67
  68. 68. DevOps“We’re all in this together” 68
  69. 69. The Way Forward: Part 1 Stabilize the site 69
  70. 70. The Way Forward: Part 1 Stabilize the site Improve metrics & monitoring 70
  71. 71. The Way Forward: Part 1 Stabilize the site StatsDhttp://github.com/etsy/statsd 71
  72. 72. The Way Forward: Part 1 Stabilize the site Upgrade database hardware vertically as far as possible 72
  73. 73. The Way Forward: Part 1 Stabilize the siteGive developers production access to help troubleshoot problems 73
  74. 74. The Way Forward: Part 2 Continuous Deployment 74
  75. 75. The Way Forward: Part 2 Continuous Deployment Any engineer can deploy to prod(generally happens 25+ times per day) 75
  76. 76. The Way Forward: Part 2 Continuous Deployment Deployinatorhttp://github.com/etsy/deployinator 76
  77. 77. The Way Forward: Part 2 Continuous DeploymentOne button that deploys the site 77
  78. 78. The Way Forward: Part 2 Continuous DeploymentSmall changesets, deployed frequently 78
  79. 79. The Way Forward: Part 2 Continuous Deployment Requires solid tests, good communication 79
  80. 80. The Way Forward: Part 2 Continuous DeploymentDistributed developer-driven QA 80
  81. 81. The Way Forward: Part 3 Circumvent Sprouter 81
  82. 82. The Way Forward: Part 3 Circumvent SprouterObject-Relational Mapping (ORM) 82
  83. 83. The Way Forward: Part 3 Circumvent Sprouteraka “The Vietnam of Computer Science” (Google it) 83
  84. 84. The Way Forward: Part 3 Circumvent SprouterFront-end PHP talks directly to database via ORM (also written in PHP) 84
  85. 85. The Way Forward: Part 3 Circumvent SprouterORM can cache where appropriate (as can front end) 85
  86. 86. The Way Forward: Part 4 Database Sharding 86
  87. 87. The Way Forward: Part 4 Database Sharding Etsy has a lot of DNA from flickr -including their DB sharding scheme 87
  88. 88. The Way Forward: Part 4 Database Sharding Based on MySQL 88
  89. 89. The Way Forward: Part 4 Database Sharding Battle-tested, well-known 89
  90. 90. The Way Forward: Part 4 Database Sharding Scales horizontally to infinity (or close enough) 90
  91. 91. The Way Forward: Part 4 Database Sharding No single points of failure (master-master replication) 91
  92. 92. The Way Forward: Part 4 Database Sharding Gradually phase out Sprouter, phase in ORM / sharded data 92
  93. 93. Sprouter: Timeline Fall ’07: Idea first discussedSpring ’08: Alpha version debuts Fall ’08: Released in productionSpring ’09: Sprouter deprecated 93
  94. 94. Sprouter: Timeline Fall ’07: Idea first discussed Spring ’08: Alpha version debuts Fall ’08: Released in production Spring ’09: Sprouter deprecatedSpring ’11: Sprouter turned off 94
  95. 95. 95
  96. 96. Lessons Learned 96
  97. 97. Etsy Architecture: 2007Operating System: Database: Webserver: Languages: 97
  98. 98. Etsy Architecture: 2011Operating System: Database: Webserver: Languages: 98
  99. 99. Open & trusting > closed & afraid (DevOps DevOps DevOps) 99
  100. 100. Front end/database interaction is too criticalto take chances on novel/untested solutions 100
  101. 101. Side corollary: If you’re doing something“clever”, you’re probably doing it wrong 101
  102. 102. The architectural decisions you make todaywill have large impact long after you’re gone 102
  103. 103. No architectural hole is so deep that provenscaling strategies don’t exist for digging out 103
  104. 104. AcknowledgementWe are probably making decisions today that will be the subject of a similar talk in 2015 104
  105. 105. Learn More:http://codeascraft.etsy.com/@codeascraft 105
  106. 106. Etsy is hiring!http://www.etsy.com/careers@etsy 106

×