Scaling Etsy: What Went Wrong, What Went Right

9,985 views
9,775 views

Published on

Slides for the talk given at Surge 2011.

Published in: Technology
0 Comments
18 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
9,985
On SlideShare
0
From Embeds
0
Number of Embeds
467
Actions
Shares
0
Downloads
154
Comments
0
Likes
18
Embeds 0
No embeds

No notes for slide

Scaling Etsy: What Went Wrong, What Went Right

  1. 1. Scaling : What Went Wrong, What Went RightRoss Snyderross@etsy.com@beamrider9 Sept. 30, 2011 1
  2. 2. Etsy is the world’s handmade marketplace. (vintage and supplies, too) 2
  3. 3. Etsy was founded in mid-2005 and is constantly growing. Gross Merchandise Sales ($MM) 3
  4. 4. From humble beginnings... June Four employees, one web*,2005: one db, founder’s apartment * until getting slashdotted by a link from Boing Boing in Aug. 2005 4
  5. 5. ... to today’s handmade juggernaut.Sept. 250+ employees, multiple2011: offices, billions of pageviews (NYC Mayor Mike Bloomberg visited Etsy in June 2011) 5
  6. 6. How’d we get here? 6
  7. 7. Answer: with some difficulty.“There is no education like adversity.” - Benjamin Disraeli 7
  8. 8. A few disclaimers 8
  9. 9. Hindsight is 20/20 9
  10. 10. “History is written by the victors” 10
  11. 11. Etsy thrives today because of whatits early employees accomplished 11
  12. 12. Your narrator wasn’t present for most of the events covered in this talk 12
  13. 13. Etsy Architecture: 2007 13
  14. 14. Etsy Architecture: 2007Operating System: Database: Webserver: Languages: 14
  15. 15. Etsy Architecture: 2007 Most business logic in Postgres stored procedures 15
  16. 16. Etsy Architecture: 2007 Front end / database interaction = storedprocedure calls wrapped with PHP functions 16
  17. 17. Etsy Architecture: 2007Some database partitioning by feature, but still with a large central DB 17
  18. 18. Etsy Architecture: 2007 Site uptime = not great 18
  19. 19. Etsy Architecture: 2007 “How do we scale?” 19
  20. 20. Etsy Architecture: 2007“Let’s write some middleware!”(runners up: “Let’s rewrite the site in Java!” and “Let’s rewrite the site in Python!”) 20
  21. 21. Conway’s Law:“Any organization that designsa system (defined broadly) willproduce a design whosestructure is a copy of theorganizations communicationstructure.”- Melvin Conway, 1968 21
  22. 22. Etsy Engineering: 2007Dev DBA Ops 22
  23. 23. Etsy Engineering: 2007Dev DBA Ops Devs write code 23
  24. 24. Etsy Engineering: 2007Dev DBA Ops DBAs write SQL 24
  25. 25. Etsy Engineering: 2007Dev DBA OpsOps deploys code & touches prod 25
  26. 26. SILOS 26
  27. 27. Etsy’s big bet: “Sprouter” (the Stored Procedure Router) 27
  28. 28. Sprouter Web Sprouter DB(PHP) (Python) (Postgres) Runs on each webserver, listens on port 8010 28
  29. 29. Sprouter Web Sprouter DB (PHP) (Python) (Postgres) Maps name/arguments to a Postgresstored procedure, calls it, returns results 29
  30. 30. Sprouter Web Sprouter DB(PHP) (Python) (Postgres) Caches things 30
  31. 31. Sprouter Web Sprouter DB(PHP) (Python) (Postgres) Supports sharding (in theory) 31
  32. 32. Sprouter Web Sprouter DB(PHP) (Python) (Postgres) Devs write PHP, DBAs write SQL, meet somewhere in the middle 32
  33. 33. SILOS 33
  34. 34. Sprouter Web Sprouter DB(PHP) (Python) (Postgres) The hope: easier to scale Sprouter than to scale the database itself 34
  35. 35. Sprouter Web Sprouter DB (PHP) (Python) (Postgres) (scaling the db when everything’s instored procedures = somewhere between hard and impossible) 35
  36. 36. Sprouter: Timeline Fall ’07: Idea first discussedSpring ’08: Alpha version debuts Fall ’08: Released in production 36
  37. 37. Sprouter: Timeline Fall ’07: Idea first discussed Spring ’08: Alpha version debuts Fall ’08: Released in productionSpring ’09: Sprouter deprecated 37
  38. 38. What happened? 38
  39. 39. Sprouter: “Good” Parts Web Sprouter DB(PHP) (Python) (Postgres)Forcibly centralizes database access 39
  40. 40. Sprouter: “Good” Parts Web Sprouter DB(PHP) (Python) (Postgres) Hides data store implementation from caller 40
  41. 41. Sprouter: “Good” Parts Web Sprouter DB(PHP) (Python) (Postgres) Opens the door for “clever” automatic caching 41
  42. 42. Sprouter: “Good” Parts Web Sprouter DB (PHP) (Python) (Postgres)Prevents developers from writing SQL (?) 42
  43. 43. 43
  44. 44. Sprouter: Not-As-Good Parts Web Sprouter DB(PHP) (Python) (Postgres)Creates substantial developer friction 44
  45. 45. Sprouter: Not-As-Good Parts Web Sprouter DB(PHP) (Python) (Postgres)Homegrown daemon + dependencies for Ops to maintain 45
  46. 46. Sprouter: Not-As-Good Parts Web Sprouter DB (PHP) (Python) (Postgres)Lack of community support / provability 46
  47. 47. Sprouter: Not-As-Good Parts Web Sprouter DB (PHP) (Python) (Postgres)Complex synchronization required to deploy (due to tight coupling with Postgres) 47
  48. 48. Sprouter: Not-As-Good Parts Web Sprouter DB (PHP) (Python) (Postgres)Database remains single point of failure(sharding features never fully formed) 48
  49. 49. Sprouter: SummaryExtra barriers to development 49
  50. 50. Sprouter: SummaryExtra barriers to development+ Negligible (negative?) effect on site reliability 50
  51. 51. Sprouter: SummaryExtra barriers to development+ Negligible (negative?) effect on site reliability+ Deploys even more painful 51
  52. 52. Sprouter: SummaryExtra barriers to development+ Negligible (negative?) effect on site reliability+ Deploys even more painful+ Requires extra Ops/Dev resources 52
  53. 53. Sprouter: SummaryExtra barriers to development+ Negligible (negative?) effect on site reliability+ Deploys even more painful+ Requires extra Ops/Dev resources= 53
  54. 54. How did attitudes change so quickly? 54
  55. 55. Sprouter: Timeline Fall ’07: Idea first discussed Spring ’08: Alpha version debuts Fall ’08: Released in productionSpring ’09: Sprouter deprecated 55
  56. 56. The Great Etsy Culture Shift 56
  57. 57. The Great Etsy Culture ShiftJust as Sprouter went live, many of its strongest proponents departed Etsy 57
  58. 58. The Great Etsy Culture Shift Taking with them... 58
  59. 59. The Great Etsy Culture Shift Devotion to Postgres stored procedures / types 59
  60. 60. The Great Etsy Culture Shift Fear of developers writing SQL 60
  61. 61. The Great Etsy Culture Shift Fear of developers touching prod 61
  62. 62. The Great Etsy Culture ShiftInfrequent / large deploys to production 62
  63. 63. The Great Etsy Culture Shift “Not developed here” 63
  64. 64. The Great Etsy Culture ShiftThen Now Fall ’08 64
  65. 65. DevOps 65
  66. 66. DevOpsSilos = bad 66
  67. 67. DevOpsTrust, cooperation, transparency, shared responsibility = good 67
  68. 68. DevOps“We’re all in this together” 68
  69. 69. The Way Forward: Part 1 Stabilize the site 69
  70. 70. The Way Forward: Part 1 Stabilize the site Improve metrics & monitoring 70
  71. 71. The Way Forward: Part 1 Stabilize the site StatsDhttp://github.com/etsy/statsd 71
  72. 72. The Way Forward: Part 1 Stabilize the site Upgrade database hardware vertically as far as possible 72
  73. 73. The Way Forward: Part 1 Stabilize the siteGive developers production access to help troubleshoot problems 73
  74. 74. The Way Forward: Part 2 Continuous Deployment 74
  75. 75. The Way Forward: Part 2 Continuous Deployment Any engineer can deploy to prod(generally happens 25+ times per day) 75
  76. 76. The Way Forward: Part 2 Continuous Deployment Deployinatorhttp://github.com/etsy/deployinator 76
  77. 77. The Way Forward: Part 2 Continuous DeploymentOne button that deploys the site 77
  78. 78. The Way Forward: Part 2 Continuous DeploymentSmall changesets, deployed frequently 78
  79. 79. The Way Forward: Part 2 Continuous Deployment Requires solid tests, good communication 79
  80. 80. The Way Forward: Part 2 Continuous DeploymentDistributed developer-driven QA 80
  81. 81. The Way Forward: Part 3 Circumvent Sprouter 81
  82. 82. The Way Forward: Part 3 Circumvent SprouterObject-Relational Mapping (ORM) 82
  83. 83. The Way Forward: Part 3 Circumvent Sprouteraka “The Vietnam of Computer Science” (Google it) 83
  84. 84. The Way Forward: Part 3 Circumvent SprouterFront-end PHP talks directly to database via ORM (also written in PHP) 84
  85. 85. The Way Forward: Part 3 Circumvent SprouterORM can cache where appropriate (as can front end) 85
  86. 86. The Way Forward: Part 4 Database Sharding 86
  87. 87. The Way Forward: Part 4 Database Sharding Etsy has a lot of DNA from flickr -including their DB sharding scheme 87
  88. 88. The Way Forward: Part 4 Database Sharding Based on MySQL 88
  89. 89. The Way Forward: Part 4 Database Sharding Battle-tested, well-known 89
  90. 90. The Way Forward: Part 4 Database Sharding Scales horizontally to infinity (or close enough) 90
  91. 91. The Way Forward: Part 4 Database Sharding No single points of failure (master-master replication) 91
  92. 92. The Way Forward: Part 4 Database Sharding Gradually phase out Sprouter, phase in ORM / sharded data 92
  93. 93. Sprouter: Timeline Fall ’07: Idea first discussedSpring ’08: Alpha version debuts Fall ’08: Released in productionSpring ’09: Sprouter deprecated 93
  94. 94. Sprouter: Timeline Fall ’07: Idea first discussed Spring ’08: Alpha version debuts Fall ’08: Released in production Spring ’09: Sprouter deprecatedSpring ’11: Sprouter turned off 94
  95. 95. 95
  96. 96. Lessons Learned 96
  97. 97. Etsy Architecture: 2007Operating System: Database: Webserver: Languages: 97
  98. 98. Etsy Architecture: 2011Operating System: Database: Webserver: Languages: 98
  99. 99. Open & trusting > closed & afraid (DevOps DevOps DevOps) 99
  100. 100. Front end/database interaction is too criticalto take chances on novel/untested solutions 100
  101. 101. Side corollary: If you’re doing something“clever”, you’re probably doing it wrong 101
  102. 102. The architectural decisions you make todaywill have large impact long after you’re gone 102
  103. 103. No architectural hole is so deep that provenscaling strategies don’t exist for digging out 103
  104. 104. AcknowledgementWe are probably making decisions today that will be the subject of a similar talk in 2015 104
  105. 105. Learn More:http://codeascraft.etsy.com/@codeascraft 105
  106. 106. Etsy is hiring!http://www.etsy.com/careers@etsy 106

×