Crossing the Production Barrier: Development at Scale
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Crossing the Production Barrier: Development at Scale

on

  • 3,760 views

 

Statistics

Views

Total Views
3,760
Views on SlideShare
2,047
Embed Views
1,713

Actions

Likes
3
Downloads
29
Comments
0

9 Embeds 1,713

http://blog.johngoulah.com 1607
https://twitter.com 80
http://prlog.ru 18
http://abtasty.com 2
http://johngoulah.com 2
http://www.johngoulah.com 1
http://news.google.com 1
http://ranksit.com 1
http://cache.baiducontent.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Crossing the Production Barrier: Development at Scale Document Transcript

  • 1. jgoulah@etsy.com/@johngoulahCrossingtheProductionBarrierDevelopmentAtScale
  • 2. The world’s handmade marketplaceplatform for people to sell homemade, crafts, and vintage goods
  • 3. 42MMuniquevisitors/mo.
  • 4. 1.5B+pageviews/mo.42MMuniquevisitors/mo.
  • 5. 1.5B+pageviews/mo.42MMuniquevisitors/mo.850Kshops/200countries
  • 6. 1.5B+pageviews/mo.895MMsalesin201242MMuniquevisitors/mo.850Kshops/200countries
  • 7. big cluster, 20 shards and adding 5 more
  • 8. over 40% increase from last year in QPS (25K last year)additional 30K moving over from postgres1/3 RAM not dedicated to the pool (OS, disk, network buffers, etc)
  • 9. 4TBInnoDBbufferpoolover 40% increase from last year in QPS (25K last year)additional 30K moving over from postgres1/3 RAM not dedicated to the pool (OS, disk, network buffers, etc)
  • 10. 4TBInnoDBbufferpool20TB+datastoredover 40% increase from last year in QPS (25K last year)additional 30K moving over from postgres1/3 RAM not dedicated to the pool (OS, disk, network buffers, etc)
  • 11. 60K+queries/secavg4TBInnoDBbufferpool20TB+datastoredover 40% increase from last year in QPS (25K last year)additional 30K moving over from postgres1/3 RAM not dedicated to the pool (OS, disk, network buffers, etc)
  • 12. 60K+queries/secavg4TBInnoDBbufferpool20TB+datastored~1.2Gbpsoutbound(plaintext)over 40% increase from last year in QPS (25K last year)additional 30K moving over from postgres1/3 RAM not dedicated to the pool (OS, disk, network buffers, etc)
  • 13. 60K+queries/secavg4TBInnoDBbufferpool20TB+datastored99.99%queriesunder1ms~1.2Gbpsoutbound(plaintext)over 40% increase from last year in QPS (25K last year)additional 30K moving over from postgres1/3 RAM not dedicated to the pool (OS, disk, network buffers, etc)
  • 14. 50+MySQLservers/800CPUsServerSpecHPDL380G796GBRAM16spindles/1TBRAID1024Core16 x 146GB
  • 15. TheProblembeen around since ’05,hit this a few years ago, every big company probably has this issue
  • 16. DATAsync prod to dev, until prod data gets too bighttp://www.flickr.com/photos/uwwresnet/6280880034/sizes/l/in/photostream/
  • 17. SomeApproachessubsets have to end somewhere (a shop has favorites that are connectedto people, connected to shops, etc)generated data can be time consuming to fake
  • 18. SomeApproachessubsetsofdatasubsets have to end somewhere (a shop has favorites that are connectedto people, connected to shops, etc)generated data can be time consuming to fake
  • 19. SomeApproachessubsetsofdatagenerateddatasubsets have to end somewhere (a shop has favorites that are connectedto people, connected to shops, etc)generated data can be time consuming to fake
  • 20. But...but there is a problem with both of those approaches
  • 21. EdgeCaseswhat about testing edge cases, difficult to diagnose bugs?hard to model the same data set that produced a user facing bughttp://www.flickr.com/photos/sovietuk/141381675/sizes/l/in/photostream/
  • 22. Perspectiveanother issue is testing problems at scale, complex and large gobs ofdatareal social network ecosystem can be difficult to generate (favorites,follows)(activity feed, “similar items” search gives better results)http://www.flickr.com/photos/donsolo/2136923757/sizes/l/in/photostream/
  • 23. Prod Dev?what most people do before data gets too big,almost 2 days to sync 20Tb over 1Gbps link, 5 hrs over 10Gbpsbringing prod dataset to dev was expensive hardware/maint,keeping parity with prod, and applying schema changes would take at leastas long
  • 24. UseProductionso we did what we saw as the last resort - used productionnot for greenfield development, more for mature features and diagnosing bugswe still have a dev database but the data is sparse and unreliable
  • 25. UseProduction(sometimes)so we did what we saw as the last resort - used productionnot for greenfield development, more for mature features and diagnosing bugswe still have a dev database but the data is sparse and unreliable
  • 26. goes without saying this can be dangerousalso difficult if done right, we’ve been working on this for a yearhttp://www.flickr.com/photos/stuckincustoms/432361985/sizes/l/in/photostream/
  • 27. Approachtwo big things: cultural and technical
  • 28. SolveCultureIssuesFirstpart of figuring this out was exhausting all other optionsgetting buy-in from major stakeholders
  • 29. Two“Simple”TechnicalIssues
  • 30. step0:failurerecovery
  • 31. step1:makeitsafehow to have test data in production, prevent stupid mistakes
  • 32. phasedrollout
  • 33. phasedrolloutread-only
  • 34. phasedrolloutread-onlyr/wdevshardonly
  • 35. phasedrolloutread-onlyr/wdevshardonlyfullr/w
  • 36. How?how did we do it?
  • 37. QuickOverviewhigh level viewhttp://www.flickr.com/photos/h-k-d/7852444560/sizes/o/in/photostream/
  • 38. tickets indexshard1 shard2 shardN
  • 39. tickets indexshard1 shard2 shardNUniqueIDs
  • 40. tickets indexshard1 shard2 shardNShardLookup
  • 41. tickets indexshard1 shard2 shardNStore/RetrieveData
  • 42. devshardintroducing....dev shard, shard used for initial writes of data created when coming from devenv
  • 43. tickets indexshard1 shard2 shardN
  • 44. tickets indexshard1 shard2 shardNDEVshard
  • 45. shard1 shard2 shardNDEVshardwww.etsy.com www.goulah.vmInitialWrites
  • 46. shard1 shard2 shardNDEVshardwww.etsy.com www.goulah.vmInitialWrites
  • 47. shard1 shard2 shardNDEVshardwww.etsy.com www.goulah.vmInitialWrites
  • 48. mysqlproxy
  • 49. proxy hits all of the shards/index/ticketshttp://www.oreillynet.com/pub/a/databases/2007/07/12/getting-started-with-mysql-proxy.html
  • 50. dangerous/unnecessaryqueries-- filter dangerous queries - (queries without a WHERE)-- remove unnecessary queries - (instead of DELETE, have a flag, ALTERstatements don’t run from dev)
  • 51. dangerous/unnecessaryqueries(DEV) etsy_rw@jgoulah [test]>select * from fred_test;-- filter dangerous queries - (queries without a WHERE)-- remove unnecessary queries - (instead of DELETE, have a flag, ALTERstatements don’t run from dev)
  • 52. dangerous/unnecessaryqueries(DEV) etsy_rw@jgoulah [test]>select * from fred_test;ERROR 9001 (E9001): Selects fromtables must have where clauses-- filter dangerous queries - (queries without a WHERE)-- remove unnecessary queries - (instead of DELETE, have a flag, ALTERstatements don’t run from dev)
  • 53. knownin/egressfunnelwe know where all of the queries from dev originate fromhttp://www.flickr.com/photos/medevac71/4875526920/sizes/l/in/photostream/
  • 54. explicitlyenabled% dev_proxy onDev-Proxy config is now ON. Usedev_proxy off to turn it off.Not on all the time
  • 55. visualnotifications
  • 56. notify engineers they are using the proxy,this is read-only mode
  • 57. read/writemode
  • 58. read-write mode, needed for login and other things that write data
  • 59. stealthdatahiding data from users(favorites go on dev and prod shard, making sure test user/shops don’tshow up)http://www.flickr.com/photos/davidyuweb/8063097077/sizes/h/in/photostream/
  • 60. Securityhttp://www.flickr.com/photos/sidelong/3878741556/sizes/l/in/photostream/
  • 61. PCItoken exchange only, locked down for most people
  • 62. PCIoff-limitstoken exchange only, locked down for most people
  • 63. anomalydetectionanother part of our security setup is detection
  • 64. loggingbasics of anomaly detection is log collection
  • 65. 2013-04-22 18:05:43 485370821 devproxy --/* DEVPROXY source=10.101.194.19:40198uuid=c309e8db-ca32-4171-9c4a-6c37d9dd3361[htSp8458VmHlC] [etsy_index_B] [browse.php] */SELECT id FROM table;
  • 66. 2013-04-22 18:05:43 485370821 devproxy --/* DEVPROXY source=10.101.194.19:40198uuid=c309e8db-ca32-4171-9c4a-6c37d9dd3361[htSp8458VmHlC] [etsy_index_B] [browse.php] */SELECT id FROM table;date
  • 67. 2013-04-22 18:05:43 485370821 devproxy --/* DEVPROXY source=10.101.194.19:40198uuid=c309e8db-ca32-4171-9c4a-6c37d9dd3361[htSp8458VmHlC] [etsy_index_B] [browse.php] */SELECT id FROM table;date threadid
  • 68. 2013-04-22 18:05:43 485370821 devproxy --/* DEVPROXY source=10.101.194.19:40198uuid=c309e8db-ca32-4171-9c4a-6c37d9dd3361[htSp8458VmHlC] [etsy_index_B] [browse.php] */SELECT id FROM table;date threadidsourceip
  • 69. 2013-04-22 18:05:43 485370821 devproxy --/* DEVPROXY source=10.101.194.19:40198uuid=c309e8db-ca32-4171-9c4a-6c37d9dd3361[htSp8458VmHlC] [etsy_index_B] [browse.php] */SELECT id FROM table;date threadidsourceipuniqueidgeneratedbyproxy
  • 70. 2013-04-22 18:05:43 485370821 devproxy --/* DEVPROXY source=10.101.194.19:40198uuid=c309e8db-ca32-4171-9c4a-6c37d9dd3361[htSp8458VmHlC] [etsy_index_B] [browse.php] */SELECT id FROM table;date threadidsourceipuniqueidgeneratedbyproxyapprequestid
  • 71. 2013-04-22 18:05:43 485370821 devproxy --/* DEVPROXY source=10.101.194.19:40198uuid=c309e8db-ca32-4171-9c4a-6c37d9dd3361[htSp8458VmHlC] [etsy_index_B] [browse.php] */SELECT id FROM table;date threadidsourceipuniqueidgeneratedbyproxyapprequestid dest.shard
  • 72. 2013-04-22 18:05:43 485370821 devproxy --/* DEVPROXY source=10.101.194.19:40198uuid=c309e8db-ca32-4171-9c4a-6c37d9dd3361[htSp8458VmHlC] [etsy_index_B] [browse.php] */SELECT id FROM table;date threadidsourceipuniqueidgeneratedbyproxyapprequestid dest.shard script
  • 73. login-as(read only, logged w/ reason for access)
  • 74. reasonisrecordedandreviewed
  • 75. Recovery
  • 76. sourcesofrestoredata
  • 77. sourcesofrestoredataHadoop
  • 78. sourcesofrestoredataHadoopBackups
  • 79. sourcesofrestoredataHadoopBackupsDelayedSlaves
  • 80. DelayedSlavespt-slave-delay watches a slave and starts and stops its replication SQL thread asnecessary to hold ithttp://www.flickr.com/photos/xploded/141295823/sizes/o/in/photostream/
  • 81. DelayedSlavesrole of the delayed slavealso source of BCP(business continuity planning - prevention and recovery of threats)
  • 82. 4hourdelaybehindmasterDelayedSlavesrole of the delayed slavealso source of BCP(business continuity planning - prevention and recovery of threats)
  • 83. 4hourdelaybehindmasterproducerowbasedbinarylogsDelayedSlavesrole of the delayed slavealso source of BCP(business continuity planning - prevention and recovery of threats)
  • 84. 4hourdelaybehindmasterproducerowbasedbinarylogsDelayedSlavesallowforquickrecoveryrole of the delayed slavealso source of BCP(business continuity planning - prevention and recovery of threats)
  • 85. pt-slave-delay--daemonize--pid/var/run/pt-slave-delay.pid--log/var/log/pt-slave-delay.log--delay4h--interval1m--nocontinuelast 3 options most important,4h delay, interval is how frequently it should check whether slaveshould be started or stoppednocontinue - don’t continue replication normally on exitxuser/pass eliminated for brevity
  • 86. R/W R/WSlaveShardPair
  • 87. R/W R/WSlaveShardPairpt-slave-delay
  • 88. R/W R/WSlaveShardPairpt-slave-delayrowbasedbinlogs
  • 89. R/W R/WSlaveShardPairHDFSVerticaParse/Transformin addition can use slaves to send data to other stores for offline queries1)parse each binlog file to generate sequence file of row changes2)apply the row changes to a previous set for the latest version
  • 90. somethingbadhappens...bad query is run (bad update, etc)http://www.flickr.com/photos/focalintent/1332072795/sizes/o/in/photostream/
  • 91. A BSlaveBeforeRestoration....master.info should be pointing to the right placestep 2 could be flipping physical box (for faster recovery such as indexservers)
  • 92. A BSlaveBeforeRestoration....1)stopdelayedslavereplicationmaster.info should be pointing to the right placestep 2 could be flipping physical box (for faster recovery such as indexservers)
  • 93. BSlaveBeforeRestoration....1)stopdelayedslavereplication2)pullsideA Amaster.info should be pointing to the right placestep 2 could be flipping physical box (for faster recovery such as indexservers)
  • 94. BSlaveBeforeRestoration....3)stopmaster-masterreplication1)stopdelayedslavereplication2)pullsideA Amaster.info should be pointing to the right placestep 2 could be flipping physical box (for faster recovery such as indexservers)
  • 95. > SHOW SLAVE STATUSRelay_Log_File: dbslave-relay.007178Relay_Log_Pos: 8666654ondelayedslaveget the relay position
  • 96. mysql> show relaylog events in "dbslave-relay.007178"from 8666654 limit 1G*************************** 1. row *******************Log_name: dbslave-relay.007178Pos: 8666654Event_type: QueryServer_id: 1016572End_log_pos: 8666565Info: use `etsy_shard`; /*[CVmkWxhD7gsatX8hLbkDoHk29iKo] [etsy_shard_001_B] [/your/activity/index.php] */ UPDATE `news_feed_stats`SET `time_last_viewed` = 1366406780, `update_time` =1366406780 WHERE `owner_id` = 30793071 AND`owner_type_id` = 2 AND `feed_type` = owner2 rows in set (0.00 sec)ondelayedslaveshow relaylog events will show statements from relay logpass relay log and position to start
  • 97. filterbadqueriescycle through all the logs, analyze Query eventsrotate events - next log filelast relay log points to binlog master(server_id is masters, binlog coord matches master_log_file/pos)http://www.flickr.com/photos/chriswaits/6607823843/sizes/l/in/photostream/
  • 98. BSlaveAfterDelayedSlaveDataIsRestored....Amaster.info should be pointing to the right placestep 2 could be flipping physical box (for faster recovery such as indexservers)
  • 99. BSlaveAfterDelayedSlaveDataIsRestored....1)stopmysqlonAandslaveAmaster.info should be pointing to the right placestep 2 could be flipping physical box (for faster recovery such as indexservers)
  • 100. BSlaveAfterDelayedSlaveDataIsRestored....1)stopmysqlonAandslave2)copydatafilestoAAmaster.info should be pointing to the right placestep 2 could be flipping physical box (for faster recovery such as indexservers)
  • 101. BSlaveAfterDelayedSlaveDataIsRestored....1)stopmysqlonAandslave2)copydatafilestoA3)restartBtoAreplication,letAcatchuptoBAmaster.info should be pointing to the right placestep 2 could be flipping physical box (for faster recovery such as indexservers)
  • 102. SlaveAfterDelayedSlaveDataIsRestored....1)stopmysqlonAandslave2)copydatafilestoA3)restartBtoAreplication,letAcatchuptoBA4)restartAtoBreplication,putAbackin,thenpullBA Bmaster.info should be pointing to the right placestep 2 could be flipping physical box (for faster recovery such as indexservers)
  • 103. OtherFormsofRecoveryMigrateSingleObject(user/shop/etc)HadoopDeltasBackup+Binlogsmigrate object from delayed slave (similar to shard migration)can generate deltas from hadoopif delayed slave has “played” the bad data, go from last nights backup(slower)
  • 104. UseCaseswhat are some use cases?http://www.flickr.com/photos/seatbelt67/502255276/sizes/o/in/photostream/
  • 105. userreportsabug...a user files a bug, i can trace the code for the exact page theyre on right from mydev machine
  • 106. testing“dry”writestesting how application runs a “dry” write --r/o mode, exception is thrown with the exact query it would have attempted to run,the values it tried to use, etc.
  • 107. searchadscampaignconsistencystarting campaigns and maintaining consistency for entire ad system is nearly impossible indevSearch ads data is stored in more than a dozen DB tables and state changes are driven by acombination of browsers triggering ads,sellers managing their campaigns, and a slew of crons running anywhere from once per 5 minutesto once a montheg) to test pausing campaigns that run out of money mid-day,can pull large numbers of campaigns from prod and operate on those to verify that the data willstill be consistent
  • 108. googleproductlistingadsGPLA is where we syndicate our listings to google to be used in google product search adswe can test edge cases in GPLA syndication where it would be difficult to recreate thestate in dev
  • 109. testingprototypesfeatures like similar items search gives better results in production because of theamount of data,allowed us to test the quality of listings a prototype was displaying
  • 110. performancetestingneed a real data set to test pages like treasury search with lots of threads/avatars/etcthe dev data is too sparse, xhprof traces don’t mean anything, missing avatars changeperf characteristics
  • 111. hadoopgenerateddatasetsdataset produced from hadoop (recommendations for users, or statisticsabout usage)but since hadoop is prod data its for prod users/listings/shops, so have tocheck against prod--- sync to dev would fill dev dbs and data wouldn’t line up (b/c prod data)
  • 112. browseslicesbrowse slices have complex population so its easier to test experiment against prod data
  • 113. not enough listings to populate the narrower subcategories, and it just takes too long
  • 114. ThankYouetsy.com/jobsWe’re hiring