Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Scaling Wanelo.com 100x in Six Months

32,877 views

Published on

A collection of problems and solutions worked through Wanelo team as they were scaling the site with the rapid demand of users. By Konstantin Gredeskoul and Eric Saxby.

Published in: Technology
  • Be the first to comment

Scaling Wanelo.com 100x in Six Months

  1. 1. Scaling 100x in six months by Eric Saxby & Konstantin Gredeskoul April 2013 Proprietary andThursday, April 18, 13 Confidential 1
  2. 2. What is Wanelo? ■ Wanelo (“Wah-nee-lo” from Want, Need Love) is a global platform for shopping. Proprietary andThursday, April 18, 13 Confidential 2
  3. 3. What is Wanelo? ■ Wanelo (“Wah-nee-lo” from Want, Need Love) is a global platform for shopping. Proprietary andThursday, April 18, 13 Confidential 2
  4. 4. ■ It’s marketing-free shopping across 100s of thousands of unique stores Proprietary andThursday, April 18, 13 Confidential 3
  5. 5. Personal Activity Feed... Proprietary andThursday, April 18, 13 Confidential 4
  6. 6. Personal Activity Feed... Proprietary andThursday, April 18, 13 Confidential 4
  7. 7. iOS + Android Proprietary andThursday, April 18, 13 Confidential 5
  8. 8. iOS + Android Proprietary andThursday, April 18, 13 Confidential 5
  9. 9. Early Decisions Proprietary andThursday, April 18, 13 Confidential 6
  10. 10. Early Decisions ■ Optimize for iteration speed, not performance Proprietary andThursday, April 18, 13 Confidential 6
  11. 11. Early Decisions ■ Optimize for iteration speed, not performance ■ Keep scalability in mind, track metrics, and fix as needed Proprietary andThursday, April 18, 13 Confidential 6
  12. 12. Early Decisions ■ Optimize for iteration speed, not performance ■ Keep scalability in mind, track metrics, and fix as needed ■ Introduce many levels of caching early Proprietary andThursday, April 18, 13 Confidential 6
  13. 13. Technology Timeline Proprietary andThursday, April 18, 13 Confidential 7
  14. 14. Technology Timeline ■ 2010 - 2011 Wanelo v1 stack is Java, JSP, MySQL, Hibernate 90K lines of code, 53+ DB tables, no tests Proprietary andThursday, April 18, 13 Confidential 7
  15. 15. Technology Timeline ■ 2010 - 2011 Wanelo v1 stack is Java, JSP, MySQL, Hibernate 90K lines of code, 53+ DB tables, no tests ■ May 2012 - June 2012 Rewrite from scratch to RoR on PostgreSQL (v2) Proprietary andThursday, April 18, 13 Confidential 7
  16. 16. Technology Timeline ■ 2010 - 2011 Wanelo v1 stack is Java, JSP, MySQL, Hibernate 90K lines of code, 53+ DB tables, no tests ■ May 2012 - June 2012 Rewrite from scratch to RoR on PostgreSQL (v2) ■ Ruby app is 10K LOC, full test coverage, 8 database tables, less features Proprietary andThursday, April 18, 13 Confidential 7
  17. 17. The “Big” Rewrite Proprietary andThursday, April 18, 13 Confidential 8
  18. 18. The “Big” Rewrite More info here.... Proprietary andThursday, April 18, 13 Confidential 8
  19. 19. The “Big” Rewrite More info here.... building.wanelo.com/ http:// Proprietary andThursday, April 18, 13 Confidential 8
  20. 20. The “Big” Rewrite More info here.... building.wanelo.com/ http:// Proprietary andThursday, April 18, 13 Confidential 8
  21. 21. Growth Timeline Proprietary andThursday, April 18, 13 Confidential 9
  22. 22. Growth Timeline ■ 06/2012 - RoR App Relaunches Proprietary andThursday, April 18, 13 Confidential 9
  23. 23. Growth Timeline ■ 06/2012 - RoR App Relaunches ■ 2-3K requests per minute (RPM) peak Proprietary andThursday, April 18, 13 Confidential 9
  24. 24. Growth Timeline ■ 06/2012 - RoR App Relaunches ■ 2-3K requests per minute (RPM) peak ■ 08/2012 - iOS App is launched Proprietary andThursday, April 18, 13 Confidential 9
  25. 25. Growth Timeline ■ 06/2012 - RoR App Relaunches ■ 2-3K requests per minute (RPM) peak ■ 08/2012 - iOS App is launched ■ 10-40K RPM peak Proprietary andThursday, April 18, 13 Confidential 9
  26. 26. Growth Timeline ■ 06/2012 - RoR App Relaunches ■ 2-3K requests per minute (RPM) peak ■ 08/2012 - iOS App is launched ■ 10-40K RPM peak ■ 12/2012 - Android app launched Proprietary andThursday, April 18, 13 Confidential 9
  27. 27. Growth Timeline ■ 06/2012 - RoR App Relaunches ■ 2-3K requests per minute (RPM) peak ■ 08/2012 - iOS App is launched ■ 10-40K RPM peak ■ 12/2012 - Android app launched ■ 40-120K RPM peak Proprietary andThursday, April 18, 13 Confidential 9
  28. 28. Growth Timeline ■ 06/2012 - RoR App Relaunches ■ 2-3K requests per minute (RPM) peak ■ 08/2012 - iOS App is launched ■ 10-40K RPM peak ■ 12/2012 - Android app launched ■ 40-120K RPM peak ■ 03/2013 - #24 top free apps iTunes Proprietary andThursday, April 18, 13 Confidential 9
  29. 29. Growth Timeline ■ 06/2012 - RoR App Relaunches ■ 2-3K requests per minute (RPM) peak ■ 08/2012 - iOS App is launched ■ 10-40K RPM peak ■ 12/2012 - Android app launched ■ 40-120K RPM peak ■ 03/2013 - #24 top free apps iTunes ■ 80-200K RPM peak Proprietary andThursday, April 18, 13 Confidential 9
  30. 30. Requests Per Minute (RPM) Proprietary andThursday, April 18, 13 Confidential 10
  31. 31. Current Numbers... ■ 4M active monthly users ■ 5M products saved 700M times ■ 8M products saved per day ■ 200k stores Proprietary andThursday, April 18, 13 Confidential 11
  32. 32. Backend Stack & Key Vendors ■ MRI Ruby 1.9.3 & Rails 3.2 ■ PostgreSQL 9.2.4, Solr 3.6 ■ Joyent Cloud, SmartOS ZFS, ARC, raw IO performance, SmartOS, CPU bursting, dTrace ■ Circonus, Chef + Opscode Monitoring, graphing, alerting, automation ■ Amazon S3 + Fastly CDN ■ NewRelic, statsd, Graphite, nagios Proprietary andThursday, April 18, 13 Confidential 12
  33. 33. Wanelo Web Architecture nginx 6 x 2GB haproxy unicorn x 14 sidekiq 20 x 8GB 4 x 8GB haproxy pgbouncer twemproxy haproxy pgbouncer twemproxy Solr PostgreSQL Redis MemCached Proprietary andThursday, April 18, 13 Confidential 13
  34. 34. This talk is about: Proprietary andThursday, April 18, 13 Confidential 14
  35. 35. This talk is about: 1. How much traffic can your database handle? Proprietary andThursday, April 18, 13 Confidential 14
  36. 36. This talk is about: 1. How much traffic can your database handle? 2. Special report on counters Proprietary andThursday, April 18, 13 Confidential 14
  37. 37. This talk is about: 1. How much traffic can your database handle? 2. Special report on counters 3. Scaling database reads Proprietary andThursday, April 18, 13 Confidential 14
  38. 38. This talk is about: 1. How much traffic can your database handle? 2. Special report on counters 3. Scaling database reads 4. Scaling database writes Proprietary andThursday, April 18, 13 Confidential 14
  39. 39. 1. How much traffic can your database handle?Thursday, April 18, 13 15
  40. 40. PostgreSQL is Awesome! Proprietary andThursday, April 18, 13 Confidential 16
  41. 41. PostgreSQL is Awesome! ■ Does a fantastic job of not corrupting your data Proprietary andThursday, April 18, 13 Confidential 16
  42. 42. PostgreSQL is Awesome! ■ Does a fantastic job of not corrupting your data ■ Streaming replication in 9.2 is extremely reliable Proprietary andThursday, April 18, 13 Confidential 16
  43. 43. PostgreSQL is Awesome! ■ Does a fantastic job of not corrupting your data ■ Streaming replication in 9.2 is extremely reliable ■ Won’t write to a read-only replica Proprietary andThursday, April 18, 13 Confidential 16
  44. 44. PostgreSQL is Awesome! ■ Does a fantastic job of not corrupting your data ■ Streaming replication in 9.2 is extremely reliable ■ Won’t write to a read-only replica ■ But... No master/master replication Proprietary andThursday, April 18, 13 Confidential 16
  45. 45. PostgreSQL is Awesome! ■ Does a fantastic job of not corrupting your data ■ Streaming replication in 9.2 is extremely reliable ■ Won’t write to a read-only replica ■ But... No master/master replication (good!) Proprietary andThursday, April 18, 13 Confidential 16
  46. 46. Is the database healthy? Proprietary andThursday, April 18, 13 Confidential 17
  47. 47. What’s healthy? Proprietary andThursday, April 18, 13 Confidential 18
  48. 48. What’s healthy? ■ Able to respond quickly to queries from application (< 4ms disk seek time) Proprietary andThursday, April 18, 13 Confidential 18
  49. 49. What’s healthy? ■ Able to respond quickly to queries from application (< 4ms disk seek time) ■ Has enough room to grow Proprietary andThursday, April 18, 13 Confidential 18
  50. 50. What’s healthy? ■ Able to respond quickly to queries from application (< 4ms disk seek time) ■ Has enough room to grow ■ How do we know when we’re approaching a dangerous threshold? Proprietary andThursday, April 18, 13 Confidential 18
  51. 51. Oops! NewRelic Latency (yellow = database) Proprietary andThursday, April 18, 13 Confidential 19
  52. 52. Oops! NewRelic Latency (yellow = database) Proprietary andThursday, April 18, 13 Confidential 19
  53. 53. pg_stat_statements ■ Maybe your app is to blame for performance...    select      query,  calls,  total_time      from          pg_stat_statements      order  by  total_time  desc  limit  12; Proprietary andThursday, April 18, 13 Confidential 20
  54. 54. pg_stat_statements ■ Maybe your app is to blame for performance...    select      query,  calls,  total_time      from          pg_stat_statements      order  by  total_time  desc  limit  12; Similar to Percona Toolkit, but runs all the time collecting stats. Proprietary andThursday, April 18, 13 Confidential 20
  55. 55. pg_stat_statements Proprietary andThursday, April 18, 13 Confidential 21
  56. 56. pg_stat_user_indexes ■ Using indexes as much as you think you are? ■ Using indexes at all? Proprietary andThursday, April 18, 13 Confidential 22
  57. 57. pg_stat_user_indexes ■ Using indexes as much as you think you are? ■ Using indexes at all? Proprietary andThursday, April 18, 13 Confidential 22
  58. 58. pg_stat_user_tables ■ Full table scans? (seq_scan) Proprietary andThursday, April 18, 13 Confidential 23
  59. 59. pg_stat_user_tables ■ Full table scans? (seq_scan) Proprietary andThursday, April 18, 13 Confidential 23
  60. 60. Throw that in a graph Reads/second for one large table, daily Proprietary andThursday, April 18, 13 Confidential 24
  61. 61. Non-linear changes Suspicious spike! Proprietary andThursday, April 18, 13 Confidential 25
  62. 62. Correlate different data Deployments! Aha! Proprietary andThursday, April 18, 13 Confidential 26
  63. 63. Utilization vs Saturation # of Active PostgreSQL connections Proprietary andThursday, April 18, 13 Confidential 27
  64. 64. Utilization vs Saturation Red line: % of max connections established Purple: % of connections in query Proprietary andThursday, April 18, 13 Confidential 28
  65. 65. Disk reads/writes green: reads, red: writes Proprietary andThursday, April 18, 13 Confidential 29
  66. 66. Disk reads/writes green: reads, red: writes Usage increases, but are the disks saturated? Proprietary andThursday, April 18, 13 Confidential 29
  67. 67. Utilization vs Saturation Proprietary andThursday, April 18, 13 Confidential 30
  68. 68. Utilization vs Saturation Proprietary andThursday, April 18, 13 Confidential 30
  69. 69. Utilization vs Saturation [ How much are you waiting on disk? Proprietary andThursday, April 18, 13 Confidential 31
  70. 70. File system cache (ARC) Proprietary andThursday, April 18, 13 Confidential 32
  71. 71. File system cache (ARC) Proprietary andThursday, April 18, 13 Confidential 32
  72. 72. File system cache (ARC) Proprietary andThursday, April 18, 13 Confidential 32
  73. 73. Watch the right things Hit ratio of the file system cache (ARC) Proprietary andThursday, April 18, 13 Confidential 33
  74. 74. Watch the right things Hit ratio of the file system cache (ARC) Proprietary andThursday, April 18, 13 Confidential 33
  75. 75. Room to grow... Size (including indexes) of a key table Proprietary andThursday, April 18, 13 Confidential 34
  76. 76. Working set in RAM? Adding index increases the size Proprietary andThursday, April 18, 13 Confidential 35
  77. 77. Working set in RAM? Adding index increases the size Proprietary andThursday, April 18, 13 Confidential 35
  78. 78. Collect all the data you can Once we knew where to look, graphs added later could explain behavior we could only guess at earlier Proprietary andThursday, April 18, 13 Confidential 36
  79. 79. Collect all the data you can Once we knew where to look, graphs added later could explain behavior we could only guess at earlier Proprietary andThursday, April 18, 13 Confidential 36
  80. 80. 2. Special report on Counters and PaginationThursday, April 18, 13 37
  81. 81. Problem #1: DB Latency Up... Proprietary andThursday, April 18, 13 Confidential 38
  82. 82. Problem #1: DB Latency Up... ■ iostat shows 100% disk busy Proprietary andThursday, April 18, 13 Confidential 38
  83. 83. Problem #1: DB Latency Up... ■ iostat shows 100% disk busy device            r/s        w/s      Mr/s      Mw/s  wait  actv    svc_t    %w    %b   sd1              384.0  1157.5      48.0    116.8    0.0    8.8        5.7      2  100   sd1              368.0  1117.9      45.7    106.3    0.0    8.0        5.4      2  100   sd1              330.3  1357.5      41.3    139.1    0.0    9.5        5.6      2  100   Proprietary andThursday, April 18, 13 Confidential 38
  84. 84. Problem #1: DB Latency Up... ■ iostat shows 100% disk busy device            r/s        w/s      Mr/s      Mw/s  wait  actv    svc_t    %w    %b   sd1              384.0  1157.5      48.0    116.8    0.0    8.8        5.7      2  100   sd1              368.0  1117.9      45.7    106.3    0.0    8.0        5.4      2  100   sd1              330.3  1357.5      41.3    139.1    0.0    9.5        5.6      2  100   Proprietary andThursday, April 18, 13 Confidential 38
  85. 85. Problem #1: DB Latency Up... ■ iostat shows 100% disk busy device            r/s        w/s      Mr/s      Mw/s  wait  actv    svc_t    %w    %b   sd1              384.0  1157.5      48.0    116.8    0.0    8.8        5.7      2  100   sd1              368.0  1117.9      45.7    106.3    0.0    8.0        5.4      2  100   sd1              330.3  1357.5      41.3    139.1    0.0    9.5        5.6      2  100   Proprietary andThursday, April 18, 13 Confidential 38
  86. 86. Problem #1: Diagnostics Proprietary andThursday, April 18, 13 Confidential 39
  87. 87. Problem #1: Diagnostics ■ Database is running very very hot. Initial investigation shows large number of counts. Proprietary andThursday, April 18, 13 Confidential 39
  88. 88. Problem #1: Diagnostics ■ Database is running very very hot. Initial investigation shows large number of counts. ■ Turns out anytime you page with Kaminari, it always does a count(*)! Proprietary andThursday, April 18, 13 Confidential 39
  89. 89. Problem #1: Diagnostics ■ Database is running very very hot. Initial investigation shows large number of counts. ■ Turns out anytime you page with Kaminari, it always does a count(*)! SELECT  "stores".*  FROM  "stores"                                      WHERE  (state  =  approved)                                      LIMIT  20  OFFSET  0 SELECT  COUNT(*)  FROM  "stores"  WHERE  (state  =  approved) Proprietary andThursday, April 18, 13 Confidential 39
  90. 90. Problem #1: Pagination Proprietary andThursday, April 18, 13 Confidential 40
  91. 91. Problem #1: Pagination ■ Doing count(*) is pretty expensive, as DB must scan many rows (either the actual table or an index) Proprietary andThursday, April 18, 13 Confidential 40
  92. 92. Problem #1: Pagination Proprietary andThursday, April 18, 13 Confidential 41
  93. 93. Problem #1: Pagination ■ We are paginating everything! Even infinite scroll is a paged view behind the scenes. Proprietary andThursday, April 18, 13 Confidential 41
  94. 94. Problem #1: Pagination ■ We are paginating everything! Even infinite scroll is a paged view behind the scenes. ■ But we really DON’T want to run count(*) for every paged view. Proprietary andThursday, April 18, 13 Confidential 41
  95. 95. Problem #1: Pagination ■ We are showing most popular stores ■ Maybe it’s OK to hard-code the total number to, say, 1000? Proprietary andThursday, April 18, 13 Confidential 42
  96. 96. Problem #1: Pagination ■ We are showing most popular stores ■ Maybe it’s OK to hard-code the total number to, say, 1000? ■ How do we tell Kaminari NOT to issue a count query in this case? Proprietary andThursday, April 18, 13 Confidential 42
  97. 97. Problem #1: Pagination (ctd) Proprietary andThursday, April 18, 13 Confidential 43
  98. 98. Solution #1: Monkey Patch!! Proprietary andThursday, April 18, 13 Confidential 44
  99. 99. Solution #1: Monkey Patch!! Proprietary andThursday, April 18, 13 Confidential 44
  100. 100. Solution #1: Pass in the counter Proprietary andThursday, April 18, 13 Confidential 45
  101. 101. Solution #1: Pass in the counter SELECT  "stores".*  FROM  "stores"  WHERE  (state  =   approved)  LIMIT  20  OFFSET  0 Proprietary andThursday, April 18, 13 Confidential 45
  102. 102. Problem #2: Count Draculas ■ AKA: We still are doing too many counts! Proprietary andThursday, April 18, 13 Confidential 46
  103. 103. Problem #2: Count Draculas ■ AKA: We still are doing too many counts! Proprietary andThursday, April 18, 13 Confidential 46
  104. 104. Problem #2: Count Draculas ■ AKA: We still are doing too many counts! ■ Rails makes it so easy to do it the lazy way. Proprietary andThursday, April 18, 13 Confidential 46
  105. 105. Problem #2: Too Many Counts! ■ But it just doesn’t scale well Proprietary andThursday, April 18, 13 Confidential 47
  106. 106. Problem #2: Too Many Counts! ■ But it just doesn’t scale well ■ Fortunately, Rails has just a feature for this... Proprietary andThursday, April 18, 13 Confidential 47
  107. 107. Problem #2: Too Many Counts! ■ But it just doesn’t scale well ■ Fortunately, Rails has just a feature for this... Proprietary andThursday, April 18, 13 Confidential 47
  108. 108. Counter Caches ■ Unfortunately, it has one massive issue: Proprietary andThursday, April 18, 13 Confidential 48
  109. 109. Counter Caches ■ Unfortunately, it has one massive issue: ■ It causes database deadlocks at high volume Proprietary andThursday, April 18, 13 Confidential 48
  110. 110. Counter Caches ■ Unfortunately, it has one massive issue: ■ It causes database deadlocks at high volume ■ Because many ruby processes are creating child records concurrently Proprietary andThursday, April 18, 13 Confidential 48
  111. 111. Counter Caches ■ Unfortunately, it has one massive issue: ■ It causes database deadlocks at high volume ■ Because many ruby processes are creating child records concurrently ■ Each is executing a callback, trying to update counter_cache column on the parent, requiring row-level lock Proprietary andThursday, April 18, 13 Confidential 48
  112. 112. Counter Caches ■ Unfortunately, it has one massive issue: ■ It causes database deadlocks at high volume ■ Because many ruby processes are creating child records concurrently ■ Each is executing a callback, trying to update counter_cache column on the parent, requiring row-level lock ■ Deadlocks ensue Proprietary andThursday, April 18, 13 Confidential 48
  113. 113. Possible Solution: Use Background Jobs Proprietary andThursday, April 18, 13 Confidential 49
  114. 114. Possible Solution: Use Background Jobs ■ It works like this: Proprietary andThursday, April 18, 13 Confidential 49
  115. 115. Possible Solution: Use Background Jobs ■ It works like this: ■ As the record is created, we enqueue a request to recalculate counter_cache on the parent Proprietary andThursday, April 18, 13 Confidential 49
  116. 116. Possible Solution: Use Background Jobs ■ It works like this: ■ As the record is created, we enqueue a request to recalculate counter_cache on the parent ■ The job performs a complete recalculation of the counter cache and is idempotent Proprietary andThursday, April 18, 13 Confidential 49
  117. 117. Solution #2: Explained Proprietary andThursday, April 18, 13 Confidential 50
  118. 118. Solution #2: Explained ■ Sidekiq with UniqueJob extension Proprietary andThursday, April 18, 13 Confidential 50
  119. 119. Solution #2: Explained ■ Sidekiq with UniqueJob extension ■ Short wait for “buffering” Proprietary andThursday, April 18, 13 Confidential 50
  120. 120. Solution #2: Explained ■ Sidekiq with UniqueJob extension ■ Short wait for “buffering” ■ Serialize updates via small number of workers Proprietary andThursday, April 18, 13 Confidential 50
  121. 121. Solution #2: Explained ■ Sidekiq with UniqueJob extension ■ Short wait for “buffering” ■ Serialize updates via small number of workers ■ Can temporarily stop workers (in an emergency) to alleviate DB load Proprietary andThursday, April 18, 13 Confidential 50
  122. 122. Solution #2: Code Proprietary andThursday, April 18, 13 Confidential 51
  123. 123. Things are better. BUT... Proprietary andThursday, April 18, 13 Confidential 52
  124. 124. Things are better. BUT... Still too many fucking counts! Proprietary andThursday, April 18, 13 Confidential 52
  125. 125. Things are better. BUT... Still too many fucking counts! ■ Even doing count(*) from workers is too much on the databases Proprietary andThursday, April 18, 13 Confidential 52
  126. 126. Things are better. BUT... Still too many fucking counts! ■ Even doing count(*) from workers is too much on the databases ■ We need to stop doing count(*) in DB. But keep counter_caches. How? Proprietary andThursday, April 18, 13 Confidential 52
  127. 127. Things are better. BUT... Still too many fucking counts! ■ Even doing count(*) from workers is too much on the databases ■ We need to stop doing count(*) in DB. But keep counter_caches. How? ■ We could use Redis for this. Proprietary andThursday, April 18, 13 Confidential 52
  128. 128. save product product_id Solution #3: Counts Deltas unicorn counter_cache column 1. INCR product_id 2. ProductCountWorker.enqueue product_id Redis Redis PostgreSQL Counters Sidekiq 4. GET 3. Dequeue 5. RESET 5. SQL Update INCR by N sidekiq Proprietary andThursday, April 18, 13 Confidential 53
  129. 129. save product product_id Solution #3: Counts Deltas unicorn ■ Web request increments counter value in Redis counter_cache column ■ Enqueues request to update counter_cache 1. INCR product_id 2. ProductCountWorker.enqueue product_id ■ Background Job picks up a few minutes later, reads Redis Redis PostgreSQL Redis delta value, and Counters Sidekiq removes it. ■ Updates counter_cache 4. GET 5. RESET 3. Dequeue column by incrementing it by delta. 5. SQL Update INCR by N sidekiq Proprietary andThursday, April 18, 13 Confidential 53
  130. 130. Define counter_cache_on... ■ Internal GEM, will open source soon! Proprietary andThursday, April 18, 13 Confidential 54
  131. 131. Can now use counter caches in pagination! Proprietary andThursday, April 18, 13 Confidential 55
  132. 132. 3. Scaling readsThursday, April 18, 13 56
  133. 133. Multiple optimization cycles ■ Caching action caching, fragment, CDN ■ Personalization via AJAX Cache the entire page, then add personalized details ■ 25ms/req memcached time is cheaper than 12ms/req of database time Proprietary andThursday, April 18, 13 Confidential 57
  134. 134. Cache optimization 40% hit ratio! Woo! Wait... is that even good? Proprietary andThursday, April 18, 13 Confidential 58
  135. 135. Cache optimization Increasing your hit ratio means less queries against your database Proprietary andThursday, April 18, 13 Confidential 59
  136. 136. Cache optimization Caveat: even low hit ratio caches can save your ass. You’re removing load from the DB, remember? Proprietary andThursday, April 18, 13 Confidential 60
  137. 137. Cache saturation Blue: cache writes How long before your caches Red: automatic evictions start evicting data? Proprietary andThursday, April 18, 13 Confidential 61
  138. 138. Cache saturation Blue: cache writes How long before your caches Red: automatic evictions start evicting data? Proprietary andThursday, April 18, 13 Confidential 61
  139. 139. Cache saturation Blue: cache writes How long before your caches Red: automatic evictions start evicting data? Proprietary andThursday, April 18, 13 Confidential 61
  140. 140. Ajax personalization Proprietary andThursday, April 18, 13 Confidential 62
  141. 141. Ajax personalization Proprietary andThursday, April 18, 13 Confidential 62
  142. 142. Ajax personalization Proprietary andThursday, April 18, 13 Confidential 62
  143. 143. Nice! ■ Rails Action Caching Runs before_filters, so A/B experiments can still run ■ Extremely fast pages 4ms application time for some of our computationally heaviest pages ■ Could be served via CDN in the future Proprietary andThursday, April 18, 13 Confidential 63
  144. 144. Sad trombone... ■ Are you actually logged in? Pages don’t know until Ajax successfully runs ■ Selenium AND Jasmine tests! Proprietary andThursday, April 18, 13 Confidential 64
  145. 145. Read/write splitting ■ Sometime in December 2012... Proprietary andThursday, April 18, 13 Confidential 65
  146. 146. Read/write splitting ■ Sometime in December 2012... ■ Database reaching 100% saturation Proprietary andThursday, April 18, 13 Confidential 65
  147. 147. Read/write splitting ■ Sometime in December 2012... ■ Database reaching 100% saturation ■ Latency starting to increase non-linearly Proprietary andThursday, April 18, 13 Confidential 65
  148. 148. Read/write splitting ■ Sometime in December 2012... ■ Database reaching 100% saturation ■ Latency starting to increase non-linearly ■ We need to distribute database load Proprietary andThursday, April 18, 13 Confidential 65
  149. 149. Read/write splitting ■ Sometime in December 2012... ■ Database reaching 100% saturation ■ Latency starting to increase non-linearly ■ We need to distribute database load ■ We need to use read replicas! Proprietary andThursday, April 18, 13 Confidential 65
  150. 150. DB adapters for read/write ■ Looked at several, including DbCharmer Proprietary andThursday, April 18, 13 Confidential 66
  151. 151. DB adapters for read/write ■ Looked at several, including DbCharmer ■ Features / Configurability / Stability ■ Thread safety? This may be Ruby, but some people do actually use threads. ■ If I tell you it’s a read-only replica, DON’T ISSUE WRITES ■ Failover on errors? Proprietary andThursday, April 18, 13 Confidential 66
  152. 152. Chose Makara, by TaskRabbit ■ Used in production ■ We extended it to work with PostgreSQL ■ Works with Sidekiqs (thread-safe!) ■ Failover code is very simple. Simple is sometimes better. https://github.com/taskrabbit/makara Proprietary andThursday, April 18, 13 Confidential 67
  153. 153. We rolled out Makara and... ■ 1 master, 3 read-only async replicas Proprietary andThursday, April 18, 13 Confidential 68
  154. 154. We rolled out Makara and... ■ 1 master, 3 read-only async replicas Wait, what? Proprietary andThursday, April 18, 13 Confidential 68
  155. 155. A note about graphs ■ NewRelic is great! ■ Not easy to predict when your systems are about to fall over ■ Use something else to visualize Database and disk saturation Proprietary andThursday, April 18, 13 Confidential 69
  156. 156. 3 days later, in production ■ 3 read replicas distributing load from master ■ app servers and sidekiqs create lots of connections to DB backends Proprietary andThursday, April 18, 13 Confidential 70
  157. 157. 3 days later, in production ■ 3 read replicas distributing load from master ■ app servers and sidekiqs create lots of connections to DB backends ■ Mysterious spikes in errors at high traffic Proprietary andThursday, April 18, 13 Confidential 70
  158. 158. 3 days later, in production ■ 3 read replicas distributing load from master ■ app servers and sidekiqs create lots of connections to DB backends ■ Mysterious spikes in errors at high traffic Proprietary andThursday, April 18, 13 Confidential 70
  159. 159. Replication! Doh! Replication lag (yellow) correlates with application errors (red) Proprietary andThursday, April 18, 13 Confidential 71
  160. 160. Replication lag! Doh! ■ Track latency sending xlog to slaves select client_addr, pg_xlog_location_diff(sent_location, write_location) from pg_stat_replication; ■ Track latency applying xlogs on slaves select pg_xlog_location_diff( pg_last_xlog_receive_location(), pg_last_xlog_replay_location()), extract(epoch from now()) - extract(epoch from pg_last_xact_replay_timestamp()); Proprietary andThursday, April 18, 13 Confidential 72
  161. 161. Eventual Consistency Proprietary andThursday, April 18, 13 Confidential 73
  162. 162. Eventual Consistency ■ Some code paths should always go to master for reads (ie, after signup) Proprietary andThursday, April 18, 13 Confidential 73
  163. 163. Eventual Consistency ■ Some code paths should always go to master for reads (ie, after signup) ■ Application should be resilient to getting RecordNotFound to tolerate replication delays Proprietary andThursday, April 18, 13 Confidential 73
  164. 164. Eventual Consistency ■ Some code paths should always go to master for reads (ie, after signup) ■ Application should be resilient to getting RecordNotFound to tolerate replication delays ■ Not enough to scale reads. Writes become the bottleneck. Proprietary andThursday, April 18, 13 Confidential 73
  165. 165. Write load delays replication Replicas are busy trying to apply XLOGs and serve heavy read traffic Proprietary andThursday, April 18, 13 Confidential 74
  166. 166. 4. Scaling database writesThursday, April 18, 13 75
  167. 167. First, No-Brainers: ■ Move stuff out of the DB. Easiest first. Proprietary andThursday, April 18, 13 Confidential 76
  168. 168. First, No-Brainers: ■ Move stuff out of the DB. Easiest first. ■ Tracking user activity is very easy to do with a database table. But slow. Proprietary andThursday, April 18, 13 Confidential 76
  169. 169. First, No-Brainers: ■ Move stuff out of the DB. Easiest first. ■ Tracking user activity is very easy to do with a database table. But slow. ■ 2000 inserts/sec while also handling site critical data? Not a good idea. Proprietary andThursday, April 18, 13 Confidential 76
  170. 170. First, No-Brainers: ■ Move stuff out of the DB. Easiest first. ■ Tracking user activity is very easy to do with a database table. But slow. ■ 2000 inserts/sec while also handling site critical data? Not a good idea. ■ Solution: UDP packets to rsyslog, ASCII delimited files, log- rotate, analyze them later Proprietary andThursday, April 18, 13 Confidential 76
  171. 171. Next: Async Commits Proprietary andThursday, April 18, 13 Confidential 77
  172. 172. Next: Async Commits ■ PostgreSQL supports delayed (batched) commits Proprietary andThursday, April 18, 13 Confidential 77
  173. 173. Next: Async Commits ■ PostgreSQL supports delayed (batched) commits ■ Delays fsync for some # of microseconds Proprietary andThursday, April 18, 13 Confidential 77
  174. 174. Next: Async Commits ■ PostgreSQL supports delayed (batched) commits ■ Delays fsync for some # of microseconds ■ At high volume helps disk IO Proprietary andThursday, April 18, 13 Confidential 77
  175. 175. PostgreSQL Async Commits Proprietary andThursday, April 18, 13 Confidential 78
  176. 176. ZFS Block Size Proprietary andThursday, April 18, 13 Confidential 79
  177. 177. ZFS Block Size ■ Default ZFS block size is 128Kb Proprietary andThursday, April 18, 13 Confidential 79
  178. 178. ZFS Block Size ■ Default ZFS block size is 128Kb ■ PostgreSQL block size is 8Kb Proprietary andThursday, April 18, 13 Confidential 79
  179. 179. ZFS Block Size ■ Default ZFS block size is 128Kb ■ PostgreSQL block size is 8Kb ■ Small writes require lots of bandwidth Proprietary andThursday, April 18, 13 Confidential 79
  180. 180. ZFS Block Size ■ Default ZFS block size is 128Kb ■ PostgreSQL block size is 8Kb ■ Small writes require lots of bandwidth device            r/s        w/s      Mr/s      Mw/s  wait  actv    svc_t    %w    %b   sd1              384.0  1157.5      48.0    116.8    0.0    8.8        5.7      2  100   sd1              368.0  1117.9      45.7    106.3    0.0    8.0        5.4      2  100   sd1              330.3  1357.5      41.3    139.1    0.0    9.5        5.6      2  100   Proprietary andThursday, April 18, 13 Confidential 79
  181. 181. ZFS Block Size (ctd.) Proprietary andThursday, April 18, 13 Confidential 80
  182. 182. ZFS Block Size (ctd.) ■ Solution: change ZFS block size to 8K: Proprietary andThursday, April 18, 13 Confidential 80
  183. 183. ZFS Block Size (ctd.) device            r/s        w/s      Mr/s      Mw/s  wait  actv    svc_t    %w    %b   sd1              384.0  1157.5      48.0    116.8    0.0    8.8        5.7      2  100   sd1              368.0  1117.9      45.7    106.3    0.0    8.0        5.4      2  100   sd1              330.3  1357.5      41.3    139.1    0.0    9.5        5.6      2  100   ■ Solution: change ZFS block size to 8K: Proprietary andThursday, April 18, 13 Confidential 80
  184. 184. ZFS Block Size (ctd.) device            r/s        w/s      Mr/s      Mw/s  wait  actv    svc_t    %w    %b   sd1              384.0  1157.5      48.0    116.8    0.0    8.8        5.7      2  100   sd1              368.0  1117.9      45.7    106.3    0.0    8.0        5.4      2  100   sd1              330.3  1357.5      41.3    139.1    0.0    9.5        5.6      2  100   ■ Solution: change ZFS block size to 8K: device            r/s        w/s      Mr/s      Mw/s  wait  actv    svc_t    %w    %b sd1              130.3    219.9        1.0        4.4    0.0    0.7        2.1      0    37 sd1              329.3    384.1        2.6      11.3    0.0    1.9        2.6      1    78 sd1              335.3    357.0        2.6        8.7    0.0    1.8        2.6      1    80 sd1              354.0    200.3        2.8        4.9    0.0    1.6        3.0      0    84 sd1              465.3    100.7        3.6        1.7    0.0    2.1        3.7      0    91 Proprietary andThursday, April 18, 13 Confidential 80
  185. 185. Next: Vertical Sharding Proprietary andThursday, April 18, 13 Confidential 81
  186. 186. Next: Vertical Sharding ■ Move out largest table into its own master database (150 inserts/sec) Proprietary andThursday, April 18, 13 Confidential 81
  187. 187. Next: Vertical Sharding ■ Move out largest table into its own master database (150 inserts/sec) ■ Remove any SQL joins, do them in application, drop foreign keys Proprietary andThursday, April 18, 13 Confidential 81
  188. 188. Next: Vertical Sharding ■ Move out largest table into its own master database (150 inserts/sec) ■ Remove any SQL joins, do them in application, drop foreign keys ■ Switch model to establish_connection to another DB. Fix many broken tests. Proprietary andThursday, April 18, 13 Confidential 81
  189. 189. Vertical Sharding unicorns haproxy pgbouncer twemproxy PostgreSQL saves master PostgreSQL PostgreSQL PostgreSQL main replica main replica main master streaming replication Proprietary andThursday, April 18, 13 Confidential 82
  190. 190. Vertical Sharding: Results Proprietary andThursday, April 18, 13 Confidential 83
  191. 191. Vertical Sharding: Results ■ Deploy All Things! Proprietary andThursday, April 18, 13 Confidential 83
  192. 192. Future: Services Approach unicorns haproxy pgbouncer twemproxy http / json PostgreSQL PostgreSQL PostgreSQL sinatra services app main replica main replica main master streaming replication Shard1 Shard2 Shard3 Proprietary andThursday, April 18, 13 Confidential 84
  193. 193. In Conclusion. Tasty gems :) https://github.com/wanelo/pause https://github.com/wanelo/spanx https://github.com/wanelo/redis_with_failover https://github.com/kigster/ventable Proprietary andThursday, April 18, 13 Confidential 85
  194. 194. In Conclusion. Tasty gems :) https://github.com/wanelo/pause ■ distributed rate limiting using redis https://github.com/wanelo/spanx https://github.com/wanelo/redis_with_failover https://github.com/kigster/ventable Proprietary andThursday, April 18, 13 Confidential 85
  195. 195. In Conclusion. Tasty gems :) https://github.com/wanelo/pause ■ distributed rate limiting using redis https://github.com/wanelo/spanx ■ rate-limit-based IP blocker for nginx https://github.com/wanelo/redis_with_failover https://github.com/kigster/ventable Proprietary andThursday, April 18, 13 Confidential 85
  196. 196. In Conclusion. Tasty gems :) https://github.com/wanelo/pause ■ distributed rate limiting using redis https://github.com/wanelo/spanx ■ rate-limit-based IP blocker for nginx https://github.com/wanelo/redis_with_failover ■ attempt another redis server if available https://github.com/kigster/ventable Proprietary andThursday, April 18, 13 Confidential 85
  197. 197. In Conclusion. Tasty gems :) https://github.com/wanelo/pause ■ distributed rate limiting using redis https://github.com/wanelo/spanx ■ rate-limit-based IP blocker for nginx https://github.com/wanelo/redis_with_failover ■ attempt another redis server if available https://github.com/kigster/ventable ■ observable pattern with a twist Proprietary andThursday, April 18, 13 Confidential 85
  198. 198. Thanks. Comments? Questions? https://github.com/wanelo https://github.com/wanelo-chef @kig & @sax @kig & @ecdysone @kigster & @sax Proprietary andThursday, April 18, 13 Confidential 86

×