Scaling Wanelo.com 100x in Six Months
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Scaling Wanelo.com 100x in Six Months

on

  • 28,588 views

A collection of problems and solutions worked through Wanelo team as they were scaling the site with the rapid demand of users. By Konstantin Gredeskoul and Eric Saxby.

A collection of problems and solutions worked through Wanelo team as they were scaling the site with the rapid demand of users. By Konstantin Gredeskoul and Eric Saxby.

Statistics

Views

Total Views
28,588
Views on SlideShare
8,450
Embed Views
20,138

Actions

Likes
27
Downloads
186
Comments
0

17 Embeds 20,138

http://building.wanelo.com 19829
https://twitter.com 269
http://assets.txmblr.com 7
http://translate.googleusercontent.com 6
http://localhost 5
http://buildingwanelo.tumblr.com 4
http://www.tumblr.com 3
http://www.linkedin.com 3
https://tasks.crowdflower.com 2
http://feedly.com 2
http://news.google.com 2
http://building.wanelo.com&_=1369240257614 HTTP 1
http://www.newsblur.com 1
http://lightspeed.natickps.org 1
http://www.pinterest.com 1
http://yandex.ru 1
http://pinterest.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Our mission is to democratize and transform the world's commerce by reorganizing shopping around people.
  • Some of the stores have close to half a million followers. Some are big and known, and some aren’t at all, outside of Wanelo.
  • Near real time updates to your feed, as people post products to stores you follow, or collections. Following a hashtag is very powerful.
  • Rails backend API, simple JSON in/out, using RABL for rendering JSON back (slow!). JSON.generate() is so much faster than to_json
  • included in /contrib in the Postgres source. Very easy to install. If a package does not come with pg_stat_statements, this is a reason to compile it yourself.
  • This is why we like Postgres: visibility tools
  • Sometimes you throw everything in a single graph, not knowing if it’s useful Sometimes that graph saves your ass when you happen to see it out of the corner of your eye
  • Extremely useful to correlate different data points visually
  • Why are you even waiting on disks? Postgres relies heavily on the file cache
  • Adaptive Replacement Cache This is why we like SmartOS/Illumos/Solaris: visibility tools
  • Great thing about ARC: even when your query misses in-RAM db cache, you hit an in-RAM file cache
  • Slowed down the site to the point where errors started happening
  • purple is hit ratio of cache servers
  • purple is hit ratio of cache servers
  • purple is hit ratio of cache servers
  • blue: writes red: automatic eviction
  • Hard to do this after the fact
  • This is why you want to already be on Postgres. You can take risks knowing that PG will throw errors, not corrupt data.
  • When you pull aside the curtain of a Ruby DB adapter, you can get a sense of... betrayal. Why is it written like this? Why method_missing? Why????? ActiveRecord is a finely crafted pile of code defined after the fact. Unfortunately, the DB adapters that don’t use crazy metaprogramming do things even worse to avoid it. Error handling is set of regexs. Easy to extend. Requests after a write read from master.
  • Putting everything into a class namespace per thread is not thread safety. Threaded code often spawns new threads.
  • New Relic application graph for month of December
  • Graphite / Circonus
  • Postgres 9.2 specific. 9.1 you basically have to connect to both master and replica, do binary math
  • By the way, these error spikes are 10/minute.

Scaling Wanelo.com 100x in Six Months Presentation Transcript

  • 1. Scaling 100x in six months by Eric Saxby & Konstantin Gredeskoul April 2013 Proprietary andThursday, April 18, 13 Confidential 1
  • 2. What is Wanelo? ■ Wanelo (“Wah-nee-lo” from Want, Need Love) is a global platform for shopping. Proprietary andThursday, April 18, 13 Confidential 2
  • 3. What is Wanelo? ■ Wanelo (“Wah-nee-lo” from Want, Need Love) is a global platform for shopping. Proprietary andThursday, April 18, 13 Confidential 2
  • 4. ■ It’s marketing-free shopping across 100s of thousands of unique stores Proprietary andThursday, April 18, 13 Confidential 3
  • 5. Personal Activity Feed... Proprietary andThursday, April 18, 13 Confidential 4
  • 6. Personal Activity Feed... Proprietary andThursday, April 18, 13 Confidential 4
  • 7. iOS + Android Proprietary andThursday, April 18, 13 Confidential 5
  • 8. iOS + Android Proprietary andThursday, April 18, 13 Confidential 5
  • 9. Early Decisions Proprietary andThursday, April 18, 13 Confidential 6
  • 10. Early Decisions ■ Optimize for iteration speed, not performance Proprietary andThursday, April 18, 13 Confidential 6
  • 11. Early Decisions ■ Optimize for iteration speed, not performance ■ Keep scalability in mind, track metrics, and fix as needed Proprietary andThursday, April 18, 13 Confidential 6
  • 12. Early Decisions ■ Optimize for iteration speed, not performance ■ Keep scalability in mind, track metrics, and fix as needed ■ Introduce many levels of caching early Proprietary andThursday, April 18, 13 Confidential 6
  • 13. Technology Timeline Proprietary andThursday, April 18, 13 Confidential 7
  • 14. Technology Timeline ■ 2010 - 2011 Wanelo v1 stack is Java, JSP, MySQL, Hibernate 90K lines of code, 53+ DB tables, no tests Proprietary andThursday, April 18, 13 Confidential 7
  • 15. Technology Timeline ■ 2010 - 2011 Wanelo v1 stack is Java, JSP, MySQL, Hibernate 90K lines of code, 53+ DB tables, no tests ■ May 2012 - June 2012 Rewrite from scratch to RoR on PostgreSQL (v2) Proprietary andThursday, April 18, 13 Confidential 7
  • 16. Technology Timeline ■ 2010 - 2011 Wanelo v1 stack is Java, JSP, MySQL, Hibernate 90K lines of code, 53+ DB tables, no tests ■ May 2012 - June 2012 Rewrite from scratch to RoR on PostgreSQL (v2) ■ Ruby app is 10K LOC, full test coverage, 8 database tables, less features Proprietary andThursday, April 18, 13 Confidential 7
  • 17. The “Big” Rewrite Proprietary andThursday, April 18, 13 Confidential 8
  • 18. The “Big” Rewrite More info here.... Proprietary andThursday, April 18, 13 Confidential 8
  • 19. The “Big” Rewrite More info here.... building.wanelo.com/ http:// Proprietary andThursday, April 18, 13 Confidential 8
  • 20. The “Big” Rewrite More info here.... building.wanelo.com/ http:// Proprietary andThursday, April 18, 13 Confidential 8
  • 21. Growth Timeline Proprietary andThursday, April 18, 13 Confidential 9
  • 22. Growth Timeline ■ 06/2012 - RoR App Relaunches Proprietary andThursday, April 18, 13 Confidential 9
  • 23. Growth Timeline ■ 06/2012 - RoR App Relaunches ■ 2-3K requests per minute (RPM) peak Proprietary andThursday, April 18, 13 Confidential 9
  • 24. Growth Timeline ■ 06/2012 - RoR App Relaunches ■ 2-3K requests per minute (RPM) peak ■ 08/2012 - iOS App is launched Proprietary andThursday, April 18, 13 Confidential 9
  • 25. Growth Timeline ■ 06/2012 - RoR App Relaunches ■ 2-3K requests per minute (RPM) peak ■ 08/2012 - iOS App is launched ■ 10-40K RPM peak Proprietary andThursday, April 18, 13 Confidential 9
  • 26. Growth Timeline ■ 06/2012 - RoR App Relaunches ■ 2-3K requests per minute (RPM) peak ■ 08/2012 - iOS App is launched ■ 10-40K RPM peak ■ 12/2012 - Android app launched Proprietary andThursday, April 18, 13 Confidential 9
  • 27. Growth Timeline ■ 06/2012 - RoR App Relaunches ■ 2-3K requests per minute (RPM) peak ■ 08/2012 - iOS App is launched ■ 10-40K RPM peak ■ 12/2012 - Android app launched ■ 40-120K RPM peak Proprietary andThursday, April 18, 13 Confidential 9
  • 28. Growth Timeline ■ 06/2012 - RoR App Relaunches ■ 2-3K requests per minute (RPM) peak ■ 08/2012 - iOS App is launched ■ 10-40K RPM peak ■ 12/2012 - Android app launched ■ 40-120K RPM peak ■ 03/2013 - #24 top free apps iTunes Proprietary andThursday, April 18, 13 Confidential 9
  • 29. Growth Timeline ■ 06/2012 - RoR App Relaunches ■ 2-3K requests per minute (RPM) peak ■ 08/2012 - iOS App is launched ■ 10-40K RPM peak ■ 12/2012 - Android app launched ■ 40-120K RPM peak ■ 03/2013 - #24 top free apps iTunes ■ 80-200K RPM peak Proprietary andThursday, April 18, 13 Confidential 9
  • 30. Requests Per Minute (RPM) Proprietary andThursday, April 18, 13 Confidential 10
  • 31. Current Numbers... ■ 4M active monthly users ■ 5M products saved 700M times ■ 8M products saved per day ■ 200k stores Proprietary andThursday, April 18, 13 Confidential 11
  • 32. Backend Stack & Key Vendors ■ MRI Ruby 1.9.3 & Rails 3.2 ■ PostgreSQL 9.2.4, Solr 3.6 ■ Joyent Cloud, SmartOS ZFS, ARC, raw IO performance, SmartOS, CPU bursting, dTrace ■ Circonus, Chef + Opscode Monitoring, graphing, alerting, automation ■ Amazon S3 + Fastly CDN ■ NewRelic, statsd, Graphite, nagios Proprietary andThursday, April 18, 13 Confidential 12
  • 33. Wanelo Web Architecture nginx 6 x 2GB haproxy unicorn x 14 sidekiq 20 x 8GB 4 x 8GB haproxy pgbouncer twemproxy haproxy pgbouncer twemproxy Solr PostgreSQL Redis MemCached Proprietary andThursday, April 18, 13 Confidential 13
  • 34. This talk is about: Proprietary andThursday, April 18, 13 Confidential 14
  • 35. This talk is about: 1. How much traffic can your database handle? Proprietary andThursday, April 18, 13 Confidential 14
  • 36. This talk is about: 1. How much traffic can your database handle? 2. Special report on counters Proprietary andThursday, April 18, 13 Confidential 14
  • 37. This talk is about: 1. How much traffic can your database handle? 2. Special report on counters 3. Scaling database reads Proprietary andThursday, April 18, 13 Confidential 14
  • 38. This talk is about: 1. How much traffic can your database handle? 2. Special report on counters 3. Scaling database reads 4. Scaling database writes Proprietary andThursday, April 18, 13 Confidential 14
  • 39. 1. How much traffic can your database handle?Thursday, April 18, 13 15
  • 40. PostgreSQL is Awesome! Proprietary andThursday, April 18, 13 Confidential 16
  • 41. PostgreSQL is Awesome! ■ Does a fantastic job of not corrupting your data Proprietary andThursday, April 18, 13 Confidential 16
  • 42. PostgreSQL is Awesome! ■ Does a fantastic job of not corrupting your data ■ Streaming replication in 9.2 is extremely reliable Proprietary andThursday, April 18, 13 Confidential 16
  • 43. PostgreSQL is Awesome! ■ Does a fantastic job of not corrupting your data ■ Streaming replication in 9.2 is extremely reliable ■ Won’t write to a read-only replica Proprietary andThursday, April 18, 13 Confidential 16
  • 44. PostgreSQL is Awesome! ■ Does a fantastic job of not corrupting your data ■ Streaming replication in 9.2 is extremely reliable ■ Won’t write to a read-only replica ■ But... No master/master replication Proprietary andThursday, April 18, 13 Confidential 16
  • 45. PostgreSQL is Awesome! ■ Does a fantastic job of not corrupting your data ■ Streaming replication in 9.2 is extremely reliable ■ Won’t write to a read-only replica ■ But... No master/master replication (good!) Proprietary andThursday, April 18, 13 Confidential 16
  • 46. Is the database healthy? Proprietary andThursday, April 18, 13 Confidential 17
  • 47. What’s healthy? Proprietary andThursday, April 18, 13 Confidential 18
  • 48. What’s healthy? ■ Able to respond quickly to queries from application (< 4ms disk seek time) Proprietary andThursday, April 18, 13 Confidential 18
  • 49. What’s healthy? ■ Able to respond quickly to queries from application (< 4ms disk seek time) ■ Has enough room to grow Proprietary andThursday, April 18, 13 Confidential 18
  • 50. What’s healthy? ■ Able to respond quickly to queries from application (< 4ms disk seek time) ■ Has enough room to grow ■ How do we know when we’re approaching a dangerous threshold? Proprietary andThursday, April 18, 13 Confidential 18
  • 51. Oops! NewRelic Latency (yellow = database) Proprietary andThursday, April 18, 13 Confidential 19
  • 52. Oops! NewRelic Latency (yellow = database) Proprietary andThursday, April 18, 13 Confidential 19
  • 53. pg_stat_statements ■ Maybe your app is to blame for performance...    select      query,  calls,  total_time      from          pg_stat_statements      order  by  total_time  desc  limit  12; Proprietary andThursday, April 18, 13 Confidential 20
  • 54. pg_stat_statements ■ Maybe your app is to blame for performance...    select      query,  calls,  total_time      from          pg_stat_statements      order  by  total_time  desc  limit  12; Similar to Percona Toolkit, but runs all the time collecting stats. Proprietary andThursday, April 18, 13 Confidential 20
  • 55. pg_stat_statements Proprietary andThursday, April 18, 13 Confidential 21
  • 56. pg_stat_user_indexes ■ Using indexes as much as you think you are? ■ Using indexes at all? Proprietary andThursday, April 18, 13 Confidential 22
  • 57. pg_stat_user_indexes ■ Using indexes as much as you think you are? ■ Using indexes at all? Proprietary andThursday, April 18, 13 Confidential 22
  • 58. pg_stat_user_tables ■ Full table scans? (seq_scan) Proprietary andThursday, April 18, 13 Confidential 23
  • 59. pg_stat_user_tables ■ Full table scans? (seq_scan) Proprietary andThursday, April 18, 13 Confidential 23
  • 60. Throw that in a graph Reads/second for one large table, daily Proprietary andThursday, April 18, 13 Confidential 24
  • 61. Non-linear changes Suspicious spike! Proprietary andThursday, April 18, 13 Confidential 25
  • 62. Correlate different data Deployments! Aha! Proprietary andThursday, April 18, 13 Confidential 26
  • 63. Utilization vs Saturation # of Active PostgreSQL connections Proprietary andThursday, April 18, 13 Confidential 27
  • 64. Utilization vs Saturation Red line: % of max connections established Purple: % of connections in query Proprietary andThursday, April 18, 13 Confidential 28
  • 65. Disk reads/writes green: reads, red: writes Proprietary andThursday, April 18, 13 Confidential 29
  • 66. Disk reads/writes green: reads, red: writes Usage increases, but are the disks saturated? Proprietary andThursday, April 18, 13 Confidential 29
  • 67. Utilization vs Saturation Proprietary andThursday, April 18, 13 Confidential 30
  • 68. Utilization vs Saturation Proprietary andThursday, April 18, 13 Confidential 30
  • 69. Utilization vs Saturation [ How much are you waiting on disk? Proprietary andThursday, April 18, 13 Confidential 31
  • 70. File system cache (ARC) Proprietary andThursday, April 18, 13 Confidential 32
  • 71. File system cache (ARC) Proprietary andThursday, April 18, 13 Confidential 32
  • 72. File system cache (ARC) Proprietary andThursday, April 18, 13 Confidential 32
  • 73. Watch the right things Hit ratio of the file system cache (ARC) Proprietary andThursday, April 18, 13 Confidential 33
  • 74. Watch the right things Hit ratio of the file system cache (ARC) Proprietary andThursday, April 18, 13 Confidential 33
  • 75. Room to grow... Size (including indexes) of a key table Proprietary andThursday, April 18, 13 Confidential 34
  • 76. Working set in RAM? Adding index increases the size Proprietary andThursday, April 18, 13 Confidential 35
  • 77. Working set in RAM? Adding index increases the size Proprietary andThursday, April 18, 13 Confidential 35
  • 78. Collect all the data you can Once we knew where to look, graphs added later could explain behavior we could only guess at earlier Proprietary andThursday, April 18, 13 Confidential 36
  • 79. Collect all the data you can Once we knew where to look, graphs added later could explain behavior we could only guess at earlier Proprietary andThursday, April 18, 13 Confidential 36
  • 80. 2. Special report on Counters and PaginationThursday, April 18, 13 37
  • 81. Problem #1: DB Latency Up... Proprietary andThursday, April 18, 13 Confidential 38
  • 82. Problem #1: DB Latency Up... ■ iostat shows 100% disk busy Proprietary andThursday, April 18, 13 Confidential 38
  • 83. Problem #1: DB Latency Up... ■ iostat shows 100% disk busy device            r/s        w/s      Mr/s      Mw/s  wait  actv    svc_t    %w    %b   sd1              384.0  1157.5      48.0    116.8    0.0    8.8        5.7      2  100   sd1              368.0  1117.9      45.7    106.3    0.0    8.0        5.4      2  100   sd1              330.3  1357.5      41.3    139.1    0.0    9.5        5.6      2  100   Proprietary andThursday, April 18, 13 Confidential 38
  • 84. Problem #1: DB Latency Up... ■ iostat shows 100% disk busy device            r/s        w/s      Mr/s      Mw/s  wait  actv    svc_t    %w    %b   sd1              384.0  1157.5      48.0    116.8    0.0    8.8        5.7      2  100   sd1              368.0  1117.9      45.7    106.3    0.0    8.0        5.4      2  100   sd1              330.3  1357.5      41.3    139.1    0.0    9.5        5.6      2  100   Proprietary andThursday, April 18, 13 Confidential 38
  • 85. Problem #1: DB Latency Up... ■ iostat shows 100% disk busy device            r/s        w/s      Mr/s      Mw/s  wait  actv    svc_t    %w    %b   sd1              384.0  1157.5      48.0    116.8    0.0    8.8        5.7      2  100   sd1              368.0  1117.9      45.7    106.3    0.0    8.0        5.4      2  100   sd1              330.3  1357.5      41.3    139.1    0.0    9.5        5.6      2  100   Proprietary andThursday, April 18, 13 Confidential 38
  • 86. Problem #1: Diagnostics Proprietary andThursday, April 18, 13 Confidential 39
  • 87. Problem #1: Diagnostics ■ Database is running very very hot. Initial investigation shows large number of counts. Proprietary andThursday, April 18, 13 Confidential 39
  • 88. Problem #1: Diagnostics ■ Database is running very very hot. Initial investigation shows large number of counts. ■ Turns out anytime you page with Kaminari, it always does a count(*)! Proprietary andThursday, April 18, 13 Confidential 39
  • 89. Problem #1: Diagnostics ■ Database is running very very hot. Initial investigation shows large number of counts. ■ Turns out anytime you page with Kaminari, it always does a count(*)! SELECT  "stores".*  FROM  "stores"                                      WHERE  (state  =  approved)                                      LIMIT  20  OFFSET  0 SELECT  COUNT(*)  FROM  "stores"  WHERE  (state  =  approved) Proprietary andThursday, April 18, 13 Confidential 39
  • 90. Problem #1: Pagination Proprietary andThursday, April 18, 13 Confidential 40
  • 91. Problem #1: Pagination ■ Doing count(*) is pretty expensive, as DB must scan many rows (either the actual table or an index) Proprietary andThursday, April 18, 13 Confidential 40
  • 92. Problem #1: Pagination Proprietary andThursday, April 18, 13 Confidential 41
  • 93. Problem #1: Pagination ■ We are paginating everything! Even infinite scroll is a paged view behind the scenes. Proprietary andThursday, April 18, 13 Confidential 41
  • 94. Problem #1: Pagination ■ We are paginating everything! Even infinite scroll is a paged view behind the scenes. ■ But we really DON’T want to run count(*) for every paged view. Proprietary andThursday, April 18, 13 Confidential 41
  • 95. Problem #1: Pagination ■ We are showing most popular stores ■ Maybe it’s OK to hard-code the total number to, say, 1000? Proprietary andThursday, April 18, 13 Confidential 42
  • 96. Problem #1: Pagination ■ We are showing most popular stores ■ Maybe it’s OK to hard-code the total number to, say, 1000? ■ How do we tell Kaminari NOT to issue a count query in this case? Proprietary andThursday, April 18, 13 Confidential 42
  • 97. Problem #1: Pagination (ctd) Proprietary andThursday, April 18, 13 Confidential 43
  • 98. Solution #1: Monkey Patch!! Proprietary andThursday, April 18, 13 Confidential 44
  • 99. Solution #1: Monkey Patch!! Proprietary andThursday, April 18, 13 Confidential 44
  • 100. Solution #1: Pass in the counter Proprietary andThursday, April 18, 13 Confidential 45
  • 101. Solution #1: Pass in the counter SELECT  "stores".*  FROM  "stores"  WHERE  (state  =   approved)  LIMIT  20  OFFSET  0 Proprietary andThursday, April 18, 13 Confidential 45
  • 102. Problem #2: Count Draculas ■ AKA: We still are doing too many counts! Proprietary andThursday, April 18, 13 Confidential 46
  • 103. Problem #2: Count Draculas ■ AKA: We still are doing too many counts! Proprietary andThursday, April 18, 13 Confidential 46
  • 104. Problem #2: Count Draculas ■ AKA: We still are doing too many counts! ■ Rails makes it so easy to do it the lazy way. Proprietary andThursday, April 18, 13 Confidential 46
  • 105. Problem #2: Too Many Counts! ■ But it just doesn’t scale well Proprietary andThursday, April 18, 13 Confidential 47
  • 106. Problem #2: Too Many Counts! ■ But it just doesn’t scale well ■ Fortunately, Rails has just a feature for this... Proprietary andThursday, April 18, 13 Confidential 47
  • 107. Problem #2: Too Many Counts! ■ But it just doesn’t scale well ■ Fortunately, Rails has just a feature for this... Proprietary andThursday, April 18, 13 Confidential 47
  • 108. Counter Caches ■ Unfortunately, it has one massive issue: Proprietary andThursday, April 18, 13 Confidential 48
  • 109. Counter Caches ■ Unfortunately, it has one massive issue: ■ It causes database deadlocks at high volume Proprietary andThursday, April 18, 13 Confidential 48
  • 110. Counter Caches ■ Unfortunately, it has one massive issue: ■ It causes database deadlocks at high volume ■ Because many ruby processes are creating child records concurrently Proprietary andThursday, April 18, 13 Confidential 48
  • 111. Counter Caches ■ Unfortunately, it has one massive issue: ■ It causes database deadlocks at high volume ■ Because many ruby processes are creating child records concurrently ■ Each is executing a callback, trying to update counter_cache column on the parent, requiring row-level lock Proprietary andThursday, April 18, 13 Confidential 48
  • 112. Counter Caches ■ Unfortunately, it has one massive issue: ■ It causes database deadlocks at high volume ■ Because many ruby processes are creating child records concurrently ■ Each is executing a callback, trying to update counter_cache column on the parent, requiring row-level lock ■ Deadlocks ensue Proprietary andThursday, April 18, 13 Confidential 48
  • 113. Possible Solution: Use Background Jobs Proprietary andThursday, April 18, 13 Confidential 49
  • 114. Possible Solution: Use Background Jobs ■ It works like this: Proprietary andThursday, April 18, 13 Confidential 49
  • 115. Possible Solution: Use Background Jobs ■ It works like this: ■ As the record is created, we enqueue a request to recalculate counter_cache on the parent Proprietary andThursday, April 18, 13 Confidential 49
  • 116. Possible Solution: Use Background Jobs ■ It works like this: ■ As the record is created, we enqueue a request to recalculate counter_cache on the parent ■ The job performs a complete recalculation of the counter cache and is idempotent Proprietary andThursday, April 18, 13 Confidential 49
  • 117. Solution #2: Explained Proprietary andThursday, April 18, 13 Confidential 50
  • 118. Solution #2: Explained ■ Sidekiq with UniqueJob extension Proprietary andThursday, April 18, 13 Confidential 50
  • 119. Solution #2: Explained ■ Sidekiq with UniqueJob extension ■ Short wait for “buffering” Proprietary andThursday, April 18, 13 Confidential 50
  • 120. Solution #2: Explained ■ Sidekiq with UniqueJob extension ■ Short wait for “buffering” ■ Serialize updates via small number of workers Proprietary andThursday, April 18, 13 Confidential 50
  • 121. Solution #2: Explained ■ Sidekiq with UniqueJob extension ■ Short wait for “buffering” ■ Serialize updates via small number of workers ■ Can temporarily stop workers (in an emergency) to alleviate DB load Proprietary andThursday, April 18, 13 Confidential 50
  • 122. Solution #2: Code Proprietary andThursday, April 18, 13 Confidential 51
  • 123. Things are better. BUT... Proprietary andThursday, April 18, 13 Confidential 52
  • 124. Things are better. BUT... Still too many fucking counts! Proprietary andThursday, April 18, 13 Confidential 52
  • 125. Things are better. BUT... Still too many fucking counts! ■ Even doing count(*) from workers is too much on the databases Proprietary andThursday, April 18, 13 Confidential 52
  • 126. Things are better. BUT... Still too many fucking counts! ■ Even doing count(*) from workers is too much on the databases ■ We need to stop doing count(*) in DB. But keep counter_caches. How? Proprietary andThursday, April 18, 13 Confidential 52
  • 127. Things are better. BUT... Still too many fucking counts! ■ Even doing count(*) from workers is too much on the databases ■ We need to stop doing count(*) in DB. But keep counter_caches. How? ■ We could use Redis for this. Proprietary andThursday, April 18, 13 Confidential 52
  • 128. save product product_id Solution #3: Counts Deltas unicorn counter_cache column 1. INCR product_id 2. ProductCountWorker.enqueue product_id Redis Redis PostgreSQL Counters Sidekiq 4. GET 3. Dequeue 5. RESET 5. SQL Update INCR by N sidekiq Proprietary andThursday, April 18, 13 Confidential 53
  • 129. save product product_id Solution #3: Counts Deltas unicorn ■ Web request increments counter value in Redis counter_cache column ■ Enqueues request to update counter_cache 1. INCR product_id 2. ProductCountWorker.enqueue product_id ■ Background Job picks up a few minutes later, reads Redis Redis PostgreSQL Redis delta value, and Counters Sidekiq removes it. ■ Updates counter_cache 4. GET 5. RESET 3. Dequeue column by incrementing it by delta. 5. SQL Update INCR by N sidekiq Proprietary andThursday, April 18, 13 Confidential 53
  • 130. Define counter_cache_on... ■ Internal GEM, will open source soon! Proprietary andThursday, April 18, 13 Confidential 54
  • 131. Can now use counter caches in pagination! Proprietary andThursday, April 18, 13 Confidential 55
  • 132. 3. Scaling readsThursday, April 18, 13 56
  • 133. Multiple optimization cycles ■ Caching action caching, fragment, CDN ■ Personalization via AJAX Cache the entire page, then add personalized details ■ 25ms/req memcached time is cheaper than 12ms/req of database time Proprietary andThursday, April 18, 13 Confidential 57
  • 134. Cache optimization 40% hit ratio! Woo! Wait... is that even good? Proprietary andThursday, April 18, 13 Confidential 58
  • 135. Cache optimization Increasing your hit ratio means less queries against your database Proprietary andThursday, April 18, 13 Confidential 59
  • 136. Cache optimization Caveat: even low hit ratio caches can save your ass. You’re removing load from the DB, remember? Proprietary andThursday, April 18, 13 Confidential 60
  • 137. Cache saturation Blue: cache writes How long before your caches Red: automatic evictions start evicting data? Proprietary andThursday, April 18, 13 Confidential 61
  • 138. Cache saturation Blue: cache writes How long before your caches Red: automatic evictions start evicting data? Proprietary andThursday, April 18, 13 Confidential 61
  • 139. Cache saturation Blue: cache writes How long before your caches Red: automatic evictions start evicting data? Proprietary andThursday, April 18, 13 Confidential 61
  • 140. Ajax personalization Proprietary andThursday, April 18, 13 Confidential 62
  • 141. Ajax personalization Proprietary andThursday, April 18, 13 Confidential 62
  • 142. Ajax personalization Proprietary andThursday, April 18, 13 Confidential 62
  • 143. Nice! ■ Rails Action Caching Runs before_filters, so A/B experiments can still run ■ Extremely fast pages 4ms application time for some of our computationally heaviest pages ■ Could be served via CDN in the future Proprietary andThursday, April 18, 13 Confidential 63
  • 144. Sad trombone... ■ Are you actually logged in? Pages don’t know until Ajax successfully runs ■ Selenium AND Jasmine tests! Proprietary andThursday, April 18, 13 Confidential 64
  • 145. Read/write splitting ■ Sometime in December 2012... Proprietary andThursday, April 18, 13 Confidential 65
  • 146. Read/write splitting ■ Sometime in December 2012... ■ Database reaching 100% saturation Proprietary andThursday, April 18, 13 Confidential 65
  • 147. Read/write splitting ■ Sometime in December 2012... ■ Database reaching 100% saturation ■ Latency starting to increase non-linearly Proprietary andThursday, April 18, 13 Confidential 65
  • 148. Read/write splitting ■ Sometime in December 2012... ■ Database reaching 100% saturation ■ Latency starting to increase non-linearly ■ We need to distribute database load Proprietary andThursday, April 18, 13 Confidential 65
  • 149. Read/write splitting ■ Sometime in December 2012... ■ Database reaching 100% saturation ■ Latency starting to increase non-linearly ■ We need to distribute database load ■ We need to use read replicas! Proprietary andThursday, April 18, 13 Confidential 65
  • 150. DB adapters for read/write ■ Looked at several, including DbCharmer Proprietary andThursday, April 18, 13 Confidential 66
  • 151. DB adapters for read/write ■ Looked at several, including DbCharmer ■ Features / Configurability / Stability ■ Thread safety? This may be Ruby, but some people do actually use threads. ■ If I tell you it’s a read-only replica, DON’T ISSUE WRITES ■ Failover on errors? Proprietary andThursday, April 18, 13 Confidential 66
  • 152. Chose Makara, by TaskRabbit ■ Used in production ■ We extended it to work with PostgreSQL ■ Works with Sidekiqs (thread-safe!) ■ Failover code is very simple. Simple is sometimes better. https://github.com/taskrabbit/makara Proprietary andThursday, April 18, 13 Confidential 67
  • 153. We rolled out Makara and... ■ 1 master, 3 read-only async replicas Proprietary andThursday, April 18, 13 Confidential 68
  • 154. We rolled out Makara and... ■ 1 master, 3 read-only async replicas Wait, what? Proprietary andThursday, April 18, 13 Confidential 68
  • 155. A note about graphs ■ NewRelic is great! ■ Not easy to predict when your systems are about to fall over ■ Use something else to visualize Database and disk saturation Proprietary andThursday, April 18, 13 Confidential 69
  • 156. 3 days later, in production ■ 3 read replicas distributing load from master ■ app servers and sidekiqs create lots of connections to DB backends Proprietary andThursday, April 18, 13 Confidential 70
  • 157. 3 days later, in production ■ 3 read replicas distributing load from master ■ app servers and sidekiqs create lots of connections to DB backends ■ Mysterious spikes in errors at high traffic Proprietary andThursday, April 18, 13 Confidential 70
  • 158. 3 days later, in production ■ 3 read replicas distributing load from master ■ app servers and sidekiqs create lots of connections to DB backends ■ Mysterious spikes in errors at high traffic Proprietary andThursday, April 18, 13 Confidential 70
  • 159. Replication! Doh! Replication lag (yellow) correlates with application errors (red) Proprietary andThursday, April 18, 13 Confidential 71
  • 160. Replication lag! Doh! ■ Track latency sending xlog to slaves select client_addr, pg_xlog_location_diff(sent_location, write_location) from pg_stat_replication; ■ Track latency applying xlogs on slaves select pg_xlog_location_diff( pg_last_xlog_receive_location(), pg_last_xlog_replay_location()), extract(epoch from now()) - extract(epoch from pg_last_xact_replay_timestamp()); Proprietary andThursday, April 18, 13 Confidential 72
  • 161. Eventual Consistency Proprietary andThursday, April 18, 13 Confidential 73
  • 162. Eventual Consistency ■ Some code paths should always go to master for reads (ie, after signup) Proprietary andThursday, April 18, 13 Confidential 73
  • 163. Eventual Consistency ■ Some code paths should always go to master for reads (ie, after signup) ■ Application should be resilient to getting RecordNotFound to tolerate replication delays Proprietary andThursday, April 18, 13 Confidential 73
  • 164. Eventual Consistency ■ Some code paths should always go to master for reads (ie, after signup) ■ Application should be resilient to getting RecordNotFound to tolerate replication delays ■ Not enough to scale reads. Writes become the bottleneck. Proprietary andThursday, April 18, 13 Confidential 73
  • 165. Write load delays replication Replicas are busy trying to apply XLOGs and serve heavy read traffic Proprietary andThursday, April 18, 13 Confidential 74
  • 166. 4. Scaling database writesThursday, April 18, 13 75
  • 167. First, No-Brainers: ■ Move stuff out of the DB. Easiest first. Proprietary andThursday, April 18, 13 Confidential 76
  • 168. First, No-Brainers: ■ Move stuff out of the DB. Easiest first. ■ Tracking user activity is very easy to do with a database table. But slow. Proprietary andThursday, April 18, 13 Confidential 76
  • 169. First, No-Brainers: ■ Move stuff out of the DB. Easiest first. ■ Tracking user activity is very easy to do with a database table. But slow. ■ 2000 inserts/sec while also handling site critical data? Not a good idea. Proprietary andThursday, April 18, 13 Confidential 76
  • 170. First, No-Brainers: ■ Move stuff out of the DB. Easiest first. ■ Tracking user activity is very easy to do with a database table. But slow. ■ 2000 inserts/sec while also handling site critical data? Not a good idea. ■ Solution: UDP packets to rsyslog, ASCII delimited files, log- rotate, analyze them later Proprietary andThursday, April 18, 13 Confidential 76
  • 171. Next: Async Commits Proprietary andThursday, April 18, 13 Confidential 77
  • 172. Next: Async Commits ■ PostgreSQL supports delayed (batched) commits Proprietary andThursday, April 18, 13 Confidential 77
  • 173. Next: Async Commits ■ PostgreSQL supports delayed (batched) commits ■ Delays fsync for some # of microseconds Proprietary andThursday, April 18, 13 Confidential 77
  • 174. Next: Async Commits ■ PostgreSQL supports delayed (batched) commits ■ Delays fsync for some # of microseconds ■ At high volume helps disk IO Proprietary andThursday, April 18, 13 Confidential 77
  • 175. PostgreSQL Async Commits Proprietary andThursday, April 18, 13 Confidential 78
  • 176. ZFS Block Size Proprietary andThursday, April 18, 13 Confidential 79
  • 177. ZFS Block Size ■ Default ZFS block size is 128Kb Proprietary andThursday, April 18, 13 Confidential 79
  • 178. ZFS Block Size ■ Default ZFS block size is 128Kb ■ PostgreSQL block size is 8Kb Proprietary andThursday, April 18, 13 Confidential 79
  • 179. ZFS Block Size ■ Default ZFS block size is 128Kb ■ PostgreSQL block size is 8Kb ■ Small writes require lots of bandwidth Proprietary andThursday, April 18, 13 Confidential 79
  • 180. ZFS Block Size ■ Default ZFS block size is 128Kb ■ PostgreSQL block size is 8Kb ■ Small writes require lots of bandwidth device            r/s        w/s      Mr/s      Mw/s  wait  actv    svc_t    %w    %b   sd1              384.0  1157.5      48.0    116.8    0.0    8.8        5.7      2  100   sd1              368.0  1117.9      45.7    106.3    0.0    8.0        5.4      2  100   sd1              330.3  1357.5      41.3    139.1    0.0    9.5        5.6      2  100   Proprietary andThursday, April 18, 13 Confidential 79
  • 181. ZFS Block Size (ctd.) Proprietary andThursday, April 18, 13 Confidential 80
  • 182. ZFS Block Size (ctd.) ■ Solution: change ZFS block size to 8K: Proprietary andThursday, April 18, 13 Confidential 80
  • 183. ZFS Block Size (ctd.) device            r/s        w/s      Mr/s      Mw/s  wait  actv    svc_t    %w    %b   sd1              384.0  1157.5      48.0    116.8    0.0    8.8        5.7      2  100   sd1              368.0  1117.9      45.7    106.3    0.0    8.0        5.4      2  100   sd1              330.3  1357.5      41.3    139.1    0.0    9.5        5.6      2  100   ■ Solution: change ZFS block size to 8K: Proprietary andThursday, April 18, 13 Confidential 80
  • 184. ZFS Block Size (ctd.) device            r/s        w/s      Mr/s      Mw/s  wait  actv    svc_t    %w    %b   sd1              384.0  1157.5      48.0    116.8    0.0    8.8        5.7      2  100   sd1              368.0  1117.9      45.7    106.3    0.0    8.0        5.4      2  100   sd1              330.3  1357.5      41.3    139.1    0.0    9.5        5.6      2  100   ■ Solution: change ZFS block size to 8K: device            r/s        w/s      Mr/s      Mw/s  wait  actv    svc_t    %w    %b sd1              130.3    219.9        1.0        4.4    0.0    0.7        2.1      0    37 sd1              329.3    384.1        2.6      11.3    0.0    1.9        2.6      1    78 sd1              335.3    357.0        2.6        8.7    0.0    1.8        2.6      1    80 sd1              354.0    200.3        2.8        4.9    0.0    1.6        3.0      0    84 sd1              465.3    100.7        3.6        1.7    0.0    2.1        3.7      0    91 Proprietary andThursday, April 18, 13 Confidential 80
  • 185. Next: Vertical Sharding Proprietary andThursday, April 18, 13 Confidential 81
  • 186. Next: Vertical Sharding ■ Move out largest table into its own master database (150 inserts/sec) Proprietary andThursday, April 18, 13 Confidential 81
  • 187. Next: Vertical Sharding ■ Move out largest table into its own master database (150 inserts/sec) ■ Remove any SQL joins, do them in application, drop foreign keys Proprietary andThursday, April 18, 13 Confidential 81
  • 188. Next: Vertical Sharding ■ Move out largest table into its own master database (150 inserts/sec) ■ Remove any SQL joins, do them in application, drop foreign keys ■ Switch model to establish_connection to another DB. Fix many broken tests. Proprietary andThursday, April 18, 13 Confidential 81
  • 189. Vertical Sharding unicorns haproxy pgbouncer twemproxy PostgreSQL saves master PostgreSQL PostgreSQL PostgreSQL main replica main replica main master streaming replication Proprietary andThursday, April 18, 13 Confidential 82
  • 190. Vertical Sharding: Results Proprietary andThursday, April 18, 13 Confidential 83
  • 191. Vertical Sharding: Results ■ Deploy All Things! Proprietary andThursday, April 18, 13 Confidential 83
  • 192. Future: Services Approach unicorns haproxy pgbouncer twemproxy http / json PostgreSQL PostgreSQL PostgreSQL sinatra services app main replica main replica main master streaming replication Shard1 Shard2 Shard3 Proprietary andThursday, April 18, 13 Confidential 84
  • 193. In Conclusion. Tasty gems :) https://github.com/wanelo/pause https://github.com/wanelo/spanx https://github.com/wanelo/redis_with_failover https://github.com/kigster/ventable Proprietary andThursday, April 18, 13 Confidential 85
  • 194. In Conclusion. Tasty gems :) https://github.com/wanelo/pause ■ distributed rate limiting using redis https://github.com/wanelo/spanx https://github.com/wanelo/redis_with_failover https://github.com/kigster/ventable Proprietary andThursday, April 18, 13 Confidential 85
  • 195. In Conclusion. Tasty gems :) https://github.com/wanelo/pause ■ distributed rate limiting using redis https://github.com/wanelo/spanx ■ rate-limit-based IP blocker for nginx https://github.com/wanelo/redis_with_failover https://github.com/kigster/ventable Proprietary andThursday, April 18, 13 Confidential 85
  • 196. In Conclusion. Tasty gems :) https://github.com/wanelo/pause ■ distributed rate limiting using redis https://github.com/wanelo/spanx ■ rate-limit-based IP blocker for nginx https://github.com/wanelo/redis_with_failover ■ attempt another redis server if available https://github.com/kigster/ventable Proprietary andThursday, April 18, 13 Confidential 85
  • 197. In Conclusion. Tasty gems :) https://github.com/wanelo/pause ■ distributed rate limiting using redis https://github.com/wanelo/spanx ■ rate-limit-based IP blocker for nginx https://github.com/wanelo/redis_with_failover ■ attempt another redis server if available https://github.com/kigster/ventable ■ observable pattern with a twist Proprietary andThursday, April 18, 13 Confidential 85
  • 198. Thanks. Comments? Questions? https://github.com/wanelo https://github.com/wanelo-chef @kig & @sax @kig & @ecdysone @kigster & @sax Proprietary andThursday, April 18, 13 Confidential 86