A collection of problems and solutions worked through Wanelo team as they were scaling the site with the rapid demand of users. By Konstantin Gredeskoul and Eric Saxby.
Konstantin GredeskoulCTO, Alpha Geek, Speaker, Coder, Musician, DJ at ReinventONE, Inc.
1. Scaling
100x
in six months
by Eric Saxby & Konstantin Gredeskoul
April 2013
Proprietary and
Thursday, April 18, 13 Confidential 1
2. What is Wanelo?
■ Wanelo (“Wah-nee-lo” from Want, Need
Love) is a global platform for shopping.
Proprietary and
Thursday, April 18, 13 Confidential 2
3. What is Wanelo?
■ Wanelo (“Wah-nee-lo” from Want, Need
Love) is a global platform for shopping.
Proprietary and
Thursday, April 18, 13 Confidential 2
4. ■ It’s marketing-free shopping across
100s of thousands of unique stores
Proprietary and
Thursday, April 18, 13 Confidential 3
7. iOS + Android
Proprietary and
Thursday, April 18, 13 Confidential 5
8. iOS + Android
Proprietary and
Thursday, April 18, 13 Confidential 5
9. Early Decisions
Proprietary and
Thursday, April 18, 13 Confidential 6
10. Early Decisions
■ Optimize for iteration speed, not
performance
Proprietary and
Thursday, April 18, 13 Confidential 6
11. Early Decisions
■ Optimize for iteration speed, not
performance
■ Keep scalability in mind, track metrics,
and fix as needed
Proprietary and
Thursday, April 18, 13 Confidential 6
12. Early Decisions
■ Optimize for iteration speed, not
performance
■ Keep scalability in mind, track metrics,
and fix as needed
■ Introduce many levels of caching early
Proprietary and
Thursday, April 18, 13 Confidential 6
14. Technology Timeline
■ 2010 - 2011
Wanelo v1 stack is Java, JSP, MySQL, Hibernate
90K lines of code, 53+ DB tables, no tests
Proprietary and
Thursday, April 18, 13 Confidential 7
15. Technology Timeline
■ 2010 - 2011
Wanelo v1 stack is Java, JSP, MySQL, Hibernate
90K lines of code, 53+ DB tables, no tests
■ May 2012 - June 2012
Rewrite from scratch to RoR on PostgreSQL (v2)
Proprietary and
Thursday, April 18, 13 Confidential 7
16. Technology Timeline
■ 2010 - 2011
Wanelo v1 stack is Java, JSP, MySQL, Hibernate
90K lines of code, 53+ DB tables, no tests
■ May 2012 - June 2012
Rewrite from scratch to RoR on PostgreSQL (v2)
■ Ruby app is 10K LOC, full test coverage, 8
database tables, less features
Proprietary and
Thursday, April 18, 13 Confidential 7
17. The “Big” Rewrite
Proprietary and
Thursday, April 18, 13 Confidential 8
18. The “Big” Rewrite
More info here....
Proprietary and
Thursday, April 18, 13 Confidential 8
19. The “Big” Rewrite
More info here....
building.wanelo.com/
http://
Proprietary and
Thursday, April 18, 13 Confidential 8
20. The “Big” Rewrite
More info here....
building.wanelo.com/
http://
Proprietary and
Thursday, April 18, 13 Confidential 8
21. Growth Timeline
Proprietary and
Thursday, April 18, 13 Confidential 9
22. Growth Timeline
■ 06/2012 - RoR App Relaunches
Proprietary and
Thursday, April 18, 13 Confidential 9
23. Growth Timeline
■ 06/2012 - RoR App Relaunches
■ 2-3K requests per minute (RPM) peak
Proprietary and
Thursday, April 18, 13 Confidential 9
24. Growth Timeline
■ 06/2012 - RoR App Relaunches
■ 2-3K requests per minute (RPM) peak
■ 08/2012 - iOS App is launched
Proprietary and
Thursday, April 18, 13 Confidential 9
25. Growth Timeline
■ 06/2012 - RoR App Relaunches
■ 2-3K requests per minute (RPM) peak
■ 08/2012 - iOS App is launched
■ 10-40K RPM peak
Proprietary and
Thursday, April 18, 13 Confidential 9
30. Requests Per Minute (RPM)
Proprietary and
Thursday, April 18, 13 Confidential 10
31. Current Numbers...
■ 4M active monthly users
■ 5M products saved 700M times
■ 8M products saved per day
■ 200k stores
Proprietary and
Thursday, April 18, 13 Confidential 11
33. Wanelo Web Architecture
nginx
6 x 2GB
haproxy
unicorn x 14 sidekiq
20 x 8GB
4 x 8GB
haproxy pgbouncer twemproxy haproxy pgbouncer twemproxy
Solr PostgreSQL Redis MemCached
Proprietary and
Thursday, April 18, 13 Confidential 13
34. This talk is about:
Proprietary and
Thursday, April 18, 13 Confidential 14
35. This talk is about:
1. How much traffic can your database handle?
Proprietary and
Thursday, April 18, 13 Confidential 14
36. This talk is about:
1. How much traffic can your database handle?
2. Special report on counters
Proprietary and
Thursday, April 18, 13 Confidential 14
37. This talk is about:
1. How much traffic can your database handle?
2. Special report on counters
3. Scaling database reads
Proprietary and
Thursday, April 18, 13 Confidential 14
38. This talk is about:
1. How much traffic can your database handle?
2. Special report on counters
3. Scaling database reads
4. Scaling database writes
Proprietary and
Thursday, April 18, 13 Confidential 14
39. 1.
How much traffic can your
database handle?
Thursday, April 18, 13 15
41. PostgreSQL is Awesome!
■ Does a fantastic job of not corrupting
your data
Proprietary and
Thursday, April 18, 13 Confidential 16
42. PostgreSQL is Awesome!
■ Does a fantastic job of not corrupting
your data
■ Streaming replication in 9.2 is
extremely reliable
Proprietary and
Thursday, April 18, 13 Confidential 16
43. PostgreSQL is Awesome!
■ Does a fantastic job of not corrupting
your data
■ Streaming replication in 9.2 is
extremely reliable
■ Won’t write to a read-only replica
Proprietary and
Thursday, April 18, 13 Confidential 16
44. PostgreSQL is Awesome!
■ Does a fantastic job of not corrupting
your data
■ Streaming replication in 9.2 is
extremely reliable
■ Won’t write to a read-only replica
■ But... No master/master replication
Proprietary and
Thursday, April 18, 13 Confidential 16
45. PostgreSQL is Awesome!
■ Does a fantastic job of not corrupting
your data
■ Streaming replication in 9.2 is
extremely reliable
■ Won’t write to a read-only replica
■ But... No master/master replication
(good!)
Proprietary and
Thursday, April 18, 13 Confidential 16
46. Is the database healthy?
Proprietary and
Thursday, April 18, 13 Confidential 17
47. What’s healthy?
Proprietary and
Thursday, April 18, 13 Confidential 18
48. What’s healthy?
■ Able to respond quickly to queries from
application (< 4ms disk seek time)
Proprietary and
Thursday, April 18, 13 Confidential 18
49. What’s healthy?
■ Able to respond quickly to queries from
application (< 4ms disk seek time)
■ Has enough room to grow
Proprietary and
Thursday, April 18, 13 Confidential 18
50. What’s healthy?
■ Able to respond quickly to queries from
application (< 4ms disk seek time)
■ Has enough room to grow
■ How do we know when we’re
approaching a dangerous threshold?
Proprietary and
Thursday, April 18, 13 Confidential 18
51. Oops!
NewRelic Latency (yellow = database)
Proprietary and
Thursday, April 18, 13 Confidential 19
52. Oops!
NewRelic Latency (yellow = database)
Proprietary and
Thursday, April 18, 13 Confidential 19
53. pg_stat_statements
■ Maybe your app is to blame for
performance...
select
query,
calls,
total_time
from
pg_stat_statements
order
by
total_time
desc
limit
12;
Proprietary and
Thursday, April 18, 13 Confidential 20
54. pg_stat_statements
■ Maybe your app is to blame for
performance...
select
query,
calls,
total_time
from
pg_stat_statements
order
by
total_time
desc
limit
12;
Similar to Percona Toolkit, but runs all the
time collecting stats.
Proprietary and
Thursday, April 18, 13 Confidential 20
55. pg_stat_statements
Proprietary and
Thursday, April 18, 13 Confidential 21
56. pg_stat_user_indexes
■ Using indexes as much as you think
you are?
■ Using indexes at all?
Proprietary and
Thursday, April 18, 13 Confidential 22
57. pg_stat_user_indexes
■ Using indexes as much as you think
you are?
■ Using indexes at all?
Proprietary and
Thursday, April 18, 13 Confidential 22
58. pg_stat_user_tables
■ Full table scans? (seq_scan)
Proprietary and
Thursday, April 18, 13 Confidential 23
59. pg_stat_user_tables
■ Full table scans? (seq_scan)
Proprietary and
Thursday, April 18, 13 Confidential 23
60. Throw that in a graph
Reads/second for one large table, daily
Proprietary and
Thursday, April 18, 13 Confidential 24
61. Non-linear changes
Suspicious spike!
Proprietary and
Thursday, April 18, 13 Confidential 25
62. Correlate different data
Deployments! Aha!
Proprietary and
Thursday, April 18, 13 Confidential 26
63. Utilization vs Saturation
# of Active PostgreSQL connections
Proprietary and
Thursday, April 18, 13 Confidential 27
64. Utilization vs Saturation
Red line: % of max connections established
Purple: % of connections in query
Proprietary and
Thursday, April 18, 13 Confidential 28
65. Disk reads/writes
green: reads, red: writes
Proprietary and
Thursday, April 18, 13 Confidential 29
66. Disk reads/writes
green: reads, red: writes
Usage increases, but are the disks saturated?
Proprietary and
Thursday, April 18, 13 Confidential 29
69. Utilization vs Saturation
[
How much are you waiting on disk?
Proprietary and
Thursday, April 18, 13 Confidential 31
70. File system cache (ARC)
Proprietary and
Thursday, April 18, 13 Confidential 32
71. File system cache (ARC)
Proprietary and
Thursday, April 18, 13 Confidential 32
72. File system cache (ARC)
Proprietary and
Thursday, April 18, 13 Confidential 32
73. Watch the right things
Hit ratio of the file system cache (ARC)
Proprietary and
Thursday, April 18, 13 Confidential 33
74. Watch the right things
Hit ratio of the file system cache (ARC)
Proprietary and
Thursday, April 18, 13 Confidential 33
75. Room to grow...
Size (including indexes) of a key table
Proprietary and
Thursday, April 18, 13 Confidential 34
76. Working set in RAM?
Adding index increases the size
Proprietary and
Thursday, April 18, 13 Confidential 35
77. Working set in RAM?
Adding index increases the size
Proprietary and
Thursday, April 18, 13 Confidential 35
78. Collect all the data you can
Once we knew where to look, graphs added
later could explain behavior we could
only guess at earlier
Proprietary and
Thursday, April 18, 13 Confidential 36
79. Collect all the data you can
Once we knew where to look, graphs added
later could explain behavior we could
only guess at earlier
Proprietary and
Thursday, April 18, 13 Confidential 36
80. 2.
Special report on
Counters and Pagination
Thursday, April 18, 13 37
81. Problem #1: DB Latency Up...
Proprietary and
Thursday, April 18, 13 Confidential 38
82. Problem #1: DB Latency Up...
■ iostat shows 100% disk busy
Proprietary and
Thursday, April 18, 13 Confidential 38
87. Problem #1: Diagnostics
■ Database is running very very hot.
Initial investigation shows large number of counts.
Proprietary and
Thursday, April 18, 13 Confidential 39
88. Problem #1: Diagnostics
■ Database is running very very hot.
Initial investigation shows large number of counts.
■ Turns out anytime you page with Kaminari, it
always does a count(*)!
Proprietary and
Thursday, April 18, 13 Confidential 39
89. Problem #1: Diagnostics
■ Database is running very very hot.
Initial investigation shows large number of counts.
■ Turns out anytime you page with Kaminari, it
always does a count(*)!
SELECT
"stores".*
FROM
"stores"
WHERE
(state
=
'approved')
LIMIT
20
OFFSET
0
SELECT
COUNT(*)
FROM
"stores"
WHERE
(state
=
'approved')
Proprietary and
Thursday, April 18, 13 Confidential 39
91. Problem #1: Pagination
■ Doing count(*) is pretty expensive, as DB
must scan many rows (either the actual table
or an index)
Proprietary and
Thursday, April 18, 13 Confidential 40
93. Problem #1: Pagination
■ We are paginating everything! Even infinite
scroll is a paged view behind the scenes.
Proprietary and
Thursday, April 18, 13 Confidential 41
94. Problem #1: Pagination
■ We are paginating everything! Even infinite
scroll is a paged view behind the scenes.
■ But we really DON’T want to run count(*) for
every paged view.
Proprietary and
Thursday, April 18, 13 Confidential 41
95. Problem #1: Pagination
■ We are showing most popular stores
■ Maybe it’s OK to hard-code the total number to,
say, 1000?
Proprietary and
Thursday, April 18, 13 Confidential 42
96. Problem #1: Pagination
■ We are showing most popular stores
■ Maybe it’s OK to hard-code the total number to,
say, 1000?
■ How do we tell Kaminari NOT to issue a
count query in this case?
Proprietary and
Thursday, April 18, 13 Confidential 42
98. Solution #1: Monkey Patch!!
Proprietary and
Thursday, April 18, 13 Confidential 44
99. Solution #1: Monkey Patch!!
Proprietary and
Thursday, April 18, 13 Confidential 44
100. Solution #1: Pass in the
counter
Proprietary and
Thursday, April 18, 13 Confidential 45
101. Solution #1: Pass in the
counter
SELECT
"stores".*
FROM
"stores"
WHERE
(state
=
'approved')
LIMIT
20
OFFSET
0
Proprietary and
Thursday, April 18, 13 Confidential 45
102. Problem #2: Count Draculas
■ AKA: We still are doing too many counts!
Proprietary and
Thursday, April 18, 13 Confidential 46
103. Problem #2: Count Draculas
■ AKA: We still are doing too many counts!
Proprietary and
Thursday, April 18, 13 Confidential 46
104. Problem #2: Count Draculas
■ AKA: We still are doing too many counts!
■ Rails makes it so easy to do it the lazy way.
Proprietary and
Thursday, April 18, 13 Confidential 46
105. Problem #2: Too Many Counts!
■ But it just doesn’t scale well
Proprietary and
Thursday, April 18, 13 Confidential 47
106. Problem #2: Too Many Counts!
■ But it just doesn’t scale well
■ Fortunately, Rails has just a feature for this...
Proprietary and
Thursday, April 18, 13 Confidential 47
107. Problem #2: Too Many Counts!
■ But it just doesn’t scale well
■ Fortunately, Rails has just a feature for this...
Proprietary and
Thursday, April 18, 13 Confidential 47
108. Counter Caches
■ Unfortunately, it has one massive issue:
Proprietary and
Thursday, April 18, 13 Confidential 48
109. Counter Caches
■ Unfortunately, it has one massive issue:
■ It causes database deadlocks at high volume
Proprietary and
Thursday, April 18, 13 Confidential 48
110. Counter Caches
■ Unfortunately, it has one massive issue:
■ It causes database deadlocks at high volume
■ Because many ruby processes are creating child
records concurrently
Proprietary and
Thursday, April 18, 13 Confidential 48
111. Counter Caches
■ Unfortunately, it has one massive issue:
■ It causes database deadlocks at high volume
■ Because many ruby processes are creating child
records concurrently
■ Each is executing a callback, trying to update
counter_cache column on the parent, requiring
row-level lock
Proprietary and
Thursday, April 18, 13 Confidential 48
112. Counter Caches
■ Unfortunately, it has one massive issue:
■ It causes database deadlocks at high volume
■ Because many ruby processes are creating child
records concurrently
■ Each is executing a callback, trying to update
counter_cache column on the parent, requiring
row-level lock
■ Deadlocks ensue
Proprietary and
Thursday, April 18, 13 Confidential 48
113. Possible Solution:
Use Background Jobs
Proprietary and
Thursday, April 18, 13 Confidential 49
114. Possible Solution:
Use Background Jobs
■ It works like this:
Proprietary and
Thursday, April 18, 13 Confidential 49
115. Possible Solution:
Use Background Jobs
■ It works like this:
■ As the record is created, we enqueue a request
to recalculate counter_cache on the parent
Proprietary and
Thursday, April 18, 13 Confidential 49
116. Possible Solution:
Use Background Jobs
■ It works like this:
■ As the record is created, we enqueue a request
to recalculate counter_cache on the parent
■ The job performs a complete recalculation of
the counter cache and is idempotent
Proprietary and
Thursday, April 18, 13 Confidential 49
118. Solution #2: Explained
■ Sidekiq with UniqueJob extension
Proprietary and
Thursday, April 18, 13 Confidential 50
119. Solution #2: Explained
■ Sidekiq with UniqueJob extension
■ Short wait for “buffering”
Proprietary and
Thursday, April 18, 13 Confidential 50
120. Solution #2: Explained
■ Sidekiq with UniqueJob extension
■ Short wait for “buffering”
■ Serialize updates via small number of workers
Proprietary and
Thursday, April 18, 13 Confidential 50
121. Solution #2: Explained
■ Sidekiq with UniqueJob extension
■ Short wait for “buffering”
■ Serialize updates via small number of workers
■ Can temporarily stop workers (in an
emergency) to alleviate DB load
Proprietary and
Thursday, April 18, 13 Confidential 50
122. Solution #2: Code
Proprietary and
Thursday, April 18, 13 Confidential 51
123. Things are better. BUT...
Proprietary and
Thursday, April 18, 13 Confidential 52
124. Things are better. BUT...
Still too many fucking counts!
Proprietary and
Thursday, April 18, 13 Confidential 52
125. Things are better. BUT...
Still too many fucking counts!
■ Even doing count(*) from workers is too
much on the databases
Proprietary and
Thursday, April 18, 13 Confidential 52
126. Things are better. BUT...
Still too many fucking counts!
■ Even doing count(*) from workers is too
much on the databases
■ We need to stop doing count(*) in DB. But
keep counter_caches. How?
Proprietary and
Thursday, April 18, 13 Confidential 52
127. Things are better. BUT...
Still too many fucking counts!
■ Even doing count(*) from workers is too
much on the databases
■ We need to stop doing count(*) in DB. But
keep counter_caches. How?
■ We could use Redis for this.
Proprietary and
Thursday, April 18, 13 Confidential 52
128. save product product_id
Solution #3:
Counts Deltas unicorn
counter_cache column
1. INCR product_id 2. ProductCountWorker.enqueue product_id
Redis Redis
PostgreSQL
Counters Sidekiq
4. GET
3. Dequeue
5. RESET
5. SQL Update INCR by N
sidekiq
Proprietary and
Thursday, April 18, 13 Confidential 53
129. save product product_id
Solution #3:
Counts Deltas unicorn
■ Web request increments
counter value in Redis
counter_cache column
■ Enqueues request to
update counter_cache 1. INCR product_id 2. ProductCountWorker.enqueue product_id
■ Background Job picks up
a few minutes later, reads Redis Redis
PostgreSQL
Redis delta value, and Counters Sidekiq
removes it.
■ Updates counter_cache 4. GET
5. RESET
3. Dequeue
column by incrementing it
by delta. 5. SQL Update INCR by N
sidekiq
Proprietary and
Thursday, April 18, 13 Confidential 53
130. Define counter_cache_on...
■ Internal GEM, will open source soon!
Proprietary and
Thursday, April 18, 13 Confidential 54
131. Can now use counter caches
in pagination!
Proprietary and
Thursday, April 18, 13 Confidential 55
133. Multiple optimization cycles
■ Caching
action caching, fragment, CDN
■ Personalization via AJAX
Cache the entire page, then add
personalized details
■ 25ms/req memcached time is cheaper than
12ms/req of database time
Proprietary and
Thursday, April 18, 13 Confidential 57
134. Cache optimization
40% hit ratio! Woo!
Wait... is that even good?
Proprietary and
Thursday, April 18, 13 Confidential 58
135. Cache optimization
Increasing your hit ratio means less
queries against your database
Proprietary and
Thursday, April 18, 13 Confidential 59
136. Cache optimization
Caveat: even low hit ratio caches
can save your ass. You’re removing
load from the DB, remember?
Proprietary and
Thursday, April 18, 13 Confidential 60
137. Cache saturation
Blue: cache writes How long before your caches
Red: automatic evictions start evicting data?
Proprietary and
Thursday, April 18, 13 Confidential 61
138. Cache saturation
Blue: cache writes How long before your caches
Red: automatic evictions start evicting data?
Proprietary and
Thursday, April 18, 13 Confidential 61
139. Cache saturation
Blue: cache writes How long before your caches
Red: automatic evictions start evicting data?
Proprietary and
Thursday, April 18, 13 Confidential 61
143. Nice!
■ Rails Action Caching
Runs before_filters, so A/B experiments can still run
■ Extremely fast pages
4ms application time for some of our
computationally heaviest pages
■ Could be served via CDN in the future
Proprietary and
Thursday, April 18, 13 Confidential 63
144. Sad trombone...
■ Are you actually logged in?
Pages don’t know until Ajax successfully runs
■ Selenium AND Jasmine tests!
Proprietary and
Thursday, April 18, 13 Confidential 64
145. Read/write splitting
■ Sometime in December 2012...
Proprietary and
Thursday, April 18, 13 Confidential 65
146. Read/write splitting
■ Sometime in December 2012...
■ Database reaching 100% saturation
Proprietary and
Thursday, April 18, 13 Confidential 65
147. Read/write splitting
■ Sometime in December 2012...
■ Database reaching 100% saturation
■ Latency starting to increase non-linearly
Proprietary and
Thursday, April 18, 13 Confidential 65
148. Read/write splitting
■ Sometime in December 2012...
■ Database reaching 100% saturation
■ Latency starting to increase non-linearly
■ We need to distribute database load
Proprietary and
Thursday, April 18, 13 Confidential 65
149. Read/write splitting
■ Sometime in December 2012...
■ Database reaching 100% saturation
■ Latency starting to increase non-linearly
■ We need to distribute database load
■ We need to use read replicas!
Proprietary and
Thursday, April 18, 13 Confidential 65
150. DB adapters for read/write
■ Looked at several, including DbCharmer
Proprietary and
Thursday, April 18, 13 Confidential 66
151. DB adapters for read/write
■ Looked at several, including DbCharmer
■ Features / Configurability / Stability
■ Thread safety? This may be Ruby, but some
people do actually use threads.
■ If I tell you it’s a read-only replica, DON’T
ISSUE WRITES
■ Failover on errors?
Proprietary and
Thursday, April 18, 13 Confidential 66
152. Chose Makara, by TaskRabbit
■ Used in production
■ We extended it to work with PostgreSQL
■ Works with Sidekiqs (thread-safe!)
■ Failover code is very simple. Simple is
sometimes better.
https://github.com/taskrabbit/makara
Proprietary and
Thursday, April 18, 13 Confidential 67
153. We rolled out Makara and...
■ 1 master, 3 read-only async replicas
Proprietary and
Thursday, April 18, 13 Confidential 68
154. We rolled out Makara and...
■ 1 master, 3 read-only async replicas
Wait, what?
Proprietary and
Thursday, April 18, 13 Confidential 68
155. A note about graphs
■ NewRelic is great!
■ Not easy to predict when your
systems are about to fall over
■ Use something else to visualize
Database and disk saturation
Proprietary and
Thursday, April 18, 13 Confidential 69
156. 3 days later, in production
■ 3 read replicas distributing load from master
■ app servers and sidekiqs create lots of
connections to DB backends
Proprietary and
Thursday, April 18, 13 Confidential 70
157. 3 days later, in production
■ 3 read replicas distributing load from master
■ app servers and sidekiqs create lots of
connections to DB backends
■ Mysterious spikes in errors at high traffic
Proprietary and
Thursday, April 18, 13 Confidential 70
158. 3 days later, in production
■ 3 read replicas distributing load from master
■ app servers and sidekiqs create lots of
connections to DB backends
■ Mysterious spikes in errors at high traffic
Proprietary and
Thursday, April 18, 13 Confidential 70
159. Replication! Doh!
Replication lag (yellow)
correlates with application errors (red)
Proprietary and
Thursday, April 18, 13 Confidential 71
160. Replication lag! Doh!
■ Track latency sending xlog to slaves
select client_addr,
pg_xlog_location_diff(sent_location, write_location)
from pg_stat_replication;
■ Track latency applying xlogs on slaves
select pg_xlog_location_diff(
pg_last_xlog_receive_location(),
pg_last_xlog_replay_location()),
extract(epoch from now()) -
extract(epoch from pg_last_xact_replay_timestamp());
Proprietary and
Thursday, April 18, 13 Confidential 72
162. Eventual Consistency
■ Some code paths should always go to
master for reads (ie, after signup)
Proprietary and
Thursday, April 18, 13 Confidential 73
163. Eventual Consistency
■ Some code paths should always go to
master for reads (ie, after signup)
■ Application should be resilient to
getting RecordNotFound to tolerate
replication delays
Proprietary and
Thursday, April 18, 13 Confidential 73
164. Eventual Consistency
■ Some code paths should always go to
master for reads (ie, after signup)
■ Application should be resilient to
getting RecordNotFound to tolerate
replication delays
■ Not enough to scale reads.
Writes become the bottleneck.
Proprietary and
Thursday, April 18, 13 Confidential 73
165. Write load delays replication
Replicas are busy trying to apply XLOGs
and serve heavy read traffic
Proprietary and
Thursday, April 18, 13 Confidential 74
166. 4.
Scaling database writes
Thursday, April 18, 13 75
167. First, No-Brainers:
■ Move stuff out of the DB. Easiest first.
Proprietary and
Thursday, April 18, 13 Confidential 76
168. First, No-Brainers:
■ Move stuff out of the DB. Easiest first.
■ Tracking user activity is very easy to do
with a database table. But slow.
Proprietary and
Thursday, April 18, 13 Confidential 76
169. First, No-Brainers:
■ Move stuff out of the DB. Easiest first.
■ Tracking user activity is very easy to do
with a database table. But slow.
■ 2000 inserts/sec while also handling site
critical data? Not a good idea.
Proprietary and
Thursday, April 18, 13 Confidential 76
170. First, No-Brainers:
■ Move stuff out of the DB. Easiest first.
■ Tracking user activity is very easy to do
with a database table. But slow.
■ 2000 inserts/sec while also handling site
critical data? Not a good idea.
■ Solution:
UDP packets to rsyslog, ASCII delimited files, log-
rotate, analyze them later
Proprietary and
Thursday, April 18, 13 Confidential 76
172. Next: Async Commits
■ PostgreSQL supports delayed
(batched) commits
Proprietary and
Thursday, April 18, 13 Confidential 77
173. Next: Async Commits
■ PostgreSQL supports delayed
(batched) commits
■ Delays fsync for some # of
microseconds
Proprietary and
Thursday, April 18, 13 Confidential 77
174. Next: Async Commits
■ PostgreSQL supports delayed
(batched) commits
■ Delays fsync for some # of
microseconds
■ At high volume helps disk IO
Proprietary and
Thursday, April 18, 13 Confidential 77
186. Next: Vertical Sharding
■ Move out largest table into its own
master database (150 inserts/sec)
Proprietary and
Thursday, April 18, 13 Confidential 81
187. Next: Vertical Sharding
■ Move out largest table into its own
master database (150 inserts/sec)
■ Remove any SQL joins, do them in
application, drop foreign keys
Proprietary and
Thursday, April 18, 13 Confidential 81
188. Next: Vertical Sharding
■ Move out largest table into its own
master database (150 inserts/sec)
■ Remove any SQL joins, do them in
application, drop foreign keys
■ Switch model to establish_connection
to another DB. Fix many broken tests.
Proprietary and
Thursday, April 18, 13 Confidential 81
189. Vertical Sharding
unicorns
haproxy pgbouncer twemproxy
PostgreSQL
saves master
PostgreSQL PostgreSQL PostgreSQL
main replica main replica main master
streaming replication
Proprietary and
Thursday, April 18, 13 Confidential 82
192. Future: Services Approach
unicorns
haproxy pgbouncer twemproxy
http / json
PostgreSQL PostgreSQL PostgreSQL sinatra services app
main replica main replica main master
streaming replication
Shard1 Shard2 Shard3
Proprietary and
Thursday, April 18, 13 Confidential 84
193. In Conclusion. Tasty gems :)
https://github.com/wanelo/pause
https://github.com/wanelo/spanx
https://github.com/wanelo/redis_with_failover
https://github.com/kigster/ventable
Proprietary and
Thursday, April 18, 13 Confidential 85
194. In Conclusion. Tasty gems :)
https://github.com/wanelo/pause
■ distributed rate limiting using redis
https://github.com/wanelo/spanx
https://github.com/wanelo/redis_with_failover
https://github.com/kigster/ventable
Proprietary and
Thursday, April 18, 13 Confidential 85
195. In Conclusion. Tasty gems :)
https://github.com/wanelo/pause
■ distributed rate limiting using redis
https://github.com/wanelo/spanx
■ rate-limit-based IP blocker for nginx
https://github.com/wanelo/redis_with_failover
https://github.com/kigster/ventable
Proprietary and
Thursday, April 18, 13 Confidential 85
196. In Conclusion. Tasty gems :)
https://github.com/wanelo/pause
■ distributed rate limiting using redis
https://github.com/wanelo/spanx
■ rate-limit-based IP blocker for nginx
https://github.com/wanelo/redis_with_failover
■ attempt another redis server if available
https://github.com/kigster/ventable
Proprietary and
Thursday, April 18, 13 Confidential 85
197. In Conclusion. Tasty gems :)
https://github.com/wanelo/pause
■ distributed rate limiting using redis
https://github.com/wanelo/spanx
■ rate-limit-based IP blocker for nginx
https://github.com/wanelo/redis_with_failover
■ attempt another redis server if available
https://github.com/kigster/ventable
■ observable pattern with a twist
Proprietary and
Thursday, April 18, 13 Confidential 85
Our mission is to democratize and transform the world's commerce by reorganizing shopping around people.
Some of the stores have close to half a million followers. Some are big and known, and some aren’t at all, outside of Wanelo.
Near real time updates to your feed, as people post products to stores you follow, or collections. Following a hashtag is very powerful.
Rails backend API, simple JSON in/out, using RABL for rendering JSON back (slow!). JSON.generate() is so much faster than to_json
included in /contrib in the Postgres source. Very easy to install. If a package does not come with pg_stat_statements, this is a reason to compile it yourself.
This is why we like Postgres: visibility tools
Sometimes you throw everything in a single graph, not knowing if it’s useful Sometimes that graph saves your ass when you happen to see it out of the corner of your eye
Extremely useful to correlate different data points visually
Why are you even waiting on disks? Postgres relies heavily on the file cache
Adaptive Replacement Cache This is why we like SmartOS/Illumos/Solaris: visibility tools
Great thing about ARC: even when your query misses in-RAM db cache, you hit an in-RAM file cache
Slowed down the site to the point where errors started happening
purple is hit ratio of cache servers
purple is hit ratio of cache servers
purple is hit ratio of cache servers
blue: writes red: automatic eviction
Hard to do this after the fact
This is why you want to already be on Postgres. You can take risks knowing that PG will throw errors, not corrupt data.
When you pull aside the curtain of a Ruby DB adapter, you can get a sense of... betrayal. Why is it written like this? Why method_missing? Why????? ActiveRecord is a finely crafted pile of code defined after the fact. Unfortunately, the DB adapters that don’t use crazy metaprogramming do things even worse to avoid it. Error handling is set of regexs. Easy to extend. Requests after a write read from master.
Putting everything into a class namespace per thread is not thread safety. Threaded code often spawns new threads.
New Relic application graph for month of December
Graphite / Circonus
Postgres 9.2 specific. 9.1 you basically have to connect to both master and replica, do binary math