shopify
How Shopify Scales Rails
John Duff
• The Shopify stack
• Knowing what to scale
• How we cache
• Scaling beyond caching
• Splitting things up
Overview
What is Shopify?
The Stack
• Ruby 1.9.3-p385
• Rails 3.2
• Percona MySQL 5.5
• Unicorn 4.5
• Memcached 1.4.14
• Redis 2.6
The Stack
The Stack
• 53 App Servers
• 1590 Unicorn Workers
• 5 Job Servers
• 370 Job Workers
Nginx
Unicorn
Rails 3.2
Ruby 1.9.3-p385
The Stack
Firewall
Load Balancer
App Servers
Redis
Job Servers
Database
Memcached Search
• 55,873 Lines of application code
• 15,968 Lines of CoffeeScript application code
• 81,892 Lines of test code
• 211 Controllers
• 468 Models
The Stack
Current Scale
9.9 M Orders
An order every 3.2 seconds
2,008 Sales per Minute
Cyber Monday
50,000 RPM
45 ms response time
13.3 billion requests
Looking Back, to Look Ahead
• First line of code written in 2004
• Shopify released June, 2006
• Same codebase
• Over 9 years of Rails upgrades, improvements and changes
Looking Back, to Look Ahead
Looking Back, to Look Ahead
• 6,702 Lines of application code (55,873)
• 4,386 Lines of test code (81,892)
• 38 Controllers (211)
• 77 Models (468)
Looking Back, to Look Ahead
• Ruby 1.8.2
• Rails 0.13.1
• MySQL 4.1
• Lighttpd
• Memcached
Know The System
One Request, One Process
RPM = W * 1/R
RPM = 1590 * 60 / 0.072
1,325,000 = 1172 * 60 / 0.072
↑ Workers
↓ Response Time
Know The System
• Avoid network calls during requests
• Speed up unavoidable network calls
• The Storefront and Checkout
• The Chive
Chive Flash Sale
Measure ALL THE THINGS
Measure ALL THE THINGS
• New Relic
• Splunk
• StatsD
• Cacti
• Conan
New Relic
Splunk
Caching
cacheable
cacheable
• https://github.com/Shopify/cacheable
• serve gzip’d content
• ETag and 304 Not Modified
• generational caching
• no explicit expiry
cacheable
class PostsController < ApplicationController
def show
response_cache do
@post = @shop.posts.find(params[:id])
respond_with(@post)
end
end
def cache_key_data
{
:action => action_name,
:format => request.format,
:params => params.slice(:id),
:shop_version => @shop.version
}
end
end
requests
Caching Dynamic 404s
Identity Cache
Identity Cache
• https://github.com/Shopify/identity_cache
• cache full model objects in memcached
• can include associated objects in cache
• must opt in to the cache
• explicit, but automatic expiry
Identity Cache
class Product < ActiveRecord::Base
include IdentityCache
has_many :images
cache_index [:shop_id, :id]
cache_has_many :images, :embed => true
end
@product = Product.fetch_by_shop_id_and_id(shop_id, id)
@images = @product.fetch_images
Identity Cache
Get Out of My Process
Delayed Job
• Jobs stored in the db
• Workers run in their own process
• Workers poll for jobs periodically
• https://github.com/collectiveidea/delayed_job
Resque
• Redis backed
• O(1) operation to pop jobs
• Faster (300 jobs/sec vs 120 jobs/sec)
• Extensible
• https://github.com/defunkt/resque
Resque
• Sending Email
• Processing Payments
• Geolocation
• Import / Export
• Indexing for Search
• 86 Other things...
Background Payment Processing
ms
Resque
class AddressGeolocationJob
max_retries 3
def self.perform(params)
object = params[:model].constantize.find(params[:id])
object.latitude, object.longitude = Geocoder.geocode(object)
object.save!
end
end
Resque.enqueue(AddressGeolocationJob, :id => 1, :model => 'Address')
Redis
• Inventory reservation system
• Sessions
• Theme uploads
• Throttling
• Carts
All Roads Lead To MySQL
MySQL Hardware
• 4 x 8 Core Processor
• SSD
• 256 GB Ram
• Full working set in memory
MySQL Query Optimization
• pt-query-digest
• Avoid queries that generate temp tables
• Adding the right indexes
• Forcing / Ignoring Indexes
MySQL Tuning
• disable innodb_stats_on_metadata
• increase table_open_cache
• replace glibc memory allocator with tcmalloc
• innodb_autoinc_lock_mode=‘interleaved’
after_commit
db transactions best friend
after_commit
• After transaction has been committed
• Webhooks
• Cache expiry
• Update associated objects
after_commit
class OrderObserver < ActiveRecord::Observer
observe :order
def after_save(order)
if order.changes.keys.include?(:financial_status)
order.flag_for_after_commit(:update_customer)
end
end
def after_commit(order)
if order.flagged_for_after_commit?(:update_customer)
Resque.enqueue(UpdateCustomerJob, :id => order.id)
end
end
end
Services
Services
• Split out standalone services as needed
• Independently scaled
• Segmented metrics
• Overall system is more complex
• Limit to what is necessary
Imagery
Adapt and Evolve as Needed
Using data and knowledge of the system to drive decisions
Summary
• Know your application and infrastructure.
• Keep slow IO or CPU tasks out of the main process.
• Measure your optimizations. You can make it worse.
Thanks.
@johnduff | john.duff@shopify.com

How Shopify Scales Rails