DATABASE SHARDING
IN RAILS APPLICATIONS
RUBY MEDITATION #23
September 2018
Work: @Talkable
Ruby: since 2012
Rails: since v3.2
Open Source: about 0 contributions
@justmacc
@vitalikdanchenko
ABOUT ME
ABOUT ME
Hobby: Running
Marathon Finisher: 4 times
Marathon PB: 3:05:10
ABOUT ME
A LITTLE BIT ABOUT SCALING
DATABASE SHARDING
A LOT OF DIAGRAMS
ALMOST NO CODE
AGENDA
CAN RAILS SCALE?
CAN RAILS SCALE?
— @busterarm at Hacker News
"TWITTER SCALE" WASN'T
SO MUCH A RUBY/RAILS
PROBLEM AS IT WAS AN
RDBMS
”
“
https://news.ycombinator.com/item?id=17217210
CAN RAILS SCALE?
SCALING
SCALABILITY VS PERFORMANCE
SCALING
SCALING
Rails Application
WEB WORKERS
DATABASE
SCALING
Rails Application
WEB WORKERS
BACKGROUND
JOBS
DATABASE
SCALING
Rails Application
WEB WORKERS
BACKGROUND JOBS
DATABASE
SCALING
Rails Application
WEB WORKERS
BACKGROUND JOBS
DATABASE
SCALING
Rails Application
WEB WORKERS
BACKGROUND JOBS DATABASE
SCALING
Rails Application
WEB WORKERS
DATABASE
BACKGROUND JOBS
SCALING
Rails Application
WEB
WORKERS
DATABASE
WEB
WORKERS
WEB
WORKERS
WEB
WORKERS
JOBS JOBS
JOBS JOBS
SCALING
SCALING
DATABASE
DATABASE SCALING
DATABASE
MASTER
DATABASE
SLAVE
DATABASE SCALING
DATABASE
MASTER
DATABASE
SLAVE
DATABASE
SLAVE
DATABASE
SLAVE
DATABASE SCALING
DATABASE
PARTIAL DATA
DATABASE
PARTIAL DATA
DATABASE
PARTIAL DATA
DATABASE SHARDING
FUNCTIONAL EXPRESSIONAL METADATA
Separate
Modules
id % N
year
country
row -> shard
DATABASE SHARDING
THEORETICAL PART
PRACTICAL PART
DATABASE SHARDING IN RAILS
MULTIPLE DATABASES MODEL
1 class ShardedBase < ActiveRecord::Base
2 establish_connection SHARD_DB_CONFIG
3 self.abstract_class = true
4 end
5
6 class Post < ShardedBase
7 end
8
9 class Comment < ShardedBase
10 end
DATABASE SHARDING IN RAILS
GEMS
• thiagopradi/octopus
• hsgubert/rails-sharding
• zendesk/active_record_shards
1 ActiveRecord::Base.on_shard(:number_two) do
2 Post.find_by(author: author)
3 end
DATABASE SHARDING IN TALKABLE
STEP BY STEP
DATABASE SHARDING IN TALKABLE
STEP BY STEP
1. How to shard?
DATABASE SHARDING IN TALKABLE
STEP BY STEP
1. How to shard?
BY CUSTOMER
DATABASE SHARDING IN TALKABLE
STEP BY STEP
1. How to shard? — expressional + metadata
DATABASE SHARDING IN TALKABLE
STEP BY STEP
1. How to shard? — expressional + metadata
2. Which tables have to be sharded?
DATABASE SHARDING IN TALKABLE
STEP BY STEP
1. How to shard? — expressional + metadata
2. Which tables have to be sharded?
THE LARGEST
DATABASE SHARDING IN TALKABLE
STEP BY STEP
1 class Visitor < ShardedRecord
2 has_many :purchases
3 end
4
5 class Purchase < ShardedRecord
6 belongs_to :person
7 has_one :referral
8 end
9
10 class Referral < ShardedRecord
11 has_many :rewards
12 end
13
14 class Reward < ShardedRecord
15 belongs_to :incentive
16 end
DATABASE SHARDING IN TALKABLE
STEP BY STEP
DATABASE
NOT SHARDED
CONFIG DATA
DATABASE
SHARD #1
REFERRAL DATA
DATABASE
SHARD #2
REFERRAL DATA
DATABASE SHARDING IN TALKABLE
STEP BY STEP
1. How to shard? — expressional + metadata
2. Which tables have to be sharded? — config vs referral
3. Get rid of joins
1 module ShardingJoinFinder
2 TABLE_REFERENCE = /(^|s+)(join|from)s+`?([a-z_]+)`?/i
3
4 def execute(query, name = nil)
5 ensure_sharding_match(query)
6 super(query, name)
7 end
8
9 def select_all(query, *args)
10 ensure_sharding_match(query)
11 super
12 end
13
14 def ensure_sharding_match(query)
15 query.scan(TABLE_REFERENCE).map(&:last)
16 sharded, unsharded = klasses.partition(&:is_sharded?)
17 raise Error if sharded.any? && unsharded.any?
18 end
19 end
20
21 ActiveRecord::ConnectionAdapters::Mysql2Adapter.prepend(ShardingJ
22 ActiveRecord::Relation.prepend(ShardingJoinFinder)
DATABASE SHARDING IN TALKABLE
STEP BY STEP
1. How to shard? — expressional + metadata
2. Which tables have to be sharded? — config vs referral
3. Get rid of joins — joins detector
DATABASE SHARDING IN TALKABLE
STEP BY STEP
1. How to shard? — expressional + metadata
2. Which tables have to be sharded? — config vs referral
3. Get rid of joins — joins detector
4. Introduce connection management
DATABASE SHARDING IN TALKABLE
STEP BY STEP
1. How to shard? — expressional + metadata
2. Which tables have to be sharded? — config vs referral
3. Get rid of joins — joins detector
4. Introduce connection management — gem
DATABASE SHARDING IN TALKABLE
STEP BY STEP
1. How to shard? — expressional + metadata
2. Which tables have to be sharded? — config vs referral
3. Get rid of joins — joins detector
4. Introduce connection management — gem
5. Add utility helpers
1 class DatabaseUtils
2 class SlaveError < StandardError; end
3
4 class << self
5 def use_shard(shard, &block); end
6 def current_shard; end
7 def uses_slave?; end
8 def config_for(slave: uses_slave?, shard: current_shard); end
9 def connection(slave: uses_slave?, shard: current_shard); end
10 def sharded_tables; end
11 def on_all_shards(&block); end
12 def processlist; end
13 def slave_lag_for(connection); end
14 end
15 end
1 Rails.application.console do
2 def shard(shard)
3 shard = shard.database_shard if shard.respond_to?(:database_shard)
4 DatabaseUtils.use_shard!(shard)
5 end
6 end
DATABASE SHARDING IN TALKABLE
STEP BY STEP
1. How to shard? — expressional + metadata
2. Which tables have to be sharded? — config vs referral
3. Get rid of joins — joins detector
4. Introduce connection management — gem
5. Add utility helpers — depends on individual needs
DATABASE SHARDING IN TALKABLE
STEP BY STEP
1. How to shard? — expressional + metadata
2. Which tables have to be sharded? — config vs referral
3. Get rid of joins — joins detector
4. Introduce connection management — gem
5. Add utility helpers — depends on individual needs
6. Migrate
1 production:
2 host: 'master.production.com'
3 database: 'primary'
4 slave:
5 host: 'slave.production.com'
6 database: 'primary'
7
8 shards:
9 1:
10 host: 'master.production.com'
11 database: 'shard_1'
12 slave:
13 host: 'slave.production.com'
14 database: 'shard_1'
15 2:
16 host: 'shard_2.production.com'
17 database: 'shard_2'
18 slave:
19 host: 'slave.shard_2.production.com'
20 database: 'shard_2'
1 production:
2 host: 'master.production.com'
3 database: 'primary'
4 slave:
5 host: 'slave.production.com'
6 database: 'primary'
7
8 shards:
9 1:
10 host: 'shard1.production.com'
11 database: 'shard_1'
12 slave:
13 host: 'slave.shard_1.production.com'
14 database: 'shard_1'
15 2:
16 host: 'shard_2.production.com'
17 database: 'shard_2'
18 slave:
19 host: 'slave.shard_2.production.com'
20 database: 'shard_2'
DATABASE SHARDING IN TALKABLE
STEP BY STEP
1. How to shard? — expressional + metadata
2. Which tables have to be sharded? — config vs referral
3. Get rid of joins — joins detector
4. Introduce connection management — gem
5. Add utility helpers — depends on individual needs
6. Migrate — profit!!!
MySQL PostgreSQL
Vites Citus
Jetpants ???
ScaleBase
DATABASE SHARDING SOLUTIONS
PROS CONS
Easier to Manage Not Easy to Implement
Smaller and Faster Lost Abilities
Reduces Costs
DATABASE SHARDING
CONCLUSION
DATABASE SHARDING
CONCLUSION
• Do not shard your data until you really need it
• Choose a sharding model that fits your context (project)
• Each additional shard increases the chance of a random
crash
When: 08:00 am
Where: 46°28'44.5"N 30°45'41.0"E
How far: doesn’t matter
How fast: doesn’t matter
SUNDAY MORNING RUN
Q&A

Database Sharding in Rails Applications – Vitalik Danchenko | Ruby Meditation #23

  • 1.
    DATABASE SHARDING IN RAILSAPPLICATIONS RUBY MEDITATION #23 September 2018
  • 2.
    Work: @Talkable Ruby: since2012 Rails: since v3.2 Open Source: about 0 contributions @justmacc @vitalikdanchenko ABOUT ME
  • 3.
  • 4.
    Hobby: Running Marathon Finisher:4 times Marathon PB: 3:05:10 ABOUT ME
  • 5.
    A LITTLE BITABOUT SCALING DATABASE SHARDING A LOT OF DIAGRAMS ALMOST NO CODE AGENDA
  • 6.
  • 7.
  • 8.
    — @busterarm atHacker News "TWITTER SCALE" WASN'T SO MUCH A RUBY/RAILS PROBLEM AS IT WAS AN RDBMS ” “ https://news.ycombinator.com/item?id=17217210 CAN RAILS SCALE?
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
    FUNCTIONAL EXPRESSIONAL METADATA Separate Modules id% N year country row -> shard DATABASE SHARDING
  • 25.
  • 26.
    DATABASE SHARDING INRAILS MULTIPLE DATABASES MODEL 1 class ShardedBase < ActiveRecord::Base 2 establish_connection SHARD_DB_CONFIG 3 self.abstract_class = true 4 end 5 6 class Post < ShardedBase 7 end 8 9 class Comment < ShardedBase 10 end
  • 27.
    DATABASE SHARDING INRAILS GEMS • thiagopradi/octopus • hsgubert/rails-sharding • zendesk/active_record_shards 1 ActiveRecord::Base.on_shard(:number_two) do 2 Post.find_by(author: author) 3 end
  • 28.
    DATABASE SHARDING INTALKABLE STEP BY STEP
  • 29.
    DATABASE SHARDING INTALKABLE STEP BY STEP 1. How to shard?
  • 30.
    DATABASE SHARDING INTALKABLE STEP BY STEP 1. How to shard? BY CUSTOMER
  • 31.
    DATABASE SHARDING INTALKABLE STEP BY STEP 1. How to shard? — expressional + metadata
  • 32.
    DATABASE SHARDING INTALKABLE STEP BY STEP 1. How to shard? — expressional + metadata 2. Which tables have to be sharded?
  • 33.
    DATABASE SHARDING INTALKABLE STEP BY STEP 1. How to shard? — expressional + metadata 2. Which tables have to be sharded? THE LARGEST
  • 34.
    DATABASE SHARDING INTALKABLE STEP BY STEP 1 class Visitor < ShardedRecord 2 has_many :purchases 3 end 4 5 class Purchase < ShardedRecord 6 belongs_to :person 7 has_one :referral 8 end 9 10 class Referral < ShardedRecord 11 has_many :rewards 12 end 13 14 class Reward < ShardedRecord 15 belongs_to :incentive 16 end
  • 35.
    DATABASE SHARDING INTALKABLE STEP BY STEP DATABASE NOT SHARDED CONFIG DATA DATABASE SHARD #1 REFERRAL DATA DATABASE SHARD #2 REFERRAL DATA
  • 36.
    DATABASE SHARDING INTALKABLE STEP BY STEP 1. How to shard? — expressional + metadata 2. Which tables have to be sharded? — config vs referral 3. Get rid of joins
  • 37.
    1 module ShardingJoinFinder 2TABLE_REFERENCE = /(^|s+)(join|from)s+`?([a-z_]+)`?/i 3 4 def execute(query, name = nil) 5 ensure_sharding_match(query) 6 super(query, name) 7 end 8 9 def select_all(query, *args) 10 ensure_sharding_match(query) 11 super 12 end 13 14 def ensure_sharding_match(query) 15 query.scan(TABLE_REFERENCE).map(&:last) 16 sharded, unsharded = klasses.partition(&:is_sharded?) 17 raise Error if sharded.any? && unsharded.any? 18 end 19 end 20 21 ActiveRecord::ConnectionAdapters::Mysql2Adapter.prepend(ShardingJ 22 ActiveRecord::Relation.prepend(ShardingJoinFinder)
  • 38.
    DATABASE SHARDING INTALKABLE STEP BY STEP 1. How to shard? — expressional + metadata 2. Which tables have to be sharded? — config vs referral 3. Get rid of joins — joins detector
  • 39.
    DATABASE SHARDING INTALKABLE STEP BY STEP 1. How to shard? — expressional + metadata 2. Which tables have to be sharded? — config vs referral 3. Get rid of joins — joins detector 4. Introduce connection management
  • 40.
    DATABASE SHARDING INTALKABLE STEP BY STEP 1. How to shard? — expressional + metadata 2. Which tables have to be sharded? — config vs referral 3. Get rid of joins — joins detector 4. Introduce connection management — gem
  • 41.
    DATABASE SHARDING INTALKABLE STEP BY STEP 1. How to shard? — expressional + metadata 2. Which tables have to be sharded? — config vs referral 3. Get rid of joins — joins detector 4. Introduce connection management — gem 5. Add utility helpers
  • 42.
    1 class DatabaseUtils 2class SlaveError < StandardError; end 3 4 class << self 5 def use_shard(shard, &block); end 6 def current_shard; end 7 def uses_slave?; end 8 def config_for(slave: uses_slave?, shard: current_shard); end 9 def connection(slave: uses_slave?, shard: current_shard); end 10 def sharded_tables; end 11 def on_all_shards(&block); end 12 def processlist; end 13 def slave_lag_for(connection); end 14 end 15 end 1 Rails.application.console do 2 def shard(shard) 3 shard = shard.database_shard if shard.respond_to?(:database_shard) 4 DatabaseUtils.use_shard!(shard) 5 end 6 end
  • 43.
    DATABASE SHARDING INTALKABLE STEP BY STEP 1. How to shard? — expressional + metadata 2. Which tables have to be sharded? — config vs referral 3. Get rid of joins — joins detector 4. Introduce connection management — gem 5. Add utility helpers — depends on individual needs
  • 44.
    DATABASE SHARDING INTALKABLE STEP BY STEP 1. How to shard? — expressional + metadata 2. Which tables have to be sharded? — config vs referral 3. Get rid of joins — joins detector 4. Introduce connection management — gem 5. Add utility helpers — depends on individual needs 6. Migrate
  • 45.
    1 production: 2 host:'master.production.com' 3 database: 'primary' 4 slave: 5 host: 'slave.production.com' 6 database: 'primary' 7 8 shards: 9 1: 10 host: 'master.production.com' 11 database: 'shard_1' 12 slave: 13 host: 'slave.production.com' 14 database: 'shard_1' 15 2: 16 host: 'shard_2.production.com' 17 database: 'shard_2' 18 slave: 19 host: 'slave.shard_2.production.com' 20 database: 'shard_2'
  • 46.
    1 production: 2 host:'master.production.com' 3 database: 'primary' 4 slave: 5 host: 'slave.production.com' 6 database: 'primary' 7 8 shards: 9 1: 10 host: 'shard1.production.com' 11 database: 'shard_1' 12 slave: 13 host: 'slave.shard_1.production.com' 14 database: 'shard_1' 15 2: 16 host: 'shard_2.production.com' 17 database: 'shard_2' 18 slave: 19 host: 'slave.shard_2.production.com' 20 database: 'shard_2'
  • 47.
    DATABASE SHARDING INTALKABLE STEP BY STEP 1. How to shard? — expressional + metadata 2. Which tables have to be sharded? — config vs referral 3. Get rid of joins — joins detector 4. Introduce connection management — gem 5. Add utility helpers — depends on individual needs 6. Migrate — profit!!!
  • 48.
    MySQL PostgreSQL Vites Citus Jetpants??? ScaleBase DATABASE SHARDING SOLUTIONS
  • 49.
    PROS CONS Easier toManage Not Easy to Implement Smaller and Faster Lost Abilities Reduces Costs DATABASE SHARDING CONCLUSION
  • 50.
    DATABASE SHARDING CONCLUSION • Donot shard your data until you really need it • Choose a sharding model that fits your context (project) • Each additional shard increases the chance of a random crash
  • 51.
    When: 08:00 am Where:46°28'44.5"N 30°45'41.0"E How far: doesn’t matter How fast: doesn’t matter SUNDAY MORNING RUN
  • 52.