Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Cache is King
Get the Most Bang for Your Buck From Ruby
Site Reliability Engineer
Adding Indexes
Using SELECT statements
Batch Processing
Elasticsearch::Transport::Errors::GatewayTimeout 504
{
"statusCode": 200,
"took": "100ms"
}
Resque
Demo Time!
Quantity of Datastore Hits
The average company has...
60 thousand
assets
24 million
vulnerabilities?
MySQL
Elasticsearch
Cluster
Serialization
MySQL
Elasticsearch
Cluster
ActiveModelSerializers
module Beehive
module Serializers
class Vulnerability < ActiveModel::Serializer
attributes :id, :client_id, :created_at, :...
200 MILLION
11 hours and counting...
(1.6ms)
(0.9ms)
(4.1ms)(5.2ms)
(5.2ms)
(1.3ms)
(3.1ms)
(2.9ms)
(2.2ms)
(4.9ms)
(6.0ms)
(0.3ms)
(1.6ms)
(0.9ms)
(2.2ms)
(3....
MySQL
Bulk Serialization
class BulkVulnerabilityCache
attr_accessor :vulnerabilities, :client, :vulnerability_ids
def initialize(vulns, client)
sel...
module Serializers
class Vulnerability
attr_accessor :vulnerability, :cache
def initialize(vuln, bulk_cache)
self.cache = ...
class Vulnerability
has_many :custom_fields
end
CustomField.where(:vulnerability_id => vuln.id)
cache.fetch('custom_fields', vuln.id)
The Result...
(pry)> vulns = Vulnerability.limit(300);
(pry)> Benchmark.realtime { vulns.each(&:serialize) }
=> 6.02245222...
Decrease in database hits
Individual Serialization:
Bulk Serialization:
2,100
7
1k vulns 1k vulns 1k vulns
Vulnerability Batches
1k vulns 1k vulns 1k vulns
Vulnerability Batches
7k 7
MySQL Queries
Bulk Serialization
Deployed
Bulk Serialization
Deployed
RDS CPU Utilization
Process
in Bulk
Elasticsearch
Cluster
+
Redis
MySQL
Vulnerabilities
Redis
Client 1
Index
Client 2
Index
Client 3 & 4
Index
(pry)> Redis.get(“elasticsearch_index_#{client_id}”)
DEBUG -- : [Redis] command=GET args="elasticsearch_index_1234"
DEBUG ...
client_indexes = Hash.new do |h, client_id|
h[client_id] = Redis.get(“elasticsearch_index_#{client_id}”)
end
indexing_hashes = vuln_hashes.map do |hash|
{
:_index => Redis.get(“elasticsearch_index_#{client_id}”)
:_type => hash[:doc...
1 + 1 + 2
Client 1 Client 2 Client 3 & 4
1k 1k 1k
1000x
65% job
speed up
Local Cache
Redis
Process
in Bulk
Hash
Cache
Sharded Databases
CLIENT 1
CLIENT 2
CLIENT 3
Asset.with_shard(client_id).find(1)
{
'client_123' => 'shard_123',
'client_456' => 'shard_456',
'client_789' => 'shard_789'
}
Sharding Configuration
Sharding Configuration Size
20 bytes 1kb 13kb
285 Workers
7.8 MB/second
ActiveRecord::Base.connection
(pry)> ActiveRecord::Base.connection
=> #<Octopus::Proxy:0x000055b38c697d10
@proxy_config= #<Octopus::ProxyConfig:0x000055...
module Octopus
class Proxy
attr_accessor :proxy_config
delegate :current_shard, :current_shard=,
:current_slave_group, :cu...
Know your gems
Process
in Bulk
Framework
Cache
Hash
Cache
Avoid making datastore hits you don’t
need
User.where(:id => user_ids).each do |user|
# Lots of user processing
end
FALSE
(pry)> User.where(:id => [])
User Load (1.0ms) SELECT `users`.* FROM `users` WHERE 1=0
=> []
return unless user_ids.any?
User.where(:id => user_ids).each do |user|
# Lots of user processing
end
(pry)> Benchmark.realtime do
> 10_000.times { User.where(:id => []) }
> end
=> 0.5508159045130014
(pry)> Benchmark.realtim...
(pry)> Benchmark.realtime do
> 10_000.times { User.where(:id => []) }
> end
=> 0.5508159045130014
“Ruby is slow”Hitting th...
User.where(:id => user_ids).each do |user|
# Lots of user processing
end
User.where(:id => user_ids).each do |user|
# Lots of user processing
end
users = User.where(:id => user_ids).active.short....
.none
(pry)> User.where(:id => []).active.tall.single
User Load (0.7ms) SELECT `users`.* FROM `users` WHERE 1=0 AND
`users`.`act...
Logging
pry(main)> Rails.logger.level = 0
$ redis-cli monitor > commands-redis-2018-10-01.txt
pry(main)> Search.connection...
Preventing useless
datastore hits
Report
Elasticsearch
MySQL Redis
(pry)> Report.blank_reports.count
=> 10805
(pry)> Report.active.count
=> 25842
(pry)> Report.average_asset_count
=> 1657
I...
Report
Elasticsearch
MySQL Redis
10+ hrs
3 hrs
Process
in Bulk
Framework
Cache
Database
Guards
Hash
Cache
Resque Workers
Redis
45 workers
45 workers
45 workers
70 workers
70 workers
70 workers
48 MB
16 MB
Redis Requests
70 workers
100k
200k
?
Resque
Throttling
Redis Requests
100k
200k
Redis Network Traffic
48MB
16MB
Process
in Bulk
Framework
Cache
Database
Guards
Remove
Datastore
Hits
Hash
Cache
Every
datastore hit
COUNTS
Questions
Contact
https://www.linkedin.com/in/mollystruve/
https://github.com/mstruve
@molly_struve
molly.struve@gmail.com
Cache is King: Get the Most Bang for Your Buck From Ruby
Cache is King: Get the Most Bang for Your Buck From Ruby
Cache is King: Get the Most Bang for Your Buck From Ruby
Cache is King: Get the Most Bang for Your Buck From Ruby
Cache is King: Get the Most Bang for Your Buck From Ruby
Cache is King: Get the Most Bang for Your Buck From Ruby
Cache is King: Get the Most Bang for Your Buck From Ruby
Cache is King: Get the Most Bang for Your Buck From Ruby
Cache is King: Get the Most Bang for Your Buck From Ruby
Cache is King: Get the Most Bang for Your Buck From Ruby
Cache is King: Get the Most Bang for Your Buck From Ruby
Cache is King: Get the Most Bang for Your Buck From Ruby
Cache is King: Get the Most Bang for Your Buck From Ruby
Cache is King: Get the Most Bang for Your Buck From Ruby
Cache is King: Get the Most Bang for Your Buck From Ruby
Cache is King: Get the Most Bang for Your Buck From Ruby
Cache is King: Get the Most Bang for Your Buck From Ruby
Cache is King: Get the Most Bang for Your Buck From Ruby
Cache is King: Get the Most Bang for Your Buck From Ruby
Cache is King: Get the Most Bang for Your Buck From Ruby
Cache is King: Get the Most Bang for Your Buck From Ruby
Upcoming SlideShare
Loading in …5
×

Cache is King: Get the Most Bang for Your Buck From Ruby

31 views

Published on

Sometimes your fastest queries can cause the most problems. I will take you beyond the slow query optimization and instead zero in on the performance impacts surrounding the quantity of your datastore hits. Using real world examples dealing with datastores such as Elasticsearch, MySQL, and Redis, I will demonstrate how many fast queries can wreak just as much havoc as a few big slow ones. With each example I will make use of the simple tools available in Ruby to decrease and eliminate the need for these fast and seemingly innocuous datastore hits.

Published in: Engineering
  • Be the first to comment

  • Be the first to like this

Cache is King: Get the Most Bang for Your Buck From Ruby

  1. 1. Cache is King Get the Most Bang for Your Buck From Ruby
  2. 2. Site Reliability Engineer
  3. 3. Adding Indexes Using SELECT statements Batch Processing
  4. 4. Elasticsearch::Transport::Errors::GatewayTimeout 504 { "statusCode": 200, "took": "100ms" }
  5. 5. Resque
  6. 6. Demo Time!
  7. 7. Quantity of Datastore Hits
  8. 8. The average company has... 60 thousand assets 24 million vulnerabilities?
  9. 9. MySQL Elasticsearch Cluster
  10. 10. Serialization
  11. 11. MySQL Elasticsearch Cluster ActiveModelSerializers
  12. 12. module Beehive module Serializers class Vulnerability < ActiveModel::Serializer attributes :id, :client_id, :created_at, :updated_at, :priority, :details, :notes, :asset_id, :solution_id, :owner_id, :ticket_id end end end
  13. 13. 200 MILLION
  14. 14. 11 hours and counting...
  15. 15. (1.6ms) (0.9ms) (4.1ms)(5.2ms) (5.2ms) (1.3ms) (3.1ms) (2.9ms) (2.2ms) (4.9ms) (6.0ms) (0.3ms) (1.6ms) (0.9ms) (2.2ms) (3.0ms) (2.1ms) (1.3ms) (2.1ms) (8.1ms) (1.4ms)
  16. 16. MySQL
  17. 17. Bulk Serialization
  18. 18. class BulkVulnerabilityCache attr_accessor :vulnerabilities, :client, :vulnerability_ids def initialize(vulns, client) self.vulnerabilities = vulns self.vulnerability_ids = vulns.map(&:id) self.client = client end # MySQL Lookups end
  19. 19. module Serializers class Vulnerability attr_accessor :vulnerability, :cache def initialize(vuln, bulk_cache) self.cache = bulk_cache self.vulnerability = vuln end end end self.cache = bulk_cache
  20. 20. class Vulnerability has_many :custom_fields end
  21. 21. CustomField.where(:vulnerability_id => vuln.id) cache.fetch('custom_fields', vuln.id)
  22. 22. The Result... (pry)> vulns = Vulnerability.limit(300); (pry)> Benchmark.realtime { vulns.each(&:serialize) } => 6.022452222998254 (pry)> Benchmark.realtime do > BulkVulnerability.new(vulns, [], client).serialize > end => 0.7267019419959979
  23. 23. Decrease in database hits Individual Serialization: Bulk Serialization: 2,100 7
  24. 24. 1k vulns 1k vulns 1k vulns Vulnerability Batches
  25. 25. 1k vulns 1k vulns 1k vulns Vulnerability Batches 7k 7
  26. 26. MySQL Queries Bulk Serialization Deployed
  27. 27. Bulk Serialization Deployed RDS CPU Utilization
  28. 28. Process in Bulk
  29. 29. Elasticsearch Cluster + Redis MySQL Vulnerabilities
  30. 30. Redis Client 1 Index Client 2 Index Client 3 & 4 Index
  31. 31. (pry)> Redis.get(“elasticsearch_index_#{client_id}”) DEBUG -- : [Redis] command=GET args="elasticsearch_index_1234" DEBUG -- : [Redis] call_time=1.07 ms GET
  32. 32. client_indexes = Hash.new do |h, client_id| h[client_id] = Redis.get(“elasticsearch_index_#{client_id}”) end
  33. 33. indexing_hashes = vuln_hashes.map do |hash| { :_index => Redis.get(“elasticsearch_index_#{client_id}”) :_type => hash[:doc_type], :_id => hash[:id], :data => hash[:data] } end client_indexes[hash[:client_id]],
  34. 34. 1 + 1 + 2 Client 1 Client 2 Client 3 & 4 1k 1k 1k
  35. 35. 1000x
  36. 36. 65% job speed up
  37. 37. Local Cache
  38. 38. Redis
  39. 39. Process in Bulk Hash Cache
  40. 40. Sharded Databases CLIENT 1 CLIENT 2 CLIENT 3
  41. 41. Asset.with_shard(client_id).find(1)
  42. 42. { 'client_123' => 'shard_123', 'client_456' => 'shard_456', 'client_789' => 'shard_789' } Sharding Configuration
  43. 43. Sharding Configuration Size 20 bytes 1kb 13kb
  44. 44. 285 Workers
  45. 45. 7.8 MB/second
  46. 46. ActiveRecord::Base.connection
  47. 47. (pry)> ActiveRecord::Base.connection => #<Octopus::Proxy:0x000055b38c697d10 @proxy_config= #<Octopus::ProxyConfig:0x000055b38c694ae8
  48. 48. module Octopus class Proxy attr_accessor :proxy_config delegate :current_shard, :current_shard=, :current_slave_group, :current_slave_group=, :shard_names, :shards_for_group, :shards, :sharded, :config, :initialize_shards, :shard_name, to: :proxy_config, prefix: false end end
  49. 49. Know your gems
  50. 50. Process in Bulk Framework Cache Hash Cache
  51. 51. Avoid making datastore hits you don’t need
  52. 52. User.where(:id => user_ids).each do |user| # Lots of user processing end
  53. 53. FALSE
  54. 54. (pry)> User.where(:id => []) User Load (1.0ms) SELECT `users`.* FROM `users` WHERE 1=0 => []
  55. 55. return unless user_ids.any? User.where(:id => user_ids).each do |user| # Lots of user processing end
  56. 56. (pry)> Benchmark.realtime do > 10_000.times { User.where(:id => []) } > end => 0.5508159045130014 (pry)> Benchmark.realtime do > 10_000.times do > next unless ids.any? > User.where(:id => []) > end > end => 0.0006368421018123627
  57. 57. (pry)> Benchmark.realtime do > 10_000.times { User.where(:id => []) } > end => 0.5508159045130014 “Ruby is slow”Hitting the database is slow!
  58. 58. User.where(:id => user_ids).each do |user| # Lots of user processing end
  59. 59. User.where(:id => user_ids).each do |user| # Lots of user processing end users = User.where(:id => user_ids).active.short.single
  60. 60. .none
  61. 61. (pry)> User.where(:id => []).active.tall.single User Load (0.7ms) SELECT `users`.* FROM `users` WHERE 1=0 AND `users`.`active` = 1 AND `users`.`short` = 0 AND `users`.`single` = 1 => [] (pry)> User.none.active.tall.single => [] .none in action...
  62. 62. Logging pry(main)> Rails.logger.level = 0 $ redis-cli monitor > commands-redis-2018-10-01.txt pry(main)> Search.connection.transport.logger = Logger.new(STDOUT)
  63. 63. Preventing useless datastore hits
  64. 64. Report Elasticsearch MySQL Redis
  65. 65. (pry)> Report.blank_reports.count => 10805 (pry)> Report.active.count => 25842 (pry)> Report.average_asset_count => 1657 Investigating Existing Reports
  66. 66. Report Elasticsearch MySQL Redis
  67. 67. 10+ hrs
  68. 68. 3 hrs
  69. 69. Process in Bulk Framework Cache Database Guards Hash Cache
  70. 70. Resque Workers Redis
  71. 71. 45 workers 45 workers 45 workers
  72. 72. 70 workers 70 workers 70 workers
  73. 73. 48 MB 16 MB
  74. 74. Redis Requests 70 workers 100k 200k
  75. 75. ?
  76. 76. Resque Throttling
  77. 77. Redis Requests 100k 200k
  78. 78. Redis Network Traffic 48MB 16MB
  79. 79. Process in Bulk Framework Cache Database Guards Remove Datastore Hits Hash Cache
  80. 80. Every datastore hit COUNTS
  81. 81. Questions
  82. 82. Contact https://www.linkedin.com/in/mollystruve/ https://github.com/mstruve @molly_struve molly.struve@gmail.com

×