Memory Issues in Ruby on Rails Applications

2,551 views
2,353 views

Published on

Boston Ruby Meetup presentation by Joe Ferris, CTO of thoughtbot, and Simeon Simeonov, CTO of Swoop, on ways to optimize the memory footprint of data intensive Ruby on Rails applications.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
2,551
On SlideShare
0
From Embeds
0
Number of Embeds
25
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Memory Issues in Ruby on Rails Applications

  1. 1. Memory Issues in Rails Applications
  2. 2. I am @simeons
  3. 3. recruit amazing people solve hard problems ! ship ! make users happy ! repeat
  4. 4. Problems of Success (good problems) Too many users Too much traffic Too much data
  5. 5. Memory Issues in Rails Applications Common Problem of Success
  6. 6. Display Advertising
 Makes the Web Suck User-focused optimization Tens of millions of users 1000+% better than average 200+% better than Google Swoop Fixes That
  7. 7. Mobile  SDKs   iOS  &  Android Web  SDK   RequireJS  &  jQuery Components   AngularJS NLP,  etc.   Python Targe<ng   High-­‐Perf  Java Analy<cs   Ruby  2.0 Internal  Apps   Ruby  2.0  /  Rails  3 Pub  Portal   Ruby  2.0  /  Rails  3 Ad  Portal   Ruby  2.0  /  Rails  4
  8. 8. Before 1hr @ 4Gb
  9. 9. Before 1hr @ 4Gb When problems grow faster than the rate at which you can throw HW at them, you actually have to solve them
  10. 10. Before 1hr @ 4Gb After 5min @ 230Mb
  11. 11. Resolving Memory Issues in Rails Applications Using Streams
  12. 12. CSV
  13. 13. 0 125 250 375 500 0 25,000 50,000 75,000 100,000 Rows Memory (Mb)
  14. 14. 0 125 250 375 500 0 25,000 50,000 75,000 100,000 Rows Memory (Mb) You are here
  15. 15. 0 125 250 375 500 0 25,000 50,000 75,000 100,000 Rows Memory (Mb) You are here This sucks
  16. 16. 0 125 250 375 500 0 25,000 50,000 75,000 100,000 Rows Memory (Mb) You are here This sucks Start thinking here
  17. 17. Memory Leaks
  18. 18. class AddDomainsStep def call(hashes) hashes.map do |hash| transform_and_return(hash) end end end
  19. 19. 1 class AddDomainsStep 2 def initialize 3 @domain_config = DomainConfig.instance 4 end 5 6 def call(hashes) 7 hashes.each do |hash| 8 hash['domain'] = 9 @domain_config. 10 domain_for(hash['domain_id']) 11 end 12 end 13 end
  20. 20. 1 class DomainConfig 2 include Singleton 3 4 def initialize 5 @domains = {} 6 end 7 8 def domain_for(id) 9 @domains[id] ||= Domain.name_for(id) || '' 10 end 11 end
  21. 21. @domains[id] ||= Domain.name_for(id) || ''
  22. 22. Memory Leak •Memory that will never be released by the garbage collector. •Memory usage grows the longer the process runs.
  23. 23. Avoid Global State •Global variables •Class variables •Singletons •Per-process instance state
  24. 24. Memory Churn
  25. 25. hashes.map do |hash| hash['domain'].downcase.strip end hashes.each do |hash| hash['domain'].downcase! hash['domain'].strip! end vs
  26. 26. Memory Churn •Allocating and deallocating tons of objects slows down processing •Mutation limits allocations, but makes it easier to introduce bugs
  27. 27. 1 hashes.each do |hash| 2 hash['domain'].downcase! 3 hash['domain'].strip! 4 end Spot the Bug!
  28. 28. # In shared state: @domains[id] ||= Domain.name_for(id) || '' ! # Much later: hash['domain'].downcase! hash['domain'].strip!
  29. 29. Good News! •Allocating and freeing objects is fairly fast in Ruby •Keeping your stack frame light will limit the effects of memory churn
  30. 30. Memory Bloat
  31. 31. def to_csv csv = [CSV.generate_line(headers)] ! rows.each do |row| values = headers.map do |header| row[header] || defaults[header] end ! csv << CSV.generate_line(values) end ! csv.join('') end
  32. 32. def to_csv csv = [CSV.generate_line(headers)] ! rows.each do |row| values = headers.map do |header| row[header] || defaults[header] end ! csv << CSV.generate_line(values) end ! csv.join('') end
  33. 33. def to_csv csv = [CSV.generate_line(headers)] ! rows.each do |row| values = headers.map do |header| row[header] || defaults[header] end ! csv << CSV.generate_line(values) end ! csv.join('') end
  34. 34. Memory Bloat •Memory usage grows with data set •Loading too much data at once
  35. 35. Laziness
  36. 36. rename_report_fields( squash( add_domains( add_properties( unwind_variations( rows ) ) ) ) )
  37. 37. def duplicate(number, count) if count > 0 [number] + repeat(number, count - 1) else [] end end ! def sum(list) list.inject(0) do |result, number| result + number end end
  38. 38. sum(repeat(5,10)) # => 50
  39. 39. duplicate :: Int -> Int -> [Int] duplicate number count | count <= 0 = [] | otherwise = number:duplicate number (count - 1) ! sum :: [Int] -> Int sum [x] = x sum (x:remaining) = x + sum remaining
  40. 40. > sum $ duplicate 5 10 50
  41. 41. Be Proactive About Being Lazy
  42. 42. Enumerable
  43. 43. class AddDomainsStep def initialize(source) @source = source end ! def each @source.each do |hash| hash['domain'] = DomainConfig. instance. domain_for(hash['domain_id']) yield hash end end end
  44. 44. RenameReportFieldsStep.new( SquashStep.new( AddDomainsStep.new( AddPropertiesStep.new( UnwindVariationsStep.new( rows ) ) ) ) )
  45. 45. Buffering

×