Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

How to avoid hanging yourself with Rails

5,064 views

Published on

Presentation given to Toronto Rails Project Night, performance tips for ActiveRecord usage

Published in: Technology
  • These slides are part of a presentation I gave, so you don't really have the context to go with each slide - sorry about that. Glad they helped out though.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • It's really interesting. I've had problems with the speed of a rails app and after reading this presentation I really improved it. Thanks. But ... the begining is a little confusing.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

How to avoid hanging yourself with Rails

  1. work.rowanhick.com How to avoid hanging yourself with Rails Using ActiveRecord right the first time 1
  2. Discussion tonight • Intended for new Rails Developers • People that think Rails is slow • Focus on simple steps to improve common :has_many performance problems • Short - 15mins • All links/references up on http://work.rowanhick.com tomorrow 2
  3. About me • New Zealander (not Australian) • Product Development Mgr for a startup in Toronto • Full time with Rails for 2 years • Previously PHP/MySQL for 4 years • 6 years Prior QA/BA/PM for Enterprise CAD/ CAM software dev company 3
  4. Disclaimer • For sake of brevity and understanding, the SQL shown here is cut down to “psuedo sql” • This is not an exhaustive in-depth analysis, just meant as a heads up • Times were done using ApacheBench through mongrel in production mode • ab -n 1000 http://127.0.0.1/orders/test_xxxx 4
  5. ActiveRecord lets you get in trouble far to quick. • Super easy syntax comes at a cost. @orders = Order.find(:all) @orders.each do |order| puts order.customer.name puts order.customer.country.name end ✴Congratulations, you just overloaded your DB with (total number of Orders x 2) unnecessary SQL calls 5
  6. What happened there? • One query to get the orders @orders = Order.find(:all) “SELECT * FROM orders” • For every item in the orders collection customer.name: “SELECT * FROM customers WHERE id = x” customer.country.name: “SELECT * FROM customers WHERE id = y” 6
  7. Systemic Problem in Web development I’ve seen: - 15 Second page reloads - 10000 queries per page “<insert name here> language performs really poorly, we’re going to get it redeveloped in <insert new language here>” 7
  8. Atypical root cause • Failure to build application with *real* data • ie “It worked fine on my machine” but the developer never loaded up 100’000 records to see what would happen • Using Rake tasks to build realistic data sets • Test, test, test • tail -f log/development.log 8
  9. Faker to the rescue • in lib/xchain.rake namespace :xchain do desc quot;Load fake customersquot; task :load_customers => :environment do require 'Faker' Customer.find(:all, :conditions => quot;email LIKE('%XCHAIN_ %')quot;).each { |c| c.destroy } 1..300.times do c = Customer.new c.status_id = rand(3) + 1 c.country_id = rand(243) + 1 c.name = Faker::Company.name c.alternate_name = Faker::Company.name c.phone = Faker::PhoneNumber.phone_number c.email = quot;XCHAIN_quot;+Faker::Internet.email c.save end end $ rake xchain:load_customers 9
  10. Eager loading • By using :include in .finds you create sql joins • Pull all required records in one query find(:all, :include => [ :customer, :order_lines ]) ✓ order.customer, order.order_lines find(:all, :include => [ { :customer => :country }, :order_lines ]) ✓ order.customer order.customer.country order.order_lines 10
  11. Improvement • Let’s start optimising ... @orders = Order.find(:all, :include => {:customers => :country} ) • Resulting SQL ... “SELECT orders.*, countries.* FROM orders LEFT JOIN customers ON ( customers.id = orders.customers_id ) LEFT JOIN countries ON ( countries.id = customers.country_id) ✓ 7.70 req/s 1.4x faster 11
  12. Select only what you need • Using the :select parameter in the find options, you can limit the columns you are requesting back from the database • No point grabbing all columns, if you only want :id and :name Orders.find(:all, :select => ‘orders.id, orders.name’) 12
  13. The last slide was very important • Not using selects is *okay* provided you have very small columns, and never any binary, or large text data • You can suddenly saturate your DB connection. • Imagine our Orders table had an Invoice column on it storing a pdf of the invoice... 13
  14. Oops • Can’t show a benchmark • :select and :include don’t work together !, reverts back to selecting all columns • Core team for a long time have not included patches to make it work • One little sentence in ActiveRecord rdoc “Because eager loading generates the SELECT statement too, the :select option is ignored.” 14
  15. ‘mrj’ to the rescue • http://dev.rubyonrails.org/attachment/ticket/ 7147/init.5.rb • Monkey patch to fix select/include problem • Produces much more efficient SQL 15
  16. Updated finder • Now :select and :include playing nice: @orders = Order.find(:all, :select => 'orders.id, orders.created_at, customers.name, countries.name, order_statuses.name', :include => [{:customer[:name] => :country[:name]}, :order_status[:name]], :conditions => conditions, :order => 'order_statuses.sort_order ASC,order_statuses.id ASC, orders.id DESC') ✓15.15 req/s 2.88x faster 16
  17. r8672 change • http://blog.codefront.net/2008/01/30/living-on-the- edge-of-rails-5-better-eager-loading-and-more/ • The following uses new improved association load (12 req/s) @orders = Order.find(:all, :include => [{:customer => :country}, :order_status] ) • The following does not @orders = Order.find(:all, :include => [{:customer => :country}, :order_status], :order => ‘order_statuses.sort_order’ ) 17
  18. r8672 output... • Here’s the SQL Order Load (0.000837) SELECT * FROM `orders` WHERE (order_status_id < 100) LIMIT 10 Customer Load (0.000439) SELECT * FROM `customers` WHERE (customers.id IN (2106,2018,1920,2025,2394,2075,2334,2159,1983,2017)) Country Load (0.000324) SELECT * FROM `countries` WHERE (countries.id IN (33,17,56,150,194,90,91,113,80,54)) OrderStatus Load (0.000291) SELECT * FROM `order_statuses` WHERE (order_statuses.id IN (10)) 18
  19. But I want more • Okay, this still isn’t blazing fast. I’m building the next killr web2.0 app • Forgetabout associations, just load it via SQL, depending on application, makes a huge difference • Concentrate on commonly used pages 19
  20. Catch 22 • Hard coding SQL is the fastest solution • No construction of SQL, no generation of ActiveRecord associated classes • If your DB changes, you have to update SQL ‣ Keep SQL with models where possible 20
  21. It ain’t pretty.. but it’s fast • Find by SQL class order def self.find_current_orders find_by_sql(quot;SELECT orders.id, orders.created_at, customers.name as customer_name, countries.name as country_name, order_statuses.name as status_name FROM orders LEFT OUTER JOIN `customers` ON `customers`.id = `orders`.customer_id LEFT OUTER JOIN `countries` ON `countries`.id = `customers`.country_id LEFT OUTER JOIN `order_statuses` ON `order_statuses`.id = `orders`.order_status_id WHERE order_status_id < 100 ORDER BY order_statuses.sort_order ASC,order_statuses.id ASC, orders.id DESCquot;) end end • 28.90 req/s ( 5.49x faster ) 21
  22. And the results find(:all) 5.26 req/s find(:all, :include) 7.70 req/s 1.4x find(:all, :select, :in 15.15 req/s 2.88x clude) find_by_sql() 28.90 req/s 5.49x 22
  23. Don’t forget indexes • 64000 orders OrderStatus.find(:all).each { |os| puts os.orders.count } • Avg 0.61 req/s no indexes • EXPLAIN your SQL ALTER TABLE `xchain_test`.`orders` ADD INDEX order_status_idx(`order_status_id`); • Avg 23 req/s after index (37x improvment) 23
  24. Avoid .count • It’s damned slow OrderStatus.find(:all).each { |os| puts os.orders.count } • Add column orders_count + update code OrderStatus.find(:all).each { |os| puts os.orders_count } ✓34 req/s vs 108 req/s (3x faster) 24
  25. For the speed freaks • Merb - http://merbivore.com • 38.56 req/s - 7x performance improvement • Nearly identical code • Blazingly fast 25
  26. work.rowanhick.com The End 26

×