Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Enhance you APDEX.. naturally!


Published on

Common performance problems + solutions for medium Rails sites

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Enhance you APDEX.. naturally!

  1. 1. ENHANCE YOUR APDEX.. NATURALLY! Proven methods to enhance your Rails performance when trafic increases X times Vlad ZLOTEANU #ParisRB Software Engineer - Dimelo March 6, 2002 @vladzloteanuCopyright Dimelo SA
  2. 2. Be Warned! Surprise coming up… at the end of this talk! ;)Copyright Dimelo SA
  3. 3. Dimelo Software editor, SaaS platforms, social media CR Frontend platforms Collaborative platforms, ‘forum/SO-like’, white-labeled, for big accounts (a kind of GetSatisfaction / UserVoice for big accounts) Backend product SocialAPIs kind of tweetdeck, but multiple channel, designed for multiple users/teamsCopyright Dimelo SA
  4. 4. Technical details (frontend product) 30+ average dynamic (Rails) req/s (web + api) Peaks of 80 req/s 2M+ dynamic requests / day 700k+ unique visitors / dayCopyright Dimelo SA
  5. 5. Tools Load/Stress tests: AB, httperf, siege Htop, iftop, mytop passenger-status, passenger-memory-status Mysql: EXPLAIN query, SHOW PROCESSLIST Application logs NewRelicCopyright Dimelo SA
  6. 6. Demo env Rails 3.2 REE + Passenger + Apache (3 workers) MySQL 5.5.x + InnoDB tables OSX Lion - MBPro 2010 8GB RAM class Post < ActiveRecord::Base # 500K posts belongs_to :author has_and_belongs_to_many :categories # state -> [ moderation, published, answered ] … class Category < ActiveRecord::Base has_and_belongs_to_many :postsCopyright Dimelo SA
  7. 7. A. External services: timeouts [DEMO] # EventMachine app on port 8081 operation = proc do sleep 2 # simulate a long running request resp.status = 200 resp.content = "Hello World!" end EM.defer(operation, callback) # AggregatesController on main site (port 8080) uri = URI(’) http =, uri.port) @rss_feeds = http.get("/").bodyCopyright Dimelo SA
  8. 8. A. External services: timeouts Problem Page depends on external resource (E.G.: RSS, Twitter API, FB API, Auth servers, …) External resource responds very slow, or connection hangs In ruby, Net::HTTP’s default timeout is 60s! Ruby 1.8 – Timeout library is not reliable Solution Move it to a BG request Put timeouts EVERYWHERE! Enable timeouts on API clients Cache parts that involve external resourcesCopyright Dimelo SA
  9. 9. A. Internal services(2) Problem Same conditions, but this time 2 services from same server/application have calls each to other Solution Same problems, but risk of deadlock!Copyright Dimelo SA
  10. 10. B. DB: Queries containing ‘OR’ conditions [Demo] # Request: list also my posts (current user’s posts), even if they are not published # Current index: on [state, created_at] @posts.where("state = :state OR author_id = :author_id", {:state => published, :author_id => params[:author_id]})Copyright Dimelo SA
  11. 11. B. DB: Queries containing ‘OR’ conditions Problem Queries containing “OR” conditions EG: ‘visible_or_mine’ (status = published OR author_id=42 ) .. will make index on [ a, b, c ] unusable on (a OR condition) AND b AND c Solution Don’t use it! Cache the result Put index only on sort column On: (a OR cond) AND b AND c, put index on[b, c]Copyright Dimelo SA
  12. 12. C. Filtering on HABTM relations [Demo] # Request: Filter by one (or more) categories # Model @posts = @posts.joins(:categories). where(:categories => {:id => params[:having_categories]}) # OR: Create join model, use only one join # Model has_many :post_categorizations has_many :categories, :through => :post_categorizations # Controller @posts.joins(:post_categorizations). where(:post_categorizations => {:category_id =>Copyright Dimelo SA params[:having_categories]})
  13. 13. C. Filtering on HABTM relations Problem Filtering on HABTM relations creates a double join .. which are (usually) expensive Solution Rewrite double joins Use intermediary model Join on intermediary modelCopyright Dimelo SA
  14. 14. D. DB: Pagination/count on large tables [Demo] # Nothing fancy, just implement pagination # Controller @posts = @posts.paginate( :page => params[:page]). order(posts.created_at desc) # View <%= will_paginate @posts %>Copyright Dimelo SA
  15. 15. D. DB: Pagination/count on large tables Problem Count queries are expensive on large tables Each time a pagination is displayed, a count query is run Displaying distant page (aka using a big OFFSET) is very expensive MyISAM: counts LOCK the TABLE!Copyright Dimelo SA
  16. 16. D. DB: Pagination/count on large tables (2) Solution Cache count result .. and don’t display ‘last’ pages Limit count SELECT COUNT(*) FROM a_table WHERE some_conditions  SELECT COUNT(*) FROM (SELECT 1 FROM a_table WHERE some_conditions LIMIT x) t; Drop the isolation: NOLOCK / READ UNCOMMITEDCopyright Dimelo SA
  17. 17. E. Fragment caching: Thundering herd # Let’s implement fragment caching, time-expired for displaying the previous page (no pagination optimisations were enabled) <% cache_key = ("posts::" + Digest::MD5.hexdigest(params.inspect)) cache cache_key, :expires_in => 20.seconds do %> <h1>Posts#index</h1> …. <% end %>Copyright Dimelo SA
  18. 18. E. Fragment caching: Thundering herd Problem Using: fragment cache, time-expired Cache for a resource-intensive page expires  multiple processes try to recalculate the key Effects: resource consumption peaks, passenger worker pools starvation t Cache unavailable; Cache Cache computation validityCopyright Dimelo SA
  19. 19. E. Fragment caching: Thundering herd (2)Copyright Dimelo SA
  20. 20. E. Fragment caching: Thundering herd (3) Backgrounded calculation/sweeping is hard/messy/buggy Solution Before expiration time is reached (t - delta), obtain a lock and trigger cache recalculation The next processes won’t obtain the lock and will serve the still-valid cache Rails 2: Rails 3: Implemented.Copyright Dimelo SA
  21. 21. F. API and Web on same server Problem API and Web don’t have complementary usage patterns Web slows down APIs, that should respond fast APIs are much more prone to peaks Worker threads starvation Solution Put API and WEB on different servers Log/Throttle API callsCopyright Dimelo SA
  22. 22. G. API: dynamic queries Problem REST APIs usually expose a proxy to your DB Client can make any type of combination available: filter + sort And because they can, they will. Solution Don’t give them too many options  Use one db per client (prepare to shard per client) Will be able to: add custom indexes Log/Throttle API callsCopyright Dimelo SA
  23. 23. H. Ruby GC: not adapted for Web frameworks Problem Ruby GC is not optimized for large web frameworks On medium Rails apps, ~50% of time can be spent in GC Solution Use REE or Ruby 1.9.3 Activate GC.stats & benchmark your app Tweak GC params trade memory for CPU Previous conf: 40%+ speed for 20%+ memoryCopyright Dimelo SA
  24. 24. I. Other recomandations Solution Use MyISAM (on MySQL) unless you really need transactions Design your models thinking about sharding (and shard the DB, when it becomes the bottleneck) Perf refactor: improve where it matters Benchmark (before and after) Btw.. Don’t make perf tests on OSX :PCopyright Dimelo SA
  25. 25. Copyright Dimelo SA
  26. 26. Le Dimelo Contest revient ! Coder un Middleware Rack pour déterminer les urls accédées via Rack, calculer le nombre de visiteurs uniques, en temps réel, agrégé sur les 5 dernières minutes.Copyright Dimelo SA
  27. 27. Le prix !Copyright Dimelo SA
  28. 28. Copyright Dimelo SA
  29. 29. .end Thank you! ?Copyright Dimelo SA