Lessons learned while building Omroep.nl

2,011 views

Published on

Presentation given by @bartzon and @tieleman at the Ruby en Rails conference 2009.

Published in: Technology
0 Comments
6 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,011
On SlideShare
0
From Embeds
0
Number of Embeds
24
Actions
Shares
0
Downloads
15
Comments
0
Likes
6
Embeds 0
No embeds

No notes for slide

Lessons learned while building Omroep.nl

  1. 1. Lessons learned while building omroep.nl Bart Zonneveld (@bartzon) Sjoerd Tieleman (@tieleman)
  2. 2. Nederlandse Publieke Omroep Dutch Public broadcasting Company AVRO Joodse Omroep NMO Teleac BNN KRO NOS TROS BOS LLiNK NPS VARA EO MAX OHM VPRO HUMAN NCRV RKK ZvK IKON NIO RVU
  3. 3. Rails sites • Beetlejuice • Radio 1 • Centrale • Z@PP navigatie • Omroep.nl • Z@ppelin • Nederland 1 • Zelda • Nederland 3 • Various tools
  4. 4. Team • 2 coders • 1 designer • 1 editor • 1 project manager 6 months, CMS built from scratch
  5. 5. Requirements • Handle 30.000-40.000 pageviews per day • Handle traffic spikes • Flexible, multi user CMS • Loads of external data
  6. 6. Daily spread
  7. 7. Some numbers +----------------------+-------+-------+---------+---------+-----+-------+ | Name | Lines | LOC | Classes | Methods | M/C | LOC/M | +----------------------+-------+-------+---------+---------+-----+-------+ | Controllers | 1864 | 1535 | 41 | 185 | 4 | 6 | | Helpers | 797 | 631 | 1 | 75 | 75 | 6 | | Models | 1303 | 1055 | 40 | 153 | 3 | 4 | | Libraries | 814 | 620 | 15 | 79 | 5 | 5 | | Integration tests | 0 | 0 | 0 | 0 | 0 | 0 | | Functional tests | 0 | 0 | 0 | 0 | 0 | 0 | | Unit tests | 0 | 0 | 0 | 0 | 0 | 0 | | Model specs | 1932 | 1573 | 0 | 9 | 0 | 172 | | View specs | 7322 | 5950 | 0 | 153 | 0 | 36 | | Controller specs | 7292 | 5846 | 0 | 175 | 0 | 31 | | Helper specs | 900 | 676 | 0 | 2 | 0 | 336 | | Library specs | 56 | 45 | 2 | 12 | 6 | 1 | +----------------------+-------+-------+---------+---------+-----+-------+ | Total | 22280 | 17931 | 99 | 843 | 8 | 19 | +----------------------+-------+-------+---------+---------+-----+-------+ Code LOC: 3841 Test LOC: 14090 Code to Test Ratio: 1:3.7
  8. 8. Moar numbers • 410 Cucumber scenarios, 600 step definitions • 2235 RSpec specifications so it must be bug-free, right? ;-)
  9. 9. Tools Ruby on Rails 2.3.4 Rspec + Webrat + Cucumber Apache 2.2 Paperclip SVN App monitoring: RPM, Hoptoad MySQL 5 Service monitoring: Nagios Memcache
  10. 10. Tools: app monitoring Hoptoad New Relic RPM
  11. 11. Architecture • Apache 2.2 with mod_proxy • Rails 2.3.4 running on Phusion Passenger 2.2.5 with REE • 4 hosts, each running 4 instances (per app) Appdex: 1.0, avg. response time 40ms, 130 rpm, db load 0.6 %
  12. 12. Servers • Quadcore Intel Xeon E542, 32 GB Ram • Fedora 8 • Other mumbojumbo
  13. 13. Architecture Front proxy Front proxy Application Application Application Application server server server server Database memcache
  14. 14. Workflow • BDD • Shared behaviours • Performance testing • Staging and production environment
  15. 15. BDD • RSpec • Cucumber • (Webrat)
  16. 16. 3 slide intro to BDD: RSpec describe Article do it_should_behave_like "all objects with userstamps" it_should_behave_like "all objects than can be published" it_should_behave_like "all objects that have an url" it_should_behave_like "all objects that can be searched" it_should_behave_like "all objects with related articles" it "should not be valid without a name" do @article.attributes = @valid_attributes.except(:name) @article.should_not be_valid end it "should not be valid without contents" do @article.attributes = @valid_attributes.except(:contents) @article.should_not be_valid end end
  17. 17. 3 slide intro to BDD: Cucumber features Feature: Articles on the homepage As a visitor I want to view articles on the homepage So that I can see the latest content Scenario: 5 most recent articles Given there are 8 articles When I visit the homepage Then I should see the 5 last published articles
  18. 18. 3 slide intro to BDD: Cucumber steps Given "there are $num articles" do |num| num.to_i.times { create_article } end When "I visit the homepage" do visit root_path end Then "I should see the $num last published articles" do |num| Article.last_published(num).each do |article| response.should contain(article.title) end end
  19. 19. Shared behaviours • Tags • User stamps (created by, updated by) • Searching • “Related” articles • Publication timestamps (on/offline at)
  20. 20. Shared behaviours module UserStamps def self.included(klass) class Article < ActiveRecord::Base klass.instance_eval do include Shared::UserStamps include InstanceMethods end include Shared::Published end include Shared::Url include Shared::Search module InstanceMethods include Shared::RelatedArticles def created_by User.find_by_id(creator_id) # stuff end end def updated_by User.find_by_id(updater_id) end end end
  21. 21. Workflow: performance testing • ab, httperf, autobench, cURL • NewRelic RPM • Safari Web Inspector • http://railslab.newrelic.com/scaling-rails
  22. 22. Autobench
  23. 23. Challenges • Content Management System • Loads and loads and loads of external data
  24. 24. CMS • Articles, Themas, Specials, Subsites • Multiple feeds, images, links • Version control • Media database
  25. 25. CMS: Articles Subsite Thema Page Special Article Link Feed Image
  26. 26. CMS:Version control
  27. 27. Media DB • Implemented as REST app • To be used as REST service • Files, folders, crops
  28. 28. External data • RSS feeds • EPG data • Zelda • Babel • News/sport/teletekst • Twitter • Lots of custom XML formats
  29. 29. External data: XML/RSS • Empty feeds • Encodings are off (Windows-1252, ISO-8859-1, UTF-8) • “Custom” fields • Incorrect fields (dates, unescaped HTML) • Timeouts • Everything that can go wrong, will go wrong
  30. 30. External data: Twitter
  31. 31. External data: EPG data Zelda don’t sue us Nintendo... please? :)
  32. 32. External data • Empty feeds • Retrieving the feed while someone is updating it • Required fields that are empty • DTD?
  33. 33. <!ELEMENT aflevering ( prid?, tite?, medium?, icon?, aankondiging?, inkl?, ingl?, infi?, inak?, inds?, inbb?, kykw?, orti?, aant?, land?, lcod?, psrt?, prem?, inh1?, afle?, atit?, inh2?, bron?, prij?, inh3?, mail?, webs?, inhk?, gids_tekst?, omroepen?, genres?, personen?, streams?, fragmenten?, serie? )>
  34. 34. <!ELEMENT inkl (#PCDATA)> <!ELEMENT ingl (#PCDATA)> <!ELEMENT intt (#PCDATA)> <!ELEMENT inhh (#PCDATA)> <!ELEMENT omro (#PCDATA)> <!ELEMENT lcod (#PCDATA)> <!ELEMENT herh (#PCDATA)> <!ELEMENT inds (#PCDATA)> <!ELEMENT infi (#PCDATA)> <!ELEMENT inbb (#PCDATA)> <!ELEMENT genr (#PCDATA)> <!ELEMENT kykw (#PCDATA)> <!ELEMENT afle (#PCDATA)>
  35. 35. Lessons learned • Cache the crap out of everything • Rescue everything • Test everything (frontend and backend)
  36. 36. Caching • Cache the homepage • Page cache → Fragment cache • Don’t cache forms • Cache as much as possible
  37. 37. Case: article views • Article is page cached • Update the number of views in realtime?
  38. 38. Use AJAX! <% javascript_tag do %> <%= remote_function :url => update_views_article_path(@article) %> <% end %>
  39. 39. Case: banner items
  40. 40. Case: banner items • Fast requests (<10ms) • ETags (304 Not Modified) • Static resource → page cache • Move to front proxy, frees up Rails cluster • 1100rpm → 130rpm • 20ms → 40ms • Average response time going up? Oh nooooes!
  41. 41. Caching external data • Don’t expire cache (preferrably) • Explicitly overwrite • Update in background (feeeeeeeds) • memcache FTW!
  42. 42. memcache • Escape your keys using CGI::escape • Max keylength is 250 • Max value size is 1MB
  43. 43. Rescueing def self.get_feed_contents(url) content = "" open(url) { |s| content = s.read } RSS::Parser.parse(content, false).items rescue => e logger.warn "Feed #{url} raised error: #{e.message}" [] rescue Timeout::Error => e logger.warn "Feed #{url} timed out: #{e.message}" [] end Timeout::Error is an interrupt...
  44. 44. Testing • rcov • Refactor your tests • Peer reviews, external audits • Run specs/features (continuously) in parallel (your Cucumber is too slooooow!)
  45. 45. Cucumber salad num_of_processes.times do |count| pids << Process.fork do setup_database(conn, count) Cucumber::Cli::Main.execute( ["-f", "progress", "-l", "nl", "-r", "features"] + feature_sets[count] ) end end MacBook Pro “Regular” Mac Pro (8) (4) 12:12 4:34 2:12
  46. 46. Conclusions • Test • Optimize • Monitor
  47. 47. What’s next for us? • Building a high-performance backend • Uitzending Gemist statistics API • 250+ reqs/s at minimum
  48. 48. @questions.any?

×