Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

How The Guardian Embraced the Internet using Content, Search, and Open Source

1,994 views

Published on

This talk will cover how The Guardian opened up their business, enriched it, and reached new markets with its Open Platform strategy. Stephen will cover the technical architecture, implementation of Solr (the key technology powering the platform), and how The Guardian has used it to embrace disruption in the media space, while finding new sources of revenue and innovation

Published in: Technology, News & Politics
  • Be the first to comment

How The Guardian Embraced the Internet using Content, Search, and Open Source

  1. 1. From publisher to platform: How the Guardian embraced the internet using content, search, and Open Source Stephen Dunn, Guardian News and Media stephen.dunn@guardian.co.uk, 25th May, 2011 Twitter: @cuica, @openplatformThursday, 26 May 2011
  2. 2. 1 From publisher to platform How the Guardian embraced the Internet using content, search, and Open Source Stephen Dunn, Guardian News and Media 2Thursday, 26 May 2011
  3. 3. The publishing era 3Thursday, 26 May 2011
  4. 4. We started a long time ago:Thursday, 26 May 2011
  5. 5. Keyword page Live blogs Apps Mobile site Twitter updates Swine flu Comment Content partnerships Newspapers Audio Video Open platform APIThursday, 26 May 2011
  6. 6. To secure the financial and editorial To secure the financial and editorial independence independence of the Guardian in perpetuity. To promote freedom in thein perpetuity of the Guardian press and liberal journalism globally. To promote freedom in the press and liberal To become the worlds leading liberal voice. journalism globallyThursday, 26 May 2011
  7. 7. Open Web Principles 7Thursday, 26 May 2011
  8. 8. 2009 8Thursday, 26 May 2011
  9. 9. 1. Permanent http://www.flickr.com/photos/fstorr/ • “A cool URI is one that does not change” Tim Berners-Lee 1998 • 1.5 million resources redirected to new scheme 9Thursday, 26 May 2011
  10. 10. 2. Addressable ★ Resources are “about” something - ready for the social web. ★ We live in “the age of point-at-things” (Coates 2005) 10Thursday, 26 May 2011
  11. 11. 3. Discoverable ★ Multiple routes to content ★ Tagging drives discovery 11Thursday, 26 May 2011
  12. 12. 4. Open 12Thursday, 26 May 2011
  13. 13. Example: The Hackable Guardian http:// www.guardian.co.uk/.... /technology/internet /rss /technology/all /rss /environment/climatechange +business/globaleconomy/rssThursday, 26 May 2011
  14. 14. Results... 14Thursday, 26 May 2011
  15. 15. Site traffic growth Final Release Unique Users 30,000,000 26,250,000 First release 22,500,000 Unique Users Pre - project 18,750,000 15,000,000 11,250,000 40M 7,500,000 3,750,000 Sep 2005 Oct 2006 Nov 2007 Dec 2008 15Thursday, 26 May 2011
  16. 16. However... 16Thursday, 26 May 2011
  17. 17. 1 Billion+ Internet Users! 17Thursday, 26 May 2011
  18. 18. 18Thursday, 26 May 2011
  19. 19. 19Thursday, 26 May 2011
  20. 20. 20Thursday, 26 May 2011
  21. 21. ...“How I stopped worrying about my website and learned to love the whole internet.” Matt McAlister 21Thursday, 26 May 2011
  22. 22. The Open Strategy OPEN IN OPEN OUT Bring in data and apps Enable partners to from the Internet build applications using Guardian content and services for other platforms 22Thursday, 26 May 2011
  23. 23. 23Thursday, 26 May 2011
  24. 24. "Our most interesting experiments lie in combining what we know with the experience, opinions and expertise of the people who want to participate rather than passively receive.” 24Thursday, 26 May 2011
  25. 25. 25Thursday, 26 May 2011
  26. 26. 26Thursday, 26 May 2011
  27. 27. 27Thursday, 26 May 2011
  28. 28. 28Thursday, 26 May 2011
  29. 29. 29Thursday, 26 May 2011
  30. 30. 30Thursday, 26 May 2011
  31. 31. 31Thursday, 26 May 2011
  32. 32. 32Thursday, 26 May 2011
  33. 33. 33Thursday, 26 May 2011
  34. 34. Jack Shenker “The Guardian alongside Al Jazeera was the one news source that everybody on the streets in Tahrir - not just in Cairo but in surrounding cities and major centers of revolutionary activity - that people were talking about.” 34Thursday, 26 May 2011
  35. 35. The Open Strategy OPEN IN OPEN OUT Bring in data and apps Enable partners to from the Internet build applications using Guardian content and services for other platforms 35 22Thursday, 26 May 2011
  36. 36. The Open Platform 36Thursday, 26 May 2011
  37. 37. The suite of services enabling partners to build applications with the Guardian 37Thursday, 26 May 2011
  38. 38. OPEN IN OPEN OUT Bring in data and apps Enable partners to from the Internet build applications using Guardian content and services for other platforms 38 22Thursday, 26 May 2011
  39. 39. CONTENT API DATA STORE POLITICS API A service for A directory of Open database selecting and useful data of candidates, collecting curated by voting records, content from Guardian constituencies, the Guardian editors election results, for re-use live data on election dayThursday, 26 May 2011
  40. 40. Mutualised news! 40Thursday, 26 May 2011
  41. 41. Mutualised news! 41Thursday, 26 May 2011
  42. 42. Mutualised news! 42Thursday, 26 May 2011
  43. 43. 43Thursday, 26 May 2011
  44. 44. 44Thursday, 26 May 2011
  45. 45. 45Thursday, 26 May 2011
  46. 46. 46Thursday, 26 May 2011
  47. 47. DATA STORE A directory of useful data curated by Guardian editorsThursday, 26 May 2011
  48. 48. POLITICS API Open database of candidates, voting records, constituencies, election results, live data on election dayThursday, 26 May 2011
  49. 49. POLITICS API Open database of candidates, voting records, constituencies, election results, live data on election day 49Thursday, 26 May 2011
  50. 50. <OBLIGATORY DOGFOOD SLIDE > 50Thursday, 26 May 2011
  51. 51. 51Thursday, 26 May 2011
  52. 52. Thursday, 26 May 2011
  53. 53. Thursday, 26 May 2011
  54. 54. Thursday, 26 May 2011
  55. 55. Thursday, 26 May 2011
  56. 56. Open for Business 56Thursday, 26 May 2011
  57. 57. 3 Tiers of access 3 Revenue models Keyless: Take our headlines. You keep associated revenues. Approved: Take our full article content, but with an advert. Guardian keeps ad revenue, you keep rest-of- page revenue. Bespoke: Take, reformat, augment our content Revenue model to be negotiated. Combination of Media, Fees, Downloads. 57Thursday, 26 May 2011
  58. 58. 58Thursday, 26 May 2011
  59. 59. What this means Open Out: Developers can now access full content APIs on demand with keys post-approved Platform is positioned as a place to do business So rapid scalability, reliability and performance are now core requirements 59Thursday, 26 May 2011
  60. 60. OPEN IN OPEN OUT Bring in data and Allow partners to apps from the build applications internet using Guardian content and services for other platformsThursday, 26 May 2011
  61. 61. Simple REST/HTTP MICROAPPS framework allows lightweight development A framework for integrating 3rd party Applications proxied for applications into performance guardian.co.uk Apps generally hosted in the cloud, allows hot deployment into production 61Thursday, 26 May 2011
  62. 62. MICROAPPS A framework for integrating 3rd party applications into guardian.co.uk 62Thursday, 26 May 2011
  63. 63. • What could I cook?Thursday, 26 May 2011
  64. 64. Bringing it together 64Thursday, 26 May 2011
  65. 65. 65Thursday, 26 May 2011
  66. 66. App showcase 66Thursday, 26 May 2011
  67. 67. From publisher to platform Seeking massive growth, but no longer only broadcasting content on the website User/partner engagement & contribution on Journalism data software applications revenue and ads Support developers and partners with data and APIs, need scalability, reliability, speed 67Thursday, 26 May 2011
  68. 68. Evolving the architecture 68Thursday, 26 May 2011
  69. 69. Web server Web server Web server App server App server App server Memcached (added later) Oracle CMSThursday, 26 May 2011
  70. 70. Web server Web server Web server Why RDBMS? App server App server App server 5 years ago, fewer alternatives Memcached Understand operations procedures Can easily recruit DBAs / devs Oracle Developer/ops tools Business critical system: a safe choice CMSThursday, 26 May 2011
  71. 71. Scaling traffic Unique Users 30,000,000 26,250,000 22,500,000 Unique Users 18,750,000 15,000,000 11,250,000 7,500,000 3,750,000 Sep 2005 Sep 2006 Sep 2007 Sep 2008 71Thursday, 26 May 2011
  72. 72. 72Thursday, 26 May 2011
  73. 73. 73Thursday, 26 May 2011
  74. 74. 74Thursday, 26 May 2011
  75. 75. 75Thursday, 26 May 2011
  76. 76. 76Thursday, 26 May 2011
  77. 77. 77Thursday, 26 May 2011
  78. 78. We chose Solr/Lucene Can perform complex queries, including full-text search We can change the schema with no downtime Most queries are of similar cost Scales very well horizontally “Just worked” in the cloud No strange control processes/engines Developers just loved working with it! 78Thursday, 26 May 2011
  79. 79. 79Thursday, 26 May 2011
  80. 80. Api Web servers Solr App server Solr Memcached Solr RDBMS Solr Solr Solr CMS Cloud, EC2 80Thursday, 26 May 2011
  81. 81. What about Open In? OPEN IN OPEN OUT Bring in data and apps Enable partners to from the Internet build applications using Guardian content and services for other platforms 81 22Thursday, 26 May 2011
  82. 82. Apps Web servers Proxy App App server App App Memcached App RDBMS App App CMS external hosting app engine etc 82Thursday, 26 May 2011
  83. 83. Core Out In Web servers Solr Proxy App App server App Solr Memcached App Solr App CMS Solr Solr App rdbms Solr Appexternal hosting Cloud, EC2app engine etc 83Thursday, 26 May 2011
  84. 84. 84Thursday, 26 May 2011
  85. 85. 85Thursday, 26 May 2011

×