Successfully reported this slideshow.
Your SlideShare is downloading. ×

From Publisher To Platform: How The Guardian Used Content, Search, and Open Source To Build a Powerful New Business Model

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 93 Ad

From Publisher To Platform: How The Guardian Used Content, Search, and Open Source To Build a Powerful New Business Model

Download to read offline

Last year The Guardian launched The Open Platform, a suite of services and tools that enable content partners and developers to build applications leveraging The Guardian's rich content.

This talk will cover how The Guardian opened up their content, enriched it, and reached new markets with it's platform strategy.

We cover the background platform strategy, technical architecture, implementation of Solr, and how the new release of the Guardian's Open Platform, launched May 20th, 2010, has embraced disruption in the media space, while at the same time accelerating revenue.

Last year The Guardian launched The Open Platform, a suite of services and tools that enable content partners and developers to build applications leveraging The Guardian's rich content.

This talk will cover how The Guardian opened up their content, enriched it, and reached new markets with it's platform strategy.

We cover the background platform strategy, technical architecture, implementation of Solr, and how the new release of the Guardian's Open Platform, launched May 20th, 2010, has embraced disruption in the media space, while at the same time accelerating revenue.

Advertisement
Advertisement

More Related Content

Similar to From Publisher To Platform: How The Guardian Used Content, Search, and Open Source To Build a Powerful New Business Model (14)

Recently uploaded (20)

Advertisement

From Publisher To Platform: How The Guardian Used Content, Search, and Open Source To Build a Powerful New Business Model

  1. 1 From publisher to platform How the guardian used content, search, and open source to build a powerful new business model Stephen Dunn, Guardian News and Media Apache Lucene EuroCon 21 May 2010
  2. The publishing era Apache Lucene EuroCon 21 May 2010 2
  3. We started a long time ago: Apache Lucene EuroCon 21 May 2010
  4. “To secure the financial and editorial independence of To secure the financial and editorial The Guardian in perpetuity.” independence of the Guardian in perpetuity. Topromote freedom in the press press and liberal “To promote freedom in the and liberal journalism journalism globally. globally.” Apache Lucene EuroCon 21 May 2010
  5. 2010 Apache Lucene EuroCon 21 May 2010
  6. 2010 Keyword page Live blogs iPhone app Mobile site Twitter updates Swine flu Comment Content partnerships Newspapers Audio Video Data API Apache Lucene EuroCon 21 May 2010
  7. 1996 Apache Lucene EuroCon 21 May 2010 6
  8. 1999 Apache Lucene EuroCon 21 May 2010 7
  9. 1999 Apache Lucene EuroCon 21 May 2010 7
  10. 01-> 06 Apache Lucene EuroCon 21 May 2010 8
  11. 2009 1.5M pages and counting 250M+ pages/ month 30M visitors/ month 4x Webby award winner (best newspaper site) Apache Lucene EuroCon 21 May 2010 9
  12. 2009 1.5M pages and counting 250M+ pages/ month 30M visitors/ month 4x Webby award winner (best newspaper site) Apache Lucene EuroCon 21 May 2010 9
  13. 2009 1.5M pages and counting 250M+ pages/ month 30M visitors/ month 4x Webby award winner (best newspaper site) Apache Lucene EuroCon 21 May 2010 9
  14. 2009 1.5M pages and counting 250M+ pages/ month 30M visitors/ month 4x Webby award winner (best newspaper site) Apache Lucene EuroCon 21 May 2010 9
  15. Part of the Web Apache Lucene EuroCon 21 May 2010 10
  16. 1. Permanent http://www.flickr.com/photos/fstorr/ • “A cool URI is one that does not change” Tim Berners-Lee 1998 • 1.5 million resources redirected to new scheme Apache Lucene EuroCon 21 May 2010 11
  17. 2. Addressable ★ Resources are “about” something - ready for the social web. ★ We live in “the age of point-at-things” (Coates 2005) Apache Lucene EuroCon 21 May 2010 12
  18. 3. Discoverable ★ Multiple routes to content ★ Tagging drives discovery Apache Lucene EuroCon 21 May 2010 13
  19. 3. Discoverable ★ Multiple routes to content ★ Tagging drives discovery Apache Lucene EuroCon 21 May 2010 13
  20. 3. Discoverable ★ Multiple routes to content ★ Tagging drives discovery Apache Lucene EuroCon 21 May 2010 13
  21. 3. Discoverable ★ Multiple routes to content ★ Tagging drives discovery Apache Lucene EuroCon 21 May 2010 13
  22. Apache Lucene EuroCon 21 May 2010 14
  23. The hackable guardian.co.uk http://www.guardian.co.uk/.... Apache Lucene EuroCon 21 May 2010
  24. The hackable guardian.co.uk http://www.guardian.co.uk/.... /technology/internet /technology/all /environment/climatechange Apache Lucene EuroCon 21 May 2010
  25. The hackable guardian.co.uk http://www.guardian.co.uk/.... /technology/internet /technology/all /environment/climatechange +business/globaleconomy Apache Lucene EuroCon 21 May 2010
  26. The hackable guardian.co.uk http://www.guardian.co.uk/.... /technology/internet /technology/all /environment/climatechange +business/globaleconomy Apache Lucene EuroCon 21 May 2010
  27. The hackable guardian.co.uk http://www.guardian.co.uk/.... /technology/internet/rss /technology/all/rss /environment/climatechange +business/globaleconomy/rss Apache Lucene EuroCon 21 May 2010
  28. Results... Apache Lucene EuroCon 21 May 2010 16
  29. Site traffic growth Final Release Unique Users First release Apache Lucene EuroCon 21 May 2010 17
  30. Site traffic growth Final Release Unique Users 30,000,000 26,250,000 First release 22,500,000 Unique Users 18,750,000 15,000,000 11,250,000 7,500,000 3,750,000 Sep 2005 Feb 2006 Jul 2006 Dec 2006 May 2007 Oct 2007 Mar 2008 Aug 2008 Jan 2009 Apache Lucene EuroCon 21 May 2010 17
  31. Site traffic growth Final Release Unique Users 30,000,000 26,250,000 First release 22,500,000 Unique Users Pre - project 18,750,000 15,000,000 11,250,000 7,500,000 3,750,000 Sep 2005 Feb 2006 Jul 2006 Dec 2006 May 2007 Oct 2007 Mar 2008 Aug 2008 Jan 2009 Apache Lucene EuroCon 21 May 2010 17
  32. Site traffic growth Final Release Unique Users 30,000,000 26,250,000 First release 22,500,000 Unique Users Pre - project 18,750,000 15,000,000 11,250,000 36M 7,500,000 3,750,000 Sep 2005 Feb 2006 Jul 2006 Dec 2006 May 2007 Oct 2007 Mar 2008 Aug 2008 Jan 2009 Apache Lucene EuroCon 21 May 2010 17
  33. However... Apache Lucene EuroCon 21 May 2010 18
  34. 1 Billion+ Internet Users! Apache Lucene EuroCon 21 May 2010 19
  35. Apache Lucene EuroCon 21 May 2010 20
  36. Apache Lucene EuroCon 21 May 2010 21
  37. Apache Lucene EuroCon 21 May 2010 22
  38. ....”How I stopped worrying about my website and learned to love the whole Internet.” Matt McAlister Apache Lucene EuroCon 21 May 2010 23
  39. The Open Strategy OPEN IN OPEN OUT Bring in data and Enable partners to apps from the build applications Internet using Guardian content and services for other digital platforms Apache Lucene EuroCon 21 May 2010 24
  40. Apache Lucene EuroCon 21 May 2010 25
  41. Apache Lucene EuroCon 21 May 2010 26
  42. Apache Lucene EuroCon 21 May 2010 27
  43. "Our most interesting experiments lie in combining what we know with the experience, opinions and expertise of the people who want to participate rather than passively receive.” Apache Lucene EuroCon 21 May 2010 28
  44. TA BE The Open Platform Apache Lucene EuroCon 21 May 2010 29
  45. TA BE OPEN IN OPEN OUT Bring in data and apps Allow partners to build from the Internet applications using Guardian content and services for other digital platforms Apache Lucene EuroCon 21 May 2010 30
  46. TA BE OPEN IN OPEN OUT Bring in data and apps Allow partners to build from the Internet applications using Guardian content and services for other digital platforms Apache Lucene EuroCon 21 May 2010 30
  47. TA BE The suite of services enabling partners to build applications with the Guardian Apache Lucene EuroCon 21 May 2010 31
  48. TA BE Apache Lucene EuroCon 21 May 2010
  49. TA BE CONTENT API DATA STORE POLITICS API A service for A directory of Open database of selecting and useful data candidates, voting collecting content curated by records, from the Guardian Guardian constituencies, editors election results, for re-use live data on election day Apache Lucene EuroCon 21 May 2010
  50. TA BE Your App Here! CONTENT API A service for selecting REST API and collecting content from the Guardian for re-use Search engine CMS Guardian database Apache Lucene EuroCon 21 May 2010
  51. TA BE Apache Lucene EuroCon 21 May 2010 34
  52. • Stamen Design - APIMaps.org Apache Lucene EuroCon 21 May 2010 35
  53. Text Apache Lucene EuroCon 21 May 2010 36
  54. TA BE DATA STORE A directory of useful data curated by Guardian editors Apache Lucene EuroCon 21 May 2010
  55. TA BE POLITICS API Open database of candidates, voting records, constituencies, election results, live data on election day Apache Lucene EuroCon 21 May 2010
  56. TA BE POLITICS API Open database of candidates, voting records, constituencies, election results, live data on election day Apache Lucene EuroCon 21 May 2010 39
  57. TA BE Open for Business Apache Lucene EuroCon 21 May 2010 40
  58. Open for Business Apache Lucene EuroCon 21 May 2010 40
  59. 1 3 Tiers of access, 3 Revenue models BESPOKE: Take, reformat, augment our content. Same access as Guardian. Revenue model to be negotiated. Combination of Media, Fees, Downloads. APPROVED: Take our full article content, with an advert. Guardian keeps ad revenue, you keep rest-of-page revenue KEYLESS: Take our headlines. You keep associated revenues Apache Lucene EuroCon 21 May 2010 41
  60. Apache Lucene EuroCon 21 May 2010 42
  61. What this means OPEN OUT: Developers can now access our full content APIs on demand with keys post-approved. We are now positioning the platform as a place to do business with us. So, rapid scalability, reliability, performance, are now core requirements Apache Lucene EuroCon 21 May 2010 43
  62. 2 Open In CONTENT API DATA STORE POLITICS API A service for selecting A directory of Open database of and collecting content useful data curated candidates, voting from the Guardian for by Guardian records, re-use editors constituencies, election results, live data on election day Apache Lucene EuroCon 21 May 2010
  63. 2 Open In CONTENT API DATA STORE POLITICS API MICROAPPS A service for selecting A directory of Open database of A framework for and collecting content useful data curated candidates, voting integrating 3rd party from the Guardian for by Guardian records, applications into re-use editors constituencies, guardian.co.uk. election results, live data on election day Apache Lucene EuroCon 21 May 2010
  64. OPEN OUT OPEN IN Allow partners to build Bring in data and apps applications using from the Internet Guardian content and services for other digital platforms Apache Lucene EuroCon 21 May 2010 45
  65. Apache Lucene EuroCon 21 May 2010 46
  66. Apache Lucene EuroCon 21 May 2010 47
  67. App showcase Apache Lucene EuroCon 21 May 2010 48
  68. What this means Open In: Partners can now more easily integrate into our core The Open Platform will become key to our commercial future. Apache Lucene EuroCon 21 May 2010 49
  69. Evolving the architecture Apache Lucene EuroCon 21 May 2010 50
  70. From Publisher to Platform ★Seeking massive growth, but no longer only broadcasting content ★User/partner engagement & contribution on ★journalism ★data ★software ★applications ★revenue and ads ★ Support developers and partners with data and APIs, need scalability, reliability, speed Apache Lucene EuroCon 21 May 2010 51
  71. Web server Web server Web server App server App server App server Memcached Oracle CMS Apache Lucene EuroCon 21 May 2010
  72. Web server Web server Web server Why RDBMS? App server App server App server 5 years ago, fewer alternatives Understand operations procedures Memcached Can easily recruit DBAs / devs Developer/ops tools Oracle Business critical system: a safe choice CMS Data feeds Apache Lucene EuroCon 21 May 2010
  73. Scaling Apache Lucene EuroCon 21 May 2010 54
  74. Unique Users Apache Lucene EuroCon 21 May 2010 55
  75. 30,000,000 Unique Users 26,250,000 22,500,000 Unique Users 18,750,000 15,000,000 11,250,000 7,500,000 3,750,000 Sep 2005 Feb 2006 Jul 2006 Dec 2006 May 2007 Oct 2007 Mar 2008 Aug 2008 Jan 2009 Apache Lucene EuroCon 21 May 2010 55
  76. Unique Users Apache Lucene EuroCon 21 May 2010 56
  77. 28,000,000 25,750,000 Unique Users 23,500,000 21,250,000 19,000,000 16,750,000 14,500,000 12,250,000 May 2008 Jul 2008 Sep 2008 Nov 2008 Jan 2009 Apache Lucene EuroCon 21 May 2010 56
  78. Whatʼs going on? ★We tag our content (multifaceted) ★Guardian.co.uk is a faceted browse through our tag- space, with editorial teams “spotlighting” key resources on selected nodes. ★Can apply multiple facets in queries faster in a search-like architecture, than an RDBMS Apache Lucene EuroCon 21 May 2010 57
  79. Whatʼs going on? ★We tag our content (multifaceted) ★Guardian.co.uk is a faceted browse through our tag- space, with editorial teams “spotlighting” key resources on selected nodes. ★Can apply multiple facets in queries faster in a search-like architecture, than an RDBMS Apache Lucene EuroCon 21 May 2010 57
  80. Whatʼs going on? ★We tag our content (multifaceted) ★Guardian.co.uk is a faceted browse through our tag- space, with editorial teams “spotlighting” key resources on selected nodes. ★Can apply multiple facets in queries faster in a search-like architecture, than an RDBMS Apache Lucene EuroCon 21 May 2010 57
  81. “Related content” from search engine Apache Lucene EuroCon 21 May 2010 58
  82. 5 Apache Lucene EuroCon 21 May 2010
  83. Your App Here! CONTENT API A service for selecting REST API and collecting content from the Guardian for re-use Search engine CMS Guardian database Apache Lucene EuroCon 21 May 2010
  84. Apache Lucene EuroCon 21 May 2010 61
  85. We used Solr/Lucene Can perform complex queries, including full text search We can change the schema with no downtime. On our dataset most queries are of a similar cost Scales very well horizontally Replication makes it easy to work in the cloud Apache Lucene EuroCon 21 May 2010 62
  86. Core Web servers App server Memcached rdbms CMS Apache Lucene EuroCon 21 May 2010 63
  87. Core Content API Web servers Solr App server Solr Memcached Solr rdbms Solr Solr Solr CMS Cloud, EC2 Apache Lucene EuroCon 21 May 2010 63
  88. Open in? Simple REST/ HTTP framework MICROAPPS allows lightweight development A framework for Applications proxied for integrating 3rd party performance applications into guardian.co.uk. Apps generally hosted in the cloud, hot deployment into production Apache Lucene EuroCon 21 May 2010
  89. Open in? Simple REST/ HTTP framework MICROAPPS allows lightweight development A framework for Applications proxied for integrating 3rd party performance applications into guardian.co.uk. Apps generally hosted in the cloud, hot deployment into production Apache Lucene EuroCon 21 May 2010
  90. Core Apps Web servers Proxy App App server App Memcached App App rdbms App App CMS external hosting app engine etc Apache Lucene EuroCon 21 May 2010 65
  91. OPEN IN OPEN OUT Web servers Solr Proxy App App servers App Memcached Solr App Solr App CMS Solr Solr App Solr App rdbms Cloud, EC2 external hosting app engine etc Apache Lucene EuroCon 21 May 2010
  92. C I O CONTENT r external Clo C I O ??????? r external Clo Apache Lucene EuroCon 21 May 2010
  93. Thank you http://www.guardian.co.uk/open-platform Twitter: @openplatform @cuica (Stephen Dunn) Apache Lucene EuroCon 21 May 2010 68

×