Keynote: from publisher to platform, How The Guardian Embraced the Internet using Content, Search, and Open Source - By Stephen Dunn

From publisher to platform:
How the Guardian embraced the internet
using content, search, and Open Source
Stephen Dunn, Guardian News and Media
stephen.dunn@guardian.co.uk, 25th May, 2011
Twitter: @cuica, @openplatform

Thursday, 26 May 2011

1

From publisher to platform
How the Guardian embraced the Internet using
content, search, and Open Source
Stephen Dunn, Guardian News and Media

2


The publishing era

3


We started a long
time ago:


Keyword page

Live blogs
Apps Mobile site

Twitter updates
Swine flu Comment

Content partnerships

Newspapers

Audio

Video Open platform API


To secure the financial and editorial
To secure the financial and editorial independence
independence of the Guardian in perpetuity.
To promote freedom in thein perpetuity
of the Guardian press and liberal
journalism globally.

To promote freedom in the press and liberal
To become the world's leading liberal voice.

journalism globally


Open Web Principles

7


2009

8


1. Permanent

http://www.ﬂickr.com/photos/fstorr/

• “A cool URI is one that does not change” Tim Berners-Lee 1998
• 1.5 million resources redirected to new scheme
9


2. Addressable
★ Resources are “about” something - ready for the
social web.

★ We live in “the age of point-at-things” (Coates 2005)

10


3. Discoverable

★ Multiple routes
to content

★ Tagging drives
discovery

11


4. Open

12


Example: The Hackable Guardian

http://
www.guardian.co.uk/....

/technology/internet /rss

/technology/all /rss

/environment/climatechange +business/globaleconomy/rss


Results...

14


Site trafﬁc growth Final Release

Unique Users
30,000,000

26,250,000 First release

22,500,000
Unique Users

Pre - project
18,750,000

15,000,000

11,250,000
40M
7,500,000

3,750,000

Sep 2005 Oct 2006 Nov 2007 Dec 2008

15


However...

16


1 Billion+
Internet
Users!

17


...“How I
stopped
worrying about
my website and
learned to love
the whole
internet.”
Matt McAlister

21


The Open Strategy

OPEN IN OPEN OUT

Bring in data and apps Enable partners to
from the Internet build applications
using Guardian
content and services
for other platforms

22


"Our most interesting experiments lie in combining
what we know with the experience, opinions and
expertise of the people who want to participate
rather than passively receive.”
24


Jack Shenker
“The Guardian alongside Al Jazeera was the one news source
that everybody on the streets in Tahrir - not just in Cairo but in
surrounding cities and major centers of revolutionary activity -
that people were talking about.”
34


The Open Strategy

OPEN IN OPEN OUT

using Guardian
for other platforms

35
22


The Open Platform

36


The suite of services enabling
partners to build applications with
the Guardian

37


OPEN IN OPEN OUT

using Guardian
for other platforms

38
22


CONTENT API DATA STORE POLITICS API
A service for A directory of Open database
selecting and useful data of candidates,
collecting curated by voting records,
content from Guardian constituencies,
the Guardian editors election results,
for re-use live data on
election day


Mutualised news!

40


Mutualised news!

41


Mutualised news!

42


DATA STORE
A directory of
useful data curated
by Guardian
editors


POLITICS API
Open database of
candidates, voting
records, constituencies,
election results, live
data on election day


POLITICS API
Open database of
candidates, voting
records, constituencies,
election results, live
data on election day

49


<OBLIGATORY DOGFOOD SLIDE >

50


Open for Business

56


3 Tiers of access
3 Revenue models

Keyless: Take our headlines. You keep associated
revenues.

Approved: Take our full article content, but with an
advert. Guardian keeps ad revenue, you keep rest-of-
page revenue.

Bespoke: Take, reformat, augment our content
Revenue model to be negotiated. Combination of
Media, Fees, Downloads.

57


What this means
Open Out: Developers can now access full content APIs on
demand with keys post-approved

Platform is positioned as a place to do business

So rapid scalability, reliability and performance are now core
requirements

59


OPEN IN OPEN OUT
Bring in data and Allow partners to
apps from the build applications
internet using Guardian
content and
services for other
platforms


Simple REST/HTTP
MICROAPPS framework allows lightweight
development
A framework for
integrating 3rd party Applications proxied for
applications into performance
guardian.co.uk
Apps generally hosted in the
cloud, allows hot deployment
into production

61


MICROAPPS
A framework for
integrating 3rd party
applications into
guardian.co.uk

62


• What could I cook?


Bringing it together

64


App showcase

66


From publisher to
platform
Seeking massive growth, but no longer only
broadcasting content on the website

User/partner engagement & contribution on
Journalism
data
software
applications
revenue and ads

Support developers and partners with data and APIs,
need scalability, reliability, speed
67


Evolving the
architecture

68


Web server Web server Web server

App server App server App server

Memcached (added later)

Oracle

CMS


Web server Web server Web server

Why RDBMS?
App server App server App server
5 years ago, fewer alternatives

Memcached
Understand operations procedures

Can easily recruit DBAs / devs
Oracle
Developer/ops tools

Business critical system: a safe choice
CMS


Scaling trafﬁc
Unique Users
30,000,000

26,250,000

22,500,000
Unique Users

18,750,000

15,000,000

11,250,000

7,500,000

3,750,000

Sep 2005 Sep 2006 Sep 2007 Sep 2008

71


We chose Solr/Lucene
Can perform complex queries, including full-text search

We can change the schema with no downtime

Most queries are of similar cost

Scales very well horizontally

“Just worked” in the cloud

No strange control processes/engines

Developers just loved working with it!
78


Api
Web servers

Solr
App server
Solr
Memcached
Solr

RDBMS Solr
Solr

Solr
CMS

Cloud, EC2

80


What about Open In?

OPEN IN OPEN OUT

using Guardian
for other platforms

81
22


Apps
Web servers

Proxy
App
App server
App

App Memcached

App
RDBMS
App

App
CMS
external hosting
app engine etc

82


Core
Out
In
Web servers

Solr

Proxy
App
App server
App Solr
Memcached
App Solr
App CMS Solr
Solr
App
rdbms
Solr
App

external hosting Cloud, EC2
app engine etc
83


Keynote: from publisher to platform, How The Guardian Embraced the Internet using Content, Search, and Open Source - By Stephen Dunn

Recommended

Recommended

More Related Content

Similar to Keynote: from publisher to platform, How The Guardian Embraced the Internet using Content, Search, and Open Source - By Stephen Dunn

Similar to Keynote: from publisher to platform, How The Guardian Embraced the Internet using Content, Search, and Open Source - By Stephen Dunn (20)

More from lucenerevolution

More from lucenerevolution (20)

Recently uploaded

Recently uploaded (20)

Keynote: from publisher to platform, How The Guardian Embraced the Internet using Content, Search, and Open Source - By Stephen Dunn