SlideShare a Scribd company logo
1




  From publisher to platform
   How the guardian used content, search, and open source to
   build a powerful new business model
   Stephen Dunn, Guardian News and Media

Apache Lucene EuroCon                                          21 May 2010
The publishing era




Apache Lucene EuroCon   21 May 2010      2
We started a long
    time ago:




Apache Lucene EuroCon   21 May 2010
“To secure the financial and editorial independence of
   To secure the financial and editorial
  The Guardian in perpetuity.”
    independence of the Guardian in perpetuity.
    Topromote freedom in the press press and liberal
  “To  promote freedom in the and liberal journalism
    journalism globally.
  globally.”
Apache Lucene EuroCon   21 May 2010
2010




Apache Lucene EuroCon   21 May 2010
2010
                                                           Keyword page

                                                                                     Live blogs
   iPhone app                        Mobile site




          Twitter updates
                                                           Swine flu                        Comment



        Content partnerships



                                                                                                  Newspapers

                        Audio


                                                   Video                  Data API

Apache Lucene EuroCon      21 May 2010
1996




Apache Lucene EuroCon   21 May 2010   6
1999




Apache Lucene EuroCon   21 May 2010   7
1999




Apache Lucene EuroCon   21 May 2010   7
01-> 06




Apache Lucene EuroCon   21 May 2010   8
2009
     1.5M pages
     and counting

     250M+ pages/
     month

     30M visitors/
     month

     4x Webby
     award winner
     (best
     newspaper
     site)



Apache Lucene EuroCon   21 May 2010   9
2009
     1.5M pages
     and counting

     250M+ pages/
     month

     30M visitors/
     month

     4x Webby
     award winner
     (best
     newspaper
     site)



Apache Lucene EuroCon   21 May 2010   9
2009
     1.5M pages
     and counting

     250M+ pages/
     month

     30M visitors/
     month

     4x Webby
     award winner
     (best
     newspaper
     site)



Apache Lucene EuroCon   21 May 2010   9
2009
     1.5M pages
     and counting

     250M+ pages/
     month

     30M visitors/
     month

     4x Webby
     award winner
     (best
     newspaper
     site)



Apache Lucene EuroCon   21 May 2010   9
Part of the Web




Apache Lucene EuroCon   21 May 2010             10
1. Permanent




                                            http://www.flickr.com/photos/fstorr/


  •     “A cool URI is one that does not change”                   Tim Berners-Lee 1998
  •     1.5 million resources redirected to new scheme
Apache Lucene EuroCon   21 May 2010                                                  11
2. Addressable
              ★ Resources are “about” something - ready for the
                social web.

              ★ We live in “the age of point-at-things” (Coates 2005)




Apache Lucene EuroCon   21 May 2010                                     12
3. Discoverable


      ★ Multiple routes
        to content

      ★ Tagging drives
        discovery




Apache Lucene EuroCon   21 May 2010                     13
3. Discoverable


      ★ Multiple routes
        to content

      ★ Tagging drives
        discovery




Apache Lucene EuroCon   21 May 2010                     13
3. Discoverable


      ★ Multiple routes
        to content

      ★ Tagging drives
        discovery




Apache Lucene EuroCon   21 May 2010                     13
3. Discoverable


      ★ Multiple routes
        to content

      ★ Tagging drives
        discovery




Apache Lucene EuroCon   21 May 2010                     13
Apache Lucene EuroCon   21 May 2010   14
The hackable guardian.co.uk
http://www.guardian.co.uk/....




Apache Lucene EuroCon   21 May 2010
The hackable guardian.co.uk
http://www.guardian.co.uk/....




/technology/internet

/technology/all

/environment/climatechange

Apache Lucene EuroCon   21 May 2010
The hackable guardian.co.uk
http://www.guardian.co.uk/....




/technology/internet

/technology/all

/environment/climatechange +business/globaleconomy

Apache Lucene EuroCon   21 May 2010
The hackable guardian.co.uk
http://www.guardian.co.uk/....




/technology/internet

/technology/all

/environment/climatechange +business/globaleconomy

Apache Lucene EuroCon   21 May 2010
The hackable guardian.co.uk
http://www.guardian.co.uk/....




/technology/internet/rss

/technology/all/rss

/environment/climatechange +business/globaleconomy/rss

Apache Lucene EuroCon   21 May 2010
Results...



Apache Lucene EuroCon   21 May 2010                16
Site traffic growth                                     Final Release

                                        Unique Users

                                      First release




Apache Lucene EuroCon   21 May 2010                                    17
Site traffic growth                                                                        Final Release

                                                        Unique Users
                   30,000,000

                   26,250,000                         First release


                   22,500,000
    Unique Users




                   18,750,000

                   15,000,000

                   11,250,000

                    7,500,000

                    3,750,000



                           Sep 2005 Feb 2006 Jul 2006 Dec 2006 May 2007 Oct 2007 Mar 2008 Aug 2008 Jan 2009

Apache Lucene EuroCon           21 May 2010                                                                   17
Site traffic growth                                                                        Final Release

                                                         Unique Users
                   30,000,000

                   26,250,000                          First release


                   22,500,000
    Unique Users




                                       Pre - project
                   18,750,000

                   15,000,000

                   11,250,000

                    7,500,000

                    3,750,000



                           Sep 2005 Feb 2006 Jul 2006 Dec 2006 May 2007 Oct 2007 Mar 2008 Aug 2008 Jan 2009

Apache Lucene EuroCon           21 May 2010                                                                   17
Site traffic growth                                                                        Final Release

                                                         Unique Users
                   30,000,000

                   26,250,000                          First release


                   22,500,000
    Unique Users




                                       Pre - project
                   18,750,000

                   15,000,000

                   11,250,000
                                                                                  36M
                    7,500,000

                    3,750,000



                           Sep 2005 Feb 2006 Jul 2006 Dec 2006 May 2007 Oct 2007 Mar 2008 Aug 2008 Jan 2009

Apache Lucene EuroCon           21 May 2010                                                                   17
However...


Apache Lucene EuroCon   21 May 2010            18
1 Billion+
                                       Internet
                                        Users!




Apache Lucene EuroCon   21 May 2010                19
Apache Lucene EuroCon   21 May 2010   20
Apache Lucene EuroCon   21 May 2010   21
Apache Lucene EuroCon   21 May 2010   22
....”How I stopped
  worrying about
  my website and
  learned to love
  the whole
  Internet.”

  Matt McAlister

Apache Lucene EuroCon   21 May 2010   23
The Open Strategy
            OPEN IN                   OPEN OUT

            Bring in data and         Enable partners to
            apps from the             build applications
            Internet                  using Guardian
                                      content and services
                                      for other digital
                                      platforms


Apache Lucene EuroCon   21 May 2010                          24
Apache Lucene EuroCon   21 May 2010   25
Apache Lucene EuroCon   21 May 2010   26
Apache Lucene EuroCon   21 May 2010   27
"Our most interesting experiments lie in combining
what we know with the experience, opinions and
expertise of the people who want to participate
rather than passively receive.”
Apache Lucene EuroCon   21 May 2010              28
TA
                        BE




               The Open Platform


Apache Lucene EuroCon    21 May 2010   29
TA
                        BE




             OPEN IN                   OPEN OUT

             Bring in data and apps    Allow partners to build
             from the Internet         applications using
                                       Guardian content and
                                       services for other digital
                                       platforms



Apache Lucene EuroCon    21 May 2010                                30
TA
                        BE




             OPEN IN                   OPEN OUT

             Bring in data and apps    Allow partners to build
             from the Internet         applications using
                                       Guardian content and
                                       services for other digital
                                       platforms



Apache Lucene EuroCon    21 May 2010                                30
TA
                        BE




            The suite of services
          enabling partners to build
            applications with the
                  Guardian
Apache Lucene EuroCon    21 May 2010   31
TA
                        BE




Apache Lucene EuroCon    21 May 2010
TA
                        BE




             CONTENT API               DATA STORE       POLITICS API

                A service for          A directory of   Open database of
                selecting and           useful data     candidates, voting
             collecting content         curated by           records,
             from the Guardian           Guardian        constituencies,
                                          editors        election results,
                  for re-use
                                                           live data on
                                                           election day




Apache Lucene EuroCon    21 May 2010
TA
                        BE
                                             Your App Here!

         CONTENT API
           A service for selecting             REST API
           and collecting content
           from the Guardian for
                   re-use

                                             Search engine
                                       CMS

                                               Guardian
                                               database
Apache Lucene EuroCon    21 May 2010
TA
                        BE




Apache Lucene EuroCon    21 May 2010   34
• Stamen Design - APIMaps.org
Apache Lucene EuroCon   21 May 2010   35
Text




Apache Lucene EuroCon   21 May 2010          36
TA
                        BE



              DATA STORE
                A directory of
              useful data curated
                 by Guardian
                    editors




Apache Lucene EuroCon    21 May 2010
TA
                        BE


   POLITICS API
      Open database of
      candidates, voting
   records, constituencies,
     election results, live
     data on election day




Apache Lucene EuroCon    21 May 2010
TA
                        BE


 POLITICS API
    Open database of
    candidates, voting
 records, constituencies,
   election results, live
   data on election day




Apache Lucene EuroCon    21 May 2010   39
TA
                        BE




                        Open for Business



Apache Lucene EuroCon    21 May 2010        40
Open for Business



Apache Lucene EuroCon   21 May 2010         40
1          3 Tiers of access, 3 Revenue models

                BESPOKE: Take, reformat, augment our content. Same access as
                Guardian. Revenue model to be negotiated. Combination of Media,
                Fees, Downloads.

                APPROVED: Take our full article content, with an advert.
                Guardian keeps ad revenue, you keep rest-of-page revenue


                KEYLESS: Take our headlines. You keep associated revenues

Apache Lucene EuroCon   21 May 2010                                               41
Apache Lucene EuroCon   21 May 2010   42
What this means
            OPEN OUT: Developers can now access our full content APIs
            on demand with keys post-approved.

            We are now positioning the platform as a place to do
            business with us.

            So, rapid scalability, reliability, performance, are now core
            requirements



Apache Lucene EuroCon   21 May 2010                                         43
2              Open In
    CONTENT API                       DATA STORE        POLITICS API
  A service for selecting          A directory of       Open database of
  and collecting content         useful data curated    candidates, voting
  from the Guardian for             by Guardian              records,
          re-use                       editors            constituencies,
                                                       election results, live
                                                       data on election day




Apache Lucene EuroCon   21 May 2010
2              Open In
    CONTENT API                       DATA STORE        POLITICS API              MICROAPPS
  A service for selecting          A directory of       Open database of          A framework for
  and collecting content         useful data curated    candidates, voting      integrating 3rd party
  from the Guardian for             by Guardian              records,             applications into
          re-use                       editors            constituencies,          guardian.co.uk.
                                                       election results, live
                                                       data on election day




Apache Lucene EuroCon   21 May 2010
OPEN OUT
             OPEN IN
                                      Allow partners to build
             Bring in data and apps
                                      applications using
             from the Internet
                                      Guardian content and
                                      services for other digital
                                      platforms




Apache Lucene EuroCon   21 May 2010                                45
Apache Lucene EuroCon   21 May 2010   46
Apache Lucene EuroCon   21 May 2010   47
App showcase




Apache Lucene EuroCon   21 May 2010                  48
What this means
        Open In: Partners can now more easily integrate
        into our core

        The Open Platform will become key to our
        commercial future.




Apache Lucene EuroCon   21 May 2010                       49
Evolving the
                             architecture


Apache Lucene EuroCon   21 May 2010         50
From Publisher to Platform
       ★Seeking massive growth, but no longer only
       broadcasting content

       ★User/partner engagement & contribution on
                        ★journalism
                        ★data
                        ★software
                        ★applications
                        ★revenue       and ads

       ★ Support developers and partners with data and
                    APIs, need scalability, reliability, speed
Apache Lucene EuroCon    21 May 2010                             51
Web server      Web server   Web server



                        App server      App server   App server



                                         Memcached




                                           Oracle




                                            CMS

Apache Lucene EuroCon     21 May 2010
Web server         Web server        Web server

                                 Why RDBMS?
                        App server         App server        App server
                                 5 years ago, fewer alternatives

                                 Understand operations procedures
                        Memcached
                                 Can easily recruit DBAs / devs

                                 Developer/ops tools
                                               Oracle
                                 Business critical system: a safe
                                 choice


                                CMS                     Data feeds
Apache Lucene EuroCon     21 May 2010
Scaling




Apache Lucene EuroCon   21 May 2010             54
Unique Users




Apache Lucene EuroCon   21 May 2010                  55
30,000,000
                                                        Unique Users
                    26,250,000

                    22,500,000
     Unique Users




                    18,750,000

                    15,000,000

                    11,250,000

                     7,500,000

                     3,750,000



                            Sep 2005 Feb 2006 Jul 2006 Dec 2006 May 2007 Oct 2007 Mar 2008 Aug 2008 Jan 2009



Apache Lucene EuroCon            21 May 2010                                                                   55
Unique Users




Apache Lucene EuroCon   21 May 2010                  56
28,000,000
                                      25,750,000          Unique Users
                                      23,500,000
                                      21,250,000
                                      19,000,000
                                      16,750,000
                                      14,500,000
                                      12,250,000

                                                   May 2008   Jul 2008   Sep 2008 Nov 2008 Jan 2009
Apache Lucene EuroCon   21 May 2010                                                                   56
Whatʼs going on?
      ★We tag our content
          (multifaceted)

      ★Guardian.co.uk is a faceted
          browse through our tag-
          space, with editorial teams
          “spotlighting” key resources
          on selected nodes.

      ★Can apply multiple facets in
          queries faster in a search-like
          architecture, than an RDBMS




Apache Lucene EuroCon   21 May 2010         57
Whatʼs going on?
      ★We tag our content
          (multifaceted)

      ★Guardian.co.uk is a faceted
          browse through our tag-
          space, with editorial teams
          “spotlighting” key resources
          on selected nodes.

      ★Can apply multiple facets in
          queries faster in a search-like
          architecture, than an RDBMS




Apache Lucene EuroCon   21 May 2010         57
Whatʼs going on?
      ★We tag our content
          (multifaceted)

      ★Guardian.co.uk is a faceted
          browse through our tag-
          space, with editorial teams
          “spotlighting” key resources
          on selected nodes.

      ★Can apply multiple facets in
          queries faster in a search-like
          architecture, than an RDBMS




Apache Lucene EuroCon   21 May 2010         57
“Related content” from search engine




Apache Lucene EuroCon   21 May 2010          58
5
Apache Lucene EuroCon   21 May 2010
Your App Here!

         CONTENT API
           A service for selecting            REST API
           and collecting content
           from the Guardian for
                   re-use

                                            Search engine
                                      CMS

                                              Guardian
                                              database
Apache Lucene EuroCon   21 May 2010
Apache Lucene EuroCon   21 May 2010   61
We used Solr/Lucene
         Can perform complex queries, including full text search

         We can change the schema with no downtime.

         On our dataset most queries are of a similar cost

         Scales very well horizontally

         Replication makes it easy to work in the cloud




Apache Lucene EuroCon   21 May 2010                                62
Core



                           Web servers


                                      App server



                                      Memcached



                                       rdbms




                                      CMS




Apache Lucene EuroCon   21 May 2010                63
Core


                                                                  Content API
                           Web servers

                                                           Solr
                                      App server

                                                           Solr

                                      Memcached
                                                           Solr

                                       rdbms       Solr
                                                           Solr


                                                           Solr
                                      CMS

                                                          Cloud, EC2

Apache Lucene EuroCon   21 May 2010                                             63
Open in?

                                       Simple REST/ HTTP framework
         MICROAPPS                     allows lightweight development

       A framework for                 Applications proxied for
     integrating 3rd party             performance
       applications into
        guardian.co.uk.                Apps generally hosted in the
                                       cloud, hot deployment into
                                       production




Apache Lucene EuroCon   21 May 2010
Open in?

                                       Simple REST/ HTTP framework
         MICROAPPS                     allows lightweight development

       A framework for                 Applications proxied for
     integrating 3rd party             performance
       applications into
        guardian.co.uk.                Apps generally hosted in the
                                       cloud, hot deployment into
                                       production




Apache Lucene EuroCon   21 May 2010
Core


                   Apps
                                               Web servers




                                       Proxy
                        App
                                                     App server

                        App
                                                     Memcached
                        App

                        App                             rdbms

                        App


                        App                           CMS

        external hosting
        app engine etc
Apache Lucene EuroCon    21 May 2010                              65
OPEN IN                                                     OPEN OUT

                                                Web servers

                                                                       Solr
                                        Proxy
                        App                     App servers

                        App                     Memcached              Solr

                        App                                            Solr
                        App                        CMS         Solr
                                                                       Solr
                        App
                                                                       Solr
                        App                            rdbms

                                                                      Cloud, EC2
       external hosting
       app engine etc
Apache Lucene EuroCon     21 May 2010
C
                                  I               O


                                       CONTENT


                                          r
                            external             Clo
                                         C
                                  I               O

                                       ???????


                                          r
                            external             Clo
Apache Lucene EuroCon   21 May 2010
Thank you
http://www.guardian.co.uk/open-platform
Twitter: @openplatform
         @cuica (Stephen Dunn)




Apache Lucene EuroCon   21 May 2010       68

More Related Content

Similar to From Publisher To Platform: How The Guardian Used Content, Search, and Open Source To Build a Powerful New Business Model

9th Content Providers Community Call\
9th Content Providers Community Call\9th Content Providers Community Call\
9th Content Providers Community Call\
OpenAIRE
 
Derailed chef update-oct2010
Derailed chef update-oct2010Derailed chef update-oct2010
Derailed chef update-oct2010
jtimberman
 
iTunes U and UCL
iTunes U and UCLiTunes U and UCL
iTunes U and UCL
Jeremy Speller
 
You Sir, Sir Vey
You Sir, Sir VeyYou Sir, Sir Vey
You Sir, Sir Vey
Everett Toews
 
Open Source Software Wikipedia 2008
Open Source Software Wikipedia 2008Open Source Software Wikipedia 2008
Open Source Software Wikipedia 2008
Thomas G Henry
 
Netscape and Opera Software Business Case Analysis UMB School Of Business And...
Netscape and Opera Software Business Case Analysis UMB School Of Business And...Netscape and Opera Software Business Case Analysis UMB School Of Business And...
Netscape and Opera Software Business Case Analysis UMB School Of Business And...
Rune Haugestad
 
Developing SOA Services with Red Hat JBoss and Eclipse tools
Developing SOA Services with Red Hat JBoss and Eclipse toolsDeveloping SOA Services with Red Hat JBoss and Eclipse tools
Developing SOA Services with Red Hat JBoss and Eclipse tools
Eclipse Day 2010 in Rome
 
Europeana Generic Services Projects Meeting, 29-30 October 2018, The Hague, E...
Europeana Generic Services Projects Meeting, 29-30 October 2018, The Hague, E...Europeana Generic Services Projects Meeting, 29-30 October 2018, The Hague, E...
Europeana Generic Services Projects Meeting, 29-30 October 2018, The Hague, E...
Europeana
 
Creating OpenSocial Apps
Creating OpenSocial AppsCreating OpenSocial Apps
Creating OpenSocial Apps
Bastian Hofmann
 
"OpenStack in Japan", from OpenStack Days Taiwan 2016
"OpenStack in Japan", from OpenStack Days Taiwan 2016"OpenStack in Japan", from OpenStack Days Taiwan 2016
"OpenStack in Japan", from OpenStack Days Taiwan 2016
shintaro mizuno
 
OpenStack Day Taiwan 2016 -Shintaro Mizuno
OpenStack Day Taiwan 2016 -Shintaro MizunoOpenStack Day Taiwan 2016 -Shintaro Mizuno
OpenStack Day Taiwan 2016 -Shintaro Mizuno
shintaro mizuno
 
"Understanding Open Source and Ubuntu Part 2 of 2" by Kurt von Finck @ eLiber...
"Understanding Open Source and Ubuntu Part 2 of 2" by Kurt von Finck @ eLiber..."Understanding Open Source and Ubuntu Part 2 of 2" by Kurt von Finck @ eLiber...
"Understanding Open Source and Ubuntu Part 2 of 2" by Kurt von Finck @ eLiber...
eLiberatica
 
A Match Made In The Cloud
A Match Made In The CloudA Match Made In The Cloud
A Match Made In The Cloud
Chapter Three
 
Europeana Network Association AGM 2016 - 8 November - Merete Sanderhoff & Har...
Europeana Network Association AGM 2016 - 8 November - Merete Sanderhoff & Har...Europeana Network Association AGM 2016 - 8 November - Merete Sanderhoff & Har...
Europeana Network Association AGM 2016 - 8 November - Merete Sanderhoff & Har...
Europeana
 

Similar to From Publisher To Platform: How The Guardian Used Content, Search, and Open Source To Build a Powerful New Business Model (14)

9th Content Providers Community Call\
9th Content Providers Community Call\9th Content Providers Community Call\
9th Content Providers Community Call\
 
Derailed chef update-oct2010
Derailed chef update-oct2010Derailed chef update-oct2010
Derailed chef update-oct2010
 
iTunes U and UCL
iTunes U and UCLiTunes U and UCL
iTunes U and UCL
 
You Sir, Sir Vey
You Sir, Sir VeyYou Sir, Sir Vey
You Sir, Sir Vey
 
Open Source Software Wikipedia 2008
Open Source Software Wikipedia 2008Open Source Software Wikipedia 2008
Open Source Software Wikipedia 2008
 
Netscape and Opera Software Business Case Analysis UMB School Of Business And...
Netscape and Opera Software Business Case Analysis UMB School Of Business And...Netscape and Opera Software Business Case Analysis UMB School Of Business And...
Netscape and Opera Software Business Case Analysis UMB School Of Business And...
 
Developing SOA Services with Red Hat JBoss and Eclipse tools
Developing SOA Services with Red Hat JBoss and Eclipse toolsDeveloping SOA Services with Red Hat JBoss and Eclipse tools
Developing SOA Services with Red Hat JBoss and Eclipse tools
 
Europeana Generic Services Projects Meeting, 29-30 October 2018, The Hague, E...
Europeana Generic Services Projects Meeting, 29-30 October 2018, The Hague, E...Europeana Generic Services Projects Meeting, 29-30 October 2018, The Hague, E...
Europeana Generic Services Projects Meeting, 29-30 October 2018, The Hague, E...
 
Creating OpenSocial Apps
Creating OpenSocial AppsCreating OpenSocial Apps
Creating OpenSocial Apps
 
"OpenStack in Japan", from OpenStack Days Taiwan 2016
"OpenStack in Japan", from OpenStack Days Taiwan 2016"OpenStack in Japan", from OpenStack Days Taiwan 2016
"OpenStack in Japan", from OpenStack Days Taiwan 2016
 
OpenStack Day Taiwan 2016 -Shintaro Mizuno
OpenStack Day Taiwan 2016 -Shintaro MizunoOpenStack Day Taiwan 2016 -Shintaro Mizuno
OpenStack Day Taiwan 2016 -Shintaro Mizuno
 
"Understanding Open Source and Ubuntu Part 2 of 2" by Kurt von Finck @ eLiber...
"Understanding Open Source and Ubuntu Part 2 of 2" by Kurt von Finck @ eLiber..."Understanding Open Source and Ubuntu Part 2 of 2" by Kurt von Finck @ eLiber...
"Understanding Open Source and Ubuntu Part 2 of 2" by Kurt von Finck @ eLiber...
 
A Match Made In The Cloud
A Match Made In The CloudA Match Made In The Cloud
A Match Made In The Cloud
 
Europeana Network Association AGM 2016 - 8 November - Merete Sanderhoff & Har...
Europeana Network Association AGM 2016 - 8 November - Merete Sanderhoff & Har...Europeana Network Association AGM 2016 - 8 November - Merete Sanderhoff & Har...
Europeana Network Association AGM 2016 - 8 November - Merete Sanderhoff & Har...
 

Recently uploaded

Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
Torry Harris
 
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
Lidia A.
 
Amul milk launches in US: Key details of its new products ...
Amul milk launches in US: Key details of its new products ...Amul milk launches in US: Key details of its new products ...
Amul milk launches in US: Key details of its new products ...
chetankumar9855
 
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
RaminGhanbari2
 
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdfAcumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
BrainSell Technologies
 
Calgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptxCalgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptx
ishalveerrandhawa1
 
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdfWhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
ArgaBisma
 
Google I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged SlidesGoogle I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged Slides
Google Developer Group - Harare
 
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
Tatiana Al-Chueyr
 
find out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challengesfind out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challenges
huseindihon
 
The Role of IoT in Australian Mobile App Development - PDF Guide
The Role of IoT in Australian Mobile App Development - PDF GuideThe Role of IoT in Australian Mobile App Development - PDF Guide
The Role of IoT in Australian Mobile App Development - PDF Guide
Shiv Technolabs
 
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite SolutionIPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Networks
 
CiscoIconsLibrary cours de réseau VLAN.ppt
CiscoIconsLibrary cours de réseau VLAN.pptCiscoIconsLibrary cours de réseau VLAN.ppt
CiscoIconsLibrary cours de réseau VLAN.ppt
moinahousna
 
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
SynapseIndia
 
How Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdfHow Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdf
HackersList
 
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python CodebaseEuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
Jimmy Lai
 
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
alexjohnson7307
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
sunilverma7884
 
July Patch Tuesday
July Patch TuesdayJuly Patch Tuesday
July Patch Tuesday
Ivanti
 
Choose our Linux Web Hosting for a seamless and successful online presence
Choose our Linux Web Hosting for a seamless and successful online presenceChoose our Linux Web Hosting for a seamless and successful online presence
Choose our Linux Web Hosting for a seamless and successful online presence
rajancomputerfbd
 

Recently uploaded (20)

Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
 
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
 
Amul milk launches in US: Key details of its new products ...
Amul milk launches in US: Key details of its new products ...Amul milk launches in US: Key details of its new products ...
Amul milk launches in US: Key details of its new products ...
 
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
 
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdfAcumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
 
Calgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptxCalgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptx
 
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdfWhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
 
Google I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged SlidesGoogle I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged Slides
 
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
 
find out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challengesfind out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challenges
 
The Role of IoT in Australian Mobile App Development - PDF Guide
The Role of IoT in Australian Mobile App Development - PDF GuideThe Role of IoT in Australian Mobile App Development - PDF Guide
The Role of IoT in Australian Mobile App Development - PDF Guide
 
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite SolutionIPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite Solution
 
CiscoIconsLibrary cours de réseau VLAN.ppt
CiscoIconsLibrary cours de réseau VLAN.pptCiscoIconsLibrary cours de réseau VLAN.ppt
CiscoIconsLibrary cours de réseau VLAN.ppt
 
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
 
How Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdfHow Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdf
 
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python CodebaseEuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
 
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
 
July Patch Tuesday
July Patch TuesdayJuly Patch Tuesday
July Patch Tuesday
 
Choose our Linux Web Hosting for a seamless and successful online presence
Choose our Linux Web Hosting for a seamless and successful online presenceChoose our Linux Web Hosting for a seamless and successful online presence
Choose our Linux Web Hosting for a seamless and successful online presence
 

From Publisher To Platform: How The Guardian Used Content, Search, and Open Source To Build a Powerful New Business Model

  • 1. 1 From publisher to platform How the guardian used content, search, and open source to build a powerful new business model Stephen Dunn, Guardian News and Media Apache Lucene EuroCon 21 May 2010
  • 2. The publishing era Apache Lucene EuroCon 21 May 2010 2
  • 3. We started a long time ago: Apache Lucene EuroCon 21 May 2010
  • 4. “To secure the financial and editorial independence of To secure the financial and editorial The Guardian in perpetuity.” independence of the Guardian in perpetuity. Topromote freedom in the press press and liberal “To promote freedom in the and liberal journalism journalism globally. globally.” Apache Lucene EuroCon 21 May 2010
  • 6. 2010 Keyword page Live blogs iPhone app Mobile site Twitter updates Swine flu Comment Content partnerships Newspapers Audio Video Data API Apache Lucene EuroCon 21 May 2010
  • 10. 01-> 06 Apache Lucene EuroCon 21 May 2010 8
  • 11. 2009 1.5M pages and counting 250M+ pages/ month 30M visitors/ month 4x Webby award winner (best newspaper site) Apache Lucene EuroCon 21 May 2010 9
  • 12. 2009 1.5M pages and counting 250M+ pages/ month 30M visitors/ month 4x Webby award winner (best newspaper site) Apache Lucene EuroCon 21 May 2010 9
  • 13. 2009 1.5M pages and counting 250M+ pages/ month 30M visitors/ month 4x Webby award winner (best newspaper site) Apache Lucene EuroCon 21 May 2010 9
  • 14. 2009 1.5M pages and counting 250M+ pages/ month 30M visitors/ month 4x Webby award winner (best newspaper site) Apache Lucene EuroCon 21 May 2010 9
  • 15. Part of the Web Apache Lucene EuroCon 21 May 2010 10
  • 16. 1. Permanent http://www.flickr.com/photos/fstorr/ • “A cool URI is one that does not change” Tim Berners-Lee 1998 • 1.5 million resources redirected to new scheme Apache Lucene EuroCon 21 May 2010 11
  • 17. 2. Addressable ★ Resources are “about” something - ready for the social web. ★ We live in “the age of point-at-things” (Coates 2005) Apache Lucene EuroCon 21 May 2010 12
  • 18. 3. Discoverable ★ Multiple routes to content ★ Tagging drives discovery Apache Lucene EuroCon 21 May 2010 13
  • 19. 3. Discoverable ★ Multiple routes to content ★ Tagging drives discovery Apache Lucene EuroCon 21 May 2010 13
  • 20. 3. Discoverable ★ Multiple routes to content ★ Tagging drives discovery Apache Lucene EuroCon 21 May 2010 13
  • 21. 3. Discoverable ★ Multiple routes to content ★ Tagging drives discovery Apache Lucene EuroCon 21 May 2010 13
  • 22. Apache Lucene EuroCon 21 May 2010 14
  • 29. Site traffic growth Final Release Unique Users First release Apache Lucene EuroCon 21 May 2010 17
  • 30. Site traffic growth Final Release Unique Users 30,000,000 26,250,000 First release 22,500,000 Unique Users 18,750,000 15,000,000 11,250,000 7,500,000 3,750,000 Sep 2005 Feb 2006 Jul 2006 Dec 2006 May 2007 Oct 2007 Mar 2008 Aug 2008 Jan 2009 Apache Lucene EuroCon 21 May 2010 17
  • 31. Site traffic growth Final Release Unique Users 30,000,000 26,250,000 First release 22,500,000 Unique Users Pre - project 18,750,000 15,000,000 11,250,000 7,500,000 3,750,000 Sep 2005 Feb 2006 Jul 2006 Dec 2006 May 2007 Oct 2007 Mar 2008 Aug 2008 Jan 2009 Apache Lucene EuroCon 21 May 2010 17
  • 32. Site traffic growth Final Release Unique Users 30,000,000 26,250,000 First release 22,500,000 Unique Users Pre - project 18,750,000 15,000,000 11,250,000 36M 7,500,000 3,750,000 Sep 2005 Feb 2006 Jul 2006 Dec 2006 May 2007 Oct 2007 Mar 2008 Aug 2008 Jan 2009 Apache Lucene EuroCon 21 May 2010 17
  • 34. 1 Billion+ Internet Users! Apache Lucene EuroCon 21 May 2010 19
  • 35. Apache Lucene EuroCon 21 May 2010 20
  • 36. Apache Lucene EuroCon 21 May 2010 21
  • 37. Apache Lucene EuroCon 21 May 2010 22
  • 38. ....”How I stopped worrying about my website and learned to love the whole Internet.” Matt McAlister Apache Lucene EuroCon 21 May 2010 23
  • 39. The Open Strategy OPEN IN OPEN OUT Bring in data and Enable partners to apps from the build applications Internet using Guardian content and services for other digital platforms Apache Lucene EuroCon 21 May 2010 24
  • 40. Apache Lucene EuroCon 21 May 2010 25
  • 41. Apache Lucene EuroCon 21 May 2010 26
  • 42. Apache Lucene EuroCon 21 May 2010 27
  • 43. "Our most interesting experiments lie in combining what we know with the experience, opinions and expertise of the people who want to participate rather than passively receive.” Apache Lucene EuroCon 21 May 2010 28
  • 44. TA BE The Open Platform Apache Lucene EuroCon 21 May 2010 29
  • 45. TA BE OPEN IN OPEN OUT Bring in data and apps Allow partners to build from the Internet applications using Guardian content and services for other digital platforms Apache Lucene EuroCon 21 May 2010 30
  • 46. TA BE OPEN IN OPEN OUT Bring in data and apps Allow partners to build from the Internet applications using Guardian content and services for other digital platforms Apache Lucene EuroCon 21 May 2010 30
  • 47. TA BE The suite of services enabling partners to build applications with the Guardian Apache Lucene EuroCon 21 May 2010 31
  • 48. TA BE Apache Lucene EuroCon 21 May 2010
  • 49. TA BE CONTENT API DATA STORE POLITICS API A service for A directory of Open database of selecting and useful data candidates, voting collecting content curated by records, from the Guardian Guardian constituencies, editors election results, for re-use live data on election day Apache Lucene EuroCon 21 May 2010
  • 50. TA BE Your App Here! CONTENT API A service for selecting REST API and collecting content from the Guardian for re-use Search engine CMS Guardian database Apache Lucene EuroCon 21 May 2010
  • 51. TA BE Apache Lucene EuroCon 21 May 2010 34
  • 52. • Stamen Design - APIMaps.org Apache Lucene EuroCon 21 May 2010 35
  • 53. Text Apache Lucene EuroCon 21 May 2010 36
  • 54. TA BE DATA STORE A directory of useful data curated by Guardian editors Apache Lucene EuroCon 21 May 2010
  • 55. TA BE POLITICS API Open database of candidates, voting records, constituencies, election results, live data on election day Apache Lucene EuroCon 21 May 2010
  • 56. TA BE POLITICS API Open database of candidates, voting records, constituencies, election results, live data on election day Apache Lucene EuroCon 21 May 2010 39
  • 57. TA BE Open for Business Apache Lucene EuroCon 21 May 2010 40
  • 58. Open for Business Apache Lucene EuroCon 21 May 2010 40
  • 59. 1 3 Tiers of access, 3 Revenue models BESPOKE: Take, reformat, augment our content. Same access as Guardian. Revenue model to be negotiated. Combination of Media, Fees, Downloads. APPROVED: Take our full article content, with an advert. Guardian keeps ad revenue, you keep rest-of-page revenue KEYLESS: Take our headlines. You keep associated revenues Apache Lucene EuroCon 21 May 2010 41
  • 60. Apache Lucene EuroCon 21 May 2010 42
  • 61. What this means OPEN OUT: Developers can now access our full content APIs on demand with keys post-approved. We are now positioning the platform as a place to do business with us. So, rapid scalability, reliability, performance, are now core requirements Apache Lucene EuroCon 21 May 2010 43
  • 62. 2 Open In CONTENT API DATA STORE POLITICS API A service for selecting A directory of Open database of and collecting content useful data curated candidates, voting from the Guardian for by Guardian records, re-use editors constituencies, election results, live data on election day Apache Lucene EuroCon 21 May 2010
  • 63. 2 Open In CONTENT API DATA STORE POLITICS API MICROAPPS A service for selecting A directory of Open database of A framework for and collecting content useful data curated candidates, voting integrating 3rd party from the Guardian for by Guardian records, applications into re-use editors constituencies, guardian.co.uk. election results, live data on election day Apache Lucene EuroCon 21 May 2010
  • 64. OPEN OUT OPEN IN Allow partners to build Bring in data and apps applications using from the Internet Guardian content and services for other digital platforms Apache Lucene EuroCon 21 May 2010 45
  • 65. Apache Lucene EuroCon 21 May 2010 46
  • 66. Apache Lucene EuroCon 21 May 2010 47
  • 67. App showcase Apache Lucene EuroCon 21 May 2010 48
  • 68. What this means Open In: Partners can now more easily integrate into our core The Open Platform will become key to our commercial future. Apache Lucene EuroCon 21 May 2010 49
  • 69. Evolving the architecture Apache Lucene EuroCon 21 May 2010 50
  • 70. From Publisher to Platform ★Seeking massive growth, but no longer only broadcasting content ★User/partner engagement & contribution on ★journalism ★data ★software ★applications ★revenue and ads ★ Support developers and partners with data and APIs, need scalability, reliability, speed Apache Lucene EuroCon 21 May 2010 51
  • 71. Web server Web server Web server App server App server App server Memcached Oracle CMS Apache Lucene EuroCon 21 May 2010
  • 72. Web server Web server Web server Why RDBMS? App server App server App server 5 years ago, fewer alternatives Understand operations procedures Memcached Can easily recruit DBAs / devs Developer/ops tools Oracle Business critical system: a safe choice CMS Data feeds Apache Lucene EuroCon 21 May 2010
  • 74. Unique Users Apache Lucene EuroCon 21 May 2010 55
  • 75. 30,000,000 Unique Users 26,250,000 22,500,000 Unique Users 18,750,000 15,000,000 11,250,000 7,500,000 3,750,000 Sep 2005 Feb 2006 Jul 2006 Dec 2006 May 2007 Oct 2007 Mar 2008 Aug 2008 Jan 2009 Apache Lucene EuroCon 21 May 2010 55
  • 76. Unique Users Apache Lucene EuroCon 21 May 2010 56
  • 77. 28,000,000 25,750,000 Unique Users 23,500,000 21,250,000 19,000,000 16,750,000 14,500,000 12,250,000 May 2008 Jul 2008 Sep 2008 Nov 2008 Jan 2009 Apache Lucene EuroCon 21 May 2010 56
  • 78. Whatʼs going on? ★We tag our content (multifaceted) ★Guardian.co.uk is a faceted browse through our tag- space, with editorial teams “spotlighting” key resources on selected nodes. ★Can apply multiple facets in queries faster in a search-like architecture, than an RDBMS Apache Lucene EuroCon 21 May 2010 57
  • 79. Whatʼs going on? ★We tag our content (multifaceted) ★Guardian.co.uk is a faceted browse through our tag- space, with editorial teams “spotlighting” key resources on selected nodes. ★Can apply multiple facets in queries faster in a search-like architecture, than an RDBMS Apache Lucene EuroCon 21 May 2010 57
  • 80. Whatʼs going on? ★We tag our content (multifaceted) ★Guardian.co.uk is a faceted browse through our tag- space, with editorial teams “spotlighting” key resources on selected nodes. ★Can apply multiple facets in queries faster in a search-like architecture, than an RDBMS Apache Lucene EuroCon 21 May 2010 57
  • 81. “Related content” from search engine Apache Lucene EuroCon 21 May 2010 58
  • 82. 5 Apache Lucene EuroCon 21 May 2010
  • 83. Your App Here! CONTENT API A service for selecting REST API and collecting content from the Guardian for re-use Search engine CMS Guardian database Apache Lucene EuroCon 21 May 2010
  • 84. Apache Lucene EuroCon 21 May 2010 61
  • 85. We used Solr/Lucene Can perform complex queries, including full text search We can change the schema with no downtime. On our dataset most queries are of a similar cost Scales very well horizontally Replication makes it easy to work in the cloud Apache Lucene EuroCon 21 May 2010 62
  • 86. Core Web servers App server Memcached rdbms CMS Apache Lucene EuroCon 21 May 2010 63
  • 87. Core Content API Web servers Solr App server Solr Memcached Solr rdbms Solr Solr Solr CMS Cloud, EC2 Apache Lucene EuroCon 21 May 2010 63
  • 88. Open in? Simple REST/ HTTP framework MICROAPPS allows lightweight development A framework for Applications proxied for integrating 3rd party performance applications into guardian.co.uk. Apps generally hosted in the cloud, hot deployment into production Apache Lucene EuroCon 21 May 2010
  • 89. Open in? Simple REST/ HTTP framework MICROAPPS allows lightweight development A framework for Applications proxied for integrating 3rd party performance applications into guardian.co.uk. Apps generally hosted in the cloud, hot deployment into production Apache Lucene EuroCon 21 May 2010
  • 90. Core Apps Web servers Proxy App App server App Memcached App App rdbms App App CMS external hosting app engine etc Apache Lucene EuroCon 21 May 2010 65
  • 91. OPEN IN OPEN OUT Web servers Solr Proxy App App servers App Memcached Solr App Solr App CMS Solr Solr App Solr App rdbms Cloud, EC2 external hosting app engine etc Apache Lucene EuroCon 21 May 2010
  • 92. C I O CONTENT r external Clo C I O ??????? r external Clo Apache Lucene EuroCon 21 May 2010
  • 93. Thank you http://www.guardian.co.uk/open-platform Twitter: @openplatform @cuica (Stephen Dunn) Apache Lucene EuroCon 21 May 2010 68