Mashing up

        Michael Brunton-Spall
michael.brunton-spall@guardian.co.uk
            @mibgames
About Us

 Online since 1995
 250M+ pages per month
 30M+ visitors per month
1995 - Guardian Online
1999 - Guardian Unlimited
1999 - Removed the registration wall
2007 - Rebuild and Redesign
2007 - Rebuild and Redesign
2007 - Rebuild and Redesign
Developing The Guardian
The Hackable Guardian

 Url Hacking
 Keyword Combiners
 RSS Feeds
Url Hacking

http://www.guardian.co.uk/
    [section]/[keyword]
        technology/internet
    [section]/all
        envi...
RSS Feeds

RSS Everywhere!
  [section]/[keyword]/rss
      technology/internet/rss
  [section]/all /rss
      environment/...
Full Fat!


http://www.flickr.com/photos/snapperwolf/
Url Combiners

  A Logical AND
  Almost any combination from before
     technology/internet+profile/bobbiejohnson
     th...
Problems with this approach

  Fragile
  No Discovery
  Not well documented
  Copyright Issues
Build applications with the Guardian
Open Platform

  Opening up how we work with people both internally
  and externally
  A suite of services enabling partne...
Data Store
  A directory of useful data curated by Guardian editors
Data Store
  A directory of useful data curated by Guardian editors
Data Store
Data Store
Content API

  A service for selecting and collecting content from the
  Guardian for re-use
Content API

  A service for selecting and collecting content from the
  Guardian for re-use
Content API




              Search content
Content API




              Xml and Json
Content API




              Tag Metadata
Content API




              Full Article Body
Content API




              Filters
So why?


 guardian.co.uk has an amazing amount of
 quality content
 Incredible amounts of meta-data, curated by
 editors
...
How?
 Backed of our search platform
 Provides access to 10 years of article content
 and metadata
 Supports multiple outpu...
Pricing
Pricing




          FREE!
How free is free?

  You can publish full articles from the guardian on your
  website
  5k queries per day limit
  24 hou...
Beta trial


  Limited number of keys
  Collecting feedback
  Will open more widely at end of beta program
How do I use it?

  Home page
     http://www.guardian.co.uk/open-platform
  Sign up for an API key at
     http://guardia...
What can I do with it?


  Search our tag hierarchy
  Find content by tag
  Find content by search terms




         Disp...
MP's Expenses
MP's Expenses

 Written in Django
 Hosted on EC2
 Easy to modify
MP's Expenses




                An MP's Page
View

def mp(request, id):
  mp = get_object_or_404(MP, pk = id)
  ...

  return render(request, 'mp.html', {
     'mp': m...
View

def mp(request, id):
  mp = get_object_or_404(MP, pk = id)
  ...

  return render(request, 'mp.html', {
     'mp': m...
View

def mp(request, id):
  mp = get_object_or_404(MP, pk = id)
  ...

  return render(request, 'mp.html', {
     'mp': m...
Using the Guardian API

  Getting all articles about an MP
     results = client.search(q='"%S"' % (name))
     returns a ...
Get the client

def mp(request, id):
  mp = get_object_or_404(MP, pk = id)
  ...
  client = Client(settings.GUARDIAN_APIKE...
Make the request

def mp(request, id):
  mp = get_object_or_404(MP, pk = id)
  ...
  client = Client(settings.GUARDIAN_API...
Update the template

<div id="about-mp">
  <ul>
     {% for article in articles %}
     <li>
        <a href="{{article.we...
And the result!
Additions?

  Use Memcached to cache the response from guardian API
  Don't create a new client each time
  Article bodies
Who else is using it?
Who else is using it?
Who else is using it?
Who else is using it?
Public API Key?

 jbynv3fwdp8ju5625mt2axw3
 Only 5k queries a day
 Will be closed on Monday 6th July
 Will be closed if ab...
Questions?
Mashing Up The Guardian
Mashing Up The Guardian
Upcoming SlideShare
Loading in …5
×

Mashing Up The Guardian

2,592 views

Published on

Published in: News & Politics
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,592
On SlideShare
0
From Embeds
0
Number of Embeds
135
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Mashing Up The Guardian

  1. 1. Mashing up Michael Brunton-Spall michael.brunton-spall@guardian.co.uk @mibgames
  2. 2. About Us Online since 1995 250M+ pages per month 30M+ visitors per month
  3. 3. 1995 - Guardian Online
  4. 4. 1999 - Guardian Unlimited
  5. 5. 1999 - Removed the registration wall
  6. 6. 2007 - Rebuild and Redesign
  7. 7. 2007 - Rebuild and Redesign
  8. 8. 2007 - Rebuild and Redesign
  9. 9. Developing The Guardian
  10. 10. The Hackable Guardian Url Hacking Keyword Combiners RSS Feeds
  11. 11. Url Hacking http://www.guardian.co.uk/ [section]/[keyword] technology/internet [section]/all environment/all [publication]/[date]/[newspapersection] theguardian/2009/jun/11/technologyguardian profile/[name] profile/bobbiejohnson
  12. 12. RSS Feeds RSS Everywhere! [section]/[keyword]/rss technology/internet/rss [section]/all /rss environment/all/rss [publication]/[date]/[newspapersection]/rss theguardian/2009/jun/11/technologyguardian/rss profile/[name]/rss profile/bobbiejohnson/rss
  13. 13. Full Fat! http://www.flickr.com/photos/snapperwolf/
  14. 14. Url Combiners A Logical AND Almost any combination from before technology/internet+profile/bobbiejohnson theguardian/technologyguardian+technology/internet Except things that aren't actual tags: Dated newspaper sections theguardian/2009/jun/11... + anything = 404 the all page .../all + anything = 404
  15. 15. Problems with this approach Fragile No Discovery Not well documented Copyright Issues
  16. 16. Build applications with the Guardian
  17. 17. Open Platform Opening up how we work with people both internally and externally A suite of services enabling partners to build applications with the Guardian Content API Data Store
  18. 18. Data Store A directory of useful data curated by Guardian editors
  19. 19. Data Store A directory of useful data curated by Guardian editors
  20. 20. Data Store
  21. 21. Data Store
  22. 22. Content API A service for selecting and collecting content from the Guardian for re-use
  23. 23. Content API A service for selecting and collecting content from the Guardian for re-use
  24. 24. Content API Search content
  25. 25. Content API Xml and Json
  26. 26. Content API Tag Metadata
  27. 27. Content API Full Article Body
  28. 28. Content API Filters
  29. 29. So why? guardian.co.uk has an amazing amount of quality content Incredible amounts of meta-data, curated by editors Aim to allow the guardian to become a rich source of facts and journalism for the web
  30. 30. How? Backed of our search platform Provides access to 10 years of article content and metadata Supports multiple output formats: XML, JSON, ATOM Supported free text search across content Search for keywords Guardian supported api projects in Java, PHP, Python and Ruby Community supported api projects in Perl, ActionScript and Coldfusion (and probably more?)
  31. 31. Pricing
  32. 32. Pricing FREE!
  33. 33. How free is free? You can publish full articles from the guardian on your website 5k queries per day limit 24 hour maximum cache lifetime Online support Partner with us on advertising
  34. 34. Beta trial Limited number of keys Collecting feedback Will open more widely at end of beta program
  35. 35. How do I use it? Home page http://www.guardian.co.uk/open-platform Sign up for an API key at http://guardian.mashery.com Use the API Explorer at http://api.guardianapis.com/docs/ Use the python library at http://code.google.com/p/openplatform-python/
  36. 36. What can I do with it? Search our tag hierarchy Find content by tag Find content by search terms Display on your website!
  37. 37. MP's Expenses
  38. 38. MP's Expenses Written in Django Hosted on EC2 Easy to modify
  39. 39. MP's Expenses An MP's Page
  40. 40. View def mp(request, id): mp = get_object_or_404(MP, pk = id) ... return render(request, 'mp.html', { 'mp': mp, 'documents': mp.documents.all(), 'top_users': top_users[:5], })
  41. 41. View def mp(request, id): mp = get_object_or_404(MP, pk = id) ... return render(request, 'mp.html', { 'mp': mp, 'documents': mp.documents.all(), 'top_users': top_users[:5], })
  42. 42. View def mp(request, id): mp = get_object_or_404(MP, pk = id) ... return render(request, 'mp.html', { 'mp': mp, 'documents': mp.documents.all(), 'top_users': top_users[:5], })
  43. 43. Using the Guardian API Getting all articles about an MP results = client.search(q='"%S"' % (name)) returns a paginating iterator for x in results: print x['headline'] Shows only first 10 by default
  44. 44. Get the client def mp(request, id): mp = get_object_or_404(MP, pk = id) ... client = Client(settings.GUARDIAN_APIKEY) return render(request, 'mp.html', { 'mp': mp, 'documents': mp.documents.all(), 'top_users': top_users[:5], })
  45. 45. Make the request def mp(request, id): mp = get_object_or_404(MP, pk = id) ... client = Client(settings.GUARDIAN_APIKEY) return render(request, 'mp.html', { 'mp': mp, 'documents': mp.documents.all(), 'top_users': top_users[:5], 'articles': client.search(q='"%s"'%(mp.name)), })
  46. 46. Update the template <div id="about-mp"> <ul> {% for article in articles %} <li> <a href="{{article.webUrl}}">{{article.headline}}</a> </li> {% endfor %} </ul> </div>
  47. 47. And the result!
  48. 48. Additions? Use Memcached to cache the response from guardian API Don't create a new client each time Article bodies
  49. 49. Who else is using it?
  50. 50. Who else is using it?
  51. 51. Who else is using it?
  52. 52. Who else is using it?
  53. 53. Public API Key? jbynv3fwdp8ju5625mt2axw3 Only 5k queries a day Will be closed on Monday 6th July Will be closed if abused Strongly encourage you to sign up for your own key
  54. 54. Questions?

×