Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

of

New information for new journalists pt2: data Slide 1 New information for new journalists pt2: data Slide 2 New information for new journalists pt2: data Slide 3 New information for new journalists pt2: data Slide 4 New information for new journalists pt2: data Slide 5 New information for new journalists pt2: data Slide 6 New information for new journalists pt2: data Slide 7 New information for new journalists pt2: data Slide 8 New information for new journalists pt2: data Slide 9 New information for new journalists pt2: data Slide 10 New information for new journalists pt2: data Slide 11 New information for new journalists pt2: data Slide 12 New information for new journalists pt2: data Slide 13 New information for new journalists pt2: data Slide 14 New information for new journalists pt2: data Slide 15 New information for new journalists pt2: data Slide 16 New information for new journalists pt2: data Slide 17 New information for new journalists pt2: data Slide 18 New information for new journalists pt2: data Slide 19 New information for new journalists pt2: data Slide 20 New information for new journalists pt2: data Slide 21 New information for new journalists pt2: data Slide 22 New information for new journalists pt2: data Slide 23 New information for new journalists pt2: data Slide 24 New information for new journalists pt2: data Slide 25 New information for new journalists pt2: data Slide 26 New information for new journalists pt2: data Slide 27 New information for new journalists pt2: data Slide 28 New information for new journalists pt2: data Slide 29 New information for new journalists pt2: data Slide 30 New information for new journalists pt2: data Slide 31 New information for new journalists pt2: data Slide 32 New information for new journalists pt2: data Slide 33 New information for new journalists pt2: data Slide 34 New information for new journalists pt2: data Slide 35 New information for new journalists pt2: data Slide 36 New information for new journalists pt2: data Slide 37 New information for new journalists pt2: data Slide 38 New information for new journalists pt2: data Slide 39 New information for new journalists pt2: data Slide 40 New information for new journalists pt2: data Slide 41 New information for new journalists pt2: data Slide 42 New information for new journalists pt2: data Slide 43 New information for new journalists pt2: data Slide 44 New information for new journalists pt2: data Slide 45 New information for new journalists pt2: data Slide 46 New information for new journalists pt2: data Slide 47 New information for new journalists pt2: data Slide 48 New information for new journalists pt2: data Slide 49 New information for new journalists pt2: data Slide 50 New information for new journalists pt2: data Slide 51 New information for new journalists pt2: data Slide 52 New information for new journalists pt2: data Slide 53 New information for new journalists pt2: data Slide 54 New information for new journalists pt2: data Slide 55 New information for new journalists pt2: data Slide 56 New information for new journalists pt2: data Slide 57 New information for new journalists pt2: data Slide 58 New information for new journalists pt2: data Slide 59
Upcoming SlideShare
news:rewired - Linked data and the semantic web
Next
Download to read offline and view in fullscreen.

3 Likes

Share

Download to read offline

New information for new journalists pt2: data

Download to read offline

Presentation to ESCACC, Barcelona, 2010

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

New information for new journalists pt2: data

  1. 1. Introduction Paul Bradshaw Data journalism
  2. 2. Ivy Lee
  3. 3. “Each weekday, my computer program goes to the Chicago Police Department's website and gathers all crimes reported in Chicago.” Adrian Holovaty
  4. 4. Great stories Engagement Targeting/relevance Why?
  5. 5. “The Tribune’s biggest magnet by far has been its more than three dozen interactive databases, which collectively have drawn three times as many page views as the site’s stories.” http://bit.ly/dj2dmz
  6. 6. Times film genres
  7. 7. Data Journalism Continuum
  8. 8. 1. Finding data
  9. 9. What is data?
  10. 10. Numbers Text Connections Live data Behavioural data Images, audio, video Anything that a computer can work with
  11. 11. Start with the data and look for the stories? (MPs’ expenses) Or start with a lead and look for the data? Passive vs active data journalism
  12. 12. Data.gov.uk What Do They Know Openlylocal, Scraperwiki Disclosure logs RSS feeds, XML, structured data Some UK projects
  13. 13. Delicious.com/paulb/car CAR
  14. 14. Advanced search by file type “Performance figures” Filetype: pdf Filetype: xls Filetype: doc Filetype: ppt Filetype: rdf OR xml
  15. 15. Advanced search by domain “Disclosure logs” site: .gov.es Database site: .org.cat OR .org +Tables –chairs site: Health, police, military domains
  16. 16. Use overseas sources • US medicine databases • EU subsidy databases • Swedish people data • International police agency correspondence
  17. 17. Scraping Scraping can automate & schedule the gathering process if there are multiple sources Tools: OutWit Hub plugin, Yahoo! Pipes, Scraperwiki, Google Spreadsheets formulae
  18. 18. Interrogating data
  19. 19. Humans collect data Humans enter data Human error Time spent now...
  20. 20. Different words for the same thing Double spaces, punctuation Wrong data type Mistyped Duplicate entries Default entries (1/1/00) ...Saves time later
  21. 21. "Because we take the time to clean the data, we are able to do lobbying stories no other news organisation can do." David Donald, Center for Public Integrity
  22. 22. Group by term then sort to see duplications Find & replace double spaces, etc. Select column/row & check data type Sort to find unusually large/small, and neighbouring misspellings Cleaning methods
  23. 23. Never publish a name from data without running a background check Check.
  24. 24. Other tools Freebase Gridworks: see http://vimeo.com/10081183
  25. 25. Visualising data
  26. 26. or http://chartchooser.juiceanalytics.com/
  27. 27. (trends, dips, correlations)
  28. 28. (comparison, themes)
  29. 29. (proportions, comparison)
  30. 30. Mashing data
  31. 31. Geocoded data with map - Live data (e.g. Twitter API) - Static data (e.g. Google Docs) - Dynamic data (e.g. Google Form) 2 spreadsheets with common data - Tools: MySQL, Access, etc. Combining data sources
  32. 32. Twittermap Wikipedia map NYT Property Guardian vs Nature BBC Most Read BBC Olympic Village Combining data sources
  33. 33. Big events (protests, Olympics, inauguration) Comparisons Geocoded data Connections What mashes well?
  34. 34. Aggregates Maps Filters Counts Cleans or reformats (regex) Yahoo! Pipes
  35. 35. Scraperwiki – mapping library Maptube – combine maps Google Docs – publish in different formats +++ Other tools
  36. 36. Computer-readable data Paris – France, Texas, or Hilton? Unique identifiers – usually URI RDF, RDFa, XML, etc. Semantic web & linked data
  37. 37. Application Programming Interface Build on top of data Google Maps, Twitter, Facebook, Digg, Guardian, NYT, NPR, They Work For You, etc. API
  38. 38. Slideshare.net/onlinejournalist Twitter.com/paulbradshaw Q&A
  39. 39. Delicious.com/paulb/datajournalism Delicious.com/paulb/visualisation Delicious.com/paulb/statistics Bookmarks
  • trinetizen

    Jul. 7, 2014
  • chicodelacalle

    Oct. 28, 2010
  • xevimontoya

    Oct. 27, 2010

Presentation to ESCACC, Barcelona, 2010

Views

Total views

2,474

On Slideshare

0

From embeds

0

Number of embeds

218

Actions

Downloads

30

Shares

0

Comments

0

Likes

3

×