Data Journalism (City Online Journalism wk8)
Upcoming SlideShare
Loading in...5

Data Journalism (City Online Journalism wk8)



Week 8 lecture to students on the 8 MAs at City University

Week 8 lecture to students on the 8 MAs at City University



Total Views
Views on SlideShare
Embed Views



3 Embeds 89 40 27 22


Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution-NonCommercial LicenseCC Attribution-NonCommercial License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Data Journalism (City Online Journalism wk8) Data Journalism (City Online Journalism wk8) Presentation Transcript

  • Online Journalism City University Paul Bradshaw Data journalism
  • 1. What is it? 2. Where to get it 3. How to get it Themes View slide
  •   View slide
  • “ Each weekday, my computer program goes to the Chicago Police Department's website and gathers all crimes reported in Chicago.” Adrian Holovaty
  • Times film genres
      • Times Data Blog
  • ” QUOTE” Now is a good time.
  • “ The Tribune’s more than three dozen interactive databases , collectively have drawn three times as many page views as the site’s stories . [75% of traffic]”
  • . What is data?
  • Numbers Text Live data Behavioural data Images, audio, video Anything that a computer can work with
  • Start with the data and look for the stories? (MPs’ expenses) Or start with a lead and look for the data? Passive vs active data journalism
  • Data Journalism Continuum
  • Guardian datastore Openlylocal,Open Corporates, Open Charities, Who's Lobbying etc. FOI requests (WDTK), disclosure logs Books - British Political Facts Finding
  • WDMMG forums MySociety mailing lists Open Data Cookbook Wolfram Alpha forum Finding – data communities
  • Government - national and local 'Monitors' - regulators & other bodies Charities, pressure groups Institutions - academic, scientific, health Business, finance Media, entertainment, sport Other secondary sources
  • (etc) Filetype:pdf (etc) Imagine the page you hope to find, including jargon etc.  Database contents are invisible Google News alerts: report OR review   Advanced search
  • "quotes search for exact phrases" "disclosure logs"  + ensures page contains word: +logs - omits results with word: -wooden * wildcard, e.g. "deaths * custody" ~ synonyms, e.g. ~deaths   Advanced search
  • Tip: use overseas sources
      • US medicine databases
      • EU subsidy databases
      • Swedish people data
      • International police agency correspondence with UK
  • RSS, XML, JSON, RDF - and APIs Scraperwiki Outwit Hub Yahoo! Pipes Spreadsheet formulae (look them up) Feeds and scrapers
  • Format? Table? Pattern? URL? 'Structured' data
  • 'Structured' HTML? (Use Firebug)
    • <p>       <strong>
    • Case Ref: FS50295557 <br />Date: 04/11/2010 <br />Public Authority: London Borough of Southwark <br />Summary: </strong>
    • The complainant requested a copy of the authorities approved business plan  [...]<br /><strong>Section of Act/EIR &amp; Finding: </strong>FOI 1 - Complaint Upheld , FOI 10 - Complaint Upheld <br />
    • <a title=&quot;Opens in new window&quot; href=&quot;~/media/documents/decisionnotices/2010/fs_50295557.ashx&quot; target=&quot;_blank&quot;>View PDF of Decision Notice FS50295557</a></p>
  • =ImportHTML(&quot;;, &quot;table&quot;, 1) =ImportXML(&quot;”) =ImportFeed(&quot;;&A2) Spreadsheet formulae
  • Fetch Page module Regex Yahoo! Pipes
  • &quot;A problem for sites who want to provide privacy while allowing new users to join easily. Scraping services may constitute a violation of terms of service; tactics often resemble a denial-of-service attack or a security exploit.&quot; Ethics
  • . Questions?
  • Links
    • - Use advanced search to find data
    • - Use tools to scrape data
    • Visualise a politician's speeches using Wordle or Many Eyes
    • Read up on some of the tools or technologies before the lab
  • Books Darrell Huff - How To Lie With Statistics Blastland & Dilnot - The Tiger That Isn't Donna Wong - The WSJ Guide to Information Graphics Brian Suda - A Practical Guide to Designing with Data
  • . Assignments
  • Enough time? 10 credits = 100 hours Lectures = 15 hours Group blog = 60 hours (75%) Strategy = 20 hours (25%) (Some in labs) + 5 hours on other issues
  • Enough time? Blog Just an example: 10 posts ranging from simple links to interviews, analysis, experiment 5.5 hours ave per week x10 weeks = 55 hours + 5 hours to write evaluation
  • Enough time? Strategy Just an example: 12.5 hours researching community 30 mins per week x10 weeks with community (2.5 hours) 5 hours analysis & write up
  • Group blogs
    • 8 areas:
    • Online video; 2. Online audio
    • 3. Data; 4. UGC
    • 5. Community management
    • 6. Mobile; 7. Social media
    • 8. Infographics and photography
  • Criteria Ass1: Newsgathering/research Production Law, ethics and strategy Ass 2: Research Analysis Execution