Open Data Journalism
Alex@oreilly.com

@digiphile

radar.oreilly.com/alexh
2013: a networked public sphere
Natural
 disasters
#Sidibouzid
#Jan25
How did we get here?
In the 1990s, government and civil
society spread the Internet globally
In the 2000s, mobile phones and social
  networking connected us ever more
Open Journalism
The
stream
In the 2010s, big data will change
        everything again.




     Image Credit: Real Time Rome from Senseable.MIT.edu
An expanding number of data sources
Commercial and industry data
Social data and crisis data
Open government data platforms
Open data allows citizens to be
   generative in new ways
230 apps now use or are based on
        open health data
What about journalism?
“We used to call it CAR”-DeBarros




         Bob Woodward, via Cliff1066
“Data-driven journalism is the future”




         Source: Tim Berners-Lee in the Guardian
Is “data journalism” just
computer assisted reporting
            (CAR)?
     •   Spreadsheets
     •   Databases
     •   Text and code editors
     •   Statistics
“Trendy but not new”-Simon Rogers,
             Guardian
Show, don’t tell




  A “Sankey diagram”
What’s changed?
•   Online spreadsheets and data tools
•   Data visualization tools
•   Open source frameworks
•   Code sharing
•   Agile development
•   Cloud storage and processing (EC2 &Heroku)
•   The amount of data
“Newspapers are either going
  to start doing what we do, or
  they're going to be bypassed
  and out of date.”

  -Elliot Jaspin

  That was 1986, in Time.
More than 36 interactive databases published
Data sets account for 75% of overall traffic
                           [Source: CJR]
Global leaders
ProPublica
A tangled web
Dollars for Docs
New York Times
“Make small things faster, make big things
           possible.”-Derek Willis, NYT




TimesMachine.nytimes.com cost a few hundred dollars. Hosted on Amazon EC2.
The Guardian
Guardian Datablog
Chicago Tribune

• Flame retardants
Center for Public Integrity
International Consortium of
          Investigative Journalists
Offshoring $
  80 journalists
  40 countries
  260 gigabytes
  2.5 million files
Reuters: Connected China
La Nacion
Storytelling still matters.
“We use these tools to find and tell stories.
 We use them like we use a telephone.
 The story is still the thing.”

     - Anthony DeBarros
           USA Today



                         Source: Data Journalism and the Big Picture
Los Angeles Times
SOPA Opera
Best practices?
Understand the context
     for the data
Show your data
Show your work
Share your code
Plan for reuse
Build on open standards
Citizen-centric
Keeping citizens safe

“Traffic on the NYC Health Department’s
restaurant inspection site has gone from
10,000 hits per month to 124,000”

                      - New York Times
Make data find the people.
Helps citizens who need it most
Privacy
challenges
Security challenges
• Protect your sources? Protect your data!
Bridge the
data divide




              Digital signage on the cheap
FOIA &Press Freedom
Fauxpen Data
In an age of “openwashing”…

We need to:

Evaluate licenses.

Peruse the Terms of Service.

Review the governance.

Look at community.

Check the format.
Wired Italy
Emerging trends
Political tensions over open data
• Gun map graphic
Robo-journalism?
Data journalists, meet civic hackers




              Source: BuzzData
Now it’s “Hacks and Hackers”




Photo by Dennis Crowley, from “Hack to Hacker: Rise of the Journalist-Programmer”
Homicide Watch
Citizens as Sensors: Andhra Pradesh
Citizensourcing
Makers and open source hardware
Safecast
 open source

 Geiger counter
Networked accountability
Sensor Journalism
“If Stage 1 of data journalism was “find and
   scrape data,” then…

  Stage 2 was “ask government agencies to
  release data” in easy to use formats.

  Stage 3 is going to be “make your own data”,
  and those sources of data are going to be
  automated and updated in real-time.”

                      -JavaunMoradi, NPR
Data creation
Data journalism with a purpose
Co-create a stronger union
Government of the
  people, for the
  people, by the
 people, with the
     people.

Data journalism Overview