WikiLeaks and the Myth of
     (Data-Driven) Citizen
               Journalism
Catalina Iorga, Research MA in Media Studies ’11, University of Amsterdam
The Benefits of Open Govt. Data
 External Contributions

  •   Specialised skills, local knowledge, ‘professional
        amateurs (Jennifer Bell, VisibleGovernment.ca, 2009)

 Citizen Empowerment

  •   “these applications arm citizens with the information
        they need to make decisions every day” (data.gov,
        2010)

 Software Innovation

  •   Open licenses, existing technologies and communities
       (Jonathan Gray, Open Knowledge Foundation, 2010)
Data-driven Journalism
 Large datasets available online

  •   The Afghan War Diary 2004 - 2010 (WikiLeaks,
        2010)

 Information visualisation tools

  •   Guardian Data Explorer (Tony Hirst)

 Narratives powered by the Web

  •   Using the Web “to tell a story, not just as a
        delivery medium” (Alan Maclean, The New York
        Times, 2010)
Data-driven Journalism




Image source: http://upload.wikimedia.org/wikipedia/commons/4/48/Data_driven_journalism_process.jpg
Research Question
What kind of stories do (data-driven)
citizen journalists tell about the war in
Afghanistan by referencing WikiLeaks
documents?
Intended WikiLeaks
‘Assange himself has stated that WikiLeaks has
deliberately moved away from the "egocentric"
blogosphere and assorted social media and
nowadays collaborates only with professional
journalists and human rights activists.’

(Geert Lovink, ‘Twelve Theses on WikiLeaks’, 2010)
Distinction
data-driven mainstream journalism
                               vs.
    data-driven citizen journalism
Why Links?
• Forms of citation
• Indicators of (citizen / user) engagement
Afghan War Diary
• “ an extraordinary secret
    compendium of over 91,000 reports
    covering the war in Afghanistan
    from 2004 to 2010.”

• “ the most significant archive about
    the reality of war to have ever been
    released during the course of a
    war.”

                             (WikiLeaks,
 2010) Image source:   http://wardiary.wikileaks.org
Afghan War Diary - Der Spiegel




“Deadly Toll”

(Der Spiegel, 2010: http://www.spiegel.de/international/world/bild-708314-114716.html)
Afghan War Diary - The
Guardian




“Afghanistan war logs: IED attacks on civilians, coalition and Afghan troops”

(The Guardian, 2010: http://www.guardian.co.uk/world/datablog/interactive/2010/jul/26/ied-
afghanistan-war-logs)
Methodology
•
 Observe the common root of all Afghan War Diary 2004 -
 2010 document URLs (‘http://wardiary.wikileaks.org/afg/
 event').

•
 Query Google with the Google Scraper to get the first 1000
 results which contain this common root as a textual
 component.

•
 Submit the top 100 results to the Link Ripper to extract all
 outlinks to specific Afghan War Diary 2004 - 2010 document
 pages.

•
 Insert the Link Ripper output in the Harvester to remove
 textual descriptions and alphabetize the obtained URL list.

•
 Manually clean the output by searching for the 'http://
 wardiary.wikileaks.org/afg/event' and produce a list of Afghan
 War Diary 2004 - 2010 document URLs.

•
 Select all documents that receive at least two links and
 compile a final list of the 'most mentioned' warlogs.
Limitations
• Searching for inlinks with Yahoo! Site Explorer
  or Google yielded similar results.

• Finding inlinks with different descriptions is
  very difficult.
Types of Accounts
Local Interest Descriptive Lists




(James Barlow, Jul 26 2010: http://jamesbarlow.co.uk/british-entries-afghan-
    war-diaries)
Types of Accounts
‘Connect the Dots’ / Conspiratorial Reasoning




(Peak of Elephants, Jul 26 2010: http://peakofelephants.posterous.com/post/
    861912878)
Most Linked Logs
  Idaho Soldier Captured in Afghanistan


• WTOP (news radio, Washington, DC)
                                               • 8 mentions
• KomoNews (news radio, Seattle, DC)           • Only
• SF Examiner (daily paper, San Francisco, CA) mainstream
                                               stories
• Newser (US-based news site)
• Lebanon Daily News (daily paper, Lebanon County, PA)
• Las Vegas Sun (daily paper, Las Vegas, NV)
• Yahoo! News
• AP (press agency)
Most Linked Logs
Four Canadians Killed in Friendly Fire


• UberVu
                               • 4 mentions
• Ottawa Forums                • Overlap of
•   Wikipedia                  mainstream and
                               alternative comments
• CyberPresse
Conclusion

Data-Driven Citizen Journalism = Absent
Possible Reasons

 •   too much data

 •   technical military terms

 •   mainstream media filters
Credits
Research done by:

 •
 Camilo Cristancho (PhD candidate in Political Science at the

  Universitat Autònoma de Barcelona)

 •
 Matteo Cernison (PhD Candidate in Social and Political Science at

  European University Institute, Florence)

 •
 Catalina Iorga

Wiki: https://wiki.digitalmethods.net/Dmi/
DataDrivenUserJournalism
P.S.
Go to http://wikileaks.ch/ (instead of the official website)
for:

• Guantanamo Files: http://wikileaks.ch/gitmo/
• Cablegate: http://wikileaks.ch/cablegate.html
• Iraq and Afghanistan War Logs: http://wikileaks.ch/iraq/
diarydig/
Thank you for your time and

attention!
E-mail: catalina.iorga@gmail.com

Web: http://catalinaiorga.wordpress.com

DMI Workshop: Wikileaks and the Myth of (Data-Driven) Citizen Journalism (wiki-leaks workshop)

  • 1.
    WikiLeaks and theMyth of (Data-Driven) Citizen Journalism Catalina Iorga, Research MA in Media Studies ’11, University of Amsterdam
  • 2.
    The Benefits ofOpen Govt. Data External Contributions • Specialised skills, local knowledge, ‘professional amateurs (Jennifer Bell, VisibleGovernment.ca, 2009) Citizen Empowerment • “these applications arm citizens with the information they need to make decisions every day” (data.gov, 2010) Software Innovation • Open licenses, existing technologies and communities (Jonathan Gray, Open Knowledge Foundation, 2010)
  • 3.
    Data-driven Journalism Largedatasets available online • The Afghan War Diary 2004 - 2010 (WikiLeaks, 2010) Information visualisation tools • Guardian Data Explorer (Tony Hirst) Narratives powered by the Web • Using the Web “to tell a story, not just as a delivery medium” (Alan Maclean, The New York Times, 2010)
  • 4.
    Data-driven Journalism Image source:http://upload.wikimedia.org/wikipedia/commons/4/48/Data_driven_journalism_process.jpg
  • 5.
    Research Question What kindof stories do (data-driven) citizen journalists tell about the war in Afghanistan by referencing WikiLeaks documents?
  • 6.
    Intended WikiLeaks ‘Assange himselfhas stated that WikiLeaks has deliberately moved away from the "egocentric" blogosphere and assorted social media and nowadays collaborates only with professional journalists and human rights activists.’ (Geert Lovink, ‘Twelve Theses on WikiLeaks’, 2010)
  • 7.
    Distinction data-driven mainstream journalism vs. data-driven citizen journalism
  • 8.
    Why Links? • Formsof citation • Indicators of (citizen / user) engagement
  • 9.
    Afghan War Diary •“ an extraordinary secret compendium of over 91,000 reports covering the war in Afghanistan from 2004 to 2010.” • “ the most significant archive about the reality of war to have ever been released during the course of a war.” (WikiLeaks, 2010) Image source: http://wardiary.wikileaks.org
  • 10.
    Afghan War Diary- Der Spiegel “Deadly Toll” (Der Spiegel, 2010: http://www.spiegel.de/international/world/bild-708314-114716.html)
  • 11.
    Afghan War Diary- The Guardian “Afghanistan war logs: IED attacks on civilians, coalition and Afghan troops” (The Guardian, 2010: http://www.guardian.co.uk/world/datablog/interactive/2010/jul/26/ied- afghanistan-war-logs)
  • 12.
    Methodology • Observe thecommon root of all Afghan War Diary 2004 - 2010 document URLs (‘http://wardiary.wikileaks.org/afg/ event'). • Query Google with the Google Scraper to get the first 1000 results which contain this common root as a textual component. • Submit the top 100 results to the Link Ripper to extract all outlinks to specific Afghan War Diary 2004 - 2010 document pages. • Insert the Link Ripper output in the Harvester to remove textual descriptions and alphabetize the obtained URL list. • Manually clean the output by searching for the 'http:// wardiary.wikileaks.org/afg/event' and produce a list of Afghan War Diary 2004 - 2010 document URLs. • Select all documents that receive at least two links and compile a final list of the 'most mentioned' warlogs.
  • 13.
    Limitations • Searching forinlinks with Yahoo! Site Explorer or Google yielded similar results. • Finding inlinks with different descriptions is very difficult.
  • 14.
    Types of Accounts LocalInterest Descriptive Lists (James Barlow, Jul 26 2010: http://jamesbarlow.co.uk/british-entries-afghan- war-diaries)
  • 15.
    Types of Accounts ‘Connectthe Dots’ / Conspiratorial Reasoning (Peak of Elephants, Jul 26 2010: http://peakofelephants.posterous.com/post/ 861912878)
  • 16.
    Most Linked Logs Idaho Soldier Captured in Afghanistan • WTOP (news radio, Washington, DC) • 8 mentions • KomoNews (news radio, Seattle, DC) • Only • SF Examiner (daily paper, San Francisco, CA) mainstream stories • Newser (US-based news site) • Lebanon Daily News (daily paper, Lebanon County, PA) • Las Vegas Sun (daily paper, Las Vegas, NV) • Yahoo! News • AP (press agency)
  • 17.
    Most Linked Logs FourCanadians Killed in Friendly Fire • UberVu • 4 mentions • Ottawa Forums • Overlap of • Wikipedia mainstream and alternative comments • CyberPresse
  • 18.
    Conclusion Data-Driven Citizen Journalism= Absent Possible Reasons • too much data • technical military terms • mainstream media filters
  • 19.
    Credits Research done by: • Camilo Cristancho (PhD candidate in Political Science at the Universitat Autònoma de Barcelona) • Matteo Cernison (PhD Candidate in Social and Political Science at European University Institute, Florence) • Catalina Iorga Wiki: https://wiki.digitalmethods.net/Dmi/ DataDrivenUserJournalism
  • 20.
    P.S. Go to http://wikileaks.ch/(instead of the official website) for: • Guantanamo Files: http://wikileaks.ch/gitmo/ • Cablegate: http://wikileaks.ch/cablegate.html • Iraq and Afghanistan War Logs: http://wikileaks.ch/iraq/ diarydig/
  • 21.
    Thank you foryour time and attention! E-mail: catalina.iorga@gmail.com Web: http://catalinaiorga.wordpress.com

Editor's Notes