„Mining Data, Refining Journalism? Data Journalism’s Development and Critical Potential“, Vortrag von Julius Reimer und Wiebke Loosen im Rahmen der „67th Annual Conference“ der International Communication Association (ICA) am 25. Mai 2017 in San Diego, USA.
2. ► @julius_reimer & @WLoosen | Mining Data, Refining Journalism? | 29 May 2017 | ICA | San Diego
Journalism in the Data-Driven Society
2
▶ Datafication’s double relevance for journalism:
1. Topic that must be covered to
enhance public’s awareness,
understanding and debate and to
enable (re-)action
2. “Quantitative and computational
turn” of journalistic practices
In terms of reporting style:
emerging journalistic sub-field of
data-driven journalism (DDJ)
3. ► @julius_reimer & @WLoosen | Mining Data, Refining Journalism? | 29 May 2017 | ICA | San Diego
DDJ: the Future of Journalism
(in Terms of Reporting Style)?
3
4. ► @julius_reimer & @WLoosen | Mining Data, Refining Journalism? | 29 May 2017 | ICA | San Diego
DDJ: the Future of Journalism
(in Terms of Reporting Style)?
4
▶ Obviously, expecting DDJ to completely replace traditional practices of
news gathering and reporting is a rather “naïve” position.
▶ However, if so, which functions of journalism could DDJ potentially fulfil,
given its recent development & current state?
▶ Re-examine already collected data on development & state of DDJ (Loosen et
al., 2017) to look for evidence on how fit DDJ is to fulfil journalism’s functions.
5. ► @julius_reimer & @WLoosen | Mining Data, Refining Journalism? | 29 May 2017 | ICA | San Diego
Functions of Journalism
5
Adversarial function
▶ Act as adversary of officials
▶ Act as adversary of business
Disseminator function
▶ Get information to public quickly
▶ Avoid unverified facts
▶ Reach widest possible audience
▶ Provide entertainment & relaxation
Populist mobilizer function
▶ Let people express views
▶ Develop cultural interests
▶ Motivate people to get involved
▶ Point to possible solutions
▶ Set the political agenda
Interpretive function
▶ Investigate official claims
▶ Analyze complex problems
▶ Discuss (inter-)national policy
Synchronizing function
▶ Synchronize different social domains
(politics, economy, law, etc.)
(Weaver et al., 2007: pp. 139–146;
Görke & Scholl, 2006: p. 650)
6. ► @julius_reimer & @WLoosen | Mining Data, Refining Journalism? | 29 May 2017 | ICA | San Diego
Method
6
▶ Standardised content analysis of projects nominated for the Data
Journalism Awards (DJA) 2013–2016
▶ “Gold-standard” of DDJ (cf. Borges-Rey, 2016; De Maeyer et al., 2015; Fink & Anderson, 2015)
▶ n = 225
Dimension Variables
Authorship Medium; type of medium; external partners; number of people involved mentioned by name
Story properties
Headline; topic; reference to a specific event; question(s) posed to data; number of related
articles; length of article; language; winner of DJA
Data
Data source(s); type(s) of data source(s); access to data; kind of data; additional information on
data; geographical reference; changeability of dataset; time period covered; unit of analysis
Analysis & journalistic editing
Personalised case example; call for public intervention or criticism; focus of data analysis;
visualisation
Interactivity Interactive functions; online access to the database; opportunities for communication
Dimension Variables
Authorship Medium; type of medium; external partners; number of people involved mentioned by name
Story properties
Headline; topic; reference to a specific event; question(s) posed to data; number of related
articles; length of article; language; winner of DJA
Data
Data source(s); type(s) of data source(s); access to data; kind of data; additional information on
data; geographical reference; changeability of dataset; time period covered; unit of analysis
Analysis & journalistic editing
Personalised case example; call for public intervention or criticism; focus of data analysis;
visualisation
Interactivity Interactive functions; online access to the database; opportunities for communication
7. ► @julius_reimer & @WLoosen | Mining Data, Refining Journalism? | 29 May 2017 | ICA | San Diego
Reliability of Coding
7
▶ Cases from 2013 & 2014 were coded by two student coders
▶ Test with 10% of cases: intercoder reliability coefficients (Holsti or
Krippendorff’s α) ≥ 0.7 for all variables
▶ Cases of 2015 were coded by one of the abovementioned coders
▶ No additional reliability test
▶ Cases of 2016 were coded by two different student coders who had been
instructed by the authors and the abovementioned coder
▶ Test with 10% of cases: intercoder reliability coefficients (Holsti or
Krippendorff’s α) ≥ 0.7 for 84 of 89 variables
▶ Type of medium, aggregated unit of analysis, other visualisation: Hr = 0.67
▶ Number of articles: α = 0.42; length of article: α = 0.27
▶ Additional measure to secure reliability: consensual coding (2 coders + 1 author)
8. ► @julius_reimer & @WLoosen | Mining Data, Refining Journalism? | 29 May 2017 | ICA | San Diego
Types of Media
8
▶ DDJ widely adopted
across field: not
only by new actors,
but also by legacy
media
▶ à Resilient organis.
structures ensure
sustainability of
reporting style
▶ But: 5.0 authors on
average
▶ à DDJ is resource-
intensive
43,1
18,2
8,4 8,4
5,8 5,3 4,4 4,0 3,1 2,7
0
10
20
30
40
50
(%; n = 225)
9. ► @julius_reimer & @WLoosen | Mining Data, Refining Journalism? | 29 May 2017 | ICA | San Diego
Countries of Origin
9
▶ Projects from 33 countries on all 5 continents + 5 international projects
▶ 49 % US; 13 % UK
▶ à DDJ is wide-spread
phenomenon, but
dominated by Anglo-
American actors (in
our sample, at least)
10. ► @julius_reimer & @WLoosen | Mining Data, Refining Journalism? | 29 May 2017 | ICA | San Diego
Topics
10
▶ à Focus on domains
important to
journalism’s function
▶ But: 1.5 different
topics in a piece on
average
▶ à DDJ compares
different perspectives
only sometimes
▶ à Potential problem
for synchronizing
function
48,2
36,6
28,1
21,4
5,4
3,1 2,7
0
10
20
30
40
50
Politics Society Business Health &
science
Education Sports Culture
(%; multiple coding possible; n = 224)
11. ► @julius_reimer & @WLoosen | Mining Data, Refining Journalism? | 29 May 2017 | ICA | San Diego
Data Sources
11
68,4
41,8
20,4 20,4
7,1
0
10
20
30
40
50
60
70▶ Strong dependence on official
state institutions
▶ à Potential problem for watchdog
function
▶ But: Private companies’ share
constantly growing (n.s.)
▶ à DDJ increasingly looking for
new sources
▶ 1.5 different kinds of sources on
average
▶ à DDJ does not always contrast
one source’s data with another
one’s
▶ à Potential problem for watchdog
& synchronizing functions
(%; multiple coding possible; n = 225)
12. ► @julius_reimer & @WLoosen | Mining Data, Refining Journalism? | 29 May 2017 | ICA | San Diego
Access to Data
12
43,3 44,2
22,3
8,9
7,1
3,6
0
10
20
30
40
50▶ Strong dependence on data
already available
▶ Small shares of more
“investigative” ways of
collecting data
▶ à Potential problem for
watchdog function
(%; multiple coding possible; n = 224)
13. ► @julius_reimer & @WLoosen | Mining Data, Refining Journalism? | 29 May 2017 | ICA | San Diego
Kind of Data
13
47,3
45,0
38,3
35,1
30,2
15,8
12,6
0
10
20
30
40
50▶ 2.3 different kinds of
data on average
▶ à DDJ combines data
types which enhances
analytical performance
(%; multiple coding possible; n = 222)
14. ► @julius_reimer & @WLoosen | Mining Data, Refining Journalism? | 29 May 2017 | ICA | San Diego
▶ 1.7 different foci on average
▶ à DDJ regularly performs
complex analyses
▶ Also, 52% of pieces include
criticism &/or call for public
intervention
▶ à Assumption of watchdog
function
Focus of Data Analysis
14
(%; multiple coding possible; n = 225)
85,3
48,4
31,6
0
10
20
30
40
50
60
70
80
90
Compare groups Show changes over
time
Show connections &
flows
15. ► @julius_reimer & @WLoosen | Mining Data, Refining Journalism? | 29 May 2017 | ICA | San Diego
Visualizations
15
▶ Average number of different
visualizations grew
constantly (2013: 2.1 – 2016:
3.1; p < .05)
▶ à Explanatory, analytical, &
entertaining function
66,7
60,0
49,8
31,6
27,1
18,7
3,1 0,9
0
10
20
30
40
50
60
70
80
(%; multiple coding possible; n = 225)
16. ► @julius_reimer & @WLoosen | Mining Data, Refining Journalism? | 29 May 2017 | ICA | San Diego
Interactive Features
16
▶ à Explanatory &
involvement function
▶ Also: 22.3 % of projects
included data-related
participative options
beyond comments
▶ à Involvement &
expression of views
function
17,0
63,8
52,7
28,1
16,5
4,0 1,3
0,0
10,0
20,0
30,0
40,0
50,0
60,0
70,0
(%; multiple coding possible; n = 224)
17. ► @julius_reimer & @WLoosen | Mining Data, Refining Journalism? | 29 May 2017 | ICA | San Diego
Also, the average
DJA-nominated piece contains...
17
▶ …1.7 different foci of analysis (e.g., compare groups, show developments over
time)
▶ à DDJ regularly performs complex analyses
▶ …criticism &/or call for public intervention (52% of projects)
▶ à Assumption of watchdog function
▶ …a growing number of different visualizations, but rather simple ones (images,
simple static charts, maps)
▶ à Explanatory, analytical, & entertaining potential, but limited performance
▶ …a feature allowing for data-related interactivity, especially zoom into
maps/details on demand, filtering of data
▶ à Explanatory & involvement function
▶ …rarely data-related participative options beyond comments (22.3 % of projects)
▶ à Missed opportunity for better involvement & letting users express their views
18. ► @julius_reimer & @WLoosen | Mining Data, Refining Journalism? | 29 May 2017 | ICA | San Diego
Trends & Developments
18
Shares & average numbers of aspects mostly stable or without linear trend. E.g.:
▶ First, average number of authors grew (2013: 4.1 – 2015: 5.7), then fell again (2016: 4.4)
▶ First, average number of different analytical foci grew (2013: 1.6 – 2015: 1.8), then fell again
(2016: 1.4)
Exceptions:
▶ Growing share of business pieces (2014: 18.8% – 2016: 46.7% [χ2 = 11.210, df = 3, p < .05])
à Artifact of nominee selection through jury?
▶ Average number of different kinds of visualizations grew constantly & significantly (2013:
2.1 – 2016: 3.1 [ANOVA: F = 8.161, df = 244, p < .001])
▶ Average number of different kinds of access to data grew constantly & significantly (2013:
1.1 – 2016: 1.6 [χ2 = 10.984, df = 3, p < .05])
▶ Constantly growing share of pieces incl. criticism/call for public intervention (2013: 46.4% –
2016: 63.0%; n.s.)
19. ► @julius_reimer & @WLoosen | Mining Data, Refining Journalism? | 29 May 2017 | ICA | San Diego
Award-Winners (vs. Projects only Nominated)
19
Few (& mostly not statistically significant) differences:
▶ Higher average number of authors (M = 6.3 vs 4.8 [M calculated without extreme cases “Swiss Leaks”
and “Panama Papers” with 171 and 377 contributors, resp.]; n.s.)
▶ More societal issues; less politics & business (n.s.)
▶ Less data from other, non-commercial organizations; more from private companies (n.s.)
▶ More requested, self-collected, & leaked data (n.s.)
▶ More geo-, financial, & personal data; less polls (n.s.)
▶ Significantly more different visualisations (3.0 vs 2.5; t = 2.656, df = 223, p < .01)
▶ (Significantly) higher shares of all kinds of visualisations, except simple static charts &
other visualisations (images & animated vis.: p < .05 [Fisher’s exact test]; rest: n.s.)
▶ Less without interactive features (p[1-sided] < .05 [Fisher’s exact test]); more zoom/details &
personalization (n. s.)
▶ Significantly more with data-related participative options beyond comments (37.8% vs
19.3%; p < .05 [Fisher’s exact test])
20. ► @julius_reimer & @WLoosen | Mining Data, Refining Journalism? | 29 May 2017 | ICA | San Diego
Conclusion
20
Adversarial function
▶ Strong focus on politics & business
▶ But watchdog performance limited by dependence on
available data from official/commercial sources & rare
contrasting of data types/sources & perspectives
Disseminator function
▶ Strong focus on verified facts
▶ But DDJ is personnel-intensive, time-consuming &
depends on availability of data à limited ability to react
to breaking news & disseminate information quickly
▶ Untapped entertainment potential through visualizations
& interactivity
Populist mobilizer function
▶ Strong, but untapped potential to involve & let people
express their views
▶ No developing of cultural interests
▶ Not as “investigative” as often implied, but strong critical
stance & occasional pointing towards solutions
Interpretive function
▶ Strong & evolving analytical power
▶ But fact-checking of claims with data only in some cases
Synchronizing function
▶ Rare contrasting of data types & sources as well as
perspectives
(Best practice) DDJ through the lens of journalism’s functions: a mixed picture
21. ► @julius_reimer & @WLoosen | Mining Data, Refining Journalism? | 29 May 2017 | ICA | San Diego
Conclusion
21
Critical potential: chances for expansion & innovation
▶ Broaden coverage of under-reported topics
▶ Strengthen investigative & watchdog reporting by…
▶ …increasing own data collection efforts (cf. also Tabary et al., 2016: 81)
▶ …comparing data of different types/from different sources &
perspectives of different social domains
Overall conclusion:
▶ In a datafied society, DDJ is a necessary complementation of
traditional journalistc practices – nothing more, nothing less.
22. ► @julius_reimer & @WLoosen | Mining Data, Refining Journalism? | 29 May 2017 | ICA | San Diego
Conclusion
22
DDJ: an increasingly necessary complementation
▶ The more the social domains that journalism is supposed to observe &
control are datafied, i.e. the more their social construction relies on
data,
▶ & the more these social domains engage in “data-spin” to influence
public communication related to them,
▶ the more journalism itself needs to be able to “make sense of data” to
fulfil its functions, i.e. the more important DDJ becomes as a
complementation of traditional practices of news gathering &
reporting.
24. ► @julius_reimer & @WLoosen | Mining Data, Refining Journalism? | 29 May 2017 | ICA | San Diego
References
24
Literature:
▶Borges-Rey, E. (2016). Unravelling data journalism. A study of data journalism practice in British newsrooms.
Journalism Practice, 10(7), 833–843.
▶De Maeyer, J., Libert, M., Domingo, D., Heinderyckx, F., & Le Cam, F. (2015). Waiting for data journalism. A
qualitative assessment of the anecdotal take-up of data journalism in French-speaking Belgium. Digital
Journalism, 3(3), 432–446.
▶Fink, K., & Anderson, C. W. (2015). Data journalism in the United States. Beyond the “usual suspects.”
Journalism Studies, 16(4), 467–481.
▶Görke, A., & Scholl, A. (2006). Niklas Luhmann’s theory of social systems and journalism research. Journalism
Studies, 7(4), 644–655.
▶Loosen, W., Reimer, J., & De Silva-Schmidt, F. (2017). Data-driven reporting – an on-going (r)evolution? A
longitudinal analysis of projects nominated for the Data Journalism Awards 2013–2015. URL: http://www.hans-
bredow-institut.de/webfm_send/1181.
▶Tabary, C., Provost, A.-M., & Trottier, A. (2016). Data journalism’s actors, practices and skills: A case study from
Quebec. Journalism: Theory, Practice, and Criticism, 17(1), 66–84.
▶Weaver, D. H., Beam, R. A., Brownlee, B. J., Voakes, P. S., & Wilhoit, G. C. (2007). The American journalist in the
21st century. U.S. news people at the dawn of a new millennium. Mahwah: L. Erlbaum Associates.
25. ► @julius_reimer & @WLoosen | Mining Data, Refining Journalism? | 29 May 2017 | ICA | San Diego
References
25
Media logos:
▶The Guardian: https://commons.wikimedia.org/wiki/File:The_Guardian.svg
▶ICIJ: https://offshoreleaks.icij.org/
▶Mother Jones:
http://www.underconsideration.com/brandnew/archives/new_logo_for_mother_jones_done_in_house.php
▶NYT: https://commons.wikimedia.org/wiki/File:New_York_Times_logo_variation.jpg
▶Pro Publica: https://en.wikipedia.org/wiki/File:Propublica_logo.jpg
▶The Wall Street Journal: http://www.hartleyglobal.com/wall-street-journal/
▶BBC: http://www.bbc.com/news
▶La Nación: https://en.wikipedia.org/wiki/File:La_Nacion_Logo.svg
Project screenshots:
▶“Female population”: https://qz.com/335183/heres-why-men-on-earth-outnumber-women-by-60-million/
▶“Deaths by group”: http://www.bbc.com/news/world-30080914
▶“Rede de escândalos”: http://veja.abril.com.br/infograficos/painel_rede_escandalos/network_of_scandals.html