challenges in the field of Data Journalism 	
	

PAST PRESENT FUTURE
Grüezi mitenand!	
	

• 
• 

Hille van der Kaa	
Twitter: @Hillevanderkaa	

• 

Researcher Data Journalism at Tilburg
University	
Professorship Media, an
interdisciplinary institute of journalism,
ICT, communication, arts &
economics at Fontys University of
Applied Sciences	

• 
what is data journalism?
Data journalism is a journalism specialty
reflecting the increased role that numerical
data is used in the production and
distribution of information in the digital era.	
	
It reflects the increased interaction
between content producers and several
other fields such as design, computer
science and statistics. 	
	
(Poynter institute, 2011)	

	
  
computer
assisted
reporting	
	

big data
driven
journalism	
	

automated
algorithmic
storytelling	
	

	
past	

	
present	

	
future
1952	
The Univac was used by
CBS to predict the result
of the 1952 presidential
election.	
	
With a sample of just 1%
of the voting population
it famously predicted
an Eisenhower landslide
while the conventional
wisdom favored
Stevenson.
Computer-assisted reporting describes
the use of computers to gather and
analyze the data necessary to write
news stories.	
	
Collectively this has become known as
computer-assisted reporting, or CAR.	
	
  
1967
1992
entering the jungle of Big Data	
an ‘information explosion’ that
generates exabytes of data every year	
for example: info generated by open
data government policies, digital
archiving, and human interactions in
social networks …	
	
… all these data tell stories
velocity	

volume	

•  data comes in
quickly through
multiple sources
(online systems,
social media
etc.)	

•  from terabytes
to petabytes
(and more) of
information	

variety	

complexity	

•  of data types
(structured,
semi-structured,
and
unstructured
data)	

•  geographical
and multi-data
center data
distribution,
cloud
computing
data journalism 	
=	
the craft of finding stories	
into complex datasets
computer assisted reporting	
data-driven journalism	
database journalism	
automated journalism
data-driven journalism	
data-driven journalism is a process
whereby journalists build stories using
numerical data or databases 	
	
as a primary material	
	
  
database journalism	
database journalism or structured
journalism is a principle in information
management whereby news content is
organized around structured pieces of
data, as opposed to news stories	
	
it focuses on the constitution and
maintenance of the database
 challenge 1	
the line between activism and journalism has become
even fuzzier in the digital age	
	
growing interest in investigative news organisations
that operate on a non-profit model	
	

some new players are producing	
to serve a particular agenda	

	
  
 challenge 2	
when does a study, or dataset, constitute as a reliable
source for a news story? 	
	
what does a journalist need besides the data to
create a trustworthy story?	
	

how to evaluate data?	
as a news source… or?	
	
	
  
	
  
	
  
 challenge 3	
how to analyze all these data?	
	
data journalism = social science?	
	
	
	
  
	
  
	
  

	
  
challenge
only skilled data analysts can bridge the
gap between data and knowledge, and
find the stories underneath	
a new set of skills: 	
advanced research methods +
computational methods	
(data mining, data processing)
ability to select data from a broad range of data sources	
	
ability to analyze and abstract data from a scientific perspective	
	
ability to explore and detect abnormity in data	
	
familiarity with various data standards & the ability to convert	
	
ability to visualize data in graphics and text	

	
ability to transform data	
in a journalistic storyline
all journalists = data journalists	
	

	
NO	
	
But....
future journalist need to know 	
	
(at least a little bit…)	
	
about data 	
	
and algorithms
computer assisted reporting	
data-driven journalism	
database journalism 	
automated journalism
all journalists = data journalists	
	

	
NO	
	
But....
journalism, as one of the key professions
specialized in making information visible
and accessible to large audiences, 	
	
MUST BE	
	
at the forefront of the 	
‘data revolution’
Widerluege!

Storytelling in a digital age - challenges of a Data Journalist

  • 1.
    challenges in thefield of Data Journalism PAST PRESENT FUTURE
  • 2.
    Grüezi mitenand! •  •  Hille vander Kaa Twitter: @Hillevanderkaa •  Researcher Data Journalism at Tilburg University Professorship Media, an interdisciplinary institute of journalism, ICT, communication, arts & economics at Fontys University of Applied Sciences • 
  • 3.
    what is datajournalism?
  • 4.
    Data journalism is a journalismspecialty reflecting the increased role that numerical data is used in the production and distribution of information in the digital era. It reflects the increased interaction between content producers and several other fields such as design, computer science and statistics. (Poynter institute, 2011)  
  • 6.
  • 8.
    1952 The Univac wasused by CBS to predict the result of the 1952 presidential election. With a sample of just 1% of the voting population it famously predicted an Eisenhower landslide while the conventional wisdom favored Stevenson.
  • 9.
    Computer-assisted reporting describes theuse of computers to gather and analyze the data necessary to write news stories. Collectively this has become known as computer-assisted reporting, or CAR.  
  • 10.
  • 11.
  • 12.
    entering the jungleof Big Data an ‘information explosion’ that generates exabytes of data every year for example: info generated by open data government policies, digital archiving, and human interactions in social networks … … all these data tell stories
  • 13.
    velocity volume •  data comesin quickly through multiple sources (online systems, social media etc.) •  from terabytes to petabytes (and more) of information variety complexity •  of data types (structured, semi-structured, and unstructured data) •  geographical and multi-data center data distribution, cloud computing
  • 14.
    data journalism = thecraft of finding stories into complex datasets
  • 15.
    computer assisted reporting data-drivenjournalism database journalism automated journalism
  • 16.
    data-driven journalism data-driven journalismis a process whereby journalists build stories using numerical data or databases as a primary material  
  • 25.
    database journalism database journalismor structured journalism is a principle in information management whereby news content is organized around structured pieces of data, as opposed to news stories it focuses on the constitution and maintenance of the database
  • 32.
     challenge 1 the linebetween activism and journalism has become even fuzzier in the digital age growing interest in investigative news organisations that operate on a non-profit model some new players are producing to serve a particular agenda  
  • 33.
     challenge 2 when doesa study, or dataset, constitute as a reliable source for a news story? what does a journalist need besides the data to create a trustworthy story? how to evaluate data? as a news source… or?      
  • 34.
     challenge 3 how toanalyze all these data? data journalism = social science?        
  • 35.
  • 36.
    only skilled dataanalysts can bridge the gap between data and knowledge, and find the stories underneath a new set of skills: advanced research methods + computational methods (data mining, data processing)
  • 37.
    ability to selectdata from a broad range of data sources ability to analyze and abstract data from a scientific perspective ability to explore and detect abnormity in data familiarity with various data standards & the ability to convert ability to visualize data in graphics and text ability to transform data in a journalistic storyline
  • 40.
    all journalists =data journalists NO But....
  • 41.
    future journalist needto know (at least a little bit…) about data and algorithms
  • 42.
    computer assisted reporting data-drivenjournalism database journalism automated journalism
  • 51.
    all journalists =data journalists NO But....
  • 52.
    journalism, as oneof the key professions specialized in making information visible and accessible to large audiences, MUST BE at the forefront of the ‘data revolution’
  • 60.