Data Journalism lecture - Week 5: Storytelling with Data
Lecture date: 7 Oct 2015
MA in Journalism
National University of Ireland, Galway
Title slide image from The Data Journalism Handbook
5. The New York City metropolitan area is home to the largest
Jewish community outside Israel. It is also home to nearly a quarter of the
nation's Indian Americans and 15% of all Korean Americans and the
largest Asian Indian population in the Western Hemisphere; the largest
African American community of any city in the country; and including 6
Chinatowns in the city proper, comprised as of 2008 a population of
659,596 overseas Chinese, the largest outside of Asia. New York City
alone, according to the 2010 Census, has now become home to
more than one million Asian Americans, greater than the combined totals
of San Francisco and Los Angeles. New York contains the highest total
Asian population of any U.S. city proper. 6.0% of New York City is of
Chinese ethnicity, with about forty percent of them living in the
borough of Queens alone. Koreans make up 1.2% of the city's population,
and Japanese at 0.3%. Filipinos are the largest southeast Asian ethnic
group at 0.8%, followed by Vietnamese who make up only 0.2% of New
York City's population. Indians are the largest South Asian group,
comprising 2.4% of the city's population, and Bangladeshis and Pakistanis
at 0.7% and 0.5%, respectively. / Demographics of New York, Wikipedia
9. Charles
Minard,
1812
Napoleaon’s
March
on
Moscow
Six
types
of
data:
(1)
the
number
of
Napoleon's
troops;
(2)
distance;
(3)
temperature;
(4)
the
la6tude
and
longitude;
(5)
direc6on
of
travel;
(6)
loca6on
rela6ve
to
specific
dates.
10.
TYPE
OF
DATA
ANALYSIS
TEMPORAL
GEOSPATIAL
TOPICAL
NETWORK
12.
To
understand
temporal
distribu6on
of
datasets;
To
iden6fy
growth
rate,
latency
to
peak
6mes,
or
decay
rates;
To
see
paTerns
in
6me-‐series
data,
such
as
seasonality
or
bursts.
Visual
Insights,
by
Katy
Borner
and
David
E.
Polley,
2014
14. Napoleaon’s
March
on
Moscow
Six
types
of
data:
(1)
the
number
of
Napoleon's
troops;
(2)
distance;
(3)
temperature;
(4)
the
la6tude
and
longitude;
(5)
direc6on
of
travel;
(6)
loca6on
rela6ve
to
specific
dates.
Charles
Minard,
1812
15. Visual
Insights,
by
Katy
Borner
and
David
E.
Polley,
2014
hTp://scimaps.org/maps/map/history_flow_visuali_56/detail
21.
Uses
loca6on
informa6on
to
iden6fy
posi6ons,
movements,
[trends
or
paTerns]
over
geographical
space.
Visual
Insights,
by
Katy
Borner
and
David
E.
Polley,
2014
30.
Uses
text
to
iden6fy
major
topics,
their
interrela6ons,
and
their
evolu6on
over
6me,
[and
space].
Visual
Insights,
by
Katy
Borner
and
David
E.
Polley,
2014
31. Map
of
Science
hTp://cns.iu.edu/images/teaching/ivmoocbook14/4.12.pdf
37.
To
iden6fy
(highly)
connected
en66es
and
the
rela6onship
between
them;
Network
proper6es,
such
as
size
and
density;
Structure
such
as
clusters
and
backbones.
Visual
Insights,
by
Katy
Borner
and
David
E.
Polley,
2014
38. Map
of
science
collabora6ons
2008
-‐
2012
Olivier
H.
Beauchesne
(2014)
43. Why
do
we
visualise?
To
tell
a
story
and
communicate
Visualise
to
analyse
44. Bar Line Area Map
More
Some chart types
Pie
Scatter
Plot
Bubble Heat
map
Box
Plot
Source:
infogram
training
and
Tableau
45. Most common way to visualise
data. Good to show differences in
values & categories that don’t
add up to 100%.
Percent of spending by department,
website traffic by origination site.
Poor choice for showing time-
series data, as the line charts
have a smoother representation.
Bar
Comparing data
across categories
Source:
infogram
training
and
Tableau
46. Good for showing contrast when
two or three components of
something differ greatly in size.
Percentage of budget spent on
different departments, response
categories from a survey.
Poor choice if you have too
many variables or if their values
are similar in size.
Pie
Compare proportions
out of 100%
Source:
infogram
training
and
Tableau
47. Line
Get some lengthy !
data like oil prices?
Best choice for time-series data
and highlighting trends, with not
more than three sets per chart.
Stock price change over a five-
year period, website page views
during a month, revenue growth by
quarter.
May be visually misleading when
attempting to show data that is
not based on time-series.
Line
View trends in
Data over time
Source:
infogram
training
and
Tableau
48. A great choice to show regional
differences in certain variables,
when there is a clear correlation.
Driving penalties by county, product
export destinations by country, car
accidents by postcode.
Not optimal when the differences
are small in size or when time-
series data has to be displayed.
Map
To show a
Geographical comparison
Source:
infogram
training
49. An effective way to get a sense
of trends, concentrations,
correlations and outliers.
Relationship between weight of a
vehicle and its max speed,
speeding ticket and death rate.
Not so easy to read by every
day users.
Scatter
Plot
Investigate relationship
vetween two variables
Source:
Tableau
50. Suitable for understanding your
data at a glance, seeing how
data is skewed towards one
end, identifying outliers in your
data.
Not so easy to read by every
day users.
Box Plot
To show distribution
of a set of data
Source:
Tableau
51. To give weight to cencentration
of data on scatter plots or
maps.
Not so easy to understand by
every day users, particularly
when comparing data on two
axis.
Bubble
To show cencentration
of data
Source:
Tableau
52. Works well with 2-3 groups of
people, objects or categories
are compared, and when
differences are significant.
A line chart is a better option
with more than three groups and
when differences are small.
Picto
Another way of comparing
categories
Source:
infogram
training
58. Hands-on
Visualise number of death per county and rate of
death per county in Ireland.
Start with Excel
Then Google Spreadsheets
Then move on to Datawrapper
Data:
RSA 2013 road death statistics
Any other?
59. Resources:
Visual
Insights:
A
Prac6cal
Guide
to
Making
Sense
of
Data,
by
Katy
Borner
and
David
E.
Polley,
2014
Facts
are
Sacred,
by
Simon
Rogers,
2013
London:
The
Informa6on
Capital,
by
James
Cheshire
and
Oliver
Uber6,
2014
Which
chart
or
graph
is
right
for
you?,
Maila
Hardin,
Daniel
Hom,
Ross
Perez,
Lori
Williams,
Tableau
whitepaper