This document provides an overview of data journalism techniques including data collection, cleaning, analysis and visualization. It discusses using tools like correlation analysis and significance tests to analyze relationships between variables in data. As an example, it describes analyzing the correlation between penalty points and road fatality rates in Irish counties and determining if the relationship is statistically significant. The document concludes with noting the student has now completed all the necessary data analysis and visualization for a story on penalty points.
2. What we have learned so far
What Data Journalism is about
Finding Data
Data collection
Data scraping
Data mashing and summarisation
Data cleaning
Data aanalysis
Data visualisation with graphs, charts and infographics
Data visualisation with maps
FOI
Social Media as a source
6. Correlation analysis
Correlation concerns the strength of
relationship between values of two variables.
Are height and weight correlated?
Are engine size and max speed in cars
correlated?
15. Significance test
Significance test is to determine whether an
observed relationship is real, or is it just one
that we would anyway expect to see quite often
by chance?
We start out assuming that there is no real
relationship between the two variables: null
hypothesis.
16. p value
p value: the probability that your relationship has
happened by chance.The smaller the p value the more
significant the relationship.
p value is calculated probability of an observed difference
occurring by chance when really no difference/
relationship actually exists (null hypothesis).
If p value was small enough(?*), we can reject the null
hypothesis.
17. p value cut offs
p 0.05 or 0.05 level significant*
p 0.01 or 0.01 level highly significant**
22. Hands-on
Correlation analysis and
significant test for:
Penalty points in counties in
Ireland and rate of road fatalities.
Use SPSS or PSPP
Go back to your penalty points and road fatalities
story/data.
23. You have now completed all the data
analysis and visualisation needed for our
penalty points story.
Well done!
24. Resources:
Sta+s+cs
without
tears:
A
primer
for
non-‐mathema+cians,
Derek
Rowntree,
first
published
1981
Sta+s+cs
done
wrong,
Alex
Reinhart,
2015
hNp://www.sta+s+csdonewrong.com/