Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Workshop on Data Journalism
February 17, 2014
Ghent

How to get the
data and how to
process them?

Lorenzo Pellizzari
1
About me …

2
Get the data

Receive it

Advanced search
techniques

How to get
the data?
FOI laws

Scrape it

3
1

Receive it

Analyzing the War Logs (Associated Press)
4
2

Advanced search techniques:
Google
79.300.000 results

5results

5
2

Advanced search techniques:
SPARQL

http://dbpedia.org/sparql

6
2

Advanced search techniques:
SPARQL

7
2

Advanced search techniques:
SPARQL

http://latemar.science.unitn.it/spacetime/spacetime.html
8
3

Freedom of Information laws

9
3

Freedom of Information laws

10
4

Scrape your data

“Web scraping (web harvesting or web data extraction) is a computer software
technique of extracting ...
4

Scrape your data

12
4

Scrape your data

13
Process the data
What Analytics, Data mining, Big Data
software you used in the past 12 months for a
real project (not jus...
The software for data analysis
Share of R- or SAS-related posts to Stack
Overflow by week.

http://r4stats.com/articles/po...
The software for data analysis

16
Example: ABC News
Interactive map of gas wells and leases in Australia

Scraping: Main data coming from
gouvernemental web...
Example: ABC News
•

A web developer and designer

•

A lead journalist

•

A part time researcher with expertise in data ...
19
Upcoming SlideShare
Loading in …5
×

DataJournalism: How To get data and process them?

659 views

Published on

Workshop on datajournalism given at the DataDays organised by the Open Knowledge Foundation on the 17th of February 2014.

Published in: Technology
  • Be the first to comment

DataJournalism: How To get data and process them?

  1. 1. Workshop on Data Journalism February 17, 2014 Ghent How to get the data and how to process them? Lorenzo Pellizzari 1
  2. 2. About me … 2
  3. 3. Get the data Receive it Advanced search techniques How to get the data? FOI laws Scrape it 3
  4. 4. 1 Receive it Analyzing the War Logs (Associated Press) 4
  5. 5. 2 Advanced search techniques: Google 79.300.000 results 5results 5
  6. 6. 2 Advanced search techniques: SPARQL http://dbpedia.org/sparql 6
  7. 7. 2 Advanced search techniques: SPARQL 7
  8. 8. 2 Advanced search techniques: SPARQL http://latemar.science.unitn.it/spacetime/spacetime.html 8
  9. 9. 3 Freedom of Information laws 9
  10. 10. 3 Freedom of Information laws 10
  11. 11. 4 Scrape your data “Web scraping (web harvesting or web data extraction) is a computer software technique of extracting information from websites.” (Wikipedia) http://www-news.iaea.org/ 11
  12. 12. 4 Scrape your data 12
  13. 13. 4 Scrape your data 13
  14. 14. Process the data What Analytics, Data mining, Big Data software you used in the past 12 months for a real project (not just evaluation) [798 voters] http://www.kdnuggets.com/ 14
  15. 15. The software for data analysis Share of R- or SAS-related posts to Stack Overflow by week. http://r4stats.com/articles/popularity/ 15
  16. 16. The software for data analysis 16
  17. 17. Example: ABC News Interactive map of gas wells and leases in Australia Scraping: Main data coming from gouvernemental websites FOI: Data on chemical releases Variety of reports: Data on salt and water http://datajournalismhandbook.org/ 17
  18. 18. Example: ABC News • A web developer and designer • A lead journalist • A part time researcher with expertise in data extraction, excel spread sheets and data cleaning • A part time junior journalist • A consultant executive producer • A academic consultant with expertise in data mining, graphic visualization and advanced research skills • The services of a project manager and the administrative assistance of the ABC’s multi-platform unit • Importantly we also had a reference group of journalists and others whom we consulted on a needs basis http://datajournalismhandbook.org/ 18
  19. 19. 19

×