Content created by
The Open Data Institute
Using Open Data
Dr David Tarrant | @davetaz | The Open Data Institute
Content created by
The Open Data Institute
Agenda
Discovering open data
Quality and provenance
Data analysis and visualisation
Open data in policy cycles
Referencing data
Content created by
The Open Data Institute
Agenda
Discovering open data
Quality and provenance
Data analysis and visualisation
Open data in policy cycles
Referencing data
Content created by
The Open Data Institute
Content created by
The Open Data Institute
data.gov.XX
Content created by
The Open Data Institute
Google advanced
site: Get results only from certain
sites or domains
link: Find pages that link to a
certain page
related: Find sites similar to one
you already know
filetype: Find certain file types only
Content created by
The Open Data Institute
Aggregators and portals
Collect together data from across the web into one place.
FAO World Bank
Content created by
The Open Data Institute
Scraping
If you can’t obtain usable data (csv, xls) then you may have to
resort to scraping.
pdftables.com magic.import.io
Content created by
The Open Data Institute
Content created by
The Open Data Institute
Content created by
The Open Data Institute
Content created by
The Open Data Institute
Content created by
The Open Data Institute
Agenda
Discovering open data
Quality and provenance
Data analysis and visualisation
Open data in policy cycles
Referencing data
Content created by
The Open Data Institute
Guidelines
5 - S t a r s
★★★★★
Content created by
The Open Data Institute
Open Data Certificate
http://certificates.theodi.org
Content created by
The Open Data Institute
Establishing trust in data
Who
Collected it?
Owns it?
Publishes it?
Is the Audience?
What
Is it (title/description)?
Type of data is it?
Type of objects?
When
Collected?
Published?
Updated?
Due next update?
Where
Was it collected?
Is it used?
Is it described?
Is it located?
Content created by
The Open Data Institute
http://5stardata.info/
★★★★★
5-Stars
Content created by
The Open Data Institute
Open Refine
http://openrefine.org
A free power
tool for cleaning
messy data
Content created by
The Open Data Institute
Agenda
Discovering open data
Quality and provenance
Data analysis and visualisation
Open data in policy cycles
Referencing data
Content created by
The Open Data Institute
Data analysis
Quantitative Qualitative
Content created by
The Open Data Institute
Remember
• Not all data is structured
• Not all numeric data is structured
• Some text data is structured
Content created by
The Open Data Institute
Analysing quantitative data
Content created by
The Open Data Institute
Beware!
• Targets
• Fluctuation
• Chance
• Correlation != Causation
https://xkcd.com/925/
Content created by
The Open Data Institute
Analysing qualitative data
Entity recognition can
help with coding and
thematic network
analysis.
Try Open Calais
Search: open calais
Content created by
The Open Data Institute
Visualisaion
Not all data
visualisations are
good!
Content created by
The Open Data Institute
Picking the right visulisation
1) Audience
• Who are your audience and what do they expect?
2) Purpose
• What story are you trying to tell.
3) Data
• What types of visulisation suit the data
Content created by
The Open Data Institute
Keep it simple!
Which country achieved the greatest crop yield in 2014?
Content created by
The Open Data Institute
Nothing wrong with a bar chart
Observe how you don’t need unnecessary clutter like axis and labels you can’t read
Content created by
The Open Data Institute
Simple lines and interactivity
https://www.nytimes.com/interactive/2017/0
1/15/us/politics/you-draw-obama-
legacy.html?_r=0
Content created by
The Open Data Institute
Agenda
Discovering open data
Quality and provenance
Data analysis and visualisation
Open data in policy cycles
Referencing data
Content created by
The Open Data Institute
The policy cycle
Open data helps at
every stage of the
policy cycle!
Content created by
The Open Data Institute
Example policy
Agenda: To publish more open data from Universities on Agriculture.
Why? To increase the benefit from this data to improve agriculture
worldwide.
But what is the benefit to those who already hold the data?
Content created by
The Open Data Institute
Understanding researchers
Universities are ranked on the quality of their research which is
linked to publication.
Therefor if data publication can hold the same value and benefit
then we should see more data.
Content created by
The Open Data Institute
How research creates impact
1) The journal of publication
2) The number of citations the paper has
Content created by
The Open Data Institute
Doing the same for research data
1) Create reputable places to share data
2) Create a way to link/reference the data,
including an index
3) Mandate the publication of research data
Content created by
The Open Data Institute
https://blog.datacite.org/general-assembly-2016/
Content created by
The Open Data Institute
Recap
Discovering open data
Quality and provenance
Data analysis and visualisation
Open data in policy cycles
Referencing data
Content created by
The Open Data Institute
Thank-you
Dr David Tarrant | @davetaz | The Open Data Institute
https://xkcd.com/552/

Using Open Data - David Tarrant

  • 1.
    Content created by TheOpen Data Institute Using Open Data Dr David Tarrant | @davetaz | The Open Data Institute
  • 2.
    Content created by TheOpen Data Institute Agenda Discovering open data Quality and provenance Data analysis and visualisation Open data in policy cycles Referencing data
  • 3.
    Content created by TheOpen Data Institute Agenda Discovering open data Quality and provenance Data analysis and visualisation Open data in policy cycles Referencing data
  • 4.
    Content created by TheOpen Data Institute
  • 5.
    Content created by TheOpen Data Institute data.gov.XX
  • 6.
    Content created by TheOpen Data Institute Google advanced site: Get results only from certain sites or domains link: Find pages that link to a certain page related: Find sites similar to one you already know filetype: Find certain file types only
  • 7.
    Content created by TheOpen Data Institute Aggregators and portals Collect together data from across the web into one place. FAO World Bank
  • 8.
    Content created by TheOpen Data Institute Scraping If you can’t obtain usable data (csv, xls) then you may have to resort to scraping. pdftables.com magic.import.io
  • 9.
    Content created by TheOpen Data Institute
  • 10.
    Content created by TheOpen Data Institute
  • 11.
    Content created by TheOpen Data Institute
  • 12.
    Content created by TheOpen Data Institute
  • 13.
    Content created by TheOpen Data Institute Agenda Discovering open data Quality and provenance Data analysis and visualisation Open data in policy cycles Referencing data
  • 14.
    Content created by TheOpen Data Institute Guidelines 5 - S t a r s ★★★★★
  • 15.
    Content created by TheOpen Data Institute Open Data Certificate http://certificates.theodi.org
  • 16.
    Content created by TheOpen Data Institute Establishing trust in data Who Collected it? Owns it? Publishes it? Is the Audience? What Is it (title/description)? Type of data is it? Type of objects? When Collected? Published? Updated? Due next update? Where Was it collected? Is it used? Is it described? Is it located?
  • 17.
    Content created by TheOpen Data Institute http://5stardata.info/ ★★★★★ 5-Stars
  • 18.
    Content created by TheOpen Data Institute Open Refine http://openrefine.org A free power tool for cleaning messy data
  • 19.
    Content created by TheOpen Data Institute Agenda Discovering open data Quality and provenance Data analysis and visualisation Open data in policy cycles Referencing data
  • 20.
    Content created by TheOpen Data Institute Data analysis Quantitative Qualitative
  • 21.
    Content created by TheOpen Data Institute Remember • Not all data is structured • Not all numeric data is structured • Some text data is structured
  • 22.
    Content created by TheOpen Data Institute Analysing quantitative data
  • 23.
    Content created by TheOpen Data Institute Beware! • Targets • Fluctuation • Chance • Correlation != Causation https://xkcd.com/925/
  • 24.
    Content created by TheOpen Data Institute Analysing qualitative data Entity recognition can help with coding and thematic network analysis. Try Open Calais Search: open calais
  • 25.
    Content created by TheOpen Data Institute Visualisaion Not all data visualisations are good!
  • 26.
    Content created by TheOpen Data Institute Picking the right visulisation 1) Audience • Who are your audience and what do they expect? 2) Purpose • What story are you trying to tell. 3) Data • What types of visulisation suit the data
  • 27.
    Content created by TheOpen Data Institute Keep it simple! Which country achieved the greatest crop yield in 2014?
  • 28.
    Content created by TheOpen Data Institute Nothing wrong with a bar chart Observe how you don’t need unnecessary clutter like axis and labels you can’t read
  • 29.
    Content created by TheOpen Data Institute Simple lines and interactivity https://www.nytimes.com/interactive/2017/0 1/15/us/politics/you-draw-obama- legacy.html?_r=0
  • 30.
    Content created by TheOpen Data Institute Agenda Discovering open data Quality and provenance Data analysis and visualisation Open data in policy cycles Referencing data
  • 31.
    Content created by TheOpen Data Institute The policy cycle Open data helps at every stage of the policy cycle!
  • 32.
    Content created by TheOpen Data Institute Example policy Agenda: To publish more open data from Universities on Agriculture. Why? To increase the benefit from this data to improve agriculture worldwide. But what is the benefit to those who already hold the data?
  • 33.
    Content created by TheOpen Data Institute Understanding researchers Universities are ranked on the quality of their research which is linked to publication. Therefor if data publication can hold the same value and benefit then we should see more data.
  • 34.
    Content created by TheOpen Data Institute How research creates impact 1) The journal of publication 2) The number of citations the paper has
  • 35.
    Content created by TheOpen Data Institute Doing the same for research data 1) Create reputable places to share data 2) Create a way to link/reference the data, including an index 3) Mandate the publication of research data
  • 36.
    Content created by TheOpen Data Institute https://blog.datacite.org/general-assembly-2016/
  • 37.
    Content created by TheOpen Data Institute Recap Discovering open data Quality and provenance Data analysis and visualisation Open data in policy cycles Referencing data
  • 38.
    Content created by TheOpen Data Institute Thank-you Dr David Tarrant | @davetaz | The Open Data Institute https://xkcd.com/552/