Natasha Simons
Managing Research Data Workshop
Data discovery and metadata
iSchools Data Science Winter Institute
Hong Kong
7 December 2017
Why do people search for data?
Why do people search for data*?
•Exploratory/Scoping
•Reuse/Secondary data analysis
•Can be starting point or ad hoc
•Peer review
•Reproduce/extend results
•Repurpose (e.g. for mashups, visualisations, simulations)
•Verify claims (e.g. report findings)
*Not in any order; not exhaustive!
How do people find data?
How do people find data*?
•Google
•Ask a colleague
•Find link to data in a journal article
•Data journals
•Data registries e.g. re3data
•Open data portals e.g. data.gov
•Institutional repositories
•Data / Discipline repositories e.g. Dryad
•Project website
•Data discovery aggregators like Research Data Australia
•Library catalogues, databases
*Not in any order; not exhaustive!
Characteristics of finding data
When creating metadata records, keep in mind that finding data is:
● Movable feast / changing beast
● No standard practice, universal standard or vocab
● Databases are non-exhaustive
● Methods for searching and terms driven by why people are
looking and how the data is stored
FAIR Data
To aid discovery and reuse, data needs to be:
● Findable
● Accessible
● Interoperable
● Reusable
More on FAIR Data:
● FAIR Data Principles (FORCE11): https://www.force11.org/group/fairgroup/fairprinciples
● ANDS and FAIR Data: https://www.ands.org.au/working-with-data/fairdata
● FAIR Data ANDS Webinar series: https://www.youtube.com/user/andsdata (FAIR Data Playlist)
ANDS/Nectar/RDS
“FAIRground” booth
at eResearch
Australasia 2017
Hands-on exercise: data description
Your task:
1. Divide into pairs
2. Each pair take one of the CSV data files
3. Describe the data by creating a metadata record. Think about:
title, creators, date, short description and so on.
You have 15 minutes - go!!
If you are unfamiliar with metadata, take few minutes
to view the introductory video at:
https://www.youtube.com/watch?v=ABF2FvSPVYE
Class discussion
How did you go?
What did you learn?
Here are the original metadata descriptions:
CSV dataset #1 - https://data.qld.gov.au/dataset/marine-oil-spills-
data
CSV dataset #2 –
https://data.qld.gov.au/dataset/koala-hospital-data
Australian data discovery portals
Open data case study
University of Tasmania - IMAS Marine Data
https://www.youtube.com/watch?v=_Bs56PnYK9g
More Open Data project stories: https://www.youtube.com/user/andsdata
(Open Data Playlist)
Research Data Australia
https://researchdata.ands.org.au/
TERN - Terrestrial/ecology data
http://portal.tern.org.au/#/00629597
AURIN - urban research data
https://data.aurin.org.au/
Atlas of Living Australia
https://www.ala.org.au/
National Library’s TROVE
http://trove.nla.gov.au/
re3data includes Aus data repositories
With the exception of third party images or where otherwise indicated, this work is licensed under the Creative
Commons 4.0 International Attribution Licence.
ANDS, Nectar and RDS are supported by the Australian Government through the National Collaborative Research
Infrastructure Strategy Program (NCRIS).
Natasha.simons@ands.org.au
@n_simons
orcid.org/0000-0003-0635-1998
Natasha Simons

Ischools workshop - 4 - data discovery

  • 1.
    Natasha Simons Managing ResearchData Workshop Data discovery and metadata iSchools Data Science Winter Institute Hong Kong 7 December 2017
  • 2.
    Why do peoplesearch for data?
  • 3.
    Why do peoplesearch for data*? •Exploratory/Scoping •Reuse/Secondary data analysis •Can be starting point or ad hoc •Peer review •Reproduce/extend results •Repurpose (e.g. for mashups, visualisations, simulations) •Verify claims (e.g. report findings) *Not in any order; not exhaustive!
  • 4.
    How do peoplefind data?
  • 5.
    How do peoplefind data*? •Google •Ask a colleague •Find link to data in a journal article •Data journals •Data registries e.g. re3data •Open data portals e.g. data.gov •Institutional repositories •Data / Discipline repositories e.g. Dryad •Project website •Data discovery aggregators like Research Data Australia •Library catalogues, databases *Not in any order; not exhaustive!
  • 6.
    Characteristics of findingdata When creating metadata records, keep in mind that finding data is: ● Movable feast / changing beast ● No standard practice, universal standard or vocab ● Databases are non-exhaustive ● Methods for searching and terms driven by why people are looking and how the data is stored
  • 7.
    FAIR Data To aiddiscovery and reuse, data needs to be: ● Findable ● Accessible ● Interoperable ● Reusable More on FAIR Data: ● FAIR Data Principles (FORCE11): https://www.force11.org/group/fairgroup/fairprinciples ● ANDS and FAIR Data: https://www.ands.org.au/working-with-data/fairdata ● FAIR Data ANDS Webinar series: https://www.youtube.com/user/andsdata (FAIR Data Playlist) ANDS/Nectar/RDS “FAIRground” booth at eResearch Australasia 2017
  • 8.
    Hands-on exercise: datadescription Your task: 1. Divide into pairs 2. Each pair take one of the CSV data files 3. Describe the data by creating a metadata record. Think about: title, creators, date, short description and so on. You have 15 minutes - go!! If you are unfamiliar with metadata, take few minutes to view the introductory video at: https://www.youtube.com/watch?v=ABF2FvSPVYE
  • 9.
    Class discussion How didyou go? What did you learn? Here are the original metadata descriptions: CSV dataset #1 - https://data.qld.gov.au/dataset/marine-oil-spills- data CSV dataset #2 – https://data.qld.gov.au/dataset/koala-hospital-data
  • 10.
  • 11.
    Open data casestudy University of Tasmania - IMAS Marine Data https://www.youtube.com/watch?v=_Bs56PnYK9g More Open Data project stories: https://www.youtube.com/user/andsdata (Open Data Playlist)
  • 12.
  • 13.
    TERN - Terrestrial/ecologydata http://portal.tern.org.au/#/00629597
  • 14.
    AURIN - urbanresearch data https://data.aurin.org.au/
  • 15.
    Atlas of LivingAustralia https://www.ala.org.au/
  • 16.
  • 17.
    re3data includes Ausdata repositories
  • 18.
    With the exceptionof third party images or where otherwise indicated, this work is licensed under the Creative Commons 4.0 International Attribution Licence. ANDS, Nectar and RDS are supported by the Australian Government through the National Collaborative Research Infrastructure Strategy Program (NCRIS). Natasha.simons@ands.org.au @n_simons orcid.org/0000-0003-0635-1998 Natasha Simons

Editor's Notes

  • #19 [Kate] Thank you and please feel free to contact us.