SlideShare a Scribd company logo
CSC – Suomalainen tutkimuksen, koulutuksen, kulttuurin ja julkishallinnon ICT-osaamiskeskusCSC – Suomalainen tutkimuksen, koulutuksen, kulttuurin ja julkishallinnon ICT-osaamiskeskus
Supporting FAIR data.
Categorization of research data as a tool in
data management
Jessica Parland-von Essen https://orcid.org/0000-0003-4460-3906, Katja Fält https://orcid.org/0000-
0002-6172-5377, Zubair Maalick https://orcid.org/0000-0002-0975-1471, Miika Alonen
https://orcid.org/0000-0002-0065-0017, Eduardo Gonzalez https://orcid.org/0000-0003-1400-0995
The FAIR
principles
for
research
data
Persistent identifiers
3
a) Cite a specific slice or subset (the set of updates to the
dataset made during a particular period of time or to a particular
area of the dataset).
b) Cite a specific snapshot (a copy of the entire dataset made at
a specific time).
c) Cite the continuously updated dataset, but add Access Date
and Time to the citation. (Does not necessarily ensure
reproducibility.)
d) Cite a query, time-stamped for re-execution against a
versioned database.
DYNAMIC DATASETS
IMMUTABLE DATASETS
Maybe we need to be more specific and find common ground in concepts?
4
CHUNKING UP RESEARCH DATA
Categorization according to technical properties
• Modality, DCMI types
oDublin Core –type of thinking
• Format, DCMI format
oMIME types
oSoftware related
• Language, coding
oHuman interpretation
5
By Lin Kristensen from New Jersey, USA (Timeless Books) [CC BY 2.0
(https://creativecommons.org/licenses/by/2.0)], via Wikimedia Commons
Categorization according to contextual traits
• Origin
oObservational, experimental,
simulation, derived etc
• Use category
oSource, output, method
• Provenance, lifecycle
oPrimary, secondary, data levels,
qualitative, quantitative
6
By David Monniaux CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0/), from Wikimedia Commons
Categorization according to inherent traits
• Access type (availability)
oOpen data, sensitive data
• Semantic structure
oCoherence, levels of measurement,
groupings, classifications
• Research data type (stability)
oGeneric data, Generic research data,
research data publications
7
8
9
Dynamic and growing
datasets
URN allows use of
fragments
Avoid PID inflation
Consider costs and
sustainability
Ad hoc creation rather
than automatic minting
and allocation?
Operational data Generic research data Research dataset
Description Data for any use, private or government
owned, might fall within PSI.
Produced by/with/for
researchers, validated, good
quality, well documented, might
be raw or processed.
Dataset produced for a certain
research question
Might be highly processed,
reuse difficult unless mature
field. The main purpose is
assessment and reproducibilty.
Format May be dynamic mature solutions,
active or even hot data.
Coherent and well documented
formats. Data should be quite
stable with versioning. Should be
possible to cite and enable
reproducible research.
Usually in files, but might also
be a database with
applications. Citation does not
require date. Two-tier resolver
for identifier and landing page
with metadata available even
after data is gone. Might have
defined lifespan.
Examples - weather data
- data catalogue
- big data from social media
- corpora
- time series of
experimental or
observational data from
technical instruments
- similar social or clinical
surveys
- data paper
- data cited in article and
published in Zenodo,
EUDAT B2Share, other
or journal repository
Using research data types …
… makes it easier to describe services
… makes it easier for researchers to plan data life cycle
… makes developing solutions for citation and FAIR data
creation and use easier
…makes it easier to describe and manage research data
11
facebook.com/CSCfi
twitter.com/CSCfi
youtube.com/CSCfi
linkedin.com/company/csc---it-center-for-science
Kuvat CSC:n arkisto ja Thinkstock
github.com/CSCfi
Jessica PvE parland@csc.fi

More Related Content

What's hot

Research Data Management from a Software Engineering Perspective
Research Data Management from a Software Engineering PerspectiveResearch Data Management from a Software Engineering Perspective
Research Data Management from a Software Engineering Perspective
Sarah Anna Stewart
 
Why should researchers care about data curation?
Why should researchers care about data curation?Why should researchers care about data curation?
Why should researchers care about data curation?
Varsha Khodiyar
 
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Tom Plasterer
 
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesApplication of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
Pistoia Alliance
 
II-SDV 2016 Stefan Geißler Navigating complex information landscapes – Semant...
II-SDV 2016 Stefan Geißler Navigating complex information landscapes – Semant...II-SDV 2016 Stefan Geißler Navigating complex information landscapes – Semant...
II-SDV 2016 Stefan Geißler Navigating complex information landscapes – Semant...
Dr. Haxel Consult
 
Is that a scientific report or just some cool pictures from the lab? Reproduc...
Is that a scientific report or just some cool pictures from the lab? Reproduc...Is that a scientific report or just some cool pictures from the lab? Reproduc...
Is that a scientific report or just some cool pictures from the lab? Reproduc...
Greg Landrum
 
Lankade data Vinnova webbinarium
Lankade data Vinnova webbinarium Lankade data Vinnova webbinarium
Lankade data Vinnova webbinarium
Kerstin Forsberg
 
A Federated In-Memory Database System for Life Sciences
A Federated In-Memory Database System for Life SciencesA Federated In-Memory Database System for Life Sciences
A Federated In-Memory Database System for Life Sciences
Matthieu Schapranow
 
International perspective for sharing publicly funded medical research data
International perspective for sharing publicly funded medical research dataInternational perspective for sharing publicly funded medical research data
International perspective for sharing publicly funded medical research data
ARDC
 
Architecture and Standards
Architecture and StandardsArchitecture and Standards
Architecture and Standards
ARDC
 
NREM 601/605 Data Management Plans
NREM 601/605 Data Management PlansNREM 601/605 Data Management Plans
NREM 601/605 Data Management Plans
Sara Rutter
 
NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...
NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...
NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...
European School of Oncology
 
Analyze Genomes: A Federated In-memory Database Computing Platform enabling r...
Analyze Genomes: A Federated In-memory Database Computing Platform enabling r...Analyze Genomes: A Federated In-memory Database Computing Platform enabling r...
Analyze Genomes: A Federated In-memory Database Computing Platform enabling r...
Matthieu Schapranow
 
Converged IT and Data Commons
Converged IT and Data CommonsConverged IT and Data Commons
Converged IT and Data Commons
Simon Twigger
 
NIH BD2K bioCADDIE DataMed: Data Discovery Index
NIH BD2K bioCADDIE DataMed: Data Discovery IndexNIH BD2K bioCADDIE DataMed: Data Discovery Index
NIH BD2K bioCADDIE DataMed: Data Discovery Index
Susanna-Assunta Sansone
 
THOR Workshop - Data Publishing PLOS
THOR Workshop - Data Publishing PLOSTHOR Workshop - Data Publishing PLOS
THOR Workshop - Data Publishing PLOS
Maaike Duine
 
Clinical Data Models - The Hyve - Bio IT World April 2019
Clinical Data Models - The Hyve - Bio IT World April 2019Clinical Data Models - The Hyve - Bio IT World April 2019
Clinical Data Models - The Hyve - Bio IT World April 2019
Kees van Bochove
 
Introduction to ADA
Introduction to ADAIntroduction to ADA
Introduction to ADA
ARDC
 
Is one enough? Data warehousing for biomedical research
Is one enough? Data warehousing for biomedical researchIs one enough? Data warehousing for biomedical research
Is one enough? Data warehousing for biomedical research
Greg Landrum
 
Open science and medical evidence generation - Kees van Bochove - The Hyve
Open science and medical evidence generation - Kees van Bochove - The HyveOpen science and medical evidence generation - Kees van Bochove - The Hyve
Open science and medical evidence generation - Kees van Bochove - The Hyve
Kees van Bochove
 

What's hot (20)

Research Data Management from a Software Engineering Perspective
Research Data Management from a Software Engineering PerspectiveResearch Data Management from a Software Engineering Perspective
Research Data Management from a Software Engineering Perspective
 
Why should researchers care about data curation?
Why should researchers care about data curation?Why should researchers care about data curation?
Why should researchers care about data curation?
 
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
 
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesApplication of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
 
II-SDV 2016 Stefan Geißler Navigating complex information landscapes – Semant...
II-SDV 2016 Stefan Geißler Navigating complex information landscapes – Semant...II-SDV 2016 Stefan Geißler Navigating complex information landscapes – Semant...
II-SDV 2016 Stefan Geißler Navigating complex information landscapes – Semant...
 
Is that a scientific report or just some cool pictures from the lab? Reproduc...
Is that a scientific report or just some cool pictures from the lab? Reproduc...Is that a scientific report or just some cool pictures from the lab? Reproduc...
Is that a scientific report or just some cool pictures from the lab? Reproduc...
 
Lankade data Vinnova webbinarium
Lankade data Vinnova webbinarium Lankade data Vinnova webbinarium
Lankade data Vinnova webbinarium
 
A Federated In-Memory Database System for Life Sciences
A Federated In-Memory Database System for Life SciencesA Federated In-Memory Database System for Life Sciences
A Federated In-Memory Database System for Life Sciences
 
International perspective for sharing publicly funded medical research data
International perspective for sharing publicly funded medical research dataInternational perspective for sharing publicly funded medical research data
International perspective for sharing publicly funded medical research data
 
Architecture and Standards
Architecture and StandardsArchitecture and Standards
Architecture and Standards
 
NREM 601/605 Data Management Plans
NREM 601/605 Data Management PlansNREM 601/605 Data Management Plans
NREM 601/605 Data Management Plans
 
NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...
NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...
NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...
 
Analyze Genomes: A Federated In-memory Database Computing Platform enabling r...
Analyze Genomes: A Federated In-memory Database Computing Platform enabling r...Analyze Genomes: A Federated In-memory Database Computing Platform enabling r...
Analyze Genomes: A Federated In-memory Database Computing Platform enabling r...
 
Converged IT and Data Commons
Converged IT and Data CommonsConverged IT and Data Commons
Converged IT and Data Commons
 
NIH BD2K bioCADDIE DataMed: Data Discovery Index
NIH BD2K bioCADDIE DataMed: Data Discovery IndexNIH BD2K bioCADDIE DataMed: Data Discovery Index
NIH BD2K bioCADDIE DataMed: Data Discovery Index
 
THOR Workshop - Data Publishing PLOS
THOR Workshop - Data Publishing PLOSTHOR Workshop - Data Publishing PLOS
THOR Workshop - Data Publishing PLOS
 
Clinical Data Models - The Hyve - Bio IT World April 2019
Clinical Data Models - The Hyve - Bio IT World April 2019Clinical Data Models - The Hyve - Bio IT World April 2019
Clinical Data Models - The Hyve - Bio IT World April 2019
 
Introduction to ADA
Introduction to ADAIntroduction to ADA
Introduction to ADA
 
Is one enough? Data warehousing for biomedical research
Is one enough? Data warehousing for biomedical researchIs one enough? Data warehousing for biomedical research
Is one enough? Data warehousing for biomedical research
 
Open science and medical evidence generation - Kees van Bochove - The Hyve
Open science and medical evidence generation - Kees van Bochove - The HyveOpen science and medical evidence generation - Kees van Bochove - The Hyve
Open science and medical evidence generation - Kees van Bochove - The Hyve
 

Similar to Supporting FAIR data principles with data categorization

Enabling Precise Identification and Citability of Dynamic Data: Recommendatio...
Enabling Precise Identification and Citability of Dynamic Data: Recommendatio...Enabling Precise Identification and Citability of Dynamic Data: Recommendatio...
Enabling Precise Identification and Citability of Dynamic Data: Recommendatio...
LEARN Project
 
Data Science Provenance: From Drug Discovery to Fake Fans
Data Science Provenance: From Drug Discovery to Fake FansData Science Provenance: From Drug Discovery to Fake Fans
Data Science Provenance: From Drug Discovery to Fake Fans
Jameel Syed
 
FAIR Ddata in trustworthy repositories: the basics
FAIR Ddata in trustworthy repositories: the basicsFAIR Ddata in trustworthy repositories: the basics
FAIR Ddata in trustworthy repositories: the basics
OpenAIRE
 
Make your data great now
Make your data great nowMake your data great now
Make your data great now
Daniel JACOB
 
Managing and Sharing Research Data - Workshop at UiO - December 04, 2017
Managing and Sharing Research Data - Workshop at UiO - December 04, 2017Managing and Sharing Research Data - Workshop at UiO - December 04, 2017
Managing and Sharing Research Data - Workshop at UiO - December 04, 2017
Michel Heeremans
 
Research Data Management, Challenges and Tools - Per Öster
Research Data Management, Challenges and Tools - Per Öster Research Data Management, Challenges and Tools - Per Öster
Research Data Management, Challenges and Tools - Per Öster
LEARN Project
 
Sharing scientific data ethics and consent
Sharing scientific data    ethics and consentSharing scientific data    ethics and consent
Sharing scientific data ethics and consent
Aboul Ella Hassanien
 
Research data management : Open Research Data pilot, data management (plans),...
Research data management : Open Research Data pilot, data management (plans),...Research data management : Open Research Data pilot, data management (plans),...
Research data management : Open Research Data pilot, data management (plans),...
Leon Osinski
 
Research methods group accelarating impact by sharing data
Research methods group  accelarating impact by sharing dataResearch methods group  accelarating impact by sharing data
Research methods group accelarating impact by sharing data
World Agroforestry (ICRAF)
 
Introduction to Data Management
Introduction to Data ManagementIntroduction to Data Management
Introduction to Data Management
cunera
 
Wilson-npg-scientific data-nfdp13
Wilson-npg-scientific data-nfdp13Wilson-npg-scientific data-nfdp13
Wilson-npg-scientific data-nfdp13
DataDryad
 
Simon hodson
Simon hodsonSimon hodson
داده های پژوهشی
داده های پژوهشیداده های پژوهشی
داده های پژوهشی
Hosseinieh Ershad Public Library
 
User-friendly bioinformatics (Monthly Informational workshop)
User-friendly bioinformatics (Monthly Informational workshop)User-friendly bioinformatics (Monthly Informational workshop)
User-friendly bioinformatics (Monthly Informational workshop)
Elia Brodsky
 
A Generic Scientific Data Model and Ontology for Representation of Chemical Data
A Generic Scientific Data Model and Ontology for Representation of Chemical DataA Generic Scientific Data Model and Ontology for Representation of Chemical Data
A Generic Scientific Data Model and Ontology for Representation of Chemical Data
Stuart Chalk
 
Big Data – Shining the Light on Enterprise Dark Data
Big Data – Shining the Light on Enterprise Dark DataBig Data – Shining the Light on Enterprise Dark Data
Big Data – Shining the Light on Enterprise Dark Data
Hitachi Vantara
 
I o dav data workshop prof wafula final 19.9.17
I o dav data workshop prof wafula final 19.9.17I o dav data workshop prof wafula final 19.9.17
I o dav data workshop prof wafula final 19.9.17
Tom Nyongesa
 
Enabling Precise Identification and Citability of Dynamic Data: Recommendatio...
Enabling Precise Identification and Citability of Dynamic Data: Recommendatio...Enabling Precise Identification and Citability of Dynamic Data: Recommendatio...
Enabling Precise Identification and Citability of Dynamic Data: Recommendatio...
Research Data Alliance
 
DRIVE CENTRAL STUDY PLATFORM: Data flow, data quality and statistical analysi...
DRIVE CENTRAL STUDY PLATFORM: Data flow, data quality and statistical analysi...DRIVE CENTRAL STUDY PLATFORM: Data flow, data quality and statistical analysi...
DRIVE CENTRAL STUDY PLATFORM: Data flow, data quality and statistical analysi...
DRIVE research
 
Burton - Security, Privacy and Trust
Burton - Security, Privacy and TrustBurton - Security, Privacy and Trust
Burton - Security, Privacy and Trust
National Information Standards Organization (NISO)
 

Similar to Supporting FAIR data principles with data categorization (20)

Enabling Precise Identification and Citability of Dynamic Data: Recommendatio...
Enabling Precise Identification and Citability of Dynamic Data: Recommendatio...Enabling Precise Identification and Citability of Dynamic Data: Recommendatio...
Enabling Precise Identification and Citability of Dynamic Data: Recommendatio...
 
Data Science Provenance: From Drug Discovery to Fake Fans
Data Science Provenance: From Drug Discovery to Fake FansData Science Provenance: From Drug Discovery to Fake Fans
Data Science Provenance: From Drug Discovery to Fake Fans
 
FAIR Ddata in trustworthy repositories: the basics
FAIR Ddata in trustworthy repositories: the basicsFAIR Ddata in trustworthy repositories: the basics
FAIR Ddata in trustworthy repositories: the basics
 
Make your data great now
Make your data great nowMake your data great now
Make your data great now
 
Managing and Sharing Research Data - Workshop at UiO - December 04, 2017
Managing and Sharing Research Data - Workshop at UiO - December 04, 2017Managing and Sharing Research Data - Workshop at UiO - December 04, 2017
Managing and Sharing Research Data - Workshop at UiO - December 04, 2017
 
Research Data Management, Challenges and Tools - Per Öster
Research Data Management, Challenges and Tools - Per Öster Research Data Management, Challenges and Tools - Per Öster
Research Data Management, Challenges and Tools - Per Öster
 
Sharing scientific data ethics and consent
Sharing scientific data    ethics and consentSharing scientific data    ethics and consent
Sharing scientific data ethics and consent
 
Research data management : Open Research Data pilot, data management (plans),...
Research data management : Open Research Data pilot, data management (plans),...Research data management : Open Research Data pilot, data management (plans),...
Research data management : Open Research Data pilot, data management (plans),...
 
Research methods group accelarating impact by sharing data
Research methods group  accelarating impact by sharing dataResearch methods group  accelarating impact by sharing data
Research methods group accelarating impact by sharing data
 
Introduction to Data Management
Introduction to Data ManagementIntroduction to Data Management
Introduction to Data Management
 
Wilson-npg-scientific data-nfdp13
Wilson-npg-scientific data-nfdp13Wilson-npg-scientific data-nfdp13
Wilson-npg-scientific data-nfdp13
 
Simon hodson
Simon hodsonSimon hodson
Simon hodson
 
داده های پژوهشی
داده های پژوهشیداده های پژوهشی
داده های پژوهشی
 
User-friendly bioinformatics (Monthly Informational workshop)
User-friendly bioinformatics (Monthly Informational workshop)User-friendly bioinformatics (Monthly Informational workshop)
User-friendly bioinformatics (Monthly Informational workshop)
 
A Generic Scientific Data Model and Ontology for Representation of Chemical Data
A Generic Scientific Data Model and Ontology for Representation of Chemical DataA Generic Scientific Data Model and Ontology for Representation of Chemical Data
A Generic Scientific Data Model and Ontology for Representation of Chemical Data
 
Big Data – Shining the Light on Enterprise Dark Data
Big Data – Shining the Light on Enterprise Dark DataBig Data – Shining the Light on Enterprise Dark Data
Big Data – Shining the Light on Enterprise Dark Data
 
I o dav data workshop prof wafula final 19.9.17
I o dav data workshop prof wafula final 19.9.17I o dav data workshop prof wafula final 19.9.17
I o dav data workshop prof wafula final 19.9.17
 
Enabling Precise Identification and Citability of Dynamic Data: Recommendatio...
Enabling Precise Identification and Citability of Dynamic Data: Recommendatio...Enabling Precise Identification and Citability of Dynamic Data: Recommendatio...
Enabling Precise Identification and Citability of Dynamic Data: Recommendatio...
 
DRIVE CENTRAL STUDY PLATFORM: Data flow, data quality and statistical analysi...
DRIVE CENTRAL STUDY PLATFORM: Data flow, data quality and statistical analysi...DRIVE CENTRAL STUDY PLATFORM: Data flow, data quality and statistical analysi...
DRIVE CENTRAL STUDY PLATFORM: Data flow, data quality and statistical analysi...
 
Burton - Security, Privacy and Trust
Burton - Security, Privacy and TrustBurton - Security, Privacy and Trust
Burton - Security, Privacy and Trust
 

More from Jessica Parland-von Essen

Planning a Finnish PID Roadmap
Planning a Finnish PID Roadmap Planning a Finnish PID Roadmap
Planning a Finnish PID Roadmap
Jessica Parland-von Essen
 
Tutkimusaineistojen kuvailu, metadata ja yhteentoimivuus
Tutkimusaineistojen kuvailu, metadata ja yhteentoimivuusTutkimusaineistojen kuvailu, metadata ja yhteentoimivuus
Tutkimusaineistojen kuvailu, metadata ja yhteentoimivuus
Jessica Parland-von Essen
 
Pid landscape in finland
Pid landscape in finlandPid landscape in finland
Pid landscape in finland
Jessica Parland-von Essen
 
Fairdata-palvelut ja tutkimusaineistojen pitkäaikaissäilytys
Fairdata-palvelut ja tutkimusaineistojen pitkäaikaissäilytysFairdata-palvelut ja tutkimusaineistojen pitkäaikaissäilytys
Fairdata-palvelut ja tutkimusaineistojen pitkäaikaissäilytys
Jessica Parland-von Essen
 
Open Science goes FAIR
Open Science goes FAIROpen Science goes FAIR
Open Science goes FAIR
Jessica Parland-von Essen
 
Metatiedot tunnisteet tutkimisdata
Metatiedot tunnisteet tutkimisdataMetatiedot tunnisteet tutkimisdata
Metatiedot tunnisteet tutkimisdata
Jessica Parland-von Essen
 
Towards a FAIR lifecycle
Towards a FAIR lifecycleTowards a FAIR lifecycle
Towards a FAIR lifecycle
Jessica Parland-von Essen
 
A Finnish perspective on FAIRsFAIR outputs
A Finnish perspective on FAIRsFAIR outputsA Finnish perspective on FAIRsFAIR outputs
A Finnish perspective on FAIRsFAIR outputs
Jessica Parland-von Essen
 
Persistence and Interoperability
Persistence and InteroperabilityPersistence and Interoperability
Persistence and Interoperability
Jessica Parland-von Essen
 
Collections meet the researcher. Digitalization, disintegration and disillusi...
Collections meet the researcher. Digitalization, disintegration and disillusi...Collections meet the researcher. Digitalization, disintegration and disillusi...
Collections meet the researcher. Digitalization, disintegration and disillusi...
Jessica Parland-von Essen
 
Research data management for historians
Research data management for historiansResearch data management for historians
Research data management for historians
Jessica Parland-von Essen
 
FAIR data and the Etsin service
FAIR data and the Etsin serviceFAIR data and the Etsin service
FAIR data and the Etsin service
Jessica Parland-von Essen
 
Yhteiskuntatieteen aineistot
Yhteiskuntatieteen aineistotYhteiskuntatieteen aineistot
Yhteiskuntatieteen aineistot
Jessica Parland-von Essen
 
Avoimen suomen historia
Avoimen suomen historiaAvoimen suomen historia
Avoimen suomen historia
Jessica Parland-von Essen
 
Open Science Process
Open Science ProcessOpen Science Process
Open Science Process
Jessica Parland-von Essen
 
Tutkimusaineistoihiin viittaaminen, pysyvät tunnisteet ja linkittäminen
Tutkimusaineistoihiin viittaaminen, pysyvät tunnisteet ja linkittäminenTutkimusaineistoihiin viittaaminen, pysyvät tunnisteet ja linkittäminen
Tutkimusaineistoihiin viittaaminen, pysyvät tunnisteet ja linkittäminen
Jessica Parland-von Essen
 
Avoin tiede Suomessa
Avoin tiede SuomessaAvoin tiede Suomessa
Avoin tiede Suomessa
Jessica Parland-von Essen
 
Forskningsdataforhumanister
ForskningsdataforhumanisterForskningsdataforhumanister
Forskningsdataforhumanister
Jessica Parland-von Essen
 
Data Management in Research
Data Management in ResearchData Management in Research
Data Management in Research
Jessica Parland-von Essen
 

More from Jessica Parland-von Essen (20)

Planning a Finnish PID Roadmap
Planning a Finnish PID Roadmap Planning a Finnish PID Roadmap
Planning a Finnish PID Roadmap
 
Tutkimusaineistojen kuvailu, metadata ja yhteentoimivuus
Tutkimusaineistojen kuvailu, metadata ja yhteentoimivuusTutkimusaineistojen kuvailu, metadata ja yhteentoimivuus
Tutkimusaineistojen kuvailu, metadata ja yhteentoimivuus
 
Pid landscape in finland
Pid landscape in finlandPid landscape in finland
Pid landscape in finland
 
Fairdata-palvelut ja tutkimusaineistojen pitkäaikaissäilytys
Fairdata-palvelut ja tutkimusaineistojen pitkäaikaissäilytysFairdata-palvelut ja tutkimusaineistojen pitkäaikaissäilytys
Fairdata-palvelut ja tutkimusaineistojen pitkäaikaissäilytys
 
Open Science goes FAIR
Open Science goes FAIROpen Science goes FAIR
Open Science goes FAIR
 
Metatiedot tunnisteet tutkimisdata
Metatiedot tunnisteet tutkimisdataMetatiedot tunnisteet tutkimisdata
Metatiedot tunnisteet tutkimisdata
 
Towards a FAIR lifecycle
Towards a FAIR lifecycleTowards a FAIR lifecycle
Towards a FAIR lifecycle
 
A Finnish perspective on FAIRsFAIR outputs
A Finnish perspective on FAIRsFAIR outputsA Finnish perspective on FAIRsFAIR outputs
A Finnish perspective on FAIRsFAIR outputs
 
Persistence and Interoperability
Persistence and InteroperabilityPersistence and Interoperability
Persistence and Interoperability
 
Collections meet the researcher. Digitalization, disintegration and disillusi...
Collections meet the researcher. Digitalization, disintegration and disillusi...Collections meet the researcher. Digitalization, disintegration and disillusi...
Collections meet the researcher. Digitalization, disintegration and disillusi...
 
Research data management for historians
Research data management for historiansResearch data management for historians
Research data management for historians
 
FAIR data and the Etsin service
FAIR data and the Etsin serviceFAIR data and the Etsin service
FAIR data and the Etsin service
 
Yhteiskuntatieteen aineistot
Yhteiskuntatieteen aineistotYhteiskuntatieteen aineistot
Yhteiskuntatieteen aineistot
 
Avoimen suomen historia
Avoimen suomen historiaAvoimen suomen historia
Avoimen suomen historia
 
Open Science Process
Open Science ProcessOpen Science Process
Open Science Process
 
Tutkimusaineistoihiin viittaaminen, pysyvät tunnisteet ja linkittäminen
Tutkimusaineistoihiin viittaaminen, pysyvät tunnisteet ja linkittäminenTutkimusaineistoihiin viittaaminen, pysyvät tunnisteet ja linkittäminen
Tutkimusaineistoihiin viittaaminen, pysyvät tunnisteet ja linkittäminen
 
AffarerAllianserAnseende
AffarerAllianserAnseendeAffarerAllianserAnseende
AffarerAllianserAnseende
 
Avoin tiede Suomessa
Avoin tiede SuomessaAvoin tiede Suomessa
Avoin tiede Suomessa
 
Forskningsdataforhumanister
ForskningsdataforhumanisterForskningsdataforhumanister
Forskningsdataforhumanister
 
Data Management in Research
Data Management in ResearchData Management in Research
Data Management in Research
 

Recently uploaded

Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdfNamma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
22ad0301
 
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
yuvarajkumar334
 
一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理
keesa2
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
bmucuha
 
一比一原版(UofT毕业证)多伦多大学毕业证如何办理
一比一原版(UofT毕业证)多伦多大学毕业证如何办理一比一原版(UofT毕业证)多伦多大学毕业证如何办理
一比一原版(UofT毕业证)多伦多大学毕业证如何办理
exukyp
 
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Marlon Dumas
 
一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理
zsafxbf
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
hqfek
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
ytypuem
 
SAP BW4HANA Implementagtion Content Document
SAP BW4HANA Implementagtion Content DocumentSAP BW4HANA Implementagtion Content Document
SAP BW4HANA Implementagtion Content Document
newdirectionconsulta
 
Econ3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdfEcon3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdf
blueshagoo1
 
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdfreading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
perranet1
 
Q4FY24 Investor-Presentation.pdf bank slide
Q4FY24 Investor-Presentation.pdf bank slideQ4FY24 Investor-Presentation.pdf bank slide
Q4FY24 Investor-Presentation.pdf bank slide
mukulupadhayay1
 
Telemetry Solution for Gaming (AWS Summit'24)
Telemetry Solution for Gaming (AWS Summit'24)Telemetry Solution for Gaming (AWS Summit'24)
Telemetry Solution for Gaming (AWS Summit'24)
GeorgiiSteshenko
 
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
Rebecca Bilbro
 
Senior Engineering Sample EM DOE - Sheet1.pdf
Senior Engineering Sample EM DOE  - Sheet1.pdfSenior Engineering Sample EM DOE  - Sheet1.pdf
Senior Engineering Sample EM DOE - Sheet1.pdf
Vineet
 
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
aguty
 
Bangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts ServiceBangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts Service
nhero3888
 
Salesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - CanariasSalesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - Canarias
davidpietrzykowski1
 
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
ywqeos
 

Recently uploaded (20)

Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdfNamma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
 
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
 
一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
一比一原版(UofT毕业证)多伦多大学毕业证如何办理
一比一原版(UofT毕业证)多伦多大学毕业证如何办理一比一原版(UofT毕业证)多伦多大学毕业证如何办理
一比一原版(UofT毕业证)多伦多大学毕业证如何办理
 
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
 
一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
 
SAP BW4HANA Implementagtion Content Document
SAP BW4HANA Implementagtion Content DocumentSAP BW4HANA Implementagtion Content Document
SAP BW4HANA Implementagtion Content Document
 
Econ3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdfEcon3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdf
 
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdfreading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
 
Q4FY24 Investor-Presentation.pdf bank slide
Q4FY24 Investor-Presentation.pdf bank slideQ4FY24 Investor-Presentation.pdf bank slide
Q4FY24 Investor-Presentation.pdf bank slide
 
Telemetry Solution for Gaming (AWS Summit'24)
Telemetry Solution for Gaming (AWS Summit'24)Telemetry Solution for Gaming (AWS Summit'24)
Telemetry Solution for Gaming (AWS Summit'24)
 
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
 
Senior Engineering Sample EM DOE - Sheet1.pdf
Senior Engineering Sample EM DOE  - Sheet1.pdfSenior Engineering Sample EM DOE  - Sheet1.pdf
Senior Engineering Sample EM DOE - Sheet1.pdf
 
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
 
Bangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts ServiceBangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts Service
 
Salesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - CanariasSalesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - Canarias
 
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
 

Supporting FAIR data principles with data categorization

  • 1. CSC – Suomalainen tutkimuksen, koulutuksen, kulttuurin ja julkishallinnon ICT-osaamiskeskusCSC – Suomalainen tutkimuksen, koulutuksen, kulttuurin ja julkishallinnon ICT-osaamiskeskus Supporting FAIR data. Categorization of research data as a tool in data management Jessica Parland-von Essen https://orcid.org/0000-0003-4460-3906, Katja Fält https://orcid.org/0000- 0002-6172-5377, Zubair Maalick https://orcid.org/0000-0002-0975-1471, Miika Alonen https://orcid.org/0000-0002-0065-0017, Eduardo Gonzalez https://orcid.org/0000-0003-1400-0995
  • 3. Persistent identifiers 3 a) Cite a specific slice or subset (the set of updates to the dataset made during a particular period of time or to a particular area of the dataset). b) Cite a specific snapshot (a copy of the entire dataset made at a specific time). c) Cite the continuously updated dataset, but add Access Date and Time to the citation. (Does not necessarily ensure reproducibility.) d) Cite a query, time-stamped for re-execution against a versioned database. DYNAMIC DATASETS IMMUTABLE DATASETS
  • 4. Maybe we need to be more specific and find common ground in concepts? 4 CHUNKING UP RESEARCH DATA
  • 5. Categorization according to technical properties • Modality, DCMI types oDublin Core –type of thinking • Format, DCMI format oMIME types oSoftware related • Language, coding oHuman interpretation 5 By Lin Kristensen from New Jersey, USA (Timeless Books) [CC BY 2.0 (https://creativecommons.org/licenses/by/2.0)], via Wikimedia Commons
  • 6. Categorization according to contextual traits • Origin oObservational, experimental, simulation, derived etc • Use category oSource, output, method • Provenance, lifecycle oPrimary, secondary, data levels, qualitative, quantitative 6 By David Monniaux CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0/), from Wikimedia Commons
  • 7. Categorization according to inherent traits • Access type (availability) oOpen data, sensitive data • Semantic structure oCoherence, levels of measurement, groupings, classifications • Research data type (stability) oGeneric data, Generic research data, research data publications 7
  • 8. 8
  • 9. 9 Dynamic and growing datasets URN allows use of fragments Avoid PID inflation Consider costs and sustainability Ad hoc creation rather than automatic minting and allocation?
  • 10. Operational data Generic research data Research dataset Description Data for any use, private or government owned, might fall within PSI. Produced by/with/for researchers, validated, good quality, well documented, might be raw or processed. Dataset produced for a certain research question Might be highly processed, reuse difficult unless mature field. The main purpose is assessment and reproducibilty. Format May be dynamic mature solutions, active or even hot data. Coherent and well documented formats. Data should be quite stable with versioning. Should be possible to cite and enable reproducible research. Usually in files, but might also be a database with applications. Citation does not require date. Two-tier resolver for identifier and landing page with metadata available even after data is gone. Might have defined lifespan. Examples - weather data - data catalogue - big data from social media - corpora - time series of experimental or observational data from technical instruments - similar social or clinical surveys - data paper - data cited in article and published in Zenodo, EUDAT B2Share, other or journal repository
  • 11. Using research data types … … makes it easier to describe services … makes it easier for researchers to plan data life cycle … makes developing solutions for citation and FAIR data creation and use easier …makes it easier to describe and manage research data 11