SlideShare a Scribd company logo
1 of 23
Download to read offline
Basics in good research data
management (RDM) for
reviewing DMPs
FOSTER & OpenAIRE webinar, 22nd October 2018
https://www.openaire.eu/open-access-week-2018
S. Venkataraman
Digital Curation Centre, Edinburgh
s.venkataraman@ed.ac.uk
https://doi.org/10.5281/zenodo.1461601
WHAT IS RESEARCH
DATA
MANAGEMENT?
What is Research Data Management?
Create
Document
Use
Store
Share
Preserve
“the active management
and appraisal of data
over the lifecycle of
scholarly and scientific
interest”
Data management is
part of
good research
practice
Concepts to cover
•Data formats
•Metadata
•Licensing
•Data repositories
•Persistent identifiers
These aspects are addressed specifically in Data Management Plans so here
we will help you review them
Choose a
appropriate file
formats
Data Formats
Different formats are good for different things
- open, lossless formats are more sustainable e.g. rtf, xml, tif, wav
- proprietary and/or compressed formats are less preservable but are
often in widespread use e.g. doc, jpg, mp3
One format for analysis then convert to a standard format
Data centres may suggest preferred formats for deposit
https://www.ukdataservice.ac.uk/manage-data/format/recommended-
formats
Data Formats
Type of data Recommended formats Acceptable formats
Tabular data with extensive metadata
variable labels, code labels, and defined missing values
SPSS portable format (.por)
delimited text and command ('setup') file (SPSS, Stata, SAS, etc.)
structured text or mark-up file of metadata information, e.g. DDI XML file
proprietary formats of statistical packages: SPSS (.sav), Stata (.dta), MS
Access (.mdb/.accdb)
Tabular data with minimal metadata
column headings, variable names
comma-separated values (.csv)
tab-delimited file (.tab)
delimited text with SQL data definition statements
delimited text (.txt) with characters not present in data used as delimiters
widely-used formats: MS Excel (.xls/.xlsx), MS Access (.mdb/.accdb), dBase
(.dbf), OpenDocument Spreadsheet (.ods)
Geospatial data
vector and raster data
ESRI Shapefile (.shp, .shx, .dbf, .prj, .sbx, .sbn optional)
geo-referenced TIFF (.tif, .tfw)
CAD data (.dwg)
tabular GIS attribute data
Geography Markup Language (.gml)
ESRI Geodatabase format (.mdb)
MapInfo Interchange Format (.mif) for vector data
Keyhole Mark-up Language (.kml)
Adobe Illustrator (.ai), CAD data (.dxf or .svg)
binary formats of GIS and CAD packages
Textual data Rich Text Format (.rtf)
plain text, ASCII (.txt)
eXtensible Mark-up Language (.xml) text according to an appropriate
Document Type Definition (DTD) or schema
Hypertext Mark-up Language (.html)
widely-used formats: MS Word (.doc/.docx)
some software-specific formats: NUD*IST, NVivo and ATLAS.ti
Image data TIFF 6.0 uncompressed (.tif) JPEG (.jpeg, .jpg, .jp2) if original created in this format
GIF (.gif)
TIFF other versions (.tif, .tiff)
RAW image format (.raw)
Photoshop files (.psd)
BMP (.bmp)
PNG (.png)
Adobe Portable Document Format (PDF/A, PDF) (.pdf)
Audio data Free Lossless Audio Codec (FLAC) (.flac) MPEG-1 Audio Layer 3 (.mp3) if original created in this format
Audio Interchange File Format (.aif)
Waveform Audio Format (.wav)
Video data MPEG-4 (.mp4)
OGG video (.ogv, .ogg)
motion JPEG 2000 (.mj2)
AVCHD video (.avchd)
Documentation and scripts Rich Text Format (.rtf)
PDF/UA, PDF/A or PDF (.pdf)
XHTML or HTML (.xhtml, .htm)
OpenDocument Text (.odt)
plain text (.txt)
widely-used formats: MS Word (.doc/.docx), MS Excel (.xls/.xlsx)
XML marked-up text (.xml) according to an appropriate DTD or schema, e.g.
XHMTL 1.0
Document your
data as fully as
possible
Metadata and documentation
At a basic level, metadata supports data discovery, disambiguation and
citation
Rich metadata and documentation will support interoperability & reuse
Standards should be used. These can be general – such as Dublin Core, or
discipline specific
Data Documentation Initiative (DDI) – social science
Ecological Metadata Language (EML) - ecology
Flexible Image Transport System (FITS) – astronomy
Where to find relevant standards?
Metadata Standards Directory
Broad, disciplinary listing of
standards and tools. Maintained
by RDA group
https://rdamsc.dcc.ac.uk
FAIRsharing
•A portal of data standards, databases,
and policies
•Focused on life, environmental and
biomedical sciences, but expanding to
other disciplines
https://fairsharing.org
Value of controlled vocabularies
“MTBLS1: A metabolomic study of urinary changes in type 2 diabetes in……”
Example courtesy of Ken Haug, European
Bioinformatics Institute (EMBL-EBI)
Controlled vocabularies
• e.g. SNOMED CT (clinical terms) or MeSH
• Include ontologies as well
• Defined terms + taxonomy
• Useful for selecting keywords to tag datasets
• Example: compare anatomical components in two distinct species of organism…
➢Organism A
➢Term A1
➢Term A2
➢Term A3
➢Term B1
➢Term B2
➢Term C4
➢.
➢.
➢.
➢Term n
►Organism B
►Term A1
►Term A2
►Term A3
►Term B1
►Term B2
►Term C4
►.
►.
►.
►Term n
Ensure your data is
as visible as
possible
Dataset licensing
Horizon 2020
guidelines point
to:
or
EUDAT licensing tool
https://ufal.github.io/public-license-selector
Choose a suitable
repository
Data repositories
www.re3data.org
The EC guidelines point to Re3data as one of the registries that can be
searched to find a home for data
www.fosteropenscience.eu/content/re3data-demo
Considerations when selecting repositories
• Often preferable to use a subject specific repository if available
• Useful if repositories assign a persistent identifier
• Look for certification as a ‘Trustworthy Digital Repository’ with an explicit
ambition to keep the data available in long term.
• Generic repositories are also available e.g. Zenodo or institutional
repositories Icons to note
open access,
licenses, PIDs,
certificates…
Make sure that data
can be accessed in
perpetuity
Persistent Identifiers
• a long-lasting reference to a document, file or other object
• PIDs come in various forms e.g. ARK, DOI, URN, PURL, Handles...
• Typically they’re actionable i.e. type it into web browser to access
• Many repositories will assign them on deposit
Persistent Identifiers
A specific example: ORCID
https://orcid.org/blog/2017/10/04/building-information-infrastructure-
research-institutions
https://orcid.org/blog/2016/10/31/organization-identifier-project-way-
forward
Thanks for watching!
More info at:
www.dcc.ac.uk/resources/
https://www.fosteropenscience.eu/
https://www.openaire.eu/

More Related Content

What's hot

Research Data Management
Research Data ManagementResearch Data Management
Research Data ManagementJamie Bisset
 
Data Archiving and Sharing
Data Archiving and SharingData Archiving and Sharing
Data Archiving and SharingC. Tobin Magle
 
Introduction of search databases
Introduction of search databases Introduction of search databases
Introduction of search databases Youssef2000
 
Digital library software
Digital library softwareDigital library software
Digital library softwareavid
 
Introduction to Population Health Analytics, Predictive Analytics, Big Data a...
Introduction to Population Health Analytics, Predictive Analytics, Big Data a...Introduction to Population Health Analytics, Predictive Analytics, Big Data a...
Introduction to Population Health Analytics, Predictive Analytics, Big Data a...Frank Wang
 
Data Analytics
Data AnalyticsData Analytics
Data AnalyticsRavi Nayak
 
Metric Fields in Information Science
Metric Fields in Information ScienceMetric Fields in Information Science
Metric Fields in Information ScienceGladys Wakat
 
Electronic Resource Management in the library
Electronic Resource Management in the libraryElectronic Resource Management in the library
Electronic Resource Management in the libraryDr. Nihar K. Patra
 
Data analytics introduction
Data analytics introductionData analytics introduction
Data analytics introductionamiyadash
 

What's hot (20)

Research Data Management
Research Data ManagementResearch Data Management
Research Data Management
 
Introduction to Databases
Introduction to Databases Introduction to Databases
Introduction to Databases
 
Medlars
MedlarsMedlars
Medlars
 
Data Archiving and Sharing
Data Archiving and SharingData Archiving and Sharing
Data Archiving and Sharing
 
Introduction of search databases
Introduction of search databases Introduction of search databases
Introduction of search databases
 
Spiral of Scientific Method Arun Joseph MPhil
Spiral of Scientific Method   Arun Joseph MPhilSpiral of Scientific Method   Arun Joseph MPhil
Spiral of Scientific Method Arun Joseph MPhil
 
Dspace
DspaceDspace
Dspace
 
Digital archiving
Digital archivingDigital archiving
Digital archiving
 
Digital library software
Digital library softwareDigital library software
Digital library software
 
Altmetrics
Altmetrics Altmetrics
Altmetrics
 
Introduction to Population Health Analytics, Predictive Analytics, Big Data a...
Introduction to Population Health Analytics, Predictive Analytics, Big Data a...Introduction to Population Health Analytics, Predictive Analytics, Big Data a...
Introduction to Population Health Analytics, Predictive Analytics, Big Data a...
 
Introduction to biomedical research
Introduction to biomedical researchIntroduction to biomedical research
Introduction to biomedical research
 
Data Analytics
Data AnalyticsData Analytics
Data Analytics
 
Metric Fields in Information Science
Metric Fields in Information ScienceMetric Fields in Information Science
Metric Fields in Information Science
 
Electronic Resource Management in the library
Electronic Resource Management in the libraryElectronic Resource Management in the library
Electronic Resource Management in the library
 
Metadata Standards
Metadata StandardsMetadata Standards
Metadata Standards
 
Data analytics introduction
Data analytics introductionData analytics introduction
Data analytics introduction
 
Research Data Management: Why is it important?
Research Data Management: Why is it  important?Research Data Management: Why is it  important?
Research Data Management: Why is it important?
 
RISK BASED MONITORING
RISK BASED MONITORINGRISK BASED MONITORING
RISK BASED MONITORING
 
Metadata: A concept
Metadata: A conceptMetadata: A concept
Metadata: A concept
 

Similar to Basics of Research Data Management

OpenAIRE webinar: Principles of Research Data Management, with S. Venkatarama...
OpenAIRE webinar: Principles of Research Data Management, with S. Venkatarama...OpenAIRE webinar: Principles of Research Data Management, with S. Venkatarama...
OpenAIRE webinar: Principles of Research Data Management, with S. Venkatarama...OpenAIRE
 
Networked Digital Library Of Theses And Dissertations
Networked Digital Library Of Theses And DissertationsNetworked Digital Library Of Theses And Dissertations
Networked Digital Library Of Theses And Dissertationssinglish
 
Data Management Planning for researchers
Data Management Planning for researchersData Management Planning for researchers
Data Management Planning for researchersSarah Jones
 
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...OpenAIRE
 
Data Science Process.pptx
Data Science Process.pptxData Science Process.pptx
Data Science Process.pptxWidsoulDevil
 
Intro to Digitization Projects
Intro to Digitization ProjectsIntro to Digitization Projects
Intro to Digitization Projectszsrlibrary
 
2009 PLANETS Vienna - MIXED migration to XML
2009 PLANETS Vienna - MIXED migration to XML2009 PLANETS Vienna - MIXED migration to XML
2009 PLANETS Vienna - MIXED migration to XMLDirk Roorda
 
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)OpenAIRE
 
Introduction to Digital Humanities: Metadata standards and ontologies
Introduction to Digital Humanities: Metadata standards and ontologies Introduction to Digital Humanities: Metadata standards and ontologies
Introduction to Digital Humanities: Metadata standards and ontologies LIBIS
 
The need of Interoperability in Office and GIS formats
The need of Interoperability in Office and GIS formatsThe need of Interoperability in Office and GIS formats
The need of Interoperability in Office and GIS formatsMarkus Neteler
 
CSU-ACADIS_dataManagement101-20120217
CSU-ACADIS_dataManagement101-20120217CSU-ACADIS_dataManagement101-20120217
CSU-ACADIS_dataManagement101-20120217lyarmey
 
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...faflrt
 
AnIML: A New Analytical Data Standard
AnIML: A New Analytical Data StandardAnIML: A New Analytical Data Standard
AnIML: A New Analytical Data StandardStuart Chalk
 
Metadata Workshop - Utrecht - November 5, 2008
Metadata Workshop - Utrecht - November 5, 2008Metadata Workshop - Utrecht - November 5, 2008
Metadata Workshop - Utrecht - November 5, 2008askamy
 

Similar to Basics of Research Data Management (20)

OpenAIRE webinar: Principles of Research Data Management, with S. Venkatarama...
OpenAIRE webinar: Principles of Research Data Management, with S. Venkatarama...OpenAIRE webinar: Principles of Research Data Management, with S. Venkatarama...
OpenAIRE webinar: Principles of Research Data Management, with S. Venkatarama...
 
Good Practice in Research Data Management
Good Practice in Research Data ManagementGood Practice in Research Data Management
Good Practice in Research Data Management
 
Networked Digital Library Of Theses And Dissertations
Networked Digital Library Of Theses And DissertationsNetworked Digital Library Of Theses And Dissertations
Networked Digital Library Of Theses And Dissertations
 
Data Management Planning for researchers
Data Management Planning for researchersData Management Planning for researchers
Data Management Planning for researchers
 
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
 
Data Science Process.pptx
Data Science Process.pptxData Science Process.pptx
Data Science Process.pptx
 
Intro to Digitization Projects
Intro to Digitization ProjectsIntro to Digitization Projects
Intro to Digitization Projects
 
2009 PLANETS Vienna - MIXED migration to XML
2009 PLANETS Vienna - MIXED migration to XML2009 PLANETS Vienna - MIXED migration to XML
2009 PLANETS Vienna - MIXED migration to XML
 
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
 
Preservation Metadata, Michael Day, DCC
Preservation Metadata, Michael Day, DCCPreservation Metadata, Michael Day, DCC
Preservation Metadata, Michael Day, DCC
 
Trm Vilnius Metadata New
Trm Vilnius Metadata NewTrm Vilnius Metadata New
Trm Vilnius Metadata New
 
Introduction to Digital Humanities: Metadata standards and ontologies
Introduction to Digital Humanities: Metadata standards and ontologies Introduction to Digital Humanities: Metadata standards and ontologies
Introduction to Digital Humanities: Metadata standards and ontologies
 
The need of Interoperability in Office and GIS formats
The need of Interoperability in Office and GIS formatsThe need of Interoperability in Office and GIS formats
The need of Interoperability in Office and GIS formats
 
Metadata For Preservation Delos
Metadata For Preservation DelosMetadata For Preservation Delos
Metadata For Preservation Delos
 
Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-
 
CSU-ACADIS_dataManagement101-20120217
CSU-ACADIS_dataManagement101-20120217CSU-ACADIS_dataManagement101-20120217
CSU-ACADIS_dataManagement101-20120217
 
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
 
AnIML: A New Analytical Data Standard
AnIML: A New Analytical Data StandardAnIML: A New Analytical Data Standard
AnIML: A New Analytical Data Standard
 
Prototype Design of Open Access Institutional Repository
Prototype Design of Open Access Institutional RepositoryPrototype Design of Open Access Institutional Repository
Prototype Design of Open Access Institutional Repository
 
Metadata Workshop - Utrecht - November 5, 2008
Metadata Workshop - Utrecht - November 5, 2008Metadata Workshop - Utrecht - November 5, 2008
Metadata Workshop - Utrecht - November 5, 2008
 

More from OpenAIRE

10th OpenAIRE Content Providers Community Call
10th OpenAIRE Content Providers Community Call10th OpenAIRE Content Providers Community Call
10th OpenAIRE Content Providers Community CallOpenAIRE
 
9th Content Providers Community Call\
9th Content Providers Community Call\9th Content Providers Community Call\
9th Content Providers Community Call\OpenAIRE
 
OpenAIRE in the European Open Science Cloud (EOSC)
OpenAIRE in the European Open Science Cloud (EOSC)OpenAIRE in the European Open Science Cloud (EOSC)
OpenAIRE in the European Open Science Cloud (EOSC)OpenAIRE
 
8th Content Providers Community Call
8th Content Providers Community Call8th Content Providers Community Call
8th Content Providers Community CallOpenAIRE
 
7th Content Providers Community Call
7th Content Providers Community Call7th Content Providers Community Call
7th Content Providers Community CallOpenAIRE
 
OpenAIRE PROVIDE Dashboard for Turkish repository managers
OpenAIRE PROVIDE Dashboard for Turkish repository managersOpenAIRE PROVIDE Dashboard for Turkish repository managers
OpenAIRE PROVIDE Dashboard for Turkish repository managersOpenAIRE
 
What will it cost to manage and share my data?
What will it cost to manage and share my data?What will it cost to manage and share my data?
What will it cost to manage and share my data?OpenAIRE
 
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)OpenAIRE
 
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)OpenAIRE
 
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)OpenAIRE
 
6th Content Providers Community Call
6th Content Providers Community Call6th Content Providers Community Call
6th Content Providers Community CallOpenAIRE
 
20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing DataOpenAIRE
 
20200504_Research Data & the GDPR: How Open is Open?
20200504_Research Data & the GDPR: How Open is Open?20200504_Research Data & the GDPR: How Open is Open?
20200504_Research Data & the GDPR: How Open is Open?OpenAIRE
 
20200504_Data, Data Ownership and Open Science
20200504_Data, Data Ownership and Open Science20200504_Data, Data Ownership and Open Science
20200504_Data, Data Ownership and Open ScienceOpenAIRE
 
20200429_Research Data & the GDPR: How Open is Open? (updated version)
20200429_Research Data & the GDPR: How Open is Open? (updated version)20200429_Research Data & the GDPR: How Open is Open? (updated version)
20200429_Research Data & the GDPR: How Open is Open? (updated version)OpenAIRE
 
20200429_Data, Data Ownership and Open Science
20200429_Data, Data Ownership and Open Science20200429_Data, Data Ownership and Open Science
20200429_Data, Data Ownership and Open ScienceOpenAIRE
 
20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing DataOpenAIRE
 
COVID-19: Activities, tools, best practice and contact points in Greece
 COVID-19: Activities, tools, best practice and contact points in Greece COVID-19: Activities, tools, best practice and contact points in Greece
COVID-19: Activities, tools, best practice and contact points in GreeceOpenAIRE
 
5th Content Providers Community Call
5th Content Providers Community Call5th Content Providers Community Call
5th Content Providers Community CallOpenAIRE
 
4th Content Providers Community Call
4th Content Providers Community Call4th Content Providers Community Call
4th Content Providers Community CallOpenAIRE
 

More from OpenAIRE (20)

10th OpenAIRE Content Providers Community Call
10th OpenAIRE Content Providers Community Call10th OpenAIRE Content Providers Community Call
10th OpenAIRE Content Providers Community Call
 
9th Content Providers Community Call\
9th Content Providers Community Call\9th Content Providers Community Call\
9th Content Providers Community Call\
 
OpenAIRE in the European Open Science Cloud (EOSC)
OpenAIRE in the European Open Science Cloud (EOSC)OpenAIRE in the European Open Science Cloud (EOSC)
OpenAIRE in the European Open Science Cloud (EOSC)
 
8th Content Providers Community Call
8th Content Providers Community Call8th Content Providers Community Call
8th Content Providers Community Call
 
7th Content Providers Community Call
7th Content Providers Community Call7th Content Providers Community Call
7th Content Providers Community Call
 
OpenAIRE PROVIDE Dashboard for Turkish repository managers
OpenAIRE PROVIDE Dashboard for Turkish repository managersOpenAIRE PROVIDE Dashboard for Turkish repository managers
OpenAIRE PROVIDE Dashboard for Turkish repository managers
 
What will it cost to manage and share my data?
What will it cost to manage and share my data?What will it cost to manage and share my data?
What will it cost to manage and share my data?
 
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)
 
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)
 
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)
 
6th Content Providers Community Call
6th Content Providers Community Call6th Content Providers Community Call
6th Content Providers Community Call
 
20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
 
20200504_Research Data & the GDPR: How Open is Open?
20200504_Research Data & the GDPR: How Open is Open?20200504_Research Data & the GDPR: How Open is Open?
20200504_Research Data & the GDPR: How Open is Open?
 
20200504_Data, Data Ownership and Open Science
20200504_Data, Data Ownership and Open Science20200504_Data, Data Ownership and Open Science
20200504_Data, Data Ownership and Open Science
 
20200429_Research Data & the GDPR: How Open is Open? (updated version)
20200429_Research Data & the GDPR: How Open is Open? (updated version)20200429_Research Data & the GDPR: How Open is Open? (updated version)
20200429_Research Data & the GDPR: How Open is Open? (updated version)
 
20200429_Data, Data Ownership and Open Science
20200429_Data, Data Ownership and Open Science20200429_Data, Data Ownership and Open Science
20200429_Data, Data Ownership and Open Science
 
20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
 
COVID-19: Activities, tools, best practice and contact points in Greece
 COVID-19: Activities, tools, best practice and contact points in Greece COVID-19: Activities, tools, best practice and contact points in Greece
COVID-19: Activities, tools, best practice and contact points in Greece
 
5th Content Providers Community Call
5th Content Providers Community Call5th Content Providers Community Call
5th Content Providers Community Call
 
4th Content Providers Community Call
4th Content Providers Community Call4th Content Providers Community Call
4th Content Providers Community Call
 

Recently uploaded

NUMERICAL Proof Of TIme Electron Theory.
NUMERICAL Proof Of TIme Electron Theory.NUMERICAL Proof Of TIme Electron Theory.
NUMERICAL Proof Of TIme Electron Theory.syedmuneemqadri
 
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...yogeshlabana357357
 
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...TALAPATI ARUNA CHENNA VYDYANAD
 
GBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) EnzymologyGBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) EnzymologyAreesha Ahmad
 
Film Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdfFilm Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdfPharmatech-rx
 
dkNET Webinar: The 4DN Data Portal - Data, Resources and Tools to Help Elucid...
dkNET Webinar: The 4DN Data Portal - Data, Resources and Tools to Help Elucid...dkNET Webinar: The 4DN Data Portal - Data, Resources and Tools to Help Elucid...
dkNET Webinar: The 4DN Data Portal - Data, Resources and Tools to Help Elucid...dkNET
 
Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...
Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...
Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...mikehavy0
 
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...Sahil Suleman
 
Information science research with large language models: between science and ...
Information science research with large language models: between science and ...Information science research with large language models: between science and ...
Information science research with large language models: between science and ...Fabiano Dalpiaz
 
Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...
Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...
Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...Sérgio Sacani
 
Lubrication System in forced feed system
Lubrication System in forced feed systemLubrication System in forced feed system
Lubrication System in forced feed systemADB online India
 
TEST BANK for Organic Chemistry 6th Edition.pdf
TEST BANK for Organic Chemistry 6th Edition.pdfTEST BANK for Organic Chemistry 6th Edition.pdf
TEST BANK for Organic Chemistry 6th Edition.pdfmarcuskenyatta275
 
Manganese‐RichSandstonesasanIndicatorofAncientOxic LakeWaterConditionsinGale...
Manganese‐RichSandstonesasanIndicatorofAncientOxic  LakeWaterConditionsinGale...Manganese‐RichSandstonesasanIndicatorofAncientOxic  LakeWaterConditionsinGale...
Manganese‐RichSandstonesasanIndicatorofAncientOxic LakeWaterConditionsinGale...Sérgio Sacani
 
GBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interactionGBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interactionAreesha Ahmad
 
Mining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptxMining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptxKyawThanTint
 
Heads-Up Multitasker: CHI 2024 Presentation.pdf
Heads-Up Multitasker: CHI 2024 Presentation.pdfHeads-Up Multitasker: CHI 2024 Presentation.pdf
Heads-Up Multitasker: CHI 2024 Presentation.pdfbyp19971001
 
SaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptx
SaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptxSaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptx
SaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptxPat (JS) Heslop-Harrison
 
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...kevin8smith
 
Triploidy ...............................pptx
Triploidy ...............................pptxTriploidy ...............................pptx
Triploidy ...............................pptxCherry
 
VILLAGE ATTACHMENT For rural agriculture PPT.pptx
VILLAGE ATTACHMENT For rural agriculture  PPT.pptxVILLAGE ATTACHMENT For rural agriculture  PPT.pptx
VILLAGE ATTACHMENT For rural agriculture PPT.pptxAQIBRASOOL4
 

Recently uploaded (20)

NUMERICAL Proof Of TIme Electron Theory.
NUMERICAL Proof Of TIme Electron Theory.NUMERICAL Proof Of TIme Electron Theory.
NUMERICAL Proof Of TIme Electron Theory.
 
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
 
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
 
GBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) EnzymologyGBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) Enzymology
 
Film Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdfFilm Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdf
 
dkNET Webinar: The 4DN Data Portal - Data, Resources and Tools to Help Elucid...
dkNET Webinar: The 4DN Data Portal - Data, Resources and Tools to Help Elucid...dkNET Webinar: The 4DN Data Portal - Data, Resources and Tools to Help Elucid...
dkNET Webinar: The 4DN Data Portal - Data, Resources and Tools to Help Elucid...
 
Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...
Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...
Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...
 
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
 
Information science research with large language models: between science and ...
Information science research with large language models: between science and ...Information science research with large language models: between science and ...
Information science research with large language models: between science and ...
 
Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...
Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...
Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...
 
Lubrication System in forced feed system
Lubrication System in forced feed systemLubrication System in forced feed system
Lubrication System in forced feed system
 
TEST BANK for Organic Chemistry 6th Edition.pdf
TEST BANK for Organic Chemistry 6th Edition.pdfTEST BANK for Organic Chemistry 6th Edition.pdf
TEST BANK for Organic Chemistry 6th Edition.pdf
 
Manganese‐RichSandstonesasanIndicatorofAncientOxic LakeWaterConditionsinGale...
Manganese‐RichSandstonesasanIndicatorofAncientOxic  LakeWaterConditionsinGale...Manganese‐RichSandstonesasanIndicatorofAncientOxic  LakeWaterConditionsinGale...
Manganese‐RichSandstonesasanIndicatorofAncientOxic LakeWaterConditionsinGale...
 
GBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interactionGBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interaction
 
Mining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptxMining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptx
 
Heads-Up Multitasker: CHI 2024 Presentation.pdf
Heads-Up Multitasker: CHI 2024 Presentation.pdfHeads-Up Multitasker: CHI 2024 Presentation.pdf
Heads-Up Multitasker: CHI 2024 Presentation.pdf
 
SaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptx
SaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptxSaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptx
SaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptx
 
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
 
Triploidy ...............................pptx
Triploidy ...............................pptxTriploidy ...............................pptx
Triploidy ...............................pptx
 
VILLAGE ATTACHMENT For rural agriculture PPT.pptx
VILLAGE ATTACHMENT For rural agriculture  PPT.pptxVILLAGE ATTACHMENT For rural agriculture  PPT.pptx
VILLAGE ATTACHMENT For rural agriculture PPT.pptx
 

Basics of Research Data Management

  • 1. Basics in good research data management (RDM) for reviewing DMPs FOSTER & OpenAIRE webinar, 22nd October 2018 https://www.openaire.eu/open-access-week-2018 S. Venkataraman Digital Curation Centre, Edinburgh s.venkataraman@ed.ac.uk https://doi.org/10.5281/zenodo.1461601
  • 3. What is Research Data Management? Create Document Use Store Share Preserve “the active management and appraisal of data over the lifecycle of scholarly and scientific interest” Data management is part of good research practice
  • 4.
  • 5. Concepts to cover •Data formats •Metadata •Licensing •Data repositories •Persistent identifiers These aspects are addressed specifically in Data Management Plans so here we will help you review them
  • 7. Data Formats Different formats are good for different things - open, lossless formats are more sustainable e.g. rtf, xml, tif, wav - proprietary and/or compressed formats are less preservable but are often in widespread use e.g. doc, jpg, mp3 One format for analysis then convert to a standard format Data centres may suggest preferred formats for deposit https://www.ukdataservice.ac.uk/manage-data/format/recommended- formats
  • 8. Data Formats Type of data Recommended formats Acceptable formats Tabular data with extensive metadata variable labels, code labels, and defined missing values SPSS portable format (.por) delimited text and command ('setup') file (SPSS, Stata, SAS, etc.) structured text or mark-up file of metadata information, e.g. DDI XML file proprietary formats of statistical packages: SPSS (.sav), Stata (.dta), MS Access (.mdb/.accdb) Tabular data with minimal metadata column headings, variable names comma-separated values (.csv) tab-delimited file (.tab) delimited text with SQL data definition statements delimited text (.txt) with characters not present in data used as delimiters widely-used formats: MS Excel (.xls/.xlsx), MS Access (.mdb/.accdb), dBase (.dbf), OpenDocument Spreadsheet (.ods) Geospatial data vector and raster data ESRI Shapefile (.shp, .shx, .dbf, .prj, .sbx, .sbn optional) geo-referenced TIFF (.tif, .tfw) CAD data (.dwg) tabular GIS attribute data Geography Markup Language (.gml) ESRI Geodatabase format (.mdb) MapInfo Interchange Format (.mif) for vector data Keyhole Mark-up Language (.kml) Adobe Illustrator (.ai), CAD data (.dxf or .svg) binary formats of GIS and CAD packages Textual data Rich Text Format (.rtf) plain text, ASCII (.txt) eXtensible Mark-up Language (.xml) text according to an appropriate Document Type Definition (DTD) or schema Hypertext Mark-up Language (.html) widely-used formats: MS Word (.doc/.docx) some software-specific formats: NUD*IST, NVivo and ATLAS.ti Image data TIFF 6.0 uncompressed (.tif) JPEG (.jpeg, .jpg, .jp2) if original created in this format GIF (.gif) TIFF other versions (.tif, .tiff) RAW image format (.raw) Photoshop files (.psd) BMP (.bmp) PNG (.png) Adobe Portable Document Format (PDF/A, PDF) (.pdf) Audio data Free Lossless Audio Codec (FLAC) (.flac) MPEG-1 Audio Layer 3 (.mp3) if original created in this format Audio Interchange File Format (.aif) Waveform Audio Format (.wav) Video data MPEG-4 (.mp4) OGG video (.ogv, .ogg) motion JPEG 2000 (.mj2) AVCHD video (.avchd) Documentation and scripts Rich Text Format (.rtf) PDF/UA, PDF/A or PDF (.pdf) XHTML or HTML (.xhtml, .htm) OpenDocument Text (.odt) plain text (.txt) widely-used formats: MS Word (.doc/.docx), MS Excel (.xls/.xlsx) XML marked-up text (.xml) according to an appropriate DTD or schema, e.g. XHMTL 1.0
  • 9. Document your data as fully as possible
  • 10. Metadata and documentation At a basic level, metadata supports data discovery, disambiguation and citation Rich metadata and documentation will support interoperability & reuse Standards should be used. These can be general – such as Dublin Core, or discipline specific Data Documentation Initiative (DDI) – social science Ecological Metadata Language (EML) - ecology Flexible Image Transport System (FITS) – astronomy
  • 11. Where to find relevant standards? Metadata Standards Directory Broad, disciplinary listing of standards and tools. Maintained by RDA group https://rdamsc.dcc.ac.uk FAIRsharing •A portal of data standards, databases, and policies •Focused on life, environmental and biomedical sciences, but expanding to other disciplines https://fairsharing.org
  • 12. Value of controlled vocabularies “MTBLS1: A metabolomic study of urinary changes in type 2 diabetes in……” Example courtesy of Ken Haug, European Bioinformatics Institute (EMBL-EBI)
  • 13. Controlled vocabularies • e.g. SNOMED CT (clinical terms) or MeSH • Include ontologies as well • Defined terms + taxonomy • Useful for selecting keywords to tag datasets • Example: compare anatomical components in two distinct species of organism… ➢Organism A ➢Term A1 ➢Term A2 ➢Term A3 ➢Term B1 ➢Term B2 ➢Term C4 ➢. ➢. ➢. ➢Term n ►Organism B ►Term A1 ►Term A2 ►Term A3 ►Term B1 ►Term B2 ►Term C4 ►. ►. ►. ►Term n
  • 14. Ensure your data is as visible as possible
  • 18. Data repositories www.re3data.org The EC guidelines point to Re3data as one of the registries that can be searched to find a home for data www.fosteropenscience.eu/content/re3data-demo
  • 19. Considerations when selecting repositories • Often preferable to use a subject specific repository if available • Useful if repositories assign a persistent identifier • Look for certification as a ‘Trustworthy Digital Repository’ with an explicit ambition to keep the data available in long term. • Generic repositories are also available e.g. Zenodo or institutional repositories Icons to note open access, licenses, PIDs, certificates…
  • 20. Make sure that data can be accessed in perpetuity
  • 21. Persistent Identifiers • a long-lasting reference to a document, file or other object • PIDs come in various forms e.g. ARK, DOI, URN, PURL, Handles... • Typically they’re actionable i.e. type it into web browser to access • Many repositories will assign them on deposit
  • 22. Persistent Identifiers A specific example: ORCID https://orcid.org/blog/2017/10/04/building-information-infrastructure- research-institutions https://orcid.org/blog/2016/10/31/organization-identifier-project-way- forward
  • 23. Thanks for watching! More info at: www.dcc.ac.uk/resources/ https://www.fosteropenscience.eu/ https://www.openaire.eu/