Research Data Management
DMP & repository
Mireia Alcalá
Consortium of University Services in Catalonia. Open Science Area
Workshop: Research Data Management & Open Science (02.11.20) – IDIBELL
Publications and data
 DATA: ‘any information that has been
collected, observed, generated or
created to validate original research
findings’
 OPEN (RESEARCH) DATA: ‘are
online, free of cost, accessible data
that can be used, reused and
distributed provided that the data
source is attributed’ (FOSTER)
(FOSTER)
Making data FAIR
 In 2014, a core set of principles were drafted in order to optimize the
reusability of research data, named the FAIR Data Principles:
• Findable: Assign persistent IDs, provide rich metadata, register in a searchable
resource...
• Accessible: Retrievable by their ID using a standard protocol, metadata remain
accessible even if data aren’t...
• Interoperable: Use formal, broadly applicable languages, use standard
vocabularies, qualified references...
• Reusable: Rich, accurate metadata, clear licenses, provenance, use of
community standards…
Horizon 2020 and the RDM
 In the 2014-16, the Open Research Data pilot included only selected areas
of Horizon 2020  But, in 2017, the pilot has been extended to cover all
the thematic areas of the program.
 Addresses two things at one:
• Streamlining RDM as a standard practice through Data Management Plans
(DMPs): required as deliverable
• Open access to research data- ‘as open as possible, as closed as necessary’
(Opt-out options for IPR, confidentiality/privacy and security reason as well as if
OA runs against the main objective of the project.)
Research Data Management: CSUC’s vision
Size
Projects
Policies
DMP
Training&Promotion
Repository
We support universities & research centers to ensure that the management of their research
data adapt to the requirements of funding bodies following the FAIR principles
Training and promotion
 Training
• At CSUC level (inviting experts)
coordinated with i-CERCA
– 5 courses (177 participants)
– 2020-2021: 4 courses scheduled
• At university level (organizing staff
courses)
 Promotion
• Article in LQ (DOI: 10.18352/lq.10253)
• Infographics
• Videos
Data policy
Promote the implementation of research data policies
• Framework agreement for open access to research data (2016)
supported by the vice rectors for Research of the Universities of
Catalonia
– Topics:
– Open Access as a default
– Responsibilities
– Retention and storage
– Data Management Plan
– Costs
– Preservation
– Support and monitoring
• Template (2018) for drawing up an institutional data policy, following
the above recommendations and others proposed by LEARN
– UB and UAB have approved their policy following this template
 DMP is a formal document that outlines how data must be handled both during a
research project and after the project is completed, and helps you with:
• Which data will be created
• Standards and methodologies to be used, documentation
• How ethics an Intellectual Property will be addressed
• Plans for storage and backup
• Plans for data sharing and access
• Strategy for long-term preservation
Data Management Plan (DMP)
 Many funders require DMPs  For this reason, the main
objective of the 1st action was focused on supporting
researchers in creating their DMPs
 2016: 1st version of
• Updated in 2018 and revised periodically
 eiNa DMP is
• Created collectively
• Used at personal / institutional level
 The tool includes guides, descriptions and real examples
for:
• Horizon 2020
• European Research Council (ERC)
• Plan Estatal de Investigación Científica
• Software Sustainability Institute
• PhDs
Supporting DMPs
 eiNa DMP (dmp.csuc.cat) is a free online tool that allows you to:
• Create: answer the questions and you’ll get a DMP FAIR.
• Share: collaborate with other researchers by giving them read-only, writing or co-owning
permissions.
• Export: convert your document to DOC, PDF, XML…
eiNa DMP
 Instance of DMPRoadmap
• Open source software
• Distributed under the MIT license
• Developed by the Digital Curation
Center and the University of
California Curation Center
DMP – Collective template jointly maintained
eiNa DMP
Sharing data
This guide provides sources for consulting:
• disciplinary repositories (directories, publishers'
recommendations, etc.).
• multidisciplinary repositories (a comparative table
showing the type of data allowed, the file size, the
associated licenses, the cost of depositing, etc.).
Creating a FAIR data repository
Nº of projects
Size and
complexity
 The vice-rectors for research of the Universities of Catalonia decided to
commission a report that would determine the reasonable functional
requirements that a data repository must have in order to comply with the
FAIR requirements.
 The report “Feasible, Affordable and Implementable Requirements for a
FAIR data repository”:
• Analyzes the references from other countries
• Studies some technical reports
• Interviews 25 experts
• Determines 25 functional requirements
• Proposes final recommendations
http://hdl.handle.net/2072/356460
DataverseCAT
 FAIR research data repository
 Federated
 Multidisciplinary
 For open research data
 For researchers from
CSUC universities and
CERCA centers
Dataverse software
 An open-source platform to publish, cite, and archive research data
 Built to support multiple types of data, users, and workflows
 Developed at Harvard’s Institute
 More than 60 installations around the world
• In Europe, federated cases: DataverseNL (Netherlands), DataverseNO (Norway)
Instances, organization & datasets
UB
UAB URV
Altres
UPC UdG
i-CERCA
UPF
Deposit process
FAIR implementation in Dataverse
 Findable
• Assign DataCite DOIs
• Metadata standards in human and machine-readable formats
• ID is in the metadata tab of the dataset and file landing page
• DataCite metadata is registered and indexed by DataCite Search (& Google
Dataset Search)
 Accessible
• Support for HTTP, APIs,..
• Open metadata even if data are restricted
 Interoperable
• Linked data (JSON, XML..)
• Controlled vocabularies
• Metadata templates
 Reusable
• Licenses
• Access and use terms
• Full data citation
• Versioning
• File format conversion
FAIR Data in Dataverse
Data Citation with
Persistent Identifier (DOI)
Data files
Metadata
Data licenses, User agreements Versioning
DataverseCAT, a repository to develop expertise and good practices
• DataCite metadata
• Standards and vocabularies
• File naming and organizing folders
• Readme.txt
• Licenses
• Data citation
• Storing data
• Anonimization
• Open formats
• …
Thank you!
Mireia Alcalá
mireia.alcala@csuc.cat

Research data management: DMP & repository

  • 1.
    Research Data Management DMP& repository Mireia Alcalá Consortium of University Services in Catalonia. Open Science Area Workshop: Research Data Management & Open Science (02.11.20) – IDIBELL
  • 2.
    Publications and data DATA: ‘any information that has been collected, observed, generated or created to validate original research findings’  OPEN (RESEARCH) DATA: ‘are online, free of cost, accessible data that can be used, reused and distributed provided that the data source is attributed’ (FOSTER) (FOSTER)
  • 3.
    Making data FAIR In 2014, a core set of principles were drafted in order to optimize the reusability of research data, named the FAIR Data Principles: • Findable: Assign persistent IDs, provide rich metadata, register in a searchable resource... • Accessible: Retrievable by their ID using a standard protocol, metadata remain accessible even if data aren’t... • Interoperable: Use formal, broadly applicable languages, use standard vocabularies, qualified references... • Reusable: Rich, accurate metadata, clear licenses, provenance, use of community standards…
  • 4.
    Horizon 2020 andthe RDM  In the 2014-16, the Open Research Data pilot included only selected areas of Horizon 2020  But, in 2017, the pilot has been extended to cover all the thematic areas of the program.  Addresses two things at one: • Streamlining RDM as a standard practice through Data Management Plans (DMPs): required as deliverable • Open access to research data- ‘as open as possible, as closed as necessary’ (Opt-out options for IPR, confidentiality/privacy and security reason as well as if OA runs against the main objective of the project.)
  • 5.
    Research Data Management:CSUC’s vision Size Projects Policies DMP Training&Promotion Repository We support universities & research centers to ensure that the management of their research data adapt to the requirements of funding bodies following the FAIR principles
  • 6.
    Training and promotion Training • At CSUC level (inviting experts) coordinated with i-CERCA – 5 courses (177 participants) – 2020-2021: 4 courses scheduled • At university level (organizing staff courses)  Promotion • Article in LQ (DOI: 10.18352/lq.10253) • Infographics • Videos
  • 7.
    Data policy Promote theimplementation of research data policies • Framework agreement for open access to research data (2016) supported by the vice rectors for Research of the Universities of Catalonia – Topics: – Open Access as a default – Responsibilities – Retention and storage – Data Management Plan – Costs – Preservation – Support and monitoring • Template (2018) for drawing up an institutional data policy, following the above recommendations and others proposed by LEARN – UB and UAB have approved their policy following this template
  • 8.
     DMP isa formal document that outlines how data must be handled both during a research project and after the project is completed, and helps you with: • Which data will be created • Standards and methodologies to be used, documentation • How ethics an Intellectual Property will be addressed • Plans for storage and backup • Plans for data sharing and access • Strategy for long-term preservation Data Management Plan (DMP)
  • 9.
     Many fundersrequire DMPs  For this reason, the main objective of the 1st action was focused on supporting researchers in creating their DMPs  2016: 1st version of • Updated in 2018 and revised periodically  eiNa DMP is • Created collectively • Used at personal / institutional level  The tool includes guides, descriptions and real examples for: • Horizon 2020 • European Research Council (ERC) • Plan Estatal de Investigación Científica • Software Sustainability Institute • PhDs Supporting DMPs
  • 10.
     eiNa DMP(dmp.csuc.cat) is a free online tool that allows you to: • Create: answer the questions and you’ll get a DMP FAIR. • Share: collaborate with other researchers by giving them read-only, writing or co-owning permissions. • Export: convert your document to DOC, PDF, XML… eiNa DMP  Instance of DMPRoadmap • Open source software • Distributed under the MIT license • Developed by the Digital Curation Center and the University of California Curation Center
  • 11.
    DMP – Collectivetemplate jointly maintained
  • 12.
  • 13.
    Sharing data This guideprovides sources for consulting: • disciplinary repositories (directories, publishers' recommendations, etc.). • multidisciplinary repositories (a comparative table showing the type of data allowed, the file size, the associated licenses, the cost of depositing, etc.).
  • 14.
    Creating a FAIRdata repository Nº of projects Size and complexity  The vice-rectors for research of the Universities of Catalonia decided to commission a report that would determine the reasonable functional requirements that a data repository must have in order to comply with the FAIR requirements.  The report “Feasible, Affordable and Implementable Requirements for a FAIR data repository”: • Analyzes the references from other countries • Studies some technical reports • Interviews 25 experts • Determines 25 functional requirements • Proposes final recommendations http://hdl.handle.net/2072/356460
  • 15.
    DataverseCAT  FAIR researchdata repository  Federated  Multidisciplinary  For open research data  For researchers from CSUC universities and CERCA centers
  • 16.
    Dataverse software  Anopen-source platform to publish, cite, and archive research data  Built to support multiple types of data, users, and workflows  Developed at Harvard’s Institute  More than 60 installations around the world • In Europe, federated cases: DataverseNL (Netherlands), DataverseNO (Norway)
  • 17.
    Instances, organization &datasets UB UAB URV Altres UPC UdG i-CERCA UPF
  • 18.
  • 19.
    FAIR implementation inDataverse  Findable • Assign DataCite DOIs • Metadata standards in human and machine-readable formats • ID is in the metadata tab of the dataset and file landing page • DataCite metadata is registered and indexed by DataCite Search (& Google Dataset Search)  Accessible • Support for HTTP, APIs,.. • Open metadata even if data are restricted  Interoperable • Linked data (JSON, XML..) • Controlled vocabularies • Metadata templates  Reusable • Licenses • Access and use terms • Full data citation • Versioning • File format conversion
  • 20.
    FAIR Data inDataverse Data Citation with Persistent Identifier (DOI) Data files Metadata Data licenses, User agreements Versioning
  • 21.
    DataverseCAT, a repositoryto develop expertise and good practices • DataCite metadata • Standards and vocabularies • File naming and organizing folders • Readme.txt • Licenses • Data citation • Storing data • Anonimization • Open formats • …
  • 22.