SlideShare a Scribd company logo
1 of 48
Download to read offline
RDA and Adoption 
Early WG Report back session 
September 23, 2014
Happy Birthday! 2 
http://cdn.cakecentral.com/d/d3/900x900px-LL-d3548099_gallery6680631282672149.jpeg
3 
What did we learn? 
Ā§ļ‚§ Motivated groups of people can do a lot 
Ā§ļ‚§ But we are relying too much on volunteer labour 
contributed on top of over-full lives 
Ā§ļ‚§ Looks like the RDA-challenge goal of 12-18 months is 
achievable 
Ā§ļ‚§ But IGs also provide valuable space for longer-term 
interaction 
Ā§ļ‚§ We need to reduce friction in our processes 
Ā§ļ‚§ But the organisation is maturing rapidly
4 
RDA and Outputs 
Ā§ļ‚§ RDA will only deliver on its promise if it produces 
deliverables, and those deliverables become adopted 
outside the groups that created them 
Ā§ļ‚§ Consequential TAB foci: 
Ā§ļ‚§ proposals for new groups ā€“ adoption plans? 
Ā§ļ‚§ tracking groups underway ā€“ fit for purpose? 
Ā§ļ‚§ monitoring of adoption once groups conclude ā€“ actually 
adopted? 
Ā§ļ‚§ So, how can we most usefully think through the process 
of adoption?
5 
Diffusion and RDA 
Ā§ļ‚§ Adoption can be seen as the end result of a diffusion 
process. This diffusion process involves 
Ā§ļ‚§ awareness 
Ā§ļ‚§ interest 
Ā§ļ‚§ evaluation 
Ā§ļ‚§ trial 
Ā§ļ‚§ adoption 
Ā§ļ‚§ RDA has a role to play in 
Ā§ļ‚§ supporting each stage 
Ā§ļ‚§ making the transitions from one stage to the next more likely
6 
Important questions 
1. How do we talk about data? 
2. How can we describe the data? 
3. Can we optimize addressing the data? 
4. How can we get trust in our infrastructure?
7 
What 
Ā§ļ‚§ Base infrastructure 
Ā§ļ‚§ (Coincidence, also social groups!) 
Ā§ļ‚§ Lets agree on Terms. (DFT) 
Ā§ļ‚§ Descriptions for Interoperability. (DTR) 
Ā§ļ‚§ Scaling across PID systems. (PIT) 
Ā§ļ‚§ Building policies into the infrastructure. (PP)
8 
These groups 
Ā§ļ‚§ Amplify each other 
Ā§ļ‚§ Use each others outputs 
Ā§ļ‚§ Have to interlock properly 
Ā§ļ‚§ Will continue the effort after they finish.
Data Foundation and Terminology 
Chairs: Gary Berg-Cross, Raphael Ritz, Peter Wittenburg
Task 10 
Bob Kahn: 
You need to know where you are talking about. 
DFT mission: understand what the core of the data domain 
is, develop definitions of core terms based on data models. 
DFT is part of coming to an agreed culture in RDA. 
Scope: 
AND only speak about domain of registered data. 
Ā§ļ‚§ knowing that there is a lot of non-registered data 
Ā§ļ‚§ knowing that some disciplines are further away from 
what we are discussing as necessity
DFT WG Activities & Accomplishments 11 
Ā§ļ‚§ Drafted 4 related Model Documents on core 
work: 
1. Data Models 1: Overview ā€“ 22+ models 
2. Data Models 2: Analysis & Synthesis 
3. Data Models 3: Term Snapshot 
4. Data Models 4: Use Cases 
(Work with other RDA WGs on use cases to 
illustrate 
data concepts) 
Ā§ļ‚§ Developed Semantic Media Wiki Term Tool to 
capture initial list of terms and definitions for 
discussions, demo held at P3 
(open for others and ā€œpersistentā€) 
Candidate List 
Evolved to 
Consolidated 
List
12 
Our Core Terms in simple Words JļŠ 
Ā§ļ‚§ digital object (DO) 
Ā§ļ‚§ persistent identifier 
Ā§ļ‚§ PID resolution system 
Ā§ļ‚§ metadata 
Ā§ļ‚§ aggregation 
Ā§ļ‚§ digital collection 
Ā§ļ‚§ (digital) repository 
Ā§ļ‚§ bitstream 
Ā§ļ‚§ state information 
Need to put relation between terms into the documents 
On purpose no formal ontology (yet) and no terminologistā€™ exactness 
since we made definitions for data practitioners first.
13 
Definitions & Process 
Ā§ļ‚§ A digital collection is an aggregation of DOs that is identified by 
a PID and described by metadata. 
Ā§ļ‚§ Note: A digital collection is a (complex) DO. 
Ā§ļ‚§ Variants 
Ā§ļ‚§ A collection is a form of aggregation of elements that has an identity of its own separate from the 
identity of the elements. 
Ā§ļ‚§ Collection is defined as a ā€œgroup of objects gathered together for some intellectual, artistic curatorial 
purpose. 
Ā§ļ‚§ A digital collection is a type of aggregation formed by a collection process on existing data and data 
sets where the collected data is in digital form. 
Ā§ļ‚§ Collection is a type of aggregation obeying part-role relations and is a digital object since it has a 
PID to be referable and metadata describing its properties. 
Ā§ļ‚§ A Digital Collection is an organized aggregation or other grouping of distinct DOs that are related by 
some criteria and where the collection is described by metadata. A Digital Collection may also be 
identified by a unique persistent identifier, in which case the collection may be construed as a DO. 
(Kahn et.al) 
Ā§ļ‚§ Conclusion points 
Ā§ļ‚§ purpose and process of aggregation/collection building and part relations not 
relevant for definition 
Ā§ļ‚§ remember: only speak about domain of registered DOs.
Interactions with others 14 
ā€¢ Interacted with RDA WGs and IGs. 
ā€¢ Participated in Munich meeting and Chairs telcos. 
ā€¢ Part of WG forum discussions 
ā€¢ also ā€œactiveā€ interactions with about 120 groups 
RDA/EU & EUDAT Interviews Interactions Total 
Humanities &Soc Sci 8 13 21 
Environmental 7 2 9 
Life Sciences 10 7 17 
Natural Sciences 11 13 24 
Engineering & CS - 14 14 
Various disciplines - 24 24 
others 4 3 7 
40 74 114
Adoption 15 
ā€¢ What does adoption mean in case of a set of terms? 
ā€¢ itā€™s about the interaction process itself within and 
outside of RDA 
ā€¢ itā€™s about influencing conceptualization and thus 
harmonizing ā€œlanguageā€ 
ā€¢ itā€™s about changing cultures 
ā€¢ we have done a lot ā€“ many departments & communities 
ā€¢ why so relevant: 
ā€¢ report from 120 interactions tells us that data practices 
are a nightmare (report is available) 
ā€¢ data organizations are so different that data federation 
including ā€œlogical informationā€ is too expensive 
ā€¢ current data science is not reproducible
Objectives until/for P4 16 
1. Go out and intensify interaction based on Snapshot 
Ā§ļ‚§ create condensed statements for different groups (2-page flyer) 
Ā§ļ‚§ interact with other groups in RDA and early adopters 
Ā§ļ‚§ interact with the many communities (outside RDA) we already contacted 
(in Europe ESFRI RI projects: 17th October, Brussels) 
Ā§ļ‚§ encourage people using the term wiki 
2. Come to new consolidated agreements 
Ā§ļ‚§ consolidated definitions until P5 
Ā§ļ‚§ present the consolidated definitions and tend core term set 
Ā§ļ‚§ identify some people from communities that have adoption talks (no PR!) 
3. Finish some unsolved issues 
Ā§ļ‚§ synthesis: generic flexible enough model to capture terms and their 
relationships 
Ā§ļ‚§ add more use cases 
Ā§ļ‚§ see how to continue maintenance
Thanks for your attention.
Data Type Registries WG 
Outcomes
19 
Problem: Implicit Assumptions in Data 
Ā§ļ‚§ Data sharing requires that data can be parsed, 
understood, and reused by people and applications 
other than those that created the data 
Ā§ļ‚§ How do we do this now? 
Ā§ļ‚§ For documents ā€“ formats are enough, e.g., PDF, and then the 
document explains itself to humans 
Ā§ļ‚§ This doesnā€™t work well with data ā€“ numbers are not self-explanatory 
Ā§ļ‚§ What does the number 7 mean in cell B27? 
Ā§ļ‚§ Data producers may not have explicitly specified certain 
details in the data: measurement units, coordinate 
systems, variable names, etc. 
Ā§ļ‚§ Need a way to precisely characterize those assumptions 
such that they can be identified by humans and 
machines that were not closely involved in its creation
20 
Goals: Explicate and Share Assumptions using 
Types and Type Registries 
Ā§ļ‚§ Evaluate and identify a few assumptions in data that can 
be codified and shared in order toā€¦ 
Ā§ļ‚§ Produce a functioning Registry system that can easily 
be evaluated by organizations before adoption 
Ā§ļ‚§ Highly configurable for changing scope of captured and shared 
assumptions depending on the domain or organization 
Ā§ļ‚§ Supports several Type record dissemination variations 
Ā§ļ‚§ Design for allowing federation between multiple Registry 
instances 
Ā§ļ‚§ The groupā€™s emphasis is not on 
Ā§ļ‚§ Identifying every possible assumption and data characteristic 
applicable for all domains 
Ā§ļ‚§ Technology
21 
Results 
Ā§ļ‚§ Produced a community consensus system ā€“ in this case the 
consensus was between the group members 
Ā§ļ‚§ Input from folks from different backgrounds including 
technologists, scientists, policy analysts, etc., is considered 
Ā§ļ‚§ Released a functioning prototype that can be adapted (with no s/w 
changes) for domain-specific use 
Ā§ļ‚§ Not a turnkey solution 
Ā§ļ‚§ Adapt - Evaluate ā€“ Adopt cycle is expected at each organization 
or community 
Ā§ļ‚§ Federation between different instances is technically possible 
Ā§ļ‚§ Organizational policies were not discussed due to the lack of 
time 
Ā§ļ‚§ CNRI, a member of the group, has designed and implemented a 
prototype, the latest of which is at: http://typeregistry.org 
Ā§ļ‚§ With the help of RDA provided scholar, we seeded the Registry 
with Types that pertain to geosciences community
22 
Points to Keep in Mind 
Ā§ļ‚§ Data Type Registry is neither a turnkey system 
nor an immediate ROI application 
Ā§ļ‚§ Every organization should nominate a domain 
expert for defining the scope of Type records 
and for seeding their Registry instance 
Ā§ļ‚§ Cross-domain interpretation beyond some basic 
computability needs social processes in place 
Ā§ļ‚§ Data systems such as Type Registries are low-level 
infrastructure systems with wide 
applicability 
Ā§ļ‚§ Network effect plays a significant role in the success of any 
infrastructure
23 
Adoption and Impact 
Ā§ļ‚§ We expect multiple groups to put significant 
efforts into exercising the prototype: 
Ā§ļ‚§ the EUDAT project in Europe, 
Ā§ļ‚§ National Institute of Standards and Technology 
(NIST) in the US, 
Ā§ļ‚§ the International DOI Foundation 
Ā§ļ‚§ (Wo Chang, Digital Data Manager at NIST, 
shares his evaluation plans)
24 
Conclusion ā€“ For Now 
Ā§ļ‚§ Adoption plans will continue 
Ā§ļ‚§ The group, or some part of it, will continue to 
work, we hope with RDAā€™s blessing and maybe 
support. We will have more to say at P5 
Ā§ļ‚§ Future-proofing data is hard work, but is 
essential for long-term data-driven science
WG PID Information Types 
Outcomes
26 
Problem & Goal 
Ā§ļ‚§ PIDs are associated with additional information and this 
information needs to be typed 
Ā§ļ‚§ Harmonization across disciplines and PID providers 
Ā§ļ‚§ What are PID Information Types? 
Ā§ļ‚§ Specify a framework for defining types 
Ā§ļ‚§ Agree on some essential types 
Ā§ļ‚§ Provide technical solutions for interaction with PID types 
Ā§ļ‚§ Provide the tools first, then create types individually
27 
Results 
Insights gained: 
Ā§ļ‚§ Types depend on use cases and semantics differ between 
disciplines 
Ā§ļ‚§ There is no single set of types fitting all cases 
Ā§ļ‚§ Community processes must define types from practical adoption 
Final deliverables avaliable: 
Ā§ļ‚§ Type examples and illustrating use cases 
Ā§ļ‚§ Types registered in the Type Registry prototype 
Ā§ļ‚§ API description and prototypic implementation 
Ā§ļ‚§ Client demonstrator GUI
Registered types enable cross-services 28 
Format: 
Checksum: 
Size: 
Verification service 
Size: 
Format: 
Checksum:
29 
Adoption & Impact 
Ā§ļ‚§ Register your types so they can be adopted and reused, 
making it easier for others to use your data 
Ā§ļ‚§ Information on how to register new types available in the report 
Ā§ļ‚§ Adopt types already being used in your domain to 
increase interoperability 
Ā§ļ‚§ Decouple object management from contents 
Ā§ļ‚§ Simplify client access to data across domains, implementations 
and changes in information models 
Ā§ļ‚§ More lightweight access to information on less accessible 
objects
30 
Possible follow-ups 
Ā§ļ‚§ Adoption of these capabilities by PID infrastructure 
providers 
Ā§ļ‚§ Discipline-specific types, preferably from practical 
adoption 
Ā§ļ‚§ Establish a type ecosystem 
Ā§ļ‚§ Refine data model 
Ā§ļ‚§ Enhance REST API
31 
Conclusions 
Ā§ļ‚§ Draft final report available via the website 
Ā§ļ‚§ Demonstrator web GUI: 
http://smw-rda.esc.rzg.mpg.de/PitApiGui/
Practical Policies 
Outcomes
WG Practical Policies 33
Ā§ļ‚§ Create research data repository 
Ā§ļ‚§ Data: 2 TB, 500,000 files + growing 
+ integrity 
+ access (IG FIM) 
+ publish (publication+PID) 
+ ā€¦ 
Ā§ļ‚§ Some assertions: policies & rules attached to the data 
WG Practical Policies 34 
Scenario 
Policy: 
Asser%on 
or 
assurance 
that 
is 
enforced 
about 
a 
collec%on 
or 
a 
dataset
Computer actionable policies 
Ā§ļ‚§ Enforce management, 
Ā§ļ‚§ Automate administrative tasks, 
Ā§ļ‚§ Validate assessment criteria, 
Ā§ļ‚§ Automate scientific analyses 
Ā§ļ‚§ etc. 
A generic set of policies that can be revised and adapted 
by user communities and site managers does not exist. 
Ā§ļ‚§ Domain scientists who want to build-up a collection or 
a repository 
Ā§ļ‚§ Data centers for automating policies 
WG Practical Policies 35 
Problem
Ā§ļ‚§ To bring together practitioners in policy making and 
policy implementation (nearly all RDA WG/IGs) 
Ā§ļ‚§ To identify typical application scenarios for policies 
such as replication, preservation etc. 
Ā§ļ‚§ To collect and to register practical policies 
Ā§ļ‚§ To enable sharing, revising, adapting, and re-using of 
computer actionable policies 
WG Practical Policies 36 
Goals
Survey of 30 Institutions for Highest Priority 
Policies 
Policy 
Importance 
Integrity 
217 
Preserva%on 
150 
Access 
control 
126 
Provenance 
108 
Data 
Management 
plans 
99 
Publica%on 
75 
Replica%on 
66 
Data 
staging 
52 
Federa%on 
37 
Metadata 
sharing 
23 
Regulatory 
16 
Collec%on 
proper%es 
7 
Iden%fiers 
7 
Data 
sharing 
7 
Versioning 
7 
Licensing 
6 
Format 
6 
Data 
Life 
Cycle 
6 
Arrangement 
5 
Processing 
5 
In close cooperation with the Engagement Group 
WG Practical Policies 37
Contextual 
Metadata 
Extrac%on 
Data 
Reten%on 
Disposi%on 
Integrity 
Storage 
Cost 
Reports 
Restricted 
Searching 
No%fica%on 
Data 
Access 
Control 
Use 
Agreements 
Data 
backup 
Data 
Format 
Control 
Collec%on-Ā­ā€ 
based 
Policies 
Identification of 
11 important 
policy areas:
Identification of 11 important policy areas: 
Ā§ļ‚§ Contextual metadata extraction 
Ā§ļ‚§ Data access control 
Ā§ļ‚§ Data backup 
Ā§ļ‚§ Data format control 
Ā§ļ‚§ Data retention 
Ā§ļ‚§ Disposition 
Ā§ļ‚§ Integrity (including replication) 
Ā§ļ‚§ Notification 
Ā§ļ‚§ Restricted searching 
Ā§ļ‚§ Storage cost reports 
Ā§ļ‚§ Use agreements 
WG Practical Policies 39 
Results
https://www.rd-alliance.org/filedepot?cid=104&fid=556 
Templates 
Ā§ļ‚§ Interactions of policies and DO attributes 
Ā§ļ‚§ Policy descriptions 
Ā§ļ‚§ Technology independent 
Ā§ļ‚§ Reviews of the provided policy areas in progress 
WG Practical Policies 40 
Results
Results 
https://www.rd-alliance.org/filedepot?cid=104&fid=553 
Ā§ļ‚§ Examples for implementations: 
Ā§ļ‚§ English language descriptions 
Ā§ļ‚§ iRODS 
Ā§ļ‚§ GPFS 
WG Practical Policies 41 
Ā§ļ‚§ ~50 pages
Result: List of of policy categories and policies 
Ā§ļ‚§ Improved data center administration 
Ā§ļ‚§ By sharing policies, communities can interoperate and 
share data more effectively 
Ā§ļ‚§ Transparency: basis of establishing trust 
Ā§ļ‚§ Implemented policies: can be used as examples and be 
adapted to specific requirements and other data 
management systems 
WG Practical Policies 42 
Impact
Target Communities: 
Ā§ļ‚§ Groups managing data collections 
Ā§ļ‚§ Data centers 
First adopters are the institutions/organizations who 
contributed to the results, e.g. RENCI, KIT, OSC, DARIAH, 
RZG, etc.: 
Ā§ļ‚§ EUDAT 
Ā§ļ‚§ CESNET 
Ā§ļ‚§ (DataNet Federation Consortium, WDS ? ) 
WG Practical Policies 43 
Adoption
Ā§ļ‚§ ā€œOutcomes Policy Templates: Practical Policy Working 
Group, September 2014ā€ 
https://www.rd-alliance.org/filedepot?cid=104&fid=556 
Ā§ļ‚§ ā€œImplementations: Practical Policy Working Group, 
September 2014ā€ 
https://www.rd-alliance.org/filedepot?cid=104&fid=553 
Ā§ļ‚§ Work in Progress: Reviews 
WG Practical Policies 44 
Conclusions
Conclusions: Next Steps 
Ā§ļ‚§ More interaction with other technical groups 
Ć ļƒ  Data Fabric 
Ć ļƒ  Publication policies 
Ā§ļ‚§ More interaction with domain specific groups 
WG Practical Policies 45 
Ć ļƒ  Adopters 
For information please contact 
Ā§ļ‚§ Reagan Moore rwmoore@renci.org and 
Ā§ļ‚§ Rainer Stotzka rainer.stotzka@kit.edu
WG Practical Policies 
Outbreak Session: 
Tuesday September 23, 14:00 ā€“ 15:30 
Agenda: 
1. Introduction 
2. Presentation of deliverables 
3. David Antos & Petr Benedikt: "Policy implementations 
WG Practical Policies 46 
on GPFSā€ 
4. Discussions: 
Ā§ļ‚§ Policy reviews 
Ā§ļ‚§ Adding new policies 
Ā§ļ‚§ Interoperability with other WG/IGs 
Ā§ļ‚§ Adoption
47 
P5 and Adoption Day 
Ā§ļ‚§ More groups will be presenting at P5 
Ā§ļ‚§ Starting to see how different WG outputs can fit together 
Ā§ļ‚§ Ex: Data Fabric 
Ā§ļ‚§ Planning to have a major focus at P5 on adoption of WG 
outputs 
Ā§ļ‚§ Also thinking through how best to accelerate adoption 
and support groups that want to integrate RDA outputs
48 
How you can help! 
Ā§ļ‚§ Get involved in WGs, IGs to ensure outputs meet your 
needs and the needs of your organisation 
Ā§ļ‚§ Encourage your organisation to become aware of RDA 
outputs and evaluate or trial them 
Ā§ļ‚§ Look for places where RDA can make a difference

More Related Content

Similar to RDA Work Groups Outputs and Adoption - Early WG Report back session

Keynote: Mark Parsons - Plans are Useless, But Planning is Essential
Keynote: Mark Parsons - Plans are Useless, But Planning is EssentialKeynote: Mark Parsons - Plans are Useless, But Planning is Essential
Keynote: Mark Parsons - Plans are Useless, But Planning is EssentialCASRAI
Ā 
Research Data Alliance Member Statistics January 2016
Research Data Alliance Member Statistics January 2016Research Data Alliance Member Statistics January 2016
Research Data Alliance Member Statistics January 2016Research Data Alliance
Ā 
Towards Generating Policy-compliant Datasets (poster)
Towards GeneratingPolicy-compliant Datasets (poster)Towards GeneratingPolicy-compliant Datasets (poster)
Towards Generating Policy-compliant Datasets (poster)Christophe Debruyne
Ā 
Making DMPs actionable and public
Making DMPs actionable and publicMaking DMPs actionable and public
Making DMPs actionable and publicStephanie Simms
Ā 
Research Data Alliance Member Statistics December 2015
Research Data Alliance Member Statistics December 2015Research Data Alliance Member Statistics December 2015
Research Data Alliance Member Statistics December 2015Research Data Alliance
Ā 
Towards Generating Policy-compliant Datasets
Towards Generating Policy-compliant DatasetsTowards Generating Policy-compliant Datasets
Towards Generating Policy-compliant DatasetsChristophe Debruyne
Ā 
Data Description Registry Interoperability WG at Research Data Alliance Third...
Data Description Registry Interoperability WG at Research Data Alliance Third...Data Description Registry Interoperability WG at Research Data Alliance Third...
Data Description Registry Interoperability WG at Research Data Alliance Third...amiraryani
Ā 
Monthly statistics of the RDA community - February 2016
Monthly statistics of the RDA community - February 2016Monthly statistics of the RDA community - February 2016
Monthly statistics of the RDA community - February 2016Research Data Alliance
Ā 
Sentara Linked Data Workshop - Sept 10, 2012
Sentara Linked Data Workshop - Sept 10, 2012Sentara Linked Data Workshop - Sept 10, 2012
Sentara Linked Data Workshop - Sept 10, 20123 Round Stones
Ā 
The Research Data Alliance--Creating the culture and technology for an intern...
The Research Data Alliance--Creating the culture and technology for an intern...The Research Data Alliance--Creating the culture and technology for an intern...
The Research Data Alliance--Creating the culture and technology for an intern...Research Data Alliance
Ā 
Open Data is not Enough (final version)
Open Data is not Enough (final version)Open Data is not Enough (final version)
Open Data is not Enough (final version)Research Data Alliance
Ā 
Writing a successful data management plan with the DMPTool
Writing a successful data management plan with the DMPToolWriting a successful data management plan with the DMPTool
Writing a successful data management plan with the DMPToolkfear
Ā 
Interdisciplinary Processes at the Digital Repository of Ireland
Interdisciplinary Processes at the Digital Repository of IrelandInterdisciplinary Processes at the Digital Repository of Ireland
Interdisciplinary Processes at the Digital Repository of Irelanddri_ireland
Ā 
Supporting team coordination of software development across multiple companies
Supporting team coordination of software development across multiple companiesSupporting team coordination of software development across multiple companies
Supporting team coordination of software development across multiple companiesAnh Nguyen Duc
Ā 
FAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM
Ā 
1. Overview_of_data_analytics (1).pdf
1. Overview_of_data_analytics (1).pdf1. Overview_of_data_analytics (1).pdf
1. Overview_of_data_analytics (1).pdfAyele40
Ā 

Similar to RDA Work Groups Outputs and Adoption - Early WG Report back session (20)

Keynote: Mark Parsons - Plans are Useless, But Planning is Essential
Keynote: Mark Parsons - Plans are Useless, But Planning is EssentialKeynote: Mark Parsons - Plans are Useless, But Planning is Essential
Keynote: Mark Parsons - Plans are Useless, But Planning is Essential
Ā 
Research Data Alliance Member Statistics January 2016
Research Data Alliance Member Statistics January 2016Research Data Alliance Member Statistics January 2016
Research Data Alliance Member Statistics January 2016
Ā 
Towards Generating Policy-compliant Datasets (poster)
Towards GeneratingPolicy-compliant Datasets (poster)Towards GeneratingPolicy-compliant Datasets (poster)
Towards Generating Policy-compliant Datasets (poster)
Ā 
Making DMPs actionable and public
Making DMPs actionable and publicMaking DMPs actionable and public
Making DMPs actionable and public
Ā 
Research Data Alliance Member Statistics December 2015
Research Data Alliance Member Statistics December 2015Research Data Alliance Member Statistics December 2015
Research Data Alliance Member Statistics December 2015
Ā 
Open Data is not Enough
Open Data is not EnoughOpen Data is not Enough
Open Data is not Enough
Ā 
Towards Generating Policy-compliant Datasets
Towards Generating Policy-compliant DatasetsTowards Generating Policy-compliant Datasets
Towards Generating Policy-compliant Datasets
Ā 
PlanetData Project Overview
PlanetData Project OverviewPlanetData Project Overview
PlanetData Project Overview
Ā 
Data Description Registry Interoperability WG at Research Data Alliance Third...
Data Description Registry Interoperability WG at Research Data Alliance Third...Data Description Registry Interoperability WG at Research Data Alliance Third...
Data Description Registry Interoperability WG at Research Data Alliance Third...
Ā 
Monthly statistics of the RDA community - February 2016
Monthly statistics of the RDA community - February 2016Monthly statistics of the RDA community - February 2016
Monthly statistics of the RDA community - February 2016
Ā 
Sentara Linked Data Workshop - Sept 10, 2012
Sentara Linked Data Workshop - Sept 10, 2012Sentara Linked Data Workshop - Sept 10, 2012
Sentara Linked Data Workshop - Sept 10, 2012
Ā 
The Research Data Alliance--Creating the culture and technology for an intern...
The Research Data Alliance--Creating the culture and technology for an intern...The Research Data Alliance--Creating the culture and technology for an intern...
The Research Data Alliance--Creating the culture and technology for an intern...
Ā 
Open Data is not Enough (final version)
Open Data is not Enough (final version)Open Data is not Enough (final version)
Open Data is not Enough (final version)
Ā 
Writing a successful data management plan with the DMPTool
Writing a successful data management plan with the DMPToolWriting a successful data management plan with the DMPTool
Writing a successful data management plan with the DMPTool
Ā 
Rda in a_nutshell_january_2017
Rda in a_nutshell_january_2017Rda in a_nutshell_january_2017
Rda in a_nutshell_january_2017
Ā 
Interdisciplinary Processes at the Digital Repository of Ireland
Interdisciplinary Processes at the Digital Repository of IrelandInterdisciplinary Processes at the Digital Repository of Ireland
Interdisciplinary Processes at the Digital Repository of Ireland
Ā 
Supporting team coordination of software development across multiple companies
Supporting team coordination of software development across multiple companiesSupporting team coordination of software development across multiple companies
Supporting team coordination of software development across multiple companies
Ā 
Rda in a_nutshell_march_2017
Rda in a_nutshell_march_2017Rda in a_nutshell_march_2017
Rda in a_nutshell_march_2017
Ā 
FAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech Proposals
Ā 
1. Overview_of_data_analytics (1).pdf
1. Overview_of_data_analytics (1).pdf1. Overview_of_data_analytics (1).pdf
1. Overview_of_data_analytics (1).pdf
Ā 

More from Research Data Alliance

RDA in a Nutshell - September 2020
RDA in a Nutshell - September 2020RDA in a Nutshell - September 2020
RDA in a Nutshell - September 2020Research Data Alliance
Ā 
RDA in a Nutshell - February 2020
RDA in a Nutshell - February 2020RDA in a Nutshell - February 2020
RDA in a Nutshell - February 2020Research Data Alliance
Ā 
RDA in a Nutshell - January 2020
RDA in a Nutshell - January 2020RDA in a Nutshell - January 2020
RDA in a Nutshell - January 2020Research Data Alliance
Ā 
Rda in a Nutshell - December 2019
Rda in a Nutshell - December 2019Rda in a Nutshell - December 2019
Rda in a Nutshell - December 2019Research Data Alliance
Ā 
Rda in a Nutshell - November 2019
Rda in a Nutshell - November 2019Rda in a Nutshell - November 2019
Rda in a Nutshell - November 2019Research Data Alliance
Ā 
RDA in a Nutshell - October 2019
RDA in a Nutshell - October 2019RDA in a Nutshell - October 2019
RDA in a Nutshell - October 2019Research Data Alliance
Ā 
The Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsThe Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsResearch Data Alliance
Ā 
The Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsThe Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsResearch Data Alliance
Ā 
RDA Value for Infrastructure Providers
RDA Value for Infrastructure ProvidersRDA Value for Infrastructure Providers
RDA Value for Infrastructure ProvidersResearch Data Alliance
Ā 
Rda in a nutshell september 2019
Rda in a nutshell september 2019Rda in a nutshell september 2019
Rda in a nutshell september 2019Research Data Alliance
Ā 
The Value of the Rda Value for Organisations Performing Research
The Value of the Rda Value for Organisations Performing ResearchThe Value of the Rda Value for Organisations Performing Research
The Value of the Rda Value for Organisations Performing ResearchResearch Data Alliance
Ā 
The Value of the RDA for Funders
The Value of the RDA for FundersThe Value of the RDA for Funders
The Value of the RDA for FundersResearch Data Alliance
Ā 

More from Research Data Alliance (20)

RDA in a Nutshell - September 2020
RDA in a Nutshell - September 2020RDA in a Nutshell - September 2020
RDA in a Nutshell - September 2020
Ā 
RDA in a Nutshell - August 2020
RDA in a Nutshell - August 2020RDA in a Nutshell - August 2020
RDA in a Nutshell - August 2020
Ā 
RDA in a Nutshell - July 2020
RDA in a Nutshell - July 2020RDA in a Nutshell - July 2020
RDA in a Nutshell - July 2020
Ā 
RDA in a Nutshell - June 2020
RDA in a Nutshell - June 2020RDA in a Nutshell - June 2020
RDA in a Nutshell - June 2020
Ā 
RDA in a Nutshell - May 2020
RDA in a Nutshell - May 2020RDA in a Nutshell - May 2020
RDA in a Nutshell - May 2020
Ā 
RDA in a Nutshell - April 2020
RDA in a Nutshell - April 2020RDA in a Nutshell - April 2020
RDA in a Nutshell - April 2020
Ā 
RDA in a Nutshell - March 2020
RDA in a Nutshell - March 2020RDA in a Nutshell - March 2020
RDA in a Nutshell - March 2020
Ā 
RDA in a Nutshell - February 2020
RDA in a Nutshell - February 2020RDA in a Nutshell - February 2020
RDA in a Nutshell - February 2020
Ā 
RDA in a Nutshell - January 2020
RDA in a Nutshell - January 2020RDA in a Nutshell - January 2020
RDA in a Nutshell - January 2020
Ā 
Rda in a Nutshell - December 2019
Rda in a Nutshell - December 2019Rda in a Nutshell - December 2019
Rda in a Nutshell - December 2019
Ā 
Rda in a Nutshell - November 2019
Rda in a Nutshell - November 2019Rda in a Nutshell - November 2019
Rda in a Nutshell - November 2019
Ā 
RDA in a Nutshell - October 2019
RDA in a Nutshell - October 2019RDA in a Nutshell - October 2019
RDA in a Nutshell - October 2019
Ā 
The Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsThe Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to Individuals
Ā 
The Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsThe Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to Individuals
Ā 
RDA Value for Infrastructure Providers
RDA Value for Infrastructure ProvidersRDA Value for Infrastructure Providers
RDA Value for Infrastructure Providers
Ā 
Rda in a nutshell september 2019
Rda in a nutshell september 2019Rda in a nutshell september 2019
Rda in a nutshell september 2019
Ā 
The Value of the Rda Value for Organisations Performing Research
The Value of the Rda Value for Organisations Performing ResearchThe Value of the Rda Value for Organisations Performing Research
The Value of the Rda Value for Organisations Performing Research
Ā 
RDA Value for Libraries
RDA Value for LibrariesRDA Value for Libraries
RDA Value for Libraries
Ā 
The Value of the RDA for Funders
The Value of the RDA for FundersThe Value of the RDA for Funders
The Value of the RDA for Funders
Ā 
Rda in a nutshell august 2019
Rda in a nutshell august 2019Rda in a nutshell august 2019
Rda in a nutshell august 2019
Ā 

Recently uploaded

Call Girls in Munirka Delhi šŸ’ÆCall Us šŸ”8264348440šŸ”
Call Girls in Munirka Delhi šŸ’ÆCall Us šŸ”8264348440šŸ”Call Girls in Munirka Delhi šŸ’ÆCall Us šŸ”8264348440šŸ”
Call Girls in Munirka Delhi šŸ’ÆCall Us šŸ”8264348440šŸ”soniya singh
Ā 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
Ā 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
Ā 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
Ā 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSĆ©rgio Sacani
Ā 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
Ā 
Call Girls in Mayapuri Delhi šŸ’ÆCall Us šŸ”9953322196šŸ” šŸ’ÆEscort.
Call Girls in Mayapuri Delhi šŸ’ÆCall Us šŸ”9953322196šŸ” šŸ’ÆEscort.Call Girls in Mayapuri Delhi šŸ’ÆCall Us šŸ”9953322196šŸ” šŸ’ÆEscort.
Call Girls in Mayapuri Delhi šŸ’ÆCall Us šŸ”9953322196šŸ” šŸ’ÆEscort.aasikanpl
Ā 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
Ā 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
Ā 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
Ā 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: ā€œEg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: ā€œEg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: ā€œEg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: ā€œEg...SĆ©rgio Sacani
Ā 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
Ā 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
Ā 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
Ā 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
Ā 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
Ā 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxAleenaTreesaSaji
Ā 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
Ā 

Recently uploaded (20)

Call Girls in Munirka Delhi šŸ’ÆCall Us šŸ”8264348440šŸ”
Call Girls in Munirka Delhi šŸ’ÆCall Us šŸ”8264348440šŸ”Call Girls in Munirka Delhi šŸ’ÆCall Us šŸ”8264348440šŸ”
Call Girls in Munirka Delhi šŸ’ÆCall Us šŸ”8264348440šŸ”
Ā 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
Ā 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
Ā 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
Ā 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
Ā 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
Ā 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
Ā 
Call Girls in Mayapuri Delhi šŸ’ÆCall Us šŸ”9953322196šŸ” šŸ’ÆEscort.
Call Girls in Mayapuri Delhi šŸ’ÆCall Us šŸ”9953322196šŸ” šŸ’ÆEscort.Call Girls in Mayapuri Delhi šŸ’ÆCall Us šŸ”9953322196šŸ” šŸ’ÆEscort.
Call Girls in Mayapuri Delhi šŸ’ÆCall Us šŸ”9953322196šŸ” šŸ’ÆEscort.
Ā 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Ā 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Ā 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
Ā 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
Ā 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: ā€œEg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: ā€œEg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: ā€œEg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: ā€œEg...
Ā 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Ā 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Ā 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
Ā 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Ā 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
Ā 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptx
Ā 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Ā 

RDA Work Groups Outputs and Adoption - Early WG Report back session

  • 1. RDA and Adoption Early WG Report back session September 23, 2014
  • 2. Happy Birthday! 2 http://cdn.cakecentral.com/d/d3/900x900px-LL-d3548099_gallery6680631282672149.jpeg
  • 3. 3 What did we learn? Ā§ļ‚§ Motivated groups of people can do a lot Ā§ļ‚§ But we are relying too much on volunteer labour contributed on top of over-full lives Ā§ļ‚§ Looks like the RDA-challenge goal of 12-18 months is achievable Ā§ļ‚§ But IGs also provide valuable space for longer-term interaction Ā§ļ‚§ We need to reduce friction in our processes Ā§ļ‚§ But the organisation is maturing rapidly
  • 4. 4 RDA and Outputs Ā§ļ‚§ RDA will only deliver on its promise if it produces deliverables, and those deliverables become adopted outside the groups that created them Ā§ļ‚§ Consequential TAB foci: Ā§ļ‚§ proposals for new groups ā€“ adoption plans? Ā§ļ‚§ tracking groups underway ā€“ fit for purpose? Ā§ļ‚§ monitoring of adoption once groups conclude ā€“ actually adopted? Ā§ļ‚§ So, how can we most usefully think through the process of adoption?
  • 5. 5 Diffusion and RDA Ā§ļ‚§ Adoption can be seen as the end result of a diffusion process. This diffusion process involves Ā§ļ‚§ awareness Ā§ļ‚§ interest Ā§ļ‚§ evaluation Ā§ļ‚§ trial Ā§ļ‚§ adoption Ā§ļ‚§ RDA has a role to play in Ā§ļ‚§ supporting each stage Ā§ļ‚§ making the transitions from one stage to the next more likely
  • 6. 6 Important questions 1. How do we talk about data? 2. How can we describe the data? 3. Can we optimize addressing the data? 4. How can we get trust in our infrastructure?
  • 7. 7 What Ā§ļ‚§ Base infrastructure Ā§ļ‚§ (Coincidence, also social groups!) Ā§ļ‚§ Lets agree on Terms. (DFT) Ā§ļ‚§ Descriptions for Interoperability. (DTR) Ā§ļ‚§ Scaling across PID systems. (PIT) Ā§ļ‚§ Building policies into the infrastructure. (PP)
  • 8. 8 These groups Ā§ļ‚§ Amplify each other Ā§ļ‚§ Use each others outputs Ā§ļ‚§ Have to interlock properly Ā§ļ‚§ Will continue the effort after they finish.
  • 9. Data Foundation and Terminology Chairs: Gary Berg-Cross, Raphael Ritz, Peter Wittenburg
  • 10. Task 10 Bob Kahn: You need to know where you are talking about. DFT mission: understand what the core of the data domain is, develop definitions of core terms based on data models. DFT is part of coming to an agreed culture in RDA. Scope: AND only speak about domain of registered data. Ā§ļ‚§ knowing that there is a lot of non-registered data Ā§ļ‚§ knowing that some disciplines are further away from what we are discussing as necessity
  • 11. DFT WG Activities & Accomplishments 11 Ā§ļ‚§ Drafted 4 related Model Documents on core work: 1. Data Models 1: Overview ā€“ 22+ models 2. Data Models 2: Analysis & Synthesis 3. Data Models 3: Term Snapshot 4. Data Models 4: Use Cases (Work with other RDA WGs on use cases to illustrate data concepts) Ā§ļ‚§ Developed Semantic Media Wiki Term Tool to capture initial list of terms and definitions for discussions, demo held at P3 (open for others and ā€œpersistentā€) Candidate List Evolved to Consolidated List
  • 12. 12 Our Core Terms in simple Words JļŠ Ā§ļ‚§ digital object (DO) Ā§ļ‚§ persistent identifier Ā§ļ‚§ PID resolution system Ā§ļ‚§ metadata Ā§ļ‚§ aggregation Ā§ļ‚§ digital collection Ā§ļ‚§ (digital) repository Ā§ļ‚§ bitstream Ā§ļ‚§ state information Need to put relation between terms into the documents On purpose no formal ontology (yet) and no terminologistā€™ exactness since we made definitions for data practitioners first.
  • 13. 13 Definitions & Process Ā§ļ‚§ A digital collection is an aggregation of DOs that is identified by a PID and described by metadata. Ā§ļ‚§ Note: A digital collection is a (complex) DO. Ā§ļ‚§ Variants Ā§ļ‚§ A collection is a form of aggregation of elements that has an identity of its own separate from the identity of the elements. Ā§ļ‚§ Collection is defined as a ā€œgroup of objects gathered together for some intellectual, artistic curatorial purpose. Ā§ļ‚§ A digital collection is a type of aggregation formed by a collection process on existing data and data sets where the collected data is in digital form. Ā§ļ‚§ Collection is a type of aggregation obeying part-role relations and is a digital object since it has a PID to be referable and metadata describing its properties. Ā§ļ‚§ A Digital Collection is an organized aggregation or other grouping of distinct DOs that are related by some criteria and where the collection is described by metadata. A Digital Collection may also be identified by a unique persistent identifier, in which case the collection may be construed as a DO. (Kahn et.al) Ā§ļ‚§ Conclusion points Ā§ļ‚§ purpose and process of aggregation/collection building and part relations not relevant for definition Ā§ļ‚§ remember: only speak about domain of registered DOs.
  • 14. Interactions with others 14 ā€¢ Interacted with RDA WGs and IGs. ā€¢ Participated in Munich meeting and Chairs telcos. ā€¢ Part of WG forum discussions ā€¢ also ā€œactiveā€ interactions with about 120 groups RDA/EU & EUDAT Interviews Interactions Total Humanities &Soc Sci 8 13 21 Environmental 7 2 9 Life Sciences 10 7 17 Natural Sciences 11 13 24 Engineering & CS - 14 14 Various disciplines - 24 24 others 4 3 7 40 74 114
  • 15. Adoption 15 ā€¢ What does adoption mean in case of a set of terms? ā€¢ itā€™s about the interaction process itself within and outside of RDA ā€¢ itā€™s about influencing conceptualization and thus harmonizing ā€œlanguageā€ ā€¢ itā€™s about changing cultures ā€¢ we have done a lot ā€“ many departments & communities ā€¢ why so relevant: ā€¢ report from 120 interactions tells us that data practices are a nightmare (report is available) ā€¢ data organizations are so different that data federation including ā€œlogical informationā€ is too expensive ā€¢ current data science is not reproducible
  • 16. Objectives until/for P4 16 1. Go out and intensify interaction based on Snapshot Ā§ļ‚§ create condensed statements for different groups (2-page flyer) Ā§ļ‚§ interact with other groups in RDA and early adopters Ā§ļ‚§ interact with the many communities (outside RDA) we already contacted (in Europe ESFRI RI projects: 17th October, Brussels) Ā§ļ‚§ encourage people using the term wiki 2. Come to new consolidated agreements Ā§ļ‚§ consolidated definitions until P5 Ā§ļ‚§ present the consolidated definitions and tend core term set Ā§ļ‚§ identify some people from communities that have adoption talks (no PR!) 3. Finish some unsolved issues Ā§ļ‚§ synthesis: generic flexible enough model to capture terms and their relationships Ā§ļ‚§ add more use cases Ā§ļ‚§ see how to continue maintenance
  • 17. Thanks for your attention.
  • 18. Data Type Registries WG Outcomes
  • 19. 19 Problem: Implicit Assumptions in Data Ā§ļ‚§ Data sharing requires that data can be parsed, understood, and reused by people and applications other than those that created the data Ā§ļ‚§ How do we do this now? Ā§ļ‚§ For documents ā€“ formats are enough, e.g., PDF, and then the document explains itself to humans Ā§ļ‚§ This doesnā€™t work well with data ā€“ numbers are not self-explanatory Ā§ļ‚§ What does the number 7 mean in cell B27? Ā§ļ‚§ Data producers may not have explicitly specified certain details in the data: measurement units, coordinate systems, variable names, etc. Ā§ļ‚§ Need a way to precisely characterize those assumptions such that they can be identified by humans and machines that were not closely involved in its creation
  • 20. 20 Goals: Explicate and Share Assumptions using Types and Type Registries Ā§ļ‚§ Evaluate and identify a few assumptions in data that can be codified and shared in order toā€¦ Ā§ļ‚§ Produce a functioning Registry system that can easily be evaluated by organizations before adoption Ā§ļ‚§ Highly configurable for changing scope of captured and shared assumptions depending on the domain or organization Ā§ļ‚§ Supports several Type record dissemination variations Ā§ļ‚§ Design for allowing federation between multiple Registry instances Ā§ļ‚§ The groupā€™s emphasis is not on Ā§ļ‚§ Identifying every possible assumption and data characteristic applicable for all domains Ā§ļ‚§ Technology
  • 21. 21 Results Ā§ļ‚§ Produced a community consensus system ā€“ in this case the consensus was between the group members Ā§ļ‚§ Input from folks from different backgrounds including technologists, scientists, policy analysts, etc., is considered Ā§ļ‚§ Released a functioning prototype that can be adapted (with no s/w changes) for domain-specific use Ā§ļ‚§ Not a turnkey solution Ā§ļ‚§ Adapt - Evaluate ā€“ Adopt cycle is expected at each organization or community Ā§ļ‚§ Federation between different instances is technically possible Ā§ļ‚§ Organizational policies were not discussed due to the lack of time Ā§ļ‚§ CNRI, a member of the group, has designed and implemented a prototype, the latest of which is at: http://typeregistry.org Ā§ļ‚§ With the help of RDA provided scholar, we seeded the Registry with Types that pertain to geosciences community
  • 22. 22 Points to Keep in Mind Ā§ļ‚§ Data Type Registry is neither a turnkey system nor an immediate ROI application Ā§ļ‚§ Every organization should nominate a domain expert for defining the scope of Type records and for seeding their Registry instance Ā§ļ‚§ Cross-domain interpretation beyond some basic computability needs social processes in place Ā§ļ‚§ Data systems such as Type Registries are low-level infrastructure systems with wide applicability Ā§ļ‚§ Network effect plays a significant role in the success of any infrastructure
  • 23. 23 Adoption and Impact Ā§ļ‚§ We expect multiple groups to put significant efforts into exercising the prototype: Ā§ļ‚§ the EUDAT project in Europe, Ā§ļ‚§ National Institute of Standards and Technology (NIST) in the US, Ā§ļ‚§ the International DOI Foundation Ā§ļ‚§ (Wo Chang, Digital Data Manager at NIST, shares his evaluation plans)
  • 24. 24 Conclusion ā€“ For Now Ā§ļ‚§ Adoption plans will continue Ā§ļ‚§ The group, or some part of it, will continue to work, we hope with RDAā€™s blessing and maybe support. We will have more to say at P5 Ā§ļ‚§ Future-proofing data is hard work, but is essential for long-term data-driven science
  • 25. WG PID Information Types Outcomes
  • 26. 26 Problem & Goal Ā§ļ‚§ PIDs are associated with additional information and this information needs to be typed Ā§ļ‚§ Harmonization across disciplines and PID providers Ā§ļ‚§ What are PID Information Types? Ā§ļ‚§ Specify a framework for defining types Ā§ļ‚§ Agree on some essential types Ā§ļ‚§ Provide technical solutions for interaction with PID types Ā§ļ‚§ Provide the tools first, then create types individually
  • 27. 27 Results Insights gained: Ā§ļ‚§ Types depend on use cases and semantics differ between disciplines Ā§ļ‚§ There is no single set of types fitting all cases Ā§ļ‚§ Community processes must define types from practical adoption Final deliverables avaliable: Ā§ļ‚§ Type examples and illustrating use cases Ā§ļ‚§ Types registered in the Type Registry prototype Ā§ļ‚§ API description and prototypic implementation Ā§ļ‚§ Client demonstrator GUI
  • 28. Registered types enable cross-services 28 Format: Checksum: Size: Verification service Size: Format: Checksum:
  • 29. 29 Adoption & Impact Ā§ļ‚§ Register your types so they can be adopted and reused, making it easier for others to use your data Ā§ļ‚§ Information on how to register new types available in the report Ā§ļ‚§ Adopt types already being used in your domain to increase interoperability Ā§ļ‚§ Decouple object management from contents Ā§ļ‚§ Simplify client access to data across domains, implementations and changes in information models Ā§ļ‚§ More lightweight access to information on less accessible objects
  • 30. 30 Possible follow-ups Ā§ļ‚§ Adoption of these capabilities by PID infrastructure providers Ā§ļ‚§ Discipline-specific types, preferably from practical adoption Ā§ļ‚§ Establish a type ecosystem Ā§ļ‚§ Refine data model Ā§ļ‚§ Enhance REST API
  • 31. 31 Conclusions Ā§ļ‚§ Draft final report available via the website Ā§ļ‚§ Demonstrator web GUI: http://smw-rda.esc.rzg.mpg.de/PitApiGui/
  • 34. Ā§ļ‚§ Create research data repository Ā§ļ‚§ Data: 2 TB, 500,000 files + growing + integrity + access (IG FIM) + publish (publication+PID) + ā€¦ Ā§ļ‚§ Some assertions: policies & rules attached to the data WG Practical Policies 34 Scenario Policy: Asser%on or assurance that is enforced about a collec%on or a dataset
  • 35. Computer actionable policies Ā§ļ‚§ Enforce management, Ā§ļ‚§ Automate administrative tasks, Ā§ļ‚§ Validate assessment criteria, Ā§ļ‚§ Automate scientific analyses Ā§ļ‚§ etc. A generic set of policies that can be revised and adapted by user communities and site managers does not exist. Ā§ļ‚§ Domain scientists who want to build-up a collection or a repository Ā§ļ‚§ Data centers for automating policies WG Practical Policies 35 Problem
  • 36. Ā§ļ‚§ To bring together practitioners in policy making and policy implementation (nearly all RDA WG/IGs) Ā§ļ‚§ To identify typical application scenarios for policies such as replication, preservation etc. Ā§ļ‚§ To collect and to register practical policies Ā§ļ‚§ To enable sharing, revising, adapting, and re-using of computer actionable policies WG Practical Policies 36 Goals
  • 37. Survey of 30 Institutions for Highest Priority Policies Policy Importance Integrity 217 Preserva%on 150 Access control 126 Provenance 108 Data Management plans 99 Publica%on 75 Replica%on 66 Data staging 52 Federa%on 37 Metadata sharing 23 Regulatory 16 Collec%on proper%es 7 Iden%fiers 7 Data sharing 7 Versioning 7 Licensing 6 Format 6 Data Life Cycle 6 Arrangement 5 Processing 5 In close cooperation with the Engagement Group WG Practical Policies 37
  • 38. Contextual Metadata Extrac%on Data Reten%on Disposi%on Integrity Storage Cost Reports Restricted Searching No%fica%on Data Access Control Use Agreements Data backup Data Format Control Collec%on-Ā­ā€ based Policies Identification of 11 important policy areas:
  • 39. Identification of 11 important policy areas: Ā§ļ‚§ Contextual metadata extraction Ā§ļ‚§ Data access control Ā§ļ‚§ Data backup Ā§ļ‚§ Data format control Ā§ļ‚§ Data retention Ā§ļ‚§ Disposition Ā§ļ‚§ Integrity (including replication) Ā§ļ‚§ Notification Ā§ļ‚§ Restricted searching Ā§ļ‚§ Storage cost reports Ā§ļ‚§ Use agreements WG Practical Policies 39 Results
  • 40. https://www.rd-alliance.org/filedepot?cid=104&fid=556 Templates Ā§ļ‚§ Interactions of policies and DO attributes Ā§ļ‚§ Policy descriptions Ā§ļ‚§ Technology independent Ā§ļ‚§ Reviews of the provided policy areas in progress WG Practical Policies 40 Results
  • 41. Results https://www.rd-alliance.org/filedepot?cid=104&fid=553 Ā§ļ‚§ Examples for implementations: Ā§ļ‚§ English language descriptions Ā§ļ‚§ iRODS Ā§ļ‚§ GPFS WG Practical Policies 41 Ā§ļ‚§ ~50 pages
  • 42. Result: List of of policy categories and policies Ā§ļ‚§ Improved data center administration Ā§ļ‚§ By sharing policies, communities can interoperate and share data more effectively Ā§ļ‚§ Transparency: basis of establishing trust Ā§ļ‚§ Implemented policies: can be used as examples and be adapted to specific requirements and other data management systems WG Practical Policies 42 Impact
  • 43. Target Communities: Ā§ļ‚§ Groups managing data collections Ā§ļ‚§ Data centers First adopters are the institutions/organizations who contributed to the results, e.g. RENCI, KIT, OSC, DARIAH, RZG, etc.: Ā§ļ‚§ EUDAT Ā§ļ‚§ CESNET Ā§ļ‚§ (DataNet Federation Consortium, WDS ? ) WG Practical Policies 43 Adoption
  • 44. Ā§ļ‚§ ā€œOutcomes Policy Templates: Practical Policy Working Group, September 2014ā€ https://www.rd-alliance.org/filedepot?cid=104&fid=556 Ā§ļ‚§ ā€œImplementations: Practical Policy Working Group, September 2014ā€ https://www.rd-alliance.org/filedepot?cid=104&fid=553 Ā§ļ‚§ Work in Progress: Reviews WG Practical Policies 44 Conclusions
  • 45. Conclusions: Next Steps Ā§ļ‚§ More interaction with other technical groups Ć ļƒ  Data Fabric Ć ļƒ  Publication policies Ā§ļ‚§ More interaction with domain specific groups WG Practical Policies 45 Ć ļƒ  Adopters For information please contact Ā§ļ‚§ Reagan Moore rwmoore@renci.org and Ā§ļ‚§ Rainer Stotzka rainer.stotzka@kit.edu
  • 46. WG Practical Policies Outbreak Session: Tuesday September 23, 14:00 ā€“ 15:30 Agenda: 1. Introduction 2. Presentation of deliverables 3. David Antos & Petr Benedikt: "Policy implementations WG Practical Policies 46 on GPFSā€ 4. Discussions: Ā§ļ‚§ Policy reviews Ā§ļ‚§ Adding new policies Ā§ļ‚§ Interoperability with other WG/IGs Ā§ļ‚§ Adoption
  • 47. 47 P5 and Adoption Day Ā§ļ‚§ More groups will be presenting at P5 Ā§ļ‚§ Starting to see how different WG outputs can fit together Ā§ļ‚§ Ex: Data Fabric Ā§ļ‚§ Planning to have a major focus at P5 on adoption of WG outputs Ā§ļ‚§ Also thinking through how best to accelerate adoption and support groups that want to integrate RDA outputs
  • 48. 48 How you can help! Ā§ļ‚§ Get involved in WGs, IGs to ensure outputs meet your needs and the needs of your organisation Ā§ļ‚§ Encourage your organisation to become aware of RDA outputs and evaluate or trial them Ā§ļ‚§ Look for places where RDA can make a difference