SlideShare a Scribd company logo
Principles for proper Data Management and
Re-Use – an RDA view
Peter Wittenburg
Max Planck Society
2
 does RDA have one view – yes & no
 RDA is basically a bottom-up organization driven by the many
“creative” minds who want to change data practices
 RDA has now about 2000 members – so we have 2000 opinions?
 we have an intensive discussion process since 2012 (ICRI
Conference Copenhagen) and we can see that there are a number
of trends and principles all or most seem to agree with
 still RDA is a very young initiative and needs
much attention and grease
Clarification
3
Why is this all relevant?
 Naoyuki Tsunematsu (JST ):
• Data exchange (and thus the need for proper data
management) difficult to convey in Japanese Science
• parallel trends observed for Japanese Science
• not so often included in collaborations anymore
• not so often represented in the top papers
• enormous decrease in international ranking
• serious worries about counterproductive encapsulation
• this concern seems to be relevant for all of us
4
Trends I – Volume, Complexity
from simple
structures ...
... towards
complex
relationships
5
Trends II - Anonymity
direct exchange between known colleagues
Domain of Repositories
6
Trends III – Re-Usage
Domain of
trusted
Repositories
• Data will be re-used in different contexts
• Data needs to be findable, accessible, combinable and
interpretable for others
7
Data Practices I – Survey
 ~120 Interviews/Interactions
 2 Workshops with Leading Scientists (EU, US)
 too much manual or via ad hoc scripts
 too much in Legacy formats (no PID & MD)
 there are lighthouse projects etc. but ...
 DM and DP not efficient and too expensive
(Biologist for 75% of his time data manager)
 federating data incl. logical information much too expensive
 hardly usage of automated workflows and lack of
reproducibility
8
Data Practices I – Survey
 ~120 Interviews/Interactions
 2 Workshops with Leading Scientists (EU, US)
 too much manual or via ad hoc scripts
 too much in Legacy formats (no PID & MD)
 there are lighthouse projects etc. but ...
 DM and DP not efficient and too expensive
(Biologist for 75% of his time data manager)
 federating data incl. logical information much too expensive
 hardly usage of automated workflows and lack of
reproducibility
9
12 21 26
95 95 96 97
266
676
DIF DwC DC EML FGDC Open
GIS
ISO My Lab none
Metadata standards
Data Practices III - Metadata
slide von Bill Michener, DataONE
10
 lack of proper documentation,
schemas, semantics, relations, etc.
 directory structures, spreadsheets etc.
are ad hoc creations and knowledge
fades away
 etc.
Data Practices II – Data Entropy
11
Community Center
Common Data
Center
Changes needed – EUDAT and others
many excellent projects
are working on
changes: ESFRI
projects, DataNet
projects, e-
Infrastructures, national
projects
RDA needs to build on
experiences and
expertise
12
RDA widely agreed I – time to change
 management of data objects is widely type and discipline
independent
 still every project defines its own strategies leading to huge stack of
software that will not be maintainable
13
RDA widely agreed II –time to change
what
Value Added
Services
Data
Sources
Persistent
Identifiers
Persistent
Reference
Analysis Citation
Apps
Custom
Clients
Plug-Ins
Resolution System Typing
PID
Local Storage Cloud Computed
Data Sets RDBMS Files
Digital Objects
PID record
attributes
bit sequence
(instance)
metadata
attributes
points to instances
describes properties
describes
properties
& context
point to
each other
14
RDA Results I: common data model
• PIDs at the beginning of trust chain
• have a worldwide, independent and robust PID system
worldwide (DONA Handles – DOIs are Handles)!
• metadata are essential in anonymous data world
taken from RDA WG Data
Foundation & Terminology
15
 result: a registry for data types
 you get an unknown file,
pull it on DTR and content is being
visualized
 extended MIME Type concept
 no free lunch: someone needs to
register and define type
 code available begin 2015
 PIT Demo already working with
DTR
RDA Results II: Data Type Registry
Federated Set of
Type Registries
Visualization
Data Processing10100
11010
101…. Data Set
Dissemination
10100
11010
101….
10100
11010
101….
Terms:…
Rights
Agree
Visualization
Processing
Interpretation
3
Domain of
Services
2
1
Human or Machine
Consumers
4
• NIST is already working with
communities on fargoing ideas
16
 result: a generic API and a set of basic attributes
 a PID Record is like a Passport (Number, Photo, Exp-Date, etc.)
 if all PID Service-Provider agree on one API and talk the same language
(registered terms) SW development will become easy
 Test-Installation
in operation
together with
DTR
RDA Results III: PID Information Types
LOC location, path
CKSM checksum
CKSM_T checksum type
RoR owning repository
MD path to MD
17
 due to unforeseen circumstances need until P5
 Practical Policies = executable Workflow Statements
 result at P5: a set of Best Practice PPs for a number of typical DM/DP
tasks (Integrity Check, Replication, etc.)
 currently a large collection of PPs, currently being evaluated
 you could add your policies
RDA Results IV: Practical Policies
replication policy X
replication policy Y
integrity policy A
integrity policy B
integrity policy C
md extraction policy l
md extraction policy k
etc.
Policy Inventory
Repository
selection
implementation
execution
data manager
18
 need to place many RDA WGs & IGs on a common landscape since
finally everything needs to fit together -> Data Fabric
RDA ongoing: Data Fabric
19
1973
Changes take long ...
1990 1993
TCP/IP
Specification
1977
TCP/IP
Stress-test
WWW-Mosaic
available
worldwide
adoption
 many different suggestion & protocols
 first no advantage for TCP/IP
 at the beginning discussion about different email systems
 at the beginning no interest from researchers and also industry
(toi of some freaks)
 required some top-down decisions to enforce unification
20 years!
20
RDA is about global bridge building
20
RDA is about building the social and technical bridges that
enable global open sharing of data.
Researchers, scientists, data practitioners from around the
world are invited to work together to achieve the vision
Funders: NSF, EC, AU Gov, Japan, Brazil, DE?, UK?, ZA?, FI?,
etc.
21
Thanks for your attention.
http://www.rd-alliance.org
http://europe.rd-alliance.org
22
 see Science 2.0 Initiative of EC
 nr. of researchers increases enormously
 there is a pressure in the direction of Grand Challenges
and those topics relevant for societies
 research is increasingly often data intensive
 border-crossing research is a fact (countries, disciplines)
 faster cycles (hypothesis – analysis – publications –
reviews)
Trends IV: research is changing
23
bottom-up
process
top-down
process
uptake
to come
RDA is about global bridge building
24
EUDAT Services
24
EUDAT Box
dropbox-like service
easy sharing
local synching
Semantic Anno
checking , referencing and
annotating
Dynamic Data
immediate handling
Generic Workflow
automating data
processing
B2DROP B2NOTE

More Related Content

What's hot

RDC Jane Fry, Chantal Ripp - Data Interoperability I
RDC Jane Fry, Chantal Ripp - Data Interoperability IRDC Jane Fry, Chantal Ripp - Data Interoperability I
RDC Jane Fry, Chantal Ripp - Data Interoperability I
CASRAI
 
Datajalostamo-seminaari 5.6.2014: Tutkimusdatan avoimuus – globaalit tutkimus...
Datajalostamo-seminaari 5.6.2014: Tutkimusdatan avoimuus – globaalit tutkimus...Datajalostamo-seminaari 5.6.2014: Tutkimusdatan avoimuus – globaalit tutkimus...
Datajalostamo-seminaari 5.6.2014: Tutkimusdatan avoimuus – globaalit tutkimus...
Digitalmikkeli
 
RDA Members Monthly Statistics - May 2015
RDA Members Monthly Statistics - May 2015RDA Members Monthly Statistics - May 2015
RDA Members Monthly Statistics - May 2015
Research Data Alliance
 
Rda in a_nutshell_february_2017_updated
Rda in a_nutshell_february_2017_updatedRda in a_nutshell_february_2017_updated
Rda in a_nutshell_february_2017_updated
Research Data Alliance
 
SoBigData. European Research Infrastructure for Big Data and Social Mining
SoBigData. European Research Infrastructure for Big Data and Social MiningSoBigData. European Research Infrastructure for Big Data and Social Mining
SoBigData. European Research Infrastructure for Big Data and Social Mining
Research Data Alliance
 
OSGIS: an introduction to the research data alliance
OSGIS: an introduction to the research data allianceOSGIS: an introduction to the research data alliance
OSGIS: an introduction to the research data alliance
Herman Stehouwer
 
Research Data Alliance: Current Activities and Expected Impact
Research Data Alliance: Current Activities and Expected ImpactResearch Data Alliance: Current Activities and Expected Impact
Research Data Alliance: Current Activities and Expected Impact
Herman Stehouwer
 
An Comprehensive Study of Big Data Environment and its Challenges.
An Comprehensive Study of Big Data Environment and its Challenges.An Comprehensive Study of Big Data Environment and its Challenges.
An Comprehensive Study of Big Data Environment and its Challenges.
ijceronline
 
Mapping the content ecosystem
Mapping the content ecosystemMapping the content ecosystem
Mapping the content ecosystem
Rob Hanna, ECMs
 
Research engagement in EUDAT| www.eudat.eu |
Research engagement in EUDAT| www.eudat.eu | Research engagement in EUDAT| www.eudat.eu |
Research engagement in EUDAT| www.eudat.eu |
EUDAT
 
DCC and FAIR initiatives
DCC and FAIR initiativesDCC and FAIR initiatives
DCC and FAIR initiatives
Sarah Jones
 
RDA in a Nutshell - September 2020
RDA in a Nutshell - September 2020RDA in a Nutshell - September 2020
RDA in a Nutshell - September 2020
Research Data Alliance
 
The Future of LOD
The Future of LODThe Future of LOD
The Future of LOD
Ghislain ATEMEZING
 
The Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsThe Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to Individuals
Research Data Alliance
 
Metadata Standards
Metadata StandardsMetadata Standards
Metadata Standards
Aditya Ranjan
 
PiDs for research - Natasha Simons - May 24, 2017
PiDs for research - Natasha Simons - May 24, 2017PiDs for research - Natasha Simons - May 24, 2017
PiDs for research - Natasha Simons - May 24, 2017
ARDC
 
2015 05-27-congrés archivoscatalunya
2015 05-27-congrés archivoscatalunya2015 05-27-congrés archivoscatalunya
2015 05-27-congrés archivoscatalunya
José Carlos Ramalho
 
Open Science and Identifiers
Open Science and IdentifiersOpen Science and Identifiers
Open Science and Identifiers
National Institute of Informatics (NII)
 
MIDESS
MIDESSMIDESS
MIDESS
JISC CETIS
 
Rebecca Grant - DRI/ARA(I) Training: Introduction to EAD - Metadata and Metad...
Rebecca Grant - DRI/ARA(I) Training: Introduction to EAD - Metadata and Metad...Rebecca Grant - DRI/ARA(I) Training: Introduction to EAD - Metadata and Metad...
Rebecca Grant - DRI/ARA(I) Training: Introduction to EAD - Metadata and Metad...
dri_ireland
 

What's hot (20)

RDC Jane Fry, Chantal Ripp - Data Interoperability I
RDC Jane Fry, Chantal Ripp - Data Interoperability IRDC Jane Fry, Chantal Ripp - Data Interoperability I
RDC Jane Fry, Chantal Ripp - Data Interoperability I
 
Datajalostamo-seminaari 5.6.2014: Tutkimusdatan avoimuus – globaalit tutkimus...
Datajalostamo-seminaari 5.6.2014: Tutkimusdatan avoimuus – globaalit tutkimus...Datajalostamo-seminaari 5.6.2014: Tutkimusdatan avoimuus – globaalit tutkimus...
Datajalostamo-seminaari 5.6.2014: Tutkimusdatan avoimuus – globaalit tutkimus...
 
RDA Members Monthly Statistics - May 2015
RDA Members Monthly Statistics - May 2015RDA Members Monthly Statistics - May 2015
RDA Members Monthly Statistics - May 2015
 
Rda in a_nutshell_february_2017_updated
Rda in a_nutshell_february_2017_updatedRda in a_nutshell_february_2017_updated
Rda in a_nutshell_february_2017_updated
 
SoBigData. European Research Infrastructure for Big Data and Social Mining
SoBigData. European Research Infrastructure for Big Data and Social MiningSoBigData. European Research Infrastructure for Big Data and Social Mining
SoBigData. European Research Infrastructure for Big Data and Social Mining
 
OSGIS: an introduction to the research data alliance
OSGIS: an introduction to the research data allianceOSGIS: an introduction to the research data alliance
OSGIS: an introduction to the research data alliance
 
Research Data Alliance: Current Activities and Expected Impact
Research Data Alliance: Current Activities and Expected ImpactResearch Data Alliance: Current Activities and Expected Impact
Research Data Alliance: Current Activities and Expected Impact
 
An Comprehensive Study of Big Data Environment and its Challenges.
An Comprehensive Study of Big Data Environment and its Challenges.An Comprehensive Study of Big Data Environment and its Challenges.
An Comprehensive Study of Big Data Environment and its Challenges.
 
Mapping the content ecosystem
Mapping the content ecosystemMapping the content ecosystem
Mapping the content ecosystem
 
Research engagement in EUDAT| www.eudat.eu |
Research engagement in EUDAT| www.eudat.eu | Research engagement in EUDAT| www.eudat.eu |
Research engagement in EUDAT| www.eudat.eu |
 
DCC and FAIR initiatives
DCC and FAIR initiativesDCC and FAIR initiatives
DCC and FAIR initiatives
 
RDA in a Nutshell - September 2020
RDA in a Nutshell - September 2020RDA in a Nutshell - September 2020
RDA in a Nutshell - September 2020
 
The Future of LOD
The Future of LODThe Future of LOD
The Future of LOD
 
The Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsThe Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to Individuals
 
Metadata Standards
Metadata StandardsMetadata Standards
Metadata Standards
 
PiDs for research - Natasha Simons - May 24, 2017
PiDs for research - Natasha Simons - May 24, 2017PiDs for research - Natasha Simons - May 24, 2017
PiDs for research - Natasha Simons - May 24, 2017
 
2015 05-27-congrés archivoscatalunya
2015 05-27-congrés archivoscatalunya2015 05-27-congrés archivoscatalunya
2015 05-27-congrés archivoscatalunya
 
Open Science and Identifiers
Open Science and IdentifiersOpen Science and Identifiers
Open Science and Identifiers
 
MIDESS
MIDESSMIDESS
MIDESS
 
Rebecca Grant - DRI/ARA(I) Training: Introduction to EAD - Metadata and Metad...
Rebecca Grant - DRI/ARA(I) Training: Introduction to EAD - Metadata and Metad...Rebecca Grant - DRI/ARA(I) Training: Introduction to EAD - Metadata and Metad...
Rebecca Grant - DRI/ARA(I) Training: Introduction to EAD - Metadata and Metad...
 

Similar to Principles for proper data management and reuse--An RDA view

RDA Work Groups Outputs and Adoption - Early WG Report back session
RDA Work Groups Outputs and Adoption - Early WG Report back sessionRDA Work Groups Outputs and Adoption - Early WG Report back session
RDA Work Groups Outputs and Adoption - Early WG Report back session
Research Data Alliance
 
Data management planning: the what, the why, the who, the how
Data management planning: the what, the why, the who, the howData management planning: the what, the why, the who, the how
Data management planning: the what, the why, the who, the how
Martin Donnelly
 
D.3.1: State of the Art - Linked Data and Digital Preservation
D.3.1: State of the Art - Linked Data and Digital PreservationD.3.1: State of the Art - Linked Data and Digital Preservation
D.3.1: State of the Art - Linked Data and Digital Preservation
PRELIDA Project
 
Open Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsOpen Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and Solutions
Martin Donnelly
 
iConference: Overview of data management planning
iConference: Overview of data management planningiConference: Overview of data management planning
iConference: Overview of data management planning
Carly Strasser
 
Data Management Planning at the DCC
Data Management Planning at the DCCData Management Planning at the DCC
Data Management Planning at the DCC
Martin Donnelly
 
Open Data is not Enough
Open Data is not EnoughOpen Data is not Enough
Open Data is not Enough
Research Data Alliance
 
LIBER Webinar: Are the FAIR Data Principles really fair?
LIBER Webinar: Are the FAIR Data Principles really fair?LIBER Webinar: Are the FAIR Data Principles really fair?
LIBER Webinar: Are the FAIR Data Principles really fair?
LIBER Europe
 
NordForsk Open Access Reykjavik 14-15/8-2014:Rda
NordForsk Open Access Reykjavik 14-15/8-2014:RdaNordForsk Open Access Reykjavik 14-15/8-2014:Rda
NordForsk Open Access Reykjavik 14-15/8-2014:RdaNordForsk
 
Data Residency: Challenges and the Need for Standards
Data Residency: Challenges and the Need for StandardsData Residency: Challenges and the Need for Standards
Data Residency: Challenges and the Need for Standards
Cloud Standards Customer Council
 
Keynote: Mark Parsons - Plans are Useless, But Planning is Essential
Keynote: Mark Parsons - Plans are Useless, But Planning is EssentialKeynote: Mark Parsons - Plans are Useless, But Planning is Essential
Keynote: Mark Parsons - Plans are Useless, But Planning is Essential
CASRAI
 
"Plans are worthless, but planning is essential"
"Plans are worthless, but planning is essential""Plans are worthless, but planning is essential"
"Plans are worthless, but planning is essential"
Research Data Alliance
 
Research Data Management at Imperial College London
Research Data Management at Imperial College LondonResearch Data Management at Imperial College London
Research Data Management at Imperial College London
Sarah Anna Stewart
 
Simms DataONE webinar 13 Mar 18
Simms DataONE webinar 13 Mar 18Simms DataONE webinar 13 Mar 18
Simms DataONE webinar 13 Mar 18
Stephanie Simms
 
Data Management Planning at the DCC: a human factor
Data Management Planning at the DCC: a human factorData Management Planning at the DCC: a human factor
Data Management Planning at the DCC: a human factor
Martin Donnelly
 
Open Access Week 2017: Introduction to Open Data Policies in H2020
Open Access Week 2017: Introduction to Open Data Policies in H2020Open Access Week 2017: Introduction to Open Data Policies in H2020
Open Access Week 2017: Introduction to Open Data Policies in H2020
OpenAIRE
 
General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...
Nancy Pontika
 
Making DMPs actionable and public
Making DMPs actionable and publicMaking DMPs actionable and public
Making DMPs actionable and public
Stephanie Simms
 
Toward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docxToward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docx
juliennehar
 
Data Management Plans: a gentle introduction
Data Management Plans: a gentle introductionData Management Plans: a gentle introduction
Data Management Plans: a gentle introduction
Martin Donnelly
 

Similar to Principles for proper data management and reuse--An RDA view (20)

RDA Work Groups Outputs and Adoption - Early WG Report back session
RDA Work Groups Outputs and Adoption - Early WG Report back sessionRDA Work Groups Outputs and Adoption - Early WG Report back session
RDA Work Groups Outputs and Adoption - Early WG Report back session
 
Data management planning: the what, the why, the who, the how
Data management planning: the what, the why, the who, the howData management planning: the what, the why, the who, the how
Data management planning: the what, the why, the who, the how
 
D.3.1: State of the Art - Linked Data and Digital Preservation
D.3.1: State of the Art - Linked Data and Digital PreservationD.3.1: State of the Art - Linked Data and Digital Preservation
D.3.1: State of the Art - Linked Data and Digital Preservation
 
Open Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsOpen Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and Solutions
 
iConference: Overview of data management planning
iConference: Overview of data management planningiConference: Overview of data management planning
iConference: Overview of data management planning
 
Data Management Planning at the DCC
Data Management Planning at the DCCData Management Planning at the DCC
Data Management Planning at the DCC
 
Open Data is not Enough
Open Data is not EnoughOpen Data is not Enough
Open Data is not Enough
 
LIBER Webinar: Are the FAIR Data Principles really fair?
LIBER Webinar: Are the FAIR Data Principles really fair?LIBER Webinar: Are the FAIR Data Principles really fair?
LIBER Webinar: Are the FAIR Data Principles really fair?
 
NordForsk Open Access Reykjavik 14-15/8-2014:Rda
NordForsk Open Access Reykjavik 14-15/8-2014:RdaNordForsk Open Access Reykjavik 14-15/8-2014:Rda
NordForsk Open Access Reykjavik 14-15/8-2014:Rda
 
Data Residency: Challenges and the Need for Standards
Data Residency: Challenges and the Need for StandardsData Residency: Challenges and the Need for Standards
Data Residency: Challenges and the Need for Standards
 
Keynote: Mark Parsons - Plans are Useless, But Planning is Essential
Keynote: Mark Parsons - Plans are Useless, But Planning is EssentialKeynote: Mark Parsons - Plans are Useless, But Planning is Essential
Keynote: Mark Parsons - Plans are Useless, But Planning is Essential
 
"Plans are worthless, but planning is essential"
"Plans are worthless, but planning is essential""Plans are worthless, but planning is essential"
"Plans are worthless, but planning is essential"
 
Research Data Management at Imperial College London
Research Data Management at Imperial College LondonResearch Data Management at Imperial College London
Research Data Management at Imperial College London
 
Simms DataONE webinar 13 Mar 18
Simms DataONE webinar 13 Mar 18Simms DataONE webinar 13 Mar 18
Simms DataONE webinar 13 Mar 18
 
Data Management Planning at the DCC: a human factor
Data Management Planning at the DCC: a human factorData Management Planning at the DCC: a human factor
Data Management Planning at the DCC: a human factor
 
Open Access Week 2017: Introduction to Open Data Policies in H2020
Open Access Week 2017: Introduction to Open Data Policies in H2020Open Access Week 2017: Introduction to Open Data Policies in H2020
Open Access Week 2017: Introduction to Open Data Policies in H2020
 
General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...
 
Making DMPs actionable and public
Making DMPs actionable and publicMaking DMPs actionable and public
Making DMPs actionable and public
 
Toward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docxToward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docx
 
Data Management Plans: a gentle introduction
Data Management Plans: a gentle introductionData Management Plans: a gentle introduction
Data Management Plans: a gentle introduction
 

More from Research Data Alliance

RDA in a Nutshell - August 2020
RDA in a Nutshell - August 2020RDA in a Nutshell - August 2020
RDA in a Nutshell - August 2020
Research Data Alliance
 
RDA in a Nutshell - July 2020
RDA in a Nutshell - July 2020RDA in a Nutshell - July 2020
RDA in a Nutshell - July 2020
Research Data Alliance
 
RDA in a Nutshell - June 2020
RDA in a Nutshell - June 2020RDA in a Nutshell - June 2020
RDA in a Nutshell - June 2020
Research Data Alliance
 
RDA in a Nutshell - May 2020
RDA in a Nutshell - May 2020RDA in a Nutshell - May 2020
RDA in a Nutshell - May 2020
Research Data Alliance
 
RDA in a Nutshell - April 2020
RDA in a Nutshell - April 2020RDA in a Nutshell - April 2020
RDA in a Nutshell - April 2020
Research Data Alliance
 
RDA in a Nutshell - March 2020
RDA in a Nutshell - March 2020RDA in a Nutshell - March 2020
RDA in a Nutshell - March 2020
Research Data Alliance
 
RDA in a Nutshell - February 2020
RDA in a Nutshell - February 2020RDA in a Nutshell - February 2020
RDA in a Nutshell - February 2020
Research Data Alliance
 
RDA in a Nutshell - January 2020
RDA in a Nutshell - January 2020RDA in a Nutshell - January 2020
RDA in a Nutshell - January 2020
Research Data Alliance
 
Rda in a Nutshell - December 2019
Rda in a Nutshell - December 2019Rda in a Nutshell - December 2019
Rda in a Nutshell - December 2019
Research Data Alliance
 
Rda in a Nutshell - November 2019
Rda in a Nutshell - November 2019Rda in a Nutshell - November 2019
Rda in a Nutshell - November 2019
Research Data Alliance
 
RDA in a Nutshell - October 2019
RDA in a Nutshell - October 2019RDA in a Nutshell - October 2019
RDA in a Nutshell - October 2019
Research Data Alliance
 
The Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsThe Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to Individuals
Research Data Alliance
 
RDA Value for Infrastructure Providers
RDA Value for Infrastructure ProvidersRDA Value for Infrastructure Providers
RDA Value for Infrastructure Providers
Research Data Alliance
 
Rda in a nutshell september 2019
Rda in a nutshell september 2019Rda in a nutshell september 2019
Rda in a nutshell september 2019
Research Data Alliance
 
The Value of the Rda Value for Organisations Performing Research
The Value of the Rda Value for Organisations Performing ResearchThe Value of the Rda Value for Organisations Performing Research
The Value of the Rda Value for Organisations Performing Research
Research Data Alliance
 
RDA Value for Libraries
RDA Value for LibrariesRDA Value for Libraries
RDA Value for Libraries
Research Data Alliance
 
The Value of the RDA for Funders
The Value of the RDA for FundersThe Value of the RDA for Funders
The Value of the RDA for Funders
Research Data Alliance
 
Rda in a nutshell august 2019
Rda in a nutshell august 2019Rda in a nutshell august 2019
Rda in a nutshell august 2019
Research Data Alliance
 
Rda value for regions
Rda value for regionsRda value for regions
Rda value for regions
Research Data Alliance
 
Rda in-a-nutshell-july-2019
Rda in-a-nutshell-july-2019Rda in-a-nutshell-july-2019
Rda in-a-nutshell-july-2019
Research Data Alliance
 

More from Research Data Alliance (20)

RDA in a Nutshell - August 2020
RDA in a Nutshell - August 2020RDA in a Nutshell - August 2020
RDA in a Nutshell - August 2020
 
RDA in a Nutshell - July 2020
RDA in a Nutshell - July 2020RDA in a Nutshell - July 2020
RDA in a Nutshell - July 2020
 
RDA in a Nutshell - June 2020
RDA in a Nutshell - June 2020RDA in a Nutshell - June 2020
RDA in a Nutshell - June 2020
 
RDA in a Nutshell - May 2020
RDA in a Nutshell - May 2020RDA in a Nutshell - May 2020
RDA in a Nutshell - May 2020
 
RDA in a Nutshell - April 2020
RDA in a Nutshell - April 2020RDA in a Nutshell - April 2020
RDA in a Nutshell - April 2020
 
RDA in a Nutshell - March 2020
RDA in a Nutshell - March 2020RDA in a Nutshell - March 2020
RDA in a Nutshell - March 2020
 
RDA in a Nutshell - February 2020
RDA in a Nutshell - February 2020RDA in a Nutshell - February 2020
RDA in a Nutshell - February 2020
 
RDA in a Nutshell - January 2020
RDA in a Nutshell - January 2020RDA in a Nutshell - January 2020
RDA in a Nutshell - January 2020
 
Rda in a Nutshell - December 2019
Rda in a Nutshell - December 2019Rda in a Nutshell - December 2019
Rda in a Nutshell - December 2019
 
Rda in a Nutshell - November 2019
Rda in a Nutshell - November 2019Rda in a Nutshell - November 2019
Rda in a Nutshell - November 2019
 
RDA in a Nutshell - October 2019
RDA in a Nutshell - October 2019RDA in a Nutshell - October 2019
RDA in a Nutshell - October 2019
 
The Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsThe Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to Individuals
 
RDA Value for Infrastructure Providers
RDA Value for Infrastructure ProvidersRDA Value for Infrastructure Providers
RDA Value for Infrastructure Providers
 
Rda in a nutshell september 2019
Rda in a nutshell september 2019Rda in a nutshell september 2019
Rda in a nutshell september 2019
 
The Value of the Rda Value for Organisations Performing Research
The Value of the Rda Value for Organisations Performing ResearchThe Value of the Rda Value for Organisations Performing Research
The Value of the Rda Value for Organisations Performing Research
 
RDA Value for Libraries
RDA Value for LibrariesRDA Value for Libraries
RDA Value for Libraries
 
The Value of the RDA for Funders
The Value of the RDA for FundersThe Value of the RDA for Funders
The Value of the RDA for Funders
 
Rda in a nutshell august 2019
Rda in a nutshell august 2019Rda in a nutshell august 2019
Rda in a nutshell august 2019
 
Rda value for regions
Rda value for regionsRda value for regions
Rda value for regions
 
Rda in-a-nutshell-july-2019
Rda in-a-nutshell-july-2019Rda in-a-nutshell-july-2019
Rda in-a-nutshell-july-2019
 

Recently uploaded

Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 

Recently uploaded (20)

Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 

Principles for proper data management and reuse--An RDA view

  • 1. Principles for proper Data Management and Re-Use – an RDA view Peter Wittenburg Max Planck Society
  • 2. 2  does RDA have one view – yes & no  RDA is basically a bottom-up organization driven by the many “creative” minds who want to change data practices  RDA has now about 2000 members – so we have 2000 opinions?  we have an intensive discussion process since 2012 (ICRI Conference Copenhagen) and we can see that there are a number of trends and principles all or most seem to agree with  still RDA is a very young initiative and needs much attention and grease Clarification
  • 3. 3 Why is this all relevant?  Naoyuki Tsunematsu (JST ): • Data exchange (and thus the need for proper data management) difficult to convey in Japanese Science • parallel trends observed for Japanese Science • not so often included in collaborations anymore • not so often represented in the top papers • enormous decrease in international ranking • serious worries about counterproductive encapsulation • this concern seems to be relevant for all of us
  • 4. 4 Trends I – Volume, Complexity from simple structures ... ... towards complex relationships
  • 5. 5 Trends II - Anonymity direct exchange between known colleagues Domain of Repositories
  • 6. 6 Trends III – Re-Usage Domain of trusted Repositories • Data will be re-used in different contexts • Data needs to be findable, accessible, combinable and interpretable for others
  • 7. 7 Data Practices I – Survey  ~120 Interviews/Interactions  2 Workshops with Leading Scientists (EU, US)  too much manual or via ad hoc scripts  too much in Legacy formats (no PID & MD)  there are lighthouse projects etc. but ...  DM and DP not efficient and too expensive (Biologist for 75% of his time data manager)  federating data incl. logical information much too expensive  hardly usage of automated workflows and lack of reproducibility
  • 8. 8 Data Practices I – Survey  ~120 Interviews/Interactions  2 Workshops with Leading Scientists (EU, US)  too much manual or via ad hoc scripts  too much in Legacy formats (no PID & MD)  there are lighthouse projects etc. but ...  DM and DP not efficient and too expensive (Biologist for 75% of his time data manager)  federating data incl. logical information much too expensive  hardly usage of automated workflows and lack of reproducibility
  • 9. 9 12 21 26 95 95 96 97 266 676 DIF DwC DC EML FGDC Open GIS ISO My Lab none Metadata standards Data Practices III - Metadata slide von Bill Michener, DataONE
  • 10. 10  lack of proper documentation, schemas, semantics, relations, etc.  directory structures, spreadsheets etc. are ad hoc creations and knowledge fades away  etc. Data Practices II – Data Entropy
  • 11. 11 Community Center Common Data Center Changes needed – EUDAT and others many excellent projects are working on changes: ESFRI projects, DataNet projects, e- Infrastructures, national projects RDA needs to build on experiences and expertise
  • 12. 12 RDA widely agreed I – time to change  management of data objects is widely type and discipline independent  still every project defines its own strategies leading to huge stack of software that will not be maintainable
  • 13. 13 RDA widely agreed II –time to change what Value Added Services Data Sources Persistent Identifiers Persistent Reference Analysis Citation Apps Custom Clients Plug-Ins Resolution System Typing PID Local Storage Cloud Computed Data Sets RDBMS Files Digital Objects PID record attributes bit sequence (instance) metadata attributes points to instances describes properties describes properties & context point to each other
  • 14. 14 RDA Results I: common data model • PIDs at the beginning of trust chain • have a worldwide, independent and robust PID system worldwide (DONA Handles – DOIs are Handles)! • metadata are essential in anonymous data world taken from RDA WG Data Foundation & Terminology
  • 15. 15  result: a registry for data types  you get an unknown file, pull it on DTR and content is being visualized  extended MIME Type concept  no free lunch: someone needs to register and define type  code available begin 2015  PIT Demo already working with DTR RDA Results II: Data Type Registry Federated Set of Type Registries Visualization Data Processing10100 11010 101…. Data Set Dissemination 10100 11010 101…. 10100 11010 101…. Terms:… Rights Agree Visualization Processing Interpretation 3 Domain of Services 2 1 Human or Machine Consumers 4 • NIST is already working with communities on fargoing ideas
  • 16. 16  result: a generic API and a set of basic attributes  a PID Record is like a Passport (Number, Photo, Exp-Date, etc.)  if all PID Service-Provider agree on one API and talk the same language (registered terms) SW development will become easy  Test-Installation in operation together with DTR RDA Results III: PID Information Types LOC location, path CKSM checksum CKSM_T checksum type RoR owning repository MD path to MD
  • 17. 17  due to unforeseen circumstances need until P5  Practical Policies = executable Workflow Statements  result at P5: a set of Best Practice PPs for a number of typical DM/DP tasks (Integrity Check, Replication, etc.)  currently a large collection of PPs, currently being evaluated  you could add your policies RDA Results IV: Practical Policies replication policy X replication policy Y integrity policy A integrity policy B integrity policy C md extraction policy l md extraction policy k etc. Policy Inventory Repository selection implementation execution data manager
  • 18. 18  need to place many RDA WGs & IGs on a common landscape since finally everything needs to fit together -> Data Fabric RDA ongoing: Data Fabric
  • 19. 19 1973 Changes take long ... 1990 1993 TCP/IP Specification 1977 TCP/IP Stress-test WWW-Mosaic available worldwide adoption  many different suggestion & protocols  first no advantage for TCP/IP  at the beginning discussion about different email systems  at the beginning no interest from researchers and also industry (toi of some freaks)  required some top-down decisions to enforce unification 20 years!
  • 20. 20 RDA is about global bridge building 20 RDA is about building the social and technical bridges that enable global open sharing of data. Researchers, scientists, data practitioners from around the world are invited to work together to achieve the vision Funders: NSF, EC, AU Gov, Japan, Brazil, DE?, UK?, ZA?, FI?, etc.
  • 21. 21 Thanks for your attention. http://www.rd-alliance.org http://europe.rd-alliance.org
  • 22. 22  see Science 2.0 Initiative of EC  nr. of researchers increases enormously  there is a pressure in the direction of Grand Challenges and those topics relevant for societies  research is increasingly often data intensive  border-crossing research is a fact (countries, disciplines)  faster cycles (hypothesis – analysis – publications – reviews) Trends IV: research is changing
  • 24. 24 EUDAT Services 24 EUDAT Box dropbox-like service easy sharing local synching Semantic Anno checking , referencing and annotating Dynamic Data immediate handling Generic Workflow automating data processing B2DROP B2NOTE

Editor's Notes

  1. Suzie Scientists want to be able to use other scientists’ datasets, they are willin to share their own data and they feel it is appropriate to create new datasets from shared data.