SlideShare a Scribd company logo
1 of 57
Download to read offline
On community-standards, data curation and
scholarly communication
Susanna-Assunta Sansone, PhD
@SusannaASansone
13th Annual Meeting of the Bioinformatics Italian Society, University of Salerno, Italy, 15-17 June 2016.
Data Consultant,
Founding Academic Editor
Associate Director,
Principal Investigator
Member,
Executive Committee
•  Better data better science – the FAIR meme
•  Publication of digital research outputs – why it matters
•  Interoperability standards – as enablers
Outline
Research as a Connected Digital Enterprise aka The Commons
•  Researcher X is automatically made aware of researcher Y through commonalities
in their respective data located in the Commons.
The vision - P. Bourne (NIH Associate Director for Data Science)
Research as a Connected Digital Enterprise aka The Commons
•  Researcher X is automatically made aware of researcher Y through commonalities
in their respective data located in the Commons.
•  Research X locates the researcher Y’s data sets with their associated usage
statistics, navigates to the associated publications and starts to explore various
ideas to engage with researcher Y and their research network.
The vision - P. Bourne (NIH Associate Director for Data Science)
Research as a Connected Digital Enterprise aka The Commons
•  Researcher X is automatically made aware of researcher Y through commonalities
in their respective data located in the Commons.
•  Research X locates the researcher Y’s data sets with their associated usage
statistics, navigates to the associated publications and starts to explore various
ideas to engage with researcher Y and their research network.
•  A fruitful collaboration ensues and they generate publications, data sets and
software; their output is captured in PubMed and the Commons, and is indexed by
the data and software catalogs.
The vision - P. Bourne (NIH Associate Director for Data Science)
Research as a Connected Digital Enterprise aka The Commons
•  Researcher X is automatically made aware of researcher Y through commonalities
in their respective data located in the Commons.
•  Research X locates the researcher Y’s data sets with their associated usage
statistics, navigates to the associated publications and starts to explore various
ideas to engage with researcher Y and their research network.
•  A fruitful collaboration ensues and they generate publications, data sets and
software; their output is captured in PubMed and the Commons, and is indexed by
the data and software catalogs.
•  Company Z identifies relevant data and software that, based on the metrics from
the catalogs, have utilization above a threshold indicating that those data and
software are heavily utilized by the community.
The vision - P. Bourne (NIH Associate Director for Data Science)
Research as a Connected Digital Enterprise aka The Commons
•  Researcher X is automatically made aware of researcher Y through commonalities
in their respective data located in the Commons.
•  Research X locates the researcher Y’s data sets with their associated usage
statistics, navigates to the associated publications and starts to explore various
ideas to engage with researcher Y and their research network.
•  A fruitful collaboration ensues and they generate publications, data sets and
software; their output is captured in PubMed and the Commons, and is indexed by
the data and software catalogs.
•  Company Z identifies relevant data and software that, based on the metrics from
the catalogs, have utilization above a threshold indicating that those data and
software are heavily utilized by the community. An open source version remains, but
the company adds services on top of the software and revenue flows back to the
labs of researchers X and Y which is used to develop new innovative software for
open distribution.
The vision - P. Bourne (NIH Associate Director for Data Science)
Research as a Connected Digital Enterprise aka The Commons
•  Researcher X is automatically made aware of researcher Y through commonalities
in their respective data located in the Commons.
•  Research X locates the researcher Y’s data sets with their associated usage
statistics, navigates to the associated publications and starts to explore various
ideas to engage with researcher Y and their research network.
•  A fruitful collaboration ensues and they generate publications, data sets and
software; their output is captured in PubMed and the Commons, and is indexed by
the data and software catalogs.
•  Company Z identifies relevant data and software that, based on the metrics from
the catalogs, have utilization above a threshold indicating that those data and
software are heavily utilized by the community. An open source version remains, but
the company adds services on top of the software and revenue flows back to the
labs of researchers X and Y which is used to develop new innovative software for
open distribution.
•  Researchers X and Y provide hands-on advice in the use of their new version and
their course is offered as a MOOC (Massive Open Online Courses).
The vision - P. Bourne (NIH Associate Director for Data Science)
Research as a Connected Digital Enterprise aka The Commons
The vision - P. Bourne (NIH Associate Director for Data Science)
https://datascience.nih.gov/commons
A Data Discovery Index prototype that:
•  Helps users find and access shared data
•  Interoperates in the NIH Commons
aggregator'
A'
B C
A
aggregator'
Data'Discovery'Index'
data'
Dashed lines:
mapping of metadata
standards, links to
aggregators, data
Data:
digital research objects
Pilot projectsCore
development team
Designed as an element of the
ecosystem
1
2
medicine	
agriculture	
bioindustries	
environment	
ELIXIR	connects	national	
bioinformatics	centres	and	
EMBL-EBI	into	a	sustainable		
European	infrastructure	for	
biological	research	data	
Building a pan-European infrastructure
to do better science !
more efficiently!
Credit to: ttps://projects.ac/blog/five-top-reasons-to-protect-your-data-and-practise-safe-science/ 2014
“Over 50% of completed studies in biomedicine do not
appear in the published literature….Often because
results do not conform to author's hypotheses”
“Only half the health-related studies funded by the
European Union between 1998 and 2006 - an
expenditure of €6 billion - led to identifiable reports”
Selective reporting is still an unfortunate practice
•  Small independent efforts, yielding a rich variety of specialty data sets
o  Most of these data (such as null findings) is unpublished
o  These dark data hold a potential wealth of knowledge
•  Researchers still lack of or insufficient motivations
•  Hypothesis-confirming results get prioritized
•  Agreements, disagreements and timing
•  Loose requirements and monitoring by journals and
funders
But why?
•  Most researchers are
sharing data, and using the
data of others
•  Direct contact* between
researchers (on request) is
a common way of sharing
data
•  Repositories are second
most common method of
sharing
Kratz JE, Strasser C (2015) Researcher Perspectives on Publication and Peer Review of Data. PLoS ONE 10(2): e0117619.
Current approaches to sharing
* Data associated with published works disappears at a rate of ~17% per year (Vines et al. 2014, doi:10.1016/j.cub.2013.11.014
Datasets not referenced in a manuscript are essentially invisible and data producers do not get appropriate credit for their work
•  Outputs are multi-dimensional, not always well cited, stored
o  Software, codes, workflows are hard(er) to get hold of
•  Poorly described for third party reuse
o  Different level of details and annotation
•  Curation activities are perceived as time consuming
o  Collection and harmonization of detailed methods and
experimental steps is done/rushed at publication stage
Shared data is not always understandable, reusable
A B C D E
1 Group1 Group2
2 Day 0
3 Sodium 139 142
4 Potassium 3.3 4.8
5 Chloride 100 108
6 BUN 18 18
7 Creatine 1.2 1.2
8 Uric acid 5.5* 6.2*
9 Day 7
10 Sodium 140 146
11 Potassium 3.4 5.1
12 Chloride 97 108
S1Sh.cuo
Sharing starts with good metadata…
Credit to: Iain Hrynaszkiewicz
A B C D E
1 Group1 Group2
2 Day 0
3 Sodium 139 142
4 Potassium 3.3 4.8
5 Chloride 100 108
6 BUN 18 18
7 Creatine 1.2 1.2
8 Uric acid 5.5* 6.2*
9 Day 7
10 Sodium 140 146
11 Potassium 3.4 5.1
12 Chloride 97 108
S1Sh.cuo Meaningless
column titles
Special characters
can cause text
mining errors
No units
Unhelpful
document name
Undefined
abbreviation
Formatting for
information that
should be in
metadata
….…but this not!
Credit to: Iain Hrynaszkiewicz
A B C D E F
1 Parameter Day Control Treated Units P
2 Sodium 0 139 142 mEq/l 0.82
3 Sodium 7 140 146 mEq/l 0.70
4 Sodium 14 140 158 mEq/l 0.03
5 Sodium 21 143 160 mEq/l 0.02
6 Potassium 0 3.3 4.8 mEq/l 0.06
7 Potassium 7 3.4 5.1 mEq/l 0.07
8 Potassium 14 3.7 4.7 mEq/l 0.10
9 Potassium 21 3.1 3.6 mEq/l 0.52
10 Chloride 0 100 108 mEq/l 0.56
11 Chloride 7 97 108 mEq/l 0.68
12 Chloride 14 101 106 mEq/l 0.79
Table_S1_Shanghai_blood.xls
….this is much clearer!
Credit to: Iain Hrynaszkiewicz
Without context data is meaningless
The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta
Sansone www.ebi.ac.uk/net-project
2
4
…breadth and depth of the
context is pivotal…
…including capturing
experimental design and
statistical analysis
Among these, publishers occupy a
leverage point, because of importance of
formal publications in the academic
incentive structure
Stakeholders mobilizations, old and new driving forces
•  Incentive, credit for sharing
o  Big and small data
o  Unpublished data
o  Long tail of data
o  Curated aggregation
•  Peer review of data
•  Value of data vs. analysis
•  Discoverability and reusability
o  Complementing community
databases
Growing number of data papers and data journals
nature.com/scientificdataHonorary Academic Editor
Susanna-Assunta Sansone, PhD
Managing Editor
Andrew L Hufton, PhD
Editorial Curator
Varsha Khodiyar
Publisher
Iain Hrynaszkiewicz
A new open-access, online-only publication for
descriptions of scientifically valuable datasets
Supported by
A new article type
A new category of publication that provides detailed
descriptors of scientifically valuable datasets
Mandates open data, without unnecessary
restrictions, as a condition of submission
Research
papers
Data
records
Data
Descriptors
Value added component – complementing
articles and repositories
Scientific hypotheses:
Synthesis
Analysis
Conclusions
Methods and technical analyses supporting the quality
of the measurements:
What did I do to generate the data?
How was the data processed?
Where is the data?
Who did what when
Relation with traditional articles – content
Citation of and links to data files and databases
Experimental metadata or
structured component
(in-house curated, machine-
readable formats)
Article or
narrative component
(PDF and HTML)
Data Descriptors has two components
The Data Curation Editor is responsible for creating and
curating the machine-readable structured component
•  Enables browsing and searching the articles
•  Facilitates links to related journal articles and repository
records
Curation and discoverability
Created with the input of the
authors, includes value-added
semantic annotation of the
experimental metadata
analysis
method script
Data file or
record in a
database
Data Descriptors: structured component
Browse, search, view Data Descriptors
3
8	
Why data papers? Credit for data producers!
Credit to: Varsha Khodiyar
“The Data Descriptor made it easier to use
the data, for me it was critical that everything
was there…all the technical details like voxel
size.”
Professor Daniele Marinazzo
Why data papers? Data reuse is easier!
Credit to: Varsha Khodiyar
4
0	
Decades
old
dataset
Aggregated or
curated data
resources
Computationally
produced data
products
Large
consortium
dataset
Data from a
single
experiment
Data associated
with a high
impact analysis
article
What does make a good Data Descriptors?
Credit to: Andrew Hufton
•  Better data better science – the FAIR meme
•  Publication of digital research outputs – why it matters
•  Interoperability standards – as enablers
Outline
de jure de facto
grass-roots
groups
standard
organizations
Nanotechnology Working Group
•  To structure, enrich and report the description of the datasets and the
experimental context under which they were produced
•  To facilitate discovery, sharing, understanding and reuse of datasets
Community-developed content standards
de jure de facto
grass-roots
groups
standard
organizations
Nanotechnology Working Group
Content standards as enabler for better described data
Including minimum
information reporting
requirements, or
checklists to report the
same core, essential
information
Including controlled
vocabularies, taxonomies,
thesauri, ontologies etc. to
use the same word and
refer to the same ‘thing’
Including conceptual
model, conceptual
schema from which an
exchange format is derived
to allow data to flow from
one system to another
203
105
345
miame!
MIRIAM!
MIQAS!
MIX!
MIGEN!
ARRIVE!
MIAPE!
MIASE!
MIQE!
MISFISHIE….!
REMARK!
CONSORT!
SRAxml!
SOFT!
FASTA!
DICOM!
MzML!
SBRML!
SEDML…!
GELML!
ISA-Tab!
CML!
MITAB!
AAO!
CHEBI!
OBI!
PATO! ENVO!
MOD!
BTO!
IDO…!
TEDDY!
PRO!
XAO!
DO
VO!
Complex and evolving landscape
data policies databases
data/metadata standards
Is there a database, implementing
standards, where to deposit my
metagenomics dataset?
My funder’s data sharing policy
recommends the use of
established standards, but
which ones are widely
endorsed and applicable to my
toxicological and clinical data?
Am I using the most up-to-date
version of this terminology to
annotate cell-based assays?
I understand this format has been
deprecated; what has been replaced
by and how is leading the work?
Are there databases implementing
this exchange format, whose
development we have funded?
What are the mature
standards and
standards-compliant
databases we should
recommend to our
authors?
But how do we help users to make informed decisions?
A web-based, curated and searchable registry ensuring that
standards and databases are registered, informative and
discoverable; monitoring development and evolution of standards,
their use in databases and adoption of both in data policies
An informative and educational resource
1,400 records and growing
An informative and educational resource
The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta
Sansone www.ebi.ac.uk/net-project
Tracking evolution, e.g. deprecations and substitutions
Model/format formalizing reporting guideline -->
<-- Reporting guideline used by model/format
Cross-linking standards to standards and databases
Standards and databases recommended by publishers in
their data policies
Interactive graph to inform and educate, e.g. database
standard
policy
Interactive graph to inform and educate, e.g. database
standard
policy
Interactive graph to inform and educate, e.g. database
standard
policy
Linking standards and databases to training material
Advised by the ELIXIR Training Coordinators Group,
including:
A collaboration between:
Data!
Software!
Standards!
Databases!
Workflow!
Publications!
Training material!
Philippe
Rocca-Serra, PhD
Senior Research Lecturer
Alejandra
Gonzalez-Beltran, PhD
Research Lecturer
Milo
Thurston, DPhD
Research Software Engineer
Massimiliano
Izzo, PhD
Research Software Engineer
Peter
McQuilton, PhD
Knowledge Engineer
Allyson
Lister, PhD
Knowledge Engineer
Eamonn
Maguire, DPhil
Software Engineer contractor
David
Johnson, PhD
Research Software Engineer
Susanna-Assunta Sansone, PhD
Principal Investigator, Associate Director
We also acknowledge our network of collaborators
in the following active projects: H2020 PhenoMeNal,
H2020 ELIXIR-EXCELERATE, H2020 MultiMot,
NIH bioCADDIE, NIH CEDAR and IMI eTRIKS

More Related Content

What's hot

IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...Amanda Whitmire
 
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"Anita de Waard
 
Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Combining Explicit and Latent Web Semantics for Maintaining Knowledge GraphsCombining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Combining Explicit and Latent Web Semantics for Maintaining Knowledge GraphsPaul Groth
 
Introduction to research data management; Lecture 01 for GRAD521
Introduction to research data management; Lecture 01 for GRAD521Introduction to research data management; Lecture 01 for GRAD521
Introduction to research data management; Lecture 01 for GRAD521Amanda Whitmire
 
Knowledge graph construction for research & medicine
Knowledge graph construction for research & medicineKnowledge graph construction for research & medicine
Knowledge graph construction for research & medicinePaul Groth
 
Introduction to Data Management
Introduction to Data ManagementIntroduction to Data Management
Introduction to Data ManagementAmanda Whitmire
 
Reproducible research: First steps.
Reproducible research: First steps. Reproducible research: First steps.
Reproducible research: First steps. Richard Layton
 
Introduction to data management
Introduction to data managementIntroduction to data management
Introduction to data managementCunera Buys
 
The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
The Fourth Paradigm - Deltares Data Science Day, 31 October 2014The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
The Fourth Paradigm - Deltares Data Science Day, 31 October 2014Microsoft Azure for Research
 
NSF Data Management Plan Case Study: UVa’s Response.
NSF Data Management Plan Case Study:  UVa’s Response.NSF Data Management Plan Case Study:  UVa’s Response.
NSF Data Management Plan Case Study: UVa’s Response.Andrew Sallans
 
Minimal viable-datareuse-czi
Minimal viable-datareuse-cziMinimal viable-datareuse-czi
Minimal viable-datareuse-cziPaul Groth
 
The Kaleidoscope of Impact: same data, different perspectives, constantly cha...
The Kaleidoscope of Impact: same data, different perspectives, constantly cha...The Kaleidoscope of Impact: same data, different perspectives, constantly cha...
The Kaleidoscope of Impact: same data, different perspectives, constantly cha...Kudos
 
Data peer review workshop
Data peer review workshopData peer review workshop
Data peer review workshopVarsha Khodiyar
 
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds
 
Best practices data collection
Best practices data collectionBest practices data collection
Best practices data collectionSherry Lake
 
THOR Workshop - Data Publishing PLOS
THOR Workshop - Data Publishing PLOSTHOR Workshop - Data Publishing PLOS
THOR Workshop - Data Publishing PLOSMaaike Duine
 
RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015William Gunn
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsCarole Goble
 
Developing data services: a tale from two Oregon universities
Developing data services: a tale from two Oregon universitiesDeveloping data services: a tale from two Oregon universities
Developing data services: a tale from two Oregon universitiesAmanda Whitmire
 

What's hot (20)

IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
 
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
 
Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Combining Explicit and Latent Web Semantics for Maintaining Knowledge GraphsCombining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
 
Introduction to research data management; Lecture 01 for GRAD521
Introduction to research data management; Lecture 01 for GRAD521Introduction to research data management; Lecture 01 for GRAD521
Introduction to research data management; Lecture 01 for GRAD521
 
Knowledge graph construction for research & medicine
Knowledge graph construction for research & medicineKnowledge graph construction for research & medicine
Knowledge graph construction for research & medicine
 
Introduction to Data Management
Introduction to Data ManagementIntroduction to Data Management
Introduction to Data Management
 
Reproducible research: First steps.
Reproducible research: First steps. Reproducible research: First steps.
Reproducible research: First steps.
 
Introduction to data management
Introduction to data managementIntroduction to data management
Introduction to data management
 
The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
The Fourth Paradigm - Deltares Data Science Day, 31 October 2014The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
 
NSF Data Management Plan Case Study: UVa’s Response.
NSF Data Management Plan Case Study:  UVa’s Response.NSF Data Management Plan Case Study:  UVa’s Response.
NSF Data Management Plan Case Study: UVa’s Response.
 
Valen Metadata and the [Data] Repository
Valen Metadata and the [Data] RepositoryValen Metadata and the [Data] Repository
Valen Metadata and the [Data] Repository
 
Minimal viable-datareuse-czi
Minimal viable-datareuse-cziMinimal viable-datareuse-czi
Minimal viable-datareuse-czi
 
The Kaleidoscope of Impact: same data, different perspectives, constantly cha...
The Kaleidoscope of Impact: same data, different perspectives, constantly cha...The Kaleidoscope of Impact: same data, different perspectives, constantly cha...
The Kaleidoscope of Impact: same data, different perspectives, constantly cha...
 
Data peer review workshop
Data peer review workshopData peer review workshop
Data peer review workshop
 
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
 
Best practices data collection
Best practices data collectionBest practices data collection
Best practices data collection
 
THOR Workshop - Data Publishing PLOS
THOR Workshop - Data Publishing PLOSTHOR Workshop - Data Publishing PLOS
THOR Workshop - Data Publishing PLOS
 
RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research Objects
 
Developing data services: a tale from two Oregon universities
Developing data services: a tale from two Oregon universitiesDeveloping data services: a tale from two Oregon universities
Developing data services: a tale from two Oregon universities
 

Similar to On community-standards, data curation and scholarly communication - BITS, Italy, 2016

bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...dkNET
 
The Rocky Road to Reuse
The Rocky Road to ReuseThe Rocky Road to Reuse
The Rocky Road to ReuseAnita de Waard
 
Looking for Data: Finding New Science
Looking for Data: Finding New ScienceLooking for Data: Finding New Science
Looking for Data: Finding New ScienceAnita de Waard
 
THOR Workshop - Introduction
THOR Workshop - IntroductionTHOR Workshop - Introduction
THOR Workshop - IntroductionMaaike Duine
 
Alain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersAlain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersIncisive_Events
 
PhRMA Some Early Thoughts
PhRMA Some Early ThoughtsPhRMA Some Early Thoughts
PhRMA Some Early ThoughtsPhilip Bourne
 
ODIN: Connecting research and researchers
ODIN: Connecting research and researchersODIN: Connecting research and researchers
ODIN: Connecting research and researchersSergio Ruiz
 
Parsec 191119 slideshare
Parsec 191119 slideshareParsec 191119 slideshare
Parsec 191119 slideshareAlison Specht
 
Why would a publisher care about open data?
Why would a publisher care about open data?Why would a publisher care about open data?
Why would a publisher care about open data?Anita de Waard
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environmentphilipdurbin
 
FAIR for the future: embracing all things data
FAIR for the future: embracing all things dataFAIR for the future: embracing all things data
FAIR for the future: embracing all things dataARDC
 
Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...Robin Rice
 
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseMendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseAnita de Waard
 
SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...
SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...
SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...Susanna-Assunta Sansone
 
Acting as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decadeActing as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decadeLizLyon
 
Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014Susanna-Assunta Sansone
 
Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...EDINA, University of Edinburgh
 

Similar to On community-standards, data curation and scholarly communication - BITS, Italy, 2016 (20)

bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
 
The Rocky Road to Reuse
The Rocky Road to ReuseThe Rocky Road to Reuse
The Rocky Road to Reuse
 
Looking for Data: Finding New Science
Looking for Data: Finding New ScienceLooking for Data: Finding New Science
Looking for Data: Finding New Science
 
THOR Workshop - Introduction
THOR Workshop - IntroductionTHOR Workshop - Introduction
THOR Workshop - Introduction
 
Alain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersAlain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producers
 
PhRMA Some Early Thoughts
PhRMA Some Early ThoughtsPhRMA Some Early Thoughts
PhRMA Some Early Thoughts
 
ODIN: Connecting research and researchers
ODIN: Connecting research and researchersODIN: Connecting research and researchers
ODIN: Connecting research and researchers
 
Parsec 191119 slideshare
Parsec 191119 slideshareParsec 191119 slideshare
Parsec 191119 slideshare
 
Why would a publisher care about open data?
Why would a publisher care about open data?Why would a publisher care about open data?
Why would a publisher care about open data?
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environment
 
FAIR for the future: embracing all things data
FAIR for the future: embracing all things dataFAIR for the future: embracing all things data
FAIR for the future: embracing all things data
 
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLANINCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
 
Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...
 
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseMendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
 
SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...
SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...
SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...
 
Acting as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decadeActing as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decade
 
Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014
 
Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...
 
Simon hodson
Simon hodsonSimon hodson
Simon hodson
 
Open Science - Global Perspectives/Simon Hodson
Open Science - Global Perspectives/Simon HodsonOpen Science - Global Perspectives/Simon Hodson
Open Science - Global Perspectives/Simon Hodson
 

More from Susanna-Assunta Sansone

FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
NFDI Physical Sciences Colloquium - FAIR
NFDI Physical Sciences Colloquium - FAIRNFDI Physical Sciences Colloquium - FAIR
NFDI Physical Sciences Colloquium - FAIRSusanna-Assunta Sansone
 
FAIR, community standards and data FAIRification: components and recipes
FAIR, community standards and data FAIRification: components and recipesFAIR, community standards and data FAIRification: components and recipes
FAIR, community standards and data FAIRification: components and recipesSusanna-Assunta Sansone
 
FAIRification is a Team Sport: FAIRsharing and the FAIR Cookbook
FAIRification is a Team Sport: FAIRsharing and the FAIR CookbookFAIRification is a Team Sport: FAIRsharing and the FAIR Cookbook
FAIRification is a Team Sport: FAIRsharing and the FAIR CookbookSusanna-Assunta Sansone
 
FAIRsharing: how we assist with FAIRness
FAIRsharing: how we assist with FAIRnessFAIRsharing: how we assist with FAIRness
FAIRsharing: how we assist with FAIRnessSusanna-Assunta Sansone
 
FAIRsharing - focus on standards and new features
FAIRsharing - focus on standards and new features FAIRsharing - focus on standards and new features
FAIRsharing - focus on standards and new features Susanna-Assunta Sansone
 
FAIR data and standards for a coordinated COVID-19 response
FAIR data and standards for a coordinated COVID-19 responseFAIR data and standards for a coordinated COVID-19 response
FAIR data and standards for a coordinated COVID-19 responseSusanna-Assunta Sansone
 

More from Susanna-Assunta Sansone (20)

FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
FAIRsharing-Standards-4-GSC-Aug23.pdf
FAIRsharing-Standards-4-GSC-Aug23.pdfFAIRsharing-Standards-4-GSC-Aug23.pdf
FAIRsharing-Standards-4-GSC-Aug23.pdf
 
FAIR-4-GSC-Sansone-Aug23.pdf
FAIR-4-GSC-Sansone-Aug23.pdfFAIR-4-GSC-Sansone-Aug23.pdf
FAIR-4-GSC-Sansone-Aug23.pdf
 
FAIRsharing & FAIRcookbook at RDA 2023
FAIRsharing & FAIRcookbook at RDA 2023FAIRsharing & FAIRcookbook at RDA 2023
FAIRsharing & FAIRcookbook at RDA 2023
 
NFDI Physical Sciences Colloquium - FAIR
NFDI Physical Sciences Colloquium - FAIRNFDI Physical Sciences Colloquium - FAIR
NFDI Physical Sciences Colloquium - FAIR
 
Metadata Standards
Metadata StandardsMetadata Standards
Metadata Standards
 
FAIRcookbook: GSRS22-Singapore
FAIRcookbook: GSRS22-SingaporeFAIRcookbook: GSRS22-Singapore
FAIRcookbook: GSRS22-Singapore
 
FAIR Cookbook
FAIR Cookbook FAIR Cookbook
FAIR Cookbook
 
FAIR, community standards and data FAIRification: components and recipes
FAIR, community standards and data FAIRification: components and recipesFAIR, community standards and data FAIRification: components and recipes
FAIR, community standards and data FAIRification: components and recipes
 
FAIRsharing and the FAIR Cookbook
FAIRsharing and the FAIR Cookbook FAIRsharing and the FAIR Cookbook
FAIRsharing and the FAIR Cookbook
 
FAIRsharing for EOSC
FAIRsharing for EOSC FAIRsharing for EOSC
FAIRsharing for EOSC
 
FAIR: standards and services
FAIR: standards and servicesFAIR: standards and services
FAIR: standards and services
 
FAIRification is a Team Sport: FAIRsharing and the FAIR Cookbook
FAIRification is a Team Sport: FAIRsharing and the FAIR CookbookFAIRification is a Team Sport: FAIRsharing and the FAIR Cookbook
FAIRification is a Team Sport: FAIRsharing and the FAIR Cookbook
 
FAIRsharing: what we do for policies
FAIRsharing: what we do for policiesFAIRsharing: what we do for policies
FAIRsharing: what we do for policies
 
FAIRsharing: how we assist with FAIRness
FAIRsharing: how we assist with FAIRnessFAIRsharing: how we assist with FAIRness
FAIRsharing: how we assist with FAIRness
 
ELIXIR FAIR Activities - Examplars
ELIXIR FAIR Activities - ExamplarsELIXIR FAIR Activities - Examplars
ELIXIR FAIR Activities - Examplars
 
FAIRsharing - focus on standards and new features
FAIRsharing - focus on standards and new features FAIRsharing - focus on standards and new features
FAIRsharing - focus on standards and new features
 
FAIR data and standards for a coordinated COVID-19 response
FAIR data and standards for a coordinated COVID-19 responseFAIR data and standards for a coordinated COVID-19 response
FAIR data and standards for a coordinated COVID-19 response
 
FAIRsharing poster
FAIRsharing posterFAIRsharing poster
FAIRsharing poster
 
The FAIR Cookbook poster
The FAIR Cookbook posterThe FAIR Cookbook poster
The FAIR Cookbook poster
 

Recently uploaded

Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 

Recently uploaded (20)

Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 

On community-standards, data curation and scholarly communication - BITS, Italy, 2016

  • 1. On community-standards, data curation and scholarly communication Susanna-Assunta Sansone, PhD @SusannaASansone 13th Annual Meeting of the Bioinformatics Italian Society, University of Salerno, Italy, 15-17 June 2016. Data Consultant, Founding Academic Editor Associate Director, Principal Investigator Member, Executive Committee
  • 2. •  Better data better science – the FAIR meme •  Publication of digital research outputs – why it matters •  Interoperability standards – as enablers Outline
  • 3. Research as a Connected Digital Enterprise aka The Commons •  Researcher X is automatically made aware of researcher Y through commonalities in their respective data located in the Commons. The vision - P. Bourne (NIH Associate Director for Data Science)
  • 4. Research as a Connected Digital Enterprise aka The Commons •  Researcher X is automatically made aware of researcher Y through commonalities in their respective data located in the Commons. •  Research X locates the researcher Y’s data sets with their associated usage statistics, navigates to the associated publications and starts to explore various ideas to engage with researcher Y and their research network. The vision - P. Bourne (NIH Associate Director for Data Science)
  • 5. Research as a Connected Digital Enterprise aka The Commons •  Researcher X is automatically made aware of researcher Y through commonalities in their respective data located in the Commons. •  Research X locates the researcher Y’s data sets with their associated usage statistics, navigates to the associated publications and starts to explore various ideas to engage with researcher Y and their research network. •  A fruitful collaboration ensues and they generate publications, data sets and software; their output is captured in PubMed and the Commons, and is indexed by the data and software catalogs. The vision - P. Bourne (NIH Associate Director for Data Science)
  • 6. Research as a Connected Digital Enterprise aka The Commons •  Researcher X is automatically made aware of researcher Y through commonalities in their respective data located in the Commons. •  Research X locates the researcher Y’s data sets with their associated usage statistics, navigates to the associated publications and starts to explore various ideas to engage with researcher Y and their research network. •  A fruitful collaboration ensues and they generate publications, data sets and software; their output is captured in PubMed and the Commons, and is indexed by the data and software catalogs. •  Company Z identifies relevant data and software that, based on the metrics from the catalogs, have utilization above a threshold indicating that those data and software are heavily utilized by the community. The vision - P. Bourne (NIH Associate Director for Data Science)
  • 7. Research as a Connected Digital Enterprise aka The Commons •  Researcher X is automatically made aware of researcher Y through commonalities in their respective data located in the Commons. •  Research X locates the researcher Y’s data sets with their associated usage statistics, navigates to the associated publications and starts to explore various ideas to engage with researcher Y and their research network. •  A fruitful collaboration ensues and they generate publications, data sets and software; their output is captured in PubMed and the Commons, and is indexed by the data and software catalogs. •  Company Z identifies relevant data and software that, based on the metrics from the catalogs, have utilization above a threshold indicating that those data and software are heavily utilized by the community. An open source version remains, but the company adds services on top of the software and revenue flows back to the labs of researchers X and Y which is used to develop new innovative software for open distribution. The vision - P. Bourne (NIH Associate Director for Data Science)
  • 8. Research as a Connected Digital Enterprise aka The Commons •  Researcher X is automatically made aware of researcher Y through commonalities in their respective data located in the Commons. •  Research X locates the researcher Y’s data sets with their associated usage statistics, navigates to the associated publications and starts to explore various ideas to engage with researcher Y and their research network. •  A fruitful collaboration ensues and they generate publications, data sets and software; their output is captured in PubMed and the Commons, and is indexed by the data and software catalogs. •  Company Z identifies relevant data and software that, based on the metrics from the catalogs, have utilization above a threshold indicating that those data and software are heavily utilized by the community. An open source version remains, but the company adds services on top of the software and revenue flows back to the labs of researchers X and Y which is used to develop new innovative software for open distribution. •  Researchers X and Y provide hands-on advice in the use of their new version and their course is offered as a MOOC (Massive Open Online Courses). The vision - P. Bourne (NIH Associate Director for Data Science)
  • 9. Research as a Connected Digital Enterprise aka The Commons The vision - P. Bourne (NIH Associate Director for Data Science) https://datascience.nih.gov/commons
  • 10. A Data Discovery Index prototype that: •  Helps users find and access shared data •  Interoperates in the NIH Commons
  • 11. aggregator' A' B C A aggregator' Data'Discovery'Index' data' Dashed lines: mapping of metadata standards, links to aggregators, data Data: digital research objects Pilot projectsCore development team Designed as an element of the ecosystem
  • 13.
  • 14. to do better science ! more efficiently!
  • 16. “Over 50% of completed studies in biomedicine do not appear in the published literature….Often because results do not conform to author's hypotheses” “Only half the health-related studies funded by the European Union between 1998 and 2006 - an expenditure of €6 billion - led to identifiable reports” Selective reporting is still an unfortunate practice •  Small independent efforts, yielding a rich variety of specialty data sets o  Most of these data (such as null findings) is unpublished o  These dark data hold a potential wealth of knowledge
  • 17. •  Researchers still lack of or insufficient motivations •  Hypothesis-confirming results get prioritized •  Agreements, disagreements and timing •  Loose requirements and monitoring by journals and funders But why?
  • 18. •  Most researchers are sharing data, and using the data of others •  Direct contact* between researchers (on request) is a common way of sharing data •  Repositories are second most common method of sharing Kratz JE, Strasser C (2015) Researcher Perspectives on Publication and Peer Review of Data. PLoS ONE 10(2): e0117619. Current approaches to sharing * Data associated with published works disappears at a rate of ~17% per year (Vines et al. 2014, doi:10.1016/j.cub.2013.11.014 Datasets not referenced in a manuscript are essentially invisible and data producers do not get appropriate credit for their work
  • 19. •  Outputs are multi-dimensional, not always well cited, stored o  Software, codes, workflows are hard(er) to get hold of •  Poorly described for third party reuse o  Different level of details and annotation •  Curation activities are perceived as time consuming o  Collection and harmonization of detailed methods and experimental steps is done/rushed at publication stage Shared data is not always understandable, reusable
  • 20. A B C D E 1 Group1 Group2 2 Day 0 3 Sodium 139 142 4 Potassium 3.3 4.8 5 Chloride 100 108 6 BUN 18 18 7 Creatine 1.2 1.2 8 Uric acid 5.5* 6.2* 9 Day 7 10 Sodium 140 146 11 Potassium 3.4 5.1 12 Chloride 97 108 S1Sh.cuo Sharing starts with good metadata… Credit to: Iain Hrynaszkiewicz
  • 21. A B C D E 1 Group1 Group2 2 Day 0 3 Sodium 139 142 4 Potassium 3.3 4.8 5 Chloride 100 108 6 BUN 18 18 7 Creatine 1.2 1.2 8 Uric acid 5.5* 6.2* 9 Day 7 10 Sodium 140 146 11 Potassium 3.4 5.1 12 Chloride 97 108 S1Sh.cuo Meaningless column titles Special characters can cause text mining errors No units Unhelpful document name Undefined abbreviation Formatting for information that should be in metadata ….…but this not! Credit to: Iain Hrynaszkiewicz
  • 22. A B C D E F 1 Parameter Day Control Treated Units P 2 Sodium 0 139 142 mEq/l 0.82 3 Sodium 7 140 146 mEq/l 0.70 4 Sodium 14 140 158 mEq/l 0.03 5 Sodium 21 143 160 mEq/l 0.02 6 Potassium 0 3.3 4.8 mEq/l 0.06 7 Potassium 7 3.4 5.1 mEq/l 0.07 8 Potassium 14 3.7 4.7 mEq/l 0.10 9 Potassium 21 3.1 3.6 mEq/l 0.52 10 Chloride 0 100 108 mEq/l 0.56 11 Chloride 7 97 108 mEq/l 0.68 12 Chloride 14 101 106 mEq/l 0.79 Table_S1_Shanghai_blood.xls ….this is much clearer! Credit to: Iain Hrynaszkiewicz
  • 23. Without context data is meaningless
  • 24. The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project 2 4 …breadth and depth of the context is pivotal… …including capturing experimental design and statistical analysis
  • 25. Among these, publishers occupy a leverage point, because of importance of formal publications in the academic incentive structure Stakeholders mobilizations, old and new driving forces
  • 26. •  Incentive, credit for sharing o  Big and small data o  Unpublished data o  Long tail of data o  Curated aggregation •  Peer review of data •  Value of data vs. analysis •  Discoverability and reusability o  Complementing community databases Growing number of data papers and data journals
  • 27. nature.com/scientificdataHonorary Academic Editor Susanna-Assunta Sansone, PhD Managing Editor Andrew L Hufton, PhD Editorial Curator Varsha Khodiyar Publisher Iain Hrynaszkiewicz A new open-access, online-only publication for descriptions of scientifically valuable datasets Supported by
  • 28. A new article type A new category of publication that provides detailed descriptors of scientifically valuable datasets Mandates open data, without unnecessary restrictions, as a condition of submission
  • 29. Research papers Data records Data Descriptors Value added component – complementing articles and repositories
  • 30. Scientific hypotheses: Synthesis Analysis Conclusions Methods and technical analyses supporting the quality of the measurements: What did I do to generate the data? How was the data processed? Where is the data? Who did what when Relation with traditional articles – content
  • 31. Citation of and links to data files and databases
  • 32. Experimental metadata or structured component (in-house curated, machine- readable formats) Article or narrative component (PDF and HTML) Data Descriptors has two components
  • 33. The Data Curation Editor is responsible for creating and curating the machine-readable structured component •  Enables browsing and searching the articles •  Facilitates links to related journal articles and repository records Curation and discoverability
  • 34. Created with the input of the authors, includes value-added semantic annotation of the experimental metadata analysis method script Data file or record in a database Data Descriptors: structured component
  • 35. Browse, search, view Data Descriptors
  • 36.
  • 37.
  • 38. 3 8 Why data papers? Credit for data producers! Credit to: Varsha Khodiyar
  • 39. “The Data Descriptor made it easier to use the data, for me it was critical that everything was there…all the technical details like voxel size.” Professor Daniele Marinazzo Why data papers? Data reuse is easier! Credit to: Varsha Khodiyar
  • 40. 4 0 Decades old dataset Aggregated or curated data resources Computationally produced data products Large consortium dataset Data from a single experiment Data associated with a high impact analysis article What does make a good Data Descriptors? Credit to: Andrew Hufton
  • 41. •  Better data better science – the FAIR meme •  Publication of digital research outputs – why it matters •  Interoperability standards – as enablers Outline
  • 42. de jure de facto grass-roots groups standard organizations Nanotechnology Working Group •  To structure, enrich and report the description of the datasets and the experimental context under which they were produced •  To facilitate discovery, sharing, understanding and reuse of datasets Community-developed content standards
  • 43. de jure de facto grass-roots groups standard organizations Nanotechnology Working Group Content standards as enabler for better described data Including minimum information reporting requirements, or checklists to report the same core, essential information Including controlled vocabularies, taxonomies, thesauri, ontologies etc. to use the same word and refer to the same ‘thing’ Including conceptual model, conceptual schema from which an exchange format is derived to allow data to flow from one system to another
  • 45. Is there a database, implementing standards, where to deposit my metagenomics dataset? My funder’s data sharing policy recommends the use of established standards, but which ones are widely endorsed and applicable to my toxicological and clinical data? Am I using the most up-to-date version of this terminology to annotate cell-based assays? I understand this format has been deprecated; what has been replaced by and how is leading the work? Are there databases implementing this exchange format, whose development we have funded? What are the mature standards and standards-compliant databases we should recommend to our authors? But how do we help users to make informed decisions?
  • 46. A web-based, curated and searchable registry ensuring that standards and databases are registered, informative and discoverable; monitoring development and evolution of standards, their use in databases and adoption of both in data policies An informative and educational resource 1,400 records and growing
  • 47. An informative and educational resource
  • 48. The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project Tracking evolution, e.g. deprecations and substitutions
  • 49. Model/format formalizing reporting guideline --> <-- Reporting guideline used by model/format Cross-linking standards to standards and databases
  • 50. Standards and databases recommended by publishers in their data policies
  • 51. Interactive graph to inform and educate, e.g. database standard policy
  • 52. Interactive graph to inform and educate, e.g. database standard policy
  • 53. Interactive graph to inform and educate, e.g. database standard policy
  • 54. Linking standards and databases to training material
  • 55. Advised by the ELIXIR Training Coordinators Group, including: A collaboration between:
  • 57. Philippe Rocca-Serra, PhD Senior Research Lecturer Alejandra Gonzalez-Beltran, PhD Research Lecturer Milo Thurston, DPhD Research Software Engineer Massimiliano Izzo, PhD Research Software Engineer Peter McQuilton, PhD Knowledge Engineer Allyson Lister, PhD Knowledge Engineer Eamonn Maguire, DPhil Software Engineer contractor David Johnson, PhD Research Software Engineer Susanna-Assunta Sansone, PhD Principal Investigator, Associate Director We also acknowledge our network of collaborators in the following active projects: H2020 PhenoMeNal, H2020 ELIXIR-EXCELERATE, H2020 MultiMot, NIH bioCADDIE, NIH CEDAR and IMI eTRIKS