Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
High quality data publications: drives and needs - Sansone, BDebate, 12 Nov 2014
1. !
High quality data publications:!
drives and needs!
!
Susanna-Assunta Sansone, PhD!
!
@biosharing!
@isatools!
@scientificdata!
!
B-DEBATE: Big Data in Biomedicine. Challenges and Opportunities, 12 Nov, 2014
Data Consultant,
Honorary Academic Editor
Associate Director,
Principal Investigator
3. Plagued by selective reporting of data and methods
• Over 50% of completed studies in
biomedicine do not appear in the
published literature!
!
• Often because results do not
conform to author's hypotheses!
“Only half the health-related
studies funded by the European
Union between 1998 and 2006 -
an expenditure of €6 billion - led
to identifiable reports”!
4. Incentivizing individual contributor to share data
• Big science efforts!
o data is often better organized, reported and shared!
• Small independent efforts, yielding a rich variety of specialty data sets!
o Most of these data (such as null findings) is unpublished!
o These dark data hold a potential wealth of knowledge!
5. From made reproducible to born reproducible
“Reproducing the method took several months of effort, and
required using new versions and new software that posed
challenges to reconstructing and validating the results”
10. Data/reproducibility at NPG
Wang et al, Nature, 2013
doi:10.1038/nature12730
• Figure source data
o putting data behind figures/graphs
11. Data/reproducibility at NPG
• Figure source data
o putting data behind figures/graphs
• Data citation
o tackling both styling and format; monitoring community developments,
such the Data Citation Synthesis Group
• Code reproducibility
o peer review, availability and reuse
• NPG’s Linked Data release – CC0
• A new data journal
12. Role of data papers and data journals
• Incentive, credit for sharing!
• Peer review focus!
• Value of data vs. analysis!
• Discoverability and reusability!
13. market research (2011)
• What do researchers want from a data publications?
o 96% - increased visibility and discovery
o 95% - increased usability of their research data
o 93% - credit mechanism for deposit of data
o 80% - peer review of content/datasets
Respondent characteristics
387 respondents (329 active researchers
Physics (24%)
Earth and environmental science (21%)
Biology (20%)
Chemistry (19%)
Others (16%)
14. !
!
!
Helping you publish, discover and reuse research data
Credit for sharing
your data
Focused on reuse
and reproducibility
Peer reviewed,
curated
Promoting community
data and code
repositories
Open Access
• Currently covering life, natural and environmental
sciences!
• Big and small data!
o power of small data are in their aggregation and
integration with other datasets!
• New and previously published individual datasets,
curated collections and citizen science!
o a fuller, more in-depth look at the data processing
steps, additional data files, codes etc!
o tutorial-like information for scientists interested in
reusing or integrating the data with their own!
15. Introducing a new content type: Data Descriptor
Methods and technical analyses supporting the quality
of the measurements:!
What did I do to generate the data?!
How was the data processed?!
Where is the data?!
Who did what when!
How can the data be used or reused?!
Designed to make data
more discoverable,
interpretable
and reusable
16. Relation with traditional article - content
!
!
!
!
!
!
!
!
Scientific hypotheses:!
Synthesis!
Analysis!
Conclusions!
Methods and technical analyses supporting the quality
of the measurements:!
What did I do to generate the data?!
How was the data processed?!
Where is the data?!
Who did what when!
How can the data be used or reused?!
17. Relation with traditional article - time
Publish
Data!
AFTER: expand on your research articles, adding further information for reuse of the data
AT THE SAME TIME: publish your Data Descriptor(s) alongside research article(s)
OR BEFORE !
18. Share your data, get credited and cited
!
!
!
!
!
!
!
!
!
Code in GitHub
!
!
!
!
!
!
!
!
!
Data in OpenfMRI
19. Data Descriptor: narrative and structure
!
!
!
Experimental metadata or !
structured component!
(in-house curated, machine-readable
formats)!
Article or !
narrative component!
(PDF and HTML) !
20. Data Descriptor: narrative
Focus on data reuse!
Detailed descriptions of the methods and technical analyses supporting the
quality of the measurements.!
Does not contain tests of new scientific hypotheses!
Sections:!
• Title!
• Abstract!
• Background & Summary!
• Methods!
• Technical Validation!
• Data Records!
• Usage Notes !
• Figures & Tables !
• References!
• Data Citations!
!
Joint Declaration of Data Citation Principles by the
Data Citation Synthesis Group
21. Focus on data reuse!
Detailed descriptions of the methods and technical analyses supporting the
quality of the measurements.!
Does not contain tests of new scientific hypotheses!
Sections:!
• Title!
• Abstract!
• Background & Summary!
In traditional publications this
• Methods!
• Technical Validation!
• Data Records!
• Usage Notes !
• Figures & Tables !
• References!
• Data Citations!
information is not provided in a
sufficiently detailed manner
However this information is
essential for understanding,
reusing, and reproducing
datasets
!
Data Descriptor: narrative
22. Data Descriptor: structure (CC0)
In-house editorial curator:!
• assists users to submit the structured
content via simple templates and an
internal authoring tool!
• performs value-added semantic
annotation of the experimental
metadata!
For advanced users/service providers
willing to export ISA-Tab for direct
submission, we have released a technical
specification:!
Data file or !
record in a
database!
analysis !
method! script!
23. Adding value to research articles and data records
Research
papers
Descriptors
Data
Data
records
We currently recognize over
60 public data repositories!
!
25. Peer review process focused on quality and reuse!
Evaluation is not be based on the perceived impact !
or novelty of the findings or size of the data!
!
• Experimental rigour and technical data quality!
o Methodologically sound!
o Technical validation experiments and statistical analyses!
o Depth, coverage, size, and/or completeness of data sufficient for the types
of applications!
• Completeness of the description!
o Sufficient details to allow others to reproduce the results, reuse or
integrate it with other data!
o Compliance with relevant minimum information or reporting standards!
• Integrity of the data files and repository record!
o Data files match the descriptions in the Data Descriptor!
o Deposited in the most appropriate available databases!