Open science framework – Jeff Spies, Centre for Open Science
Active research from lab to publication – Simon Coles, University of Southampton
Managing active research in the university – Robin Rice, University of Edinburgh
Making research available: FAIR principles and Force 11 - David De Roure, Oxford e-Research Centre
Jisc and CNI conference, 6 July 2016
13. Norms
Communality
Open sharing
Universalism
Evaluate research on own merit
Disinterestedness
Motivated by knowledge and
discovery
Counternorms
Secrecy
Closed
Particularlism
Evaluate research by reputation
Self-interestedness
Treat science as a competition
14. Norms
Communality
Open sharing
Universalism
Evaluate research on own merit
Disinterestedness
Motivated by knowledge and
discovery
Organized skepticism
Consider all new evidence, even
against one’s prior work
Counternorms
Secrecy
Closed
Particularism
Evaluate research by reputation
Self-interestedness
Treat science as a competition
Organized dogmatism
Invest career promoting one’s own
theories, findings
15. Norms
Communality
Open sharing
Universalism
Evaluate research on own merit
Disinterestedness
Motivated by knowledge and
discovery
Organized skepticism
Consider all new evidence, even
against one’s prior work
Quality
Counternorms
Secrecy
Closed
Particularism
Evaluate research by reputation
Self-interestedness
Treat science as a competition
Organized dogmatism
Invest career promoting one’s own
theories, findings
Quantity
41. OpenSesame
Soon29 grants to develop open tools and services: https://cos.io/pr/2015-09-24/
Publish Report
Search/Discovery
Develop Idea
Design Study
Collect Data
Store Data
Analyze Data
Write Report
64. Active research from lab to
publication
Prof. Simon Coles (s.j.coles@soton.ac.uk)
Director, UK National Crystallography Service
65. What is “active” here?
Continual management throughout the whole
experimental process
Immediate feedback during the experiment to
inform next steps or direction
67. Working Across Facilities
67
Conceive Research Propose Experiment Analyse Publish
Approval
Submissi
on
Experim
ent
Analysi
s
Publicati
on
Propos
al
Proposal
Approv
al
Schedul
e
Experim
ent
Archi
ve
Analy
se
Publi
sh
NCS User
NCS
Central
Facility
72. In the context of ‘traditional’
publishing
• ELN as Supplementary Information for conventional
publication (Chemistry Central Journal 2013, 7:182 )
72
73. Can we make metadata
do more for us (actively)?
Formal frameworks for real-time capture required…
74. A semantic framework for chemistry
• Describes and relates different types of process information
74
elnItemManifest
high-level semantic
description of ELN record
Core Scientific
Metadata model
SIMS
Reaction
Procedures
S88
Analytical data
Allotrope
Foundation
75. elnItemManifest
• Layered metadata model for description, export & packaging
• This is the first (information) layer – leads into knowledge
• Published through Dial-a-Molecule
athttp://wp.me/p2JoQ6-xF & in J. ChemInf 2013, 5:52
75
76. Core Scientific Metadata model
as a Starting Point
• Doesn’t cover all, but…
• Forms the basis for extensions:
- To derived data
- To laboratory based science
- To secondary analysis data
- To preservation information
- To publication data
76
Investigatio
n
Publication KeywordTopic
Sample
Sample
Parameter
Dataset
Dataset
Parameter
Datafile
Datafile
Parameter
Investigator
Related
Datafile
Parameter
Authorisati
on
77. SIMS:
Sample Information Management System
• A standard/format for crystallographic sample and
experiment data management and archival
• Supported by CrystalClear and NCS Portal, providing
interaction between facility, instruments and CIF, ImgCIF etc
77
78. Standards for reactions: S88
78
• Group arising from Dial-a-Molecule consisting of Mettler
Toledo, Pfizer, GlaxoSmithKline, AstraZeneca, Johnson &
Johnson, Southampton University, NextMove, Royal
Society of Chemistry looking to:
– Provide guidance for S88 implementations for synthetic organic
chemistry reaction procedures
– Provide example set
– Agree on controlled vocabularies for elements
– Generate a schema
– IUPAC uptake?
79. Standards for reactions: S88
79
• Group arising from Dial-a-Molecule consisting of Mettler
Toledo, Pfizer, GlaxoSmithKline, AstraZeneca, Johnson &
Johnson, Southampton University, NextMove, Royal
Society of Chemistry looking to:
– Provide guidance for S88 implementations for synthetic organic
chemistry reaction procedures
– Provide example set
– Agree on controlled vocabularies for elements
– Generate a schema
– IUPAC uptake?
16 (0.101 g, 0.132 mmol) and Cu(OTf)2 (0.006 g, 0.01 mmol) were added to a round-bottom flask under a N2
atmosphere. In a separate vial, 2 (0.155 g, 0.753 mmol) was dissolved in C2H4Cl2 (1.3 mL) and transferred to the reaction
flask. CF3CO2H (0.030 mL, 3 equiv) was added to the reaction mixture, which was refluxed at 100 °C for 1 h. The
reaction mixture was washed with saturated NaHCO3 (15 mL) and extracted with C2H4Cl2 (3 x 5 mL). The organic
fractions were collected, dried (MgSO4), and filtered to give a dark red solution. The solvent was removed, and the
product was purified by column chromatography (SiO2, 30:70 CH2Cl2 : hexane) to yield 17 as a pale yellow powder (0.096
g, 68% yield).
83. Research Data Alliance
• Chemistry data interest group
• Joint RDA/IUPAC Charter drafted
– Characterise chemical data types
– Leverage to establish standards
– Examine workflows in disciplines interacting with
chemistry
– Cultivate a sharing culture
83
85. Recording process
• Plan (Prospective
provenance)
85
• Enactment
(Retrospective
provenance)
• Realisation
86. oreChem Plan for eCrystals
• Machine-readable
representation of methodology
• Describes requirements for
software and data products
86
87. CREAM:
Collaboration for Research
Enhancement using Active Metadata
• How to collect and use
metadata actively to
capture tacit information
• Active metadata: assemblage of
metadata and annotations used
actively within the process that
generates it (capable of being
reused by another process).
• Central Facilities; Chemistry;
Geosciences; Art; Music…
• Uptake: CODATA; Research
Data Alliance 87
https://blog.soton.ac.uk/cream/
89. Managing active research in
the University of Edinburgh
Robin Rice
University of Edinburgh
(@sparrowbarley)
90. Elements of the presentation
• Funded & unfunded research, PGR
students, collaborators
• Managing & support for …
– research grants
– research outputs
– research data
• Simplified research lifecycle
– (before – during – after)
90
91. New Research Management and Administration
System (grants)
Worktribe - Empowering and supporting research administrators
and investigators from idea through costing, approvals, award,
post-award management and closure.
• A recent requirements and procurement project focused on
delivery of a new 'Worktribe Research Management' system to
support improved pre-award and post-award business processes
across the University.
– 3 pilot schools / institutes November 2015
– Go-live across university April 2016
• The new 'Worktribe Research Management' system is integrated
with the existing Finance, HR and PURE systems.
• https://www.projects.ed.ac.uk/programme/rmas
91
92. Managing research outputs:
guidance for authors; OJS
1. Make your work Open Access 5. Acknowledge your funder
2. Use your name consistently 6. Statement on research data
3. Use an ORCID identifier 7. Cite the DOI and OA links
4. Institutional affiliation 8. Claim your digital space
‘Ensure your research reaches the widest possible global audience, is
eligible for submission in research assessment exercises, and fulfils funder
requirements.’
92
93. Type, format volume of data, chosen software
for long-term access, existing data, file naming,
structure, versioning, quality assurance
process.
Information needed for the
data to be read and
interpreted in future,
metadata standards,
methodology, definition of
variables, format & file
type of data.
Restrict access to data, risks
to data security,
appropriate methods to
transfer / share data,
encryption.
Secure & sufficient storage for active
data, regular backups, disaster recovery
Make data publicly
available (where possible)
at the end of a project,
license data, any
restrictions on sharing,
access controls?
Select data to keep,
decide how long data
will be kept, in which
repository, costs
involved in long-term
storage?
Day-to-daymanagementofdata
Managing Research Data
(from researcher training)
93
94. Who manages active data?
• RDM Policy (2011) sets out roles and responsibilities
for researchers & the institution
– Researchers are responsible for their work
– Enabling role of institution
• Services, including stewardship
• Monitoring compliance
• Principal Investigators
• Research Institutes & Schools
94
96. From RDM programme to
Research Data Service…
• RDM programme a result of both bottom-up (‘action
group’) activity and top-down policy implementation
• Services had various providers and their purpose and
names were confusing
• New single service has
SO & SOM with Virtual Team
across Information Services
96
100. Finding and analysing data
• What is Data Library & Consultancy?
• The Data Library & Consultancy team assists researchers to
discover and use datasets for analysis, learning and teaching.
• Data librarians are available to help you find answers to data-
related questions.
• Tools include a data catalogue and the Survey Documentation
and Analysis online data browser
100
101. Storing data
• What is DataStore?
• DataStore is file storage for active research data, and is
available to all research staff and postgraduate research
students (PGRs).
• DataStore provides a free individual allocation for each
researcher, as well as shared group spaces. Additional capacity
of virtually any size is available.
101
102. Transferring data
• What is DataSync?
• DataSync is a tool to synchronise and share research data with
collaborators. It has an app to synchronise data to computers
and mobile devices, and a web interface to allow access to
data from any web browser.
• Data can be shared with anyone who has an email address, via
the web interface.
102
103. Versioning software
• What is Subversion?
• Subversion is a version control tool which allows users to store code. It is
also available as an extension called SourcEd which provides a web based
collaboration tool integrated with your repository.
• When documents stored in a Subversion repository are updated the old
versions are kept so you can revert if necessary. The service also allows
multiple people to collaborate on documents.
103
104. Data management support
One-to-one support is available on the following areas of RDM:
• Writing and reviewing DMPs;
• Creating SOPs for metadata collection and publication;
• Creating SOPs for good data management practice;
• Choosing a data repository and preparing data for deposit and
publication
104
105. Training - Online
• MANTRA: MANTRA is a free, non-credit, self-paced course
designed for postgraduate students and early career
researchers which provides guidelines for good practice in
research data management
• Research Data Management and Sharing MOOC: This free five-
week course - created by the Universities of Edinburgh and
North Carolina - is designed to reach learners across disciplines
and continents.
105
113. Training workshops
• Creating a data management plan for your grant application
• Managing your research data: why is it important and what
should you do?
• Working with personal & sensitive research data
• Good practice in research data management
• Handling data using SPSS
113
116. Making research available - FAIR
principles and Force 11
David De Roure, Oxford e-Research Centre
14/07/2016
117. David De Roure
@dder
Making research available:
FAIR principles and FORCE11
DIRECTOR, UNIVERSITY OF OXFORD E-RESEARCH CENTRE
118. A Brief History of Force11
● 2008/2009:
– Elsevier Grand challenge
● 2010:
– Found & connected to Phil
Bourne
– Planned Dagstuhl meeting
● 2011:
– January: Beyond the PDF,
San Diego: 97 Attendees
– August: Force11 at
Dagstuhl: 34 attendees
– November: Manifesto is
published >> Force11!
● 2012: Funding Sloan
● 2013: Beyond the PDF2,
Amsterdam:
– 148 attendees, great discussion
● 2014: Working groups take
off:
– Data Citation Principles Working
group
– Resource Identifier Working
group
● 2015: Force15, Oxford:
– 257 attendees
● 2015: Force 2016,
Portland, Oregon
122. A diverse set of stakeholders -
representing academia, industry,
funding agencies, and scholarly
publishers - have come together to
design and jointly endorse a concise
and measureable set of principles,
for those wishing to enhance the
reusability of their data holdings
Including, but not limited to:
European Open Science Cloud –
High Level Expert Group
123. These put emphasis on enhancing the ability of
machines to automatically find and use the data, in
addition to supporting its reuse by individual
NOTE: The Principles are high-level; do not suggest
any specific technology, standard, or implementation-