Workshop 4: Open Science & Open Data for Librarians/Ina Smith

Workshop 4:
Open Science & Open Data for Librarians
24 April 2018 14:00 – 17:30
XXIII SCECSAL Conference, Entebbe, Uganda
24 April 2018

Programme
14:00 – 14:30 Introduction to Open Science/Open Data
14:30 – 15:00 Data informing the library profession
15:00 – 15:40 Data in support of research
15:40 – 16:00 Health Break
16:00 – 17:00 Working with data – tools & applications
17:00 – 17:30 Towards a data strategy for your library & institution

Data Stakeholders
• Governments (policy)
• Institutions (policy & strategy)
• Research Offices (reporting, impact)
• Researchers (collecting data in an ethical and
trusted way so that it can be re-used)
• Statisticians (processing, analysing and visualising
data)
• System engineers (to maintain a network and
allow for data to be digitally transmitted)
• Librarians (managing and organizing the data, and
making sure it is digitally preserved for the
unforeseeable future)

Why Librarians as Data Partners?
• Information standards
• Organizational skills
• Setting up file structures (organizing
information)
• Knowledge of workflows
• Knowledge of collection management
• Describing data using established metadata
schemes & controlled vocabulary
• Collection curation/preservation

Role of Librarians
• Advocate for transparency, openness in research,
access to data
• Initiating conversation on Open Science Open Data
Policy & Strategy - implement
• Develop own data skills (data skills but also
informed on copyright, licensing, citation)
• Increase visibility of research data
• Manage & register trusted data repositories
• Recommend trusted data repositories
• Promote & support proper research data
management planning among researchers

Data Skills for Librarians (1)
• Data terminology
• Unix-style command line interface, allowing librarians to
efficiently work with directories and files, and find and manipulate
data
• Cleaning and enhancing data in OpenRefine and spreadsheets
• Git version control system and the GitHub collaboration tool
• Web scraping and extracting data from websites
• Scientific writing in useful, powerful, and open mark-up
languages such as LaTeX, XML, and Markdown
• Formulating and managing citation data, publication lists, and
bibliographies in open formats such as BiBTeX, JSON, XML and
using open source reference management tools such as JabRef
and Zotero

Data Skills for Librarians (2)
• Transforming metadata documenting research outputs into open plain
text formats for easy reuse in research information systems in support of
funder compliance mandates and institutional reporting
• Scholarly identity with ORCiD and managing reputation with ORCiD-
enabled scholarly sharing platforms such as ScienceOpen
• Authorship, contributorship, and copyright ownership in collaborative
research projects
• Demonstrating best practices in attribution, acknowledgement, and
citation, particularly for non-traditional research outputs (software,
datasets)
• Identifying reputable Open Access publications and Open
Institutional/Open Data repositories
• Scholarly annotation and open peer review
• Investigating and managing copyright status of a work, and evaluating
conditions for Fair Use

Introduction to Open Science/
Open Data

Types of data
• Government data
• Communication data (mobile phones)
• Internet data
• Statistical data
• Research data (social & natural sciences)
• Discipline specific
• And more …

What is “data” and why “data”?

Open Science Defined
“Open Science is the practice of science in such a
way that others can collaborate and contribute,
where research data, lab notes and other
research processes are freely available, under
terms that enable reuse, redistribution and
reproduction of the research and its
underlying data and methods.” - FOSTER Project,
funded by the European Commission

Open Science Research Lifecycle
(Foster)

Original Research Data Lifecycle image from University of California, Santa Cruz
http://guides.library.ucsc.edu/datamanagement/
Repositories
Repositories
Tools
Plan
Policy&Infrastructure

Activity
http://bit.ly/scecsal2018

Fears Researchers Experience
• Getting scooped
• Time & effort by researcher
• Someone else finding a path-breaking application
of the data that researcher hasn’t considered
• Fear of problems/errors in the measurement
process being exposed
• Confidentiality/privacy of respondents - ethics
clearance
• Intellectual Property Rights – signed away, little
understanding, no IP in place

• When should research data be open?
• When should research data be closed?

• IP, Copyright, Licensing, Citations, Persistent
Identifiers (DOIs), Metadata Standards
• Dataverse
https://dataverse.org/
• CKAN
https://ckan.org/
• DKAN
https://getdkan.org/
• Nesstar
http://www.nesstar.com/software/publisher.html
https://www.coretrustseal.org/about/
Implement & Manage Trusted Data
Repositories

Data Repositories vs Social Media
• Social media sites/3rd party software:
• Connect researchers sharing interests
• Marketing data
• Sites belong to third parties – and data
• Repository:
• Supports export/harvesting of metadata
• Offers long-term preservation
• Non-profit – no advertisements
• Uses open standards and protocols
• Copyright

Recommend Trusted Data
Repositories
https://www.re3data.org/
Find more repositories, datasets

Register Data Initiatives
• re3data.org
https://www.re3data.org/
• Open Data Barometer
https://opendatabarometer.org/
• Global Open Data Index
https://index.okfn.org/
• African Open Science Platform
http://africanopenscience.org.za/
• Dataverse …. And more …

Research Data Management
https://github.com/DMPRoadmap
Research Proposal
Ethics Committee
Funder
Data Server &
Repository
Etc.

Working with data – tools &
applications

Working with Data
• Using R, Python, ggplot and more ..
• Collection e.g. Survey
• Normalisation & Cleaning e.g. OpenRefine
• Analysis
• Visualisation
• Preservation
• Mining

Data Visualisation
• Static: http://r-statistics.co/Top50-Ggplot2-
Visualizations-MasterList-R-Code.html
• Dynamic: https://blog.profitbricks.com/39-
data-visualization-tools-for-big-data/

https://www.targetmap.com/viewer.aspx?reportId=56245
Please note: this is just a preview and data still to be cleaned and
updated and corrected.
African Open Science Platform (AOSP)
Landscape Study

Data Mining
• Set of methods to analyse data from various
dimensions and perspectives, finding previously
unknown hidden patterns, classifying and grouping
the data and summarizing the identified
relationships
The tasks of data mining are twofold:
• Create predictive power using features to predict
unknown or future values of the same or other
feature
• Create a descriptive power, find interesting,
human-interpretable patterns that describe the
data

https://www.youtube.com/watch?v=W44q6qszdqY

https://my.rapidminer.com/nexus/acco
unt/index.html

Self- & Lifelong Learning
• Bachelor of Science in Data Science, Sol Plaatje University (South Africa)
• Coursera Data Science
• Coursera Research Data Management and Sharing
• Foster Open Science Courses
• Masters Program in Biodiversity Informatics, Prof Jean Ganglo, University of Abomey-
Calavi (Benin)
• MANTRA for Researchers
• MANTRA for Librarians
• Agricultural Information Management Standards (AIMS)
• Author Carpentry
• Data Carpentry
• Library Carpentry
• WDS Training Resources
• UCT eResearch

http://www.dcc.ac.uk/resources/meta
data-standards/list

Towards a data strategy for your
library & institution

Open Science Open Data Statement

Open Science Open Data Policy
http://learn-rdm.eu/wp-
content/uploads/red_LEARN_Elements_of_the_Content_of_a_RDM_Policy.pdf

Endorse the Accord
Call to Endorse

AOSP Focus Areas
Policy Infrastructur
Capacity
Building
Incentives

Library Frameworks
• Policy
• Infrastructure
• Capacity Building/CPD
• Incentives

Awareness – start the conversation
• To begin ….
• What data repositories? Which data type? Which
metadata standards?
• Data web page
• Market services re data support
• Meet with stakeholders at institution
• Form a committee to implement strategy, policy,
etc.
• Implement Research Data Management Plans
• Implement Institutional Data Repository

http://internationaldataweek.org/

Thank you
Ina Smith
Project Manager, African Open Science Platform Project, Academy of
Science of South Africa (ASSAf)
ina@assaf.org.za
Susan Veldsman
Director, Scholarly Publishing Programme, Academy of Science of
South Africa (ASSAf)
susan@assaf.org.za
Visit http://africanopenscience.org.za

Workshop 4: Open Science & Open Data for Librarians/Ina Smith

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Workshop 4: Open Science & Open Data for Librarians/Ina Smith

Similar to Workshop 4: Open Science & Open Data for Librarians/Ina Smith (20)

More from African Open Science Platform

More from African Open Science Platform (20)

Recently uploaded

Recently uploaded (20)

Workshop 4: Open Science & Open Data for Librarians/Ina Smith